The Metadata is the Medium

The Metadata is the Medium:

Apple’s Choice of File Format Signals a New Photograph

Josh Shagam / June 2017

Apple recently announced its adoption of a image file format standard called HEIF: High Efficiency Image File Format. Starting with the release of iOS11, every photograph taken on an iPhone or iPad will use this nascent format (its definition was formalized in 2015). The change will be invisible to the majority of iPhone and iPad owners. Photos will look the same in the photo album app and when sharing over email or messaging services. Still, this move away from the JPEG image format signals an important paradigm shift in photography. The differences between HEIF and JPEG, the image standard of mobile phone photography and the web at large, point to a fundamental change in our expectations and use of photographs. [1]

To understand why this matters, here are the basic features that define what HEIF can do:

-it uses efficient lossy compression algorithms to reduce file size that are currently used for high quality video

-a single HEIF file can “hold” a set of photographic images, meaning it can contain a burst sequence or a set of bracketed exposures used for High Dynamic Range (HDR) processing

-it supports audio and text that can be time-synchronized to image content

-it supports parametric image edit metadata (instructions for how to process the image without making those changes permanent)

-it supports lossless compression

-it supports 10-bit image data

-its metadata storage structure is highly extensible

There’s more, but only those of us who read standards documents for fun and follow the ruminations of technical working groups will find the details exhilarating. The first point bullet point is particularly interesting because it means having image sets in a storage footprint akin to a small video rather than a pile of full resolution images. We all enjoy the fruits of similarly smart compression algorithms every time we sit down to watch Netflix or YouTube videos over limited-bandwidth networks. I’m not interested in doing comparison tests to find out if HEIF looks better than JPEG, though such comparisons are important. Instead, I’m intrigued by the narrative that emerges from this successor format’s utility.

JPEG, the incumbent, ubiquitous image file format has lived for multiple decades and supports a lossy-compressed image with a limited amount of metadata (date, time, location, camera information). It uses a limited color space, limited bit depth and does not support audio or animation. It has served us all extremely well…yet it’s no secret that we’ve been waiting for a true successor so that it can retire with dignity. There have been contenders: WebP, BPG, FLIF. I’m not going to put a stake in the ground and claim that HEIF will swiftly supplant JPEG. It’s hard to establish a new standard, even with an excellent contender. That conversation involves discussing marketing strategy, developer adoption and a fair amount of speculation, i.e. endless internet forum debate fodder.

I’d rather turn your attention to the implications of Apple’s move to HEIF. One of the primary smartphone makers in the business has picked a new format to store, modify and share photographs on a platform with millions of daily users. I believe this signals a shift in what a photograph is, how it lives and changes as a visual communication medium. The shift is twofold.

1. A photograph is no longer a single image. We expect it to be more than a two-dimensional array of pixels. I wholeheartedly expect (and emotionally demand) that texted photographs of my three year old nephew be Live Photo vignettes, brief glimpses of life in motion; at once both an adorable still photograph and a video punctuated by audio and movement. And yeah, I desperately await a reality where Harry-Potter-newspaper-esque moving photos are commonplace across all platforms by default. I believe that “photograph” is still the best term to describe this (at the very least, it sidesteps the GIF pronunciation debate). Snapchat, Boomerang and Giphy are all evidence that micro-video-type content has grown beyond a fad and is increasingly part of our social language. It’s not so much about abbreviating video as it is a solution for extending the photographic moment. HEIF offers this multi-image or vignette-video functionality without requiring extra files alongside the standard image. It’s both at the same time and has the right version to serve up with the app or platform that requests it.

2. A photograph is more than the pixels shown onscreen. Metadata gives software situational awareness and technical clues to make smart, automatic adjustments and suggestions (those auto-generated montage videos can be creepy, though). The extensibility of the HEIF format– meaning, its ability to support yet-unrealized metadata– means room for growth as we continue to explore how to augment light recording with data. Non-visual metadata is becoming a fundamental component in constructing better photographs and generating effective delivery mechanisms. We may not realize how often an algorithm helped avoid a group photo with blinking subjects, for example.

Additionally, taking advantage of the data underlying image pixels means better-realized editing workflows. Photographers are painfully acquainted with raw files, sidecar metadata and output versions of images. This traditional way of working meant creating derivative files: an original, a JPEG for mobile, a JPEG for print, and so on. File management on mobile devices is particularly onerous. HEIF stores the original image (note: not raw) with a host of secondary, supplemental or derivative image products. Alternate exposures, depth maps, sound, and/or alternate views are stored in a single container that remembers their relation to one another. It means that the file itself handles file management and the image maker spends more time making images.

I find HEIF’s metadata support for depth maps particularly exciting. Starting with the iPhone 7 Plus, Apple introduced the ability to blur portrait backgrounds to emulate what larger-sensor cameras can do. This is accomplished using a secondary sensor to estimate the separation of subject and background. They were not the first to do it, nor the best: HTC introduced a similar technique a few years ago and Google has its Lens Blur app. Everyone wants to use additional sensors to bring smartphone imaging up to par with dedicated camera hardware. Apple dipping their toes in the water of computational photography is important because the move to HEIF points to a long-term play at dual-camera-derived depth data. Background blurring will get better and become commonplace, but the real hook is augmented reality (AR) applications that take advantage of depth information to superimpose graphics and interfaces. As we’ve seen from Snapchat and Pokemon Go, a photograph becomes much more than a record of reality when coupled with facial recognition, machine learning and location awareness. Augmented or “mixed reality” applications using smartphone imaging promises to further sculpt our relationship with visual media. [2]

The new photograph is a sequence, a collection of momentary data, an inhale and exhale of breath. It is space-aware, post-capture focusable, endlessly editable. It is a face-recognized, machine-learning-identified, time-stamped, GPS-logged document of who, what, when and where. These are all natural and exciting advancements for the medium and serve as evidence that our desire to record reality is intertwined with experiencing and enhancing it. The HEIF format is simply a technical mile marker of where our collective visual fluency sits in the year 2017. Photography is becoming less defined by what’s composed in the frame and more by which recorded data are exploited by the software interpreting it.

It’s always possible that the HEIF format will fail to gain traction or something better will come along. Perhaps we’ll finally tire of GIFs, Live Photos, Snapchat filters and faux-blurred backgrounds. It is a chicken-and-egg reality that introducing a new standard (especially when it’s done by a company with self-interested motives) does not make it universally supported, adopted or “the standard.” For this reason, I’m not so interested in evaluating the format in a vacuum. I’m interested in exploring our use of photographs in conversation, on social media and in our art. The technology of the day informs these uses. It’s my belief that the photograph is different from what it was five years ago. Its definition will likely change a great deal before another five pass. For now, the new photograph is more than a single image– it’s a collection data that defines a moment.

[1] I say this in the general sense, understanding fully that professional photography in all its forms continues to live and breathe the traditional photograph. However, the rise and prominence of smartphone imaging has lead to a regular-user-centric push for quality, features and functionality. The rest of this article speaks more to the influence of the billions of photographs taken by non-professionals and excluding commercial use.

[2] I’ve decided to call this (at least in the pretentious confines of my own head) the New Augmented Topographics. We’re entering an era of photographing the world that gathers immense amounts of spatial and temporal data. Environment-aware augmented reality applications will rely on us aiming our multitude of sensors at all of the things in our lives to which we assign value and attention. Artificial intelligence will gain more objective insight from our photographs than we will. The hope is that we get additional, subjective utility and enjoyment from the exchange.

JOSH SHAGAM

Portfolio

About Me