On Wearables


I've been thinking about wearable technology.

The Strategy

Google Glass has received a lot of scorn from many directions. There are the obvious privacy concerns; people don't like the idea of being under constant surveillance by strangers. Some don't see the appeal of having technology so constantly and literally in-your-face. Some complain that Glass creates a barrier between the users and the world, making it difficult to communicate with the wearer. Some just think it looks stupid.

Popular opinion is split between the positions of Google Glass being an inevitable future, and a complete flop. Both sides miss the point. Whether Glass itself is successful or not is immaterial. Glass is a proving ground. It's an expensive experiment in predictive and contextual information retrieval, hands-free interfaces, constrained information display, and the general challenges involved in producing useful wearable tech, including connectivity, prolonged battery life, material science limitations, and overcoming social bias.

The continued shrinking of the key components involved in heads-up displays is inevitable. At some point, we will have contact lens displays, or virtual retinal displays, or something similar. While unobtrusive, personal visual display is an undoubtedly difficult technical problem to solve, the real challenge is how to make it useful. New, non-traditional input and output methods will require new methods of interaction, and this is exactly where Google is investing its effort. When the technology is ready (or, indeed, spearheaded to fruition by Google themselves), guess who will be waiting with the software and systems to take full advantage of it?

Quoth Ray Kurzweil, who now works for Google (emphasis added):

Most technology projects fail not because the technology doesn’t work, but because the timing is wrong - not all of the enabling factors are at play where they are needed. So I began to study these trends in order to anticipate what the world would be like in 3-5 or 10 years and make realistic assessments. That continued to be the primary application of this study. I used these methodologies to guide the development plans of my projects, in particular when to launch a particular project, so that the software would be ready when the underlying hardware was needed, the needs of the market, and so on.

These methodologies had the side benefit of allowing us to project development 20 or 30 years in the future. There is a strong common wisdom that you can’t predict the future, but that wisdom is incorrect. Some key measures of information technology - price-performance, capacity, bandwidth - follow very smooth exponential trends. (CRN interview, 2005)

So, I see the strategy. I respect it. I also wonder if we couldn't do better.

The Question

Not long ago, I visited my grandparents and extended family in Minnesota. My grandfather damaged his hearing in the Korean war, and it's been getting progressively worse as he ages. It's been a few years since I've seen him last, and I was surprised to see how withdrawn he's become. Growing up, he was always the storyteller - the one who chimes in with an appropriately funny anecdote relevant to the conversation. But his inability to follow the conversation has robbed him of that.

Grandpa Browne has multiple types of hearing aids, but none of them work particularly well for him. I was stunned to learn how much hearing aids cost; they start at a few hundred dollars each, and can reach great heights of $10,000 or more. All that, for something which only marginally increases his hearing.

While observing my family for the few days I was there, I noticed something which I thought very interesting: as long as Grandpa Browne knew vaguely what we were talking about, he had a much easier time "hearing" us. This, of course, is the miracle of the brain. It's a giant memory and prediction engine. Perception of a given signal is hugely influenced by the brain's prediction of what the signal will be. The prediction and positive identification of a previous or subsequent signal signicantly collapses the space of probably correct predictions for the current signal. And the previous or subsequent signal may come from other senses. The feeling of fur touching your skin may not be surprising if you heard your cat's bell collar a moment before, or a moment after. Hearing thunder without seeing lightning might shock you. Seeing lightning without seeing clouds or feeling a change in the weather might do the same. Thus, Grandpa Browne's ability to correctly "hear" a word is heavily influenced by both his current input (the sound of the word) subsequent or simultaneous input (seeing the motion of your lips) and previous input (the current conversational subject).

After one particularly frustrating exchange between Grandma and Grandpa Browne, wherein Grandpa finally shook his head in apology and non-understanding, I remarked my observation to my Grandma. "He seems to do pretty well if he knows what we're talking about." She confirmed this. An obvious solution to this is for Grandpa Browne to carry a small notebook that the speaker can quickly write key words on. But he'll never do this, both because it's time-consuming (breaks the flow of conversation), and it places a burden on the speaker. The obvious question is "How can technology solve this?"

"What if he had a little device which would listen to what you say, and display the words in big text?" I asked Grandma Browne.

"Oh, that would be great! He really just needs something to get him going, and then he does pretty well."

This is a fairly simple idea, and the core of it relies on relatively simple technology that exists today. In my head, the productized version of of this is a dedicated device with an E-Ink ink display and a 3G radio enabling access to a remote ASR service. A Google, Microsoft, or Amazon would whip this out in no time. The hardest part isn't software, but usability, marketing, and determining the business viability and relevance to their core business. Grandma Browne would gladly pay $10, $15, $20 per month to be able to communicate with her husband. It removes the burden from the speaker, while enabling the listener to gain the context needed to more actively participate in conversation.

While this may (or may not) be a perfectly marketable product, there is still, in my mind, a fatal flaw: the form factor. The screen (indeed, the device itself) still acts as a very present intermediary between the two partners. It's a barrier, if only a minimal one. So the next question becomes: how do we get rid of the screen?

It's this simple question which has occupied my mind for the past several months. There is an obvious answer, and it has profound implications if followed through. It would fundamentally alter not only the way we interact with technology, but how we perceive the world around us.

The Missing Bridge

Glass and Smartwatches aren't technically or socially innovative; they take an existing consumption paradigm and put it on your head or wrist, respectively. As previously mentioned, there are resulting challenges from this approach, which Google is championing almost singlehandedly.

Putting aside Glass and Smartwatches, the major focus of wearable technology has been collecting and quantifying information for immediate or retrospective analysis. The Fitbit, Jawbone UP, Nike Fuel, etc., are tools for learning more about the activities and habits of the wearer. In fact, recent literature defines "wearables" as wearable devices capabable of five basic functions:

  • Sense
  • Process (Analyze)
  • Store
  • Transmit
  • Apply (Utilize)

While the text doesn't explicitly exclude it, no emphasis is placed on the role of wearables as general purpose output (that is, output from something else, input into the human sensory system). General purpose input is equally applicable, but I'm focusing on output at the moment. Modern wearables concern themselves with the task of providing information about the self without increasing cognitive load. While this has value, it neglects an obvious potential function: unobtrusively augmenting our perception of the world.

Imagine the following scenarios:

  1. You're 24, and have what is considered severe hearing loss. You wear hearing aids, and are reasonably good at speechreading. You're out getting dinner with a group of 4-5 friends at a moderately noisy restaurant. Conversation is lively while waiting for the food, and you're able to track most of what's happening by simply paying attention to who is speaking and who they're speaking to, speechreading and hearing bits and pieces along the way. Once the food arrives, people begin eating, and conversation dies down. You're focusing on wrapping a paricularly stubborn noodle around your fork, when a sensation in your arm signals that someone said your name, though you didn't hear it. You look up, and a friend across the table asks you a question. She (quite rudely) has food in her mouth, making it difficult to speechread. Additionally, the noise of the restaurant would render normal hearing aids nearly useless. However, beneath your collar is a ring of small microphones, which beamformed on the speaker after hearing your name. With the audio input focused on in the direction of the speaker, you hear most of the question: "That...good! What k...is that?" Nearly immediately, a sensation in your arms and shoulders signals the transcripted dialog to you in a kind of haptic shorthand, and you subconsciously pluck out the pieces you were missing: "looks" and "kind of cheese". You're able to answer the question with confidence, "Goat! It's delicious!"
  2. It's 2:15 pm in Seattle. Google knows you have an appointment with a new dentist at 2:45, and it's a 25-minute walk from your office to the dentist's office. You grab your stuff and rush out the door. You work in South Lake Union, and you know the dentist is downtown somewhere. As you walk briskly down Westlake toward Denny, a sensation in your arm signals you to continue straight on Westlake. As you approach Stewart, it nudges you right, then left on 4th Ave. Just after crossing 4th and Pine, you are signaled that you've reached your destination, which is on the right. The office's suite number is signaled, and you walk in to Advanced Dentistry, without once stopping to look at your smartphone.

In Scenario 1, we see ASR and haptics providing the initial "nudge" needed to engage in conversation, via a wakeword. The microphone array powering the ASR device also "pairs" with traditional hearing aids, enhancing their function. Lastly, ASR and haptics fill the gaps in the missed dialog, providing the brain with enough data to disambiguate the original input. This is a wonderful orchestration of technology, all of which exists today, but has not successfully been put into place for this purpose. The application of wearable technology for the purpose of augmenting human senses is not unfamiliar; those with hearing aids and cochlear implants experience this already. But the majority of technology employed to overcome sensory impairments today focuses on "repairing" or augmenting the impaired sense itself. Little effort is made to enhance already healthy senses with new information, even as science tells us that our amazing brains will utilize the cortical regions typically reserved for the impaired sense to process other senses.

Introducing new signals to our somesthetic senses for the purpose of augmenting our overall perception is not new. Indeed, this is the primary use of haptics - adding either force or tactile feedback as sensory cues into an otherwise complex or opaque system or process - but we've done little to explore the depths of what's possible.

Standard Japanese has only 100 distinct syllables. Surely it would be possible to create 100 individually recognizable pulse patterns, allowing one to "feel" the spoken language. English has quite a few more distinct sounds, but for the most part we can consider the problem of compact representation of spoken/written English a solved problem. We merely need to translate those solutions to the new medium to enable a whole new level of human-computer interaction.

This is partially illustrated in Scenario 2. One problem with traditional corrective sensory augmentations is that their target audience (e.g., the deaf) is the only audience which can meaningfully benefit from the technology. Hearing aids will marginally increase the hearing of the hearing impaired, but someone who can hear won't get much benefit from hearing aids; in fact, they may cause damage, requiring their future use. But the smart application of haptics and supporting technology is potentially beneficial to all, as it provides a means of adding potentially limitless novel senses to the brain, either using simple cues (the equivalent of a spoken grunt), or complex information transfer, or both.

The applications boggle the mind. Scenario 2 portrays bird-like honing capabilities. Simple, yet magical. What if we paired it with biometric sensors, and linked devices together over a distance? I could feel my partner's anxiety level spike, and discretely send her a questioning or reassuring cue signal. What if health care providers feel exceptional vital signs from their patients, at a distance? Instantly "knowing" that the patient just down the hall in room 218 is crashing might be the difference between life and death. Connecting one's haptic system to a building's environmental monitoring could give occupants an immediate and visceral response to a gas leak, fire, or other hazard. Fire fighters could feel nearby persons in need of aid - critical in cases with limited visibility.

But the applications needn't be so dramatic. The ability to add and incorporate novel, virtual senses has the potential to profoundly impact every industry in ways both large and small. The future of wearable technology - indeed, technology in general - is to be as subtle and contextually relevant as it is pervasive.

The Plan

Alan Kay is often quoted as saying "The best way to predict the future is to invent it." To that end, I've begun a new project: a haptic sleeve, paired with a microphone array and automatic speech recognition.

The V1 prototype is well under way. In its purest, most stripped-down form, the haptic sleeve by itself is an output-only device, and the purpose of the V1 prototype is to establish viability of the form factor for complex information transfer. If successful, V2 will pair the sleeve with simple audio-driven haptic feedback, and V3 with ASR.

I'm super excited to be exploring this space, and will use this site to provide updates on the project. If you or anyone you know would be interested in testing early prototypes, please contact me.