At the Open Source Business Conference in March 2004 Clayton Christensen gave a presentation. It’s available as an audio file for download here: Clayton Christensen | Capturing the Upside. I strongly recommend listening to the whole thing because it’s the quintessential Disruption lecture.
It has relevance in many areas of analysis, but when I was trying to think of a way to characterize the potential for Siri I recalled one particular passage that I saw as almost clairvoyant. Seven and a half years ago, Clay said:
… the next time you go to a computer superstore, go to the voice recognition software shelf and pick up a box there that’s called the IBM ViaVoice. Now don’t buy it, but just look at it! They have a picture of the customer on the box, and it’s an administrative assistant who is sitting in front of her computer wearing a headset speaking rather than word processing.
You think about the value proposition that IBM has to be making to this woman. She types 90 words a minute. She is 99% accurate. If she needs to capitalize something, she just instinctively presses shift and cruises through. And IBM has to say, “No, don’t do that anymore. I want you to put this headset on and teach yourself to speak in a slow and distinct and consistent manner in complete sentences. If you must capitalize, you must pause, speak the command “capitalize,” pause, speak the word you want to capitalize, pause, speak the command “uncapitalize,” pause, please be patient, we are 70% accurate, this will get better we promise.”
This is not an attractive proposition to this customer. And IBM has — I’ve not worked with them at all, but as I understand it — they’ve spent maybe $700 million trying to make voice recognition technology good enough that it can be used in that market. This is a very difficult technical hurdle to surmount. Meanwhile, while they are investing that aggressively, Lego comes up with these robots that recognize “stop,” “go,” “left,” “right,” and the kids are thrilled with the four word vocabulary, and then “press — or — say — one” kinds of applications take root, and now directory assistants ask you to say the city and state and so on, and much simpler, and an interesting market is emerging.
I bet maybe the next place it takes route is in chatrooms because the kids don’t spellcheck or capitalize anyway, and they would rather speak than type. And maybe then the next application would be, when you see these stubby fingered executives with their BlackBerrys trying to peck out emails, and their fingers are four times the diameter of the keys, they’re only 70% accurate. If somebody gave them a voice recognition algorithm that really didn’t have to be very good so that they could speak their wireless email rather than peck it out, I bet they’d be thrilled with the crummy product. And ultimately, as it takes route in these new applications, it may get good enough that we can do word processing with it, but it’ll be a long time.
The thing that strikes me about Siri is how “crummy” it is. It’s trying to be an intelligent assistant but it’s not nearly good enough. Many have called it “toy like”. It was so uninteresting that there was minimal coverage of it after the keynote launch. One can almost forgive the cynics. It was “yet another feature”.
However, it does seem to be good in a very limited set of tasks. It is something you can hire for some minor jobs that you’d rather not do. Basic calendar booking, context aware search and a few delightful surprises. But it’s not trying to be much more. It’s not trying to be a typist. It’s not trying to be a companion. It’s not trying to be smarter than you and make you redundant. It’s only trying to help lubricate your life. This is what makes it so exciting.
Like the example of the toy robot that delights the child, Siri delights with simple competence. It’s not profound but as Clay points out in his talk, it’s a necessary first step. What must follow are many more steps. Siri must get better. And because it will have at least 200 million users, I’m betting it will. So over time it will take on more tasks and will eventually help us in ways that we cannot yet conceive possible today. This is just like the introduction of the capacitive touch screen. Popularizing the touch screen has led to experiences with phones and tablets which we did not think possible four years ago.
But it takes time. Like any truly useful breakthrough, it takes a long time to mature. And also like any disruption, the potential of Siri is rooted in four principles:
- Humble early goals which it accomplishes well
- A large population of enthusiastic adopters who give it sustenance
- Plenty of headroom in improvement giving it areas to grow into with positive feedback
- A patient sponsor who makes a stable living
There’s no magic to it. In fact it’s banal. These are only the principles that every parent uses to raise a child.