2017 represented a breakthrough year for Artificial Intelligence (AI). Computers are now outperforming humans in the areas of speech recognition and game playing. The success builds on a similar breakthrough in image recognition back in 2015. But do these advances mean that AI itself has finally been solved? And what opportunities are there for business to get involved in the utilisation of AI?
AI is a technology that has always promised much over the past 30 years. Back in 1995, Apple revealed their visionary view of a future personal assistant with the “Knowledge Navigator”. Though we can see some of the features of the Knowledge Navigator in Siri, we are still a long way off that kind of capability.
So how have the most recent advances in AI been achieved? It’s fair to say that most of the recent advances have been achieved via “Deep Learning” algorithms. Machine Learning (ML) is a branch of Artificial Intelligence, where systems have the ability to automatically learn from experience as opposed to being explicitly programmed.
Deep learning is an advanced form of machine learning, which centres around the concept of unsupervised learning, and has become possible following recent advances in three main areas: better algorithms, more powerful computers (CPUs and GPUs) and data, lots of data. In simple terms, deep learning improves on conventional neural networks by using multiple hidden layers. This requires far more processing capability and huge quantities of data.
Recognition of conversational speech has always been more challenging than transcribing a single speaker or shorter utterances. Using the Switchboard conversational corpus to evaluate (which consists of 2400 telephone conversations by 543 different speakers on a range of topics such as sport and politics), human transcription yielded an error rate of 5.9%. In 2017 however, both Microsoft and IBM bettered this, with an error rate of 5.1% and 5.5% respectively.
In game playing meanwhile, Alpha Go, which beat human opponents in 2016 and 2017, was superseded by Alpha Go Zero, a self taught AI system. The successor beat Alpha Go after just three days of self-playing. Similar impressive results can be exemplified in Poker and Pacman.
With three core principles of AI seemingly solved, does this mean AI itself is solved? No. What it does mean is that some of the building blocks of Artificial Intelligence are now good enough to be used in a larger AI system. The building blocks themselves aren’t of great use on their own (except as a measure of progress). What is missing is understanding.
What exactly does speech recognition give us? In practical terms it gives us transcribed words with an error rate of 1 word in 20. Should that mean you can speak almost anything to a personal assistant, such as Siri, and it should work perfectly? Unfortunately not, because comprehending words is only one part of the larger AI problem of understanding language.
Indeed, from the words recognised, the assistant then has to understand what to do. Was it a question? What was the context? Do we have to take any actions? Words alone don’t allow Siri to do anything of any usefulness. Siri is handy in the areas that it knows about and is expecting. However, once you move outside its basic knowledge becomes limited.
Amazon's Alexa Skills work in the same way. Skill developers define a set of expected words and phrases (e.g. "I'd like to order a large latte", "can I have a regular cappuccino please") and then create a range of responses and actions that are to be executed when those words are spoken. Alexa overcomes misrecognised words by instead concentrating on identifying slots (such as the size and type of coffee). A response can then be triggered to ask for any missing information (e.g. "I'd like a latte", "What size would you like?", "Can I have a large").
What does image recognition enable? In practical terms, it means we can classify pictures and identify elements within pictures and video. More mature than speech recognition, image recognition already has some practical uses. For example, a system has been built to identify skin cancer from photographs at an accuracy rate that meets or exceeds that of humans.
Despite advancing the field considerably, deep learning itself is no silver bullet. It can certainly be applied across a number of fields, but it requires large amounts of data, or the capacity to perform many simulations in order to collect large quantities of data and subsequently learn itself. The game of Go has simple rules. No luck is involved and there is no hidden information. This means it can be run as a perfect simulation in software. Using AI for problems that involve hardware though, such as that found in a car or a nuclear reactor, makes running such simulations much more complex.
However, what these advances in AI do mean is that to build an AI system, you no longer need to be an AI expert. Google, Microsoft and Amazon all have interfaces (APIs) that can be called by other systems. Some deep learning algorithms such as tensorflow have been released as open source, whilst Amazon has released its core recommendation algorithm. The opportunity is there for specialists in particular sectors such as the medical, financial or automotive fields to take these AI building blocks and solve genuine problems. Only then can we even consider thinking that AI might be solved.