The machines that learned to listen
Back then, computing systems were extremely expensive and inflexible, with limited memory and computational speed. But regardless, Audrey could recognise the sound of a spoken digit – zero to nine – with more than 90% accuracy, at least when uttered by its developer HK Davis. It worked with 70-80% accuracy for a few other designated speakers, but far less well with voices it was unfamiliar with. “This was an amazing achievement for the time, but the system required a room full of electronics, with specialised circuitry to recognise each digit,” says Charlie Bahr of Bell Labs Information Analytics.
Science Photo LibraryBecause Audrey could recognise only voices of designated speakers, its use was limited: for instance, it could offer voice dialling by, say, toll operators, but it wasn’t really a necessity because in most cases manual push-button dialling of numbers was cheaper and easier. Audrey was an early bird – it preceded general purpose computers, and although it was not used in production systems, “it showed that speech recognition could be made practical”, says Bahr.
But there was another goal. “I believe Audrey was initially developed to reduce bandwidth, the volume of data travelling over the wires,” says Bahr’s colleague Larry O’Gorman of Nokia Bell Labs. Recognised speech would require much less bandwidth than the original sound waves. But as telephone switches became digital in the 1970s and 80s, they enabled faster and cheaper call routing, while staying dependent upon an operator recognising a person’s request to dial a number. So, in the 1970s and 80s, a huge effort in Bell Labs’ speech research was to simply do the following: recognise zero to nine digits, and ‘yes’ or ‘no’. “With recognition of these 12 words, the telephone system was able to complete the transition to machine-only telephony,” says O’Gorman.
Audrey was not the only kid on the block, though. In the 1960s, several Japanese teams worked on speech recognition, with the most notable ones a vowel recogniser from the Radio Research Lab in Tokyo, a phoneme recogniser from Kyoto University, and a spoken-digit recogniser from NEC Laboratories.
Source link



