|DMS Consulting | sitemap | log in|
Computer Speech and Recognition
Early Computer Speech
In the late 70s I supplied the Basic Interpreter for the Transam Triton, a single board home computer that you could build from a kit. I was interested in computer speech so I bought a speech chip - an SC01 made by Votrax.
The Triton was replaced by a BBC Micro (my son's machine) and the SC01 was installed instead of the BBC Speech chip. Some games were written that enabled the BBC machine to speak a few words but not much more. The main problem was the need for a Word to Phoneme lookup (the SC01 being a phoneme driven chip).
I did very little more with Speech until the mid 90s. I had been programming in Java for a while and became aware of the Java Speech API. IBM produced a version of the API to work with their ViaVoice software. I installed ViaVoice and the Java Speech API on my pc and started working with it.
While it was possible to code both speech and recognition, the implementation was limited and programs severely restricted by the performance of the available computers.
Christmas 2009 I was given a Robotics Starter Kit from Parallax Inc. The kit included a Javelin Stamp microprocessor that was programmable in Java. One of my projects was to interface the SC01 speech chip to the Javelin. Using a PC to drive the system, I now had plenty of memory for lookup tables.
I was able to write Java programs that could speak and tried interfacing the speech to The Adventure Game.
It was obvious I needed a larger dictionary but more seriously, I needed a much better voice - one I could understand.
I was interested in exploring computer speech and recognition within Artificial Intelligence. Before I could think about the Intelligence side I needed understandable speech and reliable recognition.
I discovered Cloud Garden - a superb implementation of the Java Speech API. I first used this with the IBM ViaVoice. The speech was much better but recognition, with my old version of ViaVoice, was poor. I invested in Dragon Naturally Speaking V10. These two products made Java Speech and Recognition both possible and practical. Add to this a Mbrola Voice from eSpeak on Source Forge and I had the tools I needed.