Speech to voice?
You may have heard it when calling a company or when you see a product presentation online. It is a computer voice where speech to voice has been used. To see what we mean, check Google Translate and try its pronunciation out yourself in different languages. It sounds a little robotic right?
“Translators not needed any more?” The beginning of a new translation world!
Today if you call a bank in the US you will almost certainly talk to a computer that can answer simple questions about your account and connect you to a real person if necessary. Several products on the market today, including XBOX Kinect use speech to voice input to provide simple answers or to navigate a user interface. In fact our Microsoft Windows and Office products have had speech recognition included in them since the late 90s. This functionality has been invaluable to our customers with accessibility needs.
Until recently though, even the best speech systems still had word error rates of 20-25% on arbitrary speech.
Just over two years ago, researchers at Microsoft Research and the University of Toronto made another breakthrough. By using a technique called Deep Neural Networks, which is patterned after human brain behavior, researchers were able to train more discriminative and better speech recognizers than previous methods.
According to Rick Rashid, Microsoft’s Chief Research Officer: During my October 25 presentation in China, I had the opportunity to showcase the latest results of this work. We have been able to reduce the word error rate for speech by over 30% compared to previous methods. This means that rather than having one word in 4 or 5 incorrect, now the error rate is one word in 7 or 8. While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modeling in 1979, and as we add more data to the training we believe that we will get even better results.
“Of course, there are still likely to be errors in both the English text and the translation into Chinese, and the results can sometimes be humorous. Still, the technology has developed to be quite useful”.
“Most significantly, we have attained an important goal by enabling an English speaker like me to present in Chinese in his or her own voice, which is what I demonstrated in China. It required a speech to voice system that Microsoft researchers built using a few hours speech of a native Chinese speaker and properties of my own voice taken from about one hour of pre-recorded (English) data, in this case recordings of previous speeches I’d made”.
“In other words, we may not have to wait until the 22nd century for a usable equivalent of Star Trek’s universal translator, and we can also hope that as barriers to understanding language are removed, barriers to understanding each other might also be removed. The cheers from the crowd of 2000 mostly Chinese students, and the commentary that’s grown on China’s social media forums ever since, suggests a growing community of budding computer scientists who feel the same way about speech to voice”.
Daily we get happy clients and email replies back about our work. Visit and like us on Facebook.
We at NordicTrans endorse this attitude and that’s why we are eager to receive your translation request whether it’s medical, technical, legal, financial or any other specialization. We will treat it with the utmost respect and professionalism, so contact us ([email protected]) any time of day. We are here for you for translation services or even speech to voice.