Re: [Flashcoders] RE: Flash speech-to-text

Steven Sacks Wed, 26 Aug 2009 22:42:32 -0700

This is how you record sound:
http://www.getmicrophone.com/?p=69

If you're asking how to convert sound waves into speech, dude, what? Do yourealize how challenging speech recognition is? Wait, why am I asking you this?If you did, you wouldn't be asking people on a Flash list how to do it, as ifit's some piece of code somebody can copy and paste or a few links that willtell you the secret formula.

Most speech to text programs are based on the Hidden Markov models. In speechrecognition, the hidden Markov model would output a sequence of n-dimensionalreal-valued vectors (with n being a small integer, such as 10), outputting oneof these every 10 milliseconds. The vectors would consist of cepstralcoefficients, which are obtained by taking a Fourier transform of a short timewindow of speech and decorrelating the spectrum using a cosine transform, thentaking the first (most significant) coefficients. The hidden Markov model willtend to have in each state a statistical distribution that is a mixture ofdiagonal covariance Gaussians which will give a likelihood for each observedvector. Each word, or (for more general speech recognition systems), eachphoneme, will have a different output distribution; a hidden Markov model for asequence of words or phonemes is made by concatenating the individual trainedhidden Markov models for the separate words and phonemes.

There you have it. That's a high level overview of speech to text. Do youunderstand anything in that paragraph? Probably not.

Unless you're willing to study and put in the time to figure out how to do this,you're not going to figure it out. Nobody is going to point you in the rightdirection because this is a very niche knowledge area and none of these peopleare on Flashcoders. They're at universities working on their doctorates orworking for the military or government, or some private company and they're notsharing this information. This is the stuff patents are made of.

So either give up now (because what you want is some easy solution and thereisn't one) or start doing real research, learn some serious Calculus, become anexpert on on sound, speech, waveforms, and then figure out how to port all ofthis into Flash, which, in all likelihood, lacks the performance to actuallyachieve this.

You'll probably have to do it on the server, passing the sound to the server asan mp3 file, and then pass the text back. That's the only thing I can think ofthat would possibly be able to do this.

Prove me wrong. If you pull this off, you could probably build an entirecompany around your technology.

_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Re: [Flashcoders] RE: Flash speech-to-text

Reply via email to