On 10/24/2012 01:56 AM, Bruce wrote: > I am not the person to code it. But could we, when we have sufficient > computational resources, correct errors using a Bayesian estimation of > the speech information likely to follow a particular speech string? > Although the fundamental frequency in speech is very speaker-specific, > the relative change in frequency and harmonic content might be less so. > Having learned some corpus of this, we might be able to successfully > predict the value of a lost frame given a number of previous frames. Or > maybe we could learn a speaker in real-time and predict subsequent frames. > > You should be very wary of predictive strategies in circumstances like this. Prediction cannot provide information (in the proper information theoretic sense of the word). If the signal is highly redundant various schemes can work to fill gaps with something pleasant sounds, and there's probably still enough genuine material around to maintain the original meaning of the speech. With a codec at very low bit rates, most of the redundancy has been squeezed out, and what is left has a fairly high proportion of actual information. Most "predictions" will be little more than wild speculations. If they are based on historical precedent, the result will probably sounds fairly natural, but if the speaker is not repeating a commonly used phrase it may come out completely wrong.
Steve ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Freetel-codec2 mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/freetel-codec2
