On 10/24/2012 01:56 AM, Bruce wrote:
> I am not the person to code it. But could we, when we have sufficient
> computational resources, correct errors using a Bayesian estimation of
> the speech information likely to follow a particular speech string?
> Although the fundamental frequency in speech is very speaker-specific,
> the relative change in frequency and harmonic content might be less so.
> Having learned some corpus of this, we might be able to successfully
> predict the value of a lost frame given a number of previous frames. Or
> maybe we could learn a speaker in real-time and predict subsequent frames.
>
>
You should be very wary of predictive strategies in circumstances like 
this. Prediction cannot provide information (in the proper information 
theoretic sense of the word). If the signal is highly redundant various 
schemes can work to fill gaps with something pleasant sounds, and 
there's probably still enough genuine material around to maintain the 
original meaning of the speech. With a codec at very low bit rates, most 
of the redundancy has been squeezed out, and what is left has a fairly 
high proportion of actual information. Most "predictions" will be little 
more than wild speculations. If they are based on historical precedent, 
the result will probably sounds fairly natural, but if the speaker is 
not repeating a commonly used phrase it may come out completely wrong.

Steve

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Freetel-codec2 mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Reply via email to