> 
> Frank Klemm schrieb am Fre, 08 Sep 2000:
> > Now a list of "a" values. Sorry about using a little bit of math.
> > 
> > So see on the list below. 
> > There are some stereo pieces of music with a "a" > 1.667 or < 0.600
> > 
> > 1.25/1.92   Giora Feidman/Rabbi Chaim's Dance -- [15] Tishrei Saba.wav
> > 1.02/1.75   technik/Fraunhofer_Beispiele/track7.wav
> > 1.29/1.74   technik/Fraunhofer_Beispiele/The Ecstasy Of Gold (Ennio Morricone).wav
> > 0.56/0.83   technik/Fraunhofer_Beispiele/si.wav
> > 0.69/0.55   technik/Fraunhofer_Beispiele/mstest.wav
> > 0.63 0.54   Giora Feidman/Rabbi Chaim's Dance -- [03] Melody from the Film "The 
>Dibbuk".wav
> > 0.74/0.33   technik/Fraunhofer_Beispiele/An old Song of (Pink Floyd).wav
> > 0.95/0.23   technik/Fraunhofer_Beispiele/t1.wav
> > 
> > This tracks should have more or less problems with mode selection.
> > 
> > --- Data 
>------------------------------------------------------------------------------------
> >  x[AC]     y[AC]     r         type        x[DC]     y[DC]   sy/sx  File
> <snip> 
> >  67.848%   50.007%   45.091%   MS-Stereo  0.011%    0.005%   0.737  
>technik/Fraunhofer_Beispiele/main_theme.wav
> >  13.056%    4.277%   46.340%   MS-Stereo                     0.328  
>technik/Fraunhofer_Beispiele/main_theme.wav
> 
> Frank, listen to the above sample when forced to mid-side coding.
> This excerpt from "Main Theme", track 8 of Pink Floyd's soundtrack 
> from the film More, will sound awful when there are M/S frames.
> Even at 320 kbit/s you will easily detect distortions. This is
> an example where all frames should be L/R coded.
> 
> 
> Ciao Robert
> 
> 


This is sample 'mstest.wav' on the web site (which is now back up,
www.sulaco.org or www.mp3dev.org)

This sample, and serveral others, were what caused us to abandon
using the amount of L/R correlation to determine if mid/side
stereo is OK.  In mstest.wav, the channels are very well correlated,
but ms stereo gives bad results.

When this signal is encoded as M/S, the side channel version
introduces some noise which will then be spread to both
channels when decoded, but only one channel has enough
masking to tolerate this noise.

Here's a simple example:

Suppose there is a 50db signal in only the L channel.
When encoding mid/side, the side channel will see this
as a 50db signal, and allow 30db of noise (for example).
When decoded, we will now have 30db of noise in both L
channel (where it is masked by the original 50db signal)
and 30db of noise in the R channel (where there was nothing).

The fix for this, was to follow MPEG AAC and encode M/S only when the amount
of masking was similar in both the L and R channels.
In AAC, they allow for M/S encoding on a band-by-band
basis which would be nice.  But since this is not allowed
in MP3, we take some type of average of the L & R channel
masking differences, and this has to be less than some
number before M/S encoding is allowed.  

The algorithm was extensively tuned agains mp3enc 3.1, and
gives surprisingly similar results, so I think it must be
very close to what FhG also uses.  See:

http://www.sulaco.org/gpsycho/ms_stereo.html

You could probably replace this with a correlation measure
which weights each critical band the same (rather than
equally weighted in frequency).  But this would produce
very similar results since a well correlated signal in
a given band will result in very similar maskings for
that band.  


Mark






























--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Reply via email to