::  
::  Also, does anyone know the basis for the ISO switching criterion?  Do they
::  really mean square energy (quadrupled magnitude)?  They give no hints as to
::  how to reconcile the mid/side samples with the right/left psychoacoustics in
::  the loop section of the encoder.  Computing psychoacoustics for the sum and
::  difference signals makes no sense to me, as one is never going to listen to
::  them and thus the psychoacoustic threshold figures are irrelevant, but the
::  alternative of trying to simultaneously allocate bit/noise for both channels
::  seems overly complicated/possibly impossible (oxymoron strikes!).  I'm
::  compromising right now by just calculating the distortion thresholds using
::  the L/R SMR and the M/S signal and bandwidths (and the loop distortions with
::  the M/S signal), but this seems like a pretty silly way to do things, and it
::  doesn't sound very good.  I'm trying not to plagiarize LAME (I still use the
::  ISO/ATT psych model, for one), but I gather that LAME does some sort of
::  mid/side psychoacoustic processing?
::  
1st step:

  - Calculate diffuse field corrected ear-drum SPL°):
    L' = a(w) L + b(w) R,  R' = a(w) R + b(w) L         w is a small omega, a(w),b(w) 
complex
    Note: different for loudspeakers and head phones 
          (where a \approx 1 and b \approx 0)
    Note: search for HRTF to get usable approaches for a and b.
    Note: for very good results you must take into consideration that a and
          b changes depending on the temporal direct/indirect sound ratio
  - ignore in-brain talk over (ca. -60 dB @1 kHz, much lower than accoustic talk over) 
  - calculate threshold for the left and the right ear
  
2nd step:
  if frame can be coded in LR mode
    - code L using L and threshold(L)
    - code R using R and threshold(R)
  if frame can be coded in MS mode
    - code M using M and 0.6...0.8 * min(threshold(L),threshold(R))
    - code S using S, decoded coded M, threshold(L) and threshold(R)

3rd step:
  select MS/LR depending on:
    - br(MS)/br(LR) ratio
    - the past of the audio file
    - bit pool fuel
    - (the future of the audio file using one mechanism I mailed before)
    - user options?

-- 
Mit freundlichen Grüßen
Frank Klemm

°) SPL = Sound preasure level
   
 
eMail | [EMAIL PROTECTED]       home: [EMAIL PROTECTED]
phone | +49 (3641) 64-2721    home: +49 (3641) 390545
sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Reply via email to