MD: Listening to compression

Howard Chu Mon, 15 Jan 2001 20:12:25 -0800

> From: "Eric Woudenberg, Minidisc.org Editor"
> Hi Howard,

Howdy!

> "Howard Chu" <[EMAIL PROTECTED]> writes:
> > Use a bit-accurate CD-ROM drive and rip an audio file off a CD onto
> > your PC.  This is the original, "perfect" data source. Encode it as an
> > MP3 using any bitrate you care to test, and then decode it back to PCM
> > format. If you invert this file and add it to the original WAV file,
> > the result will be the difference between the two signals, the "error"
> > between the original and the MP3 file. You can listen to this error
> > signal and measure its RMS value, to give both subjective and
> > objective quality measures.
>
> As you state, the error signal (or its square) is a valid quantitative
> measure of the loss incurred by the coder. But even though you can
> listen to the error as an audio signal, it isn't a valid subjective
> measure since the error in question is never normally heard
> standalone, rather only as a deviation in the presence of the full
> audio signal (and is hence being masked by the signal).

I guess that's true, perhaps it's real value is in objective measurement.
Certainly it reveals a lot though, such as how much high-frequency signal
is thrown away by a particular encoding. It can also highlight artifacts
that
you can identify once you know they're present, such as the pre-transient
noise in the Microsoft Audio format. (Something that MP3 and ATRAC don't
suffer from... See http://www.real.com/msaudio/ for the story) Once you know
it's
there, your subjective perception will be more attuned to the flaws...

> General aside on subjective evaluation of coders:
>
> Automatic measurement of the subjective quality of perceptual encoders
> is a current research topic and involves some degree of modeling the
> human auditory system in the measurement phase. I've always thought
> that this was a bit of a tail-chasing problem, since if you've got a
> model with superior accuracy in the measurement system, why not use
> that model in the encoder as well and further reduce the perceptual
> loss?

Heh. You can't develop a high-precision model for human perception; you
are inherently forced to approximate since every person is different...
But I suppose a decent neural-net could duplicate a lot of the behavior.
It works for vision, and the nervous system's basic characteristics are
consistent throughout. (It can be said that all five of the human senses
operate in "block floating" mode. Overall you can sense a very wide dynamic
range of signals, but you only have a small window on that range at any
given time. E.g., you can see details in dim light, and you can see well
in bright light, but a sudden transition from dim to bright or vice versa
leaves you blinded: the input has exceeded your dynamic range window, your
visual system has "clipped" ... Hearing is not much different, except that
you don't have the equivalent of the AGC that the eyes (pupil, iris) have.)

> An automatatic system that produced reliable subjective scores for
> audio and speech coders would be a great boon to developers, since
> they could save the considerable time and money spent using human
> subjects. (They could also save wear and tear on their subjects -- at
> Lucent I would occasionally participate in experimental evaluations of
> cell-phone speech coder samples (as a favor to colleagues). I found it
> to be painstaking and frequently frustrating work).
>
Heh. The worst I ever had to do was sit at a newly designed computer desk
and give opinions on its ergonomic efficiency...

  -- Howard

-----------------------------------------------------------------
To stop getting this list send a message containing just the word
"unsubscribe" to [EMAIL PROTECTED]
MD: Listening to compression

Reply via email to