I have briefly tried the "--voice" mode and the "normal" mode when
encoding a purely voice signal (with background noise) at 8kbps, and
have been very impressed with the difference. I would like to compress
the signal more... but 8 is as low as it goes.

The "nomal" mode renders the voice absolutely unintelligible (I assume
the encoder tries too hard to preserve the background).

The "--voice" mode actually seems to reduce the background garbage
(noise) where there is no speech, and to also concentrate on the speech
when it is present.

I have looked at the spectrogram for each, and there is a BIG
difference.

My question (after all this guff) is "does LAME perform any smarts (like
looking for particular frequency domain patterns), and if so, what?"

I have read most of the past articles on "--voice" but they don't tell
me all I wish to know. I am also starting with a 11K/samples per sec
file (mono) and having to up-sample it to 44.1K before I can process it.
Has anyone considered allowing different input sample rates (ie: the
standard 16, 22.05, 24, 32, 48) as well as 44.1 ?


--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Reply via email to