----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, August 04, 2000 4:14 PM
Subject: [MP3 ENCODER] Voice encoding questions


> Howdy All,
>
> In testing my (comparatively naive) hack of the dist10 encoder, I have
> discovered that, while it does OK for music, it has real problems with
> speech signals.  (Caveat:  at our lowest overall bitrate of 300kbps for
> combined video/audio, we run the audio at 32kbit mono - though we go way
up
> to 64kbps mono for higher overall bitrate signals, and are aiming to
default
> at 64kbps stereo [not joint].)  In particular, the broadband noise bursts
> associated with fricatives really wreak havoc.
>
> My test signal here is spfe49_1 from the AAC SQAM test suite, which is a
> female English speaker going on about giving pills to animals.  I ran it
> through 1) my encoder, 2) LAME (3.85 w/ frame analyzer), 3) mp3enc31, and
4)
> our current Layer-II encoder.
>
> 1) With my encoder (64kbps stereo CRC), every fricative is almost painful
to
> listen to, as the pink noise bursts end up being narrow band filtered (due
> to lack of bits - only the MDCT coeffs closest to the pole are making it
> into the bitstream), and there are occasional weird high frequency blips
and
> arpeggiation which are very annoying.
>
> 2) LAME (-m s -h -b 64 -p --resample 44.1) (we use CRC and I haven't
enabled
> LSF yet) sounds pretty good.  There are occasional minor glitches, but
> that's to be expected at this bitrate.  However, LAME (as above plus -k to
> turn off the filters) sounds pretty similar to what I'm getting.  I note
> that without the forced resampling, LAME will attempt to downsample to
> 22050.

If you want to encode voice signals, I'd suggest you to use --voice
or --preset voice


> 3) FhG (-br 64000 -qual 9 -crc -no-is -esr 44100) sounds very good.  (Man,
> is it slow, though.)  Again, without the forced MPEG-1 sampling rate, the
> mp3enc31 will attempt to use 22050.

You're disabling intensity stereo, but not joint stereo. With those
settings, mp3enc is using m/s stereo. This is an advantage over Lame that
you forced to use plain stereo.


> 4) Layer-II (64 kbps stereo CRC) sounds good.

The layer II encoder is probably using joint stereo. In Layer II, joint
stereo is quite similar to the intensity stereo of layer III



>And what is the capital of Assyria?
The first assyrian capital was Assur, and it was later replaced by Kalah.

--

Gabriel Bouvigne - France
[EMAIL PROTECTED]
icq: 12138873

MP3' Tech: www.mp3-tech.org


--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Reply via email to