Hello Gabriel,
Tuesday, August 22, 2000, 12:43:07 PM, you wrote:
GB> First, please note that it has been a long time I didn't really looked
GB> inside of the Lame code, so I'll perhaps tell a few wrong statements. (btw,
GB> please could anyone explain me when to use the word "tell" and when "say"?)
I hope you don't expect english advice from me :)).
>> If I understand correctly, the "-mj" is evaluating if a frame
>> qualifies for M/S coding beforehand, and if so, it will then be coded
>> in M/S, independent of the outcome.
GB> There is also another parameter: trying to minimise the toggling between s
GB> and ms
if this really is necessary, this condition could be left in (even the
current M/S criterium), but because the "-mx" will get results from
experiment, I'd like it to cast as much as possible predictions.
Just let it "compute" and take out the best one.
If, of course, this kind of excessive toggling is a decoder problem,
it'd need to be a criterium to be met in the encoder. If not, just let
the encoder encode, and pick out the frames with lowest noise...
>> I've heard my fair share of examples where lame opts for M/S, but
>> afterwards this is a bad choice, giving a M/S frame sounding much worse
>> than S would have, or in vbr, more bytes are used on the M/S frame
>> compared to the S frame.
GB> Does this really happens in vbr? Could you please try using Mp3x and see if
GB> the same frame could sometimes use more space in ms than in s?
I have more than an educated guess when it comes to this.
btw: could someone update that stats display on the end of encoding?
I'd like a counter of how many M/S and S frames are in each bitrate.
Much easier and fast than using Mp3x.
GB> It seems so strange...If it's true, I think that there is a mistake somewhere in
the ms
GB> bit allocation
why is it so strange? Is it feasible that a reasonable simple formula to
determine if a frame is fit for M/S is able to _exactly_ predict how
it comes out after encoding? It can never do so 100% accurately.
to make my point, let me quote Mark Taylor himself: (about JS)
> This works much better than the algorithm suggested in the ISO MP3
> spec. But you still run into trouble: what if 90% of the bands can
> handle mid/side encoding, and 10% cant? LAME has to make a decision
> in these cases, and it is possible it can make the wrong decision.
It is proven quite clear to me in the Velvet example:
- Sounds _fine_ in 192S (-ms)
- Totally flunks in 192JS (-mj)
so, even if the JS sample only has 35% M/S frames, this still is
obviously too much because the M/S are there while the S ones would be
a clearly better choise.
With this in the back of my head, I looked at what vbr (-V1 -q1) did on the same
sample:
Joint Stereo 320 113 (24.8%)
Stereo 320 99 (21.7%)
I know there more possible causes (bit-reservoir conditions etc) for
this behaviour, but this would be very unlikely. (because the
bitreservoir: one time a JS frame is bigger, another the S would be,
cancelling out each-others effects in the long run)
So let's interpret this in the most simple way:
* we know M/S makes mistakes on this sample (192 cbr)
* #JS 320 frames > #S 320 frames
educated-guess: Lame opts for M/S for the same reason it did on the cbr
case, but after encoding, it ends up with very big amount of
intruduced noise -> high framesize, and maybe even maxxes out @320.
This while the whole time it would have been better of with a 256S
frame or so ...
>> problem: once the criterium is met, and a frame tagged as
>> "M/S"-material, it will always be a M/S, even if S would have been
>> better.
GB> Not always: I think that if we got something like s-s-s-ms-s-s it will be
GB> converted before bitstream formatting to s-s-s-s-s-s in order to reduce the
GB> togling artefact.
In general. Or practically: much too often ;)
>> Big advantage of this prediction method is the speed.
GB> I don't know if it's still the case, but in the past both ms and s data were
GB> computed as the mode of a frame could be changed according to the next one.
that would be nice
>> Since you never have 100% accurate prediction this is one of the most
>> prominent causes of poor quality in -mj mode. (read that post
>> of me referring to 192JS of the Velvet track)
GB> This Velvet track must have some (perhaps not yet known) other difficulties,
GB> as the results are quite catastrophic for every encoder, including mp2 ones.
Lame 192S sounds fine, also -V1 -q1, but I'm thinking it's
unnecesarely too big...
>> What I'm suggesting: a "-mx" mode (or whatever letter)
GB> This is, to my mind, the goal of -mj, so any change should be made into -mj
I disagree. Initially I was also thinking this, but then when I
discussed all these alledged improvements, I found a healthy amount of
reservation to this idea because of the big implications. Suggestion
was: the M/S prediction needs tweaking for this problem-wav, rather
than changing the whole system.
And, in retrospect, I understand this. The current -mj mode is very
fine, and it can be tweaked to also account for the specific problems
of the wav I used. But in the end, it will remain still a system based on
prediction, with the accompanying weaknesses.
That's why I suggest the new "-mx" setting. This one would be a
utterly stupid and inefficient algoritm, but the resulting files would
be the best possible. (and slow to encode)
-mx would also offer a very nice tool to tweak the -mj M/S conditions:
you have a file at your disposal that consists only of the best
possible sequence of MS-S-MS-... frames, and you can strive to
generate a -mj predictive model that comes close.
"-mj" is fundamentally flawed, and I think you should build in a
post-check to make sure choosing M/S to S paid off.
also, when "-mx" is there as an option, it's just that, an option. you
would not be forced to undergo a 200% slowdown to get minimal quality
returns.
GB> not so slower, if both sets of data are already computed
I hope so :).
--
Best regards,
Roel mailto:[EMAIL PROTECTED]
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )