Re: [MP3 ENCODER] various questions (was: Some suggestions for LAME - please review)

2000-09-24 Thread menno

"Gabriel Bouvigne" <[EMAIL PROTECTED]> wrote:
> Is there a scalefactor for >16kHz in AAC? (Meno, are you listening?)

AAC has scalefactorbands that fill the whole frequency range, scalefactors
are calculated for all scalefactorbands.

Bye, Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



[MP3 ENCODER] Fwd: download vs pirate vs bootleg

2000-09-19 Thread Menno


> Undercover  -  The Rolling Stones Mailing List
>  ==
>{SWIPED} from:
>http://dailynews.yahoo.com/h/nm/2918/en/music-student_1.html
>
>Monday September 18 7:55 PM ET
>Oklahoma student may face music download charges
>OKLAHOMA CITY (Reuters) - An Oklahoma State University student could
>face criminal charges of copyright infringement after police found as
>many as 1,000 Internet music files on his computer, campus police said Monday.
>
>Police seized the personal computer and a CD recorder from the
>student's dorm room after university officials were notified by the
>Recording Industry Association of America which is campaigning against the 
>wide-spread practice of copying and moving music over the Internet.
>
>University officials said the Washington D.C.-based RIAA, which
>represents big record companies, had notified the school that it had
>detected a high volume of music downloads to the campus computernetwork.[]
>
>``My understanding is he was maintaining files of all these songs and
>making them available to others,'' said Chief Everett Eaton of the
>Oklahoma State University Police Department.
>
>A computer forensic specialist has since been busy analyzing the files
>on the computer's hard drive, said OSU police Lt. Steve Altman.
>``The computer specialist feels there may be in excess of a thousand
>files,'' Altman said. ``That could cause state felony charges to be
>filed for copyright infringement.''
>
>Altman declined to name the 19-year-old male student, who has not been
>arrested. The results of the police investigation will be turned over
>to a district attorney and it could be weeks before any charges are
>filed, Altman said.
>
>Nestor Gonzales, a spokesman for the university, said the student was
>downloading music using several different Internet protocols including
>Napster (news - web sites), a program that allows users to exchange
>music via the Internet.
>
>``That was one of the protocols he was using,'' Gonzales said. ``He
>may have been using others. It wouldn't have mattered. The high volume
>of downloads warranted action.''
>
>``It does not appear he was selling the files or profiting in any way
>from the downloads,'' Gonzales added.
>
>Reuters/Variety REUTERS

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] psymodel ?

2000-06-12 Thread Menno Bakker

Hi,

This formula is meant to calculate the threshold for MDCT from the threshold
for FFT. It indeed only works when that threshold is calculated for every
frequency bin.


>
> Hi,
>
> Tmdct(i) can be calculated directly from Tdft(i) with this formula:
>
> Tmdct(i) = (2/M) * Tdft(i) * (cos( 2*Pi*n0*(i + 0.5)/N - /_S(i) ))^2
>
> where:
> M = number of samples in frequency domain
> N = number of samples in time domain
> n0 = (M+1)/2
> S(i) = the DFT, where /_S(i) is the phase
>
> If you want the full explanation, please let me know.
>
> Bye, Menno
>


--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] psymodel ?

2000-06-07 Thread Menno Bakker

Hi,

Tmdct(i) can be calculated directly from Tdft(i) with this formula:

Tmdct(i) = (2/M) * Tdft(i) * (cos( 2*Pi*n0*(i + 0.5)/N - /_S(i) ))^2

where:
M = number of samples in frequency domain
N = number of samples in time domain
n0 = (M+1)/2
S(i) = the DFT, where /_S(i) is the phase

If you want the full explanation, please let me know.

Bye, Menno


- Original Message - 
From: "Takehiro Tominaga" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, June 07, 2000 1:51 PM
Subject: [MP3 ENCODER] psymodel ?


> Hi all
> 
> I am not major in psychoacoustic effect but I am wondering the way
> LAME's(and ISO dist10's) psychoacoustic noise threshold calculation.
> 
> it is calculated by the way like this.
> 
> E(i) : energy of bark i
> T(i) : threshold of bark i
> R(i) : ratio of bark i
> E'(i) : energy of band(scalefactor band) i
> T'(i): threshold of band(scalefactor band) i
> R'(i): ratio of band(scalefactor band) i
> S: spread function to calculate the masking effect
> ATH(i): ATH threshold of band(scalefactor band) i
> 
> step1
> [T(0) T(1) ... ] = S [E(0) E(1)  ]
> 
> step2
> R(i) = T(i) / E(i)
> 
> step3
> R'(i) = sum_{some area} R(i)
> E'(i) = sum_{some area} E(i)
> 
> step4
> T'(i) = max(R'(i) * E'(i), ATH(i))
> 
> 
> step 1&2 are done in psymodel.c, and 3&4 are in quantize.c(calc_xmin).
> 
> Question:
> why are there "step2" and "step4" ?
> why doesn't it calculate T'(i) directly from T(i) ?
> 
> I hope remove these steps and we can get more simple&fast psymodel.c
> --- 
> Takehiro TOMINAGA // may the source be with you!
> --
> MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
> 

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] MP4

2000-01-28 Thread Menno Bakker

The newest version of FAAC currently encodes MPEG4 GA (general audio) (but
only the AAC part with LTP). I guess it's the only MPEG4 audio encoder
currently available (correct me if I'm wrong).
You can check it out at http://www.slimline.net/aac

Bye, Menno


> This is probably the wrong place to ask this, so I hope you don't break
your
> fingers hitting the delete button.
>
> I've been wondering about MP4 since reading the first stories and seeing
> the sfront program scroll over the screen on freshmeat. I've kind of read
> the sfront homepage, but I still don't get it :)
>
> What can you tell me about MP4 and its development?
>
> Ivo
>
> --
> MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
>

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] LAME M/S thresholds

1999-12-15 Thread Menno Bakker

Hello,

I don't understand all of the theory myself, but I'll show you what the AAC
document says (it is dowbloadabe from the FAAC website). I find it rather
vague.
This is literally what it says:


M/S Stereo
The decison to code left and right coefficients as either left + right (L/R)
or mid/side (M/S) is made on a noiseless coding band by noiseless coding
band basis for all spectral coefficients in the current block. For each
noiseless coding band the following decison process is used:
1. For each noiseless coding band, not only L and R raw thresholds, but also
M=(L+R)/2 and S=(L-­R)/2 raw thresholds are calculated. For the raw M and S
thresholds, rather than using the tonality for the M or S threshold, one
uses the more tonal value from the L or R calculation in each threshold
calculation band, and proceed with the psychoacoustic model for M and S from
the M and S energies and the minimum of the L or R values for C(w) in each
threshold calculation band. The values that are provided to the imaging
control process are identified in the psychoacoustic model information
section as en(b) (the spread normalized energy) and nb(b), the raw
threshold.
2. The raw thresholds for M, S, L and R, and the spread energy for M, S, L
and R, are all brought into an ``imaging control process''. The resulting
adjusted thresholds are inserted as the values for nb(b) into step 11 of the
psychoacoustic model for further processing.
3. The final, protected and adapted to coder­band thresholds for all of
M,S,L and R are directly applied to the appropriate spectrum by quantizing
the actual L, R, M and S spectral values with the appropriate calculated and
quantized threshold.
4. The number of bits actually required to code M/S, and the number of bits
required to code L/R are calculated.
5. The method that uses the least bits is used in each given noiseless
coding band, and the stereo mask is set accordingly.

With these definitions
Mthr,Sthr,Rthr, Lthr  raw thresholds. (the nb(b) from step 10 of
the psychoacoustic model)
Mengy,Sengy,Rengy,Sengy   spread energy.(en(b) from step 6 of the
psychoacoustic model)
Mfthr, Sfthr, Rfthr, Lfthrfinal (output) thresholds. (returned as
nb(b) in step 11 of the psychoacoustic model)
bmax(b)BMLD protection ratio, as can be
calculated from
bmax(b) = pow(10, -3*(0.5+0.5*cos(Pi*(min(bval(b),15.5)/15.5)))

the imaging control process for each noiseless coding band is as follows:
t=Mthr/Sthr
if (t>1)
t=1/t
Rfthr= max(Rthr*t, min (Rthr, bmax*Rengy)
Lfthr= max(Lthr*t, min (Lthr, bmax) *Lengy)
t=min(Lthr, Rthr)
Mfthr=min(t, max(Mthr, min(Sengy*bmax,Sthr) )
Sfthr=min(t, max(Sthr, min(Mengy*bmax,Mthr) )


Extra:
C(w) = unpredictability

I changed this last piece of code until I found it to sound the best
(Although, maybe there are more changes possible). I find steps 3,4 and 5
strange since they look only at the amount of bits.
Maybe some of you really understands this and can make something from it
with a theoretical proof. I find the way I do it now in FAAC very good and
makes the encoder sound a lot better.

Bye, Menno

> Hi Menno,
>
> I took a look at psy_step11andahalf() and couldn't understand the
> reasoning behind it.  Maybe you could explain some?
>
> I dont have the AAC ISO docs, so I'm just using the Johnson and
> Ferreira reference.  In that paper, the MLD correction is used to
> compensate for stereo demasking, but only at low frequencies.  The MLD
> seems to be constructed so that at high frequencies, the maskings in
> either channel will use the maximum over both channels, but at low
> frequencies masking in one channel will only effect the other channel
> up the the level of the MLD.
>
> Thus at low frequencies, a signal in the mid channel will have less
> masking on the side channel.  Under this theory, MLD
> must be in increasing function of frequency.
>
> However, looking at this again and comparing with your formulas,
> I think there is a mistake in LAME's implementation:
> The comparison of mid and side maskings should be done with the true
> maskings, not the masking/energy ratios.  I will try to fix this
> today.
>
> Mark
>


--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



RE: [MP3 ENCODER] LAME M/S thresholds

1999-12-12 Thread Menno Bakker

>Hi,
>
>That's right. therefore in FAAC you will see this:
>if ((nb[0][b] <= 1.58*nb[1][b])||(nb[1][b] <= 1.58*nb[0][b]))
>and in LAME:
>if ((ratio[0][b] >= 1.58*ratio[1][b])||(ratio[1][b] >= 1.58*ratio[0][b]))
>
>Or the other way around, I don't have access to my source code where I am 
now.
>Also I think that LAME is doing these calculations with the final ratios,
>while I in FAAC use the nb values (can't remember how they were called in
>LAME, sorry).
>The beta source code is available on the developers page, I hope you can find
>them, it will make this a lot clearer.
>
>Bye, Menno

Sorry, that was not entirely the reason. Basically the values are used in a 
whole different way, than in LAME. Just take a look at psy_step11andahalf().

Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



RE: [MP3 ENCODER] LAME M/S thresholds

1999-12-11 Thread Menno Bakker

>= Original Message From [EMAIL PROTECTED] =
>Menno wrote:
>> While implementing M/S threshold calculations for FAAC, I noticed some
>> things that could be improved in GPSYCHO.
>>
>> The formula for the BMLD protection ratios you reverse engineered should
>> look like this:
>> mld[b] = pow(10, -3*(0.5+0.5*(M_PI*(min(bval[b], 15.5)/15.5;
>> I took this from the 14496-3 ISO document.
>> If you want to see what the rest of the calculations should look like, you
>> should take a look at psy_step11andahalf() in psych.c from FAAC 0.55beta.
>> Also the paper advises to use the more tonal tonality values from the left
>> and right channels for the mid and side channels. And the minimum of
>> unpredictability from the left and right channels for both the mid and side
>> channels.
>> I haven't tried this in LAME myself, but in FAAC it works very good.
>>
>> Bye, Menno
>
>Hi Menno,
>I just compared your mld with the one implemented in Lame, and they
>are totaly different, even in the tendency.
>With your mld = -3*(0.5+0.5*(M_PI*(min(bval[b], 15.5)/15.5)))
>and Lames mld = 1.25*(1-cos(PI*b/SBPSY_s))-2.5
>the one in Lame increases monoton, yours falls monoton.
>Here are the mlds for short blocks:
>
>Lame mld   your mld   difference
>-2.5   -1.5  -1
>-2.45741   -2.02384  -0.433572
>-2.33253   -2.54737   0.214835
>-2.13388   -3.03745   0.903571
>-1.875 -3.45245   1.57745
>-1.57352   -3.81333   2.2398
>-1.25  -4.14806   2.89806
>-0.926476  -4.43749   3.51101
>-0.625 -4.69682   4.07182
>-0.366117  -4.93336   4.56724
>-0.167468  -5.15104   4.98357
>-0.0425927 -5.33832   5.29572
>
>So, what is going on here?
>
>And where can I find the sources for FAAC 0.55beta? I only found
>the sources for FAAC 0.42.
>
>Robert
>
>--
>MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Hi,

That's right. therefore in FAAC you will see this:
if ((nb[0][b] <= 1.58*nb[1][b])||(nb[1][b] <= 1.58*nb[0][b]))
and in LAME:
if ((ratio[0][b] >= 1.58*ratio[1][b])||(ratio[1][b] >= 1.58*ratio[0][b]))

Or the other way around, I don't have access to my source code where I am now.
Also I think that LAME is doing these calculations with the final ratios, 
while I in FAAC use the nb values (can't remember how they were called in 
LAME, sorry).
The beta source code is available on the developers page, I hope you can find 
them, it will make this a lot clearer.

Bye, Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



[MP3 ENCODER] LAME M/S thresholds

1999-12-09 Thread Menno Bakker

While implementing M/S threshold calculations for FAAC, I noticed some
things that could be improved in GPSYCHO.

The formula for the BMLD protection ratios you reverse engineered should
look like this:
mld[b] = pow(10, -3*(0.5+0.5*(M_PI*(min(bval[b], 15.5)/15.5;
I took this from the 14496-3 ISO document.
If you want to see what the rest of the calculations should look like, you
should take a look at psy_step11andahalf() in psych.c from FAAC 0.55beta.
Also the paper advises to use the more tonal tonality values from the left
and right channels for the mid and side channels. And the minimum of
unpredictability from the left and right channels for both the mid and side
channels.
I haven't tried this in LAME myself, but in FAAC it works very good.

Bye, Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



[MP3 ENCODER] minval

1999-11-29 Thread Menno Bakker

Hi,

Can someone tell me where the minval values in the psychoacoustic model are
based on? Or are they just vaues found by trial and error?
Thanks

Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] twinvq

1999-11-25 Thread Menno Bakker

RE: [MP3 ENCODER] twinvq
>> > this is a completely mp3 listserv, but in case any of you
>> > dont know about twinvq, its another natural audio codec with
>> > better frequency response than mp3 at the same bitrate.
>> > http://www.vqf.com, http://www.vqcentral.com, and channel
>> > #vqf on the DALnet IRC network (irc.dal.net), are good places
>> > to check it out.
>>
>> AFAIK VQF is absolutely proprietary and owned by Yamaha. There is no
>> technical information available whatsoever. This renders VQF
>> unusable for
>> development which manifests in no free third party players
>> beeing available.
>
>Yes, it is proprietary. AFAIK Yamaha don't really support it any more,
because much of the >technology it uses ended up in AAC, which is a better
choice for low bitrate stuff anyway.  (Public format AND better sound
quality).
>-- Mat.

I suppose you mean MPEG4. MPEG4 contains both AAC and TwinVQ.

Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



[MP3 ENCODER] FAAC version 0.42

1999-11-10 Thread Menno Bakker

Hi,

Since interest in my project was shown before on this list, I will update
you on the status of it. I've now finished version 0.42 of FAAC. It has a
totally new quantizer, which now resembles the one from LAME pretty much. I
hope some of you that have been working on the LAME quantizer could also
have a look at my new one.
Further I have added a pulsecoder which was the only thing still missing
from the AAC format in FAAC. The pulscoder is used to encode large pulses in
de spectral components seperately, so that less bits are needed for that
scalefactor band. In pulse.c I now use a very brute method to find those
pulses, but I'm sure there is a much better method to find the right pulses
that need seperate coding.

I hope some of you can help me with this project. I find that the quality
has improved quite a lot since the last version.

Bye, Menno

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )