Re: [MP3 ENCODER] various questions (was: Some suggestions for LAME - please review)
"Gabriel Bouvigne" <[EMAIL PROTECTED]> wrote: > Is there a scalefactor for >16kHz in AAC? (Meno, are you listening?) AAC has scalefactorbands that fill the whole frequency range, scalefactors are calculated for all scalefactorbands. Bye, Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Fwd: download vs pirate vs bootleg
> Undercover - The Rolling Stones Mailing List > == >{SWIPED} from: >http://dailynews.yahoo.com/h/nm/2918/en/music-student_1.html > >Monday September 18 7:55 PM ET >Oklahoma student may face music download charges >OKLAHOMA CITY (Reuters) - An Oklahoma State University student could >face criminal charges of copyright infringement after police found as >many as 1,000 Internet music files on his computer, campus police said Monday. > >Police seized the personal computer and a CD recorder from the >student's dorm room after university officials were notified by the >Recording Industry Association of America which is campaigning against the >wide-spread practice of copying and moving music over the Internet. > >University officials said the Washington D.C.-based RIAA, which >represents big record companies, had notified the school that it had >detected a high volume of music downloads to the campus computernetwork.[] > >``My understanding is he was maintaining files of all these songs and >making them available to others,'' said Chief Everett Eaton of the >Oklahoma State University Police Department. > >A computer forensic specialist has since been busy analyzing the files >on the computer's hard drive, said OSU police Lt. Steve Altman. >``The computer specialist feels there may be in excess of a thousand >files,'' Altman said. ``That could cause state felony charges to be >filed for copyright infringement.'' > >Altman declined to name the 19-year-old male student, who has not been >arrested. The results of the police investigation will be turned over >to a district attorney and it could be weeks before any charges are >filed, Altman said. > >Nestor Gonzales, a spokesman for the university, said the student was >downloading music using several different Internet protocols including >Napster (news - web sites), a program that allows users to exchange >music via the Internet. > >``That was one of the protocols he was using,'' Gonzales said. ``He >may have been using others. It wouldn't have mattered. The high volume >of downloads warranted action.'' > >``It does not appear he was selling the files or profiting in any way >from the downloads,'' Gonzales added. > >Reuters/Variety REUTERS -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] psymodel ?
Hi, This formula is meant to calculate the threshold for MDCT from the threshold for FFT. It indeed only works when that threshold is calculated for every frequency bin. > > Hi, > > Tmdct(i) can be calculated directly from Tdft(i) with this formula: > > Tmdct(i) = (2/M) * Tdft(i) * (cos( 2*Pi*n0*(i + 0.5)/N - /_S(i) ))^2 > > where: > M = number of samples in frequency domain > N = number of samples in time domain > n0 = (M+1)/2 > S(i) = the DFT, where /_S(i) is the phase > > If you want the full explanation, please let me know. > > Bye, Menno > -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] psymodel ?
Hi, Tmdct(i) can be calculated directly from Tdft(i) with this formula: Tmdct(i) = (2/M) * Tdft(i) * (cos( 2*Pi*n0*(i + 0.5)/N - /_S(i) ))^2 where: M = number of samples in frequency domain N = number of samples in time domain n0 = (M+1)/2 S(i) = the DFT, where /_S(i) is the phase If you want the full explanation, please let me know. Bye, Menno - Original Message - From: "Takehiro Tominaga" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, June 07, 2000 1:51 PM Subject: [MP3 ENCODER] psymodel ? > Hi all > > I am not major in psychoacoustic effect but I am wondering the way > LAME's(and ISO dist10's) psychoacoustic noise threshold calculation. > > it is calculated by the way like this. > > E(i) : energy of bark i > T(i) : threshold of bark i > R(i) : ratio of bark i > E'(i) : energy of band(scalefactor band) i > T'(i): threshold of band(scalefactor band) i > R'(i): ratio of band(scalefactor band) i > S: spread function to calculate the masking effect > ATH(i): ATH threshold of band(scalefactor band) i > > step1 > [T(0) T(1) ... ] = S [E(0) E(1) ] > > step2 > R(i) = T(i) / E(i) > > step3 > R'(i) = sum_{some area} R(i) > E'(i) = sum_{some area} E(i) > > step4 > T'(i) = max(R'(i) * E'(i), ATH(i)) > > > step 1&2 are done in psymodel.c, and 3&4 are in quantize.c(calc_xmin). > > Question: > why are there "step2" and "step4" ? > why doesn't it calculate T'(i) directly from T(i) ? > > I hope remove these steps and we can get more simple&fast psymodel.c > --- > Takehiro TOMINAGA // may the source be with you! > -- > MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) > -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] MP4
The newest version of FAAC currently encodes MPEG4 GA (general audio) (but only the AAC part with LTP). I guess it's the only MPEG4 audio encoder currently available (correct me if I'm wrong). You can check it out at http://www.slimline.net/aac Bye, Menno > This is probably the wrong place to ask this, so I hope you don't break your > fingers hitting the delete button. > > I've been wondering about MP4 since reading the first stories and seeing > the sfront program scroll over the screen on freshmeat. I've kind of read > the sfront homepage, but I still don't get it :) > > What can you tell me about MP4 and its development? > > Ivo > > -- > MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) > -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] LAME M/S thresholds
Hello, I don't understand all of the theory myself, but I'll show you what the AAC document says (it is dowbloadabe from the FAAC website). I find it rather vague. This is literally what it says: M/S Stereo The decison to code left and right coefficients as either left + right (L/R) or mid/side (M/S) is made on a noiseless coding band by noiseless coding band basis for all spectral coefficients in the current block. For each noiseless coding band the following decison process is used: 1. For each noiseless coding band, not only L and R raw thresholds, but also M=(L+R)/2 and S=(L-R)/2 raw thresholds are calculated. For the raw M and S thresholds, rather than using the tonality for the M or S threshold, one uses the more tonal value from the L or R calculation in each threshold calculation band, and proceed with the psychoacoustic model for M and S from the M and S energies and the minimum of the L or R values for C(w) in each threshold calculation band. The values that are provided to the imaging control process are identified in the psychoacoustic model information section as en(b) (the spread normalized energy) and nb(b), the raw threshold. 2. The raw thresholds for M, S, L and R, and the spread energy for M, S, L and R, are all brought into an ``imaging control process''. The resulting adjusted thresholds are inserted as the values for nb(b) into step 11 of the psychoacoustic model for further processing. 3. The final, protected and adapted to coderband thresholds for all of M,S,L and R are directly applied to the appropriate spectrum by quantizing the actual L, R, M and S spectral values with the appropriate calculated and quantized threshold. 4. The number of bits actually required to code M/S, and the number of bits required to code L/R are calculated. 5. The method that uses the least bits is used in each given noiseless coding band, and the stereo mask is set accordingly. With these definitions Mthr,Sthr,Rthr, Lthr raw thresholds. (the nb(b) from step 10 of the psychoacoustic model) Mengy,Sengy,Rengy,Sengy spread energy.(en(b) from step 6 of the psychoacoustic model) Mfthr, Sfthr, Rfthr, Lfthrfinal (output) thresholds. (returned as nb(b) in step 11 of the psychoacoustic model) bmax(b)BMLD protection ratio, as can be calculated from bmax(b) = pow(10, -3*(0.5+0.5*cos(Pi*(min(bval(b),15.5)/15.5))) the imaging control process for each noiseless coding band is as follows: t=Mthr/Sthr if (t>1) t=1/t Rfthr= max(Rthr*t, min (Rthr, bmax*Rengy) Lfthr= max(Lthr*t, min (Lthr, bmax) *Lengy) t=min(Lthr, Rthr) Mfthr=min(t, max(Mthr, min(Sengy*bmax,Sthr) ) Sfthr=min(t, max(Sthr, min(Mengy*bmax,Mthr) ) Extra: C(w) = unpredictability I changed this last piece of code until I found it to sound the best (Although, maybe there are more changes possible). I find steps 3,4 and 5 strange since they look only at the amount of bits. Maybe some of you really understands this and can make something from it with a theoretical proof. I find the way I do it now in FAAC very good and makes the encoder sound a lot better. Bye, Menno > Hi Menno, > > I took a look at psy_step11andahalf() and couldn't understand the > reasoning behind it. Maybe you could explain some? > > I dont have the AAC ISO docs, so I'm just using the Johnson and > Ferreira reference. In that paper, the MLD correction is used to > compensate for stereo demasking, but only at low frequencies. The MLD > seems to be constructed so that at high frequencies, the maskings in > either channel will use the maximum over both channels, but at low > frequencies masking in one channel will only effect the other channel > up the the level of the MLD. > > Thus at low frequencies, a signal in the mid channel will have less > masking on the side channel. Under this theory, MLD > must be in increasing function of frequency. > > However, looking at this again and comparing with your formulas, > I think there is a mistake in LAME's implementation: > The comparison of mid and side maskings should be done with the true > maskings, not the masking/energy ratios. I will try to fix this > today. > > Mark > -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
RE: [MP3 ENCODER] LAME M/S thresholds
>Hi, > >That's right. therefore in FAAC you will see this: >if ((nb[0][b] <= 1.58*nb[1][b])||(nb[1][b] <= 1.58*nb[0][b])) >and in LAME: >if ((ratio[0][b] >= 1.58*ratio[1][b])||(ratio[1][b] >= 1.58*ratio[0][b])) > >Or the other way around, I don't have access to my source code where I am now. >Also I think that LAME is doing these calculations with the final ratios, >while I in FAAC use the nb values (can't remember how they were called in >LAME, sorry). >The beta source code is available on the developers page, I hope you can find >them, it will make this a lot clearer. > >Bye, Menno Sorry, that was not entirely the reason. Basically the values are used in a whole different way, than in LAME. Just take a look at psy_step11andahalf(). Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
RE: [MP3 ENCODER] LAME M/S thresholds
>= Original Message From [EMAIL PROTECTED] = >Menno wrote: >> While implementing M/S threshold calculations for FAAC, I noticed some >> things that could be improved in GPSYCHO. >> >> The formula for the BMLD protection ratios you reverse engineered should >> look like this: >> mld[b] = pow(10, -3*(0.5+0.5*(M_PI*(min(bval[b], 15.5)/15.5; >> I took this from the 14496-3 ISO document. >> If you want to see what the rest of the calculations should look like, you >> should take a look at psy_step11andahalf() in psych.c from FAAC 0.55beta. >> Also the paper advises to use the more tonal tonality values from the left >> and right channels for the mid and side channels. And the minimum of >> unpredictability from the left and right channels for both the mid and side >> channels. >> I haven't tried this in LAME myself, but in FAAC it works very good. >> >> Bye, Menno > >Hi Menno, >I just compared your mld with the one implemented in Lame, and they >are totaly different, even in the tendency. >With your mld = -3*(0.5+0.5*(M_PI*(min(bval[b], 15.5)/15.5))) >and Lames mld = 1.25*(1-cos(PI*b/SBPSY_s))-2.5 >the one in Lame increases monoton, yours falls monoton. >Here are the mlds for short blocks: > >Lame mld your mld difference >-2.5 -1.5 -1 >-2.45741 -2.02384 -0.433572 >-2.33253 -2.54737 0.214835 >-2.13388 -3.03745 0.903571 >-1.875 -3.45245 1.57745 >-1.57352 -3.81333 2.2398 >-1.25 -4.14806 2.89806 >-0.926476 -4.43749 3.51101 >-0.625 -4.69682 4.07182 >-0.366117 -4.93336 4.56724 >-0.167468 -5.15104 4.98357 >-0.0425927 -5.33832 5.29572 > >So, what is going on here? > >And where can I find the sources for FAAC 0.55beta? I only found >the sources for FAAC 0.42. > >Robert > >-- >MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) Hi, That's right. therefore in FAAC you will see this: if ((nb[0][b] <= 1.58*nb[1][b])||(nb[1][b] <= 1.58*nb[0][b])) and in LAME: if ((ratio[0][b] >= 1.58*ratio[1][b])||(ratio[1][b] >= 1.58*ratio[0][b])) Or the other way around, I don't have access to my source code where I am now. Also I think that LAME is doing these calculations with the final ratios, while I in FAAC use the nb values (can't remember how they were called in LAME, sorry). The beta source code is available on the developers page, I hope you can find them, it will make this a lot clearer. Bye, Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] LAME M/S thresholds
While implementing M/S threshold calculations for FAAC, I noticed some things that could be improved in GPSYCHO. The formula for the BMLD protection ratios you reverse engineered should look like this: mld[b] = pow(10, -3*(0.5+0.5*(M_PI*(min(bval[b], 15.5)/15.5; I took this from the 14496-3 ISO document. If you want to see what the rest of the calculations should look like, you should take a look at psy_step11andahalf() in psych.c from FAAC 0.55beta. Also the paper advises to use the more tonal tonality values from the left and right channels for the mid and side channels. And the minimum of unpredictability from the left and right channels for both the mid and side channels. I haven't tried this in LAME myself, but in FAAC it works very good. Bye, Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] minval
Hi, Can someone tell me where the minval values in the psychoacoustic model are based on? Or are they just vaues found by trial and error? Thanks Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] twinvq
RE: [MP3 ENCODER] twinvq >> > this is a completely mp3 listserv, but in case any of you >> > dont know about twinvq, its another natural audio codec with >> > better frequency response than mp3 at the same bitrate. >> > http://www.vqf.com, http://www.vqcentral.com, and channel >> > #vqf on the DALnet IRC network (irc.dal.net), are good places >> > to check it out. >> >> AFAIK VQF is absolutely proprietary and owned by Yamaha. There is no >> technical information available whatsoever. This renders VQF >> unusable for >> development which manifests in no free third party players >> beeing available. > >Yes, it is proprietary. AFAIK Yamaha don't really support it any more, because much of the >technology it uses ended up in AAC, which is a better choice for low bitrate stuff anyway. (Public format AND better sound quality). >-- Mat. I suppose you mean MPEG4. MPEG4 contains both AAC and TwinVQ. Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] FAAC version 0.42
Hi, Since interest in my project was shown before on this list, I will update you on the status of it. I've now finished version 0.42 of FAAC. It has a totally new quantizer, which now resembles the one from LAME pretty much. I hope some of you that have been working on the LAME quantizer could also have a look at my new one. Further I have added a pulsecoder which was the only thing still missing from the AAC format in FAAC. The pulscoder is used to encode large pulses in de spectral components seperately, so that less bits are needed for that scalefactor band. In pulse.c I now use a very brute method to find those pulses, but I'm sure there is a much better method to find the right pulses that need seperate coding. I hope some of you can help me with this project. I find that the quality has improved quite a lot since the last version. Bye, Menno -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )