On Tue, Apr 14, 2015 at 6:10 AM, Michael Niedermayer <michae...@gmx.at> wrote:
> On Tue, Apr 14, 2015 at 12:33:50AM +0100, Rostislav Pehlivanov wrote:
>> This commit implements the perceptual noise substitution AAC extension. This 
>> is a proof of concept implementation, and as such, is not enabled by 
>> default. This is the second revision of this patch, made after some 
>> discussion via non-public email due to a mistake. Any changes made since the 
>> first revision have been indicated.
>>
>> In order to extend the encoder to use an additional codebook, the array 
>> holding each codebook has been modified with two additional entries - 13 for 
>> the NOISE_BT codebook and 12 which has a placeholder function. The cost 
>> system was modified to skip the 12th entry using an array to map the input 
>> and outputs it has. It also does not accept using the 13th codebook for any 
>> band which is not marked as containing noise, thereby restricting its 
>> ability to arbitrarily choose it for bands. The use of arrays allows the 
>> system to be easily extended to allow for intensity stereo encoding, which 
>> uses additional codebooks.
>>
>> The 12th entry in the codebook function array points to a function which 
>> stops the execution of the program by calling an assert with an always 
>> 'false' argument. After a discussion, it was pointed out in an email 
>> discussion with Claudio Freire that having a 'NULL' entry can result in 
>> unexpected behaviour and could be used as a security hole. There is no 
>> danger of this function being called during encoding due to the codebook 
>> maps introduced.
>>
>> Another change from version 1 of the patch is the addition of an argument to 
>> the encoder, '-aac_pns' to enable and disable the PNS. This currently 
>> defaults to disable the PNS, as it is experimental. The switch will be 
>> removed in the future, when the algorithm to select noise bands has been 
>> improved. The current algorithm simply compares the energy to the threshold 
>> (multiplied by a constant) to determine noise, however the FFPsyBand 
>> structure contains other useful figures to determine which bands carry noise 
>> more accurately.
>>
>> Finally, the way energy values are converted to scalefactor indices has 
>> changed since the first commit, as per the suggestion of Claudio Freire. 
>> This may still have some drawbacks, but unlike the first commit it works 
>> without having redundant offsets and outputs what the decoder expects to 
>> have, in terms of the ranges of the scalefactor indices.
>>
>
>> Some spectral comparisons: https://0x0.st/T7.png (original), 
>> https://0x0.st/Th.png (encoded without PNS), https://0x0.st/A1.png (encoded 
>> with PNS, const = 1.2), https://0x0.st/Aj.png (spectral difference). The 
>> constant is the value which multiplies the threshold when it gets compared 
>> to the energy, larger values means more noise will be substituded by PNS 
>> values. Example when const = 2.2: https://0x0.st/Ae.png
>
> its probably better to upload and link to places that are more permanent
> than 0x0.st as someone in the future might see this discussion or
> commit and could want to see the pictures too


ffmpeg's wiki?
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to