On Tue, Apr 14, 2015 at 6:10 AM, Michael Niedermayer <michae...@gmx.at> wrote: > On Tue, Apr 14, 2015 at 12:33:50AM +0100, Rostislav Pehlivanov wrote: >> This commit implements the perceptual noise substitution AAC extension. This >> is a proof of concept implementation, and as such, is not enabled by >> default. This is the second revision of this patch, made after some >> discussion via non-public email due to a mistake. Any changes made since the >> first revision have been indicated. >> >> In order to extend the encoder to use an additional codebook, the array >> holding each codebook has been modified with two additional entries - 13 for >> the NOISE_BT codebook and 12 which has a placeholder function. The cost >> system was modified to skip the 12th entry using an array to map the input >> and outputs it has. It also does not accept using the 13th codebook for any >> band which is not marked as containing noise, thereby restricting its >> ability to arbitrarily choose it for bands. The use of arrays allows the >> system to be easily extended to allow for intensity stereo encoding, which >> uses additional codebooks. >> >> The 12th entry in the codebook function array points to a function which >> stops the execution of the program by calling an assert with an always >> 'false' argument. After a discussion, it was pointed out in an email >> discussion with Claudio Freire that having a 'NULL' entry can result in >> unexpected behaviour and could be used as a security hole. There is no >> danger of this function being called during encoding due to the codebook >> maps introduced. >> >> Another change from version 1 of the patch is the addition of an argument to >> the encoder, '-aac_pns' to enable and disable the PNS. This currently >> defaults to disable the PNS, as it is experimental. The switch will be >> removed in the future, when the algorithm to select noise bands has been >> improved. The current algorithm simply compares the energy to the threshold >> (multiplied by a constant) to determine noise, however the FFPsyBand >> structure contains other useful figures to determine which bands carry noise >> more accurately. >> >> Finally, the way energy values are converted to scalefactor indices has >> changed since the first commit, as per the suggestion of Claudio Freire. >> This may still have some drawbacks, but unlike the first commit it works >> without having redundant offsets and outputs what the decoder expects to >> have, in terms of the ranges of the scalefactor indices. >> > >> Some spectral comparisons: https://0x0.st/T7.png (original), >> https://0x0.st/Th.png (encoded without PNS), https://0x0.st/A1.png (encoded >> with PNS, const = 1.2), https://0x0.st/Aj.png (spectral difference). The >> constant is the value which multiplies the threshold when it gets compared >> to the energy, larger values means more noise will be substituded by PNS >> values. Example when const = 2.2: https://0x0.st/Ae.png > > its probably better to upload and link to places that are more permanent > than 0x0.st as someone in the future might see this discussion or > commit and could want to see the pictures too
ffmpeg's wiki? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel