On Mon, Apr 13, 2015 at 8:33 PM, Rostislav Pehlivanov <atomnu...@gmail.com> wrote: > This commit implements the perceptual noise substitution AAC extension. This > is a proof of concept implementation, and as such, is not enabled by default. > This is the second revision of this patch, made after some discussion via > non-public email due to a mistake. Any changes made since the first revision > have been indicated. > > In order to extend the encoder to use an additional codebook, the array > holding each codebook has been modified with two additional entries - 13 for > the NOISE_BT codebook and 12 which has a placeholder function. The cost > system was modified to skip the 12th entry using an array to map the input > and outputs it has. It also does not accept using the 13th codebook for any > band which is not marked as containing noise, thereby restricting its ability > to arbitrarily choose it for bands. The use of arrays allows the system to be > easily extended to allow for intensity stereo encoding, which uses additional > codebooks. > > The 12th entry in the codebook function array points to a function which > stops the execution of the program by calling an assert with an always > 'false' argument. After a discussion, it was pointed out in an email > discussion with Claudio Freire that having a 'NULL' entry can result in > unexpected behaviour and could be used as a security hole. There is no danger > of this function being called during encoding due to the codebook maps > introduced. > > Another change from version 1 of the patch is the addition of an argument to > the encoder, '-aac_pns' to enable and disable the PNS. This currently > defaults to disable the PNS, as it is experimental. The switch will be > removed in the future, when the algorithm to select noise bands has been > improved. The current algorithm simply compares the energy to the threshold > (multiplied by a constant) to determine noise, however the FFPsyBand > structure contains other useful figures to determine which bands carry noise > more accurately. > > Finally, the way energy values are converted to scalefactor indices has > changed since the first commit, as per the suggestion of Claudio Freire. This > may still have some drawbacks, but unlike the first commit it works without > having redundant offsets and outputs what the decoder expects to have, in > terms of the ranges of the scalefactor indices. > > Some spectral comparisons: https://0x0.st/T7.png (original), > https://0x0.st/Th.png (encoded without PNS), https://0x0.st/A1.png (encoded > with PNS, const = 1.2), https://0x0.st/Aj.png (spectral difference). The > constant is the value which multiplies the threshold when it gets compared to > the energy, larger values means more noise will be substituded by PNS values. > Example when const = 2.2: https://0x0.st/Ae.png > > Comments, tips, feedback and criticism are welcome.
This commandline: /home/claudiofreire/src/ffmpeg/ffmpeg -i /home/claudiofreire/tmp/audiosamples/ffsamples/aac/ct_faac-adts.aac -strict -2 -c:a aac -b:a 48k -cutoff 22050 -f adts -aac_pns 1 -y test.adts Produces: Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:398 Aborted This will probably relate to the fact that noise scalefactors need to be clamped to a range of SCALE_MAX_DIFF (though independently of regular scalefactors). I would suggest that, at the end of twoloop, you measure the minimum noise scalefactor, and clamp in the range minscaler to minscaler+SCALE_MAX_DIFF. You can get the ffsamples folder by configuring with --samples=/home/claudiofreire/tmp/audiosamples/ffsamples (or whatever path works for you), and then make fate-rsync _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel