On Mon, Apr 13, 2015 at 8:33 PM, Rostislav Pehlivanov
<atomnu...@gmail.com> wrote:
> This commit implements the perceptual noise substitution AAC extension. This 
> is a proof of concept implementation, and as such, is not enabled by default. 
> This is the second revision of this patch, made after some discussion via 
> non-public email due to a mistake. Any changes made since the first revision 
> have been indicated.
>
> In order to extend the encoder to use an additional codebook, the array 
> holding each codebook has been modified with two additional entries - 13 for 
> the NOISE_BT codebook and 12 which has a placeholder function. The cost 
> system was modified to skip the 12th entry using an array to map the input 
> and outputs it has. It also does not accept using the 13th codebook for any 
> band which is not marked as containing noise, thereby restricting its ability 
> to arbitrarily choose it for bands. The use of arrays allows the system to be 
> easily extended to allow for intensity stereo encoding, which uses additional 
> codebooks.
>
> The 12th entry in the codebook function array points to a function which 
> stops the execution of the program by calling an assert with an always 
> 'false' argument. After a discussion, it was pointed out in an email 
> discussion with Claudio Freire that having a 'NULL' entry can result in 
> unexpected behaviour and could be used as a security hole. There is no danger 
> of this function being called during encoding due to the codebook maps 
> introduced.
>
> Another change from version 1 of the patch is the addition of an argument to 
> the encoder, '-aac_pns' to enable and disable the PNS. This currently 
> defaults to disable the PNS, as it is experimental. The switch will be 
> removed in the future, when the algorithm to select noise bands has been 
> improved. The current algorithm simply compares the energy to the threshold 
> (multiplied by a constant) to determine noise, however the FFPsyBand 
> structure contains other useful figures to determine which bands carry noise 
> more accurately.
>
> Finally, the way energy values are converted to scalefactor indices has 
> changed since the first commit, as per the suggestion of Claudio Freire. This 
> may still have some drawbacks, but unlike the first commit it works without 
> having redundant offsets and outputs what the decoder expects to have, in 
> terms of the ranges of the scalefactor indices.
>
> Some spectral comparisons: https://0x0.st/T7.png (original), 
> https://0x0.st/Th.png (encoded without PNS), https://0x0.st/A1.png (encoded 
> with PNS, const = 1.2), https://0x0.st/Aj.png (spectral difference). The 
> constant is the value which multiplies the threshold when it gets compared to 
> the energy, larger values means more noise will be substituded by PNS values. 
> Example when const = 2.2: https://0x0.st/Ae.png
>
> Comments, tips, feedback and criticism are welcome.


This commandline:

/home/claudiofreire/src/ffmpeg/ffmpeg -i
/home/claudiofreire/tmp/audiosamples/ffsamples/aac/ct_faac-adts.aac
-strict -2 -c:a aac -b:a 48k -cutoff 22050 -f adts -aac_pns 1 -y
test.adts

Produces:

Assertion diff >= 0 && diff <= 120 failed at libavcodec/aacenc.c:398
Aborted

This will probably relate to the fact that noise scalefactors need to
be clamped to a range of SCALE_MAX_DIFF (though independently of
regular scalefactors).

I would suggest that, at the end of twoloop, you measure the minimum
noise scalefactor, and clamp in the range minscaler to
minscaler+SCALE_MAX_DIFF.

You can get the ffsamples folder by configuring with
--samples=/home/claudiofreire/tmp/audiosamples/ffsamples (or whatever
path works for you), and then make fate-rsync
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to