Hi all,
I believe I’ve found a solution to this unexpected noise issue.
In aaccoder.c, I added an upper bound check for sfb_energy inside mark_pns()
I limit the energy ratio sfb_energy/(threshold*sqrtf(1.5f/freq_boost)) to
less than 100.
Code like this:
```
@@ static void mark_pns(AACEncContext *s, AVCodecContext *avctx,
SingleChannelElement *sce)
- if (sfb_energy < threshold * sqrtf(1.5f / freq_boost) || spread <
spread_threshold || min_energy < pns_transient_energy_r * max_energy) {
+ if (sfb_energy < threshold * sqrtf(1.5f / freq_boost) ||
+ sfb_energy > 100.0f * threshold * sqrtf(1.5f / freq_boost) ||
+ spread < spread_threshold ||
+ min_energy < pns_transient_energy_r * max_energy) {
```
After retesting with the problematic input samples, the noise artifacts
disappeared. I also ran tests on several songs and general audio ,the
results look promising. It seems that when sfb_energy is significantly
higher than the psy threshold, PNS can introduce unexpectedly noise.
I also gathered some rough statistics comparing this energy ratio across
typical audio and the noisy input samples. While the data may not be
exhaustive and could contain bias, it offers some insight: In general
audio, this ratio tends to stay below 5, whereas the noisy segments in the
input samples show much higher values.
[image: sfb_energy.png]
Agent 45 <[email protected]> 于2025年8月6日周三 07:34写道:
> Hello, Dear FFmpeg team,
>
> I'm encountering a consistent issue when encoding specific human voice
> with FFmpeg's built-in AAC encoder. At low bitrates the encoded output
> contains noticeable and sometimes harsh noise artifacts. These artifacts
> gradually reduce as the bitrate increases.
>
> Here are the commands used, ffmpeg version 7.1.1:
>
> ffmpeg -i input1.wav -c:a aac -b:a 128k output1_128k.m4a
> ffmpeg -i input2.wav -c:a aac -b:a 128k output2_128k.m4a
> ffmpeg -i input3.wav -c:a aac -b:a 128k output3_128k.m4a
>
> ffmpeg -i input1.wav -c:a aac -b:a 256k output1_256k.m4a
> ffmpeg -i input2.wav -c:a aac -b:a 256k output2_256k.m4a
> ffmpeg -i input3.wav -c:a aac -b:a 256k output3_256k.m4a
>
> ffmpeg -i input1.wav -c:a aac -b:a 320k output1_320k.m4a
> ffmpeg -i input2.wav -c:a aac -b:a 320k output2_320k.m4a
> ffmpeg -i input3.wav -c:a aac -b:a 320k output3_320k.m4a
>
> ### Observations:
>
> - `output1_128k.m4a`: Severe noise throughout the file
> - `output2_128k.m4a`: Audible noise at 0.4s and 0.8s
> - `output3_128k.m4a`: Starts producing harsh noise from 0.8s
>
> - `output1_256k.m4a`: Noticeable noise at around 0.25s
> - `output2_256k.m4a`: Mild noise around 0.8s
> - `output3_256k.m4a`: Some noise still present from 0.8s
>
> - All 320k versions are clean — no noise detected
>
> I’ve attached all files (input and encoded outputs).
>
> I hope this issue can be considered for review and fixed if possible, as
> the presence of such artifacts at 128k is quite unexpected and undesirable,
> especially for speech.
>
> Please let me know if I can assist further with testing or other input.
>
> Best regards,
> jack
>
_______________________________________________
ffmpeg-user mailing list -- [email protected]
To unsubscribe send an email to [email protected]