Hello!

I'm trying to nomalize an audio file with FFmpeg. I'm using the loudnorm filter. The source loudness is -23 LUFS and I want to make it -17 LUFS. As far as I know, loudnorm has 2 modes of normalizing audio: linear and dynamic (analysing small parts vs. analysing the whole file).

The problem is that when I have an audio file where someone is speaking, the pauses in the speech get louder and louder and thus a hissing noise is clearly audible. Thats why I need linear normalization. But for some reason, that I can't explain, FFmpeg always switches to dynamic mode.

I've considered all the requirements for liner scaling in the loudnorm documentation, but I can't figure out whats wrong. I've specified all 4 values, target LRA isn't lower than input LRA, and when I normalize the file in Adobe Audition to -17 LUFS, I can't see any peeking.
What would be the best way to get linear normalization with FFmpeg?

Here is what I'm doing:

1. Analyze the source audio file:

ffmpeg -i input.wav -filter:a loudnorm=I=-17:TP=-1:LRA=9:print_format=json -f null - ffmpeg version N-100942-gc596e82155-gb7251aed46+3 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
  configuration:  --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache g++' --ld='ccache g++' --disable-autodetect --enable-amf --enable-bzlib --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libdav1d --enable-libaom --disable-debug --enable-fontconfig --enable-libass --enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp --enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 --enable-chromaprint --enable-decklink --enable-frei0r --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libfdk-aac --enable-libflite --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libsvthevc --enable-libsvtav1 --enable-libkvazaar --enable-libmodplug --enable-librtmp --enable-librubberband --enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal --enable-libvmaf --enable-libcodec2 --enable-libsrt --enable-ladspa --enable-librav1e --enable-libglslang --enable-vulkan --enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++ --extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC --extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++ --extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads --extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree --extra-cflags=-DAL_LIBTYPE_STATIC --extra-cflags='-ID:/ab-suite/local64/include/AL'
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.120.100 / 58.120.100
  libavformat    58. 65.101 / 58. 65.101
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.101.100 /  7.101.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'input.wav':
  Metadata:
    encoded_by      : Adobe Adobe Media Encoder 2020.0
    encoder         : Adobe Adobe Media Encoder 2020.0 (Windows)
    date            : 2021-02-15
    creation_time   : 15:31:34
    time_reference  : 0
  Duration: 00:37:57.52, bitrate: 1539 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoded_by      : Adobe Adobe Media Encoder 2020.0
    time_reference  : 0
    date            : 2021-02-15
    encoder         : Lavf58.65.101
  Stream #0:0: Audio: pcm_s16le, 192000 Hz, stereo, s16, 6144 kb/s
    Metadata:
      encoder         : Lavc58.120.100 pcm_s16le
size=N/A time=00:37:54.62 bitrate=N/A speed=33.9x
video:0kB audio:1708140kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[Parsed_loudnorm_0 @ 00000121fcf41e00]
{
        "input_i" : "-22.72",
        "input_tp" : "-2.67",
        "input_lra" : "6.10",
        "input_thresh" : "-33.31",
        "output_i" : "-16.95",
        "output_tp" : "-1.00",
        "output_lra" : "6.00",
        "output_thresh" : "-27.53",
        "normalization_type" : "dynamic",
        "target_offset" : "-0.05"
}

2. Encode the audio with:

ffmpeg -i input.wav -filter:a loudnorm=I=-17:TP=-1:LRA=9:measured_I=-22.72:measured_TP=-2.67:measured_LRA=6.10:measured_thresh=-33.31:offset=-0.05:linear=true:print_format=summary output.wav ffmpeg version N-100942-gc596e82155-gb7251aed46+3 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
  configuration:  --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache g++' --ld='ccache g++' --disable-autodetect --enable-amf --enable-bzlib --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libdav1d --enable-libaom --disable-debug --enable-fontconfig --enable-libass --enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp --enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 --enable-chromaprint --enable-decklink --enable-frei0r --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libfdk-aac --enable-libflite --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libsvthevc --enable-libsvtav1 --enable-libkvazaar --enable-libmodplug --enable-librtmp --enable-librubberband --enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal --enable-libvmaf --enable-libcodec2 --enable-libsrt --enable-ladspa --enable-librav1e --enable-libglslang --enable-vulkan --enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp --extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++ --extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC --extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++ --extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads --extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree --extra-cflags=-DAL_LIBTYPE_STATIC --extra-cflags='-ID:/ab-suite/local64/include/AL'
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.120.100 / 58.120.100
  libavformat    58. 65.101 / 58. 65.101
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.101.100 /  7.101.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'input.wav':
  Metadata:
    encoded_by      : Adobe Adobe Media Encoder 2020.0
    encoder         : Adobe Adobe Media Encoder 2020.0 (Windows)
    date            : 2021-02-15
    creation_time   : 15:31:34
    time_reference  : 0
  Duration: 00:37:57.52, bitrate: 1539 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
File 'output.wav' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'output.wav':
  Metadata:
    ITCH            : Adobe Adobe Media Encoder 2020.0
    time_reference  : 0
    ICRD            : 2021-02-15
    ISFT            : Lavf58.65.101
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 192000 Hz, stereo, s16, 6144 kb/s
    Metadata:
      encoder         : Lavc58.120.100 pcm_s16le
size= 1708140kB time=00:37:54.62 bitrate=6151.8kbits/s speed=33.6x
video:0kB audio:1708140kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000009%
[Parsed_loudnorm_0 @ 000001a3673c0780]
Input Integrated:    -22.7 LUFS
Input True Peak:      -2.7 dBTP
Input LRA:             6.1 LU
Input Threshold:     -33.3 LUFS

Output Integrated:   -17.0 LUFS
Output True Peak:     -1.0 dBTP
Output LRA:            6.0 LU
Output Threshold:    -27.6 LUFS

Normalization Type:   Dynamic
Target Offset:        -0.0 LU

_______________________________________________
ffmpeg-user mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Reply via email to