From: Jiawei <jia...@iscas.ac.cn> This patch modifies the FFmpeg build system to allow GCC to use the `-ftree-vectorize` flag when the compiler version is 13 or newer. Enabling this flag can improve performance through better loop analysis and auto-vectorization (SIMD) opportunities in modern GCC versions.
The explicit `-fno-tree-vectorize` flag was originally added in commit 973859f5230e (2009). A previous attempt to enable `-ftree-vectorize` was made in commit cb8646af24bd (2016), but it was reverted in fd6dbc53855f. The reason for the revert was that the inline x86 CABAC assembly code caused compiler errors - the compiler would run out of available registers during vectorization, making it unable to compile some functions. There were also reports of GCC hitting internal compiler errors, and miscompilations on x86_64. In commit 182663a58a7a (2023), the problematic CABAC function was made non-inline. This significantly reduces the risk of register exhaustion caused by inlining large assembly blocks, making the vectorizer safer to enable for other functions. Given improvements in GCC's vectorizer and the mitigation of the original issue, we now re-enable `-ftree-vectorize` for GCC version 13 and above. Signed-off-by: Jiawei <jia...@iscas.ac.cn> --- Reposting as iterations v3-v5 by Jiawei haven't made it to the mailing list. For future reference; the issues with the CABAC assembly is easily reproducible with current GCC versions too; build a version before 182663a58a7a (possibly backport effadce6c756247ea8bae32dc13bb3e6f464f0eb and f01fdedb69e4accb1d1555106d8f682ff1f1ddc7), configure a build for x86_32 with --cpu=haswell, and it hits errors like this: src/libavcodec/x86/cabac.h:199:5: error: ‘asm’ operand has impossible constraints The last time there were also reports of miscompilations, this were mentioned on the mailing list in https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193915.html and https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html. If we apply this, let's hope that GCC has improved since last time; this is limited to be applied on only GCC 13 and newer (but for all architectures; in particular on aarch64 it is has been seen to be generally beneficial in e.g. dav1d). Also, note that this it not us explicitly opting in to an experimental feature, but this is a default feature in GCC that we've explicitly opted out from so far. --- configure | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 534b443f7d..117893ee93 100755 --- a/configure +++ b/configure @@ -7673,7 +7673,11 @@ if enabled icc; then disable aligned_stack fi elif enabled gcc; then - check_optflags -fno-tree-vectorize + gcc_version=$($cc -dumpversion) + major_version=${gcc_version%%.*} + if [ $major_version -lt 13 ]; then + check_optflags -fno-tree-vectorize + fi check_cflags -Werror=format-security check_cflags -Werror=implicit-function-declaration check_cflags -Werror=missing-prototypes -- 2.39.5 (Apple Git-154) _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".