PR #23473 opened by mkver URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23473 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23473.patch
This avoids reg-reg moves and saves 90112B of .text here. It also makes the code less reliant on a clean upper ymm state. Not all functions use VEX encoding yet; besides inline assembly functions which are not influenced by x86inc.asm there are also functions using a mixture of xmm and mmx registers (e.g. h264_intrapred.asm) using INIT_MMX where the automatic VEX translation is not active. This means that some parts of the code still rely on a clean upper ymm state. Hint: One could do even more, e.g. remove all the REP_RETs or remove <AVX functions which have are overridden by a <=AVX version. One could also modify the EXTERNAL_SSE2 etc. macros to actually check for AVX instead of just presuming it from the fact that the compiler is allowed to use AVX freely. And one could add an explicit option for this instead of deriving it from the __AVX__ macro (which should cover GCC, Clang and MSVC). >From ccaa1195c3e52b010946058b4bfa1d1ca85b9c2d Mon Sep 17 00:00:00 2001 From: Andreas Rheinhardt <[email protected]> Date: Sun, 14 Jun 2026 00:44:36 +0200 Subject: [PATCH] avutil/x86/x86util: Force VEX encoding when using -mavx This avoids reg-reg moves and saves 90112B of .text here. It also makes the code less reliant on a clean upper ymm state. Not all functions use VEX encoding yet; besides inline assembly functions which are not influenced by x86inc.asm there are also functions using a mixture of xmm and mmx registers (e.g. h264_intrapred.asm) using INIT_MMX where the automatic VEX translation is not active. This means that some parts of the code still rely on a clean upper ymm state. Signed-off-by: Andreas Rheinhardt <[email protected]> --- configure | 3 +++ libavutil/x86/x86util.asm | 4 ++++ 2 files changed, 7 insertions(+) diff --git a/configure b/configure index e67aa362ad..ef07d4895b 100755 --- a/configure +++ b/configure @@ -2445,6 +2445,7 @@ ARCH_FEATURES=" simd_align_16 simd_align_32 simd_align_64 + x86_sse2avx " BUILTIN_LIST=" @@ -6892,6 +6893,8 @@ EOF check_cc intrinsics_sse2 emmintrin.h "__m128i test = _mm_setzero_si128()" + test_cpp_condition stddef.h "defined(__AVX__) && __AVX__" && enable x86_sse2avx + elif enabled loongarch; then enabled lsx && check_inline_asm lsx '"vadd.b $vr0, $vr1, $vr2"' '-mlsx' && append LSXFLAGS '-mlsx' enabled lasx && check_inline_asm lasx '"xvadd.b $xr0, $xr1, $xr2"' '-mlasx' && append LASXFLAGS '-mlasx' diff --git a/libavutil/x86/x86util.asm b/libavutil/x86/x86util.asm index da41e2e5ef..6632155c99 100644 --- a/libavutil/x86/x86util.asm +++ b/libavutil/x86/x86util.asm @@ -27,6 +27,10 @@ %define public_prefix avpriv %define cpuflags_mmxext cpuflags_mmx2 +%if HAVE_X86_SSE2AVX +%define FORCE_VEX_ENCODING 1 +%endif + %include "libavutil/x86/x86inc.asm" ; expands to [base],...,[base+7*stride] -- 2.52.0 _______________________________________________ ffmpeg-devel mailing list -- [email protected] To unsubscribe send an email to [email protected]
