On Wed, 10 Jul 2024, James Almer wrote:

ffmpeg | branch: master | James Almer <jamr...@gmail.com> | Wed Jul 10 13:00:20 
2024 -0300| [bd1bcb07e0f29c135103a402d71b343a09ad1690] | committer: James Almer

x86/intreadwrite: use intrinsics instead of inline asm for AV_COPY128

This has the benefit of removing any SSE -> AVX penalty that may happen when
the compiler emits VEX encoded instructions.

Signed-off-by: James Almer <jamr...@gmail.com>

http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=bd1bcb07e0f29c135103a402d71b343a09ad1690
---

configure                    |  5 ++++-
libavutil/x86/intreadwrite.h | 20 +++++++-------------
2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/configure b/configure
index f84fefeaab..7151ed1de3 100755
--- a/configure
+++ b/configure
@@ -2314,6 +2314,7 @@ HEADERS_LIST="

INTRINSICS_LIST="
    intrinsics_neon
+    intrinsics_sse
    intrinsics_sse2
"

@@ -2744,7 +2745,8 @@ armv6t2_deps="arm"
armv8_deps="aarch64"
neon_deps_any="aarch64 arm"
intrinsics_neon_deps="neon"
-intrinsics_sse2_deps="sse2"
+intrinsics_sse_deps="sse"
+intrinsics_sse2_deps="sse2 intrinsics_sse"
vfp_deps="arm"
vfpv3_deps="vfp"
setend_deps="arm"
@@ -6446,6 +6448,7 @@ elif enabled loongarch; then
fi

check_cc intrinsics_neon arm_neon.h "int16x8_t test = vdupq_n_s16(0)"
+check_cc intrinsics_sse immintrin.h "__m128 test = _mm_setzero_ps()"
check_cc intrinsics_sse2 emmintrin.h "__m128i test = _mm_setzero_si128()"

check_ldflags -Wl,--as-needed
diff --git a/libavutil/x86/intreadwrite.h b/libavutil/x86/intreadwrite.h
index 9bbef00dba..6546eb016c 100644
--- a/libavutil/x86/intreadwrite.h
+++ b/libavutil/x86/intreadwrite.h
@@ -22,29 +22,25 @@
#define AVUTIL_X86_INTREADWRITE_H

#include <stdint.h>
+#if HAVE_INTRINSICS_SSE
+#include <immintrin.h>
+#endif

This change seems to have broken builds for x86 with Clang 16 or newer. (Clang 15 and lower seems to be fine.)

See e.g. http://fate.ffmpeg.org/log.cgi?slot=i686-mingw32-clang-trunk&time=20240711035948&log=compile for an example of the error. The issue is that a clang internal intrinsics header contains "_mm_comige_sh(__m128h A,", i.e. a parameter with the name "A" (which toolchain provided headers shouldn't use). This clashes with libavcodec/huffuyv.h, which has a "#define A 3".

This is obviously a Clang intrinsics header bug, but we can't fix the existing Clang 16-18 releases that are out there, so I guess what we can do is change our "define A" to something more elaborate. (IIRC there are some similar issues with names with ncurses and/or android headers too.)

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to