mathops: clip constants used with shift instructions within inline assembly

James Almer Sun, 16 Jul 2023 09:04:29 -0700

On 7/16/2023 1:00 PM, Andreas Rheinhardt wrote:

James Almer:

From: Rémi Denis-Courmont <r...@remlab.net>


Fixes assembling with binutil as >= 2.41

Signed-off-by: James Almer <jamr...@gmail.com>
---
  libavcodec/x86/mathops.h | 26 +++++++++++++++++++++++---
  1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/libavcodec/x86/mathops.h b/libavcodec/x86/mathops.h
index 6298f5ed19..ca7e2dffc1 100644
--- a/libavcodec/x86/mathops.h
+++ b/libavcodec/x86/mathops.h
@@ -35,12 +35,20 @@
  static av_always_inline av_const int MULL(int a, int b, unsigned shift)
  {
      int rt, dummy;
+    if (__builtin_constant_p(shift))


We actually have av_builtin_constant_p. Is it guaranteed that all
compilers supporting inline ASM also support __builtin_constant_p?

I can use av_builtin_constant_p() if you want, but it will be expandedto __builtin_constant_p() in all supported compilers, judging by howit's defined.

      __asm__ (
          "imull %3               \n\t"
          "shrdl %4, %%edx, %%eax \n\t"
          :"=a"(rt), "=d"(dummy)
-        :"a"(a), "rm"(b), "ci"((uint8_t)shift)
+        :"a"(a), "rm"(b), "i"(shift & 0x1F)
      );
+    else
+        __asm__ (
+            "imull %3               \n\t"
+            "shrdl %4, %%edx, %%eax \n\t"
+            :"=a"(rt), "=d"(dummy)
+            :"a"(a), "rm"(b), "c"((uint8_t)shift)
+        );
      return rt;
  }

@@ -113,19 +121,31 @@ __asm__ volatile(\

  // avoid +32 for shift optimization (gcc should do that ...)
  #define NEG_SSR32 NEG_SSR32
  static inline  int32_t NEG_SSR32( int32_t a, int8_t s){
+    if (__builtin_constant_p(s))
      __asm__ ("sarl %1, %0\n\t"
           : "+r" (a)
-         : "ic" ((uint8_t)(-s))
+         : "i" (-s & 0x1F)
      );
+    else
+        __asm__ ("sarl %1, %0\n\t"
+               : "+r" (a)
+               : "c" ((uint8_t)(-s))
+        );
      return a;
  }

#define NEG_USR32 NEG_USR32

  static inline uint32_t NEG_USR32(uint32_t a, int8_t s){
+    if (__builtin_constant_p(s))
      __asm__ ("shrl %1, %0\n\t"
           : "+r" (a)
-         : "ic" ((uint8_t)(-s))
+         : "i" (-s & 0x1F)
      );
+    else
+        __asm__ ("shrl %1, %0\n\t"
+               : "+r" (a)
+               : "c" ((uint8_t)(-s))
+        );
      return a;
  }


Does this have a performance or codesize impact?


It should behave the same it has until now.

And is the inline ASM actually any good? (When I comment the inline ASM
of NEG_USR32 out, code size actually increases with GCC 11, suggesting
that the inline ASM may be counterproductive as it impairs the compilers
ability to optimize.)

I did not test nor check if removing this ricing is better or not. Itcan be looked at later. Right now, i want lavc to compile with binutils 2.41


- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec/x86/mathops: clip constants used with shift instructions within inline assembly

Reply via email to