On Tue, 24 Oct 2023, Martin Storsjö wrote:

Clang versions before 17 (Xcode versions up to and including 15.0)
had a very annoying bug in its behaviour of the ".arch" directive
in assembly. If the directive only contained a level, such as
".arch armv8.2-a", it did validate the name of the level, but it
didn't apply the level to what instructions are allowed. The level
was applied if the directive contained an extra feature enabled,
such as ".arch armv8.2-a+crc" though. It was also applied on the
next ".arch_extension" directive.

This bug, combined with the fact that the same versions of Clang
didn't support the dotprod/i8mm extension names in either
".arch <level>+<feature>" or in ".arch_extension", could lead to
unexepcted build failures.

As the dotprod/i8mm extensions couldn't be enabled dynamically
via the ".arch_extension" directive, someone building ffmpeg could
try to enable them by configuring their build with
--extra-cflags="-march=armv8.6-a".

During configure, we test for support for the i8mm instructions
like this:

   # Built with -march=armv8.6-a
   .arch armv8.2-a             # Has no visible effect here
   #.arch_extension i8mm       # Omitted as the extension name isn't known
   usdot v0.4s, v0.16b, v0.16b
   # Successfully assembled as armv8.6-a is the effective level,
   # and i8mm is enabled implicitly in armv8.6-a.

Thus, we would enable assembling those instructions. However if
we later check for another extension, such as sve (which those
versions of Clang actually do support), we can later run into the
following situation when building actual code:

   # Built with -march=armv8.6-a
   .arch armv8.2-a             # Has no visible effect here
   #.arch_extension i8mm       # Omitted as the extension name isn't known
   .arch_extension sve         # Included as "sve" is as supported extension 
name
   # .arch_extension effectively activates the previous .arch directive,
   # so the effective level is armv8.2-a+sve now.
   usdot v0.4s, v0.16b, v0.16b
   # Fails to build the instructions that require i8mm. Despite the
   # configure check, the unrelated ".arch_extension sve" directive
   # breaks the functionality of the i8mm feature.

This patch avoids this situation:
- By adding a dummy feature such as "+crc" on the .arch directive
 (if supported), we make sure that it does get applied immediately,
 avoiding it taking effect spuriously at a later unrelated
 ".arch_extension" directive.
- By checking for higher arch levels such as armv8.4-a and armv8.6-a,
 we can assemble the dotprod and i8mm extensions without the user
 needing to pass -march=armv8.6-a. This allows using the dotprod/i8mm
 codepaths via runtime detection while keeping the binary runnable
 on older versions. I.e. this enables the i8mm codepaths on Apple M2
 machines while built with Xcode's Clang.

TL;DR: Enable the I8MM extensions for Apple M2 without the user needing
to do a custom configuration; avoid potential build breakage if a user
does such a custom configuration.

Once Xcode versions that have these issues fixed are prevalent, we
can consider reverting this change.
---
configure | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

Will push now.

// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to