libjpeg-turbo doesn't actually check whether the CPU supports Neon instructions at compile time. It only checks whether the C compiler has the ability to compile Neon intrinsics. That is necessary because libjpeg-turbo's Neon SIMD extensions are implemented using compiler intrinsics. Because libjpeg-turbo's x86 SIMD extensions are implemented using raw assembly code, no such compile-time checks are necessary. As long as you are using a recent version of NASM or YASM, the x86 SIMD extensions will build. In general, compile-time SIMD capability checks would not be useful, because you can always build SIMD code on CPUs that can't run the code. In fact, my primary build machine has an older CPU that doesn't support AVX2 instructions, but it builds the libjpeg-turbo AVX2 SIMD extensions just fine.

libjpeg-turbo performs a run-time SIMD capability check within the body of any of the jsimd_can_*() functions, which ensures that only instructions that the CPU supports will be used. Neon instructions are always available with the AArch64 architecture, so run-time detection of Neon instructions is only performed for AArch32 builds. On Linux and Android, AArch64 builds of libjpeg-turbo do perform run-time checks to determine whether to disable the use of certain Neon instructions on CPUs that have slow implementations of them. However, that is a legacy feature that only affects the legacy Neon GNU Assembler (GAS) code, which is only used by default with GCC 11 and earlier (because those versions of GCC do not generate optimal assembly code from Neon intrinsics.) When building with Clang, the full Neon intrinsics implementation is used by default. For AArch32 builds, the default is to perform a Neon capability check at run time, which would allow the build to run on both Neon-equipped and non-Neon-equipped CPUs. However, if you pass -mfpu=neon to the compiler, you will produce C code that only runs on Neon-equipped CPUs. Thus, libjpeg-turbo detects that situation at compile time (via the __ARM_NEON__ macro, which is defined with -mfpu=neon) and disables run-time Neon capability checking if the resulting build could never run on non-Neon-equipped CPUs.

With x86 builds, run-time SIMD capability checking is always performed. Only the SIMD instructions that are supported by the CPU will be used, so to answer your specific question-- yes, SIMD-enabled builds of libjpeg-turbo will work properly on CPUs that only support MMX instructions. All x86-64 systems support SSE2 instructions, but x86-64 builds of libjpeg-turbo will perform run-time checking to determine whether AVX2 instructions can be used.


On 9/11/22 3:28 PM, Berke Yavas wrote:
Hi,

In the i386 architecture and x86_64 architecture, there are different SIMD support extensions. For example, for the i386 there could be a MMX, SSE2 or AVX2 support. Afaik, libjpeg-turbo does not check if these supports are available in the compile time as it does for NEON support. So how does libjpeg-turbo knows which extension to use? Is this dispatched at the run-time? Is one of these are enough to run SIMD extension? Suppose the system only have MMX system. Does SIMD extension(WITH_SIMD) still works without a problem?

It would be good to add these checks at the compile time for the x64 and i386 like ARM or MIPS.

I am new to these things. This questions might be stupid, sorry if it is.

Thanks,
Berke

--
You received this message because you are subscribed to the Google Groups 
"libjpeg-turbo User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/libjpeg-turbo-users/f5e38797-babf-1491-e81b-45cdb932e3f3%40virtualgl.org.

Reply via email to