Re: [libjpeg-turbo-users] SIMD instruction support based on CPU architecture

'DRC' via libjpeg-turbo User Discussion/Support Wed, 14 Sep 2022 07:59:18 -0700

libjpeg-turbo doesn't actually check whether the CPU supports Neoninstructions at compile time. It only checks whether the C compiler hasthe ability to compile Neon intrinsics. That is necessary becauselibjpeg-turbo's Neon SIMD extensions are implemented using compilerintrinsics. Because libjpeg-turbo's x86 SIMD extensions are implementedusing raw assembly code, no such compile-time checks are necessary. Aslong as you are using a recent version of NASM or YASM, the x86 SIMDextensions will build. In general, compile-time SIMD capability checkswould not be useful, because you can always build SIMD code on CPUs thatcan't run the code. In fact, my primary build machine has an older CPUthat doesn't support AVX2 instructions, but it builds the libjpeg-turboAVX2 SIMD extensions just fine.

libjpeg-turbo performs a run-time SIMD capability check within the bodyof any of the jsimd_can_*() functions, which ensures that onlyinstructions that the CPU supports will be used. Neon instructions arealways available with the AArch64 architecture, so run-time detection ofNeon instructions is only performed for AArch32 builds. On Linux andAndroid, AArch64 builds of libjpeg-turbo do perform run-time checks todetermine whether to disable the use of certain Neon instructions onCPUs that have slow implementations of them. However, that is a legacyfeature that only affects the legacy Neon GNU Assembler (GAS) code,which is only used by default with GCC 11 and earlier (because thoseversions of GCC do not generate optimal assembly code from Neonintrinsics.) When building with Clang, the full Neon intrinsicsimplementation is used by default. For AArch32 builds, the default isto perform a Neon capability check at run time, which would allow thebuild to run on both Neon-equipped and non-Neon-equipped CPUs. However,if you pass -mfpu=neon to the compiler, you will produce C code thatonly runs on Neon-equipped CPUs. Thus, libjpeg-turbo detects thatsituation at compile time (via the __ARM_NEON__ macro, which is definedwith -mfpu=neon) and disables run-time Neon capability checking if theresulting build could never run on non-Neon-equipped CPUs.

With x86 builds, run-time SIMD capability checking is always performed.Only the SIMD instructions that are supported by the CPU will be used,so to answer your specific question-- yes, SIMD-enabled builds oflibjpeg-turbo will work properly on CPUs that only support MMXinstructions. All x86-64 systems support SSE2 instructions, but x86-64builds of libjpeg-turbo will perform run-time checking to determinewhether AVX2 instructions can be used.



On 9/11/22 3:28 PM, Berke Yavas wrote:

Hi,
In the i386 architecture and x86_64 architecture, there are differentSIMD support extensions. For example, for the i386 there could be a MMX,SSE2 or AVX2 support. Afaik, libjpeg-turbo does not check if thesesupports are available in the compile time as it does for NEON support.So how does libjpeg-turbo knows which extension to use? Is thisdispatched at the run-time? Is one of these are enough to run SIMDextension? Suppose the system only have MMX system. Does SIMDextension(WITH_SIMD) still works without a problem?
It would be good to add these checks at the compile time for the x64 andi386 like ARM or MIPS.
I am new to these things. This questions might be stupid, sorry if it is.

Thanks,
Berke


--
You received this message because you are subscribed to the Google Groups 
"libjpeg-turbo User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/libjpeg-turbo-users/f5e38797-babf-1491-e81b-45cdb932e3f3%40virtualgl.org.

Re: [libjpeg-turbo-users] SIMD instruction support based on CPU architecture

Reply via email to