libjpeg-turbo doesn't actually check whether the CPU supports Neon
instructions at compile time. It only checks whether the C compiler has
the ability to compile Neon intrinsics. That is necessary because
libjpeg-turbo's Neon SIMD extensions are implemented using compiler
intrinsics. Because libjpeg-turbo's x86 SIMD extensions are implemented
using raw assembly code, no such compile-time checks are necessary. As
long as you are using a recent version of NASM or YASM, the x86 SIMD
extensions will build. In general, compile-time SIMD capability checks
would not be useful, because you can always build SIMD code on CPUs that
can't run the code. In fact, my primary build machine has an older CPU
that doesn't support AVX2 instructions, but it builds the libjpeg-turbo
AVX2 SIMD extensions just fine.
libjpeg-turbo performs a run-time SIMD capability check within the body
of any of the jsimd_can_*() functions, which ensures that only
instructions that the CPU supports will be used. Neon instructions are
always available with the AArch64 architecture, so run-time detection of
Neon instructions is only performed for AArch32 builds. On Linux and
Android, AArch64 builds of libjpeg-turbo do perform run-time checks to
determine whether to disable the use of certain Neon instructions on
CPUs that have slow implementations of them. However, that is a legacy
feature that only affects the legacy Neon GNU Assembler (GAS) code,
which is only used by default with GCC 11 and earlier (because those
versions of GCC do not generate optimal assembly code from Neon
intrinsics.) When building with Clang, the full Neon intrinsics
implementation is used by default. For AArch32 builds, the default is
to perform a Neon capability check at run time, which would allow the
build to run on both Neon-equipped and non-Neon-equipped CPUs. However,
if you pass -mfpu=neon to the compiler, you will produce C code that
only runs on Neon-equipped CPUs. Thus, libjpeg-turbo detects that
situation at compile time (via the __ARM_NEON__ macro, which is defined
with -mfpu=neon) and disables run-time Neon capability checking if the
resulting build could never run on non-Neon-equipped CPUs.
With x86 builds, run-time SIMD capability checking is always performed.
Only the SIMD instructions that are supported by the CPU will be used,
so to answer your specific question-- yes, SIMD-enabled builds of
libjpeg-turbo will work properly on CPUs that only support MMX
instructions. All x86-64 systems support SSE2 instructions, but x86-64
builds of libjpeg-turbo will perform run-time checking to determine
whether AVX2 instructions can be used.
On 9/11/22 3:28 PM, Berke Yavas wrote:
Hi,
In the i386 architecture and x86_64 architecture, there are different
SIMD support extensions. For example, for the i386 there could be a MMX,
SSE2 or AVX2 support. Afaik, libjpeg-turbo does not check if these
supports are available in the compile time as it does for NEON support.
So how does libjpeg-turbo knows which extension to use? Is this
dispatched at the run-time? Is one of these are enough to run SIMD
extension? Suppose the system only have MMX system. Does SIMD
extension(WITH_SIMD) still works without a problem?
It would be good to add these checks at the compile time for the x64 and
i386 like ARM or MIPS.
I am new to these things. This questions might be stupid, sorry if it is.
Thanks,
Berke
--
You received this message because you are subscribed to the Google Groups
"libjpeg-turbo User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/libjpeg-turbo-users/f5e38797-babf-1491-e81b-45cdb932e3f3%40virtualgl.org.