On Thu, 30 Nov 2023 06:39:43 GMT, Xiaohong Gong <xg...@openjdk.org> wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Rename vmath to sleef in configure

This version looks much better, thank you! I guess cflags/SVE_CFLAGS is an 
okay-ish solution.

I'm still not 100% happy though, but it might be due to my limited 
understanding. Let me write down a few numbered statements and then you can 
tell me if I'm right or wrong.

1. The aarch64 supports two different SIMD instruction set additions, Neon and 
SVE.
2. A specific instance of an aarch64 CPU can implement Neon, or SVE, or none of 
them, but not both.
3. SVE is superior to Neon, and is far more common these days.
4. We would like to ship a single version of libvmath.so, that supports SVE if 
it happens to be run on a CPU with SVE.
5. THe same version will just use the fallback code that "works" but has lower 
performance if run on a CPU without SVE (regardless of if it has Neon or not)
6. If libvmath.so is built without SVE support, and is then run on a CPU with 
SVE, it will "work", but not utilize the SVE functionality, so have degraded 
performance compared to what we want.
7. To be able to build libvmath.so with SVE support, we need to be able to 
compile a simple test program using `#include <arm_sve.h>` and 
`-march=armv8-a+sve`. If this fails, we cannot build libvmath.so with SVE 
support.
8. The ability to build with SVE support should only be dependent on the gcc 
compiler and sysroot header files, and not the SIMD instruction set of the 
build machine CPU.

If all these are correct, then I think the problem is that we just silently 
ignore if building with SVE fails. Instead, it should cause configure to fail.

If, for some reason, we must support build environment that cannot build for 
SVE, then we need to have a configure flag that allows us to require the 
presence of SVE building ability, like --enable-sve-support, which will be 
"auto" by default and thus adapt to the platform, but can be set to on, which 
will cause a configure fail if the platform does not have SVE compilation 
abilities.

We cannot just silently drop expected functionality depending on the build 
machine, or at the very least, we must have a way to prevent that from 
happening.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1833403493

Reply via email to