On Thu, 30 Nov 2023 06:39:43 GMT, Xiaohong Gong <xg...@openjdk.org> wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Rename vmath to sleef in configure

> Okay, now I found a few more of your comments that I missed before. I 
> apologize, the Github PR review UI can be a bit confusing when discussions 
> are taking place in multiple locations. So, here's a revision to my list 
> above:
> 
> 1. An aach64 CPU can have both Neon and SVE present at the same time.
> 2. You are assuming that Neon is always present, and what I referred to as 
> the fallback case is in fact using Neon instead of SVE.
> 3. You would like to split vect_math.c into two parts, e.g.  vect_math_neon.c 
> and vect_math_sve.c.
> 4. You will then, use heuristics in hotspot to determine at runtime if SVE or 
> Neon functionality should be used.  Even if SVE is present on the runtime 
> machine, heuristics can chose to use the Neon implementation anyway in some 
> cases.
> 5. Only vect_math_sve.c. need the -march+sve.

Yes, all the above are true.

> 6. The neon part do not need the -march+sve flag, and will fail if built with 
> this flag. (???)

Current neon code does not need the `-march=arm-a+sve` flag, but it will not 
fail if we built with it. Because the C compiler (GCC/Clang) can also supports 
SVE and NEON at the same time. My concert is that if we build the neon code 
with sve flags,  it has the possibility that compiler may generate SVE specific 
instructions inside the NEON functions in future (e.g. if sve has new features 
related to method call) , although it doesn't happen now. If so, calling the 
neon stubs (which contains sve instructions) on non-sve supported machines in 
runtime may crash. Hence, it's more safe if we can separate neon and sve code 
and use different flags for them.


> Anyway, it is straightforward to add compiler flags to individual files. You 
> do it like this:
> 
> $(eval $(call SetupJdkLibrary, BUILD_LIBVMATH, \
      NAME := vmath, \
      CFLAGS := $(CFLAGS_JDKLIB) $(LIBSLEEF_CFLAGS) -fvisibility=default, \
      vect_math_sve.c_CFLAGS := $(SVE_CFLAGS), \
 ...

Thanks so much for this. That's very helpful!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835258683

Reply via email to