Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-30 Thread Xiaohong Gong
On Wed, 22 Nov 2023 09:05:31 GMT, Andrew Haley  wrote:

>>> Have you considered the possibility of copying the sleef source to the 
>>> OpenJDK repository and thereby it becomes part of the build process? I 
>>> don't know how straightforward that is technically and IANAL but I think 
>>> it's worth exploring.
>>> 
>> 
>> Hi @PaulSandoz ! Thanks for the suggestion! Copying the sleef source sounds 
>> good. However, I actually have no idea about how to handle the third-party 
>> licence in OpenJDK project. Do you have any idea about this area? Some 
>> suggestions/guidence   from the JDK team will be much helpful. Thanks!
>
>> > Have you considered the possibility of copying the sleef source to the 
>> > OpenJDK repository and thereby it becomes part of the build process? I 
>> > don't know how straightforward that is technically and IANAL but I think 
>> > it's worth exploring.
>> 
>> Hi @PaulSandoz ! Thanks for the suggestion! Copying the sleef source sounds 
>> good. However, I actually have no idea about how to handle the third-party 
>> licence in OpenJDK project. Do you have any idea about this area? Some 
>> suggestions/guidence from the JDK team will be much helpful. Thanks!
> 
> From a legal pespective, we can do this. SLEEF is distributed under Boost 
> Software License Version 1.0., which is a GPL-compatible free software 
> licence. The only issue is whether we want to do so. It would certainly be 
> convenient.

Hi @theRealAph , @magicus ,

The latest commit renamed `vmath` in configure to `sleef`, and the Arm SVE 
cflags is removed to `flags-cflags.m4` now. I was thinking about separating the 
NEON and SVE functions in `libvmath.so` to two source files, and add the SVE 
cflags just for SVE functions as needed. But it seems I have to write the make 
commands manually, which I think is not good? But I can have a try if it's 
necessary.

Since the sve flags is just used for building `libvmath.so` now, moving it to 
`flags-cflags.m4` seems not so right? 

@magicus , may I have your comments on this part? Thanks in advance!

> Don't touch the make system.Instead, try to open the library at runtime with 
> os::dll_open(), after (or inside) VM_Version::initialize().
>
> If you're not running on an SVE system, none of the SVE routines will be 
> called, so it doesn't matter, right?

Yes, the SVE routines are called only on the SVE system. So I think it doesn't 
matter. I tested with the image built on non-sve machine and ran the Vector API 
tests on SVE machine, and it works well. It also works well on the contrary 
situation.

Best Regards,
Xiaohong

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1833278899


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-26 Thread Xiaohong Gong
On Thu, 23 Nov 2023 14:01:48 GMT, Magnus Ihse Bursie  wrote:

>> OK, I see. It makes sense that the suffix name should be choosed mainly 
>> based on the real module name that is searched/checked in configure.
>
> This still needs fixing.

Yes, I will fix this together with removing the SVE cflags which I need more 
time to handle it with a better way. Thanks!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1405526236


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-23 Thread Magnus Ihse Bursie
On Thu, 23 Nov 2023 01:41:46 GMT, Xiaohong Gong  wrote:

>> As I said above, you should not mix the two together. Keep the library 
>> handling for libsleef. Move the march setting to where it belongs. And 
>> rename the files, functions and variables after this.
>
> OK, I see. It makes sense that the suffix name should be choosed mainly based 
> on the real module name that is searched/checked in configure.

This still needs fixing.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1403428592


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-23 Thread Magnus Ihse Bursie
On Thu, 23 Nov 2023 08:40:57 GMT, Xiaohong Gong  wrote:

>> Thanks for the advice! I will take a consideration for it.
>
>> Thirdly, I do not like at all how you just come crashing in setting -march 
>> like that. The -march flag is handled by FLAGS_SETUP_ABI_PROFILE.
> 
> `-march=armv8-a+sve` is just used in this new added module, which may not 
> expect to be used for other libraries. Per my understanding, flags handled by 
> `FLAGS_SETUP_ABI_PROFILE` is not used for a specified module? 
> 
>> Actually, now that I think of it, this is just completely wrong! You are 
>> checking on features on the build machine, to determine what target machine 
>> code to generate, with no way to override.
> 
> Yes, that's be a risk, although the usage to the SVE functions are controlled 
> by SVE feature as well in runtime. I need time to find a better solution.

It does not matter if you set the -march on just part of the build. Actually, 
there is no point in doing so. Either the JDK is run on a machine with the 
matching architecture, or it isn't. 

I don't know the details of what the aarch64 SVE feature means, but unless this 
is a special instance, any attempt to execute the compiled code on a machine 
that does not support that architecture will fail.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1403416310


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-23 Thread Xiaohong Gong
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

Agree that we using the external library as the first step, and then is adding 
the sleef sources in JDK if needed.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1824007585


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-23 Thread Xiaohong Gong
On Thu, 23 Nov 2023 01:28:40 GMT, Xiaohong Gong  wrote:

>> Ah, now I se what you are trying to do here. First of all, in the detection 
>> part, only set `SVE_FEATURE_SUPPORT`. Then you can handle the `SVE_CFLAGS` 
>> addition elsewhere/later. 
>> 
>> Secondly, you should not mix these `SVE_CFLAGS` with the spleef C flags. 
>> Keeping them separate will allow for LIBSLEEF_CFLAGS to be named just that.
>> 
>> Thirdly, I do not like at all how you just come crashing in setting `-march` 
>> like that. The `-march` flag is handled by `FLAGS_SETUP_ABI_PROFILE`. 
>> 
>> Actually, now that I think of it, this is just completely wrong! You are 
>> checking on features on the build machine, to determine what target machine 
>> code to generate, with no way to override. 
>> 
>> You need to break out the -march handling separately. It should be moved to 
>> FLAGS_SETUP_ABI_PROFILE. I'm guessing you will need to make something like a 
>> `aarch64-sve` profile, and possibly try to auto-select it based on the 
>> result of the sve test program above. But changing 
>> `OPENJDK_TARGET_ABI_PROFILE` can have further consequences; I do not know 
>> the full extent on the top of my head.
>
> Thanks for the advice! I will take a consideration for it.

> Thirdly, I do not like at all how you just come crashing in setting -march 
> like that. The -march flag is handled by FLAGS_SETUP_ABI_PROFILE.

`-march=armv8-a+sve` is just used in this new added module, which may not 
expect to be used for other libraries. Per my understanding, flags handled by 
`FLAGS_SETUP_ABI_PROFILE` is not used for a specified module? 

> Actually, now that I think of it, this is just completely wrong! You are 
> checking on features on the build machine, to determine what target machine 
> code to generate, with no way to override.

Yes, that's be a risk, although the usage to the SVE functions are controlled 
by SVE feature as well in runtime. I need time to find a better solution.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1403052964


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-22 Thread Xiaohong Gong
On Tue, 21 Nov 2023 14:13:19 GMT, Magnus Ihse Bursie  wrote:

>> Yes, it seems weird. But the library we want to built out is `libvmath.so` 
>> instead of `libsleef.so`. And we not only check the sleef library, but also 
>> the ARM SVE feature inside it. So using `VMATH` suffix is more reasonable to 
>> me. WDYT?
>
> As I said above, you should not mix the two together. Keep the library 
> handling for libsleef. Move the march setting to where it belongs. And rename 
> the files, functions and variables after this.

OK, I see. It makes sense that the suffix name should be choosed mainly based 
on the real module name that is searched/checked in configure.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1402847627


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-22 Thread Xiaohong Gong
On Tue, 21 Nov 2023 14:12:13 GMT, Magnus Ihse Bursie  wrote:

>> This is just used to print the result of `AC_MSG_CEHCKING[if ARM SVE feature 
>> is supported]` in configure.
>
> Ah, now I se what you are trying to do here. First of all, in the detection 
> part, only set `SVE_FEATURE_SUPPORT`. Then you can handle the `SVE_CFLAGS` 
> addition elsewhere/later. 
> 
> Secondly, you should not mix these `SVE_CFLAGS` with the spleef C flags. 
> Keeping them separate will allow for LIBSLEEF_CFLAGS to be named just that.
> 
> Thirdly, I do not like at all how you just come crashing in setting `-march` 
> like that. The `-march` flag is handled by `FLAGS_SETUP_ABI_PROFILE`. 
> 
> Actually, now that I think of it, this is just completely wrong! You are 
> checking on features on the build machine, to determine what target machine 
> code to generate, with no way to override. 
> 
> You need to break out the -march handling separately. It should be moved to 
> FLAGS_SETUP_ABI_PROFILE. I'm guessing you will need to make something like a 
> `aarch64-sve` profile, and possibly try to auto-select it based on the result 
> of the sve test program above. But changing `OPENJDK_TARGET_ABI_PROFILE` can 
> have further consequences; I do not know the full extent on the top of my 
> head.

Thanks for the advice! I will take a consideration for it.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1402843737


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-22 Thread Magnus Ihse Bursie
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

I think it can be a good idea to start with using an external library, as is 
done in this PR. For all our other bundled libraries, we also always support 
the option of building with the "system" library instead of the bundled one, so 
this code will still be required.

If we should push this PR first, and then add the source in a separate PR, or 
if we should bake everything together (both external library and bundled 
sources) into a single PR, I cannot say. But I expect it is easier to take it 
piece-wise.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1823387045


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-22 Thread Paul Sandoz
On Wed, 22 Nov 2023 09:05:31 GMT, Andrew Haley  wrote:

> > Have you considered the possibility of copying the sleef source to the 
> > OpenJDK repository and thereby it becomes part of the build process? I 
> > don't know how straightforward that is technically and IANAL but I think 
> > it's worth exploring.
> 
> Hi @PaulSandoz ! Thanks for the suggestion! Copying the sleef source sounds 
> good. However, I actually have no idea about how to handle the third-party 
> licence in OpenJDK project. Do you have any idea about this area? Some 
> suggestions/guidence from the JDK team will be much helpful. Thanks!

We (Oracle Java Platform Group) can handle the required "paperwork" on any 
third party dependencies and attribution of copyright before any PR can be 
integrated, if you can help detail what those are. First i think we need to 
determine if this is feasible e.g., copying a subset and integrating it into 
the build system, since it does not make sense to bring in the support for quad 
floats and DFT, which IIUC brings in a dependency on compiler support for 
OpenMP.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1823335443


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-22 Thread Andrew Haley
On Wed, 22 Nov 2023 01:52:51 GMT, Xiaohong Gong  wrote:

> > Have you considered the possibility of copying the sleef source to the 
> > OpenJDK repository and thereby it becomes part of the build process? I 
> > don't know how straightforward that is technically and IANAL but I think 
> > it's worth exploring.
> 
> Hi @PaulSandoz ! Thanks for the suggestion! Copying the sleef source sounds 
> good. However, I actually have no idea about how to handle the third-party 
> licence in OpenJDK project. Do you have any idea about this area? Some 
> suggestions/guidence from the JDK team will be much helpful. Thanks!

>From a legal pespective, we can do this. SLEEF is distributed under Boost 
>Software License Version 1.0., which is a GPL-compatible free software 
>licence. The only issue is whether we want to do so. It would certainly be 
>convenient.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1822364670


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-21 Thread Xiaohong Gong
On Wed, 22 Nov 2023 00:05:26 GMT, Paul Sandoz  wrote:

> Have you considered the possibility of copying the sleef source to the 
> OpenJDK repository and thereby it becomes part of the build process? I don't 
> know how straightforward that is technically and IANAL but I think it's worth 
> exploring.
> 

Hi @PaulSandoz ! Thanks for the suggestion! Copying the sleef source sounds 
good. However, I actually have no idea about how to handle the third-party 
licence in OpenJDK project. Do you have any idea about this area? Some 
suggestions/guidence   from the JDK team will be much helpful. Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1821965636


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-21 Thread Xiaohong Gong
On Tue, 21 Nov 2023 18:14:41 GMT, Paul Sandoz  wrote:

> > This looks good. As far as I can tell the choice you've made of accuracy 
> > matches what we need to meet the spec.
> 
> Same here . Sinh/cosh/tanh/expm1 are specified to be within 2.5 ulps of the 
> exact result, but i believe sleef does not offer that option, it's either 
> within 1 or 3.5, so we have to pick the former. AFAICT sleef does not say 
> anything about monotonicity, but we relax semi-monotonicity for all but sqrt 
> (we defer to IEEE 754).

Yes, that's why we at least have to use the 1up in sleef here.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1821956522


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-21 Thread Paul Sandoz
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

Have you considered the possibility of copying the sleef source to the OpenJDK 
repository and thereby it becomes part of the build process? I don't know how 
straightforward that is technically and IANAL but I think it's worth exploring.

Also it may enable us to use sleef for other platforms where we have gaps 
(looking at Table 1.1 of https://sleef.org/). Further out it should inspire us 
to do a Java Vector API port to as indicated in a prior comment.  

> Yes, libsleef is used to build the binary if found. And at runtime, if the 
> libsleef with right version is not found, `dlopen` to the libvmath.so will 
> fail. And then all the operations will be fall-back to the java default 
> implementation. X86_64 has also bundled the Intel's SVML binary to jdk image 
> at build time. And we use the same way loading/opening the library at runtime.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1821885433


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-21 Thread Paul Sandoz
On Mon, 23 Oct 2023 09:02:35 GMT, Xiaohong Gong  wrote:

> This looks good. As far as I can tell the choice you've made of accuracy 
> matches what we need to meet the spec. 

Same here . Sinh/cosh/tanh/expm1 are specified to be within 2.5 ulps of the 
exact result, but i believe sleef does not offer that option, it's either 
within 1 or 3.5, so we have to pick the former. AFAICT sleef does not say 
anything about monotonicity, but we relax semi-monotonicity for all but sqrt 
(we defer to IEEE 754).

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1821422985


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-21 Thread Magnus Ihse Bursie
On Mon, 20 Nov 2023 01:47:34 GMT, Xiaohong Gong  wrote:

>> make/autoconf/lib-vmath.m4 line 92:
>> 
>>> 90: []
>>> 91: )
>>> 92: AC_MSG_RESULT([${SVE_FEATURE_SUPPORT}])
>> 
>> What is this test even for? I can't see any usage of SVE_FEATURE_SUPPORT 
>> outside this function.
>
> This is just used to print the result of `AC_MSG_CEHCKING[if ARM SVE feature 
> is supported]` in configure.

Ah, now I se what you are trying to do here. First of all, in the detection 
part, only set `SVE_FEATURE_SUPPORT`. Then you can handle the `SVE_CFLAGS` 
addition elsewhere/later. 

Secondly, you should not mix these `SVE_CFLAGS` with the spleef C flags. 
Keeping them separate will allow for LIBSLEEF_CFLAGS to be named just that.

Thirdly, I do not like at all how you just come crashing in setting `-march` 
like that. The `-march` flag is handled by `FLAGS_SETUP_ABI_PROFILE`. 

Actually, now that I think of it, this is just completely wrong! You are 
checking on features on the build machine, to determine what target machine 
code to generate, with no way to override. 

You need to break out the -march handling separately. It should be moved to 
FLAGS_SETUP_ABI_PROFILE. I'm guessing you will need to make something like a 
`aarch64-sve` profile, and possibly try to auto-select it based on the result 
of the sve test program above. But changing `OPENJDK_TARGET_ABI_PROFILE` can 
have further consequences; I do not know the full extent on the top of my head.

>> make/autoconf/libraries.m4 line 129:
>> 
>>> 127:   LIB_SETUP_LIBFFI
>>> 128:   LIB_SETUP_MISC_LIBS
>>> 129:   LIB_SETUP_VMATH
>> 
>> The function (and file) should be named after "sleef", not "vmath".
>
> Yes, it seems weird. But the library we want to built out is `libvmath.so` 
> instead of `libsleef.so`. And we not only check the sleef library, but also 
> the ARM SVE feature inside it. So using `VMATH` suffix is more reasonable to 
> me. WDYT?

As I said above, you should not mix the two together. Keep the library handling 
for libsleef. Move the march setting to where it belongs. And rename the files, 
functions and variables after this.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1400663476
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1400665010


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-19 Thread Xiaohong Gong
On Thu, 16 Nov 2023 13:00:03 GMT, Magnus Ihse Bursie  wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains five additional 
>> commits since the last revision:
>> 
>>  - Add a bundled native lib in jdk as a bridge to libsleef
>>  - Merge 'jdk:master' into JDK-8312425
>>  - Disable sleef by default
>>  - Merge 'jdk:master' into JDK-8312425
>>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF
>
> doc/building.md line 552:
> 
>> 550: 
>> 551: libsleef, the [SIMD Library for Evaluating Elementary Functions](
>> 552: https://sleef.org/) is required when building libvmath.so on 
>> Linux+AArch64
> 
> The conventional way we have refered to os/cpu combinations in the build 
> documentation is like this: `Linux/aarch64`.
> 
> I also think you need to expand a bit that this is optional, and if you do 
> not provide libsleef, the build will succeed but without the vector 
> performance enhancements provided by libvmath.

Thanks for the review! This sounds good to me. I will add it.

> make/autoconf/lib-vmath.m4 line 49:
> 
>> 47:  test -e ${with_libsleef}/include/sleef.h; then
>> 48: LIBSLEEF_FOUND=yes
>> 49: LIBVMATH_LIBS="-L${with_libsleef}/lib"
> 
> This should be LIBSLEEF_LIBS and ...CFLAGS.

Seems as above. The target library is `libvmath.so`, and the cflags/libs are 
used for building it instead of `libsleef.so`.

> make/autoconf/lib-vmath.m4 line 92:
> 
>> 90: []
>> 91: )
>> 92: AC_MSG_RESULT([${SVE_FEATURE_SUPPORT}])
> 
> What is this test even for? I can't see any usage of SVE_FEATURE_SUPPORT 
> outside this function.

This is just used to print the result of `AC_MSG_CEHCKING[if ARM SVE feature is 
supported]` in configure.

> make/autoconf/lib-vmath.m4 line 102:
> 
>> 100:   fi
>> 101: 
>> 102:   AC_SUBST(LIBSLEEF_FOUND)
> 
> Do not export LIBSLEEF_FOUND. It is okay to use internally here, but you 
> should instead export ENABLE_LIBSLEEF, using true/false (instead of yes/no). 
> This is the way we handle all other optional components.

Make sense to me. Thanks for the comment!

> make/autoconf/libraries.m4 line 129:
> 
>> 127:   LIB_SETUP_LIBFFI
>> 128:   LIB_SETUP_MISC_LIBS
>> 129:   LIB_SETUP_VMATH
> 
> The function (and file) should be named after "sleef", not "vmath".

Yes, it seems weird. But the library we want to built out is `libvmath.so` 
instead of `libsleef.so`. And we not only check the sleef library, but also the 
ARM SVE feature inside it. So using `VMATH` suffix is more reasonable to me. 
WDYT?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1398574834
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1398573871
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1398571212
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1398575161
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1398572906


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-16 Thread Magnus Ihse Bursie
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

There's a lot to work with regarding the build system changes here...

make/autoconf/lib-vmath.m4 line 39:

> 37:   LIBVMATH_CFLAGS=
> 38:   LIBVMATH_LIBS=
> 39: 

There are multiple issues with this function. Please have a look at how other 
libraries are handled. Some remarks:
1) You always need to pair AC_MSG_CHECKING and AC_MSG_RESULT. Do not make any 
operations in between that can cause output.
2) If the user runs just --with-libsleef, the value will be "yes". You need to 
treat this, not as a path, but as a request to enable the library using default 
methods (like pkg-check or well known locations).

make/autoconf/lib-vmath.m4 line 49:

> 47:  test -e ${with_libsleef}/include/sleef.h; then
> 48: LIBSLEEF_FOUND=yes
> 49: LIBVMATH_LIBS="-L${with_libsleef}/lib"

This should be LIBSLEEF_LIBS and ...CFLAGS.

make/autoconf/lib-vmath.m4 line 92:

> 90: []
> 91: )
> 92: AC_MSG_RESULT([${SVE_FEATURE_SUPPORT}])

What is this test even for? I can't see any usage of SVE_FEATURE_SUPPORT 
outside this function.

make/autoconf/libraries.m4 line 129:

> 127:   LIB_SETUP_LIBFFI
> 128:   LIB_SETUP_MISC_LIBS
> 129:   LIB_SETUP_VMATH

The function (and file) should be named after "sleef", not "vmath".

-

Changes requested by ihse (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16234#pullrequestreview-1734359314
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395684054
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395687129
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395684964
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395686104


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-16 Thread Magnus Ihse Bursie
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

doc/building.md line 552:

> 550: 
> 551: libsleef, the [SIMD Library for Evaluating Elementary Functions](
> 552: https://sleef.org/) is required when building libvmath.so on 
> Linux+AArch64

The conventional way we have refered to os/cpu combinations in the build 
documentation is like this: `Linux/aarch64`.

I also think you need to expand a bit that this is optional, and if you do not 
provide libsleef, the build will succeed but without the vector performance 
enhancements provided by libvmath.

make/autoconf/lib-vmath.m4 line 102:

> 100:   fi
> 101: 
> 102:   AC_SUBST(LIBSLEEF_FOUND)

Do not export LIBSLEEF_FOUND. It is okay to use internally here, but you should 
instead export ENABLE_LIBSLEEF, using true/false (instead of yes/no). This is 
the way we handle all other optional components.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395657610
PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395659965


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-16 Thread Magnus Ihse Bursie
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

doc/building.md line 549:

> 547: files.
> 548: 
> 549: ### libsleef

You will need to regenerate building.html as well. `make update-build-docs` 
using pandoc 2.19.2.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1395653735


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-15 Thread Xiaohong Gong
On Thu, 16 Nov 2023 05:33:13 GMT, David Holmes  wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains five additional 
>> commits since the last revision:
>> 
>>  - Add a bundled native lib in jdk as a bridge to libsleef
>>  - Merge 'jdk:master' into JDK-8312425
>>  - Disable sleef by default
>>  - Merge 'jdk:master' into JDK-8312425
>>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF
>
> So to be clear, now you have to opt-in to using `libsleef` by building a 
> binary that will use it. That binary will always use `libsleef` if found, so 
> there is no way to opt-out at runtime. Is that the way it works on x86_64 too?

Thanks for adding the labels @dholmes-ora !

> So to be clear, now you have to opt-in to using libsleef by building a binary 
> that will use it. That binary will always use libsleef if found, so there is 
> no way to opt-out at runtime. Is that the way it works on x86_64 too?

Yes, libsleef is used to build the binary if found. And at runtime, if the 
libsleef with right version is not found, `dlopen` to the libvmath.so will 
fail. And then all the operations will be fall-back to the java default 
implementation. X86_64 has also bundled the Intel's SVML binary to jdk image at 
build time. And we use the same way loading/opening the library at runtime.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1813916971


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-15 Thread David Holmes
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

So to be clear, now you have to opt-in to using `libsleef` by building a binary 
that will use it. That binary will always use `libsleef` if found, so there is 
no way to opt-out at runtime. Is that the way it works on x86_64 too?

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1813816824


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-15 Thread David Holmes
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains five additional 
> commits since the last revision:
> 
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

Really this should go to panama-dev but core-libs will have to do.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1813811597