Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2024-02-06 Thread Ludovic Henry
On Tue, 6 Feb 2024 08:20:39 GMT, Magnus Ihse Bursie  wrote:

> I'd just hate to see all this work go to waste.

Same here!

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1929780538


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2024-02-06 Thread Magnus Ihse Bursie
On Mon, 11 Dec 2023 18:25:09 GMT, Ludovic Henry  wrote:

>> @theRealAph 
>>> Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF 
>>> support?
>> 
>> I can't say anything for sure, but I picked up some positive vibes from our 
>> internal chat. I think the idea was that libsleef could potentially cover up 
>> vector math for all platforms that the current Intel lib solution is missing 
>> (basically, everything but linux+windows x64). So I this can be seen as a 
>> bit of a trial balloon if it is worth a more complete integration of 
>> libsleef in the JDK.
>> 
>> And I can't say anything at all about what Oracle JDKs are going to release 
>> with, but I am fully open towards adding libsleef to the GHA builds. And I'm 
>> guessing that Adoptium has no reason not to include libsleef, but then 
>> again, I cannot of course speak for them.
>
>> I can't say anything for sure, but I picked up some positive vibes from our 
>> internal chat. I think the idea was that libsleef could potentially cover up 
>> vector math for all platforms that the current Intel lib solution is missing 
>> (basically, everything but linux+windows x64). So I this can be seen as a 
>> bit of a trial balloon if it is worth a more complete integration of 
>> libsleef in the JDK.
> 
> I can add that we are interested to use that for Linux + RISC-V support given 
> the RISC-V support was recently merged into sleef upstream. 
> https://github.com/shibatch/sleef/pull/477

@luhenry Maybe you are interested in helping with bringing this forward? I can 
assist with risc-v related fixes in the makefiles. I'd just hate to see all 
this work go to waste.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1928995458


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-11 Thread Ludovic Henry
On Mon, 4 Dec 2023 11:58:55 GMT, Magnus Ihse Bursie  wrote:

> I can't say anything for sure, but I picked up some positive vibes from our 
> internal chat. I think the idea was that libsleef could potentially cover up 
> vector math for all platforms that the current Intel lib solution is missing 
> (basically, everything but linux+windows x64). So I this can be seen as a bit 
> of a trial balloon if it is worth a more complete integration of libsleef in 
> the JDK.

I can add that we are interested to use that for Linux + RISC-V support given 
the RISC-V support was recently merged into sleef upstream. 
https://github.com/shibatch/sleef/pull/477

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1850636984


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-06 Thread Xiaohong Gong
On Fri, 1 Dec 2023 16:26:02 GMT, Magnus Ihse Bursie  wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains ten additional 
>> commits since the last revision:
>> 
>>  - Separate neon and sve functions into two source files
>>  - Merge branch 'jdk:master' into JDK-8312425
>>  - Rename vmath to sleef in configure
>>  - Address review comments in build system
>>  - Add a bundled native lib in jdk as a bridge to libsleef
>>  - Merge 'jdk:master' into JDK-8312425
>>  - Disable sleef by default
>>  - Merge 'jdk:master' into JDK-8312425
>>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF
>
> doc/building.md line 639:
> 
>> 637: 
>> 638: libsleef, the [SIMD Library for Evaluating Elementary Functions](
>> 639: https://sleef.org/) is required when building libvmath.so on 
>> Linux/aarch64
> 
> This is incorrect. The library is not required, but if it is present, we will 
> build libvmath with it.
> 
> Edit: Or rather, this is misleading. Technically it is correct, since you 
> state that it is required when building libvmath.so, but it is easy to 
> mistake for being required for building the JDK. The reader presumably does 
> not know what libvmath.so is or how it is used.
> 
> Please rephrase this to so that it is clear that this is optional, but will 
> provide performance benefits to the resulting JDK if present. You do not need 
> to mention libvmath.so here, for no other dependency do we declare what parts 
> of the JDK that require it -- it is not essential for this document.
> 
> Also see if you can make this paragraph and the one at the end be a bit more 
> tighter, not the last paragraph seems to be both repeat and contradict this 
> one.

Hi @magicus , the doc is updated. Thanks for your comment on this!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1416964362


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-05 Thread Xiaohong Gong
On Tue, 5 Dec 2023 13:00:04 GMT, Magnus Ihse Bursie  wrote:

> So you need to check both the flag and the header file? Oh well, then this is 
> probably as good as it gets.

Yes, we have to check both the flag and the header file.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1841926432


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-05 Thread Magnus Ihse Bursie
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains ten additional 
> commits since the last revision:
> 
>  - Separate neon and sve functions into two source files
>  - Merge branch 'jdk:master' into JDK-8312425
>  - Rename vmath to sleef in configure
>  - Address review comments in build system
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

So you need to check both the flag and the header file? Oh well, then this is 
probably as good as it gets.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1840748314


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-04 Thread Xiaohong Gong
On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong  wrote:

>> The final thing we need to resolve properly is the SVE compiler test. 
>> 
>> @theRealAph says:
>>> arm_sve.h is part of GCC. It was added to GCC in 2019.
>> 
>> A more relevant question is what version of gcc it was added, and if that 
>> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then 
>> this test could basically be framed as a gcc version check.
>> 
>> I'm still leaning towards failing configure if the SVE code cannot be 
>> compiled. Under what circumstances can this test possibly fail, so 
>> SVE_CFLAGS would not be set?
>
>> The final thing we need to resolve properly is the SVE compiler test.
>> 
>> @theRealAph says:
>> 
>> > arm_sve.h is part of GCC. It was added to GCC in 2019.
>> 
>> A more relevant question is what version of gcc it was added, and if that 
>> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then 
>> this test could basically be framed as a gcc version check.
>> 
>> I'm still leaning towards failing configure if the SVE code cannot be 
>> compiled. Under what circumstances can this test possibly fail, so 
>> SVE_CFLAGS would not be set?
> 
> Yes, the SVE compiler test code could be treated as a gcc/clang version 
> check. `arm_sve.h` which is included in `sleef.h` and then in 
> `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It 
> was included in gcc start from version 10 (may not be exact, but gcc 8/9 
> would fail when compile c code including this header). We have to make sure 
> the compiler supports the SVE ACLE before using it.  Here are the different 
> scenarios:
> 
> 1. The SVE compiler test success, and `SVE_CFLAGS` is set to 
> `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully 
> including NEON/SVE. Hence, the vector math operations with all kinds of 
> vector size on both NEON/SVE machines will be improved as expected.
> 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in 
> `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. 
> Hence, the enhancement for vector math operations with > 128-bit vector size 
> on SVE machines are missing.

> @XiaohongGong If we are sure that the SVE test will always succeed when 
> running on gcc 10 or higher, then I guess I don't really need a way to 
> enforce SVE support -- you'll just have to make sure you use a recent enough 
> gcc.
> 
> But, then the entire test becomes a bit unnecessary. You can just replace it 
> with a version check on gcc, or perhaps a FLAGS_COMPILER_CHECK_ARGUMENTS on 
> `-march=armv8-a+sve`.

Thanks for the suggestion @magicus ! Replacing with a version check for the c 
compiler seems fine. But I cannot see the advantange than current test. Here 
are the reasons:
1. `-march=armv8-a+sve` is the necessary cflag for the sve source, and only 
included start from some c compiler versions. The c compiler version check must 
happen before using it. So it should also happen in the make or configure 
stage? Hence, we still have to find a right place to check it (should be in 
`lib-sleef.m4` or otherwhere?).
2. We not only have to check the gcc version, but also have to check the clang 
version. Would this make the code more complex?

Regarding to using `FLAGS_COMPILER_CHECK_ARGUMENTS on "-march=armv8-a+sve"`, it 
is not right as well. Because we have to make sure the c compiler supports SVE 
ACLE completely which contains the sve header `arm_sve.h`. The compiler that 
supports option `-march=armv8-a+sve` cannot make sure the SVE ACLE is supported 
as well.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1839875881


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-04 Thread Magnus Ihse Bursie
On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong  wrote:

>> The final thing we need to resolve properly is the SVE compiler test. 
>> 
>> @theRealAph says:
>>> arm_sve.h is part of GCC. It was added to GCC in 2019.
>> 
>> A more relevant question is what version of gcc it was added, and if that 
>> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then 
>> this test could basically be framed as a gcc version check.
>> 
>> I'm still leaning towards failing configure if the SVE code cannot be 
>> compiled. Under what circumstances can this test possibly fail, so 
>> SVE_CFLAGS would not be set?
>
>> The final thing we need to resolve properly is the SVE compiler test.
>> 
>> @theRealAph says:
>> 
>> > arm_sve.h is part of GCC. It was added to GCC in 2019.
>> 
>> A more relevant question is what version of gcc it was added, and if that 
>> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then 
>> this test could basically be framed as a gcc version check.
>> 
>> I'm still leaning towards failing configure if the SVE code cannot be 
>> compiled. Under what circumstances can this test possibly fail, so 
>> SVE_CFLAGS would not be set?
> 
> Yes, the SVE compiler test code could be treated as a gcc/clang version 
> check. `arm_sve.h` which is included in `sleef.h` and then in 
> `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It 
> was included in gcc start from version 10 (may not be exact, but gcc 8/9 
> would fail when compile c code including this header). We have to make sure 
> the compiler supports the SVE ACLE before using it.  Here are the different 
> scenarios:
> 
> 1. The SVE compiler test success, and `SVE_CFLAGS` is set to 
> `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully 
> including NEON/SVE. Hence, the vector math operations with all kinds of 
> vector size on both NEON/SVE machines will be improved as expected.
> 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in 
> `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. 
> Hence, the enhancement for vector math operations with > 128-bit vector size 
> on SVE machines are missing.

@XiaohongGong If we are sure that the SVE test will always succeed when running 
on gcc 10 or higher, then I guess I don't really need a way to enforce SVE 
support -- you'll just have to make sure you use a recent enough gcc.

But, then the entire test becomes a bit unnecessary. You can just replace it 
with a version check on gcc, or perhaps a FLAGS_COMPILER_CHECK_ARGUMENTS on 
`-march=armv8-a+sve`.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838495553


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-04 Thread Magnus Ihse Bursie
On Fri, 1 Dec 2023 16:49:28 GMT, Andrew Haley  wrote:

>> Oh, and:
>> 
>> If we can't trust SLEEF not to break the ABI we're using, we should not be 
>> using SLEEF.
>
>> @theRealAph You are making good points.
>> 
>> You are basically saying: "we don't need any configure support for libsleef, 
>> we can just hard-code the names and dispatch to them directly to a dlopened 
>> library at runtime".
> 
> Yep.
> 
>> That is technically correct, but I'd still like to argue that the current 
>> setup have some merits:
>> 
>> * It will check that there is no typo in the function names. I agree 
>> that the likelihood of getting this wrong is low, but it is still a good 
>> practice to use official include files to have the compiler help checking 
>> this.
>> 
>> * If we would like to bundle libsleef.so with the JVM, now we have the 
>> possibility do do so easily. (Especially if it is like you say that it is 
>> not commonly installed). (If licenses allow etc)
>> 
>> * If we want to incorporate/bundle the source code of libsleef into 
>> OpenJDK, and build it as part of our internal library, we will have a good 
>> starting position, compared to starting from a hard-coded assembly file in 
>> hotspot. (I thought I heard some noise about this prospect).
>> 
>> 
>> So at this point, I am okay with the general approach of this PR. There are 
>> still some build issues to sort out, though, I'll address them separately.
> 
> I see, OK. The question in my mind is whether the common builds of OpenJDK 
> (Oracle, Adoptium, etc.) will support running with SLEEF. If by default SLEEF 
> is not required, support won't be built, and (to an nth order approximation) 
> no one will use it. But I guess it's better than nothing.
> 
> Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF 
> support?

@theRealAph 
> Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF 
> support?

I can't say anything for sure, but I picked up some positive vibes from our 
internal chat. I think the idea was that libsleef could potentially cover up 
vector math for all platforms that the current Intel lib solution is missing 
(basically, everything but linux+windows x64). So I this can be seen as a bit 
of a trial balloon if it is worth a more complete integration of libsleef in 
the JDK.

And I can't say anything at all about what Oracle JDKs are going to release 
with, but I am fully open towards adding libsleef to the GHA builds. And I'm 
guessing that Adoptium has no reason not to include libsleef, but then again, I 
cannot of course speak for them.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838489588


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-04 Thread Xiaohong Gong
On Fri, 1 Dec 2023 16:45:49 GMT, Magnus Ihse Bursie  wrote:

> The final thing we need to resolve properly is the SVE compiler test.
> 
> @theRealAph says:
> 
> > arm_sve.h is part of GCC. It was added to GCC in 2019.
> 
> A more relevant question is what version of gcc it was added, and if that 
> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then 
> this test could basically be framed as a gcc version check.
> 
> I'm still leaning towards failing configure if the SVE code cannot be 
> compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS 
> would not be set?

Yes, the SVE compiler test code could be treated as a gcc/clang version check. 
`arm_sve.h` which is included in `sleef.h` and then in `vect_math_sve.c` is the 
SVE ACLE (Arm C Language Extensions) header file. It was included in gcc start 
from version 10 (may not be exact, but gcc 8/9 would fail when compile c code 
including this header). We have to make sure the compiler supports the SVE ACLE 
before using it.  Here are the different scenarios:

1. The SVE compiler test success, and `SVE_CFLAGS` is set to 
`-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully 
including NEON/SVE. Hence, the vector math operations with all kinds of vector 
size on both NEON/SVE machines will be improved as expected.
2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in 
`libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. 
Hence, the enhancement for vector math operations with > 128-bit vector size on 
SVE machines are missing.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838061502


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley  wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains ten additional 
>> commits since the last revision:
>> 
>>  - Separate neon and sve functions into two source files
>>  - Merge branch 'jdk:master' into JDK-8312425
>>  - Rename vmath to sleef in configure
>>  - Address review comments in build system
>>  - Add a bundled native lib in jdk as a bridge to libsleef
>>  - Merge 'jdk:master' into JDK-8312425
>>  - Disable sleef by default
>>  - Merge 'jdk:master' into JDK-8312425
>>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF
>
> Oh, and:
> 
> If we can't trust SLEEF not to break the ABI we're using, we should not be 
> using SLEEF.

> @theRealAph You are making good points.
> 
> You are basically saying: "we don't need any configure support for libsleef, 
> we can just hard-code the names and dispatch to them directly to a dlopened 
> library at runtime".

Yep.

> That is technically correct, but I'd still like to argue that the current 
> setup have some merits:
> 
> * It will check that there is no typo in the function names. I agree that 
> the likelihood of getting this wrong is low, but it is still a good practice 
> to use official include files to have the compiler help checking this.
> 
> * If we would like to bundle libsleef.so with the JVM, now we have the 
> possibility do do so easily. (Especially if it is like you say that it is not 
> commonly installed). (If licenses allow etc)
> 
> * If we want to incorporate/bundle the source code of libsleef into 
> OpenJDK, and build it as part of our internal library, we will have a good 
> starting position, compared to starting from a hard-coded assembly file in 
> hotspot. (I thought I heard some noise about this prospect).
> 
> 
> So at this point, I am okay with the general approach of this PR. There are 
> still some build issues to sort out, though, I'll address them separately.

I see, OK. The question in my mind is whether the common builds of OpenJDK 
(Oracle, Adoptium, etc.) will support running with SLEEF. If by default SLEEF 
is not required, support won't be built, and (to an nth order approximation) no 
one will use it. But I guess it's better than nothing.

Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF 
support?

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836449876


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Magnus Ihse Bursie
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains ten additional 
> commits since the last revision:
> 
>  - Separate neon and sve functions into two source files
>  - Merge branch 'jdk:master' into JDK-8312425
>  - Rename vmath to sleef in configure
>  - Address review comments in build system
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

The final thing we need to resolve properly is the SVE compiler test. 

@theRealAph says:
> arm_sve.h is part of GCC. It was added to GCC in 2019.

A more relevant question is what version of gcc it was added, and if that also 
implies that the compiler knows about `-march=armv8-a+sve`. If so, then this 
test could basically be framed as a gcc version check.

I'm still leaning towards failing configure if the SVE code cannot be compiled. 
Under what circumstances can this test possibly fail, so SVE_CFLAGS would not 
be set?

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836444674


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Magnus Ihse Bursie
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains ten additional 
> commits since the last revision:
> 
>  - Separate neon and sve functions into two source files
>  - Merge branch 'jdk:master' into JDK-8312425
>  - Rename vmath to sleef in configure
>  - Address review comments in build system
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

doc/building.md line 639:

> 637: 
> 638: libsleef, the [SIMD Library for Evaluating Elementary Functions](
> 639: https://sleef.org/) is required when building libvmath.so on 
> Linux/aarch64

This is incorrect. The library is not required, but if it is present, we will 
build libvmath with it.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1412330808


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Magnus Ihse Bursie
On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley  wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains ten additional 
>> commits since the last revision:
>> 
>>  - Separate neon and sve functions into two source files
>>  - Merge branch 'jdk:master' into JDK-8312425
>>  - Rename vmath to sleef in configure
>>  - Address review comments in build system
>>  - Add a bundled native lib in jdk as a bridge to libsleef
>>  - Merge 'jdk:master' into JDK-8312425
>>  - Disable sleef by default
>>  - Merge 'jdk:master' into JDK-8312425
>>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF
>
> Oh, and:
> 
> If we can't trust SLEEF not to break the ABI we're using, we should not be 
> using SLEEF.

@theRealAph You are making good points. 

You are basically saying: "we don't need any configure support for libsleef, we 
can just hard-code the names and dispatch to them directly to a dlopened 
library at runtime".

That is technically correct, but I'd still like to argue that the current setup 
have some merits:

* It will check that there is no typo in the function names. I agree that the 
likelihood of getting this wrong is low, but it is still a good practice to use 
official include files to have the compiler help checking this.
* If we would like to bundle libsleef.so with the JVM, now we have the 
possibility do do so easily. (Especially if it is like you say that it is not 
commonly installed). (If licenses allow etc)
* If we want to incorporate/bundle the source code of libsleef into OpenJDK, 
and build it as part of our internal library, we will have a good starting 
position, compared to starting from a hard-coded assembly file in hotspot. (I 
thought I heard some noise about this prospect).

So at this point, I am okay with the general approach of this PR. There are 
still some build issues to sort out, though, I'll address them separately.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836266314


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains ten additional 
> commits since the last revision:
> 
>  - Separate neon and sve functions into two source files
>  - Merge branch 'jdk:master' into JDK-8312425
>  - Rename vmath to sleef in configure
>  - Address review comments in build system
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

Oh, and:

If we can't trust SLEEF not to break the ABI we're using, we should not be 
using SLEEF.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835833150


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains ten additional 
> commits since the last revision:
> 
>  - Separate neon and sve functions into two source files
>  - Merge branch 'jdk:master' into JDK-8312425
>  - Rename vmath to sleef in configure
>  - Address review comments in build system
>  - Add a bundled native lib in jdk as a bridge to libsleef
>  - Merge 'jdk:master' into JDK-8312425
>  - Disable sleef by default
>  - Merge 'jdk:master' into JDK-8312425
>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

Let me summarize the issues, as I see them:

To a high-order approximation, no one builds their own JDK.
We want people to be able to use these vector math operations.
There is no need to depend on a specific SLEEF library version.
I do not expect SLEEF to break the ABI by e.g. renaming functions. (I know, but 
let's assume.)
As long as the functions we want to use are present, we should use them.
SLEEF is not (yet) a standard part of OSs and build systems.

We don't want to fail unnecessarily at runtime because of a SLEEF ABI version 
change.
We don't want to fail to build the JDK if our GCC is too old for SVE. (Is that 
a problem now? It might be.)
We want to be able to test and run with any version of SLEEF, as long as the 
functions we need are present.

It should be possible to drop a SLEEF library into the system, and the JDK will 
use it.
The alternative is to package SLEEF with the JDK.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835829658


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Xiaohong Gong
> Currently the vector floating-point math APIs like 
> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
> which causes large performance gap on AArch64. Note that those APIs are 
> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
> To close the gap, we would like to optimize these APIs for AArch64 by calling 
> a third-party vector library called libsleef [2], which are available in 
> mainstream Linux distros (e.g. [3] [4]).
> 
> SLEEF supports multiple accuracies. To match Vector API's requirement and 
> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
> instructions used stubs in libsleef for most of the operations by default, 
> and 2) add the vector calling convention to apply with the runtime calls to 
> stub code in libsleef. Note that for those APIs that libsleef does not 
> support 1.0 ULP, we choose 0.5 ULP instead.
> 
> To help loading the expected libsleef library, this patch also adds an 
> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
> People can use it to denote the libsleef path/name explicitly. By default, it 
> points to the system installed library. If the library does not exist or the 
> dynamic loading of it in runtime fails, the math vector ops will fall-back to 
> use the default scalar version without error. But a warning is printed out if 
> people specifies a nonexistent library explicitly.
> 
> Note that this is a part of the original proposed patch in panama-dev [5], 
> just with some initial review comments addressed. And now we'd like to get 
> some wider feedbacks from more hotspot experts.
> 
> [1] https://github.com/openjdk/jdk/pull/3638
> [2] https://sleef.org/
> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
> [4] https://packages.debian.org/bookworm/libsleef3
> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html

Xiaohong Gong has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains ten additional 
commits since the last revision:

 - Separate neon and sve functions into two source files
 - Merge branch 'jdk:master' into JDK-8312425
 - Rename vmath to sleef in configure
 - Address review comments in build system
 - Add a bundled native lib in jdk as a bridge to libsleef
 - Merge 'jdk:master' into JDK-8312425
 - Disable sleef by default
 - Merge 'jdk:master' into JDK-8312425
 - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16234/files
  - new: https://git.openjdk.org/jdk/pull/16234/files/c1ce1968..ee5caf6d

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16234=05
 - incr: https://webrevs.openjdk.org/?repo=jdk=16234=04-05

  Stats: 638331 lines in 1866 files changed: 100400 ins; 474467 del; 63464 mod
  Patch: https://git.openjdk.org/jdk/pull/16234.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234

PR: https://git.openjdk.org/jdk/pull/16234