Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Tue, 6 Feb 2024 08:20:39 GMT, Magnus Ihse Bursie wrote: > I'd just hate to see all this work go to waste. Same here! - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1929780538
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Mon, 11 Dec 2023 18:25:09 GMT, Ludovic Henry wrote: >> @theRealAph >>> Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF >>> support? >> >> I can't say anything for sure, but I picked up some positive vibes from our >> internal chat. I think the idea was that libsleef could potentially cover up >> vector math for all platforms that the current Intel lib solution is missing >> (basically, everything but linux+windows x64). So I this can be seen as a >> bit of a trial balloon if it is worth a more complete integration of >> libsleef in the JDK. >> >> And I can't say anything at all about what Oracle JDKs are going to release >> with, but I am fully open towards adding libsleef to the GHA builds. And I'm >> guessing that Adoptium has no reason not to include libsleef, but then >> again, I cannot of course speak for them. > >> I can't say anything for sure, but I picked up some positive vibes from our >> internal chat. I think the idea was that libsleef could potentially cover up >> vector math for all platforms that the current Intel lib solution is missing >> (basically, everything but linux+windows x64). So I this can be seen as a >> bit of a trial balloon if it is worth a more complete integration of >> libsleef in the JDK. > > I can add that we are interested to use that for Linux + RISC-V support given > the RISC-V support was recently merged into sleef upstream. > https://github.com/shibatch/sleef/pull/477 @luhenry Maybe you are interested in helping with bringing this forward? I can assist with risc-v related fixes in the makefiles. I'd just hate to see all this work go to waste. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1928995458
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Mon, 4 Dec 2023 11:58:55 GMT, Magnus Ihse Bursie wrote: > I can't say anything for sure, but I picked up some positive vibes from our > internal chat. I think the idea was that libsleef could potentially cover up > vector math for all platforms that the current Intel lib solution is missing > (basically, everything but linux+windows x64). So I this can be seen as a bit > of a trial balloon if it is worth a more complete integration of libsleef in > the JDK. I can add that we are interested to use that for Linux + RISC-V support given the RISC-V support was recently merged into sleef upstream. https://github.com/shibatch/sleef/pull/477 - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1850636984
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 16:26:02 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains ten additional >> commits since the last revision: >> >> - Separate neon and sve functions into two source files >> - Merge branch 'jdk:master' into JDK-8312425 >> - Rename vmath to sleef in configure >> - Address review comments in build system >> - Add a bundled native lib in jdk as a bridge to libsleef >> - Merge 'jdk:master' into JDK-8312425 >> - Disable sleef by default >> - Merge 'jdk:master' into JDK-8312425 >> - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF > > doc/building.md line 639: > >> 637: >> 638: libsleef, the [SIMD Library for Evaluating Elementary Functions]( >> 639: https://sleef.org/) is required when building libvmath.so on >> Linux/aarch64 > > This is incorrect. The library is not required, but if it is present, we will > build libvmath with it. > > Edit: Or rather, this is misleading. Technically it is correct, since you > state that it is required when building libvmath.so, but it is easy to > mistake for being required for building the JDK. The reader presumably does > not know what libvmath.so is or how it is used. > > Please rephrase this to so that it is clear that this is optional, but will > provide performance benefits to the resulting JDK if present. You do not need > to mention libvmath.so here, for no other dependency do we declare what parts > of the JDK that require it -- it is not essential for this document. > > Also see if you can make this paragraph and the one at the end be a bit more > tighter, not the last paragraph seems to be both repeat and contradict this > one. Hi @magicus , the doc is updated. Thanks for your comment on this! - PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1416964362
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Tue, 5 Dec 2023 13:00:04 GMT, Magnus Ihse Bursie wrote: > So you need to check both the flag and the header file? Oh well, then this is > probably as good as it gets. Yes, we have to check both the flag and the header file. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1841926432
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. >> To close the gap, we would like to optimize these APIs for AArch64 by >> calling a third-party vector library called libsleef [2], which are >> available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and >> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA >> instructions used stubs in libsleef for most of the operations by default, >> and 2) add the vector calling convention to apply with the runtime calls to >> stub code in libsleef. Note that for those APIs that libsleef does not >> support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an >> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. >> People can use it to denote the libsleef path/name explicitly. By default, >> it points to the system installed library. If the library does not exist or >> the dynamic loading of it in runtime fails, the math vector ops will >> fall-back to use the default scalar version without error. But a warning is >> printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], >> just with some initial review comments addressed. And now we'd like to get >> some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF So you need to check both the flag and the header file? Oh well, then this is probably as good as it gets. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1840748314
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong wrote: >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >>> arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that >> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then >> this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be >> compiled. Under what circumstances can this test possibly fail, so >> SVE_CFLAGS would not be set? > >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >> >> > arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that >> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then >> this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be >> compiled. Under what circumstances can this test possibly fail, so >> SVE_CFLAGS would not be set? > > Yes, the SVE compiler test code could be treated as a gcc/clang version > check. `arm_sve.h` which is included in `sleef.h` and then in > `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It > was included in gcc start from version 10 (may not be exact, but gcc 8/9 > would fail when compile c code including this header). We have to make sure > the compiler supports the SVE ACLE before using it. Here are the different > scenarios: > > 1. The SVE compiler test success, and `SVE_CFLAGS` is set to > `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully > including NEON/SVE. Hence, the vector math operations with all kinds of > vector size on both NEON/SVE machines will be improved as expected. > 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in > `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. > Hence, the enhancement for vector math operations with > 128-bit vector size > on SVE machines are missing. > @XiaohongGong If we are sure that the SVE test will always succeed when > running on gcc 10 or higher, then I guess I don't really need a way to > enforce SVE support -- you'll just have to make sure you use a recent enough > gcc. > > But, then the entire test becomes a bit unnecessary. You can just replace it > with a version check on gcc, or perhaps a FLAGS_COMPILER_CHECK_ARGUMENTS on > `-march=armv8-a+sve`. Thanks for the suggestion @magicus ! Replacing with a version check for the c compiler seems fine. But I cannot see the advantange than current test. Here are the reasons: 1. `-march=armv8-a+sve` is the necessary cflag for the sve source, and only included start from some c compiler versions. The c compiler version check must happen before using it. So it should also happen in the make or configure stage? Hence, we still have to find a right place to check it (should be in `lib-sleef.m4` or otherwhere?). 2. We not only have to check the gcc version, but also have to check the clang version. Would this make the code more complex? Regarding to using `FLAGS_COMPILER_CHECK_ARGUMENTS on "-march=armv8-a+sve"`, it is not right as well. Because we have to make sure the c compiler supports SVE ACLE completely which contains the sve header `arm_sve.h`. The compiler that supports option `-march=armv8-a+sve` cannot make sure the SVE ACLE is supported as well. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1839875881
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong wrote: >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >>> arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that >> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then >> this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be >> compiled. Under what circumstances can this test possibly fail, so >> SVE_CFLAGS would not be set? > >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >> >> > arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that >> also implies that the compiler knows about `-march=armv8-a+sve`. If so, then >> this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be >> compiled. Under what circumstances can this test possibly fail, so >> SVE_CFLAGS would not be set? > > Yes, the SVE compiler test code could be treated as a gcc/clang version > check. `arm_sve.h` which is included in `sleef.h` and then in > `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It > was included in gcc start from version 10 (may not be exact, but gcc 8/9 > would fail when compile c code including this header). We have to make sure > the compiler supports the SVE ACLE before using it. Here are the different > scenarios: > > 1. The SVE compiler test success, and `SVE_CFLAGS` is set to > `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully > including NEON/SVE. Hence, the vector math operations with all kinds of > vector size on both NEON/SVE machines will be improved as expected. > 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in > `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. > Hence, the enhancement for vector math operations with > 128-bit vector size > on SVE machines are missing. @XiaohongGong If we are sure that the SVE test will always succeed when running on gcc 10 or higher, then I guess I don't really need a way to enforce SVE support -- you'll just have to make sure you use a recent enough gcc. But, then the entire test becomes a bit unnecessary. You can just replace it with a version check on gcc, or perhaps a FLAGS_COMPILER_CHECK_ARGUMENTS on `-march=armv8-a+sve`. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838495553
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 16:49:28 GMT, Andrew Haley wrote: >> Oh, and: >> >> If we can't trust SLEEF not to break the ABI we're using, we should not be >> using SLEEF. > >> @theRealAph You are making good points. >> >> You are basically saying: "we don't need any configure support for libsleef, >> we can just hard-code the names and dispatch to them directly to a dlopened >> library at runtime". > > Yep. > >> That is technically correct, but I'd still like to argue that the current >> setup have some merits: >> >> * It will check that there is no typo in the function names. I agree >> that the likelihood of getting this wrong is low, but it is still a good >> practice to use official include files to have the compiler help checking >> this. >> >> * If we would like to bundle libsleef.so with the JVM, now we have the >> possibility do do so easily. (Especially if it is like you say that it is >> not commonly installed). (If licenses allow etc) >> >> * If we want to incorporate/bundle the source code of libsleef into >> OpenJDK, and build it as part of our internal library, we will have a good >> starting position, compared to starting from a hard-coded assembly file in >> hotspot. (I thought I heard some noise about this prospect). >> >> >> So at this point, I am okay with the general approach of this PR. There are >> still some build issues to sort out, though, I'll address them separately. > > I see, OK. The question in my mind is whether the common builds of OpenJDK > (Oracle, Adoptium, etc.) will support running with SLEEF. If by default SLEEF > is not required, support won't be built, and (to an nth order approximation) > no one will use it. But I guess it's better than nothing. > > Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF > support? @theRealAph > Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF > support? I can't say anything for sure, but I picked up some positive vibes from our internal chat. I think the idea was that libsleef could potentially cover up vector math for all platforms that the current Intel lib solution is missing (basically, everything but linux+windows x64). So I this can be seen as a bit of a trial balloon if it is worth a more complete integration of libsleef in the JDK. And I can't say anything at all about what Oracle JDKs are going to release with, but I am fully open towards adding libsleef to the GHA builds. And I'm guessing that Adoptium has no reason not to include libsleef, but then again, I cannot of course speak for them. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838489588
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 16:45:49 GMT, Magnus Ihse Bursie wrote: > The final thing we need to resolve properly is the SVE compiler test. > > @theRealAph says: > > > arm_sve.h is part of GCC. It was added to GCC in 2019. > > A more relevant question is what version of gcc it was added, and if that > also implies that the compiler knows about `-march=armv8-a+sve`. If so, then > this test could basically be framed as a gcc version check. > > I'm still leaning towards failing configure if the SVE code cannot be > compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS > would not be set? Yes, the SVE compiler test code could be treated as a gcc/clang version check. `arm_sve.h` which is included in `sleef.h` and then in `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It was included in gcc start from version 10 (may not be exact, but gcc 8/9 would fail when compile c code including this header). We have to make sure the compiler supports the SVE ACLE before using it. Here are the different scenarios: 1. The SVE compiler test success, and `SVE_CFLAGS` is set to `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully including NEON/SVE. Hence, the vector math operations with all kinds of vector size on both NEON/SVE machines will be improved as expected. 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. Hence, the enhancement for vector math operations with > 128-bit vector size on SVE machines are missing. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838061502
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains ten additional >> commits since the last revision: >> >> - Separate neon and sve functions into two source files >> - Merge branch 'jdk:master' into JDK-8312425 >> - Rename vmath to sleef in configure >> - Address review comments in build system >> - Add a bundled native lib in jdk as a bridge to libsleef >> - Merge 'jdk:master' into JDK-8312425 >> - Disable sleef by default >> - Merge 'jdk:master' into JDK-8312425 >> - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF > > Oh, and: > > If we can't trust SLEEF not to break the ABI we're using, we should not be > using SLEEF. > @theRealAph You are making good points. > > You are basically saying: "we don't need any configure support for libsleef, > we can just hard-code the names and dispatch to them directly to a dlopened > library at runtime". Yep. > That is technically correct, but I'd still like to argue that the current > setup have some merits: > > * It will check that there is no typo in the function names. I agree that > the likelihood of getting this wrong is low, but it is still a good practice > to use official include files to have the compiler help checking this. > > * If we would like to bundle libsleef.so with the JVM, now we have the > possibility do do so easily. (Especially if it is like you say that it is not > commonly installed). (If licenses allow etc) > > * If we want to incorporate/bundle the source code of libsleef into > OpenJDK, and build it as part of our internal library, we will have a good > starting position, compared to starting from a hard-coded assembly file in > hotspot. (I thought I heard some noise about this prospect). > > > So at this point, I am okay with the general approach of this PR. There are > still some build issues to sort out, though, I'll address them separately. I see, OK. The question in my mind is whether the common builds of OpenJDK (Oracle, Adoptium, etc.) will support running with SLEEF. If by default SLEEF is not required, support won't be built, and (to an nth order approximation) no one will use it. But I guess it's better than nothing. Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF support? - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836449876
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. >> To close the gap, we would like to optimize these APIs for AArch64 by >> calling a third-party vector library called libsleef [2], which are >> available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and >> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA >> instructions used stubs in libsleef for most of the operations by default, >> and 2) add the vector calling convention to apply with the runtime calls to >> stub code in libsleef. Note that for those APIs that libsleef does not >> support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an >> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. >> People can use it to denote the libsleef path/name explicitly. By default, >> it points to the system installed library. If the library does not exist or >> the dynamic loading of it in runtime fails, the math vector ops will >> fall-back to use the default scalar version without error. But a warning is >> printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], >> just with some initial review comments addressed. And now we'd like to get >> some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF The final thing we need to resolve properly is the SVE compiler test. @theRealAph says: > arm_sve.h is part of GCC. It was added to GCC in 2019. A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836444674
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. >> To close the gap, we would like to optimize these APIs for AArch64 by >> calling a third-party vector library called libsleef [2], which are >> available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and >> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA >> instructions used stubs in libsleef for most of the operations by default, >> and 2) add the vector calling convention to apply with the runtime calls to >> stub code in libsleef. Note that for those APIs that libsleef does not >> support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an >> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. >> People can use it to denote the libsleef path/name explicitly. By default, >> it points to the system installed library. If the library does not exist or >> the dynamic loading of it in runtime fails, the math vector ops will >> fall-back to use the default scalar version without error. But a warning is >> printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], >> just with some initial review comments addressed. And now we'd like to get >> some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF doc/building.md line 639: > 637: > 638: libsleef, the [SIMD Library for Evaluating Elementary Functions]( > 639: https://sleef.org/) is required when building libvmath.so on > Linux/aarch64 This is incorrect. The library is not required, but if it is present, we will build libvmath with it. - PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1412330808
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains ten additional >> commits since the last revision: >> >> - Separate neon and sve functions into two source files >> - Merge branch 'jdk:master' into JDK-8312425 >> - Rename vmath to sleef in configure >> - Address review comments in build system >> - Add a bundled native lib in jdk as a bridge to libsleef >> - Merge 'jdk:master' into JDK-8312425 >> - Disable sleef by default >> - Merge 'jdk:master' into JDK-8312425 >> - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF > > Oh, and: > > If we can't trust SLEEF not to break the ABI we're using, we should not be > using SLEEF. @theRealAph You are making good points. You are basically saying: "we don't need any configure support for libsleef, we can just hard-code the names and dispatch to them directly to a dlopened library at runtime". That is technically correct, but I'd still like to argue that the current setup have some merits: * It will check that there is no typo in the function names. I agree that the likelihood of getting this wrong is low, but it is still a good practice to use official include files to have the compiler help checking this. * If we would like to bundle libsleef.so with the JVM, now we have the possibility do do so easily. (Especially if it is like you say that it is not commonly installed). (If licenses allow etc) * If we want to incorporate/bundle the source code of libsleef into OpenJDK, and build it as part of our internal library, we will have a good starting position, compared to starting from a hard-coded assembly file in hotspot. (I thought I heard some noise about this prospect). So at this point, I am okay with the general approach of this PR. There are still some build issues to sort out, though, I'll address them separately. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836266314
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. >> To close the gap, we would like to optimize these APIs for AArch64 by >> calling a third-party vector library called libsleef [2], which are >> available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and >> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA >> instructions used stubs in libsleef for most of the operations by default, >> and 2) add the vector calling convention to apply with the runtime calls to >> stub code in libsleef. Note that for those APIs that libsleef does not >> support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an >> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. >> People can use it to denote the libsleef path/name explicitly. By default, >> it points to the system installed library. If the library does not exist or >> the dynamic loading of it in runtime fails, the math vector ops will >> fall-back to use the default scalar version without error. But a warning is >> printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], >> just with some initial review comments addressed. And now we'd like to get >> some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF Oh, and: If we can't trust SLEEF not to break the ABI we're using, we should not be using SLEEF. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835833150
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. >> To close the gap, we would like to optimize these APIs for AArch64 by >> calling a third-party vector library called libsleef [2], which are >> available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and >> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA >> instructions used stubs in libsleef for most of the operations by default, >> and 2) add the vector calling convention to apply with the runtime calls to >> stub code in libsleef. Note that for those APIs that libsleef does not >> support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an >> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. >> People can use it to denote the libsleef path/name explicitly. By default, >> it points to the system installed library. If the library does not exist or >> the dynamic loading of it in runtime fails, the math vector ops will >> fall-back to use the default scalar version without error. But a warning is >> printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], >> just with some initial review comments addressed. And now we'd like to get >> some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF Let me summarize the issues, as I see them: To a high-order approximation, no one builds their own JDK. We want people to be able to use these vector math operations. There is no need to depend on a specific SLEEF library version. I do not expect SLEEF to break the ABI by e.g. renaming functions. (I know, but let's assume.) As long as the functions we want to use are present, we should use them. SLEEF is not (yet) a standard part of OSs and build systems. We don't want to fail unnecessarily at runtime because of a SLEEF ABI version change. We don't want to fail to build the JDK if our GCC is too old for SVE. (Is that a problem now? It might be.) We want to be able to test and run with any version of SLEEF, as long as the functions we need are present. It should be possible to drop a SLEEF library into the system, and the JDK will use it. The alternative is to package SLEEF with the JDK. - PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835829658
Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]
> Currently the vector floating-point math APIs like > `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, > which causes large performance gap on AArch64. Note that those APIs are > optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. > To close the gap, we would like to optimize these APIs for AArch64 by calling > a third-party vector library called libsleef [2], which are available in > mainstream Linux distros (e.g. [3] [4]). > > SLEEF supports multiple accuracies. To match Vector API's requirement and > implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA > instructions used stubs in libsleef for most of the operations by default, > and 2) add the vector calling convention to apply with the runtime calls to > stub code in libsleef. Note that for those APIs that libsleef does not > support 1.0 ULP, we choose 0.5 ULP instead. > > To help loading the expected libsleef library, this patch also adds an > experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. > People can use it to denote the libsleef path/name explicitly. By default, it > points to the system installed library. If the library does not exist or the > dynamic loading of it in runtime fails, the math vector ops will fall-back to > use the default scalar version without error. But a warning is printed out if > people specifies a nonexistent library explicitly. > > Note that this is a part of the original proposed patch in panama-dev [5], > just with some initial review comments addressed. And now we'd like to get > some wider feedbacks from more hotspot experts. > > [1] https://github.com/openjdk/jdk/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Separate neon and sve functions into two source files - Merge branch 'jdk:master' into JDK-8312425 - Rename vmath to sleef in configure - Address review comments in build system - Add a bundled native lib in jdk as a bridge to libsleef - Merge 'jdk:master' into JDK-8312425 - Disable sleef by default - Merge 'jdk:master' into JDK-8312425 - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF - Changes: - all: https://git.openjdk.org/jdk/pull/16234/files - new: https://git.openjdk.org/jdk/pull/16234/files/c1ce1968..ee5caf6d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16234=05 - incr: https://webrevs.openjdk.org/?repo=jdk=16234=04-05 Stats: 638331 lines in 1866 files changed: 100400 ins; 474467 del; 63464 mod Patch: https://git.openjdk.org/jdk/pull/16234.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234 PR: https://git.openjdk.org/jdk/pull/16234