RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable

2023-12-06 Thread Serguei Spitsyn
This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time 
frame.
It is fixing a deadlock issue between `VirtualThread` class critical sections 
with the `interruptLock` (in methods: `unpark()`, `interrupt()`, 
`getAndClearInterrupt()`, `threadState()`, `toString()`), 
`JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms.
The deadlocking scenario is well described by Patricio in a bug report comment.
In simple words, a virtual thread should not be suspended during 
'interruptLock' critical sections.

The fix is to record that a virtual thread is in a critical section 
(`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about 
begin/end of critical section.
This bit is used in `HandshakeState::get_op_for_self()` to filter out any 
`HandshakeOperation` if a target `JavaThread` is in a critical section.

Some of new notifications with `notifyJvmtiSync()` method is on a performance 
critical path. It is why this method has been intrincified.

New test was developed by Patricio:
 `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock`
The test is very nice as it reliably in 100% reproduces the deadlock without 
the fix.
The test is never failing with this fix.

Testing:
 - tested with newly added test: 
`test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock`
 - tested with mach5 tiers 1-6

-

Commit messages:
 - added @summary to new test SuspendWithInterruptLock.java
 - add new test SuspendWithInterruptLock.java
 - 8311218: fatal error: stuck in 
JvmtiVTMSTransitionDisabler::VTMS_transition_disable

Changes: https://git.openjdk.org/jdk/pull/17011/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8311218
  Stats: 183 lines in 15 files changed: 178 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/17011.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011

PR: https://git.openjdk.org/jdk/pull/17011


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v8]

2023-12-06 Thread Xiaohong Gong
> Currently the vector floating-point math APIs like 
> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
> which causes large performance gap on AArch64. Note that those APIs are 
> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
> To close the gap, we would like to optimize these APIs for AArch64 by calling 
> a third-party vector library called libsleef [2], which are available in 
> mainstream Linux distros (e.g. [3] [4]).
> 
> SLEEF supports multiple accuracies. To match Vector API's requirement and 
> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
> instructions used stubs in libsleef for most of the operations by default, 
> and 2) add the vector calling convention to apply with the runtime calls to 
> stub code in libsleef. Note that for those APIs that libsleef does not 
> support 1.0 ULP, we choose 0.5 ULP instead.
> 
> To help loading the expected libsleef library, this patch also adds an 
> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
> People can use it to denote the libsleef path/name explicitly. By default, it 
> points to the system installed library. If the library does not exist or the 
> dynamic loading of it in runtime fails, the math vector ops will fall-back to 
> use the default scalar version without error. But a warning is printed out if 
> people specifies a nonexistent library explicitly.
> 
> Note that this is a part of the original proposed patch in panama-dev [5], 
> just with some initial review comments addressed. And now we'd like to get 
> some wider feedbacks from more hotspot experts.
> 
> [1] https://github.com/openjdk/jdk/pull/3638
> [2] https://sleef.org/
> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
> [4] https://packages.debian.org/bookworm/libsleef3
> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html

Xiaohong Gong has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains 12 additional commits 
since the last revision:

 - Remove -fvisibility in makefile and add the attribute in source code
 - Merge branch 'jdk:master' into JDK-8312425
 - Add "--with-libsleef-lib" and "--with-libsleef-include" options
 - Separate neon and sve functions into two source files
 - Merge branch 'jdk:master' into JDK-8312425
 - Rename vmath to sleef in configure
 - Address review comments in build system
 - Add a bundled native lib in jdk as a bridge to libsleef
 - Merge 'jdk:master' into JDK-8312425
 - Disable sleef by default
 - ... and 2 more: https://git.openjdk.org/jdk/compare/6feb6794...c55357b6

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16234/files
  - new: https://git.openjdk.org/jdk/pull/16234/files/f3ff0672..c55357b6

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=06-07

  Stats: 83778 lines in 1591 files changed: 38309 ins; 39305 del; 6164 mod
  Patch: https://git.openjdk.org/jdk/pull/16234.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234

PR: https://git.openjdk.org/jdk/pull/16234


Re: RFR: JDK-8319413: Start of release updates for JDK 23 [v8]

2023-12-06 Thread Joe Darcy
> Time to start making preparations for JDK 23.

Joe Darcy has updated the pull request incrementally with one additional commit 
since the last revision:

  Update copyright year.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16505/files
  - new: https://git.openjdk.org/jdk/pull/16505/files/4a871372..a1aa187c

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16505&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16505&range=06-07

  Stats: 6 lines in 6 files changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/16505.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16505/head:pull/16505

PR: https://git.openjdk.org/jdk/pull/16505


Re: RFR: JDK-8319413: Start of release updates for JDK 23 [v7]

2023-12-06 Thread Joe Darcy
> Time to start making preparations for JDK 23.

Joe Darcy has updated the pull request with a new target base due to a merge or 
a rebase. The pull request now contains 21 commits:

 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Regenerate JDK 22 symbol files.
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Add symbol files.
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - ... and 11 more: https://git.openjdk.org/jdk/compare/632a3c56...4a871372

-

Changes: https://git.openjdk.org/jdk/pull/16505/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16505&range=06
  Stats: 4866 lines in 98 files changed: 4828 ins; 0 del; 38 mod
  Patch: https://git.openjdk.org/jdk/pull/16505.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16505/head:pull/16505

PR: https://git.openjdk.org/jdk/pull/16505


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v12]

2023-12-06 Thread Srinivas Vamsi Parasa
On Wed, 6 Dec 2023 23:09:01 GMT, Srinivas Vamsi Parasa  wrote:

>>> LGTM, thanks!
>> 
>> Thanks Jatin!
>
>> @vamsi-parasa, sorry, I was wrong. I missed that you need to check type 
>> `bt`. Latest change is more complicated than it was before. Please revert it 
>> back (undo last change). I will test previous version 09.
> @vnkozlov 
> Vladimir, please see the commit reverted in the updated changes pushed now.

> @vamsi-parasa, please, remind me which tests check that code in 
> `libsmdsort.so` is used?

@vnkozlov 
Please see the tests for simd sort code in 
`test/jdk/java/util/Arrays/Sorting.java`

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843963054


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v12]

2023-12-06 Thread Vladimir Kozlov
On Wed, 6 Dec 2023 23:09:01 GMT, Srinivas Vamsi Parasa  wrote:

>>> LGTM, thanks!
>> 
>> Thanks Jatin!
>
>> @vamsi-parasa, sorry, I was wrong. I missed that you need to check type 
>> `bt`. Latest change is more complicated than it was before. Please revert it 
>> back (undo last change). I will test previous version 09.
> @vnkozlov 
> Vladimir, please see the commit reverted in the updated changes pushed now.

@vamsi-parasa, please, remind me which tests check that code in `libsmdsort.so` 
is used?

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843938518


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v12]

2023-12-06 Thread Vladimir Kozlov
On Wed, 6 Dec 2023 23:12:13 GMT, Srinivas Vamsi Parasa  wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX2 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> For serial sort on random data, this PR shows upto ~7.5x improvement for 
>> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
>> performance data below.
>> 
>> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
>> datatypes (int, float) as shown below.
>> 
>> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
>> 
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
>> Speedup
>> -- | -- | -- | -- | --
>> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
>> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
>> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
>> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
>> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
>> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
>> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
>> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
>> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
>> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
>> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
>> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
>> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
>> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
>> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
>> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
>> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
>> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
>> 
>> 
>> 
>> 
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Revert "Change supported intrinsic check"
>   
>   This reverts commit 9621eb045c2958582f81ec06b237789a07481ddd.

Good. I submitted testing.

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843839669


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v12]

2023-12-06 Thread Srinivas Vamsi Parasa
On Wed, 6 Dec 2023 17:44:24 GMT, Srinivas Vamsi Parasa  wrote:

>> LGTM, thanks!
>
>> LGTM, thanks!
> 
> Thanks Jatin!

> @vamsi-parasa, sorry, I was wrong. I missed that you need to check type `bt`. 
> Latest change is more complicated than it was before. Please revert it back 
> (undo last change). I will test previous version 09.
@vnkozlov 
Vladimir, please see the commit reverted in the updated changes pushed now.

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843834085


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v12]

2023-12-06 Thread Srinivas Vamsi Parasa
> The goal is to develop faster sort routines for x86_64 CPUs by taking 
> advantage of AVX2 instructions. This enhancement provides an order of 
> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
> 
> For serial sort on random data, this PR shows upto ~7.5x improvement for 
> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
> performance data below.
> 
> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
> datatypes (int, float) as shown below.
> 
> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
> 
> 
> 
> 
> 
> 
> 
> 
> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
> Speedup
> -- | -- | -- | -- | --
> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
> 
> 
> 
> 
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/...

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  Revert "Change supported intrinsic check"
  
  This reverts commit 9621eb045c2958582f81ec06b237789a07481ddd.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16534/files
  - new: https://git.openjdk.org/jdk/pull/16534/files/9621eb04..eadba369

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=11
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=10-11

  Stats: 28 lines in 4 files changed: 0 ins; 20 del; 8 mod
  Patch: https://git.openjdk.org/jdk/pull/16534.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16534/head:pull/16534

PR: https://git.openjdk.org/jdk/pull/16534


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Vladimir Kozlov
On Wed, 6 Dec 2023 17:44:24 GMT, Srinivas Vamsi Parasa  wrote:

>> LGTM, thanks!
>
>> LGTM, thanks!
> 
> Thanks Jatin!

@vamsi-parasa, sorry, I was wrong. I missed that you need to check type `bt`. 
Latest change is more complicated than it was before. Please revert it back 
(undo last change). I will test previous version 09.

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843822075


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v11]

2023-12-06 Thread Srinivas Vamsi Parasa
> The goal is to develop faster sort routines for x86_64 CPUs by taking 
> advantage of AVX2 instructions. This enhancement provides an order of 
> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
> 
> For serial sort on random data, this PR shows upto ~7.5x improvement for 
> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
> performance data below.
> 
> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
> datatypes (int, float) as shown below.
> 
> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
> 
> 
> 
> 
> 
> 
> 
> 
> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
> Speedup
> -- | -- | -- | -- | --
> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
> 
> 
> 
> 
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/...

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  Change supported intrinsic check

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16534/files
  - new: https://git.openjdk.org/jdk/pull/16534/files/7e124581..9621eb04

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=10
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=09-10

  Stats: 28 lines in 4 files changed: 20 ins; 0 del; 8 mod
  Patch: https://git.openjdk.org/jdk/pull/16534.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16534/head:pull/16534

PR: https://git.openjdk.org/jdk/pull/16534


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Srinivas Vamsi Parasa
On Wed, 6 Dec 2023 18:41:26 GMT, Vladimir Kozlov  wrote:

>> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   add missing header files
>
> src/hotspot/share/opto/library_call.cpp line 5393:
> 
>> 5391: if (!Matcher::supports_simd_sort(bt)) {
>> 5392:   return false;
>> 5393: }
> 
> This check should be in `C2Compiler::is_intrinsic_supported()`

Hi Vladimir (@vnkozlov), please see the updated changes which use 
`C2Compiler::is_intrinsic_supported(id, bt)`

> src/hotspot/share/opto/library_call.cpp line 5450:
> 
>> 5448:   if (!Matcher::supports_simd_sort(bt)) {
>> 5449: return false;
>> 5450:   }
> 
> Same.

Please see the updated changes which use C2Compiler::is_intrinsic_supported(id, 
bt)

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417946689
PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417946968


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Vladimir Kozlov
On Wed, 6 Dec 2023 17:48:04 GMT, Srinivas Vamsi Parasa  wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX2 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> For serial sort on random data, this PR shows upto ~7.5x improvement for 
>> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
>> performance data below.
>> 
>> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
>> datatypes (int, float) as shown below.
>> 
>> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
>> 
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
>> Speedup
>> -- | -- | -- | -- | --
>> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
>> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
>> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
>> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
>> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
>> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
>> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
>> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
>> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
>> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
>> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
>> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
>> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
>> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
>> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
>> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
>> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
>> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
>> 
>> 
>> 
>> 
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   add missing header files

src/hotspot/share/opto/library_call.cpp line 5393:

> 5391: if (!Matcher::supports_simd_sort(bt)) {
> 5392:   return false;
> 5393: }

This check should be in `C2Compiler::is_intrinsic_supported()`

src/hotspot/share/opto/library_call.cpp line 5450:

> 5448:   if (!Matcher::supports_simd_sort(bt)) {
> 5449: return false;
> 5450:   }

Same.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417831171
PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417832246


Re: RFR: JDK-8319413: Start of release updates for JDK 23 [v6]

2023-12-06 Thread Joe Darcy
> Time to start making preparations for JDK 23.

Joe Darcy has updated the pull request with a new target base due to a merge or 
a rebase. The pull request now contains 18 commits:

 - Regenerate JDK 22 symbol files.
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Add symbol files.
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Merge branch 'master' into JDK-8319413
 - Update symbol files to JDK 22 b26.
 - ... and 8 more: https://git.openjdk.org/jdk/compare/db5613af...efe849f6

-

Changes: https://git.openjdk.org/jdk/pull/16505/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16505&range=05
  Stats: 4866 lines in 98 files changed: 4828 ins; 0 del; 38 mod
  Patch: https://git.openjdk.org/jdk/pull/16505.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16505/head:pull/16505

PR: https://git.openjdk.org/jdk/pull/16505


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Sandhya Viswanathan
On Wed, 6 Dec 2023 18:26:34 GMT, Vladimir Kozlov  wrote:

>> @TobiHartmann @vnkozlov Please advice if we can go head and integrate this 
>> PR today before the fork.
>
>> @TobiHartmann @vnkozlov Please advice if we can go head and integrate this 
>> PR today before the fork.
> 
> Too late. Changes looks fine to me (I am still on fence that we moving to C++ 
> implementation of intrinsics and require latest C++ compiler version). I need 
> to run testing for latest version of changes before approval. Lets not rush.

Thanks a lot @vnkozlov, we will wait for your approval.

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843454162


Re: RFR: 8321374: Add a configure option to explicitly set CompanyName property in VersionInfo resource for Windows exe/dll

2023-12-06 Thread Erik Joelsson
On Tue, 5 Dec 2023 11:15:18 GMT, Frederic Thevenet  
wrote:

> When building OpenJDK on the Windows platform, version information are 
> embedded as compiled resources into every native library and executable, 
> which typically contain version numbers, copyright information, product name 
> and vendor name (called "Company name" in this context).
> 
> Currently, the value for "Company name" is always the same as that of 
> "vendor-name" but it It would be really useful to some build scenarios to 
> allow for the "company name" property embedded in the VersionInfo compiled 
> resources to be different from the "vendor name" property that is used within 
> the JVM (e.g. java -version). 
> 
> This PR adds  a "--with-jdk-rc-company-name" configure option which can be 
> used to set "Company name" to its own value, independently of "'vendor-name", 
> in the same fashion as the existing "--with-jdk-rc-name", used to override 
> the values of "File description" and "Product name".
> 
> If "--with-jdk-rc-company-name" isn't used, then the vendor-name value is 
> used instead, reproducing the same behavior than without this patch.

Marked as reviewed by erikj (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/16972#pullrequestreview-1768356733


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Vladimir Kozlov
On Wed, 6 Dec 2023 18:05:22 GMT, Sandhya Viswanathan  
wrote:

> @TobiHartmann @vnkozlov Please advice if we can go head and integrate this PR 
> today before the fork.

Too late. Changes looks fine to me (I am still on fence that we moving to C++ 
implementation of intrinsics and require latest C++ compiler version). I need 
to run testing for latest version of changes before approval. Lets not rush.

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843446956


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Sandhya Viswanathan
On Wed, 6 Dec 2023 17:48:04 GMT, Srinivas Vamsi Parasa  wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX2 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> For serial sort on random data, this PR shows upto ~7.5x improvement for 
>> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
>> performance data below.
>> 
>> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
>> datatypes (int, float) as shown below.
>> 
>> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
>> 
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
>> Speedup
>> -- | -- | -- | -- | --
>> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
>> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
>> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
>> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
>> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
>> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
>> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
>> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
>> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
>> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
>> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
>> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
>> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
>> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
>> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
>> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
>> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
>> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
>> 
>> 
>> 
>> 
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   add missing header files

@TobiHartmann @vnkozlov Please advice if we can go head and integrate this PR 
today before the fork.

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843407940


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Srinivas Vamsi Parasa
> The goal is to develop faster sort routines for x86_64 CPUs by taking 
> advantage of AVX2 instructions. This enhancement provides an order of 
> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
> 
> For serial sort on random data, this PR shows upto ~7.5x improvement for 
> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
> performance data below.
> 
> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
> datatypes (int, float) as shown below.
> 
> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
> 
> 
> 
> 
> 
> 
> 
> 
> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
> Speedup
> -- | -- | -- | -- | --
> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
> 
> 
> 
> 
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/...

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  add missing header files

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16534/files
  - new: https://git.openjdk.org/jdk/pull/16534/files/c143e0b9..7e124581

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=09
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=08-09

  Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/16534.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16534/head:pull/16534

PR: https://git.openjdk.org/jdk/pull/16534


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Jatin Bhateja
On Wed, 6 Dec 2023 17:44:25 GMT, Srinivas Vamsi Parasa  wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX2 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> For serial sort on random data, this PR shows upto ~7.5x improvement for 
>> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
>> performance data below.
>> 
>> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
>> datatypes (int, float) as shown below.
>> 
>> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
>> 
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
>> Speedup
>> -- | -- | -- | -- | --
>> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
>> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
>> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
>> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
>> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
>> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
>> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
>> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
>> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
>> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
>> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
>> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
>> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
>> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
>> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
>> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
>> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
>> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
>> 
>> 
>> 
>> 
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   add missing header files

LGTM, thanks!

-

Marked as reviewed by jbhateja (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16534#pullrequestreview-1768255412


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v10]

2023-12-06 Thread Srinivas Vamsi Parasa
On Wed, 6 Dec 2023 17:42:39 GMT, Jatin Bhateja  wrote:

> LGTM, thanks!

Thanks Jatin!

-

PR Comment: https://git.openjdk.org/jdk/pull/16534#issuecomment-1843372385


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v9]

2023-12-06 Thread Srinivas Vamsi Parasa
On Tue, 5 Dec 2023 19:19:23 GMT, Jatin Bhateja  wrote:

>> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   remove unused avx2 64 bit sort functions; add assertions
>
> src/java.base/linux/native/libsimdsort/avx2-linux-qsort.cpp line 50:
> 
>> 48: case JVM_T_DOUBLE:
>> 49: avx2_fast_sort((double*)array, from_index, to_index, 
>> INSERTION_SORT_THRESHOLD_64BIT);
>> 50: break;
> 
> Please add safe assertions for missing types.

This is from an older (but outdated) commit. The assertions have been added in 
other cases.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417706670


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v8]

2023-12-06 Thread Srinivas Vamsi Parasa
On Tue, 5 Dec 2023 19:37:34 GMT, Jatin Bhateja  wrote:

>> Srinivas Vamsi Parasa has updated the pull request with a new target base 
>> due to a merge or a rebase. The incremental webrev excludes the unrelated 
>> changes brought in by the merge/rebase. The pull request contains 17 
>> additional commits since the last revision:
>> 
>>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>>  - add GCC version guards
>>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>>  - Remove C++17 from C flags
>>  - add avoid masked stores operation
>>  - update the code to check for supported simd sort cpus
>>  - Disable AVX2 sort for 64-bit types
>>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>>  - fix jcheck failures due to windows encoding
>>  - fix carriage return and change insertion sort thresholds
>>  - ... and 7 more: https://git.openjdk.org/jdk/compare/d4151e5b...bc590d9f
>
> src/java.base/linux/native/libsimdsort/avx2-emu-funcs.hpp line 64:
> 
>> 62: }
>> 63: return lut;
>> 64: }();
> 
> Lut64 is needed for compress64 emulation, can be removed.

Removed in the latest commit...

> src/java.base/linux/native/libsimdsort/avx2-emu-funcs.hpp line 234:
> 
>> 232: 
>> 233: vtype::mask_storeu(leftStore, left, temp);
>> 234: }
> 
> Can be removed if not being used.

Removed in the latest commit...

> src/java.base/linux/native/libsimdsort/avx2-emu-funcs.hpp line 277:
> 
>> 275: 
>> 276: return _mm_popcnt_u32(shortMask);
>> 277: }
> 
> Can be removed if not being used.

Removed in the latest commit...

> src/java.base/linux/native/libsimdsort/avx2-linux-qsort.cpp line 44:
> 
>> 42: break;
>> 43: case JVM_T_FLOAT:
>> 44: avx2_fast_sort((float*)array, from_index, to_index, 
>> INSERTION_SORT_THRESHOLD_32BIT);
> 
> Assertions for unsupported types.

Added in the latest commit...

> src/java.base/linux/native/libsimdsort/avx2-linux-qsort.cpp line 56:
> 
>> 54: case JVM_T_FLOAT:
>> 55: avx2_fast_partition((float*)array, from_index, to_index, 
>> pivot_indices, index_pivot1, index_pivot2);
>> 56: break;
> 
> Please add assertion for unsupported types.

Added in the latest commit...

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417701182
PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417702999
PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417702251
PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417701469
PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417701705


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v9]

2023-12-06 Thread Srinivas Vamsi Parasa
On Wed, 6 Dec 2023 11:59:19 GMT, Magnus Ihse Bursie  wrote:

>> Hi Magnus (@magicus),
>>  
>>> Are you saying that when compiling with GCC 6, it will just silently ignore 
>>> `-std=c++17`? I'd have assumed that it printed a warning or error about an 
>>> unknown or invalid option, if C++17 is not supported.
>> 
>> The GCC complier for versions 6 (and even 5) silently ignores the flag 
>> `-std=c++17`. It does not print any warning or error. I tested it with a toy 
>> C++ program and also by building OpenJDK using GCC 6. 
>> 
>>> You can't check for if compiler options should be enabled or not inside 
>>> source code files.
>> 
>>  what I meant was, there are #ifdef guards using predefined macros in the 
>> C++ source code to check for GCC version and make the simdsort code 
>> available for compilation or not based on the GCC version
>> 
>> 
>> // src/java.base/linux/native/libsimdsort/simdsort-support.hpp
>> #if defined(_LP64) && (defined(__GNUC__) && ((__GNUC__ > 7) || ((__GNUC__ == 
>> 7) && (__GNUC_MINOR__ >= 5
>> #define __SIMDSORT_SUPPORTED_LINUX
>> #endif
>> 
>> 
>> 
>> //src/java.base/linux/native/libsimdsort/avx2-linux-qsort.cpp
>> #include "simdsort-support.hpp"
>> #ifdef __SIMDSORT_SUPPORTED_LINUX
>> 
>> #endif
>
> Okay, then I guess I am fine with this.

Thank you Magnus!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417707661


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v9]

2023-12-06 Thread Srinivas Vamsi Parasa
> The goal is to develop faster sort routines for x86_64 CPUs by taking 
> advantage of AVX2 instructions. This enhancement provides an order of 
> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
> 
> For serial sort on random data, this PR shows upto ~7.5x improvement for 
> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
> performance data below.
> 
> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
> datatypes (int, float) as shown below.
> 
> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>  href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
> 
> 
> 
> 
> 
> 
> 
> 
> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
> Speedup
> -- | -- | -- | -- | --
> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
> 
> 
> 
> 
> 
>  xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> 
> 
> 
> 
>  href="file:///C:/Users/...

Srinivas Vamsi Parasa has updated the pull request incrementally with one 
additional commit since the last revision:

  remove unused avx2 64 bit sort functions; add assertions

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16534/files
  - new: https://git.openjdk.org/jdk/pull/16534/files/bc590d9f..c143e0b9

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=08
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16534&range=07-08

  Stats: 128 lines in 4 files changed: 12 ins; 116 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/16534.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16534/head:pull/16534

PR: https://git.openjdk.org/jdk/pull/16534


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v8]

2023-12-06 Thread Srinivas Vamsi Parasa
On Tue, 5 Dec 2023 19:33:48 GMT, Jatin Bhateja  wrote:

>> Srinivas Vamsi Parasa has updated the pull request with a new target base 
>> due to a merge or a rebase. The incremental webrev excludes the unrelated 
>> changes brought in by the merge/rebase. The pull request contains 17 
>> additional commits since the last revision:
>> 
>>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>>  - add GCC version guards
>>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>>  - Remove C++17 from C flags
>>  - add avoid masked stores operation
>>  - update the code to check for supported simd sort cpus
>>  - Disable AVX2 sort for 64-bit types
>>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>>  - fix jcheck failures due to windows encoding
>>  - fix carriage return and change insertion sort thresholds
>>  - ... and 7 more: https://git.openjdk.org/jdk/compare/d8b29378...bc590d9f
>
> src/java.base/linux/native/libsimdsort/avx512-32bit-qsort.hpp line 235:
> 
>> 233: return avx512_double_compressstore>(
>> 234: left_addr, right_addr, k, reg);
>> 235: }
> 
> Can be removed.

This is needed for AVX512 sort...

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417690992


Re: RFR: 8321374: Add a configure option to explicitly set CompanyName property in VersionInfo resource for Windows exe/dll

2023-12-06 Thread Frederic Thevenet
On Tue, 5 Dec 2023 11:15:18 GMT, Frederic Thevenet  
wrote:

> When building OpenJDK on the Windows platform, version information are 
> embedded as compiled resources into every native library and executable, 
> which typically contain version numbers, copyright information, product name 
> and vendor name (called "Company name" in this context).
> 
> Currently, the value for "Company name" is always the same as that of 
> "vendor-name" but it It would be really useful to some build scenarios to 
> allow for the "company name" property embedded in the VersionInfo compiled 
> resources to be different from the "vendor name" property that is used within 
> the JVM (e.g. java -version). 
> 
> This PR adds  a "--with-jdk-rc-company-name" configure option which can be 
> used to set "Company name" to its own value, independently of "'vendor-name", 
> in the same fashion as the existing "--with-jdk-rc-name", used to override 
> the values of "File description" and "Product name".
> 
> If "--with-jdk-rc-company-name" isn't used, then the vendor-name value is 
> used instead, reproducing the same behavior than without this patch.

NB: Test "GathererTest::testMassivelyComposedGatherers" in tier1 on linux-86 is 
failing for reasons unrelated to this PR.

-

PR Comment: https://git.openjdk.org/jdk/pull/16972#issuecomment-1843058806


RFR: 8321374: Add a configure option to explicitly set CompanyName property in VersionInfo resource for Windows exe/dll

2023-12-06 Thread Frederic Thevenet
When building OpenJDK on the Windows platform, version information are embedded 
as compiled resources into every native library and executable, which typically 
contain version numbers, copyright information, product name and vendor name 
(called "Company name" in this context).

Currently, the value for "Company name" is always the same as that of 
"vendor-name" but it It would be really useful to some build scenarios to allow 
for the "company name" property embedded in the VersionInfo compiled resources 
to be different from the "vendor name" property that is used within the JVM 
(e.g. java -version). 

This PR adds  a "--with-jdk-rc-company-name" configure option which can be used 
to set "Company name" to its own value, independently of "'vendor-name", in the 
same fashion as the existing "--with-jdk-rc-name", used to override the values 
of "File description" and "Product name".

If "--with-jdk-rc-company-name" isn't used, then the vendor-name value is used 
instead, reproducing the same behavior than without this patch.

-

Commit messages:
 - 8321374: Add a configure option to explicitly set CompanyName property in 
VersionInfo resource for Windows exe/dll

Changes: https://git.openjdk.org/jdk/pull/16972/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16972&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8321374
  Stats: 11 lines in 3 files changed: 10 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/16972.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16972/head:pull/16972

PR: https://git.openjdk.org/jdk/pull/16972


Re: Cross compilation discussion

2023-12-06 Thread Andrew Haley

On 12/6/23 14:35, erik.joels...@oracle.com wrote:

On 12/6/23 04:18, Andrew Haley wrote:

On 11/29/23 14:31, erik.joels...@oracle.com wrote:

Perhaps what we need to do is separate the notion of needing a separate
BUILD_JDK from the notion of cross compiling.


Isn't that what --with-sysroot= usually means? In that case, you're
building against an incompatible set of libraries. By definition,
because if you're building against a compatible set of libraries you
don't have a different sysroot.


No, that is not always the case, and was the point of my previous email.
When we build the JDK at Oracle, we provide a sysroot of an older Linux
version (the least common denominator for all our support Linux
distros/versions). The goal is to produce a single JDK distribution that
works on all our support Linux versions. Today we run the build on OL8
and "cross compile" it to OL6 using a devkit/sysroot. The JDK produced
is compatible with OL6, OL7, OL8 and OL9. Because of this we haven't
considered it a cross compilation setup, while it does have some aspects
of cross compiling. There is however no need to deal with a separate
BUILD_JDK in this scenario. Also the build, host and target tuples in
autoconf are the same.


OK. So you are doing what I'd usually call cross compiling, sort of, but
you don't want to treat it as a cross-compiled build. It's an unusual
case, in that it is cross compiled, but it is also expected to be able to
run on the build system.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



Re: Cross compilation discussion

2023-12-06 Thread erik . joelsson

On 12/6/23 04:18, Andrew Haley wrote:

On 11/29/23 14:31, erik.joels...@oracle.com wrote:

Perhaps what we need to do is separate the notion of needing a separate
BUILD_JDK from the notion of cross compiling.


Isn't that what --with-sysroot= usually means? In that case, you're
building against an incompatible set of libraries. By definition,
because if you're building against a compatible set of libraries you
don't have a different sysroot.

No, that is not always the case, and was the point of my previous email. 
When we build the JDK at Oracle, we provide a sysroot of an older Linux 
version (the least common denominator for all our support Linux 
distros/versions). The goal is to produce a single JDK distribution that 
works on all our support Linux versions. Today we run the build on OL8 
and "cross compile" it to OL6 using a devkit/sysroot. The JDK produced 
is compatible with OL6, OL7, OL8 and OL9. Because of this we haven't 
considered it a cross compilation setup, while it does have some aspects 
of cross compiling. There is however no need to deal with a separate 
BUILD_JDK in this scenario. Also the build, host and target tuples in 
autoconf are the same.


/Erik



Re: RFR: 8321373: Build should use LC_ALL=C.UTF-8

2023-12-06 Thread Magnus Ihse Bursie
On Tue, 5 Dec 2023 10:35:05 GMT, Magnus Ihse Bursie  wrote:

> We're currently setting LC_ALL=C. Not all tools will default to utf-8 as 
> their encoding of choice when they see this locale, but use an arbitrarily 
> encoding, which might not properly handle all UTF-8 characters. Since in 
> practice, all our encoding is utf8, we should tell our tools this as well.
> 
> This will at least have effect on how Java treats path names including 
> unicode characters.

I'll make such a run. But I discovered now that C.UTF-8 is not available on 
(some?) macs, so I guess I'll need to add a configure check to determine if we 
should use C or C.UTF-8. The latter is the norm on Linux nowadays (or, to be 
precise, locales should in general be specified as `.UTF-8`).

-

PR Comment: https://git.openjdk.org/jdk/pull/16971#issuecomment-1842772271


Re: Cross compilation discussion

2023-12-06 Thread Andrew Haley

On 11/29/23 14:31, erik.joels...@oracle.com wrote:

Perhaps what we need to do is separate the notion of needing a separate
BUILD_JDK from the notion of cross compiling.


Isn't that what --with-sysroot= usually means? In that case, you're
building against an incompatible set of libraries. By definition,
because if you're building against a compatible set of libraries you
don't have a different sysroot.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. 
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v8]

2023-12-06 Thread Magnus Ihse Bursie
On Mon, 4 Dec 2023 22:15:24 GMT, Srinivas Vamsi Parasa  wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX2 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> For serial sort on random data, this PR shows upto ~7.5x improvement for 
>> 32-bit datatypes (int, float) on Intel TigerLake machine as shown in the 
>> performance data below.
>> 
>> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
>> datatypes (int, float) as shown below.
>> 
>> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
>> 
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>> > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Benchmark (Serial Sort) | Size | Baseline  (us/op) | AVX2 (us/op) | 
>> Speedup
>> -- | -- | -- | -- | --
>> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
>> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
>> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
>> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
>> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
>> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
>> ArraysSort.intSort | 1 | 330.065 | 43.383 | 7.6
>> ArraysSort.intSort | 10 | 4099.527 | 778.943 | 5.3
>> ArraysSort.intSort | 100 | 49150.16 | 9634.335 | 5.1
>> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
>> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
>> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
>> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
>> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
>> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
>> ArraysSort.floatSort | 1 | 323.136 | 57.175 | 5.7
>> ArraysSort.floatSort | 10 | 4299.519 | 862.63 | 5.0
>> ArraysSort.floatSort | 100 | 50889.4 | 10972.19 | 4.6
>> 
>> 
>> 
>> 
>> 
>> > xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> 
>> 
>> 
> Srinivas Vamsi Parasa has updated the pull request with a new target base due 
> to a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains 17 additional 
> commits since the last revision:
> 
>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>  - add GCC version guards
>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>  - Remove C++17 from C flags
>  - add avoid masked stores operation
>  - update the code to check for supported simd sort cpus
>  - Disable AVX2 sort for 64-bit types
>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into simdsort
>  - fix jcheck failures due to windows encoding
>  - fix carriage return and change insertion sort thresholds
>  - ... and 7 more: https://git.openjdk.org/jdk/compare/dbbc5f0a...bc590d9f

Build changes look fine. You will still need the usual 2 reviewers from hotspot.

-

Marked as reviewed by ihse (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16534#pullrequestreview-1767374468


Re: RFR: 8319577: x86_64 AVX2 intrinsics for Arrays.sort methods (int, float arrays) [v8]

2023-12-06 Thread Magnus Ihse Bursie
On Tue, 5 Dec 2023 17:26:06 GMT, Srinivas Vamsi Parasa  wrote:

>> That sounds weird. You can't check for if compiler options should be enabled 
>> or not inside source code files.
>> 
>> Are you saying that when compiling with GCC 6, it will just silently ignore 
>> `-std=c++17`? I'd have assumed that it printed a warning or error about an 
>> unknown or invalid option, if C++17 is not supported.
>
> Hi Magnus (@magicus),
>  
>> Are you saying that when compiling with GCC 6, it will just silently ignore 
>> `-std=c++17`? I'd have assumed that it printed a warning or error about an 
>> unknown or invalid option, if C++17 is not supported.
> 
> The GCC complier for versions 6 (and even 5) silently ignores the flag 
> `-std=c++17`. It does not print any warning or error. I tested it with a toy 
> C++ program and also by building OpenJDK using GCC 6. 
> 
>> You can't check for if compiler options should be enabled or not inside 
>> source code files.
> 
>  what I meant was, there are #ifdef guards using predefined macros in the C++ 
> source code to check for GCC version and make the simdsort code available for 
> compilation or not based on the GCC version
> 
> 
> // src/java.base/linux/native/libsimdsort/simdsort-support.hpp
> #if defined(_LP64) && (defined(__GNUC__) && ((__GNUC__ > 7) || ((__GNUC__ == 
> 7) && (__GNUC_MINOR__ >= 5
> #define __SIMDSORT_SUPPORTED_LINUX
> #endif
> 
> 
> 
> //src/java.base/linux/native/libsimdsort/avx2-linux-qsort.cpp
> #include "simdsort-support.hpp"
> #ifdef __SIMDSORT_SUPPORTED_LINUX
> 
> #endif

Okay, then I guess I am fine with this.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1417170882


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7]

2023-12-06 Thread Magnus Ihse Bursie
On Wed, 6 Dec 2023 09:14:58 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add "--with-libsleef-lib" and "--with-libsleef-include" options

All the makefile changes we've discussed previously now look good. However, I 
just noticed the additional -f flag. Why are you not exporting the functions 
from source code instead? That is the way we normally do it in JDK libraries. 
In your case, it seems like you only need to add the export to the macro.

-

PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1842720243


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7]

2023-12-06 Thread Magnus Ihse Bursie
On Wed, 6 Dec 2023 09:14:58 GMT, Xiaohong Gong  wrote:

>> Currently the vector floating-point math APIs like 
>> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
>> which causes large performance gap on AArch64. Note that those APIs are 
>> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
>> To close the gap, we would like to optimize these APIs for AArch64 by 
>> calling a third-party vector library called libsleef [2], which are 
>> available in mainstream Linux distros (e.g. [3] [4]).
>> 
>> SLEEF supports multiple accuracies. To match Vector API's requirement and 
>> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
>> instructions used stubs in libsleef for most of the operations by default, 
>> and 2) add the vector calling convention to apply with the runtime calls to 
>> stub code in libsleef. Note that for those APIs that libsleef does not 
>> support 1.0 ULP, we choose 0.5 ULP instead.
>> 
>> To help loading the expected libsleef library, this patch also adds an 
>> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
>> People can use it to denote the libsleef path/name explicitly. By default, 
>> it points to the system installed library. If the library does not exist or 
>> the dynamic loading of it in runtime fails, the math vector ops will 
>> fall-back to use the default scalar version without error. But a warning is 
>> printed out if people specifies a nonexistent library explicitly.
>> 
>> Note that this is a part of the original proposed patch in panama-dev [5], 
>> just with some initial review comments addressed. And now we'd like to get 
>> some wider feedbacks from more hotspot experts.
>> 
>> [1] https://github.com/openjdk/jdk/pull/3638
>> [2] https://sleef.org/
>> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
>> [4] https://packages.debian.org/bookworm/libsleef3
>> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html
>
> Xiaohong Gong has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add "--with-libsleef-lib" and "--with-libsleef-include" options

make/modules/jdk.incubator.vector/Lib.gmk line 45:

> 43:   $(eval $(call SetupJdkLibrary, BUILD_LIBVMATH, \
> 44:   NAME := vmath, \
> 45:   CFLAGS := $(CFLAGS_JDKLIB) $(LIBSLEEF_CFLAGS) -fvisibility=default, 
> \

Why `-fvisibility=default`? (Sorry, only noticed this now)

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1417156921


Re: Compile Errors in OpenJDK 11 with Non-English Characters in Compilation Path

2023-12-06 Thread last last
The pull request at https://github.com/openjdk/jdk/pull/16971/files has
effectively addressed the issue I was facing.
Thank you for your time and effort in addressing this matter!

Magnus Ihse Bursie  于2023年12月5日周二 18:36写道:

> On closer inspection, it would probably be more prudential to set the
> value of LC_ALL to C.UTF-8 in the build. It will likely help you with your
> problem. I have published a PR at
> https://github.com/openjdk/jdk/pull/16971 (JDK-8321373).
>
> /Magnus
> On 2023-11-24 13:08, Magnus Ihse Bursie wrote:
>
> Are you running on Ubuntu properly, or Ubuntu in WLS? If it is the former,
> the it is not the same problem as the other poster, which were running on
> Windows.
>
> The JDK build uses the "C" locale. This should be set automatically by the
> build system, so it should just work. But in case there is some bug, try
> exporting LC_ALL=C before building and see if it helps.
>
> Also, the point of LC_ALL is that you can specify it instead of all the
> detailed LC_* variables...
>
> /Magnus
> On 2023-11-24 08:41, last last wrote:
>
> I am currently facing the same error when build paths include Chinese
> characters ,after setting the environment variable "export
> LC_ALL=zh_CN.UTF-8" .The testing environment is also Ubuntu 2004.
> The system locale settings are as follows: LANG=zh_CN.UTF-8
> LANGUAGE=zh_CN:zh LC_CTYPE="zh_CN.UTF-8" LC_NUMERIC="zh_CN.UTF-8"
> LC_TIME="zh_CN.UTF-8" LC_COLLATE="zh_CN.UTF-8" LC_MONETARY="zh_CN.UTF-8"
> LC_MESSAGES="zh_CN.UTF-8" LC_PAPER="zh_CN.UTF-8" LC_NAME="zh_CN.UTF-8"
> LC_ADDRESS="zh_CN.UTF-8" LC_TELEPHONE="zh_CN.UTF-8"
> LC_MEASUREMENT="zh_CN.UTF-8" LC_IDENTIFICATION="zh_CN.UTF-8"
> LC_ALL=zh_CN.UTF-8
> I'm wondering if there are any other system configurations that could help
> resolve this problem. Your insights would be greatly appreciated.
>
> Magnus Ihse Bursie  于2023年11月23日周四 22:30写道:
>
>> Congratulations! You get to test drive the recently updated build
>> instructions for Windows locale requirements! :-)
>>
>> Please see
>>
>> https://github.com/openjdk/jdk/blob/master/doc/building.md#locale-requirements
>>
>> In particular, I believe you need to set your system locale.
>>
>> Please report back if that solves the problem or not.
>>
>> /Magnus
>>
>>
>> On 2023-11-23 03:51, last last wrote:
>> > Hi all,
>> >
>> > I tried to build a OpenJDK11 fastdebug with paths that include Chinese
>> > characters,my build path is "/home/kylin/图片/jdk11u-dev",but i saw some
>> > error as followings:
>> >
>> > Exception in thread "main" java.nio.file.InvalidPathException:
>> > Malformed input or input contains unmappable characters:
>> >
>> /home/kylin/??/jdk11u-dev/build/linux-aarch64-normal-server-fastdebug/buildtools/langtools_tools_classes/_the.BUILD_TOOLS_LANGTOOLS_batch.tmp
>> > at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:145)
>> > at java.base/sun.nio.fs.UnixPath.(UnixPath.java:69)
>> > at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
>> > at java.base/java.nio.file.Path.of(Path.java:147)
>> > at java.base/java.nio.file.Paths.get(Paths.java:69)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.loadCmdFile(CommandLine.java:128)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.appendParsedCommandArgs(CommandLine.java:71)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.parse(CommandLine.java:102)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.parse(CommandLine.java:123)
>> > at jdk.compiler/com.sun.tools.javac.main.Main.compile(Main.java:215)
>> > at jdk.compiler/com.sun.tools.javac.main.Main.compile(Main.java:170)
>> > at jdk.compiler/com.sun.tools.javac.Main.compile(Main.java:57)
>> > at jdk.compiler/com.sun.tools.javac.Main.main(Main.java:43)
>> > make[3]: *** [ToolsLangtools.gmk:40:
>> >
>> /home/kylin/图片/jdk11u-dev/build/linux-aarch64-normal-server-fastdebug/buildtools/langtools_tools_classes/_the.BUILD_TOOLS_LANGTOOLS_batch]
>>
>> > Error 1
>> > make[2]: *** [make/Main.gmk:73: buildtools-langtools] Error 2
>> > make[2]: *** Waiting for unfinished jobs
>> > Exception in thread "main" java.nio.file.InvalidPathException:
>> > Malformed input or input contains unmappable characters:
>> >
>> /home/kylin/??/jdk11u-dev/build/linux-aarch64-normal-server-fastdebug/hotspot/variant-server/tools/jvmti/_the.BUILD_JVMTI_TOOLS_batch.tmp
>> > at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:145)
>> > at java.base/sun.nio.fs.UnixPath.(UnixPath.java:69)
>> > at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
>> > at java.base/java.nio.file.Path.of(Path.java:147)
>> > at java.base/java.nio.file.Paths.get(Paths.java:69)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.loadCmdFile(CommandLine.java:128)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.appendParsedCommandArgs(CommandLine.java:71)
>> > at
>> >
>> jdk.compiler/com.sun.tools.javac.main.CommandLine.parse(CommandLine.java:102)
>> > at
>> >
>> 

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7]

2023-12-06 Thread Xiaohong Gong
> Currently the vector floating-point math APIs like 
> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, 
> which causes large performance gap on AArch64. Note that those APIs are 
> optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. 
> To close the gap, we would like to optimize these APIs for AArch64 by calling 
> a third-party vector library called libsleef [2], which are available in 
> mainstream Linux distros (e.g. [3] [4]).
> 
> SLEEF supports multiple accuracies. To match Vector API's requirement and 
> implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA 
> instructions used stubs in libsleef for most of the operations by default, 
> and 2) add the vector calling convention to apply with the runtime calls to 
> stub code in libsleef. Note that for those APIs that libsleef does not 
> support 1.0 ULP, we choose 0.5 ULP instead.
> 
> To help loading the expected libsleef library, this patch also adds an 
> experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. 
> People can use it to denote the libsleef path/name explicitly. By default, it 
> points to the system installed library. If the library does not exist or the 
> dynamic loading of it in runtime fails, the math vector ops will fall-back to 
> use the default scalar version without error. But a warning is printed out if 
> people specifies a nonexistent library explicitly.
> 
> Note that this is a part of the original proposed patch in panama-dev [5], 
> just with some initial review comments addressed. And now we'd like to get 
> some wider feedbacks from more hotspot experts.
> 
> [1] https://github.com/openjdk/jdk/pull/3638
> [2] https://sleef.org/
> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/
> [4] https://packages.debian.org/bookworm/libsleef3
> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html

Xiaohong Gong has updated the pull request incrementally with one additional 
commit since the last revision:

  Add "--with-libsleef-lib" and "--with-libsleef-include" options

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16234/files
  - new: https://git.openjdk.org/jdk/pull/16234/files/ee5caf6d..f3ff0672

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=05-06

  Stats: 124 lines in 3 files changed: 67 ins; 33 del; 24 mod
  Patch: https://git.openjdk.org/jdk/pull/16234.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234

PR: https://git.openjdk.org/jdk/pull/16234


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-06 Thread Xiaohong Gong
On Tue, 5 Dec 2023 13:03:22 GMT, Magnus Ihse Bursie  wrote:

>> Thanks for the suggestion @magicus !
>> 
>> The check in current `lib-sleef.m4` is very common:
>> 
>> -  If user has specified libsleef root by '--with-libsleef', we assume it is 
>> the manually built sleef lib. So only `lib/` and `include/` is checked. And 
>> the flags are set based on that path.
>> - If user has not specified the libsleef root, and no `SYSROOT` is set, we 
>> try `PKG_CHECK` (like what you suggested)
>> - Otherwise, check `sleef.h`   
>>- We assume the sleef module is installed under one of the valid system 
>> paths if the header can be found. So just linking with `-lsleef` will 
>> success.
>> 
>> It's an issue in current flow like what @theRealAph met. I will add the 
>> options like `--with-libsleef-lib` and `--with-libsleef-include` like ffi. 
>> Regarding to extending the check for`--with-libsleef`, I think we can just 
>> make it simple like what it is now. Or, we have to check all the potential 
>> valid lib paths like `lib/`, `lib64/`, or maybe `lib/aarch64-linux-gnu`. The 
>> same to the `include` part.  @theRealAph @magicus , WDYT?
>
> I'm fine with adding just --with-libsleef-lib and --with-libsleef-include to 
> specify them directly. This makes it at least possible to use, if not overly 
> convenient, for people using a system like Andrew's. If it annoys someone too 
> much, we can extend it later.

Added these two options in latest commit. Thanks!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1416962901


Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-06 Thread Xiaohong Gong
On Fri, 1 Dec 2023 16:26:02 GMT, Magnus Ihse Bursie  wrote:

>> Xiaohong Gong has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains ten additional 
>> commits since the last revision:
>> 
>>  - Separate neon and sve functions into two source files
>>  - Merge branch 'jdk:master' into JDK-8312425
>>  - Rename vmath to sleef in configure
>>  - Address review comments in build system
>>  - Add a bundled native lib in jdk as a bridge to libsleef
>>  - Merge 'jdk:master' into JDK-8312425
>>  - Disable sleef by default
>>  - Merge 'jdk:master' into JDK-8312425
>>  - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF
>
> doc/building.md line 639:
> 
>> 637: 
>> 638: libsleef, the [SIMD Library for Evaluating Elementary Functions](
>> 639: https://sleef.org/) is required when building libvmath.so on 
>> Linux/aarch64
> 
> This is incorrect. The library is not required, but if it is present, we will 
> build libvmath with it.
> 
> Edit: Or rather, this is misleading. Technically it is correct, since you 
> state that it is required when building libvmath.so, but it is easy to 
> mistake for being required for building the JDK. The reader presumably does 
> not know what libvmath.so is or how it is used.
> 
> Please rephrase this to so that it is clear that this is optional, but will 
> provide performance benefits to the resulting JDK if present. You do not need 
> to mention libvmath.so here, for no other dependency do we declare what parts 
> of the JDK that require it -- it is not essential for this document.
> 
> Also see if you can make this paragraph and the one at the end be a bit more 
> tighter, not the last paragraph seems to be both repeat and contradict this 
> one.

Hi @magicus , the doc is updated. Thanks for your comment on this!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1416964362