On Fri, 25 Aug 2023 01:57:41 GMT, Srinivas Vamsi Parasa <[email protected]>
wrote:
>> The goal is to develop faster sort routines for x86_64 CPUs by taking
>> advantage of AVX512 instructions. This enhancement provides an order of
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>>
>> This PR shows upto ~7x improvement for 32-bit datatypes (int, float) and
>> upto ~4.5x improvement for 64-bit datatypes (long, double) as shown in the
>> performance data below.
>>
>>
>> **Arrays.sort performance data using JMH benchmarks for arrays with random
>> data**
>>
>> | Arrays.sort benchmark | Array Size | Baseline
>> (us/op) | AVX512 Sort (us/op) | Speedup |
>> | --- | --- | --- | --- | ---
>> |
>> | ArraysSort.doubleSort | 10 | 0.034 | 0.035
>> | 1.0 |
>> | ArraysSort.doubleSort | 25 | 0.116 | 0.089
>> | 1.3 |
>> | ArraysSort.doubleSort | 50 | 0.282 | 0.291
>> | 1.0 |
>> | ArraysSort.doubleSort | 75 | 0.474 | 0.358
>> | 1.3 |
>> | ArraysSort.doubleSort | 100 | 0.654 | 0.623
>> | 1.0 |
>> | ArraysSort.doubleSort | 1000 | 9.274 | 6.331
>> | 1.5 |
>> | ArraysSort.doubleSort | 10000 | 323.339 | 71.228
>> | **4.5** |
>> | ArraysSort.doubleSort | 100000 | 4471.871 |
>> 1002.748 | **4.5** |
>> | ArraysSort.doubleSort | 1000000 | 51660.742 |
>> 12921.295 | **4.0** |
>> | ArraysSort.floatSort | 10 | 0.045 | 0.046
>> | 1.0 |
>> | ArraysSort.floatSort | 25 | 0.103 | 0.084
>> | 1.2 |
>> | ArraysSort.floatSort | 50 | 0.285 | 0.33
>> | 0.9 |
>> | ArraysSort.floatSort | 75 | 0.492 | 0.346
>> | 1.4 |
>> | ArraysSort.floatSort | 100 | 0.597 | 0.326
>> | 1.8 |
>> | ArraysSort.floatSort | 1000 | 9.811 | 5.294
>> | 1.9 |
>> | ArraysSort.floatSort | 10000 | 323.955 | 50.547
>> | **6.4** |
>> | ArraysSort.floatSort | 100000 | 4326.38 | 731.152
>> | **5.9** |
>> | ArraysSort.floatSort | 1000000 | 52413.88 |
>> 8409.193 | **6.2** |
>> | ArraysSort.intSort | 10 | 0.033 | 0.033
>> | 1.0 |
>> | ArraysSort.intSort | 25 | 0.086 | 0.051
>> | 1.7 |
>> | ArraysSort.intSort | 50 | 0.236 | 0.151
>> | 1.6 |
>> | ArraysSort.intSort | 75 | 0.416 | 0.332
>> | 1.3 |
>> | ArraysSort.intSort | 100 | 0.63 | 0.521
>> | 1.2 |
>> | ArraysSort.intSort | 1000 | 10.518 | 4.698
>> | 2.2 |
>> | ArraysSort.intSort | 10000 | 309.659 | 42.518
>> | **7.3** |
>> | ArraysSort.intSort | 100000 | 4130.917 |
>> 573.956 | **7.2** |
>> | ArraysSort.intSort | 1000000 | 49876.307 |
>> 6712.812 | **7.4** |
>> | ArraysSort.longSort | 10 | 0.036 | 0.037
>> | 1.0 |
>> | ArraysSort.longSort | 25 | 0.094 | 0.08
>> | 1.2 |
>> | ArraysSort.longSort | 50 | 0.218 | 0.227
>> | 1.0 |
>> | ArraysSort.longSort | 75 | 0.466 | 0.402
>> | 1.2 |
>> | ArraysSort.longSort | 100 | 0.76 | 0.58
>> | 1.3 |
>> | ArraysSort.longSort | 1000 | 10.449 | 6....
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one
> additional commit since the last revision:
>
> Remove unnecessary import in Arrays.java
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4143:
> 4141: log_info(library)("Loaded library %s, handle " INTPTR_FORMAT,
> JNI_LIB_PREFIX "x86_64" JNI_LIB_SUFFIX, p2i(libx86_64));
> 4142:
> 4143: if (UseAVX > 2 && VM_Version::supports_avx512dq()) {
This check should be done before you locate and load library
src/hotspot/share/opto/library_call.cpp line 5218:
> 5216: BasicType bt = elem_type->basic_type();
> 5217: stubAddr = StubRoutines::select_array_partition_function(bt);
> 5218: if (stubAddr == nullptr) return false;
I see now how you check for AVX512 support.
You bailout here if address for stubs is not set and I see that you have `if
(UseAVX > 2 && VM_Version::supports_avx512dq())` check in stubGenerator.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1306180258
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1306179926