On Fri, 22 Sep 2023 16:53:21 GMT, Srinivas Vamsi Parasa <d...@openjdk.org> 
wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX512 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> This PR shows upto ~7x improvement for 32-bit datatypes (int, float) and 
>> upto ~4.5x improvement for 64-bit datatypes (long, double) as shown in the 
>> performance data below.
>> 
>> 
>> **Arrays.sort performance data using JMH benchmarks for arrays with random 
>> data** 
>> 
>> |    Arrays.sort benchmark   |       Array Size      |       Baseline 
>> (us/op)        |       AVX512 Sort (us/op)     |       Speedup |
>> |    ---     |       ---     |       ---     |       ---     |       ---     
>> |
>> |    ArraysSort.doubleSort   |       10      |       0.034   |       0.035   
>> |       1.0     |
>> |    ArraysSort.doubleSort   |       25      |       0.116   |       0.089   
>> |       1.3     |
>> |    ArraysSort.doubleSort   |       50      |       0.282   |       0.291   
>> |       1.0     |
>> |    ArraysSort.doubleSort   |       75      |       0.474   |       0.358   
>> |       1.3     |
>> |    ArraysSort.doubleSort   |       100     |       0.654   |       0.623   
>> |       1.0     |
>> |    ArraysSort.doubleSort   |       1000    |       9.274   |       6.331   
>> |       1.5     |
>> |    ArraysSort.doubleSort   |       10000   |       323.339 |       71.228  
>> |       **4.5** |
>> |    ArraysSort.doubleSort   |       100000  |       4471.871        |       
>> 1002.748        |       **4.5** |
>> |    ArraysSort.doubleSort   |       1000000 |       51660.742       |       
>> 12921.295       |       **4.0** |
>> |    ArraysSort.floatSort    |       10      |       0.045   |       0.046   
>> |       1.0     |
>> |    ArraysSort.floatSort    |       25      |       0.103   |       0.084   
>> |       1.2     |
>> |    ArraysSort.floatSort    |       50      |       0.285   |       0.33    
>> |       0.9     |
>> |    ArraysSort.floatSort    |       75      |       0.492   |       0.346   
>> |       1.4     |
>> |    ArraysSort.floatSort    |       100     |       0.597   |       0.326   
>> |       1.8     |
>> |    ArraysSort.floatSort    |       1000    |       9.811   |       5.294   
>> |       1.9     |
>> |    ArraysSort.floatSort    |       10000   |       323.955 |       50.547  
>> |       **6.4** |
>> |    ArraysSort.floatSort    |       100000  |       4326.38 |       731.152 
>> |       **5.9** |
>> |    ArraysSort.floatSort    |       1000000 |       52413.88        |       
>> 8409.193        |       **6.2** |
>> |    ArraysSort.intSort      |       10      |       0.033   |       0.033   
>> |       1.0     |
>> |    ArraysSort.intSort      |       25      |       0.086   |       0.051   
>> |       1.7     |
>> |    ArraysSort.intSort      |       50      |       0.236   |       0.151   
>> |       1.6     |
>> |    ArraysSort.intSort      |       75      |       0.416   |       0.332   
>> |       1.3     |
>> |    ArraysSort.intSort      |       100     |       0.63    |       0.521   
>> |       1.2     |
>> |    ArraysSort.intSort      |       1000    |       10.518  |       4.698   
>> |       2.2     |
>> |    ArraysSort.intSort      |       10000   |       309.659 |       42.518  
>> |       **7.3** |
>> |    ArraysSort.intSort      |       100000  |       4130.917        |       
>> 573.956 |       **7.2** |
>> |    ArraysSort.intSort      |       1000000 |       49876.307       |       
>> 6712.812        |       **7.4** |
>> |    ArraysSort.longSort     |       10      |       0.036   |       0.037   
>> |       1.0     |
>> |    ArraysSort.longSort     |       25      |       0.094   |       0.08    
>> |       1.2     |
>> |    ArraysSort.longSort     |       50      |       0.218   |       0.227   
>> |       1.0     |
>> |    ArraysSort.longSort     |       75      |       0.466   |       0.402   
>> |       1.2     |
>> |    ArraysSort.longSort     |       100     |       0.76    |       0.58    
>> |       1.3     |
>> |    ArraysSort.longSort     |       1000    |       10.449  |       6....
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Update CompileThresholdScaling only for the sort and partition intrinsics; 
> update build script to remove nested if

In general it looks good. I have some code style comments and file name change 
request.
After you fix that I will need to rerun testing for it before approval.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4183:

> 4181: 
> 4182:   // Load x86_64_sort library on supported hardware to enable avx512 
> sort and partition intrinsics
> 4183:     if (UseAVX > 2 && VM_Version::supports_avx512dq()) {

Indention (spacing) is wrong for lines 4183-4190 after you moved check.

src/hotspot/share/opto/library_call.cpp line 5372:

> 5370: bool LibraryCallKit::inline_array_partition() {
> 5371: 
> 5372:   address stubAddr = nullptr;

Move this declaration to first assignment at line 5387.

src/hotspot/share/opto/library_call.cpp line 5374:

> 5372:   address stubAddr = nullptr;
> 5373:   const char *stubName;
> 5374:   stubName = "array_partition_stub";

It could one line.

src/hotspot/share/opto/library_call.cpp line 5400:

> 5398: 
> 5399:   // create the pivotIndices array of type int and size = 2
> 5400:   Node* pivotIndices = nullptr;

Move to assignment at line 5403.

src/hotspot/share/opto/library_call.cpp line 5414:

> 5412:   make_runtime_call(RC_LEAF|RC_NO_FP, 
> OptoRuntime::array_partition_Type(),
> 5413:                     stubAddr, stubName, TypePtr::BOTTOM,
> 5414:                     obj_adr, elemType, fromIndex, toIndex, 
> pivotIndices_adr, indexPivot1, indexPivot2);

May be put `indexPivot*` argument on new line for fit screen better.

src/java.base/linux/native/libsimdsort/avx512-32bit-qsort.hpp line 4:

> 2:  * Copyright (c) 2021, 2023, Intel Corporation. All rights reserved.
> 3:  * Copyright (c) 2021 Serge Sans Paille. All rights reserved.
> 4:  * Intel x86-simd-sort source code.

I don't think this line here and in other new files will pass our Copyright 
header checker. Do you need this line here? You do have comment below `This 
implementation is based on x86-simd-sort`.
`DO NOT ALTER ..` line should immediately follow `Copyright` lines.

src/java.base/linux/native/libsimdsort/avxsort_linux_x86.cpp line 1:

> 1: /*

I think this file name should follow pattern of other files: 
`avx512-linux-qsort.cpp

src/java.base/share/classes/java/util/DualPivotQuicksort.java line 418:

> 416: 
> 417:         /*
> 418:         * The first and the last elements to be sorted are moved

Misaligned `/*` here and several places later. `*` should be aligned.

src/java.base/share/classes/java/util/DualPivotQuicksort.java line 1353:

> 1351:         /*
> 1352:             * Swap the pivot into its final position.
> 1353:             */

Indent

src/java.base/share/classes/java/util/DualPivotQuicksort.java line 2941:

> 2939:         /*
> 2940:             * Swap the pivot into its final position.
> 2941:             */

indent

src/java.base/share/classes/java/util/DualPivotQuicksort.java line 3794:

> 3792:         /*
> 3793:             * Swap the pivot into its final position.
> 3794:             */

Indent.

-------------

PR Review: https://git.openjdk.org/jdk/pull/14227#pullrequestreview-1660450589
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347808019
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347810630
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347809581
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347814187
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347816297
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347830769
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347834135
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347828096
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347824813
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347823957
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347823053

Reply via email to