On Thu, 31 Aug 2023 18:45:39 GMT, Srinivas Vamsi Parasa <d...@openjdk.org> 
wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking 
>> advantage of AVX512 instructions. This enhancement provides an order of 
>> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> This PR shows upto ~7x improvement for 32-bit datatypes (int, float) and 
>> upto ~4.5x improvement for 64-bit datatypes (long, double) as shown in the 
>> performance data below.
>> 
>> 
>> **Arrays.sort performance data using JMH benchmarks for arrays with random 
>> data** 
>> 
>> |    Arrays.sort benchmark   |       Array Size      |       Baseline 
>> (us/op)        |       AVX512 Sort (us/op)     |       Speedup |
>> |    ---     |       ---     |       ---     |       ---     |       ---     
>> |
>> |    ArraysSort.doubleSort   |       10      |       0.034   |       0.035   
>> |       1.0     |
>> |    ArraysSort.doubleSort   |       25      |       0.116   |       0.089   
>> |       1.3     |
>> |    ArraysSort.doubleSort   |       50      |       0.282   |       0.291   
>> |       1.0     |
>> |    ArraysSort.doubleSort   |       75      |       0.474   |       0.358   
>> |       1.3     |
>> |    ArraysSort.doubleSort   |       100     |       0.654   |       0.623   
>> |       1.0     |
>> |    ArraysSort.doubleSort   |       1000    |       9.274   |       6.331   
>> |       1.5     |
>> |    ArraysSort.doubleSort   |       10000   |       323.339 |       71.228  
>> |       **4.5** |
>> |    ArraysSort.doubleSort   |       100000  |       4471.871        |       
>> 1002.748        |       **4.5** |
>> |    ArraysSort.doubleSort   |       1000000 |       51660.742       |       
>> 12921.295       |       **4.0** |
>> |    ArraysSort.floatSort    |       10      |       0.045   |       0.046   
>> |       1.0     |
>> |    ArraysSort.floatSort    |       25      |       0.103   |       0.084   
>> |       1.2     |
>> |    ArraysSort.floatSort    |       50      |       0.285   |       0.33    
>> |       0.9     |
>> |    ArraysSort.floatSort    |       75      |       0.492   |       0.346   
>> |       1.4     |
>> |    ArraysSort.floatSort    |       100     |       0.597   |       0.326   
>> |       1.8     |
>> |    ArraysSort.floatSort    |       1000    |       9.811   |       5.294   
>> |       1.9     |
>> |    ArraysSort.floatSort    |       10000   |       323.955 |       50.547  
>> |       **6.4** |
>> |    ArraysSort.floatSort    |       100000  |       4326.38 |       731.152 
>> |       **5.9** |
>> |    ArraysSort.floatSort    |       1000000 |       52413.88        |       
>> 8409.193        |       **6.2** |
>> |    ArraysSort.intSort      |       10      |       0.033   |       0.033   
>> |       1.0     |
>> |    ArraysSort.intSort      |       25      |       0.086   |       0.051   
>> |       1.7     |
>> |    ArraysSort.intSort      |       50      |       0.236   |       0.151   
>> |       1.6     |
>> |    ArraysSort.intSort      |       75      |       0.416   |       0.332   
>> |       1.3     |
>> |    ArraysSort.intSort      |       100     |       0.63    |       0.521   
>> |       1.2     |
>> |    ArraysSort.intSort      |       1000    |       10.518  |       4.698   
>> |       2.2     |
>> |    ArraysSort.intSort      |       10000   |       309.659 |       42.518  
>> |       **7.3** |
>> |    ArraysSort.intSort      |       100000  |       4130.917        |       
>> 573.956 |       **7.2** |
>> |    ArraysSort.intSort      |       1000000 |       49876.307       |       
>> 6712.812        |       **7.4** |
>> |    ArraysSort.longSort     |       10      |       0.036   |       0.037   
>> |       1.0     |
>> |    ArraysSort.longSort     |       25      |       0.094   |       0.08    
>> |       1.2     |
>> |    ArraysSort.longSort     |       50      |       0.218   |       0.227   
>> |       1.0     |
>> |    ArraysSort.longSort     |       75      |       0.466   |       0.402   
>> |       1.2     |
>> |    ArraysSort.longSort     |       100     |       0.76    |       0.58    
>> |       1.3     |
>> |    ArraysSort.longSort     |       1000    |       10.449  |       6....
>
> Srinivas Vamsi Parasa has updated the pull request with a new target base due 
> to a merge or a rebase. The pull request now contains 32 commits:
> 
>  - update build script
>  - Merge branch 'master' of https://git.openjdk.org/jdk into avx512sort
>  - Clean up parameters passed to arrayPartition; update the check to load 
> library
>  - Remove unnecessary import in Arrays.java
>  - Move sort and partition intrinsics from Arrays.java to DPQS.java
>  - Fix unused assignment in DPQS.java and space in Arrays.java
>  - add parallelSort benchmarking
>  - Update copyright for DPQS.java; replace avx512 pivot calculation with 
> scalar version
>  - Update avx512-common-qsort.h
>  - Decomposed DPQS using AVX512 partitioning and AVX512 sort (for small 
> arrays). Works for serial and parallel sort.
>  - ... and 22 more: https://git.openjdk.org/jdk/compare/c8acab1d...1746eedd

make/modules/java.base/Lib.gmk line 255:

> 253:         TARGETS += $(BUILD_LIB_X86_64)
> 254:   endif
> 255: endif

Indentation looks off here 
(https://openjdk.org/groups/build/doc/code-conventions.html)
Suggestion:

ifeq ($(call isTargetOs, linux)+$(call isTargetCpu, 
x86_64)+$(INCLUDE_COMPILER2), true+true+true)
  ifeq ($(TOOLCHAIN_TYPE), gcc)
    $(eval $(call SetupJdkLibrary, BUILD_LIB_X86_64, \
        NAME := x86_64, \
        TOOLCHAIN := TOOLCHAIN_LINK_CXX, \
        OPTIMIZATION := HIGH, \
        CFLAGS := $(CFLAGS_JDKLIB), \
        CXXFLAGS := $(CXXFLAGS_JDKLIB), \
        LDFLAGS := $(LDFLAGS_JDKLIB) \
            $(call SET_SHARED_LIBRARY_ORIGIN), \
        LIBS := $(LIBCXX), \
        LIBS_linux := -lc -lm -ldl, \
    ))

    TARGETS += $(BUILD_LIB_X86_64)
  endif
endif


I'm also still wondering about the library name. It's very generic for 
something that seems to be rather specific.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1312139279

Reply via email to