On Tue, 7 Nov 2023 00:12:41 GMT, Srinivas Vamsi Parasa <d...@openjdk.org> wrote:

> The goal is to develop faster sort routines for x86_64 CPUs by taking 
> advantage of AVX2 instructions. This enhancement provides an order of 
> magnitude speedup for Arrays.sort() using int, long, float and double arrays.
> 
> For serial sort on random data, this PR shows upto ~7.5x improvement for 
> 32-bit datatypes (int, float) and upto ~3x improvement for 64-bit datatypes 
> (long, double) on Intel TigerLake machine as shown in the performance data 
> below.
> 
> For parallel sort on random data, this PR shows upto ~3.4x for 32-bit 
> datatypes (int, float) and upto ~2.3x for 64-bit datatypes as shown below.
> 
> **Note:** This PR also improves the performance of AVX512 sort by upto 35%.
> 
> <html xmlns:v="urn:schemas-microsoft-com:vml"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40";>
> 
> <head>
> 
> <meta name=ProgId content=Excel.Sheet>
> <meta name=Generator content="Microsoft Excel 15">
> <link id=Main-File rel=Main-File
> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
> <link rel=File-List
> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
> 
> 
> </head>
> 
> <body link="#0563C1" vlink="#954F72">
> 
> 
> 
> Benchmark (Serial Sort) | Size | Baseline      (us/op) | AVX2     (us/op) | 
> Speedup
> -- | -- | -- | -- | --
> ArraysSort.intSort | 10 | 0.034 | 0.029 | 1.2
> ArraysSort.intSort | 25 | 0.088 | 0.044 | 2.0
> ArraysSort.intSort | 50 | 0.239 | 0.159 | 1.5
> ArraysSort.intSort | 75 | 0.417 | 0.27 | 1.5
> ArraysSort.intSort | 100 | 0.572 | 0.265 | 2.2
> ArraysSort.intSort | 1000 | 10.098 | 4.282 | 2.4
> ArraysSort.intSort | 10000 | 330.065 | 43.383 | 7.6
> ArraysSort.intSort | 100000 | 4099.527 | 778.943 | 5.3
> ArraysSort.intSort | 1000000 | 49150.16 | 9634.335 | 5.1
> ArraysSort.floatSort | 10 | 0.045 | 0.043 | 1.0
> ArraysSort.floatSort | 25 | 0.105 | 0.073 | 1.4
> ArraysSort.floatSort | 50 | 0.278 | 0.216 | 1.3
> ArraysSort.floatSort | 75 | 0.476 | 0.241 | 2.0
> ArraysSort.floatSort | 100 | 0.583 | 0.313 | 1.9
> ArraysSort.floatSort | 1000 | 10.182 | 4.329 | 2.4
> ArraysSort.floatSort | 10000 | 323.136 | 57.175 | 5.7
> ArraysSort.floatSort | 100000 | 4299.519 | 862.63 | 5.0
> ArraysSort.floatSort | 1000000 | 50889.4 | 10972.19 | 4.6
> ArraysSort.longSort | 10 | 0.037 | 0.031 | 1.2
> ArraysSort.longSort | 25 | 0.101 | 0.073 | 1.4
> ArraysSort.longSort | 50 | 0.227 | 0.219 | 1.0
> ArraysSort.longSort | 75 | 0.446 | 0.332 | 1.3
> ArraysSort.longSort | 100 | 0.714 | 0.557 | 1.3
> ArraysSort.longSort | 1000 ...

make/modules/java.base/Lib.gmk line 245:

> 243:       TOOLCHAIN := TOOLCHAIN_LINK_CXX, \
> 244:       OPTIMIZATION := HIGH, \
> 245:       CFLAGS := $(CFLAGS_JDKLIB) -std=c++17, \

This makes me uneasy. We do not in general use C++17 in the JDK. 

Is this flag needed for the code to compile? If so, would it be difficult to 
rewrite it not to require C++17 constructs?

Or was it added since you noticed performance increases, not related to the new 
code, by forcing the compiler to use a higher language revision?

We are supporting gcc versions from 6. From what I can tell, C++17 was fully 
introduced in gcc 11. Increasing the lowest supported gcc to 11  would require 
quite a jump, just for this library.

In the worst case, you would need to make the existence of this library 
dependent on gcc version. (It is my understanding that the library is optional, 
and just produces a performance benefits if it exists).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16534#discussion_r1394350837

Reply via email to