On Fri, 17 Nov 2023 21:09:13 GMT, Vladimir Yaroslavskiy <d...@openjdk.org> 
wrote:

>> Comparision of Stock JDK ( with AVX512sort) vs. Radix sort for All 
>> (https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_RadixForAll.java)
>> <html xmlns:v="urn:schemas-microsoft-com:vml"
>> xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:x="urn:schemas-microsoft-com:office:excel"
>> xmlns="http://www.w3.org/TR/REC-html40";>
>> 
>> <head>
>> 
>> <meta name=ProgId content=Excel.Sheet>
>> <meta name=Generator content="Microsoft Excel 15">
>> <link id=Main-File rel=Main-File
>> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
>> <link rel=File-List
>> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>> 
>> 
>> 
>> </head>
>> 
>> <body link="#0563C1" vlink="#954F72">
>> 
>> 
>> Benchmark   (us/op) | Builder | (size) | Stock JDK     (+AVX512 sort) | 
>> Radix for      All (+AVX512 sort) | Speedup
>> -- | -- | -- | -- | -- | --
>> ArraysSort.Int.testSort | RANDOM | 800 | 3.345 | 2.329 | 1.436
>> ArraysSort.Int.testSort | RANDOM | 7000 | 31.617 | 29.886 | 1.058
>> ArraysSort.Int.testSort | RANDOM | 50000 | 304.558 | 258.662 | 1.177
>> ArraysSort.Int.testSort | RANDOM | 300000 | 2097.165 | 1626.93 | 1.289
>> ArraysSort.Int.testSort | RANDOM | 2000000 | 15357.603 | 11158.73 | 1.376
>> ArraysSort.Int.testSort | REPEATED | 800 | 0.921 | 0.997 | 0.924
>> ArraysSort.Int.testSort | REPEATED | 7000 | 3.386 | 4.434 | 0.764
>> ArraysSort.Int.testSort | REPEATED | 50000 | 22.774 | 22.85 | 0.997
>> ArraysSort.Int.testSort | REPEATED | 300000 | 161.34 | 172.827 | 0.934
>> ArraysSort.Int.testSort | REPEATED | 2000000 | 1138.572 | 994.153 | 1.145
>> ArraysSort.Int.testSort | STAGGER | 800 | 2.13 | 2.383 | 0.894
>> ArraysSort.Int.testSort | STAGGER | 7000 | 17.967 | 19.506 | 0.921
>> ArraysSort.Int.testSort | STAGGER | 50000 | 121.2 | 145.089 | 0.835
>> ArraysSort.Int.testSort | STAGGER | 300000 | 728.444 | 858.927 | 0.848
>> ArraysSort.Int.testSort | STAGGER | 2000000 | 4943.958 | 5976.788 | 0.827
>> ArraysSort.Int.testSort | SHUFFLE | 800 | 2.834 | 2.348 | 1.207
>> ArraysSort.Int.testSort | SHUFFLE | 7000 | 30.086 | 38.303 | 0.785
>> ArraysSort.Int.testSort | SHUFFLE | 50000 | 268.786 | 258.078 | 1.041
>> ArraysSort.Int.testSort | SHUFFLE | 300000 | 1996.706 | 2439.81 | 0.818
>> ArraysSort.Int.testSort | SHUFFLE | 2000000 | 14108.444 | 19083.22 | 0.739
>> ArraysSort.Int.testSortParallel | RANDOM | 800 | 3.318 | 8.074 | 0.41
>> ArraysSort.Int.testSortParallel | RANDOM | 7000 | 26.533 | 44.091 | 0.60
>> ArraysSort.Int.testSor...
>
> Hello Vamsi (@vamsi-parasa),
> 
> Thank you very much for benchmarking, I appreciate your efforts!
> 
> I looked at non-parallel sorting when radix sort is switched off 
> (DualPivotQuicksort_RadixForParallel) and cannot explain the following data 
> for STAGGER where all speedup < 1:
> testSort STAGGER    7000  0.92
> testSort STAGGER   50000  0.84
> testSort STAGGER  300000  0.85
> testSort STAGGER 2000000  0.83
> In these cases both versions go directly to merging sort: no quicksort, no 
> insertion sort, no radix sort at all
> and therefore, no intrinsic also, Java code only,merging sort only.
> It is expected that benchmarking without AVX512 should be the same,
> but my benchmarking on Windows shows speedup 1.0 .. 1.10.
> 
> Vamsi,
> Could you please run benchmarking with derived classes from jdk and my 
> version?
> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a01.java
> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a02.java
> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r01.java
> II hope i allows us to detect the root of such behaviour.
> 
> Please check sequential sorting only (parallel sort is out of scope now).
> I see you used not the latest ArraysSort, I pointed to 
> https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSort.java
> 
> It is not critical, but it will be better to be in the same environment, see
> increased warmup, specified parameters for run and updated data sets
> @Warmup(iterations = 2, time = 3, timeUnit = TimeUnit.SECONDS)
> @Measurement(iterations = 4, time = 3, timeUnit = TimeUnit.SECONDS)
> @Fork(value=1, jvmArgsAppend={"-XX:CompileThreshold=1", 
> "-XX:-TieredCompilation"})
> 
> Could you please spare some time and provide the performance data?

Hello Vladimir (@iaroslavski),

You're welcome!
Will provide the performance data (for the next set of experiments) within few 
days.

Thanks,
Vamsi

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13568#issuecomment-1817236360

Reply via email to