On Fri, 17 Nov 2023 21:09:13 GMT, Vladimir Yaroslavskiy <d...@openjdk.org> wrote:
>> Comparision of Stock JDK ( with AVX512sort) vs. Radix sort for All >> (https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_RadixForAll.java) >> <html xmlns:v="urn:schemas-microsoft-com:vml" >> xmlns:o="urn:schemas-microsoft-com:office:office" >> xmlns:x="urn:schemas-microsoft-com:office:excel" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> <head> >> >> <meta name=ProgId content=Excel.Sheet> >> <meta name=Generator content="Microsoft Excel 15"> >> <link id=Main-File rel=Main-File >> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> >> <link rel=File-List >> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> >> >> >> >> </head> >> >> <body link="#0563C1" vlink="#954F72"> >> >> >> Benchmark (us/op) | Builder | (size) | Stock JDK (+AVX512 sort) | >> Radix for All (+AVX512 sort) | Speedup >> -- | -- | -- | -- | -- | -- >> ArraysSort.Int.testSort | RANDOM | 800 | 3.345 | 2.329 | 1.436 >> ArraysSort.Int.testSort | RANDOM | 7000 | 31.617 | 29.886 | 1.058 >> ArraysSort.Int.testSort | RANDOM | 50000 | 304.558 | 258.662 | 1.177 >> ArraysSort.Int.testSort | RANDOM | 300000 | 2097.165 | 1626.93 | 1.289 >> ArraysSort.Int.testSort | RANDOM | 2000000 | 15357.603 | 11158.73 | 1.376 >> ArraysSort.Int.testSort | REPEATED | 800 | 0.921 | 0.997 | 0.924 >> ArraysSort.Int.testSort | REPEATED | 7000 | 3.386 | 4.434 | 0.764 >> ArraysSort.Int.testSort | REPEATED | 50000 | 22.774 | 22.85 | 0.997 >> ArraysSort.Int.testSort | REPEATED | 300000 | 161.34 | 172.827 | 0.934 >> ArraysSort.Int.testSort | REPEATED | 2000000 | 1138.572 | 994.153 | 1.145 >> ArraysSort.Int.testSort | STAGGER | 800 | 2.13 | 2.383 | 0.894 >> ArraysSort.Int.testSort | STAGGER | 7000 | 17.967 | 19.506 | 0.921 >> ArraysSort.Int.testSort | STAGGER | 50000 | 121.2 | 145.089 | 0.835 >> ArraysSort.Int.testSort | STAGGER | 300000 | 728.444 | 858.927 | 0.848 >> ArraysSort.Int.testSort | STAGGER | 2000000 | 4943.958 | 5976.788 | 0.827 >> ArraysSort.Int.testSort | SHUFFLE | 800 | 2.834 | 2.348 | 1.207 >> ArraysSort.Int.testSort | SHUFFLE | 7000 | 30.086 | 38.303 | 0.785 >> ArraysSort.Int.testSort | SHUFFLE | 50000 | 268.786 | 258.078 | 1.041 >> ArraysSort.Int.testSort | SHUFFLE | 300000 | 1996.706 | 2439.81 | 0.818 >> ArraysSort.Int.testSort | SHUFFLE | 2000000 | 14108.444 | 19083.22 | 0.739 >> ArraysSort.Int.testSortParallel | RANDOM | 800 | 3.318 | 8.074 | 0.41 >> ArraysSort.Int.testSortParallel | RANDOM | 7000 | 26.533 | 44.091 | 0.60 >> ArraysSort.Int.testSor... > > Hello Vamsi (@vamsi-parasa), > > Thank you very much for benchmarking, I appreciate your efforts! > > I looked at non-parallel sorting when radix sort is switched off > (DualPivotQuicksort_RadixForParallel) and cannot explain the following data > for STAGGER where all speedup < 1: > testSort STAGGER 7000 0.92 > testSort STAGGER 50000 0.84 > testSort STAGGER 300000 0.85 > testSort STAGGER 2000000 0.83 > In these cases both versions go directly to merging sort: no quicksort, no > insertion sort, no radix sort at all > and therefore, no intrinsic also, Java code only,merging sort only. > It is expected that benchmarking without AVX512 should be the same, > but my benchmarking on Windows shows speedup 1.0 .. 1.10. > > Vamsi, > Could you please run benchmarking with derived classes from jdk and my > version? > https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a01.java > https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a02.java > https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r01.java > II hope i allows us to detect the root of such behaviour. > > Please check sequential sorting only (parallel sort is out of scope now). > I see you used not the latest ArraysSort, I pointed to > https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSort.java > > It is not critical, but it will be better to be in the same environment, see > increased warmup, specified parameters for run and updated data sets > @Warmup(iterations = 2, time = 3, timeUnit = TimeUnit.SECONDS) > @Measurement(iterations = 4, time = 3, timeUnit = TimeUnit.SECONDS) > @Fork(value=1, jvmArgsAppend={"-XX:CompileThreshold=1", > "-XX:-TieredCompilation"}) > > Could you please spare some time and provide the performance data? Hello Vladimir (@iaroslavski), You're welcome! Will provide the performance data (for the next set of experiments) within few days. Thanks, Vamsi ------------- PR Comment: https://git.openjdk.org/jdk/pull/13568#issuecomment-1817236360