On Thu, 24 Aug 2023 06:23:29 GMT, Srinivas Vamsi Parasa <d...@openjdk.org> 
wrote:

>>> Improvements are nice but it would not pay off if you have big regressions. 
>>> I can accept 0.9x but 0.4x - 0.8x regressions should be investigated and 
>>> implementation adjusted to avoid them.
>> 
>> Hi Vladimir,
>> 
>> Thank you for the suggestion! 
>> Currently, AVX512sort is doing well for Random, Repeated and Shuffle 
>> patterns of input data. The regressions are observed for Staggered (Wave) 
>> pattern of input data. 
>> Will investigate the regressions and adjust the implementations to address 
>> them.
>> 
>> Thanks,
>> Vamsi
>
>> Improvements are nice but it would not pay off if you have big regressions. 
>> I can accept 0.9x but 0.4x - 0.8x regressions should be investigated and 
>> implementation adjusted to avoid them.
> 
> Hello Vladimir (@vnkozlov) ,
> 
> As per your suggestion, the implementation was adjusted to address the 
> regressions caused for STAGGER and REPEATED type of input data patterns. 
> Please see below the new JMH performance data using the adjusted 
> implementation.
> 
> In the new implementation, we don't call the AVX512 sort intrinsic at the top 
> level (`Arrays.sort()`) . Instead, we take a decomposed or a 'building 
> blocks' approach where we only intrinsify  (using AVX512 instructions) the 
> partitioning and small array sort functions used in the current baseline ( 
> `DualPivotQuickSort.sort()` ). Since the current baseline has logic to detect 
> and sort special input patterns like STAGGER, we fallback to the current 
> baseline instead of using AVX512 partitioning and sorting (which works best 
> for RANDOM, REPEATED and SHUFFLE patterns).
> 
> Data below shows `Arrays.sort()` performance comparison between the current 
> **Java baseline (DPQS)** vs. **AVX512 sort** (this PR)  using the 
> `ArraysSort.java` JMH 
> [benchmark](https://github.com/openjdk/jdk/pull/13568/files#diff-dee51b13bd1872ff455cec2f29255cfd25014a5dd33dda55a2fc68638c3dd4b2)
>  provided in the PR for [JDK-8266431: Dual-Pivot Quicksort improvements 
> (Radix sort)](https://github.com/openjdk/jdk/pull/13568/files#top) ( #13568)
> 
> - The following command line was used to run the benchmarks: ` java -jar 
> $JDK_HOME/build/linux-x86_64-server-release/images/test/micro/benchmarks.jar 
> -jvmArgs "-XX:CompileThreshold=1 -XX:-TieredCompilation" ArraysSort`
> - The scores shown are the average time (us/op),  thus lower is better. The 
> last column towards the right shows the speedup. 
> 
> 
> |     Benchmark       |       Mode    |       Size    |       Baseline DPQS 
> (us/op)   |       AVX512 partitioning & sort (us/op)      |       Speedup |
> |     ---     |       ---     |       ---     |       ---     |       ---     
> |       ---     |
> |     ArraysSort.Double.testSort      |       RANDOM  |       800     |       
> 6.7     |       4.8     |       1.39x   |
> |     ArraysSort.Double.testSort      |       RANDOM  |       7000    |       
> 234.1   |       51.5    |       **4.55x**       |
> |     ArraysSort.Double.testSort      |       RANDOM  |       50000   |       
> 2155.9  |       470.0   |       **4.59x**       |
> |     ArraysSort.Double.testSort      |       RANDOM  |       300000  |       
> 15076.3 |       3391.3  |       **4.45x**       |
> |     ArraysSort.Double.testSort      |       RANDOM  |       2000000 |       
> 116445.5        |       27491.7 |       **4.24x**       |
> |     ArraysSort.Double.testSort      |       REPEATED        |       800     
> |       2.3     |       1.7     |       1.35x   |
> |     ArraysSort.Double.testSort      |       REPEATED        |       7000    
> |       23.3    |       12.1    |       **1.92x**       |
> |     ArraysSort.Double.testSort      |...

@vamsi-parasa Thank you for addressing performance issues I asked about.

First, since you add new public Java API to Arrays class this have to be 
reviewed by Core Libs.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14227#issuecomment-1692644591

Reply via email to