On Fri, 1 Sep 2023 06:12:57 GMT, Alan Bateman <al...@openjdk.org> wrote:

>> Hi team,
>> @AlanBateman, @rose00, @mbreinhold 
>> 
>> There are a big efforts now to improve sorting with x86_64 AVX512
>> https://github.com/openjdk/jdk/pull/14227, no changes of
>> Dual-Pivot Quicksort logic.
>> 
>> But at the same time this PR offers algorithmic improvements,
>> like optimized insertion sort, merging sort, better partitioning
>> and Radix sort for fully random data. Radix sort has linear complexity
>> instead of n*ln(n), much faster!. Also we handle properly situation if
>> no enough memory for additional buffer, now we will face with OOM.
>> 
>> Alan, you mentioned that DualPivotQuicksort will need detailed review.
>> Can we go ahead and start reviewing? Laurent checked performance,
>> JMH results look fine.
>> 
>> I think it would be logical to integrate new DualPivotQuicksort first,
>> and then apply AVX512 changes.
>
>> Alan, you mentioned that DualPivotQuicksort will need detailed review. Can 
>> we go ahead and start reviewing? Laurent checked performance, JMH results 
>> look fine.
> 
> As before, I think the main question with this change is whether adding radix 
> sort to the mix is worth the complexity and additional code to maintain. Also 
> as we discussed in the previous PR, the additional memory needed for the 
> radix sort may have an effect on other things that are going on concurrently. 
> I know it has been updated to handle OOME but I think potential reviewers 
> would need to be comfortable with that part.

Hi Alan (@AlanBateman) and team,

I understand your doubts about Radix sort, let's me to provide my thoughts.

Code of Radix sort is about 10% of all code and very clear for understanding: 
scan digits + then place elements according histogram. Other sorting algorithms 
have more complexity (like merging sort, quicksort partitioning, heap sort). 
Therefore, I don't expect problems with maintaining.

Why Radix sort is important? Even Quicksort is very fast, but it relies on 
comparability of elements only and knows nothing about nature of elements to be 
sorted. Radix sort knows the structure of elements (32 or 64 bits) and 
therefore works faster (on random data). I was asked many times why we don't 
use Radix sort, so the idea is not new. Introducing of Radix sort is like using 
of counting sort for byte/short/char, the same approach.

Existing version of sort() will request additional memory, if tryMergingSort() 
detects already sorted subsequences. In this case Radix sort or Quicksort will 
not be invoked at all. Radix sort is invoked, if merging sort is not completed 
(and therefore, no allocated memory). Additional memory is requested by either 
merging sort or Radix sort, not by both algorithms on the same input. It means 
these algorithms never request double memory. Memory effect (merging sort and 
Radix sort) is not summarized. In case of parallel sorting the additional 
buffer allocated for parallel merge sort will be reused by tryMergingSort() as 
well as by Radix sort.

**Summary:** the size of total requested memory always equals to the size of 
part to be sorted (not more), no any duplicates.

Need to say that Radix sort is invoked not on all inputs, many inputs are 
covered by Quicksort or merging sort. If Quicksort suggests O(n*ln(n)), Radix 
sort shows linear, much better, O(n) time! As it was mentioned before, we added 
check against OOM, sorting will be completed successfully in any case. Existing 
implementation will fail with error, if there is no enough memory.

**In few words,** Radix sort is simple and it impacts not more than already 
introduced algorithms in JDK.

I encourage reviewers to look at this PR.

Thank you,
Vladimir

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13568#issuecomment-1711730647

Reply via email to