Vladimir Yaroslavskiy <vladimir.yaroslavs...@...> writes:
> I'd like to share with you new Dual-Pivot Quicksort which is
> faster than the known implementations (theoretically and
> experimental). I'd like to propose to replace the JDK's
> Quicksort implementation by new one.

This is a great idea; as a mathematician I immediately want to generalize it.
Unfortunately, I lack the computing skill to test my generalization in
practice.  

Suppose that instead of 2 pivots we have m (randomly chosen) pivots, where
m=sqrt(n+1)-1.
Step 1: Sort the m pivots, recursively.
Step 2: Build a balanced binary tree from the pivots.
Step 3: Sort the remaining n-m elements into (m+1) categories, using the BBT.
Step 4: Sort each of the (m+1) categories, recursively.

Let S(n) denote the number of steps needed for this algorithm.  Calculating
roughly:
Step 1 requires S(m) steps.
Step 2 requires m steps (it is easy since the pivots are already sorted).
Step 3 requires (n-m)(log m) comparisons.
Step 4 requires, on average, (m+1)S((n-m)/(m+1)) steps.

m was chosen so that (n-m)/(m+1)=m, so S(n)=(m+2)S(m)+m+(n-m)(log m) <= (m+2)
S(m)+ n log m.  Crudely, we replace both m and m+2 by sqrt(n), to get as
approximation:
S(n) <= 0.5 n log n + n^0.5 [ 0.25 n^0.5 log n + n^0.25 [ 0.125 n^0.25 log n
 +...]]] = 0.5 n log n [1 + 0.5 + 0.25 + ...] = n log n.  This appears to be
better than 2 n log n.  This argument can be made precise (the only really
questionable issue is replacing m+2 by m) by first getting a crude but 
correct upper bound for 2S(m), which gets absorbed into the -mlogm term.

Unfortunately, I'm not up to the detailed comparison vs. swap analysis as 
done by Mr. Yaroslavskiy, but it seems to me that, if nothing else, having 
an efficient algorithm to partition the non-pivot elements should provide a
savings.

Vadim Ponomarenko
Dept. of Mathematics and Statistics
San Diego State University

Reply via email to