Hello, everyone! I've investigated the implementation of the Dual-Pivot Quicksort which is used for sorting primitives and here is my result:
http://cr.openjdk.java.net/~alanb/6947216/webrev.00 New implementation of Dual-Pivot Quicksort is faster than previous one of 12% for client VM and few percents for server VM. Tests on Bentley's test suite (Win XP, JDK 7, build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88 for client VM and 0.98-1.00 for server VM. In compare with sorting from JDK 6 by Jon L. Bentley and M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50 (client / server). See the execution time for sorting array of 2`000`000 int elements 50 times, client / server VM, in milliseconds: random new: 16723 18776 jdk7: 17571 18975 jdk6: 22241 26524 ascendant new: 3541 4702 jdk7: 4486 4669 jdk6: 8667 7978 descendant new: 3854 4907 jdk7: 4942 5034 jdk6: 8787 8243 equal new: 234 281 jdk7: 291 230 jdk6: 602 1018 organ pipes new: 7673 8613 jdk7: 8167 8993 jdk6: 11902 14044 stagger 1 new: 7648 8591 jdk7: 8161 8979 jdk6: 11908 13810 stagger 2 new: 8349 9299 jdk7: 10968 11916 jdk6: 12194 14734 stagger 4 new: 8475 9622 jdk7: 9221 9682 jdk6: 10523 12006 stagger 8 new: 9321 10689 jdk7: 11125 12387 jdk6: 13829 16214 period 1..2 new: 758 751 jdk7: 870 754 jdk6: 1038 1227 period 1..4 new: 1004 963 jdk7: 1365 1209 jdk6: 1511 1847 period 1..8 new: 1588 1573 jdk7: 1599 1790 jdk6: 2602 3045 random 1..2 new: 1327 1125 jdk7: 1362 1496 jdk6: 1531 2182 random 1..4 new: 1830 2118 jdk7: 1851 2236 jdk6: 2292 3025 where stagger(m) is array like a[i] = i * (m + 1) % length. The summary of changes is: 1. For sorting small arrays is used insertion sort with sentinel instead of traditional, which has the structure: for (int i = left + 1; i <= right; i++) { for (j = i; j > left && a[j-1] > a[j]; j--) { swap(a[i], a[j-1]); } } Note that range check j > left is performed on each iteration, but really helps very rare. To avoid this expensive range check, it was suggested to set minimum value (negative infinity) on the first position. This type of suggestion is used in new version: if left bound > 0, we can put sentinel on a[left - 1], do insertion sort without expensive check range, and then restore a[left - 1] value. If left == 0, traditional insertion sort is used. Please, look at the webrev for details. 2. In previous implementation 5 evenly spaced elements sixth = length / 6; a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth] were used as candidates of pivots elements. This case is very sensitive for period inputs, especially by 6. The new suggestion is to take 5 center evenly spaced elements like this: int seventh = length / 7; int midpoint = (left + right) >>> 1; a[midpoint - 2 * seventh], a[midpoint - seventh], a[midpoint], a[midpoint + seventh], a[midpoint + 2 * seventh] and moreover, the seventh is calculated inexpensively: seventh = (length >>> 3) + (length >>> 6) + 1; This schema works the same on random, ascendant, descendant, equal inputs, but much better for period / stagger. 3. The whole structure <choose pivots> if (pivotsDiffer) { <do partitioning for two pivots> } else { <do partitioning for one pivot> } <sort left and right parts> if (!pivotsDiffer) { return; } <swap internal pivot values to ends> <sort center part> was modified to: ---------------- <choose pivots> if (pivot1 < pivot2) { <do partitioning for two pivots> <swap pivots into their final positions> <sort left and right parts> <swap internal pivot values to ends> <sort center part> } else { <do partitioning for one pivot> <sort left and right parts> } 4. Partitioning for both cases have not been changed at all. 5. Minor javadoc and format changes. Please, review new implementation, any comments / suggestions are welcome! Thank you, Vladimir