Josh, Quick note on changing
int pivot1 = a[e2[; int pivot2 = a[e4]; by int pivot1 = ae2; int pivot2 = ae4; It is extremely surprised, but version with local variables eats 5% (!) of time for client and 2% for server mode (in compare with one-pivot implementation from JDK 6), see: client server with a[e2], a[e4]: 60.75% 48.20% with ae2, ae4: 65.80% 50.42% I don't have idea why this simple change is so expensive. Does anybody can explain it? Vladimir Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua Bloch <j...@google.com>: > > Vladimir, > > Hi. I'm thrilled that you were able to eke so much more perforance out of > this already optimized code. I read your changes on my flight to Pittsburgh. > Here's the code review: > > I think the comment on lines 102-105 was too useful to delete: > > 102 ... This > 103 * method differs from the public {...@code sort} method in that the > 104 * {...@code right} index is inclusive, and it does no range checking > 105 * on {...@code left} or {...@code right}. > > The sentinel technique that you use in lines 118 - 136 is questionable: you > are modifying a portion of the array outside the specified range of the call, > which arguably violates the contract of the call, and could be observed in a > multithreaded program. It's not beyond the realm of reason that it could > break existing clients. I will discuss it with Doug Lea and let you know what > he says. > > If we leave this optimization in, we should change the comments slightly. > Appearing to comment-out code (other than entire assertions in inner loops) > is never a good thing, hence the change to line 130 and the addition of the > line that follows. Also I reworded the commennt in lines 122-125 for clarity. > Changed lines are starred, added line has a "+": > > 118 // Use insertion sort on tiny arrays > 119 if (length < INSERTION_SORT_THRESHOLD) { > 120 if (left > 0) { > 121 /* > *122 * Temporarily set the value immediately preceding the range > *123 * to the minimum int value to serve as a sentinel. This > *124 * allows us to avoid the j >= left check on each iteration. > 125 */ > 126 int before = a[left - 1]; a[left - 1] = Integer.MIN_VALUE; > 127 > 128 for (int j, i = left + 1; i <= right; i++) { > 129 int ai = a[i]; > *130 for (j = i - 1; ai < a[j]; j--) { > +NEW // assert j >= left; > 131 a[j + 1] = a[j]; > 132 } > 133 a[j + 1] = ai; > 134 } > 135 a[left - 1] = before; > 136 } else { > > The comment in line 155 is misleading, and should be replace by this: > > I'd reword the comment in lines 137-140 for clarity: > > 137 /* > 138 * For case, when left == 0, traditional (without a sentinel) > 139 * insertion sort, optimized for server VM, is used. > 140 */ > > 155 // Inexpensive approximation of length / 7 > > The comment strategy for choosing sample points is subtle enough that it > deserves bit more commentary. I propose replacing line 158 with this: > > /* > * Sort five evenly spaced elements around (and including) the center > * element in the range. These elements will be used for pivot selection > * as described below. The choice for spacing these elements was empirically > * determined to work well on a wide variety of inputs. > */ > > Lines 183 and 184 are unnecessary array accesses. Probably better is: > > 183 int pivot1 = ae2; > 184 int pivot2 = ae4; > > This is essentially just a renaming of these two local variables, and > presumably the compiler will act accordingly. > > I prefer the original wording: > > 197 * Pointer k is the first index of -part > > to the revised wording: > > 217 * Pointer k is the first index of not inspected -part. > > I'd revert this change. > > I'd change 190 from: > > 190 if (pivot1 < pivot2) { > to > 190 if (pivot1 != pivot2) { > > It's clearer (this is the "pivots differ" case), and at least as fast. > > The spacing lines 249 and 250 is a bit quirky: > > 249 dualPivotQuicksort(a, left, less - 2); > 250 dualPivotQuicksort(a, great + 2, right); > > I'd replace it with the more standard: > > 249 dualPivotQuicksort(a, left, less - 2); > 250 dualPivotQuicksort(a, great + 2, right); > > Alternatively, you could make corresponding arguments line up: > > 249 dualPivotQuicksort(a, left, less - 2); > 250 dualPivotQuicksort(a, great + 2, right ); > > The policy change in line 256 is non-trivial: > > Old: > > 298 if (less < e1 && great > e5) { > > New: > > 256 if (great - less > 5 * seventh) { > > I previously experimented with the "new-style" policy (length-based, rather > than endpoint-based), but didn't get good results. Perhaps the symmetric > style combines well with the change in interval size (5/7 vs. 2/3). At any > rate, it's simpler than the old-style and conforms to the comment, so if > you're satisfied with the performance, I'm happy with it. > > In lines 362 and 363, you have the same "quirky spacing" as in linkes 249 and > 250: > > 361 // Sort left and right parts recursively > 362 dualPivotQuicksort(a, left, less - 1); > 363 dualPivotQuicksort(a, great + 1, right); > > Regards, > Josh