Josh,

Quick note on changing

int pivot1 = a[e2[;
int pivot2 = a[e4];

by

int pivot1 = ae2;
int pivot2 = ae4;

It is extremely surprised, but version with local variables eats
5% (!) of time for client and 2% for server mode (in compare
with one-pivot implementation from JDK 6), see:

                     client   server
with a[e2], a[e4]:   60.75%   48.20%
    with ae2, ae4:   65.80%   50.42%

I don't have idea why this simple change is so expensive.
Does anybody can explain it?

Vladimir

Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua Bloch <j...@google.com>:
> 
> Vladimir,
> 
> Hi. I'm thrilled that you were able to eke so much more perforance out of 
> this already optimized code. I read your changes on my flight to Pittsburgh. 
> Here's the code review:
> 
> I think the comment on lines 102-105 was too useful to delete:
> 
> 102    ... This
> 103   * method differs from the public {...@code sort} method in that the
> 104   * {...@code right} index is inclusive, and it does no range checking
> 105   * on {...@code left} or {...@code right}.
> 
> The sentinel technique that you use in lines 118 - 136 is questionable: you 
> are modifying a portion of the array outside the specified range of the call, 
> which arguably violates the contract of the call, and could be observed in a 
> multithreaded program. It's not beyond the realm of reason that it could 
> break existing clients. I will discuss it with Doug Lea and let you know what 
> he says.
> 
> If we leave this optimization in, we should change the comments slightly. 
> Appearing to comment-out code (other than entire assertions in inner loops) 
> is never a good thing, hence the change to line 130 and the addition of the 
> line that follows. Also I reworded the commennt in lines 122-125 for clarity. 
> Changed lines are starred, added line has a "+":
> 
> 118     // Use insertion sort on tiny arrays
> 119     if (length < INSERTION_SORT_THRESHOLD) {
> 120       if (left > 0) {
> 121         /*
> *122         * Temporarily set the value immediately preceding the range
> *123         * to the minimum int value to serve as a sentinel. This
> *124         * allows us to avoid the j >= left check on each iteration.
> 125         */ 
> 126         int before = a[left - 1]; a[left - 1] = Integer.MIN_VALUE;
> 127
> 128         for (int j, i = left + 1; i <= right; i++) {
> 129           int ai = a[i];
> *130           for (j = i - 1; ai < a[j]; j--) {
> +NEW             // assert j >= left;
> 131             a[j + 1] = a[j];
> 132           }
> 133           a[j + 1] = ai;
> 134         }
> 135         a[left - 1] = before;
> 136       } else {
> 
> The comment in line 155 is misleading, and should be replace by this:
> 
> I'd reword the comment in lines 137-140 for clarity:
> 
> 137         /*
> 138         * For case, when left == 0, traditional (without a sentinel)
> 139         * insertion sort, optimized for server VM, is used.
> 140         */ 
> 
> 155     // Inexpensive approximation of length / 7
> 
> The comment strategy for choosing sample points is subtle enough that it 
> deserves bit more commentary. I propose replacing line 158 with this:
> 
>   /*
>   * Sort five evenly spaced elements around (and including) the center
>   * element in the range. These elements will be used for pivot selection
>   * as described below. The choice for spacing these elements was empirically
>   * determined to work well on a wide variety of inputs.
>   */ 
> 
> Lines 183 and 184 are unnecessary array accesses. Probably better is:
> 
> 183     int pivot1 = ae2;
> 184     int pivot2 = ae4;
> 
> This is essentially just a renaming of these two local variables, and 
> presumably the compiler will act accordingly.
> 
> I prefer the original wording:
> 
> 197       * Pointer k is the first index of -part
> 
> to the revised wording:
> 
> 217       * Pointer k is the first index of not inspected -part.
> 
> I'd revert this change.
> 
> I'd change 190 from:
> 
> 190     if (pivot1 < pivot2) {
> to
> 190     if (pivot1 != pivot2) {
> 
> It's clearer (this is the "pivots differ" case), and at least as fast.
> 
> The spacing lines 249 and 250 is a bit quirky:
> 
> 249       dualPivotQuicksort(a, left,  less - 2);
> 250       dualPivotQuicksort(a, great + 2, right);
> 
> I'd replace it with the more standard:
> 
> 249       dualPivotQuicksort(a, left, less - 2);
> 250       dualPivotQuicksort(a, great + 2, right);
> 
> Alternatively, you could make corresponding arguments line up:
> 
> 249       dualPivotQuicksort(a, left,   less - 2);
> 250       dualPivotQuicksort(a, great + 2, right  );
> 
> The policy change in line 256 is non-trivial:
> 
> Old:
> 
> 298     if (less < e1 && great > e5) {
> 
> New:
> 
> 256       if (great - less > 5 * seventh) {
> 
> I previously experimented with the "new-style" policy (length-based, rather 
> than endpoint-based), but didn't get good results. Perhaps the symmetric 
> style combines well with the change in interval size (5/7 vs. 2/3). At any 
> rate, it's simpler than the old-style and conforms to the comment, so if 
> you're satisfied with the performance, I'm happy with it.
> 
> In lines 362 and 363, you have the same "quirky spacing" as in linkes 249 and 
> 250:
> 
> 361       // Sort left and right parts recursively
> 362       dualPivotQuicksort(a, left,  less - 1);
> 363       dualPivotQuicksort(a, great + 1, right);
> 
>  Regards,
>  Josh

Reply via email to