On 4/19/13 6:37 PM, Ivan Kazmenko wrote:
That [SwapStrategystable] uses the TimSort that contains
the optimization you look for.
alias mySort = sort!("a < b", SwapStrategy.stable);
Thank you, these are good for now.
A seemingly easy way to provide the choice globally would be to have an
assignable SwapStrategy.default in std.algorithm... or would that be a
bad design decision to have such a global state?
Yah, that would mean one can't reason what a piece of code does without
knowing what the context was.
Consider a sorted array. Append an element that is less than all the
previous elements. Then sort the array again by the sort function
from std.algorithm.
If you know that, then don't do a sort. Make some free space in
the array, shift the items, sort just the first part with a slice.
That behavior isn't always easy to predict.
Still, I think this is a problem in Phobos which should be fixed. In
most implementations of quicksort (even middle-element, the easiest to
come up with), one expects it to perform in O(n log n) with probability
close to one, except on some artificially constructed cases.
On a side note, I'm surprised that median-of-three (getPivot in
std.algorithm) fails at such a simple test, so I had something new to
learn there, too. If I just comment out the switch in getPivot and
"return mid" directly, it becomes fast enough in this case.
I could have sworn I made getPivot choose at random at some point. I
agree this should be fixed; there are a number of possibilities:
a) choose median of five
b) if the array is large, sort a stride of it first (e.g. 1%) then
choose the pivot as the median point of that stride
c) add introspection to the algorithm, i.e. if an attempt to partition
hits the worst case or near worst case, just try another pivot instead
of moving forward with the sorting stage.
Another view at this is the following. As far as I know by now, one of
the points beyond D (and Phobos) design is to have everything safe by
default, but faster if you want it explicitly and you know what you are
doing.
That view of safety is different (memory safety).
In this particular case, that would mean having a worst-case O(n
log n) sort, and/or a stable sort, by default - but with the opportunity
to switch to the (time-wise and stability-wise) unsafe quicksort if you
really need the extra speedup and know what you are doing. So, why isn't
TimSort the default?
TimSort is slower on average.
Andrei