On Monday, 17 December 2012 at 15:28:36 UTC, bearophile wrote:
Regarding the recent Phobos improvements that introduce a Timsort:

http://forum.dlang.org/thread/50c8a4e67f79_3fdd19b7ae8146...@sh3.rs.github.com.mail

I have found a blog post that compares the performance of Timsort, Smoothsort, and std::stable_sort:

http://www.altdevblogaday.com/2012/06/15/smoothsort-vs-timsort/

Bye,
bearophile

While Smoothsort may be mathematically sound, it simply doesn't translate well to computer hardware. It's a variant of heap sort which requires a great deal of random access, whereas Timsort is a variant of merge sort which is largely sequential and benefits from the CPU cache. Furthermore, the Leonardo heap is much more computationally expensive than a typical binary or ternary heap.

Both Timsort and Smoothsort are what you call "natural sorts," meaning they typically require fewer comparisons on data with low entropy. They're also rather complex which means added overhead. When sorting primitive types like int, comparisons are inexpensive, so the overhead makes these algorithms slower. But had he tested it on a data type like strings, then we'd likely see Timsort take the lead.

On purely random data, quick sort and merge sort will win most of the time. But Timsort has an advantage over Smoothsort of being an adaptive algorithm; the so called "galloping mode," which is computationally expensive, is only activated when minGallop reaches a certain threshold and therefore beneficial. Otherwise, a simple linear merge is used (i.e. merge sort).

On another note, I highly doubt that std::sort uses a "median of medians" algorithm, which would add much overhead and essentially double the number of comparisons required with little to no benefit. More likely, it simply chooses the pivot from a median of three.

Reply via email to