On Friday, 19 April 2013 at 22:56:00 UTC, bearophile wrote:
So, why isn't TimSort the default?

Probably because someone thinks that "on average" the other sort is faster.

In theory the other should be faster, because if you relax the constraints of the sort being stable I think in theory you should be able to write a little faster sorting algorithm (I don't know if this is true).

I'm just throwing my 2c in, but I think TimSort ought to be the default. It's simply the safer option. Since worst-case performance can be designed (and it can be designed unless the pivots are, at least, pseudo-randomly chosen), there is a very real risk of the default sort being used in a way that can be compromised by an attacker. From this perspective, seems to be like TimSort being default would support the design goal #5 of D, "Make doing things the right way easier than the wrong way."

Plus, TimSort seems to be faster in most cases in my attempts. The only cases I could find that it wasn't faster is when I could guarantee that the data I was passing in had no order to it. In cases where you suspect (but can't guarantee) data is sorted when passed in (think fetching from a web API and it gives you sorted data, but the docs don't say it should), calling TimSort is nearly as fast as calling an "isSorted" ... So, my recommendation is to just call TimSort and don't worry about the extra branching code ([checking sortedness, if not sorted, call sort] vs [call sort]). This makes it so the "fast" code is also very convenient to write.

TBH, though, I think the reason that TimSort is not the default is because it was added only semi-recently. The old stable sort was not nearly as fast as the unstable sort (and, in fact, IIRC, it didn't work properly when I tried it). I doubt that someone intentionally said that quicksort was faster than TimSort on average ... it's just that no one thought to change the default when TimSort was added.

Maybe an enhancement request could be made on this?

Reply via email to