On 13 April 2012 17:42, Peter Geoghegan <pe...@2ndquadrant.com> wrote: > One insight that I had at the time was that text comparisons where the > c locale isn't used are really rather expensive, and I doubt that > there is too much that can be done to address that directly. However, > if we were to support timsort, that could substantially cut down on > the number of comparisons for text columns, which could be a real win. > Maybe that would happen through some explicit mechanism, or maybe the > planner could actually infer that it was likely to be optimal to use > timsort based on a high statistical correlation between physical row > ordering and logical ordering of a key column's values.
Further thoughts: At the time we committed our own quicksort implementation, based on the NetBSD one, we eschewed the optimisation of using insertion sort when n is fairly low. This happens to be a very common optimisation, so I'm not really super-confident that that was a good decision. However, we also added our own optimisation, which is to attempt, regardless of the size of n, to ascertain that the array is pre-sorted, in which case we don't quicksort at all. So if we attempt to quicksort an array which is almost pre-sorted, but say only has its very last element out-of-order, we'll do fairly horribly, because we waste the effort of almost an entire "bubble sort iteration". So almost-sorted data would seem to be a compelling thing to optimise for that reason, as well as for the simple observation that it isn't exactly uncommon in a relational database. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers