On Thu, Aug 20, 2015 at 8:15 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 20 August 2015 at 03:24, Peter Geoghegan <p...@heroku.com> wrote: >> >> >> The patch is ~3.25x faster than master > > > I've tried to read this post twice and both times my work_mem overflowed. > ;-) > > Can you summarize what this patch does? I understand clearly what it doesn't > do...
The most important thing that it does is always quicksort runs, that are formed by simply filling work_mem with tuples in no particular order, rather than trying to make runs that are twice as large as work_mem on average. That's what the ~3.25x improvement concerned. That's actually a significantly simpler algorithm than replacement selection, and appears to be much faster. You might even say that it's a dumb algorithm, because it is less sophisticated than replacement selection. However, replacement selection tends to use CPU caches very poorly, while its traditional advantages have become dramatically less important due to large main memory sizes in particular. Also, it hurts that we don't currently dump tuples in batches, for several reasons. Better to do memory intense operations in batch, rather than having a huge inner loop, in order to minimize or prevent instruction cache misses. And we can better take advantage of asynchronous I/O. The complicated aspect of considering the patch is whether or not it's okay to not use replacement selection anymore -- is that an appropriate trade-off? The reason that the code has not actually been simplified by this patch is that I still want to use replacement selection for one specific case: when it is anticipated that a "quicksort with spillover" can occur, which is only possible with incremental spilling. That may avoid most I/O, by spilling just a few tuples using a heap/priority queue, and quicksorting everything else. That's compelling when you can manage it, but no reason to always use replacement selection for the first run in the common case where there well be several runs in total. Is that any clearer? To borrow a phrase from the processor architecture community, from a high level this is a "Brainiac versus Speed Demon" [1] trade-off. (I wish that there was a widely accepted name for this trade-off.) [1] http://www.lighterra.com/papers/modernmicroprocessors/#thebrainiacdebate -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers