On Thu, Jul 30, 2015 at 12:09 PM, Heikki Linnakangas <hlinn...@iki.fi>
wrote:

>
> True, you need a heap to hold the next tuple from each tape in the merge
> step. To avoid exceeding work_mem, you'd need to push some tuples from the
> in-memory array to the tape to make room for that. In practice, though, the
> memory needed for the merge step's heap is tiny. Even if you merge 1000
> tapes, you only need memory for 1000 tuples in the heap. But yeah, you'll
> need some logic to share/divide the in-memory array between the heap and
> the "in-memory tail" of the last tape.


It's a bit worse than that because we buffer up a significant chunk of the
tape to avoid randomly seeking between tapes after every tuple read. But I
think in today's era of large memory we don't need anywhere near the entire
work_mem just to buffer to avoid random access. Something simple like a
fixed buffer size per tape probably much less than 1MB/tape.

I'm a bit confused where the big win comes from though. Is what's going on
that the external sort only exceeded memory by a small amount  so nearly
all the tuples are still in memory?


-- 
greg

Reply via email to