On 04/14/2017 08:19 AM, Peter Geoghegan wrote:
BTW, I'm skeptical of the idea of Heikki's around killing polyphase
merge itself at this point. I think that keeping most tapes active per
pass is useful now that our memory accounting involves handing over an
even share to each maybe-active tape for every merge pass, something
established by Heikki's work on external sorting.

The pre-read buffers are only needed for input tapes; the total number of tapes doesn't matter.

For comparison, imagine that you want to perform a merge, such that you always merge 7 runs into one. With polyphase merge, you would need 8 tapes, so that you always read from 7 of them, and write onto one. With balanced merge, you would need 14 tapes: you always read from 7 tapes, and you would need up to 7 output tapes, of which one would be active at any given time.

Those extra idle output tapes are practically free in our implementation. The "pre-read buffers" are only needed for input tapes, the number of output tapes doesn't matter. Likewise, maintaining the heap is cheaper if you only merge a small number of tapes at a time, but that's also dependent on the number of *input* tapes, not the total number of tapes.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to