On Tue, Sep 6, 2016 at 10:51 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote: > I still don't get it. When building the initial runs, we don't need buffer > space for maxTapes yet, because we're only writing to a single tape at a > time. An unused tape shouldn't take much memory. In inittapes(), when we > have built all the runs, we know how many tapes we actually needed, and we > can allocate the buffer memory accordingly.
Right. That's correct. But, we're not concerned about physically allocated memory, but rather logically allocated memory (i.e., what goes into USEMEM()). tuplesort.c should be able to fully use the workMem specified by caller in the event of an external sort, just as with an internal sort. > [thinks a bit, looks at logtape.c]. Hmm, I guess that's wrong, because of > the way this all is implemented. When we're building the initial runs, we're > only writing to one tape at a time, but logtape.c nevertheless holds onto a > BLCKSZ'd currentBuffer, plus one buffer for each indirect level, for every > tape that has been used so far. What if we changed LogicalTapeRewind to free > those buffers? There isn't much point in that, because those buffers are never physically allocated in the first place when there are thousands. They are, however, entered into the tuplesort.c accounting as if they were, denying tuplesort.c the full benefit of available workMem. It doesn't matter if you USEMEM() or FREEMEM() after we first spill to disk, but before we begin the merge. (We already refund the unused-but-logically-allocated memory from unusued at the beginning of the merge (within beginmerge()), so we can't do any better than we already are from that point on -- that makes the batch memtuples growth thing slightly more effective.) -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers