Robert Haas wrote: > On Sat, Mar 7, 2015 at 5:49 PM, Andres Freund <and...@2ndquadrant.com> wrote: > > On 2015-03-05 15:28:12 -0600, Jim Nasby wrote: > >> I was thinking the simpler route of just repalloc'ing... the memcpy would > >> suck, but much less so than the extra index pass. 64M gets us 11M tuples, > >> which probably isn't very common. > > > > That has the chance of considerably increasing the peak memory usage > > though, as you obviously need both the old and new allocation during the > > repalloc(). > > > > And in contrast to the unused memory at the tail of the array, which > > will usually not be actually allocated by the OS at all, this is memory > > that's actually read/written respectively. > > Yeah, I'm not sure why everybody wants to repalloc() that instead of > making several separate allocations as needed. That would avoid > increasing peak memory usage, and would avoid any risk of needing to > copy the whole array. Also, you could grow in smaller chunks, like > 64MB at a time instead of 1GB or more at a time. Doubling an > allocation that's already 1GB or more gets big in a hurry.
Yeah, a chunk list rather than a single chunk seemed a good idea to me too. Also, I think the idea of starting with an allocation assuming some small number of dead tuples per page made sense -- and by the time that space has run out, you have a better estimate of actual density of dead tuples, so you can do the second allocation based on that new estimate (but perhaps clamp it at say 1 GB, just in case you just scanned a portion of the table with an unusually high percentage of dead tuples.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers