On Mon, May 11, 2015 at 10:11 AM, Noah Misch <n...@leadboat.com> wrote: > On Mon, May 11, 2015 at 08:29:05AM -0400, Robert Haas wrote: >> Given your concerns, and the need to get a fix for this out the door >> quickly, what I'm inclined to do for the present is go bump the >> threshold from 25% of MaxMultiXact to 50% of MaxMultiXact without >> changing anything else. > > +1 > >> Your analysis shows that this is more in line >> with the existing policy for multixact IDs than what I did, and it >> will reduce the threat of frequent wraparound scans. Now, it will >> also increase the chances of somebody hitting the wall before >> autovacuum can bail them out. But maybe not that much. If we need >> 75% of the multixact member space to complete one cycle of >> anti-wraparound vacuums, we're actually very close to the point where >> the system just cannot work. If that's one big table, we're done. > > Agreed.
OK, I have made this change. Barring further trouble reports, this completes the multixact work I plan to do for the next release. Here is what is outstanding: 1. We might want to introduce a GUC to control the point at which member offset utilization begins clamping autovacuum_multixact_freeze_max_age. It doesn't seem wise to do anything about this before pushing a minor release out. It's not entirely trivial, and it may be helpful to learn more about how the changes already made work out in practice before proceeding. Also, we might not back-patch this anyway. 2. The recent changes adjust things - for good reason - so that the safe threshold for multixact member creation is advanced only at checkpoint time. This means it's theoretically possible to have a situation where autovacuum has done all it can, but because no checkpoint has happened yet, the user can't create any more multixacts. Thanks to some good work by Thomas, autovacuum will realize this and avoid spinning uselessly over every table in the system, which is good, but you're still stuck with errors until the next checkpoint. Essentially, we're hoping that autovacuum will clean things up far enough in advance of hitting the threshold where we have to throw an error that a checkpoint will intervene before the error starts happening. It's possible we could improve this further, but I think it would be unwise to mess with it right now. It may be that there is no real-world problem here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers