Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart writes: > On Thu, Dec 29, 2022 at 03:29:15PM -0500, Tom Lane wrote: >> I tried to make a patch along these lines, and soon hit a stumbling >> block: ONLY is a fully-reserved SQL keyword. I don't think this >> syntax is attractive enough to justify requiring people to >> double-quote the option, so we are back to square one. Anybody >> have a different suggestion? > ... adding both SKIP_DATABASE_STATS and ONLY_DATABASE_STATS might be the > best bet. Nobody has proposed a different bikeshed color, so I'm going to proceed with that syntax. I'll incorporate the parallel-machinery fix from your patch and push to HEAD only (since it's hard to argue this isn't a new feature). This needn't foreclose pursuing the various ideas about making vac_update_datfrozenxid faster; but none of those would eliminate the fundamental O(N^2) issue AFAICS. regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Thu, Dec 29, 2022 at 03:29:15PM -0500, Tom Lane wrote: > Nathan Bossart writes: >> On Thu, Dec 29, 2022 at 12:22:58PM -0500, Tom Lane wrote: >>> Justin Pryzby writes: VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) > VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) > >> +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a >> little better since it states the action like most of the other options, >> but I think both choices are sufficiently clear. > > I tried to make a patch along these lines, and soon hit a stumbling > block: ONLY is a fully-reserved SQL keyword. I don't think this > syntax is attractive enough to justify requiring people to > double-quote the option, so we are back to square one. Anybody > have a different suggestion? Hm. I thought about using PreventInTransactionBlock() for the function, but that probably won't work for a few reasons. AFAICT we'd need to restrict it to only be callable via "SELECT update_database_stats()", which feels a bit unnatural. There was some discussion elsewhere [0] about adding a PROCESS_MAIN_RELATION option or expanding PROCESS_TOAST to simplify vacuuming the TOAST table directly. If such an option existed, you could call VACUUM (PROCESS_MAIN_RELATION FALSE, PROCESS_TOAST FALSE, UPDATE_DATABASE_STATES TRUE) pg_class; to achieve roughly what we need. I'll admit this is hacky, though. So, adding both SKIP_DATABASE_STATS and ONLY_DATABASE_STATS might be the best bet. [0] https://postgr.es/m/20221215191246.GA252861%40nathanxps13 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart writes: > On Thu, Dec 29, 2022 at 12:22:58PM -0500, Tom Lane wrote: >> Justin Pryzby writes: >>> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) > +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a > little better since it states the action like most of the other options, > but I think both choices are sufficiently clear. I tried to make a patch along these lines, and soon hit a stumbling block: ONLY is a fully-reserved SQL keyword. I don't think this syntax is attractive enough to justify requiring people to double-quote the option, so we are back to square one. Anybody have a different suggestion? regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Wed, Dec 28, 2022 at 07:03:29PM -0800, Andres Freund wrote: > Separately I wonder if it's worth micro-optimizing vac_update_datfrozenxid() a > bit. I e.g. see a noticable speedup bypassing systable_getnext() and using > heap_getnext(). It's really too bad that we want to check for "in the future" > xids, otherwise we could use a ScanKey to filter at a lower level. Another thing I'm exploring is looking up the datfrozenxid/datminmxid before starting the pg_class scan so that the scan can be stopped early if it sees that we cannot possibly advance the values. The overwrite-corrupt-values logic might make this a little more complicated, but I think it'd be sufficient to force the pg_class scan to complete if we ever see a value "in the future." Overwriting the corrupt value might be delayed, but it would eventually happen once the table ages advance. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart writes: > +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a > little better since it states the action like most of the other options, > but I think both choices are sufficiently clear. Consistency of VACUUM's options seems like a lost cause already :-(. Between them, DISABLE_PAGE_SKIPPING, SKIP_LOCKED, and PROCESS_TOAST cover just about the entire set of possibilities for describing a boolean option --- we couldn't even manage to be consistent about whether ON or OFF is the default, let alone where the verb is. And it's hard to argue that FULL, VERBOSE, or PARALLEL is a verb. regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Thu, Dec 29, 2022 at 12:22:58PM -0500, Tom Lane wrote: >> Justin Pryzby writes: >>> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) > VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a little better since it states the action like most of the other options, but I think both choices are sufficiently clear. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
I wrote: > Justin Pryzby writes: >> I assumed it would look like: >> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) > Meh. We could do it like that, but I think options that look like > booleans but aren't are messy. Note that I'm not necessarily objecting to there being just one option, only to its values being a superset-of-boolean. Perhaps this'd work: VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Hi, On 2022-12-28 15:13:23 -0500, Tom Lane wrote: > [ redirecting to -hackers because patch attached ] > > Peter Geoghegan writes: > > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane wrote: > >> That is a really good point. How about teaching VACUUM to track > >> the oldest original relfrozenxid and relminmxid among the table(s) > >> it processed, and skip vac_update_datfrozenxid unless at least one > >> of those matches the database's values? For extra credit, also > >> skip if we didn't successfully advance the source rel's value. > > > Hmm. I think that that would probably work. > > I poked into that idea some more and concluded that getting VACUUM to > manage it behind the user's back is not going to work very reliably. > The key problem is explained by this existing comment in autovacuum.c: > > * Even if we didn't vacuum anything, it may still be important to do > * this, because one indirect effect of vac_update_datfrozenxid() is to > * update ShmemVariableCache->xidVacLimit. That might need to be done > * even if we haven't vacuumed anything, because relations with older > * relfrozenxid values or other databases with older datfrozenxid values > * might have been dropped, allowing xidVacLimit to advance. > > That is, if the table that's holding back datfrozenxid gets dropped > between VACUUM runs, VACUUM would never think that it might have > advanced the global minimum. I wonder if a less aggressive version of this idea might still work. Perhaps we could use ShmemVariableCache->latestCompletedXid or ShmemVariableCache->nextXid to skip at least some updates? Obviously this isn't going to help if there's a lot of concurrent activity, but the case of just running vacuumdb -a might be substantially improved. Separately I wonder if it's worth micro-optimizing vac_update_datfrozenxid() a bit. I e.g. see a noticable speedup bypassing systable_getnext() and using heap_getnext(). It's really too bad that we want to check for "in the future" xids, otherwise we could use a ScanKey to filter at a lower level. Greetings, Andres Freund
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Justin Pryzby writes: > On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: >> I'm forced to the conclusion that we have to expose some VACUUM >> options if we want this to work well. Attached is a draft patch >> that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options >> (name bikeshedding welcome) and teaches vacuumdb to use them. > I assumed it would look like: > VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) Meh. We could do it like that, but I think options that look like booleans but aren't are messy. regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: > Peter Geoghegan writes: > > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane wrote: > >> That is a really good point. How about teaching VACUUM to track > >> the oldest original relfrozenxid and relminmxid among the table(s) > >> it processed, and skip vac_update_datfrozenxid unless at least one > >> of those matches the database's values? For extra credit, also > >> skip if we didn't successfully advance the source rel's value. > > > Hmm. I think that that would probably work. > > I'm forced to the conclusion that we have to expose some VACUUM > options if we want this to work well. Attached is a draft patch > that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options > (name bikeshedding welcome) and teaches vacuumdb to use them. I was surprised to hear that this added *two* options. I assumed it would look like: VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) -- Justin
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart writes: > On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: >> +executeCommand(conn, "VACUUM (ONLY_DATABASE_STATS);", echo); > When I looked at this, I thought it would be better to send the command > through the parallel slot machinery so that failures would use the same > code path as the rest of the VACUUM commands. However, you also need to > adjust ParallelSlotsWaitCompletion() to mark the slots as idle so that the > slot array can be reused after it is called. Hm. I was just copying the way commands are issued further up in the same function. But I think you're right: once we've done ParallelSlotsAdoptConn(sa, conn); it's probably not entirely kosher to use the conn directly. regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Wed, Dec 28, 2022 at 04:20:19PM -0500, Tom Lane wrote: > Nathan Bossart writes: >> I think the main difference between your patch and mine is that I've >> exposed vac_update_datfrozenxid() via a function instead of a VACUUM >> option. IMHO that feels a little more natural, but I can't say I feel too >> strongly about it. > > I thought about that but it seems fairly unsafe, because that means > that vac_update_datfrozenxid is executing inside a user-controlled > transaction. I don't think it will hurt us if the user does a > ROLLBACK afterward --- but if he sits on the open transaction, > that would be bad, if only because we're still holding the > LockDatabaseFrozenIds lock which will block other VACUUMs. > There might be more hazards besides that; certainly no one has ever > tried to run vac_update_datfrozenxid that way before. That's a good point. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: > + /* If we used SKIP_DATABASE_STATS, mop up with ONLY_DATABASE_STATS */ > + if (vacopts->skip_database_stats && stage == ANALYZE_NO_STAGE && > !failed) > + { > + executeCommand(conn, "VACUUM (ONLY_DATABASE_STATS);", echo); > + } When I looked at this, I thought it would be better to send the command through the parallel slot machinery so that failures would use the same code path as the rest of the VACUUM commands. However, you also need to adjust ParallelSlotsWaitCompletion() to mark the slots as idle so that the slot array can be reused after it is called. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart writes: > I think the main difference between your patch and mine is that I've > exposed vac_update_datfrozenxid() via a function instead of a VACUUM > option. IMHO that feels a little more natural, but I can't say I feel too > strongly about it. I thought about that but it seems fairly unsafe, because that means that vac_update_datfrozenxid is executing inside a user-controlled transaction. I don't think it will hurt us if the user does a ROLLBACK afterward --- but if he sits on the open transaction, that would be bad, if only because we're still holding the LockDatabaseFrozenIds lock which will block other VACUUMs. There might be more hazards besides that; certainly no one has ever tried to run vac_update_datfrozenxid that way before. regards, tom lane
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: > I'm forced to the conclusion that we have to expose some VACUUM > options if we want this to work well. Attached is a draft patch > that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options > (name bikeshedding welcome) and teaches vacuumdb to use them. This is the conclusion I arrived at, too. In fact, I was just about to post a similar patch set. I'm attaching it here anyway, but I'm fine with proceeding with your version. I think the main difference between your patch and mine is that I've exposed vac_update_datfrozenxid() via a function instead of a VACUUM option. IMHO that feels a little more natural, but I can't say I feel too strongly about it. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From a46934c1a6cec7a5efe8a25d49507a7a2f59c928 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 27 Dec 2022 15:21:26 -0800 Subject: [PATCH v1 1/3] add UPDATE_DATFROZENXID option to VACUUM --- doc/src/sgml/ref/vacuum.sgml | 22 ++ src/backend/commands/vacuum.c| 9 ++--- src/backend/postmaster/autovacuum.c | 2 +- src/include/commands/vacuum.h| 1 + src/test/regress/expected/vacuum.out | 2 ++ src/test/regress/sql/vacuum.sql | 3 +++ 6 files changed, 35 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index e14ead8826..1219614507 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -36,6 +36,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ boolean ] TRUNCATE [ boolean ] PARALLEL integer +UPDATE_DATFROZENXID [ boolean ] and table_and_columns is: @@ -295,6 +296,27 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ boolean diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index ba965b8c7b..51537aa5ba 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -114,6 +114,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) bool full = false; bool disable_page_skipping = false; bool process_toast = true; + bool update_datfrozenxid = vacstmt->is_vacuumcmd; ListCell *lc; /* index_cleanup and truncate values unspecified for now */ @@ -200,6 +201,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) params.nworkers = nworkers; } } + else if (strcmp(opt->defname, "update_datfrozenxid") == 0) + update_datfrozenxid = defGetBoolean(opt); else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), @@ -216,7 +219,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) (freeze ? VACOPT_FREEZE : 0) | (full ? VACOPT_FULL : 0) | (disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) | - (process_toast ? VACOPT_PROCESS_TOAST : 0); + (process_toast ? VACOPT_PROCESS_TOAST : 0) | + (update_datfrozenxid ? VACOPT_UPDATE_DATFROZENXID : 0); /* sanity checks on options */ Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE)); @@ -528,11 +532,10 @@ vacuum(List *relations, VacuumParams *params, StartTransactionCommand(); } - if ((params->options & VACOPT_VACUUM) && !IsAutoVacuumWorkerProcess()) + if (params->options & VACOPT_UPDATE_DATFROZENXID) { /* * Update pg_database.datfrozenxid, and truncate pg_xact if possible. - * (autovacuum.c does this for itself.) */ vac_update_datfrozenxid(); } diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 0746d80224..edc219c8f5 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2854,7 +2854,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map, tab->at_relid = relid; tab->at_sharedrel = classForm->relisshared; - /* Note that this skips toast relations */ + /* Note that this skips toast relations and updating datfrozenxid */ tab->at_params.options = (dovacuum ? VACOPT_VACUUM : 0) | (doanalyze ? VACOPT_ANALYZE : 0) | (!wraparound ? VACOPT_SKIP_LOCKED : 0); diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 2f274f2bec..700489040e 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -188,6 +188,7 @@ typedef struct VacAttrStats #define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */ #define VACOPT_PROCESS_TOAST 0x40 /* process the TOAST table, if any */ #define VACOPT_DISABLE_PAGE_SKIPPING 0x80 /* don't skip any pages */ +#define VACOPT_UPDATE_DATFROZENXID 0x100 /* update datfrozenxid afterwards */ /* * Values used by index_cleanup and truncate params. diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out index 0035d158b7..19c405417d 100644 --- a/src/test/regress/expected/vacuum.out +++ b/src/test/regress/expected/vacuum.out @@ -282,6 +282,8 @@ ALTER TABLE vactst ALTER COLUMN t SET STORAGE EXTERNAL; VACUUM (PROCESS_TOAST FALSE
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
[ redirecting to -hackers because patch attached ] Peter Geoghegan writes: > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane wrote: >> That is a really good point. How about teaching VACUUM to track >> the oldest original relfrozenxid and relminmxid among the table(s) >> it processed, and skip vac_update_datfrozenxid unless at least one >> of those matches the database's values? For extra credit, also >> skip if we didn't successfully advance the source rel's value. > Hmm. I think that that would probably work. I poked into that idea some more and concluded that getting VACUUM to manage it behind the user's back is not going to work very reliably. The key problem is explained by this existing comment in autovacuum.c: * Even if we didn't vacuum anything, it may still be important to do * this, because one indirect effect of vac_update_datfrozenxid() is to * update ShmemVariableCache->xidVacLimit. That might need to be done * even if we haven't vacuumed anything, because relations with older * relfrozenxid values or other databases with older datfrozenxid values * might have been dropped, allowing xidVacLimit to advance. That is, if the table that's holding back datfrozenxid gets dropped between VACUUM runs, VACUUM would never think that it might have advanced the global minimum. I'm forced to the conclusion that we have to expose some VACUUM options if we want this to work well. Attached is a draft patch that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options (name bikeshedding welcome) and teaches vacuumdb to use them. Light testing says that this is a win: even on the regression database, which isn't all that big, I see a drop in vacuumdb's runtime from ~260 ms to ~175 ms. Of course this is a case where VACUUM doesn't really have anything to do, so it's a best-case scenario ... but still, I was expecting the effect to be barely above noise with this many tables, yet it's a good bit more. regards, tom lane diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index e14ead8826..443928f286 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -36,6 +36,8 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ boolean ] TRUNCATE [ boolean ] PARALLEL integer +SKIP_DATABASE_STATS [ boolean ] +ONLY_DATABASE_STATS [ boolean ] and table_and_columns is: @@ -295,6 +297,40 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns + list must be empty, and no other option may be enabled. + + + + boolean diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index ba965b8c7b..055c1a146b 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -114,6 +114,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) bool full = false; bool disable_page_skipping = false; bool process_toast = true; + bool skip_database_stats = false; + bool only_database_stats = false; ListCell *lc; /* index_cleanup and truncate values unspecified for now */ @@ -200,6 +202,10 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) params.nworkers = nworkers; } } + else if (strcmp(opt->defname, "skip_database_stats") == 0) + skip_database_stats = defGetBoolean(opt); + else if (strcmp(opt->defname, "only_database_stats") == 0) + only_database_stats = defGetBoolean(opt); else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), @@ -216,7 +222,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) (freeze ? VACOPT_FREEZE : 0) | (full ? VACOPT_FULL : 0) | (disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) | - (process_toast ? VACOPT_PROCESS_TOAST : 0); + (process_toast ? VACOPT_PROCESS_TOAST : 0) | + (skip_database_stats ? VACOPT_SKIP_DATABASE_STATS : 0) | + (only_database_stats ? VACOPT_ONLY_DATABASE_STATS : 0); /* sanity checks on options */ Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE)); @@ -349,6 +357,23 @@ vacuum(List *relations, VacuumParams *params, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("PROCESS_TOAST required with VACUUM FULL"))); + /* sanity check for ONLY_DATABASE_STATS */ + if (params->options & VACOPT_ONLY_DATABASE_STATS) + { + Assert(params->options & VACOPT_VACUUM); + if (relations != NIL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("ONLY_DATABASE_STATS cannot be specified with a list of tables"))); + /* don't require people to turn off PROCESS_TOAST explicitly */ + if (params->options & ~(VACOPT_VACUUM | +VACOPT_PROCESS_TOAST | +VACOPT_ONLY_DATABASE_STATS)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("ONLY_DATABASE_STATS cannot be specified with other VACUUM options"))); + } + /* * Create special memory context for cross-transaction storage. * @@ -376