On Wed, Jun 30, 2021 at 03:29:41PM +0900, Michael Paquier wrote: > On Tue, May 18, 2021 at 01:04:18PM +0200, Drouvot, Bertrand wrote: > > On 5/17/21 8:56 PM, Andres Freund wrote: > >> On 2021-05-17 20:14:40 +0200, Drouvot, Bertrand wrote: > >>> I was also wondering if: > >>> > >>> * We should keep the old behavior in case pg_resetwal -x is being used > >>> without -u? (The proposed patch does not set an arbitrary oldestXID > >>> anymore in case -x is used) > >> I don't think we should. I don't see anything in the old behaviour worth > >> maintaining. > > So, pg_resetwal logic with the oldest XID assignment is causing some > problem here. This open item is opened for some time now and it is > idle for a couple of weeks. It looks that we have some solution > drafted, to be able to move forward, with the following things (no > patches yet): > - More robustness safety checks in procarray.c. > - A rework of oldestXid in pg_resetwal. > > Is there somebody working on that?
Bertrand sent a patch which is awaiting review. This allows both of Tom's reproducers to pass pg_upgrade (with assertions and delays). http://cfbot.cputube.org/bertrand-drouvot.html I re-arranged the pg_upgrade output of that patch: it was in the middle of the two halves: "Setting next transaction ID and epoch for new cluster" -- Justin
>From 30204d85005d6288d2dc93bcd28ff4fb2454a703 Mon Sep 17 00:00:00 2001 From: Tom Lane <t...@sss.pgh.pa.us> Date: Sun, 16 May 2021 16:23:02 -0400 Subject: [PATCH 1/3] prion failed with ERROR: missing chunk number 0 for toast value 14334 in pg_toast_2619 It also seems like some assertions in procarray.c would be a good idea. With the attached patch, we get through core regression just fine, but the pg_upgrade test fails immediately after the "Resetting WAL archives" step. --- src/backend/storage/ipc/procarray.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 4c91e721d0..324e105c59 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -2486,6 +2486,15 @@ GetSnapshotData(Snapshot snapshot) oldestfxid); /* accurate value known */ GlobalVisTempRels.maybe_needed = GlobalVisTempRels.definitely_needed; + + /* Do basic sanity check on these XIDs */ + Assert(FullTransactionIdPrecedesOrEquals(GlobalVisSharedRels.maybe_needed, + GlobalVisSharedRels.definitely_needed)); + Assert(FullTransactionIdPrecedesOrEquals(GlobalVisCatalogRels.maybe_needed, + GlobalVisCatalogRels.definitely_needed)); + Assert(FullTransactionIdPrecedesOrEquals(GlobalVisDataRels.maybe_needed, + GlobalVisDataRels.definitely_needed)); + /* not much point in checking GlobalVisTempRels, given the above */ } RecentXmin = xmin; @@ -4020,6 +4029,8 @@ GlobalVisTestFor(Relation rel) Assert(FullTransactionIdIsValid(state->definitely_needed) && FullTransactionIdIsValid(state->maybe_needed)); + Assert(FullTransactionIdPrecedesOrEquals(state->maybe_needed, + state->definitely_needed)); return state; } @@ -4085,6 +4096,15 @@ GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons) GlobalVisDataRels.definitely_needed); GlobalVisTempRels.definitely_needed = GlobalVisTempRels.maybe_needed; + /* Do basic sanity check on these XIDs */ + Assert(FullTransactionIdPrecedesOrEquals(GlobalVisSharedRels.maybe_needed, + GlobalVisSharedRels.definitely_needed)); + Assert(FullTransactionIdPrecedesOrEquals(GlobalVisCatalogRels.maybe_needed, + GlobalVisCatalogRels.definitely_needed)); + Assert(FullTransactionIdPrecedesOrEquals(GlobalVisDataRels.maybe_needed, + GlobalVisDataRels.definitely_needed)); + /* not much point in checking GlobalVisTempRels, given the above */ + ComputeXidHorizonsResultLastXmin = RecentXmin; } -- 2.17.0
>From 544a88b1651b0db177d5883219c7ae75f1100f7b Mon Sep 17 00:00:00 2001 From: "Drouvot, Bertrand" <bdrou...@amazon.com> Date: Tue, 18 May 2021 13:26:38 +0200 Subject: [PATCH 2/3] pg_upgrade can result in early wraparound on databases with high transaction load MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 5/4/21 10:17 AM, Drouvot, Bertrand wrote: Copy/pasting Andres feedback (Thanks Andres for this feedback) on those questions from another thread [1]. > I was also wondering if: > > * We should keep the old behavior in case pg_resetwal -x is being used > without -u? (The proposed patch does not set an arbitrary oldestXID > anymore in case -x is used) Andres: I don't think we should. I don't see anything in the old behaviour worth maintaining. > * We should ensure that the xid provided with -x or -u is > >= FirstNormalTransactionId (Currently the only check is that it is > # 0)? Andres: Applying TransactionIdIsNormal() seems like a good idea. => I am attaching a new version that makes use of TransactionIdIsNormal() checks. Andres: I think it's important to verify that the xid provided with -x is within a reasonable range of the oldest xid. => What do you mean by "a reasonable range"? Thanks Bertrand --- src/bin/pg_resetwal/pg_resetwal.c | 65 ++++++++++++++++++------------- src/bin/pg_upgrade/controldata.c | 17 +++++++- src/bin/pg_upgrade/pg_upgrade.c | 6 +++ src/bin/pg_upgrade/pg_upgrade.h | 1 + 4 files changed, 60 insertions(+), 29 deletions(-) diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c index 805dafef07..5e864760ed 100644 --- a/src/bin/pg_resetwal/pg_resetwal.c +++ b/src/bin/pg_resetwal/pg_resetwal.c @@ -65,6 +65,7 @@ static bool guessed = false; /* T if we had to guess at any values */ static const char *progname; static uint32 set_xid_epoch = (uint32) -1; static TransactionId set_xid = 0; +static TransactionId set_oldest_unfrozen_xid = 0; static TransactionId set_oldest_commit_ts_xid = 0; static TransactionId set_newest_commit_ts_xid = 0; static Oid set_oid = 0; @@ -102,6 +103,7 @@ main(int argc, char *argv[]) {"next-oid", required_argument, NULL, 'o'}, {"multixact-offset", required_argument, NULL, 'O'}, {"next-transaction-id", required_argument, NULL, 'x'}, + {"oldest-transaction-id", required_argument, NULL, 'u'}, {"wal-segsize", required_argument, NULL, 1}, {NULL, 0, NULL, 0} }; @@ -135,7 +137,7 @@ main(int argc, char *argv[]) } - while ((c = getopt_long(argc, argv, "c:D:e:fl:m:no:O:x:", long_options, NULL)) != -1) + while ((c = getopt_long(argc, argv, "c:D:e:fl:m:no:O:x:u:", long_options, NULL)) != -1) { switch (c) { @@ -176,9 +178,24 @@ main(int argc, char *argv[]) fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); } - if (set_xid == 0) + if (!TransactionIdIsNormal(set_xid)) { - pg_log_error("transaction ID (-x) must not be 0"); + pg_log_error("transaction ID (-x) must be greater or equal to %u", FirstNormalTransactionId); + exit(1); + } + break; + + case 'u': + set_oldest_unfrozen_xid = strtoul(optarg, &endptr, 0); + if (endptr == optarg || *endptr != '\0') + { + pg_log_error("invalid argument for option %s", "-u"); + fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); + exit(1); + } + if (!TransactionIdIsNormal(set_oldest_unfrozen_xid)) + { + pg_log_error("oldest unfrozen transaction ID (-u) must be greater or equal to %u", FirstNormalTransactionId); exit(1); } break; @@ -429,21 +446,12 @@ main(int argc, char *argv[]) XidFromFullTransactionId(ControlFile.checkPointCopy.nextXid)); if (set_xid != 0) - { ControlFile.checkPointCopy.nextXid = FullTransactionIdFromEpochAndXid(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextXid), set_xid); - /* - * For the moment, just set oldestXid to a value that will force - * immediate autovacuum-for-wraparound. It's not clear whether adding - * user control of this is useful, so let's just do something that's - * reasonably safe. The magic constant here corresponds to the - * maximum allowed value of autovacuum_freeze_max_age. - */ - ControlFile.checkPointCopy.oldestXid = set_xid - 2000000000; - if (ControlFile.checkPointCopy.oldestXid < FirstNormalTransactionId) - ControlFile.checkPointCopy.oldestXid += FirstNormalTransactionId; + if (set_oldest_unfrozen_xid != 0) { + ControlFile.checkPointCopy.oldestXid = set_oldest_unfrozen_xid; ControlFile.checkPointCopy.oldestXidDB = InvalidOid; } @@ -1209,20 +1217,21 @@ usage(void) printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname); printf(_("Options:\n")); printf(_(" -c, --commit-timestamp-ids=XID,XID\n" - " set oldest and newest transactions bearing\n" - " commit timestamp (zero means no change)\n")); - printf(_(" [-D, --pgdata=]DATADIR data directory\n")); - printf(_(" -e, --epoch=XIDEPOCH set next transaction ID epoch\n")); - printf(_(" -f, --force force update to be done\n")); - printf(_(" -l, --next-wal-file=WALFILE set minimum starting location for new WAL\n")); - printf(_(" -m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID\n")); - printf(_(" -n, --dry-run no update, just show what would be done\n")); - printf(_(" -o, --next-oid=OID set next OID\n")); - printf(_(" -O, --multixact-offset=OFFSET set next multitransaction offset\n")); - printf(_(" -V, --version output version information, then exit\n")); - printf(_(" -x, --next-transaction-id=XID set next transaction ID\n")); - printf(_(" --wal-segsize=SIZE size of WAL segments, in megabytes\n")); - printf(_(" -?, --help show this help, then exit\n")); + " set oldest and newest transactions bearing\n" + " commit timestamp (zero means no change)\n")); + printf(_(" [-D, --pgdata=]DATADIR data directory\n")); + printf(_(" -e, --epoch=XIDEPOCH set next transaction ID epoch\n")); + printf(_(" -f, --force force update to be done\n")); + printf(_(" -l, --next-wal-file=WALFILE set minimum starting location for new WAL\n")); + printf(_(" -m, --multixact-ids=MXID,MXID set next and oldest multitransaction ID\n")); + printf(_(" -n, --dry-run no update, just show what would be done\n")); + printf(_(" -o, --next-oid=OID set next OID\n")); + printf(_(" -O, --multixact-offset=OFFSET set next multitransaction offset\n")); + printf(_(" -u, --oldest-transaction-id=XID set oldest unfrozen transaction ID\n")); + printf(_(" -V, --version output version information, then exit\n")); + printf(_(" -x, --next-transaction-id=XID set next transaction ID\n")); + printf(_(" --wal-segsize=SIZE size of WAL segments, in megabytes\n")); + printf(_(" -?, --help show this help, then exit\n")); printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT); printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL); } diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c index 4f647cdf33..a4b6375403 100644 --- a/src/bin/pg_upgrade/controldata.c +++ b/src/bin/pg_upgrade/controldata.c @@ -44,6 +44,7 @@ get_control_data(ClusterInfo *cluster, bool live_check) bool got_oid = false; bool got_multi = false; bool got_oldestmulti = false; + bool got_oldestxid = false; bool got_mxoff = false; bool got_nextxlogfile = false; bool got_float8_pass_by_value = false; @@ -312,6 +313,17 @@ get_control_data(ClusterInfo *cluster, bool live_check) cluster->controldata.chkpnt_nxtmulti = str2uint(p); got_multi = true; } + else if ((p = strstr(bufin, "Latest checkpoint's oldestXID:")) != NULL) + { + p = strchr(p, ':'); + + if (p == NULL || strlen(p) <= 1) + pg_fatal("%d: controldata retrieval problem\n", __LINE__); + + p++; /* remove ':' char */ + cluster->controldata.chkpnt_oldstxid = str2uint(p); + got_oldestxid = true; + } else if ((p = strstr(bufin, "Latest checkpoint's oldestMultiXid:")) != NULL) { p = strchr(p, ':'); @@ -544,7 +556,7 @@ get_control_data(ClusterInfo *cluster, bool live_check) /* verify that we got all the mandatory pg_control data */ if (!got_xid || !got_oid || - !got_multi || + !got_multi || !got_oldestxid || (!got_oldestmulti && cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER) || !got_mxoff || (!live_check && !got_nextxlogfile) || @@ -575,6 +587,9 @@ get_control_data(ClusterInfo *cluster, bool live_check) cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER) pg_log(PG_REPORT, " latest checkpoint oldest MultiXactId\n"); + if (!got_oldestxid) + pg_log(PG_REPORT, " latest checkpoint oldestXID\n"); + if (!got_mxoff) pg_log(PG_REPORT, " latest checkpoint next MultiXactOffset\n"); diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c index e23b8ca88d..0d877f6456 100644 --- a/src/bin/pg_upgrade/pg_upgrade.c +++ b/src/bin/pg_upgrade/pg_upgrade.c @@ -485,6 +485,12 @@ copy_xact_xlog_xid(void) old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); check_ok(); + prep_status("Setting oldest XID for new cluster"); + exec_prog(UTILITY_LOG_FILE, NULL, true, true, + "\"%s/pg_resetwal\" -f -u %u \"%s\"", + new_cluster.bindir, old_cluster.controldata.chkpnt_oldstxid, + new_cluster.pgdata); + check_ok(); /* * If the old server is before the MULTIXACT_FORMATCHANGE_CAT_VER change diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h index f7eb2349e6..db96627ccb 100644 --- a/src/bin/pg_upgrade/pg_upgrade.h +++ b/src/bin/pg_upgrade/pg_upgrade.h @@ -207,6 +207,7 @@ typedef struct uint32 chkpnt_nxtmulti; uint32 chkpnt_nxtmxoff; uint32 chkpnt_oldstMulti; + uint32 chkpnt_oldstxid; uint32 align; uint32 blocksz; uint32 largesz; -- 2.17.0