On Wed, Jun 30, 2021 at 03:29:41PM +0900, Michael Paquier wrote:
> On Tue, May 18, 2021 at 01:04:18PM +0200, Drouvot, Bertrand wrote:
> > On 5/17/21 8:56 PM, Andres Freund wrote:
> >> On 2021-05-17 20:14:40 +0200, Drouvot, Bertrand wrote:
> >>> I was also wondering if:
> >>> 
> >>>   * We should keep the old behavior in case pg_resetwal -x is being used
> >>>     without -u?
 (The proposed patch does not set an arbitrary oldestXID
> >>>     anymore in 
case -x is used)
> >> I don't think we should. I don't see anything in the old behaviour worth
> >> maintaining.
> 
> So, pg_resetwal logic with the oldest XID assignment is causing some
> problem here.  This open item is opened for some time now and it is
> idle for a couple of weeks.  It looks that we have some solution
> drafted, to be able to move forward, with the following things (no
> patches yet): 
> - More robustness safety checks in procarray.c.
> - A rework of oldestXid in pg_resetwal.
> 
> Is there somebody working on that?

Bertrand sent a patch which is awaiting review.
This allows both of Tom's reproducers to pass pg_upgrade (with assertions and
delays).

http://cfbot.cputube.org/bertrand-drouvot.html

I re-arranged the pg_upgrade output of that patch: it was in the middle of the
two halves: "Setting next transaction ID and epoch for new cluster"

-- 
Justin
>From 30204d85005d6288d2dc93bcd28ff4fb2454a703 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 16 May 2021 16:23:02 -0400
Subject: [PATCH 1/3] prion failed with ERROR: missing chunk number 0 for toast
 value 14334 in pg_toast_2619

It also seems like some assertions in procarray.c would be a
good idea.  With the attached patch, we get through core
regression just fine, but the pg_upgrade test fails immediately
after the "Resetting WAL archives" step.
---
 src/backend/storage/ipc/procarray.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4c91e721d0..324e105c59 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2486,6 +2486,15 @@ GetSnapshotData(Snapshot snapshot)
 								   oldestfxid);
 		/* accurate value known */
 		GlobalVisTempRels.maybe_needed = GlobalVisTempRels.definitely_needed;
+
+		/* Do basic sanity check on these XIDs */
+		Assert(FullTransactionIdPrecedesOrEquals(GlobalVisSharedRels.maybe_needed,
+												 GlobalVisSharedRels.definitely_needed));
+		Assert(FullTransactionIdPrecedesOrEquals(GlobalVisCatalogRels.maybe_needed,
+												 GlobalVisCatalogRels.definitely_needed));
+		Assert(FullTransactionIdPrecedesOrEquals(GlobalVisDataRels.maybe_needed,
+												 GlobalVisDataRels.definitely_needed));
+		/* not much point in checking GlobalVisTempRels, given the above */
 	}
 
 	RecentXmin = xmin;
@@ -4020,6 +4029,8 @@ GlobalVisTestFor(Relation rel)
 
 	Assert(FullTransactionIdIsValid(state->definitely_needed) &&
 		   FullTransactionIdIsValid(state->maybe_needed));
+	Assert(FullTransactionIdPrecedesOrEquals(state->maybe_needed,
+											 state->definitely_needed));
 
 	return state;
 }
@@ -4085,6 +4096,15 @@ GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons)
 							   GlobalVisDataRels.definitely_needed);
 	GlobalVisTempRels.definitely_needed = GlobalVisTempRels.maybe_needed;
 
+	/* Do basic sanity check on these XIDs */
+	Assert(FullTransactionIdPrecedesOrEquals(GlobalVisSharedRels.maybe_needed,
+											 GlobalVisSharedRels.definitely_needed));
+	Assert(FullTransactionIdPrecedesOrEquals(GlobalVisCatalogRels.maybe_needed,
+											 GlobalVisCatalogRels.definitely_needed));
+	Assert(FullTransactionIdPrecedesOrEquals(GlobalVisDataRels.maybe_needed,
+											 GlobalVisDataRels.definitely_needed));
+	/* not much point in checking GlobalVisTempRels, given the above */
+
 	ComputeXidHorizonsResultLastXmin = RecentXmin;
 }
 
-- 
2.17.0

>From 544a88b1651b0db177d5883219c7ae75f1100f7b Mon Sep 17 00:00:00 2001
From: "Drouvot, Bertrand" <bdrou...@amazon.com>
Date: Tue, 18 May 2021 13:26:38 +0200
Subject: [PATCH 2/3] pg_upgrade can result in early wraparound on databases
 with high transaction load
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On 5/4/21 10:17 AM, Drouvot, Bertrand wrote:

Copy/pasting Andres feedback (Thanks Andres for this feedback) on those
questions from another thread [1].

 > I was also wondering if:
 >
 > * We should keep the old behavior in case pg_resetwal -x is being used
 > without -u?
 (The proposed patch does not set an arbitrary oldestXID
 > anymore in 
case -x is used)

Andres: I don't think we should. I don't see anything in the old
behaviour worth
maintaining.

 > * We should ensure that the xid provided with -x or -u is
 > >=
FirstNormalTransactionId (Currently the only check is that it is
 > # 0)?

Andres: Applying TransactionIdIsNormal() seems like a good idea.

=> I am attaching a new version that makes use of
TransactionIdIsNormal() checks.

Andres: I think it's important to verify that the xid provided with -x
is within a reasonable range of the oldest xid.

=> What do you mean by "a reasonable range"?

Thanks

Bertrand
---
 src/bin/pg_resetwal/pg_resetwal.c | 65 ++++++++++++++++++-------------
 src/bin/pg_upgrade/controldata.c  | 17 +++++++-
 src/bin/pg_upgrade/pg_upgrade.c   |  6 +++
 src/bin/pg_upgrade/pg_upgrade.h   |  1 +
 4 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 805dafef07..5e864760ed 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -65,6 +65,7 @@ static bool guessed = false;	/* T if we had to guess at any values */
 static const char *progname;
 static uint32 set_xid_epoch = (uint32) -1;
 static TransactionId set_xid = 0;
+static TransactionId set_oldest_unfrozen_xid = 0;
 static TransactionId set_oldest_commit_ts_xid = 0;
 static TransactionId set_newest_commit_ts_xid = 0;
 static Oid	set_oid = 0;
@@ -102,6 +103,7 @@ main(int argc, char *argv[])
 		{"next-oid", required_argument, NULL, 'o'},
 		{"multixact-offset", required_argument, NULL, 'O'},
 		{"next-transaction-id", required_argument, NULL, 'x'},
+		{"oldest-transaction-id", required_argument, NULL, 'u'},
 		{"wal-segsize", required_argument, NULL, 1},
 		{NULL, 0, NULL, 0}
 	};
@@ -135,7 +137,7 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt_long(argc, argv, "c:D:e:fl:m:no:O:x:", long_options, NULL)) != -1)
+	while ((c = getopt_long(argc, argv, "c:D:e:fl:m:no:O:x:u:", long_options, NULL)) != -1)
 	{
 		switch (c)
 		{
@@ -176,9 +178,24 @@ main(int argc, char *argv[])
 					fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 					exit(1);
 				}
-				if (set_xid == 0)
+				if (!TransactionIdIsNormal(set_xid))
 				{
-					pg_log_error("transaction ID (-x) must not be 0");
+					pg_log_error("transaction ID (-x) must be greater or equal to %u", FirstNormalTransactionId);
+					exit(1);
+				}
+				break;
+
+			case 'u':
+				set_oldest_unfrozen_xid = strtoul(optarg, &endptr, 0);
+				if (endptr == optarg || *endptr != '\0')
+				{
+					pg_log_error("invalid argument for option %s", "-u");
+					fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+					exit(1);
+				}
+				if (!TransactionIdIsNormal(set_oldest_unfrozen_xid))
+				{
+					pg_log_error("oldest unfrozen transaction ID (-u) must be greater or equal to %u", FirstNormalTransactionId);
 					exit(1);
 				}
 				break;
@@ -429,21 +446,12 @@ main(int argc, char *argv[])
 											 XidFromFullTransactionId(ControlFile.checkPointCopy.nextXid));
 
 	if (set_xid != 0)
-	{
 		ControlFile.checkPointCopy.nextXid =
 			FullTransactionIdFromEpochAndXid(EpochFromFullTransactionId(ControlFile.checkPointCopy.nextXid),
 											 set_xid);
 
-		/*
-		 * For the moment, just set oldestXid to a value that will force
-		 * immediate autovacuum-for-wraparound.  It's not clear whether adding
-		 * user control of this is useful, so let's just do something that's
-		 * reasonably safe.  The magic constant here corresponds to the
-		 * maximum allowed value of autovacuum_freeze_max_age.
-		 */
-		ControlFile.checkPointCopy.oldestXid = set_xid - 2000000000;
-		if (ControlFile.checkPointCopy.oldestXid < FirstNormalTransactionId)
-			ControlFile.checkPointCopy.oldestXid += FirstNormalTransactionId;
+	if (set_oldest_unfrozen_xid != 0) {
+		ControlFile.checkPointCopy.oldestXid = set_oldest_unfrozen_xid;
 		ControlFile.checkPointCopy.oldestXidDB = InvalidOid;
 	}
 
@@ -1209,20 +1217,21 @@ usage(void)
 	printf(_("Usage:\n  %s [OPTION]... DATADIR\n\n"), progname);
 	printf(_("Options:\n"));
 	printf(_("  -c, --commit-timestamp-ids=XID,XID\n"
-			 "                                 set oldest and newest transactions bearing\n"
-			 "                                 commit timestamp (zero means no change)\n"));
-	printf(_(" [-D, --pgdata=]DATADIR          data directory\n"));
-	printf(_("  -e, --epoch=XIDEPOCH           set next transaction ID epoch\n"));
-	printf(_("  -f, --force                    force update to be done\n"));
-	printf(_("  -l, --next-wal-file=WALFILE    set minimum starting location for new WAL\n"));
-	printf(_("  -m, --multixact-ids=MXID,MXID  set next and oldest multitransaction ID\n"));
-	printf(_("  -n, --dry-run                  no update, just show what would be done\n"));
-	printf(_("  -o, --next-oid=OID             set next OID\n"));
-	printf(_("  -O, --multixact-offset=OFFSET  set next multitransaction offset\n"));
-	printf(_("  -V, --version                  output version information, then exit\n"));
-	printf(_("  -x, --next-transaction-id=XID  set next transaction ID\n"));
-	printf(_("      --wal-segsize=SIZE         size of WAL segments, in megabytes\n"));
-	printf(_("  -?, --help                     show this help, then exit\n"));
+			 "                                   set oldest and newest transactions bearing\n"
+			 "                                   commit timestamp (zero means no change)\n"));
+	printf(_(" [-D, --pgdata=]DATADIR            data directory\n"));
+	printf(_("  -e, --epoch=XIDEPOCH             set next transaction ID epoch\n"));
+	printf(_("  -f, --force                      force update to be done\n"));
+	printf(_("  -l, --next-wal-file=WALFILE      set minimum starting location for new WAL\n"));
+	printf(_("  -m, --multixact-ids=MXID,MXID    set next and oldest multitransaction ID\n"));
+	printf(_("  -n, --dry-run                    no update, just show what would be done\n"));
+	printf(_("  -o, --next-oid=OID               set next OID\n"));
+	printf(_("  -O, --multixact-offset=OFFSET    set next multitransaction offset\n"));
+	printf(_("  -u, --oldest-transaction-id=XID  set oldest unfrozen transaction ID\n"));
+	printf(_("  -V, --version                    output version information, then exit\n"));
+	printf(_("  -x, --next-transaction-id=XID    set next transaction ID\n"));
+	printf(_("      --wal-segsize=SIZE           size of WAL segments, in megabytes\n"));
+	printf(_("  -?, --help                       show this help, then exit\n"));
 	printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
 	printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
 }
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 4f647cdf33..a4b6375403 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -44,6 +44,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 	bool		got_oid = false;
 	bool		got_multi = false;
 	bool		got_oldestmulti = false;
+	bool		got_oldestxid = false;
 	bool		got_mxoff = false;
 	bool		got_nextxlogfile = false;
 	bool		got_float8_pass_by_value = false;
@@ -312,6 +313,17 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 			cluster->controldata.chkpnt_nxtmulti = str2uint(p);
 			got_multi = true;
 		}
+		else if ((p = strstr(bufin, "Latest checkpoint's oldestXID:")) != NULL)
+		{
+			p = strchr(p, ':');
+
+			if (p == NULL || strlen(p) <= 1)
+				pg_fatal("%d: controldata retrieval problem\n", __LINE__);
+
+			p++;				/* remove ':' char */
+			cluster->controldata.chkpnt_oldstxid = str2uint(p);
+			got_oldestxid = true;
+		}
 		else if ((p = strstr(bufin, "Latest checkpoint's oldestMultiXid:")) != NULL)
 		{
 			p = strchr(p, ':');
@@ -544,7 +556,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 
 	/* verify that we got all the mandatory pg_control data */
 	if (!got_xid || !got_oid ||
-		!got_multi ||
+		!got_multi || !got_oldestxid ||
 		(!got_oldestmulti &&
 		 cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER) ||
 		!got_mxoff || (!live_check && !got_nextxlogfile) ||
@@ -575,6 +587,9 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 			cluster->controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
 			pg_log(PG_REPORT, "  latest checkpoint oldest MultiXactId\n");
 
+		if (!got_oldestxid)
+			pg_log(PG_REPORT, "  latest checkpoint oldestXID\n");
+
 		if (!got_mxoff)
 			pg_log(PG_REPORT, "  latest checkpoint next MultiXactOffset\n");
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index e23b8ca88d..0d877f6456 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -485,6 +485,12 @@ copy_xact_xlog_xid(void)
 			  old_cluster.controldata.chkpnt_nxtxid,
 			  new_cluster.pgdata);
 	check_ok();
+	prep_status("Setting oldest XID for new cluster");
+	exec_prog(UTILITY_LOG_FILE, NULL, true, true,
+			  "\"%s/pg_resetwal\" -f -u %u \"%s\"",
+			  new_cluster.bindir, old_cluster.controldata.chkpnt_oldstxid,
+			  new_cluster.pgdata);
+	check_ok();
 
 	/*
 	 * If the old server is before the MULTIXACT_FORMATCHANGE_CAT_VER change
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index f7eb2349e6..db96627ccb 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -207,6 +207,7 @@ typedef struct
 	uint32		chkpnt_nxtmulti;
 	uint32		chkpnt_nxtmxoff;
 	uint32		chkpnt_oldstMulti;
+	uint32		chkpnt_oldstxid;
 	uint32		align;
 	uint32		blocksz;
 	uint32		largesz;
-- 
2.17.0

Reply via email to