On 24/06/2022 04:43, Andres Freund wrote:
On 2022-06-23 22:03:27 +0300, Heikki Linnakangas wrote:
In summary, I think we should:
- commit and backpatch Simon's
just_remove_TransactionIdIsKnownCompleted_call.v1.patch
- fix pg_xact_status() to check TransactionIdIsInProgress()
- in v15, remove TransationIdIsKnownCompleted function altogether

I'll try to get around to that in the next few days, unless someone beats me
to it.

Makes sense.

This is what I came up with for master. One difference from Simon's original patch is that I decided to not expose the TransactionIdIsKnownNotInProgress(), as there are no other callers of it in core, and it doesn't seem useful to extensions. I inlined it into the caller instead.

BTW, should we worry about XID wraparound with the cache? Could you have a backend sit idle for 2^32 transactions, without making any TransactionIdIsKnownNotInProgress() calls? That's not new with this patch, though, it could happen with the single-item cache in transam.c, too.

- Heikki
From de81639af2ce0ed635778d4f7cafb3dfc4fd84f5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Sat, 25 Jun 2022 12:02:39 +0300
Subject: [PATCH 1/1] Fix visibility check when XID is committed in CLOG but
 not in procarray.

TransactionIdIsInProgress had a fast path to return 'false' if the
single-item CLOG cache said that the transaction was known to be
committed. However, that was wrong, because a transaction is first
marked as committed in the CLOG but doesn't become visible to others
until it has removed its XID from the proc array. That could lead to an
error:

    ERROR:  t_xmin is uncommitted in tuple to be updated

or for an UPDATE to go ahead without blocking, before the previous
UPDATE on the same row was made visible.

The window is usually very short, but synchronous replication makes it
much wider, because the wait for synchronous replica happens in that
window.

Another thing that makes it hard to hit is that it's hard to get such
a commit-in-progress transaction into the single item CLOG cache.
Normally, if you call TransactionIdIsInProgress on such a transaction,
it determines that the XID is in progress without checking the CLOG
and without populating the cache. One way to prime the cache is to
explicitly call pg_xact_status() on the XID. Another way is use a lot
of subtransactions, so that the subxid cache in the proc array is
overflown, making TransactionIdIsInProgress rely on CLOG checks.

This has been broken ever since it was introduced in 2008, but the race
condition is very hard to hit, especially without synchronous
replication.  There were a couple of reports of this starting from
summer 2021, but no one was able to find the root cause then.

TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
but I left in place in backbranches in case it's used by extensions.

Also change pg_xact_status() to check TransactionIdIsInProgress().
Previously, it only checked the CLOG, and returned "committed" before
the transaction was actually made visible to other queries. Note that
this also means that you cannot use pg_xact_status() to reproduce the
bug anymore, even if the code wasn't fixed.

Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
the pg_xact_status() change added by me.

Author: Simon Riggs
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
---
 src/backend/access/transam/transam.c | 27 -------------------------
 src/backend/storage/ipc/procarray.c  | 12 ++++++++++-
 src/backend/utils/adt/xid8funcs.c    | 30 +++++++++++-----------------
 src/include/access/transam.h         |  1 -
 4 files changed, 23 insertions(+), 47 deletions(-)

diff --git a/src/backend/access/transam/transam.c b/src/backend/access/transam/transam.c
index dbc5f884e8..5865810135 100644
--- a/src/backend/access/transam/transam.c
+++ b/src/backend/access/transam/transam.c
@@ -219,33 +219,6 @@ TransactionIdDidAbort(TransactionId transactionId)
 	return false;
 }
 
-/*
- * TransactionIdIsKnownCompleted
- *		True iff transaction associated with the identifier is currently
- *		known to have either committed or aborted.
- *
- * This does NOT look into pg_xact but merely probes our local cache
- * (and so it's not named TransactionIdDidComplete, which would be the
- * appropriate name for a function that worked that way).  The intended
- * use is just to short-circuit TransactionIdIsInProgress calls when doing
- * repeated heapam_visibility.c checks for the same XID.  If this isn't
- * extremely fast then it will be counterproductive.
- *
- * Note:
- *		Assumes transaction identifier is valid.
- */
-bool
-TransactionIdIsKnownCompleted(TransactionId transactionId)
-{
-	if (TransactionIdEquals(transactionId, cachedFetchXid))
-	{
-		/* If it's in the cache at all, it must be completed. */
-		return true;
-	}
-
-	return false;
-}
-
 /*
  * TransactionIdCommitTree
  *		Marks the given transaction and children as committed
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4da53a5b3f..dadaa958a8 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -261,6 +261,11 @@ static ProcArrayStruct *procArray;
 
 static PGPROC *allProcs;
 
+/*
+ * Cache to reduce overhead of repeated calls to TransactionIdIsInProgress()
+ */
+static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
+
 /*
  * Bookkeeping for tracking emulated transactions in recovery
  */
@@ -1396,7 +1401,7 @@ TransactionIdIsInProgress(TransactionId xid)
 	 * already known to be completed, we can fall out without any access to
 	 * shared memory.
 	 */
-	if (TransactionIdIsKnownCompleted(xid))
+	if (TransactionIdEquals(cachedXidIsNotInProgress, xid))
 	{
 		xc_by_known_xact_inc();
 		return false;
@@ -1554,6 +1559,7 @@ TransactionIdIsInProgress(TransactionId xid)
 	if (nxids == 0)
 	{
 		xc_no_overflow_inc();
+		cachedXidIsNotInProgress = xid;
 		return false;
 	}
 
@@ -1568,7 +1574,10 @@ TransactionIdIsInProgress(TransactionId xid)
 	xc_slow_answer_inc();
 
 	if (TransactionIdDidAbort(xid))
+	{
+		cachedXidIsNotInProgress = xid;
 		return false;
+	}
 
 	/*
 	 * It isn't aborted, so check whether the transaction tree it belongs to
@@ -1586,6 +1595,7 @@ TransactionIdIsInProgress(TransactionId xid)
 		}
 	}
 
+	cachedXidIsNotInProgress = xid;
 	return false;
 }
 
diff --git a/src/backend/utils/adt/xid8funcs.c b/src/backend/utils/adt/xid8funcs.c
index 6c57ec3d35..81ca6cfdd0 100644
--- a/src/backend/utils/adt/xid8funcs.c
+++ b/src/backend/utils/adt/xid8funcs.c
@@ -36,6 +36,7 @@
 #include "miscadmin.h"
 #include "postmaster/postmaster.h"
 #include "storage/lwlock.h"
+#include "storage/procarray.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
@@ -676,29 +677,22 @@ pg_xact_status(PG_FUNCTION_ARGS)
 	{
 		Assert(TransactionIdIsValid(xid));
 
-		if (TransactionIdIsCurrentTransactionId(xid))
+		/*
+		 * Like when doing visiblity checks on a row, check whether the transaction
+		 * is still in progress before looking into the CLOG. Otherwise we
+		 * would incorrectly return "committed" for a transaction that is
+		 * committing and has already updated the CLOG, but hasn't removed its XID
+		 * from the proc array yet. (See comment on that race condition at
+		 * the top of heapam_visibility.c)
+		 */
+		if (TransactionIdIsInProgress(xid))
 			status = "in progress";
 		else if (TransactionIdDidCommit(xid))
 			status = "committed";
-		else if (TransactionIdDidAbort(xid))
-			status = "aborted";
 		else
 		{
-			/*
-			 * The xact is not marked as either committed or aborted in clog.
-			 *
-			 * It could be a transaction that ended without updating clog or
-			 * writing an abort record due to a crash. We can safely assume
-			 * it's aborted if it isn't committed and is older than our
-			 * snapshot xmin.
-			 *
-			 * Otherwise it must be in-progress (or have been at the time we
-			 * checked commit/abort status).
-			 */
-			if (TransactionIdPrecedes(xid, GetActiveSnapshot()->xmin))
-				status = "aborted";
-			else
-				status = "in progress";
+			/* it must have aborted or crashed */
+			status = "aborted";
 		}
 	}
 	else
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 338dfca5a0..775471d2a7 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -273,7 +273,6 @@ extern PGDLLIMPORT VariableCache ShmemVariableCache;
  */
 extern bool TransactionIdDidCommit(TransactionId transactionId);
 extern bool TransactionIdDidAbort(TransactionId transactionId);
-extern bool TransactionIdIsKnownCompleted(TransactionId transactionId);
 extern void TransactionIdCommitTree(TransactionId xid, int nxids, TransactionId *xids);
 extern void TransactionIdAsyncCommitTree(TransactionId xid, int nxids, TransactionId *xids, XLogRecPtr lsn);
 extern void TransactionIdAbortTree(TransactionId xid, int nxids, TransactionId *xids);
-- 
2.30.2

Reply via email to