I think you've done a stellar job of identifying what the actual problem
was. I like the new (simpler) coding of that portion of
HeapTupleSatisfiesVacuum.
freeze-the-dead is not listed in isolation_schedule; an easy fix.
I confirm that the test crashes with an assertion failure without the
code fix, and that it doesn't with it.
I think the comparison to OldestXmin should be reversed:
if (!TransactionIdPrecedes(xmax, OldestXmin))
return HEAPTUPLE_RECENTLY_DEAD;
return HEAPTUPLE_DEAD;
This way, an xmax that has exactly the OldestXmin value will return
RECENTLY_DEAD rather DEAD, which seems reasonable to me (since
OldestXmin value itself is supposed to be still possibly visible to
somebody). Also, this way it is consistent with the other comparison to
OldestXmin at the bottom of the function. There is no reason for the
"else" or the extra braces.
Put together, I propose the attached delta for 0001.
Your commit message does a poor job of acknowledging prior work on
diagnosing the problem starting from Dan's initial test case and patch.
I haven't looked at your 0002 yet.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 8be2980116..aab03835d1 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -1324,25 +1324,23 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId
OldestXmin,
else if (TransactionIdDidCommit(xmax))
{
/*
- * The multixact might still be running due to lockers.
If the
- * updater is below the horizon we have to return DEAD
regardless
- * - otherwise we could end up with a tuple where the
updater has
- * to be removed due to the horizon, but is not pruned
away. It's
- * not a problem to prune that tuple because all the
lockers will
- * also be present in the newer tuple version.
+ * The multixact might still be running due to lockers.
If the
+ * updater is below the xid horizon, we have to return
DEAD
+ * regardless -- otherwise we could end up with a tuple
where the
+ * updater has to be removed due to the horizon, but is
not pruned
+ * away. It's not a problem to prune that tuple,
because any
+ * remaining lockers will also be present in newer
tuple versions.
*/
- if (TransactionIdPrecedes(xmax, OldestXmin))
- {
- return HEAPTUPLE_DEAD;
- }
- else
+ if (!TransactionIdPrecedes(xmax, OldestXmin))
return HEAPTUPLE_RECENTLY_DEAD;
+
+ return HEAPTUPLE_DEAD;
}
else if
(!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
{
/*
* Not in Progress, Not Committed, so either Aborted or
crashed.
- * Remove the Xmax.
+ * Mark the Xmax as invalid.
*/
SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
InvalidTransactionId);
}
diff --git a/src/test/isolation/isolation_schedule
b/src/test/isolation/isolation_schedule
index e41b9164cd..eb566ebb6c 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -44,6 +44,7 @@ test: update-locked-tuple
test: propagate-lock-delete
test: tuplelock-conflict
test: tuplelock-update
+test: freeze-the-dead
test: nowait
test: nowait-2
test: nowait-3