On Thu, Sep 7, 2023 at 3:30 PM Robert Haas <robertmh...@gmail.com> wrote:
>
> On Thu, Sep 7, 2023 at 3:10 PM Melanie Plageman
> <melanieplage...@gmail.com> wrote:
> > I can fix it by changing the type of PruneResult->off_loc to be an
> > OffsetNumber pointer. This does mean that PruneResult will be
> > initialized partially by heap_page_prune() callers. I wonder if you
> > think that undermines the case for making a new struct.
>
> I think that it undermines the case for including that particular
> argument in the struct. It's not really a Prune*Result* if the caller
> initializes it in part. It seems fairly reasonable to still have a
> PruneResult struct for the other stuff, though, at least to me. How do
> you see it?

Yes, I think off_loc probably didn't belong in PruneResult to begin with.
It is inherently not a result of pruning since it will only be used in
the event that pruning doesn't complete (it errors out).

In the attached v4 patch set, I have both PruneResult and off_loc as
parameters to heap_page_prune(). I have also added a separate commit
which adds comments both above heap_page_prune()'s call site in
lazy_scan_prune() and in the heap_page_prune() function header
clarifying the various points we discussed.

> > I still want to eliminate the NULL check of off_loc in
> > heap_page_prune() by making it a required parameter. Even though
> > on-access pruning does not have an error callback mechanism which uses
> > the offset, it seems better to have a pointless local variable in
> > heap_page_prune_opt() than to do a NULL check for every tuple pruned.
>
> It doesn't seem important to me unless it improves performance. If
> it's just stylistic, I don't object, but I also don't really see a
> reason to care.

I did some performance testing but, as expected, I couldn't concoct a
scenario where the overhead was noticeable in a profile. So much else
is happening in that code, the NULL check basically doesn't matter
(even though it isn't optimized out).

I mostly wanted to remove the NULL checks because I found them
distracting (so, a stylistic complaint). However, upon further
reflection, I actually think it is better if heap_page_prune_opt()
passes NULL. heap_page_prune() has no error callback mechanism that
could use it, and passing a valid value implies otherwise. Also, as
you said, off_loc will always be InvalidOffsetNumber when
heap_page_prune() returns normally, so having heap_page_prune_opt()
pass a dummy value might actually be more confusing for future
programmers.

> > > I haven't run the regression suite with 0001 applied. I'm guessing
> > > that you did, and that they passed, which if true means that we don't
> > > have a test for this, which is a shame, although writing such a test
> > > might be a bit tricky. If there is a test case for this and you didn't
> > > run it, woops.
> >
> > There is no test coverage for the vacuum error callback context
> > currently (tests passed for me). I looked into how we might add such a
> > test. First, I investigated what kind of errors might occur during
> > heap_prune_satisfies_vacuum(). Some of the multixact functions called
> > by HeapTupleSatisfiesVacuumHorizon() could error out -- for example
> > GetMultiXactIdMembers(). It seems difficult to trigger the errors in
> > GetMultiXactIdMembers(), as we would have to cause wraparound. It
> > would be even more difficult to ensure that we hit those specific
> > errors from a call stack containing heap_prune_satisfies_vacuum(). As
> > such, I'm not sure I can think of a way to protect future developers
> > from repeating my mistake--apart from improving the comment like you
> > mentioned.
>
> 004_verify_heapam.pl has some tests that intentionally corrupt pages
> and then use pg_amcheck to detect the corruption. Such an approach
> could probably also be used here. But it's a pain to get such tests
> right, because any change to the page format due to endianness,
> different block size, or whatever can make carefully-written tests go
> boom.

Cool! I hadn't examined how these tests work until now. I took a stab
at writing a test in the existing 0004_verify_heapam.pl. The simplest
thing would be if we could just vacuum the corrupted table ("test")
after running pg_amcheck and compare the error context to our
expectation. I found that this didn't work, though. In an assert
build, vacuum trips an assert before it hits an error while vacuuming
a corrupted tuple in the "test" table. There might be a way of
modifying the existing test code to avoid this, but I tried the next
easiest thing -- corrupting a tuple in the other existing table in the
file, "junk". This is possible to do, but we have to repeat a lot of
the setup code done for the "test" table to get the line pointer array
and loop through and corrupt a tuple. In order to do this well, I
would want to refactor some of the boilerplate into a function. There
are other fiddly bits as well that I needed to consider. I think a
test like this could be useful coverage of the some of the possible
errors that could happen in heap_prune_satisfies_vacuum(), but it
definitely isn't coverage of pg_amcheck (and thus shouldn't go in that
file) and a whole new test which spins up a Postgres to cover
vacuum_error_callback() seemed like a bit much.

- Melanie
From e986940e546171d1f1d06f62a101d695a8481e7a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 6 Sep 2023 14:57:20 -0400
Subject: [PATCH v4 2/3] Move heap_page_prune output parameters into struct

Add PruneResult, a structure containing the output parameters and result
of heap_page_prune(). Reorganizing the results of heap_page_prune() into
a struct simplifies the function signature and provides a location for
future commits to store additional output parameters.

Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
---
 src/backend/access/heap/pruneheap.c  | 33 +++++++++++++---------------
 src/backend/access/heap/vacuumlazy.c | 17 +++++---------
 src/include/access/heapam.h          | 13 +++++++++--
 src/tools/pgindent/typedefs.list     |  1 +
 4 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 392b54f093..5ac286e152 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -155,15 +155,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 		 */
 		if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree)
 		{
-			int			ndeleted,
-						nnewlpdead;
+			PruneResult presult;
 
-			ndeleted = heap_page_prune(relation, buffer, vistest,
-									   &nnewlpdead, NULL);
+			heap_page_prune(relation, buffer, vistest, &presult, NULL);
 
 			/*
 			 * Report the number of tuples reclaimed to pgstats.  This is
-			 * ndeleted minus the number of newly-LP_DEAD-set items.
+			 * presult.ndeleted minus the number of newly-LP_DEAD-set items.
 			 *
 			 * We derive the number of dead tuples like this to avoid totally
 			 * forgetting about items that were set to LP_DEAD, since they
@@ -175,9 +173,9 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 			 * tracks ndeleted, since it will set the same LP_DEAD items to
 			 * LP_UNUSED separately.
 			 */
-			if (ndeleted > nnewlpdead)
+			if (presult.ndeleted > presult.nnewlpdead)
 				pgstat_update_heap_dead_tuples(relation,
-											   ndeleted - nnewlpdead);
+											   presult.ndeleted - presult.nnewlpdead);
 		}
 
 		/* And release buffer lock */
@@ -204,24 +202,22 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * (see heap_prune_satisfies_vacuum and
  * HeapTupleSatisfiesVacuum).
  *
- * Sets *nnewlpdead for caller, indicating the number of items that were
- * newly set LP_DEAD during prune operation.
+ * presult contains output parameters needed by callers such as the number of
+ * tuples removed and the number of line pointers newly marked LP_DEAD.
+ * heap_page_prune() is responsible for initializing it.
  *
  * off_loc is the current offset into the line pointer array while pruning.
  * This is used by vacuum to populate the error context message. On-access
  * pruning has no such callback mechanism for populating the error context, so
  * it passes NULL. When provided by the caller, off_loc is set exclusively by
  * heap_page_prune().
- *
- * Returns the number of tuples deleted from the page during this call.
  */
-int
+void
 heap_page_prune(Relation relation, Buffer buffer,
 				GlobalVisState *vistest,
-				int *nnewlpdead,
+				PruneResult *presult,
 				OffsetNumber *off_loc)
 {
-	int			ndeleted = 0;
 	Page		page = BufferGetPage(buffer);
 	BlockNumber blockno = BufferGetBlockNumber(buffer);
 	OffsetNumber offnum,
@@ -247,6 +243,9 @@ heap_page_prune(Relation relation, Buffer buffer,
 	prstate.nredirected = prstate.ndead = prstate.nunused = 0;
 	memset(prstate.marked, 0, sizeof(prstate.marked));
 
+	presult->ndeleted = 0;
+	presult->nnewlpdead = 0;
+
 	maxoff = PageGetMaxOffsetNumber(page);
 	tup.t_tableOid = RelationGetRelid(prstate.rel);
 
@@ -321,7 +320,7 @@ heap_page_prune(Relation relation, Buffer buffer,
 			continue;
 
 		/* Process this item or chain of items */
-		ndeleted += heap_prune_chain(buffer, offnum, &prstate);
+		presult->ndeleted += heap_prune_chain(buffer, offnum, &prstate);
 	}
 
 	/* Clear the offset information once we have processed the given page. */
@@ -422,9 +421,7 @@ heap_page_prune(Relation relation, Buffer buffer,
 	END_CRIT_SECTION();
 
 	/* Record number of newly-set-LP_DEAD items for caller */
-	*nnewlpdead = prstate.ndead;
-
-	return ndeleted;
+	presult->nnewlpdead = prstate.ndead;
 }
 
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 102cc97358..7ead9cfe9d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1544,12 +1544,11 @@ lazy_scan_prune(LVRelState *vacrel,
 	ItemId		itemid;
 	HeapTupleData tuple;
 	HTSV_Result res;
-	int			tuples_deleted,
-				tuples_frozen,
+	PruneResult presult;
+	int			tuples_frozen,
 				lpdead_items,
 				live_tuples,
 				recently_dead_tuples;
-	int			nnewlpdead;
 	HeapPageFreeze pagefrz;
 	int64		fpi_before = pgWalUsage.wal_fpi;
 	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
@@ -1572,7 +1571,6 @@ retry:
 	pagefrz.FreezePageRelminMxid = vacrel->NewRelminMxid;
 	pagefrz.NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
 	pagefrz.NoFreezePageRelminMxid = vacrel->NewRelminMxid;
-	tuples_deleted = 0;
 	tuples_frozen = 0;
 	lpdead_items = 0;
 	live_tuples = 0;
@@ -1581,9 +1579,8 @@ retry:
 	/*
 	 * Prune all HOT-update chains in this page.
 	 *
-	 * We count tuples removed by the pruning step as tuples_deleted.  Its
-	 * final value can be thought of as the number of tuples that have been
-	 * deleted from the table.  It should not be confused with lpdead_items;
+	 * We count the number of tuples removed from the page by the pruning step
+	 * in presult.ndeleted. It should not be confused with lpdead_items;
 	 * lpdead_items's final value can be thought of as the number of tuples
 	 * that were deleted from indexes.
 	 *
@@ -1591,9 +1588,7 @@ retry:
 	 * current offset when populating the error context message, so it is
 	 * imperative that we pass its location to heap_page_prune.
 	 */
-	tuples_deleted = heap_page_prune(rel, buf, vacrel->vistest,
-									 &nnewlpdead,
-									 &vacrel->offnum);
+	heap_page_prune(rel, buf, vacrel->vistest, &presult, &vacrel->offnum);
 
 	/*
 	 * Now scan the page to collect LP_DEAD items and check for tuples
@@ -1933,7 +1928,7 @@ retry:
 	}
 
 	/* Finally, add page-local counts to whole-VACUUM counts */
-	vacrel->tuples_deleted += tuples_deleted;
+	vacrel->tuples_deleted += presult.ndeleted;
 	vacrel->tuples_frozen += tuples_frozen;
 	vacrel->lpdead_items += lpdead_items;
 	vacrel->live_tuples += live_tuples;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 6598c4d7d8..2d3f149e4f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -191,6 +191,15 @@ typedef struct HeapPageFreeze
 
 } HeapPageFreeze;
 
+/*
+ * Per-page state returned from pruning
+ */
+typedef struct PruneResult
+{
+	int			ndeleted;		/* Number of tuples deleted from the page */
+	int			nnewlpdead;		/* Number of newly LP_DEAD items */
+} PruneResult;
+
 /* ----------------
  *		function prototypes for heap access method
  *
@@ -284,9 +293,9 @@ extern TransactionId heap_index_delete_tuples(Relation rel,
 /* in heap/pruneheap.c */
 struct GlobalVisState;
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
-extern int	heap_page_prune(Relation relation, Buffer buffer,
+extern void heap_page_prune(Relation relation, Buffer buffer,
 							struct GlobalVisState *vistest,
-							int *nnewlpdead,
+							PruneResult *presult,
 							OffsetNumber *off_loc);
 extern void heap_page_prune_execute(Buffer buffer,
 									OffsetNumber *redirected, int nredirected,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0656c94416..8c30c0b1b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2150,6 +2150,7 @@ ProjectionPath
 PromptInterruptContext
 ProtocolVersion
 PrsStorage
+PruneResult
 PruneState
 PruneStepResult
 PsqlScanCallbacks
-- 
2.37.2

From 1a621864420cd17db1bef0546990d8298e3a3d34 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 6 Sep 2023 16:54:41 -0400
Subject: [PATCH v4 3/3] Reuse heap_page_prune() tuple visibility statuses

heap_page_prune() obtains the HTSV_Result (tuple visibility status)
returned from HeapTupleSatisfiesVacuum() for every tuple on the page and
stores them in an array. By making this array available to
heap_page_prune()'s caller lazy_scan_prune(), we can avoid an additional
call to HeapTupleSatisfiesVacuum() when freezing the tuples and
recording LP_DEAD items for vacuum. This saves resources and eliminates
the possibility that vacuuming corrupted data results in a hang due to
endless retry looping.

This replaces the retry mechanism introduced in 8523492d4 to handle cases in
which a tuple's inserting transaction aborted between the visibility check in
heap_page_prune() and lazy_scan_prune()'s call to HeapTupleSatisfiesVacuum() --
rendering it dead but without a dead line pointer. We can instead reuse the
tuple's original visibility status, circumventing any disagreements.

Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
---
 src/backend/access/heap/pruneheap.c  | 37 ++++++++++++------------
 src/backend/access/heap/vacuumlazy.c | 42 ++++++++--------------------
 src/include/access/heapam.h          | 11 ++++++++
 3 files changed, 41 insertions(+), 49 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 5ac286e152..256039c6be 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -53,16 +53,6 @@ typedef struct
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
 	bool		marked[MaxHeapTuplesPerPage + 1];
-
-	/*
-	 * Tuple visibility is only computed once for each tuple, for correctness
-	 * and efficiency reasons; see comment in heap_page_prune() for details.
-	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
-	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
-	 *
-	 * Same indexing as ->marked.
-	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
 } PruneState;
 
 /* Local functions */
@@ -71,6 +61,7 @@ static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
 											   Buffer buffer);
 static int	heap_prune_chain(Buffer buffer,
 							 OffsetNumber rootoffnum,
+							 int8 *htsv,
 							 PruneState *prstate);
 static void heap_prune_record_prunable(PruneState *prstate, TransactionId xid);
 static void heap_prune_record_redirect(PruneState *prstate,
@@ -243,6 +234,10 @@ heap_page_prune(Relation relation, Buffer buffer,
 	prstate.nredirected = prstate.ndead = prstate.nunused = 0;
 	memset(prstate.marked, 0, sizeof(prstate.marked));
 
+	/*
+	 * presult->htsv is not initialized here because all ntuple spots in the
+	 * array will be set either to a valid HTSV_Result value or -1.
+	 */
 	presult->ndeleted = 0;
 	presult->nnewlpdead = 0;
 
@@ -279,7 +274,7 @@ heap_page_prune(Relation relation, Buffer buffer,
 		/* Nothing to do if slot doesn't contain a tuple */
 		if (!ItemIdIsNormal(itemid))
 		{
-			prstate.htsv[offnum] = -1;
+			presult->htsv[offnum] = -1;
 			continue;
 		}
 
@@ -295,8 +290,8 @@ heap_page_prune(Relation relation, Buffer buffer,
 		if (off_loc)
 			*off_loc = offnum;
 
-		prstate.htsv[offnum] = heap_prune_satisfies_vacuum(&prstate, &tup,
-														   buffer);
+		presult->htsv[offnum] = heap_prune_satisfies_vacuum(&prstate, &tup,
+															buffer);
 	}
 
 	/* Scan the page */
@@ -320,7 +315,8 @@ heap_page_prune(Relation relation, Buffer buffer,
 			continue;
 
 		/* Process this item or chain of items */
-		presult->ndeleted += heap_prune_chain(buffer, offnum, &prstate);
+		presult->ndeleted += heap_prune_chain(buffer, offnum,
+											  presult->htsv, &prstate);
 	}
 
 	/* Clear the offset information once we have processed the given page. */
@@ -449,6 +445,8 @@ heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
 /*
  * Prune specified line pointer or a HOT chain originating at line pointer.
  *
+ * Tuple visibility information is provided in htsv.
+ *
  * If the item is an index-referenced tuple (i.e. not a heap-only tuple),
  * the HOT chain is pruned by removing all DEAD tuples at the start of the HOT
  * chain.  We also prune any RECENTLY_DEAD tuples preceding a DEAD tuple.
@@ -476,7 +474,8 @@ heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
  * Returns the number of tuples (to be) deleted from the page.
  */
 static int
-heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
+heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
+				 int8 *htsv, PruneState *prstate)
 {
 	int			ndeleted = 0;
 	Page		dp = (Page) BufferGetPage(buffer);
@@ -497,7 +496,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 	 */
 	if (ItemIdIsNormal(rootlp))
 	{
-		Assert(prstate->htsv[rootoffnum] != -1);
+		Assert(htsv[rootoffnum] != -1);
 		htup = (HeapTupleHeader) PageGetItem(dp, rootlp);
 
 		if (HeapTupleHeaderIsHeapOnly(htup))
@@ -520,7 +519,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 			 * either here or while following a chain below.  Whichever path
 			 * gets there first will mark the tuple unused.
 			 */
-			if (prstate->htsv[rootoffnum] == HEAPTUPLE_DEAD &&
+			if (htsv[rootoffnum] == HEAPTUPLE_DEAD &&
 				!HeapTupleHeaderIsHotUpdated(htup))
 			{
 				heap_prune_record_unused(prstate, rootoffnum);
@@ -588,7 +587,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 			break;
 
 		Assert(ItemIdIsNormal(lp));
-		Assert(prstate->htsv[offnum] != -1);
+		Assert(htsv[offnum] != -1);
 		htup = (HeapTupleHeader) PageGetItem(dp, lp);
 
 		/*
@@ -608,7 +607,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 		 */
 		tupdead = recent_dead = false;
 
-		switch ((HTSV_Result) prstate->htsv[offnum])
+		switch ((HTSV_Result) htsv[offnum])
 		{
 			case HEAPTUPLE_DEAD:
 				tupdead = true;
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7ead9cfe9d..6c328f3ed5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1524,12 +1524,12 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
  * of complexity just so we could deal with tuples that were DEAD to VACUUM,
  * but nevertheless were left with storage after pruning.
  *
- * The approach we take now is to restart pruning when the race condition is
- * detected.  This allows heap_page_prune() to prune the tuples inserted by
- * the now-aborted transaction.  This is a little crude, but it guarantees
- * that any items that make it into the dead_items array are simple LP_DEAD
- * line pointers, and that every remaining item with tuple storage is
- * considered as a candidate for freezing.
+ * As of Postgres 17, we circumvent this problem altogether by reusing the
+ * result of heap_page_prune()'s visibility check. Without the second call to
+ * HeapTupleSatisfiesVacuum(), there is no new HTSV_Result and there can be no
+ * disagreement. The tuple's TID won't be added to the array of dead tuple TIDs
+ * for vacuum, thus vacuum will never be tasked with reaping a tuple with
+ * storage.
  */
 static void
 lazy_scan_prune(LVRelState *vacrel,
@@ -1542,8 +1542,6 @@ lazy_scan_prune(LVRelState *vacrel,
 	OffsetNumber offnum,
 				maxoff;
 	ItemId		itemid;
-	HeapTupleData tuple;
-	HTSV_Result res;
 	PruneResult presult;
 	int			tuples_frozen,
 				lpdead_items,
@@ -1563,8 +1561,6 @@ lazy_scan_prune(LVRelState *vacrel,
 	 */
 	maxoff = PageGetMaxOffsetNumber(page);
 
-retry:
-
 	/* Initialize (or reset) page-level state */
 	pagefrz.freeze_required = false;
 	pagefrz.FreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
@@ -1604,6 +1600,7 @@ retry:
 		 offnum <= maxoff;
 		 offnum = OffsetNumberNext(offnum))
 	{
+		HeapTupleHeader htup;
 		bool		totally_frozen;
 
 		/*
@@ -1646,22 +1643,7 @@ retry:
 
 		Assert(ItemIdIsNormal(itemid));
 
-		ItemPointerSet(&(tuple.t_self), blkno, offnum);
-		tuple.t_data = (HeapTupleHeader) PageGetItem(page, itemid);
-		tuple.t_len = ItemIdGetLength(itemid);
-		tuple.t_tableOid = RelationGetRelid(rel);
-
-		/*
-		 * DEAD tuples are almost always pruned into LP_DEAD line pointers by
-		 * heap_page_prune(), but it's possible that the tuple state changed
-		 * since heap_page_prune() looked.  Handle that here by restarting.
-		 * (See comments at the top of function for a full explanation.)
-		 */
-		res = HeapTupleSatisfiesVacuum(&tuple, vacrel->cutoffs.OldestXmin,
-									   buf);
-
-		if (unlikely(res == HEAPTUPLE_DEAD))
-			goto retry;
+		htup = (HeapTupleHeader) PageGetItem(page, itemid);
 
 		/*
 		 * The criteria for counting a tuple as live in this block need to
@@ -1682,7 +1664,7 @@ retry:
 		 * (Cases where we bypass index vacuuming will violate this optimistic
 		 * assumption, but the overall impact of that should be negligible.)
 		 */
-		switch (res)
+		switch ((HTSV_Result) presult.htsv[offnum])
 		{
 			case HEAPTUPLE_LIVE:
 
@@ -1704,7 +1686,7 @@ retry:
 				{
 					TransactionId xmin;
 
-					if (!HeapTupleHeaderXminCommitted(tuple.t_data))
+					if (!HeapTupleHeaderXminCommitted(htup))
 					{
 						prunestate->all_visible = false;
 						break;
@@ -1714,7 +1696,7 @@ retry:
 					 * The inserter definitely committed. But is it old enough
 					 * that everyone sees it as committed?
 					 */
-					xmin = HeapTupleHeaderGetXmin(tuple.t_data);
+					xmin = HeapTupleHeaderGetXmin(htup);
 					if (!TransactionIdPrecedes(xmin,
 											   vacrel->cutoffs.OldestXmin))
 					{
@@ -1768,7 +1750,7 @@ retry:
 		prunestate->hastup = true;	/* page makes rel truncation unsafe */
 
 		/* Tuple with storage -- consider need to freeze */
-		if (heap_prepare_freeze_tuple(tuple.t_data, &vacrel->cutoffs, &pagefrz,
+		if (heap_prepare_freeze_tuple(htup, &vacrel->cutoffs, &pagefrz,
 									  &frozen[tuples_frozen], &totally_frozen))
 		{
 			/* Save prepared freeze plan for later */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 2d3f149e4f..864ae561c4 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -198,6 +198,17 @@ typedef struct PruneResult
 {
 	int			ndeleted;		/* Number of tuples deleted from the page */
 	int			nnewlpdead;		/* Number of newly LP_DEAD items */
+
+	/*
+	 * Tuple visibility is only computed once for each tuple, for correctness
+	 * and efficiency reasons; see comment in heap_page_prune() for details.
+	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
+	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
+	 *
+	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * 1. Otherwise every access would need to subtract 1.
+	 */
+	int8		htsv[MaxHeapTuplesPerPage + 1];
 } PruneResult;
 
 /* ----------------
-- 
2.37.2

From 661d9837ba34552fa301dc17b1fd74dd833aec72 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Thu, 7 Sep 2023 17:35:46 -0400
Subject: [PATCH v4 1/3] Clarify usage of heap_page_prune parameter off_loc

heap_page_prune() takes an optional parameter, off_loc, where it saves
the current offset while pruning. Vacuum passes the address of
vacrel->offnum, which is used by vacuum_error_callback() to populate the
error context message. On-access pruning has no such callback mechanism,
so it passes NULL. When off_loc is not provided the error message
context simply omits the offset. Because vacrel->offnum is specifically
used by vacuum_error_callback(), it is required that it be passed to
heap_page_prune() for the error message to be correctly populated.
Clarify this requirement in a comment in lazy_scan_prune() before
calling heap_page_prune().

While we are at it, provide some additional detail about off_loc in
heap_page_prune()'s function header comment.

Discussion: https://postgr.es/m/CA%2BTgmoZ8E7DXzAZkV%3DRF9tfJkQFH%3Da3KkjQhYNY9W26Jh7AN%2Bw%40mail.gmail.com
---
 src/backend/access/heap/pruneheap.c  | 7 +++++--
 src/backend/access/heap/vacuumlazy.c | 4 ++++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 18193efa23..392b54f093 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -207,8 +207,11 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * Sets *nnewlpdead for caller, indicating the number of items that were
  * newly set LP_DEAD during prune operation.
  *
- * off_loc is the offset location required by the caller to use in error
- * callback.
+ * off_loc is the current offset into the line pointer array while pruning.
+ * This is used by vacuum to populate the error context message. On-access
+ * pruning has no such callback mechanism for populating the error context, so
+ * it passes NULL. When provided by the caller, off_loc is set exclusively by
+ * heap_page_prune().
  *
  * Returns the number of tuples deleted from the page during this call.
  */
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1a05adfa61..102cc97358 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1586,6 +1586,10 @@ retry:
 	 * deleted from the table.  It should not be confused with lpdead_items;
 	 * lpdead_items's final value can be thought of as the number of tuples
 	 * that were deleted from indexes.
+	 *
+	 * vacuum_error_callback() looks specifically at vacrel->offnum for the
+	 * current offset when populating the error context message, so it is
+	 * imperative that we pass its location to heap_page_prune.
 	 */
 	tuples_deleted = heap_page_prune(rel, buf, vacrel->vistest,
 									 &nnewlpdead,
-- 
2.37.2

Reply via email to