On Thu, Sep 7, 2023 at 3:30 PM Robert Haas <[email protected]> wrote:
>
> On Thu, Sep 7, 2023 at 3:10 PM Melanie Plageman
> <[email protected]> wrote:
> > I can fix it by changing the type of PruneResult->off_loc to be an
> > OffsetNumber pointer. This does mean that PruneResult will be
> > initialized partially by heap_page_prune() callers. I wonder if you
> > think that undermines the case for making a new struct.
>
> I think that it undermines the case for including that particular
> argument in the struct. It's not really a Prune*Result* if the caller
> initializes it in part. It seems fairly reasonable to still have a
> PruneResult struct for the other stuff, though, at least to me. How do
> you see it?
Yes, I think off_loc probably didn't belong in PruneResult to begin with.
It is inherently not a result of pruning since it will only be used in
the event that pruning doesn't complete (it errors out).
In the attached v4 patch set, I have both PruneResult and off_loc as
parameters to heap_page_prune(). I have also added a separate commit
which adds comments both above heap_page_prune()'s call site in
lazy_scan_prune() and in the heap_page_prune() function header
clarifying the various points we discussed.
> > I still want to eliminate the NULL check of off_loc in
> > heap_page_prune() by making it a required parameter. Even though
> > on-access pruning does not have an error callback mechanism which uses
> > the offset, it seems better to have a pointless local variable in
> > heap_page_prune_opt() than to do a NULL check for every tuple pruned.
>
> It doesn't seem important to me unless it improves performance. If
> it's just stylistic, I don't object, but I also don't really see a
> reason to care.
I did some performance testing but, as expected, I couldn't concoct a
scenario where the overhead was noticeable in a profile. So much else
is happening in that code, the NULL check basically doesn't matter
(even though it isn't optimized out).
I mostly wanted to remove the NULL checks because I found them
distracting (so, a stylistic complaint). However, upon further
reflection, I actually think it is better if heap_page_prune_opt()
passes NULL. heap_page_prune() has no error callback mechanism that
could use it, and passing a valid value implies otherwise. Also, as
you said, off_loc will always be InvalidOffsetNumber when
heap_page_prune() returns normally, so having heap_page_prune_opt()
pass a dummy value might actually be more confusing for future
programmers.
> > > I haven't run the regression suite with 0001 applied. I'm guessing
> > > that you did, and that they passed, which if true means that we don't
> > > have a test for this, which is a shame, although writing such a test
> > > might be a bit tricky. If there is a test case for this and you didn't
> > > run it, woops.
> >
> > There is no test coverage for the vacuum error callback context
> > currently (tests passed for me). I looked into how we might add such a
> > test. First, I investigated what kind of errors might occur during
> > heap_prune_satisfies_vacuum(). Some of the multixact functions called
> > by HeapTupleSatisfiesVacuumHorizon() could error out -- for example
> > GetMultiXactIdMembers(). It seems difficult to trigger the errors in
> > GetMultiXactIdMembers(), as we would have to cause wraparound. It
> > would be even more difficult to ensure that we hit those specific
> > errors from a call stack containing heap_prune_satisfies_vacuum(). As
> > such, I'm not sure I can think of a way to protect future developers
> > from repeating my mistake--apart from improving the comment like you
> > mentioned.
>
> 004_verify_heapam.pl has some tests that intentionally corrupt pages
> and then use pg_amcheck to detect the corruption. Such an approach
> could probably also be used here. But it's a pain to get such tests
> right, because any change to the page format due to endianness,
> different block size, or whatever can make carefully-written tests go
> boom.
Cool! I hadn't examined how these tests work until now. I took a stab
at writing a test in the existing 0004_verify_heapam.pl. The simplest
thing would be if we could just vacuum the corrupted table ("test")
after running pg_amcheck and compare the error context to our
expectation. I found that this didn't work, though. In an assert
build, vacuum trips an assert before it hits an error while vacuuming
a corrupted tuple in the "test" table. There might be a way of
modifying the existing test code to avoid this, but I tried the next
easiest thing -- corrupting a tuple in the other existing table in the
file, "junk". This is possible to do, but we have to repeat a lot of
the setup code done for the "test" table to get the line pointer array
and loop through and corrupt a tuple. In order to do this well, I
would want to refactor some of the boilerplate into a function. There
are other fiddly bits as well that I needed to consider. I think a
test like this could be useful coverage of the some of the possible
errors that could happen in heap_prune_satisfies_vacuum(), but it
definitely isn't coverage of pg_amcheck (and thus shouldn't go in that
file) and a whole new test which spins up a Postgres to cover
vacuum_error_callback() seemed like a bit much.
- Melanie
From e986940e546171d1f1d06f62a101d695a8481e7a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 6 Sep 2023 14:57:20 -0400
Subject: [PATCH v4 2/3] Move heap_page_prune output parameters into struct
Add PruneResult, a structure containing the output parameters and result
of heap_page_prune(). Reorganizing the results of heap_page_prune() into
a struct simplifies the function signature and provides a location for
future commits to store additional output parameters.
Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
---
src/backend/access/heap/pruneheap.c | 33 +++++++++++++---------------
src/backend/access/heap/vacuumlazy.c | 17 +++++---------
src/include/access/heapam.h | 13 +++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 33 insertions(+), 31 deletions(-)
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 392b54f093..5ac286e152 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -155,15 +155,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
*/
if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree)
{
- int ndeleted,
- nnewlpdead;
+ PruneResult presult;
- ndeleted = heap_page_prune(relation, buffer, vistest,
- &nnewlpdead, NULL);
+ heap_page_prune(relation, buffer, vistest, &presult, NULL);
/*
* Report the number of tuples reclaimed to pgstats. This is
- * ndeleted minus the number of newly-LP_DEAD-set items.
+ * presult.ndeleted minus the number of newly-LP_DEAD-set items.
*
* We derive the number of dead tuples like this to avoid totally
* forgetting about items that were set to LP_DEAD, since they
@@ -175,9 +173,9 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
* tracks ndeleted, since it will set the same LP_DEAD items to
* LP_UNUSED separately.
*/
- if (ndeleted > nnewlpdead)
+ if (presult.ndeleted > presult.nnewlpdead)
pgstat_update_heap_dead_tuples(relation,
- ndeleted - nnewlpdead);
+ presult.ndeleted - presult.nnewlpdead);
}
/* And release buffer lock */
@@ -204,24 +202,22 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
* (see heap_prune_satisfies_vacuum and
* HeapTupleSatisfiesVacuum).
*
- * Sets *nnewlpdead for caller, indicating the number of items that were
- * newly set LP_DEAD during prune operation.
+ * presult contains output parameters needed by callers such as the number of
+ * tuples removed and the number of line pointers newly marked LP_DEAD.
+ * heap_page_prune() is responsible for initializing it.
*
* off_loc is the current offset into the line pointer array while pruning.
* This is used by vacuum to populate the error context message. On-access
* pruning has no such callback mechanism for populating the error context, so
* it passes NULL. When provided by the caller, off_loc is set exclusively by
* heap_page_prune().
- *
- * Returns the number of tuples deleted from the page during this call.
*/
-int
+void
heap_page_prune(Relation relation, Buffer buffer,
GlobalVisState *vistest,
- int *nnewlpdead,
+ PruneResult *presult,
OffsetNumber *off_loc)
{
- int ndeleted = 0;
Page page = BufferGetPage(buffer);
BlockNumber blockno = BufferGetBlockNumber(buffer);
OffsetNumber offnum,
@@ -247,6 +243,9 @@ heap_page_prune(Relation relation, Buffer buffer,
prstate.nredirected = prstate.ndead = prstate.nunused = 0;
memset(prstate.marked, 0, sizeof(prstate.marked));
+ presult->ndeleted = 0;
+ presult->nnewlpdead = 0;
+
maxoff = PageGetMaxOffsetNumber(page);
tup.t_tableOid = RelationGetRelid(prstate.rel);
@@ -321,7 +320,7 @@ heap_page_prune(Relation relation, Buffer buffer,
continue;
/* Process this item or chain of items */
- ndeleted += heap_prune_chain(buffer, offnum, &prstate);
+ presult->ndeleted += heap_prune_chain(buffer, offnum, &prstate);
}
/* Clear the offset information once we have processed the given page. */
@@ -422,9 +421,7 @@ heap_page_prune(Relation relation, Buffer buffer,
END_CRIT_SECTION();
/* Record number of newly-set-LP_DEAD items for caller */
- *nnewlpdead = prstate.ndead;
-
- return ndeleted;
+ presult->nnewlpdead = prstate.ndead;
}
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 102cc97358..7ead9cfe9d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1544,12 +1544,11 @@ lazy_scan_prune(LVRelState *vacrel,
ItemId itemid;
HeapTupleData tuple;
HTSV_Result res;
- int tuples_deleted,
- tuples_frozen,
+ PruneResult presult;
+ int tuples_frozen,
lpdead_items,
live_tuples,
recently_dead_tuples;
- int nnewlpdead;
HeapPageFreeze pagefrz;
int64 fpi_before = pgWalUsage.wal_fpi;
OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
@@ -1572,7 +1571,6 @@ retry:
pagefrz.FreezePageRelminMxid = vacrel->NewRelminMxid;
pagefrz.NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
pagefrz.NoFreezePageRelminMxid = vacrel->NewRelminMxid;
- tuples_deleted = 0;
tuples_frozen = 0;
lpdead_items = 0;
live_tuples = 0;
@@ -1581,9 +1579,8 @@ retry:
/*
* Prune all HOT-update chains in this page.
*
- * We count tuples removed by the pruning step as tuples_deleted. Its
- * final value can be thought of as the number of tuples that have been
- * deleted from the table. It should not be confused with lpdead_items;
+ * We count the number of tuples removed from the page by the pruning step
+ * in presult.ndeleted. It should not be confused with lpdead_items;
* lpdead_items's final value can be thought of as the number of tuples
* that were deleted from indexes.
*
@@ -1591,9 +1588,7 @@ retry:
* current offset when populating the error context message, so it is
* imperative that we pass its location to heap_page_prune.
*/
- tuples_deleted = heap_page_prune(rel, buf, vacrel->vistest,
- &nnewlpdead,
- &vacrel->offnum);
+ heap_page_prune(rel, buf, vacrel->vistest, &presult, &vacrel->offnum);
/*
* Now scan the page to collect LP_DEAD items and check for tuples
@@ -1933,7 +1928,7 @@ retry:
}
/* Finally, add page-local counts to whole-VACUUM counts */
- vacrel->tuples_deleted += tuples_deleted;
+ vacrel->tuples_deleted += presult.ndeleted;
vacrel->tuples_frozen += tuples_frozen;
vacrel->lpdead_items += lpdead_items;
vacrel->live_tuples += live_tuples;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 6598c4d7d8..2d3f149e4f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -191,6 +191,15 @@ typedef struct HeapPageFreeze
} HeapPageFreeze;
+/*
+ * Per-page state returned from pruning
+ */
+typedef struct PruneResult
+{
+ int ndeleted; /* Number of tuples deleted from the page */
+ int nnewlpdead; /* Number of newly LP_DEAD items */
+} PruneResult;
+
/* ----------------
* function prototypes for heap access method
*
@@ -284,9 +293,9 @@ extern TransactionId heap_index_delete_tuples(Relation rel,
/* in heap/pruneheap.c */
struct GlobalVisState;
extern void heap_page_prune_opt(Relation relation, Buffer buffer);
-extern int heap_page_prune(Relation relation, Buffer buffer,
+extern void heap_page_prune(Relation relation, Buffer buffer,
struct GlobalVisState *vistest,
- int *nnewlpdead,
+ PruneResult *presult,
OffsetNumber *off_loc);
extern void heap_page_prune_execute(Buffer buffer,
OffsetNumber *redirected, int nredirected,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 0656c94416..8c30c0b1b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2150,6 +2150,7 @@ ProjectionPath
PromptInterruptContext
ProtocolVersion
PrsStorage
+PruneResult
PruneState
PruneStepResult
PsqlScanCallbacks
--
2.37.2
From 1a621864420cd17db1bef0546990d8298e3a3d34 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 6 Sep 2023 16:54:41 -0400
Subject: [PATCH v4 3/3] Reuse heap_page_prune() tuple visibility statuses
heap_page_prune() obtains the HTSV_Result (tuple visibility status)
returned from HeapTupleSatisfiesVacuum() for every tuple on the page and
stores them in an array. By making this array available to
heap_page_prune()'s caller lazy_scan_prune(), we can avoid an additional
call to HeapTupleSatisfiesVacuum() when freezing the tuples and
recording LP_DEAD items for vacuum. This saves resources and eliminates
the possibility that vacuuming corrupted data results in a hang due to
endless retry looping.
This replaces the retry mechanism introduced in 8523492d4 to handle cases in
which a tuple's inserting transaction aborted between the visibility check in
heap_page_prune() and lazy_scan_prune()'s call to HeapTupleSatisfiesVacuum() --
rendering it dead but without a dead line pointer. We can instead reuse the
tuple's original visibility status, circumventing any disagreements.
Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
---
src/backend/access/heap/pruneheap.c | 37 ++++++++++++------------
src/backend/access/heap/vacuumlazy.c | 42 ++++++++--------------------
src/include/access/heapam.h | 11 ++++++++
3 files changed, 41 insertions(+), 49 deletions(-)
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 5ac286e152..256039c6be 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -53,16 +53,6 @@ typedef struct
* 1. Otherwise every access would need to subtract 1.
*/
bool marked[MaxHeapTuplesPerPage + 1];
-
- /*
- * Tuple visibility is only computed once for each tuple, for correctness
- * and efficiency reasons; see comment in heap_page_prune() for details.
- * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
- * indicate no visibility has been computed, e.g. for LP_DEAD items.
- *
- * Same indexing as ->marked.
- */
- int8 htsv[MaxHeapTuplesPerPage + 1];
} PruneState;
/* Local functions */
@@ -71,6 +61,7 @@ static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
Buffer buffer);
static int heap_prune_chain(Buffer buffer,
OffsetNumber rootoffnum,
+ int8 *htsv,
PruneState *prstate);
static void heap_prune_record_prunable(PruneState *prstate, TransactionId xid);
static void heap_prune_record_redirect(PruneState *prstate,
@@ -243,6 +234,10 @@ heap_page_prune(Relation relation, Buffer buffer,
prstate.nredirected = prstate.ndead = prstate.nunused = 0;
memset(prstate.marked, 0, sizeof(prstate.marked));
+ /*
+ * presult->htsv is not initialized here because all ntuple spots in the
+ * array will be set either to a valid HTSV_Result value or -1.
+ */
presult->ndeleted = 0;
presult->nnewlpdead = 0;
@@ -279,7 +274,7 @@ heap_page_prune(Relation relation, Buffer buffer,
/* Nothing to do if slot doesn't contain a tuple */
if (!ItemIdIsNormal(itemid))
{
- prstate.htsv[offnum] = -1;
+ presult->htsv[offnum] = -1;
continue;
}
@@ -295,8 +290,8 @@ heap_page_prune(Relation relation, Buffer buffer,
if (off_loc)
*off_loc = offnum;
- prstate.htsv[offnum] = heap_prune_satisfies_vacuum(&prstate, &tup,
- buffer);
+ presult->htsv[offnum] = heap_prune_satisfies_vacuum(&prstate, &tup,
+ buffer);
}
/* Scan the page */
@@ -320,7 +315,8 @@ heap_page_prune(Relation relation, Buffer buffer,
continue;
/* Process this item or chain of items */
- presult->ndeleted += heap_prune_chain(buffer, offnum, &prstate);
+ presult->ndeleted += heap_prune_chain(buffer, offnum,
+ presult->htsv, &prstate);
}
/* Clear the offset information once we have processed the given page. */
@@ -449,6 +445,8 @@ heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
/*
* Prune specified line pointer or a HOT chain originating at line pointer.
*
+ * Tuple visibility information is provided in htsv.
+ *
* If the item is an index-referenced tuple (i.e. not a heap-only tuple),
* the HOT chain is pruned by removing all DEAD tuples at the start of the HOT
* chain. We also prune any RECENTLY_DEAD tuples preceding a DEAD tuple.
@@ -476,7 +474,8 @@ heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
* Returns the number of tuples (to be) deleted from the page.
*/
static int
-heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
+heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
+ int8 *htsv, PruneState *prstate)
{
int ndeleted = 0;
Page dp = (Page) BufferGetPage(buffer);
@@ -497,7 +496,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
*/
if (ItemIdIsNormal(rootlp))
{
- Assert(prstate->htsv[rootoffnum] != -1);
+ Assert(htsv[rootoffnum] != -1);
htup = (HeapTupleHeader) PageGetItem(dp, rootlp);
if (HeapTupleHeaderIsHeapOnly(htup))
@@ -520,7 +519,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
* either here or while following a chain below. Whichever path
* gets there first will mark the tuple unused.
*/
- if (prstate->htsv[rootoffnum] == HEAPTUPLE_DEAD &&
+ if (htsv[rootoffnum] == HEAPTUPLE_DEAD &&
!HeapTupleHeaderIsHotUpdated(htup))
{
heap_prune_record_unused(prstate, rootoffnum);
@@ -588,7 +587,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
break;
Assert(ItemIdIsNormal(lp));
- Assert(prstate->htsv[offnum] != -1);
+ Assert(htsv[offnum] != -1);
htup = (HeapTupleHeader) PageGetItem(dp, lp);
/*
@@ -608,7 +607,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
*/
tupdead = recent_dead = false;
- switch ((HTSV_Result) prstate->htsv[offnum])
+ switch ((HTSV_Result) htsv[offnum])
{
case HEAPTUPLE_DEAD:
tupdead = true;
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7ead9cfe9d..6c328f3ed5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1524,12 +1524,12 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
* of complexity just so we could deal with tuples that were DEAD to VACUUM,
* but nevertheless were left with storage after pruning.
*
- * The approach we take now is to restart pruning when the race condition is
- * detected. This allows heap_page_prune() to prune the tuples inserted by
- * the now-aborted transaction. This is a little crude, but it guarantees
- * that any items that make it into the dead_items array are simple LP_DEAD
- * line pointers, and that every remaining item with tuple storage is
- * considered as a candidate for freezing.
+ * As of Postgres 17, we circumvent this problem altogether by reusing the
+ * result of heap_page_prune()'s visibility check. Without the second call to
+ * HeapTupleSatisfiesVacuum(), there is no new HTSV_Result and there can be no
+ * disagreement. The tuple's TID won't be added to the array of dead tuple TIDs
+ * for vacuum, thus vacuum will never be tasked with reaping a tuple with
+ * storage.
*/
static void
lazy_scan_prune(LVRelState *vacrel,
@@ -1542,8 +1542,6 @@ lazy_scan_prune(LVRelState *vacrel,
OffsetNumber offnum,
maxoff;
ItemId itemid;
- HeapTupleData tuple;
- HTSV_Result res;
PruneResult presult;
int tuples_frozen,
lpdead_items,
@@ -1563,8 +1561,6 @@ lazy_scan_prune(LVRelState *vacrel,
*/
maxoff = PageGetMaxOffsetNumber(page);
-retry:
-
/* Initialize (or reset) page-level state */
pagefrz.freeze_required = false;
pagefrz.FreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
@@ -1604,6 +1600,7 @@ retry:
offnum <= maxoff;
offnum = OffsetNumberNext(offnum))
{
+ HeapTupleHeader htup;
bool totally_frozen;
/*
@@ -1646,22 +1643,7 @@ retry:
Assert(ItemIdIsNormal(itemid));
- ItemPointerSet(&(tuple.t_self), blkno, offnum);
- tuple.t_data = (HeapTupleHeader) PageGetItem(page, itemid);
- tuple.t_len = ItemIdGetLength(itemid);
- tuple.t_tableOid = RelationGetRelid(rel);
-
- /*
- * DEAD tuples are almost always pruned into LP_DEAD line pointers by
- * heap_page_prune(), but it's possible that the tuple state changed
- * since heap_page_prune() looked. Handle that here by restarting.
- * (See comments at the top of function for a full explanation.)
- */
- res = HeapTupleSatisfiesVacuum(&tuple, vacrel->cutoffs.OldestXmin,
- buf);
-
- if (unlikely(res == HEAPTUPLE_DEAD))
- goto retry;
+ htup = (HeapTupleHeader) PageGetItem(page, itemid);
/*
* The criteria for counting a tuple as live in this block need to
@@ -1682,7 +1664,7 @@ retry:
* (Cases where we bypass index vacuuming will violate this optimistic
* assumption, but the overall impact of that should be negligible.)
*/
- switch (res)
+ switch ((HTSV_Result) presult.htsv[offnum])
{
case HEAPTUPLE_LIVE:
@@ -1704,7 +1686,7 @@ retry:
{
TransactionId xmin;
- if (!HeapTupleHeaderXminCommitted(tuple.t_data))
+ if (!HeapTupleHeaderXminCommitted(htup))
{
prunestate->all_visible = false;
break;
@@ -1714,7 +1696,7 @@ retry:
* The inserter definitely committed. But is it old enough
* that everyone sees it as committed?
*/
- xmin = HeapTupleHeaderGetXmin(tuple.t_data);
+ xmin = HeapTupleHeaderGetXmin(htup);
if (!TransactionIdPrecedes(xmin,
vacrel->cutoffs.OldestXmin))
{
@@ -1768,7 +1750,7 @@ retry:
prunestate->hastup = true; /* page makes rel truncation unsafe */
/* Tuple with storage -- consider need to freeze */
- if (heap_prepare_freeze_tuple(tuple.t_data, &vacrel->cutoffs, &pagefrz,
+ if (heap_prepare_freeze_tuple(htup, &vacrel->cutoffs, &pagefrz,
&frozen[tuples_frozen], &totally_frozen))
{
/* Save prepared freeze plan for later */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 2d3f149e4f..864ae561c4 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -198,6 +198,17 @@ typedef struct PruneResult
{
int ndeleted; /* Number of tuples deleted from the page */
int nnewlpdead; /* Number of newly LP_DEAD items */
+
+ /*
+ * Tuple visibility is only computed once for each tuple, for correctness
+ * and efficiency reasons; see comment in heap_page_prune() for details.
+ * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
+ * indicate no visibility has been computed, e.g. for LP_DEAD items.
+ *
+ * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+ * 1. Otherwise every access would need to subtract 1.
+ */
+ int8 htsv[MaxHeapTuplesPerPage + 1];
} PruneResult;
/* ----------------
--
2.37.2
From 661d9837ba34552fa301dc17b1fd74dd833aec72 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Thu, 7 Sep 2023 17:35:46 -0400
Subject: [PATCH v4 1/3] Clarify usage of heap_page_prune parameter off_loc
heap_page_prune() takes an optional parameter, off_loc, where it saves
the current offset while pruning. Vacuum passes the address of
vacrel->offnum, which is used by vacuum_error_callback() to populate the
error context message. On-access pruning has no such callback mechanism,
so it passes NULL. When off_loc is not provided the error message
context simply omits the offset. Because vacrel->offnum is specifically
used by vacuum_error_callback(), it is required that it be passed to
heap_page_prune() for the error message to be correctly populated.
Clarify this requirement in a comment in lazy_scan_prune() before
calling heap_page_prune().
While we are at it, provide some additional detail about off_loc in
heap_page_prune()'s function header comment.
Discussion: https://postgr.es/m/CA%2BTgmoZ8E7DXzAZkV%3DRF9tfJkQFH%3Da3KkjQhYNY9W26Jh7AN%2Bw%40mail.gmail.com
---
src/backend/access/heap/pruneheap.c | 7 +++++--
src/backend/access/heap/vacuumlazy.c | 4 ++++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 18193efa23..392b54f093 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -207,8 +207,11 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
* Sets *nnewlpdead for caller, indicating the number of items that were
* newly set LP_DEAD during prune operation.
*
- * off_loc is the offset location required by the caller to use in error
- * callback.
+ * off_loc is the current offset into the line pointer array while pruning.
+ * This is used by vacuum to populate the error context message. On-access
+ * pruning has no such callback mechanism for populating the error context, so
+ * it passes NULL. When provided by the caller, off_loc is set exclusively by
+ * heap_page_prune().
*
* Returns the number of tuples deleted from the page during this call.
*/
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1a05adfa61..102cc97358 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1586,6 +1586,10 @@ retry:
* deleted from the table. It should not be confused with lpdead_items;
* lpdead_items's final value can be thought of as the number of tuples
* that were deleted from indexes.
+ *
+ * vacuum_error_callback() looks specifically at vacrel->offnum for the
+ * current offset when populating the error context message, so it is
+ * imperative that we pass its location to heap_page_prune.
*/
tuples_deleted = heap_page_prune(rel, buf, vacrel->vistest,
&nnewlpdead,
--
2.37.2