Hi,

I noticed while writing another patch that I think there might be an
issue/oversight with how the snapshot conflict horizon is calculated
for the prune/freeze WAL record in master/17 and the freeze WAL record
in 16.

In code to determine whether or not to freeze tuples on the page in
phase I of vacuum, we ignore LP_DEAD items temporarily, assuming they
will be set LP_UNUSED and removed by phase III of vacuum.

This means that the local values of all_visible and all_frozen
(whether or not the page is all-visible/all-frozen) in the PruneState
may temporarily be true when there are LP_DEAD items.
Before returning to code that actually updates the visibility map,
all_visible and all_frozen are unset if there are lpdead items.

However, we calculate the snapshot conflict horizon for the
prune/freeze record in master/17 and the freeze record in 16 before
updating all-frozen and all-visible. That means the horizon may be too
aggressive if the page cannot actually be set all-frozen. This isn't a
correctness issue, but it may lead to transactions on the standby
being unnecessarily canceled for pages which have some tuples frozen
but not all due to remaining LP_DEAD items.

I'm not 100% certain of my analysis given that this is a complicated
area of the code. I've attached suggested fixes for master/17 and 16
and am interested in what others think.

- Melanie
From fee539ddc97f1d58a883d37fcecea94af1261e8d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Thu, 29 May 2025 10:20:18 -0400
Subject: [PATCH] Calculate prune/freeze conflict horizon with final all_frozen
 value

In phase I of heap vacuuming, while determining whether or not to freeze
tuples on a page, heap_page_prune_and_freeze() ignores LP_DEAD items
which it assumes will be set LP_UNUSED and removed by the phase III of
vacuum.

That means the local value of prstate.all_frozen it uses may be true
even when dead items are present. In this case, however, before
returning control to lazy_scan_prune() heap_page_prune_and_freeze() must
unset prstate.all_frozen to avoid lazy_scan_prune() incorrectly updating
the visibility map for that heap page.

The problem was that heap_page_prune_and_freeze() calculated the
snapshot conflict horizon for the prune and freeze record using the
contingent value of prstate.all_frozen. That could result in an overly
aggressive snapshot conflict horizon in the case that some tuples were
frozen but the whole page was not able to be set all-frozen. This would
not affect correctness -- only potentially cancel queries unnecessarily
on the standby.

Unset this sooner to correctly set the snapshot conflict horizon in the
prune/freeze record.

This issue is present in at least Postgres 17 and 16. Further
investigation is needed for earlier versions.
---
 src/backend/access/heap/pruneheap.c | 45 ++++++++++++++---------------
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index a8025889be0..14744dcd2d5 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -750,6 +750,25 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 		 */
 	}
 
+	/*
+	 * It was convenient to ignore LP_DEAD items in all_visible earlier on to
+	 * make the choice of whether or not to freeze the page unaffected by the
+	 * short-term presence of LP_DEAD items.  These LP_DEAD items were
+	 * effectively assumed to be LP_UNUSED items in the making.  It doesn't
+	 * matter which vacuum heap pass (initial pass or final pass) ends up
+	 * setting the page all-frozen, as long as the ongoing VACUUM does it.
+	 *
+	 * Now that freezing has been finalized, unset all_visible if there are
+	 * any LP_DEAD items on the page.  It needs to reflect the present state
+	 * of the page, both for calculating the snapshot conflict horizon for
+	 * this record and to communicate back to the caller.
+	 */
+	if (prstate.lpdead_items > 0)
+	{
+		prstate.all_visible = false;
+		prstate.all_frozen = false;
+	}
+
 	/* Any error while applying the changes is critical */
 	START_CRIT_SECTION();
 
@@ -852,30 +871,8 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	presult->nfrozen = prstate.nfrozen;
 	presult->live_tuples = prstate.live_tuples;
 	presult->recently_dead_tuples = prstate.recently_dead_tuples;
-
-	/*
-	 * It was convenient to ignore LP_DEAD items in all_visible earlier on to
-	 * make the choice of whether or not to freeze the page unaffected by the
-	 * short-term presence of LP_DEAD items.  These LP_DEAD items were
-	 * effectively assumed to be LP_UNUSED items in the making.  It doesn't
-	 * matter which vacuum heap pass (initial pass or final pass) ends up
-	 * setting the page all-frozen, as long as the ongoing VACUUM does it.
-	 *
-	 * Now that freezing has been finalized, unset all_visible if there are
-	 * any LP_DEAD items on the page.  It needs to reflect the present state
-	 * of the page, as expected by our caller.
-	 */
-	if (prstate.all_visible && prstate.lpdead_items == 0)
-	{
-		presult->all_visible = prstate.all_visible;
-		presult->all_frozen = prstate.all_frozen;
-	}
-	else
-	{
-		presult->all_visible = false;
-		presult->all_frozen = false;
-	}
-
+	presult->all_visible = prstate.all_visible;
+	presult->all_frozen = prstate.all_frozen;
 	presult->hastup = prstate.hastup;
 
 	/*
-- 
2.34.1

From 8b527ff196b8752809b92f7073bb7624d2329170 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Thu, 29 May 2025 10:39:35 -0400
Subject: [PATCH] Calculate freeze conflict horizon with final all_frozen value

In phase I of heap vacuuming, while determining whether or not to freeze
tuples on a page, lazy_scan_prune() ignores LP_DEAD items which it
assumes will be set LP_UNUSED and removed by the phase III of vacuum.

That means the local value of prunestate->all_frozen it uses may be true
even when dead items are present. In this case, however, before
returning control to lazy_scan_heap() lazy_scan_prune() must unset
prunsteate->all_visible to avoid lazy_scan_haep() incorrectly updating
the visibility map for that heap page.

The problem was that lazy_scan_prune() calculated the snapshot conflict
horizon for the freeze record using the contingent value of
prunestate->all_frozen. That could result in an overly aggressive
snapshot conflict horizon in the case that some tuples were frozen but
the whole page was not able to be set all-frozen. This would not affect
correctness -- only potentially cancel queries unnecessarily on the
standby.

Unset this sooner to correctly set the snapshot conflict horizon in the
freeze record. This is the minimal change to achieve this affect in
backbranches, however, the value of all_frozen is not corrected before
returning to lazy_scan_prune(). That doesn't affect correctness since it
is not used without all_visible. It seemed wiser to do the smallest
possible fix.
---
 src/backend/access/heap/vacuumlazy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9fa88960ada..8c3605eff1f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1833,7 +1833,7 @@ retry:
 			 * once we're done with it.  Otherwise we generate a conservative
 			 * cutoff by stepping back from OldestXmin.
 			 */
-			if (prunestate->all_visible && prunestate->all_frozen)
+			if (prunestate->all_visible && prunestate->all_frozen && lpdead_items == 0)
 			{
 				/* Using same cutoff when setting VM is now unnecessary */
 				snapshotConflictHorizon = prunestate->visibility_cutoff_xid;
-- 
2.34.1

Reply via email to