On Fri, Apr 16, 2021 at 1:16 PM Peter Geoghegan <p...@bowt.ie> wrote:
> I'm not sure what to do, though. Both the INDEX_CLEANUP = off case and
> the failsafe case are only intended for emergencies. And it's hard to
> know what to do in a code path that is designed to rarely or never be
> used.

How about just documenting it in comments, as in the attached patch? I
tried to address all of the issues with LP_DEAD accounting together.
Both the issue raised by Masahiko, and one or two others that were
also discussed recently on other threads. They all seem kind of
related to me.

I didn't address the INDEX_CLEANUP = off case in the comments directly
(I just addressed the failsafe case). There is no good reason to think
that the situation will resolve with INDEX_CLEANUP = off, so it didn't
seem wise to mention it too. But that should still be okay --
INDEX_CLEANUP = off has practically been superseded by the failsafe,
since it is much more flexible. And, anybody that uses INDEX_CLEANUP =
off cannot expect to never do index cleanup without seriously bad
consequences all over the place.


--
Peter Geoghegan
From 5d24338e90f0f3eec7134c328d40270f2ddddc50 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Fri, 9 Apr 2021 18:50:25 -0700
Subject: [PATCH] VACUUM accounting: Explain LP_DEAD accounting.

---
 src/backend/access/heap/vacuumlazy.c | 70 +++++++++++++++++++---------
 1 file changed, 47 insertions(+), 23 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 9f9ba5d308..ad37f25e2a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -686,7 +686,16 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 						new_min_multi,
 						false);
 
-	/* report results to the stats collector, too */
+	/*
+	 * Report results to the stats collector, too.
+	 *
+	 * We deliberately avoid telling the stats collector about LP_DEAD items
+	 * that remain in the table when index/heap vacuuming has been bypassed by
+	 * the failsafe mechanism.  Avoid behaving too aggressively once the
+	 * danger of wraparound failure subsides.  Autoanalyze should notice any
+	 * remaining LP_DEAD items and schedule an autovacuum if nothing else
+	 * does.
+	 */
 	pgstat_report_vacuum(RelationGetRelid(rel),
 						 rel->rd_rel->relisshared,
 						 Max(new_live_tuples, 0),
@@ -1334,6 +1343,9 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
 		 */
 		lazy_scan_prune(vacrel, buf, blkno, page, vistest, &prunestate);
 
+		Assert(!prunestate.all_visible || !prunestate.has_lpdead_items);
+		Assert(!all_visible_according_to_vm || prunestate.all_visible);
+
 		/* Remember the location of the last page with nonremovable tuples */
 		if (prunestate.hastup)
 			vacrel->nonempty_pages = blkno + 1;
@@ -1404,7 +1416,6 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
 		 * Handle setting visibility map bit based on what the VM said about
 		 * the page before pruning started, and using prunestate
 		 */
-		Assert(!prunestate.all_visible || !prunestate.has_lpdead_items);
 		if (!all_visible_according_to_vm && prunestate.all_visible)
 		{
 			uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
@@ -1737,19 +1748,22 @@ retry:
 		/*
 		 * LP_DEAD items are processed outside of the loop.
 		 *
-		 * Note that we deliberately don't set hastup=true in the case of an
-		 * LP_DEAD item here, which is not how lazy_check_needs_freeze() or
+		 * Our working assumption is that any LP_DEAD items we encounter here
+		 * will become LP_UNUSED inside lazy_vacuum_heap_page() before VACUUM
+		 * finishes.
+		 *
+		 * This is why the number of LP_DEAD items won't be reported to the
+		 * stats collector -- we simply won't leave any behind.  (Actually,
+		 * cases where we bypass index vacuuming will in fact leave behind
+		 * LP_DEAD items, but that only happens in special circumstances.  We
+		 * avoid trying to compensate for that in our later report to the
+		 * stats collector.)
+		 *
+		 * This is also why we deliberately don't set hastup=true in the case
+		 * of an LP_DEAD item.  This is not how lazy_check_needs_freeze() or
 		 * count_nondeletable_pages() do it -- they only consider pages empty
 		 * when they only have LP_UNUSED items, which is important for
 		 * correctness.
-		 *
-		 * Our assumption is that any LP_DEAD items we encounter here will
-		 * become LP_UNUSED inside lazy_vacuum_heap_page() before we actually
-		 * call count_nondeletable_pages().  In any case our opinion of
-		 * whether or not a page 'hastup' (which is how our caller sets its
-		 * vacrel->nonempty_pages value) is inherently race-prone.  It must be
-		 * treated as advisory/unreliable, so we might as well be slightly
-		 * optimistic.
 		 */
 		if (ItemIdIsDead(itemid))
 		{
@@ -1901,9 +1915,6 @@ retry:
 	 * that will need to be vacuumed in indexes later, or a LP_NORMAL tuple
 	 * that remains and needs to be considered for freezing now (LP_UNUSED and
 	 * LP_REDIRECT items also remain, but are of no further interest to us).
-	 *
-	 * Add page level counters to caller's counts, and then actually process
-	 * LP_DEAD and LP_NORMAL items.
 	 */
 	vacrel->offnum = InvalidOffsetNumber;
 
@@ -1988,13 +1999,6 @@ retry:
 	}
 #endif
 
-	/* Add page-local counts to whole-VACUUM counts */
-	vacrel->tuples_deleted += tuples_deleted;
-	vacrel->lpdead_items += lpdead_items;
-	vacrel->new_dead_tuples += new_dead_tuples;
-	vacrel->num_tuples += num_tuples;
-	vacrel->live_tuples += live_tuples;
-
 	/*
 	 * Now save details of the LP_DEAD items from the page in the dead_tuples
 	 * array.  Also record that page has dead items in per-page prunestate.
@@ -2021,6 +2025,13 @@ retry:
 		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
 									 dead_tuples->num_tuples);
 	}
+
+	/* Finally, add page-local counts to whole-VACUUM counts */
+	vacrel->tuples_deleted += tuples_deleted;
+	vacrel->lpdead_items += lpdead_items;
+	vacrel->new_dead_tuples += new_dead_tuples;
+	vacrel->num_tuples += num_tuples;
+	vacrel->live_tuples += live_tuples;
 }
 
 /*
@@ -2095,6 +2106,12 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
 		 * not exceed 32MB.  This limits the risk that we will bypass index
 		 * vacuuming again and again until eventually there is a VACUUM whose
 		 * dead_tuples space is not CPU cache resident.
+		 *
+		 * We don't take any special steps to remember the LP_DEAD items (such
+		 * as counting them in new_dead_tuples report to the stats collector)
+		 * when the optimization is applied.  The tests that we apply should
+		 * avoid any noticeable disagreements between acquire_sample_rows()
+		 * and the accounting we perform in lazy_scan_prune().
 		 */
 		threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
 		do_bypass_optimization =
@@ -2146,7 +2163,8 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
 	}
 
 	/*
-	 * Forget the now-vacuumed tuples -- just press on
+	 * Forget the LP_DEAD items that we just vacuumed (or just decided to not
+	 * vacuum)
 	 */
 	vacrel->dead_tuples->num_tuples = 0;
 }
@@ -3101,6 +3119,12 @@ lazy_cleanup_one_index(Relation indrel, IndexBulkDeleteResult *istat,
  *
  * Also don't attempt it if wraparound failsafe is in effect.  It's hard to
  * predict how long lazy_truncate_heap will take.  Don't take any chances.
+ * There is very little chance of truncation working out when the failsafe is
+ * in effect in any case.  It is very likely that pages that lazy_scan_prune
+ * thought were !hastup before the failsafe was in effect won't be considered
+ * !hastup by count_nondeletable_pages now.  Note that lazy_scan_prune makes
+ * the optimistic assumption that any LP_DEAD items it encounters will always
+ * be LP_UNUSED by the time we're called.
  *
  * Also don't attempt it if we are doing early pruning/vacuuming, because a
  * scan which cannot find a truncated heap page cannot determine that the
-- 
2.27.0

Reply via email to