On Thu, Mar 21, 2024 at 9:28 AM Heikki Linnakangas <hlinn...@iki.fi> wrote:
>
> In heap_page_prune_and_freeze(), we now do some extra work on each live
> tuple, to set the all_visible_except_removable correctly. And also to
> update live_tuples, recently_dead_tuples and hastup. When we're not
> freezing, that's a waste of cycles, the caller doesn't care. I hope it's
> enough that it doesn't matter, but is it?

Last year on an early version of the patch set I did some pgbench
tpcb-like benchmarks -- since there is a lot of on-access pruning in
that workload -- and I don't remember it being a showstopper. The code
has changed a fair bit since then. However, I think it might be safer
to pass a flag "by_vacuum" to heap_page_prune_and_freeze() and skip
the rest of the loop after heap_prune_satisifies_vacuum() when
on-access pruning invokes it. I had avoided that because it felt ugly
and error-prone, however it addresses a few other of your points as
well.

For example, I think we should set a bit in the prune/freeze WAL
record's flags to indicate if pruning was done by vacuum or on access
(mentioned in another of your recent emails).

> The first commit (after the WAL format changes) changes the all-visible
> check to use GlobalVisTestIsRemovableXid. The commit message says that
> it's because we don't have 'cutoffs' available, but we only care about
> that when we're freezing, and when we're freezing, we actually do have
> 'cutoffs' in HeapPageFreeze. Using GlobalVisTestIsRemovableXid seems
> sensible anyway, because that's what we use in
> heap_prune_satisfies_vacuum() too, but just wanted to point that out.

Yes, this is true. If we skip this part of the loop when on-access
pruning invokes it, we can go back to using the OldestXmin. I have
done that as well as some other changes in the attached patch,
conflict_horizon_updates.diff. Note that this patch may not apply on
your latest patch as I wrote it on top of an older version. Switching
back to using OldestXmin for page visibility determination makes this
patch set more similar to master as well. We could keep the
alternative check (with GlobalVisState) to maintain the illusion that
callers passing by_vacuum as True can pass NULL for pagefrz, but I was
worried about the extra overhead.

It would be nice to pick a single reasonable visibility horizon (the
oldest running xid we compare things to) at the beginning of
heap_page_prune_and_freeze() and use it for determining if we can
remove tuples, if we can freeze tuples, and if the page is all
visible. It makes it confusing that we use OldestXmin for freezing and
setting the page visibility in the VM and GlobalVisState for
determining prunability. I think using GlobalVisState for freezing has
been discussed before and ruled out for a variety of reasons, and I
definitely don't suggest making that change in this patch set.

> The 'frz_conflict_horizon' stuff is still fuzzy to me. (Not necessarily
> these patches's fault). This at least is wrong, because Max(a, b)
> doesn't handle XID wraparound correctly:
>
> >                       if (do_freeze)
> >                               conflict_xid = 
> > Max(prstate.snapshotConflictHorizon,
> >                                                                  
> > presult->frz_conflict_horizon);
> >                       else
> >                               conflict_xid = 
> > prstate.snapshotConflictHorizon;
>
> Then there's this in lazy_scan_prune():
>
> >               /* Using same cutoff when setting VM is now unnecessary */
> >               if (presult.all_frozen)
> >                       presult.frz_conflict_horizon = InvalidTransactionId;
> This does the right thing in the end, but if all the tuples are frozen
> shouldn't frz_conflict_horizon already be InvalidTransactionId? The
> comment says it's "newest xmin on the page", and if everything was
> frozen, all xmins are FrozenTransactionId. In other words, that should
> be moved to heap_page_prune_and_freeze() so that it doesn't lie to its
> caller. Also, frz_conflict_horizon is only set correctly if
> 'all_frozen==true', would be good to mention that in the comments too.

Yes, this is a good point. I've spent some time swapping all of this
back into my head. I think we should change the names of all these
conflict horizon variables and introduce some local variables again.
In the attached patch, I've updated the name of the variable in
PruneFreezeResult to vm_conflict_horizon, as it is only used for
emitting a VM update record. Now, I don't set it until the end of
heap_page_prune_and_freeze(). It is only updated from
InvalidTransactionId if the page is not all frozen. As you say, if the
page is all frozen, there can be no conflict.

I've also changed PruneState->snapshotConflictHorizon to
PruneState->latest_xid_removed.

And I introduced the local variables visibility_cutoff_xid and
frz_conflict_horizon. I think it is important we distinguish between
the latest xid pruned, the latest xmin of tuples frozen, and the
latest xid of all live tuples on the page.

Though we end up using visibility_cutoff_xid as the freeze conflict
horizon if the page is all frozen, I think it is more clear to have
the three variables and name them what they are. Then, we calculate
the correct one for the combined WAL record before emitting it. I've
done that in the attached. I've tried to reduce the scope of the
variables as much as possible to keep it as clear as I could.

I think I've also fixed the issue with using Max() to compare
TransactionIds by using TransactionIdFollows() instead.

Note that I still don't think we have a resolution on what to
correctly update new_relfrozenxid and new_relminmxid to at the end
when presult->nfrozen == 0 and presult->all_frozen is true.

        if (presult->nfrozen > 0)
        {
            presult->new_relfrozenxid = pagefrz->FreezePageRelfrozenXid;
            presult->new_relminmxid = pagefrz->FreezePageRelminMxid;
        }
        else
        {
            presult->new_relfrozenxid = pagefrz->NoFreezePageRelfrozenXid;
            presult->new_relminmxid = pagefrz->NoFreezePageRelminMxid;
        }

- Melanie
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 2b5f8ef1e80..06ee8565b80 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -38,7 +38,7 @@ typedef struct
 	bool		mark_unused_now;
 
 	TransactionId new_prune_xid;	/* new prune hint value for page */
-	TransactionId snapshotConflictHorizon;	/* latest xid removed */
+	TransactionId latest_xid_removed;
 	int			nredirected;	/* numbers of entries in arrays below */
 	int			ndead;
 	int			nunused;
@@ -176,7 +176,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 			 * not the relation has indexes, since we cannot safely determine
 			 * that during on-access pruning with the current implementation.
 			 */
-			heap_page_prune_and_freeze(relation, buffer, vistest, false, NULL,
+			heap_page_prune_and_freeze(relation, buffer, false, vistest, false, NULL,
 									   &presult, NULL);
 
 			/*
@@ -214,6 +214,15 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * Prune and repair fragmentation and potentially freeze tuples on the
  * specified page.
  *
+ * heap_page_prune_and_freeze() may do all of the following: prune tuples,
+ * freeze tuples, determine the all-visible/all-frozen status of the apge and
+ * the associated snapshot conflict horizon, determine if truncating this page
+ * would be safe, and identify any new potential values for relfrozenxid and
+ * relminmxid. Not all of these responsibilities are useful to all callers. If
+ * by_vacuum is passed as True, heap_page_prune_and_freeze() will do all of
+ * these. Otherwise, the other parameters will determine which of these it
+ * does.
+ *
  * If the page can be marked all-frozen in the visibility map, we may
  * opportunistically freeze tuples on the page if either its tuples are old
  * enough or freezing will be cheap enough.
@@ -242,6 +251,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  */
 void
 heap_page_prune_and_freeze(Relation relation, Buffer buffer,
+						   bool by_vacuum,
 						   GlobalVisState *vistest,
 						   bool mark_unused_now,
 						   HeapPageFreeze *pagefrz,
@@ -254,6 +264,7 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 				maxoff;
 	PruneState	prstate;
 	HeapTupleData tup;
+	TransactionId visibility_cutoff_xid;
 	bool		do_freeze;
 	bool		do_prune;
 	bool		do_hint;
@@ -275,7 +286,7 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	prstate.new_prune_xid = InvalidTransactionId;
 	prstate.vistest = vistest;
 	prstate.mark_unused_now = mark_unused_now;
-	prstate.snapshotConflictHorizon = InvalidTransactionId;
+	prstate.latest_xid_removed = InvalidTransactionId;
 	prstate.nredirected = prstate.ndead = prstate.nunused = 0;
 	memset(prstate.marked, 0, sizeof(prstate.marked));
 
@@ -302,8 +313,16 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	 * all_visible is also set to true.
 	 */
 	presult->all_frozen = true;
-	/* for recovery conflicts */
-	presult->frz_conflict_horizon = InvalidTransactionId;
+
+	/*
+	 * The visibility cutoff xid is the newest xmin of live tuples on the
+	 * page. In the common case, this will be set as the conflict horizon the
+	 * caller can use for updating the VM. If, at the end of freezing and
+	 * pruning, the page is all-frozen, there is no possibility that any
+	 * running transaction on the standby does not see tuples on the page as
+	 * all-visible, so the conflict horizon remains InvalidTransactionId.
+	 */
+	presult->vm_conflict_horizon = visibility_cutoff_xid = InvalidTransactionId;
 
 	/* For advancing relfrozenxid and relminmxid */
 	presult->new_relfrozenxid = InvalidTransactionId;
@@ -362,6 +381,13 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 		prstate.htsv[offnum] = heap_prune_satisfies_vacuum(&prstate, &tup,
 														   buffer);
 
+		/*
+		 * On-access pruning does not update the VM nor provide pagefrz to
+		 * consider freezing tuples, so skip the rest of the loop.
+		 */
+		if (!by_vacuum)
+			continue;
+
 		/*
 		 * The criteria for counting a tuple as live in this block need to
 		 * match what analyze.c's acquire_sample_rows() does, otherwise VACUUM
@@ -434,17 +460,19 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 					 * don't consider the page all-visible.
 					 */
 					xmin = HeapTupleHeaderGetXmin(htup);
-					if (xmin != FrozenTransactionId &&
-						!GlobalVisTestIsRemovableXid(vistest, xmin))
+
+					/* For now always use pagefrz->cutoffs */
+					Assert(pagefrz);
+					if (!TransactionIdPrecedes(xmin, pagefrz->cutoffs->OldestXmin))
 					{
 						all_visible_except_removable = false;
 						break;
 					}
 
 					/* Track newest xmin on page. */
-					if (TransactionIdFollows(xmin, presult->frz_conflict_horizon) &&
+					if (TransactionIdFollows(xmin, visibility_cutoff_xid) &&
 						TransactionIdIsNormal(xmin))
-						presult->frz_conflict_horizon = xmin;
+						visibility_cutoff_xid = xmin;
 				}
 				break;
 			case HEAPTUPLE_RECENTLY_DEAD:
@@ -673,10 +701,9 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 
 	if (do_prune || do_freeze)
 	{
-		/*
-		 * Apply the planned item changes, then repair page fragmentation, and
-		 * update the page's hint bit about whether it has free line pointers.
-		 */
+		TransactionId frz_conflict_horizon = InvalidTransactionId;
+
+		/* Apply the planned item changes and repair page fragmentation. */
 		if (do_prune)
 		{
 			heap_page_prune_execute(buffer, false,
@@ -692,12 +719,16 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 			 * when the whole page is eligible to become all-frozen in the VM
 			 * once we're done with it.  Otherwise we generate a conservative
 			 * cutoff by stepping back from OldestXmin. This avoids false
-			 * conflicts when hot_standby_feedback is in use.
+			 * conflicts when hot_standby_feedback is in use. MFIXME: it is
+			 * possible to use presult->all_visible by now, but is it clearer
+			 * to use all_visible_except_removable?
 			 */
-			if (!(all_visible_except_removable && presult->all_frozen))
+			if (all_visible_except_removable && presult->all_frozen)
+				frz_conflict_horizon = visibility_cutoff_xid;
+			else
 			{
-				presult->frz_conflict_horizon = pagefrz->cutoffs->OldestXmin;
-				TransactionIdRetreat(presult->frz_conflict_horizon);
+				frz_conflict_horizon = pagefrz->cutoffs->OldestXmin;
+				TransactionIdRetreat(frz_conflict_horizon);
 			}
 			heap_freeze_prepared_tuples(buffer, prstate.frozen, presult->nfrozen);
 		}
@@ -709,22 +740,22 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 		 */
 		if (RelationNeedsWAL(relation))
 		{
+			TransactionId conflict_xid = InvalidTransactionId;
+
 			/*
-			 * The snapshotConflictHorizon for the whole record should be the most
-			 * conservative of all the horizons calculated for any of the possible
-			 * modifications. If this record will prune tuples, any transactions on
-			 * the standby older than the youngest xmax of the most recently removed
-			 * tuple this record will prune will conflict. If this record will freeze
-			 * tuples, any transactions on the standby with xids older than the
-			 * youngest tuple this record will freeze will conflict.
+			 * The snapshot conflict horizon for the whole record should be
+			 * the newest xid of all the horizons calculated for any of the
+			 * possible modifications. If this record will prune tuples, any
+			 * transactions on the standby older than the youngest xmax of the
+			 * most recently removed tuple this record will prune will
+			 * conflict. If this record will freeze tuples, any transactions
+			 * on the standby with xids older than the youngest tuple this
+			 * record will freeze will conflict.
 			 */
-			TransactionId conflict_xid;
-
-			if (do_freeze)
-				conflict_xid = Max(prstate.snapshotConflictHorizon,
-								   presult->frz_conflict_horizon);
+			if (TransactionIdFollows(frz_conflict_horizon, prstate.latest_xid_removed))
+				conflict_xid = frz_conflict_horizon;
 			else
-				conflict_xid = prstate.snapshotConflictHorizon;
+				conflict_xid = prstate.latest_xid_removed;
 
 			log_heap_prune_and_freeze(relation, buffer,
 									  conflict_xid,
@@ -738,6 +769,15 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 
 	END_CRIT_SECTION();
 
+	/*
+	 * For callers planning to update the visibility map, the conflict horizon
+	 * for that record must be the newest xmin on the page. However, if the
+	 * page is completely frozen, there can be no conflict and the
+	 * vm_conflict_horizon should remain InvalidTransactionId.
+	 */
+	if (!presult->all_frozen)
+		presult->vm_conflict_horizon = visibility_cutoff_xid;
+
 	/*
 	 * If we froze tuples on the page, the caller can advance relfrozenxid and
 	 * relminmxid to the values in pagefrz->FreezePageRelfrozenXid and
@@ -876,7 +916,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 			{
 				heap_prune_record_unused(prstate, rootoffnum);
 				HeapTupleHeaderAdvanceConflictHorizon(htup,
-													  &prstate->snapshotConflictHorizon);
+													  &prstate->latest_xid_removed);
 				ndeleted++;
 			}
 
@@ -1029,7 +1069,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 		{
 			latestdead = offnum;
 			HeapTupleHeaderAdvanceConflictHorizon(htup,
-												  &prstate->snapshotConflictHorizon);
+												  &prstate->latest_xid_removed);
 		}
 		else if (!recent_dead)
 			break;
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c3da64102cf..ee4fdc00073 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1421,7 +1421,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * items LP_UNUSED, so mark_unused_now should be true if no indexes and
 	 * false otherwise.
 	 */
-	heap_page_prune_and_freeze(rel, buf, vacrel->vistest, vacrel->nindexes == 0,
+	heap_page_prune_and_freeze(rel, buf, true, vacrel->vistest, vacrel->nindexes == 0,
 							   &pagefrz, &presult, &vacrel->offnum);
 
 	/*
@@ -1444,10 +1444,6 @@ lazy_scan_prune(LVRelState *vacrel,
 		 * (don't confuse that with pages newly set all-frozen in VM).
 		 */
 		vacrel->frozen_pages++;
-
-		/* Using same cutoff when setting VM is now unnecessary */
-		if (presult.all_frozen)
-			presult.frz_conflict_horizon = InvalidTransactionId;
 	}
 
 	/*
@@ -1473,7 +1469,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		Assert(presult.all_frozen == debug_all_frozen);
 
 		Assert(!TransactionIdIsValid(debug_cutoff) ||
-			   debug_cutoff == presult.frz_conflict_horizon);
+			   debug_cutoff == presult.vm_conflict_horizon);
 	}
 #endif
 
@@ -1527,7 +1523,7 @@ lazy_scan_prune(LVRelState *vacrel,
 
 		if (presult.all_frozen)
 		{
-			Assert(!TransactionIdIsValid(presult.frz_conflict_horizon));
+			Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
 			flags |= VISIBILITYMAP_ALL_FROZEN;
 		}
 
@@ -1547,7 +1543,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		PageSetAllVisible(page);
 		MarkBufferDirty(buf);
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, presult.frz_conflict_horizon,
+						  vmbuffer, presult.vm_conflict_horizon,
 						  flags);
 	}
 
@@ -1612,11 +1608,11 @@ lazy_scan_prune(LVRelState *vacrel,
 		/*
 		 * Set the page all-frozen (and all-visible) in the VM.
 		 *
-		 * We can pass InvalidTransactionId as our frz_conflict_horizon, since
-		 * a snapshotConflictHorizon sufficient to make everything safe for
-		 * REDO was logged when the page's tuples were frozen.
+		 * We can pass InvalidTransactionId as our snapshot conflict horizon,
+		 * since a snapshot conflict horizon sufficient to make everything
+		 * safe for REDO was logged when the page's tuples were frozen.
 		 */
-		Assert(!TransactionIdIsValid(presult.frz_conflict_horizon));
+		Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
 						  vmbuffer, InvalidTransactionId,
 						  VISIBILITYMAP_ALL_VISIBLE |
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index b2015f5a1ac..ba2ce01af04 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -212,8 +212,8 @@ typedef struct PruneFreezeResult
 	/* Whether or not the page can be set all frozen in the VM */
 	bool		all_frozen;
 
-	/* Number of newly frozen tuples */
-	TransactionId frz_conflict_horizon; /* Newest xmin on the page */
+	/* Newest xmin on the page */
+	TransactionId vm_conflict_horizon;
 
 	/* New value of relfrozenxid found by heap_page_prune_and_freeze() */
 	TransactionId new_relfrozenxid;
@@ -322,6 +322,7 @@ extern TransactionId heap_index_delete_tuples(Relation rel,
 struct GlobalVisState;
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern void heap_page_prune_and_freeze(Relation relation, Buffer buffer,
+									   bool by_vacuum,
 									   struct GlobalVisState *vistest,
 									   bool mark_unused_now,
 									   HeapPageFreeze *pagefrz,

Reply via email to