Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Chao Li Sun, 21 Dec 2025 23:20:36 -0800

> On Dec 20, 2025, at 05:09, Melanie Plageman <[email protected]> wrote:
> 
> Attached v29 addresses some feedback and also corrects a small error
> with the assertion I had added in the previous version's 0009.
> 
> On Thu, Dec 18, 2025 at 10:38 PM Xuneng Zhou <[email protected]> wrote:
>> 
>> I’ve done a basic review of patches 1 and 2. Here are some comments
>> which may be somewhat immature, as this is a fairly large change set
>> and I’m new to some parts of the code.
>> 
>> 1) Potential stale old_vmbits after VM repair n v2
> 
> Good catch! I've fixed this in attached v29.
> 
>> 2) Add Assert(BufferIsDirty(buf))
>> 
>> Since the patch's core claim is "buffer must be dirty before WAL
>> registration", an assertion encodes this invariant. Should we add:
>> 
>> Assert(BufferIsValid(buf));
>> Assert(BufferIsDirty(buf));
>> 
>> right before the visibilitymap_set() call?
> 
> There are already assertions that will trip in various places -- most
> importantly in XLogRegisterBuffer(), which is the one that inspired
> this refactor.
> 
>> The comment at lines:
>>> "The only scenario where it is not already dirty is if the VM was removed…"
>> 
>> This phrasing could become misleading after future refactors. Can we
>> make it more direct like:
>> 
>>> "We must mark the heap buffer dirty before calling visibilitymap_set(), 
>>> because it may WAL-log the buffer and XLogRegisterBuffer() requires it."
> 
> I see your point about future refactors missing updating comments like
> this. But, I don't think we are going to refactor the code such that
> we can have PD_ALL_VISIBLE set without the VM bits set more often.
> Also, it is common practice in Postgres to describe very specific edge
> cases or odd scenarios in order to explain code that may seem
> confusing without the comment. It does risk that comment later
> becoming stale, but it is better that future developers understand why
> the code is there.
> 
> That being said, I take your point that the comment is confusing. I
> have updated it in a different way.
> 
>>> "Even if PD_ALL_VISIBLE is already set, we don't need to worry about 
>>> unnecessarily dirtying the heap buffer, as it must be marked dirty before 
>>> adding it to the WAL chain. The only scenario where it is not already dirty 
>>> is if the VM was removed..."
>> 
>> In this test we now call MarkBufferDirty() on the heap page even when
>> only setting the VM, so the comments claiming “does not need to modify
>> the heap buffer”/“no heap page modification” might be misleading. It
>> might be better to say the test doesn’t need to modify heap
>> tuples/page contents or doesn’t need to prune/freeze.
> 
> The point I'm trying to make is that we have to dirty the buffer even
> if we don't modify the page because of the XLOG sub-system
> requirements. And, it may seem like a waste to do that if not
> modifying the page, but the page will rarely be clean anyway. I've
> tried to make this more clear in attached v29.
> 
> - Melanie
> <v29-0001-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch><v29-0002-Eliminate-use-of-cached-VM-value-in-lazy_scan_pr.patch><v29-0003-Refactor-lazy_scan_prune-VM-clear-logic-into-hel.patch><v29-0004-Set-the-VM-in-heap_page_prune_and_freeze.patch><v29-0005-Move-VM-assert-into-prune-freeze-code.patch><v29-0006-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch><v29-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch><v29-0008-Remove-XLOG_HEAP2_VISIBLE-entirely.patch><v29-0009-Simplify-heap_page_would_be_all_visible-visibili.patch><v29-0010-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch><v29-0011-Unset-all_visible-sooner-if-not-freezing.patch><v29-0012-Track-which-relations-are-modified-by-a-query.patch><v29-0013-Pass-down-information-on-table-modification-to-s.patch><v29-0014-Allow-on-access-pruning-to-set-pages-all-visible.patch><v29-0015-Set-pd_prune_xid-on-insert.patch>

A few more comments on v29:

1 - 0002 - Looks like since 0002, visibilitymap_set()’s return value is no 
longer used, so do we need to update the function and change return type to 
void? I remember in some patches, to address Coverity alerts, people had to do 
“(void) function_with_a_return_value()”.

2 - 0003
```
+ * Helper to correct any corruption detected on an heap page and its
```

Nit: “an” -> “a”

3 - 0003
```
+static bool
+identify_and_fix_vm_corruption(Relation rel, Buffer heap_buffer,
+                                                          BlockNumber 
heap_blk, Page heap_page,
+                                                          int nlpdead_items,
+                                                          Buffer vmbuffer,
+                                                          uint8 vmbits)
+{
+       Assert(visibilitymap_get_status(rel, heap_blk, &vmbuffer) == vmbits);
```

Right before this function is called:
```
        old_vmbits = visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer);
+       if (identify_and_fix_vm_corruption(vacrel->rel, buf, blkno, page,
+                                                                          
presult.lpdead_items, vmbuffer,
+                                                                          
old_vmbits))
```

So, the Assert() is checking if old_vmbits is newly returned from 
visibilitymap_get_status(), in that case, identify_and_fix_vm_corruption() can 
take vmbits as a pointer , and it calls visibilitymap_get_status() to get 
vmbits itself and returns vmbits via the pointer, so that we don’t need to call 
visibilitymap_get_status() twice.

4 - 0004
```
+        * These are only set if the HEAP_PAGE_PRUNE_UPDATE_VM option is set and
+        * we have attempted to update the VM.
+        */
+       uint8           new_vmbits;
+       uint8           old_vmbits;
```

The comment feels a little confusing to me. "HEAP_PAGE_PRUNE_UPDATE_VM option 
is set” is a clear indication, but how to decide "we have attempted to update 
the VM”? By reading the code:
```
+       prstate->attempt_update_vm =
+               (params->options & HEAP_PAGE_PRUNE_UPDATE_VM) != 0;
```

It’s just the result of HEAP_PAGE_PRUNE_UPDATE_VM being set. So, maybe we don’t 
the “and” part.

5 - 0004
```
+ * Returns true if one or both VM bits should be set, along with returning the
+ * current value of the VM bits in *old_vmbits and the desired new value of
+ * the VM bits in *new_vmbits.
+ */
+static bool
+heap_page_will_set_vm(PruneState *prstate,
+                                         Relation relation,
+                                         BlockNumber heap_blk, Buffer 
heap_buffer, Page heap_page,
+                                         Buffer vmbuffer,
+                                         int nlpdead_items,
+                                         uint8 *old_vmbits,
+                                         uint8 *new_vmbits)
+{
+       if (!prstate->attempt_update_vm)
+               return false;
```

old_vmbits and new_vmbits are purely output parameters. So, maybe we should set 
them to 0 inside this function instead of relying on callers to initialize them.

I think this is a similar case where I raised a comment earlier about 
initializing presult to {0} in the callers, and you only wanted to set presult 
in heap_page_prune_and_freeze().

6 - 0004
```
@@ -823,13 +975,19 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
                                                   MultiXactId *new_relmin_mxid)
 {
        Buffer          buffer = params->buffer;
+       Buffer          vmbuffer = params->vmbuffer;
        Page            page = BufferGetPage(buffer);
+       BlockNumber blockno = BufferGetBlockNumber(buffer);
        PruneState      prstate;
        bool            do_freeze;
        bool            do_prune;
        bool            do_hint_prune;
+       bool            do_set_vm;
        bool            did_tuple_hint_fpi;
        int64           fpi_before = pgWalUsage.wal_fpi;
+       uint8           new_vmbits = 0;
+       uint8           old_vmbits = 0;
+
 
        /* Initialize prstate */
```

Nit: an extra empty line is added.

7 - 0005
```
-        * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples, and
-        * will return 'all_visible', 'all_frozen' flags to the caller.
+        * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples
```

Nit: a tailing dot is needed in the end of the comment line.

8 - 0005
```
@@ -978,6 +1003,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
        Buffer          vmbuffer = params->vmbuffer;
        Page            page = BufferGetPage(buffer);
        BlockNumber blockno = BufferGetBlockNumber(buffer);
+       TransactionId vm_conflict_horizon = InvalidTransactionId;
```

I guess the variable name “vm_conflict_horizon” comes from the old 
"presult->vm_conflict_horizon”. But in the new logic, this variable is used 
more generic, for example Assert(debug_cutoff == vm_conflict_horizon). I see 
0006 has renamed to “conflict_xid”, so it’s up to you if or not rename it. But 
to make the commit self-contained, I’d suggest renaming it.

9 - 0006
```
@@ -3537,6 +3537,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
        {
                ItemId          itemid;
                HeapTupleData tuple;
+               TransactionId dead_after = InvalidTransactionId;
```

This initialization seems to not needed, as HeapTupleSatisfiesVacuumHorizon() 
will always set a value to it.

10 - 0010
```
+                                * there is any snapshot that still consider 
the newest xid on
```

Nit: consider -> considers

11 - 0011
```
+        * page. If we won't attempt freezing, just unset all-visible now, 
though.
         */
+       if (!prstate->attempt_freeze)
+       {
+               prstate->all_visible = false;
+               prstate->all_frozen = false;
+       }
```

The comment says “just unset all-visible”, but the code actually also unset 
all_frozen.

12 - 0012
```
+       /*
+        * RT indexes of relations modified by the query either through
+        * UPDATE/DELETE/INSERT/MERGE or SELECT FOR UPDATE
+        */
+       Bitmapset  *es_modified_relids;
```

As we intentionally only want indexes, does it make sense to just name the 
field es_modified_rtindexes to make it more explicit.

13 - 0012
```
+                       /* If it has a rowmark, the relation is modified */
+                       estate->es_modified_relids = 
bms_add_member(estate->es_modified_relids,
+                                                                               
                                rc->rti);
```

I think this comment is a little misleading, because SELECT FOR UPDATE/SHARE 
doesn’t always modify tuples of the relation. If a reader not associating this 
code with this patch, he may consider the comment is wrong. So, I think we 
should make the comment more explicit. Maybe rephrase like “If it has a 
rowmark, the relation may modify or lock heap pages”.

14 - 0015 - commit message
```
Setting pd_prune_xid on insert can cause a page to be dirtied and
written out when it previously would not have been, affetcting the
```

Typo: affetcting -> affecting

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Reply via email to