> On Dec 20, 2025, at 05:09, Melanie Plageman <[email protected]> wrote:
>
> Attached v29 addresses some feedback and also corrects a small error
> with the assertion I had added in the previous version's 0009.
>
> On Thu, Dec 18, 2025 at 10:38 PM Xuneng Zhou <[email protected]> wrote:
>>
>> I’ve done a basic review of patches 1 and 2. Here are some comments
>> which may be somewhat immature, as this is a fairly large change set
>> and I’m new to some parts of the code.
>>
>> 1) Potential stale old_vmbits after VM repair n v2
>
> Good catch! I've fixed this in attached v29.
>
>> 2) Add Assert(BufferIsDirty(buf))
>>
>> Since the patch's core claim is "buffer must be dirty before WAL
>> registration", an assertion encodes this invariant. Should we add:
>>
>> Assert(BufferIsValid(buf));
>> Assert(BufferIsDirty(buf));
>>
>> right before the visibilitymap_set() call?
>
> There are already assertions that will trip in various places -- most
> importantly in XLogRegisterBuffer(), which is the one that inspired
> this refactor.
>
>> The comment at lines:
>>> "The only scenario where it is not already dirty is if the VM was removed…"
>>
>> This phrasing could become misleading after future refactors. Can we
>> make it more direct like:
>>
>>> "We must mark the heap buffer dirty before calling visibilitymap_set(),
>>> because it may WAL-log the buffer and XLogRegisterBuffer() requires it."
>
> I see your point about future refactors missing updating comments like
> this. But, I don't think we are going to refactor the code such that
> we can have PD_ALL_VISIBLE set without the VM bits set more often.
> Also, it is common practice in Postgres to describe very specific edge
> cases or odd scenarios in order to explain code that may seem
> confusing without the comment. It does risk that comment later
> becoming stale, but it is better that future developers understand why
> the code is there.
>
> That being said, I take your point that the comment is confusing. I
> have updated it in a different way.
>
>>> "Even if PD_ALL_VISIBLE is already set, we don't need to worry about
>>> unnecessarily dirtying the heap buffer, as it must be marked dirty before
>>> adding it to the WAL chain. The only scenario where it is not already dirty
>>> is if the VM was removed..."
>>
>> In this test we now call MarkBufferDirty() on the heap page even when
>> only setting the VM, so the comments claiming “does not need to modify
>> the heap buffer”/“no heap page modification” might be misleading. It
>> might be better to say the test doesn’t need to modify heap
>> tuples/page contents or doesn’t need to prune/freeze.
>
> The point I'm trying to make is that we have to dirty the buffer even
> if we don't modify the page because of the XLOG sub-system
> requirements. And, it may seem like a waste to do that if not
> modifying the page, but the page will rarely be clean anyway. I've
> tried to make this more clear in attached v29.
>
> - Melanie
> <v29-0001-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch><v29-0002-Eliminate-use-of-cached-VM-value-in-lazy_scan_pr.patch><v29-0003-Refactor-lazy_scan_prune-VM-clear-logic-into-hel.patch><v29-0004-Set-the-VM-in-heap_page_prune_and_freeze.patch><v29-0005-Move-VM-assert-into-prune-freeze-code.patch><v29-0006-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch><v29-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch><v29-0008-Remove-XLOG_HEAP2_VISIBLE-entirely.patch><v29-0009-Simplify-heap_page_would_be_all_visible-visibili.patch><v29-0010-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch><v29-0011-Unset-all_visible-sooner-if-not-freezing.patch><v29-0012-Track-which-relations-are-modified-by-a-query.patch><v29-0013-Pass-down-information-on-table-modification-to-s.patch><v29-0014-Allow-on-access-pruning-to-set-pages-all-visible.patch><v29-0015-Set-pd_prune_xid-on-insert.patch>
A few more comments on v29:
1 - 0002 - Looks like since 0002, visibilitymap_set()’s return value is no
longer used, so do we need to update the function and change return type to
void? I remember in some patches, to address Coverity alerts, people had to do
“(void) function_with_a_return_value()”.
2 - 0003
```
+ * Helper to correct any corruption detected on an heap page and its
```
Nit: “an” -> “a”
3 - 0003
```
+static bool
+identify_and_fix_vm_corruption(Relation rel, Buffer heap_buffer,
+ BlockNumber
heap_blk, Page heap_page,
+ int nlpdead_items,
+ Buffer vmbuffer,
+ uint8 vmbits)
+{
+ Assert(visibilitymap_get_status(rel, heap_blk, &vmbuffer) == vmbits);
```
Right before this function is called:
```
old_vmbits = visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer);
+ if (identify_and_fix_vm_corruption(vacrel->rel, buf, blkno, page,
+
presult.lpdead_items, vmbuffer,
+
old_vmbits))
```
So, the Assert() is checking if old_vmbits is newly returned from
visibilitymap_get_status(), in that case, identify_and_fix_vm_corruption() can
take vmbits as a pointer , and it calls visibilitymap_get_status() to get
vmbits itself and returns vmbits via the pointer, so that we don’t need to call
visibilitymap_get_status() twice.
4 - 0004
```
+ * These are only set if the HEAP_PAGE_PRUNE_UPDATE_VM option is set and
+ * we have attempted to update the VM.
+ */
+ uint8 new_vmbits;
+ uint8 old_vmbits;
```
The comment feels a little confusing to me. "HEAP_PAGE_PRUNE_UPDATE_VM option
is set” is a clear indication, but how to decide "we have attempted to update
the VM”? By reading the code:
```
+ prstate->attempt_update_vm =
+ (params->options & HEAP_PAGE_PRUNE_UPDATE_VM) != 0;
```
It’s just the result of HEAP_PAGE_PRUNE_UPDATE_VM being set. So, maybe we don’t
the “and” part.
5 - 0004
```
+ * Returns true if one or both VM bits should be set, along with returning the
+ * current value of the VM bits in *old_vmbits and the desired new value of
+ * the VM bits in *new_vmbits.
+ */
+static bool
+heap_page_will_set_vm(PruneState *prstate,
+ Relation relation,
+ BlockNumber heap_blk, Buffer
heap_buffer, Page heap_page,
+ Buffer vmbuffer,
+ int nlpdead_items,
+ uint8 *old_vmbits,
+ uint8 *new_vmbits)
+{
+ if (!prstate->attempt_update_vm)
+ return false;
```
old_vmbits and new_vmbits are purely output parameters. So, maybe we should set
them to 0 inside this function instead of relying on callers to initialize them.
I think this is a similar case where I raised a comment earlier about
initializing presult to {0} in the callers, and you only wanted to set presult
in heap_page_prune_and_freeze().
6 - 0004
```
@@ -823,13 +975,19 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
MultiXactId *new_relmin_mxid)
{
Buffer buffer = params->buffer;
+ Buffer vmbuffer = params->vmbuffer;
Page page = BufferGetPage(buffer);
+ BlockNumber blockno = BufferGetBlockNumber(buffer);
PruneState prstate;
bool do_freeze;
bool do_prune;
bool do_hint_prune;
+ bool do_set_vm;
bool did_tuple_hint_fpi;
int64 fpi_before = pgWalUsage.wal_fpi;
+ uint8 new_vmbits = 0;
+ uint8 old_vmbits = 0;
+
/* Initialize prstate */
```
Nit: an extra empty line is added.
7 - 0005
```
- * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples, and
- * will return 'all_visible', 'all_frozen' flags to the caller.
+ * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples
```
Nit: a tailing dot is needed in the end of the comment line.
8 - 0005
```
@@ -978,6 +1003,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
Buffer vmbuffer = params->vmbuffer;
Page page = BufferGetPage(buffer);
BlockNumber blockno = BufferGetBlockNumber(buffer);
+ TransactionId vm_conflict_horizon = InvalidTransactionId;
```
I guess the variable name “vm_conflict_horizon” comes from the old
"presult->vm_conflict_horizon”. But in the new logic, this variable is used
more generic, for example Assert(debug_cutoff == vm_conflict_horizon). I see
0006 has renamed to “conflict_xid”, so it’s up to you if or not rename it. But
to make the commit self-contained, I’d suggest renaming it.
9 - 0006
```
@@ -3537,6 +3537,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
{
ItemId itemid;
HeapTupleData tuple;
+ TransactionId dead_after = InvalidTransactionId;
```
This initialization seems to not needed, as HeapTupleSatisfiesVacuumHorizon()
will always set a value to it.
10 - 0010
```
+ * there is any snapshot that still consider
the newest xid on
```
Nit: consider -> considers
11 - 0011
```
+ * page. If we won't attempt freezing, just unset all-visible now,
though.
*/
+ if (!prstate->attempt_freeze)
+ {
+ prstate->all_visible = false;
+ prstate->all_frozen = false;
+ }
```
The comment says “just unset all-visible”, but the code actually also unset
all_frozen.
12 - 0012
```
+ /*
+ * RT indexes of relations modified by the query either through
+ * UPDATE/DELETE/INSERT/MERGE or SELECT FOR UPDATE
+ */
+ Bitmapset *es_modified_relids;
```
As we intentionally only want indexes, does it make sense to just name the
field es_modified_rtindexes to make it more explicit.
13 - 0012
```
+ /* If it has a rowmark, the relation is modified */
+ estate->es_modified_relids =
bms_add_member(estate->es_modified_relids,
+
rc->rti);
```
I think this comment is a little misleading, because SELECT FOR UPDATE/SHARE
doesn’t always modify tuples of the relation. If a reader not associating this
code with this patch, he may consider the comment is wrong. So, I think we
should make the comment more explicit. Maybe rephrase like “If it has a
rowmark, the relation may modify or lock heap pages”.
14 - 0015 - commit message
```
Setting pd_prune_xid on insert can cause a page to be dirtied and
written out when it previously would not have been, affetcting the
```
Typo: affetcting -> affecting
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/