On Sat, 20 Dec 2025 at 02:10, Melanie Plageman
<[email protected]> wrote:
>
> Attached v29 addresses some feedback and also corrects a small error
> with the assertion I had added in the previous version's 0009.
>
> On Thu, Dec 18, 2025 at 10:38 PM Xuneng Zhou <[email protected]> wrote:
> >
> > I’ve done a basic review of patches 1 and 2. Here are some comments
> > which may be somewhat immature, as this is a fairly large change set
> > and I’m new to some parts of the code.
> >
> > 1) Potential stale old_vmbits after VM repair n v2
>
> Good catch! I've fixed this in attached v29.
>
> > 2) Add Assert(BufferIsDirty(buf))
> >
> > Since the patch's core claim is "buffer must be dirty before WAL
> > registration", an assertion encodes this invariant. Should we add:
> >
> > Assert(BufferIsValid(buf));
> > Assert(BufferIsDirty(buf));
> >
> > right before the visibilitymap_set() call?
>
> There are already assertions that will trip in various places -- most
> importantly in XLogRegisterBuffer(), which is the one that inspired
> this refactor.
>
> > The comment at lines:
> > > "The only scenario where it is not already dirty is if the VM was 
> > > removed…"
> >
> > This phrasing could become misleading after future refactors. Can we
> > make it more direct like:
> >
> > > "We must mark the heap buffer dirty before calling visibilitymap_set(), 
> > > because it may WAL-log the buffer and XLogRegisterBuffer() requires it."
>
> I see your point about future refactors missing updating comments like
> this. But, I don't think we are going to refactor the code such that
> we can have PD_ALL_VISIBLE set without the VM bits set more often.
> Also, it is common practice in Postgres to describe very specific edge
> cases or odd scenarios in order to explain code that may seem
> confusing without the comment. It does risk that comment later
> becoming stale, but it is better that future developers understand why
> the code is there.
>
> That being said, I take your point that the comment is confusing. I
> have updated it in a different way.
>
> > > "Even if PD_ALL_VISIBLE is already set, we don't need to worry about 
> > > unnecessarily dirtying the heap buffer, as it must be marked dirty before 
> > > adding it to the WAL chain. The only scenario where it is not already 
> > > dirty is if the VM was removed..."
> >
> > In this test we now call MarkBufferDirty() on the heap page even when
> > only setting the VM, so the comments claiming “does not need to modify
> > the heap buffer”/“no heap page modification” might be misleading. It
> > might be better to say the test doesn’t need to modify heap
> > tuples/page contents or doesn’t need to prune/freeze.
>
> The point I'm trying to make is that we have to dirty the buffer even
> if we don't modify the page because of the XLOG sub-system
> requirements. And, it may seem like a waste to do that if not
> modifying the page, but the page will rarely be clean anyway. I've
> tried to make this more clear in attached v29.
>
> - Melanie


Hi! I checked v29-0009, about HeapTupleSatisfiesVacuumHorizon. Origins
of this code track down to fdf9e21196a6 which was committed as part of
[0], at which point
there was no HeapTupleSatisfiesVacuumHorizon function. I guess this is
the reason this optimization was not performed earlier.

I also think this patch is correct, because we do similar things for
HEAPTUPLE_DEAD & HEAPTUPLE_RECENTLY_DEAD, and
HeapTupleSatisfiesVacuumHorizon is just a proxy to
HeapTupleSatisfiesVacuumHorizon with only difference in DEAD VS
RECENTLY_DEAD handling.


Similar change could be done at heapam_scan_analyze_next_tuple

...
case HEAPTUPLE_DEAD:
case HEAPTUPLE_RECENTLY_DEAD:
/* Count dead and recently-dead rows */
*deadrows += 1;
break;
...



[0] 
https://www.postgresql.org/message-id/CABOikdP0meGuXPPWuYrP%3DvDvoqUdshF2xJAzZHWSKg03Rz_%2B9Q%40mail.gmail.com


-- 
Best regards,
Kirill Reshke


Reply via email to