Hi,
On 2026-01-28 18:16:10 -0500, Melanie Plageman wrote:
> Subject: [PATCH v34 13/14] Allow on-access pruning to set pages all-visible
>
> Many queries do not modify the underlying relation. For such queries, if
> on-access pruning occurs during the scan, we can check whether the page
> has become all-visible and update the visibility map accordingly.
> Previously, only vacuum and COPY FREEZE marked pages as all-visible or
> all-frozen.
>
> This commit implements on-access VM setting for sequential scans as well
> as for the underlying heap relation in index scans and bitmap heap
> scans.
For evaluating this, did you build anything that evaluates the frequency of
this succeeding, causing unnecessary un-all-visibling etc during benchmarks?
> diff --git a/src/backend/access/heap/heapam.c
> b/src/backend/access/heap/heapam.c
> index 1a3fa8a76aa..7d22549a290 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -617,6 +617,7 @@ heap_prepare_pagescan(TableScanDesc sscan)
> Buffer buffer = scan->rs_cbuf;
> BlockNumber block = scan->rs_cblock;
> Snapshot snapshot;
> + Buffer *vmbuffer = NULL;
> Page page;
> int lines;
> bool all_visible;
> @@ -631,7 +632,9 @@ heap_prepare_pagescan(TableScanDesc sscan)
> /*
> * Prune and repair fragmentation for the whole page, if possible.
> */
> - heap_page_prune_opt(scan->rs_base.rs_rd, buffer);
> + if (sscan->rs_flags & SO_HINT_REL_READ_ONLY)
> + vmbuffer = &scan->rs_vmbuffer;
> + heap_page_prune_opt(scan->rs_base.rs_rd, buffer, vmbuffer);
I don't love that the signalling to heap_page_prune_opt() about this is by
passing vmbuffer or NULL.
We clearly don't want to actually freeze rows if we're doing an update and
might just update the rows again. But it's less clear to me that, if we are
pruning dead row versions *and* the page is already all-visible after that
(say because only HOT versions were removed), we shouldn't mark the page as
such?
> @@ -306,6 +312,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
> .cutoffs = NULL,
> };
>
> + if (vmbuffer)
> + {
> + visibilitymap_pin(relation,
> BufferGetBlockNumber(buffer), vmbuffer);
> + params.options |= HEAP_PAGE_PRUNE_UPDATE_VM;
> + params.vmbuffer = *vmbuffer;
Why do we pin the buffer at this time, rather than deferring that until we
actually need it? I guess we just always will access it, but that doesn't
seem like it's inherent (c.f. my earlier points about a faster exit when
looking at an already all-frozen page or such).
It's not clear to me why we are pinning the page in lazy_scan_heap(), before
it's clear that we need it, either. But there the cost is often very low,
because we have a lot of sequential accesses. But here we might be called
from an index scan, with very little locality of access.
Greetings,
Andres Freund