On Fri, Apr 1, 2022 at 11:39 AM Robert Haas <robertmh...@gmail.com> wrote: > So I'm completely confused here. If we always start a vacuum with > lazy_scan_heap(), as you said you wanted, then we will not save any > heap scanning.
The term "start a VACUUM" becomes ambiguous with the conveyor belt. What I was addressed in a nearby email back in February [1] was the idea of doing heap vacuuming of the last run (or several runs) of dead TIDs on top of heap pruning to create the next run/runs of dead TIDs. > What am I missing? There is a certain sense in which we are bound to always "start a vacuum" in lazy_scan_prune(), with any design based on the current one. How else are we ever going to make a basic initial determination about which heap LP_DEAD items need their TIDs deleted from indexes, sooner or later? Obviously that information must always have originated in lazy_scan_prune (or in lazy_scan_noprune). With the conveyor belt, and a non-HOT-update heavy workload, we'll eventually need to exhaustively do index vacuuming of all indexes (even those that don't need it for their own sake) to make it safe to remove heap line pointer bloat (to set heap LP_DEAD items to LP_UNUSED). This will happen least often of all, and is the one dependency conveyor belt can't help with. To answer your question: when heap vacuuming does finally happen, we at least don't need to call lazy_scan_prune for any pages first (neither the pages we're vacuuming, nor any other heap pages). Plus the decision to finally clean up line pointer bloat can be made based on known facts about line pointer bloat, without tying that to other processing done by lazy_scan_prune() -- so there's greater separation of concerns. That having been said...maybe it would make sense to also call lazy_scan_prune() right after these relatively rare calls to lazy_vacuum_heap_page(), opportunistically (since we already dirtied the page once). But that would be an additional optimization, at best; it wouldn't be the main way that we call lazy_scan_prune(). [1] https://www.postgresql.org/message-id/CAH2-WzmG%3D_vYv0p4bhV8L73_u%2BBkd0JMWe2zHH333oEujhig1g%40mail.gmail.com -- Peter Geoghegan