On Thu, 3 Feb 2022 at 12:21, Robert Haas <robertmh...@gmail.com> wrote: > > VACUUM's first pass over the heap is implemented by a function called > lazy_scan_heap(), while the second pass is implemented by a function > called lazy_vacuum_heap_rel(). This seems to imply that the first pass > is primarily an examination of what is present, while the second pass > does the real work. This used to be more true than it now is.
I've been out of touch for a while but I'm trying to catch up with the progress of the past few years. Whatever happened to the idea to "rotate" the work of vacuum. So all the work of the second pass would actually be deferred until the first pass of the next vacuum cycle. That would also have the effect of eliminating the duplicate work, both the writes with the wal generation as well as the actual scan. The only heap scan would be "remove line pointers previously cleaned from indexes and prune dead tuples recording them to clean from indexes in future". The index scan would remove line pointers and record them to be removed from the heap in a future heap scan. The downside would mainly be in the latency before the actual tuples get cleaned up from the table. That is not so much of an issue as far as space these days with tuple pruning but is more and more of an issue with xid wraparound. Also, having to record the line pointers that have been cleaned from indexes somewhere on disk for the subsequent vacuum would be extra state on disk and we've learned that means extra complexity. -- greg