Re: should vacuum's first heap pass be read-only?

Robert Haas Mon, 07 Feb 2022 08:37:02 -0800

On Fri, Feb 4, 2022 at 4:12 PM Peter Geoghegan <[email protected]> wrote:
> I had imagined that we'd
> want to do heap vacuuming in the same way as today with the dead TID
> conveyor belt stuff -- it just might take several VACUUM operations
> until we are ready to do a round of heap vacuuming.


I am trying to understand exactly what you are imagining here. Do you
mean we'd continue to lazy_scan_heap() at the start of every vacuum,
and lazy_vacuum_heap_rel() at the end? I had assumed that we didn't
want to do that, because we might already know from the conveyor belt
that there are some dead TIDs that could be marked unused, and it
seems strange to just ignore that knowledge at a time when we're
scanning the heap anyway. However, on reflection, that approach has
something to recommend it, because it would be somewhat simpler to
understand what's actually being changed. We could just:

1. Teach lazy_scan_heap() that it should add TIDs to the conveyor
belt, if we're using one, unless they're already there, but otherwise
work as today.

2. Teach lazy_vacuum_heap_rel() that it, if there is a conveyor belt,
it should try to clear from the indexes all of the dead TIDs that are
eligible.

3. If there is a conveyor belt, use some kind of magic to decide when
to skip vacuuming some or all indexes. When we skip one or more
indexes, the subsequent lazy_vacuum_heap_rel() can't possibly mark as
unused any of the dead TIDs we found this time, so we should just skip
it, unless somehow there are TIDs on the conveyor belt that were
already ready to be marked unused at the start of this VACUUM, in
which case we can still handle those.

Is this the kind of thing you had in mind?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: should vacuum's first heap pass be read-only?

Reply via email to