On 08/03/2024 02:46, Melanie Plageman wrote:
On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:
I feel heap_vac_scan_get_next_block() function could use some love. Maybe
just some rewording of the comments, or maybe some other refactoring; not
sure. But I'm pretty happy with the function signature and how it's called.

I've cleaned up the comments on heap_vac_scan_next_block() in the first
couple patches (not so much in the streaming read user). Let me know if
it addresses your feelings or if I should look for other things I could
change.

Thanks, that is better. I think I now finally understand how the function works, and now I can see some more issues and refactoring opportunities :-).

Looking at current lazy_scan_skip() code in 'master', one thing now caught my eye (and it's the same with your patches):

        *next_unskippable_allvis = true;
        while (next_unskippable_block < rel_pages)
        {
                uint8           mapbits = visibilitymap_get_status(vacrel->rel,
                                                                                
                           next_unskippable_block,
                                                                                
                           vmbuffer);

                if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
                {
                        Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
                        *next_unskippable_allvis = false;
                        break;
                }

                /*
                 * Caller must scan the last page to determine whether it has 
tuples
                 * (caller must have the opportunity to set 
vacrel->nonempty_pages).
                 * This rule avoids having lazy_truncate_heap() take 
access-exclusive
                 * lock on rel to attempt a truncation that fails anyway, just 
because
                 * there are tuples on the last page (it is likely that there 
will be
                 * tuples on other nearby pages as well, but those can be 
skipped).
                 *
                 * Implement this by always treating the last block as unsafe 
to skip.
                 */
                if (next_unskippable_block == rel_pages - 1)
                        break;

                /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
                if (!vacrel->skipwithvm)
                {
                        /* Caller shouldn't rely on all_visible_according_to_vm 
*/
                        *next_unskippable_allvis = false;
                        break;
                }

                /*
                 * Aggressive VACUUM caller can't skip pages just because they 
are
                 * all-visible.  They may still skip all-frozen pages, which 
can't
                 * contain XIDs < OldestXmin (XIDs that aren't already frozen 
by now).
                 */
                if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
                {
                        if (vacrel->aggressive)
                                break;

                        /*
                         * All-visible block is safe to skip in non-aggressive 
case.  But
                         * remember that the final range contains such a block 
for later.
                         */
                        skipsallvis = true;
                }

                /* XXX: is it OK to remove this? */
                vacuum_delay_point();
                next_unskippable_block++;
                nskippable_blocks++;
        }

Firstly, it seems silly to check DISABLE_PAGE_SKIPPING within the loop. When DISABLE_PAGE_SKIPPING is set, we always return the next block and set *next_unskippable_allvis = false regardless of the visibility map, so why bother checking the visibility map at all?

Except at the very last block of the relation! If you look carefully, at the last block we do return *next_unskippable_allvis = true, if the VM says so, even if DISABLE_PAGE_SKIPPING is set. I think that's wrong. Surely the intention was to pretend that none of the VM bits were set if DISABLE_PAGE_SKIPPING is used, also for the last block.

This was changed in commit 980ae17310:

@@ -1311,7 +1327,11 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, 
BlockNumber next_block,
/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
                if (!vacrel->skipwithvm)
+               {
+                       /* Caller shouldn't rely on all_visible_according_to_vm 
*/
+                       *next_unskippable_allvis = false;
                        break;
+               }

Before that, *next_unskippable_allvis was set correctly according to the VM, even when DISABLE_PAGE_SKIPPING was used. It's not clear to me why that was changed. And I think setting it to 'true' would be a more failsafe value than 'false'. When *next_unskippable_allvis is set to true, the caller cannot rely on it because a concurrent modification could immediately clear the VM bit. But because VACUUM is the only process that sets VM bits, if it's set to false, the caller can assume that it's still not set later on.

One consequence of that is that with DISABLE_PAGE_SKIPPING, lazy_scan_heap() dirties all pages, even if there are no changes. The attached test script demonstrates that.

ISTM we should revert the above hunk, and backpatch it to v16. I'm a little wary because I don't understand why that change was made in the first place, though. I think it was just an ill-advised attempt at tidying up the code as part of the larger commit, but I'm not sure. Peter, do you remember?

I wonder if we should give up trying to set all_visible_according_to_vm correctly when we decide what to skip, and always do "all_visible_according_to_vm = visibilitymap_get_status(...)" in lazy_scan_prune(). It would be more expensive, but maybe it doesn't matter in practice. It would get rid of this tricky bookkeeping in heap_vac_scan_next_block().

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment: vactest.sql
Description: application/sql

Reply via email to