On Thu, Jan 11, 2024 at 4:49 PM Robert Haas <robertmh...@gmail.com> wrote: > > On Thu, Jan 11, 2024 at 2:30 PM Melanie Plageman > <melanieplage...@gmail.com> wrote: > > I'm still kind of hung up on the changes that 0001 makes to vacuumlazy.c. > > Say we didn't add the recordfreespace parameter to lazy_scan_prune(). > Couldn't the caller just compute it? lpdead_items goes out of scope, > but there's also prstate.has_lpdead_items. > > Pressing that gripe a bit more, I just figured out that "Wait until > lazy_vacuum_heap_rel() to save free space" gets turned into "If we > will likely do index vacuuming, wait until lazy_vacuum_heap_rel() to > save free space." That follows the good practice of phrasing the > comment conditionally when the comment is outside the if-statement. > But the if-statement says merely "if (recordfreespace)", which is not > obviously related to "If we will likely do index vacuuming" but under > the hood actually is. But it seems like an unpleasant amount of action > at a distance. If that condition said if (vacrel->nindexes > 0 && > vacrel->do_index_vacuuming && prstate.has_lpdead_items) > UnlockReleaseBuffer(); else { PageGetHeapFreeSpace; > UnlockReleaseBuffer; RecordPageWithFreeSpace } it would be a lot more > obvious that the code was doing what the comment says.
Yes, this is a good point. Seems like writing maintainable code is really about never lying and then figuring out when hiding the truth is also lying. Darn! My only excuse is that lazy_scan_noprune() has a similarly confusing output parameter, recordfreespace, both of which I removed in a later patch in the set. I think we should change it as you say. > That's a refactoring question, but I'm also wondering if there's a > functional issue. Pre-patch, if the table has no indexes and some > items are left LP_DEAD, then we mark them unused, forget them, > RecordPageWithFreeSpace(), and if enough pages have been visited, > FreeSpaceMapVacuumRange(). Post-patch, if the table has no indexes, no > items will be left LP_DEAD, and *recordfreespace will be set to true, > so we'll PageRecordFreeSpace(). But this seems to me to mean that > post-patch we'll PageRecordFreeSpace() in more cases than we do now. > Specifically, imagine a case where the current code wouldn't mark any > items LP_DEAD and the page also had no pre-existing items that were > LP_DEAD and the table also has no indexes. Currently, I believe we > wouldn't PageRecordFreeSpace(), but with the patch, I think we would. > Am I wrong? Ah! I think you are right. Good catch. I could fix this with logic like this: bool space_freed = vacrel->tuples_deleted > tuples_already_deleted; if ((vacrel->nindexes == 0 && space_freed) || (vacrel->nindexes > 0 && (space_freed || !vacrel->do_index_vacuuming))) I think I made this mistake when working on a different version of this that combined the prune and no prune cases. I noticed that lazy_scan_noprune() updates the FSM whenever there are no indexes. I wonder why this is (and why we don't do it in the prune case). > Note that the call to FreeSpaceMapVacuumRange() seems to try to guard > against this problem by testing for vacrel->tuples_deleted > > tuples_already_deleted. I haven't tried to verify whether that guard > is correct, but the fact that FreeSpaceMapVacuumRange() has such a > guard and RecordPageWithFreeSpace() does not have one makes me > suspicious. FreeSpaceMapVacuumRange() is not called for the no prune case, so I think this is right. > Another interesting effect of the patch is that instead of getting > lazy_vacuum_heap_page()'s handling of the all-visible status of the > page, we get the somewhat more complex handling done by > lazy_scan_heap(). I haven't fully through the consequences of that, > but if you have, I'd be interested to hear your analysis. lazy_vacuum_heap_page() calls heap_page_is_all_visible() which does another HeapTupleSatisfiesVacuum() call -- which is definitely going to be more expensive than not doing that. In one case, in lazy_scan_heap(), we might do a visibilitymap_get_status() (via VM_ALL_FROZEN()) to avoid calling visibilitymap_set() if the page is already marked all frozen in the VM. But that would pale in comparison to another HeapTupleSatisfiesVacuum() (I think). The VM update code in lazy_scan_heap() looks complicated but two out of four cases are basically to deal with uncommon data corruption. - Melanie