On Mon, Jan 15, 2024 at 12:29:57PM -0500, Robert Haas wrote: > On Fri, Jan 12, 2024 at 4:05 PM Melanie Plageman > <melanieplage...@gmail.com> wrote: > > Yea, that works for now. I mean, I think the way we should do it is > > update the FSM in lazy_scan_noprune(), but, for the purposes of this > > patch, yes. has_lpdead_items output parameter seems fine to me. > > Here's v2. > > It's not exactly clear to me why you want to update the FSM in > lazy_scan_[no]prune(). When I first looked at v7-0004, I said to > myself "well, this is dumb, because now we're just duplicating > something that is common to both cases". But then I realized that the > logic was *already* duplicated in lazy_scan_heap(), and that by > pushing the duplication down to lazy_scan_[no]prune(), you made the > two copies identical rather than, as at present, having two copies > that differ from each other. Perhaps that's a sufficient reason to > make the change all by itself, but it seems like what would be really > good is if we only needed one copy of the logic. I don't know if > that's achievable, though.
Upthread in v4-0004, I do have a version which combines the two FSM updates for lazy_scan_prune() and lazy_scan_noprune(). If you move the VM updates from lazy_scan_heap() into lazy_scan_prune(), then there is little difference between the prune and no prune cases in lazy_scan_heap(). The main difference is that, when lazy_scan_noprune() returns true, you are supposed to avoid calling lazy_scan_new_or_empty() again and have to avoid calling lazy_scan_prune(). I solved this with a local variable "do_prune" and checked it before calling lazy_scan_new_or_empty() and lazy_scan_prune(). I moved away from this approach because it felt odd to test "do_prune" before calling lazy_scan_new_or_empty(). Though, perhaps it doesn't matter if we just call lazy_scan_new_or_empty() again. We do that when lazy_scan_noprune() returns false anyway. I thought perhaps we could circumvent this issue by moving lazy_scan_new_or_empty() into lazy_scan_prune(). But, this seemed wrong to me because, as it stands now, if lazy_scan_new_or_empty() returns true, we would want to skip the VM update code and FSM update code in lazy_scan_heap() after lazy_scan_prune(). That would mean lazy_scan_prune() would have to return something to indicate that. > More generally, I somewhat question whether it's really right to push > things from lazy_scan_heap() and into lazy_scan_[no]prune(), mostly > because of the risk of having to duplicate logic. I'm not even really > convinced that it's good for us to have both of those functions. > There's an awful lot of code duplication between them already. Each > has a loop that tests the status of each item and then, for LP_USED, > switches on htsv_get_valid_status or HeapTupleSatisfiesVacuum. It > doesn't seem trivial to unify all that code, but it doesn't seem very > nice to have two copies of it, either. I agree that the duplicated logic in both places is undesirable. I think the biggest issue with combining them would be that when lazy_scan_noprune() returns false, it needs to go get a cleanup lock and then invoke the logic of lazy_scan_prune() for all the tuples on the page. This seems a little hard to get right in a single function. Then there are more trivial-to-solve differences like invoking heap_tuple_should_freeze() and not bothering with the whole heap_prepare_freeze_tuple() if we only have the share lock. I am willing to try and write a version of this to see if it is better. I will say, though, my agenda was to eventually push the actions taken in the loop in lazy_scan_prune() into heap_page_prune() and heap_prune_chain(). > From 32684f41d1dd50f726aa0dfe8a5d816aa5c42d64 Mon Sep 17 00:00:00 2001 > From: Robert Haas <rh...@postgresql.org> > Date: Mon, 15 Jan 2024 12:05:52 -0500 > Subject: [PATCH v2] Be more consistent about whether to update the FSM while > vacuuming. Few small review comments below, but, overall, LGTM. > Previously, when lazy_scan_noprune() was called and returned true, we would > update the FSM immediately if the relation had no indexes or if the page > contained no dead items. On the other hand, when lazy_scan_prune() was > called, we would update the FSM if either of those things was true or > if index vacuuming was disabled. Eliminate that behavioral difference by > considering vacrel->do_index_vacuuming in both cases. > > Also, instead of having lazy_scan_noprune() make the decision > internally and pass it back to the caller via *recordfreespace, just > have it pass the number of LP_DEAD items back to the caller, and then It doesn't pass the number of LP_DEAD items back to the caller. It passes a boolean. > let the caller make the decision. That seems less confusing, since > the caller also decides in the lazy_scan_prune() case; moreover, this > way, the whole test is in one place, instead of spread out. Perhaps it isn't important, but I find this wording confusing. You mention lazy_scan_prune() and then mention that "the whole test is in one place instead of spread out" -- which kind of makes it sound like you are consolidating FSM updates for both the lazy_scan_noprune() and lazy_scan_prune() cases. Perhaps simply flipping the order of the "since the caller" and "moreover, this way" conjunctions would solve it. I defer to your judgment. > --- > src/backend/access/heap/vacuumlazy.c | 58 ++++++++++++++-------------- > 1 file changed, 29 insertions(+), 29 deletions(-) > > diff --git a/src/backend/access/heap/vacuumlazy.c > b/src/backend/access/heap/vacuumlazy.c > index b63cad1335..f17816b81d 100644 > --- a/src/backend/access/heap/vacuumlazy.c > +++ b/src/backend/access/heap/vacuumlazy.c > @@ -1960,7 +1974,6 @@ lazy_scan_noprune(LVRelState *vacrel, > Assert(BufferGetBlockNumber(buf) == blkno); > > hastup = false; /* for now */ > - *recordfreespace = false; /* for now */ > > lpdead_items = 0; > live_tuples = 0; > @@ -2102,18 +2115,8 @@ lazy_scan_noprune(LVRelState *vacrel, > hastup = true; > missed_dead_tuples += lpdead_items; > } > - > - *recordfreespace = true; > - } > - else if (lpdead_items == 0) > - { > - /* > - * Won't be vacuuming this page later, so record page's > freespace in > - * the FSM now > - */ > - *recordfreespace = true; > } > - else > + else if (lpdead_items != 0) It stuck out to me a bit that this test is if lpdead_items != 0 and the one above it: /* Save any LP_DEAD items found on the page in dead_items array */ if (vacrel->nindexes == 0) { /* Using one-pass strategy (since table has no indexes) */ if (lpdead_items > 0) { tests if lpdead_items > 0. It is more the inconsistency that bothered me than the fact that lpdead_items is signed. > @@ -2159,6 +2156,9 @@ lazy_scan_noprune(LVRelState *vacrel, > if (hastup) > vacrel->nonempty_pages = blkno + 1; > > + /* Did we find LP_DEAD items? */ > + *has_lpdead_items = (lpdead_items > 0); I would drop this comment. The code doesn't really need it, and the reason we care if there are LP_DEAD items is not because they are dead but because we want to know if we'll touch this page again. You don't need to rehash all that here, so I think omitting the comment is enough. - Melanie