Hi, On 2024-01-04 17:37:27 -0500, Melanie Plageman wrote: > On Thu, Jan 4, 2024 at 3:03 PM Andres Freund <and...@anarazel.de> wrote: > > > > On 2023-11-17 18:12:08 -0500, Melanie Plageman wrote: > > > Assert(ItemIdIsNormal(lp)); > > > htup = (HeapTupleHeader) PageGetItem(dp, lp); > > > @@ -715,7 +733,17 @@ heap_prune_chain(Buffer buffer, OffsetNumber > > > rootoffnum, > > > * redirect the root to the correct chain member. > > > */ > > > if (i >= nchain) > > > - heap_prune_record_dead(prstate, rootoffnum); > > > + { > > > + /* > > > + * If the relation has no indexes, we can remove > > > dead tuples > > > + * during pruning instead of marking their line > > > pointers dead. Set > > > + * this tuple's line pointer LP_UNUSED. > > > + */ > > > + if (prstate->pronto_reap) > > > + heap_prune_record_unused(prstate, > > > rootoffnum); > > > + else > > > + heap_prune_record_dead(prstate, rootoffnum); > > > + } > > > else > > > heap_prune_record_redirect(prstate, rootoffnum, > > > chainitems[i]); > > > } > > > @@ -726,9 +754,12 @@ heap_prune_chain(Buffer buffer, OffsetNumber > > > rootoffnum, > > > * item. This can happen if the loop in heap_page_prune > > > caused us to > > > * visit the dead successor of a redirect item before > > > visiting the > > > * redirect item. We can clean up by setting the redirect > > > item to > > > - * DEAD state. > > > + * DEAD state. If pronto_reap is true, we can set it > > > LP_UNUSED now. > > > */ > > > - heap_prune_record_dead(prstate, rootoffnum); > > > + if (prstate->pronto_reap) > > > + heap_prune_record_unused(prstate, rootoffnum); > > > + else > > > + heap_prune_record_dead(prstate, rootoffnum); > > > } > > > > > > return ndeleted; > > > > There's three new calls to heap_prune_record_unused() and the logic got more > > nested. Is there a way to get to a nicer end result? > > So, I could do another loop through the line pointers in > heap_page_prune() (after the loop calling heap_prune_chain()) and, if > pronto_reap is true, set dead line pointers LP_UNUSED. Then, when > constructing the WAL record, I would just not add the prstate.nowdead > that I saved from heap_prune_chain() to the prune WAL record.
Hm, that seems a bit sad as well. I am wondering if we could move the pronto_reap handling into heap_prune_record_dead() or a wrapper of it. I am more concerned about the human reader than the CPU here... > > > @@ -972,20 +970,21 @@ lazy_scan_heap(LVRelState *vacrel) > > > continue; > > > } > > > > > > - /* Collect LP_DEAD items in dead_items array, count > > > tuples */ > > > - if (lazy_scan_noprune(vacrel, buf, blkno, page, > > > &hastup, > > > + /* > > > + * Collect LP_DEAD items in dead_items array, count > > > tuples, > > > + * determine if rel truncation is safe > > > + */ > > > + if (lazy_scan_noprune(vacrel, buf, blkno, page, > > > > > > &recordfreespace)) > > > { > > > Size freespace = 0; > > > > > > /* > > > * Processed page successfully (without > > > cleanup lock) -- just > > > - * need to perform rel truncation and FSM > > > steps, much like the > > > - * lazy_scan_prune case. Don't bother > > > trying to match its > > > - * visibility map setting steps, though. > > > + * need to update the FSM, much like the > > > lazy_scan_prune case. > > > + * Don't bother trying to match its > > > visibility map setting > > > + * steps, though. > > > */ > > > - if (hastup) > > > - vacrel->nonempty_pages = blkno + 1; > > > if (recordfreespace) > > > freespace = > > > PageGetHeapFreeSpace(page); > > > UnlockReleaseBuffer(buf); > > > > The comment continues to say that we "determine if rel truncation is safe" - > > but I don't see that? Oh, I see, it's done inside lazy_scan_noprune(). This > > doesn't seem like a clear improvement to me. Particularly because it's only > > set if lazy_scan_noprune() actually does something. > > I don't get what the last sentence means ("Particularly because..."). Took me a second to understand myself again too, oops. What I think I meant is that it seems error-prone that it's only set in some paths inside lazy_scan_noprune(). Previously it was at least a bit clearer in lazy_scan_heap() that it would be set for the different possible paths. > > I don't like the existing code in lazy_scan_heap(). But this kinda seems > > like > > tinkering around the edges, without getting to the heart of the issue. I > > think > > we should > > > > 1) Move everything after ReadBufferExtended() and the end of the loop into > > its > > own function > > > > 2) All the code in the loop body after the call to lazy_scan_prune() is > > specific to the lazy_scan_prune() path, it doesn't make sense that it's > > at > > the same level as the the calls to lazy_scan_noprune(), > > lazy_scan_new_or_empty() or lazy_scan_prune(). Either it should be in > > lazy_scan_prune() or a new wrapper function. > > > > 3) It's imo wrong that we have UnlockReleaseBuffer() (there are 6 different > > places unlocking if I didn't miscount!) and RecordPageWithFreeSpace() > > calls > > in this many places. I think this is largely a consequence of the > > previous > > points. Once those are addressed, we can have one common place. > > I have other patches that do versions of all of the above, but they > didn't seem to really fit with this patch set. I am taking a step to > move code out of lazy_scan_heap() that doesn't belong there. That fact > that other code should also be moved from there seems more like a "yes > and" than a "no but". That being said, do you think I should introduce > patches doing further refactoring of lazy_scan_heap() (like what you > suggest above) into this thread? It probably should not be part of this patchset. I probably shouldn't have written the above here, but after concluding that I didn't think your small refactoring patch was quite right, I couldn't stop myself from thinking about what would be right. Greetings, Andres Freund