On Fri, Jan 5, 2024 at 12:59 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > I actually think we are going to want to stop referring to these steps > as pruning and vacuuming. It is confusing because vacuuming refers to > the whole process of doing garbage collection on the table and also to > the specific step of setting dead line pointers unused. If we called > these steps say, pruning and reaping, that may be more clear.
I agree that there's some terminological confusion here, but if we leave all of the current naming unchanged and start using new naming conventions in new code, I don't think that's going to clear things up. Getting rid of the weird naming with pruning, "scanning" (that is part of vacuum), and "vacuuming" (that is another part of vacuum) is gonna take some work. > Vacuuming consists of three phases -- the first pass, index vacuuming, > and the second pass. I don't think we should dictate what happens in > each pass. That is, we shouldn't expect only pruning to happen in the > first pass and only reaping to happen in the second pass. For example, > I think Andres has previously proposed doing another round of pruning > after index vacuuming. The second pass/third phase is distinguished > primarily by being after index vacuuming. > > If we think about it this way, that frees us up to set dead line > pointers unused in the first pass when the table has no indexes. For > clarity, I could add a block comment explaining that doing this is an > optimization and not a logical requirement. One way to make this even > more clear would be to set the dead line pointers unused in a separate > loop after heap_prune_chain() as I proposed upthread. I don't really disagree with any of this, but I think the question is whether the code is cleaner with mark-LP-as-unused pushed down into pruning or whether it's better the way it is. Andres seems confident that the change won't suck for performance, which is good, but I'm not quite convinced that it's the right direction to go with the code, and he doesn't seem to be either. Perhaps this all turns on this point: MP> I am planning to add a VM update into the freeze record, at which point MP> I will move the VM update code into lazy_scan_prune(). This will then MP> allow us to consolidate the freespace map update code for the prune and MP> noprune cases and make lazy_scan_heap() short and sweet. Can we see what that looks like on top of this change? -- Robert Haas EDB: http://www.enterprisedb.com