On Tue, Apr 5, 2022 at 6:26 PM Peter Geoghegan <p...@bowt.ie> wrote: > On a long enough timeline the LP_DEAD items in heap pages are bound to > become the dominant concern in almost any interesting case for the > conveyor belt, for the obvious reason: you can't do anything about > LP_DEAD items without also doing every other piece of processing > involving those same heap pages. So in that sense, yes, they will be > the dominant problem at times, for sure. > > On the other hand it seems very hard to imagine an interesting > scenario in which LP_DEAD items are the dominant problem from the > earliest stage of processing by VACUUM. But even if it was somehow > possible, would it matter? That would mean that there'd be occasional > instances of the conveyor belt being ineffective -- hardly the end of > the world. What has it cost us to keep it as an option that wasn't > used? I don't think we'd have to do any extra work, other than > in-memory bookkeeping.
Well, OK, here's what I don't understand. Let's say I insert a tuple and then I delete a tuple. Then time goes by and other things happen, including but those things do not include a heap vacuum. However, during that time, all transactions that were in progress at the time of the insert-then-delete have now completed. At the end of that time, the number of things that need to be cleaned out of the heap is exactly 1: there is either a dead line pointer, or if the page hasn't been pruned yet, there is a dead tuple. The number of things that need to be cleaned out of the index is <= 1, because the index tuple could have gotten nuked by kill_prior_tuple or bottom-up index deletion, or it might still be there. It follows that the number of dead line pointers (or tuples that can be truncated to dead line pointers) in the heap is always greater than or equal to the number in the index. All things being equal, that means the heap is always in trouble before the index is in trouble. Maybe all things are not equal, but I don't know why that should be so. It feels like the index has opportunistic cleanup mechanisms that can completely eliminate index tuples, while the heap can at best replace dead tuples with dead line pointers which still consume some resources. And if that's the case then doing more index vacuum cycles than we do heap vacuum cycles really isn't a sensible thing to do. You seem to think it is, though... what am I missing? -- Robert Haas EDB: http://www.enterprisedb.com