On Tue, Dec 7, 2021 at 12:27 PM Robert Haas <robertmh...@gmail.com> wrote: > Well... I mean, I think we're almost saying the same thing, then, but > I think you're saying it more confusingly. I have no objection to > counting the number of dead HOT chains rather than the number of dead > tules, because that's what affects the index contents, but there's no > need to characterize that as "not the literal truth."
Works for me! > Sure, but we don't *need* to be less accurate, and I don't think we > even *benefit* from being less accurate. If we do something like count > dead HOT chains instead of dead tuples, let's not call that a > less-accurate count of dead tuples. Let's call it an accurate count of > dead HOT chains. Fair enough, but even then we still ultimately have to generate a final number that represents how close we are to a configurable "do an autovacuum" threshold (such as an autovacuum_vacuum_scale_factor-based threshold) -- the autovacuum.c side of this (the consumer side) fundamentally needs the model to reduce everything to a one dimensional number (even though the reality is that there isn't just one dimension). This single number (abstract bloat units, abstract dead tuples, whatever) is a function of things like the count of dead HOT chains, perhaps the concentration of dead tuples on heap pages, whatever -- but it's not the same thing as any one of those things we count. I think that this final number needs to be denominated in abstract units -- we need to call those abstract units *something*. I don't care what that name ends up being, as long as it reflects reality. -- Peter Geoghegan