Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Peter Geoghegan Tue, 07 Dec 2021 12:44:38 -0800

On Tue, Dec 7, 2021 at 12:27 PM Robert Haas <[email protected]> wrote:
> Well... I mean, I think we're almost saying the same thing, then, but
> I think you're saying it more confusingly. I have no objection to
> counting the number of dead HOT chains rather than the number of dead
> tules, because that's what affects the index contents, but there's no
> need to characterize that as "not the literal truth."


Works for me!

> Sure, but we don't *need* to be less accurate, and I don't think we
> even *benefit* from being less accurate. If we do something like count
> dead HOT chains instead of dead tuples, let's not call that a
> less-accurate count of dead tuples. Let's call it an accurate count of
> dead HOT chains.

Fair enough, but even then we still ultimately have to generate a
final number that represents how close we are to a configurable "do an
autovacuum" threshold (such as an autovacuum_vacuum_scale_factor-based
threshold) -- the autovacuum.c side of this (the consumer side)
fundamentally needs the model to reduce everything to a one
dimensional number (even though the reality is that there isn't just
one dimension). This single number (abstract bloat units, abstract
dead tuples, whatever) is a function of things like the count of dead
HOT chains, perhaps the concentration of dead tuples on heap pages,
whatever -- but it's not the same thing as any one of those things we
count.

I think that this final number needs to be denominated in abstract
units -- we need to call those abstract units *something*. I don't
care what that name ends up being, as long as it reflects reality.

-- 
Peter Geoghegan

Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Reply via email to