вт, 2 февр. 2021 г. в 05:27, Peter Geoghegan <p...@bowt.ie>:

> And now here is the second thing I thought of, which is much better:
>
> Sometimes 1% of the dead tuples in a heap relation will be spread
> across 90%+ of the pages. With other workloads 1% of dead tuples might
> be highly concentrated, and appear in no more than 1% of all heap
> pages. Obviously the distinction between these two cases/workloads
> matters a lot. And so the triggering criteria must be quantitative
> *and* qualitative. It should not be based on counting dead tuples,
> since that alone won't differentiate these two extreme cases - both of
> which are probably quite common (in the real world extremes are
> actually the normal and common case IME).
>
> I like the idea of basing it on counting *heap blocks*, not dead
> tuples. We can count heap blocks that have *at least* one dead tuple
> (of course it doesn't matter how they're dead, whether it was this
> VACUUM operation or some earlier opportunistic pruning). Note in
> particular that it should not matter if it's a heap block that has
> only one LP_DEAD line pointer or a heap page that is near the
> MaxHeapTuplesPerPage limit for the page -- we count either type of
> page towards the heap-page based limit used to decide if index
> vacuuming goes ahead for all indexes during VACUUM.
>

I really like this idea!

It resembles the approach used in bottom-up index deletion, block-based
accounting provides a better estimate for the usefulness of the operation.

I suppose that 1% threshold should be configurable as a cluster-wide GUC
and also as a table storage parameter?


-- 
Victor Yegorov

Reply via email to