On Wed, Oct 19, 2022 at 11:19 AM Andres Freund <and...@anarazel.de> wrote:
> We e.g. currently can't track the number of blocks written out in a relation,
> because we don't have a Relation at that point. Nor can't we really get hold
> of one, as the writeback can happen in a different database without access to
> pg_class. Which is also the reason why the per-relation IO stats aren't
> populated by the startup process, even though it'd obviously sometimes be
> helpful to know where the most IO time is spent on a standby.
>
> There's also quite a bit of contortions of the bufmgr interface related to
> this.

This seems related to the difficulty with distinguishing between
internal pages and leaf pages (or some generalized AM-agnostic
definition) in views like pg_statio_*_indexes.

Differentiating between leaf pages and internal pages would definitely
be a big improvement, but it's kind of an awkward thing to implement
[1] because you have to somehow invent the general concept of multiple
distinct kinds of buffers/pages within a relation. A lot of code would
need to be taught about that.

This work would be more likely to actually happen if it was tied to
some bigger project that promised other benefits.

> 2) Split index and table statistics into different types of stats

> This e.g. would allow us keep track of the number of index entries killed via
> the killtuples mechanism, which in turn would allow us to more intelligently
> decide whether we should vacuum indexes (often the most expensive part of
> vacuum). In a lot of workload killtuples takes care of most of the cleanup,
> but in others it doesn't do much.

While I do agree that it would be nice to record information about the
number of deletion operations per index, that information will still
be tricky to interpret and act upon relative to other kinds of
information. As a general rule, we should prefer to focus on signals
that show things really aren't going well in some specific and
unambiguous way. Signals about things that are going well seem harder
to work with -- they don't generalize well.

What I really mean here is this: I think that page split stuff is
going to be much more interesting than index deletion stuff. Index
deletion exists to prevent page splits. So it's natural to ask
questions about where that seems like it ought to have happened, but
didn't actually happen. This likely requires bucketing page splits
into different categories (since most individual page splits aren't
like that at all). Then it becomes much easier to (say) compare
indexes on the same table -- the user can follow a procedure that is
likely to generalize well to many different kinds of situations.

It's not completely clear how the bucketization would work. We ought
to remember how many page splits were caused by INSERT statements
rather than non-HOT UPDATEs, though -- that much seems likely to be
very useful and actionable. The DBA can probably consume this
information in a low context way by looking at the proportions of one
kind of split to the other at the level of each index.

One type of split is mostly just a "cost of doing business" for B-Tree
indexing. The other type really isn't.

> 3) Maintain more historical statistics about vacuuming

> However, none of that allows the user to identify which relations are causing
> autovacuum to not keep up. Even just keeping track of the the total time
> autovacuum has spent on certain relations would be a significant improvement,
> with more easily imaginable (total IO [time], autovacuum delay time, xid age).

With VACUUM in particular the picture over time can be far easier to
work with than any given snapshot, from any single VACUUM operation.
Focusing on how things seem to be changing can make it a lot easier to
spot concerning trends, especially if you're a non-expert.

I would also expect a similar focus on the picture over time to be
useful with the indexing stuff, for roughly the same underlying
reasons.

[1] 
https://postgr.es/m/CAA8Fd-pB=mr42YQuoaLPO_o2=xo9yjnjq23cyjdfwc8sxgm...@mail.gmail.com
--
Peter Geoghegan


Reply via email to