Re: EXPLAIN: showing ReadStream / prefetch stats

Melanie Plageman Mon, 16 Mar 2026 10:29:47 -0700

On Mon, Mar 16, 2026 at 3:08 AM Lukas Fittl <[email protected]> wrote:
>
> I'm 50/50 if hiding this behind a new option really makes sense - if
> its cheap enough to always capture, why not always show it?
>
> e.g. we could consider doing with this what we did with BUFFERS
> recently, which is to enable it by default. If someone finds that too
> visually busy, they could still do IO OFF.


I like this idea.

> This also does make me wonder a bit what we should do about I/O
> timings. Conceptually they'd belong closer to IO now than BUFFERS..

I could see moving the timings to the I/O line.

> > The second line "I/O" is about the I/O requests actually issued - how
> > many times we had to wait for the block (when we get to process it),
> > average size of a request (in BLCKSZ blocks), and average number of
> > in-progress requests.
>
> I wonder if we could somehow consolidate this into one line for the
> text format? (specifically, moving prefetch into "I/O" at the end?)

Next release, I'm hopeful we'll get write combining in and then want
to include write IO details here. Not a reason to avoid making it one
line now, though. However, I think it will be a very long line...

> I'm also not sure if "max" is really that useful, vs capacity?

I find it very helpful. If you keep increasing
effective_io_concurrency and io_combine_limit and don't see the max
increasing, I think it is helpful to know what the configured possible
limit would be vs what the read stream actually got you up to. And
because you can set effective_io_concurrency and io_combine_limit per
query, it is nice to know what this value was at the time the query
was run.

> I feel like something is off about the complexity of having each node
> type ferry back the information. e.g. when you're implementing the
> support for index prefetching, it'll require a bunch more changes. In
> my mind, there is a reason we have a related problem that we solved
> with the current pgBufferUsage, instead of dealing with that on a
> per-node basis. I really feel we should have a more generic way of
> dealing with this.
<--snip-->
> I've attached a prototype of how that could look like (apply the other
> patch set first, v8, see commit fest entry [1] - also attached a
> preparatory refactoring of using "Instrumentation" for parallel query
> reporting, which avoids having individual structs there).

The patch footprint is _much_ nicer with your stack-based
instrumentation. Very cool. I'll leave it to Tomas whether he wants to
create a dependency on your big project a few weeks before feature
freeze, though.

> Its also worth noting that this would make it trivial to output this
> information for utility commands that have read stream support, or
> show aggregate statistics in pg_stat_statements/etc.

Yes this would be a major bonus. Even for currently possible users,
this hasn't been extended to TID Range Scan -- which involves more LOC
and boilerplate.

- Melanie

Re: EXPLAIN: showing ReadStream / prefetch stats

Reply via email to