On Mon, Mar 16, 2026 at 3:08 AM Lukas Fittl <[email protected]> wrote: > > I'm 50/50 if hiding this behind a new option really makes sense - if > its cheap enough to always capture, why not always show it? > > e.g. we could consider doing with this what we did with BUFFERS > recently, which is to enable it by default. If someone finds that too > visually busy, they could still do IO OFF.
I like this idea. > This also does make me wonder a bit what we should do about I/O > timings. Conceptually they'd belong closer to IO now than BUFFERS.. I could see moving the timings to the I/O line. > > The second line "I/O" is about the I/O requests actually issued - how > > many times we had to wait for the block (when we get to process it), > > average size of a request (in BLCKSZ blocks), and average number of > > in-progress requests. > > I wonder if we could somehow consolidate this into one line for the > text format? (specifically, moving prefetch into "I/O" at the end?) Next release, I'm hopeful we'll get write combining in and then want to include write IO details here. Not a reason to avoid making it one line now, though. However, I think it will be a very long line... > I'm also not sure if "max" is really that useful, vs capacity? I find it very helpful. If you keep increasing effective_io_concurrency and io_combine_limit and don't see the max increasing, I think it is helpful to know what the configured possible limit would be vs what the read stream actually got you up to. And because you can set effective_io_concurrency and io_combine_limit per query, it is nice to know what this value was at the time the query was run. > I feel like something is off about the complexity of having each node > type ferry back the information. e.g. when you're implementing the > support for index prefetching, it'll require a bunch more changes. In > my mind, there is a reason we have a related problem that we solved > with the current pgBufferUsage, instead of dealing with that on a > per-node basis. I really feel we should have a more generic way of > dealing with this. <--snip--> > I've attached a prototype of how that could look like (apply the other > patch set first, v8, see commit fest entry [1] - also attached a > preparatory refactoring of using "Instrumentation" for parallel query > reporting, which avoids having individual structs there). The patch footprint is _much_ nicer with your stack-based instrumentation. Very cool. I'll leave it to Tomas whether he wants to create a dependency on your big project a few weeks before feature freeze, though. > Its also worth noting that this would make it trivial to output this > information for utility commands that have read stream support, or > show aggregate statistics in pg_stat_statements/etc. Yes this would be a major bonus. Even for currently possible users, this hasn't been extended to TID Range Scan -- which involves more LOC and boilerplate. - Melanie
