Hi,

On 2026-02-17 12:16:23 -0500, Peter Geoghegan wrote:
> On Mon, Feb 16, 2026 at 11:48 AM Andres Freund <[email protected]> wrote:
> I agree that the current heuristics (which were invented recently) are
> too conservative. I overfit the heuristics to my current set of
> adversarial queries, as a stopgap measure.

Are you doing any testing on higher latency storage?  I found it to be quite
valuable to use dm_delay to have a disk with reproducible (i.e. not cloud)
higher latency (i.e. not just a local SSD).  Low latency NVMe can reduce the
penalty of not enough readahead so much that it's hard to spot problems...


> > Note that there is pretty much *no* readhead, because the yields happen more
> > frequently than a io_combine_limit sized IO can be formed.
> 
> ISTM that we need the yields to better cooperate with whatever's
> happening on the read stream side.

Plausible.  It could be that we could get away with controlling the rampup to
be slower in potentially problematic cases, without needing the yielding, but
not sure.

If that doesn't work, it might just be sufficient to increase the number of
batches that trigger yields as the scan goes on (perhaps by taking the number
of already "consumed" batches into account).


To evaluate the amount of wasted work, it could be useful to make the read
stream stats page spit out the amount of "unconsumed" IOs at the end of the
scan.



> > With the yielding logic disabled:
> 
> > The comment seems to say it's about avoiding to look very into the future 
> > when
> > using index only scans that just need a few heap lookups. Certainly an
> > important goal.
> 
> The main motivation for yielding is to deal with things like merge
> joins fed by at least one plain index scan, and plain scans for an
> "ORDER BY .... LIMIT N" query.

Would be good to document why the yielding exists more extensively in the
comment above it...


> I attach an example of where disabling the yield mechanism hurts
> instead of helping, to give you a sense of the problems in this area.

What data/schema is that? Looks kinda but not really TPC-H like.


I assume that there are no mark & restores in the query, given that presumably
the inner side is unique?


Greetings,

Andres Freund


Reply via email to