On 3/28/24 06:20, Thomas Munro wrote: > With the unexplained but apparently somewhat systematic regression > patterns on certain tests and settings, I wonder if they might be due > to read_stream.c trying to form larger reads, making it a bit lazier. > It tries to see what the next block will be before issuing the > fadvise. I think that means that with small I/O concurrency settings, > there might be contrived access patterns where it loses, and needs > effective_io_concurrency to be set one notch higher to keep up, or > something like that.
Yes, I think we've speculated this might be the root cause before, but IIRC we didn't manage to verify it actually is the problem. FWIW I don't think the tests use synthetic data, but I don't think it's particularly contrived. > One way to test that idea would be to run the > tests with io_combine_limit = 1 (meaning 1 block). It issues advise > eagerly when io_combine_limit is reached, so I suppose it should be > exactly as eager as master. The only difference then should be that > it automatically suppresses sequential fadvise calls. Sure, I'll give that a try. What are some good values to test? Perhaps 32 and 1, i.e. the default and "no coalescing"? If this turns out to be the problem, does that mean we would consider using a more conservative default value? Is there some "auto tuning" we could do? For example, could we reduce the value combine limit if we start not finding buffers in memory, or something like that? I recognize this may not be possible with buffered I/O, due to not having any insight into page cache. And maybe it's misguided anyway, because how would we know if the right response is to increase or reduce the combine limit? Anyway, doesn't the combine limit work against the idea that effective_io_concurrency is "prefetch distance"? With eic=32 I'd expect we issue prefetch 32 pages ahead, i.e. if we prefetch page X, we should then process 32 pages before we actually need X (and we expect the page to already be in memory, thanks to the gap). But with the combine limit set to 32, is this still true? I've tried going through read_stream_* to determine how this will behave, but read_stream_look_ahead/read_stream_start_pending_read does not make this very clear. I'll have to experiment with some tracing. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company