Hi,

On 2021-02-24 21:41:16 +0100, Dmitry Dolgov wrote:
> I'm curious about control knobs for this feature, it's somewhat related
> to the stats questions also discussed in this thread. I guess most
> important of those are max_aio_in_flight, io_max_concurrency etc, and
> they're going to be a hard limits, right?

Yea - there's a lot more work needed in that area.

io_max_concurrency especially really should be a GUC, I was just too
lazy for that so far.


> I'm curious if it makes sense
> to explore possibility to have these sort of "backpressure", e.g. if
> number of inflight requests is too large calculate inflight_limit a bit
> lower than possible (to avoid hard performance deterioration when the db
> is trying to do too much IO, and rather do it smooth).

It's decidedly nontrivial to compute "too large" - and pretty workload
dependant (e.g. lower QDs are better latency sensitive OLTP, higher QD
is better for bulk r/w heavy analytics). So I don't really want to go
there for now - the project is already very large.

What I do think is needed and feasible (there's a bunch of TODOs in the
code about it already) is to be better at only utilizing deeper queues
when lower queues don't suffice. So we e.g. don't read ahead more than a
few blocks for a scan where the query is spending most of the time
"elsewhere.

There's definitely also some need for a bit better global, instead of
per-backend, control over the number of IOs in flight. That's not too
hard to implement - the hardest probably is to avoid it becoming a
scalability issue.

I think the area with the most need for improvement is figuring out how
we determine the queue depths for different things using IO. Don't
really want to end up with 30 parameters influencing what queue depth to
use for (vacuum, index builds, sequential scans, index scans, bitmap
heap scans, ...) - but they benefit from a deeper queue will differ
between places.


> From what I remember io_uring does have something similar only for
> SQPOLL. Another similar question if this could be used for throttling
> of some overloaded workers in case of misconfigured clients or such?

You mean dynamically? Or just by setting the concurrency lower for
certain users? I think doing so dynamically is way too complicated for
now. But I'd expect configuring it on a per-user basis or such to be a
reasonable thing. That might require splitting it into two GUCs - one
SUSET one and a second one that's settable by any user, but can only
lower the depth.

I think it'll be pretty useful to e.g. configure autovacuum to have a
low queue depth instead of using the current cost limiting. That way the
impact on the overall system is limitted, but it's not slowed down
unnecessarily as much.

Greetings,

Andres Freund


Reply via email to