> But read latency is still something like 30ms which I would think would be
> much higher if it's saturated.

No. You're using stress, so you have some total cap on concurrency.
Given a fixed concurrency, you'll saturate at some particular average
latency which is mostly a function of the backlog implies by the
concurrency and the average time for each actual request to process.

If you double the concurrency of your stress clients, you should
expect roughly twice the average latency.

Running a fixed-concurrency benchmark against a saturated cluster will
have vastly different effects on latency than trying to serve real
live traffic without a feedback mechanism with a system which is
processing fewer requests than incoming.

-- 
/ Peter Schuller

Reply via email to