W dniu 19.12.2025 o 17:43, Willy Tarreau pisze:
On Fri, Dec 19, 2025 at 05:20:08PM +0100, Hank Bowen wrote:
However, having touched on the subject of HTTP/2 - I'm wondering if I
understand correctly why in case of http-reuse set to "aggressive" or
"always" one client can cause head-of-line blocking problem on the rest of
the clients.

Yes in theory, though since 3.1 or so, it has been significantly mitigated
by the fact that we now support a dynamic Rx buffer size and we advertise
only the allocated size. Prior to 3.1 we'd advertise 64kB despite a 16kB
per-stream buffer by default. Most of the time it would be fine thanks to
other internal buffering but not always. But HoL is inherent to H2 and is
always a trade-off between HoL risk and BDP (bandwidth-delay product).

OK, I generally understand that, that is there was overadertising which was ineffective but if you have time I'd like to learn about the details. I write "if you have time" here on purpose, as I'm asking this more out of curiosity than from a real need.

That is, if we advertise 64 kB Rx buffer size then it is a problem when a stream established between client and haproxy has 16 kB buffer or when a stream established between haproxy and a backend server has such a value (or both cases causes inefficiency)? Is this buffer related to some setting that haproxy makes available (after taking a glance on possible settings that could correspond to that I cannot tell, seems like it can be tune.h2.max-frame-size but I'm not sure)? How exactly advertising the Rx buffer size larger than buffer of a stream causes a problem? That is, I suppose that we have then to deal with a situation like that (I have constructed this after some questioning chat gpt on that matter, I do not know how trusted it can be here, but its explanation looks reasonable): haproxy advertises Rx buffer size of 64 kB but buffer on a stream is only 16 kB. Haproxy downloads 64 kB for a given stream exhausting HTTP/2 connection-level window and due to this fact it cannot download more data. Now, let's note that it makes sense for haproxy to send WINDOW_UPDATE to the server only after the client has read the data (otherwise haproxy would eventually overflow its own buffers). Haproxy passes the first 16 kB to the mentioned stream and it waits for a client to read it, client indeed eventually takes all this 16 kB of data but the process is slow and during this process downloading data for other streams is blocked (it always is but the matter here is that it is blocked for a long time). After the client completes that operation haproxy sends WINDOW_UPDATE to the server and only then takes (only) 16 kB for another stream (the stream given by the next request frame in the frame sequence). Does my description correspond to reality?

Is it that the given connection's TCP buffer then significantly
fills up and when other (fast) clients request to download data, haproxy can
only download so much of them as much there is remaining space in its TCP
buffer which is low, so it must perform this operation by executing many
separate downloads and each such download is related to some overhead
(compared to the situation when all the data were downloaded to haproxy's
TCP buffer at once)?

The issue is that if you aggregate a slow and a fast reader into the same
connection, and the connection is filled with data for the slow reader,
there's no way to make the data for the fast reader bypass it since TCP
is in-order (something that QUIC addresses).

Well, I'm a bit not sure if I properly understand it. That is, under normal circumstances, that is in case of direct client <-> server interaction the HTTP/2 frames must indeed be sent from server to client in order, the TCP packets also naturally have to be sent in order. And when we have client <-> proxy <-> servers setup, the frames from a given backend server indeed does also have to be transmitted (to the proxy) in order and so does have TCP packets. But then haproxy can - as you said it below - send a frame from the n+1-th frame to client B without waiting for the n-th frame to be sent to client A (of course I assume that clients A and B are different). So there is a sort of bypassing. Does what I wrote about TCP buffer and overhead related to multiple downloads (instead of quite possibly just one) hold true?

If we have a sequence of frames from the server and they are for different
clients, haproxy does not have to wait for the n-th frame to be sent to a
client in order to send the n+1-th frame to another client, am I right?

That's it, you just cannot realistically do that otherwise you send only one
frame per network round-trip, which can limit the connection's performance
to 16kB per round-trip, e.g. 160kB per second for 100ms. But as explained,
with 3.1+ and dynamic buffers we can now modulate what we advertise and do
our best to adjust to the number of readers in a same connection. However,
slow readers will stlil reserve a number of buffers that will possibly be
under-utilized and not usable by faster ones. But that's a minor issue
compared to the initial one.

I have also some more questions although I'm not sure if it is best to send
them here or to create a new topic, but they are rather closely related to
this discussion.

If they're related, let's keep going on this thread ;-)

Willy


Reply via email to