Thanks for bringing this up, Kazuho. My back-of-the envelope math also indicated that 1/3 was a better value than 1/2 when I looked into it a few years ago, but I never constructed a clean test to prove it with a real congestion controller. Unfortunately, our congestion control simulator is below our flow control layer.
It probably makes sense to test this in the real-world and see if reducing it to 1/3 measurably reduces the number of blocked frames we receive on our servers. There are use cases when auto-tuning is nice. Even for Chrome, there are cases when if we started with a smaller stream flow control window, we would have avoided some bugs where a few streams consume the entire connection flow control window. Thanks, Ian On Tue, Oct 14, 2025 at 8:49 AM Kazuho Oku <[email protected]> wrote: > > > 2025年10月14日(火) 19:36 Max Inden <[email protected]>: > >> * Send MAX_DATA / MAX_STREAM_DATA no later than when 33% (i.e., 1/3) of >> the credit is consumed. >> >> Firefox will send MAX_STREAM_DATA after 25% of the credit has been >> consumed. >> >> >> https://github.com/mozilla/neqo/blob/791fd40fb7e9ee4599c07c11695d1849110e704b/neqo-transport/src/fc.rs#L30-L37 >> >> * Instead of doubling (x2) the window size, increase it by a larger >> factor (e.g., x4). >> >> Firefox will increase the window by up to 4x the overshoot of the current >> BDP estimate. >> > > Good to know that Firefox uses these numbers. They look fine to me, though > depending on the size of the initial credit, Careful Resume might get > blocked. > >> >> https://github.com/mozilla/neqo/blob/791fd40fb7e9ee4599c07c11695d1849110e704b/neqo-transport/src/fc.rs#L402-L409 >> >> * disable auto tuning entirely (it's needed only for latency-sensitive >> applications). >> >> What would be a reasonable one-size-fits-all stream data window size, >> which at the same time doesn't expose the receiver to a memory exhaustion >> attack? >> >> Because it is difficult to estimate the sender's initial window and how >> quicly it ramps up - especially with algorithms like Careful Resume, which >> don't use Slow Start - my preference is to disable auto tuning by default. >> >> Wouldn't a high but reasonable start value + window auto-tuning be ideal? >> > > > Yeah, I think there’s often confusion between two distinct aspects: > a) the maximum buffer size that the receiver can allocate, and > b) how fast the sender might transmit. > > A is what receivers need to prevent memory-exhaustion attacks. It’s purely > a local policy: the limit might be 1 MB or 10 MB, but it’s unrelated to B — > that is, it doesn’t depend on how quickly the sender sends. > > For latency-sensitive applications that read slowly, it’s important to cap > the receive buffer at roughly read_speed × latency, because otherwise > bufferbloat increases latency. But again, that consideration is separate > from B. > > In my view, B mainly concerns minimizing the amount of memory allocated > inside the kernel. Kernel-space memory management is far more constrained > than in user space: allocations often have to be contiguous, and falling > back to swap is not an option. Note also that the TCP/IP stack is decades > old, from an era when memory was a much more precious resource than it is > today. > > In contrast, a QUIC stack running in user space can rely on virtual > memory, where fragmentation is rarely a real issue. When a user-space > buffer fills up, the program can simply call realloc() and append > data—possibly incurring operations such as virtual-memory remapping or > paging. There is no need to pre-reserve large contiguous chunks of memory. > > To summarize, there is far less need in QUIC, if any, to minimize the > receive window advertised to the peer, compared to what was necessary for > in-kernel TCP. > > >> On 14/10/2025 03.38, Kazuho Oku wrote: >> >> >> >> 2025年9月29日(月) 16:28 Max Inden <[email protected]>: >> >>> For what it is worth, also referencing previous discussion on this list: >>> >>> "Why isn't QUIC growing?" >>> >>> https://mailarchive.ietf.org/arch/msg/quic/RBhFFY3xcGRdBEdkYmTK2k926mQ/ >>> >> >> Reading the old thread, I'm reminded that people often assume QUIC >> performs better than TCP. However, that is true only when the QUIC stack is >> implemented, configured, and deployed correctly. >> >> One bug I've seen in multiple stacks - one that significantly affects >> benchmark results - is the failure to auto-tune the receive window as >> aggressively as the sender's Slow Start allows. >> >> Based on my understanding, Google Quiche implements receive window >> auto-tuning as follows: >> * Send MAX_DATA / MAX_STREAM_DATA when 50% of the credit has been >> consumed. >> * Double the window size when these frames are frequently. >> >> Several other stacks have adopted this approach. >> >> The problem with this logic is that it's too conservative and causes the >> sender to become flow-control-blocked during Slow Start. >> >> Consider the following example: >> 1. The receiver advertises an initial Maximum Data of W. >> 2. After receiving 0.5W bytes, the receiver sends Maximum Data=2.5W along >> with ACKs up to W/2. The next Maximum Data will be sent once the receiver >> has received 1.5W bytes. >> 3. The receiver receives bytes up to W and ACKs them. >> 4. At this point, the sender's Slow Start permits transmission up to 2W >> bytes, but the advertised receive window is only 1.5W. As a rsult, the >> connection becomes flow-control-blocked. >> >> There are several ways to address this issue: >> * Send MAX_DATA / MAX_STREAM_DATA no later than when 33% (i.e., 1/3) of >> the credit is consumed. >> * Instead of doubling (x2) the window size, increase it by a larger >> factor (e.g., x4). >> * disable auto tuning entirely (it's needed only for latency-sensitive >> applications). >> >> Because it is difficult to estimate the sender's initial window and how >> quicly it ramps up - especially with algorithms like Careful Resume, which >> don't use Slow Start - my preference is to disable auto tuning by default. >> >> In fact, this is also the choice made by Chromium, which is why it is not >> affected by this bug! >> >> For reference, Tatshiro addressed this issue in ngtcp2 in the >> follwing PRs; >> * https://github.com/ngtcp2/ngtcp2/pull/1396 - Tweak threshold for >> max_stream_data and max_data transmission >> * https://github.com/ngtcp2/ngtcp2/pull/1397 - Add note for window >> auto-tuning >> * https://github.com/ngtcp2/ngtcp2/pull/1398 - examples/client: Disable >> window auto-tuning by default >> >> However, I suspect the bug may still exist in other stacks. >> >> On 29/09/2025 05.38, Lars Eggert wrote: >>> >>> Hi, >>> >>> pitch for a discussion at 124. >>> >>> https://radar.cloudflare.com/ >>> <https://radar.cloudflare.com/adoption-and-usage?dateRange=52w> and >>> similar stats have had H3 around 30% for a few years now, with little >>> changes since the first quichbram up to that level. >>> >>> Topic: why is that and is there anything the WG or IETF can do to change >>> it (upwards, of course)? >>> >>> Thanks, >>> Lars >>> -- >>> Sent from a mobile device; please excuse typos. >>> >>> >> >> -- >> Kazuho Oku >> >> > > -- > Kazuho Oku >
