Thanks for bringing this up, Kazuho.  My back-of-the envelope math also
indicated that 1/3 was a better value than 1/2 when I looked into it a few
years ago, but I never constructed a clean test to prove it with a real
congestion controller.  Unfortunately, our congestion control simulator is
below our flow control layer.

It probably makes sense to test this in the real-world and see if reducing
it to 1/3 measurably reduces the number of blocked frames we receive on our
servers.

There are use cases when auto-tuning is nice.  Even for Chrome, there are
cases when if we started with a smaller stream flow control window, we
would have avoided some bugs where a few streams consume the entire
connection flow control window.

Thanks, Ian

On Tue, Oct 14, 2025 at 8:49 AM Kazuho Oku <[email protected]> wrote:

>
>
> 2025年10月14日(火) 19:36 Max Inden <[email protected]>:
>
>> * Send MAX_DATA / MAX_STREAM_DATA no later than when 33% (i.e., 1/3) of
>> the credit is consumed.
>>
>> Firefox will send MAX_STREAM_DATA after 25% of the credit has been
>> consumed.
>>
>>
>> https://github.com/mozilla/neqo/blob/791fd40fb7e9ee4599c07c11695d1849110e704b/neqo-transport/src/fc.rs#L30-L37
>>
>> * Instead of doubling (x2) the window size, increase it by a larger
>> factor (e.g., x4).
>>
>> Firefox will increase the window by up to 4x the overshoot of the current
>> BDP estimate.
>>
>
> Good to know that Firefox uses these numbers. They look fine to me, though
> depending on the size of the initial credit, Careful Resume might get
> blocked.
>
>>
>> https://github.com/mozilla/neqo/blob/791fd40fb7e9ee4599c07c11695d1849110e704b/neqo-transport/src/fc.rs#L402-L409
>>
>> * disable auto tuning entirely (it's needed only for latency-sensitive
>> applications).
>>
>> What would be a reasonable one-size-fits-all stream data window size,
>> which at the same time doesn't expose the receiver to a memory exhaustion
>> attack?
>>
>> Because it is difficult to estimate the sender's initial window and how
>> quicly it ramps up - especially with algorithms like Careful Resume, which
>> don't use Slow Start - my preference is to disable auto tuning by default.
>>
>> Wouldn't a high but reasonable start value + window auto-tuning be ideal?
>>
>
>
> Yeah, I think there’s often confusion between two distinct aspects:
> a) the maximum buffer size that the receiver can allocate, and
> b) how fast the sender might transmit.
>
> A is what receivers need to prevent memory-exhaustion attacks. It’s purely
> a local policy: the limit might be 1 MB or 10 MB, but it’s unrelated to B —
> that is, it doesn’t depend on how quickly the sender sends.
>
> For latency-sensitive applications that read slowly, it’s important to cap
> the receive buffer at roughly read_speed × latency, because otherwise
> bufferbloat increases latency. But again, that consideration is separate
> from B.
>
> In my view, B mainly concerns minimizing the amount of memory allocated
> inside the kernel. Kernel-space memory management is far more constrained
> than in user space: allocations often have to be contiguous, and falling
> back to swap is not an option. Note also that the TCP/IP stack is decades
> old, from an era when memory was a much more precious resource than it is
> today.
>
> In contrast, a QUIC stack running in user space can rely on virtual
> memory, where fragmentation is rarely a real issue. When a user-space
> buffer fills up, the program can simply call realloc() and append
> data—possibly incurring operations such as virtual-memory remapping or
> paging. There is no need to pre-reserve large contiguous chunks of memory.
>
> To summarize, there is far less need in QUIC, if any, to minimize the
> receive window advertised to the peer, compared to what was necessary for
> in-kernel TCP.
>
>
>> On 14/10/2025 03.38, Kazuho Oku wrote:
>>
>>
>>
>> 2025年9月29日(月) 16:28 Max Inden <[email protected]>:
>>
>>> For what it is worth, also referencing previous discussion on this list:
>>>
>>> "Why isn't QUIC growing?"
>>>
>>> https://mailarchive.ietf.org/arch/msg/quic/RBhFFY3xcGRdBEdkYmTK2k926mQ/
>>>
>>
>> Reading the old thread, I'm reminded that people often assume QUIC
>> performs better than TCP. However, that is true only when the QUIC stack is
>> implemented, configured, and deployed correctly.
>>
>> One bug I've seen in multiple stacks - one that significantly affects
>> benchmark results - is the failure to auto-tune the receive window as
>> aggressively as the sender's Slow Start allows.
>>
>> Based on my understanding, Google Quiche implements receive window
>> auto-tuning as follows:
>> * Send MAX_DATA / MAX_STREAM_DATA when 50% of the credit has been
>> consumed.
>> * Double the window size when these frames are frequently.
>>
>> Several other stacks have adopted this approach.
>>
>> The problem with this logic is that it's too conservative and causes the
>> sender to become flow-control-blocked during Slow Start.
>>
>> Consider the following example:
>> 1. The receiver advertises an initial Maximum Data of W.
>> 2. After receiving 0.5W bytes, the receiver sends Maximum Data=2.5W along
>> with ACKs up to W/2. The next Maximum Data will be sent once the receiver
>> has received 1.5W bytes.
>> 3. The receiver receives bytes up to W and ACKs them.
>> 4. At this point, the sender's Slow Start permits transmission up to 2W
>> bytes, but the advertised receive window is only 1.5W. As a rsult, the
>> connection becomes flow-control-blocked.
>>
>> There are several ways to address this issue:
>> * Send MAX_DATA / MAX_STREAM_DATA no later than when 33% (i.e., 1/3) of
>> the credit is consumed.
>> * Instead of doubling (x2) the window size, increase it by a larger
>> factor (e.g., x4).
>> * disable auto tuning entirely (it's needed only for latency-sensitive
>> applications).
>>
>> Because it is difficult to estimate the sender's initial window and how
>> quicly it ramps up - especially with algorithms like Careful Resume, which
>> don't use Slow Start - my preference is to disable auto tuning by default.
>>
>> In fact, this is also the choice made by Chromium, which is why it is not
>> affected by this bug!
>>
>> For reference, Tatshiro addressed this issue in ngtcp2 in the
>> follwing PRs;
>> * https://github.com/ngtcp2/ngtcp2/pull/1396 - Tweak threshold for
>> max_stream_data and max_data transmission
>> * https://github.com/ngtcp2/ngtcp2/pull/1397 - Add note for window
>> auto-tuning
>> * https://github.com/ngtcp2/ngtcp2/pull/1398 - examples/client: Disable
>> window auto-tuning by default
>>
>> However, I suspect the bug may still exist in other stacks.
>>
>> On 29/09/2025 05.38, Lars Eggert wrote:
>>>
>>> Hi,
>>>
>>> pitch for a discussion at 124.
>>>
>>> https://radar.cloudflare.com/
>>> <https://radar.cloudflare.com/adoption-and-usage?dateRange=52w> and
>>> similar stats have had H3 around 30% for a few years now, with little
>>> changes since the first quichbram up to that level.
>>>
>>> Topic: why is that and is there anything the WG or IETF can do to change
>>> it (upwards, of course)?
>>>
>>> Thanks,
>>> Lars
>>> --
>>> Sent from a mobile device; please excuse typos.
>>>
>>>
>>
>> --
>> Kazuho Oku
>>
>>
>
> --
> Kazuho Oku
>

Reply via email to