On 8/27/2024 8:47 AM, Neal Cardwell wrote:
This is a quick note to see if there is interest in suggesting QUIC-related
text for draft-ietf-tcpm-prr-rfc6937bis, "Proportional Rate Reduction for
TCP":
https://datatracker.ietf.org/doc/draft-ietf-tcpm-prr-rfc6937bis/
At TCPM at IETF 120 in July, there were a few people suggesting that it
would be nice if the prr-rfc6937bis draft had a section explaining how PRR
could be implemented for QUIC. Here's the discussion, if you are curious
about the context:
https://www.youtube.com/watch?v=eF_Qa475ag8&t=2553s
Are there any folks with QUIC implementation experience who are interested
in volunteering to write a few paragraphs for the prr-rfc6937bis draft to
add a section explaining how PRR could be implemented for QUIC?
Maybe the urgency is not that big. The text in RFC 9002 says:
Implementations MAY reduce the congestion window immediately upon
entering a recovery period or use other mechanisms, such as Proportional
Rate Reduction [PRR <https://www.rfc-editor.org/rfc/rfc9002.html#PRR>],
to reduce the congestion window more gradually. If the congestion window
is reduced immediately, a single packet can be sent prior to reduction.
This speeds up loss recovery if the data in the lost packet is
retransmitted and is similar to TCP as described in Section 5
<https://www.rfc-editor.org/rfc/rfc6675#section-5> of [RFC6675
<https://www.rfc-editor.org/rfc/rfc9002.html#RFC6675>]. So, it is a MAY,
with an informational reference. As weak a claim as you can make it.
Moreover, many implementers take RFC 9002 (QUIC RECOVERY) with a grain
of salt. The loss recovery part is fine, but the congestion control part
is an adaptation of Reno, because the WG charter stated that only Reno
would be considered. Some implementations do support Reno as an option,
but most deployments use Cubic or BBR. For those, the only mandate that
apply are those in RFC 9000, which are very generic. In the section 1,
Overview, we find: "QUIC depends on congestion control to avoid network
congestion. An exemplary congestion control algorithm is described in
Section 7 <https://www.rfc-editor.org/rfc/rfc9002#section-7> of
[QUIC-RECOVERY
<https://www.rfc-editor.org/rfc/rfc9000.html#QUIC-RECOVERY>]". Section
9.4 specifies that when multiple paths are used, congestion control is
"per path"; that requirement is kept in the QUIC multipath expension draft.
There are a couple of complicating factors. First, while loss detection
is rather similar to TCP, retransmission and flow control are very
different: QUIC can send data on several parallel streams, different
flow control variables apply per stream and globally, and retransmission
operates at the frame level, not the packet level. Different streams may
have different priorities, and those priorities also affect which data
is repeated first -- or maybe not repeated at all.
Then there are the differences between implementation in the kernel,
like TCP, and implementations in a user process, like QUIC. Some things
are easier, such as managing memory or handling fine grained timers.
Other are most costly, in particular sending packets through the socket
API, which leads to desire to use features like GSO or sendmmsg and send
packets in batches.
To come back to PRR, the main reference in RFC 9000 is in section 4.2.
on flow control limits, which says "When a sender receives credit after
being blocked, it might be able to send a large amount of data in
response, resulting in short-term congestion; see Section 7.7
<https://www.rfc-editor.org/rfc/rfc9002#section-7.7> of [QUIC-RECOVERY
<https://www.rfc-editor.org/rfc/rfc9000.html#QUIC-RECOVERY>] for a
discussion of how a sender can avoid this congestion." Different
implementations may treat that problem differently. The general
agreement is to do some form of pacing.
The implementation that I maintain uses a leaky-bucket style pacing, in
which the size of the bucket allows for sending reasonable batches of
packet, and the rate of the bucket matches a pacing rate either computed
directly (e.g., with BBR) or derived from Congestion Window size and RTT
(e.g., for Cubic and Reno). Other implementations make their own
choices. AFAIK, there is no big demand inside the QUIC WG to standardize
that.
There was also a suggestion (sensible, IMHO) to set a two-month timeout,
and ship the draft without QUIC text if we don't get a contribution before
then. And that was a little over one month ago, so let's set a timeout of
one month from now: Sep 27th. If we get a contribution of a QUIC section by
Sep 27th, 2024 (any time zone) then we'll add that to the draft. Otherwise,
I think the consensus was to try to progress without a QUIC section.
That seems very reasonable. The TCPM group should publish a TCP related
specification. If it does replace RFC6937, QUIC developers will read it
instead of or in addition to the RFC6937 reference of RFC 9002.
-- Christian Huitema