Hi IPPM,

For the purpose of this email, I'm speaking as an individual. I'm cross
posting to the QUIC mailing list for visibility of others that might also
be interested. The TL;DR is the draft -03 contains text that indicates how
visible bits of QUIC packets could be used and that raises several concerns
that I think need to be addressed before this document can progress further.

draft-ietf-ippm-explicit-flow-measurements is fairly far progressed through
its IETF journey. To drastically summarise it, it's about the various forms
of marking bits in packets that can be used for performance measurements.
And that requires collaboration between client and server in order to
expose some information that on-path observers might use.

This might seem familiar to the QUIC WG. We have a spin bit formally
defined in version 1. And we last heard, explicitly in QUIC, about the
concepts presented in draft-ietf-ippm-explicit-flow-measurements in a
thread in 2021. The work was adopted to IPPM and progressed there and that
is all good. However, I've not been able to track this draft as closely as
I would have liked and I suspect some other QUIC stakeholders may have
missed it too.

Other folks have raised some concerns during last call already and I'm
generally in agreement with them. However, for clarity of discussion
consider this a new thread that has some overlap and some non-overlap.

The major point that I think remains unaddressed by this document is how
bits in QUIC packets would _actually_ get used in a coordinated and
interoperable manner. The QUIC invariants draft, Section 5.2 [1] states
that short header packets consists of 7 Version-Specific Bits. Per 9000
Section 17.3.1 [1], QUIC version 1 defines the 1-RTT short header packet
with the 7 bits allocated as so

  Fixed Bit (1) = 1,
  Spin Bit (1),
  Reserved Bits (2),
  Key Phase (1),
  Packet Number Length (2),

Where the Reserved Bits have the requirements:

_The value included prior to protection MUST be set to 0. An endpoint MUST
treat receipt of a packet that has a non-zero value for these bits, after
removing both packet and header protection, as a connection error of type
PROTOCOL_VIOLATION._

The Intro draft-ietf-ippm-explicit-flow-measurements states that:

_this document proposes adding a small number of dedicated measurement bits
to the clear portion of the transport protocol headers. These bits can be
added to an unencrypted portion of a transport-layer header, e.g. UDP
surplus space (see [UDP-OPTIONS
<https://www.ietf.org/archive/id/draft-ietf-ippm-explicit-flow-measurements-03.html#UDP-OPTIONS>
] and [UDP-SURPLUS
<https://www.ietf.org/archive/id/draft-ietf-ippm-explicit-flow-measurements-03.html#UDP-SURPLUS>
]) or reserved bits in a QUIC v1 header, as already done with the latency
Spin bit (see [QUIC-TRANSPORT
<https://www.ietf.org/archive/id/draft-ietf-ippm-explicit-flow-measurements-03.html#QUIC-TRANSPORT>
])._

This is problematic because you can't just change those Reserved Bits at
will and hope that independent implementations will interoperate. Similarly
you can't just substitute the Spin Bit as proposed in Section 6. This
really makes me question how this has been implemented and tested to date
because, to make it function properly and operate safely on the Internet,
you'd either need to have defined new versions of QUIC or defined an
extension. For an example of the latter, see RFC 9287 [3], which describes
a mechanism for repurposing the Fixed Bit. Yes, I understand some people
have trialled these out in a limited fashion but I'm talking about how
would an operator that is watching millions of QUIC flows from heterogenous
clients and servers know which ones are using measurements bits and which
aren't?

At the point that client and server need to negotiate a version or an
extension, there is now a coordination challenge with these on-path
observers. The QUIC manageability spec has some great text detailing the
considerations that apply here, this draft is sorely lacking a reference to
Section 3 of RFC 9312 [4], and also lacking any commentary on dealing with
those considerations here. (but maybe it shouldn't, see end of email).

In addition to these concerns, I also find the protocol ossification
considerations in Section 5 to be awkward. It's not clear who the onus is
on to do anything because it talks about protocol designers and then says
implementations could decide to do something. It mentions "Latency Spin bit
greasing" in RFC 9000 but that term is not used by that name. I'm going to
presume you're alluding to Section 17.4 [5]. If so there are two important
separate aspects, disabling the spin bit on every 1 in 16 paths or
connection IDs, and randomizing values in the spin bit. The reason this is
awkward is because there are many different permutations of bits described
in the specification and operators need to somehow guess i) what
permutation is being used ii) if the endpoints have it enabled and iii) how
randomization affects analysis.

Finally, the draft consistently cites other QUIC documents (great!) but
lacks deep linking into the specific sections where you really want the
reader to go and focus. That places a lot of expectation on the reader to
read the right thing and do it. It would be helpful to add more-specific
links whereever possible.

All in all, these combination of concerns lead me to believe the document
in its current state is potentially unsafe, because it endangers people
picking it up literally and starting to write or read bits without due
coordination or feature detection. The confusion arises because the intro
implies this is a proposal to change bits but then backs off that
commitment and uses "coulds" etc. I think we need to be crystal clear.
Speak about some bits in the abstract and mention the few protocol-specific
concerns or related examples that each protocol already has defined.
However, remove the implications that we have well thought out how to get
these deployed for QUIC and punt that to future work by explicitly stating
so. I think this takes the form of removing section 6, rewriting section 5
to speak about the higher-level problems and possible approaches, and
removing any straggling sentences about reusing existing or reserved bits.

Cheers
Lucas

[1] - https://datatracker.ietf.org/doc/html/rfc8999#section-5.2
[2] - https://datatracker.ietf.org/doc/html/rfc9000#section-17.3.1
[3] - https://datatracker.ietf.org/doc/html/rfc9287.html
[4] - https://datatracker.ietf.org/doc/html/rfc9312#section-3
[5] - https://datatracker.ietf.org/doc/html/rfc9000#section-17.4

Reply via email to