Folks- I am appending what I sent Martin & Gorry this morning. I looked quite quickly as Martin was looking for quick input. I am happy to iterate if things aren't all that understandable.
allman [Quick hit.] I agree with Martin's DISCUSS and Gorry's notes. A couple more things here ... > This spec looks a mess! A generous reading is that this is most of the problem. I think maybe there is some intent in here that isn't stated very well. It needs to be explicit and not as sloppy. A few specific things (in addition to what Gorry said, which I absolutely agree with): - "Though timer values are the choice of the implementation, mishandling of the timer can lead to serious congestion problems" + Gorry flagged this and I am flagging it again. If this is something that can lead to serious problems, let's not just leave it to "choice of the implementation". Especially if we have some idea how to make it less problematic. - "Implementations SHOULD use an initial timer value of 100 msec (the minimum defined in RFC 6298 [RFC6298])" + I wrote RFC 6298 and I have no idea where this is coming from! + Even if this value of 100msec is OK for DTLS it shouldn't lean on RFC 6298 because RFC 6298 doesn't say that is OK. I.e., the parenthetical is objectively wrong. + RFC 6298 says the INITIAL RTO should be 1sec (point (2.1) in section 2). RFC 8961 affirms this and also says the INITIAL RTO should be 1sec (requirement (1) in section 4). - "Note that a 100 msec timer is recommended rather than the 3-second RFC 6298 default in order to improve latency for time-sensitive applications." + Again, this mis-states RFC 6298, which says the initial RTO is 1sec (not 3sec). (Previous to RFC 6298 the initial RTO was 3sec, which is probably where the notion comes from. Most of the purpose of RFC 6298 was to drop the initial RTO to 1sec.) + This is a statement of desire, not any sort of principled justification for using 100msec. At the least this should be much better argued. + To me 100msec feels much too close to the RTT of some network paths to be appropriate here. To be clear, deviations from RFC 8961 that gather consensus are fine, but you should say why that deviation is OK. And, I'd think the further you deviate the more you need to say (for me). I.e., dropping from 1sec to 900msec may not be that big of an issue. But, dropping to 1/10-th of the guideline and to something pretty close to not rare RTTs should require some care and some discussion, IMO. + And, I am not trying to be a picky protocol lawyer and say this document "didn't check the RFC 8085 / RFC 8961 box". Rather, RFC 8085 & 8961 say things for a reason and I don't think we should implicitly ignore them because they come from experience on how to do these sorts of things. - "The retransmit timer expires: the implementation transitions to the SENDING state, where it retransmits the flight, resets the retransmit timer, and returns to the WAITING state." + Maybe this is spec sloppiness, but boy does it sound like the recipe TCP used before VJCC to collapse the network. I.e., expire and retransmit the window. Rinse and repeat. It may be the intention is for backoff to be involved. But, that isn't what it says. - “When they have received part of a flight and do not immediately receive the rest of the flight (which may be in the same UDP datagram). A reasonable approach here is to set a timer for 1/4 the current retransmit timer value when the first record in the flight is received and then send an ACK when that timer expires.” + Where does 1/4 come from? Why is it "reasonable"? This just feels like a complete WAG that was pulled out of the air. And, +1 on all the flight size stuff Martin mentioned. allman
signature.asc
Description: OpenPGP digital signature
_______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls