Hi Ted, please see below.
> Am 23.10.2018 um 21:51 schrieb Ted Lemon <mel...@fugue.com>: > > On Mon, Oct 15, 2018 at 10:02 AM Mirja Kuehlewind (IETF) > <i...@kuehlewind.net> wrote: > sorry for the delay, however, as you performed a couple of changes it took me > a while to re-review. I believe I’m unfortunately not fully ready to release > my discuss at this point, but close.. > > No worries—it's a busy time. > > Regarding my first discuss point (delayed ACKs aso.) I think the text > improved and I would like to seem my minor wording question (comment 2) > below addressed before I finally release the discuss here. However, I still > think the extensive discussion as provided in section 9.5 now, does not > necessarily belong in this document. Therefore I would rather would have > preferred to move this text in a real appendix, or removed it completely and > maybe document in an own informational RFC (in tcpm). > > Regarding my second discuss point (keep-alives), the text seems still not > quite right yet, or I’m really confused. Please also see also further below > (comment 3). > > Anyway here are my comments on the edited/new text in the order they appear > in the draft: > > 1) I think the following text in section 3 is not fully correct: > > "Fast Open message: A TCP SYN packet that begins a DSO connection and > contains early data ([RFC8446] section 2.3). Fast Open is only > permitted when using TLS encapsulation: a TCP SYN message that does > not use TLS encapsulation but contains early data is not permitted.“ > If TLS 0-RTT is used this data will not be carried in the TCP SYN, it will > „just“ be send at the same time as the TLS handshake is performed (but after > the TCP handshake). Only if TCP Fast Open (TFO) (see RFC7413) is used, data > can also be sent in the TCP SYN. I guess you mainly need to fix the reference > here, or maybe name both mechanisms separately. > > If you look at the table on p. 18 of RFC8447, it shows early data being sent > in the first packet. What am I missing here? I guess you mean RFC8446 :-) The table there on p.18 shows only the TLS handshake, there is a TCP handshake before that. > > 2) In section 5.5.1: > "With a DSO request message, the TCP implementation waits for the > application-layer client software to generate the corresponding DSO > response message, which enables the TCP implementation to send a > single combined IP packet containing the TCP acknowledgement, the TCP > window update, and the application-generated DSO response message. > This is more efficient than sending three separate IP packets.“ > > The phrasing here is a bit confusing, to me at least. It sounds a bit like > there is a special TCP for DSO… maybe the following is a bit better: > "With a DSO request message, TCP delayed acknowledge timer will usually > make the implementation wait for the > application-layer client software to generate the corresponding DSO > response message before it sends out an TCP acknowledgment > This will generate a > single combined IP packet containing the TCP acknowledgement, the TCP > window update, and the application-generated DSO response message and > is more efficient than sending three separate IP packets.“ > > (Note that the deplayed ack timer can be configured to a very small value as > well, and as such it depends on the processing time and the value of the > timer if a TCP implementation will wait or not.) > > I think using the passive voice here makes the text harder to follow, but I > see what you are saying. How about this: > > With a bidirectional exchange over TCP, as for example with a DSO request > message, the operating system TCP implementation waits for the "the operating system’s TCP implementation will usually wait for" or even better „the deplayed acknowledgments timer in TCP will usually wait for" > application-layer client software to generate the corresponding DSO > response message. It can then send a > single combined packet containing the TCP acknowledgement, the > TCP window update, and the application-generated DSO response message.. > This is more efficient than sending three separate packets, as would occur if > the TCP packet containing the DSO request were acknowledged immediately. > > 3) Section 6.5.2 > "For example, a (hypothetical and unrealistic) > keepalive interval value of 100 ms would result in a continuous > stream of ten messages per second or more, in both directions, to > keep the DSO Session alive. And, in this extreme example, a single > packet loss and retransmission over a long path could introduce a > momentary pause in the stream of messages of over 200 ms, long enough > to cause the server to overzealously abort the connection.“ > I think this example is still not correct (and the changes might made have it > worse: how can there be more then 10 messages?) > > So the point here is that there is a dependency on the RTT. Only if the RTT > is smaller than 200ms this can happen, otherwise the connection is closed > anyway after two keep-alives. However, if the RTT is much smaller than 100ms > and e.g. TLP is used, it would still work even if one packet is lost. > > Remember that keepalives are not synchronous. That is, if we send a > keepalive, we don't wait for the response. So it's perfectly possible for > there to be several keepalives in flight in this situation, if the RTT is > >200ms. Yes, I misunderstood that initially, however btw. why did you decide to design it that way? > > In any case, I don’t think this example is actually very helpful. The point > is that the keep-alives interval should always be much larger than the RTT to > make this work appropriately. However, the point about keeping the network > load is, is rather independent to the question of when the mechanism actually > breaks. I would recommend to simply remove this example and just say that the > interval MUST not be smaller than 10 sec to keep the network load reasonably > low. > > However, having read this and the previous section again, I think your > implementation of the keep-alives mechanism could also be improved. Usually, > there should be two intervals. One defines, how long the connection can be > idle before an keeps-live is sent and one that defines when a keeper-lives > should be retransmitted if it is deemed to be lost, where the first one just > usually be larger than the second one (and both timers should always be > larger than the RTT). That would enable faster failure if the connection is > actually lost. > > A possible point of confusion is that these are not TCP keepalive packets. > These are DSO messages being sent over the TCP transport. So it's not > possible for a keepalive to be lost. If we don't get a response to a > keepalive during the keepalive interval, this means that the TCP connection > has stalled, or that the remote end is no longer reachable. There is no > retransmission. Is that where the confusion lies, or am I misunderstanding? Right, I got actually confused here. So you send data frequently and if something goes wrong the connection will be closed at sender-side. Hm... it seems like if you want to test transport liveness with this (and not application liveness), you might maybe rather want use the existing keep-alive mechanism in TCP…? Why don’t you just recommend to use that? > > 4) Section 6.6.2.2. (Reconnecting After an Unexplained Connection Drop) > "It is also possible for a server to forcibly terminate the > connection; in this case the client doesn't know whether the > termination was the result of a protocol error or a network outage. > The client could determine which of the two is occurring by noticing > if a connection is repeatedly dropped by the server; if so, the > client can mark the server as not supporting DSO.“ > How often should the client try and in which interval? > > I've added the following text to address this question: > > ### Misbehaving Clients > > A server may determine that a client is not following the protocol correctly. > There may be no > way for the server to recover the session, in which case the server forcibly > terminates the > connection. Since the client doesn't know why the connection dropped, it may > reconnect > immediately. If the server has determined that a client is not following the > protocol > correctly, it may terminate the DSO session as soon as it is established, > specifying a long > retry-delay to prevent the client from immediately reconnecting. > > and > > #### Reconnecting After an Unexplained Connection Drop {#dropreconnect} > > It is also possible for a server to forcibly terminate the connection; in > this case the client doesn't know whether the termination was the result > of a protocol error or a network outage. When the client notices that > the connection has been dropped, it can attempt to reconnect immediately. > However, if the connection is dropped again without the client being > able to successfully do whatever it is trying to do, it should mark the > server as not supporting DSO. > > These two bits of advice, in combination with the surrounding text, should > address the problem you're pointing to. Yes, I think this is fine now. > > 5) Section 9.2: > "In principle, anycast servers could maintain sufficient state that > they can both handle packets in the same TCP connection.“ > Really? I mean in theory yes but has this ever been done in practice? I would > think that sharing TCP state is even harder than sharing DSO state. > > I've just deleted this paragraph—I think we were trying to address a > hypothetical scenario here and got a little carried away. :) Good. Thanks! > > Please clarify that TLS 0-RTT can be used without TFO (or TFO can be used > without TLS) and I would also recommend to discuss the respective issue > separately. > > As Benjamin said, I took out support for TCP Fast Open without TLS 1.3 > because I didn't think it was practical to address the potential issues with > it. Not sure why you think that/which issues you mean? rfc7413 actually discusses all kind of issues extensively. > However, in looking back at what I wrote, it's easy to see why this was > confusing. I've substantially tightened up the text about this: all cases > where terms like "TCP Fast Open" and "0-RTT" are used now refer to "early > data." The changes are relatively small, but sprinkled over a whole > section, so I don't think it's practical to enumerate them here, but they > should show up nicely in the diffs. I believe these changes address your > concern, but please let me know if they do not. > > https://www.ietf.org/rfcdiff?url2=draft-ietf-dnsop-session-signal-17 Okay, I believe this is fine now. I guess you could further clarify somewhere that „early data“ is always 0-RTT TLS with or _WITHOUT_ TFO. Thanks! Mirja > > Thanks! _______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop