HI Roman, thank you for the review. Please see comments inline.
> Hi! > > I performed a AD review of draft-ietf-ipsecme-rfc8229bis-05. Thanks for > revising RFC8229 with this new > guidance. Comments are below: > > ** The abstract notes that many of the document updates came from deployment > experience. I'm hoping to > incorporate that feedback on a particular issue. There are a number places > in this document where > qualitative recommendations are made about various network stack timers. Can > quantitative > recommendations be made in any of the following: Traditionally, IPsec specifications contain very few quantitatives concerning various timings. This is due to the belief that concrete timeouts don't affect interoperability. Instead, some very generic recommendations are usually given. See for example Section 2.4 of RFC 7296: The number of retries and length of timeouts are not covered in this specification because they do not affect interoperability. It is suggested that messages be retransmitted at least a dozen times over a period of at least several minutes before giving up on an SA, but different environments may require different rules. > -- Section 7.1 "If the TCP connection is no more associated with any active > IKE SA, the TCP Responder MAY > close the connection to clean up resources if TCP Originator didn't close it > within some reasonable period of > time." I don't think we should prescribe concrete time to wait (since it is a Responder's matter when to free up its resources), but we can add a recommendation. How about: If the TCP connection is no more associated with any active IKE SA, the TCP Responder MAY close the connection to clean up resources if TCP Originator didn't close it within some reasonable period of time (e.g. few seconds). The reason for keeping the orphan TCP connection for some short time is to allow the Initiator to re-use it in case it is ever possible. For example, if the responder returned an error notify and deletes the IKE SA, but the initiator is able to recover (e.g. after COOKIE request or INVALID_KE_PAYLOAD) then if the Responder immediately closes TCP connection, then the Initiator will have to re-establish it, thus wasting 2 RTT. So, this is just for optimization, nothing fatal happens if the responder closes orphan TCP connection immediately. > -- Section 7.4. "In particular, it is advised that the Initiator should not > act immediately after receiving error > notification and should instead wait some time for valid response, ..." This text is just a repetition of what RFC 7296 contains (Section 2.21.1). This specification recommends not to follow RFC 7296 in this situation and act upon immediately if error notification is received. > -- Section 8.1. "If no response is received within a certain period of time > after several retransmissions ..." It's hard to give any concrete recommendations here. If the initiator switches to TCP too quickly, then it may end up with TCP transport while UDP is available on this path. This is suboptimal. On the other hand, if it waits too long before switching to TCP in situation when UDP doesn't work, then it makes the connection outage longer. How about adding the following sentence: The value of timeout and the number of retransmissions may vary depending on the initiator's configuration, but it is expected that the initiators would try to get response over UDP for at least half a minute sending at least dozen retransmissions before switching to TCP. What WG members think about these values? > -- Section 8.4. "For the client, the cluster failover event may remain > undetected for long time if it has no IKE > or ESP traffic to send. " Hm, I'm a bit confused what quantitative do you want to see here. It is just an ascertaining that as long as no traffic originates from the client to the cluster then the fact that the failover takes place will not be known to the client (in case of TCP). > -- Section 8.4. "if support for High Availability in IKEv2 is negotiated and > TCP transport is used, a client that is > a TCP Originator SHOULD periodically send IKEv2 messages (e.g. by > initiating liveness check exchange) > whenever there is no IKEv2 or ESP traffic." Again, it's hard to give concrete recommendations. All depends on the client's policy. If it wants to minimize the delay it detects the cluster failover, then it would send liveness check messages more frequently. On the other hand, if it wants to save resources, it would send them less frequently. I don't think any "one size fits all" recommendation can be given. > The only place I found quantitative guidance was in Section 7.3.1. > > ** Section 6.1. Editorial. s/with new Initiators's SPI/with the Initiator's > new SPI/ > > ** Section 7.1 Editorial. > > OLD > If the TCP connection is no more > associated > > NEW > If the TCP connection is no longer associated Fixed, thank you. > ** Section 7.3 Should something be said about the utility of mixing both SYN > Cookies and IKEv2 Cookies? Can you elaborate on this? It is my understanding that no application traffic can be sent until TCP handshake completes (well, it *can* in theory, but not in practice). So I see no way how they can be mixed except for sending IKEv2 Cookie over the established TCP connection, which is discussed and discouraged. Am I missing something? > ** Section 7.3 > * the exchange Responder SHOULD NOT request a Cookie, with the > exception of Puzzles or in rare cases like preventing TCP Sequence > Number attacks. > > I'm having trouble following this guidance. Is this saying "you SHOULD NOT > send IKEv2 Cookies without > Puzzles?". If so, is this the intent: Yes. > The exchange Responder SHOULD NOT request a IKEv2 Cookie without a Puzzle, > unless mitigation against only > TCP Sequence Number attacks is desired? Changed to: the exchange Responder SHOULD NOT send an IKEv2 Cookie request without an accompanied Puzzle; an example of an exception to this rule may be a mitigation against TCP Sequence Number attacks. Note, that mitigation against TCP Sequence Number attacks is not the only possible reason, for now it's hard to think of others, but they may exist (or arise in the future). > ** Section 7.3. Typo. s/change change/change/ Typo fixed, thanks. > ** Section 7.6 > ... TCP keep-alives [RFC1122] and TLS keep-alives [RFC6520] > may be used > > Should this be "... MAY be used"? Changed to MAY. > ** Section 7.7. Editorial. s/Besides, TCP encapsulation/TCP encapsulation/ > ** Section 8.3. Typo. s/incative/inactive/ Fixed, thank you. > ** Section 8.4. > This document makes the following recommendation: if support for High > Availability in IKEv2 is negotiated and TCP transport is used, a > client that is a TCP Originator SHOULD periodically send IKEv2 > messages (e.g. by initiating liveness check exchange) whenever there > is no IKEv2 or ESP traffic. This differs from the recommendations > given in Section 2.4 of [RFC7296] in the following: the liveness > check should be periodically performed even if the client has nothing > to send over ESP. The frequency of sending such messages should be > high enough to allow quick detection and restoring of broken TCP > connection. > > -- Due to the change in behavior being suggested to RFC7296, did the WG > discuss this document formally > updating it (RFC7296)? I don't think this is needed. The new recommendations are only applicable in the context of TCP encapsulation, so all old implementations remain compliant with RFC 7296. > -- Consider using a more precise term that "TCP Originator" "TCP Originator" was used in RFC 8229 and there was quite a lot of discussions of how to call it (initially it was "TCP Initiator", that provoked mixing up with IKE Exchange Initiator, which may be different parties). Do you have some candidate term in mind? > ** Appendix A. Would there be a reason not to reference a recommended > conformance to RFC7525 or draft- > ietf-uta-rfc7525bis for TLS connections? It's a good question. The idea of using TLS encapsulation was to allow IPsec to pass where only TLS is allowed, but the real protection of traffic should still be provided by IPsec, so with TLS 1.2 NULL cipher suites were intended to be used. With TLS 1.3 NULL cipher suites are gone, so TLS will also participate in traffic protection, so referencing RFC 7525 seems to make sense. Any opinions? A github branch with the changes (xml): https://github.com/smyslov/rfc8229bis/compare/master...AD-review?quick_pull=1 Regards, Valery. > Regards, > Roman > > _______________________________________________ > IPsec mailing list > IPsec@ietf.org > https://www.ietf.org/mailman/listinfo/ipsec _______________________________________________ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec