Re: TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
> >My concern is about wireless media which can experience large variations > >in signal/noise ratio, in the process generating transient "link down" > >indications. This could cause those connections to migrate to other > >media/interfaces. > > Wouldn't that be something that should be fixed in the driver for that > interface? Declaring a link to be down has significant implications on many > systems, this shouldn't be done at the drop of a hat for links where this > determination isn't easily made. Having drivers declare links down too soon > and then having the next layer ignore that is not a good solution, especially > because there are also link layers which can determine their up/down status > much more accurately. Drivers vary widely in their behavior. I have seen drivers that can take as long as 30 seconds to notice that the point of attachment is gone after pulling the plug. Other drivers notice the missing network announcements or low SNR on incoming data and react much sooner. However, even after sending a "link down", a "link up" can come soon after if a new point of attachment is discovered. So rather than treating a "link down" as a trigger, it is probably best to treat it as a "hint", lowering the timers. > >If the host has implemented the strong host model, then > >when the transient "link down" is resolved, the connection won't resume > >using the prior outbound interface. This could lead to applications > >experiencing sub-optimal conditions long-term based on a transient > >event. > > Hm, I must say that I don't know off the top of my head if shim6 will > automatically rehome to the primary address pair after some time. I'll have to > reread the specifications. Or does anyone else remember this? I don't think it will automatically re-home in the strong host model. ___ Ietf mailing list Ietf@ietf.org https://www1.ietf.org/mailman/listinfo/ietf
Re: TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
On 23 nov 2007, at 15:05, Bernard Aboba wrote: My concern is about wireless media which can experience large variations in signal/noise ratio, in the process generating transient "link down" indications. This could cause those connections to migrate to other media/interfaces. Wouldn't that be something that should be fixed in the driver for that interface? Declaring a link to be down has significant implications on many systems, this shouldn't be done at the drop of a hat for links where this determination isn't easily made. Having drivers declare links down too soon and then having the next layer ignore that is not a good solution, especially because there are also link layers which can determine their up/down status much more accurately. If the host has implemented the strong host model, then when the transient "link down" is resolved, the connection won't resume using the prior outbound interface. This could lead to applications experiencing sub-optimal conditions long-term based on a transient event. Hm, I must say that I don't know off the top of my head if shim6 will automatically rehome to the primary address pair after some time. I'll have to reread the specifications. Or does anyone else remember this? There are a few approaches that come to mind: a. Continue to make decisions based on timers, perhaps using the "link down" indication as a hint to lower the timer values (e.g. requiring only two retransmissions instead of three) I'm not a fan of timer-based decision making if it can be avoided because it's extra work and you pretty much always wait too long or not long enough. 2. Suggest the weak host model to be used along with SHIM6, so that if the "link down" proves to be transient, the connection will migrate back to its former outgoing interface. That would be good, yes. It would be possible to adjust the keepalive interval based on RTT estimates, though. If the information is available, this might be the best approach. But is it worth the trouble? The timeout will remain the same (10 seconds unless something else is established during the shim context setup) so the only difference is that if the RTT is 10 ms you could choose to send a keepalive after 9980 ms but if it's 1500 ms you send the keepalive after 6000 ms. Implementers will probably just use 3 seconds so 3 keepalives are seen before a timeout or 4 seconds so 2 are seen before a timeout. ___ Ietf mailing list Ietf@ietf.org https://www1.ietf.org/mailman/listinfo/ietf
Re: TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
> Note that we explicitly say that this applies to local knowledge that an > address doesn't work, although then the issue is confused by mentioning > deprecation, which leaves the address involved still reachable. Yes. > The issue is also different from regular operation, because if I pull out my > ethernet cable when I'm not running shim6, the only two options are keeping > the session open until it times out or giving up immediately. The latter is > suboptimal because I could put the cable back in before the session times out. > But with shim6, it's possible to rehome the connection to a different > interface so it keeps running whether or not the cable is plugged in again. My concern is about wireless media which can experience large variations in signal/noise ratio, in the process generating transient "link down" indications. This could cause those connections to migrate to other media/interfaces. If the host has implemented the strong host model, then when the transient "link down" is resolved, the connection won't resume using the prior outbound interface. This could lead to applications experiencing sub-optimal conditions long-term based on a transient event. There are a few approaches that come to mind: a. Continue to make decisions based on timers, perhaps using the "link down" indication as a hint to lower the timer values (e.g. requiring only two retransmissions instead of three) 2. Suggest the weak host model to be used along with SHIM6, so that if the "link down" proves to be transient, the connection will migrate back to its former outgoing interface. > It would be possible to adjust the keepalive interval based on RTT estimates, > though. If the information is available, this might be the best approach. ___ Ietf mailing list Ietf@ietf.org https://www1.ietf.org/mailman/listinfo/ietf
Re: [tsv-dir] Re: TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
When traffic is flowing in both directions or when there is no traffic, there is no need to send keeplives. That makes sense. It would be helpful for the protocol document to include this explanation in the keepalive section (and also for the Failure Detection specification to remove the reference to TCP Keepalives, since they're unrelated). The SCTP algorithms make extensive use of transport layer information such as retransmission counts, which the SHIM6 Failure Detection document seems to assume will be unavailable. Right. Shim6 must work for all kinds of communication. However, it would be good to make use of transport protocol knowledge when available. You feel there are missed opportunities in this area? Yes. If the transport layer can make the information available, then it seems to me that Failure Detection could be improved, providing for TCP a better approximation of the functionality in SCTP. In general, it would not be desirable for SHIM6 to initiate the re- homing of a TCP connection due to a transient failure. Link layer "down" indications or resulting address deprecations are examples of this. The trouble is, how do you know a problem is transient? You don't. That's why "link down" indications are best ignored by both the Internet and Transport layers. About address deprecation: I do seem to remember a discussion where the conclusion was that deprecation is no reason to stop using an address just because it's deprecated. Telling the other end that an address should no longer be used when it's deprecated would have that effect, so if the proto document mandates that, that could be problematic. It is suggested not mandated. However, it's hard to see a circumstance in which this would be helpful (and it will often hurt), so I'd prefer to see the suggestion removed. (One scenario is a router that no longer sends RAs but still continues to route, it would be possible to use the addresses after they've become deprecated until they become invalid in this case.) Yes. 6. Interactions of SHIM6 with congestion control. Section 4.3 of the Failure Detection document talks about exploration timeout values. Exploration can be kicked off if no inbound traffic is received within Send Timeout (default = 10 seconds). The first observation is that the Send Timeout should probably depend on the RTO estimate, as it does in SCTP. Otherwise we could have a network with a high RTO and SHIM6 exploration could commence after RTO is backed off only a few times. This would be undesirable from a congestion control point of view. We need the timeout to be somewhat long to accommodate the case where a host receives a packet, then does processing and finally sends an answer. However, it also needs to be fairly short so that we have time to repair a failure before the user, application or transport protocol give up. I don't think alignment with the transport's retransmission timeout makes sense here. The RTO represents the best estimate of the maximum time that can expire until an ACK is expected. So while I'd agree that failover should occur prior to transport connection teardown, it is not desirable for this to occur before a minimum number of RTOs has expired. The time that this takes will depend on the RTO. For example, if the goal is to re-home after 3 timeouts, using an RTOmin of 1 second, three timeouts will take 7 seconds. However, where the RTO is much larger, 10 seconds might correspond to fewer timeouts (maybe only 2). The suggested value of the Initial Probe Timeout (500ms) is less than RTOmin and 4 probes can be sent before initiating exponential backoff. This seems like it could violate "conservation of packets". Why doesn't exponential backoff begin immediately? Then you'd either have to send the first few probes in quick succession without leaving a reasonable amount of time for responses to come back, or it would take very long for the first 5 or so probes to go out. 500 ms is still relatively aggressive as it's well below the maximum observed RTTs on the internet. The issue is kicking off SHIM6 exploration simultaneously with transport layer congestive backoff. While SHIM6 exploration is designed to find alternate paths, the paths could still share a bottleneck. So while transport layer congestive backoff is attempting to let packets drain from the network, SHIM6 will be injecting more packets. In these situations, aggressively resending Probes will not improve the likelihood that they will get through. With respect to 500ms being "well below the maximum observed RTT on the Internet", I'd observe that RTOmin is set at 1 second. So my recommendation would be to set the minimum Initial Probe Timeout to RTOmin, and allow upwards adjustment based on the RTO estimate, if available. ___ Ietf mailing list Ietf@ietf.org https://www1.ietf.org/mailman/listinfo/iet
Re: TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
> When traffic is flowing in both directions or when there is no traffic, > there is no need to send keeplives. That makes sense. It would be helpful for the protocol document to include this explanation in the keepalive section (and also for the Failure Detection specification to remove the reference to TCP Keepalives, since they're unrelated). >> The SCTP algorithms make extensive use >> of transport layer information such as retransmission counts, which >> the SHIM6 Failure Detection document seems to assume will be >> unavailable. > > Right. Shim6 must work for all kinds of communication. However, it would > be good to make use of transport protocol knowledge when available. You > feel there are missed opportunities in this area? Yes. If the transport layer can make the information available, then it seems to me that Failure Detection could be improved, providing for TCP a better approximation of the functionality in SCTP. >> In general, it would not be desirable for SHIM6 to initiate the re- >> homing of a TCP connection due to a transient failure. Link layer "down" >> indications or resulting address deprecations are examples of this. > > The trouble is, how do you know a problem is transient? You don't. That's why "link down" indications are best ignored by both the Internet and Transport layers. > About address deprecation: I do seem to remember a discussion where the > conclusion was that deprecation is no reason to stop using an address > just because it's deprecated. Telling the other end that an address > should no longer be used when it's deprecated would have that effect, so > if the proto document mandates that, that could be problematic. It is suggested not mandated. However, it's hard to see a circumstance in which this would be helpful (and it will often hurt), so I'd prefer to see the suggestion removed. > (One scenario is a router that no longer sends RAs but still continues to > route, it would be possible to use the addresses after they've become > deprecated until they become invalid in this case.) Yes. >> 6. Interactions of SHIM6 with congestion control. Section 4.3 of the >> Failure Detection document talks about exploration timeout values. >> Exploration can be kicked off if no inbound traffic is >> received within Send Timeout (default = 10 seconds). > >> The first observation is that the Send Timeout should probably depend >> on the RTO estimate, as it does in SCTP. Otherwise we could have a >> network with a high RTO and SHIM6 exploration could commence after RTO >> is backed off only a few times. This would be undesirable from a >> congestion control point of view. > > We need the timeout to be somewhat long to accommodate the case where a > host receives a packet, then does processing and finally sends an answer. > However, it also needs to be fairly short so that we have time to repair > a failure before the user, application or transport protocol give up. I > don't think alignment with the transport's retransmission timeout makes > sense here. The RTO represents the best estimate of the maximum time that can expire until an ACK is expected. So while I'd agree that failover should occur prior to transport connection teardown, it is not desirable for this to occur before a minimum number of RTOs has expired. The time that this takes will depend on the RTO. For example, if the goal is to re-home after 3 timeouts, using an RTOmin of 1 second, three timeouts will take 7 seconds. However, where the RTO is much larger, 10 seconds might correspond to fewer timeouts (maybe only 2). >> The suggested value of the Initial Probe Timeout (500ms) >> is less than RTOmin and 4 probes can be sent before initiating >> exponential backoff. This seems like it could violate "conservation >> of packets". Why doesn't exponential backoff begin immediately? > > Then you'd either have to send the first few probes in quick succession > without leaving a reasonable amount of time for responses to come back, > or it would take very long for the first 5 or so probes to go out. 500 ms > is still relatively aggressive as it's well below the maximum observed > RTTs on the internet. The issue is kicking off SHIM6 exploration simultaneously with transport layer congestive backoff. While SHIM6 exploration is designed to find alternate paths, the paths could still share a bottleneck. So while transport layer congestive backoff is attempting to let packets drain from the network, SHIM6 will be injecting more packets. In these situations, aggressively sending Probes will not improve the likelihood that they will get through. With respect to 500ms being "well below the maximum observed RTT on the Internet", I'd observe that RTOmin is set at 1 second. So my recommendation would be to set the minimum Initial Probe Timeout to RTOmin, and allow upwards adjustment based on the RTO estimate, if available. __
Re: TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
On 22 nov 2007, at 17:17, Bernard Aboba wrote: However, this relationship is not explored. The TCP keepalive interval is generally kept quite large, partly out of a desire not to tear down idle TCP connections due to a transient failure. The SHIM6 keepalive interval during idle is not defined in the Failure Detection document, but my impression was that it could be much shorter and this would seem to collide with the philosophy of TCP keepalives. Shim6 REAP keepalives aren't sent when a shim6 context is idle. The REAP protocol assumes that traffic must always be bidirectional, so when there has been outgoing traffic but no incoming traffic, there must be a failure. Keepalives exist to accommodate the cases where there is legitimately only incoming traffic but no return traffic. When traffic is flowing in both directions or when there is no traffic, there is no need to send keeplives. The SCTP algorithms make extensive use of transport layer information such as retransmission counts, which the SHIM6 Failure Detection document seems to assume will be unavailable. Right. Shim6 must work for all kinds of communication. However, it would be good to make use of transport protocol knowledge when available. You feel there are missed opportunities in this area? In general, it would not be desirable for SHIM6 to initiate the re- homing of a TCP connection due to a transient failure. Link layer "down" indications or resulting address deprecations are examples of this. The trouble is, how do you know a problem is transient? About address deprecation: I do seem to remember a discussion where the conclusion was that deprecation is no reason to stop using an address just because it's deprecated. Telling the other end that an address should no longer be used when it's deprecated would have that effect, so if the proto document mandates that, that could be problematic. (One scenario is a router that no longer sends RAs but still continues to route, it would be possible to use the addresses after they've become deprecated until they become invalid in this case.) 6. Interactions of SHIM6 with congestion control. Section 4.3 of the Failure Detection document talks about exploration timeout values. Exploration can be kicked off if no inbound traffic is received within Send Timeout (default = 10 seconds). The first observation is that the Send Timeout should probably depend on the RTO estimate, as it does in SCTP. Otherwise we could have a network with a high RTO and SHIM6 exploration could commence after RTO is backed off only a few times. This would be undesirable from a congestion control point of view. We need the timeout to be somewhat long to accommodate the case where a host receives a packet, then does processing and finally sends an answer. However, it also needs to be fairly short so that we have time to repair a failure before the user, application or transport protocol give up. I don't think alignment with the transport's retransmission timeout makes sense here. The suggested value of the Initial Probe Timeout (500ms) is less than RTOmin and 4 probes can be sent before initiating exponential backoff. This seems like it could violate "conservation of packets". Why doesn't exponential backoff begin immediately? Then you'd either have to send the first few probes in quick succession without leaving a reasonable amount of time for responses to come back, or it would take very long for the first 5 or so probes to go out. 500 ms is still relatively aggressive as it's well below the maximum observed RTTs on the internet. Iljitsch ___ Ietf mailing list Ietf@ietf.org https://www1.ietf.org/mailman/listinfo/ietf
TSV-DIR Review of draft-ietf-shim6-protocol-09.txt
I have reviewed this document as part of the transport area directorate's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors for their information and to allow them to address any issues raised. In preparing this review, in addition to the protocol document, I have also read the other SHIM6 WG drafts such as the Applicability Statement and the Failure Detection document. Since this is a review for the Transport Directorate, the interaction between SHIM6 and the Transport layer was my primary focus. The basic mechanics are laid out in the protocol document, and the ability for applications (and transports) to avoid or choose use of SHIM6 is described in the API document. The Failure Detection document describes the reachability detection algorithms. Overall, I think that the document could do a better job of describing the interaction of SHIM6 with the transport layer. While SHIM6 layering is clearly explained, in several places within SHIM6 WG documents interactions are described but details or recommendations are not fully fleshed out. The transport area has traditionally demanded a higher level of detail with respect to algorithms (particularly relating to parameter estimation and congestion control). Overall, I think that this document could benefit from addition of a subsection within Section 1 devoted to SHIM6-Transport layer interaction. Overall, the biggest issue appears to be integration of dynamically estimated transport parameters (RTT, RTO, etc.) with SHIM6 re-homing. The impact of SHIM6 on parameter estimation is covered in draft-schuetz-tcpm-tcp-rlci-01 which is an informative reference even though it is "recommended". Here are the issues that I noted within the document set: 1. MTU discovery/MSS negotiation. This is briefly discussed in Section 15.3 of the protocol document. As noted there, SHIM6 failover may result in a change in MTU. Some specific recommendations might be helpful here (such as a recommendation to use Packetization Layer Path MTU discovery). The insertion of a Payload Extension (or common shim control message) header may also result in an MTU change in mid-connection; however, this seems easier to handle assuming that the transport layer is made aware of it and can reduce the MSS accordingly. 2. Keepalive Messages. Section 5.12 refers to the Failure Detection document (a normative reference) for the definition of the Keepalive Message format. Although I understand that the details of Keepalive algorithms might belong in a separate document, support for Keepalive appears to be required, so that the message format needs to be defined in the protocol document, and I would also like to see a discussion of the philosophy of Failure Detection in Section 1. Negotiation of a static SHIM6 Keepalive timeout, is allowed, if different from the default value. Section 4.1 of the Failure Detection document states: The setting of these values is also related to various parameters in transport protocols, such as TCP keepalive interval. However, this relationship is not explored. The TCP keepalive interval is generally kept quite large, partly out of a desire not to tear down idle TCP connections due to a transient failure. The SHIM6 keepalive interval during idle is not defined in the Failure Detection document, but my impression was that it could be much shorter and this would seem to collide with the philosophy of TCP keepalives. So I'm not clear what the above sentence means. 3. Interactions with SCTP. The applicability statement raises some potential issues: However, since SCTP and shim6 both aim to exchange addressing information between hosts in order to meet the same general goal, it is possible that their simultaneous use might result in unexpected behaviour, e.g. due to race conditions. It is recommended that shim6 is not used for SCTP sessions, and that path maintenance is provided solely by SCTP. The API document provides details on how SCTP can request that SHIM6 not be used with it. However, the protocol document does not discuss this issue, which could be handled in the transport interactions section. Given that SHIM6 is not of much use for SCTP, I wondered whether SHIM6 would bring equivalent functionality to TCP. Comparing the reachability detection algorithms described in the Failure Detection document with the corresponding SCTP algorithms described in RFC 4960 Section 8, the answer appears to be "no". The SCTP algorithms make extensive use of transport layer information such as retransmission counts, which the SHIM6 Failure Detection document seems to assume will be unavailable. As described later on, it would appear to me that Failure Detection and the Transport layer need to be closely integrated to be effective; this lead me to wonder whether this represents a potential archi