RE: MTU handling in 6RD deployments
Hi Mark, -Original Message- From: Mark Townsley [mailto:m...@townsley.net] Sent: Friday, January 17, 2014 12:41 AM To: Mikael Abrahamsson Cc: Templin, Fred L; ipv6-ops@lists.cluenet.de Subject: Re: MTU handling in 6RD deployments On Jan 17, 2014, at 9:24 AM, Mikael Abrahamsson wrote: On Thu, 16 Jan 2014, Templin, Fred L wrote: The key is that we want to probe the path between the BR and CE (in both directions) *before* allowing regular data packets to flow. We want to know ahead of time whether to allow large packets into the tunnel or whether we need to shut the MTU down to 1480 (or 1472 or something) and clamp the MSS. Because, once we restrict the tunnel MTU hosts will be stuck with a degenerate MTU indefinitely or at least for a long time. This method makes some sense, but since network conditions can change, I would like to see periodic re-checks of the tunnel still working with the packet sizes, perhaps pinging itself over the tunnel once per minute with the larger packet size if larger packet size is in use. Section 8 of RFC 5969 could be relevant here. In that section, I see: The link-local address of a 6rd virtual interface performing the 6rd encapsulation would, if needed, be formed as described in Section 3.7 of [RFC4213]. However, no communication using link-local addresses will occur. So, if we were to construct the pings from the IPv6 level we would want to use link-local source and destination addresses. But, that raises a question that would need to be addressed - should the pings be constructed at the IPv6 level, the IPv4 level, or some mid-level like SEAL? One other thing about this is that we are specifically not testing to determine an exact path MTU. We are only trying to answer the binary question of whether or not the tunnel can pass a 1500 byte IPv6 packet. Thanks - Fred fred.l.temp...@boeing.com - Mark -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
Hi Mark, -Original Message- From: ipv6-ops-bounces+fred.l.templin=boeing@lists.cluenet.de [mailto:ipv6-ops- bounces+fred.l.templin=boeing@lists.cluenet.de] On Behalf Of Templin, Fred L Sent: Friday, January 17, 2014 7:57 AM To: Mark Townsley; Mikael Abrahamsson Cc: ipv6-ops@lists.cluenet.de Subject: RE: MTU handling in 6RD deployments Hi Mark, -Original Message- From: Mark Townsley [mailto:m...@townsley.net] Sent: Friday, January 17, 2014 12:41 AM To: Mikael Abrahamsson Cc: Templin, Fred L; ipv6-ops@lists.cluenet.de Subject: Re: MTU handling in 6RD deployments On Jan 17, 2014, at 9:24 AM, Mikael Abrahamsson wrote: On Thu, 16 Jan 2014, Templin, Fred L wrote: The key is that we want to probe the path between the BR and CE (in both directions) *before* allowing regular data packets to flow. We want to know ahead of time whether to allow large packets into the tunnel or whether we need to shut the MTU down to 1480 (or 1472 or something) and clamp the MSS. Because, once we restrict the tunnel MTU hosts will be stuck with a degenerate MTU indefinitely or at least for a long time. This method makes some sense, but since network conditions can change, I would like to see periodic re-checks of the tunnel still working with the packet sizes, perhaps pinging itself over the tunnel once per minute with the larger packet size if larger packet size is in use. Section 8 of RFC 5969 could be relevant here. In that section, I see: The link-local address of a 6rd virtual interface performing the 6rd encapsulation would, if needed, be formed as described in Section 3.7 of [RFC4213]. However, no communication using link-local addresses will occur. Sorry, I was looking at the wrong section. I see now that Section 8 is talking about a method for a CE to send an ordinary data packet that loops back via the BR. That method is fine, but it is no more immune to someone abusing the mechanism than would be sending a ping (or some other NUD message). By using a ping, the BR can impose rate-limiting on its ping responses whereas with a looped-back data packet the BR really can't do rate limiting. Also, Section 8 of RFC5969 only talks about the CE testing the forward path to the BR. Unless the BR also tests the reverse path to the CE it has no way of knowing whether the CE can accept large packets. Thanks - Fred fred.l.temp...@boeing.com So, if we were to construct the pings from the IPv6 level we would want to use link-local source and destination addresses. But, that raises a question that would need to be addressed - should the pings be constructed at the IPv6 level, the IPv4 level, or some mid-level like SEAL? One other thing about this is that we are specifically not testing to determine an exact path MTU. We are only trying to answer the binary question of whether or not the tunnel can pass a 1500 byte IPv6 packet. Thanks - Fred fred.l.temp...@boeing.com - Mark -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
On Fri, 17 Jan 2014, Mikael Abrahamsson wrote: On Fri, 17 Jan 2014, Templin, Fred L wrote: Sorry, I was looking at the wrong section. I see now that Section 8 is talking about a method for a CE to send an ordinary data packet that loops back via the BR. That method is fine, but it is no more immune to someone abusing the mechanism than would be sending a ping (or some other NUD message). By using a ping, the BR can impose rate-limiting on its ping responses whereas with a looped-back data packet the BR really can't do rate limiting. You don't ping the BR, you ping yourself via the BR. The BR only forwards the packet. My bad, I didn't read your text properly. Why would the BR want to rate-limit data plane traffic? -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
On Fri, 17 Jan 2014, Templin, Fred L wrote: Sorry, I was looking at the wrong section. I see now that Section 8 is talking about a method for a CE to send an ordinary data packet that loops back via the BR. That method is fine, but it is no more immune to someone abusing the mechanism than would be sending a ping (or some other NUD message). By using a ping, the BR can impose rate-limiting on its ping responses whereas with a looped-back data packet the BR really can't do rate limiting. You don't ping the BR, you ping yourself via the BR. The BR only forwards the packet. Also, Section 8 of RFC5969 only talks about the CE testing the forward path to the BR. Unless the BR also tests the reverse path to the CE it has no way of knowing whether the CE can accept large packets. You misread the text. -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
Hi Mikael, -Original Message- From: Mikael Abrahamsson [mailto:swm...@swm.pp.se] Sent: Friday, January 17, 2014 8:15 AM To: Templin, Fred L Cc: Mark Townsley; ipv6-ops@lists.cluenet.de Subject: RE: MTU handling in 6RD deployments On Fri, 17 Jan 2014, Templin, Fred L wrote: Sorry, I was looking at the wrong section. I see now that Section 8 is talking about a method for a CE to send an ordinary data packet that loops back via the BR. That method is fine, but it is no more immune to someone abusing the mechanism than would be sending a ping (or some other NUD message). By using a ping, the BR can impose rate-limiting on its ping responses whereas with a looped-back data packet the BR really can't do rate limiting. You don't ping the BR, you ping yourself via the BR. The BR only forwards the packet. Also, Section 8 of RFC5969 only talks about the CE testing the forward path to the BR. Unless the BR also tests the reverse path to the CE it has no way of knowing whether the CE can accept large packets. You misread the text. I don't see anywhere where it says that the BR should also ping the CE and cache a boolean ACCEPTS_BIG_PACKETS for this CE. If the BR doesn't do that, it needs to set its MTU to the CE to 1480 (or 1472 or something). Thanks - Fred fred.l.temp...@boeing.com -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
cache a boolean ACCEPTS_BIG_PACKETS for this CE. BTW, the reason I am saying that the only thing we are trying to determine is whether/not the CE-BR path can pass a 1500 byte packet is that 1500 bytes is the de facto Internet cell most end systems expect to see w/o getting an ICMP PTB back. So, if we can give the hosts at least 1500 then if they want to try for a larger size they should use RFC4821. This makes things much easier than trying to probe the CE-BR path for an exact size. Thanks - Fred fred.l.temp...@boeing.com
Re: MTU handling in 6RD deployments
On Jan 17, 2014, at 5:14 PM, Mikael Abrahamsson wrote: On Fri, 17 Jan 2014, Templin, Fred L wrote: Sorry, I was looking at the wrong section. I see now that Section 8 is talking about a method for a CE to send an ordinary data packet that loops back via the BR. That method is fine, but it is no more immune to someone abusing the mechanism than would be sending a ping (or some other NUD message). By using a ping, the BR can impose rate-limiting on its ping responses whereas with a looped-back data packet the BR really can't do rate limiting. You don't ping the BR, you ping yourself via the BR. The BR only forwards the packet. Precisely. The whole idea is to stay on the data plane. - Mark Also, Section 8 of RFC5969 only talks about the CE testing the forward path to the BR. Unless the BR also tests the reverse path to the CE it has no way of knowing whether the CE can accept large packets. You misread the text. -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
On Fri, 17 Jan 2014, Templin, Fred L wrote: But, if the BR doesn't examine the packet it could get caught up in a flood-ping initiated by a malicious CE. The BR should have enough dataplane forwarding capacity to handle this. I am considering a specific ping rather than an ordinary data packet as a way for the BR to know whether the CE is testing the MTU vs whether it is just looping back packets. If the BR knows the CE is testing the MTU, it can send ping replies subject to rate limiting so a malicious CE can't swamp the BR with excessive pings. Why does it need to know? The CE is pinging itself CE-BR-CE, and if the CE doesn't receive the packet back then the MTU is obviously limited. So the CE sends out a packet towards the BR, with the IPv6 address being the CE itself. So the packet arrives at the BR, gets decapsulated, does IPv6 dst address lookup, gets encapsulated, and then sent onto the CE. Pure data plane. I don't get why the BR should need to get involved in anything more complicated than that? -- Mikael Abrahamssonemail: swm...@swm.pp.se
RE: MTU handling in 6RD deployments
Hi, You don't ping the BR, you ping yourself via the BR. The BR only forwards the packet. Precisely. The whole idea is to stay on the data plane. I do not work for a network equipment manufacturer, so I'll take your word that remaining in the data plane is critical for 6rd BRs and that high data rate loopbacks are not a problem. So, a looped back MTU test tests both the forward and reverse path MTUs between the CE and BR. This is important to the CE, because if it were only to test the forward path to the BR it would not know whether the reverse path MTU is big enough and so allowing an IPv6 destination outside of the 6rd site to discover a too-large MSS could result in communication failures. In terms of the BR's knowledge of the path MTU to the CE, if we can assume that the BR will receive the necessary ICMPs from the 6rd site then it can passively rely on translating ICMPv4 PTB messages coming from the 6rd site into corresponding ICMPv6 PTB messages to send back to the remote IPv6 correspondent. So, the BR should be able to set an infinite IPv6 MTU on its tunnel interface and passively translate any PTB messages it receives. That, plus the fact that the two IPv6 hosts have to agree on an MSS excuses the BR from having to do any active probing itself. So, take what is already in RFC5969, and add that a successful test of a 1500 byte probe allows the CE to set an infinite IPv6 MTU with the understanding that IPv6 hosts that want to use sizes larger than 1500 are expected to use RFC4821. BTW, by infinite I mean 4GB minus the encapsulation overhead. Thanks - Fred fred.l.temp...@boeing.com
RE: MTU handling in 6RD deployments
BTW, by infinite I mean 4GB minus the encapsulation overhead. Umm, sorry; that is only for tunnels over IPv6. For tunnels over IPv4, infinite means 64KB minus the overhead. Thanks - Fred fred.l.temp...@boeing.com
Re: Anybody else unable to reach sixxs.net?
On Jan 17, 2014, at 7:33 PM, Brian E Carpenter brian.e.carpen...@gmail.com wrote: Tracing route to nginx.sixxs.net [2620:0:6b0:a:250:56ff:fe99:78f7] over a maximum of 30 hops: It works for me from Chicago and Los Angeles. But fifteen minutes has passed since your message, so maybe transient transit problems... Good luck, % traceroute6 www.sixxs.net traceroute6: Warning: nginx.sixxs.net has multiple addresses; using 2620:0:6b0:a:250:56ff:fe99:78f7 traceroute6 to nginx.sixxs.net (2620:0:6b0:a:250:56ff:fe99:78f7) from 2607:ff50:0:20::, 64 hops max, 12 byte packets 1 2607:ff50:0:20::1 6.022 ms 1.584 ms 1.049 ms 2 gigabitethernet11-18.core1.chi1.he.net 7.627 ms 2.154 ms 1.716 ms 3 10ge5-4.fr1.ord.llnw.net 25.908 ms 26.680 ms 31.710 ms 4 ve8.fr3.ord4.ipv6.llnw.net 28.149 ms 27.552 ms 26.848 ms 5 2607:f4e8:2:1::2 3.466 ms 2.473 ms 2.573 ms 6 uschi03.sixxs.net 5.009 ms 3.342 ms 3.091 ms % traceroute6 www.sixxs.net traceroute6: Warning: nginx.sixxs.net has multiple addresses; using 2620::6b0:a:250:56ff:fe99:78f7 traceroute6 to nginx.sixxs.net (2620::6b0:a:250:56ff:fe99:78f7) from 2605:e000:1504:8010::3, 64 hops max, 12 byte packets 1 2605:e000:1504:8010::a 0.363 ms 0.240 ms 0.184 ms 2 * * * 3 2605:e000:0:4::6:1b1 12.968 ms 11.802 ms 12.592 ms 4 2605:e000:0:4::6:e4 13.888 ms 25.503 ms 15.987 ms 5 2605:e000:0:4::34 15.358 ms 13.417 ms 15.802 ms 6 2001:1998:0:8::86 15.453 ms 26.863 ms 12.077 ms 7 2001:1998:0:4::11c 11.338 ms 2001:1998:0:4::11a 15.702 ms 12.215 ms 8 10gigabitethernet17.switch2.lax2.he.net 23.888 ms 21.888 ms 10.843 ms 9 2001:504:13::16 12.843 ms 12.355 ms 13.151 ms 10 ve7.fr3.lax.ipv6.llnw.net 28.727 ms 21.286 ms 11.649 ms 11 10ge5-3.fr1.sjc.llnw.net 42.653 ms 25.235 ms 38.264 ms 12 tge13-4.fr3.ord.ipv6.llnw.net 74.541 ms 85.577 ms 72.086 ms 13 ve8.fr3.ord4.ipv6.llnw.net 84.359 ms 74.412 ms 73.434 ms 14 2607:f4e8:2:1::2 68.140 ms 67.624 ms 66.310 ms 15 uschi03.sixxs.net 64.603 ms 66.013 ms 64.188 ms smime.p7s Description: S/MIME cryptographic signature