Re: [Int-area] Tunneling/MTU BCP, was: Re: Mucking with IP ID

Iljitsch van Beijnum Thu, 02 Aug 2007 05:58:23 -0700

On 2-aug-2007, at 13:57, Iljitsch van Beijnum wrote:

[...]

Looks I wasn't done yet... Sorry for the bandwidth use, I shouldreally optimize my Maximum Message Unit.


Thinking about tunneling/MTU/PMTUD:

There is a recommendation floating around that suggests setting theouter tunnel header DF bit to the value of the inner header DF bit.The trouble with this is that it only works if:

A. RFC 4821 is used, which is generally something the tunnel operatorcan't know (also note that RFC 4821 capability is per transport, soeven if TCP has it other transports are likely to still fail on PMTUDblack holes), orB. Packet too big messages generated for the outer tunnel header aretranslated into packet too big messages relating to the inner tunnelheader, something that I don't think can be done in all cases withIPv4 because not enough of the original packet is returned

So for IPv4, I don't think copying the DF bit makes sense. Thisleaves two options:


1. Do PMTUD for the tunnel
2. Don't do PMTUD for the tunnel

In the first case, the outer header DF bit should be set and thetunnel source should dynamically adjust the tunnel MTU based on toobig messages that come back for tunneled packets. Then, any largepackets with the DF bit set will be dropped and a too big is sentback to the source. Large packets with the DF bit cleared will haveto be fragmented. (Yes, you can always try to send back a too big forthese too, but if they're ignored you'll have to implement logic torate limit the too bigs.) So you do have fragmentation overhead onthe tunnel ingress point but no reassembly overhead at the tunnelegress point.

In the second case, the tunnel MTU will still probably be somethinglike 1480 so incoming packets with DF set will be dropped and a toobig sent back, but an administrator could choose to set the tunnelMTU to 1500 (or bigger) and accept the fact that all large packetswill be fragmented at the tunnel ingress point and reassembled at thetunnel egress point. Since the DF bit is cleared, any hops with areduced MTU will lead to fragmentation in the middle with reassemblyat the tunnel egress point.

Then we have IPv6. Since DF is implied here, translating too bigswhen the outer header is IPv6 is possible, because ICMPv6 returnsenough of the original packet. But since it's impossible to carry aminimum maximum sized IPv6 packet of 1280 bytes through a tunnelrunning over a link with the minimum MTU of 1280 bytes, it'sextremely important to set link MTUs higher than 1280 bytes whereverpossible, or black holes that can't be recovered from will be theresult.

An interesting question in this regard is what the maximum packetsize should be for applications/transports that can't adjust theirpacket size dynamically and/or don't implement RFC 4821. For IPv6,the obvious answer is 1280, but maybe this is a bit too conservativefor a general recommendation. I know that 1450 is often recommendedfor use with streaming video, but in practice this becomes 1478 withIP and RTP headers. (1498 for IPv6...) For IPv4, this leaves enoughroom for an extra IP header (20 bytes) OR a PPPoE header (8 bytes),but not for both or for GRE encapsulation (at least 24 bytes).


Some choices and the extra headers they allow for:

1492: PPPoE
1480: PPPoE / IPv4
1476: PPPoE / IPv4 / IPv4 + GRE
1472: PPPoE + IPv4 / IPv4 + GRE
1460: PPPoE + IPv4 / IPv4 + GRE / 2 x IPv4 / IPv6
1452: PPPoE + 2 x IPv4 / 2 x IPv4 + GRE / PPPoE + IPv6


_______________________________________________
Int-area mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/int-area

Re: [Int-area] Tunneling/MTU BCP, was: Re: Mucking with IP ID

Reply via email to