Re: [Int-area] Tunneling/MTU BCP, was: Re: Mucking with IP ID

Joe Touch Thu, 02 Aug 2007 07:37:09 -0700


Iljitsch van Beijnum wrote:
> On 2-aug-2007, at 15:30, Joe Touch wrote:
...
>> tunnel endpoints MUST either:
> 
>>     1. set outer DF=0 and allow fragmentation (including at the
>>     tunnel source
> 
> So far so good...
> 
>>     2. set outer DF=1 when their payload fits,
> 
> ...but this makes no sense at all. The whole point of PMTUD is to _find_
> _out_ whether stuff fits. You can't know that in advance.


Fits in what you currently think the path MTU is (I should have been
more specific).

>> Receipt of a too-big at the tunnel source should not be expected to be
>> translated to be sent to the original packet's source;
> 
> Not for IPv4. For IPv6, that would be a valid choice but handling this
> in the same way as IPv4 would also be fine.

It's a valid choice, but should not be EXPECTED.

>> The primary benefit of receiving such
>> messages is for subsequent packets; the tunnel source would decrease its
>> MTU, and then **other** packets from that source (or any other source)
>> would correct the actions above (#1 would make smaller fragments, #2
>> would generate ICMPs back to the source).
> 
> Right. Note that TCP tends to send out two packets at a time, so with
> this in effect the first packet will trigger PMTUD in the tunnel, but by
> then, the second packet is also on its way, so both packets will be lost
> and TCP will probably stall for some time. Then when the third packet
> comes, the sending host finally sees the too big.

Yes. Transport protocols will react poorly to this - once. Presumably
other connections will not experience this problem.

>> These rules apply equally to IPv4 and IPv6; in neither case should
>> tunnels fragment the encapsulated packet, IMO.
> 
> Why not?
> 
> Fragmentation needs to happen in certain cases with IPv4. The only
> choice is who is going to reassemble.

I like treating IPv4 and IPv6 similarly. Tunnels should not put undue
burden on endpoints. Since a tunnel destination MUST exist (to
decapsulate), it ought to be saddled with the work of reassembly, rather
than dropping it on the endpoint.

>>> Some choices and the extra headers they allow for:
> 
>>> 1492: PPPoE
>>> 1480: PPPoE / IPv4
>>> 1476: PPPoE / IPv4 / IPv4 + GRE
>>> 1472: PPPoE + IPv4 / IPv4 + GRE
>>> 1460: PPPoE + IPv4 / IPv4 + GRE / 2 x IPv4 / IPv6
>>> 1452: PPPoE + 2 x IPv4 / 2 x IPv4 + GRE / PPPoE + IPv6
> 
>> There are many other cases - notably IPsec tunnels, which consume even
>> more bytes. Tunnel endpoints may employ header compression which may
>> somewhat compensate for size inflation too. IMO, it's not useful to
>> guess these sizes or expected layerings, as the use of layered VPNs and
>> overlays is likely to increase over time.
> 
> I'm aware that there is a race going on to see who can be the first to
> implement 1500 bytes of overhead per packet. Obviously whatever maximum
> packet size above 68 bytes a sender of a packet chooses, there will be
> some configuration that can't carry packets of that size. And since
> datagram based applications can't arbitrarily reduce their packet size,
> there will always be _some_ fragmentation. (Or black holes if people
> prevent fragmentation from working properly.) Reducing packet sizes a
> few percent for applications / transports that require a one time packet
> size choice seems like a good idea to avoid triggering these issues
> unnecessarily.

Sure. Let's pick one we won't have to move too often, though. That means
a few layers of possible IPsec, e.g., 1300 or 1200. That's close enough
to 1500 for efficiency.

Joe

-- 
----------------------------------------------------------------------
Joe Touch                Sr. Network Engineer, USAF TSAT Space Segment
               Postel Center Director & Research Assoc. Prof., USC/ISI

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Int-area mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/int-area

Re: [Int-area] Tunneling/MTU BCP, was: Re: Mucking with IP ID

Reply via email to