Hi,

We _are_ seeing IPv6 packet fragmentation in TCP over IPv6, and the causes are 
systematic, rather than random chance.

What is happening is that the client is behind some form of tunnel (MPLS) we 
_assume_. The server (FreeBSD recent version) is sitting with a MTU of 1500, as 
is the client.

The client opens 5 ports in parallel (possibly firefox???) The client makes a 
request down port 1, then, a little later, makes another request down port 2. 
The server sends large packets down the TCP session open to port 1.  A gateway 
on the path sends back an ICMP6 packet too big to the server for this TCP 
session, with a new MTU of 1492. The TCP session on port 1 now adjust its MSS 
to 1432 (=1492 - 60) and the server resends its set of data as a sequence of 
packets each of which is 1492 octets in size - all good, no fragmentation of 
TCP packets so far.

But then, on the session open to the client's port 2 the server assembles its 
set of packets for the second request to send to the client. The TCP session on 
the server which is bound to port 2 of the client is unaware of the MTU change 
(after all, the ICMP6 PTB message was directed to the TCP session that is bound 
to client port 1, not port 2), and the server's IPv6 driver gets a set of 1440 
octet TCP payload packets to send to the client. But there is a data structure 
in the server (I think FreeBSD uses the routing table, but the FreeBSD folk 
probably know better than I do which internal table is used to cache path MTU 
data derived from ICMPv6 PTB messages) which says that for this IPv6 
destination address the new MTU is not 1500, but is 1492. The IPv6 output 
driver now fragments the queued TCP packets to session connected to the client 
port 2 to send 2 packets for each single packet passed to it from the upper 
level TCP driver, and what heads onto the wire is a 1420 octet 
 payload and a 20 octet payload, with IPv6 fragmentation set to bind them 
together.

As far as I can see this is not "bad" behaviour. The second session has not 
received it own ICMP6 packet too big, so this TCP session (and all other 
parallel TCP sessions) are unaware of the need to drop its MSS, and because of 
the cached per host entry with the new MTU the IP driver is aware that it 
cannot send these packets out without fragmentation. So the IP layer is aware 
of this lower path MTU and correctly fragments the TCP packet to ensure 
delivery. I don't believe this is a bug per se. I think its an unintended side 
effect of the way in which ICMP6 PTB messages are processed by the recipient.

How prevalent is this behaviour?

Well, how prevalent are browsers that open up parallel ports to the same 
destination?

Gee, with today's browsers just about everyone does this!

If you want to deprecate IPv6 fragmentation and still allow this form of 
parallel session behaviour to work rather than wedge, then the internal 
handling of ICMPv6 PTB messages in the host needs to be reworked as far as I 
can tell.

thanks,

   Geoff




On 25/06/2013, at 12:22 PM, Ronald Bonica <rbon...@juniper.net> wrote:

> Hi Mark,
> 
> Thanks for this good empirical data!
> 
> I would like to verify your assertion that most of the IPv6 fragment carry 
> UDP. Do you have any way to be sure?
> 
>                                      Ron
> 
> 
>> -----Original Message-----
>> From: Mark Andrews [mailto:ma...@isc.org]
>> Sent: Monday, June 24, 2013 9:53 PM
>> To: George Michaelson
>> Cc: Ronald Bonica; ipv6@ietf.org 6man-wg
>> Subject: Re: New Version Notification for draft-bonica-6man-frag-
>> deprecate-00.txt
>> 
>> 
>> In message <CAKr6gn2zu2n-pJMirG-seN5WX=Evyquu9EqqLOV-zf-
>> rkq9...@mail.gmail.com>
>> , George Michaelson writes:
>>> --===============4023034923616370839==
>>> Content-Type: multipart/alternative;
>>> boundary=047d7b86e55011538004dff06308
>>> 
>>> --047d7b86e55011538004dff06308
>>> Content-Type: text/plain; charset=ISO-8859-1
>>> 
>>> On Tue, Jun 25, 2013 at 2:38 AM, Ronald Bonica <rbon...@juniper.net>
>> wrote:
>>> 
>>>>   ** **
>>>> 
>>>> I'd like to understand the basis of these assertions. I believe
>> what
>>>> I am seeing, on the edge, suggests there is in fact V6
>> fragmentation
>>>> in both TCP and UDP.****
>>>> 
>>>> ** **
>>>> 
>>>> ** **
>>>> 
>>>> Hi George,****
>>>> 
>>>> ** **
>>>> 
>>>> It would be helpful if you could describe:****
>>>> 
>>>> ** **
>>>> 
>>>> **-          **Where your observations are being made
>>>> 
>>> 
>>> On our own web services (www.apnic.net, and an associated whois
>>> service which attracts more wide ranging traffic)
>>> 
>>> On 'high in the tree' DNS servers for reverse DNS, including an NS of
>>> in-addr.arpa and ip6.arpa (note: dns transport is disjoint from the
>>> namespace being searched: we see queries over v6 transport to v4
>>> domains, and to ccTLD we secondary)
>>> 
>>> In a packet capture of 2400::/12 run in conjunction with Merit, as
>>> research into darknets.
>>> 
>>> 
>>>> ****
>>>> 
>>>> **-          **What percentage of traffic is fragmented
>>>> 
>>> 
>>> our own web: practically none.
>>> 
>>> our own dns: 0.01%. in a sequence of 10 minute samples. consistently,
>>> I might add.
>>> 
>>> the 2400::/12:  around 0.25% to 1%. so more variable, but higher.
>>> 
>>> 
>>>> ****
>>>> 
>>>> **-          **What kinds of packets are being fragmented
>>>> 
>>> 
>>> our own DNS: port 53. little TCP.
>>> 
>>> 2400::/12 capture. mostly port 53. TCP doesn't get captured in the
>>> darknet research. Its impossible to establish the end-to-end
>> relationship.
>>> 
>>> I am not sure I call up to 1% of something 'rare'. I'm not even sure
>> I
>>> call 0.1% or 0.01% of something 'rare'. Otherwise, Since IPv6
>> adoption
>>> rates are at this class of deployment by end user, perhaps it also
>>> should be considered for deprecation..
>>> 
>>> It really would be helpful to understand your assertion about the
>>> rarity of
>>> IPv6 fragmentation. I want to understand how you got to this point of
>>> view on IPv6 frags.
>>> 
>>> -George
>> 
>> .58% of my IPv6 traffic in fragmented.  Assuming that it is mostly UDP
>> I get 14% of my IPv6 UDP traffic is fragmented.  Most of that traffic
>> is non local.  I would assume most of the drops are due to PMTUD
>> blocking the initial fragment but letting the tail fragment through as
>> this machine is behind a tunnel.
>> 
>> Mark
>> 
>> ip6:
>>      381915 total packets received
>>      0 with size smaller than minimum
>>      0 with data size < data length
>>      0 with bad options
>>      0 with incorrect version number
>>      2213 fragments received
>>      0 fragments dropped (dup or out of space)
>>      48 fragments dropped after timeout
>>      0 fragments that exceeded limit
>>      1077 packets reassembled ok
>>      217810 packets for this host
>>      0 packets forwarded
>>      93958 packets not forwardable
>>      0 redirects sent
>>      297719 packets sent from this host
>>      0 packets sent with fabricated ip header
>>      0 output packets dropped due to no bufs, etc.
>>      5031 output packets discarded due to no route
>>      33 output datagrams fragmented
>>      66 fragments created
>>      0 datagrams that can't be fragmented
>>      0 packets that violated scope rules
>>      93924 multicast packets which we don't join
>>      Input histogram:
>>              hop by hop: 132
>>              TCP: 202894
>>              UDP: 15103
>>              fragment: 2213
>>              ICMP6: 161573
>> 
>> --
>> Mark Andrews, ISC
>> 1 Seymour St., Dundas Valley, NSW 2117, Australia
>> PHONE: +61 2 9871 4742                 INTERNET: ma...@isc.org
>> 
> 
> 
> 
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to