Re: [rrg] SEAL critique, PMTUD, RFC4821 = vapourware

Robin Whittle Thu, 28 Jan 2010 18:32:11 -0800

Short version:     Ben Stasiewicz's research into significant
                   (33/404 = 8%) levels of PMTUD failure in IPv6 -
                   are tunnels causing a lot of this?


                   Geoff Huston's investigation of an IPv6 problem
                   presumably caused by one or more tunnels.

                   Tunnel ingress routers have a tough time
                   generating valid PTBs - I think they would need
                   to cache the initial parts of the packets
                   before they were encapsulated in order to be
                   able to generate a valid PTB to the sending host
                   if a PTB came back from a router in the tunnel
                   due to an encapsulted packet exceeding next-hop
                   MTU.

                   PTBs not making it correctly through NAT to the
                   sending host?  Google finds Norton Security on
                   a PC blocking IPv6 PTBs.

                   None of this makes me think that PTB-based
                   PMTUD must be abandoned and completely replaced
                   with end-to-end PMTUD, such as RFC 4821.

Hi Dan,

Thanks for your message, in which you wrote, in a different order:

> Sounds like you want a research paper?

Yes.  Fortunately Matthew Luckie wrote to the MTU list, pointing
to the work of one of his students - Ben Stasiewicz:

  
http://listserver.internetnz.net.nz/pipermail/ipv6-techsig/2009-October/000708.html
  http://www.wand.net.nz/~bss7/pmtud-ben-final.pdf

This concerns a test of IPv6 websites.  0.11% of the Alexa top
million websites had an IPv6 version.  PMTUD problems were found to
a significant degree (figures quoted below), and it seems that some
servers had been configured to use a small packet size to avoid
such problems.

There's a discussion:

  
http://listserver.internetnz.net.nz/pipermail/ipv6-techsig/2009-October/date.html#708


>> Where is the occurrence of this documented in anything more than the
>> anecdotal way you describe it above?  I don't see any reason why it
>> would always be hush-hush.
> 
> A recent accidental block was discussed in the thread
> http://www.gossamer-threads.com/lists/nsp/ipv6/20779, where Google
> was accidentally blocking ICMPv6 PTB.
> 
> I have heard of many similar problems where ICMPv6 PTB is being 
> blocked; these take hours and occasionally days to resolve.  
> Some of those can be attributed to growing pains of IPv6,
> though -- perhaps a too-low limit for ICMPv6-per-second
> from a router, and that just needed to be increased, or an overly-
> aggressive ICMPv6 filter.
> 
> All modern routers allow limiting the number of ICMPs per second 
> they send.  So if there is some reason they're sending a lot of ICMPs,
> such as during an attack or when the interface is full of non-attack
> traffic, they won't be able to send a legitimate PTB.  Forwarding real
> traffic is more important than filtering ICMPs.  
> 
> The Cisco command is "ip icmp rate-limit unreachable", Juniper's is 
> icmpv4-rate-limit.  These limits are actively discussed on operations 
> lists, which implies those limits are actively used on the Internet.

OK - thanks for pointing to this Google incident.

What you describe doesn't seem to me to be a fundamental problem
with RFC 1191 / RFC 1981 PMTUD.  If there was a fundamental
problem, I would expect it to appear frequently in IPv4, and not
only affect PTBs but other ICMP messages, though I guess the PTBs
are the ones which would cause most trouble if they were either
filtered out, or not sent in the first place.

My understanding of Tony's message:

  http://www.ietf.org/mail-archive/web/rrg/current/msg05820.html

is that he argues that that the PTB approach to PMTUD - and indeed
anything which relies on ICMP messages - is iretrievably broken,
and so new end-to-end protocols would be needed, such as RFC 4821.

But I an yet to read anything which convinces me of this.


>> Maybe many UDP-based applications do this already - locally doing the
>> same thing that RFC 4821 suggests, but without sharing any
>> information with other packetization layers.
> 
> draft-petithuguenin-behave-stun-pmtud does it for UDP.  I know one 
> video application that plans to use the technique described in 
> that I-D.

OK.


>> But TCP is a protocol which frequently is ready to send "long"
>> packets - as long as its local MSS allows. 
> 
> And fixing PTMUD is often done by tweaking the MSS.  Cisco 
> equipment has long supported that functionality (and it was
> mentioned as a quick fix in the thread I cited above).  This
> "fixes" PMTUD failures.  Cisco command is "ip tcp adjust-mss",
> Juniper commands are 'set flow all-tcp-mss' and 'set flow tcp-mss'.

If PTBs are either being dropped due to filtering, or not being
generated (such as due to competition with other ICMP messages
during an attack) and this is causing a significant problem with
PMTUD, then it seems the possible solutions are:

  1 - Stop the filtering and configure the routers so (within
      practical limits) attacks don't stop the generation of
      PTBs.

  2 - Adjust down MSS on sending hosts to fix the PMTU problems
      for TCP - but not other protocols.

  3 - Have TCP and other packetization layers in the stack and
      applications no longer reliant on PTBs, but implementing
      RFC 4821 fully, or performing similar end-to-end PMTUD
      independently, as you noted above was being done for a
      video app.

1 would clearly produce the best outcomes.

Benedikt, at the end of http://www.gossamer-threads.com/lists/nsp/ipv6/20779
strongly objected to fixing the MSS at a defensively low value.

Likewise Daniel Griggs:
  
http://listserver.internetnz.net.nz/pipermail/ipv6-techsig/2009-October/000714.html

Geoff Huston thinks the problem may be caused by tunnels or PTBs
not getting back to sending hosts behind NAT:

  
http://listserver.internetnz.net.nz/pipermail/ipv6-techsig/2009-October/000721.html

I hadn't thought of NAT boxes . . . .  The NAT box would need to
look into the PTB, find the part of the original packet - IP header
and next 8 bytes at least - and then figure out which host behind it
to send the PTB to.  Presumably any self-respecting NAT box would
do this.  However, perhaps if actual PTB usage is minimal due to
manual or default low MTU/MSS settings, then perhaps there could
be wide enough adoption of NAT boxes which don't do this, before
anyone notices the problem sufficient to cause the developers
to add proper PTB handling.

Here is a Google report on "Norton Internet Security was disabling
all inbound ICMPv6 traffic not part of an existing session (i.e.
traffic that was not part of a response to outbound traffic
initiated by the host).":

  http://sites.google.com/site/ipv6center/icmpv6-is-non-optional


On tunnels, Geoff Huston wrote, in part:

   This additional header overhead implies that the tunnel's
   MTU is smaller than the "raw" interface MTU. The second
   problem with a tunnel is that there may be further tunnels
   "inside" the tunnel, so that the tunnel ingress is not
   necessarily aware of the true tunnel MTU. The third problem
   is that the routing of the interior of the tunnel may change,
   so that the tunnel MTU may be variable.

But for a tunnel to support PMTUD, I think the entry router needs
to maintain some kind of cache of recently sent packets so it can
construct a valid PTB to the sending host, when an encapsulated
packet in the tunnel hits an MTU limit.

This could be very expensive for large tunnels.  How long is the
fragment of each packet to be held?


   But if there is a condition that prevents the source from
   receiving packet-too-big ICMPv6 messages then the algorithm
   fails, and the application may hang when full-sized TCP packets
   are passed through the network. In some cases this may happen
   at a point well distanced from the two endpoints of the TCP
   session, so that the ICMPv6 filtering may be occurring at a
   point that is not under the control of the source or the
   destination.

   This is the basic reason why so many web server systems are
   averse to configuring themselves as dual stack IPv4 and IPv6
   servers. The problem is that through no fault of their own in
   the local configuration of the IPv6 server, and through no
   fault in the configuration of the IPv6 client, there are
   situations where the application fails, even though every part
   of the system appears to be functioning.


The first item on Geoff's list of solution is to fix the problem
in the network - whatever is dropping or failing to generate PTBs.

He contemplates changes to clients and servers, and in the end,
for practical reasons, to get his dual-stack server working
reliably, he drops the MTU to 1400:

  Waiting for every filter in the Internet to do the right thing
  with ICMP messages may well be a fruitless task, and adding
  further complexity into applications or the TCP protocol
  behaviour seems to take the long way around the problem. The
  most effective approach appears to be the simplest one as well
  - whether you are a dual stack client, or a dual stack server,
  the best way to get more reliable service under these rather
  strange corner cases is to drop your MTU.


This gets things going for now, but allows bad tunnels, PTB
filtering etc. to persist without complaint.  If Geoff Huston
lacks the time and inclination to figure out where the errant
tunnel is and provide the necessary feedback to its
administrators - who is going to do it?

So do we pull our horns in, wind back MTU/MSS values to
something timid and safe - and then forever be locked into
these limits even when most of the paths to other hosts around
the world support 9kbyte jumboframes?  This is just hobbling and
blinkering our hosts.

Tony seems to think we should give up on ICMP PTBs and expect OS
and app developers to implement RFC 4821.

I think that networks which filter out PTBs should change their
ways.

If there are tunnels not supporting PMTUD, then these need to be
fixed or changed.

Assuming IPv6 becomes much more widely adopted, these tunnel
problems will tend to go away, since I assume they are mainly
IPv6 over IPv4 tunnels.  So I think the prevalence of PMTUD
problems in IPv6, assuming it can be largely attributed to bad
tunnels, is no reason to abandon PTB-based PMTUD.

Ben Stasiewicz found:

  371 PMTUD_SUCCESS           OK    371
  214 RX_TOOSMALL
   22 RX_NODATA       ----]   Fail   33
   13 TCP_NOCONN          ]
    9 PMTUD_FAIL      ----]
    5 TCP_RST             ]
    2 RX_NOACK        ----]
    1 TCP_ERROR

The 214 servers classed as "RX_TOSMALL" were those where the
HTTP server returned packets too small to perform the PMTUD
test.  Perhaps this was a tweak to avoid PMTUD problems which
are apparently not uncommon (33/404) in IPv6 at present.

But what is the incidence of such troubles in IPv4?

  - Robin                   http://www.firstpr.com.au/ip/ivip/


_______________________________________________
rrg mailing list
rrg@irtf.org
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] SEAL critique, PMTUD, RFC4821 = vapourware

Reply via email to