Hi Iljitsch,

On Tue, 26 Jul 2005 13:05:31 +0200
Iljitsch van Beijnum <[EMAIL PROTECTED]> wrote:

> On 26-jul-2005, at 4:41, Mark Smith wrote:
> 
> > I'd like to suggest a two phased development approach, based on  
> > whether
> > layer 2 can be assumed to fully support the larger MTUs or not (I  
> > think
> > the following is also probably a summary of the couple of threads of
> > discussion in the emails of the last couple of days) :
> 
> > (1) Making the assumption that layer 2 can support larger, non- 
> > standard
> > MTUs (as it has been engineered to by a network designer), develop
> > mechanisms that allow nodes to announce their non-standard, larger MTU
> > support. If a pair of nodes find they both support larger MTUs/MRUs,
> > then they'll take advantage of it for unicast traffic. If the unicast
> > communcations fails because layer 2 doesn't support the MTU/MRUs it is
> > supposed to, then layer 2 is broken and it is up to network admin to
> > fix it.
> 
> You mean: if I hook up two hosts that support 9k jumboframes to a  
> cheap unmanaged 10/100 switch (all out of the box) and it doesn't  
> work, I have to go in and reconfigure the hosts so it does work?
> 
> Sorry, but that's not acceptable.
> 

I think that is an uncommon scenario. How many hosts with jumbo frame
capable interfaces come with jumbo frame capablility _enabled by
default_ ? The complex "cope with any MTU on any device" solution is
really trying to address what I think would currently be the worst case
scenario. I don't think that scenario will be all that common, at least
in the short to medium term because :

(a) devices that are jumbo frame capable don't come with them enabled
by default

(b) people who enable jumbo frames usually know what they're doing and
what is required for them to work ie. a layer 2 infrastructure that will
support them.

If you enable jumbo frames on end-nodes without a layer 2 infrastructure
that supports them, then you get what you deserve - a network that won't
work. The solution of course is go back to standard size frames or buy
jumbo capable layer 2 infrastructure and enable it. I think it's really
as simple as that.

Don't fall into the trap of assuming the worst case scenario will be the
common one. I think the common one is that people who specifically buy
_and want to enable_ jumbo frame capable end-nodes will also make sure
they buy jumbo frame capable layer 2 infrastructure.

> And the case where I have to enable jumboframes on the hosts and then  
> everything fails because the switch can't handle it is barely usable.  
> It allows for jumbo and non-jumbo capable hosts to live on the same  
> subnet, which is an important advantage, but it's still extremely  
> fragile for no good reason.
> 

See above. You broke the network by enabling a feature on your end-nodes
that your network doesn't support.

> What we need in addition to this would be (as outlined in my message  
> to Bob):
> 
> - a mechanism to distribute the MTU for the layer 2 network to jumbo- 
> capable systems

There are probably a few RA announced-style MTU values that need to be
distributed around :

(a) the original MTU that is currently in RAs - the "whole of link" one.
This is the MTU that all nodes on the link must be capable of receiving,
as it will be used as the maximum size for multicasts. Not really
necessary for ethernets as this value would normally be 1500 bytes,
although there are some uses for ethernets if all offlink destinations
are via a link with a slightly smaller MTU e.g., and IPsec tunnel, to
avoid a PMTUD cycle. The other reason to specifically set it is to cope
with layer 2 technologies such as token ring which don't have a
standardised MTU.

(b) the "link maximum jumbo" ie. the absolute largest MTU that the layer
2 infrastructure can support. Nodes that can support jumbo frames can
raise their MTU up to either the maximum MTU they support above the
standard value that is less than this value (eg 7k if they don't support
9K jumbos (they exist, even in fast ethernet. My Netgear Fa312s under
Linux support MTUs of 2024 bytes !)) or set their MTU to this value, which
may actually be smaller than the jumbo size they're capable of
supporting (I think I've read that some Intel GigE NICs can support 16K
jumbos). This value would be configured on the routers making RA
announcements after the layer 2 infrastructure has been had jumbo frame
capability enabled.

> - validation that jumboframes indeed work to minimize the impact of  
> misconfiguration
> 

Once your layer 2 infrastructure is known to support a specific jumbo
frame size, I don't think it is all that necessary to check for it
anymore. The overheads of checking for it all the time may be too high
when compared to how often unauthorised standard MTU or unconfigured
switches are plugged in to the network.


> > In all other cases (eg. multicast, unicast without these  
> > announcements),
> > the MTU used will be the either the link layer MTU standard or the  
> > link
> > MTU value as announced in RAs
> 
> Right.
> 
> > There is a corner case that needs to be
> > covered where a larger than standard MTU has been annnounced for a
> > technology that does have a fixed MTU e.g. 1500 byte ethernet, and a
> > sub-set of nodes only support the standard size).

I think my "link jumbo maximum" description previously provides a
solution to this corner case.

> 
> If the nodes announce their MTU/MRU in neighbor advertisements this  
> isn't a problem: jumbo-capable hosts wouldn't send jumboframes to  
> neighbors that don't support them.
> 

Agree.

> > These mechasims only come into play if non-standard MTU support has  
> > been
> > enabled
> 
> Yes, this is important.
> 
> > (in some mechanism specific manner, which may be via a new RA
> > option,
> 
> Suboptimal because switches can't easily do this.
> 

Which is why I'd have the routers do it in their RAs.

> > or even just configuring larger, non-standard  MTUs on the
> > interfaces that are capable of larger frames).
> 
> Very suboptimal because it requires manual configuration of all nodes.
> 
> > The out-of-the-box
> > plug-and-play IPv6 functionality that exists today will be  
> > preserved if
> > these mechanisms aren't enabled,
> 
> Yes, and the destructive potential of nuclear bombs is mitigated if  
> you don't detonate them... This is 2005 and IPv6. Autoconf is the  
> word, is the word that you heard. It's got groove, it's got meaning.  
> Autoconf is the time, is the place is the motion. Autoconf is the way  
> we are feeling.
> 

The problem is going too far with the goal of PnP, and then ending up with
monsterously complex solutions that attempt to address every possible
scenario, including the most esoteric onces, rather than just the common ones
(ie. common operational scenarios, common failure scenarios).

> > (2) Develop mechansisms that can dynamically discover MTU/MRU sizes,
> > limitations and variations over time, including within intermediary
> > layer 2 devices e.g., switches. It seems that some solutions have
> > already started to be developed in this area, including the email
> > discussion between Iljitsch and Perry,
> 
> :-)
> 

Don't get me wrong, I'd like this solution to be developed. I'm just
wondering how hard it might be to develop, and if there is simpler
interrim "sub-solution" that would still be useful to a fair number of
people, namely those who don't expect to just "plug-and-play" when they
want to use non-standard protocol parameters and feature, such as jumbo
frames.

Once a fully fledged "cope with any MTU" solution exists, then
manufacturers of NICs and switches could enable jumbo frames by default.

> We can certainly experiment with this while we roll out the more  
> conservative approach.
> 
> > Matt Mathis in
> > draft-ietf-pmtud-method-04.txt and Fred Templin in
> > draft-templin-ipvlx-02.txt.
> 
> I don't really see how those apply.
> 

Only because from what I remember of Matt's draft, and what Fred said
about his, they perform some sort of MTU discovery without relying on a
feedback mechanism e.g. ICMP Dest Unreachable, Packet Too Big, which I
think you solution would have to deal with, assuming that there wasn't a
requirement to add protocols to the switches.

> > If the above phased approach is followed, I think it would be  
> > useful to
> > allow any ND  or other options developed for phase 1 to be re-used  
> > for a
> > phase 2 solution if they can.
> 
> Of course.


Regards,
Mark.

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to