On Sat, 23 Jul 2005 14:16:46 -0500
"Stephen Sprunk" <[EMAIL PROTECTED]> wrote:

> Thus spake "Mark Smith" <[EMAIL PROTECTED]>
> > On Fri, 22 Jul 2005 13:20:40 -0500
> > "Stephen Sprunk" <[EMAIL PROTECTED]> wrote:
> >> The hole is that there may be an L2 device in the middle which has
> >> a lower MTU than the end hosts.  The neighbor MTU is an upper
> >> bound, but it's not guaranteed to work -- you need to probe to see
> >> what really works.
> >
> > I agree that probing would be nice to have, however I don't think not
> > having it is a "black" hole.
> >
> > When a larger than standard MTU is used (e.g., 9K jumbos over
> > GigE verses the 802.3 standard of 1500), the operator has made a
> > conscious decision to increase the MTU, and therefore needs to
> > take responsibility for ensuring that intermediary layer 2 devices
> > support the increased MTU. People who want use jumbo frames
> > on GigE interfaces know usually that they have to also have
> > jumbo frame capable switches..
> 
> Except we nearly have this situation today -- jumbos only work if the admin 
> goes through a lot of painstaking effort to make sure they do.  You're 
> proposing we eliminate the per-host effort but not the network-side effort; 
> to me that's a half-solution (as you granted in a section I snipped).  I 
> want jumbos to work out of the box without the admin (which might be my 
> grandmother at home) even knowing what jumbos are.
> 

Remember that the per-host effort is where a lot of the effort is -
there are just so many more of them than there are layer 2
infrastructure devices.

I'd like "ultimate plug-and-play" for this scenario, however, for quite
valid reasons, that problem is either quite hard or even impossible to
solve. 

The fundamentaly issue I think I'm really addressing is that even if the
administrator goes to the effort of ensuring that the layer 2
intermediary devices support jumbo frames, IPv6 currently prevents you
taking advantage of it for the hosts that support it, the moment you
plug in one legacy device that doesn't.

For example, you may have a few hundred hosts that support jumbo frames,
and layer 2 switching infrastructure that also supports jumbo frames.
You then want to plug in an older 10 or 100Mbps network attach printer
(that has been software upgraded to support IPv6) that doesn't. That
printer hobles all the hosts to using 1500 byte frames for IPv6. Sure,
you can fix that by buying or building a 2 interface router that
supports jumbo ethernet frames on at least one interface, and then
plugging the printer into the other interface, creating a different
MTU'd segment. Thats a fairly $$$$ or time intensive solution. I'd think
modifying IPv6 to support this scenario (again, assuming layer 2
infrastructure supports the largest MTU/MRU) would be very much
appreciated by people who design a network like this and are then faced
with the idea of having to buy/build a router just to support an
existing, legacy printer.

"Legacy" printers could with any other "legacy hardware" device that has
been software upgraded to support IPv6, e.g. router, host, scientific
equipment etc.

> > If we make the assumption that the intermediary link layer devices
> > support the largest node MTU/MRU on the segment, I think the
> > problem becomes a lot simpler, and then the issues to solve are :
> >
> > (a) how to ensure large MTU/MRU capable end nodes use them
> > when communicating between each other.
> >
> > (b) how to ensure end-nodes with smaller MTU/MRUs are not sent
> > too large frames by the large MTU/MRU capable end-nodes.
> >
> > I think either of these issues could be solved by using a ND NS/NA
> > MRU type option, and, because RAs can carry a link-layer address,
> > negating the need for a node to perform a ND NS/NA transaction
> > with the router, at least initially, have an RA also possibly carry
> > an MRU indication.
> 
> The RA's MRU would make a good upper bound, but you still need to probe to 
> make sure a host you're talking to isn't sitting behind some $30 hub that 
> silently drops jumbos (er, giants).  And you have to keep probing in case 
> the host moves or the L1/L2 path changes, unless you've recently received 
> from that host a jumbo or ACK for your own jumbo.
> 

If you've gone to the effort of consciously building a jumbo capable
infrastructure, and then chosen to use it, and then somebody else
('cause it won't be you) plugs in a $30 hub, then 

(a) you can repremand them

(b) be sure that the problem will be localised to the devices on the
other side of the hub, if they try to plug jumbo capable devices in
behind it, and attempt to use jumbo frames. The existing, properly
supported jumbo capable devices plugged into the proper jumbo capable
layer 2 infrastructure would continue to work properly. If they plug in
jumbo frame capable devices behind the hub (forgetting for the moment
that hubs don't come in GigE flavour), but don't enable jumbo frames on
the end-nodes, things will work as normal.

Again, remember that my suggestion solution is _constrained_ to
scenarios where the layer 2 infrastructure has been specifically
designed to support jumbo frames, and a conscious effort has been made
to enable them. My solution has the fundamental goal of allowing
small-MTU, "legacy" end-nodes (e.g., printers, routers, etc.) to be
plugged into the same layer 2 infrastructure without those legacy
devices hobbling the MTU/MRUs the jumbo capable devices support.

I think this scenario is or will be quite common in the future. I'd
suggest that buying GigE layer 2 switching without looking at its
features, capabilities and specifications, is only limited to the
residential, SOHO and maybe small business market. IOW, any scenario
where the network is actually designed may benefit from this MTU/MRU
discovery solution for unicast traffic. Those networks are present in
medium to large businesses, and service providers, and I think there
will be lots of "_designed_ to support jumbo" frame networks in the
future.

> The reason I think this is necessary is that it allows hosts supporting 
> jumbos to have them on by default and gracefully fall back to non-jumbo 
> sizes in the presence of non-jumbo-aware hosts or network devices.  You can 
> only have them on by default if you're sure you can handle the most perverse 
> case (which I think I presented).
> 
> > For multicast, the link standard MTU or the link RA announced MTU
> > would be used.
> 
> I'm not sure using an MTU above 1500 for multicast should be legal; there 
> may be hosts on the subnet which we don't know are neighbors and thus we 
> don't know the lowest MRU out there.  Also, aren't there cases where we'll 
> know of a neighbor but wouldn't have had to do ND and thus wouldn't have 
> learned their MRU?
>

They wouldn't be legal, as the multicast scenario is exactly why RFC2461
says the MTU has to be of a fixed size for the segment. However, once we
apply this constraint only to multicast traffic, we then may be able to
allowing differing MTU/MRUs for unicast traffic between hosts on the
same link (of course, again, assuming that the layer 2 infrastructure
supports the largest MTU/MRU).

 
> > For nodes that don't implement this "MRU" option, they'll use the link
> > standard MTU to communicate with their neighbours and vice-versa,
> > as per IPv6 operation today.
> 
> Of course.
> 
> And, since I know this is the IPv6 WG list, which WG would be appropriate to 
> discuss back-porting this feature into IPv4 after we solve it for IPv6?
> 

We choose not to back-port (it may not be possible anyway as I don't
think ARP is extensible like IPv6 ND is), and that will then provide
another reason to migrate to IPv6 ... :-)

Back-porting nice features from IPv6 to IPv4 is possibly false economy.
It takes extra effort (even though it may be a small amount), which
instead could be used to further improve IPv6, and also provides
additional disincentives to adopt IPv6. In some ways it reminds me of
the effort that is spent overcoming NAT limitations, rather than more
usefully either improving the application by adding features, or porting
the application to IPv6 and avoiding then avoiding NAT issues
completely.

Regards,
Mark.

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to