Thus spake "Iljitsch van Beijnum" <[EMAIL PROTECTED]>
On 22-jul-2005, at 20:20, Stephen Sprunk wrote:
Thinking about this a bit more, this could probably be fairly easy to
achieve by creating a "onlink-MRU" or "interface-MRU" option for ND
Neighbour Advertisements.

If there aren't any big holes in what I'm suggesting, I'm willing to
spend some time co-authoring an Internet draft on this.

I'm sure we can find a nice restaurant or café in Paris to discuss the matter further. :-)

I'm not able to attend, so I'd appreciate it if someone would give my comments a bit of airtime even if none of the attendees agree with me.

The hole is that there may be an L2 device in the middle which has a lower MTU than the end hosts. The neighbor MTU is an upper bound, but it's not guaranteed to work -- you need to probe to see what really works.

(Layer 1 devices can also impose MTU limits.)

True; read the above L2 as L1/L2.

It's important to keep this simple, unless the IEEE guys also want
to play and add mechanisms to exchange MTU information between
switches.

Given their past stance on jumbos, I don't see that happening.

Just like PMTUD, you need to periodically probe and adjust to changing network conditions, including detecting "black holes". Fred Baker suggested the host send both minimum MTU (576 for
v4 and 1280 for v6) and maximum MTU frames in a given burst
and track what gets through.

I'm not really comfortable with this... It makes more sense to me to have a router or two, or maybe one or two non-router hosts, send out "MTU announcements", and other hosts only announce the non-
standard MTU in neighbor advertisements when they recently heard
one of those announcements. When the MTU suddenly decreasees,
the announcements are no longer heard, hosts put 1500 in their
neighbor advertisements and neighbor unreachability detection does
the rest.

I'm okay with hosts dropping to 1500 (assuming Ethernet) for neighbors that can't receive jumbos. It'd be desirable for them to find a higher value that works but which is still less than both hosts' MTUs. Trying to ratchet the MTU back up after it's been lowered is probably more trouble than it's worth.

The fact that ethernet is supposed to have a tree topology makes
things slightly simpler.

Some 802 networks are not trees, and there are non-802 networks out there. I'd like to have a single jumbo spec for all L1/L2 types; it doesn't seem to require much tap-dancing around the specific numbers to generalize it to all media types, though obviously Ethernet is the most common and most in need of help.

The most perverse scenario I can envision is a network where one host has an MTU of 9k, another has 8k, one network path has 10k, another path has 3k, and the path varies every few minutes (and isn't necessarily symmetric).

Real-life ethernet isn't supposed to be like that...

I've been traumatized by some of the networks I've seen in the wild. The scenario I gave was deliberately perverse, but it's not very far from the worst I've encountered -- and I guarantee such things will occur if we standardize jumbos. I'm betting that's why the IEEE refused to tackle it.

For those who think there isn't a real problem here: it takes a  little
over 800 packets per second to saturate a 10 Mbps ethernet link.
At GE speeds that's 80000 packets per second. It is very hard to
achieve decent performance when you have to stop what you're doing 80000 times per second...

Modern hosts can do it, but it'd be nice to reduce the CPU load due to NIC interrupts if possible.

There is also the environment to consider because the amount of
power switches use is strongly related to the number of packets that
flow through the switch. So increasing the MTU from (for instance)
1500 to 9000 bytes means it only takes 3 packets to transfer 18000
bytes (2 data, 1 ack), while it takes (best case) 13 packets at 1500
bytes (12 data, 1 ack) but usually 18 (6 acks).  That saves a LOT
of power.

The power consideration didn't even occur to me; I was thinking of CPU load on end hosts, per-packet overhead on the wire, and pps limits on network gear.

S

Stephen Sprunk      "Those people who think they know everything
CCIE #3723         are a great annoyance to those of us who do."
K5SSS --Isaac Asimov

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to