Iljitsch van Beijnum wrote:

On 28 okt 2003, at 0:12, Fred Templin wrote:

> If I hook up two boxes that can do 9000 bytes over ethernet, but the
> switch is 1500 bytes only, I had better make sure those two boxes
> stick to 1500.


Yes, I know. By "negotiate a larger MTU" I mean not only an
initial indication of how much the *neighbor* can handle but
also ongoing and continuous attention to how much the  *L2
media* can handle.


But unfortunately this isn't something that can easily be determined. Is it ok to just send large probe packets, or can sending packets that are too large be harmful?


Well, there's no use in sending large probe packets if you aren't going
to allow sending large data packets. The purpose of the probes is to
determine a window of opportunity during which it should  be OK
to send large data packets.

> The good thing is that if we allow hosts to discover the maximum
> usable MTU between them, we can now use the regular RA MTU option to
> broadcast a much smaller MTU for this type of traffic.


But, if there are many routers on the link, what do you do if
they aren't all broadcasting a consistent smaller MTU?


Beat the administrator.


I wish!

RFC 2461 says to accept the most recently heard MTU - is that
good enough?


I'd say: trust the router you've chosen as your default gateway. I considered picking the lowest from those advertised, but this would make it trivial to do "MTU reduction attacks".



> Then there would be a new RA option that would function as the
> "maximum MTU": hosts are not allowed to transmit packets larger than
> this. This value must be equal to or higher than the IP over RFC value
> if it's present. There could be a bit in this option that indicates
> "if lower layers tell you it's ok to use this value" or "forget what
> lower layers tell you, this value is the correct one".


I'm afraid I'm not bought into this one as being necessary (yet). We know
from our physical/logical point of attachment what the largest possible
MTU for the attached L2 media is - this is a given.


Yes, but who is "we"? The administrator knows this after perusing the documentation and obviously the box itself knows at some level. We currently don't have any mechanisms to transfer this knowledge to the layer 3 on hosts that are considering using jumboframes.


By "we", I was referring to the interface that is configured over the L2 link.
In the case of Ethernet - I hear you and Erik telling me that the interface has
no way of knowing whether the physical link is capable of supporting an
MTU of 9KB for jumboframes or only 1500bytes - do I have this correct?


The IEEE was understandably reluctant to increase the gigabit ethernet MTU beyond 1500 bytes because this would break interoperation with existing 10 and 100 Mbps ethernet. But at some point something has to give, because two hosts that want to utilize 10 gigabit ethernet between them would have to transmit/receive almost a million packets per second with a 1500 byte MTU.


So, you are saying that the IEEE was reluctant to increase the Gig-E MTU
beyond 1500 bytes. The MTU speaks only to the *send* side - but, what
about the MRU (receive side)? Can Ethernet interfaces determine whether
they are configured over a Gig-E link and (if so) make sure the MRU is
at least 9KB?

So, to supply a maximum
MTU that is larger than the attached L2 media can support in a single packet
is saying that we expect the sender to do L2 fragmentation locally.


Obviously announcing an MTU that's larger than what layer 2 can support doesn't make sense. But often layer 2 can support a larger MTU than what is specified for that particular layer 2 protocol and/or for IP over that protocol. This "unofficial" MTU is the one we're interested in. (At least when increasing the MTU.)


Well, if an L2 infrastructure has a mix of 10/100 and Gig-E
Ethernet elements then I suppose routers on the L2 could advertise
Max/Min MTU values such that:

1280 <= Min_MTU <= 1500 <= Max_MTU <= 9KB

Then, each neighbor could announce a per-neighbor MTU
such that:

1280 <= NBR_MTU <= Max_MTU

Do I have this right?

Is this what you are saying, and is this really desireable? (Maybe; I'm willing to be convinced.)


If layer 2 fragmentation is done there should also be layer 2 reassembly. Then the whole procedure becomes transparent to IP so we don't have a problem. This is what 802.11 does to achieve interference robustness. AAL5 in ATM is basically the same thing.


Yes, if the L2 supports fragmentation then the L3 can send and receive
whole packets transparently - but this should only be allowed up to the
point that triggering the L2 fragmentation does not impart issues such as
excessive fragment loss due to congestion.

See sections 3.4 ("Use of Transparent Fragmentation") and 3.5 ("Careful
use of Intentional Fragmentation) of "Fragmentation Considered Harmful"
for  more thorough discussion on this.

Fred
[EMAIL PROTECTED]




-------------------------------------------------------------------- IETF IPv6 working group mailing list [EMAIL PROTECTED] Administrative Requests: https://www1.ietf.org/mailman/listinfo/ipv6 --------------------------------------------------------------------

Reply via email to