How PMTU Discovery works in the network:

   Path MTU Discovery depends upon two IP capabilities. IP packets sent
   will have the "do-not-fragment" option set in the IP header. When an
   outbound interface in the path is encountered that can not transmit
   the packet because its MTU is less than what is required by the
   packet, an ICMP "packet-too-long" message is sent back to the
   source. This ICMP response message contains the actual MTU of the
   problem interface. Hence, a normal PMTU discovery should fail only
   once for each encountered interface with a smaller MTU in the path.
   PMTU is not normally determined by experimentation. Also, MTU is a
   configuration parameter of the IP hosts or routers. It must be
   compatible with what the actual network can support, but the network
   itself does not "tell" the hosts what is correct. A mismatch where
   the hosts or routers have a MTU larger than the network can support
   will cause problems. The other way around will go unnoticed.

In your case, you should succeed on the second try and the MTU would
immediately drop to 1492. So, something else is going on with the
tracepath program. Without a packet trace of what is actually going on,
as seen by the network and the program, it is difficult to figure out
what is really occurring. In this case the problem appears to be in the
host from which the trace route is issued, SLES9.

There are two questions: Is the MTU really 1492 and why is tracepath
acting as it is? The first is of the most concern.

I would recommend using a ping with a payload size of 57344 and the
do-not-fragment option enabled. If this does not work, then remove the
do-not-fragment option and see if it succeeds. This will tell you if
packets are being fragmented or not to reach the destination. If it
works without the option, but fails with it, you are definitely
fragmenting packets. You can experiment with the payload size to
determine exactly what the size is. This will ensure what physical
packet size is being used and whether fragmentation is really occurring
or not. Neither of these is clear with the tracepath program.

The fact that you are experiencing difficulty determining what is
happening now begs the question if it really was working before. While
the output appears to suggest the issue is with the SLES9 side of the
communication, you might also want to try the above ping test from the
Z/OS side and see if any additional information is seen.

Also, just because you end up fragmenting does not necessarily mean that
the hipersocket I/F can not support larger packets. MTU is a host
configuration parameter. Correcting the MTU parameter to match what the
network can physically do will resolve the problem. Note both hosts must
be correctly configured. You appeared to have checked this already, so
the question would be why the settings might be ignored if it isn't working.

As far as the tracepath program, ping is ICMP based and trace route is
usually UDP based. How this specific executable is implemented or what
type of socket API is used, I do not know. Check the source.

Harold Grovesteen


James Melin wrote:

We've been seeing some disturbing differences between the packet size being 
(aparrently) sent on SLES9 vs SLES8 over the hipersocket interface.

The tool we have been using to tell us the packet length going over the 
hipersocket is tracepath.

On SLES8:

nokomis:~ # uname -r
2.4.21-83-default
nokomis:~ # tracepath 192.168.252.1
1?: [LOCALHOST]     pmtu 57344
1:  192.168.252.1 (192.168.252.1)                          2.133ms reached
    Resume: pmtu 57344 hops 1 back 1
nokomis:~ #

on SLES9:

abinodji:~ # tracepath 192.168.252.1
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.267ms pmtu 57344
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.129ms pmtu 32000
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.107ms pmtu 17914
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.055ms pmtu 8166
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.046ms pmtu 4352
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.047ms pmtu 2002
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.035ms pmtu 1492
1:  hawk.hipersocket (192.168.252.1)                       0.834ms reached
    Resume: pmtu 1492 hops 1 back 1

At first we thought it was a problem with tracepath between the releases.

So I took the executeable for tracepath on SLES-8 and copied to to SLES-9 - 
Wasn't even sure it would run... But it did and revealed different
behaviour, identical to that of the SLES9 Tracepath:

abinodji:~ # ./tracepath_sles8 192.168.252.1
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.258ms pmtu 57344
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.127ms pmtu 32000
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.157ms pmtu 17914
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.057ms pmtu 8166
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.045ms pmtu 4352
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.046ms pmtu 2002
1:  abinodji.co.hennepin.mn.us (192.168.252.22)            0.037ms pmtu 1492
1:  hawk.hipersocket (192.168.252.1)                       0.342ms reached
    Resume: pmtu 1492 hops 1 back 1

So there is some fundamental difference between SLES-8 and SLES-9 and from what 
I can see no matter WHAT my hipersocket MTU size is (in this case
57344) I appear to be getting packet sizes of 1492 on SLES9.

Note that the PMTU discovery is hitting the SAME interface/.IP address 7 times 
before it completed to the target. 192.168.252.22 is HSI Ip address of
Abinodji, and 192.168.252.1 is the address of the target z/OS LPAR image on 
which we run DB2/Shadow direct.

Anyone have any idea as to whether or not I am actually having my packets 
shredded to 1492, and if so Why?

The CHPID, the Linuxes and the z/OS target all have the correct settings for an 
MTU of that size (chpid 64K, the MTU for interfaces 64K less 8K
overhead).

I've had this open with Novell for about three weeks and they're of absolutely 
NO help.

Thanks,

-J

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390




----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to