How PMTU Discovery works in the network: Path MTU Discovery depends upon two IP capabilities. IP packets sent will have the "do-not-fragment" option set in the IP header. When an outbound interface in the path is encountered that can not transmit the packet because its MTU is less than what is required by the packet, an ICMP "packet-too-long" message is sent back to the source. This ICMP response message contains the actual MTU of the problem interface. Hence, a normal PMTU discovery should fail only once for each encountered interface with a smaller MTU in the path. PMTU is not normally determined by experimentation. Also, MTU is a configuration parameter of the IP hosts or routers. It must be compatible with what the actual network can support, but the network itself does not "tell" the hosts what is correct. A mismatch where the hosts or routers have a MTU larger than the network can support will cause problems. The other way around will go unnoticed.
In your case, you should succeed on the second try and the MTU would immediately drop to 1492. So, something else is going on with the tracepath program. Without a packet trace of what is actually going on, as seen by the network and the program, it is difficult to figure out what is really occurring. In this case the problem appears to be in the host from which the trace route is issued, SLES9. There are two questions: Is the MTU really 1492 and why is tracepath acting as it is? The first is of the most concern. I would recommend using a ping with a payload size of 57344 and the do-not-fragment option enabled. If this does not work, then remove the do-not-fragment option and see if it succeeds. This will tell you if packets are being fragmented or not to reach the destination. If it works without the option, but fails with it, you are definitely fragmenting packets. You can experiment with the payload size to determine exactly what the size is. This will ensure what physical packet size is being used and whether fragmentation is really occurring or not. Neither of these is clear with the tracepath program. The fact that you are experiencing difficulty determining what is happening now begs the question if it really was working before. While the output appears to suggest the issue is with the SLES9 side of the communication, you might also want to try the above ping test from the Z/OS side and see if any additional information is seen. Also, just because you end up fragmenting does not necessarily mean that the hipersocket I/F can not support larger packets. MTU is a host configuration parameter. Correcting the MTU parameter to match what the network can physically do will resolve the problem. Note both hosts must be correctly configured. You appeared to have checked this already, so the question would be why the settings might be ignored if it isn't working. As far as the tracepath program, ping is ICMP based and trace route is usually UDP based. How this specific executable is implemented or what type of socket API is used, I do not know. Check the source. Harold Grovesteen James Melin wrote:
We've been seeing some disturbing differences between the packet size being (aparrently) sent on SLES9 vs SLES8 over the hipersocket interface. The tool we have been using to tell us the packet length going over the hipersocket is tracepath. On SLES8: nokomis:~ # uname -r 2.4.21-83-default nokomis:~ # tracepath 192.168.252.1 1?: [LOCALHOST] pmtu 57344 1: 192.168.252.1 (192.168.252.1) 2.133ms reached Resume: pmtu 57344 hops 1 back 1 nokomis:~ # on SLES9: abinodji:~ # tracepath 192.168.252.1 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.267ms pmtu 57344 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.129ms pmtu 32000 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.107ms pmtu 17914 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.055ms pmtu 8166 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.046ms pmtu 4352 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.047ms pmtu 2002 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.035ms pmtu 1492 1: hawk.hipersocket (192.168.252.1) 0.834ms reached Resume: pmtu 1492 hops 1 back 1 At first we thought it was a problem with tracepath between the releases. So I took the executeable for tracepath on SLES-8 and copied to to SLES-9 - Wasn't even sure it would run... But it did and revealed different behaviour, identical to that of the SLES9 Tracepath: abinodji:~ # ./tracepath_sles8 192.168.252.1 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.258ms pmtu 57344 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.127ms pmtu 32000 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.157ms pmtu 17914 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.057ms pmtu 8166 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.045ms pmtu 4352 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.046ms pmtu 2002 1: abinodji.co.hennepin.mn.us (192.168.252.22) 0.037ms pmtu 1492 1: hawk.hipersocket (192.168.252.1) 0.342ms reached Resume: pmtu 1492 hops 1 back 1 So there is some fundamental difference between SLES-8 and SLES-9 and from what I can see no matter WHAT my hipersocket MTU size is (in this case 57344) I appear to be getting packet sizes of 1492 on SLES9. Note that the PMTU discovery is hitting the SAME interface/.IP address 7 times before it completed to the target. 192.168.252.22 is HSI Ip address of Abinodji, and 192.168.252.1 is the address of the target z/OS LPAR image on which we run DB2/Shadow direct. Anyone have any idea as to whether or not I am actually having my packets shredded to 1492, and if so Why? The CHPID, the Linuxes and the z/OS target all have the correct settings for an MTU of that size (chpid 64K, the MTU for interfaces 64K less 8K overhead). I've had this open with Novell for about three weeks and they're of absolutely NO help. Thanks, -J ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390