(Sorry in advance to some of the readers if they get this
message as a duplicate of the ones I've been trying to send
for the past few hours; I think I was lost to the list due
to my return-address change and hope re-subscription fixed
that). Here goes:
Hello all,
When IPF 4.1.32_rc5 module is loaded on Solaris 10u6 x86_64
(even with empty rule files), the host running ipf drops out
of traceroute's (UDP as well as ICMP ones)).
My take (explained in detail below) is that this is because
of some issues with byte-swapping in IP header.
I haven't yet witnessed the likes of this error, so I guessed
it's a relatively new regression. However I confirmed it also
happens on IPF 4.1.29 on Solaris 10u4 x86_64.
It doesn't happen on IPF 4.1.28 on Solaris 8mu7 (x86 32-bit).
IMHO this may point to some troubles with 64-bitness or with
some changes to the kernel for the past 8 years...
I have booted in 32-bit mode, and the problem is present as
well. So it is not linked to 64-bitness of the kernel/tools.
Then I built IPFilter 4.1.28 on Solaris 10u6 x86_64, and it
fails to appear in the traceroute output in the same way -
with bad header packet-length and wrong checksum.
So now I think this is some fault of the new sol10u4/sol10u6
kernels... I wonder what the gurus think - I'm annoyed by
making so many wrong assumptions in such a short timeframe ;)
I have also checked on another system with stock unpatched
Solaris 10u6 x86_64 and its variant of IPFilter 4.1.9, and
could not reproduce the problem. It may be possibly due to
the system having only one NIC, or because the stock IPF
was tweaked to work. If the latter is indeed true, please
take a look at backporting these tweaks to the main tree :)
DETAILS
According to tcpdump, here's what happens:
18:19:56.928991 IP 81.5.113.125 > 192.168.1.141: ICMP echo request, id
1024, seq 19380, length 72
18:19:56.929014 IP truncated-ip - 28560 bytes missing! 81.5.113.2 >
81.5.113.125: icmp
for UDP it looks like this:
18:41:01.694741 IP (tos 0x0, ttl 1, id 62849, offset 0, flags [none],
proto: UDP (17), length: 40) 81.5.113.98.47318 > 192.168.1.141.33434:
[udp sum ok] UDP, length 12
18:41:01.694761 IP truncated-ip - 17340 bytes missing! (tos 0x0, ttl
255, id 61609, offset 512, flags [none], proto: ICMP (1), length: 17408,
bad cksum 6a0 (->2a4)!) 81.5.113.2 > 81.5.113.98: icmp
I found a thread on similar problem from a few years ago, here:
http://readlist.com/lists/netbsd.org/current-users/0/3580.html
Posters suggested that byte-ordering could be broken and the
fix is trivial, as soon as they'd find where to fix that. I've
not seen any follow-ups.
Apparently, this happens here as well - response packet header
is wrong (0x4400 = 17408, instead of 0044, I guess):
18:46:04.131031 IP (tos 0x0, ttl 1, id 31946, offset 0, flags [none],
proto: ICMP (1), length: 40) 81.5.113.98 > 192.168.1.140: ICMP echo
request, id 47321, seq 1, length 20
0x0000: 4500 0028 7cca 0000 0101 b86f 5105 7162
0x0010: c0a8 018c 0800 0b22 b8d9 0001 0101 0000
0x0020: 4269 f849 ef4e 0900 0000 0000 0000
18:46:04.131046 IP truncated-ip - 17340 bytes missing! (tos 0x0, ttl
255, id 61616, offset 512, flags [none], proto: ICMP (1), length: 17408,
bad cksum 699 (->29d)!) 81.5.113.2 > 81.5.113.98: icmp
0x0000: 4500 4400 f0b0 0040 ff01 0699 5105 7102
0x0010: 5105 7162 0b00 f4ff 0000 0000 4500 0028
0x0020: 7cca 0000 0001 b96f 5105 7162 c0a8 018c
0x0030: 0800 0b22 b8d9
UDP:
18:44:51.683621 IP (tos 0x0, ttl 1, id 62856, offset 0, flags [none],
proto: UDP (17), length: 40) 81.5.113.98.47320 > 192.168.1.141.33434:
[udp sum ok] UDP, length 12
0x0000: 4500 0028 f588 0000 0111 3fa0 5105 7162
0x0010: c0a8 018d b8d8 829a 0014 867b 0001 0000
0x0020: fa68 f849 c486 0200 0000 0000 0000
18:44:51.683635 IP truncated-ip - 17340 bytes missing! (tos 0x0, ttl
255, id 61612, offset 512, flags [none], proto: ICMP (1), length: 17408,
bad cksum 69d (->2a1)!) 81.5.113.2 > 81.5.113.98: icmp
0x0000: 4500 4400 f0ac 0040 ff01 069d 5105 7102
0x0010: 5105 7162 0b00 79c2 0000 0000 4500 0028
0x0020: f588 0000 0011 40a0 5105 7162 c0a8 018d
0x0030: b8d8 829a 0014
I am not yet sure whether this affects any connectivity
beside traceroute's, but it looks very fishy ;(
PS: There was one minor modification to make 4.1.28 get built
on sol10u6; I backported a few lines from 4.1.32rc5:
--- ip_compat.h.orig 2007-10-10 13:51:42.000000000 +0400
+++ ip_compat.h.jimfix 2009-04-30 10:22:52.716355861 +0400
@@ -175,7 +175,12 @@
# endif
# include <inet/ip.h>
# undef COPYOUT
-# include <inet/ip_ire.h>
+
+// backported ifndef from 4.1.32rc5 - jim 20090430
+# if !defined(_SYS_NETI_H)
+# include <inet/ip_ire.h>
+# endif
+
# ifndef KERNEL
# undef _KERNEL
# endif
--
+============================================================+
| |
| Климов Евгений, Jim Klimov |
| технический директор CTO |
| ЗАО "ЦОС и ВТ" JSC COS&HT |
| |
| +7-903-7705859 (cellular) mailto:[email protected] |
| CC:[email protected],[email protected] |
+============================================================+
| () ascii ribbon campaign - against html mail |
| /\ - against microsoft attachments |
+============================================================+