Patrick McHardy wrote: > Chinh Nguyen wrote: > >>Patrick McHardy wrote: >> >> >>>What values does skb->ip_summed have before that? >> >> >>the skb->ip_summed value before the checksum check in tcp_v4_rcv is >>CHECKSUM_NONE. Hence tcp_v4_rcv checks its value, which is incorrect because >>the >>checksum is with regards to the private IP but the NAT device has modified the >>source IP. > > > Netfilter recalculates the checksum when NATing it.
The NATing is not done by netfilter but by the NAT device between the IPsec peers. > > I believe that skb->ip_summed is set to CHECKSUM_NONE by esp_input > >>(net/ipv4/esp4.c:180) which is called by xfrm4_rcv_encap >>(net/ipv4/xfrm4_input.c:101). > > > The question is why the checksum is invalid. Please start by describing > what you're trying to do. [Linux ipsec client C] ------ [NAT device] ---------- [Linux ipsec server S] C negotiates a IPsec Transport Mode with S. Because of Transport Mode/NAT-T, 2 things happen to an IPsec packet. 1. It is UDP-encapsulated, typically on port 4500/udp. 2. Transport Mode traffic leaves the original IP header alone whereas tunnel mode wraps the entire traffic in a second IP header. As such, when the packet passes through the NAT device, the source IP is N. However, the original unencrypted packet had source IP C. S rips off the UDP-encap header, decrypts the payload, and "joins" the content back to the IP header. If the decrypted content is UDP or TCP, the UDP/TCP checksum is now incorrect because the source IP is now N not C. (In tunnel mode, we would ignore the NAT-ted outer IP header because the decrypted content has an entire IP header + UDP/TCP etc) This is a well-known problem with transport mode/NAT. One solution is to use NAT-OA and NAT-OR to recalculate the checksum. The linux kernel does the simpler thing of ignoring the UDP/TCP checksum altogether in this particular case: function esp_post_input (net/ipv4/esp4.c) 290 /* 291 * 2) ignore UDP/TCP checksums in case 292 * of NAT-T in Transport Mode, or 293 * perform other post-processing fixes 294 * as per * draft-ietf-ipsec-udp-encaps-06, 295 * section 3.1.2 296 */ 297 if (!x->props.mode) 298 skb->ip_summed = CHECKSUM_UNNECESSARY; 299 300 break; As noted, esp_post_input is called in xfrm4_policy_check. Decrypted UDP traffic through transport mode/nat also has bad checksums. However, since it is passed through udp_queue_rcv_skb after decryption, and this function calls xfrm4_policy_check before checking the UDP checksum, line 298 means the kernel ignores the bad checksum. Decrypted TCP traffic has bad checksums too. But since tcp_v4_rcv checks the TCP checksum before calling xfrm4_policy_check, the bad checksum means the TCP packet is dropped as a bad segment. The end result is that UDP and other traffic (eg, ICMP) can pass through transport mode/nat but not TCP. I don't know what correct fix is. Adding an extra call to xfrm4_policy_check in tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to break anything else. On the other hand, moving some of the code in esp_post_input into esp_input (especially line 298) will work, too. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html