Re: Problem with Ipsec transport mode over NAT
Patrick McHardy wrote: I don't know what correct fix is. Adding an extra call to xfrm4_policy_check in tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to break anything else. On the other hand, moving some of the code in esp_post_input into esp_input (especially line 298) will work, too. So we could move checksum validation behind xfrm4_policy_check or already set ip_summed to CHECKSUM_UNNECESSARY in esp_input. Already setting ip_summed in esp4_input looks easier. But this still leaves one problem. With netfilter and local NAT, a decapsulated transport mode packet might be forwarded to another host. In that case the checksum contained in the packet is invalid. Any ideas how to fix this anyone? I don't know what the functional separation or difference between a packet input function and a packet post input function is but the entire code in esp_post_input doesn't seem like it would cause any problem just by placing it at end of esp_input instead of current location. A forwarded decapsulated packet would have the destination IP changed from server S to another IP. Shouldn't that cause the stack to automatically recalculate the checksum? Anyway, enough speculation. I will leave the solution to those who know the linux kernel networking code. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with Ipsec transport mode over NAT
On Fri, Feb 24, 2006 at 04:57:33AM +, Patrick McHardy wrote: So we could move checksum validation behind xfrm4_policy_check or already set ip_summed to CHECKSUM_UNNECESSARY in esp_input. Already setting ip_summed in esp4_input looks easier. But this still leaves Absolutely. The only reason post_input exists at all is that it gives us the potential to adjust the checksums incrementally in future which we ought to do. However, after thinking about it for a bit we can adjust the checksums without using this post_input stuff at all. The crucial point is that only the inner-most NAT-T SA needs to be considered when adjusting checksums. What's more, the checksum adjustment comes down to a single u32 due to the linearity of IP checksums. We just happen to have a spare u32 lying around in our skb structure :) When ip_summed is set to CHECKSUM_NONE on input, the value of skb-csum is currently unused. All we have to do is to make that the checksum adjustment and voila, there goes all the post_input and decap structures! I'll send patches to get rid of post_input now. one problem. With netfilter and local NAT, a decapsulated transport mode packet might be forwarded to another host. In that case the checksum contained in the packet is invalid. Any ideas how to fix this anyone? I suppose you should treat CHECKSUM_UNNECESSARY as an indication that you need to recompute the checksum from scratch instead of adjusting it. So start by getting skb_checksum_help to only zap CHECKSUM_HW, and then test on this in the *_manip_pkt functions. BTW, the original address (nat_oa) structure is wrong. We need the original src as well as the original dst addresses to incrementally adjust the checksum. I wonder why everyone keeps getting this wrong. Fortunately it shouldn't be too hard to fix up, for netlink at least. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with Ipsec transport mode over NAT
Patrick McHardy wrote: Chinh Nguyen wrote: I discovered that the bug is in the function tcp_v4_rcv for kernel 2.6.16-rc1. After the ESP packet is decapped and decrypted in xfrm4_rcv_encap_finish, the unencrypted packet is pushed back through ip_local_deliver. For a UDP packet, it goes (back) to function udp_queue_rcv_skb. The first thing this function does is called xfrm4_policy_check. As noted previously, in xfrm4_policy_check, if the skb-sp != NULL, the esp_post_input function is called. The post input function sets skb-ip_summed to CHECKSUM_UNNECESSASRY if we are in transport mode. Therefore, further down in udp_queue_rcv_skb, we skip the checksum check and the packet is passed up the stack. However, for a decrypted TCP packet, the packet goes to tcp_v4_rcv. This function does the checksum check right away if skb-ip_summed != CHECKSUM_UNNECESSARY while xfrm4_policy_check is called a little later in the function. Therefore, the esp post input has not yet set the ip_summed to unnecessary. The decrypted packet fails the checksum and is discarded. To confirm this, I added another call to xfrm4_policy_check before the checksum check in tcp_v4_rcv (to call esp post input). Once patched, my systems were able to initiate TCP connections using Transport Mode/NAT. What values does skb-ip_summed have before that? the skb-ip_summed value before the checksum check in tcp_v4_rcv is CHECKSUM_NONE. Hence tcp_v4_rcv checks its value, which is incorrect because the checksum is with regards to the private IP but the NAT device has modified the source IP. I believe that skb-ip_summed is set to CHECKSUM_NONE by esp_input (net/ipv4/esp4.c:180) which is called by xfrm4_rcv_encap (net/ipv4/xfrm4_input.c:101). - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with Ipsec transport mode over NAT
Chinh Nguyen wrote: Patrick McHardy wrote: What values does skb-ip_summed have before that? the skb-ip_summed value before the checksum check in tcp_v4_rcv is CHECKSUM_NONE. Hence tcp_v4_rcv checks its value, which is incorrect because the checksum is with regards to the private IP but the NAT device has modified the source IP. Netfilter recalculates the checksum when NATing it. I believe that skb-ip_summed is set to CHECKSUM_NONE by esp_input (net/ipv4/esp4.c:180) which is called by xfrm4_rcv_encap (net/ipv4/xfrm4_input.c:101). The question is why the checksum is invalid. Please start by describing what you're trying to do. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with Ipsec transport mode over NAT
Patrick McHardy wrote: Chinh Nguyen wrote: Patrick McHardy wrote: What values does skb-ip_summed have before that? the skb-ip_summed value before the checksum check in tcp_v4_rcv is CHECKSUM_NONE. Hence tcp_v4_rcv checks its value, which is incorrect because the checksum is with regards to the private IP but the NAT device has modified the source IP. Netfilter recalculates the checksum when NATing it. The NATing is not done by netfilter but by the NAT device between the IPsec peers. I believe that skb-ip_summed is set to CHECKSUM_NONE by esp_input (net/ipv4/esp4.c:180) which is called by xfrm4_rcv_encap (net/ipv4/xfrm4_input.c:101). The question is why the checksum is invalid. Please start by describing what you're trying to do. [Linux ipsec client C] -- [NAT device] -- [Linux ipsec server S] C negotiates a IPsec Transport Mode with S. Because of Transport Mode/NAT-T, 2 things happen to an IPsec packet. 1. It is UDP-encapsulated, typically on port 4500/udp. 2. Transport Mode traffic leaves the original IP header alone whereas tunnel mode wraps the entire traffic in a second IP header. As such, when the packet passes through the NAT device, the source IP is N. However, the original unencrypted packet had source IP C. S rips off the UDP-encap header, decrypts the payload, and joins the content back to the IP header. If the decrypted content is UDP or TCP, the UDP/TCP checksum is now incorrect because the source IP is now N not C. (In tunnel mode, we would ignore the NAT-ted outer IP header because the decrypted content has an entire IP header + UDP/TCP etc) This is a well-known problem with transport mode/NAT. One solution is to use NAT-OA and NAT-OR to recalculate the checksum. The linux kernel does the simpler thing of ignoring the UDP/TCP checksum altogether in this particular case: function esp_post_input (net/ipv4/esp4.c) 290 /* 291 * 2) ignore UDP/TCP checksums in case 292 *of NAT-T in Transport Mode, or 293 *perform other post-processing fixes 294 *as per * draft-ietf-ipsec-udp-encaps-06, 295 *section 3.1.2 296 */ 297 if (!x-props.mode) 298 skb-ip_summed = CHECKSUM_UNNECESSARY; 299 300 break; As noted, esp_post_input is called in xfrm4_policy_check. Decrypted UDP traffic through transport mode/nat also has bad checksums. However, since it is passed through udp_queue_rcv_skb after decryption, and this function calls xfrm4_policy_check before checking the UDP checksum, line 298 means the kernel ignores the bad checksum. Decrypted TCP traffic has bad checksums too. But since tcp_v4_rcv checks the TCP checksum before calling xfrm4_policy_check, the bad checksum means the TCP packet is dropped as a bad segment. The end result is that UDP and other traffic (eg, ICMP) can pass through transport mode/nat but not TCP. I don't know what correct fix is. Adding an extra call to xfrm4_policy_check in tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to break anything else. On the other hand, moving some of the code in esp_post_input into esp_input (especially line 298) will work, too. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with Ipsec transport mode over NAT
Chinh Nguyen wrote: Patrick McHardy wrote: Netfilter recalculates the checksum when NATing it. The NATing is not done by netfilter but by the NAT device between the IPsec peers. I see, so the TCP checksum includes the wrong IPs. [Linux ipsec client C] -- [NAT device] -- [Linux ipsec server S] C negotiates a IPsec Transport Mode with S. Because of Transport Mode/NAT-T, 2 things happen to an IPsec packet. 1. It is UDP-encapsulated, typically on port 4500/udp. 2. Transport Mode traffic leaves the original IP header alone whereas tunnel mode wraps the entire traffic in a second IP header. As such, when the packet passes through the NAT device, the source IP is N. However, the original unencrypted packet had source IP C. S rips off the UDP-encap header, decrypts the payload, and joins the content back to the IP header. If the decrypted content is UDP or TCP, the UDP/TCP checksum is now incorrect because the source IP is now N not C. (In tunnel mode, we would ignore the NAT-ted outer IP header because the decrypted content has an entire IP header + UDP/TCP etc) This is a well-known problem with transport mode/NAT. One solution is to use NAT-OA and NAT-OR to recalculate the checksum. The linux kernel does the simpler thing of ignoring the UDP/TCP checksum altogether in this particular case: function esp_post_input (net/ipv4/esp4.c) 290 /* 291 * 2) ignore UDP/TCP checksums in case 292 *of NAT-T in Transport Mode, or 293 *perform other post-processing fixes 294 *as per * draft-ietf-ipsec-udp-encaps-06, 295 *section 3.1.2 296 */ 297 if (!x-props.mode) 298 skb-ip_summed = CHECKSUM_UNNECESSARY; 299 300 break; As noted, esp_post_input is called in xfrm4_policy_check. Decrypted UDP traffic through transport mode/nat also has bad checksums. However, since it is passed through udp_queue_rcv_skb after decryption, and this function calls xfrm4_policy_check before checking the UDP checksum, line 298 means the kernel ignores the bad checksum. Decrypted TCP traffic has bad checksums too. But since tcp_v4_rcv checks the TCP checksum before calling xfrm4_policy_check, the bad checksum means the TCP packet is dropped as a bad segment. The end result is that UDP and other traffic (eg, ICMP) can pass through transport mode/nat but not TCP. I don't know what correct fix is. Adding an extra call to xfrm4_policy_check in tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to break anything else. On the other hand, moving some of the code in esp_post_input into esp_input (especially line 298) will work, too. So we could move checksum validation behind xfrm4_policy_check or already set ip_summed to CHECKSUM_UNNECESSARY in esp_input. Already setting ip_summed in esp4_input looks easier. But this still leaves one problem. With netfilter and local NAT, a decapsulated transport mode packet might be forwarded to another host. In that case the checksum contained in the packet is invalid. Any ideas how to fix this anyone? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with Ipsec transport mode over NAT
Chinh Nguyen wrote: I discovered that the bug is in the function tcp_v4_rcv for kernel 2.6.16-rc1. After the ESP packet is decapped and decrypted in xfrm4_rcv_encap_finish, the unencrypted packet is pushed back through ip_local_deliver. For a UDP packet, it goes (back) to function udp_queue_rcv_skb. The first thing this function does is called xfrm4_policy_check. As noted previously, in xfrm4_policy_check, if the skb-sp != NULL, the esp_post_input function is called. The post input function sets skb-ip_summed to CHECKSUM_UNNECESSASRY if we are in transport mode. Therefore, further down in udp_queue_rcv_skb, we skip the checksum check and the packet is passed up the stack. However, for a decrypted TCP packet, the packet goes to tcp_v4_rcv. This function does the checksum check right away if skb-ip_summed != CHECKSUM_UNNECESSARY while xfrm4_policy_check is called a little later in the function. Therefore, the esp post input has not yet set the ip_summed to unnecessary. The decrypted packet fails the checksum and is discarded. To confirm this, I added another call to xfrm4_policy_check before the checksum check in tcp_v4_rcv (to call esp post input). Once patched, my systems were able to initiate TCP connections using Transport Mode/NAT. What values does skb-ip_summed have before that? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html