Chinh Nguyen wrote:
> Patrick McHardy wrote:
> 
>>Netfilter recalculates the checksum when NATing it.
> 
> 
> The NATing is not done by netfilter but by the NAT device between the IPsec 
> peers.

I see, so the TCP checksum includes the wrong IPs.

> [Linux ipsec client C] ------ [NAT device] ---------- [Linux ipsec server S]
> 
> C negotiates a IPsec Transport Mode with S. Because of Transport Mode/NAT-T, 2
> things happen to an IPsec packet.
> 
> 1. It is UDP-encapsulated, typically on port 4500/udp.
> 2. Transport Mode traffic leaves the original IP header alone whereas tunnel
> mode wraps the entire traffic in a second IP header. As such, when the packet
> passes through the NAT device, the source IP is N. However, the original
> unencrypted packet had source IP C.
> 
> S rips off the UDP-encap header, decrypts the payload, and "joins" the content
> back to the IP header. If the decrypted content is UDP or TCP, the UDP/TCP
> checksum is now incorrect because the source IP is now N not C.
> 
> (In tunnel mode, we would ignore the NAT-ted outer IP header because the
> decrypted content has an entire IP header + UDP/TCP etc)
> 
> This is a well-known problem with transport mode/NAT. One solution is to use
> NAT-OA and NAT-OR to recalculate the checksum. The linux kernel does the 
> simpler
> thing of ignoring the UDP/TCP checksum altogether in this particular case:
> 
> function esp_post_input (net/ipv4/esp4.c)
>     290             /*
>     291              * 2) ignore UDP/TCP checksums in case
>     292              *    of NAT-T in Transport Mode, or
>     293              *    perform other post-processing fixes
>     294              *    as per * draft-ietf-ipsec-udp-encaps-06,
>     295              *    section 3.1.2
>     296              */
>     297             if (!x->props.mode)
>     298                 skb->ip_summed = CHECKSUM_UNNECESSARY;
>     299
>     300             break;
> 
> 
> As noted, esp_post_input is called in xfrm4_policy_check. Decrypted UDP 
> traffic
> through transport mode/nat also has bad checksums. However, since it is passed
> through udp_queue_rcv_skb after decryption, and this function calls
> xfrm4_policy_check before checking the UDP checksum, line 298 means the kernel
> ignores the bad checksum.
> 
> Decrypted TCP traffic has bad checksums too. But since tcp_v4_rcv checks the 
> TCP
> checksum before calling xfrm4_policy_check, the bad checksum means the TCP
> packet is dropped as a bad segment.
> 
> The end result is that UDP and other traffic (eg, ICMP) can pass through
> transport mode/nat but not TCP.
> 
> I don't know what correct fix is. Adding an extra call to xfrm4_policy_check 
> in
> tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to
> break anything else. On the other hand, moving some of the code in
> esp_post_input into esp_input (especially line 298) will work, too.

So we could move checksum validation behind xfrm4_policy_check or
already set ip_summed to CHECKSUM_UNNECESSARY in esp_input. Already
setting ip_summed in esp4_input looks easier. But this still leaves
one problem. With netfilter and local NAT, a decapsulated transport
mode packet might be forwarded to another host. In that case the
checksum contained in the packet is invalid. Any ideas how to fix
this anyone?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to