Patrick McHardy wrote:
> Chinh Nguyen wrote:
> 
>>Patrick McHardy wrote:
>>
>>
>>>What values does skb->ip_summed have before that?
>>
>>
>>the skb->ip_summed value before the checksum check in tcp_v4_rcv is
>>CHECKSUM_NONE. Hence tcp_v4_rcv checks its value, which is incorrect because 
>>the
>>checksum is with regards to the private IP but the NAT device has modified the
>>source IP.
> 
> 
> Netfilter recalculates the checksum when NATing it.

The NATing is not done by netfilter but by the NAT device between the IPsec 
peers.

> 
>  I believe that skb->ip_summed is set to CHECKSUM_NONE by esp_input
> 
>>(net/ipv4/esp4.c:180) which is called by xfrm4_rcv_encap
>>(net/ipv4/xfrm4_input.c:101).
> 
> 
> The question is why the checksum is invalid. Please start by describing
> what you're trying to do.

[Linux ipsec client C] ------ [NAT device] ---------- [Linux ipsec server S]

C negotiates a IPsec Transport Mode with S. Because of Transport Mode/NAT-T, 2
things happen to an IPsec packet.

1. It is UDP-encapsulated, typically on port 4500/udp.
2. Transport Mode traffic leaves the original IP header alone whereas tunnel
mode wraps the entire traffic in a second IP header. As such, when the packet
passes through the NAT device, the source IP is N. However, the original
unencrypted packet had source IP C.

S rips off the UDP-encap header, decrypts the payload, and "joins" the content
back to the IP header. If the decrypted content is UDP or TCP, the UDP/TCP
checksum is now incorrect because the source IP is now N not C.

(In tunnel mode, we would ignore the NAT-ted outer IP header because the
decrypted content has an entire IP header + UDP/TCP etc)

This is a well-known problem with transport mode/NAT. One solution is to use
NAT-OA and NAT-OR to recalculate the checksum. The linux kernel does the simpler
thing of ignoring the UDP/TCP checksum altogether in this particular case:

function esp_post_input (net/ipv4/esp4.c)
    290             /*
    291              * 2) ignore UDP/TCP checksums in case
    292              *    of NAT-T in Transport Mode, or
    293              *    perform other post-processing fixes
    294              *    as per * draft-ietf-ipsec-udp-encaps-06,
    295              *    section 3.1.2
    296              */
    297             if (!x->props.mode)
    298                 skb->ip_summed = CHECKSUM_UNNECESSARY;
    299
    300             break;


As noted, esp_post_input is called in xfrm4_policy_check. Decrypted UDP traffic
through transport mode/nat also has bad checksums. However, since it is passed
through udp_queue_rcv_skb after decryption, and this function calls
xfrm4_policy_check before checking the UDP checksum, line 298 means the kernel
ignores the bad checksum.

Decrypted TCP traffic has bad checksums too. But since tcp_v4_rcv checks the TCP
checksum before calling xfrm4_policy_check, the bad checksum means the TCP
packet is dropped as a bad segment.

The end result is that UDP and other traffic (eg, ICMP) can pass through
transport mode/nat but not TCP.

I don't know what correct fix is. Adding an extra call to xfrm4_policy_check in
tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to
break anything else. On the other hand, moving some of the code in
esp_post_input into esp_input (especially line 298) will work, too.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to