Hi Jiri

Did you have the chance to take a look at this?

Thanks,
Jaime.

-----Original Message-----
From: Jaime Caamaño Ruiz <jcaam...@suse.de>
Reply-To: jcaam...@suse.com
To: Jiri Benc <jb...@redhat.com>, Yi Y Yang <yi.y.y...@intel.com>
Cc: ovs-discuss@openvswitch.org <ovs-discuss@openvswitch.org>
Subject: Re: [ovs-discuss] Bad checksums observed with nsh
encapsulation
Date: Fri, 29 Jun 2018 15:59:48 +0200

Hello Jiri

I will try to explain in more detail and provide a simple scenario to
reproduce.

Lets look first into the vm2vm scenario with no nsh, just a tcp packet
taken from on vm port and put into the other vmp port. The originating
vm requests checksum offload. OVS just clones the buffer into the
destination port. It will keep the same values for
ip_summed=CHECKSUM_PARTIAL, csum_offset, csum_start. The destination vm
assumes that it does not need to verify the checksum as it deems the
packet to be local (see [1]).

Now, lets assume the packet gets eth+nsh encapsulated and sent to
another intermediary service function vm on the same host, before being
unencapsulated and sent to the final destination vm. In my case the
intermediary vm is just a user space app that emulates a service
function. Opens a raw socket, inverts the outer eth header macs, and
decrements the path index before sending back the packet to OVS. At
this point, the nsh payload which is the original tcp packet has a bad
tcp checksum. Depending on what the service function does, this might
not be ok already. In any case, when the packet is back at OVS, it no
longer holds any of the previous csum metadata as it is a new buffer
generated via the service function user space app that is just
eth+nsh+payload. OVS unecapsulates and sends the packet to the final
destination vm, where is no longer deemed as a local packet as
ip_summed=CHECKSUM_NONE, so it is rejected when the tcp checksum does
not verify.

As a way to reproduce:

---

ovs-vsctl add-br br0

ip link add blueth0 type veth peer name veth0
ip netns add blue
ip link set veth0 netns blue
ovs-vsctl add-port ovsbr0 blueth0
ip netns exec blue ip addr add 10.0.0.1/24 dev veth0
ip link set blueth0 up
ip netns exec blue ip link set veth0 up

ip link add redeth0 type veth peer name veth0
ip netns add red
ip link set veth0 netns red
ovs-vsctl add-port ovsbr0 redeth0
ip netns exec red ip addr add 10.0.0.2/24 dev veth0
ip link set redeth0 up
ip netns exec red ip link set veth0 up

ovs-ofctl -Oopenflow13 add-flow br0
"priority=1,tcp,in_port=1,actions=encap(nsh),encap(ethernet),2"
ovs-ofctl -Oopenflow13 add-flow br0
"priority=1,tcp,in_port=2,actions=encap(nsh),encap(ethernet),1"

Capture on one namespace:
ip netns exec red tcpdump -i veth0 -U -w test.pcap

Connect form the other:
ip netns exec blue nc 10.0.0.2 80

----

If you inspect the capture (pic attached), you will see the SYN
connection attempts nsh encapsulated. The first attempt, the checksum
will be ok...that is because there is no flow in kernel datapath and
the checksum is calculated prior to upcall (see [2]). But successive
attempts will have wrong checksum since the flow has already been
installed in the kernel datapath and there is no upcall.

BR
Jaime

[1] https://github.com/torvalds/linux/blob/ea5f39f2f994e6fb8cb8d0304aa5
f422ae3bbf83/include/linux/skbuff.h#L3589
[2] https://github.com/openvswitch/ovs/blob/d22f8927c3c9034128df3859e98
e486ba1f06d60/datapath/datapath.c#L439

-----Original Message-----
From: Jiri Benc <jb...@redhat.com>
To: Yi Y Yang <yi.y.y...@intel.com>
Cc: ovs-discuss@openvswitch.org <ovs-discuss@openvswitch.org>, jcaamano
@suse.de
Subject: Re: [ovs-discuss] Bad checksums observed with nsh
encapsulation
Date: Tue, 26 Jun 2018 17:07:11 +0200

> But when we are pushing nsh headers, the first receiver may not be
> the final receiver and CHECKSUM_PARTIAL may not reach the final
> reciever which will then verify and reject a bad checksum.

I don't understand this. Could you please provide a minimal test case?
I will then reproduce locally and take a look. From the data provided,
it's unclear what's going on - the ofproto trace is too complex and
it's unclear what the individual interfaces are.

> So I think it may be necessary to handle the CHECKSUM_PARTIAL case on
> nsh_push, something like adding
> 
> if (skb->ip_summed == CHECKSUM_PARTIAL) {
>     skb_checksum_help(skb);
> }
> 
> Tried that and got rid of my problem.

That's an interesting datapoint. However, this is not the right fix,
there's no reason for such code. The bug lies elsewhere.

 Jiri
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to