Hello Mike, On Mon, Feb 12, 2024 at 8:50 PM Mike Pattrick <m...@redhat.com> wrote: > > Previously if an OVS configuration nested multiple layers of UDP tunnels > like VXLAN or GENEVE on top of each other through netdev-linux > interfaces, the vnet header would be incorrectly set to the outermost > UDP tunnel layer instead of the intermediary tunnel layer. > > This resulted in the middle UDP tunnel not checksum offloading properly. > > Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.") > Reported-by: David Marchand <david.march...@redhat.com> > Signed-off-by: Mike Pattrick <m...@redhat.com>
I have some trouble relating this patch to the issue I faced :-). Could you detail a test that shows the issue you fix here? After applying (only this patch), I still reproduce an issue with inner checksums. As I reported this issue to you offlist, let me put the details in public here. I wrote a system-traffic.at unit test that stacks 3 vxlan tunnels (separate topic, but for the context, my goal was to stress DPDK dp-packets wrt headroom). If I try this unit test before commit 084c8087292c ("userspace: Support VXLAN and GENEVE TSO."), I have no issue. The topology is as follows: ############################################################## # # at_ns0 . init_net # . # at_vxlan1 (10.1.1.1/24) . br0 (10.1.1.100/24) # (remote 172.31.1.100) . | # . at_vxlan0 # . (remote 172.31.1.1) # . # at_vxlan3 (172.31.1.1/24) . br-underlay0 (172.31.1.100/24) # (remote 172.31.2.100) . | # . at_vxlan2 # . (remote 172.31.2.1) # . # at_vxlan5 (172.31.2.1/24) . br-underlay1 (172.31.2.100/24) # (remote 172.31.3.100) . | # . at_vxlan4 # . (remote 172.31.3.1) # . # p0 (172.31.3.1/24) . br-underlay2 (172.31.3.100/24) # | . | # \-------------------------.-ovs-p0 # ############################################################## (gmail will probably bust this copy/paste, so putting a link to the actual test: https://github.com/david-marchand/ovs/commit/manyvxlan~2#diff-45a77f85f9679bc66ac97300392c0d5d9f5c53264fa8a82d735a553246e71faeR400) With this setup, I try to ping, from at_ns0 netns, the ip address of the br tap iface plugged with the other side of each tunnel: - Most outter level, no encapsulation, all good: 16:24:51.590966 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 63550, offset 0, flags [DF], proto ICMP (1), length 84) 172.31.3.1 > 172.31.3.100: ICMP echo request, id 26707, seq 1, length 64 16:24:51.591084 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 28720, offset 0, flags [none], proto ICMP (1), length 84) 172.31.3.100 > 172.31.3.1: ICMP echo reply, id 26707, seq 1, length 64 - One tunnel encap all good: 16:24:54.140629 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 61052, offset 0, flags [none], proto UDP (17), length 134) 172.31.3.1.36831 > 172.31.3.100.vxlan: [udp sum ok] VXLAN, flags [I] (0x08), vni 0 1e:db:ec:e5:28:6d > 9a:39:be:e8:18:4b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 54399, offset 0, flags [DF], proto ICMP (1), length 84) 172.31.2.1 > 172.31.2.100: ICMP echo request, id 51488, seq 1, length 64 16:24:54.140772 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 134) 172.31.3.100.39912 > 172.31.3.1.vxlan: [no cksum] VXLAN, flags [I] (0x08), vni 0 9a:39:be:e8:18:4b > 1e:db:ec:e5:28:6d, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 29701, offset 0, flags [none], proto ICMP (1), length 84) 172.31.2.100 > 172.31.2.1: ICMP echo reply, id 51488, seq 1, length 64 - Two tunnels encap: 16:24:58.578900 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4 (0x0800), length 142: (tos 0x0, ttl 64, id 61719, offset 0, flags [none], proto UDP (17), length 128) 172.31.3.1.50673 > 172.31.3.100.vxlan: [udp sum ok] VXLAN, flags [I] (0x08), vni 0 1e:db:ec:e5:28:6d > 9a:39:be:e8:18:4b, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 64, id 35175, offset 0, flags [none], proto UDP (17), length 78) 172.31.2.1.44060 > 172.31.2.100.vxlan: [udp sum ok] VXLAN, flags [I] (0x08), vni 1 62:53:3f:82:da:56 > Broadcast, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 172.31.1.100 tell 172.31.1.1, length 28 16:24:58.579021 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 142: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 128) 172.31.3.100.56325 > 172.31.3.1.vxlan: [no cksum] VXLAN, flags [I] (0x08), vni 0 9a:39:be:e8:18:4b > 1e:db:ec:e5:28:6d, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 78, bad cksum 0 (->ddfb)!) 172.31.2.100.56325 > 172.31.2.1.vxlan: [no cksum] VXLAN, flags [I] (0x08), vni 1 26:8d:e7:0e:41:45 > 62:53:3f:82:da:56, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 172.31.1.100 is-at 26:8d:e7:0e:41:45, length 28 In this last packet capture, the inner vxlan tunnel header has wrong (well no) IP checksum. If I set the tunnel endpoint neighbors in each netns, the same issue can be seen for icmp encapsulated traffic: 17:41:39.303170 32:a4:65:eb:df:49 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 198: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 184) 172.31.3.100.59550 > 172.31.3.1.vxlan: [no cksum] VXLAN, flags [I] (0x08), vni 0 fe:72:7b:39:24:49 > 7e:9d:b3:a5:a2:f8, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 134, bad cksum 0 (->ddc3)!) 172.31.2.100.59550 > 172.31.2.1.vxlan: [no cksum] VXLAN, flags [I] (0x08), vni 1 56:21:48:f3:f3:42 > 72:64:65:f9:38:0a, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 49609, offset 0, flags [none], proto ICMP (1), length 84) 172.31.1.100 > 172.31.1.1: ICMP echo reply, id 35387, seq 1, length 64 Running in GHA: https://github.com/david-marchand/ovs/actions/runs/7903675971/job/21572151269 Afaics, this issue is not reproduced when using the kernel datapath (check-kernel) but it is seen in GHA for check-system-userspace / check-dpdk / check-afxdp. I can reproduce those locally. There is also a failure with check-offloads in GHA which I can't reproduce locally (for the latter, it may be a kernel related issue (and I think there were some known issues..)). Now, if I apply below diff, my issue seems to go away for check-system-userspace / check-dpdk / check-afxdp. $ git diff -b HEAD^ diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index 0d6d803fe4..7f3e2c0ae4 100644 --- a/lib/netdev-native-tnl.c +++ b/lib/netdev-native-tnl.c @@ -280,6 +280,7 @@ dp_packet_tnl_ol_process(struct dp_packet *packet, dp_packet_hwol_set_tx_ipv6(packet); } } + } /* Attention please, tunnel inner l2 len is consist of udp header * len and tunnel header len and inner l2 len. */ @@ -299,7 +300,6 @@ dp_packet_tnl_ol_process(struct dp_packet *packet, (char *) dp_packet_eth(packet) + VXLAN_HLEN); } - } } void But IPv6 tso tests have some failures, and some nsh test fails too regardless of which datapath is used, so this is not a right fix. Actual commit: https://github.com/david-marchand/ovs/commit/manyvxlan https://github.com/david-marchand/ovs/actions/runs/7904287294 -- David Marchand _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev