Hello Brendan, This resembles an issue I have seen with CX5 when not using OVS flow offload. The resolution in my case was to apply a fix [0] to the mlx5 kernel driver.
0: https://www.spinics.net/lists/netdev/msg711911.html -- Frode Nordahl On Wed, May 5, 2021 at 5:00 PM Brendan Doyle <[email protected]> wrote: > > Folks, > > I had posted an question to this alias a while back with the subject: > " TCP tunnel traffic stops working when move from RHEL 7.7 to 7.9" > > I finally got to the bottom of this and discovered that the issues is with > UDP checksum offload > when the underlay is in a vlan, which seems to break OVN. Is this a known > issue? > > To cut to the chase I got things working with the following command on each > chassis: > > ethtool --offload genev_sys_6081 rx on tx off > > When I looked at the tcpdumps on the underlay NIC I noticed that in the old > working > OS (OEL 7.7 (RHEL 7.7 based) ) that the outer UDP pkt always had "[udp sum > ok]" meaning that > the OS was doing the checksum, where as in the new broken OS (OEL 7.9 (RHEL > 7.9) the first > few packets had "[bad udp cksum" these packets got through, but then the > next few had > "[udp sum ok]" and these did not get through to the other chassis across the > tunnel. Oddly > when I removed the vlan, with no ethertool changes things worked, It only > broke when there > was a vlan in the mix. Then after much trail and error with ethertool > settings on the NIC, > the VIF, ovs-system and finally genev_sys_6081 I got it to work. > > Seems like a bit of a performance limitation that OVN does not work with NIC > checksum offload? > > Brendan > > > On 29/04/2021 10:54, Brendan Doyle wrote: > > Hi Folks, > > In a very basic OVN config, where I have two VMs on different chassis: > > switch 7b89d593-05f3-41a7-a246-8dade975df48 (ls_vcn1) > port a6a358c5-5db4-49c7-b68a-3a7429161ab4 > addresses: ["52:54:00:71:ad:a0 192.16.1.5"] > port b6c5ef1a-acd9-4053-9986-88e1a6a12b81 > addresses: ["52:54:00:40:8f:dc 192.16.1.6"] > > When I upgrade the chassis from OEL 7.7 (RHEL 7.7 based) to OEL 7.9 (RHEL > 7.9) based, then > TCP traffic stops working, ping and UDP are fine. When I look at tcpdump of > the traffic on both > chassis, I see the initial handshake encapsulated traffic being sent and > revived on both nodes. > The initial TCP handshake seems to get through on the sender and it sends the > first data packet > but the receive side does not get the data packets and keeps sending the > initial handshake ack > (see traces below). > > I'm think something to do with tcp checksum or some other NIC offload? the > NICS are CX5s. > Just wondering has anyone come across this? > > Thanks > > Brendan > > > Sender > --------- > 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 132: > (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17), length 118) > 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xfc99 -> 0xa576!] > Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual > Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002] > 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 (0x0800), > length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF], proto TCP (6), > length 60) > 192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b (correct), seq > 3225335796, win 27200, options [mss 1360,sackOK,TS val 1242625918 ecr > 0,nop,wscale 7], length 0 > > 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 132: > (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17), length 118) > 253.255.0.18.28454 > 253.255.0.21.6081: [udp sum ok] Geneve, Flags [C], > vni 0x1, proto TEB (0x6558), options [class Open Virtual Networking (OVN) > (0x102) type 0x80(C) len 8 data 00020001] > 52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 (0x0800), > length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), > length 60) > 192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f (correct), seq > 3217262113, ack 3225335797, win 26960, options [mss 1360,sackOK,TS val > 3343009202 ecr 1242625918,nop,wscale 7], length 0 > > 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 124: > (tos 0x0, ttl 64, id 29695, offset 0, flags [DF], proto UDP (17), length 110) > 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa57e -> 0x723d!] > Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual > Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002] > 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 (0x0800), > length 66: (tos 0x0, ttl 64, id 61069, offset 0, flags [DF], proto TCP (6), > length 52) > 192.16.1.6.38900 > 192.16.1.5.22: Flags [.], cksum 0x8252 (incorrect -> > 0x4f11), seq 1, ack 1, win 213, options [nop,nop,TS val 1242625920 ecr > 3343009202], length 0 > > 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 145: > (tos 0x0, ttl 64, id 29696, offset 0, flags [DF], proto UDP (17), length 131) > 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 -> 0xae4d!] > Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual > Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002] > 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 (0x0800), > length 87: (tos 0x0, ttl 64, id 61070, offset 0, flags [DF], proto TCP (6), > length 73) > 192.16.1.6.38900 > 192.16.1.5.22: Flags [P.], cksum 0x8267 (incorrect -> > 0x8b4b), seq 1:22, ack 1, win 213, options [nop,nop,TS val 1242625920 ecr > 3343009202], length 21 > > 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 145: > (tos 0x0, ttl 64, id 29775, offset 0, flags [DF], proto UDP (17), length 131) > 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 -> 0xad7f!] > Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual > Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002] > 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 (0x0800), > length 87: (tos 0x0, ttl 64, id 61071, offset 0, flags [DF], proto TCP (6), > length 73) > 192.16.1.6.38900 > 192.16.1.5.22: Flags [P.], cksum 0x8267 (incorrect -> > 0x8a7d), seq 1:22, ack 1, win 213, options [nop,nop,TS val 1242626126 ecr > 3343009202], length 21 > > Just repeats don't see anything else from the receiver > > Receiver > ------------ > 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 132: > (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17), length 118) > 253.255.0.21.62384 > 253.255.0.18.6081: [udp sum ok] Geneve, Flags [C], > vni 0x1, proto TEB (0x6558), options [class Open Virtual Networking (OVN) > (0x102) type 0x80(C) len 8 data 00010002] > 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 (0x0800), > length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF], proto TCP (6), > length 60) > 192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b (correct), seq > 3225335796, win 27200, options [mss 1360,sackOK,TS val 1242625918 ecr > 0,nop,wscale 7], length 0 > > 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 132: > (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17), length 118) > 253.255.0.18.28454 > 253.255.0.21.6081: [bad udp cksum 0xfc99 -> 0x2a01!] > Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual > Networking (OVN) (0x102) type 0x80(C) len 8 data 00020001] > 52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 (0x0800), > length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), > length 60) > 192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f (correct), seq > 3217262113, ack 3225335797, win 26960, options [mss 1360,sackOK,TS val > 3343009202 ecr 1242625918,nop,wscale 7], length 0 > > 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 132: > (tos 0x0, ttl 64, id 6137, offset 0, flags [DF], proto UDP (17), length 118) > 253.255.0.18.28454 > 253.255.0.21.6081: [bad udp cksum 0x2a01 -> 0x5bc0!] > Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual > Networking (OVN) (0x102) type 0x80(C) len 8 data 00020001] > 52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 (0x0800), > length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), > length 60) > 192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0x825a (incorrect -> > 0xb419), seq 3217262113, ack 3225335797, win 26960, options [mss > 1360,sackOK,TS val 3343010248 ecr 1242625918,nop,wscale 7], length 0 > > > Repeats don't see anything else from the sender. > > _______________________________________________ > discuss mailing list > [email protected] > https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!GqivPVa7Brio!N7LR5w08pkOggvzRCJX5QV6SXVf2Jet8S66oBsNRg9twtYl94cpCa-6wRj-l_gZyKVg$ > > > _______________________________________________ > discuss mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
