Also checked the correct ovn trace

As the packets should reach the destination if driver was behaving
correctly

ubuntu@juju-7639a4-3-lxd-6:~$ network="provider1-net-external" 
inport="4679fbb8-3d4d-4dcd-b986-4ec4c0fb9000" ip4_src="10.99.0.88" 
ip4_dst="10.99.0.254" eth_src="fa:16:3e:ab:87:ad"
ubuntu@juju-7639a4-3-lxd-6:~$ sudo ovn-trace "$network" "inport == \"$inport\" 
&& ip4.src == "$ip4_src" && ip4.dst == "$ip4_dst" && eth.src == "$eth_src" && 
ip.ttl == 64 && icmp4.type == 8"
# 
icmp,reg14=0x3,vlan_tci=0x0000,dl_src=fa:16:3e:ab:87:ad,dl_dst=00:00:00:00:00:00,nw_src=10.99.0.88,nw_dst=10.99.0.254,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0

ingress(dp="provider1-net-external", inport="4679fb")
---------------------------------------------------
 0. ls_in_port_sec_l2 (northd.c:5516): inport == "4679fb", priority 50, uuid 
03e0a90b
        next;
 6. ls_in_pre_lb (northd.c:5663): ip && inport == "4679fb", priority 110, uuid 
f5e890b2
        next;
24. ls_in_l2_lkup (northd.c:7577): 1, priority 0, uuid 3aba4e5b
        outport = get_fdb(eth.dst);
        next;
25. ls_in_l2_unknown (northd.c:7581): outport == "none", priority 50, uuid 
0bf357af
        outport = "_MC_unknown";
        output;

multicast(dp="provider1-net-external", mcgroup="_MC_unknown")
-----------------------------------------------------------

        egress(dp="provider1-net-external", inport="4679fb", 
outport="provnet-8d6ece")
        
----------------------------------------------------------------------------
        0. ls_out_pre_lb (northd.c:5666): ip && outport == "provnet-8d6ece", 
priority 110, uuid ddd2f5b5
                next;
        9. ls_out_port_sec_l2 (northd.c:5613): outport == "provnet-8d6ece", 
priority 50, uuid 966e0b90
                output;
                /* output to "provnet-8d6ece", type "localnet" */

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2008781

Title:
  OVN provider network type vlan packets cannot go outside the bond on
  Intel E810-XXV card

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 20.04.5 LTS
  ubuntu@compute-09:~$ uname -a
  Linux compute-09 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux

  ubuntu@compute-09:~$ sudo update-pciids
  ubuntu@compute-09:~$ lspci |grep Intel|grep -i Ether
  31:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)
  31:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)
  ca:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)
  ca:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)

  The test instance with provider network floating ip 10.99.0.213 cannot reach 
the provider network gateway
  openstack server create --key-name ubuntu-keypair --image 
auto-sync/ubuntu-jammy-22.04-amd64-server-20230210-disk1.img --flavor m1.small 
--net provider1-private-net ubuntu-provider1

  ubuntu@compute-05:~$ sudo -E ip netns exec 
ovnmeta-fcd1b354-6f41-42dc-ae73-87df28856ee5 ssh ubuntu@192.168.100.123
  ubuntu@ubuntu-provider1:~$ ping 10.99.0.254
  PING 10.99.0.254 (10.99.0.254) 56(84) bytes of data.
  ^C
  --- 10.99.0.254 ping statistics ---
  419 packets transmitted, 0 received, 100% packet loss, time 428035ms

  I found the compute from which the outside traffic is going out
  and I see ARP requests with no response
  compute-09:~$ sudo tcpdump -vteni bond1 '(vlan 300)'
  tcpdump: listening on bond1, link-type EN10MB (Ethernet), capture size 262144 
bytes fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 
46: vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request 
who-has 10.99.0.254 tell 10.99.0.88, length 28
  fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: 
vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
10.99.0.254 tell 10.99.0.88, length 28
  fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: 
vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
10.99.0.254 tell 10.99.0.88, length 28
  For the test you may ping .254 indifenetely

  The error count grows on tx packets on bond1 and the card ens2f0 (which 
happens to push the traffic)
  ubuntu@compute-09:~$ sudo ethtool -S ens2f0|grep error
       tx_errors: 12
       tx_errors.nic: 0
       rx_length_errors.nic: 0
       rx_crc_errors.nic: 0
  ubuntu@compute-09:~$ ifconfig ens2f0
  ens2f0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9000
          ether b4:83:51:00:83:d1  txqueuelen 1000  (Ethernet)
          RX packets 53784  bytes 22064970 (22.0 MB)
          RX errors 0  dropped 0  overruns 0  frame 0
          TX packets 52163  bytes 18393142 (18.3 MB)
          TX errors 12  dropped 0 overruns 0  carrier 0  collisions 0

  If I create vlan interface directly on bond1 I can ping the gateway with no 
problem
  so that creates opportunity for
  WORKAROUND 1: set the network to flat and push traffic on vlan interfaces on 
computes as for physnet device

  Another thing I tried was to install the HWE kernel

  ubuntu@compute-09:~$ uname -a
  Linux compute-09 5.15.0-60-generic #66~20.04.1-Ubuntu SMP Wed Jan 25 09:41:30 
UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

  Fortunately traffic was still going out from compute-09 after reboot,
  that fixed the issue
  so we have WORKAROUND 2
  ubuntu@ubuntu-provider2:~$ ping 10.99.0.254
  PING 10.99.0.254 (10.99.0.254) 56(84) bytes of data.
  64 bytes from 10.99.0.254: icmp_seq=1 ttl=63 time=2.15 ms
  64 bytes from 10.99.0.254: icmp_seq=2 ttl=63 time=0.896 ms
  64 bytes from 10.99.0.254: icmp_seq=3 ttl=63 time=1.12 ms
  ^C
  ubuntu@infra-1:~$ ping 10.99.0.213
  PING 10.99.0.213 (10.99.0.213) 56(84) bytes of data.
  64 bytes from 10.99.0.213: icmp_seq=1 ttl=62 time=5.12 ms
  64 bytes from 10.99.0.213: icmp_seq=2 ttl=62 time=2.17 ms
  64 bytes from 10.99.0.213: icmp_seq=3 ttl=62 time=0.948 ms
  64 bytes from 10.99.0.213: icmp_seq=4 ttl=62 time=1.00 ms
  64 bytes from 10.99.0.213: icmp_seq=5 ttl=62 time=0.891 ms
  64 bytes from 10.99.0.213: icmp_seq=6 ttl=62 time=1.05 ms

  Now I can ping both ways

  However I am afraid that we may encounter same issue like for Jammy for the 
cards when booting, as it happens randomly for the kernel with the same number 
5.15.0-60
  Here's the bug I am referring to
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2004262

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008781/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to