apport information

** Attachment added: "ProcCpuinfoMinimal.txt"
   
https://bugs.launchpad.net/bugs/2008781/+attachment/5650462/+files/ProcCpuinfoMinimal.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2008781

Title:
  OVN provider network type vlan packets cannot go outside the bond on
  Intel E810-XXV card

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Ubuntu 20.04.5 LTS
  ubuntu@compute-09:~$ uname -a
  Linux compute-09 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux

  ubuntu@compute-09:~$ sudo update-pciids
  ubuntu@compute-09:~$ lspci |grep Intel|grep -i Ether
  31:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)
  31:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)
  ca:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)
  ca:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV 
for SFP (rev 02)

  The test instance with provider network floating ip 10.99.0.213 cannot reach 
the provider network gateway
  openstack server create --key-name ubuntu-keypair --image 
auto-sync/ubuntu-jammy-22.04-amd64-server-20230210-disk1.img --flavor m1.small 
--net provider1-private-net ubuntu-provider1

  ubuntu@compute-05:~$ sudo -E ip netns exec 
ovnmeta-fcd1b354-6f41-42dc-ae73-87df28856ee5 ssh ubuntu@192.168.100.123
  ubuntu@ubuntu-provider1:~$ ping 10.99.0.254
  PING 10.99.0.254 (10.99.0.254) 56(84) bytes of data.
  ^C
  --- 10.99.0.254 ping statistics ---
  419 packets transmitted, 0 received, 100% packet loss, time 428035ms

  I found the compute from which the outside traffic is going out
  and I see ARP requests with no response
  compute-09:~$ sudo tcpdump -vteni bond1 '(vlan 300)'
  tcpdump: listening on bond1, link-type EN10MB (Ethernet), capture size 262144 
bytes fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 
46: vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request 
who-has 10.99.0.254 tell 10.99.0.88, length 28
  fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: 
vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
10.99.0.254 tell 10.99.0.88, length 28
  fa:16:3e:ab:87:ad > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: 
vlan 300, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
10.99.0.254 tell 10.99.0.88, length 28
  For the test you may ping .254 indifenetely

  The error count grows on tx packets on bond1 and the card ens2f0 (which 
happens to push the traffic)
  ubuntu@compute-09:~$ sudo ethtool -S ens2f0|grep error
       tx_errors: 12
       tx_errors.nic: 0
       rx_length_errors.nic: 0
       rx_crc_errors.nic: 0
  ubuntu@compute-09:~$ ifconfig ens2f0
  ens2f0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9000
          ether b4:83:51:00:83:d1  txqueuelen 1000  (Ethernet)
          RX packets 53784  bytes 22064970 (22.0 MB)
          RX errors 0  dropped 0  overruns 0  frame 0
          TX packets 52163  bytes 18393142 (18.3 MB)
          TX errors 12  dropped 0 overruns 0  carrier 0  collisions 0

  If I create vlan interface directly on bond1 I can ping the gateway with no 
problem
  so that creates opportunity for
  WORKAROUND 1: set the network to flat and push traffic on vlan interfaces on 
computes as for physnet device

  Another thing I tried was to install the HWE kernel

  ubuntu@compute-09:~$ uname -a
  Linux compute-09 5.15.0-60-generic #66~20.04.1-Ubuntu SMP Wed Jan 25 09:41:30 
UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

  Fortunately traffic was still going out from compute-09 after reboot,
  that fixed the issue
  so we have WORKAROUND 2
  ubuntu@ubuntu-provider2:~$ ping 10.99.0.254
  PING 10.99.0.254 (10.99.0.254) 56(84) bytes of data.
  64 bytes from 10.99.0.254: icmp_seq=1 ttl=63 time=2.15 ms
  64 bytes from 10.99.0.254: icmp_seq=2 ttl=63 time=0.896 ms
  64 bytes from 10.99.0.254: icmp_seq=3 ttl=63 time=1.12 ms
  ^C
  ubuntu@infra-1:~$ ping 10.99.0.213
  PING 10.99.0.213 (10.99.0.213) 56(84) bytes of data.
  64 bytes from 10.99.0.213: icmp_seq=1 ttl=62 time=5.12 ms
  64 bytes from 10.99.0.213: icmp_seq=2 ttl=62 time=2.17 ms
  64 bytes from 10.99.0.213: icmp_seq=3 ttl=62 time=0.948 ms
  64 bytes from 10.99.0.213: icmp_seq=4 ttl=62 time=1.00 ms
  64 bytes from 10.99.0.213: icmp_seq=5 ttl=62 time=0.891 ms
  64 bytes from 10.99.0.213: icmp_seq=6 ttl=62 time=1.05 ms

  Now I can ping both ways

  However I am afraid that we may encounter same issue like for Jammy for the 
cards when booting, as it happens randomly for the kernel with the same number 
5.15.0-60
  Here's the bug I am referring to
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2004262
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Feb 27 13:33 seq
   crw-rw---- 1 root audio 116, 33 Feb 27 13:33 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.25
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CasperMD5CheckResult: skip
  DistroRelease: Ubuntu 20.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 004: ID 1604:10c0 Tascam 
   Bus 001 Device 003: ID 1604:10c0 Tascam 
   Bus 001 Device 002: ID 1604:10c0 Tascam 
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  Lsusb-t:
   /:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/10p, 5000M
   /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
       |__ Port 14: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
           |__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
           |__ Port 4: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
  MachineType: Dell Inc. PowerEdge R650
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-139-generic 
root=UUID=70799655-ec36-47e0-a10b-d647a84ac9be ro
  ProcVersionSignature: Ubuntu 5.4.0-139.156-generic 5.4.224
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-139-generic N/A
   linux-backports-modules-5.4.0-139-generic  N/A
   linux-firmware                             1.187.36
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.4.0-139-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 09/14/2022
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.8.2
  dmi.board.name: 0PJ7YJ
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A01
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.8.2:bd09/14/2022:svnDellInc.:pnPowerEdgeR650:pvr:rvnDellInc.:rn0PJ7YJ:rvrA01:cvnDellInc.:ct23:cvr:
  dmi.product.family: PowerEdge
  dmi.product.name: PowerEdge R650
  dmi.product.sku: SKU=0912;ModelName=PowerEdge R650
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008781/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to