Hi Frode,

Sorry for reaching out directly but I figured it might be easier for you
to report this to the Ubuntu kernel development team (if not, I can also
try to open a launchpad bug myself but my knowledge in that area is
limited).

Our OVN CI (in GitHub actions) is broken since Friday, e.g.:
https://github.com/ovn-org/ovn/actions/runs/18622640352/job/53147265121

These tests fail:
 252: system-ovn-kmod.at:1006 Load Balancer LS hairpin IPv6 UDP - larger than 
MTU -- parallelization=yes -- ovn_monitor_all=yes
      lb
 253: system-ovn-kmod.at:1006 Load Balancer LS hairpin IPv6 UDP - larger than 
MTU -- parallelization=yes -- ovn_monitor_all=no
      lb

They fail with:

 (cat datafile; sleep 3) | nc -6 -u 8800::0088 4040 -p 20000 -o 
udp_frag_test_c1.recvd
NS_EXEC_HEREDOC
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Listening on 4200::1:2021
stderr:
Ncat: Message too long.
stdout:

As there were no OVN (or OVS) user space changes that could've caused
this (the last good run was on Thursday) we had a look at other
components that might have changed.

It seems there was a GitHub ubuntu:24.04 runner image change that
happened since.  The new version is:

  Image: ubuntu-24.04
  Version: 20251014.76.1
  Included Software: 
https://github.com/actions/runner-images/blob/ubuntu24/20251014.76/images/ubuntu/Ubuntu2404-Readme.md
  Image Release: 
https://github.com/actions/runner-images/releases/tag/ubuntu24%2F20251014.76

which uses kernel version 6.14.0-1012-azure.

Our last known good CI runs were using kernel
version 6.11.0-1018-azure.

I had a look at the linux-image-unsigned-6.14.0-1012-azure Ubuntu
kernel sources and it seems that we there might be a patch missing
there.  I think we might be hitting the same issue as in:

https://lore.kernel.org/stable/[email protected]/

Checking the unpacked Ubuntu kernel sources it seems the 6.11 kernel
didn't have the buggy patch:
a18dfa9925b9ef6107ea3aa5814ca3c704d34a8a "ipv6: save dontfrag in cork"

While kernel 6.14.0-1012-azure includes the code from the buggy patch
but only has the first of the followup fixes:
- 54580ccdd8a9c6821fd6f72171d435480867e4c3 "ipv6: remove leftover ip6 cookie 
initializer"
- 096208592b09c2f5fc0c1a174694efa41c04209d "ipv6: replace ipcm6_init calls with 
ipcm6_init_sk" <<< the code doesn't have this commit.

Would you happen to have some time to double check my findings and maybe
report this to the Ubuntu kernel team?

Also, it seems ovn-kubernetes CI is affected by this too:
https://github.com/ovn-kubernetes/ovn-kubernetes/actions/runs/18638480933/job/53134392523#step:16:4399

[FAIL] [sig-network] Networking Granular Checks: Services [It] should be able 
to handle large requests: udp [sig-network]

Thank you!

Best regards,
Dumitru


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to