On 1/4/24 15:03, Ilya Maximets wrote: > On 1/4/24 11:57, Simon Horman wrote: >> On Thu, Jan 04, 2024 at 04:27:49PM +1300, Brad Cowie wrote: >>> Linux kernel commit ebddb1404900 ("net: move the nat function to >>> nf_nat_ovs for ovs and tc") introduced a regression into the kernel >>> datapath which prevented the openvswitch match key from being updated >>> when nat was undone for packets in the related conntrack state. This >>> issue caused these packets (usually ICMP/ICMPv6 error packets) to >>> match the wrong openflow rule. >>> >>> This issue was fixed in linux kernel commit e6345d2824a3 ("netfilter: >>> nf_nat: fix action not being set for all ct states"). >>> >>> This test will reproduce the issue and fail for kernel versions >>> v6.2 to v6.6, and will pass on earlier kernel versions where the issue >>> wasn't present, or on later kernel versions that have the fix applied. >>> >>> Link: https://lore.kernel.org/netdev/20231221224311.130319-1-b...@faucet.nz/ >>> Suggested-by: Aaron Conole <acon...@redhat.com> >>> Signed-off-by: Brad Cowie <b...@faucet.nz> >> >> Hi Brad, >> >> thanks for following-up on this. >> >> One question from my side is, given that this is currently broken in many >> kernels in use today, how we should integrate this. For one thing, >> applying this patch causes the CI to fail. >> >> https://github.com/ovsrobot/ovs/actions/runs/7405341045 >> >> It might be nice if we could detect known to be broken kernels. >> But I'm not sure, there is an easy way to do that, other than >> running the test itself. >> >> Do you have any thoughts on this? > > One option could be to exclude the kernels below 6.7 with: > > OVS_CHECK_KERNEL(6, 7) > > Unfortunately the original issue seems to be backported to some > distribution kernels, but all the kernels 6.7+ should be fine. > > However, I'm not convinced the test failed in CI because of this. > They failed due to difference on packet and by counters. > In case of offload tests they also have matching packet counters, > but different byte counters. I know there were cases where TC > counts bytes differently. So, the counters seem to be not a > very reliable source of information. Is there any other way > we can detect the issue without comparing exact values of the > packet/byte counters?
Stripping out the byte counters might be enough, I guess, in pair with the kernel version check. But if there is a better way to check, it might be better to not rely on packet counters as well. > > Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev