On 1/4/24 15:03, Ilya Maximets wrote:
> On 1/4/24 11:57, Simon Horman wrote:
>> On Thu, Jan 04, 2024 at 04:27:49PM +1300, Brad Cowie wrote:
>>> Linux kernel commit ebddb1404900 ("net: move the nat function to
>>> nf_nat_ovs for ovs and tc") introduced a regression into the kernel
>>> datapath which prevented the openvswitch match key from being updated
>>> when nat was undone for packets in the related conntrack state. This
>>> issue caused these packets (usually ICMP/ICMPv6 error packets) to
>>> match the wrong openflow rule.
>>>
>>> This issue was fixed in linux kernel commit e6345d2824a3 ("netfilter:
>>> nf_nat: fix action not being set for all ct states").
>>>
>>> This test will reproduce the issue and fail for kernel versions
>>> v6.2 to v6.6, and will pass on earlier kernel versions where the issue
>>> wasn't present, or on later kernel versions that have the fix applied.
>>>
>>> Link: https://lore.kernel.org/netdev/20231221224311.130319-1-b...@faucet.nz/
>>> Suggested-by: Aaron Conole <acon...@redhat.com>
>>> Signed-off-by: Brad Cowie <b...@faucet.nz>
>>
>> Hi Brad,
>>
>> thanks for following-up on this.
>>
>> One question from my side is, given that this is currently broken in many
>> kernels in use today, how we should integrate this.  For one thing,
>> applying this patch causes the CI to fail.
>>
>>   https://github.com/ovsrobot/ovs/actions/runs/7405341045
>>
>> It might be nice if we could detect known to be broken kernels.
>> But I'm not sure, there is an easy way to do that, other than
>> running the test itself.
>>
>> Do you have any thoughts on this?
> 
> One option could be to exclude the kernels below 6.7 with:
> 
> OVS_CHECK_KERNEL(6, 7)
> 
> Unfortunately the original issue seems to be backported to some
> distribution kernels, but all the kernels 6.7+ should be fine.
> 
> However, I'm not convinced the test failed in CI because of this.
> They failed due to difference on packet and by counters.
> In case of offload tests they also have matching packet counters,
> but different byte counters.  I know there were cases where TC
> counts bytes differently.  So, the counters seem to be not a
> very reliable source of information.  Is there any other way
> we can detect the issue without comparing exact values of the
> packet/byte counters?

Stripping out the byte counters might be enough, I guess, in
pair with the kernel version check.  But if there is a better
way to check, it might be better to not rely on packet counters
as well.

> 
> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to