On 5/31/22 23:48, Ilya Maximets wrote: > On 5/31/22 21:15, Frode Nordahl wrote: >> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl
<snip> >> I've pushed the first part of the fix here: >> https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html > > Thanks! I saw that and I tend to think that it is correct. > I'll try to test it and apply in the next couple of days. > > One question about the test above: which entity actually adds > the ct_state to the packet or at which moment that happens? > I see it, but I'm not sure I fully understand that. Looks > like I'm missing smething obvious. I found what is going on there - kernel simply tracks everything if not told to do so, so ICMP packets create the ct entry and subsequent packets re-use it, so icmp replies have +trk set while entering OVS. ---- Let's have some summary of the issues discovered here so far, including a few new issues: 1. ct states set externally are not tracked correctly by OVS. Fix: https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html Status: LGTM, will apply soon. This fixes the problem originally reported by Liam, IIUC. Right? 2. Kernel ct() actions are trying to re-use the cached connection after the tuple changes. This ends up to be the OVN hairpin issue as reported here: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 Proposed Fix: https://lore.kernel.org/netdev/20220606221140.488984-1-i.maxim...@ovn.org/T/#u Status: Needs review. 3. ct_clear is not a reversible action, because it required to pass the packet through conntrack again in order to restore the conntrack state. Fix: To move the CT_CLEAR to a group of irreversible actions in the reversible_actions() in ofproto/ofproto-dpif-xlate.c. Status: An easy fix that does nothing useful without fixes for later issues, because we clear the ct_state before the patch_port_output() processing and optimizing out the CT_CLEAR action. 4. (new!) reversible_actions() optimization is logically incorrect. The reason is that reversible_actions() doesn't look further than a list of actions of the first OF rule in the second bridge. If the list of action contains resubmit, there can still be irreversible actions and packet will be irreversibly modified by the patch port processing without cloning. Simple reproducer: diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index dbb3b6dda..be30ad5bf 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -8736,7 +8736,8 @@ OVS_VSWITCHD_START( AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats bands=type=drop rate=2']) AT_CHECK([ovs-ofctl del-flows br0]) AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,ip,actions=2,1]) -AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 in_port=1,ip,actions=meter:1,3]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 "in_port=1,ip,actions=resubmit(,7)"]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 table=7,in_port=1,ip,actions=meter:1,3]) AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(100),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=10.1.1.22,dst=10.0.0.3,proto=6,tos=0,ttl=64,frag=no),tcp(src=53295,dst=8080)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], --- Status: unclear. One idea is to reverse the logic and check datapath actions for being reversible when going up in recursion instead of trying to predict all the future actions while going down. Ideas are welcome. Hammerhead approach: Mark 'resubmit' as irreversible action. But that will effectively mean that we're always cloning on patch port output, which is not great, see the issue #8. 5. Packets after the ct() in a non-forked pipeline must be untracked. Status: unclear. Fix for the issue #2 will cover issues for already tracked packets, but if the packet is supposed to be untracked, but it is tracked, then we must emit the ct_clear action from the userspace. The most viable approach, I guess, is the previously proposed 'pending_ct_clear' fix. Alternative is to always emit ct_clear right after the ct() action and not loose our minds trying to track the pending action in all the weird cases. 6. Conntrack state must not be preserved while going through the patch port. State: unclear. The right solution would be to emit the ct_clear datapath action after cloning for the patch port. Few problems here: - current code in ofproto/ofproto-dpif-xlate.c will not actually allow us to do that, we need to fix the issue #4 first to be able to inject new actions post factum. - ct_clear is non-reversible (issue #3) meaning if we're adding ct-clear, we have to clone even if the patch port processing doesn't have any ct() actions or recirculations. Excessive ct_clear's and clones can be avoided by fixing the issue #4 and examining the list of datapath actions created by the patch port processing afterwards to decide if we need to clone or insert ct_clear. 7. (new!) Userspace conntrack implementation fails to perform correct lookup if the cached connection was cleared during the tuple change. The issue can be seen with the OVN hairpin issue reproducer: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6 by running 'make check-system-userspace'. I'm not 100% sure what is going on there, but the test fails because packets are marked as invalid due to conn_not_found() -> valid_new() failure for the icmp reply. 8. Not really a bug, but all above implies adding more clone()'s and they are not playing well with HW offloading, i.e. simply breaking it. So, it makes sense to think how we can minimize the impact. Best regards, Ilya Maximets. _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss