3. ct_clear is not a reversible action, because it required
to pass the packet through conntrack again in order to restore
the conntrack state.
Fix: To move the CT_CLEAR to a group of irreversible actions
in the reversible_actions() in ofproto/ofproto-dpif-xlate.c.
Status: An easy fix that does nothing useful without fixes for
later issues, because we clear the ct_state before
the patch_port_output() processing and optimizing out
the CT_CLEAR action.
4. (new!) reversible_actions() optimization is logically incorrect.
The reason is that reversible_actions() doesn't look further
than a list of actions of the first OF rule in the second bridge.
If the list of action contains resubmit, there can still be
irreversible actions and packet will be irreversibly modified by
the patch port processing without cloning.
Simple reproducer:
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index dbb3b6dda..be30ad5bf 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -8736,7 +8736,8 @@ OVS_VSWITCHD_START(
AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats
bands=type=drop rate=2'])
AT_CHECK([ovs-ofctl del-flows br0])
AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,ip,actions=2,1])
-AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 in_port=1,ip,actions=meter:1,3])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
"in_port=1,ip,actions=resubmit(,7)"])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
table=7,in_port=1,ip,actions=meter:1,3])
AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
'in_port(100),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=10.1.1.22,dst=10.0.0.3,proto=6,tos=0,ttl=64,frag=no),tcp(src=53295,dst=8080)'],
[0], [stdout])
AT_CHECK([tail -1 stdout], [0],
---
Status: unclear.
One idea is to reverse the logic and check datapath actions for
being reversible when going up in recursion instead of trying to
predict all the future actions while going down.
Ideas are welcome.
Hammerhead approach: Mark 'resubmit' as irreversible action. But that
will effectively mean that we're always cloning on patch port output,
which is not great, see the issue #8.
5. Packets after the ct() in a non-forked pipeline must be untracked.
Status: unclear.
Fix for the issue #2 will cover issues for already tracked packets,
but if the packet is supposed to be untracked, but it is tracked,
then we must emit the ct_clear action from the userspace.
The most viable approach, I guess, is the previously proposed
'pending_ct_clear' fix. Alternative is to always emit ct_clear
right after the ct() action and not loose our minds trying to
track the pending action in all the weird cases.
6. Conntrack state must not be preserved while going through the
patch port.
State: unclear.
The right solution would be to emit the ct_clear datapath action
after cloning for the patch port. Few problems here:
- current code in ofproto/ofproto-dpif-xlate.c will not actually
allow us to do that, we need to fix the issue #4 first to be able to
inject new actions post factum.
- ct_clear is non-reversible (issue #3) meaning if we're adding
ct-clear, we have to clone even if the patch port processing
doesn't have any ct() actions or recirculations.
Excessive ct_clear's and clones can be avoided by fixing the
issue #4 and examining the list of datapath actions created by
the patch port processing afterwards to decide if we need to
clone or insert ct_clear.
7. (new!) Userspace conntrack implementation fails to perform correct
lookup if the cached connection was cleared during the tuple change.
The issue can be seen with the OVN hairpin issue reproducer:
https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856/comments/6
by running 'make check-system-userspace'.
I'm not 100% sure what is going on there, but the test fails because
packets are marked as invalid due to conn_not_found() -> valid_new()
failure for the icmp reply.
8. Not really a bug, but all above implies adding more clone()'s and
they are not playing well with HW offloading, i.e. simply breaking it.
So, it makes sense to think how we can minimize the impact.