> Hi Lorenzo, Mark > > Thanks for the patch. > Just to try to clarify the case which would not work anymore if we apply > this patch: see below. > > Thanks > Xavier
Hi Xavier, > > On Wed, Sep 24, 2025 at 7:22 PM Mark Michelson via dev < > [email protected]> wrote: > > > Hi Lorenzo, > > > > Clearing CT in this scenario fixes the sampling issue, but I fear it > > could break some ACLs. "allow-related" ACLs that are expected to be > > applied in this scenario may no longer be applied since conntrack > > state is being cleared > > With this patch, ACL applied on 'full' switches (i.e. w/o outport == ..) > ACL will behave differently than before in some cases. > This is highlighted by the system test change. > > In the following configuration > sw0p1 - sw0 - lr0 - sw1 - sw1p1 (all running on same hv1) > With the following ACLs > [0] acl-add sw1 to-lport 1002 ip4 && tcp && tcp.dst == 80' allow-related > [1] acl-add sw1 to-lport 1001 ip4' drop > [2] acl-add sw1 from-lport 1001 'inport == "sw1p1" && ip4' drop > > Without the patch, wget from sw0p1 towards sw1p1 was working (despite > looking at a "wrong" zone). > With the patch, replies from sw1p1 are dropped in sw1p1 egress pipeline (as > ct is skipped as towards router port, so allow-related rule does nor work). > > In more details: > Sending packets from sw0p1 towards sw1p1 port 80. > sw0 -> lr0 -> sw1 > - (sw1 ingress) received from sw1-lr0 > no ct (skipped as from router port) > - (sw1 egress) sent to sw1p1. > sent to ct (zone=sw1-p1) -> hits ACL [0] > > Reply (SYN/ACK) from sw1p1 > - (sw1 ingress) received from sw1p1 > sent to ct (zone=sw1-p1) -> hits 'related' part of [0] > - (sw1 egress) sent to sw1-lr0 > With this patch: not sent to ct => -trk -> allow-related does not apply > -> ACL [1] is applied, packet is dropped. > Without the patch, packet was sent to ct (zone=zone=sw1-p1), packet was > +trk and hits 'related' part of [0] yes, I was aware of this possible problem (in fact I modified the system-test :)). I was thinking it was not a valid use-case. I will investigate the solution suggested by Dumitru. Regards, Lorenzo > > . Would it be possible to just suppress the > > sampling in this scenario instead of clearing CT completely? > > > > I have a couple of other questions inline below > > > > On Mon, Sep 22, 2025 at 8:58 AM Lorenzo Bianconi via dev > > <[email protected]> wrote: > > > > > > > Clear connection tracking state in the logical switch egress pipeline > > if > > > > the packet is sent to a distributed logical router via a patch port. > > > > This patch will fix the oversampling issue reported in > > > > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2025-May/053626.html > > > > > > > > Reported-at: https://issues.redhat.com/browse/FDP-1408 > > > > Signed-off-by: Lorenzo Bianconi <[email protected]> > > > > > > Hi Oscar, > > > > > > any chance to test if this patch fixes the reported problem? Thanks in > > advance. > > > > > > Regards, > > > Lorenzo > > > > > > > --- > > > > northd/northd.c | 16 ++++-- > > > > tests/ovn-northd.at | 4 +- > > > > tests/system-ovn.at | 129 > > +++++++++++++++++++++++++++++++++++++++++++- > > > > 3 files changed, 141 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/northd/northd.c b/northd/northd.c > > > > index 8b5413ef3..2e433bfdd 100644 > > > > --- a/northd/northd.c > > > > +++ b/northd/northd.c > > > > @@ -5832,7 +5832,7 @@ build_lswitch_output_port_sec_od(struct > > ovn_datapath *od, > > > > > > > > static void > > > > skip_port_from_conntrack(const struct ovn_datapath *od, struct > > ovn_port *op, > > > > - bool has_stateful_acl, enum ovn_stage > > in_stage, > > > > + bool apply_ct, enum ovn_stage in_stage, > > > > enum ovn_stage out_stage, uint16_t priority, > > > > struct lflow_table *lflows, > > > > struct lflow_ref *lflow_ref) > > > > @@ -5848,9 +5848,17 @@ skip_port_from_conntrack(const struct > > ovn_datapath *od, struct ovn_port *op, > > > > * conntrack state across all chassis. */ > > > > > > > > const char *ingress_action = "next;"; > > > > - const char *egress_action = has_stateful_acl > > > > - ? "next;" > > > > - : "ct_clear; next;"; > > > > + struct ovn_port *peer = op->peer; > > > > + if (peer && lsp_is_router(op->nbsp)) { > > > > + struct ovn_datapath *peer_od = peer->od; > > > > + > > > > + if (peer_od && > > > > + !peer_od->is_gw_router && > > vector_is_empty(&peer_od->l3dgw_ports)) { > > > > + /* We can clear ct for distributed routers. */ > > > > + apply_ct = false; > > > > + } > > > > I'm a bit confused about why apply_ct remains true for gateway routers > > and routers with DGPs. Wouldn't this lead to the same double sampling > > issue as before? Consider a scenario with a gateway router where both > > vms are bound to the gateway chassis. The traffic pattern is identical > > to what is done in the system test you've added, but in this case, CT > > is not cleared because the router is a gateway router. But even if the > > vms are bound to a different chassis than the gateway router, then why > > would we not clear CT in this case? Wouldn't the double sampling still > > happen then? > > > > I think this reinforces the suggestion I had that instead of clearing > > CT, we should just suppress the sampling on "router" type logical > > switch ports, no matter what attributes the router has. > > > > > > + } > > > > + const char *egress_action = apply_ct ? "next;" : "ct_clear; > > next;"; > > > > > > > > char *ingress_match = xasprintf("ip && inport == %s", > > op->json_key); > > > > char *egress_match = xasprintf("ip && outport == %s", > > op->json_key); > > > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > > > > index c9e998129..8eb4bae1c 100644 > > > > --- a/tests/ovn-northd.at > > > > +++ b/tests/ovn-northd.at > > > > @@ -4819,7 +4819,7 @@ check_stateful_flows() { > > > > table=??(ls_out_pre_lb ), priority=100 , match=(ip), > > action=(reg0[[2]] = 1; next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(eth.mcast), > > action=(next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(eth.src == > > \$svc_monitor_mac), action=(next;) > > > > - table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > > == "sw0-lr0"), action=($action) > > > > The $action variable is only used in this one line of > > check_stateful_flows(). If the variable is not going to be used > > anymore, then you can remove it entirely and clean up the calls to > > check_stateful_flows() so they do not pass an argument any longer. > > > > Depending on what is done for v2, this comment may not be relevant though. > > > > > > + table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > > == "sw0-lr0"), action=(ct_clear; next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(nd || nd_rs > > || nd_ra || mldv1 || mldv2), action=(next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(reg0[[16]] == > > 1), action=(next;) > > > > ]) > > > > @@ -4886,7 +4886,7 @@ AT_CHECK([grep "ls_out_pre_lb" sw0flows | > > ovn_strip_lflows], [0], [dnl > > > > table=??(ls_out_pre_lb ), priority=0 , match=(1), > > action=(next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(eth.mcast), > > action=(next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(eth.src == > > $svc_monitor_mac), action=(next;) > > > > - table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > > == "sw0-lr0"), action=(next;) > > > > + table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > > == "sw0-lr0"), action=(ct_clear; next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(nd || nd_rs > > || nd_ra || mldv1 || mldv2), action=(next;) > > > > table=??(ls_out_pre_lb ), priority=110 , match=(reg0[[16]] == > > 1), action=(next;) > > > > ]) > > > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > > > index 8e356df6f..3e4ff32d1 100644 > > > > --- a/tests/system-ovn.at > > > > +++ b/tests/system-ovn.at > > > > @@ -8542,8 +8542,8 @@ check ovn-nbctl lsp-set-addresses sw1p1 > > "50:54:00:00:00:03 20.0.0.3" > > > > > > > > check ovn-nbctl --apply-after-lb acl-add sw1 from-lport 1001 'inport > > == "sw1p1" && ip4' drop > > > > > > > > -check ovn-nbctl acl-add sw1 to-lport 1002 'ip4 && tcp && tcp.dst == > > 80' allow-related > > > > -check ovn-nbctl acl-add sw1 to-lport 1001 'ip4' drop > > > > +check ovn-nbctl acl-add sw1 to-lport 1002 'outport == "sw1p1" && ip4 > > && tcp && tcp.dst == 80' allow-related > > > > +check ovn-nbctl acl-add sw1 to-lport 1001 'outport == "sw1p1" && ip4' > > drop > > > > Why are these changes necessary? > > > Applying ACL on the switch (w/ outport=..) applies ACL also on the switch > ports, and the patch is breaking this use case. > > > > > > > > > > > ADD_NAMESPACES(sw0p1) > > > > ADD_VETH(sw0p1, sw0p1, br-int, "10.0.0.3/24", "50:54:00:00:00:02", \ > > > > @@ -13969,6 +13969,131 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query > > port patch-.*/d > > > > AT_CLEANUP > > > > ]) > > > > > > > > +OVN_FOR_EACH_NORTHD([ > > > > +AT_SETUP([ovn -- ACL Sampling - Stateful ACL - routed traffic]) > > > > +AT_KEYWORDS([ACL]) > > > > + > > > > +CHECK_CONNTRACK() > > > > +CHECK_CONNTRACK_NAT() > > > > +ovn_start > > > > +OVS_TRAFFIC_VSWITCHD_START() > > > > +ADD_BR([br-int]) > > > > + > > > > +dnl Set external-ids in br-int needed for ovn-controller > > > > +check ovs-vsctl \ > > > > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > > > > + -- set Open_vSwitch . > > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > > > > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > > > > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > > > > + -- set bridge br-int fail-mode=secure > > other-config:disable-in-band=true > > > > + > > > > +dnl Start ovn-controller > > > > +start_daemon ovn-controller > > > > + > > > > +dnl Logical network: > > > > +dnl 1 router > > > > +dnl 2 logical switch > > > > +dnl 2 VIF > > > > + > > > > +check ovn-nbctl \ > > > > + -- lr-add lr \ > > > > + -- lrp-add lr lrp1 00:00:00:00:01:00 42.42.42.1/24 \ > > > > + -- lrp-add lr lrp2 00:00:00:00:02:00 43.43.43.1/24 \ > > > > + -- ls-add ls1 \ > > > > + -- ls-add ls2 \ > > > > + -- lsp-add ls1 vm1 \ > > > > + -- lsp-add ls2 vm2 \ > > > > + -- lsp-set-addresses vm1 00:00:00:00:00:01 \ > > > > + -- lsp-set-addresses vm2 00:00:00:00:00:02 \ > > > > + -- lsp-add ls1 ls1-lr \ > > > > + -- lsp-add ls2 ls2-lr \ > > > > + -- lsp-set-type ls1-lr router \ > > > > + -- lsp-set-type ls2-lr router \ > > > > + -- lsp-set-options ls1-lr router-port=lrp1 \ > > > > + -- lsp-set-options ls2-lr router-port=lrp2 \ > > > > + -- lsp-set-addresses ls1-lr router \ > > > > + -- lsp-set-addresses ls2-lr router > > > > +check ovn-nbctl --wait=hv sync > > > > + > > > > +ADD_NAMESPACES(vm1) > > > > +ADD_VETH(vm1, vm1, br-int, "42.42.42.2/24", "00:00:00:00:00:01", > > "42.42.42.1") > > > > +ADD_NAMESPACES(vm2) > > > > +ADD_VETH(vm2, vm2, br-int, "43.43.43.2/24", "00:00:00:00:00:02", > > "43.43.43.1") > > > > + > > > > +wait_for_ports_up > > > > + > > > > +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 43.43.43.2 | > > FORMAT_PING], \ > > > > +[0], [dnl > > > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > > > +]) > > > > + > > > > +collector1=$(ovn-nbctl create Sample_Collector id=1 name=c1 > > probability=65535 set_id=100) > > > > +collector2=$(ovn-nbctl create Sample_Collector id=2 name=c2 > > probability=65535 set_id=200) > > > > +check_row_count nb:Sample_Collector 2 > > > > + > > > > +check_uuid ovn-nbctl create Sampling_App type="acl-est" id="43" > > > > +check_row_count nb:Sampling_App 1 > > > > + > > > > +check ovn-nbctl pg-add pg vm1 vm2 > > > > +check_uuid ovn-nbctl > > \ > > > > + -- --id=@sample1 create Sample collector="$collector1" > > metadata=1001 \ > > > > + -- --sample-est=@sample1 acl-add pg from-lport 200 "inport == @pg > > && udp" allow-related \ > > > > + -- --id=@sample2 create Sample collector="$collector2" > > metadata=1004 \ > > > > + -- --sample-est=@sample2 acl-add pg to-lport 200 "outport == @pg > > && udp" allow-related > > > > +check ovn-nbctl --wait=hv sync > > > > +check_row_count nb:ACL 2 > > > > +check_row_count nb:Sample 2 > > > > + > > > > +NETNS_DAEMONIZE([vm1], [nc -l -u 1234], [udp1.pid]) > > > > +NETNS_DAEMONIZE([vm2], [nc -l -u 1235], [udp2.pid]) > > > > + > > > > +dnl Start an IPFIX collector. > > > > +DAEMONIZE([nfcapd -B 1024000 -w . -p 4242 2> collector.err], > > [collector.pid]) > > > > +dnl Wait for the collector to be up. > > > > +OVS_WAIT_UNTIL([grep -q 'Startup nfcapd.' collector.err]) > > > > + > > > > +dnl Configure the OVS flow sample collector. > > > > +ovs-vsctl --id=@br get Bridge br-int \ > > > > + -- --id=@ipfix1 create IPFIX targets=\"127.0.0.1:4242\" > > template_interval=1 \ > > > > + -- --id=@cs1 create Flow_Sample_Collector_Set id=100 bridge=@br > > ipfix=@ipfix1 \ > > > > + -- --id=@ipfix2 create IPFIX targets=\"127.0.0.1:4242\" > > template_interval=1 \ > > > > + -- --id=@cs2 create Flow_Sample_Collector_Set id=200 bridge=@br > > ipfix=@ipfix2 > > > > + > > > > +check ovn-nbctl --wait=hv sync > > > > +dnl And wait for it to be up and running. > > > > +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep -q '2 ids']) > > > > + > > > > +# Create the connection in the CT table. > > > > +echo "Hello" > Hello > > > > +NS_EXEC([vm1], [nc -p 1234 -u 43.43.43.2 1235 < Hello]) > > > > +NS_EXEC([vm2], [nc -p 1235 -u 42.42.42.2 1234 < Hello]) > > > > +# Send some more packets. > > > > +for i in $(seq 10); do > > > > +NS_EXEC([vm1], [nc -p 1234 -u 43.43.43.2 1235 < Hello]) > > > > +NS_EXEC([vm2], [nc -p 1235 -u 42.42.42.2 1234 < Hello]) > > > > +done > > > > + > > > > +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep 'id 100' | > > grep -q 'sampled pkts=21']) > > > > +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep 'id 200' | > > grep -q 'sampled pkts=21']) > > > > + > > > > +OVN_CLEANUP_CONTROLLER([hv1]) > > > > + > > > > +as ovn-sb > > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > > + > > > > +as ovn-nb > > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > > + > > > > +as northd > > > > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > > > > + > > > > +as > > > > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > > > > +/connection dropped.*/d"]) > > > > + > > > > +AT_CLEANUP > > > > +]) > > > > + > > > > OVN_FOR_EACH_NORTHD([ > > > > AT_SETUP([SB Disconnect - MAC_Binding]) > > > > ovn_start > > > > -- > > > > 2.51.0 > > > > > > > _______________________________________________ > > > dev mailing list > > > [email protected] > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
_______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
