> On 9/25/25 3:10 PM, Lorenzo Bianconi wrote: > >> Hi Lorenzo, > >> > >> Clearing CT in this scenario fixes the sampling issue, but I fear it > >> could break some ACLs. "allow-related" ACLs that are expected to be > >> applied in this scenario may no longer be applied since conntrack > >> state is being cleared. Would it be possible to just suppress the > >> sampling in this scenario instead of clearing CT completely? > > > > Hi Mark, > > > > Hi Lorenzo, Mark, > > > thx for the review. Sampling decision is based only on ct state info, > > logical > > flows does not depend on the input or output port used (e.g. ingress ACL > > sampling flows): > > > > table=10(ls_in_acl_sample ), priority=1000 , match=(ip && ct.trk && > > (ct.est || ct.rel) && ct_label.obs_unused == 0 && !ct.rpl && > > ct_mark.obs_collector_id == 1 && ct_mark.obs_stage == 0), > > action=(sample(probability=65535,collector_set=100,obs_domain=43,obs_point=ct_label.obs_point_id); > > next;) > > table=10(ls_in_acl_sample ), priority=1000 , match=(ip && ct.trk && > > (ct.est || ct.rel) && ct_label.obs_unused == 0 && ct.rpl && > > ct_mark.obs_collector_id == 2), > > action=(sample(probability=65535,collector_set=200,obs_domain=43,obs_point=ct_label.obs_point_id); > > next;) > > > > IIUC the only way to avoid sampling is to clear ct state, right? Another > > approach would be to just avoid sampling on all possible patch ports > > connected > > to a router but this probably will introduce new lflows. > > Do you think it is better? @Dumitru: what do you think? > > > > I'm afraid there are valid cases (e.g., no stateful ACLs) when ACLs are > correctly applied to traffic being sent out from logical switches to > logical routers (via OVN patch ports). It would be bad if we couldn't > sample that kind of traffic. > > I have a different suggestion: can't we use a bit in the per-packet > "flags" register to track whether a packet has been sampled in the > current conntrack zone? The "ct_clear" action would have to be changed > to zero this flag. > > Would that work to address the problem this patch is trying to fix?
Hi Dumitru, I will investigate this approach, thx for pointing this out. Regards, Lorenzo > > > Moreover, the patch clears ct_state just for patch ports on distributed > > routers. > > Is it correct to use LSP ct zone in this case or is it just an 'abuse' of > > previous behaviour? > > > > I agree with Mark on this one. It feels weird to only change behavior > for one type of router ports. > > Regards, > Dumitru > > >> > >> I have a couple of other questions inline below > >> > >> On Mon, Sep 22, 2025 at 8:58 AM Lorenzo Bianconi via dev > >> <[email protected]> wrote: > >>> > >>>> Clear connection tracking state in the logical switch egress pipeline if > >>>> the packet is sent to a distributed logical router via a patch port. > >>>> This patch will fix the oversampling issue reported in > >>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2025-May/053626.html > >>>> > >>>> Reported-at: https://issues.redhat.com/browse/FDP-1408 > >>>> Signed-off-by: Lorenzo Bianconi <[email protected]> > >>> > >>> Hi Oscar, > >>> > >>> any chance to test if this patch fixes the reported problem? Thanks in > >>> advance. > >>> > >>> Regards, > >>> Lorenzo > >>> > >>>> --- > >>>> northd/northd.c | 16 ++++-- > >>>> tests/ovn-northd.at | 4 +- > >>>> tests/system-ovn.at | 129 +++++++++++++++++++++++++++++++++++++++++++- > >>>> 3 files changed, 141 insertions(+), 8 deletions(-) > >>>> > >>>> diff --git a/northd/northd.c b/northd/northd.c > >>>> index 8b5413ef3..2e433bfdd 100644 > >>>> --- a/northd/northd.c > >>>> +++ b/northd/northd.c > >>>> @@ -5832,7 +5832,7 @@ build_lswitch_output_port_sec_od(struct > >>>> ovn_datapath *od, > >>>> > >>>> static void > >>>> skip_port_from_conntrack(const struct ovn_datapath *od, struct ovn_port > >>>> *op, > >>>> - bool has_stateful_acl, enum ovn_stage in_stage, > >>>> + bool apply_ct, enum ovn_stage in_stage, > >>>> enum ovn_stage out_stage, uint16_t priority, > >>>> struct lflow_table *lflows, > >>>> struct lflow_ref *lflow_ref) > >>>> @@ -5848,9 +5848,17 @@ skip_port_from_conntrack(const struct > >>>> ovn_datapath *od, struct ovn_port *op, > >>>> * conntrack state across all chassis. */ > >>>> > >>>> const char *ingress_action = "next;"; > >>>> - const char *egress_action = has_stateful_acl > >>>> - ? "next;" > >>>> - : "ct_clear; next;"; > >>>> + struct ovn_port *peer = op->peer; > >>>> + if (peer && lsp_is_router(op->nbsp)) { > >>>> + struct ovn_datapath *peer_od = peer->od; > >>>> + > >>>> + if (peer_od && > >>>> + !peer_od->is_gw_router && > >>>> vector_is_empty(&peer_od->l3dgw_ports)) { > >>>> + /* We can clear ct for distributed routers. */ > >>>> + apply_ct = false; > >>>> + } > >> > >> I'm a bit confused about why apply_ct remains true for gateway routers > >> and routers with DGPs. Wouldn't this lead to the same double sampling > >> issue as before? Consider a scenario with a gateway router where both > >> vms are bound to the gateway chassis. The traffic pattern is identical > >> to what is done in the system test you've added, but in this case, CT > >> is not cleared because the router is a gateway router. But even if the > >> vms are bound to a different chassis than the gateway router, then why > >> would we not clear CT in this case? Wouldn't the double sampling still > >> happen then? > >> > >> I think this reinforces the suggestion I had that instead of clearing > >> CT, we should just suppress the sampling on "router" type logical > >> switch ports, no matter what attributes the router has. > >> > >>>> + } > >>>> + const char *egress_action = apply_ct ? "next;" : "ct_clear; next;"; > >>>> > >>>> char *ingress_match = xasprintf("ip && inport == %s", op->json_key); > >>>> char *egress_match = xasprintf("ip && outport == %s", op->json_key); > >>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > >>>> index c9e998129..8eb4bae1c 100644 > >>>> --- a/tests/ovn-northd.at > >>>> +++ b/tests/ovn-northd.at > >>>> @@ -4819,7 +4819,7 @@ check_stateful_flows() { > >>>> table=??(ls_out_pre_lb ), priority=100 , match=(ip), > >>>> action=(reg0[[2]] = 1; next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(eth.mcast), > >>>> action=(next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(eth.src == > >>>> \$svc_monitor_mac), action=(next;) > >>>> - table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > >>>> == "sw0-lr0"), action=($action) > >> > >> The $action variable is only used in this one line of > >> check_stateful_flows(). If the variable is not going to be used > >> anymore, then you can remove it entirely and clean up the calls to > >> check_stateful_flows() so they do not pass an argument any longer. > >> > >> Depending on what is done for v2, this comment may not be relevant though. > >> > >>>> + table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > >>>> == "sw0-lr0"), action=(ct_clear; next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(nd || nd_rs || > >>>> nd_ra || mldv1 || mldv2), action=(next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(reg0[[16]] == > >>>> 1), action=(next;) > >>>> ]) > >>>> @@ -4886,7 +4886,7 @@ AT_CHECK([grep "ls_out_pre_lb" sw0flows | > >>>> ovn_strip_lflows], [0], [dnl > >>>> table=??(ls_out_pre_lb ), priority=0 , match=(1), > >>>> action=(next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(eth.mcast), > >>>> action=(next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(eth.src == > >>>> $svc_monitor_mac), action=(next;) > >>>> - table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > >>>> == "sw0-lr0"), action=(next;) > >>>> + table=??(ls_out_pre_lb ), priority=110 , match=(ip && outport > >>>> == "sw0-lr0"), action=(ct_clear; next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(nd || nd_rs || > >>>> nd_ra || mldv1 || mldv2), action=(next;) > >>>> table=??(ls_out_pre_lb ), priority=110 , match=(reg0[[16]] == > >>>> 1), action=(next;) > >>>> ]) > >>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at > >>>> index 8e356df6f..3e4ff32d1 100644 > >>>> --- a/tests/system-ovn.at > >>>> +++ b/tests/system-ovn.at > >>>> @@ -8542,8 +8542,8 @@ check ovn-nbctl lsp-set-addresses sw1p1 > >>>> "50:54:00:00:00:03 20.0.0.3" > >>>> > >>>> check ovn-nbctl --apply-after-lb acl-add sw1 from-lport 1001 'inport == > >>>> "sw1p1" && ip4' drop > >>>> > >>>> -check ovn-nbctl acl-add sw1 to-lport 1002 'ip4 && tcp && tcp.dst == 80' > >>>> allow-related > >>>> -check ovn-nbctl acl-add sw1 to-lport 1001 'ip4' drop > >>>> +check ovn-nbctl acl-add sw1 to-lport 1002 'outport == "sw1p1" && ip4 && > >>>> tcp && tcp.dst == 80' allow-related > >>>> +check ovn-nbctl acl-add sw1 to-lport 1001 'outport == "sw1p1" && ip4' > >>>> drop > >> > >> Why are these changes necessary? > > > > Because the test was relying on the previous ct zone assumption. Discussing > > with Dunitru we agreed this is wrong. > > @Dumitru: any input on it? > > > > Regards, > > Lorenzo > > > >> > >>>> > >>>> ADD_NAMESPACES(sw0p1) > >>>> ADD_VETH(sw0p1, sw0p1, br-int, "10.0.0.3/24", "50:54:00:00:00:02", \ > >>>> @@ -13969,6 +13969,131 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query > >>>> port patch-.*/d > >>>> AT_CLEANUP > >>>> ]) > >>>> > >>>> +OVN_FOR_EACH_NORTHD([ > >>>> +AT_SETUP([ovn -- ACL Sampling - Stateful ACL - routed traffic]) > >>>> +AT_KEYWORDS([ACL]) > >>>> + > >>>> +CHECK_CONNTRACK() > >>>> +CHECK_CONNTRACK_NAT() > >>>> +ovn_start > >>>> +OVS_TRAFFIC_VSWITCHD_START() > >>>> +ADD_BR([br-int]) > >>>> + > >>>> +dnl Set external-ids in br-int needed for ovn-controller > >>>> +check ovs-vsctl \ > >>>> + -- set Open_vSwitch . external-ids:system-id=hv1 \ > >>>> + -- set Open_vSwitch . > >>>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > >>>> + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > >>>> + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > >>>> + -- set bridge br-int fail-mode=secure > >>>> other-config:disable-in-band=true > >>>> + > >>>> +dnl Start ovn-controller > >>>> +start_daemon ovn-controller > >>>> + > >>>> +dnl Logical network: > >>>> +dnl 1 router > >>>> +dnl 2 logical switch > >>>> +dnl 2 VIF > >>>> + > >>>> +check ovn-nbctl \ > >>>> + -- lr-add lr \ > >>>> + -- lrp-add lr lrp1 00:00:00:00:01:00 42.42.42.1/24 \ > >>>> + -- lrp-add lr lrp2 00:00:00:00:02:00 43.43.43.1/24 \ > >>>> + -- ls-add ls1 \ > >>>> + -- ls-add ls2 \ > >>>> + -- lsp-add ls1 vm1 \ > >>>> + -- lsp-add ls2 vm2 \ > >>>> + -- lsp-set-addresses vm1 00:00:00:00:00:01 \ > >>>> + -- lsp-set-addresses vm2 00:00:00:00:00:02 \ > >>>> + -- lsp-add ls1 ls1-lr \ > >>>> + -- lsp-add ls2 ls2-lr \ > >>>> + -- lsp-set-type ls1-lr router \ > >>>> + -- lsp-set-type ls2-lr router \ > >>>> + -- lsp-set-options ls1-lr router-port=lrp1 \ > >>>> + -- lsp-set-options ls2-lr router-port=lrp2 \ > >>>> + -- lsp-set-addresses ls1-lr router \ > >>>> + -- lsp-set-addresses ls2-lr router > >>>> +check ovn-nbctl --wait=hv sync > >>>> + > >>>> +ADD_NAMESPACES(vm1) > >>>> +ADD_VETH(vm1, vm1, br-int, "42.42.42.2/24", "00:00:00:00:00:01", > >>>> "42.42.42.1") > >>>> +ADD_NAMESPACES(vm2) > >>>> +ADD_VETH(vm2, vm2, br-int, "43.43.43.2/24", "00:00:00:00:00:02", > >>>> "43.43.43.1") > >>>> + > >>>> +wait_for_ports_up > >>>> + > >>>> +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 43.43.43.2 | > >>>> FORMAT_PING], \ > >>>> +[0], [dnl > >>>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms > >>>> +]) > >>>> + > >>>> +collector1=$(ovn-nbctl create Sample_Collector id=1 name=c1 > >>>> probability=65535 set_id=100) > >>>> +collector2=$(ovn-nbctl create Sample_Collector id=2 name=c2 > >>>> probability=65535 set_id=200) > >>>> +check_row_count nb:Sample_Collector 2 > >>>> + > >>>> +check_uuid ovn-nbctl create Sampling_App type="acl-est" id="43" > >>>> +check_row_count nb:Sampling_App 1 > >>>> + > >>>> +check ovn-nbctl pg-add pg vm1 vm2 > >>>> +check_uuid ovn-nbctl > >>>> \ > >>>> + -- --id=@sample1 create Sample collector="$collector1" > >>>> metadata=1001 \ > >>>> + -- --sample-est=@sample1 acl-add pg from-lport 200 "inport == @pg > >>>> && udp" allow-related \ > >>>> + -- --id=@sample2 create Sample collector="$collector2" > >>>> metadata=1004 \ > >>>> + -- --sample-est=@sample2 acl-add pg to-lport 200 "outport == @pg && > >>>> udp" allow-related > >>>> +check ovn-nbctl --wait=hv sync > >>>> +check_row_count nb:ACL 2 > >>>> +check_row_count nb:Sample 2 > >>>> + > >>>> +NETNS_DAEMONIZE([vm1], [nc -l -u 1234], [udp1.pid]) > >>>> +NETNS_DAEMONIZE([vm2], [nc -l -u 1235], [udp2.pid]) > >>>> + > >>>> +dnl Start an IPFIX collector. > >>>> +DAEMONIZE([nfcapd -B 1024000 -w . -p 4242 2> collector.err], > >>>> [collector.pid]) > >>>> +dnl Wait for the collector to be up. > >>>> +OVS_WAIT_UNTIL([grep -q 'Startup nfcapd.' collector.err]) > >>>> + > >>>> +dnl Configure the OVS flow sample collector. > >>>> +ovs-vsctl --id=@br get Bridge br-int \ > >>>> + -- --id=@ipfix1 create IPFIX targets=\"127.0.0.1:4242\" > >>>> template_interval=1 \ > >>>> + -- --id=@cs1 create Flow_Sample_Collector_Set id=100 bridge=@br > >>>> ipfix=@ipfix1 \ > >>>> + -- --id=@ipfix2 create IPFIX targets=\"127.0.0.1:4242\" > >>>> template_interval=1 \ > >>>> + -- --id=@cs2 create Flow_Sample_Collector_Set id=200 bridge=@br > >>>> ipfix=@ipfix2 > >>>> + > >>>> +check ovn-nbctl --wait=hv sync > >>>> +dnl And wait for it to be up and running. > >>>> +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep -q '2 ids']) > >>>> + > >>>> +# Create the connection in the CT table. > >>>> +echo "Hello" > Hello > >>>> +NS_EXEC([vm1], [nc -p 1234 -u 43.43.43.2 1235 < Hello]) > >>>> +NS_EXEC([vm2], [nc -p 1235 -u 42.42.42.2 1234 < Hello]) > >>>> +# Send some more packets. > >>>> +for i in $(seq 10); do > >>>> +NS_EXEC([vm1], [nc -p 1234 -u 43.43.43.2 1235 < Hello]) > >>>> +NS_EXEC([vm2], [nc -p 1235 -u 42.42.42.2 1234 < Hello]) > >>>> +done > >>>> + > >>>> +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep 'id 100' | grep > >>>> -q 'sampled pkts=21']) > >>>> +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep 'id 200' | grep > >>>> -q 'sampled pkts=21']) > >>>> + > >>>> +OVN_CLEANUP_CONTROLLER([hv1]) > >>>> + > >>>> +as ovn-sb > >>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > >>>> + > >>>> +as ovn-nb > >>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > >>>> + > >>>> +as northd > >>>> +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > >>>> + > >>>> +as > >>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > >>>> +/connection dropped.*/d"]) > >>>> + > >>>> +AT_CLEANUP > >>>> +]) > >>>> + > >>>> OVN_FOR_EACH_NORTHD([ > >>>> AT_SETUP([SB Disconnect - MAC_Binding]) > >>>> ovn_start > >>>> -- > >>>> 2.51.0 > >>>> > >>> _______________________________________________ > >>> dev mailing list > >>> [email protected] > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> >
_______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
