On Tue, Dec 9, 2025 at 4:27 AM Dumitru Ceara <[email protected]> wrote:
>
> On 11/18/25 4:50 AM, Xie Liu wrote:
> > Consider the case of stateful Firewall for N-S traffic:
> >
> > PUBLIC---S1-(S1-R1)---------(R1-S1)-R1 -------- S2 ---- VM1
> >
> > Configuration:
> >
> > ovn-nbctl pg-add pg_dgw
> > ovn-nbctl pg-set-ports pg_dgw S1-R1
> > ovn-nbctl acl-add pg_dgw from-lport 2000 "inport == @pg_dgw && ip4 &&
> > icmp4" allow-related
> > ovn-nbctl acl-add pg_dgw from-lport 1000 "inport == @pg_dgw && ip4" drop
> > ovn-nbctl acl-add pg_dgw to-lport 1000 "outport == @pg_dgw && ip4" drop
> > ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1
> > enable_router_port_acl=true
> >
> > VM1 pings external network.
> >
> > Through this patch[1], the ovn-controller assigned a CT zone ID
> > to the localnet LSP but not the dgw LSP.
> >
> > This caused ACL failures: ICMP reply packets from external networks
> > performed CT lookups in the wrong zone, couldn't match established
> > connections, and were incorrectly dropped.
> >
> > This commit enables CT zone allocation for patch ports that correspond
> > to router gateway ports when enable_router_port_acl=true is set.
> >
> > Changes:
> > - northd: Add enable-router-port-acl option to southbound port binding
> > - binding: Handle patch port CT zone requirements in local_lports
> > - controller: Add/Delete CT zone for patch ports enabled/disabled ACL
> >
> > [1]https://github.com/ovn-org/ovn/commit/5ae7d2cb60a50541e88e8f5c74a669e2aa7acdda
> >
> > Reported-at: https://github.com/ovn-org/ovn/issues/264
> > Signed-off-by: Xie Liu <[email protected]>
> > ---
>
> Hi Xie Liu,
>
> Thanks for this new version and sorry for the delay in reviewing it.
>
> > controller/binding.c | 31 ++++++++
> > controller/ovn-controller.c | 23 +++---
> > controller/physical.c | 4 +-
> > northd/northd.c | 5 ++
> > tests/system-ovn.at | 145 ++++++++++++++++++++++++++++++++++++
> > 5 files changed, 197 insertions(+), 11 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index b429323a9..6e138f1df 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -791,6 +791,16 @@ update_active_pb_ras_pd(const struct
> > sbrec_port_binding *pb,
> > }
> > }
> >
> > +/* Check the patch port needs CT zone based on enable-router-port-acl.
> > + * Returns true if the patch port should be added to local_lports for
> > + * CT zone allocation.
> > + */
> > +static bool
> > +patch_port_needs_ct_zone(const struct sbrec_port_binding *pb)
> > +{
> > + return smap_get_bool(&pb->options, "enable-router-port-acl", false);
>
> To avoid confusion I'd use the same option name as we have in the
> northbound database, enable_router_port_acl.
>
> > +}
> > +
> > static bool is_ext_id_changed(const struct smap *a, const struct smap *b,
> > const char *key);
> > static struct local_binding *local_binding_create(
> > @@ -2259,6 +2269,12 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct
> > binding_ctx_out *b_ctx_out)
> > case LP_PATCH:
> > update_related_lport(pb, b_ctx_out);
> > consider_patch_port_for_local_datapaths(pb, b_ctx_in,
> > b_ctx_out);
> > + /* Check if this patch port needs CT zone allocation and add it
> > + * to local_lports if required. */
> > + if (patch_port_needs_ct_zone(pb)) {
> > + update_local_lports(pb->logical_port, b_ctx_out,
> > + LPORT_STATUS_BOUND);
> > + }
> > break;
> >
> > case LP_VTEP:
> > @@ -3158,6 +3174,21 @@ handle_updated_port(struct binding_ctx_in *b_ctx_in,
> > update_related_lport(pb, b_ctx_out);
> > handled = consider_patch_port_for_local_datapaths(pb, b_ctx_in,
> > b_ctx_out);
> > + /* Handle dynamic changes to enable-router-port-acl option.
> > + * Check CT zone requirement if options updated. */
> > + if (sbrec_port_binding_is_updated(pb,
> > + SBREC_PORT_BINDING_COL_OPTIONS))
> > {
> > + tracked_datapath_lport_add(pb, TRACKED_RESOURCE_UPDATED,
> > + b_ctx_out->tracked_dp_bindings);
> > + if (patch_port_needs_ct_zone(pb)) {
> > + update_local_lports(pb->logical_port, b_ctx_out,
> > + LPORT_STATUS_BOUND);
> > + } else {
> > + /* Option was removed, remove from local_lports to
> > + * release CT zone */
> > + remove_local_lports(pb->logical_port, b_ctx_out);
> > + }
> > + }
> > break;
> >
> > case LP_VTEP:
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 14b9b9e28..b3317eb15 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -2486,20 +2486,23 @@ ct_zones_runtime_data_handler(struct engine_node
> > *node, void *data)
> > if (strcmp(t_lport->pb->type, "")
> > && strcmp(t_lport->pb->type, "localport")
> > && strcmp(t_lport->pb->type, "l3gateway")
> > - && strcmp(t_lport->pb->type, "localnet")) {
> > + && strcmp(t_lport->pb->type, "localnet")
> > + && strcmp(t_lport->pb->type, "patch")) {
> > /* We allocate zone-id's only to VIF, localport, l3gateway,
> > - * and localnet lports. */
> > - if (sbrec_port_binding_is_updated(t_lport->pb,
> > -
> > SBREC_PORT_BINDING_COL_TYPE)) {
> > - updated |=
> > ct_zone_handle_port_update(&ct_zones_data->ctx,
> > - t_lport->pb,
> > - false, &scan_start,
> > - min_ct_zone, max_ct_zone);
> > - }
> > -
> > + * localnet and patch (enabled ACL) lports. */
> > continue;
> > }
> >
> > + /* For patch ports without ACL flag, delete the ct_zone */
> > + if (!strcmp(t_lport->pb->type, "patch") &&
> > + !smap_get_bool(&t_lport->pb->options,
> > + "enable-router-port-acl", false)) {
> > + updated |= ct_zone_handle_port_update(&ct_zones_data->ctx,
> > + t_lport->pb,
> > + false, &scan_start,
> > + min_ct_zone, max_ct_zone);
> > + continue;
> > + }
> > bool port_updated =
> > t_lport->tracked_type == TRACKED_RESOURCE_NEW ||
> > t_lport->tracked_type == TRACKED_RESOURCE_UPDATED;
> > diff --git a/controller/physical.c b/controller/physical.c
> > index 6ac5dcd3f..a4bb29403 100644
> > --- a/controller/physical.c
> > +++ b/controller/physical.c
> > @@ -2012,8 +2012,10 @@ consider_port_binding(const struct physical_ctx *ctx,
> > ofpact_put_CT_CLEAR(ofpacts_p);
> > put_load(0, MFF_LOG_DNAT_ZONE, 0, 32, ofpacts_p);
> > put_load(0, MFF_LOG_SNAT_ZONE, 0, 32, ofpacts_p);
> > - put_load(0, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p);
> > struct zone_ids peer_zones = get_zone_ids(peer, ctx->ct_zones);
> > + if (!peer_zones.ct) {
> > + put_load(0, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p);
> > + }
>
> I don't think we need this change.
>
> > load_logical_ingress_metadata(peer, &peer_zones, ctx->n_encap_ips,
> > ctx->encap_ips, ofpacts_p, false);
> > put_load(0, MFF_LOG_FLAGS, 0, 32, ofpacts_p);
> > diff --git a/northd/northd.c b/northd/northd.c
> > index cdf12ec86..0dc631fe4 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -2856,6 +2856,11 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn
> > *ovnsb_txn,
> > } else {
> > smap_add(&new, "peer", router_port);
> > }
> > + /* Add enable-router-port-acl option for CT zone alloc
> > + * to patch ports that need ACL processing. */
> > + if (op->enable_router_port_acl) {
> > + smap_add(&new, "enable-router-port-acl", "true");
> > + }
> > }
> > if (chassis) {
> > smap_add(&new, "l3gateway-chassis", chassis);
> > diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > index 76f73d96e..742d7df5f 100644
> > --- a/tests/system-ovn.at
> > +++ b/tests/system-ovn.at
> > @@ -19283,3 +19283,148 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> > /.*terminating with signal 15.*/d"])
> > AT_CLEANUP
> > ])
> > +
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([Distributed gw port CT zone allocation])
> > +CHECK_CONNTRACK()
> > +
> > +ovn_start
> > +
> > +# Topo: vm1 -- ls -- lr -- ext (with ACLs on ext_lr)
> > +
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +ADD_BR([br-int])
> > +ADD_BR([br-ext], [set Bridge br-ext fail-mode=standalone])
> > +
> > +# Set external-ids in br-int needed for ovn-controller
> > +check ovs-vsctl \
> > + -- set Open_vSwitch . external-ids:system-id=hv1 \
> > + -- set Open_vSwitch .
> > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> > + -- set bridge br-int fail-mode=secure
> > other-config:disable-in-band=true \
> > + -- set Open_vSwitch .
> > external-ids:ovn-bridge-mappings=phynet:br-ext
> > +
> > +start_daemon ovn-controller
> > +
> > +# Create logical switches: internal and external networks
> > +check ovn-nbctl ls-add ls
> > +check ovn-nbctl ls-add ext
> > +
> > +# Add localnet port to external switch for physical connectivity
> > +check ovn-nbctl lsp-add-localnet-port ext ln_port phynet
> > +
> > +# Add VM port to internal switch
> > +check ovn-nbctl lsp-add ls vm1 \
> > + -- lsp-set-addresses vm1 "00:00:00:00:00:11 192.168.100.11"
> > +
> > +# Create logical_router_ports: lr_ls and lr_ext
> > +check ovn-nbctl lr-add lr
> > +check ovn-nbctl lrp-add lr lr_ls 00:00:00:00:01:01 192.168.100.1/24
> > +check ovn-nbctl lrp-add lr lr_ext 00:00:00:00:01:02 172.16.0.1/24
> > +check ovn-nbctl lrp-set-gateway-chassis lr_ext hv1
> > +
> > +# Create peer logical_switch_ports: ls_lr and ext_lr
> > +check ovn-nbctl lsp-add ls ls_lr \
> > + -- lsp-set-type ls_lr router \
> > + -- lsp-set-addresses ls_lr router \
> > + -- lsp-set-options ls_lr router-port=lr_ls
> > +
> > +check ovn-nbctl lsp-add ext ext_lr \
> > + -- lsp-set-type ext_lr router \
> > + -- lsp-set-addresses ext_lr router \
> > + -- lsp-set-options ext_lr router-port=lr_ext
> > enable_router_port_acl=true
> > +
> > +# Create port group for external gateway ACLs
> > +check ovn-nbctl pg-add gw_pg ext_lr
> > +
> > +# Add ACL rule to enable connection tracking for external gateway traffic
> > +# Allowing ICMP from vm1 to external
> > +check ovn-nbctl acl-add gw_pg to-lport 2001 "inport == @gw_pg && ip4 &&
> > icmp4" allow-related
> > +check ovn-nbctl acl-add gw_pg to-lport 1001 "inport == @gw_pg && ip4" drop
> > +check ovn-nbctl acl-add gw_pg to-lport 1001 "outport == @gw_pg && ip4" drop
> > +check ovn-nbctl --wait=sb sync
> > +
> > +# Create namespaces vm1
> > +ADD_NAMESPACES(vm1)
> > +ADD_VETH(vm1, vm1, br-int, "192.168.100.11/24", "00:00:00:00:00:11", \
> > + "192.168.100.1")
> > +
> > +# Create namespace external
> > +ADD_NAMESPACES(external)
> > +ADD_VETH(external, external, br-ext, "172.16.0.100/24",
> > "00:00:00:00:01:10", \
> > + "172.16.0.1")
> > +NS_CHECK_EXEC([external], [ip route add 192.168.100.0/24 via 172.16.0.1])
>
> We don't need this, 172.16.0.1 is already the default gateway in the
> external namespace.
>
> > +
> > +wait_for_ports_up
>
> I'd add an ovn-nbctl --wait=hv sync here, and remove the --wait=sb above.
>
> > +
> > +# ext_lr has enable-router-port-acl option
> > +AT_CHECK([ovn-sbctl get Port_Binding ext_lr
> > options:enable-router-port-acl], [0], [dnl
> > +["true"]
> > +])
> > +
> > +# ls_lr does not have enable-router-port-acl option
> > +AT_CHECK([ovn-sbctl --columns=options list Port_Binding ls_lr | grep
> > enable-router-port-acl], [1], [])
> > +
>
> I'd write these as:
>
> # Check that ext_lr has enable_router_port_acl option.
> AT_CHECK([ovn-sbctl get Port_Binding ext_lr options:enable_router_port_acl],
> [0], [dnl
> "true"
> ])
>
> # Check that ls_lr does not have enable_router_port_acl option.
> AT_CHECK([ovn-sbctl --columns options list Port_Binding ls_lr | grep -q
> enable_router_port_acl], [1], [])
>
> > +# Allocated CT zone for ext_lr
> > +AT_CHECK([ovn-appctl -t ovn-controller ct-zone-list | sed "s/ [[0-9]]*/
> > ??/" | sort], [0], [stdout])
>
> Nit: no need to sort.
>
> > +AT_CHECK([grep "ext_lr" stdout], [0], [dnl
> > +ext_lr ??
> > +])
> > +
> > +# No CT zone for ls_lr
> > +AT_CHECK([grep "ls_lr" stdout], [1], [])
> > +
> > +# Generate ICMP traffic to create conntrack entries
> > +NS_CHECK_EXEC([vm1], [ping -q -c 2 -W 1 172.16.0.100 | FORMAT_PING], \
> > +[0], [dnl
> > +2 packets transmitted, 2 received, 0% packet loss, time 0ms
> > +])
> > +
> > +# Check that ICMP conntrack entries exist
> > +zone_id=$(ovn-appctl -t ovn-controller ct-zone-list | grep ext_lr | cut -d
> > ' ' -f2)
> > +AT_CHECK_UNQUOTED([ovs-appctl dpctl/dump-conntrack |
> > FORMAT_CT(172.16.0.100) | grep "zone=$zone_id"], [0], [dnl
> > +icmp,orig=(src=192.168.100.11,dst=172.16.0.100,id=<cleared>,type=8,code=0),reply=(src=172.16.0.100,dst=192.168.100.11,id=<cleared>,type=0,code=0),zone=$zone_id
> > +])
> > +
> > +# Test TCP traffic - should be blocked by ACLs
> > +OVS_START_L7([external], [http])
> > +AT_CHECK([ip netns exec vm1 wget 172.16.0.100:80 -t 3 -T 1
> > --retry-connrefused], [4], [ignore], [ignore])
>
> On my test machine this fails because wget exits with 0 even if we
> failed to get the page.
>
> We can simplify the check and use nc instead:
>
> NETNS_DAEMONIZE([external], [nc -l -k 172.16.0.100 80], [external.pid])
> NS_CHECK_EXEC([vm1], [nc -z 172.16.0.100 80], [1])
>
> This seems to be more reliable.
>
> > +
> > +# Remove enable_router_port_acl and verify CT zone release
> > +check ovn-nbctl lsp-set-options ext_lr router-port=lr_ext
> > +check ovn-nbctl --wait=sb sync
> > +AT_CHECK([ovn-appctl -t ovn-controller ct-zone-list | grep "ext_lr"], [1],
> > [])
> > +
> > +# Re-enable enable_router_port_acl and verify CT zone reallocation
> > +check ovn-nbctl lsp-set-options ext_lr router-port=lr_ext
> > enable_router_port_acl=true
> > +check ovn-nbctl --wait=sb sync
> > +AT_CHECK([ovn-appctl -t ovn-controller ct-zone-list | sed "s/ [[0-9]]*/
> > ??/" | sort], [0], [stdout])
> > +AT_CHECK([grep "ext_lr" stdout], [0], [dnl
> > +ext_lr ??
> > +])
> > +
> > +# Test ICMP traffic still works after dynamic ACL changes
> > +NS_CHECK_EXEC([vm1], [ping -q -c 2 -W 1 172.16.0.100 | FORMAT_PING], \
> > +[0], [dnl
> > +2 packets transmitted, 2 received, 0% packet loss, time 0ms
> > +])
> > +
> > +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> > +
> > +as ovn-sb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as ovn-nb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as northd
> > +OVS_APP_EXIT_AND_WAIT([ovn-northd])
>
> We should use OVN_CLEANUP_NORTHD instead.
>
> > +
> > +as
> > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> > +/connection dropped.*/d"])
> > +
> > +AT_CLEANUP
> > +])
> > +])
>
> I took care of the minor issues above (also changed some other minor style
> issues) and applied the patch to main.
>
> I also pushed the patch to 25.09. It didn't apply cleanly on 25.03,
> 24.09 and 24.03, would you happen to have some time to prepare a custom
> backport patch for those branches?
>
> Regards,
> Dumitru
>
Thanks, will do. I’ll work on backport patches for 25.03, 24.09 and 24.03.
Regards,
Liushy
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev