Hi Xie Liu, Thanks for the patch!
On 11/7/25 5:48 AM, [email protected] wrote: > From: Xie Liu <[email protected]> > > Consider the case of stateful Firewall for N-S traffic: > > PUBLIC---S1-(S1-R1)---------(R1-S1)-R1 -------- S2 ---- VM1 > > Configuration: > > ovn-nbctl pg-add pg_dgw > ovn-nbctl pg-set-ports pg_dgw S1-R1 > ovn-nbctl acl-add pg_dgw from-lport 2000 "inport == @pg_dgw && ip4 && icmp4" > allow-related > ovn-nbctl acl-add pg_dgw from-lport 1000 "outport == @pg_dgw && ip4" drop > ovn-nbctl acl-add pg_dgw to-lport 1000 "outport == @pg_dgw && ip4" drop > ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1 enable_router_port_acl=true > > VM1 pings external network. > > Through this patch[1], the ovn-controller assigned a CT zone ID > to the localnet LSP but not the dgw LSP. > > This caused ACL failures: ICMP reply packets from external networks > performed CT lookups in the wrong zone, couldn't match established > connections, and were incorrectly dropped. While I understand why the CT lookup in the wrong zone (in the VIF zone) causes issues, I have my concerns about this behavior change. > > Fix by ensuring ports without CT zone allocation use default zone 0, > preventing incorrect zone inheritance and restoring proper ACL behavior > for distributed gateway scenarios. > Default zone 0 is most likely already used by the host traffic. Which means we might have unexpected collisions here. Please see the ovn-kubernetes use case where traffic from the host itself is also sent through conntrack (also in zone 0). Would a correct approach be to instead allocate explicit zone ids for ports that have enable_router_port_acl=true? > [1]https://github.com/ovn-org/ovn/commit/5ae7d2cb60a50541e88e8f5c74a669e2aa7acdda > > Reported-at: https://github.com/ovn-org/ovn/issues/264 > Signed-off-by: Xie Liu <[email protected]> > --- > controller/physical.c | 11 ++++------- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/controller/physical.c b/controller/physical.c > index 6ac5dcd3f..127f21e59 100644 > --- a/controller/physical.c > +++ b/controller/physical.c > @@ -1141,9 +1141,9 @@ static void > put_zones_ofpacts(const struct zone_ids *zone_ids, struct ofpbuf *ofpacts_p) > { > if (zone_ids) { > - if (zone_ids->ct) { > - put_load(zone_ids->ct, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p); > - } > + /* Always load CT zone (0 if unallocated) to ensure > + * consistent connection tracking metadata. */ > + put_load(zone_ids->ct, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p); > if (zone_ids->dnat) { > put_load(zone_ids->dnat, MFF_LOG_DNAT_ZONE, 0, 32, ofpacts_p); > } > @@ -2012,7 +2012,6 @@ consider_port_binding(const struct physical_ctx *ctx, > ofpact_put_CT_CLEAR(ofpacts_p); > put_load(0, MFF_LOG_DNAT_ZONE, 0, 32, ofpacts_p); > put_load(0, MFF_LOG_SNAT_ZONE, 0, 32, ofpacts_p); > - put_load(0, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p); > struct zone_ids peer_zones = get_zone_ids(peer, ctx->ct_zones); > load_logical_ingress_metadata(peer, &peer_zones, ctx->n_encap_ips, > ctx->encap_ips, ofpacts_p, false); > @@ -2560,9 +2559,7 @@ static void > local_set_ct_zone_and_output_pb(int tunnel_key, int64_t zone_id, > struct ofpbuf *ofpacts) > { > - if (zone_id) { > - put_load(zone_id, MFF_LOG_CT_ZONE, 0, 16, ofpacts); > - } > + put_load(zone_id, MFF_LOG_CT_ZONE, 0, 16, ofpacts); > put_load(tunnel_key, MFF_LOG_OUTPORT, 0, 32, ofpacts); > put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts); > } It would also be great if you could add a test for this use case. Thank you! Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
