On Fri, Jul 25, 2025 at 8:19 AM Frode Nordahl <fnord...@ubuntu.com> wrote:
>
> While there is a hardware offload friendly approach to fixing
> processing of fragmented traffic to load balancers in 8e6f9a8355e2
> ("northd: Fix HW offload problem related to ct_tuple."), the change
> is quite invasive, and not compatible with our requirement to make
> UDP fragmentation work for switch egress pipeline on user space
> data path in existing releases.
>
> In a subsequent patch an optional workaround to this problem will be
> provided, which relies on these fields being available.

Hi Frode, thanks for the patch.

To my understanding, this patch itself is the workaround for supporting
fragmented UDP in ACLs by just putting fields like ct_udp.dst in the match,
correct? The work-in-progress [0] seems to be the real solution instead of
a workaround. Could you clarify a little?

>
> The documentation for OVN Northbound ACL table already states that
> the match column uses the same expression language as the OVN
> Southbound Logical_Flow match column.
>
> As we know, the use of these fields may be problematic for hardware
> offload, consequently a note is added to discourage their use.
>
> Related: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2115795
> Related: https://issues.redhat.com/browse/FDP-684
> Signed-off-by: Frode Nordahl <fnord...@ubuntu.com>
> ---
>
> As discussed in the IRC meeting July 24th, I post a prerequisite patch
> in anticipation of a patch that provides a workaround for a bug
> preventing fragmented UDP packets to pass ACLs that match on protocol
> and port numbers.
>
> There is a work in progress available [0], and we intend to propose
> this optional workaround behind a NB_Global bug flag which can be used
> until a CT_EXTRACT based stage is made available for the switch egress
> pipeline.
>

Just want to clarify here. I think the point (as emphasized by @Ilya
Maximets ) was that the CT fields are at a lower layer. So the real reason
for this workaround is not about missing CT_EXTRACT stage but about the
complexity of parsing and rewriting tcp/udp/sctp fields from user defined
ACLs to use CT metadata. Of course if we move forward to implement the
rewriting approach, CT_EXTRACT stage would be needed to make sure HW
offload works.

> 0: https://github.com/sombrafam/ovn/commits/fix-fragmentation-for-acls/
>
>  lib/logical-fields.c | 15 +++++++++++++++
>  ovn-nb.xml           |  8 ++++++++
>  tests/ovn.at         |  7 +++++++
>  3 files changed, 30 insertions(+)
>
> diff --git a/lib/logical-fields.c b/lib/logical-fields.c
> index e479a78c1..efbc75b72 100644
> --- a/lib/logical-fields.c
> +++ b/lib/logical-fields.c
> @@ -359,6 +359,21 @@ ovn_init_symtab(struct shash *symtab)
>
>      expr_symtab_add_ovn_field(symtab, "icmp4.frag_mtu",
OVN_ICMP4_FRAG_MTU);
>      expr_symtab_add_ovn_field(symtab, "icmp6.frag_mtu",
OVN_ICMP6_FRAG_MTU);
> +
> +    expr_symtab_add_field(symtab, "ct_proto", MFF_CT_NW_PROTO,
> +                          "ct.trk", false);
> +
> +    expr_symtab_add_predicate(symtab, "ct_udp", "ct_proto == 17");
> +    expr_symtab_add_field(symtab, "ct_udp.dst", MFF_CT_TP_DST,
> +                          "ct_udp", false);
> +
> +    expr_symtab_add_predicate(symtab, "ct_tcp", "ct_proto == 6");
> +    expr_symtab_add_field(symtab, "ct_tcp.dst", MFF_CT_TP_DST,
> +                          "ct_tcp", false);
> +
> +    expr_symtab_add_predicate(symtab, "ct_sctp", "ct_proto == 132");
> +    expr_symtab_add_field(symtab, "ct_sctp.dst", MFF_CT_TP_DST,
> +                          "ct_sctp", false);
>  }
>

These symbols need to be documented in ovn-sb.xml.

>  const char *
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 4a7581807..caf946eac 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -2576,6 +2576,14 @@ or
>          Note that you can not create an ACL matching on a port with
>          type=router or type=localnet.
>        </p>
> +
> +      <p>
> +        Note that matching directly on ct_state subfields such as
> +        <code>ct.new</code>, <code>ct.trk</code>, or connection tracker
> +        metadata fields such as <code>ct_proto</code>,
<code>ct_udp</code>,
> +        <code>ct_tcp</code>, <code>ct_sctp</code> is not recommended as
their
> +        use may impact performance and prevent hardware offload from
working.

As mentioned above, the reason for discouraging the use of these CT fields
is not only about performance, but also because they are at a lower level,
and related to internal implementation of the upper level (user-friendly)
fields. Shall we be clear about this in this note? Probably we should also
mention that it is only needed for supporting fragmented packets in ACL,
mainly for UDP.

In general I am ok with this workaround if we make these notes clear so
that the users understand the risks. But we should also understand that
once these fields are released, it would be hard to remove them in the
future considering backward compatibility. Thoughts?

Thanks,
Han

> +      </p>
>      </column>
>
>      <column name="action">
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 0dabec8d9..3ba958dec 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -154,7 +154,14 @@ ct_mark.natted = ct_mark[1]
>  ct_mark.obs_collector_id = ct_mark[16..23]
>  ct_mark.obs_stage = ct_mark[4..5]
>  ct_mark.skip_snat = ct_mark[2]
> +ct_proto = NXM_NX_CT_NW_PROTO
> +ct_sctp = ct_proto == 132
> +ct_sctp.dst = NXM_NX_CT_TP_DST
>  ct_state = NXM_NX_CT_STATE
> +ct_tcp = ct_proto == 6
> +ct_tcp.dst = NXM_NX_CT_TP_DST
> +ct_udp = ct_proto == 17
> +ct_udp.dst = NXM_NX_CT_TP_DST
>  ]])
>  AT_CLEANUP
>
> --
> 2.43.0
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to