Hey Ales,

On Mon, Jan 26, 2026 at 10:18 AM Ales Musil <[email protected]> wrote:

>
>
> On Fri, Jan 23, 2026 at 8:00 PM Erlon R. Cruz <[email protected]> wrote:
>
>> v2: Rebased on current upstream main, removed external python
>> code traffic generators, added scenario tests for: dhcp, negative
>> udp and ovn rule generation, documentation and many code clean ups.
>> DHCP traffic is being dropped for some reason. Still need to figure
>> that out for v3.
>>
>> At the current state, OVN can not handle fragmented traffic for ACLs
>> in the userspace datapath (DPDK). Just like in the case of LB
>> (commit 20a96b9), the kernel DP will try to reassemble the fragments
>> during CT lookup, however userspace won't reassemble them.
>>
>> This patch allows OVN to handle fragmented traffic by defining a
>> translation table on southbound that leverages OpenFlow connection
>> tracking capabilities. When a stateful flow is created on NB, we add
>> a hint in the flow. This hint will be read in SB and if the
>> connection tracking is set to be used, SB will use the alternative
>> translation table that will use the connection tracking information.
>>
>> This approach should not change the current behavior and it's only
>> enabled if acl_ct_translation is set:
>>
>> ovn-nbctl set NB_Global . options:acl_ct_translation=true
>>
>> Signed-off-by: Erlon R. Cruz <[email protected]>
>> ---
>>
>
> Hi Erlon,
>
> thank you for the v2. It needs a rebase. I have also some comments down
> below.
>
>
>>  NEWS                         |   5 +
>>  controller/lflow.c           |  24 ++-
>>  controller/lflow.h           |   1 +
>>  include/ovn/logical-fields.h |   3 +
>>  lib/logical-fields.c         |  67 ++++++
>>  northd/en-global-config.c    |   5 +
>>  northd/lflow-mgr.c           |  51 ++++-
>>  northd/lflow-mgr.h           |   8 +-
>>  northd/northd.c              |  41 +++-
>>  ovn-nb.xml                   |  20 ++
>>  tests/atlocal.in             |   3 +
>>  tests/ovn.at                 |  71 +++++++
>>  tests/system-ovn.at          | 389 +++++++++++++++++++++++++++++++++++
>>  13 files changed, 668 insertions(+), 20 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 9883fb81d..7062701c8 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -1,5 +1,10 @@
>>  Post v25.09.0
>>  -------------
>> +   - Fixed support for fragmented traffic in the userspace datapath.
>> Added the
>> +     "acl_ct_translation" NB_Global option to enable connection tracking
>> +     based L4 field translation for stateful ACLs. When enabled allows
>> proper
>> +     handling of IP fragmentation in userspace datapaths. This option
>> breaks
>> +     hardware offloading and is disabled by default.
>>
>
> We are usually adding NEWS at the end.
>
>
>>     - Added support for TLS Server Name Indication (SNI) with the new
>>       --ssl-server-name option in OVN utilities and daemons. This allows
>>       specifying the server name for SNI, which is useful when connecting
>> diff --git a/controller/lflow.c b/controller/lflow.c
>> index 784a0d2dd..139181546 100644
>> --- a/controller/lflow.c
>> +++ b/controller/lflow.c
>> @@ -51,10 +51,19 @@ COVERAGE_DEFINE(consider_logical_flow);
>>  /* Contains "struct expr_symbol"s for fields supported by OVN lflows. */
>>  static struct shash symtab;
>>
>> +/* Alternative symbol table for ACL CT translation.
>> + * This symbol table maps L4 port fields (tcp/udp/sctp) to their
>> connection
>> + * tracking equivalents (ct_tp_src/ct_tp_dst with ct_proto predicates).
>> + * This allows matching on all IP fragments (not just the first fragment)
>> + * so that all fragments can be matched based on the connection tracking
>> state.
>> + */
>> +static struct shash acl_ct_symtab;
>> +
>>  void
>>  lflow_init(void)
>>  {
>>      ovn_init_symtab(&symtab);
>> +    ovn_init_acl_ct_symtab(&acl_ct_symtab);
>>  }
>>
>>  struct lookup_port_aux {
>> @@ -984,7 +993,16 @@ convert_match_to_expr(const struct
>> sbrec_logical_flow *lflow,
>>                       lflow->match);
>>          return NULL;
>>      }
>> -    struct expr *e = expr_parse_string(lex_str_get(&match_s), &symtab,
>> +
>> +    /* Check if this logical flow requires ACL CT translation.
>> +     * If the tags contains "acl_ct_translation"="true", we use the
>> +     * alternative symbol table that maps L4 fields (tcp/udp/sctp ports)
>> +     * to their CT equivalents. */
>> +    const char *ct_trans = smap_get(&lflow->tags, "acl_ct_translation");
>> +    struct shash *symtab_to_use = (ct_trans && !strcmp(ct_trans, "true")
>> +                                   ? &acl_ct_symtab : &symtab);
>>
>
> nit: smap_get_bool(..., false);
>
>
>> +
>> +    struct expr *e = expr_parse_string(lex_str_get(&match_s),
>> symtab_to_use,
>>                                         addr_sets, port_groups,
>> &addr_sets_ref,
>>                                         &port_groups_ref,
>>                                         ldp->datapath->tunnel_key,
>> @@ -1016,7 +1034,7 @@ convert_match_to_expr(const struct
>> sbrec_logical_flow *lflow,
>>              e = expr_combine(EXPR_T_AND, e, *prereqs);
>>              *prereqs = NULL;
>>          }
>> -        e = expr_annotate(e, &symtab, &error);
>> +        e = expr_annotate(e, symtab_to_use, &error);
>>      }
>>      if (error) {
>>          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>> @@ -2045,6 +2063,8 @@ lflow_destroy(void)
>>  {
>>      expr_symtab_destroy(&symtab);
>>      shash_destroy(&symtab);
>> +    expr_symtab_destroy(&acl_ct_symtab);
>> +    shash_destroy(&acl_ct_symtab);
>>  }
>>
>>  bool
>> diff --git a/controller/lflow.h b/controller/lflow.h
>> index 30c6dbebd..3e762d49b 100644
>> --- a/controller/lflow.h
>> +++ b/controller/lflow.h
>> @@ -156,6 +156,7 @@ struct lflow_ctx_out {
>>  };
>>
>>  void lflow_init(void);
>> +void ovn_init_acl_ct_symtab(struct shash *acl_symtab);
>>
>
> nit: Please remove this declaration it moved into logical-fields.h.
>
>
>>  void lflow_run(struct lflow_ctx_in *, struct lflow_ctx_out *);
>>  void lflow_handle_cached_flows(struct lflow_cache *,
>>                                 const struct sbrec_logical_flow_table *);
>> diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
>> index f0d34196a..8f23249a0 100644
>> --- a/include/ovn/logical-fields.h
>> +++ b/include/ovn/logical-fields.h
>> @@ -236,6 +236,9 @@ const char *event_to_string(enum ovn_controller_event
>> event);
>>  int string_to_event(const char *s);
>>  const struct ovn_field *ovn_field_from_name(const char *name);
>>
>> +void ovn_init_acl_ct_symtab(struct shash *symtab);
>> +void expr_symtab_remove(struct shash *symtab, const char *name);
>> +
>>  /* OVN CT label values
>>   * ===================
>>   * These are specific ct.label bit values OVN uses to track different
>> types
>> diff --git a/lib/logical-fields.c b/lib/logical-fields.c
>> index c8bddcdc5..334a7e244 100644
>> --- a/lib/logical-fields.c
>> +++ b/lib/logical-fields.c
>> @@ -451,3 +451,70 @@ ovn_field_from_name(const char *name)
>>
>>      return shash_find_data(&ovnfield_by_name, name);
>>  }
>> +
>> +/* Removes the symbol with 'name' from 'symtab', freeing its memory. */
>> +void
>> +expr_symtab_remove(struct shash *symtab, const char *name)
>> +{
>> +    struct expr_symbol *symbol = shash_find_and_delete(symtab, name);
>> +    if (symbol) {
>> +        free(symbol->name);
>> +        free(symbol->prereqs);
>> +        free(symbol->predicate);
>> +        free(symbol);
>> +    }
>> +}
>> +
>> +/* Initialize a symbol table for ACL CT translation.
>> + * This creates an alternative symbol table that maps L4 port fields
>> + * (tcp/udp/sctp) to their connection tracking equivalents. This allows
>> + * matching on all IP fragments (not just the first) by using CT state
>> + * which is available for all fragments. */
>> +void
>> +ovn_init_acl_ct_symtab(struct shash *acl_symtab)
>> +{
>> +    /* Initialize with the standard symbol table. */
>> +    ovn_init_symtab(acl_symtab);
>> +
>> +    /* Remove the original tcp/udp/sctp symbols that we will override. */
>> +    expr_symtab_remove(acl_symtab, "tcp.src");
>> +    expr_symtab_remove(acl_symtab, "tcp.dst");
>> +    expr_symtab_remove(acl_symtab, "tcp");
>> +    expr_symtab_remove(acl_symtab, "udp.src");
>> +    expr_symtab_remove(acl_symtab, "udp.dst");
>> +    expr_symtab_remove(acl_symtab, "udp");
>> +    expr_symtab_remove(acl_symtab, "sctp.src");
>> +    expr_symtab_remove(acl_symtab, "sctp.dst");
>> +    expr_symtab_remove(acl_symtab, "sctp");
>> +
>> +    /* Add ct_proto field - CT original direction protocol. Used in the
>> +     * tcp/udp/sctp predicate expansions below. */
>> +    expr_symtab_add_field(acl_symtab, "ct_proto", MFF_CT_NW_PROTO,
>> +                          "ct.trk", false);
>> +
>> +    /* Override TCP protocol and port fields to use CT equivalents.
>> +     * When "tcp" is used as a predicate, it expands to "ct_proto == 6"
>> +     * instead of "ip.proto == 6". */
>> +    expr_symtab_add_predicate(acl_symtab, "tcp",
>> +                              "ct.trk && !ct.inv && ct_proto == 6");
>> +    expr_symtab_add_field(acl_symtab, "tcp.src", MFF_CT_TP_SRC,
>> +                          "tcp", false);
>> +    expr_symtab_add_field(acl_symtab, "tcp.dst", MFF_CT_TP_DST,
>> +                          "tcp", false);
>> +
>> +    /* Override UDP protocol and port fields */
>> +    expr_symtab_add_predicate(acl_symtab, "udp",
>> +                              "ct.trk && !ct.inv && ct_proto == 17");
>> +    expr_symtab_add_field(acl_symtab, "udp.src", MFF_CT_TP_SRC,
>> +                          "udp", false);
>> +    expr_symtab_add_field(acl_symtab, "udp.dst", MFF_CT_TP_DST,
>> +                          "udp", false);
>> +
>> +    /* Override SCTP protocol and port fields */
>> +    expr_symtab_add_predicate(acl_symtab, "sctp",
>> +                              "ct.trk && !ct.inv && ct_proto == 132");
>> +    expr_symtab_add_field(acl_symtab, "sctp.src", MFF_CT_TP_SRC,
>> +                          "sctp", false);
>> +    expr_symtab_add_field(acl_symtab, "sctp.dst", MFF_CT_TP_DST,
>> +                          "sctp", false);
>> +}
>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
>> index 2556b2888..a6f37b0c3 100644
>> --- a/northd/en-global-config.c
>> +++ b/northd/en-global-config.c
>> @@ -717,6 +717,11 @@ check_nb_options_out_of_sync(
>>          return true;
>>      }
>>
>> +    if (config_out_of_sync(&nb->options, &config_data->nb_options,
>> +                           "acl_ct_translation", false)) {
>> +        return true;
>> +    }
>> +
>>      return false;
>>  }
>>
>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
>> index f21024903..66a519780 100644
>> --- a/northd/lflow-mgr.c
>> +++ b/northd/lflow-mgr.c
>> @@ -184,6 +184,7 @@ struct ovn_lflow {
>>      struct ovn_dp_group *dpg;    /* Link to unique Sb datapath group. */
>>      const char *where;
>>      const char *flow_desc;
>> +    bool acl_ct_translation;     /* Use CT-based L4 field translation. */
>>
>>      struct uuid sb_uuid;         /* SB DB row uuid, specified by northd.
>> */
>>      struct ovs_list referenced_by;  /* List of struct lflow_ref_node. */
>> @@ -725,7 +726,7 @@ lflow_ref_sync_lflows(struct lflow_ref *lflow_ref,
>>   * then it may corrupt the hmap.  Caller should ensure thread safety
>>   * for such scenarios.
>>   */
>> -static void
>> +static struct ovn_lflow *
>>  lflow_table_add_lflow__(struct lflow_table *lflow_table,
>>                         const struct ovn_synced_datapath *sdp,
>>                         const unsigned long *dp_bitmap, size_t
>> dp_bitmap_len,
>> @@ -798,9 +799,17 @@ lflow_table_add_lflow__(struct lflow_table
>> *lflow_table,
>>      ovn_dp_group_add_with_reference(lflow, sdp, dp_bitmap,
>> dp_bitmap_len);
>>
>>      lflow_hash_unlock(hash_lock);
>> +    return lflow;
>>  }
>>
>>  void
>> +ovn_lflow_set_acl_ct_translation(struct ovn_lflow *lflow,
>> +                                 bool acl_ct_translation)
>> +{
>> +    lflow->acl_ct_translation = acl_ct_translation;
>> +}
>> +
>> +struct ovn_lflow *
>>  lflow_table_add_lflow(struct lflow_table_add_args *args)
>>  {
>>      /* It is invalid for both args->dp_bitmap and args->sdp to be
>> @@ -811,11 +820,13 @@ lflow_table_add_lflow(struct lflow_table_add_args
>> *args)
>>          args->sdp = NULL;
>>      }
>>
>> -    lflow_table_add_lflow__(args->table, args->sdp, args->dp_bitmap,
>> -                            args->dp_bitmap_len, args->stage,
>> args->priority,
>> -                            args->match, args->actions, args->io_port,
>> -                            args->ctrl_meter, args->stage_hint,
>> args->where,
>> -                            args->flow_desc, args->lflow_ref);
>> +    return lflow_table_add_lflow__(args->table, args->sdp,
>> args->dp_bitmap,
>> +                                    args->dp_bitmap_len, args->stage,
>> +                                    args->priority, args->match,
>> +                                    args->actions, args->io_port,
>> +                                    args->ctrl_meter, args->stage_hint,
>> +                                    args->where, args->flow_desc,
>> +                                    args->lflow_ref);
>>  }
>>
>>  struct ovn_dp_group *
>> @@ -945,6 +956,7 @@ ovn_lflow_init(struct ovn_lflow *lflow,
>>      lflow->stage_hint = stage_hint;
>>      lflow->ctrl_meter = ctrl_meter;
>>      lflow->flow_desc = flow_desc;
>> +    lflow->acl_ct_translation = false;
>>      lflow->dpg = NULL;
>>      lflow->where = where;
>>      lflow->sb_uuid = sbuuid;
>> @@ -1122,12 +1134,20 @@ sync_lflow_to_sb(struct ovn_lflow *lflow,
>>          sbrec_logical_flow_set_match(sbflow, lflow->match);
>>          sbrec_logical_flow_set_actions(sbflow, lflow->actions);
>>          sbrec_logical_flow_set_flow_desc(sbflow, lflow->flow_desc);
>> -        if (lflow->io_port) {
>> +
>> +        /* Set tags for io_port and/or acl_ct_translation if needed. */
>> +        if (lflow->io_port || lflow->acl_ct_translation) {
>>              struct smap tags = SMAP_INITIALIZER(&tags);
>> -            smap_add(&tags, "in_out_port", lflow->io_port);
>> +            if (lflow->io_port) {
>> +                smap_add(&tags, "in_out_port", lflow->io_port);
>> +            }
>> +            if (lflow->acl_ct_translation) {
>> +                smap_add(&tags, "acl_ct_translation", "true");
>> +            }
>>              sbrec_logical_flow_set_tags(sbflow, &tags);
>>              smap_destroy(&tags);
>>          }
>> +
>>          sbrec_logical_flow_set_controller_meter(sbflow,
>> lflow->ctrl_meter);
>>
>>          /* Trim the source locator lflow->where, which looks something
>> like
>> @@ -1193,6 +1213,21 @@ sync_lflow_to_sb(struct ovn_lflow *lflow,
>>                  }
>>              }
>>          }
>> +
>> +        /* Update acl_ct_translation marker in tags if needed.
>> +         * This must be outside ovn_internal_version_changed check
>> because
>> +         * the option can be enabled/disabled at runtime. */
>> +        bool cur_has_ct_trans = smap_get_bool(&sbflow->tags,
>> +                                              "acl_ct_translation",
>> false);
>> +        if (lflow->acl_ct_translation != cur_has_ct_trans) {
>> +            if (lflow->acl_ct_translation) {
>> +                sbrec_logical_flow_update_tags_setkey(
>> +                    sbflow, "acl_ct_translation", "true");
>> +            } else {
>> +                sbrec_logical_flow_update_tags_delkey(
>> +                    sbflow, "acl_ct_translation");
>> +            }
>> +        }
>>      }
>>
>>      if (lflow->dp) {
>> diff --git a/northd/lflow-mgr.h b/northd/lflow-mgr.h
>> index 4a1655c35..16aa0dd07 100644
>> --- a/northd/lflow-mgr.h
>> +++ b/northd/lflow-mgr.h
>> @@ -24,6 +24,7 @@
>>  struct ovsdb_idl_txn;
>>  struct ovn_datapath;
>>  struct ovsdb_idl_row;
>> +struct ovn_lflow;
>>
>>  /* lflow map which stores the logical flows. */
>>  struct lflow_table {
>> @@ -89,7 +90,12 @@ struct lflow_table_add_args {
>>      const char *where;
>>  };
>>
>> -void lflow_table_add_lflow(struct lflow_table_add_args *args);
>> +struct ovn_lflow *lflow_table_add_lflow(struct lflow_table_add_args
>> *args);
>> +
>> +/* Set the acl_ct_translation flag on an lflow. When true,
>> ovn-controller will
>> + * use a symbol table that maps L4 port fields to CT equivalents. */
>> +void ovn_lflow_set_acl_ct_translation(struct ovn_lflow *lflow,
>> +                                      bool acl_ct_translation);
>>
>
> We should use the new macro approach e.g. WITH_CT_TRANSLATION
> as we have for io port etc. instead of defining some extra function.
>
>
>>
>>
>>  #define WITH_HINT(HINT) .stage_hint = HINT
>> diff --git a/northd/northd.c b/northd/northd.c
>> index d79fe40c9..52233260f 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -83,12 +83,18 @@ static bool check_lsp_is_up;
>>  static bool install_ls_lb_from_router;
>>
>>  /* Use common zone for SNAT and DNAT if this option is set to "true". */
>> -static bool use_common_zone = false;
>> +static bool use_common_zone;
>>
>>  /* If this option is 'true' northd will make use of ct.inv match fields.
>>   * Otherwise, it will avoid using it.  The default is true. */
>>  static bool use_ct_inv_match = true;
>>
>> +/* If this option is 'true' northd will flag the related ACL flows to use
>> + * connection tracking fields to properly handle IP fragments. By
>> default this
>> + * option is set to 'false'.
>> + */
>> +static bool acl_ct_translation = false;
>> +
>>  /* If this option is 'true' northd will implicitly add a lowest-priority
>>   * drop rule in the ACL stage of logical switches that have at least one
>>   * ACL.
>> @@ -7199,6 +7205,12 @@ consider_acl(struct lflow_table *lflows, const
>> struct ovn_datapath *od,
>>          match_tier_len = match->length;
>>      }
>>
>> +    /* Check if this ACL needs CT translation for fragment handling.
>> +     * All stateful ACLs are marked when the option is enabled; the
>> actual
>> +     * translation only affects L4 port fields in ovn-controller. */
>> +    bool needs_ct_trans = has_stateful && acl_ct_translation;
>> +    struct ovn_lflow *lflow;
>> +
>>      if (!has_stateful
>>          || !strcmp(acl->action, "pass")
>>          || !strcmp(acl->action, "allow-stateless")) {
>> @@ -7216,6 +7228,7 @@ consider_acl(struct lflow_table *lflows, const
>> struct ovn_datapath *od,
>>
>>          ds_put_cstr(actions, "next;");
>>          ds_put_format(match, "(%s)", acl->match);
>> +        /* Stateless ACLs don't need CT translation. */
>>          ovn_lflow_add(lflows, od, stage, priority, ds_cstr(match),
>>                        ds_cstr(actions), lflow_ref,
>> WITH_HINT(&acl->header_));
>>          return;
>> @@ -7284,8 +7297,10 @@ consider_acl(struct lflow_table *lflows, const
>> struct ovn_datapath *od,
>>                            (uint8_t) acl->network_function_group->id);
>>          }
>>          ds_put_cstr(actions, "next;");
>> -        ovn_lflow_add(lflows, od, stage, priority, ds_cstr(match),
>> -                      ds_cstr(actions), lflow_ref,
>> WITH_HINT(&acl->header_));
>> +        lflow = ovn_lflow_add(lflows, od, stage, priority,
>> ds_cstr(match),
>> +                              ds_cstr(actions), lflow_ref,
>> +                              WITH_HINT(&acl->header_));
>> +        ovn_lflow_set_acl_ct_translation(lflow, needs_ct_trans);
>>
>>          /* Match on traffic in the request direction for an established
>>           * connection tracking entry that has not been marked for
>> @@ -7314,8 +7329,10 @@ consider_acl(struct lflow_table *lflows, const
>> struct ovn_datapath *od,
>>                            (uint8_t) acl->network_function_group->id);
>>          }
>>          ds_put_cstr(actions, "next;");
>> -        ovn_lflow_add(lflows, od, stage, priority, ds_cstr(match),
>> -                      ds_cstr(actions), lflow_ref,
>> WITH_HINT(&acl->header_));
>> +        lflow = ovn_lflow_add(lflows, od, stage, priority,
>> ds_cstr(match),
>> +                              ds_cstr(actions), lflow_ref,
>> +                              WITH_HINT(&acl->header_));
>> +        ovn_lflow_set_acl_ct_translation(lflow, needs_ct_trans);
>>      } else if (!strcmp(acl->action, "drop")
>>                 || !strcmp(acl->action, "reject")) {
>>          if (acl->network_function_group) {
>> @@ -7340,8 +7357,10 @@ consider_acl(struct lflow_table *lflows, const
>> struct ovn_datapath *od,
>>          build_acl_sample_label_action(actions, acl, acl->sample_new,
>> NULL,
>>                                        obs_stage);
>>          ds_put_cstr(actions, "next;");
>> -        ovn_lflow_add(lflows, od, stage, priority, ds_cstr(match),
>> -                      ds_cstr(actions), lflow_ref,
>> WITH_HINT(&acl->header_));
>> +        lflow = ovn_lflow_add(lflows, od, stage, priority,
>> ds_cstr(match),
>> +                              ds_cstr(actions), lflow_ref,
>> +                              WITH_HINT(&acl->header_));
>> +        ovn_lflow_set_acl_ct_translation(lflow, needs_ct_trans);
>>          /* For an existing connection without ct_mark.blocked set, we've
>>           * encountered a policy change. ACLs previously allowed
>>           * this connection and we committed the connection tracking
>> @@ -7366,8 +7385,10 @@ consider_acl(struct lflow_table *lflows, const
>> struct ovn_datapath *od,
>>          ds_put_format(actions,
>>                        "ct_commit { ct_mark.blocked = 1; "
>>                        "ct_label.obs_point_id = %"PRIu32"; }; next;",
>> obs_pid);
>> -        ovn_lflow_add(lflows, od, stage, priority, ds_cstr(match),
>> -                      ds_cstr(actions), lflow_ref,
>> WITH_HINT(&acl->header_));
>> +        lflow = ovn_lflow_add(lflows, od, stage, priority,
>> ds_cstr(match),
>> +                              ds_cstr(actions), lflow_ref,
>> +                              WITH_HINT(&acl->header_));
>> +        ovn_lflow_set_acl_ct_translation(lflow, needs_ct_trans);
>>      }
>>  }
>>
>> @@ -20386,6 +20407,8 @@ ovnnb_db_run(struct northd_input *input_data,
>>
>>      use_ct_inv_match = smap_get_bool(input_data->nb_options,
>>                                       "use_ct_inv_match", true);
>> +    acl_ct_translation = smap_get_bool(input_data->nb_options,
>> +                                       "acl_ct_translation", false);
>>
>>      /* deprecated, use --event instead */
>>      controller_event_en = smap_get_bool(input_data->nb_options,
>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>> index e74c0d010..2b610ff1c 100644
>> --- a/ovn-nb.xml
>> +++ b/ovn-nb.xml
>> @@ -327,6 +327,26 @@
>>          </p>
>>        </column>
>>
>> +      <column name="options" key="acl_ct_translation">
>> +        <p>
>> +          If set to <code>true</code>, <code>ovn-northd</code> will
>> enable
>> +          connection tracking based L4 field translation for stateful
>> ACLs.
>> +          When enabled, ACL matches on L4 port fields (tcp/udp/sctp) use
>> +          connection tracking state instead of packet headers. This
>> allows
>> +          proper handling of IP fragmentation in userspace datapaths
>> (e.g.,
>> +          DPDK) where fragments are not automatically reassembled during
>> +          connection tracking lookup.
>> +        </p>
>> +        <p>
>> +          <em>Important:</em> Enabling this option breaks hardware
>> offloading
>> +          of flows. Only enable this option if you need to handle
>> fragmented
>> +          traffic with stateful ACLs in userspace datapaths.
>> +        </p>
>> +        <p>
>> +          The default value is <code>false</code>.
>> +        </p>
>> +      </column>
>> +
>>        <column name="options" key="default_acl_drop">
>>          <p>
>>            If set to <code>true</code>., <code>ovn-northd</code> will
>> diff --git a/tests/atlocal.in b/tests/atlocal.in
>> index 9a3cf2d16..f0d5eb243 100644
>> --- a/tests/atlocal.in
>> +++ b/tests/atlocal.in
>> @@ -189,6 +189,9 @@ find_command dhcpd
>>  # Set HAVE_DHCLIENT
>>  find_command dhclient
>>
>> +# Set HAVE_DNSMASQ
>> +find_command dnsmasq
>> +
>>  # Set HAVE_BFDD_BEACON
>>  find_command bfdd-beacon
>>
>> diff --git a/tests/ovn.at b/tests/ovn.at
>> index 445a74ce5..9399bcb5a 100644
>> --- a/tests/ovn.at
>> +++ b/tests/ovn.at
>> @@ -44152,3 +44152,74 @@ check ovn-nbctl --wait=hv lsp-set-type down_ext
>> localnet
>>  OVN_CLEANUP([hv1],[hv2])
>>  AT_CLEANUP
>>  ])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ACL CT translation - Rules])
>> +AT_KEYWORDS([acl_ct_translation_rules])
>> +ovn_start
>> +
>> +check ovn-nbctl ls-add ls1
>> +
>> +check ovn-nbctl lsp-add ls1 lp1 \
>> +    -- lsp-set-addresses lp1 "f0:00:00:00:00:01 10.0.0.1"
>> +check ovn-nbctl lsp-add ls1 lp2 \
>> +    -- lsp-set-addresses lp2 "f0:00:00:00:00:02 10.0.0.2"
>> +
>> +net_add n1
>> +sim_add hv1
>> +
>> +as hv1
>> +ovs-vsctl add-br br-phys
>> +ovn_attach n1 br-phys 192.168.0.1
>> +ovs-vsctl -- add-port br-int hv1-vif1 -- \
>> +    set interface hv1-vif1 external-ids:iface-id=lp1 ofport-request=1
>
> +ovs-vsctl -- add-port br-int hv1-vif2 -- \
>> +    set interface hv1-vif2 external-ids:iface-id=lp2 ofport-request=2
>>
>
> nit: Please drop the ofport-request.
>
>
>> +
>> +# Create port group and add ACLs with TCP/UDP matches
>> +lp1_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=lp1)
>> +lp2_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=lp2)
>> +check ovn-nbctl pg-add pg1 $lp1_uuid $lp2_uuid
>> +
>> +check ovn-nbctl acl-add pg1 from-lport 1002 \
>> +    "inport == @pg1 && ip4 && tcp && tcp.dst == 80" allow-related
>> +check ovn-nbctl acl-add pg1 from-lport 1002 \
>> +    "inport == @pg1 && ip4 && udp && udp.dst == 53" allow-related
>> +check ovn-nbctl acl-add pg1 to-lport 1002 \
>> +    "outport == @pg1 && ip4 && tcp && tcp.src == 80" allow-related
>> +check ovn-nbctl acl-add pg1 to-lport 1002 \
>> +    "outport == @pg1 && ip4 && udp && udp.src == 53" allow-related
>> +check ovn-nbctl --wait=hv --log acl-add pg1 to-lport 100 "outport ==
>> @pg1" drop
>> +
>> +# Verify lflows do NOT have acl_ct_translation tag when option is
>> disabled (default)
>> +AT_CHECK([ovn-sbctl --bare --columns=tags find Logical_Flow | grep -c
>> "acl_ct_translation"], [1], [ignore], [ignore])
>>
>
> nit: Please use "row_count" instead. That applies to the whole test.
>
>
>> +
>> +# Verify OpenFlows use packet header fields (tcp.dst, udp.dst) not CT
>> fields
>> +# When CT translation is OFF, we should see tp_dst matches
>> +as hv1
>> +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "tp_dst=80|tp_dst=53" |
>> head -1 | grep -q "tp_dst"], [0])
>>
>
> We can actually use getting a table directly via "ovn-debug
> lflow-stage-to-oftable <STAGE_NAME>".
> That also applies to the whole test.
>
> +
>> +# Now enable ACL CT translation
>> +check ovn-nbctl --wait=hv set NB_Global . options:acl_ct_translation=true
>> +
>> +# Verify lflows now have acl_ct_translation tag in the tags column
>> +AT_CHECK([ovn-sbctl --bare --columns=tags find Logical_Flow | grep -c
>> "acl_ct_translation"], [0], [ignore])
>>
> +
>> +# Verify OpenFlows now use CT fields (ct_tp_dst) instead of packet
>> headers
>> +# When CT translation is ON, we should see ct_tp_dst matches
>> +as hv1
>> +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E
>> "ct_tp_dst=80|ct_tp_dst=53" | head -1 | grep -q "ct_tp_dst"], [0])
>> +
>> +# Now disable ACL CT translation again
>> +check ovn-nbctl --wait=hv set NB_Global .
>> options:acl_ct_translation=false
>> +
>> +# Verify lflows no longer have acl_ct_translation tag
>> +AT_CHECK([ovn-sbctl --bare --columns=tags find Logical_Flow | grep -c
>> "acl_ct_translation"], [1], [ignore], [ignore])
>> +
>> +# Verify OpenFlows reverted to packet header fields
>> +as hv1
>> +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "tp_dst=80|tp_dst=53" |
>> head -1 | grep -q "tp_dst"], [0])
>> +
>> +OVN_CLEANUP([hv1])
>> +AT_CLEANUP
>> +])
>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>> index 5b3dc47fd..539a8fa3d 100644
>> --- a/tests/system-ovn.at
>> +++ b/tests/system-ovn.at
>> @@ -20347,3 +20347,392 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error
>> receiving.*/d
>>  /.*terminating with signal 15.*/d"])
>>  AT_CLEANUP
>>  ])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ACL CT translation - UDP fragmentation])
>> +AT_KEYWORDS([acl_ct_translation_udp_fragmentation])
>> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>> +
>> +CHECK_CONNTRACK()
>> +
>> +ovn_start
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +ADD_BR([br-int])
>> +
>> +# Set external-ids in br-int needed for ovn-controller
>> +check ovs-vsctl \
>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>> +        -- set Open_vSwitch .
>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>> +        -- set bridge br-int fail-mode=secure
>> other-config:disable-in-band=true
>> +
>> +# Start ovn-controller
>> +start_daemon ovn-controller
>> +
>> +# Set the minimal fragment size for userspace DP.
>> +ovs-appctl dpctl/ipf-set-min-frag v4 500
>> +
>> +check ovn-nbctl ls-add internal
>> +check ovn-nbctl lsp-add internal client \
>> +    -- lsp-set-addresses client "f0:00:00:01:02:03 172.16.1.3"
>> +check ovn-nbctl lsp-add internal server \
>> +    -- lsp-set-addresses server "f0:00:0f:01:02:03 172.16.1.2"
>> +
>> +ADD_NAMESPACES(client)
>> +ADD_VETH(client, client, br-int, "172.16.1.3/24", "f0:00:00:01:02:03", \
>> +         "172.16.1.1")
>> +NS_EXEC([client], [ip l set dev client mtu 900])
>> +
>> +ADD_NAMESPACES(server)
>> +ADD_VETH(server, server, br-int, "172.16.1.2/24", "f0:00:0f:01:02:03", \
>> +         "172.16.1.1")
>> +NS_EXEC([server], [ip l set dev server mtu 900])
>> +
>> +# Create data file for traffic testing (8000 bytes will be fragmented
>> with MTU 900)
>> +printf %8000s > datafile
>> +
>> +# Create port group with both client and server
>> +client_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=client)
>> +server_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=server)
>> +check ovn-nbctl pg-add internal_vms $client_uuid $server_uuid
>> +
>> +# ACL rules - allow outgoing traffic
>> +check ovn-nbctl acl-add internal_vms from-lport 1002 "inport ==
>> @internal_vms && ip4 && udp" allow-related
>> +check ovn-nbctl acl-add internal_vms from-lport 1002 "inport ==
>> @internal_vms && ip4" allow-related
>> +
>> +# ACL rules - allow incoming UDP on specific ports
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && ip4 && udp && udp.dst == 5060" allow-related
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && ip4 && udp && udp.dst == 5061" allow-related
>> +
>> +# Allow ARP
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && arp" allow-related
>> +
>> +# Drop rule
>> +check ovn-nbctl --log --severity=info acl-add internal_vms to-lport 100
>> "outport == @internal_vms" drop
>> +
>> +# Enable ACL CT translation for fragmentation handling
>> +check ovn-nbctl --wait=hv set NB_Global . options:acl_ct_translation=true
>> +
>> +check ovn-nbctl --wait=hv sync
>> +check ovs-appctl dpctl/flush-conntrack
>> +
>> +# Start tcpdump to capture IP fragments on both sides
>> +NETNS_START_TCPDUMP([server], [-U -i server -Q in -nn ip and '(ip[[6:2]]
>> & 0x3fff != 0)'], [tcpdump-udp-server])
>> +NETNS_START_TCPDUMP([client], [-U -i client -Q in -nn ip and '(ip[[6:2]]
>> & 0x3fff != 0)'], [tcpdump-udp-client])
>> +
>> +# Start UDP listeners on both sides
>> +NETNS_DAEMONIZE([server], [nc -l -u 172.16.1.2 5060 > udp_server.rcvd],
>> [server.pid])
>> +NETNS_DAEMONIZE([client], [nc -l -u 172.16.1.3 5061 > udp_client.rcvd],
>> [client.pid])
>> +
>> +# Client sends to server (will be fragmented due to MTU 900)
>> +NS_CHECK_EXEC([client], [cat datafile | nc -u 172.16.1.2 5060 -q 1],
>> [0], [ignore], [ignore])
>>
>
> My nc version doesn't support -q. We probably need a different trick to
> make it wait.
>
>
>> +
>> +# Server sends to client (will be fragmented due to MTU 900)
>> +NS_CHECK_EXEC([server], [cat datafile | nc -u 172.16.1.3 5061 -q 1],
>> [0], [ignore], [ignore])
>> +
>> +OVS_WAIT_UNTIL([test -s udp_server.rcvd])
>> +OVS_WAIT_UNTIL([test -s udp_client.rcvd])
>> +
>> +# Verify both sides received data
>> +check cmp datafile udp_server.rcvd
>> +check cmp datafile udp_client.rcvd
>> +
>> +# Verify IP fragments were received on both sides (at least 5 fragments
>> confirms fragmentation)
>> +OVS_WAIT_UNTIL([test $(cat tcpdump-udp-server.tcpdump | wc -l) -ge 5])
>> +OVS_WAIT_UNTIL([test $(cat tcpdump-udp-client.tcpdump | wc -l) -ge 5])
>> +
>> +AS_BOX([UDP fragmentation test passed - bi-directional traffic with IP
>> fragments])
>>
>
> nit: We can drop this box the test would indicate failure.
>
>
>> +
>> +OVN_CLEANUP_CONTROLLER([hv1])
>> +OVN_CLEANUP_NORTHD
>> +
>> +as
>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>> +/connection dropped.*/d
>> +/WARN|netdev@ovs-netdev: execute.*/d
>> +/dpif|WARN|system@ovs-system: execute.*/d
>> +"])
>> +AT_CLEANUP
>> +])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ACL CT translation - TCP traffic])
>> +AT_KEYWORDS([acl_ct_translation_tcp_fragmentation])
>> +
>> +CHECK_CONNTRACK()
>> +
>> +ovn_start
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +ADD_BR([br-int])
>> +
>> +# Set external-ids in br-int needed for ovn-controller
>> +check ovs-vsctl \
>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>> +        -- set Open_vSwitch .
>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>> +        -- set bridge br-int fail-mode=secure
>> other-config:disable-in-band=true
>> +
>> +# Start ovn-controller
>> +start_daemon ovn-controller
>> +
>> +check ovn-nbctl ls-add internal
>> +check ovn-nbctl lsp-add internal client \
>> +    -- lsp-set-addresses client "f0:00:00:01:02:03 172.16.1.3"
>> +check ovn-nbctl lsp-add internal server \
>> +    -- lsp-set-addresses server "f0:00:0f:01:02:03 172.16.1.2"
>> +
>> +ADD_NAMESPACES(client)
>> +ADD_VETH(client, client, br-int, "172.16.1.3/24", "f0:00:00:01:02:03", \
>> +         "172.16.1.1")
>> +
>> +ADD_NAMESPACES(server)
>> +ADD_VETH(server, server, br-int, "172.16.1.2/24", "f0:00:0f:01:02:03", \
>> +         "172.16.1.1")
>> +
>> +# Create data file for traffic testing
>> +printf %8000s > datafile
>> +
>> +# Create port group with both client and server
>> +client_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=client)
>> +server_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=server)
>> +check ovn-nbctl pg-add internal_vms $client_uuid $server_uuid
>> +
>> +# ACL rules - allow outgoing traffic
>> +check ovn-nbctl acl-add internal_vms from-lport 1002 "inport ==
>> @internal_vms && ip4 && tcp" allow-related
>> +check ovn-nbctl acl-add internal_vms from-lport 1002 "inport ==
>> @internal_vms && ip4" allow-related
>> +
>> +# ACL rules - allow incoming TCP on specific ports
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && ip4 && tcp && tcp.dst == 8080" allow-related
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && ip4 && tcp && tcp.dst == 8081" allow-related
>> +
>> +# Allow ARP
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && arp" allow-related
>> +
>> +# Drop rule
>> +check ovn-nbctl --log --severity=info acl-add internal_vms to-lport 100
>> "outport == @internal_vms" drop
>> +
>> +# Enable ACL CT translation
>> +check ovn-nbctl --wait=hv set NB_Global . options:acl_ct_translation=true
>> +
>> +check ovn-nbctl --wait=hv sync
>> +check ovs-appctl dpctl/flush-conntrack
>> +
>> +# Test client -> server direction
>> +NETNS_DAEMONIZE([server], [nc -l 172.16.1.2 8080 > tcp_server.rcvd],
>> [server.pid])
>> +NS_CHECK_EXEC([client], [cat datafile | nc 172.16.1.2 8080 -q 1], [0],
>> [ignore], [ignore])
>> +
>> +OVS_WAIT_UNTIL([test -s tcp_server.rcvd])
>> +check cmp datafile tcp_server.rcvd
>> +
>> +# Clean up first direction
>> +kill $(cat server.pid) 2>/dev/null || true
>> +
>> +# Test server -> client direction
>> +NETNS_DAEMONIZE([client], [nc -l 172.16.1.3 8081 > tcp_client.rcvd],
>> [client.pid])
>> +NS_CHECK_EXEC([server], [cat datafile | nc 172.16.1.3 8081 -q 1], [0],
>> [ignore], [ignore])
>> +
>> +OVS_WAIT_UNTIL([test -s tcp_client.rcvd])
>> +check cmp datafile tcp_client.rcvd
>> +
>> +AS_BOX([TCP traffic test passed - bi-directional traffic])
>> +
>> +OVN_CLEANUP_CONTROLLER([hv1])
>> +OVN_CLEANUP_NORTHD
>> +
>> +as
>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>> +/connection dropped.*/d
>> +"])
>> +AT_CLEANUP
>> +])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ACL CT translation - negative test])
>>
>
> The UDP + TCP + negative use the same infrastructure, maybe
> we can squash them in a single test that just changes the ACL rules,
> WDYT?
>
>
I was conducting all the tests within the same infrastructure, but this
made it more difficult to debug and identify which tests were failing, and
the test was becoming very large.



> +AT_KEYWORDS([acl_ct_translation_udp_fragmentation_negative])
>> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>> +
>> +CHECK_CONNTRACK()
>> +
>> +ovn_start
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +ADD_BR([br-int])
>> +
>> +# Set external-ids in br-int needed for ovn-controller
>> +check ovs-vsctl \
>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>> +        -- set Open_vSwitch .
>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>> +        -- set bridge br-int fail-mode=secure
>> other-config:disable-in-band=true
>> +
>> +# Start ovn-controller
>> +start_daemon ovn-controller
>> +
>> +check ovn-nbctl ls-add internal
>> +check ovn-nbctl lsp-add internal client \
>> +    -- lsp-set-addresses client "f0:00:00:01:02:03 172.16.1.3"
>> +check ovn-nbctl lsp-add internal server \
>> +    -- lsp-set-addresses server "f0:00:0f:01:02:03 172.16.1.2"
>> +
>> +ADD_NAMESPACES(client)
>> +ADD_VETH(client, client, br-int, "172.16.1.3/24", "f0:00:00:01:02:03", \
>> +         "172.16.1.1")
>> +
>> +ADD_NAMESPACES(server)
>> +ADD_VETH(server, server, br-int, "172.16.1.2/24", "f0:00:0f:01:02:03", \
>> +         "172.16.1.1")
>> +
>> +# Create port group with both client and server
>> +client_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=client)
>> +server_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=server)
>> +check ovn-nbctl pg-add internal_vms $client_uuid $server_uuid
>> +
>> +# ACL rules - allow outgoing traffic
>> +check ovn-nbctl acl-add internal_vms from-lport 1002 "inport ==
>> @internal_vms && ip4 && udp" allow-related
>> +check ovn-nbctl acl-add internal_vms from-lport 1002 "inport ==
>> @internal_vms && ip4" allow-related
>> +
>> +# ACL rules - allow incoming UDP ONLY on port 5060 (NOT 4000)
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && ip4 && udp && udp.dst == 5060" allow-related
>> +
>> +# Allow ARP
>> +check ovn-nbctl acl-add internal_vms to-lport 1002 "outport ==
>> @internal_vms && arp" allow-related
>> +
>> +# Drop rule - this should drop traffic on port 4000
>> +check ovn-nbctl --log --severity=info acl-add internal_vms to-lport 100
>> "outport == @internal_vms" drop
>> +
>> +# Enable ACL CT translation
>> +check ovn-nbctl --wait=hv set NB_Global . options:acl_ct_translation=true
>> +
>> +check ovn-nbctl --wait=hv sync
>> +check ovs-appctl dpctl/flush-conntrack
>> +
>> +# Start tcpdump to capture any incoming packets on server
>> +NETNS_START_TCPDUMP([server], [-U -i server -Q in -nn udp port 4000],
>> [tcpdump-drop-server])
>> +
>> +# Start UDP listener on server (on port 4000 which is NOT allowed by
>> ACLs)
>> +NETNS_DAEMONIZE([server], [nc -l -u 172.16.1.2 4000 > udp_drop.rcvd],
>> [drop_server.pid])
>> +
>> +# Client sends to server on disallowed port
>> +NS_CHECK_EXEC([client], [echo "test" | timeout 2 nc -u 172.16.1.2 4000
>> -q 1], [0], [ignore], [ignore])
>> +
>> +# Wait a bit for any packets to arrive
>> +sleep 2
>> +
>> +# Verify server did NOT receive any data (file should be empty or not
>> exist)
>> +AT_CHECK([test ! -s udp_drop.rcvd], [0])
>> +
>> +# Verify tcpdump captured no packets (traffic was dropped by ACL)
>> +AT_CHECK([test $(cat tcpdump-drop-server.tcpdump | wc -l) -eq 0], [0])
>> +
>> +AS_BOX([Negative test passed - traffic on disallowed port was dropped])
>> +
>> +OVN_CLEANUP_CONTROLLER([hv1])
>> +OVN_CLEANUP_NORTHD
>> +
>> +as
>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>> +/connection dropped.*/d
>> +"])
>> +AT_CLEANUP
>> +])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ACL CT translation with DHCP traffic])
>> +AT_KEYWORDS([acl_ct_translation_dhcp_fragmentation])
>> +AT_SKIP_IF([test $HAVE_DNSMASQ = no])
>> +AT_SKIP_IF([test $HAVE_DHCLIENT = no])
>>
>
> If we want to keep this test we need to install both dnsmasq and dhclient
> to the container images.
>
>
>> +
>> +CHECK_CONNTRACK()
>> +
>> +ovn_start
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +ADD_BR([br-int])
>> +
>> +# Set external-ids in br-int needed for ovn-controller
>> +check ovs-vsctl \
>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>> +        -- set Open_vSwitch .
>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>> +        -- set bridge br-int fail-mode=secure
>> other-config:disable-in-band=true
>> +
>> +# Start ovn-controller
>> +start_daemon ovn-controller
>> +
>> +# Create logical switch
>> +check ovn-nbctl ls-add ls1
>> +
>> +# Create DHCP server port
>> +check ovn-nbctl lsp-add ls1 dhcp-server \
>> +    -- lsp-set-addresses dhcp-server "00:00:00:00:02:00 192.168.1.2"
>> +
>> +# Create DHCP client port
>> +check ovn-nbctl lsp-add ls1 dhcp-client \
>> +    -- lsp-set-addresses dhcp-client "00:00:00:00:02:01"
>> +
>> +# Add OVS ports
>> +ADD_NAMESPACES(server, client)
>> +ADD_VETH(server, server, br-int, "192.168.1.2/24", "00:00:00:00:02:00")
>> +ADD_VETH(client, client, br-int, "0.0.0.0/0", "00:00:00:00:02:01")
>> +
>> +# Bind logical ports to OVS ports
>> +check ovs-vsctl set Interface ovs-server
>> external_ids:iface-id=dhcp-server
>> +check ovs-vsctl set Interface ovs-client
>> external_ids:iface-id=dhcp-client
>> +
>> +# Create port group with both ports
>> +server_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=dhcp-server)
>> +client_uuid=$(ovn-nbctl --bare --columns=_uuid find logical_switch_port
>> name=dhcp-client)
>> +check ovn-nbctl pg-add dhcp_vms $server_uuid $client_uuid
>> +
>> +# Add ACL rules - allow outgoing traffic from both ports
>> +check ovn-nbctl acl-add dhcp_vms from-lport 1002 "inport == @dhcp_vms &&
>> ip4" allow-related
>> +check ovn-nbctl acl-add dhcp_vms from-lport 1002 "inport == @dhcp_vms &&
>> udp" allow-related
>> +
>> +# Add ACL rules for DHCP traffic (ports 67 and 68)
>> +check ovn-nbctl acl-add dhcp_vms to-lport 1002 "outport == @dhcp_vms &&
>> ip4 && udp && udp.dst == 67" allow-related
>> +check ovn-nbctl acl-add dhcp_vms to-lport 1002 "outport == @dhcp_vms &&
>> ip4 && udp && udp.dst == 68" allow-related
>> +
>> +# Allow ARP and broadcast
>> +check ovn-nbctl acl-add dhcp_vms to-lport 1002 "outport == @dhcp_vms &&
>> arp" allow-related
>> +
>> +# Add drop rule
>> +check ovn-nbctl --log --severity=info acl-add dhcp_vms to-lport 100
>> "outport == @dhcp_vms" drop
>> +
>> +# Enable ACL CT translation
>> +check ovn-nbctl --wait=hv set NB_Global .
>> options:acl_ct_translation=false
>> +
>> +# Wait for flows to be installed
>> +check ovn-nbctl --wait=hv sync
>> +
>> +# Start dnsmasq as DHCP server in the server namespace
>> +# dnsmasq doesn't have AppArmor restrictions like dhcpd
>> +NETNS_DAEMONIZE([server], [dnsmasq --no-daemon --no-resolv --no-hosts \
>> +    --interface=server --bind-interfaces \
>> +    --dhcp-range=192.168.1.100,192.168.1.100,255.255.255.0,1h \
>> +    --dhcp-option=option:router,192.168.1.2 \
>> +    --log-dhcp], [dnsmasq.pid])
>> +
>> +# Give dnsmasq time to start
>> +sleep 1
>> +
>> +# Request IP via DHCP using dhclient
>> +NS_CHECK_EXEC([client], [dhclient -1 -v client], [0], [ignore], [ignore])
>>
>
> This is leaking the dhclient, in other words it keeps running even when
> the test is done.
>
>
Good catch.


> +
>> +# Verify client got an IP address from DHCP
>> +NS_CHECK_EXEC([client], [ip addr show client | grep -q "192.168.1.100"],
>> [0])
>> +
>> +AS_BOX([DHCP test passed - client received IP 192.168.1.100 via DHCP
>> with ACL CT translation enabled])
>> +
>> +OVN_CLEANUP_CONTROLLER([hv1])
>> +
>> +OVN_CLEANUP_NORTHD
>> +
>> +as
>> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>> +/connection dropped.*/d
>> +/dnsmasq.*/d
>> +"])
>> +AT_CLEANUP
>> +])
>> --
>> 2.43.0
>>
>>
>
> I was able to figure out why is it happening, dunno if we should keep the
> test around
> but it can be reproduced even with multicast UDP traffic so it might be
> enough to adjust
> the UDP test to also send multicast.  The following two logical flows are
> to "blame":
>
>
Well, I think the broader the coverage, the better. The advantage of having
a DHCP client/server is that the traffic will contain all the nuances of
real traffic, and we don't have to encode all protocol data. For example,
the return traffic should be difficult to generate since it will not be
multicast. I tried to reuse NETNS_START_DHCPD, but I encountered various
permission errors and couldn't get it to work. Dnsmasq seems easier to use
and doesn't rely on an external config file.


>         /* Do not send multicast packets to conntrack. */
>         ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, "eth.mcast",
>                       "next;", lflow_ref);
>         ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, "eth.mcast",
>                       "next;", lflow_ref);
>
> We basically skip CT in case of multicast, and because the translation
> adds new
> prereq of +ct.trk we will never match that flow. I'm afraid that there
> isn't any
> easy/good solution. If we disable those flows we are imposing performance
> penalty on multicast traffic. If we say that's acceptable we need to
> clearly
> state that too in the documentation it's in the same ballpark with "broken"
> HW offload.
>

What if we discard it only for DHCP traffic? I got it passing with
something like this:

        if (acl_udp_ct_translation) {
            ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 120,
                          "eth.mcast && udp && "
                          "(udp.src == 67 || udp.src == 68 || "
                          "udp.dst == 67 || udp.dst == 68)",
                          REGBIT_CONNTRACK_DEFRAG" = 1; next;",
                          lflow_ref);
        }

This takes higher priority than the more general multicast rule, so
performance should not be a problem if we apply it for DHCP.


> Regards,
> Ales
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to