Hi Alexandra,

Thanks for the fix!

On 5/8/26 10:00 PM, Alexandra Rukomoinikova via dev wrote:
> ARP requests sent through localnet port from physical network are
> processed only on chassis hosting the HA group, while requests for
> distributed NAT are also distributed across chassis.
> 
> Previously, the case where a VIF port exists on the same logical switch
> connected to the physical network was not handled. Now, for each VIF port
> on such an external switch, ARP requests are processed on the chassis that
> hosts that port.
> 
> Code simply iterates over all ports within the logical switch, which is
> considered acceptable since there are typically not a lot other non-VIF
> ports on an external switch.
> 
> Fixed I-P: vif ports are processed incrementally, so if switch with connection
> to the physical network was created and then a vif port was added,
> ls_arp processing node would not work process this update.
>

Maybe "would not process this update" instead of "would not work..."?
> Fixes: 1b4058b ("northd: Process external arps on ha chassis.")

The Fixes format is not exactly correct, the correct way to format it is:

git log -1 --pretty=format:"CC: %an <%ae>%nFixes: %h (\"%s\")"
--abbrev=12 COMMIT_REF

As documented in the submitting-patches.rst file.

> Reported-at: https://redhat.atlassian.net/browse/FDP-3767

Our internal automation is "special".  This should actually be:
https://redhat.atlassian.net/browse/FDP-3774

> Signed-off-by: Alexandra Rukomoinikova <[email protected]>
> ---
>  northd/northd.c     | 25 ++++++++++++++++++
>  northd/northd.h     |  6 +++++
>  tests/ovn-northd.at | 17 +++++++-----
>  tests/system-ovn.at | 64 +++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 106 insertions(+), 6 deletions(-)
> 
> diff --git a/northd/northd.c b/northd/northd.c
> index 6ff505ca1..f55dbc1d5 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -4812,6 +4812,10 @@ ls_handle_lsp_changes(struct ovsdb_idl_txn 
> *ovnsb_idl_txn,
>              struct nbrec_logical_switch_port *new_nbsp = 
> changed_ls->ports[j];
>              op = ovn_port_find_in_datapath(od, new_nbsp);
>  
> +            if (!hmapx_is_empty(&od->phys_ports)) {
> +                goto fail;
> +            }
> +

This could go outside the loop, it doesn't depend on ports[j[.

However, this is a bit worrying, it will fail incremental processing for
any LSP addition if it's part of a localnet logical switch which will
have performance implications at scale I guess.

Can't we handle the case of "changed lsp" (or new/deleted lsp) in
ls_arp_northd_handler() too?  And then incrementally in
lflow_handle_ls_arp_changes().

>              if (!op) {
>                  if (!lsp_can_be_inc_processed(new_nbsp)) {
>                      goto fail;
> @@ -10066,6 +10070,8 @@ 
> build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
>   * router connection ports that requires chassis residence.
>   * ARP requests coming from localnet/l2gateway ports
>   * allowed for processing on resident chassis only.
> + * If logical switch has VIF port, ARP requests are allowed
> + * to be processed on the chassis hosting this VIF port.
>   */
>  static void
>  build_lswitch_arp_chassis_resident(const struct ovn_datapath *od,
> @@ -10133,6 +10139,25 @@ build_lswitch_arp_chassis_resident(const struct 
> ovn_datapath *od,
>              }
>          }
>  
> +        HMAP_FOR_EACH (op, dp_node, &od->ports) {
> +            if (!port_is_vif(op)) {
> +                continue;
> +            }
> +
> +            for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> +                for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
> +                    ds_clear(&match);
> +                    ds_put_format(&match,
> +                                  REGBIT_EXT_ARP " == 1 && arp.tpa == %s "
> +                                  "&& is_chassis_resident(%s)",
> +                                  op->lsp_addrs[i].ipv4_addrs[j].addr_s,
> +                                  op->json_key);
> +                    ovn_lflow_add(lflows, od, S_SWITCH_IN_APPLY_PORT_SEC, 85,
> +                                  ds_cstr(&match), "next;", ar->lflow_ref);
> +                }
> +            }

I'm a bit worried about these new flows.  Also about the flows we added
for NATs in 16b79a66d2c3 ("northd: Restrict external ARP request to
logical_ip for dnat_and_snat.") (CC Ales).

We already add ARP responder flows for VIFs in table S_SWITCH_IN_ARP_ND_RSP.

Shouldn't we rework the original 1b4058b ("northd: Process external arps
on ha chassis.") and do the drop for ARPs with REGBIT_EXT_ARP == 1 in
the switch arp responder stage?

Then we wouldn't need to add another set of logical flows for each LSP
IP.  There are a few cases we need to carefully account for then but it
would be more scalable in my opinion.

What do you think?

> +        }
> +
>          ovn_lflow_add(lflows, od, S_SWITCH_IN_APPLY_PORT_SEC, 70,
>                        REGBIT_EXT_ARP" == 1", "drop;", ar->lflow_ref);
>      }
> diff --git a/northd/northd.h b/northd/northd.h
> index 81ab07600..7092f6001 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -1178,6 +1178,12 @@ od_is_centralized(const struct ovn_datapath *od)
>      return !od->is_distributed;
>  }
>  
> +static inline bool
> +port_is_vif(const struct ovn_port *op)
> +{
> +    return op->sb ? !strcmp(op->sb->type, "") : 0;

Should be ": false" instead of ": 0".

Also, I'd rely on the NB type instead of the port_binding type here.

We could use this helper in a bunch of other places in northd.c where we
check LSP type.

> +}
> +
>  struct ovn_port *ovn_port_find(const struct hmap *ports, const char *name);
>  
>  void build_igmp_lflows(struct hmap *igmp_groups,
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 3f237b076..9d1d00380 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -19702,15 +19702,10 @@ check ovn-nbctl lr-add lr1
>  check ovn-nbctl lrp-add lr1 down_link f0:00:00:00:00:f1 192.168.1.1/24
>  
>  check ovn-nbctl ls-add ls1
> -check ovn-nbctl lsp-add ls1 up_link
>  check ovn-nbctl lsp-add ls1 down_vif1
>  check ovn-nbctl lsp-add ls1 down_vif2
>  check ovn-nbctl lsp-add ls1 down_ext
> -
> -check ovn-nbctl set Logical_Switch_Port up_link \
> -    type=router \
> -    options:router-port=down_link \
> -    addresses=router
> +check ovn-nbctl lsp-add-router-port ls1 up_link down_link
>  
>  check ovn-nbctl lsp-set-addresses down_vif1 'f0:00:00:00:00:01 192.168.1.101'
>  check ovn-nbctl lsp-set-addresses down_vif2 'f0:00:00:00:00:02 192.168.1.102'
> @@ -19728,6 +19723,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 | grep 
> ls_in_apply_port_sec | ovn_strip_lflow
>    table=??(ls_in_apply_port_sec), priority=50   , match=(reg0[[15]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=70   , match=(reg0[[22]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]] == 1 && 
> is_chassis_resident("cr-down_link")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.101 && is_chassis_resident("down_vif1")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.102 && is_chassis_resident("down_vif2")), action=(next;)
>  ])
>  
>  # Check nat adding to dgr attached to logical switch trigger ls-arp flow.
> @@ -19740,6 +19737,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 | grep 
> ls_in_apply_port_sec | ovn_strip_lflow
>    table=??(ls_in_apply_port_sec), priority=70   , match=(reg0[[22]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]] == 1 && 
> is_chassis_resident("cr-down_link")), action=(next;)
>    table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.0.3 && is_chassis_resident("down_vif1")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.101 && is_chassis_resident("down_vif1")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.102 && is_chassis_resident("down_vif2")), action=(next;)
>  ])
>  
>  check ovn-nbctl --wait=sb lr-nat-del lr1 dnat_and_snat 192.168.0.3
> @@ -19748,6 +19747,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 | grep 
> ls_in_apply_port_sec | ovn_strip_lflow
>    table=??(ls_in_apply_port_sec), priority=50   , match=(reg0[[15]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=70   , match=(reg0[[22]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]] == 1 && 
> is_chassis_resident("cr-down_link")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.101 && is_chassis_resident("down_vif1")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.102 && is_chassis_resident("down_vif2")), action=(next;)
>  ])
>  
>  # Check changing logical port type to l2gateway.
> @@ -19757,6 +19758,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 | grep 
> ls_in_apply_port_sec | ovn_strip_lflow
>    table=??(ls_in_apply_port_sec), priority=50   , match=(reg0[[15]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=70   , match=(reg0[[22]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]] == 1 && 
> is_chassis_resident("cr-down_link")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.101 && is_chassis_resident("down_vif1")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.102 && is_chassis_resident("down_vif2")), action=(next;)
>  ])
>  
>  # Check changing logical port type to vif.
> @@ -19773,6 +19776,8 @@ AT_CHECK([ovn-sbctl lflow-list ls1 | grep 
> ls_in_apply_port_sec | ovn_strip_lflow
>    table=??(ls_in_apply_port_sec), priority=50   , match=(reg0[[15]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=70   , match=(reg0[[22]] == 1), 
> action=(drop;)
>    table=??(ls_in_apply_port_sec), priority=75   , match=(reg0[[22]] == 1 && 
> is_chassis_resident("cr-down_link")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.101 && is_chassis_resident("down_vif1")), action=(next;)
> +  table=??(ls_in_apply_port_sec), priority=85   , match=(reg0[[22]] == 1 && 
> arp.tpa == 192.168.1.102 && is_chassis_resident("down_vif2")), action=(next;)
>  ])
>  
>  # Check changing removing logical port.
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index 582ed194b..0648c44b6 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -21814,3 +21814,67 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port 
> patch-.*/d
>  
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([VIF port connected to localnet network])
> +#
> +# Topology:
> +#    (fabric) -- localnet-port -- LS --- DGP(chassis2) -- LR
> +#                                 |
> +#                                 |
> +#                               VM (chassis1)
> +#
> +It is expected that ARP requests to this port are allowed on the chesis that 
> hosts this port.

Missing "#".

Typo: chesis

> +
> +ovn_start
> +OVS_TRAFFIC_VSWITCHD_START()
> +ADD_BR([br-int])
> +ADD_BR([br-ext])
> +
> +ovs-ofctl add-flow br-ext action=normal
> +# Set external-ids in br-int needed for ovn-controller
> +ovs-vsctl \

Missing "check".

> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> +        -- set Open_vSwitch . 
> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> +        -- set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext

Missing "\" at end of line which means we essentially won't configure
the line below.

> +        -- set bridge br-int fail-mode=secure 
> other-config:disable-in-band=true
> +
> +# Start ovn-controller
> +start_daemon ovn-controller
> +
> +check ovn-nbctl lr-add lr1
> +check ovn-nbctl ls-add public
> +
> +check ovn-nbctl lrp-add lr1 rp-public 00:00:02:01:02:03 172.31.1.1/24
> +check ovn-nbctl lsp-add-router-port public public-rp rp-public
> +check ovn-nbctl lsp-add-localnet-port public localnet phynet
> +check ovn-nbctl lrp-set-gateway-chassis rp-public hv2
> +check ovn-nbctl --wait=hv sync
> +
> +ADD_NAMESPACES(ext)
> +ADD_VETH(ext, ext, br-ext, "172.31.1.3/24", "f0:00:00:01:02:04", \
> +         "172.31.1.1")
> +
> +ADD_NAMESPACES(lsp)
> +ADD_VETH(lsp, lsp, br-int, "172.31.1.2/24", "f0:00:00:01:02:03", \
> +         "172.31.1.1")
> +
> +check ovn-nbctl lsp-add public lsp
> +check ovn-nbctl lsp-set-addresses lsp "f0:00:00:01:02:03 172.31.1.2"
> +check ovn-nbctl --wait=hv sync
> +
> +NS_CHECK_EXEC([ext], [ping -q -c 3 -i 0.3 -w 2 172.31.1.2 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +OVN_CLEANUP_CONTROLLER([hv1])
> +OVN_CLEANUP_NORTHD
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> +/connection dropped.*/d"])
> +
> +AT_CLEANUP
> +])

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to