On Thu, Aug 28, 2025 at 5:52 PM Felix Huettner via dev <
[email protected]> wrote:

> When installing a new flow then its idle_age will start ticking from 0.
> We previously treated this as the flow being used and bumped the
> MAC_Binding
> timestamp. However if the flow has just been freshly installed it was
> never used and therefore should not be bumped.
> This is especially important for chassis that add the flows newly
> because they became part of the relevant datapath.
>
> In case of a large enough mac_binding_age_threshold and if the datapath
> is added and removed across chassis faster than the
> mac_binding_age_threshold this can actually lead to MAC_Binding that are
> never removed even though they are inactive.
>
> We also need to fix the test "MAC binding aging - probing GW router
> Dynamic Neigh" since it previously relied on the fact that a newly
> created flow directly triggers an update. This pushed the time of the
> first probe request sufficiently for the test to work. As we now no
> longer do this we can at most wait for 'cooldown_period' as afterwards a
> probe might be send immediately.
>
> Signed-off-by: Felix Huettner <[email protected]>
> ---
>

Hi Felix,

thank you for the fix. I would suggest slightly different approach
that is simpler IMO and doesn't increase memory usage.
See down below.


>  controller/mac-cache.c |  7 +++--
>  controller/mac-cache.h |  2 ++
>  tests/ovn.at           | 66 ++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 71 insertions(+), 4 deletions(-)
>
> diff --git a/controller/mac-cache.c b/controller/mac-cache.c
> index fcf5fa7d4..173e39145 100644
> --- a/controller/mac-cache.c
> +++ b/controller/mac-cache.c
> @@ -348,6 +348,7 @@ mac_binding_stats_process_flow_stats(struct vector
> *stats_vec,
>  {
>      struct mac_cache_stats stats = (struct mac_cache_stats) {
>          .idle_age_ms = ofp_stats->idle_age * 1000,
> +        .has_ever_hit = ofp_stats->idle_age < ofp_stats->duration_sec,
>

Instead of adding new bool we can simply do the following if at the
start of the "*_stats_process_flow_stats()":

    if (ofp_stats->idle_age == ofp_stats->duration_sec) {
        return;
    }

WDYT?

         .data.mb = (struct mac_binding_data) {
>              .cookie = ntohll(ofp_stats->cookie),
>              /* The port_key must be zero to match
> @@ -419,7 +420,7 @@ mac_binding_stats_run(
>
>          /* If "idle_age" is under threshold it means that the mac binding
> is
>           * used on this chassis. */
> -        if (stats->idle_age_ms < threshold->value) {
> +        if (stats->has_ever_hit && stats->idle_age_ms < threshold->value)
> {
>              if (since_updated_ms >= threshold->cooldown_period) {
>                  mac_binding_update_log("Updating active", &mb->data, true,
>                                         threshold, stats->idle_age_ms,
> @@ -452,6 +453,7 @@ fdb_stats_process_flow_stats(struct vector *stats_vec,
>  {
>      struct mac_cache_stats stats = (struct mac_cache_stats) {
>          .idle_age_ms = ofp_stats->idle_age * 1000,
> +        .has_ever_hit = ofp_stats->idle_age < ofp_stats->duration_sec,
>          .data.fdb = (struct fdb_data) {
>              .port_key = ofp_stats->match.flow.regs[MFF_LOG_INPORT -
> MFF_REG0],
>              .dp_key = ntohll(ofp_stats->match.flow.metadata),
> @@ -514,7 +516,7 @@ fdb_stats_run(struct rconn *swconn OVS_UNUSED,
>
>          /* If "idle_age" is under threshold it means that the fdb entry is
>           * used on this chassis. */
> -        if (stats->idle_age_ms < threshold->value) {
> +        if (stats->has_ever_hit && stats->idle_age_ms < threshold->value)
> {
>              if (since_updated_ms >= threshold->cooldown_period) {
>                  fdb_update_log("Updating active", &fdb->data, true,
>                                 threshold, stats->idle_age_ms,
> @@ -838,6 +840,7 @@ mac_binding_probe_stats_process_flow_stats(
>  {
>      struct mac_cache_stats stats = (struct mac_cache_stats) {
>          .idle_age_ms = ofp_stats->idle_age * 1000,
> +        .has_ever_hit = ofp_stats->idle_age < ofp_stats->duration_sec,
>          .data.mb = (struct mac_binding_data) {
>              .cookie = ntohll(ofp_stats->cookie),
>              /* The port_key must be zero to match
> diff --git a/controller/mac-cache.h b/controller/mac-cache.h
> index d7b5b9cd5..fa462e82c 100644
> --- a/controller/mac-cache.h
> +++ b/controller/mac-cache.h
> @@ -93,6 +93,8 @@ struct fdb {
>
>  struct mac_cache_stats {
>      int64_t idle_age_ms;
> +    bool has_ever_hit; /* Defines if the entry might have been hit at all,
> +                        * independent of idle_age_ms. */
>
>      union {
>          /* Common data to identify MAC binding. */
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 292ca0dae..81fbfc629 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -36578,6 +36578,68 @@ OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  ])
>
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([MAC binding aging - unused entries])
> +ovn_start
> +
> +net_add n1
> +sim_add hv1
> +as hv1
> +check ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +ovn-appctl -t ovn-controller vlog/set mac_cache:file:dbg pinctrl:file:dbg
> +
> +check ovn-nbctl                                                     \
> +    -- ls-add ls1                                                   \
> +    -- lr-add lr                                                    \
> +    -- set logical_router lr options:mac_binding_age_threshold=4    \
> +    -- lrp-add lr lr-ls1 00:00:00:00:10:00 192.168.10.1/24          \
> +    -- lsp-add ls1 ls1-lr                                           \
> +    -- lsp-set-type ls1-lr router                                   \
> +    -- lsp-set-addresses ls1-lr router                              \
> +    -- lsp-set-options ls1-lr router-port=lr-ls1                    \
> +    -- lsp-add ls1 vif1                                             \
> +    -- lsp-set-addresses vif1 "00:00:00:00:10:10 192.168.10.10"
> +
> +check ovs-vsctl                                      \
> +    -- add-port br-int vif1                          \
> +    -- set interface vif1 external-ids:iface-id=vif1
> +
> +OVN_POPULATE_ARP
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +dnl Wait for pinctrl thread to be connected.
> +OVS_WAIT_UNTIL([grep pinctrl hv1/ovn-controller.log | grep -q connected])
> +
> +dnl Create the MAC_Binding directly in the database to ensure it is never
> used.
> +dnl The timestamp is set to now.
> +dp=$(fetch_column Datapath_Binding _uuid external_ids:name=lr)
> +now=$(date +%s)000
> +check_uuid ovn-sbctl create MAC_Binding datapath=$dp logical_port=lr-ls1 \
> +           ip=192.168.10.20 'mac="00:00:00:11:22:33"' timestamp=$now
> +
> +dnl Wait until the flow is installed locally
> +OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int table=OFTABLE_MAC_CACHE_USE |
> grep 192.168.10.20])
> +
> +dnl Wait long enough to give ovn-controller the chance to decide if the
> entry
> +dnl should be bumped or not.
> +sleep 2
> +
> +dnl The entry should still be there and the timestamp should not have been
> +dnl updated
> +check_row_count mac_binding 1 ip="192.168.10.20"
> +new_ts=$(fetch_column MAC_Binding timestamp ip="192.168.10.20")
> +check test "$now" = "$new_ts"
> +
> +dnl After waiting until the age threshold the entry should be deleted
> +sleep 3
> +check_row_count mac_binding 0 ip="192.168.10.20"
> +
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +])
> +
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([MAC binding aging - probing])
>  AT_SKIP_IF([test $HAVE_SCAPY = no])
> @@ -36844,7 +36906,7 @@ OVS_WAIT_UNTIL([$(ovs-ofctl dump-flows br-int
> table=OFTABLE_MAC_BINDING | \
>  n_arp=$(ovs-ofctl dump-flows br-int table=OFTABLE_MAC_CACHE_USE |awk
> '/arp_spa=192.168.20.2/{print <http://192.168.20.2/%7Bprint>
> substr($4,11,1)}')
>  # Check GW router does not send any ARP requests in this case.
>  OVS_WAIT_UNTIL([test $(ovs-ofctl dump-flows br-int
> table=OFTABLE_MAC_CACHE_USE | \
> -                       awk '/nw_src=192.168.10.100/{print
> <http://192.168.10.100/%7Bprint> substr($6,10,1)}') -ge $((aging_th/2))])
> +                       awk '/nw_src=192.168.10.100/{print
> <http://192.168.10.100/%7Bprint> substr($6,10,1)}') -ge
> $((aging_th*3/16))])
>  AT_CHECK([test $(ovs-ofctl dump-flows br-int table=OFTABLE_MAC_CACHE_USE
> |awk '/arp_spa=192.168.20.2/{print <http://192.168.20.2/%7Bprint>
> substr($4,11,1)}') -eq $n_arp])
>
>  send_imcp_echo_req hv1 public 00:00:00:00:10:00 00:00:00:00:10:1a
> 192.168.20.2 192.168.10.100
> @@ -36865,7 +36927,7 @@ check_row_count mac_binding 1
> mac=\"00:00:00:00:30:00\" ip=\"fd12::2\"
>
>  n_ipv6=$(ovs-ofctl dump-flows br-int table=OFTABLE_MAC_CACHE_USE |awk
> '/ipv6_src=fd12::2/{print substr($4,11,1)}')
>  OVS_WAIT_UNTIL([test $(ovs-ofctl dump-flows br-int
> table=OFTABLE_MAC_CACHE_USE | \
> -                       awk '/ipv6_src=fd12::2/{print substr($6,10,1)}')
> -ge $((aging_th/2))])
> +                       awk '/ipv6_src=fd12::2/{print substr($6,10,1)}')
> -ge $((aging_th*3/16))])
>  # Check GW router does not send any NS requests in this case.
>  AT_CHECK([test $(ovs-ofctl dump-flows br-int table=OFTABLE_MAC_CACHE_USE
> |awk '/ipv6_src=fd12::2/{print substr($4,11,1)}') -eq $n_ipv6])
>  send_icmp6_echo_req hv1 public 00:00:00:00:10:00 00:00:00:00:50:01
> fd12::2 fd11::64
>
> base-commit: 0935bd0fa6afa24ce013c08a2796758cd26d196b
> --
> 2.43.0
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Thanks,
Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to