On Fri, Aug 12, 2022 at 12:21 AM Ales Musil <amu...@redhat.com> wrote:
>
> On Thu, Aug 11, 2022 at 2:45 PM Dumitru Ceara <dce...@redhat.com> wrote:
>
> > On 8/10/22 18:37, Han Zhou wrote:
> > > On Tue, Aug 9, 2022 at 11:22 PM Ales Musil <amu...@redhat.com> wrote:
> > >>
> > >>
> > >>
> > >> On Wed, Aug 10, 2022 at 8:04 AM Han Zhou <hz...@ovn.org> wrote:
> > >>
> > >> Hi Han,
> > >>
> > >> thank you for the review, see my reply inline below.
> > >>
> > >>> On Tue, Aug 9, 2022 at 4:05 AM Ales Musil <amu...@redhat.com> wrote:
> > >>>>
> > >>>> Add MAC binding aging mechanism, that utilizes
> > >>>> the timestamp column of MAC_Binding table.
> > >>>> When the MAC binding exceeds the threshold it is
> > >>>> removed from SB DB, this is postponed only in case
> > >>>> we receive update ARP with update to MAC address.
> > >>>>
> > >>>> The threshold is configurable via option
> > >>>> "mac_binding_age_threshold" that can be specified
> > >>>> for each logical router. The option is defaulting to
> > >>>> 0 which means that by default the aging is disabled
> > >>>> and the MAC binding rows will be persisted the same
> > >>>> way as before.
> > >>>>
> > >>>> Reported-at: https://bugzilla.redhat.com/2084668
> > >>>> Signed-off-by: Ales Musil <amu...@redhat.com>
> > >>>> ---
> > >>>> v3: Rebase on top of current main.
> > >>>>     Update according to the final conclusion.
> > >>>> v4: Rebase on top of current main.
> > >>>>     Skip one row instead of all if time elapsed is negative.
> > >>>>     Add note to NEWS.
> > >>>> ---
> > >>>>  NEWS                       |   3 +
> > >>>>  northd/automake.mk         |   2 +
> > >>>>  northd/inc-proc-northd.c   |  13 ++++
> > >>>>  northd/mac-binding-aging.c | 151
> > +++++++++++++++++++++++++++++++++++++
> > >>>>  northd/mac-binding-aging.h |  33 ++++++++
> > >>>>  ovn-nb.xml                 |   7 ++
> > >>>>  6 files changed, 209 insertions(+)
> > >>>>  create mode 100644 northd/mac-binding-aging.c
> > >>>>  create mode 100644 northd/mac-binding-aging.h
> > >>>>
> > >>>> diff --git a/NEWS b/NEWS
> > >>>> index 20cea579e..eff8e472a 100644
> > >>>> --- a/NEWS
> > >>>> +++ b/NEWS
> > >>>> @@ -14,6 +14,9 @@ Post v22.06.0
> > >>>>      NAT-T UDP encapsulation. Requires OVS support for IPsec custom
> > > tunnel
> > >>>>      options (which will be available in OVS 2.18).
> > >>>>    - Removed possibility of disabling logical datapath groups.
> > >>>> +  - Added MAC binding aging mechanism, that is disabled by default.
> > >>>> +    It can be enabled per logical router with option
> > >>>> +    "mac_binding_age_threshold".
> > >>>>
> > >>>>  OVN v22.06.0 - 03 Jun 2022
> > >>>>  --------------------------
> > >>>> diff --git a/northd/automake.mk b/northd/automake.mk
> > >>>> index 4862ec7b7..81582867d 100644
> > >>>> --- a/northd/automake.mk
> > >>>> +++ b/northd/automake.mk
> > >>>> @@ -1,6 +1,8 @@
> > >>>>  # ovn-northd
> > >>>>  bin_PROGRAMS += northd/ovn-northd
> > >>>>  northd_ovn_northd_SOURCES = \
> > >>>> +       northd/mac-binding-aging.c \
> > >>>> +       northd/mac-binding-aging.h \
> > >>>>         northd/northd.c \
> > >>>>         northd/northd.h \
> > >>>>         northd/ovn-northd.c \
> > >>>> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> > >>>> index 43093cb5a..4a3699106 100644
> > >>>> --- a/northd/inc-proc-northd.c
> > >>>> +++ b/northd/inc-proc-northd.c
> > >>>> @@ -22,9 +22,11 @@
> > >>>>  #include "ip-mcast-index.h"
> > >>>>  #include "static-mac-binding-index.h"
> > >>>>  #include "lib/inc-proc-eng.h"
> > >>>> +#include "lib/mac-binding-index.h"
> > >>>>  #include "lib/ovn-nb-idl.h"
> > >>>>  #include "lib/ovn-sb-idl.h"
> > >>>>  #include "mcast-group-index.h"
> > >>>> +#include "northd/mac-binding-aging.h"
> > >>>>  #include "openvswitch/poll-loop.h"
> > >>>>  #include "openvswitch/vlog.h"
> > >>>>  #include "inc-proc-northd.h"
> > >>>> @@ -149,6 +151,8 @@ enum sb_engine_node {
> > >>>>   * avoid sparse errors. */
> > >>>>  static ENGINE_NODE(northd, "northd");
> > >>>>  static ENGINE_NODE(lflow, "lflow");
> > >>>> +static ENGINE_NODE(mac_binding_aging, "mac_binding_aging");
> > >>>> +static ENGINE_NODE(mac_binding_aging_waker,
> > > "mac_binding_aging_waker");
> > >>>>
> > >>>>  void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
> > >>>>                            struct ovsdb_idl_loop *sb)
> > >>>> @@ -211,12 +215,16 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
> > > *nb,
> > >>>>      engine_add_input(&en_northd, &en_sb_load_balancer, NULL);
> > >>>>      engine_add_input(&en_northd, &en_sb_fdb, NULL);
> > >>>>      engine_add_input(&en_northd, &en_sb_static_mac_binding, NULL);
> > >>>> +    engine_add_input(&en_mac_binding_aging, &en_sb_mac_binding,
> > NULL);
> > >>>> +    engine_add_input(&en_mac_binding_aging, &en_northd, NULL);
> > >>>> +    engine_add_input(&en_mac_binding_aging,
> > > &en_mac_binding_aging_waker, NULL);
> > >>>>      engine_add_input(&en_lflow, &en_nb_bfd, NULL);
> > >>>>      engine_add_input(&en_lflow, &en_sb_bfd, NULL);
> > >>>>      engine_add_input(&en_lflow, &en_sb_logical_flow, NULL);
> > >>>>      engine_add_input(&en_lflow, &en_sb_multicast_group, NULL);
> > >>>>      engine_add_input(&en_lflow, &en_sb_igmp_group, NULL);
> > >>>>      engine_add_input(&en_lflow, &en_northd, NULL);
> > >>>> +    engine_add_input(&en_lflow, &en_mac_binding_aging, NULL);
> > >>>>
> > >>>>      struct engine_arg engine_arg = {
> > >>>>          .nb_idl = nb->idl,
> > >>>> @@ -235,6 +243,8 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
> > > *nb,
> > >>>>          chassis_hostname_index_create(sb->idl);
> > >>>>      struct ovsdb_idl_index *sbrec_static_mac_binding_by_lport_ip
> > >>>>          = static_mac_binding_index_create(sb->idl);
> > >>>> +    struct ovsdb_idl_index *sbrec_mac_binding_by_datapath
> > >>>> +        = mac_binding_by_datapath_index_create(sb->idl);
> > >>>>
> > >>>>      engine_init(&en_lflow, &engine_arg);
> > >>>>
> > >>>> @@ -256,6 +266,9 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
> > > *nb,
> > >>>>      engine_ovsdb_node_add_index(&en_sb_static_mac_binding,
> > >>>>
> > >  "sbrec_static_mac_binding_by_lport_ip",
> > >>>>
> > sbrec_static_mac_binding_by_lport_ip);
> > >>>> +    engine_ovsdb_node_add_index(&en_sb_mac_binding,
> > >>>> +                                "sbrec_mac_binding_by_datapath",
> > >>>> +                                sbrec_mac_binding_by_datapath);
> > >>>>  }
> > >>>>
> > >>>>  void inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
> > >>>> diff --git a/northd/mac-binding-aging.c b/northd/mac-binding-aging.c
> > >>>> new file mode 100644
> > >>>> index 000000000..e8217e8bc
> > >>>> --- /dev/null
> > >>>> +++ b/northd/mac-binding-aging.c
> > >>>> @@ -0,0 +1,151 @@
> > >>>> +/* Copyright (c) 2022, Red Hat, Inc.
> > >>>> + *
> > >>>> + * Licensed under the Apache License, Version 2.0 (the "License");
> > >>>> + * you may not use this file except in compliance with the License.
> > >>>> + * You may obtain a copy of the License at:
> > >>>> + *
> > >>>> + *     http://www.apache.org/licenses/LICENSE-2.0
> > >>>> + *
> > >>>> + * Unless required by applicable law or agreed to in writing,
> > software
> > >>>> + * distributed under the License is distributed on an "AS IS" BASIS,
> > >>>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> > > implied.
> > >>>> + * See the License for the specific language governing permissions
> > and
> > >>>> + * limitations under the License.
> > >>>> + */
> > >>>> +
> > >>>> +#include <config.h>
> > >>>> +
> > >>>> +#include "lib/inc-proc-eng.h"
> > >>>> +#include "lib/ovn-nb-idl.h"
> > >>>> +#include "lib/ovn-sb-idl.h"
> > >>>> +#include "lib/timeval.h"
> > >>>> +#include "northd/mac-binding-aging.h"
> > >>>> +#include "northd/northd.h"
> > >>>> +#include "openvswitch/hmap.h"
> > >>>> +#include "openvswitch/poll-loop.h"
> > >>>> +#include "openvswitch/util.h"
> > >>>> +#include "openvswitch/vlog.h"
> > >>>> +
> > >>>> +VLOG_DEFINE_THIS_MODULE(mac_binding_aging);
> > >>>> +
> > >>>> +struct mac_binding_waker {
> > >>>> +    bool should_schedule;
> > >>>> +    long long next_wake_msec;
> > >>>> +};
> > >>>> +
> > >>>> +static struct mac_binding_waker waker;
> > >>>> +
> > >>>> +static void
> > >>>> +mac_binding_aging_run_for_datapath(const struct
> > > sbrec_datapath_binding *dp,
> > >>>> +                                   const struct nbrec_logical_router
> > > *nbr,
> > >>>> +                                   struct ovsdb_idl_index
> > > *mb_by_datapath,
> > >>>> +                                   int64_t now, int64_t *wake_delay)
> > >>>> +{
> > >>>> +    uint64_t threshold = smap_get_uint(&nbr->options,
> > >>>> +                                         "mac_binding_age_threshold",
> > >>>> +                                         0) * 1000;
> >
> > Nit: indentation.
> >
> > >>>> +    if (!threshold) {
> > >>>> +        return;
> > >>>> +    }
> > >>>> +
> > >>>> +    struct sbrec_mac_binding *mb_index_row =
> > >>>> +        sbrec_mac_binding_index_init_row(mb_by_datapath);
> > >>>> +    sbrec_mac_binding_index_set_datapath(mb_index_row, dp);
> > >>>> +
> > >>>> +    const struct sbrec_mac_binding *mb;
> > >>>> +    SBREC_MAC_BINDING_FOR_EACH_EQUAL (mb, mb_index_row,
> > > mb_by_datapath) {
> > >>>> +        int64_t elapsed = now - mb->timestamp;
> > >>>> +
> > >>>> +        if (elapsed < 0) {
> > >>>> +            continue;
> > >>>> +        } else if (elapsed >= threshold) {
> > >>>> +            sbrec_mac_binding_delete(mb);
> > >>>> +        } else {
> > >>>> +            *wake_delay = MIN(*wake_delay, threshold - elapsed);
> > >>>> +        }
> > >>>> +    }
> > >>>> +    sbrec_mac_binding_index_destroy_row(mb_index_row);
> > >>>> +}
> > >>>> +
> > >>>> +void
> > >>>> +en_mac_binding_aging_run(struct engine_node *node, void *data
> > > OVS_UNUSED)
> > >>>> +{
> > >>>> +    const struct engine_context *eng_ctx = engine_get_context();
> > >>>> +
> > >>>> +    if (!eng_ctx->ovnsb_idl_txn) {
> > >>>> +        return;
> > >>>> +    }
> > >>>> +
> > >>>> +    int64_t next_expire_msec = INT64_MAX;
> > >>>> +    int64_t now = time_wall_msec();
> > >>>> +    struct northd_data *northd_data = engine_get_input_data("northd",
> > > node);
> > >>>> +    struct ovsdb_idl_index *sbrec_mac_binding_by_datapath =
> > >>>> +
> > >  engine_ovsdb_node_get_index(engine_get_input("SB_mac_binding", node),
> > >>>> +                                    "sbrec_mac_binding_by_datapath");
> > >>>> +
> > >>>> +    struct ovn_datapath *od;
> > >>>> +    HMAP_FOR_EACH (od, key_node, &northd_data->datapaths) {
> > >>>> +        if (od->sb && od->nbr) {
> > >>>> +            mac_binding_aging_run_for_datapath(od->sb, od->nbr,
> > >>>> +
> > > sbrec_mac_binding_by_datapath,
> > >>>> +                                               now,
> > > &next_expire_msec);
> > >>>> +        }
> > >>>> +    }
> > >>>> +
> > >>>> +    if (next_expire_msec < INT64_MAX) {
> > >>>> +        waker.should_schedule = true;
> > >>>> +        waker.next_wake_msec = time_msec() + next_expire_msec;
> > >>>> +        /* Run the engine right after so the waker can reflect on the
> > > new wake
> > >>>> +         * time. */
> > >>>> +        poll_immediate_wake();
> > >>>
> > >>> Why would we immediately wake up? Wouldn't it cause busy-loop?  Shall
> > we
> > > use poll_timer_wait_until? Sorry if I missed anything here and please
> > > correct me.
> > >>
> > >>
> > >> That's because of the dependency of the nodes, we need to run the waker
> > > that will schedule the sleep.
> > >> We could probably use poll_timer_wait_until as well and it should have
> > > the same effect.
> > >>
> > > Thanks, I think I did misunderstand a little and now I got the idea of
> > the
> > > waker node. It is an input node just to trigger the timer, which wakes up
> > > the aging processing when needed and at the same time avoids running the
> > > aging scanning when the process is woken up for other reasons. So the
> > > poll_immediate_wake() here won't cause busy-loop. However, it does wake
> > up
> > > the process at least one extra time for each run, although the cost may
> > be
> > > small. I'd still use poll_timer_wait_until() here since we already know
> > the
> > > next_wake_msec.
> > >
> > > In addition, it is a little weird from I-P engine's point of view that
> > the
> > > output node execution updates the input node's data, but for this timer
> > use
> > > case I think it should work and I haven't come up with anything better.
> > It
> > > would be helpful to leave some comments for the waker node to explain the
> > > motivation and make this part easier to understand.
> > >
> >
> > +1 for both points (comment + poll_timer_wait_until()).
> >
> > Also, can we at least use engine_get_input_data() to get the input
> > node's data?  This will make it more clear that we're using data that's
> > related to the waker node.  The 'waker' variable would have to be moved
> > inside the en_mac_binding_aging_waker data.
> >
> > > Thanks,
> > > Han
> > >
> > >>>
> > >>>
> > >>>> +    } else {
> > >>>> +        waker.should_schedule = false;
> > >>>> +    }
> > >>>> +
> > >>>> +    /* This node is part of lflow, but lflow does not depend on it.
> > > Setting
> > >>>> +     * state as unchanged does not trigger lflow node when it is not
> > > needed. */
> > >>>
> > >>> This should work, but it is like a hack. Ideally we should allow the
> > I-P
> > > engine to have multiple "root" nodes. This can be a future improvement
> > > though. Maybe add a "XXX" comment here (a TODO item).
> > >>
> > >>
> > >> Yes unfortunately this is not the cleanest solution, but I did not come
> > > with a better one. I will add a TODO comment here about moving it once we
> > > allow multiple root nodes.
> > >>
> > >>>
> > >>>
> > >>> Thanks,
> > >>> Han
> > >>>
> > >>
> > >> Thanks,
> > >> Ales
> > >>
> > >>>
> > >>>
> > >>>> +    engine_set_node_state(node, EN_UNCHANGED);
> > >>>> +}
> > >>>> +
> > >>>> +void *
> > >>>> +en_mac_binding_aging_init(struct engine_node *node OVS_UNUSED,
> > >>>> +                          struct engine_arg *arg OVS_UNUSED)
> > >>>> +{
> > >>>> +    return NULL;
> > >>>> +}
> > >>>> +
> > >>>> +void
> > >>>> +en_mac_binding_aging_cleanup(void *data OVS_UNUSED)
> > >>>> +{
> > >>>> +}
> > >>>> +
> > >>>> +void
> > >>>> +en_mac_binding_aging_waker_run(struct engine_node *node, void *data
> > > OVS_UNUSED)
> > >>>> +{
> > >>>> +    if (!waker.should_schedule) {
> > >>>> +        return;
> > >>>> +    }
> > >>>> +
> > >>>> +    if (time_msec() >= waker.next_wake_msec) {
> > >>>> +        waker.should_schedule = false;
> > >>>> +        engine_set_node_state(node, EN_UPDATED);
> > >>>> +        return;
> > >>>> +    }
> > >>>> +
> > >>>> +    engine_set_node_state(node, EN_UNCHANGED);
> > >>>> +    poll_timer_wait_until(waker.next_wake_msec);
> > >>>> +}
> > >>>> +
> > >>>> +void *
> > >>>> +en_mac_binding_aging_waker_init(struct engine_node *node OVS_UNUSED,
> > >>>> +                                struct engine_arg *arg OVS_UNUSED)
> > >>>> +{
> > >>>> +    waker.should_schedule = false;
> > >>>> +    waker.next_wake_msec = 0;
> > >>>> +    return NULL;
> > >>>> +}
> > >>>> +
> > >>>> +void
> > >>>> +en_mac_binding_aging_waker_cleanup(void *data OVS_UNUSED)
> > >>>> +{
> > >>>> +}
> > >>>> diff --git a/northd/mac-binding-aging.h b/northd/mac-binding-aging.h
> > >>>> new file mode 100644
> > >>>> index 000000000..296a7ab38
> > >>>> --- /dev/null
> > >>>> +++ b/northd/mac-binding-aging.h
> > >>>> @@ -0,0 +1,33 @@
> > >>>> +/* Copyright (c) 2022, Red Hat, Inc.
> > >>>> + *
> > >>>> + * Licensed under the Apache License, Version 2.0 (the "License");
> > >>>> + * you may not use this file except in compliance with the License.
> > >>>> + * You may obtain a copy of the License at:
> > >>>> + *
> > >>>> + *     http://www.apache.org/licenses/LICENSE-2.0
> > >>>> + *
> > >>>> + * Unless required by applicable law or agreed to in writing,
> > software
> > >>>> + * distributed under the License is distributed on an "AS IS" BASIS,
> > >>>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> > > implied.
> > >>>> + * See the License for the specific language governing permissions
> > and
> > >>>> + * limitations under the License.
> > >>>> + */
> > >>>> +
> > >>>> +#ifndef MAC_BINDING_AGING_H
> > >>>> +#define MAC_BINDING_AGING_H 1
> > >>>> +
> > >>>> +#include "lib/inc-proc-eng.h"
> > >>>> +
> > >>>> +/* The MAC binding aging node functions. */
> > >>>> +void en_mac_binding_aging_run(struct engine_node *node, void *data);
> > >>>> +void *en_mac_binding_aging_init(struct engine_node *node,
> > >>>> +                                struct engine_arg *arg);
> > >>>> +void en_mac_binding_aging_cleanup(void *data);
> > >>>> +
> > >>>> +/* The MAC binding aging waker node functions. */
> > >>>> +void en_mac_binding_aging_waker_run(struct engine_node *node, void
> > > *data);
> > >>>> +void *en_mac_binding_aging_waker_init(struct engine_node *node,
> > >>>> +                                      struct engine_arg *arg);
> > >>>> +void en_mac_binding_aging_waker_cleanup(void *data);
> > >>>> +
> > >>>> +#endif /* northd/mac-binding-aging.h */
> > >>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
> > >>>> index e26afd83c..d3fba9bdc 100644
> > >>>> --- a/ovn-nb.xml
> > >>>> +++ b/ovn-nb.xml
> > >>>> @@ -2392,6 +2392,13 @@
> > >>>>          and other sources. This way, OVN and the other sources can
> > > make use of
> > >>>>          the same conntrack zone.
> > >>>>        </column>
> > >>>> +
> > >>>> +      <column name="options" key="mac_binding_age_threshold"
> > >>>> +              type='{"type": "integer", "minInteger": 0,
> > > "maxInteger": 4294967295}'>
> > >>>> +        MAC binding aging <code>threshold</code> value in secs. MAC
> > > binding
> >
> > Nit: I'd use 'seconds'.
> >
> > Thanks,
> > Dumitru
> >
> > >>>> +        exceeding this timeout will be automatically removed. The
> > > value
> > >>>> +        defaults to 0, which means disabled.
> > >>>> +      </column>
> > >>>>      </group>
> > >>>>
> > >>>>      <group title="Common Columns">
> > >>>> --
> > >>>> 2.37.1
> > >>>>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Ales Musil
> > >>
> > >> Senior Software Engineer - OVN Core
> > >>
> > >> Red Hat EMEA
> > >>
> > >> amu...@redhat.com    IM: amusil
> > >
> >
> >
> Thank you for the reviews.
>
> v5 should address most of the comments except the open one from Numan about
> upgrade.
>

IMO upgrading one is very important.  As I mentioned earlier,
openstack neutron or ovn-kubernetes would not know
which OVN version is being run and if all the ovn-controllers are
upgraded to the version which has the mac_binding timestamp
support.  This config option is not something which an administrator
can enable or disable depending on the upgrade status.

There have been a few  upgrade issues with OVN and because of which
openstack CMS had to do special handling.
This patch series can potentially affect the existing datapath traffic
after partial upgrades.  So I think its better we handle this
properly.

Numan

> Thanks,
> Ales
>
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA <https://www.redhat.com>
>
> amu...@redhat.com    IM: amusil
> <https://red.ht/sig>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to