On 8/26/25 3:23 PM, Ales Musil wrote:
>
>
> On Tue, Aug 26, 2025 at 2:37 PM Ilya Maximets <[email protected]
> <mailto:[email protected]>> wrote:
>
> On 8/26/25 12:38 PM, Ales Musil wrote:
> > The SB MAC_Binding and Static_MAC_Binding use the same columns
> > as an indication of logical port and datapath. However, the check
> > if any of those rows is stale wasn't consistent. Unify it in a single
> > function to make sure that both tables have the same condition to
> > identify stale entries.
>
> Hi, Ales. Thanks for the fix!
>
>
> Hi Ilya,
> thank you for the review.
>
>
>
> The commit message is focusing on entirely wrong thing. Reading it
> it sounds like just a refactoring patch, which is not the case.
> Please, describe what the actual issue is and which condition is
> causing it.
>
>
> It's actually focusing on the right thing. I can add more details
> on the scenario when that happens, but it's still unification of
> the logic.
>
>
>
> I'm also not sure if we need to combine the checks for both types
> of mac bindings as it will complicate the code and make it more
> error prone, see below.
>
>
> I disagree, the issue is caused by a difference between how we remove
> Static MAC Bindings and MAC Bindings. There shouldn't be any difference
> whatsoever.
The dynamic MAC bindings are created by ovn-controller, while the static
ones are coming from the user and SB entries are created by northd.
The logic for creation of these entries is different. And the logic of
when they should be removed is also different. Dynamic entries must be
cleaned up when the datapath goes away, because it's the controller that
will need to re-create them. Static ones can just be updated without any
logical issues. So, I don't think we should unify the logic of removal.
>
>
>
> >
> > Fixes: 417f1415b2f5 ("northd: Clean up SB MAC bindings for deleted
> ports.")
>
> I don't think this commit should be in the Fixes tag, it achieved
> what it meant to fix, i.e. it solved the case of router removal.
> This patch is aiming to solve the router re-creation, which is,
> though related, a different use case.
>
>
> Again I don't agree, the fix solved just part of the problem.
The change fixes the issue described in the report and the commit message.
>
>
>
> > Fixes: 6856adc616a7 ("ovn-northd: Clear SB records depending on stale
> datapaths.")
>
> Is there a bug in handling of non-static MAC bindings? If not, then
> this tag should also not be here.
>
>
> As Dumitru pointed out there was a missing check for !op->nbrp
> unlikely to happen, but still missing.
It should not be possible to have a port without nbrp inside the lr_ports
hash map. This check is likely redundant after the hash map split.
>
>
>
> > Signed-off-by: Ales Musil <[email protected] <mailto:[email protected]>>
> > ---
> > northd/northd.c | 35 ++++++++++++++++++++++++++---------
> > tests/ovn-northd.at <http://ovn-northd.at> | 8 ++++++++
> > 2 files changed, 34 insertions(+), 9 deletions(-)
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 015f30a35..f9f132386 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -2867,6 +2867,26 @@ common:
> > }
> > }
> >
> > +static bool
> > +mac_binding_is_stale(const struct hmap *lr_datapaths,
> > + const struct hmap *lr_ports,
> > + const struct sbrec_datapath_binding *dp,
> > + const char *logical_port)
> > +{
> > + const struct ovn_datapath *od =
> > + ovn_datapath_from_sbrec(NULL, lr_datapaths, dp);
> > + if (!od || ovn_datapath_is_stale(od)) {
> > + return true;
> > + }
> > +
> > + const struct ovn_port *op = ovn_port_find(lr_ports, logical_port);
> > + if (!op || !op->nbrp) {
> > + return true;
> > + }
> > +
> > + return false;
> > +}
> > +
> > /* Remove mac_binding entries that refer to logical_ports which are
> > * deleted. */
> > static void
> > @@ -2876,11 +2896,8 @@ cleanup_mac_bindings(
> > {
> > const struct sbrec_mac_binding *b;
> > SBREC_MAC_BINDING_TABLE_FOR_EACH_SAFE (b, sbrec_mac_binding_table)
> {
> > - const struct ovn_datapath *od =
> > - ovn_datapath_from_sbrec(NULL, lr_datapaths, b->datapath);
> > -
> > - if (!od || ovn_datapath_is_stale(od) ||
> > - !ovn_port_find(lr_ports, b->logical_port)) {
> > + if (mac_binding_is_stale(lr_datapaths, lr_ports, b->datapath,
> > + b->logical_port)) {
> > sbrec_mac_binding_delete(b);
> > }
> > }
> > @@ -18850,7 +18867,7 @@ build_static_mac_binding_table(
> > struct ovsdb_idl_txn *ovnsb_txn,
> > const struct nbrec_static_mac_binding_table *nbrec_static_mb_table,
> > const struct sbrec_static_mac_binding_table *sbrec_static_mb_table,
> > - struct hmap *lr_ports)
> > + const struct hmap *lr_datapaths, const struct hmap *lr_ports)
> > {
> > /* Cleanup SB Static_MAC_Binding entries which do not have
> corresponding
> > * NB Static_MAC_Binding entries, and SB Static_MAC_Binding
> entries for
> > @@ -18866,8 +18883,8 @@ build_static_mac_binding_table(
> > continue;
> > }
> >
> > - struct ovn_port *op = ovn_port_find(lr_ports,
> nb_smb->logical_port);
> > - if (!op || !op->nbrp || !op->od || !op->od->sdp->sb_dp) {
> > + if (mac_binding_is_stale(lr_datapaths, lr_ports,
> sb_smb->datapath,
> > + sb_smb->logical_port)) {
>
> I don't think we should do that. The reason the code was written this
> way is so we can easily compare the conditions for creation below with
> the conditions for removal. Hiding the conditions and also changing
> them here creates a discrepancy in the logic between creation and the
> removal, which may cause similar issues in the future again.
>
>
> Again as said above the logic for removal of static and dynamic
> MBs shouldn't be different.
Again, I do not agree with that.
>
>
> The actual problem is not with the removal, but with the update below.
> The update logic doesn't consider that the datapath can change, but it
> can, because we search only by the names and the port and ip and the
> datapath is not part of the index.
>
>
> But the datapath isn't changed, it is removed and a new one is created.
From the point of view of the static MAC binding entry, the datapath
changed underneath it. The NB entry is not bound to the datapath it only
references the port name and if the port changed the datapath, then it's
the update for the MAC binding. MAC binding was never removed from the NB,
it's the same entry, just updated.
This is not the case for dynamic bindings that are created for a specific
datapath by the ovn-controller and hence must be removed on deletion of
that datapath.
>
>
>
> So, I think, we need to keep the addition and removal in sync as it is
> now and fix the logic for the update to set the new datapath reference
> in case it changed. And the correct Fixes tag should be:
>
>
> This would create more inconsistencies between static and dynamic MBs,
> which we shouldn't do.
Again, disagree. Static and dynamic entries are different entities.
One alternative solution would be to re-write the build of static mac bindings
in a way that we first create+update and then remove everything that wasn't
touched on that run. This will eliminate the duplication of checks between
creation and removal and likely make this whole discussion obsolete.
>
>
>
> Fixes: b22684d4e37c ("Add a northbound interface to program MAC_Binding
> table")
>
> Best regards, Ilya Maximets.
>
>
>
> Thanks,
> Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev