On 9/18/25 2:43 PM, Xavier Simonart wrote:
> Hi Rose
>
> Thanks for looking into this issue and providing a patch.
> I have two additional comments - see below
>
> Thanks
> Xavier
>
> On Thu, Sep 18, 2025 at 8:47 AM Ales Musil via dev <[email protected]
> <mailto:[email protected]>> wrote:
>
> On Thu, Sep 18, 2025 at 12:02 AM Rosemarie O'Riorden via dev <
> [email protected] <mailto:[email protected]>> wrote:
>
> > If you set a chassis' encap ip to be the same as that of another
> > chassis, OVN should detect it, log an error, and delete the chassis.
> >
> > But this was not the case; OVN would allow the transaction. However, the
> > commit to the southbound database would fail, because there was already
> > an encap record with that same ip and type (from the other chassis).
> > The OVS db on the problematic chassis would then rapidly increase in
> > size. Essentially, the controller would get stuck in a loop of deleting
> > and recreating the tunnels.
> >
> > This change makes it so before building encaps, the controller detects
> > if there is an existing encap tunnel of a certain ip and type. If so,
> > it logs an error, refuses to build the encaps, and deletes the
> > associated chassis record.
>
> Note that this is a different behavior from what OVN does today when the
> ovn-encap-ip is removed.
> If the ovn-encap-ip is removed, the ovn-controller will log a message, but
> will not delete the encap and chassis entries in sb.
> I think that removing the encap entry, as you propose, is a good thing to do,
> so I entered a different FDP tracking this part:
> https://issues.redhat.com/browse/FDP-1711
> <https://issues.redhat.com/browse/FDP-1711>.
> See below about deletion of the chassis record.
>
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-1481
> <https://issues.redhat.com/browse/FDP-1481>
> > Co-authored-by: Nicholas Hubbard <[email protected]
> <mailto:[email protected]>>
> > Signed-off-by: Nicholas Hubbard <[email protected]
> <mailto:[email protected]>>
> > Signed-off-by: Rosemarie O'Riorden <[email protected]
> <mailto:[email protected]>>
> > ---
> > v2: Fix typo in title.
> > ---
> >
>
> Hi Rosemarie,
>
> thank you for the patch. I have a couple of comments inline.
>
> controller/chassis.c | 97 +++++++++++++++++++++++++------------
> > controller/chassis.h | 3 +-
> > controller/ovn-controller.c | 5 +-
> > lib/chassis-index.c | 20 ++++++++
> > lib/chassis-index.h | 3 ++
> > tests/ovn-controller.at <http://ovn-controller.at> | 34
> ++++++++++++-
> > 6 files changed, 128 insertions(+), 34 deletions(-)
> >
> > diff --git a/controller/chassis.c b/controller/chassis.c
> > index 07bef1c56..ac204c779 100644
> > --- a/controller/chassis.c
> > +++ b/controller/chassis.c
> > @@ -77,6 +77,12 @@ struct ovs_chassis_cfg {
> > bool ct_label_flush;
> > };
> >
> > +enum chassis_update_status {
> > + CHASSIS_UPDATED,
> > + CHASSIS_NOT_UPDATED,
> > + CHASSIS_NEED_DELETE
> > +};
> > +
> > static void
> > ovs_chassis_cfg_init(struct ovs_chassis_cfg *cfg)
> > {
> > @@ -699,7 +705,8 @@ chassis_build_encaps(struct ovsdb_idl_txn
> > *ovnsb_idl_txn,
> > const char *encap_ip_default,
> > const char *chassis_id,
> > const char *encap_csum,
> > - size_t *n_encap)
> > + size_t *n_encap,
> > + struct ovsdb_idl_index *sbrec_encaps)
> > {
> > size_t tunnel_count = 0;
> >
> > @@ -713,6 +720,18 @@ chassis_build_encaps(struct ovsdb_idl_txn
> > *ovnsb_idl_txn,
> >
> > SSET_FOR_EACH (encap_ip, encap_ip_set) {
> > SSET_FOR_EACH (encap_type, encap_type_set) {
> > + const struct sbrec_encap *encap_rec =
> > + encap_lookup_by_ip_and_type(sbrec_encaps,
> > + encap_ip, encap_type);
> > + if (encap_rec && strcmp(encap_rec->chassis_name,
> chassis_id))
> > {
> > + static struct vlog_rate_limit rl =
> > VLOG_RATE_LIMIT_INIT(5, 1);
> > + VLOG_WARN_RL(&rl, "'%s' already has encap ip '%s' and "
> > + "type '%s', cannot duplicate on '%s'",
> > + encap_rec->chassis_name, encap_rec->ip,
> > + encap_rec->type, chassis_id);
> > + continue;
> > + }
> > +
> > struct sbrec_encap *encap =
> sbrec_encap_insert(ovnsb_idl_txn);
> >
> > sbrec_encap_set_type(encap, encap_type);
> > @@ -778,7 +797,7 @@ update_supported_sset(struct sset *supported)
> >
> > static void
> > remove_unsupported_options(const struct sbrec_chassis *chassis_rec,
> > - bool *updated)
> > + enum chassis_update_status *update_status)
> > {
> > struct sset supported_options =
> SSET_INITIALIZER(&supported_options);
> > update_supported_sset(&supported_options);
> > @@ -789,7 +808,7 @@ remove_unsupported_options(const struct
> sbrec_chassis
> > *chassis_rec,
> > VLOG_WARN("Removing unsupported key \"%s\" from chassis
> > record.",
> > node->key);
> > sbrec_chassis_update_other_config_delkey(chassis_rec,
> > node->key);
> > - *updated = true;
> > + *update_status = CHASSIS_UPDATED;
> > }
> > }
> >
> > @@ -827,25 +846,27 @@ chassis_get_record(struct ovsdb_idl_txn
> > *ovnsb_idl_txn,
> > }
> >
> > /* Update a Chassis record based on the config in the ovs config.
> > - * Returns true if 'chassis_rec' was updated, false otherwise.
> > + * Returns 1 if 'chassis_rec' was updated, -1 if another chassis exists
> > + * with an encap with same IP and type as 'chassis', and returns 0
> > otherwise.
> >
>
> This comment is very misleading, I don't see the function returning
> integers, it returns enum.
>
>
> > */
> > -static bool
> > +static enum chassis_update_status
> > chassis_update(const struct sbrec_chassis *chassis_rec,
> > struct ovsdb_idl_txn *ovnsb_idl_txn,
> > const struct ovs_chassis_cfg *ovs_cfg,
> > const char *chassis_id,
> > - const struct sset *transport_zones)
> > + const struct sset *transport_zones,
> > + struct ovsdb_idl_index *sbrec_encaps)
> > {
> > - bool updated = false;
> > + enum chassis_update_status update_status = CHASSIS_NOT_UPDATED;
> >
> > if (strcmp(chassis_id, chassis_rec->name)) {
> > sbrec_chassis_set_name(chassis_rec, chassis_id);
> > - updated = true;
> > + update_status = CHASSIS_UPDATED;
> > }
> >
> > if (strcmp(ovs_cfg->hostname, chassis_rec->hostname)) {
> > sbrec_chassis_set_hostname(chassis_rec, ovs_cfg->hostname);
> > - updated = true;
> > + update_status = CHASSIS_UPDATED;
> > }
> >
> > if (chassis_other_config_changed(ovs_cfg, chassis_rec)) {
> > @@ -856,12 +877,12 @@ chassis_update(const struct sbrec_chassis
> > *chassis_rec,
> > sbrec_chassis_verify_other_config(chassis_rec);
> > sbrec_chassis_set_other_config(chassis_rec, &other_config);
> > smap_destroy(&other_config);
> > - updated = true;
> > + update_status = CHASSIS_UPDATED;
> > }
> >
> > update_chassis_transport_zones(transport_zones, chassis_rec);
> >
> > - remove_unsupported_options(chassis_rec, &updated);
> > + remove_unsupported_options(chassis_rec, &update_status);
> >
> > /* If any of the encaps should change, update them. */
> > bool tunnels_changed =
> > @@ -871,20 +892,27 @@ chassis_update(const struct sbrec_chassis
> > *chassis_rec,
> > ovs_cfg->encap_csum,
> > chassis_rec);
> > if (!tunnels_changed) {
> > - return updated;
> > + return update_status;
> > }
> >
> > struct sbrec_encap **encaps;
> > size_t n_encap;
> >
> > encaps =
> > - chassis_build_encaps(ovnsb_idl_txn, &ovs_cfg->encap_type_set,
> > + chassis_build_encaps(ovnsb_idl_txn,
> > + &ovs_cfg->encap_type_set,
> > &ovs_cfg->encap_ip_set,
> > ovs_cfg->encap_ip_default, chassis_id,
> > - ovs_cfg->encap_csum, &n_encap);
> > + ovs_cfg->encap_csum, &n_encap,
> > + sbrec_encaps);
> > sbrec_chassis_set_encaps(chassis_rec, encaps, n_encap);
> > free(encaps);
> > - return true;
> > +
> > + if (!n_encap) {
> > + return CHASSIS_NEED_DELETE;
> > + }
> > +
> > + return CHASSIS_UPDATED;
> > }
> >
> > /* If this is a chassis_private config update after we initialized the
> > record
> > @@ -935,7 +963,8 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
> > const char *chassis_id,
> > const struct ovsrec_bridge *br_int,
> > const struct sset *transport_zones,
> > - const struct sbrec_chassis_private **chassis_private)
> > + const struct sbrec_chassis_private **chassis_private,
> > + struct ovsdb_idl_index *sbrec_encaps)
> > {
> > struct ovs_chassis_cfg ovs_cfg;
> >
> > @@ -956,22 +985,30 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
> > * modified in the ovs table.
> > */
> > if (chassis_rec && ovnsb_idl_txn) {
> > - bool updated = chassis_update(chassis_rec, ovnsb_idl_txn,
> > &ovs_cfg,
> > - chassis_id, transport_zones);
> > -
> > - if (!existed || updated) {
> > - ovsdb_idl_txn_add_comment(ovnsb_idl_txn,
> > - "ovn-controller: %s chassis
> '%s'",
> > - !existed ? "registering" :
> > "updating",
> > - chassis_id);
> > - }
> > + enum chassis_update_status update_status =
> > + chassis_update(chassis_rec, ovnsb_idl_txn, &ovs_cfg,
> > + chassis_id,
> > + transport_zones, sbrec_encaps);
>
> +
> > + if (update_status == CHASSIS_NEED_DELETE) {
> > + sbrec_chassis_delete(chassis_rec);
> >
>
> Shouldn't we also delete the chassis private record if it exists?
> I don't think there is a point in having one of them to remain.
> I'm also not sure if the sbrec deletion is the right way to do it,
> we could relax the schema and decrease the min for encaps
> to 0, which would allow us to keep the chassis record without
> any encap being configured. There is only single place that
> expects at least 1 encap, that is ovn-ic, but looking at the
> code nothing prevents us from skipping chassis without
> encap instead of crashing.
>
> I agree that we should probably not delete the chassis record.
I'm not sure if keeping the record is a good thing for a few reasons:
1. ovn-controller doesn't create the chassis record if not fully configured.
So, it's natural for it to remove the record when the configuration
turns into something unusable.
2. Keeping the record will block --wait=hv calls, if I'm not mistaken.
Sure, such commands are always dangerous due to ovn-controller maybe
being down for some reason, but it's better if we do not expand the
cases where this can happen. Node bing down in most cases would be
a temporary condition, while misconfic can persist for a long time.
3. Chaning database schema this way is not backward compatibe. So, we'll
need to check if we're working with the new or old schema on the server
side, otherwise we will be sending transactions that can't succeed.
Sure, this will not make the situation here worse, but we will not
be fixing the problem for the upgrade time, which can span many weeks
on large production clusters. Or we'll need to delete for old schema
and not delete for a new schema, which is double the code and seems
unnecessary. Either way we shouldn't be sending transactions that
we know will fail.
Since this is actually a bug fix (the bug is that ovn-controller
drives the node out of disk space), backporting the schema change
will also be a little problematic, as usual.
> Removing the chassis might also result in tunnels not being deleted in some
> cases. With removing tunnels we would ensure
> that traffic will not be sent to this wrongly configured chassis.
This shouldn't be a problem. Encaps is not a root table, so any time the
chassis record is removed all the encaps are also removed. And ovn-controller
should handle this kind of update properly, because that's a normal scenario
for a node removal. If somehow we do not handle that correctly, then that
should be fixed. But it would be a completely separate issue from this one.
>
>
>
> > + chassis_rec = NULL;
>
> + } else {
>
>
> This adds yet another nesting, please consider using goto
> to reduce it.
>
> + if (!existed || update_status == CHASSIS_UPDATED) {
> > + ovsdb_idl_txn_add_comment(ovnsb_idl_txn,
> > + "ovn-controller: %s chassis
> > '%s'",
> > + !existed ? "registering"
> > + : "updating", chassis_id);
> > + }
> >
> > - *chassis_private =
> > - chassis_private_get_record(ovnsb_idl_txn,
> > - sbrec_chassis_private_by_name,
> > + *chassis_private =
> > + chassis_private_get_record(ovnsb_idl_txn,
> > +
> sbrec_chassis_private_by_name,
> > + chassis_id);
> > + if (*chassis_private) {
> > + chassis_private_update(*chassis_private, chassis_rec,
> > chassis_id);
> > - if (*chassis_private) {
> > - chassis_private_update(*chassis_private, chassis_rec,
> > chassis_id);
> > + }
> > }
> > }
> >
> > diff --git a/controller/chassis.h b/controller/chassis.h
> > index 03cc2f906..14251bdad 100644
> > --- a/controller/chassis.h
> > +++ b/controller/chassis.h
> > @@ -44,7 +44,8 @@ const struct sbrec_chassis *chassis_run(
> > const struct ovsrec_open_vswitch_table *,
> > const char *chassis_id, const struct ovsrec_bridge *br_int,
> > const struct sset *transport_zones,
> > - const struct sbrec_chassis_private **chassis_private);
> > + const struct sbrec_chassis_private **chassis_private,
> > + struct ovsdb_idl_index *sbrec_chassis_encaps);
> > bool chassis_cleanup(struct ovsdb_idl_txn *ovs_idl_txn,
> > struct ovsdb_idl_txn *ovnsb_idl_txn,
> > const struct ovsrec_open_vswitch_table *,
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 330227933..7faf1c38f 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -6570,6 +6570,9 @@ main(int argc, char *argv[])
> > struct ovsdb_idl_index *sbrec_learned_route_index_by_datapath
> > = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
> > &sbrec_learned_route_col_datapath);
> > + struct ovsdb_idl_index *sbrec_encaps
> > + = ovsdb_idl_index_create2(ovnsb_idl_loop.idl,
> > + &sbrec_encap_col_type,
> > &sbrec_encap_col_ip);
> >
> > ovsdb_idl_track_add_all(ovnsb_idl_loop.idl);
> > ovsdb_idl_omit_alert(ovnsb_idl_loop.idl,
> > @@ -7275,7 +7278,7 @@ main(int argc, char *argv[])
> > sbrec_chassis_private_by_name,
> > ovs_table, chassis_id,
> > br_int, &transport_zones,
> > - &chassis_private);
> > + &chassis_private, sbrec_encaps);
> > }
> >
> > /* If any OVS feature support changed, force a full
> recompute.
> > diff --git a/lib/chassis-index.c b/lib/chassis-index.c
> > index 4b38036cb..953cc6ed4 100644
> > --- a/lib/chassis-index.c
> > +++ b/lib/chassis-index.c
> > @@ -115,3 +115,23 @@ ha_chassis_group_lookup_by_name(
> >
> > return retval;
> > }
> > +
> > +/* Finds and returns the encap with the given ip and type, or NULL if
> no
> > such
> > + * encap exists. */
> > +const struct sbrec_encap *
> > +encap_lookup_by_ip_and_type(struct ovsdb_idl_index
> *sbrec_chassis_encaps,
> > + const char *ip, const char *type)
> > +{
> > + struct sbrec_encap *target = sbrec_encap_index_init_row(
> > + sbrec_chassis_encaps);
> > +
> > + sbrec_encap_index_set_ip(target, ip);
> > + sbrec_encap_index_set_type(target, type);
> > +
> > + struct sbrec_encap *retval = sbrec_encap_index_find(
> > + sbrec_chassis_encaps, target);
> > +
> > + sbrec_encap_index_destroy_row(target);
> > +
> > + return retval;
> > +}
> > diff --git a/lib/chassis-index.h b/lib/chassis-index.h
> > index bc654da13..dd94596c4 100644
> > --- a/lib/chassis-index.h
> > +++ b/lib/chassis-index.h
> > @@ -35,5 +35,8 @@ chassis_private_lookup_by_name(
> > struct ovsdb_idl_index *ha_chassis_group_index_create(struct ovsdb_idl
> > *idl);
> > const struct sbrec_ha_chassis_group *ha_chassis_group_lookup_by_name(
> > struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const char
> > *name);
> > +const struct sbrec_encap *
> > +encap_lookup_by_ip_and_type(struct ovsdb_idl_index
> *sbrec_chassis_encaps,
> > + const char *ip, const char *type);
> >
>
> nit: For declaration the return type and name should be at the same line.
>
>
> >
> > #endif /* lib/chassis-index.h */
> > diff --git a/tests/ovn-controller.at <http://ovn-controller.at>
> b/tests/ovn-controller.at <http://ovn-controller.at>
> > index a8a9a2da2..9d158ebc4 100644
> > --- a/tests/ovn-controller.at <http://ovn-controller.at>
> > +++ b/tests/ovn-controller.at <http://ovn-controller.at>
> > @@ -277,7 +277,8 @@ ovs-vsctl -- set Open_vSwitch .
> > external-ids:hostname="${new_sysid}" \
> > -- set Open_vSwitch .
> > external-ids:ovn-remote="${current_remote}"
> >
> > OVS_WAIT_UNTIL([
> > - grep -q 'Transaction causes multiple rows in \\"Encap\\" table to
> > have identical values' hv/ovn-controller.log
> > + grep -q "'hv' already has encap ip '192.168.0.1' and type
> 'geneve', "
> > \
> > + "cannot duplicate on 'hv-foo'" hv/ovn-controller.log
> > ])
> >
> > # Destroy the stale entries manually and ovn-controller should now be
> able
> > @@ -310,7 +311,8 @@ check ovn-sbctl destroy chassis_private . -- destroy
> > chassis .
> > OVN_CLEANUP_SBOX([hv], ["/foo/d
> > /bar/d
> > /Transaction causes multiple rows/d
>
> I guess we should delete the "Transaction causes multiple rows" potential
> error.
>
> > -/failed to create bridge/d"])
> > +/failed to create bridge/d
> > +/cannot duplicate on/d"])
> >
> > OVN_CLEANUP_VSWITCH([main])
> > as ovn-sb
> > @@ -3876,3 +3878,31 @@ wait_ports 0
> > OVN_CLEANUP([hv1
> > /Invalid VXLAN port number.*/d])
> > AT_CLEANUP
> > +
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([ovn-controller - qwerty two encaps same IP and type])
> > +AT_KEYWORDS([ovn])
> > +ovn_start
> > +net_add n1
> > +
> > +sim_add hv1
> > +as hv1
> > +check ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.11
> > +
> > +sim_add hv2
> > +as hv2
> > +check ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.12
> > +
> > +# Set hv2's encap IP to be the same as hv1's, and save hv2's db size.
> > +check ovs-vsctl set Open_vSwitch .
> > external-ids:ovn-encap-ip="192.168.0.11"
> > +old_size=$(wc -c hv2/conf.db)
> >
>
> This is racy, you're saving the size *after* the change. It would be
> better
> to actually wait for the 192.168.0.12 encap to disappear and then
> check count of encap table, possibly chassis table (depending if we still
> want to go with delete or not).
FWIW, while we can add some check for the Sb DB records here, I think
it's important that we also check that the local OVS database doesn't
show unrestricted growth, since that's the actual problem this patch
is attempting to fix.
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev