On Thu, Oct 2, 2025 at 12:01 PM Ilya Maximets <[email protected]> wrote:
> On 9/18/25 2:43 PM, Xavier Simonart wrote: > > Hi Rose > > > > Thanks for looking into this issue and providing a patch. > > I have two additional comments - see below > > > > Thanks > > Xavier > > > > On Thu, Sep 18, 2025 at 8:47 AM Ales Musil via dev < > [email protected] <mailto:[email protected]>> wrote: > > > > On Thu, Sep 18, 2025 at 12:02 AM Rosemarie O'Riorden via dev < > > [email protected] <mailto:[email protected]>> wrote: > > > > > If you set a chassis' encap ip to be the same as that of another > > > chassis, OVN should detect it, log an error, and delete the > chassis. > > > > > > But this was not the case; OVN would allow the transaction. > However, the > > > commit to the southbound database would fail, because there was > already > > > an encap record with that same ip and type (from the other > chassis). > > > The OVS db on the problematic chassis would then rapidly increase > in > > > size. Essentially, the controller would get stuck in a loop of > deleting > > > and recreating the tunnels. > > > > > > This change makes it so before building encaps, the controller > detects > > > if there is an existing encap tunnel of a certain ip and type. If > so, > > > it logs an error, refuses to build the encaps, and deletes the > > > associated chassis record. > > > > Note that this is a different behavior from what OVN does today when > the ovn-encap-ip is removed. > > If the ovn-encap-ip is removed, the ovn-controller will log a message, > but will not delete the encap and chassis entries in sb. > > I think that removing the encap entry, as you propose, is a good thing > to do, so I entered a different FDP tracking this part: > > https://issues.redhat.com/browse/FDP-1711 < > https://issues.redhat.com/browse/FDP-1711>. > > See below about deletion of the chassis record. > > > > > > > > Reported-at: https://issues.redhat.com/browse/FDP-1481 < > https://issues.redhat.com/browse/FDP-1481> > > > Co-authored-by: Nicholas Hubbard <[email protected] > <mailto:[email protected]>> > > > Signed-off-by: Nicholas Hubbard <[email protected] > <mailto:[email protected]>> > > > Signed-off-by: Rosemarie O'Riorden <[email protected] <mailto: > [email protected]>> > > > --- > > > v2: Fix typo in title. > > > --- > > > > > > > Hi Rosemarie, > > > > thank you for the patch. I have a couple of comments inline. > > > > controller/chassis.c | 97 > +++++++++++++++++++++++++------------ > > > controller/chassis.h | 3 +- > > > controller/ovn-controller.c | 5 +- > > > lib/chassis-index.c | 20 ++++++++ > > > lib/chassis-index.h | 3 ++ > > > tests/ovn-controller.at <http://ovn-controller.at> | 34 > ++++++++++++- > > > 6 files changed, 128 insertions(+), 34 deletions(-) > > > > > > diff --git a/controller/chassis.c b/controller/chassis.c > > > index 07bef1c56..ac204c779 100644 > > > --- a/controller/chassis.c > > > +++ b/controller/chassis.c > > > @@ -77,6 +77,12 @@ struct ovs_chassis_cfg { > > > bool ct_label_flush; > > > }; > > > > > > +enum chassis_update_status { > > > + CHASSIS_UPDATED, > > > + CHASSIS_NOT_UPDATED, > > > + CHASSIS_NEED_DELETE > > > +}; > > > + > > > static void > > > ovs_chassis_cfg_init(struct ovs_chassis_cfg *cfg) > > > { > > > @@ -699,7 +705,8 @@ chassis_build_encaps(struct ovsdb_idl_txn > > > *ovnsb_idl_txn, > > > const char *encap_ip_default, > > > const char *chassis_id, > > > const char *encap_csum, > > > - size_t *n_encap) > > > + size_t *n_encap, > > > + struct ovsdb_idl_index *sbrec_encaps) > > > { > > > size_t tunnel_count = 0; > > > > > > @@ -713,6 +720,18 @@ chassis_build_encaps(struct ovsdb_idl_txn > > > *ovnsb_idl_txn, > > > > > > SSET_FOR_EACH (encap_ip, encap_ip_set) { > > > SSET_FOR_EACH (encap_type, encap_type_set) { > > > + const struct sbrec_encap *encap_rec = > > > + encap_lookup_by_ip_and_type(sbrec_encaps, > > > + encap_ip, encap_type); > > > + if (encap_rec && strcmp(encap_rec->chassis_name, > chassis_id)) > > > { > > > + static struct vlog_rate_limit rl = > > > VLOG_RATE_LIMIT_INIT(5, 1); > > > + VLOG_WARN_RL(&rl, "'%s' already has encap ip '%s' > and " > > > + "type '%s', cannot duplicate on > '%s'", > > > + encap_rec->chassis_name, > encap_rec->ip, > > > + encap_rec->type, chassis_id); > > > + continue; > > > + } > > > + > > > struct sbrec_encap *encap = > sbrec_encap_insert(ovnsb_idl_txn); > > > > > > sbrec_encap_set_type(encap, encap_type); > > > @@ -778,7 +797,7 @@ update_supported_sset(struct sset *supported) > > > > > > static void > > > remove_unsupported_options(const struct sbrec_chassis > *chassis_rec, > > > - bool *updated) > > > + enum chassis_update_status > *update_status) > > > { > > > struct sset supported_options = > SSET_INITIALIZER(&supported_options); > > > update_supported_sset(&supported_options); > > > @@ -789,7 +808,7 @@ remove_unsupported_options(const struct > sbrec_chassis > > > *chassis_rec, > > > VLOG_WARN("Removing unsupported key \"%s\" from > chassis > > > record.", > > > node->key); > > > sbrec_chassis_update_other_config_delkey(chassis_rec, > > > node->key); > > > - *updated = true; > > > + *update_status = CHASSIS_UPDATED; > > > } > > > } > > > > > > @@ -827,25 +846,27 @@ chassis_get_record(struct ovsdb_idl_txn > > > *ovnsb_idl_txn, > > > } > > > > > > /* Update a Chassis record based on the config in the ovs config. > > > - * Returns true if 'chassis_rec' was updated, false otherwise. > > > + * Returns 1 if 'chassis_rec' was updated, -1 if another chassis > exists > > > + * with an encap with same IP and type as 'chassis', and returns 0 > > > otherwise. > > > > > > > This comment is very misleading, I don't see the function returning > > integers, it returns enum. > > > > > > > */ > > > -static bool > > > +static enum chassis_update_status > > > chassis_update(const struct sbrec_chassis *chassis_rec, > > > struct ovsdb_idl_txn *ovnsb_idl_txn, > > > const struct ovs_chassis_cfg *ovs_cfg, > > > const char *chassis_id, > > > - const struct sset *transport_zones) > > > + const struct sset *transport_zones, > > > + struct ovsdb_idl_index *sbrec_encaps) > > > { > > > - bool updated = false; > > > + enum chassis_update_status update_status = > CHASSIS_NOT_UPDATED; > > > > > > if (strcmp(chassis_id, chassis_rec->name)) { > > > sbrec_chassis_set_name(chassis_rec, chassis_id); > > > - updated = true; > > > + update_status = CHASSIS_UPDATED; > > > } > > > > > > if (strcmp(ovs_cfg->hostname, chassis_rec->hostname)) { > > > sbrec_chassis_set_hostname(chassis_rec, > ovs_cfg->hostname); > > > - updated = true; > > > + update_status = CHASSIS_UPDATED; > > > } > > > > > > if (chassis_other_config_changed(ovs_cfg, chassis_rec)) { > > > @@ -856,12 +877,12 @@ chassis_update(const struct sbrec_chassis > > > *chassis_rec, > > > sbrec_chassis_verify_other_config(chassis_rec); > > > sbrec_chassis_set_other_config(chassis_rec, > &other_config); > > > smap_destroy(&other_config); > > > - updated = true; > > > + update_status = CHASSIS_UPDATED; > > > } > > > > > > update_chassis_transport_zones(transport_zones, chassis_rec); > > > > > > - remove_unsupported_options(chassis_rec, &updated); > > > + remove_unsupported_options(chassis_rec, &update_status); > > > > > > /* If any of the encaps should change, update them. */ > > > bool tunnels_changed = > > > @@ -871,20 +892,27 @@ chassis_update(const struct sbrec_chassis > > > *chassis_rec, > > > ovs_cfg->encap_csum, > > > chassis_rec); > > > if (!tunnels_changed) { > > > - return updated; > > > + return update_status; > > > } > > > > > > struct sbrec_encap **encaps; > > > size_t n_encap; > > > > > > encaps = > > > - chassis_build_encaps(ovnsb_idl_txn, > &ovs_cfg->encap_type_set, > > > + chassis_build_encaps(ovnsb_idl_txn, > > > + &ovs_cfg->encap_type_set, > > > &ovs_cfg->encap_ip_set, > > > ovs_cfg->encap_ip_default, > chassis_id, > > > - ovs_cfg->encap_csum, &n_encap); > > > + ovs_cfg->encap_csum, &n_encap, > > > + sbrec_encaps); > > > sbrec_chassis_set_encaps(chassis_rec, encaps, n_encap); > > > free(encaps); > > > - return true; > > > + > > > + if (!n_encap) { > > > + return CHASSIS_NEED_DELETE; > > > + } > > > + > > > + return CHASSIS_UPDATED; > > > } > > > > > > /* If this is a chassis_private config update after we > initialized the > > > record > > > @@ -935,7 +963,8 @@ chassis_run(struct ovsdb_idl_txn > *ovnsb_idl_txn, > > > const char *chassis_id, > > > const struct ovsrec_bridge *br_int, > > > const struct sset *transport_zones, > > > - const struct sbrec_chassis_private **chassis_private) > > > + const struct sbrec_chassis_private **chassis_private, > > > + struct ovsdb_idl_index *sbrec_encaps) > > > { > > > struct ovs_chassis_cfg ovs_cfg; > > > > > > @@ -956,22 +985,30 @@ chassis_run(struct ovsdb_idl_txn > *ovnsb_idl_txn, > > > * modified in the ovs table. > > > */ > > > if (chassis_rec && ovnsb_idl_txn) { > > > - bool updated = chassis_update(chassis_rec, ovnsb_idl_txn, > > > &ovs_cfg, > > > - chassis_id, > transport_zones); > > > - > > > - if (!existed || updated) { > > > - ovsdb_idl_txn_add_comment(ovnsb_idl_txn, > > > - "ovn-controller: %s chassis > '%s'", > > > - !existed ? "registering" : > > > "updating", > > > - chassis_id); > > > - } > > > + enum chassis_update_status update_status = > > > + chassis_update(chassis_rec, ovnsb_idl_txn, &ovs_cfg, > > > + chassis_id, > > > + transport_zones, sbrec_encaps); > > > > + > > > + if (update_status == CHASSIS_NEED_DELETE) { > > > + sbrec_chassis_delete(chassis_rec); > > > > > > > Shouldn't we also delete the chassis private record if it exists? > > I don't think there is a point in having one of them to remain. > > I'm also not sure if the sbrec deletion is the right way to do it, > > we could relax the schema and decrease the min for encaps > > to 0, which would allow us to keep the chassis record without > > any encap being configured. There is only single place that > > expects at least 1 encap, that is ovn-ic, but looking at the > > code nothing prevents us from skipping chassis without > > encap instead of crashing. > > > > I agree that we should probably not delete the chassis record. > > I'm not sure if keeping the record is a good thing for a few reasons: > > 1. ovn-controller doesn't create the chassis record if not fully > configured. > So, it's natural for it to remove the record when the configuration > turns into something unusable. > > 2. Keeping the record will block --wait=hv calls, if I'm not mistaken. > Sure, such commands are always dangerous due to ovn-controller maybe > being down for some reason, but it's better if we do not expand the > cases where this can happen. Node bing down in most cases would be > a temporary condition, while misconfic can persist for a long time. > > 3. Chaning database schema this way is not backward compatibe. So, we'll > need to check if we're working with the new or old schema on the server > side, otherwise we will be sending transactions that can't succeed. > Sure, this will not make the situation here worse, but we will not > be fixing the problem for the upgrade time, which can span many weeks > on large production clusters. Or we'll need to delete for old schema > and not delete for a new schema, which is double the code and seems > unnecessary. Either way we shouldn't be sending transactions that > we know will fail. > > Since this is actually a bug fix (the bug is that ovn-controller > drives the node out of disk space), backporting the schema change > will also be a little problematic, as usual. > > > Removing the chassis might also result in tunnels not being deleted in > some > > cases. With removing tunnels we would ensure > > that traffic will not be sent to this wrongly configured chassis. > > This shouldn't be a problem. Encaps is not a root table, so any time the > chassis record is removed all the encaps are also removed. And > ovn-controller > should handle this kind of update properly, because that's a normal > scenario > for a node removal. If somehow we do not handle that correctly, then that > should be fixed. But it would be a completely separate issue from this > one. > > > > > > > > > > + chassis_rec = NULL; > > > > + } else { > > > > > > This adds yet another nesting, please consider using goto > > to reduce it. > > > > + if (!existed || update_status == CHASSIS_UPDATED) { > > > + ovsdb_idl_txn_add_comment(ovnsb_idl_txn, > > > + "ovn-controller: %s > chassis > > > '%s'", > > > + !existed ? "registering" > > > + : "updating", > chassis_id); > > > + } > > > > > > - *chassis_private = > > > - chassis_private_get_record(ovnsb_idl_txn, > > > - > sbrec_chassis_private_by_name, > > > + *chassis_private = > > > + chassis_private_get_record(ovnsb_idl_txn, > > > + > sbrec_chassis_private_by_name, > > > + chassis_id); > > > + if (*chassis_private) { > > > + chassis_private_update(*chassis_private, > chassis_rec, > > > chassis_id); > > > - if (*chassis_private) { > > > - chassis_private_update(*chassis_private, chassis_rec, > > > chassis_id); > > > + } > > > } > > > } > > > > > > diff --git a/controller/chassis.h b/controller/chassis.h > > > index 03cc2f906..14251bdad 100644 > > > --- a/controller/chassis.h > > > +++ b/controller/chassis.h > > > @@ -44,7 +44,8 @@ const struct sbrec_chassis *chassis_run( > > > const struct ovsrec_open_vswitch_table *, > > > const char *chassis_id, const struct ovsrec_bridge *br_int, > > > const struct sset *transport_zones, > > > - const struct sbrec_chassis_private **chassis_private); > > > + const struct sbrec_chassis_private **chassis_private, > > > + struct ovsdb_idl_index *sbrec_chassis_encaps); > > > bool chassis_cleanup(struct ovsdb_idl_txn *ovs_idl_txn, > > > struct ovsdb_idl_txn *ovnsb_idl_txn, > > > const struct ovsrec_open_vswitch_table *, > > > diff --git a/controller/ovn-controller.c > b/controller/ovn-controller.c > > > index 330227933..7faf1c38f 100644 > > > --- a/controller/ovn-controller.c > > > +++ b/controller/ovn-controller.c > > > @@ -6570,6 +6570,9 @@ main(int argc, char *argv[]) > > > struct ovsdb_idl_index *sbrec_learned_route_index_by_datapath > > > = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, > > > > &sbrec_learned_route_col_datapath); > > > + struct ovsdb_idl_index *sbrec_encaps > > > + = ovsdb_idl_index_create2(ovnsb_idl_loop.idl, > > > + &sbrec_encap_col_type, > > > &sbrec_encap_col_ip); > > > > > > ovsdb_idl_track_add_all(ovnsb_idl_loop.idl); > > > ovsdb_idl_omit_alert(ovnsb_idl_loop.idl, > > > @@ -7275,7 +7278,7 @@ main(int argc, char *argv[]) > > > > sbrec_chassis_private_by_name, > > > ovs_table, chassis_id, > > > br_int, &transport_zones, > > > - &chassis_private); > > > + &chassis_private, > sbrec_encaps); > > > } > > > > > > /* If any OVS feature support changed, force a full > recompute. > > > diff --git a/lib/chassis-index.c b/lib/chassis-index.c > > > index 4b38036cb..953cc6ed4 100644 > > > --- a/lib/chassis-index.c > > > +++ b/lib/chassis-index.c > > > @@ -115,3 +115,23 @@ ha_chassis_group_lookup_by_name( > > > > > > return retval; > > > } > > > + > > > +/* Finds and returns the encap with the given ip and type, or > NULL if no > > > such > > > + * encap exists. */ > > > +const struct sbrec_encap * > > > +encap_lookup_by_ip_and_type(struct ovsdb_idl_index > *sbrec_chassis_encaps, > > > + const char *ip, const char *type) > > > +{ > > > + struct sbrec_encap *target = sbrec_encap_index_init_row( > > > + sbrec_chassis_encaps); > > > + > > > + sbrec_encap_index_set_ip(target, ip); > > > + sbrec_encap_index_set_type(target, type); > > > + > > > + struct sbrec_encap *retval = sbrec_encap_index_find( > > > + sbrec_chassis_encaps, target); > > > + > > > + sbrec_encap_index_destroy_row(target); > > > + > > > + return retval; > > > +} > > > diff --git a/lib/chassis-index.h b/lib/chassis-index.h > > > index bc654da13..dd94596c4 100644 > > > --- a/lib/chassis-index.h > > > +++ b/lib/chassis-index.h > > > @@ -35,5 +35,8 @@ chassis_private_lookup_by_name( > > > struct ovsdb_idl_index *ha_chassis_group_index_create(struct > ovsdb_idl > > > *idl); > > > const struct sbrec_ha_chassis_group > *ha_chassis_group_lookup_by_name( > > > struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const > char > > > *name); > > > +const struct sbrec_encap * > > > +encap_lookup_by_ip_and_type(struct ovsdb_idl_index > *sbrec_chassis_encaps, > > > + const char *ip, const char *type); > > > > > > > nit: For declaration the return type and name should be at the same > line. > > > > > > > > > > #endif /* lib/chassis-index.h */ > > > diff --git a/tests/ovn-controller.at <http://ovn-controller.at> > b/tests/ovn-controller.at <http://ovn-controller.at> > > > index a8a9a2da2..9d158ebc4 100644 > > > --- a/tests/ovn-controller.at <http://ovn-controller.at> > > > +++ b/tests/ovn-controller.at <http://ovn-controller.at> > > > @@ -277,7 +277,8 @@ ovs-vsctl -- set Open_vSwitch . > > > external-ids:hostname="${new_sysid}" \ > > > -- set Open_vSwitch . > > > external-ids:ovn-remote="${current_remote}" > > > > > > OVS_WAIT_UNTIL([ > > > - grep -q 'Transaction causes multiple rows in \\"Encap\\" > table to > > > have identical values' hv/ovn-controller.log > > > + grep -q "'hv' already has encap ip '192.168.0.1' and type > 'geneve', " > > > \ > > > + "cannot duplicate on 'hv-foo'" hv/ovn-controller.log > > > ]) > > > > > > # Destroy the stale entries manually and ovn-controller should > now be able > > > @@ -310,7 +311,8 @@ check ovn-sbctl destroy chassis_private . -- > destroy > > > chassis . > > > OVN_CLEANUP_SBOX([hv], ["/foo/d > > > /bar/d > > > /Transaction causes multiple rows/d > > > > I guess we should delete the "Transaction causes multiple rows" > potential error. > > > > > -/failed to create bridge/d"]) > > > +/failed to create bridge/d > > > +/cannot duplicate on/d"]) > > > > > > OVN_CLEANUP_VSWITCH([main]) > > > as ovn-sb > > > @@ -3876,3 +3878,31 @@ wait_ports 0 > > > OVN_CLEANUP([hv1 > > > /Invalid VXLAN port number.*/d]) > > > AT_CLEANUP > > > + > > > +OVN_FOR_EACH_NORTHD([ > > > +AT_SETUP([ovn-controller - qwerty two encaps same IP and type]) > > > +AT_KEYWORDS([ovn]) > > > +ovn_start > > > +net_add n1 > > > + > > > +sim_add hv1 > > > +as hv1 > > > +check ovs-vsctl add-br br-phys > > > +ovn_attach n1 br-phys 192.168.0.11 > > > + > > > +sim_add hv2 > > > +as hv2 > > > +check ovs-vsctl add-br br-phys > > > +ovn_attach n1 br-phys 192.168.0.12 > > > + > > > +# Set hv2's encap IP to be the same as hv1's, and save hv2's db > size. > > > +check ovs-vsctl set Open_vSwitch . > > > external-ids:ovn-encap-ip="192.168.0.11" > > > +old_size=$(wc -c hv2/conf.db) > > > > > > > This is racy, you're saving the size *after* the change. It would be > better > > to actually wait for the 192.168.0.12 encap to disappear and then > > check count of encap table, possibly chassis table (depending if we > still > > want to go with delete or not). > > FWIW, while we can add some check for the Sb DB records here, I think > it's important that we also check that the local OVS database doesn't > show unrestricted growth, since that's the actual problem this patch > is attempting to fix. > > Best regards, Ilya Maximets. > > As agreed during the offline discussion let's try to go with the removal of chassis and chassis_private, hopefully it won't prove to complex. Thanks, Ales _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
