On Tue, Sep 8, 2020 at 12:58 PM Han Zhou <zhou...@gmail.com> wrote: > On Mon, Sep 7, 2020 at 3:11 AM Dumitru Ceara <dce...@redhat.com> wrote: > > > > ovn-controller always stores the last configured system-id/chassis-id in > > memory regardless if the connection to the SB is up or down. This is OK > > as long as the change can be committed successfully when the SB DB > > connection comes back up. > > > > Without this change, if the chassis-id changes while the SB connection is > > down, ovn-controller will fail to create the new record but nevertheless > > update its in-memory chassis-id. When the SB connection is restored > > ovn-controller tries to find the record corresponding to the chassis-id > > it stored in memory. This fails causing ovn-controller to try to insert > > a new record. But at this point a constraint violation is hit in the SB > > because the Encap records of the "stale" chassis still exist in the DB, > > along with the old chassis record. > > > > This commit changes the way we search for a "stale" chassis record in the > > SB to cover the above mentioned case. Also, in such cases there's no need > > to recreate the Encaps, it's sufficient to update the chassis_name field. > > > > Fixes: 5344f24ecb1a ("ovn-controller: Refactor chassis.c to abstract the > string parsing") > > Signed-off-by: Dumitru Ceara <dce...@redhat.com> > > > > (cherry-picked from master commit > 94a32fca2d2b825fece0ef5b1873459bd9857dd3) > > Hi Dumitru, > > Sorry that I missed the original review, but this patch seems causing > problem in master already. When RBAC is enabled, it will result in below > error: > 2020-09-08T07:20:31.991Z|00014|binding|INFO|Claiming lport lsp1 for this > chassis. > 2020-09-08T07:20:31.992Z|00015|ovsdb_idl|WARN|transaction error: > {"details":"RBAC rules for client \"chassis-1\" role \"ovn-controller\" > prohibit modification of table \"Encap\".","error":"permission error"} > > You can simply reproduce by: make sandbox, and then run: ovn-nbctl > --wait=hv sync, which will hang forever because of the problem. > This patch seems to be updating the "name" field of Encap table, which > violates the design of RBAC, which uses "name" as the identity of the > client. We shouldn't allow an user to change system-id/chassis-id directly. > I think the proper way is firstly destroying the old chassis and then > creating a new chassis, which would also avoid the complex logic in > ovn-controller regarding the "stale record" handling. What do you think? > > Hi Han,
Yes. Dumitru submitted the patch to fix this issue in master and I applied it just now. Can you please test again with the latest master. https://github.com/ovn-org/ovn/commit/849c5a492a26337789db1a2dc97570989ef08c34 We see the same issue with branch-20.06 too. Thanks Numan Thanks, > Han > > > --- > > controller/chassis.c | 60 > ++++++++++++++++++++++++++++++------------------- > > tests/ovn-controller.at | 17 ++++++++++++++ > > 2 files changed, 54 insertions(+), 23 deletions(-) > > > > diff --git a/controller/chassis.c b/controller/chassis.c > > index 522893e..d525463 100644 > > --- a/controller/chassis.c > > +++ b/controller/chassis.c > > @@ -380,10 +380,7 @@ chassis_tunnels_changed(const struct sset > *encap_type_set, > > { > > size_t encap_type_count = 0; > > > > - for (int i = 0; i < chassis_rec->n_encaps; i++) { > > - if (strcmp(chassis_rec->name, > chassis_rec->encaps[i]->chassis_name)) { > > - return true; > > - } > > + for (size_t i = 0; i < chassis_rec->n_encaps; i++) { > > > > if (!sset_contains(encap_type_set, > chassis_rec->encaps[i]->type)) { > > return true; > > @@ -456,6 +453,19 @@ chassis_build_encaps(struct ovsdb_idl_txn > *ovnsb_idl_txn, > > } > > > > /* > > + * Updates encaps for a given chassis. This can happen when the chassis > > + * name has changed. Also, the only thing we support updating is the > > + * chassis_name. For other changes the encaps will be recreated. > > + */ > > +static void > > +chassis_update_encaps(const struct sbrec_chassis *chassis) > > +{ > > + for (size_t i = 0; i < chassis->n_encaps; i++) { > > + sbrec_encap_set_chassis_name(chassis->encaps[i], chassis->name); > > + } > > +} > > + > > +/* > > * Returns a pointer to a chassis record from 'chassis_table' that > > * matches at least one tunnel config. > > */ > > @@ -486,9 +496,10 @@ chassis_get_stale_record(const struct > sbrec_chassis_table *chassis_table, > > /* If this is a chassis config update after we initialized the record > once > > * then we should always be able to find it with the ID we saved in > > * chassis_state. > > - * Otherwise (i.e., first time we create the record) then we check if > there's > > - * a stale record from a previous controller run that didn't end > gracefully > > - * and reuse it. If not then we create a new record. > > + * Otherwise (i.e., first time we create the record or if the system-id > > + * changed) then we check if there's a stale record from a previous > > + * controller run that didn't end gracefully and reuse it. If not then > we > > + * create a new record. > > */ > > static const struct sbrec_chassis * > > chassis_get_record(struct ovsdb_idl_txn *ovnsb_idl_txn, > > @@ -497,29 +508,31 @@ chassis_get_record(struct ovsdb_idl_txn > *ovnsb_idl_txn, > > const struct ovs_chassis_cfg *ovs_cfg, > > const char *chassis_id) > > { > > - const struct sbrec_chassis *chassis_rec; > > + const struct sbrec_chassis *chassis = NULL; > > > > if (chassis_info_id_inited(&chassis_state)) { > > - chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name, > > - > chassis_info_id(&chassis_state)); > > - if (!chassis_rec) { > > - VLOG_DBG("Could not find Chassis, will create it" > > - ": stored (%s) ovs (%s)", > > + chassis = chassis_lookup_by_name(sbrec_chassis_by_name, > > + > chassis_info_id(&chassis_state)); > > + if (!chassis) { > > + VLOG_DBG("Could not find Chassis, will check if the id > changed: " > > + "stored (%s) ovs (%s)", > > chassis_info_id(&chassis_state), chassis_id); > > - if (ovnsb_idl_txn) { > > - /* Recreate the chassis record. */ > > - chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn); > > - } > > } > > - } else { > > - chassis_rec = > > - chassis_get_stale_record(chassis_table, ovs_cfg, > chassis_id); > > + } > > > > - if (!chassis_rec && ovnsb_idl_txn) { > > - chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn); > > + if (!chassis) { > > + chassis = chassis_get_stale_record(chassis_table, ovs_cfg, > chassis_id); > > + } > > + > > + if (!chassis) { > > + /* Recreate the chassis record. */ > > + VLOG_DBG("Could not find Chassis, will create it: %s", > chassis_id); > > + if (ovnsb_idl_txn) { > > + return sbrec_chassis_insert(ovnsb_idl_txn); > > } > > } > > - return chassis_rec; > > + > > + return chassis; > > } > > > > /* Update a Chassis record based on the config in the ovs config. */ > > @@ -567,6 +580,7 @@ chassis_update(const struct sbrec_chassis > *chassis_rec, > > &ovs_cfg->encap_ip_set, > ovs_cfg->encap_csum, > > chassis_rec); > > if (!tunnels_changed) { > > + chassis_update_encaps(chassis_rec); > > return; > > } > > > > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at > > index 63b2581..1c7aa58 100644 > > --- a/tests/ovn-controller.at > > +++ b/tests/ovn-controller.at > > @@ -196,6 +196,23 @@ OVS_WAIT_UNTIL([ > > test "${sysid}" = "${chassis_id}" > > ]) > > > > +# Simulate system-id changing while ovn-controller is disconnected from > the > > +# SB. > > +valid_remote=$(ovs-vsctl get Open_vSwitch . external_ids:ovn-remote) > > +invalid_remote=tcp:0.0.0.0:4242 > > +ovs-vsctl set Open_vSwitch . external_ids:ovn-remote=${invalid_remote} > > +expected_state="not connected" > > +OVS_WAIT_UNTIL([ > > + test "${expected_state}" = "$(ovn-appctl -t ovn-controller > connection-status)" > > +]) > > +sysid=${sysid}-bar > > +ovs-vsctl set Open_vSwitch . external-ids:system-id="${sysid}" > > +ovs-vsctl set Open_vSwitch . external_ids:ovn-remote=${valid_remote} > > +OVS_WAIT_UNTIL([ > > + chassis_id=$(ovn-sbctl get Chassis "${sysid}" name) > > + test "${sysid}" = "${chassis_id}" > > +]) > > + > > # Gracefully terminate daemons > > OVN_CLEANUP_SBOX([hv]) > > OVN_CLEANUP_VSWITCH([main]) > > -- > > 1.8.3.1 > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev