On Tue, Sep 8, 2020 at 1:16 PM Numan Siddique <num...@ovn.org> wrote:

>
>
> On Tue, Sep 8, 2020 at 12:58 PM Han Zhou <zhou...@gmail.com> wrote:
>
>> On Mon, Sep 7, 2020 at 3:11 AM Dumitru Ceara <dce...@redhat.com> wrote:
>> >
>> > ovn-controller always stores the last configured system-id/chassis-id in
>> > memory regardless if the connection to the SB is up or down. This is OK
>> > as long as the change can be committed successfully when the SB DB
>> > connection comes back up.
>> >
>> > Without this change, if the chassis-id changes while the SB connection
>> is
>> > down, ovn-controller will fail to create the new record but nevertheless
>> > update its in-memory chassis-id. When the SB connection is restored
>> > ovn-controller tries to find the record corresponding to the chassis-id
>> > it stored in memory. This fails causing ovn-controller to try to insert
>> > a new record. But at this point a constraint violation is hit in the SB
>> > because the Encap records of the "stale" chassis still exist in the DB,
>> > along with the old chassis record.
>> >
>> > This commit changes the way we search for a "stale" chassis record in
>> the
>> > SB to cover the above mentioned case. Also, in such cases there's no
>> need
>> > to recreate the Encaps, it's sufficient to update the chassis_name
>> field.
>> >
>> > Fixes: 5344f24ecb1a ("ovn-controller: Refactor chassis.c to abstract the
>> string parsing")
>> > Signed-off-by: Dumitru Ceara <dce...@redhat.com>
>> >
>> > (cherry-picked from master commit
>> 94a32fca2d2b825fece0ef5b1873459bd9857dd3)
>>
>> Hi Dumitru,
>>
>> Sorry that I missed the original review, but this patch seems causing
>> problem in master already. When RBAC is enabled, it will result in below
>> error:
>> 2020-09-08T07:20:31.991Z|00014|binding|INFO|Claiming lport lsp1 for this
>> chassis.
>> 2020-09-08T07:20:31.992Z|00015|ovsdb_idl|WARN|transaction error:
>> {"details":"RBAC rules for client \"chassis-1\" role \"ovn-controller\"
>> prohibit modification of table \"Encap\".","error":"permission error"}
>>
>> You can simply reproduce by: make sandbox, and then run: ovn-nbctl
>> --wait=hv sync, which will hang forever because of the problem.
>> This patch seems to be updating the "name" field of Encap table, which
>> violates the design of RBAC, which uses "name" as the identity of the
>> client. We shouldn't allow an user to change system-id/chassis-id
>> directly.
>> I think the proper way is firstly destroying the old chassis and then
>> creating a new chassis, which would also avoid the complex logic in
>> ovn-controller regarding the "stale record" handling. What do you think?
>>
>>
> Hi Han,
>
> Yes. Dumitru submitted the patch to fix this issue in master and I applied
> it just now.
> Can you please test again with the latest master.
>
>
> https://github.com/ovn-org/ovn/commit/849c5a492a26337789db1a2dc97570989ef08c34
>
> We see the same issue with branch-20.06 too.
>


Hi Dumitru,

Can you please make a 2 patch series for branch-20.03 which includes this
patch and the other one
which fixes the RBAC issue.

Thanks
Numan


>
> Thanks
> Numan
>
> Thanks,
>> Han
>>
>> > ---
>> >  controller/chassis.c    | 60
>> ++++++++++++++++++++++++++++++-------------------
>> >  tests/ovn-controller.at | 17 ++++++++++++++
>> >  2 files changed, 54 insertions(+), 23 deletions(-)
>> >
>> > diff --git a/controller/chassis.c b/controller/chassis.c
>> > index 522893e..d525463 100644
>> > --- a/controller/chassis.c
>> > +++ b/controller/chassis.c
>> > @@ -380,10 +380,7 @@ chassis_tunnels_changed(const struct sset
>> *encap_type_set,
>> >  {
>> >      size_t encap_type_count = 0;
>> >
>> > -    for (int i = 0; i < chassis_rec->n_encaps; i++) {
>> > -        if (strcmp(chassis_rec->name,
>> chassis_rec->encaps[i]->chassis_name)) {
>> > -            return true;
>> > -        }
>> > +    for (size_t i = 0; i < chassis_rec->n_encaps; i++) {
>> >
>> >          if (!sset_contains(encap_type_set,
>> chassis_rec->encaps[i]->type)) {
>> >              return true;
>> > @@ -456,6 +453,19 @@ chassis_build_encaps(struct ovsdb_idl_txn
>> *ovnsb_idl_txn,
>> >  }
>> >
>> >  /*
>> > + * Updates encaps for a given chassis. This can happen when the chassis
>> > + * name has changed. Also, the only thing we support updating is the
>> > + * chassis_name. For other changes the encaps will be recreated.
>> > + */
>> > +static void
>> > +chassis_update_encaps(const struct sbrec_chassis *chassis)
>> > +{
>> > +    for (size_t i = 0; i < chassis->n_encaps; i++) {
>> > +        sbrec_encap_set_chassis_name(chassis->encaps[i],
>> chassis->name);
>> > +    }
>> > +}
>> > +
>> > +/*
>> >   * Returns a pointer to a chassis record from 'chassis_table' that
>> >   * matches at least one tunnel config.
>> >   */
>> > @@ -486,9 +496,10 @@ chassis_get_stale_record(const struct
>> sbrec_chassis_table *chassis_table,
>> >  /* If this is a chassis config update after we initialized the record
>> once
>> >   * then we should always be able to find it with the ID we saved in
>> >   * chassis_state.
>> > - * Otherwise (i.e., first time we create the record) then we check if
>> there's
>> > - * a stale record from a previous controller run that didn't end
>> gracefully
>> > - * and reuse it. If not then we create a new record.
>> > + * Otherwise (i.e., first time we create the record or if the system-id
>> > + * changed) then we check if there's a stale record from a previous
>> > + * controller run that didn't end gracefully and reuse it. If not then
>> we
>> > + * create a new record.
>> >   */
>> >  static const struct sbrec_chassis *
>> >  chassis_get_record(struct ovsdb_idl_txn *ovnsb_idl_txn,
>> > @@ -497,29 +508,31 @@ chassis_get_record(struct ovsdb_idl_txn
>> *ovnsb_idl_txn,
>> >                     const struct ovs_chassis_cfg *ovs_cfg,
>> >                     const char *chassis_id)
>> >  {
>> > -    const struct sbrec_chassis *chassis_rec;
>> > +    const struct sbrec_chassis *chassis = NULL;
>> >
>> >      if (chassis_info_id_inited(&chassis_state)) {
>> > -        chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
>> > -
>> chassis_info_id(&chassis_state));
>> > -        if (!chassis_rec) {
>> > -            VLOG_DBG("Could not find Chassis, will create it"
>> > -                     ": stored (%s) ovs (%s)",
>> > +        chassis = chassis_lookup_by_name(sbrec_chassis_by_name,
>> > +
>> chassis_info_id(&chassis_state));
>> > +        if (!chassis) {
>> > +            VLOG_DBG("Could not find Chassis, will check if the id
>> changed: "
>> > +                     "stored (%s) ovs (%s)",
>> >                       chassis_info_id(&chassis_state), chassis_id);
>> > -            if (ovnsb_idl_txn) {
>> > -                /* Recreate the chassis record.  */
>> > -                chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
>> > -            }
>> >          }
>> > -    } else {
>> > -        chassis_rec =
>> > -            chassis_get_stale_record(chassis_table, ovs_cfg,
>> chassis_id);
>> > +    }
>> >
>> > -        if (!chassis_rec && ovnsb_idl_txn) {
>> > -            chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
>> > +    if (!chassis) {
>> > +        chassis = chassis_get_stale_record(chassis_table, ovs_cfg,
>> chassis_id);
>> > +    }
>> > +
>> > +    if (!chassis) {
>> > +        /* Recreate the chassis record. */
>> > +        VLOG_DBG("Could not find Chassis, will create it: %s",
>> chassis_id);
>> > +        if (ovnsb_idl_txn) {
>> > +            return sbrec_chassis_insert(ovnsb_idl_txn);
>> >          }
>> >      }
>> > -    return chassis_rec;
>> > +
>> > +    return chassis;
>> >  }
>> >
>> >  /* Update a Chassis record based on the config in the ovs config. */
>> > @@ -567,6 +580,7 @@ chassis_update(const struct sbrec_chassis
>> *chassis_rec,
>> >                                  &ovs_cfg->encap_ip_set,
>> ovs_cfg->encap_csum,
>> >                                  chassis_rec);
>> >      if (!tunnels_changed) {
>> > +        chassis_update_encaps(chassis_rec);
>> >          return;
>> >      }
>> >
>> > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
>> > index 63b2581..1c7aa58 100644
>> > --- a/tests/ovn-controller.at
>> > +++ b/tests/ovn-controller.at
>> > @@ -196,6 +196,23 @@ OVS_WAIT_UNTIL([
>> >      test "${sysid}" = "${chassis_id}"
>> >  ])
>> >
>> > +# Simulate system-id changing while ovn-controller is disconnected from
>> the
>> > +# SB.
>> > +valid_remote=$(ovs-vsctl get Open_vSwitch . external_ids:ovn-remote)
>> > +invalid_remote=tcp:0.0.0.0:4242
>> > +ovs-vsctl set Open_vSwitch . external_ids:ovn-remote=${invalid_remote}
>> > +expected_state="not connected"
>> > +OVS_WAIT_UNTIL([
>> > +    test "${expected_state}" = "$(ovn-appctl -t ovn-controller
>> connection-status)"
>> > +])
>> > +sysid=${sysid}-bar
>> > +ovs-vsctl set Open_vSwitch . external-ids:system-id="${sysid}"
>> > +ovs-vsctl set Open_vSwitch . external_ids:ovn-remote=${valid_remote}
>> > +OVS_WAIT_UNTIL([
>> > +    chassis_id=$(ovn-sbctl get Chassis "${sysid}" name)
>> > +    test "${sysid}" = "${chassis_id}"
>> > +])
>> > +
>> >  # Gracefully terminate daemons
>> >  OVN_CLEANUP_SBOX([hv])
>> >  OVN_CLEANUP_VSWITCH([main])
>> > --
>> > 1.8.3.1
>> >
>> > _______________________________________________
>> > dev mailing list
>> > d...@openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> _______________________________________________
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to