On Fri, Feb 2, 2024 at 12:19 PM Ales Musil <amu...@redhat.com> wrote:
> > > On Tue, Jan 30, 2024 at 2:59 PM Mohammad Heib <mh...@redhat.com> wrote: > >> ovn-controller immediately removes the vport_bindings requests that were >> generated by VIFs after handling them locally, this approach is intended >> to avoid binding the vport to one VIF only and allocate the vport >> between the different VIFs that exist in the vport:virtual-parents. >> >> Although the behavior mentioned above is correct, in some cases when the >> SB Database is busy the transaction that binds this vport to the desired >> VIF/chassis can fail and the controller will not re-try to bind the >> vport again because we deleted the bind_vport request in the previous >> loop/TXN. >> >> This patch aims to change the above behavior by storing the bind_vport >> requests for a bit longer time and this is done by the following: >> 1. add relevancy_time for each new bind_vport request and >> mark this request as new. >> >> 2. loop0: ovn-controller will try to handle this bind_vport request >> for the first time as usual (no change). >> >> 3. loop0: ovn-controller will try to delete the already handled >> bind_vport >> request as usual but first, it will check if this request is marked >> as new and >> if the relevancy_time is still valid if so the controller will mark >> this >> request as an old request and keep it, otherwise remove it. >> >> 4.loop1: ovn-controller will try to commit the same change again for >> the old request, if the previous commit in loop0 succeeded the >> change will not have any effect on SB, otherwise we will try to >> commit the same vport_bind request again. >> >> 5. loop1: delete the old bind_vport request. >> >> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1954659 >> Signed-off-by: Mohammad Heib <mh...@redhat.com> >> --- >> > > Hi Mohammad, > > overall the change makes sense, I have a couple of comments see down below. > > controller/pinctrl.c | 58 +++++++++++++++++++++++++++++++++++++++----- >> 1 file changed, 52 insertions(+), 6 deletions(-) >> >> diff --git a/controller/pinctrl.c b/controller/pinctrl.c >> index bd3bd3d81..152962448 100644 >> --- a/controller/pinctrl.c >> +++ b/controller/pinctrl.c >> @@ -6519,10 +6519,52 @@ struct put_vport_binding { >> uint32_t vport_key; >> >> uint32_t vport_parent_key; >> + >> + /* This vport record Only relevant if "relevancy_time" >> + * is earlier than the current_time, "new_record" is true. >> + */ >> + long long int relevancy_time; >> > > The intention of the variable should be probably clearer e.g. > relevant_until_ms. > > Also reading through the rest of the code it doesn't seem possible that > the binding wouldn't be deleted, hence I think there isn't any need for the > relevancy time, it should be enough to have a flag that will be flipped. In > any case we will try to commit twice. I would leave out the whole relevancy > and keep the flag flipping it on first commit WDYT? > > >> + bool new_record; >> }; >> >> /* Contains "struct put_vport_binding"s. */ >> static struct hmap put_vport_bindings; >> +/* the relevance time in ms of vport record before deleteing. */ >> +#define VPORT_RELEVANCE_TIME 1500 >> + >> +/* >> + * Validate if the vport_binding record that was added >> + * by the pinctrl thread is still relevant and needs >> + * to be updated in the SBDB or not. >> + * >> + * vport_binding record is only relevant and needs to be updated in SB >> if: >> + * 1. The put_vport_binding:relevancy_time still valid. >> + * 2. The put_vport_binding:new_record is true: >> + * The new_record will be set to "true" when this vport record is >> created >> + * by function "pinctrl_handle_bind_vport". >> + * >> + * After the first attempt to bind this vport to the chassis and >> + * virtual_parent by function "run_put_vport_bindings" we will set >> the >> + * value of vpb:new_record to "false" and keep it in >> "put_vport_bindings" >> + * >> + * After the second attempt of binding the vpb it will be removed >> by >> + * this function. >> + * >> + * The above guarantees that we will try to bind the vport twice in >> + * a certain amount of time. >> + * >> +*/ >> +static bool >> +is_vport_binding_relevant(struct put_vport_binding *vpb) >> +{ >> + long long int cur_time = time_msec(); >> + >> + if (vpb->new_record && vpb->relevancy_time > cur_time) { >> + vpb->new_record = false; >> > > This is a nasty side effect that I wouldn't expect from starting with is_. > > >> + return true; >> + } >> + return false; >> +} >> >> static void >> init_put_vport_bindings(void) >> @@ -6531,18 +6573,21 @@ init_put_vport_bindings(void) >> } >> >> static void >> -flush_put_vport_bindings(void) >> +flush_put_vport_bindings(bool force_flush) >> { >> struct put_vport_binding *vport_b; >> - HMAP_FOR_EACH_POP (vport_b, hmap_node, &put_vport_bindings) { >> - free(vport_b); >> + HMAP_FOR_EACH_SAFE (vport_b, hmap_node, &put_vport_bindings) { >> + if (!is_vport_binding_relevant(vport_b) || force_flush) { >> + hmap_remove(&put_vport_bindings, &vport_b->hmap_node); >> + free(vport_b); >> + } >> } >> } >> >> static void >> destroy_put_vport_bindings(void) >> { >> - flush_put_vport_bindings(); >> + flush_put_vport_bindings(true); >> hmap_destroy(&put_vport_bindings); >> } >> >> @@ -6620,7 +6665,7 @@ run_put_vport_bindings(struct ovsdb_idl_txn >> *ovnsb_idl_txn, >> sbrec_port_binding_by_key, chassis, vpb); >> } >> >> - flush_put_vport_bindings(); >> + flush_put_vport_bindings(false); >> } >> >> /* Called with in the pinctrl_handler thread context. */ >> @@ -6658,7 +6703,8 @@ pinctrl_handle_bind_vport( >> vpb->dp_key = dp_key; >> vpb->vport_key = vport_key; >> vpb->vport_parent_key = vport_parent_key; >> - >> + vpb->new_record = true; >> + vpb->relevancy_time = time_msec() + VPORT_RELEVANCE_TIME; >> notify_pinctrl_main(); >> } >> >> -- >> 2.34.3 >> >> _______________________________________________ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >> > Thanks, > Ales > -- > > Ales Musil > > Senior Software Engineer - OVN Core > > Red Hat EMEA <https://www.redhat.com> > > amu...@redhat.com > <https://red.ht/sig> > I forgot to mention, is it possible to come up with a test that would check this behavior? -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> amu...@redhat.com <https://red.ht/sig> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev