Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-19 Thread Dumitru Ceara
On 5/17/22 19:36, Han Zhou wrote:
> On Tue, May 17, 2022 at 2:41 AM Xavier Simonart  wrote:
>>
>> Hi Han

Hi Xavier, Han,


>> Thanks for looking into this and for your feedback.
>> Please see comments below
>>
>> On Tue, May 17, 2022 at 8:03 AM Han Zhou  wrote:
>>>
>>>
>>>
>>> On Thu, May 12, 2022 at 2:04 AM Xavier Simonart 
> wrote:

 When VIF ports are claimed on a chassis, SBDB Port_Binding table is
> updated.
 If the SBDB IDL is still is read-only ("in transaction") when such a
> update
 is required, the update is not possible and recompute is triggered
> through
 I+P failure.

 This situation can happen:
 - after updating Port_Binding->chassis to SBDB for one port, in a
> following
   iteration, ovn-controller handles
> Interface:external_ids:ovn-installed
   (for the same port) while SBDB is still read-only.
 - after updating Port_Binding->chassis to SBDB for one port, in a
> following
   iteration, ovn-controller updates Port_Binding->chassis for another
> port,
   while SBDB is still read-only.

 This patch prevent the recompute, by having the if-status module
 updating the Port_Binding chassis (if needed), when possible.
 This does not delay Port_Binding chassis update compared to before this
 patch: Port_Binding chassis will be updated as soon as SBDB is again
 writable, as it was the case, through a recompute, before this patch.

>>> Thanks Xavier. I think the approach is good: moving the SB update from
> the I-P engine to if-status module, so it wouldn't need to trigger I-P
> engine recompute, and just let the if-status module update the SB as soon
> as it is writable, through the if-status's state-machine.
>>> However, I have some comments/questions regarding the implementation
> details, primarily confusion on the state transitions. Please see below.
>>>
 Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
 Signed-off-by: Xavier Simonart 
 ---
  controller/binding.c| 124 ++--
  controller/binding.h|   9 ++-
  controller/if-status.c  |  31 +++--
  controller/if-status.h  |   6 +-
  controller/ovn-controller.c |   6 +-
  tests/ovn.at|  55 +++-
  6 files changed, 184 insertions(+), 47 deletions(-)

 diff --git a/controller/binding.c b/controller/binding.c
 index e5ba56b25..d917d8775 100644
 --- a/controller/binding.c
 +++ b/controller/binding.c
 @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct
> shash *local_bindings,
  }

  bool
 -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
 +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
 +const struct sbrec_chassis *chassis_rec)
  {
  struct local_binding *lbinding =
  local_binding_find(local_bindings, pb_name);
  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
 +if (b_lport && b_lport->pb->chassis != chassis_rec) {
 +return false;
 +}
>>>
>>> Why need the change here? I understand that it is obvious that the
> binding should not be considered up if the chassis is not updated in SB
> port_binding, but I wonder why we need the change in *this* patch.
>>> The function is called to see if an interface state should be moved from
> MARK_UP to INSTALLED, but if the chassis is not set, it shouldn't even be
> moved to MARK_UP.
>>>
>> I probably need to provide a better explanation of the states.
>> In fact, my two goals were: 1) avoid/decrease recomputes and 2) do not
> increase latency (time between interface is claimed and it's up in SB and
> ovn-installed in OVS).
>> There is not really an additional state covering the fact that the
> chassis is set in sbdb.
>> As soon as the state is CLAIMED, we try to set the pb->chassis (by 'try'
> I mean we update pb->chassis if sb is not readonly).
>> The fact that sb is readonly in CLAIMED state does not prevent us to move
> to INSTALL_FLOWS state, where we are installing flows in OVS.
> 
> This may be a problem. If we haven't updated pb->chassis, then the physical
> flows being installed are incomplete, and if we move the state forward, we
> would end up updating the port status (SB pb->up and OVS
> interface->external_ids:installed) too early, which completely void the
> original purpose of the if-statue module, which is to manage the interface
> state and tell CMS when a lport is ready to receive traffic.
> 

Sounds like an issue indeed.

>> As soon as the seqno for those flows was received, we moved to MARK_UP
> state.
>> In those three states, we can update pb->chassis (if not already done).
>> In MARK_UP we set the interface up.
>> We moved to INSTALLED when both up and chassis have been written.
>>
  if (lbinding && b_lport && lbinding->iface) {
  if (b_lport

Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-17 Thread Han Zhou
On Tue, May 17, 2022 at 2:41 AM Xavier Simonart  wrote:
>
> Hi Han
> Thanks for looking into this and for your feedback.
> Please see comments below
>
> On Tue, May 17, 2022 at 8:03 AM Han Zhou  wrote:
>>
>>
>>
>> On Thu, May 12, 2022 at 2:04 AM Xavier Simonart 
wrote:
>> >
>> > When VIF ports are claimed on a chassis, SBDB Port_Binding table is
updated.
>> > If the SBDB IDL is still is read-only ("in transaction") when such a
update
>> > is required, the update is not possible and recompute is triggered
through
>> > I+P failure.
>> >
>> > This situation can happen:
>> > - after updating Port_Binding->chassis to SBDB for one port, in a
following
>> >   iteration, ovn-controller handles
Interface:external_ids:ovn-installed
>> >   (for the same port) while SBDB is still read-only.
>> > - after updating Port_Binding->chassis to SBDB for one port, in a
following
>> >   iteration, ovn-controller updates Port_Binding->chassis for another
port,
>> >   while SBDB is still read-only.
>> >
>> > This patch prevent the recompute, by having the if-status module
>> > updating the Port_Binding chassis (if needed), when possible.
>> > This does not delay Port_Binding chassis update compared to before this
>> > patch: Port_Binding chassis will be updated as soon as SBDB is again
>> > writable, as it was the case, through a recompute, before this patch.
>> >
>> Thanks Xavier. I think the approach is good: moving the SB update from
the I-P engine to if-status module, so it wouldn't need to trigger I-P
engine recompute, and just let the if-status module update the SB as soon
as it is writable, through the if-status's state-machine.
>> However, I have some comments/questions regarding the implementation
details, primarily confusion on the state transitions. Please see below.
>>
>> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
>> > Signed-off-by: Xavier Simonart 
>> > ---
>> >  controller/binding.c| 124 ++--
>> >  controller/binding.h|   9 ++-
>> >  controller/if-status.c  |  31 +++--
>> >  controller/if-status.h  |   6 +-
>> >  controller/ovn-controller.c |   6 +-
>> >  tests/ovn.at|  55 +++-
>> >  6 files changed, 184 insertions(+), 47 deletions(-)
>> >
>> > diff --git a/controller/binding.c b/controller/binding.c
>> > index e5ba56b25..d917d8775 100644
>> > --- a/controller/binding.c
>> > +++ b/controller/binding.c
>> > @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct
shash *local_bindings,
>> >  }
>> >
>> >  bool
>> > -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
>> > +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
>> > +const struct sbrec_chassis *chassis_rec)
>> >  {
>> >  struct local_binding *lbinding =
>> >  local_binding_find(local_bindings, pb_name);
>> >  struct binding_lport *b_lport =
local_binding_get_primary_lport(lbinding);
>> > +if (b_lport && b_lport->pb->chassis != chassis_rec) {
>> > +return false;
>> > +}
>>
>> Why need the change here? I understand that it is obvious that the
binding should not be considered up if the chassis is not updated in SB
port_binding, but I wonder why we need the change in *this* patch.
>> The function is called to see if an interface state should be moved from
MARK_UP to INSTALLED, but if the chassis is not set, it shouldn't even be
moved to MARK_UP.
>>
> I probably need to provide a better explanation of the states.
> In fact, my two goals were: 1) avoid/decrease recomputes and 2) do not
increase latency (time between interface is claimed and it's up in SB and
ovn-installed in OVS).
> There is not really an additional state covering the fact that the
chassis is set in sbdb.
> As soon as the state is CLAIMED, we try to set the pb->chassis (by 'try'
I mean we update pb->chassis if sb is not readonly).
> The fact that sb is readonly in CLAIMED state does not prevent us to move
to INSTALL_FLOWS state, where we are installing flows in OVS.

This may be a problem. If we haven't updated pb->chassis, then the physical
flows being installed are incomplete, and if we move the state forward, we
would end up updating the port status (SB pb->up and OVS
interface->external_ids:installed) too early, which completely void the
original purpose of the if-statue module, which is to manage the interface
state and tell CMS when a lport is ready to receive traffic.

> As soon as the seqno for those flows was received, we moved to MARK_UP
state.
> In those three states, we can update pb->chassis (if not already done).
> In MARK_UP we set the interface up.
> We moved to INSTALLED when both up and chassis have been written.
>
>> >  if (lbinding && b_lport && lbinding->iface) {
>> >  if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
>> >  return false;
>> > @@ -673,13 +677,23 @@ local_binding_is_up(struct shash
*local_bindings, const char *pb_name)
>> >  }
>>

Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-17 Thread Xavier Simonart
Hi Han
Thanks for looking into this and for your feedback.
Please see comments below

On Tue, May 17, 2022 at 8:03 AM Han Zhou  wrote:

>
>
> On Thu, May 12, 2022 at 2:04 AM Xavier Simonart 
> wrote:
> >
> > When VIF ports are claimed on a chassis, SBDB Port_Binding table is
> updated.
> > If the SBDB IDL is still is read-only ("in transaction") when such a
> update
> > is required, the update is not possible and recompute is triggered
> through
> > I+P failure.
> >
> > This situation can happen:
> > - after updating Port_Binding->chassis to SBDB for one port, in a
> following
> >   iteration, ovn-controller handles Interface:external_ids:ovn-installed
> >   (for the same port) while SBDB is still read-only.
> > - after updating Port_Binding->chassis to SBDB for one port, in a
> following
> >   iteration, ovn-controller updates Port_Binding->chassis for another
> port,
> >   while SBDB is still read-only.
> >
> > This patch prevent the recompute, by having the if-status module
> > updating the Port_Binding chassis (if needed), when possible.
> > This does not delay Port_Binding chassis update compared to before this
> > patch: Port_Binding chassis will be updated as soon as SBDB is again
> > writable, as it was the case, through a recompute, before this patch.
> >
> Thanks Xavier. I think the approach is good: moving the SB update from the
> I-P engine to if-status module, so it wouldn't need to trigger I-P engine
> recompute, and just let the if-status module update the SB as soon as it is
> writable, through the if-status's state-machine.
> However, I have some comments/questions regarding the implementation
> details, primarily confusion on the state transitions. Please see below.
>
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
> > Signed-off-by: Xavier Simonart 
> > ---
> >  controller/binding.c| 124 ++--
> >  controller/binding.h|   9 ++-
> >  controller/if-status.c  |  31 +++--
> >  controller/if-status.h  |   6 +-
> >  controller/ovn-controller.c |   6 +-
> >  tests/ovn.at|  55 +++-
> >  6 files changed, 184 insertions(+), 47 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index e5ba56b25..d917d8775 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct shash
> *local_bindings,
> >  }
> >
> >  bool
> > -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
> > +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> > +const struct sbrec_chassis *chassis_rec)
> >  {
> >  struct local_binding *lbinding =
> >  local_binding_find(local_bindings, pb_name);
> >  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> > +if (b_lport && b_lport->pb->chassis != chassis_rec) {
> > +return false;
> > +}
>
> Why need the change here? I understand that it is obvious that the binding
> should not be considered up if the chassis is not updated in SB
> port_binding, but I wonder why we need the change in *this* patch.
> The function is called to see if an interface state should be moved from
> MARK_UP to INSTALLED, but if the chassis is not set, it shouldn't even be
> moved to MARK_UP.
>
> I probably need to provide a better explanation of the states.
In fact, my two goals were: 1) avoid/decrease recomputes and 2) do not
increase latency (time between interface is claimed and it's up in SB and
ovn-installed in OVS).
There is not really an additional state covering the fact that the chassis
is set in sbdb.
As soon as the state is CLAIMED, we try to set the pb->chassis (by 'try' I
mean we update pb->chassis if sb is not readonly).
The fact that sb is readonly in CLAIMED state does not prevent us to move
to INSTALL_FLOWS state, where we are installing flows in OVS.
As soon as the seqno for those flows was received, we moved to MARK_UP
state.
In those three states, we can update pb->chassis (if not already done).
In MARK_UP we set the interface up.
We moved to INSTALLED when both up and chassis have been written.

>  if (lbinding && b_lport && lbinding->iface) {
> >  if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
> >  return false;
> > @@ -673,13 +677,23 @@ local_binding_is_up(struct shash *local_bindings,
> const char *pb_name)
> >  }
> >
> >  bool
> > -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
> > +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> > +  const struct sbrec_chassis *chassis_rec)
> >  {
> >  struct local_binding *lbinding =
> >  local_binding_find(local_bindings, pb_name);
> >
> >  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> >
> > +if (b_lport) {
> > +if (b_lport->pb->chassis == chassis_rec) {
> > +

Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-16 Thread Han Zhou
On Thu, May 12, 2022 at 2:04 AM Xavier Simonart  wrote:
>
> When VIF ports are claimed on a chassis, SBDB Port_Binding table is
updated.
> If the SBDB IDL is still is read-only ("in transaction") when such a
update
> is required, the update is not possible and recompute is triggered through
> I+P failure.
>
> This situation can happen:
> - after updating Port_Binding->chassis to SBDB for one port, in a
following
>   iteration, ovn-controller handles Interface:external_ids:ovn-installed
>   (for the same port) while SBDB is still read-only.
> - after updating Port_Binding->chassis to SBDB for one port, in a
following
>   iteration, ovn-controller updates Port_Binding->chassis for another
port,
>   while SBDB is still read-only.
>
> This patch prevent the recompute, by having the if-status module
> updating the Port_Binding chassis (if needed), when possible.
> This does not delay Port_Binding chassis update compared to before this
> patch: Port_Binding chassis will be updated as soon as SBDB is again
> writable, as it was the case, through a recompute, before this patch.
>
Thanks Xavier. I think the approach is good: moving the SB update from the
I-P engine to if-status module, so it wouldn't need to trigger I-P engine
recompute, and just let the if-status module update the SB as soon as it is
writable, through the if-status's state-machine.
However, I have some comments/questions regarding the implementation
details, primarily confusion on the state transitions. Please see below.

> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
> Signed-off-by: Xavier Simonart 
> ---
>  controller/binding.c| 124 ++--
>  controller/binding.h|   9 ++-
>  controller/if-status.c  |  31 +++--
>  controller/if-status.h  |   6 +-
>  controller/ovn-controller.c |   6 +-
>  tests/ovn.at|  55 +++-
>  6 files changed, 184 insertions(+), 47 deletions(-)
>
> diff --git a/controller/binding.c b/controller/binding.c
> index e5ba56b25..d917d8775 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct shash
*local_bindings,
>  }
>
>  bool
> -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
> +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> +const struct sbrec_chassis *chassis_rec)
>  {
>  struct local_binding *lbinding =
>  local_binding_find(local_bindings, pb_name);
>  struct binding_lport *b_lport =
local_binding_get_primary_lport(lbinding);
> +if (b_lport && b_lport->pb->chassis != chassis_rec) {
> +return false;
> +}

Why need the change here? I understand that it is obvious that the binding
should not be considered up if the chassis is not updated in SB
port_binding, but I wonder why we need the change in *this* patch.
The function is called to see if an interface state should be moved from
MARK_UP to INSTALLED, but if the chassis is not set, it shouldn't even be
moved to MARK_UP.

>  if (lbinding && b_lport && lbinding->iface) {
>  if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
>  return false;
> @@ -673,13 +677,23 @@ local_binding_is_up(struct shash *local_bindings,
const char *pb_name)
>  }
>
>  bool
> -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
> +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> +  const struct sbrec_chassis *chassis_rec)
>  {
>  struct local_binding *lbinding =
>  local_binding_find(local_bindings, pb_name);
>
>  struct binding_lport *b_lport =
local_binding_get_primary_lport(lbinding);
>
> +if (b_lport) {
> +if (b_lport->pb->chassis == chassis_rec) {
> +return false;
> +} else if (b_lport->pb->chassis) {
> +VLOG_DBG("lport %s already claimed by other chassis",
> + b_lport->pb->logical_port);
> +}
> +}
> +

Same as above, it is not clear to me why this change here. It would be
better to clarify the state transition first.

>  if (!lbinding) {
>  return true;
>  }
> @@ -894,6 +908,48 @@ get_lport_type_str(enum en_lport_type lport_type)
>  OVS_NOT_REACHED();
>  }
>
> +static void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
> +  const struct sbrec_chassis *chassis_rec)
> +{
> +if (pb->chassis != chassis_rec) {
> +if (pb->chassis) {
> +VLOG_INFO("Changing chassis for lport %s from %s to %s.",
> +pb->logical_port, pb->chassis->name,
> +chassis_rec->name);
> +} else {
> +VLOG_INFO("Claiming lport %s for this chassis.",
pb->logical_port);
> +}
> +for (int i = 0; i < pb->n_mac; i++) {
> +VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
> +}
> +sbrec_port_bi

Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-13 Thread Dumitru Ceara
On 5/13/22 15:41, Xavier Simonart wrote:
> Hi Dumitru
> 
> Thanks for the review.
> I'll go through your suggestions and submit a v2.
> 
> Just commented here on your question - see embedded.
> Thanks
> Xavier
> 
> On Fri, May 13, 2022 at 2:46 PM Dumitru Ceara  wrote:
> 
>> Hi Xavier,
>>
>> On 5/12/22 11:04, Xavier Simonart wrote:
>>> When VIF ports are claimed on a chassis, SBDB Port_Binding table is
>> updated.
>>> If the SBDB IDL is still is read-only ("in transaction") when such a
>> update
>>> is required, the update is not possible and recompute is triggered
>> through
>>> I+P failure.
>>>
>>> This situation can happen:
>>> - after updating Port_Binding->chassis to SBDB for one port, in a
>> following
>>>   iteration, ovn-controller handles Interface:external_ids:ovn-installed
>>>   (for the same port) while SBDB is still read-only.
>>> - after updating Port_Binding->chassis to SBDB for one port, in a
>> following
>>>   iteration, ovn-controller updates Port_Binding->chassis for another
>> port,
>>>   while SBDB is still read-only.
>>>
>>> This patch prevent the recompute, by having the if-status module
>>> updating the Port_Binding chassis (if needed), when possible.
>>> This does not delay Port_Binding chassis update compared to before this
>>> patch: Port_Binding chassis will be updated as soon as SBDB is again
>>> writable, as it was the case, through a recompute, before this patch.
>>
>> Just to confirm, with your patch the Port_Binding chassis will be
>> updated when the SBDB is again writable and will *not* cause a
>> recompute, right?
>>
>> Yes, confirmed. I can reformulate as:
> 
> This patch prevents the recompute, by having the if-status module
> updating the Port_Binding chassis (if needed), when possible.
> This does not delay Port_Binding chassis update compared to before this
> patch.
> - With the patch, Port_Binding chassis will be updated as soon as SBDB is
> again
> writable, without recompute.
> - Without the patch, Port_Binding chassis will be updated as soon as SBDB
> is again
> writable, through a recompute.
> 

Thanks for the confirmation!

> As a slide note, the patch only prevents the recomputes when up is set
> through notification (notify_up=true).
> I think we can later fix the recomputes in the (notify_up=false) case if
> needed, but I thought it was better to see if
> 1) the approach used by the patch is accepted

IMO the approach is fine.

> 2) the use case notify_up=false is really needed

Container and virtual ports are used by OpenStack/Neutron so we might
want to try to fix this case too.

> In that case, additional states would be needed in if-status state machine.
> 

Why can't we reuse the state machine by adding a flag to the ovs_iface
to mark that this is a child interface?  In which case the transition
from OIF_INSTALL_FLOWS to OIF_MARK_UP would happen only if the parent
interface is already in OIF_MARK_UP.

OIF_CLAIMED -> OIF_INSTALL_FLOWS -> OIF_MARK_UP -> OIF_INSTALLED

I'm OK to address this second case (notify_up=false) as a follow up though.

Thanks,
Dumitru

>>
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
>>> Signed-off-by: Xavier Simonart 
>>> ---
>>>  controller/binding.c| 124 ++--
>>>  controller/binding.h|   9 ++-
>>>  controller/if-status.c  |  31 +++--
>>>  controller/if-status.h  |   6 +-
>>>  controller/ovn-controller.c |   6 +-
>>>  tests/ovn.at|  55 +++-
>>>  6 files changed, 184 insertions(+), 47 deletions(-)
>>>
>>> diff --git a/controller/binding.c b/controller/binding.c
>>> index e5ba56b25..d917d8775 100644
>>> --- a/controller/binding.c
>>> +++ b/controller/binding.c
>>> @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct shash
>> *local_bindings,
>>>  }
>>>
>>>  bool
>>> -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
>>> +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
>>> +const struct sbrec_chassis *chassis_rec)
>>>  {
>>>  struct local_binding *lbinding =
>>>  local_binding_find(local_bindings, pb_name);
>>>  struct binding_lport *b_lport =
>> local_binding_get_primary_lport(lbinding);
>>> +if (b_lport && b_lport->pb->chassis != chassis_rec) {
>>> +return false;
>>> +}
>>>  if (lbinding && b_lport && lbinding->iface) {
>>>  if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
>>>  return false;
>>> @@ -673,13 +677,23 @@ local_binding_is_up(struct shash *local_bindings,
>> const char *pb_name)
>>>  }
>>>
>>>  bool
>>> -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
>>> +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
>>> +  const struct sbrec_chassis *chassis_rec)
>>>  {
>>>  struct local_binding *lbinding =
>>>  local_binding_find(local_bindings, pb_name);
>>>
>>>  struct binding_lport *b_lport =
>> local_bin

Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-13 Thread Xavier Simonart
Hi Dumitru

Thanks for the review.
I'll go through your suggestions and submit a v2.

Just commented here on your question - see embedded.
Thanks
Xavier

On Fri, May 13, 2022 at 2:46 PM Dumitru Ceara  wrote:

> Hi Xavier,
>
> On 5/12/22 11:04, Xavier Simonart wrote:
> > When VIF ports are claimed on a chassis, SBDB Port_Binding table is
> updated.
> > If the SBDB IDL is still is read-only ("in transaction") when such a
> update
> > is required, the update is not possible and recompute is triggered
> through
> > I+P failure.
> >
> > This situation can happen:
> > - after updating Port_Binding->chassis to SBDB for one port, in a
> following
> >   iteration, ovn-controller handles Interface:external_ids:ovn-installed
> >   (for the same port) while SBDB is still read-only.
> > - after updating Port_Binding->chassis to SBDB for one port, in a
> following
> >   iteration, ovn-controller updates Port_Binding->chassis for another
> port,
> >   while SBDB is still read-only.
> >
> > This patch prevent the recompute, by having the if-status module
> > updating the Port_Binding chassis (if needed), when possible.
> > This does not delay Port_Binding chassis update compared to before this
> > patch: Port_Binding chassis will be updated as soon as SBDB is again
> > writable, as it was the case, through a recompute, before this patch.
>
> Just to confirm, with your patch the Port_Binding chassis will be
> updated when the SBDB is again writable and will *not* cause a
> recompute, right?
>
> Yes, confirmed. I can reformulate as:

This patch prevents the recompute, by having the if-status module
updating the Port_Binding chassis (if needed), when possible.
This does not delay Port_Binding chassis update compared to before this
patch.
- With the patch, Port_Binding chassis will be updated as soon as SBDB is
again
writable, without recompute.
- Without the patch, Port_Binding chassis will be updated as soon as SBDB
is again
writable, through a recompute.

As a slide note, the patch only prevents the recomputes when up is set
through notification (notify_up=true).
I think we can later fix the recomputes in the (notify_up=false) case if
needed, but I thought it was better to see if
1) the approach used by the patch is accepted
2) the use case notify_up=false is really needed
In that case, additional states would be needed in if-status state machine.

>
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
> > Signed-off-by: Xavier Simonart 
> > ---
> >  controller/binding.c| 124 ++--
> >  controller/binding.h|   9 ++-
> >  controller/if-status.c  |  31 +++--
> >  controller/if-status.h  |   6 +-
> >  controller/ovn-controller.c |   6 +-
> >  tests/ovn.at|  55 +++-
> >  6 files changed, 184 insertions(+), 47 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index e5ba56b25..d917d8775 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct shash
> *local_bindings,
> >  }
> >
> >  bool
> > -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
> > +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> > +const struct sbrec_chassis *chassis_rec)
> >  {
> >  struct local_binding *lbinding =
> >  local_binding_find(local_bindings, pb_name);
> >  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> > +if (b_lport && b_lport->pb->chassis != chassis_rec) {
> > +return false;
> > +}
> >  if (lbinding && b_lport && lbinding->iface) {
> >  if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
> >  return false;
> > @@ -673,13 +677,23 @@ local_binding_is_up(struct shash *local_bindings,
> const char *pb_name)
> >  }
> >
> >  bool
> > -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
> > +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> > +  const struct sbrec_chassis *chassis_rec)
> >  {
> >  struct local_binding *lbinding =
> >  local_binding_find(local_bindings, pb_name);
> >
> >  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
> >
> > +if (b_lport) {
> > +if (b_lport->pb->chassis == chassis_rec) {
> > +return false;
> > +} else if (b_lport->pb->chassis) {
> > +VLOG_DBG("lport %s already claimed by other chassis",
> > + b_lport->pb->logical_port);
> > +}
> > +}
> > +
> >  if (!lbinding) {
> >  return true;
> >  }
> > @@ -894,6 +908,48 @@ get_lport_type_str(enum en_lport_type lport_type)
> >  OVS_NOT_REACHED();
> >  }
> >
> > +static void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
>
> Nit: this should be:
>
> static void
> set_pb_chassis_in_sbrec(const struct sbrec_por

Re: [ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-13 Thread Dumitru Ceara
Hi Xavier,

On 5/12/22 11:04, Xavier Simonart wrote:
> When VIF ports are claimed on a chassis, SBDB Port_Binding table is updated.
> If the SBDB IDL is still is read-only ("in transaction") when such a update
> is required, the update is not possible and recompute is triggered through
> I+P failure.
> 
> This situation can happen:
> - after updating Port_Binding->chassis to SBDB for one port, in a following
>   iteration, ovn-controller handles Interface:external_ids:ovn-installed
>   (for the same port) while SBDB is still read-only.
> - after updating Port_Binding->chassis to SBDB for one port, in a following
>   iteration, ovn-controller updates Port_Binding->chassis for another port,
>   while SBDB is still read-only.
> 
> This patch prevent the recompute, by having the if-status module
> updating the Port_Binding chassis (if needed), when possible.
> This does not delay Port_Binding chassis update compared to before this
> patch: Port_Binding chassis will be updated as soon as SBDB is again
> writable, as it was the case, through a recompute, before this patch.

Just to confirm, with your patch the Port_Binding chassis will be
updated when the SBDB is again writable and will *not* cause a
recompute, right?

> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
> Signed-off-by: Xavier Simonart 
> ---
>  controller/binding.c| 124 ++--
>  controller/binding.h|   9 ++-
>  controller/if-status.c  |  31 +++--
>  controller/if-status.h  |   6 +-
>  controller/ovn-controller.c |   6 +-
>  tests/ovn.at|  55 +++-
>  6 files changed, 184 insertions(+), 47 deletions(-)
> 
> diff --git a/controller/binding.c b/controller/binding.c
> index e5ba56b25..d917d8775 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct shash 
> *local_bindings,
>  }
>  
>  bool
> -local_binding_is_up(struct shash *local_bindings, const char *pb_name)
> +local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> +const struct sbrec_chassis *chassis_rec)
>  {
>  struct local_binding *lbinding =
>  local_binding_find(local_bindings, pb_name);
>  struct binding_lport *b_lport = 
> local_binding_get_primary_lport(lbinding);
> +if (b_lport && b_lport->pb->chassis != chassis_rec) {
> +return false;
> +}
>  if (lbinding && b_lport && lbinding->iface) {
>  if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
>  return false;
> @@ -673,13 +677,23 @@ local_binding_is_up(struct shash *local_bindings, const 
> char *pb_name)
>  }
>  
>  bool
> -local_binding_is_down(struct shash *local_bindings, const char *pb_name)
> +local_binding_is_down(struct shash *local_bindings, const char *pb_name,
> +  const struct sbrec_chassis *chassis_rec)
>  {
>  struct local_binding *lbinding =
>  local_binding_find(local_bindings, pb_name);
>  
>  struct binding_lport *b_lport = 
> local_binding_get_primary_lport(lbinding);
>  
> +if (b_lport) {
> +if (b_lport->pb->chassis == chassis_rec) {
> +return false;
> +} else if (b_lport->pb->chassis) {
> +VLOG_DBG("lport %s already claimed by other chassis",
> + b_lport->pb->logical_port);
> +}
> +}
> +
>  if (!lbinding) {
>  return true;
>  }
> @@ -894,6 +908,48 @@ get_lport_type_str(enum en_lport_type lport_type)
>  OVS_NOT_REACHED();
>  }
>  
> +static void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,

Nit: this should be:

static void
set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,

> +  const struct sbrec_chassis *chassis_rec)
> +{
> +if (pb->chassis != chassis_rec) {
> +if (pb->chassis) {
> +VLOG_INFO("Changing chassis for lport %s from %s to %s.",
> +pb->logical_port, pb->chassis->name,
> +chassis_rec->name);

Nit: indentation.

> +} else {
> +VLOG_INFO("Claiming lport %s for this chassis.", 
> pb->logical_port);
> +}
> +for (int i = 0; i < pb->n_mac; i++) {
> +VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
> +}
> +sbrec_port_binding_set_chassis(pb, chassis_rec);
> +}
> +}
> +
> +static void clear_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb)

Here too, the function name should start on a new line.

> +{
> +if (pb->chassis) {
> +sbrec_port_binding_set_chassis(pb, NULL);
> +}
> +}
> +
> +void
> +local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
> +const struct sbrec_chassis *chassis_rec)

Nit: indentation.

> +{
> +struct local_binding *lbinding =
> +local_binding_find(local_bindings, pb_name);
> +struct binding_lport *b_lport = 
> local_binding_get

[ovs-dev] [PATCH ovn] controller: avoid recomputes triggered by SBDB Port_Binding updates.

2022-05-12 Thread Xavier Simonart
When VIF ports are claimed on a chassis, SBDB Port_Binding table is updated.
If the SBDB IDL is still is read-only ("in transaction") when such a update
is required, the update is not possible and recompute is triggered through
I+P failure.

This situation can happen:
- after updating Port_Binding->chassis to SBDB for one port, in a following
  iteration, ovn-controller handles Interface:external_ids:ovn-installed
  (for the same port) while SBDB is still read-only.
- after updating Port_Binding->chassis to SBDB for one port, in a following
  iteration, ovn-controller updates Port_Binding->chassis for another port,
  while SBDB is still read-only.

This patch prevent the recompute, by having the if-status module
updating the Port_Binding chassis (if needed), when possible.
This does not delay Port_Binding chassis update compared to before this
patch: Port_Binding chassis will be updated as soon as SBDB is again
writable, as it was the case, through a recompute, before this patch.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059253
Signed-off-by: Xavier Simonart 
---
 controller/binding.c| 124 ++--
 controller/binding.h|   9 ++-
 controller/if-status.c  |  31 +++--
 controller/if-status.h  |   6 +-
 controller/ovn-controller.c |   6 +-
 tests/ovn.at|  55 +++-
 6 files changed, 184 insertions(+), 47 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index e5ba56b25..d917d8775 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -657,11 +657,15 @@ local_binding_get_lport_ofport(const struct shash 
*local_bindings,
 }
 
 bool
-local_binding_is_up(struct shash *local_bindings, const char *pb_name)
+local_binding_is_up(struct shash *local_bindings, const char *pb_name,
+const struct sbrec_chassis *chassis_rec)
 {
 struct local_binding *lbinding =
 local_binding_find(local_bindings, pb_name);
 struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
+if (b_lport && b_lport->pb->chassis != chassis_rec) {
+return false;
+}
 if (lbinding && b_lport && lbinding->iface) {
 if (b_lport->pb->n_up && !b_lport->pb->up[0]) {
 return false;
@@ -673,13 +677,23 @@ local_binding_is_up(struct shash *local_bindings, const 
char *pb_name)
 }
 
 bool
-local_binding_is_down(struct shash *local_bindings, const char *pb_name)
+local_binding_is_down(struct shash *local_bindings, const char *pb_name,
+  const struct sbrec_chassis *chassis_rec)
 {
 struct local_binding *lbinding =
 local_binding_find(local_bindings, pb_name);
 
 struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
 
+if (b_lport) {
+if (b_lport->pb->chassis == chassis_rec) {
+return false;
+} else if (b_lport->pb->chassis) {
+VLOG_DBG("lport %s already claimed by other chassis",
+ b_lport->pb->logical_port);
+}
+}
+
 if (!lbinding) {
 return true;
 }
@@ -894,6 +908,48 @@ get_lport_type_str(enum en_lport_type lport_type)
 OVS_NOT_REACHED();
 }
 
+static void set_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb,
+  const struct sbrec_chassis *chassis_rec)
+{
+if (pb->chassis != chassis_rec) {
+if (pb->chassis) {
+VLOG_INFO("Changing chassis for lport %s from %s to %s.",
+pb->logical_port, pb->chassis->name,
+chassis_rec->name);
+} else {
+VLOG_INFO("Claiming lport %s for this chassis.", pb->logical_port);
+}
+for (int i = 0; i < pb->n_mac; i++) {
+VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]);
+}
+sbrec_port_binding_set_chassis(pb, chassis_rec);
+}
+}
+
+static void clear_pb_chassis_in_sbrec(const struct sbrec_port_binding *pb)
+{
+if (pb->chassis) {
+sbrec_port_binding_set_chassis(pb, NULL);
+}
+}
+
+void
+local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
+const struct sbrec_chassis *chassis_rec)
+{
+struct local_binding *lbinding =
+local_binding_find(local_bindings, pb_name);
+struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding);
+
+if (b_lport) {
+if (chassis_rec) {
+set_pb_chassis_in_sbrec(b_lport->pb, chassis_rec);
+} else {
+clear_pb_chassis_in_sbrec(b_lport->pb);
+}
+}
+}
+
 /* For newly claimed ports, if 'notify_up' is 'false':
  * - set the 'pb.up' field to true if 'pb' has no 'parent_pb'.
  * - set the 'pb.up' field to true if 'parent_pb.up' is 'true' (e.g., for
@@ -906,23 +962,37 @@ get_lport_type_str(enum en_lport_type lport_type)
  *   it's explicitly set to 'false'.
  *   This is to ensure compatibility with older versions of ovn-northd.
  */
-static void
+static