Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-15 Thread Dumitru Ceara
On 8/14/23 18:51, Han Zhou wrote:
> On Mon, Aug 14, 2023 at 4:46 AM Dumitru Ceara  wrote:
>>
>> On 8/12/23 07:08, Han Zhou wrote:
>>> On Fri, Aug 11, 2023 at 6:07 AM Dumitru Ceara  wrote:

 On 8/10/23 18:38, Ilya Maximets wrote:
> On 8/10/23 17:34, Dumitru Ceara wrote:
>> On 8/10/23 17:20, Han Zhou wrote:
>>> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
>>> wrote:

 On 8/10/23 15:34, Han Zhou wrote:
> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
>>> wrote:
>>
>> On 8/10/23 08:12, Ales Musil wrote:
>>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson <
> mmich...@redhat.com

> wrote:
>>>
 Hi Ales,

 I have some high-level comments/questions about this patch.

>>>
>>> Hi Mark,
>>>
>>
>> Hi Ales, Mark,
>>
>>> thank you for the review. See my answers inline below.
>>>
>>>
 I have been privy to the conversations that led to this change.
>>> My
 understanding is that by having ovn-northd wake up immediately,
>>> it
>>> can
 be more CPU-intensive than waiting a bit for changes to
>>> accumulate
>>> and
 handling all of those at once instead. However, nothing in
>>> either the
 commit message or ovn-nb.xml explains what the purpose of this
>>> new
 configuration option is. I think you should add a sentence or
>>> two to
 explain why someone would want to enable this option.


>>> Yeah that's my bad, I have v2 prepared with some explanation in
>>> the
> commit
>>> message
>>> together with results from scale run.
>>>
>>
>> +1 we really need to explain why this change is needed.
>>
>>>

 Next, the algorithm used here strikes me as odd. We use the
>>> previous
> run
 time of ovn-northd to determine how long to wait before running
>>> again.
 This delay is capped by the configured backoff time. Let's say
>>> that
 we've configured the backoff interval to be 200 ms. If
> ovn-northd
>>> has a
 super quick run and only takes 10 ms, then we will only delay
> the
>>> next
 run by 10 ms. IMO, this seems like it would not mitigate the
>>> original
 issue by much, since we are only allowing a maximum of 20 ms
> (10
>>> ms
>>> for
 the run of ovn-northd + 10 ms delay) of NB changes to
> accumulate.
 Conversely, if northd has a huge recompute and it takes 5000 ms
>>> to
 complete, then we would delay the next run by 200ms. In this
>>> case,
 delaying at all seems like it's not necessary since we
>>> potentially
>>> have
 5000 ms worth of NB DB updates that have not been addressed. I
>>> would
 have expected the opposite approach to be taken. If someone
>>> configures
 200ms as their backoff interval, I would expect us to always
>>> allow a
 *minimum* of 200ms of NB changes to accumulate before running
>>> again.
>>> So
 for instance, if northd runs quickly and is done in 10 ms, then
>>> we
> would
 wait 200 - 10 = 190 ms before processing changes again. If
> northd
>>> takes
 a long time to recompute and takes 5000 ms, then we would not
>>> wait at
 all before processing changes again. Was the algorithm chosen
>>> based
>>> on
 experimentation? Is it a well-known method I'm just not
> familiar
>>> with?
>>
>> I think the main assumption (that should probably be made
> explicit
>>> in
>> the commit log and/or documentation) is that on average changes
>>> happen
>> in a uniform way.  This might not always be accurate.
>>
>> However, if we're off with the estimate, in the worst case we'd
> be
>> adding the configured max delay to the latency of processing
>>> changes.
>> So, as long as the value is not extremely high, the impact is not
>>> that
>> high either.
>>
>> Last but not least, as this value would be configured by the CMS,
>>> we
>> assume the CMS knows what they're doing. :)
>>

>>>
>>> I'm not sure if the algorithm is well known.
>>>
>>> The thing is that at scale we almost always cap at the backoff
> so
>>> it
>>> has
>>> probably
>>> the same effect as your suggestion with the difference that we
>>> actually
>>> delay even
>>> after long runs. And that is actually desired, it's true that in
>>> the
> let's
>>> say 500 ms
>>> should be enough to accumulate more changes however that can
> lead
>>> to
> another

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-15 Thread Dumitru Ceara
On 8/14/23 19:04, Han Zhou wrote:
> On Mon, Aug 14, 2023 at 9:54 AM Ilya Maximets  wrote:
>>
>> On 8/11/23 15:07, Dumitru Ceara wrote:
>>> On 8/10/23 18:38, Ilya Maximets wrote:
 On 8/10/23 17:34, Dumitru Ceara wrote:
> On 8/10/23 17:20, Han Zhou wrote:
>> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
> wrote:
>>>
>>> On 8/10/23 15:34, Han Zhou wrote:
 On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
> wrote:
>
> On 8/10/23 08:12, Ales Musil wrote:
>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson <
> mmich...@redhat.com>
 wrote:
>>
>>> Hi Ales,
>>>
>>> I have some high-level comments/questions about this patch.
>>>
>>
>> Hi Mark,
>>
>
> Hi Ales, Mark,
>
>> thank you for the review. See my answers inline below.
>>
>>
>>> I have been privy to the conversations that led to this change.
> My
>>> understanding is that by having ovn-northd wake up immediately,
> it
>> can
>>> be more CPU-intensive than waiting a bit for changes to
> accumulate
>> and
>>> handling all of those at once instead. However, nothing in
> either the
>>> commit message or ovn-nb.xml explains what the purpose of this
> new
>>> configuration option is. I think you should add a sentence or
> two to
>>> explain why someone would want to enable this option.
>>>
>>>
>> Yeah that's my bad, I have v2 prepared with some explanation in
> the
 commit
>> message
>> together with results from scale run.
>>
>
> +1 we really need to explain why this change is needed.
>
>>
>>>
>>> Next, the algorithm used here strikes me as odd. We use the
> previous
 run
>>> time of ovn-northd to determine how long to wait before running
>> again.
>>> This delay is capped by the configured backoff time. Let's say
> that
>>> we've configured the backoff interval to be 200 ms. If
> ovn-northd
>> has a
>>> super quick run and only takes 10 ms, then we will only delay
> the
>> next
>>> run by 10 ms. IMO, this seems like it would not mitigate the
> original
>>> issue by much, since we are only allowing a maximum of 20 ms
> (10 ms
>> for
>>> the run of ovn-northd + 10 ms delay) of NB changes to
> accumulate.
>>> Conversely, if northd has a huge recompute and it takes 5000 ms
> to
>>> complete, then we would delay the next run by 200ms. In this
> case,
>>> delaying at all seems like it's not necessary since we
> potentially
>> have
>>> 5000 ms worth of NB DB updates that have not been addressed. I
> would
>>> have expected the opposite approach to be taken. If someone
>> configures
>>> 200ms as their backoff interval, I would expect us to always
> allow a
>>> *minimum* of 200ms of NB changes to accumulate before running
> again.
>> So
>>> for instance, if northd runs quickly and is done in 10 ms, then
> we
 would
>>> wait 200 - 10 = 190 ms before processing changes again. If
> northd
>> takes
>>> a long time to recompute and takes 5000 ms, then we would not
> wait at
>>> all before processing changes again. Was the algorithm chosen
> based
>> on
>>> experimentation? Is it a well-known method I'm just not familiar
>> with?
>
> I think the main assumption (that should probably be made
> explicit in
> the commit log and/or documentation) is that on average changes
> happen
> in a uniform way.  This might not always be accurate.
>
> However, if we're off with the estimate, in the worst case we'd be
> adding the configured max delay to the latency of processing
> changes.
> So, as long as the value is not extremely high, the impact is not
> that
> high either.
>
> Last but not least, as this value would be configured by the CMS,
> we
> assume the CMS knows what they're doing. :)
>
>>>
>>
>> I'm not sure if the algorithm is well known.
>>
>> The thing is that at scale we almost always cap at the backoff
> so it
>> has
>> probably
>> the same effect as your suggestion with the difference that we
>> actually
>> delay even
>> after long runs. And that is actually desired, it's true that in
> the
 let's
>> say 500 ms
>> should be enough to accumulate more changes however that can
> lead to
 another
>> 500 ms run and so on. That in the end means that northd will
> spin at
 100%
>> CPU
>> anyway which is what we want to avoid. So from another point of
> view
>> the
>> accumulation
>> of 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-14 Thread Han Zhou
On Mon, Aug 14, 2023 at 9:54 AM Ilya Maximets  wrote:
>
> On 8/11/23 15:07, Dumitru Ceara wrote:
> > On 8/10/23 18:38, Ilya Maximets wrote:
> >> On 8/10/23 17:34, Dumitru Ceara wrote:
> >>> On 8/10/23 17:20, Han Zhou wrote:
>  On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
wrote:
> >
> > On 8/10/23 15:34, Han Zhou wrote:
> >> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
wrote:
> >>>
> >>> On 8/10/23 08:12, Ales Musil wrote:
>  On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson <
mmich...@redhat.com>
> >> wrote:
> 
> > Hi Ales,
> >
> > I have some high-level comments/questions about this patch.
> >
> 
>  Hi Mark,
> 
> >>>
> >>> Hi Ales, Mark,
> >>>
>  thank you for the review. See my answers inline below.
> 
> 
> > I have been privy to the conversations that led to this change.
My
> > understanding is that by having ovn-northd wake up immediately,
it
>  can
> > be more CPU-intensive than waiting a bit for changes to
accumulate
>  and
> > handling all of those at once instead. However, nothing in
either the
> > commit message or ovn-nb.xml explains what the purpose of this
new
> > configuration option is. I think you should add a sentence or
two to
> > explain why someone would want to enable this option.
> >
> >
>  Yeah that's my bad, I have v2 prepared with some explanation in
the
> >> commit
>  message
>  together with results from scale run.
> 
> >>>
> >>> +1 we really need to explain why this change is needed.
> >>>
> 
> >
> > Next, the algorithm used here strikes me as odd. We use the
previous
> >> run
> > time of ovn-northd to determine how long to wait before running
>  again.
> > This delay is capped by the configured backoff time. Let's say
that
> > we've configured the backoff interval to be 200 ms. If
ovn-northd
>  has a
> > super quick run and only takes 10 ms, then we will only delay
the
>  next
> > run by 10 ms. IMO, this seems like it would not mitigate the
original
> > issue by much, since we are only allowing a maximum of 20 ms
(10 ms
>  for
> > the run of ovn-northd + 10 ms delay) of NB changes to
accumulate.
> > Conversely, if northd has a huge recompute and it takes 5000 ms
to
> > complete, then we would delay the next run by 200ms. In this
case,
> > delaying at all seems like it's not necessary since we
potentially
>  have
> > 5000 ms worth of NB DB updates that have not been addressed. I
would
> > have expected the opposite approach to be taken. If someone
>  configures
> > 200ms as their backoff interval, I would expect us to always
allow a
> > *minimum* of 200ms of NB changes to accumulate before running
again.
>  So
> > for instance, if northd runs quickly and is done in 10 ms, then
we
> >> would
> > wait 200 - 10 = 190 ms before processing changes again. If
northd
>  takes
> > a long time to recompute and takes 5000 ms, then we would not
wait at
> > all before processing changes again. Was the algorithm chosen
based
>  on
> > experimentation? Is it a well-known method I'm just not familiar
>  with?
> >>>
> >>> I think the main assumption (that should probably be made
explicit in
> >>> the commit log and/or documentation) is that on average changes
happen
> >>> in a uniform way.  This might not always be accurate.
> >>>
> >>> However, if we're off with the estimate, in the worst case we'd be
> >>> adding the configured max delay to the latency of processing
changes.
> >>> So, as long as the value is not extremely high, the impact is not
that
> >>> high either.
> >>>
> >>> Last but not least, as this value would be configured by the CMS,
we
> >>> assume the CMS knows what they're doing. :)
> >>>
> >
> 
>  I'm not sure if the algorithm is well known.
> 
>  The thing is that at scale we almost always cap at the backoff
so it
>  has
>  probably
>  the same effect as your suggestion with the difference that we
>  actually
>  delay even
>  after long runs. And that is actually desired, it's true that in
the
> >> let's
>  say 500 ms
>  should be enough to accumulate more changes however that can
lead to
> >> another
>  500 ms run and so on. That in the end means that northd will
spin at
> >> 100%
>  CPU
>  anyway which is what we want to avoid. So from another point of
view
>  the
>  accumulation
>  of IDL changes is a secondary effect which is still desired, but
not
>  the
>  main purpose.
> 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-14 Thread Ilya Maximets
On 8/11/23 15:07, Dumitru Ceara wrote:
> On 8/10/23 18:38, Ilya Maximets wrote:
>> On 8/10/23 17:34, Dumitru Ceara wrote:
>>> On 8/10/23 17:20, Han Zhou wrote:
 On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara  wrote:
>
> On 8/10/23 15:34, Han Zhou wrote:
>> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:
>>>
>>> On 8/10/23 08:12, Ales Musil wrote:
 On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
>> wrote:

> Hi Ales,
>
> I have some high-level comments/questions about this patch.
>

 Hi Mark,

>>>
>>> Hi Ales, Mark,
>>>
 thank you for the review. See my answers inline below.


> I have been privy to the conversations that led to this change. My
> understanding is that by having ovn-northd wake up immediately, it
 can
> be more CPU-intensive than waiting a bit for changes to accumulate
 and
> handling all of those at once instead. However, nothing in either the
> commit message or ovn-nb.xml explains what the purpose of this new
> configuration option is. I think you should add a sentence or two to
> explain why someone would want to enable this option.
>
>
 Yeah that's my bad, I have v2 prepared with some explanation in the
>> commit
 message
 together with results from scale run.

>>>
>>> +1 we really need to explain why this change is needed.
>>>

>
> Next, the algorithm used here strikes me as odd. We use the previous
>> run
> time of ovn-northd to determine how long to wait before running
 again.
> This delay is capped by the configured backoff time. Let's say that
> we've configured the backoff interval to be 200 ms. If ovn-northd
 has a
> super quick run and only takes 10 ms, then we will only delay the
 next
> run by 10 ms. IMO, this seems like it would not mitigate the original
> issue by much, since we are only allowing a maximum of 20 ms (10 ms
 for
> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
> Conversely, if northd has a huge recompute and it takes 5000 ms to
> complete, then we would delay the next run by 200ms. In this case,
> delaying at all seems like it's not necessary since we potentially
 have
> 5000 ms worth of NB DB updates that have not been addressed. I would
> have expected the opposite approach to be taken. If someone
 configures
> 200ms as their backoff interval, I would expect us to always allow a
> *minimum* of 200ms of NB changes to accumulate before running again.
 So
> for instance, if northd runs quickly and is done in 10 ms, then we
>> would
> wait 200 - 10 = 190 ms before processing changes again. If northd
 takes
> a long time to recompute and takes 5000 ms, then we would not wait at
> all before processing changes again. Was the algorithm chosen based
 on
> experimentation? Is it a well-known method I'm just not familiar
 with?
>>>
>>> I think the main assumption (that should probably be made explicit in
>>> the commit log and/or documentation) is that on average changes happen
>>> in a uniform way.  This might not always be accurate.
>>>
>>> However, if we're off with the estimate, in the worst case we'd be
>>> adding the configured max delay to the latency of processing changes.
>>> So, as long as the value is not extremely high, the impact is not that
>>> high either.
>>>
>>> Last but not least, as this value would be configured by the CMS, we
>>> assume the CMS knows what they're doing. :)
>>>
>

 I'm not sure if the algorithm is well known.

 The thing is that at scale we almost always cap at the backoff so it
 has
 probably
 the same effect as your suggestion with the difference that we
 actually
 delay even
 after long runs. And that is actually desired, it's true that in the
>> let's
 say 500 ms
 should be enough to accumulate more changes however that can lead to
>> another
 500 ms run and so on. That in the end means that northd will spin at
>> 100%
 CPU
 anyway which is what we want to avoid. So from another point of view
 the
 accumulation
 of IDL changes is a secondary effect which is still desired, but not
 the
 main purpose.

 Also delaying by short time if the previous run was short is fine, you
>> are
 right that we don't
 accumulate enough however during short running times there is a high
>> chance
 that the
 northd would get to sleep anyway (We will help it to sleep at least a

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-14 Thread Han Zhou
On Mon, Aug 14, 2023 at 4:46 AM Dumitru Ceara  wrote:
>
> On 8/12/23 07:08, Han Zhou wrote:
> > On Fri, Aug 11, 2023 at 6:07 AM Dumitru Ceara  wrote:
> >>
> >> On 8/10/23 18:38, Ilya Maximets wrote:
> >>> On 8/10/23 17:34, Dumitru Ceara wrote:
>  On 8/10/23 17:20, Han Zhou wrote:
> > On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
> > wrote:
> >>
> >> On 8/10/23 15:34, Han Zhou wrote:
> >>> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
> > wrote:
> 
>  On 8/10/23 08:12, Ales Musil wrote:
> > On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson <
mmich...@redhat.com
> >>
> >>> wrote:
> >
> >> Hi Ales,
> >>
> >> I have some high-level comments/questions about this patch.
> >>
> >
> > Hi Mark,
> >
> 
>  Hi Ales, Mark,
> 
> > thank you for the review. See my answers inline below.
> >
> >
> >> I have been privy to the conversations that led to this change.
> > My
> >> understanding is that by having ovn-northd wake up immediately,
> > it
> > can
> >> be more CPU-intensive than waiting a bit for changes to
> > accumulate
> > and
> >> handling all of those at once instead. However, nothing in
> > either the
> >> commit message or ovn-nb.xml explains what the purpose of this
> > new
> >> configuration option is. I think you should add a sentence or
> > two to
> >> explain why someone would want to enable this option.
> >>
> >>
> > Yeah that's my bad, I have v2 prepared with some explanation in
> > the
> >>> commit
> > message
> > together with results from scale run.
> >
> 
>  +1 we really need to explain why this change is needed.
> 
> >
> >>
> >> Next, the algorithm used here strikes me as odd. We use the
> > previous
> >>> run
> >> time of ovn-northd to determine how long to wait before running
> > again.
> >> This delay is capped by the configured backoff time. Let's say
> > that
> >> we've configured the backoff interval to be 200 ms. If
ovn-northd
> > has a
> >> super quick run and only takes 10 ms, then we will only delay
the
> > next
> >> run by 10 ms. IMO, this seems like it would not mitigate the
> > original
> >> issue by much, since we are only allowing a maximum of 20 ms
(10
> > ms
> > for
> >> the run of ovn-northd + 10 ms delay) of NB changes to
accumulate.
> >> Conversely, if northd has a huge recompute and it takes 5000 ms
> > to
> >> complete, then we would delay the next run by 200ms. In this
> > case,
> >> delaying at all seems like it's not necessary since we
> > potentially
> > have
> >> 5000 ms worth of NB DB updates that have not been addressed. I
> > would
> >> have expected the opposite approach to be taken. If someone
> > configures
> >> 200ms as their backoff interval, I would expect us to always
> > allow a
> >> *minimum* of 200ms of NB changes to accumulate before running
> > again.
> > So
> >> for instance, if northd runs quickly and is done in 10 ms, then
> > we
> >>> would
> >> wait 200 - 10 = 190 ms before processing changes again. If
northd
> > takes
> >> a long time to recompute and takes 5000 ms, then we would not
> > wait at
> >> all before processing changes again. Was the algorithm chosen
> > based
> > on
> >> experimentation? Is it a well-known method I'm just not
familiar
> > with?
> 
>  I think the main assumption (that should probably be made
explicit
> > in
>  the commit log and/or documentation) is that on average changes
> > happen
>  in a uniform way.  This might not always be accurate.
> 
>  However, if we're off with the estimate, in the worst case we'd
be
>  adding the configured max delay to the latency of processing
> > changes.
>  So, as long as the value is not extremely high, the impact is not
> > that
>  high either.
> 
>  Last but not least, as this value would be configured by the CMS,
> > we
>  assume the CMS knows what they're doing. :)
> 
> >>
> >
> > I'm not sure if the algorithm is well known.
> >
> > The thing is that at scale we almost always cap at the backoff
so
> > it
> > has
> > probably
> > the same effect as your suggestion with the difference that we
> > actually
> > delay even
> > after long runs. And that is actually desired, it's true that in
> > the
> >>> let's
> > say 500 ms
> > should be enough to accumulate more changes however that can
lead
> > to
> >>> another
> > 500 ms run and so on. That in the end means that 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-14 Thread Ales Musil
On Mon, Aug 14, 2023 at 1:46 PM Dumitru Ceara  wrote:

> On 8/12/23 07:08, Han Zhou wrote:
> > On Fri, Aug 11, 2023 at 6:07 AM Dumitru Ceara  wrote:
> >>
> >> On 8/10/23 18:38, Ilya Maximets wrote:
> >>> On 8/10/23 17:34, Dumitru Ceara wrote:
>  On 8/10/23 17:20, Han Zhou wrote:
> > On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
> > wrote:
> >>
> >> On 8/10/23 15:34, Han Zhou wrote:
> >>> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
> > wrote:
> 
>  On 8/10/23 08:12, Ales Musil wrote:
> > On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson <
> mmich...@redhat.com
> >>
> >>> wrote:
> >
> >> Hi Ales,
> >>
> >> I have some high-level comments/questions about this patch.
> >>
> >
> > Hi Mark,
> >
> 
>  Hi Ales, Mark,
> 
> > thank you for the review. See my answers inline below.
> >
> >
> >> I have been privy to the conversations that led to this change.
> > My
> >> understanding is that by having ovn-northd wake up immediately,
> > it
> > can
> >> be more CPU-intensive than waiting a bit for changes to
> > accumulate
> > and
> >> handling all of those at once instead. However, nothing in
> > either the
> >> commit message or ovn-nb.xml explains what the purpose of this
> > new
> >> configuration option is. I think you should add a sentence or
> > two to
> >> explain why someone would want to enable this option.
> >>
> >>
> > Yeah that's my bad, I have v2 prepared with some explanation in
> > the
> >>> commit
> > message
> > together with results from scale run.
> >
> 
>  +1 we really need to explain why this change is needed.
> 
> >
> >>
> >> Next, the algorithm used here strikes me as odd. We use the
> > previous
> >>> run
> >> time of ovn-northd to determine how long to wait before running
> > again.
> >> This delay is capped by the configured backoff time. Let's say
> > that
> >> we've configured the backoff interval to be 200 ms. If
> ovn-northd
> > has a
> >> super quick run and only takes 10 ms, then we will only delay
> the
> > next
> >> run by 10 ms. IMO, this seems like it would not mitigate the
> > original
> >> issue by much, since we are only allowing a maximum of 20 ms (10
> > ms
> > for
> >> the run of ovn-northd + 10 ms delay) of NB changes to
> accumulate.
> >> Conversely, if northd has a huge recompute and it takes 5000 ms
> > to
> >> complete, then we would delay the next run by 200ms. In this
> > case,
> >> delaying at all seems like it's not necessary since we
> > potentially
> > have
> >> 5000 ms worth of NB DB updates that have not been addressed. I
> > would
> >> have expected the opposite approach to be taken. If someone
> > configures
> >> 200ms as their backoff interval, I would expect us to always
> > allow a
> >> *minimum* of 200ms of NB changes to accumulate before running
> > again.
> > So
> >> for instance, if northd runs quickly and is done in 10 ms, then
> > we
> >>> would
> >> wait 200 - 10 = 190 ms before processing changes again. If
> northd
> > takes
> >> a long time to recompute and takes 5000 ms, then we would not
> > wait at
> >> all before processing changes again. Was the algorithm chosen
> > based
> > on
> >> experimentation? Is it a well-known method I'm just not familiar
> > with?
> 
>  I think the main assumption (that should probably be made explicit
> > in
>  the commit log and/or documentation) is that on average changes
> > happen
>  in a uniform way.  This might not always be accurate.
> 
>  However, if we're off with the estimate, in the worst case we'd be
>  adding the configured max delay to the latency of processing
> > changes.
>  So, as long as the value is not extremely high, the impact is not
> > that
>  high either.
> 
>  Last but not least, as this value would be configured by the CMS,
> > we
>  assume the CMS knows what they're doing. :)
> 
> >>
> >
> > I'm not sure if the algorithm is well known.
> >
> > The thing is that at scale we almost always cap at the backoff so
> > it
> > has
> > probably
> > the same effect as your suggestion with the difference that we
> > actually
> > delay even
> > after long runs. And that is actually desired, it's true that in
> > the
> >>> let's
> > say 500 ms
> > should be enough to accumulate more changes however that can lead
> > to
> >>> another
> > 500 ms run and so on. That in the end 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-14 Thread Dumitru Ceara
On 8/12/23 07:08, Han Zhou wrote:
> On Fri, Aug 11, 2023 at 6:07 AM Dumitru Ceara  wrote:
>>
>> On 8/10/23 18:38, Ilya Maximets wrote:
>>> On 8/10/23 17:34, Dumitru Ceara wrote:
 On 8/10/23 17:20, Han Zhou wrote:
> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
> wrote:
>>
>> On 8/10/23 15:34, Han Zhou wrote:
>>> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
> wrote:

 On 8/10/23 08:12, Ales Musil wrote:
> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson >
>>> wrote:
>
>> Hi Ales,
>>
>> I have some high-level comments/questions about this patch.
>>
>
> Hi Mark,
>

 Hi Ales, Mark,

> thank you for the review. See my answers inline below.
>
>
>> I have been privy to the conversations that led to this change.
> My
>> understanding is that by having ovn-northd wake up immediately,
> it
> can
>> be more CPU-intensive than waiting a bit for changes to
> accumulate
> and
>> handling all of those at once instead. However, nothing in
> either the
>> commit message or ovn-nb.xml explains what the purpose of this
> new
>> configuration option is. I think you should add a sentence or
> two to
>> explain why someone would want to enable this option.
>>
>>
> Yeah that's my bad, I have v2 prepared with some explanation in
> the
>>> commit
> message
> together with results from scale run.
>

 +1 we really need to explain why this change is needed.

>
>>
>> Next, the algorithm used here strikes me as odd. We use the
> previous
>>> run
>> time of ovn-northd to determine how long to wait before running
> again.
>> This delay is capped by the configured backoff time. Let's say
> that
>> we've configured the backoff interval to be 200 ms. If ovn-northd
> has a
>> super quick run and only takes 10 ms, then we will only delay the
> next
>> run by 10 ms. IMO, this seems like it would not mitigate the
> original
>> issue by much, since we are only allowing a maximum of 20 ms (10
> ms
> for
>> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
>> Conversely, if northd has a huge recompute and it takes 5000 ms
> to
>> complete, then we would delay the next run by 200ms. In this
> case,
>> delaying at all seems like it's not necessary since we
> potentially
> have
>> 5000 ms worth of NB DB updates that have not been addressed. I
> would
>> have expected the opposite approach to be taken. If someone
> configures
>> 200ms as their backoff interval, I would expect us to always
> allow a
>> *minimum* of 200ms of NB changes to accumulate before running
> again.
> So
>> for instance, if northd runs quickly and is done in 10 ms, then
> we
>>> would
>> wait 200 - 10 = 190 ms before processing changes again. If northd
> takes
>> a long time to recompute and takes 5000 ms, then we would not
> wait at
>> all before processing changes again. Was the algorithm chosen
> based
> on
>> experimentation? Is it a well-known method I'm just not familiar
> with?

 I think the main assumption (that should probably be made explicit
> in
 the commit log and/or documentation) is that on average changes
> happen
 in a uniform way.  This might not always be accurate.

 However, if we're off with the estimate, in the worst case we'd be
 adding the configured max delay to the latency of processing
> changes.
 So, as long as the value is not extremely high, the impact is not
> that
 high either.

 Last but not least, as this value would be configured by the CMS,
> we
 assume the CMS knows what they're doing. :)

>>
>
> I'm not sure if the algorithm is well known.
>
> The thing is that at scale we almost always cap at the backoff so
> it
> has
> probably
> the same effect as your suggestion with the difference that we
> actually
> delay even
> after long runs. And that is actually desired, it's true that in
> the
>>> let's
> say 500 ms
> should be enough to accumulate more changes however that can lead
> to
>>> another
> 500 ms run and so on. That in the end means that northd will spin
> at
>>> 100%
> CPU
> anyway which is what we want to avoid. So from another point of
> view
> the
> accumulation
> of IDL changes is a secondary effect which is still desired, but
> not
> the
> main purpose.
>
> Also delaying by short time if the previous run was short is
> 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-11 Thread Han Zhou
On Fri, Aug 11, 2023 at 6:07 AM Dumitru Ceara  wrote:
>
> On 8/10/23 18:38, Ilya Maximets wrote:
> > On 8/10/23 17:34, Dumitru Ceara wrote:
> >> On 8/10/23 17:20, Han Zhou wrote:
> >>> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara 
wrote:
> 
>  On 8/10/23 15:34, Han Zhou wrote:
> > On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
wrote:
> >>
> >> On 8/10/23 08:12, Ales Musil wrote:
> >>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
> > wrote:
> >>>
>  Hi Ales,
> 
>  I have some high-level comments/questions about this patch.
> 
> >>>
> >>> Hi Mark,
> >>>
> >>
> >> Hi Ales, Mark,
> >>
> >>> thank you for the review. See my answers inline below.
> >>>
> >>>
>  I have been privy to the conversations that led to this change.
My
>  understanding is that by having ovn-northd wake up immediately,
it
> >>> can
>  be more CPU-intensive than waiting a bit for changes to
accumulate
> >>> and
>  handling all of those at once instead. However, nothing in
either the
>  commit message or ovn-nb.xml explains what the purpose of this
new
>  configuration option is. I think you should add a sentence or
two to
>  explain why someone would want to enable this option.
> 
> 
> >>> Yeah that's my bad, I have v2 prepared with some explanation in
the
> > commit
> >>> message
> >>> together with results from scale run.
> >>>
> >>
> >> +1 we really need to explain why this change is needed.
> >>
> >>>
> 
>  Next, the algorithm used here strikes me as odd. We use the
previous
> > run
>  time of ovn-northd to determine how long to wait before running
> >>> again.
>  This delay is capped by the configured backoff time. Let's say
that
>  we've configured the backoff interval to be 200 ms. If ovn-northd
> >>> has a
>  super quick run and only takes 10 ms, then we will only delay the
> >>> next
>  run by 10 ms. IMO, this seems like it would not mitigate the
original
>  issue by much, since we are only allowing a maximum of 20 ms (10
ms
> >>> for
>  the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
>  Conversely, if northd has a huge recompute and it takes 5000 ms
to
>  complete, then we would delay the next run by 200ms. In this
case,
>  delaying at all seems like it's not necessary since we
potentially
> >>> have
>  5000 ms worth of NB DB updates that have not been addressed. I
would
>  have expected the opposite approach to be taken. If someone
> >>> configures
>  200ms as their backoff interval, I would expect us to always
allow a
>  *minimum* of 200ms of NB changes to accumulate before running
again.
> >>> So
>  for instance, if northd runs quickly and is done in 10 ms, then
we
> > would
>  wait 200 - 10 = 190 ms before processing changes again. If northd
> >>> takes
>  a long time to recompute and takes 5000 ms, then we would not
wait at
>  all before processing changes again. Was the algorithm chosen
based
> >>> on
>  experimentation? Is it a well-known method I'm just not familiar
> >>> with?
> >>
> >> I think the main assumption (that should probably be made explicit
in
> >> the commit log and/or documentation) is that on average changes
happen
> >> in a uniform way.  This might not always be accurate.
> >>
> >> However, if we're off with the estimate, in the worst case we'd be
> >> adding the configured max delay to the latency of processing
changes.
> >> So, as long as the value is not extremely high, the impact is not
that
> >> high either.
> >>
> >> Last but not least, as this value would be configured by the CMS,
we
> >> assume the CMS knows what they're doing. :)
> >>
> 
> >>>
> >>> I'm not sure if the algorithm is well known.
> >>>
> >>> The thing is that at scale we almost always cap at the backoff so
it
> >>> has
> >>> probably
> >>> the same effect as your suggestion with the difference that we
> >>> actually
> >>> delay even
> >>> after long runs. And that is actually desired, it's true that in
the
> > let's
> >>> say 500 ms
> >>> should be enough to accumulate more changes however that can lead
to
> > another
> >>> 500 ms run and so on. That in the end means that northd will spin
at
> > 100%
> >>> CPU
> >>> anyway which is what we want to avoid. So from another point of
view
> >>> the
> >>> accumulation
> >>> of IDL changes is a secondary effect which is still desired, but
not
> >>> the
> >>> main purpose.
> >>>
> >>> Also delaying by short time if the previous run was short is
fine, you
> > are
> >>> right that we don't
> >>> accumulate enough however during short running 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-11 Thread Dumitru Ceara
On 8/10/23 18:38, Ilya Maximets wrote:
> On 8/10/23 17:34, Dumitru Ceara wrote:
>> On 8/10/23 17:20, Han Zhou wrote:
>>> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara  wrote:

 On 8/10/23 15:34, Han Zhou wrote:
> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:
>>
>> On 8/10/23 08:12, Ales Musil wrote:
>>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
> wrote:
>>>
 Hi Ales,

 I have some high-level comments/questions about this patch.

>>>
>>> Hi Mark,
>>>
>>
>> Hi Ales, Mark,
>>
>>> thank you for the review. See my answers inline below.
>>>
>>>
 I have been privy to the conversations that led to this change. My
 understanding is that by having ovn-northd wake up immediately, it
>>> can
 be more CPU-intensive than waiting a bit for changes to accumulate
>>> and
 handling all of those at once instead. However, nothing in either the
 commit message or ovn-nb.xml explains what the purpose of this new
 configuration option is. I think you should add a sentence or two to
 explain why someone would want to enable this option.


>>> Yeah that's my bad, I have v2 prepared with some explanation in the
> commit
>>> message
>>> together with results from scale run.
>>>
>>
>> +1 we really need to explain why this change is needed.
>>
>>>

 Next, the algorithm used here strikes me as odd. We use the previous
> run
 time of ovn-northd to determine how long to wait before running
>>> again.
 This delay is capped by the configured backoff time. Let's say that
 we've configured the backoff interval to be 200 ms. If ovn-northd
>>> has a
 super quick run and only takes 10 ms, then we will only delay the
>>> next
 run by 10 ms. IMO, this seems like it would not mitigate the original
 issue by much, since we are only allowing a maximum of 20 ms (10 ms
>>> for
 the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
 Conversely, if northd has a huge recompute and it takes 5000 ms to
 complete, then we would delay the next run by 200ms. In this case,
 delaying at all seems like it's not necessary since we potentially
>>> have
 5000 ms worth of NB DB updates that have not been addressed. I would
 have expected the opposite approach to be taken. If someone
>>> configures
 200ms as their backoff interval, I would expect us to always allow a
 *minimum* of 200ms of NB changes to accumulate before running again.
>>> So
 for instance, if northd runs quickly and is done in 10 ms, then we
> would
 wait 200 - 10 = 190 ms before processing changes again. If northd
>>> takes
 a long time to recompute and takes 5000 ms, then we would not wait at
 all before processing changes again. Was the algorithm chosen based
>>> on
 experimentation? Is it a well-known method I'm just not familiar
>>> with?
>>
>> I think the main assumption (that should probably be made explicit in
>> the commit log and/or documentation) is that on average changes happen
>> in a uniform way.  This might not always be accurate.
>>
>> However, if we're off with the estimate, in the worst case we'd be
>> adding the configured max delay to the latency of processing changes.
>> So, as long as the value is not extremely high, the impact is not that
>> high either.
>>
>> Last but not least, as this value would be configured by the CMS, we
>> assume the CMS knows what they're doing. :)
>>

>>>
>>> I'm not sure if the algorithm is well known.
>>>
>>> The thing is that at scale we almost always cap at the backoff so it
>>> has
>>> probably
>>> the same effect as your suggestion with the difference that we
>>> actually
>>> delay even
>>> after long runs. And that is actually desired, it's true that in the
> let's
>>> say 500 ms
>>> should be enough to accumulate more changes however that can lead to
> another
>>> 500 ms run and so on. That in the end means that northd will spin at
> 100%
>>> CPU
>>> anyway which is what we want to avoid. So from another point of view
>>> the
>>> accumulation
>>> of IDL changes is a secondary effect which is still desired, but not
>>> the
>>> main purpose.
>>>
>>> Also delaying by short time if the previous run was short is fine, you
> are
>>> right that we don't
>>> accumulate enough however during short running times there is a high
> chance
>>> that the
>>> northd would get to sleep anyway (We will help it to sleep at least a
> bit
>>> nevertheless).
>>>
>>>

 Next, I notice that you've added new poll_timer_wait() calls but
> haven't
 changed 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Ilya Maximets
On 8/10/23 17:34, Dumitru Ceara wrote:
> On 8/10/23 17:20, Han Zhou wrote:
>> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara  wrote:
>>>
>>> On 8/10/23 15:34, Han Zhou wrote:
 On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:
>
> On 8/10/23 08:12, Ales Musil wrote:
>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
 wrote:
>>
>>> Hi Ales,
>>>
>>> I have some high-level comments/questions about this patch.
>>>
>>
>> Hi Mark,
>>
>
> Hi Ales, Mark,
>
>> thank you for the review. See my answers inline below.
>>
>>
>>> I have been privy to the conversations that led to this change. My
>>> understanding is that by having ovn-northd wake up immediately, it
>> can
>>> be more CPU-intensive than waiting a bit for changes to accumulate
>> and
>>> handling all of those at once instead. However, nothing in either the
>>> commit message or ovn-nb.xml explains what the purpose of this new
>>> configuration option is. I think you should add a sentence or two to
>>> explain why someone would want to enable this option.
>>>
>>>
>> Yeah that's my bad, I have v2 prepared with some explanation in the
 commit
>> message
>> together with results from scale run.
>>
>
> +1 we really need to explain why this change is needed.
>
>>
>>>
>>> Next, the algorithm used here strikes me as odd. We use the previous
 run
>>> time of ovn-northd to determine how long to wait before running
>> again.
>>> This delay is capped by the configured backoff time. Let's say that
>>> we've configured the backoff interval to be 200 ms. If ovn-northd
>> has a
>>> super quick run and only takes 10 ms, then we will only delay the
>> next
>>> run by 10 ms. IMO, this seems like it would not mitigate the original
>>> issue by much, since we are only allowing a maximum of 20 ms (10 ms
>> for
>>> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
>>> Conversely, if northd has a huge recompute and it takes 5000 ms to
>>> complete, then we would delay the next run by 200ms. In this case,
>>> delaying at all seems like it's not necessary since we potentially
>> have
>>> 5000 ms worth of NB DB updates that have not been addressed. I would
>>> have expected the opposite approach to be taken. If someone
>> configures
>>> 200ms as their backoff interval, I would expect us to always allow a
>>> *minimum* of 200ms of NB changes to accumulate before running again.
>> So
>>> for instance, if northd runs quickly and is done in 10 ms, then we
 would
>>> wait 200 - 10 = 190 ms before processing changes again. If northd
>> takes
>>> a long time to recompute and takes 5000 ms, then we would not wait at
>>> all before processing changes again. Was the algorithm chosen based
>> on
>>> experimentation? Is it a well-known method I'm just not familiar
>> with?
>
> I think the main assumption (that should probably be made explicit in
> the commit log and/or documentation) is that on average changes happen
> in a uniform way.  This might not always be accurate.
>
> However, if we're off with the estimate, in the worst case we'd be
> adding the configured max delay to the latency of processing changes.
> So, as long as the value is not extremely high, the impact is not that
> high either.
>
> Last but not least, as this value would be configured by the CMS, we
> assume the CMS knows what they're doing. :)
>
>>>
>>
>> I'm not sure if the algorithm is well known.
>>
>> The thing is that at scale we almost always cap at the backoff so it
>> has
>> probably
>> the same effect as your suggestion with the difference that we
>> actually
>> delay even
>> after long runs. And that is actually desired, it's true that in the
 let's
>> say 500 ms
>> should be enough to accumulate more changes however that can lead to
 another
>> 500 ms run and so on. That in the end means that northd will spin at
 100%
>> CPU
>> anyway which is what we want to avoid. So from another point of view
>> the
>> accumulation
>> of IDL changes is a secondary effect which is still desired, but not
>> the
>> main purpose.
>>
>> Also delaying by short time if the previous run was short is fine, you
 are
>> right that we don't
>> accumulate enough however during short running times there is a high
 chance
>> that the
>> northd would get to sleep anyway (We will help it to sleep at least a
 bit
>> nevertheless).
>>
>>
>>>
>>> Next, I notice that you've added new poll_timer_wait() calls but
 haven't
>>> changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
>>> calls. Is there any danger of ovn-northd getting in a busy loop of
>>> sleeping and waking because 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Han Zhou
On Thu, Aug 10, 2023 at 8:34 AM Dumitru Ceara  wrote:
>
> On 8/10/23 17:20, Han Zhou wrote:
> > On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara  wrote:
> >>
> >> On 8/10/23 15:34, Han Zhou wrote:
> >>> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara 
wrote:
> 
>  On 8/10/23 08:12, Ales Musil wrote:
> > On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
> >>> wrote:
> >
> >> Hi Ales,
> >>
> >> I have some high-level comments/questions about this patch.
> >>
> >
> > Hi Mark,
> >
> 
>  Hi Ales, Mark,
> 
> > thank you for the review. See my answers inline below.
> >
> >
> >> I have been privy to the conversations that led to this change. My
> >> understanding is that by having ovn-northd wake up immediately, it
> > can
> >> be more CPU-intensive than waiting a bit for changes to accumulate
> > and
> >> handling all of those at once instead. However, nothing in either
the
> >> commit message or ovn-nb.xml explains what the purpose of this new
> >> configuration option is. I think you should add a sentence or two
to
> >> explain why someone would want to enable this option.
> >>
> >>
> > Yeah that's my bad, I have v2 prepared with some explanation in the
> >>> commit
> > message
> > together with results from scale run.
> >
> 
>  +1 we really need to explain why this change is needed.
> 
> >
> >>
> >> Next, the algorithm used here strikes me as odd. We use the
previous
> >>> run
> >> time of ovn-northd to determine how long to wait before running
> > again.
> >> This delay is capped by the configured backoff time. Let's say that
> >> we've configured the backoff interval to be 200 ms. If ovn-northd
> > has a
> >> super quick run and only takes 10 ms, then we will only delay the
> > next
> >> run by 10 ms. IMO, this seems like it would not mitigate the
original
> >> issue by much, since we are only allowing a maximum of 20 ms (10 ms
> > for
> >> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
> >> Conversely, if northd has a huge recompute and it takes 5000 ms to
> >> complete, then we would delay the next run by 200ms. In this case,
> >> delaying at all seems like it's not necessary since we potentially
> > have
> >> 5000 ms worth of NB DB updates that have not been addressed. I
would
> >> have expected the opposite approach to be taken. If someone
> > configures
> >> 200ms as their backoff interval, I would expect us to always allow
a
> >> *minimum* of 200ms of NB changes to accumulate before running
again.
> > So
> >> for instance, if northd runs quickly and is done in 10 ms, then we
> >>> would
> >> wait 200 - 10 = 190 ms before processing changes again. If northd
> > takes
> >> a long time to recompute and takes 5000 ms, then we would not wait
at
> >> all before processing changes again. Was the algorithm chosen based
> > on
> >> experimentation? Is it a well-known method I'm just not familiar
> > with?
> 
>  I think the main assumption (that should probably be made explicit in
>  the commit log and/or documentation) is that on average changes
happen
>  in a uniform way.  This might not always be accurate.
> 
>  However, if we're off with the estimate, in the worst case we'd be
>  adding the configured max delay to the latency of processing changes.
>  So, as long as the value is not extremely high, the impact is not
that
>  high either.
> 
>  Last but not least, as this value would be configured by the CMS, we
>  assume the CMS knows what they're doing. :)
> 
> >>
> >
> > I'm not sure if the algorithm is well known.
> >
> > The thing is that at scale we almost always cap at the backoff so it
> > has
> > probably
> > the same effect as your suggestion with the difference that we
> > actually
> > delay even
> > after long runs. And that is actually desired, it's true that in the
> >>> let's
> > say 500 ms
> > should be enough to accumulate more changes however that can lead to
> >>> another
> > 500 ms run and so on. That in the end means that northd will spin at
> >>> 100%
> > CPU
> > anyway which is what we want to avoid. So from another point of view
> > the
> > accumulation
> > of IDL changes is a secondary effect which is still desired, but not
> > the
> > main purpose.
> >
> > Also delaying by short time if the previous run was short is fine,
you
> >>> are
> > right that we don't
> > accumulate enough however during short running times there is a high
> >>> chance
> > that the
> > northd would get to sleep anyway (We will help it to sleep at least
a
> >>> bit
> > nevertheless).
> >
> >
> >>
> >> Next, I notice that you've added new poll_timer_wait() calls but
> >>> haven't
> >> changed the ovsdb_idl_loop_run() or

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Dumitru Ceara
On 8/10/23 17:20, Han Zhou wrote:
> On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara  wrote:
>>
>> On 8/10/23 15:34, Han Zhou wrote:
>>> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:

 On 8/10/23 08:12, Ales Musil wrote:
> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
>>> wrote:
>
>> Hi Ales,
>>
>> I have some high-level comments/questions about this patch.
>>
>
> Hi Mark,
>

 Hi Ales, Mark,

> thank you for the review. See my answers inline below.
>
>
>> I have been privy to the conversations that led to this change. My
>> understanding is that by having ovn-northd wake up immediately, it
> can
>> be more CPU-intensive than waiting a bit for changes to accumulate
> and
>> handling all of those at once instead. However, nothing in either the
>> commit message or ovn-nb.xml explains what the purpose of this new
>> configuration option is. I think you should add a sentence or two to
>> explain why someone would want to enable this option.
>>
>>
> Yeah that's my bad, I have v2 prepared with some explanation in the
>>> commit
> message
> together with results from scale run.
>

 +1 we really need to explain why this change is needed.

>
>>
>> Next, the algorithm used here strikes me as odd. We use the previous
>>> run
>> time of ovn-northd to determine how long to wait before running
> again.
>> This delay is capped by the configured backoff time. Let's say that
>> we've configured the backoff interval to be 200 ms. If ovn-northd
> has a
>> super quick run and only takes 10 ms, then we will only delay the
> next
>> run by 10 ms. IMO, this seems like it would not mitigate the original
>> issue by much, since we are only allowing a maximum of 20 ms (10 ms
> for
>> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
>> Conversely, if northd has a huge recompute and it takes 5000 ms to
>> complete, then we would delay the next run by 200ms. In this case,
>> delaying at all seems like it's not necessary since we potentially
> have
>> 5000 ms worth of NB DB updates that have not been addressed. I would
>> have expected the opposite approach to be taken. If someone
> configures
>> 200ms as their backoff interval, I would expect us to always allow a
>> *minimum* of 200ms of NB changes to accumulate before running again.
> So
>> for instance, if northd runs quickly and is done in 10 ms, then we
>>> would
>> wait 200 - 10 = 190 ms before processing changes again. If northd
> takes
>> a long time to recompute and takes 5000 ms, then we would not wait at
>> all before processing changes again. Was the algorithm chosen based
> on
>> experimentation? Is it a well-known method I'm just not familiar
> with?

 I think the main assumption (that should probably be made explicit in
 the commit log and/or documentation) is that on average changes happen
 in a uniform way.  This might not always be accurate.

 However, if we're off with the estimate, in the worst case we'd be
 adding the configured max delay to the latency of processing changes.
 So, as long as the value is not extremely high, the impact is not that
 high either.

 Last but not least, as this value would be configured by the CMS, we
 assume the CMS knows what they're doing. :)

>>
>
> I'm not sure if the algorithm is well known.
>
> The thing is that at scale we almost always cap at the backoff so it
> has
> probably
> the same effect as your suggestion with the difference that we
> actually
> delay even
> after long runs. And that is actually desired, it's true that in the
>>> let's
> say 500 ms
> should be enough to accumulate more changes however that can lead to
>>> another
> 500 ms run and so on. That in the end means that northd will spin at
>>> 100%
> CPU
> anyway which is what we want to avoid. So from another point of view
> the
> accumulation
> of IDL changes is a secondary effect which is still desired, but not
> the
> main purpose.
>
> Also delaying by short time if the previous run was short is fine, you
>>> are
> right that we don't
> accumulate enough however during short running times there is a high
>>> chance
> that the
> northd would get to sleep anyway (We will help it to sleep at least a
>>> bit
> nevertheless).
>
>
>>
>> Next, I notice that you've added new poll_timer_wait() calls but
>>> haven't
>> changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
>> calls. Is there any danger of ovn-northd getting in a busy loop of
>> sleeping and waking because of this? I don't think it should, since
>> presumably ovsdb_idl_loop_run() should clear the conditions waited on
>>> by
>> ovsdb_idl_loop_commit_and_wait(), but I 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Han Zhou
On Thu, Aug 10, 2023 at 6:36 AM Dumitru Ceara  wrote:
>
> On 8/10/23 15:34, Han Zhou wrote:
> > On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:
> >>
> >> On 8/10/23 08:12, Ales Musil wrote:
> >>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
> > wrote:
> >>>
>  Hi Ales,
> 
>  I have some high-level comments/questions about this patch.
> 
> >>>
> >>> Hi Mark,
> >>>
> >>
> >> Hi Ales, Mark,
> >>
> >>> thank you for the review. See my answers inline below.
> >>>
> >>>
>  I have been privy to the conversations that led to this change. My
>  understanding is that by having ovn-northd wake up immediately, it
can
>  be more CPU-intensive than waiting a bit for changes to accumulate
and
>  handling all of those at once instead. However, nothing in either the
>  commit message or ovn-nb.xml explains what the purpose of this new
>  configuration option is. I think you should add a sentence or two to
>  explain why someone would want to enable this option.
> 
> 
> >>> Yeah that's my bad, I have v2 prepared with some explanation in the
> > commit
> >>> message
> >>> together with results from scale run.
> >>>
> >>
> >> +1 we really need to explain why this change is needed.
> >>
> >>>
> 
>  Next, the algorithm used here strikes me as odd. We use the previous
> > run
>  time of ovn-northd to determine how long to wait before running
again.
>  This delay is capped by the configured backoff time. Let's say that
>  we've configured the backoff interval to be 200 ms. If ovn-northd
has a
>  super quick run and only takes 10 ms, then we will only delay the
next
>  run by 10 ms. IMO, this seems like it would not mitigate the original
>  issue by much, since we are only allowing a maximum of 20 ms (10 ms
for
>  the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
>  Conversely, if northd has a huge recompute and it takes 5000 ms to
>  complete, then we would delay the next run by 200ms. In this case,
>  delaying at all seems like it's not necessary since we potentially
have
>  5000 ms worth of NB DB updates that have not been addressed. I would
>  have expected the opposite approach to be taken. If someone
configures
>  200ms as their backoff interval, I would expect us to always allow a
>  *minimum* of 200ms of NB changes to accumulate before running again.
So
>  for instance, if northd runs quickly and is done in 10 ms, then we
> > would
>  wait 200 - 10 = 190 ms before processing changes again. If northd
takes
>  a long time to recompute and takes 5000 ms, then we would not wait at
>  all before processing changes again. Was the algorithm chosen based
on
>  experimentation? Is it a well-known method I'm just not familiar
with?
> >>
> >> I think the main assumption (that should probably be made explicit in
> >> the commit log and/or documentation) is that on average changes happen
> >> in a uniform way.  This might not always be accurate.
> >>
> >> However, if we're off with the estimate, in the worst case we'd be
> >> adding the configured max delay to the latency of processing changes.
> >> So, as long as the value is not extremely high, the impact is not that
> >> high either.
> >>
> >> Last but not least, as this value would be configured by the CMS, we
> >> assume the CMS knows what they're doing. :)
> >>
> 
> >>>
> >>> I'm not sure if the algorithm is well known.
> >>>
> >>> The thing is that at scale we almost always cap at the backoff so it
has
> >>> probably
> >>> the same effect as your suggestion with the difference that we
actually
> >>> delay even
> >>> after long runs. And that is actually desired, it's true that in the
> > let's
> >>> say 500 ms
> >>> should be enough to accumulate more changes however that can lead to
> > another
> >>> 500 ms run and so on. That in the end means that northd will spin at
> > 100%
> >>> CPU
> >>> anyway which is what we want to avoid. So from another point of view
the
> >>> accumulation
> >>> of IDL changes is a secondary effect which is still desired, but not
the
> >>> main purpose.
> >>>
> >>> Also delaying by short time if the previous run was short is fine, you
> > are
> >>> right that we don't
> >>> accumulate enough however during short running times there is a high
> > chance
> >>> that the
> >>> northd would get to sleep anyway (We will help it to sleep at least a
> > bit
> >>> nevertheless).
> >>>
> >>>
> 
>  Next, I notice that you've added new poll_timer_wait() calls but
> > haven't
>  changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
>  calls. Is there any danger of ovn-northd getting in a busy loop of
>  sleeping and waking because of this? I don't think it should, since
>  presumably ovsdb_idl_loop_run() should clear the conditions waited on
> > by
>  ovsdb_idl_loop_commit_and_wait(), but I want to double-check.
> 
> >>>
> >>> AFAIK it shouldn't cause any 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Dumitru Ceara
On 8/10/23 15:34, Han Zhou wrote:
> On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:
>>
>> On 8/10/23 08:12, Ales Musil wrote:
>>> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
> wrote:
>>>
 Hi Ales,

 I have some high-level comments/questions about this patch.

>>>
>>> Hi Mark,
>>>
>>
>> Hi Ales, Mark,
>>
>>> thank you for the review. See my answers inline below.
>>>
>>>
 I have been privy to the conversations that led to this change. My
 understanding is that by having ovn-northd wake up immediately, it can
 be more CPU-intensive than waiting a bit for changes to accumulate and
 handling all of those at once instead. However, nothing in either the
 commit message or ovn-nb.xml explains what the purpose of this new
 configuration option is. I think you should add a sentence or two to
 explain why someone would want to enable this option.


>>> Yeah that's my bad, I have v2 prepared with some explanation in the
> commit
>>> message
>>> together with results from scale run.
>>>
>>
>> +1 we really need to explain why this change is needed.
>>
>>>

 Next, the algorithm used here strikes me as odd. We use the previous
> run
 time of ovn-northd to determine how long to wait before running again.
 This delay is capped by the configured backoff time. Let's say that
 we've configured the backoff interval to be 200 ms. If ovn-northd has a
 super quick run and only takes 10 ms, then we will only delay the next
 run by 10 ms. IMO, this seems like it would not mitigate the original
 issue by much, since we are only allowing a maximum of 20 ms (10 ms for
 the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
 Conversely, if northd has a huge recompute and it takes 5000 ms to
 complete, then we would delay the next run by 200ms. In this case,
 delaying at all seems like it's not necessary since we potentially have
 5000 ms worth of NB DB updates that have not been addressed. I would
 have expected the opposite approach to be taken. If someone configures
 200ms as their backoff interval, I would expect us to always allow a
 *minimum* of 200ms of NB changes to accumulate before running again. So
 for instance, if northd runs quickly and is done in 10 ms, then we
> would
 wait 200 - 10 = 190 ms before processing changes again. If northd takes
 a long time to recompute and takes 5000 ms, then we would not wait at
 all before processing changes again. Was the algorithm chosen based on
 experimentation? Is it a well-known method I'm just not familiar with?
>>
>> I think the main assumption (that should probably be made explicit in
>> the commit log and/or documentation) is that on average changes happen
>> in a uniform way.  This might not always be accurate.
>>
>> However, if we're off with the estimate, in the worst case we'd be
>> adding the configured max delay to the latency of processing changes.
>> So, as long as the value is not extremely high, the impact is not that
>> high either.
>>
>> Last but not least, as this value would be configured by the CMS, we
>> assume the CMS knows what they're doing. :)
>>

>>>
>>> I'm not sure if the algorithm is well known.
>>>
>>> The thing is that at scale we almost always cap at the backoff so it has
>>> probably
>>> the same effect as your suggestion with the difference that we actually
>>> delay even
>>> after long runs. And that is actually desired, it's true that in the
> let's
>>> say 500 ms
>>> should be enough to accumulate more changes however that can lead to
> another
>>> 500 ms run and so on. That in the end means that northd will spin at
> 100%
>>> CPU
>>> anyway which is what we want to avoid. So from another point of view the
>>> accumulation
>>> of IDL changes is a secondary effect which is still desired, but not the
>>> main purpose.
>>>
>>> Also delaying by short time if the previous run was short is fine, you
> are
>>> right that we don't
>>> accumulate enough however during short running times there is a high
> chance
>>> that the
>>> northd would get to sleep anyway (We will help it to sleep at least a
> bit
>>> nevertheless).
>>>
>>>

 Next, I notice that you've added new poll_timer_wait() calls but
> haven't
 changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
 calls. Is there any danger of ovn-northd getting in a busy loop of
 sleeping and waking because of this? I don't think it should, since
 presumably ovsdb_idl_loop_run() should clear the conditions waited on
> by
 ovsdb_idl_loop_commit_and_wait(), but I want to double-check.

>>>
>>> AFAIK it shouldn't cause any issues as ovsdb_idl_loop_run() will process
>>> anything
>>> that it can and wait will be fine. The problem would be if we would
> skip the
>>> ovsdb_idl_loop_run() for some reason.
>>>
>>>

 Next, does this have any negative impact on our ability to perform
 incremental 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Han Zhou
On Thu, Aug 10, 2023 at 2:29 AM Dumitru Ceara  wrote:
>
> On 8/10/23 08:12, Ales Musil wrote:
> > On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson 
wrote:
> >
> >> Hi Ales,
> >>
> >> I have some high-level comments/questions about this patch.
> >>
> >
> > Hi Mark,
> >
>
> Hi Ales, Mark,
>
> > thank you for the review. See my answers inline below.
> >
> >
> >> I have been privy to the conversations that led to this change. My
> >> understanding is that by having ovn-northd wake up immediately, it can
> >> be more CPU-intensive than waiting a bit for changes to accumulate and
> >> handling all of those at once instead. However, nothing in either the
> >> commit message or ovn-nb.xml explains what the purpose of this new
> >> configuration option is. I think you should add a sentence or two to
> >> explain why someone would want to enable this option.
> >>
> >>
> > Yeah that's my bad, I have v2 prepared with some explanation in the
commit
> > message
> > together with results from scale run.
> >
>
> +1 we really need to explain why this change is needed.
>
> >
> >>
> >> Next, the algorithm used here strikes me as odd. We use the previous
run
> >> time of ovn-northd to determine how long to wait before running again.
> >> This delay is capped by the configured backoff time. Let's say that
> >> we've configured the backoff interval to be 200 ms. If ovn-northd has a
> >> super quick run and only takes 10 ms, then we will only delay the next
> >> run by 10 ms. IMO, this seems like it would not mitigate the original
> >> issue by much, since we are only allowing a maximum of 20 ms (10 ms for
> >> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
> >> Conversely, if northd has a huge recompute and it takes 5000 ms to
> >> complete, then we would delay the next run by 200ms. In this case,
> >> delaying at all seems like it's not necessary since we potentially have
> >> 5000 ms worth of NB DB updates that have not been addressed. I would
> >> have expected the opposite approach to be taken. If someone configures
> >> 200ms as their backoff interval, I would expect us to always allow a
> >> *minimum* of 200ms of NB changes to accumulate before running again. So
> >> for instance, if northd runs quickly and is done in 10 ms, then we
would
> >> wait 200 - 10 = 190 ms before processing changes again. If northd takes
> >> a long time to recompute and takes 5000 ms, then we would not wait at
> >> all before processing changes again. Was the algorithm chosen based on
> >> experimentation? Is it a well-known method I'm just not familiar with?
>
> I think the main assumption (that should probably be made explicit in
> the commit log and/or documentation) is that on average changes happen
> in a uniform way.  This might not always be accurate.
>
> However, if we're off with the estimate, in the worst case we'd be
> adding the configured max delay to the latency of processing changes.
> So, as long as the value is not extremely high, the impact is not that
> high either.
>
> Last but not least, as this value would be configured by the CMS, we
> assume the CMS knows what they're doing. :)
>
> >>
> >
> > I'm not sure if the algorithm is well known.
> >
> > The thing is that at scale we almost always cap at the backoff so it has
> > probably
> > the same effect as your suggestion with the difference that we actually
> > delay even
> > after long runs. And that is actually desired, it's true that in the
let's
> > say 500 ms
> > should be enough to accumulate more changes however that can lead to
another
> > 500 ms run and so on. That in the end means that northd will spin at
100%
> > CPU
> > anyway which is what we want to avoid. So from another point of view the
> > accumulation
> > of IDL changes is a secondary effect which is still desired, but not the
> > main purpose.
> >
> > Also delaying by short time if the previous run was short is fine, you
are
> > right that we don't
> > accumulate enough however during short running times there is a high
chance
> > that the
> > northd would get to sleep anyway (We will help it to sleep at least a
bit
> > nevertheless).
> >
> >
> >>
> >> Next, I notice that you've added new poll_timer_wait() calls but
haven't
> >> changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
> >> calls. Is there any danger of ovn-northd getting in a busy loop of
> >> sleeping and waking because of this? I don't think it should, since
> >> presumably ovsdb_idl_loop_run() should clear the conditions waited on
by
> >> ovsdb_idl_loop_commit_and_wait(), but I want to double-check.
> >>
> >
> > AFAIK it shouldn't cause any issues as ovsdb_idl_loop_run() will process
> > anything
> > that it can and wait will be fine. The problem would be if we would
skip the
> > ovsdb_idl_loop_run() for some reason.
> >
> >
> >>
> >> Next, does this have any negative impact on our ability to perform
> >> incremental processing in ovn-northd? My concern is that since we are
> >> still running the ovsdb IDL 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Dumitru Ceara
On 8/10/23 08:12, Ales Musil wrote:
> On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson  wrote:
> 
>> Hi Ales,
>>
>> I have some high-level comments/questions about this patch.
>>
> 
> Hi Mark,
> 

Hi Ales, Mark,

> thank you for the review. See my answers inline below.
> 
> 
>> I have been privy to the conversations that led to this change. My
>> understanding is that by having ovn-northd wake up immediately, it can
>> be more CPU-intensive than waiting a bit for changes to accumulate and
>> handling all of those at once instead. However, nothing in either the
>> commit message or ovn-nb.xml explains what the purpose of this new
>> configuration option is. I think you should add a sentence or two to
>> explain why someone would want to enable this option.
>>
>>
> Yeah that's my bad, I have v2 prepared with some explanation in the commit
> message
> together with results from scale run.
> 

+1 we really need to explain why this change is needed.

> 
>>
>> Next, the algorithm used here strikes me as odd. We use the previous run
>> time of ovn-northd to determine how long to wait before running again.
>> This delay is capped by the configured backoff time. Let's say that
>> we've configured the backoff interval to be 200 ms. If ovn-northd has a
>> super quick run and only takes 10 ms, then we will only delay the next
>> run by 10 ms. IMO, this seems like it would not mitigate the original
>> issue by much, since we are only allowing a maximum of 20 ms (10 ms for
>> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
>> Conversely, if northd has a huge recompute and it takes 5000 ms to
>> complete, then we would delay the next run by 200ms. In this case,
>> delaying at all seems like it's not necessary since we potentially have
>> 5000 ms worth of NB DB updates that have not been addressed. I would
>> have expected the opposite approach to be taken. If someone configures
>> 200ms as their backoff interval, I would expect us to always allow a
>> *minimum* of 200ms of NB changes to accumulate before running again. So
>> for instance, if northd runs quickly and is done in 10 ms, then we would
>> wait 200 - 10 = 190 ms before processing changes again. If northd takes
>> a long time to recompute and takes 5000 ms, then we would not wait at
>> all before processing changes again. Was the algorithm chosen based on
>> experimentation? Is it a well-known method I'm just not familiar with?

I think the main assumption (that should probably be made explicit in
the commit log and/or documentation) is that on average changes happen
in a uniform way.  This might not always be accurate.

However, if we're off with the estimate, in the worst case we'd be
adding the configured max delay to the latency of processing changes.
So, as long as the value is not extremely high, the impact is not that
high either.

Last but not least, as this value would be configured by the CMS, we
assume the CMS knows what they're doing. :)

>>
> 
> I'm not sure if the algorithm is well known.
> 
> The thing is that at scale we almost always cap at the backoff so it has
> probably
> the same effect as your suggestion with the difference that we actually
> delay even
> after long runs. And that is actually desired, it's true that in the let's
> say 500 ms
> should be enough to accumulate more changes however that can lead to another
> 500 ms run and so on. That in the end means that northd will spin at 100%
> CPU
> anyway which is what we want to avoid. So from another point of view the
> accumulation
> of IDL changes is a secondary effect which is still desired, but not the
> main purpose.
> 
> Also delaying by short time if the previous run was short is fine, you are
> right that we don't
> accumulate enough however during short running times there is a high chance
> that the
> northd would get to sleep anyway (We will help it to sleep at least a bit
> nevertheless).
> 
> 
>>
>> Next, I notice that you've added new poll_timer_wait() calls but haven't
>> changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
>> calls. Is there any danger of ovn-northd getting in a busy loop of
>> sleeping and waking because of this? I don't think it should, since
>> presumably ovsdb_idl_loop_run() should clear the conditions waited on by
>> ovsdb_idl_loop_commit_and_wait(), but I want to double-check.
>>
> 
> AFAIK it shouldn't cause any issues as ovsdb_idl_loop_run() will process
> anything
> that it can and wait will be fine. The problem would be if we would skip the
> ovsdb_idl_loop_run() for some reason.
> 
> 
>>
>> Next, does this have any negative impact on our ability to perform
>> incremental processing in ovn-northd? My concern is that since we are
>> still running the ovsdb IDL loop that if multiple NB changes occur
>> during our delay, then we might have to fall back to a full recompute
>> instead of being able to incrementally process the changes. Are my
>> concerns valid?
>>
> 
> I suppose that can happen if there are changes 

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-10 Thread Ales Musil
On Wed, Aug 9, 2023 at 5:13 PM Mark Michelson  wrote:

> Hi Ales,
>
> I have some high-level comments/questions about this patch.
>

Hi Mark,

thank you for the review. See my answers inline below.


> I have been privy to the conversations that led to this change. My
> understanding is that by having ovn-northd wake up immediately, it can
> be more CPU-intensive than waiting a bit for changes to accumulate and
> handling all of those at once instead. However, nothing in either the
> commit message or ovn-nb.xml explains what the purpose of this new
> configuration option is. I think you should add a sentence or two to
> explain why someone would want to enable this option.
>
>
Yeah that's my bad, I have v2 prepared with some explanation in the commit
message
together with results from scale run.


>
> Next, the algorithm used here strikes me as odd. We use the previous run
> time of ovn-northd to determine how long to wait before running again.
> This delay is capped by the configured backoff time. Let's say that
> we've configured the backoff interval to be 200 ms. If ovn-northd has a
> super quick run and only takes 10 ms, then we will only delay the next
> run by 10 ms. IMO, this seems like it would not mitigate the original
> issue by much, since we are only allowing a maximum of 20 ms (10 ms for
> the run of ovn-northd + 10 ms delay) of NB changes to accumulate.
> Conversely, if northd has a huge recompute and it takes 5000 ms to
> complete, then we would delay the next run by 200ms. In this case,
> delaying at all seems like it's not necessary since we potentially have
> 5000 ms worth of NB DB updates that have not been addressed. I would
> have expected the opposite approach to be taken. If someone configures
> 200ms as their backoff interval, I would expect us to always allow a
> *minimum* of 200ms of NB changes to accumulate before running again. So
> for instance, if northd runs quickly and is done in 10 ms, then we would
> wait 200 - 10 = 190 ms before processing changes again. If northd takes
> a long time to recompute and takes 5000 ms, then we would not wait at
> all before processing changes again. Was the algorithm chosen based on
> experimentation? Is it a well-known method I'm just not familiar with?
>

I'm not sure if the algorithm is well known.

The thing is that at scale we almost always cap at the backoff so it has
probably
the same effect as your suggestion with the difference that we actually
delay even
after long runs. And that is actually desired, it's true that in the let's
say 500 ms
should be enough to accumulate more changes however that can lead to another
500 ms run and so on. That in the end means that northd will spin at 100%
CPU
anyway which is what we want to avoid. So from another point of view the
accumulation
of IDL changes is a secondary effect which is still desired, but not the
main purpose.

Also delaying by short time if the previous run was short is fine, you are
right that we don't
accumulate enough however during short running times there is a high chance
that the
northd would get to sleep anyway (We will help it to sleep at least a bit
nevertheless).


>
> Next, I notice that you've added new poll_timer_wait() calls but haven't
> changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait()
> calls. Is there any danger of ovn-northd getting in a busy loop of
> sleeping and waking because of this? I don't think it should, since
> presumably ovsdb_idl_loop_run() should clear the conditions waited on by
> ovsdb_idl_loop_commit_and_wait(), but I want to double-check.
>

AFAIK it shouldn't cause any issues as ovsdb_idl_loop_run() will process
anything
that it can and wait will be fine. The problem would be if we would skip the
ovsdb_idl_loop_run() for some reason.


>
> Next, does this have any negative impact on our ability to perform
> incremental processing in ovn-northd? My concern is that since we are
> still running the ovsdb IDL loop that if multiple NB changes occur
> during our delay, then we might have to fall back to a full recompute
> instead of being able to incrementally process the changes. Are my
> concerns valid?
>

I suppose that can happen if there are changes that could result in
"conflict"
and full recompute. During the test we haven't seen anything like that.
The odds of that happening are small because as stated previously we are
doing
basically the same as if the engine was running for a long time always from
the IDL
point of view except that we give IDL a chance to process whatever has
pilled up
within the sleep period.


>
> Next, has scale testing shown that this change has made a positive
> impact? If so, is there any recommendation for how to determine what to
> configure the value to?
>
>
It has a huge impact actually the value tested was 200 ms, the
recommendation
would be < 500 ms. After that point the latency on components creation
would be
very noticable. I will put the recommendation into the ovn-nb.xml with the
latency

Re: [ovs-dev] [PATCH ovn] northd: Allow delay of northd engine runs

2023-08-09 Thread Mark Michelson

Hi Ales,

I have some high-level comments/questions about this patch.

I have been privy to the conversations that led to this change. My 
understanding is that by having ovn-northd wake up immediately, it can 
be more CPU-intensive than waiting a bit for changes to accumulate and 
handling all of those at once instead. However, nothing in either the 
commit message or ovn-nb.xml explains what the purpose of this new 
configuration option is. I think you should add a sentence or two to 
explain why someone would want to enable this option.


Next, the algorithm used here strikes me as odd. We use the previous run 
time of ovn-northd to determine how long to wait before running again. 
This delay is capped by the configured backoff time. Let's say that 
we've configured the backoff interval to be 200 ms. If ovn-northd has a 
super quick run and only takes 10 ms, then we will only delay the next 
run by 10 ms. IMO, this seems like it would not mitigate the original 
issue by much, since we are only allowing a maximum of 20 ms (10 ms for 
the run of ovn-northd + 10 ms delay) of NB changes to accumulate. 
Conversely, if northd has a huge recompute and it takes 5000 ms to 
complete, then we would delay the next run by 200ms. In this case, 
delaying at all seems like it's not necessary since we potentially have 
5000 ms worth of NB DB updates that have not been addressed. I would 
have expected the opposite approach to be taken. If someone configures 
200ms as their backoff interval, I would expect us to always allow a 
*minimum* of 200ms of NB changes to accumulate before running again. So 
for instance, if northd runs quickly and is done in 10 ms, then we would 
wait 200 - 10 = 190 ms before processing changes again. If northd takes 
a long time to recompute and takes 5000 ms, then we would not wait at 
all before processing changes again. Was the algorithm chosen based on 
experimentation? Is it a well-known method I'm just not familiar with?


Next, I notice that you've added new poll_timer_wait() calls but haven't 
changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait() 
calls. Is there any danger of ovn-northd getting in a busy loop of 
sleeping and waking because of this? I don't think it should, since 
presumably ovsdb_idl_loop_run() should clear the conditions waited on by 
ovsdb_idl_loop_commit_and_wait(), but I want to double-check.


Next, does this have any negative impact on our ability to perform 
incremental processing in ovn-northd? My concern is that since we are 
still running the ovsdb IDL loop that if multiple NB changes occur 
during our delay, then we might have to fall back to a full recompute 
instead of being able to incrementally process the changes. Are my 
concerns valid?


Next, has scale testing shown that this change has made a positive 
impact? If so, is there any recommendation for how to determine what to 
configure the value to?


Finally, is this change expected to be a long-term necessity? This 
option seems to be useful for cases where northd recomputes are 
required. Performing recomputes less frequently seems like it would 
lower the CPU usage of ovn-northd while still processing the same amount 
of changes. However, once northd can handle most changes incrementally, 
is there still a benefit to delaying running? If each run of northd 
handles all DB changes incrementally, then is there any point in putting 
delays between those incremental runs?


On 8/9/23 01:29, Ales Musil wrote:

Add config option called "northd-backoff-interval-ms" that allows
to delay northd engine runs capped by the config option.
When the config option is set to 0 or unspecified, the engine
will run without any restrictions. If the value >0 we will delay
northd engine run by the previous run time capped by the
config option.

Signed-off-by: Ales Musil 
---
  NEWS |  2 ++
  northd/inc-proc-northd.c | 23 ++-
  northd/inc-proc-northd.h |  3 ++-
  northd/ovn-northd.c  |  9 +++--
  ovn-nb.xml   |  7 +++
  5 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 8275877f9..6109f13a2 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post v23.06.0
- To allow optimizing ovn-controller's monitor conditions for the regular
  VIF case, ovn-controller now unconditionally monitors all sub-ports
  (ports with parent_port set).
+  - Add "northd-backoff-interval-ms" config option to delay northd engine
+runs capped at the set value.
  
  OVN v23.06.0 - 01 Jun 2023

  --
diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
index d328deb22..87db50ad1 100644
--- a/northd/inc-proc-northd.c
+++ b/northd/inc-proc-northd.c
@@ -42,6 +42,8 @@ VLOG_DEFINE_THIS_MODULE(inc_proc_northd);
  
  static unixctl_cb_func chassis_features_list;
  
+static int64_t next_northd_run_ms = 0;

+
  #define NB_NODES \
  NB_NODE(nb_global, "nb_global") \
  NB_NODE(logical_switch,