On 8/24/22 17:49, Ilya Maximets wrote:
> On 8/24/22 17:43, Ilya Maximets wrote:
>> On 8/23/22 16:42, Dumitru Ceara wrote:
>>> After removing the possibility of disabling logical datapath groups the
>>> above mentioned state should only be relevant when ovn-northd is started
>>> with more than one lflow processing threads (--n-threads > 1), to avoid
>>> a very inefficient first lflow processing run due to a suboptimally
>>> sized hmap.
>>>
>>> There's no need to ever go back to this state now that we don't allow
>>> users to disable logical datapath groups.  Moreover, the previous code
>>> was wrongfully setting the state to STATE_INIT_HASH_SIZES at every
>>> iteration, essentially disabling parallelization.
>>>
>>> Fixes: 9dea0c090e91 ("nb: Remove possibility of disabling logical datapath 
>>> groups.")
>>> Reported-by: Ilya Maximets <i.maxim...@ovn.org>
>>> Signed-off-by: Dumitru Ceara <dce...@redhat.com>
>>> ---
>>
>> Hi, Dumitru.  Thanks for the patch!

Hi Ilya,

>> I've been using it in multiple scale tests since it was posted,
>> and it does re-enable parallel lflow build.
>>

Thanks for trying it out!

>> One issue though.  Not sure how to fix that or what is the desired
>> behavior, but what I noticed is:
>>
>>  1. Cluster with 3 northd instances starts up.
>>  2. After a while all of them has been active at least once, due
>>     to leadership transfers or other things.
>>  3. If one instance stays active for long, a lot of database changes
>>     can be accumulated.
>>  4. Once the leadership flips, new active northd is not prepared for
>>     the amount of logical flows it suddenly needs to manage.  So,
>>     it starts parallel processing with hmap significantly smaller
>>     than it actually needs right now.  That iteration takes noticeably
>>     more time than a usual one.  After the first iteration after
>>     becoming active, it can process lflows with a normal speed.
>>
>> So, I wonder if we can fall into a catastrophic scenario where
>> the current hash map size is very small and amount of flows after
>> re-connection is very high, so threads are only fighting for locks
>> all the time.
>>
>> Do you have some thoughts on this?
>>

This seems like a possible scenario (although I wonder how often we'd
hit it in practice).

But this was a possible scenario before 9dea0c090e91 ("nb: Remove
possibility of disabling logical datapath groups.") too, right?

>> Should we disable parallelism when northd goes into stand-up state?
> 
> * I meant 'standby', of course.  :/
> 

We could, but that means every time northd becomes active it will do one
first inefficient run.  Can we instead store the max_seen_lflow_size
value in the SB database and use the one we read from the DB?

>>
>> Best regards, Ilya Maximets.
>>

Thanks,
Dumitru

>>>  northd/northd.c | 10 ----------
>>>  1 file changed, 10 deletions(-)
>>>
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index 7e2681865..33943bfaf 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -14164,15 +14164,6 @@ void run_update_worker_pool(int n_threads)
>>>      }
>>>  }
>>>  
>>> -static void init_worker_pool(void)
>>> -{
>>> -    /* If parallelization is enabled, make sure locks are initialized. */
>>> -    if (parallelization_state != STATE_NULL) {
>>> -        lflow_hash_lock_init();
>>> -        parallelization_state = STATE_INIT_HASH_SIZES;
>>> -    }
>>> -}
>>> -
>>>  static void
>>>  build_mcast_groups(struct lflow_input *data,
>>>                     const struct hmap *datapaths,
>>> @@ -15762,7 +15753,6 @@ void northd_run(struct northd_input *input_data,
>>>                  struct ovsdb_idl_txn *ovnsb_txn)
>>>  {
>>>      stopwatch_start(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec());
>>> -    init_worker_pool();
>>>      ovnnb_db_run(input_data, data, ovnnb_txn, ovnsb_txn,
>>>                   input_data->sbrec_chassis_by_name,
>>>                   input_data->sbrec_chassis_by_hostname);
>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to