On 8/24/22 17:49, Ilya Maximets wrote: > On 8/24/22 17:43, Ilya Maximets wrote: >> On 8/23/22 16:42, Dumitru Ceara wrote: >>> After removing the possibility of disabling logical datapath groups the >>> above mentioned state should only be relevant when ovn-northd is started >>> with more than one lflow processing threads (--n-threads > 1), to avoid >>> a very inefficient first lflow processing run due to a suboptimally >>> sized hmap. >>> >>> There's no need to ever go back to this state now that we don't allow >>> users to disable logical datapath groups. Moreover, the previous code >>> was wrongfully setting the state to STATE_INIT_HASH_SIZES at every >>> iteration, essentially disabling parallelization. >>> >>> Fixes: 9dea0c090e91 ("nb: Remove possibility of disabling logical datapath >>> groups.") >>> Reported-by: Ilya Maximets <i.maxim...@ovn.org> >>> Signed-off-by: Dumitru Ceara <dce...@redhat.com> >>> --- >> >> Hi, Dumitru. Thanks for the patch!
Hi Ilya, >> I've been using it in multiple scale tests since it was posted, >> and it does re-enable parallel lflow build. >> Thanks for trying it out! >> One issue though. Not sure how to fix that or what is the desired >> behavior, but what I noticed is: >> >> 1. Cluster with 3 northd instances starts up. >> 2. After a while all of them has been active at least once, due >> to leadership transfers or other things. >> 3. If one instance stays active for long, a lot of database changes >> can be accumulated. >> 4. Once the leadership flips, new active northd is not prepared for >> the amount of logical flows it suddenly needs to manage. So, >> it starts parallel processing with hmap significantly smaller >> than it actually needs right now. That iteration takes noticeably >> more time than a usual one. After the first iteration after >> becoming active, it can process lflows with a normal speed. >> >> So, I wonder if we can fall into a catastrophic scenario where >> the current hash map size is very small and amount of flows after >> re-connection is very high, so threads are only fighting for locks >> all the time. >> >> Do you have some thoughts on this? >> This seems like a possible scenario (although I wonder how often we'd hit it in practice). But this was a possible scenario before 9dea0c090e91 ("nb: Remove possibility of disabling logical datapath groups.") too, right? >> Should we disable parallelism when northd goes into stand-up state? > > * I meant 'standby', of course. :/ > We could, but that means every time northd becomes active it will do one first inefficient run. Can we instead store the max_seen_lflow_size value in the SB database and use the one we read from the DB? >> >> Best regards, Ilya Maximets. >> Thanks, Dumitru >>> northd/northd.c | 10 ---------- >>> 1 file changed, 10 deletions(-) >>> >>> diff --git a/northd/northd.c b/northd/northd.c >>> index 7e2681865..33943bfaf 100644 >>> --- a/northd/northd.c >>> +++ b/northd/northd.c >>> @@ -14164,15 +14164,6 @@ void run_update_worker_pool(int n_threads) >>> } >>> } >>> >>> -static void init_worker_pool(void) >>> -{ >>> - /* If parallelization is enabled, make sure locks are initialized. */ >>> - if (parallelization_state != STATE_NULL) { >>> - lflow_hash_lock_init(); >>> - parallelization_state = STATE_INIT_HASH_SIZES; >>> - } >>> -} >>> - >>> static void >>> build_mcast_groups(struct lflow_input *data, >>> const struct hmap *datapaths, >>> @@ -15762,7 +15753,6 @@ void northd_run(struct northd_input *input_data, >>> struct ovsdb_idl_txn *ovnsb_txn) >>> { >>> stopwatch_start(OVNNB_DB_RUN_STOPWATCH_NAME, time_msec()); >>> - init_worker_pool(); >>> ovnnb_db_run(input_data, data, ovnnb_txn, ovnsb_txn, >>> input_data->sbrec_chassis_by_name, >>> input_data->sbrec_chassis_by_hostname); >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev