On Fri, May 23, 2025 at 10:02:59AM +0000, Smirnov Aleksandr (K2 Cloud) wrote: > Hi Felix, > > Answering your propose, yes, I can reuse this solution instead of direct > check seqno vs expected seqno. > However, I have inspected what you are doing in your patch and want to > point out to one mistake (at least by my opinion).
Hi Alexander, thanks for the review. > When I tried to play with seqno/expected seqno numbers I realized that > they only set to meaningful values BELOW the > line > ovnsb_expected_cond_seqno = > update_sb_monitors( > ovnsb_idl_loop.idl, chassis, > &runtime_data->local_lports, > &runtime_data->lbinding_data.bindings, > &runtime_data->local_datapaths, > sb_monitor_all); So per default ovnsb_expected_cond_seqno is initialized to UINT_MAX. Then i see two locations where it might be set to an actual value: 1. At the start of the main loop in update_sb_db, but only if monitor_all is set 2. The point you mentioned that happens after the engine computed completely. > > So, if you want to make meaningful comparison of seqno(s) to answer the > question 'are there upcoming updates' you have to that check after > update_sb_monitors() call and before next iteration of main loop. Or at the start of the main loop if we are sure that ovnsb_expected_cond_seqno is valid. I did this in my patch with a `ovnsb_expected_cond_seqno != UINT_MAX`. > Otherwise, (again by my opinion) your code will always catch guard > checks (time, max number, etc) I am not sure if i get that point correctly. I want to share an example case based on my understanding of what you mean. You can tell me if i got it right: We assume a case where monitor-all=false. When starting ovn-controller we will connect to the local ovsdb and southbound. We then run the engine the first time and set the initial monitoring conditions. At this point ovnsb_expected_cond_seqno is valid and larger than ovnsb_cond_seqno. Before we now get the initial monitor results from southbound something happens that prevents the engine from running (e.g. northd_version_match is false). At some point the initial monitor updates arrive which will cause ovnsb_expected_cond_seqno == ovnsb_cond_seqno. At least in my patch any further main loop iteration would now cause daemon_started_recently_countdown to be triggered, even though the engine never had the chance to update and define a new ovnsb_expected_cond_seqno. If the engine can then start running again daemon_started_recently() might return true, even though we did not yet allow for up to 20 times of initial condition changes. > This is still dark area for me, but at least I need to share my > understanding of seqno life cycle. Same for me and i think you found a valid point. However it would be great if you could confirm the above understading. If you meant a different point then maybe i need to fix something else as well. I will try to fix the above and send a new version of the patchset if you confirm that this was the point you meant. So thank you very much for finding that. Thanks, Felix > > Thank you, > > Alexander > > On 5/7/25 5:26 PM, Felix Huettner wrote: > > On Tue, Mar 18, 2025 at 01:08:12PM +0300, Aleksandr Smirnov wrote: > >> Delay initial flow upload to OVS until all monitored updates received. > >> This is a kind of replacement of the wait-before-clear parameter. > >> Instead of waiting a certain amount of time we will wait > >> until the final monitored update comes from SB DB. > >> An event we watch in the controller's main loop is current condition > >> sequence number compared vs expected condition sequence number. > >> If they are not equal we still have to receive updates in response > >> to recent monitor condition change request. This check makes sense > >> only in code that lies after update_sb_monitors() function call. > >> > >> Note, this update will only work if wait-before-clear == 0, > >> i.e. you can still rely on wait-before-clear behavior. > >> > >> Signed-off-by: Aleksandr Smirnov <[email protected]> > >> --- > >> controller/ofctrl.c | 29 ++++++++++++++++- > >> controller/ofctrl.h | 3 +- > >> controller/ovn-controller.c | 4 ++- > >> tests/ovn-controller.at | 62 +++++++++++++++++++++++++++++++++++++ > >> 4 files changed, 95 insertions(+), 3 deletions(-) > >> > >> diff --git a/controller/ofctrl.c b/controller/ofctrl.c > >> index 4a3d35b97..304b9bbc8 100644 > >> --- a/controller/ofctrl.c > >> +++ b/controller/ofctrl.c > >> @@ -349,6 +349,9 @@ static uint64_t cur_cfg; > >> /* Current state. */ > >> static enum ofctrl_state state; > >> > >> +/* Release wait before clear stage. */ > >> +static bool wait_before_clear_proceed = false; > >> + > >> /* The time (ms) to stay in the state S_WAIT_BEFORE_CLEAR. Read from > >> * external_ids: ovn-ofctrl-wait-before-clear. */ > >> static unsigned int wait_before_clear_time = 0; > >> @@ -446,6 +449,7 @@ run_S_NEW(void) > >> struct ofpbuf *buf = ofpraw_alloc(OFPRAW_NXT_TLV_TABLE_REQUEST, > >> rconn_get_version(swconn), 0); > >> xid = queue_msg(buf); > >> + wait_before_clear_proceed = false; > >> state = S_TLV_TABLE_REQUESTED; > >> } > >> > >> @@ -638,6 +642,14 @@ error: > >> static void > >> run_S_WAIT_BEFORE_CLEAR(void) > >> { > >> + if (wait_before_clear_time == 0) { > >> + if (wait_before_clear_proceed) { > >> + state = S_CLEAR_FLOWS; > >> + } > >> + > >> + return; > >> + } > >> + > >> if (!wait_before_clear_time || > >> (wait_before_clear_expire && > >> time_msec() >= wait_before_clear_expire)) { > >> @@ -2695,11 +2707,26 @@ ofctrl_put(struct ovn_desired_flow_table > >> *lflow_table, > >> uint64_t req_cfg, > >> bool lflows_changed, > >> bool pflows_changed, > >> - struct tracked_acl_ids *tracked_acl_ids) > >> + struct tracked_acl_ids *tracked_acl_ids, > >> + bool monitor_cond_complete) > >> { > >> static bool skipped_last_time = false; > >> static uint64_t old_req_cfg = 0; > >> bool need_put = false; > >> + > >> + if (state == S_WAIT_BEFORE_CLEAR) { > >> + /* If no more monitored condition changes expected > >> + Release wait before clear stage and skip > >> + over poll wait. */ > >> + if (monitor_cond_complete) { > >> + wait_before_clear_proceed = true; > >> + poll_immediate_wake(); > >> + } > >> + > >> + skipped_last_time = true; > >> + return; > >> + } > >> + > >> if (lflows_changed || pflows_changed || skipped_last_time || > >> ofctrl_initial_clear) { > >> need_put = true; > >> diff --git a/controller/ofctrl.h b/controller/ofctrl.h > >> index d1ee69cb0..76e2fbece 100644 > >> --- a/controller/ofctrl.h > >> +++ b/controller/ofctrl.h > >> @@ -68,7 +68,8 @@ void ofctrl_put(struct ovn_desired_flow_table > >> *lflow_table, > >> uint64_t nb_cfg, > >> bool lflow_changed, > >> bool pflow_changed, > >> - struct tracked_acl_ids *tracked_acl_ids); > >> + struct tracked_acl_ids *tracked_acl_ids, > >> + bool monitor_cond_complete); > >> bool ofctrl_has_backlog(void); > >> void ofctrl_wait(void); > >> void ofctrl_destroy(void); > >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > >> index bc8acddf1..2623fc758 100644 > >> --- a/controller/ovn-controller.c > >> +++ b/controller/ovn-controller.c > >> @@ -6384,7 +6384,9 @@ main(int argc, char *argv[]) > >> ofctrl_seqno_get_req_cfg(), > >> engine_node_changed(&en_lflow_output), > >> engine_node_changed(&en_pflow_output), > >> - tracked_acl_ids); > >> + tracked_acl_ids, > >> + ovnsb_cond_seqno > >> + == ovnsb_expected_cond_seqno); > > Hi Aleksandr, > > > > i just wanted to let you know that we could probably combine this with > > https://patchwork.ozlabs.org/project/ovn/patch/67010dca0a8dcc390e873349d662c9163e01050f.1746623091.git.felix.huettner@stackit.cloud/ > > then you could use "!daemon_started_recently()" here. > > > > But that is just for information. > > > > Thanks a lot, > > Felix > > > >> stopwatch_stop(OFCTRL_PUT_STOPWATCH_NAME, > >> time_msec()); > >> } > >> stopwatch_start(OFCTRL_SEQNO_RUN_STOPWATCH_NAME, > >> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at > >> index efb0a1741..812616711 100644 > >> --- a/tests/ovn-controller.at > >> +++ b/tests/ovn-controller.at > >> @@ -2463,6 +2463,68 @@ AT_CHECK([ovs-ofctl dump-flows br-int | grep group > >> -c], [0], [3 > >> OVN_CLEANUP([hv1]) > >> AT_CLEANUP > >> > >> +OVN_FOR_EACH_NORTHD([ > >> +AT_SETUP([ovn-controller - ofctrl delay until all monitored updates come]) > >> + > >> +# Prepare testing configuration > >> +ovn_start > >> + > >> +net_add n1 > >> +sim_add hv1 > >> +as hv1 > >> +check ovs-vsctl add-br br-phys > >> +ovn_attach n1 br-phys 192.168.0.1 > >> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \ > >> + set interface hv1-vif1 external-ids:iface-id=ls1-lp1 > >> + > >> +check ovn-nbctl ls-add ls1 > >> + > >> +check ovn-nbctl lsp-add ls1 ls1-lp1 \ > >> +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.1.2.3" > >> + > >> +check ovn-nbctl lb-add lb1 1.1.1.1 10.1.2.3 \ > >> +-- ls-lb-add ls1 lb1 > >> + > >> +check ovn-nbctl lb-add lb2 2.2.2.2 10.1.2.4 \ > >> +-- ls-lb-add ls1 lb2 > >> + > >> +check ovn-nbctl --wait=hv sync > >> + > >> +# Stop ovn-controller > >> +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > >> + > >> +# Sine we test initial flow upload controller is restarted. > >> +# Clear log file and start controller. > >> +rm -f hv1/ovn-controller.log > >> +start_daemon ovn-controller -vfile:jsonrpc:dbg -vfile:ofctrl:dbg > >> + > >> +# Monitor log file until flow finally uploaded to OVS > >> +OVS_WAIT_UNTIL([grep -q 'Setting lport.*in OVS' hv1/ovn-controller.log]) > >> + > >> +# Analyse log file, select records about: > >> +# 1. monitor_cond changes made for SB DB (message class is 'jsonrpc') > >> +# 2. 'clearing all flows' message which is issued after 'wait before > >> +# clear' stage released (message class is 'ofctrl') > >> +# > >> +# We expect here that all monitoring condition changes should be made > >> before > >> +# OVS flow cleared / uploaded. > >> +# For now all monitoring updates comes in three iterations: initial, > >> +# direct dps, indirect dps that corresponds to > >> +# three messages of type 1 followed by one message of type 2 > >> +# > >> +# For monitor-all=true one message of type 1 followed by one message of > >> type 2 > >> +# > >> +# Then we cut off message class and take first letter > >> +# (j for jsonrpc and o for ofctrl) > >> +# > >> +call_seq=`grep -E \ > >> + "(clearing all flows)|(monitor_cond.*South)" \ > >> + hv1/ovn-controller.log | cut -d "|" -f 3- | cut -b 1 | tr -d '\n'` > >> +AT_CHECK([echo $call_seq | grep -qE "^j+o$"], [0]) > >> + > >> +OVN_CLEANUP([hv1]) > >> +AT_CLEANUP > >> +]) > >> > >> AT_SETUP([ovn-controller - check ovn-chassis-mac-mappings]) > >> > >> -- > >> 2.47.0 > >> > >> _______________________________________________ > >> dev mailing list > >> [email protected] > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
