Re: [ovs-dev] [PATCH ovn 1/2] controller: Delay initial flow upload to OVS.

Smirnov Aleksandr (K2 Cloud) Fri, 30 May 2025 03:01:19 -0700

On 5/27/25 5:54 PM, Felix Huettner wrote:

Hi Felix,


I am sorry, I misunderstood your plan with countdown.
If you want to trigger it after first update is utilized all is correct.
So, I cancel my review complain and will use in my code suggested function
instead of direct compare of ovnsb_expected_cond_seqno vs ovnsb_cond_seqno

Thank you,

Alexander

> On Fri, May 23, 2025 at 10:02:59AM +0000, Smirnov Aleksandr (K2 Cloud) wrote:
>> Hi Felix,
>>
>> Answering your propose, yes, I can reuse this solution instead of direct
>> check seqno vs expected seqno.
>> However, I have inspected what you are doing in your patch and want to
>> point out to one mistake (at least by my opinion).
> Hi Alexander,
>
> thanks for the review.
>
>> When I tried to play with seqno/expected seqno numbers I realized that
>> they only set to meaningful values BELOW the
>> line
>> ovnsb_expected_cond_seqno =
>>                                   update_sb_monitors(
>>                                       ovnsb_idl_loop.idl, chassis,
>> &runtime_data->local_lports,
>> &runtime_data->lbinding_data.bindings,
>> &runtime_data->local_datapaths,
>>                                       sb_monitor_all);
> So per default ovnsb_expected_cond_seqno is initialized to UINT_MAX.
> Then i see two locations where it might be set to an actual value:
>
> 1. At the start of the main loop in update_sb_db, but only if
>     monitor_all is set
> 2. The point you mentioned that happens after the engine computed
>     completely.
>
>> So, if you want to make meaningful comparison of seqno(s) to answer the
>> question 'are there upcoming updates' you have to that check after
>> update_sb_monitors() call and before next iteration of main loop.
> Or at the start of the main loop if we are sure that ovnsb_expected_cond_seqno
> is valid. I did this in my patch with a
> `ovnsb_expected_cond_seqno != UINT_MAX`.
>
>> Otherwise, (again by my opinion) your code will always catch guard
>> checks (time,  max number, etc)
> I am not sure if i get that point correctly. I want to share an example
> case based on my understanding of what you mean. You can tell me if i got
> it right:
>
> We assume a case where monitor-all=false.
> When starting ovn-controller we will connect to the local ovsdb and
> southbound. We then run the engine the first time and set the initial
> monitoring conditions.
> At this point ovnsb_expected_cond_seqno is valid and larger than
> ovnsb_cond_seqno.
>
> Before we now get the initial monitor results from southbound something
> happens that prevents the engine from running (e.g. northd_version_match
> is false).
> At some point the initial monitor updates arrive which will cause
> ovnsb_expected_cond_seqno == ovnsb_cond_seqno. At least in my patch any
> further main loop iteration would now cause daemon_started_recently_countdown
> to be triggered, even though the engine never had the chance to update
> and define a new ovnsb_expected_cond_seqno.
>
> If the engine can then start running again daemon_started_recently()
> might return true, even though we did not yet allow for up to 20 times
> of initial condition changes.
>
>> This is still dark area for me, but at least I need to share my
>> understanding of seqno life cycle.
> Same for me and i think you found a valid point.
>
> However it would be great if you could confirm the above understading.
> If you meant a different point then maybe i need to fix something else
> as well.
>
> I will try to fix the above and send a new version of the patchset if
> you confirm that this was the point you meant.
>
> So thank you very much for finding that.
>
> Thanks,
> Felix
>
>> Thank you,
>>
>> Alexander
>>
>> On 5/7/25 5:26 PM, Felix Huettner wrote:
>>> On Tue, Mar 18, 2025 at 01:08:12PM +0300, Aleksandr Smirnov wrote:
>>>> Delay initial flow upload to OVS until all monitored updates received.
>>>> This is a kind of replacement of the wait-before-clear parameter.
>>>> Instead of waiting a certain amount of time we will wait
>>>> until the final monitored update comes from SB DB.
>>>> An event we watch in the controller's main loop is current condition
>>>> sequence number compared vs expected condition sequence number.
>>>> If they are not equal we still have to receive updates in response
>>>> to recent monitor condition change request. This check makes sense
>>>> only in code that lies after update_sb_monitors() function call.
>>>>
>>>> Note, this update will only work if wait-before-clear == 0,
>>>> i.e. you can still rely on wait-before-clear behavior.
>>>>
>>>> Signed-off-by: Aleksandr Smirnov <[email protected]>
>>>> ---
>>>>    controller/ofctrl.c         | 29 ++++++++++++++++-
>>>>    controller/ofctrl.h         |  3 +-
>>>>    controller/ovn-controller.c |  4 ++-
>>>>    tests/ovn-controller.at     | 62 +++++++++++++++++++++++++++++++++++++
>>>>    4 files changed, 95 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/controller/ofctrl.c b/controller/ofctrl.c
>>>> index 4a3d35b97..304b9bbc8 100644
>>>> --- a/controller/ofctrl.c
>>>> +++ b/controller/ofctrl.c
>>>> @@ -349,6 +349,9 @@ static uint64_t cur_cfg;
>>>>    /* Current state. */
>>>>    static enum ofctrl_state state;
>>>>    
>>>> +/* Release wait before clear stage. */
>>>> +static bool wait_before_clear_proceed = false;
>>>> +
>>>>    /* The time (ms) to stay in the state S_WAIT_BEFORE_CLEAR. Read from
>>>>     * external_ids: ovn-ofctrl-wait-before-clear. */
>>>>    static unsigned int wait_before_clear_time = 0;
>>>> @@ -446,6 +449,7 @@ run_S_NEW(void)
>>>>        struct ofpbuf *buf = ofpraw_alloc(OFPRAW_NXT_TLV_TABLE_REQUEST,
>>>>                                          rconn_get_version(swconn), 0);
>>>>        xid = queue_msg(buf);
>>>> +    wait_before_clear_proceed = false;
>>>>        state = S_TLV_TABLE_REQUESTED;
>>>>    }
>>>>    
>>>> @@ -638,6 +642,14 @@ error:
>>>>    static void
>>>>    run_S_WAIT_BEFORE_CLEAR(void)
>>>>    {
>>>> +    if (wait_before_clear_time == 0) {
>>>> +        if (wait_before_clear_proceed) {
>>>> +            state = S_CLEAR_FLOWS;
>>>> +        }
>>>> +
>>>> +        return;
>>>> +    }
>>>> +
>>>>        if (!wait_before_clear_time ||
>>>>            (wait_before_clear_expire &&
>>>>             time_msec() >= wait_before_clear_expire)) {
>>>> @@ -2695,11 +2707,26 @@ ofctrl_put(struct ovn_desired_flow_table 
>>>> *lflow_table,
>>>>               uint64_t req_cfg,
>>>>               bool lflows_changed,
>>>>               bool pflows_changed,
>>>> -           struct tracked_acl_ids *tracked_acl_ids)
>>>> +           struct tracked_acl_ids *tracked_acl_ids,
>>>> +           bool monitor_cond_complete)
>>>>    {
>>>>        static bool skipped_last_time = false;
>>>>        static uint64_t old_req_cfg = 0;
>>>>        bool need_put = false;
>>>> +
>>>> +    if (state == S_WAIT_BEFORE_CLEAR) {
>>>> +        /* If no more monitored condition changes expected
>>>> +           Release wait before clear stage and skip
>>>> +           over poll wait. */
>>>> +        if (monitor_cond_complete) {
>>>> +            wait_before_clear_proceed = true;
>>>> +            poll_immediate_wake();
>>>> +        }
>>>> +
>>>> +        skipped_last_time = true;
>>>> +        return;
>>>> +    }
>>>> +
>>>>        if (lflows_changed || pflows_changed || skipped_last_time ||
>>>>            ofctrl_initial_clear) {
>>>>            need_put = true;
>>>> diff --git a/controller/ofctrl.h b/controller/ofctrl.h
>>>> index d1ee69cb0..76e2fbece 100644
>>>> --- a/controller/ofctrl.h
>>>> +++ b/controller/ofctrl.h
>>>> @@ -68,7 +68,8 @@ void ofctrl_put(struct ovn_desired_flow_table 
>>>> *lflow_table,
>>>>                    uint64_t nb_cfg,
>>>>                    bool lflow_changed,
>>>>                    bool pflow_changed,
>>>> -                struct tracked_acl_ids *tracked_acl_ids);
>>>> +                struct tracked_acl_ids *tracked_acl_ids,
>>>> +                bool monitor_cond_complete);
>>>>    bool ofctrl_has_backlog(void);
>>>>    void ofctrl_wait(void);
>>>>    void ofctrl_destroy(void);
>>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>>>> index bc8acddf1..2623fc758 100644
>>>> --- a/controller/ovn-controller.c
>>>> +++ b/controller/ovn-controller.c
>>>> @@ -6384,7 +6384,9 @@ main(int argc, char *argv[])
>>>>                                       ofctrl_seqno_get_req_cfg(),
>>>>                                       
>>>> engine_node_changed(&en_lflow_output),
>>>>                                       
>>>> engine_node_changed(&en_pflow_output),
>>>> -                                   tracked_acl_ids);
>>>> +                                   tracked_acl_ids,
>>>> +                                   ovnsb_cond_seqno
>>>> +                                   == ovnsb_expected_cond_seqno);
>>> Hi Aleksandr,
>>>
>>> i just wanted to let you know that we could probably combine this with
>>> https://patchwork.ozlabs.org/project/ovn/patch/67010dca0a8dcc390e873349d662c9163e01050f.1746623091.git.felix.huettner@stackit.cloud/
>>> then you could use "!daemon_started_recently()" here.
>>>
>>> But that is just for information.
>>>
>>> Thanks a lot,
>>> Felix
>>>
>>>>                            stopwatch_stop(OFCTRL_PUT_STOPWATCH_NAME, 
>>>> time_msec());
>>>>                        }
>>>>                        stopwatch_start(OFCTRL_SEQNO_RUN_STOPWATCH_NAME,
>>>> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
>>>> index efb0a1741..812616711 100644
>>>> --- a/tests/ovn-controller.at
>>>> +++ b/tests/ovn-controller.at
>>>> @@ -2463,6 +2463,68 @@ AT_CHECK([ovs-ofctl dump-flows br-int | grep group 
>>>> -c], [0], [3
>>>>    OVN_CLEANUP([hv1])
>>>>    AT_CLEANUP
>>>>    
>>>> +OVN_FOR_EACH_NORTHD([
>>>> +AT_SETUP([ovn-controller - ofctrl delay until all monitored updates come])
>>>> +
>>>> +# Prepare testing configuration
>>>> +ovn_start
>>>> +
>>>> +net_add n1
>>>> +sim_add hv1
>>>> +as hv1
>>>> +check ovs-vsctl add-br br-phys
>>>> +ovn_attach n1 br-phys 192.168.0.1
>>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>>> +    set interface hv1-vif1 external-ids:iface-id=ls1-lp1
>>>> +
>>>> +check ovn-nbctl ls-add ls1
>>>> +
>>>> +check ovn-nbctl lsp-add ls1 ls1-lp1 \
>>>> +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.1.2.3"
>>>> +
>>>> +check ovn-nbctl lb-add lb1 1.1.1.1 10.1.2.3 \
>>>> +-- ls-lb-add ls1 lb1
>>>> +
>>>> +check ovn-nbctl lb-add lb2 2.2.2.2 10.1.2.4 \
>>>> +-- ls-lb-add ls1 lb2
>>>> +
>>>> +check ovn-nbctl --wait=hv sync
>>>> +
>>>> +# Stop ovn-controller
>>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>> +
>>>> +# Sine we test initial flow upload controller is restarted.
>>>> +# Clear log file and start controller.
>>>> +rm -f hv1/ovn-controller.log
>>>> +start_daemon ovn-controller -vfile:jsonrpc:dbg -vfile:ofctrl:dbg
>>>> +
>>>> +# Monitor log file until flow finally uploaded to OVS
>>>> +OVS_WAIT_UNTIL([grep -q 'Setting lport.*in OVS' hv1/ovn-controller.log])
>>>> +
>>>> +# Analyse log file, select records about:
>>>> +# 1. monitor_cond changes made for SB DB (message class is 'jsonrpc')
>>>> +# 2. 'clearing all flows' message which is issued after 'wait before
>>>> +#    clear' stage released (message class is 'ofctrl')
>>>> +#
>>>> +# We expect here that all monitoring condition changes should be made 
>>>> before
>>>> +# OVS flow cleared / uploaded.
>>>> +# For now all monitoring updates comes in three iterations: initial,
>>>> +# direct dps, indirect dps that corresponds to
>>>> +# three messages of type 1 followed by one message of type 2
>>>> +#
>>>> +# For monitor-all=true one message of type 1 followed by one message of 
>>>> type 2
>>>> +#
>>>> +# Then we cut off message class and take first letter
>>>> +# (j for jsonrpc and o for ofctrl)
>>>> +#
>>>> +call_seq=`grep -E \
>>>> + "(clearing all flows)|(monitor_cond.*South)" \
>>>> + hv1/ovn-controller.log | cut -d "|" -f 3- | cut -b 1 | tr -d '\n'`
>>>> +AT_CHECK([echo $call_seq | grep -qE "^j+o$"], [0])
>>>> +
>>>> +OVN_CLEANUP([hv1])
>>>> +AT_CLEANUP
>>>> +])
>>>>    
>>>>    AT_SETUP([ovn-controller - check ovn-chassis-mac-mappings])
>>>>    
>>>> -- 
>>>> 2.47.0
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> [email protected]
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 1/2] controller: Delay initial flow upload to OVS.

Reply via email to