On 3/20/25 13:20, Ilya Maximets wrote:
> On 3/3/25 10:19, Martin Morgenstern via dev wrote:
>> From: Felix Huettner <[email protected]>
>>
>> This change makes the ovsdb_idl_loop_run_completion() function from the
>> northd code in OVN usable for other parts of the code and adapts the
>> client synchronisation layer functions accordingly.
>>
>> It allows users to continue processing idl messages until some timeout
>> arrives or no more messages exist.
>>
>> Signed-off-by: Martin Morgenstern <[email protected]>
>> Signed-off-by: Felix Huettner <[email protected]>
>> ---
>>  lib/ovsdb-cs.c  | 10 ++++++++--
>>  lib/ovsdb-cs.h  |  2 +-
>>  lib/ovsdb-idl.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
>>  lib/ovsdb-idl.h |  5 ++++-
>>  4 files changed, 57 insertions(+), 6 deletions(-)
>>
>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c
>> index 81a09da6e..e4d5c4f2d 100644
>> --- a/lib/ovsdb-cs.c
>> +++ b/lib/ovsdb-cs.c
>> @@ -611,12 +611,12 @@ ovsdb_cs_db_add_event(struct ovsdb_cs_db *db, enum 
>> ovsdb_cs_event_type type)
>>   *
>>   * Initializes 'events' with a list of events that occurred on 'cs'.  The
>>   * caller must process and destroy all of the events. */
>> -void
>> +int
>>  ovsdb_cs_run(struct ovsdb_cs *cs, struct ovs_list *events)
>>  {
>>      ovs_list_init(events);
>>      if (!cs->session) {
>> -        return;
>> +        return EINVAL;
>>      }
>>  
>>      ovsdb_cs_send_cond_change(cs);
>> @@ -671,6 +671,12 @@ ovsdb_cs_run(struct ovsdb_cs *cs, struct ovs_list 
>> *events)
>>      jsonrpc_session_gratuitous_echo_reply(cs->session);
>>  
>>      ovs_list_push_back_all(events, &cs->data.events);
>> +
>> +    if (ret == EAGAIN) {
>> +        return EAGAIN;
>> +    } else {
>> +        return 0;
>> +    }
>>  }
>>  
>>  /* Arranges for poll_block() to wake up when ovsdb_cs_run() has something to
>> diff --git a/lib/ovsdb-cs.h b/lib/ovsdb-cs.h
>> index bcc3dcd71..7ba2a8d70 100644
>> --- a/lib/ovsdb-cs.h
>> +++ b/lib/ovsdb-cs.h
>> @@ -118,7 +118,7 @@ struct ovsdb_cs *ovsdb_cs_create(const char *database, 
>> int max_version,
>>                                   void *ops_aux);
>>  void ovsdb_cs_destroy(struct ovsdb_cs *);
>>  
>> -void ovsdb_cs_run(struct ovsdb_cs *, struct ovs_list *events);
>> +int ovsdb_cs_run(struct ovsdb_cs *, struct ovs_list *events);
>>  void ovsdb_cs_wait(struct ovsdb_cs *);
>>  
>>  /* Network connection. */
>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
>> index 4c2a3e3aa..b90df90ac 100644
>> --- a/lib/ovsdb-idl.c
>> +++ b/lib/ovsdb-idl.c
>> @@ -45,6 +45,7 @@
>>  #include "simap.h"
>>  #include "sset.h"
>>  #include "svec.h"
>> +#include "timeval.h"
>>  #include "util.h"
>>  #include "uuid.h"
>>  #include "openvswitch/vlog.h"
>> @@ -439,13 +440,14 @@ ovsdb_idl_clear(struct ovsdb_idl *db)
>>  /* Processes a batch of messages from the database server on 'idl'.  This 
>> may
>>   * cause the IDL's contents to change.  The client may check for that with
>>   * ovsdb_idl_get_seqno(). */
>> -void
>> +int
>>  ovsdb_idl_run(struct ovsdb_idl *idl)
>>  {
>>      ovs_assert(!idl->txn);
>>  
>>      struct ovs_list events;
>> -    ovsdb_cs_run(idl->cs, &events);
>> +    int ret;
>> +    ret = ovsdb_cs_run(idl->cs, &events);
>>  
>>      struct ovsdb_cs_event *event;
>>      LIST_FOR_EACH_POP (event, list_node, &events) {
>> @@ -479,6 +481,8 @@ ovsdb_idl_run(struct ovsdb_idl *idl)
>>      ovsdb_idl_reparse_refs_to_inserted(idl);
>>      ovsdb_idl_reparse_deleted(idl);
>>      ovsdb_idl_row_destroy_postprocess(idl);
>> +
>> +    return ret;
>>  }
>>  
>>  /* Arranges for poll_block() to wake up when ovsdb_idl_run() has something 
>> to
>> @@ -4391,6 +4395,44 @@ ovsdb_idl_loop_run(struct ovsdb_idl_loop *loop)
>>      return loop->open_txn;
>>  }
>>  
>> +/* Run the ovsdb_idl_loop until there is no more data to be received.
>> + * A timeout in milliseconds can be provided. The actual duration of the
>> + * function is returned in the duration parameter. */
>> +struct ovsdb_idl_txn *
>> +ovsdb_idl_loop_run_completion(struct ovsdb_idl_loop *idl_loop,
>> +                              unsigned long long timeout,
>> +                              uint64_t *idl_duration)
>> +{
>> +    unsigned long long duration, start = time_msec();
>> +    struct ovsdb_idl_txn *txn;
>> +    int n = 0;
>> +
>> +    /* Accumulate database changes as long as there are some,
>> +     * but no longer than the given timeout. */
>> +    while (time_msec() - start < timeout) {
>> +        if (ovsdb_idl_run(idl_loop->idl) == EAGAIN) {
>> +            break;
>> +        }
>> +        n++;
>> +    }
>> +
>> +    txn = ovsdb_idl_loop_run(idl_loop);
>> +
>> +    duration = time_msec() - start;
>> +    if (idl_duration) {
>> +        *idl_duration = duration;
>> +    }
>> +    /* ovsdb_idl_run() is called at least 2 times.  Once directly and
>> +     * once in the ovsdb_idl_loop_run().  n > 2 means that we received
>> +     * data on at least 2 subsequent calls. */
>> +    if (n > 2 || duration > 100) {
>> +        VLOG(duration > timeout ? VLL_INFO : VLL_DBG,
>> +             "%d iterations in %lld ms", n + 1, duration);
>> +    }
>> +
>> +    return txn;
>> +}

Hi Ilya,

> I'll summarize my thoughts on the last 3 patches of the set here.
> 
> The function above was written this way for OVN originally because there was 
> no
> IDL API to use that would allow timing the calls and batching things together.
> But since we're already changing IDL, CS and jsonrpc APIs in this patch set, 
> we
> may as well add a proper timing API instead for applications to use.

That sounds reasonable.

> The function above also doesn't allow to be precise enough with the timings 
> since
> it doesn't control what is happening underneath, i.e. doesn't actually know 
> how
> many iterations the libraries below will make and how much time will that 
> take.

This is exactly what we see in our setup. With the timeout set to 500ms,
we are constantly getting logs reporting N iterations with a duration
slightly higher than 500ms and, very seldomly, even between 10s and 15s.

> I'd suggest we add proper time-based APIs throughout the stack instead.  E.g.
> create _until() flavors of the idl_loop_run, ovsdb_idl_run, ovsdb_cs_run,
> jsonrpc_session_run and jsonrpc_recv, i.e. all the functions throughout the 
> stack,
> adding a deadline argument.  These functions would try to receive something 
> until
> they get a complete message (for functions that receive a single message) or 
> will
> accumulate messages / events until deadline is reached  (for functions that 
> can
> process or gather multiple messages / events).  Functions may still return 
> early
> if there is nothing to receive from the lower level at the moment.
> 
> The time accounting will still be far from perfect in this case, but should be
> much more accurate.
> 
> Having a 'deadline' semantics instead of a 'timeout' would make it much easier
> to pass the argument through the stack of calls, as it will not need any 
> updates.
> 
> What do you think?

Sounds good to me. Thanks a lot for these suggestions! We will have a
look at this and prepare another patch set.

For me it would make sense to have a separate patch set for the timing
API changes and rework v2 of the current patch set to only include the
small (and revised) changes from PATCH 2/6 (jsonrpc: Add coverage for
retries of jsonrpc_recv()) and PATCH 3/6 (jsonrpc: Optimize receive
buffer size). What do you think?

Best regards,
Martin

> 
> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to