On 1/12/24 23:47, Ilya Maximets wrote:
> On 1/12/24 23:10, Frode Nordahl wrote:
>> On Fri, Jan 12, 2024 at 5:20 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
>>>
>>> On 1/12/24 01:27, Ilya Maximets wrote:
>>>> On 1/10/24 20:29, Frode Nordahl wrote:
>>>>> During testing of the implementation of cooperative multitasking
>>>>> in the ovsdb-server we noticed that the very first yield being
>>>>> called in the jsonrpc-server always fired.
>>>>>
>>>>> This indicates that the operations being after/before storage run
>>>>> is taking too long as it currently is.
>>>>>
>>>>> Moving the storage run section to the top of the main loop lead
>>>>> to successful results using less yield calls as documented in the
>>>>> commit message of the next commit in this series.
>>>>
>>>> I don't really understand what exactly is taking so much time and
>>>> why moving this part of the code helps.  Could you elaborate?
>>>
>>> I think, I got it.  In the current version of a patch set we have:
>>>
>>> main():
>>> 0.    open_db()
>>>           raft_alloc()  <-- register callback
>>>           read_db()     <-- takes a lot of time
>>>       main_loop()
>>> 1.        run jsonrpc   <-- yields and warns because read_db
>>>                             already took a lot of time.
>>>               yield()
>>>                   raft_run()  <-- updates the callback
>>> 2.        storage_run()
>>>               raft_run()      <-- updates the callback
>>>
>>> The solution proposed in this patch is to swap 1 and 2,
>>> so we call raft_run() directly before we yield for the
>>> first time.
>>>
>>> I suppose, my seggestion to not have _register() function would
>>> solve that problem, because the callback will not be registered
>>> until the first call to raft_run().
>>>
>>> Is that correct?
>>
>> Unfortunately, also when only having the _set() function the first
>> yield always fires.
>>
>> Putting log lines before/after every major function call in the
>> main_loop provides something like this:
>> 2024-01-12T22:01:46.968Z|00204|ovsdb_server|DBG|HELLO 0
>> 2024-01-12T22:01:46.968Z|00205|ovsdb_server|DBG|HELLO 1
>> 2024-01-12T22:01:46.968Z|00206|ovsdb_server|DBG|HELLO 2
>> 2024-01-12T22:01:46.968Z|00207|ovsdb_server|DBG|HELLO 3
>> 2024-01-12T22:01:46.968Z|00208|ovsdb_server|DBG|HELLO 4
>> 2024-01-12T22:01:46.968Z|00209|ovsdb_server|DBG|HELLO 5
>> 2024-01-12T22:01:46.968Z|00210|ovsdb_server|DBG|HELLO 6
>> 2024-01-12T22:01:47.134Z|00211|ovsdb_server|DBG|HELLO 0
>> 2024-01-12T22:01:47.134Z|00212|ovsdb_server|DBG|HELLO 1
>> 2024-01-12T22:01:47.134Z|00213|ovsdb_server|DBG|HELLO 2
>> 2024-01-12T22:01:47.134Z|00214|ovsdb_server|DBG|HELLO 3
>> 2024-01-12T22:01:47.134Z|00215|ovsdb_server|DBG|HELLO 4
>> 2024-01-12T22:01:47.134Z|00216|ovsdb_server|DBG|HELLO 5
>> 2024-01-12T22:01:47.135Z|00217|cooperative_multitasking|DBG|ovsdb/jsonrpc-server.c:604:
>> yield for raft_run(0x560b47e225e0): elapsed(167) >= threshold(166),
>> overrun: 1
> 
> But this is a debug log, i.e. it doesn't actually fire.
> The overrun here is just 1ms, which should be fine.
> Or am I missing something?

Sorry, it is firing, it's just not a warning at this point.
But the logic below about poll_timer_wait_until(raft->ping_timeout);
waking the thread up at exactly ping timeout and causing the next
possible yield to fire still stands.

> 
>>
>> So I actually wonder if the wait timers set up in raft_wait() are too
>> optimistic? Then again, those timers can't ever take into account the
>> number of clients serviced by for example the call to
>> ovsdb_jsonrpc_server_run().
> 
> You seem to use 500ms as a RAFT election timer here, that sounds like
> a small number to use.  I'd not recommend anything smaller than a
> default 1 second.  300ms ping cadence should be fine.  You can't predict
> the total workload due to unknown number of clients, but hundreds of
> milliseconds is a lot of time for a modern CPU, so should generally be
> fine.
> 
> If I understand the log above correctly, It shows 2 iterations of the
> loop, right?  And between them ovsdb-server sleeps for 168-ish ms
> in the poll_block.  Likely because it is a timeout for a ping timer.
> 
> I we want the yield to not fire when poll_block wakes up on exact time
> the ping timer should fire (it always will wake up at that time), we
> may increase the threshold.  E.g. add something like 10-100 ms to it.
> i.e. mkae yilding slightly less frequent than scheduled wakeups for
> raft_wait.
> 
> Does that make sense?
> 
>>
>> I know that reordering the whole thing is a bit drastic, and I kind of
>> regret proposing that.
>>
>> Would an alternative be to slip in a yield at the top of the main_loop
>> function or a SHASH_FOR_EACH_SAFE (node, all_dbs) that just does
>> ovsdb_storage_run()?
>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to