On 1/12/24 23:47, Ilya Maximets wrote: > On 1/12/24 23:10, Frode Nordahl wrote: >> On Fri, Jan 12, 2024 at 5:20 PM Ilya Maximets <i.maxim...@ovn.org> wrote: >>> >>> On 1/12/24 01:27, Ilya Maximets wrote: >>>> On 1/10/24 20:29, Frode Nordahl wrote: >>>>> During testing of the implementation of cooperative multitasking >>>>> in the ovsdb-server we noticed that the very first yield being >>>>> called in the jsonrpc-server always fired. >>>>> >>>>> This indicates that the operations being after/before storage run >>>>> is taking too long as it currently is. >>>>> >>>>> Moving the storage run section to the top of the main loop lead >>>>> to successful results using less yield calls as documented in the >>>>> commit message of the next commit in this series. >>>> >>>> I don't really understand what exactly is taking so much time and >>>> why moving this part of the code helps. Could you elaborate? >>> >>> I think, I got it. In the current version of a patch set we have: >>> >>> main(): >>> 0. open_db() >>> raft_alloc() <-- register callback >>> read_db() <-- takes a lot of time >>> main_loop() >>> 1. run jsonrpc <-- yields and warns because read_db >>> already took a lot of time. >>> yield() >>> raft_run() <-- updates the callback >>> 2. storage_run() >>> raft_run() <-- updates the callback >>> >>> The solution proposed in this patch is to swap 1 and 2, >>> so we call raft_run() directly before we yield for the >>> first time. >>> >>> I suppose, my seggestion to not have _register() function would >>> solve that problem, because the callback will not be registered >>> until the first call to raft_run(). >>> >>> Is that correct? >> >> Unfortunately, also when only having the _set() function the first >> yield always fires. >> >> Putting log lines before/after every major function call in the >> main_loop provides something like this: >> 2024-01-12T22:01:46.968Z|00204|ovsdb_server|DBG|HELLO 0 >> 2024-01-12T22:01:46.968Z|00205|ovsdb_server|DBG|HELLO 1 >> 2024-01-12T22:01:46.968Z|00206|ovsdb_server|DBG|HELLO 2 >> 2024-01-12T22:01:46.968Z|00207|ovsdb_server|DBG|HELLO 3 >> 2024-01-12T22:01:46.968Z|00208|ovsdb_server|DBG|HELLO 4 >> 2024-01-12T22:01:46.968Z|00209|ovsdb_server|DBG|HELLO 5 >> 2024-01-12T22:01:46.968Z|00210|ovsdb_server|DBG|HELLO 6 >> 2024-01-12T22:01:47.134Z|00211|ovsdb_server|DBG|HELLO 0 >> 2024-01-12T22:01:47.134Z|00212|ovsdb_server|DBG|HELLO 1 >> 2024-01-12T22:01:47.134Z|00213|ovsdb_server|DBG|HELLO 2 >> 2024-01-12T22:01:47.134Z|00214|ovsdb_server|DBG|HELLO 3 >> 2024-01-12T22:01:47.134Z|00215|ovsdb_server|DBG|HELLO 4 >> 2024-01-12T22:01:47.134Z|00216|ovsdb_server|DBG|HELLO 5 >> 2024-01-12T22:01:47.135Z|00217|cooperative_multitasking|DBG|ovsdb/jsonrpc-server.c:604: >> yield for raft_run(0x560b47e225e0): elapsed(167) >= threshold(166), >> overrun: 1 > > But this is a debug log, i.e. it doesn't actually fire. > The overrun here is just 1ms, which should be fine. > Or am I missing something?
Sorry, it is firing, it's just not a warning at this point. But the logic below about poll_timer_wait_until(raft->ping_timeout); waking the thread up at exactly ping timeout and causing the next possible yield to fire still stands. > >> >> So I actually wonder if the wait timers set up in raft_wait() are too >> optimistic? Then again, those timers can't ever take into account the >> number of clients serviced by for example the call to >> ovsdb_jsonrpc_server_run(). > > You seem to use 500ms as a RAFT election timer here, that sounds like > a small number to use. I'd not recommend anything smaller than a > default 1 second. 300ms ping cadence should be fine. You can't predict > the total workload due to unknown number of clients, but hundreds of > milliseconds is a lot of time for a modern CPU, so should generally be > fine. > > If I understand the log above correctly, It shows 2 iterations of the > loop, right? And between them ovsdb-server sleeps for 168-ish ms > in the poll_block. Likely because it is a timeout for a ping timer. > > I we want the yield to not fire when poll_block wakes up on exact time > the ping timer should fire (it always will wake up at that time), we > may increase the threshold. E.g. add something like 10-100 ms to it. > i.e. mkae yilding slightly less frequent than scheduled wakeups for > raft_wait. > > Does that make sense? > >> >> I know that reordering the whole thing is a bit drastic, and I kind of >> regret proposing that. >> >> Would an alternative be to slip in a yield at the top of the main_loop >> function or a SHASH_FOR_EACH_SAFE (node, all_dbs) that just does >> ovsdb_storage_run()? >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev