* Phil Howard ([email protected]) wrote: > On Mon, Jun 6, 2011 at 12:21 PM, Mathieu Desnoyers > <[email protected]> wrote: > > * Mathieu Desnoyers ([email protected]) wrote: > >> I notice that the "poll(NULL, 0, 10);" delay is executed both for the RT > >> and non-RT code. So given that my goal is to get the call_rcu thread to > >> GC memory as quickly as possible to diminish the overhead of cache > >> misses, I decided to try removing this delay for !RT: the call_rcu > >> thread then wakes up ASAP when the thread invoking call_rcu wakes it. My > >> updates jump to 76349/s (getting there!) ;). > >> > >> This improvement can be explained by a lower delay between call_rcu and > >> execution of its callback, which decrease the amount of cache used, and > >> therefore provides better cache locality. > > > > I just wonder if it's worth it: removing this delay from the !RT > > call_rcu thread can cause high-rate of synchronize_rcu() calls. So > > although there might be an advantage in terms of update rate, it will > > likely cause extra cache-line bounces between the call_rcu threads and > > the reader threads. > > > > test_urcu_rbtree 7 1 20 -g 1000000 > > > > With the delay in the call_rcu thread: > > search: 1842857 items/reader thread/s (7 reader threads) > > updates: 21066 items/s (1 update thread) > > ratio: 87 search/update > > > > Without the delay in the call_rcu thread: > > search: 3064285 items/reader thread/s (7 reader threads) > > updates: 45096 items/s (1 update thread) > > ratio: 68 search/update > > > > So basically, adding the delay doubles the update performance, at the > > cost of being 33% slower for reads. My first thought is that if an > > application has very frequent updates, then maybe it wants to have fast > > updates because the update throughput is then important. If the > > application has infrequent updates, then the reads will be fast anyway, > > because rare call_rcu invocation will trigger less cache-line bounce > > between readers and writers. Any other thoughts on this trade-off and > > how to deal with it ? > > > > Did I miss something here? It looks like you more than doubled the > update rate and almost doubled the lookup rate. The search/update > ration is less, but if both the raw rates improved so much, how is > this a bad thing?
Actually, my discussion of the results was good, but the I mis-entered the raw results. Here is the re-run of the tests, with the results well entered this time. I notice that on repeated runs, the update rates seems to be much closer between delay vs no-delay than the original difference I noticed. test_urcu_rbtree 7 1 20 -g 1000000 With the delay in the call_rcu thread: search: 3064285 items/reader thread/s (7 reader threads) updates: 43051 items/s (1 update thread) ratio: 71 search/update Without the delay in the call_rcu thread: search: 1550000 items/reader thread/s (7 reader threads) updates: 47221 items/s (1 update thread) ratio: 33 search/update So removing the delay seems to hurt read performance quite a lot, and does not benefit updates as much as I initially thought (it's only 9.6%). I would be tempted to just leave the delay in place for !RT case then. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
