Hi, I did some benchmarking on the rbtree2 branch (on my rcu red black tree implementation) and figured out some very important slowdown that were due to the current call_rcu implementation. This patch series digs through the problems I noticed and explains the solution. The three main problems were:
- Lack of per-cpu affinity for per-cpu call_rcu threads. - Use of pthread_cond, which requires a mutex, at each call_rcu execution to signal the call_rcu thread (heavy mutex contention). - More delay than necessary between executions of the call_rcu thread in the wakeup-based scenario increased the cache footprint. I'm proposing these patches as RFC. If you think they are acceptable, I'll pull them into the liburcu mainline. Comments are welcome, Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
