Hello! This series reduces lock contention on the root rcu_node structure, and is also the first precursor to TBD changes to consolidate the three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched) into one.
1. Improve non-root rcu_cbs_completed() accuracy, thus reducing the need to acquire the root rcu_node structure's ->lock. This also eliminates the need to reassign callbacks to an earlier grace period, which enables introduction of funnel locking in a later commit, which further reduces contention. 2. Make rcu_start_future_gp()'s grace-period check more precise, eliminating one need for forward-progress failsafe checks that acquire the root rcu_node structure's ->lock. 3. Create (and make use of) accessors for the ->need_future_gp[] array to enable easy changes in size. 4. Make rcu_gp_kthread() check for early-boot activity, which was another situation needing failsafe checks. 5. Make rcu_gp_cleanup() more accurately predict need for new GP. This eliminates the need for both failsafe checks and extra grace-period kthread wakeups. 6. Avoid losing ->need_future_gp[] values due to GP start/end races by expanding this array from two elements to four. 7. Make rcu_future_needs_gp() check all ->need_future_gps[] elements, again to eliminate a need for both failsafe checks and extra grace-period kthread wakeups. 8. Convert ->need_future_gp[] array to boolean, given that there is no longer a need to count the number of requests for a future grace period. 9. Make rcu_migrate_callbacks wake GP kthread when needed, which again eliminates a need for failsafe checks. 10. Avoid __call_rcu_core() root rcu_node ->lock acquisition, which was one of the failsafe checks that many of the above patches were making safe to remove. 11. Switch __rcu_process_callbacks() to rcu_accelerate_cbs(), which was one of the failsafe checks that many of the above patches were making safe to remove. (Yes, this one also acquired the root rcu_node structure's ->lock, and was in fact the lock acquisition that was showing up in Nick Piggin's traces.) 12. Put ->completed into an unsigned long instead of an int. (The "int" was harmless because only the low-order bits were used, but it was still an accident waiting to happen.) 13. Clear requests other than RCU_GP_FLAG_INIT at grace-period end. This prevents premature quiescent-state forcing that might otherwise occur due to requests posted when the grace period was already almost done. 14. Inline rcu_start_gp_advanced() into rcu_start_future_gp(). This brings RCU down to only one function to start a grace period, in happen contrast to the need to choose correctly between three of them before this patch series. 15. Make rcu_start_future_gp() caller select grace period to avoid duplicate grace-period selection. (We are going to like this grace period so much that we selected it twice!) 16. Add funnel locking to rcu_start_this_gp(), the point being to reduce lock contention, especially on large systems. 17. Make rcu_start_this_gp() check for out-of-range requests. If this check triggers, that indicates a bug in a caller of rcu_start_this_gp() or that the ->need_future_gp[] array needs to be even bigger, most likely the former. More importantly, it avoids one possible cause of otherwise silent grace-period hangs. 18. The rcu_gp_cleanup() function does not need cpu_needs_another_gp() because funnel locking summarizes the need for future grace periods in the root rcu_node structure's ->lock, which rcu_gp_cleanup() already holds for other reasons. 19. Simplify and inline cpu_needs_another_gp(), which used to be a key part of the no-longer-required forward-progress failsafe checks. 20. Drop early GP request check from rcu_gp_kthread(). Yes, it was added above in order avoid grace-period hangs, but at this point in the series is no longer needed. All in the name of bisectability. 21. Update list of rcu_future_grace_period() trace events to reflect strings added above. Thanx, Paul ------------------------------------------------------------------------ include/trace/events/rcu.h | 13 - kernel/rcu/rcu_segcblist.c | 18 - kernel/rcu/rcu_segcblist.h | 2 kernel/rcu/tree.c | 406 ++++++++++++++++----------------------------- kernel/rcu/tree.h | 24 ++ kernel/rcu/tree_plugin.h | 28 --- 6 files changed, 182 insertions(+), 309 deletions(-)