On Fri, Oct 04, 2013 at 05:23:48PM -0700, Paul E. McKenney wrote: > The underlying problem is that perf is invoking call_rcu() with the > scheduler locks held, but in NOCB mode, call_rcu() will with high > probability invoke the scheduler -- which just might want to use its > locks. The reason that call_rcu() needs to invoke the scheduler is > to wake up the corresponding rcuo callback-offload kthread, which > does the job of starting up a grace period and invoking the callbacks > afterwards. > > One solution (championed on a related problem by Lai Jiangshan) is to
That's rcu_read_unlock_special(), right? > simply defer the wakeup to some point where scheduler locks are no longer > held. Since we don't want to unnecessarily incur the cost of such > deferral, the task before us is threefold: > > 1. Determine when it is likely that a relevant scheduler lock is held. > > 2. Defer the wakeup in such cases. > > 3. Ensure that all deferred wakeups eventually happen, preferably > sooner rather than later. > > We use irqs_disabled_flags() as a proxy for relevant scheduler locks > being held. This works because the relevant locks are always acquired > with interrupts disabled. We may defer more often than needed, but that > is at least safe. Fair enough; do you feel the need for something more specific? > The wakeup deferral is tracked via a new field in the per-CPU and > per-RCU-flavor rcu_data structure, namely ->nocb_defer_wakeup. > > This flag is checked by the RCU core processing. The __rcu_pending() > function now checks this flag, which causes rcu_check_callbacks() > to initiate RCU core processing at each scheduling-clock interrupt > where this flag is set. Of course this is not sufficient because > scheduling-clock interrupts are often turned off (the things we used to > be able to count on!). So the flags are also checked on entry to any > state that RCU considers to be idle, which includes both NO_HZ_IDLE idle > state and NO_HZ_FULL user-mode-execution state. So RCU doesn't current differentiate between EQS for nr_running==1 and nr_running==0? > This approach should allow call_rcu() to be invoked regardless of what > locks you might be holding, the key word being "should". Agreed. Except it looks like you've inverted the deferred wakeup condition :-) > @@ -2314,6 +2323,22 @@ static int rcu_nocb_kthread(void *arg) > return 0; > } > > +/* Is a deferred wakeup of rcu_nocb_kthread() required? */ > +static bool rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp) > +{ > + return ACCESS_ONCE(rdp->nocb_defer_wakeup); > +} > + > +/* Do a deferred wakeup of rcu_nocb_kthread(). */ > +static void do_nocb_deferred_wakeup(struct rcu_data *rdp) > +{ > + if (rcu_nocb_need_deferred_wakeup(rdp)) !rcu_nocb_need_deferred_wakeup() ? > + return; > + ACCESS_ONCE(rdp->nocb_defer_wakeup) = false; > + wake_up(&rdp->nocb_wq); > + trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("DeferredWakeEmpty")); > +} > + > /* Initialize per-rcu_data variables for no-CBs CPUs. */ > static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp) > { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/