Re: dyntick-idle CPU and node's qsmask

Paul E. McKenney Wed, 21 Nov 2018 06:40:09 -0800

On Tue, Nov 20, 2018 at 08:37:22PM -0800, Joel Fernandes wrote:
> On Tue, Nov 20, 2018 at 06:41:07PM -0800, Paul E. McKenney wrote:
> [...] 
> > > > > I was thinking if we could simplify rcu_note_context_switch (the 
> > > > > parts that
> > > > > call rcu_momentary_dyntick_idle), if we did the following in
> > > > > rcu_implicit_dynticks_qs.
> > > > > 
> > > > > Since we already call rcu_qs in rcu_note_context_switch, that would 
> > > > > clear the
> > > > > rdp->cpu_no_qs flag. Then there should be no need to call
> > > > > rcu_momentary_dyntick_idle from rcu_note_context switch.
> > > > 
> > > > But does this also work for the rcu_all_qs() code path?
> > > 
> > > Could we not do something like this in rcu_all_qs? as some over-simplified
> > > pseudo code:
> > > 
> > > rcu_all_qs() {
> > >   if (!urgent_qs || !heavy_qs)
> > >      return;
> > > 
> > >   rcu_qs();   // This clears the rdp->cpu_no_qs flags which we can 
> > > monitor in
> > >               //  the diff in my last email (from 
> > > rcu_implicit_dynticks_qs)
> > > }
> > 
> > Except that rcu_qs() doesn't necessarily report the quiescent state to
> > the RCU core.  Keeping down context-switch overhead and all that.
> 
> Sure yeah, but I think the QS will be indirectly anyway by the force_qs_rnp()
> path if we detect that rcu_qs() happened on the CPU?


The force_qs_rnp() path won't see anything that has not already been
reported to the RCU core.

> > > > > I think this would simplify cond_resched as well.  Could this avoid 
> > > > > the need
> > > > > for having an rcu_all_qs at all? Hopefully I didn't some Tasks-RCU 
> > > > > corner cases..
> > > > 
> > > > There is also the code path from cond_resched() in PREEMPT=n kernels.
> > > > This needs rcu_all_qs().  Though it is quite possible that some 
> > > > additional
> > > > code collapsing is possible.
> > > > 
> > > > > Basically for some background, I was thinking can we simplify the 
> > > > > code that
> > > > > calls "rcu_momentary_dyntick_idle" since we already register a qs in 
> > > > > other
> > > > > ways (like by resetting cpu_no_qs).
> > > > 
> > > > One complication is that rcu_all_qs() is invoked with interrupts
> > > > and preemption enabled, while rcu_note_context_switch() is
> > > > invoked with interrupts disabled.  Also, as you say, Tasks RCU.
> > > > Plus rcu_all_qs() wants to exit immediately if there is nothing to
> > > > do, while rcu_note_context_switch() must unconditionally do rcu_qs()
> > > > -- yes, it could check, but that would be redundant with the checks
> > > 
> > > This immediate exit is taken care off in the above psuedo code, would that
> > > help the cond_resched performance?
> > 
> > It look like you are cautiously edging towards the two wrapper functions
> > calling common code, relying on inlining and simplification.  Why not just
> > try doing it?  ;-)
> 
> Sure yeah. I was more thinking of the ambitious goal of getting rid of the
> complexity and exploring the general design idea, than containing/managing
> the complexity with reducing code duplication. :D
> 
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index c818e0c91a81..5aa0259c014d 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -1063,7 +1063,7 @@ static int rcu_implicit_dynticks_qs(struct 
> > > > > rcu_data *rdp)
> > > > >        * read-side critical section that started before the beginning
> > > > >        * of the current RCU grace period.
> > > > >        */
> > > > > -     if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap)) {
> > > > > +     if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap) || 
> > > > > !rdp->cpu_no_qs.b.norm) {
> > > > 
> > > > If I am not too confused, this change could cause trouble for
> > > > nohz_full CPUs looping in the kernel.  Such CPUs don't necessarily take
> > > > scheduler-clock interrupts, last I checked, and this could prevent the
> > > > CPU from reporting its quiescent state to core RCU.
> > > 
> > > Would that still be a problem if rcu_all_qs called rcu_qs? Also the above
> > > diff is an OR condition so it is more relaxed than before.
> > 
> > Yes, because rcu_qs() is only guaranteed to capture the quiescent
> > state on the current CPU, not necessarily report it to the RCU core.
> 
> The reporting to the core is necessary to call rcu_report_qs_rnp so that the
> QS information is propogating up the tree, right?
> 
> Wouldn't that reporting be done anyway by:
> 
> force_qs_rnp
>   -> rcu_implicit_dynticks_qs  (which returns 1 because rdp->cpu_no_qs.b.norm
>                               was cleared by rcu_qs() and we detect that
>                               with help of above diff)

Ah.  It is not safe to sample rdp->cpu_no_qs.b.norm off-CPU, and that
is what your patch would do.  This is intentional -- if it were safe to
sample off-CPU, then it would be more expensive to read/update on-CPU.

>   -> rcu_report_qs_rnp is called with mask bit set for corresponding CPU that
>                               has the !rdp->cpu_no_qs.b.norm
> 
> 
> I think that's what I am missing - that why wouldn't the above scheme work.
> The only difference is reporting to the RCU core might invoke pending
> callbacks but I'm not sure if that matters for this. I'll these changes,
> and try tracing it out and study it more.  thanks for the patience,

There are a lot of moving parts and you have not yet gotten to all
of them.  I suggest next taking a look at the relationship between
rcu_check_callbacks() and rcu_process_callbacks(), including the
open_softirq().  These have old names -- they handle the interface
between the CPU and RCU code, among other things.  Including invoking
callbacks, but only for some configurations.  :-/

                                                        Thanx, Paul

Re: dyntick-idle CPU and node's qsmask

Reply via email to