On Wed, Sep 09, 2020 at 04:22:41AM -0700, Paul E. McKenney wrote:
> On Wed, Sep 09, 2020 at 07:03:39AM +0000, Zhang, Qiang wrote:
> > 
> > When config preempt RCU,  and then  there are multiple levels  node,  the 
> > current task is preempted  in rcu  read critical region.
> > the current task be add to "rnp->blkd_tasks" link list,  and the 
> > "rnp->gp_tasks"  may be assigned a value .  these rnp is leaf node in RCU 
> > tree.
> > 
> > But in "rcu_gp_fqs_loop" func, we check blocked readers in root node. 
> > 
> > static void rcu_gp_fqs_loop(void)
> >  {
> >             .....
> >             struct rcu_node *rnp = rcu_get_root();
> >             .....
> >             if (!READ_ONCE(rnp->qsmask) &&
> >                                !rcu_preempt_blocked_readers_cgp(rnp))    
> > ------> rnp is root node
> >                      break;
> >             ....
> > }
> > 
> > the root node's blkd_tasks never add task, the "rnp->gp_tasks" is never be 
> > assigned value,  this check is invailed.
> >  Should we check leaf nodes like this 
> 
> There are two cases:
> 
> 1.    There is only a single rcu_node structure, which is both root
>       and leaf.  In this case, the current check is required:  Both
>       ->qsmask and the ->blkd_tasks list must be checked.  Your
>       rcu_preempt_blocked_readers() would work in this case, but
>       the current code is a bit faster because it does not need
>       to acquire the ->lock nor does it need the loop overhead.
> 
> 2.    There are multiple levels.  In this case, as you say, the root
>       rcu_node structure's ->blkd_tasks list will always be empty.
>       But also in this case, the root rcu_node structure's ->qsmask
>       cannot be zero until all the leaf rcu_node structures' ->qsmask
>       fields are zero and their ->blkd_tasks lists no longer have
>       tasks blocking the current grace period.  This means that your
>       rcu_preempt_blocked_readers() function would never return
>       true in this case.
> 
> So the current code is fine.
> 
> Are you seeing failures on mainline kernels?  If so, what is the failure
> mode?

Nice question! He had me wondering there too for a bit ;-)

Thanks,

 - Joel

Reply via email to