On Thu, Jul 27, 2017 at 07:36:58AM -0700, Paul E. McKenney wrote:
> On Thu, Jul 27, 2017 at 10:29:55PM +0800, Boqun Feng wrote:
> > On Thu, Jul 27, 2017 at 07:16:33AM -0700, Paul E. McKenney wrote:
> > > On Thu, Jul 27, 2017 at 09:55:51PM +0800, Boqun Feng wrote:
> > > > Hi Paul,
> > > > 
> > > > I have a side question out of curiosity:
> > > > 
> > > > How does synchronize_sched() work properly for sys_membarrier()?
> > > > 
> > > > sys_membarrier() requires every other CPU does a smp_mb() before it
> > > > returns, and I know synchronize_sched() will wait until all CPUs running
> > > > a kernel thread do a context-switch, which has a smp_mb(). However, I
> > > > believe sched flavor RCU treat CPU running a user thread as a quiesent
> > > > state, so synchronize_sched() could return without that CPU does a
> > > > context switch. 
> > > > 
> > > > So why could we use synchronize_sched() for sys_membarrier()?
> > > > 
> > > > In particular, could the following happens?
> > > > 
> > > >         CPU 0:                          CPU 1:
> > > >         =========================       ==========================
> > > >         <in user space>                 <in user space>
> > > >                                         {read Y}(reordered) 
> > > > <------------------------------+
> > > >         store Y;                                                        
> > > >                    |
> > > >                                         read X; 
> > > > --------------------------------------+    |
> > > >         sys_membarrier():               <timer interrupt>               
> > > >               |    |
> > > >           synchronize_sched();            update_process_times(user): 
> > > > //user == true  |    |
> > > >                                             rcu_check_callbacks(usr):   
> > > >               |    |
> > > >                                               if (user || ..) {         
> > > >               |    |
> > > >                                                 rcu_sched_qs()          
> > > >               |    |
> > > >                                                 ...                     
> > > >               |    |
> > > >                                                 <report quesient state 
> > > > in softirq>    |    |
> > > 
> > > The reporting of the quiescent state will acquire the leaf rcu_node
> > > structure's lock, with an smp_mb__after_unlock_lock(), which will
> > > one way or another be a full memory barrier.  So the reorderings
> > > cannot happen.
> > > 
> > > Unless I am missing something subtle.  ;-)
> > > 
> > 
> > Well, smp_mb__after_unlock_lock() in ARM64 is a no-op, and ARM64's lock
> > doesn't provide a smp_mb().
> > 
> > So my point is more like: synchronize_sched() happens to be a
> > sys_membarrier() because of some implementation detail, and if some day
> > we come up with a much cheaper way to implement sched flavor
> > RCU(hopefully!), synchronize_sched() may be not good for the job. So at
> > least, we'd better document this somewhere?
> 
> Last I heard, ARM's unlock/lock acted as a full barrier.  Will?

Yeah, should do. unlock is release, lock is acquire and we're RCsc.

Will

Reply via email to