On Fri, 2 Sep 2016, Paul E. McKenney wrote:

> On Fri, Sep 02, 2016 at 02:10:13PM -0400, Alan Stern wrote:
> > Paul, Peter, and Ingo:
> > 
> > This must have come up before, but I don't know what was decided.
> > 
> > Isn't it often true that a memory barrier is needed before a call to 
> > wake_up_process()?  A typical scenario might look like this:
> > 
> >     CPU 0
> >     -----
> >     for (;;) {
> >             set_current_state(TASK_INTERRUPTIBLE);
> >             if (signal_pending(current))
> >                     break;
> >             if (wakeup_flag)
> >                     break;
> >             schedule();
> >     }
> >     __set_current_state(TASK_RUNNING);
> >     wakeup_flag = 0;
> > 
> > 
> >     CPU 1
> >     -----
> >     wakeup_flag = 1;
> >     wake_up_process(my_task);
> > 
> > The underlying pattern is:
> > 
> >     CPU 0                           CPU 1
> >     -----                           -----
> >     write current->state            write wakeup_flag
> >     smp_mb();
> >     read wakeup_flag                read my_task->state
> > 
> > where set_current_state() does the write to current->state and 
> > automatically adds the smp_mb(), and wake_up_process() reads 
> > my_task->state to see whether the task needs to be woken up.
> > 
> > The kerneldoc for wake_up_process() says that it has no implied memory
> > barrier if it doesn't actually wake anything up.  And even when it
> > does, the implied barrier is only smp_wmb, not smp_mb.
> > 
> > This is the so-called SB (Store Buffer) pattern, which is well known to
> > require a full smp_mb on both sides.  Since wake_up_process() doesn't
> > include smp_mb(), isn't it correct that the caller must add it
> > explicitly?
> > 
> > In other words, shouldn't the code for CPU 1 really be:
> > 
> >     wakeup_flag = 1;
> >     smp_mb();
> >     wake_up_process(task);
> > 
> > If my reasoning is correct, then why doesn't wake_up_process() include 
> > this memory barrier automatically, the way set_current_state() does?  
> > There could be an alternate version (__wake_up_process()) which omits 
> > the barrier, just like __set_current_state().
> 
> A common case uses locking, in which case additional memory barriers
> inside of the wait/wakeup functions are not needed.  Any accesses made
> while holding the lock before invoking the wakeup function (e.g.,
> wake_up()) are guaranteed to be seen after acquiring that same
> lock following return from the wait function (e.g., wait_event()).
> In this case, adding barriers to the wait and wakeup functions would
> just add overhead.
> 
> But yes, this decision does mean that people using the wait/wakeup
> functions without locking need to be more careful.  Something like
> this:
> 
>       /* prior accesses. */
>       smp_mb();
>       wakeup_flag = 1;
>       wake_up(...);
> 
> And on the other task:
> 
>       wait_event(... wakeup_flag == 1 ...);
>       smp_mb();
>       /* The waker's prior accesses will be visible here. */
> 
> Or am I missing your point?

I'm afraid so.  The code doesn't use wait_event(), in part because
there's no wait_queue (since only one task is involved).

But maybe there's another barrier which needs to be fixed.  Felipe, can
you check to see if received_cbw() is getting called in
get_next_command(), and if so, what value it returns?  Or is the
preceding sleep_thread() the one that never wakes up?

It could be that the smp_wmb() in wakeup_thread() needs to be smp_mb().  
The reason being that get_next_command() runs outside the protection of 
the spinlock.

Alan Stern

Reply via email to