On Wed, 10 Aug 2016, Manfred Spraul wrote:

On 08/10/2016 02:05 AM, Benjamin Herrenschmidt wrote:
On Tue, 2016-08-09 at 20:52 +0200, Manfred Spraul wrote:
Hi Benjamin, Hi Michael,

regarding commit 51d7d5205d33 ("powerpc: Add smp_mb() to
arch_spin_is_locked()"):

For the ipc/sem code, I would like to replace the spin_is_locked() with
a smp_load_acquire(), see:

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/tree/ipc/sem.c#n367

http://www.ozlabs.org/~akpm/mmots/broken-out/ipc-semc-fix-complex_count-vs-simple-op-race.patch

To my understanding, I must now add a smp_mb(), otherwise it would be
broken on PowerPC:

The approach that the memory barrier is added into spin_is_locked()
doesn't work because the code doesn't use spin_is_locked().

Correct?
Right, otherwise you aren't properly ordered. The current powerpc locks provide
good protection between what's inside vs. what's outside the lock but not vs.
the lock *value* itself, so if, like you do in the sem code, use the lock
value as something that is relevant in term of ordering, you probably need
an explicit full barrier.

But the problem here is with spin_unlock_wait() (for ll/sc spin_lock) not 
seeing the
store that makes the lock visibly taken and both threads end up exiting out of 
sem_lock();
similar scenario to the spin_is_locked commit mentioned above, which is 
crossing of
locks.

Now that spin_unlock_wait() always implies at least an load-acquire barrier 
(for both
ticket and qspinlocks, which is still x86 only), we wait on the full critical 
region.

So this patch takes this locking scheme:

  CPU0                        CPU1
  spin_lock(l)                spin_lock(L)
  spin_unlock_wait(L)         if (spin_is_locked(l))
  foo()                  foo()

... and converts it now to:

  CPU0                        CPU1
  complex_mode = true         spin_lock(l)
  smp_mb()                                <--- do we want a smp_mb() here?
  spin_unlock_wait(l)         if (!smp_load_acquire(complex_mode))
  foo()                  foo()

We should not be doing an smp_mb() right after a spin_lock(), makes no sense. 
The
spinlock machinery should guarantee us the barriers in the unorthodox locking 
cases,
such as this.

Thanks,
Davidlohr

Reply via email to