Hi Paul,
On 08/10/2016 11:00 PM, Paul E. McKenney wrote:
On Wed, Aug 10, 2016 at 12:17:57PM -0700, Davidlohr Bueso wrote:
[...]
CPU0 CPU1
complex_mode = true spin_lock(l)
smp_mb() <--- do we want a smp_mb() here?
spin_unlock_wait(l) if (!smp_load_acquire(complex_mode))
foo() foo()
We should not be doing an smp_mb() right after a spin_lock(), makes no sense.
The
spinlock machinery should guarantee us the barriers in the unorthodox locking
cases,
such as this.
In this case, from what I can see, we do need a store-load fence.
That said, yes, it really should be smp_mb__after_unlock_lock() rather
than smp_mb(). So if this code pattern is both desired and legitimate,
the smp_mb__after_unlock_lock() definitions probably need to move out
of kernel/rcu/tree.h to barrier.h or some such.
Can you explain the function name, why smp_mb__after_unlock_lock()?
I would have called it smp_mb__after_spin_lock().
For ipc/sem.c, the use case is:
[sorry, I only now notice that the mailer ate the formatting]:
cpu 1: complex_mode_enter():
smp_store_mb(sma->complex_mode, true);
for (i = 0; i < sma->sem_nsems; i++) {
sem = sma->sem_base + i;
spin_unlock_wait(&sem->lock);
}
cpu 2: sem_lock():
spin_lock(&sem->lock);
smp_mb();
if (!smp_load_acquire(&sma->complex_mode)) {
What is forbidden is that both cpu1 and cpu2 proceed.
--
Manfred