Hi,

What do you think about the following patch for sem_lock()?

Other options:

1) I don't like

        #define smp_mb__after_unlock_wait()     smp_rmb()

        I think it is too specific: the last block in sem_lock uses

                if (sma->complex_count == 0) {
                        smp_rmb();
                        return;
                }

2) What about

        #define smp_aquire__after_control_barrier()     smp_rmb()


Best regards,
        Manfred


xxxxx

sem_lock() does not properly pair memory barriers.

Theoretially an acquire barrier would the right operation.
But since the existing control boundary is a write memory barrier,
it is cheaper use an smp_rmb().

Signed-off-by: Manfred Spraul <manf...@colorfullife.com>
---
 ipc/sem.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 9284211..d43011d 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -267,6 +267,10 @@ static void sem_wait_array(struct sem_array *sma)
        if (sma->complex_count)  {
                /* The thread that increased sma->complex_count waited on
                 * all sem->lock locks. Thus we don't need to wait again.
+                *
+                * The is no need for memory barriers: with
+                * complex_count>0, all threads acquire/release
+                * sem_perm.lock, thus spin_lock/unlock is the barrier.
                 */
                return;
        }
@@ -275,6 +279,20 @@ static void sem_wait_array(struct sem_array *sma)
                sem = sma->sem_base + i;
                spin_unlock_wait(&sem->lock);
        }
+       /*
+        * We own sem_perm.lock, all owners of sma->sem_base[i].lock have
+        * dropped their locks. But we still need a memory barrier:
+        * - The lock dropping thread did a spin_unlock(), which is the
+        *   release memory barrier.
+        * - But the spin_unlock(&sma->sem_base[i].lock) might have
+        *   happened after this thread did spin_lock(&sma->sem_perm.lock),
+        *   thus the acquire memory barrier in this thread is missing.
+        * - spin_unlock_wait() is internally a loop, thus we have a control
+        *   boundary. As writes are not speculated, we have already a barrier
+        *   for writes. Reads can be performed speculatively, therefore a
+        *   smp_rmb() is necessary.
+        */
+       smp_rmb();
 }
 
 /*
@@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct 
sembuf *sops,
                         * Thus: if is now 0, then it will stay 0.
                         */
                        if (sma->complex_count == 0) {
-                               /* fast path successful! */
+                               /*
+                                * Fast path successful!
+                                * We only need a final memory barrier.
+                                * (see sem_wait_array() for details).
+                                */
+                               smp_rmb();
+
                                return sops->sem_num;
                        }
                }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to