Re: RFC: mutex: hung tasks on SMP platforms with asm-generic/mutex-xchg.h

Nicolas Pitre Wed, 08 Aug 2012 22:12:42 -0700

On Tue, 7 Aug 2012, Will Deacon wrote:

> Hello,
> 
> ARM recently moved to asm-generic/mutex-xchg.h for its mutex implementation
> after our previous implementation was found to be missing some crucial
> memory barriers. However, I'm seeing some problems running hackbench on
> SMP platforms due to the way in which the MUTEX_SPIN_ON_OWNER code operates.
> 
> The symptoms are that a bunch of hackbench tasks are left waiting on an
> unlocked mutex and therefore never get woken up to claim it. I think this
> boils down to the following sequence:
> 
> 
>         Task A        Task B        Task C        Lock value
> 0                                                     1
> 1       lock()                                        0
> 2                     lock()                          0
> 3                     spin(A)                         0
> 4       unlock()                                      1
> 5                                   lock()            0
> 6                     cmpxchg(1,0)                    0
> 7                     contended()                    -1
> 8       lock()                                        0
> 9       spin(C)                                       0
> 10                                  unlock()          1
> 11      cmpxchg(1,0)                                  0
> 12      unlock()                                      1
> 
> 
> At this point, the lock is unlocked, but Task B is in an uninterruptible
> sleep with nobody to wake it up.
> 
> The following patch fixes the problem by ensuring we put the lock into
> the contended state if we acquire it from the spin loop on the slowpath
> but I'd like to be sure that this won't cause problems with other mutex
> implementations:
> 
> 
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index a307cc9..27b7887 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -170,7 +170,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
> unsigned int subclass,
>                 if (owner && !mutex_spin_on_owner(lock, owner))
>                         break;
>  
> -               if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
> +               if (atomic_cmpxchg(&lock->count, 1, -1) == 1) {
>                         lock_acquired(&lock->dep_map, ip);
>                         mutex_set_owner(lock);
>                         preempt_enable();
> 
> 
> All comments welcome.


Well... after thinking about this for a while, I came to the conclusion 
that the mutex_spin_on_owner code is indeed breaking the contract 
between the xchg lock fast path and the slow path.  The xchg fast path 
will always set the count to 0 and rely on the slow path to restore a 
possible pre-existing negative count.  So the above change would be 
needed for correctness in the xchg case, even if it always forces the 
unlock into the slow path.

As I mentioned previously, we don't want to force the slow path in all 
cases though.  The atomic decrement based fast path doesn't need that, 
so I'd suggest this fix instead:

diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h
index 580a6d35c7..60964844c8 100644
--- a/include/asm-generic/mutex-xchg.h
+++ b/include/asm-generic/mutex-xchg.h
@@ -108,4 +108,6 @@ __mutex_fastpath_trylock(atomic_t *count, int 
(*fail_fn)(atomic_t *))
        return prev;
 }
 
+#define __MUTEX_XCHG_FAST_PATH
+
 #endif
diff --git a/kernel/mutex.c b/kernel/mutex.c
index a307cc9c95..c6a26a4f1c 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -161,6 +161,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
 
        for (;;) {
                struct task_struct *owner;
+               int locked_val;
 
                /*
                 * If there's an owner, wait for it to either
@@ -170,7 +171,19 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
                if (owner && !mutex_spin_on_owner(lock, owner))
                        break;
 
-               if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
+#ifdef __MUTEX_XCHG_FAST_PATH
+               /*
+                * The fast path based on xchg sets a transient 0 count,
+                * relying on the slow path to restore a possible
+                * pre-existing contended count.  Without checking the
+                * waiters' list we must presume possible contension here.
+                */
+               locked_val = -1;
+#else
+               locked_val = 0;
+#endif
+
+               if (atomic_cmpxchg(&lock->count, 1, locked_val) == 1) {
                        lock_acquired(&lock->dep_map, ip);
                        mutex_set_owner(lock);
                        preempt_enable();

That would be needed for the stable tree as well.

A further cleanup could remove all definitions of 
__mutex_slowpath_needs_to_unlock() given that they're all set to 1 
except for the xchg fast path, in which case __MUTEX_XCHG_FAST_PATH 
could be reused instead.

Of course that might tilt the balance towards using mutex-dec.h on ARM 
v6 and above instead of mutex-xchg.h.  But that is an orthogonal issue, 
and that doesn't remove the need for fixing the xchg based case for 
correctness.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: mutex: hung tasks on SMP platforms with asm-generic/mutex-xchg.h

Reply via email to