On 10/17/14 11:38, Catalin Marinas wrote: > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's > nothing to wake up) changes the futex code to avoid taking a lock when > there are no waiters. This code has been subsequently fixed in commit > 11d4616bd07f (futex: revert back to the explicit waiter counting code). > Both the original commit and the fix-up rely on get_futex_key_refs() to > always imply a barrier. > > However, for private futexes, none of the cases in the switch statement > of get_futex_key_refs() would be hit and the function completes without > a memory barrier as required before checking the "waiters" in > futex_wake() -> hb_waiters_pending(). The consequence is a race with a > thread waiting on a futex on another CPU, allowing the waker thread to > read "waiters == 0" while the waiter thread to have read "futex_val == > locked" (in kernel).
Verified that this is: a) how it is documented to work b) not how it actually works currently Nice catch indeed. ... > diff --git a/kernel/futex.c b/kernel/futex.c > index 815d7af2ffe8..f3a3a071283c 100644 > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key) > case FUT_OFF_MMSHARED: > futex_get_mm(key); /* implies MB (B) */ > break; > + default: A comment here indicating this covers the PROCESS_PRIVATE futex case would be welcome, given the complexity involved. > + smp_mb(); /* explicit MB (B) */ Also, the "Basic" futex operation and ordering guarantees documentation currently reads: * Where (A) orders the waiters increment and the futex value read through * atomic operations (see hb_waiters_inc) and where (B) orders the write * to futex and the waiters read -- this is done by the barriers in * get_futex_key_refs(), through either ihold or atomic_inc, depending on the * futex type. Which is not incomplete (lacking the explicit smp_mb()) added by this patch. Perhaps the MB implementation of get_futex_key_refs() need not be explicitly enumerated here? -- Darren Hart Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/