On Tue, 10 Dec 2013, Dave Jones wrote: > On Tue, Dec 10, 2013 at 11:55:06AM -0800, Linus Torvalds wrote: > > On Tue, Dec 10, 2013 at 11:18 AM, Thomas Gleixner <t...@linutronix.de> > wrote: > > > > > > So this is pretty unlikely. The retry requires: > > > > > > get_futex_value_locked() == EFAULT; > > > > > > Now we drop the hash bucket locks and do: > > > > > > get_user(); > > > > > > And if that get_user() faults again, we bail out. > > > > I think you need to look closer. > > > > We have at least also that "futex_proxy_trylock_atomic() returns > > -EAGAIN" case. Which triggers at some exit condition. Another thread > > in the same group, perhaps never completing the exit because it's > > waiting for this one? I dunno, I didn't look any closer (but this does > > make me think "Hey, we should add Oleg to the Cc too", since > > PF_EXITING is involved).. So maybe there is some situation where that > > EAGAIN will keep happening, forever.. > > > > Now, I'm *not* saying that that is it. It's quite possible/likely some > > other loop, but I do have to say that it sure isn't _obvious_. And > > that whole EAGAIN return case is quite deep and special, so ... > > > > Linus > > > > PS: Oleg - the whole thread is on lkml. Ping me if you need more context. > > btw, I've left the machine in that state, and will for as long as necesary > in case someone has any ideas for further tracing experiments.
Can you gather a trace with the function tracer? That will tell us what the thing is actually doing. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/