Re: kernel bug: futex_wait hang

2005-03-23 Thread Jakub Jelinek
On Wed, Mar 23, 2005 at 05:12:59AM -0800, [EMAIL PROTECTED] wrote:
> the hang occurs during an attempted thread cancel+join. we know from
> strace that one thread calls tgkill() on the other. the other thread is
> blocked in a poll call on a FIFO. after tgkill, the first thread enters a
> futex wait, apparently waiting for the thread ID of the cancelled thread
> to appear at some location (just a guess based on the info from strace).
> the wait never returns, and so the first thread ends up hung in
> pthread_join(). there are no user-defined mutexes or condvars involved.

If the thread that is to be cancelled is in async cancel state (it should
be when waiting in a poll and if cancellation is not disabled in that thread),
then pthread_cancel sends a SIGCANCEL signal to it via tgkill.
If tgkill succeeds (and thus pthread_cancel succeeds too) and you call
pthread_join on it, in the likely case the thread is still alive
pthread_join will FUTEX_WAIT on pd->tid, waiting until the thread dies.
NPTL threads are created with CLONE_CHILD_CLEARTID >tid, so this
futex will be FUTEX_WAKEd by mm_release in kernel whenever the thread is
exiting (or dying in some other way).

So, if pthread_join waits for the thread forever, the thread must be
around (otherwise pthread_join would not block on it; well, there could
be memory corruption in the program and anything would be possible then).
This would mean either that the poll has not been awaken by the SIGCANCEL
signal, or e.g. that one of the registered cleanup handlers (or C++
destructors) in the thread that is being cancelled get stuck for whatever
reason (deadlock, etc.).

Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-23 Thread paul
> Paul is on vacation for a week so I suspect this will have to wait for
> his return.  But he's been right about similar issues in the past so I'm
> inclined to believe him.
>
> In the meantime if anyone cares to investigate, the problem is trivial
> to reproduce.  All you need is JACK, XMMS, xmms-jack and any 2.6 kernel.

fortunately (or perhaps not), by the powers of webmail, here i am to just
mention a couple more details for anyone who cares to think about this.

the hang occurs during an attempted thread cancel+join. we know from
strace that one thread calls tgkill() on the other. the other thread is
blocked in a poll call on a FIFO. after tgkill, the first thread enters a
futex wait, apparently waiting for the thread ID of the cancelled thread
to appear at some location (just a guess based on the info from strace).
the wait never returns, and so the first thread ends up hung in
pthread_join(). there are no user-defined mutexes or condvars involved.

what is rather odd is that when we have checked for the persistence (or
otherwise) of the cancelled thread, it appears as if several other threads
have died, not just the cancelled one. the primary tester was unclear if
this was to be expected or not (chris morgan, the author of xmms-jack),
and i cannot say definitively whether or not we know for certain that the
cancelled thread no longer exists.

as lee mentioned, i am on vacation right now, using some xp machine at a
relative's house, so i can't test anything till i return.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-23 Thread paul
 Paul is on vacation for a week so I suspect this will have to wait for
 his return.  But he's been right about similar issues in the past so I'm
 inclined to believe him.

 In the meantime if anyone cares to investigate, the problem is trivial
 to reproduce.  All you need is JACK, XMMS, xmms-jack and any 2.6 kernel.

fortunately (or perhaps not), by the powers of webmail, here i am to just
mention a couple more details for anyone who cares to think about this.

the hang occurs during an attempted thread cancel+join. we know from
strace that one thread calls tgkill() on the other. the other thread is
blocked in a poll call on a FIFO. after tgkill, the first thread enters a
futex wait, apparently waiting for the thread ID of the cancelled thread
to appear at some location (just a guess based on the info from strace).
the wait never returns, and so the first thread ends up hung in
pthread_join(). there are no user-defined mutexes or condvars involved.

what is rather odd is that when we have checked for the persistence (or
otherwise) of the cancelled thread, it appears as if several other threads
have died, not just the cancelled one. the primary tester was unclear if
this was to be expected or not (chris morgan, the author of xmms-jack),
and i cannot say definitively whether or not we know for certain that the
cancelled thread no longer exists.

as lee mentioned, i am on vacation right now, using some xp machine at a
relative's house, so i can't test anything till i return.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-23 Thread Jakub Jelinek
On Wed, Mar 23, 2005 at 05:12:59AM -0800, [EMAIL PROTECTED] wrote:
 the hang occurs during an attempted thread cancel+join. we know from
 strace that one thread calls tgkill() on the other. the other thread is
 blocked in a poll call on a FIFO. after tgkill, the first thread enters a
 futex wait, apparently waiting for the thread ID of the cancelled thread
 to appear at some location (just a guess based on the info from strace).
 the wait never returns, and so the first thread ends up hung in
 pthread_join(). there are no user-defined mutexes or condvars involved.

If the thread that is to be cancelled is in async cancel state (it should
be when waiting in a poll and if cancellation is not disabled in that thread),
then pthread_cancel sends a SIGCANCEL signal to it via tgkill.
If tgkill succeeds (and thus pthread_cancel succeeds too) and you call
pthread_join on it, in the likely case the thread is still alive
pthread_join will FUTEX_WAIT on pd-tid, waiting until the thread dies.
NPTL threads are created with CLONE_CHILD_CLEARTID self-tid, so this
futex will be FUTEX_WAKEd by mm_release in kernel whenever the thread is
exiting (or dying in some other way).

So, if pthread_join waits for the thread forever, the thread must be
around (otherwise pthread_join would not block on it; well, there could
be memory corruption in the program and anything would be possible then).
This would mean either that the poll has not been awaken by the SIGCANCEL
signal, or e.g. that one of the registered cleanup handlers (or C++
destructors) in the thread that is being cancelled get stuck for whatever
reason (deadlock, etc.).

Jakub
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-22 Thread Lee Revell
On Tue, 2005-03-22 at 01:34 -0500, Jakub Jelinek wrote:
> On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
> > On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
> > > Jamie Lokier <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > The most recent messages under "Futex queue_me/get_user ordering",
> > > > with a patch from Jakub Jelinek will fix this problem by changing the
> > > > kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> > > > "<[EMAIL PROTECTED]>".
> > > > 
> > > > I have not tested the patch, but it looks convincing.
> > > 
> > > OK, thanks.  Lee && Paul, that's at
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
> > > 
> > 
> > Does not fix the problem.
> 
> Have you analyzed the use of mutexes/condvars in the program?
> The primary suspect is a deadlock, race of some kind or other bug
> in the program.  All these will show up as a hang in FUTEX_WAIT.
> The argument that it works with LinuxThreads doesn't count,
> the timing and internals of both threading libraries are so different
> that a program bug can only show up with one of the threading libraries
> and not both.
> Only once you distill a minimal self-contained testcase that proves
> the program is correct and it gets analyzed, it is time to talk about
> NPTL or kernel bugs.

Paul is on vacation for a week so I suspect this will have to wait for
his return.  But he's been right about similar issues in the past so I'm
inclined to believe him.

In the meantime if anyone cares to investigate, the problem is trivial
to reproduce.  All you need is JACK, XMMS, xmms-jack and any 2.6 kernel.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-22 Thread Lee Revell
On Tue, 2005-03-22 at 15:30 +, Jamie Lokier wrote:
> Lee Revell wrote:
> > On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
> > > I argued for fixing Glibc on the grounds that the changed kernel
> > > behaviour, or more exactly having Glibc depend on it, loses a certain
> > > semantic property useful for unusual operations on multiple futexes at
> > > the same time.  But I appear to have lost the argument, and Jakub's
> > > latest patch does clean up some cruft quite nicely, with no expected
> > > performance hit.
> > 
> > A glibc fix will take forever to get to users compared to a kernel fix.
> 
> Interesting perspective.  On my systems Glibc is upgraded more often
> than the kernel.
> 

Blame the Debian maintainers.  This bug, reported August 2004, is still
unfixed even in unstable!!!

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=266507

Apparently they think marking a bug "fixed upstream" does something to
solve the problem.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-22 Thread Jamie Lokier
Lee Revell wrote:
> On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
> > I argued for fixing Glibc on the grounds that the changed kernel
> > behaviour, or more exactly having Glibc depend on it, loses a certain
> > semantic property useful for unusual operations on multiple futexes at
> > the same time.  But I appear to have lost the argument, and Jakub's
> > latest patch does clean up some cruft quite nicely, with no expected
> > performance hit.
> 
> A glibc fix will take forever to get to users compared to a kernel fix.

Interesting perspective.  On my systems Glibc is upgraded more often
than the kernel.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-22 Thread Jamie Lokier
Lee Revell wrote:
 On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
  I argued for fixing Glibc on the grounds that the changed kernel
  behaviour, or more exactly having Glibc depend on it, loses a certain
  semantic property useful for unusual operations on multiple futexes at
  the same time.  But I appear to have lost the argument, and Jakub's
  latest patch does clean up some cruft quite nicely, with no expected
  performance hit.
 
 A glibc fix will take forever to get to users compared to a kernel fix.

Interesting perspective.  On my systems Glibc is upgraded more often
than the kernel.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-22 Thread Lee Revell
On Tue, 2005-03-22 at 15:30 +, Jamie Lokier wrote:
 Lee Revell wrote:
  On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
   I argued for fixing Glibc on the grounds that the changed kernel
   behaviour, or more exactly having Glibc depend on it, loses a certain
   semantic property useful for unusual operations on multiple futexes at
   the same time.  But I appear to have lost the argument, and Jakub's
   latest patch does clean up some cruft quite nicely, with no expected
   performance hit.
  
  A glibc fix will take forever to get to users compared to a kernel fix.
 
 Interesting perspective.  On my systems Glibc is upgraded more often
 than the kernel.
 

Blame the Debian maintainers.  This bug, reported August 2004, is still
unfixed even in unstable!!!

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=266507

Apparently they think marking a bug fixed upstream does something to
solve the problem.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-22 Thread Lee Revell
On Tue, 2005-03-22 at 01:34 -0500, Jakub Jelinek wrote:
 On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
  On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
   Jamie Lokier [EMAIL PROTECTED] wrote:

The most recent messages under Futex queue_me/get_user ordering,
with a patch from Jakub Jelinek will fix this problem by changing the
kernel.  Yes, you should apply Jakub's most recent patch, message-ID
[EMAIL PROTECTED].

I have not tested the patch, but it looks convincing.
   
   OK, thanks.  Lee  Paul, that's at
   ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
   
  
  Does not fix the problem.
 
 Have you analyzed the use of mutexes/condvars in the program?
 The primary suspect is a deadlock, race of some kind or other bug
 in the program.  All these will show up as a hang in FUTEX_WAIT.
 The argument that it works with LinuxThreads doesn't count,
 the timing and internals of both threading libraries are so different
 that a program bug can only show up with one of the threading libraries
 and not both.
 Only once you distill a minimal self-contained testcase that proves
 the program is correct and it gets analyzed, it is time to talk about
 NPTL or kernel bugs.

Paul is on vacation for a week so I suspect this will have to wait for
his return.  But he's been right about similar issues in the past so I'm
inclined to believe him.

In the meantime if anyone cares to investigate, the problem is trivial
to reproduce.  All you need is JACK, XMMS, xmms-jack and any 2.6 kernel.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Jakub Jelinek
On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
> On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
> > Jamie Lokier <[EMAIL PROTECTED]> wrote:
> > > 
> > > The most recent messages under "Futex queue_me/get_user ordering",
> > > with a patch from Jakub Jelinek will fix this problem by changing the
> > > kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> > > "<[EMAIL PROTECTED]>".
> > > 
> > > I have not tested the patch, but it looks convincing.
> > 
> > OK, thanks.  Lee && Paul, that's at
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
> > 
> 
> Does not fix the problem.

Have you analyzed the use of mutexes/condvars in the program?
The primary suspect is a deadlock, race of some kind or other bug
in the program.  All these will show up as a hang in FUTEX_WAIT.
The argument that it works with LinuxThreads doesn't count,
the timing and internals of both threading libraries are so different
that a program bug can only show up with one of the threading libraries
and not both.
Only once you distill a minimal self-contained testcase that proves
the program is correct and it gets analyzed, it is time to talk about
NPTL or kernel bugs.

Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Lee Revell
On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
> Jamie Lokier <[EMAIL PROTECTED]> wrote:
> > 
> > The most recent messages under "Futex queue_me/get_user ordering",
> > with a patch from Jakub Jelinek will fix this problem by changing the
> > kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> > "<[EMAIL PROTECTED]>".
> > 
> > I have not tested the patch, but it looks convincing.
> 
> OK, thanks.  Lee && Paul, that's at
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
> 

Does not fix the problem.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Andrew Morton
Jamie Lokier <[EMAIL PROTECTED]> wrote:
>
> Andrew Morton wrote:
> > iirc we ended up deciding that the futex problems around that time were due
> > to userspace problems (a version of libc).  But then, there's no discussion
> > around Seto's patch and it didn't get applied.  So I don't know what
> > happened to that work - it's all a bit mysterious.
> 
> It can be fixed _either_ in Glibc, or by changing the kernel.
> 
> That problem is caused by differing assumptions between Glibc and the
> kernel about subtle futex semantics.  Which goes to show they are
> really clever, or something.
> 
> I provided pseudo-code for the Glibc fix, but not an actual patch
> because NPTL is quite complicated and I wanted to know the Glibc
> people were interested, but apparently they were too busy at the time
> - benchmarks would have made sense for such a patch.
> 
> Scott Snyder started fixing part of Glibc, and that did fix his
> instance of this problem so we know the approach works.  But a full
> patch for Glibc was not prepared.
> 
> The most recent messages under "Futex queue_me/get_user ordering",
> with a patch from Jakub Jelinek will fix this problem by changing the
> kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> "<[EMAIL PROTECTED]>".
> 
> I have not tested the patch, but it looks convincing.

OK, thanks.  Lee && Paul, that's at
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch


> I argued for fixing Glibc on the grounds that the changed kernel
> behaviour, or more exactly having Glibc depend on it, loses a certain
> semantic property useful for unusual operations on multiple futexes at
> the same time.  But I appear to have lost the argument, and Jakub's
> latest patch does clean up some cruft quite nicely, with no expected
> performance hit.

Futexes were initially so simple.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Lee Revell
On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
> I argued for fixing Glibc on the grounds that the changed kernel
> behaviour, or more exactly having Glibc depend on it, loses a certain
> semantic property useful for unusual operations on multiple futexes at
> the same time.  But I appear to have lost the argument, and Jakub's
> latest patch does clean up some cruft quite nicely, with no expected
> performance hit.

A glibc fix will take forever to get to users compared to a kernel fix.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Lee Revell
On Mon, 2005-03-21 at 20:20 -0800, Andrew Morton wrote:
> Lee Revell <[EMAIL PROTECTED]> wrote:
> >
> > Paul Davis and Chris Morgan have been chasing down a problem with
> > xmms_jack and it really looks like this bug, thought to have been fixed
> > in 2.6.10, is the culprit.
> > 
> > http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
> > 
> > (for more info google "futex_wait 2.6 hang")
> > 
> > It's simple to reproduce.  Run JACK and launch xmms with the JACK output
> > plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
> > 
> > [EMAIL PROTECTED]:~$ strace -p 7935
> > Process 7935 attached - interrupt to quit
> > futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
> > 
> > Just like in the above bug report, if xmms is run with
> > LD_ASSUME_KERNEL=2.4.19, it works perfectly.
> > 
> > I have reproduced the bug with 2.6.12-rc1.
> > 
> 
> iirc we ended up deciding that the futex problems around that time were due
> to userspace problems (a version of libc).  But then, there's no discussion
> around Seto's patch and it didn't get applied.  So I don't know what
> happened to that work - it's all a bit mysterious.
> 

It does seem like it could be a different problem.  Maybe Paul can
provide some more evidence that it's a kernel and not a glibc/NPTL bug.
I'm really just posting this on Paul's behalf; I don't claim to
understand the issue. ;-)

> Is this a 100% repeatable hang, or is it some occasional race?
> 

100% repeatable.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Jamie Lokier
Andrew Morton wrote:
> iirc we ended up deciding that the futex problems around that time were due
> to userspace problems (a version of libc).  But then, there's no discussion
> around Seto's patch and it didn't get applied.  So I don't know what
> happened to that work - it's all a bit mysterious.

It can be fixed _either_ in Glibc, or by changing the kernel.

That problem is caused by differing assumptions between Glibc and the
kernel about subtle futex semantics.  Which goes to show they are
really clever, or something.

I provided pseudo-code for the Glibc fix, but not an actual patch
because NPTL is quite complicated and I wanted to know the Glibc
people were interested, but apparently they were too busy at the time
- benchmarks would have made sense for such a patch.

Scott Snyder started fixing part of Glibc, and that did fix his
instance of this problem so we know the approach works.  But a full
patch for Glibc was not prepared.

The most recent messages under "Futex queue_me/get_user ordering",
with a patch from Jakub Jelinek will fix this problem by changing the
kernel.  Yes, you should apply Jakub's most recent patch, message-ID
"<[EMAIL PROTECTED]>".

I have not tested the patch, but it looks convincing.

I argued for fixing Glibc on the grounds that the changed kernel
behaviour, or more exactly having Glibc depend on it, loses a certain
semantic property useful for unusual operations on multiple futexes at
the same time.  But I appear to have lost the argument, and Jakub's
latest patch does clean up some cruft quite nicely, with no expected
performance hit.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Jamie Lokier
Lee Revell wrote:
> > iirc we ended up deciding that the futex problems around that time were due
> > to userspace problems (a version of libc).  But then, there's no discussion
> > around Seto's patch and it didn't get applied.  So I don't know what
> > happened to that work - it's all a bit mysterious.
> 
> It does seem like it could be a different problem.  Maybe Paul can
> provide some more evidence that it's a kernel and not a glibc/NPTL bug.
> I'm really just posting this on Paul's behalf; I don't claim to
> understand the issue. ;-)

Try applying the patch below, which was recently posted by Jakub Jelinek.

If it fixes the problem, it's the same thing as Hidetoshi Seto
noticed, although this patch is much improved thanks to
"preempt_count technology" (tm).  If not, it's a whole new problem.

-- Jamie

--- linux-2.6.11/kernel/futex.c.jj  2005-03-17 04:42:29.0 -0500
+++ linux-2.6.11/kernel/futex.c 2005-03-18 05:45:29.0 -0500
@@ -97,7 +97,6 @@ struct futex_q {
  */
 struct futex_hash_bucket {
spinlock_t  lock;
-   unsigned intnqueued;
struct list_head   chain;
 };
 
@@ -265,7 +264,6 @@ static inline int get_futex_value_locked
inc_preempt_count();
ret = __copy_from_user_inatomic(dest, from, sizeof(int));
dec_preempt_count();
-   preempt_check_resched();
 
return ret ? -EFAULT : 0;
 }
@@ -339,7 +337,6 @@ static int futex_requeue(unsigned long u
struct list_head *head1;
struct futex_q *this, *next;
int ret, drop_count = 0;
-   unsigned int nqueued;
 
  retry:
down_read(>mm->mmap_sem);
@@ -354,23 +351,22 @@ static int futex_requeue(unsigned long u
bh1 = hash_futex();
bh2 = hash_futex();
 
-   nqueued = bh1->nqueued;
+   if (bh1 < bh2)
+   spin_lock(>lock);
+   spin_lock(>lock);
+   if (bh1 > bh2)
+   spin_lock(>lock);
+
if (likely(valp != NULL)) {
int curval;
 
-   /* In order to avoid doing get_user while
-  holding bh1->lock and bh2->lock, nqueued
-  (monotonically increasing field) must be first
-  read, then *uaddr1 fetched from userland and
-  after acquiring lock nqueued field compared with
-  the stored value.  The smp_mb () below
-  makes sure that bh1->nqueued is read from memory
-  before *uaddr1.  */
-   smp_mb();
-
ret = get_futex_value_locked(, (int __user *)uaddr1);
 
if (unlikely(ret)) {
+   spin_unlock(>lock);
+   if (bh1 != bh2)
+   spin_unlock(>lock);
+
/* If we would have faulted, release mmap_sem, fault
 * it in and start all over again.
 */
@@ -385,21 +381,10 @@ static int futex_requeue(unsigned long u
}
if (curval != *valp) {
ret = -EAGAIN;
-   goto out;
+   goto out_unlock;
}
}
 
-   if (bh1 < bh2)
-   spin_lock(>lock);
-   spin_lock(>lock);
-   if (bh1 > bh2)
-   spin_lock(>lock);
-
-   if (unlikely(nqueued != bh1->nqueued && valp != NULL)) {
-   ret = -EAGAIN;
-   goto out_unlock;
-   }
-
head1 = >chain;
list_for_each_entry_safe(this, next, head1, list) {
if (!match_futex (>key, ))
@@ -435,13 +420,9 @@ out:
return ret;
 }
 
-/*
- * queue_me and unqueue_me must be called as a pair, each
- * exactly once.  They are called with the hashed spinlock held.
- */
-
 /* The key must be already stored in q->key. */
-static void queue_me(struct futex_q *q, int fd, struct file *filp)
+static inline struct futex_hash_bucket *
+queue_lock(struct futex_q *q, int fd, struct file *filp)
 {
struct futex_hash_bucket *bh;
 
@@ -455,11 +436,35 @@ static void queue_me(struct futex_q *q, 
q->lock_ptr = >lock;
 
spin_lock(>lock);
-   bh->nqueued++;
+   return bh;
+}
+
+static inline void __queue_me(struct futex_q *q, struct futex_hash_bucket *bh)
+{
list_add_tail(>list, >chain);
spin_unlock(>lock);
 }
 
+static inline void
+queue_unlock(struct futex_q *q, struct futex_hash_bucket *bh)
+{
+   spin_unlock(>lock);
+   drop_key_refs(>key);
+}
+
+/*
+ * queue_me and unqueue_me must be called as a pair, each
+ * exactly once.  They are called with the hashed spinlock held.
+ */
+
+/* The key must be already stored in q->key. */
+static void queue_me(struct futex_q *q, int fd, struct file *filp)
+{
+   struct futex_hash_bucket *bh;
+   bh = queue_lock(q, fd, filp);
+   __queue_me(q, bh);
+}
+
 /* Return 1 if we were still queued (ie. 0 means we were woken) */
 static int unqueue_me(struct futex_q *q)
 {
@@ 

Re: kernel bug: futex_wait hang

2005-03-21 Thread Andrew Morton
Lee Revell <[EMAIL PROTECTED]> wrote:
>
> Paul Davis and Chris Morgan have been chasing down a problem with
> xmms_jack and it really looks like this bug, thought to have been fixed
> in 2.6.10, is the culprit.
> 
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
> 
> (for more info google "futex_wait 2.6 hang")
> 
> It's simple to reproduce.  Run JACK and launch xmms with the JACK output
> plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
> 
> [EMAIL PROTECTED]:~$ strace -p 7935
> Process 7935 attached - interrupt to quit
> futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
> 
> Just like in the above bug report, if xmms is run with
> LD_ASSUME_KERNEL=2.4.19, it works perfectly.
> 
> I have reproduced the bug with 2.6.12-rc1.
> 

iirc we ended up deciding that the futex problems around that time were due
to userspace problems (a version of libc).  But then, there's no discussion
around Seto's patch and it didn't get applied.  So I don't know what
happened to that work - it's all a bit mysterious.

Is this a 100% repeatable hang, or is it some occasional race?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Andrew Morton
Lee Revell [EMAIL PROTECTED] wrote:

 Paul Davis and Chris Morgan have been chasing down a problem with
 xmms_jack and it really looks like this bug, thought to have been fixed
 in 2.6.10, is the culprit.
 
 http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
 
 (for more info google futex_wait 2.6 hang)
 
 It's simple to reproduce.  Run JACK and launch xmms with the JACK output
 plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
 
 [EMAIL PROTECTED]:~$ strace -p 7935
 Process 7935 attached - interrupt to quit
 futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
 
 Just like in the above bug report, if xmms is run with
 LD_ASSUME_KERNEL=2.4.19, it works perfectly.
 
 I have reproduced the bug with 2.6.12-rc1.
 

iirc we ended up deciding that the futex problems around that time were due
to userspace problems (a version of libc).  But then, there's no discussion
around Seto's patch and it didn't get applied.  So I don't know what
happened to that work - it's all a bit mysterious.

Is this a 100% repeatable hang, or is it some occasional race?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Jamie Lokier
Lee Revell wrote:
  iirc we ended up deciding that the futex problems around that time were due
  to userspace problems (a version of libc).  But then, there's no discussion
  around Seto's patch and it didn't get applied.  So I don't know what
  happened to that work - it's all a bit mysterious.
 
 It does seem like it could be a different problem.  Maybe Paul can
 provide some more evidence that it's a kernel and not a glibc/NPTL bug.
 I'm really just posting this on Paul's behalf; I don't claim to
 understand the issue. ;-)

Try applying the patch below, which was recently posted by Jakub Jelinek.

If it fixes the problem, it's the same thing as Hidetoshi Seto
noticed, although this patch is much improved thanks to
preempt_count technology (tm).  If not, it's a whole new problem.

-- Jamie

--- linux-2.6.11/kernel/futex.c.jj  2005-03-17 04:42:29.0 -0500
+++ linux-2.6.11/kernel/futex.c 2005-03-18 05:45:29.0 -0500
@@ -97,7 +97,6 @@ struct futex_q {
  */
 struct futex_hash_bucket {
spinlock_t  lock;
-   unsigned intnqueued;
struct list_head   chain;
 };
 
@@ -265,7 +264,6 @@ static inline int get_futex_value_locked
inc_preempt_count();
ret = __copy_from_user_inatomic(dest, from, sizeof(int));
dec_preempt_count();
-   preempt_check_resched();
 
return ret ? -EFAULT : 0;
 }
@@ -339,7 +337,6 @@ static int futex_requeue(unsigned long u
struct list_head *head1;
struct futex_q *this, *next;
int ret, drop_count = 0;
-   unsigned int nqueued;
 
  retry:
down_read(current-mm-mmap_sem);
@@ -354,23 +351,22 @@ static int futex_requeue(unsigned long u
bh1 = hash_futex(key1);
bh2 = hash_futex(key2);
 
-   nqueued = bh1-nqueued;
+   if (bh1  bh2)
+   spin_lock(bh1-lock);
+   spin_lock(bh2-lock);
+   if (bh1  bh2)
+   spin_lock(bh1-lock);
+
if (likely(valp != NULL)) {
int curval;
 
-   /* In order to avoid doing get_user while
-  holding bh1-lock and bh2-lock, nqueued
-  (monotonically increasing field) must be first
-  read, then *uaddr1 fetched from userland and
-  after acquiring lock nqueued field compared with
-  the stored value.  The smp_mb () below
-  makes sure that bh1-nqueued is read from memory
-  before *uaddr1.  */
-   smp_mb();
-
ret = get_futex_value_locked(curval, (int __user *)uaddr1);
 
if (unlikely(ret)) {
+   spin_unlock(bh1-lock);
+   if (bh1 != bh2)
+   spin_unlock(bh2-lock);
+
/* If we would have faulted, release mmap_sem, fault
 * it in and start all over again.
 */
@@ -385,21 +381,10 @@ static int futex_requeue(unsigned long u
}
if (curval != *valp) {
ret = -EAGAIN;
-   goto out;
+   goto out_unlock;
}
}
 
-   if (bh1  bh2)
-   spin_lock(bh1-lock);
-   spin_lock(bh2-lock);
-   if (bh1  bh2)
-   spin_lock(bh1-lock);
-
-   if (unlikely(nqueued != bh1-nqueued  valp != NULL)) {
-   ret = -EAGAIN;
-   goto out_unlock;
-   }
-
head1 = bh1-chain;
list_for_each_entry_safe(this, next, head1, list) {
if (!match_futex (this-key, key1))
@@ -435,13 +420,9 @@ out:
return ret;
 }
 
-/*
- * queue_me and unqueue_me must be called as a pair, each
- * exactly once.  They are called with the hashed spinlock held.
- */
-
 /* The key must be already stored in q-key. */
-static void queue_me(struct futex_q *q, int fd, struct file *filp)
+static inline struct futex_hash_bucket *
+queue_lock(struct futex_q *q, int fd, struct file *filp)
 {
struct futex_hash_bucket *bh;
 
@@ -455,11 +436,35 @@ static void queue_me(struct futex_q *q, 
q-lock_ptr = bh-lock;
 
spin_lock(bh-lock);
-   bh-nqueued++;
+   return bh;
+}
+
+static inline void __queue_me(struct futex_q *q, struct futex_hash_bucket *bh)
+{
list_add_tail(q-list, bh-chain);
spin_unlock(bh-lock);
 }
 
+static inline void
+queue_unlock(struct futex_q *q, struct futex_hash_bucket *bh)
+{
+   spin_unlock(bh-lock);
+   drop_key_refs(q-key);
+}
+
+/*
+ * queue_me and unqueue_me must be called as a pair, each
+ * exactly once.  They are called with the hashed spinlock held.
+ */
+
+/* The key must be already stored in q-key. */
+static void queue_me(struct futex_q *q, int fd, struct file *filp)
+{
+   struct futex_hash_bucket *bh;
+   bh = queue_lock(q, fd, filp);
+   __queue_me(q, bh);
+}
+
 /* Return 1 if we were still queued (ie. 0 means we were woken) */
 static int 

Re: kernel bug: futex_wait hang

2005-03-21 Thread Jamie Lokier
Andrew Morton wrote:
 iirc we ended up deciding that the futex problems around that time were due
 to userspace problems (a version of libc).  But then, there's no discussion
 around Seto's patch and it didn't get applied.  So I don't know what
 happened to that work - it's all a bit mysterious.

It can be fixed _either_ in Glibc, or by changing the kernel.

That problem is caused by differing assumptions between Glibc and the
kernel about subtle futex semantics.  Which goes to show they are
really clever, or something.

I provided pseudo-code for the Glibc fix, but not an actual patch
because NPTL is quite complicated and I wanted to know the Glibc
people were interested, but apparently they were too busy at the time
- benchmarks would have made sense for such a patch.

Scott Snyder started fixing part of Glibc, and that did fix his
instance of this problem so we know the approach works.  But a full
patch for Glibc was not prepared.

The most recent messages under Futex queue_me/get_user ordering,
with a patch from Jakub Jelinek will fix this problem by changing the
kernel.  Yes, you should apply Jakub's most recent patch, message-ID
[EMAIL PROTECTED].

I have not tested the patch, but it looks convincing.

I argued for fixing Glibc on the grounds that the changed kernel
behaviour, or more exactly having Glibc depend on it, loses a certain
semantic property useful for unusual operations on multiple futexes at
the same time.  But I appear to have lost the argument, and Jakub's
latest patch does clean up some cruft quite nicely, with no expected
performance hit.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Lee Revell
On Mon, 2005-03-21 at 20:20 -0800, Andrew Morton wrote:
 Lee Revell [EMAIL PROTECTED] wrote:
 
  Paul Davis and Chris Morgan have been chasing down a problem with
  xmms_jack and it really looks like this bug, thought to have been fixed
  in 2.6.10, is the culprit.
  
  http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
  
  (for more info google futex_wait 2.6 hang)
  
  It's simple to reproduce.  Run JACK and launch xmms with the JACK output
  plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
  
  [EMAIL PROTECTED]:~$ strace -p 7935
  Process 7935 attached - interrupt to quit
  futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
  
  Just like in the above bug report, if xmms is run with
  LD_ASSUME_KERNEL=2.4.19, it works perfectly.
  
  I have reproduced the bug with 2.6.12-rc1.
  
 
 iirc we ended up deciding that the futex problems around that time were due
 to userspace problems (a version of libc).  But then, there's no discussion
 around Seto's patch and it didn't get applied.  So I don't know what
 happened to that work - it's all a bit mysterious.
 

It does seem like it could be a different problem.  Maybe Paul can
provide some more evidence that it's a kernel and not a glibc/NPTL bug.
I'm really just posting this on Paul's behalf; I don't claim to
understand the issue. ;-)

 Is this a 100% repeatable hang, or is it some occasional race?
 

100% repeatable.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Lee Revell
On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
 I argued for fixing Glibc on the grounds that the changed kernel
 behaviour, or more exactly having Glibc depend on it, loses a certain
 semantic property useful for unusual operations on multiple futexes at
 the same time.  But I appear to have lost the argument, and Jakub's
 latest patch does clean up some cruft quite nicely, with no expected
 performance hit.

A glibc fix will take forever to get to users compared to a kernel fix.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Andrew Morton
Jamie Lokier [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  iirc we ended up deciding that the futex problems around that time were due
  to userspace problems (a version of libc).  But then, there's no discussion
  around Seto's patch and it didn't get applied.  So I don't know what
  happened to that work - it's all a bit mysterious.
 
 It can be fixed _either_ in Glibc, or by changing the kernel.
 
 That problem is caused by differing assumptions between Glibc and the
 kernel about subtle futex semantics.  Which goes to show they are
 really clever, or something.
 
 I provided pseudo-code for the Glibc fix, but not an actual patch
 because NPTL is quite complicated and I wanted to know the Glibc
 people were interested, but apparently they were too busy at the time
 - benchmarks would have made sense for such a patch.
 
 Scott Snyder started fixing part of Glibc, and that did fix his
 instance of this problem so we know the approach works.  But a full
 patch for Glibc was not prepared.
 
 The most recent messages under Futex queue_me/get_user ordering,
 with a patch from Jakub Jelinek will fix this problem by changing the
 kernel.  Yes, you should apply Jakub's most recent patch, message-ID
 [EMAIL PROTECTED].
 
 I have not tested the patch, but it looks convincing.

OK, thanks.  Lee  Paul, that's at
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch


 I argued for fixing Glibc on the grounds that the changed kernel
 behaviour, or more exactly having Glibc depend on it, loses a certain
 semantic property useful for unusual operations on multiple futexes at
 the same time.  But I appear to have lost the argument, and Jakub's
 latest patch does clean up some cruft quite nicely, with no expected
 performance hit.

Futexes were initially so simple.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Lee Revell
On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
 Jamie Lokier [EMAIL PROTECTED] wrote:
  
  The most recent messages under Futex queue_me/get_user ordering,
  with a patch from Jakub Jelinek will fix this problem by changing the
  kernel.  Yes, you should apply Jakub's most recent patch, message-ID
  [EMAIL PROTECTED].
  
  I have not tested the patch, but it looks convincing.
 
 OK, thanks.  Lee  Paul, that's at
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
 

Does not fix the problem.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel bug: futex_wait hang

2005-03-21 Thread Jakub Jelinek
On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
 On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
  Jamie Lokier [EMAIL PROTECTED] wrote:
   
   The most recent messages under Futex queue_me/get_user ordering,
   with a patch from Jakub Jelinek will fix this problem by changing the
   kernel.  Yes, you should apply Jakub's most recent patch, message-ID
   [EMAIL PROTECTED].
   
   I have not tested the patch, but it looks convincing.
  
  OK, thanks.  Lee  Paul, that's at
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
  
 
 Does not fix the problem.

Have you analyzed the use of mutexes/condvars in the program?
The primary suspect is a deadlock, race of some kind or other bug
in the program.  All these will show up as a hang in FUTEX_WAIT.
The argument that it works with LinuxThreads doesn't count,
the timing and internals of both threading libraries are so different
that a program bug can only show up with one of the threading libraries
and not both.
Only once you distill a minimal self-contained testcase that proves
the program is correct and it gets analyzed, it is time to talk about
NPTL or kernel bugs.

Jakub
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/