subject:"Re\: kernel bug\: futex

On Tue, 2005-03-22 at 01:34 -0500, Jakub Jelinek wrote:
> On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
> > On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
> > > Jamie Lokier <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > The most recent messages under "Futex queue_me/get_user ordering",
> > > > with a patch from Jakub Jelinek will fix this problem by changing the
> > > > kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> > > > "<[EMAIL PROTECTED]>".
> > > > 
> > > > I have not tested the patch, but it looks convincing.
> > > 
> > > OK, thanks.  Lee && Paul, that's at
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
> > > 
> > 
> > Does not fix the problem.
> 
> Have you analyzed the use of mutexes/condvars in the program?
> The primary suspect is a deadlock, race of some kind or other bug
> in the program.  All these will show up as a hang in FUTEX_WAIT.
> The argument that it works with LinuxThreads doesn't count,
> the timing and internals of both threading libraries are so different
> that a program bug can only show up with one of the threading libraries
> and not both.
> Only once you distill a minimal self-contained testcase that proves
> the program is correct and it gets analyzed, it is time to talk about
> NPTL or kernel bugs.

Paul is on vacation for a week so I suspect this will have to wait for
his return.  But he's been right about similar issues in the past so I'm
inclined to believe him.

In the meantime if anyone cares to investigate, the problem is trivial
to reproduce.  All you need is JACK, XMMS, xmms-jack and any 2.6 kernel.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Tue, 2005-03-22 at 15:30 +, Jamie Lokier wrote:
> Lee Revell wrote:
> > On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
> > > I argued for fixing Glibc on the grounds that the changed kernel
> > > behaviour, or more exactly having Glibc depend on it, loses a certain
> > > semantic property useful for unusual operations on multiple futexes at
> > > the same time.  But I appear to have lost the argument, and Jakub's
> > > latest patch does clean up some cruft quite nicely, with no expected
> > > performance hit.
> > 
> > A glibc fix will take forever to get to users compared to a kernel fix.
> 
> Interesting perspective.  On my systems Glibc is upgraded more often
> than the kernel.
> 

Blame the Debian maintainers.  This bug, reported August 2004, is still
unfixed even in unstable!!!

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=266507

Apparently they think marking a bug "fixed upstream" does something to
solve the problem.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

2005-03-22 Thread Jamie Lokier

Lee Revell wrote:
> On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
> > I argued for fixing Glibc on the grounds that the changed kernel
> > behaviour, or more exactly having Glibc depend on it, loses a certain
> > semantic property useful for unusual operations on multiple futexes at
> > the same time.  But I appear to have lost the argument, and Jakub's
> > latest patch does clean up some cruft quite nicely, with no expected
> > performance hit.
> 
> A glibc fix will take forever to get to users compared to a kernel fix.

Interesting perspective.  On my systems Glibc is upgraded more often
than the kernel.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

2005-03-22 Thread Jamie Lokier

Lee Revell wrote:
 On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
  I argued for fixing Glibc on the grounds that the changed kernel
  behaviour, or more exactly having Glibc depend on it, loses a certain
  semantic property useful for unusual operations on multiple futexes at
  the same time.  But I appear to have lost the argument, and Jakub's
  latest patch does clean up some cruft quite nicely, with no expected
  performance hit.
 
 A glibc fix will take forever to get to users compared to a kernel fix.

Interesting perspective.  On my systems Glibc is upgraded more often
than the kernel.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Tue, 2005-03-22 at 15:30 +, Jamie Lokier wrote:
 Lee Revell wrote:
  On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
   I argued for fixing Glibc on the grounds that the changed kernel
   behaviour, or more exactly having Glibc depend on it, loses a certain
   semantic property useful for unusual operations on multiple futexes at
   the same time.  But I appear to have lost the argument, and Jakub's
   latest patch does clean up some cruft quite nicely, with no expected
   performance hit.
  
  A glibc fix will take forever to get to users compared to a kernel fix.
 
 Interesting perspective.  On my systems Glibc is upgraded more often
 than the kernel.
 

Blame the Debian maintainers.  This bug, reported August 2004, is still
unfixed even in unstable!!!

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=266507

Apparently they think marking a bug fixed upstream does something to
solve the problem.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Tue, 2005-03-22 at 01:34 -0500, Jakub Jelinek wrote:
On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
Jamie Lokier [EMAIL PROTECTED] wrote:

The most recent messages under Futex queue_me/get_user ordering,
with a patch from Jakub Jelinek will fix this problem by changing the
kernel. Yes, you should apply Jakub's most recent patch, message-ID
[EMAIL PROTECTED].

I have not tested the patch, but it looks convincing.

OK, thanks. Lee Paul, that's at
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch

Does not fix the problem.

Have you analyzed the use of mutexes/condvars in the program?
The primary suspect is a deadlock, race of some kind or other bug
in the program. All these will show up as a hang in FUTEX_WAIT.
The argument that it works with LinuxThreads doesn't count,
the timing and internals of both threading libraries are so different
that a program bug can only show up with one of the threading libraries
and not both.
Only once you distill a minimal self-contained testcase that proves
the program is correct and it gets analyzed, it is time to talk about
NPTL or kernel bugs.

Paul is on vacation for a week so I suspect this will have to wait for
his return. But he's been right about similar issues in the past so I'm
inclined to believe him.

In the meantime if anyone cares to investigate, the problem is trivial
to reproduce. All you need is JACK, XMMS, xmms-jack and any 2.6 kernel.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

2005-03-21 Thread Jakub Jelinek

On Tue, Mar 22, 2005 at 12:30:53AM -0500, Lee Revell wrote:
> On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
> > Jamie Lokier <[EMAIL PROTECTED]> wrote:
> > > 
> > > The most recent messages under "Futex queue_me/get_user ordering",
> > > with a patch from Jakub Jelinek will fix this problem by changing the
> > > kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> > > "<[EMAIL PROTECTED]>".
> > > 
> > > I have not tested the patch, but it looks convincing.
> > 
> > OK, thanks.  Lee && Paul, that's at
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
> > 
> 
> Does not fix the problem.

Have you analyzed the use of mutexes/condvars in the program?
The primary suspect is a deadlock, race of some kind or other bug
in the program.  All these will show up as a hang in FUTEX_WAIT.
The argument that it works with LinuxThreads doesn't count,
the timing and internals of both threading libraries are so different
that a program bug can only show up with one of the threading libraries
and not both.
Only once you distill a minimal self-contained testcase that proves
the program is correct and it gets analyzed, it is time to talk about
NPTL or kernel bugs.

Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Mon, 2005-03-21 at 21:08 -0800, Andrew Morton wrote:
> Jamie Lokier <[EMAIL PROTECTED]> wrote:
> > 
> > The most recent messages under "Futex queue_me/get_user ordering",
> > with a patch from Jakub Jelinek will fix this problem by changing the
> > kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> > "<[EMAIL PROTECTED]>".
> > 
> > I have not tested the patch, but it looks convincing.
> 
> OK, thanks.  Lee && Paul, that's at
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch
> 

Does not fix the problem.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

Jamie Lokier <[EMAIL PROTECTED]> wrote:
>
> Andrew Morton wrote:
> > iirc we ended up deciding that the futex problems around that time were due
> > to userspace problems (a version of libc).  But then, there's no discussion
> > around Seto's patch and it didn't get applied.  So I don't know what
> > happened to that work - it's all a bit mysterious.
> 
> It can be fixed _either_ in Glibc, or by changing the kernel.
> 
> That problem is caused by differing assumptions between Glibc and the
> kernel about subtle futex semantics.  Which goes to show they are
> really clever, or something.
> 
> I provided pseudo-code for the Glibc fix, but not an actual patch
> because NPTL is quite complicated and I wanted to know the Glibc
> people were interested, but apparently they were too busy at the time
> - benchmarks would have made sense for such a patch.
> 
> Scott Snyder started fixing part of Glibc, and that did fix his
> instance of this problem so we know the approach works.  But a full
> patch for Glibc was not prepared.
> 
> The most recent messages under "Futex queue_me/get_user ordering",
> with a patch from Jakub Jelinek will fix this problem by changing the
> kernel.  Yes, you should apply Jakub's most recent patch, message-ID
> "<[EMAIL PROTECTED]>".
> 
> I have not tested the patch, but it looks convincing.

OK, thanks.  Lee && Paul, that's at
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch


> I argued for fixing Glibc on the grounds that the changed kernel
> behaviour, or more exactly having Glibc depend on it, loses a certain
> semantic property useful for unusual operations on multiple futexes at
> the same time.  But I appear to have lost the argument, and Jakub's
> latest patch does clean up some cruft quite nicely, with no expected
> performance hit.

Futexes were initially so simple.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
> I argued for fixing Glibc on the grounds that the changed kernel
> behaviour, or more exactly having Glibc depend on it, loses a certain
> semantic property useful for unusual operations on multiple futexes at
> the same time.  But I appear to have lost the argument, and Jakub's
> latest patch does clean up some cruft quite nicely, with no expected
> performance hit.

A glibc fix will take forever to get to users compared to a kernel fix.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Mon, 2005-03-21 at 20:20 -0800, Andrew Morton wrote:
> Lee Revell <[EMAIL PROTECTED]> wrote:
> >
> > Paul Davis and Chris Morgan have been chasing down a problem with
> > xmms_jack and it really looks like this bug, thought to have been fixed
> > in 2.6.10, is the culprit.
> > 
> > http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
> > 
> > (for more info google "futex_wait 2.6 hang")
> > 
> > It's simple to reproduce.  Run JACK and launch xmms with the JACK output
> > plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
> > 
> > [EMAIL PROTECTED]:~$ strace -p 7935
> > Process 7935 attached - interrupt to quit
> > futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
> > 
> > Just like in the above bug report, if xmms is run with
> > LD_ASSUME_KERNEL=2.4.19, it works perfectly.
> > 
> > I have reproduced the bug with 2.6.12-rc1.
> > 
> 
> iirc we ended up deciding that the futex problems around that time were due
> to userspace problems (a version of libc).  But then, there's no discussion
> around Seto's patch and it didn't get applied.  So I don't know what
> happened to that work - it's all a bit mysterious.
> 

It does seem like it could be a different problem.  Maybe Paul can
provide some more evidence that it's a kernel and not a glibc/NPTL bug.
I'm really just posting this on Paul's behalf; I don't claim to
understand the issue. ;-)

> Is this a 100% repeatable hang, or is it some occasional race?
> 

100% repeatable.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

Andrew Morton wrote:
> iirc we ended up deciding that the futex problems around that time were due
> to userspace problems (a version of libc).  But then, there's no discussion
> around Seto's patch and it didn't get applied.  So I don't know what
> happened to that work - it's all a bit mysterious.

It can be fixed _either_ in Glibc, or by changing the kernel.

That problem is caused by differing assumptions between Glibc and the
kernel about subtle futex semantics.  Which goes to show they are
really clever, or something.

I provided pseudo-code for the Glibc fix, but not an actual patch
because NPTL is quite complicated and I wanted to know the Glibc
people were interested, but apparently they were too busy at the time
- benchmarks would have made sense for such a patch.

Scott Snyder started fixing part of Glibc, and that did fix his
instance of this problem so we know the approach works.  But a full
patch for Glibc was not prepared.

The most recent messages under "Futex queue_me/get_user ordering",
with a patch from Jakub Jelinek will fix this problem by changing the
kernel.  Yes, you should apply Jakub's most recent patch, message-ID
"<[EMAIL PROTECTED]>".

I have not tested the patch, but it looks convincing.

I argued for fixing Glibc on the grounds that the changed kernel
behaviour, or more exactly having Glibc depend on it, loses a certain
semantic property useful for unusual operations on multiple futexes at
the same time.  But I appear to have lost the argument, and Jakub's
latest patch does clean up some cruft quite nicely, with no expected
performance hit.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

Lee Revell wrote:
> > iirc we ended up deciding that the futex problems around that time were due
> > to userspace problems (a version of libc).  But then, there's no discussion
> > around Seto's patch and it didn't get applied.  So I don't know what
> > happened to that work - it's all a bit mysterious.
> 
> It does seem like it could be a different problem.  Maybe Paul can
> provide some more evidence that it's a kernel and not a glibc/NPTL bug.
> I'm really just posting this on Paul's behalf; I don't claim to
> understand the issue. ;-)

Try applying the patch below, which was recently posted by Jakub Jelinek.

If it fixes the problem, it's the same thing as Hidetoshi Seto
noticed, although this patch is much improved thanks to
"preempt_count technology" (tm).  If not, it's a whole new problem.

-- Jamie

--- linux-2.6.11/kernel/futex.c.jj  2005-03-17 04:42:29.0 -0500
+++ linux-2.6.11/kernel/futex.c 2005-03-18 05:45:29.0 -0500
@@ -97,7 +97,6 @@ struct futex_q {
  */
 struct futex_hash_bucket {
spinlock_t  lock;
-   unsigned intnqueued;
struct list_head   chain;
 };
 
@@ -265,7 +264,6 @@ static inline int get_futex_value_locked
inc_preempt_count();
ret = __copy_from_user_inatomic(dest, from, sizeof(int));
dec_preempt_count();
-   preempt_check_resched();
 
return ret ? -EFAULT : 0;
 }
@@ -339,7 +337,6 @@ static int futex_requeue(unsigned long u
struct list_head *head1;
struct futex_q *this, *next;
int ret, drop_count = 0;
-   unsigned int nqueued;
 
  retry:
down_read(>mm->mmap_sem);
@@ -354,23 +351,22 @@ static int futex_requeue(unsigned long u
bh1 = hash_futex();
bh2 = hash_futex();
 
-   nqueued = bh1->nqueued;
+   if (bh1 < bh2)
+   spin_lock(>lock);
+   spin_lock(>lock);
+   if (bh1 > bh2)
+   spin_lock(>lock);
+
if (likely(valp != NULL)) {
int curval;
 
-   /* In order to avoid doing get_user while
-  holding bh1->lock and bh2->lock, nqueued
-  (monotonically increasing field) must be first
-  read, then *uaddr1 fetched from userland and
-  after acquiring lock nqueued field compared with
-  the stored value.  The smp_mb () below
-  makes sure that bh1->nqueued is read from memory
-  before *uaddr1.  */
-   smp_mb();
-
ret = get_futex_value_locked(, (int __user *)uaddr1);
 
if (unlikely(ret)) {
+   spin_unlock(>lock);
+   if (bh1 != bh2)
+   spin_unlock(>lock);
+
/* If we would have faulted, release mmap_sem, fault
 * it in and start all over again.
 */
@@ -385,21 +381,10 @@ static int futex_requeue(unsigned long u
}
if (curval != *valp) {
ret = -EAGAIN;
-   goto out;
+   goto out_unlock;
}
}
 
-   if (bh1 < bh2)
-   spin_lock(>lock);
-   spin_lock(>lock);
-   if (bh1 > bh2)
-   spin_lock(>lock);
-
-   if (unlikely(nqueued != bh1->nqueued && valp != NULL)) {
-   ret = -EAGAIN;
-   goto out_unlock;
-   }
-
head1 = >chain;
list_for_each_entry_safe(this, next, head1, list) {
if (!match_futex (>key, ))
@@ -435,13 +420,9 @@ out:
return ret;
 }
 
-/*
- * queue_me and unqueue_me must be called as a pair, each
- * exactly once.  They are called with the hashed spinlock held.
- */
-
 /* The key must be already stored in q->key. */
-static void queue_me(struct futex_q *q, int fd, struct file *filp)
+static inline struct futex_hash_bucket *
+queue_lock(struct futex_q *q, int fd, struct file *filp)
 {
struct futex_hash_bucket *bh;
 
@@ -455,11 +436,35 @@ static void queue_me(struct futex_q *q, 
q->lock_ptr = >lock;
 
spin_lock(>lock);
-   bh->nqueued++;
+   return bh;
+}
+
+static inline void __queue_me(struct futex_q *q, struct futex_hash_bucket *bh)
+{
list_add_tail(>list, >chain);
spin_unlock(>lock);
 }
 
+static inline void
+queue_unlock(struct futex_q *q, struct futex_hash_bucket *bh)
+{
+   spin_unlock(>lock);
+   drop_key_refs(>key);
+}
+
+/*
+ * queue_me and unqueue_me must be called as a pair, each
+ * exactly once.  They are called with the hashed spinlock held.
+ */
+
+/* The key must be already stored in q->key. */
+static void queue_me(struct futex_q *q, int fd, struct file *filp)
+{
+   struct futex_hash_bucket *bh;
+   bh = queue_lock(q, fd, filp);
+   __queue_me(q, bh);
+}
+
 /* Return 1 if we were still queued (ie. 0 means we were woken) */
 static int unqueue_me(struct futex_q *q)
 {
@@

Re: kernel bug: futex_wait hang

Lee Revell <[EMAIL PROTECTED]> wrote:
>
> Paul Davis and Chris Morgan have been chasing down a problem with
> xmms_jack and it really looks like this bug, thought to have been fixed
> in 2.6.10, is the culprit.
> 
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
> 
> (for more info google "futex_wait 2.6 hang")
> 
> It's simple to reproduce.  Run JACK and launch xmms with the JACK output
> plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
> 
> [EMAIL PROTECTED]:~$ strace -p 7935
> Process 7935 attached - interrupt to quit
> futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
> 
> Just like in the above bug report, if xmms is run with
> LD_ASSUME_KERNEL=2.4.19, it works perfectly.
> 
> I have reproduced the bug with 2.6.12-rc1.
> 

iirc we ended up deciding that the futex problems around that time were due
to userspace problems (a version of libc).  But then, there's no discussion
around Seto's patch and it didn't get applied.  So I don't know what
happened to that work - it's all a bit mysterious.

Is this a 100% repeatable hang, or is it some occasional race?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

Lee Revell [EMAIL PROTECTED] wrote:

 Paul Davis and Chris Morgan have been chasing down a problem with
 xmms_jack and it really looks like this bug, thought to have been fixed
 in 2.6.10, is the culprit.
 
 http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
 
 (for more info google futex_wait 2.6 hang)
 
 It's simple to reproduce.  Run JACK and launch xmms with the JACK output
 plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
 
 [EMAIL PROTECTED]:~$ strace -p 7935
 Process 7935 attached - interrupt to quit
 futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
 
 Just like in the above bug report, if xmms is run with
 LD_ASSUME_KERNEL=2.4.19, it works perfectly.
 
 I have reproduced the bug with 2.6.12-rc1.
 

iirc we ended up deciding that the futex problems around that time were due
to userspace problems (a version of libc).  But then, there's no discussion
around Seto's patch and it didn't get applied.  So I don't know what
happened to that work - it's all a bit mysterious.

Is this a 100% repeatable hang, or is it some occasional race?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

Lee Revell wrote:
  iirc we ended up deciding that the futex problems around that time were due
  to userspace problems (a version of libc).  But then, there's no discussion
  around Seto's patch and it didn't get applied.  So I don't know what
  happened to that work - it's all a bit mysterious.
 
 It does seem like it could be a different problem.  Maybe Paul can
 provide some more evidence that it's a kernel and not a glibc/NPTL bug.
 I'm really just posting this on Paul's behalf; I don't claim to
 understand the issue. ;-)

Try applying the patch below, which was recently posted by Jakub Jelinek.

If it fixes the problem, it's the same thing as Hidetoshi Seto
noticed, although this patch is much improved thanks to
preempt_count technology (tm).  If not, it's a whole new problem.

-- Jamie

--- linux-2.6.11/kernel/futex.c.jj  2005-03-17 04:42:29.0 -0500
+++ linux-2.6.11/kernel/futex.c 2005-03-18 05:45:29.0 -0500
@@ -97,7 +97,6 @@ struct futex_q {
  */
 struct futex_hash_bucket {
spinlock_t  lock;
-   unsigned intnqueued;
struct list_head   chain;
 };
 
@@ -265,7 +264,6 @@ static inline int get_futex_value_locked
inc_preempt_count();
ret = __copy_from_user_inatomic(dest, from, sizeof(int));
dec_preempt_count();
-   preempt_check_resched();
 
return ret ? -EFAULT : 0;
 }
@@ -339,7 +337,6 @@ static int futex_requeue(unsigned long u
struct list_head *head1;
struct futex_q *this, *next;
int ret, drop_count = 0;
-   unsigned int nqueued;
 
  retry:
down_read(current-mm-mmap_sem);
@@ -354,23 +351,22 @@ static int futex_requeue(unsigned long u
bh1 = hash_futex(key1);
bh2 = hash_futex(key2);
 
-   nqueued = bh1-nqueued;
+   if (bh1  bh2)
+   spin_lock(bh1-lock);
+   spin_lock(bh2-lock);
+   if (bh1  bh2)
+   spin_lock(bh1-lock);
+
if (likely(valp != NULL)) {
int curval;
 
-   /* In order to avoid doing get_user while
-  holding bh1-lock and bh2-lock, nqueued
-  (monotonically increasing field) must be first
-  read, then *uaddr1 fetched from userland and
-  after acquiring lock nqueued field compared with
-  the stored value.  The smp_mb () below
-  makes sure that bh1-nqueued is read from memory
-  before *uaddr1.  */
-   smp_mb();
-
ret = get_futex_value_locked(curval, (int __user *)uaddr1);
 
if (unlikely(ret)) {
+   spin_unlock(bh1-lock);
+   if (bh1 != bh2)
+   spin_unlock(bh2-lock);
+
/* If we would have faulted, release mmap_sem, fault
 * it in and start all over again.
 */
@@ -385,21 +381,10 @@ static int futex_requeue(unsigned long u
}
if (curval != *valp) {
ret = -EAGAIN;
-   goto out;
+   goto out_unlock;
}
}
 
-   if (bh1  bh2)
-   spin_lock(bh1-lock);
-   spin_lock(bh2-lock);
-   if (bh1  bh2)
-   spin_lock(bh1-lock);
-
-   if (unlikely(nqueued != bh1-nqueued  valp != NULL)) {
-   ret = -EAGAIN;
-   goto out_unlock;
-   }
-
head1 = bh1-chain;
list_for_each_entry_safe(this, next, head1, list) {
if (!match_futex (this-key, key1))
@@ -435,13 +420,9 @@ out:
return ret;
 }
 
-/*
- * queue_me and unqueue_me must be called as a pair, each
- * exactly once.  They are called with the hashed spinlock held.
- */
-
 /* The key must be already stored in q-key. */
-static void queue_me(struct futex_q *q, int fd, struct file *filp)
+static inline struct futex_hash_bucket *
+queue_lock(struct futex_q *q, int fd, struct file *filp)
 {
struct futex_hash_bucket *bh;
 
@@ -455,11 +436,35 @@ static void queue_me(struct futex_q *q, 
q-lock_ptr = bh-lock;
 
spin_lock(bh-lock);
-   bh-nqueued++;
+   return bh;
+}
+
+static inline void __queue_me(struct futex_q *q, struct futex_hash_bucket *bh)
+{
list_add_tail(q-list, bh-chain);
spin_unlock(bh-lock);
 }
 
+static inline void
+queue_unlock(struct futex_q *q, struct futex_hash_bucket *bh)
+{
+   spin_unlock(bh-lock);
+   drop_key_refs(q-key);
+}
+
+/*
+ * queue_me and unqueue_me must be called as a pair, each
+ * exactly once.  They are called with the hashed spinlock held.
+ */
+
+/* The key must be already stored in q-key. */
+static void queue_me(struct futex_q *q, int fd, struct file *filp)
+{
+   struct futex_hash_bucket *bh;
+   bh = queue_lock(q, fd, filp);
+   __queue_me(q, bh);
+}
+
 /* Return 1 if we were still queued (ie. 0 means we were woken) */
 static int

Re: kernel bug: futex_wait hang

Andrew Morton wrote:
 iirc we ended up deciding that the futex problems around that time were due
 to userspace problems (a version of libc).  But then, there's no discussion
 around Seto's patch and it didn't get applied.  So I don't know what
 happened to that work - it's all a bit mysterious.

It can be fixed _either_ in Glibc, or by changing the kernel.

That problem is caused by differing assumptions between Glibc and the
kernel about subtle futex semantics.  Which goes to show they are
really clever, or something.

I provided pseudo-code for the Glibc fix, but not an actual patch
because NPTL is quite complicated and I wanted to know the Glibc
people were interested, but apparently they were too busy at the time
- benchmarks would have made sense for such a patch.

Scott Snyder started fixing part of Glibc, and that did fix his
instance of this problem so we know the approach works.  But a full
patch for Glibc was not prepared.

The most recent messages under Futex queue_me/get_user ordering,
with a patch from Jakub Jelinek will fix this problem by changing the
kernel.  Yes, you should apply Jakub's most recent patch, message-ID
[EMAIL PROTECTED].

I have not tested the patch, but it looks convincing.

I argued for fixing Glibc on the grounds that the changed kernel
behaviour, or more exactly having Glibc depend on it, loses a certain
semantic property useful for unusual operations on multiple futexes at
the same time.  But I appear to have lost the argument, and Jakub's
latest patch does clean up some cruft quite nicely, with no expected
performance hit.

-- Jamie
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Mon, 2005-03-21 at 20:20 -0800, Andrew Morton wrote:
 Lee Revell [EMAIL PROTECTED] wrote:
 
  Paul Davis and Chris Morgan have been chasing down a problem with
  xmms_jack and it really looks like this bug, thought to have been fixed
  in 2.6.10, is the culprit.
  
  http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/2044.html
  
  (for more info google futex_wait 2.6 hang)
  
  It's simple to reproduce.  Run JACK and launch xmms with the JACK output
  plugin.  Close XMMS.  The xmms process hangs.  Strace looks like this:
  
  [EMAIL PROTECTED]:~$ strace -p 7935
  Process 7935 attached - interrupt to quit
  futex(0xb5341bf8, FUTEX_WAIT, 7939, NULL
  
  Just like in the above bug report, if xmms is run with
  LD_ASSUME_KERNEL=2.4.19, it works perfectly.
  
  I have reproduced the bug with 2.6.12-rc1.
  
 
 iirc we ended up deciding that the futex problems around that time were due
 to userspace problems (a version of libc).  But then, there's no discussion
 around Seto's patch and it didn't get applied.  So I don't know what
 happened to that work - it's all a bit mysterious.
 

It does seem like it could be a different problem.  Maybe Paul can
provide some more evidence that it's a kernel and not a glibc/NPTL bug.
I'm really just posting this on Paul's behalf; I don't claim to
understand the issue. ;-)

 Is this a 100% repeatable hang, or is it some occasional race?
 

100% repeatable.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

On Tue, 2005-03-22 at 04:48 +, Jamie Lokier wrote:
 I argued for fixing Glibc on the grounds that the changed kernel
 behaviour, or more exactly having Glibc depend on it, loses a certain
 semantic property useful for unusual operations on multiple futexes at
 the same time.  But I appear to have lost the argument, and Jakub's
 latest patch does clean up some cruft quite nicely, with no expected
 performance hit.

A glibc fix will take forever to get to users compared to a kernel fix.

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang

Jamie Lokier [EMAIL PROTECTED] wrote:

Andrew Morton wrote:
iirc we ended up deciding that the futex problems around that time were due
to userspace problems (a version of libc). But then, there's no discussion
around Seto's patch and it didn't get applied. So I don't know what
happened to that work - it's all a bit mysterious.

It can be fixed _either_ in Glibc, or by changing the kernel.

That problem is caused by differing assumptions between Glibc and the
kernel about subtle futex semantics. Which goes to show they are
really clever, or something.

I provided pseudo-code for the Glibc fix, but not an actual patch
because NPTL is quite complicated and I wanted to know the Glibc
people were interested, but apparently they were too busy at the time
- benchmarks would have made sense for such a patch.

Scott Snyder started fixing part of Glibc, and that did fix his
instance of this problem so we know the approach works. But a full
patch for Glibc was not prepared.

I have not tested the patch, but it looks convincing.

OK, thanks. Lee Paul, that's at
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm1/broken-out/futex-queue_me-get_user-ordering-fix.patch

I argued for fixing Glibc on the grounds that the changed kernel
behaviour, or more exactly having Glibc depend on it, loses a certain
semantic property useful for unusual operations on multiple futexes at
the same time. But I appear to have lost the argument, and Jakub's
latest patch does clean up some cruft quite nicely, with no expected
performance hit.

Futexes were initially so simple.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: kernel bug: futex_wait hang