Launchpad has imported 6 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=737387.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2011-09-11T17:56:01+00:00 Ryan wrote:

Created attachment 522617
test case demonstrating the issue

There appears to be a strange bug in glibc that causes deadlocks when
calling fork() from threads. We had a testcase in GLib failing from time
to time because of this.

I've attached a minimal testcase that uses only pure pthreads + libc.
Compile it with -pthread and run it. It should fill your screen with
dots for a while, then hang when it hits the bug (which happens randomly
anywhere between 1 dot and hundreds). I've already received independent
verification that this testcase hangs on several people's computers.

I believe this to be an upstream issue since this bug is visible on Ubuntu as 
well, but the glibc website says I should file bugs against distributions 
first. I also believe the issue to be a regression since older Fedora and RHEL 
releases are unaffected.  The problem appears to affect both 32 and 64bits.
Description of problem:

Some notes:

 - compiling the testcase with -static has the side-effect of causing the
   bug to go away

 - compiling the testcase with -DFORK_DIRECTLY also appears to solve the
   problem

 - replacing the execv() with a direct exit(0) doesn't solve the problem
   but causes the frequency to change

The fact that both static linking and making the fork() syscall directly
cause the problem to disappear leads me to believe that this is a libc
bug rather than a kernel bug (which is the only other possibility). I'm
not 100% sure of that, though, since libc actually uses the clone()
syscall to implement fork(), so there could be a different inside the
kernel because of that.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/comments/4

------------------------------------------------------------------------
On 2011-09-16T14:28:37+00:00 Fedora wrote:

glibc-2.14.90-9 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/glibc-2.14.90-9

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/comments/5

------------------------------------------------------------------------
On 2011-09-16T21:11:20+00:00 Ryan wrote:

Thanks for the awesome turnaround.  I installed the update from testing
on my F16 system and it appears to fix the problem.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/comments/6

------------------------------------------------------------------------
On 2011-09-17T19:34:50+00:00 Fedora wrote:

Package glibc-2.14.90-9:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing glibc-2.14.90-9'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/glibc-2.14.90-9
then log in and leave karma (feedback).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/comments/10

------------------------------------------------------------------------
On 2011-09-28T18:52:27+00:00 Fedora wrote:

Package glibc-2.14.90-10:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing glibc-2.14.90-10'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/glibc-2.14.90-10
then log in and leave karma (feedback).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/comments/13

------------------------------------------------------------------------
On 2011-10-02T18:12:47+00:00 Fedora wrote:

glibc-2.14.90-10 has been pushed to the Fedora 16 stable repository.  If
problems still persist, please make note of it in this bug report.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/comments/15


** Changed in: glibc (Fedora)
       Status: Unknown => Fix Released

** Changed in: glibc (Fedora)
   Importance: Unknown => Undecided

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to eglibc in Ubuntu.
https://bugs.launchpad.net/bugs/838975

Title:
  weird pthread/fork race/deadlock

Status in eglibc package in Ubuntu:
  Fix Released
Status in eglibc source package in Natty:
  Invalid
Status in eglibc source package in Oneiric:
  Fix Released
Status in glibc package in Fedora:
  Fix Released

Bug description:
  There appears to be a strange bug in glibc that causes deadlocks when
  calling fork() from threads.  We had a testcase in GLib failing from
  time to time because of this.

  I've attached a minimal testcase that uses only pure pthreads + libc.
  Compile it with -pthread and run it.  It should fill your screen with
  dots for a while, then hang when it hits the bug (which happens
  randomly anywhere between 1 dot and hundreds).  I've already received
  independent verification that this testcase hangs on several people's
  computers.

  I believe this to be an upstream issue since this bug is visible on
  Fedora 15 and 16, but the glibc website says I should file bugs
  against distributions first.  I also believe the issue to be a
  regression since Lucid is fine but Oneiric is not.  The problem
  appears to affect both 32 and 64bits.

  Some notes:

   - compiling the testcase with -static has the side-effect of causing
  the bug to go away

   - compiling the testcase with -DFORK_DIRECTLY also appears to solve
  the problem

   - replacing the execv() with a direct exit(0) doesn't solve the
  problem but causes the frequency to change

  
  The fact that both static linking and making the fork() syscall directly 
cause the problem to disappear leads me to believe that this is a libc bug 
rather than a kernel bug (which is the only other possibility).  I'm not 100% 
sure of that, though, since libc actually uses the clone() syscall to implement 
fork(), so there could be a different inside the kernel because of that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/838975/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to