Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-10-27 Thread Carlos O'Donell
On Sat, Oct 25, 2008 at 1:21 PM, Julien Danjou [EMAIL PROTECTED] wrote:
 Is there anything from an outsider that could help?

I've seen this on-and-off again on the hppa-linux port. The issue has,
in my experience, been a compiler problem. My standard operating
procedure is to methodically add volatile to the atomic.h operations
until it goes away, and then work out the compiler mis-optimization.

The bug is almost always a situation where the lll_unlock is scheduled
before owner = 0, and the assert catches the race condition where you
unlock but have not yet cleared the owner.

$0.02.

Cheers,
Carlos.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-10-27 Thread Carlos O'Donell
On Mon, Oct 27, 2008 at 10:05 AM, Andrew Haley [EMAIL PROTECTED] wrote:
 I've seen this on-and-off again on the hppa-linux port. The issue has,
 in my experience, been a compiler problem. My standard operating
 procedure is to methodically add volatile to the atomic.h operations
 until it goes away, and then work out the compiler mis-optimization.

 The bug is almost always a situation where the lll_unlock is scheduled
 before owner = 0, and the assert catches the race condition where you
 unlock but have not yet cleared the owner.

 Are you sure this is a compiler problem?  Unless you use explicit atomic
 memory accesses or volatile the compiler is supposed to re-order memory
 access.  Perhaps I'm misunderstanding you.

Sorry, parsing the above statement requires knowing something about
how lll_unlock is implemented in glibc.

The lll_unlock function is supposed to be a memory barrier.

The function is usually an explicit atomic operation, or a volatile
asm implementing the futex syscall i.e. INTERNAL_SYSCALL macro.

Cheers,
Carlos.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-10-27 Thread Andrew Haley
Carlos O'Donell wrote:
 On Mon, Oct 27, 2008 at 10:05 AM, Andrew Haley [EMAIL PROTECTED] wrote:
 I've seen this on-and-off again on the hppa-linux port. The issue has,
 in my experience, been a compiler problem. My standard operating
 procedure is to methodically add volatile to the atomic.h operations
 until it goes away, and then work out the compiler mis-optimization.

 The bug is almost always a situation where the lll_unlock is scheduled
 before owner = 0, and the assert catches the race condition where you
 unlock but have not yet cleared the owner.
 Are you sure this is a compiler problem?  Unless you use explicit atomic
 memory accesses or volatile the compiler is supposed to re-order memory
 access.  Perhaps I'm misunderstanding you.
 
 Sorry, parsing the above statement requires knowing something about
 how lll_unlock is implemented in glibc.
 
 The lll_unlock function is supposed to be a memory barrier.
 
 The function is usually an explicit atomic operation, or a volatile
 asm implementing the futex syscall i.e. INTERNAL_SYSCALL macro.

I understand all that, but the question still stands: is the compiler
really moving a memory write past a memory barrier?  ISTR we did have
a discussion on gcc-list about that, but it was a while ago and should
now be fixed.

Andrew.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-10-27 Thread Carlos O'Donell
On Mon, Oct 27, 2008 at 11:27 AM, Andrew Haley [EMAIL PROTECTED] wrote:
 I understand all that, but the question still stands: is the compiler
 really moving a memory write past a memory barrier?  ISTR we did have
 a discussion on gcc-list about that, but it was a while ago and should
 now be fixed.

This issue no longer affects the PA port, but I can't speak for s390.

The PA port is the only port for which I do regular gcc / glibc testing.

Cheers,
Carlos.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-10-25 Thread Julien Danjou
At 1210458182 time_t, Aurelien Jarno wrote:
 Looking quickly at the code the problem is that LLL_MUTEX_LOCK (mutex)
 fails to acquire the mutex. It can be a bug in atomic.h or a bug in the
 futexes implementation of the kernel.
 
 It would be nice to have an strace of the problem to see the futex
 syscall before this assertion.

Here's what I can get from #468793.
In this test, if the number of thread is = 2, it's ok.
With something like ./tchmttest typical casket 3 1000 1000 it fails 50 %
of the time.

I've tried to strace the test but unfortunately when stracing,
everything is fine.

Is there anything from an outsider that could help?

Cheers,
-- 
Julien Danjou
.''`.  Debian Developer
: :' : http://julien.danjou.info
`. `'  http://people.debian.org/~acid
  `-   9A0D 5FD9 EB42 22F6 8974  C95C A462 B51E C2FE E5CD


signature.asc
Description: Digital signature


Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-05-10 Thread Aurelien Jarno
On Wed, May 07, 2008 at 11:29:49AM +0200, Bastian Blank wrote:
 Package: libc6
 Version: 2.7-10
 Severity: important
 
 On Wed, May 07, 2008 at 09:34:12AM +0200, Matthias Klose wrote:
  the build failure on s390 is unexpected; is it possible to extract a
  test case?
 
 | java: pthread_mutex_lock.c:71: __pthread_mutex_lock: Assertion 
 `mutex-__data.__owner == 0' failed.
 
 So another package failed about that (after mono and libto$bla). It
 looks like a race condition somewhere in the libpthread.
 

Looking quickly at the code the problem is that LLL_MUTEX_LOCK (mutex)
fails to acquire the mutex. It can be a bug in atomic.h or a bug in the
futexes implementation of the kernel.

It would be nice to have an strace of the problem to see the futex
syscall before this assertion.

Also a small testcase of the problem would be really helpful to debug
it.

-- 
  .''`.  Aurelien Jarno | GPG: 1024D/F1BCDB73
 : :' :  Debian developer   | Electrical Engineer
 `. `'   [EMAIL PROTECTED] | [EMAIL PROTECTED]
   `-people.debian.org/~aurel32 | www.aurel32.net



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#479952: libc6/s390 - __pthread_mutex_lock: Assertion `mutex-__data.__owner == 0' failed.

2008-05-07 Thread Bastian Blank
Package: libc6
Version: 2.7-10
Severity: important

On Wed, May 07, 2008 at 09:34:12AM +0200, Matthias Klose wrote:
 the build failure on s390 is unexpected; is it possible to extract a
 test case?

| java: pthread_mutex_lock.c:71: __pthread_mutex_lock: Assertion 
`mutex-__data.__owner == 0' failed.

So another package failed about that (after mono and libto$bla). It
looks like a race condition somewhere in the libpthread.

Bastian

-- 
The more complex the mind, the greater the need for the simplicity of play.
-- Kirk, Shore Leave, stardate 3025.8



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]