On 08.07.2010 13:23, Konstantin Serebryany wrote:
> The stack trace from gdb suggests that your program is blocked on
> pthread_cond_wait, which does not necessary mean there is a mutex
> deadlock.
> You might be waiting for some condition which never becomes true.
>   

Thanks for the help. And this was actually the case. With some
combinations of sort | uniq I found the cause.

Stefan

> --kcc
>
> On Thu, Jul 8, 2010 at 2:18 PM, Stefan Kost <[email protected]> wrote:
>   
>> On 08.07.2010 12:30, Konstantin Serebryany wrote:
>>     
>>> On Thu, Jul 8, 2010 at 1:09 PM, Stefan Kost <[email protected]> wrote:
>>>
>>>       
>>>> On 08.07.2010 11:34, Konstantin Serebryany wrote:
>>>>
>>>>         
>>>>> --tool=helgrind
>>>>>
>>>>>
>>>>>           
>>>> Nope. helgrind does not complain. Does it run cycle checks on-the-fly?
>>>>
>>>>         
>>> Yes, http://valgrind.org/docs/manual/hg-manual.html#hg-manual.lock-orders
>>>
>>>       
>> hm, then it should detect the problem indeed.
>>     
>>>       
>>>> Or how would it detect that the app deadlocked.
>>>>
>>>>         
>>> helgrind finds cycles in lock ordering, deadlock does not have to
>>> actually happen during the execution.
>>>
>>> Does your program use pthread_mutex_ or something else?
>>> Is the program dynamically linked?
>>>
>>>       
>> The application is a benchmark for gstreamer, using glib's gthread
>> (which uses pthread on linux). The program is dynamically linked. If I
>> ctrl-c the app under gdb and dump all strackframes, I have a lot of
>> stackframes like the two below:
>> #0  0x0012d422 in __kernel_vsyscall ()
>> #1  0x00325af9 in __lll_lock_wait () at
>> ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
>> #2  0x00328e1c in _L_cond_lock_826 () from
>> /lib/tls/i686/cmov/libpthread.so.0
>> #3  0x00328c40 in __pthread_mutex_cond_lock (mutex=0x824e6b0) at
>> ../nptl/pthread_mutex_lock.c:61
>> #4  0x003230b3 in pthread_cond_wait@@GLIBC_2.3.2 () at
>> ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:203
>> ...
>> and
>> #0  0x0012d422 in __kernel_vsyscall ()
>> #1  0x00323015 in pthread_cond_wait@@GLIBC_2.3.2 () at
>> ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:122
>> ...
>>
>> Stefan
>>
>>
>>     
>>> --kcc
>>>
>>>
>>>       
>>>> I was thinking of
>>>> writing a LD_PRELOAD based toy, there I would ctrl-c the app and then
>>>> run the cycle checks and dump the results. I have found no evidence in
>>>> the docs that I can signal helgrind to tell that the app has no deadlocked.
>>>>
>>>> Stefan
>>>>
>>>>
>>>>
>>>>         
>>>>> On Thu, Jul 8, 2010 at 12:30 PM, Stefan Kost <[email protected]> 
>>>>> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> hi,
>>>>>>
>>>>>> is anyone aware of a valgrind tool that can help me to debug a deadlock
>>>>>> in a highly threaded program. The programm can easily create hundreds of
>>>>>> threads.
>>>>>> What I am locking for is a tool that tracks for each thread which
>>>>>> mutexes are locked (incl. the strackframe of the lock) and if it is
>>>>>> waiting on a mutex (also including the stackframe). When the app
>>>>>> deadlocks, the collected data can be represented as a directed graph
>>>>>> ("thread -> mutex" for a held lock and "mutex -> thread" for a pending
>>>>>> lock) and one could run Tarjan's strongly connected components algorithm
>>>>>> [1][2] to detect cycles. For each found cycle it could print the
>>>>>> involved threads with the backtraces.
>>>>>>
>>>>>> Stefan
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> http://en.wikipedia.org/wiki/Tarjan%E2%80%99s_strongly_connected_components_algorithm
>>>>>> [2] http://www.logarithmic.net/pfh/blog/01208083168
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> This SF.net email is sponsored by Sprint
>>>>>> What will you do first with EVO, the first 4G phone?
>>>>>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
>>>>>> _______________________________________________
>>>>>> Valgrind-users mailing list
>>>>>> [email protected]
>>>>>> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>> ------------------------------------------------------------------------------
>>>>> This SF.net email is sponsored by Sprint
>>>>> What will you do first with EVO, the first 4G phone?
>>>>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
>>>>> _______________________________________________
>>>>> Valgrind-users mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>>>>>
>>>>>
>>>>>           
>>>> ------------------------------------------------------------------------------
>>>> This SF.net email is sponsored by Sprint
>>>> What will you do first with EVO, the first 4G phone?
>>>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
>>>> _______________________________________________
>>>> Valgrind-users mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>>>>
>>>>
>>>>
>>>>
>>>>         
>>
>>     


------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to