(2012/12/19 3:57), Philippe Waroquiers wrote:
> On Tue, 2012-12-18 at 21:00 +0900, ISHIKAWA,chiaki wrote:
>> (2012/12/18 8:07), Philippe Waroquiers wrote:
>>> Destruction of unknown cond var is probably/maybe bug
>>> https://bugs.kde.org/show_bug.cgi?id=307082
>>>
>> I have produced a patch to take care of the issue.
>> But before that, I have a question.
>>
>> Q1: Why does valgrind not complain if I compile & link
>> Marc's code (in the bug entry which was given as a reminder that
>> "unknown cond var" may be a bug or false positive.) in the following manner,
>>
>>       cc -o /tmp/a.out marc.c
> No idea. Maybe a problem of redirection caused by static linking ?
>
>
>> Q2: I have produced a work-in-progress patch to take care this issue.
>> I wonder if the developers in the know can take a look and improve it.
>>
>> The patch is posted to the bug entry
>> https://bugs.kde.org/show_bug.cgi?id=307082
> I took a quick look at the patch, approach looks ok to me.
> No time to look more in depth at this now however :(.
>

Thank you again.


With my patch, I tested mozilla thunderbird mail client under helgrind, and 
found
that most of the warning messages (destruction of unknown cond var) were bogus.
Only a few warnings now come from external libraries, and so in this sense 
mozilla
thunderbird is OK.
[And the patch does not seem to introduce serious bugs so far.]

However, I am still struggling to figure out whether I can learn
which tasks are possibly waiting on a cond var being destroyed.
The message is something like this:
  ==4103== Thread #1: pthread_cond_destroy: destruction of condition variable 
being waited upon
==4103==    at 0x4027B9E: pthread_cond_destroy_WRK (hg_intercepts.c:940)
==4103==    by 0x4029A01: pthread_cond_destroy@* (hg_intercepts.c:958)
==4103==    by 0x47193BA: PR_DestroyCondVar (ptsynch.c:372)
==4103==    by 0x5947C40: nsHTTPListener::~nsHTTPListener() (CondVar.h:56)
==4103==    by 0x5947D82: nsHTTPListener::Release() (nsNSSCallbacks.cpp:536)
==4103==    by 0x603EFCA: nsCOMPtr_base::assign_with_AddRef(nsISupports*) 
(nsCOMPtr.h:442)

I wanted to print out the task IDs that are waiting on this cond variables.

Now my tentative conclusion is, it is impossible to know which tasks are waiting
even going outside helgrind.

Here is my reasoning. I wonder if I am right or wrong.
I am discussing the situation in linux.

Let nWaiters be the number of tasks waiting.

1. The specification of pthread_cond_signal() does not say which task is being 
unblocked.
So all helgrind can do is to decrement nWaiters by one.
(pthread_cons_broadcast() releases all the tasks instead.)

helgrind can't really know which task is being removed from the waiting list and
so decrmenting nWaiters is all it does (I think).

2. My desire was just printing out the task ids still waiting.
OK, let me go outside helgrind.
Is it possible to do so by modifying libpthread?

So I thought I could tweak libpthread and print the task list if it
maintains a list of tasks that are waiting.

Under Debian GNU/Linux, which I use, pthread library seems to
come from libc. It is actually libc6 and is an alias of eglibc, which is
a streamlined libc that can be used on embedded systems.

So this is the source file I looked at.
I looked inside the source code and found that,
since the pthread semantics relatedto cond var is such
that the library only needs to release ONE  unspecified task
by pthread_cond_signal(), the library
does not seem to contain an explicit list of waiting
tasks.
thread library relies on a "futex" kernel mechanism to take
care of blocking and releasing the tasks. futex is a kernel mechanism developed 
to
take care 1-1 user/kernel task space mapping, and
thread function seems to use futex for synchronization by directly invoking 
this kernel API.
Basically, pthread functions don't use library level task list and such, but
relies exclusively on futex mechanism inside the kernel to take care task 
synchronizations.

So at user level, it is not possible to print tasks waiting on a cond var when
the cond var is being destroyed (or for that matter impossible to know the task 
ids
to begin with.)

So my tentative conclusion is it is impossible to know
which task(s) are still blocking on a cond var when the cond var
is being destroyed.

Maybe at the kernel level, we can know (not sure), but
invoking extra kernel calls just to know this internal data structure
(if possible at all), may introduce extra thread context switches due to
such kernel calls (being cancellation point maybe) and disturb
libpthread and helgrind operation...
So I am inclined to avoid it and decided to forget about it.

So I am stuck.
I thought it was easy, but going down to kernel level seems
too heavy-weight operation (AND it is not portable, and not sure
is possible to begin with.)

Actually, pthread_cond_destroy() ought to return EBUSY when there is at
least one task waiting so a careful program
can do something about such a situtaion ( but in mozilla thunderbird case,
it looks the error is printed when a class object is destroyed and
the whole memory area
in which cond var is located seems to be released due to a release of
class object or something, and error code is being ignored I am afarid.
So anything goes.
Granted that most of the observed cases seem to be related to shutdown of
thunderbird mail client (when many objectes are destroyed), and may
not bring serious consequences,
BUT shutdown is where many crashes are reported today,
I have a feeling this destruction of cond variable which has still some tasks
waiting may contribute to a portion of crashes.

So I wonder if people who have worked on helgrind
agree that it is indeed very difficult to figure out
exactly which tasks are waiting when a cond var is being destroyed
at user level.

Also, does anyone have a clever idea about how to debug this situation?

TIA




------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to