[email protected] (Ludovic Courtès) writes:
> Hi,
>
> Neil Jerram <[email protected]> writes:
>
>> [email protected] (Ludovic Courtès) writes:
>
>>> This must be related to http://savannah.gnu.org/bugs/?27457 .
>>> Contributions welcome! ;-)
>>
>> I will start looking at this later this evening. (Unless you're already
>> investigating - in which case please let me know!)
>
> I won’t look into it in the next days, so go ahead! ;-)
Well... I don't see the throw from critical section problem. Instead,
after a while, I see a hang, with one thread doing:
#0 0xb7f06424 in __kernel_vsyscall ()
#1 0xb7acd255 in sem_wait@@GLIBC_2.1 () from /lib/i686/cmov/libpthread.so.0
#2 0xb7dc2018 in GC_stop_world () from /usr/lib/libgc.so.1
and all the others:
#0 0xb7f06424 in __kernel_vsyscall ()
#1 0xb7b05837 in sigsuspend () from /lib/i686/cmov/libc.so.6
#2 0xb7dc222b in GC_suspend_handler_inner () from /usr/lib/libgc.so.1
#3 0xb7dc22b5 in GC_suspend_handler () from /usr/lib/libgc.so.1
#4 <signal handler called>
In theory, each other thread must have called sem_post(), and the number
of those sem_post()s should be the same as the number of times that the
GC_stop_world thread calls sem_wait(), and so the GC_stop_world thread
shouldn't be waiting.
I wonder if there's a way that the pthread_kill(p->id, SIG_SUSPEND) in
GC_stop_world can appear to succeed (by returning 0), but the signalled
thread doesn't get the signal, or dies before it does the sem_post()?
Regarding the throw from critical section problem, I guess I'm not
seeing this because of not running on a multi-core machine. Can someone
who does see this problem
- run under GDB
- set a breakpoint on the fprintf (stderr, "throw from within critical
section.\n") line in throw.c
- post the thread backtraces (thread apply all bt), when this breakpoint
is hit?
Thanks.
Neil