On Sat, 2004-01-17 at 12:12, Leopold Toetsch wrote:

> But I'm a bit worried about the reason, why it actually hangs here:
> 
> > (gdb) bac
> > #0  0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
> ...
> > #3  0x0fd0a694 in exit () from /lib/libc.so.6
> > #4  0x0fcf23a8 in __libc_start_main () from /lib/libc.so.6
> 
> These are exit calls from the main thread. That should AFAIK just kill
> all threads that got started eventually and finish the process.

What if the event thread is stuck?  When the tests hang, suspending and
resuming the process unsticks it, though the current test will fail.  I
wonder if it's waiting for a masked signal or a signal that never
arrives.

(Warning: I've just about exhausted my knowledge of pthreads programming
coming up with that idea, so if it sounds crack-addled, it's definitely
due to gaps in my knowledge.)

I base that idea on this backtrace:

/home/chromatic/dev/parrot/t/src/extend_8, process 19479
Reading symbols from /lib/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 19479)]
[New Thread 32769 (LWP 19480)]
[New Thread 16386 (LWP 19481)]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1
0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
(gdb) info threads
  3 Thread 16386 (LWP 19481)  0x0ff976a4 in __pthread_sigsuspend ()
   from /lib/libpthread.so.0
  2 Thread 32769 (LWP 19480)  0x0ff9c140 in waitpid ()
   from /lib/libpthread.so.0
  1 Thread 16384 (LWP 19479)  0x0ff976a4 in __pthread_sigsuspend ()
   from /lib/libpthread.so.0
(gdb) bac
#0  0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1  0x0ff973e0 in __pthread_wait_for_restart_signal ()
   from /lib/libpthread.so.0
#2  0x0ff96fe4 in pthread_onexit_process () from /lib/libpthread.so.0
#3  0x0fd0a694 in exit () from /lib/libc.so.6
#4  0x0fcf23a8 in __libc_start_main () from /lib/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread 32769 (LWP 19480))]#0  0x0ff9c140 in
waitpid ()
   from /lib/libpthread.so.0
(gdb) bac
#0  0x0ff9c140 in waitpid () from /lib/libpthread.so.0
#1  0x0ff9c128 in waitpid () from /lib/libpthread.so.0
#2  0x0ff958d0 in pthread_handle_exit () from /lib/libpthread.so.0
#3  0x0ff94c90 in __pthread_manager () from /lib/libpthread.so.0
#4  0x0fdb3118 in clone () from /lib/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 16386 (LWP 19481))]#0  0x0ff976a4 in
__pthread_sigsuspend () from /lib/libpthread.so.0
(gdb) bac
#0  0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1  0x0ff973e0 in __pthread_wait_for_restart_signal ()
   from /lib/libpthread.so.0
#2  0x0ff93f9c in [EMAIL PROTECTED] () from
/lib/libpthread.so.0
#3  0x101d7614 in queue_wait (queue=0x10279e30) at src/tsq.c:159
#4  0x1009c968 in event_thread (data=0x10279e30) at src/events.c:349
#5  0x0ff94d98 in pthread_start_thread () from /lib/libpthread.so.0
#6  0x0fdb3118 in clone () from /lib/libc.so.6
(gdb) q

and the fact that the attached patch seems to fix things.  I don't
expect that it's correct.  It might paper over a real problem, perhaps
on my system.  It's food for thought though.

If it turns out that the problem does lie here, can anyone suggest a
very small test program that would demonstrate the real problem?

-- c


Index: src/interpreter.c
===================================================================
RCS file: /cvs/public/parrot/src/interpreter.c,v
retrieving revision 1.258
diff -u -u -r1.258 interpreter.c
--- src/interpreter.c	16 Jan 2004 20:47:23 -0000	1.258
+++ src/interpreter.c	19 Jan 2004 02:10:34 -0000
@@ -1123,6 +1123,7 @@
      * wait for threads to complete if needed
      */
     if (!interpreter->parent_interpreter) {
+        Parrot_new_terminate_event(interpreter);
         pt_join_threads(interpreter);
     }
     /* if something needs destruction (e.g. closing PIOs)

Reply via email to