--- Begin Message ---
Package: subterfugue
Version: 0.2.1a-9.5
Severity: normal
Steps to reproduce:
1) apt-get source subterfugue
2) cd subterfugue*/test
3) make
4) sf ./fork
Expected results:
4) Both parent and child print something in the test:
parent: I see kid with pid = 12956
child: I'm here
Actual results:
4) Only parent manages to print something. After 10 seconds we get
asserotion failure from fork (probably on child side?):
parent: I see kid with pid = 22737
fork: ../nptl/sysdeps/unix/sysv/linux/i386/../fork.c:138: __libc_fork:
Assertion `({ __typeof (self->tid) __value; if (sizeof (__value) == 1) asm
volatile ("movb %%gs:%P2,%b0" : "=q" (__value) : "0" (0), "i"
(__builtin_offsetof (struct pthread, tid))); else if (sizeof (__value) == 4)
asm volatile ("movl %%gs:%P1,%0" : "=r" (__value) : "i" (__builtin_offsetof
(struct pthread, tid))); else { if (sizeof (__value) != 8) abort (); asm
volatile ("movl %%gs:%P1,%%eax\n\t" "movl %%gs:%P2,%%edx" : "=A" (__value) :
"i" (__builtin_offsetof (struct pthread, tid)), "i" (__builtin_offsetof (struct
pthread, tid) + 4)); } __value; }) != ppid' failed.
More info:
1) During those 10 seconds the child is T state according to ps.
2) "ps -eo wchan,cmd|grep fork" reveals that the wchan field of the
child is "ptrace" which means that the child probably does nothing
until the parent issues PTRACE_SYSCALL but it never does?
3) I see the problem in debian etch with kernels
linux-image-2.6.18-5-686-bigmem 2.6.18.dfsg.1-13etch4 and
linux-image-2.6.18-4-k7 2.6.18.dfsg.1-12etch2
4) The problem does not appear in debian sarge with
kernel-image-2.4.27-3-k7 2.4.27-10sarge5
5) Debug output (sf -d ./fork) under sarge says
child is 13014
pid 13014 stopped, signal = 5, ORIG_EAX = 11, EAX = 0
pid 13014 stopped, signal = 133, ORIG_EAX = 2, EAX = -38
ptrace.pokeuser(13014, 0, 8209)
ptrace.pokeuser(13014, 4, 0)
ptrace.pokeuser(13014, 12, 0)
ptrace.pokeuser(13014, 16, 13014)
ptrace.pokeuser(13014, 20, 2)
ptrace.pokeuser(13014, 44, 120)
[13015] new child, parent is 13014
pid 13015 stopped, signal = 133, ORIG_EAX = 120, EAX = 0
ptrace.pokeuser(13015, 0, 1075116224)
ptrace.pokeuser(13015, 4, 1)
ptrace.pokeuser(13015, 12, 1073833280)
ptrace.pokeuser(13015, 16, -1073743100)
ptrace.pokeuser(13015, 20, -1073743224)
ptrace.pokeuser(13015, 44, 2)
pid 13014 stopped, signal = 133, ORIG_EAX = 120, EAX = 13015
ptrace.pokeuser(13014, 0, 1075116224)
ptrace.pokeuser(13014, 4, 1)
ptrace.pokeuser(13014, 12, 1073833280)
ptrace.pokeuser(13014, 16, -1073743100)
ptrace.pokeuser(13014, 20, -1073743224)
ptrace.pokeuser(13014, 44, 2)
pid 13015 stopped, signal = 19, ORIG_EAX = 2, EAX = 0
pid 13014 stopped, signal = 133, ORIG_EAX = 4, EAX = -38child: I'm here
pid 13015 stopped, signal = 133, ORIG_EAX = 4, EAX = 16
parent: I see kid with pid = 13015
pid 13014 stopped, signal = 133, ORIG_EAX = 4, EAX = 35
pid 13015 stopped, signal = 133, ORIG_EAX = 175, EAX = 0
pid 13015 stopped, signal = 133, ORIG_EAX = 174, EAX = -38
pid 13014 stopped, signal = 133, ORIG_EAX = 175, EAX = -38
pid 13015 stopped, signal = 133, ORIG_EAX = 174, EAX = 0
pid 13014 stopped, signal = 133, ORIG_EAX = 175, EAX = 0
pid 13015 stopped, signal = 133, ORIG_EAX = 175, EAX = 0
pid 13014 stopped, signal = 133, ORIG_EAX = 174, EAX = -38
pid 13014 stopped, signal = 133, ORIG_EAX = 174, EAX = 0
# 10 second pause here
pid 13015 stopped, signal = 133, ORIG_EAX = 162, EAX = 0
pid 13014 stopped, signal = 133, ORIG_EAX = 162, EAX = 0
pid 13015 stopped, signal = 133, ORIG_EAX = 252, EAX = -38
pid 13015 stopped, signal = 133, ORIG_EAX = 252, EAX = -38
pid 13014 stopped, signal = 133, ORIG_EAX = 252, EAX = -38
pid 13015 stopped, signal = 133, ORIG_EAX = 1, EAX = -38
pid 13014 stopped, signal = 133, ORIG_EAX = 252, EAX = -38
pid 13014 stopped, signal = 17, ORIG_EAX = 252, EAX = -38
6) Debug output under etch says
child is 22820
pid 22820 stopped, signal = 5, ORIG_EAX = 11, EAX = 0
pid 22820 stopped, signal = 133, ORIG_EAX = 120, EAX = -38
ptrace.pokeuser(22820, 0, 18882577)
ptrace.pokeuser(22820, 16, 22820)
ptrace.pokeuser(22820, 20, 2)
pid 22820 stopped, signal = 133, ORIG_EAX = 120, EAX = 22821
ptrace.pokeuser(22820, 0, 18874385)
ptrace.pokeuser(22820, 16, -1209927384)
ptrace.pokeuser(22820, 20, -1078520376)
[22821] new child, parent is 22820
pid 22821 stopped, signal = 19, ORIG_EAX = 120, EAX = 0
pid 22820 stopped, signal = 133, ORIG_EAX = 4, EAX = -38
parent: I see kid with pid = 22821
pid 22820 stopped, signal = 133, ORIG_EAX = 4, EAX = 35
pid 22820 stopped, signal = 133, ORIG_EAX = 174, EAX = -38
pid 22820 stopped, signal = 133, ORIG_EAX = 174, EAX = 0
# 10 second pause here
fork: ../nptl/sysdeps/unix/sysv/linux/i386/../fork.c:138: __libc_fork:
Assertion `({ __typeof (self->tid) __value; if (sizeof (__value) == 1) asm
volatile ("movb %%gs:%P2,%b0" : "=q" (__value) : "0" (0), "i"
(__builtin_offsetof (struct pthread, tid))); else if (sizeof (__value) == 4)
asm volatile ("movl %%gs:%P1,%0" : "=r" (__value) : "i" (__builtin_offsetof
(struct pthread, tid))); else { if (sizeof (__value) != 8) abort (); asm
volatile ("movl %%gs:%P1,%%eax\n\t" "movl %%gs:%P2,%%edx" : "=A" (__value) :
"i" (__builtin_offsetof (struct pthread, tid)), "i" (__builtin_offsetof (struct
pthread, tid) + 4)); } __value; }) != ppid' failed.
7) I'll try to attach the strace output of both cases to this bug
report (but I'll do it in separate mail just in case it gets rejected
by BTS). Here's the imho relevant part from sarge output (comments
after #):
ptrace(PTRACE_POKEUSER, 13041, 4*ORIG_EAX, 0x78) = 0 # 0x78 = __NR_clone
ptrace(PTRACE_SYSCALL, 13041, 0, SIG_0) = 0 # let parent proceed with
clone
--- SIGCHLD (Child exited) @ 0 (0) ---
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == 133}], WUNTRACED|__WALL, NULL) =
13042 # got child pid
ptrace(PTRACE_PEEKUSER, 13042, 4*EDI, [0x32f1]) = 0
ptrace(PTRACE_PEEKUSER, 13042, 4*EBP, [0x2]) = 0
getpgid(0x32f2) = 13039
ptrace(PTRACE_PEEKUSER, 13042, 4*ORIG_EAX, [0x78]) = 0
ptrace(PTRACE_PEEKUSER, 13042, 4*EAX, [0]) = 0 # clone returned 0 in
child
open("/proc/13042/mem", O_RDWR|O_LARGEFILE) = 3
ptrace(PTRACE_POKEUSER, 13042, 4*EBX, 0x4014f8c0) = 0
ptrace(PTRACE_POKEUSER, 13042, 4*ECX, 0x1) = 0
ptrace(PTRACE_POKEUSER, 13042, 4*ESI, 0x40016540) = 0
ptrace(PTRACE_POKEUSER, 13042, 4*EDI, 0xbffffb04) = 0
ptrace(PTRACE_POKEUSER, 13042, 4*EBP, 0xbffffa88) = 0
ptrace(PTRACE_POKEUSER, 13042, 4*ORIG_EAX, 0x2) = 0 # 0x2 = __NR_fork
ptrace(PTRACE_SYSCALL, 13042, 0, SIG_0) = 0 # sf forces the child to
fork!
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WUNTRACED|__WALL, NULL)
= 13042
ptrace(PTRACE_SYSCALL, 13042, 0, SIG_0) = 0 # sf lets child continue
after fork
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == 133}], WUNTRACED|__WALL, NULL) =
13042
ptrace(PTRACE_PEEKUSER, 13042, 4*ORIG_EAX, [0x4]) = 0 # 0x4 = __NR_write
ptrace(PTRACE_PEEKUSER, 13042, 4*EAX, [0xffffffda]) = 0
ptrace(PTRACE_SYSCALL, 13042, 0, SIG_0) = 0 # sf lets child do its
first write
strace in etch:
ptrace(PTRACE_PEEKUSER, 22868, 4*ORIG_EAX, [0x78]) = 0 # 0x78 = __NR_clone
ptrace(PTRACE_PEEKUSER, 22868, 4*ORIG_EAX, [0x78]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*EAX, [0xffffffda]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*EBX, [0x1200011]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*ECX, [0]) = 0
ptrace(PTRACE_POKEUSER, 22868, 4*EBX, 0x1202011) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*ECX, [0]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*EDX, [0]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*ESI, [0]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*EDI, [0xb7e57928]) = 0
ptrace(PTRACE_POKEUSER, 22868, 4*EDI, 0x5954) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*EBP, [0xbf869ab8]) = 0
ptrace(PTRACE_POKEUSER, 22868, 4*EBP, 0x2) = 0
ptrace(PTRACE_SYSCALL, 22868, 0, SIG_0) = 0 # let parent proceed
with clone
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == 133}], WSTOPPED|__WALL, NULL) =
22868 # parent finished syscall before child
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_PEEKUSER, 22868, 4*ORIG_EAX, [0x78]) = 0
ptrace(PTRACE_PEEKUSER, 22868, 4*EAX, [0x5955]) = 0 # parent got child pid
in clone return value 0x5955 = 22869
open("/proc/22868/mem", O_RDWR|O_LARGEFILE) = 3
ptrace(PTRACE_POKEUSER, 22868, 4*EBX, 0x1200011) = 0
ptrace(PTRACE_POKEUSER, 22868, 4*EDI, 0xb7e57928) = 0
ptrace(PTRACE_POKEUSER, 22868, 4*EBP, 0xbf869ab8) = 0
ptrace(PTRACE_SYSCALL, 22868, 0, SIG_0) = 0 # let parent proceed
after clone
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], WSTOPPED|__WALL, NULL) =
22869 # child is done with clone
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_PEEKUSER, 22869, 4*EDI, [0x5954]) = 0
ptrace(PTRACE_PEEKUSER, 22869, 4*EBP, [0x2]) = 0 # why does sf not say
PTRACE_SYSCALL to child to let it continue?
getpgid(0x5955) = 22866 # child is forgotten
and won't generate any more SIGCHLDs because it's stopped
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == 133}], WSTOPPED|__WALL, NULL) =
22868
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_PEEKUSER, 22868, 4*ORIG_EAX, [0x4]) = 0 # 0x4 = __NR_write
ptrace(PTRACE_PEEKUSER, 22868, 4*EAX, [0xffffffda]) = 0
ptrace(PTRACE_SYSCALL, 22868, 0, SIG_0) = 0 # sf lets parent write
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == 133}], WSTOPPED|__WALL, NULL) =
22868
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_PEEKUSER, 22868, 4*ORIG_EAX, [0x4]) = 0 # 0x4 = __NR_write
ptrace(PTRACE_PEEKUSER, 22868, 4*EAX, [0x23]) = 0 # parent write
returned 0x23 = 35 characters: "parent: I see kid with pid = 22869\"
ptrace(PTRACE_SYSCALL, 22868, 0, SIG_0) = 0 # let parent do more
syscalls
-- System Information:
Debian Release: 4.0
APT prefers stable
APT policy: (500, 'stable')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-4-k7
Locale: LANG=C, LC_CTYPE=fi_FI (charmap=ISO-8859-1)
Versions of packages subterfugue depends on:
ii python 2.4.4-2 An interactive high-level object-o
ii python-central 0.5.12 register and build utility for Pyt
Versions of packages subterfugue recommends:
ii python-gtk2 2.8.6-8 Python bindings for the GTK+ widge
-- no debconf information
--- End Message ---