> IOW, 2 threads T1 and T2. T2 forks the child C. T1 ptraces C. C dies > and becomes EXIT_ZOMBIE. It sends the notification to thread-group. > > Then, any thread does do_wait(). But since ptrace_reparented() = T > we don't release C but send the notification again. This doesn't > look right.
Technically, I think this really is "right". It just seems screwy because, well, the whole ptrace+wait interface is indeed screwy. T1 is the ptracer, and is not the natural parent. Consider that T1 runs a piece of code (library, isolated chunk in a giant complex program) that got just got asked to trace C. It doesn't know anything about C, it just knows that PTRACE_ATTACH worked on it. So, it expects the usual behavior when it does waitpid(C) and gets !WIFSTOPPED: automatic detach, notification of the real parent, and the real parent's waits work. Imagine T2 runs another piece of code that forks and waits for that child, and doesn't know anything else, e.g. it called system(). That code is isolated in the function, and all it expects of the rest of the (unknown) code in the process is that any wait calls are waitpid() selecting only a known child (or are in other threads using __WNOTHREAD, etc.), so nobody will steal its child. These two isolated chunks of code have limiting (and perhaps short-sighted) assumptions. But things work out just right for them. (Naturally they have problems if both calls are in the same thread leaving the child alive in between, but imagine some current application that never does it that way.) Now C dies and the sequence is: C dies -> wake_up_parent T1 wakes up, enters wait loop T2 wakes up, enters wait loop T1 sees C in wait_task_zombie() -> will report, about to untrace it T2 sees C in wait_task_zombie() -> task_ptrace(C) still true, skip it T1 untraces C T2 blocks again til 2nd wake_up_parent If we were to omit the second do_notify_parent() as you suggest, then T2 stays blocked forever instead of reaping C. If we were to change ptrace_reparented() as you contemplate, then even after some other wakeup, T2 would get -ECHILD. Either way, the system call ABI compatibility is broken. It's just not an option, merits of interface choices aside. Note for this case it now works right when both use just __WNOTHREAD, which a caller "trying to be smart about it" might reasonably do. T1 is seeing C on its ->ptraced, and T2 is seeing (skipping) C on its ->children list. When everybody uses __WNOTHREAD, I bet they'd think that ptrace_reparented() losing that distinction is pretty counterintuitive. Thanks, Roland