I seem to find the problem, and I do not see a simple solution. Will try to think more tomorrow with a fresh head, but perhaps you can help. looks like we need some changes in utrace/signal layer.
Suppose that a tracee is going to report, say, PTRACE_EVENT_FORK. The callback returns UTRACE_STOP. Of course, utrace_report_xxx() does not do finish_resume_report() or utrace_stop(), we rely on utrace_resume() or utrace_get_signal() which should call utrace_stop() eventually. Now suppose that the tracee's sub-thread initiates the group-stop. In this case the tracee will call do_signal_stop(), without doing utrace_resume/utrace_get_signal. And this means the tracee will stop in TASK_STOPPED with the wrong ->exit_code. ptrace can hook ->report_jctl(), but this can't help. We can do --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1569,6 +1569,10 @@ static int do_signal_stop(int signr) signal_wake_up(t, 0); } } + + // tracehook_notify_jctl() can change ->exit_code. + // If we won't stop ->exit_code will be cleared anyway + current->exit_code = sig->group_exit_code; /* * If there are no other threads in the group, or if there is * a group stop in progress and we are the last to stop, report @@ -1584,7 +1588,6 @@ static int do_signal_stop(int signr) if (sig->group_stop_count) { if (!--sig->group_stop_count) sig->flags = SIGNAL_STOP_STOPPED; - current->exit_code = sig->group_exit_code; __set_current_state(TASK_STOPPED); } spin_unlock_irq(¤t->sighand->siglock); But this doesn't really help too. tracehook_notify_jctl() unlocks ->siglock, if SIGCONT comes in between we can't notice it. In this case the stop reports will be wrong again. What do you think? Oleg.