I'll point out that the only symptom that matters to ptrace is ->exit_code and that issue disappears if you stop overloading it for ptrace purposes, which is the clean thing to do in the long run anyway. But there are subtler issues that haven't yet directly affected ptrace.
In this scenario, the event has already happened and that's externally visible (the child exists). (In the equivalent case for exec, you can tell from /proc/pid/exe, the effects of setuid have happened, etc.) We don't have the option of saying that the group-stop happened first. If that were so, this event would not have happened yet, but it's already been both reported via utrace and become observable by userland. So it should appear as if the other thread started its group-stop after we were already in TASK_TRACED. There is no need or benefit I see in any substantive change to the utrace API reflecting this. It's just that group-stop is an event that can happen after other events, like syscall-exit is. In other versions of this race, we'll already be in utrace_stop() or on the way to it, and hit "we must participate in the bookkeeping." But the bookkeeping is not really enough. In the case where we were the last thread we do set SIGNAL_STOP_STOPPED, but we swallow the parent's notification. Of course you don't notice this in ptrace since it swallows that notification anyway (and since this is a rare race). But for the general case of utrace, it's wrong as it is. So utrace_stop() ought to do the parent notification in this case. That is, the whole middle part of the do_signal_stop() and tracehook_notify_jctl() logic. i.e., utrace_report_jctl() and then both set ->exit_code and call do_notify_parent_cldstop(), modulo ptrace interference. Hmm, perhaps ->exit_code is only ever set now because of ptrace, at least since your task_stopped_code() changes? Perhaps we should just nix setting it in do_signal_stop() too (after new ptrace). Now, back to your case, a different ordering of that same race. There is nothing really wrong with having utrace's hook be before the group_stop_count check in get_signal_to_deliver(). But that would require changing the tracehook interface there, and the existing tracehook_get_signal() seems about as clean as we can make it. It seems fine to me that we get into do_signal_stop(). If we got into utrace_get_signal()->utrace_stop() first, it would just be the case above where we are acting like do_signal_stop() ran anyway. We can change tracehook_notify_jctl() to call utrace_report_jctl() for any nonzero utrace_flags, and check UTRACE_EVENT(JCTL) inside. (Really it could check (UTRACE_EVENT(JCTL) | ENGINE_STOP), but that's an internal bit that only utrace.c itself knows about.) If (after the report_jctl pass if any) anybody wants UTRACE_STOP, then we handle it with utrace_stop() right there (for what==CLD_STOPPED only). That handles the pending group_stop_count by the (fixed) logic above, so that when we come back out from TASK_TRACED (assuming no unrelated new stops) group_stop_count==0 and do_signal_stop() falls through its no-op case. For that plan, we have to give utrace_report_jctl() a return value that tracehook_notify_jctl() passes back so it can return 0 when utrace_stop() ran (and did any notification itself). But I think that flies, is perhaps moderately clean, and does not need any changes outside tracehook/utrace. What do you think? Thanks, Roland