Incomplete reply, just can't read/think/concentrate today... On 10/07, Roland McGrath wrote: > > > We had a lengthy discussion about this. > > Yes. I only ever wanted that revert then because it was too late in the > 2.6.30 cycle to hash this all out and get it really right. I meant that > we should leave wrong enough alone in 2.6.30 but get it all worked out > more properly in 2.6.31, but I forgot to follow up on it. If we can > iron out the behavior now and the upstream version of implementing it is > not big new hair, it might still be possible to get it fixed in 2.6.32. > > That piece of implementation is 100% wrong. But we have to figure out > what the manifest semantics are today from the userland perspective and > decide what exactly we want them to be before we implement those precise > semantics in some sensible way.
Yes. In particular, ptrace(PTRACE_DETACH, SIGKILL) should cancel SIGNAL_STOP_STOPPED, yes? > > - sig->flags = SIGNAL_STOP_STOPPED; > > + sig->flags = SIGNAL_STOP_STOPPED | > > SIGNAL_STOP_DEQUEUED; > > Boy, do I not understand why that does anything about this at all! > But I am barely awake tonight. Ok, I guess I do sort of if it goes > along with some other patch to set SIGNAL_STOP_STOPPED. But since > you've verified you really understand what happens, you can tell us! Two threads T1 and T2, both ptraced by P, both TASK_TRACED, T2 sleeps in ptrace_signal(). P does: ptrace(DETACH, T1, SIGSTOP); ptrace(DETACH, T2, SIGSTOP); The first DETACH wakes up T1, it dequeues SIGSTOP, calls do_signal_stop(). T2 is still TASK_TRACED, this means T1 completes the group-stop and sets sig->flags = SIGNAL_STOP_STOPPED. The second detach wakes up T2, it returns from ptrace_signal() and calls do_signal_stop() which does nothing without SIGNAL_STOP_DEQUEUED. But please remember, the patch above is not complete of course and currently I do not see the good solution. I am starting to think we should forget about these bugs, merge utrace-ptrace, and then try to fix them. Even the first detach can fail to stop T1, because SIGNAL_STOP_DEQUEUED can be cleared before. I never knew what user-space actually does with ptrace, now I am really surprized gdb/etc assume it can trust ptrace(SIGSTOP). Sometime it works, but only by accident. Oleg.