On 10/15, Roland McGrath wrote: > > > Yes, except this doesn't really work. We have a lot of races, afaics, > > even without ptrace. The problem is, once we drop ->siglock, we can't > > trust SIGNAL_STOP_DEQUEUED. And siglock is dropped by dequeue_signal(), > > it is dropped by get_signal_to_deliver(), by ptrace_signal/utrace too. > > If you think there are real problems unrelated to ptrace, we should be > discussing that separately on LKML.
Agreed. And iirc we already did, but I can't recall what was the end of discussion. My point was, the discussed problems with ptrace && stop probably are not ptrace-only, we need other changes. Hopefully we should address them after v1. > Myself, I do care that we get all the cases right by the anal POSIX rules. > But I'm pretty sure that upstream at large will only ever actually care > about the SIGKILL case. So that's the caveat about how much to worry. Yes, I think we can't race with SIGKILL. > > Example. get_signal_to_deliver() dequeues, say, SIGTTIN, drops ->siglock > > for is_current_pgrp_orphaned(). > > > > SIGCONT comes, clears SIGNAL_STOP_DEQUEUED - we shouldn't stop. > > I've been trying to think of a way to define it away is OK, but I can't > quite get there. So I think we do indeed have a problem there. I wonder > if we can hash that out independent of ptrace stuff. Yes, I think this is solvable. Iirc, I even did the fix a long ago, but can't recall. But I'd prefer to delay this discussion unless you think we should fix this right now. I mean, I just can't think about this until we solve all known problems in utrace-ptrace. When I am trying to think about 2 different problems at the same time, I can't make any progress ;) > > Say, ptrace(DETACH/CONT, SIGSTOP). This should work, this means > > SIGNAL_STOP_DEQUEUED should be set even before the tracee calls > > do_signal_stop(). But otoh it doesn't look right to set this flag > > each time the tracee sees a stop signal from the debugger (especially > > on detach), ->real_parent should not see multiple notifications. > > I don't think an extra SIGCHLD/wakeup here is going to be considered a > problem, in the grand scheme of things. I can't really see any plausible > way we'd (want to) preserve whether the real_parent already thought it was > stopped or not. I dunno. I am not arguing, just I don't know. But, stopped-attach-transparency seem to check this. I am not sure, but "Excessive waiting SIGSTOP after it was already waited for\n" looks like it it does. I'll re-check. Oleg.