> > T1 takes T2->ptrace_mutex to examine T2->ptraced. > > In that case we are fine. But we want to not take the lock when > list_empty(&tsk->ptraced), and we may see list_empty() == T.
Yes. I still haven't thought through how to make that fast-path check right. > OK. For the moment, please forget about these changes. Let's recall > do_wait() has the ancient bug. wait_task_zombie() sets EXIT_DEAD > unconditionally, then drops tasklist. If we are not the real_parent > and the child was traced, we may restore >exit_state = EXIT_ZOMBIE > later. But, if the ->real_parent calls do_wait() in between, it can > see the child in EXIT_DEAD state and return -ECHLD. Really? Won't that do_wait hit (wait_consider_task): if (likely(!ptrace) && unlikely(p->ptrace)) { /* * This child is hidden by ptrace. * We aren't allowed to see it now, but eventually we will. */ *notask_error = 0; return 0; } ? p->ptrace is still set until the tracer gets write_lock_irq(&tasklist_lock). > (and we have other minor problems with EXIT_DEAD tasks on ->children). Which? (We can make this a separate thread if it's not really apropos just to ptrace.) > Perhaps we can start with this change, > > wait_task_zombie: I don't think I followed the intent of this change. Using write_lock_irq at all in the non-ptrace reap case seems bad. > Now return to the topic. Perhaps we can do something "hybrid" for the start? > To avoid the discussed races with do_wait(), PTRACE_ATTACH/PTRACE_DETACH can > temporary take tasklist just for setting PT_PTRACED and adding to ->ptraced > list. write_lock_irq inside ptrace_mutex should be safe. But does that leave us still with the current ugly tasklist_lock+task_lock dance? Moreover, we want to find the path that gets us (eventually) to not using tasklist_lock at all for ptrace. > Or, _perhaps_ we can just add a barrier, so that if do_wait() sees > PT_PTRACED it must see !list_empty(->ptraced). That's nice if it works. Another wrinkle to consider is that in the long run we probably ->ptrace to go away entirely. (If ptrace becomes purely utrace-based, we don't need any tracee state details in task_struct directly except to make the do_wait exclusion work.) So eventually we should consider different ways to store the "stolen by ptrace" bit for real_parent's wait to track, since that bit will be the only state left for "core" code. > In that case, ptrace_do_wait() does not need ->ptrace_mutex, it can iterate > over ->ptraced list lockless (but perhaps we should use list_for_each_rcu). There is no inherent reason to avoid ptrace_mutex or try real hard to be lockless for its own sake. It's only the common case of no ptrace use where we'd like to short-circuit. Note that another angle on this could be to exploit the idea I'd intended to be a later optimization: mutex, ptraced in a struct allocated on demand. Then the short-circuit test is simply if (!tsk->ptrace_info). On the first use of ptrace that allocates ptrace_info, it can do some extra futzing (that needn't be very efficient) to avoid the do_wait race cases we've been coming up with. (Of course, this all is very vague, just for discussion. ;-) > We only need ->ptrace_mutex to untrace the task, so we just modify > wait_task_zombie() further: I don't think I can grok this before I've figured out what I missed about the idea above. > Of course, this all is very vague, just for discussion. That's the way I like it! ;-) Thanks, Roland