On 04/16, Roland McGrath wrote: > > (I just didn't want anything easier to get blocked > while waiting for feedback from me.)
I have a couple of questions about other cleanups... Will write another email. > Yes. I think we can do it with two hacks, which means no more than four or > five actual hacks. > > [... snip ...] I _seem_ to understand what you suggest, and I _think_ this can work. But I need to re-read this again (probably more than 10 times ;) to convince myself I really understand. But see below. > > Two threads T1 and T2. T1 has a TASK_STOPPED child C. T2 in the middle > > of sys_ptrace(PTRACE_ATTACH, C). > > You are a bad, bad man. > > > T1 does do_wait(WSTOPPED). It is possible that we already see C->ptrace > > (so do_wait_thread(T1->children) just clears *notask_error), but we > > don't see C in T2->ptraced list. > > "In the middle" of PTRACE_ATTACH means T2 holds T2->ptrace_mutex. Before > it takes the mutex, C->ptrace is clear (unless racing just after D does > PTRACE_DETACH to make T2's PTRACE_ATTACH possible). > > T1 takes T2->ptrace_mutex to examine T2->ptraced. In that case we are fine. But we want to not take the lock when list_empty(&tsk->ptraced), and we may see list_empty() == T. > Before that, say C->ptrace was set from prior attach by D. T1 saw > C->ptrace in do_wait_thread and did not report C. Now, D does > sys_ptrace(PTRACE_DETACH, C). Hmm, yes, another corner case... OK. For the moment, please forget about these changes. Let's recall do_wait() has the ancient bug. wait_task_zombie() sets EXIT_DEAD unconditionally, then drops tasklist. If we are not the real_parent and the child was traced, we may restore >exit_state = EXIT_ZOMBIE later. But, if the ->real_parent calls do_wait() in between, it can see the child in EXIT_DEAD state and return -ECHLD. (and we have other minor problems with EXIT_DEAD tasks on ->children). Perhaps we can start with this change, wait_task_zombie: get_task_struct(p); read_unlock(&tasklist_lock); write_lock_irq(&tasklist_lock); if (p->exit_state != TASK_ZOMBIE) { write_unlock_irq(&tasklist_lock); // This is extremely unlikely case // return something which forces // "goto repeat" in do_wait(). return -E_GOTO_REPEAT; } if (ptrace_reparented(p)) { ptrace_unlink(p); .... } p->exit_state = EXIT_DEAD; list_del_init(p->sibling); // not really needed but good ... Now return to the topic. Perhaps we can do something "hybrid" for the start? To avoid the discussed races with do_wait(), PTRACE_ATTACH/PTRACE_DETACH can temporary take tasklist just for setting PT_PTRACED and adding to ->ptraced list. Or, _perhaps_ we can just add a barrier, so that if do_wait() sees PT_PTRACED it must see !list_empty(->ptraced). In that case, ptrace_do_wait() does not need ->ptrace_mutex, it can iterate over ->ptraced list lockless (but perhaps we should use list_for_each_rcu). We only need ->ptrace_mutex to untrace the task, so we just modify wait_task_zombie() further: get_task_struct(p); read_unlock(&tasklist_lock); // since we dropped tasklist, we should return either // success or -E_GOTO_REPEAT; if (p->ptrace) { lock_parents_ptrace_mutex(p); if (p is not traced) { mutex_unlock(); return -E_GOTO_REPEAT; } untrace(); mutex_unlock(); } write_lock_irq(&tasklist_lock); ... try to reap ... Now. With these changes, nobody except de_thread() can call release_task(p) when p is ptraced. So the "untrace" part above can go into the separate helper, de_thread() calls it before release_task(leader). tracehook_finish_release_task() can be killed. Of course, this all is very vague, just for discussion. Oleg.