On 04/13, Roland McGrath wrote: > > > I tried to think about the first steps in ptrace-cleanup, and I > > need your help. > > I think I said that list was "not necessarily in this order" and I > certainly meant it so.
Yes sure. But I think this part is most important and most hard, and other changes may depend on locking very much. > We want the new lock to be a mutex. I don't think we want to hold both > locks, though it won't hurt to have tasklist_lock taken (appropriately > briefly) while holding the mutex. Agreed. > But I want to point out here that the > big-picture cleanup we are getting here is having ptrace machinations never > take tasklist_lock. Yes. > That is not something you can see as an immediate good > by reading diffstats or whatever such micro-level view of "cleanup". But > with a larger view, it is a huge boon for the desireable characteristics in > the kernel overall, in performance scaling, proper separation of concerns, > etc. So this large good is what to consider against the smaller costs of > some repetition or complexity in the code at the micro-level. (Of course, > we always want to make it as clean and concise as we can at every level.) Sure! The problem is I still can't see how can we do this without complicating the current code "too much". I spent several hours doing nothing except thinking about these changes, but all I can say now is: I need to think more ;) And of course I appreciate your ideas. Just for example, it would be really nice to avoid taking current->ptrace_mutex in exit_ptrace() when there are no tracees, but I don't see how can we do this without races with ptrace_traceme(). But this is a minor detail. > The tasklist_lock for our purposes here is only taken to keep the > next_thread() loop valid. (For non-ptrace do_wait_thread() it serves other > purposes too, but leave that aside for now.) In ptrace_do_wait(), we just > need to make sure that loop (in its caller) does not go awry. What I think > we can do is some version of: > > get_task_struct(tsk); > read_unlock(&tasklist_lock); > mutex_lock(&tsk->ptrace_mutex); > ... > mutex_unlock(&tsk->ptrace_mutex); > read_lock(&tasklist_lock); > ... check for tsk->thread_group having been invalidated ... > put_task_struct(tsk); Yes we need ptrace_mutex in ptrace_do_wait(). And I also thought about dropping tasklist_lock in ptrace_do_wait(). But this doesn't work. Consider 2 threads, T1 and T2. T2 forks the child C. T1 calls do_wait(). It scans T->children and finds nothing. Then it calls ptrace_do_wait() which temporary drops tasklist. T2 exits, reparents C to T1. T1 finds nothing interesting in ->ptraced (not that it matters, but it is possible the list was not empty when we enter ptrace_do_wait, but it is empty when we take ->ptrace_mutex). T1 takes tasklist again. We check tsk == T1 has not exited (or we some other check) and continue. do_wait() returns -ECHILD. We need something more clever here. > 1. We really want to short-circuit and not do any lock fiddling in the > common case of list_empty(&tsk->ptraced). Completely agreed. > (Or that it just doesn't > matter, because the attach/traceme can be said to be "after" the wait > call for this race.) Yes... but we can have some nasty corener cases which are not bugs, but still not good. Two threads T1 and T2. T1 has a TASK_STOPPED child C. T2 in the middle of sys_ptrace(PTRACE_ATTACH, C). T1 does do_wait(WSTOPPED). It is possible that we already see C->ptrace (so do_wait_thread(T1->children) just clears *notask_error), but we don't see C in T2->ptraced list. Oh. Will try to think more ;) Oleg.