On 04/16, Roland McGrath wrote:
>
> Using write_lock_irq at all in the non-ptrace reap case seems bad.

Agreed. But note that, once we change wait_task_zombie() to untrace
the task first (under ->ptrace_mutex), we can switch to read_lock()
again.

> > Now return to the topic. Perhaps we can do something "hybrid" for the start?
> > To avoid the discussed races with do_wait(), PTRACE_ATTACH/PTRACE_DETACH can
> > temporary take tasklist just for setting PT_PTRACED and adding to ->ptraced
> > list.
>
> write_lock_irq inside ptrace_mutex should be safe.  But does that leave us
> still with the current ugly tasklist_lock+task_lock dance?

Oh, this dance should be killed anyway. In fact I think currently we
don't need task_lock() at all, but I am not sure, I forgot the results
of my previous grepping... will do again.

> Moreover, we
> want to find the path that gets us (eventually) to not using tasklist_lock
> at all for ptrace.

Yes. But only DETACH/ATTACH needs tasklist. Perhaps this can be
"good enough" for the start... Not that I am sure it will be easy to
improve things later, though.

> > In that case, ptrace_do_wait() does not need ->ptrace_mutex, it can iterate
> > over ->ptraced list lockless (but perhaps we should use list_for_each_rcu).
>
> There is no inherent reason to avoid ptrace_mutex or try real hard to be
> lockless for its own sake.  It's only the common case of no ptrace use
> where we'd like to short-circuit.

Agreed, but still it is better to delay unlock(tasklist)+mutex_lock(ptrace)
until we know this is really needed. Only wait_task_zombie's path needs
this. See also below.

> Note that another angle on this could be to exploit the idea I'd intended
> to be a later optimization: mutex, ptraced in a struct allocated on demand.

Sure, I think this should be covered by "ptracer (parent) data structures 
cleanup"
we didn't discuss yet.

> Then the short-circuit test is simply if (!tsk->ptrace_info).

But, from the correctness pov, this doesn't differ from checking
!list_empty(->ptraced) ? I mean, the condition can be changed under us
in both cases, and we seem to have the same corner cases.

But of course I agree, this is better. I didn't try to pay much attention
to these changes just because I thought (perhaps wrongly) we can ignore
them in this discussion.

> I don't think I can grok this before I've figured out what I missed about
> the idea above.

I think I see the very "strategic" reason to fix wait_task_zombie first ;)

It would be nice if the locking changes will not uglify the code from the
friendly lkml reviwers pov. But surely, this unlock/lock/recheck/repeat
dance (which we can't avoid) doesn't look very good.

But what if we have a reson for this dance? Say, the old bug in do_wait.
Then we can introduce this complication to fix the bug, and only then
send these changes on top.


Perhaps I should try to make some draft patches, just to see how it can
look. But of course the discussion is still in progress ;)

Or I should start with "ptrace child" cleanup first... (will reply to your
email about the child cleanups later).

Oleg.

Reply via email to