I took it from the lack of Signed-off-by: lines that you probably didn't
want me to merge these.  Instead I applied them all to a new branch
utrace-ptrace-reuse-engine:
        magilla 135 % git l utrace-ptrace..utrace-ptrace-reuse-engine
        85f8b3a detach should reset the context of self-detaching engine
        a27233a attach: try to re-use the self-detaching engine
        8667615 ptrace_notify_stop: fix engine leak
        3d5d053 ptrace_detach_task: don't use engine ptr before IS_ERR(engine)
        01875c7 fold detach_signal() into ptrace_detach_task()
        464c2b7 don't detach the engine with the parting signal
        97b345c implement the basic detach-with-signal logic

I found it far easier to read utrace-ptrace..utrace-ptrace-reuse-engine as
a single diff.  Some miscellaneous comments:

+                       if (valid_signal(sig))
+                               context->signr = sig;
+                       context->options = 0;
+                       context->eventmsg = 0;
+                       context->stop_code = 0;
+                       /* make sure do_wait() can't succeed */
+                       tracee->exit_code = 0;

Since you allocate struct ptrace_context with kzalloc, it would seem
cleanest here to use memset to clear it all first.  The only thing you
don't clear explicitly here is siginfo.  You need it preserved for the
resume_signal logic to use.  But IMHO it's cleaner to make that explicit in
some comments and extract and restore it around the memset.  That should be
the one difference in the data structures between fully detaching and then
doing a separate full attach, right?

Also it feels like there should be a way to consolidate that switch somehow
with the first one in do_ptrace_resume, so e.g. the syscall->send_sig logic
is all in one place. 

+               if (context->resume == UTRACE_DETACH) {
+                       /* we rely on ->cred_guard_mutex */
+                       context->resume = UTRACE_RESUME;

Ok, so this means that when ->resume == UTRACE_DETACH then the only other
thing that could change ->resume is another thread in the same code path.
That path olds cred_guard_mutex, so it excludes other threads.  Right?

+                       /*
+                        * Make sure we don't race with ptrace_report_signal()
+                        */
+                       utrace_barrier(tracee, engine);
+                       if (engine->ops == &ptrace_utrace_ops)
+                               goto finish;

I really don't understand this part.  Firstly, what's the race?  Secondly,
what is that ->ops check about??  That is clearly not something that the
ptrace layer ought to be fiddling with.  If what you mean to detect here is
that the tracee died or was detached while we waited for things to calm
down, then you should just check the return value from utrace_barrier.
(That will handily avoid the new __must_check warning you should be seeing
for calling utrace_barrier without checking its return value, too!)


I know we went around on this idea before, and we don't need to rehash the
idea if you still don't like it.  But the hair like clearing out the data
structure for this approach makes me think the "second special engine"
approach really might look better.  That is, detaching with a signal
attaches a new engine with a different ops vector and then detaches the
real ptrace engine.  Then on successful attach, you do the look-up for any
ptrace_detached_ops engine, transfer over its ctx->signr and ->siginfo to
your fresh ptrace_context, detach it and free the old context.  It has more
data structures live momentarily than you really need, but it seems cleaner.


Thanks,
Roland

Reply via email to