On 10/11, Roland McGrath wrote: > > I took it from the lack of Signed-off-by: lines that you probably didn't > want me to merge these.
Ah, no. I never added mine sob's in these long series, because I thought it doesn't matter. > Instead I applied them all to a new branch > utrace-ptrace-reuse-engine: > magilla 135 % git l utrace-ptrace..utrace-ptrace-reuse-engine > 85f8b3a detach should reset the context of self-detaching engine > a27233a attach: try to re-use the self-detaching engine > 8667615 ptrace_notify_stop: fix engine leak > 3d5d053 ptrace_detach_task: don't use engine ptr before IS_ERR(engine) > 01875c7 fold detach_signal() into ptrace_detach_task() > 464c2b7 don't detach the engine with the parting signal > 97b345c implement the basic detach-with-signal logic OK, great. > I found it far easier to read utrace-ptrace..utrace-ptrace-reuse-engine as > a single diff. Some miscellaneous comments: Yes, as usual the changes were chaotic. I tend to send the changes even if I know they should be redone/cleanuped later, to get your early review and to discuss asap. > + if (valid_signal(sig)) > + context->signr = sig; > + context->options = 0; > + context->eventmsg = 0; > + context->stop_code = 0; > + /* make sure do_wait() can't succeed */ > + tracee->exit_code = 0; > > Since you allocate struct ptrace_context with kzalloc, it would seem > cleanest here to use memset to clear it all first. The only thing you > don't clear explicitly here is siginfo. You need it preserved for the > resume_signal logic to use. But IMHO it's cleaner to make that explicit in > some comments and extract and restore it around the memset. That should be > the one difference in the data structures between fully detaching and then > doing a separate full attach, right? Mostly yes. But I think it would be much better to move this re-initialization into ptrace_attach_task(), in this case we can't do memset(0) + restore siginfo. > Also it feels like there should be a way to consolidate that switch somehow > with the first one in do_ptrace_resume, so e.g. the syscall->send_sig logic > is all in one place. I thought about this too. Currently I don't see how to do this cleanly. This all needs some obvious cleanups, then we will see if we can cleanup the code more and do some consolidation. > + if (context->resume == UTRACE_DETACH) { > + /* we rely on ->cred_guard_mutex */ > + context->resume = UTRACE_RESUME; > > Ok, so this means that when ->resume == UTRACE_DETACH then the only other > thing that could change ->resume is another thread in the same code path. > That path olds cred_guard_mutex, so it excludes other threads. Right? Yes. Other callers of ptrace_attach_task() are ptrace_traceme() and ptrace_clone_attach(), they can not hit this case. > + /* > + * Make sure we don't race with ptrace_report_signal() > + */ > + utrace_barrier(tracee, engine); > + if (engine->ops == &ptrace_utrace_ops) > + goto finish; > > I really don't understand this part. Firstly, what's the race? Secondly, > what is that ->ops check about?? That is clearly not something that the > ptrace layer ought to be fiddling with. If what you mean to detect here is > that the tracee died or was detached while we waited for things to calm > down, then you should just check the return value from utrace_barrier. No, the return calue from utrace_barrier() does not matter. And, if the tracee is killed we don't care. The race is different. So. The tracee reports the signal and stops. The tracer detaches, sets ->resume = UTRACE_DETACH and wakes up the tracee. A new tracer attaches, and changes ->resume back to UTRACE_RESUME. But, since the tracee is not stopped, this can be too later. It is possible that the tracee has already returned UTRACE_DETACH | ... from ptrace_report_signal(), in this case finish_callback() will mark this engine as detached soon or it has already detached it. That is why we do: context->resume = UTRACE_RESUME; utrace_barrier(); if (engine->ops == &ptrace_utrace_ops) SUCCESS! If ptrace_report_signal() has already seen resume = UTRACE_DETACH, this engine should be detached after utrace_barrier(). If ptrace_report_signal() was not called yet, it must see UTRACE_RESUME when it will be called. > (That will handily avoid the new __must_check warning you should be seeing > for calling utrace_barrier without checking its return value, too!) Argh. I didn't even know it is __must_check! OK, will fix... Not that I personally like these __must_check's though ;) > I know we went around on this idea before, and we don't need to rehash the > idea if you still don't like it. But the hair like clearing out the data > structure for this approach makes me think the "second special engine" > approach really might look better. That is, detaching with a signal > attaches a new engine with a different ops vector and then detaches the > real ptrace engine. Agreed, this all is very subjective. I hate the idea to attach another engine here, but can't prove it is really bad. Firstly, I dislike it because it is very unlikely another tracer will attach before the tracee handles this signal, a bit ugly we should arrange the new engine "just in case". And in theory kmem_cache_alloc() can fail, not good. But mostly I dislike the fact we must teach old and new engines to cooperate. > Then on successful attach, you do the look-up for any > ptrace_detached_ops engine, transfer over its ctx->signr and ->siginfo to > your fresh ptrace_context, detach it and free the old context. It has more > data structures live momentarily than you really need, but it seems cleaner. Again, this is subjective, but I'd like to avoid these complications. And. probably I missed something, but it is not trivial to implement. A lot of problems to solve. Just for example, when we attach the new engine, its ->flag has no UTRACE_EVENT(SIGNAL) yet. If we steal ->siginfo we can "lose" this signal. If we set UTRACE_EVENT(SIGNAL) first, the new and old engine can race. Yes I thought about this, but afaics we have to fight with multiple annoying races. Again, again, I do not pretend I am sure this is really true. Oleg.