Re: PTRACE_EVENT_SIGTRAP
On 10/27, Roland McGrath wrote: This is used only in the UTRACE_SIGNAL_HANDLER case. That means after tracehook_signal_handler(), which is where a signal handler has just been set up. For reference, in old ptrace, tracehook_signal_handler() is: if (stepping) ptrace_notify(SIGTRAP); This means that when PTRACE_SINGLE{STEP,BLOCK},signr was used to deliver the handled signal, we stop there in ptrace_notify. This is in lieu of a proper single-step trap, with the theory that what you meant to step to next was the first instruction of the signal handler, rather than the second instruction of the handler. The only purpose of PTRACE_EVENT_SIGTRAP is to distinguish this case from PTRACE_EVENT_SIGNAL as it would be for a real SIGTRAP after a normal instruction step. The only purposes of this distinction are to make PTRACE_SETSIGINFO have no effect, and to make resumptions ignore the data/signr argument. The former makes it consistent with other cases in utrace-ptrace that replace ptrace_notify() calls in the old ptrace, but that itself is a known nit incompatibility with the old behavior, so it's superfluous at best. The latter is just intended for pure bug-compatibility with the old behavior. Aside from these, there are no other reasons not to have the UTRACE_SIGNAL_HANDLER case result in PTRACE_EVENT_SIGNAL, with siginfo_t filled in to match what the old ptrace_notify() does. Is that all correct? In short, PTRACE_EVENT_SIGTRAP provides bug compatibility for PTRACE_SINGLESTEP,signr silently acting like PTRACE_SINGLESTEP,0 when (s/PTRACE_SINGLESTEP/PTRACE_ANY/) called at the stop just provoked by the last PTRACE_SINGLESTEP,signr. Is that it? Yes. I think this is a piece of bug compatibility that upstream will be happy to have broken. That is, I bet most people don't realize at all PTRACE_SINGLESTEP,signr will sometimes swallow a signal after what appears to be a normal single-step stop. In fact, I suspect many might say that this is worse than the previous behavior my addition of this ptrace_notify() call (sometime before git so 2.6.12; Sep 2004, not sure what version that was, maybe 2.6.9). That behavior was that PTRACE_SINGLESTEP,signr would stop at the second instruction of the signal handler and never the first--but that would be a real step (like you now get after PTRACE_SINGLESTEP,0 when stopped in ptrace_notify() at the first instruction of the handler). I'm quite sure that at the time I never contemplated the signal-swallowing implication of using ptrace_notify() here. So, get rid of PTRACE_EVENT_SIGTRAP. Instead for the case of UTRACE_SIGNAL_HANDLER when stepping, initialize *info to look like a vanilla SIGTRAP and then fall into the default case for real signals. Yes, I already thought about this. The patch is trivial, please see below. Does that all sound right to you, or did I miss something? Don't ask me, I never know what can be changed ;) But yes, I agree. The current behaviour looks strange (if not wrong). With this change ptrace(PTRACE_WHATEVER, signr) after PTRACE_SINGLESTEP never ignores signr. Even if perhaps this is not really useful, this is consistent. Afaics, this case was a single exception when ptrace(signr) or PTRACE_SETSIGINFO) was ignored after PTRACE_SINGLESTEP. The only problem with this change (if we do it now), it is always nice to have a separate changelog to document the behaviour change. But hopefully nobody cares, I mean, nobody will notice the difference. Oleg. --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -73,7 +73,6 @@ struct ptrace_context { #define PTRACE_EVENT_SYSCALL_ENTRY (1 16) #define PTRACE_EVENT_SYSCALL_EXIT (2 16) -#define PTRACE_EVENT_SIGTRAP (3 16) #define PTRACE_EVENT_SIGNAL(4 16) /* events visible to user-space */ #define PTRACE_EVENT_MASK 0x @@ -495,8 +494,8 @@ static u32 ptrace_report_signal(u32 acti context-siginfo = NULL; if (resume != UTRACE_RESUME) { - set_stop_code(context, PTRACE_EVENT_SIGTRAP); - return UTRACE_STOP | UTRACE_SIGNAL_IGN; + info-si_signo = SIGTRAO; + break; } case UTRACE_SIGNAL_REPORT: @@ -509,21 +508,21 @@ static u32 ptrace_report_signal(u32 acti return resume | resume_signal(task, context-signr, info, return_ka); - default: - WARN_ON(context-siginfo); - context-siginfo = info; - /* -* context-siginfo points to the caller's stack. -* Make sure the subsequent UTRACE_SIGNAL_REPORT clears -* -siginfo before return from get_signal_to_deliver(). -*/ - utrace_control(task, engine, UTRACE_INTERRUPT); + } - context-stop_code =
Re: utrace_control(,,UTRACE_SINGLESTEP)
On 10/29, Oleg Nesterov wrote: In this case I confused again. Let's forget about get_user_pages() under spin_lock(), pretend it works. Two engines, E1 and E2, the tracee sleeps in utrace_resume()-utrace_stop(). E1 does utrace_control(UTRACE_RESUME), E2 does utrace_control(UTRACE_SINGLESTEP). How this can work? If E2 calls utrace_control() first, the subsequent UTRACE_RESUME does user_disable_single_step(), and (in general) E2 has no chance to re-assert SINGLESTEP. Looks like, 26fefca sticky resume action fixes this... Oleg.
Re: utrace-indirect branch
On 10/28, Roland McGrath wrote: Unlike the old code that freaked upstream reviewers all the way out, this has very simple lifetime rules for struct utrace. Once allocated, it lives until task_struct is freed. The utrace_task_alloc() logic covers the only race (at first attach), and that seems pretty easy to understand and be confident in. Yep, I already noticed utrace-indirect branch before. While I didn't look at the code closely, this change looks very simple and imho desirable. I'll try to read it carefully later. I am a bit concerned about letting the ugly utrace_struct.h way ever get merged in, so that to fix it later we have to convince people all over again about changing what works. But of course I am really not at all sure whether even this innocuous version will upset the reviewers. Yes, this is the question ;) My opinion, it would be better to send this as a separate patch, when the utrace is already merged. Because they were too many complaints about task_struct-utrace pointer before. And given that nobody actually read the code during review (at least this happened before), I am afraid we can have ugh, it is a pointer again? complaints. But of course I don't really sure. Speaking of this change. Do you think we can remove void *cookie from exit_notify() pathes before sending utrace/ptrace to lkml? Afaics, given that task-utrace never goes away utrace-indirect doesn't need this hack. Oleg.
Re: [PATCH 7/7] implement utrace-ptrace
On 10/27, Roland McGrath wrote: --- /dev/null 2009-10-25 19:46:00.608018007 +0100 +++ V1/kernel/ptrace-utrace.c 2009-10-26 03:56:46.0 +0100 Generally, needs more comments. That's not news, I'm sure. But just giving reactions as I would if doing upstream review without other context. +struct ptrace_context { + int options; + + int signr; + siginfo_t *siginfo; + + int stop_code; + unsigned long eventmsg; + + enum utrace_resume_action resume; +}; If you are lining up whitespace in field decls, do them all. This struct merits a top comment about what it is and where it hangs. Either in more top comment or interspersed, at least a little on what the fields are for. Oh, comments. Yes we need more. I can only say I will try to add more documentation, but this is not easy. +static inline bool ptrace_event_pending(struct ptrace_context *context) s/context/ctx/ throughout for variable names (but leave ptrace_context as the struct name). OK. + if (action != UTRACE_REPORT) + ptrace_context(engine)-stop_code = 0; + utrace_control(tracee, engine, action); __must_check OK. +/** + * ptrace_traceme -- helper for PTRACE_TRACEME Don't use /** kerneldoc magic. You're not matching the whole form, and these are not functions we extract docs for anyway. Heh, this was blindly copied from the old code, will remove that part. +void ptrace_notify_stop(struct task_struct *tracee) +{ + struct utrace_engine *engine = ptrace_lookup_engine(tracee); + + if (IS_ERR(engine)) { + // XXX: temporary check, wrong with mutlitracing + WARN_ON(tracee-state != TASK_RUNNING); + return; Explain this scenario at least a little in the comment. This WARN (and another XXX comment in ptrace_resume) should go away. +#ifdef PTRACE_SINGLESTEP + case PTRACE_SINGLESTEP: +#endif +#ifdef PTRACE_SINGLEBLOCK + case PTRACE_SINGLEBLOCK: +#endif +#ifdef PTRACE_SYSEMU + case PTRACE_SYSEMU: + case PTRACE_SYSEMU_SINGLESTEP: +#endif + case PTRACE_SYSCALL: + case PTRACE_CONT: + ret = ptrace_resume(child, engine, request, data); + break; Since ptrace_resume_action has a switch with these #ifdef's anyway, maybe make this the default case here and give that other switch a default error case. Hmm. Yes, at first glance this would be cleaner... + case PTRACE_KILL: + ret = 0; + if (!child-exit_state) /* already dead */ + ret = ptrace_resume(child, engine, request, SIGKILL); + break; This could be inside ptrace_resume too. Yes, and we don't even need to check exit_state. That fits better if you fold ptrace_resume_action into ptrace_resume, and I don't see any reason at all not to do that anyway. Well, I'd like to keep ptrace_resume_action(). ptrace_resume() is fat enough, and while ptrace_resume_action's code is simple, it is not easily readable. But yes, we can fold it although I don't understand why you don't like it, to me its logic deserves a separate function. If nothing else, ptrace_resume() becomes 100 lines after the folding. Oleg.
Re: utrace-cleanup branch
Btw, why does utrace_set_events() check utrace-stopped? If a tracee is stopped then -reporting == engine is not possible, -reporting must be NULL. To optimize out other checks + mb() in the likely stopped case? Oleg.
Re: utrace-cleanup branch
To optimize out other checks + mb() in the likely stopped case? Yes. Thanks, Roland
Re: utrace-cleanup branch
Can't comment right now, need to read the code. Such gentlemanly restraint. Afaics, we can't just remove utrace_finish_jctl() and the similar code in utrace_stop(). We need void utrace_finish_jctl(void) { struct utrace *utrace = task_utrace_struct(current); /* * While in TASK_STOPPED, we can be considered safely stopped by * utrace_do_stop(). Make sure we can do nothing until the tracer * drops utrace-lock */ if (unlikely(__fatal_signal_pending())) spin_unlock_wait(utrace-lock); } and utrace_stop() should do the same. I see. The comments should be more clear that SIGKILL breaking TASK_TRACED is the only way this can arise. In the jctl case, that is in particular after utrace_do_stop() changed TASK_STOPPED into TASK_TRACED. Otherwise, the killed tracee can start another reporting loop and list_for_each() can race with, say, utrace_reset(DETACH)-utrace_reset(). More generally, if the tracer sees it is stopped under utrace-lock, the tracee must be really stopped until we drop utrace-lock(), it must not escape from utrace_stop() or do_signal_stop(). Correct. So today -stopped really does still mean one other thing. It means that it's either still in TASK_TRACED or is just waking up for SIGKILL. I think we've discussed the related stuff before. A contrary approach would be to ensure that in the SIGKILL-breaks-UTRACE_STOP case we never make any more normal reports at all before utrace_report_death. It seems like a good API choice on its face: if you used UTRACE_STOP, you didn't expect to see any SYSCALL_EXIT, SIGNAL, etc. events--the special case is the instant-death scenario, so straight to report_death makes some sense. I was attracted by this until I started going through it. It's all good until you consider report_clone. If a SIGKILL is pending when you get to utrace_report_clone(), the clone has already happened. If you skip that callback, the engine doesn't find out about the new child, and that matters if it's not a CLONE_THREAD. Thanks, Roland
Re: utrace-cleanup branch
On 10/29, Roland McGrath wrote: void utrace_finish_jctl(void) { struct utrace *utrace = task_utrace_struct(current); /* * While in TASK_STOPPED, we can be considered safely stopped by * utrace_do_stop(). Make sure we can do nothing until the tracer * drops utrace-lock */ if (unlikely(__fatal_signal_pending())) spin_unlock_wait(utrace-lock); } and utrace_stop() should do the same. I see. The comments should be more clear that SIGKILL breaking TASK_TRACED is the only way this can arise. In the jctl case, that is in particular after utrace_do_stop() changed TASK_STOPPED into TASK_TRACED. Agreed. So today -stopped really does still mean one other thing. It means that it's either still in TASK_TRACED or is just waking up for SIGKILL. I think we've discussed the related stuff before. A contrary approach would be to ensure that in the SIGKILL-breaks-UTRACE_STOP case we never make any more normal reports at all before utrace_report_death. Yes, I agree. I was attracted by this until I started going through it. It's all good until you consider report_clone. If a SIGKILL is pending when you get to utrace_report_clone(), the clone has already happened. If you skip that callback, the engine doesn't find out about the new child, and that matters if it's not a CLONE_THREAD. another nasty case is report_exit(). Even if we fix the discussed problems with explicit/implicit SIGKILL's. We shouldn't afaics skip this report if the tracee was killed by execing sub-thread. Both can be fixed if we add spin_unlock_wait() before REPORT(). But imho it would be safer if we start with spin_unlock_wait() in utrace_stop(), perhaps there is something else to consider. Oleg.
Re: utrace-indirect branch
Yes, this is the question ;) At this point I have an irrational distaste for utrace_struct.h and a pathological displeasure at having a contentious upstream discussion on this again. We need to get the right code in upstream, and in the right state before another lifetime passes. I'm leaving it entirely in your discretion how best to make that happen. My ruminations are just there for whatever they're worth. Do you think we can remove void *cookie from exit_notify() pathes before sending utrace/ptrace to lkml? Afaics, given that task-utrace never goes away utrace-indirect doesn't need this hack. My inclination is that we not fiddle with any tracehook or other changes that do not have a direct positive effect on keeping the utrace code simpler today. If upstream wants us to fiddle with them, then we will. Otherwise we can get things merged, do more rounds of internal changes, and let the dust settle, before we look for cosmetic cleanups. IMHO today it's just more trivia to read over and get distracted by, more fiddling for anyone trying a backport to an older base kernel, etc. I don't really recall the exact details in previous generations of code that made it important to save that pointer across that locking window. Now that you have cleaned up and fixed the attach vs release_task interactions, introduced utrace_finish_jctl, etc., I suspect we could (later) introduce free-on-last-detach behavior and have the code be far easier to follow and more reliable than the past situation. Even then, chances are it wouldn't wind up needing that pointer. But, who knows? It's a trivium compiled away from the tracehook inlines today. Thanks, Roland
Re: PTRACE_EVENT_SIGTRAP
On 10/29, Roland McGrath wrote: The only problem with this change (if we do it now), it is always nice to have a separate changelog to document the behaviour change. But hopefully nobody cares, I mean, nobody will notice the difference. This seems like something we could change upstream first, to clarify and separate the behavior change. If the arch bits about choosing si_code et al are resolved, then it is simple to make the old kernel's tracehook calls post a signal instead of using ptrace_notify. Ah, yes. I forgot about si_code/etc filling. Of course, the discussed change which kills PTRACE_EVENT_SIGTRAP has to fill *info correctly too. Oleg.
Re: PTRACE_EVENT_SIGTRAP
This seems like something we could change upstream first, to clarify and separate the behavior change. If the arch bits about choosing si_code et al are resolved, then it is simple to make the old kernel's tracehook calls post a signal instead of using ptrace_notify. Ah, yes. I forgot about si_code/etc filling. Of course, the discussed change which kills PTRACE_EVENT_SIGTRAP has to fill *info correctly too. Yes. We previously discussed si_code et al in the context of what utrace-ptrace could do, and that's when I first raised having some kind of arch hook for it. My suggestion above assumed that we'd invent the arch hook to use first in the upstream version in such a way that it would be natural to use the same hook in utrace-ptrace. Thanks, Roland
Re: utrace_control(,,UTRACE_SINGLESTEP)
On 10/29, Roland McGrath wrote: Another example is E1 does UTRACE_BLOCKSTEP and then E2 does UTRACE_SINGLESTEP. Now single-step will win out, which is right. But E2 gets no notice about this, so it can't know that the step is for an insn rather than a block. Yes. In contrast, say utrace_control reduces both to UTRACE_REPORT, either implicitly as we had been discussing before, or because E3 comes along (before or after) and calls utrace_control(,,UTRACE_REPORT). Then E[123] get report_quiesce(0) in their callback order. (We can't control that order much today, but that is still to consolidate the utrace_control ordering issue into that existing general ordering issue.) Yes, we can't control the callback order, and this means we have the same problem even without utrace_control(). Suppose that we enter utrace_resume(), E1-report_quiesce() returns UTRACE_BLOCKSTEP, then E2-report_quiesce() returns UTRACE_SINGLESTEP. UTRACE_SINGLESTEP wins, but E1 can't know about this. Yes, I hope I understand utrace.c ;) I meant user_enable_single_step(). I don't really understand it even in x86 case, to many low level magic. Ok, good. You officially don't need to understand that unless you want to. I want ;) Just every time I recall about enable_step/etc I have more prioritized pending problems. Afaics, we can kill -stopped, but utrace_stop() and utrace_finish_jctl() should do unconditial spin_unlock_wait(utrace-lock) after resume (or lock/unlock). Ok. Please send a patch relative to utrace-cleanup with that code including comments that explain those needs. OK, will do. Oleg.
Re: utrace_control(,,UTRACE_SINGLESTEP)
Yes, we can't control the callback order, and this means we have the same problem even without utrace_control(). Right, but that is the known item on the list to consider for API refinements. If at least utrace_control doesn't add any new wrinkles to that issue, then we know that if we later address the general callback order question so we're happy, we will be done altogether with the subject. Thanks, Roland
[PATCH 122-124] PTRACE_KILL tweaks + cosmetic cleanups
PTRACE_KILL tweaks + cosmetic cleanups Oleg.
[PATCH 123] ptrace_notify_stop: kill the temporary WARN_ON()
- Remove the temporary debugging check in ptrace_notify_stop(). - Remove the XXX marker in ptrace_resume(), the recent changes hopefully fixed the problems with jctl nested stops. --- kernel/ptrace.c |9 + 1 file changed, 1 insertion(+), 8 deletions(-) --- PU/kernel/ptrace.c~123_KILL_XXX 2009-10-30 04:00:15.0 +0100 +++ PU/kernel/ptrace.c 2009-10-30 04:01:41.0 +0100 @@ -807,11 +807,8 @@ void ptrace_notify_stop(struct task_stru { struct utrace_engine *engine = ptrace_lookup_engine(tracee); - if (IS_ERR(engine)) { - // XXX: temporary check, wrong with mutlitracing - WARN_ON(tracee-state != TASK_RUNNING); + if (IS_ERR(engine)) return; - } do_ptrace_notify_stop(ptrace_context(engine), tracee); utrace_engine_put(engine); @@ -921,10 +918,6 @@ static int ptrace_resume(struct task_str case PTRACE_EVENT_SIGNAL: context-signr = data; break; - - case 0: - // XXX: JCTL stop - break; } context-resume = action;
[PATCH 124] s/context/ctx/
No changes in ptrace.o - s/context/ctx/ - line up ptrace_context-resume decl - remove kerneldoc bits from ptrace_traceme(). The comment is not very nice, I'd better remove it completely. --- kernel/ptrace.c | 192 +++- 1 file changed, 95 insertions(+), 97 deletions(-) --- PU/kernel/ptrace.c~124_S_CONTEXT_CTX2009-10-30 04:01:41.0 +0100 +++ PU/kernel/ptrace.c 2009-10-30 04:25:53.0 +0100 @@ -56,15 +56,15 @@ void __ptrace_unlink(struct task_struct } struct ptrace_context { - int options; + int options; - int signr; - siginfo_t *siginfo; + int signr; + siginfo_t *siginfo; - int stop_code; - unsigned long eventmsg; + int stop_code; + unsigned long eventmsg; - enum utrace_resume_action resume; + enum utrace_resume_action resume; }; #define PT_UTRACED 0x1000 @@ -78,19 +78,19 @@ struct ptrace_context { /* events visible to user-space */ #define PTRACE_EVENT_MASK 0x -static inline bool ptrace_event_pending(struct ptrace_context *context) +static inline bool ptrace_event_pending(struct ptrace_context *ctx) { - return context-stop_code != 0; + return ctx-stop_code != 0; } -static inline int get_stop_event(struct ptrace_context *context) +static inline int get_stop_event(struct ptrace_context *ctx) { - return context-stop_code 8; + return ctx-stop_code 8; } -static inline void set_stop_code(struct ptrace_context *context, int event) +static inline void set_stop_code(struct ptrace_context *ctx, int event) { - context-stop_code = (event 8) | SIGTRAP; + ctx-stop_code = (event 8) | SIGTRAP; } static inline struct ptrace_context * @@ -111,27 +111,27 @@ static struct utrace_engine * ptrace_reuse_engine(struct task_struct *tracee) { struct utrace_engine *engine; - struct ptrace_context *context; + struct ptrace_context *ctx; int err = -EPERM; engine = ptrace_lookup_engine(tracee); if (IS_ERR(engine)) return engine; - context = ptrace_context(engine); - if (unlikely(context-resume == UTRACE_DETACH)) { + ctx = ptrace_context(engine); + if (unlikely(ctx-resume == UTRACE_DETACH)) { /* * Try to reuse this self-detaching engine. * The only caller which can hit this case is ptrace_attach(), * it holds -cred_guard_mutex. */ - context-options = 0; - context-eventmsg = 0; + ctx-options = 0; + ctx-eventmsg = 0; /* make sure we don't get unwanted reports */ err = utrace_set_events(tracee, engine, UTRACE_EVENT(QUIESCE)); if (!err || err == -EINPROGRESS) { - context-resume = UTRACE_RESUME; + ctx-resume = UTRACE_RESUME; /* synchronize with ptrace_report_signal() */ err = utrace_barrier(tracee, engine); } @@ -149,7 +149,7 @@ static struct utrace_engine * ptrace_attach_engine(struct task_struct *tracee) { struct utrace_engine *engine; - struct ptrace_context *context; + struct ptrace_context *ctx; if (unlikely(task_utrace_flags(tracee))) { engine = ptrace_reuse_engine(tracee); @@ -157,21 +157,21 @@ ptrace_attach_engine(struct task_struct return engine; } - context = kzalloc(sizeof(*context), GFP_KERNEL); - if (unlikely(!context)) + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (unlikely(!ctx)) return ERR_PTR(-ENOMEM); - context-resume = UTRACE_RESUME; + ctx-resume = UTRACE_RESUME; engine = utrace_attach_task(tracee, UTRACE_ATTACH_CREATE | UTRACE_ATTACH_EXCLUSIVE | UTRACE_ATTACH_MATCH_OPS, - ptrace_utrace_ops, context); + ptrace_utrace_ops, ctx); if (unlikely(IS_ERR(engine))) { if (engine != ERR_PTR(-ESRCH) engine != ERR_PTR(-ERESTARTNOINTR)) engine = ERR_PTR(-EPERM); - kfree(context); + kfree(ctx); } return engine; @@ -181,7 +181,7 @@ static inline int ptrace_set_events(stru struct utrace_engine *engine, unsigned long options) { - struct ptrace_context *context = ptrace_context(engine); +