Re: copy_process utrace_init_task (Was: [PATCH 133] stepping, accommodate to utrace-cleanup changes)
Found the trivial but nasty problem. Ah! Good catch. I added tracehook_init_task() in my tree. I don't see much benefit in sending any tracehook patch upstream for this. tracehook_init_task() corresponds to tracehook_free_task(), which is only added by utrace (and both would just be empty in a separate preparatory patch). I don't see any reason to fiddle the ptrace_init_task() call. The setup it does is all stuff that only matters for teardown done by release_task() or even earlier in the exit sequence. So it makes enough sense that the setup side should happen later as it does now. In the long run, the ptrace init stuff will all just go away. Before then I can't see any rationale for touching it. Thanks, Roland
Re: copy_process utrace_init_task (Was: [PATCH 133] stepping, accommodate to utrace-cleanup changes)
On 11/18, Roland McGrath wrote: I added tracehook_init_task() in my tree. I don't see much benefit in sending any tracehook patch upstream for this. tracehook_init_task() corresponds to tracehook_free_task(), which is only added by utrace (and both would just be empty in a separate preparatory patch). I don't see any reason to fiddle the ptrace_init_task() call. ... In the long run, the ptrace init stuff will all just go away. OK, agreed. Oleg.
Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes
On 11/18, Roland McGrath wrote: Once again. The tracee sleeps in SYSCALL_ENTER. The tracer resumes the tracee via utrace_control(UTRACE_SINGLESTEP). Right. This is another strange corner. It doesn't necessarily make especially good sense intrinsically for single-step to have this meaning from this place. But it's the status quo in the ptrace interface. Yes. So, I think we should do something like tracehook_report_syscall_exit(step) // do not use step at all if (task_utrace_flags() != 0) utrace_report_syscall_exit(); The trouble here is that the arch code looks like this: step = test_thread_flag(TIF_SINGLESTEP); if (step || test_thread_flag(TIF_SYSCALL_TRACE)) tracehook_report_syscall_exit(regs, step); i.e. unless syscall tracing is enabled, we won't even get there at all. You are right, I missed this. utrace_report_syscall_exit() if (UTRACE_EVENT(SYSCALL_EXIT)) REPORT(report_syscall_exit); if (utrace-resume == UTRACE_SINGLESTEP || utrace-resume == UTRACE_BLOCKSTEP) send_sigtrap(); What do you think? This can't work in the REPORT case. There start_report() will have reset utrace-resume, Yes, yes, the pseudo-code above was just for illustration. (This would go with un-reverting f19442c and reverting 3bd4be9.) --- a/kernel/utrace.c +++ b/kernel/utrace.c @@ -1584,20 +1584,25 @@ static inline u32 do_report_syscall_entr UTRACE_EVENT(SYSCALL_ENTRY), report_syscall_entry, resume_report | report-result | report-action, engine, current, regs); - finish_report(task, utrace, report, false); + + /* + * Now we'll stop if asked. If we're proceeding with the syscall, + * now we'll first enable single-step if asked. That leaves the + * task in stepping state so tracehook_report_syscall_exit() and + * the arch code that calls it will know. + */ + finish_resume_report(task, utrace, report); Hmm. not sure I understand how this can work. I mean, this won't enable stepping (in this particular case we are discussing). - } else if (utrace-resume == UTRACE_REPORT) { + } else if (utrace-resume = UTRACE_REPORT) { perhaps, you meant = UTRACE_REPORT ? In this case yes, UTRACE_SINGLESTEP will be noticed after resume, and ptrace or another engine can reassert it during the next reporting loop, and then finish_resume_report() will enable the stepping. But we can't trust =, another engine can do utrace_control(INTERRUPT) and change -resume before the tracee resumes. OK, probably I missed the implemenation details, but I think I got the idea, and this should fix the step-after-syscall_enter problem. But. We have the same problem with utrace_finish_vfork(), no? Instead of fixing utrace_report_syscall_entry(), perhaps we should change utrace_stop() path to enable the stepping if needed, but this means we return to synchronous utrace_control(UTRACE_XXXSTEP). Oleg.
Q: UTRACE_SYSCALL_RESUMED logic
Just can't understand UTRACE_SYSCALL_RESUMED code. To the pointed, I tried to read the docs: * When %UTRACE_STOP is used in @report_syscall_entry, then @task * stops before attempting the system call. In this case, another * @report_syscall_entry callback follows after @task resumes; Probably I misread this comment, or code (or both) but this is not what utrace_report_syscall_entry(). The second reporting loop is done if the tracee stops, and its -resume = UTRACE_REPORT after wakeup. This can happen if, during the first report, one engine returns UTRACE_STOP and another engine returns UTRACE_REPORT. Or, no matter how many engines we have, the tracer uses UTRACE_REPORT to wakeup the tracee after SYSCALL_ENTRY stop. In any case, what is the rationality? And how can we trust if (utrace-resume == UTRACE_REPORT) ? If we have multiple engines, some engine can use UTRACE_INTERRUPT ? Oleg.
Re: [HACK] utrace: fix utrace_resume()-finish_resume_report() logic
On 11/16, Roland McGrath wrote: You cited the one most obvious case: utrace_get_signal() has just run, so finish_resume_report() has just run and everything is already as we want. What else? I think, we can say that finish_resume_report() must be never called without reporting loop if -resume = UTRACE_RESUME. --- a/kernel/utrace.c +++ b/kernel/utrace.c @@ -1866,8 +1866,18 @@ void utrace_resume(struct task_struct *t */ report.action = start_report(utrace); - if (report.action == UTRACE_REPORT - likely(task-utrace_flags UTRACE_EVENT(QUIESCE))) { + switch (report.action) { + case UTRACE_RESUME: + /* + * Anything we might have done was already handled by + * utrace_get_signal(), or this is an entirely spurious + * call. (The arch might use TIF_NOTIFY_RESUME for other + * purposes as well as calling us.) + */ + return; Yes, I think this change is right. Will test and report later, but it obviously should fix the testing. I feel we need some cleanups, but can't suggest anything ;) And can't convince myself I am 100% sure we don't have other similar issues. At least, don't we also need the patch below? Oleg. --- a/kernel/utrace.c +++ b/kernel/utrace.c @@ -2002,7 +2002,7 @@ int utrace_get_signal(struct task_struct spin_unlock_irq(task-sighand-siglock); } - if (resume UTRACE_REPORT) { + if (resume UTRACE_REPORT utrace UTRACE_RESUME) { /* * We only got here to process utrace-resume. * Despite no callbacks, this report is not spurious.
[PATCH 0/4] utrace-resume fixes
On top of your patch in https://www.redhat.com/archives/utrace-devel/2009-November/msg00109.html I attached this patch below just in case. As expected, it fixes the problem with the lost TIF_SINGLESTEP. Oleg. --- UTRACE-PTRACE/kernel/utrace.c~__ROLAND_RESUME_FIX 2009-11-18 21:16:23.0 +0100 +++ UTRACE-PTRACE/kernel/utrace.c 2009-11-19 02:17:26.0 +0100 @@ -1882,8 +1882,18 @@ void utrace_resume(struct task_struct *t */ report.action = start_report(utrace); - if (report.action == UTRACE_REPORT - likely(task-utrace_flags UTRACE_EVENT(QUIESCE))) { + switch (report.action) { + case UTRACE_RESUME: + /* +* Anything we might have done was already handled by +* utrace_get_signal(), or this is an entirely spurious +* call. (The arch might use TIF_NOTIFY_RESUME for other +* purposes as well as calling us.) +*/ + return; + case UTRACE_REPORT: + if (unlikely(!(task-utrace_flags UTRACE_EVENT(QUIESCE + break; /* * Do a simple reporting pass, with no specific * callback after report_quiesce. @@ -1891,13 +1901,15 @@ void utrace_resume(struct task_struct *t report.action = UTRACE_RESUME; list_for_each_entry(engine, utrace-attached, entry) start_callback(utrace, report, engine, task, 0); - } else { + break; + default: /* * Even if this report was truly spurious, there is no need * for utrace_reset() now. TIF_NOTIFY_RESUME was already * cleared--it doesn't stay spuriously set. */ report.spurious = false; + break; } /*
[PATCH 1/4] finish_resume_report(UTRACE_RESUME) must not be called without report
We should never call finish_resume_report(report) when report-action == UTRACE_RESUME, this can destroy the result of the previous finish_resume_report(). Signed-off-by: Oleg Nesterov o...@redhat.com --- kernel/utrace.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) --- UTRACE-PTRACE/kernel/utrace.c~1_UGS_FIX_FINISH_RESUME_REPORT 2009-11-19 02:17:26.0 +0100 +++ UTRACE-PTRACE/kernel/utrace.c 2009-11-19 02:24:41.0 +0100 @@ -2019,9 +2019,11 @@ int utrace_get_signal(struct task_struct * We only got here to process utrace-resume. * Despite no callbacks, this report is not spurious. */ - report.action = resume; - report.spurious = false; - finish_resume_report(task, utrace, report); + if (resume != UTRACE_RESUME) { + report.action = resume; + report.spurious = false; + finish_resume_report(task, utrace, report); + } return -1; } else if (!(task-utrace_flags UTRACE_EVENT(QUIESCE))) { /*
[PATCH 2/4] fix finish_report() vs utrace_control() race
finish_report: if (resume utrace-resume) { spin_lock(utrace_lock); utrace-resume = resume; this can race with utrace_control(). If we are going to change utrace-resume we must always check it under utrace-lock to ensure the new value is less than. Signed-off-by: Oleg Nesterov o...@redhat.com --- kernel/utrace.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) --- UTRACE-PTRACE/kernel/utrace.c~2_FINISH_REPORT_CK_RESUME_UNDER_LOCK 2009-11-19 02:24:41.0 +0100 +++ UTRACE-PTRACE/kernel/utrace.c 2009-11-19 02:40:52.0 +0100 @@ -1375,11 +1375,13 @@ static void finish_report(struct task_st if (resume utrace-resume) { spin_lock(utrace-lock); - utrace-resume = resume; - if (resume == UTRACE_INTERRUPT) - set_tsk_thread_flag(task, TIF_SIGPENDING); - else - set_tsk_thread_flag(task, TIF_NOTIFY_RESUME); + if (resume utrace-resume) { + utrace-resume = resume; + if (resume == UTRACE_INTERRUPT) + set_tsk_thread_flag(task, TIF_SIGPENDING); + else + set_tsk_thread_flag(task, TIF_NOTIFY_RESUME); + } spin_unlock(utrace-lock); }
Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes
I'll reply tomorrow, it is to late for me. But I noticed the funny detail in your email, On 11/18, Roland McGrath wrote: Yes. But, hmm. utrace_stop(task, utrace, UTRACE_RESUME); /* XXX */ This XXX was there about passing UTRACE_RESUME, because that's not really right. I didn't notice this XXX marker, but I guess I just sent the patch which makes this case correct. (of course this patch doesn't fix the problem we are dicussing in this thread). Oleg.