Re: utrace unwanted traps
On 10/19, Roland McGrath wrote: But that's how it used to be, and then we changed it. So it merits digging up our old discussion threads that led to doing that and see what all we had in mind then. I tried to grep my mail archives and google, but failed to find anything related. Based on the date of the commit you just mentioned, I browsed in: https://www.redhat.com/archives/utrace-devel/2009-October/thread.html and saw: https://www.redhat.com/archives/utrace-devel/2009-October/msg00287.html which is the day before that commit. I didn't read through all that to refamiliarize myself with all the details. Yes, I saw this thread, but didn't read it carefully. When I read it again, I started to believe that yes, it can explain the current -resume logic. However, I think this is false alarm. With a lot of efforts, I seem to fully understand our discussion. In short: I believe we never discussed why utrace-resume should be *STEP, especially if task is stopped or it is going to stop. Although I recall I thought this is obviously nice when I tried to review utrace: sticky resume action. As for this thread. Yes, I tried to suggest the (ugly) utrace-set_singlestep which should be checked after wakeup. But my motivation was quite different, what I was trying to preserve was TIF_SINGLESTEP, not resume_action. This is why the top email says I no longer think utrace_control() should just turn SINGLESTEP into REPORT silently. Please recall that, by that time 1. utrace_control(SINGLESTEP) called user_enable_single_step() 2. To handle the stepping over syscall correctly, both ptrace and ptrace-utrace relied on syscall_trace_leave() paths which checked TIF_SINGLESTEP to send SIGTRAP if needed. 1 was wrong, user_enable_single_step() was called under utrace-lock. It was decided that the tracee itself should call this helper before it returns to user-mode. To implement this, we could change utrace_control(SINGLESTEP) to set TIF_NOTIFY_RESUME, so that -report_quiesce() could return UTRACE_SINGLESTEP. However, this breaks 2, the stepping over syscall. By the time -report_quiesce() is called we already passed syscall_trace_leave() without TIF_SINGLESTEP, and -report_quiesce() can't just return UTRACE_SINGLESTEP but otoh it can't know that we should synthesize a trap before return to user-mode. So. I do not think there is any reason to ever keep UTRACE_*STEP in utrace-resume. Except, if utrace_control(STEP) sets -resume = STEP before wakeup, we avoid the unnecessary reporting loop in utrace_resume. But at the same time, this leads to the problems we are discussing. I even tested the kernel with the patch below, everything _seems_ to work (not that I think this can proves something). What do you think? Oleg. --- kstub/kernel/utrace.c~XXX_RESUME2010-10-20 18:18:40.0 +0200 +++ kstub/kernel/utrace.c 2010-10-21 18:48:52.0 +0200 @@ -768,11 +768,13 @@ relock: /* * Ensure a reporting pass when we're resumed. */ - utrace-resume = action; if (action == UTRACE_INTERRUPT) set_thread_flag(TIF_SIGPENDING); - else + else { + action = UTRACE_REPORT; set_thread_flag(TIF_NOTIFY_RESUME); + } + utrace-resume = action; } /* @@ -1195,6 +1197,7 @@ int utrace_control(struct task_struct *t * In that case, utrace_get_signal() will be reporting soon. */ clear_engine_wants_stop(engine); + action = UTRACE_REPORT; if (action utrace-resume) { utrace-resume = action; set_notify_resume(target); @@ -1381,11 +1384,13 @@ static void finish_report(struct task_st if (resume utrace-resume) { spin_lock(utrace-lock); - utrace-resume = resume; if (resume == UTRACE_INTERRUPT) set_tsk_thread_flag(task, TIF_SIGPENDING); - else + else { + resume = UTRACE_REPORT; set_tsk_thread_flag(task, TIF_NOTIFY_RESUME); + } + utrace-resume = resume; spin_unlock(utrace-lock); }
Re: utrace unwanted traps
On 10/14, Roland McGrath wrote: In short: I no longer understand why utrace-resume can be UTRACE_*STEP when the tracee is stopped, and in fact I think this is not right. I think we didn't have that originally. Yes. The rationale that I remember is the idea that you should be able to implement a debugger interface feature like PTRACE_SINGLESTEP without noticeably more overhead than it had in pre-utrace ptrace. That is, in vanilla ptrace, the tracer thread calls user_enable_single_step itself and then just lets the tracee resume directly to user mode. We wanted to avoid always having an additional trip through tracehook_notify_resume and the utrace reporting loop and then back through the return-to-user assembly path a second time, between the tracee waking up and actually going to user mode. Probably, I am not sure. I can't recall the previous discussions. IIRC, the main problem with user_enable_single_step() was: we wanted to move the callsite from tracer to tracee by many reasons. I can't recall if avoid another reporting loop was the target, and additional trip through tracehook_notify_resume() is not avoidable anyway. Afaics, this was introduced by 26fefca955cc7c3c83014be2e10b773209d32eea utrace: sticky resume action. Before that patch the stopped tracee could only remember the pending UTRACE_REPORT or UTRACE_INTERRUPT. Right. Said another way, any time an engine that previously used UTRACE_*STEP now uses UTRACE_RESUME, we must degrade utrace-resume back to UTRACE_REPORT at most. Since we don't keep track of which engine it was that put UTRACE_*STEP into utrace-resume before, we'd have to just always degrade it any time anybody uses UTRACE_RESUME. Agreed. That has almost the whole effect of punting utrace-resume back to just a utrace-report flag. (see below) But that's how it used to be, and then we changed it. So it merits digging up our old discussion threads that led to doing that and see what all we had in mind then. I tried to grep my mail archives and google, but failed to find anything related. _Perhaps_ this was not intended actually. Before the commit above, another problem was fixed by 8ad60bbd4c665a11be179a0bff41483cca3b3560 utrace_stop: preserve report/interrupt requests across stop/resume, until this one it was possible to lose UTRACE_INTERRUPT request if it races with UTRACE_STOP from another engine. So, perhaps cleanup was the main reason for 26fefca9. See also http://www.mail-archive.com/utrace-devel@redhat.com/msg01647.html Oleg.
Re: utrace unwanted traps
Probably, I am not sure. I can't recall the previous discussions. Me either (except hazily the one notion I mentioned), but that's why we have mailing list archives. Afaics, this was introduced by 26fefca955cc7c3c83014be2e10b773209d32eea utrace: sticky resume action. Before that patch the stopped tracee could only remember the pending UTRACE_REPORT or UTRACE_INTERRUPT. Correct. But that's how it used to be, and then we changed it. So it merits digging up our old discussion threads that led to doing that and see what all we had in mind then. I tried to grep my mail archives and google, but failed to find anything related. Based on the date of the commit you just mentioned, I browsed in: https://www.redhat.com/archives/utrace-devel/2009-October/thread.html and saw: https://www.redhat.com/archives/utrace-devel/2009-October/msg00287.html which is the day before that commit. I didn't read through all that to refamiliarize myself with all the details. Thanks, Roland
utrace unwanted traps
On 09/23, Roland McGrath wrote: OK, finish_callback_report() and utrace_control(DETACH) can set TIF_NOTIFY_RESUME. Right. Those utrace_resume has the report.action==UTRACE_RESUME bail-out case. So either that would change or detach would also do UTRACE_REPORT. But what if there are no more attached engines? Looks like, utrace_resume(UTRACE_RESUME) needs to handle this special case. And utrace_reset() shouldn't clear task-utrace_flags, otherwise utrace_resume/utrace_get_signal won't be called. Right. Or else tracehook_notify_resume could call utrace_resume unconditionally, but I'm not at all sure that is not worse. The original theory was that it should always be OK to have some utrace_flags bits stay set when they are stale, because any kind of reporting pass that got enabled would hit the report-spurious case and clean the state up synchronously when it's safe. I tried to make the patch which addresses this issue, but surprisingly I failed. Because I think there are more (minor) problems here. When it comes to multitracing, we can have the unwanted trap even without detaching. In short: I no longer understand why utrace-resume can be UTRACE_*STEP when the tracee is stopped, and in fact I think this is not right. For example. Consider two engines, E1 and E2. To simplify the discussion suppose that both engines have engine-data-resume and -report_quiesce() just returns -resume. So, if the debugger wants, say, single-step, it does engine-data-resume = UTRACE_SINGLESTEP; utrace_control(UTRACE_SINGLESTEP); Actually, any engine should do something like this. Now suppose that E1 wants UTRACE_SINGLESTEP, E2 asks for UTRACE_STOP. In this case utrace_resume() results in utrace_stop() which sets utrace-resume = UTRACE_SINGLESTEP before sleeping. First of all - why? Yes, the theory is that we have another reporting loop after wake_up, but utrace-resume = UTRACE_REPORT is enough, E1 should always acquire UTRACE_SINGLESTEP if needed or it is buggy. But more importantly, this doesn't look right or I missed something. Suppose that, before E2 resumes the tracee, E1 does utrace_control(STOP) which immediately succeeds and allows E1 to play with the stopped tracee. Again, to simplify the discussion, suppose that E2 does UTRACE_RESUME and nothing more after that. Now. E1 wants to resume the tracee too and it does engine-data-resume = UTRACE_RESUME; utrace_control(UTRACE_RESUME); utrace_control(RESUME) correctly clears TIF_SINGLESTEP, but it doesn't change utrace-resume. This means utrace_resume() doesn't do the reporting loop, it just calls user_enable_single_step() and returns to user-mode. So. Could you explain why utrace_stop() should ever keep UTRACE_STEP in utrace-resume? finish_report() does the same but this looks fine. Also. I am not sure utrace_control(UTRACE_STEP) should always set utrace-resume = UTRACE_STEP. If the tracee is stopped but there is another ENGINE_STOP engine (iow, the tracee won't be resumed right now) we have the same problem. Off-topic, #define UTRACE_RESUME_BITS (ilog2(UTRACE_RESUME_MAX) + 1) why UTRACE_RESUME_MAX? it should be (ilog2(UTRACE_RESUME_MAX - 1) + 1), no? Oleg.