Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-10-22 Thread Oleg Nesterov
On 10/22, Roland McGrath wrote:

  Hmm. If task is not stopped then it is current (except
  utrace_control(DETACH) can play with the dying task).

 Right, asynchronous detach was the problematic case I was concerned with.

but asynchronous detach doesn't do utrace_reset(), unless the tracee
is stopped or exiting (-exit_state != 0).

 But the whole problem we have is that we aren't getting to
 that path when we've done a detach, right?

Confused. We already discussed this before,

 OK, I'll do some testing and resend right now. In UTRACE_DETACH case
 reset can be true but the tracee is running. But I don't think it
 makes sense to check target-exit_state == 0, correct?

I think it's OK if it's running with -exit_state set (or even with just
PF_EXITING set), i.e. already in kernel mode and never going back to
user mode.  (Then it's essentially equivalent to calling user_*_step
while racing with a SIGKILL death, which has to be OK.)  Any other kind
of running would not be OK.

Probably we misunderstood each other. In any case I agree, that patch
was not good.

Oleg.



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-10-14 Thread Roland McGrath
 The 1st patch I sent initially:
 
   --- kstub/kernel/utrace.c~10_utrace_reset_should_clear_ss   
 2010-09-20 03:55:12.0 +0200
   +++ kstub/kernel/utrace.c   2010-09-20 21:33:47.0 +0200
   @@ -709,6 +709,7 @@ static bool utrace_reset(struct task_str
*/
   utrace-resume = UTRACE_RESUME;
   utrace-signal_handler = 0;
   +   user_disable_single_step(task);
   }

   /*
 
 It clears TIF_SINGLESTEP when the last engine detaches.

This we didn't want because utrace_reset can be called when the tracee is
not stopped, and it's not safe to call user_*_step then (with the one
caveat that the SIGKILL race is OK).

 The 2nd one which changes utrace_control(DETACH),
 
   @@ -1139,7 +1139,9 @@ int utrace_control(struct task_struct *t
   target-utrace_flags = ~ENGINE_STOP;
   mark_engine_detached(engine);
   reset = reset || utrace_do_stop(target, utrace);
   -   if (!reset) {
   +   if (reset) {
   +   user_disable_single_step(target);
   +   } else {
   /*
* As in utrace_set_events(), this barrier 
 ensures
* that our engine-flags changes have hit 
 before we
 
 It doesn't cover the case when the TIF_SINGLESTEP tracee detaches
 itself, but a) currently the are no such engines, and b) it looks
 more clean.

If we're trying to meet the advertised API--how it's documented now is
the only reasonable thing if we are actually fixing things properly at
all--we do need to cover the case where the only engine to have
requested single-step earlier then later returns UTRACE_DETACH from a
callback.  That is, if that engine could actually know that the tracee
never reached user mode between its use of UTRACE_SINGLESTEP and its
later UTRACE_DETACH.

 However, I am starting to think that if you prefer this change we
 have a better option. Instead of duplicating UTRACE_RESUME logic,
 perhaps the patch below makes more sense?

Hmm, that does seem like it might be pretty reasonable.  If your
engine had ever permitted the tracee to get back to user mode after
requesting UTRACE_*STEP, then it's just desserts to cause that SIGTRAP
regardless of whether other engines might have kept it stopped in
actuality.  

But, there might be plausible corners where an engine really could know
reliably (without outside knowledge of what other engines might be
doing) that the tracee can't really have been back to user mode yet.
For example, if it checked signal_pending(tracee) before it used
UTRACE_*STEP, then it could know for sure that its report_signal
callback will have been called before it ever got to user mode, so if
that callback returns UTRACE_DETACH, it better not be left with stepping
enabled.  In that particular case, it shouldn't be a problem, since
after its report_signal we will always hit finish_resume_report later.
But we should think about if there are any holes of a similar nature.

 Please tell me which fix you prefer? Or just commit the fix you
 like more.

I'm only going to merge a specific commit of yours that you have tested.


Thanks,
Roland



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-23 Thread Roland McGrath
 I think we should start with changing utrace_control(DETACH) anyway,
 then try to improve. I'll ressend the one-liner I already showed.

Ok.

 Hmm. I'll try to think more. Right now I don't really understand
 how to do this correctly.

I wasn't immediately sure either.

 OK, finish_callback_report() and utrace_control(DETACH) can set
 TIF_NOTIFY_RESUME. 

Right.  Those utrace_resume has the report.action==UTRACE_RESUME bail-out
case.  So either that would change or detach would also do UTRACE_REPORT.

 But what if there are no more attached engines?
 Looks like, utrace_resume(UTRACE_RESUME) needs to handle this special
 case. And utrace_reset() shouldn't clear task-utrace_flags, otherwise
 utrace_resume/utrace_get_signal won't be called.

Right.  Or else tracehook_notify_resume could call utrace_resume
unconditionally, but I'm not at all sure that is not worse.  The
original theory was that it should always be OK to have some
utrace_flags bits stay set when they are stale, because any kind of
reporting pass that got enabled would hit the report-spurious case
and clean the state up synchronously when it's safe.

 So, probably detach should set TIF_NOTIFY_RESUME, but utrace_reset()
 should do user_disable_single_step() too if no more engines. Confused.

If there are no more engines but the tracee is still running, we still
shouldn't do it there because it still might not be entirely safe.
If the tracee is not stopped, it's only safe to call in current.

 And in fact I don't understand why this is important. When it comes
 to multiracing, any engine can hit the unwanted/unexpected trap
 because another engine can ask for UTRACE_*STEP. 

Right.  An engine earlier in the list could swallow the signal so the
next engines' callbacks didn't see it.  But it doesn't know that some
later engines didn't also ask for stepping.  So there would have to be
some understood convention between engines.  For example, a later
engine could see info-si_signo==SIGTRAP et al and act on that even
when the incoming utrace_signal_action(action)==UTRACE_SIGNAL_IGN.
Of course, that doesn't help a non-stepping engine that is earlier
in the list to know that a later engine will be swallowing the signal.

The original theory on this was that we'd one day stop overloading
user signals for debugger-induced traps.  In some past TODO lists and
postings I referred to extension events.  The idea (in part) was
that things like hardware stepping would generate a special new flavor
of utrace event rather than a real signal that has to be intercepted.
Then engines' callbacks would easily see that this was a debugging
event induced by some engine and ignore it (or more likely, just never
get any callback unless your engine registered interest in stepping).
This would also address the case of asynchronous engine detach just
after a trap has actually hit, where today the SIGTRAP is queued and
then later won't be intercepted by a debugger, and instead kills the
user process.

But we don't have any of that now, and don't yet know if we will
really pursue any big improvements at this API level.

 The only really
 important (I think) case is when the last engine detaches.

That's the most important case, sure.  But in any case that is not
actually racy, we should avoid later spurious traps.

 IOW. Suppose that eninge E does utrace_control(STEP) or its callback
 returns UTRACE_*STEP. If we do not detach this engine, other engines
 will see the trap. 

That's only so if the tracee actually gets back to user mode before we
have another reporting pass.

 So why it is so important to clear X86_EFLAGS_TF if we detach E ?

Perhaps I am worrying too much about it.  The worst thing is if it
could really get stuck.  But that shouldn't be possible if there are
any engines at all, perhaps only any with UTRACE_EVENT(QUIESCE) set.
Worst case, one spurious SIGTRAP will get to a report_signal pass, but
nobody will return UTRACE_*STEP again, so there won't be another.  (Of
course, nobody will swallow that SIGTRAP and so it will terminate the
process first anyway, but that's another problem.)  

What seems important is any non-racy scenarios.  That is, where
perhaps it wasn't properly stopped and E detached asynchronously,
but in practice the tracee was known to be otherwise blocked elsewhere
or something, so the detach should have full effect before it returns
to user mode.  But that is just vague theory off hand.


Thanks,
Roland



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-22 Thread Oleg Nesterov
On 09/21, Roland McGrath wrote:

  OK, so what should we do for now?

 If we can't come to a straightforward answer for the general case
 fairly quickly, then we can do the simple change to start with.

I think we should start with changing utrace_control(DETACH) anyway,
then try to improve. I'll ressend the one-liner I already showed.

  Without the multitracing utrace_resume()-user_disable_single_step()
  fixes the problem. Otherwise we probably need ENGINE_WANTS_SINGLESTEP.

 No, no, we don't want that.  We don't need more state.  And, remember,
 we really don't need to microoptimize cases when single-step was used
 recently--the cost of taking one more single-step trap and percolating
 through the signal paths was going to be pretty large anyway.

OK,

 It's better to have a spurious report (preferably just spurious
 TIF_NOTIFY_RESUME with no actual callbacks) following any detach,
 or whatever it takes to ensure user_disable_single_step() always
 runs if user_enable_*_step did before and there is no engine around
 to see a report_quiesce callback pass.  If there is a report_quiesce
 or report_signal callback pass where nobody returns UTRACE_*STEP,
 then we should never leave stepping enabled when we return to user mode.

Hmm. I'll try to think more. Right now I don't really understand
how to do this correctly.

OK, finish_callback_report() and utrace_control(DETACH) can set
TIF_NOTIFY_RESUME. But what if there are no more attached engines?
Looks like, utrace_resume(UTRACE_RESUME) needs to handle this special
case. And utrace_reset() shouldn't clear task-utrace_flags, otherwise
utrace_resume/utrace_get_signal won't be called.

So, probably detach should set TIF_NOTIFY_RESUME, but utrace_reset()
should do user_disable_single_step() too if no more engines. Confused.

And in fact I don't understand why this is important. When it comes
to multiracing, any engine can hit the unwanted/unexpected trap
because another engine can ask for UTRACE_*STEP. The only really
important (I think) case is when the last engine detaches.

IOW. Suppose that eninge E does utrace_control(STEP) or its callback
returns UTRACE_*STEP. If we do not detach this engine, other engines
will see the trap. So why it is so important to clear X86_EFLAGS_TF
if we detach E ?

Oleg.



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-22 Thread Oleg Nesterov
On 09/23, Oleg Nesterov wrote:

 On 09/21, Roland McGrath wrote:
 
  It's better to have a spurious report (preferably just spurious
  TIF_NOTIFY_RESUME with no actual callbacks) following any detach,
  or whatever it takes to ensure user_disable_single_step() always
  runs if user_enable_*_step did before and there is no engine around
  to see a report_quiesce callback pass.  If there is a report_quiesce
  or report_signal callback pass where nobody returns UTRACE_*STEP,
  then we should never leave stepping enabled when we return to user mode.

 Hmm. I'll try to think more. Right now I don't really understand
 how to do this correctly.

 OK, finish_callback_report() and utrace_control(DETACH) can set
 TIF_NOTIFY_RESUME. But what if there are no more attached engines?
 Looks like, utrace_resume(UTRACE_RESUME) needs to handle this special
 case. And utrace_reset() shouldn't clear task-utrace_flags, otherwise
 utrace_resume/utrace_get_signal won't be called.

 So, probably detach should set TIF_NOTIFY_RESUME, but utrace_reset()
 should do user_disable_single_step() too if no more engines. Confused.

Or, detach can always do user_disable_single_step() and set
TIF_NOTIFY_RESUME. If another engine wants stepping its report_quiesce()
should re-ack UTRACE_SINGLESTEP anyway, otherwise it is buggy.

 And in fact I don't understand why this is important. When it comes
 to multiracing, any engine can hit the unwanted/unexpected trap
 because another engine can ask for UTRACE_*STEP. The only really
 important (I think) case is when the last engine detaches.

Aaah. please ignore. Somehow I assumed every engine should hook
SIGTRAP.

Oleg.



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-21 Thread Roland McGrath
 Change utrace_reset() to do user_disable_single_step(). Alternatively
 we could change ptrace layer, but I think this is not ptrace specific.

Yes, it's right to fix this in the utrace layer.  
But I'm not sure that's the right place in the code to fix it.

The expectation is that either we'll get to finish_resume_report,
which calls user_*_step, or that utrace_control is resuming us
without any expectation of a report.  For UTRACE_RESUME in that
case, utrace_control calls user_disable_single_step.  So probably
the UTRACE_DETACH case there should just do it to.


Thanks,
Roland



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-21 Thread Oleg Nesterov
On 09/21, Roland McGrath wrote:

  Change utrace_reset() to do user_disable_single_step(). Alternatively
  we could change ptrace layer, but I think this is not ptrace specific.

 Yes, it's right to fix this in the utrace layer.
 But I'm not sure that's the right place in the code to fix it.

 The expectation is that either we'll get to finish_resume_report,
 which calls user_*_step, or that utrace_control is resuming us
 without any expectation of a report.  For UTRACE_RESUME in that
 case, utrace_control calls user_disable_single_step.  So probably
 the UTRACE_DETACH case there should just do it to.

Yes, initially I was going to change utrace_control(DETACH), and this
certainly looks more clean.

I was worried about the case when the TIF_SINGLESTEP tracee detaches
itself and escapes finish_resume_report()-user_disable_single_step(),
say, utrace_report_exec(). But probably we can ignore this.

Another concern was implicit detach, but thinking more I do not see
why utrace_resume() is better.

OK, I'll do some testing and resend right now. In UTRACE_DETACH case
reset can be true but the tracee is running. But I don't think it
makes sense to check target-exit_state == 0, correct?

Oleg.



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-21 Thread Oleg Nesterov
On 09/21, Roland McGrath wrote:

  I was worried about the case when the TIF_SINGLESTEP tracee detaches
  itself and escapes finish_resume_report()-user_disable_single_step(),
  say, utrace_report_exec(). But probably we can ignore this.

 No, I think that is indeed a problem.  If no engine is still attached
 whose last callback returned UTRACE_SINGLESTEP, we should never return
 to user mode with single-step enabled.

OK, so what should we do for now?

Without the multitracing utrace_resume()-user_disable_single_step()
fixes the problem. Otherwise we probably need ENGINE_WANTS_SINGLESTEP.

  Another concern was implicit detach, but thinking more I do not see
  why utrace_resume() is better.
 
  OK, I'll do some testing and resend right now. In UTRACE_DETACH case
  reset can be true but the tracee is running. But I don't think it
  makes sense to check target-exit_state == 0, correct?

 I think it's OK if it's running with -exit_state set (or even with just
 PF_EXITING set), i.e. already in kernel mode and never going back to
 user mode.  (Then it's essentially equivalent to calling user_*_step
 while racing with a SIGKILL death, which has to be OK.)  Any other kind
 of running would not be OK.

Yes, my concern was running and !current, not exiting. This is OK on
x86 but user_disable_single_step() is arch specific.

Oleg.



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-21 Thread Oleg Nesterov
On 09/22, Oleg Nesterov wrote:

 On 09/21, Roland McGrath wrote:
 
   I was worried about the case when the TIF_SINGLESTEP tracee detaches
   itself and escapes finish_resume_report()-user_disable_single_step(),
   say, utrace_report_exec(). But probably we can ignore this.
 
  No, I think that is indeed a problem.  If no engine is still attached
  whose last callback returned UTRACE_SINGLESTEP, we should never return
  to user mode with single-step enabled.

 OK, so what should we do for now?

 Without the multitracing utrace_resume()-user_disable_single_step()
 fixes the problem. Otherwise we probably need ENGINE_WANTS_SINGLESTEP.

As expected, the patch below equally fixes the problem and passes
ptrace-tests.

Which one do you prefer?

Note that, since we are going to change UTRACE_RESUME to remove
ENGINE_STOP from utrace_flags, we can probably unify DETACH/RESUME

case UTRACE_DETACH:
mark_engine_detached(engine);
reset = reset || utrace_do_stop(target, utrace);
if (!reset) {
/*
 * As in utrace_set_events(), this barrier ensures
 * that our engine-flags changes have hit before we
 * examine utrace-reporting, pairing with the barrier
 * in start_callback().  If @target has not yet hit
 * finish_callback() to clear utrace-reporting, we
 * might be in the middle of a callback to @engine.
 */
smp_mb();
if (utrace-reporting == engine)
ret = -EINPROGRESS;
}
/* fall */

case UTRACE_RESUME:
clear_engine_wants_stop(engine);
target-utrace_flags = ~ENGINE_STOP;

/*
 * This and all other cases imply resuming if stopped.
 * There might not be another report before it just
 * resumes, so make sure single-step is not left set.
 */
if (likely(reset))
user_disable_single_step(target);
break;


--- kstub/kernel/utrace.c~10_utrace_reset_should_clear_ss   2010-09-20 
03:55:12.0 +0200
+++ kstub/kernel/utrace.c   2010-09-22 01:54:24.0 +0200
@@ -1139,7 +1139,9 @@ int utrace_control(struct task_struct *t
target-utrace_flags = ~ENGINE_STOP;
mark_engine_detached(engine);
reset = reset || utrace_do_stop(target, utrace);
-   if (!reset) {
+   if (reset) {
+   user_disable_single_step(target);
+   } else {
/*
 * As in utrace_set_events(), this barrier ensures
 * that our engine-flags changes have hit before we



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-21 Thread Roland McGrath
 OK, so what should we do for now?

If we can't come to a straightforward answer for the general case
fairly quickly, then we can do the simple change to start with.

 Without the multitracing utrace_resume()-user_disable_single_step()
 fixes the problem. Otherwise we probably need ENGINE_WANTS_SINGLESTEP.

No, no, we don't want that.  We don't need more state.  And, remember,
we really don't need to microoptimize cases when single-step was used
recently--the cost of taking one more single-step trap and percolating
through the signal paths was going to be pretty large anyway.

It's better to have a spurious report (preferably just spurious
TIF_NOTIFY_RESUME with no actual callbacks) following any detach,
or whatever it takes to ensure user_disable_single_step() always
runs if user_enable_*_step did before and there is no engine around
to see a report_quiesce callback pass.  If there is a report_quiesce
or report_signal callback pass where nobody returns UTRACE_*STEP,
then we should never leave stepping enabled when we return to user mode.

 Yes, my concern was running and !current, not exiting. This is OK on
 x86 but user_disable_single_step() is arch specific.

It's not really OK on x86 either, with either SMP or preemption.


Thanks,
Roland



[PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-20 Thread Oleg Nesterov
Test-case:

#include stdio.h
#include signal.h
#include unistd.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
kill(getpid(), SIGSTOP);
return 0x23;
}

assert(wait(status) == pid);
assert(WIFSTOPPED(status)  WSTOPSIG(status) == SIGSTOP);

assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
assert(wait(status) == pid);
assert(WIFSTOPPED(status)  WSTOPSIG(status) == SIGTRAP);

assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);
assert(wait(status) == pid);

if (WIFEXITED(status)  WEXITSTATUS(status) == 0x23)
return 0;

printf(ERR!! exit status: %04x\n, status);
return 1;
}

It fails because TIF_SINGLESTEP is not cleared after PTRACE_DETACH and
the tracee gets another trap after resume.

Change utrace_reset() to do user_disable_single_step(). Alternatively
we could change ptrace layer, but I think this is not ptrace specific.

Signed-off-by: Oleg Nesterov o...@redhat.com
---

 kernel/utrace.c |1 +
 1 file changed, 1 insertion(+)

--- kstub/kernel/utrace.c~10_utrace_reset_should_clear_ss   2010-09-20 
03:55:12.0 +0200
+++ kstub/kernel/utrace.c   2010-09-20 21:33:47.0 +0200
@@ -709,6 +709,7 @@ static bool utrace_reset(struct task_str
 */
utrace-resume = UTRACE_RESUME;
utrace-signal_handler = 0;
+   user_disable_single_step(task);
}
 
/*



Re: [PATCH] utrace: utrace_reset() should clear TIF_SINGLESTEP if no more engines

2010-09-20 Thread Jan Kratochvil
On Mon, 20 Sep 2010 21:42:19 +0200, Oleg Nesterov wrote:
 Test-case:

Checked in http://sourceware.org/systemtap/wiki/utrace/tests as step-detach.


Thanks,
Jan