Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless

2009-08-04 Thread Oleg Nesterov
On 08/03, Roland McGrath wrote:

  But I think we do not care. What we do care is: -stopped must be F
  after finish_utrace_stop(), whoever checks it under utrace-lock.

 We do care about a false positive if it makes finish_utrace_stop() hit its
 WARN_ON and return true when there was no SIGKILL.

Yes sure. Now I see I misunderstood your previous email. I think
that this type of false positive is not possible. Please see below.

  when utrace_do_stop() sets -stopped, we hold -siglock, this means
  we can't race with SIGCONT/SIGKILL. If the signal somes after that,
  we can rely on rq-lock, both try_to_wake_up() and schedule() take
  this lock, it should act as a barrier. I think.

 Ok.  If we are in fact relying on any implicit barrier semantics like
 that, we should have a clear comment about it.

Ah. I guess I just added the unnecessary confusion. Indeed, we should
not rely on any implicit barrier semantics with rq-lock.

What I meant, we can rely on the fact that any wakeup (try_to_wakeup()
actually) must be a barrier wrt schedule(). IOW, if we do

VAR = 1;
try_to_wake_up(TASK);

Then TASK can safely do

set_current_state();
schedule();
BUG_ON(VAR != 1);

this is true even if the task does not sleep, in case it is woken
before it enters schedule().

So. If utrace_control() path changes -stopped and wakes up the
tracee, it must see the correct value of -stopped. This is simple.

But what if the tracee is woken by CONT/KILL? Should the tracee
see the last change in -stopped? The signal can race with
utrace_wakeup() or utrace_do_stop(), there are a lot of cases
when the woken tracee can look at -stopped while it is changed
by utrace. But I think all we need is to be sure that:

- if -stopped == T, we have a pending SIGKILL

  If we were woken by SIGCONT which comes right after
  utrace_wakeup() does s/TRACED/STOPPED/, I think we must
  see the result of -stopped = 0 in utrace_wakeup().
  Both utrace_wakeup() and signal_wake_up() take -siglock,
  the memory change in -stopped must be visible to
  utrace_wakeup() and the wakee.

- If we were killed, we must see -stopped == T _unless_
  utrace_wakeup() is in progress or we are already woken
  from utrace pov.

  Now we should worry about SIGKILL racing with utrace_do_stop().
  Again, both take -siglock, etc.

But in short, I can't understand how utrace-lock can really help,
given that a signal changes task's state without this lock. It can
only help to avoid the races with utrace playing with -stopped in
parallel, but I see nothing dangerous here.

  BTW, I think that finish_utrace_stop() doesn't need -siglock to
  check the pending SIGKILL.

 Shouldn't one always hold siglock to use task-pending at all?

Why? It only checks the bit. Note that fatal_signal_pending() is
used lockless.

But this discussion makes me wonder, what report-killed and
utrace_stop's returned value buy us? Can't we kill report-killed
and make utrace_stop() return void?

With or without this patch. Suppose that (say) utrace_report_syscall_entry()
does utrace_stop() and sleeps in TASK_TRACED. Another thread starts the
group stop and sets SIGNAL_STOP_STOPPED. utrace_wakeup() clears -stopped,
notices SIGNAL_STOP_STOPPED and doesn't wake up.

Now, we can sleep a long time. Then SIGKILL kills us, but finish_utrace_stop()
returns false.

Afaics, only utrace_get_signal() uses report.kill, and only
utrace_report_syscall_entry() uses the result of utrace_stop(). Both
can just check fatal_signal_pending().


  - Currently, if we see -stopped == T under utrace-lock we
know that the tracee can do nothing interesting from utrace
pov. It can't be waked up by utrace_wakeup(). If the tracee
is killed it must take -lock to clear -stopped before it
can do anything else.

 Right.  My original thinking was that it must always get into
 utrace_get_signal() right then, and that would be responsible for any
 synchronization that mattered.

If we remove -stopped, we can't rely on TASK_TRACED in the
same manner. For example, a killed tracee can can call
utrace_report_jctl()-REPORT() while utrace thinks it is
stopped.

 You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right?

Not sure. utrace_get_signal() can call -report_signal() before it
takes utrace-lock.

 I wonder if it might not make sense anyway to suppress this for the
 SIGKILL case.  Not just the tracing, but the entire CLD_CONTINUED
 notification.  A parent doesn't really need a SIGCHLD,CLD_CONTINUED
 immediately followed by a SIGCHLD,CLD_KILLED.

Heh. Actually, I was wrong. utrace_report_jctl() is not possible if the
tracee is killed. complete_signal(SIGKILL) sets -flags = SIGNAL_GROUP_EXIT.

  - The exiting task with _UTRACE_DEATH_EVENTS can be considered
 ^dead/dying   ^out
as quiescent. But, without 

Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless

2009-08-04 Thread Roland McGrath
 What I meant, we can rely on the fact that any wakeup (try_to_wakeup()
 actually) must be a barrier wrt schedule(). 

Ok.

 But what if the tracee is woken by CONT/KILL? Should the tracee
 see the last change in -stopped? The signal can race with
 utrace_wakeup() or utrace_do_stop(), [...]

It can't actually race with utrace_do_stop, because that holds siglock
while touching -stopped.  Any signal case should already exclude
utrace_do_stop that way.  So it's only utrace_wakeup, which clears
-stopped before taking siglock--and that's the harmless direction to race.

 But I think all we need is to be sure that:
 
   - if -stopped == T, we have a pending SIGKILL
 
 If we were woken by SIGCONT which comes right after
 utrace_wakeup() does s/TRACED/STOPPED/, I think we must
 see the result of -stopped = 0 in utrace_wakeup().
 Both utrace_wakeup() and signal_wake_up() take -siglock,
 the memory change in -stopped must be visible to
 utrace_wakeup() and the wakee.

Ok.

   - If we were killed, we must see -stopped == T _unless_
 utrace_wakeup() is in progress or we are already woken
 from utrace pov.

Right.  In those cases it doesn't matter whether we think we were killed or
not.  It can be before the wakeup or after the wakeup.

 Now we should worry about SIGKILL racing with utrace_do_stop().
 Again, both take -siglock, etc.
 
 But in short, I can't understand how utrace-lock can really help,
 given that a signal changes task's state without this lock. It can
 only help to avoid the races with utrace playing with -stopped in
 parallel, but I see nothing dangerous here.

Right, I just meant about keeping -stopped correct in the signal case.
i.e., in case of signal wakeup, it could never see an old -stopped value.

  Shouldn't one always hold siglock to use task-pending at all?
 
 Why? It only checks the bit. Note that fatal_signal_pending() is
 used lockless.

Ok.

 But this discussion makes me wonder, what report-killed and
 utrace_stop's returned value buy us? Can't we kill report-killed
 and make utrace_stop() return void?

I think at some point earlier I was not confident that a local pending
SIGKILL would always be there.  This was before we had fatal_signal_pending
et al.  It might not have been a valid worry even then.

 With or without this patch. Suppose that (say) utrace_report_syscall_entry()
 does utrace_stop() and sleeps in TASK_TRACED. Another thread starts the
 group stop and sets SIGNAL_STOP_STOPPED. utrace_wakeup() clears -stopped,
 notices SIGNAL_STOP_STOPPED and doesn't wake up.
 
 Now, we can sleep a long time. Then SIGKILL kills us, but finish_utrace_stop()
 returns false.

Yes, that's a good point.

 Afaics, only utrace_get_signal() uses report.kill, and only
 utrace_report_syscall_entry() uses the result of utrace_stop(). Both
 can just check fatal_signal_pending().

Ok.

  You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right?
 
 Not sure. utrace_get_signal() can call -report_signal() before it
 takes utrace-lock.

Right.  It couldn't before af3eb44.

 Heh. Actually, I was wrong. utrace_report_jctl() is not possible if the
 tracee is killed. complete_signal(SIGKILL) sets -flags = SIGNAL_GROUP_EXIT.

Ah, good.

  OTOH we can just keep utrace_finish_jctl() and have it and
  finish_utrace_stop() just always take the utrace lock purely
  to synchronize (and then drop it without changing anything).
 
 or we can just do spin_unlock_wait().

I guess that has the same effect as lock+unlock, sure.

 Anyway, I believe that this change, even if good, is low priority. I mean,
 it doesn't immediately makes possible other improvements, afaics. But this
 all is up to you.

No, but it is up to you! ;-)  The point of the exercise is to get utrace
merged upstream.  We will of course keep refining it in the future.  My
attention to the idea of dropping -stopped now was purely on the thought
that it might make the code easier to understand/review/believe correct 
so it could help with upstream reviewers liking it for merge.  So do it if
it helps or do the opposite if that helps, or play it by ear as feels right.


Thanks,
Roland



Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless

2009-08-03 Thread Oleg Nesterov
On 08/01, Roland McGrath wrote:

  finish_utrace_stop() can check -stopped lockless. It was set by us,
  we can't miss it.

 We enter utrace_stop() for some stop.  Either then or later, a group jctl
 stop finishes and sets SIGNAL_STOP_STOPPED.  Later, utrace_wakeup() sees
 that and we switch to TASK_STOPPED after clearing -stopped.  We stay in
 jctl stop for a few days.

 Some new debugger comes along with utrace_control(,,UTRACE_STOP) and
 utrace_do_stop() switches us to TASK_TRACED after setting -stopped.
 Meanwhile, SIGCONT is coming along and clearing SIGNAL_STOP_STOPPED
 (sibling threads run again, etc.).  Now the debugger calls
 utrace_control(,,UTRACE_RESUME), so utrace_wakeup clears -stopped and
 wakes us up.

 Now -stopped was just an instant ago set and then cleared on the other
 CPU and we are running.  Are we really sure that we see it as clear rather
 than set?

Of course, we can have the false positive. Even simpler: by the time
finish_utrace_stop() takes utrace-lock the tracer can clear -stopped
even if it was really set after schedule().

But I think we do not care. What we do care is: -stopped must be F
after finish_utrace_stop(), whoever checks it under utrace-lock.

And I think this must be true. If -stopped was ever cleared before
finish_utrace_stop(), nobody can set it again in such a way that we
can miss it.

when utrace_do_stop() sets -stopped, we hold -siglock, this means
we can't race with SIGCONT/SIGKILL. If the signal somes after that,
we can rely on rq-lock, both try_to_wake_up() and schedule() take
this lock, it should act as a barrier. I think.

Perhaps it makes sense to move utrace_stop()-recalc_sigpending()
code up, before finish_utrace_stop(). This way it would be a bit
clearer why we can't race with signals.

BTW, I think that finish_utrace_stop() doesn't need -siglock to
check the pending SIGKILL.


Anyway, this change is very minor. The only reason I sent this patch
is that I spent some time trying to understand which races this
unconditional spin_lock(utrace-lock) tries to close.


 But I think that I can even worry about this is a good
 indicator that we'd probably be happier if we can indeed get rid of
 -stopped altogether.

I think you are right, and -stopped _can_ go away. But, as a devil's
advocate, I'd like to give a couple of weak arguments against this
change.

- Currently, if we see -stopped == T under utrace-lock we
  know that the tracee can do nothing interesting from utrace
  pov. It can't be waked up by utrace_wakeup(). If the tracee
  is killed it must take -lock to clear -stopped before it
  can do anything else.

  If we remove -stopped, we can't rely on TASK_TRACED in the
  same manner. For example, a killed tracee can can call
  utrace_report_jctl()-REPORT() while utrace thinks it is
  stopped.

- The exiting task with _UTRACE_DEATH_EVENTS can be considered
  as quiescent. But, without -stopped, looking at this task we
  can't know if some engine wants this task to be stopped. IOW,
  if we see such a task we can't figure out was utrace_do_stop()
  called or not.

- I _think_ that -stopped makes the code a bit more readable and
  understandable. More explicit. But this is really subjective.

On the other hand I agree with your arguments. And, as you pointed out,
we can kill utrace_finish_jctl().

In short - I do not know.

Oleg.



Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless

2009-08-03 Thread Roland McGrath
 Of course, we can have the false positive. Even simpler: by the time
 finish_utrace_stop() takes utrace-lock the tracer can clear -stopped
 even if it was really set after schedule().

Right.  It didn't even occur to me that the only scenario I thought of was
a false positive.  I was just thinking through the statement you made about
why lockless was OK (we will have set it ourselves).

 But I think we do not care. What we do care is: -stopped must be F
 after finish_utrace_stop(), whoever checks it under utrace-lock.

We do care about a false positive if it makes finish_utrace_stop() hit its
WARN_ON and return true when there was no SIGKILL.

 when utrace_do_stop() sets -stopped, we hold -siglock, this means
 we can't race with SIGCONT/SIGKILL. If the signal somes after that,
 we can rely on rq-lock, both try_to_wake_up() and schedule() take
 this lock, it should act as a barrier. I think.

Ok.  If we are in fact relying on any implicit barrier semantics like
that, we should have a clear comment about it.

 Perhaps it makes sense to move utrace_stop()-recalc_sigpending()
 code up, before finish_utrace_stop(). This way it would be a bit
 clearer why we can't race with signals.

Ok.  I'm not quite seeing just now why this is clearer.  But if you
have a patch with comments in it, I will probably understand why.

 BTW, I think that finish_utrace_stop() doesn't need -siglock to
 check the pending SIGKILL.

Shouldn't one always hold siglock to use task-pending at all?

 Anyway, this change is very minor. The only reason I sent this patch
 is that I spent some time trying to understand which races this
 unconditional spin_lock(utrace-lock) tries to close.

Good!  Indeed I don't want to waste too much time on minor things, and
we can always clean up more later.  But it is crucial that you are
thinking thoroughly about all the locking and race questions in the
code, so that is always time well spent.

 I think you are right, and -stopped _can_ go away. But, as a devil's
 advocate, I'd like to give a couple of weak arguments against this
 change.

Sure!  We want to go whichever way makes the code overall easiest to
follow, review, and be convinced it's right.

   - Currently, if we see -stopped == T under utrace-lock we
 know that the tracee can do nothing interesting from utrace
 pov. It can't be waked up by utrace_wakeup(). If the tracee
 is killed it must take -lock to clear -stopped before it
 can do anything else.

Right.  My original thinking was that it must always get into
utrace_get_signal() right then, and that would be responsible for any
synchronization that mattered.

 If we remove -stopped, we can't rely on TASK_TRACED in the
 same manner. For example, a killed tracee can can call
 utrace_report_jctl()-REPORT() while utrace thinks it is
 stopped.

You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right?
That is literally the only thing that happens after do_signal_stop()
wakes up and before utrace_get_signal() is called.

I wonder if it might not make sense anyway to suppress this for the
SIGKILL case.  Not just the tracing, but the entire CLD_CONTINUED
notification.  A parent doesn't really need a SIGCHLD,CLD_CONTINUED
immediately followed by a SIGCHLD,CLD_KILLED.  

OTOH, there is still the general point.  utrace_get_signal() doesn't
take the utrace lock unconditionally, and I don't think we want it to.
Even if the only thing that happens is dequeuing SIGKILL, then that
could still get to utrace_report_exit().

   - The exiting task with _UTRACE_DEATH_EVENTS can be considered
  ^dead/dying   ^out
 as quiescent. But, without -stopped, looking at this task we
 can't know if some engine wants this task to be stopped. IOW,
 if we see such a task we can't figure out was utrace_do_stop()
 called or not.

(I try to reserve exiting for PF_EXITING and dying for running
with -exit_state set, because the difference is important.)

Who cares?  There is no meaning to UTRACE_STOP for a dying or dead
task.  It counts as quiescent for purposes of utrace_control knowing
what it can do, that's all.  If it's not quiescent yet, it won't ever
stop, it will just finish dying (i.e. report_death callbacks will
finish).

   - I _think_ that -stopped makes the code a bit more readable and
 understandable. More explicit. But this is really subjective.

Ok.  I'll leave it to you.  My first reaction was just that fewer
state bits means less bookkeeping to convince oneself is correct,
so easier to read in that way.  But it's not a strong opinion.

 On the other hand I agree with your arguments. And, as you pointed out,
 we can kill utrace_finish_jctl().
 
 In short - I do not know.

I don't either.  Your first point above is fairly compelling.
OTOH we can just keep utrace_finish_jctl() and have it and
finish_utrace_stop() just always take the utrace lock purely
to synchronize (and then 

Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless

2009-08-01 Thread Roland McGrath
 finish_utrace_stop() can check -stopped lockless. It was set by us,
 we can't miss it.

We enter utrace_stop() for some stop.  Either then or later, a group jctl
stop finishes and sets SIGNAL_STOP_STOPPED.  Later, utrace_wakeup() sees
that and we switch to TASK_STOPPED after clearing -stopped.  We stay in
jctl stop for a few days.

Some new debugger comes along with utrace_control(,,UTRACE_STOP) and
utrace_do_stop() switches us to TASK_TRACED after setting -stopped.
Meanwhile, SIGCONT is coming along and clearing SIGNAL_STOP_STOPPED
(sibling threads run again, etc.).  Now the debugger calls
utrace_control(,,UTRACE_RESUME), so utrace_wakeup clears -stopped and
wakes us up.

Now -stopped was just an instant ago set and then cleared on the other
CPU and we are running.  Are we really sure that we see it as clear rather
than set?

(I think there is also a racier version without jctl stops at all,
just where utrace_wakeup+utrace_do_stop+utrace_wakeup all came and
went before we actually got scheduled.)

I might be overlooking some interlock that covers this.  I put the patch
in, anyway.  But I think that I can even worry about this is a good
indicator that we'd probably be happier if we can indeed get rid of
-stopped altogether.


Thanks,
Roland