Re: Q: utrace-stopped utrace_report_jctl()

2009-03-15 Thread Roland McGrath
 I was wrong, I forgot that tracehook_get_signal() doesn't need JCTL.

Right, that is key.

 OK, let's look at utrace_do_stop:
 
   if (task_is_stopped(target) 
   !(target-utrace_flags  UTRACE_EVENT(JCTL))) {
   utrace-stopped = 1;
   return true;
   }
 
 This doesn't look correct. We don't hold -siglock, the task can be
 SIGCONT'ed and return from get_signal_to_deliver(), and then we set
 -stopped. Or I missed something again?

I think you're right.  The logic there was supposed to be, TASK_STOPPED
means it will get into utrace_get_signal().  That much is true, but
nothing inside utrace_get_signal() actually synchronizes with this to make
that matter.

All this check does is try to optimize the TASK_STOPPED case not to take
the siglock.  That doesn't seem worth much, so we can just drop it.

 Then we re-do this (well, almost) check under -siglock,
 
   } else if (task_is_stopped(target)) {
   if (!(target-utrace_flags  UTRACE_EVENT(JCTL)))
   utrace-stopped = stopped = true;
   }
 
 But this is not nice. Let's suppose the task is already stopped, we do
 UTRACE_ATTACH + utrace_set_events(JCTL).

This is exactly why utrace_set_events() sets -stopped preemptively for
that case.

 Now, utrace_control(UTRACE_STOP) can do nothing until SIGCONT. We don't
 even set -report. Yes, we can't set -stopped if JCTL, we can race with
 utrace_report_jctl() which does REPORT().

Setting JCTL while in TASK_STOPPED made it set -stopped, so
utrace_control() succeeds without calling utrace_do_stop().

 BTW, afaics utrace_report_jctl() has another bug,
 
   REPORT(task, utrace, report, UTRACE_EVENT(JCTL),
  report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED, what);
 
 I think it should do
 
   REPORT(task, utrace, report, UTRACE_EVENT(JCTL),
  report_jctl, what, notify);
 
 instead.

There is a bug, but your fix changes a key API choice.
I've put in this fix:

-  report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED, what);
+  report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED,
+  notify ? what : 0);

There are two things a tracer might be tracking: state or events.
The state is whether the thread is in job control stop or is running.
The events are the SIGCHLD notifications that the thread tries to post to
its parent.

The @type argument shows the state we will be in after the callback.
If the state changes, there will be another callback.  That's what a
state-tracking tracer needs, e.g. to keep a little light on the screen red
while the thread is stopped and green while it's running.

The @notify argument shows what SIGCHLD the parent sees (if it were
dequeuing all possible SIGCHLD postings as quickly as they come).  That's
what an event-tracking tracer needs, e.g. to match up with what SIGCHLDs
are expected in the parent.

Your change to @type would break state-trackers in the case where
tracehook_notify_jctl() is called from get_signal_to_deliver() with
CLD_STOPPED.

 With the first patch, we call utrace_report_jctl() before we actually
 stop. do_signal_stop() can fail then, but I think this is OK, we can
 pretend that SIGCONT/SIGKILL happened after we stopped. It is not complete,
 and with this patch we always call -report_jctl with notify == 0. Just for
 discussion.

I think I sort of understand the intent of your patch.  If we change the
calling convention for tracehook_notify_jctl, I think we want to preserve
the aspect that the hook decides about sending the notification.  That's
how the ptrace quirks can be reimplemented differently later without
changing the tracehook layer again.  Also, we certainly don't want one
tracehook call with two different locking conditions.

It seems right in principle to do the reporting before we change -state,
given that we have to allow for it changing during the callbacks.  And
indeed, that avoids the JCTL special case mess entirely.


Thanks,
Roland



Re: utrace_set_events/utrace_control death/reap checks

2009-03-15 Thread Roland McGrath
 utrace_set_events:
 
   (utrace-death  ((old_flags  ~events)  DEATH_EVENTS))
 
 (old_flags  ~events)  DEATH_EVENTS) means the caller tries to
 clear DEATH/QUIESCE. Why this is not allowed? And why this is not
 allowed _only_ when the target runs utrace_report_death()-REPORT()?

This is specifically documented for -EALREADY, and in the DocBook section
Interlock with final callbacks.  The idea is this:

For most utrace events, you don't know whether you'll get some callbacks.
It could be, the task got SIGKILL first thing after you attached, and it
will never report anything.  That is fine for the most part.  But for the
lifetime events it becomes a real burden on the users of the API.  They
have to manage their data structures, and so they have to know reliably
when they can and can't get what callbacks.

So, the utrace_set_events rules try to ensure that the caller knows for
sure whether it will or won't get a callback when the task dies and/or is
reaped.  You can clear DEATH/QUIESCE, and be sure from the return value
that it is now impossible that there is a report_death/report_quiesce
callback racing with you because the guy just got a SIGKILL.  If you can't
be sure of that, then you do know for sure that your callback is being made
right now or very soon.

 I think this line can be just killed. I guess the intent was to
 prevent utrace_release_task() from doing utrace_reap() in parallel
 with utrace_report_death(), but note that utrace_set_events() can
 never shrinks -utrace_flags, it only sets new bits.

It's not -utrace_flags that matters here, it's engine-flags.

That is one of the intents, but not the only one.  It's just as important
that the user of the API can rely on the ordering of its callbacks wrt its
utrace_set_events/utrace_control calls as that it can rely on the ordering
of its death and reap callbacks.

 The next line looks strange too, don't we need
 
   (utrace-reap  ((events  ~old_flags)  UTRACE_EVENT(REAP)))
 
 ?

get_utrace_lock() already returned -ESRCH if it was in EXIT_DEAD, so this
is probably moot.

 And I don't understand why do we need utrace-death at all. Apart from
 utrace_set_events (which I think doesn't need it), it is only used by
 utrace_control(UTRACE_DETACH). But I can't see how can we race with
 utrace_report_death(). If it can be called, we have DEATH_EVENTS bits
 set. But in that case utrace_do_stop() can't succeed, so UTRACE_DETACH
 can only do mark_engine_wants_stop() but not utrace_reset().

It is used by utrace_set_events and utrace_control for the same purpose.
Those calls must know for sure that report_death cannot happen, or else
that it will (or it's already happening).

Many tracers only keep track until death.  For them, the simple thing is to
have report_death clean up their data structures and return UTRACE_DETACH.
But then they also want to do asynchronous detach.  So they can do
utrace_set_events or utrace_control as the synchronizing step of
asynchronous tear-down.  If it returns 0, then report_death will not and it
is safe to destroy data structures the callback code would use.  If it
returns -EALREADY, then report_death will shortly be called and we can rely
on our callback code to take care of the data structures before it returns
UTRACE_DETACH.


Thanks,
Roland



Re: [PATCH 2/2] UTRACE_STOP: nesting engine management (updated)

2009-03-15 Thread Roland McGrath
 When a report function of an engine returns UTRACE_STOP, it means (may mean)
 that it wants to change the status of the process before resuming it.
 VM monitors often change the status, sometimes debugger users want to set
 some variables too.

Yes.  In ideal cases, it can decide up front quickly what it wants to do,
and change the user state right in the callback without stopping.  But when
it needs another agent to decide what to do, it uses UTRACE_STOP.

 IMHO, utrace should stop it *before* calling the report function of the 
 next engine, 

No, we'll never want to do it this way.  One engine doesn't get to
arbitrarily delay the reporting to other engines of the thread's events.
This is both an efficiency point and a robustness point.  It's important to
remember that utrace is about the primitive events: the user thread had an
event ... the user thread is about to run again.  The high-level notion of
what did the other engine do? is built from examining the state at these
events, and knowing about the delays that other engines are imposing via
UTRACE_STOP.

 otherwise we need to set up another structure to synchronize
 the engines (that may even be unknown one to the other).
 If there is a tracer/debugger among the engines, it is not even possible to 
 know
 which snapshot it gets, after or before the modification created by the VM
 monitor?

This is where the broader discussion of callback order comes in.

When a previous engine has decided to use UTRACE_STOP, your callback's
@action argument reflects this.  You know that another engine is going to
do something asynchronous before it lets the user thread run.  If your own
engine doesn't especially want it stopped now but wants to see what it
looks like when other engines are done fiddling with it, then you can use
UTRACE_REPORT.  That ensures that you'll get a report_quiesce callback
after those other engines have done their thing.


Thanks,
Roland