Re: Possible problem with utrace_control

2010-06-23 Thread Oleg Nesterov
On 06/22, Roland McGrath wrote:

  OTOH, I am wondering if we can give more power to utrace_control(STOP).
  Currently debugger can't stop the tracee unless it cooperates. However, if
  the tracee doesn't have UTRACE_EVENT(QUIESCE), then utrace_control(STOP)
  works. This is strange.

 Why is it strange?  If you don't have any callback, then all your engine
 does is get stopped.  If you have callbacks, then you always get callbacks
 before the thread does anything (including stop).  Since your callback's
 return value always replaces any other resume action

OK. But then perhaps UTRACE_STICKY_STOP makes sense even without the
races we discussed. It can simplify the cooperation in this case.

  Stupid question: what if we just remove the clear_engine_wants_stop()
  code in finish_callback_report() ?

 That would change the API so that a callback's resume action is always
 overridden by a utrace_control(,,UTRACE_STOP).  Today the practice from a
 third-party debugger thread is to do:

   ret = utrace_control(task, engine, UTRACE_STOP);
   if (ret == 0) {
 // Already stopped, e.g. was in jctl or another engine's stop.
 // This means we can do our asynchronous examination here.
   } else if (ret == -EINPROGRESS) {
 // it will get to report_quiesce soonish (modulo blocked in kernel)
   }

 If the examination you wanted to do was some predetermined sequence of
 fetching and fiddling before resuming, then the report_* callback can just
 do all that itself.  For example, fetch the registers for a sample and keep
 going; insert some breakpoints and keep going; etc.

OK, agreed.

 With this
 change, it would have to explicitly request not stopping by calling
 utrace_control(current, engine, UTRACE_RESUME) inside the callback.

Ah, indeed, this can't be good. Thanks.

  Btw, before I forgot about this. shouldn't utrace_control(UTRACE_RESUME)
  remove ENGINE_STOP from -utrace_flags like DETACH does?

 Perhaps so.  My memory is dim on the use of that bit in utrace_flags.

(don't assume I was able to quickly recall why DETACH does this ;)

 It's
 to close certain races, but we'll have to remind each other of the details.

Yes. Let's consider the concrete example. The tracee is going to
stop and calls utrace_stop(). Before it takes utrace-lock and
sets state = TASK_TRACED, the debugger does utrace_control(DETACH).

In this case utrace_stop() shouldn't stop, otherwise nobody will
ever wake it up. That is why we clear this bit in -utrace_flags
to provoke utrace_reset() which will check carefully the tracee
has an engine with ENGINE_STOP.

The original motivation for this patch was ptrace-utrace, the
implicit DETACH if the tracer exits.

UTRACE_RESUME should probably do the same, otherwise the tracee
can stop right after utrace_control(RESUME). This case is less
important compared to UTRACE_DETACH (and just in case, ptrace
doesn't need this) but still.

  Hmm. Not sure... at least I don't immediately see something simple.
 
  Except, just add wait_queue_head into struct utrace. This way 
  utrace_barrier()
  can wait longer than needed if it notices utrace-reporting == engine. Or we
  can add it into utrace_engine, in this case utrace_stop() needs
  list_for_each(engine, utrace-attached).

 You're talking about two different things here.  Whatever implementation
 details change in utrace_barrier() is not directly apropos.

Yes, agreed.

 For the new API idea, I was talking about something like a utrace_wake_up()
 call.

Now I don't understand you again...

The 1st question: should the new API help us to kill ptrace_notify_stop()
in utrace_stop() ?

 What I'm most interested in right now is thinking through the utility of
 particular new API details for engine writers.

Hmm... After the previous email I thought that this utrace_wake_up()
or whatever shouldn't be visible outside of utrace.c.

Ignoring the ptrace_notify_stop() problem, could you give us any
example of how it can be used?

Or. Do you literally mean something like

utrace_wait_for_event();// for debugger
utrace_wake_up();   // for tracee

which should (at least) cover all races with exit/detach/etc ?

Oleg.



Re: Possible problem with utrace_control

2010-06-23 Thread Roland McGrath
 OK. But then perhaps UTRACE_STICKY_STOP makes sense even without the
 races we discussed. It can simplify the cooperation in this case.

The only cooperation methods we should consider are those that cover all
their races.

 Yes. Let's consider the concrete example. The tracee is going to
 stop and calls utrace_stop(). Before it takes utrace-lock and
 sets state = TASK_TRACED, the debugger does utrace_control(DETACH).
 
 In this case utrace_stop() shouldn't stop, otherwise nobody will
 ever wake it up. That is why we clear this bit in -utrace_flags
 to provoke utrace_reset() which will check carefully the tracee
 has an engine with ENGINE_STOP.

Right.  I agree this race seems to apply to all kinds of resumption, not
just detach.  The question for the non-detach cases is what kind of
guarantee the utrace API is claiming to provide.  For detach it matters
strongly because the loser of the race is gone and there is no way for
it to be responsible for cleaning up the fallout.

For non-detach cases, the picture is more hazy.  The conclusions on what
matters might be different if we have utrace_wake_up() or equivalent.

Off hand I think it does matter today insofar as it violates the
documented guarantees of the utrace_barrier API.  If utrace_barrier
returns 0 it is said to mean that, Any effect of its return value (such
as %UTRACE_STOP) has already been applied to @engine.  So e.g. if you
get a wake-up sent by your callback before it returns UTRACE_STOP, and
then call utrace_barrier followed by utrace_control(,,UTRACE_RESUME),
then you should be guaranteed that your resumption cleared the
callback's stop.

Another example, you call utrace_set_events and disable some events,
including all the ones for which your callbacks ever return UTRACE_STOP.
If that returns 0, or then you call utrace_barrier and it returns 0,
then you know that either it was already stopped before you called
utrace_set_events, or that it's not stopped now.  In either case, you
know that utrace_control(,,UTRACE_RESUME) ensures that it's no longer
stopped.  (That is, stopped by your engine--it of course might still
be stopped by a different engine.)

  For the new API idea, I was talking about something like a utrace_wake_up()
  call.
 
 Now I don't understand you again...
 
 The 1st question: should the new API help us to kill ptrace_notify_stop()
 in utrace_stop() ?

Ideally yes.  This might still have to be some other kind of special case,
we'll have to figure that out.  That would be survivable if it's necessary.
The wait_chldexit wakeup would be covered by what we're discussing, it's
just another wait_queue_head_t.  The signal sending is more problematic.

But I wanted first to contemplate what API would cover new code cleanly
written.  The dismal old ptrace signal/wait synchronization might always be
a difficult outlying case.

 Hmm... After the previous email I thought that this utrace_wake_up()
 or whatever shouldn't be visible outside of utrace.c.

I'm sorry if I was not clear.  I have been talking entirely about new
utrace API features, not internals.  This is a discussion for your
engine-writer hat more than your utrace-maintainer hat.  I'm only
mentioning any utrace implementation details to elucidate what is
realistic or problematic to consider for the new API features.

 Or. Do you literally mean something like
 
   utrace_wait_for_event();// for debugger
   utrace_wake_up();   // for tracee
 
 which should (at least) cover all races with exit/detach/etc ?

I meant utrace_wake_up(wait_queue_head_t *) used in place of wake_up()
inside a utrace callback.  The waiter (tracer) side would use normal
linux/wait.h interfaces (wait_event macro, etc.).


Thanks,
Roland



Re: Possible problem with utrace_control

2010-06-22 Thread Oleg Nesterov
On 06/21, Roland McGrath wrote:

  I am wondering if there is another way to handle such a race...
  Suppose we
 
  - add UTRACE_STOP_XXX
 
  - modify utrace_control(UTRACE_STOP_XXX) to do
mark_engine_wants_stop() and nothing more.

 utrace_control(current,,UTRACE_STOP) doesn't really do anything more.

It does utrace_do_stop() which sets UTRACE_REPORT/TIF_NOTIFY_RESUME.
Harmless, but makes sense to avoid.

  Then, probably producer could do
 
  utrace_control(current, UTRACE_STOP_XXX);
 
  slot = book_message_in_buffer(buf);
  if (slot  0)
  return UTRACE_STOP_XXX;
 
  ...
  return UTRACE_RESUME;
 
  Even better, UTRACE_STOP_XXX can live outside of UTRACE_RESUME_MASK.

 Right, it doesn't belong in enum utrace_resume_action--it doesn't make
 sense as an argument for utrace_control, for example.  So it could be
 something like a return-value flag bit saying stop if marked for stop,
 e.g. UTRACE_SINGLESTEP|UTRACE_STICKY_STOP for single-step, unless marked
 to stop, in which case stay stopped.

Yes, this is what I meant. And yes, if UTRACE_STICKY_STOP doesn't
fall into UTRACE_RESUME_MASK utrace_control(enum utrace_resume_action)
shouldn't be used.

OTOH, I am wondering if we can give more power to utrace_control(STOP).
Currently debugger can't stop the tracee unless it cooperates. However, if
the tracee doesn't have UTRACE_EVENT(QUIESCE), then utrace_control(STOP)
works. This is strange.

Stupid question: what if we just remove the clear_engine_wants_stop()
code in finish_callback_report() ?

 So this would fix that race.
 ...
 (But note if there is
 another engine that decides to stop anyway, then we'll have used a
 second CPU and/or ping-pong scheduled to run the tracer thread in
 parallel with a very short amount of work in the tracee before it just
 blocks anyway.)

Hmm. Not sure I understand... Yes, the tracee still can stop if another
tracer demands, but this is fine?

Btw, before I forgot about this. shouldn't utrace_control(UTRACE_RESUME)
remove ENGINE_STOP from -utrace_flags like DETACH does?

 Here
 what the callback does is wake up the tracer, and return UTRACE_STOP.

 For this sort of scenario, it is clearly better by far to have the
 wake_up_tracer() done as late as possible in the sequence.

 In the past, Oleg has suggested something very general, like a
 utrace_engine_ops hook that gets called right before schedule().
 I shy away from something like that, since I think it degrades the
 intended property of the API that it make it harder to write truly
 arcane bugs with terrible effects on the system, not easier.  It's
 just too big a sledgehammer to wield above the thinly-protected toes
 of the utrace engine writer.

 To avoid something dangerously general, the only real option is to give
 something safely constrained and specific.

 It's straightforward enough to
 come up with something for any single synchronization method.  For
 example, if every need for this kind of synchronization with tracees is
 based on linux/wait.h facilities, then we can have a utrace API feature
 whereby a callback can queue up a wait_queue_head_t pointer to be passed
 to wake_up_interruptible_sync() after all the callbacks are done and
 UTRACE_STOP is being performed.  It's probably sufficient to allow one
 pending wake-up per engine, which makes it pretty cheap and simple to
 implement.

Heh. it turns out that I misunderstood you in the past, I thought
you dislike the whole idea of wake_up from utrace_stop().

I agree, the utrace_engine_ops-notify() hook is dangerous. But, it
is simple and allows us to kill utrace_stop()-ptrace_notify_stop().

(yes I understand, this is not only about utrace_stop(), utrace_barrier
 wants notification too).

 It's straightforward enough to
 come up with something for any single synchronization method.

Hmm. Not sure... at least I don't immediately see something simple.

Except, just add wait_queue_head into struct utrace. This way utrace_barrier()
can wait longer than needed if it notices utrace-reporting == engine. Or we
can add it into utrace_engine, in this case utrace_stop() needs
list_for_each(engine, utrace-attached).

Or I misunderstood you again?

Oleg.



Re: Possible problem with utrace_control

2010-06-22 Thread Roland McGrath
  utrace_control(current,,UTRACE_STOP) doesn't really do anything more.
 
 It does utrace_do_stop() which sets UTRACE_REPORT/TIF_NOTIFY_RESUME.
 Harmless, but makes sense to avoid.

Right.  I was talking about the API perspective.  Those internal details
are more or less invisible to a given engine.

 OTOH, I am wondering if we can give more power to utrace_control(STOP).
 Currently debugger can't stop the tracee unless it cooperates. However, if
 the tracee doesn't have UTRACE_EVENT(QUIESCE), then utrace_control(STOP)
 works. This is strange.

Why is it strange?  If you don't have any callback, then all your engine
does is get stopped.  If you have callbacks, then you always get callbacks
before the thread does anything (including stop).  Since your callback's
return value always replaces any other resume action for your engine, you
have to cooperate with yourself in not changing your mind the next time
your code is in charge of the tracee.

 Stupid question: what if we just remove the clear_engine_wants_stop()
 code in finish_callback_report() ?

That would change the API so that a callback's resume action is always
overridden by a utrace_control(,,UTRACE_STOP).  Today the practice from a
third-party debugger thread is to do:

ret = utrace_control(task, engine, UTRACE_STOP);
if (ret == 0) {
  // Already stopped, e.g. was in jctl or another engine's stop.
  // This means we can do our asynchronous examination here.
} else if (ret == -EINPROGRESS) {
  // it will get to report_quiesce soonish (modulo blocked in kernel)
}

If the examination you wanted to do was some predetermined sequence of
fetching and fiddling before resuming, then the report_* callback can just
do all that itself.  For example, fetch the registers for a sample and keep
going; insert some breakpoints and keep going; etc.

Today, the callback that goes with that asynchronous-side code has the
choice to do something and stop or to do something and not stop.  With this
change, it would have to explicitly request not stopping by calling
utrace_control(current, engine, UTRACE_RESUME) inside the callback.  It
would need to use its own bookkeeping above the utrace layer to know that
there was a stop it was supposed to clear (i.e. something the code sample
above would set before calling utrace_control).

This changes the set of races with asynchronous stop/resume scenarios.
But it doesn't remove the asynchronous-resume vs callback-stop race.
In concert with wake_up_tracer-after-stop perhaps it gets better.
We'd have to really think through the races with the new API.

  So this would fix that race.
  ...
  (But note if there is
  another engine that decides to stop anyway, then we'll have used a
  second CPU and/or ping-pong scheduled to run the tracer thread in
  parallel with a very short amount of work in the tracee before it just
  blocks anyway.)
 
 Hmm. Not sure I understand... Yes, the tracee still can stop if another
 tracer demands, but this is fine?

Correct.  When that is what's going to happen (e.g. there is another engine
attached that is a slow stopper like ptrace), then it's suboptimal to have
the scheduler decide to yield the CPU immediately on waking your first
tracer.

 Btw, before I forgot about this. shouldn't utrace_control(UTRACE_RESUME)
 remove ENGINE_STOP from -utrace_flags like DETACH does?

Perhaps so.  My memory is dim on the use of that bit in utrace_flags.  It's
to close certain races, but we'll have to remind each other of the details.

 Heh. it turns out that I misunderstood you in the past, I thought
 you dislike the whole idea of wake_up from utrace_stop().

No, just the over-general hook or the over-specific ptrace hack.
I talked about about a utrace_wake_up() that would work this way.

 Hmm. Not sure... at least I don't immediately see something simple.
 
 Except, just add wait_queue_head into struct utrace. This way utrace_barrier()
 can wait longer than needed if it notices utrace-reporting == engine. Or we
 can add it into utrace_engine, in this case utrace_stop() needs
 list_for_each(engine, utrace-attached).

You're talking about two different things here.  Whatever implementation
details change in utrace_barrier() is not directly apropos.  That would not
change the API-visible details at all, just make utrace_barrier() perform
better.  Let's not get into that before we hash out the new API ideas.

For the new API idea, I was talking about something like a utrace_wake_up()
call.  That might be implemented by a wait_queue_head_t * in utrace_engine,
for example.  The implementation internals could involve a simple
iteration, or involve more hidden bookkeeping.  I'm only talking about the
API idea for the moment.  I only mentioned implementation details at all to
indicate that a restricted API (one wake-up per callback per engine) could
be done with constant storage overhead and thus avoid a whole potential can
of worms--which is the only real reason to 

Re: Possible problem with utrace_control

2010-06-21 Thread Oleg Nesterov
On 06/17, Alpár Török  wrote:

 The producer is a utrace monitored task. At any time there is an
 arbitrary number of producers. On sys_entry and sys_exit the producers
 do  :

 slot = book_message_in_buffer(buf);
 if (slot  0) {
   /* no slot can be alocated, producer must sleep  */
   enque_producer_on_wait_list(list, producer);
   return UTRACE_STOP;
 }

I think Roland is right. Let's forget about utrace for the moment,
this code looks like

if (!CONDITION) {
set_task_state(TASK_XXX);
schedule();
}

and it can obviously race with

CONDITION = true;
wake_up_xxx();



I am wondering if there is another way to handle such a race...
Suppose we

- add UTRACE_STOP_XXX

- modify utrace_control(UTRACE_STOP_XXX) to do
  mark_engine_wants_stop() and nothing more.

- finish_callback_report(UTRACE_STOP_XXX) does

spin_lock(utrace-lock);
if (engine_wants_stop(engine))
action = UTRACE_STOP;
else
action = UTRACE_REUME;
spin_unlock(utrace-lock);

Then, probably producer could do

utrace_control(current, UTRACE_STOP_XXX);

slot = book_message_in_buffer(buf);
if (slot  0)
return UTRACE_STOP_XXX;

...
return UTRACE_RESUME;

Even better, UTRACE_STOP_XXX can live outside of UTRACE_RESUME_MASK.

Unlikely this makes sense, just a random thought.

Oleg.



Re: Possible problem with utrace_control

2010-06-21 Thread Roland McGrath
 I think Roland is right. Let's forget about utrace for the moment,
 this code looks like
 
   if (!CONDITION) {
   set_task_state(TASK_XXX);
   schedule();
   }
 
 and it can obviously race with
 
   CONDITION = true;
   wake_up_xxx();

This is Linux in-kernel synchronization 101.
The correct blocks of this sort do:

set_current_state(TASK_UNINTERRUPTIBLE);
while (!CONDITION) {
schedule();
}
set_current_state(TASK_RUNNING);

This is entirely orthogonal to anything having to do with utrace per se.
This pattern race-free because wake_up_* will reset the waiter to
TASK_RUNNING after making CONDITION true.  So in the race where the
while (!CONDITION) clause lets it get into schedule(), that just
returns immediately, after which the second iteration sees CONDITION true.

This is why I recommended using higher-level things like linux/wait.h
facilities, where the correct usage patterns are more obvious.

But the race we're actually talking about here is on the other side of
the coin.

 I am wondering if there is another way to handle such a race...
 Suppose we
 
   - add UTRACE_STOP_XXX
 
   - modify utrace_control(UTRACE_STOP_XXX) to do
 mark_engine_wants_stop() and nothing more.

utrace_control(current,,UTRACE_STOP) doesn't really do anything more.

So a callback doing this closes the old race: UTRACE_STOP is in force
when the tracer side wakes up, since utrace_control-on-current was done
before wake_up_tracer().  But...

   - finish_callback_report(UTRACE_STOP_XXX) does
 
   spin_lock(utrace-lock);
   if (engine_wants_stop(engine))
   action = UTRACE_STOP;
   else
   action = UTRACE_REUME;
   spin_unlock(utrace-lock);

Let's clarify what this means for the people following the utrace API
but not its implementation internals: currently, the callback's return
value overrides the engine's last-set state such as UTRACE_STOP, whether
that came from a previous utrace_control before the callback, a
utrace_control call inside the callback, or from the return value of a
previous callback.

So now we've moved to a different race (or a different instance of the
original race, if you like): after wake_up_tracer(), the tracer can come
and clear our engine's UTRACE_STOP state with its utrace_control call;
but then our callback return value just reestablishes UTRACE_STOP and we
still stop, having lost the asynchronous resumption from the tracer.

So Oleg has an idea for a new kind of callback return value, which I'd call
perhaps UTRACE_NOP.  This would mean to stay in UTRACE_STOP if the last
utrace_control call on this engine used UTRACE_STOP, or the last callback
return value left the engine in that state, otherwise UTRACE_RESUME.

 Then, probably producer could do
 
   utrace_control(current, UTRACE_STOP_XXX);
 
   slot = book_message_in_buffer(buf);
   if (slot  0)
   return UTRACE_STOP_XXX;
 
   ...
   return UTRACE_RESUME;
 
 Even better, UTRACE_STOP_XXX can live outside of UTRACE_RESUME_MASK.

Right, it doesn't belong in enum utrace_resume_action--it doesn't make
sense as an argument for utrace_control, for example.  So it could be
something like a return-value flag bit saying stop if marked for stop,
e.g. UTRACE_SINGLESTEP|UTRACE_STICKY_STOP for single-step, unless marked
to stop, in which case stay stopped.

So this would fix that race.  For the scenario we're discussing here,
that's probably just fine.  The only reason to stop was for the consumer
to run.  If it woke up quickly and ran enough to call utrace_control
while we were still finishing the return of the callback and utrace
bookkeeping, then we just don't stop at all.  (But note if there is
another engine that decides to stop anyway, then we'll have used a
second CPU and/or ping-pong scheduled to run the tracer thread in
parallel with a very short amount of work in the tracee before it just
blocks anyway.)

Now let's consider another scenario.  Take the normal debugger scenario,
i.e. a ptrace implementation done cleanly on the utrace API without the
magic kludges we have today, or a ptrace replacement built cleanly from
scratch in the ideal way, or the pending gdbstub implementation.  Here
what the callback does is wake up the tracer, and return UTRACE_STOP.
What the tracer thread does is block in some fashion until the tracee's
wake_up_tracer(), then do whatever logic (in ptrace, that means return
to user mode and let it make another ptrace system call, but that all
might be quite fast), and decide to examine the thread.  To make sure
its own callback is finished and UTRACE_STOP will be in force, it has to
synchronize with utrace_barrier.  So in maximum parallelism, this means,
having preempted the tracee just now by waking up, we now yield back to
the tracee to finish our callback, other engines' callbacks, and utrace
bookkeeping. 

Re: Possible problem with utrace_control

2010-06-17 Thread Alpár Török
2010/6/13 Roland McGrath rol...@redhat.com:
 Thanks for your interest in utrace.

 It's correct that passing UTRACE_REPORT to utrace_control should
 always ensure you get some report_* callback before the tracee returns
 to user mode.

 However, I think your use may be susceptible to a race between
 resumption by utrace_control, and the UTRACE_STOP done by the
 callback.  If your callback looks something like:

        wake_up_consumer();
        return UTRACE_STOP;

 then there is a window when the consumer can wake up and call
 utrace_control before that callback has actually returned UTRACE_STOP
 and had it processed.  In that event, your UTRACE_REPORT comes before
 the UTRACE_STOP, and then you do nothing afterwards to resume it.  If
 it ever did resume, it would indeed report (the UTRACE_REPORT is still
 pending).  But that doesn't help you.

 Off hand, I think there is only one way to address this race.  When
 you get the notification to consume, before calling utrace_control,
 call utrace_barrier first.  This will block the consumer until the
 producer's callback has actually finished and had UTRACE_STOP recorded
 for your engine.  At that point, you can be sure that your
 utrace_control call will indeed wake up the stopped tracee.

 Oleg or others here can double-check my reasoning and help explain the
 situation if you share your code for them to see.

 Because of this possibility, it might make sense to have utrace_control
 return -EINPROGRESS for this situation.  That at least would alert you
 that something might be amiss.  Unlike other cases of -EINPROGRESS, this
 would not just mean that you need to call utrace_barrier to be sure the
 effect has completed.  Rather, it means there is a danger of the
 scenario I described above, where the effect you asked for is going to
 be trumped by the callback's UTRACE_STOP.  So it doesn't quite fit in
 with the other situations--it's not really an indication that you have
 to follow up with a synchronization, it's an indication that your
 synchronization logic is already wrong.  But at least it would relieve
 the mystery of such a situation, assuming you are checking return values
 and noticing in debugging.  I'm certainly open to opinions on this API
 change.

 The need to use utrace_barrier as a second synchronization after the
 main wake-up (done by whatever normal means you are using) is rather
 clunky, and when it matters (as we suspect it does in the occasional
 race in your case) it leads to ping-pong scheduling.  So we would like
 to improve the utrace API in this area at some point.  One idea we have
 discussed before is a utrace_wake_up call.  (This assumes that some
 variant of linux/wait.h wake_up* is what you are using in the guts of
 wake_up_consumer.)  This would replace wake_up* for use in a callback,
 and simply do the wake-up after all the callbacks are finished (or
 perhaps just yours, but probably all), so your return value action is
 already recorded.  In fact, it might do the wakeup only after the
 UTRACE_STOP is not only recorded but actually performed, so that by the
 time your consumer wakes up, the tracee is actually stopped and
 available for third-party examination.  (You don't actually care about
 that for your case--for your purposes, it would be optimal just to delay
 the wakeup until your own callback's return value is recorded.  That
 way, if your consumer acts quickly enoug on another CPU, it might even
 resume the tracee before it actually yields the processor.)

 As you might tell from the complexity of that paragraph, there are many
 fuzzy nuances of exactly what that better API might be.  (And I didn't
 even delve into the possibility that you're using some synchronization
 mechanism other than wake_up* calls on a wait_queue_head_t.)  Hence we
 haven't tried to work it out exactly yet.  We hope that seeing a variety
 of uses such as yours will help us figure out how best to improve the
 utrace API for tracee-callback-to-tracer synchronization handshakes.


 Thanks,
 Roland


Sorry for the long delay. Here's my scenario for better understanding.

The buffer that's used for communication is in essence a circular
buffer,  but instead of  the buffer wrapping, the consumer is
suspended.

 I have  *a single* consumer at any time.  The consumer waits for data
to be available. The code looks like this:

set_current_state(TASK_INTERRUPTIBLE);
while (atomic_read(buf-tail) == atomic_read(buf-head))
 {
     pc_debug(Reader task going to sleep!);
     schedule();
     if (signal_pending(current)) {
         return -ERESTARTSYS;
     }
 }
set_current_state(TASK_RUNNING);

once data is available it read some , freeing up some space. After
that it  wakes up all producers. Producers are utrace monitored tasks.
 I use a  standard linked list  protected by a spin lock to protect it
to hold references   to the  utrace engine and task structures, that i
need for utrace control.  Is it possible to use a wait_queue for this
? I know it can 

Re: Possible problem with utrace_control

2010-06-17 Thread Roland McGrath
You should use linux/wait.h or similar facilities rather than calling
schedule() and wake_up_process() directly.  This doesn't have anything to
do with utrace, it's just the clean practice for normal kinds of blocking
in the kernel, for a variety of reasons.  That has to do with how your
control thread blocks, which is just the normal set of issues for any
thread in kernel code.  You are using the lowest-level way of doing things,
which is not recommended--that's why various higher-level facilities exist.

The admonition to use utrace facilities for all blocks has to do with when
you make a traced thread block.  You are on the correct path in that regard.

The relevant parts of your scenario match what I described.  The race I
talked about between your control thread's utrace_control() and tracee
callbacks returning UTRACE_STOP explains what you are seeing.


Thanks,
Roland



Re: Possible problem with utrace_control

2010-06-13 Thread Roland McGrath
Thanks for your interest in utrace.

It's correct that passing UTRACE_REPORT to utrace_control should
always ensure you get some report_* callback before the tracee returns
to user mode.

However, I think your use may be susceptible to a race between
resumption by utrace_control, and the UTRACE_STOP done by the
callback.  If your callback looks something like:

wake_up_consumer();
return UTRACE_STOP;

then there is a window when the consumer can wake up and call
utrace_control before that callback has actually returned UTRACE_STOP
and had it processed.  In that event, your UTRACE_REPORT comes before
the UTRACE_STOP, and then you do nothing afterwards to resume it.  If
it ever did resume, it would indeed report (the UTRACE_REPORT is still
pending).  But that doesn't help you.

Off hand, I think there is only one way to address this race.  When
you get the notification to consume, before calling utrace_control,
call utrace_barrier first.  This will block the consumer until the
producer's callback has actually finished and had UTRACE_STOP recorded
for your engine.  At that point, you can be sure that your
utrace_control call will indeed wake up the stopped tracee.

Oleg or others here can double-check my reasoning and help explain the
situation if you share your code for them to see.

Because of this possibility, it might make sense to have utrace_control
return -EINPROGRESS for this situation.  That at least would alert you
that something might be amiss.  Unlike other cases of -EINPROGRESS, this
would not just mean that you need to call utrace_barrier to be sure the
effect has completed.  Rather, it means there is a danger of the
scenario I described above, where the effect you asked for is going to
be trumped by the callback's UTRACE_STOP.  So it doesn't quite fit in
with the other situations--it's not really an indication that you have
to follow up with a synchronization, it's an indication that your
synchronization logic is already wrong.  But at least it would relieve
the mystery of such a situation, assuming you are checking return values
and noticing in debugging.  I'm certainly open to opinions on this API
change.

The need to use utrace_barrier as a second synchronization after the
main wake-up (done by whatever normal means you are using) is rather
clunky, and when it matters (as we suspect it does in the occasional
race in your case) it leads to ping-pong scheduling.  So we would like
to improve the utrace API in this area at some point.  One idea we have
discussed before is a utrace_wake_up call.  (This assumes that some
variant of linux/wait.h wake_up* is what you are using in the guts of
wake_up_consumer.)  This would replace wake_up* for use in a callback,
and simply do the wake-up after all the callbacks are finished (or
perhaps just yours, but probably all), so your return value action is
already recorded.  In fact, it might do the wakeup only after the
UTRACE_STOP is not only recorded but actually performed, so that by the
time your consumer wakes up, the tracee is actually stopped and
available for third-party examination.  (You don't actually care about
that for your case--for your purposes, it would be optimal just to delay
the wakeup until your own callback's return value is recorded.  That
way, if your consumer acts quickly enoug on another CPU, it might even
resume the tracee before it actually yields the processor.)

As you might tell from the complexity of that paragraph, there are many
fuzzy nuances of exactly what that better API might be.  (And I didn't
even delve into the possibility that you're using some synchronization
mechanism other than wake_up* calls on a wait_queue_head_t.)  Hence we
haven't tried to work it out exactly yet.  We hope that seeing a variety
of uses such as yours will help us figure out how best to improve the
utrace API for tracee-callback-to-tracer synchronization handshakes.


Thanks,
Roland



Possible problem with utrace_control

2010-06-11 Thread Alpár Török
Hi,

 I have the following scenario:
Producer consumer model. The traced tasks are producers, and there is
one consumer.  Produces produce data in a buffer. If the buffer is
full,
the producer registers itself into a list (lock protected) and returns
UTRACE_STOP. The consumer always checks the list and
utrace_control(REPORT)s the producers (all of them).  The monitoring
callback is called again, and the producer produces what it should
have in the first place.

This works well most of the time. Some times it would seem that
utrace_control doesn't  do it's job, and doesn't return an error
either.  I can observe this when there is a high pressure on the
buffer, and it's almost full so the  STOP / RESUME cycle happens
often.

What i would like to know, is how to go about clarifying this. First
of all is my assumption correct that UTRACE_REPORT will always cause
annother call to the monitoring callback?  What other precautions
should i take? If i am using utrace correctly how could i find the
problem within it ?

This problem leads to a deadlock in my model since the consumer goes
to sleep relying on the producer to wake it up, that i turn never
wakes up itself.

Alpar Torok