Re: Possible problem with utrace_control

Roland McGrath Wed, 23 Jun 2010 13:51:54 -0700

> OK. But then perhaps UTRACE_STICKY_STOP makes sense even without the
> races we discussed. It can simplify the cooperation in this case.


The only cooperation methods we should consider are those that cover all
their races.

> Yes. Let's consider the concrete example. The tracee is going to
> stop and calls utrace_stop(). Before it takes utrace->lock and
> sets state = TASK_TRACED, the debugger does utrace_control(DETACH).
> 
> In this case utrace_stop() shouldn't stop, otherwise nobody will
> ever wake it up. That is why we clear this bit in ->utrace_flags
> to provoke utrace_reset() which will check carefully the tracee
> has an engine with ENGINE_STOP.

Right.  I agree this race seems to apply to all kinds of resumption, not
just detach.  The question for the non-detach cases is what kind of
guarantee the utrace API is claiming to provide.  For detach it matters
strongly because the "loser" of the race is gone and there is no way for
it to be responsible for cleaning up the fallout.

For non-detach cases, the picture is more hazy.  The conclusions on what
matters might be different if we have utrace_wake_up() or equivalent.

Off hand I think it does matter today insofar as it violates the
documented guarantees of the utrace_barrier API.  If utrace_barrier
returns 0 it is said to mean that, "Any effect of its return value (such
as %UTRACE_STOP) has already been applied to @engine."  So e.g. if you
get a wake-up sent by your callback before it returns UTRACE_STOP, and
then call utrace_barrier followed by utrace_control(,,UTRACE_RESUME),
then you should be guaranteed that your resumption cleared the
callback's stop.

Another example, you call utrace_set_events and disable some events,
including all the ones for which your callbacks ever return UTRACE_STOP.
If that returns 0, or then you call utrace_barrier and it returns 0,
then you know that either it was already stopped before you called
utrace_set_events, or that it's not stopped now.  In either case, you
know that utrace_control(,,UTRACE_RESUME) ensures that it's no longer
stopped.  (That is, "stopped by your engine"--it of course might still
be stopped by a different engine.)

> > For the new API idea, I was talking about something like a utrace_wake_up()
> > call.
> 
> Now I don't understand you again...
> 
> The 1st question: should the new API help us to kill ptrace_notify_stop()
> in utrace_stop() ?

Ideally yes.  This might still have to be some other kind of special case,
we'll have to figure that out.  That would be survivable if it's necessary.
The wait_chldexit wakeup would be covered by what we're discussing, it's
just another wait_queue_head_t.  The signal sending is more problematic.

But I wanted first to contemplate what API would cover new code cleanly
written.  The dismal old ptrace signal/wait synchronization might always be
a difficult outlying case.

> Hmm... After the previous email I thought that this utrace_wake_up()
> or whatever shouldn't be visible outside of utrace.c.

I'm sorry if I was not clear.  I have been talking entirely about new
utrace API features, not internals.  This is a discussion for your
engine-writer hat more than your utrace-maintainer hat.  I'm only
mentioning any utrace implementation details to elucidate what is
realistic or problematic to consider for the new API features.

> Or. Do you literally mean something like
> 
>       utrace_wait_for_event();                // for debugger
>       utrace_wake_up();                       // for tracee
> 
> which should (at least) cover all races with exit/detach/etc ?

I meant utrace_wake_up(wait_queue_head_t *) used in place of wake_up()
inside a utrace callback.  The waiter (tracer) side would use normal
<linux/wait.h> interfaces (wait_event macro, etc.).


Thanks,
Roland

Re: Possible problem with utrace_control

Reply via email to