Re: Possible problem with utrace_control

Oleg Nesterov Mon, 28 Jun 2010 09:48:22 -0700

On 06/23, Roland McGrath wrote:
>
> > OK. But then perhaps UTRACE_STICKY_STOP makes sense even without the
> > races we discussed. It can simplify the cooperation in this case.
>
> The only cooperation methods we should consider are those that cover all
> their races.
>
> > Yes. Let's consider the concrete example. The tracee is going to
> > stop and calls utrace_stop(). Before it takes utrace->lock and
> > sets state = TASK_TRACED, the debugger does utrace_control(DETACH).
> >
> > In this case utrace_stop() shouldn't stop, otherwise nobody will
> > ever wake it up. That is why we clear this bit in ->utrace_flags
> > to provoke utrace_reset() which will check carefully the tracee
> > has an engine with ENGINE_STOP.
>
> Right.  I agree this race seems to apply to all kinds of resumption, not
> just detach.  The question for the non-detach cases is what kind of
> guarantee the utrace API is claiming to provide.  For detach it matters
> strongly because the "loser" of the race is gone and there is no way for
> it to be responsible for cleaning up the fallout.
>
> For non-detach cases, the picture is more hazy.  The conclusions on what
> matters might be different if we have utrace_wake_up() or equivalent.
>
> Off hand I think it does matter today insofar as it violates the
> documented guarantees of the utrace_barrier API.  If utrace_barrier
> returns 0 it is said to mean that, "Any effect of its return value (such
> as %UTRACE_STOP) has already been applied to @engine."  So e.g. if you
> get a wake-up sent by your callback before it returns UTRACE_STOP, and
> then call utrace_barrier followed by utrace_control(,,UTRACE_RESUME),
> then you should be guaranteed that your resumption cleared the
> callback's stop.


Not sure I understand how this connects to utrace_wake_up().

If a callback can return return UTRACE_STOP due to the "extern"
reason, then we can do nothing to cancel this UTRACE_STOP reliably.
I meant something like

        // say, engine->data->please_stop = false;
        instruct_the_tracee_to_not_return_UTRACE_STOP();

        utrace_barrier();
        utrace_control(UTRACE_RESUME);

> Another example, you call utrace_set_events and disable some events,
> including all the ones for which your callbacks ever return UTRACE_STOP.
> If that returns 0, or then you call utrace_barrier and it returns 0,
> then you know that either it was already stopped before you called
> utrace_set_events, or that it's not stopped now.  In either case, you
> know that utrace_control(,,UTRACE_RESUME) ensures that it's no longer
> stopped.  (That is, "stopped by your engine"--it of course might still
> be stopped by a different engine.)

Yes, this is close to my example above.

Well. This reminds me the old and unfinished discussion about utrace_barrier()
and memory barriers.

At least with the current implementation, I'm afraid that "->reporting = NULL"
in finish_callback() can fool the tracer. Unless finish_callback_report()
takes utrace->lock to change flags/ops. Of course, this is pure theoretical.

> > > For the new API idea, I was talking about something like a 
> > > utrace_wake_up()
> > > call.
> >
> > Now I don't understand you again...
> >
> > The 1st question: should the new API help us to kill ptrace_notify_stop()
> > in utrace_stop() ?
>
> Ideally yes.  This might still have to be some other kind of special case,
> we'll have to figure that out.  That would be survivable if it's necessary.
> The wait_chldexit wakeup would be covered by what we're discussing, it's
> just another wait_queue_head_t.  The signal sending is more problematic.

Yes, the signal sending is problematic, but wait_chldexit also looks
problematic. How can utrace_wake_up() know which wait_queue_head_t should
be used for wake up?

> But I wanted first to contemplate what API would cover new code cleanly
> written.  The dismal old ptrace signal/wait synchronization might always be
> a difficult outlying case.

OK.

> > Hmm... After the previous email I thought that this utrace_wake_up()
> > or whatever shouldn't be visible outside of utrace.c.
>
> I'm sorry if I was not clear.  I have been talking entirely about new
> utrace API features, not internals.  This is a discussion for your
> engine-writer hat more than your utrace-maintainer hat.  I'm only
> mentioning any utrace implementation details to elucidate what is
> realistic or problematic to consider for the new API features.

OK, let's forget about "internal" utrace problems like the spinning
inside utrace_barriers().

> > Or. Do you literally mean something like
> >
> >     utrace_wait_for_event();                // for debugger
> >     utrace_wake_up();                       // for tracee
> >
> > which should (at least) cover all races with exit/detach/etc ?
>
> I meant utrace_wake_up(wait_queue_head_t *) used in place of wake_up()
> inside a utrace callback.  The waiter (tracer) side would use normal
> <linux/wait.h> interfaces (wait_event macro, etc.).

Sorry Roland, I have the blind spot here.

Let's talk about API visible to engine writers. Which arguments
should utrace_wake_up() have, wait_queue_head_t ? When this
helper should be used by engine writers?

Sure, the tracer and tracee can have many reasons to synchronize.
They can engine->data to place wait_queue_head_t. Or they can
use another wq. I still can't understand what a simple wrapper
on top of __wake_up() can buy us.

I am sure I am missing something important and simple her.

Oleg.

Re: Possible problem with utrace_control

Reply via email to