> The problem is, utrace_control(UTRACE_RESUME) can't prevent the stop if
> the tracee has already returned UTRACE_STOP, but utrace_stop() didn't
> take utrace->lock yet.

So you are saying that utrace_barrier does not meet its documented API.
Right?  It says "effect ... has been applied".  But that's not true if a
UTRACE_STOP return value will not be cleared by an immediate subsequent
utrace_control(,,UTRACE_RESUME).

> Basically, ptrace_detach_task(sig => -1) should do:
> 
>       - if we are going to do utrace_control(UTRACE_DETACH), we
>         should first instruct the (running) tracee to not report
>         a signal, otherwise that signal will be lost.

Right.

>       - if the tracee has already reported a signal, we should
>         set ->resume = UTRACE_DETACH and resume the tracee, like
>         explicit detach does.

Right.

>         We can check ctx->siginfo != NULL to detect this case.

Ok.

> > Especially if it's as I suspect, that we can do
> > that without changing the utrace layer.
> 
> No, this problem is orthogonal, or I missed something.
> 
> Please look at this message
> 
>       https://www.redhat.com/archives/utrace-devel/2010-June/msg00075.html

Yes, I'd forgotten about that.  We do need to fix utrace_barrier to match
its documented guarantee, or else we cannot rely on it for ptrace.

> In particualar:
> 
>       Off hand I think it does matter today insofar as it violates the
>       documented guarantees of the utrace_barrier API.  If utrace_barrier
>       returns 0 it is said to mean that, "Any effect of its return value (such
>       as %UTRACE_STOP) has already been applied to @engine."  So e.g. if you
>       get a wake-up sent by your callback before it returns UTRACE_STOP, and
>       then call utrace_barrier followed by utrace_control(,,UTRACE_RESUME),
>       then you should be guaranteed that your resumption cleared the
>       callback's stop.
> 
> Yes, but currently UTRACE_RESUME can't guarantee this. 

>From the API perspective I had been thinking in, it's not utrace_control
that's supposed to guarantee it.  It's utrace_barrier that's not
supposed to return yet.  But, that is indeed a sort of inside-out way of
looking at it, really.  What utrace_barrier guarantees is that the
callback bookkeeping is done, and it's not supposed to wait for e.g. the
next engine's callback to run.

> utrace_control(RESUME) should remove ENGINE_STOP like UTRACE_DETACH does

I think you've now talked me into this.  There is no other way that
utrace_barrier can keep its guarantee about the return value effect
without also delaying while other engines' callbacks might run, which
seems much worse.


Thanks,
Roland

Reply via email to