> The problem is, utrace_control(UTRACE_RESUME) can't prevent the stop if > the tracee has already returned UTRACE_STOP, but utrace_stop() didn't > take utrace->lock yet.
So you are saying that utrace_barrier does not meet its documented API. Right? It says "effect ... has been applied". But that's not true if a UTRACE_STOP return value will not be cleared by an immediate subsequent utrace_control(,,UTRACE_RESUME). > Basically, ptrace_detach_task(sig => -1) should do: > > - if we are going to do utrace_control(UTRACE_DETACH), we > should first instruct the (running) tracee to not report > a signal, otherwise that signal will be lost. Right. > - if the tracee has already reported a signal, we should > set ->resume = UTRACE_DETACH and resume the tracee, like > explicit detach does. Right. > We can check ctx->siginfo != NULL to detect this case. Ok. > > Especially if it's as I suspect, that we can do > > that without changing the utrace layer. > > No, this problem is orthogonal, or I missed something. > > Please look at this message > > https://www.redhat.com/archives/utrace-devel/2010-June/msg00075.html Yes, I'd forgotten about that. We do need to fix utrace_barrier to match its documented guarantee, or else we cannot rely on it for ptrace. > In particualar: > > Off hand I think it does matter today insofar as it violates the > documented guarantees of the utrace_barrier API. If utrace_barrier > returns 0 it is said to mean that, "Any effect of its return value (such > as %UTRACE_STOP) has already been applied to @engine." So e.g. if you > get a wake-up sent by your callback before it returns UTRACE_STOP, and > then call utrace_barrier followed by utrace_control(,,UTRACE_RESUME), > then you should be guaranteed that your resumption cleared the > callback's stop. > > Yes, but currently UTRACE_RESUME can't guarantee this. >From the API perspective I had been thinking in, it's not utrace_control that's supposed to guarantee it. It's utrace_barrier that's not supposed to return yet. But, that is indeed a sort of inside-out way of looking at it, really. What utrace_barrier guarantees is that the callback bookkeeping is done, and it's not supposed to wait for e.g. the next engine's callback to run. > utrace_control(RESUME) should remove ENGINE_STOP like UTRACE_DETACH does I think you've now talked me into this. There is no other way that utrace_barrier can keep its guarantee about the return value effect without also delaying while other engines' callbacks might run, which seems much worse. Thanks, Roland