> The usage of ->ptrace in utrace-ptrace is racy [...]

Sure.  Replace it.  I'm not sure what purpose ->ptrace (or ctx->flags)
serves for attach/detach.  You need somewhere to store the
PTRACE_SETOPTIONS state and so forth, sure.  But you can probably just
handle the attachedness at the utrace level.  That's what
UTRACE_ATTACH_EXCLUSIVE is for.

> I think we need another variable which should be used instead, and
> the only "good" place where it can live is engine->data, it should
> point to "struct ptrace_xxx".

Of course!  That was the main point of the whole ptrace-tracee data
structure cleanup.

> Yes, I see. That is why I named it "notify_stopped", not report_xxx,
> to emphasize it is really special and should be used with care and only
> when really needed.

Right, I got that.

> So, if we do not want to add the special (and not-a-report) hooks
> into engine->ops, then perhaps it makes sense to do
[...]

That is exactly the kludge the bad old utrace-ptrace.patch does.

> So, perhaps we can just add engine->ops->notify_stopped() (or even [...]

My main point was that we should look for an entirely different way to
attack the problem.  If I thought this hook would only be used for
ptrace, it might be fine.  (OTOH then it's not all that much better than
the terrible ptrace-only kludge.)  But AFAICT we'd likely wind up with
every other new utrace module that cares about doing synchronization
with debugger threads needing to use this hook too.  That's why I want
to talk about the general case in the utrace API, and not just ptrace.

Let's ignore SIGCHLD for a moment, and just think about ptrace's
report_* waking up wait.  That is like what we expect will be the
general case of this: a debugger thread wants to sleep somehow and have
its report_* callbacks say "wake the debugger so it can examine me".

So, how does the utrace module writer build that?  You'd probably use
wait_event() et al in your debugger thread, and wake_up() in report_*
callbacks.  (That's approximately what you have in ptrace+wait.)

So how much is satisfied with a "delayed wake-up" (rather than an
open-ended hook).  Let's say:

        utrace_wake_up(task, engine, wqh);      // replaces wake_up(wqh);
        return UTRACE_STOP; // | UTRACE_SIGNAL_IGN|UTRACE_SIGNAL_HOLD

That would mean, approximately, as if wake_up(wqh) and then the wakee
had called utrace_barrier(task,engine), but without the sched flutter.

Formally, I think the guarantee should be "when utrace_barrier would
return for you".  Technically this means that if your callback did not
use UTRACE_STOP, it's just sometime after your callback returns.  But if
you did use UTRACE_STOP, then it means after the task will definitely
stop.  That's all it means today, but I think we want to strengthen that
guarantee so that when you have used UTRACE_STOP (either in a callback
already in progress when you called utrace_barrier, or in a utrace_control
call before the utrace_barrier call), utrace_barrier doesn't return until
"utrace_prepare_examine will work", i.e. actually in TASK_TRACED/STOPPED.

In ptrace, I think this alone might suffice to cover things "cleanly" in
the utrace API.  (That is, clean for utrace. ;-) The ptrace layer could
use the trick of a custom wake function to implement SIGCHLD without
needing anything but utrace_wake_up from the utrace layer.  That is at
base the same thing as your notify_stopped hook--ptrace will supply a hook
that runs in very special circumstances.  But for the more common case in
other modules, using utrace_wake_up/wait_event this way can be very
straightforward and foolproof.

Thoughts?


Thanks,
Roland

Reply via email to