> If ->report_xxx() returns UTRACE_STOP the tracee will do utrace_stop > eventually. But how can the tracer know the tracee is already > TASK_TRACED/->stopped?
We don't really have a way. It is something we need to address better. This is the most essential thing that made the old utrace-ptrace.patch a temporary kludge rather than a good model. > Looks like a horrible hack, imho. Utrace should know nothing about > ptrace. Agreed! > Can't we do something like [...] > + // SPECIAL!!! must not block/etc > + void (*notify_stopped)(...); Part of the point about utrace is to make it easier not to foul up other engines when your engine has a bug. Many of the rules about well-behaved callbacks have to be met by all engines or they will interfere; we can't really avoid that reality when we call engines' code. But part of the plan is not to make those rules too hard to meet. That's why the "don't block (for long)" rule is such a loose constraint in real kernel terms. Having engine-writers commonly need to write code that can't even call mutex_lock or kalloc, etc., is IMHO just too much to ask. It's far better if we don't have such tight constraints on any callback that engine-writers will really need to use, if even if just this special case. (Because it really is not a rare thing--this is the essential kind of synchronization that any engine-writer who handles stopping at all will want.) The report_* callback is the engine's opportunity to synchronize with its asynchronous engine code (debugger threads, etc.). The crux of the issue is that when your "synchronizing" callback is done, there always might be another engine's callback running afterwards. Hence, the actual effect of UTRACE_STOP causing TASK_TRACED state doesn't happen until after the callback. So your synchronization partner then really needs a second synchronization with the thread itself really stopping. Let's talk about the utrace API semantics picture for this purely from the engine-writer's point of view for a bit before we get into the implementation details inside utrace. The pieces we have now are engines doing their own wakeups in callbacks, utrace_barrier, and utrace_prepare_examine. Currently utrace_barrier only makes guarantees about the ordering of callbacks and the various asynchronous utrace_* calls, but not about true thread stoppedness. utrace_prepare_examine verifies that the target is blocked already, and synchronizes with low-level context switch details for that, but does not do any synchronization/delay *before*/with target->state entering its blocked/stopped state. Note it does not in particular require the target be stopped, though does require the caller's engine used UTRACE_STOP. This is intended for non-perturbing examination of a thread that might be blocked in a syscall (a la task_current_syscall). But as it stands, it would also try to accept a thread that is just blocked in kalloc/mutex_lock in the next engine's callback. But unlike the syscall, that is likely to actually wake up again very soon and that results in an -EAGAIN hiccup from utrace_prepare_examine or utrace_finish_examine. So in today's utrace, the best you can do is: report_* does: wake_up(you); return UTRACE_STOP; you wake up and do: utrace_barrier => your callback is finished, its UTRACE_STOP is logged utrace_prepare_examine => if ok, it's really stopped now |> if -EAGAIN, you have to loop(?) user_regset->get, etc. (aka arch_ptrace) utrace_finish_examine => if ok, it really was really stopped when i said it was |> if -AGAIN, you have to loop(?) A first notion is to give utrace_barrier a stronger guarantee: when your engine has UTRACE_STOP in force and then you call utrace_barrier, it blocks until the target enters a truly stopped state. But OTOH that is *too* strong if what you want is to see ASAP what the thread is doing, be it properly stoppable now or blocked in a syscall you don't want to interrupt. I guess maybe if you want to get a thing positively stopped as distinct from maybe-just-blocked-in-syscall (but with UTRACE_STOP preventing it from reaching userland), then you should just start out by using utrace_control(,,UTRACE_INTERRUPT) (perhaps followed by utrace_control(,,UTRACE_STOP) to get "already stopped" return value?) and have your callback return UTRACE_STOP. So then perhaps instead what makes sense is that utrace_prepare_examine should roll in an actual synchronization waiting for target->state to be blocked outside of any other utrace engine callbacks (after them all). If you want to do a "non-blocking" asynchronous sample of a thread, then it's safe to use this only after your engine callback has notified you that UTRACE_STOP is in force or utrace_control(,,UTRACE_STOP) returned zero. Or perhaps we can make it so utrace_prepare_examine diagnoses the "waiting for utrace to settle down" and the "real blocks" differently. The point there is that a "non-blocking sample" doesn't want to have a window where the debugger saw the target was starting to block in some real kernel code (had changed ->state) but then the debugger winds up blocking after ther target has woken again and then runs for a long time in the kernel without a block. (In that case, the debugger should see "it's still running a lot, you will get your callback eventually".) A related issue to keep in mind in all this is the "preemption flutter" problem that someone cited with ptrace (it was biting UML folks, you may recall). The control flow in old ptrace is simple enough that we could defuse that adequately just by omitting an explicit preemption check or two so that voluntary preemption would not tend to occur between the tracer wakeup and the actual schedule() blocking the tracee. In the utrace world, there will be far more going on and it would be unlikely we could manage that approach for long. Moreover, a core tenet of utrace is to help tracing engines not clog the system up generally, so IMHO we very much want extra preemption checks between callbacks and the like (as we have now in finish_callback). So we'd really like to address that sort of problem with really proper synchronization between what in the tracer really needs to wait for what in the tracee, rather than more of the juggled-together-but-not-actually-tied dances going on now. Working in the other direction, we have the (my) idea in utrace that no particular operation should be any more synchronous than it has to be. This motivates the approach of low-level moving parts the engine-writer fits together, the only blocking synchronizations being explicit calls like utrace_barrier, etc. So, what does it all mean? Well, I don't know exactly. It means that I feel strongly about retaining the flexibility in utrace to do things in very asynchronous and opportunistic ways, and I want to keep the core API focus on a fairly small number of calls. At the same time we need to do something that really makes good sense both for ptrace and for other engines that will have a similarly synchronous sort of control flow but different (and I hope less idiosyncratic) means to synchronize with debugger calls. I have a thought or two about ptrace in particular. But perhaps first we can explore the general case of the synchronization picture in the API some. Thanks, Roland