Re: Q: utrace-stopped utrace_report_jctl()
I was wrong, I forgot that tracehook_get_signal() doesn't need JCTL. Right, that is key. OK, let's look at utrace_do_stop: if (task_is_stopped(target) !(target-utrace_flags UTRACE_EVENT(JCTL))) { utrace-stopped = 1; return true; } This doesn't look correct. We don't hold -siglock, the task can be SIGCONT'ed and return from get_signal_to_deliver(), and then we set -stopped. Or I missed something again? I think you're right. The logic there was supposed to be, TASK_STOPPED means it will get into utrace_get_signal(). That much is true, but nothing inside utrace_get_signal() actually synchronizes with this to make that matter. All this check does is try to optimize the TASK_STOPPED case not to take the siglock. That doesn't seem worth much, so we can just drop it. Then we re-do this (well, almost) check under -siglock, } else if (task_is_stopped(target)) { if (!(target-utrace_flags UTRACE_EVENT(JCTL))) utrace-stopped = stopped = true; } But this is not nice. Let's suppose the task is already stopped, we do UTRACE_ATTACH + utrace_set_events(JCTL). This is exactly why utrace_set_events() sets -stopped preemptively for that case. Now, utrace_control(UTRACE_STOP) can do nothing until SIGCONT. We don't even set -report. Yes, we can't set -stopped if JCTL, we can race with utrace_report_jctl() which does REPORT(). Setting JCTL while in TASK_STOPPED made it set -stopped, so utrace_control() succeeds without calling utrace_do_stop(). BTW, afaics utrace_report_jctl() has another bug, REPORT(task, utrace, report, UTRACE_EVENT(JCTL), report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED, what); I think it should do REPORT(task, utrace, report, UTRACE_EVENT(JCTL), report_jctl, what, notify); instead. There is a bug, but your fix changes a key API choice. I've put in this fix: - report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED, what); + report_jctl, was_stopped ? CLD_STOPPED : CLD_CONTINUED, + notify ? what : 0); There are two things a tracer might be tracking: state or events. The state is whether the thread is in job control stop or is running. The events are the SIGCHLD notifications that the thread tries to post to its parent. The @type argument shows the state we will be in after the callback. If the state changes, there will be another callback. That's what a state-tracking tracer needs, e.g. to keep a little light on the screen red while the thread is stopped and green while it's running. The @notify argument shows what SIGCHLD the parent sees (if it were dequeuing all possible SIGCHLD postings as quickly as they come). That's what an event-tracking tracer needs, e.g. to match up with what SIGCHLDs are expected in the parent. Your change to @type would break state-trackers in the case where tracehook_notify_jctl() is called from get_signal_to_deliver() with CLD_STOPPED. With the first patch, we call utrace_report_jctl() before we actually stop. do_signal_stop() can fail then, but I think this is OK, we can pretend that SIGCONT/SIGKILL happened after we stopped. It is not complete, and with this patch we always call -report_jctl with notify == 0. Just for discussion. I think I sort of understand the intent of your patch. If we change the calling convention for tracehook_notify_jctl, I think we want to preserve the aspect that the hook decides about sending the notification. That's how the ptrace quirks can be reimplemented differently later without changing the tracehook layer again. Also, we certainly don't want one tracehook call with two different locking conditions. It seems right in principle to do the reporting before we change -state, given that we have to allow for it changing during the callbacks. And indeed, that avoids the JCTL special case mess entirely. Thanks, Roland
Re: utrace_set_events/utrace_control death/reap checks
utrace_set_events: (utrace-death ((old_flags ~events) DEATH_EVENTS)) (old_flags ~events) DEATH_EVENTS) means the caller tries to clear DEATH/QUIESCE. Why this is not allowed? And why this is not allowed _only_ when the target runs utrace_report_death()-REPORT()? This is specifically documented for -EALREADY, and in the DocBook section Interlock with final callbacks. The idea is this: For most utrace events, you don't know whether you'll get some callbacks. It could be, the task got SIGKILL first thing after you attached, and it will never report anything. That is fine for the most part. But for the lifetime events it becomes a real burden on the users of the API. They have to manage their data structures, and so they have to know reliably when they can and can't get what callbacks. So, the utrace_set_events rules try to ensure that the caller knows for sure whether it will or won't get a callback when the task dies and/or is reaped. You can clear DEATH/QUIESCE, and be sure from the return value that it is now impossible that there is a report_death/report_quiesce callback racing with you because the guy just got a SIGKILL. If you can't be sure of that, then you do know for sure that your callback is being made right now or very soon. I think this line can be just killed. I guess the intent was to prevent utrace_release_task() from doing utrace_reap() in parallel with utrace_report_death(), but note that utrace_set_events() can never shrinks -utrace_flags, it only sets new bits. It's not -utrace_flags that matters here, it's engine-flags. That is one of the intents, but not the only one. It's just as important that the user of the API can rely on the ordering of its callbacks wrt its utrace_set_events/utrace_control calls as that it can rely on the ordering of its death and reap callbacks. The next line looks strange too, don't we need (utrace-reap ((events ~old_flags) UTRACE_EVENT(REAP))) ? get_utrace_lock() already returned -ESRCH if it was in EXIT_DEAD, so this is probably moot. And I don't understand why do we need utrace-death at all. Apart from utrace_set_events (which I think doesn't need it), it is only used by utrace_control(UTRACE_DETACH). But I can't see how can we race with utrace_report_death(). If it can be called, we have DEATH_EVENTS bits set. But in that case utrace_do_stop() can't succeed, so UTRACE_DETACH can only do mark_engine_wants_stop() but not utrace_reset(). It is used by utrace_set_events and utrace_control for the same purpose. Those calls must know for sure that report_death cannot happen, or else that it will (or it's already happening). Many tracers only keep track until death. For them, the simple thing is to have report_death clean up their data structures and return UTRACE_DETACH. But then they also want to do asynchronous detach. So they can do utrace_set_events or utrace_control as the synchronizing step of asynchronous tear-down. If it returns 0, then report_death will not and it is safe to destroy data structures the callback code would use. If it returns -EALREADY, then report_death will shortly be called and we can rely on our callback code to take care of the data structures before it returns UTRACE_DETACH. Thanks, Roland
Re: [PATCH 2/2] UTRACE_STOP: nesting engine management (updated)
When a report function of an engine returns UTRACE_STOP, it means (may mean) that it wants to change the status of the process before resuming it. VM monitors often change the status, sometimes debugger users want to set some variables too. Yes. In ideal cases, it can decide up front quickly what it wants to do, and change the user state right in the callback without stopping. But when it needs another agent to decide what to do, it uses UTRACE_STOP. IMHO, utrace should stop it *before* calling the report function of the next engine, No, we'll never want to do it this way. One engine doesn't get to arbitrarily delay the reporting to other engines of the thread's events. This is both an efficiency point and a robustness point. It's important to remember that utrace is about the primitive events: the user thread had an event ... the user thread is about to run again. The high-level notion of what did the other engine do? is built from examining the state at these events, and knowing about the delays that other engines are imposing via UTRACE_STOP. otherwise we need to set up another structure to synchronize the engines (that may even be unknown one to the other). If there is a tracer/debugger among the engines, it is not even possible to know which snapshot it gets, after or before the modification created by the VM monitor? This is where the broader discussion of callback order comes in. When a previous engine has decided to use UTRACE_STOP, your callback's @action argument reflects this. You know that another engine is going to do something asynchronous before it lets the user thread run. If your own engine doesn't especially want it stopped now but wants to see what it looks like when other engines are done fiddling with it, then you can use UTRACE_REPORT. That ensures that you'll get a report_quiesce callback after those other engines have done their thing. Thanks, Roland