Re: Possible problem with utrace_control
On 06/22, Roland McGrath wrote: OTOH, I am wondering if we can give more power to utrace_control(STOP). Currently debugger can't stop the tracee unless it cooperates. However, if the tracee doesn't have UTRACE_EVENT(QUIESCE), then utrace_control(STOP) works. This is strange. Why is it strange? If you don't have any callback, then all your engine does is get stopped. If you have callbacks, then you always get callbacks before the thread does anything (including stop). Since your callback's return value always replaces any other resume action OK. But then perhaps UTRACE_STICKY_STOP makes sense even without the races we discussed. It can simplify the cooperation in this case. Stupid question: what if we just remove the clear_engine_wants_stop() code in finish_callback_report() ? That would change the API so that a callback's resume action is always overridden by a utrace_control(,,UTRACE_STOP). Today the practice from a third-party debugger thread is to do: ret = utrace_control(task, engine, UTRACE_STOP); if (ret == 0) { // Already stopped, e.g. was in jctl or another engine's stop. // This means we can do our asynchronous examination here. } else if (ret == -EINPROGRESS) { // it will get to report_quiesce soonish (modulo blocked in kernel) } If the examination you wanted to do was some predetermined sequence of fetching and fiddling before resuming, then the report_* callback can just do all that itself. For example, fetch the registers for a sample and keep going; insert some breakpoints and keep going; etc. OK, agreed. With this change, it would have to explicitly request not stopping by calling utrace_control(current, engine, UTRACE_RESUME) inside the callback. Ah, indeed, this can't be good. Thanks. Btw, before I forgot about this. shouldn't utrace_control(UTRACE_RESUME) remove ENGINE_STOP from -utrace_flags like DETACH does? Perhaps so. My memory is dim on the use of that bit in utrace_flags. (don't assume I was able to quickly recall why DETACH does this ;) It's to close certain races, but we'll have to remind each other of the details. Yes. Let's consider the concrete example. The tracee is going to stop and calls utrace_stop(). Before it takes utrace-lock and sets state = TASK_TRACED, the debugger does utrace_control(DETACH). In this case utrace_stop() shouldn't stop, otherwise nobody will ever wake it up. That is why we clear this bit in -utrace_flags to provoke utrace_reset() which will check carefully the tracee has an engine with ENGINE_STOP. The original motivation for this patch was ptrace-utrace, the implicit DETACH if the tracer exits. UTRACE_RESUME should probably do the same, otherwise the tracee can stop right after utrace_control(RESUME). This case is less important compared to UTRACE_DETACH (and just in case, ptrace doesn't need this) but still. Hmm. Not sure... at least I don't immediately see something simple. Except, just add wait_queue_head into struct utrace. This way utrace_barrier() can wait longer than needed if it notices utrace-reporting == engine. Or we can add it into utrace_engine, in this case utrace_stop() needs list_for_each(engine, utrace-attached). You're talking about two different things here. Whatever implementation details change in utrace_barrier() is not directly apropos. Yes, agreed. For the new API idea, I was talking about something like a utrace_wake_up() call. Now I don't understand you again... The 1st question: should the new API help us to kill ptrace_notify_stop() in utrace_stop() ? What I'm most interested in right now is thinking through the utility of particular new API details for engine writers. Hmm... After the previous email I thought that this utrace_wake_up() or whatever shouldn't be visible outside of utrace.c. Ignoring the ptrace_notify_stop() problem, could you give us any example of how it can be used? Or. Do you literally mean something like utrace_wait_for_event();// for debugger utrace_wake_up(); // for tracee which should (at least) cover all races with exit/detach/etc ? Oleg.
Re: Possible problem with utrace_control
OK. But then perhaps UTRACE_STICKY_STOP makes sense even without the races we discussed. It can simplify the cooperation in this case. The only cooperation methods we should consider are those that cover all their races. Yes. Let's consider the concrete example. The tracee is going to stop and calls utrace_stop(). Before it takes utrace-lock and sets state = TASK_TRACED, the debugger does utrace_control(DETACH). In this case utrace_stop() shouldn't stop, otherwise nobody will ever wake it up. That is why we clear this bit in -utrace_flags to provoke utrace_reset() which will check carefully the tracee has an engine with ENGINE_STOP. Right. I agree this race seems to apply to all kinds of resumption, not just detach. The question for the non-detach cases is what kind of guarantee the utrace API is claiming to provide. For detach it matters strongly because the loser of the race is gone and there is no way for it to be responsible for cleaning up the fallout. For non-detach cases, the picture is more hazy. The conclusions on what matters might be different if we have utrace_wake_up() or equivalent. Off hand I think it does matter today insofar as it violates the documented guarantees of the utrace_barrier API. If utrace_barrier returns 0 it is said to mean that, Any effect of its return value (such as %UTRACE_STOP) has already been applied to @engine. So e.g. if you get a wake-up sent by your callback before it returns UTRACE_STOP, and then call utrace_barrier followed by utrace_control(,,UTRACE_RESUME), then you should be guaranteed that your resumption cleared the callback's stop. Another example, you call utrace_set_events and disable some events, including all the ones for which your callbacks ever return UTRACE_STOP. If that returns 0, or then you call utrace_barrier and it returns 0, then you know that either it was already stopped before you called utrace_set_events, or that it's not stopped now. In either case, you know that utrace_control(,,UTRACE_RESUME) ensures that it's no longer stopped. (That is, stopped by your engine--it of course might still be stopped by a different engine.) For the new API idea, I was talking about something like a utrace_wake_up() call. Now I don't understand you again... The 1st question: should the new API help us to kill ptrace_notify_stop() in utrace_stop() ? Ideally yes. This might still have to be some other kind of special case, we'll have to figure that out. That would be survivable if it's necessary. The wait_chldexit wakeup would be covered by what we're discussing, it's just another wait_queue_head_t. The signal sending is more problematic. But I wanted first to contemplate what API would cover new code cleanly written. The dismal old ptrace signal/wait synchronization might always be a difficult outlying case. Hmm... After the previous email I thought that this utrace_wake_up() or whatever shouldn't be visible outside of utrace.c. I'm sorry if I was not clear. I have been talking entirely about new utrace API features, not internals. This is a discussion for your engine-writer hat more than your utrace-maintainer hat. I'm only mentioning any utrace implementation details to elucidate what is realistic or problematic to consider for the new API features. Or. Do you literally mean something like utrace_wait_for_event();// for debugger utrace_wake_up(); // for tracee which should (at least) cover all races with exit/detach/etc ? I meant utrace_wake_up(wait_queue_head_t *) used in place of wake_up() inside a utrace callback. The waiter (tracer) side would use normal linux/wait.h interfaces (wait_event macro, etc.). Thanks, Roland
Re: Possible problem with utrace_control
On 06/21, Roland McGrath wrote: I am wondering if there is another way to handle such a race... Suppose we - add UTRACE_STOP_XXX - modify utrace_control(UTRACE_STOP_XXX) to do mark_engine_wants_stop() and nothing more. utrace_control(current,,UTRACE_STOP) doesn't really do anything more. It does utrace_do_stop() which sets UTRACE_REPORT/TIF_NOTIFY_RESUME. Harmless, but makes sense to avoid. Then, probably producer could do utrace_control(current, UTRACE_STOP_XXX); slot = book_message_in_buffer(buf); if (slot 0) return UTRACE_STOP_XXX; ... return UTRACE_RESUME; Even better, UTRACE_STOP_XXX can live outside of UTRACE_RESUME_MASK. Right, it doesn't belong in enum utrace_resume_action--it doesn't make sense as an argument for utrace_control, for example. So it could be something like a return-value flag bit saying stop if marked for stop, e.g. UTRACE_SINGLESTEP|UTRACE_STICKY_STOP for single-step, unless marked to stop, in which case stay stopped. Yes, this is what I meant. And yes, if UTRACE_STICKY_STOP doesn't fall into UTRACE_RESUME_MASK utrace_control(enum utrace_resume_action) shouldn't be used. OTOH, I am wondering if we can give more power to utrace_control(STOP). Currently debugger can't stop the tracee unless it cooperates. However, if the tracee doesn't have UTRACE_EVENT(QUIESCE), then utrace_control(STOP) works. This is strange. Stupid question: what if we just remove the clear_engine_wants_stop() code in finish_callback_report() ? So this would fix that race. ... (But note if there is another engine that decides to stop anyway, then we'll have used a second CPU and/or ping-pong scheduled to run the tracer thread in parallel with a very short amount of work in the tracee before it just blocks anyway.) Hmm. Not sure I understand... Yes, the tracee still can stop if another tracer demands, but this is fine? Btw, before I forgot about this. shouldn't utrace_control(UTRACE_RESUME) remove ENGINE_STOP from -utrace_flags like DETACH does? Here what the callback does is wake up the tracer, and return UTRACE_STOP. For this sort of scenario, it is clearly better by far to have the wake_up_tracer() done as late as possible in the sequence. In the past, Oleg has suggested something very general, like a utrace_engine_ops hook that gets called right before schedule(). I shy away from something like that, since I think it degrades the intended property of the API that it make it harder to write truly arcane bugs with terrible effects on the system, not easier. It's just too big a sledgehammer to wield above the thinly-protected toes of the utrace engine writer. To avoid something dangerously general, the only real option is to give something safely constrained and specific. It's straightforward enough to come up with something for any single synchronization method. For example, if every need for this kind of synchronization with tracees is based on linux/wait.h facilities, then we can have a utrace API feature whereby a callback can queue up a wait_queue_head_t pointer to be passed to wake_up_interruptible_sync() after all the callbacks are done and UTRACE_STOP is being performed. It's probably sufficient to allow one pending wake-up per engine, which makes it pretty cheap and simple to implement. Heh. it turns out that I misunderstood you in the past, I thought you dislike the whole idea of wake_up from utrace_stop(). I agree, the utrace_engine_ops-notify() hook is dangerous. But, it is simple and allows us to kill utrace_stop()-ptrace_notify_stop(). (yes I understand, this is not only about utrace_stop(), utrace_barrier wants notification too). It's straightforward enough to come up with something for any single synchronization method. Hmm. Not sure... at least I don't immediately see something simple. Except, just add wait_queue_head into struct utrace. This way utrace_barrier() can wait longer than needed if it notices utrace-reporting == engine. Or we can add it into utrace_engine, in this case utrace_stop() needs list_for_each(engine, utrace-attached). Or I misunderstood you again? Oleg.
Re: Possible problem with utrace_control
utrace_control(current,,UTRACE_STOP) doesn't really do anything more. It does utrace_do_stop() which sets UTRACE_REPORT/TIF_NOTIFY_RESUME. Harmless, but makes sense to avoid. Right. I was talking about the API perspective. Those internal details are more or less invisible to a given engine. OTOH, I am wondering if we can give more power to utrace_control(STOP). Currently debugger can't stop the tracee unless it cooperates. However, if the tracee doesn't have UTRACE_EVENT(QUIESCE), then utrace_control(STOP) works. This is strange. Why is it strange? If you don't have any callback, then all your engine does is get stopped. If you have callbacks, then you always get callbacks before the thread does anything (including stop). Since your callback's return value always replaces any other resume action for your engine, you have to cooperate with yourself in not changing your mind the next time your code is in charge of the tracee. Stupid question: what if we just remove the clear_engine_wants_stop() code in finish_callback_report() ? That would change the API so that a callback's resume action is always overridden by a utrace_control(,,UTRACE_STOP). Today the practice from a third-party debugger thread is to do: ret = utrace_control(task, engine, UTRACE_STOP); if (ret == 0) { // Already stopped, e.g. was in jctl or another engine's stop. // This means we can do our asynchronous examination here. } else if (ret == -EINPROGRESS) { // it will get to report_quiesce soonish (modulo blocked in kernel) } If the examination you wanted to do was some predetermined sequence of fetching and fiddling before resuming, then the report_* callback can just do all that itself. For example, fetch the registers for a sample and keep going; insert some breakpoints and keep going; etc. Today, the callback that goes with that asynchronous-side code has the choice to do something and stop or to do something and not stop. With this change, it would have to explicitly request not stopping by calling utrace_control(current, engine, UTRACE_RESUME) inside the callback. It would need to use its own bookkeeping above the utrace layer to know that there was a stop it was supposed to clear (i.e. something the code sample above would set before calling utrace_control). This changes the set of races with asynchronous stop/resume scenarios. But it doesn't remove the asynchronous-resume vs callback-stop race. In concert with wake_up_tracer-after-stop perhaps it gets better. We'd have to really think through the races with the new API. So this would fix that race. ... (But note if there is another engine that decides to stop anyway, then we'll have used a second CPU and/or ping-pong scheduled to run the tracer thread in parallel with a very short amount of work in the tracee before it just blocks anyway.) Hmm. Not sure I understand... Yes, the tracee still can stop if another tracer demands, but this is fine? Correct. When that is what's going to happen (e.g. there is another engine attached that is a slow stopper like ptrace), then it's suboptimal to have the scheduler decide to yield the CPU immediately on waking your first tracer. Btw, before I forgot about this. shouldn't utrace_control(UTRACE_RESUME) remove ENGINE_STOP from -utrace_flags like DETACH does? Perhaps so. My memory is dim on the use of that bit in utrace_flags. It's to close certain races, but we'll have to remind each other of the details. Heh. it turns out that I misunderstood you in the past, I thought you dislike the whole idea of wake_up from utrace_stop(). No, just the over-general hook or the over-specific ptrace hack. I talked about about a utrace_wake_up() that would work this way. Hmm. Not sure... at least I don't immediately see something simple. Except, just add wait_queue_head into struct utrace. This way utrace_barrier() can wait longer than needed if it notices utrace-reporting == engine. Or we can add it into utrace_engine, in this case utrace_stop() needs list_for_each(engine, utrace-attached). You're talking about two different things here. Whatever implementation details change in utrace_barrier() is not directly apropos. That would not change the API-visible details at all, just make utrace_barrier() perform better. Let's not get into that before we hash out the new API ideas. For the new API idea, I was talking about something like a utrace_wake_up() call. That might be implemented by a wait_queue_head_t * in utrace_engine, for example. The implementation internals could involve a simple iteration, or involve more hidden bookkeeping. I'm only talking about the API idea for the moment. I only mentioned implementation details at all to indicate that a restricted API (one wake-up per callback per engine) could be done with constant storage overhead and thus avoid a whole potential can of worms--which is the only real reason to
Re: Possible problem with utrace_control
On 06/17, Alpár Török wrote: The producer is a utrace monitored task. At any time there is an arbitrary number of producers. On sys_entry and sys_exit the producers do : slot = book_message_in_buffer(buf); if (slot 0) { /* no slot can be alocated, producer must sleep */ enque_producer_on_wait_list(list, producer); return UTRACE_STOP; } I think Roland is right. Let's forget about utrace for the moment, this code looks like if (!CONDITION) { set_task_state(TASK_XXX); schedule(); } and it can obviously race with CONDITION = true; wake_up_xxx(); I am wondering if there is another way to handle such a race... Suppose we - add UTRACE_STOP_XXX - modify utrace_control(UTRACE_STOP_XXX) to do mark_engine_wants_stop() and nothing more. - finish_callback_report(UTRACE_STOP_XXX) does spin_lock(utrace-lock); if (engine_wants_stop(engine)) action = UTRACE_STOP; else action = UTRACE_REUME; spin_unlock(utrace-lock); Then, probably producer could do utrace_control(current, UTRACE_STOP_XXX); slot = book_message_in_buffer(buf); if (slot 0) return UTRACE_STOP_XXX; ... return UTRACE_RESUME; Even better, UTRACE_STOP_XXX can live outside of UTRACE_RESUME_MASK. Unlikely this makes sense, just a random thought. Oleg.
Re: Possible problem with utrace_control
I think Roland is right. Let's forget about utrace for the moment, this code looks like if (!CONDITION) { set_task_state(TASK_XXX); schedule(); } and it can obviously race with CONDITION = true; wake_up_xxx(); This is Linux in-kernel synchronization 101. The correct blocks of this sort do: set_current_state(TASK_UNINTERRUPTIBLE); while (!CONDITION) { schedule(); } set_current_state(TASK_RUNNING); This is entirely orthogonal to anything having to do with utrace per se. This pattern race-free because wake_up_* will reset the waiter to TASK_RUNNING after making CONDITION true. So in the race where the while (!CONDITION) clause lets it get into schedule(), that just returns immediately, after which the second iteration sees CONDITION true. This is why I recommended using higher-level things like linux/wait.h facilities, where the correct usage patterns are more obvious. But the race we're actually talking about here is on the other side of the coin. I am wondering if there is another way to handle such a race... Suppose we - add UTRACE_STOP_XXX - modify utrace_control(UTRACE_STOP_XXX) to do mark_engine_wants_stop() and nothing more. utrace_control(current,,UTRACE_STOP) doesn't really do anything more. So a callback doing this closes the old race: UTRACE_STOP is in force when the tracer side wakes up, since utrace_control-on-current was done before wake_up_tracer(). But... - finish_callback_report(UTRACE_STOP_XXX) does spin_lock(utrace-lock); if (engine_wants_stop(engine)) action = UTRACE_STOP; else action = UTRACE_REUME; spin_unlock(utrace-lock); Let's clarify what this means for the people following the utrace API but not its implementation internals: currently, the callback's return value overrides the engine's last-set state such as UTRACE_STOP, whether that came from a previous utrace_control before the callback, a utrace_control call inside the callback, or from the return value of a previous callback. So now we've moved to a different race (or a different instance of the original race, if you like): after wake_up_tracer(), the tracer can come and clear our engine's UTRACE_STOP state with its utrace_control call; but then our callback return value just reestablishes UTRACE_STOP and we still stop, having lost the asynchronous resumption from the tracer. So Oleg has an idea for a new kind of callback return value, which I'd call perhaps UTRACE_NOP. This would mean to stay in UTRACE_STOP if the last utrace_control call on this engine used UTRACE_STOP, or the last callback return value left the engine in that state, otherwise UTRACE_RESUME. Then, probably producer could do utrace_control(current, UTRACE_STOP_XXX); slot = book_message_in_buffer(buf); if (slot 0) return UTRACE_STOP_XXX; ... return UTRACE_RESUME; Even better, UTRACE_STOP_XXX can live outside of UTRACE_RESUME_MASK. Right, it doesn't belong in enum utrace_resume_action--it doesn't make sense as an argument for utrace_control, for example. So it could be something like a return-value flag bit saying stop if marked for stop, e.g. UTRACE_SINGLESTEP|UTRACE_STICKY_STOP for single-step, unless marked to stop, in which case stay stopped. So this would fix that race. For the scenario we're discussing here, that's probably just fine. The only reason to stop was for the consumer to run. If it woke up quickly and ran enough to call utrace_control while we were still finishing the return of the callback and utrace bookkeeping, then we just don't stop at all. (But note if there is another engine that decides to stop anyway, then we'll have used a second CPU and/or ping-pong scheduled to run the tracer thread in parallel with a very short amount of work in the tracee before it just blocks anyway.) Now let's consider another scenario. Take the normal debugger scenario, i.e. a ptrace implementation done cleanly on the utrace API without the magic kludges we have today, or a ptrace replacement built cleanly from scratch in the ideal way, or the pending gdbstub implementation. Here what the callback does is wake up the tracer, and return UTRACE_STOP. What the tracer thread does is block in some fashion until the tracee's wake_up_tracer(), then do whatever logic (in ptrace, that means return to user mode and let it make another ptrace system call, but that all might be quite fast), and decide to examine the thread. To make sure its own callback is finished and UTRACE_STOP will be in force, it has to synchronize with utrace_barrier. So in maximum parallelism, this means, having preempted the tracee just now by waking up, we now yield back to the tracee to finish our callback, other engines' callbacks, and utrace bookkeeping.
Re: Possible problem with utrace_control
2010/6/13 Roland McGrath rol...@redhat.com: Thanks for your interest in utrace. It's correct that passing UTRACE_REPORT to utrace_control should always ensure you get some report_* callback before the tracee returns to user mode. However, I think your use may be susceptible to a race between resumption by utrace_control, and the UTRACE_STOP done by the callback. If your callback looks something like: wake_up_consumer(); return UTRACE_STOP; then there is a window when the consumer can wake up and call utrace_control before that callback has actually returned UTRACE_STOP and had it processed. In that event, your UTRACE_REPORT comes before the UTRACE_STOP, and then you do nothing afterwards to resume it. If it ever did resume, it would indeed report (the UTRACE_REPORT is still pending). But that doesn't help you. Off hand, I think there is only one way to address this race. When you get the notification to consume, before calling utrace_control, call utrace_barrier first. This will block the consumer until the producer's callback has actually finished and had UTRACE_STOP recorded for your engine. At that point, you can be sure that your utrace_control call will indeed wake up the stopped tracee. Oleg or others here can double-check my reasoning and help explain the situation if you share your code for them to see. Because of this possibility, it might make sense to have utrace_control return -EINPROGRESS for this situation. That at least would alert you that something might be amiss. Unlike other cases of -EINPROGRESS, this would not just mean that you need to call utrace_barrier to be sure the effect has completed. Rather, it means there is a danger of the scenario I described above, where the effect you asked for is going to be trumped by the callback's UTRACE_STOP. So it doesn't quite fit in with the other situations--it's not really an indication that you have to follow up with a synchronization, it's an indication that your synchronization logic is already wrong. But at least it would relieve the mystery of such a situation, assuming you are checking return values and noticing in debugging. I'm certainly open to opinions on this API change. The need to use utrace_barrier as a second synchronization after the main wake-up (done by whatever normal means you are using) is rather clunky, and when it matters (as we suspect it does in the occasional race in your case) it leads to ping-pong scheduling. So we would like to improve the utrace API in this area at some point. One idea we have discussed before is a utrace_wake_up call. (This assumes that some variant of linux/wait.h wake_up* is what you are using in the guts of wake_up_consumer.) This would replace wake_up* for use in a callback, and simply do the wake-up after all the callbacks are finished (or perhaps just yours, but probably all), so your return value action is already recorded. In fact, it might do the wakeup only after the UTRACE_STOP is not only recorded but actually performed, so that by the time your consumer wakes up, the tracee is actually stopped and available for third-party examination. (You don't actually care about that for your case--for your purposes, it would be optimal just to delay the wakeup until your own callback's return value is recorded. That way, if your consumer acts quickly enoug on another CPU, it might even resume the tracee before it actually yields the processor.) As you might tell from the complexity of that paragraph, there are many fuzzy nuances of exactly what that better API might be. (And I didn't even delve into the possibility that you're using some synchronization mechanism other than wake_up* calls on a wait_queue_head_t.) Hence we haven't tried to work it out exactly yet. We hope that seeing a variety of uses such as yours will help us figure out how best to improve the utrace API for tracee-callback-to-tracer synchronization handshakes. Thanks, Roland Sorry for the long delay. Here's my scenario for better understanding. The buffer that's used for communication is in essence a circular buffer, but instead of the buffer wrapping, the consumer is suspended. I have *a single* consumer at any time. The consumer waits for data to be available. The code looks like this: set_current_state(TASK_INTERRUPTIBLE); while (atomic_read(buf-tail) == atomic_read(buf-head)) { pc_debug(Reader task going to sleep!); schedule(); if (signal_pending(current)) { return -ERESTARTSYS; } } set_current_state(TASK_RUNNING); once data is available it read some , freeing up some space. After that it wakes up all producers. Producers are utrace monitored tasks. I use a standard linked list protected by a spin lock to protect it to hold references to the utrace engine and task structures, that i need for utrace control. Is it possible to use a wait_queue for this ? I know it can
Re: Possible problem with utrace_control
You should use linux/wait.h or similar facilities rather than calling schedule() and wake_up_process() directly. This doesn't have anything to do with utrace, it's just the clean practice for normal kinds of blocking in the kernel, for a variety of reasons. That has to do with how your control thread blocks, which is just the normal set of issues for any thread in kernel code. You are using the lowest-level way of doing things, which is not recommended--that's why various higher-level facilities exist. The admonition to use utrace facilities for all blocks has to do with when you make a traced thread block. You are on the correct path in that regard. The relevant parts of your scenario match what I described. The race I talked about between your control thread's utrace_control() and tracee callbacks returning UTRACE_STOP explains what you are seeing. Thanks, Roland
Re: Possible problem with utrace_control
Thanks for your interest in utrace. It's correct that passing UTRACE_REPORT to utrace_control should always ensure you get some report_* callback before the tracee returns to user mode. However, I think your use may be susceptible to a race between resumption by utrace_control, and the UTRACE_STOP done by the callback. If your callback looks something like: wake_up_consumer(); return UTRACE_STOP; then there is a window when the consumer can wake up and call utrace_control before that callback has actually returned UTRACE_STOP and had it processed. In that event, your UTRACE_REPORT comes before the UTRACE_STOP, and then you do nothing afterwards to resume it. If it ever did resume, it would indeed report (the UTRACE_REPORT is still pending). But that doesn't help you. Off hand, I think there is only one way to address this race. When you get the notification to consume, before calling utrace_control, call utrace_barrier first. This will block the consumer until the producer's callback has actually finished and had UTRACE_STOP recorded for your engine. At that point, you can be sure that your utrace_control call will indeed wake up the stopped tracee. Oleg or others here can double-check my reasoning and help explain the situation if you share your code for them to see. Because of this possibility, it might make sense to have utrace_control return -EINPROGRESS for this situation. That at least would alert you that something might be amiss. Unlike other cases of -EINPROGRESS, this would not just mean that you need to call utrace_barrier to be sure the effect has completed. Rather, it means there is a danger of the scenario I described above, where the effect you asked for is going to be trumped by the callback's UTRACE_STOP. So it doesn't quite fit in with the other situations--it's not really an indication that you have to follow up with a synchronization, it's an indication that your synchronization logic is already wrong. But at least it would relieve the mystery of such a situation, assuming you are checking return values and noticing in debugging. I'm certainly open to opinions on this API change. The need to use utrace_barrier as a second synchronization after the main wake-up (done by whatever normal means you are using) is rather clunky, and when it matters (as we suspect it does in the occasional race in your case) it leads to ping-pong scheduling. So we would like to improve the utrace API in this area at some point. One idea we have discussed before is a utrace_wake_up call. (This assumes that some variant of linux/wait.h wake_up* is what you are using in the guts of wake_up_consumer.) This would replace wake_up* for use in a callback, and simply do the wake-up after all the callbacks are finished (or perhaps just yours, but probably all), so your return value action is already recorded. In fact, it might do the wakeup only after the UTRACE_STOP is not only recorded but actually performed, so that by the time your consumer wakes up, the tracee is actually stopped and available for third-party examination. (You don't actually care about that for your case--for your purposes, it would be optimal just to delay the wakeup until your own callback's return value is recorded. That way, if your consumer acts quickly enoug on another CPU, it might even resume the tracee before it actually yields the processor.) As you might tell from the complexity of that paragraph, there are many fuzzy nuances of exactly what that better API might be. (And I didn't even delve into the possibility that you're using some synchronization mechanism other than wake_up* calls on a wait_queue_head_t.) Hence we haven't tried to work it out exactly yet. We hope that seeing a variety of uses such as yours will help us figure out how best to improve the utrace API for tracee-callback-to-tracer synchronization handshakes. Thanks, Roland
Possible problem with utrace_control
Hi, I have the following scenario: Producer consumer model. The traced tasks are producers, and there is one consumer. Produces produce data in a buffer. If the buffer is full, the producer registers itself into a list (lock protected) and returns UTRACE_STOP. The consumer always checks the list and utrace_control(REPORT)s the producers (all of them). The monitoring callback is called again, and the producer produces what it should have in the first place. This works well most of the time. Some times it would seem that utrace_control doesn't do it's job, and doesn't return an error either. I can observe this when there is a high pressure on the buffer, and it's almost full so the STOP / RESUME cycle happens often. What i would like to know, is how to go about clarifying this. First of all is my assumption correct that UTRACE_REPORT will always cause annother call to the monitoring callback? What other precautions should i take? If i am using utrace correctly how could i find the problem within it ? This problem leads to a deadlock in my model since the consumer goes to sleep relying on the producer to wake it up, that i turn never wakes up itself. Alpar Torok