[PATCH 1/2] UTRACE_STOP race condition (updated)
Dear Roland, dear utrace developers, I have updated my patch #1 (it solves the race condition on utrace_stop but not the nesting issue) for the latest version of utrace. I am trying to get the patches updated downloading, compiling and testing the fixes every week or so... Things would be easier if these patch could be merged in the mainstream ;-) renzo diff -Naur linux-2.6.29-rc7-git5-utrace/kernel/utrace.c linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c --- linux-2.6.29-rc7-git5-utrace/kernel/utrace.c2009-03-12 11:00:09.0 +0100 +++ linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c 2009-03-12 11:05:50.0 +0100 @@ -376,6 +376,13 @@ return killed; } +static void mark_engine_wants_stop(struct utrace_engine *engine); +static void clear_engine_wants_stop(struct utrace_engine *engine); +static bool engine_wants_stop(struct utrace_engine *engine); +static void mark_engine_wants_resume(struct utrace_engine *engine); +static void clear_engine_wants_resume(struct utrace_engine *engine); +static bool engine_wants_resume(struct utrace_engine *engine); + /* * Perform %UTRACE_STOP, i.e. block in TASK_TRACED until woken up. * @task == current, @utrace == current-utrace, which is not locked. @@ -385,6 +392,7 @@ static bool utrace_stop(struct task_struct *task, struct utrace *utrace) { bool killed; + struct utrace_engine *engine, *next; /* * @utrace-stopped is the flag that says we are safely @@ -406,7 +414,23 @@ return true; } - utrace-stopped = 1; + /* final check: it is really needed to stop? */ + list_for_each_entry_safe(engine, next, utrace-attached, entry) { + if ((engine-ops != utrace_detached_ops) engine_wants_stop(engine)) { + if (engine_wants_resume(engine)) { + clear_engine_wants_stop(engine); + clear_engine_wants_resume(engine); + } + else + utrace-stopped = 1; + } + } + if (unlikely(!utrace-stopped)) { + spin_unlock_irq(task-sighand-siglock); + spin_unlock(utrace-lock); + return false; + } + __set_current_state(TASK_TRACED); /* @@ -632,6 +656,7 @@ * to record whether the engine is keeping the target thread stopped. */ #define ENGINE_STOP(1UL _UTRACE_NEVENTS) +#define ENGINE_RESUME (1UL (_UTRACE_NEVENTS+1)) static void mark_engine_wants_stop(struct utrace_engine *engine) { @@ -648,6 +673,21 @@ return (engine-flags ENGINE_STOP) != 0; } +static void mark_engine_wants_resume(struct utrace_engine *engine) +{ + engine-flags |= ENGINE_RESUME; +} + +static void clear_engine_wants_resume(struct utrace_engine *engine) +{ + engine-flags = ~ENGINE_RESUME; +} + +static bool engine_wants_resume(struct utrace_engine *engine) +{ + return (engine-flags ENGINE_RESUME) != 0; +} + /** * utrace_set_events - choose which event reports a tracing engine gets * @target:thread to affect @@ -906,6 +946,10 @@ list_move(engine-entry, detached); } else { flags |= engine-flags | UTRACE_EVENT(REAP); + if (engine_wants_resume(engine)) { + clear_engine_wants_stop(engine); + clear_engine_wants_resume(engine); + } wake = wake !engine_wants_stop(engine); } } @@ -1133,6 +1177,7 @@ * There might not be another report before it just * resumes, so make sure single-step is not left set. */ + mark_engine_wants_resume(engine); if (likely(resume)) user_disable_single_step(target); break;
Re: [PATCH] #2 UTRACE_STOP race condition nesting
Dear Roland, dear utrace developers, I have update also the second patch (which includes the first). This patch fixes the utrace_stop race condition and implements a consistent model of tracing engine nesting. renzo On Sat, Feb 14, 2009 at 10:11:55AM +0100, Renzo Davoli wrote: This is an updated patch. It solves the race condition + it gives a quick (a bit dirty) solution to issues 34. 3- Nesting, is it really useful to run all the reports in a row and (eventually) stop and the end waiting for all the engines? The patch waits for each engine to resume before notifying the next registered engine. 4- report_syscall_entry engines evaluation order should be reversed REPORT macros have an extra reverse argument. The macros append this string to the list_for_each_entry_safe function name. All the macro calls skip this argument except the one in report_syscall_entry where it is set to _reverse. With this patch it is possible to run nested kmview machines and ptrace works inside the virtual machines. This patch is a bit dirty because variables and sections of code needed to count and test the stopped engines are useless here: a task can be kept stopped for at most one engine at a time. This patch is a proof-of concept to show what I meant in my previous message. For what concerns 12 (not included in this patch): 1- Virtual Machines may need to change the system call THis is just to simplify the implementation of arch. independent virtual machine. I have kept the definition of missing functions in the kmview module code. 2- UTRACE_SYSCALL_ABORT: is it really useful as a return value for report_syscall_entry? It is useless for kmview as the decision of aborting the system call is taken while the process is stopped, I am currently setting the syscall number to -1 to skip the syscall. For the sake of completeness there is another way to implement the partial virtual machine stuff by introducing another quiescence state inside the report upcalls. I mean: when utrace calls a report function (say for example report_syscall_entry), the function in the module puts the process in a stopped state (maybe its TASK_TRACED and calls the schedule). From utrace's point of view the report function does not return until all the changes in the task state have been completed and the decision UTRACE_RESUME/UTRACE_SYSCALL_ABORT has been taken. In this way UTRACE_STOP is never used because the module has to implement another feature similar to UTRACE_STOP on its own. So what is UTRACE_STOP for? ciao renzo --- --- kernel/utrace.c.mcgrath 2009-03-05 15:09:57.0 +0100 +++ kernel/utrace.c 2009-03-06 11:49:15.0 +0100 @@ -369,6 +369,13 @@ return killed; } +static void mark_engine_wants_stop(struct utrace_engine *engine); +static void clear_engine_wants_stop(struct utrace_engine *engine); +static bool engine_wants_stop(struct utrace_engine *engine); +static void mark_engine_wants_resume(struct utrace_engine *engine); +static void clear_engine_wants_resume(struct utrace_engine *engine); +static bool engine_wants_resume(struct utrace_engine *engine); + /* * Perform %UTRACE_STOP, i.e. block in TASK_TRACED until woken up. * @task == current, @utrace == current-utrace, which is not locked. @@ -378,6 +385,7 @@ static bool utrace_stop(struct task_struct *task, struct utrace *utrace) { bool killed; + struct utrace_engine *engine, *next; /* * @utrace-stopped is the flag that says we are safely @@ -399,7 +407,23 @@ return true; } - utrace-stopped = 1; + /* final check: is really needed to stop? */ + list_for_each_entry_safe(engine, next, utrace-attached, entry) { + if ((engine-ops != utrace_detached_ops) engine_wants_stop(engine)) { + if (engine_wants_resume(engine)) { + clear_engine_wants_stop(engine); + clear_engine_wants_resume(engine); + } + else + utrace-stopped = 1; + } + } + if (unlikely(!utrace-stopped)) { + spin_unlock_irq(task-sighand-siglock); + spin_unlock(utrace-lock); + return false; + } + __set_current_state(TASK_TRACED); /* @@ -625,6 +649,7 @@ * to record whether the engine is keeping the target thread stopped. */ #define ENGINE_STOP(1UL _UTRACE_NEVENTS) +#define ENGINE_RESUME (1UL (_UTRACE_NEVENTS+1)) static void mark_engine_wants_stop(struct utrace_engine *engine) { @@ -641,6 +666,21 @@ return (engine-flags ENGINE_STOP) != 0; } +static void mark_engine_wants_resume(struct utrace_engine *engine) +{ + engine-flags |= ENGINE_RESUME; +} + +static void clear_engine_wants_resume(struct
Re: UTRACE_STOP race condition?
Renzo Davoli re...@cs.unibo.it writes: [...] If the report function returns UTRACE_STOP the traced process stays in a quiescent state and the module wakes it up by a utrace_control(...,UTRACE_RESUME) call *later*. [...] If the module wakes the traced process too quickly, utrace has not yet put it into a stopped state, therefore UTRACE_RESUME gets lost. [...] The module has decided UTRACE_STOP at t=1, then the module can call utrace_control(...,UTRACE_RESUME) at any t1. [...] This may not answer your question, but I believe it is not proper to to make this call at any time t1, only once you receive the quiesce callback. - FChE
Re: UTRACE_STOP race condition?
On Wed, Feb 11, 2009 at 09:45:15AM -0500, Frank Ch. Eigler wrote: This may not answer your question, but I believe it is not proper to to make this call at any time t1, only once you receive the quiesce callback. Maybe I am wrong but the quiesce callback gets called *before* the other report_* (say syscall_entry). So when I capture UTRACE_QUIESCE, I got the report call before t=1. Some communication from utrace to the module should happen *after* utrace-stopped is set to 1 (something similar to the code Roland added for ptrace). Even if it worked this way (i.e. return STOP and wait for report_quiesce, I think the race condition there is in any case) the interface to the module would be horrible. When the module receives a report callback, it returns UTRACE_STOP and then it needs to use some data structure to wait for a report_quiesce to restart the traced process. With the idea of patch included in my previous mail there is no need of such a complexity. Thank you for taking part to this discussion renzo