Suppose that we want to detach the engine and free engine->data. To avoid the races with our calbacks which can use ->data we should do either
err = utrace_set_events(0); if (err == ...) utrace_barrier(); utrace_control(DETACH); or err = utrace_control(DETACH); if (err = ...) utrace_barrier(); Neither variant works if we should care about report_quiesce() or report_death(). Let's discuss the 2nd case, the 1st one have the similar problems. The problem is, utrace_control(DETACH) does nothing and returns -EALREADY if utrace->death is set, this is not right. We can and should detach in this case, we only should skip utrace_reset() to avoid the race with utrace_report_death()->REPORT_CALLBACKS(). This was the main problem I hit during the testing. Consider the exiting tracee with the valid engine->ops, suppose that utrace_control(DETACH) is called right after utrace_report_death() unlocks utrace->lock and before it does REPORT_CALLBACKS(). utrace_control() fails, but utrace_barrier() does not help after that. It takes get_utrace_lock() successfully and returns 0. The caller frees engine->data, but utrace_report_death() calls ->report_death() after that. Kill utrace_control_dead(). Change utrace_control() to always allow UTRACE_DETACH if engine has the valid ->ops. This means that mark_engine_detached() can race with REPORT_CALLBACKS() but there is nothing new. utrace_do_stop() is unnecessary and pointless in this case, but it doesn't hurt. Signed-off-by: Oleg Nesterov <o...@redhat.com> --- kernel/utrace.c | 49 ++++++++++--------------------------------------- 1 file changed, 10 insertions(+), 39 deletions(-) --- kstub/kernel/utrace.c~6_fix_control_dead 2010-08-16 11:17:05.000000000 +0200 +++ kstub/kernel/utrace.c 2010-08-16 11:17:51.000000000 +0200 @@ -928,37 +928,6 @@ void utrace_maybe_reap(struct task_struc } } -/* - * You can't do anything to a dead task but detach it. - * If release_task() has been called, you can't do that. - * - * On the exit path, DEATH and QUIESCE event bits are set only - * before utrace_report_death() has taken the lock. At that point, - * the death report will come soon, so disallow detach until it's - * done. This prevents us from racing with it detaching itself. - * - * Called only when @target->exit_state is nonzero. - */ -static inline int utrace_control_dead(struct task_struct *target, - struct utrace *utrace, - enum utrace_resume_action action) -{ - lockdep_assert_held(&utrace->lock); - - if (action != UTRACE_DETACH || unlikely(utrace->reap)) - return -ESRCH; - - if (unlikely(utrace->death)) - /* - * We have already started the death report. We can't - * prevent the report_death and report_reap callbacks, - * so tell the caller they will happen. - */ - return -EALREADY; - - return 0; -} - /** * utrace_control - control a thread being traced by a tracing engine * @target: thread to affect @@ -1115,18 +1084,20 @@ int utrace_control(struct task_struct *t ret = 0; /* - * ->exit_state can change under us, this doesn't matter. - * We do not care about ->exit_state in fact, but we do - * care about ->reap and ->death. If either flag is set, - * we must also see ->exit_state != 0. + * ->exit_state can change under us, this doesn't matter. But, + * if utrace->death is set we must also see ->exit_state != 0, + * this is what we care about. */ if (unlikely(target->exit_state)) { - ret = utrace_control_dead(target, utrace, action); - if (ret) { + /* You can't do anything to a dead task but detach it */ + if (action != UTRACE_DETACH) { spin_unlock(&utrace->lock); - return ret; + return -ESRCH; } - reset = true; + + /* If it is not set, we can safely do utrace_reset() */ + if (likely(!utrace->death)) + reset = true; } switch (action) {