On 07/20, Roland McGrath wrote:
>
> Yeah, it might just need this:
>
> diff --git a/kernel/utrace.c b/kernel/utrace.c
> index 74b5fc5..60af805 100644
> --- a/kernel/utrace.c
> +++ b/kernel/utrace.c
> @@ -549,6 +549,7 @@ restart:
> list_for_each_entry_safe(engine, next, &utrace->attached, entry) {
> ops = engine->ops;
> engine->ops = NULL;
> + engine->flags = 0;
> list_move(&engine->entry, &detached);
I think this makes sense regardless.
> put_detached_list(&detached);
> +
> + /*
> + * Make sure all engine->flags and engine->ops updates are done
> + * before the current task_struct might really die.
> + */
> + smp_wmb();
I don't think we need additional barriers here or in prepare/finish,
rcu_call() should play correctly with rcu_read_lock().
IOW, rcu_read_lock() must have enough "barrierness" wrt rcu_call().
> Let me explain the general notion. It's always been a little iffy, and it
> definitely deserves an airing now in your fresh review of all things utrace.
>
> The root of the thing is that modules can't use get_task_struct().
Aaaah.
> The other calls use get_utrace_lock to implement this guarantee. It checks
> that your engine is attached before it looks at the task_struct, and this
> is (supposedly) synchronized with detach and reap safely. (You should be
> checking on that supposition, Oleg. ;-)
I still think EXIT_DEAD check must die ;)
> For a direct poke at task->thread.something you can make it work.
>
> rcu_read_lock();
> ret = utrace_prepare_examine(task, engine, &exam);
> if (ret) goto hell;
> fiddle = task->faddle;
> rcu_read_unlock();
> ret = utrace_finish_examine(task, engine, &exam);
> if (ret) goto hell;
>
> Even though utrace_prepare_examine (wait_task_inactive) will block and
> that lets RCU stuff happen, when you've come back again you will be in
> rcu_read_lock across the put_task_struct it does. So task is live until
> rcu_read_unlock(). (I have no idea if intentionally allowing a block
> inside rcu_read_lock freaks out lockdep or something.)
Hmm. I think without CONFIG_PREEMPT_RCU schedule_debug() will complain
if wait_task_inactive() blocks.
In any case, without CONFIG_PREEMPT_RCU we must not schedule() under
rcu_read_lock(), because rcuclassic considers schedule() call as
"->passed_quiesc = true".
> This suggests perhaps utrace_prepare_examine ought to keep a task ref to
> be released by utrace_finish_examine. But then it's a back-door way for
> modules to use get/put_task_struct and leak task_structs. Is that OK?
Oh, don't ask me ;)
But I think this is very natural. Yes, with this change the caller must
always do utrace_finish_examine() if _prepare_ succeeds, but a buggy
utrace module can leak memory or just crash the kernel in many ways.
Or we can export __put_task_struct(), this means more work for utrace
user.
Or. Actually, when I started read utrace.c I was surprised there is
no "task_struct *tracee" in "struct utrace_engine". Perhaps engine
itself should ping task_struct ? And engine->tracee can be very handy.
Oleg.