On Mon, Mar 06, 2017 at 02:14:59PM +0100, Peter Zijlstra wrote: > On Mon, Mar 06, 2017 at 10:57:07AM +0100, Dmitry Vyukov wrote: > > > ================================================================== > > BUG: KASAN: use-after-free in atomic_dec_and_test > > arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158 > > BUG: KASAN: use-after-free in put_task_struct > > include/linux/sched/task.h:93 [inline] at addr ffff880079c30158 > > BUG: KASAN: use-after-free in put_ctx+0xcf/0x110 > > FWIW, this output is very confusing, is this a result of your > post-processing replicating the line for every 'inlined' part? > > > kernel/events/core.c:1131 at addr ffff880079c30158 > > Write of size 4 by task syz-executor6/25698 > > > atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline] > > put_task_struct include/linux/sched/task.h:93 [inline] > > put_ctx+0xcf/0x110 kernel/events/core.c:1131 > > perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322 > > perf_release+0x37/0x50 kernel/events/core.c:4338 > > __fput+0x332/0x800 fs/file_table.c:209 > > ____fput+0x15/0x20 fs/file_table.c:245 > > task_work_run+0x197/0x260 kernel/task_work.c:116 > > exit_task_work include/linux/task_work.h:21 [inline] > > do_exit+0xb38/0x29c0 kernel/exit.c:880 > > do_group_exit+0x149/0x420 kernel/exit.c:984 > > get_signal+0x7e0/0x1820 kernel/signal.c:2318 > > do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808 > > exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157 > > syscall_return_slowpath arch/x86/entry/common.c:191 [inline] > > do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286 > > entry_SYSCALL64_slow_path+0x25/0x25 > > So this is fput().. > > > > Freed: > > PID = 25681 > > save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 > > save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > > set_track mm/kasan/kasan.c:525 [inline] > > kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589 > > __cache_free mm/slab.c:3514 [inline] > > kmem_cache_free+0x71/0x240 mm/slab.c:3774 > > free_task_struct kernel/fork.c:158 [inline] > > free_task+0x151/0x1d0 kernel/fork.c:370 > > copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931 > > copy_process kernel/fork.c:1531 [inline] > > _do_fork+0x200/0x1010 kernel/fork.c:1994 > > SYSC_clone kernel/fork.c:2104 [inline] > > SyS_clone+0x37/0x50 kernel/fork.c:2098 > > do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281 > > return_from_SYSCALL_64+0x0/0x7a > > and this is a failed fork(). > > > However, inherited events don't have a filedesc to fput(), and > similarly, a task that fails for has never been visible to attach a perf > event to because it never hits the pid-hash. > > Or so it is assumed. > > I'm forever getting lost in the PID code. Oleg, is there any way > find_task_by_vpid() can return a task that can still fail fork() ?
So I _think_ find_task_by_vpid() can return an already dead task; and we'll happily increase task->usage. Dmitry; I have no idea how easy it is for you to reproduce the thing; but so far I've not had much success. Could you perhaps stick the below in? Once we convert task_struct to refcount_t that should generate a WARN of its own I suppose. --- diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 000fdb2..612d652 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -763,6 +763,7 @@ struct perf_event_context { #ifdef CONFIG_CGROUP_PERF int nr_cgroups; /* cgroup evts */ #endif + int switches; void *task_ctx_data; /* pmu specific data */ struct rcu_head rcu_head; }; diff --git a/kernel/events/core.c b/kernel/events/core.c index 6f41548f..6455b7a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2902,6 +2902,8 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn, if (!parent && !next_parent) goto unlock; + ctx->switches++; + if (next_parent == ctx || next_ctx == parent || next_parent == parent) { /* * Looks like the two contexts are clones, so we might be @@ -3780,6 +3782,12 @@ find_lively_task_by_vpid(pid_t vpid) task = current; else task = find_task_by_vpid(vpid); + + if (task) { + if (WARN_ON_ONCE(task->flags & PF_EXITING)) + task = NULL; + } + if (task) get_task_struct(task); rcu_read_unlock(); @@ -10432,6 +10440,10 @@ void perf_event_free_task(struct task_struct *task) mutex_unlock(&ctx->mutex); + WARN_ON_ONCE(ctx->switches); + WARN_ON_ONCE(atomic_read(&ctx->refcount) != 1); + WARN_ON_ONCE(ctx->task != task); + put_ctx(ctx); } }