On 02/22, Prashant Bhole wrote:
>
> Hi,
> I encountered following BUG caught by KASAN with recent kernels when trying
> out [BCC project] bcc/testing/python/test_usdt2.py
> Tried with v4.12, it was reproducible.
>
> --- KASAN log ---
> BUG: KASAN: use-after-free in uprobe_perf_close+0x118/0x1a0
> Read of size 4 at addr ffff8800bb2db4cc by task test_usdt2.py/1265
>
> CPU: 2 PID: 1265 Comm: test_usdt2.py Not tainted 4.16.0-rc2-next-20180220+
> #38
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26
> 04/01/2014
> Call Trace:
>  dump_stack+0x5c/0x80
>  print_address_description+0x73/0x290
>  kasan_report+0x257/0x380
>  ? uprobe_perf_close+0x118/0x1a0
>  uprobe_perf_close+0x118/0x1a0

...

> Freed by task 1265:
>  __kasan_slab_free+0x135/0x180
>  kmem_cache_free+0xaf/0x230
>  rcu_process_callbacks+0x559/0xd90
>  __do_softirq+0x125/0x3a2

Hmm. this looks strange, I do not see no __put_task_struct/free_task in this
trace... OK, nevermind.

> After debugging, found that uprobe_perf_close() is called after task has
> been terminated and uprobe_perf_close() tries to access task_struct of the
> terminated process.

Oh. You can't imagine how much I forgot this code ;) I will recheck, but at
first glance you are right. We can't rely on _free_event()->put_ctx() which
does put_task_struct() after event->destroy(), the exiting task does
put_task_struct(current) itself and sets child_ctx->task = TASK_TOMBSTONE in
perf_event_exit_task_context().

In short, nothing protects event->hw.target. But uprobe_perf_open() should be
safe, perf_init_event() is called when the caller has the additional reference.

I am wondering if this was wrong from the very beginning or it was broken later,
but I won't even try to check.

>  static int uprobe_perf_close(struct trace_uprobe *tu, struct perf_event
> *event)
>  {
> +    int err = 0;
>      bool done;
> 
>      write_lock(&tu->filter.rwlock);
> @@ -1054,9 +1055,12 @@ static int uprobe_perf_close(struct trace_uprobe *tu,
> struct perf_event *event)
>      write_unlock(&tu->filter.rwlock);
> 
>      if (!done)
> -        return uprobe_apply(tu->inode, tu->offset, &tu->consumer, false);
> +        err =  uprobe_apply(tu->inode, tu->offset, &tu->consumer, false);
> 
> -    return 0;
> +    if (event->hw.target)
> +        put_task_struct(event->hw.target);
> +
> +    return err;
>  }

No need to delay put_task_struct(), you can call it right after "done = ..."
in the "if (event->hw.target)" block and simplify this change.

However. Probably this is the simplest fix, but it doesn't look nice. We do not
really need task_struct, we need its ->mm. PF_EXITING check can be removed, this
is a minor optimization.

perhaps we can add _something_ like

        struct mm_struct *uprobe_event_mm(event)
        {
                // needs rcu_read_lock/READ_ONCE/etc
                if (event->ctx &&
                    event->ctx->task && event->ctx->task != TASK_TOMBSTONE)
                        return event->ctx->task->mm;

                return NULL;
        }

and use it instead of event->hw.target->mm ... Not sure.


And. What about other users of event->hw.target? Say, task_bp_pinned(). It 
doesn't
dereference this pointer, How can we trust the result of "iter->hw.target == 
tsk"
if hw.target can be freed and then re-alloced?


This all makes me think that we should change (fix) kernel/events/core.c...

Oleg.

Reply via email to