On Wed, 2017-02-15 at 22:41 +0000, Ben Hutchings wrote:
> 3.2.85-rc1 review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Peter Zijlstra <pet...@infradead.org>
> 
> commit 321027c1fe77f892f4ea07846aeae08cefbbb290 upstream.
> 
> Di Shen reported a race between two concurrent sys_perf_event_open()
> calls where both try and move the same pre-existing software group
> into a hardware context.
> 
> The problem is exactly that described in commit:
> 
>   f63a8daa5812 ("perf: Fix event->ctx locking")
> 
> ... where, while we wait for a ctx->mutex acquisition, the event->ctx
> relation can have changed under us.
> 
> That very same commit failed to recognise sys_perf_event_context() as an
> external access vector to the events and thereby didn't apply the
> established locking rules correctly.
> 
> So while one sys_perf_event_open() call is stuck waiting on
> mutex_lock_double(), the other (which owns said locks) moves the group
> about. So by the time the former sys_perf_event_open() acquires the
> locks, the context we've acquired is stale (and possibly dead).
> 
> Apply the established locking rules as per perf_event_ctx_lock_nested()
> to the mutex_lock_double() for the 'move_group' case. This obviously means
> we need to validate state after we acquire the locks.
[...]
>               /*
>                * See perf_event_ctx_lock() for comments on the details
>                * of swizzling perf_event::ctx.
>                */
> -             mutex_lock_double(&gctx->mutex, &ctx->mutex);
> -
>               perf_remove_from_context(group_leader, false);
>  
>               /*
> @@ -6709,10 +6757,8 @@ SYSCALL_DEFINE5(perf_event_open,
>       ++ctx->generation;
>       perf_unpin_context(ctx);
>  
> -     if (move_group) {
> -             mutex_unlock(&gctx->mutex);
> -             put_ctx(gctx);
> -     }
> +     if (move_group)
> +             perf_event_ctx_unlock(group_leader, gctx);
>       mutex_unlock(&ctx->mutex);
>  
>       event->owner = current;
[...]

Peter has clarified that the last call to put_ctx(gctx) corresponds to
the reference cleared by perf_remove_from_context(group_leader, false)
above.  So although perf_event_ctx_unlock() also calls put_ctx(gctx),
we really do want to drop two references here now and should keep the
direct call.

I made the same error when backporting to 3.16, and will fix that as
well.

Ben.

-- 
Ben Hutchings
73.46% of all statistics are made up.

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to