On Wed, Jan 11, 2017 at 05:03:58PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 11, 2017 at 02:59:20PM +0000, Mark Rutland wrote:
> > On Fri, Dec 09, 2016 at 02:59:00PM +0100, Peter Zijlstra wrote:

> > > +  * If we get a false negative, things are complicated. If we are after
> > > +  * perf_event_context_sched_in() ctx::lock will serialize us, and the
> > > +  * value must be correct. If we're before, it doesn't matter since
> > > +  * perf_event_context_sched_in() will program the counter.
> > > +  *
> > > +  * However, this hinges on the remote context switch having observed
> > > +  * our task->perf_event_ctxp[] store, such that it will in fact take
> > > +  * ctx::lock in perf_event_context_sched_in().
> > 
> > Sorry if I'm being thick here, but which store are we describing above?
> > i.e. which function, how does that relate to perf_install_in_context()?
> 
> The only store to perf_event_ctxp[] of interest is the initial one in
> find_get_context().

Ah, I see. I'd missed the rcu_assign_pointer() when looking around for
an assignment.

> > I haven't managed to wrap my head around why this matters. :/
> 
> See the scenario from:
> 
>  
> https://lkml.kernel.org/r/[email protected]
> 
> Its installing the first event on 't', which concurrently with the
> install gets migrated to a third CPU.

I was completely failing to consider that this was the installation of
the first event; I should have read the existing comment. Things make a
lot more sense now.

> CPU0            CPU1            CPU2
> 
>                 (current == t)
> 
> t->perf_event_ctxp[] = ctx;
> smp_mb();
> cpu = task_cpu(t);
> 
>                 switch(t, n);
>                                 migrate(t, 2);
>                                 switch(p, t);
> 
>                                 ctx = t->perf_event_ctxp[]; // must not be 
> NULL
> 
> smp_function_call(cpu, ..);
> 
>                 generic_exec_single()
>                   func();
>                     spin_lock(ctx->lock);
>                     if (task_curr(t)) // false
> 
>                     add_event_to_ctx();
>                     spin_unlock(ctx->lock);
> 
>                                 perf_event_context_sched_in();
>                                   spin_lock(ctx->lock);
>                                   // sees event
> 
> 
> 
> So its CPU0's store of t->perf_event_ctxp[] that must not go 'missing.
> Because if CPU2's load of that variable were to observe NULL, it would
> not try to schedule the ctx and we'd have a task running without its
> counter, which would be 'bad'.
> 
> As long as we observe !NULL, we'll acquire ctx->lock. If we acquire it
> first and not see the event yet, then CPU0 must observe task_running()
> and retry. If the install happens first, then we must see the event on
> sched-in and all is well.

I think I follow now. Thanks for bearing with me!

> In any case, I'll try and write a proper Changelog for this...

If it's just the commit message and/or comments changing, feel free to
add:

Tested-by: Mark Rutland <[email protected]>

Thanks,
Mark.

Reply via email to