Peter Zijlstra <pet...@infradead.org> writes:

> On Fri, Mar 04, 2016 at 03:42:46PM +0200, Alexander Shishkin wrote:
>> @@ -4649,10 +4679,22 @@ static void perf_mmap_close(struct vm_area_struct 
>> *vma)
>>       */
>>      if (rb_has_aux(rb) && vma->vm_pgoff == rb->aux_pgoff &&
>>          atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex)) 
>> {
>> +            /*
>> +             * Stop all aux events that are writing to this here buffer,
>> +             * so that we can free its aux pages and corresponding pmu
>> +             * data. Note that after rb::aux_mmap_count dropped to zero,
>> +             * they won't start any more (see perf_aux_output_begin()).
>> +             */
>> +            perf_pmu_output_stop(event);
>
> So to me it seems like we're interested in rb, we don't particularly
> care about @event in this case.

Yeah, @event is used only for its rb and pmu down this path.

>> +    if (!has_aux(event))
>> +            return;
>> +
>
>       if (!parent)
>               parent = event;
>
>> +    if (rcu_dereference(event->rb) == rb)
> s/event/parent/
>
>> +            ro->err = __perf_event_stop(event);
>
>> +    else if (parent && rcu_dereference(parent->rb) == rb)
>> +            ro->err = __perf_event_stop(event);
>
> and these can go.. However..

You're right: it's the parent that's got the ->rb, but it may well be
the child that's actually running and writing data there. So we need to
stop any running children. I think I actually broke this bit since the
previous version, because there needs to be a comment about it.

>
>> +}
>> +
>> +static int __perf_pmu_output_stop(void *info)
>> +{
>> +    struct perf_event *event = info;
>> +    struct pmu *pmu = event->pmu;
>> +    struct perf_cpu_context *cpuctx = get_cpu_ptr(pmu->pmu_cpu_context);
>> +    struct remote_output ro = {
>> +            .rb     = event->rb,
>> +    };
>> +
>> +    rcu_read_lock();
>> +    perf_event_aux_ctx(&cpuctx->ctx, __perf_event_output_stop, &ro);
>> +    if (cpuctx->task_ctx)
>> +            perf_event_aux_ctx(cpuctx->task_ctx, __perf_event_output_stop,
>> +                               &ro);
>> +    rcu_read_unlock();
>> +
>> +    return ro.err;
>> +}
>> +
>> +static void perf_pmu_output_stop(struct perf_event *event)
>> +{
>> +    int cpu, err;
>> +
>> +    /* better be thorough */
>> +    get_online_cpus();
>> +restart:
>> +    for_each_online_cpu(cpu) {
>> +            err = cpu_function_call(cpu, __perf_pmu_output_stop, event);
>> +            if (err)
>> +                    goto restart;
>> +    }
>> +    put_online_cpus();
>> +}
>
> This seems wildly overkill, could we not iterate rb->event_list like we
> do for the normal buffer?

Actually we can. One problem though is that iterating rb::event_list
requires rcu read section or irqsafe rb::event_lock and we need to send
IPIs. The normal buffer case tears down the rb::event_list as it goes,
so it can close the rcu read section right after it fetches one event
from it. In this case however, we must keep the list intact.

> Sure, we need to IPI for each event found, but that seems better than
> unconditionally sending IPIs to all CPUs.

Actually, won't it "often" be the case that the number of events will be
a multiple of the number of cpus? The usual use case being one event per
task per cpu and inheritance enabled. In this case we'll zap multiple
events per IPI.

Regards,
--
Alex

Reply via email to