> -----Original Message-----
> From: Peter Zijlstra [mailto:[email protected]]
> Sent: Friday, February 26, 2016 5:37 AM
> To: Liang, Kan
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH 1/1] perf/core: find auxiliary events in running pmus
> list
> 
> > index 9> +static void account_running_pmu(struct perf_event *event)
> > +{
> > +   struct running_pmu *pmu;
> > +
> > +   mutex_lock(&running_pmus_lock);
> > +
> > +   list_for_each_entry(pmu, &running_pmus, entry) {
> > +           if (pmu->pmu == event->pmu) {
> > +                   pmu->nr_event++;
> > +                   goto out;
> > +           }
> > +   }
> > +
> > +   pmu = kzalloc(sizeof(struct running_pmu), GFP_KERNEL);
> > +   if (pmu != NULL) {
> > +           pmu->nr_event++;
> > +           pmu->pmu = event->pmu;
> > +           list_add_rcu(&pmu->entry, &running_pmus);
> > +   }
> 
> That kzalloc() doesn't make any sense, why not simply add a member to
> struct pmu?
> 
> > +out:
> > +   mutex_unlock(&running_pmus_lock);
> > +}
> 
> In any case, can't we replace the whole perf_event_aux muck with a data
> structure for finding interested events?
> 
> Because not only is iterating all PMUs silly, we then further iterate all
> events in whatever contexts we find from them. Even if none of the
> events in these contexts is interested in the side-band event.
> 
> We have:
>       mmap,comm,task,mmap_data,mmap2,comm_exec,context_switc
> h
> 
> 
> Which I think we can reduce like:
> 
>       {mmap,mmap_data,mmap2} -> mmap
>       {comm,comm_exec} -> comm
> 
>       mmap,comm,task,context_switch
> 
> This would allow us to keep 4 per-cpu lists of events like:
> 
> 
> struct pmu_event_list {
>       raw_spinlock_t          lock;
>       struct list_head        list;
> };
> 
> enum event_sb_channel {
>       sb_mmap = 0,
>       sb_comm,
>       sb_task,
>       sb_switch,
>       sb_nr,
> }
> 
> static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events[sb_nr]);
> 
> static void attach_sb_event(struct perf_event *event, enum
> event_sb_channel sb) {
>       struct pmu_event_list *pel = per_cpu_ptr(&pmu_sb_events[sb],
> event->cpu);
> 
>       raw_spin_lock(&pel->lock);
>       list_add_rcu(&event->sb_list[sb], &pel->list);
>       raw_spin_unlock(&pel->lock);
> }
> 
> static void account_pmu_sb_event(struct perf_event *event) {
>       if (event->parent)
>               return;
> 
>       if (event->attach & ATTACH_TASK)
>               return;
> 
>       if (event->attr.mmap || event->attr.mmap_data || event-
> >attr.mmap2)
>               attach_sb_event(event, sb_mmap);
> 
>       if (event->attr.comm || event->attr.comm_exec)
>               attach_sb_event(event, sb_comm);
> 
>       if (event->attr.task)
>               attach_sb_event(event, sb_task);
> 
>       if (event->attr.context_switch)
>               attach_sb_event(event, sb_switch);
> }
> 
> /* matching unaccount */
> 
> 
> static void perf_event_sb_iterate(enum event_sb_channel sb,
>                                 perf_event_sb_output_f output, void
> *data) {
>       struct pmu_event_list *pel =
> __this_cpu_ptr(&pmu_sb_events[sb]);
>       struct perf_event *event;
> 
>       list_for_each_entry_rcu(event, &pel->list, sb_list[sb]) {
>               if (event->state < PERF_EVENT_STATE_INACTIVE)
>                       continue;
>               if (!event_filter_match(event))
>                       continue;
>               output(event, data);
>       }
> }
> 
> static void perf_event_sb_mask(unsigned int sb_mask,
> perf_event_sb_output_f output, void *data) {
>       int sb;
> 
>       for (sb = 0; sb < sb_nr; sb++) {
>               if (!(sb_mask & (1 << sb)))
>                       continue;
>               perf_event_sb_iterate(sb, output, data);
>       }
> }
> 
> Note: the mask is needed because a task event (as per perf_event_task)
> needs to go out to sb_comm, sb_mmap and sb_task, see
> perf_event_task_match().
> 
> And then you can replace the whole global part of perf_event_aux (which I
> would rename to perf_event_sb), with this.
> 
> You still have to do something like:
> 
>       for_each_task_context_nr(ctxn) {
>               ctx = rcu_dereference(current->perf_event_ctxp[ctxn]);
>               if (!ctx)
>                       continue;
>               perf_event_sb_ctx(ctx, output, data);
>       }
>

I'm not sure why we need something like for_each_task_context_nr(ctxn) in
perf_event_aux/perf_event_sb.
Isn't perf_event_sb_mask enough?
It looks perf_event_sb_mask will go through all possible interested events
(includes both per cpu and per task events). That's what we want to do in
perf_event_aux, right?
 
Thanks,
Kan

> To get at the per task events, because I don't think we want to go update
> more global state on context switch, nor am I entirely sure its worth it to
> keep per sb ctx->event_list[]s.
> 
> 

Reply via email to