> -----Original Message----- > From: Peter Zijlstra [mailto:[email protected]] > Sent: Friday, February 26, 2016 5:37 AM > To: Liang, Kan > Cc: [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; [email protected] > Subject: Re: [PATCH 1/1] perf/core: find auxiliary events in running pmus > list > > > index 9> +static void account_running_pmu(struct perf_event *event) > > +{ > > + struct running_pmu *pmu; > > + > > + mutex_lock(&running_pmus_lock); > > + > > + list_for_each_entry(pmu, &running_pmus, entry) { > > + if (pmu->pmu == event->pmu) { > > + pmu->nr_event++; > > + goto out; > > + } > > + } > > + > > + pmu = kzalloc(sizeof(struct running_pmu), GFP_KERNEL); > > + if (pmu != NULL) { > > + pmu->nr_event++; > > + pmu->pmu = event->pmu; > > + list_add_rcu(&pmu->entry, &running_pmus); > > + } > > That kzalloc() doesn't make any sense, why not simply add a member to > struct pmu? > > > +out: > > + mutex_unlock(&running_pmus_lock); > > +} > > In any case, can't we replace the whole perf_event_aux muck with a data > structure for finding interested events? > > Because not only is iterating all PMUs silly, we then further iterate all > events in whatever contexts we find from them. Even if none of the > events in these contexts is interested in the side-band event. > > We have: > mmap,comm,task,mmap_data,mmap2,comm_exec,context_switc > h > > > Which I think we can reduce like: > > {mmap,mmap_data,mmap2} -> mmap > {comm,comm_exec} -> comm > > mmap,comm,task,context_switch > > This would allow us to keep 4 per-cpu lists of events like: > > > struct pmu_event_list { > raw_spinlock_t lock; > struct list_head list; > }; > > enum event_sb_channel { > sb_mmap = 0, > sb_comm, > sb_task, > sb_switch, > sb_nr, > } > > static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events[sb_nr]); > > static void attach_sb_event(struct perf_event *event, enum > event_sb_channel sb) { > struct pmu_event_list *pel = per_cpu_ptr(&pmu_sb_events[sb], > event->cpu); > > raw_spin_lock(&pel->lock); > list_add_rcu(&event->sb_list[sb], &pel->list); > raw_spin_unlock(&pel->lock); > } > > static void account_pmu_sb_event(struct perf_event *event) { > if (event->parent) > return; > > if (event->attach & ATTACH_TASK) > return; > > if (event->attr.mmap || event->attr.mmap_data || event- > >attr.mmap2) > attach_sb_event(event, sb_mmap); > > if (event->attr.comm || event->attr.comm_exec) > attach_sb_event(event, sb_comm); > > if (event->attr.task) > attach_sb_event(event, sb_task); > > if (event->attr.context_switch) > attach_sb_event(event, sb_switch); > } > > /* matching unaccount */ > > > static void perf_event_sb_iterate(enum event_sb_channel sb, > perf_event_sb_output_f output, void > *data) { > struct pmu_event_list *pel = > __this_cpu_ptr(&pmu_sb_events[sb]); > struct perf_event *event; > > list_for_each_entry_rcu(event, &pel->list, sb_list[sb]) { > if (event->state < PERF_EVENT_STATE_INACTIVE) > continue; > if (!event_filter_match(event)) > continue; > output(event, data); > } > } > > static void perf_event_sb_mask(unsigned int sb_mask, > perf_event_sb_output_f output, void *data) { > int sb; > > for (sb = 0; sb < sb_nr; sb++) { > if (!(sb_mask & (1 << sb))) > continue; > perf_event_sb_iterate(sb, output, data); > } > } > > Note: the mask is needed because a task event (as per perf_event_task) > needs to go out to sb_comm, sb_mmap and sb_task, see > perf_event_task_match(). > > And then you can replace the whole global part of perf_event_aux (which I > would rename to perf_event_sb), with this. > > You still have to do something like: > > for_each_task_context_nr(ctxn) { > ctx = rcu_dereference(current->perf_event_ctxp[ctxn]); > if (!ctx) > continue; > perf_event_sb_ctx(ctx, output, data); > } >
I'm not sure why we need something like for_each_task_context_nr(ctxn) in perf_event_aux/perf_event_sb. Isn't perf_event_sb_mask enough? It looks perf_event_sb_mask will go through all possible interested events (includes both per cpu and per task events). That's what we want to do in perf_event_aux, right? Thanks, Kan > To get at the per task events, because I don't think we want to go update > more global state on context switch, nor am I entirely sure its worth it to > keep per sb ctx->event_list[]s. > >

