RE: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-10 Thread Dey, Megha


>-Original Message-
>From: Jiri Olsa [mailto:jo...@redhat.com]
>Sent: Saturday, November 4, 2017 6:26 AM
>To: Megha Dey <megha@linux.intel.com>
>Cc: x...@kernel.org; linux-ker...@vger.kernel.org; linux-
>d...@vger.kernel.org; t...@linutronix.de; mi...@redhat.com;
>h...@zytor.com; andriy.shevche...@linux.intel.com;
>kstew...@linuxfoundation.org; Yu, Yu-cheng <yu-cheng...@intel.com>;
>Brown, Len <len.br...@intel.com>; gre...@linuxfoundation.org;
>pet...@infradead.org; a...@kernel.org;
>alexander.shish...@linux.intel.com; namhy...@kernel.org;
>vikas.shiva...@linux.intel.com; pombreda...@nexb.com;
>m...@kylehuey.com; b...@suse.de; Andrejczuk, Grzegorz
><grzegorz.andrejc...@intel.com>; Luck, Tony <tony.l...@intel.com>;
>cor...@lwn.net; Shankar, Ravi V <ravi.v.shan...@intel.com>; Dey, Megha
><megha@intel.com>
>Subject: Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch
>Monitoring support
>
>On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:
>
>SNIP
>
>> +
>> +static void intel_bm_event_update(struct perf_event *event) {
>> +union bm_detect_status cur_stat;
>> +
>> +rdmsrl(BR_DETECT_STATUS_MSR, cur_stat.raw);
>> +local64_set(>hw.prev_count, (uint64_t)cur_stat.raw); }
>> +
>> +static void intel_bm_event_stop(struct perf_event *event, int mode) {
>> +wrmsrl(BR_DETECT_COUNTER_CONFIG_BASE + event->id,
>> +   (event->hw.bm_counter_conf & ~1));
>> +
>> +intel_bm_event_update(event);
>> +}
>> +
>> +static void intel_bm_event_del(struct perf_event *event, int flags) {
>> +intel_bm_event_stop(event, flags);
>> +}
>> +
>> +static void intel_bm_event_read(struct perf_event *event) { }
>
>should you call intel_bm_event_update in here? so the read syscall gets
>updated data in case the case the event is active

We do not update the event->count in the intel_bm_event_update function. We are 
basically saving the status register contents when a task is being scheduled 
out. So it has nothing to do with the read syscall. Having said that, I will 
probably put what stop() does in del() and get rid of the stop() function.
>
>jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-10 Thread Dey, Megha


>-Original Message-
>From: Jiri Olsa [mailto:jo...@redhat.com]
>Sent: Saturday, November 4, 2017 6:26 AM
>To: Megha Dey <megha@linux.intel.com>
>Cc: x...@kernel.org; linux-ker...@vger.kernel.org; linux-
>d...@vger.kernel.org; t...@linutronix.de; mi...@redhat.com;
>h...@zytor.com; andriy.shevche...@linux.intel.com;
>kstew...@linuxfoundation.org; Yu, Yu-cheng <yu-cheng...@intel.com>;
>Brown, Len <len.br...@intel.com>; gre...@linuxfoundation.org;
>pet...@infradead.org; a...@kernel.org;
>alexander.shish...@linux.intel.com; namhy...@kernel.org;
>vikas.shiva...@linux.intel.com; pombreda...@nexb.com;
>m...@kylehuey.com; b...@suse.de; Andrejczuk, Grzegorz
><grzegorz.andrejc...@intel.com>; Luck, Tony <tony.l...@intel.com>;
>cor...@lwn.net; Shankar, Ravi V <ravi.v.shan...@intel.com>; Dey, Megha
><megha@intel.com>
>Subject: Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch
>Monitoring support
>
>On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:
>
>SNIP
>
>> +static unsigned int bm_threshold = BM_MAX_THRESHOLD; static
>unsigned
>> +int bm_mispred_evt_cnt;
>> +
>> +/* Branch monitoring counter owners */ static struct perf_event
>> +*bm_counter_owner[2];
>
>SNIP
>
>> + * Find a hardware counter for the target task
>> + */
>> +for (i = 0; i < bm_num_counters; i++) {
>> +if ((bm_counter_owner[i] == NULL) ||
>> +(bm_counter_owner[i]->state ==
>PERF_EVENT_STATE_DEAD)) {
>> +counter_to_use = i;
>> +bm_counter_owner[i] = event;
>> +break;
>> +}
>> +}
>> +
>> +if (counter_to_use == -1)
>> +return -EBUSY;
>
>not sure I understand, your docs says: "There are 2 8-bit counters that each..
>"
>
>so there are 2 counters per CPU? if that's corrent, isn't this check too strict
>then? you could have more events configured running on other CPUs for
>another tasks
>
>given that we do task only events here, should bm_counter_owner be part
>of task, together with the limit..? I'm probably missing something..

Yes you are right. Initially, we had support for 2 events(from one or 2 tasks) 
to be monitored for the entire system. This indeed seems very limiting. In the 
next patchset, I will add support for 2 events per task (This is what the 
hardware can support).
>
>thanks,
>jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-10 Thread Dey, Megha


>-Original Message-
>From: Jiri Olsa [mailto:jo...@redhat.com]
>Sent: Saturday, November 4, 2017 6:25 AM
>To: Megha Dey <megha@linux.intel.com>
>Cc: x...@kernel.org; linux-ker...@vger.kernel.org; linux-
>d...@vger.kernel.org; t...@linutronix.de; mi...@redhat.com;
>h...@zytor.com; andriy.shevche...@linux.intel.com;
>kstew...@linuxfoundation.org; Yu, Yu-cheng <yu-cheng...@intel.com>;
>Brown, Len <len.br...@intel.com>; gre...@linuxfoundation.org;
>pet...@infradead.org; a...@kernel.org;
>alexander.shish...@linux.intel.com; namhy...@kernel.org;
>vikas.shiva...@linux.intel.com; pombreda...@nexb.com;
>m...@kylehuey.com; b...@suse.de; Andrejczuk, Grzegorz
><grzegorz.andrejc...@intel.com>; Luck, Tony <tony.l...@intel.com>;
>cor...@lwn.net; Shankar, Ravi V <ravi.v.shan...@intel.com>; Dey, Megha
><megha@intel.com>
>Subject: Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch
>Monitoring support
>
>On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:
>
>SNIP
>
>> +event->event_caps |= PERF_EV_CAP_BM;
>> +/*
>> + * cfg contains one of the 6 possible Branch Monitoring events
>> + */
>> +cfg = event->attr.config;
>> +if (cfg < 0 || cfg > (BM_MAX_EVENTS - 1))
>> +return -EINVAL;
>> +
>> +if (event->attr.sample_period) /* no sampling */
>> +return -EINVAL;
>
>you can use the 'is_sampling_event' function

Will make the change.
>
>jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-10 Thread Dey, Megha


>-Original Message-
>From: Jiri Olsa [mailto:jo...@redhat.com]
>Sent: Saturday, November 4, 2017 6:25 AM
>To: Megha Dey <megha@linux.intel.com>
>Cc: x...@kernel.org; linux-ker...@vger.kernel.org; linux-
>d...@vger.kernel.org; t...@linutronix.de; mi...@redhat.com;
>h...@zytor.com; andriy.shevche...@linux.intel.com;
>kstew...@linuxfoundation.org; Yu, Yu-cheng <yu-cheng...@intel.com>;
>Brown, Len <len.br...@intel.com>; gre...@linuxfoundation.org;
>pet...@infradead.org; a...@kernel.org;
>alexander.shish...@linux.intel.com; namhy...@kernel.org;
>vikas.shiva...@linux.intel.com; pombreda...@nexb.com;
>m...@kylehuey.com; b...@suse.de; Andrejczuk, Grzegorz
><grzegorz.andrejc...@intel.com>; Luck, Tony <tony.l...@intel.com>;
>cor...@lwn.net; Shankar, Ravi V <ravi.v.shan...@intel.com>; Dey, Megha
><megha@intel.com>
>Subject: Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch
>Monitoring support
>
>On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:
>
>SNIP
>
>> +
>> +static int intel_bm_event_nmi_handler(unsigned int cmd, struct
>> +pt_regs *regs) {
>> +struct perf_event *event;
>> +union bm_detect_status stat;
>> +struct perf_sample_data data;
>> +int i;
>> +unsigned long x;
>> +
>> +rdmsrl(BR_DETECT_STATUS_MSR, stat.raw);
>> +
>> +if (stat.event) {
>> +wrmsrl(BR_DETECT_STATUS_MSR, 0);
>> +apic_write(APIC_LVTPC, APIC_DM_NMI);
>> +/*
>> + * Issue wake-up to corrresponding polling event
>> + */
>> +x = stat.ctrl_hit;
>> +for_each_set_bit(i, , bm_num_counters) {
>> +event = bm_counter_owner[i];
>> +perf_sample_data_init(, 0, event-
>>hw.last_period);
>> +perf_event_overflow(event, , regs);
>
>hum, it's non sampling events only right? then you don't need any of the
>perf_sample_data stuff.. the perf_event_overflow call is basicaly nop

Yeah you are right. We were supporting sampling initially, forgot to get rid of 
this code.
Will change this in the next version.
>
>> +local64_inc(>count);
>> +atomic_set(>hw.bm_poll, POLLIN);
>> +event->pending_wakeup = 1;
>> +irq_work_queue(>pending);
>
>also this is for sampling events only
>
>seems like you only want to increment the event->count in here

We are currently working on a library similar to libperf which user space apps 
could make use of instead of perf command line monitoring. This code has been 
added so that the user can poll on when/if an interrupt is triggered and let 
the user know of its occurrence. If you think there is a better way of doing 
this, please let me know :)

>
>thanks,
>jirka
>
>> +}
>> +return NMI_HANDLED;
>> +}
>> +return NMI_DONE;
>> +}
>
>
>SNIP
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-10 Thread Megha Dey
On Mon, 2017-11-06 at 13:49 +0200, Alexander Shishkin wrote:
> On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:
> > +static int intel_bm_event_init(struct perf_event *event)
> > +{
> 
> ...
> 
> > +   /*
> > +* Find a hardware counter for the target task
> > +*/
> > +   for (i = 0; i < bm_num_counters; i++) {
> > +   if ((bm_counter_owner[i] == NULL) ||
> > +   (bm_counter_owner[i]->state == PERF_EVENT_STATE_DEAD)) {
> > +   counter_to_use = i;
> > +   bm_counter_owner[i] = event;
> > +   break;
> 
> How are two concurrent perf_event_open()s not going to race here?
> Also, I'm not sure what's the value of looking at the ->state here.
> Shouldn't the ->destroy() method clear the corresponding array slot?

Yes you are right. I will add a locking mechanism here to prevent racing
and remove the ->state in the next version.
> 
> > +   }
> > +   }
> 
> ...
> 
> > +   wrmsrl(BR_DETECT_COUNTER_CONFIG_BASE + counter_to_use,
> > +   event->hw.bm_counter_conf);
> > +   wrmsrl(BR_DETECT_STATUS_MSR, 0);
> 
> These wrmsrs will happen on whatever CPU perf_event_open() is called on,
> as opposed to the CPU where the event will be scheduled. You probably want
> to keep the MSR accesses in the start()/stop() callbacks.

Agreed, don't think we need this code here. We are writing to the MSRs
in start() anyways.
> 
> Regards,
> --
> Alex
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-04 Thread Jiri Olsa
On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:

SNIP

> +
> +static int intel_bm_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
> +{
> + struct perf_event *event;
> + union bm_detect_status stat;
> + struct perf_sample_data data;
> + int i;
> + unsigned long x;
> +
> + rdmsrl(BR_DETECT_STATUS_MSR, stat.raw);
> +
> + if (stat.event) {
> + wrmsrl(BR_DETECT_STATUS_MSR, 0);
> + apic_write(APIC_LVTPC, APIC_DM_NMI);
> + /*
> +  * Issue wake-up to corrresponding polling event

s/rrr/rr/ ;-)

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-04 Thread Jiri Olsa
On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:

SNIP

> +
> +static void intel_bm_event_update(struct perf_event *event)
> +{
> + union bm_detect_status cur_stat;
> +
> + rdmsrl(BR_DETECT_STATUS_MSR, cur_stat.raw);
> + local64_set(>hw.prev_count, (uint64_t)cur_stat.raw);
> +}
> +
> +static void intel_bm_event_stop(struct perf_event *event, int mode)
> +{
> + wrmsrl(BR_DETECT_COUNTER_CONFIG_BASE + event->id,
> +(event->hw.bm_counter_conf & ~1));
> +
> + intel_bm_event_update(event);
> +}
> +
> +static void intel_bm_event_del(struct perf_event *event, int flags)
> +{
> + intel_bm_event_stop(event, flags);
> +}
> +
> +static void intel_bm_event_read(struct perf_event *event)
> +{
> +}

should you call intel_bm_event_update in here? so the read syscall
gets updated data in case the case the event is active

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-04 Thread Jiri Olsa
On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:

SNIP

> +static unsigned int bm_threshold = BM_MAX_THRESHOLD;
> +static unsigned int bm_mispred_evt_cnt;
> +
> +/* Branch monitoring counter owners */
> +static struct perf_event *bm_counter_owner[2];

SNIP

> +  * Find a hardware counter for the target task
> +  */
> + for (i = 0; i < bm_num_counters; i++) {
> + if ((bm_counter_owner[i] == NULL) ||
> + (bm_counter_owner[i]->state == PERF_EVENT_STATE_DEAD)) {
> + counter_to_use = i;
> + bm_counter_owner[i] = event;
> + break;
> + }
> + }
> +
> + if (counter_to_use == -1)
> + return -EBUSY;

not sure I understand, your docs says: "There are 2 8-bit counters that each.. "

so there are 2 counters per CPU? if that's corrent, isn't this
check too strict then? you could have more events configured
running on other CPUs for another tasks

given that we do task only events here, should bm_counter_owner be part of task,
together with the limit..? I'm probably missing something..

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-04 Thread Jiri Olsa
On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:

SNIP

> + event->hw.bm_ctrl = (bm_window_size << BM_WINDOW_SIZE_SHIFT) |
> + (bm_guest_disable << BM_GUEST_DISABLE_SHIFT) |
> + (bm_lbr_freeze << BM_LBR_FREEZE_SHIFT) |
> + (bm_window_cnt_sel << BM_WINDOW_CNT_SEL_SHIFT) |
> + (bm_cnt_and_mode << BM_CNT_AND_MODE_SHIFT) |
> + BM_ENABLE;
> + event->hw.bm_counter_conf = (bm_threshold << BM_THRESHOLD_SHIFT) |
> + (bm_mispred_evt_cnt << BM_MISPRED_EVT_CNT_SHIFT) |
> + (cfg << BM_EVENT_TYPE_SHIFT);
> +
> + wrmsrl(BR_DETECT_COUNTER_CONFIG_BASE + counter_to_use,
> + event->hw.bm_counter_conf);
> + wrmsrl(BR_DETECT_STATUS_MSR, 0);
> + event->id = counter_to_use;

I think you need to add your own id under hw_perf_event::intel_bm

we use event->id as event unique id expected also in perf stat
for group reading or stat record

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-04 Thread Jiri Olsa
On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:

SNIP

> +
> +static int intel_bm_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
> +{
> + struct perf_event *event;
> + union bm_detect_status stat;
> + struct perf_sample_data data;
> + int i;
> + unsigned long x;
> +
> + rdmsrl(BR_DETECT_STATUS_MSR, stat.raw);
> +
> + if (stat.event) {
> + wrmsrl(BR_DETECT_STATUS_MSR, 0);
> + apic_write(APIC_LVTPC, APIC_DM_NMI);
> + /*
> +  * Issue wake-up to corrresponding polling event
> +  */
> + x = stat.ctrl_hit;
> + for_each_set_bit(i, , bm_num_counters) {
> + event = bm_counter_owner[i];
> + perf_sample_data_init(, 0, event->hw.last_period);
> + perf_event_overflow(event, , regs);

hum, it's non sampling events only right? then you don't need
any of the perf_sample_data stuff.. the perf_event_overflow call
is basicaly nop

> + local64_inc(>count);
> + atomic_set(>hw.bm_poll, POLLIN);
> + event->pending_wakeup = 1;
> + irq_work_queue(>pending);

also this is for sampling events only

seems like you only want to increment the event->count in here

thanks,
jirka

> + }
> + return NMI_HANDLED;
> + }
> + return NMI_DONE;
> +}


SNIP
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V0 2/3] perf/x86/intel/bm.c: Add Intel Branch Monitoring support

2017-11-04 Thread Jiri Olsa
On Fri, Nov 03, 2017 at 11:00:05AM -0700, Megha Dey wrote:

SNIP

> + event->event_caps |= PERF_EV_CAP_BM;
> + /*
> +  * cfg contains one of the 6 possible Branch Monitoring events
> +  */
> + cfg = event->attr.config;
> + if (cfg < 0 || cfg > (BM_MAX_EVENTS - 1))
> + return -EINVAL;
> +
> + if (event->attr.sample_period) /* no sampling */
> + return -EINVAL;

you can use the 'is_sampling_event' function

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html