Re: [RFC PATCH 0/7] Share events between metrics

2020-12-15 Thread Andi Kleen
> Or, is the concern more about trying to time-slice the results in a 
> fairly granular way and expecting accurate results then?

Usually the later. It's especially important for divisions. You want
both divisor and dividend to be in the same time slice, otherwise
the result usually doesn't make a lot of sense.

-Andi


Re: [RFC PATCH 0/7] Share events between metrics

2020-12-15 Thread Paul A. Clarke
On Thu, May 07, 2020 at 10:43:43PM -0700, Ian Rogers wrote:
> On Thu, May 7, 2020 at 2:47 PM Andi Kleen  wrote:
> >
> > > > - without this change events within a metric may get scheduled
> > > >   together, after they may appear as part of a larger group and be
> > > >   multiplexed at different times, lowering accuracy - however, less
> > > >   multiplexing may compensate for this.

Does mutiplexing somewhat related events at different times actually reduce
accuracy, or is it just more likely to give that appearance?

It seems that perf measurements are only useful if the workload is in a
fairly steady state.  If there is some wobbling, then measuring at the
same time is more accurate for the periods where the events are being
measured simultaneously, but may be far off for when they are not being
measured at all.  Spreading them out over a longer duration may actually
increase accuracy by sampling over more varied intervals.

Or, is the concern more about trying to time-slice the results in a 
fairly granular way and expecting accurate results then?

(Or, maybe my ignorance is showing again.  :-)

PC


Re: [RFC PATCH 0/7] Share events between metrics

2020-05-07 Thread Ian Rogers
On Thu, May 7, 2020 at 2:47 PM Andi Kleen  wrote:
>
> > > - without this change events within a metric may get scheduled
> > >   together, after they may appear as part of a larger group and be
> > >   multiplexed at different times, lowering accuracy - however, less
> > >   multiplexing may compensate for this.
> >
> > I agree the heuristic in this patch set is naive and would welcome to
> > improve it from your toplev experience. I think this change is
> > progress on TopDownL1 - would you agree?
>
> TopdownL1 in non SMT mode should always fit. Inside a group
> deduping always makes sense.
>
> The problem is SMT mode where it doesn't fit. toplev tries
> to group each node and each level together.

Thanks Andi, I've provided some examples of TopDownL3_SMT in the cover
letter of the v3 patch set:
https://lore.kernel.org/lkml/20200508053629.210324-1-irog...@google.com/
I tested sandybridge and cascadelake and the results look similar to
the non-SMT version. Let me know if there's a different variant to
test.

> >
> > I'm wondering if what is needed are flags to control behavior. For
> > example, avoiding the use of groups altogether. For TopDownL1 I see.
>
> Yes the current situation isn't great.
>
> For Topdown your patch clearly is an improvement, I'm not sure
> it's for everything though.
>
> Probably the advanced heuristics are only useful for a few
> formulas, most are very simple. So maybe it's ok. I guess
> would need some testing over the existing formulas.

Agreed, do you have a pointer on a metric group where things would
obviously be worse? I started off with a cache miss and hit rate
metric and similar to topdown this approach is a benefit.

In v3 I've added a --metric-no-merge option to retain existing
grouping behavior, I've also added a --metric-no-group that avoids
groups for all metrics. This may be useful if the NMI watchdog can't
be disabled.

Thanks for the input!
Ian

> -Andi


Re: [RFC PATCH 0/7] Share events between metrics

2020-05-07 Thread Andi Kleen
> > - without this change events within a metric may get scheduled
> >   together, after they may appear as part of a larger group and be
> >   multiplexed at different times, lowering accuracy - however, less
> >   multiplexing may compensate for this.
> 
> I agree the heuristic in this patch set is naive and would welcome to
> improve it from your toplev experience. I think this change is
> progress on TopDownL1 - would you agree?

TopdownL1 in non SMT mode should always fit. Inside a group
deduping always makes sense. 

The problem is SMT mode where it doesn't fit. toplev tries
to group each node and each level together.

> 
> I'm wondering if what is needed are flags to control behavior. For
> example, avoiding the use of groups altogether. For TopDownL1 I see.

Yes the current situation isn't great.

For Topdown your patch clearly is an improvement, I'm not sure
it's for everything though.

Probably the advanced heuristics are only useful for a few
formulas, most are very simple. So maybe it's ok. I guess
would need some testing over the existing formulas.


-Andi


Re: [RFC PATCH 0/7] Share events between metrics

2020-05-07 Thread Ian Rogers
On Thu, May 7, 2020 at 10:48 AM Andi Kleen  wrote:
>
> On Thu, May 07, 2020 at 01:14:29AM -0700, Ian Rogers wrote:
> > Metric groups contain metrics. Metrics create groups of events to
> > ideally be scheduled together. Often metrics refer to the same events,
> > for example, a cache hit and cache miss rate. Using separate event
> > groups means these metrics are multiplexed at different times and the
> > counts don't sum to 100%. More multiplexing also decreases the
> > accuracy of the measurement.
> >
> > This change orders metrics from groups or the command line, so that
> > the ones with the most events are set up first. Later metrics see if
> > groups already provide their events, and reuse them if
> > possible. Unnecessary events and groups are eliminated.
>
> Note this actually may make multiplexing errors worse.
>
> For metrics it is often important that all the input values to
> the metric run at the same time.
>
> So e.g. if you have two metrics and they each fit into a group,
> but not together, even though you have more multiplexing it
> will give more accurate results for each metric.
>
> I think you change can make sense for metrics that don't fit
> into single groups anyways. perf currently doesn't quite know
> this but some heuristic could be added.
>
> But I wouldn't do it for simple metrics that fit into groups.
> The result may well be worse.
>
> My toplev tool has some heuristics for this, also some more
> sophisticated ones that tracks subexpressions. That would
> be far too complicated for perf likely.
>
> -Andi

Thanks Andi!

I was trying to be mindful of the multiplexing issue in the description:

> - without this change events within a metric may get scheduled
>   together, after they may appear as part of a larger group and be
>   multiplexed at different times, lowering accuracy - however, less
>   multiplexing may compensate for this.

I agree the heuristic in this patch set is naive and would welcome to
improve it from your toplev experience. I think this change is
progress on TopDownL1 - would you agree?

I'm wondering if what is needed are flags to control behavior. For
example, avoiding the use of groups altogether. For TopDownL1 I see.

Currently:
27,294,614,172  idq_uops_not_delivered.core #  0.3
Frontend_Bound   (49.96%)
24,498,363,923  cycles
   (49.96%)
21,449,143,854  uops_issued.any   #  0.1
Bad_Speculation  (16.68%)
16,450,676,961  uops_retired.retire_slots
   (16.68%)
   880,423,103  int_misc.recovery_cycles
   (16.68%)
24,180,775,568  cycles
   (16.68%)
27,662,201,567  idq_uops_not_delivered.core #  0.5
Backend_Bound(16.67%)
25,354,331,290  cycles
   (16.67%)
22,642,218,398  uops_issued.any
   (16.67%)
17,564,211,383  uops_retired.retire_slots
   (16.67%)
   896,938,527  int_misc.recovery_cycles
   (16.67%)
17,872,454,517  uops_retired.retire_slots #  0.2 Retiring
   (16.68%)
25,122,100,836  cycles
   (16.68%)
15,101,167,608  inst_retired.any  #  0.6 IPC
   (33.34%)
24,977,816,793  cpu_clk_unhalted.thread
   (33.34%)
24,868,717,684  cycles
  # 99474870736.0
SLOTS   (49.98%)

With proposed (RFC) sharing of events over metrics:
22,780,823,620  cycles
  # 91123294480.0
SLOTS
  #  0.2 Retiring
  #  0.3
Frontend_Bound
  #  0.1
Bad_Speculation
  #  0.4
Backend_Bound(50.01%)
26,097,362,439  idq_uops_not_delivered.core
 (50.01%)
   790,521,504  int_misc.recovery_cycles
   (50.01%)
21,025,308,329  uops_issued.any
   (50.01%)
17,041,506,149  uops_retired.retire_slots
   (50.01%)
22,964,891,526  cpu_clk_unhalted.thread   #  0.6 IPC
   (49.99%)
14,531,724,741  inst_retired.any
   (49.99%)

No groups:
 1,517,455,258  cycles
  # 6069821032.0 SLOTS
  #  0.1 Retiring
  #  0.3
Frontend_Bound
  #  0.1
Bad_Speculation
  #  0.5
Backend_Bound(85.64%)
 1,943,047,724  idq_uops_not_delivered.core
 (85.61%)
54,257,713  int_misc.recovery_cycles
   (85.63%)
 1,050,787,137  

Re: [RFC PATCH 0/7] Share events between metrics

2020-05-07 Thread Andi Kleen
On Thu, May 07, 2020 at 01:14:29AM -0700, Ian Rogers wrote:
> Metric groups contain metrics. Metrics create groups of events to
> ideally be scheduled together. Often metrics refer to the same events,
> for example, a cache hit and cache miss rate. Using separate event
> groups means these metrics are multiplexed at different times and the
> counts don't sum to 100%. More multiplexing also decreases the
> accuracy of the measurement.
> 
> This change orders metrics from groups or the command line, so that
> the ones with the most events are set up first. Later metrics see if
> groups already provide their events, and reuse them if
> possible. Unnecessary events and groups are eliminated.

Note this actually may make multiplexing errors worse.

For metrics it is often important that all the input values to
the metric run at the same time. 

So e.g. if you have two metrics and they each fit into a group,
but not together, even though you have more multiplexing it
will give more accurate results for each metric.

I think you change can make sense for metrics that don't fit 
into single groups anyways. perf currently doesn't quite know
this but some heuristic could be added. 

But I wouldn't do it for simple metrics that fit into groups.
The result may well be worse.

My toplev tool has some heuristics for this, also some more
sophisticated ones that tracks subexpressions. That would
be far too complicated for perf likely.

-Andi


Re: [RFC PATCH 0/7] Share events between metrics

2020-05-07 Thread Ian Rogers
On Thu, May 7, 2020 at 6:49 AM Jiri Olsa  wrote:
>
> On Thu, May 07, 2020 at 01:14:29AM -0700, Ian Rogers wrote:
> > Metric groups contain metrics. Metrics create groups of events to
> > ideally be scheduled together. Often metrics refer to the same events,
> > for example, a cache hit and cache miss rate. Using separate event
> > groups means these metrics are multiplexed at different times and the
> > counts don't sum to 100%. More multiplexing also decreases the
> > accuracy of the measurement.
> >
> > This change orders metrics from groups or the command line, so that
> > the ones with the most events are set up first. Later metrics see if
> > groups already provide their events, and reuse them if
> > possible. Unnecessary events and groups are eliminated.
> >
> > RFC because:
> >  - without this change events within a metric may get scheduled
> >together, after they may appear as part of a larger group and be
> >multiplexed at different times, lowering accuracy - however, less
> >multiplexing may compensate for this.
> >  - libbpf's hashmap is used, however, libbpf is an optional
> >requirement for building perf.
> >  - other things I'm not thinking of.
>
> hi,
> I can't apply this, what branch/commit is this based on?
>
> Applying: perf expr: migrate expr ids table to libbpf's hashmap
> error: patch failed: tools/perf/tests/pmu-events.c:428
> error: tools/perf/tests/pmu-events.c: patch does not apply
> error: patch failed: tools/perf/util/expr.h:2
> error: tools/perf/util/expr.h: patch does not apply
> error: patch failed: tools/perf/util/expr.y:73
> error: tools/perf/util/expr.y: patch does not apply
> Patch failed at 0001 perf expr: migrate expr ids table to libbpf's 
> hashmap
>
> thanks,
> jirka

Thanks for trying! I have resent the entire patch series here:
https://lore.kernel.org/lkml/20200507140819.126960-1-irog...@google.com/
It is acme's perf/core tree + metric fix/test CLs + some minor fixes.
Details in the cover letter.

Thanks,
Ian

> >
> > Thanks!
> >
> > Ian Rogers (7):
> >   perf expr: migrate expr ids table to libbpf's hashmap
> >   perf metricgroup: change evlist_used to a bitmap
> >   perf metricgroup: free metric_events on error
> >   perf metricgroup: always place duration_time last
> >   perf metricgroup: delay events string creation
> >   perf metricgroup: order event groups by size
> >   perf metricgroup: remove duped metric group events
> >
> >  tools/perf/tests/expr.c   |  32 ++---
> >  tools/perf/tests/pmu-events.c |  22 ++--
> >  tools/perf/util/expr.c| 125 ++
> >  tools/perf/util/expr.h|  22 ++--
> >  tools/perf/util/expr.y|  22 +---
> >  tools/perf/util/metricgroup.c | 242 +-
> >  tools/perf/util/stat-shadow.c |  46 ---
> >  7 files changed, 280 insertions(+), 231 deletions(-)
> >
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
>


Re: [RFC PATCH 0/7] Share events between metrics

2020-05-07 Thread Jiri Olsa
On Thu, May 07, 2020 at 01:14:29AM -0700, Ian Rogers wrote:
> Metric groups contain metrics. Metrics create groups of events to
> ideally be scheduled together. Often metrics refer to the same events,
> for example, a cache hit and cache miss rate. Using separate event
> groups means these metrics are multiplexed at different times and the
> counts don't sum to 100%. More multiplexing also decreases the
> accuracy of the measurement.
> 
> This change orders metrics from groups or the command line, so that
> the ones with the most events are set up first. Later metrics see if
> groups already provide their events, and reuse them if
> possible. Unnecessary events and groups are eliminated.
> 
> RFC because:
>  - without this change events within a metric may get scheduled
>together, after they may appear as part of a larger group and be
>multiplexed at different times, lowering accuracy - however, less
>multiplexing may compensate for this.
>  - libbpf's hashmap is used, however, libbpf is an optional
>requirement for building perf.
>  - other things I'm not thinking of.

hi,
I can't apply this, what branch/commit is this based on?

Applying: perf expr: migrate expr ids table to libbpf's hashmap
error: patch failed: tools/perf/tests/pmu-events.c:428
error: tools/perf/tests/pmu-events.c: patch does not apply
error: patch failed: tools/perf/util/expr.h:2
error: tools/perf/util/expr.h: patch does not apply
error: patch failed: tools/perf/util/expr.y:73
error: tools/perf/util/expr.y: patch does not apply
Patch failed at 0001 perf expr: migrate expr ids table to libbpf's 
hashmap

thanks,
jirka

> 
> Thanks!
> 
> Ian Rogers (7):
>   perf expr: migrate expr ids table to libbpf's hashmap
>   perf metricgroup: change evlist_used to a bitmap
>   perf metricgroup: free metric_events on error
>   perf metricgroup: always place duration_time last
>   perf metricgroup: delay events string creation
>   perf metricgroup: order event groups by size
>   perf metricgroup: remove duped metric group events
> 
>  tools/perf/tests/expr.c   |  32 ++---
>  tools/perf/tests/pmu-events.c |  22 ++--
>  tools/perf/util/expr.c| 125 ++
>  tools/perf/util/expr.h|  22 ++--
>  tools/perf/util/expr.y|  22 +---
>  tools/perf/util/metricgroup.c | 242 +-
>  tools/perf/util/stat-shadow.c |  46 ---
>  7 files changed, 280 insertions(+), 231 deletions(-)
> 
> -- 
> 2.26.2.526.g744177e7f7-goog
>