On Wed, Sep 9, 2020 at 10:52 AM Ian Rogers <irog...@google.com> wrote: > > On Wed, Sep 9, 2020 at 2:53 AM Namhyung Kim <namhy...@kernel.org> wrote: > > > > Hi Ian, > > > > On Wed, Sep 9, 2020 at 5:02 PM Ian Rogers <irog...@google.com> wrote: > > > > > > A metric like DRAM_BW_Use has on SkylakeX events > > > uncore_imc/cas_count_read/ > > > and uncore_imc/case_count_write/. These events open 6 events per socket > > > with pmu names of uncore_imc_[0-5]. The current metric setup code in > > > find_evsel_group assumes one ID will map to 1 event to be recorded in > > > metric_events. For events with multiple matches, the first event is > > > recorded in metric_events (avoiding matching >1 event with the same > > > name) and the evlist_used updated so that duplicate events aren't > > > removed when the evlist has unused events removed. > > > > > > Before this change: > > > $ /tmp/perf/perf stat -M DRAM_BW_Use -a -- sleep 1 > > > > > > Performance counter stats for 'system wide': > > > > > > 41.14 MiB uncore_imc/cas_count_read/ > > > 1,002,614,251 ns duration_time > > > > > > 1.002614251 seconds time elapsed > > > > > > After this change: > > > $ /tmp/perf/perf stat -M DRAM_BW_Use -a -- sleep 1 > > > > > > Performance counter stats for 'system wide': > > > > > > 157.47 MiB uncore_imc/cas_count_read/ # 0.00 DRAM_BW_Use > > > > Hmm.. I guess the 0.00 result is incorrect, no? > > Agreed. There are a number of pre-existing bugs in this code. I'll try > to look into this one.
There was a bug where the metric_leader wasn't being set correctly and so the accumulation of values wasn't working as expected. There's a small fix in: https://lore.kernel.org/lkml/20200910032632.511566-3-irog...@google.com/T/#u that also does the change I mentioned below. The metric still remains at 0.0 following this change as I believe there is a bug in the metric. The metric expression is: "( 64 * ( uncore_imc@cas_count_read@ + uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time" ie the counts of uncore_imc/cas_count_read/ and uncore_imc/cas_count_write/ are being first being scaled up by 64, that is to turn a count of cache lines into bytes, the count is then divided by 1000000000 or a GB to supposedly give GB/s. However, the counts are read and scaled: ... void perf_stat__update_shadow_stats(struct evsel *counter, u64 count, ... count *= counter->scale; ... The scale value from /sys/devices/uncore_imc_0/events/cas_count_read.scale is 6.103515625e-5 or 64/(1024*1024). This means the result of the expression is 64 times too large but in PB/s and so too small to display. I don't rule out there being other issues but this appears to be a significant one. Printing using intervals also looks suspicious as the count appears to just increase rather than vary up and down. Jin Yao, I don't know if you can take a look? Thanks, Ian > > > 126.97 MiB uncore_imc/cas_count_write/ > > > 1,003,019,728 ns duration_time > > > > > > Erroneous duplication introduced in: > > > commit 2440689d62e9 ("perf metricgroup: Remove duped metric group > > > events"). > > > > > > Fixes: ded80bda8bc9 ("perf expr: Migrate expr ids table to a hashmap"). > > > Reported-by: Jin, Yao <yao....@linux.intel.com> > > > Signed-off-by: Ian Rogers <irog...@google.com> > > > --- > > [SNIP] > > > @@ -248,6 +260,16 @@ static struct evsel *find_evsel_group(struct evlist > > > *perf_evlist, > > > ev = metric_events[i]; > > > ev->metric_leader = ev; > > > set_bit(ev->idx, evlist_used); > > > + /* > > > + * Mark two events with identical names in the same group > > > as > > > + * being in use as uncore events may be duplicated for > > > each pmu. > > > + */ > > > + evlist__for_each_entry(perf_evlist, ev) { > > > + if (metric_events[i]->leader == ev->leader && > > > + !strcmp(metric_events[i]->name, ev->name)) { > > > + set_bit(ev->idx, evlist_used); > > > > I'm not sure whether they are grouped together. > > But if so, you can use for_each_group_member(ev, leader). > > Good suggestion, unfortunately the groups may be removed for things > like NMI watchdog or --metric-no-group and so there wouldn't be a > leader to follow. We could do something like this: > > if (metric_events[i]->leader) { > for_each_group_member(ev, leader) { > if (!strcmp(metric_events[i]->name, ev->name)) > set_bit(ev->idx, evlist_used); > } > } else { > evlist__for_each_entry(perf_evlist, ev) { > if (!ev->leader && !strcmp(metric_events[i]->name, ev->name)) > set_bit(ev->idx, evlist_used); > } > } > } > > What do you think? > > Thanks, > Ian > > > Thanks > > Namhyung > > > > > > > + } > > > + } > > > } > > > > > > return metric_events[0]; > > > -- > > > 2.28.0.526.ge36021eeef-goog > > >