On Mon, Jul 24, 2017 at 04:40:04PM -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
> 
> Setting up groups can be complicated due to the
> complicated scheduling restrictions of different PMUs.
> User tools usually don't understand all these restrictions.
> Still in many cases it is useful to set up groups and
> they work most of the time. However if the group
> is set up wrong some members will not reported any values
> because they never get scheduled.
> 
> Add a concept of a 'weak group': try to set up a group,
> but if it's not schedulable fallback to not using
> a group. That gives us the best of both worlds:
> groups if they work, but still a usable fallback if they don't.
> 
> In theory it would be possible to have more complex fallback
> strategies (e.g. try to split the group in half), but
> the simple fallback of not using a group seems to work for now.
> 
> So far the weak group is only implemented for perf stat,
> not for record.
> 
> Here's an unschedulable group (on IvyBridge with SMT on)
> 
> % perf stat -e 
> '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}'
>  -a sleep 1
> 
>         73,806,067      branches
>          4,848,144      branch-misses             #    6.57% of all branches
>         14,754,458      l1d.replacement
>         24,905,558      l2_lines_in.all
>    <not supported>      l2_rqsts.all_code_rd         <------- will never 
> report anything
> 
> With the weak group:
> 
> % perf stat -e 
> '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W'
>  -a sleep 1
> 
>        125,366,055      branches                                              
>         (80.02%)
>          9,208,402      branch-misses             #    7.35% of all branches  
>         (80.01%)
>         24,560,249      l1d.replacement                                       
>         (80.00%)
>         43,174,971      l2_lines_in.all                                       
>         (80.05%)
>         31,891,457      l2_rqsts.all_code_rd                                  
>         (79.92%)

looks handy, few comments below

thanks,
jirka

> 
> The extra event scheduled with some extra multiplexing
> 
> Signed-off-by: Andi Kleen <[email protected]>
> ---

SNIP

> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 97d6b6c42014..551ed938e05c 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -564,7 +564,7 @@ static int __run_perf_stat(int argc, const char **argv)
>       int interval = stat_config.interval;
>       char msg[BUFSIZ];
>       unsigned long long t0, t1;
> -     struct perf_evsel *counter;
> +     struct perf_evsel *counter, *c2, *leader;
>       struct timespec ts;
>       size_t l;
>       int status = 0;
> @@ -595,6 +595,32 @@ static int __run_perf_stat(int argc, const char **argv)
>       evlist__for_each_entry(evsel_list, counter) {
>  try_again:
>               if (create_perf_stat_counter(counter) < 0) {
> +                     /* Weak group failed. Reset the group. */
> +                     if (errno == EINVAL &&
> +                         counter->leader != counter &&
> +                         counter->weak_group) {

could you please put this de-grouping code into a function?

> +                             bool is_open = true;
> +
> +                             pr_debug("Weak group for %s/%d failed\n",
> +                                             counter->leader->name, 
> counter->nr_members);
> +                             leader = counter->leader;
> +                             evlist__for_each_entry(evsel_list, c2) {

we have for_each_group_member

> +                                     if (c2 == counter)
> +                                             is_open = false;
> +                                     if (c2->leader == leader) {
> +                                             if (is_open)
> +                                                     perf_evsel__close(c2,
> +                                                             c2->cpus ? 
> c2->cpus->nr :
> +                                                             
> cpu_map__nr(evsel_list->cpus),
> +                                                             
> thread_map__nr(evsel_list->threads));
> +                                             c2->leader = c2;
> +                                             c2->nr_members = 0;
> +                                     }
> +                             }
> +                             counter = leader;
> +                             goto try_again;
> +                     }
> +
>                       /*
>                        * PPC returns ENXIO for HW counters until 2.6.37
>                        * (behavior changed with commit b0a873e).

SNIP

Reply via email to