On 22/11/17 01:33, Kim Phillips wrote:
> 'perf record' and 'perf report --dump-raw-trace' supported in this
> release.
> 
> Example usage:
> 
> $ ./perf record -e arm_spe_0/ts_enable=1,pa_enable=1/ \
>               dd if=/dev/zero of=/dev/null count=10000
> 
> perf report --dump-raw-trace
> 
> Note that the perf.data file is portable, so the report can be run on
> another architecture host if necessary.
> 
> Output will contain raw SPE data and its textual representation, such
> as:
> 
> 0x550 [0x30]: PERF_RECORD_AUXTRACE size: 0xc408  offset: 0  ref: 0x30005619  
> idx: 3  tid: 2109  cpu: 3
> .
> . ... ARM SPE data: size 50184 bytes
> .  00000000:  49 00                                           LD
> .  00000002:  b2 00 9c 7b 7a 00 80 ff ff                      VA 
> 0xffff80007a7b9c00
> .  0000000b:  9a 00 00                                        LAT 0 XLAT
> .  0000000e:  42 16                                           EV RETIRED 
> L1D-ACCESS TLB-ACCESS
> .  00000010:  b0 b0 c9 15 08 00 00 ff ff                      PC 
> 0xff00000815c9b0 el3 ns=1
> .  00000019:  98 00 00                                        LAT 0 TOT
> .  0000001c:  71 00 20 fa fd 16 00 00 00                      TS 98750308352
> .  00000025:  49 01                                           ST
> .  00000027:  b2 60 bc 0c 0f 00 00 ff ff                      VA 
> 0xffff00000f0cbc60
> .  00000030:  9a 00 00                                        LAT 0 XLAT
> .  00000033:  42 16                                           EV RETIRED 
> L1D-ACCESS TLB-ACCESS
> .  00000035:  b0 48 cc 15 08 00 00 ff ff                      PC 
> 0xff00000815cc48 el3 ns=1
> .  0000003e:  98 00 00                                        LAT 0 TOT
> .  00000041:  71 00 20 fa fd 16 00 00 00                      TS 98750308352
> .  0000004a:  48 00                                           INSN-OTHER
> .  0000004c:  42 02                                           EV RETIRED
> .  0000004e:  b0 ac 47 0c 08 00 00 ff ff                      PC 
> 0xff0000080c47ac el3 ns=1
> .  00000057:  98 00 00                                        LAT 0 TOT
> .  0000005a:  71 00 20 fa fd 16 00 00 00                      TS 98750308352
> .  00000063:  49 00                                           LD
> .  00000065:  b2 18 48 e5 7a 00 80 ff ff                      VA 
> 0xffff80007ae54818
> .  0000006e:  9a 00 00                                        LAT 0 XLAT
> .  00000071:  42 16                                           EV RETIRED 
> L1D-ACCESS TLB-ACCESS
> .  00000073:  b0 08 f8 15 08 00 00 ff ff                      PC 
> 0xff00000815f808 el3 ns=1
> .  0000007c:  98 00 00                                        LAT 0 TOT
> .  0000007f:  71 00 20 fa fd 16 00 00 00                      TS 98750308352
> ...
> 
> Other release notes:
> 
> - applies to acme's perf/{core,urgent} branches, likely elsewhere
> 
> - Report is self-contained within the tool.  Record requires enabling
>   the kernel SPE driver by setting CONFIG_ARM_SPE_PMU.
> 
> - the intel-bts implementation was used as a starting point; its
>   min/default/max buffer sizes and power of 2 pages granularity need to be
>   revisited for ARM SPE
> 
> - recording across multiple SPE clusters/domains not supported
> 
> - snapshot support (record -S), and conversion to native perf events
>   (e.g., via 'perf inject --itrace'), are also not supported
> 
> - technically both cs-etm and spe can be used simultaneously, however
>   disabled for simplicity in this release
> 
> Signed-off-by: Kim Phillips <kim.phill...@arm.com>

For what is there now, it looks fine from the auxtrace point of view.  There
are a couple of minor points below but nevertheless:

Acked-by: Adrian Hunter <adrian.hun...@intel.com>

> ---
> v4: rebased onto acme's perf/core, whitespace fixes.
> 
> v3: trying to address comments from v2:
> 
> - despite adding a find_all_arm_spe_pmus() function to scan for all
>   arm_spe_<n> device instances, in order to ensure auxtrace_record__init
>   successfully matches the evsel type with the correct arm_spe_pmu type,
>   I am still having trouble running in multi-SPE PPI (heterogeneous)
>   environments (mmap fails with EOPNOTSUPP, as does running with
>   --per-thread on homogeneous systems).
> 
> - arm_spe_reference: use gettime instead of direct cntvct register access
> 
> - spe-decoder: add a comment for why SPE_EVENTS code sets packet->index.
> 
> - added arm_spe_pmu_default_config that accesses the driver
>   caps/min_interval and sets the default sampling period to it.  This way
>   users don't have to specify -c explicitly.  Also set is_uncore to false.
> 
> - set more sampling bits in the arm_spe and its tracking evsel.  Still
>   unsure if too liberal, and not sure whether it needs another context
>   switch tracking evsel.  Comments welcome!
> 
> - https://www.spinics.net/lists/arm-kernel/msg614361.html
> 
> v2: mostly addressing Mark Rutland's comments as much as possible without his
> feedback to my feedback:
> 
> - decoder refactored with a get_payload, not extended to with-ext_len ones 
> like
>   get_addr,  named the constants
> 
> - 0x-ified %x output formats, but decided to not sign extend the addresses in
>   the raw dump, rather do so if necessary in the synthesis stage:
>   SPE implementations differ in this area, and raw dump should reflect that.
> 
> - CPU mask / new record behaviour bisected to commit e3ba76deef23064 "perf
>   tools: Force uncore events to system wide monitoring".  Waiting to hear back
>   on why driver can't do system wide monitoring, even across PPIs, by e.g.,
>   sharing the SPE interrupts in one handler (SPE's don't differ in this record
>   regard).
> 
> - addressed off-list comment from M. Williams:
>   "Instruction Type" packet was renamed as "Operation Type".
>    so in the spe packet decoder: INSN_TYPE -> OP_TYPE
> 
> - do_get_packet fixed to handle excessive, successive PADding from a new 
> source
>   of raw SPE data, so instead of:
> 
>       .  000011ae:  00                                              PAD
>       .  000011af:  00                                              PAD
>       .  000011b0:  00                                              PAD
>       .  000011b1:  00                                              PAD
>       .  000011b2:  00                                              PAD
>       .  000011b3:  00                                              PAD
>       .  000011b4:  00                                              PAD
>       .  000011b5:  00                                              PAD
>       .  000011b6:  00                                              PAD
> 
>   we now get:
> 
>       .  000011ae:  00 00 00 00 00 00 00 00 00                      PAD
> 
> - fixed 52 00 00 decoded with an empty events clause, adding 'EV' for all 
> events
>   clauses now.  parser writers can detect for empty event clauses by finding
>   nothing after it.
> 
>  tools/perf/arch/arm/util/auxtrace.c   |  75 +++++-
>  tools/perf/arch/arm/util/pmu.c        |   5 +-
>  tools/perf/arch/arm64/util/Build      |   3 +-
>  tools/perf/arch/arm64/util/arm-spe.c  | 235 +++++++++++++++++
>  tools/perf/util/Build                 |   2 +
>  tools/perf/util/arm-spe-pkt-decoder.c | 471 
> ++++++++++++++++++++++++++++++++++
>  tools/perf/util/arm-spe-pkt-decoder.h |  52 ++++
>  tools/perf/util/arm-spe.c             | 318 +++++++++++++++++++++++
>  tools/perf/util/arm-spe.h             |  42 +++
>  tools/perf/util/auxtrace.c            |   3 +
>  tools/perf/util/auxtrace.h            |   1 +
>  11 files changed, 1199 insertions(+), 8 deletions(-)
>  create mode 100644 tools/perf/arch/arm64/util/arm-spe.c
>  create mode 100644 tools/perf/util/arm-spe-pkt-decoder.c
>  create mode 100644 tools/perf/util/arm-spe-pkt-decoder.h
>  create mode 100644 tools/perf/util/arm-spe.c
>  create mode 100644 tools/perf/util/arm-spe.h
> 
> diff --git a/tools/perf/arch/arm/util/auxtrace.c 
> b/tools/perf/arch/arm/util/auxtrace.c
> index 8edf2cb71564..8e7c1ad18224 100644
> --- a/tools/perf/arch/arm/util/auxtrace.c
> +++ b/tools/perf/arch/arm/util/auxtrace.c
> @@ -22,6 +22,42 @@
>  #include "../../util/evlist.h"
>  #include "../../util/pmu.h"
>  #include "cs-etm.h"
> +#include "arm-spe.h"
> +
> +static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
> +{
> +     struct perf_pmu **arm_spe_pmus = NULL;
> +     int ret, i, nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
> +     /* arm_spe_xxxxxxxxx\0 */
> +     char arm_spe_pmu_name[sizeof(ARM_SPE_PMU_NAME) + 10];
> +
> +     arm_spe_pmus = zalloc(sizeof(struct perf_pmu *) * nr_cpus);
> +     if (!arm_spe_pmus) {
> +             pr_err("spes alloc failed\n");
> +             *err = -ENOMEM;
> +             return NULL;
> +     }
> +
> +     for (i = 0; i < nr_cpus; i++) {
> +             ret = sprintf(arm_spe_pmu_name, "%s%d", ARM_SPE_PMU_NAME, i);
> +             if (ret < 0) {
> +                     pr_err("sprintf failed\n");
> +                     *err = -ENOMEM;
> +                     return NULL;
> +             }
> +
> +             arm_spe_pmus[*nr_spes] = perf_pmu__find(arm_spe_pmu_name);
> +             if (arm_spe_pmus[*nr_spes]) {
> +                     pr_debug2("%s %d: arm_spe_pmu %d type %d name %s\n",
> +                              __func__, __LINE__, *nr_spes,
> +                              arm_spe_pmus[*nr_spes]->type,
> +                              arm_spe_pmus[*nr_spes]->name);
> +                     (*nr_spes)++;
> +             }
> +     }
> +
> +     return arm_spe_pmus;
> +}
>  
>  struct auxtrace_record
>  *auxtrace_record__init(struct perf_evlist *evlist, int *err)
> @@ -29,22 +65,49 @@ struct auxtrace_record
>       struct perf_pmu *cs_etm_pmu;
>       struct perf_evsel *evsel;
>       bool found_etm = false;
> +     bool found_spe = false;
> +     static struct perf_pmu **arm_spe_pmus = NULL;
> +     static int nr_spes = 0;
> +     int i;
> +
> +     if (!evlist)
> +             return NULL;
>  
>       cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
>  
> -     if (evlist) {
> -             evlist__for_each_entry(evlist, evsel) {
> -                     if (cs_etm_pmu &&
> -                         evsel->attr.type == cs_etm_pmu->type)
> -                             found_etm = true;
> +     if (!arm_spe_pmus)
> +             arm_spe_pmus = find_all_arm_spe_pmus(&nr_spes, err);
> +
> +     evlist__for_each_entry(evlist, evsel) {
> +             if (cs_etm_pmu &&
> +                 evsel->attr.type == cs_etm_pmu->type)
> +                     found_etm = true;
> +
> +             if (!nr_spes)
> +                     continue;
> +
> +             for (i = 0; i < nr_spes; i++) {
> +                     if (evsel->attr.type == arm_spe_pmus[i]->type) {
> +                             found_spe = true;
> +                             break;
> +                     }
>               }
>       }
>  
> +     if (found_etm && found_spe) {
> +             pr_err("Concurrent ARM Coresight ETM and SPE operation not 
> currently supported\n");
> +             *err = -EOPNOTSUPP;
> +             return NULL;
> +     }
> +
>       if (found_etm)
>               return cs_etm_record_init(err);
>  
> +     if (found_spe)
> +             return arm_spe_recording_init(err, arm_spe_pmus[i]);
> +
>       /*
> -      * Clear 'err' even if we haven't found a cs_etm event - that way perf
> +      * Clear 'err' even if we haven't found an event - that way perf
>        * record can still be used even if tracers aren't present.  The NULL
>        * return value will take care of telling the infrastructure HW tracing
>        * isn't available.
> diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
> index 98d67399a0d6..4c06a25ae6b1 100644
> --- a/tools/perf/arch/arm/util/pmu.c
> +++ b/tools/perf/arch/arm/util/pmu.c
> @@ -20,6 +20,7 @@
>  #include <linux/perf_event.h>
>  
>  #include "cs-etm.h"
> +#include "arm-spe.h"
>  #include "../../util/pmu.h"
>  
>  struct perf_event_attr
> @@ -30,7 +31,9 @@ struct perf_event_attr
>               /* add ETM default config here */
>               pmu->selectable = true;
>               pmu->set_drv_config = cs_etm_set_drv_config;
> -     }
> +     } else
> +             if (strstarts(pmu->name, ARM_SPE_PMU_NAME))
> +                     return arm_spe_pmu_default_config(pmu);

More conventional kernel style would be:

        } else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
                return arm_spe_pmu_default_config(pmu);
        }

Also it looks like arm_spe_pmu_default_config() is only compiled for arm64
so what happens if you build for arm.

>  #endif
>       return NULL;
>  }
> diff --git a/tools/perf/arch/arm64/util/Build 
> b/tools/perf/arch/arm64/util/Build
> index cef6fb38d17e..f9969bb88ccb 100644
> --- a/tools/perf/arch/arm64/util/Build
> +++ b/tools/perf/arch/arm64/util/Build
> @@ -3,4 +3,5 @@ libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
>  
>  libperf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
>                             ../../arm/util/auxtrace.o \
> -                           ../../arm/util/cs-etm.o
> +                           ../../arm/util/cs-etm.o \
> +                           arm-spe.o
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c 
> b/tools/perf/arch/arm64/util/arm-spe.c
> new file mode 100644
> index 000000000000..ef576b52c850
> --- /dev/null
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -0,0 +1,235 @@
> +/*
> + * ARM Statistical Profiling Extensions (SPE) support
> + * Copyright (c) 2017, ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.

Might as well switch to SPDX license identifiers, here and elsewhere.

> + *
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/bitops.h>
> +#include <linux/log2.h>
> +#include <time.h>
> +
> +#include "../../util/cpumap.h"
> +#include "../../util/evsel.h"
> +#include "../../util/evlist.h"
> +#include "../../util/session.h"
> +#include "../../util/util.h"
> +#include "../../util/pmu.h"
> +#include "../../util/debug.h"
> +#include "../../util/tsc.h"

tsc.h is not needed

> +#include "../../util/auxtrace.h"
> +#include "../../util/arm-spe.h"
> +
> +#define KiB(x) ((x) * 1024)
> +#define MiB(x) ((x) * 1024 * 1024)
> +
> +struct arm_spe_recording {
> +     struct auxtrace_record          itr;
> +     struct perf_pmu                 *arm_spe_pmu;
> +     struct perf_evlist              *evlist;
> +};
> +
> +static size_t
> +arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
> +                    struct perf_evlist *evlist __maybe_unused)
> +{
> +     return ARM_SPE_AUXTRACE_PRIV_SIZE;
> +}
> +
> +static int arm_spe_info_fill(struct auxtrace_record *itr,
> +                          struct perf_session *session,
> +                          struct auxtrace_info_event *auxtrace_info,
> +                          size_t priv_size)
> +{
> +     struct arm_spe_recording *sper =
> +                     container_of(itr, struct arm_spe_recording, itr);
> +     struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
> +
> +     if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE)
> +             return -EINVAL;
> +
> +     if (!session->evlist->nr_mmaps)
> +             return -EINVAL;
> +
> +     auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
> +     auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
> +
> +     return 0;
> +}
> +
> +static int arm_spe_recording_options(struct auxtrace_record *itr,
> +                                  struct perf_evlist *evlist,
> +                                  struct record_opts *opts)
> +{
> +     struct arm_spe_recording *sper =
> +                     container_of(itr, struct arm_spe_recording, itr);
> +     struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
> +     struct perf_evsel *evsel, *arm_spe_evsel = NULL;
> +     bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
> +     struct perf_evsel *tracking_evsel;
> +     int err;
> +
> +     sper->evlist = evlist;
> +
> +     evlist__for_each_entry(evlist, evsel) {
> +             if (evsel->attr.type == arm_spe_pmu->type) {
> +                     if (arm_spe_evsel) {
> +                             pr_err("There may be only one " 
> ARM_SPE_PMU_NAME "x event\n");
> +                             return -EINVAL;
> +                     }
> +                     evsel->attr.freq = 0;
> +                     evsel->attr.sample_period = 1;
> +                     arm_spe_evsel = evsel;
> +                     opts->full_auxtrace = true;
> +             }
> +     }
> +
> +     if (!opts->full_auxtrace)
> +             return 0;
> +
> +     /* We are in full trace mode but '-m,xyz' wasn't specified */
> +     if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
> +             if (privileged) {
> +                     opts->auxtrace_mmap_pages = MiB(4) / page_size;
> +             } else {
> +                     opts->auxtrace_mmap_pages = KiB(128) / page_size;
> +                     if (opts->mmap_pages == UINT_MAX)
> +                             opts->mmap_pages = KiB(256) / page_size;
> +             }
> +     }
> +
> +     /* Validate auxtrace_mmap_pages */
> +     if (opts->auxtrace_mmap_pages) {
> +             size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
> +             size_t min_sz = KiB(8);
> +
> +             if (sz < min_sz || !is_power_of_2(sz)) {
> +                     pr_err("Invalid mmap size for ARM SPE: must be at least 
> %zuKiB and a power of 2\n",
> +                            min_sz / 1024);
> +                     return -EINVAL;
> +             }
> +     }
> +
> +
> +     /*
> +      * To obtain the auxtrace buffer file descriptor, the auxtrace event
> +      * must come first.
> +      */
> +     perf_evlist__to_front(evlist, arm_spe_evsel);
> +
> +     perf_evsel__set_sample_bit(arm_spe_evsel, CPU);
> +     perf_evsel__set_sample_bit(arm_spe_evsel, TIME);
> +     perf_evsel__set_sample_bit(arm_spe_evsel, TID);
> +
> +     /* Add dummy event to keep tracking */
> +     err = parse_events(evlist, "dummy:u", NULL);
> +     if (err)
> +             return err;
> +
> +     tracking_evsel = perf_evlist__last(evlist);
> +     perf_evlist__set_tracking_event(evlist, tracking_evsel);
> +
> +     tracking_evsel->attr.freq = 0;
> +     tracking_evsel->attr.sample_period = 1;
> +     perf_evsel__set_sample_bit(tracking_evsel, TIME);
> +     perf_evsel__set_sample_bit(tracking_evsel, CPU);
> +     perf_evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
> +
> +     return 0;
> +}
> +
> +static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused)
> +{
> +     struct timespec ts;
> +
> +     clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
> +
> +     return ts.tv_sec ^ ts.tv_nsec;
> +}
> +
> +static void arm_spe_recording_free(struct auxtrace_record *itr)
> +{
> +     struct arm_spe_recording *sper =
> +                     container_of(itr, struct arm_spe_recording, itr);
> +
> +     free(sper);
> +}
> +
> +static int arm_spe_read_finish(struct auxtrace_record *itr, int idx)
> +{
> +     struct arm_spe_recording *sper =
> +                     container_of(itr, struct arm_spe_recording, itr);
> +     struct perf_evsel *evsel;
> +
> +     evlist__for_each_entry(sper->evlist, evsel) {
> +             if (evsel->attr.type == sper->arm_spe_pmu->type)
> +                     return perf_evlist__enable_event_idx(sper->evlist,
> +                                                          evsel, idx);
> +     }
> +     return -EINVAL;
> +}
> +
> +struct auxtrace_record *arm_spe_recording_init(int *err,
> +                                            struct perf_pmu *arm_spe_pmu)
> +{
> +     struct arm_spe_recording *sper;
> +
> +     if (!arm_spe_pmu) {
> +             *err = -ENODEV;
> +             return NULL;
> +     }
> +
> +     sper = zalloc(sizeof(struct arm_spe_recording));
> +     if (!sper) {
> +             *err = -ENOMEM;
> +             return NULL;
> +     }
> +
> +     sper->arm_spe_pmu = arm_spe_pmu;
> +     sper->itr.recording_options = arm_spe_recording_options;
> +     sper->itr.info_priv_size = arm_spe_info_priv_size;
> +     sper->itr.info_fill = arm_spe_info_fill;
> +     sper->itr.free = arm_spe_recording_free;
> +     sper->itr.reference = arm_spe_reference;
> +     sper->itr.read_finish = arm_spe_read_finish;
> +     sper->itr.alignment = 0;
> +
> +     return &sper->itr;
> +}
> +
> +struct perf_event_attr
> +*arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu)
> +{
> +     struct perf_event_attr *attr;
> +
> +     attr = zalloc(sizeof(struct perf_event_attr));
> +     if (!attr) {
> +             pr_err("arm_spe default config cannot allocate a 
> perf_event_attr\n");
> +             return NULL;
> +     }
> +
> +     /*
> +      * If kernel driver doesn't advertise a minimum,
> +      * use max allowable by PMSIDR_EL1.INTERVAL
> +      */
> +     if (perf_pmu__scan_file(arm_spe_pmu, "caps/min_interval", "%llu",
> +                               &attr->sample_period) != 1) {
> +             pr_debug("arm_spe driver doesn't advertise a min. interval. 
> Using 4096\n");
> +             attr->sample_period = 4096;
> +     }
> +
> +     arm_spe_pmu->selectable = true;
> +     arm_spe_pmu->is_uncore = false;
> +
> +     return attr;
> +}
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index a3de7916fe63..7c6a8b461e24 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -86,6 +86,8 @@ libperf-$(CONFIG_AUXTRACE) += auxtrace.o
>  libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
>  libperf-$(CONFIG_AUXTRACE) += intel-pt.o
>  libperf-$(CONFIG_AUXTRACE) += intel-bts.o
> +libperf-$(CONFIG_AUXTRACE) += arm-spe.o
> +libperf-$(CONFIG_AUXTRACE) += arm-spe-pkt-decoder.o
>  libperf-y += parse-branch-options.o
>  libperf-y += dump-insn.o
>  libperf-y += parse-regs-options.o
> diff --git a/tools/perf/util/arm-spe-pkt-decoder.c 
> b/tools/perf/util/arm-spe-pkt-decoder.c
> new file mode 100644
> index 000000000000..234943471d30
> --- /dev/null
> +++ b/tools/perf/util/arm-spe-pkt-decoder.c
> @@ -0,0 +1,471 @@
> +/*
> + * ARM Statistical Profiling Extensions (SPE) support
> + * Copyright (c) 2017, ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + */
> +
> +#include <stdio.h>
> +#include <string.h>
> +#include <endian.h>
> +#include <byteswap.h>
> +
> +#include "arm-spe-pkt-decoder.h"
> +
> +#define BIT(n)               (1ULL << (n))
> +
> +#define NS_FLAG              BIT(63)
> +#define EL_FLAG              (BIT(62) | BIT(61))
> +
> +#define SPE_HEADER0_PAD                      0x0
> +#define SPE_HEADER0_END                      0x1
> +#define SPE_HEADER0_ADDRESS          0x30 /* address packet (short) */
> +#define SPE_HEADER0_ADDRESS_MASK     0x38
> +#define SPE_HEADER0_COUNTER          0x18 /* counter packet (short) */
> +#define SPE_HEADER0_COUNTER_MASK     0x38
> +#define SPE_HEADER0_TIMESTAMP                0x71
> +#define SPE_HEADER0_TIMESTAMP                0x71
> +#define SPE_HEADER0_EVENTS           0x2
> +#define SPE_HEADER0_EVENTS_MASK              0xf
> +#define SPE_HEADER0_SOURCE           0x3
> +#define SPE_HEADER0_SOURCE_MASK              0xf
> +#define SPE_HEADER0_CONTEXT          0x24
> +#define SPE_HEADER0_CONTEXT_MASK     0x3c
> +#define SPE_HEADER0_OP_TYPE          0x8
> +#define SPE_HEADER0_OP_TYPE_MASK     0x3c
> +#define SPE_HEADER1_ALIGNMENT                0x0
> +#define SPE_HEADER1_ADDRESS          0xb0 /* address packet (extended) */
> +#define SPE_HEADER1_ADDRESS_MASK     0xf8
> +#define SPE_HEADER1_COUNTER          0x98 /* counter packet (extended) */
> +#define SPE_HEADER1_COUNTER_MASK     0xf8
> +
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +#define le16_to_cpu bswap_16
> +#define le32_to_cpu bswap_32
> +#define le64_to_cpu bswap_64
> +#define memcpy_le64(d, s, n) do { \
> +     memcpy((d), (s), (n));    \
> +     *(d) = le64_to_cpu(*(d)); \
> +} while (0)
> +#else
> +#define le16_to_cpu
> +#define le32_to_cpu
> +#define le64_to_cpu
> +#define memcpy_le64 memcpy
> +#endif
> +
> +static const char * const arm_spe_packet_name[] = {
> +     [ARM_SPE_PAD]           = "PAD",
> +     [ARM_SPE_END]           = "END",
> +     [ARM_SPE_TIMESTAMP]     = "TS",
> +     [ARM_SPE_ADDRESS]       = "ADDR",
> +     [ARM_SPE_COUNTER]       = "LAT",
> +     [ARM_SPE_CONTEXT]       = "CONTEXT",
> +     [ARM_SPE_OP_TYPE]       = "OP-TYPE",
> +     [ARM_SPE_EVENTS]        = "EVENTS",
> +     [ARM_SPE_DATA_SOURCE]   = "DATA-SOURCE",
> +};
> +
> +const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
> +{
> +     return arm_spe_packet_name[type];
> +}
> +
> +/* return ARM SPE payload size from its encoding,
> + * which is in bits 5:4 of the byte.
> + * 00 : byte
> + * 01 : halfword (2)
> + * 10 : word (4)
> + * 11 : doubleword (8)
> + */
> +static int payloadlen(unsigned char byte)
> +{
> +     return 1 << ((byte & 0x30) >> 4);
> +}
> +
> +static int arm_spe_get_payload(const unsigned char *buf, size_t len,
> +                            struct arm_spe_pkt *packet)
> +{
> +     size_t payload_len = payloadlen(buf[0]);
> +
> +     if (len < 1 + payload_len)
> +             return ARM_SPE_NEED_MORE_BYTES;
> +
> +     buf++;
> +
> +     switch (payload_len) {
> +     case 1: packet->payload = *(uint8_t *)buf; break;
> +     case 2: packet->payload = le16_to_cpu(*(uint16_t *)buf); break;
> +     case 4: packet->payload = le32_to_cpu(*(uint32_t *)buf); break;
> +     case 8: packet->payload = le64_to_cpu(*(uint64_t *)buf); break;
> +     default: return ARM_SPE_BAD_PACKET;
> +     }
> +
> +     return 1 + payload_len;
> +}
> +
> +static int arm_spe_get_pad(struct arm_spe_pkt *packet)
> +{
> +     packet->type = ARM_SPE_PAD;
> +     return 1;
> +}
> +
> +static int arm_spe_get_alignment(const unsigned char *buf, size_t len,
> +                              struct arm_spe_pkt *packet)
> +{
> +     unsigned int alignment = 1 << ((buf[0] & 0xf) + 1);
> +
> +     if (len < alignment)
> +             return ARM_SPE_NEED_MORE_BYTES;
> +
> +     packet->type = ARM_SPE_PAD;
> +     return alignment - (((uint64_t)buf) & (alignment - 1));
> +}
> +
> +static int arm_spe_get_end(struct arm_spe_pkt *packet)
> +{
> +     packet->type = ARM_SPE_END;
> +     return 1;
> +}
> +
> +static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
> +                              struct arm_spe_pkt *packet)
> +{
> +     packet->type = ARM_SPE_TIMESTAMP;
> +     return arm_spe_get_payload(buf, len, packet);
> +}
> +
> +static int arm_spe_get_events(const unsigned char *buf, size_t len,
> +                           struct arm_spe_pkt *packet)
> +{
> +     int ret = arm_spe_get_payload(buf, len, packet);
> +
> +     packet->type = ARM_SPE_EVENTS;
> +
> +     /* we use index to identify Events with a less number of
> +      * comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS,
> +      * LLC-REFILL, and REMOTE-ACCESS events are identified iff
> +      * index > 1.
> +      */
> +     packet->index = ret - 1;
> +
> +     return ret;
> +}
> +
> +static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
> +                                struct arm_spe_pkt *packet)
> +{
> +     packet->type = ARM_SPE_DATA_SOURCE;
> +     return arm_spe_get_payload(buf, len, packet);
> +}
> +
> +static int arm_spe_get_context(const unsigned char *buf, size_t len,
> +                            struct arm_spe_pkt *packet)
> +{
> +     packet->type = ARM_SPE_CONTEXT;
> +     packet->index = buf[0] & 0x3;
> +
> +     return arm_spe_get_payload(buf, len, packet);
> +}
> +
> +static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
> +                            struct arm_spe_pkt *packet)
> +{
> +     packet->type = ARM_SPE_OP_TYPE;
> +     packet->index = buf[0] & 0x3;
> +     return arm_spe_get_payload(buf, len, packet);
> +}
> +
> +static int arm_spe_get_counter(const unsigned char *buf, size_t len,
> +                            const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
> +{
> +     if (len < 2)
> +             return ARM_SPE_NEED_MORE_BYTES;
> +
> +     packet->type = ARM_SPE_COUNTER;
> +     if (ext_hdr)
> +             packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> +     else
> +             packet->index = buf[0] & 0x7;
> +
> +     packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1));
> +
> +     return 1 + ext_hdr + 2;
> +}
> +
> +static int arm_spe_get_addr(const unsigned char *buf, size_t len,
> +                         const unsigned char ext_hdr, struct arm_spe_pkt 
> *packet)
> +{
> +     if (len < 8)
> +             return ARM_SPE_NEED_MORE_BYTES;
> +
> +     packet->type = ARM_SPE_ADDRESS;
> +     if (ext_hdr)
> +             packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> +     else
> +             packet->index = buf[0] & 0x7;
> +
> +     memcpy_le64(&packet->payload, buf + 1, 8);
> +
> +     return 1 + ext_hdr + 8;
> +}
> +
> +static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
> +                              struct arm_spe_pkt *packet)
> +{
> +     unsigned int byte;
> +
> +     memset(packet, 0, sizeof(struct arm_spe_pkt));
> +
> +     if (!len)
> +             return ARM_SPE_NEED_MORE_BYTES;
> +
> +     byte = buf[0];
> +     if (byte == SPE_HEADER0_PAD)
> +             return arm_spe_get_pad(packet);
> +     else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
> +             return arm_spe_get_end(packet);
> +     else if (byte & 0xc0 /* 0y11xxxxxx */) {
> +             if (byte & 0x80) {
> +                     if ((byte & SPE_HEADER0_ADDRESS_MASK) == 
> SPE_HEADER0_ADDRESS)
> +                             return arm_spe_get_addr(buf, len, 0, packet);
> +                     if ((byte & SPE_HEADER0_COUNTER_MASK) == 
> SPE_HEADER0_COUNTER)
> +                             return arm_spe_get_counter(buf, len, 0, packet);
> +             } else
> +                     if (byte == SPE_HEADER0_TIMESTAMP)
> +                             return arm_spe_get_timestamp(buf, len, packet);
> +                     else if ((byte & SPE_HEADER0_EVENTS_MASK) == 
> SPE_HEADER0_EVENTS)
> +                             return arm_spe_get_events(buf, len, packet);
> +                     else if ((byte & SPE_HEADER0_SOURCE_MASK) == 
> SPE_HEADER0_SOURCE)
> +                             return arm_spe_get_data_source(buf, len, 
> packet);
> +                     else if ((byte & SPE_HEADER0_CONTEXT_MASK) == 
> SPE_HEADER0_CONTEXT)
> +                             return arm_spe_get_context(buf, len, packet);
> +                     else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == 
> SPE_HEADER0_OP_TYPE)
> +                             return arm_spe_get_op_type(buf, len, packet);
> +     } else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
> +             /* 16-bit header */
> +             byte = buf[1];
> +             if (byte == SPE_HEADER1_ALIGNMENT)
> +                     return arm_spe_get_alignment(buf, len, packet);
> +             else if ((byte & SPE_HEADER1_ADDRESS_MASK) == 
> SPE_HEADER1_ADDRESS)
> +                     return arm_spe_get_addr(buf, len, 1, packet);
> +             else if ((byte & SPE_HEADER1_COUNTER_MASK) == 
> SPE_HEADER1_COUNTER)
> +                     return arm_spe_get_counter(buf, len, 1, packet);
> +     }
> +
> +     return ARM_SPE_BAD_PACKET;
> +}
> +
> +int arm_spe_get_packet(const unsigned char *buf, size_t len,
> +                    struct arm_spe_pkt *packet)
> +{
> +     int ret;
> +
> +     ret = arm_spe_do_get_packet(buf, len, packet);
> +     /* put multiple consecutive PADs on the same line, up to
> +      * the fixed-width output format of 16 bytes per line.
> +      */
> +     if (ret > 0 && packet->type == ARM_SPE_PAD) {
> +             while (ret < 16 && len > (size_t)ret && !buf[ret])
> +                     ret += 1;
> +     }
> +     return ret;
> +}
> +
> +int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> +                  size_t buf_len)
> +{
> +     int ret, ns, el, index = packet->index;
> +     unsigned long long payload = packet->payload;
> +     const char *name = arm_spe_pkt_name(packet->type);
> +
> +     switch (packet->type) {
> +     case ARM_SPE_BAD:
> +     case ARM_SPE_PAD:
> +     case ARM_SPE_END:
> +             return snprintf(buf, buf_len, "%s", name);
> +     case ARM_SPE_EVENTS: {
> +             size_t blen = buf_len;
> +
> +             ret = 0;
> +             ret = snprintf(buf, buf_len, "EV");
> +             buf += ret;
> +             blen -= ret;
> +             if (payload & 0x1) {
> +                     ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x2) {
> +                     ret = snprintf(buf, buf_len, " RETIRED");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x4) {
> +                     ret = snprintf(buf, buf_len, " L1D-ACCESS");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x8) {
> +                     ret = snprintf(buf, buf_len, " L1D-REFILL");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x10) {
> +                     ret = snprintf(buf, buf_len, " TLB-ACCESS");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x20) {
> +                     ret = snprintf(buf, buf_len, " TLB-REFILL");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x40) {
> +                     ret = snprintf(buf, buf_len, " NOT-TAKEN");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (payload & 0x80) {
> +                     ret = snprintf(buf, buf_len, " MISPRED");
> +                     buf += ret;
> +                     blen -= ret;
> +             }
> +             if (index > 1) {
> +                     if (payload & 0x100) {
> +                             ret = snprintf(buf, buf_len, " LLC-ACCESS");
> +                             buf += ret;
> +                             blen -= ret;
> +                     }
> +                     if (payload & 0x200) {
> +                             ret = snprintf(buf, buf_len, " LLC-REFILL");
> +                             buf += ret;
> +                             blen -= ret;
> +                     }
> +                     if (payload & 0x400) {
> +                             ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
> +                             buf += ret;
> +                             blen -= ret;
> +                     }
> +             }
> +             if (ret < 0)
> +                     return ret;
> +             blen -= ret;
> +             return buf_len - blen;
> +     }
> +     case ARM_SPE_OP_TYPE:
> +             switch (index) {
> +             case 0: return snprintf(buf, buf_len, "%s", payload & 0x1 ?
> +                                     "COND-SELECT" : "INSN-OTHER");
> +             case 1: {
> +                     size_t blen = buf_len;
> +
> +                     if (payload & 0x1)
> +                             ret = snprintf(buf, buf_len, "ST");
> +                     else
> +                             ret = snprintf(buf, buf_len, "LD");
> +                     buf += ret;
> +                     blen -= ret;
> +                     if (payload & 0x2) {
> +                             if (payload & 0x4) {
> +                                     ret = snprintf(buf, buf_len, " AT");
> +                                     buf += ret;
> +                                     blen -= ret;
> +                             }
> +                             if (payload & 0x8) {
> +                                     ret = snprintf(buf, buf_len, " EXCL");
> +                                     buf += ret;
> +                                     blen -= ret;
> +                             }
> +                             if (payload & 0x10) {
> +                                     ret = snprintf(buf, buf_len, " AR");
> +                                     buf += ret;
> +                                     blen -= ret;
> +                             }
> +                     } else if (payload & 0x4) {
> +                             ret = snprintf(buf, buf_len, " SIMD-FP");
> +                             buf += ret;
> +                             blen -= ret;
> +                     }
> +                     if (ret < 0)
> +                             return ret;
> +                     blen -= ret;
> +                     return buf_len - blen;
> +             }
> +             case 2: {
> +                     size_t blen = buf_len;
> +
> +                     ret = snprintf(buf, buf_len, "B");
> +                     buf += ret;
> +                     blen -= ret;
> +                     if (payload & 0x1) {
> +                             ret = snprintf(buf, buf_len, " COND");
> +                             buf += ret;
> +                             blen -= ret;
> +                     }
> +                     if (payload & 0x2) {
> +                             ret = snprintf(buf, buf_len, " IND");
> +                             buf += ret;
> +                             blen -= ret;
> +                     }
> +                     if (ret < 0)
> +                             return ret;
> +                     blen -= ret;
> +                     return buf_len - blen;
> +                     }
> +             default: return 0;
> +             }
> +     case ARM_SPE_DATA_SOURCE:
> +     case ARM_SPE_TIMESTAMP:
> +             return snprintf(buf, buf_len, "%s %lld", name, payload);
> +     case ARM_SPE_ADDRESS:
> +             switch (index) {
> +             case 0:
> +             case 1: ns = !!(packet->payload & NS_FLAG);
> +                     el = (packet->payload & EL_FLAG) >> 61;
> +                     payload &= ~(0xffULL << 56);
> +                     return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
> +                                     (index == 1) ? "TGT" : "PC", payload, 
> el, ns);
> +             case 2: return snprintf(buf, buf_len, "VA 0x%llx", payload);
> +             case 3: ns = !!(packet->payload & NS_FLAG);
> +                     payload &= ~(0xffULL << 56);
> +                     return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
> +                                     payload, ns);
> +             default: return 0;
> +             }
> +     case ARM_SPE_CONTEXT:
> +             return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
> +                             (unsigned long)payload, index + 1);
> +     case ARM_SPE_COUNTER: {
> +             size_t blen = buf_len;
> +
> +             ret = snprintf(buf, buf_len, "%s %d ", name,
> +                            (unsigned short)payload);
> +             buf += ret;
> +             blen -= ret;
> +             switch (index) {
> +             case 0: ret = snprintf(buf, buf_len, "TOT"); break;
> +             case 1: ret = snprintf(buf, buf_len, "ISSUE"); break;
> +             case 2: ret = snprintf(buf, buf_len, "XLAT"); break;
> +             default: ret = 0;
> +             }
> +             if (ret < 0)
> +                     return ret;
> +             blen -= ret;
> +             return buf_len - blen;
> +     }
> +     default:
> +             break;
> +     }
> +
> +     return snprintf(buf, buf_len, "%s 0x%llx (%d)",
> +                     name, payload, packet->index);
> +}
> diff --git a/tools/perf/util/arm-spe-pkt-decoder.h 
> b/tools/perf/util/arm-spe-pkt-decoder.h
> new file mode 100644
> index 000000000000..f146f4143447
> --- /dev/null
> +++ b/tools/perf/util/arm-spe-pkt-decoder.h
> @@ -0,0 +1,52 @@
> +/*
> + * ARM Statistical Profiling Extensions (SPE) support
> + * Copyright (c) 2017, ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + */
> +
> +#ifndef INCLUDE__ARM_SPE_PKT_DECODER_H__
> +#define INCLUDE__ARM_SPE_PKT_DECODER_H__
> +
> +#include <stddef.h>
> +#include <stdint.h>
> +
> +#define ARM_SPE_PKT_DESC_MAX         256
> +
> +#define ARM_SPE_NEED_MORE_BYTES              -1
> +#define ARM_SPE_BAD_PACKET           -2
> +
> +enum arm_spe_pkt_type {
> +     ARM_SPE_BAD,
> +     ARM_SPE_PAD,
> +     ARM_SPE_END,
> +     ARM_SPE_TIMESTAMP,
> +     ARM_SPE_ADDRESS,
> +     ARM_SPE_COUNTER,
> +     ARM_SPE_CONTEXT,
> +     ARM_SPE_OP_TYPE,
> +     ARM_SPE_EVENTS,
> +     ARM_SPE_DATA_SOURCE,
> +};
> +
> +struct arm_spe_pkt {
> +     enum arm_spe_pkt_type   type;
> +     unsigned char           index;
> +     uint64_t                payload;
> +};
> +
> +const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
> +
> +int arm_spe_get_packet(const unsigned char *buf, size_t len,
> +                    struct arm_spe_pkt *packet);
> +
> +int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, size_t 
> len);
> +#endif
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> new file mode 100644
> index 000000000000..67965e26b5b1
> --- /dev/null
> +++ b/tools/perf/util/arm-spe.c
> @@ -0,0 +1,318 @@
> +/*
> + * ARM Statistical Profiling Extensions (SPE) support
> + * Copyright (c) 2017, ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + */
> +
> +#include <endian.h>
> +#include <errno.h>
> +#include <byteswap.h>
> +#include <inttypes.h>
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/bitops.h>
> +#include <linux/log2.h>
> +
> +#include "cpumap.h"
> +#include "color.h"
> +#include "evsel.h"
> +#include "evlist.h"
> +#include "machine.h"
> +#include "session.h"
> +#include "util.h"
> +#include "thread.h"
> +#include "debug.h"
> +#include "auxtrace.h"
> +#include "arm-spe.h"
> +#include "arm-spe-pkt-decoder.h"
> +
> +struct arm_spe {
> +     struct auxtrace                 auxtrace;
> +     struct auxtrace_queues          queues;
> +     struct auxtrace_heap            heap;
> +     u32                             auxtrace_type;
> +     struct perf_session             *session;
> +     struct machine                  *machine;
> +     u32                             pmu_type;
> +};
> +
> +struct arm_spe_queue {
> +     struct arm_spe          *spe;
> +     unsigned int            queue_nr;
> +     struct auxtrace_buffer  *buffer;
> +     bool                    on_heap;
> +     bool                    done;
> +     pid_t                   pid;
> +     pid_t                   tid;
> +     int                     cpu;
> +};
> +
> +static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
> +                      unsigned char *buf, size_t len)
> +{
> +     struct arm_spe_pkt packet;
> +     size_t pos = 0;
> +     int ret, pkt_len, i;
> +     char desc[ARM_SPE_PKT_DESC_MAX];
> +     const char *color = PERF_COLOR_BLUE;
> +
> +     color_fprintf(stdout, color,
> +                   ". ... ARM SPE data: size %zu bytes\n",
> +                   len);
> +
> +     while (len) {
> +             ret = arm_spe_get_packet(buf, len, &packet);
> +             if (ret > 0)
> +                     pkt_len = ret;
> +             else
> +                     pkt_len = 1;
> +             printf(".");
> +             color_fprintf(stdout, color, "  %08x: ", pos);
> +             for (i = 0; i < pkt_len; i++)
> +                     color_fprintf(stdout, color, " %02x", buf[i]);
> +             for (; i < 16; i++)
> +                     color_fprintf(stdout, color, "   ");
> +             if (ret > 0) {
> +                     ret = arm_spe_pkt_desc(&packet, desc,
> +                                            ARM_SPE_PKT_DESC_MAX);
> +                     if (ret > 0)
> +                             color_fprintf(stdout, color, " %s\n", desc);
> +             } else {
> +                     color_fprintf(stdout, color, " Bad packet!\n");
> +             }
> +             pos += pkt_len;
> +             buf += pkt_len;
> +             len -= pkt_len;
> +     }
> +}
> +
> +static void arm_spe_dump_event(struct arm_spe *spe, unsigned char *buf,
> +                            size_t len)
> +{
> +     printf(".\n");
> +     arm_spe_dump(spe, buf, len);
> +}
> +
> +static struct arm_spe_queue *arm_spe_alloc_queue(struct arm_spe *spe,
> +                                              unsigned int queue_nr)
> +{
> +     struct arm_spe_queue *speq;
> +
> +     speq = zalloc(sizeof(struct arm_spe_queue));
> +     if (!speq)
> +             return NULL;
> +
> +     speq->spe = spe;
> +     speq->queue_nr = queue_nr;
> +     speq->pid = -1;
> +     speq->tid = -1;
> +     speq->cpu = -1;
> +
> +     return speq;
> +}
> +
> +static int arm_spe_setup_queue(struct arm_spe *spe,
> +                            struct auxtrace_queue *queue,
> +                            unsigned int queue_nr)
> +{
> +     struct arm_spe_queue *speq = queue->priv;
> +
> +     if (list_empty(&queue->head))
> +             return 0;
> +
> +     if (!speq) {
> +             speq = arm_spe_alloc_queue(spe, queue_nr);
> +             if (!speq)
> +                     return -ENOMEM;
> +             queue->priv = speq;
> +
> +             if (queue->cpu != -1)
> +                     speq->cpu = queue->cpu;
> +             speq->tid = queue->tid;
> +     }
> +
> +     if (!speq->on_heap && !speq->buffer) {
> +             int ret;
> +
> +             speq->buffer = auxtrace_buffer__next(queue, NULL);
> +             if (!speq->buffer)
> +                     return 0;
> +
> +             ret = auxtrace_heap__add(&spe->heap, queue_nr,
> +                                      speq->buffer->reference);
> +             if (ret)
> +                     return ret;
> +             speq->on_heap = true;
> +     }
> +
> +     return 0;
> +}
> +
> +static int arm_spe_setup_queues(struct arm_spe *spe)
> +{
> +     unsigned int i;
> +     int ret;
> +
> +     for (i = 0; i < spe->queues.nr_queues; i++) {
> +             ret = arm_spe_setup_queue(spe, &spe->queues.queue_array[i],
> +                                         i);
> +             if (ret)
> +                     return ret;
> +     }
> +     return 0;
> +}
> +
> +static inline int arm_spe_update_queues(struct arm_spe *spe)
> +{
> +     if (spe->queues.new_data) {
> +             spe->queues.new_data = false;
> +             return arm_spe_setup_queues(spe);
> +     }
> +     return 0;
> +}
> +
> +static int arm_spe_process_event(struct perf_session *session __maybe_unused,
> +                              union perf_event *event __maybe_unused,
> +                              struct perf_sample *sample __maybe_unused,
> +                              struct perf_tool *tool __maybe_unused)
> +{
> +     return 0;
> +}
> +
> +static int arm_spe_process_auxtrace_event(struct perf_session *session,
> +                                       union perf_event *event,
> +                                       struct perf_tool *tool __maybe_unused)
> +{
> +     struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
> +                                          auxtrace);
> +     struct auxtrace_buffer *buffer;
> +     off_t data_offset;
> +     int fd = perf_data__fd(session->data);
> +     int err;
> +
> +     if (perf_data__is_pipe(session->data)) {
> +             data_offset = 0;
> +     } else {
> +             data_offset = lseek(fd, 0, SEEK_CUR);
> +             if (data_offset == -1)
> +                     return -errno;
> +     }
> +
> +     err = auxtrace_queues__add_event(&spe->queues, session, event,
> +                                      data_offset, &buffer);
> +     if (err)
> +             return err;
> +
> +     /* Dump here now we have copied a piped trace out of the pipe */
> +     if (dump_trace) {
> +             if (auxtrace_buffer__get_data(buffer, fd)) {
> +                     arm_spe_dump_event(spe, buffer->data,
> +                                          buffer->size);
> +                     auxtrace_buffer__put_data(buffer);
> +             }
> +     }
> +
> +     return 0;
> +}
> +
> +static int arm_spe_flush(struct perf_session *session __maybe_unused,
> +                      struct perf_tool *tool __maybe_unused)
> +{
> +     return 0;
> +}
> +
> +static void arm_spe_free_queue(void *priv)
> +{
> +     struct arm_spe_queue *speq = priv;
> +
> +     if (!speq)
> +             return;
> +     free(speq);
> +}
> +
> +static void arm_spe_free_events(struct perf_session *session)
> +{
> +     struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
> +                                          auxtrace);
> +     struct auxtrace_queues *queues = &spe->queues;
> +     unsigned int i;
> +
> +     for (i = 0; i < queues->nr_queues; i++) {
> +             arm_spe_free_queue(queues->queue_array[i].priv);
> +             queues->queue_array[i].priv = NULL;
> +     }
> +     auxtrace_queues__free(queues);
> +}
> +
> +static void arm_spe_free(struct perf_session *session)
> +{
> +     struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
> +                                          auxtrace);
> +
> +     auxtrace_heap__free(&spe->heap);
> +     arm_spe_free_events(session);
> +     session->auxtrace = NULL;
> +     free(spe);
> +}
> +
> +static const char * const arm_spe_info_fmts[] = {
> +     [ARM_SPE_PMU_TYPE]              = "  PMU Type           %"PRId64"\n",
> +};
> +
> +static void arm_spe_print_info(u64 *arr)
> +{
> +     if (!dump_trace)
> +             return;
> +
> +     fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], 
> arr[ARM_SPE_PMU_TYPE]);
> +}
> +
> +int arm_spe_process_auxtrace_info(union perf_event *event,
> +                               struct perf_session *session)
> +{
> +     struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info;
> +     size_t min_sz = sizeof(u64) * ARM_SPE_PMU_TYPE;
> +     struct arm_spe *spe;
> +     int err;
> +
> +     if (auxtrace_info->header.size < sizeof(struct auxtrace_info_event) +
> +                                     min_sz)
> +             return -EINVAL;
> +
> +     spe = zalloc(sizeof(struct arm_spe));
> +     if (!spe)
> +             return -ENOMEM;
> +
> +     err = auxtrace_queues__init(&spe->queues);
> +     if (err)
> +             goto err_free;
> +
> +     spe->session = session;
> +     spe->machine = &session->machines.host; /* No kvm support */
> +     spe->auxtrace_type = auxtrace_info->type;
> +     spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
> +
> +     spe->auxtrace.process_event = arm_spe_process_event;
> +     spe->auxtrace.process_auxtrace_event = arm_spe_process_auxtrace_event;
> +     spe->auxtrace.flush_events = arm_spe_flush;
> +     spe->auxtrace.free_events = arm_spe_free_events;
> +     spe->auxtrace.free = arm_spe_free;
> +     session->auxtrace = &spe->auxtrace;
> +
> +     arm_spe_print_info(&auxtrace_info->priv[0]);
> +
> +     return 0;
> +
> +err_free:
> +     free(spe);
> +     return err;
> +}
> diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
> new file mode 100644
> index 000000000000..80752b20d850
> --- /dev/null
> +++ b/tools/perf/util/arm-spe.h
> @@ -0,0 +1,42 @@
> +/*
> + * ARM Statistical Profiling Extensions (SPE) support
> + * Copyright (c) 2017, ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + */
> +
> +#ifndef INCLUDE__PERF_ARM_SPE_H__
> +#define INCLUDE__PERF_ARM_SPE_H__
> +
> +#define ARM_SPE_PMU_NAME "arm_spe_"
> +
> +enum {
> +     ARM_SPE_PMU_TYPE,
> +     ARM_SPE_PER_CPU_MMAPS,
> +     ARM_SPE_AUXTRACE_PRIV_MAX,
> +};
> +
> +#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64))
> +
> +struct auxtrace_record;
> +struct perf_tool;

struct auxtrace_record and struct perf_tool are not used.

> +union perf_event;
> +struct perf_session;
> +struct perf_pmu;
> +
> +struct auxtrace_record *arm_spe_recording_init(int *err,
> +                                            struct perf_pmu *arm_spe_pmu);
> +
> +int arm_spe_process_auxtrace_info(union perf_event *event,
> +                               struct perf_session *session);
> +
> +struct perf_event_attr *arm_spe_pmu_default_config(struct perf_pmu 
> *arm_spe_pmu);
> +#endif
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index a33491416400..f682f7a58a02 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -57,6 +57,7 @@
>  
>  #include "intel-pt.h"
>  #include "intel-bts.h"
> +#include "arm-spe.h"
>  
>  #include "sane_ctype.h"
>  #include "symbol/kallsyms.h"
> @@ -913,6 +914,8 @@ int perf_event__process_auxtrace_info(struct perf_tool 
> *tool __maybe_unused,
>               return intel_pt_process_auxtrace_info(event, session);
>       case PERF_AUXTRACE_INTEL_BTS:
>               return intel_bts_process_auxtrace_info(event, session);
> +     case PERF_AUXTRACE_ARM_SPE:
> +             return arm_spe_process_auxtrace_info(event, session);
>       case PERF_AUXTRACE_CS_ETM:
>       case PERF_AUXTRACE_UNKNOWN:
>       default:
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index d19e11b68de7..453c148d2158 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -43,6 +43,7 @@ enum auxtrace_type {
>       PERF_AUXTRACE_INTEL_PT,
>       PERF_AUXTRACE_INTEL_BTS,
>       PERF_AUXTRACE_CS_ETM,
> +     PERF_AUXTRACE_ARM_SPE,
>  };
>  
>  enum itrace_period_type {
> 

Reply via email to