From: John Jacques <john.jacq...@intel.com> repository: https://github.com/Linaro/OpenCSD.git branch: perf-opencsd-4.9 tip: commit e38cbdb2928a36c5a ("cs-etm: Update to perf cs-etm decoder for OpenCSD v0.5")
Signed-off-by: John Jacques <john.jacq...@intel.com> --- Documentation/trace/coresight.txt | 138 +- drivers/hwtracing/coresight/coresight-etm-perf.c | 31 +- drivers/hwtracing/coresight/coresight-etm.h | 5 + .../hwtracing/coresight/coresight-etm3x-sysfs.c | 12 +- drivers/hwtracing/coresight/coresight-priv.h | 4 +- drivers/hwtracing/coresight/coresight-stm.c | 9 +- drivers/hwtracing/coresight/coresight-tmc-etf.c | 48 +- drivers/hwtracing/coresight/coresight-tmc-etr.c | 286 +++- drivers/hwtracing/coresight/coresight-tmc.h | 2 +- drivers/hwtracing/coresight/coresight.c | 62 +- tools/perf/Makefile.config | 18 + tools/perf/Makefile.perf | 3 + tools/perf/arch/arm/util/cs-etm.c | 2 - tools/perf/builtin-script.c | 3 +- tools/perf/scripts/python/cs-trace-disasm.py | 134 ++ tools/perf/scripts/python/cs-trace-ranges.py | 44 + tools/perf/util/Build | 2 + tools/perf/util/auxtrace.c | 2 + tools/perf/util/cs-etm-decoder/Build | 6 + .../perf/util/cs-etm-decoder/cs-etm-decoder-stub.c | 99 ++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 527 +++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 117 ++ tools/perf/util/cs-etm.c | 1501 ++++++++++++++++++++ tools/perf/util/cs-etm.h | 10 + tools/perf/util/machine.c | 46 +- .../util/scripting-engines/trace-event-python.c | 2 + tools/perf/util/symbol-minimal.c | 3 +- 27 files changed, 3001 insertions(+), 115 deletions(-) create mode 100644 tools/perf/scripts/python/cs-trace-disasm.py create mode 100644 tools/perf/scripts/python/cs-trace-ranges.py create mode 100644 tools/perf/util/cs-etm-decoder/Build create mode 100644 tools/perf/util/cs-etm-decoder/cs-etm-decoder-stub.c create mode 100644 tools/perf/util/cs-etm-decoder/cs-etm-decoder.c create mode 100644 tools/perf/util/cs-etm-decoder/cs-etm-decoder.h create mode 100644 tools/perf/util/cs-etm.c diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight.txt index a33c88c..a2e7ccb 100644 --- a/Documentation/trace/coresight.txt +++ b/Documentation/trace/coresight.txt @@ -20,13 +20,13 @@ Components are generally categorised as source, link and sinks and are "Sources" generate a compressed stream representing the processor instruction path based on tracing scenarios as configured by users. From there the stream -flows through the coresight system (via ATB bus) using links that are connecting -the emanating source to a sink(s). Sinks serve as endpoints to the coresight +flows through the Coresight system (via ATB bus) using links that are connecting +the emanating source to a sink(s). Sinks serve as endpoints to the Coresight implementation, either storing the compressed stream in a memory buffer or creating an interface to the outside world where data can be transferred to a -host without fear of filling up the onboard coresight memory buffer. +host without fear of filling up the onboard Coresight memory buffer. -At typical coresight system would look like this: +At typical Coresight system would look like this: ***************************************************************** **************************** AMBA AXI ****************************===|| @@ -83,8 +83,8 @@ While on target configuration of the components is done via the APB bus, all trace data are carried out-of-band on the ATB bus. The CTM provides a way to aggregate and distribute signals between CoreSight components. -The coresight framework provides a central point to represent, configure and -manage coresight devices on a platform. This first implementation centers on +The Coresight framework provides a central point to represent, configure and +manage Coresight devices on a platform. This first implementation centers on the basic tracing functionality, enabling components such ETM/PTM, funnel, replicator, TMC, TPIU and ETB. Future work will enable more intricate IP blocks such as STM and CTI. @@ -129,11 +129,11 @@ expected to be added as the solution matures. Framework and implementation ---------------------------- -The coresight framework provides a central point to represent, configure and -manage coresight devices on a platform. Any coresight compliant device can +The Coresight framework provides a central point to represent, configure and +manage Coresight devices on a platform. Any Coresight compliant device can register with the framework for as long as they use the right APIs: -struct coresight_device *coresight_register(struct coresight_desc *desc); +struct Coresight_device *coresight_register(struct coresight_desc *desc); void coresight_unregister(struct coresight_device *csdev); The registering function is taking a "struct coresight_device *csdev" and @@ -193,10 +193,120 @@ the information carried in "THIS_MODULE". How to use the tracer modules ----------------------------- -Before trace collection can start, a coresight sink needs to be identify. -There is no limit on the amount of sinks (nor sources) that can be enabled at -any given moment. As a generic operation, all device pertaining to the sink -class will have an "active" entry in sysfs: +There is two ways to use the Coresight framework: 1) using the perf cmd line +tool and 2) interacting directly with the Coresight devices using the sysFS +interface. The latter will slowly be faded out as more functionality become +available from the perf cmd line tool but for the time being both are still +supported. The following sections provide details on using both methods. + +1) Using perf framework: + +Coresight tracers like ETM and PTM are represented using the Perf framework's +Performance Monitoring Unit (PMU). As such the perf framework takes charge of +controlling when tracing happens based on when the process(es) of interest are +scheduled. When configure in a system, Coresight PMUs will be listed when +queried by the perf command line tool: + +linaro@linaro-nano:~$ ./perf list pmu + +List of pre-defined events (to be used in -e): + + cs_etm// [Kernel PMU event] + +linaro@linaro-nano:~$ + +Regardless of the amount ETM/PTM IP block in a system (usually equal to the +amount of processor core), the "cs_etm" PMU will be listed only once. + +Before a trace can be configured and started a Coresight sink needs to be +selected using the sysFS method (see below). This is only temporary until +sink selection can be made from the command line tool. + +linaro@linaro-nano:~$ ls /sys/bus/coresight/devices +20010000.etb 20030000.tpiu 20040000.funnel 2201c000.ptm +2201d000.ptm 2203c000.etm 2203d000.etm 2203e000.etm replicator + +linaro@linaro-nano:~$ echo 1 > /sys/bus/coresight/devices/20010000.etb/enable_sink + +Once a sink has been selected configuring a Coresight PMU works the same way as +any other PMU. As such tracing can happen for a single CPU, a group of CPU, per +thread or a combination of those: + +linaro@linaro-nano:~$ perf record -e cs_etm// --per-thread <command> + +linaro@linaro-nano:~$ perf record -C 0,2-3 -e cs_etm// <command> + +Tracing limited to user and kernel space can also be used to narrow the amount +of collected traces: + +linaro@linaro-nano:~$ perf record -e cs_etm//u --per-thread <command> + +linaro@linaro-nano:~$ perf record -C 0,2-3 -e cs_etm//k <command> + +As of this writing two ETM/PTM specific options have are available: cycle +accurate and timestamp (please refer to the Embedded Trace Macrocell reference +manual for details on these options). By default both are disabled but using +the "cycacc" and "timestamp" mnemonic within the double '/' will see those +options configure for the upcoming trace run: + +linaro@linaro-nano:~$ perf record -e cs_etm/cycacc/ --per-thread <command> + +linaro@linaro-nano:~$ perf record -C 0,2-3 -e cs_etm/cycacc,timestamp/ <command> + +The Coresight PMUs can be configured to work in "full trace" or "snapshot" mode. +In full trace mode trace acquisition is enabled from beginning to end with trace +data being recorded continuously: + +linaro@linaro-nano:~$ perf record -e cs_etm// dd if=/dev/random of=./test.txt bs=1k count=1000 + +Since this can lead to a significant amount of data and because some devices are +limited in disk space snapshot mode can be used instead. In snapshot mode +traces are still collected in the ring buffer but not communicated to user +space. The ring buffer is allowed to wrap around, providing the latest +information before an event of interest happens. Significant events are +communicated by sending a USR2 signal to the user space command line tool. +From there the tool will stop trace collection and harvest data from the ring +buffer before re-enabling traces. Snapshot mode can be invoked using '-S' when +launching a trace collection: + +linaro@linaro-nano:~$ perf record -S -e cs_etm// dd if=/dev/random of=./test.txt bs=1k count=1000 + +Trace data collected during trace runs ends up in the "perf.data" file. Trace +configuration information necessary for trace decoding is also embedded in the +"perf.data" file. Two new headers, 'PERF_RECORD_AUXTRACE_INFO' and +'PERF_RECORD_AUXTRACE' have been added to the list of event types in order to +find out where the different sections start. + +It is worth noting that a set of metadata information exists for each tracer +that participated in a trace run. As such if 5 processors have been engaged, +5 sets of metadata will be found in the perf.data file. This is to ensure that +tracer decompression tools have all the information they need in order to +process the trace data. + +Metadata information is collected directly from the ETM/PTM management registers +using the sysFS interface. Since there is no way for the perf command line +tool to associate a CPU with a tracer, a symbolic link has been created between +the cs_etm sysFS event directory and each Coresight tracer: + +linaro@linaro-nano:~$ ls /sys/bus/event_source/devices/cs_etm +cpu0 cpu1 cpu2 cpu3 cpu4 format perf_event_mux_interval_ms +power subsystem type uevent + +linaro@linaro-nano:~$ ls /sys/bus/event_source/devices/cs_etm/cpu0/mgmt/ +etmccer etmccr etmcr etmidr etmscr etmtecr1 etmtecr2 +etmteevr etmtraceidr etmtssvr + +2) Using the sysFS interface: + +Most, if not all, configuration registers are made available to users via the +sysFS interface. Until all Coresight ETM drivers have been converted to perf, +it will also be possible to start and stop traces from sysFS. + +As with the perf method described above, a Coresight sink needs to be identify +before trace collection can commence. Using the sysFS method _only_, there is +no limit on the amount of sinks (nor sources) that can be enabled at +any given moment. As a generic operation, all devices pertaining to the sink +class will have an "enable_sink" entry in sysfs: root:/sys/bus/coresight/devices# ls replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm @@ -246,7 +356,7 @@ The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32. Following is a DS-5 output of an experimental loop that increments a variable up to a certain value. The example is simple and yet provides a glimpse of the -wealth of possibilities that coresight provides. +wealth of possibilities that Coresight provides. Info Tracing enabled Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr} diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 2cd7c71..1774196 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -202,6 +202,21 @@ static void *etm_setup_aux(int event_cpu, void **pages, if (!event_data) return NULL; + /* + * In theory nothing prevent tracers in a trace session from being + * associated with different sinks, nor having a sink per tracer. But + * until we have HW with this kind of topology we need to assume tracers + * in a trace session are using the same sink. Therefore go through + * the coresight bus and pick the first enabled sink. + * + * When operated from sysFS users are responsible to enable the sink + * while from perf, the perf tools will do it based on the choice made + * on the cmd line. As such the "enable_sink" flag in sysFS is reset. + */ + sink = coresight_get_enabled_sink(true); + if (!sink) + goto err; + INIT_WORK(&event_data->work, free_event_data); mask = &event_data->mask; @@ -219,25 +234,11 @@ static void *etm_setup_aux(int event_cpu, void **pages, * list of devices from source to sink that can be * referenced later when the path is actually needed. */ - event_data->path[cpu] = coresight_build_path(csdev); + event_data->path[cpu] = coresight_build_path(csdev, sink); if (IS_ERR(event_data->path[cpu])) goto err; } - /* - * In theory nothing prevent tracers in a trace session from being - * associated with different sinks, nor having a sink per tracer. But - * until we have HW with this kind of topology and a way to convey - * sink assignement from the perf cmd line we need to assume tracers - * in a trace session are using the same sink. Therefore pick the sink - * found at the end of the first available path. - */ - cpu = cpumask_first(mask); - /* Grab the sink at the end of the path */ - sink = coresight_get_sink(event_data->path[cpu]); - if (!sink) - goto err; - if (!sink_ops(sink)->alloc_buffer) goto err; diff --git a/drivers/hwtracing/coresight/coresight-etm.h b/drivers/hwtracing/coresight/coresight-etm.h index 4a18ee4..ad063d7 100644 --- a/drivers/hwtracing/coresight/coresight-etm.h +++ b/drivers/hwtracing/coresight/coresight-etm.h @@ -89,11 +89,13 @@ /* ETMCR - 0x00 */ #define ETMCR_PWD_DWN BIT(0) #define ETMCR_STALL_MODE BIT(7) +#define ETMCR_BRANCH_BROADCAST BIT(8) #define ETMCR_ETM_PRG BIT(10) #define ETMCR_ETM_EN BIT(11) #define ETMCR_CYC_ACC BIT(12) #define ETMCR_CTXID_SIZE (BIT(14)|BIT(15)) #define ETMCR_TIMESTAMP_EN BIT(28) +#define ETMCR_RETURN_STACK BIT(29) /* ETMCCR - 0x04 */ #define ETMCCR_FIFOFULL BIT(23) /* ETMPDCR - 0x310 */ @@ -110,8 +112,11 @@ #define ETM_MODE_STALL BIT(2) #define ETM_MODE_TIMESTAMP BIT(3) #define ETM_MODE_CTXID BIT(4) +#define ETM_MODE_BBROAD BIT(5) +#define ETM_MODE_RET_STACK BIT(6) #define ETM_MODE_ALL (ETM_MODE_EXCLUDE | ETM_MODE_CYCACC | \ ETM_MODE_STALL | ETM_MODE_TIMESTAMP | \ + ETM_MODE_BBROAD | ETM_MODE_RET_STACK | \ ETM_MODE_CTXID | ETM_MODE_EXCL_KERN | \ ETM_MODE_EXCL_USER) diff --git a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c index e9b0719..ca98ad1 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c @@ -146,7 +146,7 @@ static ssize_t mode_store(struct device *dev, goto err_unlock; } config->ctrl |= ETMCR_STALL_MODE; - } else + } else config->ctrl &= ~ETMCR_STALL_MODE; if (config->mode & ETM_MODE_TIMESTAMP) { @@ -164,6 +164,16 @@ static ssize_t mode_store(struct device *dev, else config->ctrl &= ~ETMCR_CTXID_SIZE; + if (config->mode & ETM_MODE_BBROAD) + config->ctrl |= ETMCR_BRANCH_BROADCAST; + else + config->ctrl &= ~ETMCR_BRANCH_BROADCAST; + + if (config->mode & ETM_MODE_RET_STACK) + config->ctrl |= ETMCR_RETURN_STACK; + else + config->ctrl &= ~ETMCR_RETURN_STACK; + if (config->mode & (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER)) etm_config_trace_mode(config); diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 196a14b..ef9d8e9 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -111,7 +111,9 @@ static inline void CS_UNLOCK(void __iomem *addr) void coresight_disable_path(struct list_head *path); int coresight_enable_path(struct list_head *path, u32 mode); struct coresight_device *coresight_get_sink(struct list_head *path); -struct list_head *coresight_build_path(struct coresight_device *csdev); +struct coresight_device *coresight_get_enabled_sink(bool reset); +struct list_head *coresight_build_path(struct coresight_device *csdev, + struct coresight_device *sink); void coresight_release_path(struct list_head *path); #ifdef CONFIG_CORESIGHT_SOURCE_ETM3X diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 8e79056..2d16260 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -419,10 +419,10 @@ static ssize_t stm_generic_packet(struct stm_data *stm_data, struct stm_drvdata, stm); if (!(drvdata && local_read(&drvdata->mode))) - return 0; + return -EACCES; if (channel >= drvdata->numsp) - return 0; + return -EINVAL; ch_addr = (unsigned long)stm_channel_addr(drvdata, channel); @@ -920,6 +920,11 @@ static struct amba_id stm_ids[] = { .mask = 0x0003ffff, .data = "STM32", }, + { + .id = 0x0003b963, + .mask = 0x0003ffff, + .data = "STM500", + }, { 0, 0}, }; diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d6941ea..1549436 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -70,7 +70,7 @@ static void tmc_etb_disable_hw(struct tmc_drvdata *drvdata) * When operating in sysFS mode the content of the buffer needs to be * read before the TMC is disabled. */ - if (local_read(&drvdata->mode) == CS_MODE_SYSFS) + if (drvdata->mode == CS_MODE_SYSFS) tmc_etb_dump_hw(drvdata); tmc_disable_hw(drvdata); @@ -103,19 +103,14 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata) CS_LOCK(drvdata->base); } -static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev, u32 mode) +static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) { int ret = 0; bool used = false; char *buf = NULL; - long val; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - /* This shouldn't be happening */ - if (WARN_ON(mode != CS_MODE_SYSFS)) - return -EINVAL; - /* * If we don't have a buffer release the lock and allocate memory. * Otherwise keep the lock and move along. @@ -138,13 +133,12 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev, u32 mode) goto out; } - val = local_xchg(&drvdata->mode, mode); /* * In sysFS mode we can have multiple writers per sink. Since this * sink is already enabled no memory is needed and the HW need not be * touched. */ - if (val == CS_MODE_SYSFS) + if (drvdata->mode == CS_MODE_SYSFS) goto out; /* @@ -163,6 +157,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev, u32 mode) drvdata->buf = buf; } + drvdata->mode = CS_MODE_SYSFS; tmc_etb_enable_hw(drvdata); out: spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -177,34 +172,29 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev, u32 mode) return ret; } -static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, u32 mode) +static int tmc_enable_etf_sink_perf(struct coresight_device *csdev) { int ret = 0; - long val; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - /* This shouldn't be happening */ - if (WARN_ON(mode != CS_MODE_PERF)) - return -EINVAL; - spin_lock_irqsave(&drvdata->spinlock, flags); if (drvdata->reading) { ret = -EINVAL; goto out; } - val = local_xchg(&drvdata->mode, mode); /* * In Perf mode there can be only one writer per sink. There * is also no need to continue if the ETB/ETR is already operated * from sysFS. */ - if (val != CS_MODE_DISABLED) { + if (drvdata->mode != CS_MODE_DISABLED) { ret = -EINVAL; goto out; } + drvdata->mode = CS_MODE_PERF; tmc_etb_enable_hw(drvdata); out: spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -216,9 +206,9 @@ static int tmc_enable_etf_sink(struct coresight_device *csdev, u32 mode) { switch (mode) { case CS_MODE_SYSFS: - return tmc_enable_etf_sink_sysfs(csdev, mode); + return tmc_enable_etf_sink_sysfs(csdev); case CS_MODE_PERF: - return tmc_enable_etf_sink_perf(csdev, mode); + return tmc_enable_etf_sink_perf(csdev); } /* We shouldn't be here */ @@ -227,7 +217,6 @@ static int tmc_enable_etf_sink(struct coresight_device *csdev, u32 mode) static void tmc_disable_etf_sink(struct coresight_device *csdev) { - long val; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -237,10 +226,11 @@ static void tmc_disable_etf_sink(struct coresight_device *csdev) return; } - val = local_xchg(&drvdata->mode, CS_MODE_DISABLED); /* Disable the TMC only if it needs to */ - if (val != CS_MODE_DISABLED) + if (drvdata->mode != CS_MODE_DISABLED) { tmc_etb_disable_hw(drvdata); + drvdata->mode = CS_MODE_DISABLED; + } spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -260,7 +250,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, } tmc_etf_enable_hw(drvdata); - local_set(&drvdata->mode, CS_MODE_SYSFS); + drvdata->mode = CS_MODE_SYSFS; spin_unlock_irqrestore(&drvdata->spinlock, flags); dev_info(drvdata->dev, "TMC-ETF enabled\n"); @@ -280,7 +270,7 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, } tmc_etf_disable_hw(drvdata); - local_set(&drvdata->mode, CS_MODE_DISABLED); + drvdata->mode = CS_MODE_DISABLED; spin_unlock_irqrestore(&drvdata->spinlock, flags); dev_info(drvdata->dev, "TMC disabled\n"); @@ -383,7 +373,7 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, return; /* This shouldn't happen */ - if (WARN_ON_ONCE(local_read(&drvdata->mode) != CS_MODE_PERF)) + if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF)) return; CS_UNLOCK(drvdata->base); @@ -504,7 +494,6 @@ const struct coresight_ops tmc_etf_cs_ops = { int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) { - long val; enum tmc_mode mode; int ret = 0; unsigned long flags; @@ -528,9 +517,8 @@ int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) goto out; } - val = local_read(&drvdata->mode); /* Don't interfere if operated from Perf */ - if (val == CS_MODE_PERF) { + if (drvdata->mode == CS_MODE_PERF) { ret = -EINVAL; goto out; } @@ -542,7 +530,7 @@ int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) } /* Disable the TMC if need be */ - if (val == CS_MODE_SYSFS) + if (drvdata->mode == CS_MODE_SYSFS) tmc_etb_disable_hw(drvdata); drvdata->reading = true; @@ -573,7 +561,7 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata) } /* Re-enable the TMC if need be */ - if (local_read(&drvdata->mode) == CS_MODE_SYSFS) { + if (drvdata->mode == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace * buffer. As such zero-out the buffer so that we don't end diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 886ea83..2db4857 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -15,11 +15,30 @@ * this program. If not, see <http://www.gnu.org/licenses/>. */ +#include <linux/circ_buf.h> #include <linux/coresight.h> #include <linux/dma-mapping.h> +#include <linux/slab.h> + #include "coresight-priv.h" #include "coresight-tmc.h" +/** + * struct cs_etr_buffer - keep track of a recording session' specifics + * @tmc: generic portion of the TMC buffers + * @paddr: the physical address of a DMA'able contiguous memory area + * @vaddr: the virtual address associated to @paddr + * @size: how much memory we have, starting at @paddr + * @dev: the device @vaddr has been tied to + */ +struct cs_etr_buffers { + struct cs_buffers tmc; + dma_addr_t paddr; + void __iomem *vaddr; + u32 size; + struct device *dev; +}; + static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl; @@ -86,26 +105,22 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) * When operating in sysFS mode the content of the buffer needs to be * read before the TMC is disabled. */ - if (local_read(&drvdata->mode) == CS_MODE_SYSFS) + if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_dump_hw(drvdata); tmc_disable_hw(drvdata); CS_LOCK(drvdata->base); } -static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev, u32 mode) +static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) { int ret = 0; bool used = false; - long val; unsigned long flags; void __iomem *vaddr = NULL; dma_addr_t paddr; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - /* This shouldn't be happening */ - if (WARN_ON(mode != CS_MODE_SYSFS)) - return -EINVAL; /* * If we don't have a buffer release the lock and allocate memory. @@ -134,13 +149,12 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev, u32 mode) goto out; } - val = local_xchg(&drvdata->mode, mode); /* * In sysFS mode we can have multiple writers per sink. Since this * sink is already enabled no memory is needed and the HW need not be * touched. */ - if (val == CS_MODE_SYSFS) + if (drvdata->mode == CS_MODE_SYSFS) goto out; /* @@ -155,8 +169,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev, u32 mode) drvdata->buf = drvdata->vaddr; } - memset(drvdata->vaddr, 0, drvdata->size); - + drvdata->mode = CS_MODE_SYSFS; tmc_etr_enable_hw(drvdata); out: spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -171,34 +184,29 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev, u32 mode) return ret; } -static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, u32 mode) +static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) { int ret = 0; - long val; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - /* This shouldn't be happening */ - if (WARN_ON(mode != CS_MODE_PERF)) - return -EINVAL; - spin_lock_irqsave(&drvdata->spinlock, flags); if (drvdata->reading) { ret = -EINVAL; goto out; } - val = local_xchg(&drvdata->mode, mode); /* * In Perf mode there can be only one writer per sink. There * is also no need to continue if the ETR is already operated * from sysFS. */ - if (val != CS_MODE_DISABLED) { + if (drvdata->mode != CS_MODE_DISABLED) { ret = -EINVAL; goto out; } + drvdata->mode = CS_MODE_PERF; tmc_etr_enable_hw(drvdata); out: spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -210,9 +218,9 @@ static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode) { switch (mode) { case CS_MODE_SYSFS: - return tmc_enable_etr_sink_sysfs(csdev, mode); + return tmc_enable_etr_sink_sysfs(csdev); case CS_MODE_PERF: - return tmc_enable_etr_sink_perf(csdev, mode); + return tmc_enable_etr_sink_perf(csdev); } /* We shouldn't be here */ @@ -221,7 +229,6 @@ static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode) static void tmc_disable_etr_sink(struct coresight_device *csdev) { - long val; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -231,19 +238,244 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev) return; } - val = local_xchg(&drvdata->mode, CS_MODE_DISABLED); /* Disable the TMC only if it needs to */ - if (val != CS_MODE_DISABLED) + if (drvdata->mode != CS_MODE_DISABLED) { tmc_etr_disable_hw(drvdata); + drvdata->mode = CS_MODE_DISABLED; + } spin_unlock_irqrestore(&drvdata->spinlock, flags); dev_info(drvdata->dev, "TMC-ETR disabled\n"); } +static void *tmc_alloc_etr_buffer(struct coresight_device *csdev, int cpu, + void **pages, int nr_pages, bool overwrite) +{ + int node; + struct cs_etr_buffers *buf; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + if (cpu == -1) + cpu = smp_processor_id(); + node = cpu_to_node(cpu); + + /* Allocate memory structure for interaction with Perf */ + buf = kzalloc_node(sizeof(struct cs_etr_buffers), GFP_KERNEL, node); + if (!buf) + return NULL; + + buf->dev = drvdata->dev; + buf->size = drvdata->size; + buf->vaddr = dma_alloc_coherent(buf->dev, buf->size, + &buf->paddr, GFP_KERNEL); + if (!buf->vaddr) { + kfree(buf); + return NULL; + } + + buf->tmc.snapshot = overwrite; + buf->tmc.nr_pages = nr_pages; + buf->tmc.data_pages = pages; + + return buf; +} + +static void tmc_free_etr_buffer(void *config) +{ + struct cs_etr_buffers *buf = config; + + dma_free_coherent(buf->dev, buf->size, buf->vaddr, buf->paddr); + kfree(buf); +} + +static int tmc_set_etr_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *sink_config) +{ + int ret = 0; + unsigned long head; + struct cs_etr_buffers *buf = sink_config; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + /* wrap head around to the amount of space we have */ + head = handle->head & ((buf->tmc.nr_pages << PAGE_SHIFT) - 1); + + /* find the page to write to */ + buf->tmc.cur = head / PAGE_SIZE; + + /* and offset within that page */ + buf->tmc.offset = head % PAGE_SIZE; + + local_set(&buf->tmc.data_size, 0); + + /* Tell the HW where to put the trace data */ + drvdata->vaddr = buf->vaddr; + drvdata->paddr = buf->paddr; + memset(drvdata->vaddr, 0, drvdata->size); + + return ret; +} + +static unsigned long tmc_reset_etr_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *sink_config, bool *lost) +{ + long size = 0; + struct cs_etr_buffers *buf = sink_config; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + if (buf) { + /* + * In snapshot mode ->data_size holds the new address of the + * ring buffer's head. The size itself is the whole address + * range since we want the latest information. + */ + if (buf->tmc.snapshot) { + size = buf->tmc.nr_pages << PAGE_SHIFT; + handle->head = local_xchg(&buf->tmc.data_size, size); + } + + /* + * Tell the tracer PMU how much we got in this run and if + * something went wrong along the way. Nobody else can use + * this cs_etr_buffers instance until we are done. As such + * resetting parameters here and squaring off with the ring + * buffer API in the tracer PMU is fine. + */ + *lost = !!local_xchg(&buf->tmc.lost, 0); + size = local_xchg(&buf->tmc.data_size, 0); + } + + /* Get ready for another run */ + drvdata->vaddr = NULL; + drvdata->paddr = 0; + + return size; +} + +static void tmc_update_etr_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *sink_config) +{ + int i, cur; + u32 *buf_ptr; + u32 read_ptr, write_ptr; + u32 status, to_read; + unsigned long offset; + struct cs_buffers *buf = sink_config; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + if (!buf) + return; + + /* This shouldn't happen */ + if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF)) + return; + + CS_UNLOCK(drvdata->base); + + tmc_flush_and_stop(drvdata); + + read_ptr = readl_relaxed(drvdata->base + TMC_RRP); + write_ptr = readl_relaxed(drvdata->base + TMC_RWP); + + /* + * Get a hold of the status register and see if a wrap around + * has occurred. If so adjust things accordingly. + */ + status = readl_relaxed(drvdata->base + TMC_STS); + if (status & TMC_STS_FULL) { + local_inc(&buf->lost); + to_read = drvdata->size; + } else { + to_read = CIRC_CNT(write_ptr, read_ptr, drvdata->size); + } + + /* + * The TMC RAM buffer may be bigger than the space available in the + * perf ring buffer (handle->size). If so advance the RRP so that we + * get the latest trace data. + */ + if (to_read > handle->size) { + u32 buffer_start, mask = 0; + + /* Read buffer start address in system memory */ + buffer_start = readl_relaxed(drvdata->base + TMC_DBALO); + + /* + * The value written to RRP must be byte-address aligned to + * the width of the trace memory databus _and_ to a frame + * boundary (16 byte), whichever is the biggest. For example, + * for 32-bit, 64-bit and 128-bit wide trace memory, the four + * LSBs must be 0s. For 256-bit wide trace memory, the five + * LSBs must be 0s. + */ + switch (drvdata->memwidth) { + case TMC_MEM_INTF_WIDTH_32BITS: + case TMC_MEM_INTF_WIDTH_64BITS: + case TMC_MEM_INTF_WIDTH_128BITS: + mask = GENMASK(31, 5); + break; + case TMC_MEM_INTF_WIDTH_256BITS: + mask = GENMASK(31, 6); + break; + } + + /* + * Make sure the new size is aligned in accordance with the + * requirement explained above. + */ + to_read = handle->size & mask; + /* Move the RAM read pointer up */ + read_ptr = (write_ptr + drvdata->size) - to_read; + /* Make sure we are still within our limits */ + if (read_ptr > (buffer_start + (drvdata->size - 1))) + read_ptr -= drvdata->size; + /* Tell the HW */ + writel_relaxed(read_ptr, drvdata->base + TMC_RRP); + local_inc(&buf->lost); + } + + cur = buf->cur; + offset = buf->offset; + + /* for every byte to read */ + for (i = 0; i < to_read; i += 4) { + buf_ptr = buf->data_pages[cur] + offset; + *buf_ptr = readl_relaxed(drvdata->base + TMC_RRD); + + offset += 4; + if (offset >= PAGE_SIZE) { + offset = 0; + cur++; + /* wrap around at the end of the buffer */ + cur &= buf->nr_pages - 1; + } + } + + /* + * In snapshot mode all we have to do is communicate to + * perf_aux_output_end() the address of the current head. In full + * trace mode the same function expects a size to move rb->aux_head + * forward. + */ + if (buf->snapshot) + local_set(&buf->data_size, (cur * PAGE_SIZE) + offset); + else + local_add(to_read, &buf->data_size); + + CS_LOCK(drvdata->base); +} + static const struct coresight_ops_sink tmc_etr_sink_ops = { .enable = tmc_enable_etr_sink, .disable = tmc_disable_etr_sink, + .alloc_buffer = tmc_alloc_etr_buffer, + .free_buffer = tmc_free_etr_buffer, + .set_buffer = tmc_set_etr_buffer, + .reset_buffer = tmc_reset_etr_buffer, + .update_buffer = tmc_update_etr_buffer, }; const struct coresight_ops tmc_etr_cs_ops = { @@ -253,7 +485,6 @@ const struct coresight_ops tmc_etr_cs_ops = { int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) { int ret = 0; - long val; unsigned long flags; /* config types are set a boot time and never change */ @@ -266,9 +497,8 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; } - val = local_read(&drvdata->mode); /* Don't interfere if operated from Perf */ - if (val == CS_MODE_PERF) { + if (drvdata->mode == CS_MODE_PERF) { ret = -EINVAL; goto out; } @@ -280,7 +510,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) } /* Disable the TMC if need be */ - if (val == CS_MODE_SYSFS) + if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_disable_hw(drvdata); drvdata->reading = true; @@ -303,7 +533,7 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) spin_lock_irqsave(&drvdata->spinlock, flags); /* RE-enable the TMC if need be */ - if (local_read(&drvdata->mode) == CS_MODE_SYSFS) { + if (drvdata->mode == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace * buffer. The trace buffer is cleared in tmc_etr_enable_hw(), diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 44b3ae3..51c0185 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -117,7 +117,7 @@ struct tmc_drvdata { void __iomem *vaddr; u32 size; u32 len; - local_t mode; + u32 mode; enum tmc_config_type config_type; enum tmc_mem_intf_width memwidth; u32 trigger_cntr; diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c index 7bf00a0..40ede64 100644 --- a/drivers/hwtracing/coresight/coresight.c +++ b/drivers/hwtracing/coresight/coresight.c @@ -368,6 +368,40 @@ struct coresight_device *coresight_get_sink(struct list_head *path) return csdev; } +static int coresight_enabled_sink(struct device *dev, void *data) +{ + bool *reset = data; + struct coresight_device *csdev = to_coresight_device(dev); + + if ((csdev->type == CORESIGHT_DEV_TYPE_SINK || + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) && + csdev->activated) { + /* + * Now that we have a handle on the sink for this session, + * disable the sysFS "enable_sink" flag so that possible + * concurrent perf session that wish to use another sink don't + * trip on it. Doing so has no ramification for the current + * session. + */ + if (*reset) + csdev->activated = false; + + return 1; + } + + return 0; +} + +struct coresight_device *coresight_get_enabled_sink(bool reset) +{ + struct device *dev = NULL; + + dev = bus_find_device(&coresight_bustype, NULL, &reset, + coresight_enabled_sink); + + return dev ? to_coresight_device(dev) : NULL; +} + /** * _coresight_build_path - recursively build a path from a @csdev to a sink. * @csdev: The device to start from. @@ -380,6 +414,7 @@ struct coresight_device *coresight_get_sink(struct list_head *path) * last one. */ static int _coresight_build_path(struct coresight_device *csdev, + struct coresight_device *sink, struct list_head *path) { int i; @@ -387,15 +422,15 @@ static int _coresight_build_path(struct coresight_device *csdev, struct coresight_node *node; /* An activated sink has been found. Enqueue the element */ - if ((csdev->type == CORESIGHT_DEV_TYPE_SINK || - csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) && csdev->activated) + if (csdev == sink) goto out; /* Not a sink - recursively explore each port found on this element */ for (i = 0; i < csdev->nr_outport; i++) { struct coresight_device *child_dev = csdev->conns[i].child_dev; - if (child_dev && _coresight_build_path(child_dev, path) == 0) { + if (child_dev && + _coresight_build_path(child_dev, sink, path) == 0) { found = true; break; } @@ -422,18 +457,22 @@ static int _coresight_build_path(struct coresight_device *csdev, return 0; } -struct list_head *coresight_build_path(struct coresight_device *csdev) +struct list_head *coresight_build_path(struct coresight_device *source, + struct coresight_device *sink) { struct list_head *path; int rc; + if (!sink) + return ERR_PTR(-EINVAL); + path = kzalloc(sizeof(struct list_head), GFP_KERNEL); if (!path) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(path); - rc = _coresight_build_path(csdev, path); + rc = _coresight_build_path(source, sink, path); if (rc) { kfree(path); return ERR_PTR(rc); @@ -497,6 +536,7 @@ static int coresight_validate_source(struct coresight_device *csdev, int coresight_enable(struct coresight_device *csdev) { int cpu, ret = 0; + struct coresight_device *sink; struct list_head *path; mutex_lock(&coresight_mutex); @@ -508,7 +548,17 @@ int coresight_enable(struct coresight_device *csdev) if (csdev->enable) goto out; - path = coresight_build_path(csdev); + /* + * Search for a valid sink for this session but don't reset the + * "enable_sink" flag in sysFS. Users get to do that explicitly. + */ + sink = coresight_get_enabled_sink(false); + if (!sink) { + ret = -EINVAL; + goto out; + } + + path = coresight_build_path(csdev, sink); if (IS_ERR(path)) { pr_err("building path(s) failed\n"); ret = PTR_ERR(path); diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 4ec127b..fe465b5 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -533,6 +533,24 @@ endif grep-libs = $(filter -l%,$(1)) strip-libs = $(filter-out -l%,$(1)) +ifdef CSTRACE_PATH + ifeq (${IS_64_BIT}, 1) + CSTRACE_LNX = linux64 + else + CSTRACE_LNX = linux + endif + ifeq (${DEBUG}, 1) + LIBCSTRACE = -lcstraced_c_api -lcstraced + CSTRACE_LIB_PATH = $(CSTRACE_PATH)/lib/$(CSTRACE_LNX)/dbg + else + LIBCSTRACE = -lcstraced_c_api -lcstraced + CSTRACE_LIB_PATH = $(CSTRACE_PATH)/lib/$(CSTRACE_LNX)/rel + endif + $(call detected,CSTRACE) + $(call detected_var,CSTRACE_PATH) + EXTLIBS += -L$(CSTRACE_LIB_PATH) $(LIBCSTRACE) -lstdc++ +endif + ifdef NO_LIBPERL CFLAGS += -DNO_LIBPERL else diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index a10f064..2810a64 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -86,6 +86,9 @@ include ../scripts/utilities.mak # # Define FEATURES_DUMP to provide features detection dump file # and bypass the feature detection +# +# Define NO_CSTRACE if you do not want CoreSight trace decoding support +# # As per kernel Makefile, avoid funny character set dependencies unexport LC_ALL diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 47d584d..dfea6b6 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -575,8 +575,6 @@ static FILE *cs_device__open_file(const char *name) snprintf(path, PATH_MAX, "%s" CS_BUS_DEVICE_PATH "%s", sysfs, name); - printf("path: %s\n", path); - if (stat(path, &st) < 0) return NULL; diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 7228d14..667f8c2 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -109,7 +109,8 @@ static struct { .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID | PERF_OUTPUT_CPU | PERF_OUTPUT_TIME | - PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP | + PERF_OUTPUT_EVNAME | PERF_OUTPUT_ADDR | + PERF_OUTPUT_IP | PERF_OUTPUT_SYM | PERF_OUTPUT_DSO | PERF_OUTPUT_PERIOD, diff --git a/tools/perf/scripts/python/cs-trace-disasm.py b/tools/perf/scripts/python/cs-trace-disasm.py new file mode 100644 index 0000000..c370e26 --- /dev/null +++ b/tools/perf/scripts/python/cs-trace-disasm.py @@ -0,0 +1,134 @@ +# perf script event handlers, generated by perf script -g python +# Licensed under the terms of the GNU GPL License version 2 + +# The common_* event handler fields are the most useful fields common to +# all events. They don't necessarily correspond to the 'common_*' fields +# in the format files. Those fields not available as handler params can +# be retrieved using Python functions of the form common_*(context). +# See the perf-trace-python Documentation for the list of available functions. + +import os +import sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * +from subprocess import * +from Core import * +import re; + +from optparse import OptionParser + +# +# Add options to specify vmlinux file and the objdump executable +# +parser = OptionParser() +parser.add_option("-k", "--vmlinux", dest="vmlinux_name", + help="path to vmlinux file") +parser.add_option("-d", "--objdump", dest="objdump_name", + help="name of objdump executable (in path)") +(options, args) = parser.parse_args() + +if (options.objdump_name == None): + sys.exit("No objdump executable specified - use -d or --objdump option") + +# initialize global dicts and regular expression + +build_ids = dict(); +mmaps = dict(); +disasm_cache = dict(); +disasm_re = re.compile("^\s*([0-9a-fA-F]+):") + +cache_size = 16*1024 + +def trace_begin(): + cmd_output = check_output(["perf", "buildid-list"]).split('\n'); + bid_re = re.compile("([a-fA-f0-9]+)[ \t]([^ \n]+)") + for line in cmd_output: + m = bid_re.search(line) + if (m != None) : + if (m.group(2) == "[kernel.kallsyms]") : + append = "/kallsyms" + dirname = "/" + m.group(2) + elif (m.group(2) == "[vdso]") : + append = "/vdso" + dirname = "/" + m.group(2) + else: + append = "/elf" + dirname = m.group(2) + + build_ids[m.group(2)] = \ + os.environ['PERF_BUILDID_DIR'] + \ + dirname + "/" + m.group(1) + append; + + if ((options.vmlinux_name != None) and ("[kernel.kallsyms]" in build_ids)): + build_ids['[kernel.kallsyms]'] = options.vmlinux_name; + else: + del build_ids['[kernel.kallsyms]'] + + mmap_re = re.compile("PERF_RECORD_MMAP2 -?[0-9]+/[0-9]+: \[(0x[0-9a-fA-F]+).*:\s.*\s(\S*)") + cmd_output= check_output("perf script --show-mmap-events | fgrep PERF_RECORD_MMAP2",shell=True).split('\n') + for line in cmd_output: + m = mmap_re.search(line) + if (m != None) : + mmaps[m.group(2)] = int(m.group(1),0) + + + +def trace_end(): + pass + +def process_event(t): + global cache_size + global options + + sample = t['sample'] + dso = t['dso'] + + # don't let the cache get too big, but don't bother with a fancy replacement policy + # just clear it when it hits max size + + if (len(disasm_cache) > cache_size): + disasm_cache.clear(); + + cpu = format(sample['cpu'], "d"); + addr_range = format(sample['ip'],"x") + ":" + format(sample['addr'],"x"); + + try: + disasm_output = disasm_cache[addr_range]; + except: + try: + fname = build_ids[dso]; + except KeyError: + if (dso == '[kernel.kallsyms]'): + return; + fname = dso; + + if (dso in mmaps): + offset = mmaps[dso]; + disasm = [options.objdump_name,"-d","-z", "--adjust-vma="+format(offset,"#x"),"--start-address="+format(sample['ip'],"#x"),"--stop-address="+format(sample['addr'],"#x"), fname] + else: + offset = 0 + disasm = [options.objdump_name,"-d","-z", "--start-address="+format(sample['ip'],"#x"),"--stop-address="+format(sample['addr'],"#x"),fname] + disasm_output = check_output(disasm).split('\n') + disasm_cache[addr_range] = disasm_output; + + print "FILE: %s\tCPU: %s" % (dso, cpu); + for line in disasm_output: + m = disasm_re.search(line) + if (m != None) : + try: + print "\t",line + except: + exit(1); + else: + continue; + +def trace_unhandled(event_name, context, event_fields_dict): + print ' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())]) + +def print_header(event_name, cpu, secs, nsecs, pid, comm): + print "print_header" + print "%-20s %5u %05u.%09u %8u %-20s " % \ + (event_name, cpu, secs, nsecs, pid, comm), diff --git a/tools/perf/scripts/python/cs-trace-ranges.py b/tools/perf/scripts/python/cs-trace-ranges.py new file mode 100644 index 0000000..c8edacb --- /dev/null +++ b/tools/perf/scripts/python/cs-trace-ranges.py @@ -0,0 +1,44 @@ +# +# Copyright(C) 2016 Linaro Limited. All rights reserved. +# Author: Tor Jeremiassen <tor.jeremias...@linaro.org> +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License version 2 as published by +# the Free Software Foundation. +# +# This program is distributed in the hope that it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +# more details. +# +# You should have received a copy of the GNU General Public License along with +# this program. If not, see <http://www.gnu.org/licenses/>. +# + +import os +import sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * + +def trace_begin(): + pass; + +def trace_end(): + pass + +def process_event(t): + + sample = t['sample'] + + print "range:",format(sample['ip'],"x"),"-",format(sample['addr'],"x") + +def trace_unhandled(event_name, context, event_fields_dict): + print ' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())]) + +def print_header(event_name, cpu, secs, nsecs, pid, comm): + print "print_header" + print "%-20s %5u %05u.%09u %8u %-20s " % \ + (event_name, cpu, secs, nsecs, pid, comm), diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 1dc67ef..2bbb725 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -80,6 +80,8 @@ libperf-$(CONFIG_AUXTRACE) += auxtrace.o libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/ libperf-$(CONFIG_AUXTRACE) += intel-pt.o libperf-$(CONFIG_AUXTRACE) += intel-bts.o +libperf-$(CONFIG_AUXTRACE) += cs-etm.o +libperf-$(CONFIG_AUXTRACE) += cs-etm-decoder/ libperf-y += parse-branch-options.o libperf-y += parse-regs-options.o libperf-y += term.o diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index c5a6e0b..4dbd500 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -58,6 +58,7 @@ #include "intel-pt.h" #include "intel-bts.h" +#include "cs-etm.h" int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, struct auxtrace_mmap_params *mp, @@ -902,6 +903,7 @@ int perf_event__process_auxtrace_info(struct perf_tool *tool __maybe_unused, case PERF_AUXTRACE_INTEL_BTS: return intel_bts_process_auxtrace_info(event, session); case PERF_AUXTRACE_CS_ETM: + return cs_etm__process_auxtrace_info(event, session); case PERF_AUXTRACE_UNKNOWN: default: return -EINVAL; diff --git a/tools/perf/util/cs-etm-decoder/Build b/tools/perf/util/cs-etm-decoder/Build new file mode 100644 index 0000000..e097599 --- /dev/null +++ b/tools/perf/util/cs-etm-decoder/Build @@ -0,0 +1,6 @@ +ifeq ($(CSTRACE_PATH),) +libperf-$(CONFIG_AUXTRACE) += cs-etm-decoder-stub.o +else +CFLAGS_cs-etm-decoder.o += -I$(CSTRACE_PATH)/include +libperf-$(CONFIG_AUXTRACE) += cs-etm-decoder.o +endif diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder-stub.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder-stub.c new file mode 100644 index 0000000..d2ebbd2 --- /dev/null +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder-stub.c @@ -0,0 +1,99 @@ +/* + * + * Copyright(C) 2015 Linaro Limited. All rights reserved. + * Author: Tor Jeremiassen <tor.jeremias...@linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General + * Public License for more details. + * + * You should have received a copy of the GNU GEneral Public License along + * with this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include <stdlib.h> + +#include "cs-etm-decoder.h" +#include "../util.h" + + +struct cs_etm_decoder { + void *state; + int dummy; +}; + +int cs_etm_decoder__flush(struct cs_etm_decoder *decoder) +{ + (void) decoder; + return -1; +} + +int cs_etm_decoder__add_bin_file(struct cs_etm_decoder *decoder, + uint64_t offset, + uint64_t address, + uint64_t len, + const char *fname) +{ + (void) decoder; + (void) offset; + (void) address; + (void) len; + (void) fname; + return -1; +} + +const struct cs_etm_state *cs_etm_decoder__process_data_block( + struct cs_etm_decoder *decoder, + uint64_t indx, + const uint8_t *buf, + size_t len, + size_t *consumed) +{ + (void) decoder; + (void) indx; + (void) buf; + (void) len; + (void) consumed; + return NULL; +} + +int cs_etm_decoder__add_mem_access_cb(struct cs_etm_decoder *decoder, + uint64_t address, + uint64_t len, + cs_etm_mem_cb_type cb_func) +{ + (void) decoder; + (void) address; + (void) len; + (void) cb_func; + return -1; +} + +int cs_etm_decoder__get_packet(struct cs_etm_decoder *decoder, + struct cs_etm_packet *packet) +{ + (void) decoder; + (void) packet; + return -1; +} + +struct cs_etm_decoder *cs_etm_decoder__new(uint32_t num_cpu, + struct cs_etm_decoder_params *d_params, + struct cs_etm_trace_params t_params[]) +{ + (void) num_cpu; + (void) d_params; + (void) t_params; + return NULL; +} + +void cs_etm_decoder__free(struct cs_etm_decoder *decoder) +{ + (void) decoder; + return; +} diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c new file mode 100644 index 0000000..ee2e02f --- /dev/null +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -0,0 +1,527 @@ +/* + * + * Copyright(C) 2015 Linaro Limited. All rights reserved. + * Author: Tor Jeremiassen <tor.jeremias...@linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General + * Public License for more details. + * + * You should have received a copy of the GNU GEneral Public License along + * with this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/err.h> +#include <stdlib.h> + +#include "../cs-etm.h" +#include "cs-etm-decoder.h" +#include "../util.h" +#include "../util/intlist.h" + +#include "c_api/opencsd_c_api.h" +#include "ocsd_if_types.h" +#include "etmv4/trc_pkt_types_etmv4.h" + +#define MAX_BUFFER 1024 + +struct cs_etm_decoder { + struct cs_etm_state state; + dcd_tree_handle_t dcd_tree; + void (*packet_printer)(const char *); + cs_etm_mem_cb_type mem_access; + ocsd_datapath_resp_t prev_return; + size_t prev_processed; + bool trace_on; + bool discontinuity; + struct cs_etm_packet packet_buffer[MAX_BUFFER]; + uint32_t packet_count; + uint32_t head; + uint32_t tail; + uint32_t end_tail; +}; + +static uint32_t cs_etm_decoder__mem_access(const void *context, + const ocsd_vaddr_t address, + const ocsd_mem_space_acc_t mem_space, + const uint32_t req_size, + uint8_t *buffer) +{ + struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context; + (void) mem_space; + + return decoder->mem_access(decoder->state.data, address, req_size, buffer); +} + +static int cs_etm_decoder__gen_etmv4_config(struct cs_etm_trace_params *params, + ocsd_etmv4_cfg *config) +{ + config->reg_configr = params->reg_configr; + config->reg_traceidr = params->reg_traceidr; + config->reg_idr0 = params->reg_idr0; + config->reg_idr1 = params->reg_idr1; + config->reg_idr2 = params->reg_idr2; + config->reg_idr8 = params->reg_idr8; + + config->reg_idr9 = 0; + config->reg_idr10 = 0; + config->reg_idr11 = 0; + config->reg_idr12 = 0; + config->reg_idr13 = 0; + config->arch_ver = ARCH_V8; + config->core_prof = profile_CortexA; + + return 0; +} + +static int cs_etm_decoder__flush_packet(struct cs_etm_decoder *decoder) +{ + int err = 0; + + if (decoder == NULL) + return -1; + + if (decoder->packet_count >= 31) + return -1; + + if (decoder->tail != decoder->end_tail) { + decoder->tail = (decoder->tail + 1) & (MAX_BUFFER - 1); + decoder->packet_count++; + } + + return err; +} + +int cs_etm_decoder__flush(struct cs_etm_decoder *decoder) +{ + return cs_etm_decoder__flush_packet(decoder); +} + +static int cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, + const ocsd_generic_trace_elem *elem, + const uint8_t trace_chan_id, + enum cs_etm_sample_type sample_type) +{ + int err = 0; + uint32_t et = 0; + struct int_node *inode = NULL; + + if (decoder == NULL) + return -1; + + if (decoder->packet_count >= 31) + return -1; + + err = cs_etm_decoder__flush_packet(decoder); + + if (err) + return err; + + et = decoder->end_tail; + /* Search the RB tree for the cpu associated with this traceID */ + inode = intlist__find(traceid_list, trace_chan_id); + if (!inode) + return PTR_ERR(inode); + + decoder->packet_buffer[et].sample_type = sample_type; + decoder->packet_buffer[et].start_addr = elem->st_addr; + decoder->packet_buffer[et].end_addr = elem->en_addr; + decoder->packet_buffer[et].exc = false; + decoder->packet_buffer[et].exc_ret = false; + decoder->packet_buffer[et].cpu = *((int *)inode->priv); + + et = (et + 1) & (MAX_BUFFER - 1); + + decoder->end_tail = et; + + return err; +} + +static int cs_etm_decoder__mark_exception(struct cs_etm_decoder *decoder) +{ + int err = 0; + + if (decoder == NULL) + return -1; + + decoder->packet_buffer[decoder->end_tail].exc = true; + + return err; +} + +static int cs_etm_decoder__mark_exception_return(struct cs_etm_decoder *decoder) +{ + int err = 0; + + if (decoder == NULL) + return -1; + + decoder->packet_buffer[decoder->end_tail].exc_ret = true; + + return err; +} + +static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( + const void *context, + const ocsd_trc_index_t indx, + const uint8_t trace_chan_id, + const ocsd_generic_trace_elem *elem) +{ + ocsd_datapath_resp_t resp = OCSD_RESP_CONT; + struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context; + + (void) indx; + (void) trace_chan_id; + + switch (elem->elem_type) { + case OCSD_GEN_TRC_ELEM_UNKNOWN: + break; + case OCSD_GEN_TRC_ELEM_NO_SYNC: + decoder->trace_on = false; + break; + case OCSD_GEN_TRC_ELEM_TRACE_ON: + decoder->trace_on = true; + break; + case OCSD_GEN_TRC_ELEM_INSTR_RANGE: + cs_etm_decoder__buffer_packet(decoder, elem, + trace_chan_id, CS_ETM_RANGE); + resp = OCSD_RESP_WAIT; + break; + case OCSD_GEN_TRC_ELEM_EXCEPTION: + cs_etm_decoder__mark_exception(decoder); + break; + case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: + cs_etm_decoder__mark_exception_return(decoder); + break; + case OCSD_GEN_TRC_ELEM_PE_CONTEXT: + case OCSD_GEN_TRC_ELEM_EO_TRACE: + case OCSD_GEN_TRC_ELEM_ADDR_NACC: + case OCSD_GEN_TRC_ELEM_TIMESTAMP: + case OCSD_GEN_TRC_ELEM_CYCLE_COUNT: + case OCSD_GEN_TRC_ELEM_ADDR_UNKNOWN: + case OCSD_GEN_TRC_ELEM_EVENT: + case OCSD_GEN_TRC_ELEM_SWTRACE: + case OCSD_GEN_TRC_ELEM_CUSTOM: + default: + break; + } + + decoder->state.err = 0; + + return resp; +} + +static ocsd_datapath_resp_t cs_etm_decoder__etmv4i_packet_printer( + const void *context, + const ocsd_datapath_op_t op, + const ocsd_trc_index_t indx, + const ocsd_etmv4_i_pkt *pkt) +{ + const size_t PACKET_STR_LEN = 1024; + ocsd_datapath_resp_t ret = OCSD_RESP_CONT; + char packet_str[PACKET_STR_LEN]; + size_t offset; + struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context; + + sprintf(packet_str, "%ld: ", (long int) indx); + offset = strlen(packet_str); + + switch (op) { + case OCSD_OP_DATA: + if (ocsd_pkt_str(OCSD_PROTOCOL_ETMV4I, + (void *)pkt, + packet_str+offset, + PACKET_STR_LEN-offset) != OCSD_OK) + ret = OCSD_RESP_FATAL_INVALID_PARAM; + break; + case OCSD_OP_EOT: + sprintf(packet_str, "**** END OF TRACE ****\n"); + break; + case OCSD_OP_FLUSH: + case OCSD_OP_RESET: + default: + break; + } + + decoder->packet_printer(packet_str); + + return ret; +} + +static int cs_etm_decoder__create_etmv4i_packet_printer(struct cs_etm_decoder_params *d_params, + struct cs_etm_trace_params *t_params, + struct cs_etm_decoder *decoder) +{ + ocsd_etmv4_cfg trace_config; + int ret = 0; + unsigned char CSID; /* CSID extracted from the config data */ + + if (d_params->packet_printer == NULL) + return -1; + + ret = cs_etm_decoder__gen_etmv4_config(t_params, &trace_config); + + if (ret != 0) + return -1; + + decoder->packet_printer = d_params->packet_printer; + + ret = ocsd_dt_create_decoder(decoder->dcd_tree, + OCSD_BUILTIN_DCD_ETMV4I, + OCSD_CREATE_FLG_PACKET_PROC, + (void *)&trace_config, + &CSID); + + if (ret != 0) + return -1; + + ret = ocsd_dt_attach_packet_callback(decoder->dcd_tree, + CSID, + OCSD_C_API_CB_PKT_SINK, + cs_etm_decoder__etmv4i_packet_printer, + decoder); + return ret; +} + +static int cs_etm_decoder__create_etmv4i_packet_decoder(struct cs_etm_decoder_params *d_params, + struct cs_etm_trace_params *t_params, + struct cs_etm_decoder *decoder) +{ + ocsd_etmv4_cfg trace_config; + int ret = 0; + unsigned char CSID; /* CSID extracted from the config data */ + + decoder->packet_printer = d_params->packet_printer; + + ret = cs_etm_decoder__gen_etmv4_config(t_params, &trace_config); + + if (ret != 0) + return -1; + + ret = ocsd_dt_create_decoder(decoder->dcd_tree, + OCSD_BUILTIN_DCD_ETMV4I, + OCSD_CREATE_FLG_FULL_DECODER, + (void *)&trace_config, + &CSID); + + if (ret != 0) + return -1; + + ret = ocsd_dt_set_gen_elem_outfn(decoder->dcd_tree, + cs_etm_decoder__gen_trace_elem_printer, decoder); + return ret; +} + +int cs_etm_decoder__add_mem_access_cb(struct cs_etm_decoder *decoder, + uint64_t address, + uint64_t len, + cs_etm_mem_cb_type cb_func) +{ + int err; + + decoder->mem_access = cb_func; + err = ocsd_dt_add_callback_mem_acc(decoder->dcd_tree, + address, + address+len-1, + OCSD_MEM_SPACE_ANY, + cs_etm_decoder__mem_access, + decoder); + return err; +} + + +int cs_etm_decoder__add_bin_file(struct cs_etm_decoder *decoder, + uint64_t offset, + uint64_t address, + uint64_t len, + const char *fname) +{ + int err = 0; + ocsd_file_mem_region_t region; + + (void) len; + if (NULL == decoder) + return -1; + + if (NULL == decoder->dcd_tree) + return -1; + + region.file_offset = offset; + region.start_address = address; + region.region_size = len; + err = ocsd_dt_add_binfile_region_mem_acc(decoder->dcd_tree, + ®ion, + 1, + OCSD_MEM_SPACE_ANY, + fname); + + return err; +} + +const struct cs_etm_state *cs_etm_decoder__process_data_block(struct cs_etm_decoder *decoder, + uint64_t indx, + const uint8_t *buf, + size_t len, + size_t *consumed) +{ + int ret = 0; + ocsd_datapath_resp_t dp_ret = decoder->prev_return; + size_t processed = 0; + + if (decoder->packet_count > 0) { + decoder->state.err = ret; + *consumed = processed; + return &(decoder->state); + } + + while ((processed < len) && (0 == ret)) { + + if (OCSD_DATA_RESP_IS_CONT(dp_ret)) { + uint32_t count; + dp_ret = ocsd_dt_process_data(decoder->dcd_tree, + OCSD_OP_DATA, + indx+processed, + len - processed, + &buf[processed], + &count); + processed += count; + + } else if (OCSD_DATA_RESP_IS_WAIT(dp_ret)) { + dp_ret = ocsd_dt_process_data(decoder->dcd_tree, + OCSD_OP_FLUSH, + 0, + 0, + NULL, + NULL); + break; + } else + ret = -1; + } + if (OCSD_DATA_RESP_IS_WAIT(dp_ret)) { + if (OCSD_DATA_RESP_IS_CONT(decoder->prev_return)) { + decoder->prev_processed = processed; + } + processed = 0; + } else if (OCSD_DATA_RESP_IS_WAIT(decoder->prev_return)) { + processed = decoder->prev_processed; + decoder->prev_processed = 0; + } + *consumed = processed; + decoder->prev_return = dp_ret; + decoder->state.err = ret; + return &(decoder->state); +} + +int cs_etm_decoder__get_packet(struct cs_etm_decoder *decoder, + struct cs_etm_packet *packet) +{ + if (decoder->packet_count == 0) + return -1; + + if (packet == NULL) + return -1; + + *packet = decoder->packet_buffer[decoder->head]; + + decoder->head = (decoder->head + 1) & (MAX_BUFFER - 1); + + decoder->packet_count--; + + return 0; +} + +static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) +{ + unsigned i; + + decoder->head = 0; + decoder->tail = 0; + decoder->end_tail = 0; + decoder->packet_count = 0; + for (i = 0; i < MAX_BUFFER; i++) { + decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL; + decoder->packet_buffer[i].end_addr = 0xdeadbeefdeadbeefUL; + decoder->packet_buffer[i].exc = false; + decoder->packet_buffer[i].exc_ret = false; + decoder->packet_buffer[i].cpu = INT_MIN; + } +} + +struct cs_etm_decoder *cs_etm_decoder__new(uint32_t num_cpu, + struct cs_etm_decoder_params *d_params, + struct cs_etm_trace_params t_params[]) +{ + struct cs_etm_decoder *decoder; + ocsd_dcd_tree_src_t format; + uint32_t flags; + int ret; + size_t i; + + if ((t_params == NULL) || (d_params == 0)) + return NULL; + + decoder = zalloc(sizeof(struct cs_etm_decoder)); + + if (decoder == NULL) + return NULL; + + decoder->state.data = d_params->data; + decoder->prev_return = OCSD_RESP_CONT; + cs_etm_decoder__clear_buffer(decoder); + format = (d_params->formatted ? OCSD_TRC_SRC_FRAME_FORMATTED : + OCSD_TRC_SRC_SINGLE); + flags = 0; + flags |= (d_params->fsyncs ? OCSD_DFRMTR_HAS_FSYNCS : 0); + flags |= (d_params->hsyncs ? OCSD_DFRMTR_HAS_HSYNCS : 0); + flags |= (d_params->frame_aligned ? OCSD_DFRMTR_FRAME_MEM_ALIGN : 0); + + /* Create decode tree for the data source */ + decoder->dcd_tree = ocsd_create_dcd_tree(format, flags); + + if (decoder->dcd_tree == 0) + goto err_free_decoder; + + for (i = 0; i < num_cpu; ++i) { + switch (t_params[i].protocol) { + case CS_ETM_PROTO_ETMV4i: + if (d_params->operation == CS_ETM_OPERATION_PRINT) + ret = cs_etm_decoder__create_etmv4i_packet_printer(d_params, &t_params[i], decoder); + else if (d_params->operation == CS_ETM_OPERATION_DECODE) + ret = cs_etm_decoder__create_etmv4i_packet_decoder(d_params, &t_params[i], decoder); + else + ret = -CS_ETM_ERR_PARAM; + if (ret != 0) + goto err_free_decoder_tree; + break; + default: + goto err_free_decoder_tree; + break; + } + } + + + return decoder; + +err_free_decoder_tree: + ocsd_destroy_dcd_tree(decoder->dcd_tree); +err_free_decoder: + free(decoder); + return NULL; +} + + +void cs_etm_decoder__free(struct cs_etm_decoder *decoder) +{ + if (decoder == NULL) + return; + + ocsd_destroy_dcd_tree(decoder->dcd_tree); + decoder->dcd_tree = NULL; + + free(decoder); +} diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h new file mode 100644 index 0000000..7e9db4c --- /dev/null +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -0,0 +1,117 @@ +/* + * Copyright(C) 2015 Linaro Limited. All rights reserved. + * Author: Tor Jeremiassen <tor.jeremias...@linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General + * Public License for more details. + * + * You should have received a copy of the GNU GEneral Public License along + * with this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef INCLUDE__CS_ETM_DECODER_H__ +#define INCLUDE__CS_ETM_DECODER_H__ + +#include <linux/types.h> +#include <stdio.h> + +struct cs_etm_decoder; + +struct cs_etm_buffer { + const unsigned char *buf; + size_t len; + uint64_t offset; + //bool consecutive; + uint64_t ref_timestamp; + //uint64_t trace_nr; +}; + +enum cs_etm_sample_type { + CS_ETM_RANGE = 1 << 0, +}; + +struct cs_etm_state { + int err; + void *data; + unsigned isa; + uint64_t start; + uint64_t end; + uint64_t timestamp; +}; + +struct cs_etm_packet { + enum cs_etm_sample_type sample_type; + uint64_t start_addr; + uint64_t end_addr; + bool exc; + bool exc_ret; + int cpu; +}; + + +struct cs_etm_queue; +typedef uint32_t (*cs_etm_mem_cb_type)(struct cs_etm_queue *, uint64_t, size_t, uint8_t *); + +struct cs_etm_trace_params { + void *etmv4i_packet_handler; + uint32_t reg_idr0; + uint32_t reg_idr1; + uint32_t reg_idr2; + uint32_t reg_idr8; + uint32_t reg_configr; + uint32_t reg_traceidr; + int protocol; +}; + +struct cs_etm_decoder_params { + int operation; + void (*packet_printer)(const char *); + cs_etm_mem_cb_type mem_acc_cb; + bool formatted; + bool fsyncs; + bool hsyncs; + bool frame_aligned; + void *data; +}; + +enum { + CS_ETM_PROTO_ETMV3 = 1, + CS_ETM_PROTO_ETMV4i, + CS_ETM_PROTO_ETMV4d, +}; + +enum { + CS_ETM_OPERATION_PRINT = 1, + CS_ETM_OPERATION_DECODE, +}; + +enum { + CS_ETM_ERR_NOMEM = 1, + CS_ETM_ERR_NODATA, + CS_ETM_ERR_PARAM, +}; + + +struct cs_etm_decoder *cs_etm_decoder__new(uint32_t num_cpu, struct cs_etm_decoder_params *, struct cs_etm_trace_params []); + +int cs_etm_decoder__add_mem_access_cb(struct cs_etm_decoder *, uint64_t, uint64_t, cs_etm_mem_cb_type); + +int cs_etm_decoder__flush(struct cs_etm_decoder *); +void cs_etm_decoder__free(struct cs_etm_decoder *); +int cs_etm_decoder__get_packet(struct cs_etm_decoder *, struct cs_etm_packet *); + +int cs_etm_decoder__add_bin_file(struct cs_etm_decoder *, uint64_t, uint64_t, uint64_t, const char *); + +const struct cs_etm_state *cs_etm_decoder__process_data_block(struct cs_etm_decoder *, + uint64_t, + const uint8_t *, + size_t, + size_t *); + +#endif /* INCLUDE__CS_ETM_DECODER_H__ */ diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c new file mode 100644 index 0000000..91d6a8a --- /dev/null +++ b/tools/perf/util/cs-etm.c @@ -0,0 +1,1501 @@ +/* + * Copyright(C) 2016 Linaro Limited. All rights reserved. + * Author: Tor Jeremiassen <tor.jeremias...@linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/err.h> +#include <linux/kernel.h> +#include <linux/types.h> +#include <linux/bitops.h> +#include <linux/log2.h> + +#include "perf.h" +#include "thread_map.h" +#include "thread.h" +#include "thread-stack.h" +#include "callchain.h" +#include "auxtrace.h" +#include "evlist.h" +#include "machine.h" +#include "util.h" +#include "util/intlist.h" +#include "color.h" +#include "cs-etm.h" +#include "cs-etm-decoder/cs-etm-decoder.h" +#include "debug.h" + +#include <stdlib.h> + +#define KiB(x) ((x) * 1024) +#define MiB(x) ((x) * 1024 * 1024) +#define MAX_TIMESTAMP (~0ULL) + +struct cs_etm_auxtrace { + struct auxtrace auxtrace; + struct auxtrace_queues queues; + struct auxtrace_heap heap; + u64 **metadata; + u32 auxtrace_type; + struct perf_session *session; + struct machine *machine; + struct perf_evsel *switch_evsel; + struct thread *unknown_thread; + uint32_t num_cpu; + bool timeless_decoding; + bool sampling_mode; + bool snapshot_mode; + bool data_queued; + bool sync_switch; + bool synth_needs_swap; + int have_sched_switch; + + bool sample_instructions; + u64 instructions_sample_type; + u64 instructions_sample_period; + u64 instructions_id; + struct itrace_synth_opts synth_opts; + unsigned pmu_type; +}; + +struct cs_etm_queue { + struct cs_etm_auxtrace *etm; + unsigned queue_nr; + struct auxtrace_buffer *buffer; + const struct cs_etm_state *state; + struct ip_callchain *chain; + union perf_event *event_buf; + bool on_heap; + bool step_through_buffers; + bool use_buffer_pid_tid; + pid_t pid, tid; + int cpu; + struct thread *thread; + u64 time; + u64 timestamp; + bool stop; + struct cs_etm_decoder *decoder; + u64 offset; + bool eot; + bool kernel_mapped; +}; + +static int cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq); +static int cs_etm__update_queues(struct cs_etm_auxtrace *); +static int cs_etm__process_queues(struct cs_etm_auxtrace *, u64); +static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *, pid_t, u64); +static uint32_t cs_etm__mem_access(struct cs_etm_queue *, uint64_t, size_t, uint8_t *); + +static void cs_etm__packet_dump(const char *pkt_string) +{ + const char *color = PERF_COLOR_BLUE; + + color_fprintf(stdout, color, " %s\n", pkt_string); + fflush(stdout); +} + +static void cs_etm__dump_event(struct cs_etm_auxtrace *etm, + struct auxtrace_buffer *buffer) +{ + const char *color = PERF_COLOR_BLUE; + struct cs_etm_decoder_params d_params; + struct cs_etm_trace_params *t_params; + struct cs_etm_decoder *decoder; + size_t buffer_used = 0; + size_t i; + + fprintf(stdout, "\n"); + color_fprintf(stdout, color, + ". ... CoreSight ETM Trace data: size %zu bytes\n", + buffer->size); + + t_params = zalloc(sizeof(struct cs_etm_trace_params) * etm->num_cpu); + for (i = 0; i < etm->num_cpu; ++i) { + t_params[i].protocol = CS_ETM_PROTO_ETMV4i; + t_params[i].reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0]; + t_params[i].reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1]; + t_params[i].reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2]; + t_params[i].reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8]; + t_params[i].reg_configr = etm->metadata[i][CS_ETMV4_TRCCONFIGR]; + t_params[i].reg_traceidr = etm->metadata[i][CS_ETMV4_TRCTRACEIDR]; + //[CS_ETMV4_TRCAUTHSTATUS] = " TRCAUTHSTATUS %"PRIx64"\n", + } + d_params.packet_printer = cs_etm__packet_dump; + d_params.operation = CS_ETM_OPERATION_PRINT; + d_params.formatted = true; + d_params.fsyncs = false; + d_params.hsyncs = false; + d_params.frame_aligned = true; + + decoder = cs_etm_decoder__new(etm->num_cpu, &d_params, t_params); + + zfree(&t_params); + + if (decoder == NULL) + return; + do { + size_t consumed; + cs_etm_decoder__process_data_block(decoder, buffer->offset, + &(((uint8_t *)buffer->data)[buffer_used]), + buffer->size - buffer_used, &consumed); + buffer_used += consumed; + } while (buffer_used < buffer->size); + cs_etm_decoder__free(decoder); +} + +static int cs_etm__flush_events(struct perf_session *session, struct perf_tool *tool) +{ + struct cs_etm_auxtrace *etm = container_of(session->auxtrace, + struct cs_etm_auxtrace, + auxtrace); + + int ret; + + if (dump_trace) + return 0; + + if (!tool->ordered_events) + return -EINVAL; + + ret = cs_etm__update_queues(etm); + + if (ret < 0) + return ret; + + if (etm->timeless_decoding) + return cs_etm__process_timeless_queues(etm, -1, MAX_TIMESTAMP - 1); + + return cs_etm__process_queues(etm, MAX_TIMESTAMP); +} + +static void cs_etm__set_pid_tid_cpu(struct cs_etm_auxtrace *etm, + struct auxtrace_queue *queue) +{ + struct cs_etm_queue *etmq = queue->priv; + + if ((queue->tid == -1) || (etm->have_sched_switch)) { + etmq->tid = machine__get_current_tid(etm->machine, etmq->cpu); + thread__zput(etmq->thread); + } + + if ((!etmq->thread) && (etmq->tid != -1)) + etmq->thread = machine__find_thread(etm->machine, -1, etmq->tid); + + if (etmq->thread) { + etmq->pid = etmq->thread->pid_; + if (queue->cpu == -1) + etmq->cpu = etmq->thread->cpu; + } +} + +static void cs_etm__free_queue(void *priv) +{ + struct cs_etm_queue *etmq = priv; + + if (!etmq) + return; + + thread__zput(etmq->thread); + cs_etm_decoder__free(etmq->decoder); + zfree(&etmq->event_buf); + zfree(&etmq->chain); + free(etmq); +} + +static void cs_etm__free_events(struct perf_session *session) +{ + struct cs_etm_auxtrace *aux = container_of(session->auxtrace, + struct cs_etm_auxtrace, + auxtrace); + + struct auxtrace_queues *queues = &(aux->queues); + + unsigned i; + + for (i = 0; i < queues->nr_queues; ++i) { + cs_etm__free_queue(queues->queue_array[i].priv); + queues->queue_array[i].priv = 0; + } + + auxtrace_queues__free(queues); + +} + +static void cs_etm__free(struct perf_session *session) +{ + + size_t i; + struct int_node *inode, *tmp; + struct cs_etm_auxtrace *aux = container_of(session->auxtrace, + struct cs_etm_auxtrace, + auxtrace); + auxtrace_heap__free(&aux->heap); + cs_etm__free_events(session); + session->auxtrace = NULL; + + /* First remove all traceID/CPU# nodes from the RB tree */ + intlist__for_each_entry_safe(inode, tmp, traceid_list) + intlist__remove(traceid_list, inode); + /* Then the RB tree itself */ + intlist__delete(traceid_list); + + //thread__delete(aux->unknown_thread); + for (i = 0; i < aux->num_cpu; ++i) + zfree(&aux->metadata[i]); + zfree(&aux->metadata); + free(aux); +} + +static void cs_etm__use_buffer_pid_tid(struct cs_etm_queue *etmq, + struct auxtrace_queue *queue, + struct auxtrace_buffer *buffer) +{ + if ((queue->cpu == -1) && (buffer->cpu != -1)) + etmq->cpu = buffer->cpu; + + etmq->pid = buffer->pid; + etmq->tid = buffer->tid; + + thread__zput(etmq->thread); + + if (etmq->tid != -1) { + if (etmq->pid != -1) { + etmq->thread = machine__findnew_thread(etmq->etm->machine, + etmq->pid, + etmq->tid); + } else { + etmq->thread = machine__findnew_thread(etmq->etm->machine, + -1, + etmq->tid); + } + } +} + + +static int cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq) +{ + struct auxtrace_buffer *aux_buffer = etmq->buffer; + struct auxtrace_buffer *old_buffer = aux_buffer; + struct auxtrace_queue *queue; + + if (etmq->stop) { + buff->len = 0; + return 0; + } + + queue = &etmq->etm->queues.queue_array[etmq->queue_nr]; + + aux_buffer = auxtrace_buffer__next(queue, aux_buffer); + + if (!aux_buffer) { + if (old_buffer) + auxtrace_buffer__drop_data(old_buffer); + buff->len = 0; + return 0; + } + + etmq->buffer = aux_buffer; + + if (!aux_buffer->data) { + int fd = perf_data_file__fd(etmq->etm->session->file); + + aux_buffer->data = auxtrace_buffer__get_data(aux_buffer, fd); + if (!aux_buffer->data) + return -ENOMEM; + } + + if (old_buffer) + auxtrace_buffer__drop_data(old_buffer); + + if (aux_buffer->use_data) { + buff->offset = aux_buffer->offset; + buff->len = aux_buffer->use_size; + buff->buf = aux_buffer->use_data; + } else { + buff->offset = aux_buffer->offset; + buff->len = aux_buffer->size; + buff->buf = aux_buffer->data; + } + /* + * buff->offset = 0; + * buff->len = sizeof(cstrace); + * buff->buf = cstrace; + */ + + buff->ref_timestamp = aux_buffer->reference; + + if (etmq->use_buffer_pid_tid && + ((etmq->pid != aux_buffer->pid) || + (etmq->tid != aux_buffer->tid))) { + cs_etm__use_buffer_pid_tid(etmq, queue, aux_buffer); + } + + if (etmq->step_through_buffers) + etmq->stop = true; + + return buff->len; +} + +static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, + unsigned int queue_nr) +{ + struct cs_etm_decoder_params d_params; + struct cs_etm_trace_params *t_params; + struct cs_etm_queue *etmq; + size_t i; + + etmq = zalloc(sizeof(struct cs_etm_queue)); + if (!etmq) + return NULL; + + if (etm->synth_opts.callchain) { + size_t sz = sizeof(struct ip_callchain); + + sz += etm->synth_opts.callchain_sz * sizeof(u64); + etmq->chain = zalloc(sz); + if (!etmq->chain) + goto out_free; + } else { + etmq->chain = NULL; + } + + etmq->event_buf = malloc(PERF_SAMPLE_MAX_SIZE); + if (!etmq->event_buf) + goto out_free; + + etmq->etm = etm; + etmq->queue_nr = queue_nr; + etmq->pid = -1; + etmq->tid = -1; + etmq->cpu = -1; + etmq->stop = false; + etmq->kernel_mapped = false; + + t_params = zalloc(sizeof(struct cs_etm_trace_params)*etm->num_cpu); + + for (i = 0; i < etm->num_cpu; ++i) { + t_params[i].reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0]; + t_params[i].reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1]; + t_params[i].reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2]; + t_params[i].reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8]; + t_params[i].reg_configr = etm->metadata[i][CS_ETMV4_TRCCONFIGR]; + t_params[i].reg_traceidr = etm->metadata[i][CS_ETMV4_TRCTRACEIDR]; + t_params[i].protocol = CS_ETM_PROTO_ETMV4i; + } + d_params.packet_printer = cs_etm__packet_dump; + d_params.operation = CS_ETM_OPERATION_DECODE; + d_params.formatted = true; + d_params.fsyncs = false; + d_params.hsyncs = false; + d_params.frame_aligned = true; + d_params.data = etmq; + + etmq->decoder = cs_etm_decoder__new(etm->num_cpu, &d_params, t_params); + + + zfree(&t_params); + + if (!etmq->decoder) + goto out_free; + + etmq->offset = 0; + etmq->eot = false; + + return etmq; + +out_free: + zfree(&etmq->event_buf); + zfree(&etmq->chain); + free(etmq); + return NULL; +} + +static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, + struct auxtrace_queue *queue, + unsigned int queue_nr) +{ + struct cs_etm_queue *etmq = queue->priv; + + if (list_empty(&(queue->head))) + return 0; + + if (etmq == NULL) { + etmq = cs_etm__alloc_queue(etm, queue_nr); + + if (etmq == NULL) + return -ENOMEM; + + queue->priv = etmq; + + if (queue->cpu != -1) + etmq->cpu = queue->cpu; + + etmq->tid = queue->tid; + + if (etm->sampling_mode) + if (etm->timeless_decoding) + etmq->step_through_buffers = true; + if (etm->timeless_decoding || !etm->have_sched_switch) + etmq->use_buffer_pid_tid = true; + } + + if (!etmq->on_heap && + (!etm->sync_switch)) { + const struct cs_etm_state *state; + int ret = 0; + + if (etm->timeless_decoding) + return ret; + + //cs_etm__log("queue %u getting timestamp\n", queue_nr); + //cs_etm__log("queue %u decoding cpu %d pid %d tid %d\n", + //queue_nr, etmq->cpu, etmq->pid, etmq->tid); + (void) state; + return ret; + /* + while (1) { + state = cs_etm_decoder__decode(etmq->decoder); + if (state->err) { + if (state->err == CS_ETM_ERR_NODATA) { + //cs_etm__log("queue %u has no timestamp\n", + //queue_nr); + return 0; + } + continue; + } + if (state->timestamp) + break; + } + + etmq->timestamp = state->timestamp; + //cs_etm__log("queue %u timestamp 0x%"PRIx64 "\n", + //queue_nr, etmq->timestamp); + etmq->state = state; + etmq->have_sample = true; + //cs_etm__sample_flags(etmq); + ret = auxtrace_heap__add(&etm->heap, queue_nr, etmq->timestamp); + if (ret) + return ret; + etmq->on_heap = true; + */ + } + + return 0; +} + + +static int cs_etm__setup_queues(struct cs_etm_auxtrace *etm) +{ + unsigned int i; + int ret; + + for (i = 0; i < etm->queues.nr_queues; i++) { + ret = cs_etm__setup_queue(etm, &(etm->queues.queue_array[i]), i); + if (ret) + return ret; + } + return 0; +} + +#if 0 +struct cs_etm_cache_entry { + struct auxtrace_cache_entry entry; + uint64_t icount; + uint64_t bcount; +}; + +static size_t cs_etm__cache_divisor(void) +{ + static size_t d = 64; + + return d; +} + +static size_t cs_etm__cache_size(struct dso *dso, + struct machine *machine) +{ + off_t size; + + size = dso__data_size(dso, machine); + size /= cs_etm__cache_divisor(); + + if (size < 1000) + return 10; + + if (size > (1 << 21)) + return 21; + + return 32 - __builtin_clz(size); +} + +static struct auxtrace_cache *cs_etm__cache(struct dso *dso, + struct machine *machine) +{ + struct auxtrace_cache *c; + size_t bits; + + if (dso->auxtrace_cache) + return dso->auxtrace_cache; + + bits = cs_etm__cache_size(dso, machine); + + c = auxtrace_cache__new(bits, sizeof(struct cs_etm_cache_entry), 200); + + dso->auxtrace_cache = c; + + return c; +} + +static int cs_etm__cache_add(struct dso *dso, struct machine *machine, + uint64_t offset, uint64_t icount, uint64_t bcount) +{ + struct auxtrace_cache *c = cs_etm__cache(dso, machine); + struct cs_etm_cache_entry *e; + int err; + + if (!c) + return -ENOMEM; + + e = auxtrace_cache__alloc_entry(c); + if (!e) + return -ENOMEM; + + e->icount = icount; + e->bcount = bcount; + + err = auxtrace_cache__add(c, offset, &e->entry); + + if (err) + auxtrace_cache__free_entry(c, e); + + return err; +} + +static struct cs_etm_cache_entry *cs_etm__cache_lookup(struct dso *dso, + struct machine *machine, + uint64_t offset) +{ + struct auxtrace_cache *c = cs_etm__cache(dso, machine); + + if (!c) + return NULL; + + return auxtrace_cache__lookup(dso->auxtrace_cache, offset); +} +#endif + +static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, + struct cs_etm_packet *packet) +{ + int ret = 0; + struct cs_etm_auxtrace *etm = etmq->etm; + union perf_event *event = etmq->event_buf; + struct perf_sample sample = {.ip = 0,}; + uint64_t start_addr = packet->start_addr; + uint64_t end_addr = packet->end_addr; + + event->sample.header.type = PERF_RECORD_SAMPLE; + event->sample.header.misc = PERF_RECORD_MISC_USER; + event->sample.header.size = sizeof(struct perf_event_header); + + + sample.ip = start_addr; + sample.pid = etmq->pid; + sample.tid = etmq->tid; + sample.addr = end_addr; + sample.id = etmq->etm->instructions_id; + sample.stream_id = etmq->etm->instructions_id; + sample.period = (end_addr - start_addr) >> 2; + sample.cpu = packet->cpu; + sample.flags = 0; // etmq->flags; + sample.insn_len = 1; // etmq->insn_len; + sample.cpumode = event->header.misc; + + //etmq->last_insn_cnt = etmq->state->tot_insn_cnt; + +#if 0 + { + struct addr_location al; + uint64_t offset; + struct thread *thread; + struct machine *machine = etmq->etm->machine; + uint8_t cpumode; + struct cs_etm_cache_entry *e; + uint8_t buf[256]; + size_t bufsz; + + thread = etmq->thread; + + if (!thread) + thread = etmq->etm->unknown_thread; + + if (start_addr > 0xffffffc000000000UL) + cpumode = PERF_RECORD_MISC_KERNEL; + else + cpumode = PERF_RECORD_MISC_USER; + + thread__find_addr_map(thread, cpumode, MAP__FUNCTION, start_addr, &al); + if (!al.map || !al.map->dso) + goto endTest; + if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR && + dso__data_status_seen(al.map->dso, DSO_DATA_STATUS_SEEN_ITRACE)) + goto endTest; + + offset = al.map->map_ip(al.map, start_addr); + + + e = cs_etm__cache_lookup(al.map->dso, machine, offset); + + if (e) + (void) e; + else { + int len; + map__load(al.map, machine->symbol_filter); + + bufsz = sizeof(buf); + len = dso__data_read_offset(al.map->dso, machine, + offset, buf, bufsz); + + if (len <= 0) + goto endTest; + + cs_etm__cache_add(al.map->dso, machine, offset, (end_addr - start_addr) >> 2, end_addr - start_addr); + + } +endTest: + (void) offset; + } +#endif + + ret = perf_session__deliver_synth_event(etm->session, event, &sample); + + if (ret) + pr_err("CS ETM Trace: failed to deliver instruction event, error %d\n", ret); + + return ret; +} + +struct cs_etm_synth { + struct perf_tool dummy_tool; + struct perf_session *session; +}; + + +static int cs_etm__event_synth(struct perf_tool *tool, + union perf_event *event, + struct perf_sample *sample, + struct machine *machine) +{ + struct cs_etm_synth *cs_etm_synth = + container_of(tool, struct cs_etm_synth, dummy_tool); + + (void) sample; + (void) machine; + + return perf_session__deliver_synth_event(cs_etm_synth->session, event, NULL); + +} + + +static int cs_etm__synth_event(struct perf_session *session, + struct perf_event_attr *attr, u64 id) +{ + struct cs_etm_synth cs_etm_synth; + + memset(&cs_etm_synth, 0, sizeof(struct cs_etm_synth)); + cs_etm_synth.session = session; + + return perf_event__synthesize_attr(&cs_etm_synth.dummy_tool, attr, 1, + &id, cs_etm__event_synth); +} + +static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, + struct perf_session *session) +{ + struct perf_evlist *evlist = session->evlist; + struct perf_evsel *evsel; + struct perf_event_attr attr; + bool found = false; + u64 id; + int err; + + evlist__for_each_entry(evlist, evsel) { + + if (evsel->attr.type == etm->pmu_type) { + found = true; + break; + } + } + + if (!found) { + pr_debug("There are no selected events with Core Sight Trace data\n"); + return 0; + } + + memset(&attr, 0, sizeof(struct perf_event_attr)); + attr.size = sizeof(struct perf_event_attr); + attr.type = PERF_TYPE_HARDWARE; + attr.sample_type = evsel->attr.sample_type & PERF_SAMPLE_MASK; + attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID | + PERF_SAMPLE_PERIOD; + if (etm->timeless_decoding) + attr.sample_type &= ~(u64)PERF_SAMPLE_TIME; + else + attr.sample_type |= PERF_SAMPLE_TIME; + + attr.exclude_user = evsel->attr.exclude_user; + attr.exclude_kernel = evsel->attr.exclude_kernel; + attr.exclude_hv = evsel->attr.exclude_hv; + attr.exclude_host = evsel->attr.exclude_host; + attr.exclude_guest = evsel->attr.exclude_guest; + attr.sample_id_all = evsel->attr.sample_id_all; + attr.read_format = evsel->attr.read_format; + + id = evsel->id[0] + 1000000000; + + if (!id) + id = 1; + + if (etm->synth_opts.instructions) { + attr.config = PERF_COUNT_HW_INSTRUCTIONS; + attr.sample_period = etm->synth_opts.period; + etm->instructions_sample_period = attr.sample_period; + err = cs_etm__synth_event(session, &attr, id); + + if (err) { + pr_err("%s: failed to synthesize 'instructions' event type\n", + __func__); + return err; + } + etm->sample_instructions = true; + etm->instructions_sample_type = attr.sample_type; + etm->instructions_id = id; + id += 1; + } + + etm->synth_needs_swap = evsel->needs_swap; + return 0; +} + +static int cs_etm__sample(struct cs_etm_queue *etmq) +{ + //const struct cs_etm_state *state = etmq->state; + struct cs_etm_packet packet; + //struct cs_etm_auxtrace *etm = etmq->etm; + int err; + + err = cs_etm_decoder__get_packet(etmq->decoder, &packet); + // if there is no sample, it returns err = -1, no real error + + if (!err && packet.sample_type & CS_ETM_RANGE) { + err = cs_etm__synth_instruction_sample(etmq, &packet); + if (err) + return err; + } + return 0; +} + +static int cs_etm__run_decoder(struct cs_etm_queue *etmq, u64 *timestamp) +{ + struct cs_etm_buffer buffer; + size_t buffer_used; + int err = 0; + + /* Go through each buffer in the queue and decode them one by one */ +more: + buffer_used = 0; + memset(&buffer, 0, sizeof(buffer)); + err = cs_etm__get_trace(&buffer, etmq); + if (err <= 0) + return err; + + do { + size_t processed = 0; + etmq->state = cs_etm_decoder__process_data_block(etmq->decoder, + etmq->offset, + &buffer.buf[buffer_used], + buffer.len-buffer_used, + &processed); + err = etmq->state->err; + etmq->offset += processed; + buffer_used += processed; + if (!err) + cs_etm__sample(etmq); + } while (!etmq->eot && (buffer.len > buffer_used)); +goto more; + + (void) timestamp; + + return err; +} + +static int cs_etm__update_queues(struct cs_etm_auxtrace *etm) +{ + if (etm->queues.new_data) { + etm->queues.new_data = false; + return cs_etm__setup_queues(etm); + } + return 0; +} + +static int cs_etm__process_queues(struct cs_etm_auxtrace *etm, u64 timestamp) +{ + unsigned int queue_nr; + u64 ts; + int ret; + + while (1) { + struct auxtrace_queue *queue; + struct cs_etm_queue *etmq; + + if (!etm->heap.heap_cnt) + return 0; + + if (etm->heap.heap_array[0].ordinal >= timestamp) + return 0; + + queue_nr = etm->heap.heap_array[0].queue_nr; + queue = &etm->queues.queue_array[queue_nr]; + etmq = queue->priv; + + //cs_etm__log("queue %u processing 0x%" PRIx64 " to 0x%" PRIx64 "\n", + //queue_nr, etm->heap.heap_array[0].ordinal, + //timestamp); + + auxtrace_heap__pop(&etm->heap); + + if (etm->heap.heap_cnt) { + ts = etm->heap.heap_array[0].ordinal + 1; + if (ts > timestamp) + ts = timestamp; + } else { + ts = timestamp; + } + + cs_etm__set_pid_tid_cpu(etm, queue); + + ret = cs_etm__run_decoder(etmq, &ts); + + if (ret < 0) { + auxtrace_heap__add(&etm->heap, queue_nr, ts); + return ret; + } + + if (!ret) { + ret = auxtrace_heap__add(&etm->heap, queue_nr, ts); + if (ret < 0) + return ret; + } else { + etmq->on_heap = false; + } + } + return 0; +} + +static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm, + pid_t tid, + u64 time_) +{ + struct auxtrace_queues *queues = &etm->queues; + unsigned int i; + u64 ts = 0; + + for (i = 0; i < queues->nr_queues; ++i) { + struct auxtrace_queue *queue = &(etm->queues.queue_array[i]); + struct cs_etm_queue *etmq = queue->priv; + + if (etmq && ((tid == -1) || (etmq->tid == tid))) { + etmq->time = time_; + cs_etm__set_pid_tid_cpu(etm, queue); + cs_etm__run_decoder(etmq, &ts); + + } + } + return 0; +} + +static struct cs_etm_queue *cs_etm__cpu_to_etmq(struct cs_etm_auxtrace *etm, + int cpu) +{ + unsigned q, j; + + if (etm->queues.nr_queues == 0) + return NULL; + + if (cpu < 0) + q = 0; + else if ((unsigned) cpu >= etm->queues.nr_queues) + q = etm->queues.nr_queues - 1; + else + q = cpu; + + if (etm->queues.queue_array[q].cpu == cpu) + return etm->queues.queue_array[q].priv; + + for (j = 0; q > 0; j++) { + if (etm->queues.queue_array[--q].cpu == cpu) + return etm->queues.queue_array[q].priv; + } + + for (; j < etm->queues.nr_queues; j++) { + if (etm->queues.queue_array[j].cpu == cpu) + return etm->queues.queue_array[j].priv; + + } + + return NULL; +} + +static uint32_t cs_etm__mem_access(struct cs_etm_queue *etmq, uint64_t address, size_t size, uint8_t *buffer) +{ + struct addr_location al; + uint64_t offset; + struct thread *thread; + struct machine *machine; + uint8_t cpumode; + int len; + + if (etmq == NULL) + return -1; + + machine = etmq->etm->machine; + thread = etmq->thread; + if (address > 0xffffffc000000000UL) + cpumode = PERF_RECORD_MISC_KERNEL; + else + cpumode = PERF_RECORD_MISC_USER; + + thread__find_addr_map(thread, cpumode, MAP__FUNCTION, address, &al); + + if (!al.map || !al.map->dso) + return 0; + + if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR && + dso__data_status_seen(al.map->dso, DSO_DATA_STATUS_SEEN_ITRACE)) + return 0; + + offset = al.map->map_ip(al.map, address); + + map__load(al.map); + + len = dso__data_read_offset(al.map->dso, machine, + offset, buffer, size); + + if (len <= 0) + return 0; + + return len; +} + +static bool check_need_swap(int file_endian) +{ + const int data = 1; + u8 *check = (u8 *)&data; + int host_endian; + + if (check[0] == 1) + host_endian = ELFDATA2LSB; + else + host_endian = ELFDATA2MSB; + + return host_endian != file_endian; +} + +static int cs_etm__read_elf_info(const char *fname, uint64_t *foffset, uint64_t *fstart, uint64_t *fsize) +{ + FILE *fp; + u8 e_ident[EI_NIDENT]; + int ret = -1; + bool need_swap = false; + size_t buf_size; + void *buf; + int i; + + fp = fopen(fname, "r"); + if (fp == NULL) + return -1; + + if (fread(e_ident, sizeof(e_ident), 1, fp) != 1) + goto out; + + if (memcmp(e_ident, ELFMAG, SELFMAG) || + e_ident[EI_VERSION] != EV_CURRENT) + goto out; + + need_swap = check_need_swap(e_ident[EI_DATA]); + + /* for simplicity */ + fseek(fp, 0, SEEK_SET); + + if (e_ident[EI_CLASS] == ELFCLASS32) { + Elf32_Ehdr ehdr; + Elf32_Phdr *phdr; + + if (fread(&ehdr, sizeof(ehdr), 1, fp) != 1) + goto out; + + if (need_swap) { + ehdr.e_phoff = bswap_32(ehdr.e_phoff); + ehdr.e_phentsize = bswap_16(ehdr.e_phentsize); + ehdr.e_phnum = bswap_16(ehdr.e_phnum); + } + + buf_size = ehdr.e_phentsize * ehdr.e_phnum; + buf = malloc(buf_size); + if (buf == NULL) + goto out; + + fseek(fp, ehdr.e_phoff, SEEK_SET); + if (fread(buf, buf_size, 1, fp) != 1) + goto out_free; + + for (i = 0, phdr = buf; i < ehdr.e_phnum; i++, phdr++) { + + if (need_swap) { + phdr->p_type = bswap_32(phdr->p_type); + phdr->p_offset = bswap_32(phdr->p_offset); + phdr->p_filesz = bswap_32(phdr->p_filesz); + } + + if (phdr->p_type != PT_LOAD) + continue; + + *foffset = phdr->p_offset; + *fstart = phdr->p_vaddr; + *fsize = phdr->p_filesz; + ret = 0; + break; + } + } else { + Elf64_Ehdr ehdr; + Elf64_Phdr *phdr; + + if (fread(&ehdr, sizeof(ehdr), 1, fp) != 1) + goto out; + + if (need_swap) { + ehdr.e_phoff = bswap_64(ehdr.e_phoff); + ehdr.e_phentsize = bswap_16(ehdr.e_phentsize); + ehdr.e_phnum = bswap_16(ehdr.e_phnum); + } + + buf_size = ehdr.e_phentsize * ehdr.e_phnum; + buf = malloc(buf_size); + if (buf == NULL) + goto out; + + fseek(fp, ehdr.e_phoff, SEEK_SET); + if (fread(buf, buf_size, 1, fp) != 1) + goto out_free; + + for (i = 0, phdr = buf; i < ehdr.e_phnum; i++, phdr++) { + + if (need_swap) { + phdr->p_type = bswap_32(phdr->p_type); + phdr->p_offset = bswap_64(phdr->p_offset); + phdr->p_filesz = bswap_64(phdr->p_filesz); + } + + if (phdr->p_type != PT_LOAD) + continue; + + *foffset = phdr->p_offset; + *fstart = phdr->p_vaddr; + *fsize = phdr->p_filesz; + ret = 0; + break; + } + } +out_free: + free(buf); +out: + fclose(fp); + return ret; +} + +static int cs_etm__process_event(struct perf_session *session, + union perf_event *event, + struct perf_sample *sample, + struct perf_tool *tool) +{ + struct cs_etm_auxtrace *etm = container_of(session->auxtrace, + struct cs_etm_auxtrace, + auxtrace); + + u64 timestamp; + int err = 0; + + if (dump_trace) + return 0; + + if (!tool->ordered_events) { + pr_err("CoreSight ETM Trace requires ordered events\n"); + return -EINVAL; + } + + if (sample->time && (sample->time != (u64)-1)) + timestamp = sample->time; + else + timestamp = 0; + + if (timestamp || etm->timeless_decoding) { + err = cs_etm__update_queues(etm); + if (err) + return err; + + } + + if (event->header.type == PERF_RECORD_MMAP2) { + struct dso *dso; + int cpu; + struct cs_etm_queue *etmq; + + cpu = sample->cpu; + + etmq = cs_etm__cpu_to_etmq(etm, cpu); + + if (!etmq) + return -1; + + dso = dsos__find(&(etm->machine->dsos), event->mmap2.filename, false); + if (NULL != dso) { + err = cs_etm_decoder__add_mem_access_cb( + etmq->decoder, + event->mmap2.start, + event->mmap2.len, + cs_etm__mem_access); + } + + if ((symbol_conf.vmlinux_name != NULL) && (!etmq->kernel_mapped)) { + uint64_t foffset; + uint64_t fstart; + uint64_t fsize; + + err = cs_etm__read_elf_info(symbol_conf.vmlinux_name, + &foffset, &fstart, &fsize); + + if (!err) { + cs_etm_decoder__add_bin_file( + etmq->decoder, + foffset, + fstart, + fsize & ~0xULL, + symbol_conf.vmlinux_name); + + etmq->kernel_mapped = true; + } + } + + } + + if (etm->timeless_decoding) { + if (event->header.type == PERF_RECORD_EXIT) { + err = cs_etm__process_timeless_queues(etm, + event->fork.tid, + sample->time); + } + } else if (timestamp) { + err = cs_etm__process_queues(etm, timestamp); + } + + //cs_etm__log("event %s (%u): cpu %d time%"PRIu64" tsc %#"PRIx64"\n", + //perf_event__name(event->header.type), event->header.type, + //sample->cpu, sample->time, timestamp); + return err; +} + +static int cs_etm__process_auxtrace_event(struct perf_session *session, + union perf_event *event, + struct perf_tool *tool) +{ + struct cs_etm_auxtrace *etm = container_of(session->auxtrace, + struct cs_etm_auxtrace, + auxtrace); + + (void) tool; + + if (!etm->data_queued) { + struct auxtrace_buffer *buffer; + off_t data_offset; + int fd = perf_data_file__fd(session->file); + bool is_pipe = perf_data_file__is_pipe(session->file); + int err; + + if (is_pipe) + data_offset = 0; + else { + data_offset = lseek(fd, 0, SEEK_CUR); + if (data_offset == -1) + return -errno; + } + + err = auxtrace_queues__add_event(&etm->queues, + session, + event, + data_offset, + &buffer); + if (err) + return err; + + if (dump_trace) + if (auxtrace_buffer__get_data(buffer, fd)) { + cs_etm__dump_event(etm, buffer); + auxtrace_buffer__put_data(buffer); + } + } + + return 0; + +} + +static const char * const cs_etm_global_header_fmts[] = { + [CS_HEADER_VERSION_0] = " Header version %"PRIx64"\n", + [CS_PMU_TYPE_CPUS] = " PMU type/num cpus %"PRIx64"\n", + [CS_ETM_SNAPSHOT] = " Snapshot %"PRIx64"\n", +}; + +static const char * const cs_etm_priv_fmts[] = { + [CS_ETM_MAGIC] = " Magic number %"PRIx64"\n", + [CS_ETM_CPU] = " CPU %"PRIx64"\n", + [CS_ETM_ETMCR] = " ETMCR %"PRIx64"\n", + [CS_ETM_ETMTRACEIDR] = " ETMTRACEIDR %"PRIx64"\n", + [CS_ETM_ETMCCER] = " ETMCCER %"PRIx64"\n", + [CS_ETM_ETMIDR] = " ETMIDR %"PRIx64"\n", +}; + +static const char * const cs_etmv4_priv_fmts[] = { + [CS_ETM_MAGIC] = " Magic number %"PRIx64"\n", + [CS_ETM_CPU] = " CPU %"PRIx64"\n", + [CS_ETMV4_TRCCONFIGR] = " TRCCONFIGR %"PRIx64"\n", + [CS_ETMV4_TRCTRACEIDR] = " TRCTRACEIDR %"PRIx64"\n", + [CS_ETMV4_TRCIDR0] = " TRCIDR0 %"PRIx64"\n", + [CS_ETMV4_TRCIDR1] = " TRCIDR1 %"PRIx64"\n", + [CS_ETMV4_TRCIDR2] = " TRCIDR2 %"PRIx64"\n", + [CS_ETMV4_TRCIDR8] = " TRCIDR8 %"PRIx64"\n", + [CS_ETMV4_TRCAUTHSTATUS] = " TRCAUTHSTATUS %"PRIx64"\n", +}; + +static void cs_etm__print_auxtrace_info(u64 *val, size_t num) +{ + unsigned i, j, cpu; + + for (i = 0, cpu = 0; cpu < num; ++cpu) + if (val[i] == __perf_cs_etmv3_magic) + for (j = 0; j < CS_ETM_PRIV_MAX; ++j, ++i) + fprintf(stdout, cs_etm_priv_fmts[j], val[i]); + else if (val[i] == __perf_cs_etmv4_magic) + for (j = 0; j < CS_ETMV4_PRIV_MAX; ++j, ++i) + fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]); + else + // failure.. return + return; +} + +int cs_etm__process_auxtrace_info(union perf_event *event, + struct perf_session *session) +{ + struct auxtrace_info_event *auxtrace_info = &(event->auxtrace_info); + size_t event_header_size = sizeof(struct perf_event_header); + size_t info_header_size = 8; + size_t total_size = auxtrace_info->header.size; + size_t priv_size = 0; + size_t num_cpu; + struct cs_etm_auxtrace *etm = 0; + int err = 0, idx = -1; + u64 *ptr; + u64 *hdr = NULL; + u64 **metadata = NULL; + size_t i, j, k; + unsigned pmu_type; + struct int_node *inode; + + /* + * sizeof(auxtrace_info_event::type) + + * sizeof(auxtrace_info_event::reserved) == 8 + */ + info_header_size = 8; + + if (total_size < (event_header_size + info_header_size)) + return -EINVAL; + + priv_size = total_size - event_header_size - info_header_size; + + // First the global part + + ptr = (u64 *) auxtrace_info->priv; + if (ptr[0] == 0) { + hdr = zalloc(sizeof(u64 *) * CS_HEADER_VERSION_0_MAX); + if (hdr == NULL) + return -EINVAL; + for (i = 0; i < CS_HEADER_VERSION_0_MAX; ++i) + hdr[i] = ptr[i]; + num_cpu = hdr[CS_PMU_TYPE_CPUS] & 0xffffffff; + pmu_type = (unsigned) ((hdr[CS_PMU_TYPE_CPUS] >> 32) & 0xffffffff); + } else + return -EINVAL; + + /* + * Create an RB tree for traceID-CPU# tuple. Since the conversion has + * to be made for each packet that gets decoded optimizing access in + * anything other than a sequential array is worth doing. + */ + traceid_list = intlist__new(NULL); + if (!traceid_list) + return -ENOMEM; + + metadata = zalloc(sizeof(u64 *) * num_cpu); + if (!metadata) { + err = -ENOMEM; + goto err_free_traceid_list; + } + + if (metadata == NULL) + return -EINVAL; + + for (j = 0; j < num_cpu; ++j) { + if (ptr[i] == __perf_cs_etmv3_magic) { + metadata[j] = zalloc(sizeof(u64)*CS_ETM_PRIV_MAX); + if (metadata == NULL) + return -EINVAL; + for (k = 0; k < CS_ETM_PRIV_MAX; k++) + metadata[j][k] = ptr[i+k]; + + /* The traceID is our handle */ + idx = metadata[j][CS_ETM_ETMIDR]; + i += CS_ETM_PRIV_MAX; + } else if (ptr[i] == __perf_cs_etmv4_magic) { + metadata[j] = zalloc(sizeof(u64)*CS_ETMV4_PRIV_MAX); + if (metadata == NULL) + return -EINVAL; + for (k = 0; k < CS_ETMV4_PRIV_MAX; k++) + metadata[j][k] = ptr[i+k]; + + /* The traceID is our handle */ + idx = metadata[j][CS_ETMV4_TRCTRACEIDR]; + i += CS_ETMV4_PRIV_MAX; + } + + /* Get an RB node for this CPU */ + inode = intlist__findnew(traceid_list, idx); + + /* Something went wrong, no need to continue */ + if (!inode) { + err = PTR_ERR(inode); + goto err_free_metadata; + } + + /* + * The node for that CPU should not have been taken already. + * Backout if that's the case. + */ + if (inode->priv) { + err = -EINVAL; + goto err_free_metadata; + } + + /* All good, associate the traceID with the CPU# */ + inode->priv = &metadata[j][CS_ETM_CPU]; + + } + + if (i*8 != priv_size) + return -EINVAL; + + if (dump_trace) + cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu); + + etm = zalloc(sizeof(struct cs_etm_auxtrace)); + + etm->num_cpu = num_cpu; + etm->pmu_type = pmu_type; + etm->snapshot_mode = (hdr[CS_ETM_SNAPSHOT] != 0); + + if (!etm) + return -ENOMEM; + + + err = auxtrace_queues__init(&etm->queues); + if (err) + goto err_free; + + etm->unknown_thread = thread__new(999999999, 999999999); + if (etm->unknown_thread == NULL) { + err = -ENOMEM; + goto err_free_queues; + } + err = thread__set_comm(etm->unknown_thread, "unknown", 0); + if (err) + goto err_delete_thread; + + if (thread__init_map_groups(etm->unknown_thread, + etm->machine)) { + err = -ENOMEM; + goto err_delete_thread; + } + + etm->timeless_decoding = true; + etm->sampling_mode = false; + etm->metadata = metadata; + etm->session = session; + etm->machine = &session->machines.host; + etm->auxtrace_type = auxtrace_info->type; + + etm->auxtrace.process_event = cs_etm__process_event; + etm->auxtrace.process_auxtrace_event = cs_etm__process_auxtrace_event; + etm->auxtrace.flush_events = cs_etm__flush_events; + etm->auxtrace.free_events = cs_etm__free_events; + etm->auxtrace.free = cs_etm__free; + session->auxtrace = &(etm->auxtrace); + + if (dump_trace) + return 0; + + if (session->itrace_synth_opts && session->itrace_synth_opts->set) + etm->synth_opts = *session->itrace_synth_opts; + else + itrace_synth_opts__set_default(&etm->synth_opts); + etm->synth_opts.branches = false; + etm->synth_opts.callchain = false; + etm->synth_opts.calls = false; + etm->synth_opts.returns = false; + + err = cs_etm__synth_events(etm, session); + if (err) + goto err_delete_thread; + + err = auxtrace_queues__process_index(&etm->queues, session); + if (err) + goto err_delete_thread; + + etm->data_queued = etm->queues.populated; + + return 0; + +err_delete_thread: + thread__delete(etm->unknown_thread); +err_free_queues: + auxtrace_queues__free(&etm->queues); + session->auxtrace = NULL; +err_free: + free(etm); +err_free_metadata: + /* No need to check @metadata[j], free(NULL) is supported */ + for (j = 0; j < num_cpu; ++j) + free(metadata[j]); + free(metadata); +err_free_traceid_list: + intlist__delete(traceid_list); + + return err; +} diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 3cc6bc3..32400ac 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -18,6 +18,10 @@ #ifndef INCLUDE__UTIL_PERF_CS_ETM_H__ #define INCLUDE__UTIL_PERF_CS_ETM_H__ +#include "util/event.h" +#include "util/intlist.h" +#include "util/session.h" + /* Versionning header in case things need tro change in the future. That way * decoding of old snapshot is still possible. */ @@ -61,6 +65,9 @@ enum { CS_ETMV4_PRIV_MAX, }; +/* RB tree for quick conversion between traceID and CPUs */ +struct intlist *traceid_list; + #define KiB(x) ((x) * 1024) #define MiB(x) ((x) * 1024 * 1024) @@ -71,4 +78,7 @@ static const u64 __perf_cs_etmv4_magic = 0x4040404040404040ULL; #define CS_ETMV3_PRIV_SIZE (CS_ETM_PRIV_MAX * sizeof(u64)) #define CS_ETMV4_PRIV_SIZE (CS_ETMV4_PRIV_MAX * sizeof(u64)) +int cs_etm__process_auxtrace_info(union perf_event *event, + struct perf_session *session); + #endif diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index df85b9e..bcec333 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -1,3 +1,4 @@ +#include "build-id.h" #include "callchain.h" #include "debug.h" #include "event.h" @@ -483,7 +484,8 @@ int machine__process_comm_event(struct machine *machine, union perf_event *event } int machine__process_lost_event(struct machine *machine __maybe_unused, - union perf_event *event, struct perf_sample *sample __maybe_unused) + union perf_event *event, + struct perf_sample *sample __maybe_unused) { dump_printf(": id:%" PRIu64 ": lost:%" PRIu64 "\n", event->lost.id, event->lost.lost); @@ -491,7 +493,8 @@ int machine__process_lost_event(struct machine *machine __maybe_unused, } int machine__process_lost_samples_event(struct machine *machine __maybe_unused, - union perf_event *event, struct perf_sample *sample) + union perf_event *event, + struct perf_sample *sample) { dump_printf(": id:%" PRIu64 ": lost samples :%" PRIu64 "\n", sample->id, event->lost_samples.lost); @@ -711,8 +714,16 @@ static struct dso *machine__get_kernel(struct machine *machine) DSO_TYPE_GUEST_KERNEL); } - if (kernel != NULL && (!kernel->has_build_id)) - dso__read_running_kernel_build_id(kernel, machine); + if (kernel != NULL && (!kernel->has_build_id)) { + if (symbol_conf.vmlinux_name != NULL) { + filename__read_build_id(symbol_conf.vmlinux_name, + kernel->build_id, + sizeof(kernel->build_id)); + kernel->has_build_id = 1; + } else { + dso__read_running_kernel_build_id(kernel, machine); + } + } return kernel; } @@ -726,8 +737,19 @@ static void machine__get_kallsyms_filename(struct machine *machine, char *buf, { if (machine__is_default_guest(machine)) scnprintf(buf, bufsz, "%s", symbol_conf.default_guest_kallsyms); - else - scnprintf(buf, bufsz, "%s/proc/kallsyms", machine->root_dir); + else { + if (symbol_conf.vmlinux_name != 0) { + unsigned char build_id[BUILD_ID_SIZE]; + char build_id_hex[SBUILD_ID_SIZE]; + filename__read_build_id(symbol_conf.vmlinux_name, + build_id, + sizeof(build_id)); + build_id__sprintf(build_id, sizeof(build_id), build_id_hex); + build_id_cache__linkname((char *)build_id_hex, buf, bufsz); + } else { + scnprintf(buf, bufsz, "%s/proc/kallsyms", machine->root_dir); + } + } } const char *ref_reloc_sym_names[] = {"_text", "_stext", NULL}; @@ -736,7 +758,7 @@ const char *ref_reloc_sym_names[] = {"_text", "_stext", NULL}; * Returns the name of the start symbol in *symbol_name. Pass in NULL as * symbol_name if it's not that important. */ -static u64 machine__get_running_kernel_start(struct machine *machine, +static u64 machine__get_kallsyms_kernel_start(struct machine *machine, const char **symbol_name) { char filename[PATH_MAX]; @@ -764,7 +786,7 @@ static u64 machine__get_running_kernel_start(struct machine *machine, int __machine__create_kernel_maps(struct machine *machine, struct dso *kernel) { enum map_type type; - u64 start = machine__get_running_kernel_start(machine, NULL); + u64 start = machine__get_kallsyms_kernel_start(machine, NULL); /* In case of renewal the kernel map, destroy previous one */ machine__destroy_kernel_maps(machine); @@ -1126,10 +1148,10 @@ int machine__create_kernel_maps(struct machine *machine) { struct dso *kernel = machine__get_kernel(machine); const char *name; - u64 addr; + u64 addr = machine__get_kallsyms_kernel_start(machine, &name); int ret; - if (kernel == NULL) + if (!addr || kernel == NULL) return -1; ret = __machine__create_kernel_maps(machine, kernel); @@ -1151,7 +1173,7 @@ int machine__create_kernel_maps(struct machine *machine) */ map_groups__fixup_end(&machine->kmaps); - addr = machine__get_running_kernel_start(machine, &name); + addr = machine__get_kallsyms_kernel_start(machine, &name); if (!addr) { } else if (maps__set_kallsyms_ref_reloc_sym(machine->vmlinux_maps, name, addr)) { machine__destroy_kernel_maps(machine); @@ -1901,7 +1923,7 @@ static int thread__resolve_callchain_sample(struct thread *thread, ip = chain->ips[j]; if (ip < PERF_CONTEXT_MAX) - ++nr_entries; + ++nr_entries; err = add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip); diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c index fdbbf04..7698291 100644 --- a/tools/perf/util/scripting-engines/trace-event-python.c +++ b/tools/perf/util/scripting-engines/trace-event-python.c @@ -833,6 +833,8 @@ static void python_process_general_event(struct perf_sample *sample, PyInt_FromLong(sample->cpu)); pydict_set_item_string_decref(dict_sample, "ip", PyLong_FromUnsignedLongLong(sample->ip)); + pydict_set_item_string_decref(dict_sample, "addr", + PyLong_FromUnsignedLongLong(sample->addr)); pydict_set_item_string_decref(dict_sample, "time", PyLong_FromUnsignedLongLong(sample->time)); pydict_set_item_string_decref(dict_sample, "period", diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c index 11cdde9..c094091 100644 --- a/tools/perf/util/symbol-minimal.c +++ b/tools/perf/util/symbol-minimal.c @@ -342,9 +342,8 @@ int dso__load_sym(struct dso *dso, struct map *map __maybe_unused, if (ret >= 0) dso->is_64_bit = ret; - if (filename__read_build_id(ss->name, build_id, BUILD_ID_SIZE) > 0) { + if ((!dso->has_build_id) && (filename__read_build_id(ss->name, build_id, BUILD_ID_SIZE) > 0)) dso__set_build_id(dso, build_id); - } return 0; } -- 2.7.4 -- _______________________________________________ linux-yocto mailing list linux-yocto@yoctoproject.org https://lists.yoctoproject.org/listinfo/linux-yocto