From: Kan Liang <kan.li...@linux.intel.com>

With the LBR stitching approach, the reconstructed LBR call stack
can break the HW limitation. However, it may reconstruct invalid call
stacks in some cases, e.g. exception handing such as setjmp/longjmp.
Also, it may impact the processing time especially when the number of
samples with stitched LBRs are huge.

Add an option to enable the approach.

Reviewed-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Kan Liang <kan.li...@linux.intel.com>
---
 tools/perf/Documentation/perf-script.txt | 11 +++++++++++
 tools/perf/builtin-script.c              |  6 ++++++
 2 files changed, 17 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 2599b057e47b..472f20f1e479 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -426,6 +426,17 @@ include::itrace.txt[]
 --show-on-off-events::
        Show the --switch-on/off events too.
 
+--stitch-lbr::
+       Show callgraph with stitched LBRs, which may have more complete
+       callgraph. The perf.data file must have been obtained using
+       perf record --call-graph lbr.
+       Disabled by default. In common cases with call stack overflows,
+       it can recreate better call stacks than the default lbr call stack
+       output. But this approach is not full proof. There can be cases
+       where it creates incorrect call stacks from incorrect matches.
+       The known limitations include exception handing such as
+       setjmp/longjmp will have calls/returns not match.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 67be8d31afab..0fc4d07864d1 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1641,6 +1641,7 @@ struct perf_script {
        bool                    show_bpf_events;
        bool                    allocated;
        bool                    per_event_dump;
+       bool                    stitch_lbr;
        struct evswitch         evswitch;
        struct perf_cpu_map     *cpus;
        struct perf_thread_map *threads;
@@ -1867,6 +1868,9 @@ static void process_event(struct perf_script *script,
        if (PRINT_FIELD(IP)) {
                struct callchain_cursor *cursor = NULL;
 
+               if (script->stitch_lbr)
+                       al->thread->lbr_stitch_enable = true;
+
                if (symbol_conf.use_callchain && sample->callchain &&
                    thread__resolve_callchain(al->thread, &callchain_cursor, 
evsel,
                                              sample, NULL, NULL, 
scripting_max_stack) == 0)
@@ -3556,6 +3560,8 @@ int cmd_script(int argc, const char **argv)
                   "file", "file saving guest os /proc/kallsyms"),
        OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
                   "file", "file saving guest os /proc/modules"),
+       OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr,
+                   "Enable LBR callgraph stitching approach"),
        OPTS_EVSWITCH(&script.evswitch),
        OPT_END()
        };
-- 
2.17.1

Reply via email to