Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Fri, Feb 27, 2015 at 03:57:02PM -0800, Andi Kleen wrote: On Fri, Feb 27, 2015 at 11:05:45PM +0100, Peter Zijlstra wrote: On Fri, Feb 27, 2015 at 09:54:34AM -0800, Andi Kleen wrote: perf record doesn't show where you're currently blocked. Of course it does; look at perf inject -s. Trace points don't support the LBR stack. Yes, indeed. But would it not make much more sense to squirrel the LBR state into sched:sched_switch and teach that inject -s thing to dtrt, than to make a proc file that's available on all archs but will only work on 1-2 x86 uarchs and only if you're also running the right magic perf record at the same time? Yes. It would be nice to capture the whole PMU state in trace points. There are use models for this where it can work better than sampling. But that would be a lot bigger project than this simple file, which is already quite useful with minimal effort. Its also the most horrible hack of an interface ever, so no go. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Fri, Feb 27, 2015 at 08:58:29AM +0100, Peter Zijlstra wrote: On Mon, Feb 23, 2015 at 09:44:48AM -0800, Andi Kleen wrote: On Mon, Feb 23, 2015 at 05:49:57PM +0100, Peter Zijlstra wrote: On Mon, Feb 23, 2015 at 03:43:41AM +, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com Haswell has a new feature that utilizes the existing Last Branch Record facility to record call chains. It has been implemented in perf. The call chains information is saved during perf event context. This patch exposes a /proc/pid/lbr_stack file that shows the saved LBR call chain information. But why? I mean, this thing is only useful if you have a concurrently running perf record that selects the LBR-stack stuff. And if you have that, you might as well look at its output instead. Why add this unconditional proc file that doesn't function on its own? perf record doesn't show where you're currently blocked. Of course it does; look at perf inject -s. Trace points don't support the LBR stack. -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Fri, Feb 27, 2015 at 09:54:34AM -0800, Andi Kleen wrote: perf record doesn't show where you're currently blocked. Of course it does; look at perf inject -s. Trace points don't support the LBR stack. Yes, indeed. But would it not make much more sense to squirrel the LBR state into sched:sched_switch and teach that inject -s thing to dtrt, than to make a proc file that's available on all archs but will only work on 1-2 x86 uarchs and only if you're also running the right magic perf record at the same time? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Fri, Feb 27, 2015 at 11:05:45PM +0100, Peter Zijlstra wrote: On Fri, Feb 27, 2015 at 09:54:34AM -0800, Andi Kleen wrote: perf record doesn't show where you're currently blocked. Of course it does; look at perf inject -s. Trace points don't support the LBR stack. Yes, indeed. But would it not make much more sense to squirrel the LBR state into sched:sched_switch and teach that inject -s thing to dtrt, than to make a proc file that's available on all archs but will only work on 1-2 x86 uarchs and only if you're also running the right magic perf record at the same time? Yes. It would be nice to capture the whole PMU state in trace points. There are use models for this where it can work better than sampling. But that would be a lot bigger project than this simple file, which is already quite useful with minimal effort. -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Mon, Feb 23, 2015 at 09:44:48AM -0800, Andi Kleen wrote: On Mon, Feb 23, 2015 at 05:49:57PM +0100, Peter Zijlstra wrote: On Mon, Feb 23, 2015 at 03:43:41AM +, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com Haswell has a new feature that utilizes the existing Last Branch Record facility to record call chains. It has been implemented in perf. The call chains information is saved during perf event context. This patch exposes a /proc/pid/lbr_stack file that shows the saved LBR call chain information. But why? I mean, this thing is only useful if you have a concurrently running perf record that selects the LBR-stack stuff. And if you have that, you might as well look at its output instead. Why add this unconditional proc file that doesn't function on its own? perf record doesn't show where you're currently blocked. Of course it does; look at perf inject -s. http://article.gmane.org/gmane.linux.kernel/1225774 http://article.gmane.org/gmane.linux.kernel/1225775 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
From: Kan Liang kan.li...@intel.com Haswell has a new feature that utilizes the existing Last Branch Record facility to record call chains. It has been implemented in perf. The call chains information is saved during perf event context. This patch exposes a /proc/pid/lbr_stack file that shows the saved LBR call chain information. Currently, there are already some tools which can dump stack(E.g. gstack). However, all of these tools rely on frame pointer or dwarf information. The LBR call stack facility provides an alternative to get stack. It doesn't need the debug information to construct the stack. One common case is backtracing through the libpthread library in glibc which is partially in assembler and neither have full dwarf annotation nor frame pointers. It's also helpful for jited code. Here are some examples. perf_stack uses /proc/pid/lbr_stack to dump stack information. Example 1: tchain_edit is a binary with debug information. ./tchain_edit [1] 8058 gstack 8058 0 0x0040054d in f3 () 1 0x00400587 in f2 () 2 0x004005b3 in f1 () 3 0x004005f4 in main () ./perf_stack 8058 0 0x00400540: f3 at ??:? 1 0x0040057d: f2 at ??:? 2 0x004005ae: f1 at ??:? 3 0x004005f9: main at ??:? With debug information, both gstack and perf_stack dump stack information. Example 2: tchain_edit_ch is a binary which doesn't include either dwarf or frame pointer information. ./tchain_edit_ch [1] 8084 gstack 8084 0 0x00400568 in ?? () 1 0x7fff134a7960 in ?? () 2 0x00400587 in ?? () 3 0x7fff134a7aa8 in ?? () 4 0x0046 in ?? () 5 0x7fff134a7980 in ?? () 6 0x004005b8 in ?? () 7 0x in ?? () gstack shows the wrong stack. ./perf_stack 8084 0 0x00400540: ?? ??:0 1 0x00400582: ?? ??:0 2 0x004005ae: ?? ??:0 3 0x004005f9: ?? ??:0 LBR call stack shows the correct stack. Here is the perf_stack script. perf record --call-graph lbr --pid $1 perf_pid=$! running_cpu=`cat /proc/$1/stat | awk '{print $39}'` cpu_tmp=$((1$running_cpu)) cpu=`printf 0x%X $cpu_tmp` //run something to force context switch taskset $cpu sleep 2 //dump LBR call stack i=0 while read -r line do function=$(addr2line $line -e /proc/$1/exe -fap) echo #$i $function i=`expr $i + 1` done /proc/$1/lbr_stack kill -9 $perf_pid The LBR call stack has following known limitations - Only available for haswell and later platform - Only dump user stack - Exception handing such as setjmp/longjmp will have calls/returns not match - Pushing different return address onto the stack will have calls/returns not match - If callstack is deeper than the LBR, only the last entries are captured Signed-off-by: Kan Liang kan.li...@intel.com --- arch/x86/include/asm/perf_event.h | 2 ++ arch/x86/kernel/cpu/perf_event.c | 9 +++ arch/x86/kernel/cpu/perf_event.h | 9 +-- arch/x86/kernel/cpu/perf_event_intel.c | 1 + arch/x86/kernel/cpu/perf_event_intel_lbr.c | 16 +++ fs/proc/base.c | 43 ++ include/linux/perf_event.h | 8 +- 7 files changed, 85 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index dc0f6ed..70f07fd 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -11,6 +11,8 @@ #define X86_PMC_IDX_MAX 64 +#define MAX_LBR_ENTRIES 16 + #define MSR_ARCH_PERFMON_PERFCTR00xc1 #define MSR_ARCH_PERFMON_PERFCTR10xc2 diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index e0dab5c..0b39f72 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -1922,6 +1922,14 @@ static void x86_pmu_sched_task(struct perf_event_context *ctx, bool sched_in) x86_pmu.sched_task(ctx, sched_in); } +static void x86_pmu_save_lbr_stack(struct perf_event_context *ctx, + __u64 *lbr_nr, + struct perf_branch_entry *lbr_entries) +{ + if (x86_pmu.save_lbr_stack) + x86_pmu.save_lbr_stack(ctx, lbr_nr, lbr_entries); +} + void perf_check_microcode(void) { if (x86_pmu.check_microcode) @@ -1952,6 +1960,7 @@ static struct pmu pmu = { .event_idx = x86_pmu_event_idx, .sched_task = x86_pmu_sched_task, + .save_lbr_stack = x86_pmu_save_lbr_stack, .task_ctx_size = sizeof(struct x86_perf_task_context), }; diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index a371d27..29d8b14 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h
Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Mon, Feb 23, 2015 at 03:43:41AM +, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com Haswell has a new feature that utilizes the existing Last Branch Record facility to record call chains. It has been implemented in perf. The call chains information is saved during perf event context. This patch exposes a /proc/pid/lbr_stack file that shows the saved LBR call chain information. But why? I mean, this thing is only useful if you have a concurrently running perf record that selects the LBR-stack stuff. And if you have that, you might as well look at its output instead. Why add this unconditional proc file that doesn't function on its own? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/1] proc: introduce /proc/pid/lbr_stack
On Mon, Feb 23, 2015 at 05:49:57PM +0100, Peter Zijlstra wrote: On Mon, Feb 23, 2015 at 03:43:41AM +, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com Haswell has a new feature that utilizes the existing Last Branch Record facility to record call chains. It has been implemented in perf. The call chains information is saved during perf event context. This patch exposes a /proc/pid/lbr_stack file that shows the saved LBR call chain information. But why? I mean, this thing is only useful if you have a concurrently running perf record that selects the LBR-stack stuff. And if you have that, you might as well look at its output instead. Why add this unconditional proc file that doesn't function on its own? perf record doesn't show where you're currently blocked. -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/