* Stephane Eranian <eran...@google.com> wrote: > This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL > for Intel x86 processors. When the processor support LBR filtering > this the selection is done in hardware. Otherwise, the filter is > applied by software. Note that we chose to include zero length calls > because they also represent calls. > > Signed-off-by: Stephane Eranian <eran...@google.com> > --- > arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c > b/arch/x86/kernel/cpu/perf_event_intel_lbr.c > index ad0b8b0..bfd0b71 100644 > --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c > +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c > @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct > perf_event *event) > if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP) > mask |= X86_BR_IND_JMP; > > + if (br_type & PERF_SAMPLE_BRANCH_CALL) > + mask |= X86_BR_CALL | X86_BR_ZERO_CALL;
I'm wondering how frequent zero-length calls are. If they still occur in typical user-space, would it make sense to also have a separate branch sampling type for zero length calls? Intel documents zero length calls as ones that (ab-)use the call instruction to push the current IP on the stack: call next_addr next_addr: pop %reg which can take over 10 cycles on certain microarchitectures (and it unbalances whatever call stack tracking/caching the CPU does as well). So it might make sense to analyze them separately. I guess that's the reason why Intel added a separate flag for them in the PMU. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/