* Stephane Eranian <eran...@google.com> wrote:

> This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
> for Intel x86 processors. When the processor support LBR filtering
> this the selection is done in hardware. Otherwise, the filter is
> applied by software. Note that we chose to include zero length calls
> because they also represent calls.
> 
> Signed-off-by: Stephane Eranian <eran...@google.com>
> ---
>  arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c 
> b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> index ad0b8b0..bfd0b71 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct 
> perf_event *event)
>       if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
>               mask |= X86_BR_IND_JMP;
>  
> +     if (br_type & PERF_SAMPLE_BRANCH_CALL)
> +             mask |= X86_BR_CALL | X86_BR_ZERO_CALL;

I'm wondering how frequent zero-length calls are. If they still occur in 
typical 
user-space, would it make sense to also have a separate branch sampling type 
for 
zero length calls?

Intel documents zero length calls as ones that (ab-)use the call instruction to 
push the current IP on the stack:

        call next_addr
next_addr:
        pop %reg

which can take over 10 cycles on certain microarchitectures (and it unbalances 
whatever call stack tracking/caching the CPU does as well).

So it might make sense to analyze them separately. I guess that's the reason 
why 
Intel added a separate flag for them in the PMU.

Thanks,

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to