RE: [PATCH V4 1/2] perf ignore LBR and extra_regs.

2014-07-09 Thread Liang, Kan
On Tue, Jul 08, 2014 at 09:49:40AM -0700, kan.li...@intel.com wrote: --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -464,6 +464,12 @@ struct x86_pmu { */ struct extra_reg *extra_regs; unsigned int er_flags; + /* +* EXTRA REG

RE: [PATCH V4 1/2] perf ignore LBR and extra_regs.

2014-07-09 Thread Liang, Kan
On Tue, Jul 08, 2014 at 09:49:40AM -0700, kan.li...@intel.com wrote: +/* + * Under certain circumstances, access certain MSR may cause #GP. + * The function tests if the input MSR can be safely accessed. + */ +static inline bool check_msr(unsigned long msr) { + u64 value; + +

RE: [PATCH V4 1/2] perf ignore LBR and extra_regs.

2014-07-09 Thread Liang, Kan
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Wednesday, July 09, 2014 10:58 AM To: Liang, Kan Cc: a...@firstfloor.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH V4 1/2] perf ignore LBR and extra_regs. On Wed, Jul 09

RE: [PATCH V4 1/2] perf ignore LBR and extra_regs.

2014-07-09 Thread Liang, Kan
On Wed, Jul 09, 2014 at 02:32:28PM +, Liang, Kan wrote: On Tue, Jul 08, 2014 at 09:49:40AM -0700, kan.li...@intel.com wrote: +/* + * Under certain circumstances, access certain MSR may cause #GP. + * The function tests if the input MSR can be safely

RE: [PATCH V2 2/3] perf protect LBR when Intel PT is enabled.

2014-07-07 Thread Liang, Kan
On Thu, Jul 03, 2014 at 05:52:37PM +0200, Andi Kleen wrote: If there's active LBR users out there, we should refuse to enable PT and vice versa. This doesn't work, e.g. hardware debuggers can take over at any time. Tough cookies. Hardware debuggers get to deal with whatever crap

RE: [PATCH V3 1/2] perf ignore LBR and offcore_rsp.

2014-07-08 Thread Liang, Kan
On Mon, Jul 07, 2014 at 06:34:25AM -0700, kan.li...@intel.com wrote: + /* +* Access LBR MSR may cause #GP under certain circumstances. +* E.g. KVM doesn't support LBR MSR +* Check all LBT MSR here. +* Disable LBR access if any LBR MSRs can not be accessed. +

RE: [PATCH V3 1/2] perf ignore LBR and offcore_rsp.

2014-07-08 Thread Liang, Kan
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Tuesday, July 08, 2014 5:29 AM To: Liang, Kan Cc: a...@firstfloor.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH V3 1/2] perf ignore LBR and offcore_rsp. On Mon, Jul 07, 2014

RE: [PATCH V5 1/2] perf ignore LBR and extra_regs

2014-07-14 Thread Liang, Kan
For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y (for host kernel). And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel). I'm not sure this is a useful patch. This is #GP'ing just because of a limitation in the PMU; just compile the

RE: [PATCH V5 1/2] perf ignore LBR and extra_regs

2014-07-14 Thread Liang, Kan
-Original Message- From: Paolo Bonzini [mailto:pbonz...@redhat.com] Sent: Monday, July 14, 2014 9:40 AM To: Liang, Kan; Peter Zijlstra Cc: a...@firstfloor.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org Subject: Re: [PATCH V5 1/2] perf ignore LBR and extra_regs Il 14/07

RE: [PATCH V5 1/2] perf ignore LBR and extra_regs

2014-07-14 Thread Liang, Kan
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index 3b2f9bd..992c678 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -464,6 +464,12 @@ struct x86_pmu { */ struct extra_reg *extra_regs;

RE: [PATCH V6 1/2] perf ignore LBR and extra_rsp

2014-07-15 Thread Liang, Kan
Since nobody ever treats EVENT_EXTRA_END as an actual event, the value of .extra_msr_access is irrelevant, this leaves the only 'possible' value 'true' and we can delete all those changes. Right. Which, combined with a few whitespace cleanups, gives the below patch. Thanks. Your

RE: [PATCH V2 1/3] perf ignore LBR and offcore_rsp.

2014-07-02 Thread Liang, Kan
Signed-off-by: Andi Kleen a...@linux.intel.com I did not contribute to this patch, so please remove that SOB. OK Signed-off-by: Kan Liang kan.li...@intel.com struct extra_reg *extra_regs; unsigned int er_flags; + boolextra_msr_access; /* EXTRA

RE: [PATCH V2 1/3] perf ignore LBR and offcore_rsp.

2014-07-02 Thread Liang, Kan
On Wed, Jul 2, 2014 at 2:14 PM, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com x86, perf: Protect LBR and offcore rsp against KVM lying With -cpu host, KVM reports LBR and offcore support, if the host has support. When the guest perf driver tries to access LBR or

RE: [PATCH V6 13/17] perf, x86: enable LBR callstack when recording callchain

2014-10-24 Thread Liang, Kan
On Sun, Oct 19, 2014 at 05:55:08PM -0400, Kan Liang wrote: Only enable LBR callstack when user requires fp callgraph. The feature is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER is required. Also, this feature only affects how to get user callchain. The kernel

RE: [PATCH V6 17/17] perf tools: choose to dump callchain from LBR and FP

2014-10-24 Thread Liang, Kan
On Fri, Oct 24, 2014 at 03:36:00PM +0200, Jiri Olsa wrote: On Sun, Oct 19, 2014 at 05:55:12PM -0400, Kan Liang wrote: SNIP - return 0; - } - continue; + mix_chain_nr = i + 2 + lbr_nr; + if

RE: [PATCH v5 00/16] perf, x86: Haswell LBR call stack support

2014-09-05 Thread Liang, Kan
Hi Peter and all, Did you get a chance to review these patches? Zheng is away. Should I re-send the patches? Thanks, Kan For many profiling tasks we need the callgraph. For example we often need to see the caller of a lock or the caller of a memcpy or other library function to actually

RE: [PATCH v4 3/3] perf tools: Add support to new style format of kernel PMU event

2014-09-08 Thread Liang, Kan
On Tue, Sep 02, 2014 at 11:29:30AM -0400, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com SNIP } +| +PE_KERNEL_PMU_EVENT +{ + struct parse_events_evlist *data = _data; + struct list_head *head = malloc(sizeof(*head)); + struct parse_events_term *term;

RE: [PATCH V5 11/16] perf, core: Pass perf_sample_data to perf_callchain()

2014-10-07 Thread Liang, Kan
So I don't like this. Why not use the regular PERF_SAMPLE_BRANCH_STACK output to generate the stuff from? We already have two different means, with different transport, for callchains anyhow, so a third really won't matter. I'm not sure what you mean by using the regular

RE: [PATCH V5 14/16] perf, x86: enable LBR callstack when recording callchain

2014-10-07 Thread Liang, Kan
On Tue, Oct 07, 2014 at 03:00:43AM +, Liang, Kan wrote: On Wed, Sep 10, 2014 at 10:09:11AM -0400, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com If a task specific event wants user space callchain but does not want branch stack sampling, enable

RE: [PATCH V7 0/4] perf tools: pmu event new style format fix

2014-10-05 Thread Liang, Kan
Kan Liang (4): Revert perf tools: Default to cpu// for events v5 perf tools: parse the pmu event prefix and suffix perf tools: Add support to new style format of kernel PMU event perf tools: Add test case for pmu event new style format got test failure with your patchset:

RE: [PATCH V5 08/16] perf, x86: track number of events that use LBR callstack

2014-10-06 Thread Liang, Kan
On Wed, Sep 10, 2014 at 10:09:05AM -0400, kan.li...@intel.com wrote: @@ -204,9 +204,15 @@ void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in) } } +static inline bool branch_user_callstack(unsigned br_sel) { + return (br_sel X86_BR_USER) (br_sel

RE: [PATCH V5 14/16] perf, x86: enable LBR callstack when recording callchain

2014-10-06 Thread Liang, Kan
On Wed, Sep 10, 2014 at 10:09:11AM -0400, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com If a task specific event wants user space callchain but does not want branch stack sampling, enable the LBR call stack facility implicitly. The LBR call stack facility can help

RE: [PATCH V5 11/16] perf, core: Pass perf_sample_data to perf_callchain()

2014-10-06 Thread Liang, Kan
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Wednesday, September 24, 2014 10:15 AM To: Liang, Kan Cc: eran...@google.com; linux-kernel@vger.kernel.org; mi...@redhat.com; pau...@samba.org; a...@kernel.org; a...@linux.intel.com; Yan, Zheng Subject: Re

RE: [PATCH V6 2/3] perf tools: parse the pmu event prefix and surfix

2014-10-02 Thread Liang, Kan
+static int +comp_pmu(const void *p1, const void *p2) { + struct perf_pmu_event_symbol *pmu1 = + (struct perf_pmu_event_symbol *) p1; + struct perf_pmu_event_symbol *pmu2 = + (struct perf_pmu_event_symbol *) p2; please keep it on one line,

RE: [PATCH V6 0/3] perf tools: pmu event new style format fix

2014-10-02 Thread Liang, Kan
On Thu, Sep 11, 2014 at 03:08:56PM -0400, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com There are two types of pmu event stytle formats, pmu_event_name or cpu/pmu_event_name/. However, there is a bug on supporting these two formats, especially when they mixed with

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-18 Thread Liang, Kan
On Tue, Nov 18, 2014 at 03:13:50PM +0900, Namhyung Kim wrote: SNIP + * in from register, while the callee is stored + * in to register. + * For example, there is a call stack + *

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-18 Thread Liang, Kan
whole stack. + */ Andi is using some sanity checks: http://marc.info/?l=linux-kernelm=141584447819894w=2 I guess this could be applied in here, once his patch gets in. Are you suggesting me to remove the comments, or rebase the whole

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-19 Thread Liang, Kan
On Tue, 18 Nov 2014 14:01:06 +, Kan Liang wrote: On Fri, 14 Nov 2014 08:44:12 -0500, kan liang wrote: +/* LBR only affects the user callchain */ +if (i != chain_nr) { +struct branch_stack *lbr_stack = sample-

RE: [PATCH V4 3/3] perf tool: Add sort key symoff for perf diff

2014-11-19 Thread Liang, Kan
On Tue, 18 Nov 2014 11:38:20 -0500, kan liang wrote: From: Kan Liang kan.li...@intel.com Sometime, especially debugging scaling issue, the function level diff may be high granularity. The user may want to do deeper diff analysis for some cache or lock issue. The symoff key can let

RE: [PATCH V4 1/3] perf tools: enable LBR call stack support

2014-11-19 Thread Liang, Kan
On Tue, 18 Nov 2014 16:36:55 -0500, kan liang wrote: From: Kan Liang kan.li...@intel.com Currently, there are two call chain recording options, fp and dwarf. Haswell has a new feature that utilizes the existing LBR facility to record call chains. So it provides the third options to

RE: [PATCH V4 3/3] perf tool: Add sort key symoff for perf diff

2014-11-19 Thread Liang, Kan
Em Tue, Nov 18, 2014 at 11:38:20AM -0500, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com Sometime, especially debugging scaling issue, the function level diff may be high granularity. The user may want to do deeper diff analysis for some cache or lock issue. The

RE: [PATCH V4 1/3] perf tools: enable LBR call stack support

2014-11-20 Thread Liang, Kan
On Thu, Nov 20, 2014 at 7:32 AM, Namhyung Kim namhy...@kernel.org wrote: On Wed, 19 Nov 2014 14:32:08 +, Kan Liang wrote: On Tue, 18 Nov 2014 16:36:55 -0500, kan liang wrote: + if (attr-exclude_user) { + attr-exclude_user = 0;

RE: [PATCH V5 2/3] perf tools: parse the pmu event prefix and surfix

2014-09-11 Thread Liang, Kan
SNIP return 0; } +static int +comp_pmu(const void *p1, const void *p2) { + struct perf_pmu_event_symbol *pmu1 = + (struct perf_pmu_event_symbol *) p1; + struct perf_pmu_event_symbol *pmu2 = + (struct perf_pmu_event_symbol *) p2;

RE: [PATCH V5 2/3] perf tools: parse the pmu event prefix and surfix

2014-09-11 Thread Liang, Kan
On Wed, Sep 10, 2014 at 01:55:31PM -0400, kan.li...@intel.com wrote: SNIP + struct perf_pmu_event_symbol *pmu2 = + (struct perf_pmu_event_symbol *) p2; + + return strcmp(pmu1-symbol, pmu2-symbol); } + +/* + * Read the pmu events list from sysfs + *

RE: [PATCH V3 2/3] perf tool: Move cpumode resolve code to add_callchain_ip

2014-11-21 Thread Liang, Kan
-Original Message- From: Jiri Olsa [mailto:jo...@redhat.com] Sent: Tuesday, November 18, 2014 3:25 AM To: Liang, Kan Cc: a...@kernel.org; a.p.zijls...@chello.nl; eran...@google.com; linux- ker...@vger.kernel.org; mi...@redhat.com; pau...@samba.org; a...@linux.intel.com Subject

RE: [PATCH V8 00/14] perf, x86: Haswell LBR call stack support (kernel)

2014-11-10 Thread Liang, Kan
On Thu, Nov 06, 2014 at 09:54:17AM -0500, Kan Liang wrote: Yan, Zheng (13): perf, x86: Reduce lbr_sel_map size perf, core: introduce pmu context switch callback perf, x86: use context switch callback to flush LBR stack perf, x86: Basic Haswell LBR call stack support

RE: [PATCH 0/2] perf tool: Haswell LBR call stack support (user)

2014-11-10 Thread Liang, Kan
acme, jolsa, ACK on these two? These patches are pure user tool patches. I usually sent the tool patches to them for review. Also, Jolsa had some comments on the previous perf tool part. So I would like them have a look at the new changes of the user tool. Thanks, Kan -- To unsubscribe from

RE: [PATCH 1/2] perf tools: enable LBR call stack support

2014-11-12 Thread Liang, Kan
PERF_SAMPLE_BRANCH_USER | + PERF_SAMPLE_BRANCH_CALL_STACK; + attr-exclude_user = 0; I think we shouldn't siletly change attr-exclude_user, if it was defined, we need to display warning that we are changing that or fail Right, I will display a warning here.

RE: [PATCH 2/2] perf tools: Construct LBR call chain

2014-11-12 Thread Liang, Kan
+ + printf(... chain: nr:% PRIu64 \n, total_nr); + + for (i = 0; i callchain_nr + 1; i++) printf(. %2d: %016 PRIx64 \n, i, sample-callchain-ips[i]); so if there's lbr callstack info we dont display user stack part from standard callchain? I

RE: [PATCH 1/1] perf tools: perf diff for different binaries

2014-11-03 Thread Liang, Kan
@@ -1164,6 +1164,9 @@ int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused) if (setup_sorting() 0) usage_with_options(diff_usage, options); + if (sort__has_sym_name) + tool.mmap2 = perf_event__process_mmap2; why is the mmap2

RE: [PATCH V6 01/17] perf, x86: Reduce lbr_sel_map size

2014-11-03 Thread Liang, Kan
Hi Peter, Did you get a chance to review the rest of the patch set? Thanks, Kan On Sun, Oct 19, 2014 at 05:54:56PM -0400, Kan Liang wrote: This should still very much have: From: Yan, Zheng zheng.z@intel.com Seeing how you did not write this patch, probably true for all the

RE: [PATCH V5 3/3] perf tool: check buildid for symoff

2014-11-25 Thread Liang, Kan
+ data__for_each_file_new(i, d) { + k_dsos_tmp = d-session-machines.host.kernel_dsos; + u_dsos_tmp = d-session-machines.host.user_dsos; + + if (!dsos__build_ids_equal(base_k_dsos, k_dsos_tmp)) + pr_warning(The perf.data come from

RE: [PATCH V5 3/3] perf tool: check buildid for symoff

2014-11-26 Thread Liang, Kan
On Mon, Nov 24, 2014 at 11:00:29AM -0500, Kan Liang wrote: From: Kan Liang kan.li...@intel.com symoff can support both same binaries and different binaries. However, the offset may be changed for different binaries. This patch checks the buildid of perf.data. If they are from

RE: [PATCH 1/1] perf tools: perf diff for different binaries

2014-11-06 Thread Liang, Kan
Hi Kan, On Thu, Nov 6, 2014 at 2:28 AM, Liang, Kan kan.li...@intel.com wrote: Hi Kan, On Tue, 4 Nov 2014 17:07:43 +, Kan Liang wrote: What about setting the sort_sym.se_collapse in data_process() so that hists__match() can use symbol names? Yes, we can set it if we

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-17 Thread Liang, Kan
SNIP diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index f4478ce..335c3a9 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -557,15 +557,63 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event, return 0;

RE: [PATCH V3 1/3] perf tools: enable LBR call stack support

2014-11-18 Thread Liang, Kan
On Fri, 14 Nov 2014 08:44:10 -0500, kan liang wrote: From: Kan Liang kan.li...@intel.com Currently, there are two call chain recording options, fp and dwarf. Haswell has a new feature that utilizes the existing LBR facility to record call chains. So it provides the third options to

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-18 Thread Liang, Kan
On Fri, 14 Nov 2014 08:44:12 -0500, kan liang wrote: + /* LBR only affects the user callchain */ + if (i != chain_nr) { + struct branch_stack *lbr_stack = sample- branch_stack; + int lbr_nr = lbr_stack-nr; + /*

RE: [PATCH V5 1/1] perf tool:perf diff support for different binaries

2014-12-02 Thread Liang, Kan
Em Fri, Nov 21, 2014 at 10:55:48AM -0500, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the perf.data comes from different binaries.

RE: [PATCH 1/1] perf, x86: bug fix for cycles:p and cycles:pp on SLM

2014-12-22 Thread Liang, Kan
On Mon, Dec 08, 2014 at 06:27:43AM -0800, kan.li...@intel.com wrote: +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -568,8 +568,8 @@ struct event_constraint intel_atom_pebs_event_constraints[] = { }; struct event_constraint intel_slm_pebs_event_constraints[] = { - /*

RE: [PATCH 1/1] perf, core: Use sample period avg as child event's initial period

2014-12-15 Thread Liang, Kan
On Fri, Dec 12, 2014 at 10:10:35AM -0500, kan.li...@intel.com wrote: That's because, in the inherit_event, the period for child event is inherit from parent's parent's event, which is usually the default sample_period 1. Each child event has to recaculate the period from 1 everytime.

RE: [PATCH 1/1] perf, core: Use sample period avg as child event's initial period

2014-12-16 Thread Liang, Kan
On Mon, Dec 15, 2014 at 09:17:33PM +, Liang, Kan wrote: This doesn't seem to make any kind of sense, and its weirdly implemented. So why would you push anything to the original parent? Your description states that the parent event usually has 1, and then you argue about

RE: [PATCH V8 0/4] perf tools: pmu event new style format fix

2014-10-13 Thread Liang, Kan
Hi Jolsa, Does the new patch set work on your machine? I tested the V8 patch set on haswell, ivybridge and Romley platform, I cannot reproduce the issue you mentioned. Could you please try the latest V8 patch? Thanks, Kan From: Kan Liang kan.li...@intel.com There are two types of pmu event

RE: [PATCH 1/1] perf tools: perf diff for different binaries

2014-11-04 Thread Liang, Kan
Hi Namhyung, tchain_edit[.] f1 0.14%3.913444 tchain_edit[.] f2 99.82%1.005478 tchain_edit[.] f3 Hmm.. I think it should be a default behavior for perf diff, otherwise -s symbol is almost meaningless IMHO. I

RE: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Liang, Kan
Thanks for your comments. There are lots of discussion about the patch. It's hard to reply them one by one. So I try to reply all the concerns here. The patchset doesn't try to introduce the 3rd independent callchain option That's because LBR callstack has some limitations (only available for

RE: [PATCH V7 00/17] perf, x86: Haswell LBR call stack support

2014-11-05 Thread Liang, Kan
So if I take all except 11,13,16,17 but instead do something like the below, everything will work just fine, right? Or am I missing something? Yes, it should work. Then LBR callstack will rely on user to enable it. But user never get the LBR callstack data if it's available. I'm

RE: [PATCH V7 00/17] perf, x86: Haswell LBR call stack support

2014-11-05 Thread Liang, Kan
On Wed, Nov 05, 2014 at 04:22:09PM +, Liang, Kan wrote: So if I take all except 11,13,16,17 but instead do something like the below, everything will work just fine, right? Or am I missing something? Yes, it should work. Then LBR callstack will rely on user

RE: [PATCH 1/1] perf tools: perf diff for different binaries

2014-11-05 Thread Liang, Kan
Hi Kan, On Tue, 4 Nov 2014 17:07:43 +, Kan Liang wrote: Hi Namhyung, tchain_edit[.] f1 0.14%3.913444 tchain_edit[.] f2 99.82%1.005478 tchain_edit[.] f3 Hmm.. I think it should be a default

RE: [PATCH V5 3/3] perf tool: check buildid for symoff

2014-11-27 Thread Liang, Kan
Hi Kan, On Mon, 24 Nov 2014 11:00:29 -0500, Kan Liang wrote: From: Kan Liang kan.li...@intel.com symoff can support both same binaries and different binaries. However, the offset may be changed for different binaries. This patch checks the buildid of perf.data. If they are from

RE: [PATCH V5 3/3] perf tool: check buildid for symoff

2014-11-28 Thread Liang, Kan
On Thu, Nov 27, 2014 at 02:09:51PM +, Liang, Kan wrote: Hi Kan, On Mon, 24 Nov 2014 11:00:29 -0500, Kan Liang wrote: From: Kan Liang kan.li...@intel.com symoff can support both same binaries and different binaries. However, the offset may be changed

RE: [PATCH V6 1/3] perf tool: Add sort key symoff for perf diff

2014-12-01 Thread Liang, Kan
On Mon, Dec 01, 2014 at 09:40:10AM -0500, Kan Liang wrote: SNIP +static int64_t +sort__symoff_collapse(struct hist_entry *left, struct hist_entry +*right) { + struct symbol *sym_l = left-ms.sym; + struct symbol *sym_r = right-ms.sym; + u64 symoff_l, symoff_r; +

RE: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)

2014-12-04 Thread Liang, Kan
On Tue, Dec 02, 2014 at 10:06:51AM -0500, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com This is the user space patch for Haswell LBR call stack support. For many profiling tasks we need the callgraph. For example we often need to see the caller of a lock or the caller

RE: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)

2014-12-04 Thread Liang, Kan
On Thu, Dec 04, 2014 at 12:51:42PM -0300, Arnaldo Carvalho de Melo wrote: Em Thu, Dec 04, 2014 at 02:49:52PM +, Liang, Kan escreveu: Jiri Wrote: looks ok to me.. Thanks for the review. I'll test it once I get hands on Haswel server again, I guess we wait

RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries

2015-01-26 Thread Liang, Kan
Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the perf.data comes from different binaries.

RE: [PATCH V2 1/1] perf, core: Use sample period avg as child event's initial period

2015-01-19 Thread Liang, Kan
Hi Peter, The patch is month old. I checked that it still apply to current tip. Could you please take a look? Thanks, Kan From: Kan Liang kan.li...@intel.com For perf record frequency mode, the initial sample_period is 1. That's because perf doesn't know what period should be set. It

RE: [PATCH V8 03/14] perf, x86: use context switch callback to flush LBR stack

2015-01-14 Thread Liang, Kan
On Thu, Nov 06, 2014 at 09:54:20AM -0500, Kan Liang wrote: --- a/kernel/events/core.c @@ -2673,64 +2666,6 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx, } /* - * When sampling the branck stack in system-wide, it may be necessary - * to flush the

RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries

2015-01-12 Thread Liang, Kan
Em Tue, Jan 06, 2015 at 11:53:56AM -0300, Arnaldo Carvalho de Melo escreveu: Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu: Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the

RE: [PATCH V2 1/1] perf, core: Use sample period avg as child event's initial period

2015-02-10 Thread Liang, Kan
Hi Peter, Could you please review the patch? Thanks, Kan Hi Peter, The patch is month old. I checked that it still apply to current tip. Could you please take a look? Thanks, Kan From: Kan Liang kan.li...@intel.com For perf record frequency mode, the initial sample_period

RE: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)

2015-01-05 Thread Liang, Kan
On Thu, Dec 04, 2014 at 02:49:52PM +, Liang, Kan wrote: I'll test it once I get hands on Haswel server again, I guess we wait for the kernel change to go in first anyway, right? I'm not sure, let's ask Peter. Peter? Ok so only 3/3 was missing right? I handed the kernel

RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries

2015-01-05 Thread Liang, Kan
Hi Arnaldo, The patch is one month old. Kim and Jirka have reviewed it. There is also another perf diff related patch which has similar situation. https://lkml.org/lkml/2014/12/1/380 It was also reviewed by Jirka a month ago. Both of them still apply to current perf/core. Should I re-post

RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries

2015-01-06 Thread Liang, Kan
Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the perf.data comes from different binaries. This

RE: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps

2015-03-18 Thread Liang, Kan
Em Tue, Mar 03, 2015 at 05:09:29PM +, Liang, Kan escreveu: Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo escreveu: Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com With the patch

RE: [PATCH 1/1] perf, tool: partial callgrap and time support in perf record

2015-03-16 Thread Liang, Kan
Hi Kan, On Fri, Mar 13, 2015 at 02:18:07AM +, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com When multiple events are sampled it may not be needed to collect callgraphs for all of them. The sample sites are usually nearby, and it's enough to collect the

RE: [PATCH V5 4/6] perf, x86: handle multiple records in PEBS buffer

2015-03-30 Thread Liang, Kan
One corner case needs to mention is that the PEBS hardware doesn't deal well with collisions, when PEBS events happen near to each other. The records for the events can be collapsed into a single one, and it's not possible to reconstruct all events that caused the PEBS record, However

RE: [PATCH V5 4/6] perf, x86: handle multiple records in PEBS buffer

2015-03-30 Thread Liang, Kan
-Original Message- From: Andi Kleen [mailto:a...@firstfloor.org] Sent: Monday, March 30, 2015 1:26 PM To: Liang, Kan Cc: Peter Zijlstra; linux-kernel@vger.kernel.org; mi...@kernel.org; a...@infradead.org; eran...@google.com; a...@firstfloor.org Subject: Re: [PATCH V5 4/6] perf

RE: [PATCH 1/1] perf/x86: filter branches for PEBS event

2015-03-27 Thread Liang, Kan
On Thu, Mar 26, 2015 at 02:13:23PM -0400, kan.li...@intel.com wrote: This patch move intel_shared_regs_constraints for branch_reg ahead of intel_pebs_constraints. Why not all shared regs? Yes, all shared regs can also be moved ahead. The patch is named for modifying the branch filter. I

RE: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps

2015-03-03 Thread Liang, Kan
Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo escreveu: Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com With the patch 1/5, it's possible to group read events from different pmus. -C can be used to

RE: [tip:perf/core] Revert perf: Remove the extra validity check on nr_pages

2015-03-03 Thread Liang, Kan
* tip-bot for Kan Liang tip...@zytor.com wrote: --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4446,7 +4446,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) * If we have rb pages ensure they're a power-of-two number, so we * can do

RE: [tip:perf/core] Revert perf: Remove the extra validity check on nr_pages

2015-03-04 Thread Liang, Kan
* Liang, Kan kan.li...@intel.com wrote: * tip-bot for Kan Liang tip...@zytor.com wrote: --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4446,7 +4446,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) * If we have rb pages

RE: [PATCH V7 1/1] perf tool:perf diff support for different binaries

2015-02-25 Thread Liang, Kan
Hi Arnaldo, Could you please review the patch? I've already updated the patch description to try to address your concern. Please let me know if you have any questions. Thanks, Kan From: Kan Liang kan.li...@intel.com Currently, the perf diff only works with same binaries. That's because it

RE: [PATCH v1] perf callchain: fix kernel symbol resolution by remembering the cpumode

2015-03-27 Thread Liang, Kan
Commit 2e77784bb7d8 (perf callchain: Move cpumode resolve code to add_callchain_ip) promised No change in behavior.. As this commit breaks callchains on s390x (symbols not getting resolved, observed when profiling the kernel), this statement is wrong. The cpumode must be kept when

RE: [PATCH 1/1] perf/x86: filter branches for PEBS event

2015-03-26 Thread Liang, Kan
Subject: Re: [PATCH 1/1] perf/x86: filter branches for PEBS event On Thu, Mar 26, 2015 at 11:13 AM, kan.li...@intel.com wrote: From: Kan Liang kan.li...@intel.com For supporting Intel LBR branches filtering, Intel LBR sharing logic mechanism is introduced from commit b36817e88630

RE: [PATCH] perf/x86/intel/uncore: fix IMC missing box initialization

2015-04-26 Thread Liang, Kan
This leads me to believe that this patch: commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc Author: Kan Liang kan.li...@intel.com Date: Tue Jan 20 04:54:25 2015 + perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization If I revert it, I bet things

RE: [PATCH V8 7/8] perf, x86: introduce PERF_RECORD_LOST_SAMPLES

2015-05-07 Thread Liang, Kan
So I changed it slightly to the below; changes are: - record 'lost' events to all set bits; after all we really do not know which event this sample belonged to, only logging to the first set bit seems 'wrong'. If so, the same dropped sample will be count multiple times. It's

RE: [PATCH V8 8/8] perf tools: handle PERF_RECORD_LOST_SAMPLES

2015-05-07 Thread Liang, Kan
On Wed, May 06, 2015 at 03:33:54PM -0400, Kan Liang wrote: From: Kan Liang kan.li...@intel.com This patch modified the perf tool to handle the new RECORD type, PERF_RECORD_LOST_SAMPLES. The number of lost-sample events is stored in .nr_events[PERF_EVENT_LOST_SAMPLES]. While the

RE: [PATCH V7 3/6] perf, x86: handle multiple records in PEBS buffer

2015-05-05 Thread Liang, Kan
On Mon, Apr 20, 2015 at 04:07:47AM -0400, Kan Liang wrote: +static inline void * +get_next_pebs_record_by_bit(void *base, void *top, int bit) { + struct cpu_hw_events *cpuc = this_cpu_ptr(cpu_hw_events); + void *at; + u64 pebs_status; + + if (base == NULL) +

RE: [PATCH V7 3/6] perf, x86: handle multiple records in PEBS buffer

2015-05-05 Thread Liang, Kan
On Tue, May 05, 2015 at 03:07:23PM +0200, Peter Zijlstra wrote: On Mon, Apr 20, 2015 at 04:07:47AM -0400, Kan Liang wrote: From: Yan, Zheng zheng.z@intel.com +static void perf_log_lost(struct perf_event *event) { + struct perf_output_handle handle; + struct perf_sample_data

RE: [PATCH V7 3/6] perf, x86: handle multiple records in PEBS buffer

2015-05-05 Thread Liang, Kan
On Tue, May 05, 2015 at 04:30:25PM +, Liang, Kan wrote: + for (at = base; at top; at += x86_pmu.pebs_record_size) { struct pebs_record_nhm *p = at; for_each_set_bit(bit, (unsigned long *)p-status

RE: [PATCH V9 8/8] perf tools: handle PERF_RECORD_LOST_SAMPLES

2015-05-11 Thread Liang, Kan
Em Sun, May 10, 2015 at 03:13:15PM -0400, Kan Liang escreveu: From: Kan Liang kan.li...@intel.com This patch modified the perf tool to handle the new RECORD type, PERF_RECORD_LOST_SAMPLES. The number of lost-sample events is stored in .nr_events[PERF_RECORD_LOST_SAMPLES]. While the

RE: [PATCH V9 0/8] large PEBS interrupt threshold

2015-05-11 Thread Liang, Kan
On Sun, May 10, 2015 at 03:13:07PM -0400, Kan Liang wrote: changes since v8: - Record 'lost' events to all set bits - dropped the @id field from the lost samples record - Print lost samples event nr in perf report --stdio output Only the last two patches changed, right?

RE: [RFC][PATCH] perf, pebs: Add PEBS v3 record decoding

2015-05-12 Thread Liang, Kan
I did some tests on HSX platform. It works well. Tested-by: Kan Liang kan.li...@intel.com Kan On Tue, May 12, 2015 at 03:25:57PM +0200, Peter Zijlstra wrote: So seeing how I have both this series and Andi's SKL patches, I did the below on top of them both. Could someone try

RE: [PATCH V6 3/6] perf, x86: large PEBS interrupt threshold

2015-04-15 Thread Liang, Kan
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Wednesday, April 15, 2015 1:15 PM To: Liang, Kan Cc: linux-kernel@vger.kernel.org; mi...@kernel.org; a...@infradead.org; eran...@google.com; a...@firstfloor.org Subject: Re: [PATCH V6 3/6] perf, x86: large

RE: [PATCH V6 3/6] perf, x86: large PEBS interrupt threshold

2015-04-15 Thread Liang, Kan
On Thu, Apr 09, 2015 at 12:37:43PM -0400, Kan Liang wrote: @@ -280,8 +280,9 @@ static int alloc_pebs_buffer(int cpu) ds-pebs_absolute_maximum = ds-pebs_buffer_base + max * x86_pmu.pebs_record_size; - ds-pebs_interrupt_threshold = ds-pebs_buffer_base + -

RE: [PATCH V2 1/6] perf,core: allow invalid context events to be part of sw/hw groups

2015-04-16 Thread Liang, Kan
On Wed, Apr 15, 2015 at 03:56:11AM -0400, Kan Liang wrote: The event count only be read when the event is already sched_in. Yeah, so no. This breaks what groups are. Group events _must_ be co- scheduled. You cannot guarantee you can schedule events from another PMU. Why? I think it's

RE: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

2015-04-17 Thread Liang, Kan
A) the CTRn value reaches 0: - the corresponding bit in GLOBAL_STATUS gets set - we start arming the hardware assist some unspecified amount of time later -- this could cover multiple events of interest B) the hardware assist is armed, any next event

RE: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

2015-04-17 Thread Liang, Kan
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Friday, April 17, 2015 9:13 AM To: Liang, Kan Cc: linux-kernel@vger.kernel.org; mi...@kernel.org; a...@infradead.org; eran...@google.com; a...@firstfloor.org Subject: Re: [PATCH V6 4/6] perf, x86: handle

RE: [PATCH V4 1/2] perf,tools: add time out to force stop proc map processing

2015-06-22 Thread Liang, Kan
Em Wed, Jun 17, 2015 at 09:51:10AM -0400, kan.li...@intel.com escreveu: From: Kan Liang kan.li...@intel.com System wide sampling like 'perf top' or 'perf record -a' read all threads /proc/xxx/maps before sampling. If there are any threads which generating a keeping growing huge maps,

RE: [PATCH 1/1] perf,tools: error out unsupported group leader immediately for perf stat

2015-06-11 Thread Liang, Kan
Em Thu, Jun 11, 2015 at 02:32:40AM -0400, kan.li...@intel.com escreveu: perf stat ignores the unsupported event and continue to count supported event. But if the unsupported event is group leader, perf tool will crash. After applying this patch, the unsupported group leader will error

RE: [PATCH 1/1] perf,tools: add time out to force stop endless mmap processing

2015-06-11 Thread Liang, Kan
Em Wed, Jun 10, 2015 at 03:46:04AM -0400, kan.li...@intel.com escreveu: perf top reads all threads' /proc/xxx/maps. If there is any threads which generating a keeping growing huge /proc/xxx/maps, perf will do infinite loop in perf_event__synthesize_mmap_events. This patch fixes this

RE: [PATCH 1/1] perf,tools: add time out to force stop endless mmap processing

2015-06-16 Thread Liang, Kan
Em Fri, Jun 12, 2015 at 10:24:36PM -0600, David Ahern escreveu: coming back to this ... On 6/12/15 2:39 PM, Liang, Kan wrote: Yes, perf always can read proc file. The problem is that the proc file is huge and keep growing faster than proc reader. So perf top do loop

RE: [PATCH 1/1] perf,tools: add time out to force stop endless mmap processing

2015-06-12 Thread Liang, Kan
On 6/12/15 2:39 PM, Liang, Kan wrote: Here are the test results. Please note that I get synthesized threads took... after the test case exit. It means both way have the same issue. Got it. So what you really mean is launching perf on an already running process perf never finishes

  1   2   3   4   5   6   7   8   9   10   >