Re: [PATCH v2] perf/x86/intel/uncore: Use boot_cpu_data.logical_proc_id as the pkg id

2018-09-21 Thread Liang, Kan
On 9/20/2018 9:07 PM, Masayoshi Mizuma wrote: From: Masayoshi Mizuma Physical package id 0 is not always exists. We should use boot_cpu_data.logical_proc_id directly as the pkg id here. Signed-off-by: Masayoshi Mizuma Reviewed-by: Kan Liang Thanks, Kan ---

Re: [PATCH] perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX

2018-09-21 Thread Liang, Kan
On 9/21/2018 10:14 AM, Peter Zijlstra wrote: On Fri, Sep 21, 2018 at 07:07:06AM -0700, kan.li...@linux.intel.com wrote: From: Kan Liang The counters on M3UPI Link 0 and Link 3 don't count. Writing 0 to these counters may causes system crash on some machines. The PCI BDF addresses of M3UPI

Re: [PATCH 1/3] perf/x86/intel: Factor out common code of PMI handler

2018-08-07 Thread Liang, Kan
On 8/6/2018 2:20 PM, Peter Zijlstra wrote: On Mon, Aug 06, 2018 at 10:23:41AM -0700, kan.li...@linux.intel.com wrote: + if (++loops > 100) { + static bool warned; + + if (!warned) { + WARN(1, "perfevents: irq loop stuck!\n"); +

Re: [PATCH 2/3] x86, perf: Add a separate Arch Perfmon v4 PMI handler

2018-08-07 Thread Liang, Kan
On 8/6/2018 2:35 PM, Peter Zijlstra wrote: On Mon, Aug 06, 2018 at 10:23:42AM -0700, kan.li...@linux.intel.com wrote: @@ -2044,6 +2056,14 @@ static void intel_pmu_disable_event(struct perf_event *event) if (unlikely(event->attr.precise_ip))

Re: [PATCH 3/3] perf/x86/intel: Add quirk for Goldmont Plus

2018-08-07 Thread Liang, Kan
On 8/6/2018 2:39 PM, Peter Zijlstra wrote: On Mon, Aug 06, 2018 at 10:23:43AM -0700, kan.li...@linux.intel.com wrote: +static bool intel_glk_counter_freezing_broken(int cpu) case INTEL_FAM6_ATOM_GEMINI_LAKE: + x86_add_quirk(intel_counter_freezing_quirk); We really

Re: [PATCH] perf stat: Add hint for SMI cost measurement

2019-04-25 Thread Liang, Kan
On 4/25/2019 2:39 AM, Ingo Molnar wrote: * kan.li...@linux.intel.com wrote: +static void smi_env_check(void) +{ + char *name; + size_t len; + + if (sysfs__read_str(CPUIDLE_CUR_DRV, , )) { + pr_warning("Failed to check cstate status.\n"); What a

Re: [PATCH] perf stat: Add hint for SMI cost measurement

2019-04-25 Thread Liang, Kan
On 4/25/2019 1:47 PM, Ingo Molnar wrote: * Liang, Kan wrote: On 4/25/2019 2:39 AM, Ingo Molnar wrote: * kan.li...@linux.intel.com wrote: +static void smi_env_check(void) +{ + char *name; + size_t len; + + if (sysfs__read_str(CPUIDLE_CUR_DRV

Re: [PATCH 21/22] perf/x86/intel/uncore: renames in response to multi-die/pkg support

2019-05-09 Thread Liang, Kan
On 5/6/2019 5:26 PM, Len Brown wrote: From: Len Brown Syntax update only -- no logical or functional change. In response to the new multi-die/package changes, update variable names to use the more generic "box" terminology, instead of "pkg", as the boxes can refer to either packages or

Re: [PATCH 22/22] perf/x86/intel/rapl: rename internal variables in response to multi-die/pkg support

2019-05-09 Thread Liang, Kan
On 5/6/2019 5:26 PM, Len Brown wrote: From: Len Brown Syntax update only -- no logical or functional change. In response to the new multi-die/package changes, update variable names to use the more generic "pmuid" terminology, instead of "pkgid", as the pmu can refer to either packages or

Re: [PATCH 1/5] perf/x86/intel/uncore: Add uncore support for Snow Ridge server

2019-04-22 Thread Liang, Kan
Hi Peter, Have you got a chance to take a look at the series for Snow Ridge server? Here is the link for the document. https://cdrdv2.intel.com/v1/dl/getContent/611319 Thanks, Kan On 4/15/2019 2:41 PM, kan.li...@linux.intel.com wrote: From: Kan Liang The uncore subsystem on Snow Ridge is

Re: [PATCH 1/3] perf, tools: Add support for recording and printing XMM registers

2019-04-23 Thread Liang, Kan
Hi Arnaldo and Jirka, Have you got a chance to review this patch series? This series is user space tool support for Icelake and Tremont. Thanks, Kan On 4/16/2019 11:24 AM, kan.li...@linux.intel.com wrote: From: Andi Kleen Icelake and later platforms support collecting XMM registers with

Re: [RESEND PATCH 1/3] perf, tools: Add support for recording and printing XMM registers

2019-05-13 Thread Liang, Kan
On 5/13/2019 2:37 PM, Arnaldo Carvalho de Melo wrote: Em Mon, May 13, 2019 at 01:37:16PM -0400, Arnaldo Carvalho de Melo escreveu: Em Mon, May 06, 2019 at 07:19:24AM -0700, kan.li...@linux.intel.com escreveu: From: Andi Kleen Icelake and later platforms support collecting XMM registers

Re: [PATCH] perf vendor events intel: Add uncore_upi JSON support

2019-05-13 Thread Liang, Kan
Hi Arnaldo, Could you please apply this fix? Thanks, Kan On 5/7/2019 9:16 AM, kan.li...@linux.intel.com wrote: From: Kan Liang Perf cannot parse UPI events. #perf stat -e UPI_DATA_BANDWIDTH_TX event syntax error: 'UPI_DATA_BANDWIDTH_TX' \___ parser error

Re: [PATCH] perf vendor events intel: Add uncore_upi JSON support

2019-05-14 Thread Liang, Kan
On 5/14/2019 8:59 AM, Arnaldo Carvalho de Melo wrote: Em Mon, May 13, 2019 at 05:29:30PM -0400, Liang, Kan escreveu: Hi Arnaldo, Could you please apply this fix? Sure, please next time specify which arch this should be tested on, as I tried it here on a skylake notebook (lenovo t480s

Re: [PATCH 2/3] perf parse-regs: Add generic support for non-gprs

2019-05-14 Thread Liang, Kan
On 5/14/2019 2:19 PM, Arnaldo Carvalho de Melo wrote: Em Tue, May 14, 2019 at 07:39:12AM -0700, kan.li...@linux.intel.com escreveu: From: Kan Liang Some non general purpose registers, e.g. XMM, can be collected on some platforms, e.g. X86 Icelake. Add a weak function

Re: [PATCH V2 1/3] perf parse-regs: Split parse_regs

2019-05-15 Thread Liang, Kan
On 5/15/2019 2:49 AM, Ravi Bangoria wrote: On 5/15/19 1:49 AM, kan.li...@linux.intel.com wrote: From: Kan Liang The available registers for --int-regs and --user-regs may be different, e.g. XMM registers. Split parse_regs into two dedicated functions for --int-regs and --user-regs

Re: [PATCH 1/4] perf: Fix system-wide events miscounting during cgroup monitoring

2019-04-29 Thread Liang, Kan
On 4/29/2019 11:04 AM, Mark Rutland wrote: On Mon, Apr 29, 2019 at 07:44:02AM -0700, kan.li...@linux.intel.com wrote: From: Kan Liang When counting system-wide events and cgroup events simultaneously, the value of system-wide events are miscounting. For example, perf stat -e

Re: [PATCH 2/4] perf: Add filter_match() as a parameter for pinned/flexible_sched_in()

2019-04-29 Thread Liang, Kan
On 4/29/2019 11:12 AM, Mark Rutland wrote: On Mon, Apr 29, 2019 at 07:44:03AM -0700, kan.li...@linux.intel.com wrote: From: Kan Liang A fast path will be introduced in the following patches to speed up the cgroup events sched in, which only needs a simpler filter_match(). Add

Re: [PATCH 1/4] perf: Fix system-wide events miscounting during cgroup monitoring

2019-04-30 Thread Liang, Kan
On 4/30/2019 4:56 AM, Peter Zijlstra wrote: On Mon, Apr 29, 2019 at 07:44:02AM -0700, kan.li...@linux.intel.com wrote: diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index e47ef76..039e2f2 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@

Re: [PATCH 3/4] perf cgroup: Add cgroup ID as a key of RB tree

2019-04-30 Thread Liang, Kan
On 4/30/2019 5:03 AM, Peter Zijlstra wrote: On Mon, Apr 29, 2019 at 07:44:04AM -0700, kan.li...@linux.intel.com wrote: Add unique cgrp_id for each cgroup, which is composed by CPU ID and css subsys-unique ID. *WHY* ?! that doesn't make any kind of sense.. In fact you mostly then use the

Re: [PATCH 3/4] perf cgroup: Add cgroup ID as a key of RB tree

2019-04-30 Thread Liang, Kan
On 4/30/2019 5:08 AM, Peter Zijlstra wrote: On Mon, Apr 29, 2019 at 04:02:33PM -0700, Ian Rogers wrote: This is very interesting. How does the code handle cgroup hierarchies? For example, if we have: cgroup0 is the cgroup root cgroup1 whose parent is cgroup0 cgroup2 whose parent is cgroup1

Re: [PATCH 03/22] perf/x86/intel: Support adaptive PEBSv4

2019-03-19 Thread Liang, Kan
On 3/19/2019 10:47 AM, Peter Zijlstra wrote: @@ -933,6 +998,19 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc, struct pmu *pmu) update = true; } + if (x86_pmu.intel_cap.pebs_baseline && add) { + u64 pebs_data_cfg; + +

Re: [PATCH V3 01/23] perf/x86: Support outputting XMM registers

2019-03-25 Thread Liang, Kan
On 3/23/2019 5:56 AM, Peter Zijlstra wrote: On Fri, Mar 22, 2019 at 10:22:50AM -0700, Andi Kleen wrote: diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h index f3329cabce5c..b33995313d17 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++

Re: [PATCH V3 01/23] perf/x86: Support outputting XMM registers

2019-03-26 Thread Liang, Kan
On 3/25/2019 8:11 PM, Thomas Gleixner wrote: On Fri, 22 Mar 2019, kan.li...@linux.intel.com wrote: + PERF_REG_X86_XMM15 = 62, + + /* All registers include the XMMX registers */ + PERF_REG_X86_MAX = PERF_REG_X86_XMM15 + 2, Ergo: PERF_REG_X86_MAX == 64 -#define

Re: [PATCH V3 01/23] perf/x86: Support outputting XMM registers

2019-03-26 Thread Liang, Kan
On 3/26/2019 9:47 AM, Thomas Gleixner wrote: On Tue, 26 Mar 2019, Liang, Kan wrote: On 3/25/2019 8:11 PM, Thomas Gleixner wrote: -#define REG_RESERVED (~((1ULL << PERF_REG_X86_MAX) - 1ULL)) +#define REG_RESERVED 0 What's the point of having this around? I once thought it may b

Re: [PATCH V4 04/23] perf/x86/intel: Support adaptive PEBSv4

2019-03-27 Thread Liang, Kan
On 3/26/2019 6:24 PM, Andi Kleen wrote: + for (at = base; at < top; at += cpuc->pebs_record_size) { + u64 pebs_status; + + pebs_status = get_pebs_status(at) & cpuc->pebs_enabled; + pebs_status &= mask; + + for_each_set_bit(bit,

Re: [PATCH] perf pmu: Fix parser error for uncore event alias

2019-03-27 Thread Liang, Kan
On 3/18/2019 4:53 AM, Jiri Olsa wrote: On Fri, Mar 15, 2019 at 11:00:14AM -0700, kan.li...@linux.intel.com wrote: From: Kan Liang Perf fails to parse uncore event alias, for example: #perf stat -e unc_m_clockticks -a --no-merge sleep 1 event syntax error: 'unc_m_clockticks'

Re: [PATCH V4 01/23] perf/x86: Support outputting XMM registers

2019-04-01 Thread Liang, Kan
On 4/1/2019 3:18 PM, Stephane Eranian wrote: On Tue, Mar 26, 2019 at 9:11 AM wrote: From: Kan Liang Starting from Icelake, XMM registers can be collected in PEBS record. But current code only output the pt_regs. Add a new struct x86_perf_regs for both pt_regs and xmm_regs. XMM registers

Re: [PATCH V4 01/23] perf/x86: Support outputting XMM registers

2019-04-01 Thread Liang, Kan
On 4/1/2019 5:11 PM, Stephane Eranian wrote: diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index e2b1447192a8..9378c6b2128f 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -560,6 +560,16 @@ int x86_pmu_hw_config(struct perf_event *event)

Re: [PATCH V5 08/12] perf/x86/intel: Add Icelake support

2019-04-08 Thread Liang, Kan
On 4/8/2019 11:06 AM, Peter Zijlstra wrote: On Tue, Apr 02, 2019 at 12:45:05PM -0700, kan.li...@linux.intel.com wrote: +static struct event_constraint * +icl_get_event_constraints(struct cpu_hw_events *cpuc, int idx, + struct perf_event *event) +{ + /* +

Re: [PATCH V5 00/12] perf: Add Icelake support (kernel only, except Topdown)

2019-04-08 Thread Liang, Kan
On 4/8/2019 11:41 AM, Peter Zijlstra wrote: I currently have something like the below on top, is that correct? Yes, it's correct. If so, I'll fold it back in. Thanks. It's really appreciated. Kan --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -563,16 +563,17 @@ int

Re: [PATCH V5 00/12] perf: Add Icelake support (kernel only, except Topdown)

2019-04-08 Thread Liang, Kan
@@ -963,40 +963,42 @@ static u64 pebs_update_adaptive_cfg(stru   u64 pebs_data_cfg = 0;   bool gprs, tsx_weight; -    if ((sample_type & ~(PERF_SAMPLE_IP|PERF_SAMPLE_TIME)) || -    attr->precise_ip < 2) { +    if (!(sample_type & ~(PERF_SAMPLE_IP|PERF_SAMPLE_TIME)) && +   

Re: [PATCH V5 00/12] perf: Add Icelake support (kernel only, except Topdown)

2019-04-08 Thread Liang, Kan
On 4/8/2019 12:06 PM, Liang, Kan wrote: @@ -1875,7 +1868,7 @@ static void intel_pmu_drain_pebs_nhm(str   counts[bit]++;   } -    for (bit = 0; bit < size; bit++) { +    for_each_set_bit(bit, (unsigned long *), size) {   if ((counts[bit] == 0) && (erro

Re: [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters

2019-04-10 Thread Liang, Kan
On 4/10/2019 3:41 AM, Peter Zijlstra wrote: On Tue, Apr 09, 2019 at 06:09:59PM -0700, kan.li...@linux.intel.com wrote: From: Kan Liang Fixed counters can also generate adaptive PEBS record, if the corresponding bit in IA32_FIXED_CTR_CTRL is set. Otherwise, only basic record is generated.

Re: [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support

2019-04-10 Thread Liang, Kan
On 4/10/2019 3:51 AM, Peter Zijlstra wrote: On Tue, Apr 09, 2019 at 06:10:00PM -0700, kan.li...@linux.intel.com wrote: The generic purpose counter 0 and fixed counter 0 have less skid. Force :ppp events on generic purpose counter 0. Force instruction:ppp always on fixed counter 0.

Re: [PATCH V5 08/12] perf/x86/intel: Add Icelake support

2019-04-10 Thread Liang, Kan
On 4/8/2019 11:45 AM, Liang, Kan wrote: On 4/8/2019 11:06 AM, Peter Zijlstra wrote: On Tue, Apr 02, 2019 at 12:45:05PM -0700, kan.li...@linux.intel.com wrote: +static struct event_constraint * +icl_get_event_constraints(struct cpu_hw_events *cpuc, int idx, +  struct perf_event

Re: [PATCH V5 08/12] perf/x86/intel: Add Icelake support

2019-04-11 Thread Liang, Kan
On 4/11/2019 5:00 AM, Peter Zijlstra wrote: On Wed, Apr 10, 2019 at 09:47:20PM +0200, Peter Zijlstra wrote: Sure, those are actually forced 0 with the existing thing. I'll go fold smething like back in. Thanks! @@ -3472,7 +3475,7 @@ icl_get_event_constraints(struct cpu_hw_events *cpuc,

Re: [PATCH V2 2/2] perf/x86/intel: Add Tremont core PMU support

2019-04-11 Thread Liang, Kan
On 4/11/2019 5:06 AM, Peter Zijlstra wrote: On Wed, Apr 10, 2019 at 11:57:09AM -0700, kan.li...@linux.intel.com wrote: +static struct event_constraint * +tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx, + struct perf_event *event) That 'tnt' still

Re: [PATCH V2 2/2] perf/x86/intel: Add Tremont core PMU support

2019-04-11 Thread Liang, Kan
On 4/11/2019 9:33 AM, Peter Zijlstra wrote: On Thu, Apr 11, 2019 at 09:30:10AM -0400, Liang, Kan wrote: I changed that like so: --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3508,7 +3508,7 @@ tnt_get_event_constraints(struct cpu_hw_ */ if (event

Re: [PATCH 06/22] perf/x86/intel: Add Icelake support

2019-03-20 Thread Liang, Kan
On 3/19/2019 8:08 PM, Stephane Eranian wrote: On Mon, Mar 18, 2019 at 2:44 PM wrote: From: Kan Liang Add Icelake core PMU perf code, including constraint tables and the main enable code. Icelake expanded the generic counters to always 8 even with HT on, but a range of events cannot be

Re: [PATCH V2 04/23] perf/x86/intel: Support adaptive PEBSv4

2019-03-21 Thread Liang, Kan
On 3/21/2019 5:20 PM, Peter Zijlstra wrote: On Thu, Mar 21, 2019 at 01:56:44PM -0700, kan.li...@linux.intel.com wrote: @@ -933,6 +1001,34 @@ pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc, struct pmu *pmu) update = true; } + /* +* The PEBS

Re: [PATCH V4 00/23] perf: Add Icelake support

2019-04-01 Thread Liang, Kan
Hi Peter and Thomas, Have you got a chance to review this series? Any comments are very appreciated. Thanks, Kan On 3/26/2019 12:08 PM, kan.li...@linux.intel.com wrote: From: Kan Liang The patch series intends to add Icelake support for Linux perf. PATCH 1-18: Kernel patches to support

Re: [PATCH 03/13] mm: Add generic p?d_large() macros

2019-02-18 Thread Liang, Kan
On 2/18/2019 9:19 AM, Steven Price wrote: On 18/02/2019 11:31, Peter Zijlstra wrote: On Fri, Feb 15, 2019 at 05:02:24PM +, Steven Price wrote: From: James Morse Exposing the pud/pgd levels of the page tables to walk_page_range() means we may come across the exotic large mappings that

Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-19 Thread Liang, Kan
On 2/18/2019 10:40 PM, Len Brown wrote: From: Len Brown Some new systems have multiple software-visible die within each package. The new CPUID.1F leaf can enumerate this multi-die/package topology. CPUID.1F a super-set of the CPUID.B "Extended Toplogy Leaf", and a common updated routine

Re: [PATCH 05/11] x86 topology: export die_siblings

2019-02-19 Thread Liang, Kan
On 2/18/2019 10:40 PM, Len Brown wrote: From: Len Brown like core_siblings, except it shows which die are in the same package. This is needed for lscpu(1) to correctly display die topology. Signed-off-by: Len Brown Cc: linux-...@vger.kernel.org Signed-off-by: Len Brown ---

Re: [PATCH 05/11] x86 topology: export die_siblings

2019-02-19 Thread Liang, Kan
On 2/19/2019 1:43 PM, Brown, Len wrote: Thanks for the comments, Kan, diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt index 287213b4517b..7dd2ae3df233 100644 --- a/Documentation/cputopology.txt +++ b/Documentation/cputopology.txt @@ -56,6 +56,16 @@

Re: [PATCH 01/10] perf/x86/intel: Introduce a concept "domain" as the scope of counters

2019-02-20 Thread Liang, Kan
On 2/20/2019 6:12 AM, Peter Zijlstra wrote: On Tue, Feb 19, 2019 at 12:00:02PM -0800, kan.li...@linux.intel.com wrote: It's very useful to abstract several common topology related codes for these modules to reduce the code redundancy. 3 files changed, 96 insertions(+), 1 deletion(-)

Re: [RFC] perf/x86/rapl: Getting zero on energy-cores event

2019-03-01 Thread Liang, Kan
On 3/1/2019 6:42 AM, Jiri Olsa wrote: hi, I'm getting zero counts for energy-cores event on broadwell-x server (model 0x4f) I checked intel_rapl powercap driver and it won't export the counter if it rdmsr returns zero on it the SDM also says the rdmsr returns zero for some models I made

Re: [PATCH 01/10] perf/x86/intel: Introduce a concept "domain" as the scope of counters

2019-03-05 Thread Liang, Kan
platforms. Thanks, Kan On 2/20/2019 9:36 AM, Liang, Kan wrote: On 2/20/2019 6:12 AM, Peter Zijlstra wrote: On Tue, Feb 19, 2019 at 12:00:02PM -0800, kan.li...@linux.intel.com wrote: It's very useful to abstract several common topology related codes for these modules to reduce the code

Re: [PATCH V3 01/13] perf/core, x86: Add PERF_SAMPLE_DATA_PAGE_SIZE

2019-01-31 Thread Liang, Kan
On 1/31/2019 7:37 AM, Peter Zijlstra wrote: On Wed, Jan 30, 2019 at 06:23:42AM -0800, kan.li...@linux.intel.com wrote: diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 374a197..03bf45d 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2578,3 +2578,45 @@

Re: [PATCH V4 01/13] perf/core, x86: Add PERF_SAMPLE_DATA_PAGE_SIZE

2019-02-01 Thread Liang, Kan
On 2/1/2019 4:22 AM, Peter Zijlstra wrote: On Thu, Jan 31, 2019 at 12:27:54PM -0800, kan.li...@linux.intel.com wrote: diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 374a197..229a73b 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2578,3 +2578,34 @@

Re: [PATCH V4 01/13] perf/core, x86: Add PERF_SAMPLE_DATA_PAGE_SIZE

2019-02-01 Thread Liang, Kan
On 2/1/2019 7:43 AM, Peter Zijlstra wrote: On Fri, Feb 01, 2019 at 01:36:00PM +0300, Kirill A. Shutemov wrote: On Fri, Feb 01, 2019 at 11:03:58AM +0100, Peter Zijlstra wrote: Will just mentioned a lovely feature where some archs have multi entry large pages. Possible something like:

Re: [PATCH 10/12] perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE

2019-01-23 Thread Liang, Kan
On 1/22/2019 12:11 PM, Andi Kleen wrote: + PERF_OUTPUT_CODE_PAGE_SIZE = 1UL << 32, That won't work on 32bit. You need 1ULL Also might want to audit that noone puts these flags into an int. I checked the codes, and there is no one puts the flags into an int. I will use ULL in V2.

Re: [PATCH v4 04/10] KVM/x86: intel_pmu_lbr_enable

2019-01-07 Thread Liang, Kan
On 1/5/2019 5:09 AM, Wei Wang wrote: On 01/04/2019 11:57 PM, Liang, Kan wrote: On 1/4/2019 4:58 AM, Wei Wang wrote: On 01/03/2019 12:33 AM, Liang, Kan wrote: On 12/26/2018 4:25 AM, Wei Wang wrote: + +    /* + * It could be possible that people have vcpus of old model run

Re: [PATCH v4 04/10] KVM/x86: intel_pmu_lbr_enable

2019-01-08 Thread Liang, Kan
On 1/8/2019 1:13 AM, Wei Wang wrote: On 01/07/2019 10:22 PM, Liang, Kan wrote: Thanks for sharing. I understand the point of maintaining those models at one place, but this factor-out doesn't seem very elegant to me, like below __intel_pmu_init (int model, struct x86_pmu *x86_pmu

Re: [PATCH v4 04/10] KVM/x86: intel_pmu_lbr_enable

2019-01-02 Thread Liang, Kan
On 12/26/2018 4:25 AM, Wei Wang wrote: + + /* +* It could be possible that people have vcpus of old model run on +* physcal cpus of newer model, for example a BDW guest on a SKX +* machine (but not possible to be the other way around). +* The BDW guest

Re: [PATCH v4 04/10] KVM/x86: intel_pmu_lbr_enable

2019-01-04 Thread Liang, Kan
On 1/4/2019 4:58 AM, Wei Wang wrote: On 01/03/2019 12:33 AM, Liang, Kan wrote: On 12/26/2018 4:25 AM, Wei Wang wrote: + +    /* + * It could be possible that people have vcpus of old model run on + * physcal cpus of newer model, for example a BDW guest on a SKX + * machine

Re: [PATCH 1/4] x86/perf/intel: Introduce PMU flag for Extended PEBS

2018-07-23 Thread Liang, Kan
On 7/23/2018 11:16 AM, Peter Zijlstra wrote: On Thu, Mar 08, 2018 at 06:15:39PM -0800, kan.li...@linux.intel.com wrote: From: Kan Liang The Extended PEBS feature, introduced in Goldmont Plus microarchitecture, supports all events as "Extended PEBS". Introduce flag PMU_FL_PEBS_ALL to

Re: [PATCH 3/4] perf/x86/intel/ds: Handle PEBS overflow for fixed counters

2018-07-23 Thread Liang, Kan
On 7/23/2018 12:21 PM, Peter Zijlstra wrote: On Mon, Jul 23, 2018 at 04:59:44PM +0200, Peter Zijlstra wrote: On Thu, Mar 08, 2018 at 06:15:41PM -0800, kan.li...@linux.intel.com wrote: diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ef47a418d819..86149b87cce8

Re: [PATCH 3/4] perf/x86/intel/ds: Handle PEBS overflow for fixed counters

2018-07-23 Thread Liang, Kan
On 7/23/2018 12:56 PM, Liang, Kan wrote: On 7/23/2018 12:21 PM, Peter Zijlstra wrote: On Mon, Jul 23, 2018 at 04:59:44PM +0200, Peter Zijlstra wrote: On Thu, Mar 08, 2018 at 06:15:41PM -0800, kan.li...@linux.intel.com wrote: diff --git a/arch/x86/events/intel/core.c b/arch/x86/events

Re: [PATCH V2 2/3] x86, perf: Add a separate Arch Perfmon v4 PMI handler

2018-09-12 Thread Liang, Kan
Hi Peter, Any comments for the patch series regarding to v4 PMI handler? Thanks, Kan On 8/8/2018 3:12 AM, kan.li...@linux.intel.com wrote: From: Andi Kleen Implements counter freezing for Arch Perfmon v4 (Skylake and newer). This allows to speed up the PMI handler by avoiding unnecessary

Re: [PATCH 1/2] perf vendor events: Add stepping in CPUID string for x86

2018-11-15 Thread Liang, Kan
On 11/15/2018 8:53 AM, Jiri Olsa wrote: On Wed, Nov 14, 2018 at 01:24:15PM -0800, kan.li...@linux.intel.com wrote: SNIP diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c index fb0d71afee8b..b428a4b00bf7 100644 --- a/tools/perf/arch/x86/util/header.c +++

Re: [PATCH 1/2] perf vendor events: Add stepping in CPUID string for x86

2018-11-15 Thread Liang, Kan
On 11/15/2018 9:26 AM, Liang, Kan wrote: On 11/15/2018 8:53 AM, Jiri Olsa wrote: On Wed, Nov 14, 2018 at 01:24:15PM -0800, kan.li...@linux.intel.com wrote: SNIP diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c index fb0d71afee8b..b428a4b00bf7 100644

Re: [PATCH 1/2] perf vendor events: Add stepping in CPUID string for x86

2018-11-15 Thread Liang, Kan
+ /* +* Full CPUID format is required to identify a platform. +* Error out if the cpuid string is incomplete. +*/ + if (full_mapcpuid && !full_cpuid) { + pr_info("Invalid CPUID %s. Full CPUID is required, " +

Re: [PATCH 1/2] perf vendor events: Add stepping in CPUID string for x86

2018-11-15 Thread Liang, Kan
On 11/15/2018 3:44 PM, Jiri Olsa wrote: On Wed, Nov 14, 2018 at 01:24:15PM -0800, kan.li...@linux.intel.com wrote: From: Kan Liang Perf tools cannot find the proper event list for Cascadelake server. Because Cascadelake server and Skylake server have the same CPU model number, which are

Re: [PATCH] perf/x86/intel/uncore: Fix client IMC events return huge result

2018-11-16 Thread Liang, Kan
On 11/16/2018 11:12 AM, Sasha Levin wrote: On Fri, Nov 16, 2018 at 05:19:45AM -0800, kan.li...@linux.intel.com wrote: From: Kan Liang The client IMC bandwidth events return very huge result.  perf stat -e uncore_imc/data_reads/ -e uncore_imc/data_writes/ -I 1 -a  10.000117222 34,788.76

Re: [PATCH 1/2] perf: Add munmap callback

2018-11-01 Thread Liang, Kan
On 10/24/2018 3:30 PM, Stephane Eranian wrote: The need for this new record type extends beyond physical address conversions and PEBS. A long while ago, someone reported issues with symbolization related to perf lacking munmap tracking. It had to do with vma merging. I think the sequence of

Re: [PATCH 1/2] perf: Add munmap callback

2018-10-24 Thread Liang, Kan
On 10/24/2018 12:32 PM, Arnaldo Carvalho de Melo wrote: Em Wed, Oct 24, 2018 at 09:23:34AM -0700, Andi Kleen escreveu: +void perf_event_munmap(void) +{ + struct perf_cpu_context *cpuctx; + unsigned long flags; + struct pmu *pmu; + + local_irq_save(flags); +

Re: [PATCH 1/2] perf: Add munmap callback

2018-10-25 Thread Liang, Kan
On 10/24/2018 8:29 PM, Peter Zijlstra wrote: On Wed, Oct 24, 2018 at 08:11:15AM -0700, kan.li...@linux.intel.com wrote: +void perf_event_munmap(void) +{ + struct perf_cpu_context *cpuctx; + unsigned long flags; + struct pmu *pmu; + + local_irq_save(flags); It is

Re: [PATCH V2 2/3] x86, perf: Add a separate Arch Perfmon v4 PMI handler

2018-09-27 Thread Liang, Kan
On 9/27/2018 8:51 AM, Peter Zijlstra wrote: On Wed, Aug 08, 2018 at 12:12:07AM -0700, kan.li...@linux.intel.com wrote: @@ -4325,6 +4428,8 @@ __init int intel_pmu_init(void) x86_pmu.extra_regs = intel_skl_extra_regs; x86_pmu.pebs_aliases =

Re: [PATCH 1/2] perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake

2018-11-08 Thread Liang, Kan
Hi All, Ping. Any comments for the series. Thanks, Kan On 10/19/2018 1:04 PM, kan.li...@linux.intel.com wrote: From: Kan Liang KabyLake and CoffeeLake has the same client uncore events as SkyLake. Add the PCI IDs for KabyLake Y, U, S processor line and CoffeeLake U, H, S processor line.

Re: [PATCH 1/2] perf: Add munmap callback

2018-11-05 Thread Liang, Kan
NONYMOUS|MAP_PRIVATE, -1, 0); printf("addr2=%p\n", addr2); if (addr2 == (void *)MAP_FAILED) err(1, "mmap 2 failed"); if (addr2 != (addr1 + pgsz)) errx(1, "wrong mmap2 address"); sleep(1); return 0; } On Thu, Nov 1, 2018 at 7:10 AM Liang, Kan wrote: On 10/24/2018

Re: [PATCH 1/2] perf: Add munmap callback

2018-11-06 Thread Liang, Kan
On 11/6/2018 10:00 AM, Stephane Eranian wrote: /* * mmap 1 page at the location of the unmap page (should reuse virtual space) * This creates a continuous region built from two mmaps and potentially two different sources * especially with jitted runtimes */ The two mmaps are both anon. As my

Re: [PATCH] perf/x86/intel/lbr: Optimize context switches for LBR

2018-09-14 Thread Liang, Kan
On 9/14/2018 5:22 AM, Alexey Budankov wrote: Hi Andi, On 14.09.2018 11:54, Andi Kleen wrote: In principle the LBRs need to be flushed between threads. So does current code. IMHO, ideally, LBRs stack would be preserved and restored when switching between execution stacks. That would allow

Re: [PATCH] perf/x86/intel/lbr: Optimize context switches for LBR

2018-09-14 Thread Liang, Kan
On 9/14/2018 10:27 AM, Andi Kleen wrote: On Fri, Sep 14, 2018 at 08:39:36AM -0400, Liang, Kan wrote: On 9/14/2018 5:22 AM, Alexey Budankov wrote: Hi Andi, On 14.09.2018 11:54, Andi Kleen wrote: In principle the LBRs need to be flushed between threads. So does current code. IMHO

RE: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)

2014-12-04 Thread Liang, Kan
> On Tue, Dec 02, 2014 at 10:06:51AM -0500, kan.li...@intel.com wrote: > > From: Kan Liang > > > > This is the user space patch for Haswell LBR call stack support. > > For many profiling tasks we need the callgraph. For example we often > > need to see the caller of a lock or the caller of a

RE: [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)

2014-12-04 Thread Liang, Kan
> > On Thu, Dec 04, 2014 at 12:51:42PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Thu, Dec 04, 2014 at 02:49:52PM +0000, Liang, Kan escreveu: > > > Jiri Wrote: > > > > looks ok to me.. > > > > > Thanks for the review. > > > > >

RE: [PATCH V4 3/3] perf tool: Add sort key symoff for perf diff

2014-11-19 Thread Liang, Kan
> > Em Tue, Nov 18, 2014 at 11:38:20AM -0500, kan.li...@intel.com escreveu: > > From: Kan Liang > > > > Sometime, especially debugging scaling issue, the function level diff > > may be high granularity. The user may want to do deeper diff analysis > > for some cache or lock issue. The "symoff"

RE: [PATCH V4 1/3] perf tools: enable LBR call stack support

2014-11-20 Thread Liang, Kan
> > On Thu, Nov 20, 2014 at 7:32 AM, Namhyung Kim > wrote: > > > > On Wed, 19 Nov 2014 14:32:08 +, Kan Liang wrote: > > >> On Tue, 18 Nov 2014 16:36:55 -0500, kan liang wrote: > > >> > + if (attr->exclude_user) { > > >> > + attr->exclude_user = 0;

RE: [PATCH V3 2/3] perf tool: Move cpumode resolve code to add_callchain_ip

2014-11-21 Thread Liang, Kan
> -Original Message- > From: Jiri Olsa [mailto:jo...@redhat.com] > Sent: Tuesday, November 18, 2014 3:25 AM > To: Liang, Kan > Cc: a...@kernel.org; a.p.zijls...@chello.nl; eran...@google.com; linux- > ker...@vger.kernel.org; mi...@redhat.com; pau...@samba.org; >

RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries

2015-01-06 Thread Liang, Kan
> Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu: > > From: Kan Liang > > > > Currently, the perf diff only works with same binaries. That's because > > it compares the symbol start address. It doesn't work if the perf.data > > comes from different binaries. This patch

RE: [PATCH V5 3/3] perf tool: check buildid for symoff

2014-11-28 Thread Liang, Kan
> > On Thu, Nov 27, 2014 at 02:09:51PM +, Liang, Kan wrote: > > > > > > > Hi Kan, > > > > > > On Mon, 24 Nov 2014 11:00:29 -0500, Kan Liang wrote: > > > > From: Kan Liang > > > > > > > > symoff can s

RE: [PATCH V6 1/3] perf tool: Add sort key symoff for perf diff

2014-12-01 Thread Liang, Kan
> On Mon, Dec 01, 2014 at 09:40:10AM -0500, Kan Liang wrote: > > SNIP > > > +static int64_t > > +sort__symoff_collapse(struct hist_entry *left, struct hist_entry > > +*right) { > > + struct symbol *sym_l = left->ms.sym; > > + struct symbol *sym_r = right->ms.sym; > > + u64 symoff_l,

RE: [PATCH V5 1/1] perf tool:perf diff support for different binaries

2014-12-02 Thread Liang, Kan
> > Em Fri, Nov 21, 2014 at 10:55:48AM -0500, kan.li...@intel.com escreveu: > > From: Kan Liang > > > > Currently, the perf diff only works with same binaries. That's because > > it compares the symbol start address. It doesn't work if the perf.data > > comes from different binaries. This

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-17 Thread Liang, Kan
> SNIP > > > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c > > index f4478ce..335c3a9 100644 > > --- a/tools/perf/util/session.c > > +++ b/tools/perf/util/session.c > > @@ -557,15 +557,63 @@ int perf_session_queue_event(struct > perf_session *s, union perf_event *event, > >

RE: [PATCH V3 1/3] perf tools: enable LBR call stack support

2014-11-18 Thread Liang, Kan
> > On Fri, 14 Nov 2014 08:44:10 -0500, kan liang wrote: > > From: Kan Liang > > > > Currently, there are two call chain recording options, fp and dwarf. > > Haswell has a new feature that utilizes the existing LBR facility to > > record call chains. So it provides the third options to record

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-18 Thread Liang, Kan
> On Fri, 14 Nov 2014 08:44:12 -0500, kan liang wrote: > > + /* LBR only affects the user callchain */ > > + if (i != chain_nr) { > > + struct branch_stack *lbr_stack = sample- > >branch_stack; > > + int lbr_nr = lbr_stack->nr; > > +

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-18 Thread Liang, Kan
> On Tue, Nov 18, 2014 at 03:13:50PM +0900, Namhyung Kim wrote: > > SNIP > > > >> > + * in "from" register, while the callee is > stored > > >> > + * in "to" register. > > >> > + * For example, there is a call stack > > >> > +

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-18 Thread Liang, Kan
> > whole > > > >> stack. > > > >> > + */ > > > >> > > > >> Andi is using some sanity checks: > > > >> http://marc.info/?l=linux-kernel=141584447819894=2 > > > >> I guess this could be applied in here, once his patch gets in. > > > >> > > > > > > > > Are you suggesting me to

RE: [PATCH V3 3/3] perf tools: Construct LBR call chain

2014-11-19 Thread Liang, Kan
> > On Tue, 18 Nov 2014 14:01:06 +, Kan Liang wrote: > >> On Fri, 14 Nov 2014 08:44:12 -0500, kan liang wrote: > >> > +/* LBR only affects the user callchain */ > >> > +if (i != chain_nr) { > >> > +struct branch_stack *lbr_stack =

RE: [PATCH V4 3/3] perf tool: Add sort key symoff for perf diff

2014-11-19 Thread Liang, Kan
> > On Tue, 18 Nov 2014 11:38:20 -0500, kan liang wrote: > > From: Kan Liang > > > > Sometime, especially debugging scaling issue, the function level diff > > may be high granularity. The user may want to do deeper diff analysis > > for some cache or lock issue. The "symoff" key can let the

RE: [PATCH V4 1/3] perf tools: enable LBR call stack support

2014-11-19 Thread Liang, Kan
> On Tue, 18 Nov 2014 16:36:55 -0500, kan liang wrote: > > From: Kan Liang > > > > Currently, there are two call chain recording options, fp and dwarf. > > Haswell has a new feature that utilizes the existing LBR facility to > > record call chains. So it provides the third options to record

RE: [PATCH 1/2] perf tools: enable LBR call stack support

2014-11-12 Thread Liang, Kan
> PERF_SAMPLE_BRANCH_USER | > > + > PERF_SAMPLE_BRANCH_CALL_STACK; > > + attr->exclude_user = 0; > > I think we shouldn't siletly change attr->exclude_user, if it was defined, we > need to display warning that we are changing that or fail > Right, I will display a

RE: [PATCH 2/2] perf tools: Construct LBR call chain

2014-11-12 Thread Liang, Kan
> > + > > + printf("... chain: nr:%" PRIu64 "\n", total_nr); > > + > > + for (i = 0; i < callchain_nr + 1; i++) > > printf(". %2d: %016" PRIx64 "\n", > >i, sample->callchain->ips[i]); > > so if there's lbr callstack info we dont display user stack part

RE: [PATCH v5 00/16] perf, x86: Haswell LBR call stack support

2014-09-05 Thread Liang, Kan
Hi Peter and all, Did you get a chance to review these patches? Zheng is away. Should I re-send the patches? Thanks, Kan > > For many profiling tasks we need the callgraph. For example we often need > to see the caller of a lock or the caller of a memcpy or other library > function > to

RE: [PATCH V5 2/3] perf tools: parse the pmu event prefix and surfix

2014-09-11 Thread Liang, Kan
> SNIP > > > return 0; > > } > > > > +static int > > +comp_pmu(const void *p1, const void *p2) { > > + struct perf_pmu_event_symbol *pmu1 = > > + (struct perf_pmu_event_symbol *) p1; > > + struct perf_pmu_event_symbol *pmu2 = > > + (struct

RE: [PATCH V5 2/3] perf tools: parse the pmu event prefix and surfix

2014-09-11 Thread Liang, Kan
> On Wed, Sep 10, 2014 at 01:55:31PM -0400, kan.li...@intel.com wrote: > > SNIP > > > + struct perf_pmu_event_symbol *pmu2 = > > + (struct perf_pmu_event_symbol *) p2; > > + > > + return strcmp(pmu1->symbol, pmu2->symbol); } > > + > > +/* > > + * Read the pmu events list

RE: [PATCH v4 3/3] perf tools: Add support to new style format of kernel PMU event

2014-09-08 Thread Liang, Kan
> > On Tue, Sep 02, 2014 at 11:29:30AM -0400, kan.li...@intel.com wrote: > > From: Kan Liang > > SNIP > > > } > > +| > > +PE_KERNEL_PMU_EVENT > > +{ > > + struct parse_events_evlist *data = _data; > > + struct list_head *head = malloc(sizeof(*head)); > > + struct parse_events_term

RE: [PATCH 1/1] perf/x86: filter branches for PEBS event

2015-03-26 Thread Liang, Kan
> Subject: Re: [PATCH 1/1] perf/x86: filter branches for PEBS event > > On Thu, Mar 26, 2015 at 11:13 AM, wrote: > > From: Kan Liang > > > > For supporting Intel LBR branches filtering, Intel LBR sharing logic > > mechanism is introduced from commit b36817e88630 ("perf/x86: Add > Intel > >

RE: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps

2015-03-18 Thread Liang, Kan
> > Em Tue, Mar 03, 2015 at 05:09:29PM +, Liang, Kan escreveu: > > > > > > > Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo > > > escreveu: > > > > Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.li...

<    5   6   7   8   9   10   11   12   13   14   >