On Tue, Jul 08, 2014 at 09:49:40AM -0700, kan.li...@intel.com wrote:
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -464,6 +464,12 @@ struct x86_pmu {
*/
struct extra_reg *extra_regs;
unsigned int er_flags;
+ /*
+* EXTRA REG
On Tue, Jul 08, 2014 at 09:49:40AM -0700, kan.li...@intel.com wrote:
+/*
+ * Under certain circumstances, access certain MSR may cause #GP.
+ * The function tests if the input MSR can be safely accessed.
+ */
+static inline bool check_msr(unsigned long msr) {
+ u64 value;
+
+
-Original Message-
From: Peter Zijlstra [mailto:pet...@infradead.org]
Sent: Wednesday, July 09, 2014 10:58 AM
To: Liang, Kan
Cc: a...@firstfloor.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org
Subject: Re: [PATCH V4 1/2] perf ignore LBR and extra_regs.
On Wed, Jul 09
On Wed, Jul 09, 2014 at 02:32:28PM +, Liang, Kan wrote:
On Tue, Jul 08, 2014 at 09:49:40AM -0700, kan.li...@intel.com wrote:
+/*
+ * Under certain circumstances, access certain MSR may cause #GP.
+ * The function tests if the input MSR can be safely
On Thu, Jul 03, 2014 at 05:52:37PM +0200, Andi Kleen wrote:
If there's active LBR users out there, we should refuse to enable PT
and vice versa.
This doesn't work, e.g. hardware debuggers can take over at any time.
Tough cookies. Hardware debuggers get to deal with whatever crap
On Mon, Jul 07, 2014 at 06:34:25AM -0700, kan.li...@intel.com wrote:
+ /*
+* Access LBR MSR may cause #GP under certain circumstances.
+* E.g. KVM doesn't support LBR MSR
+* Check all LBT MSR here.
+* Disable LBR access if any LBR MSRs can not be accessed.
+
-Original Message-
From: Peter Zijlstra [mailto:pet...@infradead.org]
Sent: Tuesday, July 08, 2014 5:29 AM
To: Liang, Kan
Cc: a...@firstfloor.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org
Subject: Re: [PATCH V3 1/2] perf ignore LBR and offcore_rsp.
On Mon, Jul 07, 2014
For reproducing the issue, please build the kernel with
CONFIG_KVM_INTEL = y (for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest
kernel).
I'm not sure this is a useful patch.
This is #GP'ing just because of a limitation in the PMU; just compile
the
-Original Message-
From: Paolo Bonzini [mailto:pbonz...@redhat.com]
Sent: Monday, July 14, 2014 9:40 AM
To: Liang, Kan; Peter Zijlstra
Cc: a...@firstfloor.org; linux-kernel@vger.kernel.org; k...@vger.kernel.org
Subject: Re: [PATCH V5 1/2] perf ignore LBR and extra_regs
Il 14/07
diff --git a/arch/x86/kernel/cpu/perf_event.h
b/arch/x86/kernel/cpu/perf_event.h
index 3b2f9bd..992c678 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -464,6 +464,12 @@ struct x86_pmu {
*/
struct extra_reg *extra_regs;
Since nobody ever treats EVENT_EXTRA_END as an actual event, the value
of .extra_msr_access is irrelevant, this leaves the only 'possible'
value 'true' and we can delete all those changes.
Right.
Which, combined with a few whitespace cleanups, gives the below patch.
Thanks. Your
Signed-off-by: Andi Kleen a...@linux.intel.com
I did not contribute to this patch, so please remove that SOB.
OK
Signed-off-by: Kan Liang kan.li...@intel.com
struct extra_reg *extra_regs;
unsigned int er_flags;
+ boolextra_msr_access; /* EXTRA
On Wed, Jul 2, 2014 at 2:14 PM, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
x86, perf: Protect LBR and offcore rsp against KVM lying
With -cpu host, KVM reports LBR and offcore support, if the host has
support.
When the guest perf driver tries to access LBR or
On Sun, Oct 19, 2014 at 05:55:08PM -0400, Kan Liang wrote:
Only enable LBR callstack when user requires fp callgraph. The feature
is not available when PERF_SAMPLE_BRANCH_STACK or
PERF_SAMPLE_STACK_USER is required.
Also, this feature only affects how to get user callchain. The kernel
On Fri, Oct 24, 2014 at 03:36:00PM +0200, Jiri Olsa wrote:
On Sun, Oct 19, 2014 at 05:55:12PM -0400, Kan Liang wrote:
SNIP
- return 0;
- }
- continue;
+ mix_chain_nr = i + 2 + lbr_nr;
+ if
Hi Peter and all,
Did you get a chance to review these patches?
Zheng is away. Should I re-send the patches?
Thanks,
Kan
For many profiling tasks we need the callgraph. For example we often need
to see the caller of a lock or the caller of a memcpy or other library
function
to actually
On Tue, Sep 02, 2014 at 11:29:30AM -0400, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
SNIP
}
+|
+PE_KERNEL_PMU_EVENT
+{
+ struct parse_events_evlist *data = _data;
+ struct list_head *head = malloc(sizeof(*head));
+ struct parse_events_term *term;
So I don't like this. Why not use the regular
PERF_SAMPLE_BRANCH_STACK output to generate the stuff from? We
already have two different means, with different transport, for callchains
anyhow, so a third really won't matter.
I'm not sure what you mean by using the regular
On Tue, Oct 07, 2014 at 03:00:43AM +, Liang, Kan wrote:
On Wed, Sep 10, 2014 at 10:09:11AM -0400, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
If a task specific event wants user space callchain but does not
want branch stack sampling, enable
Kan Liang (4):
Revert perf tools: Default to cpu// for events v5
perf tools: parse the pmu event prefix and suffix
perf tools: Add support to new style format of kernel PMU event
perf tools: Add test case for pmu event new style format
got test failure with your patchset:
On Wed, Sep 10, 2014 at 10:09:05AM -0400, kan.li...@intel.com wrote:
@@ -204,9 +204,15 @@ void intel_pmu_lbr_sched_task(struct
perf_event_context *ctx, bool sched_in)
}
}
+static inline bool branch_user_callstack(unsigned br_sel) {
+ return (br_sel X86_BR_USER) (br_sel
On Wed, Sep 10, 2014 at 10:09:11AM -0400, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
If a task specific event wants user space callchain but does not want
branch stack sampling, enable the LBR call stack facility implicitly.
The LBR call stack facility can help
-Original Message-
From: Peter Zijlstra [mailto:pet...@infradead.org]
Sent: Wednesday, September 24, 2014 10:15 AM
To: Liang, Kan
Cc: eran...@google.com; linux-kernel@vger.kernel.org; mi...@redhat.com;
pau...@samba.org; a...@kernel.org; a...@linux.intel.com; Yan, Zheng
Subject: Re
+static int
+comp_pmu(const void *p1, const void *p2) {
+ struct perf_pmu_event_symbol *pmu1 =
+ (struct perf_pmu_event_symbol *) p1;
+ struct perf_pmu_event_symbol *pmu2 =
+ (struct perf_pmu_event_symbol *) p2;
please keep it on one line,
On Thu, Sep 11, 2014 at 03:08:56PM -0400, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
There are two types of pmu event stytle formats, pmu_event_name
or cpu/pmu_event_name/. However, there is a bug on supporting these
two formats, especially when they mixed with
On Tue, Nov 18, 2014 at 03:13:50PM +0900, Namhyung Kim wrote:
SNIP
+ * in from register, while the callee is
stored
+ * in to register.
+ * For example, there is a call stack
+ *
whole
stack.
+ */
Andi is using some sanity checks:
http://marc.info/?l=linux-kernelm=141584447819894w=2
I guess this could be applied in here, once his patch gets in.
Are you suggesting me to remove the comments, or rebase the
whole
On Tue, 18 Nov 2014 14:01:06 +, Kan Liang wrote:
On Fri, 14 Nov 2014 08:44:12 -0500, kan liang wrote:
+/* LBR only affects the user callchain */
+if (i != chain_nr) {
+struct branch_stack *lbr_stack = sample-
On Tue, 18 Nov 2014 11:38:20 -0500, kan liang wrote:
From: Kan Liang kan.li...@intel.com
Sometime, especially debugging scaling issue, the function level diff
may be high granularity. The user may want to do deeper diff analysis
for some cache or lock issue. The symoff key can let
On Tue, 18 Nov 2014 16:36:55 -0500, kan liang wrote:
From: Kan Liang kan.li...@intel.com
Currently, there are two call chain recording options, fp and dwarf.
Haswell has a new feature that utilizes the existing LBR facility to
record call chains. So it provides the third options to
Em Tue, Nov 18, 2014 at 11:38:20AM -0500, kan.li...@intel.com escreveu:
From: Kan Liang kan.li...@intel.com
Sometime, especially debugging scaling issue, the function level diff
may be high granularity. The user may want to do deeper diff analysis
for some cache or lock issue. The
On Thu, Nov 20, 2014 at 7:32 AM, Namhyung Kim namhy...@kernel.org
wrote:
On Wed, 19 Nov 2014 14:32:08 +, Kan Liang wrote:
On Tue, 18 Nov 2014 16:36:55 -0500, kan liang wrote:
+ if (attr-exclude_user) {
+ attr-exclude_user = 0;
SNIP
return 0;
}
+static int
+comp_pmu(const void *p1, const void *p2) {
+ struct perf_pmu_event_symbol *pmu1 =
+ (struct perf_pmu_event_symbol *) p1;
+ struct perf_pmu_event_symbol *pmu2 =
+ (struct perf_pmu_event_symbol *) p2;
On Wed, Sep 10, 2014 at 01:55:31PM -0400, kan.li...@intel.com wrote:
SNIP
+ struct perf_pmu_event_symbol *pmu2 =
+ (struct perf_pmu_event_symbol *) p2;
+
+ return strcmp(pmu1-symbol, pmu2-symbol); }
+
+/*
+ * Read the pmu events list from sysfs
+ *
-Original Message-
From: Jiri Olsa [mailto:jo...@redhat.com]
Sent: Tuesday, November 18, 2014 3:25 AM
To: Liang, Kan
Cc: a...@kernel.org; a.p.zijls...@chello.nl; eran...@google.com; linux-
ker...@vger.kernel.org; mi...@redhat.com; pau...@samba.org;
a...@linux.intel.com
Subject
On Thu, Nov 06, 2014 at 09:54:17AM -0500, Kan Liang wrote:
Yan, Zheng (13):
perf, x86: Reduce lbr_sel_map size
perf, core: introduce pmu context switch callback
perf, x86: use context switch callback to flush LBR stack
perf, x86: Basic Haswell LBR call stack support
acme, jolsa, ACK on these two?
These patches are pure user tool patches. I usually sent the tool
patches to them for review. Also, Jolsa had some comments on
the previous perf tool part. So I would like them have a look
at the new changes of the user tool.
Thanks,
Kan
--
To unsubscribe from
PERF_SAMPLE_BRANCH_USER |
+
PERF_SAMPLE_BRANCH_CALL_STACK;
+ attr-exclude_user = 0;
I think we shouldn't siletly change attr-exclude_user, if it was defined, we
need to display warning that we are changing that or fail
Right, I will display a warning here.
+
+ printf(... chain: nr:% PRIu64 \n, total_nr);
+
+ for (i = 0; i callchain_nr + 1; i++)
printf(. %2d: %016 PRIx64 \n,
i, sample-callchain-ips[i]);
so if there's lbr callstack info we dont display user stack part from standard
callchain? I
@@ -1164,6 +1164,9 @@ int cmd_diff(int argc, const char **argv, const
char *prefix __maybe_unused)
if (setup_sorting() 0)
usage_with_options(diff_usage, options);
+ if (sort__has_sym_name)
+ tool.mmap2 = perf_event__process_mmap2;
why is the mmap2
Hi Peter,
Did you get a chance to review the rest of the patch set?
Thanks,
Kan
On Sun, Oct 19, 2014 at 05:54:56PM -0400, Kan Liang wrote:
This should still very much have:
From: Yan, Zheng zheng.z@intel.com
Seeing how you did not write this patch, probably true for all the
+ data__for_each_file_new(i, d) {
+ k_dsos_tmp = d-session-machines.host.kernel_dsos;
+ u_dsos_tmp = d-session-machines.host.user_dsos;
+
+ if (!dsos__build_ids_equal(base_k_dsos, k_dsos_tmp))
+ pr_warning(The perf.data come from
On Mon, Nov 24, 2014 at 11:00:29AM -0500, Kan Liang wrote:
From: Kan Liang kan.li...@intel.com
symoff can support both same binaries and different binaries. However,
the offset may be changed for different binaries. This patch checks
the buildid of perf.data. If they are from
Hi Kan,
On Thu, Nov 6, 2014 at 2:28 AM, Liang, Kan kan.li...@intel.com wrote:
Hi Kan,
On Tue, 4 Nov 2014 17:07:43 +, Kan Liang wrote:
What about setting the
sort_sym.se_collapse in data_process() so that hists__match() can
use symbol names?
Yes, we can set it if we
SNIP
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f4478ce..335c3a9 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -557,15 +557,63 @@ int perf_session_queue_event(struct
perf_session *s, union perf_event *event,
return 0;
On Fri, 14 Nov 2014 08:44:10 -0500, kan liang wrote:
From: Kan Liang kan.li...@intel.com
Currently, there are two call chain recording options, fp and dwarf.
Haswell has a new feature that utilizes the existing LBR facility to
record call chains. So it provides the third options to
On Fri, 14 Nov 2014 08:44:12 -0500, kan liang wrote:
+ /* LBR only affects the user callchain */
+ if (i != chain_nr) {
+ struct branch_stack *lbr_stack = sample-
branch_stack;
+ int lbr_nr = lbr_stack-nr;
+ /*
Em Fri, Nov 21, 2014 at 10:55:48AM -0500, kan.li...@intel.com escreveu:
From: Kan Liang kan.li...@intel.com
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries.
On Mon, Dec 08, 2014 at 06:27:43AM -0800, kan.li...@intel.com wrote:
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -568,8 +568,8 @@ struct event_constraint
intel_atom_pebs_event_constraints[] = { };
struct event_constraint intel_slm_pebs_event_constraints[] = {
- /*
On Fri, Dec 12, 2014 at 10:10:35AM -0500, kan.li...@intel.com wrote:
That's because, in the inherit_event, the period for child event is
inherit from parent's parent's event, which is usually the default
sample_period 1. Each child event has to recaculate the period from 1
everytime.
On Mon, Dec 15, 2014 at 09:17:33PM +, Liang, Kan wrote:
This doesn't seem to make any kind of sense, and its weirdly
implemented.
So why would you push anything to the original parent? Your
description states that the parent event usually has 1, and then you
argue about
Hi Jolsa,
Does the new patch set work on your machine?
I tested the V8 patch set on haswell, ivybridge and Romley platform,
I cannot reproduce the issue you mentioned.
Could you please try the latest V8 patch?
Thanks,
Kan
From: Kan Liang kan.li...@intel.com
There are two types of pmu event
Hi Namhyung,
tchain_edit[.] f1
0.14%3.913444 tchain_edit[.] f2
99.82%1.005478 tchain_edit[.] f3
Hmm.. I think it should be a default behavior for perf diff, otherwise -s
symbol is almost meaningless IMHO.
I
Thanks for your comments. There are lots of discussion about the patch.
It's hard to reply them one by one. So I try to reply all the concerns here.
The patchset doesn't try to introduce the 3rd independent callchain option
That's because LBR callstack has some limitations (only available for
So if I take all except 11,13,16,17 but instead do something like the below,
everything will work just fine, right?
Or am I missing something?
Yes, it should work. Then LBR callstack will rely on user to enable it.
But user never get the LBR callstack data if it's available.
I'm
On Wed, Nov 05, 2014 at 04:22:09PM +, Liang, Kan wrote:
So if I take all except 11,13,16,17 but instead do something like
the below, everything will work just fine, right?
Or am I missing something?
Yes, it should work. Then LBR callstack will rely on user
Hi Kan,
On Tue, 4 Nov 2014 17:07:43 +, Kan Liang wrote:
Hi Namhyung,
tchain_edit[.] f1
0.14%3.913444 tchain_edit[.] f2
99.82%1.005478 tchain_edit[.] f3
Hmm.. I think it should be a default
Hi Kan,
On Mon, 24 Nov 2014 11:00:29 -0500, Kan Liang wrote:
From: Kan Liang kan.li...@intel.com
symoff can support both same binaries and different binaries. However,
the offset may be changed for different binaries. This patch checks
the buildid of perf.data. If they are from
On Thu, Nov 27, 2014 at 02:09:51PM +, Liang, Kan wrote:
Hi Kan,
On Mon, 24 Nov 2014 11:00:29 -0500, Kan Liang wrote:
From: Kan Liang kan.li...@intel.com
symoff can support both same binaries and different binaries.
However, the offset may be changed
On Mon, Dec 01, 2014 at 09:40:10AM -0500, Kan Liang wrote:
SNIP
+static int64_t
+sort__symoff_collapse(struct hist_entry *left, struct hist_entry
+*right) {
+ struct symbol *sym_l = left-ms.sym;
+ struct symbol *sym_r = right-ms.sym;
+ u64 symoff_l, symoff_r;
+
On Tue, Dec 02, 2014 at 10:06:51AM -0500, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
This is the user space patch for Haswell LBR call stack support.
For many profiling tasks we need the callgraph. For example we often
need to see the caller of a lock or the caller
On Thu, Dec 04, 2014 at 12:51:42PM -0300, Arnaldo Carvalho de Melo wrote:
Em Thu, Dec 04, 2014 at 02:49:52PM +, Liang, Kan escreveu:
Jiri Wrote:
looks ok to me..
Thanks for the review.
I'll test it once I get hands on Haswel server again, I guess we
wait
Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu:
From: Kan Liang kan.li...@intel.com
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries.
Hi Peter,
The patch is month old. I checked that it still apply to current tip.
Could you please take a look?
Thanks,
Kan
From: Kan Liang kan.li...@intel.com
For perf record frequency mode, the initial sample_period is 1. That's
because perf doesn't know what period should be set. It
On Thu, Nov 06, 2014 at 09:54:20AM -0500, Kan Liang wrote:
--- a/kernel/events/core.c
@@ -2673,64 +2666,6 @@ static void
perf_event_context_sched_in(struct
perf_event_context *ctx, }
/*
- * When sampling the branck stack in system-wide, it may be necessary
- * to flush the
Em Tue, Jan 06, 2015 at 11:53:56AM -0300, Arnaldo Carvalho de Melo
escreveu:
Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu:
Currently, the perf diff only works with same binaries. That's
because it compares the symbol start address. It doesn't work if the
Hi Peter,
Could you please review the patch?
Thanks,
Kan
Hi Peter,
The patch is month old. I checked that it still apply to current tip.
Could you please take a look?
Thanks,
Kan
From: Kan Liang kan.li...@intel.com
For perf record frequency mode, the initial sample_period
On Thu, Dec 04, 2014 at 02:49:52PM +, Liang, Kan wrote:
I'll test it once I get hands on Haswel server again, I guess we
wait for the kernel change to go in first anyway, right?
I'm not sure, let's ask Peter.
Peter?
Ok so only 3/3 was missing right? I handed the kernel
Hi Arnaldo,
The patch is one month old. Kim and Jirka have reviewed it.
There is also another perf diff related patch which has similar situation.
https://lkml.org/lkml/2014/12/1/380
It was also reviewed by Jirka a month ago.
Both of them still apply to current perf/core.
Should I re-post
Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.li...@intel.com escreveu:
From: Kan Liang kan.li...@intel.com
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This
Em Tue, Mar 03, 2015 at 05:09:29PM +, Liang, Kan escreveu:
Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo
escreveu:
Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.li...@intel.com
escreveu:
From: Kan Liang kan.li...@intel.com
With the patch
Hi Kan,
On Fri, Mar 13, 2015 at 02:18:07AM +, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
When multiple events are sampled it may not be needed to collect
callgraphs for all of them. The sample sites are usually nearby, and
it's enough to collect the
One corner case needs to mention is that the PEBS hardware doesn't
deal well with collisions, when PEBS events happen near to each other.
The records for the events can be collapsed into a single one, and
it's not possible to reconstruct all events that caused the PEBS
record, However
-Original Message-
From: Andi Kleen [mailto:a...@firstfloor.org]
Sent: Monday, March 30, 2015 1:26 PM
To: Liang, Kan
Cc: Peter Zijlstra; linux-kernel@vger.kernel.org; mi...@kernel.org;
a...@infradead.org; eran...@google.com; a...@firstfloor.org
Subject: Re: [PATCH V5 4/6] perf
On Thu, Mar 26, 2015 at 02:13:23PM -0400, kan.li...@intel.com wrote:
This patch move intel_shared_regs_constraints for branch_reg ahead of
intel_pebs_constraints.
Why not all shared regs?
Yes, all shared regs can also be moved ahead.
The patch is named for modifying the branch filter. I
Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo
escreveu:
Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.li...@intel.com escreveu:
From: Kan Liang kan.li...@intel.com
With the patch 1/5, it's possible to group read events from
different pmus. -C can be used to
* tip-bot for Kan Liang tip...@zytor.com wrote:
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4446,7 +4446,7 @@ static int perf_mmap(struct file *file, struct
vm_area_struct *vma)
* If we have rb pages ensure they're a power-of-two number, so
we
* can do
* Liang, Kan kan.li...@intel.com wrote:
* tip-bot for Kan Liang tip...@zytor.com wrote:
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4446,7 +4446,7 @@ static int perf_mmap(struct file *file, struct
vm_area_struct *vma)
* If we have rb pages
Hi Arnaldo,
Could you please review the patch?
I've already updated the patch description to try to address your concern.
Please let me know if you have any questions.
Thanks,
Kan
From: Kan Liang kan.li...@intel.com
Currently, the perf diff only works with same binaries. That's because it
Commit 2e77784bb7d8 (perf callchain: Move cpumode resolve code to
add_callchain_ip) promised No change in behavior..
As this commit breaks callchains on s390x (symbols not getting resolved,
observed when profiling the kernel), this statement is wrong. The
cpumode must be kept when
Subject: Re: [PATCH 1/1] perf/x86: filter branches for PEBS event
On Thu, Mar 26, 2015 at 11:13 AM, kan.li...@intel.com wrote:
From: Kan Liang kan.li...@intel.com
For supporting Intel LBR branches filtering, Intel LBR sharing logic
mechanism is introduced from commit b36817e88630
This leads me to believe that this patch:
commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc
Author: Kan Liang kan.li...@intel.com
Date: Tue Jan 20 04:54:25 2015 +
perf/x86/intel/uncore: Move uncore_box_init() out of driver
initialization
If I revert it, I bet things
So I changed it slightly to the below; changes are:
- record 'lost' events to all set bits; after all we really do not
know which event this sample belonged to, only logging to the first
set bit seems 'wrong'.
If so, the same dropped sample will be count multiple times. It's
On Wed, May 06, 2015 at 03:33:54PM -0400, Kan Liang wrote:
From: Kan Liang kan.li...@intel.com
This patch modified the perf tool to handle the new RECORD type,
PERF_RECORD_LOST_SAMPLES.
The number of lost-sample events is stored in
.nr_events[PERF_EVENT_LOST_SAMPLES]. While the
On Mon, Apr 20, 2015 at 04:07:47AM -0400, Kan Liang wrote:
+static inline void *
+get_next_pebs_record_by_bit(void *base, void *top, int bit) {
+ struct cpu_hw_events *cpuc = this_cpu_ptr(cpu_hw_events);
+ void *at;
+ u64 pebs_status;
+
+ if (base == NULL)
+
On Tue, May 05, 2015 at 03:07:23PM +0200, Peter Zijlstra wrote:
On Mon, Apr 20, 2015 at 04:07:47AM -0400, Kan Liang wrote:
From: Yan, Zheng zheng.z@intel.com
+static void perf_log_lost(struct perf_event *event) {
+ struct perf_output_handle handle;
+ struct perf_sample_data
On Tue, May 05, 2015 at 04:30:25PM +, Liang, Kan wrote:
+ for (at = base; at top; at += x86_pmu.pebs_record_size) {
struct pebs_record_nhm *p = at;
for_each_set_bit(bit, (unsigned long *)p-status
Em Sun, May 10, 2015 at 03:13:15PM -0400, Kan Liang escreveu:
From: Kan Liang kan.li...@intel.com
This patch modified the perf tool to handle the new RECORD type,
PERF_RECORD_LOST_SAMPLES.
The number of lost-sample events is stored in
.nr_events[PERF_RECORD_LOST_SAMPLES]. While the
On Sun, May 10, 2015 at 03:13:07PM -0400, Kan Liang wrote:
changes since v8:
- Record 'lost' events to all set bits
- dropped the @id field from the lost samples record
- Print lost samples event nr in perf report --stdio output
Only the last two patches changed, right?
I did some tests on HSX platform. It works well.
Tested-by: Kan Liang kan.li...@intel.com
Kan
On Tue, May 12, 2015 at 03:25:57PM +0200, Peter Zijlstra wrote:
So seeing how I have both this series and Andi's SKL patches, I did
the below on top of them both.
Could someone try
-Original Message-
From: Peter Zijlstra [mailto:pet...@infradead.org]
Sent: Wednesday, April 15, 2015 1:15 PM
To: Liang, Kan
Cc: linux-kernel@vger.kernel.org; mi...@kernel.org;
a...@infradead.org; eran...@google.com; a...@firstfloor.org
Subject: Re: [PATCH V6 3/6] perf, x86: large
On Thu, Apr 09, 2015 at 12:37:43PM -0400, Kan Liang wrote:
@@ -280,8 +280,9 @@ static int alloc_pebs_buffer(int cpu)
ds-pebs_absolute_maximum = ds-pebs_buffer_base +
max * x86_pmu.pebs_record_size;
- ds-pebs_interrupt_threshold = ds-pebs_buffer_base +
-
On Wed, Apr 15, 2015 at 03:56:11AM -0400, Kan Liang wrote:
The event count only be read when the event is already sched_in.
Yeah, so no. This breaks what groups are. Group events _must_ be co-
scheduled. You cannot guarantee you can schedule events from another
PMU.
Why? I think it's
A) the CTRn value reaches 0:
- the corresponding bit in GLOBAL_STATUS gets set
- we start arming the hardware assist
some unspecified amount of time later --
this could cover multiple events of interest
B) the hardware assist is armed, any next event
-Original Message-
From: Peter Zijlstra [mailto:pet...@infradead.org]
Sent: Friday, April 17, 2015 9:13 AM
To: Liang, Kan
Cc: linux-kernel@vger.kernel.org; mi...@kernel.org;
a...@infradead.org; eran...@google.com; a...@firstfloor.org
Subject: Re: [PATCH V6 4/6] perf, x86: handle
Em Wed, Jun 17, 2015 at 09:51:10AM -0400, kan.li...@intel.com escreveu:
From: Kan Liang kan.li...@intel.com
System wide sampling like 'perf top' or 'perf record -a' read all
threads /proc/xxx/maps before sampling. If there are any threads which
generating a keeping growing huge maps,
Em Thu, Jun 11, 2015 at 02:32:40AM -0400, kan.li...@intel.com escreveu:
perf stat ignores the unsupported event and continue to count
supported event. But if the unsupported event is group leader, perf
tool will crash. After applying this patch, the unsupported group
leader will error
Em Wed, Jun 10, 2015 at 03:46:04AM -0400, kan.li...@intel.com escreveu:
perf top reads all threads' /proc/xxx/maps. If there is any threads
which generating a keeping growing huge /proc/xxx/maps, perf will do
infinite loop in perf_event__synthesize_mmap_events.
This patch fixes this
Em Fri, Jun 12, 2015 at 10:24:36PM -0600, David Ahern escreveu:
coming back to this ...
On 6/12/15 2:39 PM, Liang, Kan wrote:
Yes, perf always can read proc file. The problem is that the proc
file is huge and keep growing faster than proc reader.
So perf top do loop
On 6/12/15 2:39 PM, Liang, Kan wrote:
Here are the test results.
Please note that I get synthesized threads took... after the test case
exit.
It means both way have the same issue.
Got it. So what you really mean is launching perf on an already running
process perf never finishes
1 - 100 of 1326 matches
Mail list logo