On 7/11/2017 10:28 AM, Michael Ellerman wrote:
"Jin, Yao" writes:
On 7/10/2017 9:46 PM, Peter Zijlstra wrote:
On Mon, Jul 10, 2017 at 08:10:50AM -0500, Segher Boessenkool wrote:
PERF_BR_INT is triggered by instruction "int" .
PERF_BR_IRQ is triggered by interrupts
_BR_IND= 3,/* indirect */
PERF_BR_CALL= 4,/* call */
PERF_BR_IND_CALL= 5, /* indirect call */
PERF_BR_RET= 6,/* return */
Thanks
Jin Yao
*/
+ PERF_BR_IND_CALL= 5,/* indirect call */
+ PERF_BR_RET = 6,/* return */
I decide to only define these types in this patch set. For other more
arch-related branch type, we can add it in future.
Is this OK?
Thanks
Jin Yao
On 7/10/2017 9:10 PM, Segher Boessenkool
prepare the new patch.
Thanks
Jin Yao
On 7/10/2017 6:32 PM, Michael Ellerman wrote:
"Jin, Yao" writes:
On 7/10/2017 2:05 PM, Michael Ellerman wrote:
Jin Yao writes:
It is often useful to know the branch types while analyzing branch
data. For example, a call is very different
On 7/10/2017 2:05 PM, Michael Ellerman wrote:
Hi Jin Yao,
Sorry I haven't commented until now, but it got lost in the flood of
patches.
Never mind, it's no problem. :)
Just a few nit-picks below ...
Jin Yao writes:
It is often useful to know the branch types while analyzing b
Hi Arnaldo,
Could this series be merged? It's more than 2 months since the last time
Jiri Olsa gave the ack.
Thanks
Jin Yao
On 6/26/2017 2:24 PM, Jin, Yao wrote:
Hi maintainers,
Is this patch series OK or anything I should update?
Thanks
Jin Yao
On 6/2/2017 4:02 PM, Jin, Yao
Hi maintainers,
Is this patch series OK or anything I should update?
Thanks
Jin Yao
On 6/2/2017 4:02 PM, Jin, Yao wrote:
Hi maintainers,
Is this patch series (v6) OK for merging?
Thanks
Jin Yao
On 4/20/2017 5:36 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:48PM +0800, Jin Yao
Hi maintainers,
Is this patch series (v6) OK for merging?
Thanks
Jin Yao
On 4/20/2017 5:36 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:48PM +0800, Jin Yao wrote:
v6:
Update according to the review comments from
Jiri Olsa . Major modifications are:
1. Move that multiline
On 5/9/2017 8:39 PM, Jiri Olsa wrote:
On Tue, May 09, 2017 at 07:57:11PM +0800, Jin, Yao wrote:
SNIP
+
+ type >>= 2; /* skip X86_BR_USER and X86_BR_KERNEL */
+ mask = ~(~0 << 1);
is that a fancy way to get 1 into the mask? what do I miss?
you did not comment on thi
On 5/9/2017 4:26 PM, Jiri Olsa wrote:
On Mon, Apr 24, 2017 at 08:47:14AM +0800, Jin, Yao wrote:
On 4/23/2017 9:55 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:
SNIP
+#define X86_BR_TYPE_MAP_MAX16
+
+static int
+common_branch_type(int type
On 4/24/2017 8:47 AM, Jin, Yao wrote:
On 4/23/2017 9:55 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:
SNIP
+#define X86_BR_TYPE_MAP_MAX16
+
+static int
+common_branch_type(int type)
+{
+int i, mask;
+const int branch_map[X86_BR_TYPE_MAP_MAX
On 4/23/2017 9:55 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:
SNIP
+#define X86_BR_TYPE_MAP_MAX 16
+
+static int
+common_branch_type(int type)
+{
+ int i, mask;
+ const int branch_map[X86_BR_TYPE_MAP_MAX] = {
+ PERF_BR_CALL
On 4/20/2017 5:36 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:48PM +0800, Jin Yao wrote:
v6:
Update according to the review comments from
Jiri Olsa . Major modifications are:
1. Move that multiline conditional code inside {} brackets.
2. Move branch_type_stat_display
entry which just contains the to ip.
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 38 +-
tools/perf/util/callchain.h | 5 -
tools/perf/util/machine.c | 26 +-
3 files changed, 50 insertions(+), 19 deletions(-)
diff --
ode checking in
hist_iter__branch_callback().
v4: Comparing to previous version, the major changes are:
Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 25 +
a new patch in v5 patch series.
Signed-off-by: Jin Yao
---
tools/perf/util/Build| 1 +
tools/perf/util/branch.c | 168 +++
tools/perf/util/branch.h | 25 +++
tools/perf/util/event.h | 3 +-
4 files changed, 196 insertions(+), 1 deletion(-)
c
into {} brackets in
counts_str_build()
2. Keep the original display order, that is:
predicted, abort, cycles, iterations
v5: It's a new patch in v5 patch series.
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 106
1
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Change log
--
v6: Not changed.
v5: Not changed.
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch
intel_pmu_lbr_read_32 and
intel_pmu_lbr_read_64
Signed-off-by: Jin Yao
---
arch/x86/events/intel/lbr.c | 53 -
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index f924629..f10a7ed
changes are:
1. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be
computed later in userspace.
2. Remove the "cross" field in perf_branch_entry. The cross page
computing will be done later in userspace.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h
h
types defined in perf_event.h:
Jin Yao (7):
perf/core: Define the common branch type classification
perf/x86/intel: Record branch type
perf record: Create a new option save_type in --branch-filter
perf report: Refactor the branch info printing code
perf util: Create branch.c/.h
On 4/19/2017 10:15 PM, Jiri Olsa wrote:
On Wed, Apr 19, 2017 at 11:48:14PM +0800, Jin Yao wrote:
SNIP
+static int branch_type_str(struct branch_type_stat *stat,
+ char *bf, int bfsize)
+{
+ int i, j = 0, printed = 0;
+ u64 total = 0;
+
+ for (i = 0
compute the JCC forward/JCC backward and cross
page checking in user space by from and to addresses, while each
callchain entry only contains one ip (either from or to), so
this patch will append a branch from address to the callchain
entry which just contains the to ip.
Signed-off-by: Jin Yao
rsion, the major changes are:
Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 69 +
tools/perf/util/hist.c | 5 +---
2 files c
in v5 patch series.
Signed-off-by: Jin Yao
---
tools/perf/util/Build| 1 +
tools/perf/util/branch.c | 63
tools/perf/util/branch.h | 23 ++
tools/perf/util/event.h | 3 ++-
4 files changed, 89 insertions(+), 1 deletion(-)
create
eries.
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 106 +++-
1 file changed, 45 insertions(+), 61 deletions(-)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 2e5eff5..8cae8a6 100644
--- a/tools/perf/util/callchain.c
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Change log
--
v5: Not changed.
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2
intel_pmu_lbr_read_64
Signed-off-by: Jin Yao
---
arch/x86/events/intel/lbr.c | 53 -
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index f924629..f10a7ed 100644
--- a/arch/x86
. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be
computed later in userspace.
2. Remove the "cross" field in perf_branch_entry. The cross page
computing will be done later in userspace.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h
ed:
perf record: Create a new option save_type in --branch-filter
v2:
---
1. Use 4 bits in perf_branch_entry to record branch type.
2. Pull out some common branch types from FAR_BRANCH. Now the branch
types defined in perf_event.h:
Jin Yao (7):
perf/core: Define the common branch type clas
On 4/19/2017 8:53 AM, Jin, Yao wrote:
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:
SNIP
+const char *branch_type_name(int type)
+{
+const char *branch_names[PERF_BR_MAX] = {
+"N/A",
+"JCC&
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:
SNIP
+const char *branch_type_name(int type)
+{
+ const char *branch_names[PERF_BR_MAX] = {
+ "N/A",
+ "JCC",
+ "JMP&q
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:
SNIP
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ bool
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:06AM +0800, Jin Yao wrote:
SNIP
+static int branch_type_str(struct branch_type_stat *stat,
+ char *bf, int bfsize)
+{
+ int i, j = 0, printed = 0;
+ u64 total = 0;
+
+ for (i = 0
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:06AM +0800, Jin Yao wrote:
SNIP
static int counts_str_build(char *bf, int bfsize,
u64 branch_count, u64 predicted_count,
u64 abort_count, u64 cycles_count
On 4/13/2017 10:00 AM, Jin, Yao wrote:
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking if the branch cross 4K p
On 4/12/2017 10:26 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote:
SNIP
# Overhead Command Source Shared Object Source Symbol
Target SymbolBasic Block Cycles
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking if the branch cross 4K p
forward/JCC backward and cross
page checking in user space by from and to addresses, while each
callchain entry only contains one ip (either from or to), so
this patch will append a branch from address to the callchain
entry which just contains the to ip.
Signed-off-by: Jin Yao
---
tools/perf/util
e from and to addresses.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 70 +
tools/perf/util/event.h | 3 +-
tools/perf/util/hist.c | 5 +---
tools/perf/util/util.c | 59 ++
tools/perf/u
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2 files changed, 2 insertions(+)
diff
are:
1. Uses a lookup table to convert x86 branch type to common branch
type.
2. Move the JCC forward/JCC backward and cross page computing to
user space.
3. Initialize branch type to 0 in intel_pmu_lbr_read_32 and
intel_pmu_lbr_read_64
Signed-off-by: Jin Yao
---
arch/x86/events
. Remove the "cross" field in perf_branch_entry. The cross page
computing will be done later in userspace.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h | 29 -
tools/include/uapi/linux/perf_event.h | 29
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Jin Yao (5):
perf/core: Define the common branch type classification
perf
On 4/11/2017 4:18 PM, Peter Zijlstra wrote:
On Tue, Apr 11, 2017 at 09:52:19AM +0200, Peter Zijlstra wrote:
On Tue, Apr 11, 2017 at 06:56:30PM +0800, Jin Yao wrote:
@@ -960,6 +1006,11 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
cpuc->lbr_entries[i].from
On 4/11/2017 4:35 PM, Peter Zijlstra wrote:
On Tue, Apr 11, 2017 at 04:11:21PM +0800, Jin, Yao wrote:
On 4/11/2017 3:52 PM, Peter Zijlstra wrote:
This is still a completely inadequate changelog. I really will not
accept patches like this.
Hi,
The changelog is added in the cover-letter
ch patch's description?
That's fine, I can add and resend this patch.
Thanks
Jin Yao
forward CROSS_4K cycles:1)
__random random.c:295 (JCC backward CROSS_2M cycles:1)
__random random.c:295 (JCC forward CROSS_4K cycles:1)
__random random.c:295 (CROSS_2M RET cycles:9)
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 195
g. We don't know if the area is 4K or
2MB, so always compute both.
To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 70 +
tools/perf/util/event.
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2 files changed, 2 insertions(+)
diff
Perf already has support for disassembling the branch instruction
and using the branch type for filtering. The patch just records
the branch type in perf_branch_entry.
Before recording, the patch converts the x86 branch classification
to common branch classification.
Signed-off-by: Jin Yao
disassemble the branch instruction and record the branch
type.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h | 29 -
tools/include/uapi/linux/perf_event.h | 29 -
2 files changed, 56 insertions(+), 2 deletions(-)
diff --git a
_random random.c:297 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Jin Yao (5):
perf/core: Defi
d/backward computing to user-space though
it makes user-space code to be complicated.
Thanks
Jin Yao
(JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 221
tools/perf/util/callchain.h | 20
2 files changed, 182
g. We don't know if the area is 4K or
2MB, so always compute both.
To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 212
tools/perf/util/event.h
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2 files changed, 2 insertions(+)
diff
the
branches cross 4K or 2MB areas. It's an approximate computing for
crossing 4K page or 2MB page.
Signed-off-by: Jin Yao
---
arch/x86/events/intel/lbr.c | 106 +++-
1 file changed, 105 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/lb
record the branch
type.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h | 37 ++-
tools/include/uapi/linux/perf_event.h | 37 ++-
2 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux
1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Jin Yao (5):
perf/core: Define the common branch type classification
perf/x86/intel: Record branch type
perf record: Create a new option save_type in --branch-filter
perf
61 matches
Mail list logo