[tip:perf/core] perf event-parse: Use fixed size string for comms

2018-09-06 Thread tip-bot for Chris Phlipot
Commit-ID:  c9f23d2bc21cb263ae931f3e264d003d746107bb
Gitweb: https://git.kernel.org/tip/c9f23d2bc21cb263ae931f3e264d003d746107bb
Author: Chris Phlipot 
AuthorDate: Wed, 29 Aug 2018 19:19:50 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 30 Aug 2018 14:51:45 -0300

perf event-parse: Use fixed size string for comms

Some implementations of libc do not support the 'm' width modifier as
part of the scanf string format specifier. This can cause the parsing to
fail.  Since the parser never checks if the scanf parsing was
successesful, this can result in a crash.

Change the comm string to be allocated as a fixed size instead of
dynamically using 'm' scanf width modifier. This can be safely done
since comm size is limited to 16 bytes by TASK_COMM_LEN within the
kernel.

This change prevents perf from crashing when linked against bionic as
well as reduces the total number of heap allocations and frees invoked
while accomplishing the same task.

Signed-off-by: Chris Phlipot 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180830021950.15563-1-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/trace-event-parse.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/trace-event-parse.c 
b/tools/perf/util/trace-event-parse.c
index 920b1d58a068..e76214f8d596 100644
--- a/tools/perf/util/trace-event-parse.c
+++ b/tools/perf/util/trace-event-parse.c
@@ -164,16 +164,15 @@ void parse_ftrace_printk(struct tep_handle *pevent,
 void parse_saved_cmdline(struct tep_handle *pevent,
 char *file, unsigned int size __maybe_unused)
 {
-   char *comm;
+   char comm[17]; /* Max comm length in the kernel is 16. */
char *line;
char *next = NULL;
int pid;
 
line = strtok_r(file, "\n", &next);
while (line) {
-   sscanf(line, "%d %ms", &pid, &comm);
-   tep_register_comm(pevent, comm, pid);
-   free(comm);
+   if (sscanf(line, "%d %16s", &pid, comm) == 2)
+   tep_register_comm(pevent, comm, pid);
line = strtok_r(NULL, "\n", &next);
}
 }


[tip:perf/core] perf util: Fix bad memory access in trace info.

2018-09-06 Thread tip-bot for Chris Phlipot
Commit-ID:  a72f64261359b7451f8478f2a2bf357b4e6c757f
Gitweb: https://git.kernel.org/tip/a72f64261359b7451f8478f2a2bf357b4e6c757f
Author: Chris Phlipot 
AuthorDate: Tue, 28 Aug 2018 23:19:54 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 30 Aug 2018 14:50:50 -0300

perf util: Fix bad memory access in trace info.

In the write to the output_fd in the error condition of
record_saved_cmdline(), we are writing 8 bytes from a memory location on
the stack that contains a primitive that is only 4 bytes in size.
Change the primitive to 8 bytes in size to match the size of the write
in order to avoid reading unknown memory from the stack.

Signed-off-by: Chris Phlipot 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20180829061954.18871-1-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/trace-event-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/trace-event-info.c 
b/tools/perf/util/trace-event-info.c
index c85d0d1a65ed..7b0ca7cbb7de 100644
--- a/tools/perf/util/trace-event-info.c
+++ b/tools/perf/util/trace-event-info.c
@@ -377,7 +377,7 @@ out:
 
 static int record_saved_cmdline(void)
 {
-   unsigned int size;
+   unsigned long long size;
char *path;
struct stat st;
int ret, err = 0;


[tip:perf/core] perf tools: Update android build documentation

2016-07-05 Thread tip-bot for Chris Phlipot
Commit-ID:  3d0376113ed9cf92b86885bf5102944b61523f5b
Gitweb: http://git.kernel.org/tip/3d0376113ed9cf92b86885bf5102944b61523f5b
Author: Chris Phlipot 
AuthorDate: Thu, 30 Jun 2016 22:12:35 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 4 Jul 2016 20:27:27 -0300

perf tools: Update android build documentation

Update the android build documentation according to recent android build
fixes. The instructions for step 1a and step 2 were updated to work with
NDK version 11(oldest supported version) and NDK version 12(current
version).

Signed-off-by: Chris Phlipot 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1467349955-1135-5-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/android.txt | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/android.txt 
b/tools/perf/Documentation/android.txt
index 8484c3a..24a5999 100644
--- a/tools/perf/Documentation/android.txt
+++ b/tools/perf/Documentation/android.txt
@@ -12,14 +12,14 @@ Set the NDK variable to point to the path where you 
installed the NDK:
 
 2. Set cross-compiling environment variables for NDK toolchain and sysroot.
 For arm:
-  export 
NDK_TOOLCHAIN=${NDK}/toolchains/arm-linux-androideabi-4.6/prebuilt/linux-x86/bin/arm-linux-androideabi-
-  export NDK_SYSROOT=${NDK}/platforms/android-9/arch-arm
+  export 
NDK_TOOLCHAIN=${NDK}/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-
+  export NDK_SYSROOT=${NDK}/platforms/android-24/arch-arm
 For x86:
-  export 
NDK_TOOLCHAIN=${NDK}/toolchains/x86-4.6/prebuilt/linux-x86/bin/i686-linux-android-
-  export NDK_SYSROOT=${NDK}/platforms/android-9/arch-x86
+  export 
NDK_TOOLCHAIN=${NDK}/toolchains/x86-4.9/prebuilt/linux-x86_64/bin/i686-linux-android-
+  export NDK_SYSROOT=${NDK}/platforms/android-24/arch-x86
 
-This method is not working for Android NDK versions up to Revision 8b.
-perf uses some bionic enhancements that are not included in these NDK versions.
+This method is only tested for Android NDK versions Revision 11b and later.
+perf uses some bionic enhancements that are not included in prior NDK versions.
 You can use method (b) described below instead.
 
 (b). Use the Android source tree
@@ -49,9 +49,9 @@ II. Compile perf for Android
 
 You need to run make with the NDK toolchain and sysroot defined above:
 For arm:
-  make ARCH=arm CROSS_COMPILE=${NDK_TOOLCHAIN} 
CFLAGS="--sysroot=${NDK_SYSROOT}"
+  make WERROR=0 ARCH=arm CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie 
--sysroot=${NDK_SYSROOT}"
 For x86:
-  make ARCH=x86 CROSS_COMPILE=${NDK_TOOLCHAIN} 
CFLAGS="--sysroot=${NDK_SYSROOT}"
+  make WERROR=0 ARCH=x86 CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie 
--sysroot=${NDK_SYSROOT}"
 
 III. Install perf
 ---


[tip:perf/core] tools lib api: Respect WERROR=0 for build

2016-07-05 Thread tip-bot for Chris Phlipot
Commit-ID:  b983d54473344a9ef524a231943478047a779796
Gitweb: http://git.kernel.org/tip/b983d54473344a9ef524a231943478047a779796
Author: Chris Phlipot 
AuthorDate: Thu, 30 Jun 2016 22:12:32 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 4 Jul 2016 20:27:26 -0300

tools lib api: Respect WERROR=0 for build

This enables the workaround for compilers that generate warnings when
compiling libapi.

Signed-off-by: Chris Phlipot 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1467349955-1135-2-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/api/Makefile | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
index 67ff93e..c7ceea6 100644
--- a/tools/lib/api/Makefile
+++ b/tools/lib/api/Makefile
@@ -17,7 +17,13 @@ MAKEFLAGS += --no-print-directory
 LIBFILE = $(OUTPUT)libapi.a
 
 CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
-CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -Werror -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+
+# Treat warnings as errors unless directed not to
+ifneq ($(WERROR),0)
+  CFLAGS += -Werror
+endif
+
 CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
 CFLAGS += -I$(srctree)/tools/lib/api
 


[tip:perf/core] tools lib subcmd: Respect WERROR=0 for build

2016-07-05 Thread tip-bot for Chris Phlipot
Commit-ID:  fd01d06ae33be63cff7d133e650cd1eb32f1d548
Gitweb: http://git.kernel.org/tip/fd01d06ae33be63cff7d133e650cd1eb32f1d548
Author: Chris Phlipot 
AuthorDate: Thu, 30 Jun 2016 22:12:33 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 4 Jul 2016 20:27:26 -0300

tools lib subcmd: Respect WERROR=0 for build

this enables the workaround for compilers that generate warnings when
compiling libsubcmd.

Signed-off-by: Chris Phlipot 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1467349955-1135-3-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/subcmd/Makefile | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/subcmd/Makefile b/tools/lib/subcmd/Makefile
index a810370..ce4b7e5 100644
--- a/tools/lib/subcmd/Makefile
+++ b/tools/lib/subcmd/Makefile
@@ -19,7 +19,13 @@ MAKEFLAGS += --no-print-directory
 LIBFILE = $(OUTPUT)libsubcmd.a
 
 CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
-CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -Werror -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+
+# Treat warnings as errors unless directed not to
+ifneq ($(WERROR),0)
+  CFLAGS += -Werror
+endif
+
 CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
 
 CFLAGS += -I$(srctree)/tools/include/


Re: [PATCH 3/4] perf tool: Fix build when sysconf doesn't support cache line size

2016-07-04 Thread Chris Phlipot



On 07/04/2016 05:26 PM, Arnaldo Carvalho de Melo wrote:

Em Mon, Jul 04, 2016 at 05:19:20PM -0700, Chris Phlipot escreveu:



On 07/04/2016 03:48 PM, Arnaldo Carvalho de Melo wrote:

Em Thu, Jun 30, 2016 at 10:12:34PM -0700, Chris Phlipot escreveu:

Enable perf to build on libc implementations where sysconf() doesn't
support _SC_LEVEL1_DCACHE_LINESIZE as a parameter.

For example, the Bionic implementation does not support this as a
paremter. Older versions of Bionic will throw an error when this is passed
in as a parameter, and more recent versions will just return 0 as the
cache line size.

Signed-off-by: Chris Phlipot 
---
   tools/perf/perf.c | 4 
   1 file changed, 4 insertions(+)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 8f21922..113ca5b 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -509,7 +509,11 @@ int main(int argc, const char **argv)

/* The page_size is placed in util object. */
page_size = sysconf(_SC_PAGE_SIZE);
+#ifdef _SC_LEVEL1_DCACHE_LINESIZE
cacheline_size = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
+#else
+   cacheline_size = 0;
+#endif


Couldn't we instead fallback to:

sysfs__read_int("devices/system/cpu/cpu0/cache/index0/coherency_line_size", 
&cacheline_size)

?


I agree that in general this would be a better fallback, but in  all Android
images I have tested so far, "devices/system/cpu/cpu0/cache" does not exist.
I know not know of a good way to retrieve cache line size in this case.

I would be ok with attempting to get cacheline size using using the
following methods, unless you have other ideas:

1. attempt to use sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
2. attempt to use
sysfs__read_int("devices/system/cpu/cpu0/cache/index0/coherency_line_size",
&cacheline_size)
3. set to zero if both of the above fail.


Ok, but perhaps we should have some sort of warning in places using
this?

- Arnaldo


Such as printing a warning when cacheline_size is set to zero, or simply 
adding comments to the code in areas where cacheline_size is used?


-Chris








if (sysctl__read_int("kernel/perf_event_max_stack", &value) == 0)
sysctl_perf_event_max_stack = value;
--
2.7.4


Re: [PATCH 3/4] perf tool: Fix build when sysconf doesn't support cache line size

2016-07-04 Thread Chris Phlipot



On 07/04/2016 03:48 PM, Arnaldo Carvalho de Melo wrote:

Em Thu, Jun 30, 2016 at 10:12:34PM -0700, Chris Phlipot escreveu:

Enable perf to build on libc implementations where sysconf() doesn't
support _SC_LEVEL1_DCACHE_LINESIZE as a parameter.

For example, the Bionic implementation does not support this as a
paremter. Older versions of Bionic will throw an error when this is passed
in as a parameter, and more recent versions will just return 0 as the
cache line size.

Signed-off-by: Chris Phlipot 
---
  tools/perf/perf.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 8f21922..113ca5b 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -509,7 +509,11 @@ int main(int argc, const char **argv)

/* The page_size is placed in util object. */
page_size = sysconf(_SC_PAGE_SIZE);
+#ifdef _SC_LEVEL1_DCACHE_LINESIZE
cacheline_size = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
+#else
+   cacheline_size = 0;
+#endif


Couldn't we instead fallback to:

sysfs__read_int("devices/system/cpu/cpu0/cache/index0/coherency_line_size", 
&cacheline_size)

?


I agree that in general this would be a better fallback, but in  all 
Android images I have tested so far, "devices/system/cpu/cpu0/cache" 
does not exist. I know not know of a good way to retrieve cache line 
size in this case.


I would be ok with attempting to get cacheline size using using the 
following methods, unless you have other ideas:


1. attempt to use sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
2. attempt to use 
sysfs__read_int("devices/system/cpu/cpu0/cache/index0/coherency_line_size", 
&cacheline_size)

3. set to zero if both of the above fail.





if (sysctl__read_int("kernel/perf_event_max_stack", &value) == 0)
sysctl_perf_event_max_stack = value;
--
2.7.4


[PATCH 3/4] perf tool: Fix build when sysconf doesn't support cache line size

2016-06-30 Thread Chris Phlipot
Enable perf to build on libc implementations where sysconf() doesn't
support _SC_LEVEL1_DCACHE_LINESIZE as a parameter.

For example, the Bionic implementation does not support this as a
paremter. Older versions of Bionic will throw an error when this is passed
in as a parameter, and more recent versions will just return 0 as the
cache line size.

Signed-off-by: Chris Phlipot 
---
 tools/perf/perf.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 8f21922..113ca5b 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -509,7 +509,11 @@ int main(int argc, const char **argv)
 
/* The page_size is placed in util object. */
page_size = sysconf(_SC_PAGE_SIZE);
+#ifdef _SC_LEVEL1_DCACHE_LINESIZE
cacheline_size = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
+#else
+   cacheline_size = 0;
+#endif
 
if (sysctl__read_int("kernel/perf_event_max_stack", &value) == 0)
sysctl_perf_event_max_stack = value;
-- 
2.7.4



[PATCH 1/4] tools lib api: Respect WERROR=0 for build

2016-06-30 Thread Chris Phlipot
This enables the workaround for compilers that generate warnings when
compiling libapi.

Signed-off-by: Chris Phlipot 
---
 tools/lib/api/Makefile | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/api/Makefile b/tools/lib/api/Makefile
index 67ff93e..c7ceea6 100644
--- a/tools/lib/api/Makefile
+++ b/tools/lib/api/Makefile
@@ -17,7 +17,13 @@ MAKEFLAGS += --no-print-directory
 LIBFILE = $(OUTPUT)libapi.a
 
 CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
-CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -Werror -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+
+# Treat warnings as errors unless directed not to
+ifneq ($(WERROR),0)
+  CFLAGS += -Werror
+endif
+
 CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
 CFLAGS += -I$(srctree)/tools/lib/api
 
-- 
2.7.4



[PATCH 2/4] tools lib subcmd: Respect WERROR=0 for build

2016-06-30 Thread Chris Phlipot
this enables the workaround for compilers that generate warnings when
compiling libsubcmd.

Signed-off-by: Chris Phlipot 
---
 tools/lib/subcmd/Makefile | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/subcmd/Makefile b/tools/lib/subcmd/Makefile
index a810370..ce4b7e5 100644
--- a/tools/lib/subcmd/Makefile
+++ b/tools/lib/subcmd/Makefile
@@ -19,7 +19,13 @@ MAKEFLAGS += --no-print-directory
 LIBFILE = $(OUTPUT)libsubcmd.a
 
 CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
-CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -Werror -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -O6 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -fPIC
+
+# Treat warnings as errors unless directed not to
+ifneq ($(WERROR),0)
+  CFLAGS += -Werror
+endif
+
 CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
 
 CFLAGS += -I$(srctree)/tools/include/
-- 
2.7.4



[PATCH 0/4] perf tool: Fix Android build

2016-06-30 Thread Chris Phlipot
It looks like the tools/perf/Documentation/android.txt hasn't been updated
in a while. Following the instructions in this document to cross-compile
perf for Android results in several build errors.

This patch-set aims to fix/workaround the incompatibilities introduced
since the android perf build was last tested.

The changes were tested to build for ubuntu 16.04 as well as cross compile
for android using NDK Versions 11 and 12.

to test android arm cross compile:

$ wget http://dl.google.com/android/repository/android-ndk-r12-linux-x86_64.zip
$ unzip android-ndk-r12-linux-x86_64.zip
$ export 
NDK_TOOLCHAIN=`pwd`/android-ndk-r12/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-
$ export NDK_SYSROOT=`pwd`/android-ndk-r12/platforms/android-24/arch-arm
$ make WERROR=0 ARCH=arm CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie 
--sysroot=${NDK_SYSROOT}"


Chris Phlipot (4):
  tools lib api: Respect WERROR=0 for build
  tools lib subcmd: Respect WERROR=0 for build
  perf tool: Fix build when sysconf doesn't support cache line size
  perf tool: Update android build documentation

 tools/lib/api/Makefile   |  8 +++-
 tools/lib/subcmd/Makefile|  8 +++-
 tools/perf/Documentation/android.txt | 16 
 tools/perf/perf.c|  4 
 4 files changed, 26 insertions(+), 10 deletions(-)

-- 
2.7.4



[PATCH 4/4] perf tool: Update android build documentation

2016-06-30 Thread Chris Phlipot
Update the android build documentation according to recent android build
fixes. The instructions for step 1a and step 2 were updated to work with
NDK version 11(oldest supported version) and NDK version 12(current
version).

Signed-off-by: Chris Phlipot 
---
 tools/perf/Documentation/android.txt | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/android.txt 
b/tools/perf/Documentation/android.txt
index 8484c3a..24a5999 100644
--- a/tools/perf/Documentation/android.txt
+++ b/tools/perf/Documentation/android.txt
@@ -12,14 +12,14 @@ Set the NDK variable to point to the path where you 
installed the NDK:
 
 2. Set cross-compiling environment variables for NDK toolchain and sysroot.
 For arm:
-  export 
NDK_TOOLCHAIN=${NDK}/toolchains/arm-linux-androideabi-4.6/prebuilt/linux-x86/bin/arm-linux-androideabi-
-  export NDK_SYSROOT=${NDK}/platforms/android-9/arch-arm
+  export 
NDK_TOOLCHAIN=${NDK}/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-
+  export NDK_SYSROOT=${NDK}/platforms/android-24/arch-arm
 For x86:
-  export 
NDK_TOOLCHAIN=${NDK}/toolchains/x86-4.6/prebuilt/linux-x86/bin/i686-linux-android-
-  export NDK_SYSROOT=${NDK}/platforms/android-9/arch-x86
+  export 
NDK_TOOLCHAIN=${NDK}/toolchains/x86-4.9/prebuilt/linux-x86_64/bin/i686-linux-android-
+  export NDK_SYSROOT=${NDK}/platforms/android-24/arch-x86
 
-This method is not working for Android NDK versions up to Revision 8b.
-perf uses some bionic enhancements that are not included in these NDK versions.
+This method is only tested for Android NDK versions Revision 11b and later.
+perf uses some bionic enhancements that are not included in prior NDK versions.
 You can use method (b) described below instead.
 
 (b). Use the Android source tree
@@ -49,9 +49,9 @@ II. Compile perf for Android
 
 You need to run make with the NDK toolchain and sysroot defined above:
 For arm:
-  make ARCH=arm CROSS_COMPILE=${NDK_TOOLCHAIN} 
CFLAGS="--sysroot=${NDK_SYSROOT}"
+  make WERROR=0 ARCH=arm CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie 
--sysroot=${NDK_SYSROOT}"
 For x86:
-  make ARCH=x86 CROSS_COMPILE=${NDK_TOOLCHAIN} 
CFLAGS="--sysroot=${NDK_SYSROOT}"
+  make WERROR=0 ARCH=x86 CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie 
--sysroot=${NDK_SYSROOT}"
 
 III. Install perf
 ---
-- 
2.7.4



[tip:perf/core] perf script: Fix export of callchains with recursion in db-export

2016-05-12 Thread tip-bot for Chris Phlipot
Commit-ID:  83302e79b18f75266e4a44281e8432f61d57d441
Gitweb: http://git.kernel.org/tip/83302e79b18f75266e4a44281e8432f61d57d441
Author: Chris Phlipot 
AuthorDate: Tue, 10 May 2016 20:26:49 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 May 2016 12:24:58 -0300

perf script: Fix export of callchains with recursion in db-export

When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.

To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.

This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1462937209-6032-5-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/db-export.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 8ca4186..8d96c80 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -326,6 +326,10 @@ static struct call_path *call_path_from_sample(struct 
db_export *dbe,
al.machine = machine;
al.addr = node->ip;
 
+   if (al.map && !al.sym)
+   al.sym = dso__find_symbol(al.map->dso, MAP__FUNCTION,
+ al.addr);
+
db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
 
/* add node to the call path tree if it doesn't exist */


[tip:perf/core] perf script: Fix callchain addresses in db-export

2016-05-12 Thread tip-bot for Chris Phlipot
Commit-ID:  7a2544c004a6c576b1e307f30925b165affe6a22
Gitweb: http://git.kernel.org/tip/7a2544c004a6c576b1e307f30925b165affe6a22
Author: Chris Phlipot 
AuthorDate: Tue, 10 May 2016 20:26:48 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 May 2016 12:24:58 -0300

perf script: Fix callchain addresses in db-export

Remove the call to map_ip() to adjust al.addr, because it has already
been called when assembling the callchain, in:

  thread__resolve_callchain_sample(perf_sample)
  add_callchain_ip(ip = perf_sample->callchain->ips[j])
  thread__find_addr_location(addr = ip)
  thread__find_addr_map(addr) {
  al->addr = addr
  if (al->map)
  al->addr = al->map->map_ip(al->map, al->addr);
  }

Calling it a second time can result in incorrect addresses being used.
This can have effects such as duplicate symbols being created and
exported.

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1462937209-6032-4-git-send-email-cphlip...@gmail.com
[ Show the callchain where it is done, to help reviewing this change down the 
line ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/db-export.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 2ef1f69..8ca4186 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -324,10 +324,7 @@ static struct call_path *call_path_from_sample(struct 
db_export *dbe,
al.sym = node->sym;
al.map = node->map;
al.machine = machine;
-   if (al.map)
-   al.addr = al.map->map_ip(al.map, node->ip);
-   else
-   al.addr = node->ip;
+   al.addr = node->ip;
 
db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
 


[tip:perf/core] perf script: Fix symbol insertion behavior in db-export

2016-05-12 Thread tip-bot for Chris Phlipot
Commit-ID:  bd0a51dd2794f1d17d4e7a34ad66db845cef3e5a
Gitweb: http://git.kernel.org/tip/bd0a51dd2794f1d17d4e7a34ad66db845cef3e5a
Author: Chris Phlipot 
AuthorDate: Tue, 10 May 2016 20:26:47 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 May 2016 12:24:57 -0300

perf script: Fix symbol insertion behavior in db-export

Use the dso__insert_symbol function instead of symbols__insert() in
order to properly update the dso symbol cache.

If the cache is not updated, then duplicate symbols can be
unintentionally created, inserted, and exported.

This change prevents duplicate symbols from being exported due to
dso__find_symbol() using a stale symbol cache.

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1462937209-6032-3-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/db-export.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index f8e3057..2ef1f69 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -260,8 +260,7 @@ static int db_ids_from_al(struct db_export *dbe, struct 
addr_location *al,
if (!al->sym) {
al->sym = symbol__new(al->addr, 0, 0, "unknown");
if (al->sym)
-   symbols__insert(&dso->symbols[al->map->type],
-   al->sym);
+   dso__insert_symbol(dso, al->map->type, al->sym);
}
 
if (al->sym) {


[tip:perf/core] perf symbols: Add dso__insert_symbol function

2016-05-12 Thread tip-bot for Chris Phlipot
Commit-ID:  ae93a6c70838b87151ac12589dc507dbf4f2f067
Gitweb: http://git.kernel.org/tip/ae93a6c70838b87151ac12589dc507dbf4f2f067
Author: Chris Phlipot 
AuthorDate: Tue, 10 May 2016 20:26:46 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 May 2016 12:24:57 -0300

perf symbols: Add dso__insert_symbol function

The current method for inserting symbols is to use the symbols__insert()
function. However symbols__insert() does not update the dso symbol
cache.  This causes problems in the following scenario:

1. symbol not found at addr using dso__find_symbol

2. symbol inserted at addr using the existing symbols__insert function

3. symbol still not found at addr using dso__find_symbol() because cache isn't
   updated. This is undesired behavior.

The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol() to both insert the symbol and update the symbol
cache if necessary.

If dso__insert_symbol() is used in (2) instead of symbols__insert(),
then the undesired behavior in (3) is avoided.

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol.c | 12 
 tools/perf/util/symbol.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 2946295..21af8e8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -413,6 +413,18 @@ void dso__reset_find_symbol_cache(struct dso *dso)
}
 }
 
+void dso__insert_symbol(struct dso *dso, enum map_type type, struct symbol 
*sym)
+{
+   symbols__insert(&dso->symbols[type], sym);
+
+   /* update the symbol cache if necessary */
+   if (dso->last_find_result[type].addr >= sym->start &&
+   (dso->last_find_result[type].addr < sym->end ||
+   sym->start == sym->end)) {
+   dso->last_find_result[type].symbol = sym;
+   }
+}
+
 struct symbol *dso__find_symbol(struct dso *dso,
enum map_type type, u64 addr)
 {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 07211c2..2b5e4ed 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -246,6 +246,9 @@ int __dso__load_kallsyms(struct dso *dso, const char 
*filename, struct map *map,
 int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map,
   symbol_filter_t filter);
 
+void dso__insert_symbol(struct dso *dso, enum map_type type,
+   struct symbol *sym);
+
 struct symbol *dso__find_symbol(struct dso *dso, enum map_type type,
u64 addr);
 struct symbol *dso__find_symbol_by_name(struct dso *dso, enum map_type type,


[PATCH v2 0/4] perf script: fix duplicate symbols in db-export

2016-05-10 Thread Chris Phlipot
Changes since v1:
 - fixed scripts/checkpatch.pl errors

This patch set contains 3 fixes for duplicate symbol creation in the
db-export implementation and one new symbol API required for the fixes.

commit 9c7b37cd63d0 ("perf symbols: Fix handling of zero-length symbols.")
already removed the majority of duplicates, but these fixes take care of
the remaining corner cases.

each patch (except for the 1st, which is a dependency for patch 2) reduces
the number of duplicate symbols exported. When all patches are applied,
my test workload has no more duplicate symbols being exported.

Tests ran:

$perf record --call-graph=dwarf stress -c 2 -t 20
$perf script -s scripts/python/export-to-postgresql.py test all callchains
$psql test

To show the effect of the changes we run the following query before/after
the changes on a database created using the export-to-postgresql.py script
with callchains enabled. If this query returns any value greater than 1,
then it means that there are duplicates present. 


In the test workload, at least one symbol occurs 299 times before applying
the fixes:

  test=# select count(*) as cnt from symbols group by 
sym_start,sym_end,dso_id order by cnt desc limit 1;
   cnt 
  -
   299
  (1 row)

After applying the fixes no symbol occurs more than once:

  test=# select count(*) as cnt from symbols group by
sym_start,sym_end,dso_id order by cnt desc limit 1;
   cnt 
  -
 1
  (1 row)

Chris Phlipot (4):
  perf symbols: add dso__insert_symbol function
  perf script: fix symbol insertion behavior in db-export
  perf script: fix callchain addresses in db-export
  perf script: fix export of callchains with recursion in db-export

 tools/perf/util/db-export.c | 12 ++--
 tools/perf/util/symbol.c| 12 
 tools/perf/util/symbol.h|  3 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

-- 
2.7.4



[PATCH v2 3/4] perf script: fix callchain addresses in db-export

2016-05-10 Thread Chris Phlipot
Remove the call to map_ip, because it has already been called when
assembling the callchain. Calling it a second time can result in incorrect
addresses being used. This can have effects such as duplicate symbols
being created and exported.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 2ef1f69..8ca4186 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -324,10 +324,7 @@ static struct call_path *call_path_from_sample(struct 
db_export *dbe,
al.sym = node->sym;
al.map = node->map;
al.machine = machine;
-   if (al.map)
-   al.addr = al.map->map_ip(al.map, node->ip);
-   else
-   al.addr = node->ip;
+   al.addr = node->ip;
 
db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
 
-- 
2.7.4



[PATCH v2 2/4] perf script: fix symbol insertion behavior in db-export

2016-05-10 Thread Chris Phlipot
Use the dso__insert_symbol function instead of symbols__insert in order to
properly update the dso symbol cache.

If the cache is not updated, then duplicate symbols can be unintentionally
created, inserted, and exported.

This change prevents duplicate symbols from being exported due to
dso__find_symbol using a stale symbol cache.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index f8e3057..2ef1f69 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -260,8 +260,7 @@ static int db_ids_from_al(struct db_export *dbe, struct 
addr_location *al,
if (!al->sym) {
al->sym = symbol__new(al->addr, 0, 0, "unknown");
if (al->sym)
-   symbols__insert(&dso->symbols[al->map->type],
-   al->sym);
+   dso__insert_symbol(dso, al->map->type, al->sym);
}
 
if (al->sym) {
-- 
2.7.4



[PATCH v2 4/4] perf script: fix export of callchains with recursion in db-export

2016-05-10 Thread Chris Phlipot
When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.

To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.

This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 8ca4186..8d96c80 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -326,6 +326,10 @@ static struct call_path *call_path_from_sample(struct 
db_export *dbe,
al.machine = machine;
al.addr = node->ip;
 
+   if (al.map && !al.sym)
+   al.sym = dso__find_symbol(al.map->dso, MAP__FUNCTION,
+ al.addr);
+
db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
 
/* add node to the call path tree if it doesn't exist */
-- 
2.7.4



[PATCH v2 1/4] perf symbols: add dso__insert_symbol function

2016-05-10 Thread Chris Phlipot
The current method for inserting symbols is to use the symbols__insert
function. However symbols__insert does not update the dso symbol cache.
This causes problems in the following scenario:

1. symbol not found at ip using dso__find_symbol

2. symbol inserted at ip using the existing symbols__insert function

3. symbol still not found at ip using dso__find_symbol because cache isn't
updated. This is undesired behavior.

The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol to both insert the symbol and update the symbol cache
if necessary.

If dso__insert_symbol is used in (2) instead of symbols__insert, then the
undesired behavior in (3) is avoided.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/symbol.c | 12 
 tools/perf/util/symbol.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 2946295..21af8e8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -413,6 +413,18 @@ void dso__reset_find_symbol_cache(struct dso *dso)
}
 }
 
+void dso__insert_symbol(struct dso *dso, enum map_type type, struct symbol 
*sym)
+{
+   symbols__insert(&dso->symbols[type], sym);
+
+   /* update the symbol cache if necessary */
+   if (dso->last_find_result[type].addr >= sym->start &&
+   (dso->last_find_result[type].addr < sym->end ||
+   sym->start == sym->end)) {
+   dso->last_find_result[type].symbol = sym;
+   }
+}
+
 struct symbol *dso__find_symbol(struct dso *dso,
enum map_type type, u64 addr)
 {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 07211c2..2b5e4ed 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -246,6 +246,9 @@ int __dso__load_kallsyms(struct dso *dso, const char 
*filename, struct map *map,
 int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map,
   symbol_filter_t filter);
 
+void dso__insert_symbol(struct dso *dso, enum map_type type,
+   struct symbol *sym);
+
 struct symbol *dso__find_symbol(struct dso *dso, enum map_type type,
u64 addr);
 struct symbol *dso__find_symbol_by_name(struct dso *dso, enum map_type type,
-- 
2.7.4



[PATCH 2/4] perf script: fix symbol insertion behavior in db-export

2016-05-10 Thread Chris Phlipot
Use the dso__insert_symbol function instead of symbols__insert in order to
properly update the dso symbol cache.

If the cache is not updated, then duplicate symbols can be unintentionally
created, inserted, and exported.

This change prevents duplicate symbols from being exported due to
dso__find_symbol using a stale symbol cache.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index f8e3057..2ef1f69 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -260,8 +260,7 @@ static int db_ids_from_al(struct db_export *dbe, struct 
addr_location *al,
if (!al->sym) {
al->sym = symbol__new(al->addr, 0, 0, "unknown");
if (al->sym)
-   symbols__insert(&dso->symbols[al->map->type],
-   al->sym);
+   dso__insert_symbol(dso, al->map->type, al->sym);
}
 
if (al->sym) {
-- 
2.7.4



[PATCH 3/4] perf script: fix callchain addresses in db-export

2016-05-10 Thread Chris Phlipot
Remove the call to map_ip, because it has already been called when
assembling the callchain. Calling it a second time can result in incorrect
addresses being used. This can have effects such as duplicate symbols
being created and exported.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 2ef1f69..8ca4186 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -324,10 +324,7 @@ static struct call_path *call_path_from_sample(struct 
db_export *dbe,
al.sym = node->sym;
al.map = node->map;
al.machine = machine;
-   if (al.map)
-   al.addr = al.map->map_ip(al.map, node->ip);
-   else
-   al.addr = node->ip;
+   al.addr = node->ip;
 
db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
 
-- 
2.7.4



[PATCH 4/4] perf script: fix export of callchains with recursion in db-export

2016-05-10 Thread Chris Phlipot
When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.

To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.

This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 8ca4186..424c77c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -326,6 +326,10 @@ static struct call_path *call_path_from_sample(struct 
db_export *dbe,
al.machine = machine;
al.addr = node->ip;
 
+   if(al.map && !al.sym)
+   al.sym = dso__find_symbol(al.map->dso, MAP__FUNCTION,
+ al.addr);
+
db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
 
/* add node to the call path tree if it doesn't exist */
-- 
2.7.4



[PATCH 0/4] perf script: fix duplicate symbols in db-export

2016-05-10 Thread Chris Phlipot
This patch set contains 3 fixes for duplicate symbol creation in the
db-export implementation and one new symbol API required for the fixes.

commit 9c7b37cd63d0 already removed the majority of duplicates, but these
fixes take care of the remaining corner cases.

each patch (except for the 1st, which is a dependency for patch 2) reduces
the number of duplicate symbols exported. When all patches are applied,
my test workload has no more duplicate symbols being exported.

Tests ran:

$perf record --call-graph=dwarf stress -c 2 -t 20
$perf script -s scripts/python/export-to-postgresql.py test all callchains
$psql test

To show the effect of the changes we run the following query before/after
the changes on a database created using the export-to-postgresql.py script
with callchains enabled. If this query returns any value greater than 1,
then it means that there are duplicates present. 


In the test workload, at least one symbol occurs 299 times before applying
the fixes:

  test=# select count(*) as cnt from symbols group by 
sym_start,sym_end,dso_id order by cnt desc limit 1;
   cnt 
  -
   299
  (1 row)

After applying the fixes no symbol occurs more than once:

  test=# select count(*) as cnt from symbols group by
sym_start,sym_end,dso_id order by cnt desc limit 1;
   cnt 
  -
 1
  (1 row)



Chris Phlipot (4):
  perf symbols: add dso__insert_symbol function
  perf script: fix symbol insertion behavior in db-export
  perf script: fix callchain addresses in db-export
  perf script: fix export of callchains with recursion in db-export

 tools/perf/util/db-export.c | 12 ++--
 tools/perf/util/symbol.c| 12 
 tools/perf/util/symbol.h|  3 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

-- 
2.7.4



[PATCH 1/4] perf symbols: add dso__insert_symbol function

2016-05-10 Thread Chris Phlipot
The current method for inserting symbols is to use the symbols__insert
function. However symbols__insert does not update the dso symbol cache.
This causes problems in the following scenario:

1. symbol not found at ip using dso__find_symbol

2. symbol inserted at ip using the existing symbols__insert function

3. symbol still not found at ip using dso__find_symbol because cache isn't
updated. This is undesired behavior.

The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol to both insert the symbol and update the symbol cache
if necessary.

If dso__insert_symbol is used in (2) instead of symbols__insert, then the
undesired behavior in (3) is avoided.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/symbol.c | 12 
 tools/perf/util/symbol.h |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 2946295..21af8e8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -413,6 +413,18 @@ void dso__reset_find_symbol_cache(struct dso *dso)
}
 }
 
+void dso__insert_symbol(struct dso *dso, enum map_type type, struct symbol 
*sym)
+{
+   symbols__insert(&dso->symbols[type], sym);
+
+   /* update the symbol cache if necessary */
+   if (dso->last_find_result[type].addr >= sym->start &&
+   (dso->last_find_result[type].addr < sym->end ||
+   sym->start == sym->end)) {
+   dso->last_find_result[type].symbol = sym;
+   }
+}
+
 struct symbol *dso__find_symbol(struct dso *dso,
enum map_type type, u64 addr)
 {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 07211c2..2b5e4ed 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -246,6 +246,9 @@ int __dso__load_kallsyms(struct dso *dso, const char 
*filename, struct map *map,
 int dso__load_kallsyms(struct dso *dso, const char *filename, struct map *map,
   symbol_filter_t filter);
 
+void dso__insert_symbol(struct dso *dso, enum map_type type,
+   struct symbol *sym);
+
 struct symbol *dso__find_symbol(struct dso *dso, enum map_type type,
u64 addr);
 struct symbol *dso__find_symbol_by_name(struct dso *dso, enum map_type type,
-- 
2.7.4



[tip:perf/core] perf symbols: Fix handling of zero-length symbols.

2016-05-10 Thread tip-bot for Chris Phlipot
Commit-ID:  9c7b37cd63d0d910c531233209286f169993cbd9
Gitweb: http://git.kernel.org/tip/9c7b37cd63d0d910c531233209286f169993cbd9
Author: Chris Phlipot 
AuthorDate: Sat, 7 May 2016 02:16:59 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 May 2016 18:40:03 -0300

perf symbols: Fix handling of zero-length symbols.

This change introduces a fix to symbols__find, so that it is able to
find symbols of length zero (where start == end).

The current code has the following problem:

- The current implementation of symbols__find is unable to find any symbols
  of length zero.

- The db-export framework explicitly creates zero length symbols at
  locations where no symbol currently exists.

The combination of the two above behaviors results in behavior similar
to the example below.

1. addr_location is created for a sample, but symbol is unable to be
   resolved.

2. db export creates an "unknown" symbol of length zero at that address
   and inserts it into the dso.

3. A new sample comes in at the same address, but symbol__find is unable
   to find the zero length symbol, so it is still unresolved.

4. db export sees the symbol is unresolved, and allocated a duplicate
   symbol, even though it already did this in step 2.

This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.

The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix (generated with
scripts/python/export-to-postgresql.py)

Ex.
BEFORE THE CHANGE:

  example_db=# select count(*) from symbols;
   count
  
   900213
  (1 row)

  example_db=# select count(*) from symbols where symbols.name='unknown';
   count
  
   897355
  (1 row)

  example_db=# select count(*) from symbols where symbols.name!='unknown';
   count
  ---
2858
  (1 row)

AFTER THE CHANGE:

  example_db=# select count(*) from symbols;
   count
  ---
   25217
  (1 row)

  example_db=# select count(*) from symbols where name='unknown';
   count
  ---
   22359
  (1 row)

  example_db=# select count(*) from symbols where name!='unknown';
   count
  ---
2858
  (1 row)

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlip...@gmail.com
[ Moved the test to later in the rb_tree tests, as this not the likely case ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 415c4f6..2946295 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -301,7 +301,7 @@ static struct symbol *symbols__find(struct rb_root 
*symbols, u64 ip)
 
if (ip < s->start)
n = n->rb_left;
-   else if (ip >= s->end)
+   else if (ip > s->end || (ip == s->end && ip != s->start))
n = n->rb_right;
else
return s;


[tip:perf/core] perf script: Fix incorrect python db-export error message

2016-05-10 Thread tip-bot for Chris Phlipot
Commit-ID:  aff633406ca2772554bad7b37f2dfbc409b6ea74
Gitweb: http://git.kernel.org/tip/aff633406ca2772554bad7b37f2dfbc409b6ea74
Author: Chris Phlipot 
AuthorDate: Sat, 7 May 2016 02:17:00 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 May 2016 14:08:39 -0300

perf script: Fix incorrect python db-export error message

Fix the error message printed when attempting and failing to create the
call path root incorrectly references the call return process.

This change fixes the message to properly reference the failure to
create the call path root.

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/scripting-engines/trace-event-python.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 091bce6..1546b74 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1062,7 +1062,7 @@ static void set_table_handlers(struct tables *tables)
tables->dbe.cpr = call_path_root__new();
 
if (!tables->dbe.cpr)
-   Py_FatalError("failed to create calls processor");
+   Py_FatalError("failed to create call path root");
}
 
tables->db_export_mode = true;


Re: [PATCH 1/2] perf tools: fix handling of zero-length symbols.

2016-05-09 Thread Chris Phlipot

> On May 9, 2016, at 10:06 AM, Arnaldo Carvalho de Melo 
>  wrote:
> 
> Em Sat, May 07, 2016 at 02:16:59AM -0700, Chris Phlipot escreveu:
>> This change introduces a fix to symbols__find, so that it is able to find
>> symbols of length zero (where start==end)
>> 
>> The current code has the following problem:
>> -The current implementation of symbols__find is unable to find any symbols
>> of length zero.
>> -The db-export framework explicitly creates zero length symbols at
>> locations where no symbol currently exists.
>> 
>> The combination of the two above behaviors results in behavior similar to
>> the example below.
> 
> Ok, but you made the unlikely case be the first test, how about this one
> liner instead?
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 415c4f6d98fd..7a0917569fb3 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -301,7 +301,7 @@ static struct symbol *symbols__find(struct rb_root 
> *symbols, u64 ip)
> 
>if (ip < s->start)
>n = n->rb_left;
> -else if (ip >= s->end)
> +else if (ip > s->end || (ip == s->end && ip != s->start)
>n = n->rb_right;
>else
>return s;
> 
> 

I agree. This looks like a better fix.

>> +++ b/tools/perf/util/symbol.c
>> @@ -299,7 +299,9 @@ static struct symbol *symbols__find(struct rb_root 
>> *symbols, u64 ip)
>>while (n) {
>>struct symbol *s = rb_entry(n, struct symbol, rb_node);
>> 
>> -if (ip < s->start)
>> +if (ip == s->start && s->start == s->end)
>> +return s;
>> +else if (ip < s->start)
>>n = n->rb_left;
>>else if (ip >= s->end)
>>n = n->rb_right;
>> -- 
>> 2.7.4


[PATCH 2/2] perf script: fix incorrect python db-export error message

2016-05-07 Thread Chris Phlipot
fix the error message printed when attempting and failing to create the
call path root incorrectly references the call return process.

this change fixes the message to properly reference the failure to create
the call path root.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/scripting-engines/trace-event-python.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 091bce6..1546b74 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1062,7 +1062,7 @@ static void set_table_handlers(struct tables *tables)
tables->dbe.cpr = call_path_root__new();
 
if (!tables->dbe.cpr)
-   Py_FatalError("failed to create calls processor");
+   Py_FatalError("failed to create call path root");
}
 
tables->db_export_mode = true;
-- 
2.7.4



[PATCH 1/2] perf tools: fix handling of zero-length symbols.

2016-05-07 Thread Chris Phlipot
This change introduces a fix to symbols__find, so that it is able to find
symbols of length zero (where start==end)

The current code has the following problem:
-The current implementation of symbols__find is unable to find any symbols
 of length zero.
-The db-export framework explicitly creates zero length symbols at
 locations where no symbol currently exists.

The combination of the two above behaviors results in behavior similar to
the example below.

1. addr_location is created for a sample, but symbol is unable to be
   resolved.
2. db export creates an "unknown" symbol of length zero at that address
   and inserts it into the dso.
3. A new sample comes in at the same address, but symbol__find is unable
   to find the zero length symbol, so it is still unresolved.
4. db export sees the symbol is unresolved, and allocated a duplicate
   symbol, even though it already did this in step 2.

This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.

The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix
(generated with scripts/python/export-to-postgresql.py)

Ex.
BEFORE THE CHANGE:
  example_db=# select count(*) from symbols;
   count
  
   900213
  (1 row)

  example_db=# select count(*) from symbols where symbols.name='unknown';
   count
  
   897355
  (1 row)

  example_db=# select count(*) from symbols where symbols.name!='unknown';
   count
  ---
2858
  (1 row)

AFTER THE CHANGE:

  example_db=# select count(*) from symbols;
   count
  ---
   25217
  (1 row)

  example_db=# select count(*) from symbols where name='unknown';
   count
  ---
   22359
  (1 row)

  example_db=# select count(*) from symbols where name!='unknown';
   count
  ---
2858
  (1 row)

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/symbol.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 415c4f6..e42bf9a 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -299,7 +299,9 @@ static struct symbol *symbols__find(struct rb_root 
*symbols, u64 ip)
while (n) {
struct symbol *s = rb_entry(n, struct symbol, rb_node);
 
-   if (ip < s->start)
+   if (ip == s->start && s->start == s->end)
+   return s;
+   else if (ip < s->start)
n = n->rb_left;
else if (ip >= s->end)
n = n->rb_right;
-- 
2.7.4



[tip:perf/core] perf script: Update export-to-postgresql to support callchain export

2016-05-06 Thread tip-bot for Chris Phlipot
Commit-ID:  3521f3bc9dae4a79cfb9cc9ffcf6d961bbb7cbac
Gitweb: http://git.kernel.org/tip/3521f3bc9dae4a79cfb9cc9ffcf6d961bbb7cbac
Author: Chris Phlipot 
AuthorDate: Thu, 28 Apr 2016 01:19:11 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 May 2016 13:00:55 -0300

perf script: Update export-to-postgresql to support callchain export

Update the export-to-postgresql.py to support the newly introduced
callchain export.

callchains are added into the existing call_paths table and can now
be associated with samples when the "callpaths" commandline option
is used with the script.

Ex.:

  $ perf script -s export-to-postgresql.py example_db all callchains

Includes the following changes to enable callchain export via the python export
APIs:

- Add the "callchains" commandline option, which is used to enable
  callchain export by setting the perf_db_export_callchains global
- Add perf_db_export_callchains checks for call_path table creation
  and population.
- Add call_path_id to samples_table to conform with the new API

example usage and output using a small test app:

  test_app.c:

volatile int x = 0;
void inc_x_loop()
{
int i;
for(i=0; i<1; i++)
x++;
}

void a()
{
inc_x_loop();
}

void b()
{
inc_x_loop();
}

int main()
{
a();
b();
return 0;
}

example usage:

  $ gcc -g -O0 test_app.c
  $ perf record --call-graph=dwarf ./a.out
  [ perf record: Woken up 77 times to write data ]
  [ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ]

  $ perf script -s scripts/python/export-to-postgresql.py
example_db all callchains

  $ psql example_db

  example_db=#
  SELECT
  (SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol,
  (SELECT name FROM symbols WHERE id =
(SELECT symbol_id from call_paths where id = cps.parent_id))
as parent_symbol,
  sum(period) as event_count
  FROM samples join call_paths as cps on call_path_id = cps.id
  GROUP BY cps.id,evsel_id
  ORDER BY event_count DESC
  LIMIT 5;

symbol  |  parent_symbol   | event_count
  --+--+-
   inc_x_loop   | a|   734250982
   inc_x_loop   | b|   731028057
   unknown  | unknown  | 1335858
   task_tick_fair   | scheduler_tick   | 1238842
   update_wall_time | tick_do_update_jiffies64 |  650373
  (5 rows)

The above data shows total "self time" in cycles for each call path that was
sampled. It is intended to demonstrate how it accounts separately for the two
ways to reach the "inc_x_loop" function(via "a" and "b").  Recursive common
table expressions can be used as well to get cumulative time spent in a
function as well, but that is beyond the scope of this basic example.

Signed-off-by: Chris Phlipot 
Acked-by: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/scripts/python/export-to-postgresql.py | 47 +++
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py 
b/tools/perf/scripts/python/export-to-postgresql.py
index 6f0ca68..7656ff8 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -223,11 +223,14 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
 
 perf_db_export_mode = True
 perf_db_export_calls = False
+perf_db_export_callchains = False
+
 
 def usage():
-   print >> sys.stderr, "Usage is: export-to-postgresql.py  
[] []"
+   print >> sys.stderr, "Usage is: export-to-postgresql.py  
[] [] []"
print >> sys.stderr, "where:columns 'all' or 'branches'"
-   print >> sys.stderr, "  calls   'calls' => create calls 
table"
+   print >> sys.stderr, "  calls   'calls' => create calls 
and call_paths table"
+   print >> sys.stderr, "  callchains  'callchains' => create 
call_paths table"
raise Exception("Too few arguments")
 
 if (len(sys.argv) < 2):
@@ -245,9 +248,11 @@ if columns not in ("all", "branches"):
 
 branches = (columns == "branches")
 
-if (len(sys.argv) >= 4):
-   if (sys.argv[3] == "calls"):
+for i in range(3,len(sys.argv)):
+   if (sys.argv[i] == "calls"):
perf_db_export_c

[tip:perf/core] perf script: Expose usage of the callchain db export via the python api

2016-05-06 Thread tip-bot for Chris Phlipot
Commit-ID:  2c15f5eb04e9e7e19a2c8be6b50c63a4c6062a44
Gitweb: http://git.kernel.org/tip/2c15f5eb04e9e7e19a2c8be6b50c63a4c6062a44
Author: Chris Phlipot 
AuthorDate: Thu, 28 Apr 2016 01:19:10 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 May 2016 13:00:54 -0300

perf script: Expose usage of the callchain db export via the python api

This change allows python scripts to be able to utilize the recent
changes to the db export api allowing the export of call_paths derived
from sampled callchains. These call paths are also now associated with
the samples from which they were derived.

- This feature is enabled by setting "perf_db_export_callchains" to true

- When enabled, samples that have callchain information will have the
  callchains exported via call_path_table

- The call_path_id field is added to sample_table to enable association of
  samples with the corresponding callchain stored in the call paths
  table. A call_path_id of 0 will be exported if there is no
  corresponding callchain.

- When "perf_db_export_callchains" and "perf_db_export_calls" are both
  set to True, the call path root data structure will be shared. This
  prevents duplicating of data and call path ids that would result from
  building two separate call path trees in memory.

- The call_return_processor structure definition was relocated to the header
  file to make its contents visible to db-export.c. This enables the
  sharing of call path trees between the two features, as mentioned
  above.

This change is visible to python scripts using the python db export api.

The change is backwards compatible with scripts written against the
previous API, assuming that the scripts model the sample_table function
after the one in export-to-postgresql.py script by allowing for
additional arguments to be added in the future. ie. using *x as the
final argument of the sample_table function.

Signed-off-by: Chris Phlipot 
Acked-by: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../util/scripting-engines/trace-event-python.c| 35 --
 tools/perf/util/thread-stack.c | 13 
 tools/perf/util/thread-stack.h | 14 -
 3 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 7bb8592..091bce6 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -682,7 +682,7 @@ static int python_export_sample(struct db_export *dbe,
struct tables *tables = container_of(dbe, struct tables, dbe);
PyObject *t;
 
-   t = tuple_new(21);
+   t = tuple_new(22);
 
tuple_set_u64(t, 0, es->db_id);
tuple_set_u64(t, 1, es->evsel->db_id);
@@ -705,6 +705,7 @@ static int python_export_sample(struct db_export *dbe,
tuple_set_u64(t, 18, es->sample->data_src);
tuple_set_s32(t, 19, es->sample->flags & PERF_BRANCH_MASK);
tuple_set_s32(t, 20, !!(es->sample->flags & PERF_IP_FLAG_IN_TX));
+   tuple_set_u64(t, 21, es->call_path_id);
 
call_object(tables->sample_handler, t, "sample_table");
 
@@ -999,8 +1000,10 @@ static void set_table_handlers(struct tables *tables)
 {
const char *perf_db_export_mode = "perf_db_export_mode";
const char *perf_db_export_calls = "perf_db_export_calls";
-   PyObject *db_export_mode, *db_export_calls;
+   const char *perf_db_export_callchains = "perf_db_export_callchains";
+   PyObject *db_export_mode, *db_export_calls, *db_export_callchains;
bool export_calls = false;
+   bool export_callchains = false;
int ret;
 
memset(tables, 0, sizeof(struct tables));
@@ -1017,6 +1020,7 @@ static void set_table_handlers(struct tables *tables)
if (!ret)
return;
 
+   /* handle export calls */
tables->dbe.crp = NULL;
db_export_calls = PyDict_GetItemString(main_dict, perf_db_export_calls);
if (db_export_calls) {
@@ -1034,6 +1038,33 @@ static void set_table_handlers(struct tables *tables)
Py_FatalError("failed to create calls processor");
}
 
+   /* handle export callchains */
+   tables->dbe.cpr = NULL;
+   db_export_callchains = PyDict_GetItemString(main_dict,
+   perf_db_export_callchains);
+   if (db_export_callchains) {
+   ret = PyObject_IsTrue(db_export_callchains);
+   if (ret == -1)
+   handler_call_die(perf_db_export_callchains);
+   export_callchains = !!ret;
+   }

[tip:perf/core] perf script: Add call path id to exported sample in db export

2016-05-06 Thread tip-bot for Chris Phlipot
Commit-ID:  568850eaad8cdd3783c3347623dfcad4f043cf1c
Gitweb: http://git.kernel.org/tip/568850eaad8cdd3783c3347623dfcad4f043cf1c
Author: Chris Phlipot 
AuthorDate: Thu, 28 Apr 2016 01:19:09 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 May 2016 13:00:53 -0300

perf script: Add call path id to exported sample in db export

The exported sample now contains a reference to the call_path_id that
represents its callchain.

While callchains themselves are nice to have, being able to associate
them with samples makes them much more useful, and can allow for such
things as determining how much cumulative time is spent in a particular
function. This information is normally possible to get from the call
return processor. However, when doing normal sampling, call/return
information is not available, thus necessitating the need for
associating samples directly with call paths.

This commit include changes to db-export layer to make this information
available for subsequent patches in this change set, but by itself, does
not make any changes visible to the user.

Signed-off-by: Chris Phlipot 
Acked-by: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/db-export.c | 4 +++-
 tools/perf/util/db-export.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index a0ca90c..f8e3057 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -408,8 +408,10 @@ int db_export__sample(struct db_export *dbe, union 
perf_event *event,
struct call_path *cp = call_path_from_sample(dbe, al->machine,
 thread, sample,
 evsel);
-   if (cp)
+   if (cp) {
db_export__call_path(dbe, cp);
+   es.call_path_id = cp->db_id;
+   }
}
 
if ((evsel->attr.sample_type & PERF_SAMPLE_ADDR) &&
diff --git a/tools/perf/util/db-export.h b/tools/perf/util/db-export.h
index f5daf55..67bc6b8 100644
--- a/tools/perf/util/db-export.h
+++ b/tools/perf/util/db-export.h
@@ -44,6 +44,7 @@ struct export_sample {
u64 addr_dso_db_id;
u64 addr_sym_db_id;
u64 addr_offset; /* addr offset from symbol start */
+   u64 call_path_id;
 };
 
 struct db_export {


[tip:perf/core] perf script: Enable db export to output sampled callchains

2016-05-06 Thread tip-bot for Chris Phlipot
Commit-ID:  0a3eba3ad613fa9d5af754f7ae8c4b46047cb2a7
Gitweb: http://git.kernel.org/tip/0a3eba3ad613fa9d5af754f7ae8c4b46047cb2a7
Author: Chris Phlipot 
AuthorDate: Thu, 28 Apr 2016 01:19:08 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 May 2016 13:00:52 -0300

perf script: Enable db export to output sampled callchains

This change enables the db export api to export callchains. This is
accomplished by adding callchains obtained from samples to the
call_path_root structure and exporting them via the current call path
export API.

While the current API does support exporting call paths, this is not
supported when sampling. This commit addresses that missing feature by
allowing the export of call paths when callchains are present in
samples.

Summary:

- This feature is activated by initializing the call_path_root member
  inside the db_export structure to a non-null value.

- Callchains are resolved with thread__resolve_callchain() and then stored
  and exported by adding a call path under call path root.
- Symbol and DSO for each callchain node are exported via db_ids_from_al()

This commit puts in place infrastructure to be used by subsequent commits,
and by itself, does not introduce any user-visible changes.

Signed-off-by: Chris Phlipot 
Acked-by: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlip...@gmail.com
[ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: 
tag ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/db-export.c | 82 +
 tools/perf/util/db-export.h |  2 ++
 2 files changed, 84 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 4fc607c..a0ca90c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -23,6 +23,7 @@
 #include "event.h"
 #include "util.h"
 #include "thread-stack.h"
+#include "callchain.h"
 #include "call-path.h"
 #include "db-export.h"
 
@@ -277,6 +278,79 @@ static int db_ids_from_al(struct db_export *dbe, struct 
addr_location *al,
return 0;
 }
 
+static struct call_path *call_path_from_sample(struct db_export *dbe,
+  struct machine *machine,
+  struct thread *thread,
+  struct perf_sample *sample,
+  struct perf_evsel *evsel)
+{
+   u64 kernel_start = machine__kernel_start(machine);
+   struct call_path *current = &dbe->cpr->call_path;
+   enum chain_order saved_order = callchain_param.order;
+   int err;
+
+   if (!symbol_conf.use_callchain || !sample->callchain)
+   return NULL;
+
+   /*
+* Since the call path tree must be built starting with the root, we
+* must use ORDER_CALL for call chain resolution, in order to process
+* the callchain starting with the root node and ending with the leaf.
+*/
+   callchain_param.order = ORDER_CALLER;
+   err = thread__resolve_callchain(thread, &callchain_cursor, evsel,
+   sample, NULL, NULL,
+   sysctl_perf_event_max_stack);
+   if (err) {
+   callchain_param.order = saved_order;
+   return NULL;
+   }
+   callchain_cursor_commit(&callchain_cursor);
+
+   while (1) {
+   struct callchain_cursor_node *node;
+   struct addr_location al;
+   u64 dso_db_id = 0, sym_db_id = 0, offset = 0;
+
+   memset(&al, 0, sizeof(al));
+
+   node = callchain_cursor_current(&callchain_cursor);
+   if (!node)
+   break;
+   /*
+* Handle export of symbol and dso for this node by
+* constructing an addr_location struct and then passing it to
+* db_ids_from_al() to perform the export.
+*/
+   al.sym = node->sym;
+   al.map = node->map;
+   al.machine = machine;
+   if (al.map)
+   al.addr = al.map->map_ip(al.map, node->ip);
+   else
+   al.addr = node->ip;
+
+   db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
+
+   /* add node to the call path tree if it doesn't exist */
+   current = call_path__findnew(dbe->cpr, current,
+al.sym, node->ip,
+kernel_start);
+
+   callchain_cursor_advance(&callchain_cursor);
+   }
+
+   /* Reset the callchain order to its prior value. */
+   c

[tip:perf/core] perf tools: Refactor code to move call path handling out of thread-stack

2016-05-06 Thread tip-bot for Chris Phlipot
Commit-ID:  451db12617bc6ff1bb8ed456ed4f257594134255
Gitweb: http://git.kernel.org/tip/451db12617bc6ff1bb8ed456ed4f257594134255
Author: Chris Phlipot 
AuthorDate: Thu, 28 Apr 2016 01:19:07 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 May 2016 13:00:43 -0300

perf tools: Refactor code to move call path handling out of thread-stack

Move the call path handling code out of thread-stack.c and
thread-stack.h to allow other components that are not part of
thread-stack to create call paths.

Summary:

- Create call-path.c and call-path.h and add them to the build.

- Move all call path related code out of thread-stack.c and thread-stack.h
  and into call-path.c and call-path.h.

- A small subset of structures and functions are now visible through
  call-path.h, which is required for thread-stack.c to continue to
  compile.

This change is a prerequisite for subsequent patches in this change set
and by itself contains no user-visible changes.

Signed-off-by: Chris Phlipot 
Acked-by: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/Build  |   1 +
 tools/perf/util/call-path.c| 122 
 tools/perf/util/call-path.h|  77 +
 tools/perf/util/db-export.c|   1 +
 .../util/scripting-engines/trace-event-python.c|   1 +
 tools/perf/util/thread-stack.c | 126 +
 tools/perf/util/thread-stack.h |  25 +---
 7 files changed, 204 insertions(+), 149 deletions(-)

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 90229a8..027bb2b 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -74,6 +74,7 @@ libperf-y += srcline.o
 libperf-y += data.o
 libperf-y += tsc.o
 libperf-y += cloexec.o
+libperf-y += call-path.o
 libperf-y += thread-stack.o
 libperf-$(CONFIG_AUXTRACE) += auxtrace.o
 libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
diff --git a/tools/perf/util/call-path.c b/tools/perf/util/call-path.c
new file mode 100644
index 000..904a170
--- /dev/null
+++ b/tools/perf/util/call-path.c
@@ -0,0 +1,122 @@
+/*
+ * call-path.h: Manipulate a tree data structure containing function call paths
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+
+#include "util.h"
+#include "call-path.h"
+
+static void call_path__init(struct call_path *cp, struct call_path *parent,
+   struct symbol *sym, u64 ip, bool in_kernel)
+{
+   cp->parent = parent;
+   cp->sym = sym;
+   cp->ip = sym ? 0 : ip;
+   cp->db_id = 0;
+   cp->in_kernel = in_kernel;
+   RB_CLEAR_NODE(&cp->rb_node);
+   cp->children = RB_ROOT;
+}
+
+struct call_path_root *call_path_root__new(void)
+{
+   struct call_path_root *cpr;
+
+   cpr = zalloc(sizeof(struct call_path_root));
+   if (!cpr)
+   return NULL;
+   call_path__init(&cpr->call_path, NULL, NULL, 0, false);
+   INIT_LIST_HEAD(&cpr->blocks);
+   return cpr;
+}
+
+void call_path_root__free(struct call_path_root *cpr)
+{
+   struct call_path_block *pos, *n;
+
+   list_for_each_entry_safe(pos, n, &cpr->blocks, node) {
+   list_del(&pos->node);
+   free(pos);
+   }
+   free(cpr);
+}
+
+static struct call_path *call_path__new(struct call_path_root *cpr,
+   struct call_path *parent,
+   struct symbol *sym, u64 ip,
+   bool in_kernel)
+{
+   struct call_path_block *cpb;
+   struct call_path *cp;
+   size_t n;
+
+   if (cpr->next < cpr->sz) {
+   cpb = list_last_entry(&cpr->blocks, struct call_path_block,
+ node);
+   } else {
+   cpb = zalloc(sizeof(struct call_path_block));
+   if (!cpb)
+   return NULL;
+   list_add_tail(&cpb->node, &cpr->blocks);
+   cpr->sz += CALL_PATH_BLOCK_SIZE;
+   }
+
+   n = cpr->next++ & CALL_PATH_BLOCK_MASK;
+   cp = &cpb->cp[n];
+
+   call_path__init(cp, parent, sym, ip, in_kernel);
+
+   return cp;
+}
+
+struct

[tip:perf/core] perf callchain: Fix incorrect ordering of entries

2016-05-06 Thread tip-bot for Chris Phlipot
Commit-ID:  9919a65ec532799544dfdfd6df6f994b74c12b42
Gitweb: http://git.kernel.org/tip/9919a65ec532799544dfdfd6df6f994b74c12b42
Author: Chris Phlipot 
AuthorDate: Thu, 28 Apr 2016 01:19:06 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 May 2016 08:59:47 -0300

perf callchain: Fix incorrect ordering of entries

The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.

The callchain entries are resolved incorrectly for a sample when all of
the following conditions are met:

1. callchain_param.order is set to ORDER_CALLER

2. thread__resolve_callchain_sample is able to resolve callchain entries
   for the sample.

3. unwind__get_entries is also able to resolve callchain entries for the
   sample.

The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called when
callchain_param.order is set to ORDER_CALLER.

Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.

How to Reproduce the Existing Bug:

Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways to see
the bug, however it can still be seen in current builds using perf
report.

Here is how i can reproduce the bug using perf report:

  # perf record --call-graph=dwarf stress -c 1 -t 5

when i run this command:

  # perf report --call-graph=flat,0,0,callee

This callchain, containing kernel (handle_irq_event, etc) and userspace
samples (__libc_start_main, etc) is contained in the output, which looks
correct (callee order):

gen8_irq_handler
handle_irq_event_percpu
handle_irq_event
handle_edge_irq
handle_irq
do_IRQ
ret_from_intr
__random
rand
0x558f2a04dded
0x558f2a04c774
__libc_start_main
0x558f2a04dcd9

Now run this command using caller order:

  # perf report --call-graph=flat,0,0,caller

It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.

instead you see this:

ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random

Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.

if this patch is applied and perf script is re-run, you will see the
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):

0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
    gen8_irq_handler

Signed-off-by: Chris Phlipot 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Jiri Olsa 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 56 ++-
 1 file changed, 41 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 8c7bf4d..639a290 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1817,8 +1817,6 @@ static int thread__resolve_callchain_sample(struct thread 
*thread,
int skip_idx = -1;
int first_call = 0;
 
-   callchain_cursor_reset(cursor);
-
if (perf_evsel__has_branch_callstack(evsel)) {
err = resolve_lbr_callchain_sample(thread, cursor, sample, 
parent,
   root_al, max_stack);
@@ -1929,20 +1927,12 @@ static int unwind_entry(struct unwind_entry *entry, 
void *arg)
   entry->map, entry->sym);
 }
 
-int thread__resolve_callchain(struct thread *thread,
- struct callchain_cursor *cursor,
- struct perf_evsel *evsel,
- struct perf_sample *sample,
- 

Re: [PATCH 2/5] perf script: extend db-export api to include callchains for samples

2016-04-28 Thread Chris Phlipot

Hi Adrian,

I have just resubmitted these changes as a new patch set, which I believe
should address most of your concerns. Please review the new patch set
instead of continuing with this one.

https://lkml.org/lkml/2016/4/28/75

Thanks,
Chris

On 04/22/2016 09:41 PM, Chris Phlipot wrote:



On 04/22/2016 12:56 AM, Adrian Hunter wrote:
The call_paths table already has symbol_id which belongs uniquely to 
a DSO,

so why do we need dso_id as well?
If the symbol_id is 0 because the IP could not be resolved to a 
symbol, this is not necessarily a valid assumption. Without a dso_id 
in the call_paths table, it is not possible to resolve the dso when 
symbol information is missing. the db_export api currently does not 
have enough information to match a DSO with an IP.


It is often useful to still have the call path associated with a DSO 
even if there is no symbol, which i why i recommend keeping the dso_id 
in the call_paths table.

Why do you need a callback?  Seems like the only thing you need from
thread-stack.c is the call path tree.  You could move that to its own 
.c/.h

files and then process the call chain in db-export.c
My original intent was to reuse existing code with minimal changes and 
conform the existing design patterns they used. Thread-stack.c, for 
example, currently uses a callback to populate the call_return table, 
so I used a callback as well to populate the call_path table.


I am open to making this change if it is believed it will result in a 
cleaner implementation.


Also a list of changes like the one above heavily implies you are not
obeying the one patch == one change rule.  Please try to make patches 
that

only do one thing and also run checkpatch.
While i can split this into a few smaller patches there is only really 
justification for applying all of them all together. If this is still 
preferred i can resubmit this in smaller parts.


If you don't mind, I'll let you respond to my comments before I 
comment on

any other patches.


Let me know if you have any additional comments.

Thanks,
Chris




[PATCH 3/6] perf script: enable db export to output sampled callchains

2016-04-28 Thread Chris Phlipot
This change enables the db export api to export callchains. This
is accomplished by adding callchains obtained from samples to the
call_path_root structure and exporting them via the current call path
export API.

While the current API does support exporting call paths, this is not
supported when sampling. This commit addresses that missing feature by
allowing the export of call paths when callchains are present in samples.

Summary:
-This feature is activated by initializing the call_path_root member
inside the db_export structure to a non-null value.
-Callchains are resolved with thread__resolve_callchain() and then stored
and exported by adding a call path under call path root.
-Symbol and DSO for each callchain node are exported via db_ids_from_al()

This commit puts in place infrastructure to be used by subsequent commits,
and by itself, does not introduce any user-visible changes.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 86 +
 tools/perf/util/db-export.h |  2 ++
 2 files changed, 88 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 4fc607c..cb96591 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -15,6 +15,8 @@
 
 #include 
 
+#include 
+
 #include "evsel.h"
 #include "machine.h"
 #include "thread.h"
@@ -23,6 +25,7 @@
 #include "event.h"
 #include "util.h"
 #include "thread-stack.h"
+#include "callchain.h"
 #include "call-path.h"
 #include "db-export.h"
 
@@ -277,6 +280,81 @@ static int db_ids_from_al(struct db_export *dbe, struct 
addr_location *al,
return 0;
 }
 
+static struct call_path *call_path_from_sample(struct db_export *dbe,
+  struct machine *machine,
+  struct thread *thread,
+  struct perf_sample *sample,
+  struct perf_evsel *evsel
+  )
+{
+   u64 kernel_start = machine__kernel_start(machine);
+
+   struct call_path *current = &dbe->cpr->call_path;
+   enum chain_order saved_order = callchain_param.order;
+
+   int err = 0;
+
+   if (!symbol_conf.use_callchain || !sample->callchain)
+   return NULL;
+
+   /* Since the call path tree must be built starting with the root, we
+* must use ORDER_CALL for call chain resolution, in order to process
+* the callchain starting with the root node and ending with the leaf.
+*/
+   callchain_param.order = ORDER_CALLER;
+   err = thread__resolve_callchain(thread, &callchain_cursor, evsel,
+   sample, NULL, NULL,
+   PERF_MAX_STACK_DEPTH);
+   if (err) {
+   callchain_param.order = saved_order;
+   return NULL;
+   }
+   callchain_cursor_commit(&callchain_cursor);
+
+   while (1) {
+   struct callchain_cursor_node *node;
+   struct addr_location al;
+   u64 dso_db_id = 0, sym_db_id = 0, offset = 0;
+
+   memset(&al, 0, sizeof(al));
+
+   node = callchain_cursor_current(&callchain_cursor);
+   if (!node)
+   break;
+
+   /* handle export of symbol and dso for this node by
+* constructing an addr_location struct and then passing it to
+* db_ids_from_al() to perform the export.
+*/
+   al.sym = node->sym;
+   al.map = node->map;
+   al.machine = machine;
+   if (al.map)
+   al.addr = al.map->map_ip(al.map, node->ip);
+   else
+   al.addr = node->ip;
+
+   db_ids_from_al(dbe, &al, &dso_db_id, &sym_db_id, &offset);
+
+   /* add node to the call path tree if it doesn't exist */
+   current = call_path__findnew(dbe->cpr, current,
+al.sym, node->ip,
+kernel_start);
+
+   callchain_cursor_advance(&callchain_cursor);
+   }
+
+   /* Reset the callchain order to its prior value. */
+   callchain_param.order = saved_order;
+
+   if (current == &dbe->cpr->call_path) {
+   /* Bail because the callchain was empty. */
+   return NULL;
+   }
+
+   return current;
+}
+
 int db_export__branch_type(struct db_export *dbe, u32 branch_type,
   const char *name)
 {
@@ -330,6 +408,14 @@ int db_export__sample(struct db_export *dbe, union 
perf_event *event,
if (err)
goto out_pu

[PATCH 1/6] perf tools: fix incorrect ordering of callchain entries

2016-04-28 Thread Chris Phlipot
The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.

The callchain entries are resolved incorrectly for a sample when all
of the following conditions are met:

1. callchain_param.order is set to ORDER_CALLER

2. thread__resolve_callchain_sample is able to resolve callchain entries
   for the sample.

3. unwind__get_entries is also able to resolve callchain entries for the
   sample.

The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called
when callchain_param.order is set to ORDER_CALLER.

Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.

How to Reproduce the Existing Bug:

Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways
to see the bug, however it can still be seen in current builds
using perf report.

Here is how i can reproduce the bug using perf report:
$ ./perf record --call-graph=dwarf stress -c 1 -t 5

when i run this command:
$./perf report --call-graph=flat,0,0,callee

This callchain is contained in the output, which looks correct
(callee order):

gen8_irq_handler
handle_irq_event_percpu
handle_irq_event
handle_edge_irq
handle_irq
do_IRQ
ret_from_intr
__random
rand
0x558f2a04dded
0x558f2a04c774
__libc_start_main
0x558f2a04dcd9

Now run this command using caller order:
$./perf report --call-graph=flat,0,0,caller

It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.

instead you see this:

ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random

Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.

if this patch is applied and perf script is re-run, you will see the 
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):

0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
    gen8_irq_handler

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/machine.c | 56 ++-
 1 file changed, 41 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2cb95bb..baec208 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1812,8 +1812,6 @@ static int thread__resolve_callchain_sample(struct thread 
*thread,
int skip_idx = -1;
int first_call = 0;
 
-   callchain_cursor_reset(cursor);
-
if (perf_evsel__has_branch_callstack(evsel)) {
err = resolve_lbr_callchain_sample(thread, cursor, sample, 
parent,
   root_al, max_stack);
@@ -1924,20 +1922,12 @@ static int unwind_entry(struct unwind_entry *entry, 
void *arg)
   entry->map, entry->sym);
 }
 
-int thread__resolve_callchain(struct thread *thread,
- struct callchain_cursor *cursor,
- struct perf_evsel *evsel,
- struct perf_sample *sample,
- struct symbol **parent,
- struct addr_location *root_al,
- int max_stack)
+static int thread__resolve_callchain_unwind(struct thread *thread,
+   struct callchain_cursor *cursor,
+   struct perf_evsel *evsel,
+   struct perf_sample *sample,
+   int max_stack)
 {
-   int ret = thread__resolve_callchain_sample(thread, cursor, evsel,
-  sample, parent,
-  

[PATCH 5/6] perf script: expose usage of the callchain db export via the python api

2016-04-28 Thread Chris Phlipot
This change allows python scripts to be able to utilize the recent
changes to the db export api allowing the export of call_paths derived
from sampled callchains. These call paths are also now associated with
the samples from which they were derived.

-This feature is enabled by setting "perf_db_export_callchains" to true

-When enabled, samples that have callchain information will have the
callchains exported via call_path_table

-The call_path_id field is added to sample_table to enable association of
samples with the corresponding callchain stored in the call paths
table. A call_path_id of 0 will be exported if there is no
corresponding callchain.

-When "perf_db_export_callchains" and "perf_db_export_calls" are both
set to True, the call path root data structure will be shared. This
prevents duplicating of data and call path ids that would result from
building two separate call path trees in memory.

-The call_return_processor structure definition was relocated to the header
file to make its contents visible to db-export.c. This enables the
sharing of call path trees between the two features, as mentioned
above.

This change is visible to python scripts using the python db export api.
The change is backwards compatible with scripts written against the
previous API, assuming that the scripts model the sample_table function
after the one in export-to-postgresql.py script by allowing for additional
arguments to be added in the future. ie. using *x as the final argument of
the sample_table function.

Signed-off-by: Chris Phlipot 
---
 .../util/scripting-engines/trace-event-python.c| 35 --
 tools/perf/util/thread-stack.c | 13 
 tools/perf/util/thread-stack.h | 14 -
 3 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 7bb8592..091bce6 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -682,7 +682,7 @@ static int python_export_sample(struct db_export *dbe,
struct tables *tables = container_of(dbe, struct tables, dbe);
PyObject *t;
 
-   t = tuple_new(21);
+   t = tuple_new(22);
 
tuple_set_u64(t, 0, es->db_id);
tuple_set_u64(t, 1, es->evsel->db_id);
@@ -705,6 +705,7 @@ static int python_export_sample(struct db_export *dbe,
tuple_set_u64(t, 18, es->sample->data_src);
tuple_set_s32(t, 19, es->sample->flags & PERF_BRANCH_MASK);
tuple_set_s32(t, 20, !!(es->sample->flags & PERF_IP_FLAG_IN_TX));
+   tuple_set_u64(t, 21, es->call_path_id);
 
call_object(tables->sample_handler, t, "sample_table");
 
@@ -999,8 +1000,10 @@ static void set_table_handlers(struct tables *tables)
 {
const char *perf_db_export_mode = "perf_db_export_mode";
const char *perf_db_export_calls = "perf_db_export_calls";
-   PyObject *db_export_mode, *db_export_calls;
+   const char *perf_db_export_callchains = "perf_db_export_callchains";
+   PyObject *db_export_mode, *db_export_calls, *db_export_callchains;
bool export_calls = false;
+   bool export_callchains = false;
int ret;
 
memset(tables, 0, sizeof(struct tables));
@@ -1017,6 +1020,7 @@ static void set_table_handlers(struct tables *tables)
if (!ret)
return;
 
+   /* handle export calls */
tables->dbe.crp = NULL;
db_export_calls = PyDict_GetItemString(main_dict, perf_db_export_calls);
if (db_export_calls) {
@@ -1034,6 +1038,33 @@ static void set_table_handlers(struct tables *tables)
Py_FatalError("failed to create calls processor");
}
 
+   /* handle export callchains */
+   tables->dbe.cpr = NULL;
+   db_export_callchains = PyDict_GetItemString(main_dict,
+   perf_db_export_callchains);
+   if (db_export_callchains) {
+   ret = PyObject_IsTrue(db_export_callchains);
+   if (ret == -1)
+   handler_call_die(perf_db_export_callchains);
+   export_callchains = !!ret;
+   }
+
+   if (export_callchains) {
+   /*
+* Attempt to use the call path root from the call return
+* processor, if the call return processor is in use. Otherwise,
+* we allocate a new call path root. This prevents exporting
+* duplicate call path ids when both are in use simultaniously.
+*/
+   if (tables->dbe.crp)
+   tables->dbe.cpr = tables->dbe.crp->cpr;
+   else
+

[PATCH 6/6] perf script: update export-to-postgresql to support callchain export

2016-04-28 Thread Chris Phlipot
Update the export-to-postgresql.py to support the newly introduced
callchain export.

callchains are added into the existing call_paths table and can now
be associated with samples when the "callpaths" commandline option
is used with the script.

ex. $perf script -s export-to-postgresql.py example_db all callchains

Includes the following changes to enable callchain export via the
python export APIs:

-Add the "callchains" commandline option, which is used to enable
callchain export by setting the perf_db_export_callchains global
-Add perf_db_export_callchains checks for call_path table creation
 and population.
-Add call_path_id to samples_table to conform with the new API

example usage and output using a small test app:

test_app.c:

volatile int x = 0;
void inc_x_loop()
{
int i;
for(i=0; i<1; i++)
x++;
}

void a()
{
inc_x_loop();
}

void b()
{
inc_x_loop();
}

int main()
{
a();
b();
return 0;
}

example usage:
$ gcc -g -O0 test_app.c
$ ./perf record --call-graph=dwarf ./a.out
[ perf record: Woken up 77 times to write data ]
[ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ]

$ ./perf script -s scripts/python/export-to-postgresql.py
example_db all callchains

$ psql example_db

example_db=#
SELECT
(SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol,
(SELECT name FROM symbols WHERE id =
(SELECT symbol_id from call_paths where id = cps.parent_id))
as parent_symbol,
sum(period) as event_count
FROM samples join call_paths as cps on call_path_id = cps.id
GROUP BY cps.id,evsel_id
ORDER BY event_count DESC
LIMIT 5;

  symbol  |  parent_symbol   | event_count
--+--+-
 inc_x_loop   | a|   734250982
 inc_x_loop   | b|   731028057
 unknown  | unknown  | 1335858
 task_tick_fair   | scheduler_tick   | 1238842
 update_wall_time | tick_do_update_jiffies64 |  650373
(5 rows)

The above data shows total "self time" in cycles for each call path that
was sampled. It is intended to demonstrate how it accounts separately
for the two ways to reach the "inc_x_loop" function(via "a" and "b").
Recursive common table expressions can be used as well to get cumulative
time spent in a function as well, but that is beyond the scope of this
basic example.

Signed-off-by: Chris Phlipot 
---
 tools/perf/scripts/python/export-to-postgresql.py | 47 +++
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py 
b/tools/perf/scripts/python/export-to-postgresql.py
index 6f0ca68..7656ff8 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -223,11 +223,14 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
 
 perf_db_export_mode = True
 perf_db_export_calls = False
+perf_db_export_callchains = False
+
 
 def usage():
-   print >> sys.stderr, "Usage is: export-to-postgresql.py  
[] []"
+   print >> sys.stderr, "Usage is: export-to-postgresql.py  
[] [] []"
print >> sys.stderr, "where:columns 'all' or 'branches'"
-   print >> sys.stderr, "  calls   'calls' => create calls 
table"
+   print >> sys.stderr, "  calls   'calls' => create calls 
and call_paths table"
+   print >> sys.stderr, "  callchains  'callchains' => create 
call_paths table"
raise Exception("Too few arguments")
 
 if (len(sys.argv) < 2):
@@ -245,9 +248,11 @@ if columns not in ("all", "branches"):
 
 branches = (columns == "branches")
 
-if (len(sys.argv) >= 4):
-   if (sys.argv[3] == "calls"):
+for i in range(3,len(sys.argv)):
+   if (sys.argv[i] == "calls"):
perf_db_export_calls = True
+   elif (sys.argv[i] == "callchains"):
+   perf_db_export_callchains = True
else:
usage()
 
@@ -358,14 +363,16 @@ else:
'transactionbigint,'
'data_src   bigint,'
'branch_typeinteger,'
-   'in_tx  boolean)')
+   'in_tx  boolean,'
+   'call_path_id   bigint)')
 
-if perf_db_export_calls:
+if perf_db_export_calls or perf_db_export_callchains:

[PATCH 2/6] perf tools: refractor code to move call path handling out of thread-stack

2016-04-28 Thread Chris Phlipot
Move the call path handling code out of thread-stack.c and thread-stack.h
to allow other components that are not part of thread-stack to create call
paths.

Summary:
-Create call-path.c and call-path.h and add them to the build.
-Move all call path related code out of thread-stack.c and thread-stack.h
and into call-path.c and call-path.h.
-A small subset of structures and functions are now visible through
call-path.h, which is required for thread-stack.c to continue to
compile.

This change is a prerequisite for subsequent patches in this change set
and by itself contains no user-visible changes.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/Build  |   1 +
 tools/perf/util/call-path.c| 122 
 tools/perf/util/call-path.h|  77 +
 tools/perf/util/db-export.c|   1 +
 .../util/scripting-engines/trace-event-python.c|   1 +
 tools/perf/util/thread-stack.c | 126 +
 tools/perf/util/thread-stack.h |  25 +---
 7 files changed, 204 insertions(+), 149 deletions(-)
 create mode 100644 tools/perf/util/call-path.c
 create mode 100644 tools/perf/util/call-path.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 90229a8..027bb2b 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -74,6 +74,7 @@ libperf-y += srcline.o
 libperf-y += data.o
 libperf-y += tsc.o
 libperf-y += cloexec.o
+libperf-y += call-path.o
 libperf-y += thread-stack.o
 libperf-$(CONFIG_AUXTRACE) += auxtrace.o
 libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
diff --git a/tools/perf/util/call-path.c b/tools/perf/util/call-path.c
new file mode 100644
index 000..904a170
--- /dev/null
+++ b/tools/perf/util/call-path.c
@@ -0,0 +1,122 @@
+/*
+ * call-path.h: Manipulate a tree data structure containing function call paths
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+#include 
+#include 
+
+#include "util.h"
+#include "call-path.h"
+
+static void call_path__init(struct call_path *cp, struct call_path *parent,
+   struct symbol *sym, u64 ip, bool in_kernel)
+{
+   cp->parent = parent;
+   cp->sym = sym;
+   cp->ip = sym ? 0 : ip;
+   cp->db_id = 0;
+   cp->in_kernel = in_kernel;
+   RB_CLEAR_NODE(&cp->rb_node);
+   cp->children = RB_ROOT;
+}
+
+struct call_path_root *call_path_root__new(void)
+{
+   struct call_path_root *cpr;
+
+   cpr = zalloc(sizeof(struct call_path_root));
+   if (!cpr)
+   return NULL;
+   call_path__init(&cpr->call_path, NULL, NULL, 0, false);
+   INIT_LIST_HEAD(&cpr->blocks);
+   return cpr;
+}
+
+void call_path_root__free(struct call_path_root *cpr)
+{
+   struct call_path_block *pos, *n;
+
+   list_for_each_entry_safe(pos, n, &cpr->blocks, node) {
+   list_del(&pos->node);
+   free(pos);
+   }
+   free(cpr);
+}
+
+static struct call_path *call_path__new(struct call_path_root *cpr,
+   struct call_path *parent,
+   struct symbol *sym, u64 ip,
+   bool in_kernel)
+{
+   struct call_path_block *cpb;
+   struct call_path *cp;
+   size_t n;
+
+   if (cpr->next < cpr->sz) {
+   cpb = list_last_entry(&cpr->blocks, struct call_path_block,
+ node);
+   } else {
+   cpb = zalloc(sizeof(struct call_path_block));
+   if (!cpb)
+   return NULL;
+   list_add_tail(&cpb->node, &cpr->blocks);
+   cpr->sz += CALL_PATH_BLOCK_SIZE;
+   }
+
+   n = cpr->next++ & CALL_PATH_BLOCK_MASK;
+   cp = &cpb->cp[n];
+
+   call_path__init(cp, parent, sym, ip, in_kernel);
+
+   return cp;
+}
+
+struct call_path *call_path__findnew(struct call_path_root *cpr,
+struct call_path *parent,
+struct symbol *sym, u64 ip, u64 ks)
+{
+   struct rb_node **p;
+   struct rb_node *node_parent = NULL;
+   struct call_path *cp;
+   bool in_kernel = ip >= ks;
+
+   if (sym)
+   ip = 0;
+
+   if (!parent)
+   return call_path__new(cpr, parent, 

[PATCH 4/6] perf script: add call path id to exported sample in db export

2016-04-28 Thread Chris Phlipot
The exported sample now contains a reference to the call_path_id that
represents its callchain.

While callchains themselves are nice to have, being able to associate
them with samples makes them much more useful, and can allow for such
things as determining how much cumulative time is spent in a particular
function. This information is normally possible to get from the
call return processor. However, when doing normal sampling, call/return
information is not available, thus necessitating the need for associating
samples directly with call paths.

This commit include changes to db-export layer to make this information
available for subsequent patches in this change set, but by itself, does
not make any changes visible to the user.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c | 4 +++-
 tools/perf/util/db-export.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index cb96591..74f6875 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -412,8 +412,10 @@ int db_export__sample(struct db_export *dbe, union 
perf_event *event,
struct call_path *cp = call_path_from_sample(dbe, al->machine,
 thread, sample,
 evsel);
-   if (cp)
+   if (cp) {
db_export__call_path(dbe, cp);
+   es.call_path_id = cp->db_id;
+   }
}
 
if ((evsel->attr.sample_type & PERF_SAMPLE_ADDR) &&
diff --git a/tools/perf/util/db-export.h b/tools/perf/util/db-export.h
index f5daf55..67bc6b8 100644
--- a/tools/perf/util/db-export.h
+++ b/tools/perf/util/db-export.h
@@ -44,6 +44,7 @@ struct export_sample {
u64 addr_dso_db_id;
u64 addr_sym_db_id;
u64 addr_offset; /* addr offset from symbol start */
+   u64 call_path_id;
 };
 
 struct db_export {
-- 
2.7.4



[PATCH 0/6] perf script: export sampled callchains to database

2016-04-28 Thread Chris Phlipot
This patch set contains a set of changes to allow the export of sampled
callchains, and to associate them with samples, via the Python db export
API and export-to-postgresql.py script.

Call path information is currently only available in the database when
call/return info is available, but not when doing normal sampling. These
changes make this information available for normal sampling runs as well.

Patches 2-6 are required to make this information available in the
database.

Patch 1 is needed to fix an existing issue where callchains are
processed incorrectly which can cause the other patches to export
incorrect call paths for a small percentage of samples 
(depending on the workload).

Chris Phlipot (6):
  perf tools: fix incorrect ordering of callchain entries
  perf tools: refractor code to move call path handling out of
thread-stack
  perf script: enable db export to output sampled callchains
  perf script: add call path id to exported sample in db export
  perf script: expose usage of the callchain db export via the python
api
  perf script: update export-to-postgresql to support callchain export

 tools/perf/scripts/python/export-to-postgresql.py  |  47 ---
 tools/perf/util/Build  |   1 +
 tools/perf/util/call-path.c| 122 ++
 tools/perf/util/call-path.h|  77 
 tools/perf/util/db-export.c|  89 +
 tools/perf/util/db-export.h|   3 +
 tools/perf/util/machine.c  |  56 ++---
 .../util/scripting-engines/trace-event-python.c|  36 +-
 tools/perf/util/thread-stack.c | 139 +
 tools/perf/util/thread-stack.h |  31 ++---
 10 files changed, 408 insertions(+), 193 deletions(-)
 create mode 100644 tools/perf/util/call-path.c
 create mode 100644 tools/perf/util/call-path.h

-- 
2.7.4



[tip:perf/core] perf script: Fix segfault when printing callchains

2016-04-27 Thread tip-bot for Chris Phlipot
Commit-ID:  e557b674a9470dae99916be6105e6780b3a072ca
Gitweb: http://git.kernel.org/tip/e557b674a9470dae99916be6105e6780b3a072ca
Author: Chris Phlipot 
AuthorDate: Tue, 19 Apr 2016 19:32:11 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 25 Apr 2016 12:49:17 -0300

perf script: Fix segfault when printing callchains

This fixes a bug caused by an unitialized callchain cursor. The crash
frist appeared in:

6f736735e30f ("perf evsel: Require that callchains be resolved before
calling fprintf_{sym,callchain}")

The callchain cursor is a struct that contains pointers, that when
uninitialized will cause unpredictable behavior (usually a crash)
when trying to append to the callchain.

The existing implementation has the following issues:

1. The callchain cursor used is not initialized, resulting in
unpredictable behavior when used.
2. The cursor is declared on the stack. Even if it is properly initalized,
the implmentation will leak memory when the function returns,
since all the references to the callchain_nodes allocated by
callchain_cursor_append will be lost when the cursor goes out of
scope.
3. Storing the cursor on the stack is inefficient. Even if memory is
properly freed when it goes out of scope, a performance penalty
will be incurred due to reallocation of callchain nodes.
callchain_cursor_append is designed to avoid these reallocations
when an existing cursor is reused.

This patch fixes the crash by replacing cursor_callchain with a reference
to the global callchain_cursor which also resolves all 3 issues mentioned
above.

How to reproduce the crash:

  $ perf record --call-graph=dwarf stress -t 1 -c 1
  $ perf script > /dev/null
  Segfault

Signed-off-by: Chris Phlipot 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Peter Zijlstra 
Fixes: 6f736735e30f ("perf evsel: Require that callchains be resolved before 
calling fprintf_{sym,callchain}")
Link: 
http://lkml.kernel.org/r/1461119531-2529-1-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 5099740..f43b0c6 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -570,12 +570,12 @@ static void print_sample_bts(struct perf_sample *sample,
/* print branch_from information */
if (PRINT_FIELD(IP)) {
unsigned int print_opts = output[attr->type].print_ip_opts;
-   struct callchain_cursor *cursor = NULL, cursor_callchain;
+   struct callchain_cursor *cursor = NULL;
 
if (symbol_conf.use_callchain && sample->callchain &&
-   thread__resolve_callchain(al->thread, &cursor_callchain, 
evsel,
+   thread__resolve_callchain(al->thread, &callchain_cursor, 
evsel,
  sample, NULL, NULL, 
scripting_max_stack) == 0)
-   cursor = &cursor_callchain;
+   cursor = &callchain_cursor;
 
if (cursor == NULL) {
putchar(' ');
@@ -789,12 +789,12 @@ static void process_event(struct perf_script *script,
printf("%16" PRIu64, sample->weight);
 
if (PRINT_FIELD(IP)) {
-   struct callchain_cursor *cursor = NULL, cursor_callchain;
+   struct callchain_cursor *cursor = NULL;
 
if (symbol_conf.use_callchain && sample->callchain &&
-   thread__resolve_callchain(al->thread, &cursor_callchain, 
evsel,
+   thread__resolve_callchain(al->thread, &callchain_cursor, 
evsel,
  sample, NULL, NULL, 
scripting_max_stack) == 0)
-   cursor = &cursor_callchain;
+   cursor = &callchain_cursor;
 
putchar(cursor ? '\n' : ' ');
sample__fprintf_sym(sample, al, 0, 
output[attr->type].print_ip_opts, cursor, stdout);


[tip:perf/core] perf script: Fix postgresql ubuntu install instructions

2016-04-23 Thread tip-bot for Chris Phlipot
Commit-ID:  d6632dd59b66c89724ef28e2723586d1429382aa
Gitweb: http://git.kernel.org/tip/d6632dd59b66c89724ef28e2723586d1429382aa
Author: Chris Phlipot 
AuthorDate: Tue, 19 Apr 2016 01:56:02 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Apr 2016 12:36:54 -0300

perf script: Fix postgresql ubuntu install instructions

The current instructions for setting up an Ubuntu system for using the
export-to-postgresql.py script are incorrect.

The instructions in the script have been updated to work on newer
versions of ubuntu.

-Add missing dependencies to apt-get command:
python-pyside.qtsql, libqt4-sql-psql
-Add '-s' option to createuser command to force the user to be a
superuser since the command doesn't prompt as indicated in the
current instructions.

Tested on: Ubuntu 14.04, Ubuntu 16.04(beta)

Signed-off-by: Chris Phlipot 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1461056164-14914-3-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/scripts/python/export-to-postgresql.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py 
b/tools/perf/scripts/python/export-to-postgresql.py
index 1b02cdc..6f0ca68 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -34,10 +34,9 @@ import datetime
 #
 # ubuntu:
 #
-#  $ sudo apt-get install postgresql
+#  $ sudo apt-get install postgresql python-pyside.qtsql libqt4-sql-psql
 #  $ sudo su - postgres
-#  $ createuser 
-#  Shall the new role be a superuser? (y/n) y
+#  $ createuser -s 
 #
 # An example of using this script with Intel PT:
 #


Re: [PATCH 2/5] perf script: extend db-export api to include callchains for samples

2016-04-22 Thread Chris Phlipot



On 04/22/2016 12:56 AM, Adrian Hunter wrote:

The call_paths table already has symbol_id which belongs uniquely to a DSO,
so why do we need dso_id as well?
If the symbol_id is 0 because the IP could not be resolved to a symbol, 
this is not necessarily a valid assumption. Without a dso_id in the 
call_paths table, it is not possible to resolve the dso when symbol 
information is missing. the db_export api currently does not have enough 
information to match a DSO with an IP.


It is often useful to still have the call path associated with a DSO 
even if there is no symbol, which i why i recommend keeping the dso_id 
in the call_paths table.

Why do you need a callback?  Seems like the only thing you need from
thread-stack.c is the call path tree.  You could move that to its own .c/.h
files and then process the call chain in db-export.c
My original intent was to reuse existing code with minimal changes and 
conform the existing design patterns they used. Thread-stack.c, for 
example, currently uses a callback to populate the call_return table, so 
I used a callback as well to populate the call_path table.


I am open to making this change if it is believed it will result in a 
cleaner implementation.


Also a list of changes like the one above heavily implies you are not
obeying the one patch == one change rule.  Please try to make patches that
only do one thing and also run checkpatch.
While i can split this into a few smaller patches there is only really 
justification for applying all of them all together. If this is still 
preferred i can resubmit this in smaller parts.


If you don't mind, I'll let you respond to my comments before I comment on
any other patches.


Let me know if you have any additional comments.

Thanks,
Chris


Re: [PATCH 1/5] perf tools: fix incorrect ordering of callchain entries

2016-04-22 Thread Chris Phlipot
Modifying perf script to print call trees in the opposite order or 
applying patch 2 from this series and comparing the results output from 
export-to-postgtresql.py are the easiest ways to see the bug, however it 
can still be seen in current builds using perf report.


Here is how i can reproduce the bug using perf report:
$ ./perf record --call-graph=dwarf stress -c 1 -t 5


when i run this command:
$./perf report --call-graph=flat,0,0,callee

i see this calltree contained in the output, which looks correct (callee 
order):


gen8_irq_handler
handle_irq_event_percpu
handle_irq_event
handle_edge_irq
handle_irq
do_IRQ
ret_from_intr
__random
rand
0x558f2a04dded
0x558f2a04c774
__libc_start_main
0x558f2a04dcd9


Now I run this command:
$./perf report --call-graph=flat,0,0,caller




I would expect to see the exact reverse of the above when using caller 
order(with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the 
bottom) in the output, but it is nowhere to be found.


instead you see this:

ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random

Notice how internally the kernel symbols are reversed and the user space 
symbols are reversed, but the kernel symbols still appear above the user 
space symbols.


if you apply this patch and re-run, you will see the expected output 
(with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom):


0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler

Thanks,
Chris

On 04/22/2016 12:59 AM, Adrian Hunter wrote:

+Jiri since he wrote the original code

On 22/04/16 10:55, Adrian Hunter wrote:

On 19/04/16 11:56, Chris Phlipot wrote:

The existing implentation implementation of thread__resolve_callchain,

Remove 'implentation'


under certain circumstanes, can assemble callchain entries in the

'circumstanes' -> 'circumstances'


incorrect order.

A the callchain entries are resolved incorrectly for a sample when all
of the following conditions are met:

1. callchain_param.order is set to ORDER_CALLER

2. thread__resolve_callchain_sample is able to resolve callchain entries
for the sample.

3. unwind__get_entries is also able to resolve callchain entries for the
sample.

The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called
when callchain_param.order is set to ORDER_CALLER.

Can you give an example of the commands you used and what the call chain
looked like before and after.

Also please run ./scripts/checkpatch.pl


Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.

Signed-off-by: Chris Phlipot 
---
  tools/perf/util/machine.c | 57 ++-
  1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0c4dabc..dd086c8 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1806,8 +1806,6 @@ static int thread__resolve_callchain_sample(struct thread 
*thread,
int skip_idx = -1;
int first_call = 0;
  
-	callchain_cursor_reset(cursor);

-
if (has_branch_callstack(evsel)) {
err = resolve_lbr_callchain_sample(thread, cursor, sample, 
parent,
   root_al, max_stack);
@@ -1918,20 +1916,12 @@ static int unwind_entry(struct unwind_entry *entry, 
void *arg)
   entry->map, entry->sym);
  }
  
-int thread__resolve_callchain(struct thread *thread,

- struct callchain_cursor *cursor,
- struct perf_evsel *evsel,
- struct perf_sample *sample,
- struct symbol **parent,
- struct addr_location *root_al,
- int max_stack)
-{
-   int ret = thread__resolve_callchain_sample(thread, cursor, evsel,
-  sample, parent,
-  

[PATCH] perf script: fix segfault when printing callchains using builtin-script

2016-04-19 Thread Chris Phlipot
This fixes a bug caused by an unitialized callchain cursor. The crash
frist appeared in:
6f736735e30f ("perf evsel: Require that callchains be resolved before
calling fprintf_{sym,callchain}")

The callchain cursor is a struct that contains pointers, that when
uninitialized will cause unpredictable behavior (usually a crash)
when trying to append to the callchain.

The existing implementation has the following issues:
1. The callchain cursor used is not initialized, resulting in
unpredictable behavior when used.
2. The cursor is declared on the stack. Even if it is properly initalized,
the implmentation will leak memory when the function returns,
since all the references to the callchain_nodes allocated by
callchain_cursor_append will be lost when the cursor goes out of
scope.
3. Storing the cursor on the stack is inefficient. Even if memory is
properly freed when it goes out of scope, a performance penalty
will be incurred due to reallocation of callchain nodes.
callchain_cursor_append is designed to avoid these reallocations
when an existing cursor is reused.

This patch fixes the crash by replacing cursor_callchain with a reference
to the global callchain_cursor which also resolves all 3 issues mentioned
above.

How to reproduce the crash:
$ perf record --call-graph=dwarf stress -t 1 -c 1
$ perf script > /dev/null
Segfault

Signed-off-by: Chris Phlipot 
---
 tools/perf/builtin-script.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 5099740..f43b0c6 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -570,12 +570,12 @@ static void print_sample_bts(struct perf_sample *sample,
/* print branch_from information */
if (PRINT_FIELD(IP)) {
unsigned int print_opts = output[attr->type].print_ip_opts;
-   struct callchain_cursor *cursor = NULL, cursor_callchain;
+   struct callchain_cursor *cursor = NULL;
 
if (symbol_conf.use_callchain && sample->callchain &&
-   thread__resolve_callchain(al->thread, &cursor_callchain, 
evsel,
+   thread__resolve_callchain(al->thread, &callchain_cursor, 
evsel,
  sample, NULL, NULL, 
scripting_max_stack) == 0)
-   cursor = &cursor_callchain;
+   cursor = &callchain_cursor;
 
if (cursor == NULL) {
putchar(' ');
@@ -789,12 +789,12 @@ static void process_event(struct perf_script *script,
printf("%16" PRIu64, sample->weight);
 
if (PRINT_FIELD(IP)) {
-   struct callchain_cursor *cursor = NULL, cursor_callchain;
+   struct callchain_cursor *cursor = NULL;
 
if (symbol_conf.use_callchain && sample->callchain &&
-   thread__resolve_callchain(al->thread, &cursor_callchain, 
evsel,
+   thread__resolve_callchain(al->thread, &callchain_cursor, 
evsel,
  sample, NULL, NULL, 
scripting_max_stack) == 0)
-   cursor = &cursor_callchain;
+   cursor = &callchain_cursor;
 
putchar(cursor ? '\n' : ' ');
sample__fprintf_sym(sample, al, 0, 
output[attr->type].print_ip_opts, cursor, stdout);
-- 
2.7.4



[PATCH 3/5] perf script: fix postgresql ubuntu install instructions

2016-04-19 Thread Chris Phlipot
The current instructions for setting up an Ubuntu system for using
the export-to-postgresql.py script are incorrect.

The instructions in the script have been updated to work on newer
versions of ubuntu.

-Add missing dependencies to apt-get command:
python-pyside.qtsql, libqt4-sql-psql
-Add '-s' option to createuser command to force the user to be a
superuser since the command doesn't prompt as indicated in the
current instructions.

Tested on: Ubuntu 14.04, Ubuntu 16.04(beta)

Signed-off-by: Chris Phlipot 
---
 tools/perf/scripts/python/export-to-postgresql.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py 
b/tools/perf/scripts/python/export-to-postgresql.py
index 1b02cdc..6f0ca68 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -34,10 +34,9 @@ import datetime
 #
 # ubuntu:
 #
-#  $ sudo apt-get install postgresql
+#  $ sudo apt-get install postgresql python-pyside.qtsql libqt4-sql-psql
 #  $ sudo su - postgres
-#  $ createuser 
-#  Shall the new role be a superuser? (y/n) y
+#  $ createuser -s 
 #
 # An example of using this script with Intel PT:
 #
-- 
2.7.4



[PATCH 2/5] perf script: extend db-export api to include callchains for samples

2016-04-19 Thread Chris Phlipot
The current implementation of the python database export API only
includes call path information when using some form of call/return
tracing, but is unable to do so when sampling.

The following API extensions allow exporting of data collected by
perf record when using --call-graph.

The additions to the python api include the following:
- add call_path_id to sample_table to allow association of samples
  with call paths

- add dso_id to call_path_table to more closely align the data with that
  of a callchain_node

db-export and trace-script-python were both extended to accomidate the API
changes listed above.

Thread-stack's functionality was expanded to allow storage and exporting
of callchains that result from individual samples.

- Introduced a new static function (thread_stack__process_callchain) to
  resolve call paths using the existing callchain resolution provided by
  thread__resolve_callchain

- The existing call_path tree in call_return_processor is used for storing
  the data from the resolved callchain.

- Call_return_processor was also extended to allow the ability to export
  full call paths in addtion to the existing individual call/return pairs,
  since call/return pairs are not available when doing sampling.

The code was tested using call graphs from fp and dwarf.
export-to-postgresqlwas utilized with intel-pt data to verify that changes
did not negatively affect existing behavior of the db-export api.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c|  21 ++-
 tools/perf/util/db-export.h|   2 +
 .../util/scripting-engines/trace-event-python.c|  20 ++-
 tools/perf/util/thread-stack.c | 162 +
 tools/perf/util/thread-stack.h |  14 +-
 5 files changed, 184 insertions(+), 35 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 049438d..69c9a9d 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -329,6 +329,13 @@ int db_export__sample(struct db_export *dbe, union 
perf_event *event,
if (err)
goto out_put;
 
+   dbe->call_path_last_seen_db_id = 0;
+   if(dbe->crp) {
+   thread_stack__process_callchain(thread, comm, evsel,
+   al->machine, sample,
+   PERF_MAX_STACK_DEPTH, dbe->crp);
+   }
+
if ((evsel->attr.sample_type & PERF_SAMPLE_ADDR) &&
sample_addr_correlates_sym(&evsel->attr)) {
struct addr_location addr_al;
@@ -346,6 +353,7 @@ int db_export__sample(struct db_export *dbe, union 
perf_event *event,
goto out_put;
}
}
+   es.call_path_db_id = dbe->call_path_last_seen_db_id;
 
if (dbe->export_sample)
err = dbe->export_sample(dbe, &es);
@@ -397,9 +405,10 @@ int db_export__branch_types(struct db_export *dbe)
 int db_export__call_path(struct db_export *dbe, struct call_path *cp)
 {
int err;
-
-   if (cp->db_id)
+   if (cp->db_id) {
+   dbe->call_path_last_seen_db_id = cp->db_id;
return 0;
+   }
 
if (cp->parent) {
err = db_export__call_path(dbe, cp->parent);
@@ -409,8 +418,14 @@ int db_export__call_path(struct db_export *dbe, struct 
call_path *cp)
 
cp->db_id = ++dbe->call_path_last_db_id;
 
-   if (dbe->export_call_path)
+   if (dbe->export_call_path) {
+   if (cp->dso)
+   db_export__dso(dbe, cp->dso, cp->machine);
+   if (cp->sym && cp->dso)
+   db_export__symbol(dbe, cp->sym, cp->dso);
+   dbe->call_path_last_seen_db_id = cp->db_id;
return dbe->export_call_path(dbe, cp);
+   }
 
return 0;
 }
diff --git a/tools/perf/util/db-export.h b/tools/perf/util/db-export.h
index 25e22fd..40e3b07 100644
--- a/tools/perf/util/db-export.h
+++ b/tools/perf/util/db-export.h
@@ -43,6 +43,7 @@ struct export_sample {
u64 addr_dso_db_id;
u64 addr_sym_db_id;
u64 addr_offset; /* addr offset from symbol start */
+   u64 call_path_db_id;
 };
 
 struct db_export {
@@ -73,6 +74,7 @@ struct db_export {
u64 symbol_last_db_id;
u64 sample_last_db_id;
u64 call_path_last_db_id;
+   u64 call_path_last_seen_db_id; /* last db_id seen(exported or not) */
u64 call_return_last_db_id;
struct list_head deferred;
 };
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 525eb49..ca3f9c6 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c

[PATCH 4/5] perf script: add option to control callchain export from python

2016-04-19 Thread Chris Phlipot
resolving the callchain for every sample can be very expensive.

This commit introduces a new option "perf_db_export_callchains"
which will be enabled when it is set to True from within the python
script.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/db-export.c|  2 +-
 tools/perf/util/db-export.h|  1 +
 .../util/scripting-engines/trace-event-python.c| 22 +-
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 69c9a9d..3bd8247 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -330,7 +330,7 @@ int db_export__sample(struct db_export *dbe, union 
perf_event *event,
goto out_put;
 
dbe->call_path_last_seen_db_id = 0;
-   if(dbe->crp) {
+   if(dbe->crp && dbe->export_callchains) {
thread_stack__process_callchain(thread, comm, evsel,
al->machine, sample,
PERF_MAX_STACK_DEPTH, dbe->crp);
diff --git a/tools/perf/util/db-export.h b/tools/perf/util/db-export.h
index 40e3b07..b75fea9 100644
--- a/tools/perf/util/db-export.h
+++ b/tools/perf/util/db-export.h
@@ -77,6 +77,7 @@ struct db_export {
u64 call_path_last_seen_db_id; /* last db_id seen(exported or not) */
u64 call_return_last_db_id;
struct list_head deferred;
+   bool export_callchains;
 };
 
 int db_export__init(struct db_export *dbe);
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index ca3f9c6..2e67b35 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1008,8 +1008,10 @@ error:
 static void set_table_handlers(struct tables *tables)
 {
const char *perf_db_export_mode = "perf_db_export_mode";
+   const char *perf_db_export_callchains = "perf_db_export_callchains";
const char *perf_db_export_calls = "perf_db_export_calls";
-   PyObject *db_export_mode, *db_export_calls;
+   PyObject *db_export_mode, *db_export_callchains, *db_export_calls;
+   bool export_callchains = false;
bool export_calls = false;
int ret;
 
@@ -1028,6 +1030,16 @@ static void set_table_handlers(struct tables *tables)
return;
 
tables->dbe.crp = NULL;
+
+   db_export_callchains = PyDict_GetItemString(main_dict,
+   perf_db_export_callchains);
+   if (db_export_callchains) {
+   ret = PyObject_IsTrue(db_export_callchains);
+   if (ret == -1)
+   handler_call_die(perf_db_export_callchains);
+   export_callchains = !!ret;
+   }
+
db_export_calls = PyDict_GetItemString(main_dict, perf_db_export_calls);
if (db_export_calls) {
ret = PyObject_IsTrue(db_export_calls);
@@ -1043,9 +1055,17 @@ static void set_table_handlers(struct tables *tables)
   &tables->dbe);
if (!tables->dbe.crp)
Py_FatalError("failed to create calls processor");
+   } else if (export_callchains) {
+   tables->dbe.crp =
+   call_return_processor__new(python_process_call_path,
+  NULL,
+  &tables->dbe);
+   if (!tables->dbe.crp)
+   Py_FatalError("failed to create calls processor");
}
 
tables->db_export_mode = true;
+   tables->dbe.export_callchains = export_callchains;
/*
 * Reserve per symbol space for symbol->db_id via symbol__priv()
 */
-- 
2.7.4



[PATCH 1/5] perf tools: fix incorrect ordering of callchain entries

2016-04-19 Thread Chris Phlipot
The existing implentation implementation of thread__resolve_callchain,
under certain circumstanes, can assemble callchain entries in the
incorrect order.

A the callchain entries are resolved incorrectly for a sample when all
of the following conditions are met:

1. callchain_param.order is set to ORDER_CALLER

2. thread__resolve_callchain_sample is able to resolve callchain entries
   for the sample.

3. unwind__get_entries is also able to resolve callchain entries for the
   sample.

The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called
when callchain_param.order is set to ORDER_CALLER.

Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/machine.c | 57 ++-
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0c4dabc..dd086c8 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1806,8 +1806,6 @@ static int thread__resolve_callchain_sample(struct thread 
*thread,
int skip_idx = -1;
int first_call = 0;
 
-   callchain_cursor_reset(cursor);
-
if (has_branch_callstack(evsel)) {
err = resolve_lbr_callchain_sample(thread, cursor, sample, 
parent,
   root_al, max_stack);
@@ -1918,20 +1916,12 @@ static int unwind_entry(struct unwind_entry *entry, 
void *arg)
   entry->map, entry->sym);
 }
 
-int thread__resolve_callchain(struct thread *thread,
- struct callchain_cursor *cursor,
- struct perf_evsel *evsel,
- struct perf_sample *sample,
- struct symbol **parent,
- struct addr_location *root_al,
- int max_stack)
-{
-   int ret = thread__resolve_callchain_sample(thread, cursor, evsel,
-  sample, parent,
-  root_al, max_stack);
-   if (ret)
-   return ret;
-
+static int thread__resolve_callchain_unwind(struct thread *thread,
+struct callchain_cursor *cursor,
+   struct perf_evsel *evsel,
+   struct perf_sample *sample,
+   int max_stack)
+{
/* Can we do dwarf post unwind? */
if (!((evsel->attr.sample_type & PERF_SAMPLE_REGS_USER) &&
  (evsel->attr.sample_type & PERF_SAMPLE_STACK_USER)))
@@ -1944,7 +1934,42 @@ int thread__resolve_callchain(struct thread *thread,
 
return unwind__get_entries(unwind_entry, cursor,
   thread, sample, max_stack);
+}
+
+int thread__resolve_callchain(struct thread *thread,
+  struct callchain_cursor *cursor,
+  struct perf_evsel *evsel,
+  struct perf_sample *sample,
+  struct symbol **parent,
+  struct addr_location *root_al,
+  int max_stack)
+{
+   int ret = 0;
+   callchain_cursor_reset(&callchain_cursor);
+
+   if (callchain_param.order == ORDER_CALLEE) {
+ret = thread__resolve_callchain_sample(thread, cursor,
+   evsel, sample,
+  parent, root_al,
+  max_stack);
+   if (ret)
+   return ret;
+ret = thread__resolve_callchain_unwind(thread, cursor,
+   evsel, sample,
+  max_stack);
+   } else {
+ret = thread__resolve_callchain_unwind(thread, cursor,
+   evsel, sample,
+  max_stack);
+   if (ret)
+   return ret;
+ret = thread__resolve_callchain_sample(thread, cursor,
+   evsel, sample,
+  parent, root_al,
+  max_stack);
+   }
 
+   return ret;
 }
 
 int machine__for_each_thread(struct machine *machine,
-- 
2.7.4



[PATCH 5/5] perf script: Update export-to-postgresql to support callchains

2016-04-19 Thread Chris Phlipot
Update the export-to-postgresql.py to support the newly introduced
callchain export.

callchains are added into the existing call_paths table and can now
be associated with samples when the "callpaths" commandline option
is used with the script.

Includes the following changes:
-Now set the perf_db_export_callchains global
-Add "callchains" commandline option"
-Add perf_db_export_callchains checks for call_path table creation
 and population.
-Add call_path_id to samples_table
-Add dso_id to call_paths_table

Signed-off-by: Chris Phlipot 
---
 tools/perf/scripts/python/export-to-postgresql.py | 60 ++-
 1 file changed, 37 insertions(+), 23 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py 
b/tools/perf/scripts/python/export-to-postgresql.py
index 6f0ca68..e64009f 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -223,11 +223,14 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
 
 perf_db_export_mode = True
 perf_db_export_calls = False
+perf_db_export_callchains = False
+
 
 def usage():
-   print >> sys.stderr, "Usage is: export-to-postgresql.py  
[] []"
+   print >> sys.stderr, "Usage is: export-to-postgresql.py  
[] [] []"
print >> sys.stderr, "where:columns 'all' or 'branches'"
-   print >> sys.stderr, "  calls   'calls' => create calls 
table"
+   print >> sys.stderr, "  calls   'calls' => create calls 
and call_paths table"
+   print >> sys.stderr, "  callchains  'callchains' => create 
call_paths table"
raise Exception("Too few arguments")
 
 if (len(sys.argv) < 2):
@@ -245,12 +248,14 @@ if columns not in ("all", "branches"):
 
 branches = (columns == "branches")
 
-if (len(sys.argv) >= 4):
-   if (sys.argv[3] == "calls"):
+for i in range(3,len(sys.argv)):
+   if (sys.argv[i] == "calls"):
perf_db_export_calls = True
+   elif (sys.argv[i] == "callchains"):
+   perf_db_export_callchains = True
else:
usage()
-
+   
 output_dir_name = os.getcwd() + "/" + dbname + "-perf-data"
 os.mkdir(output_dir_name)
 
@@ -358,14 +363,17 @@ else:
'transactionbigint,'
'data_src   bigint,'
'branch_typeinteger,'
-   'in_tx  boolean)')
+   'in_tx  boolean,'
+   'call_path_id   bigint)')
 
-if perf_db_export_calls:
+if perf_db_export_calls or perf_db_export_callchains:
do_query(query, 'CREATE TABLE call_paths ('
'id bigint  NOT NULL,'
'parent_id  bigint,'
'symbol_id  bigint,'
-   'ip bigint)')
+   'ip bigint,'
+   'dso_id bigint)')
+if perf_db_export_calls:
do_query(query, 'CREATE TABLE calls ('
'id bigint  NOT NULL,'
'thread_id  bigint,'
@@ -427,7 +435,7 @@ do_query(query, 'CREATE VIEW comm_threads_view AS '
'(SELECT tid FROM threads WHERE id = thread_id) AS tid'
' FROM comm_threads')
 
-if perf_db_export_calls:
+if perf_db_export_calls or perf_db_export_callchains:
do_query(query, 'CREATE VIEW call_paths_view AS '
'SELECT '
'c.id,'
@@ -443,6 +451,7 @@ if perf_db_export_calls:
'(SELECT dso_id FROM symbols WHERE id = p.symbol_id) AS 
parent_dso_id,'
'(SELECT dso FROM symbols_view  WHERE id = p.symbol_id) 
AS parent_dso_short_name'
' FROM call_paths c INNER JOIN call_paths p ON p.id = 
c.parent_id')
+if perf_db_export_calls:
do_query(query, 'CREATE VIEW calls_view AS '
'SELECT '
'calls.id,'
@@ -540,8 +549,9 @@ dso_file= open_output_file("dso_table.bin")
 symbol_file= open_output_file("symbol_table.bin")
 branch_type_file   = open_output_file("branch_type_table.bin")
 sample_file= open_output_file("sample_table.bin")
-if perf_db_export_calls:
+if perf_db_export_calls or perf_db_export_callchains:
call_path_file  = open_output_file("call_path_table.bin")
+if perf_db_export_calls:
 

[tip:perf/core] perf tools: Fix perf script python database export crash

2016-03-11 Thread tip-bot for Chris Phlipot
Commit-ID:  616df645d7238e45d3b369933a30fee4e4e305e2
Gitweb: http://git.kernel.org/tip/616df645d7238e45d3b369933a30fee4e4e305e2
Author: Chris Phlipot 
AuthorDate: Tue, 8 Mar 2016 21:11:54 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 9 Mar 2016 10:31:02 -0300

perf tools: Fix perf script python database export crash

Remove the union in evsel so that the database id and priv pointer can
be used simultainously without conflicting and crashing.

Detailed Description for the fixed bug follows:

perf script crashes with a segmentation fault on user space tool version
4.5.rc7.ge2857b when using the python database export API. It works
properly in 4.4 and prior versions.

the crash fist appeared in:

cfc8874a4859 ("perf script: Process cpu/threads maps")

How to reproduce the bug:

Remove any temporary files left over from a previous crash (if you have
already attemped to reproduce the bug):

  $ rm -r test_db-perf-data
  $ dropdb test_db

  $ perf record timeout 1 yes >/dev/null
  $ perf script -s scripts/python/export-to-postgresql.py test_db

  Stack Trace:
  Program received signal SIGSEGV, Segmentation fault.
  __GI___libc_free (mem=0x1) at malloc.c:2929
  2929  malloc.c: No such file or directory.
  (gdb) bt
at util/stat.c:122
argv=, prefix=) at builtin-script.c:2231
argc=argc@entry=4, argv=argv@entry=0x7fffdf70) at perf.c:390
at perf.c:451

Signed-off-by: Chris Phlipot 
Acked-by: Jiri Olsa 
Cc: Adrian Hunter 
Cc: Peter Zijlstra 
Fixes: cfc8874a4859 ("perf script: Process cpu/threads maps")
Link: 
http://lkml.kernel.org/r/1457500314-8912-1-git-send-email-cphlip...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evsel.h | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index efad78f..501ea6e 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -93,10 +93,8 @@ struct perf_evsel {
const char  *unit;
struct event_format *tp_format;
off_t   id_offset;
-   union {
-   void*priv;
-   u64 db_id;
-   };
+   void*priv;
+   u64 db_id;
struct cgroup_sel   *cgrp;
void*handler;
struct cpu_map  *cpus;


[PATCH 1/1] Fixes: cfc8874a485 ("perf script: Process cpu/threads maps")

2016-03-08 Thread Chris Phlipot
fix the perf script python database export crash.
Remove the union in evsel so that the database id and priv pointer can be
used simultainously without conflicting and crashing.

Detailed Description for the fixed bug follows:

perf script crashes with a segmentaiton fault on user space tool version
4.5.rc7.ge2857b when using the python database export API. It works
properly in 4.4 and prior versions.

the crash fist appeared in
cfc8874a485 ("perf script: Process cpu/threads maps")

How to reprodcue the bug:

remove any temporary files left over from a previous crash
(if you have already attemped to reproduce the bug):
$ rm -r test_db-perf-data
$ dropdb test_db

$ ./perf record timeout 1 yes >/dev/null
$ ./perf script -s scripts/python/export-to-postgresql.py test_db

Stack Trace:
Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x1) at malloc.c:2929
2929malloc.c: No such file or directory.
(gdb) bt
at util/stat.c:122
argv=, prefix=) at builtin-script.c:2231
argc=argc@entry=4, argv=argv@entry=0x7fffdf70) at perf.c:390
at perf.c:451

Signed-off-by: Chris Phlipot 
---
 tools/perf/util/evsel.h | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8e75434..4d8037a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -93,10 +93,8 @@ struct perf_evsel {
const char  *unit;
struct event_format *tp_format;
off_t   id_offset;
-   union {
-   void*priv;
-   u64 db_id;
-   };
+   void*priv;
+   u64 db_id;
struct cgroup_sel   *cgrp;
void*handler;
struct cpu_map  *cpus;
-- 
1.9.1