[PATCH v2] Add a python script to statistic direct io behavior

2013-02-05 Thread chenggang
From: chenggang@taobao.com

The last version of this patch need to introduce 2 new tracepoint events in VFS,
but introduce new tracepoint events into VFS is not a clever idea. So, I 
modified
this patch, and only use a existing tracepoint event (ext4:ext4_direct_IO_exit).

If the engineers want to analyze the direct io behavior of some applications
without source code, perf tools with some appropriate tracepoints events in the
VFS subsystem are excellent choice.

Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers need to know the misses 
rate
of the database system's page cache. This requirements can be satisfied by 
recording
the database's file access behavior through the way of direct IO. So, we use
tracepoint event, ext4:ext4_direct_IO_exit, to record the system wide's direct 
IO behavior.
The script direct-io.py are introduced by this patch can record the tracepoint 
events,
ext4:ext4_direct_IO_exit, analyse the sample data, and give a concise report.

usage:
"perf script record direct-io\n"
"perf script report direct-io [comm|pid]\n"

Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/scripts/python/bin/direct-io-record |2 +
 tools/perf/scripts/python/bin/direct-io-report |   21 +++
 tools/perf/scripts/python/direct-io.py |  197 
 3 files changed, 220 insertions(+)
 create mode 100755 tools/perf/scripts/python/bin/direct-io-record
 create mode 100644 tools/perf/scripts/python/bin/direct-io-report
 create mode 100644 tools/perf/scripts/python/direct-io.py

diff --git a/tools/perf/scripts/python/bin/direct-io-record 
b/tools/perf/scripts/python/bin/direct-io-record
new file mode 100755
index 000..f38d5fc
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -e ext4:ext4_direct_IO_exit  $@
diff --git a/tools/perf/scripts/python/bin/direct-io-report 
b/tools/perf/scripts/python/bin/direct-io-report
new file mode 100644
index 000..828d9c6
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-report
@@ -0,0 +1,21 @@
+#!/bin/bash
+# description: direct_io statistic
+# args: [comm|pid]
+n_args=0
+for i in "$@"
+do
+if expr match "$i" "-" > /dev/null ; then
+   break
+fi
+n_args=$(( $n_args + 1 ))
+done
+if [ "$n_args" -gt 1 ] ; then
+echo "usage: perf script report direct-io [comm|pid]"
+exit
+fi
+
+if [ "$n_args" -gt 0 ] ; then
+comm=$1
+shift
+fi
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm
diff --git a/tools/perf/scripts/python/direct-io.py 
b/tools/perf/scripts/python/direct-io.py
new file mode 100644
index 000..b609e95
--- /dev/null
+++ b/tools/perf/scripts/python/direct-io.py
@@ -0,0 +1,197 @@
+# direct IO counts
+# (c) 2013, Chenggang Qin 
+# Licensed under the terms of the GNU GPL License version 2
+
+# Displays system-wide file direct IO behavior.
+# It helps us to investigate which processes trigger a direct IO,
+# and what files are accessed by these processes.
+#
+# options
+# comm, pid: show details of the file r/w behavior of a special process.
+
+import os, sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+   '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+MINORBITS = 20
+MINORMASK = ((1 << MINORBITS) - 1)
+
+usage = "perf script record direct-io\n" \
+   "perf script report direct-io [comm|pid]\n"
+
+for_comm = None
+for_pid = None
+pid_2_comm = None
+
+if len(sys.argv) > 2:
+   sys.exit(usage)
+
+if len(sys.argv) > 1:
+   try:
+   for_pid = int(sys.argv[1])
+   except:
+   for_comm = sys.argv[1]
+
+direct_write = autodict()
+direct_read = autodict()
+
+direct_write_bytes = autodict()
+direct_read_bytes = autodict()
+
+comm_read_info = autodict()
+comm_write_info = autodict()
+
+wevent_count = 0
+revent_count = 0
+
+comm_revent_count = 0;
+comm_wevent_count = 0;
+
+def MAJOR(dev):
+   return (dev) >> MINORBITS
+
+def MINOR(dev):
+   return (dev) & MINORMASK
+
+def trace_begin():
+   print "Press control+C to stop and show the summary"
+
+def trace_end():
+   if (for_comm is not None) or (for_pid is not None):
+   print_direct_io_event_for_comm()
+   else:
+   print_direct_io_event_totals()
+
+def ext4__ext4_direct_IO_exit(event_name, context, common_cpu,
+   common_secs, common_nsecs, common_pid, common_comm,
+   ino, dev, pos, len, rw, ret):
+

[PATCH v2] Perf Script: Add a python script to statistic direct io behavior

2013-02-06 Thread chenggang
From: chenggang@taobao.com

The last version of this patch need to introduce 2 new tracepoint events in VFS,
but introduce new tracepoint events into VFS is not a clever idea. So, I 
modified
this patch, and only use a existing tracepoint event (ext4:ext4_direct_IO_exit).

If the engineers want to analyze the direct io behavior of some applications
without source code, perf tools with some appropriate tracepoints events in the
VFS subsystem are excellent choice.

Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers need to know the misses 
rate
of the database system's page cache. This requirements can be satisfied by 
recording
the database's file access behavior through the way of direct IO. So, we use
tracepoint event, ext4:ext4_direct_IO_exit, to record the system wide's direct 
IO behavior.
The script direct-io.py are introduced by this patch can record the tracepoint 
events,
ext4:ext4_direct_IO_exit, analyse the sample data, and give a concise report.

usage:
"perf script record direct-io\n"
"perf script report direct-io [comm|pid]\n"

Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/scripts/python/bin/direct-io-record |2 +
 tools/perf/scripts/python/bin/direct-io-report |   21 +++
 tools/perf/scripts/python/direct-io.py |  197 
 3 files changed, 220 insertions(+)
 create mode 100755 tools/perf/scripts/python/bin/direct-io-record
 create mode 100644 tools/perf/scripts/python/bin/direct-io-report
 create mode 100644 tools/perf/scripts/python/direct-io.py

diff --git a/tools/perf/scripts/python/bin/direct-io-record 
b/tools/perf/scripts/python/bin/direct-io-record
new file mode 100755
index 000..f38d5fc
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -e ext4:ext4_direct_IO_exit  $@
diff --git a/tools/perf/scripts/python/bin/direct-io-report 
b/tools/perf/scripts/python/bin/direct-io-report
new file mode 100644
index 000..828d9c6
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-report
@@ -0,0 +1,21 @@
+#!/bin/bash
+# description: direct_io statistic
+# args: [comm|pid]
+n_args=0
+for i in "$@"
+do
+if expr match "$i" "-" > /dev/null ; then
+   break
+fi
+n_args=$(( $n_args + 1 ))
+done
+if [ "$n_args" -gt 1 ] ; then
+echo "usage: perf script report direct-io [comm|pid]"
+exit
+fi
+
+if [ "$n_args" -gt 0 ] ; then
+comm=$1
+shift
+fi
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm
diff --git a/tools/perf/scripts/python/direct-io.py 
b/tools/perf/scripts/python/direct-io.py
new file mode 100644
index 000..b609e95
--- /dev/null
+++ b/tools/perf/scripts/python/direct-io.py
@@ -0,0 +1,197 @@
+# direct IO counts
+# (c) 2013, Chenggang Qin 
+# Licensed under the terms of the GNU GPL License version 2
+
+# Displays system-wide file direct IO behavior.
+# It helps us to investigate which processes trigger a direct IO,
+# and what files are accessed by these processes.
+#
+# options
+# comm, pid: show details of the file r/w behavior of a special process.
+
+import os, sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+   '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+MINORBITS = 20
+MINORMASK = ((1 << MINORBITS) - 1)
+
+usage = "perf script record direct-io\n" \
+   "perf script report direct-io [comm|pid]\n"
+
+for_comm = None
+for_pid = None
+pid_2_comm = None
+
+if len(sys.argv) > 2:
+   sys.exit(usage)
+
+if len(sys.argv) > 1:
+   try:
+   for_pid = int(sys.argv[1])
+   except:
+   for_comm = sys.argv[1]
+
+direct_write = autodict()
+direct_read = autodict()
+
+direct_write_bytes = autodict()
+direct_read_bytes = autodict()
+
+comm_read_info = autodict()
+comm_write_info = autodict()
+
+wevent_count = 0
+revent_count = 0
+
+comm_revent_count = 0;
+comm_wevent_count = 0;
+
+def MAJOR(dev):
+   return (dev) >> MINORBITS
+
+def MINOR(dev):
+   return (dev) & MINORMASK
+
+def trace_begin():
+   print "Press control+C to stop and show the summary"
+
+def trace_end():
+   if (for_comm is not None) or (for_pid is not None):
+   print_direct_io_event_for_comm()
+   else:
+   print_direct_io_event_totals()
+
+def ext4__ext4_direct_IO_exit(event_name, context, common_cpu,
+   common_secs, common_nsecs, common_pid, common_comm,
+   ino, dev, pos, len, rw, ret):
+

[PATCH] Tracepoint Event: Add 4 tracepoint events for vfs subsystem.

2013-01-28 Thread chenggang
From: chenggang@gmail.com

If the engineers want to analyze the file access behavior of some applications
without source code, perf tools with some appropriate tracepoints events in the
VFS subsystem are excellent choice.

The system engineers or developers of server software require to know what files
are accessed by the target processes with in a period of time. Then they can
find the hot applications and the hot files. For this requirements, we added 2
tracepoint events at the begin of generic_file_aio_read() and 
generic_file_aio_write().

Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers want to know the misses
rate of the database system's page cache. This requirements can be satisfied by
recording the database's file access behavior through the way of direct IO. So,
we added 2 tracepoint events at the direct IO branch in generic_file_aio_read()
and generic_file_aio_write().

Then, we will extend the perf's function by python script to use these new 
tracepoint
events.

The 4 new tracepoint events are:
1) generic_file_aio_read
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

2) generic_file_aio_write
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

3) direct_io_read
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

4) direct_io_write
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 include/trace/events/vfs.h |  110 
 mm/filemap.c   |   18 
 2 files changed, 128 insertions(+)
 create mode 100644 include/trace/events/vfs.h

diff --git a/include/trace/events/vfs.h b/include/trace/events/vfs.h
new file mode 100644
index 000..33498e1
--- /dev/null
+++ b/include/trace/events/vfs.h
@@ -0,0 +1,110 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM vfs
+#define TRACE_INCLUDE_FILE vfs
+
+#if !defined(_TRACE_EVENTS_VFS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_EVENTS_VFS_H
+
+#include 
+
+#include 
+
+TRACE_EVENT(generic_file_aio_read,
+
+   TP_PROTO(long long pos, unsigned long bytes, unsigned char *fname),
+
+   TP_ARGS(pos, bytes, fname),
+
+   TP_STRUCT__entry(
+   __field(long long,  pos )
+   __field(unsigned long,  bytes   )
+   __array(unsigned cha

[PATCH v2] Add 4 tracepoint events for vfs

2013-01-29 Thread chenggang
From: chenggang@gmail.com

If the engineers want to analyze the file access behavior of some applications 
without source code, perf tools with some appropriate tracepoints events in the 
VFS subsystem are excellent choice.

The system engineers or developers of server software require to know what 
files are accessed by the target processes with in a period of time. Then they 
can find the hot applications and the hot files. For this requirements, we 
added 2 tracepoint events at the begin of generic_file_aio_read() and 
generic_file_aio_write().

Many database systems use their own page cache subsystems and use the direct IO 
to access the disks. Sometimes, the system engineers want to know the misses 
rate of the database system's page cache. This requirements can be satisfied by 
recording the database's file access behavior through the way of direct IO. So, 
we added 2 tracepoint events at the direct IO branch in generic_file_aio_read() 
and generic_file_aio_write().

Then, we will extend the perf's function by python script to use these new 
tracepoint events.

The 4 new tracepoint events are:
1) generic_file_aio_read
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:__data_loc char[] fname;  offset:32;  size:4; signed:1;

2) generic_file_aio_write
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:__data_loc char[] fname;  offset:32;  size:4; signed:1;

3) direct_io_read
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

4) direct_io_write
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 include/trace/events/vfs.h |   62 
 mm/filemap.c   |   18 +
 2 files changed, 80 insertions(+)
 create mode 100644 include/trace/events/vfs.h

diff --git a/include/trace/events/vfs.h b/include/trace/events/vfs.h
new file mode 100644
index 000..384ff29
--- /dev/null
+++ b/include/trace/events/vfs.h
@@ -0,0 +1,62 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM vfs
+#define TRACE_INCLUDE_FILE vfs
+
+#if !defined(_TRACE_EVENTS_VFS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_EVENTS_VFS_H
+
+#include 
+
+#include 
+
+DECLARE_EVENT_CLASS(vfs_filerw_template,
+
+   TP_PROTO(long long pos, unsigned long bytes, unsigned char *fname),
+
+   TP_ARGS(pos, bytes, fname),
+
+   TP_STRUCT__entry(
+   __field(long long,  pos )
+   __field(unsigned long,  bytes   )
+   __string(   fname,

[PATCH 2/2] perf script: add python script to show system's file r/w behavior

2013-01-30 Thread chenggang
From: chenggang@gmail.com

This patch depends on the other patch: https://lkml.org/lkml/2013/1/29/47
Because this patch uses 2 tracepoint events are introduced by the patch of the
above mentioned.

If the engineers want to analyze the file access behavior of some applications
without source code, perf script mechanism with some appropriate tracepoints
events in the VFS subsystem are excellent choice.

The system engineers or developers of server software require to know what files
are accessed by the target processes with in a period of time. Then they can
find the hot applications and the hot files. Based on the two tracepoint events,
vfs:generic_file_aio_read and vfs:generic_file_aio_write (introduced by the 
patch:
https://lkml.org/lkml/2013/1/29/47), the python script are introduced by this 
patch
can record the system context and other related infomation while any process 
access
a file. Then, this patch can show the details of the file access behavior of 
every
processes. The detail information include: process's pid, comm, the number of 
file
read/write, and the related file's name.

The usage of this script is:
perf script record filerw
perf script report filerw [comm|pid]

Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/scripts/python/bin/filerw-record |2 +
 tools/perf/scripts/python/bin/filerw-report |   21 +++
 tools/perf/scripts/python/filerw.py |  189 +++
 3 files changed, 212 insertions(+)
 create mode 100755 tools/perf/scripts/python/bin/filerw-record
 create mode 100644 tools/perf/scripts/python/bin/filerw-report
 create mode 100644 tools/perf/scripts/python/filerw.py

diff --git a/tools/perf/scripts/python/bin/filerw-record 
b/tools/perf/scripts/python/bin/filerw-record
new file mode 100755
index 000..80f358c
--- /dev/null
+++ b/tools/perf/scripts/python/bin/filerw-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -e vfs:generic_file_aio_read -e vfs:generic_file_aio_write $@
diff --git a/tools/perf/scripts/python/bin/filerw-report 
b/tools/perf/scripts/python/bin/filerw-report
new file mode 100644
index 000..5a4dac9
--- /dev/null
+++ b/tools/perf/scripts/python/bin/filerw-report
@@ -0,0 +1,21 @@
+#!/bin/bash
+# description: file read/write operations statistic
+# args: [comm]
+n_args=0
+for i in "$@"
+do
+if expr match "$i" "-" > /dev/null ; then
+   break
+fi
+n_args=$(( $n_args + 1 ))
+done
+if [ "$n_args" -gt 1 ] ; then
+echo "usage: perf script report filerw [comm|pid]"
+exit
+fi
+
+if [ "$n_args" -gt 0 ] ; then
+comm=$1
+shift
+fi
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/filerw.py $comm
diff --git a/tools/perf/scripts/python/filerw.py 
b/tools/perf/scripts/python/filerw.py
new file mode 100644
index 000..f5bd820
--- /dev/null
+++ b/tools/perf/scripts/python/filerw.py
@@ -0,0 +1,189 @@
+# file read/write counts
+# (c) 2013, Chenggang Qin 
+# Licensed under the terms of the GNU GPL License version 2
+
+# Displays system-wide file aio read/write behavior.
+# It helps us to investigate what files are accessed by all
+# processes or a special process.
+#
+# options
+# comm: show details of the file r/w behavior of a special process.
+
+import os, sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+   '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+usage = "perf script record filerw\n" \
+   "perf script report filerw [comm|pid]\n"
+
+for_comm = None
+for_pid = None
+pid_2_comm = None
+
+if len(sys.argv) > 2:
+   sys.exit(usage)
+
+if len(sys.argv) > 1:
+   try:
+   for_pid = int(sys.argv[1])
+   except:
+   for_comm = sys.argv[1]
+
+file_write = autodict()
+file_read = autodict()
+
+file_write_bytes = autodict()
+file_read_bytes = autodict()
+
+comm_read_info = autodict()
+comm_write_info = autodict()
+
+wevent_count = 0
+revent_count = 0
+
+comm_revent_count = 0;
+comm_wevent_count = 0;
+
+def trace_begin():
+   print "Press control+C to stop and show the summary"
+
+def trace_end():
+   if (for_comm is not None) or (for_pid is not None):
+   print_file_event_for_comm()
+   else:
+   print_file_event_totals()
+
+def vfs__generic_file_aio_write(event_name, context, common_cpu,
+   common_secs, common_nsecs, common_pid, common_comm,
+   pos, bytes, fname):
+   global wevent_count
+   global comm_wevent_count
+   global pid_2_comm
+
+   if (for_comm is not None) or (for_pid is not None):
+   if (common_co

[PATCH] Add a python script to statistic direct io behavior

2013-01-30 Thread chenggang
From: chenggang@gmail.com

This patch depends on a prev patch: https://lkml.org/lkml/2013/1/29/47

If the engineers want to analyze the direct io behavior of some applications
without source code, perf tools with some appropriate tracepoints events in the
VFS subsystem are excellent choice.

Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers need to know the misses 
rate
of the database system's page cache. This requirements can be satisfied by 
recording
the database's file access behavior through the way of direct IO. So, we use 2
tracepoint events to record the system wide's direct IO behavior. The 2 
tracepoint
events are:
1) vfs:direct_io_read
2) vfs:direct_io_write
they were introduced by the patch: https://lkml.org/lkml/2013/1/29/47
The script direct-io.py are introduced by this patch can record the 2 tracepoint
events, analyse the sample data, and give a concise report.

usage:
"perf script record direct-io\n"
"perf script report direct-io [comm|pid]\n"

Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/scripts/python/bin/direct-io-record |2 +
 tools/perf/scripts/python/bin/direct-io-report |   21 +++
 tools/perf/scripts/python/direct-io.py |  185 
 3 files changed, 208 insertions(+)
 create mode 100755 tools/perf/scripts/python/bin/direct-io-record
 create mode 100644 tools/perf/scripts/python/bin/direct-io-report
 create mode 100644 tools/perf/scripts/python/direct-io.py

diff --git a/tools/perf/scripts/python/bin/direct-io-record 
b/tools/perf/scripts/python/bin/direct-io-record
new file mode 100755
index 000..4857097
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -e vfs:direct_io_read -e vfs:direct_io_write $@
diff --git a/tools/perf/scripts/python/bin/direct-io-report 
b/tools/perf/scripts/python/bin/direct-io-report
new file mode 100644
index 000..828d9c6
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-report
@@ -0,0 +1,21 @@
+#!/bin/bash
+# description: direct_io statistic
+# args: [comm|pid]
+n_args=0
+for i in "$@"
+do
+if expr match "$i" "-" > /dev/null ; then
+   break
+fi
+n_args=$(( $n_args + 1 ))
+done
+if [ "$n_args" -gt 1 ] ; then
+echo "usage: perf script report direct-io [comm|pid]"
+exit
+fi
+
+if [ "$n_args" -gt 0 ] ; then
+comm=$1
+shift
+fi
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm
diff --git a/tools/perf/scripts/python/direct-io.py 
b/tools/perf/scripts/python/direct-io.py
new file mode 100644
index 000..321ff8e
--- /dev/null
+++ b/tools/perf/scripts/python/direct-io.py
@@ -0,0 +1,185 @@
+# direct IO counts
+# (c) 2013, Chenggang Qin 
+# Licensed under the terms of the GNU GPL License version 2
+
+# Displays system-wide file direct IO behavior.
+# It helps us to investigate which processes trigger a direct IO,
+# and what files are accessed by these processes.
+#
+# options
+# comm, pid: show details of the file r/w behavior of a special process.
+
+import os, sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+   '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+usage = "perf script record direct-io\n" \
+   "perf script report direct-io [comm|pid]\n"
+
+for_comm = None
+for_pid = None
+pid_2_comm = None
+
+if len(sys.argv) > 2:
+   sys.exit(usage)
+
+if len(sys.argv) > 1:
+   try:
+   for_pid = int(sys.argv[1])
+   except:
+   for_comm = sys.argv[1]
+
+file_write = autodict()
+file_read = autodict()
+
+file_write_bytes = autodict()
+file_read_bytes = autodict()
+
+comm_read_info = autodict()
+comm_write_info = autodict()
+
+wevent_count = 0
+revent_count = 0
+
+comm_revent_count = 0;
+comm_wevent_count = 0;
+
+def trace_begin():
+   print "Press control+C to stop and show the summary"
+
+def trace_end():
+   if (for_comm is not None) or (for_pid is not None):
+   print_direct_io_event_for_comm()
+   else:
+   print_direct_io_event_totals()
+
+def vfs__direct_io_write(event_name, context, common_cpu,
+   common_secs, common_nsecs, common_pid, common_comm,
+   pos, bytes, fname):
+   global wevent_count
+   global comm_wevent_count
+   global pid_2_comm
+
+   if (for_comm is not None) or (for_pid is not None):
+   if (common_comm != for_comm) and (common_pid != for_pid):
+   

[PATCH v2 0/4] perf: Make the 'perf top -p $pid' can perceive the new forked threads.

2013-02-26 Thread chenggang
From: chenggang@taobao.com

This patch set add a function that make the 'perf top -p $pid' is able to 
perceive
the new threads that is forked by target processes. 'perf top{record} -p $pid' 
can
perceive the threads are forked before we execute perf, but it cannot perceive 
the
new threads are forked after we started perf. This is perf's important defect, 
because
the applications who will fork new threads on-the-fly are very much.
For performance reasons, the event inherit mechanism is forbidden while we use 
per-task
counters. Some internal data structures, such as, thread_map, evlist->mmap, 
evsel->fd,
evsel->id, evsel->sample_id are implemented as arrays at the initialization 
phase.
Their size is fixed, and they cannot be extended or shrinked easily while we 
want to
adjust them for new forked threads and exit threads.

So, we have done the following work:
1) Transformed xyarray to linked list.
   Implementd the interfaces to extand and shrink a exist xyarray.
   The xyarray is a 2-dimensional structure. The row is still a array (because 
the
   number of CPU is fixed forever), the columns are linked list. 
2) Transformed evlist->mmap, evsel->fd, evsel->id and evsel->sample_id to list 
with the
   new xyarray.
   Implemented interfaces to expand and shrink these structures.
   The nodes in these structures can be referenced by some predefined macros, 
such as
   FD(cpu, thread), MMAP(cpu, thread), ID(cpu, thread), etc.
3) Transformed thread_map to linked list.
   Implemented the interfaces to extand and shrink a exist thread_map.
4) Added 2 callback functions to top->perf_tool, they are called while the 
PERF_RECORD_FORK
   & PERF_RECORD_EXIT events are got.
   While a PERF_RECORD_FORK event is got, all related data structures are 
expanded, a new
   fd and mmap are opened.
   While a PERF_RECORD_EXIT event is got, all nodes in the related data 
structures are
   removed, the fd and mmap are closed.

The linked list is flexible, list_add & list_del can be used easily. 
Additional, performance
penalty (especially the CPU utilization) is low.

This function has been already implemented for 'perf top -p $pid' in the patch
[4/4] of this patch set. Next step, the 'perf record -p $pid' should be modified
with the same method.

Thanks for David Ahern's suggestion.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

chenggang (4):
  Transform xyarray to linked list.
  Transform thread_map to linked list.
  Transform mmap and other related structures to list with new xyarray.
  Add fork and exit callback functions into top->perf_tool.

 tools/perf/builtin-record.c   |6 +-
 tools/perf/builtin-stat.c |2 +-
 tools/perf/builtin-top.c  |  100 -
 tools/perf/tests/open-syscall-tp-fields.c |2 +-
 tools/perf/util/event.c   |   10 +-
 tools/perf/util/evlist.c  |  171 +++---
 tools/perf/util/evlist.h  |6 +-
 tools/perf/util/evsel.c   |   98 +++--
 tools/perf/util/evsel.h   |8 +-
 tools/perf/util/header.c  |   31 ++--
 tools/perf/util/header.h  |3 +-
 tools/perf/util/python.c  |2 +-
 tools/perf/util/thread_map.c  |  223 +++--
 tools/perf/util/thread_map.h  |   16 ++-
 tools/perf/util/xyarray.c |   85 ++-
 tools/perf/util/xyarray.h |   25 +++-
 16 files changed, 641 insertions(+), 147 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/4] Transform xyarray to linked list

2013-02-26 Thread chenggang
From: chenggang 

The 2-dimensional array cannot expand and shrink easily while we want to
response the thread's fork and exit events on-the-fly.
We transform xyarray to a 2-demesional linked list. The row is still a array,
but column is implemented as a list. The number of nodes in every row are same.
The interface to append and shrink a exist xyarray is provided.
1) xyarray__append()
   append a column for all rows.
2) xyarray__remove()
   remove a column for all rows.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/xyarray.c |   85 +
 tools/perf/util/xyarray.h |   25 +++--
 2 files changed, 101 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c
index 22afbf6..fc48bda 100644
--- a/tools/perf/util/xyarray.c
+++ b/tools/perf/util/xyarray.c
@@ -1,20 +1,93 @@
 #include "xyarray.h"
 #include "util.h"
 
-struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
+/*
+ * Add a column for all rows;
+ */
+int xyarray__append(struct xyarray *xy)
 {
-   size_t row_size = ylen * entry_size;
-   struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
+   struct xyentry *new_entry;
+   unsigned int x;
+
+   for (x = 0; x < xy->row_count; x++) {
+   new_entry = zalloc(sizeof(*new_entry));
+   if (new_entry == NULL)
+   return -1;
+
+   new_entry->contents = zalloc(xy->entry_size);
+   if (new_entry->contents == NULL)
+   return -1;
 
-   if (xy != NULL) {
-   xy->entry_size = entry_size;
-   xy->row_size   = row_size;
+   list_add_tail(&new_entry->next, &xy->rows[x].head);
}
 
+   return 0;
+}
+
+struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
+{
+   struct xyarray *xy = zalloc(sizeof(*xy) + xlen * sizeof(struct row));
+   int i;
+
+   if (xy == NULL)
+   return NULL;
+
+   xy->row_count = xlen;
+   xy->entry_size = entry_size;
+
+   for (i = 0; i < xlen; i++)
+   INIT_LIST_HEAD(&xy->rows[i].head);
+
+   for (i = 0; i < ylen; i++)
+   if (xyarray__append(xy) < 0) {
+   xyarray__delete(xy);
+   return NULL;
+   }
+
return xy;
 }
 
+/*
+ * remove a column for all rows;
+ */
+int xyarray__remove(struct xyarray *xy, int y)
+{
+   struct xyentry *entry;
+   unsigned int x;
+   int count;
+
+   if (!xy)
+   return 0;
+
+   for (x = 0; x < xy->row_count; x++) {
+   count = 0;
+   list_for_each_entry(entry, &xy->rows[x].head, next)
+   if (count++ == y) {
+   list_del(&entry->next);
+   free(entry);
+   return 0;
+   }
+   }
+
+   return -1;
+}
+
+/*
+ * All nodes in every rows should be deleted before delete @xy.
+ */
 void xyarray__delete(struct xyarray *xy)
 {
+   unsigned int i;
+   struct xyentry *entry;
+
+   if (!xy)
+   return;
+
+   for (i = 0; i < xy->row_count; i++)
+   list_for_each_entry(entry, &xy->rows[i].head, next) {
+   list_del(&entry->next);
+   free(entry);
+   }
+
free(xy);
 }
diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h
index c488a07..07fa370 100644
--- a/tools/perf/util/xyarray.h
+++ b/tools/perf/util/xyarray.h
@@ -2,19 +2,38 @@
 #define _PERF_XYARRAY_H_ 1
 
 #include 
+#include 
+
+struct row {
+   struct list_head head;
+};
+
+struct xyentry {
+   struct list_head next;
+   char *contents;
+};
 
 struct xyarray {
-   size_t row_size;
+   size_t row_count;
size_t entry_size;
-   char contents[];
+   struct row rows[];
 };
 
 struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size);
 void xyarray__delete(struct xyarray *xy);
+int xyarray__append(struct xyarray *xy);
+int xyarray__remove(struct xyarray *xy, int y);
 
 static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
 {
-   return &xy->contents[x * xy->row_size + y * xy->entry_size];
+   struct xyentry *entry;
+   int columns = 0;
+
+   list_for_each_entry(entry, &xy->rows[x].head, next)
+   if (columns++ == y)
+   return entry->contents;
+
+   return NULL;
 }
 
 #endif /* _PERF_XYARRAY_H_ */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "uns

[PATCH v2 2/4] Transform thread_map to linked list

2013-02-26 Thread chenggang
From: chenggang 

The size of thread_map is fixed at initialized phase according to the
files in /proc/{$pid}. It cannot be expanded and shrinked easily while we
want to response the thread fork and exit events.
We transform the thread_map structure to a linked list, and implement some
interfaces to expend and shrink it. In order to improve compatibility with
the existing code, we can get a thread by its index in the thread_map also.
1) thread_map__append()
   Append a new thread into thread_map according to new thread's pid.
2) thread_map__remove()
   Remove a exist thread from thread_map according to the index of the
   thread in thread_map.
3) thread_map__init()
   Allocate a thread_map, and initialize it. But the thread_map is empty after
   we called this function. We should call thread_map__append() to insert
   threads.
4) thread_map__delete()
   Delete a exist thread_map.
5) thread_map__get_pid()
   Got a thread's pid by its index in the thread_map.
6) thread_map__get_idx_by_pid()
   Got a thread's index in the thread_map according to its pid.
   While we got a PERF_RECORD_EXIT event, we only know the pid of the exited 
thread.
7) thread_map__empty_thread_map()
   Return a empty thread_map, there is only a dumb thread in it.
   This function is used to instead of the global varible empty_thread_map.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-stat.c |2 +-
 tools/perf/tests/open-syscall-tp-fields.c |2 +-
 tools/perf/util/event.c   |   10 +-
 tools/perf/util/evlist.c  |2 +-
 tools/perf/util/evsel.c   |   16 +--
 tools/perf/util/python.c  |2 +-
 tools/perf/util/thread_map.c  |  210 +++--
 tools/perf/util/thread_map.h  |   19 ++-
 tools/perf/util/xyarray.c |4 +-
 9 files changed, 171 insertions(+), 96 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9984876..f5fe0da 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -401,7 +401,7 @@ static int __run_perf_stat(int argc __maybe_unused, const 
char **argv)
}
 
if (perf_target__none(&target))
-   evsel_list->threads->map[0] = child_pid;
+   thread_map__append(evsel_list->threads, child_pid);
 
/*
 * Wait for the child to be ready to exec.
diff --git a/tools/perf/tests/open-syscall-tp-fields.c 
b/tools/perf/tests/open-syscall-tp-fields.c
index 1c52fdc..39eb770 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -43,7 +43,7 @@ int test__syscall_open_tp_fields(void)
 
perf_evsel__config(evsel, &opts);
 
-   evlist->threads->map[0] = getpid();
+   thread_map__append(evlist->threads, getpid());
 
err = perf_evlist__open(evlist);
if (err < 0) {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 5cd13d7..91d2848 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -327,8 +327,8 @@ int perf_event__synthesize_thread_map(struct perf_tool 
*tool,
err = 0;
for (thread = 0; thread < threads->nr; ++thread) {
if (__event__synthesize_thread(comm_event, mmap_event,
-  threads->map[thread], 0,
-  process, tool, machine)) {
+  thread_map__get_pid(threads,
+  thread), 0, process, tool,
+  machine)) {
err = -1;
break;
}
@@ -337,12 +337,14 @@ int perf_event__synthesize_thread_map(struct perf_tool 
*tool,
 * comm.pid is set to thread group id by
 * perf_event__synthesize_comm
 */
-   if ((int) comm_event->comm.pid != threads->map[thread]) {
+   if ((int) comm_event->comm.pid
+   != thread_map__get_pid(threads, thread)) {
bool need_leader = true;
 
/* is thread group leader in thread_map? */
for (j = 0; j < threads->nr; ++j) {
-   if ((int) comm_event->comm.pid == 
threads->map[j]) {
+   if ((int) comm_event->comm.pid
+   == thread_map__get_pid(threads, thread)) {
need_leader = false;
break;

[PATCH v2 3/4] Transform mmap and other related structures to list with new xyarray

2013-02-26 Thread chenggang
From: chenggang 

evlist->mmap, evsel->id, evsel->sample_id are arrays. They cannot be expended or
shrinked easily for the forked and exited threads while we get the fork and exit
events.
We transfromed them to linked list with the new xyarray.
xyarray is a 2-dimensional structure. The row is a array still, and a row 
represents a cpu.
The column is a linked list, and a column represents a thread.

Some functions are implemented to expand and shrink the mmap, id and sample_id 
too.
1) perf_evsel__append_id_thread()
   Append a id for a evsel while a new thread is perceived.
2) perf_evsel__append_fd_thread()
   Append a fd for a evsel while a new thread is perceived.
3) perf_evlist__append_mmap_thread()
   Append a new node into evlist->mmap while a new thread is perceived.
3) perf_evsel__open_thread()
   Open the fd for the new thread with sys_perf_event_open.
4) perf_evsel__close_thread()
   Close the fd while a thread exit.
5) perf_evlist__mmap_thread()
   mmap a new thread's fd.
6) perf_evlist__munmap_thread()
   unmmap a exit thread's fd.

The following macros can be used to reference a special fd, id, mmap, sample_id 
etc.
1) FD(cpu, thread)
2) SID(cpu, thread)
3) ID(cpu, thread)
4) MMAP(cpu, thread)

evlist->pollfd is the parameter of syscall poll(), it must be a array. But we
implement a function (perf_evlist__append_pollfd_thread) to expand and shrink 
it.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-record.c |6 +-
 tools/perf/util/evlist.c|  169 ++-
 tools/perf/util/evlist.h|6 +-
 tools/perf/util/evsel.c |   83 -
 tools/perf/util/evsel.h |8 +-
 tools/perf/util/header.c|   31 
 tools/perf/util/header.h|3 +-
 7 files changed, 263 insertions(+), 43 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 774c907..13112c6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -31,6 +31,8 @@
 #include 
 #include 
 
+#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y))
+
 #ifndef HAVE_ON_EXIT
 #ifndef ATEXIT_MAX
 #define ATEXIT_MAX 32
@@ -367,8 +369,8 @@ static int perf_record__mmap_read_all(struct perf_record 
*rec)
int rc = 0;
 
for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-   if (rec->evlist->mmap[i].base) {
-   if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) 
!= 0) {
+   if (MMAP(rec->evlist, i).base) {
+   if (perf_record__mmap_read(rec, &MMAP(rec->evlist, i)) 
!= 0) {
rc = -1;
goto out;
}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d5063d6..90cfbb6 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -25,6 +25,8 @@
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
+#define ID(e, y) (*(u64 *)xyarray__entry(e->id, 0, y))
+#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y))
 
 void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
   struct thread_map *threads)
@@ -85,7 +87,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
-   free(evlist->mmap);
+   xyarray__delete(evlist->mmap);
free(evlist->pollfd);
evlist->mmap = NULL;
evlist->pollfd = NULL;
@@ -256,6 +258,32 @@ void perf_evlist__enable(struct perf_evlist *evlist)
}
 }
 
+/*
+ * If threads->nr > 1, the cpu_map__nr() must be 1.
+ * If the cpu_map__nr() > 1, we should not append pollfd.
+ */
+static int perf_evlist__append_pollfd_thread(struct perf_evlist *evlist)
+{
+   int new_nfds;
+
+   if (cpu_map__all(evlist->cpus)) {
+   struct pollfd *pfd;
+
+   new_nfds = evlist->threads->nr * evlist->nr_entries;
+   pfd = zalloc(sizeof(struct pollfd) * new_nfds);
+
+   if (!pfd)
+   return -1;
+
+   memcpy(pfd, evlist->pollfd, (evlist->threads->nr - 1) * 
evlist->nr_entries);
+
+   evlist->pollfd = pfd;
+   return 0;
+   }
+
+   return 1;
+}
+
 static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
int nfds = cpu_map__nr(evlist->cpus) * evlist->threads->nr * 
evlist->nr_entries;
@@ -288,7 +316,7 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct 
perf_evsel *evsel,
 int cpu, int thread, u

[PATCH v2 4/4] Add fork and exit callback functions into top->perf_tool

2013-02-26 Thread chenggang
From: chenggang 

Many applications will fork threads on-the-fly, these threads could exit before
the main thread exit. The perf top tool should perceive the new forked threads
while we profile a special application.
If the target process fork a thread or a thread exit, we will get a 
PERF_RECORD_FORK
 or PERF_RECORD_EXIT events. The following callback functions can process these 
events.
1) perf_top__process_event_fork()
   Open a new fd for the new forked, and expend the related data structures.
2) perf_top__process_event_exit()
   Close the fd of exit threadsd, and destroy the nodes in the related data 
structures.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-top.c |  100 +-
 tools/perf/util/evlist.c |   30 ++---
 tools/perf/util/evsel.c  |   13 +++---
 tools/perf/util/thread_map.c |   13 ++
 tools/perf/util/thread_map.h |3 --
 5 files changed, 133 insertions(+), 26 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 72f6eb7..94aab11 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -806,7 +806,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
struct perf_evsel *evsel;
struct perf_session *session = top->session;
union perf_event *event;
-   struct machine *machine;
+   struct machine *machine = NULL;
u8 origin;
int ret;
 
@@ -825,6 +825,20 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
if (event->header.type == PERF_RECORD_SAMPLE)
++top->samples;
 
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_FORK)
+   (&top->tool)->fork(&top->tool, event, &sample, machine);
+
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_EXIT) {
+   int tidx;
+
+   tidx = (&top->tool)->exit(&top->tool, event,
+   &sample, machine);
+   if (tidx == idx)
+   break;
+   }
+
switch (origin) {
case PERF_RECORD_MISC_USER:
++top->us_samples;
@@ -1024,11 +1038,95 @@ parse_callchain_opt(const struct option *opt, const 
char *arg, int unset)
return record_parse_callchain_opt(opt, arg, unset);
 }
 
+static int perf_top__append_thread(struct perf_top *top, int tidx)
+{
+   struct perf_evsel *counter;
+   struct perf_evlist *evlist = top->evlist;
+   struct cpu_map *cpus = evlist->cpus;
+
+   list_for_each_entry(counter, &evlist->entries, node)
+   if (perf_evsel__open_thread(counter, cpus, evlist->threads, 
tidx) < 0) {
+   printf("errno: %d\n", errno);
+   return -1;
+   }
+
+   if (perf_evlist__mmap_thread(evlist, false, tidx) < 0)
+   return -1;
+
+   return 0;
+}
+
+static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused,
+   union perf_event *event __maybe_unused,
+   struct perf_sample *sample 
__maybe_unused,
+   struct machine *machine __maybe_unused)
+{
+   pid_t tid = event->fork.tid;
+   pid_t ptid = event->fork.ptid;
+   struct perf_top *top = container_of(tool, struct perf_top, tool);
+   struct thread_map *threads = top->evlist->threads;
+   struct perf_evsel *evsel;
+   int i, ret;
+
+   if (!cpu_map__all(top->evlist->cpus))
+   return -1;
+
+   ret = thread_map__append(threads, tid);
+   if (ret == 1)
+   return ret;
+   if (ret == -1)
+   return ret;
+
+   for(i = 0; i < threads->nr; i++) {
+   if (ptid == thread_map__get_pid(threads, i)) {
+   if (perf_top__append_thread(top, threads->nr - 1) < 0)
+   goto free_new_thread;
+   break;
+   }
+   }
+
+   return 0;
+
+free_new_thread:
+   list_for_each_entry(evsel, &top->evlist->entries, node)
+   perf_evsel__close_thread(evsel, top->evlist->cpus->nr, 
threads->nr - 1);
+   thread_map__remove(threads, threads->nr - 1);
+   return -1;
+}
+
+static int perf_top__process_event_exit(struct perf_tool *tool __maybe_unused,
+   union perf_e

[PATCH v2 4/4] Add fork and exit callback functions into top->perf_tool

2013-02-26 Thread chenggang
From: chenggang 

Many applications will fork threads on-the-fly, these threads could exit before
the main thread exit. The perf top tool should perceive the new forked threads
while we profile a special application.
If the target process fork a thread or a thread exit, we will get a 
PERF_RECORD_FORK
 or PERF_RECORD_EXIT events. The following callback functions can process these 
events.
1) perf_top__process_event_fork()
   Open a new fd for the new forked, and expend the related data structures.
2) perf_top__process_event_exit()
   Close the fd of exit threadsd, and destroy the nodes in the related data 
structures.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Cc: linux-kernel 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-top.c |  100 +-
 tools/perf/util/evlist.c |   30 ++---
 tools/perf/util/evsel.c  |   13 +++---
 tools/perf/util/thread_map.c |   13 ++
 tools/perf/util/thread_map.h |3 --
 5 files changed, 133 insertions(+), 26 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 72f6eb7..94aab11 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -806,7 +806,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
struct perf_evsel *evsel;
struct perf_session *session = top->session;
union perf_event *event;
-   struct machine *machine;
+   struct machine *machine = NULL;
u8 origin;
int ret;
 
@@ -825,6 +825,20 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
if (event->header.type == PERF_RECORD_SAMPLE)
++top->samples;
 
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_FORK)
+   (&top->tool)->fork(&top->tool, event, &sample, machine);
+
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_EXIT) {
+   int tidx;
+
+   tidx = (&top->tool)->exit(&top->tool, event,
+   &sample, machine);
+   if (tidx == idx)
+   break;
+   }
+
switch (origin) {
case PERF_RECORD_MISC_USER:
++top->us_samples;
@@ -1024,11 +1038,95 @@ parse_callchain_opt(const struct option *opt, const 
char *arg, int unset)
return record_parse_callchain_opt(opt, arg, unset);
 }
 
+static int perf_top__append_thread(struct perf_top *top, int tidx)
+{
+   struct perf_evsel *counter;
+   struct perf_evlist *evlist = top->evlist;
+   struct cpu_map *cpus = evlist->cpus;
+
+   list_for_each_entry(counter, &evlist->entries, node)
+   if (perf_evsel__open_thread(counter, cpus, evlist->threads, 
tidx) < 0) {
+   printf("errno: %d\n", errno);
+   return -1;
+   }
+
+   if (perf_evlist__mmap_thread(evlist, false, tidx) < 0)
+   return -1;
+
+   return 0;
+}
+
+static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused,
+   union perf_event *event __maybe_unused,
+   struct perf_sample *sample 
__maybe_unused,
+   struct machine *machine __maybe_unused)
+{
+   pid_t tid = event->fork.tid;
+   pid_t ptid = event->fork.ptid;
+   struct perf_top *top = container_of(tool, struct perf_top, tool);
+   struct thread_map *threads = top->evlist->threads;
+   struct perf_evsel *evsel;
+   int i, ret;
+
+   if (!cpu_map__all(top->evlist->cpus))
+   return -1;
+
+   ret = thread_map__append(threads, tid);
+   if (ret == 1)
+   return ret;
+   if (ret == -1)
+   return ret;
+
+   for(i = 0; i < threads->nr; i++) {
+   if (ptid == thread_map__get_pid(threads, i)) {
+   if (perf_top__append_thread(top, threads->nr - 1) < 0)
+   goto free_new_thread;
+   break;
+   }
+   }
+
+   return 0;
+
+free_new_thread:
+   list_for_each_entry(evsel, &top->evlist->entries, node)
+   perf_evsel__close_thread(evsel, top->evlist->cpus->nr, 
threads->nr - 1);
+   thread_map__remove(threads, threads->nr - 1);
+   return -1;
+}
+
+static int perf_top__process_event_exit(struct perf_tool *tool __maybe_unused,
+  

[PATCH v2 3/4] Transform mmap and other related structures to list with new xyarray

2013-02-26 Thread chenggang
From: chenggang 

evlist->mmap, evsel->id, evsel->sample_id are arrays. They cannot be expended or
shrinked easily for the forked and exited threads while we get the fork and exit
events.
We transfromed them to linked list with the new xyarray.
xyarray is a 2-dimensional structure. The row is a array still, and a row 
represents a cpu.
The column is a linked list, and a column represents a thread.

Some functions are implemented to expand and shrink the mmap, id and sample_id 
too.
1) perf_evsel__append_id_thread()
   Append a id for a evsel while a new thread is perceived.
2) perf_evsel__append_fd_thread()
   Append a fd for a evsel while a new thread is perceived.
3) perf_evlist__append_mmap_thread()
   Append a new node into evlist->mmap while a new thread is perceived.
3) perf_evsel__open_thread()
   Open the fd for the new thread with sys_perf_event_open.
4) perf_evsel__close_thread()
   Close the fd while a thread exit.
5) perf_evlist__mmap_thread()
   mmap a new thread's fd.
6) perf_evlist__munmap_thread()
   unmmap a exit thread's fd.

The following macros can be used to reference a special fd, id, mmap, sample_id 
etc.
1) FD(cpu, thread)
2) SID(cpu, thread)
3) ID(cpu, thread)
4) MMAP(cpu, thread)

evlist->pollfd is the parameter of syscall poll(), it must be a array. But we
implement a function (perf_evlist__append_pollfd_thread) to expand and shrink 
it.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Cc: linux-kernel 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-record.c |6 +-
 tools/perf/util/evlist.c|  169 ++-
 tools/perf/util/evlist.h|6 +-
 tools/perf/util/evsel.c |   83 -
 tools/perf/util/evsel.h |8 +-
 tools/perf/util/header.c|   31 
 tools/perf/util/header.h|3 +-
 7 files changed, 263 insertions(+), 43 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 774c907..13112c6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -31,6 +31,8 @@
 #include 
 #include 
 
+#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y))
+
 #ifndef HAVE_ON_EXIT
 #ifndef ATEXIT_MAX
 #define ATEXIT_MAX 32
@@ -367,8 +369,8 @@ static int perf_record__mmap_read_all(struct perf_record 
*rec)
int rc = 0;
 
for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-   if (rec->evlist->mmap[i].base) {
-   if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) 
!= 0) {
+   if (MMAP(rec->evlist, i).base) {
+   if (perf_record__mmap_read(rec, &MMAP(rec->evlist, i)) 
!= 0) {
rc = -1;
goto out;
}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d5063d6..90cfbb6 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -25,6 +25,8 @@
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
+#define ID(e, y) (*(u64 *)xyarray__entry(e->id, 0, y))
+#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y))
 
 void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
   struct thread_map *threads)
@@ -85,7 +87,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
-   free(evlist->mmap);
+   xyarray__delete(evlist->mmap);
free(evlist->pollfd);
evlist->mmap = NULL;
evlist->pollfd = NULL;
@@ -256,6 +258,32 @@ void perf_evlist__enable(struct perf_evlist *evlist)
}
 }
 
+/*
+ * If threads->nr > 1, the cpu_map__nr() must be 1.
+ * If the cpu_map__nr() > 1, we should not append pollfd.
+ */
+static int perf_evlist__append_pollfd_thread(struct perf_evlist *evlist)
+{
+   int new_nfds;
+
+   if (cpu_map__all(evlist->cpus)) {
+   struct pollfd *pfd;
+
+   new_nfds = evlist->threads->nr * evlist->nr_entries;
+   pfd = zalloc(sizeof(struct pollfd) * new_nfds);
+
+   if (!pfd)
+   return -1;
+
+   memcpy(pfd, evlist->pollfd, (evlist->threads->nr - 1) * 
evlist->nr_entries);
+
+   evlist->pollfd = pfd;
+   return 0;
+   }
+
+   return 1;
+}
+
 static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
int nfds = cpu_map__nr(evlist->cpus) * evlist->threads->nr * 
evlist->nr_entries;
@@ -288,7 +316,7 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct 
perf_evsel *evsel,
 int 

[PATCH v2 2/4] Transform thread_map to linked list

2013-02-26 Thread chenggang
From: chenggang 

The size of thread_map is fixed at initialized phase according to the
files in /proc/{$pid}. It cannot be expanded and shrinked easily while we
want to response the thread fork and exit events.
We transform the thread_map structure to a linked list, and implement some
interfaces to expend and shrink it. In order to improve compatibility with
the existing code, we can get a thread by its index in the thread_map also.
1) thread_map__append()
   Append a new thread into thread_map according to new thread's pid.
2) thread_map__remove()
   Remove a exist thread from thread_map according to the index of the
   thread in thread_map.
3) thread_map__init()
   Allocate a thread_map, and initialize it. But the thread_map is empty after
   we called this function. We should call thread_map__append() to insert
   threads.
4) thread_map__delete()
   Delete a exist thread_map.
5) thread_map__get_pid()
   Got a thread's pid by its index in the thread_map.
6) thread_map__get_idx_by_pid()
   Got a thread's index in the thread_map according to its pid.
   While we got a PERF_RECORD_EXIT event, we only know the pid of the exited 
thread.
7) thread_map__empty_thread_map()
   Return a empty thread_map, there is only a dumb thread in it.
   This function is used to instead of the global varible empty_thread_map.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Cc: linux-kernel 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-stat.c |2 +-
 tools/perf/tests/open-syscall-tp-fields.c |2 +-
 tools/perf/util/event.c   |   10 +-
 tools/perf/util/evlist.c  |2 +-
 tools/perf/util/evsel.c   |   16 +--
 tools/perf/util/python.c  |2 +-
 tools/perf/util/thread_map.c  |  210 +++--
 tools/perf/util/thread_map.h  |   19 ++-
 tools/perf/util/xyarray.c |4 +-
 9 files changed, 171 insertions(+), 96 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9984876..f5fe0da 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -401,7 +401,7 @@ static int __run_perf_stat(int argc __maybe_unused, const 
char **argv)
}
 
if (perf_target__none(&target))
-   evsel_list->threads->map[0] = child_pid;
+   thread_map__append(evsel_list->threads, child_pid);
 
/*
 * Wait for the child to be ready to exec.
diff --git a/tools/perf/tests/open-syscall-tp-fields.c 
b/tools/perf/tests/open-syscall-tp-fields.c
index 1c52fdc..39eb770 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -43,7 +43,7 @@ int test__syscall_open_tp_fields(void)
 
perf_evsel__config(evsel, &opts);
 
-   evlist->threads->map[0] = getpid();
+   thread_map__append(evlist->threads, getpid());
 
err = perf_evlist__open(evlist);
if (err < 0) {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 5cd13d7..91d2848 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -327,8 +327,8 @@ int perf_event__synthesize_thread_map(struct perf_tool 
*tool,
err = 0;
for (thread = 0; thread < threads->nr; ++thread) {
if (__event__synthesize_thread(comm_event, mmap_event,
-  threads->map[thread], 0,
-  process, tool, machine)) {
+  thread_map__get_pid(threads,
+  thread), 0, process, tool,
+  machine)) {
err = -1;
break;
}
@@ -337,12 +337,14 @@ int perf_event__synthesize_thread_map(struct perf_tool 
*tool,
 * comm.pid is set to thread group id by
 * perf_event__synthesize_comm
 */
-   if ((int) comm_event->comm.pid != threads->map[thread]) {
+   if ((int) comm_event->comm.pid
+   != thread_map__get_pid(threads, thread)) {
bool need_leader = true;
 
/* is thread group leader in thread_map? */
for (j = 0; j < threads->nr; ++j) {
-   if ((int) comm_event->comm.pid == 
threads->map[j]) {
+   if ((int) comm_event->comm.pid
+   == thread_map__get_pid(threads, thread)) {
need_leader = false;
b

[PATCH v2 1/4] Transform xyarray to linked list

2013-02-26 Thread chenggang
From: chenggang 

The 2-dimensional array cannot expand and shrink easily while we want to
response the thread's fork and exit events on-the-fly.
We transform xyarray to a 2-demesional linked list. The row is still a array,
but column is implemented as a list. The number of nodes in every row are same.
The interface to append and shrink a exist xyarray is provided.
1) xyarray__append()
   append a column for all rows.
2) xyarray__remove()
   remove a column for all rows.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Cc: linux-kernel 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/xyarray.c |   85 +
 tools/perf/util/xyarray.h |   25 +++--
 2 files changed, 101 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c
index 22afbf6..fc48bda 100644
--- a/tools/perf/util/xyarray.c
+++ b/tools/perf/util/xyarray.c
@@ -1,20 +1,93 @@
 #include "xyarray.h"
 #include "util.h"
 
-struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
+/*
+ * Add a column for all rows;
+ */
+int xyarray__append(struct xyarray *xy)
 {
-   size_t row_size = ylen * entry_size;
-   struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
+   struct xyentry *new_entry;
+   unsigned int x;
+
+   for (x = 0; x < xy->row_count; x++) {
+   new_entry = zalloc(sizeof(*new_entry));
+   if (new_entry == NULL)
+   return -1;
+
+   new_entry->contents = zalloc(xy->entry_size);
+   if (new_entry->contents == NULL)
+   return -1;
 
-   if (xy != NULL) {
-   xy->entry_size = entry_size;
-   xy->row_size   = row_size;
+   list_add_tail(&new_entry->next, &xy->rows[x].head);
}
 
+   return 0;
+}
+
+struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
+{
+   struct xyarray *xy = zalloc(sizeof(*xy) + xlen * sizeof(struct row));
+   int i;
+
+   if (xy == NULL)
+   return NULL;
+
+   xy->row_count = xlen;
+   xy->entry_size = entry_size;
+
+   for (i = 0; i < xlen; i++)
+   INIT_LIST_HEAD(&xy->rows[i].head);
+
+   for (i = 0; i < ylen; i++)
+   if (xyarray__append(xy) < 0) {
+   xyarray__delete(xy);
+   return NULL;
+   }
+
return xy;
 }
 
+/*
+ * remove a column for all rows;
+ */
+int xyarray__remove(struct xyarray *xy, int y)
+{
+   struct xyentry *entry;
+   unsigned int x;
+   int count;
+
+   if (!xy)
+   return 0;
+
+   for (x = 0; x < xy->row_count; x++) {
+   count = 0;
+   list_for_each_entry(entry, &xy->rows[x].head, next)
+   if (count++ == y) {
+   list_del(&entry->next);
+   free(entry);
+   return 0;
+   }
+   }
+
+   return -1;
+}
+
+/*
+ * All nodes in every rows should be deleted before delete @xy.
+ */
 void xyarray__delete(struct xyarray *xy)
 {
+   unsigned int i;
+   struct xyentry *entry;
+
+   if (!xy)
+   return;
+
+   for (i = 0; i < xy->row_count; i++)
+   list_for_each_entry(entry, &xy->rows[i].head, next) {
+   list_del(&entry->next);
+   free(entry);
+   }
+
free(xy);
 }
diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h
index c488a07..07fa370 100644
--- a/tools/perf/util/xyarray.h
+++ b/tools/perf/util/xyarray.h
@@ -2,19 +2,38 @@
 #define _PERF_XYARRAY_H_ 1
 
 #include 
+#include 
+
+struct row {
+   struct list_head head;
+};
+
+struct xyentry {
+   struct list_head next;
+   char *contents;
+};
 
 struct xyarray {
-   size_t row_size;
+   size_t row_count;
size_t entry_size;
-   char contents[];
+   struct row rows[];
 };
 
 struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size);
 void xyarray__delete(struct xyarray *xy);
+int xyarray__append(struct xyarray *xy);
+int xyarray__remove(struct xyarray *xy, int y);
 
 static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
 {
-   return &xy->contents[x * xy->row_size + y * xy->entry_size];
+   struct xyentry *entry;
+   int columns = 0;
+
+   list_for_each_entry(entry, &xy->rows[x].head, next)
+   if (columns++ == y)
+   return entry->contents;
+
+   return NULL;
 }
 
 #endif /* _PERF_XYARRAY_H_ */
-- 
1.7.9.5

--
To unsubscribe from this list: send t

[PATCH v2 0/4] perf: Make the 'perf top -p $pid' can perceive the new forked threads.

2013-02-26 Thread chenggang
From: chenggang@taobao.com

This patch set add a function that make the 'perf top -p $pid' is able to 
perceive
the new threads that is forked by target processes. 'perf top{record} -p $pid' 
can
perceive the threads are forked before we execute perf, but it cannot perceive 
the
new threads are forked after we started perf. This is perf's important defect, 
because
the applications who will fork new threads on-the-fly are very much.
For performance reasons, the event inherit mechanism is forbidden while we use 
per-task
counters. Some internal data structures, such as, thread_map, evlist->mmap, 
evsel->fd,
evsel->id, evsel->sample_id are implemented as arrays at the initialization 
phase.
Their size is fixed, and they cannot be extended or shrinked easily while we 
want to
adjust them for new forked threads and exit threads.

So, we have done the following work:
1) Transformed xyarray to linked list.
   Implementd the interfaces to extand and shrink a exist xyarray.
   The xyarray is a 2-dimensional structure. The row is still a array (because 
the
   number of CPU is fixed forever), the columns are linked list. 
2) Transformed evlist->mmap, evsel->fd, evsel->id and evsel->sample_id to list 
with the
   new xyarray.
   Implemented interfaces to expand and shrink these structures.
   The nodes in these structures can be referenced by some predefined macros, 
such as
   FD(cpu, thread), MMAP(cpu, thread), ID(cpu, thread), etc.
3) Transformed thread_map to linked list.
   Implemented the interfaces to extand and shrink a exist thread_map.
4) Added 2 callback functions to top->perf_tool, they are called while the 
PERF_RECORD_FORK
   & PERF_RECORD_EXIT events are got.
   While a PERF_RECORD_FORK event is got, all related data structures are 
expanded, a new
   fd and mmap are opened.
   While a PERF_RECORD_EXIT event is got, all nodes in the related data 
structures are
   removed, the fd and mmap are closed.

The linked list is flexible, list_add & list_del can be used easily. 
Additional, performance
penalty (especially the CPU utilization) is low.

This function has been already implemented for 'perf top -p $pid' in the patch
[4/4] of this patch set. Next step, the 'perf record -p $pid' should be modified
with the same method.

Thanks for David Ahern's suggestion.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Cc: linux-kernel 
Signed-off-by: Chenggang Qin 

chenggang (4):
  Transform xyarray to linked list.
  Transform thread_map to linked list.
  Transform mmap and other related structures to list with new xyarray.
  Add fork and exit callback functions into top->perf_tool.

 tools/perf/builtin-record.c   |6 +-
 tools/perf/builtin-stat.c |2 +-
 tools/perf/builtin-top.c  |  100 -
 tools/perf/tests/open-syscall-tp-fields.c |2 +-
 tools/perf/util/event.c   |   10 +-
 tools/perf/util/evlist.c  |  171 +++---
 tools/perf/util/evlist.h  |6 +-
 tools/perf/util/evsel.c   |   98 +++--
 tools/perf/util/evsel.h   |8 +-
 tools/perf/util/header.c  |   31 ++--
 tools/perf/util/header.h  |3 +-
 tools/perf/util/python.c  |2 +-
 tools/perf/util/thread_map.c  |  223 +++--
 tools/perf/util/thread_map.h  |   16 ++-
 tools/perf/util/xyarray.c |   85 ++-
 tools/perf/util/xyarray.h |   25 +++-
 16 files changed, 641 insertions(+), 147 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 8/8]Perf: Add some callback functions to process fork & exit events

2013-03-13 Thread chenggang
From: chenggang 

Many applications will fork threads on-the-fly, these threads could exit before
the main thread exit. The perf top tool should perceive the new forked threads
while we profile a special application.
If the target process fork a thread or a thread exit, we will get a 
PERF_RECORD_FORK
 or PERF_RECORD_EXIT events. The following callback functions can process these 
events.
1) perf_top__process_event_fork()
   Open a new fd for the new forked, and expend the related data structures.
2) perf_top__process_event_exit()
   Close the fd of exit threadsd, and destroy the nodes in the related data 
structures.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-top.c |  109 +-
 1 file changed, 107 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index cff58e5..a591b96 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -800,7 +800,8 @@ static void perf_event__process_sample(struct perf_tool 
*tool,
return;
 }
 
-static void perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md)
+static int perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md,
+   int idx)
 {
struct perf_sample sample;
struct perf_evsel *evsel;
@@ -825,6 +826,20 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
struct perf_mmap *md)
if (event->header.type == PERF_RECORD_SAMPLE)
++top->samples;
 
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_FORK)
+   (&top->tool)->fork(&top->tool, event, &sample, NULL);
+
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_EXIT) {
+   int tidx;
+
+   tidx = (&top->tool)->exit(&top->tool, event,
+ &sample, NULL);
+   if (tidx == idx)
+   return -1;
+   }
+
switch (origin) {
case PERF_RECORD_MISC_USER:
++top->us_samples;
@@ -863,14 +878,18 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
struct perf_mmap *md)
} else
++session->stats.nr_unknown_events;
}
+   return 0;
 }
 
 static void perf_top__mmap_read(struct perf_top *top)
 {
struct perf_mmap *md;
+   int i = 0;
 
for_each_mmap(md, top->evlist) {
-   perf_top__mmap_read_idx(top, md);
+   if (perf_top__mmap_read_idx(top, md, i) == -1)
+   break;
+   i++;
}
 }
 
@@ -1025,11 +1044,97 @@ parse_callchain_opt(const struct option *opt, const 
char *arg, int unset)
return record_parse_callchain_opt(opt, arg, unset);
 }
 
+static int perf_top__append_thread(struct perf_top *top, pid_t pid)
+{
+   char msg[512];
+   struct perf_evsel *counter, *counter_err;
+   struct perf_evlist *evlist = top->evlist;
+   struct cpu_map *cpus = evlist->cpus;
+
+   counter_err = list_entry(evlist->entries.prev, struct perf_evsel, node);
+
+   list_for_each_entry(counter, &evlist->entries, node) {
+   if (perf_evsel__open_single_thread(counter, cpus, pid) < 0) {
+   if (verbose) {
+   perf_evsel__open_strerror(counter,
+ 
&top->record_opts.target,
+ errno, msg, 
sizeof(msg));
+   ui__warning("%s\n", msg);
+   }
+   counter_err = counter;
+   goto close_opened_fd;
+   }
+   }
+
+   if (perf_evlist__mmap_thread(evlist, false) < 0)
+   goto close_opened_fd;
+
+   return 0;
+
+close_opened_fd:
+   list_for_each_entry(counter, &evlist->entries, node) {
+   perf_evsel__close_single_thread(counter, cpus->nr, -1);
+   if (counter == counter_err)
+   break;
+   }
+   return -1;
+}
+
+static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused,
+   union perf_event *event __maybe_unused,
+   struct perf_sample *sample 
__maybe_unused,
+   struct machine *machine __maybe_unused)
+{
+   pid_t tid =

[PATCH v3 7/8]Perf: changed the method to traverse mmap list

2013-03-13 Thread chenggang
From: chenggang 

Changed the method to traverse the evlist->mmap list. The evlist->mmap
list is traversed very frequently. So we need to be more efficient to do
it.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-top.c  |   11 ++-
 tools/perf/tests/mmap-basic.c |4 +++-
 tools/perf/tests/open-syscall-tp-fields.c |7 ---
 tools/perf/tests/perf-record.c|7 ---
 tools/perf/util/evlist.c  |4 ++--
 tools/perf/util/evlist.h  |3 ++-
 tools/perf/util/python.c  |4 +++-
 7 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 72f6eb7..cff58e5 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -800,7 +800,7 @@ static void perf_event__process_sample(struct perf_tool 
*tool,
return;
 }
 
-static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
+static void perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md)
 {
struct perf_sample sample;
struct perf_evsel *evsel;
@@ -810,7 +810,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
u8 origin;
int ret;
 
-   while ((event = perf_evlist__mmap_read(top->evlist, idx)) != NULL) {
+   while ((event = perf_evlist__mmap_read(top->evlist, md)) != NULL) {
ret = perf_evlist__parse_sample(top->evlist, event, &sample);
if (ret) {
pr_err("Can't parse sample, err = %d\n", ret);
@@ -867,10 +867,11 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
 
 static void perf_top__mmap_read(struct perf_top *top)
 {
-   int i;
+   struct perf_mmap *md;
 
-   for (i = 0; i < top->evlist->nr_mmaps; i++)
-   perf_top__mmap_read_idx(top, i);
+   for_each_mmap(md, top->evlist) {
+   perf_top__mmap_read_idx(top, md);
+   }
 }
 
 static int perf_top__start_counters(struct perf_top *top)
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index cdd5075..93639a8 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -19,6 +19,7 @@ int test__basic_mmap(void)
 {
int err = -1;
union perf_event *event;
+   struct perf_mmap *md;
struct thread_map *threads;
struct cpu_map *cpus;
struct perf_evlist *evlist;
@@ -97,7 +98,8 @@ int test__basic_mmap(void)
++foo;
}
 
-   while ((event = perf_evlist__mmap_read(evlist, 0)) != NULL) {
+   md = perf_evlist__get_mmap(evlist, 0);
+   while ((event = perf_evlist__mmap_read(evlist, md)) != NULL) {
struct perf_sample sample;
 
if (event->header.type != PERF_RECORD_SAMPLE) {
diff --git a/tools/perf/tests/open-syscall-tp-fields.c 
b/tools/perf/tests/open-syscall-tp-fields.c
index 39eb770..cb12e82 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -20,7 +20,7 @@ int test__syscall_open_tp_fields(void)
int flags = O_RDONLY | O_DIRECTORY;
struct perf_evlist *evlist = perf_evlist__new(NULL, NULL);
struct perf_evsel *evsel;
-   int err = -1, i, nr_events = 0, nr_polls = 0;
+   int err = -1, nr_events = 0, nr_polls = 0;
 
if (evlist == NULL) {
pr_debug("%s: perf_evlist__new\n", __func__);
@@ -66,11 +66,12 @@ int test__syscall_open_tp_fields(void)
 
while (1) {
int before = nr_events;
+   struct perf_mmap *md;
 
-   for (i = 0; i < evlist->nr_mmaps; i++) {
+   for_each_mmap(md, evlist) {
union perf_event *event;
 
-   while ((event = perf_evlist__mmap_read(evlist, i)) != 
NULL) {
+   while ((event = perf_evlist__mmap_read(evlist, md)) != 
NULL) {
const u32 type = event->header.type;
int tp_flags;
struct perf_sample sample;
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 1e8e512..8aef6d2 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -56,7 +56,7 @@ int test__PERF_RECORD(void)
 found_libc_mmap = false,
 found_vdso_mmap = false,
 found_ld_mmap = false;
-   int err = -1, errs = 0, i, wakeups = 0;
+   int err = -1, errs = 0, wakeups = 0;
u32 cpu;
int total_events = 0, nr_events[PERF_RECORD_MAX] = { 0, };
 
@@ -158,11 +158,12 @@ int test__PERF_RECORD(v

[PATCH v3 6/8]Perf: Add extend mechanism for mmap & pollfd.

2013-03-13 Thread chenggang
From: chenggang 

Add extend mechanism for mmap & pollfd. Then we can adjust them while threads
are forked or exited.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/evlist.c |  151 +-
 tools/perf/util/evlist.h |3 +
 tools/perf/util/evsel.c  |7 ++-
 3 files changed, 156 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c1cd8f9..74af9bb 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -85,7 +85,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
-   free(evlist->mmap);
+   xyarray__delete(evlist->mmap);
free(evlist->pollfd);
evlist->mmap = NULL;
evlist->pollfd = NULL;
@@ -256,6 +256,32 @@ void perf_evlist__enable(struct perf_evlist *evlist)
}
 }
 
+/*
+ * If threads->nr > 1, the cpu_map__nr() must be 1.
+ * If the cpu_map__nr() > 1, we should not append pollfd.
+ */
+static int perf_evlist__extend_pollfd(struct perf_evlist *evlist)
+{
+   int new_nfds;
+
+   if (cpu_map__all(evlist->cpus)) {
+   struct pollfd *pfd;
+
+   new_nfds = evlist->threads->nr * evlist->nr_entries;
+   pfd = zalloc(sizeof(struct pollfd) * new_nfds);
+
+   if (!pfd)
+   return -1;
+
+   memcpy(pfd, evlist->pollfd, (evlist->threads->nr - 1) * 
evlist->nr_entries);
+
+   evlist->pollfd = pfd;
+   return 0;
+   }
+
+   return 1;
+}
+
 static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 {
int nfds = cpu_map__nr(evlist->cpus) * evlist->threads->nr * 
evlist->nr_entries;
@@ -416,6 +442,20 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
evlist->mmap = NULL;
 }
 
+static struct perf_mmap * perf_evlist__extend_mmap(struct perf_evlist *evlist)
+{
+   struct perf_mmap **new_mmap = NULL;
+
+   new_mmap = (struct perf_mmap **)xyarray__append(evlist->mmap, NULL);
+
+   if (new_mmap != NULL) {
+   evlist->nr_mmaps++;
+   return *new_mmap;
+   }
+
+   return NULL;
+}
+
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
@@ -433,7 +473,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist,
pmmap->prev = 0;
pmmap->mask = mask;
pmmap->base = mmap(NULL, evlist->mmap_len, prot,
- MAP_SHARED, fd, 0);
+  MAP_SHARED, fd, 0);
if (pmmap->base == MAP_FAILED) {
pmmap->base = NULL;
return -1;
@@ -527,6 +567,111 @@ out_unmap:
return -1;
 }
 
+int perf_evlist__mmap_thread(struct perf_evlist *evlist, bool overwrite)
+{
+   struct perf_evsel *evsel;
+   int prot = PROT_READ | (overwrite ? 0 : PROT_WRITE);
+   int mask = evlist->mmap_len - page_size -1;
+   int output = -1;
+   struct pollfd *old_pollfd = evlist->pollfd;
+   struct perf_mmap *pmmap;
+
+   if (!cpu_map__all(evlist->cpus))
+   return 1;
+
+   if ((pmmap = perf_evlist__extend_mmap(evlist)) == NULL)
+   return -ENOMEM;
+
+   if (perf_evlist__extend_pollfd(evlist) < 0)
+   goto free_append_mmap;
+
+   list_for_each_entry(evsel, &evlist->entries, node) {
+   if (evsel->attr.read_format & PERF_FORMAT_ID) {
+   if (perf_evsel__extend_id(evsel) < 0)
+   goto free_append_pollfd;
+   }
+   }
+
+   list_for_each_entry(evsel, &evlist->entries, node) {
+   int fd = FD(evsel, 0, -1);
+
+   if (output == -1) {
+   output = fd;
+
+   pmmap->prev = 0;
+   pmmap->mask = mask;
+   pmmap->base = mmap(NULL, evlist->mmap_len, prot,
+  MAP_SHARED, fd, 0);
+
+   if (pmmap->base == MAP_FAILED) {
+   pmmap->base = NULL;
+   goto out_unmap;
+   }
+   perf_evlist__add_pollfd(evlist, fd);
+   } else {
+   if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, output) != 0)
+   goto out_unmap;
+   }
+   if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
+   perf_evlist__id_add_fd(evlist, evsel, 0, -1, fd) < 0)

[PATCH v3 5/8]Perf: add extend mechanism for evsel->id & evsel->fd

2013-03-13 Thread chenggang
From: chenggang 

Add extend mechanism for evsel->id & evsel->fd.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/evsel.c  |   76 ++
 tools/perf/util/evsel.h  |8 +
 tools/perf/util/thread_map.c |2 +-
 3 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 015321f..2eb75f9 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -599,6 +599,16 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int 
ncpus, int nthreads)
return evsel->fd != NULL ? 0 : -ENOMEM;
 }
 
+/*
+ * Return the pointer to new fds (fds for the new thread at all cpus).
+ */
+static int** perf_evsel__extend_fd(struct perf_evsel *evsel)
+{
+   int init_fd = -1;
+
+   return (int**)xyarray__append(evsel->fd, (char *)&init_fd);
+}
+
 int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
   const char *filter)
 {
@@ -617,6 +627,26 @@ int perf_evsel__set_filter(struct perf_evsel *evsel, int 
ncpus, int nthreads,
return 0;
 }
 
+int perf_evsel__extend_id(struct perf_evsel *evsel)
+{
+   if (xyarray__append(evsel->sample_id, NULL) == NULL)
+   return -ENOMEM;
+
+   if (xyarray__append(evsel->id, NULL) == NULL) {
+   xyarray__remove(evsel->sample_id, -1);
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+void perf_evsel__remove_id(struct perf_evsel *evsel, int tidx)
+{
+   xyarray__remove(evsel->id, tidx);
+   evsel->ids--;
+   xyarray__remove(evsel->sample_id, tidx);
+}
+
 int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
 {
evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct 
perf_sample_id));
@@ -937,6 +967,52 @@ int perf_evsel__open_per_thread(struct perf_evsel *evsel,
return __perf_evsel__open(evsel, &empty_cpu_map.map, threads);
 }
 
+void perf_evsel__close_single_thread(struct perf_evsel *evsel, int cpu_nr,
+int tidx)
+{
+   int cpu;
+
+   for (cpu = 0; cpu < cpu_nr; cpu++) {
+   if (FD(evsel, cpu, tidx) >= 0)
+   close(FD(evsel, cpu, tidx));
+   }
+   xyarray__remove(evsel->fd, tidx); 
+}
+
+int perf_evsel__open_single_thread(struct perf_evsel *evsel,
+   struct cpu_map *cpus, int tid)
+{
+   int cpu;
+   int pid = -1;
+   unsigned long flags = 0;
+   int **new_fds;
+
+   if ((new_fds = perf_evsel__extend_fd(evsel)) == NULL)
+   return -1;
+
+   if (evsel->cgrp) {
+   flags = PERF_FLAG_PID_CGROUP;
+   pid = evsel->cgrp->fd;
+   }
+
+   for (cpu = 0; cpu < cpus->nr; cpu++) {
+   int group_fd;
+
+   if (!evsel->cgrp)
+   pid = tid;
+
+   group_fd = get_group_fd(evsel, cpu, -1);
+   evsel->attr.disabled = 0;
+   *new_fds[cpu] = sys_perf_event_open(&evsel->attr, pid,
+   cpus->map[cpu], group_fd,
+   flags);
+   if (*new_fds[cpu] < 0)
+   return -errno;
+   }
+
+   return 0;
+}
+
 static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel,
   const union perf_event *event,
   struct perf_sample *sample)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 7adb116..ae391d4 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -128,6 +128,9 @@ void perf_evsel__close_fd(struct perf_evsel *evsel, int 
ncpus, int nthreads);
 void perf_evsel__id_new(struct perf_evsel *evsel, int nr);
 u64 *perf_evsel__get_id(struct perf_evsel *evsel, int idx);
 
+int perf_evsel__extend_id(struct perf_evsel *evsel);
+void perf_evsel__remove_id(struct perf_evsel *evsel, int tidx);
+
 void __perf_evsel__set_sample_bit(struct perf_evsel *evsel,
  enum perf_event_sample_format bit);
 void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
@@ -152,6 +155,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct 
cpu_map *cpus,
 struct thread_map *threads);
 void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads);
 
+int perf_evsel__open_single_thread(struct perf_evsel *evsel,
+  struct cpu_map *cpus, int tid);
+void perf_evsel__close_single_thread(struct perf_evsel *evsel, int cpu_nr,
+int ti

[PATCH v3 4/8]perf: Transform evsel->id to xyarray

2013-03-13 Thread chenggang
From: chenggang 

Transform evsel->id to xyarray, so it is transformed to a linked list
instead an array.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/evlist.c |4 +++-
 tools/perf/util/evsel.c  |   19 +--
 tools/perf/util/evsel.h  |5 -
 tools/perf/util/header.c |   28 ++--
 tools/perf/util/header.h |3 ++-
 5 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7515651..c1cd8f9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -287,8 +287,10 @@ static void perf_evlist__id_hash(struct perf_evlist 
*evlist,
 void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
 int cpu, int thread, u64 id)
 {
+   u64* idp = perf_evsel__get_id(evsel, -1);
perf_evlist__id_hash(evlist, evsel, cpu, thread, id);
-   evsel->id[evsel->ids++] = id;
+   *idp = id;
+   evsel->ids++;
 }
 
 static int perf_evlist__id_add_fd(struct perf_evlist *evlist,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 57c569d..015321f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -623,7 +623,7 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int 
ncpus, int nthreads)
if (evsel->sample_id == NULL)
return -ENOMEM;
 
-   evsel->id = zalloc(ncpus * nthreads * sizeof(u64));
+   evsel->id = xyarray__new(1, ncpus * nthreads, sizeof(u64));
if (evsel->id == NULL) {
xyarray__delete(evsel->sample_id);
evsel->sample_id = NULL;
@@ -633,6 +633,21 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int 
ncpus, int nthreads)
return 0;
 }
 
+void perf_evsel__id_new(struct perf_evsel *evsel, int nr)
+{
+   if (evsel->id)
+   xyarray__delete(evsel->id);
+
+   evsel->id = NULL;
+
+   evsel->id = xyarray__new(1, nr, sizeof(u64));
+}
+
+u64 *perf_evsel__get_id(struct perf_evsel *evsel, int idx)
+{
+   return (u64 *)xyarray__entry(evsel->id, 0, idx);
+}
+
 int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus)
 {
evsel->counts = zalloc((sizeof(*evsel->counts) +
@@ -650,7 +665,7 @@ void perf_evsel__free_id(struct perf_evsel *evsel)
 {
xyarray__delete(evsel->sample_id);
evsel->sample_id = NULL;
-   free(evsel->id);
+   xyarray__delete(evsel->id);
evsel->id = NULL;
 }
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 52021c3..7adb116 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -51,7 +51,7 @@ struct perf_evsel {
char*filter;
struct xyarray  *fd;
struct xyarray  *sample_id;
-   u64 *id;
+   struct xyarray  *id;
struct perf_counts  *counts;
struct perf_counts  *prev_raw_counts;
int idx;
@@ -125,6 +125,9 @@ void perf_evsel__free_id(struct perf_evsel *evsel);
 void perf_evsel__free_counts(struct perf_evsel *evsel);
 void perf_evsel__close_fd(struct perf_evsel *evsel, int ncpus, int nthreads);
 
+void perf_evsel__id_new(struct perf_evsel *evsel, int nr);
+u64 *perf_evsel__get_id(struct perf_evsel *evsel, int idx);
+
 void __perf_evsel__set_sample_bit(struct perf_evsel *evsel,
  enum perf_event_sample_format bit);
 void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f4bfd79..d344e61 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1325,19 +1325,18 @@ read_event_desc(struct perf_header *ph, int fd)
if (!nr)
continue;
 
-   id = calloc(nr, sizeof(*id));
-   if (!id)
-   goto error;
evsel->ids = nr;
-   evsel->id = id;
+   perf_evsel__id_new(evsel, nr);
+   if (!evsel->id)
+   goto error;
 
for (j = 0 ; j < nr; j++) {
+   id = perf_evsel__get_id(evsel, j);
ret = readn(fd, id, sizeof(*id));
if (ret != (ssize_t)sizeof(*id))
goto error;
if (ph->needs_swap)
*id = bswap_64(*id);
-   id++;
}
}
 out:
@@ -1384,7 +1383,8 @@ static void print_event_desc(struct perf_header *ph, int 
fd, FILE *fp)
 
if (evsel->ids) {
fprintf(fp, &quo

[PATCH v3 3/8]Perf: Transform evlist->mmap to xyarray

2013-03-13 Thread chenggang
From: chenggang 

Transformed evlist->mmap to xyarray. Then the evlist->mmap is transformed
to a linked list too.

1) perf_evlist__mmap_thread()
   mmap a new fd for a new thread forked on-the-fly.
2) void perf_evlist__munmap_thread()
   munmap a fd for a exited thread on-the-fly.
3) perf_evlist__get_mmap()
   get a perf_mmap struct in the evlist->mmap list by its index.
4) for_each_mmap(md, evlist)
   traverse all perf_mmap structures in the evlist->mmap list.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/Makefile |3 ++-
 tools/perf/builtin-record.c |8 +++
 tools/perf/util/evlist.c|   49 ++-
 tools/perf/util/evlist.h|8 ++-
 4 files changed, 43 insertions(+), 25 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index a2108ca..7f3f066 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -209,7 +209,8 @@ BASIC_CFLAGS = \
-Iutil \
-I. \
-I$(TRACE_EVENT_DIR) \
-   -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
+   -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE \
+   -std=gnu99
 
 BASIC_LDFLAGS =
 
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 774c907..3bca0b2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -363,12 +363,12 @@ static struct perf_event_header finished_round_event = {
 
 static int perf_record__mmap_read_all(struct perf_record *rec)
 {
-   int i;
int rc = 0;
+   struct perf_mmap *pmmap = NULL;
 
-   for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-   if (rec->evlist->mmap[i].base) {
-   if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) 
!= 0) {
+   for_each_mmap(pmmap, rec->evlist) {
+   if (pmmap->base) {
+   if (perf_record__mmap_read(rec, pmmap) != 0) {
rc = -1;
goto out;
}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d5063d6..7515651 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -336,7 +336,7 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist 
*evlist, u64 id)
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
-   struct perf_mmap *md = &evlist->mmap[idx];
+   struct perf_mmap *md = perf_evlist__get_mmap(evlist, idx);
unsigned int head = perf_mmap__read_head(md);
unsigned int old = md->prev;
unsigned char *data = md->base + page_size;
@@ -401,16 +401,16 @@ union perf_event *perf_evlist__mmap_read(struct 
perf_evlist *evlist, int idx)
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
-   int i;
+   struct perf_mmap *pmmap = NULL;
 
-   for (i = 0; i < evlist->nr_mmaps; i++) {
-   if (evlist->mmap[i].base != NULL) {
-   munmap(evlist->mmap[i].base, evlist->mmap_len);
-   evlist->mmap[i].base = NULL;
+   for_each_mmap(pmmap, evlist) {
+   if (pmmap->base != NULL) {
+   munmap(pmmap->base, evlist->mmap_len);
+   pmmap->base = NULL;
}
}
 
-   free(evlist->mmap);
+   xyarray__delete(evlist->mmap);
evlist->mmap = NULL;
 }
 
@@ -419,19 +419,21 @@ static int perf_evlist__alloc_mmap(struct perf_evlist 
*evlist)
evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
if (cpu_map__all(evlist->cpus))
evlist->nr_mmaps = evlist->threads->nr;
-   evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+   evlist->mmap = xyarray__new(1, evlist->nr_mmaps, sizeof(struct 
perf_mmap));
return evlist->mmap != NULL ? 0 : -ENOMEM;
 }
 
 static int __perf_evlist__mmap(struct perf_evlist *evlist,
   int idx, int prot, int mask, int fd)
 {
-   evlist->mmap[idx].prev = 0;
-   evlist->mmap[idx].mask = mask;
-   evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
+   struct perf_mmap *pmmap = perf_evlist__get_mmap(evlist, idx);
+
+   pmmap->prev = 0;
+   pmmap->mask = mask;
+   pmmap->base = mmap(NULL, evlist->mmap_len, prot,
  MAP_SHARED, fd, 0);
-   if (evlist->mmap[idx].base == MAP_FAILED) {
-   evlist->mmap[idx].base = NULL;
+   if (pmmap->base == MAP_FAILED) {
+   pmmap->base = NULL;
return -1;
}
 
@@ -472,9 +474,11 @@ static int perf_evlist__mmap_

[PATCH v3 2/8]Perf: Transform xyarray to linked list

2013-03-13 Thread chenggang
From: chenggang 

The 2-dimensional array cannot expand and shrink easily while we want to
perceive the thread's fork and exit events on-the-fly.
We transform xyarray to a 2-demesional linked list. The x dimension is cpus and
is still a array. The y dimension is threads of interest and is transformed to
linked list.
The interface to append and shrink a exist xyarray is provided.
1) xyarray__append()
   append a column for all rows.
2) xyarray__remove()
   remove a column for all rows.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/xyarray.c |  125 +++--
 tools/perf/util/xyarray.h |   68 ++--
 2 files changed, 185 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c
index 22afbf6..ddb3bff 100644
--- a/tools/perf/util/xyarray.c
+++ b/tools/perf/util/xyarray.c
@@ -1,20 +1,135 @@
 #include "xyarray.h"
 #include "util.h"
 
+/*
+ * Add a column for all rows;
+ * @init_cont stores the initialize value for new entries.
+ * The return value is the array of new contents.
+ */
+char** xyarray__append(struct xyarray *xy, char *init_cont)
+{
+   struct xyentry *new_entry;
+   unsigned int x;
+   char **new_conts;
+
+   new_conts = zalloc(sizeof(char *) * xy->row_count);
+   if (new_conts == NULL)
+   return NULL;
+
+   for (x = 0; x < xy->row_count; x++) {
+   new_entry = zalloc(sizeof(*new_entry));
+   if (new_entry == NULL) {
+   free(new_conts);
+   return NULL;
+   }
+
+   new_entry->contents = zalloc(xy->entry_size);
+   if (new_entry->contents == NULL) {
+   free(new_entry);
+   free(new_conts);
+   return NULL;
+   }
+
+   if (init_cont)
+   memcpy(new_entry->contents, init_cont, xy->entry_size);
+
+   new_conts[x] = new_entry->contents;
+
+   list_add_tail(&new_entry->next, &xy->rows[x].head);
+   }
+
+   return new_conts;
+}
+
 struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
 {
-   size_t row_size = ylen * entry_size;
-   struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
+   struct xyarray *xy = zalloc(sizeof(*xy) + xlen * sizeof(struct row));
+   int i;
+
+   if (xy == NULL)
+   return NULL;
+
+   xy->row_count = xlen;
+   xy->entry_size = entry_size;
 
-   if (xy != NULL) {
-   xy->entry_size = entry_size;
-   xy->row_size   = row_size;
+   for (i = 0; i < xlen; i++)
+   INIT_LIST_HEAD(&xy->rows[i].head);
+
+   for (i = 0; i< ylen; i++) {
+   if (xyarray__append(xy, NULL) == NULL) {
+   xyarray__delete(xy);
+   return NULL;
+   }
}
 
return xy;
 }
 
+static inline int xyarray__remove_last(struct xyarray *xy)
+{
+   struct xyentry *entry;
+   unsigned int x;
+
+   if (xy == NULL)
+   return -1; 
+
+   for (x = 0; x < xy->row_count; x++) {
+   if (!list_empty(&xy->rows[x].head)) {
+   entry = list_entry(xy->rows[x].head.prev,
+  struct xyentry, next);
+   list_del(&entry->next);
+   free(entry);
+   }
+   }
+
+   return 0;
+}
+
+/*  
+ * remove a column for all rows;
+ */
+int xyarray__remove(struct xyarray *xy, int y)
+{
+   struct xyentry *entry, *tmp;
+   unsigned int x;
+   int count;
+
+   if (xy == NULL)
+   return -1;
+
+   if (y == -1)
+   return xyarray__remove_last(xy);
+
+   for (x = 0; x < xy->row_count; x++) {
+   count = 0;
+   list_for_each_entry_safe(entry, tmp, &xy->rows[x].head, next) {
+   if (count++ == y) {
+   list_del(&entry->next);
+   free(entry);
+   }
+   }
+   }
+
+   return 0;
+}
+
+/*
+ * delete @xy and all its nodes.
+ */
 void xyarray__delete(struct xyarray *xy)
 {
+   unsigned int i;
+   struct xyentry *entry, *tmp;
+
+   if (!xy)
+   return;
+
+   for (i = 0; i < xy->row_count; i++) {
+   list_for_each_entry_safe(entry, tmp, &xy->rows[i].head, next) {
+   list_del(&entry->next);
+   free(entry);
+ 

[PATCH v3 1/8]Perf: Transform thread_map to linked list

2013-03-13 Thread chenggang
From: chenggang 

The size of thread_map is fixed at initialized phase according to the
files in /proc/{$pid}. It cannot be expanded and shrinked while we want
to perceive the thread fork and exit events.
We transform the thread_map structure to a linked list, and implement some
interfaces to expend and shrink it. In order to improve compatibility with
the existing code, we can get a thread by its index in the thread_map also.
1) thread_map__append()
   Append a new thread into thread_map according to new thread's pid.
2) thread_map__remove()
   Remove a exist thread from thread_map according to the index of the
   thread in thread_map.
3) thread_map__init()
   Alloc a thread_map, and initialize it. But the thread_map is empty after
   we called this function. We should call thread_map__append() to insert
   threads.
4) thread_map__delete()
   Delete a exist thread_map.
5) thread_map__set_pid()
   Set the pid of a thread by its index in the thread_map.
6) thread_map__get_pid()
   Got a thread's pid by its index in the thread_map.
7) thread_map__get_idx_by_pid()
   Got a thread's index in the thread_map according to its pid.
   While we got a PERF_RECORD_EXIT event, we only know the pid of the thread.
8) thread_map__empty_thread_map()
   Return a empty thread_map, there is only a dumb thread in it.
   This function is used to instead of the global varible empty_thread_map.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-stat.c |2 +-
 tools/perf/tests/open-syscall-tp-fields.c |2 +-
 tools/perf/util/event.c   |   12 +-
 tools/perf/util/evlist.c  |2 +-
 tools/perf/util/evsel.c   |   16 +-
 tools/perf/util/python.c  |2 +-
 tools/perf/util/thread_map.c  |  281 ++---
 tools/perf/util/thread_map.h  |   17 +-
 8 files changed, 244 insertions(+), 90 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9984876..293b09c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -401,7 +401,7 @@ static int __run_perf_stat(int argc __maybe_unused, const 
char **argv)
}
 
if (perf_target__none(&target))
-   evsel_list->threads->map[0] = child_pid;
+   thread_map__set_pid(evsel_list->threads, 0, child_pid);
 
/*
 * Wait for the child to be ready to exec.
diff --git a/tools/perf/tests/open-syscall-tp-fields.c 
b/tools/perf/tests/open-syscall-tp-fields.c
index 1c52fdc..39eb770 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -43,7 +43,7 @@ int test__syscall_open_tp_fields(void)
 
perf_evsel__config(evsel, &opts);
 
-   evlist->threads->map[0] = getpid();
+   thread_map__append(evlist->threads, getpid());
 
err = perf_evlist__open(evlist);
if (err < 0) {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 5cd13d7..d093460 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -326,9 +326,11 @@ int perf_event__synthesize_thread_map(struct perf_tool 
*tool,
 
err = 0;
for (thread = 0; thread < threads->nr; ++thread) {
+   pid_t pid = thread_map__get_pid(threads, thread);
+
if (__event__synthesize_thread(comm_event, mmap_event,
-  threads->map[thread], 0,
-  process, tool, machine)) {
+  pid, 0, process, tool, 
+  machine)) {
err = -1;
break;
}
@@ -337,12 +339,14 @@ int perf_event__synthesize_thread_map(struct perf_tool 
*tool,
 * comm.pid is set to thread group id by
 * perf_event__synthesize_comm
 */
-   if ((int) comm_event->comm.pid != threads->map[thread]) {
+   if ((int) comm_event->comm.pid != pid) {
bool need_leader = true;
 
/* is thread group leader in thread_map? */
for (j = 0; j < threads->nr; ++j) {
-   if ((int) comm_event->comm.pid == 
threads->map[j]) {
+   pid_t pidj = thread_map__get_pid(threads, j);
+
+   if ((int) comm_event->comm.pid == pidj) {
need_leader = false;
break;
}
diff --git

[PATCH v3 0/8]Perf: Make the 'perf top -p $pid' can perceive the new forked threads.

2013-03-13 Thread chenggang
From: chenggang 

This patch set base on the 3.8.rc7 kernel.

Here is the version 3, I optimized the performance and structure in this 
version.

This patch set add a function that make the 'perf top -p $pid' is able to 
perceive
the new threads that is forked by target processes. 'perf top{record} -p $pid' 
can
perceive the threads are forked before we execute perf, but it cannot perceive 
the
new threads are forked after we started perf. This is perf's important defect, 
because
the applications who will fork new threads on-the-fly are very much.
For performance reasons, the event inherit mechanism is forbidden while we use 
per-task
counters. Some internal data structures, such as, thread_map, evlist->mmap, 
evsel->fd,
evsel->id, evsel->sample_id are implemented as arrays at the initialization 
phase.
Their size is fixed, and they cannot be extended easily while we want to expend 
them
for new forked threads.

So, we have done the following work:
1) Transformed thread_map to linked list.
   Implemented the interfaces to extand and shrink a exist thread_map.
2) Transformed xyarray to linked list. Implementd the interfaces to extand and 
shrink
   a exist xyarray.
   The xyarray is a 2-dimensional structure.
   The x-dimension is cpus, and the x-dimension is a array still.
   The y-dimension is threads of interest, and the y-dimension are linked list.
3) Implemented evlist->mmap, evsel->fd, evsel->id and evsel->sample_id with the 
new xyarray.
   Implemented interfaces to expand and shrink these structures.
4) Added 2 callback functions to top->perf_tool, they are called while the 
PERF_RECORD_FORK
   & PERF_RECORD_EXIT events are got.
   While a PERF_RECORD_FORK event is got, all related data structures are 
expanded, a new
   fd and mmap are opened.
   While a PERF_RECORD_EXIT event is got, all nodes in the related data 
structures are
   removed.

The linked list is flexible, list_add & list_del can be used easily. 
Additional, performance
penalty (especially the CPU utilization) is low.

At the last of this coverletter, I attached a test program and its Makefile. 
After it is 
executed, we will get its pid. Then, use this command:
'perf top -p *pid*'
The perf top will perceive the functions that called by the threads forked 
on-the-fly.
We could use 'top' tool to monitor the overhead of 'perf'. The result shows the 
cpu overhead
of this patch set is less than 3%. I think this overhead can be accepted.

My test environment is as follows:
# 
# captured on: Wed Mar 13 15:23:55 2013
# perf version : 3.8.rc7.ga39f52
# arch : x86_64
# nrcpus online : 2
# nrcpus avail : 2
# cpudesc : Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz
# cpuid : GenuineIntel,6,23,10
# total memory : 3034932 kB
#

This function has been already implemented for 'perf top -p $pid' in the patch
[8/8] of this patch set. Next step, the 'perf record -p $pid' should be modified
with the same method.

Thanks for David Ahern's suggestion.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

chenggang (8):
  changed thread_map to list
  changed xyarray to list
  hanged mmap to xyarray
  changed evsel->id to xyarray
  extend mechanism for evsel->id & evsel->fd
  add some operations for mmap
  changed the method to traverse mmap list
  fork & exit event perceived

 tools/perf/Makefile   |3 +-
 tools/perf/builtin-record.c   |8 +-
 tools/perf/builtin-stat.c |2 +-
 tools/perf/builtin-top.c  |  116 -
 tools/perf/tests/mmap-basic.c |4 +-
 tools/perf/tests/open-syscall-tp-fields.c |9 +-
 tools/perf/tests/perf-record.c|7 +-
 tools/perf/util/event.c   |   12 +-
 tools/perf/util/evlist.c  |  206 +++---
 tools/perf/util/evlist.h  |   14 +-
 tools/perf/util/evsel.c   |  118 +++--
 tools/perf/util/evsel.h   |   13 +-
 tools/perf/util/header.c  |   28 +--
 tools/perf/util/header.h  |3 +-
 tools/perf/util/python.c  |6 +-
 tools/perf/util/thread_map.c  |  265 +
 tools/perf/util/thread_map.h  |   16 +-
 tools/perf/util/xyarray.c |  125 +-
 tools/perf/util/xyarray.h |   68 +++-
 19 files changed, 866 insertions(+), 157 deletions(-)

---
Here is a program to test the patch set.

---
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define CHILDREN_NUM 15000
#define UINT_MAX(~0U)

unsigned int ne

[PATCH]Perf: Fix Makefile to remove all "*.o" files while "make clean"

2013-03-13 Thread chenggang
From: chenggang 

While we run "make clean" in perf's directory, and run the command:
"fine ./ -name *.o"
we will get:

./arch/x86/util/unwind.o
./arch/x86/util/header.o
./arch/x86/util/dwarf-regs.o
./util/scripting-engines/trace-event-python.o
./util/scripting-engines/trace-event-perl.o
./util/probe-finder.o
./util/dwarf-aux.o
./util/unwind.o
./lib/rbtree.o
./ui/browser.o
./ui/browsers/map.o
./ui/browsers/annotate.o
./ui/browsers/scripts.o
./ui/browsers/hists.o
./ui/tui/setup.o
./ui/tui/util.o
./ui/tui/helpline.o
./ui/tui/progress.o
./ui/gtk/browser.o
./ui/gtk/setup.o
./ui/gtk/util.o
./ui/gtk/helpline.o
./ui/gtk/annotate.o
./ui/gtk/progress.o
./ui/gtk/hists.o
./scripts/perl/Perf-Trace-Util/Context.o
./scripts/python/Perf-Trace-Util/Context.o

These ".o" files are not cleaned.

This patch fixed this problem.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/Makefile |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index a2108ca..20ed83c 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -1173,6 +1173,16 @@ clean: $(LIBTRACEEVENT)-clean
$(RM) $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)PERF-CFLAGS
$(RM) $(OUTPUT)util/*-bison*
$(RM) $(OUTPUT)util/*-flex*
+   $(RM) $(OUTPUT)util/*.o
+   $(RM) $(OUTPUT)util/scripting-engines/*.o
+   $(RM) $(OUTPUT)scripts/perl/Perf-Trace-Util/*.o
+   $(RM) $(OUTPUT)scripts/python/Perf-Trace-Util/*.o
+   $(RM) $(OUTPUT)ui/*.o
+   $(RM) $(OUTPUT)ui/browsers/*.o
+   $(RM) $(OUTPUT)ui/tui/*.o
+   $(RM) $(OUTPUT)ui/gtk/*.o
+   $(RM) $(OUTPUT)lib/*.o
+   $(RM) $(OUTPUT)arch/$(ARCH)/util/*.o
$(python-clean)
 
 .PHONY: all install clean strip $(LIBTRACEEVENT)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2]Perf: Fix Makefile to clean all object files

2013-03-14 Thread chenggang
From: Chenggang Qin 

If we execute "make clean" in perf's directory, many object files cannot be
cleaned in the current version.
For example:
While we run "make clean" in perf's directory, and run the command:
"fine ./ -name "*.o""
we will get:

./arch/x86/util/unwind.o
./arch/x86/util/header.o
./arch/x86/util/dwarf-regs.o
./util/scripting-engines/trace-event-python.o
./util/scripting-engines/trace-event-perl.o
./util/probe-finder.o
./util/dwarf-aux.o
./util/unwind.o
... ...

These ".o" files are not cleaned.

The reason is:
These object files are added into "BUILTIN_OBJS" while "make" process check the 
environment.
If the make command is "clean", the environment check process is not executed. 
So,
these object files will not be added into "BUILTIN_OBJS" while we execute "make 
clean".

This patch fixed this problem.
We only add a command:
"find . -name "*.o" -exec rm -f {} \;"

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index a2108ca..dec08ba 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -1174,6 +1174,7 @@ clean: $(LIBTRACEEVENT)-clean
$(RM) $(OUTPUT)util/*-bison*
$(RM) $(OUTPUT)util/*-flex*
$(python-clean)
+   find . -name "*.o" -exec rm -f {} \;
 
 .PHONY: all install clean strip $(LIBTRACEEVENT)
 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3]Perf: Fix Makefile to clean all object files

2013-03-14 Thread chenggang
From: Chenggang Qin 

If we execute "make clean" in perf's directory, many object files cannot be
cleaned in the current version.
For example:
While we run "make clean" in perf's directory, and run the command:
"fine ./ -name "*.o""
we will get:

./arch/x86/util/unwind.o
./arch/x86/util/header.o
./arch/x86/util/dwarf-regs.o
./util/scripting-engines/trace-event-python.o
./util/scripting-engines/trace-event-perl.o
./util/probe-finder.o
./util/dwarf-aux.o
./util/unwind.o
... ...

These ".o" files are not cleaned.

The reason is:
These object files are added into "BUILTIN_OBJS" while "make" process check the 
environment.
If the make command is "clean", the environment check process is not executed. 
So,
these object files will not be added into "BUILTIN_OBJS" while we execute "make 
clean".

This patch fixed this problem.
We only add a command:
"find . -name "*.o" -exec rm -f {} \;"

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index a2108ca..dec08ba 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -1174,6 +1174,7 @@ clean: $(LIBTRACEEVENT)-clean
$(RM) $(OUTPUT)util/*-bison*
$(RM) $(OUTPUT)util/*-flex*
$(python-clean)
+   $(FIND) . -name "*.o" -exec rm -f {} \;
 
 .PHONY: all install clean strip $(LIBTRACEEVENT)
 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4]Perf: Fix Makefile to clean all object files

2013-03-14 Thread chenggang
From: Chenggang Qin 

If we execute "make clean" in perf's directory, many object files cannot be
cleaned in the current version.
For example:
While we run "make clean" in perf's directory, and run the command:
"fine ./ -name "*.o""
we will get:

./arch/x86/util/unwind.o
./arch/x86/util/header.o
./arch/x86/util/dwarf-regs.o
./util/scripting-engines/trace-event-python.o
./util/scripting-engines/trace-event-perl.o
./util/probe-finder.o
./util/dwarf-aux.o
./util/unwind.o
... ...

These ".o" files are not cleaned.

The reason is:
These object files are added into "BUILTIN_OBJS" while "make" process check the 
environment.
If the make command is "clean", the environment check process is not executed. 
So,
these object files will not be added into "BUILTIN_OBJS" while we execute "make 
clean".

This patch fixed this problem.
We only add a command:
"find . -name "*.o" -exec rm -f {} \;"

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index a2108ca..dec08ba 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -1174,6 +1174,7 @@ clean: $(LIBTRACEEVENT)-clean
$(RM) $(OUTPUT)util/*-bison*
$(RM) $(OUTPUT)util/*-flex*
$(python-clean)
+   $(FIND) $(OUTPUT) -name "*.o" -delete
 
 .PHONY: all install clean strip $(LIBTRACEEVENT)
 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/5] perf tools: Add bitmap filed to thread_map to support new threads aware.

2012-12-20 Thread chenggang
During a target thread's life cycle, it may be fork many threads. But in the
current version of 'perf top{record} -p $pid', the new forked threads can not be
apperceived by perf. The content of thread_map and other related structures
need to be refreshed on-the-fly to apperceive the threads' fork and exit. A
pre-allocate large array with a bitmap to record which position can be used is a
simple way. This patch add a bitmap field into struct thread_map and modify the
related code in thread_map.c & evsel.c.
But in this patch, the bitmap mechanism cannot yet be used up, because the
interface of evlist and evsel have not been modified.

Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/util/evsel.c  |   19 -
 tools/perf/util/thread_map.c |  171 ++
 tools/perf/util/thread_map.h |8 ++
 3 files changed, 116 insertions(+), 82 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1b16dd1..a34167f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -790,11 +790,21 @@ static struct {
.cpus   = { -1, },
 };
 
+/*
+ * while we use empty_thread_map, we should clear the empty_thread_bitmap,
+ * and set the first bit.
+ */
+static DECLARE_BITMAP(empty_thread_bitmap, PID_MAX_DEFAULT);
+
 static struct {
struct thread_map map;
int threads[1];
 } empty_thread_map = {
-   .map.nr  = 1,
+   .map = {
+   .max_nr = MAX_THREADS_NR_DEFAULT,
+   .nr = 1,
+   .bitmap = empty_thread_bitmap,
+   },
.threads = { -1, },
 };
 
@@ -806,8 +816,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct 
cpu_map *cpus,
cpus = &empty_cpu_map.map;
}
 
-   if (threads == NULL)
+   if (threads == NULL) {
threads = &empty_thread_map.map;
+   bitmap_zero(threads->bitmap, PID_MAX_DEFAULT);
+   set_bit(0, threads->bitmap);
+   }
 
return __perf_evsel__open(evsel, cpus, threads);
 }
@@ -815,6 +828,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct 
cpu_map *cpus,
 int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
 struct cpu_map *cpus)
 {
+   bitmap_zero(empty_thread_map.map.bitmap, PID_MAX_DEFAULT);
+   set_bit(0, empty_thread_map.map.bitmap);
return __perf_evsel__open(evsel, cpus, &empty_thread_map.map);
 }
 
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 9b5f856..7966f3f 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -9,6 +9,8 @@
 #include "strlist.h"
 #include 
 #include "thread_map.h"
+#include 
+#include "debug.h"
 
 /* Skip "." and ".." directories */
 static int filter(const struct dirent *dir)
@@ -21,7 +23,7 @@ static int filter(const struct dirent *dir)
 
 struct thread_map *thread_map__new_by_pid(pid_t pid)
 {
-   struct thread_map *threads;
+   struct thread_map *threads = NULL;
char name[256];
int items;
struct dirent **namelist = NULL;
@@ -32,11 +34,12 @@ struct thread_map *thread_map__new_by_pid(pid_t pid)
if (items <= 0)
return NULL;
 
-   threads = malloc(sizeof(*threads) + sizeof(pid_t) * items);
-   if (threads != NULL) {
-   for (i = 0; i < items; i++)
-   threads->map[i] = atoi(namelist[i]->d_name);
-   threads->nr = items;
+   for (i = 0; i < items; i++) {
+   bool re_alloc;
+
+   if (thread_map__update(&threads, atoi(namelist[i]->d_name),
+  &re_alloc) < 0)
+   return NULL;
}
 
for (i=0; imap[0] = tid;
-   threads->nr = 1;
-   }
+   if (thread_map__update(&threads, tid, &re_alloc) < 0)
+   return NULL;
 
return threads;
 }
@@ -61,23 +63,17 @@ struct thread_map *thread_map__new_by_tid(pid_t tid)
 struct thread_map *thread_map__new_by_uid(uid_t uid)
 {
DIR *proc;
-   int max_threads = 32, items, i;
+   int items, i;
char path[256];
struct dirent dirent, *next, **namelist = NULL;
-   struct thread_map *threads = malloc(sizeof(*threads) +
-   max_threads * sizeof(pid_t));
-   if (threads == NULL)
-   goto out;
+   struct thread_map *threads = NULL;
 
proc = opendir("/proc");
if (proc == NULL)
-   goto out_free_threads;
-
-   threads->nr = 0;
+   goto out;
 
while (!readdir_r(proc, &dirent, &

[PATCH 5/5] perf top: Add the function to make 'perf top -p $pid' could be aware of new forked thread.

2012-12-20 Thread chenggang
This patch implemnet a fork function and a exit function in perf_top->tool to 
respond to
PERF_RECORD_FORK & PERF_RECORD_EXIT events. In the fork function 
(perf_top__process_event_fork), the information of the new thread is added into
thread_map. The fd and mmap of the new thread are created in this function also.
In the exit function (perf_top__process_event_exit), the information of the 
exited
thread are removed from thread_map. The fd and mmap of this thread are closed
in this function also.
Based on this patch, 'perf top -p $pid' can be aware of thread's fork and exit 
on-the-fly.
The new forked threads' sample events can be got by 'perf top'. And the symbols 
of the new
forked threads can be display on the ui.

Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/builtin-top.c |  135 ++
 1 file changed, 135 insertions(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b3650e3..e7978ce 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -844,6 +844,17 @@ static void perf_top__mmap_read_idx(struct perf_top *top, 
int idx)
if (event->header.type == PERF_RECORD_SAMPLE)
++top->samples;
 
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_FORK)
+   (&top->tool)->fork(&top->tool, event, &sample, machine);
+
+   if (cpu_map__all(top->evlist->cpus) &&
+   event->header.type == PERF_RECORD_EXIT) {
+   int close_nr;
+
+   close_nr = (&top->tool)->exit(&top->tool, event,
+ &sample, machine);
+   if (close_nr == idx)
+   return;
+   }
+
switch (origin) {
case PERF_RECORD_MISC_USER:
++top->us_samples;
@@ -896,6 +907,26 @@ static void perf_top__mmap_read(struct perf_top *top)
perf_top__mmap_read_idx(top, i);
 }
 
+static void perf_top__append_thread(struct perf_top *top, int append_nr,
+bool need_realloc)
+{
+   struct perf_evsel *counter;
+   struct perf_evlist *evlist = top->evlist;
+   int err;
+
+   list_for_each_entry(counter, &evlist->entries, node) {
+   err = perf_evsel__append_open(counter, top->evlist->cpus,
+ top->evlist->threads,
+ append_nr, need_realloc);
+
+   if (err == ESRCH) {
+   top->evlist->threads->map[append_nr] = -1;
+   clear_bit(append_nr, top->evlist->threads->bitmap);
+   return;
+   } else if (err < 0)
+   ui__error("append open error: %d\n", errno);
+   }
+}
+
 static void perf_top__start_counters(struct perf_top *top)
 {
struct perf_evsel *counter;
@@ -1174,12 +1205,116 @@ setup:
return 0;
 }
 
+static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused,
+union perf_event *event __maybe_unused,
+struct perf_sample *sample 
__maybe_unused,
+struct machine *machine __maybe_unused)
+{
+   struct perf_top *top = container_of(tool, struct perf_top, tool);
+   pid_t tid = event->fork.tid;
+   pid_t ptid = event->fork.ptid;
+   int append_nr = -1;
+   int thread;
+
+   /*
+* There are 2 same fork events are received while a thread was forked.
+* This may be a kernel bug.
+*/
+   for_each_set_bit(thread, top->evlist->threads->bitmap, PID_MAX_DEFAULT)
+   if (tid == top->evlist->threads->map[thread])
+   return -1;
+
+   for_each_set_bit(thread, top->evlist->threads->bitmap, PID_MAX_DEFAULT) 
{
+   /*
+* If new thread's parent is not target task, just ignore it.
+*/
+   if (ptid == top->evlist->threads->map[thread]) {
+   bool realloc_need;
+
+   append_nr = thread_map__update(&(top->evlist->threads),
+  tid, &realloc_need);
+   /*
+* Open counters for new thread.
+*/
+  

[PATCH 0/5] perf top: Add the function that make the 'perf top -p $pid' can be aware of the new threads.

2012-12-20 Thread chenggang
This patch set add the function that make the 'perf top -p $pid' could be aware
of the dynamic fork threads. The perf top{record} tools are not aware of the new
threads that forked by the target threads, while we use 'perf top{record} -p
$pid' model. Some critical structures, such as, thread_map, mmap, fd, pollfd,
id, are fixed in some arrays at the initialization phase. These structures
cannot be extended easily for the new threads. And, for some performance
reasons, the event inherit mechanism is forbidden in the '-p $pid' model.
So, these structures should be modified to a flexible form at low performance
penalty (especially the CPU utilization). Bitmap is a simple choice. A larger
thread_map->map[] can be allocate at the initialization phase, such as 32. When
the number of new threads is over 32, the size of this array can be extend
doubled by realloc. The bitmap is used to record which position in the map[] is
occupied by a thread, and which position can be used by the next new thread. I
insert a bitmap field in thread_map, and modified other related code in
thread_map.c, xyarray.c, evlist.c, evsel.c etc.
The fork and exit events (PERF_RECORD_FORK & PERF_RECORD_EXIT) can be caught
while we read events from the exist mmaps. Then, we can allocate resources, open
fd, record event id, and make a mmap for the new forked threads without
excessive cost. We can easily release related resources for the exited threads
also.
This function has been already implemented for 'perf top -p $pid' in the patch
[5/5] of this patch set. Next step, the 'perf record -p $pid' should be changed
use the interfaces in evlist & evsel modified by this patch set. Just like the
'perf top'.

Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

chenggang (5):
  perf tools: Add some functions to bitops.h to support more bitmap operations.
  perf tools: Add xyarray__realloc function in xyarray.c to expend xyarray.
  perf tools: Add bitmap filed to thread_map to support new threads aware.
  perf tools: Change some interfaces of evlist & evsel to support thread's
  creation and destroy with thread_map's bitmap.
  perf top: Add the function to make 'perf top -p $pid' could be aware of new
forked thread.

 tools/perf/builtin-record.c   |   25 ++-
 tools/perf/builtin-stat.c |7 +-
 tools/perf/builtin-top.c  |  149 +-
 tools/perf/tests/mmap-basic.c |4 +-
 tools/perf/tests/open-syscall-all-cpus.c  |2 +-
 tools/perf/tests/open-syscall-tp-fields.c |3 +-
 tools/perf/tests/open-syscall.c   |3 +-
 tools/perf/tests/perf-record.c|2 +-
 tools/perf/util/evlist.c  |  236 +++--
 tools/perf/util/evlist.h  |   39 +++--
 tools/perf/util/evsel.c   |  164 +---
 tools/perf/util/evsel.h   |   38 +++--
 tools/perf/util/include/linux/bitops.h|   85 +--
 tools/perf/util/python.c  |3 +-
 tools/perf/util/thread_map.c  |  171 +++--
 tools/perf/util/thread_map.h  |8 +
 tools/perf/util/xyarray.c |   26 
 tools/perf/util/xyarray.h |2 +
 18 files changed, 755 insertions(+), 212 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] perf tools: Add some functions to bitops.h to support more bitmap operations.

2012-12-20 Thread chenggang
Add bitmap_copy() & find_first_zero_bit() to the 'util/include/linux/bitops.h'.
These functions could be need if we want to change the thread_map or any other
mechanism with bitmap.

Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/util/include/linux/bitops.h |   85 ++--
 1 file changed, 69 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/include/linux/bitops.h 
b/tools/perf/util/include/linux/bitops.h
index a55d8cf..644504a 100644
--- a/tools/perf/util/include/linux/bitops.h
+++ b/tools/perf/util/include/linux/bitops.h
@@ -4,6 +4,12 @@
 #include 
 #include 
 #include 
+#include 
+
+typedef unsigned long  BITMAP;
+
+#define PID_MAX_DEFAULT 0x8000
+#define CPU_MAX_DEFAULT 0x40
 
 #ifndef __WORDSIZE
 #define __WORDSIZE (__SIZEOF_LONG__ * 8)
@@ -26,36 +32,57 @@
 (bit) < (size);\
 (bit) = find_next_bit((addr), (size), (bit) + 1))
 
-static inline void set_bit(int nr, unsigned long *addr)
+static inline void set_bit(int nr, BITMAP *addr)
 {
addr[nr / BITS_PER_LONG] |= 1UL << (nr % BITS_PER_LONG);
 }
 
-static inline void clear_bit(int nr, unsigned long *addr)
+static inline void clear_bit(int nr, BITMAP *addr)
 {
addr[nr / BITS_PER_LONG] &= ~(1UL << (nr % BITS_PER_LONG));
 }
 
-static __always_inline int test_bit(unsigned int nr, const unsigned long *addr)
+static __always_inline int test_bit(unsigned int nr, const BITMAP *addr)
 {
return ((1UL << (nr % BITS_PER_LONG)) &
-   (((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0;
+   (((BITMAP *)addr)[nr / BITS_PER_LONG])) != 0;
 }
 
-static inline unsigned long hweight_long(unsigned long w)
+static inline BITMAP hweight_long(BITMAP w)
 {
return sizeof(w) == 4 ? hweight32(w) : hweight64(w);
 }
 
+static inline void bitmap_copy(BITMAP *dst, const BITMAP *src,
+  int nbits)
+{
+   int len = BITS_TO_LONGS(nbits) * sizeof(BITMAP);
+   memcpy(dst, src, len);
+}
+
 #define BITOP_WORD(nr) ((nr) / BITS_PER_LONG)
 
+/*
+ * ffz - find first zero bit in word
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ */
+static __always_inline BITMAP ffz(BITMAP word)
+{
+   asm("rep; bsf %1,%0"
+   : "=r" (word)
+   : "r" (~word));
+   return word;
+}
+
 /**
  * __ffs - find first bit in word.
  * @word: The word to search
  *
  * Undefined if no bit exists, so code should check against 0 first.
  */
-static __always_inline unsigned long __ffs(unsigned long word)
+static __always_inline BITMAP __ffs(BITMAP word)
 {
int num = 0;
 
@@ -87,14 +114,40 @@ static __always_inline unsigned long __ffs(unsigned long 
word)
 }
 
 /*
+ * Find the first cleared bit in a memory region.
+ */
+static inline BITMAP
+find_first_zero_bit(const BITMAP *addr, BITMAP size)
+{
+   const BITMAP *p = addr;
+   BITMAP result = 0;
+   BITMAP tmp;
+
+   while (size & ~(BITS_PER_LONG-1)) {
+   if (~(tmp = *(p++)))
+   goto found;
+   result += BITS_PER_LONG;
+   size -= BITS_PER_LONG;
+   }
+   if (!size)
+   return result;
+
+   tmp = (*p) | (~0UL << size);
+   if (tmp == ~0UL)/* Are any bits zero? */
+   return result + size;   /* Nope. */
+found:
+   return result + __ffs(~tmp);
+}
+
+/*
  * Find the first set bit in a memory region.
  */
-static inline unsigned long
-find_first_bit(const unsigned long *addr, unsigned long size)
+static inline BITMAP
+find_first_bit(const BITMAP *addr, BITMAP size)
 {
-   const unsigned long *p = addr;
-   unsigned long result = 0;
-   unsigned long tmp;
+   const BITMAP *p = addr;
+   BITMAP result = 0;
+   BITMAP tmp;
 
while (size & ~(BITS_PER_LONG-1)) {
if ((tmp = *(p++)))
@@ -115,12 +168,12 @@ found:
 /*
  * Find the next set bit in a memory region.
  */
-static inline unsigned long
-find_next_bit(const unsigned long *addr, unsigned long size, unsigned long 
offset)
+static inline BITMAP
+find_next_bit(const BITMAP *addr, BITMAP size, BITMAP offset)
 {
-   const unsigned long *p = addr + BITOP_WORD(offset);
-   unsigned long result = offset & ~(BITS_PER_LONG-1);
-   unsigned long tmp;
+   const BITMAP *p = addr + BITOP_WORD(offset);
+   BITMAP result = offset & ~(BITS_PER_LONG-1);
+   BITMAP tmp;
 
if (offset >= size)
return size;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message

[PATCH 2/5] perf tools: Add xyarray__realloc function in xyarray.c to expend xyarray.

2012-12-20 Thread chenggang
xyarray__realloc() could be used if we wish extend the evsel->fd,
evsel->sample_id or any other xyarray on-the-fly.

Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/util/xyarray.c |   26 ++
 tools/perf/util/xyarray.h |2 ++
 2 files changed, 28 insertions(+)

diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c
index 22afbf6..4e76377 100644
--- a/tools/perf/util/xyarray.c
+++ b/tools/perf/util/xyarray.c
@@ -18,3 +18,29 @@ void xyarray__delete(struct xyarray *xy)
 {
free(xy);
 }
+
+int xyarray__realloc(struct xyarray **xy_old, int xlen_old, int xlen_new,
+ int ylen_new)
+{
+   size_t row_size_new = ylen_new * (*xy_old)->entry_size;
+   struct xyarray *xy_new = zalloc(sizeof(*xy_new) + xlen_new
+   * row_size_new);
+   int x;
+
+   if (xy_new != NULL) {
+   for (x = 0; x < xlen_old; x++)
+   memcpy(&xy_new->contents[x * row_size_new],
+  &((*xy_old)->contents[x * (*xy_old)->row_size]),
+  (*xy_old)->row_size);
+
+   xy_new->row_size = row_size_new;
+   xy_new->entry_size = (*xy_old)->entry_size;
+
+   xyarray__delete(*xy_old);
+
+   *xy_old = xy_new;
+
+   return 0;
+   }
+
+   return -1;
+}
+
diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h
index c488a07..ad41649 100644
--- a/tools/perf/util/xyarray.h
+++ b/tools/perf/util/xyarray.h
@@ -11,6 +11,8 @@ struct xyarray {
 
 struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size);
 void xyarray__delete(struct xyarray *xy);
+int xyarray__realloc(struct xyarray **xy_old, int xlen_old, int xlen_new,
+int ylen_new);
 
 static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] perf tools: Change some interfaces of evlist & evsel to support thread's creation and destroy with thread_map's bitmap.

2012-12-20 Thread chenggang
Based on the [PATCH 3/5], this patch changed the related interfaces in evlist &
evsel to support the operations to thread_map's bitmap. Then, we can use these
interfaces to insert a new forked thread into or remove a exited trhead from
thread_map and other related data structures.

Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/builtin-record.c   |   25 ++-
 tools/perf/builtin-stat.c |7 +-
 tools/perf/builtin-top.c  |   14 +-
 tools/perf/tests/mmap-basic.c |4 +-
 tools/perf/tests/open-syscall-all-cpus.c  |2 +-
 tools/perf/tests/open-syscall-tp-fields.c |3 +-
 tools/perf/tests/open-syscall.c   |3 +-
 tools/perf/tests/perf-record.c|2 +-
 tools/perf/util/evlist.c  |  236 +++--
 tools/perf/util/evlist.h  |   39 +++--
 tools/perf/util/evsel.c   |  147 +++---
 tools/perf/util/evsel.h   |   38 +++--
 tools/perf/util/python.c  |3 +-
 13 files changed, 408 insertions(+), 115 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f3151d3..277303f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -359,7 +359,7 @@ try_again:
goto out;
}
 
-   if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+   if (perf_evlist__mmap(evlist, opts->mmap_pages, false, -1, false) < 0) {
if (errno == EPERM) {
pr_err("Permission error mapping pages.\n"
   "Consider increasing "
@@ -472,12 +472,21 @@ static int perf_record__mmap_read_all(struct perf_record 
*rec)
int i;
int rc = 0;
 
-   for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-   if (rec->evlist->mmap[i].base) {
-   if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) 
!= 0) {
-   rc = -1;
-   goto out;
-   }
+   if (cpu_map__all(rec->evlist->cpus)) {
+   for_each_set_bit(i, rec->evlist->threads->bitmap,
+PID_MAX_DEFAULT) {
+   if (rec->evlist->mmap[i].base)
+   if (perf_record__mmap_read(rec,
+   &rec->evlist->mmap[i]) != 0){
+   rc = -1;
+   goto out;
+   }
+   }
+   } else {
+   for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+   if (rec->evlist->mmap[i].base)
+   if (perf_record__mmap_read(rec,
+   &rec->evlist->mmap[i]) != 0) {
+   rc = -1;
+   goto out;
+   }
}
}
 
@@ -1161,7 +1170,7 @@ int cmd_record(int argc, const char **argv, const char 
*prefix __maybe_unused)
err = -EINVAL;
goto out_free_fd;
}
-
+
err = __cmd_record(&record, argc, argv);
 out_free_fd:
perf_evlist__delete_maps(evsel_list);
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c247fac..74d5311 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -229,7 +229,7 @@ static int read_counter_aggr(struct perf_evsel *counter)
int i;
 
if (__perf_evsel__read(counter, perf_evsel__nr_cpus(counter),
-  evsel_list->threads->nr, scale) < 0)
+  evsel_list->threads->bitmap, scale) < 0)
return -1;
 
for (i = 0; i < 3; i++)
@@ -394,13 +394,14 @@ static int __run_perf_stat(int argc __maybe_unused, const 
char **argv)
if (no_aggr) {
list_for_each_entry(counter, &evsel_list->entries, node) {
read_counter(counter);
-   perf_evsel__close_fd(counter, 
perf_evsel__nr_cpus(counter), 1);
+   perf_evsel__close_fd(counter,
+perf_evsel__nr_cpus(counter),
+evsel_list->threads->bitmap);
}
} else {
list_for_each_entry(counter, &evsel_list->entries, node) {
read_counter_aggr(counter);
perf_evsel__close_fd(counter, 
perf_e

[PATCH v3] Add 4 tracepoint events for vfs

2013-01-30 Thread chenggang . qin
From: chenggang@gmail.com

This version changed some type definition according to Steven's advise.
Thanks for Steven.

If the engineers want to analyze the file access behavior of some applications 
without source code, perf tools with some appropriate tracepoints events in the 
VFS subsystem are excellent choice.

The system engineers or developers of server software require to know what 
files are accessed by the target processes with in a period of time. Then they 
can find the hot applications and the hot files. For this requirements, we 
added 2 tracepoint events at the begin of generic_file_aio_read() and 
generic_file_aio_write().

Many database systems use their own page cache subsystems and use the direct IO 
to access the disks. Sometimes, the system engineers want to know the misses 
rate of the database system's page cache. This requirements can be satisfied by 
recording the database's file access behavior through the way of direct IO. So, 
we added 2 tracepoint events at the direct IO branch in generic_file_aio_read() 
and generic_file_aio_write().

Then, we will extend the perf's function by python script to use these new 
tracepoint events.

The 4 new tracepoint events are:
1) generic_file_aio_read
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:__data_loc char[] fname;  offset:32;  size:4; signed:1;

2) generic_file_aio_write
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:__data_loc char[] fname;  offset:32;  size:4; signed:1;

3) direct_io_read
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

4) direct_io_write
   Format:
field:unsigned short common_type;   offset:0;   size:2; 
signed:0;
field:unsigned char common_flags;   offset:2;   size:1; 
signed:0;
field:unsigned char common_preempt_count;   offset:3;   size:1; 
signed:0;
field:int common_pid;   offset:4;   size:4; signed:1;
field:int common_padding;   offset:8;   size:4; signed:1;

field:long long pos;offset:16;  size:8; signed:1;
field:unsigned long bytes;  offset:24;  size:8; signed:0;
field:unsigned char fname[100]; offset:32;  size:100;   
signed:0;

Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 include/trace/events/vfs.h |   62 
 mm/filemap.c   |   18 +
 2 files changed, 80 insertions(+)
 create mode 100644 include/trace/events/vfs.h

diff --git a/include/trace/events/vfs.h b/include/trace/events/vfs.h
new file mode 100644
index 000..11c9acc
--- /dev/null
+++ b/include/trace/events/vfs.h
@@ -0,0 +1,62 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM vfs
+#define TRACE_INCLUDE_FILE vfs
+
+#if !defined(_TRACE_EVENTS_VFS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_EVENTS_VFS_H
+
+#include 
+
+#include 
+
+DECLARE_EVENT_CLASS(vfs_filerw_template,
+
+   TP_PROTO(long long pos, unsigned long bytes, const unsigned char 
*fname),
+
+   TP_ARGS(pos, bytes, fname),
+
+TP_STRUCT__entry(
+   __field(long long,  pos )
+  

[PATCH] perf script: Add a python script to statistic direct io behavior

2013-01-31 Thread chenggang . qin
From: chenggang@gmail.com

This patch depends on a prev patch: https://lkml.org/lkml/2013/1/29/47

If the engineers want to analyze the direct io behavior of some applications
without source code, perf tools with some appropriate tracepoints events in the
VFS subsystem are excellent choice.

Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers need to know the misses 
rate
of the database system's page cache. This requirements can be satisfied by 
recording
the database's file access behavior through the way of direct IO. So, we use 2
tracepoint events to record the system wide's direct IO behavior. The 2 
tracepoint
events are:
1) vfs:direct_io_read
2) vfs:direct_io_write
they were introduced by the patch: https://lkml.org/lkml/2013/1/29/47
The script direct-io.py are introduced by this patch can record the 2 tracepoint
events, analyse the sample data, and give a concise report.

usage:
"perf script record direct-io\n"
"perf script report direct-io [comm|pid]\n"

Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
---
 tools/perf/scripts/python/bin/direct-io-record |2 +
 tools/perf/scripts/python/bin/direct-io-report |   21 +++
 tools/perf/scripts/python/direct-io.py |  185 
 3 files changed, 208 insertions(+)
 create mode 100755 tools/perf/scripts/python/bin/direct-io-record
 create mode 100644 tools/perf/scripts/python/bin/direct-io-report
 create mode 100644 tools/perf/scripts/python/direct-io.py

diff --git a/tools/perf/scripts/python/bin/direct-io-record 
b/tools/perf/scripts/python/bin/direct-io-record
new file mode 100755
index 000..4857097
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -e vfs:direct_io_read -e vfs:direct_io_write $@
diff --git a/tools/perf/scripts/python/bin/direct-io-report 
b/tools/perf/scripts/python/bin/direct-io-report
new file mode 100644
index 000..828d9c6
--- /dev/null
+++ b/tools/perf/scripts/python/bin/direct-io-report
@@ -0,0 +1,21 @@
+#!/bin/bash
+# description: direct_io statistic
+# args: [comm|pid]
+n_args=0
+for i in "$@"
+do
+if expr match "$i" "-" > /dev/null ; then
+   break
+fi
+n_args=$(( $n_args + 1 ))
+done
+if [ "$n_args" -gt 1 ] ; then
+echo "usage: perf script report direct-io [comm|pid]"
+exit
+fi
+
+if [ "$n_args" -gt 0 ] ; then
+comm=$1
+shift
+fi
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm
diff --git a/tools/perf/scripts/python/direct-io.py 
b/tools/perf/scripts/python/direct-io.py
new file mode 100644
index 000..321ff8e
--- /dev/null
+++ b/tools/perf/scripts/python/direct-io.py
@@ -0,0 +1,185 @@
+# direct IO counts
+# (c) 2013, Chenggang Qin 
+# Licensed under the terms of the GNU GPL License version 2
+
+# Displays system-wide file direct IO behavior.
+# It helps us to investigate which processes trigger a direct IO,
+# and what files are accessed by these processes.
+#
+# options
+# comm, pid: show details of the file r/w behavior of a special process.
+
+import os, sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+   '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+usage = "perf script record direct-io\n" \
+   "perf script report direct-io [comm|pid]\n"
+
+for_comm = None
+for_pid = None
+pid_2_comm = None
+
+if len(sys.argv) > 2:
+   sys.exit(usage)
+
+if len(sys.argv) > 1:
+   try:
+   for_pid = int(sys.argv[1])
+   except:
+   for_comm = sys.argv[1]
+
+file_write = autodict()
+file_read = autodict()
+
+file_write_bytes = autodict()
+file_read_bytes = autodict()
+
+comm_read_info = autodict()
+comm_write_info = autodict()
+
+wevent_count = 0
+revent_count = 0
+
+comm_revent_count = 0;
+comm_wevent_count = 0;
+
+def trace_begin():
+   print "Press control+C to stop and show the summary"
+
+def trace_end():
+   if (for_comm is not None) or (for_pid is not None):
+   print_direct_io_event_for_comm()
+   else:
+   print_direct_io_event_totals()
+
+def vfs__direct_io_write(event_name, context, common_cpu,
+   common_secs, common_nsecs, common_pid, common_comm,
+   pos, bytes, fname):
+   global wevent_count
+   global comm_wevent_count
+   global pid_2_comm
+
+   if (for_comm is not None) or (for_pid is not None):
+   if (common_comm != for_comm) and (common_pid != for_pid):
+   

linux-kernel@vger.kernel.org

2013-02-01 Thread chenggang . qin
From: chenggang 

Yesterday, I implemented these tracepoint events in VFS subsystem. 
It is not a good idea.
Now, I modified two existing tracepoint events in ext4 subsystem to implement 
the same function.

Many database systems use their own page cache subsystems and use the direct IO
to access the disks. Sometimes, the system engineers want to know the misses
rate of the database system's page cache. They also require to know what files 
are accessed by the target processes with the direct IO method. These 
requirements 
can be satisfied by recording the database's file access behavior through the 
way
of direct IO. So, we add 'file name' as a parameter of tracepoint events: 
ext4:ext4_direct_IO_enter & ext4:ext4_direct_IO_exit.

Then, we will extend the perf or blktrace's function to use these tracepoint 
events.

Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Signed-off-by: Chenggang Qin 

---
 fs/ext4/inode.c |7 +--
 include/trace/events/ext4.h |   22 ++
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index cbfe13b..92a379f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3202,6 +3202,7 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb,
struct file *file = iocb->ki_filp;
struct inode *inode = file->f_mapping->host;
ssize_t ret;
+   const unsigned char *fname;
 
/*
 * If we are doing data journalling we don't support O_DIRECT
@@ -3213,13 +3214,15 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb 
*iocb,
if (ext4_has_inline_data(inode))
return 0;
 
-   trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw);
+   fname = file->f_path.dentry->d_name.name;
+   trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw,
+  fname);
if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
ret = ext4_ext_direct_IO(rw, iocb, iov, offset, nr_segs);
else
ret = ext4_ind_direct_IO(rw, iocb, iov, offset, nr_segs);
trace_ext4_direct_IO_exit(inode, offset,
-   iov_length(iov, nr_segs), rw, ret);
+   iov_length(iov, nr_segs), rw, ret, fname);
return ret;
 }
 
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 7e8c36b..532bbb4 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -1211,9 +1211,10 @@ DEFINE_EVENT(ext4__bitmap_load, ext4_load_inode_bitmap,
 );
 
 TRACE_EVENT(ext4_direct_IO_enter,
-   TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, int rw),
+   TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, int rw,
+const unsigned char *fname),
 
-   TP_ARGS(inode, offset, len, rw),
+   TP_ARGS(inode, offset, len, rw, fname),
 
TP_STRUCT__entry(
__field(dev_t,  dev )
@@ -1221,6 +1222,7 @@ TRACE_EVENT(ext4_direct_IO_enter,
__field(loff_t, pos )
__field(unsigned long,  len )
__field(int,rw  )
+   __string(   fname,  fname   )
),
 
TP_fast_assign(
@@ -1229,19 +1231,20 @@ TRACE_EVENT(ext4_direct_IO_enter,
__entry->pos= offset;
__entry->len= len;
__entry->rw = rw;
+   __assign_str(fname, fname);
),
 
-   TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d",
+   TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d fname %s",
  MAJOR(__entry->dev), MINOR(__entry->dev),
  (unsigned long) __entry->ino,
- __entry->pos, __entry->len, __entry->rw)
+ __entry->pos, __entry->len, __entry->rw, __get_str(fname))
 );
 
 TRACE_EVENT(ext4_direct_IO_exit,
TP_PROTO(struct inode *inode, loff_t offset, unsigned long len,
-int rw, int ret),
+int rw, int ret, const unsigned char *fname),
 
-   TP_ARGS(inode, offset, len, rw, ret),
+   TP_ARGS(inode, offset, len, rw, ret, fname),
 
TP_STRUCT__entry(
__field(dev_t,  dev )
@@ -1250,6 +1253,7 @@ TRACE_EVENT(ext4_direct_IO_exit,
__field(unsigned long,  len )
__field(int,rw  )
__field(int,ret )
+   __string(   fname,  fname   )
),
 
TP_fast_assign(
@@ -1259,13 +1263,15 @@ TRACE_EVENT(ext4_direct_IO_exit,
__entry->len= len;
  

[PATCH] perf core: Fix a bug that lead to mmap() & munmap() mismatch

2013-08-29 Thread Chenggang Qin
From: Chenggang Qin 

In function filename__read_debuglink(), after the elf_begin() mmapped the dso 
file,
the execution stream may goto "out_close". So, the elf_end() is skipped, and the
munmap() cannot be executed.

While perf is executed for a long time, the files that are not munmapped will 
cost
a large memory.

This patch fixed this bug.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: Mike Galbraith 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Arnaldo Carvalho de Melo 
Signed-off-by: Chenggang Qin 
---
 tools/perf/util/symbol-elf.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b12bf8..b4df870 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -471,27 +471,27 @@ int filename__read_debuglink(const char *filename, char 
*debuglink,
 
ek = elf_kind(elf);
if (ek != ELF_K_ELF)
-   goto out_close;
+   goto out_elf_end;
 
if (gelf_getehdr(elf, &ehdr) == NULL) {
pr_err("%s: cannot get elf header.\n", __func__);
-   goto out_close;
+   goto out_elf_end;
}
 
sec = elf_section_by_name(elf, &ehdr, &shdr,
  ".gnu_debuglink", NULL);
if (sec == NULL)
-   goto out_close;
+   goto out_elf_end;
 
data = elf_getdata(sec, NULL);
if (data == NULL)
-   goto out_close;
+   goto out_elf_end;
 
/* the start of this section is a zero-terminated string */
strncpy(debuglink, data->d_buf, size);
 
+out_elf_end:
elf_end(elf);
-
 out_close:
close(fd);
 out:
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] perf core: avoid traverse dsos list while find vdso

2013-09-04 Thread Chenggang Qin
From: Chenggang Qin 

Vdso is only one in a system. It is not necessory to traverse the
macine->user_dsos list while finding the dso of vdso.
The flag vdso_found should be replaced by a pointor that point to the dso of
vdso. If the pointer is NULL, dso of vdso have not been created. Else, the
pointor can be returned directly in function vdso__dso_findnew().
The list traversing can be avoided by this method.
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/vdso.c |   22 --
 1 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
index 3915982..8022ef0 100644
--- a/tools/perf/util/vdso.c
+++ b/tools/perf/util/vdso.c
@@ -13,7 +13,7 @@
 #include "symbol.h"
 #include "linux/string.h"
 
-static bool vdso_found;
+static struct dso *vdso_dso = NULL;
 static char vdso_file[] = "/tmp/perf-vdso.so-XX";
 
 static int find_vdso_map(void **start, void **end)
@@ -55,9 +55,6 @@ static char *get_file(void)
size_t size;
int fd;
 
-   if (vdso_found)
-   return vdso_file;
-
if (find_vdso_map(&start, &end))
return NULL;
 
@@ -79,33 +76,30 @@ static char *get_file(void)
  out:
free(buf);
 
-   vdso_found = (vdso != NULL);
return vdso;
 }
 
 void vdso__exit(void)
 {
-   if (vdso_found)
+   if (vdso_dso)
unlink(vdso_file);
 }
 
 struct dso *vdso__dso_findnew(struct list_head *head)
 {
-   struct dso *dso = dsos__find(head, VDSO__MAP_NAME, true);
-
-   if (!dso) {
+   if (!vdso_dso) {
char *file;
 
file = get_file();
if (!file)
return NULL;
 
-   dso = dso__new(VDSO__MAP_NAME);
-   if (dso != NULL) {
-   dsos__add(head, dso);
-   dso__set_long_name(dso, file);
+   vdso_dso = dso__new(VDSO__MAP_NAME);
+   if (vdso_dso != NULL) {
+   dsos__add(head, vdso_dso);
+   dso__set_long_name(vdso_dso, file);
}
}
 
-   return dso;
+   return vdso_dso;
 }
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] perf core: remove short name compare in dsos__find()

2013-09-04 Thread Chenggang Qin
From: Chenggang Qin 

If the list traversal is avoided by the last patch, the short name compare in
dsos__find() is unnecessary. The purpose of short name compare is only to find
the dso of vdso. If the vdso can be found by a pointor, the short name compare
can be removed.
Thanks

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/dso.c |   10 ++
 tools/perf/util/dso.h |3 +--
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index c4374f0..6f7d5a9 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -513,16 +513,10 @@ void dsos__add(struct list_head *head, struct dso *dso)
list_add_tail(&dso->node, head);
 }
 
-struct dso *dsos__find(struct list_head *head, const char *name, bool 
cmp_short)
+struct dso *dsos__find(struct list_head *head, const char *name)
 {
struct dso *pos;
 
-   if (cmp_short) {
-   list_for_each_entry(pos, head, node)
-   if (strcmp(pos->short_name, name) == 0)
-   return pos;
-   return NULL;
-   }
list_for_each_entry(pos, head, node)
if (strcmp(pos->long_name, name) == 0)
return pos;
@@ -531,7 +525,7 @@ struct dso *dsos__find(struct list_head *head, const char 
*name, bool cmp_short)
 
 struct dso *__dsos__findnew(struct list_head *head, const char *name)
 {
-   struct dso *dso = dsos__find(head, name, false);
+   struct dso *dso = dsos__find(head, name);
 
if (!dso) {
dso = dso__new(name);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index d51aaf2..450199a 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -133,8 +133,7 @@ struct dso *dso__kernel_findnew(struct machine *machine, 
const char *name,
const char *short_name, int dso_type);
 
 void dsos__add(struct list_head *head, struct dso *dso);
-struct dso *dsos__find(struct list_head *head, const char *name,
-  bool cmp_short);
+struct dso *dsos__find(struct list_head *head, const char *name);
 struct dso *__dsos__findnew(struct list_head *head, const char *name);
 bool __dsos__read_build_ids(struct list_head *head, bool with_hits);
 
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] perf tools: remove short name compare in dsos__find()

2013-09-09 Thread Chenggang Qin
From: Chenggang Qin 

If the list traversal is avoided by the last patch, the short name compare in
dsos__find() is unnecessary. The purpose of short name compare is only to find
the dso of vdso. If the vdso can be found by a pointor, the short name compare
can be removed.
Thanks

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/dso.c |   10 ++
 tools/perf/util/dso.h |3 +--
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index c4374f0..6f7d5a9 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -513,16 +513,10 @@ void dsos__add(struct list_head *head, struct dso *dso)
list_add_tail(&dso->node, head);
 }
 
-struct dso *dsos__find(struct list_head *head, const char *name, bool 
cmp_short)
+struct dso *dsos__find(struct list_head *head, const char *name)
 {
struct dso *pos;
 
-   if (cmp_short) {
-   list_for_each_entry(pos, head, node)
-   if (strcmp(pos->short_name, name) == 0)
-   return pos;
-   return NULL;
-   }
list_for_each_entry(pos, head, node)
if (strcmp(pos->long_name, name) == 0)
return pos;
@@ -531,7 +525,7 @@ struct dso *dsos__find(struct list_head *head, const char 
*name, bool cmp_short)
 
 struct dso *__dsos__findnew(struct list_head *head, const char *name)
 {
-   struct dso *dso = dsos__find(head, name, false);
+   struct dso *dso = dsos__find(head, name);
 
if (!dso) {
dso = dso__new(name);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index d51aaf2..450199a 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -133,8 +133,7 @@ struct dso *dso__kernel_findnew(struct machine *machine, 
const char *name,
const char *short_name, int dso_type);
 
 void dsos__add(struct list_head *head, struct dso *dso);
-struct dso *dsos__find(struct list_head *head, const char *name,
-  bool cmp_short);
+struct dso *dsos__find(struct list_head *head, const char *name);
 struct dso *__dsos__findnew(struct list_head *head, const char *name);
 bool __dsos__read_build_ids(struct list_head *head, bool with_hits);
 
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] perf tools: avoid traverse dsos list while find vdso

2013-09-09 Thread Chenggang Qin
From: Chenggang Qin 

Vdso is only one in a system. It is not necessory to traverse the
macine->user_dsos list when looking for the dso of vdso.
The flag vdso_found should be replaced by a pointor that point to the dso of
vdso. If the pointer is NULL, dso of vdso have not been created. Else, the
pointor can be returned directly in function vdso__dso_findnew().
The list traversing can be avoided by this method.
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/vdso.c |   22 --
 1 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
index 3915982..8022ef0 100644
--- a/tools/perf/util/vdso.c
+++ b/tools/perf/util/vdso.c
@@ -13,7 +13,7 @@
 #include "symbol.h"
 #include "linux/string.h"
 
-static bool vdso_found;
+static struct dso *vdso_dso = NULL;
 static char vdso_file[] = "/tmp/perf-vdso.so-XX";
 
 static int find_vdso_map(void **start, void **end)
@@ -55,9 +55,6 @@ static char *get_file(void)
size_t size;
int fd;
 
-   if (vdso_found)
-   return vdso_file;
-
if (find_vdso_map(&start, &end))
return NULL;
 
@@ -79,33 +76,30 @@ static char *get_file(void)
  out:
free(buf);
 
-   vdso_found = (vdso != NULL);
return vdso;
 }
 
 void vdso__exit(void)
 {
-   if (vdso_found)
+   if (vdso_dso)
unlink(vdso_file);
 }
 
 struct dso *vdso__dso_findnew(struct list_head *head)
 {
-   struct dso *dso = dsos__find(head, VDSO__MAP_NAME, true);
-
-   if (!dso) {
+   if (!vdso_dso) {
char *file;
 
file = get_file();
if (!file)
return NULL;
 
-   dso = dso__new(VDSO__MAP_NAME);
-   if (dso != NULL) {
-   dsos__add(head, dso);
-   dso__set_long_name(dso, file);
+   vdso_dso = dso__new(VDSO__MAP_NAME);
+   if (vdso_dso != NULL) {
+   dsos__add(head, vdso_dso);
+   dso__set_long_name(vdso_dso, file);
}
}
 
-   return dso;
+   return vdso_dso;
 }
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] perf core: Fix a mmap & munmap mismatches bug in dso__load

2013-10-10 Thread Chenggang Qin
From: Chenggang Qin 

Some dsos' symsrc is neither syms_ss or runtime_ss. In this situation, the
corresponding ELF file is opened and mmapped in symsrc__init(), but they will
be not closed and munmapped in any place.
This bug can lead to mmap & munmap mismatched, the mmap areas will exist during
the life of perf. We can think this is a memory leak.
This patch fixed the bug. symsrc__destroy() is called while the opened and
mmaped ELF file has neither symtlb section nor dynsym section, and opdsec
section.
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
Acked-by: Namhyung Kim 

---
 tools/perf/util/symbol.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index d5528e1..9675866 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -828,7 +828,8 @@ int dso__load(struct dso *dso, struct map *map, 
symbol_filter_t filter)
 
if (syms_ss && runtime_ss)
break;
-   }
+   } else
+   symsrc__destroy(ss);
 
}
 
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] perf core: Fix a mmap and munmap mismatched bug

2013-10-10 Thread Chenggang Qin
From: root 

In function filename__read_debuglink(), while the ELF file is opend and mmapped
in elf_begin(), but if this file is considered to not be usable during the
following code, we will goto the close(fd) directly. The elf_end() is skipped.
So, the mmaped ELF file cannot be munmapped. The memory areas are mmapped is
exist during the life of perf. This is a memory leak.
This patch fixed this bug.
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
Reviewed-by: Namhyung Kim 

---
 tools/perf/util/symbol-elf.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b12bf8..b4df870 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -471,27 +471,27 @@ int filename__read_debuglink(const char *filename, char 
*debuglink,
 
ek = elf_kind(elf);
if (ek != ELF_K_ELF)
-   goto out_close;
+   goto out_elf_end;
 
if (gelf_getehdr(elf, &ehdr) == NULL) {
pr_err("%s: cannot get elf header.\n", __func__);
-   goto out_close;
+   goto out_elf_end;
}
 
sec = elf_section_by_name(elf, &ehdr, &shdr,
  ".gnu_debuglink", NULL);
if (sec == NULL)
-   goto out_close;
+   goto out_elf_end;
 
data = elf_getdata(sec, NULL);
if (data == NULL)
-   goto out_close;
+   goto out_elf_end;
 
/* the start of this section is a zero-terminated string */
strncpy(debuglink, data->d_buf, size);
 
+out_elf_end:
elf_end(elf);
-
 out_close:
close(fd);
 out:
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] perf core: Fix a memory leak bug because symbol__delete is ignored

2013-10-10 Thread Chenggang Qin
From: Chenggang Qin 

In function symbols__fixup_duplicate(), while the duplicated symbols are found,
only the rb_node are deleted. The symbol structures themself are ignored.
Then, these memory areas are lost.
This patch fixed the bug. 
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 
Acked-by: Namhyung Kim 

---
 tools/perf/util/symbol.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 9675866..3c9aa6f 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -148,10 +148,12 @@ again:
 
if (choose_best_symbol(curr, next) == SYMBOL_A) {
rb_erase(&next->rb_node, symbols);
+   symbol__delete(next);
goto again;
} else {
nd = rb_next(&curr->rb_node);
rb_erase(&curr->rb_node, symbols);
+   symbol__delete(curr);
}
}
 }
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] perf core: Fix a mmap and munmap mismatched bug

2013-09-01 Thread Chenggang Qin
From: root 

In function filename__read_debuglink(), while the ELF file is opend and mmapped
in elf_begin(), but if this file is considered to not be usable during the
following code, we will goto the close(fd) directly. The elf_end() is skipped.
So, the mmaped ELF file cannot be munmapped. The memory areas are mmapped is
exist during the life of perf. This is a memory leak.
This patch fixed this bug.
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/symbol-elf.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b12bf8..b4df870 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -471,27 +471,27 @@ int filename__read_debuglink(const char *filename, char 
*debuglink,
 
ek = elf_kind(elf);
if (ek != ELF_K_ELF)
-   goto out_close;
+   goto out_elf_end;
 
if (gelf_getehdr(elf, &ehdr) == NULL) {
pr_err("%s: cannot get elf header.\n", __func__);
-   goto out_close;
+   goto out_elf_end;
}
 
sec = elf_section_by_name(elf, &ehdr, &shdr,
  ".gnu_debuglink", NULL);
if (sec == NULL)
-   goto out_close;
+   goto out_elf_end;
 
data = elf_getdata(sec, NULL);
if (data == NULL)
-   goto out_close;
+   goto out_elf_end;
 
/* the start of this section is a zero-terminated string */
strncpy(debuglink, data->d_buf, size);
 
+out_elf_end:
elf_end(elf);
-
 out_close:
close(fd);
 out:
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] perf core: Fix a mmap & munmap mismatches bug in dso__load

2013-09-01 Thread Chenggang Qin
From: Chenggang Qin 

Some dsos' symsrc is neither syms_ss or runtime_ss. In this situation, the
corresponding ELF file is opened and mmapped in symsrc__init(), but they will
be not closed and munmapped in any place.
This bug can lead to mmap & munmap mismatched, the mmap areas will exist during
the life of perf. We can think this is a memory leak.
This patch fixed the bug. symsrc__destroy() is called while the opened and
mmaped ELF file has neither symtlb section nor dynsym section, and opdsec
section.
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/symbol.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index d5528e1..9675866 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -828,7 +828,8 @@ int dso__load(struct dso *dso, struct map *map, 
symbol_filter_t filter)
 
if (syms_ss && runtime_ss)
break;
-   }
+   } else
+   symsrc__destroy(ss);
 
}
 
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] perf core: Fix a memory leak bug because symbol__delete is ignored

2013-09-01 Thread Chenggang Qin
From: Chenggang Qin 

In function symbols__fixup_duplicate(), while the duplicated symbols are found,
only the rb_node are deleted. The symbol structures themself are ignored.
Then, these memory areas are lost.
This patch fixed the bug. 
Thanks.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/symbol.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 9675866..3c9aa6f 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -148,10 +148,12 @@ again:
 
if (choose_best_symbol(curr, next) == SYMBOL_A) {
rb_erase(&next->rb_node, symbols);
+   symbol__delete(next);
goto again;
} else {
nd = rb_next(&curr->rb_node);
rb_erase(&curr->rb_node, symbols);
+   symbol__delete(curr);
}
}
 }
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf/core: Fix a warning in util/trace-event-parse.c

2013-04-07 Thread chenggang qin
From: 

While I compile the perf in Red Hat Enterprise Linux Server release 5.4 
(Tikanga),
I got a warning:

CC util/trace-event-parse.o
cc1: warnings being treated as errors
util/trace-event-parse.c: In function 'parse_proc_kallsyms':
util/trace-event-parse.c:232: warning: 'fmt' may be used uninitialized in this 
function
make: *** [util/trace-event-parse.o] Error 1

The version of gcc is:  4.1.2

The reason is that the local variable 'fmt' is not initialized before we use it.
It is fixed in this patch.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/trace-event-parse.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/trace-event-parse.c 
b/tools/perf/util/trace-event-parse.c
index 3aabcd6..630e331 100644
--- a/tools/perf/util/trace-event-parse.c
+++ b/tools/perf/util/trace-event-parse.c
@@ -229,7 +229,7 @@ void parse_proc_kallsyms(struct pevent *pevent,
char *next = NULL;
char *addr_str;
char *mod;
-   char *fmt;
+   char *fmt = NULL;
 
line = strtok_r(file, "\n", &next);
while (line) {
-- 
1.5.5.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH]Perf top: Add ability to detect new threads dynamically during 'perf top -p 'pid'' is running

2012-08-22 Thread chenggang qin
From: Chenggang Qin 

While we use "perf top -p 'pid'" to monitor the symbols of specified
processes, some new threads would be created by the monitored processes 
during "perf top" is running. In current version, these new threads and
their symbols cannot be shown.
This patch add ability to show these new threads.

Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-top.c |   86 --
 tools/perf/util/evlist.c |2 ++
 2 files changed, 85 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 68cd61e..54c9cc1 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -882,7 +882,7 @@ static void perf_top__mmap_read(struct perf_top *top)
perf_top__mmap_read_idx(top, i);
 }
 
-static void perf_top__start_counters(struct perf_top *top)
+static int perf_top__start_counters(struct perf_top *top)
 {
struct perf_evsel *counter, *first;
struct perf_evlist *evlist = top->evlist;
@@ -929,6 +929,10 @@ try_again:
 group_fd) < 0) {
int err = errno;
 
+   if (err == ESRCH) {
+   return err;
+   }
+
if (err == EPERM || err == EACCES) {
ui__error_paranoid();
goto out_err;
@@ -994,7 +998,7 @@ try_again:
goto out_err;
}
 
-   return;
+   return 0;
 
 out_err:
exit_browser(0);
@@ -1018,6 +1022,77 @@ static int perf_top__setup_sample_type(struct perf_top 
*top)
return 0;
 }
 
+static int thread_map_cmp(struct thread_map *threads_a,
+ struct thread_map *threads_b)
+{
+   int i, j;
+
+   if (threads_a->nr != threads_b->nr) {
+   return 1;
+   } else {
+   for (i = 0; i < threads_b->nr; i++) {
+   for (j = 0; j < threads_a->nr; j++)
+   if (threads_b->map[i] == threads_a->map[j])
+   break;
+
+   if (j == threads_a->nr)
+   return 1;
+   }
+
+   return 0;
+   }
+}
+
+static void check_new_threads(struct perf_top *top)
+{
+   struct thread_map *new_thread_map;
+   struct perf_evsel *counter;
+   struct perf_evlist *evlist = top->evlist;
+
+retry:
+   new_thread_map = thread_map__new_str(top->target.pid, top->target.tid,
+top->target.uid);
+   if (!new_thread_map)
+   return;
+
+   if (thread_map_cmp(top->evlist->threads, new_thread_map) == 0) {
+   free(new_thread_map);
+   return;
+   } else {
+   list_for_each_entry(counter, &evlist->entries, node) {
+   perf_evsel__close(counter, top->evlist->cpus->nr,
+ top->evlist->threads->nr);
+   }
+
+   if (top->evlist->mmap)
+   perf_evlist__munmap(top->evlist);
+
+   if (top->evlist->pollfd) {
+   free(top->evlist->pollfd);
+   top->evlist->pollfd = NULL;
+   }
+
+   top->evlist->nr_fds = 0;
+
+   thread_map__delete(top->evlist->threads);
+   top->evlist->threads = new_thread_map;
+
+   if (perf_top__start_counters(top) == ESRCH) {
+   while (thread_map_cmp(top->evlist->threads,
+ new_thread_map) == 0) {
+   new_thread_map = 
thread_map__new_str(top->target.pid,
+
top->target.tid,
+
top->target.uid);
+   if (!new_thread_map)
+   return;
+   }
+   goto retry;
+   }
+
+   return;
+   }
+}
+
 static int __cmd_top(struct perf_top *top)
 {
pthread_t thread;
@@ -1067,7 +1142,12 @@ static int __cmd_top(struct perf_top *top)
}
 
while (1) {
-   u64 hits = top->samples;
+   u64 hits;
+
+   if (perf_target__has_task(&top->target))
+   check_new_threads(top);
+
+   hits = top->samples;
 
perf_top__mmap_read(top);
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9b38681..293eca7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -452,6 +452,8 @@ void perf_evlist__munmap(struct 

[PATCH] perf tool: remove an unnecessary function call while process pipe events

2013-11-01 Thread Chenggang Qin
From: Chenggang Qin 

perf_session_free_sample_buffers() can be removed from
__perf_session__process_pipe_events(), since the ordered_samples buffer is not
used while samples are read from the pipe.
__perf_session__process_pipe_events() is only used while process the events from
pipe. While the sample are read from pipe, the ordered_samples is forbidden.
Refer to the following code in perf_session__new():
 150 if (tool && tool->ordering_requires_timestamps &&
 151 tool->ordered_samples && 
!perf_evlist__sample_id_all(self->evlist)) {
 152 dump_printf("WARNING: No sample_id_all support, falling 
back to unordered processing\n");
 153 tool->ordered_samples = false;
 154 }
If pipe is used, perf_evlist__sample_id_all(self->evlist) always return 0. 
Because
session->evlist is empty util a attr_event is read.

Thanks
Chenggang Qin

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/session.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 568b750..b69c28a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1251,7 +1251,6 @@ done:
 out_err:
free(buf);
perf_session__warn_about_errors(self, tool);
-   perf_session_free_sample_buffers(self);
return err;
 }
 
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/4] perf report: add parameters 'start' & 'end' to specify analysis interval

2013-11-01 Thread Chenggang Qin
This patch set introduced a feature to analysis the samples in a specified time
interval.
After perf.data file was generated by perf record, the user could want to
analysis a sub time interval of the whole record period.
For some functions, the percent of its samples in a certain sub time interval is
different from the percent in the total record period. Showing the scene in a
certain time interval could allow users to more easily troubleshoot performance
problems. The sample's timestamp are recorded in the perf.data file. The samples
are sorted in the ordered_samples by timestamp while perf report processed them.
So, it is easily to search the samples whose timestamp are in a certain time
interval.
We add 2 paramters --start and --end to specify the time interval.
perf report --start x --end x
The smallest granularity of time interval is millsecond.
For example:
If the whole record period of a perf.data file is 1 to 2, we can use the
following command to analysis the samples between [15000, 16000).
perf report --start 15000 --end 16000
The time is the uptime, it start timing from the system starts.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

Chenggang Qin (4):
  perf tools: add parameter 'start' & 'end' to perf report
  perf tools: relate 'start' & 'end' to perf_session
  perf tools: record min_timestamp of samples queue in ordered_samples
  perf tools: add the feature to assign analysis interval to perf
report

 tools/perf/builtin-report.c |   14 
 tools/perf/util/session.c   |   49 +-
 tools/perf/util/session.h   |3 ++
 3 files changed, 64 insertions(+), 2 deletions(-)

-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] perf tools: add the feature to assign analysis interval to perf report

2013-11-01 Thread Chenggang Qin
Only process the samples whose timestamp is in [start, end).

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/session.c |   43 +--
 1 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 4e9dd66..d50e29e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -532,6 +532,9 @@ static int flush_sample_queue(struct perf_session *s,
bool show_progress = limit == ULLONG_MAX;
int ret;
 
+   if (limit > s->tend)
+   limit = s->tend;
+
if (!tool->ordered_samples || !limit)
return 0;
 
@@ -539,6 +542,9 @@ static int flush_sample_queue(struct perf_session *s,
if (session_done())
return 0;
 
+   if (iter->timestamp < s->tstart)
+   continue;
+
if (iter->timestamp > limit)
break;
 
@@ -617,7 +623,26 @@ static int process_finished_round(struct perf_tool *tool,
  union perf_event *event __maybe_unused,
  struct perf_session *session)
 {
-   int ret = flush_sample_queue(session, tool);
+   int ret = 0;
+
+   /*
+* The next round should be processed continue.
+* But, this round is skipped.
+*/
+   if (session->ordered_samples.next_flush < session->tstart) {
+   session->ordered_samples.next_flush = 
session->ordered_samples.max_timestamp;
+   return ret;
+   }
+
+   /*
+* This round & all followed rounds are skipped.
+*/
+   if (session->ordered_samples.min_timestamp > session->tend) {
+   session->ordered_samples.next_flush = ULLONG_MAX;
+   return ret;
+   }
+
+   ret = flush_sample_queue(session, tool);
if (!ret)
session->ordered_samples.next_flush = 
session->ordered_samples.max_timestamp;
 
@@ -1373,6 +1398,14 @@ more:
goto out_err;
}
 
+   /*
+* After process a finished round event:
+* The minimal timestamp in os->samples is greater than
+* tend, so, the followed  events couldn't be processed.
+*/
+   if (session->ordered_samples.next_flush == ULLONG_MAX)
+   goto out_err;
+
head += size;
file_pos += size;
 
@@ -1389,8 +1422,14 @@ more:
if (file_pos < file_size)
goto more;
 
+   if (session->ordered_samples.max_timestamp < session->tstart)
+   goto out_err;
+
+   if (session->ordered_samples.min_timestamp > session->tend)
+   goto out_err;
+
/* do the final flush for ordered samples */
-   session->ordered_samples.next_flush = ULLONG_MAX;
+   session->ordered_samples.next_flush = session->tend;
err = flush_sample_queue(session, tool);
 out_err:
ui_progress__finish();
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] perf report: add parameter 'start' & 'end' to perf report

2013-11-01 Thread Chenggang Qin
perf report --start time1 --end time2
The unit of time1 & time2 are millsecond.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-report.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 72eae74..e9e9d0a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -733,6 +733,7 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
 {
struct perf_session *session;
struct stat st;
+   u64 tstart = 0, tend = 0;
bool has_br_stack = false;
int branch_mode = -1;
int ret = -1;
@@ -843,6 +844,8 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"),
OPT_CALLBACK(0, "percent-limit", &report, "percent",
 "Don't show entries under that percent", 
parse_percent_limit),
+   OPT_U64(0, "start", &tstart, "Start time of analysis interval. (Unit: 
ms)"),
+   OPT_U64(0, "end", &tend, "End time of analysis interval. (Unit: ms)"),
OPT_END()
};
 
@@ -850,6 +853,12 @@ int cmd_report(int argc, const char **argv, const char 
*prefix __maybe_unused)
 
argc = parse_options(argc, argv, options, report_usage, 0);
 
+   if (tend && tstart >= tend) {
+   fprintf(stderr, "start [%" PRIu64 "] is greater than end [%"
+   PRIu64 "].\n", tstart, tend);
+   return -1;
+   }
+
if (report.use_stdio)
use_browser = 0;
else if (report.use_tui)
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] perf tools: relate 'start' & 'end' to perf_session

2013-11-01 Thread Chenggang Qin
Copy the value to start and end to struct perf_session.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/builtin-report.c |5 +
 tools/perf/util/session.c   |3 +++
 tools/perf/util/session.h   |2 ++
 3 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index e9e9d0a..d3c1c8a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -889,6 +889,11 @@ repeat:
if (session == NULL)
return -ENOMEM;
 
+   if (tstart)
+   session->tstart = tstart * 1e6;
+   if (tend)
+   session->tend = tend * 1e6;
+
report.session = session;
 
has_br_stack = perf_header__has_feat(&session->header,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 568b750..193bb6a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -134,6 +134,9 @@ struct perf_session *perf_session__new(const char 
*filename, int mode,
INIT_LIST_HEAD(&self->ordered_samples.to_free);
machines__init(&self->machines);
 
+   self->tstart = 0;
+   self->tend = ULLONG_MAX;
+
if (mode == O_RDONLY) {
if (perf_session__open(self, force) < 0)
goto out_delete;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 04bf737..c9a6c27 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -37,6 +37,8 @@ struct perf_session {
int fd;
boolfd_pipe;
boolrepipe;
+   u64 tstart;
+   u64 tend;
struct ordered_samples  ordered_samples;
charfilename[1];
 };
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] perf tools: record min_timestamp of samples queue in ordered_samples

2013-11-01 Thread Chenggang Qin
Add a field 'min_timestamp' in struct ordered_samples to record the minimial
timestamp of the samples in ordered_samples->samples.

Cc: David Ahern 
Cc: Peter Zijlstra 
Cc: Paul Mackerras 
Cc: Ingo Molnar 
Cc: Arnaldo Carvalho de Melo 
Cc: Arjan van de Ven 
Cc: Namhyung Kim 
Cc: Yanmin Zhang 
Cc: Wu Fengguang 
Cc: Mike Galbraith 
Cc: Andrew Morton 
Signed-off-by: Chenggang Qin 

---
 tools/perf/util/session.c |3 +++
 tools/perf/util/session.h |1 +
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 193bb6a..4e9dd66 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -708,6 +708,9 @@ int perf_session_queue_event(struct perf_session *s, union 
perf_event *event,
new->file_offset = file_offset;
new->event = event;
 
+   if (list_empty(&os->samples) || os->min_timestamp > timestamp)
+   os->min_timestamp = timestamp;
+
__queue_event(new, s);
 
return 0;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index c9a6c27..7d411b9 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -18,6 +18,7 @@ struct ordered_samples {
u64 last_flush;
u64 next_flush;
u64 max_timestamp;
+   u64 min_timestamp;
struct list_headsamples;
struct list_headsample_cache;
struct list_headto_free;
-- 
1.7.8.rc2.5.g815b

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:perf/core] perf symbols: Fix a memory leak due to symbol__delete not being used

2013-10-14 Thread tip-bot for Chenggang Qin
Commit-ID:  d4f74eb89199dc7bde5579783e9188841e1271e3
Gitweb: http://git.kernel.org/tip/d4f74eb89199dc7bde5579783e9188841e1271e3
Author: Chenggang Qin 
AuthorDate: Fri, 11 Oct 2013 08:27:59 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 14 Oct 2013 12:21:20 -0300

perf symbols: Fix a memory leak due to symbol__delete not being used

In function symbols__fixup_duplicate(), while duplicated symbols are
found, only the rb_node is removed from the tree. The symbol structures
themself are ignored.  Then, these memory areas are lost.

Signed-off-by: Chenggang Qin 
Acked-by: Namhyung Kim 
Cc: Andrew Morton 
Cc: Arjan van de Ven 
Cc: David Ahern 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Wu Fengguang 
Cc: Yanmin Zhang 
Link: 
http://lkml.kernel.org/r/1381451279-4109-3-git-send-email-chenggang@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b66c1ee..c0c3696 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -160,10 +160,12 @@ again:
 
if (choose_best_symbol(curr, next) == SYMBOL_A) {
rb_erase(&next->rb_node, symbols);
+   symbol__delete(next);
goto again;
} else {
nd = rb_next(&curr->rb_node);
rb_erase(&curr->rb_node, symbols);
+   symbol__delete(curr);
}
}
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:perf/core] perf symbols: Fix a mmap and munmap mismatched bug

2013-10-14 Thread tip-bot for Chenggang Qin
Commit-ID:  784f3390f9bd900adfb3b0373615e105a0d9749a
Gitweb: http://git.kernel.org/tip/784f3390f9bd900adfb3b0373615e105a0d9749a
Author: Chenggang Qin 
AuthorDate: Fri, 11 Oct 2013 08:27:57 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 14 Oct 2013 12:21:23 -0300

perf symbols: Fix a mmap and munmap mismatched bug

In function filename__read_debuglink(), while the ELF file is opend and
mmapped in elf_begin(), but if this file is considered to not be usable
during the following code, we will goto the close(fd) directly. The
elf_end() is skipped.  So, the mmaped ELF file cannot be munmapped. The
mmapped areas exist during the life of perf.

This is a memory leak.  This patch fixed this bug.

Reviewed-by: Namhyung Kim 
Signed-off-by: Chenggang Qin 
Cc: Andrew Morton 
Cc: Arjan van de Ven 
Cc: Chenggang Qin 
Cc: David Ahern 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Wu Fengguang 
Cc: Yanmin Zhang 
Link: 
http://lkml.kernel.org/r/1381451279-4109-1-git-send-email-chenggang@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol-elf.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index d6b8af3..eed0b96 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -487,27 +487,27 @@ int filename__read_debuglink(const char *filename, char 
*debuglink,
 
ek = elf_kind(elf);
if (ek != ELF_K_ELF)
-   goto out_close;
+   goto out_elf_end;
 
if (gelf_getehdr(elf, &ehdr) == NULL) {
pr_err("%s: cannot get elf header.\n", __func__);
-   goto out_close;
+   goto out_elf_end;
}
 
sec = elf_section_by_name(elf, &ehdr, &shdr,
  ".gnu_debuglink", NULL);
if (sec == NULL)
-   goto out_close;
+   goto out_elf_end;
 
data = elf_getdata(sec, NULL);
if (data == NULL)
-   goto out_close;
+   goto out_elf_end;
 
/* the start of this section is a zero-terminated string */
strncpy(debuglink, data->d_buf, size);
 
+out_elf_end:
elf_end(elf);
-
 out_close:
close(fd);
 out:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/