[PATCH v2] Add a python script to statistic direct io behavior
From: chenggang@taobao.com The last version of this patch need to introduce 2 new tracepoint events in VFS, but introduce new tracepoint events into VFS is not a clever idea. So, I modified this patch, and only use a existing tracepoint event (ext4:ext4_direct_IO_exit). If the engineers want to analyze the direct io behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers need to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we use tracepoint event, ext4:ext4_direct_IO_exit, to record the system wide's direct IO behavior. The script direct-io.py are introduced by this patch can record the tracepoint events, ext4:ext4_direct_IO_exit, analyse the sample data, and give a concise report. usage: "perf script record direct-io\n" "perf script report direct-io [comm|pid]\n" Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/scripts/python/bin/direct-io-record |2 + tools/perf/scripts/python/bin/direct-io-report | 21 +++ tools/perf/scripts/python/direct-io.py | 197 3 files changed, 220 insertions(+) create mode 100755 tools/perf/scripts/python/bin/direct-io-record create mode 100644 tools/perf/scripts/python/bin/direct-io-report create mode 100644 tools/perf/scripts/python/direct-io.py diff --git a/tools/perf/scripts/python/bin/direct-io-record b/tools/perf/scripts/python/bin/direct-io-record new file mode 100755 index 000..f38d5fc --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-record @@ -0,0 +1,2 @@ +#!/bin/bash +perf record -e ext4:ext4_direct_IO_exit $@ diff --git a/tools/perf/scripts/python/bin/direct-io-report b/tools/perf/scripts/python/bin/direct-io-report new file mode 100644 index 000..828d9c6 --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-report @@ -0,0 +1,21 @@ +#!/bin/bash +# description: direct_io statistic +# args: [comm|pid] +n_args=0 +for i in "$@" +do +if expr match "$i" "-" > /dev/null ; then + break +fi +n_args=$(( $n_args + 1 )) +done +if [ "$n_args" -gt 1 ] ; then +echo "usage: perf script report direct-io [comm|pid]" +exit +fi + +if [ "$n_args" -gt 0 ] ; then +comm=$1 +shift +fi +perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm diff --git a/tools/perf/scripts/python/direct-io.py b/tools/perf/scripts/python/direct-io.py new file mode 100644 index 000..b609e95 --- /dev/null +++ b/tools/perf/scripts/python/direct-io.py @@ -0,0 +1,197 @@ +# direct IO counts +# (c) 2013, Chenggang Qin +# Licensed under the terms of the GNU GPL License version 2 + +# Displays system-wide file direct IO behavior. +# It helps us to investigate which processes trigger a direct IO, +# and what files are accessed by these processes. +# +# options +# comm, pid: show details of the file r/w behavior of a special process. + +import os, sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * +from Core import * +from Util import * + +MINORBITS = 20 +MINORMASK = ((1 << MINORBITS) - 1) + +usage = "perf script record direct-io\n" \ + "perf script report direct-io [comm|pid]\n" + +for_comm = None +for_pid = None +pid_2_comm = None + +if len(sys.argv) > 2: + sys.exit(usage) + +if len(sys.argv) > 1: + try: + for_pid = int(sys.argv[1]) + except: + for_comm = sys.argv[1] + +direct_write = autodict() +direct_read = autodict() + +direct_write_bytes = autodict() +direct_read_bytes = autodict() + +comm_read_info = autodict() +comm_write_info = autodict() + +wevent_count = 0 +revent_count = 0 + +comm_revent_count = 0; +comm_wevent_count = 0; + +def MAJOR(dev): + return (dev) >> MINORBITS + +def MINOR(dev): + return (dev) & MINORMASK + +def trace_begin(): + print "Press control+C to stop and show the summary" + +def trace_end(): + if (for_comm is not None) or (for_pid is not None): + print_direct_io_event_for_comm() + else: + print_direct_io_event_totals() + +def ext4__ext4_direct_IO_exit(event_name, context, common_cpu, + common_secs, common_nsecs, common_pid, common_comm, + ino, dev, pos, len, rw, ret): +
[PATCH v2] Perf Script: Add a python script to statistic direct io behavior
From: chenggang@taobao.com The last version of this patch need to introduce 2 new tracepoint events in VFS, but introduce new tracepoint events into VFS is not a clever idea. So, I modified this patch, and only use a existing tracepoint event (ext4:ext4_direct_IO_exit). If the engineers want to analyze the direct io behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers need to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we use tracepoint event, ext4:ext4_direct_IO_exit, to record the system wide's direct IO behavior. The script direct-io.py are introduced by this patch can record the tracepoint events, ext4:ext4_direct_IO_exit, analyse the sample data, and give a concise report. usage: "perf script record direct-io\n" "perf script report direct-io [comm|pid]\n" Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/scripts/python/bin/direct-io-record |2 + tools/perf/scripts/python/bin/direct-io-report | 21 +++ tools/perf/scripts/python/direct-io.py | 197 3 files changed, 220 insertions(+) create mode 100755 tools/perf/scripts/python/bin/direct-io-record create mode 100644 tools/perf/scripts/python/bin/direct-io-report create mode 100644 tools/perf/scripts/python/direct-io.py diff --git a/tools/perf/scripts/python/bin/direct-io-record b/tools/perf/scripts/python/bin/direct-io-record new file mode 100755 index 000..f38d5fc --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-record @@ -0,0 +1,2 @@ +#!/bin/bash +perf record -e ext4:ext4_direct_IO_exit $@ diff --git a/tools/perf/scripts/python/bin/direct-io-report b/tools/perf/scripts/python/bin/direct-io-report new file mode 100644 index 000..828d9c6 --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-report @@ -0,0 +1,21 @@ +#!/bin/bash +# description: direct_io statistic +# args: [comm|pid] +n_args=0 +for i in "$@" +do +if expr match "$i" "-" > /dev/null ; then + break +fi +n_args=$(( $n_args + 1 )) +done +if [ "$n_args" -gt 1 ] ; then +echo "usage: perf script report direct-io [comm|pid]" +exit +fi + +if [ "$n_args" -gt 0 ] ; then +comm=$1 +shift +fi +perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm diff --git a/tools/perf/scripts/python/direct-io.py b/tools/perf/scripts/python/direct-io.py new file mode 100644 index 000..b609e95 --- /dev/null +++ b/tools/perf/scripts/python/direct-io.py @@ -0,0 +1,197 @@ +# direct IO counts +# (c) 2013, Chenggang Qin +# Licensed under the terms of the GNU GPL License version 2 + +# Displays system-wide file direct IO behavior. +# It helps us to investigate which processes trigger a direct IO, +# and what files are accessed by these processes. +# +# options +# comm, pid: show details of the file r/w behavior of a special process. + +import os, sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * +from Core import * +from Util import * + +MINORBITS = 20 +MINORMASK = ((1 << MINORBITS) - 1) + +usage = "perf script record direct-io\n" \ + "perf script report direct-io [comm|pid]\n" + +for_comm = None +for_pid = None +pid_2_comm = None + +if len(sys.argv) > 2: + sys.exit(usage) + +if len(sys.argv) > 1: + try: + for_pid = int(sys.argv[1]) + except: + for_comm = sys.argv[1] + +direct_write = autodict() +direct_read = autodict() + +direct_write_bytes = autodict() +direct_read_bytes = autodict() + +comm_read_info = autodict() +comm_write_info = autodict() + +wevent_count = 0 +revent_count = 0 + +comm_revent_count = 0; +comm_wevent_count = 0; + +def MAJOR(dev): + return (dev) >> MINORBITS + +def MINOR(dev): + return (dev) & MINORMASK + +def trace_begin(): + print "Press control+C to stop and show the summary" + +def trace_end(): + if (for_comm is not None) or (for_pid is not None): + print_direct_io_event_for_comm() + else: + print_direct_io_event_totals() + +def ext4__ext4_direct_IO_exit(event_name, context, common_cpu, + common_secs, common_nsecs, common_pid, common_comm, + ino, dev, pos, len, rw, ret): +
[PATCH] Tracepoint Event: Add 4 tracepoint events for vfs subsystem.
From: chenggang@gmail.com If the engineers want to analyze the file access behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. The system engineers or developers of server software require to know what files are accessed by the target processes with in a period of time. Then they can find the hot applications and the hot files. For this requirements, we added 2 tracepoint events at the begin of generic_file_aio_read() and generic_file_aio_write(). Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers want to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we added 2 tracepoint events at the direct IO branch in generic_file_aio_read() and generic_file_aio_write(). Then, we will extend the perf's function by python script to use these new tracepoint events. The 4 new tracepoint events are: 1) generic_file_aio_read Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; 2) generic_file_aio_write Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; 3) direct_io_read Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; 4) direct_io_write Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- include/trace/events/vfs.h | 110 mm/filemap.c | 18 2 files changed, 128 insertions(+) create mode 100644 include/trace/events/vfs.h diff --git a/include/trace/events/vfs.h b/include/trace/events/vfs.h new file mode 100644 index 000..33498e1 --- /dev/null +++ b/include/trace/events/vfs.h @@ -0,0 +1,110 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM vfs +#define TRACE_INCLUDE_FILE vfs + +#if !defined(_TRACE_EVENTS_VFS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_EVENTS_VFS_H + +#include + +#include + +TRACE_EVENT(generic_file_aio_read, + + TP_PROTO(long long pos, unsigned long bytes, unsigned char *fname), + + TP_ARGS(pos, bytes, fname), + + TP_STRUCT__entry( + __field(long long, pos ) + __field(unsigned long, bytes ) + __array(unsigned cha
[PATCH v2] Add 4 tracepoint events for vfs
From: chenggang@gmail.com If the engineers want to analyze the file access behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. The system engineers or developers of server software require to know what files are accessed by the target processes with in a period of time. Then they can find the hot applications and the hot files. For this requirements, we added 2 tracepoint events at the begin of generic_file_aio_read() and generic_file_aio_write(). Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers want to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we added 2 tracepoint events at the direct IO branch in generic_file_aio_read() and generic_file_aio_write(). Then, we will extend the perf's function by python script to use these new tracepoint events. The 4 new tracepoint events are: 1) generic_file_aio_read Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:__data_loc char[] fname; offset:32; size:4; signed:1; 2) generic_file_aio_write Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:__data_loc char[] fname; offset:32; size:4; signed:1; 3) direct_io_read Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; 4) direct_io_write Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- include/trace/events/vfs.h | 62 mm/filemap.c | 18 + 2 files changed, 80 insertions(+) create mode 100644 include/trace/events/vfs.h diff --git a/include/trace/events/vfs.h b/include/trace/events/vfs.h new file mode 100644 index 000..384ff29 --- /dev/null +++ b/include/trace/events/vfs.h @@ -0,0 +1,62 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM vfs +#define TRACE_INCLUDE_FILE vfs + +#if !defined(_TRACE_EVENTS_VFS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_EVENTS_VFS_H + +#include + +#include + +DECLARE_EVENT_CLASS(vfs_filerw_template, + + TP_PROTO(long long pos, unsigned long bytes, unsigned char *fname), + + TP_ARGS(pos, bytes, fname), + + TP_STRUCT__entry( + __field(long long, pos ) + __field(unsigned long, bytes ) + __string( fname,
[PATCH 2/2] perf script: add python script to show system's file r/w behavior
From: chenggang@gmail.com This patch depends on the other patch: https://lkml.org/lkml/2013/1/29/47 Because this patch uses 2 tracepoint events are introduced by the patch of the above mentioned. If the engineers want to analyze the file access behavior of some applications without source code, perf script mechanism with some appropriate tracepoints events in the VFS subsystem are excellent choice. The system engineers or developers of server software require to know what files are accessed by the target processes with in a period of time. Then they can find the hot applications and the hot files. Based on the two tracepoint events, vfs:generic_file_aio_read and vfs:generic_file_aio_write (introduced by the patch: https://lkml.org/lkml/2013/1/29/47), the python script are introduced by this patch can record the system context and other related infomation while any process access a file. Then, this patch can show the details of the file access behavior of every processes. The detail information include: process's pid, comm, the number of file read/write, and the related file's name. The usage of this script is: perf script record filerw perf script report filerw [comm|pid] Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: David Ahern Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/scripts/python/bin/filerw-record |2 + tools/perf/scripts/python/bin/filerw-report | 21 +++ tools/perf/scripts/python/filerw.py | 189 +++ 3 files changed, 212 insertions(+) create mode 100755 tools/perf/scripts/python/bin/filerw-record create mode 100644 tools/perf/scripts/python/bin/filerw-report create mode 100644 tools/perf/scripts/python/filerw.py diff --git a/tools/perf/scripts/python/bin/filerw-record b/tools/perf/scripts/python/bin/filerw-record new file mode 100755 index 000..80f358c --- /dev/null +++ b/tools/perf/scripts/python/bin/filerw-record @@ -0,0 +1,2 @@ +#!/bin/bash +perf record -e vfs:generic_file_aio_read -e vfs:generic_file_aio_write $@ diff --git a/tools/perf/scripts/python/bin/filerw-report b/tools/perf/scripts/python/bin/filerw-report new file mode 100644 index 000..5a4dac9 --- /dev/null +++ b/tools/perf/scripts/python/bin/filerw-report @@ -0,0 +1,21 @@ +#!/bin/bash +# description: file read/write operations statistic +# args: [comm] +n_args=0 +for i in "$@" +do +if expr match "$i" "-" > /dev/null ; then + break +fi +n_args=$(( $n_args + 1 )) +done +if [ "$n_args" -gt 1 ] ; then +echo "usage: perf script report filerw [comm|pid]" +exit +fi + +if [ "$n_args" -gt 0 ] ; then +comm=$1 +shift +fi +perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/filerw.py $comm diff --git a/tools/perf/scripts/python/filerw.py b/tools/perf/scripts/python/filerw.py new file mode 100644 index 000..f5bd820 --- /dev/null +++ b/tools/perf/scripts/python/filerw.py @@ -0,0 +1,189 @@ +# file read/write counts +# (c) 2013, Chenggang Qin +# Licensed under the terms of the GNU GPL License version 2 + +# Displays system-wide file aio read/write behavior. +# It helps us to investigate what files are accessed by all +# processes or a special process. +# +# options +# comm: show details of the file r/w behavior of a special process. + +import os, sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * +from Core import * +from Util import * + +usage = "perf script record filerw\n" \ + "perf script report filerw [comm|pid]\n" + +for_comm = None +for_pid = None +pid_2_comm = None + +if len(sys.argv) > 2: + sys.exit(usage) + +if len(sys.argv) > 1: + try: + for_pid = int(sys.argv[1]) + except: + for_comm = sys.argv[1] + +file_write = autodict() +file_read = autodict() + +file_write_bytes = autodict() +file_read_bytes = autodict() + +comm_read_info = autodict() +comm_write_info = autodict() + +wevent_count = 0 +revent_count = 0 + +comm_revent_count = 0; +comm_wevent_count = 0; + +def trace_begin(): + print "Press control+C to stop and show the summary" + +def trace_end(): + if (for_comm is not None) or (for_pid is not None): + print_file_event_for_comm() + else: + print_file_event_totals() + +def vfs__generic_file_aio_write(event_name, context, common_cpu, + common_secs, common_nsecs, common_pid, common_comm, + pos, bytes, fname): + global wevent_count + global comm_wevent_count + global pid_2_comm + + if (for_comm is not None) or (for_pid is not None): + if (common_co
[PATCH] Add a python script to statistic direct io behavior
From: chenggang@gmail.com This patch depends on a prev patch: https://lkml.org/lkml/2013/1/29/47 If the engineers want to analyze the direct io behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers need to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we use 2 tracepoint events to record the system wide's direct IO behavior. The 2 tracepoint events are: 1) vfs:direct_io_read 2) vfs:direct_io_write they were introduced by the patch: https://lkml.org/lkml/2013/1/29/47 The script direct-io.py are introduced by this patch can record the 2 tracepoint events, analyse the sample data, and give a concise report. usage: "perf script record direct-io\n" "perf script report direct-io [comm|pid]\n" Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/scripts/python/bin/direct-io-record |2 + tools/perf/scripts/python/bin/direct-io-report | 21 +++ tools/perf/scripts/python/direct-io.py | 185 3 files changed, 208 insertions(+) create mode 100755 tools/perf/scripts/python/bin/direct-io-record create mode 100644 tools/perf/scripts/python/bin/direct-io-report create mode 100644 tools/perf/scripts/python/direct-io.py diff --git a/tools/perf/scripts/python/bin/direct-io-record b/tools/perf/scripts/python/bin/direct-io-record new file mode 100755 index 000..4857097 --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-record @@ -0,0 +1,2 @@ +#!/bin/bash +perf record -e vfs:direct_io_read -e vfs:direct_io_write $@ diff --git a/tools/perf/scripts/python/bin/direct-io-report b/tools/perf/scripts/python/bin/direct-io-report new file mode 100644 index 000..828d9c6 --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-report @@ -0,0 +1,21 @@ +#!/bin/bash +# description: direct_io statistic +# args: [comm|pid] +n_args=0 +for i in "$@" +do +if expr match "$i" "-" > /dev/null ; then + break +fi +n_args=$(( $n_args + 1 )) +done +if [ "$n_args" -gt 1 ] ; then +echo "usage: perf script report direct-io [comm|pid]" +exit +fi + +if [ "$n_args" -gt 0 ] ; then +comm=$1 +shift +fi +perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm diff --git a/tools/perf/scripts/python/direct-io.py b/tools/perf/scripts/python/direct-io.py new file mode 100644 index 000..321ff8e --- /dev/null +++ b/tools/perf/scripts/python/direct-io.py @@ -0,0 +1,185 @@ +# direct IO counts +# (c) 2013, Chenggang Qin +# Licensed under the terms of the GNU GPL License version 2 + +# Displays system-wide file direct IO behavior. +# It helps us to investigate which processes trigger a direct IO, +# and what files are accessed by these processes. +# +# options +# comm, pid: show details of the file r/w behavior of a special process. + +import os, sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * +from Core import * +from Util import * + +usage = "perf script record direct-io\n" \ + "perf script report direct-io [comm|pid]\n" + +for_comm = None +for_pid = None +pid_2_comm = None + +if len(sys.argv) > 2: + sys.exit(usage) + +if len(sys.argv) > 1: + try: + for_pid = int(sys.argv[1]) + except: + for_comm = sys.argv[1] + +file_write = autodict() +file_read = autodict() + +file_write_bytes = autodict() +file_read_bytes = autodict() + +comm_read_info = autodict() +comm_write_info = autodict() + +wevent_count = 0 +revent_count = 0 + +comm_revent_count = 0; +comm_wevent_count = 0; + +def trace_begin(): + print "Press control+C to stop and show the summary" + +def trace_end(): + if (for_comm is not None) or (for_pid is not None): + print_direct_io_event_for_comm() + else: + print_direct_io_event_totals() + +def vfs__direct_io_write(event_name, context, common_cpu, + common_secs, common_nsecs, common_pid, common_comm, + pos, bytes, fname): + global wevent_count + global comm_wevent_count + global pid_2_comm + + if (for_comm is not None) or (for_pid is not None): + if (common_comm != for_comm) and (common_pid != for_pid): +
[PATCH v2 0/4] perf: Make the 'perf top -p $pid' can perceive the new forked threads.
From: chenggang@taobao.com This patch set add a function that make the 'perf top -p $pid' is able to perceive the new threads that is forked by target processes. 'perf top{record} -p $pid' can perceive the threads are forked before we execute perf, but it cannot perceive the new threads are forked after we started perf. This is perf's important defect, because the applications who will fork new threads on-the-fly are very much. For performance reasons, the event inherit mechanism is forbidden while we use per-task counters. Some internal data structures, such as, thread_map, evlist->mmap, evsel->fd, evsel->id, evsel->sample_id are implemented as arrays at the initialization phase. Their size is fixed, and they cannot be extended or shrinked easily while we want to adjust them for new forked threads and exit threads. So, we have done the following work: 1) Transformed xyarray to linked list. Implementd the interfaces to extand and shrink a exist xyarray. The xyarray is a 2-dimensional structure. The row is still a array (because the number of CPU is fixed forever), the columns are linked list. 2) Transformed evlist->mmap, evsel->fd, evsel->id and evsel->sample_id to list with the new xyarray. Implemented interfaces to expand and shrink these structures. The nodes in these structures can be referenced by some predefined macros, such as FD(cpu, thread), MMAP(cpu, thread), ID(cpu, thread), etc. 3) Transformed thread_map to linked list. Implemented the interfaces to extand and shrink a exist thread_map. 4) Added 2 callback functions to top->perf_tool, they are called while the PERF_RECORD_FORK & PERF_RECORD_EXIT events are got. While a PERF_RECORD_FORK event is got, all related data structures are expanded, a new fd and mmap are opened. While a PERF_RECORD_EXIT event is got, all nodes in the related data structures are removed, the fd and mmap are closed. The linked list is flexible, list_add & list_del can be used easily. Additional, performance penalty (especially the CPU utilization) is low. This function has been already implemented for 'perf top -p $pid' in the patch [4/4] of this patch set. Next step, the 'perf record -p $pid' should be modified with the same method. Thanks for David Ahern's suggestion. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin chenggang (4): Transform xyarray to linked list. Transform thread_map to linked list. Transform mmap and other related structures to list with new xyarray. Add fork and exit callback functions into top->perf_tool. tools/perf/builtin-record.c |6 +- tools/perf/builtin-stat.c |2 +- tools/perf/builtin-top.c | 100 - tools/perf/tests/open-syscall-tp-fields.c |2 +- tools/perf/util/event.c | 10 +- tools/perf/util/evlist.c | 171 +++--- tools/perf/util/evlist.h |6 +- tools/perf/util/evsel.c | 98 +++-- tools/perf/util/evsel.h |8 +- tools/perf/util/header.c | 31 ++-- tools/perf/util/header.h |3 +- tools/perf/util/python.c |2 +- tools/perf/util/thread_map.c | 223 +++-- tools/perf/util/thread_map.h | 16 ++- tools/perf/util/xyarray.c | 85 ++- tools/perf/util/xyarray.h | 25 +++- 16 files changed, 641 insertions(+), 147 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/4] Transform xyarray to linked list
From: chenggang The 2-dimensional array cannot expand and shrink easily while we want to response the thread's fork and exit events on-the-fly. We transform xyarray to a 2-demesional linked list. The row is still a array, but column is implemented as a list. The number of nodes in every row are same. The interface to append and shrink a exist xyarray is provided. 1) xyarray__append() append a column for all rows. 2) xyarray__remove() remove a column for all rows. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/xyarray.c | 85 + tools/perf/util/xyarray.h | 25 +++-- 2 files changed, 101 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c index 22afbf6..fc48bda 100644 --- a/tools/perf/util/xyarray.c +++ b/tools/perf/util/xyarray.c @@ -1,20 +1,93 @@ #include "xyarray.h" #include "util.h" -struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) +/* + * Add a column for all rows; + */ +int xyarray__append(struct xyarray *xy) { - size_t row_size = ylen * entry_size; - struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size); + struct xyentry *new_entry; + unsigned int x; + + for (x = 0; x < xy->row_count; x++) { + new_entry = zalloc(sizeof(*new_entry)); + if (new_entry == NULL) + return -1; + + new_entry->contents = zalloc(xy->entry_size); + if (new_entry->contents == NULL) + return -1; - if (xy != NULL) { - xy->entry_size = entry_size; - xy->row_size = row_size; + list_add_tail(&new_entry->next, &xy->rows[x].head); } + return 0; +} + +struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) +{ + struct xyarray *xy = zalloc(sizeof(*xy) + xlen * sizeof(struct row)); + int i; + + if (xy == NULL) + return NULL; + + xy->row_count = xlen; + xy->entry_size = entry_size; + + for (i = 0; i < xlen; i++) + INIT_LIST_HEAD(&xy->rows[i].head); + + for (i = 0; i < ylen; i++) + if (xyarray__append(xy) < 0) { + xyarray__delete(xy); + return NULL; + } + return xy; } +/* + * remove a column for all rows; + */ +int xyarray__remove(struct xyarray *xy, int y) +{ + struct xyentry *entry; + unsigned int x; + int count; + + if (!xy) + return 0; + + for (x = 0; x < xy->row_count; x++) { + count = 0; + list_for_each_entry(entry, &xy->rows[x].head, next) + if (count++ == y) { + list_del(&entry->next); + free(entry); + return 0; + } + } + + return -1; +} + +/* + * All nodes in every rows should be deleted before delete @xy. + */ void xyarray__delete(struct xyarray *xy) { + unsigned int i; + struct xyentry *entry; + + if (!xy) + return; + + for (i = 0; i < xy->row_count; i++) + list_for_each_entry(entry, &xy->rows[i].head, next) { + list_del(&entry->next); + free(entry); + } + free(xy); } diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h index c488a07..07fa370 100644 --- a/tools/perf/util/xyarray.h +++ b/tools/perf/util/xyarray.h @@ -2,19 +2,38 @@ #define _PERF_XYARRAY_H_ 1 #include +#include + +struct row { + struct list_head head; +}; + +struct xyentry { + struct list_head next; + char *contents; +}; struct xyarray { - size_t row_size; + size_t row_count; size_t entry_size; - char contents[]; + struct row rows[]; }; struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size); void xyarray__delete(struct xyarray *xy); +int xyarray__append(struct xyarray *xy); +int xyarray__remove(struct xyarray *xy, int y); static inline void *xyarray__entry(struct xyarray *xy, int x, int y) { - return &xy->contents[x * xy->row_size + y * xy->entry_size]; + struct xyentry *entry; + int columns = 0; + + list_for_each_entry(entry, &xy->rows[x].head, next) + if (columns++ == y) + return entry->contents; + + return NULL; } #endif /* _PERF_XYARRAY_H_ */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "uns
[PATCH v2 2/4] Transform thread_map to linked list
From: chenggang The size of thread_map is fixed at initialized phase according to the files in /proc/{$pid}. It cannot be expanded and shrinked easily while we want to response the thread fork and exit events. We transform the thread_map structure to a linked list, and implement some interfaces to expend and shrink it. In order to improve compatibility with the existing code, we can get a thread by its index in the thread_map also. 1) thread_map__append() Append a new thread into thread_map according to new thread's pid. 2) thread_map__remove() Remove a exist thread from thread_map according to the index of the thread in thread_map. 3) thread_map__init() Allocate a thread_map, and initialize it. But the thread_map is empty after we called this function. We should call thread_map__append() to insert threads. 4) thread_map__delete() Delete a exist thread_map. 5) thread_map__get_pid() Got a thread's pid by its index in the thread_map. 6) thread_map__get_idx_by_pid() Got a thread's index in the thread_map according to its pid. While we got a PERF_RECORD_EXIT event, we only know the pid of the exited thread. 7) thread_map__empty_thread_map() Return a empty thread_map, there is only a dumb thread in it. This function is used to instead of the global varible empty_thread_map. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-stat.c |2 +- tools/perf/tests/open-syscall-tp-fields.c |2 +- tools/perf/util/event.c | 10 +- tools/perf/util/evlist.c |2 +- tools/perf/util/evsel.c | 16 +-- tools/perf/util/python.c |2 +- tools/perf/util/thread_map.c | 210 +++-- tools/perf/util/thread_map.h | 19 ++- tools/perf/util/xyarray.c |4 +- 9 files changed, 171 insertions(+), 96 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 9984876..f5fe0da 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -401,7 +401,7 @@ static int __run_perf_stat(int argc __maybe_unused, const char **argv) } if (perf_target__none(&target)) - evsel_list->threads->map[0] = child_pid; + thread_map__append(evsel_list->threads, child_pid); /* * Wait for the child to be ready to exec. diff --git a/tools/perf/tests/open-syscall-tp-fields.c b/tools/perf/tests/open-syscall-tp-fields.c index 1c52fdc..39eb770 100644 --- a/tools/perf/tests/open-syscall-tp-fields.c +++ b/tools/perf/tests/open-syscall-tp-fields.c @@ -43,7 +43,7 @@ int test__syscall_open_tp_fields(void) perf_evsel__config(evsel, &opts); - evlist->threads->map[0] = getpid(); + thread_map__append(evlist->threads, getpid()); err = perf_evlist__open(evlist); if (err < 0) { diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 5cd13d7..91d2848 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -327,8 +327,8 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool, err = 0; for (thread = 0; thread < threads->nr; ++thread) { if (__event__synthesize_thread(comm_event, mmap_event, - threads->map[thread], 0, - process, tool, machine)) { + thread_map__get_pid(threads, + thread), 0, process, tool, + machine)) { err = -1; break; } @@ -337,12 +337,14 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool, * comm.pid is set to thread group id by * perf_event__synthesize_comm */ - if ((int) comm_event->comm.pid != threads->map[thread]) { + if ((int) comm_event->comm.pid + != thread_map__get_pid(threads, thread)) { bool need_leader = true; /* is thread group leader in thread_map? */ for (j = 0; j < threads->nr; ++j) { - if ((int) comm_event->comm.pid == threads->map[j]) { + if ((int) comm_event->comm.pid + == thread_map__get_pid(threads, thread)) { need_leader = false; break;
[PATCH v2 3/4] Transform mmap and other related structures to list with new xyarray
From: chenggang evlist->mmap, evsel->id, evsel->sample_id are arrays. They cannot be expended or shrinked easily for the forked and exited threads while we get the fork and exit events. We transfromed them to linked list with the new xyarray. xyarray is a 2-dimensional structure. The row is a array still, and a row represents a cpu. The column is a linked list, and a column represents a thread. Some functions are implemented to expand and shrink the mmap, id and sample_id too. 1) perf_evsel__append_id_thread() Append a id for a evsel while a new thread is perceived. 2) perf_evsel__append_fd_thread() Append a fd for a evsel while a new thread is perceived. 3) perf_evlist__append_mmap_thread() Append a new node into evlist->mmap while a new thread is perceived. 3) perf_evsel__open_thread() Open the fd for the new thread with sys_perf_event_open. 4) perf_evsel__close_thread() Close the fd while a thread exit. 5) perf_evlist__mmap_thread() mmap a new thread's fd. 6) perf_evlist__munmap_thread() unmmap a exit thread's fd. The following macros can be used to reference a special fd, id, mmap, sample_id etc. 1) FD(cpu, thread) 2) SID(cpu, thread) 3) ID(cpu, thread) 4) MMAP(cpu, thread) evlist->pollfd is the parameter of syscall poll(), it must be a array. But we implement a function (perf_evlist__append_pollfd_thread) to expand and shrink it. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-record.c |6 +- tools/perf/util/evlist.c| 169 ++- tools/perf/util/evlist.h|6 +- tools/perf/util/evsel.c | 83 - tools/perf/util/evsel.h |8 +- tools/perf/util/header.c| 31 tools/perf/util/header.h|3 +- 7 files changed, 263 insertions(+), 43 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 774c907..13112c6 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -31,6 +31,8 @@ #include #include +#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y)) + #ifndef HAVE_ON_EXIT #ifndef ATEXIT_MAX #define ATEXIT_MAX 32 @@ -367,8 +369,8 @@ static int perf_record__mmap_read_all(struct perf_record *rec) int rc = 0; for (i = 0; i < rec->evlist->nr_mmaps; i++) { - if (rec->evlist->mmap[i].base) { - if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) != 0) { + if (MMAP(rec->evlist, i).base) { + if (perf_record__mmap_read(rec, &MMAP(rec->evlist, i)) != 0) { rc = -1; goto out; } diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index d5063d6..90cfbb6 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -25,6 +25,8 @@ #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y)) #define SID(e, x, y) xyarray__entry(e->sample_id, x, y) +#define ID(e, y) (*(u64 *)xyarray__entry(e->id, 0, y)) +#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y)) void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus, struct thread_map *threads) @@ -85,7 +87,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist) void perf_evlist__exit(struct perf_evlist *evlist) { - free(evlist->mmap); + xyarray__delete(evlist->mmap); free(evlist->pollfd); evlist->mmap = NULL; evlist->pollfd = NULL; @@ -256,6 +258,32 @@ void perf_evlist__enable(struct perf_evlist *evlist) } } +/* + * If threads->nr > 1, the cpu_map__nr() must be 1. + * If the cpu_map__nr() > 1, we should not append pollfd. + */ +static int perf_evlist__append_pollfd_thread(struct perf_evlist *evlist) +{ + int new_nfds; + + if (cpu_map__all(evlist->cpus)) { + struct pollfd *pfd; + + new_nfds = evlist->threads->nr * evlist->nr_entries; + pfd = zalloc(sizeof(struct pollfd) * new_nfds); + + if (!pfd) + return -1; + + memcpy(pfd, evlist->pollfd, (evlist->threads->nr - 1) * evlist->nr_entries); + + evlist->pollfd = pfd; + return 0; + } + + return 1; +} + static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist) { int nfds = cpu_map__nr(evlist->cpus) * evlist->threads->nr * evlist->nr_entries; @@ -288,7 +316,7 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel, int cpu, int thread, u
[PATCH v2 4/4] Add fork and exit callback functions into top->perf_tool
From: chenggang Many applications will fork threads on-the-fly, these threads could exit before the main thread exit. The perf top tool should perceive the new forked threads while we profile a special application. If the target process fork a thread or a thread exit, we will get a PERF_RECORD_FORK or PERF_RECORD_EXIT events. The following callback functions can process these events. 1) perf_top__process_event_fork() Open a new fd for the new forked, and expend the related data structures. 2) perf_top__process_event_exit() Close the fd of exit threadsd, and destroy the nodes in the related data structures. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-top.c | 100 +- tools/perf/util/evlist.c | 30 ++--- tools/perf/util/evsel.c | 13 +++--- tools/perf/util/thread_map.c | 13 ++ tools/perf/util/thread_map.h |3 -- 5 files changed, 133 insertions(+), 26 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 72f6eb7..94aab11 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -806,7 +806,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) struct perf_evsel *evsel; struct perf_session *session = top->session; union perf_event *event; - struct machine *machine; + struct machine *machine = NULL; u8 origin; int ret; @@ -825,6 +825,20 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) if (event->header.type == PERF_RECORD_SAMPLE) ++top->samples; + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_FORK) + (&top->tool)->fork(&top->tool, event, &sample, machine); + + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_EXIT) { + int tidx; + + tidx = (&top->tool)->exit(&top->tool, event, + &sample, machine); + if (tidx == idx) + break; + } + switch (origin) { case PERF_RECORD_MISC_USER: ++top->us_samples; @@ -1024,11 +1038,95 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset) return record_parse_callchain_opt(opt, arg, unset); } +static int perf_top__append_thread(struct perf_top *top, int tidx) +{ + struct perf_evsel *counter; + struct perf_evlist *evlist = top->evlist; + struct cpu_map *cpus = evlist->cpus; + + list_for_each_entry(counter, &evlist->entries, node) + if (perf_evsel__open_thread(counter, cpus, evlist->threads, tidx) < 0) { + printf("errno: %d\n", errno); + return -1; + } + + if (perf_evlist__mmap_thread(evlist, false, tidx) < 0) + return -1; + + return 0; +} + +static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused, + union perf_event *event __maybe_unused, + struct perf_sample *sample __maybe_unused, + struct machine *machine __maybe_unused) +{ + pid_t tid = event->fork.tid; + pid_t ptid = event->fork.ptid; + struct perf_top *top = container_of(tool, struct perf_top, tool); + struct thread_map *threads = top->evlist->threads; + struct perf_evsel *evsel; + int i, ret; + + if (!cpu_map__all(top->evlist->cpus)) + return -1; + + ret = thread_map__append(threads, tid); + if (ret == 1) + return ret; + if (ret == -1) + return ret; + + for(i = 0; i < threads->nr; i++) { + if (ptid == thread_map__get_pid(threads, i)) { + if (perf_top__append_thread(top, threads->nr - 1) < 0) + goto free_new_thread; + break; + } + } + + return 0; + +free_new_thread: + list_for_each_entry(evsel, &top->evlist->entries, node) + perf_evsel__close_thread(evsel, top->evlist->cpus->nr, threads->nr - 1); + thread_map__remove(threads, threads->nr - 1); + return -1; +} + +static int perf_top__process_event_exit(struct perf_tool *tool __maybe_unused, + union perf_e
[PATCH v2 4/4] Add fork and exit callback functions into top->perf_tool
From: chenggang Many applications will fork threads on-the-fly, these threads could exit before the main thread exit. The perf top tool should perceive the new forked threads while we profile a special application. If the target process fork a thread or a thread exit, we will get a PERF_RECORD_FORK or PERF_RECORD_EXIT events. The following callback functions can process these events. 1) perf_top__process_event_fork() Open a new fd for the new forked, and expend the related data structures. 2) perf_top__process_event_exit() Close the fd of exit threadsd, and destroy the nodes in the related data structures. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Cc: linux-kernel Signed-off-by: Chenggang Qin --- tools/perf/builtin-top.c | 100 +- tools/perf/util/evlist.c | 30 ++--- tools/perf/util/evsel.c | 13 +++--- tools/perf/util/thread_map.c | 13 ++ tools/perf/util/thread_map.h |3 -- 5 files changed, 133 insertions(+), 26 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 72f6eb7..94aab11 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -806,7 +806,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) struct perf_evsel *evsel; struct perf_session *session = top->session; union perf_event *event; - struct machine *machine; + struct machine *machine = NULL; u8 origin; int ret; @@ -825,6 +825,20 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) if (event->header.type == PERF_RECORD_SAMPLE) ++top->samples; + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_FORK) + (&top->tool)->fork(&top->tool, event, &sample, machine); + + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_EXIT) { + int tidx; + + tidx = (&top->tool)->exit(&top->tool, event, + &sample, machine); + if (tidx == idx) + break; + } + switch (origin) { case PERF_RECORD_MISC_USER: ++top->us_samples; @@ -1024,11 +1038,95 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset) return record_parse_callchain_opt(opt, arg, unset); } +static int perf_top__append_thread(struct perf_top *top, int tidx) +{ + struct perf_evsel *counter; + struct perf_evlist *evlist = top->evlist; + struct cpu_map *cpus = evlist->cpus; + + list_for_each_entry(counter, &evlist->entries, node) + if (perf_evsel__open_thread(counter, cpus, evlist->threads, tidx) < 0) { + printf("errno: %d\n", errno); + return -1; + } + + if (perf_evlist__mmap_thread(evlist, false, tidx) < 0) + return -1; + + return 0; +} + +static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused, + union perf_event *event __maybe_unused, + struct perf_sample *sample __maybe_unused, + struct machine *machine __maybe_unused) +{ + pid_t tid = event->fork.tid; + pid_t ptid = event->fork.ptid; + struct perf_top *top = container_of(tool, struct perf_top, tool); + struct thread_map *threads = top->evlist->threads; + struct perf_evsel *evsel; + int i, ret; + + if (!cpu_map__all(top->evlist->cpus)) + return -1; + + ret = thread_map__append(threads, tid); + if (ret == 1) + return ret; + if (ret == -1) + return ret; + + for(i = 0; i < threads->nr; i++) { + if (ptid == thread_map__get_pid(threads, i)) { + if (perf_top__append_thread(top, threads->nr - 1) < 0) + goto free_new_thread; + break; + } + } + + return 0; + +free_new_thread: + list_for_each_entry(evsel, &top->evlist->entries, node) + perf_evsel__close_thread(evsel, top->evlist->cpus->nr, threads->nr - 1); + thread_map__remove(threads, threads->nr - 1); + return -1; +} + +static int perf_top__process_event_exit(struct perf_tool *tool __maybe_unused, +
[PATCH v2 3/4] Transform mmap and other related structures to list with new xyarray
From: chenggang evlist->mmap, evsel->id, evsel->sample_id are arrays. They cannot be expended or shrinked easily for the forked and exited threads while we get the fork and exit events. We transfromed them to linked list with the new xyarray. xyarray is a 2-dimensional structure. The row is a array still, and a row represents a cpu. The column is a linked list, and a column represents a thread. Some functions are implemented to expand and shrink the mmap, id and sample_id too. 1) perf_evsel__append_id_thread() Append a id for a evsel while a new thread is perceived. 2) perf_evsel__append_fd_thread() Append a fd for a evsel while a new thread is perceived. 3) perf_evlist__append_mmap_thread() Append a new node into evlist->mmap while a new thread is perceived. 3) perf_evsel__open_thread() Open the fd for the new thread with sys_perf_event_open. 4) perf_evsel__close_thread() Close the fd while a thread exit. 5) perf_evlist__mmap_thread() mmap a new thread's fd. 6) perf_evlist__munmap_thread() unmmap a exit thread's fd. The following macros can be used to reference a special fd, id, mmap, sample_id etc. 1) FD(cpu, thread) 2) SID(cpu, thread) 3) ID(cpu, thread) 4) MMAP(cpu, thread) evlist->pollfd is the parameter of syscall poll(), it must be a array. But we implement a function (perf_evlist__append_pollfd_thread) to expand and shrink it. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Cc: linux-kernel Signed-off-by: Chenggang Qin --- tools/perf/builtin-record.c |6 +- tools/perf/util/evlist.c| 169 ++- tools/perf/util/evlist.h|6 +- tools/perf/util/evsel.c | 83 - tools/perf/util/evsel.h |8 +- tools/perf/util/header.c| 31 tools/perf/util/header.h|3 +- 7 files changed, 263 insertions(+), 43 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 774c907..13112c6 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -31,6 +31,8 @@ #include #include +#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y)) + #ifndef HAVE_ON_EXIT #ifndef ATEXIT_MAX #define ATEXIT_MAX 32 @@ -367,8 +369,8 @@ static int perf_record__mmap_read_all(struct perf_record *rec) int rc = 0; for (i = 0; i < rec->evlist->nr_mmaps; i++) { - if (rec->evlist->mmap[i].base) { - if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) != 0) { + if (MMAP(rec->evlist, i).base) { + if (perf_record__mmap_read(rec, &MMAP(rec->evlist, i)) != 0) { rc = -1; goto out; } diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index d5063d6..90cfbb6 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -25,6 +25,8 @@ #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y)) #define SID(e, x, y) xyarray__entry(e->sample_id, x, y) +#define ID(e, y) (*(u64 *)xyarray__entry(e->id, 0, y)) +#define MMAP(e, y) (*(struct perf_mmap *)xyarray__entry(e->mmap, 0, y)) void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus, struct thread_map *threads) @@ -85,7 +87,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist) void perf_evlist__exit(struct perf_evlist *evlist) { - free(evlist->mmap); + xyarray__delete(evlist->mmap); free(evlist->pollfd); evlist->mmap = NULL; evlist->pollfd = NULL; @@ -256,6 +258,32 @@ void perf_evlist__enable(struct perf_evlist *evlist) } } +/* + * If threads->nr > 1, the cpu_map__nr() must be 1. + * If the cpu_map__nr() > 1, we should not append pollfd. + */ +static int perf_evlist__append_pollfd_thread(struct perf_evlist *evlist) +{ + int new_nfds; + + if (cpu_map__all(evlist->cpus)) { + struct pollfd *pfd; + + new_nfds = evlist->threads->nr * evlist->nr_entries; + pfd = zalloc(sizeof(struct pollfd) * new_nfds); + + if (!pfd) + return -1; + + memcpy(pfd, evlist->pollfd, (evlist->threads->nr - 1) * evlist->nr_entries); + + evlist->pollfd = pfd; + return 0; + } + + return 1; +} + static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist) { int nfds = cpu_map__nr(evlist->cpus) * evlist->threads->nr * evlist->nr_entries; @@ -288,7 +316,7 @@ void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel, int
[PATCH v2 2/4] Transform thread_map to linked list
From: chenggang The size of thread_map is fixed at initialized phase according to the files in /proc/{$pid}. It cannot be expanded and shrinked easily while we want to response the thread fork and exit events. We transform the thread_map structure to a linked list, and implement some interfaces to expend and shrink it. In order to improve compatibility with the existing code, we can get a thread by its index in the thread_map also. 1) thread_map__append() Append a new thread into thread_map according to new thread's pid. 2) thread_map__remove() Remove a exist thread from thread_map according to the index of the thread in thread_map. 3) thread_map__init() Allocate a thread_map, and initialize it. But the thread_map is empty after we called this function. We should call thread_map__append() to insert threads. 4) thread_map__delete() Delete a exist thread_map. 5) thread_map__get_pid() Got a thread's pid by its index in the thread_map. 6) thread_map__get_idx_by_pid() Got a thread's index in the thread_map according to its pid. While we got a PERF_RECORD_EXIT event, we only know the pid of the exited thread. 7) thread_map__empty_thread_map() Return a empty thread_map, there is only a dumb thread in it. This function is used to instead of the global varible empty_thread_map. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Cc: linux-kernel Signed-off-by: Chenggang Qin --- tools/perf/builtin-stat.c |2 +- tools/perf/tests/open-syscall-tp-fields.c |2 +- tools/perf/util/event.c | 10 +- tools/perf/util/evlist.c |2 +- tools/perf/util/evsel.c | 16 +-- tools/perf/util/python.c |2 +- tools/perf/util/thread_map.c | 210 +++-- tools/perf/util/thread_map.h | 19 ++- tools/perf/util/xyarray.c |4 +- 9 files changed, 171 insertions(+), 96 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 9984876..f5fe0da 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -401,7 +401,7 @@ static int __run_perf_stat(int argc __maybe_unused, const char **argv) } if (perf_target__none(&target)) - evsel_list->threads->map[0] = child_pid; + thread_map__append(evsel_list->threads, child_pid); /* * Wait for the child to be ready to exec. diff --git a/tools/perf/tests/open-syscall-tp-fields.c b/tools/perf/tests/open-syscall-tp-fields.c index 1c52fdc..39eb770 100644 --- a/tools/perf/tests/open-syscall-tp-fields.c +++ b/tools/perf/tests/open-syscall-tp-fields.c @@ -43,7 +43,7 @@ int test__syscall_open_tp_fields(void) perf_evsel__config(evsel, &opts); - evlist->threads->map[0] = getpid(); + thread_map__append(evlist->threads, getpid()); err = perf_evlist__open(evlist); if (err < 0) { diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 5cd13d7..91d2848 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -327,8 +327,8 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool, err = 0; for (thread = 0; thread < threads->nr; ++thread) { if (__event__synthesize_thread(comm_event, mmap_event, - threads->map[thread], 0, - process, tool, machine)) { + thread_map__get_pid(threads, + thread), 0, process, tool, + machine)) { err = -1; break; } @@ -337,12 +337,14 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool, * comm.pid is set to thread group id by * perf_event__synthesize_comm */ - if ((int) comm_event->comm.pid != threads->map[thread]) { + if ((int) comm_event->comm.pid + != thread_map__get_pid(threads, thread)) { bool need_leader = true; /* is thread group leader in thread_map? */ for (j = 0; j < threads->nr; ++j) { - if ((int) comm_event->comm.pid == threads->map[j]) { + if ((int) comm_event->comm.pid + == thread_map__get_pid(threads, thread)) { need_leader = false; b
[PATCH v2 1/4] Transform xyarray to linked list
From: chenggang The 2-dimensional array cannot expand and shrink easily while we want to response the thread's fork and exit events on-the-fly. We transform xyarray to a 2-demesional linked list. The row is still a array, but column is implemented as a list. The number of nodes in every row are same. The interface to append and shrink a exist xyarray is provided. 1) xyarray__append() append a column for all rows. 2) xyarray__remove() remove a column for all rows. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Cc: linux-kernel Signed-off-by: Chenggang Qin --- tools/perf/util/xyarray.c | 85 + tools/perf/util/xyarray.h | 25 +++-- 2 files changed, 101 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c index 22afbf6..fc48bda 100644 --- a/tools/perf/util/xyarray.c +++ b/tools/perf/util/xyarray.c @@ -1,20 +1,93 @@ #include "xyarray.h" #include "util.h" -struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) +/* + * Add a column for all rows; + */ +int xyarray__append(struct xyarray *xy) { - size_t row_size = ylen * entry_size; - struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size); + struct xyentry *new_entry; + unsigned int x; + + for (x = 0; x < xy->row_count; x++) { + new_entry = zalloc(sizeof(*new_entry)); + if (new_entry == NULL) + return -1; + + new_entry->contents = zalloc(xy->entry_size); + if (new_entry->contents == NULL) + return -1; - if (xy != NULL) { - xy->entry_size = entry_size; - xy->row_size = row_size; + list_add_tail(&new_entry->next, &xy->rows[x].head); } + return 0; +} + +struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) +{ + struct xyarray *xy = zalloc(sizeof(*xy) + xlen * sizeof(struct row)); + int i; + + if (xy == NULL) + return NULL; + + xy->row_count = xlen; + xy->entry_size = entry_size; + + for (i = 0; i < xlen; i++) + INIT_LIST_HEAD(&xy->rows[i].head); + + for (i = 0; i < ylen; i++) + if (xyarray__append(xy) < 0) { + xyarray__delete(xy); + return NULL; + } + return xy; } +/* + * remove a column for all rows; + */ +int xyarray__remove(struct xyarray *xy, int y) +{ + struct xyentry *entry; + unsigned int x; + int count; + + if (!xy) + return 0; + + for (x = 0; x < xy->row_count; x++) { + count = 0; + list_for_each_entry(entry, &xy->rows[x].head, next) + if (count++ == y) { + list_del(&entry->next); + free(entry); + return 0; + } + } + + return -1; +} + +/* + * All nodes in every rows should be deleted before delete @xy. + */ void xyarray__delete(struct xyarray *xy) { + unsigned int i; + struct xyentry *entry; + + if (!xy) + return; + + for (i = 0; i < xy->row_count; i++) + list_for_each_entry(entry, &xy->rows[i].head, next) { + list_del(&entry->next); + free(entry); + } + free(xy); } diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h index c488a07..07fa370 100644 --- a/tools/perf/util/xyarray.h +++ b/tools/perf/util/xyarray.h @@ -2,19 +2,38 @@ #define _PERF_XYARRAY_H_ 1 #include +#include + +struct row { + struct list_head head; +}; + +struct xyentry { + struct list_head next; + char *contents; +}; struct xyarray { - size_t row_size; + size_t row_count; size_t entry_size; - char contents[]; + struct row rows[]; }; struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size); void xyarray__delete(struct xyarray *xy); +int xyarray__append(struct xyarray *xy); +int xyarray__remove(struct xyarray *xy, int y); static inline void *xyarray__entry(struct xyarray *xy, int x, int y) { - return &xy->contents[x * xy->row_size + y * xy->entry_size]; + struct xyentry *entry; + int columns = 0; + + list_for_each_entry(entry, &xy->rows[x].head, next) + if (columns++ == y) + return entry->contents; + + return NULL; } #endif /* _PERF_XYARRAY_H_ */ -- 1.7.9.5 -- To unsubscribe from this list: send t
[PATCH v2 0/4] perf: Make the 'perf top -p $pid' can perceive the new forked threads.
From: chenggang@taobao.com This patch set add a function that make the 'perf top -p $pid' is able to perceive the new threads that is forked by target processes. 'perf top{record} -p $pid' can perceive the threads are forked before we execute perf, but it cannot perceive the new threads are forked after we started perf. This is perf's important defect, because the applications who will fork new threads on-the-fly are very much. For performance reasons, the event inherit mechanism is forbidden while we use per-task counters. Some internal data structures, such as, thread_map, evlist->mmap, evsel->fd, evsel->id, evsel->sample_id are implemented as arrays at the initialization phase. Their size is fixed, and they cannot be extended or shrinked easily while we want to adjust them for new forked threads and exit threads. So, we have done the following work: 1) Transformed xyarray to linked list. Implementd the interfaces to extand and shrink a exist xyarray. The xyarray is a 2-dimensional structure. The row is still a array (because the number of CPU is fixed forever), the columns are linked list. 2) Transformed evlist->mmap, evsel->fd, evsel->id and evsel->sample_id to list with the new xyarray. Implemented interfaces to expand and shrink these structures. The nodes in these structures can be referenced by some predefined macros, such as FD(cpu, thread), MMAP(cpu, thread), ID(cpu, thread), etc. 3) Transformed thread_map to linked list. Implemented the interfaces to extand and shrink a exist thread_map. 4) Added 2 callback functions to top->perf_tool, they are called while the PERF_RECORD_FORK & PERF_RECORD_EXIT events are got. While a PERF_RECORD_FORK event is got, all related data structures are expanded, a new fd and mmap are opened. While a PERF_RECORD_EXIT event is got, all nodes in the related data structures are removed, the fd and mmap are closed. The linked list is flexible, list_add & list_del can be used easily. Additional, performance penalty (especially the CPU utilization) is low. This function has been already implemented for 'perf top -p $pid' in the patch [4/4] of this patch set. Next step, the 'perf record -p $pid' should be modified with the same method. Thanks for David Ahern's suggestion. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Cc: linux-kernel Signed-off-by: Chenggang Qin chenggang (4): Transform xyarray to linked list. Transform thread_map to linked list. Transform mmap and other related structures to list with new xyarray. Add fork and exit callback functions into top->perf_tool. tools/perf/builtin-record.c |6 +- tools/perf/builtin-stat.c |2 +- tools/perf/builtin-top.c | 100 - tools/perf/tests/open-syscall-tp-fields.c |2 +- tools/perf/util/event.c | 10 +- tools/perf/util/evlist.c | 171 +++--- tools/perf/util/evlist.h |6 +- tools/perf/util/evsel.c | 98 +++-- tools/perf/util/evsel.h |8 +- tools/perf/util/header.c | 31 ++-- tools/perf/util/header.h |3 +- tools/perf/util/python.c |2 +- tools/perf/util/thread_map.c | 223 +++-- tools/perf/util/thread_map.h | 16 ++- tools/perf/util/xyarray.c | 85 ++- tools/perf/util/xyarray.h | 25 +++- 16 files changed, 641 insertions(+), 147 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 8/8]Perf: Add some callback functions to process fork & exit events
From: chenggang Many applications will fork threads on-the-fly, these threads could exit before the main thread exit. The perf top tool should perceive the new forked threads while we profile a special application. If the target process fork a thread or a thread exit, we will get a PERF_RECORD_FORK or PERF_RECORD_EXIT events. The following callback functions can process these events. 1) perf_top__process_event_fork() Open a new fd for the new forked, and expend the related data structures. 2) perf_top__process_event_exit() Close the fd of exit threadsd, and destroy the nodes in the related data structures. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-top.c | 109 +- 1 file changed, 107 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index cff58e5..a591b96 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -800,7 +800,8 @@ static void perf_event__process_sample(struct perf_tool *tool, return; } -static void perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md) +static int perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md, + int idx) { struct perf_sample sample; struct perf_evsel *evsel; @@ -825,6 +826,20 @@ static void perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md) if (event->header.type == PERF_RECORD_SAMPLE) ++top->samples; + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_FORK) + (&top->tool)->fork(&top->tool, event, &sample, NULL); + + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_EXIT) { + int tidx; + + tidx = (&top->tool)->exit(&top->tool, event, + &sample, NULL); + if (tidx == idx) + return -1; + } + switch (origin) { case PERF_RECORD_MISC_USER: ++top->us_samples; @@ -863,14 +878,18 @@ static void perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md) } else ++session->stats.nr_unknown_events; } + return 0; } static void perf_top__mmap_read(struct perf_top *top) { struct perf_mmap *md; + int i = 0; for_each_mmap(md, top->evlist) { - perf_top__mmap_read_idx(top, md); + if (perf_top__mmap_read_idx(top, md, i) == -1) + break; + i++; } } @@ -1025,11 +1044,97 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset) return record_parse_callchain_opt(opt, arg, unset); } +static int perf_top__append_thread(struct perf_top *top, pid_t pid) +{ + char msg[512]; + struct perf_evsel *counter, *counter_err; + struct perf_evlist *evlist = top->evlist; + struct cpu_map *cpus = evlist->cpus; + + counter_err = list_entry(evlist->entries.prev, struct perf_evsel, node); + + list_for_each_entry(counter, &evlist->entries, node) { + if (perf_evsel__open_single_thread(counter, cpus, pid) < 0) { + if (verbose) { + perf_evsel__open_strerror(counter, + &top->record_opts.target, + errno, msg, sizeof(msg)); + ui__warning("%s\n", msg); + } + counter_err = counter; + goto close_opened_fd; + } + } + + if (perf_evlist__mmap_thread(evlist, false) < 0) + goto close_opened_fd; + + return 0; + +close_opened_fd: + list_for_each_entry(counter, &evlist->entries, node) { + perf_evsel__close_single_thread(counter, cpus->nr, -1); + if (counter == counter_err) + break; + } + return -1; +} + +static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused, + union perf_event *event __maybe_unused, + struct perf_sample *sample __maybe_unused, + struct machine *machine __maybe_unused) +{ + pid_t tid =
[PATCH v3 7/8]Perf: changed the method to traverse mmap list
From: chenggang Changed the method to traverse the evlist->mmap list. The evlist->mmap list is traversed very frequently. So we need to be more efficient to do it. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-top.c | 11 ++- tools/perf/tests/mmap-basic.c |4 +++- tools/perf/tests/open-syscall-tp-fields.c |7 --- tools/perf/tests/perf-record.c|7 --- tools/perf/util/evlist.c |4 ++-- tools/perf/util/evlist.h |3 ++- tools/perf/util/python.c |4 +++- 7 files changed, 24 insertions(+), 16 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 72f6eb7..cff58e5 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -800,7 +800,7 @@ static void perf_event__process_sample(struct perf_tool *tool, return; } -static void perf_top__mmap_read_idx(struct perf_top *top, int idx) +static void perf_top__mmap_read_idx(struct perf_top *top, struct perf_mmap *md) { struct perf_sample sample; struct perf_evsel *evsel; @@ -810,7 +810,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) u8 origin; int ret; - while ((event = perf_evlist__mmap_read(top->evlist, idx)) != NULL) { + while ((event = perf_evlist__mmap_read(top->evlist, md)) != NULL) { ret = perf_evlist__parse_sample(top->evlist, event, &sample); if (ret) { pr_err("Can't parse sample, err = %d\n", ret); @@ -867,10 +867,11 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) static void perf_top__mmap_read(struct perf_top *top) { - int i; + struct perf_mmap *md; - for (i = 0; i < top->evlist->nr_mmaps; i++) - perf_top__mmap_read_idx(top, i); + for_each_mmap(md, top->evlist) { + perf_top__mmap_read_idx(top, md); + } } static int perf_top__start_counters(struct perf_top *top) diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c index cdd5075..93639a8 100644 --- a/tools/perf/tests/mmap-basic.c +++ b/tools/perf/tests/mmap-basic.c @@ -19,6 +19,7 @@ int test__basic_mmap(void) { int err = -1; union perf_event *event; + struct perf_mmap *md; struct thread_map *threads; struct cpu_map *cpus; struct perf_evlist *evlist; @@ -97,7 +98,8 @@ int test__basic_mmap(void) ++foo; } - while ((event = perf_evlist__mmap_read(evlist, 0)) != NULL) { + md = perf_evlist__get_mmap(evlist, 0); + while ((event = perf_evlist__mmap_read(evlist, md)) != NULL) { struct perf_sample sample; if (event->header.type != PERF_RECORD_SAMPLE) { diff --git a/tools/perf/tests/open-syscall-tp-fields.c b/tools/perf/tests/open-syscall-tp-fields.c index 39eb770..cb12e82 100644 --- a/tools/perf/tests/open-syscall-tp-fields.c +++ b/tools/perf/tests/open-syscall-tp-fields.c @@ -20,7 +20,7 @@ int test__syscall_open_tp_fields(void) int flags = O_RDONLY | O_DIRECTORY; struct perf_evlist *evlist = perf_evlist__new(NULL, NULL); struct perf_evsel *evsel; - int err = -1, i, nr_events = 0, nr_polls = 0; + int err = -1, nr_events = 0, nr_polls = 0; if (evlist == NULL) { pr_debug("%s: perf_evlist__new\n", __func__); @@ -66,11 +66,12 @@ int test__syscall_open_tp_fields(void) while (1) { int before = nr_events; + struct perf_mmap *md; - for (i = 0; i < evlist->nr_mmaps; i++) { + for_each_mmap(md, evlist) { union perf_event *event; - while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) { + while ((event = perf_evlist__mmap_read(evlist, md)) != NULL) { const u32 type = event->header.type; int tp_flags; struct perf_sample sample; diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c index 1e8e512..8aef6d2 100644 --- a/tools/perf/tests/perf-record.c +++ b/tools/perf/tests/perf-record.c @@ -56,7 +56,7 @@ int test__PERF_RECORD(void) found_libc_mmap = false, found_vdso_mmap = false, found_ld_mmap = false; - int err = -1, errs = 0, i, wakeups = 0; + int err = -1, errs = 0, wakeups = 0; u32 cpu; int total_events = 0, nr_events[PERF_RECORD_MAX] = { 0, }; @@ -158,11 +158,12 @@ int test__PERF_RECORD(v
[PATCH v3 6/8]Perf: Add extend mechanism for mmap & pollfd.
From: chenggang Add extend mechanism for mmap & pollfd. Then we can adjust them while threads are forked or exited. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/evlist.c | 151 +- tools/perf/util/evlist.h |3 + tools/perf/util/evsel.c |7 ++- 3 files changed, 156 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index c1cd8f9..74af9bb 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -85,7 +85,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist) void perf_evlist__exit(struct perf_evlist *evlist) { - free(evlist->mmap); + xyarray__delete(evlist->mmap); free(evlist->pollfd); evlist->mmap = NULL; evlist->pollfd = NULL; @@ -256,6 +256,32 @@ void perf_evlist__enable(struct perf_evlist *evlist) } } +/* + * If threads->nr > 1, the cpu_map__nr() must be 1. + * If the cpu_map__nr() > 1, we should not append pollfd. + */ +static int perf_evlist__extend_pollfd(struct perf_evlist *evlist) +{ + int new_nfds; + + if (cpu_map__all(evlist->cpus)) { + struct pollfd *pfd; + + new_nfds = evlist->threads->nr * evlist->nr_entries; + pfd = zalloc(sizeof(struct pollfd) * new_nfds); + + if (!pfd) + return -1; + + memcpy(pfd, evlist->pollfd, (evlist->threads->nr - 1) * evlist->nr_entries); + + evlist->pollfd = pfd; + return 0; + } + + return 1; +} + static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist) { int nfds = cpu_map__nr(evlist->cpus) * evlist->threads->nr * evlist->nr_entries; @@ -416,6 +442,20 @@ void perf_evlist__munmap(struct perf_evlist *evlist) evlist->mmap = NULL; } +static struct perf_mmap * perf_evlist__extend_mmap(struct perf_evlist *evlist) +{ + struct perf_mmap **new_mmap = NULL; + + new_mmap = (struct perf_mmap **)xyarray__append(evlist->mmap, NULL); + + if (new_mmap != NULL) { + evlist->nr_mmaps++; + return *new_mmap; + } + + return NULL; +} + static int perf_evlist__alloc_mmap(struct perf_evlist *evlist) { evlist->nr_mmaps = cpu_map__nr(evlist->cpus); @@ -433,7 +473,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, pmmap->prev = 0; pmmap->mask = mask; pmmap->base = mmap(NULL, evlist->mmap_len, prot, - MAP_SHARED, fd, 0); + MAP_SHARED, fd, 0); if (pmmap->base == MAP_FAILED) { pmmap->base = NULL; return -1; @@ -527,6 +567,111 @@ out_unmap: return -1; } +int perf_evlist__mmap_thread(struct perf_evlist *evlist, bool overwrite) +{ + struct perf_evsel *evsel; + int prot = PROT_READ | (overwrite ? 0 : PROT_WRITE); + int mask = evlist->mmap_len - page_size -1; + int output = -1; + struct pollfd *old_pollfd = evlist->pollfd; + struct perf_mmap *pmmap; + + if (!cpu_map__all(evlist->cpus)) + return 1; + + if ((pmmap = perf_evlist__extend_mmap(evlist)) == NULL) + return -ENOMEM; + + if (perf_evlist__extend_pollfd(evlist) < 0) + goto free_append_mmap; + + list_for_each_entry(evsel, &evlist->entries, node) { + if (evsel->attr.read_format & PERF_FORMAT_ID) { + if (perf_evsel__extend_id(evsel) < 0) + goto free_append_pollfd; + } + } + + list_for_each_entry(evsel, &evlist->entries, node) { + int fd = FD(evsel, 0, -1); + + if (output == -1) { + output = fd; + + pmmap->prev = 0; + pmmap->mask = mask; + pmmap->base = mmap(NULL, evlist->mmap_len, prot, + MAP_SHARED, fd, 0); + + if (pmmap->base == MAP_FAILED) { + pmmap->base = NULL; + goto out_unmap; + } + perf_evlist__add_pollfd(evlist, fd); + } else { + if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, output) != 0) + goto out_unmap; + } + if ((evsel->attr.read_format & PERF_FORMAT_ID) && + perf_evlist__id_add_fd(evlist, evsel, 0, -1, fd) < 0)
[PATCH v3 5/8]Perf: add extend mechanism for evsel->id & evsel->fd
From: chenggang Add extend mechanism for evsel->id & evsel->fd. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/evsel.c | 76 ++ tools/perf/util/evsel.h |8 + tools/perf/util/thread_map.c |2 +- 3 files changed, 85 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 015321f..2eb75f9 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -599,6 +599,16 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads) return evsel->fd != NULL ? 0 : -ENOMEM; } +/* + * Return the pointer to new fds (fds for the new thread at all cpus). + */ +static int** perf_evsel__extend_fd(struct perf_evsel *evsel) +{ + int init_fd = -1; + + return (int**)xyarray__append(evsel->fd, (char *)&init_fd); +} + int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads, const char *filter) { @@ -617,6 +627,26 @@ int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads, return 0; } +int perf_evsel__extend_id(struct perf_evsel *evsel) +{ + if (xyarray__append(evsel->sample_id, NULL) == NULL) + return -ENOMEM; + + if (xyarray__append(evsel->id, NULL) == NULL) { + xyarray__remove(evsel->sample_id, -1); + return -ENOMEM; + } + + return 0; +} + +void perf_evsel__remove_id(struct perf_evsel *evsel, int tidx) +{ + xyarray__remove(evsel->id, tidx); + evsel->ids--; + xyarray__remove(evsel->sample_id, tidx); +} + int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads) { evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id)); @@ -937,6 +967,52 @@ int perf_evsel__open_per_thread(struct perf_evsel *evsel, return __perf_evsel__open(evsel, &empty_cpu_map.map, threads); } +void perf_evsel__close_single_thread(struct perf_evsel *evsel, int cpu_nr, +int tidx) +{ + int cpu; + + for (cpu = 0; cpu < cpu_nr; cpu++) { + if (FD(evsel, cpu, tidx) >= 0) + close(FD(evsel, cpu, tidx)); + } + xyarray__remove(evsel->fd, tidx); +} + +int perf_evsel__open_single_thread(struct perf_evsel *evsel, + struct cpu_map *cpus, int tid) +{ + int cpu; + int pid = -1; + unsigned long flags = 0; + int **new_fds; + + if ((new_fds = perf_evsel__extend_fd(evsel)) == NULL) + return -1; + + if (evsel->cgrp) { + flags = PERF_FLAG_PID_CGROUP; + pid = evsel->cgrp->fd; + } + + for (cpu = 0; cpu < cpus->nr; cpu++) { + int group_fd; + + if (!evsel->cgrp) + pid = tid; + + group_fd = get_group_fd(evsel, cpu, -1); + evsel->attr.disabled = 0; + *new_fds[cpu] = sys_perf_event_open(&evsel->attr, pid, + cpus->map[cpu], group_fd, + flags); + if (*new_fds[cpu] < 0) + return -errno; + } + + return 0; +} + static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel, const union perf_event *event, struct perf_sample *sample) diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 7adb116..ae391d4 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -128,6 +128,9 @@ void perf_evsel__close_fd(struct perf_evsel *evsel, int ncpus, int nthreads); void perf_evsel__id_new(struct perf_evsel *evsel, int nr); u64 *perf_evsel__get_id(struct perf_evsel *evsel, int idx); +int perf_evsel__extend_id(struct perf_evsel *evsel); +void perf_evsel__remove_id(struct perf_evsel *evsel, int tidx); + void __perf_evsel__set_sample_bit(struct perf_evsel *evsel, enum perf_event_sample_format bit); void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel, @@ -152,6 +155,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, struct thread_map *threads); void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads); +int perf_evsel__open_single_thread(struct perf_evsel *evsel, + struct cpu_map *cpus, int tid); +void perf_evsel__close_single_thread(struct perf_evsel *evsel, int cpu_nr, +int ti
[PATCH v3 4/8]perf: Transform evsel->id to xyarray
From: chenggang Transform evsel->id to xyarray, so it is transformed to a linked list instead an array. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/evlist.c |4 +++- tools/perf/util/evsel.c | 19 +-- tools/perf/util/evsel.h |5 - tools/perf/util/header.c | 28 ++-- tools/perf/util/header.h |3 ++- 5 files changed, 44 insertions(+), 15 deletions(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 7515651..c1cd8f9 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -287,8 +287,10 @@ static void perf_evlist__id_hash(struct perf_evlist *evlist, void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel, int cpu, int thread, u64 id) { + u64* idp = perf_evsel__get_id(evsel, -1); perf_evlist__id_hash(evlist, evsel, cpu, thread, id); - evsel->id[evsel->ids++] = id; + *idp = id; + evsel->ids++; } static int perf_evlist__id_add_fd(struct perf_evlist *evlist, diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 57c569d..015321f 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -623,7 +623,7 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads) if (evsel->sample_id == NULL) return -ENOMEM; - evsel->id = zalloc(ncpus * nthreads * sizeof(u64)); + evsel->id = xyarray__new(1, ncpus * nthreads, sizeof(u64)); if (evsel->id == NULL) { xyarray__delete(evsel->sample_id); evsel->sample_id = NULL; @@ -633,6 +633,21 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads) return 0; } +void perf_evsel__id_new(struct perf_evsel *evsel, int nr) +{ + if (evsel->id) + xyarray__delete(evsel->id); + + evsel->id = NULL; + + evsel->id = xyarray__new(1, nr, sizeof(u64)); +} + +u64 *perf_evsel__get_id(struct perf_evsel *evsel, int idx) +{ + return (u64 *)xyarray__entry(evsel->id, 0, idx); +} + int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus) { evsel->counts = zalloc((sizeof(*evsel->counts) + @@ -650,7 +665,7 @@ void perf_evsel__free_id(struct perf_evsel *evsel) { xyarray__delete(evsel->sample_id); evsel->sample_id = NULL; - free(evsel->id); + xyarray__delete(evsel->id); evsel->id = NULL; } diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 52021c3..7adb116 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -51,7 +51,7 @@ struct perf_evsel { char*filter; struct xyarray *fd; struct xyarray *sample_id; - u64 *id; + struct xyarray *id; struct perf_counts *counts; struct perf_counts *prev_raw_counts; int idx; @@ -125,6 +125,9 @@ void perf_evsel__free_id(struct perf_evsel *evsel); void perf_evsel__free_counts(struct perf_evsel *evsel); void perf_evsel__close_fd(struct perf_evsel *evsel, int ncpus, int nthreads); +void perf_evsel__id_new(struct perf_evsel *evsel, int nr); +u64 *perf_evsel__get_id(struct perf_evsel *evsel, int idx); + void __perf_evsel__set_sample_bit(struct perf_evsel *evsel, enum perf_event_sample_format bit); void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel, diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index f4bfd79..d344e61 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -1325,19 +1325,18 @@ read_event_desc(struct perf_header *ph, int fd) if (!nr) continue; - id = calloc(nr, sizeof(*id)); - if (!id) - goto error; evsel->ids = nr; - evsel->id = id; + perf_evsel__id_new(evsel, nr); + if (!evsel->id) + goto error; for (j = 0 ; j < nr; j++) { + id = perf_evsel__get_id(evsel, j); ret = readn(fd, id, sizeof(*id)); if (ret != (ssize_t)sizeof(*id)) goto error; if (ph->needs_swap) *id = bswap_64(*id); - id++; } } out: @@ -1384,7 +1383,8 @@ static void print_event_desc(struct perf_header *ph, int fd, FILE *fp) if (evsel->ids) { fprintf(fp, &quo
[PATCH v3 3/8]Perf: Transform evlist->mmap to xyarray
From: chenggang Transformed evlist->mmap to xyarray. Then the evlist->mmap is transformed to a linked list too. 1) perf_evlist__mmap_thread() mmap a new fd for a new thread forked on-the-fly. 2) void perf_evlist__munmap_thread() munmap a fd for a exited thread on-the-fly. 3) perf_evlist__get_mmap() get a perf_mmap struct in the evlist->mmap list by its index. 4) for_each_mmap(md, evlist) traverse all perf_mmap structures in the evlist->mmap list. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/Makefile |3 ++- tools/perf/builtin-record.c |8 +++ tools/perf/util/evlist.c| 49 ++- tools/perf/util/evlist.h|8 ++- 4 files changed, 43 insertions(+), 25 deletions(-) diff --git a/tools/perf/Makefile b/tools/perf/Makefile index a2108ca..7f3f066 100644 --- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -209,7 +209,8 @@ BASIC_CFLAGS = \ -Iutil \ -I. \ -I$(TRACE_EVENT_DIR) \ - -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE + -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE \ + -std=gnu99 BASIC_LDFLAGS = diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 774c907..3bca0b2 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -363,12 +363,12 @@ static struct perf_event_header finished_round_event = { static int perf_record__mmap_read_all(struct perf_record *rec) { - int i; int rc = 0; + struct perf_mmap *pmmap = NULL; - for (i = 0; i < rec->evlist->nr_mmaps; i++) { - if (rec->evlist->mmap[i].base) { - if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) != 0) { + for_each_mmap(pmmap, rec->evlist) { + if (pmmap->base) { + if (perf_record__mmap_read(rec, pmmap) != 0) { rc = -1; goto out; } diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index d5063d6..7515651 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -336,7 +336,7 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id) union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx) { - struct perf_mmap *md = &evlist->mmap[idx]; + struct perf_mmap *md = perf_evlist__get_mmap(evlist, idx); unsigned int head = perf_mmap__read_head(md); unsigned int old = md->prev; unsigned char *data = md->base + page_size; @@ -401,16 +401,16 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx) void perf_evlist__munmap(struct perf_evlist *evlist) { - int i; + struct perf_mmap *pmmap = NULL; - for (i = 0; i < evlist->nr_mmaps; i++) { - if (evlist->mmap[i].base != NULL) { - munmap(evlist->mmap[i].base, evlist->mmap_len); - evlist->mmap[i].base = NULL; + for_each_mmap(pmmap, evlist) { + if (pmmap->base != NULL) { + munmap(pmmap->base, evlist->mmap_len); + pmmap->base = NULL; } } - free(evlist->mmap); + xyarray__delete(evlist->mmap); evlist->mmap = NULL; } @@ -419,19 +419,21 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist) evlist->nr_mmaps = cpu_map__nr(evlist->cpus); if (cpu_map__all(evlist->cpus)) evlist->nr_mmaps = evlist->threads->nr; - evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap)); + evlist->mmap = xyarray__new(1, evlist->nr_mmaps, sizeof(struct perf_mmap)); return evlist->mmap != NULL ? 0 : -ENOMEM; } static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx, int prot, int mask, int fd) { - evlist->mmap[idx].prev = 0; - evlist->mmap[idx].mask = mask; - evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot, + struct perf_mmap *pmmap = perf_evlist__get_mmap(evlist, idx); + + pmmap->prev = 0; + pmmap->mask = mask; + pmmap->base = mmap(NULL, evlist->mmap_len, prot, MAP_SHARED, fd, 0); - if (evlist->mmap[idx].base == MAP_FAILED) { - evlist->mmap[idx].base = NULL; + if (pmmap->base == MAP_FAILED) { + pmmap->base = NULL; return -1; } @@ -472,9 +474,11 @@ static int perf_evlist__mmap_
[PATCH v3 2/8]Perf: Transform xyarray to linked list
From: chenggang The 2-dimensional array cannot expand and shrink easily while we want to perceive the thread's fork and exit events on-the-fly. We transform xyarray to a 2-demesional linked list. The x dimension is cpus and is still a array. The y dimension is threads of interest and is transformed to linked list. The interface to append and shrink a exist xyarray is provided. 1) xyarray__append() append a column for all rows. 2) xyarray__remove() remove a column for all rows. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/xyarray.c | 125 +++-- tools/perf/util/xyarray.h | 68 ++-- 2 files changed, 185 insertions(+), 8 deletions(-) diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c index 22afbf6..ddb3bff 100644 --- a/tools/perf/util/xyarray.c +++ b/tools/perf/util/xyarray.c @@ -1,20 +1,135 @@ #include "xyarray.h" #include "util.h" +/* + * Add a column for all rows; + * @init_cont stores the initialize value for new entries. + * The return value is the array of new contents. + */ +char** xyarray__append(struct xyarray *xy, char *init_cont) +{ + struct xyentry *new_entry; + unsigned int x; + char **new_conts; + + new_conts = zalloc(sizeof(char *) * xy->row_count); + if (new_conts == NULL) + return NULL; + + for (x = 0; x < xy->row_count; x++) { + new_entry = zalloc(sizeof(*new_entry)); + if (new_entry == NULL) { + free(new_conts); + return NULL; + } + + new_entry->contents = zalloc(xy->entry_size); + if (new_entry->contents == NULL) { + free(new_entry); + free(new_conts); + return NULL; + } + + if (init_cont) + memcpy(new_entry->contents, init_cont, xy->entry_size); + + new_conts[x] = new_entry->contents; + + list_add_tail(&new_entry->next, &xy->rows[x].head); + } + + return new_conts; +} + struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) { - size_t row_size = ylen * entry_size; - struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size); + struct xyarray *xy = zalloc(sizeof(*xy) + xlen * sizeof(struct row)); + int i; + + if (xy == NULL) + return NULL; + + xy->row_count = xlen; + xy->entry_size = entry_size; - if (xy != NULL) { - xy->entry_size = entry_size; - xy->row_size = row_size; + for (i = 0; i < xlen; i++) + INIT_LIST_HEAD(&xy->rows[i].head); + + for (i = 0; i< ylen; i++) { + if (xyarray__append(xy, NULL) == NULL) { + xyarray__delete(xy); + return NULL; + } } return xy; } +static inline int xyarray__remove_last(struct xyarray *xy) +{ + struct xyentry *entry; + unsigned int x; + + if (xy == NULL) + return -1; + + for (x = 0; x < xy->row_count; x++) { + if (!list_empty(&xy->rows[x].head)) { + entry = list_entry(xy->rows[x].head.prev, + struct xyentry, next); + list_del(&entry->next); + free(entry); + } + } + + return 0; +} + +/* + * remove a column for all rows; + */ +int xyarray__remove(struct xyarray *xy, int y) +{ + struct xyentry *entry, *tmp; + unsigned int x; + int count; + + if (xy == NULL) + return -1; + + if (y == -1) + return xyarray__remove_last(xy); + + for (x = 0; x < xy->row_count; x++) { + count = 0; + list_for_each_entry_safe(entry, tmp, &xy->rows[x].head, next) { + if (count++ == y) { + list_del(&entry->next); + free(entry); + } + } + } + + return 0; +} + +/* + * delete @xy and all its nodes. + */ void xyarray__delete(struct xyarray *xy) { + unsigned int i; + struct xyentry *entry, *tmp; + + if (!xy) + return; + + for (i = 0; i < xy->row_count; i++) { + list_for_each_entry_safe(entry, tmp, &xy->rows[i].head, next) { + list_del(&entry->next); + free(entry); +
[PATCH v3 1/8]Perf: Transform thread_map to linked list
From: chenggang The size of thread_map is fixed at initialized phase according to the files in /proc/{$pid}. It cannot be expanded and shrinked while we want to perceive the thread fork and exit events. We transform the thread_map structure to a linked list, and implement some interfaces to expend and shrink it. In order to improve compatibility with the existing code, we can get a thread by its index in the thread_map also. 1) thread_map__append() Append a new thread into thread_map according to new thread's pid. 2) thread_map__remove() Remove a exist thread from thread_map according to the index of the thread in thread_map. 3) thread_map__init() Alloc a thread_map, and initialize it. But the thread_map is empty after we called this function. We should call thread_map__append() to insert threads. 4) thread_map__delete() Delete a exist thread_map. 5) thread_map__set_pid() Set the pid of a thread by its index in the thread_map. 6) thread_map__get_pid() Got a thread's pid by its index in the thread_map. 7) thread_map__get_idx_by_pid() Got a thread's index in the thread_map according to its pid. While we got a PERF_RECORD_EXIT event, we only know the pid of the thread. 8) thread_map__empty_thread_map() Return a empty thread_map, there is only a dumb thread in it. This function is used to instead of the global varible empty_thread_map. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-stat.c |2 +- tools/perf/tests/open-syscall-tp-fields.c |2 +- tools/perf/util/event.c | 12 +- tools/perf/util/evlist.c |2 +- tools/perf/util/evsel.c | 16 +- tools/perf/util/python.c |2 +- tools/perf/util/thread_map.c | 281 ++--- tools/perf/util/thread_map.h | 17 +- 8 files changed, 244 insertions(+), 90 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 9984876..293b09c 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -401,7 +401,7 @@ static int __run_perf_stat(int argc __maybe_unused, const char **argv) } if (perf_target__none(&target)) - evsel_list->threads->map[0] = child_pid; + thread_map__set_pid(evsel_list->threads, 0, child_pid); /* * Wait for the child to be ready to exec. diff --git a/tools/perf/tests/open-syscall-tp-fields.c b/tools/perf/tests/open-syscall-tp-fields.c index 1c52fdc..39eb770 100644 --- a/tools/perf/tests/open-syscall-tp-fields.c +++ b/tools/perf/tests/open-syscall-tp-fields.c @@ -43,7 +43,7 @@ int test__syscall_open_tp_fields(void) perf_evsel__config(evsel, &opts); - evlist->threads->map[0] = getpid(); + thread_map__append(evlist->threads, getpid()); err = perf_evlist__open(evlist); if (err < 0) { diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 5cd13d7..d093460 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -326,9 +326,11 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool, err = 0; for (thread = 0; thread < threads->nr; ++thread) { + pid_t pid = thread_map__get_pid(threads, thread); + if (__event__synthesize_thread(comm_event, mmap_event, - threads->map[thread], 0, - process, tool, machine)) { + pid, 0, process, tool, + machine)) { err = -1; break; } @@ -337,12 +339,14 @@ int perf_event__synthesize_thread_map(struct perf_tool *tool, * comm.pid is set to thread group id by * perf_event__synthesize_comm */ - if ((int) comm_event->comm.pid != threads->map[thread]) { + if ((int) comm_event->comm.pid != pid) { bool need_leader = true; /* is thread group leader in thread_map? */ for (j = 0; j < threads->nr; ++j) { - if ((int) comm_event->comm.pid == threads->map[j]) { + pid_t pidj = thread_map__get_pid(threads, j); + + if ((int) comm_event->comm.pid == pidj) { need_leader = false; break; } diff --git
[PATCH v3 0/8]Perf: Make the 'perf top -p $pid' can perceive the new forked threads.
From: chenggang This patch set base on the 3.8.rc7 kernel. Here is the version 3, I optimized the performance and structure in this version. This patch set add a function that make the 'perf top -p $pid' is able to perceive the new threads that is forked by target processes. 'perf top{record} -p $pid' can perceive the threads are forked before we execute perf, but it cannot perceive the new threads are forked after we started perf. This is perf's important defect, because the applications who will fork new threads on-the-fly are very much. For performance reasons, the event inherit mechanism is forbidden while we use per-task counters. Some internal data structures, such as, thread_map, evlist->mmap, evsel->fd, evsel->id, evsel->sample_id are implemented as arrays at the initialization phase. Their size is fixed, and they cannot be extended easily while we want to expend them for new forked threads. So, we have done the following work: 1) Transformed thread_map to linked list. Implemented the interfaces to extand and shrink a exist thread_map. 2) Transformed xyarray to linked list. Implementd the interfaces to extand and shrink a exist xyarray. The xyarray is a 2-dimensional structure. The x-dimension is cpus, and the x-dimension is a array still. The y-dimension is threads of interest, and the y-dimension are linked list. 3) Implemented evlist->mmap, evsel->fd, evsel->id and evsel->sample_id with the new xyarray. Implemented interfaces to expand and shrink these structures. 4) Added 2 callback functions to top->perf_tool, they are called while the PERF_RECORD_FORK & PERF_RECORD_EXIT events are got. While a PERF_RECORD_FORK event is got, all related data structures are expanded, a new fd and mmap are opened. While a PERF_RECORD_EXIT event is got, all nodes in the related data structures are removed. The linked list is flexible, list_add & list_del can be used easily. Additional, performance penalty (especially the CPU utilization) is low. At the last of this coverletter, I attached a test program and its Makefile. After it is executed, we will get its pid. Then, use this command: 'perf top -p *pid*' The perf top will perceive the functions that called by the threads forked on-the-fly. We could use 'top' tool to monitor the overhead of 'perf'. The result shows the cpu overhead of this patch set is less than 3%. I think this overhead can be accepted. My test environment is as follows: # # captured on: Wed Mar 13 15:23:55 2013 # perf version : 3.8.rc7.ga39f52 # arch : x86_64 # nrcpus online : 2 # nrcpus avail : 2 # cpudesc : Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz # cpuid : GenuineIntel,6,23,10 # total memory : 3034932 kB # This function has been already implemented for 'perf top -p $pid' in the patch [8/8] of this patch set. Next step, the 'perf record -p $pid' should be modified with the same method. Thanks for David Ahern's suggestion. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin chenggang (8): changed thread_map to list changed xyarray to list hanged mmap to xyarray changed evsel->id to xyarray extend mechanism for evsel->id & evsel->fd add some operations for mmap changed the method to traverse mmap list fork & exit event perceived tools/perf/Makefile |3 +- tools/perf/builtin-record.c |8 +- tools/perf/builtin-stat.c |2 +- tools/perf/builtin-top.c | 116 - tools/perf/tests/mmap-basic.c |4 +- tools/perf/tests/open-syscall-tp-fields.c |9 +- tools/perf/tests/perf-record.c|7 +- tools/perf/util/event.c | 12 +- tools/perf/util/evlist.c | 206 +++--- tools/perf/util/evlist.h | 14 +- tools/perf/util/evsel.c | 118 +++-- tools/perf/util/evsel.h | 13 +- tools/perf/util/header.c | 28 +-- tools/perf/util/header.h |3 +- tools/perf/util/python.c |6 +- tools/perf/util/thread_map.c | 265 + tools/perf/util/thread_map.h | 16 +- tools/perf/util/xyarray.c | 125 +- tools/perf/util/xyarray.h | 68 +++- 19 files changed, 866 insertions(+), 157 deletions(-) --- Here is a program to test the patch set. --- #include #include #include #include #include #include #include #include #include #define CHILDREN_NUM 15000 #define UINT_MAX(~0U) unsigned int ne
[PATCH]Perf: Fix Makefile to remove all "*.o" files while "make clean"
From: chenggang While we run "make clean" in perf's directory, and run the command: "fine ./ -name *.o" we will get: ./arch/x86/util/unwind.o ./arch/x86/util/header.o ./arch/x86/util/dwarf-regs.o ./util/scripting-engines/trace-event-python.o ./util/scripting-engines/trace-event-perl.o ./util/probe-finder.o ./util/dwarf-aux.o ./util/unwind.o ./lib/rbtree.o ./ui/browser.o ./ui/browsers/map.o ./ui/browsers/annotate.o ./ui/browsers/scripts.o ./ui/browsers/hists.o ./ui/tui/setup.o ./ui/tui/util.o ./ui/tui/helpline.o ./ui/tui/progress.o ./ui/gtk/browser.o ./ui/gtk/setup.o ./ui/gtk/util.o ./ui/gtk/helpline.o ./ui/gtk/annotate.o ./ui/gtk/progress.o ./ui/gtk/hists.o ./scripts/perl/Perf-Trace-Util/Context.o ./scripts/python/Perf-Trace-Util/Context.o These ".o" files are not cleaned. This patch fixed this problem. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/Makefile | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tools/perf/Makefile b/tools/perf/Makefile index a2108ca..20ed83c 100644 --- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -1173,6 +1173,16 @@ clean: $(LIBTRACEEVENT)-clean $(RM) $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)PERF-CFLAGS $(RM) $(OUTPUT)util/*-bison* $(RM) $(OUTPUT)util/*-flex* + $(RM) $(OUTPUT)util/*.o + $(RM) $(OUTPUT)util/scripting-engines/*.o + $(RM) $(OUTPUT)scripts/perl/Perf-Trace-Util/*.o + $(RM) $(OUTPUT)scripts/python/Perf-Trace-Util/*.o + $(RM) $(OUTPUT)ui/*.o + $(RM) $(OUTPUT)ui/browsers/*.o + $(RM) $(OUTPUT)ui/tui/*.o + $(RM) $(OUTPUT)ui/gtk/*.o + $(RM) $(OUTPUT)lib/*.o + $(RM) $(OUTPUT)arch/$(ARCH)/util/*.o $(python-clean) .PHONY: all install clean strip $(LIBTRACEEVENT) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2]Perf: Fix Makefile to clean all object files
From: Chenggang Qin If we execute "make clean" in perf's directory, many object files cannot be cleaned in the current version. For example: While we run "make clean" in perf's directory, and run the command: "fine ./ -name "*.o"" we will get: ./arch/x86/util/unwind.o ./arch/x86/util/header.o ./arch/x86/util/dwarf-regs.o ./util/scripting-engines/trace-event-python.o ./util/scripting-engines/trace-event-perl.o ./util/probe-finder.o ./util/dwarf-aux.o ./util/unwind.o ... ... These ".o" files are not cleaned. The reason is: These object files are added into "BUILTIN_OBJS" while "make" process check the environment. If the make command is "clean", the environment check process is not executed. So, these object files will not be added into "BUILTIN_OBJS" while we execute "make clean". This patch fixed this problem. We only add a command: "find . -name "*.o" -exec rm -f {} \;" Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/Makefile |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tools/perf/Makefile b/tools/perf/Makefile index a2108ca..dec08ba 100644 --- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -1174,6 +1174,7 @@ clean: $(LIBTRACEEVENT)-clean $(RM) $(OUTPUT)util/*-bison* $(RM) $(OUTPUT)util/*-flex* $(python-clean) + find . -name "*.o" -exec rm -f {} \; .PHONY: all install clean strip $(LIBTRACEEVENT) .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3]Perf: Fix Makefile to clean all object files
From: Chenggang Qin If we execute "make clean" in perf's directory, many object files cannot be cleaned in the current version. For example: While we run "make clean" in perf's directory, and run the command: "fine ./ -name "*.o"" we will get: ./arch/x86/util/unwind.o ./arch/x86/util/header.o ./arch/x86/util/dwarf-regs.o ./util/scripting-engines/trace-event-python.o ./util/scripting-engines/trace-event-perl.o ./util/probe-finder.o ./util/dwarf-aux.o ./util/unwind.o ... ... These ".o" files are not cleaned. The reason is: These object files are added into "BUILTIN_OBJS" while "make" process check the environment. If the make command is "clean", the environment check process is not executed. So, these object files will not be added into "BUILTIN_OBJS" while we execute "make clean". This patch fixed this problem. We only add a command: "find . -name "*.o" -exec rm -f {} \;" Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/Makefile |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tools/perf/Makefile b/tools/perf/Makefile index a2108ca..dec08ba 100644 --- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -1174,6 +1174,7 @@ clean: $(LIBTRACEEVENT)-clean $(RM) $(OUTPUT)util/*-bison* $(RM) $(OUTPUT)util/*-flex* $(python-clean) + $(FIND) . -name "*.o" -exec rm -f {} \; .PHONY: all install clean strip $(LIBTRACEEVENT) .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4]Perf: Fix Makefile to clean all object files
From: Chenggang Qin If we execute "make clean" in perf's directory, many object files cannot be cleaned in the current version. For example: While we run "make clean" in perf's directory, and run the command: "fine ./ -name "*.o"" we will get: ./arch/x86/util/unwind.o ./arch/x86/util/header.o ./arch/x86/util/dwarf-regs.o ./util/scripting-engines/trace-event-python.o ./util/scripting-engines/trace-event-perl.o ./util/probe-finder.o ./util/dwarf-aux.o ./util/unwind.o ... ... These ".o" files are not cleaned. The reason is: These object files are added into "BUILTIN_OBJS" while "make" process check the environment. If the make command is "clean", the environment check process is not executed. So, these object files will not be added into "BUILTIN_OBJS" while we execute "make clean". This patch fixed this problem. We only add a command: "find . -name "*.o" -exec rm -f {} \;" Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/Makefile |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tools/perf/Makefile b/tools/perf/Makefile index a2108ca..dec08ba 100644 --- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -1174,6 +1174,7 @@ clean: $(LIBTRACEEVENT)-clean $(RM) $(OUTPUT)util/*-bison* $(RM) $(OUTPUT)util/*-flex* $(python-clean) + $(FIND) $(OUTPUT) -name "*.o" -delete .PHONY: all install clean strip $(LIBTRACEEVENT) .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] perf tools: Add bitmap filed to thread_map to support new threads aware.
During a target thread's life cycle, it may be fork many threads. But in the current version of 'perf top{record} -p $pid', the new forked threads can not be apperceived by perf. The content of thread_map and other related structures need to be refreshed on-the-fly to apperceive the threads' fork and exit. A pre-allocate large array with a bitmap to record which position can be used is a simple way. This patch add a bitmap field into struct thread_map and modify the related code in thread_map.c & evsel.c. But in this patch, the bitmap mechanism cannot yet be used up, because the interface of evlist and evsel have not been modified. Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/evsel.c | 19 - tools/perf/util/thread_map.c | 171 ++ tools/perf/util/thread_map.h |8 ++ 3 files changed, 116 insertions(+), 82 deletions(-) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 1b16dd1..a34167f 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -790,11 +790,21 @@ static struct { .cpus = { -1, }, }; +/* + * while we use empty_thread_map, we should clear the empty_thread_bitmap, + * and set the first bit. + */ +static DECLARE_BITMAP(empty_thread_bitmap, PID_MAX_DEFAULT); + static struct { struct thread_map map; int threads[1]; } empty_thread_map = { - .map.nr = 1, + .map = { + .max_nr = MAX_THREADS_NR_DEFAULT, + .nr = 1, + .bitmap = empty_thread_bitmap, + }, .threads = { -1, }, }; @@ -806,8 +816,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, cpus = &empty_cpu_map.map; } - if (threads == NULL) + if (threads == NULL) { threads = &empty_thread_map.map; + bitmap_zero(threads->bitmap, PID_MAX_DEFAULT); + set_bit(0, threads->bitmap); + } return __perf_evsel__open(evsel, cpus, threads); } @@ -815,6 +828,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, int perf_evsel__open_per_cpu(struct perf_evsel *evsel, struct cpu_map *cpus) { + bitmap_zero(empty_thread_map.map.bitmap, PID_MAX_DEFAULT); + set_bit(0, empty_thread_map.map.bitmap); return __perf_evsel__open(evsel, cpus, &empty_thread_map.map); } diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c index 9b5f856..7966f3f 100644 --- a/tools/perf/util/thread_map.c +++ b/tools/perf/util/thread_map.c @@ -9,6 +9,8 @@ #include "strlist.h" #include #include "thread_map.h" +#include +#include "debug.h" /* Skip "." and ".." directories */ static int filter(const struct dirent *dir) @@ -21,7 +23,7 @@ static int filter(const struct dirent *dir) struct thread_map *thread_map__new_by_pid(pid_t pid) { - struct thread_map *threads; + struct thread_map *threads = NULL; char name[256]; int items; struct dirent **namelist = NULL; @@ -32,11 +34,12 @@ struct thread_map *thread_map__new_by_pid(pid_t pid) if (items <= 0) return NULL; - threads = malloc(sizeof(*threads) + sizeof(pid_t) * items); - if (threads != NULL) { - for (i = 0; i < items; i++) - threads->map[i] = atoi(namelist[i]->d_name); - threads->nr = items; + for (i = 0; i < items; i++) { + bool re_alloc; + + if (thread_map__update(&threads, atoi(namelist[i]->d_name), + &re_alloc) < 0) + return NULL; } for (i=0; imap[0] = tid; - threads->nr = 1; - } + if (thread_map__update(&threads, tid, &re_alloc) < 0) + return NULL; return threads; } @@ -61,23 +63,17 @@ struct thread_map *thread_map__new_by_tid(pid_t tid) struct thread_map *thread_map__new_by_uid(uid_t uid) { DIR *proc; - int max_threads = 32, items, i; + int items, i; char path[256]; struct dirent dirent, *next, **namelist = NULL; - struct thread_map *threads = malloc(sizeof(*threads) + - max_threads * sizeof(pid_t)); - if (threads == NULL) - goto out; + struct thread_map *threads = NULL; proc = opendir("/proc"); if (proc == NULL) - goto out_free_threads; - - threads->nr = 0; + goto out; while (!readdir_r(proc, &dirent, &
[PATCH 5/5] perf top: Add the function to make 'perf top -p $pid' could be aware of new forked thread.
This patch implemnet a fork function and a exit function in perf_top->tool to respond to PERF_RECORD_FORK & PERF_RECORD_EXIT events. In the fork function (perf_top__process_event_fork), the information of the new thread is added into thread_map. The fd and mmap of the new thread are created in this function also. In the exit function (perf_top__process_event_exit), the information of the exited thread are removed from thread_map. The fd and mmap of this thread are closed in this function also. Based on this patch, 'perf top -p $pid' can be aware of thread's fork and exit on-the-fly. The new forked threads' sample events can be got by 'perf top'. And the symbols of the new forked threads can be display on the ui. Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-top.c | 135 ++ 1 file changed, 135 insertions(+) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index b3650e3..e7978ce 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -844,6 +844,17 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx) if (event->header.type == PERF_RECORD_SAMPLE) ++top->samples; + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_FORK) + (&top->tool)->fork(&top->tool, event, &sample, machine); + + if (cpu_map__all(top->evlist->cpus) && + event->header.type == PERF_RECORD_EXIT) { + int close_nr; + + close_nr = (&top->tool)->exit(&top->tool, event, + &sample, machine); + if (close_nr == idx) + return; + } + switch (origin) { case PERF_RECORD_MISC_USER: ++top->us_samples; @@ -896,6 +907,26 @@ static void perf_top__mmap_read(struct perf_top *top) perf_top__mmap_read_idx(top, i); } +static void perf_top__append_thread(struct perf_top *top, int append_nr, +bool need_realloc) +{ + struct perf_evsel *counter; + struct perf_evlist *evlist = top->evlist; + int err; + + list_for_each_entry(counter, &evlist->entries, node) { + err = perf_evsel__append_open(counter, top->evlist->cpus, + top->evlist->threads, + append_nr, need_realloc); + + if (err == ESRCH) { + top->evlist->threads->map[append_nr] = -1; + clear_bit(append_nr, top->evlist->threads->bitmap); + return; + } else if (err < 0) + ui__error("append open error: %d\n", errno); + } +} + static void perf_top__start_counters(struct perf_top *top) { struct perf_evsel *counter; @@ -1174,12 +1205,116 @@ setup: return 0; } +static int perf_top__process_event_fork(struct perf_tool *tool __maybe_unused, +union perf_event *event __maybe_unused, +struct perf_sample *sample __maybe_unused, +struct machine *machine __maybe_unused) +{ + struct perf_top *top = container_of(tool, struct perf_top, tool); + pid_t tid = event->fork.tid; + pid_t ptid = event->fork.ptid; + int append_nr = -1; + int thread; + + /* +* There are 2 same fork events are received while a thread was forked. +* This may be a kernel bug. +*/ + for_each_set_bit(thread, top->evlist->threads->bitmap, PID_MAX_DEFAULT) + if (tid == top->evlist->threads->map[thread]) + return -1; + + for_each_set_bit(thread, top->evlist->threads->bitmap, PID_MAX_DEFAULT) { + /* +* If new thread's parent is not target task, just ignore it. +*/ + if (ptid == top->evlist->threads->map[thread]) { + bool realloc_need; + + append_nr = thread_map__update(&(top->evlist->threads), + tid, &realloc_need); + /* +* Open counters for new thread. +*/ +
[PATCH 0/5] perf top: Add the function that make the 'perf top -p $pid' can be aware of the new threads.
This patch set add the function that make the 'perf top -p $pid' could be aware of the dynamic fork threads. The perf top{record} tools are not aware of the new threads that forked by the target threads, while we use 'perf top{record} -p $pid' model. Some critical structures, such as, thread_map, mmap, fd, pollfd, id, are fixed in some arrays at the initialization phase. These structures cannot be extended easily for the new threads. And, for some performance reasons, the event inherit mechanism is forbidden in the '-p $pid' model. So, these structures should be modified to a flexible form at low performance penalty (especially the CPU utilization). Bitmap is a simple choice. A larger thread_map->map[] can be allocate at the initialization phase, such as 32. When the number of new threads is over 32, the size of this array can be extend doubled by realloc. The bitmap is used to record which position in the map[] is occupied by a thread, and which position can be used by the next new thread. I insert a bitmap field in thread_map, and modified other related code in thread_map.c, xyarray.c, evlist.c, evsel.c etc. The fork and exit events (PERF_RECORD_FORK & PERF_RECORD_EXIT) can be caught while we read events from the exist mmaps. Then, we can allocate resources, open fd, record event id, and make a mmap for the new forked threads without excessive cost. We can easily release related resources for the exited threads also. This function has been already implemented for 'perf top -p $pid' in the patch [5/5] of this patch set. Next step, the 'perf record -p $pid' should be changed use the interfaces in evlist & evsel modified by this patch set. Just like the 'perf top'. Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Andrew Morton Signed-off-by: Chenggang Qin chenggang (5): perf tools: Add some functions to bitops.h to support more bitmap operations. perf tools: Add xyarray__realloc function in xyarray.c to expend xyarray. perf tools: Add bitmap filed to thread_map to support new threads aware. perf tools: Change some interfaces of evlist & evsel to support thread's creation and destroy with thread_map's bitmap. perf top: Add the function to make 'perf top -p $pid' could be aware of new forked thread. tools/perf/builtin-record.c | 25 ++- tools/perf/builtin-stat.c |7 +- tools/perf/builtin-top.c | 149 +- tools/perf/tests/mmap-basic.c |4 +- tools/perf/tests/open-syscall-all-cpus.c |2 +- tools/perf/tests/open-syscall-tp-fields.c |3 +- tools/perf/tests/open-syscall.c |3 +- tools/perf/tests/perf-record.c|2 +- tools/perf/util/evlist.c | 236 +++-- tools/perf/util/evlist.h | 39 +++-- tools/perf/util/evsel.c | 164 +--- tools/perf/util/evsel.h | 38 +++-- tools/perf/util/include/linux/bitops.h| 85 +-- tools/perf/util/python.c |3 +- tools/perf/util/thread_map.c | 171 +++-- tools/perf/util/thread_map.h |8 + tools/perf/util/xyarray.c | 26 tools/perf/util/xyarray.h |2 + 18 files changed, 755 insertions(+), 212 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/5] perf tools: Add some functions to bitops.h to support more bitmap operations.
Add bitmap_copy() & find_first_zero_bit() to the 'util/include/linux/bitops.h'. These functions could be need if we want to change the thread_map or any other mechanism with bitmap. Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/include/linux/bitops.h | 85 ++-- 1 file changed, 69 insertions(+), 16 deletions(-) diff --git a/tools/perf/util/include/linux/bitops.h b/tools/perf/util/include/linux/bitops.h index a55d8cf..644504a 100644 --- a/tools/perf/util/include/linux/bitops.h +++ b/tools/perf/util/include/linux/bitops.h @@ -4,6 +4,12 @@ #include #include #include +#include + +typedef unsigned long BITMAP; + +#define PID_MAX_DEFAULT 0x8000 +#define CPU_MAX_DEFAULT 0x40 #ifndef __WORDSIZE #define __WORDSIZE (__SIZEOF_LONG__ * 8) @@ -26,36 +32,57 @@ (bit) < (size);\ (bit) = find_next_bit((addr), (size), (bit) + 1)) -static inline void set_bit(int nr, unsigned long *addr) +static inline void set_bit(int nr, BITMAP *addr) { addr[nr / BITS_PER_LONG] |= 1UL << (nr % BITS_PER_LONG); } -static inline void clear_bit(int nr, unsigned long *addr) +static inline void clear_bit(int nr, BITMAP *addr) { addr[nr / BITS_PER_LONG] &= ~(1UL << (nr % BITS_PER_LONG)); } -static __always_inline int test_bit(unsigned int nr, const unsigned long *addr) +static __always_inline int test_bit(unsigned int nr, const BITMAP *addr) { return ((1UL << (nr % BITS_PER_LONG)) & - (((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0; + (((BITMAP *)addr)[nr / BITS_PER_LONG])) != 0; } -static inline unsigned long hweight_long(unsigned long w) +static inline BITMAP hweight_long(BITMAP w) { return sizeof(w) == 4 ? hweight32(w) : hweight64(w); } +static inline void bitmap_copy(BITMAP *dst, const BITMAP *src, + int nbits) +{ + int len = BITS_TO_LONGS(nbits) * sizeof(BITMAP); + memcpy(dst, src, len); +} + #define BITOP_WORD(nr) ((nr) / BITS_PER_LONG) +/* + * ffz - find first zero bit in word + * @word: The word to search + * + * Undefined if no zero exists, so code should check against ~0UL first. + */ +static __always_inline BITMAP ffz(BITMAP word) +{ + asm("rep; bsf %1,%0" + : "=r" (word) + : "r" (~word)); + return word; +} + /** * __ffs - find first bit in word. * @word: The word to search * * Undefined if no bit exists, so code should check against 0 first. */ -static __always_inline unsigned long __ffs(unsigned long word) +static __always_inline BITMAP __ffs(BITMAP word) { int num = 0; @@ -87,14 +114,40 @@ static __always_inline unsigned long __ffs(unsigned long word) } /* + * Find the first cleared bit in a memory region. + */ +static inline BITMAP +find_first_zero_bit(const BITMAP *addr, BITMAP size) +{ + const BITMAP *p = addr; + BITMAP result = 0; + BITMAP tmp; + + while (size & ~(BITS_PER_LONG-1)) { + if (~(tmp = *(p++))) + goto found; + result += BITS_PER_LONG; + size -= BITS_PER_LONG; + } + if (!size) + return result; + + tmp = (*p) | (~0UL << size); + if (tmp == ~0UL)/* Are any bits zero? */ + return result + size; /* Nope. */ +found: + return result + __ffs(~tmp); +} + +/* * Find the first set bit in a memory region. */ -static inline unsigned long -find_first_bit(const unsigned long *addr, unsigned long size) +static inline BITMAP +find_first_bit(const BITMAP *addr, BITMAP size) { - const unsigned long *p = addr; - unsigned long result = 0; - unsigned long tmp; + const BITMAP *p = addr; + BITMAP result = 0; + BITMAP tmp; while (size & ~(BITS_PER_LONG-1)) { if ((tmp = *(p++))) @@ -115,12 +168,12 @@ found: /* * Find the next set bit in a memory region. */ -static inline unsigned long -find_next_bit(const unsigned long *addr, unsigned long size, unsigned long offset) +static inline BITMAP +find_next_bit(const BITMAP *addr, BITMAP size, BITMAP offset) { - const unsigned long *p = addr + BITOP_WORD(offset); - unsigned long result = offset & ~(BITS_PER_LONG-1); - unsigned long tmp; + const BITMAP *p = addr + BITOP_WORD(offset); + BITMAP result = offset & ~(BITS_PER_LONG-1); + BITMAP tmp; if (offset >= size) return size; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message
[PATCH 2/5] perf tools: Add xyarray__realloc function in xyarray.c to expend xyarray.
xyarray__realloc() could be used if we wish extend the evsel->fd, evsel->sample_id or any other xyarray on-the-fly. Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/xyarray.c | 26 ++ tools/perf/util/xyarray.h |2 ++ 2 files changed, 28 insertions(+) diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c index 22afbf6..4e76377 100644 --- a/tools/perf/util/xyarray.c +++ b/tools/perf/util/xyarray.c @@ -18,3 +18,29 @@ void xyarray__delete(struct xyarray *xy) { free(xy); } + +int xyarray__realloc(struct xyarray **xy_old, int xlen_old, int xlen_new, + int ylen_new) +{ + size_t row_size_new = ylen_new * (*xy_old)->entry_size; + struct xyarray *xy_new = zalloc(sizeof(*xy_new) + xlen_new + * row_size_new); + int x; + + if (xy_new != NULL) { + for (x = 0; x < xlen_old; x++) + memcpy(&xy_new->contents[x * row_size_new], + &((*xy_old)->contents[x * (*xy_old)->row_size]), + (*xy_old)->row_size); + + xy_new->row_size = row_size_new; + xy_new->entry_size = (*xy_old)->entry_size; + + xyarray__delete(*xy_old); + + *xy_old = xy_new; + + return 0; + } + + return -1; +} + diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h index c488a07..ad41649 100644 --- a/tools/perf/util/xyarray.h +++ b/tools/perf/util/xyarray.h @@ -11,6 +11,8 @@ struct xyarray { struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size); void xyarray__delete(struct xyarray *xy); +int xyarray__realloc(struct xyarray **xy_old, int xlen_old, int xlen_new, +int ylen_new); static inline void *xyarray__entry(struct xyarray *xy, int x, int y) { -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] perf tools: Change some interfaces of evlist & evsel to support thread's creation and destroy with thread_map's bitmap.
Based on the [PATCH 3/5], this patch changed the related interfaces in evlist & evsel to support the operations to thread_map's bitmap. Then, we can use these interfaces to insert a new forked thread into or remove a exited trhead from thread_map and other related data structures. Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-record.c | 25 ++- tools/perf/builtin-stat.c |7 +- tools/perf/builtin-top.c | 14 +- tools/perf/tests/mmap-basic.c |4 +- tools/perf/tests/open-syscall-all-cpus.c |2 +- tools/perf/tests/open-syscall-tp-fields.c |3 +- tools/perf/tests/open-syscall.c |3 +- tools/perf/tests/perf-record.c|2 +- tools/perf/util/evlist.c | 236 +++-- tools/perf/util/evlist.h | 39 +++-- tools/perf/util/evsel.c | 147 +++--- tools/perf/util/evsel.h | 38 +++-- tools/perf/util/python.c |3 +- 13 files changed, 408 insertions(+), 115 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index f3151d3..277303f 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -359,7 +359,7 @@ try_again: goto out; } - if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) { + if (perf_evlist__mmap(evlist, opts->mmap_pages, false, -1, false) < 0) { if (errno == EPERM) { pr_err("Permission error mapping pages.\n" "Consider increasing " @@ -472,12 +472,21 @@ static int perf_record__mmap_read_all(struct perf_record *rec) int i; int rc = 0; - for (i = 0; i < rec->evlist->nr_mmaps; i++) { - if (rec->evlist->mmap[i].base) { - if (perf_record__mmap_read(rec, &rec->evlist->mmap[i]) != 0) { - rc = -1; - goto out; - } + if (cpu_map__all(rec->evlist->cpus)) { + for_each_set_bit(i, rec->evlist->threads->bitmap, +PID_MAX_DEFAULT) { + if (rec->evlist->mmap[i].base) + if (perf_record__mmap_read(rec, + &rec->evlist->mmap[i]) != 0){ + rc = -1; + goto out; + } + } + } else { + for (i = 0; i < rec->evlist->nr_mmaps; i++) { + if (rec->evlist->mmap[i].base) + if (perf_record__mmap_read(rec, + &rec->evlist->mmap[i]) != 0) { + rc = -1; + goto out; + } } } @@ -1161,7 +1170,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused) err = -EINVAL; goto out_free_fd; } - + err = __cmd_record(&record, argc, argv); out_free_fd: perf_evlist__delete_maps(evsel_list); diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index c247fac..74d5311 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -229,7 +229,7 @@ static int read_counter_aggr(struct perf_evsel *counter) int i; if (__perf_evsel__read(counter, perf_evsel__nr_cpus(counter), - evsel_list->threads->nr, scale) < 0) + evsel_list->threads->bitmap, scale) < 0) return -1; for (i = 0; i < 3; i++) @@ -394,13 +394,14 @@ static int __run_perf_stat(int argc __maybe_unused, const char **argv) if (no_aggr) { list_for_each_entry(counter, &evsel_list->entries, node) { read_counter(counter); - perf_evsel__close_fd(counter, perf_evsel__nr_cpus(counter), 1); + perf_evsel__close_fd(counter, +perf_evsel__nr_cpus(counter), +evsel_list->threads->bitmap); } } else { list_for_each_entry(counter, &evsel_list->entries, node) { read_counter_aggr(counter); perf_evsel__close_fd(counter, perf_e
[PATCH v3] Add 4 tracepoint events for vfs
From: chenggang@gmail.com This version changed some type definition according to Steven's advise. Thanks for Steven. If the engineers want to analyze the file access behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. The system engineers or developers of server software require to know what files are accessed by the target processes with in a period of time. Then they can find the hot applications and the hot files. For this requirements, we added 2 tracepoint events at the begin of generic_file_aio_read() and generic_file_aio_write(). Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers want to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we added 2 tracepoint events at the direct IO branch in generic_file_aio_read() and generic_file_aio_write(). Then, we will extend the perf's function by python script to use these new tracepoint events. The 4 new tracepoint events are: 1) generic_file_aio_read Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:__data_loc char[] fname; offset:32; size:4; signed:1; 2) generic_file_aio_write Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:__data_loc char[] fname; offset:32; size:4; signed:1; 3) direct_io_read Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; 4) direct_io_write Format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int common_padding; offset:8; size:4; signed:1; field:long long pos;offset:16; size:8; signed:1; field:unsigned long bytes; offset:24; size:8; signed:0; field:unsigned char fname[100]; offset:32; size:100; signed:0; Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- include/trace/events/vfs.h | 62 mm/filemap.c | 18 + 2 files changed, 80 insertions(+) create mode 100644 include/trace/events/vfs.h diff --git a/include/trace/events/vfs.h b/include/trace/events/vfs.h new file mode 100644 index 000..11c9acc --- /dev/null +++ b/include/trace/events/vfs.h @@ -0,0 +1,62 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM vfs +#define TRACE_INCLUDE_FILE vfs + +#if !defined(_TRACE_EVENTS_VFS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_EVENTS_VFS_H + +#include + +#include + +DECLARE_EVENT_CLASS(vfs_filerw_template, + + TP_PROTO(long long pos, unsigned long bytes, const unsigned char *fname), + + TP_ARGS(pos, bytes, fname), + +TP_STRUCT__entry( + __field(long long, pos ) +
[PATCH] perf script: Add a python script to statistic direct io behavior
From: chenggang@gmail.com This patch depends on a prev patch: https://lkml.org/lkml/2013/1/29/47 If the engineers want to analyze the direct io behavior of some applications without source code, perf tools with some appropriate tracepoints events in the VFS subsystem are excellent choice. Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers need to know the misses rate of the database system's page cache. This requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we use 2 tracepoint events to record the system wide's direct IO behavior. The 2 tracepoint events are: 1) vfs:direct_io_read 2) vfs:direct_io_write they were introduced by the patch: https://lkml.org/lkml/2013/1/29/47 The script direct-io.py are introduced by this patch can record the 2 tracepoint events, analyse the sample data, and give a concise report. usage: "perf script record direct-io\n" "perf script report direct-io [comm|pid]\n" Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: David Ahern Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/scripts/python/bin/direct-io-record |2 + tools/perf/scripts/python/bin/direct-io-report | 21 +++ tools/perf/scripts/python/direct-io.py | 185 3 files changed, 208 insertions(+) create mode 100755 tools/perf/scripts/python/bin/direct-io-record create mode 100644 tools/perf/scripts/python/bin/direct-io-report create mode 100644 tools/perf/scripts/python/direct-io.py diff --git a/tools/perf/scripts/python/bin/direct-io-record b/tools/perf/scripts/python/bin/direct-io-record new file mode 100755 index 000..4857097 --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-record @@ -0,0 +1,2 @@ +#!/bin/bash +perf record -e vfs:direct_io_read -e vfs:direct_io_write $@ diff --git a/tools/perf/scripts/python/bin/direct-io-report b/tools/perf/scripts/python/bin/direct-io-report new file mode 100644 index 000..828d9c6 --- /dev/null +++ b/tools/perf/scripts/python/bin/direct-io-report @@ -0,0 +1,21 @@ +#!/bin/bash +# description: direct_io statistic +# args: [comm|pid] +n_args=0 +for i in "$@" +do +if expr match "$i" "-" > /dev/null ; then + break +fi +n_args=$(( $n_args + 1 )) +done +if [ "$n_args" -gt 1 ] ; then +echo "usage: perf script report direct-io [comm|pid]" +exit +fi + +if [ "$n_args" -gt 0 ] ; then +comm=$1 +shift +fi +perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/direct-io.py $comm diff --git a/tools/perf/scripts/python/direct-io.py b/tools/perf/scripts/python/direct-io.py new file mode 100644 index 000..321ff8e --- /dev/null +++ b/tools/perf/scripts/python/direct-io.py @@ -0,0 +1,185 @@ +# direct IO counts +# (c) 2013, Chenggang Qin +# Licensed under the terms of the GNU GPL License version 2 + +# Displays system-wide file direct IO behavior. +# It helps us to investigate which processes trigger a direct IO, +# and what files are accessed by these processes. +# +# options +# comm, pid: show details of the file r/w behavior of a special process. + +import os, sys + +sys.path.append(os.environ['PERF_EXEC_PATH'] + \ + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') + +from perf_trace_context import * +from Core import * +from Util import * + +usage = "perf script record direct-io\n" \ + "perf script report direct-io [comm|pid]\n" + +for_comm = None +for_pid = None +pid_2_comm = None + +if len(sys.argv) > 2: + sys.exit(usage) + +if len(sys.argv) > 1: + try: + for_pid = int(sys.argv[1]) + except: + for_comm = sys.argv[1] + +file_write = autodict() +file_read = autodict() + +file_write_bytes = autodict() +file_read_bytes = autodict() + +comm_read_info = autodict() +comm_write_info = autodict() + +wevent_count = 0 +revent_count = 0 + +comm_revent_count = 0; +comm_wevent_count = 0; + +def trace_begin(): + print "Press control+C to stop and show the summary" + +def trace_end(): + if (for_comm is not None) or (for_pid is not None): + print_direct_io_event_for_comm() + else: + print_direct_io_event_totals() + +def vfs__direct_io_write(event_name, context, common_cpu, + common_secs, common_nsecs, common_pid, common_comm, + pos, bytes, fname): + global wevent_count + global comm_wevent_count + global pid_2_comm + + if (for_comm is not None) or (for_pid is not None): + if (common_comm != for_comm) and (common_pid != for_pid): +
linux-kernel@vger.kernel.org
From: chenggang Yesterday, I implemented these tracepoint events in VFS subsystem. It is not a good idea. Now, I modified two existing tracepoint events in ext4 subsystem to implement the same function. Many database systems use their own page cache subsystems and use the direct IO to access the disks. Sometimes, the system engineers want to know the misses rate of the database system's page cache. They also require to know what files are accessed by the target processes with the direct IO method. These requirements can be satisfied by recording the database's file access behavior through the way of direct IO. So, we add 'file name' as a parameter of tracepoint events: ext4:ext4_direct_IO_enter & ext4:ext4_direct_IO_exit. Then, we will extend the perf or blktrace's function to use these tracepoint events. Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Signed-off-by: Chenggang Qin --- fs/ext4/inode.c |7 +-- include/trace/events/ext4.h | 22 ++ 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index cbfe13b..92a379f 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3202,6 +3202,7 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb, struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; ssize_t ret; + const unsigned char *fname; /* * If we are doing data journalling we don't support O_DIRECT @@ -3213,13 +3214,15 @@ static ssize_t ext4_direct_IO(int rw, struct kiocb *iocb, if (ext4_has_inline_data(inode)) return 0; - trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw); + fname = file->f_path.dentry->d_name.name; + trace_ext4_direct_IO_enter(inode, offset, iov_length(iov, nr_segs), rw, + fname); if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) ret = ext4_ext_direct_IO(rw, iocb, iov, offset, nr_segs); else ret = ext4_ind_direct_IO(rw, iocb, iov, offset, nr_segs); trace_ext4_direct_IO_exit(inode, offset, - iov_length(iov, nr_segs), rw, ret); + iov_length(iov, nr_segs), rw, ret, fname); return ret; } diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 7e8c36b..532bbb4 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -1211,9 +1211,10 @@ DEFINE_EVENT(ext4__bitmap_load, ext4_load_inode_bitmap, ); TRACE_EVENT(ext4_direct_IO_enter, - TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, int rw), + TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, int rw, +const unsigned char *fname), - TP_ARGS(inode, offset, len, rw), + TP_ARGS(inode, offset, len, rw, fname), TP_STRUCT__entry( __field(dev_t, dev ) @@ -1221,6 +1222,7 @@ TRACE_EVENT(ext4_direct_IO_enter, __field(loff_t, pos ) __field(unsigned long, len ) __field(int,rw ) + __string( fname, fname ) ), TP_fast_assign( @@ -1229,19 +1231,20 @@ TRACE_EVENT(ext4_direct_IO_enter, __entry->pos= offset; __entry->len= len; __entry->rw = rw; + __assign_str(fname, fname); ), - TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d", + TP_printk("dev %d,%d ino %lu pos %lld len %lu rw %d fname %s", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, - __entry->pos, __entry->len, __entry->rw) + __entry->pos, __entry->len, __entry->rw, __get_str(fname)) ); TRACE_EVENT(ext4_direct_IO_exit, TP_PROTO(struct inode *inode, loff_t offset, unsigned long len, -int rw, int ret), +int rw, int ret, const unsigned char *fname), - TP_ARGS(inode, offset, len, rw, ret), + TP_ARGS(inode, offset, len, rw, ret, fname), TP_STRUCT__entry( __field(dev_t, dev ) @@ -1250,6 +1253,7 @@ TRACE_EVENT(ext4_direct_IO_exit, __field(unsigned long, len ) __field(int,rw ) __field(int,ret ) + __string( fname, fname ) ), TP_fast_assign( @@ -1259,13 +1263,15 @@ TRACE_EVENT(ext4_direct_IO_exit, __entry->len= len;
[PATCH] perf core: Fix a bug that lead to mmap() & munmap() mismatch
From: Chenggang Qin In function filename__read_debuglink(), after the elf_begin() mmapped the dso file, the execution stream may goto "out_close". So, the elf_end() is skipped, and the munmap() cannot be executed. While perf is executed for a long time, the files that are not munmapped will cost a large memory. This patch fixed this bug. Cc: Adrian Hunter Cc: David Ahern Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Arnaldo Carvalho de Melo Signed-off-by: Chenggang Qin --- tools/perf/util/symbol-elf.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 4b12bf8..b4df870 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -471,27 +471,27 @@ int filename__read_debuglink(const char *filename, char *debuglink, ek = elf_kind(elf); if (ek != ELF_K_ELF) - goto out_close; + goto out_elf_end; if (gelf_getehdr(elf, &ehdr) == NULL) { pr_err("%s: cannot get elf header.\n", __func__); - goto out_close; + goto out_elf_end; } sec = elf_section_by_name(elf, &ehdr, &shdr, ".gnu_debuglink", NULL); if (sec == NULL) - goto out_close; + goto out_elf_end; data = elf_getdata(sec, NULL); if (data == NULL) - goto out_close; + goto out_elf_end; /* the start of this section is a zero-terminated string */ strncpy(debuglink, data->d_buf, size); +out_elf_end: elf_end(elf); - out_close: close(fd); out: -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] perf core: avoid traverse dsos list while find vdso
From: Chenggang Qin Vdso is only one in a system. It is not necessory to traverse the macine->user_dsos list while finding the dso of vdso. The flag vdso_found should be replaced by a pointor that point to the dso of vdso. If the pointer is NULL, dso of vdso have not been created. Else, the pointor can be returned directly in function vdso__dso_findnew(). The list traversing can be avoided by this method. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/vdso.c | 22 -- 1 files changed, 8 insertions(+), 14 deletions(-) diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c index 3915982..8022ef0 100644 --- a/tools/perf/util/vdso.c +++ b/tools/perf/util/vdso.c @@ -13,7 +13,7 @@ #include "symbol.h" #include "linux/string.h" -static bool vdso_found; +static struct dso *vdso_dso = NULL; static char vdso_file[] = "/tmp/perf-vdso.so-XX"; static int find_vdso_map(void **start, void **end) @@ -55,9 +55,6 @@ static char *get_file(void) size_t size; int fd; - if (vdso_found) - return vdso_file; - if (find_vdso_map(&start, &end)) return NULL; @@ -79,33 +76,30 @@ static char *get_file(void) out: free(buf); - vdso_found = (vdso != NULL); return vdso; } void vdso__exit(void) { - if (vdso_found) + if (vdso_dso) unlink(vdso_file); } struct dso *vdso__dso_findnew(struct list_head *head) { - struct dso *dso = dsos__find(head, VDSO__MAP_NAME, true); - - if (!dso) { + if (!vdso_dso) { char *file; file = get_file(); if (!file) return NULL; - dso = dso__new(VDSO__MAP_NAME); - if (dso != NULL) { - dsos__add(head, dso); - dso__set_long_name(dso, file); + vdso_dso = dso__new(VDSO__MAP_NAME); + if (vdso_dso != NULL) { + dsos__add(head, vdso_dso); + dso__set_long_name(vdso_dso, file); } } - return dso; + return vdso_dso; } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] perf core: remove short name compare in dsos__find()
From: Chenggang Qin If the list traversal is avoided by the last patch, the short name compare in dsos__find() is unnecessary. The purpose of short name compare is only to find the dso of vdso. If the vdso can be found by a pointor, the short name compare can be removed. Thanks Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/dso.c | 10 ++ tools/perf/util/dso.h |3 +-- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index c4374f0..6f7d5a9 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -513,16 +513,10 @@ void dsos__add(struct list_head *head, struct dso *dso) list_add_tail(&dso->node, head); } -struct dso *dsos__find(struct list_head *head, const char *name, bool cmp_short) +struct dso *dsos__find(struct list_head *head, const char *name) { struct dso *pos; - if (cmp_short) { - list_for_each_entry(pos, head, node) - if (strcmp(pos->short_name, name) == 0) - return pos; - return NULL; - } list_for_each_entry(pos, head, node) if (strcmp(pos->long_name, name) == 0) return pos; @@ -531,7 +525,7 @@ struct dso *dsos__find(struct list_head *head, const char *name, bool cmp_short) struct dso *__dsos__findnew(struct list_head *head, const char *name) { - struct dso *dso = dsos__find(head, name, false); + struct dso *dso = dsos__find(head, name); if (!dso) { dso = dso__new(name); diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index d51aaf2..450199a 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -133,8 +133,7 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name, const char *short_name, int dso_type); void dsos__add(struct list_head *head, struct dso *dso); -struct dso *dsos__find(struct list_head *head, const char *name, - bool cmp_short); +struct dso *dsos__find(struct list_head *head, const char *name); struct dso *__dsos__findnew(struct list_head *head, const char *name); bool __dsos__read_build_ids(struct list_head *head, bool with_hits); -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] perf tools: remove short name compare in dsos__find()
From: Chenggang Qin If the list traversal is avoided by the last patch, the short name compare in dsos__find() is unnecessary. The purpose of short name compare is only to find the dso of vdso. If the vdso can be found by a pointor, the short name compare can be removed. Thanks Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/dso.c | 10 ++ tools/perf/util/dso.h |3 +-- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index c4374f0..6f7d5a9 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -513,16 +513,10 @@ void dsos__add(struct list_head *head, struct dso *dso) list_add_tail(&dso->node, head); } -struct dso *dsos__find(struct list_head *head, const char *name, bool cmp_short) +struct dso *dsos__find(struct list_head *head, const char *name) { struct dso *pos; - if (cmp_short) { - list_for_each_entry(pos, head, node) - if (strcmp(pos->short_name, name) == 0) - return pos; - return NULL; - } list_for_each_entry(pos, head, node) if (strcmp(pos->long_name, name) == 0) return pos; @@ -531,7 +525,7 @@ struct dso *dsos__find(struct list_head *head, const char *name, bool cmp_short) struct dso *__dsos__findnew(struct list_head *head, const char *name) { - struct dso *dso = dsos__find(head, name, false); + struct dso *dso = dsos__find(head, name); if (!dso) { dso = dso__new(name); diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index d51aaf2..450199a 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -133,8 +133,7 @@ struct dso *dso__kernel_findnew(struct machine *machine, const char *name, const char *short_name, int dso_type); void dsos__add(struct list_head *head, struct dso *dso); -struct dso *dsos__find(struct list_head *head, const char *name, - bool cmp_short); +struct dso *dsos__find(struct list_head *head, const char *name); struct dso *__dsos__findnew(struct list_head *head, const char *name); bool __dsos__read_build_ids(struct list_head *head, bool with_hits); -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] perf tools: avoid traverse dsos list while find vdso
From: Chenggang Qin Vdso is only one in a system. It is not necessory to traverse the macine->user_dsos list when looking for the dso of vdso. The flag vdso_found should be replaced by a pointor that point to the dso of vdso. If the pointer is NULL, dso of vdso have not been created. Else, the pointor can be returned directly in function vdso__dso_findnew(). The list traversing can be avoided by this method. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/vdso.c | 22 -- 1 files changed, 8 insertions(+), 14 deletions(-) diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c index 3915982..8022ef0 100644 --- a/tools/perf/util/vdso.c +++ b/tools/perf/util/vdso.c @@ -13,7 +13,7 @@ #include "symbol.h" #include "linux/string.h" -static bool vdso_found; +static struct dso *vdso_dso = NULL; static char vdso_file[] = "/tmp/perf-vdso.so-XX"; static int find_vdso_map(void **start, void **end) @@ -55,9 +55,6 @@ static char *get_file(void) size_t size; int fd; - if (vdso_found) - return vdso_file; - if (find_vdso_map(&start, &end)) return NULL; @@ -79,33 +76,30 @@ static char *get_file(void) out: free(buf); - vdso_found = (vdso != NULL); return vdso; } void vdso__exit(void) { - if (vdso_found) + if (vdso_dso) unlink(vdso_file); } struct dso *vdso__dso_findnew(struct list_head *head) { - struct dso *dso = dsos__find(head, VDSO__MAP_NAME, true); - - if (!dso) { + if (!vdso_dso) { char *file; file = get_file(); if (!file) return NULL; - dso = dso__new(VDSO__MAP_NAME); - if (dso != NULL) { - dsos__add(head, dso); - dso__set_long_name(dso, file); + vdso_dso = dso__new(VDSO__MAP_NAME); + if (vdso_dso != NULL) { + dsos__add(head, vdso_dso); + dso__set_long_name(vdso_dso, file); } } - return dso; + return vdso_dso; } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] perf core: Fix a mmap & munmap mismatches bug in dso__load
From: Chenggang Qin Some dsos' symsrc is neither syms_ss or runtime_ss. In this situation, the corresponding ELF file is opened and mmapped in symsrc__init(), but they will be not closed and munmapped in any place. This bug can lead to mmap & munmap mismatched, the mmap areas will exist during the life of perf. We can think this is a memory leak. This patch fixed the bug. symsrc__destroy() is called while the opened and mmaped ELF file has neither symtlb section nor dynsym section, and opdsec section. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin Acked-by: Namhyung Kim --- tools/perf/util/symbol.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index d5528e1..9675866 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -828,7 +828,8 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter) if (syms_ss && runtime_ss) break; - } + } else + symsrc__destroy(ss); } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] perf core: Fix a mmap and munmap mismatched bug
From: root In function filename__read_debuglink(), while the ELF file is opend and mmapped in elf_begin(), but if this file is considered to not be usable during the following code, we will goto the close(fd) directly. The elf_end() is skipped. So, the mmaped ELF file cannot be munmapped. The memory areas are mmapped is exist during the life of perf. This is a memory leak. This patch fixed this bug. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin Reviewed-by: Namhyung Kim --- tools/perf/util/symbol-elf.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 4b12bf8..b4df870 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -471,27 +471,27 @@ int filename__read_debuglink(const char *filename, char *debuglink, ek = elf_kind(elf); if (ek != ELF_K_ELF) - goto out_close; + goto out_elf_end; if (gelf_getehdr(elf, &ehdr) == NULL) { pr_err("%s: cannot get elf header.\n", __func__); - goto out_close; + goto out_elf_end; } sec = elf_section_by_name(elf, &ehdr, &shdr, ".gnu_debuglink", NULL); if (sec == NULL) - goto out_close; + goto out_elf_end; data = elf_getdata(sec, NULL); if (data == NULL) - goto out_close; + goto out_elf_end; /* the start of this section is a zero-terminated string */ strncpy(debuglink, data->d_buf, size); +out_elf_end: elf_end(elf); - out_close: close(fd); out: -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] perf core: Fix a memory leak bug because symbol__delete is ignored
From: Chenggang Qin In function symbols__fixup_duplicate(), while the duplicated symbols are found, only the rb_node are deleted. The symbol structures themself are ignored. Then, these memory areas are lost. This patch fixed the bug. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin Acked-by: Namhyung Kim --- tools/perf/util/symbol.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 9675866..3c9aa6f 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -148,10 +148,12 @@ again: if (choose_best_symbol(curr, next) == SYMBOL_A) { rb_erase(&next->rb_node, symbols); + symbol__delete(next); goto again; } else { nd = rb_next(&curr->rb_node); rb_erase(&curr->rb_node, symbols); + symbol__delete(curr); } } } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] perf core: Fix a mmap and munmap mismatched bug
From: root In function filename__read_debuglink(), while the ELF file is opend and mmapped in elf_begin(), but if this file is considered to not be usable during the following code, we will goto the close(fd) directly. The elf_end() is skipped. So, the mmaped ELF file cannot be munmapped. The memory areas are mmapped is exist during the life of perf. This is a memory leak. This patch fixed this bug. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/symbol-elf.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 4b12bf8..b4df870 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -471,27 +471,27 @@ int filename__read_debuglink(const char *filename, char *debuglink, ek = elf_kind(elf); if (ek != ELF_K_ELF) - goto out_close; + goto out_elf_end; if (gelf_getehdr(elf, &ehdr) == NULL) { pr_err("%s: cannot get elf header.\n", __func__); - goto out_close; + goto out_elf_end; } sec = elf_section_by_name(elf, &ehdr, &shdr, ".gnu_debuglink", NULL); if (sec == NULL) - goto out_close; + goto out_elf_end; data = elf_getdata(sec, NULL); if (data == NULL) - goto out_close; + goto out_elf_end; /* the start of this section is a zero-terminated string */ strncpy(debuglink, data->d_buf, size); +out_elf_end: elf_end(elf); - out_close: close(fd); out: -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] perf core: Fix a mmap & munmap mismatches bug in dso__load
From: Chenggang Qin Some dsos' symsrc is neither syms_ss or runtime_ss. In this situation, the corresponding ELF file is opened and mmapped in symsrc__init(), but they will be not closed and munmapped in any place. This bug can lead to mmap & munmap mismatched, the mmap areas will exist during the life of perf. We can think this is a memory leak. This patch fixed the bug. symsrc__destroy() is called while the opened and mmaped ELF file has neither symtlb section nor dynsym section, and opdsec section. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/symbol.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index d5528e1..9675866 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -828,7 +828,8 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter) if (syms_ss && runtime_ss) break; - } + } else + symsrc__destroy(ss); } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] perf core: Fix a memory leak bug because symbol__delete is ignored
From: Chenggang Qin In function symbols__fixup_duplicate(), while the duplicated symbols are found, only the rb_node are deleted. The symbol structures themself are ignored. Then, these memory areas are lost. This patch fixed the bug. Thanks. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/symbol.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 9675866..3c9aa6f 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -148,10 +148,12 @@ again: if (choose_best_symbol(curr, next) == SYMBOL_A) { rb_erase(&next->rb_node, symbols); + symbol__delete(next); goto again; } else { nd = rb_next(&curr->rb_node); rb_erase(&curr->rb_node, symbols); + symbol__delete(curr); } } } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf/core: Fix a warning in util/trace-event-parse.c
From: While I compile the perf in Red Hat Enterprise Linux Server release 5.4 (Tikanga), I got a warning: CC util/trace-event-parse.o cc1: warnings being treated as errors util/trace-event-parse.c: In function 'parse_proc_kallsyms': util/trace-event-parse.c:232: warning: 'fmt' may be used uninitialized in this function make: *** [util/trace-event-parse.o] Error 1 The version of gcc is: 4.1.2 The reason is that the local variable 'fmt' is not initialized before we use it. It is fixed in this patch. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/trace-event-parse.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tools/perf/util/trace-event-parse.c b/tools/perf/util/trace-event-parse.c index 3aabcd6..630e331 100644 --- a/tools/perf/util/trace-event-parse.c +++ b/tools/perf/util/trace-event-parse.c @@ -229,7 +229,7 @@ void parse_proc_kallsyms(struct pevent *pevent, char *next = NULL; char *addr_str; char *mod; - char *fmt; + char *fmt = NULL; line = strtok_r(file, "\n", &next); while (line) { -- 1.5.5.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH]Perf top: Add ability to detect new threads dynamically during 'perf top -p 'pid'' is running
From: Chenggang Qin While we use "perf top -p 'pid'" to monitor the symbols of specified processes, some new threads would be created by the monitored processes during "perf top" is running. In current version, these new threads and their symbols cannot be shown. This patch add ability to show these new threads. Signed-off-by: Chenggang Qin --- tools/perf/builtin-top.c | 86 -- tools/perf/util/evlist.c |2 ++ 2 files changed, 85 insertions(+), 3 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 68cd61e..54c9cc1 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -882,7 +882,7 @@ static void perf_top__mmap_read(struct perf_top *top) perf_top__mmap_read_idx(top, i); } -static void perf_top__start_counters(struct perf_top *top) +static int perf_top__start_counters(struct perf_top *top) { struct perf_evsel *counter, *first; struct perf_evlist *evlist = top->evlist; @@ -929,6 +929,10 @@ try_again: group_fd) < 0) { int err = errno; + if (err == ESRCH) { + return err; + } + if (err == EPERM || err == EACCES) { ui__error_paranoid(); goto out_err; @@ -994,7 +998,7 @@ try_again: goto out_err; } - return; + return 0; out_err: exit_browser(0); @@ -1018,6 +1022,77 @@ static int perf_top__setup_sample_type(struct perf_top *top) return 0; } +static int thread_map_cmp(struct thread_map *threads_a, + struct thread_map *threads_b) +{ + int i, j; + + if (threads_a->nr != threads_b->nr) { + return 1; + } else { + for (i = 0; i < threads_b->nr; i++) { + for (j = 0; j < threads_a->nr; j++) + if (threads_b->map[i] == threads_a->map[j]) + break; + + if (j == threads_a->nr) + return 1; + } + + return 0; + } +} + +static void check_new_threads(struct perf_top *top) +{ + struct thread_map *new_thread_map; + struct perf_evsel *counter; + struct perf_evlist *evlist = top->evlist; + +retry: + new_thread_map = thread_map__new_str(top->target.pid, top->target.tid, +top->target.uid); + if (!new_thread_map) + return; + + if (thread_map_cmp(top->evlist->threads, new_thread_map) == 0) { + free(new_thread_map); + return; + } else { + list_for_each_entry(counter, &evlist->entries, node) { + perf_evsel__close(counter, top->evlist->cpus->nr, + top->evlist->threads->nr); + } + + if (top->evlist->mmap) + perf_evlist__munmap(top->evlist); + + if (top->evlist->pollfd) { + free(top->evlist->pollfd); + top->evlist->pollfd = NULL; + } + + top->evlist->nr_fds = 0; + + thread_map__delete(top->evlist->threads); + top->evlist->threads = new_thread_map; + + if (perf_top__start_counters(top) == ESRCH) { + while (thread_map_cmp(top->evlist->threads, + new_thread_map) == 0) { + new_thread_map = thread_map__new_str(top->target.pid, + top->target.tid, + top->target.uid); + if (!new_thread_map) + return; + } + goto retry; + } + + return; + } +} + static int __cmd_top(struct perf_top *top) { pthread_t thread; @@ -1067,7 +1142,12 @@ static int __cmd_top(struct perf_top *top) } while (1) { - u64 hits = top->samples; + u64 hits; + + if (perf_target__has_task(&top->target)) + check_new_threads(top); + + hits = top->samples; perf_top__mmap_read(top); diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 9b38681..293eca7 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -452,6 +452,8 @@ void perf_evlist__munmap(struct
[PATCH] perf tool: remove an unnecessary function call while process pipe events
From: Chenggang Qin perf_session_free_sample_buffers() can be removed from __perf_session__process_pipe_events(), since the ordered_samples buffer is not used while samples are read from the pipe. __perf_session__process_pipe_events() is only used while process the events from pipe. While the sample are read from pipe, the ordered_samples is forbidden. Refer to the following code in perf_session__new(): 150 if (tool && tool->ordering_requires_timestamps && 151 tool->ordered_samples && !perf_evlist__sample_id_all(self->evlist)) { 152 dump_printf("WARNING: No sample_id_all support, falling back to unordered processing\n"); 153 tool->ordered_samples = false; 154 } If pipe is used, perf_evlist__sample_id_all(self->evlist) always return 0. Because session->evlist is empty util a attr_event is read. Thanks Chenggang Qin Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/session.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 568b750..b69c28a 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1251,7 +1251,6 @@ done: out_err: free(buf); perf_session__warn_about_errors(self, tool); - perf_session_free_sample_buffers(self); return err; } -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] perf report: add parameters 'start' & 'end' to specify analysis interval
This patch set introduced a feature to analysis the samples in a specified time interval. After perf.data file was generated by perf record, the user could want to analysis a sub time interval of the whole record period. For some functions, the percent of its samples in a certain sub time interval is different from the percent in the total record period. Showing the scene in a certain time interval could allow users to more easily troubleshoot performance problems. The sample's timestamp are recorded in the perf.data file. The samples are sorted in the ordered_samples by timestamp while perf report processed them. So, it is easily to search the samples whose timestamp are in a certain time interval. We add 2 paramters --start and --end to specify the time interval. perf report --start x --end x The smallest granularity of time interval is millsecond. For example: If the whole record period of a perf.data file is 1 to 2, we can use the following command to analysis the samples between [15000, 16000). perf report --start 15000 --end 16000 The time is the uptime, it start timing from the system starts. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin Chenggang Qin (4): perf tools: add parameter 'start' & 'end' to perf report perf tools: relate 'start' & 'end' to perf_session perf tools: record min_timestamp of samples queue in ordered_samples perf tools: add the feature to assign analysis interval to perf report tools/perf/builtin-report.c | 14 tools/perf/util/session.c | 49 +- tools/perf/util/session.h |3 ++ 3 files changed, 64 insertions(+), 2 deletions(-) -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] perf tools: add the feature to assign analysis interval to perf report
Only process the samples whose timestamp is in [start, end). Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/session.c | 43 +-- 1 files changed, 41 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 4e9dd66..d50e29e 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -532,6 +532,9 @@ static int flush_sample_queue(struct perf_session *s, bool show_progress = limit == ULLONG_MAX; int ret; + if (limit > s->tend) + limit = s->tend; + if (!tool->ordered_samples || !limit) return 0; @@ -539,6 +542,9 @@ static int flush_sample_queue(struct perf_session *s, if (session_done()) return 0; + if (iter->timestamp < s->tstart) + continue; + if (iter->timestamp > limit) break; @@ -617,7 +623,26 @@ static int process_finished_round(struct perf_tool *tool, union perf_event *event __maybe_unused, struct perf_session *session) { - int ret = flush_sample_queue(session, tool); + int ret = 0; + + /* +* The next round should be processed continue. +* But, this round is skipped. +*/ + if (session->ordered_samples.next_flush < session->tstart) { + session->ordered_samples.next_flush = session->ordered_samples.max_timestamp; + return ret; + } + + /* +* This round & all followed rounds are skipped. +*/ + if (session->ordered_samples.min_timestamp > session->tend) { + session->ordered_samples.next_flush = ULLONG_MAX; + return ret; + } + + ret = flush_sample_queue(session, tool); if (!ret) session->ordered_samples.next_flush = session->ordered_samples.max_timestamp; @@ -1373,6 +1398,14 @@ more: goto out_err; } + /* +* After process a finished round event: +* The minimal timestamp in os->samples is greater than +* tend, so, the followed events couldn't be processed. +*/ + if (session->ordered_samples.next_flush == ULLONG_MAX) + goto out_err; + head += size; file_pos += size; @@ -1389,8 +1422,14 @@ more: if (file_pos < file_size) goto more; + if (session->ordered_samples.max_timestamp < session->tstart) + goto out_err; + + if (session->ordered_samples.min_timestamp > session->tend) + goto out_err; + /* do the final flush for ordered samples */ - session->ordered_samples.next_flush = ULLONG_MAX; + session->ordered_samples.next_flush = session->tend; err = flush_sample_queue(session, tool); out_err: ui_progress__finish(); -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] perf report: add parameter 'start' & 'end' to perf report
perf report --start time1 --end time2 The unit of time1 & time2 are millsecond. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-report.c |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 72eae74..e9e9d0a 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -733,6 +733,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) { struct perf_session *session; struct stat st; + u64 tstart = 0, tend = 0; bool has_br_stack = false; int branch_mode = -1; int ret = -1; @@ -843,6 +844,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"), OPT_CALLBACK(0, "percent-limit", &report, "percent", "Don't show entries under that percent", parse_percent_limit), + OPT_U64(0, "start", &tstart, "Start time of analysis interval. (Unit: ms)"), + OPT_U64(0, "end", &tend, "End time of analysis interval. (Unit: ms)"), OPT_END() }; @@ -850,6 +853,12 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) argc = parse_options(argc, argv, options, report_usage, 0); + if (tend && tstart >= tend) { + fprintf(stderr, "start [%" PRIu64 "] is greater than end [%" + PRIu64 "].\n", tstart, tend); + return -1; + } + if (report.use_stdio) use_browser = 0; else if (report.use_tui) -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] perf tools: relate 'start' & 'end' to perf_session
Copy the value to start and end to struct perf_session. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/builtin-report.c |5 + tools/perf/util/session.c |3 +++ tools/perf/util/session.h |2 ++ 3 files changed, 10 insertions(+), 0 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index e9e9d0a..d3c1c8a 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -889,6 +889,11 @@ repeat: if (session == NULL) return -ENOMEM; + if (tstart) + session->tstart = tstart * 1e6; + if (tend) + session->tend = tend * 1e6; + report.session = session; has_br_stack = perf_header__has_feat(&session->header, diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 568b750..193bb6a 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -134,6 +134,9 @@ struct perf_session *perf_session__new(const char *filename, int mode, INIT_LIST_HEAD(&self->ordered_samples.to_free); machines__init(&self->machines); + self->tstart = 0; + self->tend = ULLONG_MAX; + if (mode == O_RDONLY) { if (perf_session__open(self, force) < 0) goto out_delete; diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h index 04bf737..c9a6c27 100644 --- a/tools/perf/util/session.h +++ b/tools/perf/util/session.h @@ -37,6 +37,8 @@ struct perf_session { int fd; boolfd_pipe; boolrepipe; + u64 tstart; + u64 tend; struct ordered_samples ordered_samples; charfilename[1]; }; -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] perf tools: record min_timestamp of samples queue in ordered_samples
Add a field 'min_timestamp' in struct ordered_samples to record the minimial timestamp of the samples in ordered_samples->samples. Cc: David Ahern Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Arjan van de Ven Cc: Namhyung Kim Cc: Yanmin Zhang Cc: Wu Fengguang Cc: Mike Galbraith Cc: Andrew Morton Signed-off-by: Chenggang Qin --- tools/perf/util/session.c |3 +++ tools/perf/util/session.h |1 + 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 193bb6a..4e9dd66 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -708,6 +708,9 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event, new->file_offset = file_offset; new->event = event; + if (list_empty(&os->samples) || os->min_timestamp > timestamp) + os->min_timestamp = timestamp; + __queue_event(new, s); return 0; diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h index c9a6c27..7d411b9 100644 --- a/tools/perf/util/session.h +++ b/tools/perf/util/session.h @@ -18,6 +18,7 @@ struct ordered_samples { u64 last_flush; u64 next_flush; u64 max_timestamp; + u64 min_timestamp; struct list_headsamples; struct list_headsample_cache; struct list_headto_free; -- 1.7.8.rc2.5.g815b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/core] perf symbols: Fix a memory leak due to symbol__delete not being used
Commit-ID: d4f74eb89199dc7bde5579783e9188841e1271e3 Gitweb: http://git.kernel.org/tip/d4f74eb89199dc7bde5579783e9188841e1271e3 Author: Chenggang Qin AuthorDate: Fri, 11 Oct 2013 08:27:59 +0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 14 Oct 2013 12:21:20 -0300 perf symbols: Fix a memory leak due to symbol__delete not being used In function symbols__fixup_duplicate(), while duplicated symbols are found, only the rb_node is removed from the tree. The symbol structures themself are ignored. Then, these memory areas are lost. Signed-off-by: Chenggang Qin Acked-by: Namhyung Kim Cc: Andrew Morton Cc: Arjan van de Ven Cc: David Ahern Cc: Ingo Molnar Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Wu Fengguang Cc: Yanmin Zhang Link: http://lkml.kernel.org/r/1381451279-4109-3-git-send-email-chenggang@gmail.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/symbol.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index b66c1ee..c0c3696 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -160,10 +160,12 @@ again: if (choose_best_symbol(curr, next) == SYMBOL_A) { rb_erase(&next->rb_node, symbols); + symbol__delete(next); goto again; } else { nd = rb_next(&curr->rb_node); rb_erase(&curr->rb_node, symbols); + symbol__delete(curr); } } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/core] perf symbols: Fix a mmap and munmap mismatched bug
Commit-ID: 784f3390f9bd900adfb3b0373615e105a0d9749a Gitweb: http://git.kernel.org/tip/784f3390f9bd900adfb3b0373615e105a0d9749a Author: Chenggang Qin AuthorDate: Fri, 11 Oct 2013 08:27:57 +0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 14 Oct 2013 12:21:23 -0300 perf symbols: Fix a mmap and munmap mismatched bug In function filename__read_debuglink(), while the ELF file is opend and mmapped in elf_begin(), but if this file is considered to not be usable during the following code, we will goto the close(fd) directly. The elf_end() is skipped. So, the mmaped ELF file cannot be munmapped. The mmapped areas exist during the life of perf. This is a memory leak. This patch fixed this bug. Reviewed-by: Namhyung Kim Signed-off-by: Chenggang Qin Cc: Andrew Morton Cc: Arjan van de Ven Cc: Chenggang Qin Cc: David Ahern Cc: Ingo Molnar Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Wu Fengguang Cc: Yanmin Zhang Link: http://lkml.kernel.org/r/1381451279-4109-1-git-send-email-chenggang@gmail.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/symbol-elf.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index d6b8af3..eed0b96 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -487,27 +487,27 @@ int filename__read_debuglink(const char *filename, char *debuglink, ek = elf_kind(elf); if (ek != ELF_K_ELF) - goto out_close; + goto out_elf_end; if (gelf_getehdr(elf, &ehdr) == NULL) { pr_err("%s: cannot get elf header.\n", __func__); - goto out_close; + goto out_elf_end; } sec = elf_section_by_name(elf, &ehdr, &shdr, ".gnu_debuglink", NULL); if (sec == NULL) - goto out_close; + goto out_elf_end; data = elf_getdata(sec, NULL); if (data == NULL) - goto out_close; + goto out_elf_end; /* the start of this section is a zero-terminated string */ strncpy(debuglink, data->d_buf, size); +out_elf_end: elf_end(elf); - out_close: close(fd); out: -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/