As the --children option changes the output of perf report (and perf top) it sometimes confuses users. Add more words and examples to help understanding of the option's behavior - and how to disable it ;-).
Cc: Taeung Song <treeze.tae...@gmail.com> Signed-off-by: Namhyung Kim <namhy...@kernel.org> --- tools/perf/Documentation/overhead.txt | 96 ++++++++++++++++++++++++++++++++ tools/perf/Documentation/perf-report.txt | 4 ++ tools/perf/Documentation/perf-top.txt | 3 +- 3 files changed, 102 insertions(+), 1 deletion(-) create mode 100644 tools/perf/Documentation/overhead.txt diff --git a/tools/perf/Documentation/overhead.txt b/tools/perf/Documentation/overhead.txt new file mode 100644 index 000000000000..a7d624229087 --- /dev/null +++ b/tools/perf/Documentation/overhead.txt @@ -0,0 +1,96 @@ +Overhead calculation +-------------------- +The overhead can be shown in two columns as 'Children' and 'Self' when +perf collects callchains. The self overhead is simply calculated by +adding all period values of the entry - usually a function (symbol). +This is the value that perf shows traditionally and sum of the all +self overhead should be 100%. + +The children overhead is calculated by adding all period values of the +child functions so that it can show the total overhead of the higher +level functions. The children here means that functions called from +another function. + +It might be confusing that the sum of the all children overhead +exceeds 100% since each of them is already an accumulation of (self) +overhead of its children. But with this enabled, users can find which +function has the most overhead even if samples are spread over the +children. + +Consider the following example; there’re three functions like below. + +----------------------- +void foo(void) { + /* something */ +} + +bar(void) { + /* do something */ + foo(); +} + +int main(void) { + bar() + return 0; +} +----------------------- + +In this case 'foo' is a child of 'bar', and 'bar' is an immediate +child of 'main' so 'foo' also is a child of 'main'. In other words, +'main' is a parent of 'foo' and 'bar'. and 'bar' is a parent of 'foo'. + +Suppose all samples are recorded in the 'foo' and 'bar' only. When +you record with callchain you'll see something like below in the usual +(self-overhead-only) output of the perf report: + +---------------------------------- +Overhead Symbol +........ ..................... + 60.00% foo + | + --- foo + bar + main + __libc_start_main + + 40.00% bar + | + --- bar + main + __libc_start_main +---------------------------------- + +When --children option is enabled, the (self) overhead of children (in +this case foo and bar) are added to the parent to calculate the +children overhead. In this case the report could be displayed as: + +------------------------------------------- +Children Self Symbol +........ ........ .................... + 100.00% 0.00% __libc_start_main + | + --- __libc_start_main + + 100.00% 0.00% main + | + --- main + __libc_start_main + + 100.00% 40.00% bar + | + --- bar + main + __libc_start_main + + 60.00% 60.00% foo + | + --- foo + bar + main + __libc_start_main +------------------------------------------- + +Since v3.16 the children overhead is shown by default and the output +is sorted by the values. Children overhead is disabled by specifying +--no-children option on the command line or by adding 'report.children += false' or 'top.children = false' in the perfconfig file. diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 4879cf638824..bd97e5c3c9b6 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -193,6 +193,7 @@ OPTIONS Accumulate callchain of children to parent entry so that then can show up in the output. The output will have a new "Children" column and will be sorted on the data. It requires callchains are recorded. + See `overhead calculation' section for more details. --max-stack:: Set the stack depth limit when parsing the callchain, anything @@ -323,6 +324,9 @@ OPTIONS --header-only:: Show only perf.data header (forces --stdio). + +include::overhead.txt[] + SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-annotate[1] diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt index 3265b1070518..526d6bebec1f 100644 --- a/tools/perf/Documentation/perf-top.txt +++ b/tools/perf/Documentation/perf-top.txt @@ -168,7 +168,7 @@ Default is to monitor all CPUS. Accumulate callchain of children to parent entry so that then can show up in the output. The output will have a new "Children" column and will be sorted on the data. It requires -g/--call-graph option - enabled. + enabled. See `overhead calculation' section for more details. --max-stack:: Set the stack depth limit when parsing the callchain, anything @@ -234,6 +234,7 @@ INTERACTIVE PROMPTING KEYS Pressing any unmapped key displays a menu, and prompts for input. +include::overhead.txt[] SEE ALSO -------- -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/