On Mon, Nov 2, 2015 at 2:12 PM, Namhyung Kim <namhy...@kernel.org> wrote: > Hi Arnaldo, > > On Mon, Nov 02, 2015 at 06:30:21PM -0300, Arnaldo Carvalho de Melo wrote: >> Em Mon, Nov 02, 2015 at 12:37:28PM -0800, Brendan Gregg escreveu: >> > G'Day Namhyung, >> > >> > On Mon, Nov 2, 2015 at 4:57 AM, Namhyung Kim <namhy...@kernel.org> wrote: >> > > Hello, >> > > >> > > This is what Brendan requested on the perf-users mailing list [1] to >> > > support FlameGraphs [2] more efficiently. This patchset adds a few >> > > more callchain options to adjust the output for it. >> > > >> > > At first, 'folded' output mode was added. The folded output puts all >> > > calchain nodes in a line separated by semicolons, a space and the >> > > value. Now it only supports --stdio as other UI provides some way of >> > > folding/expanding callchains dynamically. >> > > >> > > The value is now can be one of 'percent', 'period', or 'count'. The >> > > percent is current default output and the period is the raw number of >> > > sample periods. The count is the number of samples for each callchain. >> > > >> > > Here's an example: >> > > >> > > $ perf report --no-children --show-nr-samples --stdio -g folded,count >> > > ... >> > > 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel >> > > >> > > intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary >> > > 57 >> > > >> > > intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... >> > > 23 >> > >> > Thanks! >> > >> > So for the folded output I don't need the summary line (the row of >> > columns printed by hist_entry__snprintf()), and don't need anything >> > except folded stacks and the counts. If working with the existing >> > stdio interface is making it harder than it needs to be, might it be >> >> I don't think it so, just add some flag asking for that >> hist_entry__snprintf() to be supressed, ideas for a long option name? >> >> Having it as Namhyung did may have value for some people as a more >> compact way to show the callchains together with the hist_entry line. > > Yeah, I'd keep the hist entry line unless it's too hard to > parse/filter. IMHO it's just a way to show callchains, so no need to > have separate output mode..
Ok, good point, it can be thought of as a different stack representation format. > > Brendan, I guess you still need to know other info like cpu or pid, no? > Yes, I just realized that I either include the process name (Command column) or name-PID, as the first folded element. Eg, output can be: mkdir;getopt_long;page_fault;do_page_fault;__do_page_fault;filemap_map_pages 3 Or: mkdir-21918;getopt_long;page_fault;do_page_fault;__do_page_fault;filemap_map_pages 2 Usually the first, but sometimes it's helpful to split on PID as well. As for what to call such options (which may be a follow on patch anyway) ... maybe something like: "folded": fold stacks as single lines "nameonly,folded": suppress summary line and include process name in the folded stack "pidonly,folded": suppress summary line and include process_name-PID in the folded stack > And I feel like it'd be better to put the count before the callchains > for consistency like below. Is it OK to you? > > $ perf report --no-children --show-nr-samples --stdio -g folded,count > ... > 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel > 57 > intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary > 23 > intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;... > If it was printing with the perf report summary, sure, but if we have a way to only emit folded output, then counts last would be perfect and maybe a bit more intuitive (key then value). > >> >> With this in mind, do you have any other issues with Namhyung's >> patchkit? An acked-by/tested-by you would be nice to have, and then we >> could work out the new option to suppress that hist_entry__snprintf() >> in a follow up patch. Acked and tested, yes. Looks like I'd be using caller ordering, eg, to get lines like this: __GI___libc_read;entry_SYSCALL_64_fastpath;sys_read;vfs_read;__vfs_read;urandom_read;extract_entropy_user;extract_buf;check_events;xen_hypercall_xen_version 91 Which I can do just by using "-g folded,count,caller". >> >> > easier to make it a separate interface (ui/folded), that just emitted >> > the folded output? Just an idea. This existing patchset is working for >> > me, I'd just be filtering the output. >> > >> > Having the option for percentages and periods is nice. I can envisage >> > using periods (for latency flame graphs). > > Glad to see you like it. :) > > Thanks, > Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/