Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Tue, 30 Oct 2012 10:01:10 +0100, Ingo Molnar wrote: > * Peter Zijlstra wrote: > >> On Tue, 2012-10-30 at 15:59 +0900, Namhyung Kim wrote: > >> > Yes, the callchain part needs to be improved. Peter's idea >> > indeed looks good to me too. >> >> FWIW, I think this is exactly what sysprof does, except that >> tool isn't usable for other reasons.. You might want to look >> at it though. > > I always found the fundamental sysprof system-wide call graph > profiling output/view superior - and so do many Xorg developers > who are using SysProf that I talked to - so I'd strongly > encourage to use that ordering and grouping for the default perf > call-graph profiling output/view. Okay, I'll look at the sysprof. Anyway, do you have any other comments for the general --cumulate approach in this series (esp. with --branch-stack)? Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
* Peter Zijlstra wrote: > On Tue, 2012-10-30 at 15:59 +0900, Namhyung Kim wrote: > > Yes, the callchain part needs to be improved. Peter's idea > > indeed looks good to me too. > > FWIW, I think this is exactly what sysprof does, except that > tool isn't usable for other reasons.. You might want to look > at it though. I always found the fundamental sysprof system-wide call graph profiling output/view superior - and so do many Xorg developers who are using SysProf that I talked to - so I'd strongly encourage to use that ordering and grouping for the default perf call-graph profiling output/view. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Tue, 2012-10-30 at 15:59 +0900, Namhyung Kim wrote: > Yes, the callchain part needs to be improved. Peter's idea indeed looks > good to me too. FWIW, I think this is exactly what sysprof does, except that tool isn't usable for other reasons.. You might want to look at it though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
Hi Arun and Peter, On Mon, 29 Oct 2012 14:36:01 -0700, Arun Sharma wrote: > On 10/29/12 12:08 PM, Peter Zijlstra wrote: > >> Right, so I tried this and I would expect the callchains to be inverted >> too, so that when I expand say 'c' I would see that 'c' calls 'b' for >> 100% which calls 'a' for 100%. >> >> Instead I get the regular callchains, expanding 'c' gives me main calls >> it for 100%. >> >> Adding -G (invert callchains) doesn't make it better, in that case, when >> I expand 'c' we start at '__libc_start_main' instead of 'c'. >> >> Is there anything I'm missing? >> > > Sounds like a reasonable expectation. > > I tested mainly: > > perf report --cumulate -g graph,100,callee > > to find the functions with a large amount of CPU time underneath. Then > examined the callgraph without --cumulate. But yeah - it'd be nice to > be able to do both in a single invocation. Yes, the callchain part needs to be improved. Peter's idea indeed looks good to me too. But before doing that, I'd like to get an agreement on how to design/implement this feature. Sorry to Frederic (and Stephane), I'm bothering you multiple times with this but I didn't get what you want exactly. IIUC you don't want to have --cumulate option but to share branch sampling code to implement it, right? But the branch sampling output looks not fit to --cumulate usage IMHO. Could you give me an advice? > > Also, when callgraphs are displayed, the percentages are off (> > 100%). Namhyung probably needs to use he->stat_acc->period in a few > places as the denominator instead of he->period. I will look into it later. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On 10/29/12 12:08 PM, Peter Zijlstra wrote: Right, so I tried this and I would expect the callchains to be inverted too, so that when I expand say 'c' I would see that 'c' calls 'b' for 100% which calls 'a' for 100%. Instead I get the regular callchains, expanding 'c' gives me main calls it for 100%. Adding -G (invert callchains) doesn't make it better, in that case, when I expand 'c' we start at '__libc_start_main' instead of 'c'. Is there anything I'm missing? Sounds like a reasonable expectation. I tested mainly: perf report --cumulate -g graph,100,callee to find the functions with a large amount of CPU time underneath. Then examined the callgraph without --cumulate. But yeah - it'd be nice to be able to do both in a single invocation. Also, when callgraphs are displayed, the percentages are off (> 100%). Namhyung probably needs to use he->stat_acc->period in a few places as the denominator instead of he->period. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Thu, 2012-09-13 at 16:19 +0900, Namhyung Kim wrote: > When --cumulate option is given, it'll be shown like this: > >$ perf report --cumulate >(...) >+ 93.63% abc libc-2.15.so[.] __libc_start_main >+ 93.35% abc abc [.] main >+ 93.35% abc abc [.] c >+ 93.35% abc abc [.] b >+ 93.35% abc abc [.] a >+ 5.17% abc ld-2.15.so [.] _dl_map_object >+ 5.17% abc ld-2.15.so [.] _dl_map_object_from_fd >+ 1.13% abc ld-2.15.so [.] _dl_start_user >+ 1.13% abc ld-2.15.so [.] _dl_start >+ 0.29% abc perf[.] main >+ 0.29% abc perf[.] run_builtin >+ 0.29% abc perf[.] cmd_record >+ 0.29% abc libpthread-2.15.so [.] __libc_close >+ 0.07% abc ld-2.15.so [.] _start >+ 0.07% abc [kernel.kallsyms] [k] page_fault > > (This output came from TUI since stdio bothered by callchains) Right, so I tried this and I would expect the callchains to be inverted too, so that when I expand say 'c' I would see that 'c' calls 'b' for 100% which calls 'a' for 100%. Instead I get the regular callchains, expanding 'c' gives me main calls it for 100%. Adding -G (invert callchains) doesn't make it better, in that case, when I expand 'c' we start at '__libc_start_main' instead of 'c'. Is there anything I'm missing? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Fri, Sep 28, 2012 at 5:14 PM, Frederic Weisbecker wrote: > On Fri, Sep 28, 2012 at 09:07:57AM +0200, Stephane Eranian wrote: >> On Fri, Sep 28, 2012 at 7:49 AM, Namhyung Kim wrote: >> > Hi Frederic, >> > >> > On Fri, 28 Sep 2012 01:01:48 +0200, Frederic Weisbecker wrote: >> >> When Arun was working on this, I asked him to explore if it could make >> >> sense to reuse >> >> the "-b, --branch-stack" perf report option. Because after all, this >> >> feature is doing >> >> about the same than "-b" except it's using callchains instead of full >> >> branch tracing. >> >> But callchains are branches. Just a limited subset of all branches taken >> >> on excecution. >> >> So you can probably reuse some interface and even ground code there. >> >> >> >> What do you think? >> > >> > Umm.. first of all, I'm not familiar with the branch stack thing. It's >> > intel-specific, right? >> > >> The kernel API is NOT specific to Intel. It is abstracted to be portable >> across architecture. The implementation only exists on certain Intel >> X86 processors. >> >> > Also I don't understand what exactly you want here. What kind of >> > interface did you say? Can you elaborate it bit more? >> > >> Not clear to me either. >> >> > And AFAIK branch stack can collect much more branch information than >> > just callstacks. Can we differentiate which is which easily? Is there >> > any limitation on using it? What if callstacks are not sync'ed with >> > branch stacks - is it possible though? >> > >> First of all branch stack is not a branch tracing mechanism. This is a >> branch sampling mechanism. Not all branches are captured. Only the >> last N consecutive branches leading to a PMU interrupt are captured >> in each sample. >> >> Yes, the branch stack mechanism as it exists on Intel processors >> can capture more then call branches. It is HW based and provides >> a branch type filter. Filtering capability is exposed at the API level >> in a generic fashion. The hw filter is based on opcodes. Call branches >> all cover call, syscall instructions. As such, the branch stack mechanism >> cannot be used to capture callstacks to shared libraries, simply because >> there a a non call instruction in the trampoline. To obtain a better quality >> callstack you have instead to sample return branches. So yes, callstacks >> are not sync'ed with branch stack even if limited to call branches. >> > > You're right. One doesn't simply sample callchains on top of branch tracing. > Not easily at least. > But that's not what we want here. We want the other way round: use callchains > as branch sampling. > And a callchain _is_ a branch sampling. Just a specialized one. > > PERF_SAMPLE_BRANCH_STACK either records only calls, only ret, or everything, > or > You can define the filter with "-j" option. Now callchains can be considered > as the result > of a specific "-j" filter option. It's just a high level filtering. ie: not > just based on opcode > types but on semantic post-processing. As if we applied a specific filter on > a pure branch tracing > that cancelled calls that had matching ret. > A callstack mode will be added to PERF_SAMPLE_BRANCH_STACK geneirc filter because this becomes available in HW starting with Haswell (see Vol3b August 2012, section 17.8). This will still be a statistical approach and not a complete callstack trace (only the last 16 calls). So yes, you could piggyback your callstack on top of that. You could return the full trace with the existing perf_branch_entry data structure. You'd have to fill in the prediction flags as N/A. But now with Haswell, one would have to decide whether to use the 'SW callstack' or the 'HW callstack'. It all depends on the quality of the data returned by HW callstack. > But in the end, what we have is just branches. Some branch layout that is > biased, that already passed > through a semantic wheel, still it's just _branches_. > > Note I'm not arguing about adding a "-j callchain" option, just trying to > show you that callchains > are not really different from other filtered source of branch sampling. > > >> > But I think it'd be good if the branch stack can be changed to call >> > stack in general. Did you mean this? >> > >> That's not going to happen. The mechanism is much more generic than >> that. >> >> Quite frankly, I don't understand Frederic's motivation here. The mechanism >> are not quite the same. > > So, considering that callchains are just "branches", why can't we use them as > a branch source, just like PERF_SAMPLE_BRANCH_STACK data samples, that we > can reuse in "perf report -b". > > Look at commit b50311dc2ac1c04ad19163c2359910b25e16caf6 > "perf report: Add support for taken branch sampling". It's doing (except for > a few details > like the period weight of branch samples) the same than in Namhyung patch, > just with > PERF_SAMPLE_BRANCH_STACK instead of callchains. > > I don't understand what justifies this duplication. -- To unsubscribe from this
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Fri, Sep 28, 2012 at 02:49:55PM +0900, Namhyung Kim wrote: > Hi Frederic, > > On Fri, 28 Sep 2012 01:01:48 +0200, Frederic Weisbecker wrote: > > When Arun was working on this, I asked him to explore if it could make > > sense to reuse > > the "-b, --branch-stack" perf report option. Because after all, this > > feature is doing > > about the same than "-b" except it's using callchains instead of full > > branch tracing. > > But callchains are branches. Just a limited subset of all branches taken on > > excecution. > > So you can probably reuse some interface and even ground code there. > > > > What do you think? > > Umm.. first of all, I'm not familiar with the branch stack thing. It's > intel-specific, right? > > Also I don't understand what exactly you want here. What kind of > interface did you say? Can you elaborate it bit more? Look at commit b50311dc2ac1c04ad19163c2359910b25e16caf6 "perf report: Add support for taken branch sampling". It's doing almost the same than you do, just using PERF_SAMPLE_BRANCH_STACK instead of callchains. > And AFAIK branch stack can collect much more branch information than > just callstacks. That's not a problem. Callchains are just a high-level filtered source of branch samples. You don't need full branches to use "-b". Just use the flavour of branch samples you want to make the sense you want on your branch sampling. > Can we differentiate which is which easily? Sure. If you have both sources in your perf.data (PERF_SAMPLE_BRANCH_STACK and callchains), ask the user which one he wants. Otherwise defaults to what's there. > Is there > any limitation on using it? What if callstacks are not sync'ed with > branch stacks - is it possible though? It' better to make both sources mutually exclusive. Otherwise it's going to be over-complicated. > > But I think it'd be good if the branch stack can be changed to call > stack in general. Did you mean this? That's a different. We might be able to post-process branch tracing and build a callchain on top of it (following calls and ret). May be we will one day. But they are different issues altogether. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Fri, Sep 28, 2012 at 09:07:57AM +0200, Stephane Eranian wrote: > On Fri, Sep 28, 2012 at 7:49 AM, Namhyung Kim wrote: > > Hi Frederic, > > > > On Fri, 28 Sep 2012 01:01:48 +0200, Frederic Weisbecker wrote: > >> When Arun was working on this, I asked him to explore if it could make > >> sense to reuse > >> the "-b, --branch-stack" perf report option. Because after all, this > >> feature is doing > >> about the same than "-b" except it's using callchains instead of full > >> branch tracing. > >> But callchains are branches. Just a limited subset of all branches taken > >> on excecution. > >> So you can probably reuse some interface and even ground code there. > >> > >> What do you think? > > > > Umm.. first of all, I'm not familiar with the branch stack thing. It's > > intel-specific, right? > > > The kernel API is NOT specific to Intel. It is abstracted to be portable > across architecture. The implementation only exists on certain Intel > X86 processors. > > > Also I don't understand what exactly you want here. What kind of > > interface did you say? Can you elaborate it bit more? > > > Not clear to me either. > > > And AFAIK branch stack can collect much more branch information than > > just callstacks. Can we differentiate which is which easily? Is there > > any limitation on using it? What if callstacks are not sync'ed with > > branch stacks - is it possible though? > > > First of all branch stack is not a branch tracing mechanism. This is a > branch sampling mechanism. Not all branches are captured. Only the > last N consecutive branches leading to a PMU interrupt are captured > in each sample. > > Yes, the branch stack mechanism as it exists on Intel processors > can capture more then call branches. It is HW based and provides > a branch type filter. Filtering capability is exposed at the API level > in a generic fashion. The hw filter is based on opcodes. Call branches > all cover call, syscall instructions. As such, the branch stack mechanism > cannot be used to capture callstacks to shared libraries, simply because > there a a non call instruction in the trampoline. To obtain a better quality > callstack you have instead to sample return branches. So yes, callstacks > are not sync'ed with branch stack even if limited to call branches. > You're right. One doesn't simply sample callchains on top of branch tracing. Not easily at least. But that's not what we want here. We want the other way round: use callchains as branch sampling. And a callchain _is_ a branch sampling. Just a specialized one. PERF_SAMPLE_BRANCH_STACK either records only calls, only ret, or everything, or You can define the filter with "-j" option. Now callchains can be considered as the result of a specific "-j" filter option. It's just a high level filtering. ie: not just based on opcode types but on semantic post-processing. As if we applied a specific filter on a pure branch tracing that cancelled calls that had matching ret. But in the end, what we have is just branches. Some branch layout that is biased, that already passed through a semantic wheel, still it's just _branches_. Note I'm not arguing about adding a "-j callchain" option, just trying to show you that callchains are not really different from other filtered source of branch sampling. > > But I think it'd be good if the branch stack can be changed to call > > stack in general. Did you mean this? > > > That's not going to happen. The mechanism is much more generic than > that. > > Quite frankly, I don't understand Frederic's motivation here. The mechanism > are not quite the same. So, considering that callchains are just "branches", why can't we use them as a branch source, just like PERF_SAMPLE_BRANCH_STACK data samples, that we can reuse in "perf report -b". Look at commit b50311dc2ac1c04ad19163c2359910b25e16caf6 "perf report: Add support for taken branch sampling". It's doing (except for a few details like the period weight of branch samples) the same than in Namhyung patch, just with PERF_SAMPLE_BRANCH_STACK instead of callchains. I don't understand what justifies this duplication. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Fri, Sep 28, 2012 at 7:49 AM, Namhyung Kim wrote: > Hi Frederic, > > On Fri, 28 Sep 2012 01:01:48 +0200, Frederic Weisbecker wrote: >> When Arun was working on this, I asked him to explore if it could make sense >> to reuse >> the "-b, --branch-stack" perf report option. Because after all, this >> feature is doing >> about the same than "-b" except it's using callchains instead of full branch >> tracing. >> But callchains are branches. Just a limited subset of all branches taken on >> excecution. >> So you can probably reuse some interface and even ground code there. >> >> What do you think? > > Umm.. first of all, I'm not familiar with the branch stack thing. It's > intel-specific, right? > The kernel API is NOT specific to Intel. It is abstracted to be portable across architecture. The implementation only exists on certain Intel X86 processors. > Also I don't understand what exactly you want here. What kind of > interface did you say? Can you elaborate it bit more? > Not clear to me either. > And AFAIK branch stack can collect much more branch information than > just callstacks. Can we differentiate which is which easily? Is there > any limitation on using it? What if callstacks are not sync'ed with > branch stacks - is it possible though? > First of all branch stack is not a branch tracing mechanism. This is a branch sampling mechanism. Not all branches are captured. Only the last N consecutive branches leading to a PMU interrupt are captured in each sample. Yes, the branch stack mechanism as it exists on Intel processors can capture more then call branches. It is HW based and provides a branch type filter. Filtering capability is exposed at the API level in a generic fashion. The hw filter is based on opcodes. Call branches all cover call, syscall instructions. As such, the branch stack mechanism cannot be used to capture callstacks to shared libraries, simply because there a a non call instruction in the trampoline. To obtain a better quality callstack you have instead to sample return branches. So yes, callstacks are not sync'ed with branch stack even if limited to call branches. > But I think it'd be good if the branch stack can be changed to call > stack in general. Did you mean this? > That's not going to happen. The mechanism is much more generic than that. Quite frankly, I don't understand Frederic's motivation here. The mechanism are not quite the same. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
Hi Frederic, On Fri, 28 Sep 2012 01:01:48 +0200, Frederic Weisbecker wrote: > When Arun was working on this, I asked him to explore if it could make sense > to reuse > the "-b, --branch-stack" perf report option. Because after all, this feature > is doing > about the same than "-b" except it's using callchains instead of full branch > tracing. > But callchains are branches. Just a limited subset of all branches taken on > excecution. > So you can probably reuse some interface and even ground code there. > > What do you think? Umm.. first of all, I'm not familiar with the branch stack thing. It's intel-specific, right? Also I don't understand what exactly you want here. What kind of interface did you say? Can you elaborate it bit more? And AFAIK branch stack can collect much more branch information than just callstacks. Can we differentiate which is which easily? Is there any limitation on using it? What if callstacks are not sync'ed with branch stacks - is it possible though? But I think it'd be good if the branch stack can be changed to call stack in general. Did you mean this? Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On Tue, Sep 25, 2012 at 01:57:26PM +0900, Namhyung Kim wrote: > Ping. Any comments for this? > > Arun, thanks for testing! > Namhyung When Arun was working on this, I asked him to explore if it could make sense to reuse the "-b, --branch-stack" perf report option. Because after all, this feature is doing about the same than "-b" except it's using callchains instead of full branch tracing. But callchains are branches. Just a limited subset of all branches taken on excecution. So you can probably reuse some interface and even ground code there. What do you think? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
Ping. Any comments for this? Arun, thanks for testing! Namhyung On Thu, 13 Sep 2012 16:19:56 +0900, Namhyung Kim wrote: > Hi, > > This is my first attempt to implement cumulative hist period report. > This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely > rewrote it from scratch. > > It basically adds period in a sample to every node in the callchain. > A hist_entry now has an additional fields to keep the cumulative > period if --cumulate option is given on perf report. > > Let me show you an example: > > $ cat abc.c > #define barrier() asm volatile("" ::: "memory") > > void a(void) > { > int i; > > for (i = 0; i < 100; i++) > barrier(); > } > > void b(void) > { > a(); > } > > void c(void) > { > b(); > } > > int main(void) > { > c(); > > return 0; > } > > With this simple program I ran perf record and report: > > $ perf record -g -e cycles:u ./abc > $ perf report -g none --stdio > [snip] > # Overhead Command Shared Object Symbol > # ... .. .. > # > 93.35% abc abc [.] a >5.17% abc ld-2.15.so [.] _dl_map_object_from_fd >1.13% abc ld-2.15.so [.] _dl_start >0.29% abc libpthread-2.15.so [.] __libc_close >0.07% abc [kernel.kallsyms] [k] page_fault >0.00% abc ld-2.15.so [.] _start > > When --cumulate option is given, it'll be shown like this: > >$ perf report --cumulate >(...) >+ 93.63% abc libc-2.15.so[.] __libc_start_main >+ 93.35% abc abc [.] main >+ 93.35% abc abc [.] c >+ 93.35% abc abc [.] b >+ 93.35% abc abc [.] a >+ 5.17% abc ld-2.15.so [.] _dl_map_object >+ 5.17% abc ld-2.15.so [.] _dl_map_object_from_fd >+ 1.13% abc ld-2.15.so [.] _dl_start_user >+ 1.13% abc ld-2.15.so [.] _dl_start >+ 0.29% abc perf[.] main >+ 0.29% abc perf[.] run_builtin >+ 0.29% abc perf[.] cmd_record >+ 0.29% abc libpthread-2.15.so [.] __libc_close >+ 0.07% abc ld-2.15.so [.] _start >+ 0.07% abc [kernel.kallsyms] [k] page_fault > > (This output came from TUI since stdio bothered by callchains) > > As you can see __libc_start_main -> main -> c -> b -> a callchain show > up in the output. > > It might have some rough edges or even bugs, but I really want to > release it and get reviews. In fact I saw some very large percentage > or 'inf' on some callchain nodes when expanding. > > It currently ignores samples don't have symbol info when accumulating > periods along the callchain. Otherwise it resulted in very strangely > large output since every node in the callchain would be added into a > single entry which has NULL dso/sym. Simply ignoring them solved the > problem and I couldn't come up with a better solution. > > This patchset is based on current acme/perf/core + my small fixes [2],[3]. > You can also get this series on my tree at: > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > perf/cumulate-v1 > > Any comments are welcome, thanks. > Namhyung > > [1] https://lkml.org/lkml/2012/3/31/6 > [2] https://lkml.org/lkml/2012/9/11/546 > [3] https://lkml.org/lkml/2012/9/12/51 > > > Namhyung Kim (15): > perf hists: Add missing period_* fields when collapsing a hist entry > perf hists: Introduce struct he_stat > perf hists: Move he->stat.nr_events initialization to a template > perf hists: Convert hist entry functions to use struct he_stat > perf hists: Add more helpers for hist entry stat > perf hists: Add support for accumulated stat of hist entry > perf hists: Check if accumulated when adding a hist entry > perf callchain: Add a couple of callchain helpers > perf hists: Let add_hist_entry to make a hist entry template > perf hists: Accumulate hist entry stat based on the callchain > perf hists: Sort hist entries by accumulated period > perf ui/hist: Add support to accumulated hist stat > perf ui/browser: Add support to accumulated hist stat > perf ui/gtk: Add support to accumulated hist stat > perf report: Add --cumulate option > > tools/perf/builtin-report.c| 8 ++ > tools/perf/ui/browsers/hists.c | 12 +- > tools/perf/ui/gtk/browser.c| 5 +- > tools/perf/ui/hist.c | 74 ++--- > tools/perf/ui/stdio/hist.c | 2 +- > tools/perf/util/callchain.c| 15 +++ > tools/perf/util/callchain.h| 17 +++ > tools/perf/util/hist.c | 242 > + > tools/perf/util/sort.h | 17 ++- >
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On 9/13/12 12:19 AM, Namhyung Kim wrote: Hi, This is my first attempt to implement cumulative hist period report. This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely rewrote it from scratch. Tested-by: Arun Sharma Our typical use case: perf record -g fp ./foo perf report --stdio --cumulate -g graph,100,callee -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/