2014-08-15 (금), 21:51 +0200, Jiri Olsa: > On Fri, Aug 15, 2014 at 10:57:14AM +0900, Namhyung Kim wrote: > > Hi Jiri, > > > > 2014-08-14 (목), 16:10 +0200, Jiri Olsa: > > > On Thu, Aug 14, 2014 at 03:01:40PM +0900, Namhyung Kim wrote: > > > > > > SNIP > > > > > > > However, with --children feature added, it now can show all callees of > > > > the entry. For example, "start_kernel" entry now can display it calls > > > > rest_init and in turn cpu_idle and then cpuidle_idle_call (95.72%). > > > > > > > > 6.14% 0.00% swapper [kernel.kallsyms] [k] > > > > start_kernel > > > > | > > > > --- start_kernel > > > > rest_init > > > > cpu_idle > > > > | > > > > |--97.52%-- cpuidle_idle_call > > > > | cpuidle_enter_tk > > > > | | > > > > | |--99.91%-- cpuidle_wrap_enter > > > > | | cpuidle_enter > > > > | | intel_idle > > > > | --0.09%-- [...] > > > > --2.48%-- [...] > > > > > > > > Note that start_kernel has no self overhead - meaning that it never > > > > get sampled by itself but constructs such a nice callgraph. But, > > > > sadly, if an entry has self overhead, callchain will get confused with > > > > generated callchain (like above) and self callchains (which reversed > > > > order) like the eariler example. > > > > > > > > To be consistent with other entries, I'd like to make it just to show > > > > a single entry - itself - like below since it doesn't have callees > > > > (children) at all. But still use the whole callchain to construct > > > > children entries (like the start_kernel) as usual. > > > > > > > > 40.53% 40.53% swapper [kernel.kallsyms] [k] > > > > intel_idle > > > > | > > > > --- intel_idle > > > > > > I understand the consistency point, but I think we'd loose > > > usefull info by cutting this off > > > > > > I guess I can run 'report -g callee' to find out who called intel_idle > > > instead.. but I would not need to if the callchain stays here > > > > Yeah, but current behavior intermixes caller-callchains and > > callee-callchains together so adds confusion to users. This is a > > problem IMHO. > > hum, where is it callee/caller mixed? with following example: > > --- > void c(void) > { > } > > void b(void) > { > c(); > } > > void a(void) > { > b(); > } > > int main(void) > { > while(1) { > a(); > b(); > c(); > } > } > --- > > for 'c' the current code will display: > > - 43.74% 43.74% t t [.] c > ▒ > - __libc_start_main > ▒ > - 86.33% main > ▒ > 67.08% c > ▒ > - 32.91% a > ▒ > 99.44% c > ▒ > - 0.56% b > ▒ > c > ▒ > 13.67% c > ▒ > > and with this patch: > > - 43.74% 43.74% t t [.] c > ▒ > c > ▒ > > > The 'c' callchain is still in caller order. IMO we should > keep whole callchain here.
The problem is not in pure self entry (that has self overhead = children overhead) and pure cumulative entry (self overhead = 0). It's in mixed entries, please see last two examples in the description 0/3. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/