Re: perf backtraces off-by-1
On 8/28/12 9:34 AM, Peter Zijlstra wrote: It used to look like this: http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=commitdiff;h=92cc7fd78a5a79c4bb5f85bfb7d7fb025df9cd5a Hmm, that's not too bad, but a long stretch from pretty ;-) How would you 'encode' this in the perf callchain data? These days we just look at dwarf augmentation string: http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/dwarf/Gfde.c;h=8659624b0320c514057861a259b6efe1b605bbf3;hb=HEAD#l189 Right, except of course we don't have that in kernel.. The ip-- transformation could happen in user space. The kernel doesn't have to know any of this :) -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On 10/29/12 12:08 PM, Peter Zijlstra wrote: Right, so I tried this and I would expect the callchains to be inverted too, so that when I expand say 'c' I would see that 'c' calls 'b' for 100% which calls 'a' for 100%. Instead I get the regular callchains, expanding 'c' gives me main calls it for 100%. Adding -G (invert callchains) doesn't make it better, in that case, when I expand 'c' we start at '__libc_start_main' instead of 'c'. Is there anything I'm missing? Sounds like a reasonable expectation. I tested mainly: perf report --cumulate -g graph,100,callee to find the functions with a large amount of CPU time underneath. Then examined the callgraph without --cumulate. But yeah - it'd be nice to be able to do both in a single invocation. Also, when callgraphs are displayed, the percentages are off (> 100%). Namhyung probably needs to use he->stat_acc->period in a few places as the denominator instead of he->period. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
clock_gettime_ns
A couple of years ago Andy posted this patch series: http://thread.gmane.org/gmane.linux.kernel/1233209/ These patches have been in use at facebook for a couple of years and along with a vDSO implementation of thread_cpu_time(), they have proven useful for our profilers. I didn't see any arguments against this patch series. Did I miss some discussion on the topic? -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: clock_gettime_ns
On 9/5/13 12:47 AM, John Stultz wrote: If we're going to add a new interface that uses something other then a timespec, we likely need to put some serious thought into that new type, and see how it could be used across a number of syscalls. Some of the discussion around dealing with the 2038 issue touched on this. [ I know you're not asking for perf data, but may be useful for new readers ] Here's the benchmarking I did in 2011: http://thread.gmane.org/gmane.linux.kernel/1233758/focus=1233781 Switching from timespec to s64 was worth 21%. My experience over the years is that this performance delta causes userspace guys to implement their own TSC based timers, against the advice from kernel developers. http://code.ohloh.net/search?s=wall%20now%20tsc%20hz&pp=0&fl=C&fl=C%2B%2B&ff=1&mp=1&ml=1&me=1&md=1&filterChecked=true I worry that trying to solve other clock problems will cause the kernel to continue to pass the time in memory instead of registers, giving the userspace TSC based implementations a reason to exist. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] Support volatile range for anon vma
On Wed, Oct 31, 2012 at 06:56:05PM -0400, KOSAKI Motohiro wrote: > glibc malloc discard freed memory by using MADV_DONTNEED > as tcmalloc. and it is often a source of large performance decrease. > because of MADV_DONTNEED discard memory immediately and > right after malloc() call fall into page fault and pagesize memset() path. > then, using DONTNEED increased zero fill and cache miss rate. The memcg based solution that I posted a few months ago is working well for us. We see significantly less cpu in zero'ing pages. Not everyone was comfortable with the security implications of recycling pages between processes in a memcg, although it was disabled by default and had to be explicitly opted-in. Also, memory allocators have a second motivation in using madvise: to create virtually contiguous regions of memory from a fragmented address space, without increasing the RSS. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] Support volatile range for anon vma
On 11/5/12 5:49 PM, Minchan Kim wrote: Also, memory allocators have a second motivation in using madvise: to create virtually contiguous regions of memory from a fragmented address space, without increasing the RSS. I don't get it. How do we create contiguos region by madvise? Just out of curiosity. Could you elaborate that use case? :) By using a new anonymous map and faulting pages in. The fragmented virtual memory is released via MADV_DONTNEED and if the malloc/free activity on the system is dominated by one process, chances are that the newly faulted in page is the one released by the same process :) The net effect is that physical pages within a single address space are rearranged so larger allocations can be satisfied. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v4 0/3] Support volatile for anonymous range
On 12/17/12 10:47 PM, Minchan Kim wrote: I hope more inputs from user-space allocator people and test patch with their allocator because it might need design change of arena management for getting real vaule. jemalloc knows how to handle MADV_FREE on platforms that support it. This looks similar (we'll need a SIGBUS handler that does the right thing = zero the page + mark it as non-volatile in the common case). All of this of course assumes that apps madvise the kernel through APIs exposed by the malloc implementation - not via a raw syscall. In other words, some new user space code needs to be written to test this out fully. Sounds feasible though. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
perf backtraces off-by-1
Some of our language runtimes like to map IP addresses in perf backtrace to specific byte codes. The way things stand now, the addresses on the backtrace are return addresses, rather than the caller. I think this issue may be present for other unusual call/return sequences where the user may be more interested in the calling instruction rather than the instruction control flow would return to. A simple hack such as the one below makes our JIT guys happy. But the code is not right if there was an asynchronous transfer of control (eg: signal handler or interrupt). libunwind contains similar code, but has the additional info in the unwind information to recognize async control transfer. Wondering if this has been discussed before. One option is to support this for user mode only, with code to detect signal frames. Any other ideas? -Arun --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -296,6 +296,7 @@ int machine__resolve_callchain(struct machine *self, struct perf_evsel *evsel, u8 cpumode = PERF_RECORD_MISC_USER; unsigned int i; int err; + int async; callchain_cursor_reset(&evsel->hists.callchain_cursor); @@ -322,6 +323,11 @@ int machine__resolve_callchain(struct machine *self, struct perf_evsel *evsel, continue; } + /* XXX: check if this was an async control transfer */ + async = 0; +if (!async) { + ip--; + } al.filtered = false; thread__find_addr_location(thread, self, cpumode, MAP__FUNCTION, ip, &al, NULL); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: perf backtraces off-by-1
On 8/26/12 9:10 AM, Peter Zijlstra wrote: On Fri, 2012-08-24 at 15:13 -0700, Arun Sharma wrote: One option is to support this for user mode only, with code to detect signal frames. Any other ideas? I guess we'd need to see what that patch would look like... :-) It used to look like this: http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=commitdiff;h=92cc7fd78a5a79c4bb5f85bfb7d7fb025df9cd5a These days we just look at dwarf augmentation string: http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/dwarf/Gfde.c;h=8659624b0320c514057861a259b6efe1b605bbf3;hb=HEAD#l189 -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/15] perf report: Add support to accumulate hist periods
On 9/13/12 12:19 AM, Namhyung Kim wrote: Hi, This is my first attempt to implement cumulative hist period report. This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely rewrote it from scratch. Tested-by: Arun Sharma Our typical use case: perf record -g fp ./foo perf report --stdio --cumulate -g graph,100,callee -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf: Add a new sort order: SORT_INCLUSIVE (v6)
On 3/30/12 10:43 PM, Arun Sharma wrote: [ Meant to include v6 ChangeLog as well. Technical difficulties.. ] v6 ChangeLog: rebased to tip:perf/core and fixed a minor problem in computing the total period in hists__remove_entry_filter(). Needed to use period_self instead of period. This patch breaks perf top (symptom: percentages > 100%). Fixed by the following patch. Namhyung: if you're still working on forward porting this, please add this fix to your queue. -Arun commit 75a1c409a529c9741f8a2f493868d1fc7ce7e06d Author: Arun Sharma Date: Wed Aug 8 11:47:02 2012 -0700 perf: update period_self as well on collapsing When running perf top, we have a series of incoming samples, which get aggregated in various user specified ways. Suppose function "foo" had the following samples: 101, 103, 99, 105, ... ->period for the corresponding entry looks as follows: 101, 204, 303, 408, ... However, due to this bug, ->period_self contains: 101, 103, 99, 105, ... and therefore breaks the invariant period == period_self in the default mode (no sort inclusive). Since total_period is computed by summing up period_self, period/total_period can be > 100% Fix the bug by updating period_self as well. Signed-off-by: Arun Sharma diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index a2a8d91..adc891e 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -462,6 +462,7 @@ static bool hists__collapse_insert_entry(struct hists *hists, if (!cmp) { iter->period += he->period; + iter->period_self += he->period_self; iter->nr_events += he->nr_events; if (symbol_conf.use_callchain) { callchain_cursor_reset(&hists->callchain_cursor); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf: Add a new sort order: SORT_INCLUSIVE (v6)
On 8/8/12 12:16 PM, Arun Sharma wrote: and therefore breaks the invariant period == period_self in the default mode (no sort inclusive). hist_entry__decay() also needs an update to maintain the invariant. --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -138,6 +138,7 @@ static void hist_entry__add_cpumode_period(struct hist_entry *he, static void hist_entry__decay(struct hist_entry *he) { he->period = (he->period * 7) / 8; + he->period_self = (he->period_self * 7) / 8; he->nr_events = (he->nr_events * 7) / 8; } -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] perf: Add a few generic stalled-cycles events
On 10/15/12 8:55 AM, Robert Richter wrote: [..] Perf tool works then out-of-the-box with: $ perf record -e cpu/stalled-cycles-fixed-point/ ... The event string can easily be reused by other architectures as a quasi standard. I like Robert's proposal better. It's hard to model all the stall events (eg: instruction decoder related stalls on x86) in a hardware independent way. Another area to think about: software engineers are generally busy and have a limited amount of time to devote to hardware event based optimizations. The most common question I hear is: what is the expected perf gain if I fix this? It's hard to answer that with just the stall events. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET 00/24] perf tools: Add support to accumulate hist periods (v6)
On 1/22/14 5:20 AM, Jiri Olsa wrote: I have changes on top of this patchset and all looks great, I was just going throught this again and wanted to send my ack, but it no longer merges to the acme's perf/core. Could you please send updated version, and I'll finish the review.. I promise ;-) We've been testing v5 of this patch series and haven't found any major problems on our end. Tested-By: Arun Sharma -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: State of "perf: Add a new sort order: SORT_INCLUSIVE"
On 10/28/13 2:29 AM, Rodrigo Campos wrote: On Mon, Oct 28, 2013 at 06:09:30PM +0900, Namhyung Kim wrote: On Mon, 28 Oct 2013 08:42:44 +, Rodrigo Campos wrote: On Mon, Oct 28, 2013 at 02:09:49PM +0900, Namhyung Kim wrote: Anyway, You can find the series and discussion on the link below: https://lkml.org/lkml/2012/9/13/81 I've read the cover letter for that series and probably because I don't know about perf internals I have a question: How will "--culumate" interact with "--sort=dso" for example ? I mean, is it possible for that to show more than 100% ? (if you add all the 93.35% in your example in the cover letter, or something similar). Or "--culumate --sort=dso" will just group together all entries that have a dso in the call chain ? Hmm.. I think --cumulate option is only meaningful when sort order includes symbol. Maybe I can add support for --sort=dso case as well but not sure it's worth. Do you think it's really needed? I don't know if it is *needed*, but that was what I need :) I suspect that users will find creative ways of using these options to solve real world problems and we shouldn't restrict usage any more than we need to to protect against obvious bugs/crashes. Also, what's the reasoning for --cumulate not being an option under perf record -g ..,? In order to integrate perf record -b and --cumulate, we'll have to sort out the underlying infrastructure for processing callgraphs and branch stacks. I think the main roadblock here is that one is statistical and on many CPUs incomplete (only top N branches are reported). Given that there are clear use cases in production involving complex callgraphs, I'm for getting this support in first and then reconciling the differences with perf record -b later. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: State of "perf: Add a new sort order: SORT_INCLUSIVE"
On 10/28/13 8:11 PM, Namhyung Kim wrote: Hey Namhyung: Also, what's the reasoning for --cumulate not being an option under perf record -g ..,? Sorry, I cannot understand you. The 'perf record' just saves sample data (and callchains) from the ring-buffer. All the processing happens in 'perf report'. I can't see what you expect from the 'perf record --cumulate'. Am I missing something? Yes - I meant to say perf report -g :) > -g [type,min[,limit],order] Specifically, along with callee, caller, we could have a third option. Or we could have a new type (graph, fractal, cumulative). Given that there are clear use cases in production involving complex callgraphs, I'm for getting this support in first and then reconciling the differences with perf record -b later. I think what Frederic said is that the code de-duplication of 'perf report' side. The branch stack and --cumulate are different - branch stack concentrates on the branch itself but --cumulate uses callchains to find parents and give some credit to them as side information. Me too. I brought it up with Stephane at some point in the last year or so and there wasn't an obvious way to de-duplicate because of these differences. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] perf callchain: Use global caching provided by libunwind
On 9/23/14, 12:00 PM, Namhyung Kim wrote: > + unw_set_caching_policy(addr_space, UNW_CACHE_GLOBAL); The result is a bit surprising for me. In micro benchmarking (eg: Lperf-simple), the per-thread policy is generally faster because it doesn't involve locking. libunwind/tests/Lperf-simple unw_getcontext : cold avg= 109.673 nsec, warm avg= 28.610 nsec unw_init_local : cold avg= 259.876 nsec, warm avg=9.537 nsec no cache: unw_step : 1st= 3258.387 min= 2922.331 avg= 3002.384 nsec global cache: unw_step : 1st= 1192.093 min= 960.486 avg= 982.208 nsec per-thread cache: unw_step : 1st= 429.153 min= 113.533 avg= 121.762 nsec I can see how the global policy would involve less memory allocation because of shared data structures. Curious about the reason for the speedup (specifically if libunwind should change the defaults for the non-local unwinding case). -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v9)
On 3/20/14, 11:06 AM, Namhyung Kim wrote: Hello, This is a new attempt to implement cumulative hist period report. This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely rewrote it from scratch. While testing this patch series, we found error messages which look like this: Out of bounds address found: Addr: 10370 DSO:/usr/local/lib/libgcc_s.so.1 d Map:7f1b0c953000-7f1b0c968000 Symbol: 102d0-102e9 g _Unwind_DeleteException Arch: x86_64 Kernel: 3.10.23+ Tools: 3.13.rc1.g374a4d Not all samples will be on the annotation output. Please report to linux-kernel@vger.kernel.org I first suspected it to be caused by this patch series, but I'm able to reproduce without these patches as of this commit: a51e87c perf tools: Remove unused simple_strtoul() function gdb attributes 0x10370 to a different/known symbol. (gdb) x /i 0x10370 0x10370 : cmp$0x4c,%dl Is this known? Could this possibly be caused by stale histogram entries from unmapped/remapped shared libs? -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v9)
On 4/1/14, 12:58 PM, Namhyung Kim wrote: gdb attributes 0x10370 to a different/known symbol. (gdb) x /i 0x10370 0x10370 : cmp$0x4c,%dl Is this known? Could this possibly be caused by stale histogram entries from unmapped/remapped shared libs? Possibly. Anyway the addr which perf reported is a mapped address so that it's pointless to use the addr directly - it's 7f1b0c963370 in fact. Right - that's the address I'd use if the process in question is still running. But gdb followed by relative addresses could still tell us what the right symbol was? What was the exact command line though - did you use any filter (--comms, --dsos, --symbols) or event modifiers? Those are another possible culprits since map searching code touched by recent changes. There were no other filters. The command used was just "perf top". I'm not able to reproduce the problem on my machine. It'd be great if you could bisect or let me know how to reproduce it easily. I don't have a solid repro either. Involves building a binary, running "perf top" and waiting for a few mins until that warning popup appears. Will try to git bisect and figure out potential culprits. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Libunwind-devel] [RFC PATCH 0/3] Add support for dwarf compat mode unwinding
On Mon, Feb 3, 2014 at 7:28 AM, Jean Pihet wrote: >> Something like ./configure --target=arm on aarch64. > > Thanks for the link and info. > > Is there a concrete example of cross-unwinding with multiple targets, > for example on x86_64 using native and x86_32 libunwind libraries > simultaneously? > I am trying to assess the impact of multiple unwinding libs in the perf code. Might want to check with folks who worked on Frysk. There was some criticism that it was non-trivial, but it's been done. http://lists.nongnu.org/archive/html/libunwind-devel/2007-05/msg6.html -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/18] perf report: Add support to accumulate hist periods (v3)
On 12/18/13 3:16 PM, Ingo Molnar wrote: My main complaint that any variation of 'cumulative' or 'cumulate' is a tongue-twister to users. I certainly won't be able to remember it and will have to call up the manpage every time I use it - which will be very annoying. I'd probably not use the feature much. I can remember it, mainly because it's such an unusual word :) Agree that we could use something simpler/easier to remember. My other user space projects have been keeping me busy. But still very interested in this feature. Will test this patch series over the holidays. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCHSET 00/18] perf report: Add support to accumulate hist periods (v3)
On 12/18/13 8:09 PM, Namhyung Kim wrote: Hi Arun, 2013-12-18 (수), 16:08 +0530, Arun Sharma: On 12/18/13 3:16 PM, Ingo Molnar wrote: My main complaint that any variation of 'cumulative' or 'cumulate' is a tongue-twister to users. I certainly won't be able to remember it and will have to call up the manpage every time I use it - which will be very annoying. I'd probably not use the feature much. I can remember it, mainly because it's such an unusual word :) Agree that we could use something simpler/easier to remember. My other user space projects have been keeping me busy. But still very interested in this feature. Will test this patch series over the holidays. Thanks in advance for your test and feedback! :) Looks great! One of the features I'm missing compared to my earlier --sort inclusive series of patches is that, in the --children mode, I'd still like to see callchains (at least in --tui). Looks like --children and -G -s pid can't be used together in this implementation. -Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/