Re: [PATCH v2] perf: libdw support for powerpc [ping]
On Tue, Jun 20, 2017 at 10:06:35PM -0300, Arnaldo Carvalho de Melo wrote: > Em Thu, Jun 15, 2017 at 01:16:32PM +0200, Mark Wielaard escreveu: > > On Thu, 2017-06-15 at 10:46 +0200, Milian Wolff wrote: > > > Just a quick question: Have you guys applied my recent patch: > > > > > > commit 5ea0416f51cc93436bbe497c62ab49fd9cb245b6 > > > Author: Milian Wolff > > > Date: Thu Jun 1 23:00:21 2017 +0200 > > > > > > perf report: Include partial stacks unwound with libdw > > > > > > So far the whole stack was thrown away when any error occurred before > > > the maximum stack depth was unwound. This is actually a very common > > > scenario though. The stacks that got unwound so far are still > > > interesting. This removes a large chunk of differences when comparing > > > perf script output for libunwind and libdw perf unwinding. > > > > > > If not, then this could explain the issue you are seeing. > > > > Thanks! No, I didn't have that patch (*) yet. It makes a huge > > difference. With that, Paolo's patch and the elfutils libdw powerpc64 > > fallback unwinder patch, it looks like I get user stack traces for > > everything now on ppc64le. > > Can I take that as a Tested-by: you? Sure. Tested-by: Mark Wielaard Thanks, Mark
Re: [PATCH v2] perf: libdw support for powerpc [ping]
On Thu, 2017-06-15 at 10:46 +0200, Milian Wolff wrote: > Just a quick question: Have you guys applied my recent patch: > > commit 5ea0416f51cc93436bbe497c62ab49fd9cb245b6 > Author: Milian Wolff > Date: Thu Jun 1 23:00:21 2017 +0200 > > perf report: Include partial stacks unwound with libdw > > So far the whole stack was thrown away when any error occurred before > the maximum stack depth was unwound. This is actually a very common > scenario though. The stacks that got unwound so far are still > interesting. This removes a large chunk of differences when comparing > perf script output for libunwind and libdw perf unwinding. > > If not, then this could explain the issue you are seeing. Thanks! No, I didn't have that patch (*) yet. It makes a huge difference. With that, Paolo's patch and the elfutils libdw powerpc64 fallback unwinder patch, it looks like I get user stack traces for everything now on ppc64le. Cheers, Mark (*) It just this one-liner, but what a difference that makes: --- a/tools/perf/util/unwind-libdw.c +++ b/tools/perf/util/unwind-libdw.c @@ -224,7 +224,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg, err = dwfl_getthread_frames(ui->dwfl, thread->tid, frame_callback, ui); - if (err && !ui->max_stack) + if (err && ui->max_stack != max_stack) err = 0; /*
Re: [PATCH v2] perf: libdw support for powerpc [ping]
Hi Ravi, On Mon, 2017-06-12 at 17:28 +0530, Ravi Bangoria wrote: > So, I tested this patch along with Mark's patch[1] on elfutils an looks > like it's not working. Steps on what I did: > > After applying Mark's patch on upstream elfutils: > > $ aclocal > $ autoheader > $ autoconf > $ automake --add-missing > $ ./configure > $ make > $ make install DESTDIR=/home/ravi/elfutils-git > > After applying your patch on upstream perf: > > $ make > $ ./perf record --call-graph=dwarf ls > $ LD_LIBRARY_PATH=/home/ravi/elfutils-git/usr/local/lib:\ > /home/ravi/elfutils-git/usr/local/lib/elfutils/:$LD_LIBRARY_PATH \ > ./perf script > > ls 44159 1800.878468: 191408 cycles:u: > > ls 44159 1800.878673: 419356 cycles:u: >8a97c hpte_need_flush > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) >835f4 flush_hash_page > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) >8acec hpte_need_flush > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) > 3468f4 ptep_clear_flush > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) > 328b10 wp_page_copy > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) > 32ebe4 do_wp_page > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) > 33434c __handle_mm_fault > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) > 335040 handle_mm_fault > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) >7bf94 do_page_fault > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) >1a4f8 handle_page_fault > (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux) > > ls 44159 1800.878961: 430876 cycles:u: > > ls 44159 1800.879195: 423785 cycles:u: > > ls 44159 1800.879360: 427359 cycles:u: > > Here I don't see userspace callchain getting unwound. Please let me know > if I'm doing anything wrong. I see the same on very short runs. But when doing a slightly longer run, even just using ls -lahR, which does some more work, then I do see user backtraces. They are still missing for some of the early samples though. It is as if there is a stack/memory address mismatch when the probe is "too early" in ld.so. Could you do a test run on some program that does some more work to see if you never get any user stack traces, or if you only not get them for some specific probes? Thanks, Mark
Re: [PATCH] perf/probe: Search both .eh_frame and .debug_frame sections for probe location
On Thu, 2015-09-24 at 10:56 +, 平松雅巳 / HIRAMATU,MASAMI wrote: > >I am not too familiar with the code so there might be a reason for > >setting and reusing the pf->cfi to do the search twice. But might it not > >be more clear to just store both pf->cfi_eh and pf->cfi_debug and then > >check both in call_probe_finder () with the dwarf_cfi_addrframe () call? > >Which is the only place I see actually using the cfi. > > Right, but since call_probe_finder can be called repeatedly on same binary, > we should keep pf->cfi for caching CFI too. Yes. I was suggesting to rename pf->cfi to pf->cfi_eh and add pf->cfi_debug, to make clear why there are two. > >BTW. Not really related to this patch since the following was already in > >the code, and is most likely always correct anyway: > > > >> + if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) && > >> + shdr.sh_type == SHT_PROGBITS) { > >> + pf->cfi = dwarf_getcfi_elf(elf); > > > >But that SHT_PROGBITS check is only necessary because of a bug in > >elfutils < 0.156. For 0.156+ dwarf_getcfi_elf () will properly return > >NULL in case you happen to be looking at a separate debug file that > >has .eh_frame as NOBITS. In theory this prevents getting the CFI if the > >file has stripped away the shdrs. Which is reasonable, there are > >probably also other things that rely on the shdrs. > > Ah, I had just wanted to avoid introducing new ifdefs. Yes, understandable. It is a weird corner case anyway. > > But dwarf_getcfi_elf > >is able to also get you the CFI with just the phdrs. > > Hmm, how can I make such binary? I can fix it, but we need a > testcase for that. eu-strip --strip-sections can create such binaries. But honestly I wouldn't bother. They are basically useless and nobody (should) do that. The only reason you might want to support them is for getting a backtrace anyway through the CFI. But you won't be able to get any other symbol or debug information from them. I only really mentioned it because I am embarrassed about the bug in elfutils. Just glad that it got fixed and the workaround isn't necessary anymore. But it probably is not worth it now to try to remove the workaround. There is no real benefit in this case. Cheers, Mark ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] perf/probe: Search both .eh_frame and .debug_frame sections for probe location
Hi Hemant, On Thu, 2015-09-24 at 07:46 +0530, Hemant Kumar wrote: > perf probe through debuginfo__find_probes() in util/probe-finder.c > checks for the functions' frame descriptions in either .eh_frame section > of an ELF or the .debug_frame. The check is based on whether either one > of these sections is present. But sometimes, it may happen that, > .eh_frame, even if present, may not be complete and may miss some > descriptions. Right. Depending on distro, toolchain defaults, arch, build flags, etc. CFI might be found in either .eh_frame and/or .debug_frame. To be sure you find the CFI covering an address you will always have to investigate both if available. I am not too familiar with the code so there might be a reason for setting and reusing the pf->cfi to do the search twice. But might it not be more clear to just store both pf->cfi_eh and pf->cfi_debug and then check both in call_probe_finder () with the dwarf_cfi_addrframe () call? Which is the only place I see actually using the cfi. BTW. Not really related to this patch since the following was already in the code, and is most likely always correct anyway: > + if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) && > + shdr.sh_type == SHT_PROGBITS) { > + pf->cfi = dwarf_getcfi_elf(elf); But that SHT_PROGBITS check is only necessary because of a bug in elfutils < 0.156. For 0.156+ dwarf_getcfi_elf () will properly return NULL in case you happen to be looking at a separate debug file that has .eh_frame as NOBITS. In theory this prevents getting the CFI if the file has stripped away the shdrs. Which is reasonable, there are probably also other things that rely on the shdrs. But dwarf_getcfi_elf is able to also get you the CFI with just the phdrs. Cheers, Mark ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/8] Use proper types for do_div
Hi Anton, On Wed, 2009-12-09 at 19:09 +0300, Anton Vorontsov wrote: > I think this should be fixed by this patch: > > http://sourceware.org/ml/systemtap/2009-q4/msg00800.html > > Can you try it? With this patch I see no new regressions on x86_64. Yes, that fixes everything. Sorry I didn't see that patch earlier. I see Wenji also tested it already. I have pushed it for you. Thanks, Mark ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 6/8] Use proper types for do_div
Hi Anton, On Sat, 2009-11-28 at 01:33 +0300, Anton Vorontsov wrote: > do_div accepts unsigned 64-bit integer type for dividend, signed types > would cause do_div's typecheck fail: > > stat-common.c: In function 'needed_space': > stat-common.c:50: error: comparison of distinct pointer types lacks a cast > ...same errors in time.c and tapset-timers.cxx's generated code... > > A fix for time.c is special, on ppc32 cycles_t is 32-bit, so technically > we don't need do_div, but since the whole _stp_gettimeofday_ns() operates > on 64-bit types we'd better be safe and use uint64_t for the math. > > Signed-off-by: Anton Vorontsov > --- > runtime/stat-common.c |8 > runtime/time.c|3 ++- > tapset-timers.cxx |2 +- > 3 files changed, 7 insertions(+), 6 deletions(-) I don't immediately see anything wrong with the code. But on x86_64 this does cause some regression test failures in testsuite/systemtap.maps. In particular the following tests now fail (they all pass with this one patch reverted): Running /home/mark/src/systemtap/testsuite/systemtap.maps/elision.exp ... FAIL: elision-1 FAIL: elision0 FAIL: elision1 FAIL: elision2 FAIL: elision3 Running /home/mark/src/systemtap/testsuite/systemtap.maps/ix.exp ... FAIL: systemtap.maps/ix.stp Running /home/mark/src/systemtap/testsuite/systemtap.maps/linear_large_neg.exp ... FAIL: systemtap.maps/linear_large_neg.stp Running /home/mark/src/systemtap/testsuite/systemtap.maps/linear_overunder.exp ... FAIL: systemtap.maps/linear_overunder.stp Running /home/mark/src/systemtap/testsuite/systemtap.maps/linear_under.exp ... FAIL: systemtap.maps/linear_under.stp Running /home/mark/src/systemtap/testsuite/systemtap.maps/log.exp ... FAIL: systemtap.maps/log.stp Running /home/mark/src/systemtap/testsuite/systemtap.maps/log_edge.exp ... FAIL: systemtap.maps/log_edge.stp Could you have a look? Thanks, Mark ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev