Re: [PATCH v2] perf: libdw support for powerpc [ping]

2017-06-21 Thread Mark Wielaard
On Tue, Jun 20, 2017 at 10:06:35PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 15, 2017 at 01:16:32PM +0200, Mark Wielaard escreveu:
> > On Thu, 2017-06-15 at 10:46 +0200, Milian Wolff wrote:
> > > Just a quick question: Have you guys applied my recent patch:
> > > 
> > > commit 5ea0416f51cc93436bbe497c62ab49fd9cb245b6
> > > Author: Milian Wolff 
> > > Date:   Thu Jun 1 23:00:21 2017 +0200
> > > 
> > > perf report: Include partial stacks unwound with libdw
> > > 
> > > So far the whole stack was thrown away when any error occurred before
> > > the maximum stack depth was unwound. This is actually a very common
> > > scenario though. The stacks that got unwound so far are still
> > > interesting. This removes a large chunk of differences when comparing
> > > perf script output for libunwind and libdw perf unwinding.
> > > 
> > > If not, then this could explain the issue you are seeing.
> > 
> > Thanks! No, I didn't have that patch (*) yet. It makes a huge
> > difference. With that, Paolo's patch and the elfutils libdw powerpc64
> > fallback unwinder patch, it looks like I get user stack traces for
> > everything now on ppc64le.
> 
> Can I take that as a Tested-by: you?

Sure.
Tested-by: Mark Wielaard 

Thanks,

Mark


Re: [PATCH v2] perf: libdw support for powerpc [ping]

2017-06-15 Thread Mark Wielaard
On Thu, 2017-06-15 at 10:46 +0200, Milian Wolff wrote:
> Just a quick question: Have you guys applied my recent patch:
> 
> commit 5ea0416f51cc93436bbe497c62ab49fd9cb245b6
> Author: Milian Wolff 
> Date:   Thu Jun 1 23:00:21 2017 +0200
> 
> perf report: Include partial stacks unwound with libdw
> 
> So far the whole stack was thrown away when any error occurred before
> the maximum stack depth was unwound. This is actually a very common
> scenario though. The stacks that got unwound so far are still
> interesting. This removes a large chunk of differences when comparing
> perf script output for libunwind and libdw perf unwinding.
> 
> If not, then this could explain the issue you are seeing.

Thanks! No, I didn't have that patch (*) yet. It makes a huge
difference. With that, Paolo's patch and the elfutils libdw powerpc64
fallback unwinder patch, it looks like I get user stack traces for
everything now on ppc64le.

Cheers,

Mark

(*) It just this one-liner, but what a difference that makes:

--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -224,7 +224,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 
err = dwfl_getthread_frames(ui->dwfl, thread->tid, frame_callback, ui);
 
-   if (err && !ui->max_stack)
+   if (err && ui->max_stack != max_stack)
err = 0;
 
/*



Re: [PATCH v2] perf: libdw support for powerpc [ping]

2017-06-13 Thread Mark Wielaard
Hi Ravi,

On Mon, 2017-06-12 at 17:28 +0530, Ravi Bangoria wrote:
> So, I tested this patch along with Mark's patch[1] on elfutils an looks
> like it's not working. Steps on what I did:
> 
> After applying Mark's patch on upstream elfutils:
> 
>   $ aclocal
>   $ autoheader
>   $ autoconf
>   $ automake --add-missing
>   $ ./configure
>   $ make
>   $ make install DESTDIR=/home/ravi/elfutils-git
> 
> After applying your patch on upstream perf:
> 
>   $ make
>   $ ./perf record --call-graph=dwarf ls
>   $ LD_LIBRARY_PATH=/home/ravi/elfutils-git/usr/local/lib:\
> /home/ravi/elfutils-git/usr/local/lib/elfutils/:$LD_LIBRARY_PATH \
> ./perf script
> 
> ls 44159  1800.878468: 191408 cycles:u:
> 
> ls 44159  1800.878673: 419356 cycles:u:
>8a97c hpte_need_flush 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>835f4 flush_hash_page 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>8acec hpte_need_flush 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>   3468f4 ptep_clear_flush 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>   328b10 wp_page_copy 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>   32ebe4 do_wp_page 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>   33434c __handle_mm_fault 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>   335040 handle_mm_fault 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>7bf94 do_page_fault 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
>1a4f8 handle_page_fault 
> (/usr/lib/debug/lib/modules/4.11.0-3.el7.ppc64le/vmlinux)
> 
> ls 44159  1800.878961: 430876 cycles:u:
> 
> ls 44159  1800.879195: 423785 cycles:u:
> 
> ls 44159  1800.879360: 427359 cycles:u:
> 
> Here I don't see userspace callchain getting unwound. Please let me know
> if I'm doing anything wrong.

I see the same on very short runs. But when doing a slightly longer run,
even just using ls -lahR, which does some more work, then I do see user
backtraces. They are still missing for some of the early samples though.
It is as if there is a stack/memory address mismatch when the probe is
"too early" in ld.so.

Could you do a test run on some program that does some more work to see
if you never get any user stack traces, or if you only not get them for
some specific probes?

Thanks,

Mark


Re: [PATCH] perf/probe: Search both .eh_frame and .debug_frame sections for probe location

2015-09-24 Thread Mark Wielaard
On Thu, 2015-09-24 at 10:56 +, 平松雅巳 / HIRAMATU,MASAMI wrote:
> >I am not too familiar with the code so there might be a reason for
> >setting and reusing the pf->cfi to do the search twice. But might it not
> >be more clear to just store both pf->cfi_eh and pf->cfi_debug and then
> >check both in call_probe_finder () with the dwarf_cfi_addrframe () call?
> >Which is the only place I see actually using the cfi.
> 
> Right, but since call_probe_finder can be called repeatedly on same binary,
> we should keep pf->cfi for caching CFI too.

Yes. I was suggesting to rename pf->cfi to pf->cfi_eh and add
pf->cfi_debug, to make clear why there are two.

> >BTW. Not really related to this patch since the following was already in
> >the code, and is most likely always correct anyway:
> >
> >> +  if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) &&
> >> +  shdr.sh_type == SHT_PROGBITS) {
> >> +  pf->cfi = dwarf_getcfi_elf(elf);
> >
> >But that SHT_PROGBITS check is only necessary because of a bug in
> >elfutils < 0.156. For 0.156+ dwarf_getcfi_elf () will properly return
> >NULL in case you happen to be looking at a separate debug file that
> >has .eh_frame as NOBITS. In theory this prevents getting the CFI if the
> >file has stripped away the shdrs. Which is reasonable, there are
> >probably also other things that rely on the shdrs.
> 
> Ah, I had just wanted to avoid introducing new ifdefs.

Yes, understandable. It is a weird corner case anyway.

> > But dwarf_getcfi_elf
> >is able to also get you the CFI with just the phdrs.
> 
> Hmm, how can I make such binary? I can fix it, but we need a
> testcase for that.

eu-strip --strip-sections can create such binaries. But honestly I
wouldn't bother. They are basically useless and nobody (should) do that.
The only reason you might want to support them is for getting a
backtrace anyway through the CFI. But you won't be able to get any other
symbol or debug information from them.

I only really mentioned it because I am embarrassed about the bug in
elfutils. Just glad that it got fixed and the workaround isn't necessary
anymore. But it probably is not worth it now to try to remove the
workaround. There is no real benefit in this case.

Cheers,

Mark
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] perf/probe: Search both .eh_frame and .debug_frame sections for probe location

2015-09-24 Thread Mark Wielaard
Hi Hemant,

On Thu, 2015-09-24 at 07:46 +0530, Hemant Kumar wrote:
> perf probe through debuginfo__find_probes() in util/probe-finder.c
> checks for the functions' frame descriptions in either .eh_frame section
> of an ELF or the .debug_frame. The check is based on whether either one
> of these sections is present. But sometimes, it may happen that,
> .eh_frame, even if present, may not be complete and may miss some
> descriptions.

Right. Depending on distro, toolchain defaults, arch, build flags, etc.
CFI might be found in either .eh_frame and/or .debug_frame. To be sure
you find the CFI covering an address you will always have to investigate
both if available.

I am not too familiar with the code so there might be a reason for
setting and reusing the pf->cfi to do the search twice. But might it not
be more clear to just store both pf->cfi_eh and pf->cfi_debug and then
check both in call_probe_finder () with the dwarf_cfi_addrframe () call?
Which is the only place I see actually using the cfi.

BTW. Not really related to this patch since the following was already in
the code, and is most likely always correct anyway:

> + if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) &&
> + shdr.sh_type == SHT_PROGBITS) {
> + pf->cfi = dwarf_getcfi_elf(elf);

But that SHT_PROGBITS check is only necessary because of a bug in
elfutils < 0.156. For 0.156+ dwarf_getcfi_elf () will properly return
NULL in case you happen to be looking at a separate debug file that
has .eh_frame as NOBITS. In theory this prevents getting the CFI if the
file has stripped away the shdrs. Which is reasonable, there are
probably also other things that rely on the shdrs. But dwarf_getcfi_elf
is able to also get you the CFI with just the phdrs.

Cheers,

Mark
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 6/8] Use proper types for do_div

2009-12-09 Thread Mark Wielaard
Hi Anton,

On Wed, 2009-12-09 at 19:09 +0300, Anton Vorontsov wrote:
> I think this should be fixed by this patch:
> 
> http://sourceware.org/ml/systemtap/2009-q4/msg00800.html
> 
> Can you try it? With this patch I see no new regressions on x86_64.

Yes, that fixes everything. Sorry I didn't see that patch earlier. I see
Wenji also tested it already. I have pushed it for you.

Thanks,

Mark


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 6/8] Use proper types for do_div

2009-12-09 Thread Mark Wielaard
Hi Anton,

On Sat, 2009-11-28 at 01:33 +0300, Anton Vorontsov wrote: 
> do_div accepts unsigned 64-bit integer type for dividend, signed types
> would cause do_div's typecheck fail:
> 
> stat-common.c: In function 'needed_space':
> stat-common.c:50: error: comparison of distinct pointer types lacks a cast
> ...same errors in time.c and tapset-timers.cxx's generated code...
> 
> A fix for time.c is special, on ppc32 cycles_t is 32-bit, so technically
> we don't need do_div, but since the whole _stp_gettimeofday_ns() operates
> on 64-bit types we'd better be safe and use uint64_t for the math.
> 
> Signed-off-by: Anton Vorontsov 
> ---
>  runtime/stat-common.c |8 
>  runtime/time.c|3 ++-
>  tapset-timers.cxx |2 +-
>  3 files changed, 7 insertions(+), 6 deletions(-)

I don't immediately see anything wrong with the code. But on x86_64 this
does cause some regression test failures in testsuite/systemtap.maps. In
particular the following tests now fail (they all pass with this one
patch reverted):

Running /home/mark/src/systemtap/testsuite/systemtap.maps/elision.exp ...
FAIL: elision-1
FAIL: elision0
FAIL: elision1
FAIL: elision2
FAIL: elision3
Running /home/mark/src/systemtap/testsuite/systemtap.maps/ix.exp ...
FAIL: systemtap.maps/ix.stp
Running /home/mark/src/systemtap/testsuite/systemtap.maps/linear_large_neg.exp 
...
FAIL: systemtap.maps/linear_large_neg.stp
Running /home/mark/src/systemtap/testsuite/systemtap.maps/linear_overunder.exp 
...
FAIL: systemtap.maps/linear_overunder.stp
Running /home/mark/src/systemtap/testsuite/systemtap.maps/linear_under.exp ...
FAIL: systemtap.maps/linear_under.stp
Running /home/mark/src/systemtap/testsuite/systemtap.maps/log.exp ...
FAIL: systemtap.maps/log.stp
Running /home/mark/src/systemtap/testsuite/systemtap.maps/log_edge.exp ...
FAIL: systemtap.maps/log_edge.stp

Could you have a look?

Thanks,

Mark

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev