Re: ARM64 perf stacktraces on tracepoint events
On Thu, 2 Mar 2017 10:44:11 + Mark Rutlandwrote: > [Adding James, Takahiro-san, and Will, for stacktracing and arm64 perf] > > On Wed, Mar 01, 2017 at 04:42:17PM -0800, Joel Fernandes wrote: > > Hi, > > > > On 3.18 kernel running on an arm64 device, I tried running: > > perf record -a -g -e sched:sched_wakeup > > > > I was expecting something like the output at: > > www.brendangregg.com/perf.html > > > > But I don't see stacktraces, could I be missing any patches you could > > point me to? > > I can't spot much obvious, looking at arm64's stacktrace.c and > perf_callchain.c > since v3.18. There's commit: > > 9702970c7bd3e2d6 (Revert "ARM64: unwind: Fix PC calculation") > > ... though that was Cc'd stable. > > I can't immediately see why that would truncate unwinding, though I'm also not > sure how perf and ftrace interact here, so I may be missing something obvious. > > > I do see stack traces if I do regular perf profiling > > (perf record -F 99 -a -g). > > > Below is the output I get. Thanks for any insight into this. > > > > Regards, > > Joel > > > > sailfish:/data # ./perf report > ># Samples: 22 of event 'sched:sched_wakeup' > > # Event count (approx.): 410333 > > # > > # Children Self Trace output > > # > > . > > # > > 61.29%61.29% comm=ksoftirqd/0 pid=3 prio=120 success=1 > > target_cpu=000 > > | > > ---0 > > > > 33.25%33.25% comm=hw pid=12777 prio=120 success=1 target_cpu=002 > > | > > ---0 > > > > 2.46% 2.46% comm=kworker/u8:2 pid=12482 prio=120 success=1 > > target_cpu=002 > > | > > ---0 > > > > 1.39% 1.39% comm=migration/2 pid=21 prio=0 success=1 > > target_cpu=002 > > | > > ---0 > > FWIW, on a recent HEAD (86292b33d4b79ee0), I see reasonable looking > backtraces, > e.g. I cross-built the perf tool on an Ubuntu 16.10 system, after rewinding its base linux tree to ~v4.6 (e.g., commit 8beeb00 because it precedes a perf libunwind series), and still see reasonable output: # # Total Lost Samples: 0 # # Samples: 30 of event 'sched:sched_wakeup' # Event count (approx.): 30 # # Children Self Trace output # . # 20.00%20.00% comm=rcu_preempt pid=7 prio=120 target_cpu=006 | ---0x807f71c4 secondary_start_kernel cpu_startup_entry do_idle call_cpuidle cpuidle_enter cpuidle_enter_state el1_irq gic_handle_irq __handle_domain_irq irq_exit __do_softirq | |--16.67%--run_timer_softirq | expire_timers | call_timer_fn | process_timeout | wake_up_process | try_to_wake_up | ttwu_do_activate | ttwu_do_wakeup | ttwu_do_wakeup | --3.33%--rcu_process_callbacks note_gp_changes rcu_gp_kthread_wake swake_up swake_up_locked wake_up_process try_to_wake_up ttwu_do_activate ttwu_do_wakeup ttwu_do_wakeup I tried to go even further back, e.g., v4.1-rc4-15-g10b48f7 but the compiler wouldn't build perf any more, e.g.,: util/event.c:416:2: error: ‘readdir_r’ is deprecated [-Werror=deprecated-declarations] Is it possible to try a newer perf binary (assuming it's also 3.18 based), or check whether libunwind contains aarch64 support? Note that it's possible to 'perf archive' the aarch64 run and run perf report on a more up-to-date x86 host also. Thanks, Kim
Re: ARM64 perf stacktraces on tracepoint events
On Thu, 2 Mar 2017 10:44:11 + Mark Rutland wrote: > [Adding James, Takahiro-san, and Will, for stacktracing and arm64 perf] > > On Wed, Mar 01, 2017 at 04:42:17PM -0800, Joel Fernandes wrote: > > Hi, > > > > On 3.18 kernel running on an arm64 device, I tried running: > > perf record -a -g -e sched:sched_wakeup > > > > I was expecting something like the output at: > > www.brendangregg.com/perf.html > > > > But I don't see stacktraces, could I be missing any patches you could > > point me to? > > I can't spot much obvious, looking at arm64's stacktrace.c and > perf_callchain.c > since v3.18. There's commit: > > 9702970c7bd3e2d6 (Revert "ARM64: unwind: Fix PC calculation") > > ... though that was Cc'd stable. > > I can't immediately see why that would truncate unwinding, though I'm also not > sure how perf and ftrace interact here, so I may be missing something obvious. > > > I do see stack traces if I do regular perf profiling > > (perf record -F 99 -a -g). > > > Below is the output I get. Thanks for any insight into this. > > > > Regards, > > Joel > > > > sailfish:/data # ./perf report > ># Samples: 22 of event 'sched:sched_wakeup' > > # Event count (approx.): 410333 > > # > > # Children Self Trace output > > # > > . > > # > > 61.29%61.29% comm=ksoftirqd/0 pid=3 prio=120 success=1 > > target_cpu=000 > > | > > ---0 > > > > 33.25%33.25% comm=hw pid=12777 prio=120 success=1 target_cpu=002 > > | > > ---0 > > > > 2.46% 2.46% comm=kworker/u8:2 pid=12482 prio=120 success=1 > > target_cpu=002 > > | > > ---0 > > > > 1.39% 1.39% comm=migration/2 pid=21 prio=0 success=1 > > target_cpu=002 > > | > > ---0 > > FWIW, on a recent HEAD (86292b33d4b79ee0), I see reasonable looking > backtraces, > e.g. I cross-built the perf tool on an Ubuntu 16.10 system, after rewinding its base linux tree to ~v4.6 (e.g., commit 8beeb00 because it precedes a perf libunwind series), and still see reasonable output: # # Total Lost Samples: 0 # # Samples: 30 of event 'sched:sched_wakeup' # Event count (approx.): 30 # # Children Self Trace output # . # 20.00%20.00% comm=rcu_preempt pid=7 prio=120 target_cpu=006 | ---0x807f71c4 secondary_start_kernel cpu_startup_entry do_idle call_cpuidle cpuidle_enter cpuidle_enter_state el1_irq gic_handle_irq __handle_domain_irq irq_exit __do_softirq | |--16.67%--run_timer_softirq | expire_timers | call_timer_fn | process_timeout | wake_up_process | try_to_wake_up | ttwu_do_activate | ttwu_do_wakeup | ttwu_do_wakeup | --3.33%--rcu_process_callbacks note_gp_changes rcu_gp_kthread_wake swake_up swake_up_locked wake_up_process try_to_wake_up ttwu_do_activate ttwu_do_wakeup ttwu_do_wakeup I tried to go even further back, e.g., v4.1-rc4-15-g10b48f7 but the compiler wouldn't build perf any more, e.g.,: util/event.c:416:2: error: ‘readdir_r’ is deprecated [-Werror=deprecated-declarations] Is it possible to try a newer perf binary (assuming it's also 3.18 based), or check whether libunwind contains aarch64 support? Note that it's possible to 'perf archive' the aarch64 run and run perf report on a more up-to-date x86 host also. Thanks, Kim
Re: ARM64 perf stacktraces on tracepoint events
[Adding James, Takahiro-san, and Will, for stacktracing and arm64 perf] On Wed, Mar 01, 2017 at 04:42:17PM -0800, Joel Fernandes wrote: > Hi, > > On 3.18 kernel running on an arm64 device, I tried running: > perf record -a -g -e sched:sched_wakeup > > I was expecting something like the output at: > www.brendangregg.com/perf.html > > But I don't see stacktraces, could I be missing any patches you could > point me to? I can't spot much obvious, looking at arm64's stacktrace.c and perf_callchain.c since v3.18. There's commit: 9702970c7bd3e2d6 (Revert "ARM64: unwind: Fix PC calculation") ... though that was Cc'd stable. I can't immediately see why that would truncate unwinding, though I'm also not sure how perf and ftrace interact here, so I may be missing something obvious. > I do see stack traces if I do regular perf profiling > (perf record -F 99 -a -g). > Below is the output I get. Thanks for any insight into this. > > Regards, > Joel > > sailfish:/data # ./perf report ># Samples: 22 of event 'sched:sched_wakeup' > # Event count (approx.): 410333 > # > # Children Self Trace output > # > . > # > 61.29%61.29% comm=ksoftirqd/0 pid=3 prio=120 success=1 target_cpu=000 > | > ---0 > > 33.25%33.25% comm=hw pid=12777 prio=120 success=1 target_cpu=002 > | > ---0 > > 2.46% 2.46% comm=kworker/u8:2 pid=12482 prio=120 success=1 > target_cpu=002 > | > ---0 > > 1.39% 1.39% comm=migration/2 pid=21 prio=0 success=1 target_cpu=002 > | > ---0 FWIW, on a recent HEAD (86292b33d4b79ee0), I see reasonable looking backtraces, e.g. 77.48% 0.00% swapper [kernel.kallsyms] [k] do_idle | ---do_idle | |--68.84%--call_cpuidle | cpuidle_enter | cpuidle_enter_state | | | --61.05%--el1_irq | gic_handle_irq | __handle_domain_irq | irq_exit | __do_softirq | | | |--22.29%--run_timer_softirq Thanks, Mark.
Re: ARM64 perf stacktraces on tracepoint events
[Adding James, Takahiro-san, and Will, for stacktracing and arm64 perf] On Wed, Mar 01, 2017 at 04:42:17PM -0800, Joel Fernandes wrote: > Hi, > > On 3.18 kernel running on an arm64 device, I tried running: > perf record -a -g -e sched:sched_wakeup > > I was expecting something like the output at: > www.brendangregg.com/perf.html > > But I don't see stacktraces, could I be missing any patches you could > point me to? I can't spot much obvious, looking at arm64's stacktrace.c and perf_callchain.c since v3.18. There's commit: 9702970c7bd3e2d6 (Revert "ARM64: unwind: Fix PC calculation") ... though that was Cc'd stable. I can't immediately see why that would truncate unwinding, though I'm also not sure how perf and ftrace interact here, so I may be missing something obvious. > I do see stack traces if I do regular perf profiling > (perf record -F 99 -a -g). > Below is the output I get. Thanks for any insight into this. > > Regards, > Joel > > sailfish:/data # ./perf report ># Samples: 22 of event 'sched:sched_wakeup' > # Event count (approx.): 410333 > # > # Children Self Trace output > # > . > # > 61.29%61.29% comm=ksoftirqd/0 pid=3 prio=120 success=1 target_cpu=000 > | > ---0 > > 33.25%33.25% comm=hw pid=12777 prio=120 success=1 target_cpu=002 > | > ---0 > > 2.46% 2.46% comm=kworker/u8:2 pid=12482 prio=120 success=1 > target_cpu=002 > | > ---0 > > 1.39% 1.39% comm=migration/2 pid=21 prio=0 success=1 target_cpu=002 > | > ---0 FWIW, on a recent HEAD (86292b33d4b79ee0), I see reasonable looking backtraces, e.g. 77.48% 0.00% swapper [kernel.kallsyms] [k] do_idle | ---do_idle | |--68.84%--call_cpuidle | cpuidle_enter | cpuidle_enter_state | | | --61.05%--el1_irq | gic_handle_irq | __handle_domain_irq | irq_exit | __do_softirq | | | |--22.29%--run_timer_softirq Thanks, Mark.
ARM64 perf stacktraces on tracepoint events
Hi, On 3.18 kernel running on an arm64 device, I tried running: perf record -a -g -e sched:sched_wakeup I was expecting something like the output at: www.brendangregg.com/perf.html But I don't see stacktraces, could I be missing any patches you could point me to? I do see stack traces if I do regular perf profiling (perf record -F 99 -a -g). Below is the output I get. Thanks for any insight into this. Regards, Joel sailfish:/data # ./perf report # Samples: 22 of event 'sched:sched_wakeup' # Event count (approx.): 410333 # # Children Self Trace output # . # 61.29%61.29% comm=ksoftirqd/0 pid=3 prio=120 success=1 target_cpu=000 | ---0 33.25%33.25% comm=hw pid=12777 prio=120 success=1 target_cpu=002 | ---0 2.46% 2.46% comm=kworker/u8:2 pid=12482 prio=120 success=1 target_cpu=002 | ---0 1.39% 1.39% comm=migration/2 pid=21 prio=0 success=1 target_cpu=002 | ---0
ARM64 perf stacktraces on tracepoint events
Hi, On 3.18 kernel running on an arm64 device, I tried running: perf record -a -g -e sched:sched_wakeup I was expecting something like the output at: www.brendangregg.com/perf.html But I don't see stacktraces, could I be missing any patches you could point me to? I do see stack traces if I do regular perf profiling (perf record -F 99 -a -g). Below is the output I get. Thanks for any insight into this. Regards, Joel sailfish:/data # ./perf report # Samples: 22 of event 'sched:sched_wakeup' # Event count (approx.): 410333 # # Children Self Trace output # . # 61.29%61.29% comm=ksoftirqd/0 pid=3 prio=120 success=1 target_cpu=000 | ---0 33.25%33.25% comm=hw pid=12777 prio=120 success=1 target_cpu=002 | ---0 2.46% 2.46% comm=kworker/u8:2 pid=12482 prio=120 success=1 target_cpu=002 | ---0 1.39% 1.39% comm=migration/2 pid=21 prio=0 success=1 target_cpu=002 | ---0