On Mittwoch, 7. November 2018 23:41:31 CET Milian Wolff wrote: > On Dienstag, 6. November 2018 21:24:11 CET Andi Kleen wrote: > > > Where would I look for the source to change here? So far, I only > > > concentrated on the userspace side of perf in tools/perf. > > > > Kind of similar to > > > > a405bad5ad20 perf/x86: Add Haswell specific transaction flag reporting > > fdfbbd07e91f perf: Add generic transaction flags > > > > Report the original (not overwritten) regs->ip and regs->sp > > Thanks a lot Andi! With your help, I have managed to find the exact issue > for my scenario. Turns out, it really is "just" the instruction pointer > that is wrong. I.e. originally we have IP = 0x7feda32ca68c, but with PEBS > we correct that to IP = 7feda32ca688. The SP register value stays the same > according to my printk output. Using the original IP value, we can unwind > correctly since we point to the correct place in the .eh_frame section. The > PEBS IP points to a different position in the .eh_frame section, which is > "too early". > > That brings up some questions: > > - I noticed `perf record --intr-regs`, but the values recorded in the > perf.data file are always the same. I.e. comparing uregs and iregs, I always > see the same values printed by `perf script`. This smells like a bug to me, > but so far I haven't figured out why this happens...
The reason seems to be that perf_event_output only takes one set of registers, which then gets handed down into perf_prepare_sample where it gets sampled. Thus if sample type has both PERF_SAMPLE_REGS_USER and PERF_SAMPLE_REGS_INTR set, then by design both will store the same values for user space samples. Can we change this, such that perf_event_output also takes a second set of registers (iregs) that get sampled for PERF_SAMPLE_REGS_INTR? I'm very new to real kernel development, what kind of ABI/API stability guarantees exist for something like "perf_event_output"? > - Independently, when I add a custom printk manually in `arch/x86/events/ > intel/ds.c` at the end of `setup_pebs_sample_data`, then I'm never seeing > any differences between SP in iregs/pebs/regs. Shouldn't it also be > recorded via PEBS? Or is it just chance that I'm never seeing any > difference in setup_pebs_sample_data between iregs->sp and regs->sp? The reason here seems to be that the registers stored in "pebs" are essentially the same as iregs for the setup for `perf record --call-graph dwarf`. The difference is the availability of `pebs->real_ip` which gets used on my system to fixup the IP. SP stays untouched and is thus only truly valid for the untouched IP (which is discarded currently - see above). > - Generally, how do we want to handle this bug? If `--intr-regs` would > actually record a different IP than stored in uregs in the perf.data file, > then we could use that as a fallback for unwinding, when it fails the first > time. Or should we always unwind from that IP? How do we mark the "actual" > frame/IP then, if that differs? > > Thanks -- Milian Wolff | milian.wo...@kdab.com | Senior Software Engineer KDAB (Deutschland) GmbH, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt, C++ and OpenGL Experts
smime.p7s
Description: S/MIME cryptographic signature