On Thu, Jul 13, 2017 at 11:19:11AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra <pet...@infradead.org> wrote: > > > > One gloriously ugly hack would be to delay the userspace unwind to > > > return-to-userspace, at which point we have a schedulable context and can > > > take > > > faults. > > I don't think it's ugly, and it has various advantages: > > > > Of course, then you have to somehow identify this later unwind sample > > > with all > > > relevant prior samples and stitch the whole thing back together, but that > > > should be doable. > > > > > > In fact, it would not be at all hard to do, just queue a task_work from > > > the > > > NMI and have that do the EH based unwind. > > This would have a couple of advantages: > > - as you mention, being able to fault in debug info and generally do > IO/scheduling, > > - profiling overhead would be accounted to the task context that generates > it, > not the NMI context, > > - there would be a natural batching/coalescing optimization if multiple > events > hit the same system call: the user-space backtrace would only have to be > looked > up once for all samples that got collected. > > This could be done by separating the user-space backtrace into a separate > event, > and perf tooling would then apply the same user-space backtrace to all prior > kernel samples. > > I.e. the ring-buffer would have trace entries like: > > [ kernel sample #1, with kernel backtrace #1 ] > [ kernel sample #2, with kernel backtrace #2 ] > [ kernel sample #3, with kernel backtrace #3 ] > [ user-space backtrace #1 at syscall return ] > ... > > Note how the three kernel samples didn't have to do any user-space unwinding > at > all, so the user-space unwinding overhead got reduced by a factor of 3. > > Tooling would know that 'user-space backtrace #1' applies to the previous > three > kernel samples. > > Or so?
BTW, while we're throwing out ideas for this, here's another idea, though it's almost certainly not a good one :-) For user space stack unwinding, the kernel could emulate what the kernel 'guess' unwinder does by scanning the user space stack and returning all the text addresses it finds. The results wouldn't be 100% accurate, but they could end up being useful over time. -- Josh