On Thu, Dec 4, 2014 at 8:01 AM, Berthier, Emmanuel <emmanuel.berth...@intel.com> wrote: >> From: Andy Lutomirski [mailto:l...@amacapital.net] >> Sent: Wednesday, December 3, 2014 8:30 PM >> To: Berthier, Emmanuel >> Cc: Thomas Gleixner; H. Peter Anvin; X86 ML; Jarzmik, Robert; LKML >> Subject: Re: [PATCH v2] [LBR] Dump LBRs on Exception >> > The final patch will bypass the new code in case of UserSpace page fault, >> > so >> performance impact will be very low. >> > LBRs copy takes much more time than LBR stop/start. >> > >> > The simple is the better: >> > >> > .macro STOP_LBR >> > #ifdef CONFIG_LBR_DUMP_ON_EXCEPTION >> > testl $3,CS(%rsp) /* Kernel Space? */ >> > jnz 1f >> > testl $3, PER_CPU_VAR(lbr_dump_state) /* Disabled? */ >> > jnz 1f >> >> But that just wasted two of your LBR slots. > > No: false test does not generate Branch record, ex: > > Last Branch Records: > to: [<ffffffff828122a0>] page_fault+0x0/0x90 > from: [<ffffffff823c0e06>] sysrq_handle_crash+0x16/0x20 > to: [<ffffffff823c0df0>] sysrq_handle_crash+0x0/0x20 > from: [<ffffffff823c156c>] __handle_sysrq+0x9c/0x170 > to: [<ffffffff823c1562>] __handle_sysrq+0x92/0x170 > >> > push %rax >> > push %rcx >> > push %rdx >> > movl $MSR_IA32_DEBUGCTLMSR, %ecx >> > rdmsr >> > and $~1, %eax /* Disable LBR recording */ >> > wrmsr >> > pop %rdx >> > pop %rcx >> > pop %rax >> >> And the general problem with this approach (even ignoring the performance >> hit, and kernel faults on user addresses really do happen in real workloads) >> is >> that you're not saving and restoring MSR_IA32_DEBUGCTL. >> It may be that >> the rest of your patch does whatever magic is needed to make this work, but >> from just this code it's not at all obvious that this is correct. > > The algorithm is quite simple: > When I enter in Exception handler, I stop LBR recording, and dump its content > later if needed. > When I leave Exception Handler, I restart LBR recording. > So, after the first exception, LBR in On. > In case of nested Exceptions and crash, you're right, LBR will probably not > be relevant. > But your proposal does not solve this issue: If we save registers during 1rst > exception, and then overwrite them during 2nd level, > we will lose relevant info if crash is due to the 1rst exception. > >> Hence my suggestion for rdmsr -- if you're willing to enable this and take >> the >> performance hit, you can simplify it a lot and save some branch slots by >> unconditionally doing the rdmsrs if you've enabled the LBR tracing IDT entry. >> The simplification from using rdmsr isn't that the save code is simplified >> -- it's >> that there's no state change on exception entry, so you don't need to worry >> about restoring state correctly on the way out or during a context switch. >> And you can enable/disable the whole thing just by writing to the IDT, so >> there's no performance hit at all in the disabled case. > > Concerning performances: if it's really matter, the better is to disable the > CONFIG switch. > But if we enable it, it's for using it I guess, and in that case, bypassing > UserSpace page faults is better. > You're proposal of "unconditionally doing the rdmsrs" is not good in that > case. > The only small gain is when CONFIG is enable and feature is disabled by > cmdline: > - with my proposal, we get 1 test and 1 jmp more (if I switch Kernel test > with LBR state test): for an exception treatment, does it really matter? > > We can mix our proposals: keep my STOP/START code, and replace the dynamic > disabling test by IDT change. > I hope the code will stay readable. > Do we really want to save 2 instructions?
I don't really care about the number of instructions. But there are still all the nasty cases: - Context switch during exception processing (both in the C handler and in the retint code). - PMI during exception processing. - Exception while perf is poking at LBR msrs. Where are you planning on saving the start/stop previous state? --Andy > > Thanks, > > Emmanuel. -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/