2017-08-23 1:51 GMT+08:00 Josh Poimboeuf <jpoim...@redhat.com>: > On Tue, Aug 22, 2017 at 05:09:19PM +0000, yang oliver wrote: >> From: Yong Yang <yangoli...@gmail.com> >> >> While NMI interrupts the very beginning of SYSCALL, the rsp could >> be still user space address without loading kernel address. Then >> the pt_regs constructed by the NMI entry point could have a user >> space rsp. If a NMI handler tried to dump stack by using this rsp, >> it can cause the kernel panic. > > To me this sounds like an unwinder bug. The unwinder is supposed to > have checks to prevent it from accessing user space.
What kind of checking should we do here? While the NMI interrupts the first instruction of entry_SYSCALL_64, the rip is kernel address, but rsp is still a user space address. Usually, kernel uses the "user_mode" check to determine user space pt_regs, but this check doesn't work for this situation. > I know you had > previously reported this for an older (pre-4.9) kernel. Have you > verified the bug still exists on a recent kernel? Yes, I ran into kernel panic on 3.10 kernel, and the root case is show_stack_log_lvl in nmi handler dump the user space stack, but that memory got swapped out or freed. On latest kernel, show_stack_log_lvl never dump stack contents. But I'm not sure other NMI handler, or kernel trace code could dump a user space stack dump while the kernel mode pt_regs which has a user space sp. I wrote a module which register a NMI handler, it did the check like below, if (!user_mode(regs)) { /* kernel mode, but could contain user address */ rsp = regs->sp; if (rsp < __PAGE_OFFSET) { pr_err("User Stack Pointer: %016lx", rsp); pr_err("User Stack Dump: %016lx", *((unsigned long *)rsp)); } } I manage to make NMI interrupted the entry_SYSCALL_64 boundary, then I could trigger the issue. Not sure whether it is good enough. Let me know if you have other suggestions. > > -- > Josh -- ------------------ Oliver Yang