On Sun, Apr 14, 2019 at 05:59:57PM +0200, Thomas Gleixner wrote: > The debug IST stack is actually two separate debug stacks to handle #DB > recursion. This is required because the CPU starts always at top of stack > on exception entry, which means on #DB recursion the second #DB would > overwrite the stack of the first. > > The low level entry code therefore adjusts the top of stack on entry so a > secondary #DB starts from a different stack page. But the stack pages are > adjacent without a guard page between them. > > Split the debug stack into 3 stacks which are separated by guard pages. The > 3rd stack is never mapped into the cpu_entry_area and is only there to > catch triple #DB nesting: > > --- top of DB_stack <- Initial stack > --- end of DB_stack > guard page > > --- top of DB1_stack <- Top of stack after entering first #DB > --- end of DB1_stack > guard page > > --- top of DB2_stack <- Top of stack after entering second #DB > --- end of DB2_stack > guard page > > If DB2 would not act as the final guard hole, a second #DB would point the > top of #DB stack to the stack below #DB1 which would be valid and not catch > the not so desired triple nesting. > > The backing store does not allocate any memory for DB2 and its guard page > as it is not going to be mapped into the cpu_entry_area. > > - Adjust the low level entry code so it adjusts top of #DB with the offset > between the stacks instead of exception stack size. > > - Make the dumpstack code aware of the new stacks. > > - Adjust the in_debug_stack() implementation and move it into the NMI code > where it belongs. As this is NMI hotpath code, it just checks the full > area between top of DB_stack and bottom of DB1_stack without checking > for the guard page. That's correct because the NMI cannot hit a > stackpointer pointing to the guard page between DB and DB1 stack. Even > if it would, then the NMI operation still is unaffected, but the resume > of the debug exception on the topmost DB stack will crash by touching > the guard page. > > Suggested-by: Andy Lutomirski <[email protected]> > Signed-off-by: Thomas Gleixner <[email protected]> > ---
One nit below on the docs, otherwise: Reviewed-by: Sean Christopherson <[email protected]> > V2 -> V3: Fix off by one in in_debug_stack() > --- > Documentation/x86/kernel-stacks | 7 ++++++- > arch/x86/entry/entry_64.S | 8 ++++---- > arch/x86/include/asm/cpu_entry_area.h | 14 ++++++++++---- > arch/x86/include/asm/debugreg.h | 2 -- > arch/x86/include/asm/page_64_types.h | 3 --- > arch/x86/kernel/asm-offsets_64.c | 2 ++ > arch/x86/kernel/cpu/common.c | 11 ----------- > arch/x86/kernel/dumpstack_64.c | 12 ++++++++---- > arch/x86/kernel/nmi.c | 20 +++++++++++++++++++- > arch/x86/mm/cpu_entry_area.c | 4 +++- > 10 files changed, 52 insertions(+), 31 deletions(-) > > --- a/Documentation/x86/kernel-stacks > +++ b/Documentation/x86/kernel-stacks > @@ -76,7 +76,7 @@ The currently assigned IST stacks are :- > middle of switching stacks. Using IST for NMI events avoids making > assumptions about the previous state of the kernel stack. > > -* ESTACK_DB. DEBUG_STKSZ > +* ESTACK_DB. EXCEPTION_STKSZ (PAGE_SIZE). > > Used for hardware debug interrupts (interrupt 1) and for software > debug interrupts (INT3). > @@ -86,6 +86,11 @@ The currently assigned IST stacks are :- > avoids making assumptions about the previous state of the kernel > stack. > > + To handle nested #DB correctly there exist two instances of DB stacks. On > + #DB entry the IST stackpointer for #DB is switched to the second instance > + so a nested #DB starts from a clean stack. The nested #DB switches to Pretty sure the "to" at the end is unwanted. > + the IST stackpointer to a guard hole to catch triple nesting. > + > * ESTACK_MCE. EXCEPTION_STKSZ (PAGE_SIZE). > > Used for interrupt 18 - Machine Check Exception (#MC).

