> On Apr 5, 2019, at 06:50, Andy Lutomirski <[email protected]> wrote: > > > >> On Apr 5, 2019, at 2:35 AM, Thomas Gleixner <[email protected]> wrote: >> >>> On Mon, 25 Mar 2019, Thomas Gleixner wrote: >>>> On Fri, 15 Mar 2019, Chang S. Bae wrote: >>>> ENTRY(paranoid_exit) >>>> UNWIND_HINT_REGS >>>> DISABLE_INTERRUPTS(CLBR_ANY) >>>> TRACE_IRQS_OFF_DEBUG >>>> + ALTERNATIVE "jmp .Lparanoid_exit_no_fsgsbase", "nop",\ >>>> + X86_FEATURE_FSGSBASE >>>> + wrgsbase %rbx >>>> + jmp .Lparanoid_exit_no_swapgs; >>> >>> Again. A few newlines would make it more readable. >>> >>> This modifies the semantics of paranoid_entry and paranoid_exit. Looking at >>> the usage sites there is the following code in the nmi maze: >>> >>> /* >>> * Use paranoid_entry to handle SWAPGS, but no need to use paranoid_exit >>> * as we should not be calling schedule in NMI context. >>> * Even with normal interrupts enabled. An NMI should not be >>> * setting NEED_RESCHED or anything that normal interrupts and >>> * exceptions might do. >>> */ >>> call paranoid_entry >>> UNWIND_HINT_REGS >>> >>> /* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */ >>> movq %rsp, %rdi >>> movq $-1, %rsi >>> call do_nmi >>> >>> /* Always restore stashed CR3 value (see paranoid_entry) */ >>> RESTORE_CR3 scratch_reg=%r15 save_reg=%r14 >>> >>> testl %ebx, %ebx /* swapgs needed? */ >>> jnz nmi_restore >>> nmi_swapgs: >>> SWAPGS_UNSAFE_STACK >>> nmi_restore: >>> POP_REGS >>> >>> I might be missing something, but how is that supposed to work when >>> paranoid_entry uses FSGSBASE? I think it's broken, but if it's not then >>> there is a big fat comment missing explaining why. >> >> So this _is_ broken. >> >> On entry: >> >> rbx = rdgsbase() >> wrgsbase(KERNEL_GS) >> >> On exit: >> >> if (ebx == 0) >> swapgs >> >> The resulting matrix: >> >> | ENTRY GS | RBX | EXIT | GS on IRET | RESULT >> | | | | | >> 1 | KERNEL_GS | KERNEL_GS | EBX == 0 | USER_GS | FAIL >> | | | | | >> 2 | KERNEL_GS | KERNEL_GS | EBX != 0 | KERNEL_GS | ok >> | | | | | >> 3 | USER_GS | USER_GS | EBX == 0 | USER_GS | ok >> | | | | | >> 4 | USER_GS | USER_GS | EBX != 0 | KERNEL_GS | FAIL >> >> >> #1 Just works by chance because it's unlikely that the lower 32bits of a >> per CPU kernel GS are all 0. >> >> But it's just a question of probability that this turns into a >> non-debuggable once per year crash (think KASLR). >> >> #4 This can happen when the NMI hits the kernel in some other entry code >> _BEFORE_ or _AFTER_ swapgs. >> >> User space using GS addressing with GS[31:0] != 0 will crash and burn. >> >> > > Hi all- > > In a previous incarnation of these patches, I complained about the use of > SWAPGS in the paranoid path. Now I’m putting my maintainer foot down. On a > non-FSGSBASE system, the paranoid path known, definitively, which GS is > where, so SWAPGS is annoying. With FSGSBASE, unless you start looking at the > RIP that you interrupted, you cannot know whether you have user or kernel > GSBASE live, since they can have literally the same value. One of the > numerous versions of this patch compared the values and just said “well, it’s > harmless to SWAPGS if user code happens to use the same value as the kernel”. > I complained that it was far too fragile. > > So I’m putting my foot down. If you all want my ack, you’re going to save the > old GS, load the new one with WRGSBASE, and, on return, you’re going to > restore the old one with WRGSBASE. You will not use SWAPGS in the paranoid > path. > > Obviously, for the non-paranoid path, it all keeps working exactly like it > does now.
Although I can see some other concerns with this, looks like it is still worth pursuing. > > Furthermore, if you folks even want me to review this series, the ptrace > tests need to be in place. On inspection of the current code (after the > debacle a few releases back), it appears the SETREGSET’s effect depends on > the current values in the registers — it does not actually seem to reliably > load the whole state. So my confidence will be greatly increased if your > series first adds a test that detects that bug (and fails!), then fixes the > bug in a tiny little patch, then adds FSGSBASE, and keeps the test working. > I think I need to understand the issue. Appreciate if you can elaborate a little bit. > —Andy

