On Tue, Sep 4, 2018 at 2:21 PM, Dave Hansen <dave.han...@intel.com> wrote: > On 09/04/2018 12:56 PM, Andy Lutomirski wrote: >> I have no objection to this patch. >> >> Dave, why did you think that we could get a PK fault on the vsyscall >> page, even on kernels that still marked it executable? Sure, you >> could get an instruction in the vsyscall page to get a PK fault, but >> CR2 wouldn't point to the vsyscall page, right? > > I'm inferring the CR2 value from the page fault trace point. I see > entries like this: > > protection_keys-4313 [002] d... 420257.094541: page_fault_user: > address=_end ip=_end error_code=0x15 > > But, that's not a PK fault, and it triggers the "misaligned vsyscall > (exploit attempt or buggy program)" stuff in dmesg. It's just the > symptom of trying to execute the non-executable vsyscall page. > > I'm not a super big fan of this particular patch, though. The > fault_in_kernel_space() check is really presuming two things: > 1. pkey faults (PF_PK=1) only occur on user pages (_PAGE_USER=1) > 2. fault_in_kernel_space()==1 addresses are never user pages > > #1 is a hardware expectation. We *can* look for that directly by just > making sure that X86_PF_PK is only set when it also comes with > X86_PF_USER in the hardware page fault error code. > > (... > Aside: We should probably explicitly separate out the hardware > error code from the software-munged version, like we do here: > > if (user_mode(regs)) { > > local_irq_enable(); > > error_code |= X86_PF_USER) > > But, #2 is a bit of a more loose check. It wasn't true for the recent > vsyscall, and I've also seen goofy drivers map memory out to userspace > quite a few times in the kernel address space. > > So, I'd much rather see a X86_PF_USER check than a X86_PF_USER check. > > But, as for pkeys... > > The original intent here was to relay: "protection key faults can never > be spurious". The reason in my silly comment was that we don't do lazy > flushing, but that's imprecise: the real reasoning is that we don't ever > have kernel pages on which we can take protection key faults. > > IOW, I think the check here should be for "protection key faults only > occur on user pages", and all the *spurious* checking should be looking > at *just* user vs. kernel pages, like: > > static int spurious_fault_check(unsigned long error_code, pte_t *pte) > { > /* Only expect spurious faults on kernel pages: */ > WARN_ON_ONCE(pte_flags(*pte) & _PAGE_USER); > /* Only expect spurious faults originating from kernel code: */ > WARN_ON_ONCE(error_code & X86_PF_USER); > ... >
Want to just send an alternative patch? Also, I doubt it matters right here, but !X86_PF_USER isn't quite the same thing as "originating from kernel code" -- it can also be user code that does a CPL0 access due to exception delivery or access to a descriptor table. Which you saw plenty of times while debugging PTI... :) I doubt any of those should be spurious, though. --Andy