On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner <t...@linutronix.de> wrote: > From: Dave Hansen <dave.han...@linux.intel.com> > > Add the pagetable helper functions do manage the separate user space page > tables. > > [ tglx: Split out from the big combo kaiser patch ]
> +/* > + * Take a PGD location (pgdp) and a pgd value that needs to be set there. > + * Populates the user and returns the resulting PGD that must be set in > + * the kernel copy of the page tables. > + */ > +static inline pgd_t kpti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) > +{ > +#ifdef CONFIG_KERNEL_PAGE_TABLE_ISOLATION > + if (!static_cpu_has_bug(X86_BUG_CPU_SECURE_MODE_KPTI)) > + return pgd; > + > + if (pgd_userspace_access(pgd)) { > + if (pgdp_maps_userspace(pgdp)) { > + /* > + * The user page tables get the full PGD, > + * accessible from userspace: > + */ > + kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd; > + /* > + * For the copy of the pgd that the kernel uses, > + * make it unusable to userspace. This ensures on > + * in case that a return to userspace with the > + * kernel CR3 value, userspace will crash instead > + * of running. > + * > + * Note: NX might be not available or disabled. > + */ > + if (__supported_pte_mask & _PAGE_NX) > + pgd.pgd |= _PAGE_NX; > + } > + } else if (pgd_userspace_access(*pgdp)) { > + /* > + * We are clearing a _PAGE_USER PGD for which we presumably > + * populated the user PGD. We must now clear the user PGD > + * entry. > + */ > + if (pgdp_maps_userspace(pgdp)) { > + kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd; > + } else { > + /* > + * Attempted to clear a _PAGE_USER PGD which is in > + * the kernel porttion of the address space. PGDs > + * are pre-populated and we never clear them. > + */ > + WARN_ON_ONCE(1); > + } > + } else { > + /* > + * _PAGE_USER was not set in either the PGD being set or > + * cleared. All kernel PGDs should be pre-populated so > + * this should never happen after boot. > + */ > + WARN_ON_ONCE(system_state == SYSTEM_RUNNING); > + } > +#endif > + /* return the copy of the PGD we want the kernel to use: */ > + return pgd; > +} > + I mentioned this earlier, but I think this should be: VM_BUG_ON(pgdp points to a usermode table); if (pgdp_maps_userspace(pgdp)) { /* Install the pgd as requested into the usermode tables. */ kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd; if (pgd_val(pgd) & _PAGE_USER) { /* * This is a normal user pgd -- the kernelmode mapping should have NX * set to prevent erroneous usermode execution with the kernel tables. */ return __pgd(pgd_val(pgd) | _PAGE_NX; } else { /* This is a weird mapping, e.g. EFI. Map it straight through. */ return pgd; } } else { /* * We can get here due to vmalloc, a vmalloc fault, memory hot-add, or initial setup * of kernelmode page tables. Regardless of which particular code path we're in, * these mappings should not be automatically propagated to the usermode tables. */ return pgd; } } That should make all the VSYSCALL nastiness go away.