On 26/10/2018 00:11, Andy Lutomirski wrote: > On Thu, Oct 25, 2018 at 4:09 PM Andrew Cooper <andrew.coop...@citrix.com> > wrote: >> On 25/10/2018 07:09, Juergen Gross wrote: >>> On 24/10/2018 21:41, Andrew Cooper wrote: >>>> On 24/10/18 20:16, Andy Lutomirski wrote: >>>>> On Tue, Oct 23, 2018 at 11:43 AM Chang S. Bae <chang.seok....@intel.com> >>>>> wrote: >>>>>> The helper functions will switch on faster accesses to FSBASE and GSBASE >>>>>> when the FSGSBASE feature is enabled. >>>>>> >>>>>> Accessing user GSBASE needs a couple of SWAPGS operations. It is >>>>>> avoidable >>>>>> if the user GSBASE is saved at kernel entry, being updated as changes, >>>>>> and >>>>>> restored back at kernel exit. However, it seems to spend more cycles for >>>>>> savings and restorations. Little or no benefit was measured from >>>>>> experiments. >>>>>> >>>>>> Signed-off-by: Chang S. Bae <chang.seok....@intel.com> >>>>>> Reviewed-by: Andi Kleen <a...@linux.intel.com> >>>>>> Cc: Any Lutomirski <l...@kernel.org> >>>>>> Cc: H. Peter Anvin <h...@zytor.com> >>>>>> Cc: Thomas Gleixner <t...@linutronix.de> >>>>>> Cc: Ingo Molnar <mi...@kernel.org> >>>>>> Cc: Dave Hansen <dave.han...@linux.intel.com> >>>>>> --- >>>>>> arch/x86/include/asm/fsgsbase.h | 17 +++---- >>>>>> arch/x86/kernel/process_64.c | 82 +++++++++++++++++++++++++++------ >>>>>> 2 files changed, 75 insertions(+), 24 deletions(-) >>>>>> >>>>>> diff --git a/arch/x86/include/asm/fsgsbase.h >>>>>> b/arch/x86/include/asm/fsgsbase.h >>>>>> index b4d4509b786c..e500d771155f 100644 >>>>>> --- a/arch/x86/include/asm/fsgsbase.h >>>>>> +++ b/arch/x86/include/asm/fsgsbase.h >>>>>> @@ -57,26 +57,23 @@ static __always_inline void wrgsbase(unsigned long >>>>>> gsbase) >>>>>> : "memory"); >>>>>> } >>>>>> >>>>>> +#include <asm/cpufeature.h> >>>>>> + >>>>>> /* Helper functions for reading/writing FS/GS base */ >>>>>> >>>>>> static inline unsigned long x86_fsbase_read_cpu(void) >>>>>> { >>>>>> unsigned long fsbase; >>>>>> >>>>>> - rdmsrl(MSR_FS_BASE, fsbase); >>>>>> + if (static_cpu_has(X86_FEATURE_FSGSBASE)) >>>>>> + fsbase = rdfsbase(); >>>>>> + else >>>>>> + rdmsrl(MSR_FS_BASE, fsbase); >>>>>> >>>>>> return fsbase; >>>>>> } >>>>>> >>>>>> -static inline unsigned long x86_gsbase_read_cpu_inactive(void) >>>>>> -{ >>>>>> - unsigned long gsbase; >>>>>> - >>>>>> - rdmsrl(MSR_KERNEL_GS_BASE, gsbase); >>>>>> - >>>>>> - return gsbase; >>>>>> -} >>>>>> - >>>>>> +extern unsigned long x86_gsbase_read_cpu_inactive(void); >>>>>> extern void x86_fsbase_write_cpu(unsigned long fsbase); >>>>>> extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase); >>>>>> >>>>>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c >>>>>> index 31b4755369f0..fcf18046c3d6 100644 >>>>>> --- a/arch/x86/kernel/process_64.c >>>>>> +++ b/arch/x86/kernel/process_64.c >>>>>> @@ -159,6 +159,36 @@ enum which_selector { >>>>>> GS >>>>>> }; >>>>>> >>>>>> +/* >>>>>> + * Interrupts are disabled here. Out of line to be protected from >>>>>> kprobes. >>>>>> + */ >>>>>> +static noinline __kprobes unsigned long rd_inactive_gsbase(void) >>>>>> +{ >>>>>> + unsigned long gsbase, flags; >>>>>> + >>>>>> + local_irq_save(flags); >>>>>> + native_swapgs(); >>>>>> + gsbase = rdgsbase(); >>>>>> + native_swapgs(); >>>>>> + local_irq_restore(flags); >>>>>> + >>>>>> + return gsbase; >>>>>> +} >>>>> Please fold this into its only caller and make *that* noinline. >>>>> >>>>> Also, this function, and its "write" equivalent, will access the >>>>> *active* gsbase. So it either needs to be fixed for Xen PV or some >>>>> clear comment and careful auditing needs to be added to ensure that >>>>> it's not used on Xen PV. Or it needs to be renamed >>>>> native_x86_fsgsbase_... and add paravirt hooks, since Xen PV allows a >>>>> very efficient but different implementation, I think. The latter is >>>>> probably the right solution. >>>>> >>>>> (Hi Xen people -- how does CR4.FSGSBASE work on Xen? Is it always >>>>> set? Never set? Set only if the guest tries to set it?) >>>> FML. Seriously - whoever put this code into the hypervisor in the past >>>> did an atrocious job. After some experimentation, you're going to be >>>> sad and I'm declaring this borderline unusable. >>>> >>>> Looks like Xen unconditionally enabled CR4.FSGSBASE if it is available. >>>> Therefore, PV guests can use the instructions, even if the bit is clear >>>> in vCR4. >>>> >>>> The CPUID bits are exposed to guests by default, and Xen will emulate >>>> vCR4.FSGSBASE being set and cleared. >>>> >>>> We don't however emulate swapgs (which is a cpl0 instruction). The >>>> guest gets handed a #GP[0] instead. >>>> >>>> The Linux WRMSR PVop uses the set_segment_base() hypercall in instead of >>>> going through the full wrmsr emulation path. >>>> >>>> There is no equivalent get hypercall, so the only way I can see of >>>> getting the value is to actually read MSR_KERNEL_GS_BASE and take the >>>> full rdmsr emulation path. >>> Or shadow the value in a percpu variable. >> Hmm true, so long as no paths try to use native_rd{fs,gs}base() to >> bypass the PVop. > But *user* code can change the base. How is the kernel supposed to > context-switch the user gsbase?
user code can change the user gs base. Xen will switch user/kernel base as appropriate on context switch so the kernel is entered on the kernel gs base. But you are right - there is no way for Linux to peek at the current user gs base without reading MSR_GS_SHADOW. (The user gs base can be set via a hypercall, but not obtained). ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel