> Date: Sat, 3 Apr 2021 22:21:02 -0500
> From: Scott Cheloha <scottchel...@gmail.com>
> 
> On Fri, Apr 02, 2021 at 10:37:36AM -0700, Mike Larkin wrote:
> > On Thu, Apr 01, 2021 at 06:43:30PM -0500, Scott Cheloha wrote:
> > > 
> > > [...]
> > > 
> > > Hmmm.  Being able to work around this would be nice.
> > >
> > > FreeBSD has code that uses WRMSR to synchronize the TSC:
> > >
> > > https://cgit.freebsd.org/src/commit/sys/x86/x86/tsc.c?id=b2c63698d4b81576e0c8842263ee86e86cd34e76
> > >
> > > My guess is that support for writing the TSC is not implemented by
> > > every hypervisor, so we would need to be very careful in deciding when
> > > to try it.  Otherwise we end up with protection faults and other crap
> > > we don't want.
> > >
> > 
> > We implemented rdmsr_safe for things like this. We could probably do the 
> > same
> > for wrmsr.
> 
> Like this?
> 
> Sorry if this is not idiomatic.  I don't write much assembly.
> 
> I tested this a bit on my laptop.  Stuff like:
> 
>       wrmsr_safe(MSR_TSC, rdtsc() + 100);
> 
> Which seems to desync the normally synchronized TSCs here.

Before you go further down this path, please realize that "fixing" the
TSC this way is racy in much the same ways as determining the skew is.
If an SMM or a VM exit happens between reading the TSC and the actual
write to the MSR, the new value of the TSC will be off by quite a bit.

IIRC, newer CPUs have an MSR that can apply a delta to the TSC, which
circumvents that race.

In the end I'm fairly skeptical about all the attempts to fix a
fundamentally broken aspect of the architecture by throwing more and
ever more complicated code at the problem.

> Unclear what the rules are for RETGUARD.  I just copied what was in
> rdmsr_safe().  We're not using R10 so we can use R10?
> 
> -Scott
> 
> Index: include/cpufunc.h
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/include/cpufunc.h,v
> retrieving revision 1.36
> diff -u -p -r1.36 cpufunc.h
> --- include/cpufunc.h 13 Sep 2020 11:53:16 -0000      1.36
> +++ include/cpufunc.h 4 Apr 2021 03:16:48 -0000
> @@ -398,6 +398,7 @@ struct cpu_info_full;
>  void cpu_enter_pages(struct cpu_info_full *);
>  
>  int rdmsr_safe(u_int msr, uint64_t *);
> +int wrmsr_safe(uint32_t msr, uint64_t);
>  
>  #endif /* _KERNEL */
>  
> Index: amd64/locore.S
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/locore.S,v
> retrieving revision 1.122
> diff -u -p -r1.122 locore.S
> --- amd64/locore.S    3 Nov 2020 18:19:31 -0000       1.122
> +++ amd64/locore.S    4 Apr 2021 03:16:48 -0000
> @@ -1154,6 +1154,30 @@ NENTRY(rdmsr_resume)
>       ret
>  END(rdmsr_safe)
>  
> +/* int wrmsr_safe(uint32_t msr, uint64_t val) */
> +ENTRY(wrmsr_safe)
> +     RETGUARD_SETUP(wrmsr_safe, r10)
> +
> +     movl    %edi,   %ecx    /* uint32_t msr */
> +
> +     movl    %esi,   %eax    /* uint64_t val */
> +     sarq    $32,    %rsi
> +     movl    %esi,   %edx
> +     
> +     .globl  wrmsr_safe_fault
> +wrmsr_safe_fault:
> +     wrmsr
> +
> +     xorq    %rax,   %rax
> +     RETGUARD_CHECK(rdmsr_safe, r10)
> +     ret
> +
> +NENTRY(wrmsr_resume)
> +     movq    $0x1,   %rax
> +     RETGUARD_CHECK(wrmsr_safe, r10)
> +     ret
> +END(wrmsr_safe)
> +
>  #if NXEN > 0
>       /* Hypercall page needs to be page aligned */
>       .text
> 
> 

Reply via email to