On Thu, Aug 25, 2022 at 03:57:48PM +1000, Jonathan Gray wrote:
> On Wed, Aug 24, 2022 at 11:05:30PM -0500, Scott Cheloha wrote:
> > On Wed, Aug 24, 2022 at 05:51:14PM +1000, Jonathan Gray wrote:
> > > On Tue, Aug 23, 2022 at 12:20:39PM -0500, Scott Cheloha wrote:
> > > > > Hyper-V generation 1 VMs are bios boot with emulation of the usual
> > > > > devices.  32-bit and 64-bit guests.
> > > > > 
> > > > > Hyper-V generation 2 VMs are 64-bit uefi with paravirtualised devices.
> > > > > 64-bit guests only.
> > > > > 
> > > > > There is no 8254 in generation 2.
> > > > > No HPET in either generation.
> > > > > 
> > > > > hv_delay uses the "Partition Reference Counter MSR" described in
> > > > > https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/timers
> > > > > It seems it is available in both generations and could be used from 
> > > > > i386?
> > > > > 
> > > > > From reading that page hv_delay() should be preferred over 
> > > > > lapic_delay()
> > > > 
> > > > Alright, I have nudged hv_delay's quality up over lapic_delay's
> > > > quality.
> > > 
> > > Before these changes, tsc is probed before pvbus.  Do the tsc sanity
> > > checks result in it not being considered an option on hyper-v?  I think
> > > the tsc_delay and hv_delay numbers should be swapped in a later commit.
> > > It is unclear if that would change the final delay_func setting.
> > 
> > Why would we prefer hv_delay() to tsc_delay() if we had a
> > constant/invariant TSC available in our Hyper-V guest?
> > 
> > When patrick@ emailed me last year about issues with delay(9) on
> > Hyper-V, he started by saying that the root of the problem was that
> > the OpenBSD guest was not opting to use tsc_delay() because the host
> > wasn't reporting a constant/invariant TSC.  So the guest was trying to
> > use i8254_delay(), which was impossible because Hyper-V Gen2 guests
> > don't have an i8254.  Hence, hv_delay() was added to the tree.
> > 
> > So, my understanding is that the addition of hv_delay() does not mean
> > tsc_delay() is worse than hv_delay().  hv_delay() was added because
> > tsc_delay() isn't always an option and (to our surprise) neither is
> > i8254_delay().
> 
> I'm not clear on when rdtsc and rdmsr would cause a vm exit.
> Presumably the reference tsc page is provided to avoid that,
> but we don't use it.  rdtsc and rdmsr don't always cause an exit.
> 
> The wording of Microsoft's "Hypervisor Top Level Functional
> Specification" reads as the interface is only available when
> the underlying machine has a constant frequency tsc.  It also
> makes the point that the interface being in time not cycles avoids
> problems when the tsc frequency changes on live migration.
> 
> "12.3 Partition Reference Time Enlightenment
> 
> The partition reference time enlightenment presents a reference
> time source to a partition which does not require an intercept into
> the hypervisor. This enlightenment is available only when the
> underlying platform provides support of an invariant processor Time
> Stamp Counter (TSC), or iTSC. In such platforms, the processor TSC
> frequency remains constant irrespective of changes in the processor's
> clock frequency due to the use of power management states such as
> ACPI processor performance states, processor idle sleep states (ACPI
> C-states), etc.
> 
> The partition reference time enlightenment uses a virtual TSC value,
> an offset and a multiplier to enable a guest partition to compute
> the normalized reference time since partition creation, in 100nS
> units. The mechanism also allows a guest partition to atomically
> compute the reference time when the guest partition is migrated to
> a platform with a different TSC rate, and provides a fallback
> mechanism to support migration to platforms without the constant
> rate TSC feature.
> 
> This facility is not intended to be used a source of wall clock
> time, since the reference time computed using this facility will
> appear to stop during the time that a guest partition is saved until
> the subsequent restore."

If hv_delay() never causes a vm exit, but tsc_delay() *might* cause a
vm exit, and both have microsecond or better resolution, then
hv_delay() is the preferable delay(9) implementation where it is
available because vm exits have ambiguous overhead.

If that seems sensible to you, I'll commit this switch.

Index: arch/amd64/amd64/tsc.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
retrieving revision 1.26
diff -u -p -r1.26 tsc.c
--- arch/amd64/amd64/tsc.c      25 Aug 2022 17:38:16 -0000      1.26
+++ arch/amd64/amd64/tsc.c      29 Aug 2022 16:58:25 -0000
@@ -109,7 +109,7 @@ tsc_identify(struct cpu_info *ci)
 
        tsc_frequency = tsc_freq_cpuid(ci);
        if (tsc_frequency > 0)
-               delay_init(tsc_delay, 5000);
+               delay_init(tsc_delay, 4000);
 }
 
 static inline int
Index: dev/pv/pvbus.c
===================================================================
RCS file: /cvs/src/sys/dev/pv/pvbus.c,v
retrieving revision 1.25
diff -u -p -r1.25 pvbus.c
--- dev/pv/pvbus.c      25 Aug 2022 17:38:16 -0000      1.25
+++ dev/pv/pvbus.c      29 Aug 2022 16:58:26 -0000
@@ -320,7 +320,7 @@ pvbus_hyperv(struct pvbus_hv *hv)
 
 #if NHYPERV > 0
        if (hv->hv_features & CPUID_HV_MSR_TIME_REFCNT)
-               delay_init(hv_delay, 4000);
+               delay_init(hv_delay, 5000);
 #endif
 }
 

Reply via email to