Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-26 Thread Kalabic S,




On 10/26/22 23:07, Scott Cheloha wrote:


In summary:

- OpenBSD 7.2 amd64 kernel TSC and lapic calibration is broken on
   (at least) some ESXi 6.0 and ESXi 7.5 hosts under the VM configuration
   "FreeBSD (32-bit)".  The ACPI PM timer seemingly accelerates when we
   read it repeatedly during boot.

- Workaround 1 is to change the configuration to "FreeBSD (64-bit)".

- Workaround 2 is to not install acpitimer_delay() with delay_init()
   during acpitimerattach().



Maybe you noticed already, but in OpenBSD-Misc list I have suggested to 
make VMware Tools driver to advertise OS as 'FreeBSD 64-bit' OS, not 
32-bit version, making workaround 1 a default system setting.


https://marc.info/?l=openbsd-misc=166680569110622=2



Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-26 Thread Scott Cheloha
On Wed, Oct 26, 2022 at 03:23:51PM +0200, Kalabic S. wrote:
> On 26/10/2022 11:33, Scott Cheloha wrote:
> > There might be a second workaround.  Kalabic mentions here in the
> > other thread about this problem:
> > 
> > https://marc.info/?l=openbsd-bugs=14949825616=2
> > 
> > ... that changing the ESXi option "Guest OS Version" from "FreeBSD
> > (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his
> > version of ESXi.  Does that work for you?  I don't know what the other
> > consequences of that configuration change are, but it might be worth a
> > try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching
> > -current.
> > 
> > Do you have VMware support?  Is there any way for you to report this
> > problem to them?  It's unlikely they explicitly support running an
> > OpenBSD guest, but it's plausible this issue could affect other
> > operating systems.  I can't imagine OpenBSD is reading the ACPI PM
> > timer differently than Linux or FreeBSD.
> > 
> 
> Maybe related or not, but there's official paper from VMware that describes
> several known timekeeping issues and how to correct or work around them:
> https://www.cse.psu.edu/~buu1/teaching/spring06/papers/vmware-timing.pdf

I did see this, thanks for posting it.  It's not immediately useful here,
though.

> Also pardon my ignorance about TSC counters and related stuff, but just
> looking at FreeBSD related code it seems to take into account the fact it is
> running as a hypervisor guest (ESXi or Xen).
> https://github.com/freebsd/freebsd-src/blob/main/sys/x86/x86/tsc.c
> 
> Is there a detail that makes a difference when different "Guest OS Version"
> is used? Note that I have no idea what is happening there.
> 
> So, just like some AMD related improvements for TSC were introduced into
> OpenBSD recently, maybe this issue can be properly solved only by doing
> something similar for guests on hypervisor?

I would like to derive the TSC and lapic frequency from the hypervisor
CPUID leaves when they are available to avoid calibration.  It's on my
todo list.



Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-26 Thread Scott Cheloha
On Wed, Oct 26, 2022 at 07:36:28AM -0700, James J. Lippard wrote:
> On Wed, Oct 26, 2022 at 04:33:23AM -0500, Scott Cheloha wrote:
> > Thank you for testing, let's take a look.
> > [...]
> > I don't know how to explain this.  Maybe another developer will read
> > this and spot something I'm missing.  Or maybe this is a known issue
> > and I'm just not finding a reference to it online.
> > 
> > The simplest workaround is to skip installing acpitimer_delay() with
> > delay_init() during acpitimerattach().  The attached patch does this.
> 
> Can confirm that this works.

Good.

> > I don't know if this problem persists after boot.  If it does, using
> > the acpitimer0 timecounter may yield strange results in the VM.  I
> > recommend not using the acpitimer0 timecounter until the problem is
> > better understood.  A calibrated TSC is going to be a better
> > timecounter anyway.
> > 
> > There might be a second workaround.  Kalabic mentions here in the
> > other thread about this problem:
> > 
> > https://marc.info/?l=openbsd-bugs=14949825616=2
> > 
> > ... that changing the ESXi option "Guest OS Version" from "FreeBSD
> > (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his
> > version of ESXi.  Does that work for you?  I don't know what the other
> > consequences of that configuration change are, but it might be worth a
> > try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching
> > -current.
> 
> I can also confirm that this works as a workaround on the stock 7.2 kernel.
> I also booted with the last kernel with debugging info with this workaround;
> dmesg for that is below.

Even better, and thank you for double-checking with the patched
kernel.

> > Do you have VMware support?  Is there any way for you to report this
> > problem to them?  It's unlikely they explicitly support running an
> > OpenBSD guest, but it's plausible this issue could affect other
> > operating systems.  I can't imagine OpenBSD is reading the ACPI PM
> > timer differently than Linux or FreeBSD.
> 
> Unfortunately not, I only use the free vSphere ESXi.

Drat.

> OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022
> lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> [snip]
> measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: 
> count 14211444 14569397 tsc 38873326169 39063325035 usecs 9: 197660 Hz
> measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: 
> cycles 357958 tsc 190001742: 188842 Hz
> measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: 
> count 14939119 15297049 tsc 39259571275 39449557759 usecs 3: 187839 Hz
> measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: 
> cycles 357955 tsc 18897: 186316 Hz
> measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: 
> count 15666102 16024022 tsc 39645448713 39835430133 usecs 0: 194200 Hz
> measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: 
> cycles 357954 tsc 18157: 184223 Hz
> [snip]
> acpihpet0 at acpi0: 14318179 Hz
> measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: 
> count 8315 1439858 tsc 42184173245 42374137028 usecs 99980: 1900017833 Hz
> measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: 
> cycles 1431819 tsc 18907: 187610 Hz
> measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: 
> count 2894563 4326110 tsc 42567173659 42757137699 usecs 99981: 191400 Hz
> measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: 
> cycles 1431826 tsc 19836: 187611 Hz
> measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: 
> count 5781139 7212684 tsc 42950217351 43140181114 usecs 99980: 1900017633 Hz
> measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: 
> cycles 1431826 tsc 19909: 188341 Hz

This looks right.

In summary:

- OpenBSD 7.2 amd64 kernel TSC and lapic calibration is broken on
  (at least) some ESXi 6.0 and ESXi 7.5 hosts under the VM configuration
  "FreeBSD (32-bit)".  The ACPI PM timer seemingly accelerates when we
  read it repeatedly during boot.

- Workaround 1 is to change the configuration to "FreeBSD (64-bit)".

- Workaround 2 is to not install acpitimer_delay() with delay_init()
  during acpitimerattach().



Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-26 Thread James J. Lippard
On Wed, Oct 26, 2022 at 04:33:23AM -0500, Scott Cheloha wrote:
> Thank you for testing, let's take a look.
> [...]
> I don't know how to explain this.  Maybe another developer will read
> this and spot something I'm missing.  Or maybe this is a known issue
> and I'm just not finding a reference to it online.
> 
> The simplest workaround is to skip installing acpitimer_delay() with
> delay_init() during acpitimerattach().  The attached patch does this.

Can confirm that this works.

> I don't know if this problem persists after boot.  If it does, using
> the acpitimer0 timecounter may yield strange results in the VM.  I
> recommend not using the acpitimer0 timecounter until the problem is
> better understood.  A calibrated TSC is going to be a better
> timecounter anyway.
> 
> There might be a second workaround.  Kalabic mentions here in the
> other thread about this problem:
> 
> https://marc.info/?l=openbsd-bugs=14949825616=2
> 
> ... that changing the ESXi option "Guest OS Version" from "FreeBSD
> (32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his
> version of ESXi.  Does that work for you?  I don't know what the other
> consequences of that configuration change are, but it might be worth a
> try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching
> -current.

I can also confirm that this works as a workaround on the stock 7.2 kernel.
I also booted with the last kernel with debugging info with this workaround;
dmesg for that is below.

> Do you have VMware support?  Is there any way for you to report this
> problem to them?  It's unlikely they explicitly support running an
> OpenBSD guest, but it's plausible this issue could affect other
> operating systems.  I can't imagine OpenBSD is reading the ACPI PM
> timer differently than Linux or FreeBSD.

Unfortunately not, I only use the free vSphere ESXi.

OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022
lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 6424494080 (6126MB)
avail mem = 6212374528 (5924MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe0010 (242 entries)
bios0: vendor Phoenix Technologies LTD version "6.00" date 11/12/2020
bios0: VMware, Inc. VMware Virtual Platform
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) S8F0(S3) 
S16F(S3) S18F(S3) S22F(S3) S23F(S3) S24F(S3) S25F(S3) PE40(S3) S1F0(S3) 
PE50(S3) S1F0(S3) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.76 MHz, 06-56-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 9MB 64b/line 12-way L3 cache
measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 
14211444 14569397 tsc 38873326169 39063325035 usecs 9: 197660 Hz
measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 
357958 tsc 190001742: 188842 Hz
measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 
14939119 15297049 tsc 39259571275 39449557759 usecs 3: 187839 Hz
measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 
357955 tsc 18897: 186316 Hz
measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 
15666102 16024022 tsc 39645448713 39835430133 usecs 0: 194200 Hz
measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 
357954 tsc 18157: 184223 Hz
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 65MHz
delay_init: changing delay implementation: 0 -> 3000
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.69 MHz, 06-56-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 9MB 64b/line 12-way L3 cache
cpu1: smt 0, core 0, package 2

Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-26 Thread Kalabic S.

On 26/10/2022 11:33, Scott Cheloha wrote:

There might be a second workaround.  Kalabic mentions here in the
other thread about this problem:

https://marc.info/?l=openbsd-bugs=14949825616=2

... that changing the ESXi option "Guest OS Version" from "FreeBSD
(32-bit)" to "FreeBSD (64-bit)" seemed to fix the problem on his
version of ESXi.  Does that work for you?  I don't know what the other
consequences of that configuration change are, but it might be worth a
try if you prefer to run 7.2-RELEASE or 7.2-STABLE instead of patching
-current.

Do you have VMware support?  Is there any way for you to report this
problem to them?  It's unlikely they explicitly support running an
OpenBSD guest, but it's plausible this issue could affect other
operating systems.  I can't imagine OpenBSD is reading the ACPI PM
timer differently than Linux or FreeBSD.



Maybe related or not, but there's official paper from VMware that 
describes several known timekeeping issues and how to correct or work 
around them: 
https://www.cse.psu.edu/~buu1/teaching/spring06/papers/vmware-timing.pdf



Also pardon my ignorance about TSC counters and related stuff, but just 
looking at FreeBSD related code it seems to take into account the fact 
it is running as a hypervisor guest (ESXi or Xen).

https://github.com/freebsd/freebsd-src/blob/main/sys/x86/x86/tsc.c

Is there a detail that makes a difference when different "Guest OS 
Version" is used? Note that I have no idea what is happening there.


So, just like some AMD related improvements for TSC were introduced into 
OpenBSD recently, maybe this issue can be properly solved only by doing 
something similar for guests on hypervisor?


I have found that similar issues were reported for FreeBSD and other 
virtual machines previously:
- "Time drift/system clock too fast on a PFSense VM": 
https://forum.netgate.com/topic/108653/time-drift-system-clock-too-fast-on-a-pfsense-vm
- "Clock on ADC VPX hosted on VMware is running very fast causing 
exchange issues": 
https://support.citrix.com/article/CTX335923/clock-on-adc-vpx-hosted-on-vmware-is-running-very-fast-causing-exchange-issues

- ... and more can easily be googled.



Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-26 Thread Scott Cheloha
On Tue, Oct 25, 2022 at 09:00:33PM -0700, James J. Lippard wrote:
> On Tue, Oct 25, 2022 at 09:20:05PM -0500, Scott Cheloha wrote:
> > On Tue, Oct 25, 2022 at 02:24:24PM -0700, James J. Lippard wrote:
> > > I'm one of several people experiencing this issue with OpenBSD 7.2 on
> > > VMware ESXi 7.5. Scott C. has given me help in trying to track the issue
> > > down; a patched -current kernel to remove the acpi_delay code added in
> > > 7.2 makes the issue go away.
> > 
> > Thanks for your report.
> > 
> > I have one more patch for you to try.  Attached at the end.  Hopefully
> > it will confirm the root problem.  Send the resulting dmesg and we'll
> > see whether the problem is actually the acpitimer(4).
> 
> >[...]
> > Okay, here is the third patch.  Revert the earlier one and boot this.
> 
> Here's the dmesg output running with this new patch:

Thank you for testing, let's take a look.

> OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022
> lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> [snip]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.77 MHz, 06-56-03
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 
> 64b/line 8-way L2 cache, 9MB 64b/line 12-way L3 cache
> measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: 
> count 12840801 13198720 tsc 8350048970 8540029885 usecs 0: 189149 Hz
> measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: 
> cycles 357969 tsc 62919804: 629172553 Hz
> measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: 
> count 13562994 13686416 tsc 8608912502 8798895525 usecs 34479: (failed)
> measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: 
> count 13692684 14050605 tsc 880961 8992204988 usecs 0: 1900010271 Hz
> measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: 
> cycles 357969 tsc 64754894: 647522710 Hz

When we do "indirect calibration," we're using delay(9) to spin for
~100,000 microseconds in between reads of the reference timer and the
TSC.  In this case, the underlying delay(9) implementation is
i8254_delay().  This method calibrates the TSC to 1900 MHz.  For
example, in the first indirect calibration round we get:

  (tsc2 - tsc1) * acpitimer-frequency / (acpitimer2 - acpitimer1)
= (8540029885 - 8350048970) * 3579545 / (13198720 - 12840801)
= 187581

or roughly 1900 MHz.  The result printed in the dmesg (189149) is
a little different because the math in the kernel is a little
different.  The third indirect calibration round yields basically the
same result (1900010271).

When we do "direct calibration," we're reading the reference timer
itself repeatedly to spin for ~100,000 microseconds and accumulating a
count of reference timer cycles and TSC cycles as we spin.  This
method calibrates the TSC to ~630 MHz.  For example, in the first
direct calibration round we get:

  tsc-cycles * acpitimer-frequency / acpitimer-cycles
= 62919804 * 3579545 / 357969
= 629172553

or roughly 630 MHz.  The second indirect calibration round yields a
similar result (647522710).

Based on these numbers, I think the virtual ACPI PM Timer on this ESXi
VM accelerates beyond 3579545 Hz when it is read repeatedly and then
decelerates back down to the nominal frequency when it is read less
frequently.

I don't think the TSC itself has a non-constant frequency.  When we
calibrate it later with the HPET, both indirect calibration using the
local apic timer to spin and direct calibration using only the HPET
yield a TSC frequency of ~1900 MHz:

> [snip]
> cpu0: apic clock running at 65MHz
> delay_init: changing delay implementation: 0 -> 3000

(Here we switch from i8254_delay() to lapic_delay().)

> [snip]
> acpihpet0 at acpi0: 14318179 Hz
> measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: 
> count 7984 1439544 tsc 11218877272 11408843078 usecs 99981: 1900019063 Hz
> measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: 
> cycles 1431817 tsc 18744: 188634 Hz
> measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: 
> count 2894172 4325743 tsc 11601869571 11791837035 usecs 99982: 1900016642 Hz
> measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: 
> cycles 1431826 tsc 19912: 188371 Hz
> measure_tsc_freq: indirect calibration with 

Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-25 Thread James J. Lippard
On Tue, Oct 25, 2022 at 09:20:05PM -0500, Scott Cheloha wrote:
> On Tue, Oct 25, 2022 at 02:24:24PM -0700, James J. Lippard wrote:
> > I'm one of several people experiencing this issue with OpenBSD 7.2 on
> > VMware ESXi 7.5. Scott C. has given me help in trying to track the issue
> > down; a patched -current kernel to remove the acpi_delay code added in
> > 7.2 makes the issue go away.
> 
> Thanks for your report.
> 
> I have one more patch for you to try.  Attached at the end.  Hopefully
> it will confirm the root problem.  Send the resulting dmesg and we'll
> see whether the problem is actually the acpitimer(4).

>[...]
> Okay, here is the third patch.  Revert the earlier one and boot this.

Here's the dmesg output running with this new patch:

OpenBSD 7.2-current (GENERIC.MP) #1: Tue Oct 25 20:09:51 MST 2022
lipp...@chaos.int.discord.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 6424494080 (6126MB)
avail mem = 6212374528 (5924MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe0010 (242 entries)
bios0: vendor Phoenix Technologies LTD version "6.00" date 11/12/2020
bios0: VMware, Inc. VMware Virtual Platform
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) S8F0(S3) 
S16F(S3) S18F(S3) S22F(S3) S23F(S3) S24F(S3) S25F(S3) PE40(S3) S1F0(S3) 
PE50(S3) S1F0(S3) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.77 MHz, 06-56-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 9MB 64b/line 12-way L3 cache
measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 
12840801 13198720 tsc 8350048970 8540029885 usecs 0: 189149 Hz
measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 
357969 tsc 62919804: 629172553 Hz
measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 
13562994 13686416 tsc 8608912502 8798895525 usecs 34479: (failed)
measure_tsc_freq: indirect calibration with acpitimer0(1000), 3579545 Hz: count 
13692684 14050605 tsc 880961 8992204988 usecs 0: 1900010271 Hz
measure_tsc_freq: direct calibration with acpitimer0(1000), 3579545 Hz: cycles 
357969 tsc 64754894: 647522710 Hz
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 65MHz
delay_init: changing delay implementation: 0 -> 3000
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.68 MHz, 06-56-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 
8-way L2 cache, 9MB 64b/line 12-way L3 cache
cpu1: smt 0, core 0, package 2
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf000, bus 0-127
acpihpet0 at acpi0: 14318179 Hz
measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: count 
7984 1439544 tsc 11218877272 11408843078 usecs 99981: 1900019063 Hz
measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: cycles 
1431817 tsc 18744: 188634 Hz
measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: count 
2894172 4325743 tsc 11601869571 11791837035 usecs 99982: 1900016642 Hz
measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: cycles 
1431826 tsc 19912: 188371 Hz
measure_tsc_freq: indirect calibration with acpihpet0(1000), 14318179 Hz: count 
5780812 7212468 tsc 11984921805 12174900576 usecs 99988: 1900015711 Hz
measure_tsc_freq: direct calibration with acpihpet0(1000), 14318179 Hz: cycles 
1431877 tsc 190007695: 188525 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
"PNP0A05" at acpi0 not configured
acpiac0 at acpi0: AC unit online
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
cpu0: using VERW MDS workaround
pvbus0 at 

Re: 7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-25 Thread Scott Cheloha
On Tue, Oct 25, 2022 at 02:24:24PM -0700, James J. Lippard wrote:
> I'm one of several people experiencing this issue with OpenBSD 7.2 on
> VMware ESXi 7.5. Scott C. has given me help in trying to track the issue
> down; a patched -current kernel to remove the acpi_delay code added in
> 7.2 makes the issue go away.

Thanks for your report.

I have one more patch for you to try.  Attached at the end.  Hopefully
it will confirm the root problem.  Send the resulting dmesg and we'll
see whether the problem is actually the acpitimer(4).

> Below is output from sysctl machdep, sysctl hw, and dmesg:
> 
> [...]
> machdep.tscfreq=1900013052
> machdep.invarianttsc=1

This is probably the true TSC frequency...

> sysctl hw:
> hw.machine=amd64
> hw.model=Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz

... it would match the nominal CPU frequency listed in the CPU string.

I'm going to snip some of these dmesgs for reference, too.

> dmesg (which includes 7.1, post-upgrade, and with patched -current
> kernel):
> 
> OpenBSD 7.1 (GENERIC.MP) #3: Sun May 15 10:27:01 MDT 2022
> 
> r...@syspatch-71-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 6424494080 (6126MB)
> avail mem = 6212485120 (5924MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe0010 (242 entries)
> bios0: vendor Phoenix Technologies LTD version "6.00" date 11/12/2020
> bios0: VMware, Inc. VMware Virtual Platform
> acpi0 at bios0: ACPI 4.0
> acpi0: sleep states S0 S1 S4 S5
> acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
> acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) S8F0(S3) 
> S16F(S3) S18F(S3) S22F(S3) S23F(S3) S24F(S3) S25F(S3) PE40(S3) S1F0(S3) 
> PE50(S3) S1F0(S3) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.75 MHz, 06-56-03
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 65MHz
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.62 MHz, 06-56-03
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: disabling user TSC (skew=-2507)
> cpu1: smt 0, core 0, package 2
> ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xf000, bus 0-127
> acpihpet0 at acpi0: 14318179 Hz

7.1-release.  This dmesg worked right.

> OpenBSD 7.2 (GENERIC.MP) #758: Tue Sep 27 11:57:54 MDT 2022
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 6424494080 (6126MB)
> avail mem = 6212378624 (5924MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe0010 (242 entries)
> bios0: vendor Phoenix Technologies LTD version "6.00" date 11/12/2020
> bios0: VMware, Inc. VMware Virtual Platform
> acpi0 at bios0: ACPI 4.0
> acpi0: sleep states S0 S1 S4 S5
> acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
> acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) S8F0(S3) 
> S16F(S3) S18F(S3) S22F(S3) S23F(S3) S24F(S3) S25F(S3) PE40(S3) S1F0(S3) 
> PE50(S3) S1F0(S3) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 586.43 MHz, 06-56-03
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 
> 64b/line 8-way L2 cache, 9MB 64b/line 12-way L3 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 

7.2: tsc timecounter running too fast on ESXi 7.5

2022-10-25 Thread James J. Lippard
I'm one of several people experiencing this issue with OpenBSD 7.2 on
VMware ESXi 7.5. Scott C. has given me help in trying to track the issue
down; a patched -current kernel to remove the acpi_delay code added in
7.2 makes the issue go away.

Below is output from sysctl machdep, sysctl hw, and dmesg:

sysctl machdep:
machdep.console_device=ttyC0
machdep.bios.diskinfo.128=bootdev = 0xa204, cylinders = 1024, heads = 255, 
sectors = 63
machdep.bios.diskinfo.129=bootdev = 0xa0010204, cylinders = 1024, heads = 255, 
sectors = 63
machdep.bios.diskinfo.130=bootdev = 0xa0020204, cylinders = 1024, heads = 255, 
sectors = 63
machdep.bios.cksumlen=2
machdep.allowaperture=0
machdep.cpuvendor=GenuineIntel
machdep.cpuid=0x50663
machdep.cpufeature=0xf9bfbff
machdep.kbdreset=0
machdep.xcrypt=0
machdep.lidaction=1
machdep.forceukbd=0
machdep.tscfreq=1900013052
machdep.invarianttsc=1
machdep.pwraction=1

sysctl hw:
hw.machine=amd64
hw.model=Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz
hw.ncpu=2
hw.byteorder=1234
hw.pagesize=4096
hw.disknames=cd0:,sd0:e0a47e78ea955d63,sd1:0c3277666ef919fc,sd2:bdec30edfe97d02b
hw.diskcount=4
hw.sensors.acpiac0.indicator0=On (power supply)
hw.sensors.vmt0.timedelta0=0.000371 secs, OK, Tue Oct 25 14:18:36.273
hw.cpuspeed=1899
hw.vendor=VMware, Inc.
hw.product=VMware Virtual Platform
hw.version=None
hw.serialno=VMware-56 4d 2b c8 07 85 8b b7-28 1b a0 5d d5 cf 8d fd
hw.uuid=564d2bc8-0785-8bb7-281b-a05dd5cf8dfd
hw.physmem=6424494080
hw.usermem=6424477696
hw.ncpufound=2
hw.allowpowerdown=1
hw.smt=0
hw.ncpuonline=2
hw.power=1

dmesg (which includes 7.1, post-upgrade, and with patched -current
kernel):

OpenBSD 7.1 (GENERIC.MP) #3: Sun May 15 10:27:01 MDT 2022

r...@syspatch-71-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 6424494080 (6126MB)
avail mem = 6212485120 (5924MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe0010 (242 entries)
bios0: vendor Phoenix Technologies LTD version "6.00" date 11/12/2020
bios0: VMware, Inc. VMware Virtual Platform
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP BOOT APIC MCFG SRAT HPET WAET
acpi0: wakeup devices PCI0(S3) USB_(S1) P2P0(S3) S1F0(S3) S2F0(S3) S8F0(S3) 
S16F(S3) S18F(S3) S22F(S3) S23F(S3) S24F(S3) S25F(S3) PE40(S3) S1F0(S3) 
PE50(S3) S1F0(S3) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.75 MHz, 06-56-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 65MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz, 1899.62 MHz, 06-56-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: disabling user TSC (skew=-2507)
cpu1: smt 0, core 0, package 2
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf000, bus 0-127
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
"PNP0A05" at acpi0 not configured
acpiac0 at acpi0: AC unit online
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
cpu0: using VERW MDS workaround
pvbus0 at mainbus0: VMware
vmt0 at pvbus0
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82443BX AGP" rev 0x01
ppb0 at pci0 dev 1 function 0 "Intel 82443BX AGP" rev 0x01
pci1 at ppb0 bus 1
pcib0 at pci0 dev 7 function 0 "Intel 82371AB PIIX4 ISA" rev 0x08
pciide0 at pci0 dev 7 function 1 "Intel 82371AB IDE" rev 0x01: DMA, channel 0 
configured to compatibility, channel 1 configured to compatibility
pciide0: channel 0 disabled (no drives)
pciide0: channel 1 disabled (no drives)
piixpm0 at pci0 dev 7 function 3 "Intel 82371AB Power" rev 0x08: SMBus disabled
"VMware VMCI" rev 0x10 at pci0 dev 7 function 7 not configured
vga1 at pci0 dev 15 function 0 "VMware SVGA II" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
ppb1 at