On Sat, Mar 27, 2021 at 02:20:21AM +0000, Stefmorino wrote:
> > Feel free to share your raw data.
> 
> Also includes some standard sendbug dumps: https://0x0.st/-qng.tgz

Thanks!

TL;DR:

Two things:

1. Could you check whether Linux will use the TSC as a clocksource on
   this machine?  The dmesg output on any given distribution should
   contain lines about the TSC.

   Feel free to use my easy five-step method.  You don't even need to
   install Linux, we just need to boot installation media and look at
   the dmesg:

   A. Grab the latest Alpine Linux ISO:

        $ ftp 
https://dl-cdn.alpinelinux.org/alpine/v3.13/releases/x86_64/alpine-standard-3.13.4-x86_64.iso

   B. Write the ISO to a USB key to create your bootable installation media.

      Achtung!  Danger!  Don't wipe out the wrong disk!  Change /dev/sdNc
      to match the special device for your USB stick!

        # dd if=alpine-standard-3.13.4-x86_64.iso of=/dev/sdNc bs=1m

   C. Reboot.

        # shutdown -r now

   D. Boot from the USB stick.  How you do this varies by device.  Log in as
      root.  The Alpine installation ramdisk has no root password.

   E. Examine the Linux dmesg for lines about the TSC, clocksources:

        # dmesg | egrep -i 'tsc|clocksource'

2. Second, is there a more recent BIOS revision for this machine?
   Perhaps (assuming this is in fact a BIOS problem) Lenovo is aware
   of it and has fixed it.  This is unlikely but worth a look.

Long version:

I think two points form a pattern.  Bear with me.

Both you and Josh Rickmar (CC'd) have Lenovo laptops with the same CPU
(AMD Ryzen 5 2500U) and the same BIOS (LENOVO version "R0UET78W (1.58
)" date 11/17/2020).

Looking at the data here, both machines exhibit the same problem with
the TSC: the APs are all nearly synchronized outside of small
measurement errors while the BSP is way off.

This makes me wonder whether this is a firmware/BIOS bug.  Perhaps the
BIOS is fussing with the TSC on CPU0 before we boot.  Is there a new
BIOS revision available from Lenovo?

I'd also be interested to know if a recent Linux kernel would even use
the TSC on this laptop as a clocksource or if the kernel complains and
falls back to using the HPET.

There is a little chit-chat here and there about adding support to
OpenBSD for fixing the TSC skew during synchronization at boot/resume.
One way to do this is with the TSC_ADJUST MSR...

... but that won't work here in your case.  The CPU on this particular
laptop does not have TSC_ADJUST support, so if we wanted to correct
the TSC skew we'd have to use WRMSR to modify the TSC directly.

I'm uncertain about whether using WRMSR to reset the TSC on a given
logical processor is universally supported on all amd64 machines or
if it's a special feature a la the TSC_ADJUST MSR.

-Scott

Reply via email to