On Thu, Apr 01, 2021 at 03:22:00PM -0400, Josh Rickmar wrote:
> On Thu, Apr 01, 2021 at 02:15:48PM -0500, Scott Cheloha wrote:
> > On Sat, Mar 27, 2021 at 02:20:21AM +0000, Stefmorino wrote:
> > > > Feel free to share your raw data.
> > > 
> > > Also includes some standard sendbug dumps: https://0x0.st/-qng.tgz
> > 
> > Thanks!
> > 
> > TL;DR:
> > 
> > Two things:
> > 
> > 1. Could you check whether Linux will use the TSC as a clocksource on
> >    this machine?  The dmesg output on any given distribution should
> >    contain lines about the TSC.
> > 
> >    Feel free to use my easy five-step method.  You don't even need to
> >    install Linux, we just need to boot installation media and look at
> >    the dmesg:
> > 
> >    A. Grab the latest Alpine Linux ISO:
> > 
> >     $ ftp 
> > https://dl-cdn.alpinelinux.org/alpine/v3.13/releases/x86_64/alpine-standard-3.13.4-x86_64.iso
> > 
> >    B. Write the ISO to a USB key to create your bootable installation media.
> > 
> >       Achtung!  Danger!  Don't wipe out the wrong disk!  Change /dev/sdNc
> >       to match the special device for your USB stick!
> > 
> >     # dd if=alpine-standard-3.13.4-x86_64.iso of=/dev/sdNc bs=1m
> > 
> >    C. Reboot.
> > 
> >         # shutdown -r now
> > 
> >    D. Boot from the USB stick.  How you do this varies by device.  Log in as
> >       root.  The Alpine installation ramdisk has no root password.
> > 
> >    E. Examine the Linux dmesg for lines about the TSC, clocksources:
> > 
> >     # dmesg | egrep -i 'tsc|clocksource'
> > 
> > 2. Second, is there a more recent BIOS revision for this machine?
> >    Perhaps (assuming this is in fact a BIOS problem) Lenovo is aware
> >    of it and has fixed it.  This is unlikely but worth a look.
> > 
> > Long version:
> > 
> > I think two points form a pattern.  Bear with me.
> > 
> > Both you and Josh Rickmar (CC'd) have Lenovo laptops with the same CPU
> > (AMD Ryzen 5 2500U) and the same BIOS (LENOVO version "R0UET78W (1.58
> > )" date 11/17/2020).
> > 
> > Looking at the data here, both machines exhibit the same problem with
> > the TSC: the APs are all nearly synchronized outside of small
> > measurement errors while the BSP is way off.
> > 
> > This makes me wonder whether this is a firmware/BIOS bug.  Perhaps the
> > BIOS is fussing with the TSC on CPU0 before we boot.  Is there a new
> > BIOS revision available from Lenovo?
> > 
> > I'd also be interested to know if a recent Linux kernel would even use
> > the TSC on this laptop as a clocksource or if the kernel complains and
> > falls back to using the HPET.
> > 
> > There is a little chit-chat here and there about adding support to
> > OpenBSD for fixing the TSC skew during synchronization at boot/resume.
> > One way to do this is with the TSC_ADJUST MSR...
> > 
> > ... but that won't work here in your case.  The CPU on this particular
> > laptop does not have TSC_ADJUST support, so if we wanted to correct
> > the TSC skew we'd have to use WRMSR to modify the TSC directly.
> > 
> > I'm uncertain about whether using WRMSR to reset the TSC on a given
> > logical processor is universally supported on all amd64 machines or
> > if it's a special feature a la the TSC_ADJUST MSR.
> > 
> > -Scott
> 
> Hey, thanks for the reminder to try this out with Linux.  Will give it
> a shot shortly.
> 
> As for the BIOS, 1.58 is the current version (found here):
> 
> https://support.lenovo.com/us/en/downloads/ds503790
> 
> This same issue was happening with all older BIOS versions that I have
> used as well.
> 

Seems Linux doesn't like it either:

localhost:~# dmesg | egrep -i 'tsc|clocksource'
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 1996.173 MHz processor
[    0.043227] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 
0xffffffff, max_idle_ms: 6370452778343963 ns
[    0.114728] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, 
max_idle_ns: 133484873504 ns
[    0.131435] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 
0x398c1ebcd00, max_idle_ns: 881590807727 ns
[    0.244772] TSC synchronization [CPU#0 -> CPU#1]:
[    0.244772] Measured 7296391160 warp between CPUs, turning off TSC clock.
[    0.244772] tsc: Marking TSC unstable due to check_tsc_sync_source_failed
[    0.252185] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, 
max_idle_ns: 6370867519511994 ns
[    0.316884] clocksource: Switched to clocksource hpet
[    0.335046] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, 
max_idle_ns: 2085701024 ns

Reply via email to