On Thu, Apr 01, 2021 at 03:22:00PM -0400, Josh Rickmar wrote: > On Thu, Apr 01, 2021 at 02:15:48PM -0500, Scott Cheloha wrote: > > On Sat, Mar 27, 2021 at 02:20:21AM +0000, Stefmorino wrote: > > > > Feel free to share your raw data. > > > > > > Also includes some standard sendbug dumps: https://0x0.st/-qng.tgz > > > > Thanks! > > > > TL;DR: > > > > Two things: > > > > 1. Could you check whether Linux will use the TSC as a clocksource on > > this machine? The dmesg output on any given distribution should > > contain lines about the TSC. > > > > Feel free to use my easy five-step method. You don't even need to > > install Linux, we just need to boot installation media and look at > > the dmesg: > > > > A. Grab the latest Alpine Linux ISO: > > > > $ ftp > > https://dl-cdn.alpinelinux.org/alpine/v3.13/releases/x86_64/alpine-standard-3.13.4-x86_64.iso > > > > B. Write the ISO to a USB key to create your bootable installation media. > > > > Achtung! Danger! Don't wipe out the wrong disk! Change /dev/sdNc > > to match the special device for your USB stick! > > > > # dd if=alpine-standard-3.13.4-x86_64.iso of=/dev/sdNc bs=1m > > > > C. Reboot. > > > > # shutdown -r now > > > > D. Boot from the USB stick. How you do this varies by device. Log in as > > root. The Alpine installation ramdisk has no root password. > > > > E. Examine the Linux dmesg for lines about the TSC, clocksources: > > > > # dmesg | egrep -i 'tsc|clocksource' > > > > 2. Second, is there a more recent BIOS revision for this machine? > > Perhaps (assuming this is in fact a BIOS problem) Lenovo is aware > > of it and has fixed it. This is unlikely but worth a look. > > > > Long version: > > > > I think two points form a pattern. Bear with me. > > > > Both you and Josh Rickmar (CC'd) have Lenovo laptops with the same CPU > > (AMD Ryzen 5 2500U) and the same BIOS (LENOVO version "R0UET78W (1.58 > > )" date 11/17/2020). > > > > Looking at the data here, both machines exhibit the same problem with > > the TSC: the APs are all nearly synchronized outside of small > > measurement errors while the BSP is way off. > > > > This makes me wonder whether this is a firmware/BIOS bug. Perhaps the > > BIOS is fussing with the TSC on CPU0 before we boot. Is there a new > > BIOS revision available from Lenovo? > > > > I'd also be interested to know if a recent Linux kernel would even use > > the TSC on this laptop as a clocksource or if the kernel complains and > > falls back to using the HPET. > > > > There is a little chit-chat here and there about adding support to > > OpenBSD for fixing the TSC skew during synchronization at boot/resume. > > One way to do this is with the TSC_ADJUST MSR... > > > > ... but that won't work here in your case. The CPU on this particular > > laptop does not have TSC_ADJUST support, so if we wanted to correct > > the TSC skew we'd have to use WRMSR to modify the TSC directly. > > > > I'm uncertain about whether using WRMSR to reset the TSC on a given > > logical processor is universally supported on all amd64 machines or > > if it's a special feature a la the TSC_ADJUST MSR. > > > > -Scott > > Hey, thanks for the reminder to try this out with Linux. Will give it > a shot shortly. > > As for the BIOS, 1.58 is the current version (found here): > > https://support.lenovo.com/us/en/downloads/ds503790 > > This same issue was happening with all older BIOS versions that I have > used as well. >
Seems Linux doesn't like it either: localhost:~# dmesg | egrep -i 'tsc|clocksource' [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] tsc: Detected 1996.173 MHz processor [ 0.043227] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ms: 6370452778343963 ns [ 0.114728] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484873504 ns [ 0.131435] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398c1ebcd00, max_idle_ns: 881590807727 ns [ 0.244772] TSC synchronization [CPU#0 -> CPU#1]: [ 0.244772] Measured 7296391160 warp between CPUs, turning off TSC clock. [ 0.244772] tsc: Marking TSC unstable due to check_tsc_sync_source_failed [ 0.252185] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 6370867519511994 ns [ 0.316884] clocksource: Switched to clocksource hpet [ 0.335046] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns