Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [High] Bypassing `kvm_sched_clock_init()` when `CONSTANT_TSC` is present ignores KVM's `PVCLOCK_TSC_STABLE_BIT`, leading to unfiltered backward time jumps in the core scheduler. --
commit 57cf68835ef70733c037ca9ffa04fd39e1d70027 Author: Sean Christopherson <[email protected]> x86/kvmclock: Use TSC for sched_clock if it's constant and non-stop This commit updates kvmclock_init to prefer the TSC over kvmclock for sched_clock when the TSC is constant, non-stop, and not marked unstable. It does this by bypassing kvm_sched_clock_init and instead modifying the TSC properties so that TSC becomes the preferred clocksource. Link: https://lore.kernel.org/all/[email protected] > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > --- a/arch/x86/kernel/kvmclock.c > +++ b/arch/x86/kernel/kvmclock.c [ ... ] > @@ -416,22 +416,22 @@ void __init kvmclock_init(void) [ ... ] > if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && > boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && > !check_tsc_unstable()) { > kvm_clock.rating = 299; > tsc_properties = TSC_FREQ_KNOWN_AND_RELIABLE; > + } else { > + kvm_sched_clock_init(stable); > } > > - kvm_sched_clock_init(stable); Does bypassing kvm_sched_clock_init() when CONSTANT_TSC is present accidentally ignore KVM's PVCLOCK_TSC_STABLE_BIT? CONSTANT_TSC indicates the hardware TSC rate doesn't change with power states, but it doesn't guarantee cross-CPU synchronization, such as across multi-socket systems or differing vCPU offsets from hotplug and migration. KVM explicitly signals this cross-CPU stability via PVCLOCK_TSC_STABLE_BIT, which is mapped to the stable variable. Prior to this patch, kvm_sched_clock_init() was always called, invoking __paravirt_set_sched_clock() with the stable flag. This allowed the system to call clear_sched_clock_stable() if the hypervisor could not guarantee synchronization, enabling crucial per-CPU monotonicity filters in sched_clock_cpu(). By skipping kvm_sched_clock_init() entirely and registering TSC_FREQ_KNOWN_AND_RELIABLE, the kernel's native cross-CPU TSC synchronization checks and the clocksource watchdog appear to be bypassed. If KVM clears PVCLOCK_TSC_STABLE_BIT, could this result in the guest ignoring the condition and using raw, unfiltered rdtsc() values? When tasks migrate between vCPUs with differing TSC offsets, this regression might cause backward time jumps in sched_clock(), which breaks rq_clock monotonicity and corrupts CFS vruntime and min_vruntime calculations. -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=37
