Thank you for your contribution! Sashiko AI review found 1 potential issue(s) 
to consider:
- [High] Bypassing `kvm_sched_clock_init()` when `CONSTANT_TSC` is present 
ignores KVM's `PVCLOCK_TSC_STABLE_BIT`, leading to unfiltered backward time 
jumps in the core scheduler.
--

commit 57cf68835ef70733c037ca9ffa04fd39e1d70027
Author: Sean Christopherson <[email protected]>

x86/kvmclock: Use TSC for sched_clock if it's constant and non-stop

This commit updates kvmclock_init to prefer the TSC over kvmclock for
sched_clock when the TSC is constant, non-stop, and not marked unstable.
It does this by bypassing kvm_sched_clock_init and instead modifying the
TSC properties so that TSC becomes the preferred clocksource.

Link: https://lore.kernel.org/all/[email protected]

> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
[ ... ]
> @@ -416,22 +416,22 @@ void __init kvmclock_init(void)
[ ... ]
>       if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
>           boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
>           !check_tsc_unstable()) {
>               kvm_clock.rating = 299;
>               tsc_properties = TSC_FREQ_KNOWN_AND_RELIABLE;
> +     } else {
> +             kvm_sched_clock_init(stable);
>       }
>  
> -     kvm_sched_clock_init(stable);

Does bypassing kvm_sched_clock_init() when CONSTANT_TSC is present
accidentally ignore KVM's PVCLOCK_TSC_STABLE_BIT?

CONSTANT_TSC indicates the hardware TSC rate doesn't change with power
states, but it doesn't guarantee cross-CPU synchronization, such as across
multi-socket systems or differing vCPU offsets from hotplug and migration.
KVM explicitly signals this cross-CPU stability via PVCLOCK_TSC_STABLE_BIT,
which is mapped to the stable variable.

Prior to this patch, kvm_sched_clock_init() was always called, invoking
__paravirt_set_sched_clock() with the stable flag. This allowed the system
to call clear_sched_clock_stable() if the hypervisor could not guarantee
synchronization, enabling crucial per-CPU monotonicity filters in
sched_clock_cpu().

By skipping kvm_sched_clock_init() entirely and registering
TSC_FREQ_KNOWN_AND_RELIABLE, the kernel's native cross-CPU TSC synchronization
checks and the clocksource watchdog appear to be bypassed.

If KVM clears PVCLOCK_TSC_STABLE_BIT, could this result in the guest
ignoring the condition and using raw, unfiltered rdtsc() values? When tasks
migrate between vCPUs with differing TSC offsets, this regression might cause
backward time jumps in sched_clock(), which breaks rq_clock monotonicity
and corrupts CFS vruntime and min_vruntime calculations.

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=37

Reply via email to