Thank you for your contribution! Sashiko AI review found 1 potential issue(s) 
to consider:
- [High] The logic to defer AP kvmclock registration to the CPUHP_AP_ONLINE_DYN 
state when it is not used as sched_clock is architecturally unsafe and will 
break timekeeping.
--

commit 88625772445f01c4397e3dfd4ee43222efd758cf
Author: Sean Christopherson <[email protected]>

x86/kvmclock: Enable kvmclock on APs during onlining if kvmclock isn't 
sched_clock

This commit prepares for making early_percpu_clock_init() a dedicated
sched_clock hook. It adds logic to enable kvmclock on APs during CPU online
if kvmclock is not used as sched_clock, and includes WARNs in the suspend
and resume paths.

> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index e7250d21c6721..d3bb281c0805d 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
[ ... ]
> @@ -141,25 +142,51 @@ static void kvm_restore_sched_clock_state(void)
[ ... ]
>  void kvmclock_cpu_action(enum kvm_guest_cpu_action action)
>  {
> -     /*
> -      * Don't disable kvmclock on the BSP during suspend.  If kvmclock is
> -      * being used for sched_clock, then it needs to be kept alive until the
> -      * last minute, and restored as quickly as possible after resume.
> -      */
> -     if (action != KVM_GUEST_BSP_SUSPEND)
> +     switch (action) {
[ ... ]
> +     case KVM_GUEST_AP_ONLINE:
> +             /*
> +              * Secondary CPUs use dedicated sched_clock hooks to enable
> +              * kvmclock early during bringup, there's nothing to be done
> +              * when during CPU online.
> +              */
> +             if (kvmclock_is_sched_clock)
> +                     break;
> +             kvm_register_clock("secondary cpu, online");
> +             break;

Will deferring AP kvmclock registration to the CPU online phase break
timekeeping if kvmclock is still being used as the system clocksource?

If the user boots with clocksource=kvm-clock or if kvmclock is selected as a
fallback, it will be used as the clocksource despite not being sched_clock.
The kvmclock clocksource read function accesses per-CPU memory that the
hypervisor only populates after the AP registers its clock.

If early registration in start_secondary() is skipped, as anticipated here,
an AP will boot, enable local interrupts, and process timer ticks before
reaching the late CPUHP_AP_ONLINE_DYN hotplug state.

Could a call to ktime_get() during this window read uninitialized per-CPU
memory and return 0? It seems this might cause the timekeeper to compute
massive forward time jumps, leading to immediate timer expirations and
watchdogs firing.

> +     case KVM_GUEST_AP_OFFLINE:
> +     case KVM_GUEST_SHUTDOWN:
>               kvmclock_disable();
> +             break;
> +     default:
> +             WARN_ON_ONCE(1);
> +             break;
> +     }
>  }

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=27

Reply via email to