From: Dexuan Cui <de...@microsoft.com> Sent: Saturday, January 16, 2021 2:32 PM
> 
> With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually
> initialized before LAPIC is initialized: see
> 
>   apic_intr_mode_init()
> 
>     x86_platform.apic_post_init()
>       hyperv_init()
>         hv_stimer_alloc()
> 
>     apic_bsp_setup()
>       setup_local_APIC()
> 
> setup_local_APIC() temporarily disables LAPIC, initializes it and
> re-eanble it.  The direct-mode STIMER depends on LAPIC, and when it's
> registered, it can be programmed immediately and the timer can fire
> very soon:
> 
>   hv_stimer_init
>     clockevents_config_and_register
>       clockevents_register_device
>         tick_check_new_device
>           tick_setup_device
>             tick_setup_periodic(), tick_setup_oneshot()
>               clockevents_program_event
> 
> When the timer fires in the hypervisor, if the LAPIC is in the
> disabled state, new versions of Hyper-V ignore the event and don't inject
> the timer interrupt into the VM, and hence the VM hangs when it boots.
> 
> Note: when the VM starts/reboots, the LAPIC is pre-enabled by the
> firmware, so the window of LAPIC being temporarily disabled is pretty
> small, and the issue can only happen once out of 100~200 reboots for
> a 40-vCPU VM on one dev host, and on another host the issue doesn't
> reproduce after 2000 reboots.
> 
> The issue is more noticeable for kdump/kexec, because the LAPIC is
> disabled by the first kernel, and stays disabled until the kdump/kexec
> kernel enables it. This is especially an issue to a Generation-2 VM
> (for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000
> (rather than CONFIG_HZ=250) is used.
> 
> Fix the issue by moving hv_stimer_alloc() to a later place where the
> LAPIC timer is initialized.
> 
> Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU 
> onlining")
> Signed-off-by: Dexuan Cui <de...@microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c | 29 ++++++++++++++++++++++++++---
>  1 file changed, 26 insertions(+), 3 deletions(-)
> 

Reviewed-by:  Michael Kelley <mikel...@microsoft.com>

Reply via email to