Re: [RFC PATCH] x86, hyperv: fix kernel panic when kexec on HyperV VM

2019-03-05 Thread Kairui Song
On Tue, Mar 5, 2019 at 8:28 PM Peter Zijlstra  wrote:
>
> On Wed, Feb 27, 2019 at 10:55:46PM +0800, Kairui Song wrote:
> > On Wed, Feb 27, 2019 at 8:02 PM Peter Zijlstra  wrote:
> > >
> > > On Tue, Feb 26, 2019 at 11:56:15PM +0800, Kairui Song wrote:
> > > >  arch/x86/hyperv/hv_init.c | 4 
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > > > index 7abb09e2eeb8..92291c18d716 100644
> > > > --- a/arch/x86/hyperv/hv_init.c
> > > > +++ b/arch/x86/hyperv/hv_init.c
> > > > @@ -406,6 +406,10 @@ void hyperv_cleanup(void)
> > > >   /* Reset our OS id */
> > > >   wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
> > > >
> > > > + /* Cleanup page reference before reset the page */
> > > > + hv_hypercall_pg = NULL;
> > > > + wmb();
> > >
> > > What do we need that SFENCE for? Any why does it lack a comment?
> >
> > Hi, that's for ensuring the hv_hypercall_pg is reset to NULL before
> > the following wrmsr call. The wrmsr call will make the pointer address
> > invalid.
>
> WRMSR is a serializing instruction (except for TSC_DEADLINE and the
> X2APIC).
>

Many thanks for the info, I'm not aware of the exception condition, V2
is sent, will drop the barrier in V3 then.

-- 
Best Regards,
Kairui Song
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH] x86, hyperv: fix kernel panic when kexec on HyperV VM

2019-03-05 Thread Peter Zijlstra
On Wed, Feb 27, 2019 at 10:55:46PM +0800, Kairui Song wrote:
> On Wed, Feb 27, 2019 at 8:02 PM Peter Zijlstra  wrote:
> >
> > On Tue, Feb 26, 2019 at 11:56:15PM +0800, Kairui Song wrote:
> > >  arch/x86/hyperv/hv_init.c | 4 
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > > index 7abb09e2eeb8..92291c18d716 100644
> > > --- a/arch/x86/hyperv/hv_init.c
> > > +++ b/arch/x86/hyperv/hv_init.c
> > > @@ -406,6 +406,10 @@ void hyperv_cleanup(void)
> > >   /* Reset our OS id */
> > >   wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
> > >
> > > + /* Cleanup page reference before reset the page */
> > > + hv_hypercall_pg = NULL;
> > > + wmb();
> >
> > What do we need that SFENCE for? Any why does it lack a comment?
> 
> Hi, that's for ensuring the hv_hypercall_pg is reset to NULL before
> the following wrmsr call. The wrmsr call will make the pointer address
> invalid.

WRMSR is a serializing instruction (except for TSC_DEADLINE and the
X2APIC).

> I can add some comment in V2 if this is OK.

Barriers must always have a comment.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH] x86, hyperv: fix kernel panic when kexec on HyperV VM

2019-02-27 Thread Kairui Song
On Wed, Feb 27, 2019 at 8:02 PM Peter Zijlstra  wrote:
>
> On Tue, Feb 26, 2019 at 11:56:15PM +0800, Kairui Song wrote:
> >  arch/x86/hyperv/hv_init.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index 7abb09e2eeb8..92291c18d716 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -406,6 +406,10 @@ void hyperv_cleanup(void)
> >   /* Reset our OS id */
> >   wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
> >
> > + /* Cleanup page reference before reset the page */
> > + hv_hypercall_pg = NULL;
> > + wmb();
>
> What do we need that SFENCE for? Any why does it lack a comment?

Hi, that's for ensuring the hv_hypercall_pg is reset to NULL before
the following wrmsr call. The wrmsr call will make the pointer address
invalid.
I can add some comment in V2 if this is OK.


--
Best Regards,
Kairui Song
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH] x86, hyperv: fix kernel panic when kexec on HyperV VM

2019-02-27 Thread Peter Zijlstra
On Tue, Feb 26, 2019 at 11:56:15PM +0800, Kairui Song wrote:
>  arch/x86/hyperv/hv_init.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 7abb09e2eeb8..92291c18d716 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -406,6 +406,10 @@ void hyperv_cleanup(void)
>   /* Reset our OS id */
>   wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
>  
> + /* Cleanup page reference before reset the page */
> + hv_hypercall_pg = NULL;
> + wmb();

What do we need that SFENCE for? Any why does it lack a comment?

> +
>   /* Reset the hypercall page */
>   hypercall_msr.as_uint64 = 0;
>   wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> -- 
> 2.20.1
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [RFC PATCH] x86, hyperv: fix kernel panic when kexec on HyperV VM

2019-02-26 Thread Vitaly Kuznetsov
Kairui Song  writes:

> When hypercalls is used for sending IPIs, kexec will fail with a kernel
> panic like this:
>
> kexec_core: Starting new kernel
> BUG: unable to handle kernel NULL pointer dereference at 
> PGD 800057995067 P4D 800057995067 PUD 57990067 PMD 0
> Oops: 0002 [#1] SMP PTI
> CPU: 0 PID: 1016 Comm: kexec Not tainted 4.18.16-300.fc29.x86_64 #1
> Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 
> Hyper-V UEFI Release v3.0 03/02/2018
> RIP: 0010:0xc901d000
> Code: Bad RIP value.
> RSP: 0018:c9000495bcf0 EFLAGS: 00010046
> RAX:  RBX: c901d000 RCX: 00020015
> RDX: 7f553000 RSI:  RDI: c9000495bd28
> RBP: 0002 R08:  R09: 8238aaf8
> R10: 8238aae0 R11:  R12: 88007f553008
> R13: 0001 R14: 8800ff553000 R15: 
> FS:  7ff5c0e67b80() GS:880078e0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: c901cfd6 CR3: 4f678006 CR4: 003606f0
> Call Trace:
>  ? __send_ipi_mask+0x1c6/0x2d0
>  ? hv_send_ipi_mask_allbutself+0x6d/0xb0
>  ? mp_save_irq+0x70/0x70
>  ? __ioapic_read_entry+0x32/0x50
>  ? ioapic_read_entry+0x39/0x50
>  ? clear_IO_APIC_pin+0xb8/0x110
>  ? native_stop_other_cpus+0x6e/0x170
>  ? native_machine_shutdown+0x22/0x40
>  ? kernel_kexec+0x136/0x156
>  ? __do_sys_reboot+0x1be/0x210
>  ? kmem_cache_free+0x1b1/0x1e0
>  ? __dentry_kill+0x10b/0x160
>  ? _cond_resched+0x15/0x30
>  ? dentry_kill+0x47/0x170
>  ? dput.part.34+0xc6/0x100
>  ? __fput+0x147/0x220
>  ? _cond_resched+0x15/0x30
>  ? task_work_run+0x38/0xa0
>  ? do_syscall_64+0x5b/0x160
>  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack 
> ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 
> ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 
> nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security 
> nf_conntrack ip_set nfnetlink ebtable_filter ebtables ip6table_filter 
> ip6_tables sunrpc vfat fat crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
> intel_rapl_perf hv_balloon joydev xfs libcrc32c hv_storvsc serio_raw 
> scsi_transport_fc hv_netvsc hyperv_keyboard hyperv_fb hid_hyperv crc32c_intel 
> hv_vmbus
>
> That's because HyperV's machine_ops.shutdown allow registering a hook to
> be called upon shutdown, hv_vmbus will invalidate the hypercall page
> using this hook. But hv_hypercall_pg is still pointing to this invalid
> page, any hypercall based operation will panic the kernel. And kexec
> progress will send IPIs for stopping CPUs.
>
> This fix this by simply reset hv_hypercall_pg to NULL when the page is
> revoked to avoid any misuse. IPI sending will fallback to use non
> hypercall based method. This only happens on kexec / kdump so setting to
> NULL should be good enough.
>
> Fixes: 68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments")
> Signed-off-by: Kairui Song 
>
> ---
>
> I'm not sure about the details of what happened after the
>
> wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
>
> But this fix should be valid, please let me know if I get anything
> wrong, thanks.
>
>  arch/x86/hyperv/hv_init.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 7abb09e2eeb8..92291c18d716 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -406,6 +406,10 @@ void hyperv_cleanup(void)
>   /* Reset our OS id */
>   wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
>  
> + /* Cleanup page reference before reset the page */
> + hv_hypercall_pg = NULL;
> + wmb();
> +
>   /* Reset the hypercall page */
>   hypercall_msr.as_uint64 = 0;
>   wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);

This all looks correct to me. We need to reset HV_X64_MSR_HYPERCALL as
the hypercall page will remain 'special' otherwise so dumping it may
have undesired consequences (though, I think that last time I checked it
it was possible to read from this page without issues - and this should
be enough for kdump. But I'd rather keep things as they are as one
additional wrmsr doesn't hurt).

-- 
Vitaly
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel