On Tue, Sep 15, 2020 at 01:02:08PM +0200, Vitaly Kuznetsov wrote:
> Wei Liu <[email protected]> writes:
> 
> > On Tue, Sep 15, 2020 at 12:32:29PM +0200, Vitaly Kuznetsov wrote:
> >> Wei Liu <[email protected]> writes:
> >> 
> >> > When Linux is running as the root partition, the hypercall page will
> >> > have already been setup by Hyper-V. Copy the content over to the
> >> > allocated page.
> >> 
> >> And we can't setup a new hypercall page by writing something different
> >> to HV_X64_MSR_HYPERCALL, right?
> >> 
> >
> > My understanding is that we can't, but Sunil can maybe correct me.
> >
> >> >
> >> > The suspend, resume and cleanup paths remain untouched because they are
> >> > not supported in this setup yet.
> >> >
> >> > Signed-off-by: Lillian Grassin-Drake <[email protected]>
> >> > Signed-off-by: Sunil Muthuswamy <[email protected]>
> >> > Signed-off-by: Nuno Das Neves <[email protected]>
> >> > Co-Developed-by: Lillian Grassin-Drake <[email protected]>
> >> > Co-Developed-by: Sunil Muthuswamy <[email protected]>
> >> > Co-Developed-by: Nuno Das Neves <[email protected]>
> >> > Signed-off-by: Wei Liu <[email protected]>
> >> > ---
> >> >  arch/x86/hyperv/hv_init.c | 26 ++++++++++++++++++++++++--
> >> >  1 file changed, 24 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> >> > index 0eec1ed32023..26233aebc86c 100644
> >> > --- a/arch/x86/hyperv/hv_init.c
> >> > +++ b/arch/x86/hyperv/hv_init.c
> >> > @@ -25,6 +25,7 @@
> >> >  #include <linux/cpuhotplug.h>
> >> >  #include <linux/syscore_ops.h>
> >> >  #include <clocksource/hyperv_timer.h>
> >> > +#include <linux/highmem.h>
> >> >  
> >> >  /* Is Linux running as the root partition? */
> >> >  bool hv_root_partition;
> >> > @@ -448,8 +449,29 @@ void __init hyperv_init(void)
> >> >  
> >> >          rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> >> >          hypercall_msr.enable = 1;
> >> > -        hypercall_msr.guest_physical_address = 
> >> > vmalloc_to_pfn(hv_hypercall_pg);
> >> > -        wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> >> > +
> >> > +        if (hv_root_partition) {
> >> > +                struct page *pg;
> >> > +                void *src, *dst;
> >> > +
> >> > +                /*
> >> > +                 * Order is important here. We must enable the 
> >> > hypercall page
> >> > +                 * so it is populated with code, then copy the code to 
> >> > an
> >> > +                 * executable page.
> >> > +                 */
> >> > +                wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> >> > +
> >> > +                pg = vmalloc_to_page(hv_hypercall_pg);
> >> > +                dst = kmap(pg);
> >> > +                src = memremap(hypercall_msr.guest_physical_address << 
> >> > PAGE_SHIFT, PAGE_SIZE,
> >> > +                                MEMREMAP_WB);
> >> 
> >> memremap() can fail...
> >
> > And we don't care here, if it fails, we would rather it panic or oops.
> >
> > I was relying on the fact that copying from / to a NULL pointer will
> > cause the kernel to crash. But of course it wouldn't hurt to explicitly
> > panic here.
> >
> >> 
> >> > +                memcpy(dst, src, PAGE_SIZE);
> >> > +                memunmap(src);
> >> > +                kunmap(pg);
> >> > +        } else {
> >> > +                hypercall_msr.guest_physical_address = 
> >> > vmalloc_to_pfn(hv_hypercall_pg);
> >> > +                wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> >> > +        }
> >> 
> >> Why can't we do wrmsrl() for both cases here?
> >> 
> >
> > Because the hypercall page has already been set up when Linux is the
> > root.
> 
> But you already do wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64)
> in 'if (hv_root_partition)' case above, that's why I asked.
> 

You mean extracting wrmsrl to this point?  The ordering matters. See the
comment in the root branch -- we have to enable the page before copying
the content.

What can be done is:

   if (!root) {
       /* some stuff */
   }

   wrmsrl(...)

   if (root) {
       /* some stuff */
   }

This is not looking any better than the existing code.

Wei.

> >
> > I could've tried writing to the MSR again, but because the behaviour
> > here is not documented and subject to change so I didn't bother trying.
> >
> > Wei.
> >
> >> >  
> >> >          /*
> >> >           * Ignore any errors in setting up stimer clockevents
> >> 
> >> -- 
> >> Vitaly
> >> 
> >
> 
> -- 
> Vitaly
> 

Reply via email to