On Wed, Jun 14, 2017 at 01:12:12PM +0200, Paolo Bonzini wrote:
> 
> 
> On 06/06/2017 20:19, Roman Kagan wrote:
> > There is a design flaw in the Hyper-V SynIC implementation in KVM: when
> > message page or event flags page is enabled by setting the corresponding
> > msr, KVM zeroes it out.  This violates the spec in general (per spec,
> > the pages have to be overlay ones and only zeroed at cpu reset), but
> > it's non-fatal in normal operation because the user exit happens after
> > the page is zeroed, so it's the underlying guest page which is zeroed
> > out, and sane guests don't depend on its contents to be preserved while
> > it's overlaid.
> > 
> > However, in the case of vmstate load the overlay pages are set up before
> > msrs are set so the contents of those pages get lost.
> > 
> > To work it around, avoid setting up overlay pages in .post_load.
> > Instead, postpone it until after the msrs are pushed to KVM.  As a
> > result, KVM just zeroes out the underlying guest pages similar to how it
> > happens during guest-initiated msr writes, which is tolerable.
> 
> Why not disable the zeroing for host-initiated MSR writes?  This is
> pretty clearly a KVM bug, we can push it to stable kernels too.

The only problem with this is that QEMU will have no reliable way to
know if the KVM it runs with has this bug fixed or not.  Machines
without vmbus work and even migrate fine with the current KVM despite
this bug (the only user of those pages currently is synic timers which
re-arm themselves and post messages regardless of zeroing).  Now
updating QEMU to a vmbus-enabled version without updating the kernel
will make the migrations cause guest hangs.

If that is tolerable I can happily drop this patch as it complicates
code a little.  Distros probably won't be affected as they can make sure
their kernels have this bug fixed before they roll out a vmbus-capable
QEMU.

What do you think?

Thanks,
Roman.

Reply via email to