Re: [Qemu-devel] [PATCH v8 4/8] ACPI: Add Virtual Machine Generation ID support

Michael S. Tsirkin Tue, 21 Feb 2017 06:18:08 -0800

On Tue, Feb 21, 2017 at 10:58:05AM +0100, Laszlo Ersek wrote:
> On 02/21/17 02:43, Michael S. Tsirkin wrote:
> > On Mon, Feb 20, 2017 at 09:55:40PM +0100, Laszlo Ersek wrote:
> >> On 02/20/17 21:45, Eric Blake wrote:
> >>> On 02/20/2017 02:19 PM, Dr. David Alan Gilbert wrote:
> >>>> * Eric Blake (ebl...@redhat.com) wrote:
> >>>>> On 02/20/2017 04:23 AM, Dr. David Alan Gilbert wrote:
> >>>>>> * Laszlo Ersek (ler...@redhat.com) wrote:
> >>>>>>> CC Dave
> >>>>>>
> >>>>>> This isn't an area I really understand; but if I'm
> >>>>>> reading this right then 
> >>>>>>    vmgenid is stored in fw_cfg?
> >>>>>>    fw_cfg isn't migrated
> >>>>>>
> >>>>>> So why should any changes to it get migrated, except if it's already
> >>>>>> been read by the guest (and if the guest reads it again aftwards what's
> >>>>>> it expected to read?)
> >>>>>
> >>>>> Why are we expecting it to change on migration?  You want a new value
> >>>>
> >>>> I'm not; I was asking why a change made prior to migration would be
> >>>> preserved across migration.
> >>>
> >>> Okay, so you're asking what happens if the source requests the vmgenid
> >>> device, and sets an id, but the destination of the migration does not
> >>> request anything
> >>
> >> This should never happen, as it means different QEMU command lines on
> >> source vs. target hosts. (Different as in "incorrectly different".)
> >>
> >> Dave writes, "a change made prior to migration". Change made to what?
> >>
> >> - the GUID cannot be changed via the monitor once QEMU has been started.
> >> We dropped the monitor command for that, due to lack of a good use case,
> >> and due to lifecycle complexities. We have figured out a way to make it
> >> safe, but until there's a really convincing use case, we shouldn't add
> >> that complexity.
> > 
> > True but we might in the future, and it seems prudent to make
> > migration stream future-proof for that.
> 
> It is already.
> 
> The monitor command, if we add it, can be implemented incrementally. I
> described it as "approach (iii)" elsewhere in the thread. This is a more
> detailed recap:
> 
> - introduce a new device property (internal only), such as
>   "x-enable-set-vmgenid". Make it reflect whether a given machine type
>   supports the monitor command.


This is the part we can avoid at no real cost just
by making sure the guid is migrated.


> - change the /etc/vmgenid_guid fw_cfg blob from callback-less to one
>   with a selection callback
> 
> - add a new boolean latch to the vmgenid device, called
>   "guid_blob_selected" or something similar
> 
> - the reset handler sets the latch to FALSE
>   (NB: the reset handler already sets /etc/vmgenid_addr to zero)
> 
> - the select callback for /etc/vmgenid_guid sets the latch to TRUE
> 
> - the latch is added to the migration stream as a subsection *if*
>   x-enable-set-vmgenid is TRUE
> 
> - the set-vmgenid monitor command checks all three of:
>   x-enable-set-vmgenid, the latch, and the contents of
>   /etc/vmgenid_addr:
> 
>   - if x-enable-set-vmgenid is FALSE, the monitor command returns
>     QERR_UNSUPPORTED (this is a generic error class, with an
>     "unsupported" error message). Otherwise,
> 
>   - if the latch is TRUE *and* /etc/vmgenid_addr is zero, then the
>     guest firmware has executed (or started executing) ALLOCATE for
>     /etc/vmgenid_guid, but it has not executed WRITE_POINTER yet.
>     In this case updating the VMGENID from the monitor is unsafe
>     (we cannot guarantee informing the guest successfully), so in this
>     case the monitor command fails with ERROR_CLASS_DEVICE_NOT_ACTIVE.
>     The caller should simply try a bit later. (By which time the
>     firmware will likely have programmed /etc/vmgenid_addr.)

This makes no sense to me. Just update it in qemu memory
and write when guest asks for it.


>     Libvirt can recognize this error specifically, because it is not the
>     generic error class. ERROR_CLASS_DEVICE_NOT_ACTIVE stands for
>     "EAGAIN", practically, in this case.
> 
>   - Otherwise -- meaning latch is FALSE *or* /etc/vmgenid_addr is
>     nonzero, that is, the guest has either not run ALLOCATE since
>     reset, *or* it has, but it has also run WRITE_POINTER):
> 
>     - refresh the GUID within the fw_cfg blob for /etc/vmgenid_guid
>       in-place -- the guest will see this whenever it runs ALLOCATE for
>       /etc/vmgenid_guid, *AND*
> 
>     - if /etc/vmgenid_addr is not zero, then update the guest (that is,
>       RAM write + SCI)
> 
> Thanks
> Laszlo

Seems way more painful than it has to be. Just migrate the guid
and then management can write it at any time.


> > 
> >> - the address of the GUID is changed (the firmware programs it from
> >> "zero" to an actual address, in a writeable fw_cfg file), and that piece
> >> of info is explicitly migrated, as part of the vmgenid device's vmsd.
> >>
> >> Thanks
> >> Laszlo
> >>
> >>
> >>> - how does the guest on the destination see the same id
> >>> as was in place on the source at the time migration started.
> >>>
> >>>>
> >>>>
> >>>>> when you load state from disk (you don't know how many times the same
> >>>>> state has been loaded previously, so each load is effectively forking
> >>>>> the VM and you want a different value), but for a single live migration,
> >>>>> you aren't forking the VM and don't need a new generation ID.
> >>>>>
> >>>>> I guess it all boils down to what command line you're using: if libvirt
> >>>>> is driving a live migration, it will request the same UUID in the
> >>>>> command line of the destination as what is on the source; while if
> >>>>> libvirt is loading from a [managed]save to restore state from a file, it
> >>>>> will either request a new UUID directly or request auto to let qemu
> >>>>> generate the new id.
> >>>>
> >>>> Hmm now I've lost it a bit; I thought we would preserve the value
> >>>> transmitted from the source, not the value on the command line of the 
> >>>> destination.
> >>>
> >>> I guess I'm trying to figure out whether libvirt MUST read the current
> >>> id and explicitly tell the destination of migration to reuse that id, or
> >>> if libvirt can omit the id on migration and everything just works
> >>> because the id was migrated from the source.
> >>>

Re: [Qemu-devel] [PATCH v8 4/8] ACPI: Add Virtual Machine Generation ID support

Reply via email to