On Wed, 14 Nov 2012, Andres Lagar-Cavilla wrote: > Stefano, and Xen-qemu team, I have a question. > > The standard Xen-qemu workflow has Xen manage the physmap for a VM, and > allocate all the backing memory for valid pfns, regardless of whether they > are MMIO, RAM, etc. On save/migrate, when using upstream qemu, a special > monitor command is used to save the device model state, while the > save/restore code blocks in libxc takes care of the memory. > > Qemu has a chain of ram blocks with offsets, each of which is further > subdivided into memory regions that map to specific chunks of the physmap. > > AFAICT, the restore code in libxc has no knowledge of qemu's ram blocks and > offsets. My question is, how is a mismatch avoided? > > How does the workflow ensure that all the sub regions in each ram block map > to the same physmap chunks on restore? Is this an implicit guarantee from > qemu when building the VM (with the same command line) on the restore side? > > Are the regions and physmap offsets contained in the device state that is > saved? > > If, for example, I were to save/restore a VM with four e1000 emulated > devices, how does the workflow guarantee that each physmap region backing > each e1000 ROM gets reconstructed with exactly the same ram block, offset, > and physmap chunk coordinates? > > Code inspection seems to suggest qemu will lay out things deterministically > given the command line. I want to make sure I am not missing anything.
Yes, it does. Moreover QEMU is going to save everything it needs to restore the state of the devices exactly the way it was, MMIO regions addresses and sizes included. The only issue is the videoram: even though it is an MMIO region, it is saved by Xen because it looks like normal ram to the hypervisor. To solve the problem QEMU writes the location and the size of the videoram to xenstore and keeps the records up to date. The toolstack reads those records and adds them to the savefile. At restore time the toolstack writes back the records to xenstore and QEMU at boot time uses them to populate a list of physmap regions, see xen_read_physmap.