On 19.11.20 18:38, Mike Rapoport wrote:

On Thu, Nov 19, 2020 at 01:51:18PM +0100, Alexander Graf wrote:


On 19.11.20 13:02, Christian Borntraeger wrote:

On 16.11.20 16:34, Catangiu, Adrian Costin wrote:
- Background

The VM Generation ID is a feature defined by Microsoft (paper:
http://go.microsoft.com/fwlink/?LinkId=260709) and supported by
multiple hypervisor vendors.

The feature is required in virtualized environments by apps that work
with local copies/caches of world-unique data such as random values,
uuids, monotonically increasing counters, etc.
Such apps can be negatively affected by VM snapshotting when the VM
is either cloned or returned to an earlier point in time.

The VM Generation ID is a simple concept meant to alleviate the issue
by providing a unique ID that changes each time the VM is restored
from a snapshot. The hw provided UUID value can be used to
differentiate between VMs or different generations of the same VM.

- Problem

The VM Generation ID is exposed through an ACPI device by multiple
hypervisor vendors but neither the vendors or upstream Linux have no
default driver for it leaving users to fend for themselves.

I see that the qemu implementation is still under discussion. What is

Uh, the ACPI Vmgenid device emulation is in QEMU since 2.9.0 :).

the status of the other existing implementations. Do they already exist?
In other words is ACPI a given?
I think the majority of this driver could be used with just a different
backend for platforms without ACPI so in any case we could factor out
the backend (acpi, virtio, whatever) but if we are open we could maybe
start with something else.

I agree 100%. I don't think we really need a new framework in the kernel for
that. We can just have for example an s390x specific driver that also
provides the same notification mechanism through a device node that is also
named "/dev/vmgenid", no?

Or alternatively we can split the generic part of this driver as soon as a
second one comes along and then have both driver include that generic logic.

The only piece where I'm unsure is how this will interact with CRIU.

To C/R applications that use /dev/vmgenid CRIU need to be aware of it.
Checkpointing and restoring withing the same "VM generation" shouldn't be
a problem, but IMHO, making restore work after genid bump could be
challenging.

Alex, what scenario involving CRIU did you have in mind?

You can in theory run into the same situation with containers that this patch is solving for virtual machines. You could for example do a snapshot of a prewarmed Java runtime with CRIU to get full JIT speeds starting from the first request.

That however means you run into the problem of predictable randomness again.


Can containers emulate ioctls and device nodes?

Containers do not emulate ioctls but they can have /dev/vmgenid inside
the container, so applications can use it the same way as outside the
container.

Hm. I suppose we could add a CAP_ADMIN ioctl interface to /dev/vmgenid (when container people get to the point of needing it) that sets the generation to "at least X". That way on restore, you could just call that with "generation at snapshot"+1.

That also means we need to have this interface available without virtual machines then though, right?


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


Reply via email to