On 5/2/2024 8:23 AM, Markus Armbruster wrote:
Steve Sistare <steven.sist...@oracle.com> writes:

Add the cpr-exec migration mode.  Usage:
   qemu-system-$arch -machine memfd-alloc=on ...
   migrate_set_parameter mode cpr-exec
   migrate_set_parameter cpr-exec-args \
     <arg1> <arg2> ... -incoming <uri>
   migrate -d <uri>

The migrate command stops the VM, saves state to the URI,
directly exec's a new version of QEMU on the same host,
replacing the original process while retaining its PID, and
loads state from the URI.  Guest RAM is preserved in place,
albeit with new virtual addresses.

Arguments for the new QEMU process are taken from the
@cpr-exec-args parameter.  The first argument should be the
path of a new QEMU binary, or a prefix command that exec's the
new QEMU binary.

Because old QEMU terminates when new QEMU starts, one cannot
stream data between the two, so the URI must be a type, such as
a file, that reads all data before old QEMU exits.

Memory backend objects must have the share=on attribute, and
must be mmap'able in the new QEMU process.  For example,
memory-backend-file is acceptable, but memory-backend-ram is
not.

The VM must be started with the '-machine memfd-alloc=on'
option.  This causes implicit ram blocks (those not explicitly
described by a memory-backend object) to be allocated by
mmap'ing a memfd.  Examples include VGA, ROM, and even guest
RAM when it is specified without a memory-backend object.

The implementation saves precreate vmstate at the end of normal
migration in migrate_fd_cleanup, and tells the main loop to call
cpr_exec.  Incoming qemu loads preceate state early, before objects
are created.  The memfds are kept open across exec by clearing the
close-on-exec flag, their values are saved in precreate vmstate,
and they are mmap'd in new qemu.

Note that the memfd-alloc option is not related to memory-backend-memfd.
Later patches add support for memory-backend-memfd, and for additional
devices, including vfio, chardev, and more.

Signed-off-by: Steve Sistare <steven.sist...@oracle.com>

[...]

diff --git a/qapi/migration.json b/qapi/migration.json
index 49710e7..7c5f45f 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -665,9 +665,37 @@
  #     or COLO.
  #
  #     (since 8.2)
+#
+# @cpr-exec: The migrate command stops the VM, saves state to the URI,
+#     directly exec's a new version of QEMU on the same host,
+#     replacing the original process while retaining its PID, and
+#     loads state from the URI.  Guest RAM is preserved in place,
+#     albeit with new virtual addresses.

Do you mean the virtual addresses of guest RAM may differ betwen old and
new QEMU process?

The VA at which a guest RAM segment is mapped in the QEMU process
changes.  The end user would not notice or care, so I'll drop that
detail here.

+#
+#     Arguments for the new QEMU process are taken from the
+#     @cpr-exec-args parameter.  The first argument should be the
+#     path of a new QEMU binary, or a prefix command that exec's the
+#     new QEMU binary.

What's a "prefix command"?  A wrapper script, perhaps?

A prefix command is any command of the form:
  command1 command1-args command2 command2-args
where command1 performs some set up before exec'ing command2.
However, I will drop the word "prefix", it adds no meaning here.

+#
+#     Because old QEMU terminates when new QEMU starts, one cannot
+#     stream data between the two, so the URI must be a type, such as
+#     a file, that reads all data before old QEMU exits.

What happens when you specify a URI that doesn't?

Old QEMU will quietly block indefinitely writing to the URI.

+#
+#     Memory backend objects must have the share=on attribute, and
+#     must be mmap'able in the new QEMU process.  For example,
+#     memory-backend-file is acceptable, but memory-backend-ram is
+#     not.
+#
+#     The VM must be started with the '-machine memfd-alloc=on'

What happens when you don't?

If '-only-migratable-modes cpr-exec' is specified, then QEMU will fail
to start, and print a clear error message.

Otherwise, a blocker is registered and any attempt to cpr-exec will fail
with a clear error message.

- Steve

+#     option.  This causes implicit ram blocks -- those not explicitly
+#     described by a memory-backend object -- to be allocated by
+#     mmap'ing a memfd.  Examples include VGA, ROM, and even guest
+#     RAM when it is specified without a memory-backend object.
+#
+#     (since 9.1)
  ##
  { 'enum': 'MigMode',
-  'data': [ 'normal', 'cpr-reboot' ] }
+  'data': [ 'normal', 'cpr-reboot', 'cpr-exec' ] }
##
  # @ZeroPageDetection:

[...]


Reply via email to