Steve Sistare <steven.sist...@oracle.com> writes: > This patch series adds the live migration cpr-exec mode. In this mode, QEMU > stops the VM, writes VM state to the migration URI, and directly exec's a > new version of QEMU on the same host, replacing the original process while > retaining its PID. Guest RAM is preserved in place, albeit with new virtual > addresses. The user completes the migration by specifying the -incoming > option, and by issuing the migrate-incoming command if necessary. This > saves and restores VM state, with minimal guest pause time, so that QEMU may > be updated to a new version in between. > > The new interfaces are: > * cpr-exec (MigMode migration parameter) > * cpr-exec-args (migration parameter) > * memfd-alloc=on (command-line option for -machine) > * only-migratable-modes (command-line argument) > > The caller sets the mode parameter before invoking the migrate command. > > Arguments for the new QEMU process are taken from the cpr-exec-args parameter. > The first argument should be the path of a new QEMU binary, or a prefix > command that exec's the new QEMU binary, and the arguments should include > the -incoming option. > > Memory backend objects must have the share=on attribute, and must be mmap'able > in the new QEMU process. For example, memory-backend-file is acceptable, > but memory-backend-ram is not. > > QEMU must be started with the '-machine memfd-alloc=on' option. This causes > implicit RAM blocks (those not explicitly described by a memory-backend > object) to be allocated by mmap'ing a memfd. Examples include VGA, ROM, > and even guest RAM when it is specified without without reference to a > memory-backend object. The memfds are kept open across exec, their values > are saved in vmstate which is retrieved after exec, and they are re-mmap'd. > > The '-only-migratable-modes cpr-exec' option guarantees that the > configuration supports cpr-exec. QEMU will exit at start time if not. > > Example: > > In this example, we simply restart the same version of QEMU, but in > a real scenario one would set a new QEMU binary path in cpr-exec-args. > > # qemu-kvm -monitor stdio -object > memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on > -m 4G -machine memfd-alloc=on ... > > QEMU 9.1.50 monitor - type 'help' for more information > (qemu) info status > VM status: running > (qemu) migrate_set_parameter mode cpr-exec > (qemu) migrate_set_parameter cpr-exec-args qemu-kvm ... -incoming > file:vm.state > (qemu) migrate -d file:vm.state > (qemu) QEMU 9.1.50 monitor - type 'help' for more information > (qemu) info status > VM status: running > > cpr-exec mode preserves attributes of outgoing devices that must be known > before the device is created on the incoming side, such as the memfd > descriptor > number, but currently the migration stream is read after all devices are > created. To solve this problem, I add two VMStateDescription options: > precreate and factory. precreate objects are saved to their own migration > stream, distinct from the main stream, and are read early by incoming QEMU, > before devices are created. Factory objects are allocated on demand, without > relying on a pre-registered object's opaque address, which is necessary > because the devices to which the state will apply have not been created yet > and hence have not registered an opaque address to receive the state. > > This patch series implements a minimal version of cpr-exec. Future series > will add support for: > * vfio > * chardev's without loss of connectivity > * vhost > * fine-grained seccomp controls > * hostmem-memfd > * cpr-exec migration test > > > Steve Sistare (26): > oslib: qemu_clear_cloexec > vl: helper to request re-exec > migration: SAVEVM_FOREACH > migration: delete unused parameter mis > migration: precreate vmstate > migration: precreate vmstate for exec > migration: VMStateId > migration: vmstate_info_void_ptr > migration: vmstate_register_named > migration: vmstate_unregister_named > migration: vmstate_register at init time > migration: vmstate factory object > physmem: ram_block_create > physmem: hoist guest_memfd creation > physmem: hoist host memory allocation > physmem: set ram block idstr earlier > machine: memfd-alloc option > migration: cpr-exec-args parameter > physmem: preserve ram blocks for cpr > migration: cpr-exec mode > migration: migrate_add_blocker_mode > migration: ram block cpr-exec blockers > migration: misc cpr-exec blockers > seccomp: cpr-exec blocker > migration: fix mismatched GPAs during cpr-exec > migration: only-migratable-modes > > accel/xen/xen-all.c | 5 + > backends/hostmem-epc.c | 12 +- > hmp-commands.hx | 2 +- > hw/core/machine.c | 22 +++ > hw/core/qdev.c | 1 + > hw/intc/apic_common.c | 2 +- > hw/vfio/migration.c | 3 +- > include/exec/cpu-common.h | 3 +- > include/exec/memory.h | 15 ++ > include/exec/ramblock.h | 10 +- > include/hw/boards.h | 1 + > include/migration/blocker.h | 7 + > include/migration/cpr.h | 14 ++ > include/migration/misc.h | 11 ++ > include/migration/vmstate.h | 133 +++++++++++++++- > include/qemu/osdep.h | 9 ++ > include/sysemu/runstate.h | 3 + > include/sysemu/seccomp.h | 1 + > include/sysemu/sysemu.h | 1 - > migration/cpr.c | 131 ++++++++++++++++ > migration/meson.build | 3 + > migration/migration-hmp-cmds.c | 50 +++++- > migration/migration.c | 48 +++++- > migration/migration.h | 5 +- > migration/options.c | 13 ++ > migration/precreate.c | 139 +++++++++++++++++ > migration/ram.c | 16 +- > migration/savevm.c | 306 +++++++++++++++++++++++++++++------- > migration/savevm.h | 3 + > migration/trace-events | 7 + > migration/vmstate-factory.c | 78 ++++++++++ > migration/vmstate-types.c | 24 +++ > migration/vmstate.c | 3 +- > qapi/migration.json | 48 +++++- > qemu-options.hx | 22 ++- > replay/replay.c | 6 + > stubs/migr-blocker.c | 5 + > stubs/vmstate.c | 13 ++ > system/globals.c | 1 - > system/memory.c | 19 ++- > system/physmem.c | 346 > +++++++++++++++++++++++++++-------------- > system/qemu-seccomp.c | 10 +- > system/runstate.c | 29 ++++ > system/trace-events | 4 + > system/vl.c | 26 +++- > target/s390x/cpu_models.c | 4 +- > util/oslib-posix.c | 9 ++ > util/oslib-win32.c | 4 + > 48 files changed, 1417 insertions(+), 210 deletions(-) > create mode 100644 include/migration/cpr.h > create mode 100644 migration/cpr.c > create mode 100644 migration/precreate.c > create mode 100644 migration/vmstate-factory.c
Hi Steve, make check is failing. I applied the series on top of master @ 70581940ca (Merge tag 'pull-tcg-20240523' of https://gitlab.com/rth7680/qemu into staging, 2024-05-23). $ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/ivshmem-test ... qemu-system-x86_64: ../system/physmem.c:1634: qemu_ram_verify_idstr: Assertion `!strcmp(new_block->idstr, idstr)' failed. $ QTEST_QEMU_BINARY=./qemu-system-x86_64 \ ./tests/qtest/test-x86-cpuid-compat -p \ /x86_64/x86/cpuid/auto-level/pc-2.7 ... qemu-system-x86_64: ../system/physmem.c:1634: qemu_ram_verify_idstr: Assertion `!strcmp(new_block->idstr, idstr)' failed. $ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/qmp-cmd-test -p \ /x86_64/qmp/object-add-failure-modes ... savevm_state_handler_insert: Detected duplicate SaveStateEntry: id=ram1/RAMBlock, instance_id=0x0