A followup to: v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
When QMP was first introduced some 10+ years ago now, the snapshot related commands (savevm/loadvm/delvm) were not converted. This was primarily because their implementation causes blocking of the thread running the monitor commands. This was (and still is) considered undesirable behaviour both in HMP and QMP. In theory someone was supposed to fix this flaw at some point in the past 10 years and bring them into the QMP world. Sadly, thus far it hasn't happened as people always had more important things to work on. Enterprise apps were much more interested in external snapshots than internal snapshots as they have many more features. Meanwhile users still want to use internal snapshots as there is a certainly simplicity in having everything self-contained in one image, even though it has limitations. Thus the apps that end up executing the savevm/loadvm/delvm via the "human-monitor-command" QMP command. IOW, the problematic blocking behaviour that was one of the reasons for not having savevm/loadvm/delvm in QMP is experienced by applications regardless. By not portting the commands to QMP due to one design flaw, we've forced apps and users to suffer from other design flaws of HMP ( bad error reporting, strong type checking of args, no introspection) for an additional 10 years. This feels rather sub-optimal :-( In practice users don't appear to care strongly about the fact that these commands block the VM while they run. I might have seen one bug report about it, but it certainly isn't something that comes up as a frequent topic except among us QEMU maintainers. Users do care about having access to the snapshot feature. Where I am seeing frequent complaints is wrt the use of OVMF combined with snapshots which has some serious pain points. This is getting worse as the push to ditch legacy BIOS in favour of UEFI gain momentum both across OS vendors and mgmt apps. Solving it requires new parameters to the commands, but doing this in HMP is super unappealing. After 10 years, I think it is time for us to be a little pragmatic about our handling of snapshots commands. My desire is that libvirt should never use "human-monitor-command" under any circumstances, because of the inherant flaws in HMP as a protocol for machine consumption. Thus in this series I'm proposing a fairly direct mapping of the existing HMP commands for savevm/loadvm/delvm into QMP as a first step. This does not solve the blocking thread problem, but it does put in a place a design using the jobs framework which can facilitate solving it later. It does also solve the error reporting, type checking and introspection problems inherant to HMP. So we're winning on 3 out of the 4 problems, and pushed apps to a QMP design that will let us solve the last remaining problem. With a QMP variant, we reasonably deal with the problems related to OVMF: - The logic to pick which disk to store the vmstate in is not satsifactory. The first block driver state cannot be assumed to be the root disk image, it might be OVMF varstore and we don't want to store vmstate in there. - The logic to decide which disks must be snapshotted is hardwired to all disks which are writable Again with OVMF there might be a writable varstore, but this can be raw rather than qcow2 format, and thus unable to be snapshotted. While users might wish to snapshot their varstore, in some/many/most cases it is entirely uneccessary. Users are blocked from snapshotting their VM though due to this varstore. These are solved by adding two parameters to the commands. The first is a block device node name that identifies the image to store vmstate in, and the second is a list of node names to include for the snapshots. If the list of nodes isn't given, it falls back to the historical behaviour of using all disks matching some undocumented criteria. In the block code I've only dealt with node names for block devices, as IIUC, this is all that libvirt should need in the -blockdev world it now lives in. IOW, I've made not attempt to cope with people wanting to use these QMP commands in combination with -drive args. I've done some minimal work in libvirt to start to make use of the new commands to validate their functionality, but this isn't finished yet. My ultimate goal is to make the GNOME Boxes maintainer happy again by having internal snapshots work with OVMF: https://gitlab.gnome.org/GNOME/gnome-boxes/-/commit/c486da262f6566326fbcb5e= f45c5f64048f16a6e HELP NEEDED: this series starts to implement the approach that Kevin suggested wrto use of generic jobs. When I try to actually run the code though it crashes: ERROR:/home/berrange/src/virt/qemu/softmmu/cpus.c:1788:qemu_mutex_unlock_ioth= read: assertion failed: (qemu_mutex_iothread_locked()) Bail out! ERROR:/home/berrange/src/virt/qemu/softmmu/cpus.c:1788:qemu_mutex_u= nlock_iothread: assertion failed: (qemu_mutex_iothread_locked()) Obviously I've missed something related to locking, but I've no idea what, so I'm sending this v2 simply as a way to solicit suggestions for what I've messed up. You can reproduce with I/O tests using "check -qcow2 310" and it gave a stack: Thread 5 (Thread 0x7fffe6e4c700 (LWP 3399011)): #0 futex_wait_cancelable (private=0, expected=0, futex_word=0x5555566a9fd8) at ../sysdeps/nptl/futex-internal.h:183 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x555556227160 <qemu_global_mutex>, cond=0x5555566a9fb0) at pthread_cond_wait.c:508 #2 __pthread_cond_wait (cond=cond@entry=0x5555566a9fb0, mutex=mutex@entry=0x555556227160 <qemu_global_mutex>) at pthread_cond_wait.c:638 #3 0x0000555555ceb6cb in qemu_cond_wait_impl (cond=0x5555566a9fb0, mutex=0x555556227160 <qemu_global_mutex>, file=0x555555d44198 "/home/berrange/src/virt/qemu/softmmu/cpus.c", line=1145) at /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:174 #4 0x0000555555931974 in qemu_wait_io_event (cpu=cpu@entry=0x555556685050) at /home/berrange/src/virt/qemu/softmmu/cpus.c:1145 #5 0x0000555555933a89 in qemu_dummy_cpu_thread_fn (arg=arg@entry=0x555556685050) at /home/berrange/src/virt/qemu/softmmu/cpus.c:1241 #6 0x0000555555ceb049 in qemu_thread_start (args=0x7fffe6e476f0) at /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:521 #7 0x00007ffff4fdc432 in start_thread (arg=<optimized out>) at pthread_create.c:477 #8 0x00007ffff4f0a9d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 4 (Thread 0x7fffe764d700 (LWP 3399010)): #0 0x00007ffff4effb6f in __GI___poll (fds=0x7fffdc006ec0, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 #1 0x00007ffff7c1aace in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #2 0x00007ffff7c1ae53 in g_main_loop_run () at /lib64/libglib-2.0.so.0 #3 0x00005555559a9d81 in iothread_run (opaque=opaque@entry=0x55555632f200) at /home/berrange/src/virt/qemu/iothread.c:82 #4 0x0000555555ceb049 in qemu_thread_start (args=0x7fffe76486f0) at /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:521 #5 0x00007ffff4fdc432 in start_thread (arg=<optimized out>) at pthread_create.c:477 #6 0x00007ffff4f0a9d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 3 (Thread 0x7fffe7e4e700 (LWP 3399009)): #0 0x00007ffff4fe5c58 in futex_abstimed_wait_cancelable (private=0, abstime=0x7fffe7e49650, clockid=0, expected=0, futex_word=0x5555562bf888) at ../sysdeps/nptl/futex-internal.h:320 #1 do_futex_wait (sem=sem@entry=0x5555562bf888, abstime=abstime@entry=0x7fffe7e49650, clockid=0) at sem_waitcommon.c:112 #2 0x00007ffff4fe5d83 in __new_sem_wait_slow (sem=sem@entry=0x5555562bf888, abstime=abstime@entry=0x7fffe7e49650, clockid=0) at sem_waitcommon.c:184 #3 0x00007ffff4fe5e12 in sem_timedwait (sem=sem@entry=0x5555562bf888, abstime=abstime@entry=0x7fffe7e49650) at sem_timedwait.c:40 #4 0x0000555555cebbdf in qemu_sem_timedwait (sem=sem@entry=0x5555562bf888, ms=ms@entry=10000) at /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:307 #5 0x0000555555d03fa4 in worker_thread (opaque=opaque@entry=0x5555562bf810) at /home/berrange/src/virt/qemu/util/thread-pool.c:91 #6 0x0000555555ceb049 in qemu_thread_start (args=0x7fffe7e496f0) at /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:521 #7 0x00007ffff4fdc432 in start_thread (arg=<optimized out>) at pthread_create.c:477 #8 0x00007ffff4f0a9d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 2 (Thread 0x7fffe8750700 (LWP 3399008)): #0 0x00007ffff4ed1871 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7fffe874b670, rem=0x7fffe874b680) at ../sysdeps/unix/sysv/linux/--Type <RET> for more, q to quit, c to continue without paging-- clock_nanosleep.c:48 #1 0x00007ffff4ed71c7 in __GI___nanosleep (requested_time=<optimized out>, remaining=<optimized out>) at nanosleep.c:27 #2 0x00007ffff7c462f7 in g_usleep () at /lib64/libglib-2.0.so.0 #3 0x0000555555cf3fd0 in call_rcu_thread (opaque=opaque@entry=0x0) at /home/berrange/src/virt/qemu/util/rcu.c:250 #4 0x0000555555ceb049 in qemu_thread_start (args=0x7fffe874b6f0) at /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:521 #5 0x00007ffff4fdc432 in start_thread (arg=<optimized out>) at pthread_create.c:477 #6 0x00007ffff4f0a9d3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7fffe88abec0 (LWP 3398996)): #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007ffff4e2e895 in __GI_abort () at abort.c:79 #2 0x00007ffff7be5b8c in g_assertion_message_expr.cold () at /lib64/libglib-2.0.so.0 #3 0x00007ffff7c43a1f in g_assertion_message_expr () at /lib64/libglib-2.0.so.0 #4 0x0000555555932da0 in qemu_mutex_unlock_iothread () at /home/berrange/src/virt/qemu/softmmu/cpus.c:1788 #5 qemu_mutex_unlock_iothread () at /home/berrange/src/virt/qemu/softmmu/cpus.c:1786 #6 0x0000555555cfeceb in os_host_main_loop_wait (timeout=26359275747000) at /home/berrange/src/virt/qemu/util/main-loop.c:232 #7 main_loop_wait (nonblocking=nonblocking@entry=0) at /home/berrange/src/virt/qemu/util/main-loop.c:516 #8 0x0000555555941f27 in qemu_main_loop () at /home/berrange/src/virt/qemu/softmmu/vl.c:1676 #9 0x000055555581a18e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/berrange/src/virt/qemu/softmmu/main.c:49 (gdb) Changed in v2: - Use new command names "snapshot-{load,save,delete}" to make it clear that these are different from the "savevm|loadvm|delvm" as they use the Job framework - Use an include list for block devs, not an exclude list Daniel P. Berrang=C3=A9 (6): migration: improve error reporting of block driver state name block: push error reporting into bdrv_all_*_snapshot functions migration: stop returning errno from load_snapshot() block: add ability to specify list of blockdevs during snapshot block: allow specifying name of block device for vmstate storage migration: introduce snapshot-{save,load,delete} QMP commands block/monitor/block-hmp-cmds.c | 7 +- block/snapshot.c | 167 +++++++++++++++++--------- include/block/snapshot.h | 19 +-- include/migration/snapshot.h | 10 +- migration/savevm.c | 210 +++++++++++++++++++++++++++------ monitor/hmp-cmds.c | 11 +- qapi/job.json | 9 +- qapi/migration.json | 112 ++++++++++++++++++ replay/replay-snapshot.c | 4 +- softmmu/vl.c | 2 +- tests/qemu-iotests/267.out | 14 +-- tests/qemu-iotests/310 | 125 ++++++++++++++++++++ tests/qemu-iotests/310.out | 0 tests/qemu-iotests/group | 1 + 14 files changed, 562 insertions(+), 129 deletions(-) create mode 100755 tests/qemu-iotests/310 create mode 100644 tests/qemu-iotests/310.out --=20 2.26.2