This series fixes Windows-dump (win-dmp) availability reporting and closes
a related hole found while doing so: several commands read guest RAM and
must not run while a migration destination is still receiving it.

dump-guest-memory, memsave and pmemsave all read guest RAM on the main
thread with the BQL held. On a migration destination that RAM is either
still being loaded (precopy) or pulled from the source on demand
(postcopy, where the guest already runs). During precopy the read returns
incomplete data; during postcopy it deadlocks, as the read faults on a
not-yet-received page while the postcopy incoming path needs the BQL the
reader holds to install it. dump-guest-memory only guarded the precopy
phase and memsave/pmemsave had no guard at all, so a new
migration_guest_ram_loading() predicate captures both phases and all
three commands now use it.

query-dump-guest-memory-capability also advertised win-dmp for every x86
VM, and dump-guest-memory accepted it unconditionally; win-dmp only works
when the guest has published a Windows dump header through vmcoreinfo.
win_dump_available() now reads that note back from the vmcoreinfo device
and validates it, keeping its original signature, so the dump.c call sites
are unchanged.

The series also adds the first qtest coverage for dump-guest-memory:
capability list, ELF and kdump output, protocol errors, and win-dmp
reporting in both directions (the positive case forges a vmcoreinfo note,
so no Windows guest is needed).

Tested:
 - Linux guest: win-dmp not advertised; an explicit win-dmp request is
   rejected without stopping the guest; elf/kdump dumps work.
 - Windows Server 2022 guest: win-dmp advertised once vmcoreinfo is
   populated; dump-guest-memory format=win-dmp produces a valid (PAGEDU64)
   crashdump.
 - Live postcopy migration: on the destination, dump-guest-memory, memsave
   and pmemsave are all refused immediately, without faulting a page.

This supersedes the earlier "dump: enhance win_dump_available to report
properly" series, which obtained the note by reusing the dump setup path
(splitting dump_init() and stopping the vCPUs). Reading vmcoreinfo
directly is much smaller.

Changes since v4:
 - Refuse guest-RAM reads (dump-guest-memory, memsave, pmemsave) while a
   migration destination is still receiving RAM, via a shared
   migration_guest_ram_loading() helper.
 - Decide win-dmp availability by reading vmcoreinfo directly instead of
   splitting dump_init()/dump_preinit() and stopping the vCPUs.
 - Add dump-guest-memory qtests.

Denis V. Lunev (7):
  migration: add migration_guest_ram_loading() helper
  dump: refuse dump-guest-memory while guest RAM is being migrated
  system/cpus: refuse memsave/pmemsave while guest RAM is being migrated
  dump: make win_dump_available() check vmcoreinfo for a Windows dump
    header
  tests/qtest: add dump-guest-memory test
  tests/qtest/dump: reject win-dmp without vmcoreinfo
  tests/qtest/dump: cover win-dmp availability via vmcoreinfo

 MAINTAINERS              |   1 +
 dump/dump.c              |   5 +-
 dump/win_dump-x86.c      |  42 +++++
 include/migration/misc.h |   3 +
 migration/migration.c    |   6 +
 system/cpus.c            |  11 ++
 tests/qtest/dump-test.c  | 331 +++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build  |   1 +
 8 files changed, 398 insertions(+), 2 deletions(-)
 create mode 100644 tests/qtest/dump-test.c


base-commit: c7cf7c810153d6f5f31aa2d5c0dee9087f6b4dff
-- 
2.53.0


Reply via email to