Hi David, I have tired your v6 postcopy patches and found it doesn't work. When I tried to start the postcopy in live migration, some errors were printed. I just did the following things:
On destination side, started the qemu like this: /root/vt-sync/post_copy_v6_qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 1024 -net none /mnt/jinshi_ia32e_rhel6u5.qcow2 -monitor stdio -incoming tcp:0:4444 On source side, started the qemu like this: /root/vt-sync/post_copy_v6_qemu.git/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -smp 2 -m 1024 -net none /mnt/jinshi_ia32e_rhel6u5.qcow2 -monitor stdio and then (qemu) migrate_set_capability x-postcopy-ram on When I started the post copy with (qemu) migrate -d tcp:localhost:4444 I got the error message on the source side: (qemu) qemu-system-x86_64: socket_writev_buffer: Got err=104 for (131552/-1) qemu-system-x86_64: RP: Received invalid message 0x0000 length 0x0000 and the following error on the destination side: (qemu) qemu-system-x86_64: postcopy_ram_supported_by_host: No OS support qemu-system-x86_64: load of migration failed: Operation not permitted the dmesg printed: [ 233.456545] kvm: zapping shadow pages for mmio generation wraparound [ 239.785916] kvm [11926]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xabcd The v5 patches have no such errors. Do you have any suggestion? Liang > -----Original Message----- > From: qemu-devel-bounces+liang.z.li=intel....@nongnu.org [mailto:qemu- > devel-bounces+liang.z.li=intel....@nongnu.org] On Behalf Of Dr. David Alan > Gilbert (git) > Sent: Wednesday, April 15, 2015 1:03 AM > To: qemu-devel@nongnu.org > Cc: aarca...@redhat.com; yamah...@private.email.ne.jp; > quint...@redhat.com; amit.s...@redhat.com; pbonz...@redhat.com; > da...@gibson.dropbear.id.au; yayan...@cn.fujitsu.com > Subject: [Qemu-devel] [PATCH v6 00/47] Postcopy implementation > > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > This is the 6th cut of my version of postcopy; it is designed for use with > the > Linux kernel additions posted by Andrea Arcangeli here: > > git clone --reference linux -b userfault18 > git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git > > (Note this is a different API from the last version) > > This qemu series can be found at: > > https://github.com/orbitfp7/qemu.git > on the wp3-postcopy-v6 tag. > > It addresses some but not yet all of the previous review comments; however > there are a couple of large simplifications, so it seems worth posting to meet > the new kernel API and to stop people reviewing deadcode. > > Note: That the userfaultfd.h header is no longer included in this > tree: > - if you're building with the appropriate kernel headers it should find > it > - if you're building on a host that doesn't have the kernel headers > installed in the right place then: > configure with: --extra-cflags="-D__NR_userfaultfd=323" > cp include/uapi/linux/userfaultfd.h into somewhere in the include > path, e.g. /usr/local/include/linux > > v6 > Removed the PMI bitmaps > - Andrea updated the kernel API so that userspace doesn't > need to do wakeups, and thus QEMU doesn't need to keep > track of which pages it's received; there is a price - which > is we end up sending more dupes to the source, but it simplifies > stuff a lot and makes the normal paths a lot quicker. > (10s of line change in kernel, 10%-ish simplification in this code!) > Changed discard message format to a simpler start/end address scheme > and rework discard and chunking code to work in long's to match bitmap > 'qemu_get_buffer_less_copy' for postcopy pages > - avoids a userspace copy since the kernel now does it > - the new qemufile interface might also be useful for other places that > don't need a copy (maybe xbzrle?) > Changed the blockingness of the incoming fd > it was incorrectly blocking during the precopy phase after a postcopy > was > enabled, causing the HMP to be unavailable. It's now blocking only once > the postcopy thread starts up, since it's not a coroutine it can't deal > with the yields in qemu_file. > An error on the return-path now marks the migration as failed > > Fixups from Dave Gibson's comments > Removed can_postcopy, renamed save_complete to > save_complete_precopy > added save_complete_postcopy > Simplified loadvm loop exits > discard message format changes above > and many more smaller changes. > > small fixups for RCU > > > This work has been partially funded by the EU Orbit project: > see http://www.orbitproject.eu/about/ > > TODO: > The major work is to rework the page send/receive loops so that supporting > larger host pages doesn't make it quite as messy. > > Dr. David Alan Gilbert (47): > Start documenting how postcopy works. > Split header writing out of qemu_savevm_state_begin > qemu_ram_foreach_block: pass up error value, and down the ramblock > name > Add qemu_get_counted_string to read a string prefixed by a count byte > Create MigrationIncomingState > Provide runtime Target page information > Move copy out of qemu_peek_buffer > Add qemu_get_buffer_less_copy to avoid copies some of the time > Add wrapper for setting blocking status on a QEMUFile > Rename save_live_complete to save_live_complete_precopy > Return path: Open a return path on QEMUFile for sockets > Return path: socket_writev_buffer: Block even on non-blocking fd's > Migration commands > Return path: Control commands > Return path: Send responses from destination to source > Return path: Source handling of return path > ram_debug_dump_bitmap: Dump a migration bitmap as text > Move loadvm_handlers into MigrationIncomingState > Rework loadvm path for subloops > Add migration-capability boolean for postcopy-ram. > Add wrappers and handlers for sending/receiving the postcopy-ram > migration messages. > MIG_CMD_PACKAGED: Send a packaged chunk of migration stream > migrate_init: Call from savevm > Modify save_live_pending for postcopy > postcopy: OS support test > migrate_start_postcopy: Command to trigger transition to postcopy > MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state > Add qemu_savevm_state_complete_postcopy > Postcopy: Maintain sentmap and calculate discard > postcopy: Incoming initialisation > postcopy: ram_enable_notify to switch on userfault > Postcopy: Postcopy startup in migration thread > Postcopy end in migration_thread > Page request: Add MIG_RP_MSG_REQ_PAGES reverse command > Page request: Process incoming page request > Page request: Consume pages off the post-copy queue > postcopy_ram.c: place_page and helpers > Postcopy: Use helpers to map pages during migration > qemu_ram_block_from_host > Don't sync dirty bitmaps in postcopy > Host page!=target page: Cleanup bitmaps > Postcopy; Handle userfault requests > Start up a postcopy/listener thread ready for incoming page data > postcopy: Wire up loadvm_postcopy_handle_ commands > End of migration for postcopy > Disable mlock around incoming postcopy > Inhibit ballooning during postcopy > > arch_init.c | 868 > ++++++++++++++++++++++++++++++++++++--- > balloon.c | 11 + > docs/migration.txt | 167 ++++++++ > exec.c | 74 +++- > hmp-commands.hx | 15 + > hmp.c | 7 + > hmp.h | 1 + > hw/ppc/spapr.c | 2 +- > hw/virtio/virtio-balloon.c | 4 +- > include/exec/cpu-all.h | 2 - > include/exec/cpu-common.h | 7 +- > include/migration/migration.h | 126 +++++- > include/migration/postcopy-ram.h | 88 ++++ > include/migration/qemu-file.h | 15 +- > include/migration/vmstate.h | 10 +- > include/qemu/typedefs.h | 5 + > include/sysemu/balloon.h | 2 + > include/sysemu/sysemu.h | 45 +- > migration/Makefile.objs | 2 +- > migration/block.c | 9 +- > migration/migration.c | 743 +++++++++++++++++++++++++++++++-- > migration/postcopy-ram.c | 715 > ++++++++++++++++++++++++++++++++ > migration/qemu-file-unix.c | 106 ++++- > migration/qemu-file.c | 100 ++++- > migration/rdma.c | 4 +- > migration/vmstate.c | 5 +- > qapi-schema.json | 19 +- > qmp-commands.hx | 19 + > savevm.c | 809 ++++++++++++++++++++++++++++++++---- > trace-events | 77 +++- > 30 files changed, 3832 insertions(+), 225 deletions(-) create mode 100644 > include/migration/postcopy-ram.h create mode 100644 migration/postcopy- > ram.c > > -- > 2.1.0 >