On 06/04/2012 04:38 PM, Isaku Yamahata wrote:
On Mon, Jun 04, 2012 at 08:37:04PM +0800, Anthony Liguori wrote:
On 06/04/2012 05:57 PM, Isaku Yamahata wrote:
After the long time, we have v2. This is qemu part.
The linux kernel part is sent separatedly.
Changes v1 -> v2:
- split up patches for review
- buffered file refactored
- many bug fixes
Espcially PV drivers can work with postcopy
- optimization/heuristic
Patches
1 - 30: refactoring exsiting code and preparation
31 - 37: implement postcopy itself (essential part)
38 - 41: some optimization/heuristic for postcopy
Intro
=====
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.
[1] http://wiki.qemu.org/Features/PostCopyLiveMigration
Usage
=====
You need load umem character device on the host before starting migration.
Postcopy can be used for tcg and kvm accelarator. The implementation depend
on only linux umem character device. But the driver dependent code is split
into a file.
I tested only host page size == guest page size case, but the implementation
allows host page size != guest page size case.
The following options are added with this patch series.
- incoming part
command line options
-postcopy [-postcopy-flags<flags>]
where flags is for changing behavior for benchmark/debugging
Currently the following flags are available
0: default
1: enable touching page request
example:
qemu -postcopy -incoming tcp:0:4444 -monitor stdio -machine accel=kvm
- outging part
options for migrate command
migrate [-p [-n] [-m]] URI [<prefault forward> [<prefault backword>]]
-p: indicate postcopy migration
-n: disable background transferring pages: This is for benchmark/debugging
-m: move background transfer of postcopy mode
<prefault forward>: The number of forward pages which is sent with on-demand
<prefault backward>: The number of backward pages which is sent with
on-demand
example:
migrate -p -n tcp:<dest ip address>:4444
migrate -p -n -m tcp:<dest ip address>:4444 32 0
TODO
====
- benchmark/evaluation. Especially how async page fault affects the result.
I don't mean to beat on a dead horse, but I really don't understand the
point of postcopy migration other than the fact that it's possible. It's
a lot of code and a new ABI in an area where we already have too much
difficulty maintaining our ABI.
Without a compelling real world case with supporting benchmarks for why
we need postcopy and cannot improve precopy, I'm against merging this.
Some new results are available at
https://events.linuxfoundation.org/images/stories/pdf/lcjp2012_yamahata_postcopy.pdf
It does shows dramatic improvement over pre copy. As stated in the docs,
async page faults may help lots of various loads and turn post copy into
a viable solution over today's code.
In addition, the sort of 'demand pages' approach on the destination can
help us for other usages - For example, we can use this implementation
to live snapshot VMs w/ RAM (post live migration into a file that leave
the source active) and live resume VMs from file w/o reading the entire
RAM from disk.
I didn't go over the api for the live migration part but IIUC, the only
change needed for the live migration 'protocol' is w.r.t guest pages and
we need to do it regardless when we'll merge the page ordering optimization.
Cheers,
Dor
precopy assumes that the network bandwidth are wide enough and
the number of dirty pages converges. But it doesn't always hold true.
- planned migration
predictability of total migration time is important
- dynamic consolidation
In cloud use cases, the resources of physical machine are usually
over committed.
When physical machine becomes over loaded, some VMs are moved to another
physical host to balance the load.
precopy can't move VMs promptly. compression makes things worse.
- inter data center migration
With L2 over L3 technology, it has becoming common to create a virtual
data center which actually spans over multi physical data centers.
It is useful to migrate VMs over physical data centers as disaster recovery.
The network bandwidth between DCs is narrower than LAN case. So precopy
assumption wouldn't hold.
- In case that network bandwidth might be limited by QoS,
precopy assumption doesn't hold.
thanks,