Ping again ...
On 2015/9/2 16:22, zhanghailiang wrote:
This is the 9th version of COLO.
Please Note that, this version is very different from the previous versions.
since we have decided to realize proxy in qemu, which based on slirp in qemu.
We dropped all the original colo proxy related part.
It will be a long time for proxy to be ready for merging, so here we extract
the basic periodic checkpoint part that not depend on proxy into this series.
Actually, the 'periodic' mode is also what we want to support in COLO, it is
based on Yang Hongyang's netfilter series. and this mode is very like
MicroCheckpointing and Remus.
You can find the discussion about why & how to realize the colo proxy in qemu
from the follow link:
http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04069.html
As usual, here is only COLO frame part, you can get the whole codes from github:
https://github.com/coloft/qemu/commits/colo-v2.0-periodic-mode
Compared with previous versions, this version is more easy to test.
Test procedure:
1. Startup qemu
Primary side:
# x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=bn0 -netfilter
buffer,id=f0,netdev=bn0,chain=in -device virtio-net-pci,id=net-pci0,netdev=bn0
-boot c -drive
if=virtio,id=disk1,driver=quorum,read-pattern=fifo,cache=none,aio=native,children.0.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,children.0.driver=raw
-vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor
stdio -S
Secondary side:
# x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=bn0 -device
virtio-net-pci,id=net-pci0,netdev=bn0 -drive
if=none,driver=raw,file=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,id=colo1,cache=none,aio=native
-drive
if=virtio,driver=replication,mode=secondary,throttling.bps-total=70000000,file.file.filename=/mnt/ramfs/active_disk.img,file.driver=qcow2,file.backing.file.filename=/mnt/ramfs/hidden_disk.img,file.backing.driver=qcow2,file.backing.backing.backing_reference=colo1,file.backing.allow-write-backing-file=on
-vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-table -monitor stdio
-incoming tcp:0:8888
2. On Secondary VM's QEMU monitor, issue command
(qemu) nbd_server_start 192.168.2.88:8889
(qemu) nbd_server_add -w colo1
3. On Primary VM's QEMU monitor, issue command:
(qemu) child_add disk1
child.driver=replication,child.mode=primary,child.file.host=192.168.2.88,child.file.port=8889,child.file.export=colo1,child.file.driver=nbd,child.ignore-errors=on
(qemu) migrate_set_capability colo on
(qemu) migrate tcp:192.168.2.88:8888
4. After the above steps, you will see, whenever you make changes to PVM, SVM
will be synced.
You can by issue command "migrate_set_parameter checkpoint-delay 2000"
to change the checkpoint period time.
5. Failover test
You can kill PVM and run 'colo_lost_heartbeat' in SVM's
monitor at the same time, then SVM will failover and client will not feel this
change.
COLO is a totally new feature which is still in early stage,
your comments and feedback are warmly welcomed.
TODO:
1. checkpoint based on proxy in qemu
2. The capability of continuous FT
v9:
- Drop colo proxy related part (colo-nic.c file)
- Convert COLO protocol name definition to QAPI
- Smash failover related patch (patch 19/20/23)
- Fix colo exit event according Eric's comments.
- Fix some typos from Eric's comments
- Fix bug 'invalid runstate transition: 'colo' -> 'prelaunch' reported
by Dave (patch 27)
- Use migrate_set_parameter intead of ecolo-set-checkpoint-period to set
checkpoint delay time (patch 25)
- Add new patch (patch 29/30) to seperate the process of saving/loading
device and state during checkpoint. which will reduce the data size
for sending and also reduce the qsb size used in checkpoint.
Wen Congyang (1):
COLO: Add block replication into colo process
zhanghailiang (31):
configure: Add parameter for configure to enable/disable COLO support
migration: Introduce capability 'colo' to migration
COLO: migrate colo related info to slave
migration: Add state records for migration incoming
migration: Integrate COLO checkpoint process into migration
migration: Integrate COLO checkpoint process into loadvm
migration: Rename the'file' member of MigrationState and
MigrationIncomingState
COLO/migration: establish a new communication path from destination to
source
COLO: Implement colo checkpoint protocol
COLO: Add a new RunState RUN_STATE_COLO
QEMUSizedBuffer: Introduce two help functions for qsb
COLO: Save PVM state to secondary side when do checkpoint
COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
COLO: Load VMState into qsb before restore it
COLO: Flush PVM's cached RAM into SVM's memory
COLO: synchronize PVM's state to SVM periodically
COLO failover: Introduce a new command to trigger a failover
COLO failover: Introduce state to record failover process
COLO: Implement failover work for Primary VM
COLO: Implement failover work for Secondary VM
COLO: implement default failover treatment
qmp event: Add event notification for COLO error
COLO failover: Shutdown related socket fd when do failover
COLO failover: Don't do failover during loading VM's state
COLO: Control the checkpoint delay time by migrate-set-parameters
command
COLO: Implement shutdown checkpoint
COLO: Update the global runstate after going into colo state
savevm: Split load vm state function qemu_loadvm_state
COLO: Separate the process of saving/loading ram and device state
COLO: Split qemu_savevm_state_begin out of checkpoint process
COLO: Add net packets treatment into COLO
configure | 11 +
docs/qmp/qmp-events.txt | 17 +
hmp-commands.hx | 15 +
hmp.c | 16 +
hmp.h | 1 +
include/exec/cpu-all.h | 1 +
include/migration/colo.h | 44 +++
include/migration/failover.h | 33 ++
include/migration/migration.h | 16 +-
include/migration/qemu-file.h | 3 +-
include/sysemu/sysemu.h | 8 +
migration/Makefile.objs | 2 +
migration/colo-comm.c | 75 ++++
migration/colo-failover.c | 83 +++++
migration/colo.c | 782 ++++++++++++++++++++++++++++++++++++++++++
migration/exec.c | 4 +-
migration/fd.c | 4 +-
migration/migration.c | 184 +++++++---
migration/qemu-file-buf.c | 58 ++++
migration/ram.c | 185 +++++++++-
migration/savevm.c | 309 +++++++++++++----
migration/tcp.c | 4 +-
migration/unix.c | 4 +-
qapi-schema.json | 101 +++++-
qapi/event.json | 17 +
qmp-commands.hx | 20 ++
stubs/Makefile.objs | 1 +
stubs/migration-colo.c | 45 +++
trace-events | 8 +
vl.c | 37 +-
30 files changed, 1930 insertions(+), 158 deletions(-)
create mode 100644 include/migration/colo.h
create mode 100644 include/migration/failover.h
create mode 100644 migration/colo-comm.c
create mode 100644 migration/colo-failover.c
create mode 100644 migration/colo.c
create mode 100644 stubs/migration-colo.c