[Qemu-devel] [PATCH COLO-Frame v13 15/39] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily

2015-12-28 Thread zhanghailiang
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is

[Qemu-devel] [PATCH COLO-Frame v13 34/39] net/filter-buffer: Add default filter-buffer for each netdev

2015-12-28 Thread zhanghailiang
We add each netdev (except vhost-net) a default filter-buffer, which will be used for COLO or Micro-checkpoint to buffer VM's packets. The name of default filter-buffer is 'nop'. For the default filter-buffer, it will not buffer any packets in default. So it has no side effect for the netdev.

[Qemu-devel] [PATCH COLO-Frame v13 06/39] migration: Integrate COLO checkpoint process into migration

2015-12-28 Thread zhanghailiang
Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state after the first live migration successfully finished. We reuse migration thread, so if colo is enabled by user, migration thread will go into the process of colo. Signed-off-by: zhanghailiang

[Qemu-devel] [PATCH COLO-Frame v13 23/39] COLO: Implement failover work for Primary VM

2015-12-28 Thread zhanghailiang
For PVM, if there is failover request from users. The colo thread will exit the loop while the failover BH does the cleanup work and resumes VM. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert

[Qemu-devel] [PATCH COLO-Frame v13 37/39] filter-buffer: Introduce a helper function to release packets

2015-12-28 Thread zhanghailiang
We need to release all the packets from VM in COLO or Micro-checkpoint, here we add a new helper function to realse the packets that buffered by default buffer-filter Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang

[Qemu-devel] [PATCH COLO-Frame v13 29/39] COLO: Update the global runstate after going into colo state

2015-12-28 Thread zhanghailiang
If we start qemu with -S, the runstate will change from 'prelaunch' to 'running' after going into colo state. So it is necessary to update the global runstate after going into colo state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian

[Qemu-devel] [PATCH COLO-Frame v13 38/39] colo: Use default buffer-filter to buffer and release packets

2015-12-28 Thread zhanghailiang
Enable default filter to buffer packets and release the packets after a checkpoint. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v12: - Add a helper function to check if all netdev supports

[Qemu-devel] [PATCH COLO-Frame v13 36/39] filter-buffer: Introduce a helper function to enable/disable default filter

2015-12-28 Thread zhanghailiang
The default buffer filter doesn't buffer packets in default, but we need to buffer packets for COLO or Micro-checkpoint, Here we add a helper function to enable/disable filter's buffer capability. Signed-off-by: zhanghailiang Cc: Jason Wang

Re: [Qemu-devel] [PATCH] trace: Fix format specifiers for existing arguments

2015-12-28 Thread vrakush
valentin writes: Please disregard this patch. The patch from Mark Cave-Ayland, dated December 20, 2015 already fixed this issue. > From: Valentin Rakush > > This patch fixes compilation errors when --enable-trace-backend=stderr option > is

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2015-12-28 Thread Jason Wang
On 12/29/2015 02:31 PM, Zhang Chen wrote: > Hi~ > Just a small ping... > No news for a week. > Colo proxy is a part of COLO project, we need review and comments. > > > Thanks > zhangchen Hi, will find sometime to review this this week. Thanks > > > On 12/22/2015 06:42 PM, Zhang Chen wrote: >>

[Qemu-devel] [PATCH COLO-Frame v13 05/39] migration: Add state records for migration incoming

2015-12-28 Thread zhanghailiang
For migration destination, we also need to know its state, we will use it in COLO. Here we add a new member 'state' for MigrationIncomingState, and also use migrate_set_state() to modify its value. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert

[Qemu-devel] [PATCH COLO-Frame v13 10/39] COLO: Implement colo checkpoint protocol

2015-12-28 Thread zhanghailiang
We need communications protocol of user-defined to control the checkpoint process. The new checkpoint request is started by Primary VM, and the interactive process like below: Checkpoint synchronizing points: Primary Secondary

[Qemu-devel] [PATCH COLO-Frame v13 12/39] QEMUSizedBuffer: Introduce two help functions for qsb

2015-12-28 Thread zhanghailiang
Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer VM state: One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer into QEMUFile, this is used to send buffered VM state to secondary. Another is qsb_fill_buffer(), read 'size' bytes of data from the file

[Qemu-devel] [PATCH COLO-Frame v13 16/39] ram/COLO: Record the dirty pages that SVM received

2015-12-28 Thread zhanghailiang
We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. We record them by re-using migration dirty bitmap. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- v12: - Add

[Qemu-devel] [PATCH COLO-Frame v13 17/39] COLO: Load VMState into qsb before restore it

2015-12-28 Thread zhanghailiang
We should not destroy the state of SVM (Secondary VM) until we receive the whole state from the PVM (Primary VM), in case the primary fails in the middle of sending the state, so, here we cache the device state in Secondary before restore it. Besides, we should call qemu_system_reset() before

[Qemu-devel] [PATCH COLO-Frame v13 22/39] COLO failover: Introduce state to record failover process

2015-12-28 Thread zhanghailiang
When handling failover, we do different things according to the different stage of failover process, here we introduce a global atomic variable to record the status of failover. We add four failover status to indicate the different stage of failover process. You should use the helpers to get and

[Qemu-devel] [PATCH COLO-Frame v13 24/39] COLO: Implement failover work for Secondary VM

2015-12-28 Thread zhanghailiang
If users require SVM to takeover work, colo incoming thread should exit from loop while failover BH helps backing to migration incoming coroutine. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert

[Qemu-devel] [PATCH COLO-Frame v13 31/39] savevm: Introduce two helper functions for save/find loadvm_handlers entry

2015-12-28 Thread zhanghailiang
For COLO's checkpoint process, we will do savevm/loadvm repeatedly. So every time we call qemu_loadvm_section_start_full(), we will add all sections information into loadvm_handlers list for one time. There will be many instances in loadvm_handlers for one section, and this will lead to memory

[Qemu-devel] [PATCH COLO-Frame v13 30/39] savevm: Split load vm state function qemu_loadvm_state

2015-12-28 Thread zhanghailiang
qemu_loadvm_state is too long, and we can simplify it by splitting up with three helper functions. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert v13: - Add Reviewed-by tag --- migration/savevm.c | 156

[Qemu-devel] [PATCH COLO-Frame v13 02/39] migration: Introduce capability 'x-colo' to migration

2015-12-28 Thread zhanghailiang
We add helper function colo_supported() to indicate whether colo is supported or not, with which we use to control whether or not showing 'x-colo' string to users, they can use qmp command 'query-migrate-capabilities' or hmp command 'info migrate_capabilities' to learn if colo is supported. Cc:

[Qemu-devel] [PATCH COLO-Frame v13 39/39] COLO: Add block replication into colo process

2015-12-28 Thread zhanghailiang
Make sure master start block replication after slave's block replication started. Signed-off-by: zhanghailiang Signed-off-by: Wen Congyang Signed-off-by: Li Zhijian --- migration/colo.c | 52

[Qemu-devel] [PATCH COLO-Frame v13 20/39] COLO: synchronize PVM's state to SVM periodically

2015-12-28 Thread zhanghailiang
Do checkpoint periodically, the default interval is 200ms. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v12: - Add Reviewed-by tag v11: - Fix wrong sleep time for

Re: [Qemu-devel] [PATCH COLO-Frame v13 00/39] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

2015-12-28 Thread Hailiang Zhang
Cc: Markus Armbruster On 2015/12/29 15:08, zhanghailiang wrote: This is the 13th version of COLO (Still only support periodic checkpoint). Here is only COLO frame part, you can get the whole codes from github: https://github.com/coloft/qemu/commits/colo-v2.4-periodic-mode

[Qemu-devel] [PATCH COLO-Frame v13 33/39] COLO: Split qemu_savevm_state_begin out of checkpoint process

2015-12-28 Thread zhanghailiang
It is unnecessary to call qemu_savevm_state_begin() in every checkponit process. It mainly sets up devices and does the first device state pass. These data will not change during the later checkpoint process. So, we split it out of colo_do_checkpoint_transaction(), in this way, we can reduce these

[Qemu-devel] [PATCH COLO-Frame v13 32/39] COLO: Separate the process of saving/loading ram and device state

2015-12-28 Thread zhanghailiang
We separate the process of saving/loading ram and device state when do checkpoint, we add new helpers for save/load ram/device. With this change, we can directly transfer ram from master to slave without using QEMUSizeBuffer as assistant, which also reduce the size of extra memory been used

[Qemu-devel] [PATCH COLO-Frame v13 28/39] COLO: Process shutdown command for VM in COLO state

2015-12-28 Thread zhanghailiang
If VM is in COLO FT state, we should do some extra work before normal shutdown process. SVM will ignore the shutdown command if this command is issued directly to it, PVM will send the shutdown command to SVM if it gets this command. Cc: Paolo Bonzini Signed-off-by:

[Qemu-devel] [PATCH COLO-Frame v13 11/39] COLO: Add a new RunState RUN_STATE_COLO

2015-12-28 Thread zhanghailiang
Guest will enter this state when paused to save/restore VM state under colo checkpoint. Cc: Eric Blake Cc: Markus Armbruster Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by:

[Qemu-devel] [PATCH COLO-Frame v13 27/39] COLO failover: Don't do failover during loading VM's state

2015-12-28 Thread zhanghailiang
We should not do failover work while the main thread is loading VM's state, otherwise it will destroy the consistent of VM's memory and device state. Here we add a new failover status 'RELAUNCH' which means we should relaunch the process of failover. Signed-off-by: zhanghailiang

[Qemu-devel] [PATCH COLO-Frame v13 14/39] ram: Split host_from_stream_offset() into two helper functions

2015-12-28 Thread zhanghailiang
Split host_from_stream_offset() into two parts: One is to get ram block, which the block idstr may be get from migration stream, the other is to get hva (host) address from block and the offset. Besides, we will do the check working in a new helper offset_in_ramblock(). Signed-off-by:

[Qemu-devel] [PATCH COLO-Frame v13 25/39] qmp event: Add COLO_EXIT event to notify users while exited from COLO

2015-12-28 Thread zhanghailiang
If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x_colo_lost_heartbeat', users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users

[Qemu-devel] [PATCH COLO-Frame v13 35/39] filter-buffer: Accept zero interval

2015-12-28 Thread zhanghailiang
For default buffer filter, its 'interval' value is zero, so here we should accept zero interval. Signed-off-by: zhanghailiang Reviewed-by: Yang Hongyang Cc: Jason Wang --- v12: - Add Reviewed-by tag v11: - Add

[Qemu-devel] [PATCH COLO-Frame v13 26/39] COLO failover: Shutdown related socket fd when do failover

2015-12-28 Thread zhanghailiang
If the net connection between COLO's two sides is broken while colo/colo incoming thread is blocked in 'read'/'write' socket fd. It will not detect this error until connect timeout. It will be a long time. Here we shutdown all the related socket file descriptors to wake up the blocking

<    1   2