[Qemu-devel] [PATCH COLO-Frame v11 32/39] COLO: Separate the process of saving/loading ram and device state

2015-11-24 Thread zhanghailiang
during checkpoint. Besides, we move the colo_flush_ram_cache to the proper position after the above change. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - Remove load configuration section in qemu_loadvm

[Qemu-devel] [PATCH COLO-Frame v11 20/39] COLO: synchronize PVM's state to SVM periodically

2015-11-24 Thread zhanghailiang
Do checkpoint periodically, the default interval is 200ms. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - Fix wrong sleep time for checkpoint period. (Dave's review comment) --- migration/colo.c | 12 +++

[Qemu-devel] [PATCH COLO-Frame v11 10/39] COLO: Implement colo checkpoint protocol

2015-11-24 Thread zhanghailiang
w, we only support 'periodic' checkpoint, for which the Secondary VM is not running, later we will support 'hybrid' mode. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong..

[Qemu-devel] [PATCH COLO-Frame v11 25/39] COLO: implement default failover treatment

2015-11-24 Thread zhanghailiang
If we detect some error in colo, we will wait for some time, hoping users also detect it. If users don't issue failover command. We will go into default failover procedure, which the PVM will takeover work while SVM is exit in default. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.

[Qemu-devel] [PATCH COLO-Frame v11 08/39] migration: Rename the'file' member of MigrationState

2015-11-24 Thread zhanghailiang
Rename the 'file' member of MigrationState to 'to_dst_file'. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> --- v11: - Only rename 'file' member of MigrationState --- include/migration/migration.h | 2 +- migr

[Qemu-devel] [PATCH COLO-Frame v11 35/39] filter-buffer: Accept zero interval

2015-11-24 Thread zhanghailiang
For default buffer filter, its 'interval' value is zero, so here we should accept zero interval. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> Cc: Yang Hongyang <hongyang.y...@easystack.cn> --- v11: - Add comment v10: -

[Qemu-devel] [PATCH COLO-Frame v11 04/39] migration: Export migrate_set_state()

2015-11-24 Thread zhanghailiang
Fix the first parameter of migrate_set_state(), and export it. We will use it in later. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- v11: - New patch which is split from patch 'migration: Add state records for migration incoming' (Juan's suggestion) --- include/mig

[Qemu-devel] [PATCH COLO-Frame v11 05/39] migration: Add state records for migration incoming

2015-11-24 Thread zhanghailiang
For migration destination, we also need to know its state, we will use it in COLO. Here we add a new member 'state' for MigrationIncomingState, and also use migrate_set_state() to modify its value. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Reviewed-by: Dr. David Alan G

[Qemu-devel] [PATCH COLO-Frame v11 30/39] COLO: Update the global runstate after going into colo state

2015-11-24 Thread zhanghailiang
If we start qemu with -S, the runstate will change from 'prelaunch' to 'running' after going into colo state. So it is necessary to update the global runstate after going into colo state. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian &

[Qemu-devel] [PATCH COLO-Frame v11 28/39] COLO failover: Don't do failover during loading VM's state

2015-11-24 Thread zhanghailiang
We should not do failover work while the main thread is loading VM's state, otherwise it will destroy the consistent of VM's memory and device state. Here we add a new failover status 'RELAUNCH' which means we should relaunch the process of failover. Signed-off-by: zhanghailiang

[Qemu-devel] [PATCH COLO-Frame v11 38/39] colo: Use default buffer-filter to buffer and release packets

2015-11-24 Thread zhanghailiang
Enable default filter to buffer packets and release the packets after a checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> Cc: Yang Hongyang <hongyang.y...@easystack.cn> --- v11: - Use new helper functions to buffer a

[Qemu-devel] [PATCH COLO-Frame v11 00/39] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

2015-11-24 Thread zhanghailiang
port buffer/release packets for COLO (patch 32 ~ patch 36) zhanghailiang (39): configure: Add parameter for configure to enable/disable COLO support migration: Introduce capability 'x-colo' to migration COLO: migrate colo related info to secondary node migration: Export migrate_set_s

[Qemu-devel] [PATCH COLO-Frame v11 18/39] COLO: Flush PVM's cached RAM into SVM's memory

2015-11-24 Thread zhanghailiang
cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang <zhang.zhangha

[Qemu-devel] [PATCH COLO-Frame v11 31/39] savevm: Split load vm state function qemu_loadvm_state

2015-11-24 Thread zhanghailiang
qemu_loadvm_state is too long, and we can simplify it by splitting up with three helper functions. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- migration/savevm.c | 161 - 1 file changed, 97 insertions(+), 64 del

[Qemu-devel] [PATCH COLO-Frame v11 33/39] COLO: Split qemu_savevm_state_begin out of checkpoint process

2015-11-24 Thread zhanghailiang
these data transferring in the later checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migration/colo.c | 51 +-- 1 file changed, 37 insertions(+), 14 deleti

[Qemu-devel] [PATCH COLO-Frame v11 02/39] migration: Introduce capability 'x-colo' to migration

2015-11-24 Thread zhanghailiang
: Juan Quintela <quint...@redhat.com> Cc: Amit Shah <amit.s...@redhat.com> Cc: Eric Blake <ebl...@redhat.com> Cc: Markus Armbruster <arm...@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com&g

[Qemu-devel] [PATCH COLO-Frame v11 19/39] COLO: Add checkpoint-delay parameter for migrate-set-parameters

2015-11-24 Thread zhanghailiang
Add checkpoint-delay parameter for migrate-set-parameters, so that we can control the checkpoint frequency when COLO is in periodic mode. Cc: Luiz Capitulino <lcapitul...@redhat.com> Cc: Eric Blake <ebl...@redhat.com> Cc: Markus Armbruster <arm...@redhat.com> Signed-off

[Qemu-devel] [PATCH COLO-Frame v11 09/39] COLO/migration: Create a new communication path from destination to source

2015-11-24 Thread zhanghailiang
This new communication path will be used for returning messages from destination to source. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> --- v11: - Re

[Qemu-devel] [PATCH COLO-Frame v11 14/39] ram: Split host_from_stream_offset() into two helper functions

2015-11-24 Thread zhanghailiang
Split host_from_stream_offset() into two parts: One is to get ram block, which the block idstr may be get from migration stream, the other is to get hva (host) address from block and the offset. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- v11: - New patch --- mig

[Qemu-devel] [PATCH COLO-Frame v11 23/39] COLO: Implement failover work for Primary VM

2015-11-24 Thread zhanghailiang
For PVM, if there is failover request from users. The colo thread will exit the loop while the failover BH does the cleanup work and resumes VM. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - Don't call m

[Qemu-devel] [PATCH COLO-Frame v11 01/39] configure: Add parameter for configure to enable/disable COLO support

2015-11-24 Thread zhanghailiang
configure --enable-colo/--disable-colo to switch COLO support on/off. COLO support is On by default. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> Reviewed-

[Qemu-devel] [PATCH COLO-Frame v11 03/39] COLO: migrate colo related info to secondary node

2015-11-24 Thread zhanghailiang
/destination; Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --- v11: - Add Reviewed-by tag v10: -

[Qemu-devel] [PATCH COLO-Frame v11 13/39] COLO: Save PVM state to secondary side when do checkpoint

2015-11-24 Thread zhanghailiang
state, so in master, we use qsb to store VM state temporarily, get the data size by call qsb_get_length() and then migrate the data to the qsb in the secondary side. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> Signe

[Qemu-devel] [PATCH COLO-Frame v11 07/39] migration: Integrate COLO checkpoint process into loadvm

2015-11-24 Thread zhanghailiang
the original migration incoming coroutine. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - We moved the place of bdrv_invalidate_cache_all(), but done the deleting work in other patch. Fix it. - Add documentat

[Qemu-devel] [PATCH COLO-Frame v11 15/39] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily

2015-11-24 Thread zhanghailiang
-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> --- v11: - Rename 'host_cache' to 'colo_cache' (Dave's suggestion) v10: - Split the process of dirty pages recording into a new patch

[Qemu-devel] [PATCH COLO-Frame v11 22/39] COLO failover: Introduce state to record failover process

2015-11-24 Thread zhanghailiang
and set the value. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --- v11: - fix several typos found by Dave - Add Reviewed-by tag --- include/migration/failover.h | 10 ++ migration/colo-failov

[Qemu-devel] [PATCH COLO-Frame v11 24/39] COLO: Implement failover work for Secondary VM

2015-11-24 Thread zhanghailiang
If users require SVM to takeover work, colo incoming thread should exit from loop while failover BH helps backing to migration incoming coroutine. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migration

[Qemu-devel] [PATCH COLO-Frame v11 21/39] COLO failover: Introduce a new command to trigger a failover

2015-11-24 Thread zhanghailiang
m...@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - Add more comments for x-colo-lost-heartbeat command (Eric's suggestion) - Return 'enum' instead of 'int' for get_colo_mode() (Eric's sug

[Qemu-devel] [PATCH COLO-Frame v11 11/39] COLO: Add a new RunState RUN_STATE_COLO

2015-11-24 Thread zhanghailiang
Guest will enter this state when paused to save/restore VM state under colo checkpoint. Cc: Eric Blake <ebl...@redhat.com> Cc: Markus Armbruster <arm...@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fu

[Qemu-devel] [PATCH COLO-Frame v11 34/39] net/filter-buffer: Add default filter-buffer for each netdev

2015-11-24 Thread zhanghailiang
-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> Cc: Yang Hongyang <hongyang.y...@easystack.cn> --- v11: - New patch --- include/net/filter.h | 3 +++ net/filter-buffer.c | 74 net/net.c

[Qemu-devel] [PATCH COLO-Frame v11 12/39] QEMUSizedBuffer: Introduce two help functions for qsb

2015-11-24 Thread zhanghailiang
into qsb, this is used to get VM state from socket into a buffer. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --- v11: - size_t'ify these two help

[Qemu-devel] [PATCH COLO-Frame v11 26/39] qmp event: Add event notification for COLO error

2015-11-24 Thread zhanghailiang
that we exited COLO mode. Cc: Markus Armbruster <arm...@redhat.com> Cc: Michael Roth <mdr...@linux.vnet.ibm.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - Fix several typos found by Eric ---

[Qemu-devel] [PATCH COLO-Frame v11 17/39] COLO: Load VMState into qsb before restore it

2015-11-24 Thread zhanghailiang
load VM state, which can ensure the data is intact. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --

[Qemu-devel] [PATCH COLO-Frame v11 16/39] ram/COLO: Record the dirty pages that SVM received

2015-11-24 Thread zhanghailiang
We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. We record them by re-using migration dirty bitmap. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- v11: - Split a new helper function from original host_from_stream_

[Qemu-devel] [PATCH COLO-Frame v11 06/39] migration: Integrate COLO checkpoint process into migration

2015-11-24 Thread zhanghailiang
Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state after the first live migration successfully finished. We reuse migration thread, so if colo is enabled by user, migration thread will go into the process of colo. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.

[Qemu-devel] [PATCH COLO-Frame v11 39/39] COLO: Add block replication into colo process

2015-11-24 Thread zhanghailiang
Make sure master start block replication after slave's block replication started. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Wen Congyang <we...@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- mig

[Qemu-devel] [PATCH COLO-Frame v11 29/39] COLO: Process shutdown command for VM in COLO state

2015-11-24 Thread zhanghailiang
ed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- include/sysemu/sysemu.h | 3 +++ migration/colo.c| 25 +++-- qapi-schema.json| 4 +++- vl.c| 26 -- 4

[Qemu-devel] [PATCH COLO-Frame v11 36/39] filter-buffer: Introduce a helper function to enable/disable default filter

2015-11-24 Thread zhanghailiang
The default buffer filter doesn't buffer packets in default, but we need to buffer packets for COLO or Micro-checkpoint, Here we add a helper function to enable/disable filter's buffer capability. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@r

[Qemu-devel] [PATCH COLO-Frame v11 27/39] COLO failover: Shutdown related socket fd when do failover

2015-11-24 Thread zhanghailiang
operation in failover BH. Besides, we should close the corresponding file descriptors after failvoer BH shutdown them, or there will be an error. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v11: - Only shutdown

[Qemu-devel] [PATCH COLO-Frame v11 37/39] filter-buffer: Introduce a helper function to release packets

2015-11-24 Thread zhanghailiang
We need to release all the packets from VM in COLO or Micro-checkpoint, here we add a new helper function to realse the packets that buffered by default buffer-filter Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> Cc: Yang Hongyang

Re: [Qemu-devel] [PATCH] migration: Add state records for migration incoming

2015-11-23 Thread zhanghailiang
On 2015/11/18 18:51, Juan Quintela wrote: zhanghailiang <zhang.zhanghaili...@huawei.com> wrote: For migration destination, sometimes we need to know its state, and it is also useful for tracing migration incoming process. Here we add a new member 'state' for MigrationIncomingState, an

Re: [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process

2015-11-22 Thread zhanghailiang
On 2015/11/20 23:51, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: When handling failover, we do different things according to the different stage of failover process, here we introduce a global atomic variable to record the status of failover. We add

Re: [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error

2015-11-22 Thread zhanghailiang
On 2015/11/21 5:50, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'colo_lost_heartbeat', users can intervene in COLO's failover work immediately. If users don't

Re: [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover

2015-11-17 Thread zhanghailiang
On 2015/11/14 0:59, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: We leave users to choose whatever heartbeat solution they want, if the heartbeat is lost, or other errors they detect, they can use experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover, COLO

Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically

2015-11-17 Thread zhanghailiang
On 2015/11/17 18:08, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: On 2015/11/14 2:34, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: Do checkpoint periodically, the default interval is 200ms. Signed-off

Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically

2015-11-17 Thread zhanghailiang
On 2015/11/14 2:34, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: Do checkpoint periodically, the default interval is 200ms. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fu

Re: [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint

2015-11-17 Thread zhanghailiang
On 2015/11/14 2:53, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: On 2015/11/7 2:59, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: The main process of checkpoint is to synchronize SVM with PVM. VM's state includes

Re: [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:38, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's

Re: [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:42, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state after the first live migration successfully finished. We reuse migration thread, so if colo is enabled by user, migration thread will go

Re: [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily

2015-11-16 Thread zhanghailiang
On 2015/11/13 23:39, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load

Re: [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:01, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: We add helper function colo_supported() to indicate whether colo is supported or not, with which we use to control whether or not showing 'x-colo' string to users, they can use qmp command 'query-migrate

Re: [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:02, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: We should not destroy the state of SVM (Secondary VM) until we receive the whole state from the PVM (Primary VM), in case the primary fails in the middle of sending the state, so, here we

Re: [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:19, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: We need to record the address of the dirty pages that received from PVM, It will help flushing pages that cached into SVM. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.

Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:46, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: We need communications protocol of user-defined to control the checkpoint process. The new checkpoint request is started by Primary VM, and the interactive process like below: Checkpoint synchronizing points

Re: [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO

2015-11-16 Thread zhanghailiang
On 2015/11/14 0:47, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: Guest will enter this state when paused to save/restore VM state under colo checkpoint. Cc: Eric Blake <ebl...@redhat.com> Cc: Markus Armbruster <arm...@redhat.com> Signed-off-by: zhanghailiang <z

Re: [Qemu-devel] [POC]colo-proxy in qemu

2015-11-10 Thread zhanghailiang
e a good idea, maybe we can add a checkpoint request command in COLO to support more packets comparing scheme ... Thanks, zhanghailiang - Haven't read the code of packet comparing, but if it needs to keep track the state of each connection, it could be easily DOS from guest. Thanks .

Re: [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb

2015-11-09 Thread zhanghailiang
On 2015/11/7 2:30, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer VM state: One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer into QEMUFile, this is used

Re: [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint

2015-11-09 Thread zhanghailiang
On 2015/11/7 2:59, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: The main process of checkpoint is to synchronize SVM with PVM. VM's state includes ram and device state. So we will migrate PVM's state to SVM when do checkpoint, just like migration does

Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol

2015-11-08 Thread zhanghailiang
On 2015/11/9 14:51, zhanghailiang wrote: On 2015/11/7 2:26, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: We need communications protocol of user-defined to control the checkpoint process. The new checkpoint request is started by Primary VM

Re: [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm

2015-11-08 Thread zhanghailiang
On 2015/11/7 1:29, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: Switch from normal migration loadvm process into COLO checkpoint process if COLO mode is enabled. We add three new members to struct MigrationIncomingState, 'have_colo_incoming_thread

Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol

2015-11-08 Thread zhanghailiang
On 2015/11/7 2:26, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: We need communications protocol of user-defined to control the checkpoint process. The new checkpoint request is started by Primary VM, and the interactive process like below: Checkpoint

Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-05 Thread zhanghailiang
On 2015/11/5 17:19, Jason Wang wrote: On 11/05/2015 03:43 PM, zhanghailiang wrote: Hi Jason, On 2015/11/4 10:56, Jason Wang wrote: On 11/03/2015 07:56 PM, zhanghailiang wrote: Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com>

Re: [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support

2015-11-05 Thread zhanghailiang
Hi Eric, On 2015/11/5 22:52, Eric Blake wrote: On 11/03/2015 04:56 AM, zhanghailiang wrote: configure --enable-colo/--disable-colo to switch COLO support on/off. COLO support is off by default. Off by default risks bit-rot for people not building it; it's generally best to default to off

Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-04 Thread zhanghailiang
Hi Jason, On 2015/11/4 10:56, Jason Wang wrote: On 11/03/2015 07:56 PM, zhanghailiang wrote: Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> Commit log please. --- v10: new patch --- include/net/filter.h | 1 + inclu

Re: [Qemu-devel] [vhost-user BUG ?] QEMU process segfault when shutdown or reboot with vhost-user

2015-11-04 Thread zhanghailiang
On 2015/11/4 11:19, Jason Wang wrote: On 11/04/2015 10:24 AM, zhanghailiang wrote: On 2015/11/3 22:54, Marc-André Lureau wrote: Hi On Tue, Nov 3, 2015 at 2:01 PM, zhanghailiang <zhang.zhanghaili...@huawei.com> wrote: The corresponding codes where gdb reports error are: (We have adde

[Qemu-devel] [PATCH COLO-Frame v10 20/38] COLO: Implement failover work for Primary VM

2015-11-03 Thread zhanghailiang
For PVM, if there is failover request from users. The colo thread will exit the loop while the failover BH does the cleanup work and resumes VM. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v10: Call m

[Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters

2015-11-03 Thread zhanghailiang
We add a new property 'auto' for netfilter to distinguish if netfilter is added by user or automatically added. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v10: new patch --- include/net/filter.h | 2 ++ net/filter-bu

[Qemu-devel] [PATCH COLO-Frame v10 24/38] COLO failover: Shutdown related socket fd when do failover

2015-11-03 Thread zhanghailiang
operation in failover BH. Besides, we should close the corresponding file descriptors after failvoer BH shutdown them, or there will be an error. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migration

[Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters

2015-11-03 Thread zhanghailiang
Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v10: new patch --- include/net/filter.h | 1 + net/filter-buffer.c | 17 + 2 files changed, 18 insertions(+) diff --git a/include/net/filter.h b/include/net/fi

Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters

2015-11-03 Thread zhanghailiang
. Or there will be two places to release packets when we enable colo ft, one in timer callback, the other one in COLO when we do checkpoint. Thanks, zhanghailiang On 2015年11月03日 19:56, zhanghailiang wrote: Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang

[Qemu-devel] [PATCH COLO-Frame v10 38/38] COLO: Add block replication into colo process

2015-11-03 Thread zhanghailiang
Make sure master start block replication after slave's block replication started. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Wen Congyang <we...@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migratio

[Qemu-devel] [PATCH COLO-Frame v10 29/38] savevm: Split load vm state function qemu_loadvm_state

2015-11-03 Thread zhanghailiang
qemu_loadvm_state is too long, and we can simplify it by splitting up with three helper functions. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- migration/savevm.c | 165 +++-- 1 file changed, 96 insertions(+), 69 del

[Qemu-devel] [vhost-user BUG ?] QEMU process segfault when shutdown or reboot with vhost-user

2015-11-03 Thread zhanghailiang
uite familiar with vhost-user, but for vhost-user, these two callback functions seem to be always NULL, Why we can come here ? Is it an error to add VM state change handler for vhost-user ? Thanks, zhanghailiang

Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-03 Thread zhanghailiang
On 2015/11/3 20:57, Yang Hongyang wrote: On 2015年11月03日 19:56, zhanghailiang wrote: Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v10: new patch --- include/net/filter.h | 1 + include/net/net.h| 3 ++ net/fil

Re: [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets

2015-11-03 Thread zhanghailiang
On 2015/11/3 20:39, Yang Hongyang wrote: On 2015年11月03日 19:56, zhanghailiang wrote: For COLO or MC FT, We need a function to release all the buffered packets actively. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v1

[Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily

2015-11-03 Thread zhanghailiang
-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> --- v10: Split the process of dirty pages recording into a new patch --- include/exec/ram_addr.h | 1 + include/migration/colo.h |

[Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval

2015-11-03 Thread zhanghailiang
Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v10: new patch --- net/filter-buffer.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/net/filter-buffer.c b/net/filter-buffer.c index 5f0ea70..05313de 100644 ---

[Qemu-devel] [PATCH COLO-Frame v10 21/38] COLO: Implement failover work for Secondary VM

2015-11-03 Thread zhanghailiang
If users require SVM to takeover work, colo incoming thread should exit from loop while failover BH helps backing to migration incoming coroutine. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migration

[Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev

2015-11-03 Thread zhanghailiang
Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v10: new patch --- include/net/filter.h | 1 + include/net/net.h| 3 ++ net/filter-buffer.c | 84 net/net.c

[Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it

2015-11-03 Thread zhanghailiang
load VM state, which can ensure the data is intact. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --

[Qemu-devel] [PATCH COLO-Frame v10 37/38] colo: Use the netfilter to buffer and release packets

2015-11-03 Thread zhanghailiang
Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- v10: Use the new API --- migration/colo.c | 29 + 1 file changed, 29 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 36f737a..25335db 100644 --- a/migration/colo.c +++ b/mig

[Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets

2015-11-03 Thread zhanghailiang
For COLO or MC FT, We need a function to release all the buffered packets actively. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Jason Wang <jasow...@redhat.com> --- v10: new patch --- include/net/filter.h | 1 + include/net/net.h| 4 net/filter-bu

[Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO

2015-11-03 Thread zhanghailiang
Guest will enter this state when paused to save/restore VM state under colo checkpoint. Cc: Eric Blake <ebl...@redhat.com> Cc: Markus Armbruster <arm...@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fu

[Qemu-devel] [PATCH COLO-Frame v10 28/38] COLO: Update the global runstate after going into colo state

2015-11-03 Thread zhanghailiang
If we start qemu with -S, the runstate will change from 'prelaunch' to 'running' after going into colo state. So it is necessary to update the global runstate after going into colo state. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian &

[Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process

2015-11-03 Thread zhanghailiang
and set the value. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> --- include/migration/failover.h | 10 ++ migration/colo-failover.c| 37 + migration/colo.c | 4 trace-events | 1 + 4 files chang

[Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration

2015-11-03 Thread zhanghailiang
Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state after the first live migration successfully finished. We reuse migration thread, so if colo is enabled by user, migration thread will go into the process of colo. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.

[Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error

2015-11-03 Thread zhanghailiang
that we exit COLO mode. Cc: Markus Armbruster <arm...@redhat.com> Cc: Michael Roth <mdr...@linux.vnet.ibm.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- docs/qmp-events.txt | 17 +

[Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm

2015-11-03 Thread zhanghailiang
the original migration incoming coroutine. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- v10: fix a bug about fd leak which is found by Dave. --- include/migration/colo.h | 7 +++ include/migration/migr

[Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration

2015-11-03 Thread zhanghailiang
: Juan Quintela <quint...@redhat.com> Cc: Amit Shah <amit.s...@redhat.com> Cc: Eric Blake <ebl...@redhat.com> Cc: Markus Armbruster <arm...@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com&g

[Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol

2015-11-03 Thread zhanghailiang
should be added. 3) Since sync-points are single direction, the remote side may go forward a lot when this side just receives the sync-point. 4) For now, we only support 'periodic' checkpoint, for which the Secondary VM is not running, later we will support 'hybrid' mode. Signed-off-by

[Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support

2015-11-03 Thread zhanghailiang
configure --enable-colo/--disable-colo to switch COLO support on/off. COLO support is off by default. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> Reviewed-

[Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically

2015-11-03 Thread zhanghailiang
Do checkpoint periodically, the default interval is 200ms. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migration/colo.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/migration/colo.c

[Qemu-devel] [PATCH COLO-Frame v10 07/38] migration: Rename the'file' member of MigrationState and MigrationIncomingState

2015-11-03 Thread zhanghailiang
communication, so here we rename the file member to indicate this path. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> --- Will be dropped if post-copy is merged. --- include/migration/migration.h | 4 ++-- migration/exec.c

[Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

2015-11-03 Thread zhanghailiang
~ patch 36) zhanghailiang (38): configure: Add parameter for configure to enable/disable COLO support migration: Introduce capability 'x-colo' to migration COLO: migrate colo related info to secondary node migration: Add state records for migration incoming migration: Integrate COLO

[Qemu-devel] [PATCH COLO-Frame v10 04/38] migration: Add state records for migration incoming

2015-11-03 Thread zhanghailiang
-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --- include/migration/migration.h | 3 +++ migration/migration.c | 43 +++ 2 files changed, 30 insertions(+), 16 deletions(-) diff --git a/inclu

[Qemu-devel] [PATCH COLO-Frame v10 25/38] COLO failover: Don't do failover during loading VM's state

2015-11-03 Thread zhanghailiang
We should not do failover work while the main thread is loading VM's state, otherwise it will destroy the consistent of VM's memory and device state. Here we add a new failover status 'RELAUNCH' which means we should relaunch the process of failover. Signed-off-by: zhanghailiang

[Qemu-devel] [PATCH COLO-Frame v10 08/38] COLO/migration: establish a new communication path from destination to source

2015-11-03 Thread zhanghailiang
Add a new member 'to_src_file' to MigrationIncomingState and a new member 'from_dst_file' to MigrationState. They will be used for returning messages from destination to source. It will also be used by post-copy migration. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Sign

[Qemu-devel] [PATCH COLO-Frame v10 31/38] COLO: Split qemu_savevm_state_begin out of checkpoint process

2015-11-03 Thread zhanghailiang
these data transferring in the later checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> --- migration/colo.c | 51 +-- 1 file changed, 37 insertions(+), 14 deleti

[Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node

2015-11-03 Thread zhanghailiang
/destination; Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gong...@huawei.com> --- v10: - Use VMSTATE_BOOL instead of VMSTATE_UNIT32 for 'colo_requested' (Dave's suggestion). ---

[Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb

2015-11-03 Thread zhanghailiang
into qsb, this is used to get VM state from socket into a buffer. Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --- include/migration/qemu-file.h | 3 +

<    2   3   4   5   6   7   8   9   10   11   >