[Qemu-block] [PATCH RFC v2 5/6] replication: Implement block replication for shared disk case

2016-12-05 Thread zhanghailiang
Just as the scenario of non-shared disk block replication,
we are going to implement block replication from many basic
blocks that are already in QEMU.
The architecture is:

 virtio-blk ||   
.--
 /  ||   | 
Secondary
/   ||   
'--
   /|| 
virtio-blk
  / ||  
|
  | ||   
replication(5)
  |NBD  >   NBD   (2)   
|
  |  client ||server ---> hidden disk <-- 
active disk(4)
  | ^   ||  |
  |  replication(1) ||  |
  | |   ||  |
  |   +-'   ||  |
 (3)  |drive-backup sync=none   ||  |
. |   +-+   ||  |
Primary | | |   ||   backing|
' | |   ||  |
  V |   |
   +---+|
   |   shared disk | <--+
   +---+

1) Primary writes will read original data and forward it to Secondary
   QEMU.
2) The hidden-disk is created automatically. It buffers the original content
   that is modified by the primary VM. It should also be an empty disk, and
   the driver supports bdrv_make_empty() and backing file.
3) Primary write requests will be written to Shared disk.
4) Secondary write requests will be buffered in the active disk and it
   will overwrite the existing sector content in the buffer.

Signed-off-by: zhanghailiang 
Signed-off-by: Wen Congyang 
Signed-off-by: Zhang Chen 
---
 block/replication.c | 48 ++--
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 6574cc2..f416ca5 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -233,7 +233,7 @@ static coroutine_fn int 
replication_co_readv(BlockDriverState *bs,
  QEMUIOVector *qiov)
 {
 BDRVReplicationState *s = bs->opaque;
-BdrvChild *child = s->secondary_disk;
+BdrvChild *child = s->is_shared_disk ? s->primary_disk : s->secondary_disk;
 BlockJob *job = NULL;
 CowRequest req;
 int ret;
@@ -415,7 +415,12 @@ static void backup_job_completed(void *opaque, int ret)
 s->error = -EIO;
 }
 
-backup_job_cleanup(bs);
+if (s->mode == REPLICATION_MODE_PRIMARY) {
+s->replication_state = BLOCK_REPLICATION_DONE;
+s->error = 0;
+} else {
+backup_job_cleanup(bs);
+}
 }
 
 static bool check_top_bs(BlockDriverState *top_bs, BlockDriverState *bs)
@@ -467,6 +472,19 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 
 switch (s->mode) {
 case REPLICATION_MODE_PRIMARY:
+if (s->is_shared_disk) {
+job = backup_job_create(NULL, s->primary_disk->bs, bs, 0,
+MIRROR_SYNC_MODE_NONE, NULL, false, BLOCKDEV_ON_ERROR_REPORT,
+BLOCKDEV_ON_ERROR_REPORT, BLOCK_JOB_INTERNAL,
+backup_job_completed, bs, NULL, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+backup_job_cleanup(bs);
+aio_context_release(aio_context);
+return;
+}
+block_job_start(job);
+}
 break;
 case REPLICATION_MODE_SECONDARY:
 s->active_disk = bs->file;
@@ -485,7 +503,8 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 }
 
 s->secondary_disk = s->hidden_disk->bs->backing;
-if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
+if (!s->secondary_disk->bs ||
+(!s->is_shared_disk && !bdrv_has_blk(s->secondary_disk->bs))) {
 error_setg(errp, "The secondary disk doesn't have block backend");
 aio_context_release(aio_context);
 return;
@@ -580,11 +599,24 @@ static void replication_do_checkpoint(ReplicationState 
*rs, Error **errp)
 
 switch (s->mode) {
 case REPLICATION_MODE_PRIMARY:
+if (s->is_shared_disk) {
+if (!s->primary_disk->bs->job) {
+error_setg(errp, "Primary backup job was cancelled"
+   " unexpectedly");
+break

Re: [Qemu-block] [PATCH RFC v2 5/6] replication: Implement block replication for shared disk case

2017-01-17 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 04:35:03PM +0800, zhanghailiang wrote:
> @@ -663,8 +695,12 @@ static void replication_stop(ReplicationState *rs, bool 
> failover, Error **errp)
>  
>  switch (s->mode) {
>  case REPLICATION_MODE_PRIMARY:
> -s->replication_state = BLOCK_REPLICATION_DONE;
> -s->error = 0;
> +if (s->is_shared_disk && s->primary_disk->bs->job) {
> +block_job_cancel(s->primary_disk->bs->job);

Should this be block_job_cancel_sync()?

> +} else {
> +s->replication_state = BLOCK_REPLICATION_DONE;
> +s->error = 0;
> +}
>  break;
>  case REPLICATION_MODE_SECONDARY:
>  /*
> -- 
> 1.8.3.1
> 
> 


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH RFC v2 5/6] replication: Implement block replication for shared disk case

2017-01-17 Thread Hailiang Zhang

Hi Stefan,

On 2017/1/17 21:19, Stefan Hajnoczi wrote:

On Mon, Dec 05, 2016 at 04:35:03PM +0800, zhanghailiang wrote:

@@ -663,8 +695,12 @@ static void replication_stop(ReplicationState *rs, bool 
failover, Error **errp)

  switch (s->mode) {
  case REPLICATION_MODE_PRIMARY:
-s->replication_state = BLOCK_REPLICATION_DONE;
-s->error = 0;
+if (s->is_shared_disk && s->primary_disk->bs->job) {
+block_job_cancel(s->primary_disk->bs->job);


Should this be block_job_cancel_sync()?



No, here it is different from the secondary side which needs to wait
until backup job been canceled before resumes to run (Or there will be
an error, https://patchwork.kernel.org/patch/9128841/).

For primary VM, Just as you can see the design scenario in patch 1,
It accesses the shared disk directly, the backup job whose source side
is just the shared disk does not influence primary VM's running,
So IMHO, it is safe to call block_job_cancel here.

Thanks,
Hailiang



+} else {
+s->replication_state = BLOCK_REPLICATION_DONE;
+s->error = 0;
+}
  break;
  case REPLICATION_MODE_SECONDARY:
  /*
--
1.8.3.1