Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-25 Thread Kevin Wolf
Am 25.07.2013 um 00:01 hat Ian Main geschrieben:
 On Wed, Jul 24, 2013 at 02:32:53PM -0600, Eric Blake wrote:
  On 07/24/2013 04:55 AM, Kevin Wolf wrote:
  
   Unconditionally overriding format for NEW_IMAGE_MODE_EXISTING is
   definitely wrong. It's the user's choice which COW format to use for the
   backup image. There's no reason why it has to be the same format as the
   image that is being backed up.
   
   Before, bs-drv-format_name was a default for the case where a new
   image had to be created and no format was given; and the format of
   existing images could be probed. This is still what makes most sense to
   me. What's even the goal with this change?
 
 Actually I think that code is wrong.  If we are using
 NEW_IMAGE_MODE_EXISTING then format doesn't get used.  We just end up
 using bdrv_open() below to open the existing image.  Format should not
 be specified for an existing image.

That bdrv_open() gets drv passed, which is the block driver specified by
format. If you pass NULL instead, bdrv_open() tries to probe the format,
which is unreliable (raw is indistinguishable from other formats) and
as Eric notes has led to security problems in the past.

  Furthermore, I'm proposing that for 1.6, we should make the format
  argument mandatory for drive-backup.  We made it optional for
  drive-mirror, to allow for probing, but there have been CVEs in the past
  due to probing of a raw file gone wrong.  We can always relax a
  mandatory argument into an optional one in 1.7, if we decide that
  probing can be done safely, but we can never turn an optional argument
  into a mandatory one once the initial release bakes in the option.  It
  would make the code a lot simpler to just have a mandatory format
  argument, instead of having to bake in and document hueristics on which
  format is picked when the caller doesn't provide one.
 
 So I made format mandatory in the last patch but only for
 NEW_IMAGE_MODE_ABSOLUTE_PATHS.  It actually doesn't make sense to
 specify the format of an existing image so I left it optional as an
 argument, but it will throw an error if it's not specified for the case
 where we create a new image.
 
 That make sense?

No, see above, it's important as which format an existing image is
opened.

Kevin



Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-25 Thread Paolo Bonzini
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Il 24/07/2013 22:32, Eric Blake ha scritto:
 On 07/24/2013 04:55 AM, Kevin Wolf wrote:
 
 Unconditionally overriding format for NEW_IMAGE_MODE_EXISTING
 is definitely wrong. It's the user's choice which COW format to
 use for the backup image. There's no reason why it has to be
 the same format as the image that is being backed up.
 
 Before, bs-drv-format_name was a default for the case where a
 new image had to be created and no format was given; and the
 format of existing images could be probed. This is still what
 makes most sense to me. What's even the goal with this change?
 Furthermore, I'm proposing that for 1.6, we should make the format 
 argument mandatory for drive-backup.  We made it optional for 
 drive-mirror, to allow for probing, but there have been CVEs in the
 past due to probing of a raw file gone wrong.  We can always relax
 a mandatory argument into an optional one in 1.7, if we decide
 that probing can be done safely, but we can never turn an optional
 argument into a mandatory one once the initial release bakes in the
 option.  It would make the code a lot simpler to just have a
 mandatory format argument, instead of having to bake in and
 document hueristics on which format is picked when the caller
 doesn't provide one.

Probing is unsafe by definition, on the other hand we should allow it
for consistency with the rest of the API.

Making the format mandatory for mode != 'existing' is fine, though.
We can relax it later.

For mode = 'existing' we should allow both probing, and using an
explicit format.

Paolo
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJR8TgEAAoJEBvWZb6bTYbyYcoP/A5whcr0fvd2SBfx71b2xQpZ
/4Ug8c0jjraSSt+IoLMvnwD6EYkHRuXt3qDg1q2R9sIsrjQPaQjZ3PFv2ZFrnj5+
Yrxjzpe6mKQboOWeVEeUBZc1/hvzPrUnsgFf59QnhVjz5piE9xW/c5mRd4/6ij2J
3ngnv1VTB/yoNDbLoUpbMKr4EOMZKTsR0HR96QzEYRMZafL52hihnygyNXtOV84B
sfBrGC68Q9S+3Fj9Yb0qhRWZnEP8xQP8KhA5uO6jlgowUqal2JjIUDFh7/dQn0Mt
b1YlDT0a9KRfZnKc7Q4V37vALkd750kcMgoR3IxcHuuHomKf08UG5tLmYEDoMd/G
+FQol5QL/fnBjOD2uf+oN43DeYXlv3yL4iVChPhlyxBVI9aDZseli90tlJ0sSRtS
4+rEdXhtzhEXavpJMu9Vijfkzi3wYwNWdRViEoBo6ifuh9ZsIsmIMmHaWabn2J2i
rEmpB/VAFhK3XNR6OXegyG0PrYYMqWU92W7INJx0fe91lslX8xzShElL4MHRaRmU
5OiyiVKFjC5XUCTQAwwHaKTxWJ/U8rCQj8Axia5dh6SHunZMmaDLAgpKBK229A1i
X+tuv4PI8vyeWJM3uhIDVSAswqUS41sVmnDYimCzvOj8tI/gudktygS0r1nkTgBW
ePDuM7nPNHm3ifFuaPn9
=fY6P
-END PGP SIGNATURE-



Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-25 Thread Eric Blake
On 07/25/2013 08:36 AM, Paolo Bonzini wrote:

 Furthermore, I'm proposing that for 1.6, we should make the format 
 argument mandatory for drive-backup.  We made it optional for 
 drive-mirror, to allow for probing, but there have been CVEs in the
 past due to probing of a raw file gone wrong.  We can always relax
 a mandatory argument into an optional one in 1.7, if we decide
 that probing can be done safely, but we can never turn an optional
 argument into a mandatory one once the initial release bakes in the
 option.  It would make the code a lot simpler to just have a
 mandatory format argument, instead of having to bake in and
 document hueristics on which format is picked when the caller
 doesn't provide one.
 
 Probing is unsafe by definition, on the other hand we should allow it
 for consistency with the rest of the API.

Shouldn't security trump consistency?  My argument is that if probing
can be abused, it is better to have 1.6 be conservative and mandate a
format argument always; if we can later argue that some forms of probing
are safe, then for 1.7 we can relax the argument to optional in the
qapi-schema.json file and add code checks that still mandate the
argument in the instances where it is needed, but allow probing in the
remaining cases.  But why are we so concerned about allowing probing in
the first place?  Libvirt will ALWAYS be passing a format, because that
is the only way to avoid a security bug; it is easier to design the
format to be mandatory than it is to reason about when probing might be
safe, and there's no need to be consistent with the security holes that
are permanently baked into older commands.

 
 Making the format mandatory for mode != 'existing' is fine, though.
 We can relax it later.
 
 For mode = 'existing' we should allow both probing, and using an
 explicit format.

But if you insist on allowing probing for mode='existing', then
qapi-schema.json must leave it as '*format':'str', and we are stuck
implementing the code for when format is mandatory in the C code.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-25 Thread Paolo Bonzini
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Il 25/07/2013 18:57, Eric Blake ha scritto:
 On 07/25/2013 08:36 AM, Paolo Bonzini wrote:
 
 Furthermore, I'm proposing that for 1.6, we should make the
 format argument mandatory for drive-backup.  We made it
 optional for drive-mirror, to allow for probing, but there have
 been CVEs in the past due to probing of a raw file gone wrong.
 We can always relax a mandatory argument into an optional one
 in 1.7, if we decide that probing can be done safely, but we
 can never turn an optional argument into a mandatory one once
 the initial release bakes in the option.  It would make the
 code a lot simpler to just have a mandatory format argument,
 instead of having to bake in and document hueristics on which
 format is picked when the caller doesn't provide one.
 
 Probing is unsafe by definition, on the other hand we should
 allow it for consistency with the rest of the API.
 
 Shouldn't security trump consistency?

If the image can be trusted, disallowing probing makes things more
complex for the user for no reason.

Also it's only on raw images that probing is unsafe.  If all
user-writable images are non-raw, probing is actually safe.

Paolo

  My argument is that if probing
 can be abused, it is better to have 1.6 be conservative and mandate
 a format argument always; if we can later argue that some forms of
 probing are safe, then for 1.7 we can relax the argument to
 optional in the qapi-schema.json file and add code checks that
 still mandate the argument in the instances where it is needed, but
 allow probing in the remaining cases.  But why are we so concerned
 about allowing probing in the first place?  Libvirt will ALWAYS be
 passing a format, because that is the only way to avoid a security
 bug; it is easier to design the format to be mandatory than it is
 to reason about when probing might be safe, and there's no need to
 be consistent with the security holes that are permanently baked
 into older commands.
 
 
 Making the format mandatory for mode != 'existing' is fine,
 though. We can relax it later.
 
 For mode = 'existing' we should allow both probing, and using an 
 explicit format.
 
 But if you insist on allowing probing for mode='existing', then 
 qapi-schema.json must leave it as '*format':'str', and we are
 stuck implementing the code for when format is mandatory in the C
 code.
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJR8VlaAAoJEBvWZb6bTYbyTcYP/jo6xYvm2kXQ8IdDlTGu8d0I
2h+ySv2TUg7b7KSRPlDc/pAAItbfzkxRl9lrk8m50iMtX+IwFkkFYLPv0hqRBKu2
5QA6DBmXmNkCl+2pABbTaVSwZgyKqxXExuHEMyLczX7Zutx5H9RQeo7hIU3TvhcU
5yuLyO6wgjsiWspvNlax2PZiN0ZMjil5BdQsTG4G2dxP3z9bDDjH9ZFUBStcrUTO
qPuwmWWk1FvQptbK19U+pMybPX1MsDjJRVLd6QwbOTWlSn89zl5MDCXAdqJagAYj
yN4cxuN5mlV4FyKTrnhb1eIfu5Wm+EwCo9YKcSc6EbvmKPCXXNh73xZKbk/1CngT
WOhaHn+QLM0s0rQOIWPZtn+zZU2TqXW9kvRXL0nKF1cXqw/+Aoz6sjP/uJILcEh5
O9vkyeSYL7eM8EmozIRqL//R4puBjaqdKBCxD1yQQYC2dtizIBiLRVU4+w4mbnIk
d561F2MR3UUEAfkGMPMMejL+rNyywffKvle4isT4raEHCppbb5OptK6RtllJ9jPr
mLLJ+758JhMD5H5u1MiKTzE/eQtHaP0MTBrAaQJf4omKvWOfhv/jDD9lsLP0k3Ms
WmivfZdQI+bGwi9vysEhbfAuAYdC6SSeTtZCaY6m8cKsJI/7CF5RhPMXVZElC4Q5
K2Fz6vhiU+HacN6pP2F6
=TyCJ
-END PGP SIGNATURE-



Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-24 Thread Kevin Wolf
Am 23.07.2013 um 00:09 hat Ian Main geschrieben:
 This patch adds sync-modes to the drive-backup interface and
 implements the FULL, NONE and TOP modes of synchronization.
 
 FULL performs as before copying the entire contents of the drive
 while preserving the point-in-time using CoW.
 NONE only copies new writes to the target drive.
 TOP copies changes to the topmost drive image and preserves the
 point-in-time using CoW.
 
 For sync mode TOP are creating a new target image using the same backing
 file as the original disk image.  Then any new data that has been laid
 on top of it since creation is copied in the main backup_run() loop.
 There is an extra check in the 'TOP' case so that we don't bother to copy
 all the data of the backing file as it already exists in the target.
 This is where the bdrv_co_is_allocated() is used to determine if the
 data exists in the topmost layer or below.
 
 Also any new data being written is intercepted via the write_notifier
 hook which ends up calling backup_do_cow() to copy old data out before
 it gets overwritten.
 
 For mode 'NONE' we create the new target image and only copy in the
 original data from the disk image starting from the time the call was
 made.  This preserves the point in time data by only copying the parts
 that are *going to change* to the target image.  This way we can
 reconstruct the final image by checking to see if the given block exists
 in the new target image first, and if it does not, you can get it from
 the original image.  This is basically an optimization allowing you to
 do point-in-time snapshots with low overhead vs the 'FULL' version.
 
 Since there is no old data to copy out the loop in backup_run() for the
 NONE case just calls qemu_coroutine_yield() which only wakes up after
 an event (usually cancel in this case).  The rest is handled by the
 before_write notifier which again calls backup_do_cow() to write out
 the old data so it can be preserved.
 
 Signed-off-by: Ian Main im...@redhat.com
 ---
  block/backup.c| 91 
 +++
  blockdev.c| 36 ---
  include/block/block_int.h |  4 ++-
  qapi-schema.json  |  4 +--
  qmp-commands.hx   |  2 ++
  5 files changed, 92 insertions(+), 45 deletions(-)
 
 diff --git a/block/backup.c b/block/backup.c
 index 16105d4..68abd23 100644
 --- a/block/backup.c
 +++ b/block/backup.c
 @@ -37,6 +37,7 @@ typedef struct CowRequest {
  typedef struct BackupBlockJob {
  BlockJob common;
  BlockDriverState *target;
 +MirrorSyncMode sync_mode;
  RateLimit limit;
  BlockdevOnError on_source_error;
  BlockdevOnError on_target_error;
 @@ -247,40 +248,69 @@ static void coroutine_fn backup_run(void *opaque)
  
  bdrv_add_before_write_notifier(bs, before_write);
  
 -for (; start  end; start++) {
 -bool error_is_read;
 -
 -if (block_job_is_cancelled(job-common)) {
 -break;
 +if (job-sync_mode == MIRROR_SYNC_MODE_NONE) {
 +while (!block_job_is_cancelled(job-common)) {
 +/* Yield until the job is cancelled.  We just let our 
 before_write
 + * notify callback service CoW requests. */
 +job-common.busy = false;
 +qemu_coroutine_yield();
 +job-common.busy = true;
  }
 +} else {
 +/* Both FULL and TOP SYNC_MODE's require copying.. */
 +for (; start  end; start++) {
 +bool error_is_read;
  
 -/* we need to yield so that qemu_aio_flush() returns.
 - * (without, VM does not reboot)
 - */
 -if (job-common.speed) {
 -uint64_t delay_ns = ratelimit_calculate_delay(
 -job-limit, job-sectors_read);
 -job-sectors_read = 0;
 -block_job_sleep_ns(job-common, rt_clock, delay_ns);
 -} else {
 -block_job_sleep_ns(job-common, rt_clock, 0);
 -}
 +if (block_job_is_cancelled(job-common)) {
 +break;
 +}
  
 -if (block_job_is_cancelled(job-common)) {
 -break;
 -}
 +/* we need to yield so that qemu_aio_flush() returns.
 + * (without, VM does not reboot)
 + */
 +if (job-common.speed) {
 +uint64_t delay_ns = ratelimit_calculate_delay(
 +job-limit, job-sectors_read);
 +job-sectors_read = 0;
 +block_job_sleep_ns(job-common, rt_clock, delay_ns);
 +} else {
 +block_job_sleep_ns(job-common, rt_clock, 0);
 +}
  
 -ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
 -BACKUP_SECTORS_PER_CLUSTER, error_is_read);
 -if (ret  0) {
 -/* Depending on error action, fail now or retry cluster */
 -BlockErrorAction action =
 -backup_error_action(job, error_is_read, 

Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-24 Thread Ian Main
On Wed, Jul 24, 2013 at 12:55:43PM +0200, Kevin Wolf wrote:
 Am 23.07.2013 um 00:09 hat Ian Main geschrieben:
  This patch adds sync-modes to the drive-backup interface and
  implements the FULL, NONE and TOP modes of synchronization.
  
  FULL performs as before copying the entire contents of the drive
  while preserving the point-in-time using CoW.
  NONE only copies new writes to the target drive.
  TOP copies changes to the topmost drive image and preserves the
  point-in-time using CoW.
  
  For sync mode TOP are creating a new target image using the same backing
  file as the original disk image.  Then any new data that has been laid
  on top of it since creation is copied in the main backup_run() loop.
  There is an extra check in the 'TOP' case so that we don't bother to copy
  all the data of the backing file as it already exists in the target.
  This is where the bdrv_co_is_allocated() is used to determine if the
  data exists in the topmost layer or below.
  
  Also any new data being written is intercepted via the write_notifier
  hook which ends up calling backup_do_cow() to copy old data out before
  it gets overwritten.
  
  For mode 'NONE' we create the new target image and only copy in the
  original data from the disk image starting from the time the call was
  made.  This preserves the point in time data by only copying the parts
  that are *going to change* to the target image.  This way we can
  reconstruct the final image by checking to see if the given block exists
  in the new target image first, and if it does not, you can get it from
  the original image.  This is basically an optimization allowing you to
  do point-in-time snapshots with low overhead vs the 'FULL' version.
  
  Since there is no old data to copy out the loop in backup_run() for the
  NONE case just calls qemu_coroutine_yield() which only wakes up after
  an event (usually cancel in this case).  The rest is handled by the
  before_write notifier which again calls backup_do_cow() to write out
  the old data so it can be preserved.
  
  Signed-off-by: Ian Main im...@redhat.com
  ---
   block/backup.c| 91 
  +++
   blockdev.c| 36 ---
   include/block/block_int.h |  4 ++-
   qapi-schema.json  |  4 +--
   qmp-commands.hx   |  2 ++
   5 files changed, 92 insertions(+), 45 deletions(-)
  
  diff --git a/block/backup.c b/block/backup.c
  index 16105d4..68abd23 100644
  --- a/block/backup.c
  +++ b/block/backup.c
  @@ -37,6 +37,7 @@ typedef struct CowRequest {
   typedef struct BackupBlockJob {
   BlockJob common;
   BlockDriverState *target;
  +MirrorSyncMode sync_mode;
   RateLimit limit;
   BlockdevOnError on_source_error;
   BlockdevOnError on_target_error;
  @@ -247,40 +248,69 @@ static void coroutine_fn backup_run(void *opaque)
   
   bdrv_add_before_write_notifier(bs, before_write);
   
  -for (; start  end; start++) {
  -bool error_is_read;
  -
  -if (block_job_is_cancelled(job-common)) {
  -break;
  +if (job-sync_mode == MIRROR_SYNC_MODE_NONE) {
  +while (!block_job_is_cancelled(job-common)) {
  +/* Yield until the job is cancelled.  We just let our 
  before_write
  + * notify callback service CoW requests. */
  +job-common.busy = false;
  +qemu_coroutine_yield();
  +job-common.busy = true;
   }
  +} else {
  +/* Both FULL and TOP SYNC_MODE's require copying.. */
  +for (; start  end; start++) {
  +bool error_is_read;
   
  -/* we need to yield so that qemu_aio_flush() returns.
  - * (without, VM does not reboot)
  - */
  -if (job-common.speed) {
  -uint64_t delay_ns = ratelimit_calculate_delay(
  -job-limit, job-sectors_read);
  -job-sectors_read = 0;
  -block_job_sleep_ns(job-common, rt_clock, delay_ns);
  -} else {
  -block_job_sleep_ns(job-common, rt_clock, 0);
  -}
  +if (block_job_is_cancelled(job-common)) {
  +break;
  +}
   
  -if (block_job_is_cancelled(job-common)) {
  -break;
  -}
  +/* we need to yield so that qemu_aio_flush() returns.
  + * (without, VM does not reboot)
  + */
  +if (job-common.speed) {
  +uint64_t delay_ns = ratelimit_calculate_delay(
  +job-limit, job-sectors_read);
  +job-sectors_read = 0;
  +block_job_sleep_ns(job-common, rt_clock, delay_ns);
  +} else {
  +block_job_sleep_ns(job-common, rt_clock, 0);
  +}
   
  -ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
  -BACKUP_SECTORS_PER_CLUSTER, error_is_read);
  -if (ret  

Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-24 Thread Eric Blake
On 07/24/2013 04:55 AM, Kevin Wolf wrote:

 Unconditionally overriding format for NEW_IMAGE_MODE_EXISTING is
 definitely wrong. It's the user's choice which COW format to use for the
 backup image. There's no reason why it has to be the same format as the
 image that is being backed up.
 
 Before, bs-drv-format_name was a default for the case where a new
 image had to be created and no format was given; and the format of
 existing images could be probed. This is still what makes most sense to
 me. What's even the goal with this change?

Furthermore, I'm proposing that for 1.6, we should make the format
argument mandatory for drive-backup.  We made it optional for
drive-mirror, to allow for probing, but there have been CVEs in the past
due to probing of a raw file gone wrong.  We can always relax a
mandatory argument into an optional one in 1.7, if we decide that
probing can be done safely, but we can never turn an optional argument
into a mandatory one once the initial release bakes in the option.  It
would make the code a lot simpler to just have a mandatory format
argument, instead of having to bake in and document hueristics on which
format is picked when the caller doesn't provide one.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-24 Thread Ian Main
On Wed, Jul 24, 2013 at 02:32:53PM -0600, Eric Blake wrote:
 On 07/24/2013 04:55 AM, Kevin Wolf wrote:
 
  Unconditionally overriding format for NEW_IMAGE_MODE_EXISTING is
  definitely wrong. It's the user's choice which COW format to use for the
  backup image. There's no reason why it has to be the same format as the
  image that is being backed up.
  
  Before, bs-drv-format_name was a default for the case where a new
  image had to be created and no format was given; and the format of
  existing images could be probed. This is still what makes most sense to
  me. What's even the goal with this change?

Actually I think that code is wrong.  If we are using
NEW_IMAGE_MODE_EXISTING then format doesn't get used.  We just end up
using bdrv_open() below to open the existing image.  Format should not
be specified for an existing image.
 
 Furthermore, I'm proposing that for 1.6, we should make the format
 argument mandatory for drive-backup.  We made it optional for
 drive-mirror, to allow for probing, but there have been CVEs in the past
 due to probing of a raw file gone wrong.  We can always relax a
 mandatory argument into an optional one in 1.7, if we decide that
 probing can be done safely, but we can never turn an optional argument
 into a mandatory one once the initial release bakes in the option.  It
 would make the code a lot simpler to just have a mandatory format
 argument, instead of having to bake in and document hueristics on which
 format is picked when the caller doesn't provide one.

So I made format mandatory in the last patch but only for
NEW_IMAGE_MODE_ABSOLUTE_PATHS.  It actually doesn't make sense to
specify the format of an existing image so I left it optional as an
argument, but it will throw an error if it's not specified for the case
where we create a new image.

That make sense?

Ian
 



[Qemu-devel] [PATCH V6 1/3] Implement sync modes for drive-backup.

2013-07-22 Thread Ian Main
This patch adds sync-modes to the drive-backup interface and
implements the FULL, NONE and TOP modes of synchronization.

FULL performs as before copying the entire contents of the drive
while preserving the point-in-time using CoW.
NONE only copies new writes to the target drive.
TOP copies changes to the topmost drive image and preserves the
point-in-time using CoW.

For sync mode TOP are creating a new target image using the same backing
file as the original disk image.  Then any new data that has been laid
on top of it since creation is copied in the main backup_run() loop.
There is an extra check in the 'TOP' case so that we don't bother to copy
all the data of the backing file as it already exists in the target.
This is where the bdrv_co_is_allocated() is used to determine if the
data exists in the topmost layer or below.

Also any new data being written is intercepted via the write_notifier
hook which ends up calling backup_do_cow() to copy old data out before
it gets overwritten.

For mode 'NONE' we create the new target image and only copy in the
original data from the disk image starting from the time the call was
made.  This preserves the point in time data by only copying the parts
that are *going to change* to the target image.  This way we can
reconstruct the final image by checking to see if the given block exists
in the new target image first, and if it does not, you can get it from
the original image.  This is basically an optimization allowing you to
do point-in-time snapshots with low overhead vs the 'FULL' version.

Since there is no old data to copy out the loop in backup_run() for the
NONE case just calls qemu_coroutine_yield() which only wakes up after
an event (usually cancel in this case).  The rest is handled by the
before_write notifier which again calls backup_do_cow() to write out
the old data so it can be preserved.

Signed-off-by: Ian Main im...@redhat.com
---
 block/backup.c| 91 +++
 blockdev.c| 36 ---
 include/block/block_int.h |  4 ++-
 qapi-schema.json  |  4 +--
 qmp-commands.hx   |  2 ++
 5 files changed, 92 insertions(+), 45 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 16105d4..68abd23 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,6 +37,7 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
+MirrorSyncMode sync_mode;
 RateLimit limit;
 BlockdevOnError on_source_error;
 BlockdevOnError on_target_error;
@@ -247,40 +248,69 @@ static void coroutine_fn backup_run(void *opaque)
 
 bdrv_add_before_write_notifier(bs, before_write);
 
-for (; start  end; start++) {
-bool error_is_read;
-
-if (block_job_is_cancelled(job-common)) {
-break;
+if (job-sync_mode == MIRROR_SYNC_MODE_NONE) {
+while (!block_job_is_cancelled(job-common)) {
+/* Yield until the job is cancelled.  We just let our before_write
+ * notify callback service CoW requests. */
+job-common.busy = false;
+qemu_coroutine_yield();
+job-common.busy = true;
 }
+} else {
+/* Both FULL and TOP SYNC_MODE's require copying.. */
+for (; start  end; start++) {
+bool error_is_read;
 
-/* we need to yield so that qemu_aio_flush() returns.
- * (without, VM does not reboot)
- */
-if (job-common.speed) {
-uint64_t delay_ns = ratelimit_calculate_delay(
-job-limit, job-sectors_read);
-job-sectors_read = 0;
-block_job_sleep_ns(job-common, rt_clock, delay_ns);
-} else {
-block_job_sleep_ns(job-common, rt_clock, 0);
-}
+if (block_job_is_cancelled(job-common)) {
+break;
+}
 
-if (block_job_is_cancelled(job-common)) {
-break;
-}
+/* we need to yield so that qemu_aio_flush() returns.
+ * (without, VM does not reboot)
+ */
+if (job-common.speed) {
+uint64_t delay_ns = ratelimit_calculate_delay(
+job-limit, job-sectors_read);
+job-sectors_read = 0;
+block_job_sleep_ns(job-common, rt_clock, delay_ns);
+} else {
+block_job_sleep_ns(job-common, rt_clock, 0);
+}
 
-ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
-BACKUP_SECTORS_PER_CLUSTER, error_is_read);
-if (ret  0) {
-/* Depending on error action, fail now or retry cluster */
-BlockErrorAction action =
-backup_error_action(job, error_is_read, -ret);
-if (action == BDRV_ACTION_REPORT) {
+if (block_job_is_cancelled(job-common)) {
 break;
-} else {
-