On Thu, 02/25 15:58, John Snow wrote: > During incremental backups, if the target has a cluster size that is > larger than the backup cluster size and we are backing up to a target > that cannot (for whichever reason) pull clusters up from a backing image, > we may inadvertantly create unusable incremental backup images. > > For example: > > If the bitmap tracks changes at a 64KB granularity and we transmit 64KB > of data at a time but the target uses a 128KB cluster size, it is > possible that only half of a target cluster will be recognized as dirty > by the backup block job. When the cluster is allocated on the target > image but only half populated with data, we lose the ability to > distinguish between zero padding and uninitialized data. > > This does not happen if the target image has a backing file that points > to the last known good backup. > > Even if we have a backing file, though, it's likely going to be faster > to just buffer the redundant data ourselves from the live image than > fetching it from the backing file, so let's just always round up to the > target granularity. > > The same logic applies to backup modes top, none, and full. Copying > fractional clusters without the guarantee of COW is dangerous, but even > if we can rely on COW, it's likely better to just re-copy the data. > > Reported-by: Fam Zheng <f...@redhat.com> > Signed-off-by: John Snow <js...@redhat.com> > --- > block/backup.c | 25 ++++++++++++++++++++++--- > 1 file changed, 22 insertions(+), 3 deletions(-) > > diff --git a/block/backup.c b/block/backup.c > index 76addef..0f1b1bc 100644 > --- a/block/backup.c > +++ b/block/backup.c > @@ -501,6 +501,8 @@ void backup_start(BlockDriverState *bs, BlockDriverState > *target, > BlockJobTxn *txn, Error **errp) > { > int64_t len; > + BlockDriverInfo bdi; > + int ret; > > assert(bs); > assert(target); > @@ -570,15 +572,32 @@ void backup_start(BlockDriverState *bs, > BlockDriverState *target, > goto error; > } > > - bdrv_op_block_all(target, job->common.blocker); > - > job->on_source_error = on_source_error; > job->on_target_error = on_target_error; > job->target = target; > job->sync_mode = sync_mode; > job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ? > sync_bitmap : NULL; > - job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; > + > + /* If there is no backing file on the target, we cannot rely on COW if > our > + * backup cluster size is smaller than the target cluster size. Even for > + * targets with a backing file, try to avoid COW if possible. */ > + ret = bdrv_get_info(job->target, &bdi); > + if (ret < 0 && !target->backing) { > + error_setg_errno(errp, -ret, > + "Couldn't determine the cluster size of the target image, " > + "which has no backing file"); > + error_append_hint(errp, > + "Aborting, since this may create an unusable destination > image\n"); > + goto error; > + } else if (ret < 0 && target->backing) { > + /* Not fatal; just trudge on ahead. */ > + job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; > + } else { > + job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, > bdi.cluster_size); > + } > + > + bdrv_op_block_all(target, job->common.blocker); > job->common.len = len; > job->common.co = qemu_coroutine_create(backup_run); > block_job_txn_add_job(txn, &job->common); > -- > 2.4.3 >
Reviewed-by: Fam Zheng <f...@redhat.com>