21.06.2019 15:57, Vladimir Sementsov-Ogievskiy wrote: > 20.06.2019 4:03, John Snow wrote: >> This adds an "always" policy for bitmap synchronization. Regardless of if >> the job succeeds or fails, the bitmap is *always* synchronized. This means >> that for backups that fail part-way through, the bitmap retains a record of >> which sectors need to be copied out to accomplish a new backup using the >> old, partial result. >> >> In effect, this allows us to "resume" a failed backup; however the new backup >> will be from the new point in time, so it isn't a "resume" as much as it is >> an "incremental retry." This can be useful in the case of extremely large >> backups that fail considerably through the operation and we'd like to not >> waste >> the work that was already performed. >> >> Signed-off-by: John Snow <js...@redhat.com> >> --- >> qapi/block-core.json | 5 ++++- >> block/backup.c | 10 ++++++---- >> 2 files changed, 10 insertions(+), 5 deletions(-) >> >> diff --git a/qapi/block-core.json b/qapi/block-core.json >> index 0332dcaabc..58d267f1f5 100644 >> --- a/qapi/block-core.json >> +++ b/qapi/block-core.json >> @@ -1143,6 +1143,9 @@ >> # An enumeration of possible behaviors for the synchronization of a bitmap >> # when used for data copy operations. >> # >> +# @always: The bitmap is always synchronized with remaining blocks to copy, >> +# whether or not the operation has completed successfully or not. > > Hmm, now I think that 'always' sounds a bit like 'really always' i.e. during > backup > too, which is confusing.. But I don't have better suggestion. > >> +# >> # @conditional: The bitmap is only synchronized when the operation is >> successul. >> # This is useful for Incremental semantics. >> # >> @@ -1153,7 +1156,7 @@ >> # Since: 4.1 >> ## >> { 'enum': 'BitmapSyncMode', >> - 'data': ['conditional', 'never'] } >> + 'data': ['always', 'conditional', 'never'] } >> ## >> # @MirrorCopyMode: >> diff --git a/block/backup.c b/block/backup.c >> index 627f724b68..beb2078696 100644 >> --- a/block/backup.c >> +++ b/block/backup.c >> @@ -266,15 +266,17 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob >> *job, int ret) >> BlockDriverState *bs = blk_bs(job->common.blk); >> if (ret < 0 || job->bitmap_mode == BITMAP_SYNC_MODE_NEVER) { >> - /* Failure, or we don't want to synchronize the bitmap. >> - * Merge the successor back into the parent, delete nothing. */ >> + /* Failure, or we don't want to synchronize the bitmap. */ >> + if (job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS) { >> + bdrv_dirty_bitmap_claim(job->sync_bitmap, &job->copy_bitmap); >> + } >> + /* Merge the successor back into the parent. */ >> bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL); > > Hmm good, it should work. It's a lot more tricky, than just > "synchronized with remaining blocks to copy", but I'm not sure the we need > more details in > spec. > > What we have in backup? So, from one hand we have an incremental backup, and > a bitmap, counting from it. > On the other hand it's not normal incremental backup, as it don't correspond > to any valid state of vm disk, > and it may be used only as a backing in a chain of further successful > incremental backup, yes? > > And then I think: with this mode we can not stop on first error, but ignore > it, just leaving dirty bit for > resulting bitmap.. We have BLOCKDEV_ON_ERROR_IGNORE, which may be used to > achieve it, but seems it don't > work as expected, as in backup_loop() we retry operation if ret < 0 and > action != BLOCK_ERROR_ACTION_REPORT. > > And another thought: can user take a decision of discarding (like > CONDITIONAL) or saving in backing chain (like > ALWAYS) failed backup result _after_ backup job complete? For example, for > small resulting backup it may be > better to discard it and for large - to save. > Will it work if we start job with ALWAYS mode and autocomplete = false, then > on fail we can look at job progress, > and if it is small we cancel job, otherwise call complete? Or stop, > block-job-complete will not work with failure > scenarios? Then we have to set BLOCKDEV_ON_ERROR_IGNORE, and on first error > event decide, cancel or not? But we > can only cancel or continue.. > > Hmm. Cancel. So on cancel and abort you synchronize bitmap too? Seems in bad > relation with what cancel should do, > and in transactions in general...
I mean grouped transaction mode, how should it work with this? > > >> - assert(bm); >> } else { >> /* Everything is fine, delete this bitmap and install the backup. >> */ >> bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL); >> - assert(bm); >> } >> + assert(bm); >> } >> static void backup_commit(Job *job) >> > > -- Best regards, Vladimir