28.08.2019 22:50, Max Reitz wrote: > On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote: >> Drop write notifiers and use filter node instead. >> >> = Changes = >> >> 1. add filter-node-name argument for backup qmp api. We have to do it >> in this commit, as 257 needs to be fixed. > > I feel a bit bad about it not being an implicit node. But I know you > don’t like them, they’re probably just broken altogether and because > libvirt doesn’t use backup (yet), probably nobody cares. > >> 2. there no move write notifiers here, so is_write_notifier parameter > > s/there/there are/, I suppose? > >> is dropped from block-copy paths. >> >> 3. Intersecting requests handling changed, now synchronization between >> backup-top, backup and guest writes are all done in block/block-copy.c >> and works as follows: >> >> On copy operation, we work only with dirty areas. If bits are dirty it >> means that there are no requests intersecting with this area. We clear >> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to >> prevent further operations from interaction with guest (only with >> guest, as neither backup nor backup-top will touch non-dirty area). If >> copy-operation failed we set dirty bits back together with releasing >> the lock. >> >> The actual difference with old scheme is that on guest writes we >> don't lock the whole region but only dirty-parts, and to be more >> precise: only dirty-part we are currently operate on. In old scheme >> guest write to non-dirty area (which may be safely ignored by backup) >> may wait for intersecting request, touching some other area which is >> dirty. >> >> 4. To sync with in-flight requests at job finish we now have drained >> removing of the filter, we don't need rw-lock. >> >> = Notes = >> >> Note the consequence of three objects appearing: backup-top, backup job >> and block-copy-state: >> >> 1. We want to insert backup-top before job creation, to behave similar >> with mirror and commit, where job is started upon filter. >> >> 2. We also have to create block-copy-state after filter injection, as >> we don't want it's source child be replaced by fitler. Instead we want > > s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing > ring to it) > >> to keep BCS.source to be real source node, as we want to use >> bdrv_co_try_lock in CBW operations and it can't be used on filter, as >> on filter we already have in-flight (write) request from upper layer. > > Reasonable, even more so as I suppose BCS.source should eventually be a > BdrvChild of backup-top. > > What looks wrong is that the sync_bitmap is created on the source (“bs” > in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes > it is on blk_bs(job->common.blk) (which is no longer true). > >> So, we firstly create inject backup-top, then create job and BCS. BCS >> is the latest just to not create extra variable for it. Finally we set >> bcs for backup-top filter. >> >> = Iotest changes = >> >> 56: op-blocker doesn't shot now, as we set it on source, but then check > > s/shot/show/? > >> on filter, when trying to start second backup, so error caught in >> test_dismiss_collision is changed. It's OK anyway, as this test-case >> seems to test that after some collision we can dismiss first job and >> successfully start the second one. So, the change is that collision is >> changed from op-blocker to file-posix locks. > > Well, but the op blocker belongs to the source, which should be covered > by internal permissions. The fact that it now shows up as a file-posix > error thus shows that the conflict also moves from the source to the > target. It’s OK because we actually don’t have a conflict on the source. > > But I wonder whether it would be better for test_dismiss_collision() to > do a blockdev-backup instead where we can see the collision on the target. > > Hm. On second thought, why do we even get a conflict on the target? > block-copy does share the WRITE permission for it...
Not sure, but assume that this is because in file-posix.c in raw_co_create we do want RESIZE perm. I can instead move this test to use specified job-id, to move the collision to "job-id already exists" error. Is it better? I'm afraid that posix locks will not work if disable them in config. > >> However, it's obvious now that we'd better drop this op-blocker at all >> and add a test-case for two backups from one node (to different >> destinations) actually works. But not in these series. >> >> 257: The test wants to emulate guest write during backup. They should >> go to filter node, not to original source node, of course. Therefore we >> need to specify filter node name and use it. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >> --- >> qapi/block-core.json | 8 +- >> include/block/block-copy.h | 2 +- >> include/block/block_int.h | 1 + >> block/backup-top.c | 14 +- >> block/backup.c | 113 +++----------- >> block/block-copy.c | 55 ++++--- >> block/replication.c | 2 +- >> blockdev.c | 1 + >> tests/qemu-iotests/056 | 2 +- >> tests/qemu-iotests/257 | 7 +- >> tests/qemu-iotests/257.out | 306 ++++++++++++++++++------------------- >> 11 files changed, 230 insertions(+), 281 deletions(-) > > Nice. > > I don’t have much to object (for some reason, I feel a bit uneasy about > dropping all the request waiting code, but I suppose that is indeed > taken care of with the range locks now). > > [...] > >> diff --git a/block/backup.c b/block/backup.c >> index d927c63e5a..259a165405 100644 >> --- a/block/backup.c >> +++ b/block/backup.c > > [...] > >> @@ -552,6 +480,9 @@ BlockJob *backup_job_create(const char *job_id, >> BlockDriverState *bs, >> backup_clean(&job->common.job); >> job_early_fail(&job->common.job); >> } actually, here should be "else if" >> + if (backup_top) { >> + bdrv_backup_top_drop(backup_top); >> + } > > This order is weird because backup-top still has a pointer to the BCS, > but we have already freed it at this point. > >> >> return NULL; >> } >> diff --git a/block/block-copy.c b/block/block-copy.c >> index 6828c46ba0..f3102ec3ff 100644 >> --- a/block/block-copy.c >> +++ b/block/block-copy.c >> @@ -98,27 +98,32 @@ fail: >> * error occurred, return a negative error number >> */ >> static int coroutine_fn block_copy_with_bounce_buffer( >> - BlockCopyState *s, int64_t start, int64_t end, bool >> is_write_notifier, >> + BlockCopyState *s, int64_t start, int64_t end, >> bool *error_is_read, void **bounce_buffer) >> { >> int ret; >> - int nbytes; >> - int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0; >> + int nbytes = MIN(s->cluster_size, s->len - start); >> + void *lock = bdrv_co_try_lock(blk_bs(s->source), start, nbytes); >> + >> + /* >> + * Function must be called only on full-dirty region, so nobody >> touching or >> + * touched these bytes. Therefore, we must successfully get lock. > > Which is OK because backup-top does not share the WRITE permission. But > it is organized a bit weirdly, because it’s actually block-copy here > that relies on the WRITE permission not to be shared; which it itself > cannot do because backup-top needs it. > > This of course results from the fact that backup-top and block-copy > should really use the same object to access the source, namely the > backing BdrvChild of backup-top. > > Maybe there should be a comment somewhere that points this out? But I wanted block-copy not to be directly connected to backup-top. I think actually we rely on the fact that user of block-copy() guarantees that nobody is touching dirty areas (except block_copy). And with backup, this is done by backup-top filter. I'll add a comment on this. > >> + */ >> + assert(lock); > > [...] > >> @@ -128,13 +133,16 @@ static int coroutine_fn block_copy_with_bounce_buffer( >> if (error_is_read) { >> *error_is_read = false; >> } >> - goto fail; >> + goto out; >> } >> >> - return nbytes; >> -fail: >> - bdrv_set_dirty_bitmap(s->copy_bitmap, start, s->cluster_size); >> - return ret; >> +out: >> + if (ret < 0) { >> + bdrv_set_dirty_bitmap(s->copy_bitmap, start, s->cluster_size); >> + } >> + bdrv_co_unlock(lock); >> + >> + return ret < 0 ? ret : nbytes; >> > > I feel like it would still be simpler to keep the separate fail path and > just duplicate the bdrv_co_unlock(lock) call. > > [...] > >> diff --git a/blockdev.c b/blockdev.c >> index fbef6845c8..f89e48fc79 100644 >> --- a/blockdev.c >> +++ b/blockdev.c >> @@ -3601,6 +3601,7 @@ static BlockJob *do_backup_common(BackupCommon *backup, >> job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, >> backup->sync, bmap, backup->bitmap_mode, >> backup->compress, >> + backup->filter_node_name, > > Is this guaranteed to be NULL when not specified by the user? > As far as I know - yes, it is. -- Best regards, Vladimir