backup: use backup-top instead of write notifiers

Vladimir Sementsov-Ogievskiy Fri, 14 Jun 2019 02:21:11 -0700

13.06.2019 21:02, Max Reitz wrote:
> On 29.05.19 17:46, Vladimir Sementsov-Ogievskiy wrote:
>> Drop write notifiers and use filter node instead. Changes:
>>
>> 1. copy-before-writes now handled by filter node, so, drop all
>>     is_write_notifier arguments.
>>
>> 2. we don't have intersecting requests, so their handling is dropped.
>> Instead, synchronization works as follows:
>> when backup or backup-top starts copying of some area it firstly
>> clears copy-bitmap bits, and nobody touches areas, not marked with
>> dirty bits in copy-bitmap, so there is no intersection. Also, backup
>> job copy operations are surrounded by bdrv region lock, which is
>> actually serializing request, to not interfere with guest writes and
>> not read changed data from source (before reading we clear
>> corresponding bit in copy-bitmap, so, this area is not more handled by
>> backup-top).
>>
>> 3. To sync with in-flight requests at job finish we now have drained
>> removing of the filter, we don't need rw-lock.
>>
>> == RFC part ==
>>
>> iotests changed:
>> 56: op-blocker doesn't shot now, as we set it on source, but then check
>> on filter, when trying to start second backup... Should I workaround it
>> somehow?
> 
> Hm.  Where does that error message even come from?  The fact that the
> target image is in use already (Due to file locks)?
> 
> It appears that way indeed.
> 
> It seems reasonable to me that you can now run a backup on top of
> another backup.  Well, I mean, it is a stupid thing to do, but I don’t
> see why the block layer would forbid doing so.
> 
> So the test seems superfluous to me.  If we want to keep it (why not),
> it should test the opposite, namely that a backup to a different image
> (with a different job ID) works.  (It seems simple enough to modify the
> job that way, so why not.)
> 
>> 129: Hmm, now it is not busy at this moment.. But it's illegal to check
>> busy, as job has pause-points and set busy to false in these points.
>> Why we assert it in this test?
> 
> Nobody knows, it’s probably wrong.  All I know is that 129 is just
> broken anyway.
> 
>> 141: Obvious, as drv0 is not root node now, but backing of the filter,
>> when we try to remove it.
> 
> I get a failed assertion in 256.  That is probably because the
> bdrv_set_aio_context() calls weren’t as unnecessary as I deemed them to be.


hmm, will check.

> 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
>> ---
>>   block/backup.c             | 171 ++++++++++++++-----------------------
>>   tests/qemu-iotests/056     |   2 +-
>>   tests/qemu-iotests/129     |   1 -
>>   tests/qemu-iotests/141.out |   2 +-
>>   4 files changed, 68 insertions(+), 108 deletions(-)
> 
> For some reason, my gcc starts to complain that backup_loop() may not
> initialize error_is_read after this patch.  I don’t know why that is.
> Perhaps it inlines backup_do_cow() now?  (So before it just saw that a
> pointer to error_is_read was passed to backup_do_cow() and took it as an
> opaque function, so it surely would set this value somewhere.  Now it
> inlines it and it can’t find whether that will definitely happen, so it
> complains.)
> 
> I don’t think it is strictly necessary to initialize error_is_read, but,
> well, it won’t hurt.
> 
>> diff --git a/block/backup.c b/block/backup.c
>> index 00f4f8af53..a5b8e04c9c 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
> 
> [...]
> 
>> @@ -60,56 +53,17 @@ typedef struct BackupBlockJob {
>>   
>>   static const BlockJobDriver backup_job_driver;
>>   
>> -/* See if in-flight requests overlap and wait for them to complete */
>> -static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
>> -                                                       int64_t start,
>> -                                                       int64_t end)
>> -{
>> -    CowRequest *req;
>> -    bool retry;
>> -
>> -    do {
>> -        retry = false;
>> -        QLIST_FOREACH(req, &job->inflight_reqs, list) {
>> -            if (end > req->start_byte && start < req->end_byte) {
>> -                qemu_co_queue_wait(&req->wait_queue, NULL);
>> -                retry = true;
>> -                break;
>> -            }
>> -        }
>> -    } while (retry);
>> -}
>> -
>> -/* Keep track of an in-flight request */
>> -static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
>> -                              int64_t start, int64_t end)
>> -{
>> -    req->start_byte = start;
>> -    req->end_byte = end;
>> -    qemu_co_queue_init(&req->wait_queue);
>> -    QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
>> -}
>> -
>> -/* Forget about a completed request */
>> -static void cow_request_end(CowRequest *req)
>> -{
>> -    QLIST_REMOVE(req, list);
>> -    qemu_co_queue_restart_all(&req->wait_queue);
>> -}
>> -
>>   /* Copy range to target with a bounce buffer and return the bytes copied. 
>> If
>>    * error occurred, return a negative error number */
>>   static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>                                                         int64_t start,
>>                                                         int64_t end,
>> -                                                      bool 
>> is_write_notifier,
>>                                                         bool *error_is_read,
>>                                                         void **bounce_buffer)
> 
> Future feature: Somehow get this functionality done with backup-top, I
> suppose.  (This is effectively just backup_top_cbw() with some bells and
> whistles, isn’t it?)

or may be separate it as bdrv_co_pcopy or something like this.

> 
>>   {
>>       int ret;
>>       BlockBackend *blk = job->common.blk;
>>       int nbytes;
>> -    int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>       int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING 
>> : 0;
>>   
>>       assert(QEMU_IS_ALIGNED(start, job->cluster_size));
> 
> [...]
> 
>> @@ -154,15 +108,12 @@ fail:
>>   /* Copy range to target and return the bytes copied. If error occurred, 
>> return a
>>    * negative error number. */
>>   static int coroutine_fn backup_cow_with_offload(BackupBlockJob *job,
>> -                                                int64_t start,
>> -                                                int64_t end,
>> -                                                bool is_write_notifier)
>> +                                                int64_t start, int64_t end)
> 
> And I suppose this is something backup-top maybe should support, too.
> 
>>   {
>>       int ret;
>>       int nr_clusters;
>>       BlockBackend *blk = job->common.blk;
>>       int nbytes;
>> -    int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>       int write_flags = job->serialize_target_writes ? BDRV_REQ_SERIALISING 
>> : 0;
>>   
>>       assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
> 
> [...]
> 
>> @@ -391,28 +333,41 @@ static int coroutine_fn backup_loop(BackupBlockJob 
>> *job)
>>       int64_t offset;
>>       HBitmapIter hbi;
>>       BlockDriverState *bs = blk_bs(job->common.blk);
>> +    void *lock;
>>   
>>       hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
>>       while ((offset = hbitmap_iter_next(&hbi)) != -1) {
>> +        lock = bdrv_co_try_lock(backing_bs(blk_bs(job->common.blk)), offset,
>> +                                job->cluster_size);
>> +        /*
>> +         * Dirty bit is set, which means that there are no in-flight
>> +         * write requests on this area. We must succeed.
>> +         */
>> +        assert(lock);
>> +
> 
> Hm.  It makes me uneasy but I suppose you’re right.
> 
>>           if (job->sync_mode == MIRROR_SYNC_MODE_TOP &&
>>               bdrv_is_unallocated_range(bs, offset, job->cluster_size))
> 
> This can yield, right?  If it does, the bitmap is still set.  backup-top
> will see this, unset the bitmap and try to start its CBW operation.
> That is halted by the lock just taken, but the progress will still be
> published after completion, so the job can go beyond 100 %, I think.
> 
> Even if it doesn’t, copying the data twice is weird.  It may even get
> weirder if one of both requests fails.
> 
> Can we lock the backup-top node instead?  I don’t know whether locking
> would always succeed there, though...
> 

Hmm, I'll look closely at the code, but seems that we'd better reset bit before
yield.


> 
>>           {
>>               hbitmap_reset(job->copy_bitmap, offset, job->cluster_size);
>> +            bdrv_co_unlock(lock);
>>               continue;
>>           }
>>   
>>           do {
>>               if (yield_and_check(job)) {
>> +                bdrv_co_unlock(lock);
>>                   return 0;
>>               }
>> -            ret = backup_do_cow(job, offset,
>> -                                job->cluster_size, &error_is_read, false);
>> +            ret = backup_do_cow(job, offset, job->cluster_size, 
>> &error_is_read);
>>               if (ret < 0 && backup_error_action(job, error_is_read, -ret) ==
>>                              BLOCK_ERROR_ACTION_REPORT)
>>               {
>> +                bdrv_co_unlock(lock);
>>                   return ret;
>>               }
>>           } while (ret < 0);
>> +
>> +        bdrv_co_unlock(lock);
>>       }
>>   
>>       return 0;
> 


-- 
Best regards,
Vladimir

Re: [Qemu-block] [PATCH v8 7/7] block/backup: use backup-top instead of write notifiers

Reply via email to