On 12.08.19 17:47, Vladimir Sementsov-Ogievskiy wrote: > 12.08.2019 18:10, Max Reitz wrote: >> On 10.08.19 21:31, Vladimir Sementsov-Ogievskiy wrote: >>> backup_cow_with_offload can transfer more than one cluster. Let >>> backup_cow_with_bounce_buffer behave similarly. It reduces the number >>> of IO requests, since there is no need to copy cluster by cluster. >>> >>> Logic around bounce_buffer allocation changed: we can't just allocate >>> one-cluster-sized buffer to share for all iterations. We can't also >>> allocate buffer of full-request length it may be too large, so >>> BACKUP_MAX_BOUNCE_BUFFER is introduced. And finally, allocation logic >>> is to allocate a buffer sufficient to handle all remaining iterations >>> at the point where we need the buffer for the first time. >>> >>> Bonus: get rid of pointer-to-pointer. >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>> --- >>> block/backup.c | 65 +++++++++++++++++++++++++++++++------------------- >>> 1 file changed, 41 insertions(+), 24 deletions(-) >>> >>> diff --git a/block/backup.c b/block/backup.c >>> index d482d93458..65f7212c85 100644 >>> --- a/block/backup.c >>> +++ b/block/backup.c >>> @@ -27,6 +27,7 @@ >>> #include "qemu/error-report.h" >>> >>> #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16) >>> +#define BACKUP_MAX_BOUNCE_BUFFER (64 * 1024 * 1024) >>> >>> typedef struct CowRequest { >>> int64_t start_byte; >>> @@ -98,44 +99,55 @@ static void cow_request_end(CowRequest *req) >>> qemu_co_queue_restart_all(&req->wait_queue); >>> } >>> >>> -/* Copy range to target with a bounce buffer and return the bytes copied. >>> If >>> - * error occurred, return a negative error number */ >>> +/* >>> + * Copy range to target with a bounce buffer and return the bytes copied. >>> If >>> + * error occurred, return a negative error number >>> + * >>> + * @bounce_buffer is assumed to enough to store >> >> s/to/to be/ >> >>> + * MIN(BACKUP_MAX_BOUNCE_BUFFER, @end - @start) bytes >>> + */ >>> static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job, >>> int64_t start, >>> int64_t end, >>> bool >>> is_write_notifier, >>> bool *error_is_read, >>> - void **bounce_buffer) >>> + void *bounce_buffer) >>> { >>> int ret; >>> BlockBackend *blk = job->common.blk; >>> - int nbytes; >>> + int nbytes, remaining_bytes; >>> int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0; >>> >>> assert(QEMU_IS_ALIGNED(start, job->cluster_size)); >>> - bdrv_reset_dirty_bitmap(job->copy_bitmap, start, job->cluster_size); >>> - nbytes = MIN(job->cluster_size, job->len - start); >>> - if (!*bounce_buffer) { >>> - *bounce_buffer = blk_blockalign(blk, job->cluster_size); >>> - } >>> + bdrv_reset_dirty_bitmap(job->copy_bitmap, start, end - start); >>> + nbytes = MIN(end - start, job->len - start); >>> >>> - ret = blk_co_pread(blk, start, nbytes, *bounce_buffer, read_flags); >>> - if (ret < 0) { >>> - trace_backup_do_cow_read_fail(job, start, ret); >>> - if (error_is_read) { >>> - *error_is_read = true; >>> + >>> + remaining_bytes = nbytes; >>> + while (remaining_bytes) { >>> + int chunk = MIN(BACKUP_MAX_BOUNCE_BUFFER, remaining_bytes); >>> + >>> + ret = blk_co_pread(blk, start, chunk, bounce_buffer, read_flags); >>> + if (ret < 0) { >>> + trace_backup_do_cow_read_fail(job, start, ret); >>> + if (error_is_read) { >>> + *error_is_read = true; >>> + } >>> + goto fail; >>> } >>> - goto fail; >>> - } >>> >>> - ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer, >>> - job->write_flags); >>> - if (ret < 0) { >>> - trace_backup_do_cow_write_fail(job, start, ret); >>> - if (error_is_read) { >>> - *error_is_read = false; >>> + ret = blk_co_pwrite(job->target, start, chunk, bounce_buffer, >>> + job->write_flags); >>> + if (ret < 0) { >>> + trace_backup_do_cow_write_fail(job, start, ret); >>> + if (error_is_read) { >>> + *error_is_read = false; >>> + } >>> + goto fail; >>> } >>> - goto fail; >>> + >>> + start += chunk; >>> + remaining_bytes -= chunk; >>> } >>> >>> return nbytes; >>> @@ -301,9 +313,14 @@ static int coroutine_fn backup_do_cow(BackupBlockJob >>> *job, >>> } >>> } >>> if (!job->use_copy_range) { >>> + if (!bounce_buffer) { >>> + size_t len = MIN(BACKUP_MAX_BOUNCE_BUFFER, >>> + MAX(dirty_end - start, end - dirty_end)); >>> + bounce_buffer = blk_try_blockalign(job->common.blk, len); >>> + } >> >> If you use _try_, you should probably also check whether it succeeded. > > Oops, you are right, of course. > >> >> Anyway, I wonder whether it’d be better to just allocate this buffer >> once per job (the first time we get here, probably) to be of size >> BACKUP_MAX_BOUNCE_BUFFER and put it into BackupBlockJob. (And maybe add >> a buf-size parameter similar to what the mirror jobs have.) >> > > Once per job will not work, as we may have several guest writes in parallel > and therefore > several parallel copy-before-write operations.
Hm. I’m not quite happy with that because if the guest just issues many large discards in parallel, this means that qemu will allocate a large amount of memory. It would be nice if there was a simple way to keep track of the total memory usage and let requests yield if they would exceed it. > Or if you mean writing an allocator based > on once-allocated buffer like in mirror, I really dislike this idea, as we > already have > allocator: memalign/malloc/free and it works well, no reason to invent a new > one and > hardcode it into block layer (look at my answer to Eric on v2 of this patch > for more info). Well, at least it’d be something we can delay until blockdev-copy arrives(TM). Max > Or, if you mean only backup_loop generated copy-requests, yes we may keep > only one buffer for them, > but: > 1. it is not how it works now, so my patch is not a degradation in this case > 2. I'm going to parallelize backup loop too, like my series "qcow2: async > handling of fragmented io", > which will need several allocated buffers anyway. > >> >>> ret = backup_cow_with_bounce_buffer(job, start, dirty_end, >>> is_write_notifier, >>> - error_is_read, >>> &bounce_buffer); >>> + error_is_read, >>> bounce_buffer); >>> } >>> if (ret < 0) { >>> break; >>> >> >> > >
signature.asc
Description: OpenPGP digital signature