On 06/07/2017 09:08 AM, Alberto Garcia wrote: > This patch splits do_perform_cow() into three separate functions to > read, encrypt and write the COW regions. > > perform_cow() can now read both regions first, then encrypt them and > finally write them to disk. The memory allocation is also done in > this function now, using one single buffer large enough to hold both > regions. > > Signed-off-by: Alberto Garcia <be...@igalia.com> > --- > block/qcow2-cluster.c | 114 > +++++++++++++++++++++++++++++++++++++------------- > 1 file changed, 84 insertions(+), 30 deletions(-) >
Let's suppose we have a guest issuing 512-byte aligned requests and a host that requires 4k alignment; and the guest does an operation that needs a COW with one sector at both the front and end of the cluster. > @@ -760,22 +776,59 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta > *m) > BDRVQcow2State *s = bs->opaque; > Qcow2COWRegion *start = &m->cow_start; > Qcow2COWRegion *end = &m->cow_end; > + unsigned buffer_size = start->nb_bytes + end->nb_bytes; This sets buffer_size to 1024 initially. > + uint8_t *start_buffer, *end_buffer; > int ret; > > + assert(start->nb_bytes <= UINT_MAX - end->nb_bytes); > + > if (start->nb_bytes == 0 && end->nb_bytes == 0) { > return 0; > } > > + /* Reserve a buffer large enough to store the data from both the > + * start and end COW regions */ > + start_buffer = qemu_try_blockalign(bs, buffer_size); This is going to allocate a bigger buffer, namely one that is at least 4k in size (at least, that's my understanding - the block device is able to track its preferred IO size/alignment). > + if (start_buffer == NULL) { > + return -ENOMEM; > + } > + /* The part of the buffer where the end region is located */ > + end_buffer = start_buffer + start->nb_bytes; But now end_buffer does not have optimal alignment. In the old code, we called qemu_try_blockalign() twice, so that both read()s were called on a 4k boundary; but now, the end read() is called unaligned to a 4k boundary. Of course, since we're only reading 512 bytes, instead of 4k, it MIGHT not matter, but I don't know if we are going to cause a bounce buffer to come into play that we could otherwise avoid if we were smarter with our alignments. Is that something we need to analyze further, or even possibly intentionally over-allocate our buffer to ensure optimal read alignments? > + /* And now we can write everything */ > + ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, > + start_buffer, start->nb_bytes); > + if (ret < 0) { > + goto fail; > + } > > + ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, > + end_buffer, end->nb_bytes); At any rate, other than the potential of a pessimization due to poor alignments, your split looks sane, and it makes it more obvious that if we set up the write iov, a later patch could then call a single do_perform_cow_write using pwritev() over both chunks in one syscall. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature