Am 16.01.2013 18:31, schrieb Paolo Bonzini: > With AIO support in place, we can start copying more than one chunk > in parallel. This patch introduces the required infrastructure for > this: the buffer is split into multiple granularity-sized chunks, > and there is a free list to access them. > > Because of copy-on-write, a single operation may already require > multiple chunks to be available on the free list. > > In addition, two different iterations on the HBitmap may want to > copy the same cluster. We avoid this by keeping a bitmap of in-flight > I/O operations, and blocking until the previous iteration completes. > This should be a pretty rare occurrence, though; as long as there is > no overlap the next iteration can start before the previous one finishes. > > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
I'm wondering if a whole bitmap is really appropriate when you have at most 16 parallel requests in flight. Other places in qemu (like copy-on-read or qcow2 cluster allocation) use lists of in-flight requests instead. I'm not requesting a change here, just wondering what the reasons are and whether this, or the other places, or none of both should be changed long term. > --- > v1->v2: the in_flight_bitmap is now properly set and cleared [Stefan] > > block/mirror.c | 111 > ++++++++++++++++++++++++++++++++++++++++++++++++++------ > trace-events | 4 ++- > 2 files changed, 102 insertions(+), 13 deletions(-) > > diff --git a/block/mirror.c b/block/mirror.c > index 77bb184..686d2b7 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -17,7 +17,15 @@ > #include "qemu/ratelimit.h" > #include "qemu/bitmap.h" > > -#define SLICE_TIME 100000000ULL /* ns */ > +#define SLICE_TIME 100000000ULL /* ns */ > +#define MAX_IN_FLIGHT 16 > + > +/* The mirroring buffer is a list of granularity-sized chunks. > + * Free chunks are organized in a list. > + */ > +typedef struct MirrorBuffer { > + QSIMPLEQ_ENTRY(MirrorBuffer) next; > +} MirrorBuffer; > > typedef struct MirrorBlockJob { > BlockJob common; > @@ -33,7 +41,10 @@ typedef struct MirrorBlockJob { > unsigned long *cow_bitmap; > HBitmapIter hbi; > uint8_t *buf; > + QSIMPLEQ_HEAD(, MirrorBuffer) buf_free; > + int buf_free_count; > > + unsigned long *in_flight_bitmap; > int in_flight; > int ret; > } MirrorBlockJob; > @@ -41,7 +52,6 @@ typedef struct MirrorBlockJob { > typedef struct MirrorOp { > MirrorBlockJob *s; > QEMUIOVector qiov; > - struct iovec iov; > int64_t sector_num; > int nb_sectors; > } MirrorOp; > @@ -62,8 +72,23 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob > *s, bool read, > static void mirror_iteration_done(MirrorOp *op) > { > MirrorBlockJob *s = op->s; > + struct iovec *iov; > + int64_t cluster_num; > + int i, nb_chunks, nb_sectors_chunk; > > s->in_flight--; > + iov = op->qiov.iov; > + for (i = 0; i < op->qiov.niov; i++) { > + MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base; > + QSIMPLEQ_INSERT_TAIL(&s->buf_free, buf, next); > + s->buf_free_count++; > + } > + > + nb_sectors_chunk = s->granularity >> BDRV_SECTOR_BITS; > + cluster_num = op->sector_num / nb_sectors_chunk; > + nb_chunks = op->nb_sectors / nb_sectors_chunk; > + bitmap_clear(s->in_flight_bitmap, cluster_num, nb_chunks); > + > trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors); > g_slice_free(MirrorOp, op); > qemu_coroutine_enter(s->common.co, NULL); > @@ -110,8 +135,8 @@ static void mirror_read_complete(void *opaque, int ret) > static void coroutine_fn mirror_iteration(MirrorBlockJob *s) > { > BlockDriverState *source = s->common.bs; > - int nb_sectors, nb_sectors_chunk; > - int64_t end, sector_num, cluster_num; > + int nb_sectors, nb_sectors_chunk, nb_chunks; > + int64_t end, sector_num, cluster_num, next_sector, hbitmap_next_sector; > MirrorOp *op; > > s->sector_num = hbitmap_iter_next(&s->hbi); > @@ -122,6 +147,8 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob > *s) > assert(s->sector_num >= 0); > } > > + hbitmap_next_sector = s->sector_num; Is there even a reason why s->sector_num exists in the first place? If I'm not mistaken, it's only used locally and could live on the stack as hbitmap_next_sector from the beginning. Kevin