On Wed, 05/13 19:34, Alexander Yarygin wrote: > Paolo Bonzini <pbonz...@redhat.com> writes: > > > On 13/05/2015 17:18, Alexander Yarygin wrote: > >> After the commit 9b536adc ("block: acquire AioContext in > >> bdrv_drain_all()") the aio_poll() function got called for every > >> BlockDriverState, in assumption that every device may have its own > >> AioContext. The bdrv_drain_all() function is called in each > >> virtio_reset() call, > > > > ... which should actually call bdrv_drain(). Can you fix that? > > > > I thought about it, but couldn't come to conclusion that it's safe. The > comment above bdrv_drain_all() states "... it is not possible to have a > function to drain a single device's I/O queue.",
I think that comment is stale - it predates the introduction of per BDS req tracking and bdrv_drain. > besides that what if we > have several virtual disks that share host file? I'm not sure what you mean, bdrv_drain works on a BDS, each virtual disk has one of which. > Or I'm wrong and it's ok to do? > > >> which in turn is called for every virtio-blk > >> device on initialization, so we got aio_poll() called > >> 'length(device_list)^2' times. > >> > >> If we have thousands of disks attached, there are a lot of > >> BlockDriverStates but only a few AioContexts, leading to tons of > >> unnecessary aio_poll() calls. For example, startup times with 1000 disks > >> takes over 13 minutes. > >> > >> This patch changes the bdrv_drain_all() function allowing it find shared > >> AioContexts and to call aio_poll() only for unique ones. This results in > >> much better startup times, e.g. 1000 disks do come up within 5 seconds. > > > > I'm not sure this patch is correct. You may have to call aio_poll > > multiple times before a BlockDriverState is drained. > > > > Paolo > > > > > Ah, right. We need second loop, something like this: > > @@ -2030,20 +2033,33 @@ void bdrv_drain(BlockDriverState *bs) > void bdrv_drain_all(void) > { > /* Always run first iteration so any pending completion BHs run */ > - bool busy = true; > + bool busy = true, pending = false; > BlockDriverState *bs; > + GList *aio_ctxs = NULL, *ctx; > + AioContext *aio_context; > > while (busy) { > busy = false; > > QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > - AioContext *aio_context = bdrv_get_aio_context(bs); > + aio_context = bdrv_get_aio_context(bs); > > aio_context_acquire(aio_context); > busy |= bdrv_drain_one(bs); > aio_context_release(aio_context); > + if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context)) > + aio_ctxs = g_list_append(aio_ctxs, aio_context); Braces are required even for single line if. Moreover, I don't understand this - aio_ctxs is a duplicate of bdrv_states. Fam > + } > + pending = busy; > + > + for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) { > + aio_context = ctx->data; > + aio_context_acquire(aio_context); > + busy |= aio_poll(aio_context, pending); > + aio_context_release(aio_context); > } > } > + g_list_free(aio_ctxs); > } > > That looks quite ugly for me and breaks consistence of bdrv_drain_one() > since it doesn't call aio_poll() anymore... > > > >> Cc: Christian Borntraeger <borntrae...@de.ibm.com> > >> Cc: Cornelia Huck <cornelia.h...@de.ibm.com> > >> Cc: Kevin Wolf <kw...@redhat.com> > >> Cc: Paolo Bonzini <pbonz...@redhat.com> > >> Cc: Stefan Hajnoczi <stefa...@redhat.com> > >> Signed-off-by: Alexander Yarygin <yary...@linux.vnet.ibm.com> > >> --- > >> block.c | 13 +++++++++++-- > >> 1 file changed, 11 insertions(+), 2 deletions(-) > >> > >> diff --git a/block.c b/block.c > >> index f2f8ae7..7414815 100644 > >> --- a/block.c > >> +++ b/block.c > >> @@ -1994,7 +1994,6 @@ static bool bdrv_drain_one(BlockDriverState *bs) > >> bdrv_flush_io_queue(bs); > >> bdrv_start_throttled_reqs(bs); > >> bs_busy = bdrv_requests_pending(bs); > >> - bs_busy |= aio_poll(bdrv_get_aio_context(bs), bs_busy); > >> return bs_busy; > >> } > >> > >> @@ -2010,8 +2009,12 @@ static bool bdrv_drain_one(BlockDriverState *bs) > >> */ > >> void bdrv_drain(BlockDriverState *bs) > >> { > >> - while (bdrv_drain_one(bs)) { > >> + bool busy = true; > >> + > >> + while (busy) { > >> /* Keep iterating */ > >> + busy = bdrv_drain_one(bs); > >> + busy |= aio_poll(bdrv_get_aio_context(bs), busy); > >> } > >> } > >> > >> @@ -2032,6 +2035,7 @@ void bdrv_drain_all(void) > >> /* Always run first iteration so any pending completion BHs run */ > >> bool busy = true; > >> BlockDriverState *bs; > >> + GList *aio_ctxs = NULL; > >> > >> while (busy) { > >> busy = false; > >> @@ -2041,9 +2045,14 @@ void bdrv_drain_all(void) > >> > >> aio_context_acquire(aio_context); > >> busy |= bdrv_drain_one(bs); > >> + if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context)) { > >> + busy |= aio_poll(aio_context, busy); > >> + aio_ctxs = g_list_append(aio_ctxs, aio_context); > >> + } > >> aio_context_release(aio_context); > >> } > >> } > >> + g_list_free(aio_ctxs); > >> } > >> > >> /* make a BlockDriverState anonymous by removing from bdrv_state and > >> > >