Drivers can have internal request sources that generate IO, like the need_check_timer in QED. Since we want quiesced periods that contain nested event loops in block layer, we need to have a way to disable such event sources.
Block drivers must implement the "bdrv_drain" callback if it has any internal sources that can generate I/O activity, like a timer or a worker thread (even in a library) that can schedule QEMUBH in an asynchronous callback. Update the comments of bdrv_drain and bdrv_drained_begin accordingly. Signed-off-by: Fam Zheng <f...@redhat.com> --- block/io.c | 6 +++++- include/block/block.h | 9 +++++++-- include/block/block_int.h | 6 ++++++ 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index a331a19..ef8f9cc 100644 --- a/block/io.c +++ b/block/io.c @@ -234,7 +234,8 @@ static bool bdrv_requests_pending(BlockDriverState *bs) } /* - * Wait for pending requests to complete on a single BlockDriverState subtree + * Wait for pending requests to complete on a single BlockDriverState subtree, + * and suspend block driver's internal I/O until next request arrives. * * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState * AioContext. @@ -247,6 +248,9 @@ void bdrv_drain(BlockDriverState *bs) { bool busy = true; + if (bs->drv && bs->drv->bdrv_drain) { + bs->drv->bdrv_drain(bs); + } while (busy) { /* Keep iterating */ bdrv_flush_io_queue(bs); diff --git a/include/block/block.h b/include/block/block.h index c4f6eef..ff29133 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -624,8 +624,13 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs); * * Begin a quiesced section for exclusive access to the BDS, by disabling * external request sources including NBD server and device model. Note that - * this doesn't block timers or coroutines from submitting more requests, which - * means block_job_pause is still necessary. + * this doesn't prevent timers or coroutines from submitting more requests, + * which means block_job_pause is still necessary. + * + * If new I/O requests are submitted after bdrv_drained_begin is called before + * bdrv_drained_end, more internal I/O might be going on after the request has + * been completed. If you don't want this, you have to issue another bdrv_drain + * or use a nested bdrv_drained_begin/end section. * * This function can be recursive. */ diff --git a/include/block/block_int.h b/include/block/block_int.h index 7c58221..99359b2 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -288,6 +288,12 @@ struct BlockDriver { */ int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo); + /** + * Drain and stop any internal sources of requests in the driver, and + * remain so until next I/O callback (e.g. bdrv_co_writev) is called. + */ + void (*bdrv_drain)(BlockDriverState *bs); + QLIST_ENTRY(BlockDriver) list; }; -- 2.4.3