Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.

Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.

Update the comments of bdrv_drain and bdrv_drained_begin accordingly.

Signed-off-by: Fam Zheng <f...@redhat.com>
---
 block/io.c                | 6 +++++-
 include/block/block.h     | 9 +++++++--
 include/block/block_int.h | 6 ++++++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index a331a19..ef8f9cc 100644
--- a/block/io.c
+++ b/block/io.c
@@ -234,7 +234,8 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
 }
 
 /*
- * Wait for pending requests to complete on a single BlockDriverState subtree
+ * Wait for pending requests to complete on a single BlockDriverState subtree,
+ * and suspend block driver's internal I/O until next request arrives.
  *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
@@ -247,6 +248,9 @@ void bdrv_drain(BlockDriverState *bs)
 {
     bool busy = true;
 
+    if (bs->drv && bs->drv->bdrv_drain) {
+        bs->drv->bdrv_drain(bs);
+    }
     while (busy) {
         /* Keep iterating */
          bdrv_flush_io_queue(bs);
diff --git a/include/block/block.h b/include/block/block.h
index c4f6eef..ff29133 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -624,8 +624,13 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
  *
  * Begin a quiesced section for exclusive access to the BDS, by disabling
  * external request sources including NBD server and device model. Note that
- * this doesn't block timers or coroutines from submitting more requests, which
- * means block_job_pause is still necessary.
+ * this doesn't prevent timers or coroutines from submitting more requests,
+ * which means block_job_pause is still necessary.
+ *
+ * If new I/O requests are submitted after bdrv_drained_begin is called before
+ * bdrv_drained_end, more internal I/O might be going on after the request has
+ * been completed. If you don't want this, you have to issue another bdrv_drain
+ * or use a nested bdrv_drained_begin/end section.
  *
  * This function can be recursive.
  */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7c58221..99359b2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -288,6 +288,12 @@ struct BlockDriver {
      */
     int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+    /**
+     * Drain and stop any internal sources of requests in the driver, and
+     * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+     */
+    void (*bdrv_drain)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.4.3


Reply via email to