Re: [PATCH 20/20] block: Mark bdrv_refresh_limits() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:58PM +0200, Kevin Wolf wrote:
> This adds GRAPH_RDLOCK annotations to declare that callers of
> bdrv_refresh_limits() need to hold a reader lock for the graph because
> it accesses the children list of a node.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block-global-state.h | 5 -
>  include/block/block_int-common.h   | 3 ++-
>  block.c| 9 +
>  block/io.c | 1 -
>  4 files changed, 15 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 19/20] block: Mark bdrv_recurse_can_replace() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:57PM +0200, Kevin Wolf wrote:
> This adds GRAPH_RDLOCK annotations to declare that callers of
> bdrv_recurse_can_replace() need to hold a reader lock for the graph
> because it accesses the children list of a node.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block-global-state.h | 5 +++--
>  include/block/block_int-common.h   | 4 ++--
>  include/block/block_int-global-state.h | 4 ++--
>  block/blkverify.c  | 5 +++--
>  block/mirror.c | 4 
>  block/quorum.c | 4 ++--
>  blockdev.c | 3 +++
>  7 files changed, 19 insertions(+), 10 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 18/20] block: Mark bdrv_query_block_graph_info() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:56PM +0200, Kevin Wolf wrote:
> This adds GRAPH_RDLOCK annotations to declare that callers of
> bdrv_query_block_graph_info() need to hold a reader lock for the graph
> because it accesses the children list of a node.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/qapi.h | 7 ---
>  qemu-img.c   | 2 ++
>  2 files changed, 6 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 17/20] block: Mark bdrv_query_bds_stats() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:55PM +0200, Kevin Wolf wrote:
> This adds GRAPH_RDLOCK annotations to declare that callers of
> bdrv_query_bds_stats() need to hold a reader lock for the graph because
> it accesses the children list of a node.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/qapi.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 16/20] block: Mark BlockDriver callbacks for amend job GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:54PM +0200, Kevin Wolf wrote:
> From: Emanuele Giuseppe Esposito 
> 
> This adds GRAPH_RDLOCK annotations to declare that callers of amend
> callbacks in BlockDriver need to hold a reader lock for the graph.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block_int-common.h | 12 ++--
>  block/amend.c|  8 +++-
>  2 files changed, 13 insertions(+), 7 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 15/20] block: Mark bdrv_co_debug_event() GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:53PM +0200, Kevin Wolf wrote:
> From: Emanuele Giuseppe Esposito 
> 
> This adds GRAPH_RDLOCK annotations to declare that callers of
> bdrv_co_debug_event() need to hold a reader lock for the graph.
> 
> Unfortunately we cannot use a co_wrapper_bdrv_rdlock, because the
> function is called by mixed functions that run both in coroutine and
> non-coroutine context (for example blkdebug_open).
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block-io.h | 9 +
>  include/block/block_int-common.h | 4 ++--
>  block.c  | 2 ++
>  3 files changed, 9 insertions(+), 6 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 14/20] block: Mark bdrv_co_get_info() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:52PM +0200, Kevin Wolf wrote:
> From: Emanuele Giuseppe Esposito 
> 
> This adds GRAPH_RDLOCK annotations to declare that callers of
> bdrv_co_get_info() need to hold a reader lock for the graph.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block-io.h |  7 +--
>  include/block/block_int-common.h |  4 ++--
>  block.c  |  2 ++
>  block/crypto.c   |  2 +-
>  block/io.c   | 11 +--
>  block/mirror.c   |  8 ++--
>  block/raw-format.c   |  2 +-
>  7 files changed, 22 insertions(+), 14 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 13/20] block: Mark bdrv_co_get_allocated_file_size() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:51PM +0200, Kevin Wolf wrote:
> @@ -5778,6 +5779,7 @@ int64_t coroutine_fn 
> bdrv_co_get_allocated_file_size(BlockDriverState *bs)
>  {
>  BlockDriver *drv = bs->drv;
>  IO_CODE();
> +assert_bdrv_graph_readable();

Is there a need for runtime assertions in functions already checked by
TSA?

I guess not. Otherwise runtime assertions should have been added in many
of the other functions marked GRAPH_RDLOCK in this series.


signature.asc
Description: PGP signature


Re: [PATCH 12/20] mirror: Take graph lock for accessing a node's parent list

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:50PM +0200, Kevin Wolf wrote:
> This adds GRAPH_RDLOCK annotations to declare that functions accessing
> the parent list of a node need to hold a reader lock for the graph. As
> it happens, they already do.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/mirror.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

The commit message is misleading. This commit does not take the graph
lock, it declares that the caller must already hold the graph lock.

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 11/20] vhdx: Take graph lock for accessing a node's parent list

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:49PM +0200, Kevin Wolf wrote:
> This adds GRAPH_RDLOCK annotations to declare that functions accessing
> the parent list of a node need to hold a reader lock for the graph. As
> it happens, they already do.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/vhdx.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)

The commit message is misleading. This commit does not take the graph
lock, it declares that the caller must already hold the graph lock.

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 10/20] nbd: Mark nbd_co_do_establish_connection() and callers GRAPH_RDLOCK

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:48PM +0200, Kevin Wolf wrote:
> From: Emanuele Giuseppe Esposito 
> 
> This adds GRAPH_RDLOCK annotations to declare that callers of
> nbd_co_do_establish_connection() need to hold a reader lock for the
> graph.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> Signed-off-by: Kevin Wolf 
> ---
>  block/coroutines.h |  5 +++--
>  block/nbd.c| 39 +--
>  2 files changed, 24 insertions(+), 20 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


[PATCH] block: compile out assert_bdrv_graph_readable() by default

2023-05-01 Thread Stefan Hajnoczi
reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.

The --enable-debug-graph-lock option is also enabled by the more general
--enable-debug option.

Signed-off-by: Stefan Hajnoczi 
---
 meson_options.txt | 2 ++
 configure | 1 +
 meson.build   | 2 ++
 block/graph-lock.c| 3 +++
 scripts/meson-buildoptions.sh | 4 
 5 files changed, 12 insertions(+)

diff --git a/meson_options.txt b/meson_options.txt
index 2471dd02da..0b2dd2d30d 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -311,6 +311,8 @@ option('rng_none', type: 'boolean', value: false,
description: 'dummy RNG, avoid using /dev/(u)random and getrandom()')
 option('coroutine_pool', type: 'boolean', value: true,
description: 'coroutine freelist (better performance)')
+option('debug_graph_lock', type: 'boolean', value: false,
+   description: 'graph lock debugging support')
 option('debug_mutex', type: 'boolean', value: false,
description: 'mutex debugging support')
 option('debug_stack_usage', type: 'boolean', value: false,
diff --git a/configure b/configure
index 77c03315f8..243e2e0a0d 100755
--- a/configure
+++ b/configure
@@ -816,6 +816,7 @@ for opt do
   --enable-debug)
   # Enable debugging options that aren't excessively noisy
   debug_tcg="yes"
+  meson_option_parse --enable-debug-graph-lock ""
   meson_option_parse --enable-debug-mutex ""
   meson_option_add -Doptimization=0
   fortify_source="no"
diff --git a/meson.build b/meson.build
index c44d05a13f..d964e741e7 100644
--- a/meson.build
+++ b/meson.build
@@ -1956,6 +1956,7 @@ if get_option('debug_stack_usage') and have_coroutine_pool
   have_coroutine_pool = false
 endif
 config_host_data.set10('CONFIG_COROUTINE_POOL', have_coroutine_pool)
+config_host_data.set('CONFIG_DEBUG_GRAPH_LOCK', get_option('debug_graph_lock'))
 config_host_data.set('CONFIG_DEBUG_MUTEX', get_option('debug_mutex'))
 config_host_data.set('CONFIG_DEBUG_STACK_USAGE', 
get_option('debug_stack_usage'))
 config_host_data.set('CONFIG_GPROF', get_option('gprof'))
@@ -3833,6 +3834,7 @@ summary_info += {'PIE':   get_option('b_pie')}
 summary_info += {'static build':  config_host.has_key('CONFIG_STATIC')}
 summary_info += {'malloc trim support': has_malloc_trim}
 summary_info += {'membarrier':have_membarrier}
+summary_info += {'debug graph lock':  get_option('debug_graph_lock')}
 summary_info += {'debug stack usage': get_option('debug_stack_usage')}
 summary_info += {'mutex debugging':   get_option('debug_mutex')}
 summary_info += {'memory allocator':  get_option('malloc')}
diff --git a/block/graph-lock.c b/block/graph-lock.c
index 639526608f..377884c3a9 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -265,7 +265,10 @@ void bdrv_graph_rdunlock_main_loop(void)
 
 void assert_bdrv_graph_readable(void)
 {
+/* reader_count() is slow due to aio_context_list_lock lock contention */
+#ifdef CONFIG_DEBUG_GRAPH_LOCK
 assert(qemu_in_main_thread() || reader_count());
+#endif
 }
 
 void assert_bdrv_graph_writable(void)
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index d4369a3ad8..d760ceb1ad 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -22,6 +22,8 @@ meson_options_help() {
   printf "%s\n" '   QEMU'
   printf "%s\n" '  --enable-cfi Control-Flow Integrity (CFI)'
   printf "%s\n" '  --enable-cfi-debug   Verbose errors in case of CFI 
violation'
+  printf "%s\n" '  --enable-debug-graph-lock'
+  printf "%s\n" '   graph lock debugging support'
   printf "%s\n" '  --enable-debug-mutex mutex debugging support'
   printf "%s\n" '  --enable-debug-stack-usage'
   printf "%s\n" '   measure coroutine stack usage'
@@ -249,6 +251,8 @@ _meson_option_parse() {
 --datadir=*) quote_sh "-Ddatadir=$2" ;;
 --enable-dbus-display) printf "%s" -Ddbus_display=enabled ;;
 --disable-dbus-display) printf "%s" -Ddbus_display=disabled ;;
+--enable-debug-graph-lock) printf "%s" -Ddebug_graph_lock=true ;;
+--disable-debug-graph-lock) printf "%s" -Ddebug_graph_lock=false ;;
 --enable-debug-mutex) printf "%s" -Ddebug_mutex=true ;;
 --disable-debug-mutex) printf "%s" -Ddebug_mutex=false ;;
 --enable-debug-stack-usage) printf "%s" -Ddebug_stack_usage=true ;;
-- 
2.40.1




Re: [PATCH 09/20] nbd: Remove nbd_co_flush() wrapper function

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:47PM +0200, Kevin Wolf wrote:
> The only thing nbd_co_flush() does is calling nbd_client_co_flush().
> Just use that function directly in the BlockDriver definitions and
> remove the wrapper.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/nbd.c | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 08/20] block: .bdrv_open is non-coroutine and unlocked

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:46PM +0200, Kevin Wolf wrote:
> Drivers were a bit confused about whether .bdrv_open can run in a
> coroutine and whether or not it holds a graph lock.
> 
> It cannot keep a graph lock from the caller across the whole function
> because it both changes the graph (requires a writer lock) and does I/O
> (requires a reader lock). Therefore, it should take these locks
> internally as needed.
> 
> The functions used to be called in coroutine context during image
> creation. This was buggy for other reasons, and as of commit 32192301,
> all block drivers go through no_co_wrappers. So it is not called in
> coroutine context any more.
> 
> Fix qcow2 and qed to work with the correct assumptions: The graph lock
> needs to be taken internally instead of just assuming it's already
> there, and the coroutine path is dead code that can be removed.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block_int-common.h |  8 
>  block.c  |  6 +++---
>  block/qcow2.c| 15 ++-
>  block/qed.c  | 18 --
>  4 files changed, 21 insertions(+), 26 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 07/20] graph-lock: Fix GRAPH_RDLOCK_GUARD*() to be reader lock

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:45PM +0200, Kevin Wolf wrote:
> GRAPH_RDLOCK_GUARD() and GRAPH_RDLOCK_GUARD_MAINLOOP() only take a
> reader lock for the graph, so the correct annotation for them to use is
> TSA_ASSERT_SHARED rather than TSA_ASSERT.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/graph-lock.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
> index 7ef391fab3..adaa3ed089 100644
> --- a/include/block/graph-lock.h
> +++ b/include/block/graph-lock.h
> @@ -210,7 +210,7 @@ typedef struct GraphLockable { } GraphLockable;
>   * unlocked. TSA_ASSERT() makes sure that the following calls know that we

Does this comment need to be updated to TSA_ASSERT_SHARED()?

>   * hold the lock while unlocking is left unchecked.
>   */
> -static inline GraphLockable * TSA_ASSERT(graph_lock) TSA_NO_TSA
> +static inline GraphLockable * TSA_ASSERT_SHARED(graph_lock) TSA_NO_TSA
>  graph_lockable_auto_lock(GraphLockable *x)
>  {
>  bdrv_graph_co_rdlock();
> @@ -254,7 +254,7 @@ typedef struct GraphLockableMainloop { } 
> GraphLockableMainloop;
>   * unlocked. TSA_ASSERT() makes sure that the following calls know that we

Same.


signature.asc
Description: PGP signature


Re: [PATCH 06/20] graph-lock: Add GRAPH_UNLOCKED(_PTR)

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:44PM +0200, Kevin Wolf wrote:
> For some function, parts of their interface is that are called without
> holding the graph lock. Add a new macro to document this.
> 
> The macro expands to TSA_EXCLUDES(), which is a relatively weak check
> because it passes in cases where the compiler just doesn't know if the
> lock is held. Function pointers can't be checked at all. Therefore, its
> primary purpose is documentation.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/graph-lock.h | 2 ++
>  1 file changed, 2 insertions(+)

Modulo Eric's comment about the commit description:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 05/20] test-bdrv-drain: Don't modify the graph in coroutines

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:43PM +0200, Kevin Wolf wrote:
> test-bdrv-drain contains a few test cases that are run both in coroutine
> and non-coroutine context. Running the entire code including the setup
> and shutdown in coroutines is incorrect because graph modifications can
> generally not happen in coroutines.
> 
> Change the test so that creating and destroying the test nodes and
> BlockBackends always happens outside of coroutine context.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/unit/test-bdrv-drain.c | 112 +++
>  1 file changed, 75 insertions(+), 37 deletions(-)
> 
> diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
> index d9d3807062..765ae382da 100644
> --- a/tests/unit/test-bdrv-drain.c
> +++ b/tests/unit/test-bdrv-drain.c
> @@ -188,6 +188,25 @@ static void do_drain_begin_unlocked(enum drain_type 
> drain_type, BlockDriverState
>  }
>  }
>  
> +static BlockBackend * no_coroutine_fn test_setup(void)
> +{
> +BlockBackend *blk;
> +BlockDriverState *bs, *backing;
> +
> +blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL, BLK_PERM_ALL);
> +bs = bdrv_new_open_driver(_test, "test-node", BDRV_O_RDWR,
> +  _abort);
> +blk_insert_bs(blk, bs, _abort);
> +
> +backing = bdrv_new_open_driver(_test, "backing", 0, _abort);
> +bdrv_set_backing_hd(bs, backing, _abort);
> +
> +bdrv_unref(backing);
> +bdrv_unref(bs);
> +
> +return blk;
> +}
> +
>  static void do_drain_end_unlocked(enum drain_type drain_type, 
> BlockDriverState *bs)
>  {
>  if (drain_type != BDRV_DRAIN_ALL) {
> @@ -199,25 +218,19 @@ static void do_drain_end_unlocked(enum drain_type 
> drain_type, BlockDriverState *
>  }
>  }
>  
> -static void test_drv_cb_common(enum drain_type drain_type, bool recursive)
> +static void test_drv_cb_common(BlockBackend *blk, enum drain_type drain_type,
> +   bool recursive)
>  {
> -BlockBackend *blk;
> -BlockDriverState *bs, *backing;
> +BlockDriverState *bs = blk_bs(blk);
> +BlockDriverState *backing = bs->backing->bs;
>  BDRVTestState *s, *backing_s;
>  BlockAIOCB *acb;
>  int aio_ret;
>  
>  QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, 0);
>  
> -blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL, BLK_PERM_ALL);
> -bs = bdrv_new_open_driver(_test, "test-node", BDRV_O_RDWR,
> -  _abort);
>  s = bs->opaque;
> -blk_insert_bs(blk, bs, _abort);
> -
> -backing = bdrv_new_open_driver(_test, "backing", 0, _abort);
>  backing_s = backing->opaque;
> -bdrv_set_backing_hd(bs, backing, _abort);
>  
>  /* Simple bdrv_drain_all_begin/end pair, check that CBs are called */
>  g_assert_cmpint(s->drain_count, ==, 0);
> @@ -252,44 +265,53 @@ static void test_drv_cb_common(enum drain_type 
> drain_type, bool recursive)
>  
>  g_assert_cmpint(s->drain_count, ==, 0);
>  g_assert_cmpint(backing_s->drain_count, ==, 0);
> -
> -bdrv_unref(backing);
> -bdrv_unref(bs);
> -blk_unref(blk);
>  }
>  
>  static void test_drv_cb_drain_all(void)
>  {
> -test_drv_cb_common(BDRV_DRAIN_ALL, true);
> +BlockBackend *blk = test_setup();
> +test_drv_cb_common(blk, BDRV_DRAIN_ALL, true);
> +blk_unref(blk);
>  }
>  
>  static void test_drv_cb_drain(void)
>  {
> -test_drv_cb_common(BDRV_DRAIN, false);
> +BlockBackend *blk = test_setup();
> +test_drv_cb_common(blk, BDRV_DRAIN, false);
> +blk_unref(blk);
> +}
> +
> +static void test_drv_cb_co_drain_all_entry(void)

Missing coroutine_fn.

> +{
> +BlockBackend *blk = blk_all_next(NULL);
> +test_drv_cb_common(blk, BDRV_DRAIN_ALL, true);
>  }
>  
>  static void test_drv_cb_co_drain_all(void)
>  {
> -call_in_coroutine(test_drv_cb_drain_all);
> +BlockBackend *blk = test_setup();
> +call_in_coroutine(test_drv_cb_co_drain_all_entry);
> +blk_unref(blk);
>  }
>  
> -static void test_drv_cb_co_drain(void)
> +static void test_drv_cb_co_drain_entry(void)

Missing coroutine_fn.

>  {
> -call_in_coroutine(test_drv_cb_drain);
> +BlockBackend *blk = blk_all_next(NULL);
> +test_drv_cb_common(blk, BDRV_DRAIN, false);
>  }
>  
> -static void test_quiesce_common(enum drain_type drain_type, bool recursive)
> +static void test_drv_cb_co_drain(void)
>  {
> -BlockBackend *blk;
> -BlockDriverState *bs, *backing;
> -
> -blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL, BLK_PERM_ALL);
> -bs = bdrv_new_open_driver(_test, "test-node", BDRV_O_RDWR,
> -  _abort);
> -blk_insert_bs(blk, bs, _abort);
> +BlockBackend *blk = test_setup();
> +call_in_coroutine(test_drv_cb_co_drain_entry);
> +blk_unref(blk);
> +}
>  
> -backing = bdrv_new_open_driver(_test, "backing", 0, _abort);
> -bdrv_set_backing_hd(bs, backing, _abort);
> +static void test_quiesce_common(BlockBackend *blk, enum drain_type 
> drain_type,
> +

Re: [PATCH 04/20] block: Don't call no_coroutine_fns in qmp_block_resize()

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:42PM +0200, Kevin Wolf wrote:
> This QMP handler runs in a coroutine, so it must use the corresponding
> no_co_wrappers instead.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  blockdev.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 03/20] block: bdrv/blk_co_unref() for calls in coroutine context

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:41PM +0200, Kevin Wolf wrote:
> These functions must not be called in coroutine context, because they
> need write access to the graph.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block-global-state.h  |  3 ++-
>  include/sysemu/block-backend-global-state.h |  5 -
>  block.c |  2 +-
>  block/crypto.c  |  6 +++---
>  block/parallels.c   |  6 +++---
>  block/qcow.c|  6 +++---
>  block/qcow2.c   | 14 +++---
>  block/qed.c |  6 +++---
>  block/vdi.c |  6 +++---
>  block/vhdx.c|  6 +++---
>  block/vmdk.c| 18 +-
>  block/vpc.c |  6 +++---
>  12 files changed, 44 insertions(+), 40 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 02/20] block: Consistently call bdrv_activate() outside coroutine

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:40PM +0200, Kevin Wolf wrote:
> Migration code can call bdrv_activate() in coroutine context, whereas
> other callers call it outside of coroutines. As it calls other code that
> is not supposed to run in coroutines, standardise on running outside of
> coroutines.
> 
> This adds a no_co_wrapper to switch to the main loop before calling
> bdrv_activate().
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/block/block-global-state.h |  6 +-
>  block/block-backend.c  | 10 +-
>  2 files changed, 14 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH 01/20] qcow2: Don't call bdrv_getlength() in coroutine_fns

2023-05-01 Thread Stefan Hajnoczi
On Tue, Apr 25, 2023 at 07:31:39PM +0200, Kevin Wolf wrote:
> There is a bdrv_co_getlength() now, which should be used in coroutine
> context.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/qcow2.h  |  4 +++-
>  block/qcow2-refcount.c |  2 +-
>  block/qcow2.c  | 19 +--
>  3 files changed, 13 insertions(+), 12 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH v4 04/20] virtio-scsi: stop using aio_disable_external() during unplug

2023-05-01 Thread Stefan Hajnoczi
On Fri, Apr 28, 2023 at 04:22:55PM +0200, Kevin Wolf wrote:
> Am 25.04.2023 um 19:27 hat Stefan Hajnoczi geschrieben:
> > This patch is part of an effort to remove the aio_disable_external()
> > API because it does not fit in a multi-queue block layer world where
> > many AioContexts may be submitting requests to the same disk.
> > 
> > The SCSI emulation code is already in good shape to stop using
> > aio_disable_external(). It was only used by commit 9c5aad84da1c
> > ("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi
> > disk") to ensure that virtio_scsi_hotunplug() works while the guest
> > driver is submitting I/O.
> > 
> > Ensure virtio_scsi_hotunplug() is safe as follows:
> > 
> > 1. qdev_simple_device_unplug_cb() -> qdev_unrealize() ->
> >device_set_realized() calls qatomic_set(>realized, false) so
> >that future scsi_device_get() calls return NULL because they exclude
> >SCSIDevices with realized=false.
> > 
> >That means virtio-scsi will reject new I/O requests to this
> >SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while
> >virtio_scsi_hotunplug() is still executing. We are protected against
> >new requests!
> > 
> > 2. Add a call to scsi_device_purge_requests() from scsi_unrealize() so
> >that in-flight requests are cancelled synchronously. This ensures
> >that no in-flight requests remain once qdev_simple_device_unplug_cb()
> >returns.
> > 
> > Thanks to these two conditions we don't need aio_disable_external()
> > anymore.
> > 
> > Cc: Zhengui Li 
> > Reviewed-by: Paolo Bonzini 
> > Reviewed-by: Daniil Tatianin 
> > Signed-off-by: Stefan Hajnoczi 
> 
> qemu-iotests 040 starts failing for me after this patch, with what looks
> like a use-after-free error of some kind.
> 
> (gdb) bt
> #0  0x55b6e3e1f31c in job_type (job=0xe3e3e3e3e3e3e3e3) at ../job.c:238
> #1  0x55b6e3e1cee5 in is_block_job (job=0xe3e3e3e3e3e3e3e3) at 
> ../blockjob.c:41
> #2  0x55b6e3e1ce7d in block_job_next_locked (bjob=0x55b6e72b7570) at 
> ../blockjob.c:54
> #3  0x55b6e3df6370 in blockdev_mark_auto_del (blk=0x55b6e74af0a0) at 
> ../blockdev.c:157
> #4  0x55b6e393e23b in scsi_qdev_unrealize (qdev=0x55b6e7c04d40) at 
> ../hw/scsi/scsi-bus.c:303
> #5  0x55b6e3db0d0e in device_set_realized (obj=0x55b6e7c04d40, 
> value=false, errp=0x55b6e497c918 ) at ../hw/core/qdev.c:599
> #6  0x55b6e3dba36e in property_set_bool (obj=0x55b6e7c04d40, 
> v=0x55b6e7d7f290, name=0x55b6e41bd6d8 "realized", opaque=0x55b6e7246d20, 
> errp=0x55b6e497c918 )
> at ../qom/object.c:2285
> #7  0x55b6e3db7e65 in object_property_set (obj=0x55b6e7c04d40, 
> name=0x55b6e41bd6d8 "realized", v=0x55b6e7d7f290, errp=0x55b6e497c918 
> ) at ../qom/object.c:1420
> #8  0x55b6e3dbd84a in object_property_set_qobject (obj=0x55b6e7c04d40, 
> name=0x55b6e41bd6d8 "realized", value=0x55b6e74c1890, errp=0x55b6e497c918 
> )
> at ../qom/qom-qobject.c:28
> #9  0x55b6e3db8570 in object_property_set_bool (obj=0x55b6e7c04d40, 
> name=0x55b6e41bd6d8 "realized", value=false, errp=0x55b6e497c918 
> ) at ../qom/object.c:1489
> #10 0x55b6e3daf2b5 in qdev_unrealize (dev=0x55b6e7c04d40) at 
> ../hw/core/qdev.c:306
> #11 0x55b6e3db509d in qdev_simple_device_unplug_cb 
> (hotplug_dev=0x55b6e81c3630, dev=0x55b6e7c04d40, errp=0x7ffec5519200) at 
> ../hw/core/qdev-hotplug.c:72
> #12 0x55b6e3c520f9 in virtio_scsi_hotunplug (hotplug_dev=0x55b6e81c3630, 
> dev=0x55b6e7c04d40, errp=0x7ffec5519200) at ../hw/scsi/virtio-scsi.c:1065
> #13 0x55b6e3db4dec in hotplug_handler_unplug 
> (plug_handler=0x55b6e81c3630, plugged_dev=0x55b6e7c04d40, 
> errp=0x7ffec5519200) at ../hw/core/hotplug.c:56
> #14 0x55b6e3a28f84 in qdev_unplug (dev=0x55b6e7c04d40, 
> errp=0x7ffec55192e0) at ../softmmu/qdev-monitor.c:935
> #15 0x55b6e3a290fa in qmp_device_del (id=0x55b6e74c1760 "scsi0", 
> errp=0x7ffec55192e0) at ../softmmu/qdev-monitor.c:955
> #16 0x55b6e3fb0a5f in qmp_marshal_device_del (args=0x7f61cc005eb0, 
> ret=0x7f61d5a8ae38, errp=0x7f61d5a8ae40) at qapi/qapi-commands-qdev.c:114
> #17 0x55b6e3fd52e1 in do_qmp_dispatch_bh (opaque=0x7f61d5a8ae08) at 
> ../qapi/qmp-dispatch.c:128
> #18 0x55b6e4007b9e in aio_bh_call (bh=0x55b6e7dea730) at 
> ../util/async.c:155
> #19 0x55b6e4007d2e in aio_bh_poll (ctx=0x55b6e72447c0) at 
> ../util/async.c:184
> #20 0x55b6e3fe3b45 in aio_dispatch (ctx=0x55b6e72447c0) at 
> ../util/aio-posix.c:421
> #21 0x55b6e4009544 in aio_ctx_dispatch (source=0x55b6e72447c0, 
> callback=0x0, user_data=0x0) at ../util/async.c:326
> #22 0x7f61ddc14c7f in g_main_dispatch (context=0x55b6e7244b20) at 
> ../glib/gmain.c:3454
> #23 g_main_context_dispatch (context=0x55b6e7244b20) at ../glib/gmain.c:4172
> #24 0x55b6e400a7e8 in glib_pollfds_poll () at ../util/main-loop.c:290
> #25 0x55b6e400a0c2 in os_host_main_loop_wait (timeout=0) at 
> ../util/main-loop.c:313
> #26 0x55b6e4009fa2 in main_loop_wait (nonblocking=0) at 
> 

[PATCH] async: avoid use-after-free on re-entrancy guard

2023-05-01 Thread Alexander Bulekov
A BH callback can free the BH, causing a use-after-free in aio_bh_call.
Fix that by keeping a local copy of the re-entrancy guard pointer.

Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58513
Fixes: 9c86c97f12 ("async: Add an optional reentrancy guard to the BH API")
Signed-off-by: Alexander Bulekov 
---
 util/async.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/util/async.c b/util/async.c
index 9df7674b4e..055070ffbd 100644
--- a/util/async.c
+++ b/util/async.c
@@ -156,18 +156,20 @@ void aio_bh_call(QEMUBH *bh)
 {
 bool last_engaged_in_io = false;
 
-if (bh->reentrancy_guard) {
-last_engaged_in_io = bh->reentrancy_guard->engaged_in_io;
-if (bh->reentrancy_guard->engaged_in_io) {
+/* Make a copy of the guard-pointer as cb may free the bh */
+MemReentrancyGuard *reentrancy_guard = bh->reentrancy_guard;
+if (reentrancy_guard) {
+last_engaged_in_io = reentrancy_guard->engaged_in_io;
+if (reentrancy_guard->engaged_in_io) {
 trace_reentrant_aio(bh->ctx, bh->name);
 }
-bh->reentrancy_guard->engaged_in_io = true;
+reentrancy_guard->engaged_in_io = true;
 }
 
 bh->cb(bh->opaque);
 
-if (bh->reentrancy_guard) {
-bh->reentrancy_guard->engaged_in_io = last_engaged_in_io;
+if (reentrancy_guard) {
+reentrancy_guard->engaged_in_io = last_engaged_in_io;
 }
 }
 
-- 
2.39.0