On Sun, Apr 28, 2024 at 03:39:09PM +0300, Michael Tokarev wrote: > 17.04.2024 05:56, Li Zhijian via wrote: > > bdrv_activate_all() should not be called from the coroutine context, move > > it to the QEMU thread colo_process_incoming_thread() with the bql_lock > > protected. > > > > The backtrace is as follows: > > #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at > > ../block/graph-lock.c:260 > > #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop > > (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259 > > #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at > > ../block.c:6906 > > #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935 > > #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at > > ../migration/migration.c:793 > > #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) > > at ../util/coroutine-ucontext.c:175 > > #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6 > > > > CC: Fabiano Rosas <faro...@suse.de> > > Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277 > > Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() > > GRAPH_RDLOCK") > > Signed-off-by: Li Zhijian <lizhij...@fujitsu.com> > > Commit 2b3912f135 is in 8.2 (v8.1.0-1575-g2b3912f135). Is this fix supposed > to go > to stable-8.2 series?
Yes we probably should. I've copied qemu-stable when queuing this patch for that: https://lore.kernel.org/r/20240423223813.3237060-27-pet...@redhat.com > The prob here is that in 8.2, there's no bql_lock/unlock. > I guess it should use qemu_mutex_lock_iothread() instead, for before > 195801d700c008 "system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()". Right, some further care needs to be taken on this one indeed. > 17.04.2024 05:56, Li Zhijian via wrote: > > --- a/migration/colo.c > > +++ b/migration/colo.c > > @@ -835,6 +835,16 @@ static void *colo_process_incoming_thread(void *opaque) > > return NULL; > > } > > + /* Make sure all file formats throw away their mutable metadata */ > > + bql_lock(); > > + bdrv_activate_all(&local_err); > > + if (local_err) { > > + bql_unlock(); > > + error_report_err(local_err); > > + return NULL; > > + } > > + bql_unlock(); > > FWIW, this can be simplified as follows: > > bql_lock(); > bdrv_activate_all(&local_err); > bql_unlock(); > if (local_err) { > error_report_err(local_err); > return NULL; > } > > (I know it is already too late) Agree. :) Thanks, -- Peter Xu