Am 20.12.2021 um 13:20 hat Emanuele Giuseppe Esposito geschrieben: > > > On 17/12/2021 17:38, Emanuele Giuseppe Esposito wrote: > > > > > > On 17/12/2021 12:04, Hanna Reitz wrote: > > > On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote: > > > > bdrv_co_invalidate_cache is special: it is an I/O function, > > > > > > I still don’t believe it is, but well. > > > > > > (Yes, it is called by a test in an iothread, but I believe we’ve > > > seen that the tests simply sometimes test things that shouldn’t be > > > allowed.) > > > > > > > but uses the block layer permission API, which is GS. > > > > > > > > Because of this, we can assert that either the function is > > > > being called with BQL held, and thus can use the permission API, > > > > or make sure that the permission API is not used, by ensuring that > > > > bs (and parents) .open_flags does not contain BDRV_O_INACTIVE. > > > > > > > > Signed-off-by: Emanuele Giuseppe Esposito <eespo...@redhat.com> > > > > --- > > > > block.c | 26 ++++++++++++++++++++++++++ > > > > 1 file changed, 26 insertions(+) > > > > > > > > diff --git a/block.c b/block.c > > > > index a0309f827d..805974676b 100644 > > > > --- a/block.c > > > > +++ b/block.c > > > > @@ -6574,6 +6574,26 @@ void bdrv_init_with_whitelist(void) > > > > bdrv_init(); > > > > } > > > > +static bool bdrv_is_active(BlockDriverState *bs) > > > > +{ > > > > + BdrvChild *parent; > > > > + > > > > + if (bs->open_flags & BDRV_O_INACTIVE) { > > > > + return false; > > > > + } > > > > + > > > > + QLIST_FOREACH(parent, &bs->parents, next_parent) { > > > > + if (parent->klass->parent_is_bds) { > > > > + BlockDriverState *parent_bs = parent->opaque; > > > > > > This looks like a really bad hack to me. We purposefully have made > > > the parent link opaque so that a BDS cannot easily reach its > > > parents. All accesses should go through BdrvChildClass methods. > > > > > > I also don’t understand why we need to query parents at all. The > > > only fact that determines whether the current BDS will have its > > > permissions changed is whether the BDS itself is active or > > > inactive. Sure, we’ll invoke bdrv_co_invalidate_cache() on the > > > parents, too, but then we could simply let the assertion fail there. > > > > > > > + if (!bdrv_is_active(parent_bs)) { > > > > + return false; > > > > + } > > > > + } > > > > + } > > > > + > > > > + return true; > > > > +} > > > > + > > > > int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState > > > > *bs, Error **errp) > > > > { > > > > BdrvChild *child, *parent; > > > > @@ -6585,6 +6605,12 @@ int coroutine_fn > > > > bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp) > > > > return -ENOMEDIUM; > > > > } > > > > + /* > > > > + * No need to muck with permissions if bs is active. > > > > + * TODO: should activation be a separate function? > > > > + */ > > > > + assert(qemu_in_main_thread() || bdrv_is_active(bs)); > > > > + > > > > > > I don’t understand this, really. It looks to me like “if you don’t > > > call this in the main thread, this better be a no-op”, i.e., you > > > must never call this function in an I/O thread if you really want to > > > use it. I.e. what I’d classify as a GS function. > > > > > > It sounds like this is just a special case for said test, and > > > special-casing code for tests sounds like a bad idea. > > > > Ok, but trying to leave just the qemu_in_main_thread() assertion makes > > test 307 (./check 307) fail. > > I am actually not sure on why it fails, but I am sure it is because of > > the assertion, since without it it passes. > > > > I tried with gdb (./check -gdb 307 on one terminal and > > gdb -iex "target remote localhost:12345" > > in another) but it points me to this below, which I think is the ndb > > server getting the socket closed (because on the other side it crashed), > > and not the actual error. > > > > > > Thread 1 "qemu-system-x86" received signal SIGPIPE, Broken pipe. > > 0x00007ffff68af54d in sendmsg () from target:/lib64/libc.so.6 > > (gdb) bt > > #0 0x00007ffff68af54d in sendmsg () from target:/lib64/libc.so.6 > > #1 0x0000555555c13cc9 in qio_channel_socket_writev (ioc=<optimized > > out>, iov=0x5555569a4870, niov=1, fds=0x0, nfds=<optimized out>, > > errp=0x0) > > at ../io/channel-socket.c:561 > > #2 0x0000555555c19b18 in qio_channel_writev_full_all > > (ioc=0x55555763b800, iov=iov@entry=0x7fffe8dffd80, niov=niov@entry=1, > > fds=fds@entry=0x0, > > nfds=nfds@entry=0, errp=errp@entry=0x0) at ../io/channel.c:240 > > #3 0x0000555555c19bd2 in qio_channel_writev_all (errp=0x0, niov=1, > > iov=0x7fffe8dffd80, ioc=<optimized out>) at ../io/channel.c:220 > > #4 qio_channel_write_all (ioc=<optimized out>, > > buf=buf@entry=0x7fffe8dffdd0 "", buflen=buflen@entry=20, > > errp=errp@entry=0x0) at ../io/channel.c:330 > > #5 0x0000555555c27e75 in nbd_write (errp=0x0, size=20, > > buffer=0x7fffe8dffdd0, ioc=<optimized out>) at ../nbd/nbd-internal.h:71 > > #6 nbd_negotiate_send_rep_len (client=client@entry=0x555556f60930, > > type=type@entry=1, len=len@entry=0, errp=errp@entry=0x0) at > > ../nbd/server.c:203 > > #7 0x0000555555c29db1 in nbd_negotiate_send_rep (errp=0x0, type=1, > > client=0x555556f60930) at ../nbd/server.c:211 > > --Type <RET> for more, q to quit, c to continue without paging-- > > #8 nbd_negotiate_options (errp=0x7fffe8dffe88, client=<optimized out>) > > at ../nbd/server.c:1224 > > #9 nbd_negotiate (errp=0x7fffe8dffe88, client=<optimized out>) at > > ../nbd/server.c:1340 > > #10 nbd_co_client_start (opaque=<optimized out>) at ../nbd/server.c:2715 > > #11 0x0000555555d70423 in coroutine_trampoline (i0=<optimized out>, > > i1=<optimized out>) at ../util/coroutine-ucontext.c:173 > > #12 0x00007ffff67f3820 in ?? () from target:/lib64/libc.so.6 > > #13 0x00007fffffffca80 in ?? () > > > > Ok the reason for this is line 89 of tests/qemu-iotests/307: > > # Should ignore the iothread conflict, but then fail because of the > # permission conflict (and not crash) > vm.qmp_log('block-export-add', id='export1', type='nbd', node_name='null', > iothread='iothread1', fixed_iothread=False, writable=True) > > This calls qmp_block_export_add() and then blk_exp_add(), that calls > bdrv_invalidate_cache(). > Both functions are running in the main thread, but then > bdrv_invalidate_cache invokes bdrv_co_invalidate_cache() as a coroutine in > the AioContext of the given bs, so not the main loop. > > So Hanna, what should we do here? This seems very similar to the discussion > in patch 22, ie run blockdev-create (in this case block-export-add, which > seems similar?) involving nodes in I/O threads.
My gut feeling is that the bug is that we're entering coroutine context in the node's iothread too early. I think the only place where we really need it is the bs->drv->bdrv_co_invalidate_cache() call. In fact, not only permission code, but also parent->klass->activate() must be run in the main thread. The only implementation of the callback is blk_root_activate(), and it calls some permissions functions, too, but also qemu_add_vm_change_state_handler(), which doesn't look thread safe. Unlikely to ever cause problems and if it does, it won't be reproducible, but still a bug. So if we go back to a bdrv_invalidate_cache() that does all the graph manipulations (and asserts that we're in the main loop) and then have a much smaller bdrv_co_invalidate_cache() that basically just calls into the driver, would that solve the problem? Kevin