On 17/11/2021 13:51, Hanna Reitz wrote:
On 17.11.21 12:33, Emanuele Giuseppe Esposito wrote:
On 15/11/2021 13:48, Hanna Reitz wrote:
On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:
Signed-off-by: Emanuele Giuseppe Esposito <eespo...@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefa...@redhat.com>
---
block.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/block.c b/block.c
index 94bff5c757..40c4729b8d 100644
--- a/block.c
+++ b/block.c
[...]
@@ -2148,6 +2152,7 @@ static void bdrv_child_perm(BlockDriverState
*bs, BlockDriverState *child_bs,
uint64_t *nperm, uint64_t *nshared)
{
assert(bs->drv && bs->drv->bdrv_child_perm);
+ assert(qemu_in_main_thread());
bs->drv->bdrv_child_perm(bs, c, role, reopen_queue,
parent_perm, parent_shared,
nperm, nshared);
(Should’ve noticed earlier, but only did now...)
First, this function is indirectly called by bdrv_refresh_perms(). I
understand that all perm-related functions are classified as GS.
However, bdrv_co_invalidate_cache() invokes bdrv_refresh_perms. Being
declared in block/coroutine.h, it’s an I/O function, so it mustn’t
call such a GS function. BlockDriver.bdrv_co_invalidate_cache(),
bdrv_invalidate_cache(), and blk_invalidate_cache() are also
classified as I/O functions. Perhaps all of these functions should be
classified as GS functions? I believe their callers and their
purpose would allow for this.
I think that the *_invalidate_cache functions are I/O.
First of all, test-block-iothread.c calls bdrv_invalidate_cache in
test_sync_op_invalidate_cache, which is purposefully called in an
iothread. So that hints that we want it as I/O.
Hm, OK, but bdrv_co_invalidate_cache() calls bdrv_refresh_perms(), which
is a GS function, so that shouldn’t work, right?
Ok let's take a step back for one moment: can you tell me why the perm
functions should be GS?
On one side I see they are also used by I/O, as we can see above. On the
other side, I kinda see that permission should only be modified under
BQL. But I don't have any valid point to sustain that.
So I wonder if you have any specific and more valid reason to put them
as GS.
Maybe clarifying this will help finding a clean solution to this problem.
(Small mistake I just noticed: blk_invalidate_cache has the BQL
assertion even though it is rightly put in block-backend-io.h
Second, it’s called by bdrv_child_refresh_perms(), which is called by
block_crypto_amend_options_generic_luks(). This function is called
by block_crypto_co_amend_luks(), which is a BlockDriver.bdrv_co_amend
implementation, which is classified as an I/O function.
Honestly, I don’t know how to fix that mess. The best would be if we
could make the perm functions thread-safe and classify them as I/O,
but it seems to me like that’s impossible (I sure hope I’m wrong). On
the other hand, .bdrv_co_amend very much strikes me like a GS
function, but it isn’t. I’m afraid it must work on nodes that are
not in the main context, and it launches a job, so AFAIU we
absolutely cannot run it under the BQL.
It almost seems to me like we’d need a thread-safe variant of the
perm functions that’s allowed to fail when it cannot guarantee thread
safety or something. Or perhaps I’m wrong and the perm functions can
actually be classified as thread-safe and I/O, that’d be great…
I think that since we are currently only splitting and not taking care
of the actual I/O thread safety, we can move the _perm functions in
I/O, and add a nice TODO to double check their thread safety.
:/
I would really, really like to avoid that unless it’s clear that we can
make them thread-safe, or that there’s a way to take the BQL in I/O
functions to call GS functions. But the latter still wouldn’t make the
perm functions I/O functions. At most, I’d sort them under common
functions.
I mean, if they are not thread-safe after the split it means they are
not thread safe also right now.
Yes, sorry I wasn’t clear, I think there’s a pre-existing problem that
your series only unveils. I don’t know whether it has implications in
practice yet.
Hanna