Hi Kevin,

Am 26.03.24 um 13:44 schrieb Kevin Wolf:
> Am 22.03.2024 um 10:50 hat Fiona Ebner geschrieben:
>> The old_bs variable in bdrv_next() is currently determined by looking
>> at the old block backend. However, if the block graph changes before
>> the next bdrv_next() call, it might be that the associated BDS is not
>> the same that was referenced previously. In that case, the wrong BDS
>> is unreferenced, leading to an assertion failure later:
>>
>>> bdrv_unref: Assertion `bs->refcnt > 0' failed.
> 
> Your change makes sense, but in theory it shouldn't make a difference.
> The real bug is in the caller, you can't allow graph modifications while
> iterating the list of nodes. Even if it doesn't crash (like after your
> patch), you don't have any guarantee that you will have seen every node
> that exists that the end - and maybe not even that you don't get the
> same node twice.
> 
>> In particular, this can happen in the context of bdrv_flush_all(),
>> when polling for bdrv_co_flush() in the generated co-wrapper leads to
>> a graph change (for example with a stream block job [0]).
> 
> The whole locking around this case is a bit tricky and would deserve
> some cleanup.
> 
> The basic rule for bdrv_next() callers is that they need to hold the
> graph reader lock as long as they are iterating the graph, otherwise
> it's not safe. This implies that the ref/unref pairs in it should never
> make a difference either - which is important, because at least
> releasing the last reference is forbidden while holding the graph lock.
> I intended to remove the ref/unref for bdrv_next(), but I didn't because
> I realised that the callers need to be audited first that they really
> obey the rules. You found one that would be problematic.
> 
> The thing that bdrv_flush_all() gets wrong is that it promises to follow
> the graph lock rules with GRAPH_RDLOCK_GUARD_MAINLOOP(), but then calls
> something that polls. The compiler can't catch this because bdrv_flush()
> is a co_wrapper_mixed_bdrv_rdlock. The behaviour for these functions is:
> 
> - If called outside of coroutine context, they are GRAPH_UNLOCKED
> - If called in coroutine context, they are GRAPH_RDLOCK
> 
> We should probably try harder to get rid of the mixed functions, because
> a synchronous co_wrapper_bdrv_rdlock could actually be marked
> GRAPH_UNLOCKED in the code and then the compiler could catch this case.
> 
> The fix for bdrv_flush_all() is probably to make it bdrv_co_flush_all()
> with a coroutine wrapper so that the graph lock is held for the whole
> function. Then calling bdrv_co_flush() while iterating the list is safe
> and doesn't allow concurrent graph modifications.

I attempted this now, but ran into two issues:

The first is that I had to add support for a function without arguments
to the block-coroutine-wrapper.py script. I can send this as an RFC in
any case if desired.

The second is that iotest 255 ran into an assertion failure upon QMP 'quit':

> ../block/graph-lock.c:113: bdrv_graph_wrlock: Assertion 
> `!qemu_in_coroutine()' failed.

Looking at the backtrace:

> #5  0x0000762a90cc3eb2 in __GI___assert_fail
>     (assertion=0x5afb07991e7d "!qemu_in_coroutine()", file=0x5afb07991e00 
> "../block/graph-lock.c", line=113, function=0x5afb07991f20 
> <__PRETTY_FUNCTION__.4> "bdrv_graph_wrlock")
>     at ./assert/assert.c:101
> #6  0x00005afb07585311 in bdrv_graph_wrlock () at ../block/graph-lock.c:113
> #7  0x00005afb07573a36 in blk_remove_bs (blk=0x5afb0af99420) at 
> ../block/block-backend.c:901
> #8  0x00005afb075729a7 in blk_delete (blk=0x5afb0af99420) at 
> ../block/block-backend.c:487
> #9  0x00005afb07572d88 in blk_unref (blk=0x5afb0af99420) at 
> ../block/block-backend.c:547
> #10 0x00005afb07572fe8 in bdrv_next (it=0x762a852fef00) at 
> ../block/block-backend.c:618
> #11 0x00005afb0758cd65 in bdrv_co_flush_all () at ../block/io.c:2347
> #12 0x00005afb0753ba37 in bdrv_co_flush_all_entry (opaque=0x7ffff12c6050) at 
> block/block-gen.c:1391
> #13 0x00005afb0773bf41 in coroutine_trampoline (i0=168365184, i1=23291)

So I guess calling bdrv_next() is not safe from a coroutine, because the
function doing the iteration could end up being the last thing to have a
reference for the BB.

Best Regards,
Fiona


Reply via email to