Am 25.02.2021 um 12:52 hat Vladimir Sementsov-Ogievskiy geschrieben: > Hi all! It occurs that nothing prevents discarding and reallocating host > cluster during data writing. This way data writing will pollute another > newly allocated cluster of data or metadata. > > OK, v2 is a try to solve the problem with CoRwlock.. And it is marked > RFC, because of a lot of iotest failures.. Some of problems with v2: > > 1. It's a more complicated to make a test, as everything is blocking > and I can't just break write and do discard.. I have to implement > aio_discard in qemu-io and rewrite test into several portions of io > commands splitted by "sleep 1".. OK, it's not a big problem, and I've > solved it.
Right, this just demonstrates that it's doing what it's supposed to. > 2. iotest 7 fails with several leaked clusters. Seems, that it depend on > the fact that discard may be done in parallel with writes. Iotest 7 does > snapshots, so I think l1 table is updated to the moment when discard is > finally unlocked.. But I didn't dig into it, it's all my assumptions. This one looks a bit odd, but it may be related to the bug in your code that you forgot that qcow2_cluster_discard() is not a coroutine_fn. Later tests fail during the unlock: qemu-img: ../util/qemu-coroutine-lock.c:358: qemu_co_rwlock_unlock: Assertion `qemu_in_coroutine()' failed. #0 0x00007ff33f7d89d5 in raise () from /lib64/libc.so.6 #1 0x00007ff33f7c18a4 in abort () from /lib64/libc.so.6 #2 0x00007ff33f7c1789 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007ff33f7d1026 in __assert_fail () from /lib64/libc.so.6 #4 0x0000556f4ffd3c94 in qemu_co_rwlock_unlock (lock=0x556f51f63ca0) at ../util/qemu-coroutine-lock.c:358 #5 0x0000556f4fef5e09 in qcow2_cluster_discard (bs=0x556f51f56a80, offset=37748736, bytes=0, type=QCOW2_DISCARD_NEVER, full_discard=false) at ../block/qcow2-cluster.c:1970 #6 0x0000556f4ff46c23 in qcow2_snapshot_create (bs=0x556f51f56a80, sn_info=0x7fff89fb9a30) at ../block/qcow2-snapshot.c:736 #7 0x0000556f4ff0d7b6 in bdrv_snapshot_create (bs=0x556f51f56a80, sn_info=0x7fff89fb9a30) at ../block/snapshot.c:227 #8 0x0000556f4fe85526 in img_snapshot (argc=4, argv=0x7fff89fb9d30) at ../qemu-img.c:3337 #9 0x0000556f4fe8a227 in main (argc=4, argv=0x7fff89fb9d30) at ../qemu-img.c:5375 > 3. iotest 13 (and I think a lot more iotests) crashes on > assert(!to->locks_held); .. So with this assertion we can't keep rwlock > locked during data writing... > > #3 in __assert_fail () from /lib64/libc.so.6 > #4 in qemu_aio_coroutine_enter (ctx=0x55762120b700, co=0x55762121d700) > at ../util/qemu-coroutine.c:158 > #5 in aio_co_enter (ctx=0x55762120b700, co=0x55762121d700) at > ../util/async.c:628 > #6 in aio_co_wake (co=0x55762121d700) at ../util/async.c:612 > #7 in thread_pool_co_cb (opaque=0x7f17950daab0, ret=0) at > ../util/thread-pool.c:279 > #8 in thread_pool_completion_bh (opaque=0x5576211e5070) at > ../util/thread-pool.c:188 > #9 in aio_bh_call (bh=0x557621205df0) at ../util/async.c:136 > #10 in aio_bh_poll (ctx=0x55762120b700) at ../util/async.c:164 > #11 in aio_poll (ctx=0x55762120b700, blocking=true) at > ../util/aio-posix.c:659 > #12 in blk_prw (blk=0x557621205790, offset=4303351808, > buf=0x55762123e000 '\364' <repeats 199 times>, <incomplete sequence > \364>..., bytes=12288, > co_entry=0x557620d9dc97 <blk_write_entry>, flags=0) at > ../block/block-backend.c:1335 > #13 in blk_pwrite (blk=0x557621205790, offset=4303351808, > buf=0x55762123e000, > count=12288, flags=0) at ../block/block-backend.c:1501 This is another bug in your code: A taken lock belongs to its coroutine. You can't lock in one coroutine and unlock in another. The changes you made to the write code seem unnecessarily complicated anyway: Why not just qemu_co_rwlock_rdlock() right at the start of qcow2_co_pwritev_part() and unlock at its end, without taking and dropping the lock repeatedly? It makes both the locking more obviously correct and also fixes the bug (013 passes with this change). > So now I think that v1 is simpler.. It's more complicated (but not too > much) in code. But it keeps discards and data writes non-blocking each > other and avoids yields in critical sections. I think an approach with additional data structures is almost certainly more complex and harder to maintain (and as the review from Max showed, also to understand). I wouldn't give up yet on the CoRwlock based approach, it's almost trivial code in comparison. True, making qcow2_cluster_discard() a coroutine_fn requires a preparational patch that is less trivial, but at least this can be seen as something that we would want to do sooner or later anyway. Kevin