Hi all! v2: add fix:)
We've faced the following mirror bug: Just run mirror on qcow2 image more than 1G, and qemu is in dead lock. Dead lock described in 01, in short, we have extra aio_context_acquire and aio_context_release around blk_aio_pwritev in mirror_read_complete. So, write may yield to the main loop, and aio context is acquired. Main loop than hangs on trying to lock BQL, which is locked by cpu thread, and the cpu thread hangs on trying to acquire aio context. Hm, now the thing looks fixed, by I still have a questions: Is it a common thing, that we can't yield inside aio_context_acquire/release ? Was commit b9e413dd3756 "block: explicitly acquire aiocontext in aio callbacks that need it" wrong? Why it added these acquire/release, when it is written in multiple-iothreads.txt, that "Side note: the best way to schedule a function call across threads is to call aio_bh_schedule_oneshot(). No acquire/release or locking is needed." Can someone in short describe, what BQL and aio context lock means, what they protect, and haw they should cooperate? Vladimir Sementsov-Ogievskiy (2): mirror: fix dead-lock iotests: simple mirror test with kvm on 1G image block/mirror.c | 13 ++++----- tests/qemu-iotests/235 | 59 ++++++++++++++++++++++++++++++++++++++ tests/qemu-iotests/235.out | 1 + tests/qemu-iotests/group | 1 + 4 files changed, 66 insertions(+), 8 deletions(-) create mode 100755 tests/qemu-iotests/235 create mode 100644 tests/qemu-iotests/235.out -- 2.18.0