Hi all, As we move forwards with new features in the block layer, the chances of tricky bugs happening have been increasing alongside - block jobs, coroutines, throttling, AioContext, op blockers and image locking combined together make a large and complex picture that is hard to fully understand and work with. Some bugs we've encountered are quite challenging already. Examples are:
- segfault in parallel blockjobs (iotest 30) https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01144.html - Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration) https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01626.html - Drainage in bdrv_replace_child_noperm() https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg00868.html - Regression from 2.8: stuck in bdrv_drain() https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02193.html So in principle, what should we do to make the block layer easy to understand, develop with and debug? I think we have opportunities in these aspects: - Documentation There is no central developer doc about block layer, especially how all pieces fit together. Having one will make it a lot easier for new contributors to understand better. Of course, we're facing the old problem: the code is moving, maintaining an updated document needs effort. Idea: add ./doc/deve/block.txt? - Tests Writing tests is a great way not only to exercise code, verify new features work as expected and catch regression bugs, but also a way to show how the feature can be used. There is this trend that the QEMU user interface gradually moves from high level commands and args to small and flexible building blocks, therefore demostrating the usage in iotests is meaningful. Idea: Add tests to simulate how libvirt uses block layer, or how we expect it to. This would be a long term investment. We could reuse iotests, or create a new test framework specifically, if it's easier (for example, use docker/vm tests that just uses libvirt). Idea: Patchew already tests the quick group of iotests for a few formats/protocols, but we should really add it to "make check". - Simplified code, or more orthogonal/modularized architecture. Each aspect of block layer is complex enough so isolating them as much as possible is a reasonable approach to control the complexity. Block jobs and throttling becoming block filters is a good example, we should identify more. Idea: rethink event loops. Create coroutines ubiquitously (for example for each fd handler, BH and timer), so that many nested aio_poll() can be removed. Crazy idea: move the whole block layer to a vhost process, and implement existing features differently, especially in terms of multi-threading (hint: rust?). - Debuggability. Working with backtraces when coroutine is used is pretty hard, it would be nice if ./scripts/qemugdb/coroutine.py could work with core files (i.e. without a process to debug), and trace back to co->caller automatically. It's always useful to dump block graph. Maybe we should add a helper function in block layer that dumps all node graphs in graphviz DOT format, and even make it available in QMP as x-dump-block-graph? Of course gdb scripts to dump various lists are also nice little things to have. Idea: write more ./scripts/qemugdb/<scriptlet>.py. More thoughts? Fam