Akihiko Odaki <[email protected]> writes: > On 2026/05/19 4:35, Alex Bennée wrote: >> Akihiko Odaki <[email protected]> writes: >> >>> This fixes a deadlock I previously observed with the test in [1]. >>> >>> However, I can no longer reproduce the issue reliably with that test, so >>> I used Codex, a coding agent, to write a more reliable local test case, >>> shown below. I applied to Codex for Open Source to get access. The test >>> case is not intended for merge: current policy prohibits that, and it is >>> probably not worth carrying anyway because race-condition tests are >>> inherently fragile. >> What sort of hit rate where you getting with the race? So far they >> have >> both been rock solid without the additional patches for me. > > I hit the deadlock in 8 out of 10 trials.
It's taking a lot longer on my system (~ 1 in 100) but with these patches I'm still seeing a hang, it just takes a lot longer to get there. > >> >>> The remaining patches were written by me. >>> >>> [1] >>> https://lore.kernel.org/qemu-devel/[email protected]/ >>> >>> To: [email protected] >>> Cc: Alex Bennée <[email protected]> >>> Cc: Dmitry Osipenko <[email protected]> >>> Cc: Michael S. Tsirkin <[email protected]> >>> Signed-off-by: Akihiko Odaki <[email protected]> >>> >>> Below is the Codex-written test case: >>> >>> diff --git a/tests/functional/aarch64/test_gpu_blob.py >>> b/tests/functional/aarch64/test_gpu_blob.py >>> index a913d3b29c84..52627b4541f9 100755 >>> --- a/tests/functional/aarch64/test_gpu_blob.py >>> +++ b/tests/functional/aarch64/test_gpu_blob.py >>> @@ -13,7 +13,9 @@ >>> # >>> # SPDX-License-Identifier: GPL-2.0-or-later >>> -from qemu.machine.machine import VMLaunchFailure >>> +import subprocess >>> + >>> +from qemu.machine.machine import AbnormalShutdown, VMLaunchFailure >>> from qemu_test import Asset >>> from qemu_test import wait_for_console_pattern >>> @@ -25,8 +27,7 @@ class Aarch64VirtBlobTest(LinuxKernelTest): >>> 'download?path=%2Fblob-test&files=qemu-880.bin', >>> >>> '2f6ab85d0b156c94fcedd2c4c821c5cbd52925a2de107f8e2d569ea2e34e42eb') >>> - def test_virtio_gpu_blob(self): >>> - >>> + def launch_blob_test(self): >>> self.set_machine('virt') >>> self.require_accelerator("tcg") >>> @@ -65,9 +66,27 @@ def test_virtio_gpu_blob(self): >>> self.log.info("unhandled launch failure: %s", excp.output) >>> raise excp >>> + def test_virtio_gpu_blob(self): >>> + self.launch_blob_test() >>> + >>> self.wait_for_console_pattern('[INFO] virtio-gpu test finished') >>> # the test should cleanly exit >>> + def test_virtio_gpu_blob_shutdown_race(self): >>> + self.launch_blob_test() >>> + >>> + self.wait_for_console_pattern('[INFO] unmapping blob object >>> resource') >>> + >>> + try: >>> + self.vm.shutdown(timeout=10) >>> + except AbnormalShutdown as excp: >>> + if isinstance(excp.__cause__, subprocess.TimeoutExpired): >>> + raise AssertionError( >>> + "QEMU failed to exit while virtio-gpu reset was racing >>> " >>> + "with shutdown") from excp >>> + self.log.info("QEMU exited before the shutdown request >>> completed: %s", >>> + excp) >>> + >>> if __name__ == '__main__': >>> LinuxKernelTest.main() >>> >>> --- >>> Changes in v2: >>> - Added the patch "virtio-gpu: Run reset cleanup in the same BH". >>> - My assumption about the ordering was incorrect, so I changed the patch >>> to follow the approach used by virtio-gpu-gl. >>> - Link to v1: >>> https://lore.kernel.org/qemu-devel/[email protected] >>> >>> --- >>> Akihiko Odaki (2): >>> virtio-gpu: Run reset cleanup in the same BH >>> virtio-gpu: Do not wait for the main thread during reset >>> >>> include/hw/virtio/virtio-gpu.h | 4 +-- >>> hw/display/virtio-gpu.c | 60 >>> ++++++++++++++++++++---------------------- >>> 2 files changed, 30 insertions(+), 34 deletions(-) >>> --- >>> base-commit: 14f38a63b9adc02c0ebe3b5ada1f1208abaf21ea >>> change-id: 20251029-gpu-c3f45747f7ba >>> >>> Best regards, >> -- Alex Bennée Virtualisation Tech Lead @ Linaro
