On 23/07/2015 19:20, Paolo Bonzini wrote: > > > On 23/07/2015 16:14, Cornelia Huck wrote: >> (gdb) bt >> #0 0x000003fffc5871b4 in pthread_cond_wait@@GLIBC_2.3.2 () >> from /lib64/libpthread.so.0 >> #1 0x000000008024cfca in qemu_cond_wait (cond=cond@entry=0x9717d950, >> mutex=mutex@entry=0x9717d920) >> at /data/git/yyy/qemu/util/qemu-thread-posix.c:132 >> #2 0x000000008025e83a in rfifolock_lock (r=0x9717d920) >> at /data/git/yyy/qemu/util/rfifolock.c:59 >> #3 0x00000000801b78fa in aio_context_acquire (ctx=<optimized out>) >> at /data/git/yyy/qemu/async.c:331 >> #4 0x000000008007ceb4 in virtio_blk_data_plane_start (s=0x9717d710) >> at /data/git/yyy/qemu/hw/block/dataplane/virtio-blk.c:285 >> #5 0x000000008007c64a in virtio_blk_handle_output (vdev=<optimized out>, >> vq=<optimized out>) at /data/git/yyy/qemu/hw/block/virtio-blk.c:599 >> #6 0x00000000801c56dc in qemu_iohandler_poll (pollfds=0x97142800, >> ret=ret@entry=1) at /data/git/yyy/qemu/iohandler.c:126 >> #7 0x00000000801c5178 in main_loop_wait (nonblocking=<optimized out>) >> at /data/git/yyy/qemu/main-loop.c:494 >> #8 0x0000000080013ee2 in main_loop () at /data/git/yyy/qemu/vl.c:1902 >> #9 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) >> at /data/git/yyy/qemu/vl.c:4653 >> >> I've stripped down the setup to the following commandline: >> >> /data/git/yyy/qemu/build/s390x-softmmu/qemu-system-s390x -machine >> s390-ccw-virtio-2.4,accel=kvm,usb=off -m 1024 -smp >> 4,sockets=4,cores=1,threads=1 -nographic -drive >> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none,aio=native >> -device >> virtio-blk-ccw,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,x-data-plane=on > > What's the backtrace like for the other threads? This is almost > definitely a latent bug somewhere else.
BTW, I can reproduce this---I'm asking because I cannot even attach gdb to the hung process. The simplest workaround is to reintroduce commit a0710f7995 (iothread: release iothread around aio_poll, 2015-02-20), though it also comes with some risk. It avoids the bug because it limits the contention on the RFifoLock. Paolo