On 03/17/2016 01:22 PM, tu bo wrote: > > On 03/16/2016 09:38 PM, Christian Borntraeger wrote: >> On 03/16/2016 01:55 PM, Paolo Bonzini wrote: >>> >>> >>> On 16/03/2016 12:24, Christian Borntraeger wrote: >>>> On 03/16/2016 12:09 PM, Paolo Bonzini wrote: >>>>> On 16/03/2016 11:49, Christian Borntraeger wrote: >>>>>> #3 0x00000000800b713e in virtio_blk_data_plane_start (s=0xba232d80) at >>>>>> /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:224 >>>>>> #4 0x00000000800b4ea0 in virtio_blk_handle_output (vdev=0xb9eee7e8, >>>>>> vq=0xba305270) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:590 >>>>>> #5 0x00000000800ef3dc in virtio_queue_notify_vq (vq=0xba305270) at >>>>>> /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1095 >>>>>> #6 0x00000000800f1c9c in virtio_queue_host_notifier_read (n=0xba3052c8) >>>>>> at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1785 >>> >>> If you just remove the calls to virtio_queue_host_notifier_read, here >>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work >>> (keeping patches 2-4 in)? >> >> With these changes and patch 2-4 it does no longer locks up. >> I keep it running some hour to check if a crash happens. >> >> Tu Bo, your setup is currently better suited for reproducing. Can you also >> check? > > remove the calls to virtio_queue_host_notifier_read, and keeping patches 2-4 > in, > > I got same crash as before, > (gdb) bt > #0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 > #1 0x000002aa0c65d786 in coroutine_trampoline (i0=<optimized out>, > i1=-2013204784) at util/coroutine-ucontext.c:79 > #2 0x000003ff99ad150a in __makecontext_ret () from /lib64/libc.so.6 >
As an interesting side note, I updated my system from F20 to F23 some days ago (after the initial report). While To Bo is still on a F20 system. I was not able to reproduce the original crash on f23. but going back to F20 made this problem re-appear. Stack trace of thread 26429: #0 0x00000000802008aa tracked_request_begin (qemu-system-s390x) #1 0x0000000080203f3c bdrv_co_do_preadv (qemu-system-s390x) #2 0x000000008020567c bdrv_co_do_readv (qemu-system-s390x) #3 0x000000008025d0f4 coroutine_trampoline (qemu-system-s390x) #4 0x000003ff943d150a __makecontext_ret (libc.so.6) this is with patch 2-4 plus the removal of virtio_queue_host_notifier_read. Without removing virtio_queue_host_notifier_read, I get the same mutex lockup (as expected). Maybe we have two independent issues here and this is some old bug in glibc or whatever? > >> >>> >>> Paolo >>> >>>>>> #7 0x00000000800f1e14 in virtio_queue_set_host_notifier_fd_handler >>>>>> (vq=0xba305270, assign=false, set_handler=false) at >>>>>> /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1817 >>>>>> #8 0x0000000080109c50 in virtio_ccw_set_guest2host_notifier >>>>>> (dev=0xb9eed6a0, n=0, assign=false, set_handler=false) at >>>>>> /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:97 >>>>>> #9 0x0000000080109ef2 in virtio_ccw_stop_ioeventfd (dev=0xb9eed6a0) at >>>>>> /home/cborntra/REPOS/qemu/hw/s390x/virtio-ccw.c:154 >>> >>