On Tue, May 12, 2020 at 11:26:11AM +0800, Jason Wang wrote: > > On 2020/5/11 下午5:11, Dima Stepanov wrote: > >On Mon, May 11, 2020 at 11:05:58AM +0800, Jason Wang wrote: > >>On 2020/4/30 下午9:36, Dima Stepanov wrote: > >>>Since disconnect can happen at any time during initialization not all > >>>vring buffers (for instance used vring) can be intialized successfully. > >>>If the buffer was not initialized then vhost_memory_unmap call will lead > >>>to SIGSEGV. Add checks for the vring address value before calling unmap. > >>>Also add assert() in the vhost_memory_unmap() routine. > >>> > >>>Signed-off-by: Dima Stepanov <dimas...@yandex-team.ru> > >>>--- > >>> hw/virtio/vhost.c | 27 +++++++++++++++++++++------ > >>> 1 file changed, 21 insertions(+), 6 deletions(-) > >>> > >>>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > >>>index ddbdc53..3ee50c4 100644 > >>>--- a/hw/virtio/vhost.c > >>>+++ b/hw/virtio/vhost.c > >>>@@ -314,6 +314,8 @@ static void vhost_memory_unmap(struct vhost_dev *dev, > >>>void *buffer, > >>> hwaddr len, int is_write, > >>> hwaddr access_len) > >>> { > >>>+ assert(buffer); > >>>+ > >>> if (!vhost_dev_has_iommu(dev)) { > >>> cpu_physical_memory_unmap(buffer, len, is_write, access_len); > >>> } > >>>@@ -1132,12 +1134,25 @@ static void vhost_virtqueue_stop(struct vhost_dev > >>>*dev, > >>> vhost_vq_index); > >>> } > >>>- vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, > >>>idx), > >>>- 1, virtio_queue_get_used_size(vdev, idx)); > >>>- vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, > >>>idx), > >>>- 0, virtio_queue_get_avail_size(vdev, idx)); > >>>- vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, > >>>idx), > >>>- 0, virtio_queue_get_desc_size(vdev, idx)); > >>>+ /* > >>>+ * Since the vhost-user disconnect can happen during initialization > >>>+ * check if vring was initialized, before making unmap. > >>>+ */ > >>>+ if (vq->used) { > >>>+ vhost_memory_unmap(dev, vq->used, > >>>+ virtio_queue_get_used_size(vdev, idx), > >>>+ 1, virtio_queue_get_used_size(vdev, idx)); > >>>+ } > >>>+ if (vq->avail) { > >>>+ vhost_memory_unmap(dev, vq->avail, > >>>+ virtio_queue_get_avail_size(vdev, idx), > >>>+ 0, virtio_queue_get_avail_size(vdev, idx)); > >>>+ } > >>>+ if (vq->desc) { > >>>+ vhost_memory_unmap(dev, vq->desc, > >>>+ virtio_queue_get_desc_size(vdev, idx), > >>>+ 0, virtio_queue_get_desc_size(vdev, idx)); > >>>+ } > >> > >>Any reason not checking hdev->started instead? vhost_dev_start() will set it > >>to true if virtqueues were correctly mapped. > >> > >>Thanks > >Well i see it a little bit different: > > - vhost_dev_start() sets hdev->started to true before starting > > virtqueues > > - vhost_virtqueue_start() maps all the memory > >If we hit the vhost disconnect at the start of the > >vhost_virtqueue_start(), for instance for this call: > > r = dev->vhost_ops->vhost_set_vring_base(dev, &state); > >Then we will call vhost_user_blk_disconnect: > > vhost_user_blk_disconnect()-> > > vhost_user_blk_stop()-> > > vhost_dev_stop()-> > > vhost_virtqueue_stop() > >As a result we will come in this routine with the hdev->started still > >set to true, but if used/avail/desc fields still uninitialized and set > >to 0. > > > I may miss something, but consider both vhost_dev_start() and > vhost_user_blk_disconnect() were serialized in main loop. Can this really > happen? Yes, consider the case when we start the vhost-user-blk device: vhost_dev_start-> vhost_virtqueue_start And we got a disconnect in the middle of vhost_virtqueue_start() routine, for instance: 1000 vq->num = state.num = virtio_queue_get_num(vdev, idx); 1001 r = dev->vhost_ops->vhost_set_vring_num(dev, &state); 1002 if (r) { 1003 VHOST_OPS_DEBUG("vhost_set_vring_num failed"); 1004 return -errno; 1005 } --> Here we got a disconnect <-- 1006 1007 state.num = virtio_queue_get_last_avail_idx(vdev, idx); 1008 r = dev->vhost_ops->vhost_set_vring_base(dev, &state); 1009 if (r) { 1010 VHOST_OPS_DEBUG("vhost_set_vring_base failed"); 1011 return -errno; 1012 } As a result call to vhost_set_vring_base will call the disconnect routine. The backtrace log for SIGSEGV is as follows: Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff2ea9700 (LWP 183150)] 0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) bt #0 0x00007ffff4d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x000055555590fd90 in flatview_write_continue (fv=0x7fffec4a2600, addr=0, attrs=..., ptr=0x0, len=1028, addr1=0, l=1028, mr=0x555556b1b310) at ./exec.c:3142 #2 0x000055555590fe98 in flatview_write (fv=0x7fffec4a2600, addr=0, attrs=..., buf=0x0, len=1028) at ./exec.c:3177 #3 0x00005555559101ed in address_space_write (as=0x555556893940 <address_space_memory>, addr=0, attrs=..., buf=0x0, len=1028) at ./exec.c:3268 #4 0x0000555555910caf in address_space_unmap (as=0x555556893940 <address_space_memory>, buffer=0x0, len=1028, is_write=true, access_len=1028) at ./exec.c:3592 #5 0x0000555555910d82 in cpu_physical_memory_unmap (buffer=0x0, len=1028, is_write=true, access_len=1028) at ./exec.c:3613 #6 0x0000555555a16fa1 in vhost_memory_unmap (dev=0x7ffff22723e8, buffer=0x0, len=1028, is_write=1, access_len=1028) at ./hw/virtio/vhost.c:318 #7 0x0000555555a192a2 in vhost_virtqueue_stop (dev=0x7ffff22723e8, vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at ./hw/virtio/vhost.c:1136 #8 0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8, vdev=0x7ffff22721a0) at ./hw/virtio/vhost.c:1702 #9 0x00005555559b6532 in vhost_user_blk_stop (vdev=0x7ffff22721a0) at ./hw/block/vhost-user-blk.c:196 #10 0x00005555559b6b73 in vhost_user_blk_disconnect (dev=0x7ffff22721a0) at ./hw/block/vhost-user-blk.c:365 #11 0x00005555559b6c4f in vhost_user_blk_event (opaque=0x7ffff22721a0, event=CHR_EVENT_CLOSED) at ./hw/block/vhost-user-blk.c:384 #12 0x0000555555e65f7e in chr_be_event (s=0x555556b182e0, event=CHR_EVENT_CLOSED) at chardev/char.c:60 #13 0x0000555555e6601a in qemu_chr_be_event (s=0x555556b182e0, event=CHR_EVENT_CLOSED) at chardev/char.c:80 #14 0x0000555555e6eef3 in tcp_chr_disconnect_locked (chr=0x555556b182e0) at chardev/char-socket.c:488 #15 0x0000555555e6e23f in tcp_chr_write (chr=0x555556b182e0, buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-socket.c:178 #16 0x0000555555e6616c in qemu_chr_write_buffer (s=0x555556b182e0, buf=0x7ffff2ea8220 "\n", len=20, offset=0x7ffff2ea8150, write_all=true) at chardev/char.c:120 #17 0x0000555555e662d9 in qemu_chr_write (s=0x555556b182e0, buf=0x7ffff2ea8220 "\n", len=20, write_all=true) at chardev/char.c:155 #18 0x0000555555e693cc in qemu_chr_fe_write_all (be=0x7ffff2272360, buf=0x7ffff2ea8220 "\n", len=20) at chardev/char-fe.c:53 #19 0x0000555555a1c489 in vhost_user_write (dev=0x7ffff22723e8, msg=0x7ffff2ea8220, fds=0x0, fd_num=0) at ./hw/virtio/vhost-user.c:350 #20 0x0000555555a1d325 in vhost_set_vring (dev=0x7ffff22723e8, request=10, ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:660 #21 0x0000555555a1d4c6 in vhost_user_set_vring_base (dev=0x7ffff22723e8, ring=0x7ffff2ea8520) at ./hw/virtio/vhost-user.c:704 #22 0x0000555555a18c1b in vhost_virtqueue_start (dev=0x7ffff22723e8, vdev=0x7ffff22721a0, vq=0x55555770abf0, idx=0) at ./hw/virtio/vhost.c:1009 #23 0x0000555555a1a9f5 in vhost_dev_start (hdev=0x7ffff22723e8, vdev=0x7ffff22721a0) at ./hw/virtio/vhost.c:1639 #24 0x00005555559b6367 in vhost_user_blk_start (vdev=0x7ffff22721a0) at ./hw/block/vhost-user-blk.c:150 #25 0x00005555559b6653 in vhost_user_blk_set_status (vdev=0x7ffff22721a0, status=15 '\017') at ./hw/block/vhost-user-blk.c:233 #26 0x0000555555a1072d in virtio_set_status (vdev=0x7ffff22721a0, val=15 '\017') at ./hw/virtio/virtio.c:1956 ---Type <return> to continue, or q <return> to quit---
So while we inside vhost_user_blk_start() (frame#24) we are calling vhost_user_blk_disconnect() (frame#10). And for this call the dev->started field will be set to true. (gdb) frame 8 #8 0x0000555555a1acc0 in vhost_dev_stop (hdev=0x7ffff22723e8, vdev=0x7ffff22721a0) at ./hw/virtio/vhost.c:1702 1702 vhost_virtqueue_stop(hdev, (gdb) p hdev->started $1 = true It isn't an easy race to reproduce: to hit a disconnect at this window. We were able to hit it during long testing on one of the reconnect iteration. After it we were able to reproduce it with 100% by adding a sleep() call inside qemu to make the race window bigger. > > Thanks > > > > > >> > >>> } > >>> static void vhost_eventfd_add(MemoryListener *listener, >