Re: virtio-blk issue with vIOMMU
On Wed, Feb 23, 2022 at 12:37:03PM +0800, Jason Wang wrote: > Hi Stefan: > > Recently I found intel vIOMMU gives the following warning when using > virtio-blk: > > qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error > (iova=0x7ffde000, level=0x3, slpte=0x0, write=0) > qemu-system-x86_64: vtd_iommu_translate: detected translation failure > (dev=01:00:00, iova=0x7ffde000) > qemu-system-x86_64: New fault is not recorded due to compression of faults > qemu-system-x86_64: virtio: zero sized buffers are not allowed > > It happens on the boot (device start), and virtio-blk works well after this. > A quick stack trace is: > > Thread 1 "qemu-system-x86" hit Breakpoint 1, vtd_iova_to_slpte > (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false, > slptep=0x7fffd6b8, > slpte_level=0x7fffd6b0, reads=0x7fffd6aa, writes=0x7fffd6ab, > aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055 > 1055 error_report_once("%s: detected slpte permission error " > (gdb) bt > #0 vtd_iova_to_slpte > (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false, > slptep=0x7fffd6b8, slpte_level=0x7fffd6b0, reads=0x7fffd6aa, > writes=0x7fffd6ab, aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055 > #1 0x55b45734 in vtd_do_iommu_translate (vtd_as=0x574cd000, > bus=0x5766e700, devfn=0 '\000', addr=2147344384, is_write=false, > entry=0x7fffd780) > at ../hw/i386/intel_iommu.c:1785 > #2 0x55b48543 in vtd_iommu_translate (iommu=0x574cd070, > addr=2147344384, flag=IOMMU_RO, iommu_idx=0) at > ../hw/i386/intel_iommu.c:2996 > #3 0x55bd3f4d in address_space_translate_iommu > (iommu_mr=0x574cd070, xlat=0x7fffd9f0, plen_out=0x7fffd9e8, > page_mask_out=0x0, is_write=false, is_mmio=true, target_as=0x7fffd938, > attrs=...) > at ../softmmu/physmem.c:433 > #4 0x55bdbdd1 in address_space_translate_cached > (cache=0x7fffed3d02e0, addr=0, xlat=0x7fffd9f0, plen=0x7fffd9e8, > is_write=false, attrs=...) > at ../softmmu/physmem.c:3388 > #5 0x55bdc519 in address_space_lduw_internal_cached_slow > (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0, > endian=DEVICE_LITTLE_ENDIAN) > at /home/devel/git/qemu/memory_ldst.c.inc:209 > #6 0x55bdc6ac in address_space_lduw_le_cached_slow > (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) at > /home/devel/git/qemu/memory_ldst.c.inc:253 > #7 0x55c71719 in address_space_lduw_le_cached > (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) > at /home/devel/git/qemu/include/exec/memory_ldst_cached.h.inc:35 > #8 0x55c7196a in lduw_le_phys_cached (cache=0x7fffed3d02e0, addr=0) > at /home/devel/git/qemu/include/exec/memory_ldst_phys.h.inc:67 > #9 0x55c728fd in virtio_lduw_phys_cached (vdev=0x57743720, > cache=0x7fffed3d02e0, pa=0) at > /home/devel/git/qemu/include/hw/virtio/virtio-access.h:166 > #10 0x55c73485 in vring_used_flags_set_bit (vq=0x74ee5010, > mask=1) at ../hw/virtio/virtio.c:383 > #11 0x55c736a8 in virtio_queue_split_set_notification > (vq=0x74ee5010, enable=0) at ../hw/virtio/virtio.c:433 > #12 0x55c73896 in virtio_queue_set_notification (vq=0x74ee5010, > enable=0) at ../hw/virtio/virtio.c:490 > #13 0x55c19064 in virtio_blk_handle_vq (s=0x57743720, > vq=0x74ee5010) at ../hw/block/virtio-blk.c:782 > #14 0x55c191f5 in virtio_blk_handle_output (vdev=0x57743720, > vq=0x74ee5010) at ../hw/block/virtio-blk.c:819 > #15 0x55c78453 in virtio_queue_notify_vq (vq=0x74ee5010) at > ../hw/virtio/virtio.c:2315 > #16 0x55c7b523 in virtio_queue_host_notifier_aio_poll_ready > (n=0x74ee5084) at ../hw/virtio/virtio.c:3516 > #17 0x55eff158 in aio_dispatch_handler (ctx=0x5680fac0, > node=0x7fffeca5bbe0) at ../util/aio-posix.c:350 > #18 0x55eff390 in aio_dispatch_handlers (ctx=0x5680fac0) at > ../util/aio-posix.c:406 > #19 0x55eff3ea in aio_dispatch (ctx=0x5680fac0) at > ../util/aio-posix.c:416 > #20 0x55f184eb in aio_ctx_dispatch (source=0x5680fac0, > callback=0x0, user_data=0x0) at ../util/async.c:311 > #21 0x77b6b17d in g_main_context_dispatch () at > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 > #22 0x55f299ed in glib_pollfds_poll () at ../util/main-loop.c:232 > #23 0x55f29a6b in os_host_main_loop_wait (timeout=0) at > ../util/main-loop.c:255 > #24 0x55f29b7c in main_loop_wait (nonblocking=0) at > ../util/main-loop.c:531 > #25 0x55be097c in qemu_main_loop () at ../softmmu/runstate.c:727 > #26 0x558367fa in main (argc=26, argv=0x7fffe058, > envp=0x7fffe130) at ../softmmu/main.c:50 > > The slpte is 0x0 and level is 3 which probably means the device is kicked > before it was attached to any IOMMU domain. > > Bisecting points to the first bad commit: > > commit
virtio-blk issue with vIOMMU
Hi Stefan: Recently I found intel vIOMMU gives the following warning when using virtio-blk: qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error (iova=0x7ffde000, level=0x3, slpte=0x0, write=0) qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=01:00:00, iova=0x7ffde000) qemu-system-x86_64: New fault is not recorded due to compression of faults qemu-system-x86_64: virtio: zero sized buffers are not allowed It happens on the boot (device start), and virtio-blk works well after this. A quick stack trace is: Thread 1 "qemu-system-x86" hit Breakpoint 1, vtd_iova_to_slpte (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false, slptep=0x7fffd6b8, slpte_level=0x7fffd6b0, reads=0x7fffd6aa, writes=0x7fffd6ab, aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055 1055 error_report_once("%s: detected slpte permission error " (gdb) bt #0 vtd_iova_to_slpte (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false, slptep=0x7fffd6b8, slpte_level=0x7fffd6b0, reads=0x7fffd6aa, writes=0x7fffd6ab, aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055 #1 0x55b45734 in vtd_do_iommu_translate (vtd_as=0x574cd000, bus=0x5766e700, devfn=0 '\000', addr=2147344384, is_write=false, entry=0x7fffd780) at ../hw/i386/intel_iommu.c:1785 #2 0x55b48543 in vtd_iommu_translate (iommu=0x574cd070, addr=2147344384, flag=IOMMU_RO, iommu_idx=0) at ../hw/i386/intel_iommu.c:2996 #3 0x55bd3f4d in address_space_translate_iommu (iommu_mr=0x574cd070, xlat=0x7fffd9f0, plen_out=0x7fffd9e8, page_mask_out=0x0, is_write=false, is_mmio=true, target_as=0x7fffd938, attrs=...) at ../softmmu/physmem.c:433 #4 0x55bdbdd1 in address_space_translate_cached (cache=0x7fffed3d02e0, addr=0, xlat=0x7fffd9f0, plen=0x7fffd9e8, is_write=false, attrs=...) at ../softmmu/physmem.c:3388 #5 0x55bdc519 in address_space_lduw_internal_cached_slow (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at /home/devel/git/qemu/memory_ldst.c.inc:209 #6 0x55bdc6ac in address_space_lduw_le_cached_slow (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) at /home/devel/git/qemu/memory_ldst.c.inc:253 #7 0x55c71719 in address_space_lduw_le_cached (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) at /home/devel/git/qemu/include/exec/memory_ldst_cached.h.inc:35 #8 0x55c7196a in lduw_le_phys_cached (cache=0x7fffed3d02e0, addr=0) at /home/devel/git/qemu/include/exec/memory_ldst_phys.h.inc:67 #9 0x55c728fd in virtio_lduw_phys_cached (vdev=0x57743720, cache=0x7fffed3d02e0, pa=0) at /home/devel/git/qemu/include/hw/virtio/virtio-access.h:166 #10 0x55c73485 in vring_used_flags_set_bit (vq=0x74ee5010, mask=1) at ../hw/virtio/virtio.c:383 #11 0x55c736a8 in virtio_queue_split_set_notification (vq=0x74ee5010, enable=0) at ../hw/virtio/virtio.c:433 #12 0x55c73896 in virtio_queue_set_notification (vq=0x74ee5010, enable=0) at ../hw/virtio/virtio.c:490 #13 0x55c19064 in virtio_blk_handle_vq (s=0x57743720, vq=0x74ee5010) at ../hw/block/virtio-blk.c:782 #14 0x55c191f5 in virtio_blk_handle_output (vdev=0x57743720, vq=0x74ee5010) at ../hw/block/virtio-blk.c:819 #15 0x55c78453 in virtio_queue_notify_vq (vq=0x74ee5010) at ../hw/virtio/virtio.c:2315 #16 0x55c7b523 in virtio_queue_host_notifier_aio_poll_ready (n=0x74ee5084) at ../hw/virtio/virtio.c:3516 #17 0x55eff158 in aio_dispatch_handler (ctx=0x5680fac0, node=0x7fffeca5bbe0) at ../util/aio-posix.c:350 #18 0x55eff390 in aio_dispatch_handlers (ctx=0x5680fac0) at ../util/aio-posix.c:406 #19 0x55eff3ea in aio_dispatch (ctx=0x5680fac0) at ../util/aio-posix.c:416 #20 0x55f184eb in aio_ctx_dispatch (source=0x5680fac0, callback=0x0, user_data=0x0) at ../util/async.c:311 #21 0x77b6b17d in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #22 0x55f299ed in glib_pollfds_poll () at ../util/main-loop.c:232 #23 0x55f29a6b in os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:255 #24 0x55f29b7c in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531 #25 0x55be097c in qemu_main_loop () at ../softmmu/runstate.c:727 #26 0x558367fa in main (argc=26, argv=0x7fffe058, envp=0x7fffe130) at ../softmmu/main.c:50 The slpte is 0x0 and level is 3 which probably means the device is kicked before it was attached to any IOMMU domain. Bisecting points to the first bad commit: commit 826cc32423db2a99d184dbf4f507c737d7e7a4ae Author: Stefan Hajnoczi Date: Tue Dec 7 13:23:31 2021 + aio-posix: split poll check from ready handler A wild guess is that this lead some false kick to the device, any