On Wed, Feb 23, 2022 at 12:37:03PM +0800, Jason Wang wrote: > Hi Stefan: > > Recently I found intel vIOMMU gives the following warning when using > virtio-blk: > > qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error > (iova=0x7ffde000, level=0x3, slpte=0x0, write=0) > qemu-system-x86_64: vtd_iommu_translate: detected translation failure > (dev=01:00:00, iova=0x7ffde000) > qemu-system-x86_64: New fault is not recorded due to compression of faults > qemu-system-x86_64: virtio: zero sized buffers are not allowed > > It happens on the boot (device start), and virtio-blk works well after this. > A quick stack trace is: > > Thread 1 "qemu-system-x86" hit Breakpoint 1, vtd_iova_to_slpte > (s=0x555557a9f710, ce=0x7fffffffd6e0, iova=2147344384, is_write=false, > slptep=0x7fffffffd6b8, > slpte_level=0x7fffffffd6b0, reads=0x7fffffffd6aa, writes=0x7fffffffd6ab, > aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055 > 1055 error_report_once("%s: detected slpte permission error " > (gdb) bt > #0 vtd_iova_to_slpte > (s=0x555557a9f710, ce=0x7fffffffd6e0, iova=2147344384, is_write=false, > slptep=0x7fffffffd6b8, slpte_level=0x7fffffffd6b0, reads=0x7fffffffd6aa, > writes=0x7fffffffd6ab, aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055 > #1 0x0000555555b45734 in vtd_do_iommu_translate (vtd_as=0x5555574cd000, > bus=0x55555766e700, devfn=0 '\000', addr=2147344384, is_write=false, > entry=0x7fffffffd780) > at ../hw/i386/intel_iommu.c:1785 > #2 0x0000555555b48543 in vtd_iommu_translate (iommu=0x5555574cd070, > addr=2147344384, flag=IOMMU_RO, iommu_idx=0) at > ../hw/i386/intel_iommu.c:2996 > #3 0x0000555555bd3f4d in address_space_translate_iommu > (iommu_mr=0x5555574cd070, xlat=0x7fffffffd9f0, plen_out=0x7fffffffd9e8, > page_mask_out=0x0, is_write=false, is_mmio=true, target_as=0x7fffffffd938, > attrs=...) > at ../softmmu/physmem.c:433 > #4 0x0000555555bdbdd1 in address_space_translate_cached > (cache=0x7fffed3d02e0, addr=0, xlat=0x7fffffffd9f0, plen=0x7fffffffd9e8, > is_write=false, attrs=...) > at ../softmmu/physmem.c:3388 > #5 0x0000555555bdc519 in address_space_lduw_internal_cached_slow > (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0, > endian=DEVICE_LITTLE_ENDIAN) > at /home/devel/git/qemu/memory_ldst.c.inc:209 > #6 0x0000555555bdc6ac in address_space_lduw_le_cached_slow > (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) at > /home/devel/git/qemu/memory_ldst.c.inc:253 > #7 0x0000555555c71719 in address_space_lduw_le_cached > (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) > at /home/devel/git/qemu/include/exec/memory_ldst_cached.h.inc:35 > #8 0x0000555555c7196a in lduw_le_phys_cached (cache=0x7fffed3d02e0, addr=0) > at /home/devel/git/qemu/include/exec/memory_ldst_phys.h.inc:67 > #9 0x0000555555c728fd in virtio_lduw_phys_cached (vdev=0x555557743720, > cache=0x7fffed3d02e0, pa=0) at > /home/devel/git/qemu/include/hw/virtio/virtio-access.h:166 > #10 0x0000555555c73485 in vring_used_flags_set_bit (vq=0x7ffff4ee5010, > mask=1) at ../hw/virtio/virtio.c:383 > #11 0x0000555555c736a8 in virtio_queue_split_set_notification > (vq=0x7ffff4ee5010, enable=0) at ../hw/virtio/virtio.c:433 > #12 0x0000555555c73896 in virtio_queue_set_notification (vq=0x7ffff4ee5010, > enable=0) at ../hw/virtio/virtio.c:490 > #13 0x0000555555c19064 in virtio_blk_handle_vq (s=0x555557743720, > vq=0x7ffff4ee5010) at ../hw/block/virtio-blk.c:782 > #14 0x0000555555c191f5 in virtio_blk_handle_output (vdev=0x555557743720, > vq=0x7ffff4ee5010) at ../hw/block/virtio-blk.c:819 > #15 0x0000555555c78453 in virtio_queue_notify_vq (vq=0x7ffff4ee5010) at > ../hw/virtio/virtio.c:2315 > #16 0x0000555555c7b523 in virtio_queue_host_notifier_aio_poll_ready > (n=0x7ffff4ee5084) at ../hw/virtio/virtio.c:3516 > #17 0x0000555555eff158 in aio_dispatch_handler (ctx=0x55555680fac0, > node=0x7fffeca5bbe0) at ../util/aio-posix.c:350 > #18 0x0000555555eff390 in aio_dispatch_handlers (ctx=0x55555680fac0) at > ../util/aio-posix.c:406 > #19 0x0000555555eff3ea in aio_dispatch (ctx=0x55555680fac0) at > ../util/aio-posix.c:416 > #20 0x0000555555f184eb in aio_ctx_dispatch (source=0x55555680fac0, > callback=0x0, user_data=0x0) at ../util/async.c:311 > #21 0x00007ffff7b6b17d in g_main_context_dispatch () at > /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 > #22 0x0000555555f299ed in glib_pollfds_poll () at ../util/main-loop.c:232 > #23 0x0000555555f29a6b in os_host_main_loop_wait (timeout=0) at > ../util/main-loop.c:255 > #24 0x0000555555f29b7c in main_loop_wait (nonblocking=0) at > ../util/main-loop.c:531 > #25 0x0000555555be097c in qemu_main_loop () at ../softmmu/runstate.c:727 > #26 0x00005555558367fa in main (argc=26, argv=0x7fffffffe058, > envp=0x7fffffffe130) at ../softmmu/main.c:50 > > The slpte is 0x0 and level is 3 which probably means the device is kicked > before it was attached to any IOMMU domain. > > Bisecting points to the first bad commit: > > commit 826cc32423db2a99d184dbf4f507c737d7e7a4ae > Author: Stefan Hajnoczi <stefa...@redhat.com> > Date: Tue Dec 7 13:23:31 2021 +0000 > > aio-posix: split poll check from ready handler > > A wild guess is that this lead some false kick to the device, any thought on > this?
Funny, I hit the same bug but a different symptom today. I'm sending a fix and will CC you. Stefan
signature.asc
Description: PGP signature