Re: virtio-blk issue with vIOMMU

2022-02-23 Thread Stefan Hajnoczi
On Wed, Feb 23, 2022 at 12:37:03PM +0800, Jason Wang wrote:
> Hi Stefan:
> 
> Recently I found intel vIOMMU gives the following warning when using
> virtio-blk:
> 
> qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error
> (iova=0x7ffde000, level=0x3, slpte=0x0, write=0)
> qemu-system-x86_64: vtd_iommu_translate: detected translation failure
> (dev=01:00:00, iova=0x7ffde000)
> qemu-system-x86_64: New fault is not recorded due to compression of faults
> qemu-system-x86_64: virtio: zero sized buffers are not allowed
> 
> It happens on the boot (device start), and virtio-blk works well after this.
> A quick stack trace is:
> 
> Thread 1 "qemu-system-x86" hit Breakpoint 1, vtd_iova_to_slpte
> (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false,
> slptep=0x7fffd6b8,
>     slpte_level=0x7fffd6b0, reads=0x7fffd6aa, writes=0x7fffd6ab,
> aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055
> 1055        error_report_once("%s: detected slpte permission error "
> (gdb) bt
> #0  vtd_iova_to_slpte
>     (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false,
> slptep=0x7fffd6b8, slpte_level=0x7fffd6b0, reads=0x7fffd6aa,
> writes=0x7fffd6ab, aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055
> #1  0x55b45734 in vtd_do_iommu_translate (vtd_as=0x574cd000,
> bus=0x5766e700, devfn=0 '\000', addr=2147344384, is_write=false,
> entry=0x7fffd780)
>     at ../hw/i386/intel_iommu.c:1785
> #2  0x55b48543 in vtd_iommu_translate (iommu=0x574cd070,
> addr=2147344384, flag=IOMMU_RO, iommu_idx=0) at
> ../hw/i386/intel_iommu.c:2996
> #3  0x55bd3f4d in address_space_translate_iommu
>     (iommu_mr=0x574cd070, xlat=0x7fffd9f0, plen_out=0x7fffd9e8,
> page_mask_out=0x0, is_write=false, is_mmio=true, target_as=0x7fffd938,
> attrs=...)
>     at ../softmmu/physmem.c:433
> #4  0x55bdbdd1 in address_space_translate_cached
> (cache=0x7fffed3d02e0, addr=0, xlat=0x7fffd9f0, plen=0x7fffd9e8,
> is_write=false, attrs=...)
>     at ../softmmu/physmem.c:3388
> #5  0x55bdc519 in address_space_lduw_internal_cached_slow
> (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0,
> endian=DEVICE_LITTLE_ENDIAN)
>     at /home/devel/git/qemu/memory_ldst.c.inc:209
> #6  0x55bdc6ac in address_space_lduw_le_cached_slow
> (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) at
> /home/devel/git/qemu/memory_ldst.c.inc:253
> #7  0x55c71719 in address_space_lduw_le_cached
> (cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0)
>     at /home/devel/git/qemu/include/exec/memory_ldst_cached.h.inc:35
> #8  0x55c7196a in lduw_le_phys_cached (cache=0x7fffed3d02e0, addr=0)
> at /home/devel/git/qemu/include/exec/memory_ldst_phys.h.inc:67
> #9  0x55c728fd in virtio_lduw_phys_cached (vdev=0x57743720,
> cache=0x7fffed3d02e0, pa=0) at
> /home/devel/git/qemu/include/hw/virtio/virtio-access.h:166
> #10 0x55c73485 in vring_used_flags_set_bit (vq=0x74ee5010,
> mask=1) at ../hw/virtio/virtio.c:383
> #11 0x55c736a8 in virtio_queue_split_set_notification
> (vq=0x74ee5010, enable=0) at ../hw/virtio/virtio.c:433
> #12 0x55c73896 in virtio_queue_set_notification (vq=0x74ee5010,
> enable=0) at ../hw/virtio/virtio.c:490
> #13 0x55c19064 in virtio_blk_handle_vq (s=0x57743720,
> vq=0x74ee5010) at ../hw/block/virtio-blk.c:782
> #14 0x55c191f5 in virtio_blk_handle_output (vdev=0x57743720,
> vq=0x74ee5010) at ../hw/block/virtio-blk.c:819
> #15 0x55c78453 in virtio_queue_notify_vq (vq=0x74ee5010) at
> ../hw/virtio/virtio.c:2315
> #16 0x55c7b523 in virtio_queue_host_notifier_aio_poll_ready
> (n=0x74ee5084) at ../hw/virtio/virtio.c:3516
> #17 0x55eff158 in aio_dispatch_handler (ctx=0x5680fac0,
> node=0x7fffeca5bbe0) at ../util/aio-posix.c:350
> #18 0x55eff390 in aio_dispatch_handlers (ctx=0x5680fac0) at
> ../util/aio-posix.c:406
> #19 0x55eff3ea in aio_dispatch (ctx=0x5680fac0) at
> ../util/aio-posix.c:416
> #20 0x55f184eb in aio_ctx_dispatch (source=0x5680fac0,
> callback=0x0, user_data=0x0) at ../util/async.c:311
> #21 0x77b6b17d in g_main_context_dispatch () at
> /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
> #22 0x55f299ed in glib_pollfds_poll () at ../util/main-loop.c:232
> #23 0x55f29a6b in os_host_main_loop_wait (timeout=0) at
> ../util/main-loop.c:255
> #24 0x55f29b7c in main_loop_wait (nonblocking=0) at
> ../util/main-loop.c:531
> #25 0x55be097c in qemu_main_loop () at ../softmmu/runstate.c:727
> #26 0x558367fa in main (argc=26, argv=0x7fffe058,
> envp=0x7fffe130) at ../softmmu/main.c:50
> 
> The slpte is 0x0 and level is 3 which probably means the device is kicked
> before it was attached to any IOMMU domain.
> 
> Bisecting points to the first bad commit:
> 
> commit 

virtio-blk issue with vIOMMU

2022-02-22 Thread Jason Wang

Hi Stefan:

Recently I found intel vIOMMU gives the following warning when using 
virtio-blk:


qemu-system-x86_64: vtd_iova_to_slpte: detected slpte permission error 
(iova=0x7ffde000, level=0x3, slpte=0x0, write=0)
qemu-system-x86_64: vtd_iommu_translate: detected translation failure 
(dev=01:00:00, iova=0x7ffde000)

qemu-system-x86_64: New fault is not recorded due to compression of faults
qemu-system-x86_64: virtio: zero sized buffers are not allowed

It happens on the boot (device start), and virtio-blk works well after 
this. A quick stack trace is:


Thread 1 "qemu-system-x86" hit Breakpoint 1, vtd_iova_to_slpte 
(s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, is_write=false, 
slptep=0x7fffd6b8,
    slpte_level=0x7fffd6b0, reads=0x7fffd6aa, 
writes=0x7fffd6ab, aw_bits=39 '\'') at ../hw/i386/intel_iommu.c:1055

1055        error_report_once("%s: detected slpte permission error "
(gdb) bt
#0  vtd_iova_to_slpte
    (s=0x57a9f710, ce=0x7fffd6e0, iova=2147344384, 
is_write=false, slptep=0x7fffd6b8, slpte_level=0x7fffd6b0, 
reads=0x7fffd6aa, writes=0x7fffd6ab, aw_bits=39 '\'') at 
../hw/i386/intel_iommu.c:1055
#1  0x55b45734 in vtd_do_iommu_translate (vtd_as=0x574cd000, 
bus=0x5766e700, devfn=0 '\000', addr=2147344384, is_write=false, 
entry=0x7fffd780)

    at ../hw/i386/intel_iommu.c:1785
#2  0x55b48543 in vtd_iommu_translate (iommu=0x574cd070, 
addr=2147344384, flag=IOMMU_RO, iommu_idx=0) at 
../hw/i386/intel_iommu.c:2996

#3  0x55bd3f4d in address_space_translate_iommu
    (iommu_mr=0x574cd070, xlat=0x7fffd9f0, 
plen_out=0x7fffd9e8, page_mask_out=0x0, is_write=false, 
is_mmio=true, target_as=0x7fffd938, attrs=...)

    at ../softmmu/physmem.c:433
#4  0x55bdbdd1 in address_space_translate_cached 
(cache=0x7fffed3d02e0, addr=0, xlat=0x7fffd9f0, plen=0x7fffd9e8, 
is_write=false, attrs=...)

    at ../softmmu/physmem.c:3388
#5  0x55bdc519 in address_space_lduw_internal_cached_slow 
(cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0, 
endian=DEVICE_LITTLE_ENDIAN)

    at /home/devel/git/qemu/memory_ldst.c.inc:209
#6  0x55bdc6ac in address_space_lduw_le_cached_slow 
(cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0) at 
/home/devel/git/qemu/memory_ldst.c.inc:253
#7  0x55c71719 in address_space_lduw_le_cached 
(cache=0x7fffed3d02e0, addr=0, attrs=..., result=0x0)

    at /home/devel/git/qemu/include/exec/memory_ldst_cached.h.inc:35
#8  0x55c7196a in lduw_le_phys_cached (cache=0x7fffed3d02e0, 
addr=0) at /home/devel/git/qemu/include/exec/memory_ldst_phys.h.inc:67
#9  0x55c728fd in virtio_lduw_phys_cached (vdev=0x57743720, 
cache=0x7fffed3d02e0, pa=0) at 
/home/devel/git/qemu/include/hw/virtio/virtio-access.h:166
#10 0x55c73485 in vring_used_flags_set_bit (vq=0x74ee5010, 
mask=1) at ../hw/virtio/virtio.c:383
#11 0x55c736a8 in virtio_queue_split_set_notification 
(vq=0x74ee5010, enable=0) at ../hw/virtio/virtio.c:433
#12 0x55c73896 in virtio_queue_set_notification 
(vq=0x74ee5010, enable=0) at ../hw/virtio/virtio.c:490
#13 0x55c19064 in virtio_blk_handle_vq (s=0x57743720, 
vq=0x74ee5010) at ../hw/block/virtio-blk.c:782
#14 0x55c191f5 in virtio_blk_handle_output (vdev=0x57743720, 
vq=0x74ee5010) at ../hw/block/virtio-blk.c:819
#15 0x55c78453 in virtio_queue_notify_vq (vq=0x74ee5010) at 
../hw/virtio/virtio.c:2315
#16 0x55c7b523 in virtio_queue_host_notifier_aio_poll_ready 
(n=0x74ee5084) at ../hw/virtio/virtio.c:3516
#17 0x55eff158 in aio_dispatch_handler (ctx=0x5680fac0, 
node=0x7fffeca5bbe0) at ../util/aio-posix.c:350
#18 0x55eff390 in aio_dispatch_handlers (ctx=0x5680fac0) at 
../util/aio-posix.c:406
#19 0x55eff3ea in aio_dispatch (ctx=0x5680fac0) at 
../util/aio-posix.c:416
#20 0x55f184eb in aio_ctx_dispatch (source=0x5680fac0, 
callback=0x0, user_data=0x0) at ../util/async.c:311
#21 0x77b6b17d in g_main_context_dispatch () at 
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0

#22 0x55f299ed in glib_pollfds_poll () at ../util/main-loop.c:232
#23 0x55f29a6b in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:255
#24 0x55f29b7c in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531

#25 0x55be097c in qemu_main_loop () at ../softmmu/runstate.c:727
#26 0x558367fa in main (argc=26, argv=0x7fffe058, 
envp=0x7fffe130) at ../softmmu/main.c:50


The slpte is 0x0 and level is 3 which probably means the device is 
kicked before it was attached to any IOMMU domain.


Bisecting points to the first bad commit:

commit 826cc32423db2a99d184dbf4f507c737d7e7a4ae
Author: Stefan Hajnoczi 
Date:   Tue Dec 7 13:23:31 2021 +

    aio-posix: split poll check from ready handler

A wild guess is that this lead some false kick to the device, any