On 6/5/24 17:47, Alex Bennée wrote:
....
> I'm guessing some sort of resource leak, if I run vkcube-wayland in the
> guest it complains about being stuck on a fence with the iterator going
> up. However on the host I see:
> 
>   virtio_gpu_fence_ctrl fence 0x13f1, type 0x207
>   virtio_gpu_fence_ctrl fence 0x13f2, type 0x207
>   virtio_gpu_fence_resp fence 0x13f1
>   virtio_gpu_fence_resp fence 0x13f2
>   virtio_gpu_fence_ctrl fence 0x13f3, type 0x207
>   virtio_gpu_fence_ctrl fence 0x13f4, type 0x207
>   virtio_gpu_fence_resp fence 0x13f3
>   virtio_gpu_fence_resp fence 0x13f4
>   virtio_gpu_fence_ctrl fence 0x13f5, type 0x207
>   virtio_gpu_fence_ctrl fence 0x13f6, type 0x207
>   virtio_gpu_fence_resp fence 0x13f5
>   virtio_gpu_fence_resp fence 0x13f6
>   virtio_gpu_fence_ctrl fence 0x13f7, type 0x207
>   virtio_gpu_fence_ctrl fence 0x13f8, type 0x207
>   virtio_gpu_fence_resp fence 0x13f7
>   virtio_gpu_fence_resp fence 0x13f8
>   virtio_gpu_fence_ctrl fence 0x13f9, type 0x204
>   virtio_gpu_fence_resp fence 0x13f9
> 
> which looks like its going ok. However when I git Ctrl-C in the guest it
> kills QEMU:
> 
>   virtio_gpu_fence_ctrl fence 0x13fc, type 0x207
>   virtio_gpu_fence_ctrl fence 0x13fd, type 0x207
>   virtio_gpu_fence_ctrl fence 0x13fe, type 0x204
>   virtio_gpu_fence_ctrl fence 0x13ff, type 0x207
>   virtio_gpu_fence_ctrl fence 0x1400, type 0x207
>   virtio_gpu_fence_resp fence 0x13fc
>   virtio_gpu_fence_resp fence 0x13fd
>   virtio_gpu_fence_resp fence 0x13fe
>   virtio_gpu_fence_resp fence 0x13ff
>   virtio_gpu_fence_resp fence 0x1400
>   qemu-system-aarch64: 
> ../../subprojects/virglrenderer/src/virglrenderer.c:1282: 
> virgl_renderer_resource_unmap: Assertion `!ret' failed.
>   fish: Job 1, './qemu-system-aarch64 \' terminated by signal     -machine 
> type=virt,virtuali… (    -cpu neoverse-n1 \)
>   fish: Job     -smp 4 \, '    -accel tcg \' terminated by signal     -device 
> virtio-net-pci,netd… (    -device virtio-scsi-pci \)
>   fish: Job     -device scsi-hd,drive=hd \, '    -netdev 
> user,id=unet,hostfw…' terminated by signal     -blockdev driver=raw,node-n… ( 
>    -serial mon:stdio \)
>   fish: Job     -blockdev node-name=rom,dri…, '    -blockdev 
> node-name=efivars…' terminated by signal     -m 8192 \ (    -object 
> memory-backend-memf…)
>   fish: Job     -device virtio-gpu-gl-pci,h…, '    -display 
> sdl,gl=on,show-cur…' terminated by signal     -device qemu-xhci -device u… (  
>   -kernel /home/alex/lsrc/lin…)
>   fish: Job     -d guest_errors,unimp,trace…, 'SIGABRT' terminated by signal 
> Abort ()
> 
> The backtrace (and the 18G size of the core file!) indicates a leak:

The unmap debug-assert tells that BO wasn't mapped because mapping
failed, likely due to OOM. You won't hit this abort with a release build
of libvirglrenderer. The leak likely happens due to unsignalled fence.

Please try to run vkcube with disabled fence-feedback feature:

 # VN_PERF_NO_FENCE_FEEDBACK=1 vkcube-wayland

It fixes hang for me. We had problems with combination of this Venus
optimization feature + Intel ANV driver for a long time and hoped that
it's fixed by now, apparently the issue was only masked.

-- 
Best regards,
Dmitry


Reply via email to