On 6/19/24 20:37, Alex Bennée wrote:
> So I've been experimenting with Aarch64 TCG with an Intel backend like
> this:
> 
> ./qemu-system-aarch64 \
>            -M virt -cpu cortex-a76 \
>            -device virtio-net-pci,netdev=unet \
>            -netdev user,id=unet,hostfwd=tcp::2222-:22 \
>            -m 8192 \
>            -object memory-backend-memfd,id=mem,size=8G,share=on \
>            -serial mon:stdio \
>            -kernel 
> ~/lsrc/linux.git/builds/arm64.initramfs/arch/arm64/boot/Image \
>            -append "console=ttyAMA0" \
>            -device qemu-xhci -device usb-kbd -device usb-tablet \
>            -device virtio-gpu-gl-pci,blob=true,venus=true,hostmem=4G \
>            -display sdl,gl=on -d 
> plugin,guest_errors,trace:virtio_gpu_cmd_res_create_blob,trace:virtio_gpu_cmd_res_back_\*,trace:virtio_gpu_cmd_res_xfer_toh_3d,trace:virtio_gpu_cmd_res_xfer_fromh_3d,trace:address_space_map
>  
> 
> And I've noticed a couple of things. First trying to launch vkmark to
> run a KMS mode test fails with:
> 
...
>   virgl_render_server[1875931]: vkr: failed to import resource: invalid 
> res_id 5
>   virgl_render_server[1875931]: vkr: vkAllocateMemory resulted in CS error 
>   virgl_render_server[1875931]: vkr: ring_submit_cmd: vn_dispatch_command 
> failed
> 
> More interestingly when shutting stuff down we see weirdness like:
> 
>   address_space_map as:0x561b48ec48c0 addr 0x1008ac4b0:18 write:1 attrs:0x1   
>                                                                               
>                    
>   virgl_render_server[1875931]: vkr: destroying context 3 (vkmark) with a 
> valid instance                                                                
>                        
>   virgl_render_server[1875931]: vkr: destroying device with valid objects     
>                                                                               
>                    
>   vkr_context_remove_object: -7438602987017907480                             
>                                                                               
>                    
>   vkr_context_remove_object: 7                                                
>                                                                               
>                    
>   vkr_context_remove_object: 5       
> 
> which indicates something has gone very wrong. I'm not super familiar
> with the memory allocation patterns but should stuff that is done as
> virtio_gpu_cmd_res_back_attach() be find-able in the list of resources?

This is expected to fail. Vkmark creates shmem virgl GBM FB BO on guest
that isn't exportable on host. AFAICT, more code changes should be
needed to support this case.

Note that "destroying device with valid objects" msg is fine, won't hurt
to silence it in Venus to avoid confusion. It will happen every time
guest application is closed without explicitly releasing every VK object.

> I tried running under RR to further debug but weirdly I can't get
> working graphics with that. I did try running under threadsan which
> complained about a potential data race:
> 
>   vkr_context_add_object: 1 -> 0x7b2c00000288
>   vkr_context_add_object: 2 -> 0x7b2c00000270
>   vkr_context_add_object: 3 -> 0x7b3800007f28
>   vkr_context_add_object: 4 -> 0x7b3800007fa0
>   vkr_context_add_object: 5 -> 0x7b48000103f8
>   vkr_context_add_object: 6 -> 0x7b48000104a0
>   vkr_context_add_object: 7 -> 0x7b4800010440
>   virtio_gpu_cmd_res_back_attach res 0x5
>   virtio_gpu_cmd_res_back_attach res 0x6
>   vkr_context_add_object: 8 -> 0x7b48000103e0
>   virgl_render_server[1751430]: vkr: failed to import resource: invalid 
> res_id 5
>   virgl_render_server[1751430]: vkr: vkAllocateMemory resulted in CS error
>   virgl_render_server[1751430]: vkr: ring_submit_cmd: vn_dispatch_command 
> failed
>   ==================
>   WARNING: ThreadSanitizer: data race (pid=1751256)
>     Read of size 8 at 0x7f7fa0ea9138 by main thread (mutexes: write M0):
>       #0 memcpy <null> (qemu-system-aarch64+0x41fede) (BuildId: 
> 0bab171e77cb6782341ee3407e44af7267974025)
..
>   ==================
>   SUMMARY: ThreadSanitizer: data race 
> (/home/alex/lsrc/qemu.git/builds/system.threadsan/qemu-system-aarch64+0x41fede)
>  (BuildId: 0bab171e77cb6782341ee3407e44af7267974025) in __interceptor_memcpy
> 
> This could be a false positive or it could be a race between the guest
> kernel clearing memory while we are still doing
> virtio_gpu_ctrl_response.
> 
> What do you think?

The memcpy warning looks a bit suspicion, but likely is harmless. I
don't see such warning with TSAN and x86 VM.

-- 
Best regards,
Dmitry


Reply via email to