20.05.2026 14:00, Christian König пишет:
> On 5/20/26 10:12, Dmitry Osipenko wrote:
>> On 5/20/26 10:05, Christian König wrote:
>>> On 5/20/26 08:50, Dmitry Osipenko wrote:
>>>> On 5/19/26 11:27, Christian König wrote:
>>>>> On 5/19/26 10:22, Deepanshu Kartikey wrote:
>>>>>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock
>>>>>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and
>>>>>> ignore its return value. The function can fail with -EINTR from
>>>>>> dma_resv_lock_interruptible() (signal during lock wait) or with
>>>>>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation),
>>>>>> leaving the resv lock not held. The queue path then walks the object
>>>>>> array and calls dma_resv_add_fence(), which requires the lock held;
>>>>>> with lockdep enabled this trips dma_resv_assert_held():
>>>>>>
>>>>>>   WARNING: drivers/dma-buf/dma-resv.c:296 at 
>>>>>> dma_resv_add_fence+0x71e/0x840
>>>>>>   Call Trace:
>>>>>>    virtio_gpu_array_add_fence
>>>>>>    virtio_gpu_queue_ctrl_sgs
>>>>>>    virtio_gpu_queue_fenced_ctrl_buffer
>>>>>>    virtio_gpu_cursor_plane_update
>>>>>>    drm_atomic_helper_commit_planes
>>>>>>    drm_atomic_helper_commit_tail
>>>>>>    commit_tail
>>>>>>    drm_atomic_helper_commit
>>>>>>    drm_atomic_commit
>>>>>>    drm_atomic_helper_update_plane
>>>>>>    __setplane_atomic
>>>>>>    drm_mode_cursor_universal
>>>>>>    drm_mode_cursor_common
>>>>>>    drm_mode_cursor_ioctl
>>>>>>    drm_ioctl
>>>>>>    __x64_sys_ioctl
>>>>>>
>>>>>> Beyond the WARN, mutating the dma_resv fence list without the lock
>>>>>> races with concurrent readers/writers and can corrupt the list.
>>>>>
>>>>> Well why are you trying to add a fence on an atomic mode set in the first 
>>>>> place?
>>>>>
>>>>> That is usually an illegal operation here.
>>>> That is pre-existing in the driver. It performs draw operation and in
>>>> some cases waits for the completion during atomic. Whether all that
>>>> syncing is correct is hard to say immediately as some of it may be
>>>> historical edge cases.
>>>
>>> I'm not not so deeply in the atomic mode setting stuff but it strongly 
>>> sounds like that this is seriously broken.
>>>
>>> The background is that the atomic mode set framework allows an output 
>>> dma_fence which is signaled when the commit is finished.
>>>
>>> So when you allocate a fence slot and add a new fence to finish the atomic 
>>> commit it is trivially possible that this cycles back and waits for the 
>>> atomic commit to finish. In other words you have a deadlock.
>>>
>>> You probably need specially crafted userspace with the right timing to 
>>> trigger that, but such issues are usually a rather big no-no and need to be 
>>> fixed in the long term.
>>>
>>> Try to add dma_fence_begin_signaling() and dma_fence_end_signaling() 
>>> annotation and enable lockdep, the tool should be able to point out if and 
>>> what exactly goes wrong.
>>>
>>> The usual fix is to prepare everything before commit_tail is called (alloc 
>>> memory, create, reserve slot, add dma_fence etc....) and then just send out 
>>> the prepared commands later on.
>>
>> We tried with moving resv alloc to prepare_fb() in a previous patch
>> version, it resulted in a non-trivial deadlocks. The goal of this patch
>> is to fix immediate problem with a minimal code change.
> 
> Yeah, totally fine with me to get that fixed first.
> 
>> What you're saying is correct, but it may require a rather big
>> refactoring of the code. In general, everything works okay today, so not
>> really an urgent problem.
> 
> It's just a potential issue and when the AI bots keep evolving like they 
> already do they will sooner or later start to point that out as well.
Don't mind if bots will produce something useful

-- 
Best regards,
Dmitry

Reply via email to