On 5/12/26 05:07, Deepanshu Kartikey wrote: > virtio_gpu_cursor_plane_update() allocates a virtio_gpu_object_array, > locks its dma_resv, and queues a fenced transfer to the host. The > lock acquisition can fail in two ways: > > - dma_resv_lock_interruptible() returns -EINTR when a signal is > delivered while waiting for the reservation lock. > - dma_resv_reserve_fences() returns -ENOMEM if it fails to allocate > a fence slot; in this case lock_resv unlocks before returning. > > The return value was ignored, so the cursor path could proceed with > the resv lock not held. The queue path then walks the object array > and calls dma_resv_add_fence(), which requires the lock; with lockdep > enabled this trips dma_resv_assert_held(): > > WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 > Call Trace: > virtio_gpu_array_add_fence > virtio_gpu_queue_ctrl_sgs > virtio_gpu_queue_fenced_ctrl_buffer > virtio_gpu_cursor_plane_update > drm_atomic_helper_commit_planes > drm_atomic_helper_commit_tail > commit_tail > drm_atomic_helper_commit > drm_atomic_commit > drm_atomic_helper_update_plane > __setplane_atomic > drm_mode_cursor_universal > drm_mode_cursor_common > drm_mode_cursor_ioctl > drm_ioctl > __x64_sys_ioctl > > Beyond the WARN, mutating the dma_resv fence list without the lock > races with concurrent readers/writers and can corrupt the list. > > The DRM atomic helpers do not allow .atomic_update to fail: by the > time it runs, the commit has been signed off to userspace and there > is no clean rollback path. Move the fallible work -- objs allocation, > dma_resv locking, and fence slot reservation -- into > virtio_gpu_plane_prepare_fb, which is the designated callback for > resource acquisition and may return errors that the framework > handles by rolling back the commit. Stash the prepared object array > on virtio_gpu_plane_state so the update step can consume it. > > Make virtio_gpu_plane_cleanup_fb release the objs if the commit was > rolled back before update ran (i.e., objs not consumed). The queue > path already unlocks the resv after attaching the fence (vq.c:411) > and frees the array via put_free_delayed after host completion > (vq.c:271), so the update step only needs to clear vgplane_st->objs > to transfer ownership. > > Simplify virtio_gpu_cursor_plane_update to a no-fail queue submission > that hands the prepared, locked objs to the queue path. > > The bug was reported by syzbot, triggered via fault injection > (fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the > -ENOMEM branch in dma_resv_reserve_fences(). > > Reported-by: [email protected] > Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271 > Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().") > Cc: [email protected] > Link: > https://lore.kernel.org/all/[email protected]/T/ > [v1] > Signed-off-by: Deepanshu Kartikey <[email protected]> > --- > v2: Move resv lock acquisition from .atomic_update (which must not > fail) to .prepare_fb (which may), per maintainer review of v1. > The previous approach of silently skipping the cursor update on > lock failure violated the atomic-commit contract with userspace. > --- > drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + > drivers/gpu/drm/virtio/virtgpu_plane.c | 38 ++++++++++++++++++++------ > 2 files changed, 30 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h > b/drivers/gpu/drm/virtio/virtgpu_drv.h > index f17660a71a3e..e51f959dce46 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > @@ -198,6 +198,7 @@ struct virtio_gpu_framebuffer { > struct virtio_gpu_plane_state { > struct drm_plane_state base; > struct virtio_gpu_fence *fence; > + struct virtio_gpu_object_array *objs; > }; > #define to_virtio_gpu_plane_state(x) \ > container_of(x, struct virtio_gpu_plane_state, base) > diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c > b/drivers/gpu/drm/virtio/virtgpu_plane.c > index a126d1b25f46..b0511ace89e6 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_plane.c > +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c > @@ -381,6 +381,23 @@ static int virtio_gpu_plane_prepare_fb(struct drm_plane > *plane, > goto err_fence; > } > > + if (plane->type == DRM_PLANE_TYPE_CURSOR && bo->dumb) { > + struct virtio_gpu_object_array *objs; > + > + objs = virtio_gpu_array_alloc(1); > + if (!objs) { > + ret = -ENOMEM; > + goto err_fence; > + } > + virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]); > + ret = virtio_gpu_array_lock_resv(objs); > + if (ret) { > + virtio_gpu_array_put_free(objs); > + goto err_fence; > + } > + vgplane_st->objs = objs; > + } > + > return 0; > > err_fence: > @@ -417,6 +434,12 @@ static void virtio_gpu_plane_cleanup_fb(struct drm_plane > *plane, > vgplane_st->fence = NULL; > } > > + if (vgplane_st->objs) { > + virtio_gpu_array_unlock_resv(vgplane_st->objs); > + virtio_gpu_array_put_free(vgplane_st->objs); > + vgplane_st->objs = NULL; > + } > + > obj = state->fb->obj[0]; > if (drm_gem_is_imported(obj)) > virtio_gpu_cleanup_imported_obj(obj); > @@ -452,21 +475,18 @@ static void virtio_gpu_cursor_plane_update(struct > drm_plane *plane, > } > > if (bo && bo->dumb && (plane->state->fb != old_state->fb)) { > - /* new cursor -- update & wait */ > - struct virtio_gpu_object_array *objs; > - > - objs = virtio_gpu_array_alloc(1); > - if (!objs) > - return; > - virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]); > - virtio_gpu_array_lock_resv(objs); > + /* objs and fence were prepared in virtio_gpu_plane_prepare_fb; > + * the resv is already locked. The queue path takes ownership > + * of objs and unlocks the resv after attaching the fence. > + */ > virtio_gpu_cmd_transfer_to_host_2d > (vgdev, 0, > plane->state->crtc_w, > plane->state->crtc_h, > - 0, 0, objs, vgplane_st->fence); > + 0, 0, vgplane_st->objs, vgplane_st->fence); > virtio_gpu_notify(vgdev); > dma_fence_wait(&vgplane_st->fence->f, true); > + vgplane_st->objs = NULL; > } > > if (plane->state->fb != old_state->fb) {
I'm getting lockup with this patch applied and now see that virtio_gpu_resource_flush() also locks BO. Easiest option might be to add uninterruptible variant of virtio_gpu_array_lock_resv(). Could you please try it for v3? -- Best regards, Dmitry

