[Public]

Regards,
      Prike

> -----Original Message-----
> From: Koenig, Christian <[email protected]>
> Sent: Friday, April 10, 2026 7:50 PM
> To: Liang, Prike <[email protected]>; [email protected]
> Cc: Deucher, Alexander <[email protected]>
> Subject: Re: [PATCH] drm/amdgpu: drop userq fence driver refs on fence release
>
> On 4/10/26 10:32, Liang, Prike wrote:
> > [Public]
> >
> > Regards,
> >       Prike
> >
> >> -----Original Message-----
> >> From: Koenig, Christian <[email protected]>
> >> Sent: Wednesday, April 8, 2026 4:27 PM
> >> To: Liang, Prike <[email protected]>; [email protected]
> >> Cc: Deucher, Alexander <[email protected]>
> >> Subject: Re: [PATCH] drm/amdgpu: drop userq fence driver refs on
> >> fence release
> >>
> >> On 4/8/26 09:45, Prike Liang wrote:
> >>> amdgpu_userq_wait_ioctl() takes extra references on waited-on fence
> >>> drivers and stores them in waitq->fence_drv_xa. When a new userq
> >>> fence is created, those references are transferred into
> >>> userq_fence->fence_drv_array so they can be released when the fence
> completes.
> >>>
> >>> However, those inherited references are currently only dropped from
> >>> amdgpu_userq_fence_driver_process(). If a fence never reaches that
> >>> path, such as it is already signaled when created or it is dropped
> >>> through an error/cleanup path, amdgpu_userq_fence_free() frees
> >>> fence_drv_array without putting the referenced fence drivers.
> >>
> >> Clear NAK to that as well.
> >>
> >> An userq fence must be signaled at some point and when that happens
> >> the reference fence drivers can be put.
> >>
> >> What could be is that we have another call to dma_fence_signal()
> >> where we forget to do that, but it should *never* be done in
> amdgpu_userq_fence_free().
> > It looks like we’re missing the userq fence-array put on the signaled-fence 
> > branch
> in amdgpu_userq_fence_create().
>
> Ah yes, I see. Good catch.
>
> > If we ensure the fence-array is properly put/balanced earlier in the
> > flow, then amdgpu_userq_fence_put_fence_drv_array() will  naturally become a
> no-op in amdgpu_userq_fence_free(). Meanwhile, keeping the *_put call in 
> free()
> serves as a final backstop to cover any other overlooked/unbalanced paths.
> >
> > If you still prefer that amdgpu_userq_fence_put_fence_drv_array()
> > should not be called from free(), I can remove it and clean this up
> > accordingly and only put it in the *create()
>
> amdgpu_userq_fence_free() is completely removed by my recent DMA-fence
> independence patches. See them on the mailing list.
>
> I'm currently working on fixing the reset handling and fence allocation. 
> Please write a
> stand alone patch to fix this issue which we can push before my work lands.
Sure, thank you for the heads up.

> Thanks,
> Christian.
>
> >
> >> Regards,
> >> Christian.
> >>
> >>>
> >>> Signed-off-by: Prike Liang <[email protected]>
> >>> ---
> >>>  drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 17
> >>> +++++++++++------
> >>>  1 file changed, 11 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> >>> index 3be80a82788a..bd196599d3d6 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> >>> @@ -145,13 +145,21 @@ amdgpu_userq_fence_driver_free(struct
> >> amdgpu_usermode_queue *userq)
> >>>     amdgpu_userq_fence_driver_put(userq->fence_drv);
> >>>  }
> >>>
> >>> +static void
> >>> +amdgpu_userq_fence_put_fence_drv_array(struct amdgpu_userq_fence
> >>> +*userq_fence) {
> >>> +   unsigned long i;
> >>> +   for (i = 0; i < userq_fence->fence_drv_array_count; i++)
> >>> +           
> >>> amdgpu_userq_fence_driver_put(userq_fence->fence_drv_array[i]);
> >>> +   userq_fence->fence_drv_array_count = 0; }
> >>> +
> >>>  void amdgpu_userq_fence_driver_process(struct
> >>> amdgpu_userq_fence_driver *fence_drv)  {
> >>>     struct amdgpu_userq_fence *userq_fence, *tmp;
> >>>     struct dma_fence *fence;
> >>>     unsigned long flags;
> >>>     u64 rptr;
> >>> -   int i;
> >>>
> >>>     if (!fence_drv)
> >>>             return;
> >>> @@ -166,10 +174,7 @@ void amdgpu_userq_fence_driver_process(struct
> >> amdgpu_userq_fence_driver *fence_d
> >>>                     break;
> >>>
> >>>             dma_fence_signal(fence);
> >>> -
> >>> -           for (i = 0; i < userq_fence->fence_drv_array_count; i++)
> >>> -                   amdgpu_userq_fence_driver_put(userq_fence-
> >>> fence_drv_array[i]);
> >>> -
> >>> +           amdgpu_userq_fence_put_fence_drv_array(userq_fence);
> >>>             list_del(&userq_fence->link);
> >>>             dma_fence_put(fence);
> >>>     }
> >>> @@ -320,9 +325,9 @@ static void amdgpu_userq_fence_free(struct
> >>> rcu_head
> >> *rcu)
> >>>     struct amdgpu_userq_fence *userq_fence = to_amdgpu_userq_fence(fence);
> >>>     struct amdgpu_userq_fence_driver *fence_drv =
> >>> userq_fence->fence_drv;
> >>>
> >>> +   amdgpu_userq_fence_put_fence_drv_array(userq_fence);
> >>>     /* Release the fence driver reference */
> >>>     amdgpu_userq_fence_driver_put(fence_drv);
> >>> -
> >>>     kvfree(userq_fence->fence_drv_array);
> >>>     kmem_cache_free(amdgpu_userq_fence_slab, userq_fence);  }
> >

Reply via email to