On 1/8/26 15:48, Alex Deucher wrote: > Mark fences with errors before we reset the rings as > we may end up signalling fences as part of the reset > sequence. The error needs to be set before the fence > is signalled.
Setting the error is a good idea, but signaling the fence before the reset is clearly a NAK. Fence signaling can only happen after we are sure that the DMA operation has been canceled. Regards, Christian. > > Signed-off-by: Alex Deucher <[email protected]> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > index 600e6bb98af7a..5defdebd7091e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > @@ -872,6 +872,10 @@ void amdgpu_ring_reset_helper_begin(struct amdgpu_ring > *ring, > drm_sched_wqueue_stop(&ring->sched); > /* back up the non-guilty commands */ > amdgpu_ring_backup_unprocessed_commands(ring, guilty_fence); > + /* signal the guilty fence and set an error on all fences from the > context */ > + if (guilty_fence) > + amdgpu_fence_driver_guilty_force_completion(guilty_fence); > + > } > > int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring, > @@ -885,9 +889,6 @@ int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring, > if (r) > return r; > > - /* signal the guilty fence and set an error on all fences from the > context */ > - if (guilty_fence) > - amdgpu_fence_driver_guilty_force_completion(guilty_fence); > /* Re-emit the non-guilty commands */ > if (ring->ring_backup_entries_to_copy) { > amdgpu_ring_alloc_reemit(ring, > ring->ring_backup_entries_to_copy);
