On Tue, Jan 13, 2026 at 4:27 PM Alex Deucher <[email protected]> wrote: > > On Tue, Jan 13, 2026 at 8:42 AM Christian König > <[email protected]> wrote: > > > > On 1/8/26 15:48, Alex Deucher wrote: > > > It was leftover from when the driver supported drm sched > > > resubmit. That was dropped long ago, so drop this as well. > > > > We unfortunately still need that to update the guilty flag in the context > > so that amdgpu_ctx_query2() works correctly. > > I don't think it matters? We don't call this for per queue resets and > the errors seem to make their way up to userspace properly. Maybe it > would be better to move drm_sched_increase_karma() into > amdgpu_job_timedout() so it covers both queue resets and adapter > resets.
Calling drm_sched_increase_karma() appears to not do the right thing. If I keep it in place, the context always shows up as innocent. If I move it up to amdgpu_job_timedout(), even per queue reset contexts show up as innocent. The behavior is better with it removed. Alex > > Alex > > > > > But we could change the code in amdgpu_ctx_query2() to check the individual > > entities for error codes instead. > > > > Regards, > > Christian. > > > > > > > > Signed-off-by: Alex Deucher <[email protected]> > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 --- > > > 1 file changed, 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > index 868ab5314c0d1..c9954dd8d83c8 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > @@ -5808,9 +5808,6 @@ int amdgpu_device_pre_asic_reset(struct > > > amdgpu_device *adev, > > > > > > amdgpu_fence_driver_isr_toggle(adev, false); > > > > > > - if (job && job->vm) > > > - drm_sched_increase_karma(&job->base); > > > - > > > r = amdgpu_reset_prepare_hwcontext(adev, reset_context); > > > /* If reset handler not implemented, continue; otherwise return */ > > > if (r == -EOPNOTSUPP) > >
