On 07.10.25 10:28, Matthew Brost wrote: > On Tue, Oct 07, 2025 at 01:09:34AM -0700, Matthew Brost wrote: >> On Tue, Oct 07, 2025 at 09:28:56AM +0200, Christian König wrote: >>> On 02.10.25 07:16, Matthew Brost wrote: >>>> Stop open coding pending job list in drivers. Add pending job list >>>> iterator which safely walks DRM scheduler list either locklessly >>>> asserting DRM scheduler is stopped or takes pending job list lock. >>> >>> Taking the job list lock and walking the pending list while the scheduler >>> is not stopped is most likely a NO-GO. > > Oops, I misread your statement—it's late here. > > The use case for walking the scheduler with acquiring the job list lock > and without being stopped is in debugfs for Xe, where it prints out > pending job information. That seems valid. There are couple of other > upstream cases where this is done but likely not needed.
Yeah, I thought it would be something like that. > I checked and found that AMD acquires job_list_lock and walks the > pending list in two cases within amdgpu_debugfs.c. PVR also acquires the > lock in imagination/pvr_queue.c. > > If this is really controversial, we don’t strictly need this in Xe and > can remove it. But of course, AMD GPU and PVR would need to be fixed as > well. I think for debugging it is valid, but we should then have two different iterators. One for debugging which can only be used when CONFIG_DEBUG/DEBUGFS is enabled. And a different one for the functional side, e.g. iterating while the scheduler is stopped. Christian. > > Matt > >>> >> >> I agree. But this patch doesn’t do that — it actually does the opposite. >> >> It ensures that if you need to walk the scheduler list locklessly, the >> scheduler is stopped at both the beginning and end of the iterator, or >> it correctly takes the pending list lock. >> >> So, what’s the issue? Or is there just some confusion about what this >> patch is actually doing? >> >>> As far as I understand it that is exactly what Philip rejected as looking >>> into scheduler internals during the discussion on XDC. >>> >> >> I thought we agreed on having a well-defined iterator for walking the >> pending list, ensuring correct usage, rather than having drivers walk >> the pending list themselves. From my understanding, this is exactly what >> we agreed upon. >> >>> So why is that actually needed? For debugging or something functional? >>> >> >> Drivers inevitably need to walk the pending list during recovery flows >> (such as resets, resubmissions, VF migration, etc.). This ensures that a >> driver knows what it’s doing when it does so, and avoids directly >> touching scheduler internals. Instead, it should just call >> drm_sched_for_each_pending_job. >> >> This has actually been helpful in Xe already — when I was working on >> scheduler backend changes, it helped catch cases where I accidentally >> flipped the stopped flag while walking the job list. Without this, it >> could have randomly blown up later if a perfectly timed race condition >> occured (e.g., free_job fired). >> >> Matt >> >>> Regards, >>> Christian. >>> >>>> >>>> Signed-off-by: Matthew Brost <[email protected]> >>>> --- >>>> include/drm/gpu_scheduler.h | 64 +++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 64 insertions(+) >>>> >>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h >>>> index fb88301b3c45..a2dcabab8284 100644 >>>> --- a/include/drm/gpu_scheduler.h >>>> +++ b/include/drm/gpu_scheduler.h >>>> @@ -698,4 +698,68 @@ void drm_sched_entity_modify_sched(struct >>>> drm_sched_entity *entity, >>>> struct drm_gpu_scheduler **sched_list, >>>> unsigned int num_sched_list); >>>> >>>> +/* Inlines */ >>>> + >>>> +/** >>>> + * struct drm_sched_iter_pending_job - DRM scheduler pending job iterator >>>> state >>>> + * @sched: DRM scheduler associated with pending job iterator >>>> + * @stopped: DRM scheduler stopped state associated with pending job >>>> iterator >>>> + */ >>>> +struct drm_sched_iter_pending_job { >>>> + struct drm_gpu_scheduler *sched; >>>> + bool stopped; >>>> +}; >>>> + >>>> +/* Drivers should never call this directly */ >>>> +static inline struct drm_sched_iter_pending_job >>>> +__drm_sched_iter_pending_job_begin(struct drm_gpu_scheduler *sched, bool >>>> stopped) >>>> +{ >>>> + struct drm_sched_iter_pending_job iter = { >>>> + .sched = sched, >>>> + .stopped = stopped, >>>> + }; >>>> + >>>> + if (stopped) >>>> + WARN_ON(!READ_ONCE(sched->pause_submit)); >>>> + else >>>> + spin_lock(&sched->job_list_lock); >>>> + >>>> + return iter; >>>> +} >>>> + >>>> +/* Drivers should never call this directly */ >>>> +static inline void >>>> +__drm_sched_iter_pending_job_end(const struct drm_sched_iter_pending_job >>>> iter) >>>> +{ >>>> + if (iter.stopped) >>>> + WARN_ON(!READ_ONCE(iter.sched->pause_submit)); >>>> + else >>>> + spin_unlock(&iter.sched->job_list_lock); >>>> +} >>>> + >>>> +DEFINE_CLASS(drm_sched_iter_pending_job, struct >>>> drm_sched_iter_pending_job, >>>> + __drm_sched_iter_pending_job_end(_T);, >>>> + __drm_sched_iter_pending_job_begin(__sched, __stopped), >>>> + struct drm_gpu_scheduler * __sched, bool __stopped); >>>> +static inline void >>>> +*class_drm_sched_iter_pending_job_lock_ptr(class_drm_sched_iter_pending_job_t >>>> *_T) >>>> +{return _T; } >>>> +#define class_drm_sched_iter_pending_job_is_conditional false >>>> + >>>> +/** >>>> + * drm_sched_for_each_pending_job() - Iterator for each pending job in >>>> scheduler >>>> + * @__job: Current pending job being iterated over >>>> + * @__sched: DRM scheduler to iterate over pending jobs >>>> + * @__entity: DRM scheduler entity to filter jobs, NULL indicates no >>>> filter >>>> + * @__stopped: DRM scheduler stopped state >>>> + * >>>> + * Iterator for each pending job in scheduler, filtering on an entity, and >>>> + * enforcing locking rules (either scheduler fully stoppped or correctly >>>> takes >>>> + * job_list_lock). >>>> + */ >>>> +#define drm_sched_for_each_pending_job(__job, __sched, __entitiy, >>>> __stopped) \ >>>> + scoped_guard(drm_sched_iter_pending_job, __sched, __stopped) >>>> \ >>>> + list_for_each_entry(__job, &(__sched)->pending_list, list) >>>> \ >>>> + for_each_if(!__entitiy || (__job)->entity == (__entitiy)) >>>> + >>>> #endif >>>
