sched: Add pending job list iterator

Christian König Tue, 07 Oct 2025 01:44:43 -0700

On 07.10.25 10:28, Matthew Brost wrote:
> On Tue, Oct 07, 2025 at 01:09:34AM -0700, Matthew Brost wrote:
>> On Tue, Oct 07, 2025 at 09:28:56AM +0200, Christian König wrote:
>>> On 02.10.25 07:16, Matthew Brost wrote:
>>>> Stop open coding pending job list in drivers. Add pending job list
>>>> iterator which safely walks DRM scheduler list either locklessly
>>>> asserting DRM scheduler is stopped or takes pending job list lock.
>>>
>>> Taking the job list lock and walking the pending list while the scheduler 
>>> is not stopped is most likely a NO-GO.
> 
> Oops, I misread your statement—it's late here.
> 
> The use case for walking the scheduler with acquiring the job list lock
> and without being stopped is in debugfs for Xe, where it prints out
> pending job information. That seems valid. There are couple of other
> upstream cases where this is done but likely not needed.


Yeah, I thought it would be something like that.

> I checked and found that AMD acquires job_list_lock and walks the
> pending list in two cases within amdgpu_debugfs.c. PVR also acquires the
> lock in imagination/pvr_queue.c.
> 
> If this is really controversial, we don’t strictly need this in Xe and
> can remove it. But of course, AMD GPU and PVR would need to be fixed as
> well.

I think for debugging it is valid, but we should then have two different 
iterators.

One for debugging which can only be used when CONFIG_DEBUG/DEBUGFS is enabled.

And a different one for the functional side, e.g. iterating while the scheduler 
is stopped.

Christian.

> 
> Matt
> 
>>>
>>
>> I agree. But this patch doesn’t do that — it actually does the opposite.
>>
>> It ensures that if you need to walk the scheduler list locklessly, the
>> scheduler is stopped at both the beginning and end of the iterator, or
>> it correctly takes the pending list lock.
>>
>> So, what’s the issue? Or is there just some confusion about what this
>> patch is actually doing?
>>
>>> As far as I understand it that is exactly what Philip rejected as looking 
>>> into scheduler internals during the discussion on XDC.
>>>
>>
>> I thought we agreed on having a well-defined iterator for walking the
>> pending list, ensuring correct usage, rather than having drivers walk
>> the pending list themselves. From my understanding, this is exactly what
>> we agreed upon.
>>
>>> So why is that actually needed? For debugging or something functional?
>>>
>>
>> Drivers inevitably need to walk the pending list during recovery flows
>> (such as resets, resubmissions, VF migration, etc.). This ensures that a
>> driver knows what it’s doing when it does so, and avoids directly
>> touching scheduler internals. Instead, it should just call
>> drm_sched_for_each_pending_job.
>>
>> This has actually been helpful in Xe already — when I was working on
>> scheduler backend changes, it helped catch cases where I accidentally
>> flipped the stopped flag while walking the job list. Without this, it
>> could have randomly blown up later if a perfectly timed race condition
>> occured (e.g., free_job fired).
>>
>> Matt
>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Signed-off-by: Matthew Brost <[email protected]>
>>>> ---
>>>>  include/drm/gpu_scheduler.h | 64 +++++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 64 insertions(+)
>>>>
>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>> index fb88301b3c45..a2dcabab8284 100644
>>>> --- a/include/drm/gpu_scheduler.h
>>>> +++ b/include/drm/gpu_scheduler.h
>>>> @@ -698,4 +698,68 @@ void drm_sched_entity_modify_sched(struct 
>>>> drm_sched_entity *entity,
>>>>                               struct drm_gpu_scheduler **sched_list,
>>>>                               unsigned int num_sched_list);
>>>>  
>>>> +/* Inlines */
>>>> +
>>>> +/**
>>>> + * struct drm_sched_iter_pending_job - DRM scheduler pending job iterator 
>>>> state
>>>> + * @sched: DRM scheduler associated with pending job iterator
>>>> + * @stopped: DRM scheduler stopped state associated with pending job 
>>>> iterator
>>>> + */
>>>> +struct drm_sched_iter_pending_job {
>>>> +  struct drm_gpu_scheduler *sched;
>>>> +  bool stopped;
>>>> +};
>>>> +
>>>> +/* Drivers should never call this directly */
>>>> +static inline struct drm_sched_iter_pending_job
>>>> +__drm_sched_iter_pending_job_begin(struct drm_gpu_scheduler *sched, bool 
>>>> stopped)
>>>> +{
>>>> +  struct drm_sched_iter_pending_job iter = {
>>>> +          .sched = sched,
>>>> +          .stopped = stopped,
>>>> +  };
>>>> +
>>>> +  if (stopped)
>>>> +          WARN_ON(!READ_ONCE(sched->pause_submit));
>>>> +  else
>>>> +          spin_lock(&sched->job_list_lock);
>>>> +
>>>> +  return iter;
>>>> +}
>>>> +
>>>> +/* Drivers should never call this directly */
>>>> +static inline void
>>>> +__drm_sched_iter_pending_job_end(const struct drm_sched_iter_pending_job 
>>>> iter)
>>>> +{
>>>> +  if (iter.stopped)
>>>> +          WARN_ON(!READ_ONCE(iter.sched->pause_submit));
>>>> +  else
>>>> +          spin_unlock(&iter.sched->job_list_lock);
>>>> +}
>>>> +
>>>> +DEFINE_CLASS(drm_sched_iter_pending_job, struct 
>>>> drm_sched_iter_pending_job,
>>>> +       __drm_sched_iter_pending_job_end(_T);,
>>>> +       __drm_sched_iter_pending_job_begin(__sched, __stopped),
>>>> +       struct drm_gpu_scheduler * __sched, bool __stopped);
>>>> +static inline void
>>>> +*class_drm_sched_iter_pending_job_lock_ptr(class_drm_sched_iter_pending_job_t
>>>>  *_T)
>>>> +{return _T; }
>>>> +#define class_drm_sched_iter_pending_job_is_conditional false
>>>> +
>>>> +/**
>>>> + * drm_sched_for_each_pending_job() - Iterator for each pending job in 
>>>> scheduler
>>>> + * @__job: Current pending job being iterated over
>>>> + * @__sched: DRM scheduler to iterate over pending jobs
>>>> + * @__entity: DRM scheduler entity to filter jobs, NULL indicates no 
>>>> filter
>>>> + * @__stopped: DRM scheduler stopped state
>>>> + *
>>>> + * Iterator for each pending job in scheduler, filtering on an entity, and
>>>> + * enforcing locking rules (either scheduler fully stoppped or correctly 
>>>> takes
>>>> + * job_list_lock).
>>>> + */
>>>> +#define drm_sched_for_each_pending_job(__job, __sched, __entitiy, 
>>>> __stopped)      \
>>>> +  scoped_guard(drm_sched_iter_pending_job, __sched, __stopped)            
>>>> \
>>>> +  list_for_each_entry(__job, &(__sched)->pending_list, list)              
>>>> \
>>>> +  for_each_if(!__entitiy || (__job)->entity == (__entitiy))
>>>> +
>>>>  #endif
>>>

Re: [RFC PATCH 1/4] drm/sched: Add pending job list iterator

Reply via email to