Happy 2026, Tvrtko!

On Wed, 2026-01-07 at 13:44 +0000, Tvrtko Ursulin wrote:
> drm_sched_entity_is_idle() is called when flushing the entity before
> destroying it and currently decides on the idle status based either on
> number of jobs in its queue being zero, or whether the entity is not part
> of any run-queue.
> 
> If entity is not part of a run-queue it is implied it can have no jobs in
> its queue, from which it follows it is redundant to look at the both and
> we can simplify to only look at the queue.
> 
> The list_empty() check was added in
> a875f58e237a ("drm/scheduler: stop setting rq to NULL")
> where it replaced the entity->rq check which was added in
> 741f01e636b7 ("drm/scheduler: Avoid using wait_event_killable for dying 
> process (V4)").
> 
> Since for the submit race involving exiting entities, explicit handling
> via entity->stopped was added soon after in
> 62347a33001c ("drm/scheduler: Add stopped flag to drm_sched_entity")
> it indeed looks we are safe to remove the list_empty() check.
> 
> This mean we can remove the memory barrier as well and fewer memory
> barriers the better.

I am not convinced that this change is worth it.

Not to long ago we discussed that the spsc_queue should be removed and
replaced by some sort of list, with proper locks. Christian has agreed
that this should fly.

The spsc queue has only 1 user in the kernel and it's super hard to
understand how it's supposed to work and when any why barriers and
READ_ONCE's are necessary (that's, of course, also not documented in
the queue).

Until that is done I don't really want to touch any of those memory
barriers..

> 
> While at it, we add READ_ONCE annotation on the entity->stopped check to
> mark the unlocked read.

This would effectively legalize the lockless access. But it is illegal
and undefined behavior. See drm_sched_fini().

P.


> 
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Christian König <[email protected]>
> Cc: Danilo Krummrich <[email protected]>
> Cc: Matthew Brost <[email protected]>
> Cc: Philipp Stanner <[email protected]>
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index bb7e5fc47f99..2ed84504cfd6 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -137,14 +137,8 @@ EXPORT_SYMBOL(drm_sched_entity_modify_sched);
>  
>  static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>  {
> -     rmb(); /* for list_empty to work without lock */
> -
> -     if (list_empty(&entity->list) ||
> -         spsc_queue_count(&entity->job_queue) == 0 ||
> -         entity->stopped)
> -             return true;
> -
> -     return false;
> +     return spsc_queue_count(&entity->job_queue) == 0 ||
> +            READ_ONCE(entity->stopped);
>  }
>  
>  /**

Reply via email to