\
On 07/01/2026 14:11, Philipp Stanner wrote:
Happy 2026, Tvrtko!
On Wed, 2026-01-07 at 13:44 +0000, Tvrtko Ursulin wrote:
drm_sched_entity_is_idle() is called when flushing the entity before
destroying it and currently decides on the idle status based either on
number of jobs in its queue being zero, or whether the entity is not part
of any run-queue.
If entity is not part of a run-queue it is implied it can have no jobs in
its queue, from which it follows it is redundant to look at the both and
we can simplify to only look at the queue.
The list_empty() check was added in
a875f58e237a ("drm/scheduler: stop setting rq to NULL")
where it replaced the entity->rq check which was added in
741f01e636b7 ("drm/scheduler: Avoid using wait_event_killable for dying process
(V4)").
Since for the submit race involving exiting entities, explicit handling
via entity->stopped was added soon after in
62347a33001c ("drm/scheduler: Add stopped flag to drm_sched_entity")
it indeed looks we are safe to remove the list_empty() check.
This mean we can remove the memory barrier as well and fewer memory
barriers the better.
I am not convinced that this change is worth it.
Not to long ago we discussed that the spsc_queue should be removed and
replaced by some sort of list, with proper locks. Christian has agreed
that this should fly.
The spsc queue has only 1 user in the kernel and it's super hard to
understand how it's supposed to work and when any why barriers and
READ_ONCE's are necessary (that's, of course, also not documented in
the queue).
Until that is done I don't really want to touch any of those memory
barriers..
(I had a branch with spsc gone more than a year ago but I abandoned it
for now since it contained some other too ambitious changes. So no
complaints from me. Who is doing it BTW?)
Back to the point - this patch can wait, no problem. To explain the
context though.
I wanted to get rid of looking at the list_empty here because I have a
branch which improves the flow for the 1:1 sched:entity drivers.
Why are the two related? If you remember in the fair scheduler series
all the run-queue stuff is nicely grouped in sched_rq.c and encapsulated
in the rq API, which made it possible to follow up with virtualizing the
rq operations.
The yet another relevant thing is the patch I sent this week which
removes the special case where entity can be initialized with no schedulers.
If we combined all these three pre-requisites, my branch allows the
fully invariant sched:entity and 1:1:1 sched:rq:entity. Run-queue vfuncs
for the 1:1 drivers become mostly no-ops (remove and pop entity are not
needed at all, while add just checks for entity->stopped). So all the
list and tree management needed by M:N drivers simply does not happen.
In that branch 1:1 entities do not take part in the rq->entities_list
so, going back to this patch, the list_empty check will be in the way.
Anyway, just for context, I will park this cleanup for now but you can
mull it over whether the bigger picture sounds interesting to you.
Regards,
Tvrtko
While at it, we add READ_ONCE annotation on the entity->stopped check to
mark the unlocked read.
This would effectively legalize the lockless access. But it is illegal
and undefined behavior. See drm_sched_fini().
P.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
---
drivers/gpu/drm/scheduler/sched_entity.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
b/drivers/gpu/drm/scheduler/sched_entity.c
index bb7e5fc47f99..2ed84504cfd6 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -137,14 +137,8 @@ EXPORT_SYMBOL(drm_sched_entity_modify_sched);
static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
{
- rmb(); /* for list_empty to work without lock */
-
- if (list_empty(&entity->list) ||
- spsc_queue_count(&entity->job_queue) == 0 ||
- entity->stopped)
- return true;
-
- return false;
+ return spsc_queue_count(&entity->job_queue) == 0 ||
+ READ_ONCE(entity->stopped);
}
/**