On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote: > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote: > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote: > >> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote: > >> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > >> > b/drivers/gpu/drm/scheduler/sched_main.c > >> > index bfea608a7106..997a2cc1a635 100644 > >> > --- a/drivers/gpu/drm/scheduler/sched_main.c > >> > +++ b/drivers/gpu/drm/scheduler/sched_main.c > >> > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct > >> > drm_sched_entity *entity, > >> > > >> > entity->oldest_job_waiting = ts; > >> > > >> > - rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root, > >> > - drm_sched_entity_compare_before); > >> > + if (!entity->stopped) { > >> > + rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root, > >> > + drm_sched_entity_compare_before); > >> > + } > >> > >> If this is a race, then this patch here is broken, too, because you're > >> checking the 'stopped' boolean as the callers of that function do, too > >> – just later. :O > >> > >> Could still race, just less likely. > >> > >> The proper way to fix it would then be to address the issue where the > >> locking is supposed to happen. Let's look at, for example, > >> drm_sched_entity_push_job(): > >> > >> > >> void drm_sched_entity_push_job(struct drm_sched_job *sched_job) > >> { > >> (Bla bla bla) > >> > >> ………… > >> > >> /* first job wakes up scheduler */ > >> if (first) { > >> struct drm_gpu_scheduler *sched; > >> struct drm_sched_rq *rq; > >> > >> /* Add the entity to the run queue */ > >> spin_lock(&entity->lock); > >> if (entity->stopped) { <---- Aha! > >> spin_unlock(&entity->lock); > >> > >> DRM_ERROR("Trying to push to a killed entity\n"); > >> return; > >> } > >> > >> rq = entity->rq; > >> sched = rq->sched; > >> > >> spin_lock(&rq->lock); > >> drm_sched_rq_add_entity(rq, entity); > >> > >> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) > >> drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); > >> <---- bumm! > >> > >> spin_unlock(&rq->lock); > >> spin_unlock(&entity->lock); > >> > >> But the locks are still being hold. So that "shouldn't be happening"(tm). > >> > >> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini() > >> stop entities. The former holds appropriate locks, but drm_sched_fini() > >> doesn't. So that looks like a hot candidate to me. Opinions? > >> > >> On the other hand, aren't drivers prohibited from calling > >> drm_sched_entity_push_job() after calling drm_sched_fini()? If the > >> fuzzer does that, then it's not the scheduler's fault. > > Exactly, this is the first question to ask. > > And I think it's even more restrictive: > > In drm_sched_fini() > > for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { > struct drm_sched_rq *rq = sched->sched_rq[i]; > > spin_lock(&rq->lock); > list_for_each_entry(s_entity, &rq->entities, list) > /* > * Prevents reinsertion and marks job_queue as idle, > * it will be removed from the rq in > drm_sched_entity_fini() > * eventually > */ > s_entity->stopped = true; > spin_unlock(&rq->lock); > kfree(sched->sched_rq[i]); > } > > In drm_sched_entity_kill() > > static void drm_sched_entity_kill(struct drm_sched_entity *entity) > { > struct drm_sched_job *job; > struct dma_fence *prev; > > if (!entity->rq) > return; > > spin_lock(&entity->lock); > entity->stopped = true; > drm_sched_rq_remove_entity(entity->rq, entity); > spin_unlock(&entity->lock); > > [...] > } > > If this runs concurrently, this is a UAF as well. > > Personally, I have always been working with the assupmtion that entites have > to > be torn down *before* the scheduler, but those lifetimes are not documented > properly.
Yes, this is my assumption too. I would even take it further: an entity shouldn't be torn down until all jobs associated with it are freed as well. I think this would solve a lot of issues I've seen on the list related to UAF, teardown, etc. > > There are two solutions: > > (1) Strictly require all entities to be torn down before drm_sched_fini(), > i.e. stick to the natural ownership and lifetime rules here (see below). > > (2) Actually protect *any* changes of the relevent fields of the entity > structure with the entity lock. > > While (2) seems rather obvious, we run into lock inversion with this approach, > as you note below as well. And I think drm_sched_fini() should not mess with > entities anyways. > > The ownership here seems obvious: > > The scheduler *owns* a resource that is used by entities. Consequently, > entities > are not allowed to out-live the scheduler. > > Surely, the current implementation to just take the resource away from the > entity under the hood can work as well with appropriate locking, but that's a > mess. > > If the resource *really* needs to be shared for some reason (which I don't > see), > shared ownership, i.e. reference counting, is much less error prone. Yes, Xe solves all of this via reference counting (jobs refcount the entity). It's a bit easier in Xe since the scheduler and entities are the same object due to their 1:1 relationship. But even in non-1:1 relationships, an entity could refcount the scheduler. The teardown sequence would then be: all jobs complete on the entity → teardown the entity → all entities torn down → teardown the scheduler. Matt