On Mon, Sep 1, 2025 at 11:20 AM Tvrtko Ursulin <[email protected]> wrote: > > > > + Tomeu and Oded > > On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote: > > Currently, the scheduler score is incremented when a job is pushed to an > > entity and when an entity is attached to the scheduler. > > > > This leads to some bad scheduling decision where the score value is > > largely made of idle entities. > > > > For instance, a scenario with 2 schedulers and where 10 entities submit > > a single job, then do nothing, each scheduler will probably end up with > > a score of 5. > > Now, 5 userspace apps exit, so their entities will be dropped. In > > the worst case, these apps' entities where all attached to the same > > scheduler and we end up with score=5 (the 5 remaining entities) and > > score=0, despite the 2 schedulers being idle. > > When new entities show up, they will all select the second scheduler > > based on its low score value, instead of alternating between the 2. > > > > Some amdgpu rings depended on this feature, but the previous commit > > implemented the same thing in amdgpu directly so it can be safely > > removed from drm/sched. > > > > Signed-off-by: Pierre-Eric Pelloux-Prayer > > <[email protected]> > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index 5a550fd76bf0..e6d232a8ec58 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq, > > if (!list_empty(&entity->list)) > > return; > > > > - atomic_inc(rq->sched->score); > > list_add_tail(&entity->list, &rq->entities); > > } > > > > @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, > > > > spin_lock(&rq->lock); > > > > - atomic_dec(rq->sched->score); > > list_del_init(&entity->list); > > > > if (rq->current_entity == entity) > > LGTM. > > Reviewed-by: Tvrtko Ursulin <[email protected]> > > Only detail is, I did a revisit of the scheduler users and it looks like > the new rocket driver is the only one other than amdgpu which passes a > list of more than one scheduler to drm_sched_entity_init. I don't > *think* it would be affected though. It would still pick the least > loaded (based on active jobs) scheduler at job submit time. Unless there > is some hidden behaviour in that driver where it would be important to > consider number of entities too. Anyway, it would be good for rocket > driver to double-check and ack.
Hello, thanks for pinging. I think it should be fine for Rocket. Acked-by: Tomeu Vizoso <[email protected]> Regards, Tomeu
