Re: [PATCH 5/5] drm/sched: Make use of a "done" list (v2)
On 2020-12-04 3:16 a.m., Christian König wrote: > Am 04.12.20 um 04:17 schrieb Luben Tuikov: >> The drm_sched_job_done() callback now moves done >> jobs from the pending list to a "done" list. >> >> In drm_sched_job_timeout, make use of the status >> returned by a GPU driver job timeout handler to >> decide whether to leave the oldest job in the >> pending list, or to send it off to the done list. >> If a driver's job timeout callback returns a >> status that that job is done, it is added to the >> done list and the done thread woken up. If that >> job needs more time, it is left on the pending >> list and the timeout timer restarted. >> >> The idea is that a GPU driver can check the IP to >> which the passed-in job belongs to and determine >> whether the IP is alive and well, or if it needs >> more time to complete this job and perhaps others >> also executing on it. >> >> In drm_sched_job_timeout(), the main scheduler >> thread is now parked, before calling a driver's >> timeout_job callback, so as to not compete pushing >> jobs down to the GPU while the recovery method is >> taking place. >> >> Eliminate the polling mechanism of picking out done >> jobs from the pending list, i.e. eliminate >> drm_sched_get_cleanup_job(). >> >> This also eliminates the eldest job disappearing >> from the pending list, while the driver timeout >> handler is called. >> >> Various other optimizations to the GPU scheduler >> and job recovery are possible with this format. >> >> Signed-off-by: Luben Tuikov >> >> Cc: Alexander Deucher >> Cc: Andrey Grodzovsky >> Cc: Christian König >> Cc: Daniel Vetter >> Cc: Lucas Stach >> Cc: Russell King >> Cc: Christian Gmeiner >> Cc: Qiang Yu >> Cc: Rob Herring >> Cc: Tomeu Vizoso >> Cc: Steven Price >> Cc: Alyssa Rosenzweig >> Cc: Eric Anholt >> >> v2: Dispell using a done thread, so as to keep >> the cache hot on the same processor. >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 247 + >> include/drm/gpu_scheduler.h| 4 + >> 2 files changed, 134 insertions(+), 117 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index b9876cad94f2..d77180b44998 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -164,7 +164,9 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) >>* drm_sched_job_done - complete a job >>* @s_job: pointer to the job which is done >>* >> - * Finish the job's fence and wake up the worker thread. >> + * Move the completed task to the done list, >> + * signal the its fence to mark it finished, >> + * and wake up the worker thread. >>*/ >> static void drm_sched_job_done(struct drm_sched_job *s_job) >> { >> @@ -176,9 +178,14 @@ static void drm_sched_job_done(struct drm_sched_job >> *s_job) >> >> trace_drm_sched_process_job(s_fence); >> >> +spin_lock(&sched->job_list_lock); >> +list_move(&s_job->list, &sched->done_list); >> +spin_unlock(&sched->job_list_lock); >> + > > That is racy, as soon as the spinlock is dropped the job and with it the > s_fence might haven been destroyed. Yeah, I had it the other way around, (the correct way), and changed it--not sure why. I revert it back. Thanks for catching this. Regards, Luben > >> dma_fence_get(&s_fence->finished); >> drm_sched_fence_finished(s_fence); >> dma_fence_put(&s_fence->finished); > > In other words this here needs to come first. > > Regards, > Christian. > >> + >> wake_up_interruptible(&sched->wake_up_worker); >> } >> >> @@ -309,6 +316,37 @@ static void drm_sched_job_begin(struct drm_sched_job >> *s_job) >> spin_unlock(&sched->job_list_lock); >> } >> >> +/** drm_sched_job_timeout -- a timer timeout occurred >> + * @work: pointer to work_struct >> + * >> + * First, park the scheduler thread whose IP timed out, >> + * so that we don't race with the scheduler thread pushing >> + * jobs down the IP as we try to investigate what >> + * happened and give drivers a chance to recover. >> + * >> + * Second, take the fist job in the pending list >> + * (oldest), leave it in the pending list and call the >> + * driver's timer timeout callback to find out what >> + * happened, passing this job as the suspect one. >> + * >> + * The driver may return DRM_TASK_STATUS COMPLETE, >> + * which means the task is not in the IP(*) and we move >> + * it to the done list to free it. >> + * >> + * (*) A reason for this would be, say, that the job >> + * completed in due time, or the driver has aborted >> + * this job using driver specific methods in the >> + * timedout_job callback and has now removed it from >> + * the hardware. >> + * >> + * Or, the driver may return DRM_TASK_STATUS_ALIVE, to >> + * indicate that it had inquired about this job, and it >> + * has verified that this job is alive and well, and >> + * that the DRM layer should give this task more time >> +
Re: [PATCH 5/5] drm/sched: Make use of a "done" list (v2)
Am 04.12.20 um 04:17 schrieb Luben Tuikov: The drm_sched_job_done() callback now moves done jobs from the pending list to a "done" list. In drm_sched_job_timeout, make use of the status returned by a GPU driver job timeout handler to decide whether to leave the oldest job in the pending list, or to send it off to the done list. If a driver's job timeout callback returns a status that that job is done, it is added to the done list and the done thread woken up. If that job needs more time, it is left on the pending list and the timeout timer restarted. The idea is that a GPU driver can check the IP to which the passed-in job belongs to and determine whether the IP is alive and well, or if it needs more time to complete this job and perhaps others also executing on it. In drm_sched_job_timeout(), the main scheduler thread is now parked, before calling a driver's timeout_job callback, so as to not compete pushing jobs down to the GPU while the recovery method is taking place. Eliminate the polling mechanism of picking out done jobs from the pending list, i.e. eliminate drm_sched_get_cleanup_job(). This also eliminates the eldest job disappearing from the pending list, while the driver timeout handler is called. Various other optimizations to the GPU scheduler and job recovery are possible with this format. Signed-off-by: Luben Tuikov Cc: Alexander Deucher Cc: Andrey Grodzovsky Cc: Christian König Cc: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Eric Anholt v2: Dispell using a done thread, so as to keep the cache hot on the same processor. --- drivers/gpu/drm/scheduler/sched_main.c | 247 + include/drm/gpu_scheduler.h| 4 + 2 files changed, 134 insertions(+), 117 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index b9876cad94f2..d77180b44998 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -164,7 +164,9 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq) * drm_sched_job_done - complete a job * @s_job: pointer to the job which is done * - * Finish the job's fence and wake up the worker thread. + * Move the completed task to the done list, + * signal the its fence to mark it finished, + * and wake up the worker thread. */ static void drm_sched_job_done(struct drm_sched_job *s_job) { @@ -176,9 +178,14 @@ static void drm_sched_job_done(struct drm_sched_job *s_job) trace_drm_sched_process_job(s_fence); + spin_lock(&sched->job_list_lock); + list_move(&s_job->list, &sched->done_list); + spin_unlock(&sched->job_list_lock); + That is racy, as soon as the spinlock is dropped the job and with it the s_fence might haven been destroyed. dma_fence_get(&s_fence->finished); drm_sched_fence_finished(s_fence); dma_fence_put(&s_fence->finished); In other words this here needs to come first. Regards, Christian. + wake_up_interruptible(&sched->wake_up_worker); } @@ -309,6 +316,37 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job) spin_unlock(&sched->job_list_lock); } +/** drm_sched_job_timeout -- a timer timeout occurred + * @work: pointer to work_struct + * + * First, park the scheduler thread whose IP timed out, + * so that we don't race with the scheduler thread pushing + * jobs down the IP as we try to investigate what + * happened and give drivers a chance to recover. + * + * Second, take the fist job in the pending list + * (oldest), leave it in the pending list and call the + * driver's timer timeout callback to find out what + * happened, passing this job as the suspect one. + * + * The driver may return DRM_TASK_STATUS COMPLETE, + * which means the task is not in the IP(*) and we move + * it to the done list to free it. + * + * (*) A reason for this would be, say, that the job + * completed in due time, or the driver has aborted + * this job using driver specific methods in the + * timedout_job callback and has now removed it from + * the hardware. + * + * Or, the driver may return DRM_TASK_STATUS_ALIVE, to + * indicate that it had inquired about this job, and it + * has verified that this job is alive and well, and + * that the DRM layer should give this task more time + * to complete. In this case, we restart the timeout timer. + * + * Lastly, we unpark the scheduler thread. + */ static void drm_sched_job_timedout(struct work_struct *work) { struct drm_gpu_scheduler *sched; @@ -316,37 +354,32 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); - /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + kthread_park(sched->thread); + spin_lock(&sched->job_list_lock); job = lis