[PATCH v2] drm/sched: Remove optimization that causes hang when killing dependent jobs

2025-07-17 Thread Lin . Cao
When application A submits jobs and application B submits a job with a dependency on A's fence, the normal flow wakes up the scheduler after processing each job. However, the optimization in drm_sched_entity_add_dependency_cb() uses a callback that only clears dependencies without waking up the sch

[PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs

2025-07-15 Thread Lin . Cao
When application A submits jobs and application B submits a job with a dependency on A's fence, the normal flow wakes up the scheduler after processing each job. However, the optimization in drm_sched_entity_add_dependency_cb() uses a callback that only clears dependencies without waking up the sch

[PATCH v3] drm/sched: Wake up scheduler when killing jobs to prevent hang

2025-07-15 Thread Lin . Cao
When application A submits jobs (a1, a2, a3) and application B submits job b1 with a dependency on a2's scheduler fence, the normal execution flow is: 1. a1 gets popped from the entity by the scheduler 2. run_job(a1) executes 3. a1's scheduled fence gets signaled 4. drm_sched_run_job_work() calls d

[PATCH v2] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-13 Thread Lin . Cao
When Application A submits jobs (a1, a2, a3) and application B submits job b1 with a dependency on a2's scheduler fence, killing application A before run_job(a1) causes drm_sched_entity_kill_jobs_work() to force signal all jobs sequentially. However, due to missing work_run_job or work_free_job in

[PATCH] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-09 Thread Lin . Cao
When Application A submits jobs (a1, a2, a3) and application B submits job b1 with a dependency on a2's scheduler fence, killing application A before run_job(a1) causes drm_sched_entity_kill_jobs_work() to force signal all jobs sequentially. However, due to missing work_run_job or work_free_job in

[PATCH] drm/scheduler: signal scheduled fence when kill job

2025-05-14 Thread Lin . Cao
Previously we only signaled finished fence which may cause some submission's dependency cannot be cleared the cause benchmark hang. Signal both scheduled fence and finished fence could fix this issue. Signed-off-by: Lin.Cao --- drivers/gpu/drm/scheduler/sched_entity.c | 1 + 1 file changed, 1 in

[PATCH] drm/buddy: fix issue that force_merge cannot free all roots

2024-08-13 Thread Lin . Cao
If buddy manager have more than one roots and each root have sub-block need to be free. When drm_buddy_fini called, the first loop of force_merge will merge and free all of the sub block of first root, which offset is 0x0 and size is biggest(more than have of the mm size). In subsequent force_merge

[PATCH] drm/buddy: fix issue that force_merge cannot free all roots

2024-08-07 Thread Lin . Cao
If buddy manager have more than one roots and each root have sub-block need to be free. When drm_buddy_fini called, the first loop of force_merge will merge and free all of the sub block of first root, which offset is 0x0 and size is biggest(more than have of the mm size). In subsequent force_merge