Hi, Since its introduction, the vc4 driver schedules its jobs and tracks the dependencies in between them using its own internal job queue implementation. This internal implementation is based in job lists, wait queues and hand-rolled seqnos. Although job scheduling worked most of the time, in more GPU intensive scenarios, many GPU hangs were reported [1][2].
After investigating several GPU hangs, I noticed that job dependencies weren't being tracked correctly, which could lead to synchronization issues and GPU resets. Also, the GPU reset path had issues related to job resubmission. Considering the many issues related to the internal job queue implementation, this series proposes switching to the DRM GPU scheduler, which is a well-established implementation used by multiple DRM drivers. This has many advantages: 1. Using common code: Instead of relying on a custom implementation, use a trusted common framework. This helps with maintainability of the vc4 driver. It also makes the code more readable. 2. Synchronization issues are gone: With this series, applications can work reliably on RPi 3. Many users reported that they weren't able to open applications like emulators on the device. Now, it's possible to play several retro games without issues. 3. GPU resets are recoverable: Even if a timeout happens, the GPU is able to recover successfully with minimal impact to the user. 4. PM actually works: Before this series, the GPU was active during the entire runtime. After this series, the GPU is able to autosuspend and resume when needed. In order to improve reviewability of the patches, I introduced piece by piece of the new infrastructure without actually plugging it in. The actual switchover only happens in the patch "drm/vc4: Switch to DRM GPU scheduler". For this second version, I moved all the fixes patches to a different series [3] with the goal to make this series more focused on the scheduler changes. This series was mostly based on the design of the v3d driver as the two drivers are very similar. [1] https://github.com/raspberrypi/linux/issues/5780 [2] https://github.com/raspberrypi/linux/issues/3221 [3] https://lore.kernel.org/dri-devel/[email protected]/T/ Best regards, - Maíra --- v1 -> v2: https://lore.kernel.org/r/[email protected] - Moved all miscellaneous fixes and improvements to a separate series: https://lore.kernel.org/dri-devel/[email protected]/T/ - [1/7] Add Melissa's R-b (Melissa Wen) - [2/7] Squash "[PATCH 04/11] drm/vc4: Introduce vc4_job structures for DRM scheduler integration" and "[PATCH 05/11] drm/vc4: Add DRM GPU scheduler infrastructure" (Tvrtko Ursulin) - [2/7] Centralize the initialization of queues in vc4_sched_init() (Melissa Wen) - [2/7] Handle error when vc4_fence_create() fails (Melissa Wen) - [2/7] Protect vc4->render_job when updating the pointer in `run_job()` (Melissa Wen) - [2/7] Handle error when drm_sched_entity_init() fails (Tvrtko Ursulin) - [2/7] Clarify comment in sched_lock (Tvrtko Ursulin) - [2/7] Remove fence_lock as dma-fences now support a built-in lock (Tvrtko Ursulin) - [2/7] Use spin_(un)lock_irq in `run_job()` callbacks (Tvrtko Ursulin) - [3/7] Add a comment explaining why we don't need to unreference BOs in case of failure in vc4_lookup_bos() - [3/7] Several stylistic adjustments to vc4_get_bcl() (Tvrtko Ursulin) - [3/7] s/kvmalloc_array/kvmalloc in vc4_get_bcl() (Tvrtko Ursulin) - [3/7] Remove all error messages related to allocation failures (Tvrtko Ursulin) - [3/7] Rename vc4_job_init() to vc4_job_alloc() (Tvrtko Ursulin) - [3/7] Address cases in which in_sync or out_sync <= 0 (Tvrtko Ursulin) - [3/7] Replace vc4_attach_fences_and_unlock_reservation() with vc4_attach_fences() + drm_exec_fini() (Tvrtko Ursulin) - [3/7] Don't clean-up the BIN job if it has already been pushed. - [4/7] NEW PATCH: "drm/vc4: Refcount vc4_file for safe access by jobs" (Tvrtko Ursulin) - [5/7] Use vc4_file refcount to get a fd reference. - [5/7] Add comment explaining why we use dma_fence_get_rcu() in vc4_wait_seqno_ioctl() (Tvrtko Ursulin) - [7/7] Return "unknown" instead of NULL when the fence type is unknown (Tvrtko Ursulin) --- Maíra Canal (7): drm/vc4: Move vc4_wait_bo_ioctl() to vc4_bo.c drm/vc4: Add DRM GPU scheduler infrastructure and job structures drm/vc4: Add new job submission implementation drm/vc4: Refcount vc4_file for safe access by jobs drm/vc4: Add per-file descriptor seqno tracking drm/vc4: Switch to DRM GPU scheduler drm/vc4: Use unique fence timeline names per queue drivers/gpu/drm/vc4/Kconfig | 1 + drivers/gpu/drm/vc4/Makefile | 2 + drivers/gpu/drm/vc4/vc4_bo.c | 33 ++ drivers/gpu/drm/vc4/vc4_drv.c | 49 +- drivers/gpu/drm/vc4/vc4_drv.h | 234 ++++----- drivers/gpu/drm/vc4/vc4_fence.c | 34 +- drivers/gpu/drm/vc4/vc4_gem.c | 961 ++---------------------------------- drivers/gpu/drm/vc4/vc4_irq.c | 132 +---- drivers/gpu/drm/vc4/vc4_render_cl.c | 17 +- drivers/gpu/drm/vc4/vc4_sched.c | 337 +++++++++++++ drivers/gpu/drm/vc4/vc4_submit.c | 581 ++++++++++++++++++++++ drivers/gpu/drm/vc4/vc4_v3d.c | 24 +- drivers/gpu/drm/vc4/vc4_validate.c | 21 +- 13 files changed, 1243 insertions(+), 1183 deletions(-) --- base-commit: 4b9c36c83b34f710da9573291404f6a2246251c1 change-id: 20260121-vc4-drm-scheduler-03cd8670b3f6
