When xa_alloc_cyclic() failed in etnaviv_sched_push_job(), the error
path skipped drm_sched_entity_push_job(). This is a violation of the DRM
scheduler contract, as once a job has been armed with drm_sched_job_arm(),
it must be pushed with drm_sched_entity_push_job(). From the DRM
scheduler documentation,

"""
drm_sched_job_arm() is a point of no return since it initializes the
fences and their sequence number etc. Once that function has been called,
you *must* submit it with drm_sched_entity_push_job() and cannot simply
abort it by calling drm_sched_job_cleanup().
"""

Fix this by splitting the fence ID allocation into two phases: first,
alloc an xarray slot before arming the job (which can fail), then fill in
the actual fence with xa_store() after arming. This way, allocation
failures are handled before the job is armed, and once armed, the job is
always pushed to the scheduler.

This also fixes a double call to drm_sched_job_cleanup(), as both
etnaviv_sched_push_job() and its caller would call it on failure.

Fixes: 764be12345c3 ("drm/etnaviv: convert user fence tracking to XArray")
Signed-off-by: Maíra Canal <[email protected]>
---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index df4232d7e135..3cc50d697c89 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -116,16 +116,18 @@ int etnaviv_sched_push_job(struct etnaviv_gem_submit 
*submit)
         */
        mutex_lock(&gpu->sched_lock);
 
+       ret = xa_alloc_cyclic(&gpu->user_fences, &submit->out_fence_id,
+                             NULL, xa_limit_32b, &gpu->next_user_fence,
+                             GFP_KERNEL);
+       if (ret < 0)
+               goto out_unlock;
+
        drm_sched_job_arm(&submit->sched_job);
 
        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
-       ret = xa_alloc_cyclic(&gpu->user_fences, &submit->out_fence_id,
-                             submit->out_fence, xa_limit_32b,
-                             &gpu->next_user_fence, GFP_KERNEL);
-       if (ret < 0) {
-               drm_sched_job_cleanup(&submit->sched_job);
-               goto out_unlock;
-       }
+
+       xa_store(&gpu->user_fences, submit->out_fence_id,
+                submit->out_fence, GFP_KERNEL);
 
        /* the scheduler holds on to the job now */
        kref_get(&submit->refcount);
-- 
2.53.0

Reply via email to