Replace rcu_dereference_protected() with rcu_dereference() in
hwsp_offset() since the function is called within an RCU read-side
critical section. Using rcu_dereference() avoids unnecessary
protection checks and aligns with correct RCU usage patterns.

Signed-off-by: Sebastian Brzezinka <[email protected]>
---
I noticed that the current implementation of hwsp_offset() uses 
rcu_dereference_protected()
when accessing rq->timeline. This seems to be a slight misuse of the API.
rcu_dereference_protected() is intended for updater-side code, where we are not 
holding
rcu_read_lock() but instead rely on another lock that guarantees safety. The 
condition argument
in this function acts more like an assertion that the caller holds the required 
lock.
In our case, hwsp_offset() is called inside an RCU read-side critical section, 
which means
the correct primitive is rcu_dereference(). The original intent of the 
condition argument
seems to have been to guard against use-after-free scenarios for timeline(?). 
However,
rcu_dereference_protected() does not enforce that, it simply returns the 
pointer regardless
of i915_request_signaled(), and in rare cases this pattern has led to issues 
such as:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15181
'''
...
<4> [281.246503] drivers/gpu/drm/i915/gt/gen8_engine_cs.c:427 suspicious 
rcu_dereference_protected() usage!
<4> [281.246506]
other info that might help us debug this:
<4> [281.246507]
rcu_scheduler_active = 2, debug_locks = 1
<4> [281.246509] 5 locks held by gem_exec_whispe/2308:
<4> [281.246511]  #0: ffffc90002ae77c8 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0xd2c/0x3710 [i915]
<4> [281.246852]  #1: ffffc90002ae77f0 
(reservation_ww_class_mutex){+.+.}-{3:3}, at: 
i915_gem_do_execbuffer+0xd2c/0x3710 [i915]
<4> [281.247073]  #2: ffff8881e8a4a878 (&timeline->mutex){+.+.}-{3:3}, at: 
i915_request_create+0x61/0x200 [i915]
<4> [281.247403]  #3: ffffffff83595560 (rcu_read_lock){....}-{1:2}, at: 
execlists_submission_tasklet+0x44/0x27b0 [i915]
<4> [281.247592]  #4: ffff88812f0c2020 (&sched_engine->lock){-.-.}-{2:2}, at: 
execlists_submission_tasklet+0x20d/0x27b0 [i915]
<4> [281.247787]
stack backtrace:
<4> [281.247789] CPU: 9 UID: 0 PID: 2308 Comm: gem_exec_whispe Tainted: G     U 
             6.17.0-CI_DRM_17306-gb3f121acbde4+ #1 PREEMPT(voluntary)
<4> [281.247792] Tainted: [U]=USER
<4> [281.247792] Hardware name: Intel Corporation Rocket Lake Client 
Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.6062.A00.2502050210 
02/05/2025
<4> [281.247793] Call Trace:
<4> [281.247794]  <IRQ>
<4> [281.247796]  dump_stack_lvl+0x91/0xf0
<4> [281.247802]  dump_stack+0x10/0x20
<4> [281.247804]  lockdep_rcu_suspicious+0x151/0x1e0
<4> [281.247811]  ? __i915_request_submit+0xb0/0x430 [i915]
<4> [281.248010]  hwsp_offset+0x90/0xa0 [i915]
<4> [281.248199]  gen12_emit_fini_breadcrumb_rcs+0xdf/0x480 [i915]
<4> [281.248388]  ? __i915_request_submit+0xb0/0x430 [i915]
<4> [281.248584]  __i915_request_submit+0x15b/0x430 [i915]
<4> [281.248781]  execlists_submission_tasklet+0xdfa/0x27b0 [i915]
<4> [281.248974]  ? mark_held_locks+0x46/0x90
<4> [281.248982]  tasklet_action_common+0x166/0x410
<4> [281.248988]  tasklet_hi_action+0x29/0x40
<4> [281.248990]  handle_softirqs+0xd7/0x4d0
<4> [281.248994]  ? __i915_request_queue+0x3f/0x80 [i915]
<4> [281.249194]  __do_softirq+0x10/0x18
<4> [281.249197]  do_softirq.part.0+0x47/0xd0
...
'''

This issue reproduces very rarely, and I haven’t been able to reproduce it 
myself, so
I’m not entirely sure why this scenario occurs why we attempt to emit a 
breadcrumb even
when the request’s fence is already signaled. However, the correct approach 
seems to be:
 - Drop the use of rcu_dereference_protected() in this context, since it’s not 
providing
   real safety here.
 - Avoid emitting a breadcrumb at all when the request is already signaled, as
   doing so appears unnecessary.

My concern is that breadcrumbs seem to be emitted in __i915_request_submit, 
which appears
to be well-guarded against processing retried requests. This leaves me puzzled 
about what’s
actually happening here.
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 3 +--
 drivers/gpu/drm/i915/i915_request.c      | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index e9f65f27b53f..b799d423d306 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -424,8 +424,7 @@ static u32 hwsp_offset(const struct i915_request *rq)
        const struct intel_timeline *tl;
 
        /* Before the request is executed, the timeline is fixed */
-       tl = rcu_dereference_protected(rq->timeline,
-                                      !i915_request_signaled(rq));
+       tl = rcu_dereference(rq->timeline);
 
        /* See the comment in i915_request_active_seqno(). */
        return page_mask_bits(tl->hwsp_offset) + offset_in_page(rq->hwsp_seqno);
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index b9a2b2194c8f..25a9e574149e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -657,7 +657,7 @@ bool __i915_request_submit(struct i915_request *request)
        if (request->sched.semaphores &&
            i915_sw_fence_signaled(&request->semaphore))
                engine->saturated |= request->sched.semaphores;
-
+       /*It seems that breadcrumbs are being emitted here.*/
        engine->emit_fini_breadcrumb(request,
                                     request->ring->vaddr + request->postfix);
 
-- 
2.34.1

Reply via email to