Replace rcu_dereference_protected() with rcu_dereference() in hwsp_offset() since the function is called within an RCU read-side critical section. Using rcu_dereference() avoids unnecessary protection checks and aligns with correct RCU usage patterns.
Signed-off-by: Sebastian Brzezinka <[email protected]> --- I noticed that the current implementation of hwsp_offset() uses rcu_dereference_protected() when accessing rq->timeline. This seems to be a slight misuse of the API. rcu_dereference_protected() is intended for updater-side code, where we are not holding rcu_read_lock() but instead rely on another lock that guarantees safety. The condition argument in this function acts more like an assertion that the caller holds the required lock. In our case, hwsp_offset() is called inside an RCU read-side critical section, which means the correct primitive is rcu_dereference(). The original intent of the condition argument seems to have been to guard against use-after-free scenarios for timeline(?). However, rcu_dereference_protected() does not enforce that, it simply returns the pointer regardless of i915_request_signaled(), and in rare cases this pattern has led to issues such as: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15181 ''' ... <4> [281.246503] drivers/gpu/drm/i915/gt/gen8_engine_cs.c:427 suspicious rcu_dereference_protected() usage! <4> [281.246506] other info that might help us debug this: <4> [281.246507] rcu_scheduler_active = 2, debug_locks = 1 <4> [281.246509] 5 locks held by gem_exec_whispe/2308: <4> [281.246511] #0: ffffc90002ae77c8 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_gem_do_execbuffer+0xd2c/0x3710 [i915] <4> [281.246852] #1: ffffc90002ae77f0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: i915_gem_do_execbuffer+0xd2c/0x3710 [i915] <4> [281.247073] #2: ffff8881e8a4a878 (&timeline->mutex){+.+.}-{3:3}, at: i915_request_create+0x61/0x200 [i915] <4> [281.247403] #3: ffffffff83595560 (rcu_read_lock){....}-{1:2}, at: execlists_submission_tasklet+0x44/0x27b0 [i915] <4> [281.247592] #4: ffff88812f0c2020 (&sched_engine->lock){-.-.}-{2:2}, at: execlists_submission_tasklet+0x20d/0x27b0 [i915] <4> [281.247787] stack backtrace: <4> [281.247789] CPU: 9 UID: 0 PID: 2308 Comm: gem_exec_whispe Tainted: G U 6.17.0-CI_DRM_17306-gb3f121acbde4+ #1 PREEMPT(voluntary) <4> [281.247792] Tainted: [U]=USER <4> [281.247792] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.6062.A00.2502050210 02/05/2025 <4> [281.247793] Call Trace: <4> [281.247794] <IRQ> <4> [281.247796] dump_stack_lvl+0x91/0xf0 <4> [281.247802] dump_stack+0x10/0x20 <4> [281.247804] lockdep_rcu_suspicious+0x151/0x1e0 <4> [281.247811] ? __i915_request_submit+0xb0/0x430 [i915] <4> [281.248010] hwsp_offset+0x90/0xa0 [i915] <4> [281.248199] gen12_emit_fini_breadcrumb_rcs+0xdf/0x480 [i915] <4> [281.248388] ? __i915_request_submit+0xb0/0x430 [i915] <4> [281.248584] __i915_request_submit+0x15b/0x430 [i915] <4> [281.248781] execlists_submission_tasklet+0xdfa/0x27b0 [i915] <4> [281.248974] ? mark_held_locks+0x46/0x90 <4> [281.248982] tasklet_action_common+0x166/0x410 <4> [281.248988] tasklet_hi_action+0x29/0x40 <4> [281.248990] handle_softirqs+0xd7/0x4d0 <4> [281.248994] ? __i915_request_queue+0x3f/0x80 [i915] <4> [281.249194] __do_softirq+0x10/0x18 <4> [281.249197] do_softirq.part.0+0x47/0xd0 ... ''' This issue reproduces very rarely, and I haven’t been able to reproduce it myself, so I’m not entirely sure why this scenario occurs why we attempt to emit a breadcrumb even when the request’s fence is already signaled. However, the correct approach seems to be: - Drop the use of rcu_dereference_protected() in this context, since it’s not providing real safety here. - Avoid emitting a breadcrumb at all when the request is already signaled, as doing so appears unnecessary. My concern is that breadcrumbs seem to be emitted in __i915_request_submit, which appears to be well-guarded against processing retried requests. This leaves me puzzled about what’s actually happening here. --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 3 +-- drivers/gpu/drm/i915/i915_request.c | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index e9f65f27b53f..b799d423d306 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -424,8 +424,7 @@ static u32 hwsp_offset(const struct i915_request *rq) const struct intel_timeline *tl; /* Before the request is executed, the timeline is fixed */ - tl = rcu_dereference_protected(rq->timeline, - !i915_request_signaled(rq)); + tl = rcu_dereference(rq->timeline); /* See the comment in i915_request_active_seqno(). */ return page_mask_bits(tl->hwsp_offset) + offset_in_page(rq->hwsp_seqno); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index b9a2b2194c8f..25a9e574149e 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -657,7 +657,7 @@ bool __i915_request_submit(struct i915_request *request) if (request->sched.semaphores && i915_sw_fence_signaled(&request->semaphore)) engine->saturated |= request->sched.semaphores; - + /*It seems that breadcrumbs are being emitted here.*/ engine->emit_fini_breadcrumb(request, request->ring->vaddr + request->postfix); -- 2.34.1
