Quoting Tvrtko Ursulin (2019-07-17 14:40:24) > > On 16/07/2019 13:49, Chris Wilson wrote: > > As we unwind the requests for a preemption event, we return a virtual > > request back to its original virtual engine (so that it is available for > > execution on any of its siblings). In the process, this means that its > > breadcrumb should no longer be associated with the original physical > > engine, and so we are forced to decouple it. Previously, as the request > > could not complete without our awareness, we would move it to the next > > real engine without any danger. However, preempt-to-busy allowed for > > requests to continue on the HW and complete in the background as we > > unwound, which meant that we could end up retiring the request before > > fixing up the breadcrumb link. > > > > [51679.517943] INFO: trying to register non-static key. > > [51679.517956] the code is fine but needs lockdep annotation. > > [51679.517960] turning off the locking correctness validator. > > [51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G U > > 5.2.0+ #717 > > [51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS > > BNKBL357.86A.0052.2017.0918.1346 09/18/2017 > > [51679.518012] Workqueue: i915 retire_work_handler [i915] > > [51679.518017] Call Trace: > > [51679.518026] dump_stack+0x67/0x90 > > [51679.518031] register_lock_class+0x52c/0x540 > > [51679.518038] ? find_held_lock+0x2d/0x90 > > [51679.518042] __lock_acquire+0x68/0x1800 > > [51679.518047] ? find_held_lock+0x2d/0x90 > > [51679.518073] ? __i915_sw_fence_complete+0xff/0x1c0 [i915] > > [51679.518079] lock_acquire+0x90/0x170 > > [51679.518105] ? i915_request_cancel_breadcrumb+0x29/0x160 [i915] > > [51679.518112] _raw_spin_lock+0x27/0x40 > > [51679.518138] ? i915_request_cancel_breadcrumb+0x29/0x160 [i915] > > [51679.518165] i915_request_cancel_breadcrumb+0x29/0x160 [i915] > > [51679.518199] i915_request_retire+0x43f/0x530 [i915] > > [51679.518232] retire_requests+0x4d/0x60 [i915] > > [51679.518263] i915_retire_requests+0xdf/0x1f0 [i915] > > [51679.518294] retire_work_handler+0x4c/0x60 [i915] > > [51679.518301] process_one_work+0x22c/0x5c0 > > [51679.518307] worker_thread+0x37/0x390 > > [51679.518311] ? process_one_work+0x5c0/0x5c0 > > [51679.518316] kthread+0x116/0x130 > > [51679.518320] ? kthread_create_on_node+0x40/0x40 > > [51679.518325] ret_from_fork+0x24/0x30 > > [51679.520177] ------------[ cut here ]------------ > > [51679.520189] list_del corruption, ffff88883675e2f0->next is LIST_POISON1 > > (dead000000000100) > > > > Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") > > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > > Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_lrc.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > > b/drivers/gpu/drm/i915/gt/intel_lrc.c > > index 7570a9256001..2ae6695f342b 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > > @@ -492,6 +492,19 @@ __unwind_incomplete_requests(struct intel_engine_cs > > *engine) > > list_move(&rq->sched.link, pl); > > active = rq; > > } else { > > + /* > > + * Decouple the virtual breadcrumb before moving it > > + * back to the virtual engine -- we don't want the > > + * request to complete in the background and try > > + * and cancel the breadcrumb on the virtual engine > > + * (instead of the old engine where it is linked)! > > + */ > > + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, > > + &rq->fence.flags)) { > > + spin_lock(&rq->lock); > > + i915_request_cancel_breadcrumb(rq); > > + spin_unlock(&rq->lock); > > + } > > rq->engine = owner; > > owner->submit_request(rq); > > active = NULL; > > > > Got lost in the maze of DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT & > I915_FENCE_FLAG_SIGNAL flags but eventually it looked okay. Should get > re-added to correct engine's breadcrumb list on submit right?
Yes. It will also survive if the request completes in the background. It's a bit ugly, but will do -- we could probably hide the detail in a local virtual_unwind_request(). Hmm... -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx