Quoting Tvrtko Ursulin (2019-07-17 14:40:24)
> 
> On 16/07/2019 13:49, Chris Wilson wrote:
> > As we unwind the requests for a preemption event, we return a virtual
> > request back to its original virtual engine (so that it is available for
> > execution on any of its siblings). In the process, this means that its
> > breadcrumb should no longer be associated with the original physical
> > engine, and so we are forced to decouple it. Previously, as the request
> > could not complete without our awareness, we would move it to the next
> > real engine without any danger. However, preempt-to-busy allowed for
> > requests to continue on the HW and complete in the background as we
> > unwound, which meant that we could end up retiring the request before
> > fixing up the breadcrumb link.
> > 
> > [51679.517943] INFO: trying to register non-static key.
> > [51679.517956] the code is fine but needs lockdep annotation.
> > [51679.517960] turning off the locking correctness validator.
> > [51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G     U         
> >    5.2.0+ #717
> > [51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS 
> > BNKBL357.86A.0052.2017.0918.1346 09/18/2017
> > [51679.518012] Workqueue: i915 retire_work_handler [i915]
> > [51679.518017] Call Trace:
> > [51679.518026]  dump_stack+0x67/0x90
> > [51679.518031]  register_lock_class+0x52c/0x540
> > [51679.518038]  ? find_held_lock+0x2d/0x90
> > [51679.518042]  __lock_acquire+0x68/0x1800
> > [51679.518047]  ? find_held_lock+0x2d/0x90
> > [51679.518073]  ? __i915_sw_fence_complete+0xff/0x1c0 [i915]
> > [51679.518079]  lock_acquire+0x90/0x170
> > [51679.518105]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
> > [51679.518112]  _raw_spin_lock+0x27/0x40
> > [51679.518138]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
> > [51679.518165]  i915_request_cancel_breadcrumb+0x29/0x160 [i915]
> > [51679.518199]  i915_request_retire+0x43f/0x530 [i915]
> > [51679.518232]  retire_requests+0x4d/0x60 [i915]
> > [51679.518263]  i915_retire_requests+0xdf/0x1f0 [i915]
> > [51679.518294]  retire_work_handler+0x4c/0x60 [i915]
> > [51679.518301]  process_one_work+0x22c/0x5c0
> > [51679.518307]  worker_thread+0x37/0x390
> > [51679.518311]  ? process_one_work+0x5c0/0x5c0
> > [51679.518316]  kthread+0x116/0x130
> > [51679.518320]  ? kthread_create_on_node+0x40/0x40
> > [51679.518325]  ret_from_fork+0x24/0x30
> > [51679.520177] ------------[ cut here ]------------
> > [51679.520189] list_del corruption, ffff88883675e2f0->next is LIST_POISON1 
> > (dead000000000100)
> > 
> > Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c | 13 +++++++++++++
> >   1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
> > b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 7570a9256001..2ae6695f342b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -492,6 +492,19 @@ __unwind_incomplete_requests(struct intel_engine_cs 
> > *engine)
> >                       list_move(&rq->sched.link, pl);
> >                       active = rq;
> >               } else {
> > +                     /*
> > +                      * Decouple the virtual breadcrumb before moving it
> > +                      * back to the virtual engine -- we don't want the
> > +                      * request to complete in the background and try
> > +                      * and cancel the breadcrumb on the virtual engine
> > +                      * (instead of the old engine where it is linked)!
> > +                      */
> > +                     if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
> > +                                  &rq->fence.flags)) {
> > +                             spin_lock(&rq->lock);
> > +                             i915_request_cancel_breadcrumb(rq);
> > +                             spin_unlock(&rq->lock);
> > +                     }
> >                       rq->engine = owner;
> >                       owner->submit_request(rq);
> >                       active = NULL;
> > 
> 
> Got lost in the maze of DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT & 
> I915_FENCE_FLAG_SIGNAL flags but eventually it looked okay. Should get 
> re-added to correct engine's breadcrumb list on submit right?

Yes. It will also survive if the request completes in the background.

It's a bit ugly, but will do -- we could probably hide the detail in a
local virtual_unwind_request(). Hmm...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to