Re: [Intel-gfx] [PATCH 4/5] drm/i915/execlists: Cancel breadcrumb on preempting the virtual engine

2019-07-19 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-07-17 14:40:24)
> 
> On 16/07/2019 13:49, Chris Wilson wrote:
> > As we unwind the requests for a preemption event, we return a virtual
> > request back to its original virtual engine (so that it is available for
> > execution on any of its siblings). In the process, this means that its
> > breadcrumb should no longer be associated with the original physical
> > engine, and so we are forced to decouple it. Previously, as the request
> > could not complete without our awareness, we would move it to the next
> > real engine without any danger. However, preempt-to-busy allowed for
> > requests to continue on the HW and complete in the background as we
> > unwound, which meant that we could end up retiring the request before
> > fixing up the breadcrumb link.
> > 
> > [51679.517943] INFO: trying to register non-static key.
> > [51679.517956] the code is fine but needs lockdep annotation.
> > [51679.517960] turning off the locking correctness validator.
> > [51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G U 
> >5.2.0+ #717
> > [51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS 
> > BNKBL357.86A.0052.2017.0918.1346 09/18/2017
> > [51679.518012] Workqueue: i915 retire_work_handler [i915]
> > [51679.518017] Call Trace:
> > [51679.518026]  dump_stack+0x67/0x90
> > [51679.518031]  register_lock_class+0x52c/0x540
> > [51679.518038]  ? find_held_lock+0x2d/0x90
> > [51679.518042]  __lock_acquire+0x68/0x1800
> > [51679.518047]  ? find_held_lock+0x2d/0x90
> > [51679.518073]  ? __i915_sw_fence_complete+0xff/0x1c0 [i915]
> > [51679.518079]  lock_acquire+0x90/0x170
> > [51679.518105]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
> > [51679.518112]  _raw_spin_lock+0x27/0x40
> > [51679.518138]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
> > [51679.518165]  i915_request_cancel_breadcrumb+0x29/0x160 [i915]
> > [51679.518199]  i915_request_retire+0x43f/0x530 [i915]
> > [51679.518232]  retire_requests+0x4d/0x60 [i915]
> > [51679.518263]  i915_retire_requests+0xdf/0x1f0 [i915]
> > [51679.518294]  retire_work_handler+0x4c/0x60 [i915]
> > [51679.518301]  process_one_work+0x22c/0x5c0
> > [51679.518307]  worker_thread+0x37/0x390
> > [51679.518311]  ? process_one_work+0x5c0/0x5c0
> > [51679.518316]  kthread+0x116/0x130
> > [51679.518320]  ? kthread_create_on_node+0x40/0x40
> > [51679.518325]  ret_from_fork+0x24/0x30
> > [51679.520177] [ cut here ]
> > [51679.520189] list_del corruption, 3675e2f0->next is LIST_POISON1 
> > (dead0100)
> > 
> > Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> > Signed-off-by: Chris Wilson 
> > Cc: Tvrtko Ursulin 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c | 13 +
> >   1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
> > b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 7570a9256001..2ae6695f342b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -492,6 +492,19 @@ __unwind_incomplete_requests(struct intel_engine_cs 
> > *engine)
> >   list_move(>sched.link, pl);
> >   active = rq;
> >   } else {
> > + /*
> > +  * Decouple the virtual breadcrumb before moving it
> > +  * back to the virtual engine -- we don't want the
> > +  * request to complete in the background and try
> > +  * and cancel the breadcrumb on the virtual engine
> > +  * (instead of the old engine where it is linked)!
> > +  */
> > + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
> > +  >fence.flags)) {
> > + spin_lock(>lock);
> > + i915_request_cancel_breadcrumb(rq);
> > + spin_unlock(>lock);
> > + }
> >   rq->engine = owner;
> >   owner->submit_request(rq);
> >   active = NULL;
> > 
> 
> Got lost in the maze of DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT & 
> I915_FENCE_FLAG_SIGNAL flags but eventually it looked okay. Should get 
> re-added to correct engine's breadcrumb list on submit right?

Yes. It will also survive if the request completes in the background.

It's a bit ugly, but will do -- we could probably hide the detail in a
local virtual_unwind_request(). Hmm...
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/5] drm/i915/execlists: Cancel breadcrumb on preempting the virtual engine

2019-07-17 Thread Tvrtko Ursulin


On 16/07/2019 13:49, Chris Wilson wrote:

As we unwind the requests for a preemption event, we return a virtual
request back to its original virtual engine (so that it is available for
execution on any of its siblings). In the process, this means that its
breadcrumb should no longer be associated with the original physical
engine, and so we are forced to decouple it. Previously, as the request
could not complete without our awareness, we would move it to the next
real engine without any danger. However, preempt-to-busy allowed for
requests to continue on the HW and complete in the background as we
unwound, which meant that we could end up retiring the request before
fixing up the breadcrumb link.

[51679.517943] INFO: trying to register non-static key.
[51679.517956] the code is fine but needs lockdep annotation.
[51679.517960] turning off the locking correctness validator.
[51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G U
5.2.0+ #717
[51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS 
BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[51679.518012] Workqueue: i915 retire_work_handler [i915]
[51679.518017] Call Trace:
[51679.518026]  dump_stack+0x67/0x90
[51679.518031]  register_lock_class+0x52c/0x540
[51679.518038]  ? find_held_lock+0x2d/0x90
[51679.518042]  __lock_acquire+0x68/0x1800
[51679.518047]  ? find_held_lock+0x2d/0x90
[51679.518073]  ? __i915_sw_fence_complete+0xff/0x1c0 [i915]
[51679.518079]  lock_acquire+0x90/0x170
[51679.518105]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518112]  _raw_spin_lock+0x27/0x40
[51679.518138]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518165]  i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518199]  i915_request_retire+0x43f/0x530 [i915]
[51679.518232]  retire_requests+0x4d/0x60 [i915]
[51679.518263]  i915_retire_requests+0xdf/0x1f0 [i915]
[51679.518294]  retire_work_handler+0x4c/0x60 [i915]
[51679.518301]  process_one_work+0x22c/0x5c0
[51679.518307]  worker_thread+0x37/0x390
[51679.518311]  ? process_one_work+0x5c0/0x5c0
[51679.518316]  kthread+0x116/0x130
[51679.518320]  ? kthread_create_on_node+0x40/0x40
[51679.518325]  ret_from_fork+0x24/0x30
[51679.520177] [ cut here ]
[51679.520189] list_del corruption, 3675e2f0->next is LIST_POISON1 
(dead0100)

Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/intel_lrc.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7570a9256001..2ae6695f342b 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -492,6 +492,19 @@ __unwind_incomplete_requests(struct intel_engine_cs 
*engine)
list_move(>sched.link, pl);
active = rq;
} else {
+   /*
+* Decouple the virtual breadcrumb before moving it
+* back to the virtual engine -- we don't want the
+* request to complete in the background and try
+* and cancel the breadcrumb on the virtual engine
+* (instead of the old engine where it is linked)!
+*/
+   if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+>fence.flags)) {
+   spin_lock(>lock);
+   i915_request_cancel_breadcrumb(rq);
+   spin_unlock(>lock);
+   }
rq->engine = owner;
owner->submit_request(rq);
active = NULL;



Got lost in the maze of DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT & 
I915_FENCE_FLAG_SIGNAL flags but eventually it looked okay. Should get 
re-added to correct engine's breadcrumb list on submit right?


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 4/5] drm/i915/execlists: Cancel breadcrumb on preempting the virtual engine

2019-07-16 Thread Chris Wilson
As we unwind the requests for a preemption event, we return a virtual
request back to its original virtual engine (so that it is available for
execution on any of its siblings). In the process, this means that its
breadcrumb should no longer be associated with the original physical
engine, and so we are forced to decouple it. Previously, as the request
could not complete without our awareness, we would move it to the next
real engine without any danger. However, preempt-to-busy allowed for
requests to continue on the HW and complete in the background as we
unwound, which meant that we could end up retiring the request before
fixing up the breadcrumb link.

[51679.517943] INFO: trying to register non-static key.
[51679.517956] the code is fine but needs lockdep annotation.
[51679.517960] turning off the locking correctness validator.
[51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G U
5.2.0+ #717
[51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS 
BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[51679.518012] Workqueue: i915 retire_work_handler [i915]
[51679.518017] Call Trace:
[51679.518026]  dump_stack+0x67/0x90
[51679.518031]  register_lock_class+0x52c/0x540
[51679.518038]  ? find_held_lock+0x2d/0x90
[51679.518042]  __lock_acquire+0x68/0x1800
[51679.518047]  ? find_held_lock+0x2d/0x90
[51679.518073]  ? __i915_sw_fence_complete+0xff/0x1c0 [i915]
[51679.518079]  lock_acquire+0x90/0x170
[51679.518105]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518112]  _raw_spin_lock+0x27/0x40
[51679.518138]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518165]  i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518199]  i915_request_retire+0x43f/0x530 [i915]
[51679.518232]  retire_requests+0x4d/0x60 [i915]
[51679.518263]  i915_retire_requests+0xdf/0x1f0 [i915]
[51679.518294]  retire_work_handler+0x4c/0x60 [i915]
[51679.518301]  process_one_work+0x22c/0x5c0
[51679.518307]  worker_thread+0x37/0x390
[51679.518311]  ? process_one_work+0x5c0/0x5c0
[51679.518316]  kthread+0x116/0x130
[51679.518320]  ? kthread_create_on_node+0x40/0x40
[51679.518325]  ret_from_fork+0x24/0x30
[51679.520177] [ cut here ]
[51679.520189] list_del corruption, 3675e2f0->next is LIST_POISON1 
(dead0100)

Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7570a9256001..2ae6695f342b 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -492,6 +492,19 @@ __unwind_incomplete_requests(struct intel_engine_cs 
*engine)
list_move(>sched.link, pl);
active = rq;
} else {
+   /*
+* Decouple the virtual breadcrumb before moving it
+* back to the virtual engine -- we don't want the
+* request to complete in the background and try
+* and cancel the breadcrumb on the virtual engine
+* (instead of the old engine where it is linked)!
+*/
+   if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+>fence.flags)) {
+   spin_lock(>lock);
+   i915_request_cancel_breadcrumb(rq);
+   spin_unlock(>lock);
+   }
rq->engine = owner;
owner->submit_request(rq);
active = NULL;
-- 
2.22.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx