Re: [Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

2015-12-14 Thread Mika Kuoppala
Dave Gordon  writes:

> On 01/12/15 11:46, Tvrtko Ursulin wrote:
>>
>> On 23/10/15 18:02, Tomas Elf wrote:
>>> When clearing an execlist queue, instead of traversing it and
>>> unreferencing all
>>> requests while holding the spinlock (which might lead to thread
>>> sleeping with
>>> IRQs are turned off - bad news!), just move all requests to the retire
>>> request
>>> list while holding spinlock and then drop spinlock and invoke the
>>> execlists
>>> request retirement path, which already deals with the intricacies of
>>> purging/dereferencing execlist queue requests.
>>>
>>> This patch can be considered v3 of:
>>>
>>> commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
>>> Author: Tomas Elf 
>>> drm/i915: Grab execlist spinlock to avoid post-reset concurrency
>>> issues
>>>
>>> This patch assumes v2 of the above patch is part of the baseline,
>>> reverts v2
>>> and adds changes on top to turn it into v3.
>>>
>>> Signed-off-by: Tomas Elf 
>>> Cc: Tvrtko Ursulin 
>>> Cc: Chris Wilson 
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem.c | 15 ---
>>>   1 file changed, 4 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>> b/drivers/gpu/drm/i915/i915_gem.c
>>> index 2c7a0b7..b492603 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
>>> drm_i915_private *dev_priv,
>>>
>>>   if (i915.enable_execlists) {
>>>   spin_lock_irq(>execlist_lock);
>>> -while (!list_empty(>execlist_queue)) {
>>> -struct drm_i915_gem_request *submit_req;
>>>
>>> -submit_req = list_first_entry(>execlist_queue,
>>> -struct drm_i915_gem_request,
>>> -execlist_link);
>>> -list_del(_req->execlist_link);
>>> +/* list_splice_tail_init checks for empty lists */
>>> +list_splice_tail_init(>execlist_queue,
>>> +  >execlist_retired_req_list);
>>>
>>> -if (submit_req->ctx != ring->default_context)
>>> -intel_lr_context_unpin(submit_req);
>>> -
>>> -i915_gem_request_unreference(submit_req);
>>> -}
>>>   spin_unlock_irq(>execlist_lock);
>>> +intel_execlists_retire_requests(ring);
>>>   }
>>>
>>>   /*
>>
>> Fallen through the cracks..
>>
>> This looks to be even more serious, since lockdep notices possible
>> deadlock involving vmap_area_lock:
>>
>>   Possible interrupt unsafe locking scenario:
>>
>> CPU0CPU1
>> 
>>lock(vmap_area_lock);
>> local_irq_disable();
>> lock(&(>execlist_lock)->rlock);
>> lock(vmap_area_lock);
>>
>>  lock(&(>execlist_lock)->rlock);
>>
>>   *** DEADLOCK ***
>>
>> Because it unpins LRC context and ringbuffer which ends up in the VM
>> code under the execlist_lock.
>>
>> intel_execlists_retire_requests is slightly different from the code in
>> the reset handler because it concerns itself with ctx_obj existence
>> which the other one doesn't.
>>
>> Could people more knowledgeable of this code check if it is OK and R-B?
>>
>> Regards,
>>
>> Tvrtko
>
> Hi Tvrtko,
>
> I didn't understand this message at first, I thought you'd found a 
> problem with this ("v3") patch, but now I see what you actually meant is 
> that there is indeed a problem with the (v2) that got merged, not the 
> original question about unreferencing an object while holding a spinlock 
> (because it can't be the last reference), but rather because of the 
> unpin, which can indeed cause a problem with a non-i915-defined kernel lock.
>
> So we should certainly update the current (v2) upstream with this.
> Thomas Daniel already R-B'd this code on 23rd October, when it was:
>
> [PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset 
> concurrency issues.
>
> and it hasn't changed in substance since then, so you can carry his R-B 
> over, plus I said on that same day that this was a better solution. So:
>
> Reviewed-by: Thomas Daniel 
> Reviewed-by: Dave Gordon 
>

Bat farm did encounter with this few weeks back,
so it was vaguely registered. But I just failed
with timely review.

Thanks for pushing it forward,
-Mika

> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

2015-12-11 Thread Dave Gordon

On 01/12/15 11:46, Tvrtko Ursulin wrote:


On 23/10/15 18:02, Tomas Elf wrote:

When clearing an execlist queue, instead of traversing it and
unreferencing all
requests while holding the spinlock (which might lead to thread
sleeping with
IRQs are turned off - bad news!), just move all requests to the retire
request
list while holding spinlock and then drop spinlock and invoke the
execlists
request retirement path, which already deals with the intricacies of
purging/dereferencing execlist queue requests.

This patch can be considered v3 of:

commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
Author: Tomas Elf 
drm/i915: Grab execlist spinlock to avoid post-reset concurrency
issues

This patch assumes v2 of the above patch is part of the baseline,
reverts v2
and adds changes on top to turn it into v3.

Signed-off-by: Tomas Elf 
Cc: Tvrtko Ursulin 
Cc: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_gem.c | 15 ---
  1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
index 2c7a0b7..b492603 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
drm_i915_private *dev_priv,

  if (i915.enable_execlists) {
  spin_lock_irq(>execlist_lock);
-while (!list_empty(>execlist_queue)) {
-struct drm_i915_gem_request *submit_req;

-submit_req = list_first_entry(>execlist_queue,
-struct drm_i915_gem_request,
-execlist_link);
-list_del(_req->execlist_link);
+/* list_splice_tail_init checks for empty lists */
+list_splice_tail_init(>execlist_queue,
+  >execlist_retired_req_list);

-if (submit_req->ctx != ring->default_context)
-intel_lr_context_unpin(submit_req);
-
-i915_gem_request_unreference(submit_req);
-}
  spin_unlock_irq(>execlist_lock);
+intel_execlists_retire_requests(ring);
  }

  /*


Fallen through the cracks..

This looks to be even more serious, since lockdep notices possible
deadlock involving vmap_area_lock:

  Possible interrupt unsafe locking scenario:

CPU0CPU1

   lock(vmap_area_lock);
local_irq_disable();
lock(&(>execlist_lock)->rlock);
lock(vmap_area_lock);
   
 lock(&(>execlist_lock)->rlock);

  *** DEADLOCK ***

Because it unpins LRC context and ringbuffer which ends up in the VM
code under the execlist_lock.

intel_execlists_retire_requests is slightly different from the code in
the reset handler because it concerns itself with ctx_obj existence
which the other one doesn't.

Could people more knowledgeable of this code check if it is OK and R-B?

Regards,

Tvrtko


Hi Tvrtko,

I didn't understand this message at first, I thought you'd found a 
problem with this ("v3") patch, but now I see what you actually meant is 
that there is indeed a problem with the (v2) that got merged, not the 
original question about unreferencing an object while holding a spinlock 
(because it can't be the last reference), but rather because of the 
unpin, which can indeed cause a problem with a non-i915-defined kernel lock.


So we should certainly update the current (v2) upstream with this.
Thomas Daniel already R-B'd this code on 23rd October, when it was:

[PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset 
concurrency issues.


and it hasn't changed in substance since then, so you can carry his R-B 
over, plus I said on that same day that this was a better solution. So:


Reviewed-by: Thomas Daniel 
Reviewed-by: Dave Gordon 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

2015-12-11 Thread Daniel Vetter
On Fri, Dec 11, 2015 at 02:14:00PM +, Dave Gordon wrote:
> On 01/12/15 11:46, Tvrtko Ursulin wrote:
> >
> >On 23/10/15 18:02, Tomas Elf wrote:
> >>When clearing an execlist queue, instead of traversing it and
> >>unreferencing all
> >>requests while holding the spinlock (which might lead to thread
> >>sleeping with
> >>IRQs are turned off - bad news!), just move all requests to the retire
> >>request
> >>list while holding spinlock and then drop spinlock and invoke the
> >>execlists
> >>request retirement path, which already deals with the intricacies of
> >>purging/dereferencing execlist queue requests.
> >>
> >>This patch can be considered v3 of:
> >>
> >>commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
> >>Author: Tomas Elf 
> >>drm/i915: Grab execlist spinlock to avoid post-reset concurrency
> >>issues
> >>
> >>This patch assumes v2 of the above patch is part of the baseline,
> >>reverts v2
> >>and adds changes on top to turn it into v3.
> >>
> >>Signed-off-by: Tomas Elf 
> >>Cc: Tvrtko Ursulin 
> >>Cc: Chris Wilson 
> >>---
> >>  drivers/gpu/drm/i915/i915_gem.c | 15 ---
> >>  1 file changed, 4 insertions(+), 11 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c
> >>b/drivers/gpu/drm/i915/i915_gem.c
> >>index 2c7a0b7..b492603 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct
> >>drm_i915_private *dev_priv,
> >>
> >>  if (i915.enable_execlists) {
> >>  spin_lock_irq(>execlist_lock);
> >>-while (!list_empty(>execlist_queue)) {
> >>-struct drm_i915_gem_request *submit_req;
> >>
> >>-submit_req = list_first_entry(>execlist_queue,
> >>-struct drm_i915_gem_request,
> >>-execlist_link);
> >>-list_del(_req->execlist_link);
> >>+/* list_splice_tail_init checks for empty lists */
> >>+list_splice_tail_init(>execlist_queue,
> >>+  >execlist_retired_req_list);
> >>
> >>-if (submit_req->ctx != ring->default_context)
> >>-intel_lr_context_unpin(submit_req);
> >>-
> >>-i915_gem_request_unreference(submit_req);
> >>-}
> >>  spin_unlock_irq(>execlist_lock);
> >>+intel_execlists_retire_requests(ring);
> >>  }
> >>
> >>  /*
> >
> >Fallen through the cracks..
> >
> >This looks to be even more serious, since lockdep notices possible
> >deadlock involving vmap_area_lock:
> >
> >  Possible interrupt unsafe locking scenario:
> >
> >CPU0CPU1
> >
> >   lock(vmap_area_lock);
> >local_irq_disable();
> >lock(&(>execlist_lock)->rlock);
> >lock(vmap_area_lock);
> >   
> > lock(&(>execlist_lock)->rlock);
> >
> >  *** DEADLOCK ***
> >
> >Because it unpins LRC context and ringbuffer which ends up in the VM
> >code under the execlist_lock.
> >
> >intel_execlists_retire_requests is slightly different from the code in
> >the reset handler because it concerns itself with ctx_obj existence
> >which the other one doesn't.
> >
> >Could people more knowledgeable of this code check if it is OK and R-B?
> >
> >Regards,
> >
> >Tvrtko
> 
> Hi Tvrtko,
> 
> I didn't understand this message at first, I thought you'd found a problem
> with this ("v3") patch, but now I see what you actually meant is that there
> is indeed a problem with the (v2) that got merged, not the original question
> about unreferencing an object while holding a spinlock (because it can't be
> the last reference), but rather because of the unpin, which can indeed cause
> a problem with a non-i915-defined kernel lock.
> 
> So we should certainly update the current (v2) upstream with this.
> Thomas Daniel already R-B'd this code on 23rd October, when it was:
> 
> [PATCH v3 7/8] drm/i915: Grab execlist spinlock to avoid post-reset
> concurrency issues.
> 
> and it hasn't changed in substance since then, so you can carry his R-B
> over, plus I said on that same day that this was a better solution. So:
> 
> Reviewed-by: Thomas Daniel 
> Reviewed-by: Dave Gordon 

Indeed, fell through the cracks more than once :(

Sorry about that, picked up now.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

2015-12-01 Thread Tvrtko Ursulin


On 23/10/15 18:02, Tomas Elf wrote:

When clearing an execlist queue, instead of traversing it and unreferencing all
requests while holding the spinlock (which might lead to thread sleeping with
IRQs are turned off - bad news!), just move all requests to the retire request
list while holding spinlock and then drop spinlock and invoke the execlists
request retirement path, which already deals with the intricacies of
purging/dereferencing execlist queue requests.

This patch can be considered v3 of:

commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
Author: Tomas Elf 
drm/i915: Grab execlist spinlock to avoid post-reset concurrency issues

This patch assumes v2 of the above patch is part of the baseline, reverts v2
and adds changes on top to turn it into v3.

Signed-off-by: Tomas Elf 
Cc: Tvrtko Ursulin 
Cc: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_gem.c | 15 ---
  1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2c7a0b7..b492603 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct 
drm_i915_private *dev_priv,

if (i915.enable_execlists) {
spin_lock_irq(>execlist_lock);
-   while (!list_empty(>execlist_queue)) {
-   struct drm_i915_gem_request *submit_req;

-   submit_req = list_first_entry(>execlist_queue,
-   struct drm_i915_gem_request,
-   execlist_link);
-   list_del(_req->execlist_link);
+   /* list_splice_tail_init checks for empty lists */
+   list_splice_tail_init(>execlist_queue,
+ >execlist_retired_req_list);

-   if (submit_req->ctx != ring->default_context)
-   intel_lr_context_unpin(submit_req);
-
-   i915_gem_request_unreference(submit_req);
-   }
spin_unlock_irq(>execlist_lock);
+   intel_execlists_retire_requests(ring);
}

/*



Fallen through the cracks..

This looks to be even more serious, since lockdep notices possible 
deadlock involving vmap_area_lock:


 Possible interrupt unsafe locking scenario:

   CPU0CPU1
   
  lock(vmap_area_lock);
   local_irq_disable();
   lock(&(>execlist_lock)->rlock);
   lock(vmap_area_lock);
  
lock(&(>execlist_lock)->rlock);

 *** DEADLOCK ***

Because it unpins LRC context and ringbuffer which ends up in the VM 
code under the execlist_lock.


intel_execlists_retire_requests is slightly different from the code in 
the reset handler because it concerns itself with ctx_obj existence 
which the other one doesn't.


Could people more knowledgeable of this code check if it is OK and R-B?

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Update to post-reset execlist queue clean-up

2015-10-23 Thread Tomas Elf
When clearing an execlist queue, instead of traversing it and unreferencing all
requests while holding the spinlock (which might lead to thread sleeping with
IRQs are turned off - bad news!), just move all requests to the retire request
list while holding spinlock and then drop spinlock and invoke the execlists
request retirement path, which already deals with the intricacies of
purging/dereferencing execlist queue requests.

This patch can be considered v3 of:

commit b96db8b81c54ef30485ddb5992d63305d86ea8d3
Author: Tomas Elf 
drm/i915: Grab execlist spinlock to avoid post-reset concurrency issues

This patch assumes v2 of the above patch is part of the baseline, reverts v2
and adds changes on top to turn it into v3.

Signed-off-by: Tomas Elf 
Cc: Tvrtko Ursulin 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2c7a0b7..b492603 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2756,20 +2756,13 @@ static void i915_gem_reset_ring_cleanup(struct 
drm_i915_private *dev_priv,
 
if (i915.enable_execlists) {
spin_lock_irq(>execlist_lock);
-   while (!list_empty(>execlist_queue)) {
-   struct drm_i915_gem_request *submit_req;
 
-   submit_req = list_first_entry(>execlist_queue,
-   struct drm_i915_gem_request,
-   execlist_link);
-   list_del(_req->execlist_link);
+   /* list_splice_tail_init checks for empty lists */
+   list_splice_tail_init(>execlist_queue,
+ >execlist_retired_req_list);
 
-   if (submit_req->ctx != ring->default_context)
-   intel_lr_context_unpin(submit_req);
-
-   i915_gem_request_unreference(submit_req);
-   }
spin_unlock_irq(>execlist_lock);
+   intel_execlists_retire_requests(ring);
}
 
/*
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx