Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its own lock

2020-06-30 Thread Ruhl, Michael J
>-Original Message-
>From: Chris Wilson 
>Sent: Tuesday, June 30, 2020 3:29 PM
>To: Ruhl, Michael J ; intel-
>g...@lists.freedesktop.org
>Subject: Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under
>its own lock
>
>Quoting Ruhl, Michael J (2020-06-30 20:11:16)
>> >-Original Message-
>> >From: Intel-gfx  On Behalf Of
>Chris
>> >Wilson
>> >Sent: Monday, June 29, 2020 7:36 AM
>> >To: intel-gfx@lists.freedesktop.org
>> >Cc: Chris Wilson 
>> >Subject: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its
>> >own lock
>> >
>> >The obj->lut_list is traversed when the object is closed as the file
>> >table is destroyed during process termination. As this occurs before we
>> >kill any outstanding context if, due to some bug or another, the closure
>> >is blocked, then we fail to shootdown any inflight operations
>> >potentially leaving the GPU spinning forever. As we only need to guard
>> >the list against concurrent closures and insertions, the hold is short
>> >and merits being treated as a simple spinlock.
>>
>> The comment:
>>
>> /* Break long locks, and carefully continue on from this spot */
>>
>> seems to be contrary to your "the hold is short" comment.  Is calling out
>> this apparent worst case scenario (i.e. how it could happen), worth
>> documenting?
>
>It's paranoia, because list iterating can be slow and historically slow
>object closure has been reported as an issue. Only a few objects will be
>shared between multiple contexts, and even then you would only expect a
>couple of contexts to share an object. One of the OglDrvCtx would show up
>here, which convinced us to move to the per-object lists to shrink the
>number of elements walked.
>
>Even the close race igts do not cause the list to become long enough to
>schedule, but it would be possible to create an object that was shared
>by 64k contexts. Just not wise in practice. [However, I should make sure
>an igt does hit the bookmark.]

I look forward to that igt. 😊

>
>> >@@ -104,21 +105,29 @@ void i915_gem_close_object(struct
>> >drm_gem_object *gem, struct drm_file *file)
>> > {
>> >   struct drm_i915_gem_object *obj = to_intel_bo(gem);
>> >   struct drm_i915_file_private *fpriv = file->driver_priv;
>> >+  struct i915_lut_handle bookmark = {};
>> >   struct i915_mmap_offset *mmo, *mn;
>> >   struct i915_lut_handle *lut, *ln;
>> >   LIST_HEAD(close);
>> >
>> >-  i915_gem_object_lock(obj);
>> >+  spin_lock(&obj->lut_lock);
>> >   list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) {
>> >   struct i915_gem_context *ctx = lut->ctx;
>> >
>> >-  if (ctx->file_priv != fpriv)
>> >-  continue;
>> >+  if (ctx && ctx->file_priv == fpriv) {
>>
>> Null checking ctx was not done before.  I think this changed with the
>possible cond_resched?
>
>The bookmark introduces the lut->ctx == NULL to identify it as being
>special, hence the need to check.

Ok,

clean lock replacement with a paranoid worst case scenario, just in case.

Reviewed-by: Michael J. Ruhl 

M
>-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its own lock

2020-06-30 Thread Chris Wilson
Quoting Ruhl, Michael J (2020-06-30 20:11:16)
> >-Original Message-
> >From: Intel-gfx  On Behalf Of Chris
> >Wilson
> >Sent: Monday, June 29, 2020 7:36 AM
> >To: intel-gfx@lists.freedesktop.org
> >Cc: Chris Wilson 
> >Subject: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its
> >own lock
> >
> >The obj->lut_list is traversed when the object is closed as the file
> >table is destroyed during process termination. As this occurs before we
> >kill any outstanding context if, due to some bug or another, the closure
> >is blocked, then we fail to shootdown any inflight operations
> >potentially leaving the GPU spinning forever. As we only need to guard
> >the list against concurrent closures and insertions, the hold is short
> >and merits being treated as a simple spinlock.
> 
> The comment:
> 
> /* Break long locks, and carefully continue on from this spot */
> 
> seems to be contrary to your "the hold is short" comment.  Is calling out
> this apparent worst case scenario (i.e. how it could happen), worth
> documenting?

It's paranoia, because list iterating can be slow and historically slow
object closure has been reported as an issue. Only a few objects will be
shared between multiple contexts, and even then you would only expect a
couple of contexts to share an object. One of the OglDrvCtx would show up
here, which convinced us to move to the per-object lists to shrink the
number of elements walked.

Even the close race igts do not cause the list to become long enough to
schedule, but it would be possible to create an object that was shared
by 64k contexts. Just not wise in practice. [However, I should make sure
an igt does hit the bookmark.]

> >@@ -104,21 +105,29 @@ void i915_gem_close_object(struct
> >drm_gem_object *gem, struct drm_file *file)
> > {
> >   struct drm_i915_gem_object *obj = to_intel_bo(gem);
> >   struct drm_i915_file_private *fpriv = file->driver_priv;
> >+  struct i915_lut_handle bookmark = {};
> >   struct i915_mmap_offset *mmo, *mn;
> >   struct i915_lut_handle *lut, *ln;
> >   LIST_HEAD(close);
> >
> >-  i915_gem_object_lock(obj);
> >+  spin_lock(&obj->lut_lock);
> >   list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) {
> >   struct i915_gem_context *ctx = lut->ctx;
> >
> >-  if (ctx->file_priv != fpriv)
> >-  continue;
> >+  if (ctx && ctx->file_priv == fpriv) {
> 
> Null checking ctx was not done before.  I think this changed with the 
> possible cond_resched?

The bookmark introduces the lut->ctx == NULL to identify it as being
special, hence the need to check.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its own lock

2020-06-30 Thread Ruhl, Michael J
>-Original Message-
>From: Intel-gfx  On Behalf Of Chris
>Wilson
>Sent: Monday, June 29, 2020 7:36 AM
>To: intel-gfx@lists.freedesktop.org
>Cc: Chris Wilson 
>Subject: [Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its
>own lock
>
>The obj->lut_list is traversed when the object is closed as the file
>table is destroyed during process termination. As this occurs before we
>kill any outstanding context if, due to some bug or another, the closure
>is blocked, then we fail to shootdown any inflight operations
>potentially leaving the GPU spinning forever. As we only need to guard
>the list against concurrent closures and insertions, the hold is short
>and merits being treated as a simple spinlock.

The comment:

/* Break long locks, and carefully continue on from this spot */

seems to be contrary to your "the hold is short" comment.  Is calling out
this apparent worst case scenario (i.e. how it could happen), worth
documenting?

>Signed-off-by: Chris Wilson 
>---
> drivers/gpu/drm/i915/gem/i915_gem_context.c   |  6 ++
> .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++--
> drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 +--
> .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
> 4 files changed, 20 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>index 5c13809dc3c8..6675447a47b9 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
>@@ -112,8 +112,7 @@ static void lut_close(struct i915_gem_context *ctx)
>   if (!kref_get_unless_zero(&obj->base.refcount))
>   continue;
>
>-  rcu_read_unlock();
>-  i915_gem_object_lock(obj);
>+  spin_lock(&obj->lut_lock);
>   list_for_each_entry(lut, &obj->lut_list, obj_link) {
>   if (lut->ctx != ctx)
>   continue;
>@@ -124,8 +123,7 @@ static void lut_close(struct i915_gem_context *ctx)
>   list_del(&lut->obj_link);
>   break;
>   }
>-  i915_gem_object_unlock(obj);
>-  rcu_read_lock();
>+  spin_unlock(&obj->lut_lock);
>
>   if (&lut->obj_link != &obj->lut_list) {
>   i915_lut_handle_free(lut);
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>index c38ab51e82f0..b4862afaaf28 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>@@ -789,14 +789,14 @@ static int __eb_add_lut(struct i915_execbuffer *eb,
>   if (err == 0) { /* And nor has this handle */
>   struct drm_i915_gem_object *obj = vma->obj;
>
>-  i915_gem_object_lock(obj);
>+  spin_lock(&obj->lut_lock);
>   if (idr_find(&eb->file->object_idr, handle) == obj) {
>   list_add(&lut->obj_link, &obj->lut_list);
>   } else {
>   radix_tree_delete(&ctx->handles_vma,
>handle);
>   err = -ENOENT;
>   }
>-  i915_gem_object_unlock(obj);
>+  spin_unlock(&obj->lut_lock);
>   }
>   mutex_unlock(&ctx->mutex);
>   }
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>index b6ec5b50d93b..6b69191c5543 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>@@ -61,6 +61,7 @@ void i915_gem_object_init(struct drm_i915_gem_object
>*obj,
>   INIT_LIST_HEAD(&obj->mm.link);
>
>   INIT_LIST_HEAD(&obj->lut_list);
>+  spin_lock_init(&obj->lut_lock);
>
>   spin_lock_init(&obj->mmo.lock);
>   obj->mmo.offsets = RB_ROOT;
>@@ -104,21 +105,29 @@ void i915_gem_close_object(struct
>drm_gem_object *gem, struct drm_file *file)
> {
>   struct drm_i915_gem_object *obj = to_intel_bo(gem);
>   struct drm_i915_file_private *fpriv = file->driver_priv;
>+  struct i915_lut_handle bookmark = {};
>   struct i915_mmap_offset *mmo, *mn;
>   struct i915_lut_handle *lut, *ln;
>   LIST_HEAD(close);
>
>-  i915_gem_object_lock(obj);
>+  spin_lock(&obj->lut_lock);
>   list_for_each_entry_safe(lut, ln, &obj->lut

[Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its own lock

2020-06-29 Thread Chris Wilson
The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill any outstanding context if, due to some bug or another, the closure
is blocked, then we fail to shootdown any inflight operations
potentially leaving the GPU spinning forever. As we only need to guard
the list against concurrent closures and insertions, the hold is short
and merits being treated as a simple spinlock.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  6 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 +--
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 4 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5c13809dc3c8..6675447a47b9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -112,8 +112,7 @@ static void lut_close(struct i915_gem_context *ctx)
if (!kref_get_unless_zero(&obj->base.refcount))
continue;
 
-   rcu_read_unlock();
-   i915_gem_object_lock(obj);
+   spin_lock(&obj->lut_lock);
list_for_each_entry(lut, &obj->lut_list, obj_link) {
if (lut->ctx != ctx)
continue;
@@ -124,8 +123,7 @@ static void lut_close(struct i915_gem_context *ctx)
list_del(&lut->obj_link);
break;
}
-   i915_gem_object_unlock(obj);
-   rcu_read_lock();
+   spin_unlock(&obj->lut_lock);
 
if (&lut->obj_link != &obj->lut_list) {
i915_lut_handle_free(lut);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c38ab51e82f0..b4862afaaf28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -789,14 +789,14 @@ static int __eb_add_lut(struct i915_execbuffer *eb,
if (err == 0) { /* And nor has this handle */
struct drm_i915_gem_object *obj = vma->obj;
 
-   i915_gem_object_lock(obj);
+   spin_lock(&obj->lut_lock);
if (idr_find(&eb->file->object_idr, handle) == obj) {
list_add(&lut->obj_link, &obj->lut_list);
} else {
radix_tree_delete(&ctx->handles_vma, handle);
err = -ENOENT;
}
-   i915_gem_object_unlock(obj);
+   spin_unlock(&obj->lut_lock);
}
mutex_unlock(&ctx->mutex);
}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index b6ec5b50d93b..6b69191c5543 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -61,6 +61,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
INIT_LIST_HEAD(&obj->mm.link);
 
INIT_LIST_HEAD(&obj->lut_list);
+   spin_lock_init(&obj->lut_lock);
 
spin_lock_init(&obj->mmo.lock);
obj->mmo.offsets = RB_ROOT;
@@ -104,21 +105,29 @@ void i915_gem_close_object(struct drm_gem_object *gem, 
struct drm_file *file)
 {
struct drm_i915_gem_object *obj = to_intel_bo(gem);
struct drm_i915_file_private *fpriv = file->driver_priv;
+   struct i915_lut_handle bookmark = {};
struct i915_mmap_offset *mmo, *mn;
struct i915_lut_handle *lut, *ln;
LIST_HEAD(close);
 
-   i915_gem_object_lock(obj);
+   spin_lock(&obj->lut_lock);
list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) {
struct i915_gem_context *ctx = lut->ctx;
 
-   if (ctx->file_priv != fpriv)
-   continue;
+   if (ctx && ctx->file_priv == fpriv) {
+   i915_gem_context_get(ctx);
+   list_move(&lut->obj_link, &close);
+   }
 
-   i915_gem_context_get(ctx);
-   list_move(&lut->obj_link, &close);
+   /* Break long locks, and carefully continue on from this spot */
+   if (&ln->obj_link != &obj->lut_list) {
+   list_add_tail(&bookmark.obj_link, &ln->obj_link);
+   if (cond_resched_lock(&obj->lut_lock))
+   list_safe_reset_next(&bookmark, ln, obj_link);
+   __list_del_entry(&bookmark.obj_link);
+   }
}
-   i915_gem_object_unlock(obj);
+   spin_unlock(&obj->lut_lock);
 
spin_lock(&obj->mmo.lock);
rbtree_post

[Intel-gfx] [PATCH v2] drm/i915/gem: Move obj->lut_list under its own lock

2020-06-29 Thread Chris Wilson
The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill any outstanding context if, due to some bug or another, the closure
is blocked, then we fail to shootdown any inflight operations
potentially leaving the GPU spinning forever. As we only need to guard
the list against concurrent closures and insertions, the hold is short
and merits being treated as a simple spinlock.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  6 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 +--
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 4 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5c13809dc3c8..6675447a47b9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -112,8 +112,7 @@ static void lut_close(struct i915_gem_context *ctx)
if (!kref_get_unless_zero(&obj->base.refcount))
continue;
 
-   rcu_read_unlock();
-   i915_gem_object_lock(obj);
+   spin_lock(&obj->lut_lock);
list_for_each_entry(lut, &obj->lut_list, obj_link) {
if (lut->ctx != ctx)
continue;
@@ -124,8 +123,7 @@ static void lut_close(struct i915_gem_context *ctx)
list_del(&lut->obj_link);
break;
}
-   i915_gem_object_unlock(obj);
-   rcu_read_lock();
+   spin_unlock(&obj->lut_lock);
 
if (&lut->obj_link != &obj->lut_list) {
i915_lut_handle_free(lut);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c38ab51e82f0..b4862afaaf28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -789,14 +789,14 @@ static int __eb_add_lut(struct i915_execbuffer *eb,
if (err == 0) { /* And nor has this handle */
struct drm_i915_gem_object *obj = vma->obj;
 
-   i915_gem_object_lock(obj);
+   spin_lock(&obj->lut_lock);
if (idr_find(&eb->file->object_idr, handle) == obj) {
list_add(&lut->obj_link, &obj->lut_list);
} else {
radix_tree_delete(&ctx->handles_vma, handle);
err = -ENOENT;
}
-   i915_gem_object_unlock(obj);
+   spin_unlock(&obj->lut_lock);
}
mutex_unlock(&ctx->mutex);
}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index b6ec5b50d93b..3f47fa4784ac 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -61,6 +61,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
INIT_LIST_HEAD(&obj->mm.link);
 
INIT_LIST_HEAD(&obj->lut_list);
+   spin_lock_init(&obj->lut_lock);
 
spin_lock_init(&obj->mmo.lock);
obj->mmo.offsets = RB_ROOT;
@@ -104,21 +105,29 @@ void i915_gem_close_object(struct drm_gem_object *gem, 
struct drm_file *file)
 {
struct drm_i915_gem_object *obj = to_intel_bo(gem);
struct drm_i915_file_private *fpriv = file->driver_priv;
+   struct i915_lut_handle bookmark = {};
struct i915_mmap_offset *mmo, *mn;
struct i915_lut_handle *lut, *ln;
LIST_HEAD(close);
 
-   i915_gem_object_lock(obj);
+   spin_lock(&obj->lut_lock);
list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) {
struct i915_gem_context *ctx = lut->ctx;
 
-   if (ctx->file_priv != fpriv)
-   continue;
+   if (ctx && ctx->file_priv == fpriv) {
+   i915_gem_context_get(ctx);
+   list_move(&lut->obj_link, &close);
+   }
 
-   i915_gem_context_get(ctx);
-   list_move(&lut->obj_link, &close);
+   /* Break long locks, and carefully continue on from this spot */
+   if (&ln->obj_link != &obj->lut_list) {
+   list_add(&bookmark.obj_link, &ln->obj_link);
+   if (cond_resched_lock(&obj->lut_lock))
+   list_safe_reset_next(&bookmark, ln, obj_link);
+   __list_del_entry(&bookmark.obj_link);
+   }
}
-   i915_gem_object_unlock(obj);
+   spin_unlock(&obj->lut_lock);
 
spin_lock(&obj->mmo.lock);
rbtree_postorder