Re: [Intel-gfx] [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests

2019-08-01 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-08-01 18:24:50)
> 
> On 30/07/2019 12:21, Chris Wilson wrote:
> > @@ -224,8 +283,15 @@ int i915_active_ref(struct i915_active *ref,
> >   goto out;
> >   }
> >   
> > - if (!i915_active_request_isset(active))
> > - atomic_inc(>count);
> > + if (is_barrier(active)) { /* proto-node used by our idle barrier */
> > + __active_del_barrier(ref, node_from_active(active));
> > + RCU_INIT_POINTER(active->request, NULL);
> > + INIT_LIST_HEAD(>link);
> 
> So this, when prepare_remote_request calls it on ce->active, will remove 
> the idle barrier from the engine->barriers_list and immediately transfer 
> it to the rq->active_list?

Yes. So long as the context matches the idle-barrier, i.e. we are
operating on this context from the kernel_context.

> Then when this request is retired we know the 
> context is idle?

So long as the context itself hasn't claimed a new active reference for
its own requests.

> But what if it is the same context? Then it is not idle yet.

Correct, but that would only happens for the kernel_context, which is the
special case where we force idle on parking and exclude from the
deferred active-barrier (as it would continually defer instead of
parking).

> > +static inline bool is_idle_barrier(struct active_node *node, u64 idx)
> > +{
> > + return node->timeline == idx && 
> > !i915_active_request_isset(>base);
> > +}
> > +
> > +static struct active_node *idle_barrier(struct i915_active *ref, u64 idx)
> > +{
> > + struct rb_node *prev, *p;
> > +
> > + if (RB_EMPTY_ROOT(>tree))
> > + return NULL;
> > +
> > + mutex_lock(>mutex);
> > + GEM_BUG_ON(i915_active_is_idle(ref));
> > +
> > + /*
> > +  * Try to reuse any existing barrier nodes already allocated for this
> > +  * i915_active, due to overlapping active phases there is likely a
> 
> For this i915_active or for this idx? It is this i915_active by 
> definition, no? And idx also has to match in both loops below.

idx is a timeline, which although would normally be unique in the tree,
due to the active-barrier we may have more than one instance, because we
may have overlapping i915_active phases of activity.

This function exists to try and reuse any of the older phases that we
could not reap as we never completely idled. (Before we would
accidentally keep growing the tree with new nodes until we hit a point
where we retired the idle timeline after parking.)

> > +  * node kept alive (as we reuse before parking). We prefer to reuse
> > +  * completely idle barriers (less hassle in manipulating the llists),
> > +  * but otherwise any will do.
> > +  */
> > + if (ref->cache && is_idle_barrier(ref->cache, idx)) {
> > + p = >cache->node;
> > + goto match;
> > + }
> > +
> > + prev = NULL;
> > + p = ref->tree.rb_node;
> > + while (p) {
> > + struct active_node *node =
> > + rb_entry(p, struct active_node, node);
> > +
> > + if (is_idle_barrier(node, idx))
> > + goto match;
> > +
> > + prev = p;
> > + if (node->timeline < idx)
> > + p = p->rb_right;
> > + else
> > + p = p->rb_left;
> > + }
> > +
> > + for (p = prev; p; p = rb_next(p)) {
> > + struct active_node *node =
> > + rb_entry(p, struct active_node, node);
> > +
> > + if (node->timeline > idx)
> > + break;
> > +
> > + if (node->timeline < idx)
> > + continue;
> > +
> > + if (!i915_active_request_isset(>base))
> > + goto match;
> 
> Isn't this idle barrier, so same as above? How can above not find it, 
> and find it here?

We didn't check all nodes, only followed the binary tree down one
branch to find the starting point. I anticipate we may end up with more
than one idle-barrier in the tree.

> > +
> > + if (is_barrier(>base) && __active_del_barrier(ref, 
> > node))
> > + goto match;
> 
> And this is yet unused barrier with the right idx. Under what 
> circumstances can __active_del_barrier fail then? I think a comment is 
> needed.

__active_del_barrier() here can theoretically race with
i915_request_add_active_barriers(). In the other callsite, we must be
holding the kernel_context timeline mutex and so cannot be concurrently
calling add_active_barriers(). [Here is the weakness of using
__active_del_barrier(), in order to reuse an activated idle-barrier, we
may cause add_active_barriers() to miss the others and so postpone the
retirement of other contexts. Under the current scheme of parking, that
is not an issue. In the future, it means retirement may miss a
heartbeat.]

> > + if (!i915_active_request_isset(>base)) {
> > + 

Re: [Intel-gfx] [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests

2019-08-01 Thread Tvrtko Ursulin


On 30/07/2019 12:21, Chris Wilson wrote:

By placing our idle-barriers in the i915_active fence tree, we expose
those for reuse by other components that are issuing requests along the
kernel_context. Reusing the proto-barrier active_node is perfectly fine
as the new request implies a context-switch, and so an opportune point
to run the idle-barrier. However, the proto-barrier is not equivalent
to a normal active_node and care must be taken to avoid dereferencing the
ERR_PTR used as its request marker.

Reported-by: Lionel Landwerlin 
Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next kernel 
context switch")
Fixes: a9877da2d629 ("drm/i915/oa: Reconfigure contexts on the fly")
Signed-off-by: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/intel_context.c   |  40 ++-
  drivers/gpu/drm/i915/gt/intel_context.h   |  13 +-
  drivers/gpu/drm/i915/gt/intel_engine_pm.c |   2 +-
  drivers/gpu/drm/i915/gt/selftest_context.c| 310 ++
  drivers/gpu/drm/i915/i915_active.c| 246 +++---
  drivers/gpu/drm/i915/i915_active.h|   2 +-
  drivers/gpu/drm/i915/i915_active_types.h  |   2 +-
  .../drm/i915/selftests/i915_live_selftests.h  |   3 +-
  8 files changed, 555 insertions(+), 63 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gt/selftest_context.c

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index d64b45f7ec6d..211ac6568a5d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -162,23 +162,41 @@ static int __intel_context_active(struct i915_active 
*active)
if (err)
goto err_ring;
  
+	return 0;

+
+err_ring:
+   intel_ring_unpin(ce->ring);
+err_put:
+   intel_context_put(ce);
+   return err;
+}
+
+int intel_context_active_acquire(struct intel_context *ce)
+{
+   int err;
+
+   err = i915_active_acquire(>active);
+   if (err)
+   return err;
+
/* Preallocate tracking nodes */
if (!i915_gem_context_is_kernel(ce->gem_context)) {
err = i915_active_acquire_preallocate_barrier(>active,
  ce->engine);
-   if (err)
-   goto err_state;
+   if (err) {
+   i915_active_release(>active);
+   return err;
+   }
}
  
  	return 0;

+}
  
-err_state:

-   __context_unpin_state(ce->state);
-err_ring:
-   intel_ring_unpin(ce->ring);
-err_put:
-   intel_context_put(ce);
-   return err;
+void intel_context_active_release(struct intel_context *ce)
+{
+   /* Nodes preallocated in intel_context_active() */
+   i915_active_acquire_barrier(>active);
+   i915_active_release(>active);
  }
  
  void

@@ -297,3 +315,7 @@ struct i915_request *intel_context_create_request(struct 
intel_context *ce)
  
  	return rq;

  }
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_context.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 23c7e4c0ce7c..07f9924de48f 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -104,17 +104,8 @@ static inline void intel_context_exit(struct intel_context 
*ce)
ce->ops->exit(ce);
  }
  
-static inline int intel_context_active_acquire(struct intel_context *ce)

-{
-   return i915_active_acquire(>active);
-}
-
-static inline void intel_context_active_release(struct intel_context *ce)
-{
-   /* Nodes preallocated in intel_context_active() */
-   i915_active_acquire_barrier(>active);
-   i915_active_release(>active);
-}
+int intel_context_active_acquire(struct intel_context *ce);
+void intel_context_active_release(struct intel_context *ce);
  
  static inline struct intel_context *intel_context_get(struct intel_context *ce)

  {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index e74fbf04a68d..ce54092475da 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -90,7 +90,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs 
*engine)
/* Check again on the next retirement. */
engine->wakeref_serial = engine->serial + 1;
  
-	i915_request_add_barriers(rq);

+   i915_request_add_active_barriers(rq);
__i915_request_commit(rq);
  
  	return false;

diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
b/drivers/gpu/drm/i915/gt/selftest_context.c
new file mode 100644
index ..d39b5594cb02
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -0,0 +1,310 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_selftest.h"
+#include "intel_gt.h"
+

[Intel-gfx] [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests

2019-07-30 Thread Chris Wilson
By placing our idle-barriers in the i915_active fence tree, we expose
those for reuse by other components that are issuing requests along the
kernel_context. Reusing the proto-barrier active_node is perfectly fine
as the new request implies a context-switch, and so an opportune point
to run the idle-barrier. However, the proto-barrier is not equivalent
to a normal active_node and care must be taken to avoid dereferencing the
ERR_PTR used as its request marker.

Reported-by: Lionel Landwerlin 
Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next 
kernel context switch")
Fixes: a9877da2d629 ("drm/i915/oa: Reconfigure contexts on the fly")
Signed-off-by: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_context.c   |  40 ++-
 drivers/gpu/drm/i915/gt/intel_context.h   |  13 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |   2 +-
 drivers/gpu/drm/i915/gt/selftest_context.c| 310 ++
 drivers/gpu/drm/i915/i915_active.c| 246 +++---
 drivers/gpu/drm/i915/i915_active.h|   2 +-
 drivers/gpu/drm/i915/i915_active_types.h  |   2 +-
 .../drm/i915/selftests/i915_live_selftests.h  |   3 +-
 8 files changed, 555 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_context.c

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index d64b45f7ec6d..211ac6568a5d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -162,23 +162,41 @@ static int __intel_context_active(struct i915_active 
*active)
if (err)
goto err_ring;
 
+   return 0;
+
+err_ring:
+   intel_ring_unpin(ce->ring);
+err_put:
+   intel_context_put(ce);
+   return err;
+}
+
+int intel_context_active_acquire(struct intel_context *ce)
+{
+   int err;
+
+   err = i915_active_acquire(>active);
+   if (err)
+   return err;
+
/* Preallocate tracking nodes */
if (!i915_gem_context_is_kernel(ce->gem_context)) {
err = i915_active_acquire_preallocate_barrier(>active,
  ce->engine);
-   if (err)
-   goto err_state;
+   if (err) {
+   i915_active_release(>active);
+   return err;
+   }
}
 
return 0;
+}
 
-err_state:
-   __context_unpin_state(ce->state);
-err_ring:
-   intel_ring_unpin(ce->ring);
-err_put:
-   intel_context_put(ce);
-   return err;
+void intel_context_active_release(struct intel_context *ce)
+{
+   /* Nodes preallocated in intel_context_active() */
+   i915_active_acquire_barrier(>active);
+   i915_active_release(>active);
 }
 
 void
@@ -297,3 +315,7 @@ struct i915_request *intel_context_create_request(struct 
intel_context *ce)
 
return rq;
 }
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_context.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 23c7e4c0ce7c..07f9924de48f 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -104,17 +104,8 @@ static inline void intel_context_exit(struct intel_context 
*ce)
ce->ops->exit(ce);
 }
 
-static inline int intel_context_active_acquire(struct intel_context *ce)
-{
-   return i915_active_acquire(>active);
-}
-
-static inline void intel_context_active_release(struct intel_context *ce)
-{
-   /* Nodes preallocated in intel_context_active() */
-   i915_active_acquire_barrier(>active);
-   i915_active_release(>active);
-}
+int intel_context_active_acquire(struct intel_context *ce);
+void intel_context_active_release(struct intel_context *ce);
 
 static inline struct intel_context *intel_context_get(struct intel_context *ce)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index e74fbf04a68d..ce54092475da 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -90,7 +90,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs 
*engine)
/* Check again on the next retirement. */
engine->wakeref_serial = engine->serial + 1;
 
-   i915_request_add_barriers(rq);
+   i915_request_add_active_barriers(rq);
__i915_request_commit(rq);
 
return false;
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
b/drivers/gpu/drm/i915/gt/selftest_context.c
new file mode 100644
index ..d39b5594cb02
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -0,0 +1,310 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_selftest.h"
+#include "intel_gt.h"
+
+#include "gem/selftests/mock_context.h"
+#include