Re: [Intel-gfx] [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests
Quoting Tvrtko Ursulin (2019-08-01 18:24:50) > > On 30/07/2019 12:21, Chris Wilson wrote: > > @@ -224,8 +283,15 @@ int i915_active_ref(struct i915_active *ref, > > goto out; > > } > > > > - if (!i915_active_request_isset(active)) > > - atomic_inc(>count); > > + if (is_barrier(active)) { /* proto-node used by our idle barrier */ > > + __active_del_barrier(ref, node_from_active(active)); > > + RCU_INIT_POINTER(active->request, NULL); > > + INIT_LIST_HEAD(>link); > > So this, when prepare_remote_request calls it on ce->active, will remove > the idle barrier from the engine->barriers_list and immediately transfer > it to the rq->active_list? Yes. So long as the context matches the idle-barrier, i.e. we are operating on this context from the kernel_context. > Then when this request is retired we know the > context is idle? So long as the context itself hasn't claimed a new active reference for its own requests. > But what if it is the same context? Then it is not idle yet. Correct, but that would only happens for the kernel_context, which is the special case where we force idle on parking and exclude from the deferred active-barrier (as it would continually defer instead of parking). > > +static inline bool is_idle_barrier(struct active_node *node, u64 idx) > > +{ > > + return node->timeline == idx && > > !i915_active_request_isset(>base); > > +} > > + > > +static struct active_node *idle_barrier(struct i915_active *ref, u64 idx) > > +{ > > + struct rb_node *prev, *p; > > + > > + if (RB_EMPTY_ROOT(>tree)) > > + return NULL; > > + > > + mutex_lock(>mutex); > > + GEM_BUG_ON(i915_active_is_idle(ref)); > > + > > + /* > > + * Try to reuse any existing barrier nodes already allocated for this > > + * i915_active, due to overlapping active phases there is likely a > > For this i915_active or for this idx? It is this i915_active by > definition, no? And idx also has to match in both loops below. idx is a timeline, which although would normally be unique in the tree, due to the active-barrier we may have more than one instance, because we may have overlapping i915_active phases of activity. This function exists to try and reuse any of the older phases that we could not reap as we never completely idled. (Before we would accidentally keep growing the tree with new nodes until we hit a point where we retired the idle timeline after parking.) > > + * node kept alive (as we reuse before parking). We prefer to reuse > > + * completely idle barriers (less hassle in manipulating the llists), > > + * but otherwise any will do. > > + */ > > + if (ref->cache && is_idle_barrier(ref->cache, idx)) { > > + p = >cache->node; > > + goto match; > > + } > > + > > + prev = NULL; > > + p = ref->tree.rb_node; > > + while (p) { > > + struct active_node *node = > > + rb_entry(p, struct active_node, node); > > + > > + if (is_idle_barrier(node, idx)) > > + goto match; > > + > > + prev = p; > > + if (node->timeline < idx) > > + p = p->rb_right; > > + else > > + p = p->rb_left; > > + } > > + > > + for (p = prev; p; p = rb_next(p)) { > > + struct active_node *node = > > + rb_entry(p, struct active_node, node); > > + > > + if (node->timeline > idx) > > + break; > > + > > + if (node->timeline < idx) > > + continue; > > + > > + if (!i915_active_request_isset(>base)) > > + goto match; > > Isn't this idle barrier, so same as above? How can above not find it, > and find it here? We didn't check all nodes, only followed the binary tree down one branch to find the starting point. I anticipate we may end up with more than one idle-barrier in the tree. > > + > > + if (is_barrier(>base) && __active_del_barrier(ref, > > node)) > > + goto match; > > And this is yet unused barrier with the right idx. Under what > circumstances can __active_del_barrier fail then? I think a comment is > needed. __active_del_barrier() here can theoretically race with i915_request_add_active_barriers(). In the other callsite, we must be holding the kernel_context timeline mutex and so cannot be concurrently calling add_active_barriers(). [Here is the weakness of using __active_del_barrier(), in order to reuse an activated idle-barrier, we may cause add_active_barriers() to miss the others and so postpone the retirement of other contexts. Under the current scheme of parking, that is not an issue. In the future, it means retirement may miss a heartbeat.] > > + if (!i915_active_request_isset(>base)) { > > +
Re: [Intel-gfx] [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests
On 30/07/2019 12:21, Chris Wilson wrote: By placing our idle-barriers in the i915_active fence tree, we expose those for reuse by other components that are issuing requests along the kernel_context. Reusing the proto-barrier active_node is perfectly fine as the new request implies a context-switch, and so an opportune point to run the idle-barrier. However, the proto-barrier is not equivalent to a normal active_node and care must be taken to avoid dereferencing the ERR_PTR used as its request marker. Reported-by: Lionel Landwerlin Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next kernel context switch") Fixes: a9877da2d629 ("drm/i915/oa: Reconfigure contexts on the fly") Signed-off-by: Chris Wilson Cc: Lionel Landwerlin Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_context.c | 40 ++- drivers/gpu/drm/i915/gt/intel_context.h | 13 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 +- drivers/gpu/drm/i915/gt/selftest_context.c| 310 ++ drivers/gpu/drm/i915/i915_active.c| 246 +++--- drivers/gpu/drm/i915/i915_active.h| 2 +- drivers/gpu/drm/i915/i915_active_types.h | 2 +- .../drm/i915/selftests/i915_live_selftests.h | 3 +- 8 files changed, 555 insertions(+), 63 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/selftest_context.c diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index d64b45f7ec6d..211ac6568a5d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -162,23 +162,41 @@ static int __intel_context_active(struct i915_active *active) if (err) goto err_ring; + return 0; + +err_ring: + intel_ring_unpin(ce->ring); +err_put: + intel_context_put(ce); + return err; +} + +int intel_context_active_acquire(struct intel_context *ce) +{ + int err; + + err = i915_active_acquire(>active); + if (err) + return err; + /* Preallocate tracking nodes */ if (!i915_gem_context_is_kernel(ce->gem_context)) { err = i915_active_acquire_preallocate_barrier(>active, ce->engine); - if (err) - goto err_state; + if (err) { + i915_active_release(>active); + return err; + } } return 0; +} -err_state: - __context_unpin_state(ce->state); -err_ring: - intel_ring_unpin(ce->ring); -err_put: - intel_context_put(ce); - return err; +void intel_context_active_release(struct intel_context *ce) +{ + /* Nodes preallocated in intel_context_active() */ + i915_active_acquire_barrier(>active); + i915_active_release(>active); } void @@ -297,3 +315,7 @@ struct i915_request *intel_context_create_request(struct intel_context *ce) return rq; } + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftest_context.c" +#endif diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 23c7e4c0ce7c..07f9924de48f 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -104,17 +104,8 @@ static inline void intel_context_exit(struct intel_context *ce) ce->ops->exit(ce); } -static inline int intel_context_active_acquire(struct intel_context *ce) -{ - return i915_active_acquire(>active); -} - -static inline void intel_context_active_release(struct intel_context *ce) -{ - /* Nodes preallocated in intel_context_active() */ - i915_active_acquire_barrier(>active); - i915_active_release(>active); -} +int intel_context_active_acquire(struct intel_context *ce); +void intel_context_active_release(struct intel_context *ce); static inline struct intel_context *intel_context_get(struct intel_context *ce) { diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index e74fbf04a68d..ce54092475da 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -90,7 +90,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) /* Check again on the next retirement. */ engine->wakeref_serial = engine->serial + 1; - i915_request_add_barriers(rq); + i915_request_add_active_barriers(rq); __i915_request_commit(rq); return false; diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c new file mode 100644 index ..d39b5594cb02 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -0,0 +1,310 @@ +/* + * SPDX-License-Identifier: GPL-2.0 + * + * Copyright © 2019 Intel Corporation + */ + +#include "i915_selftest.h" +#include "intel_gt.h" +
[Intel-gfx] [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests
By placing our idle-barriers in the i915_active fence tree, we expose those for reuse by other components that are issuing requests along the kernel_context. Reusing the proto-barrier active_node is perfectly fine as the new request implies a context-switch, and so an opportune point to run the idle-barrier. However, the proto-barrier is not equivalent to a normal active_node and care must be taken to avoid dereferencing the ERR_PTR used as its request marker. Reported-by: Lionel Landwerlin Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next kernel context switch") Fixes: a9877da2d629 ("drm/i915/oa: Reconfigure contexts on the fly") Signed-off-by: Chris Wilson Cc: Lionel Landwerlin Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_context.c | 40 ++- drivers/gpu/drm/i915/gt/intel_context.h | 13 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 +- drivers/gpu/drm/i915/gt/selftest_context.c| 310 ++ drivers/gpu/drm/i915/i915_active.c| 246 +++--- drivers/gpu/drm/i915/i915_active.h| 2 +- drivers/gpu/drm/i915/i915_active_types.h | 2 +- .../drm/i915/selftests/i915_live_selftests.h | 3 +- 8 files changed, 555 insertions(+), 63 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/selftest_context.c diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index d64b45f7ec6d..211ac6568a5d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -162,23 +162,41 @@ static int __intel_context_active(struct i915_active *active) if (err) goto err_ring; + return 0; + +err_ring: + intel_ring_unpin(ce->ring); +err_put: + intel_context_put(ce); + return err; +} + +int intel_context_active_acquire(struct intel_context *ce) +{ + int err; + + err = i915_active_acquire(>active); + if (err) + return err; + /* Preallocate tracking nodes */ if (!i915_gem_context_is_kernel(ce->gem_context)) { err = i915_active_acquire_preallocate_barrier(>active, ce->engine); - if (err) - goto err_state; + if (err) { + i915_active_release(>active); + return err; + } } return 0; +} -err_state: - __context_unpin_state(ce->state); -err_ring: - intel_ring_unpin(ce->ring); -err_put: - intel_context_put(ce); - return err; +void intel_context_active_release(struct intel_context *ce) +{ + /* Nodes preallocated in intel_context_active() */ + i915_active_acquire_barrier(>active); + i915_active_release(>active); } void @@ -297,3 +315,7 @@ struct i915_request *intel_context_create_request(struct intel_context *ce) return rq; } + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftest_context.c" +#endif diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 23c7e4c0ce7c..07f9924de48f 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -104,17 +104,8 @@ static inline void intel_context_exit(struct intel_context *ce) ce->ops->exit(ce); } -static inline int intel_context_active_acquire(struct intel_context *ce) -{ - return i915_active_acquire(>active); -} - -static inline void intel_context_active_release(struct intel_context *ce) -{ - /* Nodes preallocated in intel_context_active() */ - i915_active_acquire_barrier(>active); - i915_active_release(>active); -} +int intel_context_active_acquire(struct intel_context *ce); +void intel_context_active_release(struct intel_context *ce); static inline struct intel_context *intel_context_get(struct intel_context *ce) { diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index e74fbf04a68d..ce54092475da 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -90,7 +90,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) /* Check again on the next retirement. */ engine->wakeref_serial = engine->serial + 1; - i915_request_add_barriers(rq); + i915_request_add_active_barriers(rq); __i915_request_commit(rq); return false; diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c new file mode 100644 index ..d39b5594cb02 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -0,0 +1,310 @@ +/* + * SPDX-License-Identifier: GPL-2.0 + * + * Copyright © 2019 Intel Corporation + */ + +#include "i915_selftest.h" +#include "intel_gt.h" + +#include "gem/selftests/mock_context.h" +#include