On 23/01/2017 11:16, Chris Wilson wrote:
On Mon, Jan 23, 2017 at 10:51:28AM +0000, Tvrtko Ursulin wrote:
On 21/01/2017 09:25, Chris Wilson wrote:
In order to maximise concurrency between engines, if we queue a request
to a current idle ring, reorder its dependencies to execute that request
as early as possible and ideally improve occupancy of multiple engines
simultaneously.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem_request.h | 5 +++--
drivers/gpu/drm/i915/intel_lrc.c | 3 +++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h
b/drivers/gpu/drm/i915/i915_gem_request.h
index ba83c507613b..7ba9cc53abe9 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -74,8 +74,9 @@ struct i915_priotree {
};
enum {
- I915_PRIORITY_LOCKED = I915_PRIORITY_MAX,
- I915_PRIORITY_DISPLAY
+ I915_PRIORITY_STALL = I915_PRIORITY_MAX,
+ I915_PRIORITY_LOCKED,
+ I915_PRIORITY_DISPLAY,
};
struct i915_gem_capture_list {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 50bec759989f..b46cb1bb32b8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -691,6 +691,9 @@ static void execlists_schedule(struct drm_i915_gem_request
*request, int prio)
struct i915_dependency stack;
LIST_HEAD(dfs);
+ if (execlists_elsp_ready(request->engine))
+ prio = max(prio, I915_PRIORITY_STALL);
+
It would have to be execlists_elsp_idle for it to match with the
commit message.
But even then the idea worries me sufficiently that I would refrain
from adding it.
I don't like the name of I915_PRIORITY_STALL either, since I think
about stalls as ELSP transitioning to idle and no runnable requests.
It could be I915_PRIORITY_SUBMIT, but I just can't find a good story
for when this might be a good idea.
That's the case we are trying to address. This engine is stalled (no
requests to run), so we are boosting the priority of its deps on the
other engines so that stall time is minimised. That's why I choose STALL
because it related to the currently stalled engine.
Here we don't know that there are no other requests to run - perhaps
they have been just queued up before this one and not in elsp yet
(tasklet pending).
Hm, and ports are also not protected by struct_mutex but the
engine->timeline->lock so this looks to be open for an race.
Perhaps if we combined it with the no other runnable requests on the
engine it might be passable?
We don't have that information yet (trivially available at least). But it
still runs into the problem that the first request added to the engine's
implicit timeline may not be the first request actually ready-to-submit
due to external third parties.
Would it not be the engine->execlist_first?
In hopefully the near future, we could write:
if (!request->engine->timeline->active_seqno)
which would only apply the magic boost to the very first rq added to
this engine after a period of idleness. Hmm, that's a bit too weak as we
try to avoid retiring.
Not keen on that. I'd like to push for keeping ready() first, and idle()
as a compromise if need be. The idea is that ready() is avoiding the
stall on the second submission slot - but that maybe too preemptive. I
can see plenty of arguments and counter-arguments.
The key problem is that this heuristic is simply too speculative. But
ideal information to choose the request with the shortest dep path, or
to choose the overall schedule to minimise delays.
After all that I still regard this as an interesting toy that may prove
useful, or be counterproductive. I also regard that it should only be
interesting in the short term.
If in doubt leave it out. :)
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx