From: Tvrtko Ursulin <tvrtko.ursu...@intel.com> Per-engine queue depths are an interesting metric for analyzing the system load and also for users who wish to use it to load balance their submissions based on it.
In this version I have split the metrics into three separate counters: 1. QUEUED - From execbuf time to request being runnable - runnable meaning until dependencies have been resolved and fences signaled. 2. RUNNABLE - From runnable to running on the GPU. 3. RUNNING - Running on the GPU. When inspected with perf stat the output looks roughly like this: # time counts unit events 201.160490145 0.01 i915/rcs0-queued/ 201.160490145 19.13 i915/rcs0-runnable/ 201.160490145 2.39 i915/rcs0-running/ The reported numbers are average queue depths for the last query period. v2: * Review feedback (see patch changelogs). * Renamed the counters and re-ordered some patches. v3: * Review feedback and rebase. v4: * Addition of last patch in the series, which supports a customer requirement to expose instantaneous queue values via the i915 query API. Tvrtko Ursulin (7): drm/i915/pmu: Fix enable count array size and bounds checking drm/i915: Keep a count of requests waiting for a slot on GPU drm/i915: Keep a count of requests submitted from userspace drm/i915/pmu: Add queued counter drm/i915/pmu: Add runnable counter drm/i915/pmu: Add running counter drm/i915: Engine queues query drivers/gpu/drm/i915/i915_pmu.c | 81 +++++++++++++++++++++++++++++---- drivers/gpu/drm/i915/i915_query.c | 43 +++++++++++++++++ drivers/gpu/drm/i915/i915_request.c | 10 ++++ drivers/gpu/drm/i915/intel_engine_cs.c | 6 ++- drivers/gpu/drm/i915/intel_lrc.c | 1 + drivers/gpu/drm/i915/intel_ringbuffer.h | 21 ++++++++- include/uapi/drm/i915_drm.h | 45 +++++++++++++++++- 7 files changed, 194 insertions(+), 13 deletions(-) -- 2.14.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx