This is still a work in progress, but as I've started to get some useful results with some simple igt tools and integration into Mesa too, others could be interested in this proof-of-concept perf pmu driver to forward Haswell Observation Architecture counters to userspace...
This has some similarities to the pmu driver Chris wrote back in September last year, except that in this case the driver is configuring a unit of the gpu to write out periodic counter snapshots to a circular buffer that then get forwarded to userspace as perf samples, instead of using hrtimer based sampling. I think if there is general agreement on leveraging perf pmus for gpu metrics that we'll likely also want some separate hrtimer based pmus later too. As well as this RFC, I also recently sent an RFC to the LKML since I wanted to see if the core perf maintainers would be happy with us exposing device metrics via perf pmus (currently perf is very cpu centric). Initial feedback seems positive and receptive to the idea which is reassuring: https://lkml.org/lkml/2014/10/22/462 When I started looking at this, it wasn't entirely clear that perf would be a good fit for covering the tooling and Mesa use cases we have and in particular there was a big question mark over how the permission model would work considering that GL apps shouldn't need to run as root to use INTEL_performance_query. Having got something working though I do feel much more confident now that it can work for us to build on perf infrastructure here. For the permission issue, the current driver depends on a small change to kernel/events/core so it can flag that its pmu relates to a device, as opposed to the cpu which lets the driver also handle its own authentication. In this case a drm file descriptor and context id can be passed as part of the config when opening an event so it's only possible to profile a specific context that you have a corresponding open drm file descriptor for. You have to be running as root to profile across all contexts. For reference the PRM for these counters can be found here: https://01.org/linuxgraphics/sites/default/files/documentation/ observability_performance_counters_haswell.pdf To test the driver out I have two igt tools; intel_oacounter_top_pmu, which is something like intel_gpu_top but based on our observability counters, and intel_gpu_trace_pmu which gives me a way to log all of the pmu sample data in detail in a json format that can be visualised using the chrome://tracing ui. These tools can be found here: https://github.com/rib/intel-gpu-tools/tree/wip/rib/intel-i915-oa-pmu I've also been experimenting with using the driver from Mesa, and that can be seen here: https://github.com/rib/mesa/tree/wip/rib/i915_oa_perf Currently the mesa driver is limited to only using the i915_oa driver as a way to configure the OA unit and still needs updating to also collect periodic counter snapshots for detecting counter wrapping. For anyone interested in an introduction to how the lower level perf interface works this may be a useful reference: http://web.eece.maine.edu/~vweaver/projects/perf_events/ perf_event_open.html For convenience these patches can also be pulled from here: https://github.com/rib/linux/tree/wip/rib/i915_oa_perf Regards, - Robert Robert Bragg (4): perf: export perf_event_overflow perf: Add PERF_PMU_CAP_IS_DEVICE flag drm/i915: add api to pin/unpin context state drm/i915: Expose PMU for Observation Architecture drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_dma.c | 2 + drivers/gpu/drm/i915/i915_drv.h | 37 ++ drivers/gpu/drm/i915/i915_gem_context.c | 30 +- drivers/gpu/drm/i915/i915_oa_perf.c | 649 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 68 ++++ include/linux/perf_event.h | 1 + include/uapi/drm/i915_drm.h | 21 ++ kernel/events/core.c | 40 +- 9 files changed, 834 insertions(+), 15 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_oa_perf.c -- 2.1.3 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx