Re: [Intel-gfx] [PATCH v5 01/11] drm/i915: Add i915 perf infrastructure

2016-09-14 Thread Robert Bragg
On Wed, Sep 14, 2016 at 3:42 PM, Emil Velikov 
wrote:

> Hi Robert,
>
> I think I've spotted one interesting, yet trivial bit.
>
> On 14 September 2016 at 15:19, Robert Bragg  wrote:
> > Adds base i915 perf infrastructure for Gen performance metrics.
> >
> > This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64
> > properties to configure a stream of metrics and returns a new fd usable
> > with standard VFS system calls including read() to read typed and sized
> > records; ioctl() to enable or disable capture and poll() to wait for
> > data.
> >
> > A stream is opened something like:
> >
> >   uint64_t properties[] = {
> >   /* Single context sampling */
> >   DRM_I915_PERF_PROP_CTX_HANDLE,ctx_handle,
> >
> >   /* Include OA reports in samples */
> >   DRM_I915_PERF_PROP_SAMPLE_OA, true,
> >
> >   /* OA unit configuration */
> >   DRM_I915_PERF_PROP_OA_METRICS_SET,metrics_set_id,
> >   DRM_I915_PERF_PROP_OA_FORMAT, report_format,
> >   DRM_I915_PERF_PROP_OA_EXPONENT,   period_exponent,
> >};
> >struct drm_i915_perf_open_param parm = {
> >   .flags = I915_PERF_FLAG_FD_CLOEXEC |
> >I915_PERF_FLAG_FD_NONBLOCK |
> >I915_PERF_FLAG_DISABLED,
> >   .properties_ptr = (uint64_t)properties,
> >   .num_properties = sizeof(properties) / 16,
> >};
> >int fd = drmIoctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, ¶m);
> >
> > Records read all start with a common { type, size } header with
> > DRM_I915_PERF_RECORD_SAMPLE being of most interest. Sample records
> > contain an extensible number of fields and it's the
> > DRM_I915_PERF_PROP_SAMPLE_xyz properties given when opening that
> > determine what's included in every sample.
> >
> If I'm understanding the above correctly the ioctl can only read user
> data and does not write to params, correct ?
>
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
>
> > +#define DRM_IOCTL_I915_PERF_OPEN   DRM_IOWR(DRM_COMMAND_BASE +
> DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
>
> If so, we seems to have a one letter too much in DRM_IOWR - should one
> use DRM_IOW/DRM_IOR ? Then again I'm not sure how many ioctls bother,
> so please don't read too much into my suggestion :-)
>

Ah, yep, good catch, I don't write back to the param struct any more.

The first iteration of this interface was even more closely modeled on the
core linux perf interface where the param struct starts with a size member
and in a case where userspace passes a structure that's smaller than
expected the kernel returns an error but also writes back the expected size
to inform userspace.

i915 perf moved to taking an array of u64 properties and no longer writes
back a size member in the param struct like perf.

Thanks,
- Robert



>
> Regards,
> Emil
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 01/11] drm/i915: Add i915 perf infrastructure

2016-09-14 Thread Emil Velikov
Hi Robert,

I think I've spotted one interesting, yet trivial bit.

On 14 September 2016 at 15:19, Robert Bragg  wrote:
> Adds base i915 perf infrastructure for Gen performance metrics.
>
> This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64
> properties to configure a stream of metrics and returns a new fd usable
> with standard VFS system calls including read() to read typed and sized
> records; ioctl() to enable or disable capture and poll() to wait for
> data.
>
> A stream is opened something like:
>
>   uint64_t properties[] = {
>   /* Single context sampling */
>   DRM_I915_PERF_PROP_CTX_HANDLE,ctx_handle,
>
>   /* Include OA reports in samples */
>   DRM_I915_PERF_PROP_SAMPLE_OA, true,
>
>   /* OA unit configuration */
>   DRM_I915_PERF_PROP_OA_METRICS_SET,metrics_set_id,
>   DRM_I915_PERF_PROP_OA_FORMAT, report_format,
>   DRM_I915_PERF_PROP_OA_EXPONENT,   period_exponent,
>};
>struct drm_i915_perf_open_param parm = {
>   .flags = I915_PERF_FLAG_FD_CLOEXEC |
>I915_PERF_FLAG_FD_NONBLOCK |
>I915_PERF_FLAG_DISABLED,
>   .properties_ptr = (uint64_t)properties,
>   .num_properties = sizeof(properties) / 16,
>};
>int fd = drmIoctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, ¶m);
>
> Records read all start with a common { type, size } header with
> DRM_I915_PERF_RECORD_SAMPLE being of most interest. Sample records
> contain an extensible number of fields and it's the
> DRM_I915_PERF_PROP_SAMPLE_xyz properties given when opening that
> determine what's included in every sample.
>
If I'm understanding the above correctly the ioctl can only read user
data and does not write to params, correct ?

> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h

> +#define DRM_IOCTL_I915_PERF_OPEN   DRM_IOWR(DRM_COMMAND_BASE + 
> DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)

If so, we seems to have a one letter too much in DRM_IOWR - should one
use DRM_IOW/DRM_IOR ? Then again I'm not sure how many ioctls bother,
so please don't read too much into my suggestion :-)

Regards,
Emil
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v5 01/11] drm/i915: Add i915 perf infrastructure

2016-09-14 Thread Robert Bragg
Adds base i915 perf infrastructure for Gen performance metrics.

This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64
properties to configure a stream of metrics and returns a new fd usable
with standard VFS system calls including read() to read typed and sized
records; ioctl() to enable or disable capture and poll() to wait for
data.

A stream is opened something like:

  uint64_t properties[] = {
  /* Single context sampling */
  DRM_I915_PERF_PROP_CTX_HANDLE,ctx_handle,

  /* Include OA reports in samples */
  DRM_I915_PERF_PROP_SAMPLE_OA, true,

  /* OA unit configuration */
  DRM_I915_PERF_PROP_OA_METRICS_SET,metrics_set_id,
  DRM_I915_PERF_PROP_OA_FORMAT, report_format,
  DRM_I915_PERF_PROP_OA_EXPONENT,   period_exponent,
   };
   struct drm_i915_perf_open_param parm = {
  .flags = I915_PERF_FLAG_FD_CLOEXEC |
   I915_PERF_FLAG_FD_NONBLOCK |
   I915_PERF_FLAG_DISABLED,
  .properties_ptr = (uint64_t)properties,
  .num_properties = sizeof(properties) / 16,
   };
   int fd = drmIoctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, ¶m);

Records read all start with a common { type, size } header with
DRM_I915_PERF_RECORD_SAMPLE being of most interest. Sample records
contain an extensible number of fields and it's the
DRM_I915_PERF_PROP_SAMPLE_xyz properties given when opening that
determine what's included in every sample.

No specific streams are supported yet so any attempt to open a stream
will return an error.

Signed-off-by: Robert Bragg 
---
 drivers/gpu/drm/i915/Makefile|   3 +
 drivers/gpu/drm/i915/i915_drv.c  |   4 +
 drivers/gpu/drm/i915/i915_drv.h  |  91 
 drivers/gpu/drm/i915/i915_perf.c | 448 +++
 include/uapi/drm/i915_drm.h  |  67 ++
 5 files changed, 613 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_perf.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a998c2b..d991781 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -110,6 +110,9 @@ i915-y += dvo_ch7017.o \
 # virtual gpu code
 i915-y += i915_vgpu.o
 
+# perf code
+i915-y += i915_perf.o
+
 ifeq ($(CONFIG_DRM_I915_GVT),y)
 i915-y += intel_gvt.o
 include $(src)/gvt/Makefile
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7f4e8ad..14f22fc 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -838,6 +838,8 @@ static int i915_driver_init_early(struct drm_i915_private 
*dev_priv,
 
intel_device_info_dump(dev_priv);
 
+   i915_perf_init(dev_priv);
+
/* Not all pre-production machines fall into this category, only the
 * very first ones. Almost everything should work, except for maybe
 * suspend/resume. And we don't implement workarounds that affect only
@@ -859,6 +861,7 @@ err_workqueues:
  */
 static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv)
 {
+   i915_perf_fini(dev_priv);
i915_gem_load_cleanup(&dev_priv->drm);
i915_workqueues_cleanup(dev_priv);
 }
@@ -2560,6 +2563,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, 
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, 
i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, 
i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, 
DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1e2dda8..0f5cd8f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1740,6 +1740,84 @@ struct intel_wm_config {
bool sprites_scaled;
 };
 
+struct i915_perf_stream;
+
+struct i915_perf_stream_ops {
+   /* Enables the collection of HW samples, either in response to
+* I915_PERF_IOCTL_ENABLE or implicitly called when stream is
+* opened without I915_PERF_FLAG_DISABLED.
+*/
+   void (*enable)(struct i915_perf_stream *stream);
+
+   /* Disables the collection of HW samples, either in response to
+* I915_PERF_IOCTL_DISABLE or implicitly called before
+* destroying the stream.
+*/
+   void (*disable)(struct i915_perf_stream *stream);
+
+   /* Return: true if any i915 perf records are ready to read()
+* for this stream.
+*/
+   bool (*can_read)(struct i915_perf_stream *stream);
+
+   /* Call poll_wait, passing a wait queue that will be woken
+* once there is something ready to read() for the stream
+*/
+   void (*poll_wait)(struct i915_perf_stream *stream,
+ struct file *file,
+ poll_table *wait);
+
+   /* For handling a blocking read, wait u