Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add perf property support for context HW id
On 25/07/17 05:26, Zhenyu Wang wrote: On 2017.07.21 14:01:01 +0100, Lionel Landwerlin wrote: I think Chris' comments show this isn't actually tested. It turned out that's true...so currently Pengyuan just tried to filter by exposed vGPU ctx_hw_id with global mode in gputop. Would that be ok with you? If yes, then we don't need i915 perf change. If that works for you, it's great. I kind of like to have the hw_id in userspace too, it's pretty handy to debug IGT tests :) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add perf property support for context HW id
On 2017.07.21 14:01:01 +0100, Lionel Landwerlin wrote: > I think Chris' comments show this isn't actually tested. It turned out that's true...so currently Pengyuan just tried to filter by exposed vGPU ctx_hw_id with global mode in gputop. Would that be ok with you? If yes, then we don't need i915 perf change. -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 signature.asc Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add perf property support for context HW id
I think Chris' comments show this isn't actually tested. Can you please write at least one test case for igt ? We have a pretty long list of patches that need to land (still need review on some patches). I would recommend you base you patches on this : https://github.com/djdeath/intel-gpu-tools/tree/wip/djdeath/oa-next Once your tests are passing, I'll put them in that branch. Thanks! On 21/07/17 12:04, Chris Wilson wrote: Quoting Zhenyu Wang (2017-07-19 06:39:48) +static struct i915_gem_context * +lookup_context_hw_id(struct drm_i915_private *dev_priv, unsigned int hw_id) +{ + struct i915_gem_context *ctx; + int ret; + + ret = i915_mutex_lock_interruptible(&dev_priv->drm); + if (ret) + return ERR_PTR(ret); + + list_for_each_entry(ctx, &dev_priv->contexts.list, link) { + if (!i915_gem_context_is_default(ctx)) + continue; This is still a massive what? Why ban normal contexts? Why are you not banning the kernel context? + + if (ctx->hw_id == hw_id) { + ret = 1; + i915_gem_context_get(ctx); + break; Bad news, your reference counting is still broken. You actually need an i915_gem_context_get_rcu() variant for pulling from this list. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add perf property support for context HW id
Quoting Zhenyu Wang (2017-07-19 06:39:48) > +static struct i915_gem_context * > +lookup_context_hw_id(struct drm_i915_private *dev_priv, unsigned int hw_id) > +{ > + struct i915_gem_context *ctx; > + int ret; > + > + ret = i915_mutex_lock_interruptible(&dev_priv->drm); > + if (ret) > + return ERR_PTR(ret); > + > + list_for_each_entry(ctx, &dev_priv->contexts.list, link) { > + if (!i915_gem_context_is_default(ctx)) > + continue; This is still a massive what? Why ban normal contexts? Why are you not banning the kernel context? > + > + if (ctx->hw_id == hw_id) { > + ret = 1; > + i915_gem_context_get(ctx); > + break; Bad news, your reference counting is still broken. You actually need an i915_gem_context_get_rcu() variant for pulling from this list. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Add perf property support for context HW id
Just a small nit down there. On 19/07/17 06:39, Zhenyu Wang wrote: In order to support profiling for special context e.g vGPU context, we can expose vGPU context hw id and enable i915 perf property to get target context for profiling. This adds new perf property to assign target context with hw id. Jiao Pengyuan has helped to fix context reference bug in original code. v2: - new function for context lookup with hw id - always require priviledged op - fix error return value Cc: Lionel Landwerlin Cc: Chris Wilson Cc: Jiao Pengyuan Cc: Niu Bing Signed-off-by: Zhenyu Wang --- drivers/gpu/drm/i915/i915_perf.c | 83 include/uapi/drm/i915_drm.h | 5 +++ 2 files changed, 72 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 96682fd86f82..a80a3959bc3b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -334,7 +334,9 @@ static struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * struct perf_open_properties - for validated properties given to open a stream * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags * @single_context: Whether a single or all gpu contexts should be monitored + * @single_context_hw: Set for special context with hw id e.g vGPU * @ctx_handle: A gem ctx handle for use with @single_context + * @ctx_hw_id: A ctx hw id for use with @single_context_hw * @metrics_set: An ID for an OA unit metric set advertised via sysfs * @oa_format: An OA unit HW report format * @oa_periodic: Whether to enable periodic OA unit sampling @@ -348,7 +350,9 @@ struct perf_open_properties { u32 sample_flags; u64 single_context:1; + u64 single_context_hw:1; u64 ctx_handle; + u64 ctx_hw_id; /* OA sampling state */ int metrics_set; @@ -2482,6 +2486,29 @@ static const struct file_operations fops = { .unlocked_ioctl = i915_perf_ioctl, }; +static struct i915_gem_context * +lookup_context_hw_id(struct drm_i915_private *dev_priv, unsigned int hw_id) +{ + struct i915_gem_context *ctx; + int ret; + + ret = i915_mutex_lock_interruptible(&dev_priv->drm); + if (ret) + return ERR_PTR(ret); + + list_for_each_entry(ctx, &dev_priv->contexts.list, link) { + if (!i915_gem_context_is_default(ctx)) + continue; + + if (ctx->hw_id == hw_id) { + ret = 1; + i915_gem_context_get(ctx); + break; + } + } + mutex_unlock(&dev_priv->drm.struct_mutex); + return ret ? ctx : NULL; I think you should replace NULL by ERR_PTR(-ENOENT) here, that would simplify the call to lookup_context_hw_id(). +} /** * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD @@ -2531,24 +2558,35 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, ret = -ENOENT; goto err; } + + /* +* On Haswell the OA unit supports clock gating off for a specific +* context and in this mode there's no visibility of metrics for the +* rest of the system, which we consider acceptable for a +* non-privileged client. +* +* For Gen8+ the OA unit no longer supports clock gating off for a +* specific context and the kernel can't securely stop the counters +* from updating as system-wide / global values. Even though we can +* filter reports based on the included context ID we can't block +* clients from seeing the raw / global counter values via +* MI_REPORT_PERF_COUNT commands and so consider it a privileged op to +* enable the OA unit by default. +*/ + if (IS_HASWELL(dev_priv) && specific_ctx) + privileged_op = false; } - /* -* On Haswell the OA unit supports clock gating off for a specific -* context and in this mode there's no visibility of metrics for the -* rest of the system, which we consider acceptable for a -* non-privileged client. -* -* For Gen8+ the OA unit no longer supports clock gating off for a -* specific context and the kernel can't securely stop the counters -* from updating as system-wide / global values. Even though we can -* filter reports based on the included context ID we can't block -* clients from seeing the raw / global counter values via -* MI_REPORT_PERF_COUNT commands and so consider it a privileged op to -* enable the OA unit by default. -*/ - if (IS_HASWELL(dev_priv) && specific_ctx) -
[Intel-gfx] [PATCH v2 1/2] drm/i915: Add perf property support for context HW id
In order to support profiling for special context e.g vGPU context, we can expose vGPU context hw id and enable i915 perf property to get target context for profiling. This adds new perf property to assign target context with hw id. Jiao Pengyuan has helped to fix context reference bug in original code. v2: - new function for context lookup with hw id - always require priviledged op - fix error return value Cc: Lionel Landwerlin Cc: Chris Wilson Cc: Jiao Pengyuan Cc: Niu Bing Signed-off-by: Zhenyu Wang --- drivers/gpu/drm/i915/i915_perf.c | 83 include/uapi/drm/i915_drm.h | 5 +++ 2 files changed, 72 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 96682fd86f82..a80a3959bc3b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -334,7 +334,9 @@ static struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * struct perf_open_properties - for validated properties given to open a stream * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags * @single_context: Whether a single or all gpu contexts should be monitored + * @single_context_hw: Set for special context with hw id e.g vGPU * @ctx_handle: A gem ctx handle for use with @single_context + * @ctx_hw_id: A ctx hw id for use with @single_context_hw * @metrics_set: An ID for an OA unit metric set advertised via sysfs * @oa_format: An OA unit HW report format * @oa_periodic: Whether to enable periodic OA unit sampling @@ -348,7 +350,9 @@ struct perf_open_properties { u32 sample_flags; u64 single_context:1; + u64 single_context_hw:1; u64 ctx_handle; + u64 ctx_hw_id; /* OA sampling state */ int metrics_set; @@ -2482,6 +2486,29 @@ static const struct file_operations fops = { .unlocked_ioctl = i915_perf_ioctl, }; +static struct i915_gem_context * +lookup_context_hw_id(struct drm_i915_private *dev_priv, unsigned int hw_id) +{ + struct i915_gem_context *ctx; + int ret; + + ret = i915_mutex_lock_interruptible(&dev_priv->drm); + if (ret) + return ERR_PTR(ret); + + list_for_each_entry(ctx, &dev_priv->contexts.list, link) { + if (!i915_gem_context_is_default(ctx)) + continue; + + if (ctx->hw_id == hw_id) { + ret = 1; + i915_gem_context_get(ctx); + break; + } + } + mutex_unlock(&dev_priv->drm.struct_mutex); + return ret ? ctx : NULL; +} /** * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD @@ -2531,24 +2558,35 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, ret = -ENOENT; goto err; } + + /* +* On Haswell the OA unit supports clock gating off for a specific +* context and in this mode there's no visibility of metrics for the +* rest of the system, which we consider acceptable for a +* non-privileged client. +* +* For Gen8+ the OA unit no longer supports clock gating off for a +* specific context and the kernel can't securely stop the counters +* from updating as system-wide / global values. Even though we can +* filter reports based on the included context ID we can't block +* clients from seeing the raw / global counter values via +* MI_REPORT_PERF_COUNT commands and so consider it a privileged op to +* enable the OA unit by default. +*/ + if (IS_HASWELL(dev_priv) && specific_ctx) + privileged_op = false; } - /* -* On Haswell the OA unit supports clock gating off for a specific -* context and in this mode there's no visibility of metrics for the -* rest of the system, which we consider acceptable for a -* non-privileged client. -* -* For Gen8+ the OA unit no longer supports clock gating off for a -* specific context and the kernel can't securely stop the counters -* from updating as system-wide / global values. Even though we can -* filter reports based on the included context ID we can't block -* clients from seeing the raw / global counter values via -* MI_REPORT_PERF_COUNT commands and so consider it a privileged op to -* enable the OA unit by default. -*/ - if (IS_HASWELL(dev_priv) && specific_ctx) - privileged_op = false; + if (props->single_context_hw) { + unsigned int hw_id = props->ctx_hw_id; + + specific_ctx = lookup_context_hw_id(dev_priv, hw_id); +