Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-27 Thread Chris Wilson
On Wed, Oct 26, 2016 at 12:05:44AM +0100, Chris Wilson wrote:
> On Tue, Oct 25, 2016 at 12:19:29AM +0100, Robert Bragg wrote:
> > +   /* So that we don't have to worry about updating the context ID
> > +* in OACONTOL on the fly we make sure to pin the context
> > +* upfront for the lifetime of the stream...
> > +*/
> > +   vma = stream->ctx->engine[RCS].state;
> 
> There's a caveat here that suggests I had better wrap up this into its
> own function. (We need to flush dirty cachelines to memory on first
> binding of the context.)

Not that actually affects hsw.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Robert Bragg
On Wed, Oct 26, 2016 at 4:03 PM, Robert Bragg 
wrote:

> On 26 Oct 2016 11:08 a.m., "Matthew Auld" 
> wrote:
> >
> > On 26 October 2016 at 00:51, Robert Bragg  wrote:
> > >
> > >
> > > On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > >  wrote:
> > >>
> > >> On 25 October 2016 at 00:19, Robert Bragg 
> wrote:
> > >
> > >
> > >>
> > >>
> > >> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > >> > b/drivers/gpu/drm/i915/i915_drv.h
> > >> > index 3448d05..ea24814 100644
> > >> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > >> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > >> > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> > >>
> > >> >
> > >> >  struct drm_i915_private {
> > >> > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> > >> >
> > >> > struct {
> > >> > bool initialized;
> > >> > +
> > >> > struct mutex lock;
> > >> > struct list_head streams;
> > >> >
> > >> > +   spinlock_t hook_lock;
> > >> > +
> > >> > struct {
> > >> > -   u32 metrics_set;
> > >> > +   struct i915_perf_stream *exclusive_stream;
> > >> > +
> > >> > +   u32 specific_ctx_id;
> > >> Can we just get rid of this, now that the vma remains pinned we can
> > >> simply get the ggtt address at the time of configuring the OA_CONTROL
> > >> register ?
> > >
> > >
> > > I considered that, but would ideally prefer to keep it considering the
> gen8+
> > > patches to come. For gen8+ (with execlists) the context ID isn't a gtt
> > > offset.
> > >
> > >>
> > >>
> > >> > +
> > >> > +   struct hrtimer poll_check_timer;
> > >> > +   wait_queue_head_t poll_wq;
> > >> > +   atomic_t pollin;
> > >> > +
> > >>
> > >
> > >>
> > >> > +/* The maximum exponent the hardware accepts is 63 (essentially it
> > >> > selects one
> > >> > + * of the 64bit timestamp bits to trigger reports from) but there's
> > >> > currently
> > >> > + * no known use case for sampling as infrequently as once per 47
> > >> > thousand years.
> > >> > + *
> > >> > + * Since the timestamps included in OA reports are only 32bits it
> seems
> > >> > + * reasonable to limit the OA exponent where it's still possible to
> > >> > account for
> > >> > + * overflow in OA report timestamps.
> > >> > + */
> > >> > +#define OA_EXPONENT_MAX 31
> > >> > +
> > >> > +#define INVALID_CTX_ID 0x
> > >> We shouldn't need this anymore.
> > >
> > >
> > > yeah I removed it and then added it back, just for the sake of
> explicitly
> > > setting the specific_ctx_id to an invalid ID when closing the exclusive
> > > stream - though resetting the value isn't strictly necessary.
> > Can we not make the specific_ctx_id per-stream, the gem context
> > already is, then we don't need to be concerned with resetting it ?
>
> Hmm, I'm not sure about that, conceptually to me it's global OA unit state.
>
> Currently the driver only supports a single exclusive stream, while Sourab
> later relaxes that to a per-engine stream and that could be relaxed further
> with non-oa metric stream types.
>
> With multiple streams we'll still only be able to programmer a single ctx
> id in oacontol.
>
> Conceptually to me, other stream types could be associated with different
> contexts (if they don't depend on the OA unit) so to me stream->ctx isn't
> necessarily OA unit state.
>
> It probably could be played around with, but right now we don't track OA
> specific state in the stream. For the ID it's just semantics to say it's OA
> state, and we could consider that it's maybe generally useful to track the
> ID, even for future non-oa streams. That might mean potentially redundantly
> pinning state for the sake of tracking the ID for streams that don't end up
> needing it.
>

I started to try out moving the specific_ctx_id and vma pointer (new) to
the stream, and also looked at initializing them together with the
stream->ctx reference, but I'm not really happy with how it's looking.

The specific_ctx_id and pinning are only for the render context, since the
OA unit is only well integrated with the render engine, which makes me more
inclined to consider them OA stream specific, not something we want/need
for all streams (considering that Sourab enables multiple streams in his
series).

Btw, for reference, my patches for gen8+ can also end up making use of the
INVALID_CTX_ID define (when overwriting the undefined ctx_id field in HW
reports when the report's ctx-id is flagged as invalid by the OA unit.) so
we maybe don't want to worry to much about removing the need for it here.

- Robert
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Robert Bragg
On 26 Oct 2016 5:54 p.m., "Ville Syrjälä" 
wrote:
>
> On Wed, Oct 26, 2016 at 05:42:23PM +0100, Robert Bragg wrote:
> > On Wed, Oct 26, 2016 at 4:37 PM, Ville Syrjälä <
> > ville.syrj...@linux.intel.com> wrote:
> >
> > > On Wed, Oct 26, 2016 at 04:17:45PM +0100, Robert Bragg wrote:
> > > > On 26 Oct 2016 9:54 a.m., "Chris Wilson" 
> > > wrote:
> > > > >
> > > > > On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
> > > > > >On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > > > > ><[1]matthew.william.a...@gmail.com> wrote:
> > > > > >
> > > > > >  On 25 October 2016 at 00:19, Robert Bragg <[2]
> > > rob...@sixbynine.org>
> > > > > >  wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > >  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > >  b/drivers/gpu/drm/i915/i915_drv.h
> > > > > >  > index 3448d05..ea24814 100644
> > > > > >  > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > >  > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > >  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> > > > > >
> > > > > >  >
> > > > > >  >  struct drm_i915_private {
> > > > > >  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> > > > > >  >
> > > > > >  > struct {
> > > > > >  > bool initialized;
> > > > > >  > +
> > > > > >  > struct mutex lock;
> > > > > >  > struct list_head streams;
> > > > > >  >
> > > > > >  > +   spinlock_t hook_lock;
> > > > > >  > +
> > > > > >  > struct {
> > > > > >  > -   u32 metrics_set;
> > > > > >  > +   struct i915_perf_stream
> > > > *exclusive_stream;
> > >
> > > OT:
> > > What kind of MUA are you using that mangles quoted mails like this?
I've
> > > not seen it on intel-gfx before. mesa-dev seems rife with it, but as I
> > > rarely read that in any great detail I've managed to ignore it there.
> > > Anyways, it makes it espesially hard to navigate long mails since
mutt's
> > > 'S' (skip quoted text) no longer works correctly.
> > >
> >
> > Not sure I want to say, and get booted out the door :-)
> >
> > I've heard that gmail has an annoying habit of forcibly wrapping plain
text
> > emails like this, and a lot of people have complained that there's no
way
> > to disable that 'feature' :-/
> >
> > I used to use Mutt, but I don't think I could really bare to go back to
it
> > any more. Last time I was using it I found myself spending too much time
> > patching it to try and make it work how I'd like, but can't say I got
much
> > enjoyment from that process.
>
> Isn't gmail just a pile of client side javascript or something? Maybe
> you'd enjoy patching that one more? ;)
>
> >
> > I've tried most MUA options available, and can't say any of them make me
> > very happy - I think these days it's just not something developers are
very
> > interesting in working on.
> >
> > I'm a sell out and just use Gmail... sorry. I can't really see myself
> > changing, though I do wish Google weren't so pedantic about forcing
> > wrapping without any option to change that behaviour. I suspect you
> > wouldn't be happy with me sending html emails, which has been Google's
> > default response to this complaint afik.
> >
> > Maybe it's gmail users causing trouble on the Mesa list too.
> >
> > - Robert
> >
> > P.S please don't think lesser of me due to my misguided MUA choices.
>
> I think I'll just reserve the right to ignore any mail with bad quoting.

Okey, fwiw, at least my patches sent out via git send-email should be fine,
so maybe just ignore my replies to feedback - which I promise not to
exploit to achieve 'consensus' through silence.

- Robert

--
Sent from Gmail on Android, in a spare moment at a VR for Immersive Theatre
meet up.

>
> --
> Ville Syrjälä
> Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Ville Syrjälä
On Wed, Oct 26, 2016 at 05:42:23PM +0100, Robert Bragg wrote:
> On Wed, Oct 26, 2016 at 4:37 PM, Ville Syrjälä <
> ville.syrj...@linux.intel.com> wrote:
> 
> > On Wed, Oct 26, 2016 at 04:17:45PM +0100, Robert Bragg wrote:
> > > On 26 Oct 2016 9:54 a.m., "Chris Wilson" 
> > wrote:
> > > >
> > > > On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
> > > > >On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > > > ><[1]matthew.william.a...@gmail.com> wrote:
> > > > >
> > > > >  On 25 October 2016 at 00:19, Robert Bragg <[2]
> > rob...@sixbynine.org>
> > > > >  wrote:
> > > > >
> > > > >
> > > > >
> > > > >  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > >  b/drivers/gpu/drm/i915/i915_drv.h
> > > > >  > index 3448d05..ea24814 100644
> > > > >  > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > >  > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > >  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> > > > >
> > > > >  >
> > > > >  >  struct drm_i915_private {
> > > > >  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> > > > >  >
> > > > >  > struct {
> > > > >  > bool initialized;
> > > > >  > +
> > > > >  > struct mutex lock;
> > > > >  > struct list_head streams;
> > > > >  >
> > > > >  > +   spinlock_t hook_lock;
> > > > >  > +
> > > > >  > struct {
> > > > >  > -   u32 metrics_set;
> > > > >  > +   struct i915_perf_stream
> > > *exclusive_stream;
> >
> > OT:
> > What kind of MUA are you using that mangles quoted mails like this? I've
> > not seen it on intel-gfx before. mesa-dev seems rife with it, but as I
> > rarely read that in any great detail I've managed to ignore it there.
> > Anyways, it makes it espesially hard to navigate long mails since mutt's
> > 'S' (skip quoted text) no longer works correctly.
> >
> 
> Not sure I want to say, and get booted out the door :-)
> 
> I've heard that gmail has an annoying habit of forcibly wrapping plain text
> emails like this, and a lot of people have complained that there's no way
> to disable that 'feature' :-/
> 
> I used to use Mutt, but I don't think I could really bare to go back to it
> any more. Last time I was using it I found myself spending too much time
> patching it to try and make it work how I'd like, but can't say I got much
> enjoyment from that process.

Isn't gmail just a pile of client side javascript or something? Maybe
you'd enjoy patching that one more? ;)

> 
> I've tried most MUA options available, and can't say any of them make me
> very happy - I think these days it's just not something developers are very
> interesting in working on.
> 
> I'm a sell out and just use Gmail... sorry. I can't really see myself
> changing, though I do wish Google weren't so pedantic about forcing
> wrapping without any option to change that behaviour. I suspect you
> wouldn't be happy with me sending html emails, which has been Google's
> default response to this complaint afik.
> 
> Maybe it's gmail users causing trouble on the Mesa list too.
> 
> - Robert
> 
> P.S please don't think lesser of me due to my misguided MUA choices.

I think I'll just reserve the right to ignore any mail with bad quoting.

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Daniel Vetter
On Wed, Oct 26, 2016 at 05:42:23PM +0100, Robert Bragg wrote:
> On Wed, Oct 26, 2016 at 4:37 PM, Ville Syrjälä <
> ville.syrj...@linux.intel.com> wrote:
> 
> > On Wed, Oct 26, 2016 at 04:17:45PM +0100, Robert Bragg wrote:
> > > On 26 Oct 2016 9:54 a.m., "Chris Wilson" 
> > wrote:
> > > >
> > > > On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
> > > > >On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > > > ><[1]matthew.william.a...@gmail.com> wrote:
> > > > >
> > > > >  On 25 October 2016 at 00:19, Robert Bragg <[2]
> > rob...@sixbynine.org>
> > > > >  wrote:
> > > > >
> > > > >
> > > > >
> > > > >  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > >  b/drivers/gpu/drm/i915/i915_drv.h
> > > > >  > index 3448d05..ea24814 100644
> > > > >  > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > >  > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > >  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> > > > >
> > > > >  >
> > > > >  >  struct drm_i915_private {
> > > > >  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> > > > >  >
> > > > >  > struct {
> > > > >  > bool initialized;
> > > > >  > +
> > > > >  > struct mutex lock;
> > > > >  > struct list_head streams;
> > > > >  >
> > > > >  > +   spinlock_t hook_lock;
> > > > >  > +
> > > > >  > struct {
> > > > >  > -   u32 metrics_set;
> > > > >  > +   struct i915_perf_stream
> > > *exclusive_stream;
> >
> > OT:
> > What kind of MUA are you using that mangles quoted mails like this? I've
> > not seen it on intel-gfx before. mesa-dev seems rife with it, but as I
> > rarely read that in any great detail I've managed to ignore it there.
> > Anyways, it makes it espesially hard to navigate long mails since mutt's
> > 'S' (skip quoted text) no longer works correctly.
> >
> 
> Not sure I want to say, and get booted out the door :-)
> 
> I've heard that gmail has an annoying habit of forcibly wrapping plain text
> emails like this, and a lot of people have complained that there's no way
> to disable that 'feature' :-/
> 
> I used to use Mutt, but I don't think I could really bare to go back to it
> any more. Last time I was using it I found myself spending too much time
> patching it to try and make it work how I'd like, but can't say I got much
> enjoyment from that process.
> 
> I've tried most MUA options available, and can't say any of them make me
> very happy - I think these days it's just not something developers are very
> interesting in working on.
> 
> I'm a sell out and just use Gmail... sorry. I can't really see myself
> changing, though I do wish Google weren't so pedantic about forcing
> wrapping without any option to change that behaviour. I suspect you
> wouldn't be happy with me sending html emails, which has been Google's
> default response to this complaint afik.
> 
> Maybe it's gmail users causing trouble on the Mesa list too.
> 
> - Robert
> 
> P.S please don't think lesser of me due to my misguided MUA choices.

I use a mix of mutt+gmail web interface, since each has their upsides.
Haven't yet seen badly misquoted stuff, I think it mostly seems to work
for me. And there's lots of kernel folks who use gmail too afaik.
-Daniel

> 
> 
> 
> >
> > > > >  > +
> > > > >  > +   u32 specific_ctx_id;
> > > > >  Can we just get rid of this, now that the vma remains pinned we
> > can
> > > > >  simply get the ggtt address at the time of configuring the
> > > OA_CONTROL
> > > > >  register ?
> > > > >
> > > > >I considered that, but would ideally prefer to keep it considering
> > > the
> > > > >gen8+ patches to come. For gen8+ (with execlists) the context ID
> > > isn't a
> > > > >gtt offset.
> > > >
> > > > In terms of symmetry, keeping the vma you pinned and unpinning the same
> > > > later makes its ownership much clearer. (And I do want the owner of
> > each
> > > > pin to be clear, for when we start enabling debug to catch the VMA
> > > > leaks.)
> > >
> > > Keeping our own pointer to the pinned vma could be a clarification.
> > >
> > > Considering Matt's comments too, I'm thinking I'll put the pinning and
> > > specific_ctx_id initialization together with setting stream->ctx, keeping
> > > the state together under the stream. It's going to potentially mean
> > > redundantly pinning the ctx for the sake of the ID in the future for
> > > streams that don't really need it, but I think it's probably not worth
> > > worrying about that.
> > >
> > > - Robert
> > >
> > > > -Chris
> > > >
> > > > --
> > > > Chris Wilson, Intel Open Source Technology Centre
> >
> > > ___
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> 

Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Robert Bragg
On Wed, Oct 26, 2016 at 4:37 PM, Ville Syrjälä <
ville.syrj...@linux.intel.com> wrote:

> On Wed, Oct 26, 2016 at 04:17:45PM +0100, Robert Bragg wrote:
> > On 26 Oct 2016 9:54 a.m., "Chris Wilson" 
> wrote:
> > >
> > > On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
> > > >On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > > ><[1]matthew.william.a...@gmail.com> wrote:
> > > >
> > > >  On 25 October 2016 at 00:19, Robert Bragg <[2]
> rob...@sixbynine.org>
> > > >  wrote:
> > > >
> > > >
> > > >
> > > >  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > >  b/drivers/gpu/drm/i915/i915_drv.h
> > > >  > index 3448d05..ea24814 100644
> > > >  > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > >  > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > >  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> > > >
> > > >  >
> > > >  >  struct drm_i915_private {
> > > >  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> > > >  >
> > > >  > struct {
> > > >  > bool initialized;
> > > >  > +
> > > >  > struct mutex lock;
> > > >  > struct list_head streams;
> > > >  >
> > > >  > +   spinlock_t hook_lock;
> > > >  > +
> > > >  > struct {
> > > >  > -   u32 metrics_set;
> > > >  > +   struct i915_perf_stream
> > *exclusive_stream;
>
> OT:
> What kind of MUA are you using that mangles quoted mails like this? I've
> not seen it on intel-gfx before. mesa-dev seems rife with it, but as I
> rarely read that in any great detail I've managed to ignore it there.
> Anyways, it makes it espesially hard to navigate long mails since mutt's
> 'S' (skip quoted text) no longer works correctly.
>

Not sure I want to say, and get booted out the door :-)

I've heard that gmail has an annoying habit of forcibly wrapping plain text
emails like this, and a lot of people have complained that there's no way
to disable that 'feature' :-/

I used to use Mutt, but I don't think I could really bare to go back to it
any more. Last time I was using it I found myself spending too much time
patching it to try and make it work how I'd like, but can't say I got much
enjoyment from that process.

I've tried most MUA options available, and can't say any of them make me
very happy - I think these days it's just not something developers are very
interesting in working on.

I'm a sell out and just use Gmail... sorry. I can't really see myself
changing, though I do wish Google weren't so pedantic about forcing
wrapping without any option to change that behaviour. I suspect you
wouldn't be happy with me sending html emails, which has been Google's
default response to this complaint afik.

Maybe it's gmail users causing trouble on the Mesa list too.

- Robert

P.S please don't think lesser of me due to my misguided MUA choices.



>
> > > >  > +
> > > >  > +   u32 specific_ctx_id;
> > > >  Can we just get rid of this, now that the vma remains pinned we
> can
> > > >  simply get the ggtt address at the time of configuring the
> > OA_CONTROL
> > > >  register ?
> > > >
> > > >I considered that, but would ideally prefer to keep it considering
> > the
> > > >gen8+ patches to come. For gen8+ (with execlists) the context ID
> > isn't a
> > > >gtt offset.
> > >
> > > In terms of symmetry, keeping the vma you pinned and unpinning the same
> > > later makes its ownership much clearer. (And I do want the owner of
> each
> > > pin to be clear, for when we start enabling debug to catch the VMA
> > > leaks.)
> >
> > Keeping our own pointer to the pinned vma could be a clarification.
> >
> > Considering Matt's comments too, I'm thinking I'll put the pinning and
> > specific_ctx_id initialization together with setting stream->ctx, keeping
> > the state together under the stream. It's going to potentially mean
> > redundantly pinning the ctx for the sake of the ID in the future for
> > streams that don't really need it, but I think it's probably not worth
> > worrying about that.
> >
> > - Robert
> >
> > > -Chris
> > >
> > > --
> > > Chris Wilson, Intel Open Source Technology Centre
>
> > ___
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
> --
> Ville Syrjälä
> Intel OTC
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Ville Syrjälä
On Wed, Oct 26, 2016 at 04:17:45PM +0100, Robert Bragg wrote:
> On 26 Oct 2016 9:54 a.m., "Chris Wilson"  wrote:
> >
> > On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
> > >On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > ><[1]matthew.william.a...@gmail.com> wrote:
> > >
> > >  On 25 October 2016 at 00:19, Robert Bragg <[2]rob...@sixbynine.org>
> > >  wrote:
> > >
> > >
> > >
> > >  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > >  b/drivers/gpu/drm/i915/i915_drv.h
> > >  > index 3448d05..ea24814 100644
> > >  > --- a/drivers/gpu/drm/i915/i915_drv.h
> > >  > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > >  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> > >
> > >  >
> > >  >  struct drm_i915_private {
> > >  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> > >  >
> > >  > struct {
> > >  > bool initialized;
> > >  > +
> > >  > struct mutex lock;
> > >  > struct list_head streams;
> > >  >
> > >  > +   spinlock_t hook_lock;
> > >  > +
> > >  > struct {
> > >  > -   u32 metrics_set;
> > >  > +   struct i915_perf_stream
> *exclusive_stream;

OT:
What kind of MUA are you using that mangles quoted mails like this? I've
not seen it on intel-gfx before. mesa-dev seems rife with it, but as I
rarely read that in any great detail I've managed to ignore it there.
Anyways, it makes it espesially hard to navigate long mails since mutt's
'S' (skip quoted text) no longer works correctly.

> > >  > +
> > >  > +   u32 specific_ctx_id;
> > >  Can we just get rid of this, now that the vma remains pinned we can
> > >  simply get the ggtt address at the time of configuring the
> OA_CONTROL
> > >  register ?
> > >
> > >I considered that, but would ideally prefer to keep it considering
> the
> > >gen8+ patches to come. For gen8+ (with execlists) the context ID
> isn't a
> > >gtt offset.
> >
> > In terms of symmetry, keeping the vma you pinned and unpinning the same
> > later makes its ownership much clearer. (And I do want the owner of each
> > pin to be clear, for when we start enabling debug to catch the VMA
> > leaks.)
> 
> Keeping our own pointer to the pinned vma could be a clarification.
> 
> Considering Matt's comments too, I'm thinking I'll put the pinning and
> specific_ctx_id initialization together with setting stream->ctx, keeping
> the state together under the stream. It's going to potentially mean
> redundantly pinning the ctx for the sake of the ID in the future for
> streams that don't really need it, but I think it's probably not worth
> worrying about that.
> 
> - Robert
> 
> > -Chris
> >
> > --
> > Chris Wilson, Intel Open Source Technology Centre

> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Robert Bragg
On 26 Oct 2016 9:54 a.m., "Chris Wilson"  wrote:
>
> On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
> >On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> ><[1]matthew.william.a...@gmail.com> wrote:
> >
> >  On 25 October 2016 at 00:19, Robert Bragg <[2]rob...@sixbynine.org>
> >  wrote:
> >
> >
> >
> >  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> >  b/drivers/gpu/drm/i915/i915_drv.h
> >  > index 3448d05..ea24814 100644
> >  > --- a/drivers/gpu/drm/i915/i915_drv.h
> >  > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> >
> >  >
> >  >  struct drm_i915_private {
> >  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> >  >
> >  > struct {
> >  > bool initialized;
> >  > +
> >  > struct mutex lock;
> >  > struct list_head streams;
> >  >
> >  > +   spinlock_t hook_lock;
> >  > +
> >  > struct {
> >  > -   u32 metrics_set;
> >  > +   struct i915_perf_stream
*exclusive_stream;
> >  > +
> >  > +   u32 specific_ctx_id;
> >  Can we just get rid of this, now that the vma remains pinned we can
> >  simply get the ggtt address at the time of configuring the
OA_CONTROL
> >  register ?
> >
> >I considered that, but would ideally prefer to keep it considering
the
> >gen8+ patches to come. For gen8+ (with execlists) the context ID
isn't a
> >gtt offset.
>
> In terms of symmetry, keeping the vma you pinned and unpinning the same
> later makes its ownership much clearer. (And I do want the owner of each
> pin to be clear, for when we start enabling debug to catch the VMA
> leaks.)

Keeping our own pointer to the pinned vma could be a clarification.

Considering Matt's comments too, I'm thinking I'll put the pinning and
specific_ctx_id initialization together with setting stream->ctx, keeping
the state together under the stream. It's going to potentially mean
redundantly pinning the ctx for the sake of the ID in the future for
streams that don't really need it, but I think it's probably not worth
worrying about that.

- Robert

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Robert Bragg
On 26 Oct 2016 11:08 a.m., "Matthew Auld" 
wrote:
>
> On 26 October 2016 at 00:51, Robert Bragg  wrote:
> >
> >
> > On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> >  wrote:
> >>
> >> On 25 October 2016 at 00:19, Robert Bragg  wrote:
> >
> >
> >>
> >>
> >> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> >> > b/drivers/gpu/drm/i915/i915_drv.h
> >> > index 3448d05..ea24814 100644
> >> > --- a/drivers/gpu/drm/i915/i915_drv.h
> >> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> >>
> >> >
> >> >  struct drm_i915_private {
> >> > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> >> >
> >> > struct {
> >> > bool initialized;
> >> > +
> >> > struct mutex lock;
> >> > struct list_head streams;
> >> >
> >> > +   spinlock_t hook_lock;
> >> > +
> >> > struct {
> >> > -   u32 metrics_set;
> >> > +   struct i915_perf_stream *exclusive_stream;
> >> > +
> >> > +   u32 specific_ctx_id;
> >> Can we just get rid of this, now that the vma remains pinned we can
> >> simply get the ggtt address at the time of configuring the OA_CONTROL
> >> register ?
> >
> >
> > I considered that, but would ideally prefer to keep it considering the
gen8+
> > patches to come. For gen8+ (with execlists) the context ID isn't a gtt
> > offset.
> >
> >>
> >>
> >> > +
> >> > +   struct hrtimer poll_check_timer;
> >> > +   wait_queue_head_t poll_wq;
> >> > +   atomic_t pollin;
> >> > +
> >>
> >
> >>
> >> > +/* The maximum exponent the hardware accepts is 63 (essentially it
> >> > selects one
> >> > + * of the 64bit timestamp bits to trigger reports from) but there's
> >> > currently
> >> > + * no known use case for sampling as infrequently as once per 47
> >> > thousand years.
> >> > + *
> >> > + * Since the timestamps included in OA reports are only 32bits it
seems
> >> > + * reasonable to limit the OA exponent where it's still possible to
> >> > account for
> >> > + * overflow in OA report timestamps.
> >> > + */
> >> > +#define OA_EXPONENT_MAX 31
> >> > +
> >> > +#define INVALID_CTX_ID 0x
> >> We shouldn't need this anymore.
> >
> >
> > yeah I removed it and then added it back, just for the sake of
explicitly
> > setting the specific_ctx_id to an invalid ID when closing the exclusive
> > stream - though resetting the value isn't strictly necessary.
> Can we not make the specific_ctx_id per-stream, the gem context
> already is, then we don't need to be concerned with resetting it ?

Hmm, I'm not sure about that, conceptually to me it's global OA unit state.

Currently the driver only supports a single exclusive stream, while Sourab
later relaxes that to a per-engine stream and that could be relaxed further
with non-oa metric stream types.

With multiple streams we'll still only be able to programmer a single ctx
id in oacontol.

Conceptually to me, other stream types could be associated with different
contexts (if they don't depend on the OA unit) so to me stream->ctx isn't
necessarily OA unit state.

It probably could be played around with, but right now we don't track OA
specific state in the stream. For the ID it's just semantics to say it's OA
state, and we could consider that it's maybe generally useful to track the
ID, even for future non-oa streams. That might mean potentially redundantly
pinning state for the sake of tracking the ID for streams that don't end up
needing it.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Matthew Auld
On 26 October 2016 at 00:51, Robert Bragg  wrote:
>
>
> On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
>  wrote:
>>
>> On 25 October 2016 at 00:19, Robert Bragg  wrote:
>
>
>>
>>
>> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> > b/drivers/gpu/drm/i915/i915_drv.h
>> > index 3448d05..ea24814 100644
>> > --- a/drivers/gpu/drm/i915/i915_drv.h
>> > +++ b/drivers/gpu/drm/i915/i915_drv.h
>> > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
>>
>> >
>> >  struct drm_i915_private {
>> > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
>> >
>> > struct {
>> > bool initialized;
>> > +
>> > struct mutex lock;
>> > struct list_head streams;
>> >
>> > +   spinlock_t hook_lock;
>> > +
>> > struct {
>> > -   u32 metrics_set;
>> > +   struct i915_perf_stream *exclusive_stream;
>> > +
>> > +   u32 specific_ctx_id;
>> Can we just get rid of this, now that the vma remains pinned we can
>> simply get the ggtt address at the time of configuring the OA_CONTROL
>> register ?
>
>
> I considered that, but would ideally prefer to keep it considering the gen8+
> patches to come. For gen8+ (with execlists) the context ID isn't a gtt
> offset.
>
>>
>>
>> > +
>> > +   struct hrtimer poll_check_timer;
>> > +   wait_queue_head_t poll_wq;
>> > +   atomic_t pollin;
>> > +
>>
>
>>
>> > +/* The maximum exponent the hardware accepts is 63 (essentially it
>> > selects one
>> > + * of the 64bit timestamp bits to trigger reports from) but there's
>> > currently
>> > + * no known use case for sampling as infrequently as once per 47
>> > thousand years.
>> > + *
>> > + * Since the timestamps included in OA reports are only 32bits it seems
>> > + * reasonable to limit the OA exponent where it's still possible to
>> > account for
>> > + * overflow in OA report timestamps.
>> > + */
>> > +#define OA_EXPONENT_MAX 31
>> > +
>> > +#define INVALID_CTX_ID 0x
>> We shouldn't need this anymore.
>
>
> yeah I removed it and then added it back, just for the sake of explicitly
> setting the specific_ctx_id to an invalid ID when closing the exclusive
> stream - though resetting the value isn't strictly necessary.
Can we not make the specific_ctx_id per-stream, the gem context
already is, then we don't need to be concerned with resetting it ?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-26 Thread Chris Wilson
On Wed, Oct 26, 2016 at 12:51:58AM +0100, Robert Bragg wrote:
>On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
><[1]matthew.william.a...@gmail.com> wrote:
> 
>  On 25 October 2016 at 00:19, Robert Bragg <[2]rob...@sixbynine.org>
>  wrote:
> 
> 
> 
>  > diff --git a/drivers/gpu/drm/i915/i915_drv.h
>  b/drivers/gpu/drm/i915/i915_drv.h
>  > index 3448d05..ea24814 100644
>  > --- a/drivers/gpu/drm/i915/i915_drv.h
>  > +++ b/drivers/gpu/drm/i915/i915_drv.h
>  > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> 
>  >
>  >  struct drm_i915_private {
>  > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
>  >
>  >         struct {
>  >                 bool initialized;
>  > +
>  >                 struct mutex lock;
>  >                 struct list_head streams;
>  >
>  > +               spinlock_t hook_lock;
>  > +
>  >                 struct {
>  > -                       u32 metrics_set;
>  > +                       struct i915_perf_stream *exclusive_stream;
>  > +
>  > +                       u32 specific_ctx_id;
>  Can we just get rid of this, now that the vma remains pinned we can
>  simply get the ggtt address at the time of configuring the OA_CONTROL
>  register ?
> 
>I considered that, but would ideally prefer to keep it considering the
>gen8+ patches to come. For gen8+ (with execlists) the context ID isn't a
>gtt offset.

In terms of symmetry, keeping the vma you pinned and unpinning the same
later makes its ownership much clearer. (And I do want the owner of each
pin to be clear, for when we start enabling debug to catch the VMA
leaks.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-25 Thread Robert Bragg
On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld <
matthew.william.a...@gmail.com> wrote:

> On 25 October 2016 at 00:19, Robert Bragg  wrote:



>
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> > index 3448d05..ea24814 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
>
> >
> >  struct drm_i915_private {
> > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> >
> > struct {
> > bool initialized;
> > +
> > struct mutex lock;
> > struct list_head streams;
> >
> > +   spinlock_t hook_lock;
> > +
> > struct {
> > -   u32 metrics_set;
> > +   struct i915_perf_stream *exclusive_stream;
> > +
> > +   u32 specific_ctx_id;
> Can we just get rid of this, now that the vma remains pinned we can
> simply get the ggtt address at the time of configuring the OA_CONTROL
> register ?
>

I considered that, but would ideally prefer to keep it considering the
gen8+ patches to come. For gen8+ (with execlists) the context ID isn't a
gtt offset.


>
> > +
> > +   struct hrtimer poll_check_timer;
> > +   wait_queue_head_t poll_wq;
> > +   atomic_t pollin;
> > +
>
>

> > +/* The maximum exponent the hardware accepts is 63 (essentially it
> selects one
> > + * of the 64bit timestamp bits to trigger reports from) but there's
> currently
> > + * no known use case for sampling as infrequently as once per 47
> thousand years.
> > + *
> > + * Since the timestamps included in OA reports are only 32bits it seems
> > + * reasonable to limit the OA exponent where it's still possible to
> account for
> > + * overflow in OA report timestamps.
> > + */
> > +#define OA_EXPONENT_MAX 31
> > +
> > +#define INVALID_CTX_ID 0x
> We shouldn't need this anymore.
>

yeah I removed it and then added it back, just for the sake of explicitly
setting the specific_ctx_id to an invalid ID when closing the exclusive
stream - though resetting the value isn't strictly necessary.

also maybe your comment is assuming specific_ctx_id can be removed, while
I'd prefer to keep it.


> > +
> > +static int claim_specific_ctx(struct i915_perf_stream *stream)
> > +{
> pin_oa_specific_ctx, or something? Also would it not make more sense
> to operate on the context, not the stream.
>

Yeah, I avoided a name like that mainly because it's also initializing
specific_ctx_id, which seemed to me like it would become an unexpected side
effect with that more specific name.

The other consideration is that in my gen8+ patches the pinning code is
conditional depending on whether execlists are enabled, while the function
still initializes specific_ctx_id.

Certainly not attached to the names though.

Chris has some feedback with the code, so maybe that will affect this too.


> > +   struct drm_i915_private *dev_priv = stream->dev_priv;
> > +   struct i915_vma *vma;
> > +   int ret;
> > +
> > +   ret = i915_mutex_lock_interruptible(_priv->drm);
> > +   if (ret)
> > +   return ret;
> > +
> > +   /* So that we don't have to worry about updating the context ID
> > +* in OACONTOL on the fly we make sure to pin the context
> > +* upfront for the lifetime of the stream...
> > +*/
> > +   vma = stream->ctx->engine[RCS].state;
> > +   ret = i915_vma_pin(vma, 0, stream->ctx->ggtt_alignment,
> > +  PIN_GLOBAL | PIN_HIGH);
> > +   if (ret)
> > +   return ret;
> > +
> > +   dev_priv->perf.oa.specific_ctx_id = i915_ggtt_offset(vma);
> > +
> > +   mutex_unlock(_priv->drm.struct_mutex);
> > +
> > +   return 0;
> > +}
>


I'll also follow up on the other notes; thanks!

- Robert
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-25 Thread Chris Wilson
On Tue, Oct 25, 2016 at 12:19:29AM +0100, Robert Bragg wrote:
> +static int claim_specific_ctx(struct i915_perf_stream *stream)
> +{
> + struct drm_i915_private *dev_priv = stream->dev_priv;
> + struct i915_vma *vma;
> + int ret;
> +
> + ret = i915_mutex_lock_interruptible(_priv->drm);

Looking forward to the day these don't need struct_mutex.

> + if (ret)
> + return ret;
> +
> + /* So that we don't have to worry about updating the context ID
> +  * in OACONTOL on the fly we make sure to pin the context
> +  * upfront for the lifetime of the stream...
> +  */
> + vma = stream->ctx->engine[RCS].state;

There's a caveat here that suggests I had better wrap up this into its
own function. (We need to flush dirty cachelines to memory on first
binding of the context.)

> + ret = i915_vma_pin(vma, 0, stream->ctx->ggtt_alignment,
> +PIN_GLOBAL | PIN_HIGH);
> + if (ret)
> + return ret;

Oops.

> +
> + dev_priv->perf.oa.specific_ctx_id = i915_ggtt_offset(vma);
> +
> + mutex_unlock(_priv->drm.struct_mutex);
> +
> + return 0;
> +}


> +static int alloc_oa_buffer(struct drm_i915_private *dev_priv)
> +{
> + struct drm_i915_gem_object *bo;
> + struct i915_vma *vma;
> + int ret;
> +
> + BUG_ON(dev_priv->perf.oa.oa_buffer.vma);
> +
> + ret = i915_mutex_lock_interruptible(_priv->drm);
> + if (ret)
> + return ret;
> +
> + BUILD_BUG_ON_NOT_POWER_OF_2(OA_BUFFER_SIZE);
> + BUILD_BUG_ON(OA_BUFFER_SIZE < SZ_128K || OA_BUFFER_SIZE > SZ_16M);
> +
> + bo = i915_gem_object_create(_priv->drm, OA_BUFFER_SIZE);
> + if (IS_ERR(bo)) {
> + DRM_ERROR("Failed to allocate OA buffer\n");
> + ret = PTR_ERR(bo);
> + goto unlock;
> + }
> +
> + ret = i915_gem_object_set_cache_level(bo, I915_CACHE_LLC);
> + if (ret)
> + goto err_unref;
> +
> + /* PreHSW required 512K alignment, HSW requires 16M */
> + vma = i915_gem_object_ggtt_pin(bo, NULL, 0, SZ_16M, PIN_MAPPABLE);

Does this need mappable aperture space for OA? You aren't accessing it
via the aperture, but is the hw limited to it?

> + if (IS_ERR(vma)) {
> + ret = PTR_ERR(vma);
> + goto err_unref;
> + }
> + dev_priv->perf.oa.oa_buffer.vma = vma;
> +
> + dev_priv->perf.oa.oa_buffer.vaddr =
> + i915_gem_object_pin_map(bo, I915_MAP_WB);
> + if (IS_ERR(dev_priv->perf.oa.oa_buffer.vaddr)) {
> + ret = PTR_ERR(dev_priv->perf.oa.oa_buffer.vaddr);
> + goto err_unpin;
> + }
> +
> + dev_priv->perf.oa.ops.init_oa_buffer(dev_priv);
> +
> + DRM_DEBUG_DRIVER("OA Buffer initialized, gtt offset = 0x%x, vaddr = %p",
> +  i915_ggtt_offset(dev_priv->perf.oa.oa_buffer.vma),
> +  dev_priv->perf.oa.oa_buffer.vaddr);
> +
> + goto unlock;
> +
> +err_unpin:
> + __i915_vma_unpin(vma);
> +
> +err_unref:
> + i915_gem_object_put(bo);
> +
> + dev_priv->perf.oa.oa_buffer.vaddr = NULL;
> + dev_priv->perf.oa.oa_buffer.vma = NULL;
> +
> +unlock:
> + mutex_unlock(_priv->drm.struct_mutex);
> + return ret;
> +}


> + if (ret >= 0) {
> + /* Maybe make ->pollin per-stream state if we support multiple
> +  * concurrent streams in the future. */
> + atomic_set(_priv->perf.oa.pollin, false);
> + }
> +
>   return ret;
>  }
>  
> -static unsigned int i915_perf_poll_locked(struct i915_perf_stream *stream,
> +static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
> +{
> + struct drm_i915_private *dev_priv =
> + container_of(hrtimer, typeof(*dev_priv),
> +  perf.oa.poll_check_timer);
> +
> + if (!dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv)) {
> + atomic_set(_priv->perf.oa.pollin, true);
> + wake_up(_priv->perf.oa.poll_wq);
> + }
> +
> + hrtimer_forward_now(hrtimer, ns_to_ktime(POLL_PERIOD));
> +
> + return HRTIMER_RESTART;
> +}
> +
> +static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
> +   struct i915_perf_stream *stream,
> struct file *file,
> poll_table *wait)
>  {
> - unsigned int streams = 0;
> + unsigned int events = 0;
>  
>   stream->ops->poll_wait(stream, file, wait);
>  
> - if (stream->ops->can_read(stream))
> - streams |= POLLIN;
> + /* Note: we don't explicitly check whether there's something to read
> +  * here since this path may be very hot depending on what else
> +  * userspace is polling, or on the timeout in use. We rely solely on
> +  * the hrtimer/oa_poll_check_timer_cb to notify us when there are
> +  * samples to read.
> +  */
> + if (atomic_read(_priv->perf.oa.pollin))
> + 

Re: [Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-25 Thread Matthew Auld
On 25 October 2016 at 00:19, Robert Bragg  wrote:
> Gen graphics hardware can be set up to periodically write snapshots of
> performance counters into a circular buffer via its Observation
> Architecture and this patch exposes that capability to userspace via the
> i915 perf interface.
>
> v2:
>Make sure to initialize ->specific_ctx_id when opening, without
>relying on _pin_notify hook, in case ctx already pinned.
> v3:
>Revert back to pinning ctx upfront when opening stream, removing
>need to hook in to pinning and to update OACONTROL on the fly.
>
> Cc: Chris Wilson 
> Signed-off-by: Robert Bragg 
> Signed-off-by: Zhenyu Wang 
>
> fix enable hsw
Random bit of cruft ?

> ---
>  drivers/gpu/drm/i915/i915_drv.h  |   65 ++-
>  drivers/gpu/drm/i915/i915_perf.c | 1000 
> +-
>  drivers/gpu/drm/i915/i915_reg.h  |  338 +
>  include/uapi/drm/i915_drm.h  |   70 ++-
>  4 files changed, 1444 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 3448d05..ea24814 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> bool sprites_scaled;
>  };
>
> +struct i915_oa_format {
> +   u32 format;
> +   int size;
> +};
> +
>  struct i915_oa_reg {
> i915_reg_t addr;
> u32 value;
> @@ -1784,11 +1789,6 @@ struct i915_perf_stream_ops {
>  */
> void (*disable)(struct i915_perf_stream *stream);
>
> -   /* Return: true if any i915 perf records are ready to read()
> -* for this stream.
> -*/
> -   bool (*can_read)(struct i915_perf_stream *stream);
> -
> /* Call poll_wait, passing a wait queue that will be woken
>  * once there is something ready to read() for the stream
>  */
> @@ -1798,9 +1798,7 @@ struct i915_perf_stream_ops {
>
> /* For handling a blocking read, wait until there is something
>  * to ready to read() for the stream. E.g. wait on the same
> -* wait queue that would be passed to poll_wait() until
> -* ->can_read() returns true (if its safe to call ->can_read()
> -* without the i915 perf lock held).
> +* wait queue that would be passed to poll_wait().
>  */
> int (*wait_unlocked)(struct i915_perf_stream *stream);
>
> @@ -1840,11 +1838,28 @@ struct i915_perf_stream {
> struct list_head link;
>
> u32 sample_flags;
> +   int sample_size;
>
> struct i915_gem_context *ctx;
> bool enabled;
>
> -   struct i915_perf_stream_ops *ops;
> +   const struct i915_perf_stream_ops *ops;
> +};
> +
> +struct i915_oa_ops {
> +   void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
> +   int (*enable_metric_set)(struct drm_i915_private *dev_priv);
> +   void (*disable_metric_set)(struct drm_i915_private *dev_priv);
> +   void (*oa_enable)(struct drm_i915_private *dev_priv);
> +   void (*oa_disable)(struct drm_i915_private *dev_priv);
> +   void (*update_oacontrol)(struct drm_i915_private *dev_priv);
> +   void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
> +   u32 ctx_id);
> +   int (*read)(struct i915_perf_stream *stream,
> +   char __user *buf,
> +   size_t count,
> +   size_t *offset);
> +   bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
>  };
>
>  struct drm_i915_private {
> @@ -2149,16 +2164,46 @@ struct drm_i915_private {
>
> struct {
> bool initialized;
> +
> struct mutex lock;
> struct list_head streams;
>
> +   spinlock_t hook_lock;
> +
> struct {
> -   u32 metrics_set;
> +   struct i915_perf_stream *exclusive_stream;
> +
> +   u32 specific_ctx_id;
Can we just get rid of this, now that the vma remains pinned we can
simply get the ggtt address at the time of configuring the OA_CONTROL
register ?

> +
> +   struct hrtimer poll_check_timer;
> +   wait_queue_head_t poll_wq;
> +   atomic_t pollin;
> +
> +   bool periodic;
> +   int period_exponent;
> +   int timestamp_frequency;
> +
> +   int tail_margin;
> +
> +   int metrics_set;
>
> const struct i915_oa_reg *mux_regs;
> int mux_regs_len;
> const struct i915_oa_reg *b_counter_regs;
> int b_counter_regs_len;
> +
> +   struct {
> +   struct i915_vma *vma;
> +

[Intel-gfx] [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-10-24 Thread Robert Bragg
Gen graphics hardware can be set up to periodically write snapshots of
performance counters into a circular buffer via its Observation
Architecture and this patch exposes that capability to userspace via the
i915 perf interface.

v2:
   Make sure to initialize ->specific_ctx_id when opening, without
   relying on _pin_notify hook, in case ctx already pinned.
v3:
   Revert back to pinning ctx upfront when opening stream, removing
   need to hook in to pinning and to update OACONTROL on the fly.

Cc: Chris Wilson 
Signed-off-by: Robert Bragg 
Signed-off-by: Zhenyu Wang 

fix enable hsw
---
 drivers/gpu/drm/i915/i915_drv.h  |   65 ++-
 drivers/gpu/drm/i915/i915_perf.c | 1000 +-
 drivers/gpu/drm/i915/i915_reg.h  |  338 +
 include/uapi/drm/i915_drm.h  |   70 ++-
 4 files changed, 1444 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3448d05..ea24814 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1764,6 +1764,11 @@ struct intel_wm_config {
bool sprites_scaled;
 };
 
+struct i915_oa_format {
+   u32 format;
+   int size;
+};
+
 struct i915_oa_reg {
i915_reg_t addr;
u32 value;
@@ -1784,11 +1789,6 @@ struct i915_perf_stream_ops {
 */
void (*disable)(struct i915_perf_stream *stream);
 
-   /* Return: true if any i915 perf records are ready to read()
-* for this stream.
-*/
-   bool (*can_read)(struct i915_perf_stream *stream);
-
/* Call poll_wait, passing a wait queue that will be woken
 * once there is something ready to read() for the stream
 */
@@ -1798,9 +1798,7 @@ struct i915_perf_stream_ops {
 
/* For handling a blocking read, wait until there is something
 * to ready to read() for the stream. E.g. wait on the same
-* wait queue that would be passed to poll_wait() until
-* ->can_read() returns true (if its safe to call ->can_read()
-* without the i915 perf lock held).
+* wait queue that would be passed to poll_wait().
 */
int (*wait_unlocked)(struct i915_perf_stream *stream);
 
@@ -1840,11 +1838,28 @@ struct i915_perf_stream {
struct list_head link;
 
u32 sample_flags;
+   int sample_size;
 
struct i915_gem_context *ctx;
bool enabled;
 
-   struct i915_perf_stream_ops *ops;
+   const struct i915_perf_stream_ops *ops;
+};
+
+struct i915_oa_ops {
+   void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
+   int (*enable_metric_set)(struct drm_i915_private *dev_priv);
+   void (*disable_metric_set)(struct drm_i915_private *dev_priv);
+   void (*oa_enable)(struct drm_i915_private *dev_priv);
+   void (*oa_disable)(struct drm_i915_private *dev_priv);
+   void (*update_oacontrol)(struct drm_i915_private *dev_priv);
+   void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
+   u32 ctx_id);
+   int (*read)(struct i915_perf_stream *stream,
+   char __user *buf,
+   size_t count,
+   size_t *offset);
+   bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
 };
 
 struct drm_i915_private {
@@ -2149,16 +2164,46 @@ struct drm_i915_private {
 
struct {
bool initialized;
+
struct mutex lock;
struct list_head streams;
 
+   spinlock_t hook_lock;
+
struct {
-   u32 metrics_set;
+   struct i915_perf_stream *exclusive_stream;
+
+   u32 specific_ctx_id;
+
+   struct hrtimer poll_check_timer;
+   wait_queue_head_t poll_wq;
+   atomic_t pollin;
+
+   bool periodic;
+   int period_exponent;
+   int timestamp_frequency;
+
+   int tail_margin;
+
+   int metrics_set;
 
const struct i915_oa_reg *mux_regs;
int mux_regs_len;
const struct i915_oa_reg *b_counter_regs;
int b_counter_regs_len;
+
+   struct {
+   struct i915_vma *vma;
+   u8 *vaddr;
+   int format;
+   int format_size;
+   } oa_buffer;
+
+   u32 gen7_latched_oastatus1;
+
+   struct i915_oa_ops ops;
+   const struct i915_oa_format *oa_formats;
+   int n_builtin_sets;
} oa;
} perf;
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c