Re: [Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

2015-09-02 Thread ch...@chris-wilson.co.uk
On Wed, Sep 02, 2015 at 08:20:52PM +, Zanoni, Paulo R wrote:
> Em Qua, 2015-08-26 às 08:44 +0100, Chris Wilson escreveu:
> > On Tue, Aug 25, 2015 at 07:03:42PM -0300, Paulo Zanoni wrote:
> > > The unclaimed register bit is only triggered when someone touches 
> > > the
> > > specified register range.
> > > 
> > > For the normal use case (with i915.mmio_debug=0), this commit will
> > > avoid the extra __raw_i915_read32() call for every register outside
> > > the specified range, at the expense of a few additional "if"
> > > statements.
> > > 
> > > Cc: Chris Wilson 
> > > Signed-off-by: Paulo Zanoni 
> > 
> > 
> > > @@ -612,7 +614,7 @@ hsw_unclaimed_reg_debug(struct drm_i915_private 
> > > *dev_priv, u32 reg, bool read,
> > >   const char *op = read ? "reading" : "writing to";
> > >   const char *when = before ? "before" : "after";
> > >  
> > > - if (!i915.mmio_debug)
> > > + if (!i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg))
> > >   return;
> > 
> > Place the register check on the preceding line so that it is not
> > confused with checking the debug-enabled status. (Also you can argue
> > that reg will be register/cache-hot and so a cheaper test to do first
> > than loading a module parameter.)
> 
> That makes sense, I'll do it.
> 
> > 
> > >   if (__raw_i915_read32(dev_priv, FPGA_DBG) & 
> > > FPGA_DBG_RM_NOCLAIM) {
> > > @@ -624,11 +626,11 @@ hsw_unclaimed_reg_debug(struct 
> > > drm_i915_private *dev_priv, u32 reg, bool read,
> > >  }
> > >  
> > >  static void
> > > -hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
> > > +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv, u32 
> > > reg)
> > >  {
> > >   static bool mmio_debug_once = true;
> > >  
> > > - if (i915.mmio_debug || !mmio_debug_once)
> > > + if (i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg) || 
> > > !mmio_debug_once)
> > >   return;
> > 
> > The use is wrong here though because you never reviewed my patch to
> > enable the single-shot debugging from the interrupt handler to catch
> > invalid usage from elsewhere.
> 
> If you're talking about intel_uncore_check_errors(), I think we can
> just kill it now. I'll submit a patch with the arguments, so we can
> continue this topic there.
> 
> Moving back to the main subject:
> 
> In the last time you sent the patch to do the unclaimed checks only on
> forcewake code, I started the conversations with the HW guys about the
> range of registers relevant by FPGA_DBG (and CCd you on these
> conversations). My plan was to write this patch at that time, but
> priorities happened and it got postponed :(. I also hoped that maybe
> you would eventually end up writing it and I would just have to review
> it :)
> 
> My main argument was that by doing unclaimed register checking only at
> forcewake time we would be running a check that is only relevant to
> display code during non-display code. So my idea is that if we restrict
> the unclaimed check to the actual range of registers that can be caught
> we basically remove unclaimed register checking for most (all?) of the
> performance-sensitive code.
> 
> Since we have multiple solutions, I decided to do some experiments.
> First of all, since I was not really seeing hsw_unclaimed_reg_detect()
> on perf, I decided to patch it so it would do the read/write check
> around FPGA_DBG 50 times per call instead of just one. With this, by
> running "perf record glxgears" for a few seconds I could see
> hsw_unclaimed_reg_detect() taking about 4.6% of the total time.

Something is very suspect with your system if you are not observing much
hsw_unclaimed_reg_check during trivial workloads.

> Then I applied just this patch, and the time went down to 0.13%. I also
> applied your patch on top of it all, and it went up to 0.52% (I guess
> the extra checks at forcewake time cost a little bit). Also notice that
> since I add a "reg" argument to hsw_unclaimed_reg_detect(), I changed
> the forcewake call to use a register in the display range.
> I also tested your patch without my patch, and the measured result was
> about 0.56%.
> 
> Now this isn't a super relevant experiment: just glxgears with a
> modified hsw_unclaimed_reg_detect(), but I thought it would be useful
> information, and maybe you could provide me suggestions for better
> workloads.
> 
> So I'd like to know if you're ok with proceeding with just this patch
> (considering I implement your suggestions), or if you think we should
> go with just your patch or both or none.

If you really wanted to, you could combine approaches a check in the
forcewake handler as demonstrated is an order of magnitude less frequent
than the register accesses. The key point is that we *need* the checking
against other users, not just our known register access. Checking our own
access basically amounts to detecting invalid registers, which your
approach more or less erradicates (since it is a flag we add to known good
registers, we may 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

2015-09-02 Thread Zanoni, Paulo R
Em Qua, 2015-08-26 às 08:44 +0100, Chris Wilson escreveu:
> On Tue, Aug 25, 2015 at 07:03:42PM -0300, Paulo Zanoni wrote:
> > The unclaimed register bit is only triggered when someone touches 
> > the
> > specified register range.
> > 
> > For the normal use case (with i915.mmio_debug=0), this commit will
> > avoid the extra __raw_i915_read32() call for every register outside
> > the specified range, at the expense of a few additional "if"
> > statements.
> > 
> > Cc: Chris Wilson 
> > Signed-off-by: Paulo Zanoni 
> 
> 
> > @@ -612,7 +614,7 @@ hsw_unclaimed_reg_debug(struct drm_i915_private 
> > *dev_priv, u32 reg, bool read,
> > const char *op = read ? "reading" : "writing to";
> > const char *when = before ? "before" : "after";
> >  
> > -   if (!i915.mmio_debug)
> > +   if (!i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg))
> > return;
> 
> Place the register check on the preceding line so that it is not
> confused with checking the debug-enabled status. (Also you can argue
> that reg will be register/cache-hot and so a cheaper test to do first
> than loading a module parameter.)

That makes sense, I'll do it.

> 
> > if (__raw_i915_read32(dev_priv, FPGA_DBG) & 
> > FPGA_DBG_RM_NOCLAIM) {
> > @@ -624,11 +626,11 @@ hsw_unclaimed_reg_debug(struct 
> > drm_i915_private *dev_priv, u32 reg, bool read,
> >  }
> >  
> >  static void
> > -hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
> > +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv, u32 
> > reg)
> >  {
> > static bool mmio_debug_once = true;
> >  
> > -   if (i915.mmio_debug || !mmio_debug_once)
> > +   if (i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg) || 
> > !mmio_debug_once)
> > return;
> 
> The use is wrong here though because you never reviewed my patch to
> enable the single-shot debugging from the interrupt handler to catch
> invalid usage from elsewhere.

If you're talking about intel_uncore_check_errors(), I think we can
just kill it now. I'll submit a patch with the arguments, so we can
continue this topic there.

Moving back to the main subject:

In the last time you sent the patch to do the unclaimed checks only on
forcewake code, I started the conversations with the HW guys about the
range of registers relevant by FPGA_DBG (and CCd you on these
conversations). My plan was to write this patch at that time, but
priorities happened and it got postponed :(. I also hoped that maybe
you would eventually end up writing it and I would just have to review
it :)

My main argument was that by doing unclaimed register checking only at
forcewake time we would be running a check that is only relevant to
display code during non-display code. So my idea is that if we restrict
the unclaimed check to the actual range of registers that can be caught
we basically remove unclaimed register checking for most (all?) of the
performance-sensitive code.

Since we have multiple solutions, I decided to do some experiments.
First of all, since I was not really seeing hsw_unclaimed_reg_detect()
on perf, I decided to patch it so it would do the read/write check
around FPGA_DBG 50 times per call instead of just one. With this, by
running "perf record glxgears" for a few seconds I could see
hsw_unclaimed_reg_detect() taking about 4.6% of the total time.

Then I applied just this patch, and the time went down to 0.13%. I also
applied your patch on top of it all, and it went up to 0.52% (I guess
the extra checks at forcewake time cost a little bit). Also notice that
since I add a "reg" argument to hsw_unclaimed_reg_detect(), I changed
the forcewake call to use a register in the display range.
I also tested your patch without my patch, and the measured result was
about 0.56%.

Now this isn't a super relevant experiment: just glxgears with a
modified hsw_unclaimed_reg_detect(), but I thought it would be useful
information, and maybe you could provide me suggestions for better
workloads.

So I'd like to know if you're ok with proceeding with just this patch
(considering I implement your suggestions), or if you think we should
go with just your patch or both or none.

Thanks,
Paulo

> -Chris
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

2015-09-02 Thread Paulo Zanoni
2015-09-02 17:53 GMT-03:00 ch...@chris-wilson.co.uk :
> On Wed, Sep 02, 2015 at 08:20:52PM +, Zanoni, Paulo R wrote:
>> Em Qua, 2015-08-26 às 08:44 +0100, Chris Wilson escreveu:
>> > On Tue, Aug 25, 2015 at 07:03:42PM -0300, Paulo Zanoni wrote:
>> > > The unclaimed register bit is only triggered when someone touches
>> > > the
>> > > specified register range.
>> > >
>> > > For the normal use case (with i915.mmio_debug=0), this commit will
>> > > avoid the extra __raw_i915_read32() call for every register outside
>> > > the specified range, at the expense of a few additional "if"
>> > > statements.
>> > >
>> > > Cc: Chris Wilson 
>> > > Signed-off-by: Paulo Zanoni 
>> >
>> >
>> > > @@ -612,7 +614,7 @@ hsw_unclaimed_reg_debug(struct drm_i915_private
>> > > *dev_priv, u32 reg, bool read,
>> > >   const char *op = read ? "reading" : "writing to";
>> > >   const char *when = before ? "before" : "after";
>> > >
>> > > - if (!i915.mmio_debug)
>> > > + if (!i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg))
>> > >   return;
>> >
>> > Place the register check on the preceding line so that it is not
>> > confused with checking the debug-enabled status. (Also you can argue
>> > that reg will be register/cache-hot and so a cheaper test to do first
>> > than loading a module parameter.)
>>
>> That makes sense, I'll do it.
>>
>> >
>> > >   if (__raw_i915_read32(dev_priv, FPGA_DBG) &
>> > > FPGA_DBG_RM_NOCLAIM) {
>> > > @@ -624,11 +626,11 @@ hsw_unclaimed_reg_debug(struct
>> > > drm_i915_private *dev_priv, u32 reg, bool read,
>> > >  }
>> > >
>> > >  static void
>> > > -hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
>> > > +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv, u32
>> > > reg)
>> > >  {
>> > >   static bool mmio_debug_once = true;
>> > >
>> > > - if (i915.mmio_debug || !mmio_debug_once)
>> > > + if (i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg) ||
>> > > !mmio_debug_once)
>> > >   return;
>> >
>> > The use is wrong here though because you never reviewed my patch to
>> > enable the single-shot debugging from the interrupt handler to catch
>> > invalid usage from elsewhere.
>>
>> If you're talking about intel_uncore_check_errors(), I think we can
>> just kill it now. I'll submit a patch with the arguments, so we can
>> continue this topic there.
>>
>> Moving back to the main subject:
>>
>> In the last time you sent the patch to do the unclaimed checks only on
>> forcewake code, I started the conversations with the HW guys about the
>> range of registers relevant by FPGA_DBG (and CCd you on these
>> conversations). My plan was to write this patch at that time, but
>> priorities happened and it got postponed :(. I also hoped that maybe
>> you would eventually end up writing it and I would just have to review
>> it :)
>>
>> My main argument was that by doing unclaimed register checking only at
>> forcewake time we would be running a check that is only relevant to
>> display code during non-display code. So my idea is that if we restrict
>> the unclaimed check to the actual range of registers that can be caught
>> we basically remove unclaimed register checking for most (all?) of the
>> performance-sensitive code.
>>
>> Since we have multiple solutions, I decided to do some experiments.
>> First of all, since I was not really seeing hsw_unclaimed_reg_detect()
>> on perf, I decided to patch it so it would do the read/write check
>> around FPGA_DBG 50 times per call instead of just one. With this, by
>> running "perf record glxgears" for a few seconds I could see
>> hsw_unclaimed_reg_detect() taking about 4.6% of the total time.
>
> Something is very suspect with your system if you are not observing much
> hsw_unclaimed_reg_check during trivial workloads.
>
>> Then I applied just this patch, and the time went down to 0.13%. I also
>> applied your patch on top of it all, and it went up to 0.52% (I guess
>> the extra checks at forcewake time cost a little bit). Also notice that
>> since I add a "reg" argument to hsw_unclaimed_reg_detect(), I changed
>> the forcewake call to use a register in the display range.
>> I also tested your patch without my patch, and the measured result was
>> about 0.56%.
>>
>> Now this isn't a super relevant experiment: just glxgears with a
>> modified hsw_unclaimed_reg_detect(), but I thought it would be useful
>> information, and maybe you could provide me suggestions for better
>> workloads.
>>
>> So I'd like to know if you're ok with proceeding with just this patch
>> (considering I implement your suggestions), or if you think we should
>> go with just your patch or both or none.
>
> If you really wanted to, you could combine approaches a check in the
> forcewake handler as demonstrated is an order of magnitude less frequent
> than the register accesses. The key point is that we *need* the checking
> against other users, not just our known register access.

I'm not sure 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

2015-09-02 Thread Paulo Zanoni
2015-09-02 17:53 GMT-03:00 ch...@chris-wilson.co.uk :
> On Wed, Sep 02, 2015 at 08:20:52PM +, Zanoni, Paulo R wrote:
>> Em Qua, 2015-08-26 às 08:44 +0100, Chris Wilson escreveu:
>> > On Tue, Aug 25, 2015 at 07:03:42PM -0300, Paulo Zanoni wrote:
>> > > The unclaimed register bit is only triggered when someone touches
>> > > the
>> > > specified register range.
>> > >
>> > > For the normal use case (with i915.mmio_debug=0), this commit will
>> > > avoid the extra __raw_i915_read32() call for every register outside
>> > > the specified range, at the expense of a few additional "if"
>> > > statements.
>> > >
>> > > Cc: Chris Wilson 
>> > > Signed-off-by: Paulo Zanoni 
>> >
>> >
>> > > @@ -612,7 +614,7 @@ hsw_unclaimed_reg_debug(struct drm_i915_private
>> > > *dev_priv, u32 reg, bool read,
>> > >   const char *op = read ? "reading" : "writing to";
>> > >   const char *when = before ? "before" : "after";
>> > >
>> > > - if (!i915.mmio_debug)
>> > > + if (!i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg))
>> > >   return;
>> >
>> > Place the register check on the preceding line so that it is not
>> > confused with checking the debug-enabled status. (Also you can argue
>> > that reg will be register/cache-hot and so a cheaper test to do first
>> > than loading a module parameter.)
>>
>> That makes sense, I'll do it.
>>
>> >
>> > >   if (__raw_i915_read32(dev_priv, FPGA_DBG) &
>> > > FPGA_DBG_RM_NOCLAIM) {
>> > > @@ -624,11 +626,11 @@ hsw_unclaimed_reg_debug(struct
>> > > drm_i915_private *dev_priv, u32 reg, bool read,
>> > >  }
>> > >
>> > >  static void
>> > > -hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
>> > > +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv, u32
>> > > reg)
>> > >  {
>> > >   static bool mmio_debug_once = true;
>> > >
>> > > - if (i915.mmio_debug || !mmio_debug_once)
>> > > + if (i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg) ||
>> > > !mmio_debug_once)
>> > >   return;
>> >
>> > The use is wrong here though because you never reviewed my patch to
>> > enable the single-shot debugging from the interrupt handler to catch
>> > invalid usage from elsewhere.
>>
>> If you're talking about intel_uncore_check_errors(), I think we can
>> just kill it now. I'll submit a patch with the arguments, so we can
>> continue this topic there.
>>
>> Moving back to the main subject:
>>
>> In the last time you sent the patch to do the unclaimed checks only on
>> forcewake code, I started the conversations with the HW guys about the
>> range of registers relevant by FPGA_DBG (and CCd you on these
>> conversations). My plan was to write this patch at that time, but
>> priorities happened and it got postponed :(. I also hoped that maybe
>> you would eventually end up writing it and I would just have to review
>> it :)
>>
>> My main argument was that by doing unclaimed register checking only at
>> forcewake time we would be running a check that is only relevant to
>> display code during non-display code. So my idea is that if we restrict
>> the unclaimed check to the actual range of registers that can be caught
>> we basically remove unclaimed register checking for most (all?) of the
>> performance-sensitive code.
>>
>> Since we have multiple solutions, I decided to do some experiments.
>> First of all, since I was not really seeing hsw_unclaimed_reg_detect()
>> on perf, I decided to patch it so it would do the read/write check
>> around FPGA_DBG 50 times per call instead of just one. With this, by
>> running "perf record glxgears" for a few seconds I could see
>> hsw_unclaimed_reg_detect() taking about 4.6% of the total time.
>
> Something is very suspect with your system if you are not observing much
> hsw_unclaimed_reg_check during trivial workloads.

Ok, I should have said "almost not seeing" instead of "not really
seeing": my bad. I see about 0.24% on perf with plain
drm-intel-nightly. The problem is that any of the next patches reduces
this down to 0.01% or 0.02%, so it's hard to compare the optimized
patches. By doing 50 reg reads on detect() I can make the numbers of
the optimized patches a little bigger, so easier to compare.

>
>> Then I applied just this patch, and the time went down to 0.13%. I also
>> applied your patch on top of it all, and it went up to 0.52% (I guess
>> the extra checks at forcewake time cost a little bit). Also notice that
>> since I add a "reg" argument to hsw_unclaimed_reg_detect(), I changed
>> the forcewake call to use a register in the display range.
>> I also tested your patch without my patch, and the measured result was
>> about 0.56%.
>>
>> Now this isn't a super relevant experiment: just glxgears with a
>> modified hsw_unclaimed_reg_detect(), but I thought it would be useful
>> information, and maybe you could provide me suggestions for better
>> workloads.
>>
>> So I'd like to know if you're ok with proceeding with just this patch
>> (considering I 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

2015-08-29 Thread shuang . he
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: 
shuang...@intel.com)
Task id: 7256
-Summary-
Platform  Delta  drm-intel-nightly  Series Applied
ILK -1  302/302  301/302
SNB  315/315  315/315
IVB  336/336  336/336
BYT  283/283  283/283
HSW  378/378  378/378
-Detailed-
Platform  Testdrm-intel-nightly  Series 
Applied
*ILK  igt@kms_flip@flip-vs-dpms-interruptible  PASS(1)  DMESG_WARN(1)
Note: You need to pay more attention to line start with '*'
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: restrict unclaimed register checking

2015-08-26 Thread Chris Wilson
On Tue, Aug 25, 2015 at 07:03:42PM -0300, Paulo Zanoni wrote:
 The unclaimed register bit is only triggered when someone touches the
 specified register range.
 
 For the normal use case (with i915.mmio_debug=0), this commit will
 avoid the extra __raw_i915_read32() call for every register outside
 the specified range, at the expense of a few additional if
 statements.
 
 Cc: Chris Wilson ch...@chris-wilson.co.uk
 Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com


 @@ -612,7 +614,7 @@ hsw_unclaimed_reg_debug(struct drm_i915_private 
 *dev_priv, u32 reg, bool read,
   const char *op = read ? reading : writing to;
   const char *when = before ? before : after;
  
 - if (!i915.mmio_debug)
 + if (!i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg))
   return;

Place the register check on the preceding line so that it is not
confused with checking the debug-enabled status. (Also you can argue
that reg will be register/cache-hot and so a cheaper test to do first
than loading a module parameter.)

   if (__raw_i915_read32(dev_priv, FPGA_DBG)  FPGA_DBG_RM_NOCLAIM) {
 @@ -624,11 +626,11 @@ hsw_unclaimed_reg_debug(struct drm_i915_private 
 *dev_priv, u32 reg, bool read,
  }
  
  static void
 -hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
 +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv, u32 reg)
  {
   static bool mmio_debug_once = true;
  
 - if (i915.mmio_debug || !mmio_debug_once)
 + if (i915.mmio_debug || !UNCLAIMED_CHECK_RANGE(reg) || !mmio_debug_once)
   return;

The use is wrong here though because you never reviewed my patch to
enable the single-shot debugging from the interrupt handler to catch
invalid usage from elsewhere.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx