[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-08-01 Thread Maarten Lankhorst
Op 29-07-16 om 22:33 schreef Matt Roper:
> On Fri, Jul 29, 2016 at 12:39:05PM +0300, Ville Syrjälä wrote:
>> On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
>>> This is completely untested (and probably horribly broken/buggy), but
>>> here's a quick mockup of the general approach I was thinking for
>>> ensuring DDB & WM's can be updated together while ensuring the
>>> three-step pipe flushing process is honored:
>>>
>>> https://github.com/mattrope/kernel/commits/experimental/lyude_ddb
>>>
>>> Basically the idea is to take note of what's happening to the pipe's DDB
>>> allocation (shrinking, growing, unchanged, etc.) during the atomic check
>>> phase;
>> Didn't look too closely, but I think you can't actually do that unless
>> you lock all the crtcs whenever the number of active pipes is goind to
>> change. Meaning we'd essentially be back to the one-big-modeset-lock
>> apporach, which will cause missed flips and whanot on the other pipes.
>>
>> The alternative I think would consist of:
>> - make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
>>   so that a modeset doesn't have to care about the wms for the other
>>   pipes not fitting in
> Unfortunately this part is the problem.  You might get away with doing
> this on SKL/KBL which only have three planes max per pipe and a large
> (896 block) DDB, but on BXT you have up to four planes (we don't
> actually enable the topmost plane in a full-featured manner just yet,
> but need to soon), yet the total DDB is only 512 blocks.  Sadly, the
> platform with more planes was given a smaller DDB...   :-(
> We're already in trouble because users that are running setups like 3x
> 4K with most/all planes in use at large sizes can't find level 0
> watermarks that satisfy their needs and we have to reject the whole
> configuration.  If we further limit each pipe's usage to total/maxpipes
> (i.e., 170 blocks per pipe on BXT), then we're going to hit similar
> issues when only driving one or two displays with with all of the planes
> in use, even though we should have had more DDB space to work with.
>
> I guess serious plane usage isn't too common in desktop setups today,
> but it's a very critical feature in the embedded world; we can't really
> afford to cripple plane usage further.  Unfortunately, as you point out
> above, this means that we have to follow the bspec's DDB allocation
> method, which in turn means we need to grab _all_ CRTC locks any time
> _any_ CRTC is being turned on or turned off which is a BKL-style way of
> doing things.
Meh, I'm running into a similar issue on vlv/chv. I don't see a way around it. 
:(
Only thing we could do is split up the state to make the non-modeset
crtc's complete early.


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-08-01 Thread Ville Syrjälä
On Mon, Aug 01, 2016 at 10:48:37AM +0200, Maarten Lankhorst wrote:
> Op 29-07-16 om 22:33 schreef Matt Roper:
> > On Fri, Jul 29, 2016 at 12:39:05PM +0300, Ville Syrjälä wrote:
> >> On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> >>> This is completely untested (and probably horribly broken/buggy), but
> >>> here's a quick mockup of the general approach I was thinking for
> >>> ensuring DDB & WM's can be updated together while ensuring the
> >>> three-step pipe flushing process is honored:
> >>>
> >>> https://github.com/mattrope/kernel/commits/experimental/lyude_ddb
> >>>
> >>> Basically the idea is to take note of what's happening to the pipe's DDB
> >>> allocation (shrinking, growing, unchanged, etc.) during the atomic check
> >>> phase;
> >> Didn't look too closely, but I think you can't actually do that unless
> >> you lock all the crtcs whenever the number of active pipes is goind to
> >> change. Meaning we'd essentially be back to the one-big-modeset-lock
> >> apporach, which will cause missed flips and whanot on the other pipes.
> >>
> >> The alternative I think would consist of:
> >> - make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
> >>   so that a modeset doesn't have to care about the wms for the other
> >>   pipes not fitting in
> > Unfortunately this part is the problem.  You might get away with doing
> > this on SKL/KBL which only have three planes max per pipe and a large
> > (896 block) DDB, but on BXT you have up to four planes (we don't
> > actually enable the topmost plane in a full-featured manner just yet,
> > but need to soon), yet the total DDB is only 512 blocks.  Sadly, the
> > platform with more planes was given a smaller DDB...   :-(
> > We're already in trouble because users that are running setups like 3x
> > 4K with most/all planes in use at large sizes can't find level 0
> > watermarks that satisfy their needs and we have to reject the whole
> > configuration.  If we further limit each pipe's usage to total/maxpipes
> > (i.e., 170 blocks per pipe on BXT), then we're going to hit similar
> > issues when only driving one or two displays with with all of the planes
> > in use, even though we should have had more DDB space to work with.
> >
> > I guess serious plane usage isn't too common in desktop setups today,
> > but it's a very critical feature in the embedded world; we can't really
> > afford to cripple plane usage further.  Unfortunately, as you point out
> > above, this means that we have to follow the bspec's DDB allocation
> > method, which in turn means we need to grab _all_ CRTC locks any time
> > _any_ CRTC is being turned on or turned off which is a BKL-style way of
> > doing things.
> Meh, I'm running into a similar issue on vlv/chv. I don't see a way around 
> it. :(

Now are you hitting it w/ vlv/chv? They don't even have a shared DDB.

> Only thing we could do is split up the state to make the non-modeset
> crtc's complete early.

-- 
Ville Syrjälä
Intel OTC


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-08-01 Thread Ville Syrjälä
On Fri, Jul 29, 2016 at 01:41:26PM -0700, Matt Roper wrote:
> On Fri, Jul 29, 2016 at 10:26:20PM +0300, Ville Syrjälä wrote:
> > On Fri, Jul 29, 2016 at 02:48:09PM -0400, Lyude wrote:
> > > So I've been working on trying to fix this entirely again (e.g. writing
> > > the ddb properly), since from bug reports it still doesn't sound like
> > > we've got enough workarounds to make this tolerable. I've shown this to
> > > matt roper, but I should probably post what I've been trying to do for
> > > you as well.
> > > 
> > > So the approach I came up with is here
> > > 
> > > https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> > > 
> > > My approach differs a little bit from what the bspec recommends, but I
> > > think it's going to be a bit easier to implement. At the end of all the
> > > changes I'm attempting it should look like this:
> > > 
> > >  * We no longer have a global watermark update for skl
> > >  * A new hook called "update_ddbs" is added to i915_display_funcs. This
> > >gets called in intel_atomic_commit_tail() after we've disabled any
> > >CRTCs that needed disabling, and before we begin enabling/updating
> > >CRTCs
> > >  * Because pipe ddb allocations (not the inner-pipe ddb allocations
> > >that apply to each pipe's planes) only change while
> > >enabling/disabling crtcs:
> > > * Pass 1: Find which pipe's new ddb allocation won't overlap with
> > >   another pipe's previous allocation, and update that pipe first
> > > * Pass 2: Update the allocation of the remaining pipe
> > > 
> > > Here's an illustration of what this looks like. Parts of the ddb not
> > > being used by any CRTCs are marked out with 'x's:
> > > 
> > > With pipe A enabled, we enable pipe B:
> > > Initial DDB:    |           A           |
> > > update_ddbs:    |     A     |xxx| Pass 1
> > > Enable pipe B:  |     A     |     B     |
> > > 
> > > With pipes B and A active, we enable pipe C:
> > > 
> > > Initial DDB:    |     B     |     A     |
> > > update_ddbs:    |   B   |xxx|     A     | Pass 1
> > > update_ddbs:    |   B   |   A   |xxx| Pass 2
> > > Enable pipe C:  |   B   |   A   |   C   |
> > > 
> > > With pipes A, B, and C active, we disable B:
> > > Initial DDB:    |   A   |   B   |   C   |
> > > Disable pipe B: |   A   |xxx|   C   |
> > > update_ddbs:    |     A     |     C     | Pass 1
> > > Since neither pipe's new allocation overlapped, we skip pass 2
> > > 
> > > Another allocation with A, B, and C active, disabling A:
> > > Initial DDB:    |   A   |   B   |   C   |
> > > Disable pipe A: |xxx|   B   |   C   |
> > > update_ddbs:    |     B     |xxx|   C   | Pass 1
> > > update_ddbs:    |     B     |     C     | Pass 2
> > > 
> > > This should ensure we can always move around the allocations of pipes
> > > without them ever overlapping and exploding.
> > 
> > That's what the current flush thing does, or at least that what it used
> > to do at least. Not sure it's really doing it anymore, but I'm pretty
> > sure the order in which it did things was sound at some point.
> > 
> > > 
> > > This branch doesn't entirely fix underrun issues, but I'm mostly sure
> > > that's the fault of still not having removed the global wm update hook
> > > yet (which is leading to additional pipe flushes in places they
> > > shouldn't be):
> > 
> > Well it should basically boil down to s/update_wm/update_ddb/
> > Only DDB reallocation really warrants a global hook. Everything else
> > should be handled via per-crtc hooks, on all platforms.
> 
> I don't think we want even update_ddb.  We want *calculate_ddb* to be a
> global hook during the atomic check phase (and we do have that today),
> but when we get around to the commit phase and start writing stuff to
> hardware, we want to program the per-pipe DDB entries at the same time
> we program the WM's and regular plane registers (since they should be
> double buffered and flushed out the same way).  We just need to make
> sure that the order we process pipes in follows the special flushing
> rules noted above, which is what my pseudo-patches were trying to
> describe.

My comments are w.r.t. the scheme where we don't lock down all the crtcs
for every DDB update.

> 
> > 
> > > 
> > > https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> > > 
> > > As for updating inner-pipe ddb allocations for each plane on a pipe, we
> > > should be able to just do that at the same time we update each pipe's
> > > watermarks
> > 
> > Yes.
> > 
> > None of that changes what I said before though. Either you need to
> > lock down everything when the DDB needs to be repartitioned, or you
> > do what I outlined in the previous mail. Otherwise a plane update
> > etc. happening in parallel will still blow up on account of the DDB
> > state changing underneath the plane update somewhere between compute
> > and commit. I can't really think 

[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-08-02 Thread Maarten Lankhorst
Op 01-08-16 om 13:48 schreef Ville Syrjälä:
> On Mon, Aug 01, 2016 at 10:48:37AM +0200, Maarten Lankhorst wrote:
>> Op 29-07-16 om 22:33 schreef Matt Roper:
>>> On Fri, Jul 29, 2016 at 12:39:05PM +0300, Ville Syrjälä wrote:
 On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> This is completely untested (and probably horribly broken/buggy), but
> here's a quick mockup of the general approach I was thinking for
> ensuring DDB & WM's can be updated together while ensuring the
> three-step pipe flushing process is honored:
>
> https://github.com/mattrope/kernel/commits/experimental/lyude_ddb
>
> Basically the idea is to take note of what's happening to the pipe's DDB
> allocation (shrinking, growing, unchanged, etc.) during the atomic check
> phase;
 Didn't look too closely, but I think you can't actually do that unless
 you lock all the crtcs whenever the number of active pipes is goind to
 change. Meaning we'd essentially be back to the one-big-modeset-lock
 apporach, which will cause missed flips and whanot on the other pipes.

 The alternative I think would consist of:
 - make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
   so that a modeset doesn't have to care about the wms for the other
   pipes not fitting in
>>> Unfortunately this part is the problem.  You might get away with doing
>>> this on SKL/KBL which only have three planes max per pipe and a large
>>> (896 block) DDB, but on BXT you have up to four planes (we don't
>>> actually enable the topmost plane in a full-featured manner just yet,
>>> but need to soon), yet the total DDB is only 512 blocks.  Sadly, the
>>> platform with more planes was given a smaller DDB...   :-(
>>> We're already in trouble because users that are running setups like 3x
>>> 4K with most/all planes in use at large sizes can't find level 0
>>> watermarks that satisfy their needs and we have to reject the whole
>>> configuration.  If we further limit each pipe's usage to total/maxpipes
>>> (i.e., 170 blocks per pipe on BXT), then we're going to hit similar
>>> issues when only driving one or two displays with with all of the planes
>>> in use, even though we should have had more DDB space to work with.
>>>
>>> I guess serious plane usage isn't too common in desktop setups today,
>>> but it's a very critical feature in the embedded world; we can't really
>>> afford to cripple plane usage further.  Unfortunately, as you point out
>>> above, this means that we have to follow the bspec's DDB allocation
>>> method, which in turn means we need to grab _all_ CRTC locks any time
>>> _any_ CRTC is being turned on or turned off which is a BKL-style way of
>>> doing things.
>> Meh, I'm running into a similar issue on vlv/chv. I don't see a way around 
>> it. :(
> Now are you hitting it w/ vlv/chv? They don't even have a shared DDB.
Not really the same, but determining what power saving levels + values to set 
with possibly parallel updates.

~Maarten


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-08-02 Thread Ville Syrjälä
On Tue, Aug 02, 2016 at 05:41:50PM +0200, Maarten Lankhorst wrote:
> Op 01-08-16 om 13:48 schreef Ville Syrjälä:
> > On Mon, Aug 01, 2016 at 10:48:37AM +0200, Maarten Lankhorst wrote:
> >> Op 29-07-16 om 22:33 schreef Matt Roper:
> >>> On Fri, Jul 29, 2016 at 12:39:05PM +0300, Ville Syrjälä wrote:
>  On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> > This is completely untested (and probably horribly broken/buggy), but
> > here's a quick mockup of the general approach I was thinking for
> > ensuring DDB & WM's can be updated together while ensuring the
> > three-step pipe flushing process is honored:
> >
> > 
> > https://github.com/mattrope/kernel/commits/experimental/lyude_ddb
> >
> > Basically the idea is to take note of what's happening to the pipe's DDB
> > allocation (shrinking, growing, unchanged, etc.) during the atomic check
> > phase;
>  Didn't look too closely, but I think you can't actually do that unless
>  you lock all the crtcs whenever the number of active pipes is goind to
>  change. Meaning we'd essentially be back to the one-big-modeset-lock
>  apporach, which will cause missed flips and whanot on the other pipes.
> 
>  The alternative I think would consist of:
>  - make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
>    so that a modeset doesn't have to care about the wms for the other
>    pipes not fitting in
> >>> Unfortunately this part is the problem.  You might get away with doing
> >>> this on SKL/KBL which only have three planes max per pipe and a large
> >>> (896 block) DDB, but on BXT you have up to four planes (we don't
> >>> actually enable the topmost plane in a full-featured manner just yet,
> >>> but need to soon), yet the total DDB is only 512 blocks.  Sadly, the
> >>> platform with more planes was given a smaller DDB...   :-(
> >>> We're already in trouble because users that are running setups like 3x
> >>> 4K with most/all planes in use at large sizes can't find level 0
> >>> watermarks that satisfy their needs and we have to reject the whole
> >>> configuration.  If we further limit each pipe's usage to total/maxpipes
> >>> (i.e., 170 blocks per pipe on BXT), then we're going to hit similar
> >>> issues when only driving one or two displays with with all of the planes
> >>> in use, even though we should have had more DDB space to work with.
> >>>
> >>> I guess serious plane usage isn't too common in desktop setups today,
> >>> but it's a very critical feature in the embedded world; we can't really
> >>> afford to cripple plane usage further.  Unfortunately, as you point out
> >>> above, this means that we have to follow the bspec's DDB allocation
> >>> method, which in turn means we need to grab _all_ CRTC locks any time
> >>> _any_ CRTC is being turned on or turned off which is a BKL-style way of
> >>> doing things.
> >> Meh, I'm running into a similar issue on vlv/chv. I don't see a way around 
> >> it. :(
> > Now are you hitting it w/ vlv/chv? They don't even have a shared DDB.
> Not really the same, but determining what power saving levels + values to set 
> with possibly parallel updates.

Just like ILK we just need to track reasonably closely how many pipes
are active, and redo the wm level selection when that changes. Just
make sure the code never thinks there are less pipes active than in
reality, and it should all be safe.

In theory we wouldn't even need to do that since the PM5/DDR DVFS don't
actually depend on the number of active pipes. It's just that they never
validated that combination, so it's not something that's recommended to
be used. And since the whole DDR DVFS extra latency vs. DDL deadlines
is such a mess, it probably would end up in more underruns.

-- 
Ville Syrjälä
Intel OTC


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-28 Thread Matt Roper
This is completely untested (and probably horribly broken/buggy), but
here's a quick mockup of the general approach I was thinking for
ensuring DDB & WM's can be updated together while ensuring the
three-step pipe flushing process is honored:

https://github.com/mattrope/kernel/commits/experimental/lyude_ddb

Basically the idea is to take note of what's happening to the pipe's DDB
allocation (shrinking, growing, unchanged, etc.) during the atomic check
phase; then during the commit phase, we loop over the CRTC's three times
instead of just once, but only operate on a subset of the CRTC's in each
loop.  While operating on each CRTC, the plane, WM, and DDB all get
programmed together and have a single flush for all three.



Matt

On Tue, Jul 26, 2016 at 01:34:36PM -0400, Lyude wrote:
> Latest version of https://lkml.org/lkml/2016/7/26/290 . Resending the whole
> thing to keep it in one place.
> 
> Lyude (5):
>   drm/i915/skl: Add support for the SAGV, fix underrun hangs
>   drm/i915/skl: Only flush pipes when we change the ddb allocation
>   drm/i915/skl: Fix extra whitespace in skl_flush_wm_values()
>   drm/i915/skl: Update plane watermarks atomically during plane updates
>   drm/i915/skl: Always wait for pipes to update after a flush
> 
> Matt Roper (1):
>   drm/i915/gen9: Only copy WM results for changed pipes to skl_hw
> 
>  drivers/gpu/drm/i915/i915_drv.h  |   3 +
>  drivers/gpu/drm/i915/i915_reg.h  |   5 +
>  drivers/gpu/drm/i915/intel_display.c |  24 
>  drivers/gpu/drm/i915/intel_drv.h |   4 +
>  drivers/gpu/drm/i915/intel_pm.c  | 240 
> +++
>  drivers/gpu/drm/i915/intel_sprite.c  |   2 +
>  6 files changed, 255 insertions(+), 23 deletions(-)
> 
> -- 
> 2.7.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-29 Thread Ville Syrjälä
On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> This is completely untested (and probably horribly broken/buggy), but
> here's a quick mockup of the general approach I was thinking for
> ensuring DDB & WM's can be updated together while ensuring the
> three-step pipe flushing process is honored:
> 
> https://github.com/mattrope/kernel/commits/experimental/lyude_ddb
> 
> Basically the idea is to take note of what's happening to the pipe's DDB
> allocation (shrinking, growing, unchanged, etc.) during the atomic check
> phase;

Didn't look too closely, but I think you can't actually do that unless
you lock all the crtcs whenever the number of active pipes is goind to
change. Meaning we'd essentially be back to the one-big-modeset-lock
apporach, which will cause missed flips and whanot on the other pipes.

The alternative I think would consist of:
- make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
  so that a modeset doesn't have to care about the wms for the other
  pipes not fitting in
- level 1+ watermarks would be checked against total_ddb_size
- protect the plane/pipe commit with the wm mutex whenever the wms
  need to be reprogrammed
- keep the flush_wm thing around for the case when ddb size does get
  changed, protect it with the wm lock
- when programming wms, we will first filter out any level that
  doesn't fit in with the current ddb size, and then program the
  rest in
- potentially introduce per-pipe wm locks if the one big lock looks
  like an issue, which it might if the flush_wm holds it all the way
  through

> then during the commit phase, we loop over the CRTC's three times
> instead of just once, but only operate on a subset of the CRTC's in each
> loop.  While operating on each CRTC, the plane, WM, and DDB all get
> programmed together and have a single flush for all three.
>
> 
> 
> 
> Matt
> 
> On Tue, Jul 26, 2016 at 01:34:36PM -0400, Lyude wrote:
> > Latest version of https://lkml.org/lkml/2016/7/26/290 . Resending the whole
> > thing to keep it in one place.
> > 
> > Lyude (5):
> >   drm/i915/skl: Add support for the SAGV, fix underrun hangs
> >   drm/i915/skl: Only flush pipes when we change the ddb allocation
> >   drm/i915/skl: Fix extra whitespace in skl_flush_wm_values()
> >   drm/i915/skl: Update plane watermarks atomically during plane updates
> >   drm/i915/skl: Always wait for pipes to update after a flush
> > 
> > Matt Roper (1):
> >   drm/i915/gen9: Only copy WM results for changed pipes to skl_hw
> > 
> >  drivers/gpu/drm/i915/i915_drv.h  |   3 +
> >  drivers/gpu/drm/i915/i915_reg.h  |   5 +
> >  drivers/gpu/drm/i915/intel_display.c |  24 
> >  drivers/gpu/drm/i915/intel_drv.h |   4 +
> >  drivers/gpu/drm/i915/intel_pm.c  | 240 
> > +++
> >  drivers/gpu/drm/i915/intel_sprite.c  |   2 +
> >  6 files changed, 255 insertions(+), 23 deletions(-)
> > 
> > -- 
> > 2.7.4
> > 
> > ___
> > Intel-gfx mailing list
> > Intel-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> IoTG Platform Enabling & Development
> Intel Corporation
> (916) 356-2795

-- 
Ville Syrjälä
Intel OTC


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-29 Thread Lyude
So I've been working on trying to fix this entirely again (e.g. writing
the ddb properly), since from bug reports it still doesn't sound like
we've got enough workarounds to make this tolerable. I've shown this to
matt roper, but I should probably post what I've been trying to do for
you as well.

So the approach I came up with is here

https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2

My approach differs a little bit from what the bspec recommends, but I
think it's going to be a bit easier to implement. At the end of all the
changes I'm attempting it should look like this:

 * We no longer have a global watermark update for skl
 * A new hook called "update_ddbs" is added to i915_display_funcs. This
   gets called in intel_atomic_commit_tail() after we've disabled any
   CRTCs that needed disabling, and before we begin enabling/updating
   CRTCs
 * Because pipe ddb allocations (not the inner-pipe ddb allocations
   that apply to each pipe's planes) only change while
   enabling/disabling crtcs:
* Pass 1: Find which pipe's new ddb allocation won't overlap with
  another pipe's previous allocation, and update that pipe first
* Pass 2: Update the allocation of the remaining pipe

Here's an illustration of what this looks like. Parts of the ddb not
being used by any CRTCs are marked out with 'x's:

With pipe A enabled, we enable pipe B:
Initial DDB:    |           A           |
update_ddbs:    |     A     |xxx| Pass 1
Enable pipe B:  |     A     |     B     |

With pipes B and A active, we enable pipe C:

Initial DDB:    |     B     |     A     |
update_ddbs:    |   B   |xxx|     A     | Pass 1
update_ddbs:    |   B   |   A   |xxx| Pass 2
Enable pipe C:  |   B   |   A   |   C   |

With pipes A, B, and C active, we disable B:
Initial DDB:    |   A   |   B   |   C   |
Disable pipe B: |   A   |xxx|   C   |
update_ddbs:    |     A     |     C     | Pass 1
Since neither pipe's new allocation overlapped, we skip pass 2

Another allocation with A, B, and C active, disabling A:
Initial DDB:    |   A   |   B   |   C   |
Disable pipe A: |xxx|   B   |   C   |
update_ddbs:    |     B     |xxx|   C   | Pass 1
update_ddbs:    |     B     |     C     | Pass 2

This should ensure we can always move around the allocations of pipes
without them ever overlapping and exploding.

This branch doesn't entirely fix underrun issues, but I'm mostly sure
that's the fault of still not having removed the global wm update hook
yet (which is leading to additional pipe flushes in places they
shouldn't be):

https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2

As for updating inner-pipe ddb allocations for each plane on a pipe, we
should be able to just do that at the same time we update each pipe's
watermarks

Let me know what you think
Lyude

On Fri, 2016-07-29 at 12:39 +0300, Ville Syrjälä wrote:
> On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> > 
> > This is completely untested (and probably horribly broken/buggy),
> > but
> > here's a quick mockup of the general approach I was thinking for
> > ensuring DDB & WM's can be updated together while ensuring the
> > three-step pipe flushing process is honored:
> > 
> >         https://github.com/mattrope/kernel/commits/experimental/lyu
> > de_ddb
> > 
> > Basically the idea is to take note of what's happening to the
> > pipe's DDB
> > allocation (shrinking, growing, unchanged, etc.) during the atomic
> > check
> > phase;
> 
> Didn't look too closely, but I think you can't actually do that
> unless
> you lock all the crtcs whenever the number of active pipes is goind
> to
> change. Meaning we'd essentially be back to the one-big-modeset-lock
> apporach, which will cause missed flips and whanot on the other
> pipes.
> 
> The alternative I think would consist of:
> - make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
>   so that a modeset doesn't have to care about the wms for the other
>   pipes not fitting in
> - level 1+ watermarks would be checked against total_ddb_size
> - protect the plane/pipe commit with the wm mutex whenever the wms
>   need to be reprogrammed
> - keep the flush_wm thing around for the case when ddb size does get
>   changed, protect it with the wm lock
> - when programming wms, we will first filter out any level that
>   doesn't fit in with the current ddb size, and then program the
>   rest in
> - potentially introduce per-pipe wm locks if the one big lock looks
>   like an issue, which it might if the flush_wm holds it all the way
>   through
> 
> > 
> > then during the commit phase, we loop over the CRTC's three times
> > instead of just once, but only operate on a subset of the CRTC's in
> > each
> > loop.  While operating on each CRTC, the plane, WM, and DDB all get
> > programmed together and have a single flush for all three.
> > 
> > 
> > 
> > 
> > Mat

[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-29 Thread Ville Syrjälä
On Fri, Jul 29, 2016 at 02:48:09PM -0400, Lyude wrote:
> So I've been working on trying to fix this entirely again (e.g. writing
> the ddb properly), since from bug reports it still doesn't sound like
> we've got enough workarounds to make this tolerable. I've shown this to
> matt roper, but I should probably post what I've been trying to do for
> you as well.
> 
> So the approach I came up with is here
> 
> https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> 
> My approach differs a little bit from what the bspec recommends, but I
> think it's going to be a bit easier to implement. At the end of all the
> changes I'm attempting it should look like this:
> 
>  * We no longer have a global watermark update for skl
>  * A new hook called "update_ddbs" is added to i915_display_funcs. This
>gets called in intel_atomic_commit_tail() after we've disabled any
>CRTCs that needed disabling, and before we begin enabling/updating
>CRTCs
>  * Because pipe ddb allocations (not the inner-pipe ddb allocations
>that apply to each pipe's planes) only change while
>enabling/disabling crtcs:
> * Pass 1: Find which pipe's new ddb allocation won't overlap with
>   another pipe's previous allocation, and update that pipe first
> * Pass 2: Update the allocation of the remaining pipe
> 
> Here's an illustration of what this looks like. Parts of the ddb not
> being used by any CRTCs are marked out with 'x's:
> 
> With pipe A enabled, we enable pipe B:
> Initial DDB:    |           A           |
> update_ddbs:    |     A     |xxx| Pass 1
> Enable pipe B:  |     A     |     B     |
> 
> With pipes B and A active, we enable pipe C:
> 
> Initial DDB:    |     B     |     A     |
> update_ddbs:    |   B   |xxx|     A     | Pass 1
> update_ddbs:    |   B   |   A   |xxx| Pass 2
> Enable pipe C:  |   B   |   A   |   C   |
> 
> With pipes A, B, and C active, we disable B:
> Initial DDB:    |   A   |   B   |   C   |
> Disable pipe B: |   A   |xxx|   C   |
> update_ddbs:    |     A     |     C     | Pass 1
> Since neither pipe's new allocation overlapped, we skip pass 2
> 
> Another allocation with A, B, and C active, disabling A:
> Initial DDB:    |   A   |   B   |   C   |
> Disable pipe A: |xxx|   B   |   C   |
> update_ddbs:    |     B     |xxx|   C   | Pass 1
> update_ddbs:    |     B     |     C     | Pass 2
> 
> This should ensure we can always move around the allocations of pipes
> without them ever overlapping and exploding.

That's what the current flush thing does, or at least that what it used
to do at least. Not sure it's really doing it anymore, but I'm pretty
sure the order in which it did things was sound at some point.

> 
> This branch doesn't entirely fix underrun issues, but I'm mostly sure
> that's the fault of still not having removed the global wm update hook
> yet (which is leading to additional pipe flushes in places they
> shouldn't be):

Well it should basically boil down to s/update_wm/update_ddb/
Only DDB reallocation really warrants a global hook. Everything else
should be handled via per-crtc hooks, on all platforms.

> 
> https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> 
> As for updating inner-pipe ddb allocations for each plane on a pipe, we
> should be able to just do that at the same time we update each pipe's
> watermarks

Yes.

None of that changes what I said before though. Either you need to
lock down everything when the DDB needs to be repartitioned, or you
do what I outlined in the previous mail. Otherwise a plane update
etc. happening in parallel will still blow up on account of the DDB
state changing underneath the plane update somewhere between compute
and commit. I can't really think of third option that would work.

> 
> Let me know what you think
>   Lyude
> 
> On Fri, 2016-07-29 at 12:39 +0300, Ville Syrjälä wrote:
> > On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> > > 
> > > This is completely untested (and probably horribly broken/buggy),
> > > but
> > > here's a quick mockup of the general approach I was thinking for
> > > ensuring DDB & WM's can be updated together while ensuring the
> > > three-step pipe flushing process is honored:
> > > 
> > >         
> > > https://github.com/mattrope/kernel/commits/experimental/lyu
> > > de_ddb
> > > 
> > > Basically the idea is to take note of what's happening to the
> > > pipe's DDB
> > > allocation (shrinking, growing, unchanged, etc.) during the atomic
> > > check
> > > phase;
> > 
> > Didn't look too closely, but I think you can't actually do that
> > unless
> > you lock all the crtcs whenever the number of active pipes is goind
> > to
> > change. Meaning we'd essentially be back to the one-big-modeset-lock
> > apporach, which will cause missed flips and whanot on the other
> > pipes.
> > 
> > The alternative I think would consist of:
> > - ma

[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-29 Thread Lyude
On Fri, 2016-07-29 at 22:26 +0300, Ville Syrjälä wrote:
> On Fri, Jul 29, 2016 at 02:48:09PM -0400, Lyude wrote:
> > 
> > So I've been working on trying to fix this entirely again (e.g.
> > writing
> > the ddb properly), since from bug reports it still doesn't sound
> > like
> > we've got enough workarounds to make this tolerable. I've shown
> > this to
> > matt roper, but I should probably post what I've been trying to do
> > for
> > you as well.
> > 
> > So the approach I came up with is here
> > 
> > https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> > 
> > My approach differs a little bit from what the bspec recommends,
> > but I
> > think it's going to be a bit easier to implement. At the end of all
> > the
> > changes I'm attempting it should look like this:
> > 
> >  * We no longer have a global watermark update for skl
> >  * A new hook called "update_ddbs" is added to i915_display_funcs.
> > This
> >    gets called in intel_atomic_commit_tail() after we've disabled
> > any
> >    CRTCs that needed disabling, and before we begin
> > enabling/updating
> >    CRTCs
> >  * Because pipe ddb allocations (not the inner-pipe ddb allocations
> >    that apply to each pipe's planes) only change while
> >    enabling/disabling crtcs:
> >     * Pass 1: Find which pipe's new ddb allocation won't overlap
> > with
> >       another pipe's previous allocation, and update that pipe
> > first
> >     * Pass 2: Update the allocation of the remaining pipe
> > 
> > Here's an illustration of what this looks like. Parts of the ddb
> > not
> > being used by any CRTCs are marked out with 'x's:
> > 
> > With pipe A enabled, we enable pipe B:
> > Initial DDB:    |           A           |
> > update_ddbs:    |     A     |xxx| Pass 1
> > Enable pipe B:  |     A     |     B     |
> > 
> > With pipes B and A active, we enable pipe C:
> > 
> > Initial DDB:    |     B     |     A     |
> > update_ddbs:    |   B   |xxx|     A     | Pass 1
> > update_ddbs:    |   B   |   A   |xxx| Pass 2
> > Enable pipe C:  |   B   |   A   |   C   |
> > 
> > With pipes A, B, and C active, we disable B:
> > Initial DDB:    |   A   |   B   |   C   |
> > Disable pipe B: |   A   |xxx|   C   |
> > update_ddbs:    |     A     |     C     | Pass 1
> > Since neither pipe's new allocation overlapped, we skip pass 2
> > 
> > Another allocation with A, B, and C active, disabling A:
> > Initial DDB:    |   A   |   B   |   C   |
> > Disable pipe A: |xxx|   B   |   C   |
> > update_ddbs:    |     B     |xxx|   C   | Pass 1
> > update_ddbs:    |     B     |     C     | Pass 2
> > 
> > This should ensure we can always move around the allocations of
> > pipes
> > without them ever overlapping and exploding.
> 
> That's what the current flush thing does, or at least that what it
> used
> to do at least. Not sure it's really doing it anymore, but I'm pretty
> sure the order in which it did things was sound at some point.
> 
> > 
> > 
> > This branch doesn't entirely fix underrun issues, but I'm mostly
> > sure
> > that's the fault of still not having removed the global wm update
> > hook
> > yet (which is leading to additional pipe flushes in places they
> > shouldn't be):
> 
> Well it should basically boil down to s/update_wm/update_ddb/
> Only DDB reallocation really warrants a global hook. Everything else
> should be handled via per-crtc hooks, on all platforms.
> 
> > 
> > 
> > https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> > 
> > As for updating inner-pipe ddb allocations for each plane on a
> > pipe, we
> > should be able to just do that at the same time we update each
> > pipe's
> > watermarks
> 
> Yes.
> 
> None of that changes what I said before though. Either you need to
> lock down everything when the DDB needs to be repartitioned, or you
> do what I outlined in the previous mail. Otherwise a plane update
> etc. happening in parallel will still blow up on account of the DDB
> state changing underneath the plane update somewhere between compute
> and commit. I can't really think of third option that would work.
> 
Bleh! I didn't even realize plane updates could happen in parallel like
that. Suddenly your proposal makes a lot more sense...

Anyway, your method definitely sounds like the right one. Unless Matt
thinks there's something that could go wrong there, I'm going to start
working that into the driver and repost the patchset once I've added
that into the driver.

Cheers,
Lyude 
> > 
> > 
> > Let me know what you think
> > Lyude
> > 
> > On Fri, 2016-07-29 at 12:39 +0300, Ville Syrjälä wrote:
> > > 
> > > On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> > > > 
> > > > 
> > > > This is completely untested (and probably horribly
> > > > broken/buggy),
> > > > but
> > > > here's a quick mockup of the general approach I was thinking
> > > > for
> > 

[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-29 Thread Matt Roper
On Fri, Jul 29, 2016 at 12:39:05PM +0300, Ville Syrjälä wrote:
> On Thu, Jul 28, 2016 at 05:03:52PM -0700, Matt Roper wrote:
> > This is completely untested (and probably horribly broken/buggy), but
> > here's a quick mockup of the general approach I was thinking for
> > ensuring DDB & WM's can be updated together while ensuring the
> > three-step pipe flushing process is honored:
> > 
> > https://github.com/mattrope/kernel/commits/experimental/lyude_ddb
> > 
> > Basically the idea is to take note of what's happening to the pipe's DDB
> > allocation (shrinking, growing, unchanged, etc.) during the atomic check
> > phase;
> 
> Didn't look too closely, but I think you can't actually do that unless
> you lock all the crtcs whenever the number of active pipes is goind to
> change. Meaning we'd essentially be back to the one-big-modeset-lock
> apporach, which will cause missed flips and whanot on the other pipes.
> 
> The alternative I think would consist of:
> - make sure level 0 watermark never exceeds total_ddb_size/max_pipes,
>   so that a modeset doesn't have to care about the wms for the other
>   pipes not fitting in

Unfortunately this part is the problem.  You might get away with doing
this on SKL/KBL which only have three planes max per pipe and a large
(896 block) DDB, but on BXT you have up to four planes (we don't
actually enable the topmost plane in a full-featured manner just yet,
but need to soon), yet the total DDB is only 512 blocks.  Sadly, the
platform with more planes was given a smaller DDB...   :-(

We're already in trouble because users that are running setups like 3x
4K with most/all planes in use at large sizes can't find level 0
watermarks that satisfy their needs and we have to reject the whole
configuration.  If we further limit each pipe's usage to total/maxpipes
(i.e., 170 blocks per pipe on BXT), then we're going to hit similar
issues when only driving one or two displays with with all of the planes
in use, even though we should have had more DDB space to work with.

I guess serious plane usage isn't too common in desktop setups today,
but it's a very critical feature in the embedded world; we can't really
afford to cripple plane usage further.  Unfortunately, as you point out
above, this means that we have to follow the bspec's DDB allocation
method, which in turn means we need to grab _all_ CRTC locks any time
_any_ CRTC is being turned on or turned off which is a BKL-style way of
doing things.


Matt

> - level 1+ watermarks would be checked against total_ddb_size
> - protect the plane/pipe commit with the wm mutex whenever the wms
>   need to be reprogrammed
> - keep the flush_wm thing around for the case when ddb size does get
>   changed, protect it with the wm lock
> - when programming wms, we will first filter out any level that
>   doesn't fit in with the current ddb size, and then program the
>   rest in
> - potentially introduce per-pipe wm locks if the one big lock looks
>   like an issue, which it might if the flush_wm holds it all the way
>   through
> 
> > then during the commit phase, we loop over the CRTC's three times
> > instead of just once, but only operate on a subset of the CRTC's in each
> > loop.  While operating on each CRTC, the plane, WM, and DDB all get
> > programmed together and have a single flush for all three.
> >
> > 
> > 
> > 
> > Matt
> > 
> > On Tue, Jul 26, 2016 at 01:34:36PM -0400, Lyude wrote:
> > > Latest version of https://lkml.org/lkml/2016/7/26/290 . Resending the 
> > > whole
> > > thing to keep it in one place.
> > > 
> > > Lyude (5):
> > >   drm/i915/skl: Add support for the SAGV, fix underrun hangs
> > >   drm/i915/skl: Only flush pipes when we change the ddb allocation
> > >   drm/i915/skl: Fix extra whitespace in skl_flush_wm_values()
> > >   drm/i915/skl: Update plane watermarks atomically during plane updates
> > >   drm/i915/skl: Always wait for pipes to update after a flush
> > > 
> > > Matt Roper (1):
> > >   drm/i915/gen9: Only copy WM results for changed pipes to skl_hw
> > > 
> > >  drivers/gpu/drm/i915/i915_drv.h  |   3 +
> > >  drivers/gpu/drm/i915/i915_reg.h  |   5 +
> > >  drivers/gpu/drm/i915/intel_display.c |  24 
> > >  drivers/gpu/drm/i915/intel_drv.h |   4 +
> > >  drivers/gpu/drm/i915/intel_pm.c  | 240 
> > > +++
> > >  drivers/gpu/drm/i915/intel_sprite.c  |   2 +
> > >  6 files changed, 255 insertions(+), 23 deletions(-)
> > > 
> > > -- 
> > > 2.7.4
> > > 
> > > ___
> > > Intel-gfx mailing list
> > > Intel-gfx at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Matt Roper
> > Graphics Software Engineer
> > IoTG Platform Enabling & Development
> > Intel Corporation
> > (916) 356-2795
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795


[Intel-gfx] [PATCH v4 0/6] Finally fix watermarks

2016-07-29 Thread Matt Roper
On Fri, Jul 29, 2016 at 10:26:20PM +0300, Ville Syrjälä wrote:
> On Fri, Jul 29, 2016 at 02:48:09PM -0400, Lyude wrote:
> > So I've been working on trying to fix this entirely again (e.g. writing
> > the ddb properly), since from bug reports it still doesn't sound like
> > we've got enough workarounds to make this tolerable. I've shown this to
> > matt roper, but I should probably post what I've been trying to do for
> > you as well.
> > 
> > So the approach I came up with is here
> > 
> > https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> > 
> > My approach differs a little bit from what the bspec recommends, but I
> > think it's going to be a bit easier to implement. At the end of all the
> > changes I'm attempting it should look like this:
> > 
> >  * We no longer have a global watermark update for skl
> >  * A new hook called "update_ddbs" is added to i915_display_funcs. This
> >gets called in intel_atomic_commit_tail() after we've disabled any
> >CRTCs that needed disabling, and before we begin enabling/updating
> >CRTCs
> >  * Because pipe ddb allocations (not the inner-pipe ddb allocations
> >that apply to each pipe's planes) only change while
> >enabling/disabling crtcs:
> > * Pass 1: Find which pipe's new ddb allocation won't overlap with
> >   another pipe's previous allocation, and update that pipe first
> > * Pass 2: Update the allocation of the remaining pipe
> > 
> > Here's an illustration of what this looks like. Parts of the ddb not
> > being used by any CRTCs are marked out with 'x's:
> > 
> > With pipe A enabled, we enable pipe B:
> > Initial DDB:    |           A           |
> > update_ddbs:    |     A     |xxx| Pass 1
> > Enable pipe B:  |     A     |     B     |
> > 
> > With pipes B and A active, we enable pipe C:
> > 
> > Initial DDB:    |     B     |     A     |
> > update_ddbs:    |   B   |xxx|     A     | Pass 1
> > update_ddbs:    |   B   |   A   |xxx| Pass 2
> > Enable pipe C:  |   B   |   A   |   C   |
> > 
> > With pipes A, B, and C active, we disable B:
> > Initial DDB:    |   A   |   B   |   C   |
> > Disable pipe B: |   A   |xxx|   C   |
> > update_ddbs:    |     A     |     C     | Pass 1
> > Since neither pipe's new allocation overlapped, we skip pass 2
> > 
> > Another allocation with A, B, and C active, disabling A:
> > Initial DDB:    |   A   |   B   |   C   |
> > Disable pipe A: |xxx|   B   |   C   |
> > update_ddbs:    |     B     |xxx|   C   | Pass 1
> > update_ddbs:    |     B     |     C     | Pass 2
> > 
> > This should ensure we can always move around the allocations of pipes
> > without them ever overlapping and exploding.
> 
> That's what the current flush thing does, or at least that what it used
> to do at least. Not sure it's really doing it anymore, but I'm pretty
> sure the order in which it did things was sound at some point.
> 
> > 
> > This branch doesn't entirely fix underrun issues, but I'm mostly sure
> > that's the fault of still not having removed the global wm update hook
> > yet (which is leading to additional pipe flushes in places they
> > shouldn't be):
> 
> Well it should basically boil down to s/update_wm/update_ddb/
> Only DDB reallocation really warrants a global hook. Everything else
> should be handled via per-crtc hooks, on all platforms.

I don't think we want even update_ddb.  We want *calculate_ddb* to be a
global hook during the atomic check phase (and we do have that today),
but when we get around to the commit phase and start writing stuff to
hardware, we want to program the per-pipe DDB entries at the same time
we program the WM's and regular plane registers (since they should be
double buffered and flushed out the same way).  We just need to make
sure that the order we process pipes in follows the special flushing
rules noted above, which is what my pseudo-patches were trying to
describe.

> 
> > 
> > https://github.com/lyude/linux/tree/wip/skl-fix-wms-v5r2
> > 
> > As for updating inner-pipe ddb allocations for each plane on a pipe, we
> > should be able to just do that at the same time we update each pipe's
> > watermarks
> 
> Yes.
> 
> None of that changes what I said before though. Either you need to
> lock down everything when the DDB needs to be repartitioned, or you
> do what I outlined in the previous mail. Otherwise a plane update
> etc. happening in parallel will still blow up on account of the DDB
> state changing underneath the plane update somewhere between compute
> and commit. I can't really think of third option that would work.

Yep, I agree with all of this as well (and we do lock everything down
today for exactly this reason).  It's unfortunate that we need a
BKL-style mechanism for the DDB, but as I noted in the other message,
using fixed pre-partitioning for level 0 breaks too many additional use
cases.  :-(


Matt

> 
> > 
> > Let me kno