Re: [Intel-gfx] [PATCH 0/3] LPSS PWM support for devices that support it

2016-01-12 Thread C. B.
On 12 January 2016 at 08:59, Shobhit Kumar  wrote:

> Not sending yet to pwm mailing list as this is all untested. C.B. please
> test the patches and see if they work at all for you. For testing Please
> enable -
>
> CONFIG_PWM_LPSS=y
> CONFIG_PWM_LPSS_PLATFORM=y

I applied these patches to 4.4-rc8. So far no progress:

[   14.251512] [drm:pwm_setup_backlight [i915]] *ERROR* Failed to own
the pwm chip: pm_backlight

I dug in a little and found that dev_priv->vbt.dsi.config->pwm_blc ==
0,  I wonder if some of these Dell Venue's have different hardware?
This is the model I have:

[   16.370481] Hardware name: Dell Inc. Venue 8 Pro 5830/09RP78, BIOS
A13 11/27/2015

Let me know what I can provide to help

CB
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 00/38] GPU scheduler for i915 driver

2016-01-12 Thread Tian, Kevin
> From: Gordon, David S
> Sent: Tuesday, January 12, 2016 9:49 PM
> 
> On 12/01/2016 11:43, John Harrison wrote:
> > On 12/01/2016 04:37, Tian, Kevin wrote:
> >>> From: john.c.harri...@intel.com
> >>> Sent: Tuesday, January 12, 2016 2:42 AM
> >>>
> >>> From: John Harrison 
> >>>
> >>> Implemented a batch buffer submission scheduler for the i915 DRM
> >>> driver.
> >>>
> >>> The general theory of operation is that when batch buffers are
> >>> submitted to the driver, the execbuffer() code assigns a unique seqno
> >>> value and then packages up all the information required to execute the
> >>> batch buffer at a later time. This package is given over to the
> >>> scheduler which adds it to an internal node list. The scheduler also
> >>> scans the list of objects associated with the batch buffer and
> >>> compares them against the objects already in use by other buffers in
> >>> the node list. If matches are found then the new batch buffer node is
> >>> marked as being dependent upon the matching node. The same is done for
> >>> the context object. The scheduler also bumps up the priority of such
> >>> matching nodes on the grounds that the more dependencies a given batch
> >>> buffer has the more important it is likely to be.
> >>>
> >> A curious question. Is this new GPU scheduler still useful when GuC
> >> is enabled? Sorry if this Q. has been answered before.
> > Yes. The scheduler works with any back end submission mechanism -
> > legacy ring buffer, execlist or Guc. Indeed, the pre-emption support
> > (next patch series in the set) currently requires the GuC. Execlist
> > support is possible but just not currently planned due to time
> > constraints. Legacy ring buffer pre-emption is very different and a
> > lot more work for very little benefit - pre-execlist hardware does not
> > support very much in the way of pre-emption facilities.
> We have previously implemented preemption on gen7 ringbuffer, but just
> as a proof of concept and we're not going to push that upstream.
> Ringbuffer mode can in any case only support co-operative preemption,
> whereas execlist and GuC modes don't require (much) cooperation from
> preemptible tasks.
> >
> >
> > The GuC itself does not really do much in the way of scheduling. It
> > does know about the
> In the line above, John meant "does NOT know"!
> 
> .Dave.
> > dependencies between batch buffers, for example, so cannot re-order
> > work according to priority. Adding such support without still having
> > large chunks of kernel driver support is a currently unscoped and
> > unplanning task.
> >

Thank you both for clarification. It makes sense.

Thanks
Kevin
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ warning: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

Built on 06d0112e293dfdea7f796d4085f755898850947b drm-intel-nightly: 
2016y-01m-12d-21h-16m-40s UTC integration manifest

Test gem_storedw_loop:
Subgroup basic-render:
pass   -> DMESG-WARN (bdw-nuci7)
dmesg-warn -> PASS   (bdw-ultra)
Test kms_flip:
Subgroup basic-flip-vs-dpms:
dmesg-warn -> PASS   (skl-i7k-2)
dmesg-warn -> PASS   (ilk-hp8440p)
Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-a-frame-sequence:
fail   -> PASS   (snb-x220t)

bdw-nuci7total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultratotal:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc  total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox  total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2  total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
hsw-xps12total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
ilk-hp8440p  total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430stotal:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps  total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220ttotal:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1155/

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/3] LPSS PWM support for devices that support it

2016-01-12 Thread Kumar, Shobhit

On 01/13/2016 11:00 AM, C. B. wrote:

On 12 January 2016 at 08:59, Shobhit Kumar  wrote:


Not sending yet to pwm mailing list as this is all untested. C.B. please
test the patches and see if they work at all for you. For testing Please
enable -

CONFIG_PWM_LPSS=y
CONFIG_PWM_LPSS_PLATFORM=y


I applied these patches to 4.4-rc8. So far no progress:

[   14.251512] [drm:pwm_setup_backlight [i915]] *ERROR* Failed to own
the pwm chip: pm_backlight


Are you sure this is pm_backlight, it should have printed pwm_backlight 
as that is the name for the PMIC bases PWM. I checked the code and I 
have not made a typo there.




I dug in a little and found that dev_priv->vbt.dsi.config->pwm_blc ==
0,  I wonder if some of these Dell Venue's have different hardware?
This is the model I have:



This means that this device is using PMIC base PWM and not LPSS. I have 
heard from Brain, that my CRC PWM driver is also broken in latest kernel 
which should be probably used in this device. I am looking into that 
separately.


Regards
Shobhit


[   16.370481] Hardware name: Dell Inc. Venue 8 Pro 5830/09RP78, BIOS
A13 11/27/2015

Let me know what I can provide to help

CB
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ failure: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

HEAD is now at a907968 drm-intel-nightly: 2016y-01m-11d-17h-22m-54s UTC 
integration manifest
Applying: drm: kerneldoc for drm_fops.c
Repository lacks necessary blobs to fall back on 3-way merge.
Cannot fall back to three-way merge.
Patch failed at 0001 drm: kerneldoc for drm_fops.c

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ success: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

Built on a90796840c30dac6d9907439bf98d1d08046c49d drm-intel-nightly: 
2016y-01m-11d-17h-22m-54s UTC integration manifest

Test gem_storedw_loop:
Subgroup basic-render:
pass   -> DMESG-WARN (skl-i5k-2) UNSTABLE
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
dmesg-warn -> PASS   (ilk-hp8440p)

bdw-nuci7total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultratotal:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc  total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox  total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2  total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p  total:141  pass:102  dwarn:2   dfail:0   fail:0   skip:37 
ivb-t430stotal:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps  total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220ttotal:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1138/

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: Support for pread/pwrite from/to non shmem backed objects

2016-01-12 Thread Ankitprasad Sharma
On Mon, 2016-01-11 at 21:29 +, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 05:15:54PM +, Tvrtko Ursulin wrote:
> > > Is that not what was written? I take it my telepathy isn't working
> > > again.
> > 
> > Sorry not a new loop, new case in a old loop. This is the hunk I think
> > is not helping readability:
> > 
> > @@ -869,11 +967,29 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private 
> > *i915,
> > /* If we get a fault while copying data, then (presumably) our
> >  * source page isn't available.  Return the error and we'll
> >  * retry in the slow path.
> > +* If the object is non-shmem backed, we retry again with the
> > +* path that handles page fault.
> >  */
> > -   if (fast_user_write(i915->gtt.mappable, page_base,
> > -   page_offset, user_data, page_length)) {
> > -   ret = -EFAULT;
> > -   goto out_flush;
> > +   if (faulted || fast_user_write(i915->gtt.mappable,
> > +   page_base, page_offset,
> > +   user_data, page_length)) {
> > +   if (!obj->base.filp) {
> 
> This is just wrong, we neither need the faulted nor the difference in
> behaviour based on storage.
> -Chris
> 
Yes, this will not be required, I see. As we are planning to provide a
partial fallback for all objs (shmem backed as well as non-shmem
backed). 
So to conclude, a partial fallback (slow_user_access()) for all objs if
fast_user_write() fails.
And a full fallback (shmem_pwrite()) for only shmem backed objects if
slow_user_access() fails as well.
...

hit_by_slowpath = 1;
mutex_unlock(>struct_mutex);
if (slow_user_access(i915->gtt.mappable,
 page_base,
 page_offset, user_data,
 page_length, true)) {
ret = -EFAULT;
mutex_lock(>struct_mutex);
goto out_flush;
}

mutex_lock(>struct_mutex);

...

Thanks,
-Ankit

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 5/5] drm: Enable markdown^Wasciidoc for gpu.tmpl

2016-01-12 Thread Graham Whaley
On Tue, 2016-01-12 at 09:34 +0100, Daniel Vetter wrote:
> On Mon, Jan 11, 2016 at 06:12:12PM -0700, Jonathan Corbet wrote:
> > On Sat, 12 Dec 2015 12:13:45 +0100
> > Daniel Vetter  wrote:
> > 
> > > I just figured there's no way this could get it, and I'd
> > > much rather improve the docs themselves than trying to convince
> > > core
> > > kernel folks that this might be useful.
> > 
> > So I'm not quite sure why you figured that; I never said it,
> > certainly.
> 
> To clarify this wasn't really my impression of your stance, but of
> the
> overall room opinion when we had the discussion at KS. And then my
> main
> goal here is to write great docs for drm (we have about 3k lines more
> docs
> in 4.5 already), so that's why I dropped the ball on upstreaming. It
> seemed unlikely to succeed, at least without some really seriuos
> effort at
> convincing everyone, all while the drm docs for atomic haven't been
> in
> good shape yet. Since then we had a few contributors of new atomic
> drivers
> note on irc already that "oh cool, this is documented now". Overall
> really
> just boils down to what I see as the most important things for drm ;
> -)
> 
> > I've been messing with it a bit, seems to work.  I do still wish we
> > could
> > consider alternatives, especially those that might simplify the
> > toolchain
> > rather than complicating it.  But it's clear that I'm not
> > succeeding in
> > finding time to actually explore that idea; the contents of
> > $EXCUSES are
> > good, but the end result is the same.  And the patch fairy just
> > isn't
> > coming through for me on this one.
> > 
> > In my mind, there's clearly no good that can come from (further)
> > delaying
> > something that works in favor of an "it would be nice" that may
> > never
> > even exist.  So I'm currently thinking that I'll pull this into the
> > docs
> > tree once the merge window is done, with the plan to push it for
> > 4.6.
> > Then we can see if anybody screams.
> > 
> > That gives a couple of weeks for an updated patch set, should you
> > have
> > one.
> > 
> > The build-time increase is painful in the extreme - about a factor
> > of
> > three for a -j1 build, and that's with only one file using the
> > feature.
> > It feels wrong, somehow, for the docs build to take longer than
> > building
> > the kernel itself.  Can we do something about that?
> > 
> >  - How many of the comments actually use asciidoc features?  Might
> > there
> >be some possibility of detecting those in kernel-doc and
> > skipping the
> >callout to asciidoc when it's not needed?
> 
> I think that amounts to writing a partial parser (we use asciidoc for
> tables, lists, links, formatting, code snippets by now already,
> someone
> even thought of using the asciiart->png feature it has but it's not
> yet
> wired up). I don't think it's feasible.
> 
> >  - Pandoc seems to do asciidoc.  I still don't like the idea of
> > depending
> >on it for this to work, but having the *option* to use it is
> > fine.  If
> >it's really that much faster (yes, Python startup is painful)
> > then
> >maybe providing the option is worth it.
> 
> Hm, Dave asked me to convert to use python-based asciidoc insted of
> haskell-based pandoc.
> 
> >  - All over the kernel we've seen that batching improves
> > performance.  It
> >would take a bit of work, but I bet kernel-doc could put
> > together all
> >the snippets from one file, pass them through a single asciidoc
> >invocation, then split the results back apart.  That would
> > probably
> >eliminate the performance hit entirely.
> > 
> > None of that is a condition for pulling this stuff in, but can it
> > be
> > looked into?
> 
> Besides what Jani mention that asciidoctor should be a drop-in
> replacement
> if installed it also seems possible to parallelize the call-out to
> kernel-doc from docproc.c without too much effort. I hoped Jani would
> get
> around to implement the asciidoctor support, and I'm hoping I can
> snipe
> away some free sometimes the next few months to look at docproc.c
> more
> seriously. This would kinda be a cool intern project, but atm we
> throw
> them all at improving testing infrastructure ...
> 
> Anyway I'm of course still open to get this upstream, and I think a
> few
> things should be polished (like the speed-up). But right now
> bandwidth on
> my side isn't too plentiful. Maybe we should aim to have a few better
> ideas (perhaps even for all of the docs stuff) for next KS and respin
> that
> discussion?

I was just about to reply to the thread looking at the
linux.conf.au schedule it would seem that you are both attending and
presenting, and there appears to be some sort of Documentation mini
-summit on the Monday as well (not sure if that is the place for a
discussion though). I will be at LCA for the Wed-Fri. You may not have
to wait until the next KS?

 Graham
> 
> Thanks, Daniel
___
Intel-gfx 

Re: [Intel-gfx] [PATCH 07/22] drm/armada: Remove NULL open/pre/postclose hooks

2016-01-12 Thread Daniel Vetter
On Tue, Jan 12, 2016 at 11:51:58AM +, Russell King - ARM Linux wrote:
> On Mon, Jan 11, 2016 at 10:41:01PM +0100, Daniel Vetter wrote:
> > The compiler will do this, but the void hits when grepping all the
> > hooks for a subsystem wide audit are slightly annoying. So remove them
> > for next time around.
> 
> I'll try to remember to queue this after -rc1, though a reminder
> after -rc1 would be useful.

I've planed to take the entire series in through drm-misc after -rc1,
since at least conceptually it's all the same topic. Would that be ok with
you too?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Use proper plane dimensions for DDB and WM calculations

2016-01-12 Thread Ville Syrjälä
On Mon, Jan 11, 2016 at 02:13:06PM -0800, Matt Roper wrote:
> On Mon, Jan 11, 2016 at 09:31:03PM +0200, Ville Syrjälä wrote:
> > On Mon, Dec 21, 2015 at 07:31:17AM -0800, Matt Roper wrote:
> > > In commit
> > > 
> > > commit 024c9045221fe45482863c47c4b4c47d37f97cbf
> > > Author: Matt Roper 
> > > Date:   Thu Sep 24 15:53:11 2015 -0700
> > > 
> > > drm/i915/skl: Eliminate usage of pipe_wm_parameters from 
> > > SKL-style WM (v4)
> > > 
> > > I fumbled while converting the dimensions stored in the plane_parameters
> > > structure to the values stored in plane state and accidentally replaced
> > > the plane dimensions with the pipe dimensions in both the DDB allocation
> > > function and the WM calculation function.  On the DDB side this is
> > > harmless since we effectively treat all of our non-cursor planes as
> > > full-screen which may not be optimal, but generally won't cause any
> > > problems either (and in 99% of the cases where there's no sprite plane
> > > usage or primary plane windowing, there's no effect at all).  On the WM
> > > calculation side there's more potential for this fumble to cause actual
> > > problems since cursors also get miscalculated.
> > > 
> > > Cc: Ville Syrjälä 
> > > Cc: "Kondapally, Kalyan" 
> > > Cc: Radhakrishna Sripada 
> > > Signed-off-by: Matt Roper 
> > > ---
> > >  drivers/gpu/drm/i915/intel_pm.c | 24 +---
> > >  1 file changed, 13 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c 
> > > b/drivers/gpu/drm/i915/intel_pm.c
> > > index 8d0d6f5..f4d4cc7 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -2845,25 +2845,22 @@ skl_plane_relative_data_rate(const struct 
> > > intel_crtc_state *cstate,
> > >const struct drm_plane_state *pstate,
> > >int y)
> > >  {
> > > - struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
> > > + struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
> > >   struct drm_framebuffer *fb = pstate->fb;
> > > + unsigned w = drm_rect_width(_pstate->dst);
> > > + unsigned h = drm_rect_height(_pstate->dst);
> > 
> > I think you're supposed to use the src dimensions in most places.
> 
> Hmm, just went back to double check the bpsec and if I'm interpreting it
> correctly, it looks like we actually need to use the larger of the two:
> "Down scaling effectively increases the pixel rate. Up scaling does not
> reduce the pixel rate."

The pixel rate is "downscale factor * pixel clock"

> 
> Thanks for pointing that out; I'll send an updated patch.
> 
> 
> 
> Matt
> 
> > 
> > >  
> > >   /* for planar format */
> > >   if (fb->pixel_format == DRM_FORMAT_NV12) {
> > >   if (y)  /* y-plane data rate */
> > > - return intel_crtc->config->pipe_src_w *
> > > - intel_crtc->config->pipe_src_h *
> > > - drm_format_plane_cpp(fb->pixel_format, 0);
> > > + return w * h * drm_format_plane_cpp(fb->pixel_format, 
> > > 0);
> > >   else/* uv-plane data rate */
> > > - return (intel_crtc->config->pipe_src_w/2) *
> > > - (intel_crtc->config->pipe_src_h/2) *
> > > + return (w/2) * (h/2) *
> > >   drm_format_plane_cpp(fb->pixel_format, 1);
> > >   }
> > >  
> > >   /* for packed formats */
> > > - return intel_crtc->config->pipe_src_w *
> > > - intel_crtc->config->pipe_src_h *
> > > - drm_format_plane_cpp(fb->pixel_format, 0);
> > > + return w * h * drm_format_plane_cpp(fb->pixel_format, 0);
> > >  }
> > >  
> > >  /*
> > > @@ -2960,6 +2957,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state 
> > > *cstate,
> > >* FIXME: we may not allocate every single block here.
> > >*/
> > >   total_data_rate = skl_get_total_relative_data_rate(cstate);
> > > + if (!total_data_rate)
> > > + return;
> > >  
> > >   start = alloc->start;
> > >   for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
> > > @@ -3093,12 +3092,15 @@ static bool skl_compute_plane_wm(const struct 
> > > drm_i915_private *dev_priv,
> > >  {
> > >   struct drm_plane *plane = _plane->base;
> > >   struct drm_framebuffer *fb = plane->state->fb;
> > > + struct intel_plane_state *intel_pstate =
> > > + to_intel_plane_state(plane->state);
> > >   uint32_t latency = dev_priv->wm.skl_latency[level];
> > >   uint32_t method1, method2;
> > >   uint32_t plane_bytes_per_line, plane_blocks_per_line;
> > >   uint32_t res_blocks, res_lines;
> > >   uint32_t selected_result;
> > >   uint8_t bytes_per_pixel;
> > > + unsigned w = drm_rect_width(_pstate->dst);
> > >  
> > >   if (latency == 0 || !cstate->base.active || !fb)
> > >   return 

Re: [Intel-gfx] [PATCH v4 38/38] drm/i915: Allow scheduler to manage inter-ring object synchronisation

2016-01-12 Thread John Harrison

On 11/01/2016 22:07, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:43:07PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

The scheduler has always tracked batch buffer dependencies based on
DRM object usage. This means that it will not submit a batch on one
ring that has outstanding dependencies still executing on other rings.
This is exactly the same synchronisation performed by
i915_gem_object_sync() using hardware semaphores where available and
CPU stalls where not (e.g. in execlist mode and/or on Gen8 hardware).

Unfortunately, when a batch buffer is submitted to the driver the
_object_sync() call happens first. Thus in case where hardware
semaphores are disabled, the driver has already stalled until the
dependency has been resolved.

But this should just add the dependency to the request in the scheduler
callback for i915_gem_object_sync_to, or better renamed as
i915_gem_request_submit_after. Without a scheduler we can do the
optimisation of doing that work inline, with a scheduler we can just
track the dependency.
That's the whole point. The scheduler is already tracking the 
dependencies between batch buffers and will ensure that everything 
happens in the correct order. The problem is that the object sync code 
is manually forcing that order before the batch buffers even get to the 
scheduler, either through hardware semaphores (which have power and 
performance penalties) or CPU stalling (which is just a performance 
issue). Hence this patch is saying that if the dependency between the 
objects is something the scheduler knows about, i.e. it is batch buffer 
based, then don't bother doing the synchronisation up front. Just assume 
the scheduler will do it internally later.




-Chris



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 2/7] drm/i915: Do not call API requiring struct_mutex where it is not available

2016-01-12 Thread Tvrtko Ursulin
From: Tvrtko Ursulin 

LRC code was calling GEM API like i915_gem_obj_ggtt_offset from
places where the struct_mutex cannot be grabbed (irq handlers).

To avoid that this patch caches some interesting bits and values
in the engine and context structures.

Some usages are also removed where they are not needed like a
few asserts which are either impossible or have been checked
already during engine initialization.

Side benefit is also that interrupt handlers and command
submission stop evaluating invariant conditionals, like what
Gen we are running on, on every interrupt and every command
submitted.

This patch deals with logical ring context id and descriptors
while subsequent patches will deal with the remaining issues.

v2:
 * Cache the VMA instead of the address. (Chris Wilson)
 * Incorporate Dave Gordon's good comments and function name.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
Cc: Dave Gordon 
---
 drivers/gpu/drm/i915/i915_debugfs.c |  15 ++--
 drivers/gpu/drm/i915/i915_drv.h |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.h |   1 -
 drivers/gpu/drm/i915/intel_lrc.c| 126 +++-
 drivers/gpu/drm/i915/intel_lrc.h|   4 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   2 +
 6 files changed, 90 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index e3377abc0d4d..0b3550f05026 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1994,12 +1994,13 @@ static int i915_context_status(struct seq_file *m, void 
*unused)
 }
 
 static void i915_dump_lrc_obj(struct seq_file *m,
- struct intel_engine_cs *ring,
- struct drm_i915_gem_object *ctx_obj)
+ struct intel_context *ctx,
+ struct intel_engine_cs *ring)
 {
struct page *page;
uint32_t *reg_state;
int j;
+   struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
unsigned long ggtt_offset = 0;
 
if (ctx_obj == NULL) {
@@ -2009,7 +2010,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
}
 
seq_printf(m, "CONTEXT: %s %u\n", ring->name,
-  intel_execlists_ctx_id(ctx_obj));
+  intel_execlists_ctx_id(ctx, ring));
 
if (!i915_gem_obj_ggtt_bound(ctx_obj))
seq_puts(m, "\tNot bound in GGTT\n");
@@ -2058,8 +2059,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
list_for_each_entry(ctx, _priv->context_list, link) {
for_each_ring(ring, dev_priv, i) {
if (ring->default_context != ctx)
-   i915_dump_lrc_obj(m, ring,
- ctx->engine[i].state);
+   i915_dump_lrc_obj(m, ctx, ring);
}
}
 
@@ -2133,11 +2133,8 @@ static int i915_execlists(struct seq_file *m, void *data)
 
seq_printf(m, "\t%d requests in queue\n", count);
if (head_req) {
-   struct drm_i915_gem_object *ctx_obj;
-
-   ctx_obj = head_req->ctx->engine[ring_id].state;
seq_printf(m, "\tHead request id: %u\n",
-  intel_execlists_ctx_id(ctx_obj));
+  intel_execlists_ctx_id(head_req->ctx, ring));
seq_printf(m, "\tHead request tail: %u\n",
   head_req->tail);
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 747d2d84a18c..79bb3671a15e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -883,6 +883,8 @@ struct intel_context {
struct drm_i915_gem_object *state;
struct intel_ringbuffer *ringbuf;
int pin_count;
+   struct i915_vma *lrc_vma;
+   u64 lrc_desc;
} engine[I915_NUM_RINGS];
 
struct list_head link;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b448ad832dcf..e5737963ab79 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -44,7 +44,6 @@ typedef uint64_t gen8_ppgtt_pml4e_t;
 
 #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
 
-
 /* gen6-hsw has bit 11-4 for physical addr bit 39-32 */
 #define GEN6_GTT_ADDR_ENCODE(addr) ((addr) | (((addr) >> 28) & 0xff0))
 #define GEN6_PTE_ADDR_ENCODE(addr) GEN6_GTT_ADDR_ENCODE(addr)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ab344e0b878c..504030ce7f93 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -265,7 +265,8 @@ 

Re: [Intel-gfx] [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 11:42:39AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
> interrupt context without the big lock held.
> 
> v2: Renamed gtt_start to gtt_offset. (Daniel Vetter)
> v3: Cache the VMA instead of address. (Chris Wilson)
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Daniel Vetter 
> Cc: Chris Wilson 

I feel like I have done this before...

Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/7] drm/i915: Add per context timelines to fence object

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 11:03:08AM +, John Harrison wrote:
> On 11/01/2016 22:58, Chris Wilson wrote:
> >On Mon, Jan 11, 2016 at 02:47:33PM -0800, Jesse Barnes wrote:
> >>On 01/11/2016 11:03 AM, John Harrison wrote:
> >>>On 08/01/2016 22:05, Chris Wilson wrote:
> On Fri, Jan 08, 2016 at 06:47:24PM +, john.c.harri...@intel.com wrote:
> >From: John Harrison 
> >
> >The fence object used inside the request structure requires a sequence
> >number. Although this is not used by the i915 driver itself, it could
> >potentially be used by non-i915 code if the fence is passed outside of
> >the driver. This is the intention as it allows external kernel drivers
> >and user applications to wait on batch buffer completion
> >asynchronously via the dma-buff fence API.
> That doesn't make any sense as they are not limited by a single
> timeline.
> >>>I don't understand what you mean. Who is not limited by a single timeline? 
> >>> The point is that the current seqno values cannot be used as there is no 
> >>>guarantee that they will increment globally once things like a scheduler 
> >>>and pre-emption arrive. Whereas, the fence internal implementation makes 
> >>>various assumptions about the linearity of the timeline. External users do 
> >>>not want to care about timelines or seqnos at all, they just want the 
> >>>fence API to work as documented.
> >>>
> >To ensure that such external users are not confused by strange things
> >happening with the seqno, this patch adds in a per context timeline
> >that can provide a guaranteed in-order seqno value for the fence. This
> >is safe because the scheduler will not re-order batch buffers within a
> >context - they are considered to be mutually dependent.
> You haven't added per-context breadcrumbs. What we need for being able
> to execute requests from parallel timelines, but with requests within a
> timeline being ordered, is a per-context page where we can emit the
> per-context issued breadcrumb. Then instead of looking up the current
> HW seqno in a global page, the request just looks at the current context
> HW seqno in the context seq, just
> i915_seqno_passed(*req->p_context_seqno, req->seqno).
> >>>This patch is not attempting to implement per context seqno values. That 
> >>>can be done as future work. This patch is doing the simplest, least 
> >>>invasive implementation in order to make external fences work.
> >>Right.  I think we want to move to per-context seqnos, but we don't have to 
> >>do it before this work lands.  It should be easier to do it after the rest 
> >>of these bits land in fact, since seqno handling will be well encapsulated 
> >>aiui.
> >This patch is irrelevent then. I think it is actually worse because it
> >is encapsulating a design detail that is fundamentally wrong.
> >-Chris
> >
> 
> Some kind of per-context timeline is required for the external use
> of i915 fences. Seqnos cannot be used without a lot of rework
> because they dance around with scheduler re-ordering and pre-emption
> - a low priority request could go through many different seqnos if
> it keeps getting pre-empted. We need to be able to use fences
> externally on Android at least, and with SVM it becomes vital for
> linux too. Therefore we need some solution. And this is much simpler
> and than re-writing the whole of the driver's seqno management.

Actually no. per-context seqno are trivial to implement, and allow for
request reordering between timelines with the seqno known a priori, that
includes priority handling and pre-emption, and struct fence of course
(since each context is a separate timeline).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 4/7] drm/i915: Cache LRC state page in the context

2016-01-12 Thread Tvrtko Ursulin
From: Tvrtko Ursulin 

LRC lifetime is well defined so we can cache the page pointing
to the object backing store in the context in order to avoid
walking over the object SG page list from the interrupt context
without the big lock held.

v2: Also cache the mapping. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c | 30 --
 2 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 79bb3671a15e..0c6a274a2150 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -885,6 +885,8 @@ struct intel_context {
int pin_count;
struct i915_vma *lrc_vma;
u64 lrc_desc;
+   struct page *lrc_state_page;
+   uint32_t *lrc_reg_state;
} engine[I915_NUM_RINGS];
 
struct list_head link;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 94314b344f38..9bd20422cfbf 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -339,14 +339,7 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
 {
struct intel_engine_cs *ring = rq->ring;
struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
-   struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-   struct page *page;
-   uint32_t *reg_state;
-
-   BUG_ON(!ctx_obj);
-
-   page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
-   reg_state = kmap_atomic(page);
+   uint32_t *reg_state = rq->ctx->engine[ring->id].lrc_reg_state;
 
reg_state[CTX_RING_TAIL+1] = rq->tail;
reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->vma->node.start;
@@ -363,8 +356,6 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
}
 
-   kunmap_atomic(reg_state);
-
return 0;
 }
 
@@ -1050,6 +1041,8 @@ static int intel_lr_context_do_pin(struct intel_engine_cs 
*ring,
struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
+   struct page *lrc_state_page;
+   uint32_t *lrc_reg_state;
int ret;
 
WARN_ON(!mutex_is_locked(>dev->struct_mutex));
@@ -1059,12 +1052,26 @@ static int intel_lr_context_do_pin(struct 
intel_engine_cs *ring,
if (ret)
return ret;
 
+   lrc_state_page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
+   if (WARN_ON(!lrc_state_page)) {
+   ret = -ENODEV;
+   goto unpin_ctx_obj;
+   }
+
+   lrc_reg_state = kmap(lrc_state_page);
+   if (!lrc_reg_state) {
+   ret = -ENOMEM;
+   goto unpin_ctx_obj;
+   }
+
ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
if (ret)
goto unpin_ctx_obj;
 
ctx->engine[ring->id].lrc_vma = i915_gem_obj_to_ggtt(ctx_obj);
intel_lr_context_descriptor_update(ctx, ring);
+   ctx->engine[ring->id].lrc_state_page = lrc_state_page;
+   ctx->engine[ring->id].lrc_reg_state = lrc_reg_state;
ctx_obj->dirty = true;
 
/* Invalidate GuC TLB. */
@@ -1105,10 +1112,13 @@ void intel_lr_context_unpin(struct drm_i915_gem_request 
*rq)
if (ctx_obj) {
WARN_ON(!mutex_is_locked(>dev->struct_mutex));
if (--rq->ctx->engine[ring->id].pin_count == 0) {
+   kunmap(rq->ctx->engine[ring->id].lrc_state_page);
intel_unpin_ringbuffer_obj(ringbuf);
i915_gem_object_ggtt_unpin(ctx_obj);
rq->ctx->engine[ring->id].lrc_vma = NULL;
rq->ctx->engine[ring->id].lrc_desc = 0;
+   rq->ctx->engine[ring->id].lrc_state_page = NULL;
+   rq->ctx->engine[ring->id].lrc_reg_state = NULL;
}
}
 }
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 02/38] drm/i915: Explicit power enable during deferred context initialisation

2016-01-12 Thread John Harrison

On 12/01/2016 11:28, Chris Wilson wrote:

On Tue, Jan 12, 2016 at 11:11:20AM +, John Harrison wrote:

On 12/01/2016 00:20, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:31PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

A later patch in this series re-organises the batch buffer submission
code. Part of that is to reduce the scope of a pm_get/put pair.
Specifically, they previously wrapped the entire submission path from
the very start to the very end, now they only wrap the actual hardware
submission part in the back half.

However, as you haven't fixed the ordering issue that requires rpm_get
before struct_mutex, this is broken.

Why does 'intel_runtime_pm_get' require the struct mutex to be held?
It has certainly not complained at me about trying to do stuff
without it.

Because it depends upon the struct_mutex and rpm doesn't have sufficient
lockdep integration to be able to warn about using rpm from the
incorrect contexts.
Where? What does the 'pm_runtime_get_sync' call turn into? There are 
already other places in the driver which call intel_runtime_pm_get() 
immediately after grabbing the mutex lock. Also, the description comment 
for _pm_get() does not mention anything about mutexes at all.



-Chris



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 04/10] drm/i915: Support for creating Stolen memory backed objects

2016-01-12 Thread Chris Wilson
On Tue, Dec 22, 2015 at 11:50:47AM +0530, ankitprasad.r.sha...@intel.com wrote:
> @@ -456,6 +457,21 @@ struct drm_i915_gem_create {
>*/
>   __u32 handle;
>   __u32 pad;
> + /**
> +  * Requested flags (currently used for placement
> +  * (which memory domain))
> +  *
> +  * You can request that the object be created from special memory
> +  * rather than regular system pages using this parameter. Such
> +  * irregular objects may have certain restrictions (such as CPU
> +  * access to a stolen object is verboten).
> +  *
> +  * This can be used in the future for other purposes too
> +  * e.g. specifying tiling/caching/madvise
> +  */
> + __u64 flags;

We've just been discussing future flags, and it seems like we want to
reserve the first 8 bits as a placement enum.

#define I915_CREATE_PLACEMENT_NORMAL0 /* standard swappable bo */
#define I915_CREATE_PLACEMENT_STOLEN1 /* Cannot use CPU mmaps */
/*
#define I915_CREATE_PLACEMENT_PRIVATE_ACTIVE2 /* From the file private 
freed pool */
#define I915_CREATE_PLACEMENT_PRIVATE_INACTIVE  3 /* From the file private 
freed pool */
*/
#define I915_CREATE_PLACEMENT_MASK 0xff
#define __I915_CREATE_UNKNOWN_FLAGS ~I915_CREATE_PLACEMENT_MASK

So thinking ahead, rearranging this

>   /* Allocate the new object */
> - obj = i915_gem_alloc_object(dev, size);
> + if (flags & I915_CREATE_PLACEMENT_STOLEN) {
> + mutex_lock(>struct_mutex);
> + obj = i915_gem_object_create_stolen(dev, size);
> + if (!obj) {
> + mutex_unlock(>struct_mutex);
> + return -ENOMEM;
> + }
> +
> + /* Always clear fresh buffers before handing to userspace */
> + ret = i915_gem_object_clear(obj);
> + if (ret) {
> + drm_gem_object_unreference(>base);
> + mutex_unlock(>struct_mutex);
> + return ret;
> + }
> +
> + mutex_unlock(>struct_mutex);
> + } else {
> + obj = i915_gem_alloc_object(dev, size);
> + }

as something like:

u32 placement = flags & I915_CREATE_PLACEMENT_MASK;

switch (placement) {
/*
case I915_CREATE_PLACEMENT_PRIVATE_ACTIVE:
case I915_CREATE_PLACEMENT_PRIVATE_INACTIVE:
use_private_pool = true;
obj = alloc_from_pool(file, size, placement == ACTIVE);
if (obj != NULL)
break;
/* fallthrough */
*/
case I915_CREATE_PLACEMENT_NORMAL:
obj = i915_gem_alloc_object(dev, size);
break;
case I915_CREATE_PLACEMENT_STOLEN:
obj = alloc_from_stolen(dev, size);
break;
}

would ease my future plans and look a bit neater :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 4/7] drm/i915: Cache LRC state page in the context

2016-01-12 Thread Tvrtko Ursulin


On 12/01/16 12:12, Chris Wilson wrote:

On Tue, Jan 12, 2016 at 11:56:11AM +, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

LRC lifetime is well defined so we can cache the page pointing
to the object backing store in the context in order to avoid
walking over the object SG page list from the interrupt context
without the big lock held.

v2: Also cache the mapping. (Chris Wilson)
v3: Unmap on the error path.


Then we only use the lrc_state_page in the unmapping, hardly worth the
cost of saving it.


Ok.

Do you also know if this would now require any flushing or something if 
previously kunmap_atomic might have done something under the covers?



The reg_state is better associated with the ring (since it basically
contains the analog of the RING_HEAD and friends).


Hm, not sure. The page belongs to the object from that anonymous struct 
in intel_context so I think it is best to keep them together.


Regards,

Tvrtko


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] drm: Create Color Management DRM properties

2016-01-12 Thread Ville Syrjälä
On Mon, Jan 11, 2016 at 08:37:09PM +, Daniel Stone wrote:
> Hi,
> 
> On 5 January 2016 at 10:23, Daniel Vetter  wrote:
> > On Wed, Dec 23, 2015 at 09:47:00AM +, Daniel Stone wrote:
> >> It's not even a legacy vs. atomic thing, this can happen in
> >> pure-atomic as well. Same as the render-compression plane property
> >> that I just replied to here as well.
> >>
> >> - Weston starts and sets up palette/CTM properties
> >> - VT switch to Mutter, which isn't aware of new properties
> >> - Mutter atomically sets new plane state, containing framebuffer which
> >> is already colour-corrected for the target
> >> - result is now double-corrected as Mutter didn't know to unset the
> >> old properties
> >>
> >> IOW, in the current model, any user of CM has to explicitly unset it
> >> before handover (not always actually possible), or any generic
> >> userspace must unset every property it sees and doesn't know about,
> >> hoping that setting to 0 will do the right thing.
> >>
> >> Or we could add another client cap, which would prevent the CM
> >> properties from ever being exposed to userspace which doesn't know how
> >> to work with it. Passing the client caps through to
> >> plane_duplicate_state also means that we can unset the CM properties
> >> at this early point for unaware clients. This would mean that we
> >> wouldn't have to have code to explicitly unset it in, e.g., every
> >> legacy hook.
> >
> > We don't have any support for unsetting anything really. Same argument you
> > make for CM here applies to rotation too. The only generic solution (if
> > you really care about this) would be for logind to once sample all atomic
> > state on boot-up, and force-restore that state when switching. Restoring
> > atomic state doesn't require userspace to understanding the semantics
> > really.
> 
> Sure, but on the other hand, rotation has been around ~forever, and is
> implemented in multiple places. Colour management is a lot less
> obvious.
> 
> > This kind of force-restoring is probably something we should implement in
> > the fbdev emulation, which would cover about 99% of all cases where
> > developers/users want to run different compositors. Or fbdev should simply
> > reset CM state, like it does for rotation already (that part is easy to
> > add, but indeed seems to be missing).
> >
> > Trying to create an ad-hoc solution (using opt-in flags) to this problem
> > for every single feature seems pointless imo.
> 
> Fair enough, I guess it could be more difficult, so how about a new
> flag to the atomic ioctl which requests state be _reset_ to scratch
> rather than duplicated? That way, clients could be really sure they
> weren't going to get screwed by rotation / colour management / render
> compression / whatever.

One question is what exactly is scratch? Eg. I have BYT tablet here
which has the display mounted upside down so in theory the reset state
should be 180 degree rotated. I have a patch hiding in some branch
to make the fbdev helper take over the rotation from the BIOS. I suppose
I should also dig into the VBT to see if there's some rotation indicators
there for the cases when you boot with HDMI plugged in. But I digress.

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 00/38] GPU scheduler for i915 driver

2016-01-12 Thread John Harrison

On 12/01/2016 04:37, Tian, Kevin wrote:

From: john.c.harri...@intel.com
Sent: Tuesday, January 12, 2016 2:42 AM

From: John Harrison 

Implemented a batch buffer submission scheduler for the i915 DRM driver.

The general theory of operation is that when batch buffers are
submitted to the driver, the execbuffer() code assigns a unique seqno
value and then packages up all the information required to execute the
batch buffer at a later time. This package is given over to the
scheduler which adds it to an internal node list. The scheduler also
scans the list of objects associated with the batch buffer and
compares them against the objects already in use by other buffers in
the node list. If matches are found then the new batch buffer node is
marked as being dependent upon the matching node. The same is done for
the context object. The scheduler also bumps up the priority of such
matching nodes on the grounds that the more dependencies a given batch
buffer has the more important it is likely to be.


A curious question. Is this new GPU scheduler still useful when GuC
is enabled? Sorry if this Q. has been answered before.
Yes. The scheduler works with any back end submission mechanism - legacy 
ring buffer, execlist or Guc. Indeed, the pre-emption support (next 
patch series in the set) currently requires the GuC. Execlist support is 
possible but just not currently planned due to time constraints. Legacy 
ring buffer pre-emption is very different and a lot more work for very 
little benefit - pre-execlist hardware does not support very much in the 
way of pre-emption facilities.


The GuC itself does not really do much in the way of scheduling. It does 
know about the dependencies between batch buffers, for example, so 
cannot re-order work according to priority. Adding such support without 
still having large chunks of kernel driver support is a currently 
unscoped and unplanning task.





Thanks
Kevin


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 18/38] drm/i915: Added scheduler support to __wait_request() calls

2016-01-12 Thread John Harrison

On 11/01/2016 23:14, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:47PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

The scheduler can cause batch buffers, and hence requests, to be
submitted to the ring out of order and asynchronously to their
submission to the driver. Thus at the point of waiting for the
completion of a given request, it is not even guaranteed that the
request has actually been sent to the hardware yet. Even it is has
been sent, it is possible that it could be pre-empted and thus
'unsent'.

This means that it is necessary to be able to submit requests to the
hardware during the wait call itself. Unfortunately, while some
callers of __wait_request() release the mutex lock first, others do
not (and apparently can not). Hence there is the ability to deadlock
as the wait stalls for submission but the asynchronous submission is
stalled for the mutex lock.

That is a nonsequitor. Do you mean to say that unless we take action
inside GEM, the request will never be submitted to hardware by the
scheduler?


Potentially. The scheduler holds on to batches inside its internal queue 
for later submission. It can also preempt batches that have already been 
sent to the hardware. Thus the wait call might be waiting on a batch 
with has or has not been submitted but even it is currently executing, 
it might get kicked out and need re-submitting. That submission requires 
calling the back end submission part of the driver - legacy ring buffer, 
execlist or GuC. Those back ends all require grabbing the mutex lock.


  

This change hooks the scheduler in to the __wait_request() code to
ensure correct behaviour. That is, flush the target batch buffer
through to the hardware and do not deadlock waiting for something that
cannot currently be submitted.

The dependencies are known during request construction, how could we
generate a cyclic graph?
It is not so much a cyclic graph as a bouncing one. As noted above, even 
a batch buffer which is currently executing could get preempted and need 
to be resubmitted again while someone is waiting one it.



  The scheduler itself does not need the
struct_mutex (other than the buggy code),
The scheduler internally does not need the mutex lock at all - it has 
its own private spinlock for critical data. However, the back end 
submission paths do currently require the mutex lock.



  so GEM holding the
struct_mutex will not prevent the scheduler from eventually submitting
the request we are waiting for. So as far as I can see, you are papering
over your own bugs.
-Chris



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 1/3] drm/i915: simplify allocation of driver-internal requests

2016-01-12 Thread Dave Gordon

On 07/01/16 16:53, Chris Wilson wrote:

On Thu, Jan 07, 2016 at 08:49:38AM -0800, Jesse Barnes wrote:

On 01/07/2016 03:58 AM, Chris Wilson wrote:

On Thu, Jan 07, 2016 at 10:20:50AM +, Dave Gordon wrote:

There are a number of places where the driver needs a request, but isn't
working on behalf of any specific user or in a specific context. At
present, we associate them with the per-engine default context. A future
patch will abolish those per-engine context pointers; but we can already
eliminate a lot of the references to them, just by making the allocator
allow NULL as a shorthand for "an appropriate context for this ring",
which will mean that the callers don't need to know anything about how
the "appropriate context" is found (e.g. per-ring vs per-device, etc).

So this patch renames the existing i915_gem_request_alloc(), and makes
it local (static inline), and replaces it with a wrapper that provides
a default if the context is NULL, and also has a nicer calling
convention (doesn't require a pointer to an output parameter). Then we
change all callers to use the new convention:
OLD:
err = i915_gem_request_alloc(ring, user_ctx, );
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, user_ctx);
if (IS_ERR(req)) ...
OLD:
err = i915_gem_request_alloc(ring, ring->default_context, );
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, NULL);
if (IS_ERR(req)) ...


Nak. You haven't fixed i915_gem_request_alloc() at all.

http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=breadcrumbs=82c72e1a2b4385f0ab07dccee45acef38303e96f
is the patch I have been carrying ever since.


Can we stop with the "nak"?  This patch wraps the request alloc
differently than yours, but you haven't given details as to why you
think it's incorrect (see Dave's reply).


I am annoyed that people do not review my patches and are duplicating
work I did last year and the year before, without attempting to fix
real bugs.
-Chris


Chris, this patchset is totally directed towards fixing a specific bug, 
one which, moreover, arose a consequence of a patch YOU wrote:

b0366a5 drm/i915: intel_ring_initialized() must be simple and inline
(mea culpa too, obviously, since I was the one who rebased & pushed it).
Nick has a fix for the original bug, which involves reversing the 
teardown order, but can't now use it since b0366a5, so the bug remains. 
Nick's fix can be made to work if we replace the per-engine default 
contexts with the global one, which you've already agreed is a good idea 
(I think it was your idea in the first place!).


We can't take your patch because it doesn't apply to nightly. If you 
provide a standalone version that's not entangled with 100 other patches 
I'll happily review it. Or I might cherry-pick your existing one out of 
the 190-element patchset and try to rebase it onto nightly, which is how 
b0366a5 got in in the first place. I suspect it would look very much 
like mine then ...


Maybe we should just revert b0366a5 instead? Even though it was quite 
nice in itself ...


.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 073/190] drm/i915: Introduce i915_gem_active for request tracking

2016-01-12 Thread Tvrtko Ursulin



On 12/01/16 11:01, Chris Wilson wrote:

On Tue, Jan 12, 2016 at 10:04:20AM +, Tvrtko Ursulin wrote:

Perhaps then leave the structure name as is and just rename the
function to i915_gem_request_assign_active? I think that describes
better what is actually happening.


i915_gem_request_update_active()?

request_assign_active() says to me that it is the request we are acting
on and it can have only one active entity. "update" could go either way.

i915_gem_active_add_to_request() is the full version I guess, or just
i915_gem_active_set().


i915_gem_active_add_to_request sounds good, but need to reorder the 
parameters then I think.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 076/190] drm/i915: Rename vma->*_list to *_link for consistency

2016-01-12 Thread Tvrtko Ursulin



On 11/01/16 09:17, Chris Wilson wrote:

Elsewhere we have adopted the convention of using '_link' to denote
elements in the list (and '_list' for the actual list_head itself), and
that the name should indicate which list the link belongs to (and
preferrably not just where the link is being stored).

s/vma_link/obj_link/ (we iterate over obj->vma_list)
s/mm_list/vm_link/ (we iterate over vm->[in]active_list)

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_debugfs.c  | 17 +--
  drivers/gpu/drm/i915/i915_gem.c  | 50 
  drivers/gpu/drm/i915/i915_gem_context.c  |  2 +-
  drivers/gpu/drm/i915/i915_gem_evict.c|  6 ++--
  drivers/gpu/drm/i915/i915_gem_gtt.c  | 10 +++
  drivers/gpu/drm/i915/i915_gem_gtt.h  |  4 +--
  drivers/gpu/drm/i915/i915_gem_shrinker.c |  4 +--
  drivers/gpu/drm/i915/i915_gem_stolen.c   |  2 +-
  drivers/gpu/drm/i915/i915_gem_userptr.c  |  2 +-
  drivers/gpu/drm/i915/i915_gpu_error.c|  8 ++---
  10 files changed, 52 insertions(+), 53 deletions(-)


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko


diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index efa9572fc217..f311df758195 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -117,9 +117,8 @@ static u64 i915_gem_obj_total_ggtt_size(struct 
drm_i915_gem_object *obj)
u64 size = 0;
struct i915_vma *vma;

-   list_for_each_entry(vma, >vma_list, vma_link) {
-   if (i915_is_ggtt(vma->vm) &&
-   drm_mm_node_allocated(>node))
+   list_for_each_entry(vma, >vma_list, obj_link) {
+   if (i915_is_ggtt(vma->vm) && drm_mm_node_allocated(>node))
size += vma->node.size;
}

@@ -155,7 +154,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
if (obj->base.name)
seq_printf(m, " (name: %d)", obj->base.name);
-   list_for_each_entry(vma, >vma_list, vma_link) {
+   list_for_each_entry(vma, >vma_list, obj_link) {
if (vma->pin_count > 0)
pin_count++;
}
@@ -164,7 +163,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
seq_printf(m, " (display)");
if (obj->fence_reg != I915_FENCE_REG_NONE)
seq_printf(m, " (fence: %d)", obj->fence_reg);
-   list_for_each_entry(vma, >vma_list, vma_link) {
+   list_for_each_entry(vma, >vma_list, obj_link) {
seq_printf(m, " (%sgtt offset: %08llx, size: %08llx",
   i915_is_ggtt(vma->vm) ? "g" : "pp",
   vma->node.start, vma->node.size);
@@ -229,7 +228,7 @@ static int i915_gem_object_list_info(struct seq_file *m, 
void *data)
}

total_obj_size = total_gtt_size = count = 0;
-   list_for_each_entry(vma, head, mm_list) {
+   list_for_each_entry(vma, head, vm_link) {
seq_printf(m, "   ");
describe_obj(m, vma->obj);
seq_printf(m, "\n");
@@ -341,7 +340,7 @@ static int per_file_stats(int id, void *ptr, void *data)
stats->shared += obj->base.size;

if (USES_FULL_PPGTT(obj->base.dev)) {
-   list_for_each_entry(vma, >vma_list, vma_link) {
+   list_for_each_entry(vma, >vma_list, obj_link) {
struct i915_hw_ppgtt *ppgtt;

if (!drm_mm_node_allocated(>node))
@@ -453,12 +452,12 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
   count, mappable_count, size, mappable_size);

size = count = mappable_size = mappable_count = 0;
-   count_vmas(>active_list, mm_list);
+   count_vmas(>active_list, vm_link);
seq_printf(m, "  %u [%u] active objects, %llu [%llu] bytes\n",
   count, mappable_count, size, mappable_size);

size = count = mappable_size = mappable_count = 0;
-   count_vmas(>inactive_list, mm_list);
+   count_vmas(>inactive_list, vm_link);
seq_printf(m, "  %u [%u] inactive objects, %llu [%llu] bytes\n",
   count, mappable_count, size, mappable_size);

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4eef13ebdaf3..e4d7c7f5aca2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -128,10 +128,10 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void 
*data,

pinned = 0;
mutex_lock(>struct_mutex);
-   list_for_each_entry(vma, >base.active_list, mm_list)
+   list_for_each_entry(vma, >base.active_list, vm_link)
if (vma->pin_count)
pinned += vma->node.size;
-   list_for_each_entry(vma, >base.inactive_list, mm_list)
+   

Re: [Intel-gfx] [PATCH v3 1/3] drm/i915: simplify allocation of driver-internal requests

2016-01-12 Thread Daniel Vetter
On Thu, Jan 07, 2016 at 10:20:50AM +, Dave Gordon wrote:
> There are a number of places where the driver needs a request, but isn't
> working on behalf of any specific user or in a specific context. At
> present, we associate them with the per-engine default context. A future
> patch will abolish those per-engine context pointers; but we can already
> eliminate a lot of the references to them, just by making the allocator
> allow NULL as a shorthand for "an appropriate context for this ring",
> which will mean that the callers don't need to know anything about how
> the "appropriate context" is found (e.g. per-ring vs per-device, etc).
> 
> So this patch renames the existing i915_gem_request_alloc(), and makes
> it local (static inline), and replaces it with a wrapper that provides
> a default if the context is NULL, and also has a nicer calling
> convention (doesn't require a pointer to an output parameter). Then we
> change all callers to use the new convention:
> OLD:
>   err = i915_gem_request_alloc(ring, user_ctx, );
>   if (err) ...
> NEW:
>   req = i915_gem_request_alloc(ring, user_ctx);
>   if (IS_ERR(req)) ...
> OLD:
>   err = i915_gem_request_alloc(ring, ring->default_context, );
>   if (err) ...
> NEW:
>   req = i915_gem_request_alloc(ring, NULL);
>   if (IS_ERR(req)) ...
> 
> Signed-off-by: Dave Gordon 
> ---
>  drivers/gpu/drm/i915/i915_drv.h|  6 ++--
>  drivers/gpu/drm/i915/i915_gem.c| 55 
> +++---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 14 +---
>  drivers/gpu/drm/i915/intel_display.c   |  6 ++--
>  drivers/gpu/drm/i915/intel_lrc.c   |  9 +++--
>  drivers/gpu/drm/i915/intel_overlay.c   | 24 ++---
>  6 files changed, 74 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c6dd4db..c2b000a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2260,9 +2260,9 @@ struct drm_i915_gem_request {
>  
>  };
>  
> -int i915_gem_request_alloc(struct intel_engine_cs *ring,
> -struct intel_context *ctx,
> -struct drm_i915_gem_request **req_out);
> +struct drm_i915_gem_request * __must_check
> +i915_gem_request_alloc(struct intel_engine_cs *engine,
> +struct intel_context *ctx);
>  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6c60e04..c908ed1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2688,9 +2688,10 @@ void i915_gem_request_free(struct kref *req_ref)
>   kmem_cache_free(req->i915->requests, req);
>  }
>  
> -int i915_gem_request_alloc(struct intel_engine_cs *ring,
> -struct intel_context *ctx,
> -struct drm_i915_gem_request **req_out)
> +static inline int
> +__i915_gem_request_alloc(struct intel_engine_cs *ring,
> +  struct intel_context *ctx,
> +  struct drm_i915_gem_request **req_out)
>  {
>   struct drm_i915_private *dev_priv = to_i915(ring->dev);
>   struct drm_i915_gem_request *req;
> @@ -2753,6 +2754,31 @@ err:
>   return ret;
>  }
>  
> +/**
> + * i915_gem_request_alloc - allocate a request structure
> + *
> + * @engine: engine that we wish to issue the request on.
> + * @ctx: context that the request will be associated with.
> + *   This can be NULL if the request is not directly related to
> + *   any specific user context, in which case this function will
> + *   choose an appropriate context to use.
> + *
> + * Returns a pointer to the allocated request if successful,
> + * or an error code if not.
> + */
> +struct drm_i915_gem_request *
> +i915_gem_request_alloc(struct intel_engine_cs *engine,
> +struct intel_context *ctx)
> +{
> + struct drm_i915_gem_request *req;
> + int err;
> +
> + if (ctx == NULL)
> + ctx = engine->default_context;

I think NULL vs. explicit dev_priv->kernel_context (or whatever it is) is
semantically equivalent enough that it doesn't matter. And we can easily
sed this (or an entire patch series for easy rebasing) if we change our
minds.

Acked-by: Daniel Vetter 

> + err = __i915_gem_request_alloc(engine, ctx, );
> + return err ? ERR_PTR(err) : req;
> +}
> +
>  void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>  {
>   intel_ring_reserved_space_cancel(req->ringbuf);
> @@ -3170,9 +3196,13 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   return 0;
>  
>   if (*to_req == NULL) {
> - ret = 

Re: [Intel-gfx] [PATCH v3 2/3] drm/i915: abolish separate per-ring default_context pointers

2016-01-12 Thread Daniel Vetter
On Thu, Jan 07, 2016 at 10:20:51AM +, Dave Gordon wrote:
> Now that we've eliminated a lot of uses of ring->default_context,
> we can eliminate the pointer itself.
> 
> All the engines share the same default intel_context, so we can just
> keep a single reference to it in the dev_priv structure rather than one
> in each of the engine[] elements. This make refcounting more sensible
> too, as we now have a refcount of one for the one pointer, rather than
> a refcount of one but multiple pointers.
> 
> From an idea by Chris Wilson.
> 
> Signed-off-by: Dave Gordon 

Acked-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c|  4 ++--
>  drivers/gpu/drm/i915/i915_drv.h|  2 ++
>  drivers/gpu/drm/i915/i915_gem.c|  6 +++---
>  drivers/gpu/drm/i915/i915_gem_context.c| 22 --
>  drivers/gpu/drm/i915/i915_gpu_error.c  |  2 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c |  6 +++---
>  drivers/gpu/drm/i915/intel_lrc.c   | 24 +---
>  drivers/gpu/drm/i915/intel_ringbuffer.h|  1 -
>  8 files changed, 32 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 0fc38bb..2613708 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1943,7 +1943,7 @@ static int i915_context_status(struct seq_file *m, void 
> *unused)
>   seq_puts(m, "HW context ");
>   describe_ctx(m, ctx);
>   for_each_ring(ring, dev_priv, i) {
> - if (ring->default_context == ctx)
> + if (dev_priv->kernel_context == ctx)
>   seq_printf(m, "(default context %s) ",
>  ring->name);
>   }
> @@ -2039,7 +2039,7 @@ static int i915_dump_lrc(struct seq_file *m, void 
> *unused)
>  
>   list_for_each_entry(ctx, _priv->context_list, link) {
>   for_each_ring(ring, dev_priv, i) {
> - if (ring->default_context != ctx)
> + if (dev_priv->kernel_context != ctx)
>   i915_dump_lrc_obj(m, ring,
> ctx->engine[i].state);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c2b000a..aef86a8 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1940,6 +1940,8 @@ struct drm_i915_private {
>   void (*stop_ring)(struct intel_engine_cs *ring);
>   } gt;
>  
> + struct intel_context *kernel_context;
> +
>   bool edp_low_vswing;
>  
>   /* perform PHY state sanity checks? */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c908ed1..8f101121 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2678,7 +2678,7 @@ void i915_gem_request_free(struct kref *req_ref)
>  
>   if (ctx) {
>   if (i915.enable_execlists) {
> - if (ctx != req->ring->default_context)
> + if (ctx != req->i915->kernel_context)
>   intel_lr_context_unpin(req);
>   }
>  
> @@ -2774,7 +2774,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>   int err;
>  
>   if (ctx == NULL)
> - ctx = engine->default_context;
> + ctx = to_i915(engine->dev)->kernel_context;
>   err = __i915_gem_request_alloc(engine, ctx, );
>   return err ? ERR_PTR(err) : req;
>  }
> @@ -4862,7 +4862,7 @@ i915_gem_init_hw(struct drm_device *dev)
>*/
>   init_unused_rings(dev);
>  
> - BUG_ON(!dev_priv->ring[RCS].default_context);
> + BUG_ON(!dev_priv->kernel_context);
>  
>   ret = i915_ppgtt_init_hw(dev);
>   if (ret) {
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
> b/drivers/gpu/drm/i915/i915_gem_context.c
> index 900ffd0..e1d767e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -354,11 +354,10 @@ int i915_gem_context_init(struct drm_device *dev)
>  {
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   struct intel_context *ctx;
> - int i;
>  
>   /* Init should only be called once per module load. Eventually the
>* restriction on the context_disabled check can be loosened. */
> - if (WARN_ON(dev_priv->ring[RCS].default_context))
> + if (WARN_ON(dev_priv->kernel_context))
>   return 0;
>  
>   if (intel_vgpu_active(dev) && HAS_LOGICAL_RING_CONTEXTS(dev)) {
> @@ -388,12 +387,7 @@ int i915_gem_context_init(struct drm_device *dev)
>   return PTR_ERR(ctx);
>   }
>  
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> - struct intel_engine_cs *ring = _priv->ring[i];
> -
> -

Re: [Intel-gfx] [PATCH v3 1/3] drm/i915: simplify allocation of driver-internal requests

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 02:50:28PM +0100, Daniel Vetter wrote:
> On Thu, Jan 07, 2016 at 10:20:50AM +, Dave Gordon wrote:
> > There are a number of places where the driver needs a request, but isn't
> > working on behalf of any specific user or in a specific context. At
> > present, we associate them with the per-engine default context. A future
> > patch will abolish those per-engine context pointers; but we can already
> > eliminate a lot of the references to them, just by making the allocator
> > allow NULL as a shorthand for "an appropriate context for this ring",
> > which will mean that the callers don't need to know anything about how
> > the "appropriate context" is found (e.g. per-ring vs per-device, etc).
> > 
> > So this patch renames the existing i915_gem_request_alloc(), and makes
> > it local (static inline), and replaces it with a wrapper that provides
> > a default if the context is NULL, and also has a nicer calling
> > convention (doesn't require a pointer to an output parameter). Then we
> > change all callers to use the new convention:
> > OLD:
> > err = i915_gem_request_alloc(ring, user_ctx, );
> > if (err) ...
> > NEW:
> > req = i915_gem_request_alloc(ring, user_ctx);
> > if (IS_ERR(req)) ...
> > OLD:
> > err = i915_gem_request_alloc(ring, ring->default_context, );
> > if (err) ...
> > NEW:
> > req = i915_gem_request_alloc(ring, NULL);
> > if (IS_ERR(req)) ...
> > 
> > Signed-off-by: Dave Gordon 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h|  6 ++--
> >  drivers/gpu/drm/i915/i915_gem.c| 55 
> > +++---
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 14 +---
> >  drivers/gpu/drm/i915/intel_display.c   |  6 ++--
> >  drivers/gpu/drm/i915/intel_lrc.c   |  9 +++--
> >  drivers/gpu/drm/i915/intel_overlay.c   | 24 ++---
> >  6 files changed, 74 insertions(+), 40 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index c6dd4db..c2b000a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2260,9 +2260,9 @@ struct drm_i915_gem_request {
> >  
> >  };
> >  
> > -int i915_gem_request_alloc(struct intel_engine_cs *ring,
> > -  struct intel_context *ctx,
> > -  struct drm_i915_gem_request **req_out);
> > +struct drm_i915_gem_request * __must_check
> > +i915_gem_request_alloc(struct intel_engine_cs *engine,
> > +  struct intel_context *ctx);
> >  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
> >  void i915_gem_request_free(struct kref *req_ref);
> >  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 6c60e04..c908ed1 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2688,9 +2688,10 @@ void i915_gem_request_free(struct kref *req_ref)
> > kmem_cache_free(req->i915->requests, req);
> >  }
> >  
> > -int i915_gem_request_alloc(struct intel_engine_cs *ring,
> > -  struct intel_context *ctx,
> > -  struct drm_i915_gem_request **req_out)
> > +static inline int
> > +__i915_gem_request_alloc(struct intel_engine_cs *ring,
> > +struct intel_context *ctx,
> > +struct drm_i915_gem_request **req_out)
> >  {
> > struct drm_i915_private *dev_priv = to_i915(ring->dev);
> > struct drm_i915_gem_request *req;
> > @@ -2753,6 +2754,31 @@ err:
> > return ret;
> >  }
> >  
> > +/**
> > + * i915_gem_request_alloc - allocate a request structure
> > + *
> > + * @engine: engine that we wish to issue the request on.
> > + * @ctx: context that the request will be associated with.
> > + *   This can be NULL if the request is not directly related to
> > + *   any specific user context, in which case this function will
> > + *   choose an appropriate context to use.
> > + *
> > + * Returns a pointer to the allocated request if successful,
> > + * or an error code if not.
> > + */
> > +struct drm_i915_gem_request *
> > +i915_gem_request_alloc(struct intel_engine_cs *engine,
> > +  struct intel_context *ctx)
> > +{
> > +   struct drm_i915_gem_request *req;
> > +   int err;
> > +
> > +   if (ctx == NULL)
> > +   ctx = engine->default_context;
> 
> I think NULL vs. explicit dev_priv->kernel_context (or whatever it is) is
> semantically equivalent enough that it doesn't matter. And we can easily
> sed this (or an entire patch series for easy rebasing) if we change our
> minds.

But we were removing the engine->default_context as it complicated the
rest of the code. I strongly prefer keeping the contexts explicit as
context separation should be first and foremost in the driver.
-Chris

-- 
Chris Wilson, Intel Open Source Technology 

Re: [Intel-gfx] [PATCH 073/190] drm/i915: Introduce i915_gem_active for request tracking

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 01:44:13PM +, Tvrtko Ursulin wrote:
> 
> On 12/01/16 11:01, Chris Wilson wrote:
> >On Tue, Jan 12, 2016 at 10:04:20AM +, Tvrtko Ursulin wrote:
> >>Perhaps then leave the structure name as is and just rename the
> >>function to i915_gem_request_assign_active? I think that describes
> >>better what is actually happening.
> >
> >i915_gem_request_update_active()?
> >
> >request_assign_active() says to me that it is the request we are acting
> >on and it can have only one active entity. "update" could go either way.
> >
> >i915_gem_active_add_to_request() is the full version I guess, or just
> >i915_gem_active_set().
> >
> >i915_gem_request_mark_active() -> i915_gem_active_set()
> 
> Sorry, or the short version might be good enough and better in the
> code since shorter. Just I think parameters need to be re-ordered.

Yes, i915_gem_active_set() would operate on the i915_gem_active and take
i915_gem_request as its parameter.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 024/190] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor

2016-01-12 Thread Mika Kuoppala
Chris Wilson  writes:

> When reading from the HWS page, we use barrier() to prevent the compiler
> optimising away the read from the volatile (may be updated by the GPU)
> memory address. This is more suited to READ_ONCE(); make it so.
>
> Signed-off-by: Chris Wilson 
> Cc: Daniel Vetter 

After reading Documentation/memory-barriers.txt I feel that deodorant
failed. I confirmed my confusion about hws page cacheability from Chris,
and it is snooped.

The barrier here is superset of what we need. We need
to instruct compiler to throw out compiler cached value of this
particular address, not everything it had. So it makes sense to me.

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
> b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 6cc8e9c5f8d6..8f305ce253ae 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -418,8 +418,7 @@ intel_read_status_page(struct intel_engine_cs *ring,
>  int reg)
>  {
>   /* Ensure that the compiler doesn't optimize away the load. */
> - barrier();
> - return ring->status_page.page_addr[reg];
> + return READ_ONCE(ring->status_page.page_addr[reg]);
>  }
>  
>  static inline void
> -- 
> 2.7.0.rc3
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 02/38] drm/i915: Explicit power enable during deferred context initialisation

2016-01-12 Thread John Harrison

On 12/01/2016 14:04, Daniel Vetter wrote:

On Tue, Jan 12, 2016 at 11:50:34AM +, John Harrison wrote:

On 12/01/2016 11:28, Chris Wilson wrote:

On Tue, Jan 12, 2016 at 11:11:20AM +, John Harrison wrote:

On 12/01/2016 00:20, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:31PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

A later patch in this series re-organises the batch buffer submission
code. Part of that is to reduce the scope of a pm_get/put pair.
Specifically, they previously wrapped the entire submission path from
the very start to the very end, now they only wrap the actual hardware
submission part in the back half.

However, as you haven't fixed the ordering issue that requires rpm_get
before struct_mutex, this is broken.

Why does 'intel_runtime_pm_get' require the struct mutex to be held?
It has certainly not complained at me about trying to do stuff
without it.

Because it depends upon the struct_mutex and rpm doesn't have sufficient
lockdep integration to be able to warn about using rpm from the
incorrect contexts.

Where? What does the 'pm_runtime_get_sync' call turn into? There are already
other places in the driver which call intel_runtime_pm_get() immediately
after grabbing the mutex lock. Also, the description comment for _pm_get()
does not mention anything about mutexes at all.

If you nest rpm_get within dev->struct_mutex that's a bug and could
deadlock. Where does this happen? And for any such place we need a new
subtest in pm_rpm.


The first two hits when grepping the driver are in 
'i915_gem_seqno_info()' and 'i915_interrupt_info()' in i915_debugfs.c. 
Both say:

ret = mutex_lock_interruptible(>struct_mutex);
if (ret)
return ret;
intel_runtime_pm_get(dev_priv);



-Daniel


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: reboot notifier delay for eDP panels

2016-01-12 Thread Ville Syrjälä
On Mon, Jan 11, 2016 at 01:52:17PM -0800, clinton.a.tay...@intel.com wrote:
> From: Clint Taylor 
> 
> Add reboot notifier for all platforms. This guarantees T12 delay
> compliance during reboot cycles when pre-os enables the panel within
> 500ms.
> 
> Signed-off-by: Clint Taylor 
> ---
>  drivers/gpu/drm/i915/intel_dp.c |   11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 796e3d3..dbbd27a 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -126,6 +126,7 @@ static struct intel_dp *intel_attached_dp(struct 
> drm_connector *connector)
>  static void intel_dp_link_down(struct intel_dp *intel_dp);
>  static bool edp_panel_vdd_on(struct intel_dp *intel_dp);
>  static void edp_panel_vdd_off(struct intel_dp *intel_dp, bool sync);
> +static void edp_panel_off(struct intel_dp *intel_dp);
>  static void vlv_init_panel_power_sequencer(struct intel_dp *intel_dp);
>  static void vlv_steal_power_sequencer(struct drm_device *dev,
> enum pipe pipe);
> @@ -596,6 +597,10 @@ static int edp_notify_handler(struct notifier_block 
> *this, unsigned long code,
>   I915_WRITE(pp_ctrl_reg, PANEL_UNLOCK_REGS | PANEL_POWER_OFF);
>   msleep(intel_dp->panel_power_cycle_delay);
>   }
> + else
> + {
> + edp_panel_off(intel_dp);
> + }

I don't think that actually waits for the power cycle delay. Also
you can't call it without having vdd already enabled unless you
specifically want to see a WARN.

I suppose you could just do something along these lines:

if (have_panel_power || have_panel_vdd) {
edp_panel_vdd_on
edp_panel_off
}
wait_panel_power_cycle

Also should change VLV/CHV to do it the same way.

>  
>   pps_unlock(intel_dp);
>  
> @@ -5796,10 +5801,10 @@ static bool intel_edp_init_connector(struct intel_dp 
> *intel_dp,
>   }
>   mutex_unlock(>mode_config.mutex);
>  
> - if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev)) {
> - intel_dp->edp_notifier.notifier_call = edp_notify_handler;
> - register_reboot_notifier(_dp->edp_notifier);
> + intel_dp->edp_notifier.notifier_call = edp_notify_handler;
> + register_reboot_notifier(_dp->edp_notifier);
>  
> + if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev)) {
>   /*
>* Figure out the current pipe for the initial backlight setup.
>* If the current pipe isn't valid, try the PPS pipe, and if 
> that
> -- 
> 1.7.9.5
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 073/190] drm/i915: Introduce i915_gem_active for request tracking

2016-01-12 Thread Tvrtko Ursulin


On 12/01/16 11:01, Chris Wilson wrote:

On Tue, Jan 12, 2016 at 10:04:20AM +, Tvrtko Ursulin wrote:

Perhaps then leave the structure name as is and just rename the
function to i915_gem_request_assign_active? I think that describes
better what is actually happening.


i915_gem_request_update_active()?

request_assign_active() says to me that it is the request we are acting
on and it can have only one active entity. "update" could go either way.

i915_gem_active_add_to_request() is the full version I guess, or just
i915_gem_active_set().

i915_gem_request_mark_active() -> i915_gem_active_set()


Sorry, or the short version might be good enough and better in the code 
since shorter. Just I think parameters need to be re-ordered.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 00/38] GPU scheduler for i915 driver

2016-01-12 Thread Dave Gordon

On 12/01/2016 11:43, John Harrison wrote:

On 12/01/2016 04:37, Tian, Kevin wrote:

From: john.c.harri...@intel.com
Sent: Tuesday, January 12, 2016 2:42 AM

From: John Harrison 

Implemented a batch buffer submission scheduler for the i915 DRM 
driver.


The general theory of operation is that when batch buffers are
submitted to the driver, the execbuffer() code assigns a unique seqno
value and then packages up all the information required to execute the
batch buffer at a later time. This package is given over to the
scheduler which adds it to an internal node list. The scheduler also
scans the list of objects associated with the batch buffer and
compares them against the objects already in use by other buffers in
the node list. If matches are found then the new batch buffer node is
marked as being dependent upon the matching node. The same is done for
the context object. The scheduler also bumps up the priority of such
matching nodes on the grounds that the more dependencies a given batch
buffer has the more important it is likely to be.


A curious question. Is this new GPU scheduler still useful when GuC
is enabled? Sorry if this Q. has been answered before.
Yes. The scheduler works with any back end submission mechanism - 
legacy ring buffer, execlist or Guc. Indeed, the pre-emption support 
(next patch series in the set) currently requires the GuC. Execlist 
support is possible but just not currently planned due to time 
constraints. Legacy ring buffer pre-emption is very different and a 
lot more work for very little benefit - pre-execlist hardware does not 
support very much in the way of pre-emption facilities.
We have previously implemented preemption on gen7 ringbuffer, but just 
as a proof of concept and we're not going to push that upstream. 
Ringbuffer mode can in any case only support co-operative preemption, 
whereas execlist and GuC modes don't require (much) cooperation from 
preemptible tasks.



The GuC itself does not really do much in the way of scheduling. It 
does know about the 

In the line above, John meant "does NOT know"!

.Dave.
dependencies between batch buffers, for example, so cannot re-order 
work according to priority. Adding such support without still having 
large chunks of kernel driver support is a currently unscoped and 
unplanning task.





Thanks
Kevin


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: edp resume/On time optimization.

2016-01-12 Thread Ville Syrjälä
On Mon, Jan 11, 2016 at 02:55:37PM -0800, abhay.ku...@intel.com wrote:
> From: Abhay Kumar 
> 
> Make resume/on codepath not to wait for panel_power_cycle_delay(t11_t12)
> if this time is already spent in suspend/poweron time.
> 
> v2: Use CLOCK_BOOTTIME and remove jiffies for panel power cycle
> delay calculation(Ville).
> 
> v3: Addressing Ville review comment.

That's not a very good changelog.

> 
> Cc: Ville Syrjälä 
> Signed-off-by: Abhay Kumar 
> ---
>  drivers/gpu/drm/i915/intel_dp.c  | 19 ++-
>  drivers/gpu/drm/i915/intel_drv.h |  2 +-
>  2 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 796e3d3..d0885bc 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -1812,12 +1812,21 @@ static void wait_panel_off(struct intel_dp *intel_dp)
>  
>  static void wait_panel_power_cycle(struct intel_dp *intel_dp)
>  {
> + static ktime_t panel_power_on_time;
> + s64 panel_power_off_duration;
> +
>   DRM_DEBUG_KMS("Wait for panel power cycle\n");
>  
> + /* take the difference of currrent time and panel power off time
> +  * and then make panel wait for t11_t12 if needed. */
> + panel_power_on_time = ktime_get_boottime();
> + panel_power_off_duration = ktime_ms_delta(panel_power_on_time, 
> intel_dp->panel_power_off_time);
> +
>   /* When we disable the VDD override bit last we have to do the manual
>* wait. */
> - wait_remaining_ms_from_jiffies(intel_dp->last_power_cycle,
> -intel_dp->panel_power_cycle_delay);
> + if (panel_power_off_duration < ((s64) 
> intel_dp->panel_power_cycle_delay))

Some usless parens here.

> + wait_remaining_ms_from_jiffies(jiffies,
> +(intel_dp->panel_power_cycle_delay - 
> panel_power_off_duration));

ditto

Otherwise it looks like it should do what we want, so with the minor
bikesheds sorted this is
Reviewed-by: Ville Syrjälä 

>  
>   wait_panel_status(intel_dp, IDLE_CYCLE_MASK, IDLE_CYCLE_VALUE);
>  }
> @@ -1969,7 +1978,7 @@ static void edp_panel_vdd_off_sync(struct intel_dp 
> *intel_dp)
>   I915_READ(pp_stat_reg), I915_READ(pp_ctrl_reg));
>  
>   if ((pp & POWER_TARGET_ON) == 0)
> - intel_dp->last_power_cycle = jiffies;
> + intel_dp->panel_power_off_time = ktime_get_boottime();
>  
>   power_domain = intel_display_port_aux_power_domain(intel_encoder);
>   intel_display_power_put(dev_priv, power_domain);
> @@ -2118,7 +2127,7 @@ static void edp_panel_off(struct intel_dp *intel_dp)
>   I915_WRITE(pp_ctrl_reg, pp);
>   POSTING_READ(pp_ctrl_reg);
>  
> - intel_dp->last_power_cycle = jiffies;
> + intel_dp->panel_power_off_time = ktime_get_boottime();
>   wait_panel_off(intel_dp);
>  
>   /* We got a reference when we enabled the VDD. */
> @@ -5122,7 +5131,7 @@ intel_dp_add_properties(struct intel_dp *intel_dp, 
> struct drm_connector *connect
>  
>  static void intel_dp_init_panel_power_timestamps(struct intel_dp *intel_dp)
>  {
> - intel_dp->last_power_cycle = jiffies;
> + intel_dp->panel_power_off_time = ktime_get_boottime();
>   intel_dp->last_power_on = jiffies;
>   intel_dp->last_backlight_off = jiffies;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> b/drivers/gpu/drm/i915/intel_drv.h
> index bdfe403..06b37b8 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -793,9 +793,9 @@ struct intel_dp {
>   int backlight_off_delay;
>   struct delayed_work panel_vdd_work;
>   bool want_panel_vdd;
> - unsigned long last_power_cycle;
>   unsigned long last_power_on;
>   unsigned long last_backlight_off;
> + ktime_t panel_power_off_time;
>  
>   struct notifier_block edp_notifier;
>  
> -- 
> 1.9.1

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 10/38] drm/i915: Force MMIO flips when scheduler enabled

2016-01-12 Thread Daniel Vetter
On Tue, Jan 12, 2016 at 11:19:26AM +, John Harrison wrote:
> On 11/01/2016 22:16, Chris Wilson wrote:
> >On Mon, Jan 11, 2016 at 06:42:39PM +, john.c.harri...@intel.com wrote:
> >>From: John Harrison 
> >>
> >>MMIO flips are the preferred mechanism now but more importantly,
> >Says who?
> 
> I asked this exact question at the linux architecture forum quite some time
> ago - does the scheduler need to worry about managing non-batch buffer work
> such as page flips. The answer from everyone present was no, MMIO flips are
> the way to go so don't over complicate the scheduler trying to support ring
> flips. Indeed, execlist mode already forces MMIO flips anyway.

Atomic will kill CS flips. We can mourn them and scream about the loss,
but imo best is to just skip that all and move on to acceptance. So mmio
flips (or well, atomic flips) is still the way to go for everything.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 1/3] drm/i915: simplify allocation of driver-internal requests

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 01:56:48PM +, Chris Wilson wrote:
> But we were removing the engine->default_context as it complicated the
> rest of the code. I strongly prefer keeping the contexts explicit as
> context separation should be first and foremost in the driver.

$ git grep kernel_context -- drivers/gpu/drm/i915/ | grep request_alloc
drivers/gpu/drm/i915/i915_gem_evict.c:  req = 
i915_gem_request_alloc(ring, dev_priv->kernel_context);
drivers/gpu/drm/i915/intel_overlay.c:   return i915_gem_request_alloc(ring, 
dev_priv->kernel_context);

Changing those *two* callsites to pass NULL seems on the odd side, and
at least for the eviction case discards important information.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 021/190] drm/i915: Use HWS for seqno tracking everywhere

2016-01-12 Thread Mika Kuoppala
Chris Wilson  writes:

> On Tue, Jan 12, 2016 at 12:05:06PM +0200, Mika Kuoppala wrote:
>> Chris Wilson  writes:
>> > -  intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE |
>> > -  PIPE_CONTROL_WRITE_FLUSH |
>> > -  PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
>> > -  intel_ring_emit(ring, ring->scratch.gtt_offset | 
>> > PIPE_CONTROL_GLOBAL_GTT);
>> > -  intel_ring_emit(ring, i915_gem_request_get_seqno(req));
>> > +  intel_ring_emit(ring,
>> > +  GFX_OP_PIPE_CONTROL(4) |
>> > +  PIPE_CONTROL_QW_WRITE |
>> > +  PIPE_CONTROL_WRITE_FLUSH);
>> 
>> Why no more PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE?
>
> I opened vim to add it back in and I coulnd't bring myself to commit
> that attrocity.

I just noticed the asymmetry. Ilk doesn't need it?

-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Assign crtc correctly in load detection.

2016-01-12 Thread Maarten Lankhorst
Op 12-01-16 om 13:34 schreef Daniel Vetter:
> On Tue, Jan 12, 2016 at 12:35:59PM +0100, Maarten Lankhorst wrote:
>> drm_atomic_set_crtc_for_connector should be used,
>> and crtc->primary->crtc is assigned by atomic_commit.
>>
>> Rely on the helpers for setting this correctly, so
>> connector_mask gets updated too.
>>
>> Signed-off-by: Maarten Lankhorst 
> Reviewed-by: Daniel Vetter 

After examining the code some more I think this fix is incomplete.

It also needs to do the same on release and if you set i915.nuclear_pageflip 
you'll get a WARN since mode_blob's not set.
Fixing this will break release_load_detect which doesn't unset it.
Would the code work?

Cc'ing Ville since he may be able to test it.

--- >8 ---

drm/i915: Use atomic state to obtain load detection crtc.

Signed-off-by: Maarten Lankhorst 
---
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index bc2ec444925e..9eb1f4e263c6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10409,6 +10409,7 @@ mode_fits_in_fbdev(struct drm_device *dev,
if (obj->base.size < mode->vdisplay * fb->pitches[0])
return NULL;
 
+   drm_framebuffer_reference(fb);
return fb;
 #else
return NULL;
@@ -10474,6 +10475,9 @@ bool intel_get_load_detect_pipe(struct drm_connector 
*connector,
  encoder->base.id, encoder->name);
 
 retry:
+   old->old_pipe_config = NULL;
+   old->old_plane_state = NULL;
+
ret = drm_modeset_lock(>connection_mutex, ctx);
if (ret)
goto fail;
@@ -10489,24 +10493,15 @@ retry:
 */
 
/* See if we already have a CRTC for this connector */
-   if (encoder->crtc) {
-   crtc = encoder->crtc;
+   if (connector->state->crtc) {
+   crtc = connector->state->crtc;
 
ret = drm_modeset_lock(>mutex, ctx);
if (ret)
goto fail;
-   ret = drm_modeset_lock(>primary->mutex, ctx);
-   if (ret)
-   goto fail;
-
-   old->dpms_mode = connector->dpms;
-   old->load_detect_temp = false;
 
/* Make sure the crtc and connector are running */
-   if (connector->dpms != DRM_MODE_DPMS_ON)
-   connector->funcs->dpms(connector, DRM_MODE_DPMS_ON);
-
-   return true;
+   goto found;
}
 
/* Find an unused one (if possible) */
@@ -10514,8 +10509,15 @@ retry:
i++;
if (!(encoder->possible_crtcs & (1 << i)))
continue;
-   if (possible_crtc->state->enable)
+
+   ret = drm_modeset_lock(>mutex, ctx);
+   if (ret)
+   goto fail;
+
+   if (possible_crtc->state->enable) {
+   drm_modeset_unlock(>mutex);
continue;
+   }
 
crtc = possible_crtc;
break;
@@ -10529,17 +10531,19 @@ retry:
goto fail;
}
 
-   ret = drm_modeset_lock(>mutex, ctx);
-   if (ret)
-   goto fail;
+found:
+   intel_crtc = to_intel_crtc(crtc);
+
ret = drm_modeset_lock(>primary->mutex, ctx);
if (ret)
goto fail;
 
-   intel_crtc = to_intel_crtc(crtc);
-   old->dpms_mode = connector->dpms;
-   old->load_detect_temp = true;
-   old->release_fb = NULL;
+   old->old_pipe_config = intel_crtc_duplicate_state(crtc);
+   old->old_plane_state = intel_plane_duplicate_state(crtc->primary);
+   if (!old->old_pipe_config || !old->old_plane_state) {
+   ret = -ENOMEM;
+   goto fail;
+   }
 
state = drm_atomic_state_alloc(dev);
if (!state)
@@ -10553,7 +10557,9 @@ retry:
goto fail;
}
 
-   connector_state->crtc = crtc;
+   ret = drm_atomic_set_crtc_for_connector(connector_state, crtc);
+   if (ret)
+   goto fail;
 
crtc_state = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(crtc_state)) {
@@ -10577,7 +10583,6 @@ retry:
if (fb == NULL) {
DRM_DEBUG_KMS("creating tmp fb for load-detection\n");
fb = intel_framebuffer_create_for_mode(dev, mode, 24, 32);
-   old->release_fb = fb;
} else
DRM_DEBUG_KMS("reusing fbdev for load-detection framebuffer\n");
if (IS_ERR(fb)) {
@@ -10589,15 +10594,16 @@ retry:
if (ret)
goto fail;
 
-   drm_mode_copy(_state->base.mode, mode);
+   drm_framebuffer_unreference(fb);
+
+   ret = drm_atomic_set_mode_for_crtc(_state->base, mode);
+   if (ret)
+   goto fail;
 
if (drm_atomic_commit(state)) {

Re: [Intel-gfx] [PATCH 145/190] drm/i915: Stop discarding GTT cache-domain on unbind vma

2016-01-12 Thread Joonas Lahtinen
On ma, 2016-01-11 at 11:00 +, Chris Wilson wrote:
> Since
> 
> commit 43566dedde54f9729113f5f9fde77d53e75e61e9
> Author: Chris Wilson 
> Date:   Fri Jan 2 16:29:29 2015 +0530
> 
> drm/i915: Broaden application of set-domain(GTT)
> 
> we allowed objects to be in the GTT domain, but unbound. Therefore
> removing the GTT cache domain when removing the GGTT vma is no longer
> semantically correct.
> 
> An unfortunate side-effect is we lose the wondrously named
> i915_gem_object_finish_gtt(), not to be confused with
> i915_gem_gtt_finish_object()!
> 
> Signed-off-by: Chris Wilson 
> Cc: Akash Goel 
> Cc: Joonas Lahtinen 

I'm fairly sure I did this already in the past, but here goes again...

Reviewed-by: Joonas Lahtinen 

> Cc: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 26 +++---
>  1 file changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c
> b/drivers/gpu/drm/i915/i915_gem.c
> index 6ceed074f738..08287d8857c9 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2618,27 +2618,6 @@ i915_gem_object_sync(struct
> drm_i915_gem_object *obj,
>   return 0;
>  }
>  
> -static void i915_gem_object_finish_gtt(struct drm_i915_gem_object
> *obj)
> -{
> - u32 old_write_domain, old_read_domains;
> -
> - /* Force a pagefault for domain tracking on next user access
> */
> - i915_gem_release_mmap(obj);
> -
> - if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
> - return;
> -
> - old_read_domains = obj->base.read_domains;
> - old_write_domain = obj->base.write_domain;
> -
> - obj->base.read_domains &= ~I915_GEM_DOMAIN_GTT;
> - obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
> -
> - trace_i915_gem_object_change_domain(obj,
> - old_read_domains,
> - old_write_domain);
> -}
> -
>  static void i915_vma_destroy(struct i915_vma *vma)
>  {
>   GEM_BUG_ON(vma->node.allocated);
> @@ -2691,13 +2670,14 @@ int i915_vma_unbind(struct i915_vma *vma)
>   GEM_BUG_ON(obj->pages == NULL);
>  
>   if (vma->map_and_fenceable) {
> - i915_gem_object_finish_gtt(obj);
> -
>   /* release the fence reg _after_ flushing */
>   ret = i915_vma_put_fence(vma);
>   if (ret)
>   return ret;
>  
> + /* Force a pagefault for domain tracking on next
> user access */
> + i915_gem_release_mmap(obj);
> +
>   if (vma->iomap) {
>   iounmap(vma->iomap);
>   vma->iomap = NULL;
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Intel-gfx Digest, Vol 96, Issue 26

2016-01-12 Thread Shubhangi Shrivastava

Hi,

Can someone review the patches in the below mail?
PFB the link to the same:
   https://patchwork.freedesktop.org/series/369/#rev5

Thanks and Regards,
Shubhangi Shrivastava.


On Tuesday 05 January 2016 06:28 PM, 
intel-gfx-requ...@lists.freedesktop.org wrote:

Send Intel-gfx mailing list submissions to
intel-gfx@lists.freedesktop.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
or, via email, send a message with subject or body 'help' to
intel-gfx-requ...@lists.freedesktop.org

You can reach the person managing the list at
intel-gfx-ow...@lists.freedesktop.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Intel-gfx digest..."


Today's Topics:

1. [PATCH 2/6] drm/i915: Cleaning up intel_dp_hpd_pulse
   (Shubhangi Shrivastava)
2. [PATCH 0/6] Fixing sink count related detection over
   (Shubhangi Shrivastava)
3. [PATCH 4/6] drm/i915: Save sink_count for tracking   changes to
   it (Shubhangi Shrivastava)
4. [PATCH 3/6] drm/i915: Splitting  intel_dp_check_link_status
   (Shubhangi Shrivastava)
5. [PATCH 5/6] drm/i915: read sink_count dpcd always
   (Shubhangi Shrivastava)
6. [PATCH 6/6] drm/i915: force full detect on sink countchange
   (Shubhangi Shrivastava)
7. Re: [PATCH 0/2] DPCD Backlight Control (Adebisi, YetundeX)


--

Message: 1
Date: Tue,  5 Jan 2016 18:20:22 +0530
From: Shubhangi Shrivastava 
To: intel-gfx@lists.freedesktop.org
Cc: Shubhangi Shrivastava 
Subject: [Intel-gfx] [PATCH 2/6] drm/i915: Cleaning up
intel_dp_hpd_pulse
Message-ID:
<1451998226-21433-3-git-send-email-shubhangi.shrivast...@intel.com>

Current DP detection has DPCD operations split across
intel_dp_hpd_pulse and intel_dp_detect which contains
duplicates as well. Also intel_dp_detect is called
during modes enumeration as well which will result
in multiple dpcd operations. So this patch tries
to solve both these by bringing all DPCD operations
in one single function and make intel_dp_detect
use existing values instead of repeating same steps.

v2: Pulled in a hunk from last patch of the series to
 this patch. (Ander)
v3: Added MST hotplug handling. (Ander)

Tested-by: Nathan D Ciobanu 
Signed-off-by: Sivakumar Thulasimani 
Signed-off-by: Shubhangi Shrivastava 
---
  drivers/gpu/drm/i915/intel_dp.c | 71 +
  1 file changed, 44 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index e3b4208..137757b 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -4645,8 +4645,19 @@ intel_dp_long_pulse(struct intel_connector 
*intel_connector)
intel_dp_probe_oui(intel_dp);
  
  	ret = intel_dp_probe_mst(intel_dp);

-   if (ret)
+   if (ret) {
+   goto out;
+   } else if (connector->status == connector_status_connected) {
+   /*
+* If display was connected already and is still connected
+* check links status, there has been known issues of
+* link loss triggerring long pulse
+*/
+   drm_modeset_lock(>mode_config.connection_mutex, NULL);
+   intel_dp_check_link_status(intel_dp);
+   drm_modeset_unlock(>mode_config.connection_mutex);
goto out;
+   }
  
  	/*

 * Clearing NACK and defer counts to get their exact values
@@ -4677,8 +4688,21 @@ intel_dp_long_pulse(struct intel_connector 
*intel_connector)
}
  
  out:

-   if (status != connector_status_connected)
+   if (status != connector_status_connected) {
intel_dp_unset_edid(intel_dp);
+   /*
+* If we were in MST mode, and device is not there,
+* get out of MST mode
+*/
+   if (intel_dp->is_mst) {
+   DRM_DEBUG_KMS("MST device may have disappeared %d vs 
%d\n",
+   intel_dp->is_mst, intel_dp->mst_mgr.mst_state);
+   intel_dp->is_mst = false;
+   drm_dp_mst_topology_mgr_set_mst(_dp->mst_mgr,
+   intel_dp->is_mst);
+   }
+   }
+
intel_display_power_put(to_i915(dev), power_domain);
return;
  }
@@ -4701,7 +4725,8 @@ intel_dp_detect(struct drm_connector *connector, bool 
force)
return connector_status_disconnected;
}
  
-	intel_dp_long_pulse(intel_dp->attached_connector);

+   if (force)
+   

Re: [Intel-gfx] [PATCH 074/190] drm/i915: Rename request->list to link for consistency

2016-01-12 Thread Tvrtko Ursulin


On 11/01/16 09:17, Chris Wilson wrote:

We use "list" to denote the list and "link" to denote an element on that
list. Rename request->list to match this idiom.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_debugfs.c |  4 ++--
  drivers/gpu/drm/i915/i915_gem.c | 12 ++--
  drivers/gpu/drm/i915/i915_gem_request.c | 10 +-
  drivers/gpu/drm/i915/i915_gem_request.h |  4 ++--
  drivers/gpu/drm/i915/i915_gpu_error.c   |  4 ++--
  drivers/gpu/drm/i915/intel_ringbuffer.c |  6 +++---
  6 files changed, 20 insertions(+), 20 deletions(-)


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko



diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 65cb1d6a5d64..efa9572fc217 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -695,13 +695,13 @@ static int i915_gem_request_info(struct seq_file *m, void 
*data)
int count;

count = 0;
-   list_for_each_entry(req, >request_list, list)
+   list_for_each_entry(req, >request_list, link)
count++;
if (count == 0)
continue;

seq_printf(m, "%s requests: %d\n", ring->name, count);
-   list_for_each_entry(req, >request_list, list) {
+   list_for_each_entry(req, >request_list, link) {
struct task_struct *task;

rcu_read_lock();
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 77c253ddf060..f314b3ea2726 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2183,7 +2183,7 @@ i915_gem_find_active_request(struct intel_engine_cs *ring)
 * extra delay for a recent interrupt is pointless. Hence, we do
 * not need an engine->irq_seqno_barrier() before the seqno reads.
 */
-   list_for_each_entry(request, >request_list, list) {
+   list_for_each_entry(request, >request_list, link) {
if (i915_gem_request_completed(request))
continue;

@@ -2208,7 +2208,7 @@ static void i915_gem_reset_ring_status(struct 
intel_engine_cs *ring)

i915_set_reset_status(dev_priv, request->ctx, ring_hung);

-   list_for_each_entry_continue(request, >request_list, list)
+   list_for_each_entry_continue(request, >request_list, link)
i915_set_reset_status(dev_priv, request->ctx, false);
  }

@@ -2255,7 +2255,7 @@ static void i915_gem_reset_ring_cleanup(struct 
intel_engine_cs *engine)

request = list_last_entry(>request_list,
  struct drm_i915_gem_request,
- list);
+ link);

i915_gem_request_retire_upto(request);
}
@@ -2317,7 +2317,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs 
*ring)

request = list_first_entry(>request_list,
   struct drm_i915_gem_request,
-  list);
+  link);

if (!i915_gem_request_completed(request))
break;
@@ -2336,7 +2336,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs 
*ring)
  struct drm_i915_gem_object,
  ring_list[ring->id]);

-   if (!list_empty(>last_read[ring->id].request->list))
+   if (!list_empty(>last_read[ring->id].request->link))
break;

i915_gem_object_retire__read(obj, ring->id);
@@ -2449,7 +2449,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object 
*obj)
if (req == NULL)
continue;

-   if (list_empty(>list))
+   if (list_empty(>link))
goto retire;

if (i915_gem_request_completed(req)) {
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 01443d8d9224..7f38d8972721 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -333,7 +333,7 @@ void i915_gem_request_cancel(struct drm_i915_gem_request 
*req)
  static void i915_gem_request_retire(struct drm_i915_gem_request *request)
  {
trace_i915_gem_request_retire(request);
-   list_del_init(>list);
+   list_del_init(>link);

/* We know the GPU must have read the request to have
 * sent us the seqno + interrupt, so use the position
@@ -355,12 +355,12 @@ i915_gem_request_retire_upto(struct drm_i915_gem_request 
*req)

lockdep_assert_held(>dev->struct_mutex);

-   if (list_empty(>list))
+   if (list_empty(>link))
return;

do {
   

Re: [Intel-gfx] [PATCH v3 3/3] drm/i915: tidy up a few leftovers

2016-01-12 Thread Daniel Vetter
On Thu, Jan 07, 2016 at 10:20:52AM +, Dave Gordon wrote:
> There are a few bits of code which the transformations implemented by
> the previous patch reveal to be suboptimal, once the notion of a per-
> ring default context has gone away. So this tidies up the leftovers.
> 
> It could have been squashed into the previous patch, but that would have
> made that patch less clearly a simple transformation. In particular, any
> change which alters the code block structure or indentation has been
> deferred into this separate patch, because such things tend to make diffs
> more difficult to read.
> 
> Signed-off-by: Dave Gordon 

Yeah I think this was good to be split up, since I'm not convinced it's
a net improvement (from a quick look at least). Still plenty of default
context checks that seem superfluous (e.g. not pinning the default context
when going through execlist submission is imo a needless special case -
besides that we really should use active tracking).

So I'll refrain from ack or nack on this one.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 15 +--
>  drivers/gpu/drm/i915/i915_gem.c |  6 ++
>  drivers/gpu/drm/i915/intel_lrc.c| 38 
> +
>  3 files changed, 24 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2613708..bbb23da 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1942,11 +1942,8 @@ static int i915_context_status(struct seq_file *m, 
> void *unused)
>  
>   seq_puts(m, "HW context ");
>   describe_ctx(m, ctx);
> - for_each_ring(ring, dev_priv, i) {
> - if (dev_priv->kernel_context == ctx)
> - seq_printf(m, "(default context %s) ",
> -ring->name);
> - }
> + if (ctx == dev_priv->kernel_context)
> + seq_printf(m, "(kernel context) ");
>  
>   if (i915.enable_execlists) {
>   seq_putc(m, '\n');
> @@ -2037,13 +2034,11 @@ static int i915_dump_lrc(struct seq_file *m, void 
> *unused)
>   if (ret)
>   return ret;
>  
> - list_for_each_entry(ctx, _priv->context_list, link) {
> - for_each_ring(ring, dev_priv, i) {
> - if (dev_priv->kernel_context != ctx)
> + list_for_each_entry(ctx, _priv->context_list, link)
> + if (ctx != dev_priv->kernel_context)
> + for_each_ring(ring, dev_priv, i)
>   i915_dump_lrc_obj(m, ring,
> ctx->engine[i].state);
> - }
> - }
>  
>   mutex_unlock(>struct_mutex);
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8f101121..4f45eb2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2677,10 +2677,8 @@ void i915_gem_request_free(struct kref *req_ref)
>   i915_gem_request_remove_from_client(req);
>  
>   if (ctx) {
> - if (i915.enable_execlists) {
> - if (ctx != req->i915->kernel_context)
> - intel_lr_context_unpin(req);
> - }
> + if (i915.enable_execlists && ctx != req->i915->kernel_context)
> + intel_lr_context_unpin(req);
>  
>   i915_gem_context_unreference(ctx);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 5a3..8c4c9b9 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -660,16 +660,10 @@ static int execlists_move_to_gpu(struct 
> drm_i915_gem_request *req,
>  
>  int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request 
> *request)
>  {
> - int ret;
> + int ret = 0;
>  
>   request->ringbuf = request->ctx->engine[request->ring->id].ringbuf;
>  
> - if (request->ctx != request->i915->kernel_context) {
> - ret = intel_lr_context_pin(request);
> - if (ret)
> - return ret;
> - }
> -
>   if (i915.enable_guc_submission) {
>   /*
>* Check that the GuC has space for the request before
> @@ -683,7 +677,10 @@ int intel_logical_ring_alloc_request_extras(struct 
> drm_i915_gem_request *request
>   return ret;
>   }
>  
> - return 0;
> + if (request->ctx != request->i915->kernel_context)
> + ret = intel_lr_context_pin(request);
> +
> + return ret;
>  }
>  
>  static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
> @@ -2382,22 +2379,21 @@ void intel_lr_context_free(struct intel_context *ctx)
>  {
>   int i;
>  
> - for (i = 0; i < I915_NUM_RINGS; i++) {
> + for (i = 

Re: [Intel-gfx] [PATCH v4 02/38] drm/i915: Explicit power enable during deferred context initialisation

2016-01-12 Thread Daniel Vetter
On Tue, Jan 12, 2016 at 11:50:34AM +, John Harrison wrote:
> On 12/01/2016 11:28, Chris Wilson wrote:
> >On Tue, Jan 12, 2016 at 11:11:20AM +, John Harrison wrote:
> >>On 12/01/2016 00:20, Chris Wilson wrote:
> >>>On Mon, Jan 11, 2016 at 06:42:31PM +, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> A later patch in this series re-organises the batch buffer submission
> code. Part of that is to reduce the scope of a pm_get/put pair.
> Specifically, they previously wrapped the entire submission path from
> the very start to the very end, now they only wrap the actual hardware
> submission part in the back half.
> >>>However, as you haven't fixed the ordering issue that requires rpm_get
> >>>before struct_mutex, this is broken.
> >>Why does 'intel_runtime_pm_get' require the struct mutex to be held?
> >>It has certainly not complained at me about trying to do stuff
> >>without it.
> >Because it depends upon the struct_mutex and rpm doesn't have sufficient
> >lockdep integration to be able to warn about using rpm from the
> >incorrect contexts.
> Where? What does the 'pm_runtime_get_sync' call turn into? There are already
> other places in the driver which call intel_runtime_pm_get() immediately
> after grabbing the mutex lock. Also, the description comment for _pm_get()
> does not mention anything about mutexes at all.

If you nest rpm_get within dev->struct_mutex that's a bug and could
deadlock. Where does this happen? And for any such place we need a new
subtest in pm_rpm.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 16/22] drm/omap: Nuke close hooks

2016-01-12 Thread Tomi Valkeinen
On 11/01/16 23:41, Daniel Vetter wrote:
> Again since the core takes care of this we can remove them. While at
> it also remove the postclose hook, it's empty.
> 
> v2: Laurent pointed me at even more code to delete.
> 
> Cc: Laurent Pinchart 
> Cc: Tomi Valkeinen 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher 
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/omapdrm/omap_crtc.c | 13 +---
>  drivers/gpu/drm/omapdrm/omap_drv.c  | 41 
> -
>  drivers/gpu/drm/omapdrm/omap_drv.h  |  1 -
>  3 files changed, 1 insertion(+), 54 deletions(-)

Acked-by: Tomi Valkeinen 

 Tomi



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 20/22] drm/tilcdc: Nuke preclose hook

2016-01-12 Thread Tomi Valkeinen

On 11/01/16 23:41, Daniel Vetter wrote:
> Again since the drm core takes care of event unlinking/disarming this
> is now just needless code.
> 
> v2: Fixup misplaced hunks.
> 
> Cc: Rob Clark 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher  (v1)
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/tilcdc/tilcdc_crtc.c | 20 
>  drivers/gpu/drm/tilcdc/tilcdc_drv.c  |  8 
>  drivers/gpu/drm/tilcdc/tilcdc_drv.h  |  1 -
>  3 files changed, 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c 
> b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> index 7d07733bdc86..4802da8e6d6f 100644
> --- a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> +++ b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> @@ -662,26 +662,6 @@ irqreturn_t tilcdc_crtc_irq(struct drm_crtc *crtc)
>   return IRQ_HANDLED;
>  }
>  
> -void tilcdc_crtc_cancel_page_flip(struct drm_crtc *crtc, struct drm_file 
> *file)
> -{
> - struct tilcdc_crtc *tilcdc_crtc = to_tilcdc_crtc(crtc);
> - struct drm_pending_vblank_event *event;
> - struct drm_device *dev = crtc->dev;
> - unsigned long flags;
> -
> - /* Destroy the pending vertical blanking event associated with the
> -  * pending page flip, if any, and disable vertical blanking interrupts.
> -  */
> - spin_lock_irqsave(>event_lock, flags);
> - event = tilcdc_crtc->event;
> - if (event && event->base.file_priv == file) {
> - tilcdc_crtc->event = NULL;
> - event->base.destroy(>base);
> - drm_vblank_put(dev, 0);
> - }
> - spin_unlock_irqrestore(>event_lock, flags);
> -}
> -

Hmm, looks fine, but when I was comparing the omapdrm change and this
one, I see tilcdc doing drm_vblank_put() in the removed code but omapdrm
doesn't.

The other patches that nuke preclose hooks also contain vblank_put. Will
there be a vblank_put call missing here, or will there be an extra
vblank_put call happening somewhere on omapdrm?

 Tomi



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 006/190] drm/i915: Add GEM debugging Kconfig option

2016-01-12 Thread Dave Gordon

On 11/01/16 09:16, Chris Wilson wrote:

Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabling unless explicitly requested.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/Kconfig.debug |  8 
  drivers/gpu/drm/i915/i915_drv.h|  6 ++
  drivers/gpu/drm/i915/i915_gem.c| 12 +---
  3 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 1f10ee228eda..7fa6b97635e5 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -10,3 +10,11 @@ config DRM_I915_WERROR
---help---
  Add -Werror to the build flags for (and only for) i915.ko.
  Do not enable this unless you are writing code for the i915.ko module.
+
+config DRM_I915_DEBUG_GEM
+   bool "Insert extra checks into the GEM internals"
+   default n
+   depends on DRM_I915_WERROR


This comes up as an option only if DRM_I915_WERROR is already selected? 
Surely it should be orthogonal to compile-time checks, with each as 
independent options but both restricted to EXPERT mode. So the line 
above should be "depends on EXPERT" not "depends on DRM_I915_WERROR"?



+   ---help---
+ Enable extra sanity checks (including BUGs) that may slow the
+  system down and if hit hang the machine.


"hang the machine if hit". Unless you want commas round "if hit"?

Otherwise looks OK.

.Dave.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/7] drm/i915: Pass modifier instead of tiling_mode to gen4_compute_page_offset()

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

In preparation for handling more than X tiling, pass the fb modifier to
gen4_compute_page_offset() instead of the obj->tiling_mode.

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 12 ++--
 drivers/gpu/drm/i915/intel_drv.h |  4 ++--
 drivers/gpu/drm/i915/intel_sprite.c  | 21 ++---
 3 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index abfb5ba054db..0f8174051d5c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2449,11 +2449,11 @@ static void intel_unpin_fb_obj(struct drm_framebuffer 
*fb,
  * is assumed to be a power-of-two. */
 unsigned long intel_gen4_compute_page_offset(struct drm_i915_private *dev_priv,
 int *x, int *y,
-unsigned int tiling_mode,
+uint64_t fb_modifier,
 unsigned int cpp,
 unsigned int pitch)
 {
-   if (tiling_mode != I915_TILING_NONE) {
+   if (fb_modifier != DRM_FORMAT_MOD_NONE) {
unsigned int tile_rows, tiles;
 
tile_rows = *y / 8;
@@ -2769,8 +2769,8 @@ static void i9xx_update_primary_plane(struct drm_plane 
*primary,
 
if (INTEL_INFO(dev)->gen >= 4) {
intel_crtc->dspaddr_offset =
-   intel_gen4_compute_page_offset(dev_priv,
-  , , obj->tiling_mode,
+   intel_gen4_compute_page_offset(dev_priv, , ,
+  fb->modifier[0],
   pixel_size,
   fb->pitches[0]);
linear_offset -= intel_crtc->dspaddr_offset;
@@ -2877,8 +2877,8 @@ static void ironlake_update_primary_plane(struct 
drm_plane *primary,
 
linear_offset = y * fb->pitches[0] + x * pixel_size;
intel_crtc->dspaddr_offset =
-   intel_gen4_compute_page_offset(dev_priv,
-  , , obj->tiling_mode,
+   intel_gen4_compute_page_offset(dev_priv, , ,
+  fb->modifier[0],
   pixel_size,
   fb->pitches[0]);
linear_offset -= intel_crtc->dspaddr_offset;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index e27954d2edad..015538287171 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1198,8 +1198,8 @@ void assert_pipe(struct drm_i915_private *dev_priv, enum 
pipe pipe, bool state);
 #define assert_pipe_disabled(d, p) assert_pipe(d, p, false)
 unsigned long intel_gen4_compute_page_offset(struct drm_i915_private *dev_priv,
 int *x, int *y,
-unsigned int tiling_mode,
-unsigned int bpp,
+uint64_t fb_modifier,
+unsigned int cpp,
 unsigned int pitch);
 void intel_prepare_reset(struct drm_device *dev);
 void intel_finish_reset(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
b/drivers/gpu/drm/i915/intel_sprite.c
index 4d448b990c50..fc5789e65a93 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -422,9 +422,8 @@ vlv_update_plane(struct drm_plane *dplane,
crtc_h--;
 
linear_offset = y * fb->pitches[0] + x * pixel_size;
-   sprsurf_offset = intel_gen4_compute_page_offset(dev_priv,
-   , ,
-   obj->tiling_mode,
+   sprsurf_offset = intel_gen4_compute_page_offset(dev_priv, , ,
+   fb->modifier[0],
pixel_size,
fb->pitches[0]);
linear_offset -= sprsurf_offset;
@@ -557,10 +556,10 @@ ivb_update_plane(struct drm_plane *plane,
sprscale = SPRITE_SCALE_ENABLE | (src_w << 16) | src_h;
 
linear_offset = y * fb->pitches[0] + x * pixel_size;
-   sprsurf_offset =
-   intel_gen4_compute_page_offset(dev_priv,
-  , , obj->tiling_mode,
-  pixel_size, fb->pitches[0]);
+   

[Intel-gfx] [PATCH v3 2/7] drm/i915: Factor out intel_tile_width()

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

Pull the tile width calculations from intel_fb_stride_alignment() into a
new function intel_tile_width().

Also take the opportunity to pass aroun dev_priv instead of dev to
intel_fb_stride_alignment().

v2: Reorder argumnents to be more consistent with other functions
Change intel_fb_stride_alignment() to accept dev_priv instead of dev
v3: Deal with Y tilling (Daniel)

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 85 +---
 drivers/gpu/drm/i915/intel_drv.h |  4 +-
 drivers/gpu/drm/i915/intel_sprite.c  |  2 +-
 3 files changed, 54 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 0f8174051d5c..3a6387092f63 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2223,6 +2223,43 @@ static bool need_vtd_wa(struct drm_device *dev)
return false;
 }
 
+static unsigned int intel_tile_width(const struct drm_i915_private *dev_priv,
+uint64_t fb_modifier, unsigned int cpp)
+{
+   switch (fb_modifier) {
+   case DRM_FORMAT_MOD_NONE:
+   return cpp;
+   case I915_FORMAT_MOD_X_TILED:
+   if (IS_GEN2(dev_priv))
+   return 128;
+   else
+   return 512;
+   case I915_FORMAT_MOD_Y_TILED:
+   if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
+   return 128;
+   else
+   return 512;
+   case I915_FORMAT_MOD_Yf_TILED:
+   switch (cpp) {
+   case 1:
+   return 64;
+   case 2:
+   case 4:
+   return 128;
+   case 8:
+   case 16:
+   return 256;
+   default:
+   MISSING_CASE(cpp);
+   return cpp;
+   }
+   break;
+   default:
+   MISSING_CASE(fb_modifier);
+   return cpp;
+   }
+}
+
 unsigned int
 intel_tile_height(struct drm_device *dev, uint32_t pixel_format,
  uint64_t fb_format_modifier, unsigned int plane)
@@ -2914,37 +2951,15 @@ static void ironlake_update_primary_plane(struct 
drm_plane *primary,
POSTING_READ(reg);
 }
 
-u32 intel_fb_stride_alignment(struct drm_device *dev, uint64_t fb_modifier,
- uint32_t pixel_format)
+u32 intel_fb_stride_alignment(const struct drm_i915_private *dev_priv,
+ uint64_t fb_modifier, uint32_t pixel_format)
 {
-   u32 bits_per_pixel = drm_format_plane_cpp(pixel_format, 0) * 8;
-
-   /*
-* The stride is either expressed as a multiple of 64 bytes
-* chunks for linear buffers or in number of tiles for tiled
-* buffers.
-*/
-   switch (fb_modifier) {
-   case DRM_FORMAT_MOD_NONE:
-   return 64;
-   case I915_FORMAT_MOD_X_TILED:
-   if (INTEL_INFO(dev)->gen == 2)
-   return 128;
-   return 512;
-   case I915_FORMAT_MOD_Y_TILED:
-   /* No need to check for old gens and Y tiling since this is
-* about the display engine and those will be blocked before
-* we get here.
-*/
-   return 128;
-   case I915_FORMAT_MOD_Yf_TILED:
-   if (bits_per_pixel == 8)
-   return 64;
-   else
-   return 128;
-   default:
-   MISSING_CASE(fb_modifier);
+   if (fb_modifier == DRM_FORMAT_MOD_NONE) {
return 64;
+   } else {
+   int cpp = drm_format_plane_cpp(pixel_format, 0);
+
+   return intel_tile_width(dev_priv, fb_modifier, cpp);
}
 }
 
@@ -3118,7 +3133,7 @@ static void skylake_update_primary_plane(struct drm_plane 
*plane,
plane_ctl |= PLANE_CTL_PLANE_GAMMA_DISABLE;
plane_ctl |= skl_plane_ctl_rotation(rotation);
 
-   stride_div = intel_fb_stride_alignment(dev, fb->modifier[0],
+   stride_div = intel_fb_stride_alignment(dev_priv, fb->modifier[0],
   fb->pixel_format);
surf_addr = intel_plane_obj_offset(to_intel_plane(plane), obj, 0);
 
@@ -9303,7 +9318,7 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
fb->width = ((val >> 0) & 0x1fff) + 1;
 
val = I915_READ(PLANE_STRIDE(pipe, 0));
-   stride_mult = intel_fb_stride_alignment(dev, fb->modifier[0],
+   stride_mult = intel_fb_stride_alignment(dev_priv, fb->modifier[0],
fb->pixel_format);
fb->pitches[0] = (val & 0x3ff) * stride_mult;
 
@@ -11389,8 

[Intel-gfx] [PATCH 0/7] drm/i915: Reviewed fb offsets[] prep patches

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

Here's a repost of some already reviewed patches from my larger fb 
offsets[] series [1] from last year, for the sake of the CI system.

[1] http://lists.freedesktop.org/archives/intel-gfx/2015-October/078050.html

Ville Syrjälä (7):
  drm/i915: Pass modifier instead of tiling_mode to
gen4_compute_page_offset()
  drm/i915: Factor out intel_tile_width()
  drm/i915: Redo intel_tile_height() as intel_tile_size() /
intel_tile_width()
  drm/i915: change intel_fill_fb_ggtt_view() to use the real tile size
  drm/i915: Use intel_tile_{size,width,height}() in
intel_gen4_compute_page_offset()
  drm/i915: s/intel_gen4_compute_page_offset/intel_compute_tile_offset/
  drm/i915: Refactor intel_surf_alignment()

 drivers/gpu/drm/i915/intel_display.c | 248 +--
 drivers/gpu/drm/i915/intel_drv.h |  19 ++-
 drivers/gpu/drm/i915/intel_sprite.c  |  32 ++---
 3 files changed, 145 insertions(+), 154 deletions(-)

-- 
2.4.10

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 5/7] drm/i915: Use intel_tile_{size, width, height}() in intel_gen4_compute_page_offset()

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

Make intel_gen4_compute_page_offset() ready for other tiling formats
besied X-tile by getting the tile dimensions through
intel_tile_{size,width,height}().

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index e75f61a56174..2ce166d71d3a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2464,15 +2464,20 @@ unsigned long intel_gen4_compute_page_offset(struct 
drm_i915_private *dev_priv,
 unsigned int pitch)
 {
if (fb_modifier != DRM_FORMAT_MOD_NONE) {
+   unsigned int tile_size, tile_width, tile_height;
unsigned int tile_rows, tiles;
 
-   tile_rows = *y / 8;
-   *y %= 8;
+   tile_size = intel_tile_size(dev_priv);
+   tile_width = intel_tile_width(dev_priv, fb_modifier, cpp);
+   tile_height = tile_size / tile_width;
+
+   tile_rows = *y / tile_height;
+   *y %= tile_height;
 
-   tiles = *x / (512/cpp);
-   *x %= 512/cpp;
+   tiles = *x / (tile_width/cpp);
+   *x %= tile_width/cpp;
 
-   return tile_rows * pitch * 8 + tiles * 4096;
+   return tile_rows * pitch * tile_height + tiles * tile_size;
} else {
unsigned int alignment = intel_linear_alignment(dev_priv) - 1;
unsigned int offset;
-- 
2.4.10

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 7/7] drm/i915: Refactor intel_surf_alignment()

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

Pull the code to determine the surface alignment for both linear and
tiled surfaces into a separate function intel_surf_alignment(). This
will be used not only for the vma alignment but actually aligning
the plane SURF once SKL+ starts using intel_compute_page_offset()
(since SKL+ needs >4K alignment for tiled surfaces too).

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 45 +---
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 2e5b9f34fecd..0c3a398473ef 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2330,7 +2330,7 @@ intel_fill_fb_ggtt_view(struct i915_ggtt_view *view, 
struct drm_framebuffer *fb,
}
 }
 
-static unsigned int intel_linear_alignment(struct drm_i915_private *dev_priv)
+static unsigned int intel_linear_alignment(const struct drm_i915_private 
*dev_priv)
 {
if (INTEL_INFO(dev_priv)->gen >= 9)
return 256 * 1024;
@@ -2343,6 +2343,25 @@ static unsigned int intel_linear_alignment(struct 
drm_i915_private *dev_priv)
return 0;
 }
 
+static unsigned int intel_surf_alignment(const struct drm_i915_private 
*dev_priv,
+uint64_t fb_modifier)
+{
+   switch (fb_modifier) {
+   case DRM_FORMAT_MOD_NONE:
+   return intel_linear_alignment(dev_priv);
+   case I915_FORMAT_MOD_X_TILED:
+   if (INTEL_INFO(dev_priv)->gen >= 9)
+   return 256 * 1024;
+   return 0;
+   case I915_FORMAT_MOD_Y_TILED:
+   case I915_FORMAT_MOD_Yf_TILED:
+   return 1 * 1024 * 1024;
+   default:
+   MISSING_CASE(fb_modifier);
+   return 0;
+   }
+}
+
 int
 intel_pin_and_fence_fb_obj(struct drm_plane *plane,
   struct drm_framebuffer *fb,
@@ -2357,29 +2376,7 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 
WARN_ON(!mutex_is_locked(>struct_mutex));
 
-   switch (fb->modifier[0]) {
-   case DRM_FORMAT_MOD_NONE:
-   alignment = intel_linear_alignment(dev_priv);
-   break;
-   case I915_FORMAT_MOD_X_TILED:
-   if (INTEL_INFO(dev)->gen >= 9)
-   alignment = 256 * 1024;
-   else {
-   /* pin() will align the object as required by fence */
-   alignment = 0;
-   }
-   break;
-   case I915_FORMAT_MOD_Y_TILED:
-   case I915_FORMAT_MOD_Yf_TILED:
-   if (WARN_ONCE(INTEL_INFO(dev)->gen < 9,
- "Y tiling bo slipped through, driver bug!\n"))
-   return -EINVAL;
-   alignment = 1 * 1024 * 1024;
-   break;
-   default:
-   MISSING_CASE(fb->modifier[0]);
-   return -EINVAL;
-   }
+   alignment = intel_surf_alignment(dev_priv, fb->modifier[0]);
 
intel_fill_fb_ggtt_view(, fb, plane_state);
 
-- 
2.4.10

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 4/7] drm/i915: change intel_fill_fb_ggtt_view() to use the real tile size

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

Use the actual tile size as to compute stuff in
intel_fill_fb_ggtt_view() instead of assuming it's PAGE_SIZE. I suppose
it doesn't matter since we don't use the results on gen2 platforms
where the tile size is 2k.

v2: Update due to CbCr plane

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index e2c7ae90b39d..e75f61a56174 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2291,8 +2291,7 @@ intel_fill_fb_ggtt_view(struct i915_ggtt_view *view, 
struct drm_framebuffer *fb,
 {
struct drm_i915_private *dev_priv = to_i915(fb->dev);
struct intel_rotation_info *info = >params.rotation_info;
-   unsigned int tile_height, tile_pitch;
-   unsigned int cpp = drm_format_plane_cpp(fb->pixel_format, 0);
+   unsigned int tile_size, tile_width, tile_height, cpp;
 
*view = i915_ggtt_view_normal;
 
@@ -2310,19 +2309,24 @@ intel_fill_fb_ggtt_view(struct i915_ggtt_view *view, 
struct drm_framebuffer *fb,
info->uv_offset = fb->offsets[1];
info->fb_modifier = fb->modifier[0];
 
-   tile_height = intel_tile_height(dev_priv, fb->modifier[0], cpp);
-   tile_pitch = PAGE_SIZE / tile_height;
-   info->width_pages = DIV_ROUND_UP(fb->pitches[0], tile_pitch);
+   tile_size = intel_tile_size(dev_priv);
+
+   cpp = drm_format_plane_cpp(fb->pixel_format, 0);
+   tile_width = intel_tile_width(dev_priv, cpp, fb->modifier[0]);
+   tile_height = tile_size / tile_width;
+
+   info->width_pages = DIV_ROUND_UP(fb->pitches[0], tile_width);
info->height_pages = DIV_ROUND_UP(fb->height, tile_height);
-   info->size = info->width_pages * info->height_pages * PAGE_SIZE;
+   info->size = info->width_pages * info->height_pages * tile_size;
 
if (info->pixel_format == DRM_FORMAT_NV12) {
cpp = drm_format_plane_cpp(fb->pixel_format, 1);
-   tile_height = intel_tile_height(dev_priv, fb->modifier[1], cpp);
-   tile_pitch = PAGE_SIZE / tile_height;
-   info->width_pages_uv = DIV_ROUND_UP(fb->pitches[1], tile_pitch);
+   tile_width = intel_tile_width(dev_priv, fb->modifier[1], cpp);
+   tile_height = tile_size / tile_width;
+
+   info->width_pages_uv = DIV_ROUND_UP(fb->pitches[1], tile_width);
info->height_pages_uv = DIV_ROUND_UP(fb->height / 2, 
tile_height);
-   info->size_uv = info->width_pages_uv * info->height_pages_uv * 
PAGE_SIZE;
+   info->size_uv = info->width_pages_uv * info->height_pages_uv * 
tile_size;
}
 }
 
-- 
2.4.10

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 6/7] drm/i915: s/intel_gen4_compute_page_offset/intel_compute_tile_offset/

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

Since intel_gen4_compute_page_offset() can now handle tiling formats
all the way down to gen2, rename it to intel_compute_tile_offset().
Not that we actually use it on gen2/3 since there's no DSPSURF etc.
registers which would take a page aligned address.

v2: s/page/tile/ (Daniel)

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 26 +-
 drivers/gpu/drm/i915/intel_drv.h | 10 +-
 drivers/gpu/drm/i915/intel_sprite.c  | 24 
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 2ce166d71d3a..2e5b9f34fecd 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2457,11 +2457,11 @@ static void intel_unpin_fb_obj(struct drm_framebuffer 
*fb,
 
 /* Computes the linear offset to the base tile and adjusts x, y. bytes per 
pixel
  * is assumed to be a power-of-two. */
-unsigned long intel_gen4_compute_page_offset(struct drm_i915_private *dev_priv,
-int *x, int *y,
-uint64_t fb_modifier,
-unsigned int cpp,
-unsigned int pitch)
+unsigned long intel_compute_tile_offset(struct drm_i915_private *dev_priv,
+   int *x, int *y,
+   uint64_t fb_modifier,
+   unsigned int cpp,
+   unsigned int pitch)
 {
if (fb_modifier != DRM_FORMAT_MOD_NONE) {
unsigned int tile_size, tile_width, tile_height;
@@ -2784,10 +2784,10 @@ static void i9xx_update_primary_plane(struct drm_plane 
*primary,
 
if (INTEL_INFO(dev)->gen >= 4) {
intel_crtc->dspaddr_offset =
-   intel_gen4_compute_page_offset(dev_priv, , ,
-  fb->modifier[0],
-  pixel_size,
-  fb->pitches[0]);
+   intel_compute_tile_offset(dev_priv, , ,
+ fb->modifier[0],
+ pixel_size,
+ fb->pitches[0]);
linear_offset -= intel_crtc->dspaddr_offset;
} else {
intel_crtc->dspaddr_offset = linear_offset;
@@ -2892,10 +2892,10 @@ static void ironlake_update_primary_plane(struct 
drm_plane *primary,
 
linear_offset = y * fb->pitches[0] + x * pixel_size;
intel_crtc->dspaddr_offset =
-   intel_gen4_compute_page_offset(dev_priv, , ,
-  fb->modifier[0],
-  pixel_size,
-  fb->pitches[0]);
+   intel_compute_tile_offset(dev_priv, , ,
+ fb->modifier[0],
+ pixel_size,
+ fb->pitches[0]);
linear_offset -= intel_crtc->dspaddr_offset;
if (plane_state->base.rotation == BIT(DRM_ROTATE_180)) {
dspcntr |= DISPPLANE_ROTATE_180;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 68d3b81a..059b46e22c31 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1195,11 +1195,11 @@ void assert_fdi_rx_pll(struct drm_i915_private 
*dev_priv,
 void assert_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, bool 
state);
 #define assert_pipe_enabled(d, p) assert_pipe(d, p, true)
 #define assert_pipe_disabled(d, p) assert_pipe(d, p, false)
-unsigned long intel_gen4_compute_page_offset(struct drm_i915_private *dev_priv,
-int *x, int *y,
-uint64_t fb_modifier,
-unsigned int cpp,
-unsigned int pitch);
+unsigned long intel_compute_tile_offset(struct drm_i915_private *dev_priv,
+   int *x, int *y,
+   uint64_t fb_modifier,
+   unsigned int cpp,
+   unsigned int pitch);
 void intel_prepare_reset(struct drm_device *dev);
 void intel_finish_reset(struct drm_device *dev);
 void hsw_enable_pc8(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
b/drivers/gpu/drm/i915/intel_sprite.c
index 39a7be02c181..0875c8e0ec0a 

Re: [Intel-gfx] [PATCH 004/190] drm/i915: Fix some invalid requests cancellations

2016-01-12 Thread Dave Gordon

On 11/01/16 09:16, Chris Wilson wrote:

As we add the VMA to the request early, it may be cancelled during
execbuf reservation. This will leave the context object pointing to a
dangling request; i915_wait_request() simply skips the wait and so we
may unbind the object whilst it is still active.


I don't understand this; context objects don't point to requests, it's 
vice versa. The request has a pointer to the engine, and to the context, 
and adds +1 to the refcount on the latter.



However, if at any point we make a change to the hardware (and equally
importantly our bookkeeping in the driver), we cannot cancel the request
as what has already been written must be submitted. Submitting a partial
request is far easier than trying to unwind the incomplete change.


What change could be made to the hardware? Engine reset, perhaps, but 
even that doesn't necessarily invalidate a request in preparation for 
sending to the hardware.


Submitting a partial change seems likely to leave the h/w in an 
undefined state. Cancelling a request is (or ought to be) trivial; just 
reset the ringbuffer's tail pointer to where it was when the request was 
allocated. The engine doesn't read anything past the tail of the 
previous request until the new request is submitted.



Unfortunately this patch undoes the excess breadcrumb usage that olr
prevented, e.g. if we interrupt batchbuffer submission then we submit
the requests along with the memory writes and interrupt (even though we
do no real work). Disassociating requests from breadcrumbs (and
semaphores) is a topic for a past/future series, but now much more
important.


Another incomprehensible comment? OLR went away a long time ago now. And 
AFAIK batchbuffer submission cannot be interrupted by anything (or more 
accurately, anything that interrupts submission won't be able to write 
to the ringbuffer or submit new work).



Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
Cc: sta...@vger.kernel.org
---
  drivers/gpu/drm/i915/i915_drv.h|  1 -
  drivers/gpu/drm/i915/i915_gem.c|  7 ++-
  drivers/gpu/drm/i915/i915_gem_context.c| 21 +
  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +---
  drivers/gpu/drm/i915/intel_display.c   |  2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |  1 -
  6 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 747d2d84a18c..ec20814adb0c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2813,7 +2813,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void 
*data,
 struct drm_file *file_priv);
  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
struct drm_i915_gem_request *req);
-void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params 
*params);
  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
   struct drm_i915_gem_execbuffer2 *args,
   struct list_head *vmas);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3ab529669448..fd24877eb0a0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3384,12 +3384,9 @@ int i915_gpu_idle(struct drm_device *dev)
return ret;

ret = i915_switch_context(req);
-   if (ret) {
-   i915_gem_request_cancel(req);
-   return ret;
-   }
-
i915_add_request_no_flush(req);
+   if (ret)
+   return ret;


This seems like a bad idea. Looking at how we could get here (i.e. how 
could switch_context() return an error), we see things such as "failed 
to pin" or i915_gem_object_get_pages() failed.


With no real idea of what the GPU and/or CPU address spaces contain at 
this point, it seems unwise to charge ahead regardless.


.Dave.


}

ret = intel_ring_idle(ring);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index c25083c78ba7..e5e9a8918f19 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -661,7 +661,6 @@ static int do_switch(struct drm_i915_gem_request *req)
struct drm_i915_private *dev_priv = ring->dev->dev_private;
struct intel_context *from = ring->last_context;
u32 hw_flags = 0;
-   bool uninitialized = false;
int ret, i;

if (from != NULL && ring == _priv->ring[RCS]) {
@@ -768,6 +767,15 @@ static int do_switch(struct drm_i915_gem_request *req)
to->remap_slice &= ~(1<

[Intel-gfx] [PATCH v2 3/7] drm/i915: Redo intel_tile_height() as intel_tile_size() / intel_tile_width()

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

I find more usual to think about tile widths than heights, so changing
the intel_tile_height() to calculate the tile height as
tile_size/tile_width is easier than the opposite to the poor brain.

v2: Reorder arguments for consistency
Constify dev_priv arguments

Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/intel_display.c | 89 
 drivers/gpu/drm/i915/intel_drv.h |  5 +-
 drivers/gpu/drm/i915/intel_sprite.c  |  5 +-
 3 files changed, 34 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 3a6387092f63..e2c7ae90b39d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2223,6 +2223,11 @@ static bool need_vtd_wa(struct drm_device *dev)
return false;
 }
 
+static unsigned int intel_tile_size(const struct drm_i915_private *dev_priv)
+{
+   return IS_GEN2(dev_priv) ? 2048 : 4096;
+}
+
 static unsigned int intel_tile_width(const struct drm_i915_private *dev_priv,
 uint64_t fb_modifier, unsigned int cpp)
 {
@@ -2260,67 +2265,34 @@ static unsigned int intel_tile_width(const struct 
drm_i915_private *dev_priv,
}
 }
 
-unsigned int
-intel_tile_height(struct drm_device *dev, uint32_t pixel_format,
- uint64_t fb_format_modifier, unsigned int plane)
+unsigned int intel_tile_height(const struct drm_i915_private *dev_priv,
+  uint64_t fb_modifier, unsigned int cpp)
 {
-   unsigned int tile_height;
-   uint32_t pixel_bytes;
-
-   switch (fb_format_modifier) {
-   case DRM_FORMAT_MOD_NONE:
-   tile_height = 1;
-   break;
-   case I915_FORMAT_MOD_X_TILED:
-   tile_height = IS_GEN2(dev) ? 16 : 8;
-   break;
-   case I915_FORMAT_MOD_Y_TILED:
-   tile_height = 32;
-   break;
-   case I915_FORMAT_MOD_Yf_TILED:
-   pixel_bytes = drm_format_plane_cpp(pixel_format, plane);
-   switch (pixel_bytes) {
-   default:
-   case 1:
-   tile_height = 64;
-   break;
-   case 2:
-   case 4:
-   tile_height = 32;
-   break;
-   case 8:
-   tile_height = 16;
-   break;
-   case 16:
-   WARN_ONCE(1,
- "128-bit pixels are not supported for 
display!");
-   tile_height = 16;
-   break;
-   }
-   break;
-   default:
-   MISSING_CASE(fb_format_modifier);
-   tile_height = 1;
-   break;
-   }
-
-   return tile_height;
+   if (fb_modifier == DRM_FORMAT_MOD_NONE)
+   return 1;
+   else
+   return intel_tile_size(dev_priv) /
+   intel_tile_width(dev_priv, fb_modifier, cpp);
 }
 
 unsigned int
 intel_fb_align_height(struct drm_device *dev, unsigned int height,
- uint32_t pixel_format, uint64_t fb_format_modifier)
+ uint32_t pixel_format, uint64_t fb_modifier)
 {
-   return ALIGN(height, intel_tile_height(dev, pixel_format,
-  fb_format_modifier, 0));
+   unsigned int cpp = drm_format_plane_cpp(pixel_format, 0);
+   unsigned int tile_height = intel_tile_height(to_i915(dev), fb_modifier, 
cpp);
+
+   return ALIGN(height, tile_height);
 }
 
 static void
 intel_fill_fb_ggtt_view(struct i915_ggtt_view *view, struct drm_framebuffer 
*fb,
const struct drm_plane_state *plane_state)
 {
+   struct drm_i915_private *dev_priv = to_i915(fb->dev);
struct intel_rotation_info *info = >params.rotation_info;
unsigned int tile_height, tile_pitch;
+   unsigned int cpp = drm_format_plane_cpp(fb->pixel_format, 0);
 
*view = i915_ggtt_view_normal;
 
@@ -2338,22 +2310,19 @@ intel_fill_fb_ggtt_view(struct i915_ggtt_view *view, 
struct drm_framebuffer *fb,
info->uv_offset = fb->offsets[1];
info->fb_modifier = fb->modifier[0];
 
-   tile_height = intel_tile_height(fb->dev, fb->pixel_format,
-   fb->modifier[0], 0);
+   tile_height = intel_tile_height(dev_priv, fb->modifier[0], cpp);
tile_pitch = PAGE_SIZE / tile_height;
info->width_pages = DIV_ROUND_UP(fb->pitches[0], tile_pitch);
info->height_pages = DIV_ROUND_UP(fb->height, tile_height);
info->size = info->width_pages * info->height_pages * PAGE_SIZE;
 
if (info->pixel_format == DRM_FORMAT_NV12) {
-   tile_height = intel_tile_height(fb->dev, 

Re: [Intel-gfx] [PATCH 053/190] drm/i915: Convert i915_semaphores_is_enabled over to early sanitize

2016-01-12 Thread Dave Gordon

On 11/01/16 09:17, Chris Wilson wrote:

Rather than recomputing whether semaphores are enabled, we can do that
computation once during early initialisation as the i915.semaphores
module parameter is now read-only.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_debugfs.c |  2 +-
  drivers/gpu/drm/i915/i915_dma.c |  2 +-
  drivers/gpu/drm/i915/i915_drv.c | 25 ---
  drivers/gpu/drm/i915/i915_drv.h |  1 -
  drivers/gpu/drm/i915/i915_gem.c | 35 ++---
  drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
  drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
  drivers/gpu/drm/i915/intel_ringbuffer.c | 20 +--
  8 files changed, 46 insertions(+), 43 deletions(-)


LGTM.

Reviewed-by: Dave Gordon 

[aside]
The conditions below seem to exclude a lot of systems. It looks like 
semaphores can only be used on GEN 6 (if no IOMMU) and GEN 7. Nothing 
before or after that range, as GEN9+ supports execlist only.


So is it even worth continuing to support semaphores at all? Especially 
as we have customers who say that the scheduler gives more performance 
gain than semaphores ...

[/aside]

.Dave.


+static bool i915_gem_sanitize_semaphore(struct drm_i915_private *dev_priv,
+   int param_value)
+{
+   if (INTEL_INFO(dev_priv)->gen < 6)
+   return false;
+
+   if (param_value >= 0)
+   return param_value;
+
+   /* TODO: make semaphores and Execlists play nicely together */
+   if (i915.enable_execlists)
+   return false;
+
+   /* Until we get further testing... */
+   if (IS_GEN8(dev_priv))
+   return false;
+
+#ifdef CONFIG_INTEL_IOMMU
+   /* Enable semaphores on SNB when IO remapping is off */
+   if (INTEL_INFO(dev_priv)->gen == 6 && intel_iommu_gfx_mapped)
+   return false;
+#endif
+
+   return true;
+}


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/11] drm/i915: Store rawclk_freq in dev_priv

2016-01-12 Thread Ville Syrjälä
On Tue, Dec 01, 2015 at 05:43:33PM +0200, Jani Nikula wrote:
> On Tue, 01 Dec 2015, Ville Syrjälä  wrote:
> > On Tue, Dec 01, 2015 at 02:47:41PM +0200, Jani Nikula wrote:
> >> On Mon, 30 Nov 2015, ville.syrj...@linux.intel.com wrote:
> >> > From: Ville Syrjälä 
> >> >
> >> > Generalize rawclk handling by storing it in dev_priv.
> >> >
> >> > Presumably our hrawclk readout works at least for CTG and ELK
> >> > since we've been using it for DP AUX on those platforms. There
> >> > are no real docs anymore after configdb vanished, so the only
> >> > reference is the public CTG GMCH spec. What bits are listed in
> >> > that doc match our code. The ELK GMCH spec have no relevant
> >> > details unfortunately.
> >> >
> >> > The PNV situation is less clear. Starting from
> >> > commit aa17cdb4f836 ("drm/i915: initialize backlight max from VBT")
> >> > we assume that the CTG/ELK hrawclk readout works for PNV as well.
> >> > At least the results *seem* reasonable for one PNV machine (Lenovo
> >> > Ideapad S10-3t). Sadly the PNV GMCH spec doesn't have the goods on
> >> > the relevant register either.
> >> >
> >> > So let's keep assuming it works for PNV,ELK,CTG and read it out on
> >> > those platforms. G33 also has hrawclk according to some notes
> >> > in BSpec, but we don't actually need it for anything, so let's not
> >> > even try to read it out there.
> >> 
> >> Looks sensible but moderately scary, did not review properly yet,
> >> however a few nitpicks below.
> >> 
> >> >
> >> > Signed-off-by: Ville Syrjälä 
> >> > ---
> >> >  drivers/gpu/drm/i915/i915_drv.h  |  1 +
> >> >  drivers/gpu/drm/i915/intel_display.c | 55 
> >> > ++--
> >> >  drivers/gpu/drm/i915/intel_dp.c  | 16 +--
> >> >  drivers/gpu/drm/i915/intel_drv.h |  2 --
> >> >  drivers/gpu/drm/i915/intel_panel.c   | 24 
> >> >  5 files changed, 53 insertions(+), 45 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> >> > b/drivers/gpu/drm/i915/i915_drv.h
> >> > index 9ab3e25ddf38..64facd410037 100644
> >> > --- a/drivers/gpu/drm/i915/i915_drv.h
> >> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> > @@ -1781,6 +1781,7 @@ struct drm_i915_private {
> >> >  unsigned int skl_boot_cdclk;
> >> >  unsigned int cdclk_freq, max_cdclk_freq;
> >> >  unsigned int max_dotclk_freq;
> >> > +unsigned int rawclk_freq;
> >> >  unsigned int hpll_freq;
> >> >  unsigned int czclk_freq;
> >> >  
> >> > diff --git a/drivers/gpu/drm/i915/intel_display.c 
> >> > b/drivers/gpu/drm/i915/intel_display.c
> >> > index 84395c8c9dce..84a1359ecba5 100644
> >> > --- a/drivers/gpu/drm/i915/intel_display.c
> >> > +++ b/drivers/gpu/drm/i915/intel_display.c
> >> > @@ -168,49 +168,61 @@ static int vlv_get_cck_clock_hpll(struct 
> >> > drm_i915_private *dev_priv,
> >> >  return DIV_ROUND_CLOSEST(dev_priv->hpll_freq << 1, divider + 1);
> >> >  }
> >> >  
> >> > -int
> >> > -intel_pch_rawclk(struct drm_device *dev)
> >> > +static int
> >> > +intel_pch_rawclk(struct drm_i915_private *dev_priv)
> >> 
> >> I guess intel_ prefix would not be needed anymore. *shrug*.
> >
> > Yeah, lazyness won and I didn't start renaming stuff too much.
> 
> No worries.
> 
> >
> >> 
> >> >  {
> >> > -struct drm_i915_private *dev_priv = dev->dev_private;
> >> > -
> >> > -WARN_ON(!HAS_PCH_SPLIT(dev));
> >> > +return (I915_READ(PCH_RAWCLK_FREQ) & RAWCLK_FREQ_MASK) * 1000;
> >> > +}
> >> >  
> >> > -return I915_READ(PCH_RAWCLK_FREQ) & RAWCLK_FREQ_MASK;
> >> > +static int
> >> > +intel_vlv_hrawclk(struct drm_i915_private *dev_priv)
> >> > +{
> >> > +return 20;
> >> >  }
> >> >  
> >> > -/* hrawclock is 1/4 the FSB frequency */
> >> > -int intel_hrawclk(struct drm_device *dev)
> >> > +static int
> >> > +intel_g4x_hrawclk(struct drm_i915_private *dev_priv)
> >> >  {
> >> > -struct drm_i915_private *dev_priv = dev->dev_private;
> >> >  uint32_t clkcfg;
> >> >  
> >> > -/* There is no CLKCFG reg in Valleyview. VLV hrawclk is 200 MHz 
> >> > */
> >> > -if (IS_VALLEYVIEW(dev))
> >> > -return 200;
> >> > -
> >> > +/* hrawclock is 1/4 the FSB frequency */
> >> >  clkcfg = I915_READ(CLKCFG);
> >> >  switch (clkcfg & CLKCFG_FSB_MASK) {
> >> >  case CLKCFG_FSB_400:
> >> > -return 100;
> >> > +return 10;
> >> >  case CLKCFG_FSB_533:
> >> > -return 133;
> >> > +return 13;
> >> >  case CLKCFG_FSB_667:
> >> > -return 166;
> >> > +return 17;
> >> >  case CLKCFG_FSB_800:
> >> > -return 200;
> >> > +return 20;
> >> >  case CLKCFG_FSB_1067:
> >> > -return 266;
> >> > +return 27;

[Intel-gfx] ✗ warning: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

Built on 9a47f23e3744929b9b222cb750994723fff0e5ee drm-intel-nightly: 
2016y-01m-12d-16h-55m-40s UTC integration manifest

Test gem_storedw_loop:
Subgroup basic-render:
pass   -> DMESG-WARN (skl-i5k-2) UNSTABLE
pass   -> DMESG-WARN (bdw-nuci7)
pass   -> DMESG-WARN (bdw-ultra)
Test kms_flip:
Subgroup basic-flip-vs-modeset:
dmesg-warn -> PASS   (skl-i5k-2)
pass   -> DMESG-WARN (bdw-ultra)
Subgroup basic-plain-flip:
pass   -> DMESG-WARN (bdw-ultra)
Test kms_pipe_crc_basic:
Subgroup nonblocking-crc-pipe-a-frame-sequence:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup nonblocking-crc-pipe-b-frame-sequence:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup read-crc-pipe-a:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup read-crc-pipe-a-frame-sequence:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup read-crc-pipe-b:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup read-crc-pipe-b-frame-sequence:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup read-crc-pipe-c:
pass   -> DMESG-WARN (bdw-ultra)
Test pm_rpm:
Subgroup basic-pci-d3-state:
pass   -> DMESG-WARN (bdw-ultra)
Subgroup basic-rte:
pass   -> DMESG-WARN (bdw-ultra)

bdw-nuci7total:138  pass:128  dwarn:1   dfail:0   fail:0   skip:9  
bdw-ultratotal:138  pass:120  dwarn:12  dfail:0   fail:0   skip:6  
bsw-nuc-2total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc  total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox  total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2  total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
hsw-xps12total:138  pass:133  dwarn:1   dfail:0   fail:0   skip:4  
ilk-hp8440p  total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430stotal:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
skl-i7k-2total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps  total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220ttotal:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1154/

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm/i915: Extract vfunc setup from logical ring initializers

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 05:32:34PM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> Majority of them was duplicated code and only render ring
> currently overrides some of them. We can save some lines of
> code and also take away the confusion on why bsd2 did not
> do the seqno coherency workaround. (VCS2 ring does not exist
> on platforms where workaround is needed but that was not
> documented in the code.)
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 

All 3,
Reviewed-by: Chris Wilson 

We can even combine several functions by inheriting the vfuncs from
intel_ringbuffer.c, and ~only overloading the init/add_request vfuncs.
i.e. call intel_blah_engine_init_execlists after setuping the legacy vfuncs
and common init from intel_blah_engine_init. At the same time, we can
apply some of the same improvements (such as irq_enable_mask) back to the
older gen.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 10/38] drm/i915: Force MMIO flips when scheduler enabled

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 03:07:03PM +0100, Daniel Vetter wrote:
> On Tue, Jan 12, 2016 at 11:19:26AM +, John Harrison wrote:
> > On 11/01/2016 22:16, Chris Wilson wrote:
> > >On Mon, Jan 11, 2016 at 06:42:39PM +, john.c.harri...@intel.com wrote:
> > >>From: John Harrison 
> > >>
> > >>MMIO flips are the preferred mechanism now but more importantly,
> > >Says who?
> > 
> > I asked this exact question at the linux architecture forum quite some time
> > ago - does the scheduler need to worry about managing non-batch buffer work
> > such as page flips. The answer from everyone present was no, MMIO flips are
> > the way to go so don't over complicate the scheduler trying to support ring
> > flips. Indeed, execlist mode already forces MMIO flips anyway.

Two wrongs do not make a right, as they say. CS flips work very nicely
with execlists.

> Atomic will kill CS flips. We can mourn them and scream about the loss,
> but imo best is to just skip that all and move on to acceptance. So mmio
> flips (or well, atomic flips) is still the way to go for everything.

The real issue I think here is that not trying to feed a request into the
scheduler for the flip has lead to a poor interface into the scheduler.
For a CS flip request, we know the ordering, it's contents, we have to
choose the context though, but we have a good idea of the deadline which
gives a good challenge to a scheduler. That was my take.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 073/190] drm/i915: Introduce i915_gem_active for request tracking

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 10:04:20AM +, Tvrtko Ursulin wrote:
> Perhaps then leave the structure name as is and just rename the
> function to i915_gem_request_assign_active? I think that describes
> better what is actually happening.

i915_gem_request_update_active()?

request_assign_active() says to me that it is the request we are acting
on and it can have only one active entity. "update" could go either way.

i915_gem_active_add_to_request() is the full version I guess, or just
i915_gem_active_set().

i915_gem_request_mark_active() -> i915_gem_active_set()
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ warning: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

Built on a90796840c30dac6d9907439bf98d1d08046c49d drm-intel-nightly: 
2016y-01m-11d-17h-22m-54s UTC integration manifest

Test gem_storedw_loop:
Subgroup basic-render:
pass   -> DMESG-WARN (skl-i7k-2) UNSTABLE
Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-c:
pass   -> DMESG-WARN (bdw-ultra)

bdw-nuci7total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultratotal:138  pass:131  dwarn:1   dfail:0   fail:0   skip:6  
bsw-nuc-2total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc  total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox  total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2  total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p  total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430stotal:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2total:141  pass:131  dwarn:2   dfail:0   fail:0   skip:8  
snb-dellxps  total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220ttotal:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1140/

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 21/38] drm/i915: Added a module parameter for allowing scheduler overrides

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 11:34:45AM +, John Harrison wrote:
> On 11/01/2016 22:24, Chris Wilson wrote:
> >On Mon, Jan 11, 2016 at 06:42:50PM +, john.c.harri...@intel.com wrote:
> >>From: John Harrison 
> >>
> >>It can be useful to be able to disable certain features (e.g. the
> >>entire scheduler) via a module parameter for debugging purposes. A
> >>parameter has the advantage of not being a compile time switch but
> >>without implying that it can be changed dynamically at runtime.
> >>+module_param_named(scheduler_override, i915.scheduler_override, int, 0600);
> >>+MODULE_PARM_DESC(scheduler_override, "Scheduler override mask (0 = none, 1 
> >>= direct submission [default])");
> >Is this consistent with the other *enable* booleans?
> Initially there were a whole bunch of override flags for
> disabling/tweaking specific bits of the scheduler's operation. Since
> these extras now only exist in an internal debugging patch that is
> not going to be upstreamed, I guess it probably should be simplified
> to a bool rather than a flags word.

Yes, later on I saw the unsigned flags. Those should be restricted to a
debugfs interface, not a user parameter. module options are an
invitation for the world to fiddle, and they will.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 021/190] drm/i915: Use HWS for seqno tracking everywhere

2016-01-12 Thread Mika Kuoppala
Chris Wilson  writes:

> By using the same address for storing the HWS on every platform, we can
> remove the platform specific vfuncs and reduce the get-seqno routine to
> a single read of a cached memory location.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c  | 10 ++--
>  drivers/gpu/drm/i915/i915_drv.h  |  4 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c|  2 +-
>  drivers/gpu/drm/i915/i915_irq.c  |  4 +-
>  drivers/gpu/drm/i915/i915_trace.h|  2 +-
>  drivers/gpu/drm/i915/intel_breadcrumbs.c |  4 +-
>  drivers/gpu/drm/i915/intel_lrc.c | 46 ++---
>  drivers/gpu/drm/i915/intel_ringbuffer.c  | 86 
> 
>  drivers/gpu/drm/i915/intel_ringbuffer.h  |  7 +--
>  9 files changed, 43 insertions(+), 122 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index d09e48455dcb..5a706c700684 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -600,7 +600,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, 
> void *data)
>  ring->name,
>  
> i915_gem_request_get_seqno(work->flip_queued_req),
>  dev_priv->next_seqno,
> -ring->get_seqno(ring),
> +intel_ring_get_seqno(ring),
>  
> i915_gem_request_completed(work->flip_queued_req));
>   } else
>   seq_printf(m, "Flip not associated with any 
> ring\n");
> @@ -732,10 +732,8 @@ static void i915_ring_seqno_info(struct seq_file *m,
>  {
>   struct rb_node *rb;
>  
> - if (ring->get_seqno) {
> - seq_printf(m, "Current sequence (%s): %x\n",
> -ring->name, ring->get_seqno(ring));
> - }
> + seq_printf(m, "Current sequence (%s): %x\n",
> +ring->name, intel_ring_get_seqno(ring));
>  
>   spin_lock(>breadcrumbs.lock);
>   for (rb = rb_first(>breadcrumbs.waiters);
> @@ -1355,7 +1353,7 @@ static int i915_hangcheck_info(struct seq_file *m, void 
> *unused)
>  
>   for_each_ring(ring, dev_priv, i) {
>   acthd[i] = intel_ring_get_active_head(ring);
> - seqno[i] = ring->get_seqno(ring);
> + seqno[i] = intel_ring_get_seqno(ring);
>   }
>  
>   i915_get_extra_instdone(dev, instdone);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 44d46018ee13..fcedcbc50834 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2971,13 +2971,13 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
>  
>  static inline bool i915_gem_request_started(struct drm_i915_gem_request *req)
>  {
> - return i915_seqno_passed(req->ring->get_seqno(req->ring),
> + return i915_seqno_passed(intel_ring_get_seqno(req->ring),
>req->previous_seqno);
>  }
>  
>  static inline bool i915_gem_request_completed(struct drm_i915_gem_request 
> *req)
>  {
> - return i915_seqno_passed(req->ring->get_seqno(req->ring),
> + return i915_seqno_passed(intel_ring_get_seqno(req->ring),
>req->seqno);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 01d0206ca4dd..3e137fc701cf 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -903,7 +903,7 @@ static void i915_record_ring_state(struct drm_device *dev,
>   ering->waiting = intel_engine_has_waiter(ring);
>   ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
>   ering->acthd = intel_ring_get_active_head(ring);
> - ering->seqno = ring->get_seqno(ring);
> + ering->seqno = intel_ring_get_seqno(ring);
>   ering->start = I915_READ_START(ring);
>   ering->head = I915_READ_HEAD(ring);
>   ering->tail = I915_READ_TAIL(ring);
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index d73669783045..627c7fb6aa9b 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2903,7 +2903,7 @@ static int semaphore_passed(struct intel_engine_cs 
> *ring)
>   if (signaller->hangcheck.deadlock >= I915_NUM_RINGS)
>   return -1;
>  
> - if (i915_seqno_passed(signaller->get_seqno(signaller), seqno))
> + if (i915_seqno_passed(intel_ring_get_seqno(signaller), seqno))
>   return 1;
>  
>   /* cursory check for an unkickable deadlock */
> @@ -3068,7 +3068,7 @@ static void i915_hangcheck_elapsed(struct work_struct 
> *work)
>   semaphore_clear_deadlocks(dev_priv);
>  
>   acthd = intel_ring_get_active_head(ring);
> -   

Re: [Intel-gfx] [PATCH 5/7] drm/i915: Don't need a timer to wake us up

2016-01-12 Thread Tvrtko Ursulin


On 11/01/16 14:33, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 02:08:39PM +, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

We can avoid open-coding the schedule wake-up since

commit 9cff8adeaa34b5d2802f03f89803da57856b3b72
Author: NeilBrown 
Date:   Fri Feb 13 15:49:17 2015 +1100

sched: Prevent recursion in io_schedule()

exported the io_schedule_timeout function which we can now use
to simplify the code in __i915_wait_request.

v2: New commit message. (Daniel Vetter)

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Reviewed-by: Daniel Vetter 


I don't like this because the timer concept still turns out to be useful
when we separate the timer from the bottom-half. And so we have a patch
to remove it and then we add it back again because it serves a purpose.


-EWHATEVER

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 12/38] drm/i915: Added scheduler hook into i915_gem_request_notify()

2016-01-12 Thread John Harrison

On 11/01/2016 22:14, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:41PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

The scheduler needs to know when requests have completed so that it
can keep its own internal state up to date and can submit new requests
to the hardware from its queue.

Why would you reuse the user interrupt rather than introduce a
context-switch interrupt using the pipe_notify/dword_notify (yes, it can
be done by fixing up the current code). In the case of execlists you
wouldn't even need to add another interrupt vector as you could just
overload the execlists submission routine. For legacy, this would at
least let you reduce the interrupt rate from per batch to per context
switch, and keep the logic separate for user request tracking.
-Chris



One of the scheduler's design goals is to throttle the number of batches 
outstanding for any given context in order to improve responsiveness. In 
order to track this, it needs to know about each individual batch 
completion not the merged context completion (which could be for an 
arbitrarily large number of batches due to coalescing in the 
execlist/GuC layer).


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Fix a memory leak where guc->execbuf_client is not freed

2016-01-12 Thread Dave Gordon

On 06/01/16 20:53, yu@intel.com wrote:

From: Alex Dai 

During driver unloading, the guc_client created for command submission
needs to be released to avoid memory leak.

Signed-off-by: Alex Dai 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 9c24424..8ce4f32 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -995,6 +995,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = _priv->guc;

+   if (i915.enable_guc_submission)
+   i915_guc_submission_disable(dev);
+
gem_release_guc_obj(dev_priv->guc.ads_obj);
guc->ads_obj = NULL;


This looks like the right thing to do, but the wrong place to do it.

i915_guc_submission_{init,enable,disable,fini} are the top-level 
functions exported from this source file and called (only) from 
intel_guc_loader.c


Therefore, the code in intel_guc_ucode_fini() should call 
submission_disable() before submission_fini(), like this:


/**
 * intel_guc_ucode_fini() - clean up all allocated resources
 * @dev:drm device
 */
void intel_guc_ucode_fini(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;

direct_interrupts_to_host(dev_priv);
+   i915_guc_submission_disable(dev);
i915_guc_submission_fini(dev);

mutex_lock(>struct_mutex);
if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
mutex_unlock(>struct_mutex);

guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
}

There's no need for it to be conditional, as disable (and fini) are 
idempotent; if a thing hasn't been allocated, or has already been 
deallocated, then these functions will just do nothing.


HOWEVER,

while reviewing this I've noticed that the locking is all screwed up; 
basically "bf248ca drm/i915: Fix locking around GuC firmware load"
removed locking round the calls into i915_guc_loader.c and added it back 
in a few places, but not enough.


It would probably have been better to have left the locking in the 
caller, and hence round the entirety of the calls to _init, _load, 
_fini, and then explicitly DROP the mutex only for the duration of the 
request_firmware call.


It would have been better still not to insist on synchronous firmware 
load in the first place; the original generic (and asynchronous) loader 
didn't require struct_mutex or any other locking around the 
request_firmware() call, so we wouldn't now have to fix it (again).


At present, in intel_guc_loader.c, intel_guc_ucode_load() is called with 
the struct_mutex already held by the caller, but _init() and _fini() are 
called with it NOT held.


All exported functions in i915_guc_submission.c expect it to be held 
when they're called.


On that basis, what we need now is:

guc_fw_fetch() needs to take & release the mutex round the unreference 
in the fail: path (like the code in _fini above).


intel_guc_ucode_fini() needs to extend the scope of the lock to enclose 
all calls to _submission_ functions. So the above becomes:


/**
* intel_guc_ucode_fini() - clean up all allocated resources
* @dev: drm device
*/
void intel_guc_ucode_fini(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = _priv->guc.guc_fw;

mutex_lock(>struct_mutex);
direct_interrupts_to_host(dev_priv);
i915_guc_submission_disable(dev);
i915_guc_submission_fini(dev);

if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
mutex_unlock(>struct_mutex);

guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
}

FINALLY,

intel_guc_ucode_load() should probably call i915_guc_submission_fini() 
in the failure path (after submission_disable()) as it called 
submission_init() earlier. Not critical, as it will get called from 
ucode_fini() anyway, but it improves symmetry.


.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 10/38] drm/i915: Force MMIO flips when scheduler enabled

2016-01-12 Thread John Harrison

On 11/01/2016 22:16, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:39PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

MMIO flips are the preferred mechanism now but more importantly,

Says who?


I asked this exact question at the linux architecture forum quite some 
time ago - does the scheduler need to worry about managing non-batch 
buffer work such as page flips. The answer from everyone present was no, 
MMIO flips are the way to go so don't over complicate the scheduler 
trying to support ring flips. Indeed, execlist mode already forces MMIO 
flips anyway.



pipe
based flips cause issues for the scheduler. Specifically, submitting
work to the rings around the side of the scheduler could cause that
work to be lost if the scheduler generates a pre-emption event on that
ring.

That just says that you haven't designed for the ability to schedule a
flip into the scheduler, including handling the priority bump that might
required to hit the deadline.
-Chris



Initially I was doing that. I was told to drop that work.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Assign crtc correctly in load detection.

2016-01-12 Thread Daniel Vetter
On Tue, Jan 12, 2016 at 12:35:59PM +0100, Maarten Lankhorst wrote:
> drm_atomic_set_crtc_for_connector should be used,
> and crtc->primary->crtc is assigned by atomic_commit.
> 
> Rely on the helpers for setting this correctly, so
> connector_mask gets updated too.
> 
> Signed-off-by: Maarten Lankhorst 

Reviewed-by: Daniel Vetter 
> ---
> Should this be applied to topic/drm-misc since atomic connector_masks are 
> added there?
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index bc2ec444925e..6b25a90d1e0a 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10553,7 +10553,9 @@ retry:
>   goto fail;
>   }
>  
> - connector_state->crtc = crtc;
> + ret = drm_atomic_set_crtc_for_connector(connector_state, crtc);
> + if (ret)
> + goto fail;
>  
>   crtc_state = intel_atomic_get_crtc_state(state, intel_crtc);
>   if (IS_ERR(crtc_state)) {
> @@ -10597,7 +10599,6 @@ retry:
>   old->release_fb->funcs->destroy(old->release_fb);
>   goto fail;
>   }
> - crtc->primary->crtc = crtc;
>  
>   /* let the connector get through one full cycle before testing */
>   intel_wait_for_vblank(dev, intel_crtc->pipe);
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 4/7] drm/i915: Cache LRC state page in the context

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 11:56:11AM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> LRC lifetime is well defined so we can cache the page pointing
> to the object backing store in the context in order to avoid
> walking over the object SG page list from the interrupt context
> without the big lock held.
> 
> v2: Also cache the mapping. (Chris Wilson)
> v3: Unmap on the error path.

Then we only use the lrc_state_page in the unmapping, hardly worth the
cost of saving it.

The reg_state is better associated with the ring (since it basically
contains the analog of the RING_HEAD and friends).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 21/38] drm/i915: Added a module parameter for allowing scheduler overrides

2016-01-12 Thread John Harrison

On 11/01/2016 22:24, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:50PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

It can be useful to be able to disable certain features (e.g. the
entire scheduler) via a module parameter for debugging purposes. A
parameter has the advantage of not being a compile time switch but
without implying that it can be changed dynamically at runtime.
+module_param_named(scheduler_override, i915.scheduler_override, int, 0600);
+MODULE_PARM_DESC(scheduler_override, "Scheduler override mask (0 = none, 1 = direct 
submission [default])");

Is this consistent with the other *enable* booleans?
Initially there were a whole bunch of override flags for 
disabling/tweaking specific bits of the scheduler's operation. Since 
these extras now only exist in an internal debugging patch that is not 
going to be upstreamed, I guess it probably should be simplified to a 
bool rather than a flags word.




-Chris



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 020/190] drm/i915: Remove the lazy_coherency parameter from request-completed?

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 12:27:05PM +0200, Mika Kuoppala wrote:
> > for_each_ring(ring, dev_priv, i) {
> > -   seqno[i] = ring->get_seqno(ring);
> > acthd[i] = intel_ring_get_active_head(ring);
> > +   seqno[i] = ring->get_seqno(ring);
> 
> Oh! Perhaps I am overly optimistic but did you just show how to solve
> the 'hangcheck elapsed  ring idle..' coherency issue
> in hangcheck?

No. There are two causes, one is that we genuinely inspect the seqno
before it is written in the interrupt bottom-half and the other is
indeed the race you keep mentioning between the hangcheck looking at
waiters and the waiter setting itself up.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Assign crtc correctly in load detection.

2016-01-12 Thread Maarten Lankhorst
drm_atomic_set_crtc_for_connector should be used,
and crtc->primary->crtc is assigned by atomic_commit.

Rely on the helpers for setting this correctly, so
connector_mask gets updated too.

Signed-off-by: Maarten Lankhorst 
---
Should this be applied to topic/drm-misc since atomic connector_masks are added 
there?

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index bc2ec444925e..6b25a90d1e0a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10553,7 +10553,9 @@ retry:
goto fail;
}
 
-   connector_state->crtc = crtc;
+   ret = drm_atomic_set_crtc_for_connector(connector_state, crtc);
+   if (ret)
+   goto fail;
 
crtc_state = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(crtc_state)) {
@@ -10597,7 +10599,6 @@ retry:
old->release_fb->funcs->destroy(old->release_fb);
goto fail;
}
-   crtc->primary->crtc = crtc;
 
/* let the connector get through one full cycle before testing */
intel_wait_for_vblank(dev, intel_crtc->pipe);

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 06/38] drm/i915: Re-instate request->uniq because it is extremely useful

2016-01-12 Thread John Harrison

On 11/01/2016 22:04, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:35PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

The seqno value cannot always be used when debugging issues via trace
points. This is because it can be reset back to start, especially
during TDR type tests. Also, when the scheduler arrives the seqno is
only valid while a given request is executing on the hardware. While
the request is simply queued waiting for submission, it's seqno value
will be zero (meaning invalid).

Even with per-context seqno that can be assigned before execution as we
know that requests within a context cannot be reordered?
-Chris

Firstly, we do not have per-context seqno values at the moment. However, 
even if we did that would make a per request unique value even more 
useful as the only way to identify a given request then would be through 
a context pointer and seqno pair which would make scanning through debug 
traces even worse.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 02/10] drm/i915: Cleanup phys status page too

2016-01-12 Thread Daniel Vetter
On Mon, Jan 11, 2016 at 08:48:32PM +0200, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Restore the lost phys status page cleanup.
> 
> Fixes the following splat with DMA_API_DEBUG=y:

Oh, we should enable this in our CI. Can you please shoot a mail to Tomi?

> 
> WARNING: CPU: 0 PID: 21615 at ../lib/dma-debug.c:974 
> dma_debug_device_change+0x190/0x1f0()
> pci :00:02.0: DMA-API: device driver has pending DMA allocations while 
> released from device [count=1]
>One of leaked entries details: [device 
> address=0x23163000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] 
> [mapped as coherent]
> Modules linked in: i915(-) i2c_algo_bit drm_kms_helper syscopyarea 
> sysfillrect sysimgblt fb_sys_fops drm sha256_generic hmac drbg ctr ccm 
> sch_fq_codel binfmt_misc joydev mousedev arc4 ath5k iTCO_wdt mac80211 
> smsc_ircc2 ath snd_intel8x0m snd_intel8x0 snd_ac97_codec ac97_bus psmouse 
> snd_pcm input_leds i2c_i801 pcspkr snd_timer cfg80211 snd soundcore i2c_core 
> ehci_pci firewire_ohci ehci_hcd firewire_core lpc_ich 8139too rfkill 
> crc_itu_t mfd_core mii usbcore rng_core intel_agp intel_gtt usb_common 
> agpgart irda crc_ccitt fujitsu_laptop led_class parport_pc video parport 
> evdev backlight
> CPU: 0 PID: 21615 Comm: rmmod Tainted: G U  4.4.0-rc4-mgm-ovl+ #4
> Hardware name: FUJITSU SIEMENS LIFEBOOK S6120/FJNB16C, BIOS Version 1.26  
> 05/10/2004
>  e31a3de0 e31a3de0 e31a3d9c c128d4bd e31a3dd0 c1045a0c c15e00c4 e31a3dfc
>  546f c15dfad2 03ce c12b3740 03ce c12b3740  0001
>  f61fb8a0 e31a3de8 c1045a83 0009 e31a3de0 c15e00c4 e31a3dfc e31a3e4c
> Call Trace:
>  [] dump_stack+0x16/0x19
>  [] warn_slowpath_common+0x8c/0xd0
>  [] ? dma_debug_device_change+0x190/0x1f0
>  [] ? dma_debug_device_change+0x190/0x1f0
>  [] warn_slowpath_fmt+0x33/0x40
>  [] dma_debug_device_change+0x190/0x1f0
>  [] notifier_call_chain+0x59/0x70
>  [] __blocking_notifier_call_chain+0x3f/0x80
>  [] blocking_notifier_call_chain+0x1f/0x30
>  [] __device_release_driver+0xc3/0xf0
>  [] driver_detach+0x97/0xa0
>  [] bus_remove_driver+0x40/0x90
>  [] driver_unregister+0x28/0x60
>  [] ? trace_hardirqs_on_caller+0x12c/0x1d0
>  [] pci_unregister_driver+0x18/0x80
>  [] drm_pci_exit+0x87/0xb0 [drm]
>  [] i915_exit+0x1b/0x1ee [i915]
>  [] SyS_delete_module+0x14c/0x210
>  [] ? trace_hardirqs_on_caller+0x12c/0x1d0
>  [] ? fput+0xd/0x10
>  [] do_fast_syscall_32+0xa4/0x450
>  [] sysenter_past_esp+0x3b/0x5d
> ---[ end trace c2ecbc77760f10a0 ]---
> Mapped at:
>  [] debug_dma_alloc_coherent+0x33/0x90
>  [] drm_pci_alloc+0x18c/0x1e0 [drm]
>  [] intel_init_ring_buffer+0x2af/0x490 [i915]
>  [] intel_init_render_ring_buffer+0x130/0x750 [i915]
>  [] i915_gem_init_rings+0x1e/0x110 [i915]
> 
> v2: s/BUG_ON/WARN_ON/ since dim doens't like the former anymore
> 
> Cc: Chris Wilson 
> Fixes: 5c6c600 ("drm/i915: Remove DRI1 ring accessors and API")
> Signed-off-by: Ville Syrjälä 
> Reviewed-by: Chris Wilson  (v1)

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 24 
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 339701d7a9a5..d9e0b400294d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1899,6 +1899,17 @@ i915_dispatch_execbuffer(struct drm_i915_gem_request 
> *req,
>   return 0;
>  }
>  
> +static void cleanup_phys_status_page(struct intel_engine_cs *ring)
> +{
> + struct drm_i915_private *dev_priv = to_i915(ring->dev);
> +
> + if (!dev_priv->status_page_dmah)
> + return;
> +
> + drm_pci_free(ring->dev, dev_priv->status_page_dmah);
> + ring->status_page.page_addr = NULL;
> +}
> +
>  static void cleanup_status_page(struct intel_engine_cs *ring)
>  {
>   struct drm_i915_gem_object *obj;
> @@ -1915,9 +1926,9 @@ static void cleanup_status_page(struct intel_engine_cs 
> *ring)
>  
>  static int init_status_page(struct intel_engine_cs *ring)
>  {
> - struct drm_i915_gem_object *obj;
> + struct drm_i915_gem_object *obj = ring->status_page.obj;
>  
> - if ((obj = ring->status_page.obj) == NULL) {
> + if (obj == NULL) {
>   unsigned flags;
>   int ret;
>  
> @@ -2162,7 +2173,7 @@ static int intel_init_ring_buffer(struct drm_device 
> *dev,
>   if (ret)
>   goto error;
>   } else {
> - BUG_ON(ring->id != RCS);
> + WARN_ON(ring->id != RCS);
>   ret = init_phys_status_page(ring);
>   if (ret)
>   goto error;
> @@ -2208,7 +2219,12 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs 
> *ring)
>   if (ring->cleanup)
>   

Re: [Intel-gfx] [PATCH 05/13] drm/i915: Cache LRCA in the context

2016-01-12 Thread Tvrtko Ursulin


On 11/01/16 19:41, Dave Gordon wrote:

On 11/01/16 08:38, Daniel Vetter wrote:

On Fri, Jan 08, 2016 at 11:29:44AM +, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

We are not allowed to call i915_gem_obj_ggtt_offset from irq
context without the big kernel lock held.

LRCA lifetime is well defined so cache it so it can be looked up
cheaply from the interrupt context and at command submission
time.

v2: Added irq context reasoning to the commit message. (Daniel Vetter)

Signed-off-by: Tvrtko Ursulin 


A i915_obj_for_each_vma macro with a
WARN_ON(!mutex_is_locked(dev->struct_mutex)) would be awesome to validate
this. Especially since this is by far not the only time I've seen this
bug. Needs to be a follow-up though to avoid stalling this fix.


---
  drivers/gpu/drm/i915/i915_debugfs.c | 15 ++
  drivers/gpu/drm/i915/i915_drv.h |  1 +
  drivers/gpu/drm/i915/intel_lrc.c| 40
-
  drivers/gpu/drm/i915/intel_lrc.h|  3 ++-
  4 files changed, 26 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3b05bd189eab..714a45cf8a51 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1976,12 +1976,13 @@ static int i915_context_status(struct
seq_file *m, void *unused)
  }

  static void i915_dump_lrc_obj(struct seq_file *m,
-  struct intel_engine_cs *ring,
-  struct drm_i915_gem_object *ctx_obj)
+  struct intel_context *ctx,
+  struct intel_engine_cs *ring)
  {
  struct page *page;
  uint32_t *reg_state;
  int j;
+struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
  unsigned long ggtt_offset = 0;

  if (ctx_obj == NULL) {
@@ -1991,7 +1992,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
  }

  seq_printf(m, "CONTEXT: %s %u\n", ring->name,
-   intel_execlists_ctx_id(ctx_obj));
+   intel_execlists_ctx_id(ctx, ring));

  if (!i915_gem_obj_ggtt_bound(ctx_obj))
  seq_puts(m, "\tNot bound in GGTT\n");
@@ -2040,8 +2041,7 @@ static int i915_dump_lrc(struct seq_file *m,
void *unused)
  list_for_each_entry(ctx, _priv->context_list, link) {
  for_each_ring(ring, dev_priv, i) {
  if (ring->default_context != ctx)
-i915_dump_lrc_obj(m, ring,
-  ctx->engine[i].state);
+i915_dump_lrc_obj(m, ctx, ring);
  }
  }

@@ -2115,11 +2115,8 @@ static int i915_execlists(struct seq_file *m,
void *data)

  seq_printf(m, "\t%d requests in queue\n", count);
  if (head_req) {
-struct drm_i915_gem_object *ctx_obj;
-
-ctx_obj = head_req->ctx->engine[ring_id].state;
  seq_printf(m, "\tHead request id: %u\n",
-   intel_execlists_ctx_id(ctx_obj));
+   intel_execlists_ctx_id(head_req->ctx, ring));
  seq_printf(m, "\tHead request tail: %u\n",
 head_req->tail);
  }
diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 8cf655c6fc03..b77a5d84eac2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -881,6 +881,7 @@ struct intel_context {
  struct drm_i915_gem_object *state;
  struct intel_ringbuffer *ringbuf;
  int pin_count;
+u32 lrca;


lrc_offset imo. Consistent with our other usage in the driver, and
actually readable. Please apply liberally everywhere else (I know that
bsepc calls it lrca, but we don't need to follow bad naming styles
blindly).

With that Reviewed-by: Daniel Vetter 


  } engine[I915_NUM_RINGS];

  struct list_head link;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c
b/drivers/gpu/drm/i915/intel_lrc.c
index ff146a15d395..ffe004de22b0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -265,7 +265,8 @@ int intel_sanitize_enable_execlists(struct
drm_device *dev, int enable_execlists

  /**
   * intel_execlists_ctx_id() - get the Execlists Context ID
- * @ctx_obj: Logical Ring Context backing object.
+ * @ctx: Context to get the ID for
+ * @ring: Engine to get the ID for
   *
   * Do not confuse with ctx->id! Unfortunately we have a name overload
   * here: the old context ID we pass to userspace as a handler so that
@@ -275,14 +276,12 @@ int intel_sanitize_enable_execlists(struct
drm_device *dev, int enable_execlists
   *
   * Return: 20-bits globally unique context ID.
   */
-u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
+u32 intel_execlists_ctx_id(struct intel_context *ctx,
+   struct intel_engine_cs *ring)
  {
-u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
-LRC_PPHWSP_PN * PAGE_SIZE;
-
  /* LRCA is required to be 4K aligned so the more significant 

Re: [Intel-gfx] [PATCH 021/190] drm/i915: Use HWS for seqno tracking everywhere

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 12:05:06PM +0200, Mika Kuoppala wrote:
> Chris Wilson  writes:
> > -   intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE |
> > -   PIPE_CONTROL_WRITE_FLUSH |
> > -   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
> > -   intel_ring_emit(ring, ring->scratch.gtt_offset | 
> > PIPE_CONTROL_GLOBAL_GTT);
> > -   intel_ring_emit(ring, i915_gem_request_get_seqno(req));
> > +   intel_ring_emit(ring,
> > +   GFX_OP_PIPE_CONTROL(4) |
> > +   PIPE_CONTROL_QW_WRITE |
> > +   PIPE_CONTROL_WRITE_FLUSH);
> 
> Why no more PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE?

I opened vim to add it back in and I coulnd't bring myself to commit
that attrocity.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/22] drm/armada: Remove NULL open/pre/postclose hooks

2016-01-12 Thread Russell King - ARM Linux
On Mon, Jan 11, 2016 at 10:41:01PM +0100, Daniel Vetter wrote:
> The compiler will do this, but the void hits when grepping all the
> hooks for a subsystem wide audit are slightly annoying. So remove them
> for next time around.

I'll try to remember to queue this after -rc1, though a reminder
after -rc1 would be useful.

Thanks.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 4/7] drm/i915: Cache LRC state page in the context

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 12:54:25PM +, Tvrtko Ursulin wrote:
> 
> On 12/01/16 12:12, Chris Wilson wrote:
> >On Tue, Jan 12, 2016 at 11:56:11AM +, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin 
> >>
> >>LRC lifetime is well defined so we can cache the page pointing
> >>to the object backing store in the context in order to avoid
> >>walking over the object SG page list from the interrupt context
> >>without the big lock held.
> >>
> >>v2: Also cache the mapping. (Chris Wilson)
> >>v3: Unmap on the error path.
> >
> >Then we only use the lrc_state_page in the unmapping, hardly worth the
> >cost of saving it.
> 
> Ok.
> 
> Do you also know if this would now require any flushing or something
> if previously kunmap_atomic might have done something under the
> covers?

kmap_atomic only changes the PTE and the unmap would flush the TLB. In
terms of our pointer access, using kmap/kmap_atomic is equivalent. (Just
kmap_atomic is meant to be cheaper to set up and a limited resource
which can only be used without preemption.)

> >The reg_state is better associated with the ring (since it basically
> >contains the analog of the RING_HEAD and friends).
> 
> Hm, not sure. The page belongs to the object from that anonymous
> struct in intel_context so I think it is best to keep them together.

The page does sure, but the state does not.

The result is that we get:

ring->registers[CTX_RING_BUFFER_STATE+1] = ring->vma->node.state;
ASSIGN_CTX_PDP(ppgtt, ring->registers, 3);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 2);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 1);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 0);
ring->registers[CTX_RING_TAIL+1] = req->tail;

which makes a lot more sense, to me, when viewed against the underlying
architecture of the hardware and comparing against the legacy ringbuffer.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ warning: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

Built on a90796840c30dac6d9907439bf98d1d08046c49d drm-intel-nightly: 
2016y-01m-11d-17h-22m-54s UTC integration manifest

Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-b-frame-sequence:
pass   -> DMESG-WARN (byt-nuc)
Test pm_rpm:
Subgroup basic-rte:
dmesg-warn -> PASS   (byt-nuc) UNSTABLE

bdw-nuci7total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultratotal:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc  total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox  total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2  total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p  total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430stotal:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
skl-i7k-2total:141  pass:132  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps  total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220ttotal:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1141/

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/22] drm/gma500: Remove empty preclose hook

2016-01-12 Thread Patrik Jakobsson
On Mon, Jan 11, 2016 at 10:41 PM, Daniel Vetter  wrote:
> I'm auditing them all, empty ones just confuse ...
>
> Cc: Patrik Jakobsson 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher 
> Signed-off-by: Daniel Vetter 

Acked-by: Patrik Jakobsson 

> ---
>  drivers/gpu/drm/gma500/psb_drv.c | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/gma500/psb_drv.c 
> b/drivers/gpu/drm/gma500/psb_drv.c
> index 92e7e5795398..4e1c6850520e 100644
> --- a/drivers/gpu/drm/gma500/psb_drv.c
> +++ b/drivers/gpu/drm/gma500/psb_drv.c
> @@ -442,14 +442,6 @@ static long psb_unlocked_ioctl(struct file *filp, 
> unsigned int cmd,
> /* FIXME: do we need to wrap the other side of this */
>  }
>
> -/*
> - * When a client dies:
> - *- Check for and clean up flipped page state
> - */
> -static void psb_driver_preclose(struct drm_device *dev, struct drm_file 
> *priv)
> -{
> -}
> -
>  static int psb_pci_probe(struct pci_dev *pdev, const struct pci_device_id 
> *ent)
>  {
> return drm_get_pci_dev(pdev, ent, );
> @@ -495,7 +487,6 @@ static struct drm_driver driver = {
> .load = psb_driver_load,
> .unload = psb_driver_unload,
> .lastclose = psb_driver_lastclose,
> -   .preclose = psb_driver_preclose,
> .set_busid = drm_pci_set_busid,
>
> .num_ioctls = ARRAY_SIZE(psb_ioctls),
> --
> 2.6.4
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 5/5] drm: Enable markdown^Wasciidoc for gpu.tmpl

2016-01-12 Thread Daniel Vetter
On Mon, Jan 11, 2016 at 06:12:12PM -0700, Jonathan Corbet wrote:
> On Sat, 12 Dec 2015 12:13:45 +0100
> Daniel Vetter  wrote:
> 
> > I just figured there's no way this could get it, and I'd
> > much rather improve the docs themselves than trying to convince core
> > kernel folks that this might be useful.
> 
> So I'm not quite sure why you figured that; I never said it, certainly.

To clarify this wasn't really my impression of your stance, but of the
overall room opinion when we had the discussion at KS. And then my main
goal here is to write great docs for drm (we have about 3k lines more docs
in 4.5 already), so that's why I dropped the ball on upstreaming. It
seemed unlikely to succeed, at least without some really seriuos effort at
convincing everyone, all while the drm docs for atomic haven't been in
good shape yet. Since then we had a few contributors of new atomic drivers
note on irc already that "oh cool, this is documented now". Overall really
just boils down to what I see as the most important things for drm ;-)

> I've been messing with it a bit, seems to work.  I do still wish we could
> consider alternatives, especially those that might simplify the toolchain
> rather than complicating it.  But it's clear that I'm not succeeding in
> finding time to actually explore that idea; the contents of $EXCUSES are
> good, but the end result is the same.  And the patch fairy just isn't
> coming through for me on this one.
> 
> In my mind, there's clearly no good that can come from (further) delaying
> something that works in favor of an "it would be nice" that may never
> even exist.  So I'm currently thinking that I'll pull this into the docs
> tree once the merge window is done, with the plan to push it for 4.6.
> Then we can see if anybody screams.
> 
> That gives a couple of weeks for an updated patch set, should you have
> one.
> 
> The build-time increase is painful in the extreme - about a factor of
> three for a -j1 build, and that's with only one file using the feature.
> It feels wrong, somehow, for the docs build to take longer than building
> the kernel itself.  Can we do something about that?
> 
>  - How many of the comments actually use asciidoc features?  Might there
>be some possibility of detecting those in kernel-doc and skipping the
>callout to asciidoc when it's not needed?

I think that amounts to writing a partial parser (we use asciidoc for
tables, lists, links, formatting, code snippets by now already, someone
even thought of using the asciiart->png feature it has but it's not yet
wired up). I don't think it's feasible.

>  - Pandoc seems to do asciidoc.  I still don't like the idea of depending
>on it for this to work, but having the *option* to use it is fine.  If
>it's really that much faster (yes, Python startup is painful) then
>maybe providing the option is worth it.

Hm, Dave asked me to convert to use python-based asciidoc insted of
haskell-based pandoc.

>  - All over the kernel we've seen that batching improves performance.  It
>would take a bit of work, but I bet kernel-doc could put together all
>the snippets from one file, pass them through a single asciidoc
>invocation, then split the results back apart.  That would probably
>eliminate the performance hit entirely.
> 
> None of that is a condition for pulling this stuff in, but can it be
> looked into?

Besides what Jani mention that asciidoctor should be a drop-in replacement
if installed it also seems possible to parallelize the call-out to
kernel-doc from docproc.c without too much effort. I hoped Jani would get
around to implement the asciidoctor support, and I'm hoping I can snipe
away some free sometimes the next few months to look at docproc.c more
seriously. This would kinda be a cool intern project, but atm we throw
them all at improving testing infrastructure ...

Anyway I'm of course still open to get this upstream, and I think a few
things should be polished (like the speed-up). But right now bandwidth on
my side isn't too plentiful. Maybe we should aim to have a few better
ideas (perhaps even for all of the docs stuff) for next KS and respin that
discussion?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 1/6] drm/i915: move VBT based TV presence check to intel_bios.c

2016-01-12 Thread Daniel Vetter
On Mon, Jan 11, 2016 at 09:54:37PM +0200, Jani Nikula wrote:
> Hide knowledge about VBT child devices in intel_bios.c.
> 
> Signed-off-by: Jani Nikula 

Assuming you copypasted correctly ;-)

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/i915_drv.h   |  1 +
>  drivers/gpu/drm/i915/intel_bios.c | 38 ++
>  drivers/gpu/drm/i915/intel_tv.c   | 43 
> +--
>  3 files changed, 40 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 104bd1809936..3822c465d3dc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3347,6 +3347,7 @@ extern void intel_i2c_reset(struct drm_device *dev);
>  /* intel_bios.c */
>  int intel_bios_init(struct drm_i915_private *dev_priv);
>  bool intel_bios_is_valid_vbt(const void *buf, size_t size);
> +bool intel_bios_is_tv_present(struct drm_i915_private *dev_priv);
>  
>  /* intel_opregion.c */
>  #ifdef CONFIG_ACPI
> diff --git a/drivers/gpu/drm/i915/intel_bios.c 
> b/drivers/gpu/drm/i915/intel_bios.c
> index 15ba52bd2538..bb9e8b086b63 100644
> --- a/drivers/gpu/drm/i915/intel_bios.c
> +++ b/drivers/gpu/drm/i915/intel_bios.c
> @@ -1415,3 +1415,41 @@ intel_bios_init(struct drm_i915_private *dev_priv)
>  
>   return 0;
>  }
> +
> +/**
> + * intel_bios_is_tv_present - is integrated TV present in VBT
> + * @dev_priv:i915 device instance
> + *
> + * Return true if TV is present. If no child devices were parsed from VBT,
> + * assume TV is present.
> + */
> +bool intel_bios_is_tv_present(struct drm_i915_private *dev_priv)
> +{
> + union child_device_config *p_child;
> + int i;
> +
> + if (!dev_priv->vbt.child_dev_num)
> + return true;
> +
> + for (i = 0; i < dev_priv->vbt.child_dev_num; i++) {
> + p_child = dev_priv->vbt.child_dev + i;
> + /*
> +  * If the device type is not TV, continue.
> +  */
> + switch (p_child->old.device_type) {
> + case DEVICE_TYPE_INT_TV:
> + case DEVICE_TYPE_TV:
> + case DEVICE_TYPE_TV_SVIDEO_COMPOSITE:
> + break;
> + default:
> + continue;
> + }
> + /* Only when the addin_offset is non-zero, it is regarded
> +  * as present.
> +  */
> + if (p_child->old.addin_offset)
> + return true;
> + }
> +
> + return false;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_tv.c b/drivers/gpu/drm/i915/intel_tv.c
> index 948cbff6c62e..29e68859b9b7 100644
> --- a/drivers/gpu/drm/i915/intel_tv.c
> +++ b/drivers/gpu/drm/i915/intel_tv.c
> @@ -1529,47 +1529,6 @@ static const struct drm_encoder_funcs 
> intel_tv_enc_funcs = {
>   .destroy = intel_encoder_destroy,
>  };
>  
> -/*
> - * Enumerate the child dev array parsed from VBT to check whether
> - * the integrated TV is present.
> - * If it is present, return 1.
> - * If it is not present, return false.
> - * If no child dev is parsed from VBT, it assumes that the TV is present.
> - */
> -static int tv_is_present_in_vbt(struct drm_device *dev)
> -{
> - struct drm_i915_private *dev_priv = dev->dev_private;
> - union child_device_config *p_child;
> - int i, ret;
> -
> - if (!dev_priv->vbt.child_dev_num)
> - return 1;
> -
> - ret = 0;
> - for (i = 0; i < dev_priv->vbt.child_dev_num; i++) {
> - p_child = dev_priv->vbt.child_dev + i;
> - /*
> -  * If the device type is not TV, continue.
> -  */
> - switch (p_child->old.device_type) {
> - case DEVICE_TYPE_INT_TV:
> - case DEVICE_TYPE_TV:
> - case DEVICE_TYPE_TV_SVIDEO_COMPOSITE:
> - break;
> - default:
> - continue;
> - }
> - /* Only when the addin_offset is non-zero, it is regarded
> -  * as present.
> -  */
> - if (p_child->old.addin_offset) {
> - ret = 1;
> - break;
> - }
> - }
> - return ret;
> -}
> -
>  void
>  intel_tv_init(struct drm_device *dev)
>  {
> @@ -1585,7 +1544,7 @@ intel_tv_init(struct drm_device *dev)
>   if ((I915_READ(TV_CTL) & TV_FUSE_STATE_MASK) == TV_FUSE_STATE_DISABLED)
>   return;
>  
> - if (!tv_is_present_in_vbt(dev)) {
> + if (!intel_bios_is_tv_present(dev_priv)) {
>   DRM_DEBUG_KMS("Integrated TV is not present.\n");
>   return;
>   }
> -- 
> 2.1.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 073/190] drm/i915: Introduce i915_gem_active for request tracking

2016-01-12 Thread Tvrtko Ursulin


On 11/01/16 22:49, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 05:32:27PM +, Tvrtko Ursulin wrote:

+struct i915_gem_active {
+   struct drm_i915_gem_request *request;
+};
+
+static inline void
+i915_gem_request_mark_active(struct drm_i915_gem_request *request,
+struct i915_gem_active *active)
+{
+   i915_gem_request_assign(>request, request);
+}


This function name bothers me since I think the name is misleading
and unintuitive. It is not marking a request as active but
associating it with the second data structure.

Maybe i915_gem_request_move_to_active to keep the mental association
with the well established vma_move_to_active ?


That's backwards imo, since it is the i915_gem_active that gets added to
the request. (Or at least will be.)


Maybe struct i915_gem_active could also be better called
i915_gem_active_list ?


It's not a list but a node. I started with drm_i915_gem_request_node,
but that's too unwieldy and I felt even more confusing.


It is not ideal because of the future little reversal of who is in
who's list, so maybe there is something even better. But I think an
intuitive name is really important for code clarity and
maintainability.


In userspace, I have the request (which is actually a userspace fence
itself) containing a list of fences (that are identical to i915_gem_active,
they track which request contains the reference and a callback for
signalling) and those fences have a direct correspondence to,
ARB_sync_objects, for example. But we already have plenty of conflict
regarding the term fence, so that's no go.

i915_gem_active, for me, made the association with the active-reference
tracking that is ingrained into the objects and beyond. I quite like the
connection with GPU activity

i915_gem_retire_notifier? Hmm, I still like how
i915_gem_active.request != NULL is quite self-descriptive.


Yes the last bit is neat.

Perhaps then leave the structure name as is and just rename the function 
to i915_gem_request_assign_active? I think that describes better what is 
actually happening.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 020/190] drm/i915: Remove the lazy_coherency parameter from request-completed?

2016-01-12 Thread Mika Kuoppala
Chris Wilson  writes:

> Now that we have split out the seqno-barrier from the
> engine->get_seqno() callback itself, we can move the users of the
> seqno-barrier to the required callsites simplifying the common code and
> making the required workaround handling much more explicit.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c  |  4 ++--
>  drivers/gpu/drm/i915/i915_drv.h  | 17 -
>  drivers/gpu/drm/i915/i915_gem.c  | 24 
>  drivers/gpu/drm/i915/intel_display.c |  2 +-
>  drivers/gpu/drm/i915/intel_pm.c  |  4 ++--
>  5 files changed, 29 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 1499e2337e5d..d09e48455dcb 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -601,7 +601,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, 
> void *data)
>  
> i915_gem_request_get_seqno(work->flip_queued_req),
>  dev_priv->next_seqno,
>  ring->get_seqno(ring),
> -
> i915_gem_request_completed(work->flip_queued_req, true));
> +
> i915_gem_request_completed(work->flip_queued_req));
>   } else
>   seq_printf(m, "Flip not associated with any 
> ring\n");
>   seq_printf(m, "Flip queued on frame %d, (was ready on 
> frame %d), now %d\n",
> @@ -1354,8 +1354,8 @@ static int i915_hangcheck_info(struct seq_file *m, void 
> *unused)
>   intel_runtime_pm_get(dev_priv);
>  
>   for_each_ring(ring, dev_priv, i) {
> - seqno[i] = ring->get_seqno(ring);
>   acthd[i] = intel_ring_get_active_head(ring);
> + seqno[i] = ring->get_seqno(ring);

Oh! Perhaps I am overly optimistic but did you just show how to solve
the 'hangcheck elapsed  ring idle..' coherency issue
in hangcheck?

I would like to have a separate patch that does this ordering
in here and also in i915_hangcheck_elapsed to be in the safe
side, regardless.

Or at minimum, copy the acthd read, seqno read ordering
into the i915_hangcheck_elapsed also.

-Mika


>   }
>  
>   i915_get_extra_instdone(dev, instdone);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 9762aa76bb0a..44d46018ee13 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2969,20 +2969,14 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
>   return (int32_t)(seq1 - seq2) >= 0;
>  }
>  
> -static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
> -bool lazy_coherency)
> +static inline bool i915_gem_request_started(struct drm_i915_gem_request *req)
>  {
> - if (!lazy_coherency && req->ring->irq_seqno_barrier)
> - req->ring->irq_seqno_barrier(req->ring);
>   return i915_seqno_passed(req->ring->get_seqno(req->ring),
>req->previous_seqno);
>  }
>  
> -static inline bool i915_gem_request_completed(struct drm_i915_gem_request 
> *req,
> -   bool lazy_coherency)
> +static inline bool i915_gem_request_completed(struct drm_i915_gem_request 
> *req)
>  {
> - if (!lazy_coherency && req->ring->irq_seqno_barrier)
> - req->ring->irq_seqno_barrier(req->ring);
>   return i915_seqno_passed(req->ring->get_seqno(req->ring),
>req->seqno);
>  }
> @@ -3636,6 +3630,8 @@ static inline void i915_trace_irq_get(struct 
> intel_engine_cs *ring,
>  
>  static inline bool __i915_request_irq_complete(struct drm_i915_gem_request 
> *req)
>  {
> + struct intel_engine_cs *engine = req->ring;
> +
>   /* Ensure our read of the seqno is coherent so that we
>* do not "miss an interrupt" (i.e. if this is the last
>* request and the seqno write from the GPU is not visible
> @@ -3647,7 +3643,10 @@ static inline bool __i915_request_irq_complete(struct 
> drm_i915_gem_request *req)
>* but it is easier and safer to do it every time the waiter
>* is woken.
>*/
> - if (i915_gem_request_completed(req, false))
> + if (engine->irq_seqno_barrier)
> + engine->irq_seqno_barrier(engine);
> +
> + if (i915_gem_request_completed(req))
>   return true;
>  
>   /* We need to check whether any gpu reset happened in between
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 4b26529f1f44..d125820c6309 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1171,12 +1171,12 @@ static bool __i915_spin_request(struct 
> drm_i915_gem_request *req,
>   

Re: [Intel-gfx] [PATCH 3/7] drm/i915: Add per context timelines to fence object

2016-01-12 Thread John Harrison

On 11/01/2016 22:58, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 02:47:33PM -0800, Jesse Barnes wrote:

On 01/11/2016 11:03 AM, John Harrison wrote:

On 08/01/2016 22:05, Chris Wilson wrote:

On Fri, Jan 08, 2016 at 06:47:24PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

The fence object used inside the request structure requires a sequence
number. Although this is not used by the i915 driver itself, it could
potentially be used by non-i915 code if the fence is passed outside of
the driver. This is the intention as it allows external kernel drivers
and user applications to wait on batch buffer completion
asynchronously via the dma-buff fence API.

That doesn't make any sense as they are not limited by a single
timeline.

I don't understand what you mean. Who is not limited by a single timeline?  The 
point is that the current seqno values cannot be used as there is no guarantee 
that they will increment globally once things like a scheduler and pre-emption 
arrive. Whereas, the fence internal implementation makes various assumptions 
about the linearity of the timeline. External users do not want to care about 
timelines or seqnos at all, they just want the fence API to work as documented.


To ensure that such external users are not confused by strange things
happening with the seqno, this patch adds in a per context timeline
that can provide a guaranteed in-order seqno value for the fence. This
is safe because the scheduler will not re-order batch buffers within a
context - they are considered to be mutually dependent.

You haven't added per-context breadcrumbs. What we need for being able
to execute requests from parallel timelines, but with requests within a
timeline being ordered, is a per-context page where we can emit the
per-context issued breadcrumb. Then instead of looking up the current
HW seqno in a global page, the request just looks at the current context
HW seqno in the context seq, just
i915_seqno_passed(*req->p_context_seqno, req->seqno).

This patch is not attempting to implement per context seqno values. That can be 
done as future work. This patch is doing the simplest, least invasive 
implementation in order to make external fences work.

Right.  I think we want to move to per-context seqnos, but we don't have to do 
it before this work lands.  It should be easier to do it after the rest of 
these bits land in fact, since seqno handling will be well encapsulated aiui.

This patch is irrelevent then. I think it is actually worse because it
is encapsulating a design detail that is fundamentally wrong.
-Chris



Some kind of per-context timeline is required for the external use of 
i915 fences. Seqnos cannot be used without a lot of rework because they 
dance around with scheduler re-ordering and pre-emption - a low priority 
request could go through many different seqnos if it keeps getting 
pre-empted. We need to be able to use fences externally on Android at 
least, and with SVM it becomes vital for linux too. Therefore we need 
some solution. And this is much simpler and than re-writing the whole of 
the driver's seqno management.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 3/7] drm/i915: Cache ringbuffer GTT VMA

2016-01-12 Thread Tvrtko Ursulin
From: Tvrtko Ursulin 

Purpose is to avoid calling i915_gem_obj_ggtt_offset from the
interrupt context without the big lock held.

v2: Renamed gtt_start to gtt_offset. (Daniel Vetter)
v3: Cache the VMA instead of address. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/intel_lrc.c| 3 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 504030ce7f93..94314b344f38 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -340,7 +340,6 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
struct intel_engine_cs *ring = rq->ring;
struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
-   struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
struct page *page;
uint32_t *reg_state;
 
@@ -350,7 +349,7 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
reg_state = kmap_atomic(page);
 
reg_state[CTX_RING_TAIL+1] = rq->tail;
-   reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
+   reg_state[CTX_RING_BUFFER_START+1] = rq->ringbuf->vma->node.start;
 
if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
/* True 32b PPGTT with dynamic page allocation: update PDP
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 339701d7a9a5..eccc5323d59b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1988,6 +1988,7 @@ void intel_unpin_ringbuffer_obj(struct intel_ringbuffer 
*ringbuf)
else
iounmap(ringbuf->virtual_start);
ringbuf->virtual_start = NULL;
+   ringbuf->vma = NULL;
i915_gem_object_ggtt_unpin(ringbuf->obj);
 }
 
@@ -2054,6 +2055,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device 
*dev,
}
}
 
+   ringbuf->vma = i915_gem_obj_to_ggtt(obj);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 85ce2272f92c..ede57954080e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -99,6 +99,7 @@ struct intel_ring_hangcheck {
 struct intel_ringbuffer {
struct drm_i915_gem_object *obj;
void __iomem *virtual_start;
+   struct i915_vma *vma;
 
struct intel_engine_cs *ring;
struct list_head link;
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 02/38] drm/i915: Explicit power enable during deferred context initialisation

2016-01-12 Thread John Harrison

On 12/01/2016 00:20, Chris Wilson wrote:

On Mon, Jan 11, 2016 at 06:42:31PM +, john.c.harri...@intel.com wrote:

From: John Harrison 

A later patch in this series re-organises the batch buffer submission
code. Part of that is to reduce the scope of a pm_get/put pair.
Specifically, they previously wrapped the entire submission path from
the very start to the very end, now they only wrap the actual hardware
submission part in the back half.

However, as you haven't fixed the ordering issue that requires rpm_get
before struct_mutex, this is broken.
Why does 'intel_runtime_pm_get' require the struct mutex to be held? It 
has certainly not complained at me about trying to do stuff without it.




  When we have rpm fixed, you don't
need this patch as we can take the wakeref around the GSM access itself.
We still want the prolonged rpm wakeref for the GPU activity, ofc.
-Chris



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 7/7] drm/i915/dp: Enable Upfront link training for typeC DP support on BXT

2016-01-12 Thread Ander Conselvan De Oliveira
On Mon, 2016-01-11 at 17:53 +, R, Durgadoss wrote:
> > -Original Message-
> > From: Ander Conselvan De Oliveira [mailto:conselv...@gmail.com]
> > Sent: Monday, January 11, 2016 8:46 PM
> > To: R, Durgadoss; intel-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH 7/7] drm/i915/dp: Enable Upfront link training for typeC
> > DP support on BXT
> > 
> > On Fri, 2015-12-11 at 15:09 +0530, Durgadoss R wrote:
> > > To support USB type C alternate DP mode, the display driver needs to
> > > know the number of lanes required by the DP panel as well as number
> > > of lanes that can be supported by the type-C cable. Sometimes, the
> > > type-C cable may limit the bandwidth even if Panel can support
> > > more lanes. To address these scenarios, the display driver will
> > > start link training with max lanes, and if that fails, the driver
> > > falls back to x2 lanes; and repeats this procedure for all
> > > bandwidth/lane configurations.
> > > 
> > > * Since link training is done before modeset only the port
> > >   (and not pipe/planes) and its associated PLLs are enabled.
> > > * On DP hotplug: Directly start link training on the crtc
> > >   associated with the DP encoder.
> > > * On Connected boot scenarios: When booted with an LFP and a DP,
> > >   typically, BIOS brings up DP. In these cases, we disable the
> > >   crtc, do upfront link training and then re-enable it back.
> > > * All local changes made for upfront link training are reset
> > >   to their previous values once it is done; so that the
> > >   subsequent modeset is not aware of these changes.
> > > 
> > > Signed-off-by: Durgadoss R 
> > > ---
> > >  drivers/gpu/drm/i915/intel_ddi.c | 81
> > > 
> > >  drivers/gpu/drm/i915/intel_dp.c  | 77
> > > +-
> > >  drivers/gpu/drm/i915/intel_drv.h |  2 +
> > >  3 files changed, 159 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c
> > > b/drivers/gpu/drm/i915/intel_ddi.c
> > > index 632044a..43efc26 100644
> > > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > > @@ -3274,6 +3274,87 @@ intel_ddi_init_hdmi_connector(struct
> > > intel_digital_port
> > > *intel_dig_port)
> > >   return connector;
> > >  }
> > > 
> > > +bool intel_ddi_upfront_link_train(struct intel_dp *intel_dp,
> > > + struct intel_crtc *crtc)
> > > +{
> > > + struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
> > > + struct intel_connector *connector = intel_dp->attached_connector;
> > > + struct intel_encoder *encoder = connector->encoder;
> > > + struct drm_device *dev = encoder->base.dev;
> > > + struct drm_i915_private *dev_priv = dev->dev_private;
> > > + struct intel_shared_dpll *pll;
> > > + struct drm_crtc *drm_crtc = NULL;
> > > + struct intel_crtc_state *tmp_crtc_config;
> > > + struct intel_dpll_hw_state tmp_dpll_hw_state;
> > > + uint8_t link_bw, lane_count;
> > > +
> > > + if (!crtc) {
> > > + drm_crtc = intel_get_unused_crtc(>base);
> > > + if (!drm_crtc) {
> > > + DRM_ERROR("No crtc for upfront link training\n");
> > > + return false;
> > > + }
> > > + encoder->base.crtc = drm_crtc;
> > > + crtc = to_intel_crtc(drm_crtc);
> > > + }
> > > +
> > > + /* Initialize with Max Link rate & lane count supported by panel
> > > */
> > > + link_bw =  intel_dp->dpcd[DP_MAX_LINK_RATE];
> > > + lane_count = drm_dp_max_lane_count(intel_dp->dpcd);
> > > +
> > > + /* Save the crtc->config */
> > > + tmp_crtc_config = crtc->config;
> > > + tmp_dpll_hw_state = crtc->config->dpll_hw_state;
> > > +
> > > + /* Select the shared DPLL to use for this port */
> > > + intel_get_ddi_pll(dev_priv, dig_port->port, crtc->config);
> > > + pll = intel_crtc_to_shared_dpll(crtc);
> > > + if (!pll) {
> > > + DRM_ERROR("Could not get shared DPLL\n");
> > > + goto exit;
> > > + }
> > > + DRM_DEBUG_KMS("Using %s for pipe %c\n", pll->name, pipe_name(crtc
> > > ->pipe));
> > > +
> > > + do {
> > > + crtc->config->port_clock =
> > > drm_dp_bw_code_to_link_rate(link_bw);
> > > + crtc->config->lane_count = lane_count;
> > > + if (!intel_ddi_pll_select(crtc, crtc->config, encoder,
> > > false))
> > > + goto exit;
> > > +
> > > + pll->config.crtc_mask |= (1 << crtc->pipe);
> > > + pll->config.hw_state = crtc->config->dpll_hw_state;
> > > +
> > > + /* Enable PLL followed by port */
> > > + intel_enable_shared_dpll(crtc);
> > > + encoder->pre_enable(encoder);
> > > +
> > > + /* Check if link training passed; if so update DPCD */
> > > + if (intel_dp->train_set_valid)
> > > + intel_dp_update_dpcd_params(intel_dp);
> > > +
> > > + /* Disable port followed by PLL for next retry/clean up
> > > */
> > > + encoder->post_disable(encoder);
> > > + 

Re: [Intel-gfx] [PATCH v4 02/38] drm/i915: Explicit power enable during deferred context initialisation

2016-01-12 Thread Chris Wilson
On Tue, Jan 12, 2016 at 11:11:20AM +, John Harrison wrote:
> On 12/01/2016 00:20, Chris Wilson wrote:
> >On Mon, Jan 11, 2016 at 06:42:31PM +, john.c.harri...@intel.com wrote:
> >>From: John Harrison 
> >>
> >>A later patch in this series re-organises the batch buffer submission
> >>code. Part of that is to reduce the scope of a pm_get/put pair.
> >>Specifically, they previously wrapped the entire submission path from
> >>the very start to the very end, now they only wrap the actual hardware
> >>submission part in the back half.
> >However, as you haven't fixed the ordering issue that requires rpm_get
> >before struct_mutex, this is broken.
> Why does 'intel_runtime_pm_get' require the struct mutex to be held?
> It has certainly not complained at me about trying to do stuff
> without it.

Because it depends upon the struct_mutex and rpm doesn't have sufficient
lockdep integration to be able to warn about using rpm from the
incorrect contexts.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/10] drm/i915: Fixes from my attempt at running igt on gen2

2016-01-12 Thread Ville Syrjälä
On Mon, Dec 14, 2015 at 06:23:39PM +0200, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> It's been a while since I last ran igt on gen2, so I figured I'd
> give it a shot. 855 had some failures, 830 no longer worked at
> all. So I went ahead and fixed them, and here's the result.
> 
> The first three patches are not even gen2 specific bugs I caught
> with this effort. The rest is for gen2.
> 
> I have some fixes for igt as well, which I'll post separately.
> 
> The good news is that with these patches (and the igt fixes) my
> 855 completes a full kms_flip run without failures, and the BAT
> set has only one failure (gem_render_tiled_blits). 830 is fairly
> good too, but it does have a lot of underruns and pipe_assert()
> dmesg warnings. Lot of those are due to the pipe enable quirks
> since we implement those quite haphazardly.
> 
> The series is available here:
> git://github.com/vsyrjala/linux.git gen2_igt_fixes
> 
> Ville Syrjälä (10):
>   drm/i915: Cleanup phys status page too
>   drm/i915: Wait for pipe to start before sampling vblank timestamps on
> gen2
>   drm/i915: Allow 27 bytes child_dev for VBT <109
>   drm/i915: Expect child dev size of 22 bytes for VBT < 106
>   drm/i915: Use MI_BATCH_BUFFER_START on 830/845

Merged these. Thanks for the reviews.

>   drm/i915: Release mmaps on partial ggtt vma unbind
>   drm/i915: Write out crc frame counts in hex
>   drm/i915: Use drm_vblank_count() on gen2 for crc frame count
>   drm/i915: Enable vblank_disable_immediate on gen2
>   drm/i915: Reject < 8 byte batches on 830/845

And at some point I'll need to figure out what to do with the
rest. Some I'll just drop for sure.

> 
>  drivers/gpu/drm/i915/i915_debugfs.c| 13 -
>  drivers/gpu/drm/i915/i915_gem.c|  3 +++
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  3 +++
>  drivers/gpu/drm/i915/i915_irq.c| 14 +-
>  drivers/gpu/drm/i915/intel_bios.c  | 21 
>  drivers/gpu/drm/i915/intel_display.c   | 11 +++
>  drivers/gpu/drm/i915/intel_ringbuffer.c| 31 
> +++---
>  7 files changed, 71 insertions(+), 25 deletions(-)
> 
> -- 
> 2.4.10

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] RPM wakelock ref not held during HW access

2016-01-12 Thread Sergey Senozhatsky
Hello,

-mmots 4.4.0-mm1-dbg-00602-g776bd09


[ 5331.509087] WARNING: CPU: 0 PID: 359 at 
drivers/gpu/drm/i915/intel_drv.h:1446 gen6_read32+0x7b/0x253 [i915]()
[ 5331.509091] RPM wakelock ref not held during HW access
[ 5331.509093] Modules linked in:
[ 5331.509182] CPU: 0 PID: 359 Comm: Xorg Not tainted 
4.4.0-mm1-dbg-00602-g776bd09-dirty #34
[ 5331.509186]   88041bddfac0 811eb860 
88041bddfb08
[ 5331.509194]  88041bddfaf8 81040fc2 a05442a9 
88041aab
[ 5331.509200]  00064000 88041afcb001 0001 
88041bddfb60
[ 5331.509207] Call Trace:
[ 5331.509218]  [] dump_stack+0x4b/0x63
[ 5331.509227]  [] warn_slowpath_common+0x99/0xb2
[ 5331.509277]  [] ? gen6_read32+0x7b/0x253 [i915]
[ 5331.509283]  [] warn_slowpath_fmt+0x48/0x50
[ 5331.509331]  [] gen6_read32+0x7b/0x253 [i915]
[ 5331.509338]  [] ? mutex_unlock+0xe/0x10
[ 5331.509391]  [] intel_ddi_get_hw_state+0x5e/0x159 [i915]
[ 5331.509443]  [] intel_ddi_connector_get_hw_state+0x5c/0xdf 
[i915]
[ 5331.509494]  [] intel_atomic_commit+0x8e0/0x1250 [i915]
[ 5331.509528]  [] ? drm_atomic_check_only+0x293/0x564 [drm]
[ 5331.509558]  [] drm_atomic_commit+0x4d/0x52 [drm]
[ 5331.509572]  [] 
drm_atomic_helper_connector_dpms+0x116/0x17e [drm_kms_helper]
[ 5331.509599]  [] drm_mode_obj_set_property_ioctl+0xef/0x17a 
[drm]
[ 5331.509626]  [] 
drm_mode_connector_property_set_ioctl+0x30/0x32 [drm]
[ 5331.509642]  [] drm_ioctl+0x26d/0x3a8 [drm]
[ 5331.509668]  [] ? 
drm_mode_obj_set_property_ioctl+0x17a/0x17a [drm]
[ 5331.509675]  [] ? lock_acquire+0x101/0x188
[ 5331.509681]  [] ? __fget+0x5/0x19d
[ 5331.509685]  [] ? __lock_is_held+0x3c/0x57
[ 5331.509691]  [] vfs_ioctl+0x18/0x34
[ 5331.509695]  [] do_vfs_ioctl+0x572/0x5f1
[ 5331.509699]  [] ? __fget_light+0x62/0x71
[ 5331.509704]  [] SyS_ioctl+0x43/0x61
[ 5331.509711]  [] entry_SYSCALL_64_fastpath+0x12/0x6f
[ 5331.509715] ---[ end trace 70d4fd86a0395d92 ]---

-ss
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Handle error paths during watermark sanitization properly (v3)

2016-01-12 Thread Matt Roper
sanitize_watermarks() does not properly handle errors returned by
drm_atomic_helper_duplicate_state().  Make failures drop locks before
returning.  We also change the lock of connection_mutex to a
drm_modeset_lock_all_ctx() to make sure any EDEADLK's are handled
earlier.

v2: Change call to lock connetion_mutex with a call to
drm_modeset_lock_all_ctx().  This ensures that any lock contention
is handled earlier and drm_atomic_helper_duplicate_state() won't
return EDEADLK. (Maarten)

v3: Drop locks properly in more error paths. (Maarten)

Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Signed-off-by: Matt Roper 
Reviewed-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_display.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index abfb5ba..07ca19b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15353,17 +15353,17 @@ static void sanitize_watermarks(struct drm_device 
*dev)
 */
drm_modeset_acquire_init(, 0);
 retry:
-   ret = drm_modeset_lock(>mode_config.connection_mutex, );
+   ret = drm_modeset_lock_all_ctx(dev, );
if (ret == -EDEADLK) {
drm_modeset_backoff();
goto retry;
} else if (WARN_ON(ret)) {
-   return;
+   goto fail;
}
 
state = drm_atomic_helper_duplicate_state(dev, );
if (WARN_ON(IS_ERR(state)))
-   return;
+   goto fail;
 
/*
 * Hardware readout is the only time we don't want to calculate
@@ -15386,7 +15386,7 @@ retry:
 * BIOS-programmed watermarks untouched and hope for the best.
 */
WARN(true, "Could not determine valid watermarks for inherited 
state\n");
-   return;
+   goto fail;
}
 
/* Write calculated watermark values back */
@@ -15399,6 +15399,7 @@ retry:
}
 
drm_atomic_state_free(state);
+fail:
drm_modeset_drop_locks();
drm_modeset_acquire_fini();
 }
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/15] drm/i915: Cure DDI from multiple personality disorder

2016-01-12 Thread Ville Syrjälä
On Tue, Dec 08, 2015 at 07:59:35PM +0200, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> While debugging problems on DDI platforms I got tired of the crap
> caused by the the dual personality DDI encoders, so I went ahead
> and split them into separate HDMI and DP encoders.
> 
> As usual, the hole got a bit deeper after I noticed that I had to
> kill intel_prepare_ddi() as well.
> 
> This also needs the check_digital_port_conflicts() patch [1] because
> otherwise kms_setmode results in a dead machine.
> 
> Series avaialbe there (with [1] included):
> git://github.com/vsyrjala/linux.git ddi_encoder_split_3
> 
> [1] http://lists.freedesktop.org/archives/intel-gfx/2015-December/082284.html
> 
> Ville Syrjälä (15):
>   drm/i915: Pass the correct encoder to intel_ddi_clk_select() with MST
>   drm/i915: Check max number of lanes when registering DDI ports
>   drm/i915: Store max lane count in intel_digital_port
>   drm/i915: Remove pointless 'ddi_translations' local variable
>   drm/i915: Eliminate duplicated skl_get_buf_trans_dp()
>   drm/i915: Pass around dev_priv for ddi buffer programming
>   drm/i915: Add BUILD_BUG_ON()s for DDI translation table sizes
>   drm/i915: Reject >9 ddi translation entried if port != A/E on SKL
>   drm/i915: Kill intel_prepare_ddi()

Merged up to here. Skipped the BUILD_BUG_ON patch since it didn't get
any r-b love. Thanks for reviews.

>   drm/i915: Split the problematic dual-role DDI encoder into two
> encoders
>   drm/i915: Explicitly use ddi bug trans entry 9 for hdmi
>   drm/i915: Split DP/eDP/FDI and HDMI/DVI DDI buffer programming apart
>   drm/i915: Add a sanity check for 'hdmi_default_entry'
>   drm/i915: Get the iboost setting based on the port type
>   drm/i915: Simplify intel_ddi_get_encoder_port()

I'll repost the rest separately once I find a bit of time to address
the review comments.

> 
>  drivers/gpu/drm/i915/i915_drv.c |   1 -
>  drivers/gpu/drm/i915/intel_ddi.c| 542 
> +++-
>  drivers/gpu/drm/i915/intel_display.c|  21 --
>  drivers/gpu/drm/i915/intel_dp.c |  36 +--
>  drivers/gpu/drm/i915/intel_dp_mst.c |   4 +-
>  drivers/gpu/drm/i915/intel_drv.h|   6 +-
>  drivers/gpu/drm/i915/intel_hdmi.c   |   9 +
>  drivers/gpu/drm/i915/intel_opregion.c   |   1 -
>  drivers/gpu/drm/i915/intel_runtime_pm.c |  12 -
>  9 files changed, 344 insertions(+), 288 deletions(-)
> 
> -- 
> 2.4.10

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/3] drm/i915: Encapsulate the pwm_device in a pwm_info structure

2016-01-12 Thread Shobhit Kumar
pwm_info helps in encapsulating the PWM period_ns values and will form
basis of adding new pwm devices which can then be genrically used by
initializing proper pwm_info structure in the backlight setup call.

Cc: cbroo...@gmail.com
Cc: jani.nik...@linux.intel.com
Signed-off-by: Shobhit Kumar 
---
 drivers/gpu/drm/i915/intel_drv.h   |  8 ++-
 drivers/gpu/drm/i915/intel_panel.c | 45 --
 2 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index bdfe403..6f96769 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -161,6 +161,12 @@ struct intel_encoder {
enum hpd_pin hpd_pin;
 };
 
+struct intel_pwm_info {
+   struct pwm_device *dev;
+   unsigned int period_ns;
+   char *name;
+};
+
 struct intel_panel {
struct drm_display_mode *fixed_mode;
struct drm_display_mode *downclock_mode;
@@ -179,7 +185,7 @@ struct intel_panel {
/* PWM chip */
bool util_pin_active_low;   /* bxt+ */
u8 controller;  /* bxt+ only */
-   struct pwm_device *pwm;
+   struct intel_pwm_info *pwm;
 
struct backlight_device *device;
 
diff --git a/drivers/gpu/drm/i915/intel_panel.c 
b/drivers/gpu/drm/i915/intel_panel.c
index 21ee647..9e24c59 100644
--- a/drivers/gpu/drm/i915/intel_panel.c
+++ b/drivers/gpu/drm/i915/intel_panel.c
@@ -37,6 +37,13 @@
 
 #define CRC_PMIC_PWM_PERIOD_NS 21333
 
+/* CRC PMIC based PWM Information */
+struct intel_pwm_info crc_pwm_info = {
+   .period_ns = CRC_PMIC_PWM_PERIOD_NS,
+   .name = "pwm_backlight",
+   .dev = NULL,
+};
+
 void
 intel_fixed_panel_mode(const struct drm_display_mode *fixed_mode,
   struct drm_display_mode *adjusted_mode)
@@ -538,10 +545,11 @@ static u32 bxt_get_backlight(struct intel_connector 
*connector)
 static u32 pwm_get_backlight(struct intel_connector *connector)
 {
struct intel_panel *panel = >panel;
+   struct intel_pwm_info *pwm = panel->backlight.pwm;
int duty_ns;
 
-   duty_ns = pwm_get_duty_cycle(panel->backlight.pwm);
-   return DIV_ROUND_UP(duty_ns * 100, CRC_PMIC_PWM_PERIOD_NS);
+   duty_ns = pwm_get_duty_cycle(pwm->dev);
+   return DIV_ROUND_UP(duty_ns * 100, pwm->period_ns);
 }
 
 static u32 intel_panel_get_backlight(struct intel_connector *connector)
@@ -630,9 +638,10 @@ static void bxt_set_backlight(struct intel_connector 
*connector, u32 level)
 static void pwm_set_backlight(struct intel_connector *connector, u32 level)
 {
struct intel_panel *panel = >panel;
-   int duty_ns = DIV_ROUND_UP(level * CRC_PMIC_PWM_PERIOD_NS, 100);
+   struct intel_pwm_info *pwm = panel->backlight.pwm;
+   int duty_ns = DIV_ROUND_UP(level * pwm->period_ns, 100);
 
-   pwm_config(panel->backlight.pwm, duty_ns, CRC_PMIC_PWM_PERIOD_NS);
+   pwm_config(pwm->dev, duty_ns, pwm->period_ns);
 }
 
 static void
@@ -801,11 +810,12 @@ static void bxt_disable_backlight(struct intel_connector 
*connector)
 static void pwm_disable_backlight(struct intel_connector *connector)
 {
struct intel_panel *panel = >panel;
+   struct intel_pwm_info *pwm = panel->backlight.pwm;
 
/* Disable the backlight */
-   pwm_config(panel->backlight.pwm, 0, CRC_PMIC_PWM_PERIOD_NS);
+   pwm_config(pwm->dev, 0, pwm->period_ns);
usleep_range(2000, 3000);
-   pwm_disable(panel->backlight.pwm);
+   pwm_disable(pwm->dev);
 }
 
 void intel_panel_disable_backlight(struct intel_connector *connector)
@@ -1069,7 +1079,7 @@ static void pwm_enable_backlight(struct intel_connector 
*connector)
 {
struct intel_panel *panel = >panel;
 
-   pwm_enable(panel->backlight.pwm);
+   pwm_enable(panel->backlight.pwm->dev);
intel_panel_actually_set_backlight(connector, panel->backlight.level);
 }
 
@@ -1630,21 +1640,21 @@ static int pwm_setup_backlight(struct intel_connector 
*connector,
 {
struct drm_device *dev = connector->base.dev;
struct intel_panel *panel = >panel;
+   struct intel_pwm_info *pwm = _pwm_info;
int retval;
 
/* Get the PWM chip for backlight control */
-   panel->backlight.pwm = pwm_get(dev->dev, "pwm_backlight");
-   if (IS_ERR(panel->backlight.pwm)) {
-   DRM_ERROR("Failed to own the pwm chip\n");
+   pwm->dev = pwm_get(dev->dev, pwm->name);
+   if (IS_ERR(pwm->dev)) {
+   DRM_ERROR("Failed to own the pwm chip: %s\n", pwm->name);
panel->backlight.pwm = NULL;
return -ENODEV;
}
 
-   retval = pwm_config(panel->backlight.pwm, CRC_PMIC_PWM_PERIOD_NS,
-   CRC_PMIC_PWM_PERIOD_NS);
+   retval = pwm_config(pwm->dev, pwm->period_ns, pwm->period_ns);
if (retval < 0) {
-   DRM_ERROR("Failed to configure the 

[Intel-gfx] [PATCH 0/3] LPSS PWM support for devices that support it

2016-01-12 Thread Shobhit Kumar
Hi,
This is an untested attempt to enable LPSS PWM in the driver. As part
of this did some restructuring for encapsulating the pwm_info inside the
panel->backlight itself. This makes enabling LPSS PWM clean and simple.

Not sending yet to pwm mailing list as this is all untested. C.B. please
test the patches and see if they work at all for you. For testing Please
enable -

CONFIG_PWM_LPSS=y
CONFIG_PWM_LPSS_PLATFORM=y

Regards
Shobhit

Shobhit Kumar (3):
  drm/i915: Encapsulate the pwm_device in a pwm_info structure
  pwm: lpss: Add intel-gfx as consumer device in lookup table
  drm/i915: Add support for LPSS PWM on devices that support it

 drivers/gpu/drm/i915/intel_drv.h   |  8 +-
 drivers/gpu/drm/i915/intel_panel.c | 59 +++---
 drivers/pwm/pwm-lpss-platform.c|  8 ++
 3 files changed, 57 insertions(+), 18 deletions(-)

-- 
2.4.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/3] pwm: lpss: Add intel-gfx as consumer device in lookup table

2016-01-12 Thread Shobhit Kumar
Cc: cbroo...@gmail.com
Cc: jani.nik...@linux.intel.com
Signed-off-by: Shobhit Kumar 
---
 drivers/pwm/pwm-lpss-platform.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/pwm/pwm-lpss-platform.c b/drivers/pwm/pwm-lpss-platform.c
index 54433fc..910bc14 100644
--- a/drivers/pwm/pwm-lpss-platform.c
+++ b/drivers/pwm/pwm-lpss-platform.c
@@ -18,6 +18,11 @@
 
 #include "pwm-lpss.h"
 
+/* PWM consumed by the Intel GFX */
+static struct pwm_lookup lpss_pwm_lookup[] = {
+   PWM_LOOKUP("pwm-lpss", 0, ":00:02.0", "pwm_lpss", 0, 
PWM_POLARITY_NORMAL),
+};
+
 static int pwm_lpss_probe_platform(struct platform_device *pdev)
 {
const struct pwm_lpss_boardinfo *info;
@@ -38,6 +43,9 @@ static int pwm_lpss_probe_platform(struct platform_device 
*pdev)
 
platform_set_drvdata(pdev, lpwm);
 
+   /* Register intel-gfx device as allowed consumer */
+   pwm_add_table(lpss_pwm_lookup, ARRAY_SIZE(lpss_pwm_lookup));
+
pm_runtime_set_active(>dev);
pm_runtime_enable(>dev);
 
-- 
2.4.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Only complain about n_edp_entries with eDP ports

2016-01-12 Thread ville . syrjala
From: Ville Syrjälä 

commit 10afa0b65fe2 ("drm/i915: Reject >9 ddi translation entried if port != 
A/E on SKL")
added sanity checks to make sure we don't end up with too many ddi translation
values for eDP ports, but it actually failed to check if the port is eDP.
We still look up the edp translations for non-eDP ports, but don't use
them, so we shouldn't be complaining about them either.

Fixes: 10afa0b65fe2 ("drm/i915: Reject >9 ddi translation entried if port != 
A/E on SKL")
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/intel_ddi.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index d348506f847c..1f9a3687b540 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -436,8 +436,9 @@ void intel_prepare_ddi_buffer(struct intel_encoder *encoder)
dev_priv->vbt.ddi_port_info[port].dp_boost_level)
iboost_bit = 1<<31;
 
-   if (WARN_ON(port != PORT_A &&
-   port != PORT_E && n_edp_entries > 9))
+   if (WARN_ON(encoder->type == INTEL_OUTPUT_EDP &&
+   port != PORT_A && port != PORT_E &&
+   n_edp_entries > 9))
n_edp_entries = 9;
} else if (IS_BROADWELL(dev_priv)) {
ddi_translations_fdi = bdw_ddi_translations_fdi;
-- 
2.4.10

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Handle PipeC fused off on HSW

2016-01-12 Thread Gabriel Feceoru



On 11.01.2016 19:56, Ville Syrjälä wrote:

On Mon, Dec 21, 2015 at 01:57:22PM +0200, Gabriel Feceoru wrote:

On some HSW boards all pipeC tests fail with various dmesg errors.
This seems to be caused by Pipe C beeing disabled in FUSE_STRAP and
thus reading back the PIPECONF register is always zero.

Fixed by adjusting pipe_count to 2 and thus the pipeC igt tests will
be skipped.

Signed-off-by: Gabriel Feceoru 
---
  drivers/gpu/drm/i915/i915_dma.c | 3 +++
  drivers/gpu/drm/i915/i915_reg.h | 1 +
  2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 988a380..130a496 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -813,6 +813,9 @@ static void intel_device_info_runtime_init(struct 
drm_device *dev)
 !(sfuse_strap & SFUSE_STRAP_FUSE_LOCK))) {
DRM_INFO("Display fused off, disabling\n");
info->num_pipes = 0;
+   } else if (I915_READ(FUSE_STRAP) & HSW_PIPE_C_DISABLE) {
+   DRM_INFO("PipeC fused off\n");
+   info->num_pipes = 2;
}
}

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 007ae83..0432a5f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5940,6 +5940,7 @@ enum skl_disp_power_wells {
  #define  ILK_INTERNAL_GRAPHICS_DISABLE(1 << 31)
  #define  ILK_INTERNAL_DISPLAY_DISABLE (1 << 30)
  #define  ILK_DISPLAY_DEBUG_DISABLE(1 << 29)
+#define  HSW_PIPE_C_DISABLE(1 << 28)


According to Bspec the bit is already present on IVB.
IVB and HSW are both Gen7. Are you suggesting it should be named 
IVB_PIPE_C_DISABLE instead?



  #define  ILK_HDCP_DISABLE (1 << 25)
  #define  ILK_eDP_A_DISABLE(1 << 24)
  #define  HSW_CDCLK_LIMIT  (1 << 24)
--
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ success: Fi.CI.BAT

2016-01-12 Thread Patchwork
== Summary ==

Built on 37f6c2ae666fbba9eff4355115252b8b0fd43050 drm-intel-nightly: 
2016y-01m-12d-14h-25m-44s UTC integration manifest

Test gem_storedw_loop:
Subgroup basic-render:
dmesg-warn -> PASS   (bdw-nuci7)
Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-a-frame-sequence:
dmesg-warn -> PASS   (byt-nuc)
Subgroup read-crc-pipe-c:
dmesg-warn -> PASS   (bdw-ultra)

bdw-nuci7total:138  pass:129  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultratotal:138  pass:132  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2total:141  pass:115  dwarn:2   dfail:0   fail:0   skip:24 
byt-nuc  total:141  pass:123  dwarn:3   dfail:0   fail:0   skip:15 
hsw-brixbox  total:141  pass:134  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2  total:141  pass:137  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p  total:141  pass:101  dwarn:3   dfail:0   fail:0   skip:37 
ivb-t430stotal:135  pass:122  dwarn:3   dfail:4   fail:0   skip:6  
skl-i5k-2total:141  pass:108  dwarn:25  dfail:0   fail:0   skip:8  
skl-i7k-2total:141  pass:107  dwarn:26  dfail:0   fail:0   skip:8  
snb-dellxps  total:141  pass:122  dwarn:5   dfail:0   fail:0   skip:14 
snb-x220ttotal:141  pass:122  dwarn:5   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1150/

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


  1   2   >