from:"Jesse Barnes"

Re: [Intel-gfx] Computation of return value being discarded in get_cpu_power() in drivers/platform/x86/intel_ips.c

2021-06-10 Thread Jesse Barnes

Arg html email sorry.  Resending plain text:

It may be ok to drop this driver entirely now too; I doubt anyone is
relying on GPU turbo in Ironlake for anything critical anymore.  That
would allow for some simplifications in i915 too if it's still
supported.


On Thu, Jun 10, 2021 at 4:56 AM Joonas Lahtinen
 wrote:
>
> (Address for Hans was corrupt in previous message, which confused my mail
> client. Sorry for duplicate message, the other is without From: field).
>
> + Jesse
>
> Quoting Colin Ian King (2021-06-09 14:50:07)
> > Hi,
> >
> > I was reviewing some old unassigned variable warnings from static
> > analysis by Coverity and found an issue introduced with the following
> > commit:
> >
> > commit aa7ffc01d254c91a36bf854d57a14049c6134c72
> > Author: Jesse Barnes 
> > Date:   Fri May 14 15:41:14 2010 -0700
> >
> > x86 platform driver: intelligent power sharing driver
> >
> > The analysis is as follows:
> >
> > drivers/platform/x86/intel_ips.c
> >
> >  871 static u32 get_cpu_power(struct ips_driver *ips, u32 *last, int period)
> >  872 {
> >  873u32 val;
> >  874u32 ret;
> >  875
> >  876/*
> >  877 * CEC is in joules/65535.  Take difference over time to
> >  878 * get watts.
> >  879 */
> >  880val = thm_readl(THM_CEC);
> >  881
> >  882/* period is in ms and we want mW */
> >  883ret = (((val - *last) * 1000) / period);
> >
> > Unused value (UNUSED_VALUE)
> > assigned_value:  Assigning value from ret * 1000U / 65535U to ret here,
> > but that stored value is not used.
> >
> >  884ret = (ret * 1000) / 65535;
> >  885*last = val;
> >  886
> >  887return 0;
> >  888 }
> >
> > I'm really not sure why ret is being calculated on lines 883,884 and not
> > being used. Should that be *last = ret on line 885? Looks suspect anyhow.
>
> According to git blame code seems to have been disabled intentionally by the
> following commit:
>
> commit 96f3823f537088c13735cfdfbf284436c802352a
> Author: Jesse Barnes 
> Date:   Tue Oct 5 14:50:59 2010 -0400
>
> [PATCH 2/2] IPS driver: disable CPU turbo
>
> The undocumented interface we're using for reading CPU power seems to be
> overreporting power.  Until we figure out how to correct it, disable CPU
> turbo and power reporting to be safe.  This will keep the CPU within 
> default
> limits and still allow us to increase GPU frequency as needed.
>
> Maybe wrap the code after thm_readl() in #if 0 in case somebody ends up
> wanting to fix it? Or eliminate completely.
>
> In theory the thm_readl() may affect the system behavior so would not
> remove that for extra paranoia.
>
> Regards, Joonas
>
> > Colin
> >
> >
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] Computation of return value being discarded in get_cpu_power() in drivers/platform/x86/intel_ips.c

2021-06-10 Thread Jesse Barnes

It may be ok to drop this driver entirely now too; I doubt anyone is
relying on GPU turbo in Ironlake for anything critical anymore.  That would
allow for some simplifications in i915 too if it's still supported.

Jesse

On Thu, Jun 10, 2021 at 4:56 AM Joonas Lahtinen <
joonas.lahti...@linux.intel.com> wrote:

> (Address for Hans was corrupt in previous message, which confused my mail
> client. Sorry for duplicate message, the other is without From: field).
>
> + Jesse
>
> Quoting Colin Ian King (2021-06-09 14:50:07)
> > Hi,
> >
> > I was reviewing some old unassigned variable warnings from static
> > analysis by Coverity and found an issue introduced with the following
> > commit:
> >
> > commit aa7ffc01d254c91a36bf854d57a14049c6134c72
> > Author: Jesse Barnes 
> > Date:   Fri May 14 15:41:14 2010 -0700
> >
> > x86 platform driver: intelligent power sharing driver
> >
> > The analysis is as follows:
> >
> > drivers/platform/x86/intel_ips.c
> >
> >  871 static u32 get_cpu_power(struct ips_driver *ips, u32 *last, int
> period)
> >  872 {
> >  873u32 val;
> >  874u32 ret;
> >  875
> >  876/*
> >  877 * CEC is in joules/65535.  Take difference over time to
> >  878 * get watts.
> >  879 */
> >  880val = thm_readl(THM_CEC);
> >  881
> >  882/* period is in ms and we want mW */
> >  883ret = (((val - *last) * 1000) / period);
> >
> > Unused value (UNUSED_VALUE)
> > assigned_value:  Assigning value from ret * 1000U / 65535U to ret here,
> > but that stored value is not used.
> >
> >  884ret = (ret * 1000) / 65535;
> >  885*last = val;
> >  886
> >  887return 0;
> >  888 }
> >
> > I'm really not sure why ret is being calculated on lines 883,884 and not
> > being used. Should that be *last = ret on line 885? Looks suspect anyhow.
>
> According to git blame code seems to have been disabled intentionally by
> the
> following commit:
>
> commit 96f3823f537088c13735cfdfbf284436c802352a
> Author: Jesse Barnes 
> Date:   Tue Oct 5 14:50:59 2010 -0400
>
> [PATCH 2/2] IPS driver: disable CPU turbo
>
> The undocumented interface we're using for reading CPU power seems to
> be
> overreporting power.  Until we figure out how to correct it, disable
> CPU
> turbo and power reporting to be safe.  This will keep the CPU within
> default
> limits and still allow us to increase GPU frequency as needed.
>
> Maybe wrap the code after thm_readl() in #if 0 in case somebody ends up
> wanting to fix it? Or eliminate completely.
>
> In theory the thm_readl() may affect the system behavior so would not
> remove that for extra paranoia.
>
> Regards, Joonas
>
> > Colin
> >
> >
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4 1/2] drm/i915/vbt: Parse panel options separately from timing data

2019-11-14 Thread Jesse Barnes

LGTM.

Reviewed-by: Jesse Barnes 

On Thu, Nov 14, 2019 at 9:07 AM Matt Roper  wrote:
>
> Newer VBT versions will add an alternate way to read panel DTD
> information, so let's split parsing of the general panel information
> from the timing data in preparation.
>
> Cc: Jani Nikula 
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/display/intel_bios.c | 27 +++
>  1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c 
> b/drivers/gpu/drm/i915/display/intel_bios.c
> index 6d7b1a83cb07..d13ce0b7db8b 100644
> --- a/drivers/gpu/drm/i915/display/intel_bios.c
> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
> @@ -208,17 +208,12 @@ get_lvds_fp_timing(const struct bdb_header *bdb,
> return (const struct lvds_fp_timing *)((const u8 *)bdb + ofs);
>  }
>
> -/* Try to find integrated panel data */
> +/* Parse general panel options */
>  static void
> -parse_lfp_panel_data(struct drm_i915_private *dev_priv,
> -const struct bdb_header *bdb)
> +parse_panel_options(struct drm_i915_private *dev_priv,
> +   const struct bdb_header *bdb)
>  {
> const struct bdb_lvds_options *lvds_options;
> -   const struct bdb_lvds_lfp_data *lvds_lfp_data;
> -   const struct bdb_lvds_lfp_data_ptrs *lvds_lfp_data_ptrs;
> -   const struct lvds_dvo_timing *panel_dvo_timing;
> -   const struct lvds_fp_timing *fp_timing;
> -   struct drm_display_mode *panel_fixed_mode;
> int panel_type;
> int drrs_mode;
> int ret;
> @@ -267,6 +262,19 @@ parse_lfp_panel_data(struct drm_i915_private *dev_priv,
> DRM_DEBUG_KMS("DRRS not supported (VBT input)\n");
> break;
> }
> +}
> +
> +/* Try to find integrated panel timing data */
> +static void
> +parse_lfp_panel_dtd(struct drm_i915_private *dev_priv,
> +   const struct bdb_header *bdb)
> +{
> +   const struct bdb_lvds_lfp_data *lvds_lfp_data;
> +   const struct bdb_lvds_lfp_data_ptrs *lvds_lfp_data_ptrs;
> +   const struct lvds_dvo_timing *panel_dvo_timing;
> +   const struct lvds_fp_timing *fp_timing;
> +   struct drm_display_mode *panel_fixed_mode;
> +   int panel_type = dev_priv->vbt.panel_type;
>
> lvds_lfp_data = find_section(bdb, BDB_LVDS_LFP_DATA);
> if (!lvds_lfp_data)
> @@ -1868,7 +1876,8 @@ void intel_bios_init(struct drm_i915_private *dev_priv)
> /* Grab useful general definitions */
> parse_general_features(dev_priv, bdb);
> parse_general_definitions(dev_priv, bdb);
> -   parse_lfp_panel_data(dev_priv, bdb);
> +   parse_panel_options(dev_priv, bdb);
> +   parse_lfp_panel_dtd(dev_priv, bdb);
> parse_lfp_backlight(dev_priv, bdb);
> parse_sdvo_panel_data(dev_priv, bdb);
> parse_driver_features(dev_priv, bdb);
> --
> 2.21.0
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH 2/3] drm/i915: IOMMU based SVM implementation v16

2017-01-12 Thread Jesse Barnes

On Jan 12, 2017 8:04 AM, "Chris Wilson"  wrote:

On Thu, Jan 12, 2017 at 05:48:49PM +0200, Mika Kuoppala wrote:
> Chris Wilson  writes:
>
> > On Mon, Jan 09, 2017 at 06:52:53PM +0200, Mika Kuoppala wrote:
> >> +static int i915_gem_context_enable_svm(struct i915_gem_context *ctx)
> >> +{
> >> +  int ret;
> >> +
> >> +  if (!HAS_SVM(ctx->i915))
> >> +  return -ENODEV;
> >
> > How does legacy execbuf work with an svm context? It will write the
> > ppgtt, but those are no longer read by the GPU. So it will generate
> > faults at random addresses. Am I right in thinking we need to EINVAL if
> > using execbuf + context_is_svm?
>
> Yes without further experiments, it is best to block the legacy path
> with -EINVAL. I will add this.
>
> I guess with some tweaking the legacy interface could be made to work,
> but it would need is_svm_context() checks in rather many places
> in the execbuffer path to avoid relocations/pins.

Hmm. right. Basically we have to ignore all objects if svm. Basically we
strip off everything (having to EINVAL if passed in relocs etc) and more
or less call exec_svm. The advantage is that it keeps the request tracking
of the objects correct, but it can only work with softpinning of the
objects at their cpu addresses. (I don't propose we map the object in
the cpu table at the ppgtt offset!)

Anyway the decision has to be made upfront whether we want to support
the frankenapi.

FWIW I  think the Beignet team wanted this functionality to make their
implementation easier. It's a bit more invasive but might be worth it for
userspace to make their transition easier.

Jesse

-Chris

--
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10

2016-08-17 Thread Jesse Barnes

On Wed, 2016-08-17 at 12:37 +0300, Joonas Lahtinen wrote:
> On ma, 2016-08-15 at 09:26 -0700, Jesse Barnes wrote:
> > 
> > On Mon, 2016-08-15 at 15:34 +0300, Mika Kuoppala wrote:
> > > 
> > > 
> > > No idea yet why we would need to limit for rcs only.
> > > 
> > I went back and forth; I think I did test on the BLT ring and maybe
> > one
> > of the video rings and things worked on at least one platform.  But
> > I'm
> > still worried about bugs...
> > 
> 
> Any other reason for worrying other than lack of testing?
> 
> I'm pretty sure programs using OpenCL (Beignet) will want this on
> other
> rings too for zero-copy video processing as an example.
> 

No, not that I remember; and KBL is probably totally fine here.  Just
run some tests and turn it on, then hope for the best. :)

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10

2016-08-16 Thread Jesse Barnes

On Mon, 2016-08-15 at 15:34 +0300, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > 
> > On Mon, Aug 15, 2016 at 02:48:06PM +0300, Mika Kuoppala wrote:
> > > 
> > > From: Jesse Barnes 
> > > 
> > > We just need to pass in an address to execute and some flags,
> > > since we
> > > don't have to worry about buffer relocation or any of the other
> > > usual
> > > stuff.  Returns a fence to be used for synchronization.
> > > 
> > > v2: add a request after batch submission (Jesse)
> > > v3: add a flag for fence creation (Chris)
> > > v4: add CLOEXEC flag (Kristian)
> > > add non-RCS ring support (Jesse)
> > > v5: update for request alloc change (Jesse)
> > > v6: new sync file interface, error paths, request breadcrumbs
> > > v7: always CLOEXEC for sync_file_install
> > > v8: rebase on new sync file api
> > > v9: rework on top of fence requests and sync_file
> > > v10: take fence ref for sync_file (Chris)
> > >  use correct flush (Chris)
> > >  limit exec on rcs
> > 
> > This is incomplete, so just proof of principle?
> 
> At some point of rebasing I noticed that Jesse did limit
> everything on rcs. So I just put it back.
> 
> No idea yet why we would need to limit for rcs only.
> 

I went back and forth; I think I did test on the BLT ring and maybe one
of the video rings and things worked on at least one platform.  But I'm
still worried about bugs...

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl

2016-08-16 Thread Jesse Barnes

On Mon, 2016-08-15 at 13:56 +0100, Chris Wilson wrote:
> On Mon, Aug 15, 2016 at 03:25:43PM +0300, Mika Kuoppala wrote:
> > 
> > Chris Wilson  writes:
> > 
> > > 
> > > On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote:
> > > > 
> > > > From: Jesse Barnes 
> > > > 
> > > > Add i915_gem_context_create2_ioctl for passing flags
> > > > (e.g. SVM) when creating a context.
> > > > 
> > > > v2: check the pad on create_context
> > > > v3: rebase
> > > > v4: i915_dma is no more. create_gvt needs flags
> > > > 
> > > > Cc: Daniel Vetter 
> > > > Cc: Chris Wilson 
> > > > Cc: Joonas Lahtinen 
> > > > Signed-off-by: Jesse Barnes  (v1)
> > > > Signed-off-by: Mika Kuoppala 
> > > 
> > > Considering we can use deferred ppgtt creation and have setparam
> > > do we
> > > need a new create ioctl just to set a flag?
> > 
> > So like this:
> > 
> > - create ctx with the default create ioctl
> > - set cxt param it for svm capable.
> > - first submit deferred creates
> > 
> > And we use the setparam point for returning
> > error if svm context are not there.
> 
> (and a call to set svm on a context after first use is illegal)
> 
> That's the outline I had in my head. I am not sure if the result is
> cleaner - I just hope it is ;)
> 


I opted against that initially because creating the tables and setup
after the fact for the process actually seemed messier.  Plus I thought
we'd want more flags at context create later anyway...

Not that my opinion matters anymore. :)

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915: add another virtual PCH bridge for passthrough support

2016-03-19 Thread Jesse Barnes

On 03/17/2016 05:31 AM, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: add another virtual PCH bridge for passthrough support
> URL   : https://patchwork.freedesktop.org/series/4539/
> State : warning
> 
> == Summary ==
> 
> Series 4539v1 drm/i915: add another virtual PCH bridge for passthrough support
> http://patchwork.freedesktop.org/api/1.0/series/4539/revisions/1/mbox/
> 
> Test drv_module_reload_basic:
> dmesg-warn -> PASS   (hsw-gt2)
> Test gem_ringfill:
> Subgroup basic-default-s3:
> dmesg-warn -> PASS   (bsw-nuc-2)
> Test kms_flip:
> Subgroup basic-flip-vs-dpms:
> dmesg-warn -> PASS   (bdw-ultra)
> Subgroup basic-flip-vs-wf_vblank:
> fail   -> PASS   (snb-x220t)
> Subgroup basic-plain-flip:
> pass   -> DMESG-WARN (hsw-gt2)
> dmesg-warn -> PASS   (hsw-brixbox)
> Test kms_pipe_crc_basic:
> Subgroup read-crc-pipe-b-frame-sequence:
> dmesg-warn -> PASS   (snb-x220t)
> Subgroup read-crc-pipe-c-frame-sequence:
> dmesg-warn -> PASS   (hsw-gt2)
> Test pm_rpm:
> Subgroup basic-pci-d3-state:
> fail   -> DMESG-FAIL (snb-x220t)
> pass   -> DMESG-WARN (hsw-brixbox)
> Subgroup basic-rte:
> dmesg-warn -> PASS   (hsw-brixbox)
> 
> bdw-nuci7total:194  pass:182  dwarn:0   dfail:0   fail:0   skip:12 
> bdw-ultratotal:194  pass:173  dwarn:0   dfail:0   fail:0   skip:21 
> bsw-nuc-2total:194  pass:157  dwarn:0   dfail:0   fail:0   skip:37 
> byt-nuc  total:194  pass:155  dwarn:4   dfail:0   fail:0   skip:35 
> hsw-brixbox  total:194  pass:171  dwarn:1   dfail:0   fail:0   skip:22 
> hsw-gt2  total:194  pass:176  dwarn:1   dfail:0   fail:0   skip:17 
> ivb-t430stotal:194  pass:169  dwarn:0   dfail:0   fail:0   skip:25 
> skl-i5k-2total:194  pass:171  dwarn:0   dfail:0   fail:0   skip:23 
> skl-i7k-2total:194  pass:171  dwarn:0   dfail:0   fail:0   skip:23 
> skl-nuci5total:194  pass:183  dwarn:0   dfail:0   fail:0   skip:11 
> snb-x220ttotal:194  pass:160  dwarn:0   dfail:1   fail:0   skip:33 

Well some how this trivial patch only affecting passthrough environments
fixed a bunch of issues but broke D3. :)

I think all of these are still intermittent issues, so this patch is
still ok for merge.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: add another virtual PCH bridge for passthrough support

2016-03-19 Thread Jesse Barnes

Some configs use the P2X type but some use a P3X type PCH, so add that
to the detect_pch function so things work correctly.

Signed-off-by: Jesse Barnes 
---
 drivers/gpu/drm/i915/i915_drv.c | 1 +
 drivers/gpu/drm/i915/i915_drv.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 20e8200..6ad4390 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -504,6 +504,7 @@ void intel_detect_pch(struct drm_device *dev)
WARN_ON(!IS_SKYLAKE(dev) &&
!IS_KABYLAKE(dev));
} else if ((id == INTEL_PCH_P2X_DEVICE_ID_TYPE) ||
+  (id == INTEL_PCH_P3X_DEVICE_ID_TYPE) ||
   ((id == INTEL_PCH_QEMU_DEVICE_ID_TYPE) &&
pch->subsystem_vendor == 0x1af4 &&
pch->subsystem_device == 0x1100)) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9e76bfc..e53cd42 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2695,6 +2695,7 @@ struct drm_i915_cmd_table {
 #define INTEL_PCH_SPT_DEVICE_ID_TYPE   0xA100
 #define INTEL_PCH_SPT_LP_DEVICE_ID_TYPE0x9D00
 #define INTEL_PCH_P2X_DEVICE_ID_TYPE   0x7100
+#define INTEL_PCH_P3X_DEVICE_ID_TYPE   0x7000
 #define INTEL_PCH_QEMU_DEVICE_ID_TYPE  0x2900 /* qemu q35 has 2918 */
 
 #define INTEL_PCH_TYPE(dev) (__I915__(dev)->pch_type)
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 26/35] drm/i915: Added debugfs interface to scheduler tuning parameters

2016-03-11 Thread Jesse Barnes

On 03/11/2016 08:28 AM, John Harrison wrote:
> On 23/02/2016 21:06, Jesse Barnes wrote:
>> On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>   struct drm_info_node *node = m->private;
>>> @@ -5424,6 +5587,12 @@ static const struct i915_debugfs_files {
>>>   {"i915_gem_drop_caches", &i915_drop_caches_fops},
>>>   {"i915_error_state", &i915_error_state_fops},
>>>   {"i915_next_seqno", &i915_next_seqno_fops},
>>> +{"i915_scheduler_priority_min", &i915_scheduler_priority_min_fops},
>>> +{"i915_scheduler_priority_max", &i915_scheduler_priority_max_fops},
>>> +{"i915_scheduler_priority_bump",
>>> &i915_scheduler_priority_bump_fops},
>>> +{"i915_scheduler_priority_preempt",
>>> &i915_scheduler_priority_preempt_fops},
>>> +{"i915_scheduler_min_flying", &i915_scheduler_min_flying_fops},
>>> +{"i915_scheduler_file_queue_max",
>>> &i915_scheduler_file_queue_max_fops},
>>>   {"i915_display_crc_ctl", &i915_display_crc_ctl_fops},
>>>   {"i915_pri_wm_latency", &i915_pri_wm_latency_fops},
>>>   {"i915_spr_wm_latency", &i915_spr_wm_latency_fops},
>>>
>> Do these need to be serialized at all?  I guess some raciness doesn't
>> hurt too much for these guys, unless somehow an inconsistent set of
>> values would cause a livelock in the scheduler or something.
> Serialised with what? Each other or the scheduler operation? Neither
> should be necessary. The scheduler will read the current values whenever
> it tests against one of these limits. If multiple are being changed
> while the system is busy, it doesn't really matter. They are just tuning
> values and best guesses type of numbers not array indices or other
> things that would cause kernel panics if you got them wrong. E.g. if you
> set the max file queue depth to smaller than the current queue contents
> that just means you won't be able to submit more stuff until the queue
> has drained - which is presumably the intended result of lowering the
> max queue value anyway. The queue won't leak the extra entries or get
> into an inconsistent state.

Yeah I meant both serialized against scheduler accesses/usage and
atomicity with respect to one another.  Sounds like it doesn't matter
too much though, so it's fine with me.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 24/35] drm/i915: Added trace points to scheduler

2016-02-26 Thread Jesse Barnes

On 02/26/2016 07:55 AM, John Harrison wrote:
> On 23/02/2016 20:42, Jesse Barnes wrote:
>> On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>
>>> Added trace points to the scheduler to track all the various events,
>>> node state transitions and other interesting things that occur.
>>>
>>> v2: Updated for new request completion tracking implementation.
>>>
>>> v3: Updated for changes to node kill code.
>>>
>>> v4: Wrapped some long lines to keep the style checker happy.
>>>
>>> For: VIZ-1587
>>> Signed-off-by: John Harrison 
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +
>>>   drivers/gpu/drm/i915/i915_scheduler.c  |  26 
>>>   drivers/gpu/drm/i915/i915_trace.h  | 196 
>>> +
>>>   drivers/gpu/drm/i915/intel_lrc.c   |   2 +
>>>   4 files changed, 226 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index b9ad0fd..d4de8c7 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -1272,6 +1272,8 @@ i915_gem_ringbuffer_submission(struct 
>>> i915_execbuffer_params *params,
>>> i915_gem_execbuffer_move_to_active(vmas, params->request);
>>>   +trace_i915_gem_ring_queue(ring, params);
>>> +
>>>   qe = container_of(params, typeof(*qe), params);
>>>   ret = i915_scheduler_queue_execbuffer(qe);
>>>   if (ret)
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
>>> b/drivers/gpu/drm/i915/i915_scheduler.c
>>> index 47d7de4..e56ce08 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>>> @@ -88,6 +88,8 @@ static void i915_scheduler_node_requeue(struct 
>>> i915_scheduler_queue_entry *node)
>>>   /* Seqno will be reassigned on relaunch */
>>>   node->params.request->seqno = 0;
>>>   node->status = i915_sqs_queued;
>>> +trace_i915_scheduler_unfly(node->params.ring, node);
>>> +trace_i915_scheduler_node_state_change(node->params.ring, node);
>>>   }
>>> /*
>>> @@ -99,7 +101,11 @@ static void i915_scheduler_node_kill(struct 
>>> i915_scheduler_queue_entry *node)
>>>   WARN_ON(!node);
>>>   WARN_ON(I915_SQS_IS_COMPLETE(node));
>>>   +if (I915_SQS_IS_FLYING(node))
>>> +trace_i915_scheduler_unfly(node->params.ring, node);
>>> +
>>>   node->status = i915_sqs_dead;
>>> +trace_i915_scheduler_node_state_change(node->params.ring, node);
>>>   }
>>> /* Mark a node as in flight on the hardware. */
>>> @@ -124,6 +130,9 @@ static int i915_scheduler_node_fly(struct 
>>> i915_scheduler_queue_entry *node)
>>> node->status = i915_sqs_flying;
>>>   +trace_i915_scheduler_fly(ring, node);
>>> +trace_i915_scheduler_node_state_change(ring, node);
>>> +
>>>   if (!(scheduler->flags[ring->id] & i915_sf_interrupts_enabled)) {
>>>   bool success = true;
>>>   @@ -280,6 +289,8 @@ static int 
>>> i915_scheduler_pop_from_queue_locked(struct intel_engine_cs *ring,
>>>   INIT_LIST_HEAD(&best->link);
>>>   best->status  = i915_sqs_popped;
>>>   +trace_i915_scheduler_node_state_change(ring, best);
>>> +
>>>   ret = 0;
>>>   } else {
>>>   /* Can only get here if:
>>> @@ -297,6 +308,8 @@ static int i915_scheduler_pop_from_queue_locked(struct 
>>> intel_engine_cs *ring,
>>>   }
>>>   }
>>>   +trace_i915_scheduler_pop_from_queue(ring, best);
>>> +
>>>   *pop_node = best;
>>>   return ret;
>>>   }
>>> @@ -506,6 +519,8 @@ static int 
>>> i915_scheduler_queue_execbuffer_bypass(struct i915_scheduler_queue_en
>>>   struct i915_scheduler *scheduler = dev_priv->scheduler;
>>>   int ret;
>>>   +trace_i915_scheduler_queue(qe->params.ring, qe);
>>> +
>>>   intel_ring_reserved_space_cancel(qe->params.request->ringbuf);
>>> scheduler->flags[qe->params.ring->id] |= i915_sf_submitting;
>>> @@ -628,6 +643,9 @@ int i915_scheduler_queue_execbuffer(struct 
>>>

Re: [Intel-gfx] [PATCH v5 26/35] drm/i915: Added debugfs interface to scheduler tuning parameters

2016-02-23 Thread Jesse Barnes

 +
> + scheduler->priority_level_preempt = (u32) val;
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_priority_preempt_fops,
> + i915_scheduler_priority_preempt_get,
> + i915_scheduler_priority_preempt_set,
> + "%lld\n");
> +
> +static int
> +i915_scheduler_min_flying_get(void *data, u64 *val)
> +{
> + struct drm_device   *dev   = data;
> + struct drm_i915_private *dev_priv  = dev->dev_private;
> + struct i915_scheduler   *scheduler = dev_priv->scheduler;
> +
> + *val = (u64) scheduler->min_flying;
> + return 0;
> +}
> +
> +static int
> +i915_scheduler_min_flying_set(void *data, u64 val)
> +{
> + struct drm_device   *dev   = data;
> + struct drm_i915_private *dev_priv  = dev->dev_private;
> + struct i915_scheduler   *scheduler = dev_priv->scheduler;
> +
> + scheduler->min_flying = (u32) val;
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_min_flying_fops,
> + i915_scheduler_min_flying_get,
> + i915_scheduler_min_flying_set,
> + "%llu\n");
> +
> +static int
> +i915_scheduler_file_queue_max_get(void *data, u64 *val)
> +{
> + struct drm_device   *dev   = data;
> + struct drm_i915_private *dev_priv  = dev->dev_private;
> + struct i915_scheduler   *scheduler = dev_priv->scheduler;
> +
> + *val = (u64) scheduler->file_queue_max;
> + return 0;
> +}
> +
> +static int
> +i915_scheduler_file_queue_max_set(void *data, u64 val)
> +{
> + struct drm_device   *dev   = data;
> + struct drm_i915_private *dev_priv  = dev->dev_private;
> + struct i915_scheduler   *scheduler = dev_priv->scheduler;
> +
> + scheduler->file_queue_max = (u32) val;
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_file_queue_max_fops,
> + i915_scheduler_file_queue_max_get,
> + i915_scheduler_file_queue_max_set,
> + "%llu\n");
> +
>  static int i915_frequency_info(struct seq_file *m, void *unused)
>  {
>   struct drm_info_node *node = m->private;
> @@ -5424,6 +5587,12 @@ static const struct i915_debugfs_files {
>   {"i915_gem_drop_caches", &i915_drop_caches_fops},
>   {"i915_error_state", &i915_error_state_fops},
>   {"i915_next_seqno", &i915_next_seqno_fops},
> + {"i915_scheduler_priority_min", &i915_scheduler_priority_min_fops},
> + {"i915_scheduler_priority_max", &i915_scheduler_priority_max_fops},
> + {"i915_scheduler_priority_bump", &i915_scheduler_priority_bump_fops},
> + {"i915_scheduler_priority_preempt", 
> &i915_scheduler_priority_preempt_fops},
> + {"i915_scheduler_min_flying", &i915_scheduler_min_flying_fops},
> + {"i915_scheduler_file_queue_max", &i915_scheduler_file_queue_max_fops},
>   {"i915_display_crc_ctl", &i915_display_crc_ctl_fops},
>   {"i915_pri_wm_latency", &i915_pri_wm_latency_fops},
>   {"i915_spr_wm_latency", &i915_spr_wm_latency_fops},
> 

Do these need to be serialized at all?  I guess some raciness doesn't hurt too 
much for these guys, unless somehow an inconsistent set of values would cause a 
livelock in the scheduler or something.

If not,
Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 25/35] drm/i915: Added scheduler queue throttling by DRM file handle

2016-02-23 Thread Jesse Barnes

quest queue count.
> + * @file: File object to process.
> + */
> +static void i915_scheduler_file_queue_dec(struct drm_file *file)
> +{
> + struct drm_i915_file_private *file_priv = file->driver_priv;
> +
> + file_priv->scheduler_queue_length--;
> +}
> +
>  static void i915_generate_dependencies(struct i915_scheduler *scheduler,
>  struct i915_scheduler_queue_entry *node,
>  uint32_t ring)
> @@ -640,6 +679,8 @@ int i915_scheduler_queue_execbuffer(struct 
> i915_scheduler_queue_entry *qe)
>  
>   list_add_tail(&node->link, &scheduler->node_queue[ring->id]);
>  
> + i915_scheduler_file_queue_inc(node->params.file);
> +
>   not_flying = i915_scheduler_count_flying(scheduler, ring) <
>scheduler->min_flying;
>  
> @@ -883,6 +924,12 @@ static bool i915_scheduler_remove(struct i915_scheduler 
> *scheduler,
>   /* Strip the dependency info while the mutex is still locked */
>   i915_scheduler_remove_dependent(scheduler, node);
>  
> + /* Likewise clean up the file pointer. */
> + if (node->params.file) {
> + i915_scheduler_file_queue_dec(node->params.file);
> + node->params.file = NULL;
> + }
> +
>   continue;
>   }
>  
> @@ -1205,6 +1252,7 @@ int i915_scheduler_closefile(struct drm_device *dev, 
> struct drm_file *file)
>node->status,
>ring->name);
>  
> + i915_scheduler_file_queue_dec(node->params.file);
>   node->params.file = NULL;
>   }
>   }
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h 
> b/drivers/gpu/drm/i915/i915_scheduler.h
> index 075befb..b78de12 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -77,6 +77,7 @@ struct i915_scheduler {
>   int32_t priority_level_bump;
>   int32_t priority_level_preempt;
>   uint32_tmin_flying;
> + uint32_tfile_queue_max;
>  };
>  
>  /* Flag bits for i915_scheduler::flags */
> @@ -100,5 +101,6 @@ int i915_scheduler_flush_stamp(struct intel_engine_cs 
> *ring,
>  unsigned long stamp, bool is_locked);
>  bool i915_scheduler_is_request_tracked(struct drm_i915_gem_request *req,
>  bool *completed, bool *busy);
> +bool i915_scheduler_file_queue_is_full(struct drm_file *file);
>  
>  #endif  /* _I915_SCHEDULER_H_ */
> 

Just to clarify and make sure I understood the previous stuff: a queued execbuf 
that has not yet been dispatched does not reserve and pin pages right?  That 
occurs at actual dispatch time?  If so, I guess clients will hit this 64 queued 
item limit pretty regularly...  How much metadata overhead does that involve?  
Has it been derived from some performance work with a bunch of workloads?  It's 
fine if not, I can imagine that different mixes of workloads would be affected 
by lower or higher queue depths (e.g. small batch tests).

If this is tunable, I guess it should be clamped like a nice or rlimit value, 
with values outside that range requiring CAP_SYS_ADMIN.

Reviewed-by: Jesse Barnes 

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 24/35] drm/i915: Added trace points to scheduler

2016-02-23 Thread Jesse Barnes

On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Added trace points to the scheduler to track all the various events,
> node state transitions and other interesting things that occur.
> 
> v2: Updated for new request completion tracking implementation.
> 
> v3: Updated for changes to node kill code.
> 
> v4: Wrapped some long lines to keep the style checker happy.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +
>  drivers/gpu/drm/i915/i915_scheduler.c  |  26 
>  drivers/gpu/drm/i915/i915_trace.h  | 196 
> +
>  drivers/gpu/drm/i915/intel_lrc.c   |   2 +
>  4 files changed, 226 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index b9ad0fd..d4de8c7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1272,6 +1272,8 @@ i915_gem_ringbuffer_submission(struct 
> i915_execbuffer_params *params,
>  
>   i915_gem_execbuffer_move_to_active(vmas, params->request);
>  
> + trace_i915_gem_ring_queue(ring, params);
> +
>   qe = container_of(params, typeof(*qe), params);
>   ret = i915_scheduler_queue_execbuffer(qe);
>   if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
> b/drivers/gpu/drm/i915/i915_scheduler.c
> index 47d7de4..e56ce08 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -88,6 +88,8 @@ static void i915_scheduler_node_requeue(struct 
> i915_scheduler_queue_entry *node)
>   /* Seqno will be reassigned on relaunch */
>   node->params.request->seqno = 0;
>   node->status = i915_sqs_queued;
> + trace_i915_scheduler_unfly(node->params.ring, node);
> + trace_i915_scheduler_node_state_change(node->params.ring, node);
>  }
>  
>  /*
> @@ -99,7 +101,11 @@ static void i915_scheduler_node_kill(struct 
> i915_scheduler_queue_entry *node)
>   WARN_ON(!node);
>   WARN_ON(I915_SQS_IS_COMPLETE(node));
>  
> + if (I915_SQS_IS_FLYING(node))
> + trace_i915_scheduler_unfly(node->params.ring, node);
> +
>   node->status = i915_sqs_dead;
> + trace_i915_scheduler_node_state_change(node->params.ring, node);
>  }
>  
>  /* Mark a node as in flight on the hardware. */
> @@ -124,6 +130,9 @@ static int i915_scheduler_node_fly(struct 
> i915_scheduler_queue_entry *node)
>  
>   node->status = i915_sqs_flying;
>  
> + trace_i915_scheduler_fly(ring, node);
> + trace_i915_scheduler_node_state_change(ring, node);
> +
>   if (!(scheduler->flags[ring->id] & i915_sf_interrupts_enabled)) {
>   bool success = true;
>  
> @@ -280,6 +289,8 @@ static int i915_scheduler_pop_from_queue_locked(struct 
> intel_engine_cs *ring,
>   INIT_LIST_HEAD(&best->link);
>   best->status  = i915_sqs_popped;
>  
> + trace_i915_scheduler_node_state_change(ring, best);
> +
>   ret = 0;
>   } else {
>   /* Can only get here if:
> @@ -297,6 +308,8 @@ static int i915_scheduler_pop_from_queue_locked(struct 
> intel_engine_cs *ring,
>   }
>   }
>  
> + trace_i915_scheduler_pop_from_queue(ring, best);
> +
>   *pop_node = best;
>   return ret;
>  }
> @@ -506,6 +519,8 @@ static int i915_scheduler_queue_execbuffer_bypass(struct 
> i915_scheduler_queue_en
>   struct i915_scheduler *scheduler = dev_priv->scheduler;
>   int ret;
>  
> + trace_i915_scheduler_queue(qe->params.ring, qe);
> +
>   intel_ring_reserved_space_cancel(qe->params.request->ringbuf);
>  
>   scheduler->flags[qe->params.ring->id] |= i915_sf_submitting;
> @@ -628,6 +643,9 @@ int i915_scheduler_queue_execbuffer(struct 
> i915_scheduler_queue_entry *qe)
>   not_flying = i915_scheduler_count_flying(scheduler, ring) <
>scheduler->min_flying;
>  
> + trace_i915_scheduler_queue(ring, node);
> + trace_i915_scheduler_node_state_change(ring, node);
> +
>   spin_unlock_irq(&scheduler->lock);
>  
>   if (not_flying)
> @@ -657,6 +675,8 @@ bool i915_scheduler_notify_request(struct 
> drm_i915_gem_request *req)
>   struct i915_scheduler_queue_entry *node = req->scheduler_qe;
>   unsigned long flags;
>  
> + trace_i915_scheduler_landing(req);
> +
>   if (!node)
>   return false;
>  
> @@ -670,6 +690,8 @@ bool i915_scheduler_notify_request(struct 
> drm_i915_gem_request *req)
>   else
>   node->status = i915_sqs_complete;
>  
> + trace_i915_scheduler_node_state_change(req->ring, node);
> +
>   spin_unlock_irqrestore(&scheduler->lock, flags);
>  
>   return true;
> @@ -877,6 +899,8 @@ static bool i915_scheduler_remove(struct i915_scheduler 
> *scheduler,
>   /* Launch more packets now? */
>   do_submit = (queued > 0) && (flying < scheduler-

Re: [Intel-gfx] [PATCH v5 24/35] drm/i915: Added trace points to scheduler

2016-02-23 Thread Jesse Barnes

_assign(
> +__entry->ring   = req->ring->id;
> +__entry->uniq   = req->uniq;
> +__entry->seqno  = req->seqno;
> +__entry->status = req->scheduler_qe ?
> + req->scheduler_qe->status : ~0U;
> +),
> +
> + TP_printk("ring=%d, uniq=%d, seqno=%d, status=%d",
> +   __entry->ring, __entry->uniq, __entry->seqno,
> +   __entry->status)
> +);
> +
> +TRACE_EVENT(i915_scheduler_remove,
> + TP_PROTO(struct intel_engine_cs *ring,
> +  u32 min_seqno, bool do_submit),
> + TP_ARGS(ring, min_seqno, do_submit),
> +
> + TP_STRUCT__entry(
> +  __field(u32, ring)
> +  __field(u32, min_seqno)
> +  __field(bool, do_submit)
> +  ),
> +
> + TP_fast_assign(
> +__entry->ring  = ring->id;
> +__entry->min_seqno = min_seqno;
> +__entry->do_submit = do_submit;
> +),
> +
> + TP_printk("ring=%d, min_seqno = %d, do_submit=%d",
> +   __entry->ring, __entry->min_seqno, __entry->do_submit)
> +);
> +
> +TRACE_EVENT(i915_scheduler_destroy,
> + TP_PROTO(struct intel_engine_cs *ring,
> +  struct i915_scheduler_queue_entry *node),
> + TP_ARGS(ring, node),
> +
> + TP_STRUCT__entry(
> +  __field(u32, ring)
> +  __field(u32, uniq)
> +  __field(u32, seqno)
> +  ),
> +
> + TP_fast_assign(
> +__entry->ring  = ring->id;
> +__entry->uniq  = node ? node->params.request->uniq  
> : 0;
> +__entry->seqno = node ? node->params.request->seqno 
> : 0;
> +),
> +
> + TP_printk("ring=%d, uniq=%d, seqno=%d",
> +   __entry->ring, __entry->uniq, __entry->seqno)
> +);
> +
> +TRACE_EVENT(i915_scheduler_pop_from_queue,
> + TP_PROTO(struct intel_engine_cs *ring,
> +  struct i915_scheduler_queue_entry *node),
> + TP_ARGS(ring, node),
> +
> + TP_STRUCT__entry(
> +  __field(u32, ring)
> +  __field(u32, uniq)
> +  __field(u32, seqno)
> +  ),
> +
> + TP_fast_assign(
> +__entry->ring  = ring->id;
> +__entry->uniq  = node ? node->params.request->uniq  
> : 0;
> +__entry->seqno = node ? node->params.request->seqno 
> : 0;
> +),
> +
> + TP_printk("ring=%d, uniq=%d, seqno=%d",
> +   __entry->ring, __entry->uniq, __entry->seqno)
> +);
> +
> +TRACE_EVENT(i915_scheduler_node_state_change,
> + TP_PROTO(struct intel_engine_cs *ring,
> +  struct i915_scheduler_queue_entry *node),
> + TP_ARGS(ring, node),
> +
> + TP_STRUCT__entry(
> +  __field(u32, ring)
> +  __field(u32, uniq)
> +  __field(u32, seqno)
> +  __field(u32, status)
> +  ),
> +
> + TP_fast_assign(
> +__entry->ring   = ring->id;
> +__entry->uniq   = node ? node->params.request->uniq  
> : 0;
> +__entry->seqno  = node->params.request->seqno;
> +__entry->status = node->status;
> +),
> +
> + TP_printk("ring=%d, uniq=%d, seqno=%d, status=%d",
> +   __entry->ring, __entry->uniq, __entry->seqno,
> +   __entry->status)
> +);
> +
> +TRACE_EVENT(i915_gem_ring_queue,
> + TP_PROTO(struct intel_engine_cs *ring,
> +  struct i915_execbuffer_params *params),
> + TP_ARGS(ring, params),
> +
> + TP_STRUCT__entry(
> +  __field(u32, ring)
> +  __field(u32, uniq)
> +  __field(u32, seqno)
> +  ),
> +
> + TP_fast_assign(
> +__entry->ring  = ring->id;
> +__entry->uniq  = params->request->uniq;
> +__entry->seqno = params->request->seqno;
> +),
> +
> + TP_printk("ring=%d, uniq=%d, seqno=%d", __entry->ring,
> +   __entry->uniq, __entry->seqno)
> +);
> +
>  #endif /* _I915_TRACE_H_ */
>  
>  /* This part must be outside protection */
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 9c7a79a..2b9f49c 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -954,6 +954,8 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>  
>   i915_gem_execbuffer_move_to_active(vmas, params->request);
>  
> + trace_i915_gem_ring_queue(ring, params);
> +
>   qe = container_of(params, typeof(*qe), params);
>   ret = i915_scheduler_queue_execbuffer(qe);
>   if (ret)
> 

Looks fine.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 21/35] drm/i915: Added a module parameter to allow the scheduler to be disabled

2016-02-23 Thread Jesse Barnes

On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> It can be useful to be able to disable the GPU scheduler via a module
> parameter for debugging purposes.
> 
> v5: Converted from a multi-feature 'overrides' mask to a single
> 'enable' boolean. Further features (e.g. pre-emption) will now be
> separate 'enable' booleans added later. [Chris Wilson]
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/i915_params.c| 4 
>  drivers/gpu/drm/i915/i915_params.h| 1 +
>  drivers/gpu/drm/i915/i915_scheduler.c | 5 -
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_params.c 
> b/drivers/gpu/drm/i915/i915_params.c
> index d0eba58..0ef3159 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -57,6 +57,7 @@ struct i915_params i915 __read_mostly = {
>   .edp_vswing = 0,
>   .enable_guc_submission = true,
>   .guc_log_level = -1,
> + .enable_scheduler = 0,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -203,3 +204,6 @@ MODULE_PARM_DESC(enable_guc_submission, "Enable GuC 
> submission (default:false)")
>  module_param_named(guc_log_level, i915.guc_log_level, int, 0400);
>  MODULE_PARM_DESC(guc_log_level,
>   "GuC firmware logging level (-1:disabled (default), 0-3:enabled)");
> +
> +module_param_named_unsafe(enable_scheduler, i915.enable_scheduler, int, 
> 0600);
> +MODULE_PARM_DESC(enable_scheduler, "Enable scheduler (0 = disable [default], 
> 1 = enable)");
> diff --git a/drivers/gpu/drm/i915/i915_params.h 
> b/drivers/gpu/drm/i915/i915_params.h
> index 5299290..f855c86 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -60,6 +60,7 @@ struct i915_params {
>   bool enable_guc_submission;
>   bool verbose_state_checks;
>   bool nuclear_pageflip;
> + int enable_scheduler;
>  };
>  
>  extern struct i915_params i915 __read_mostly;
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
> b/drivers/gpu/drm/i915/i915_scheduler.c
> index 4f25bf2..47d7de4 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -34,6 +34,9 @@ bool i915_scheduler_is_enabled(struct drm_device *dev)
>  {
>   struct drm_i915_private *dev_priv = dev->dev_private;
>  
> + if (!i915.enable_scheduler)
> + return false;
> +
>   return dev_priv->scheduler != NULL;
>  }
>  
> @@ -548,7 +551,7 @@ int i915_scheduler_queue_execbuffer(struct 
> i915_scheduler_queue_entry *qe)
>  
>   WARN_ON(!scheduler);
>  
> - if (1/*!i915.enable_scheduler*/)
> + if (!i915.enable_scheduler)
>   return i915_scheduler_queue_execbuffer_bypass(qe);
>  
>   node = kmalloc(sizeof(*node), GFP_KERNEL);
> 

I did a double take here; maybe a comment along the lines of "if the scheduler 
is disabled, queue the buffer immediately" would help, and something similar 
for where the if (1) is added temporarily.

Doesn't matter too much though.

Reviewed-by: Jesse Barnes 

Thanks,
Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 22/35] drm/i915: Support for 'unflushed' ring idle

2016-02-23 Thread Jesse Barnes

buffer.h
> index ada93a9..cca476f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -478,7 +478,7 @@ void intel_ring_update_space(struct intel_ringbuffer 
> *ringbuf);
>  int intel_ring_space(struct intel_ringbuffer *ringbuf);
>  bool intel_ring_stopped(struct intel_engine_cs *ring);
>  
> -int __must_check intel_ring_idle(struct intel_engine_cs *ring);
> +int __must_check intel_ring_idle(struct intel_engine_cs *ring, bool flush);
>  void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
>  int intel_ring_flush_all_caches(struct drm_i915_gem_request *req);
>  int intel_ring_invalidate_all_caches(struct drm_i915_gem_request *req);
> 

Maybe Chris remembers the history here; I think the wraparound idle goes all 
the way back to Eric's original work with wrapping (something we had a lot of 
trouble with early on iirc).

My only suggestion here is to add wrappers for a new __intel_ring_idle(ring, 
bool), one for the existing usage of intel_ring_idle(ring) (which would pass 
false) and a new intel_ring_flush(ring) that passes true, along with some kdoc 
explaining the difference.  Otherwise everyone will have to look up the param 
all the time. :)

With that change (because I'm a bool param hater, at least in exposed APIs):
Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 20/35] drm/i915: Add scheduler hook to GPU reset

2016-02-23 Thread Jesse Barnes

On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> When the watchdog resets the GPU, all interrupts get disabled despite
> the reference count remaining. As the scheduler probably had
> interrupts enabled during the reset (it would have been waiting for
> the bad batch to complete), it must be poked to tell it that the
> interrupt has been disabled.
> 
> v5: New patch in series.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/i915_gem.c   |  2 ++
>  drivers/gpu/drm/i915/i915_scheduler.c | 11 +++
>  drivers/gpu/drm/i915/i915_scheduler.h |  1 +
>  3 files changed, 14 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d946f53..d7f7f7a 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3248,6 +3248,8 @@ static void i915_gem_reset_ring_cleanup(struct 
> drm_i915_private *dev_priv,
>   buffer->last_retired_head = buffer->tail;
>   intel_ring_update_space(buffer);
>   }
> +
> + i915_scheduler_reset_cleanup(ring);
>  }
>  
>  void i915_gem_reset(struct drm_device *dev)
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
> b/drivers/gpu/drm/i915/i915_scheduler.c
> index 8130a9c..4f25bf2 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -778,6 +778,17 @@ void i915_scheduler_clean_node(struct 
> i915_scheduler_queue_entry *node)
>   }
>  }
>  
> +void i915_scheduler_reset_cleanup(struct intel_engine_cs *ring)
> +{
> + struct drm_i915_private *dev_priv = ring->dev->dev_private;
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> +
> + if (scheduler->flags[ring->id] & i915_sf_interrupts_enabled) {
> + ring->irq_put(ring);
> + scheduler->flags[ring->id] &= ~i915_sf_interrupts_enabled;
> + }
> +}
> +

So I guess these flags are also protected by the struct_mutex?  If so, I guess 
it looks ok.

Reviewed-by: Jesse Barnes 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 09/35] drm/i915: Force MMIO flips when scheduler enabled

2016-02-22 Thread Jesse Barnes

On 02/20/2016 01:22 AM, Chris Wilson wrote:
> On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
>> On 02/18/2016 06:26 AM, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>
>>> MMIO flips are the preferred mechanism now
> 
> Because introducing variable latency in waking up a big core is a good
> idea?

Is the latency variability really that bad?  I'm not too concerned about waking 
up the core either, I think it's going to be hit up quite a bit in these 
situations anyway, and vblanks should be driving the rendering as well.

>> but more importantly, pipe
>>> based flips cause issues for the scheduler. Specifically, submitting
>>> work to the rings around the side of the scheduler could cause that
>>> work to be lost if the scheduler generates a pre-emption event on that
>>> ring.
> 
> No. That is just incredibily bad design.
> 
>> Why can't we use mmio flips unconditionally?  Maarten or Ville?
> 
> Why would we want to? CS flips work just fine in execlists and no reason
> was ever given as to why they were not enabled, just laziness.

I'm not saying it can't be done, but I thought with atomic we decided to go 
with mmio because it makes things a lot simpler (iirc Ville's design did that 
way back?).

I'm fine with either, but it seems like we should settle on one rather than 
trying to maintain two different ways of flipping.  We'll have work to do 
either way.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 09/35] drm/i915: Force MMIO flips when scheduler enabled

2016-02-19 Thread Jesse Barnes

On 02/19/2016 11:53 AM, Ville Syrjälä wrote:
> On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
>> On 02/18/2016 06:26 AM, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>
>>> MMIO flips are the preferred mechanism now but more importantly, pipe
>>> based flips cause issues for the scheduler. Specifically, submitting
>>> work to the rings around the side of the scheduler could cause that
>>> work to be lost if the scheduler generates a pre-emption event on that
>>> ring.
>>>
>>> For: VIZ-1587
>>> Signed-off-by: John Harrison 
>>> ---
>>>  drivers/gpu/drm/i915/intel_display.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_display.c 
>>> b/drivers/gpu/drm/i915/intel_display.c
>>> index 6e12ed7..731d20a 100644
>>> --- a/drivers/gpu/drm/i915/intel_display.c
>>> +++ b/drivers/gpu/drm/i915/intel_display.c
>>> @@ -46,6 +46,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include "i915_scheduler.h"
>>>  
>>>  /* Primary plane formats for gen <= 3 */
>>>  static const uint32_t i8xx_primary_formats[] = {
>>> @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs 
>>> *ring,
>>> return true;
>>> else if (i915.enable_execlists)
>>> return true;
>>> +   else if (i915_scheduler_is_enabled(ring->dev))
>>> +   return true;
>>> else if (obj->base.dma_buf &&
>>>  !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
>>>false))
>>>
>>
>> Why can't we use mmio flips unconditionally?  Maarten or Ville?
> 
> We do when execlists are used, which is always on gen9+. So I guess I'm
> missing the point of this patch. For gen5+ we could also do it trivially.

didn't check if the scheduler is also enabled for gen8 (I guess it would be 
nice, that would cover BDW and BSW).

> 
> For older platforms it'd require a bit of work since we'd need to
> complete the flips from the vblank interrupt. Well, we actually do
> that already even with CS flips on those platforms, but we do look
> at the flip pending interrupt to figure out if CS already issued
> the flip or not. So that part would need changing.
> 
> I also think we should switch to using the vblank interrupt for this
> stuff on all platforms, mainly since the flip done interrupt is somewhat
> broken on at least BDW (no idea if it got fixed in SKL or later), and
> doing things in more than one way certainly won't decrease our bug count.

Yeah that's probably the way to go; I haven't checked the behavior on SKL 
either.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 18/35] drm/i915: Added scheduler support to page fault handler

2016-02-19 Thread Jesse Barnes

On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> GPU page faults can now require scheduler operation in order to
> complete. For example, in order to free up sufficient memory to handle
> the fault the handler must wait for a batch buffer to complete that
> has not even been sent to the hardware yet. Thus EAGAIN no longer
> means a GPU hang, it can occur under normal operation.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 17b44b3..a47a495 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2003,10 +2003,15 @@ out:
>   }
>   case -EAGAIN:
>   /*
> -  * EAGAIN means the gpu is hung and we'll wait for the error
> -  * handler to reset everything when re-faulting in
> +  * EAGAIN can mean the gpu is hung and we'll have to wait for
> +  * the error handler to reset everything when re-faulting in
>* i915_mutex_lock_interruptible.
> +  *
> +  * It can also indicate various other nonfatal errors for which
> +  * the best response is to give other threads a chance to run,
> +  * and then retry the failing operation in its entirety.
>*/
> +         /*FALLTHRU*/
>   case 0:
>   case -ERESTARTSYS:
>   case -EINTR:
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 16/35] drm/i915: Hook scheduler node clean up into retire requests

2016-02-19 Thread Jesse Barnes

On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> The scheduler keeps its own lock on various DRM objects in order to
> guarantee safe access long after the original execbuff IOCTL has

execbuf is getting bigger, but I'm not sure if it qualifies as "buff" yet.  
Intentional misspelling? :)

> completed. This is especially important when pre-emption is enabled as
> the batch buffer might need to be submitted to the hardware multiple
> times. This patch hooks the clean up of these locks into the request
> retire function. The request can only be retired after it has
> completed on the hardware and thus is no longer eligible for
> re-submission. Thus there is no point holding on to the locks beyond
> that time.
> 
> v3: Updated to not WARN when cleaning a node that is being cancelled.
> The clean will happen later so skipping it at the point of
> cancellation is fine.
> 
> v5: Squashed the i915_scheduler.c portions down into the 'start of
> scheduler' patch. [Joonas Lahtinen]
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> Cc: Joonas Lahtinen 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1ab7256..2dd9b55 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1489,6 +1489,9 @@ static void i915_gem_request_retire(struct 
> drm_i915_gem_request *request)
>   fence_signal_locked(&request->fence);
>   }
>  
> + if (request->scheduler_qe)
> +     i915_scheduler_clean_node(request->scheduler_qe);
> +
>   i915_gem_request_unreference(request);
>  }
>  
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 15/35] drm/i915: Added tracking/locking of batch buffer objects

2016-02-19 Thread Jesse Barnes

38,19 @@ void i915_scheduler_clean_node(struct 
> i915_scheduler_queue_entry *node)
>   node->params.batch_obj = NULL;
>   }
>  
> + /* Release the locked buffers: */
> + for (i = 0; i < node->num_objs; i++)
> +     drm_gem_object_unreference(&node->saved_objects[i].obj->base);
> + kfree(node->saved_objects);
> + node->saved_objects = NULL;
> + node->num_objs = 0;
> +
> + /* Context too: */
> + if (node->params.ctx) {
> + i915_gem_context_unreference(node->params.ctx);
> + node->params.ctx = NULL;
> + }
> +
>   /* And anything else owned by the node: */
>   if (node->params.cliprects) {
>   kfree(node->params.cliprects);
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 13/35] drm/i915: Redirect execbuffer_final() via scheduler

2016-02-19 Thread Jesse Barnes

On 02/18/2016 06:27 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Updated the execbuffer() code to pass the packaged up batch buffer
> information to the scheduler rather than calling execbuffer_final()
> directly. The scheduler queue() code is currently a stub which simply
> chains on to _final() immediately.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 18 +++---
>  drivers/gpu/drm/i915/intel_lrc.c   | 12 
>  2 files changed, 11 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 7978dae..09c5ce9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -33,6 +33,7 @@
>  #include "intel_drv.h"
>  #include 
>  #include 
> +#include "i915_scheduler.h"
>  
>  #define  __EXEC_OBJECT_HAS_PIN (1<<31)
>  #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
> @@ -1226,6 +1227,7 @@ i915_gem_ringbuffer_submission(struct 
> i915_execbuffer_params *params,
>  struct drm_i915_gem_execbuffer2 *args,
>  struct list_head *vmas)
>  {
> + struct i915_scheduler_queue_entry *qe;
>   struct drm_device *dev = params->dev;
>   struct intel_engine_cs *ring = params->ring;
>   struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -1270,17 +1272,11 @@ i915_gem_ringbuffer_submission(struct 
> i915_execbuffer_params *params,
>  
>   i915_gem_execbuffer_move_to_active(vmas, params->request);
>  
> - ret = dev_priv->gt.execbuf_final(params);
> + qe = container_of(params, typeof(*qe), params);
> + ret = i915_scheduler_queue_execbuffer(qe);
>   if (ret)
>   return ret;
>  
> - /*
> -  * Free everything that was stored in the QE structure (until the
> -  * scheduler arrives and does it instead):
> -  */
> - if (params->dispatch_flags & I915_DISPATCH_SECURE)
> - i915_gem_execbuff_release_batch_obj(params->batch_obj);
> -
>   return 0;
>  }
>  
> @@ -1420,8 +1416,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
> *data,
>   struct intel_engine_cs *ring;
>   struct intel_context *ctx;
>   struct i915_address_space *vm;
> - struct i915_execbuffer_params params_master; /* XXX: will be removed 
> later */
> - struct i915_execbuffer_params *params = ¶ms_master;
> + struct i915_scheduler_queue_entry qe;
> + struct i915_execbuffer_params *params = &qe.params;
>   const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
>   u32 dispatch_flags;
>   int ret;
> @@ -1529,7 +1525,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
> *data,
>   else
>   vm = &dev_priv->gtt.base;
>  
> - memset(¶ms_master, 0x00, sizeof(params_master));
> + memset(&qe, 0x00, sizeof(qe));
>  
>   eb = eb_create(args);
>   if (eb == NULL) {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 12e8949..ff4565f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -136,6 +136,7 @@
>  #include 
>  #include "i915_drv.h"
>  #include "intel_mocs.h"
> +#include "i915_scheduler.h"
>  
>  #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
>  #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
> @@ -910,6 +911,7 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>  struct drm_i915_gem_execbuffer2 *args,
>  struct list_head *vmas)
>  {
> + struct i915_scheduler_queue_entry *qe;
>   struct drm_device   *dev = params->dev;
>   struct intel_engine_cs  *ring = params->ring;
>   struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -952,17 +954,11 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>  
>   i915_gem_execbuffer_move_to_active(vmas, params->request);
>  
> - ret = dev_priv->gt.execbuf_final(params);
> + qe = container_of(params, typeof(*qe), params);
> + ret = i915_scheduler_queue_execbuffer(qe);
>   if (ret)
>   return ret;
>  
> - /*
> -  * Free everything that was stored in the QE structure (until the
> -  * scheduler arrives and does it instead):
> -  */
> - if (params->dispatch_flags & I915_DISPATCH_SECURE)
> - i915_gem_execbuff_release_batch_obj(params->batch_obj);
> -
>   return 0;
>  }
>  
> 

I think this is ok, but might need changes if some of the earlier patches see 
changes due to Joonas's reviews.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 14/35] drm/i915: Keep the reserved space mechanism happy

2016-02-19 Thread Jesse Barnes

;stamp  = jiffies;
>   i915_gem_request_reference(node->params.request);
>  
> + intel_ring_reserved_space_cancel(node->params.request->ringbuf);
> +
>   WARN_ON(node->params.request->scheduler_qe);
>   node->params.request->scheduler_qe = node;
>  
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index ff4565f..f4bab82 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -978,13 +978,17 @@ int intel_execlists_submission_final(struct 
> i915_execbuffer_params *params)
>   /* The mutex must be acquired before calling this function */
>   WARN_ON(!mutex_is_locked(¶ms->dev->struct_mutex));
>  
> + ret = intel_logical_ring_reserve_space(req);
> + if (ret)
> + goto err;
> +
>   /*
>* Unconditionally invalidate gpu caches and ensure that we do flush
>* any residual writes from the previous batch.
>*/
>   ret = logical_ring_invalidate_all_caches(req);
>   if (ret)
> - return ret;
> + goto err;
>  
>   if (ring == &dev_priv->ring[RCS] &&
>   params->instp_mode != dev_priv->relative_constants_mode) {
> @@ -1006,13 +1010,18 @@ int intel_execlists_submission_final(struct 
> i915_execbuffer_params *params)
>  
>   ret = ring->emit_bb_start(req, exec_start, params->dispatch_flags);
>   if (ret)
> - return ret;
> + goto err;
>  
>   trace_i915_gem_ring_dispatch(req, params->dispatch_flags);
>  
>   i915_gem_execbuffer_retire_commands(params);
>  
>   return 0;
> +
> +err:
> + intel_ring_reserved_space_cancel(params->request->ringbuf);
> +
> + return ret;
>  }
>  
>  void intel_execlists_retire_requests(struct intel_engine_cs *ring)
> 

I had to double check that it was ok to cancel space if the reserve failed (I 
guess it is), so seems ok.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 09/35] drm/i915: Force MMIO flips when scheduler enabled

2016-02-19 Thread Jesse Barnes

On 02/18/2016 06:26 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> MMIO flips are the preferred mechanism now but more importantly, pipe
> based flips cause issues for the scheduler. Specifically, submitting
> work to the rings around the side of the scheduler could cause that
> work to be lost if the scheduler generates a pre-emption event on that
> ring.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 6e12ed7..731d20a 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -46,6 +46,7 @@
>  #include 
>  #include 
>  #include 
> +#include "i915_scheduler.h"
>  
>  /* Primary plane formats for gen <= 3 */
>  static const uint32_t i8xx_primary_formats[] = {
> @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs 
> *ring,
>   return true;
>   else if (i915.enable_execlists)
>   return true;
> + else if (i915_scheduler_is_enabled(ring->dev))
> + return true;
>   else if (obj->base.dma_buf &&
>!reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
>      false))
> 

Why can't we use mmio flips unconditionally?  Maarten or Ville?

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 08/35] drm/i915: Disable hardware semaphores when GPU scheduler is enabled

2016-02-19 Thread Jesse Barnes

On 02/18/2016 06:26 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> Hardware sempahores require seqno values to be continuously
> incrementing. However, the scheduler's reordering of batch buffers
> means that the seqno values going through the hardware could be out of
> order. Thus semaphores can not be used.
> 
> On the other hand, the scheduler superceeds the need for hardware
> semaphores anyway. Having one ring stall waiting for something to
> complete on another ring is inefficient if that ring could be working
> on some other, independent task. This is what the scheduler is meant
> to do - keep the hardware as busy as possible by reordering batch
> buffers to avoid dependency stalls.
> 
> v4: Downgraded a BUG_ON to WARN_ON as the latter is preferred.
> 
> v5: Squashed the i915_scheduler.c portions down into the 'start of
> scheduler' patch. [Joonas Lahtinen]
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> Cc: Joonas Lahtinen 
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 9 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 4 
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 975af35..5760a17 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -34,6 +34,7 @@
>  #include "i915_drv.h"
>  #include "i915_trace.h"
>  #include "intel_drv.h"
> +#include "i915_scheduler.h"
>  
>  #include 
>  #include 
> @@ -517,6 +518,14 @@ void intel_detect_pch(struct drm_device *dev)
>  
>  bool i915_semaphore_is_enabled(struct drm_device *dev)
>  {
> + /* Hardware semaphores are not compatible with the scheduler due to the
> +  * seqno values being potentially out of order. However, semaphores are
> +  * also not required as the scheduler will handle interring dependencies
> +  * and try do so in a way that does not cause dead time on the hardware.
> +  */
> + if (i915_scheduler_is_enabled(dev))
> + return false;
> +
>   if (INTEL_INFO(dev)->gen < 6)
>   return false;
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 9d4f19d..ca7b8af 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -33,6 +33,7 @@
>  #include 
>  #include "i915_trace.h"
>  #include "intel_drv.h"
> +#include "i915_scheduler.h"
>  
>  int __intel_ring_space(int head, int tail, int size)
>  {
> @@ -1400,6 +1401,9 @@ gen6_ring_sync(struct drm_i915_gem_request *waiter_req,
>   u32 wait_mbox = signaller->semaphore.mbox.wait[waiter->id];
>   int ret;
>  
> + /* Arithmetic on sequence numbers is unreliable with a scheduler. */
> + WARN_ON(i915_scheduler_is_enabled(signaller->dev));
> +
>   /* Throughout all of the GEM code, seqno passed implies our current
>* seqno is >= the last seqno executed. However for hardware the
>* comparison is strictly greater than.
> 

I'd rather get rid of this altogether, but I guess we'll need it for the older 
gens.  Another option would be to make the sync_to hook NULL in the scheduler 
case, though I guess the failure mode is less desirable there.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 07/35] drm/i915: Prepare retire_requests to handle out-of-order seqnos

2016-02-19 Thread Jesse Barnes

On 02/18/2016 06:26 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> A major point of the GPU scheduler is that it re-orders batch buffers
> after they have been submitted to the driver. This leads to requests
> completing out of order. In turn, this means that the retire
> processing can no longer assume that all completed entries are at the
> front of the list. Rather than attempting to re-order the request list
> on a regular basis, it is better to simply scan the entire list.
> 
> v2: Removed deferred free code as no longer necessary due to request
> handling updates.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 31 +--
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7d9aa24..0003cfc 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3233,6 +3233,7 @@ void i915_gem_reset(struct drm_device *dev)
>  void
>  i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>  {
> + struct drm_i915_gem_object *obj, *obj_next;
>   struct drm_i915_gem_request *req, *req_next;
>   LIST_HEAD(list_head);
>  
> @@ -3245,37 +3246,31 @@ i915_gem_retire_requests_ring(struct intel_engine_cs 
> *ring)
>*/
>   i915_gem_request_notify(ring, false);
>  
> + /*
> +  * Note that request entries might be out of order due to rescheduling
> +  * and pre-emption. Thus both lists must be processed in their entirety
> +  * rather than stopping at the first non-complete entry.
> +  */
> +
>   /* Retire requests first as we use it above for the early return.
>* If we retire requests last, we may use a later seqno and so clear
>* the requests lists without clearing the active list, leading to
>* confusion.
>*/
> - while (!list_empty(&ring->request_list)) {
> - struct drm_i915_gem_request *request;
> -
> - request = list_first_entry(&ring->request_list,
> -struct drm_i915_gem_request,
> -list);
> -
> - if (!i915_gem_request_completed(request))
> - break;
> + list_for_each_entry_safe(req, req_next, &ring->request_list, list) {
> + if (!i915_gem_request_completed(req))
> + continue;
>  
> - i915_gem_request_retire(request);
> + i915_gem_request_retire(req);
>   }
>  
>   /* Move any buffers on the active list that are no longer referenced
>* by the ringbuffer to the flushing/inactive lists as appropriate,
>* before we free the context associated with the requests.
>*/
> - while (!list_empty(&ring->active_list)) {
> - struct drm_i915_gem_object *obj;
> -
> - obj = list_first_entry(&ring->active_list,
> -   struct drm_i915_gem_object,
> -   ring_list[ring->id]);
> -
> + list_for_each_entry_safe(obj, obj_next, &ring->active_list, 
> ring_list[ring->id]) {
>   if (!list_empty(&obj->last_read_req[ring->id]->list))
> - break;
> + continue;
>  
>   i915_gem_object_retire__read(obj, ring->id);
>   }
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4 08/38] drm/i915: Prepare retire_requests to handle out-of-order seqnos

2016-02-04 Thread Jesse Barnes

On 01/11/2016 02:10 PM, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 06:42:37PM +, john.c.harri...@intel.com wrote:
>> From: John Harrison 
>>
>> A major point of the GPU scheduler is that it re-orders batch buffers
>> after they have been submitted to the driver. This leads to requests
>> completing out of order. In turn, this means that the retire
>> processing can no longer assume that all completed entries are at the
>> front of the list. Rather than attempting to re-order the request list
>> on a regular basis, it is better to simply scan the entire list.
> 
> This is a major misstep. Just think in terms of per-context timelines,
> and retirment order within those timelines being consistent..

I think you're just re-iterating the desire for per-context timelines.  We all 
want this, but after already implementing several big reworks, I don't think 
it's fair to make John do this.  When we move that direction after this lands, 
we can simplify/drop code where possible (sure maybe it's more total churn, but 
my guess is we'll probably find other things to refactor as well, so it doesn't 
matter too much).

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4 05/38] drm/i915: Cache request pointer in *_submission_final()

2016-02-04 Thread Jesse Barnes

t;relative_constants_mode) {
> - ret = intel_logical_ring_begin(params->request, 4);
> + ret = intel_logical_ring_begin(req, 4);
>   if (ret)
>   return ret;
>  
> @@ -963,11 +964,11 @@ int intel_execlists_submission_final(struct 
> i915_execbuffer_params *params)
>   exec_start = params->batch_obj_vm_offset +
>        params->args_batch_start_offset;
>  
> - ret = ring->emit_bb_start(params->request, exec_start, 
> params->dispatch_flags);
> + ret = ring->emit_bb_start(req, exec_start, params->dispatch_flags);
>   if (ret)
>   return ret;
>  
> - trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
> + trace_i915_gem_ring_dispatch(req, params->dispatch_flags);
>  
>   i915_gem_execbuffer_retire_commands(params);
>  
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4 04/38] drm/i915: Split i915_dem_do_execbuffer() in half

2016-02-04 Thread Jesse Barnes

instp_mode;
> - u32 instp_mask;
>   int ret;
>  
> - instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> - instp_mask = I915_EXEC_CONSTANTS_MASK;
> - switch (instp_mode) {
> + params->instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> + params->instp_mask = I915_EXEC_CONSTANTS_MASK;
> + switch (params->instp_mode) {
>   case I915_EXEC_CONSTANTS_REL_GENERAL:
>   case I915_EXEC_CONSTANTS_ABSOLUTE:
>   case I915_EXEC_CONSTANTS_REL_SURFACE:
> - if (instp_mode != 0 && ring != &dev_priv->ring[RCS]) {
> + if (params->instp_mode != 0 && ring != &dev_priv->ring[RCS]) {
>   DRM_DEBUG("non-0 rel constants mode on non-RCS\n");
>   return -EINVAL;
>   }
>  
> - if (instp_mode != dev_priv->relative_constants_mode) {
> - if (instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) {
> + if (params->instp_mode != dev_priv->relative_constants_mode) {
> + if (params->instp_mode == 
> I915_EXEC_CONSTANTS_REL_SURFACE) {
>   DRM_DEBUG("rel surface constants mode invalid 
> on gen5+\n");
>   return -EINVAL;
>   }
>  
>   /* The HW changed the meaning on this bit on gen6 */
> - instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
> + params->instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
>   }
>   break;
>   default:
> - DRM_DEBUG("execbuf with unknown constants: %d\n", instp_mode);
> + DRM_DEBUG("execbuf with unknown constants: %d\n", 
> params->instp_mode);
>   return -EINVAL;
>   }
>  
> @@ -912,7 +908,34 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>  
>   i915_gem_execbuffer_move_to_active(vmas, params->request);
>  
> - /* To be split into two functions here... */
> + ret = dev_priv->gt.execbuf_final(params);
> + if (ret)
> + return ret;
> +
> + /*
> +  * Free everything that was stored in the QE structure (until the
> +  * scheduler arrives and does it instead):
> +  */
> + if (params->dispatch_flags & I915_DISPATCH_SECURE)
> + i915_gem_execbuff_release_batch_obj(params->batch_obj);
> +
> + return 0;
> +}
> +
> +/*
> + * This is the main function for adding a batch to the ring.
> + * It is called from the scheduler, with the struct_mutex already held.
> + */
> +int intel_execlists_submission_final(struct i915_execbuffer_params *params)
> +{
> + struct drm_i915_private *dev_priv = params->dev->dev_private;
> + struct intel_ringbuffer *ringbuf = params->request->ringbuf;
> + struct intel_engine_cs *ring = params->ring;
> + u64 exec_start;
> + int ret;
> +
> + /* The mutex must be acquired before calling this function */
> + WARN_ON(!mutex_is_locked(¶ms->dev->struct_mutex));
>  
>   /*
>* Unconditionally invalidate gpu caches and ensure that we do flush
> @@ -923,7 +946,7 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>   return ret;
>  
>   if (ring == &dev_priv->ring[RCS] &&
> - instp_mode != dev_priv->relative_constants_mode) {
> + params->instp_mode != dev_priv->relative_constants_mode) {
>   ret = intel_logical_ring_begin(params->request, 4);
>   if (ret)
>   return ret;
> @@ -931,14 +954,14 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>   intel_logical_ring_emit(ringbuf, MI_NOOP);
>   intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
>   intel_logical_ring_emit(ringbuf, INSTPM);
> - intel_logical_ring_emit(ringbuf, instp_mask << 16 | instp_mode);
> + intel_logical_ring_emit(ringbuf, params->instp_mask << 16 | 
> params->instp_mode);
>   intel_logical_ring_advance(ringbuf);
>  
> - dev_priv->relative_constants_mode = instp_mode;
> + dev_priv->relative_constants_mode = params->instp_mode;
>   }
>  
>   exec_start = params->batch_obj_vm_offset +
> -  args->batch_start_offset;
> +  params->args_batch_start_offset;
>  
>   ret = ring->emit_bb_start(params->request, exec_start, 
> params->dispatch_flags);
>   if (ret)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h 
> b/drivers/gpu/drm/i915/intel_lrc.h
> index 4e60d54..8d9bad7 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -93,6 +93,7 @@ struct i915_execbuffer_params;
>  int intel_execlists_submission(struct i915_execbuffer_params *params,
>  struct drm_i915_gem_execbuffer2 *args,
>  struct list_head *vmas);
> +int intel_execlists_submission_final(struct i915_execbuffer_params *params);
>  u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
>  
>  void intel_lrc_irq_handler(struct intel_engine_cs *ring);

Just a nitpick on naming; I think prepare/commit might be better than 
submit/submit_final.  But you don't have to respin just for that.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4 03/38] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two

2016-02-04 Thread Jesse Barnes

(struct drm_device *dev, void 
> *data,
>   dispatch_flags |= I915_DISPATCH_RS;
>   }
>  
> - intel_runtime_pm_get(dev_priv);
> -
>   ret = i915_mutex_lock_interruptible(dev);
>   if (ret)
>   goto pre_mutex_err;
> @@ -1599,9 +1615,6 @@ err:
>   mutex_unlock(&dev->struct_mutex);
>  
>  pre_mutex_err:
> - /* intel_gpu_busy should also get a ref, so it will free when the device
> -  * is really idle. */
> - intel_runtime_pm_put(dev_priv);
>   return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index e510730..4bf0ee6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -647,10 +647,7 @@ static int execlists_move_to_gpu(struct 
> drm_i915_gem_request *req,
>   if (flush_domains & I915_GEM_DOMAIN_GTT)
>   wmb();
>  
> - /* Unconditionally invalidate gpu caches and ensure that we do flush
> -  * any residual writes from the previous batch.
> -  */
> - return logical_ring_invalidate_all_caches(req);
> + return 0;
>  }
>  
>  int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request 
> *request)
> @@ -913,6 +910,18 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>   if (ret)
>   return ret;
>  
> + i915_gem_execbuffer_move_to_active(vmas, params->request);
> +
> + /* To be split into two functions here... */
> +
> + /*
> +  * Unconditionally invalidate gpu caches and ensure that we do flush
> +  * any residual writes from the previous batch.
> +  */
> + ret = logical_ring_invalidate_all_caches(params->request);
> + if (ret)
> + return ret;
> +
>   if (ring == &dev_priv->ring[RCS] &&
>   instp_mode != dev_priv->relative_constants_mode) {
>   ret = intel_logical_ring_begin(params->request, 4);
> @@ -937,7 +946,6 @@ int intel_execlists_submission(struct 
> i915_execbuffer_params *params,
>  
>   trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
> - i915_gem_execbuffer_move_to_active(vmas, params->request);
>   i915_gem_execbuffer_retire_commands(params);
>  
>   return 0;
> 

Do we need to do anything if the cache invalidation fails like move the buffers 
back off the active list?  The order changed here, so I'm wondering.

If that's not a problem, then:
Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 00/22] Add support for GuC-based SLPC

2016-01-26 Thread Jesse Barnes

On 01/26/2016 09:00 AM, Daniel Vetter wrote:
> On Tue, Jan 26, 2016 at 07:45:42AM -0800, Jesse Barnes wrote:
>> On 01/22/2016 09:00 AM, Daniel Vetter wrote:
>>> On Wed, Jan 20, 2016 at 06:26:02PM -0800, tom.orou...@intel.com wrote:
>>>> From: Tom O'Rourke 
>>>>
>>>> SLPC (Single Loop Power Controller) is a replacement for
>>>> some host-based power management features.  The SLPC
>>>> implemenation runs in firmware on GuC.
>>>>
>>>> This series is a first request for comments.  This series
>>>> is not expected to be merged.  After changes based on
>>>> comments, a later patch series will be sent for merging.
>>>>  
>>>> This series has been tested with SKL guc firmware
>>>> versions 4.3 and 4.7.  The graphics power management
>>>> features in SLPC in those versions are DFPS (Dynamic FPS),
>>>> Turbo, and DCC (Duty Cycle Control).  DFPS adjusts
>>>> requested graphics frequency to maintain target framerate.
>>>> Turbo adjusts requested graphics frequency to maintain
>>>> target GT busyness.  DCC adjusts requested graphics
>>>> frequency and stalls guc-scheduler to maintain actual
>>>> graphics frequency in efficient range.
>>>
>>> Either it's been forever long ago or I missed that meeting, so I'll drop
>>> my big arch concerns here. We probably need to discuss this internally, at
>>> least the benchmark data. Two big items:
>>>
>>> - How does GuC measure fps rendered to the screen? More specifically, how
>>>   does it figure out that we missed a frame and kick the throttle up?
>>
>> Yeah, this has been covered before, both in the design review and with
>> the GuC team; I don't think the DFPS feature is ready for Linux usage
>> yet, or at least not generally, since afaik it doesn't have a way to
>> monitor offscreen rendering at all, so may end up keeping the GPU freq
>> lower than it needs to be when several clients are rendering to
>> offscreen buffers and passing them to the compositor (depending on the
>> compositor behavior at least).
> 
> There's also all kinds of issues with the current design, like:
> - kernel knows when exactly we missed the vblank to display the next
>   frame, guc can only control for average fps.
> 
> - all the fun you mention about multiple clients.
> 
> - what if we have more than 1 display running at different fps?

Yep; I think a userspace solution with a kernel interface would do a
better job here (I outlined one a few years ago, but the lazyweb hasn't
implemented it for me yet).

> I'd say we need to keep the boost-deboost stuff alive, e.g. by manually
> telling guc that the we want different limits, then resetting those limits
> again after the boost is done. Same for fast idling - kernel simply has a
> better idea if anyone is about to submit more work (we have execbuf hints
> for specific workloads like libva).
> 
> Of course this assumes that guc slpc actually obeys our new limit requests
> fast enough, so we'd still need to benchmark to make sure it's not slower
> than what we have.

I definitely want to see benchmarking here too.  Maybe the GuC does
boosting differently, but it may be just as good as what we have for all
practical purposes.

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 00/22] Add support for GuC-based SLPC

2016-01-26 Thread Jesse Barnes

On 01/22/2016 09:00 AM, Daniel Vetter wrote:
> On Wed, Jan 20, 2016 at 06:26:02PM -0800, tom.orou...@intel.com wrote:
>> From: Tom O'Rourke 
>>
>> SLPC (Single Loop Power Controller) is a replacement for
>> some host-based power management features.  The SLPC
>> implemenation runs in firmware on GuC.
>>
>> This series is a first request for comments.  This series
>> is not expected to be merged.  After changes based on
>> comments, a later patch series will be sent for merging.
>>  
>> This series has been tested with SKL guc firmware
>> versions 4.3 and 4.7.  The graphics power management
>> features in SLPC in those versions are DFPS (Dynamic FPS),
>> Turbo, and DCC (Duty Cycle Control).  DFPS adjusts
>> requested graphics frequency to maintain target framerate.
>> Turbo adjusts requested graphics frequency to maintain
>> target GT busyness.  DCC adjusts requested graphics
>> frequency and stalls guc-scheduler to maintain actual
>> graphics frequency in efficient range.
> 
> Either it's been forever long ago or I missed that meeting, so I'll drop
> my big arch concerns here. We probably need to discuss this internally, at
> least the benchmark data. Two big items:
> 
> - How does GuC measure fps rendered to the screen? More specifically, how
>   does it figure out that we missed a frame and kick the throttle up?

Yeah, this has been covered before, both in the design review and with the GuC 
team; I don't think the DFPS feature is ready for Linux usage yet, or at least 
not generally, since afaik it doesn't have a way to monitor offscreen rendering 
at all, so may end up keeping the GPU freq lower than it needs to be when 
several clients are rendering to offscreen buffers and passing them to the 
compositor (depending on the compositor behavior at least).

> - This patch series seems to remove the limiting abilities, and also
>   completely no-ops out our boost/deboost features. Can we recover these
>   features?

This is a good question; if the GuC handles this it should probably be 
mentioned somewhere.

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH igt] igt/gem_softpin: Remove false dependencies on esoteric features

2016-01-25 Thread Jesse Barnes

On 01/21/2016 12:08 AM, Daniel Vetter wrote:
> On Wed, Jan 20, 2016 at 06:49:49PM +, Belgaumkar, Vinay wrote:
>> Hi Chris,
>> These tests were developed for testing buffered SVM(using userptr
>> and soft pinning API). I think Dan wanted me to rename the tests to
>> gem_softpin, since they were being checked in as tests which
>> validated the softpin kernel patches. Can we rename the existing
>> tests to gem_buffered_svm or something similar, and then push any
>> targeted softpin only tests as gem_softpin? We were hoping to use
>> these userptr+softpin tests as GFT tests for SVM(Android) as well,
>> since buffered SVM is POR for BXT Android. 
> 
> I agree with Chris, there's no need to unecessarily mix together features.
> When the api is designed in an orthogonal way, so should be the testing.
> i915.ko is already a mindboggling complex beast, no need to make our lives
> harder by making the tests use features that aren't strictly needed.
> 
> In the end applications and UMDs will of course use all these features
> together, but that's why we do integration testing on top of just running
> igt.
> 
> Can you please review Chris' patch?

So what's the actual request here?  Chris rewrote Vinay's test, but does it 
cover all the same stuff Vinay did?  If not, it would be nice to include those, 
maybe in a separate file, since Vinay did work with lots of people to make sure 
the coverage was complete for the SVM use cases...  I definitely like the sound 
of the new stuff Chris added though; no-reloc in particular is an important use 
case for upcoming APIs.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC] drm/i915: Render decompression support for Gen9 and above

2016-01-25 Thread Jesse Barnes

On 01/19/2016 02:28 AM, Daniel Stone wrote:
 >> > We aren't just talking about a few fbs here, we already see more than
 >> > 100 fbs active during complex situations. Potentially doubling this
 >> > number is surely a significant increase in memory usage, both from the
 >> > management side in userspace and the kernel side.
>> >
>> > 8kb kernel memory for the additional 2 copies of drm_framebuffer structs
>> > for 100 buffers. That's about as much as the minimal overhead for just 1
>> > underlying gem object (counting the sg table, vma, gtt pte tracking, gem
>> > object and shmem backing node and pagecache entries). 2 integers in 
>> > userspace.
>> >
>> > Do you have some data to show that overhead?
> I agree with this view as well, and it does seem to be the way chosen
> for generic userspace on other drivers.
> 
> For context, the way ChromeOS and Wayland compositors (Weston, Mutter,
> Enlightenment) work is that a userspace library called GBM is
> distributed as part of EGL, which is the native EGL platform/winsys
> for rendering on KMS. The major difference with GBM, however, is that
> it does _not_ do presentation: presentation is explicitly controlled
> by the compositor itself.
> 
> In order to use this new property, we would have to add API to EGL/GBM
> to extract a list of property names to set, which wouldn't really make
> for great API. It'd be much cleaner for these users to stick with FB
> modifiers, especially as they destroy and recreate the FB objects
> (something we've not seen have any performance impact) for every flip
> anyway. From my side, I'd be much happier using generically-applicable
> FB modifiers, than continuing along the property explosion.
> 
> The other sticking point is that if I go from flipping GPU buffers
> with render compression enabled to software buffers, from userspace
> that means I then need to explicitly go unset the render decompression
> flag before I can display software buffers, else the flips just get
> rejected; something which isn't the case with FB modifiers. One more
> thing to go wrong ...

Just for background, we ended up with a property for this attribute due to a 
request from the only userland folks we had at the time (our hwcomposer team).  
They felt it would be simpler to use a property in this specific case, though 
they already do have a number of fb objects to deal with.  Really I can make an 
argument either way for how well each matches hardware behavior, so figured 
we'd just go with a property due to someone expressing a preference.

This has probably already been changed in an updated patch (still catching up 
on mail), but I thought I'd at least chime in on the thinking on this from way 
back (around a year ago now I think).

Cc'ing Gary in case he has further comment.

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t] lib/igt_kms, tests/testdisplay: allow probing of new connector modes

2016-01-14 Thread Jesse Barnes

Fixup some fallout from the connector probing changes so testdisplay -m
will pick up newly hotplugged displays correctly.

Signed-off-by: Jesse Barnes mode_valid = 0;
return;
}
@@ -456,7 +463,7 @@ set_stereo_mode(struct connector *c)
  * Each connector has a corresponding encoder, except in the SDVO case
  * where an encoder may have multiple connectors.
  */
-int update_display(void)
+int update_display(bool probe)
 {
struct connector *connectors;
int c;
@@ -488,7 +495,7 @@ int update_display(void)
connector_find_preferred_mode(connector->id,
  crtc_idx_mask,
  specified_mode_num,
- connector);
+ connector, probe);
if (!connector->mode_valid)
continue;
 
@@ -513,7 +520,7 @@ int update_display(void)
connector_find_preferred_mode(connector->id,
  -1UL,
  specified_mode_num,
- connector);
+ connector, probe);
if (!connector->mode_valid)
continue;
 
@@ -765,7 +772,7 @@ int main(int argc, char **argv)
 
ret = 0;
 
-   if (!update_display()) {
+   if (!update_display(false)) {
ret = 1;
goto out_stdio;
}
diff --git a/tests/testdisplay.h b/tests/testdisplay.h
index 962e621..27f8209 100644
--- a/tests/testdisplay.h
+++ b/tests/testdisplay.h
@@ -32,4 +32,4 @@ gboolean testdisplay_setup_hotplug(void);
 void testdisplay_cleanup_hotplug(void);
 
 /* called by the hotplug code */
-int update_display(void);
+int update_display(bool probe);
diff --git a/tests/testdisplay_hotplug.c b/tests/testdisplay_hotplug.c
index 9d11399..3b900ca 100644
--- a/tests/testdisplay_hotplug.c
+++ b/tests/testdisplay_hotplug.c
@@ -59,7 +59,7 @@ static gboolean hotplug_event(GIOChannel *source, 
GIOCondition condition,
 
if (memcmp(&s.st_rdev, &udev_devnum, sizeof(dev_t)) == 0 &&
hotplug && atoi(hotplug) == 1)
-   update_display();
+   update_display(true);
 
udev_device_unref(dev);
 out:
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 5/7] drm/i915: Interrupt driven fences

2016-01-11 Thread Jesse Barnes

On 01/11/2016 11:10 AM, John Harrison wrote:
> On 08/01/2016 22:46, Chris Wilson wrote:
>> On Fri, Jan 08, 2016 at 06:47:26PM +, john.c.harri...@intel.com wrote:
>>> +void i915_gem_request_notify(struct intel_engine_cs *ring, bool 
>>> fence_locked)
>>> +{
>>> +struct drm_i915_gem_request *req, *req_next;
>>> +unsigned long flags;
>>>   u32 seqno;
>>>   -seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
>>> +if (list_empty(&ring->fence_signal_list))
>>> +return;
>>> +
>>> +if (!fence_locked)
>>> +spin_lock_irqsave(&ring->fence_lock, flags);
>>>   -return i915_seqno_passed(seqno, req->seqno);
>>> +seqno = ring->get_seqno(ring, false);
>> We really don't want to do be doing the forcewake dance from inside the
>> interrupt handler. We made that mistake years ago.
>> -Chris
>>
> What forcewake dance? Nothing in the above code mentions force wake.

get_seqno() w/o lazy_coherency set will do a POSTING_READ of the ring active 
head, which goes through our crazy read function and does forcewake.  So we may 
need something smarter here.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/7] drm/i915: Add per context timelines to fence object

2016-01-11 Thread Jesse Barnes

On 01/11/2016 11:03 AM, John Harrison wrote:
> On 08/01/2016 22:05, Chris Wilson wrote:
>> On Fri, Jan 08, 2016 at 06:47:24PM +, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>
>>> The fence object used inside the request structure requires a sequence
>>> number. Although this is not used by the i915 driver itself, it could
>>> potentially be used by non-i915 code if the fence is passed outside of
>>> the driver. This is the intention as it allows external kernel drivers
>>> and user applications to wait on batch buffer completion
>>> asynchronously via the dma-buff fence API.
>> That doesn't make any sense as they are not limited by a single
>> timeline.
> I don't understand what you mean. Who is not limited by a single timeline?  
> The point is that the current seqno values cannot be used as there is no 
> guarantee that they will increment globally once things like a scheduler and 
> pre-emption arrive. Whereas, the fence internal implementation makes various 
> assumptions about the linearity of the timeline. External users do not want 
> to care about timelines or seqnos at all, they just want the fence API to 
> work as documented.
> 
>>
>>> To ensure that such external users are not confused by strange things
>>> happening with the seqno, this patch adds in a per context timeline
>>> that can provide a guaranteed in-order seqno value for the fence. This
>>> is safe because the scheduler will not re-order batch buffers within a
>>> context - they are considered to be mutually dependent.
>> You haven't added per-context breadcrumbs. What we need for being able
>> to execute requests from parallel timelines, but with requests within a
>> timeline being ordered, is a per-context page where we can emit the
>> per-context issued breadcrumb. Then instead of looking up the current
>> HW seqno in a global page, the request just looks at the current context
>> HW seqno in the context seq, just
>> i915_seqno_passed(*req->p_context_seqno, req->seqno).
> This patch is not attempting to implement per context seqno values. That can 
> be done as future work. This patch is doing the simplest, least invasive 
> implementation in order to make external fences work.

Right.  I think we want to move to per-context seqnos, but we don't have to do 
it before this work lands.  It should be easier to do it after the rest of 
these bits land in fact, since seqno handling will be well encapsulated aiui.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/7] drm/i915: Removed now redudant parameter to i915_gem_request_completed()

2016-01-11 Thread Jesse Barnes

request_retire(request);
> @@ -2924,7 +2924,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs 
> *ring)
>   }
>  
>   if (unlikely(ring->trace_irq_req &&
> -  i915_gem_request_completed(ring->trace_irq_req, true))) {
> +  i915_gem_request_completed(ring->trace_irq_req))) {
>   ring->irq_put(ring);
>   i915_gem_request_assign(&ring->trace_irq_req, NULL);
>   }
> @@ -3030,7 +3030,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object 
> *obj)
>   if (list_empty(&req->list))
>   goto retire;
>  
> - if (i915_gem_request_completed(req, true)) {
> + if (i915_gem_request_completed(req)) {
>   __i915_gem_request_retire__upto(req);
>  retire:
>   i915_gem_object_retire__read(obj, i);
> @@ -3142,7 +3142,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   if (to == from)
>   return 0;
>  
> - if (i915_gem_request_completed(from_req, true))
> + if (i915_gem_request_completed(from_req))
>   return 0;
>  
>   if (!i915_semaphore_is_enabled(obj->base.dev)) {
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index a5dd528..510365e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11313,7 +11313,7 @@ static bool __intel_pageflip_stall_check(struct 
> drm_device *dev,
>  
>   if (work->flip_ready_vblank == 0) {
>   if (work->flip_queued_req &&
> - !i915_gem_request_completed(work->flip_queued_req, true))
> + !i915_gem_request_completed(work->flip_queued_req))
>   return false;
>  
>   work->flip_ready_vblank = drm_crtc_vblank_count(crtc);
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index ebd6735..c207a3a 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -7170,7 +7170,7 @@ static void __intel_rps_boost_work(struct work_struct 
> *work)
>   struct request_boost *boost = container_of(work, struct request_boost, 
> work);
>   struct drm_i915_gem_request *req = boost->req;
>  
> - if (!i915_gem_request_completed(req, true))
> + if (!i915_gem_request_completed(req))
>   gen6_rps_boost(to_i915(req->ring->dev), NULL,
>      req->emitted_jiffies);
>  
> @@ -7186,7 +7186,7 @@ void intel_queue_rps_boost_for_request(struct 
> drm_device *dev,
>   if (req == NULL || INTEL_INFO(dev)->gen < 6)
>   return;
>  
> - if (i915_gem_request_completed(req, true))
> + if (i915_gem_request_completed(req))
>   return;
>  
>   boost = kmalloc(sizeof(*boost), GFP_ATOMIC);
> 

I'm sure we'll have optimizations on top once this whole thing lands, so this 
seems fine as an intermediate step (we'll want to do lots of benchmarking and 
analysis after the interrupt driven stuff lands anyway).

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/7] drm/i915: Convert requests to use struct fence

2016-01-11 Thread Jesse Barnes

On 01/11/2016 11:03 AM, John Harrison wrote:
> On 08/01/2016 21:59, Chris Wilson wrote:
>> On Fri, Jan 08, 2016 at 06:47:22PM +, john.c.harri...@intel.com wrote:
>>> From: John Harrison 
>>>
>>> There is a construct in the linux kernel called 'struct fence' that is
>>> intended to keep track of work that is executed on hardware. I.e. it
>>> solves the basic problem that the drivers 'struct
>>> drm_i915_gem_request' is trying to address. The request structure does
>>> quite a lot more than simply track the execution progress so is very
>>> definitely still required. However, the basic completion status side
>>> could be updated to use the ready made fence implementation and gain
>>> all the advantages that provides.
>>>
>>> This patch makes the first step of integrating a struct fence into the
>>> request. It replaces the explicit reference count with that of the
>>> fence. It also replaces the 'is completed' test with the fence's
>>> equivalent. Currently, that simply chains on to the original request
>>> implementation. A future patch will improve this.
>> But this forces everyone to do the heavyweight polling until the request
>> is completed?
> Not sure what you mean by heavy weight polling. And as described, this is 
> only an intermediate step.

Just the lazy_coherency removal maybe?  Chris?

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 036/190] drm/i915: Restore waitboost credit to the synchronous waiter

2016-01-11 Thread Jesse Barnes

On 01/11/2016 01:16 AM, Chris Wilson wrote:
> Ideally, we want to automagically have the GPU respond to the
> instantaneous load by reclocking itself. However, reclocking occurs
> relatively slowly, and to the client waiting for a result from the GPU,
> too late. To compensate and reduce the client latency, we allow the
> first wait from a client to boost the GPU clocks to maximum. This
> overcomes the lag in autoreclocking, at the expense of forcing the GPU
> clocks too high. So to offset the excessive power usage, we currently
> allow a client to only boost the clocks once before we detect the GPU
> is idle again. This works reasonably for say the first frame in a
> benchmark, but for many more synchronous workloads (like OpenCL) we find
> the GPU clocks remain too low. By noting a wait which would idle the GPU
> (i.e. we just waited upon the last known request), we can give that
> client the idle boost credit (for their next wait) without the 100ms
> delay required for us to detect the GPU idle state. The intention is to
> boost clients that are stalling in the process of feeding the GPU more
> work (and who in doing so let the GPU idle), without granting boost
> credits to clients that are throttling themselves (such as compositors).
> 
> Signed-off-by: Chris Wilson 
> Cc: "Zou, Nanhai" 
> Cc: Jesse Barnes 
> Reviewed-by: Jesse Barnes 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e9f5ca7ea835..3fea582768e9 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1314,6 +1314,22 @@ complete:
>   *timeout = 0;
>   }
>  
> + if (ret == 0 && rps && req->seqno == req->ring->last_submitted_seqno) {
> + /* The GPU is now idle and this client has stalled.
> +  * Since no other client has submitted a request in the
> +  * meantime, assume that this client is the only one
> +  * supplying work to the GPU but is unable to keep that
> +  * work supplied because it is waiting. Since the GPU is
> +  * then never kept fully busy, RPS autoclocking will
> +  * keep the clocks relatively low, causing further delays.
> +  * Compensate by giving the synchronous client credit for
> +  * a waitboost next time.
> +  */
> + spin_lock(&req->i915->rps.client_lock);
> + list_del_init(&rps->link);
> + spin_unlock(&req->i915->rps.client_lock);
> + }
> +
>   return ret;
>  }
>  
> 

Assuming this works for the OCL guys, it seems ok.  Doing the
list_del_init(&rps->link) is a bit of an obfuscated way of doing it, but
I guess the comment makes it pretty clear.

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 05/13] drm/i915: Convert requests to use struct fence

2016-01-08 Thread Jesse Barnes

On 01/08/2016 01:47 PM, Chris Wilson wrote:
> On Mon, Jan 04, 2016 at 01:16:54PM -0800, Jesse Barnes wrote:
>> On 01/04/2016 12:57 PM, Chris Wilson wrote:
>>> On Mon, Jan 04, 2016 at 09:20:44AM -0800, Jesse Barnes wrote:
>>>> So this one has my ack.
>>>
>>> This series makes a number of fundamental mistakes in seqno-interrupt
>>> handling, so no.
>>
>> Well unless you can enumerate the issues in enough detail for us to address 
>> them, we don't have much choice but to go ahead.  I know you've replied to a 
>> few of these threads in the past, but I don't see a current list of 
>> outstanding bugs aside from the one about modifying input params on the 
>> execbuf error path (though the code comment seems to indicate some care is 
>> being taken there at least, so should be a small fix).
> 
> Other than the series addressing the reported bugs which this is direct
> conflict with?

Which patchset came first?  And yes, clearly enumerating the issues is
helpful regardless.  It doesn't really matter which came first though,
we've agreed to move forward with John's version since the scheduler has
been outstanding for so long, so your bug fixes will have to be rebased
on top of this work.  I hope that's acceptable, since I think we all
have the same ultimate goal here...

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] drm/i915: Agressive downclocking on Baytrail

2016-01-06 Thread Jesse Barnes

On 01/06/2016 11:15 AM, Janne Heikkinen wrote:
> I've got Bay Trail based Asus X553MA and I've been experiencing daily hangs
> with kernels beginning from 4.2-rc1. I haven't had any problems with 4.1.x
> kernels and using 4.1.13 I've gotten constant 5+ day uptimes since November
> (I had to at least suspend it once per week for traveling but during Christmas
> longest uptime was 11 days).
> 
> Now I did bisection beginning with marking 4.2-rc1 as bad and 4.1.0 as good
> and found out that it was this commit:
> 
> [8fb55197e64d5988ec57b54e973daeea72c3f2ff]
> drm/i915: Agressive downclocking on Baytrail
> 
> causing the hangs.
> 
> 4.4-rc8 hanged in less than hour. After I reversed the patch from it and
> commented out lines containing related fields from i915_debugfs.c
> I now seem to have stable 4.4-rc8.

Cc'ing Deepak and Chris.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2, 2/4] drm/i915: simplify testing for the global default context

2016-01-04 Thread Jesse Barnes

On 01/04/2016 11:39 AM, Chris Wilson wrote:
> On Mon, Jan 04, 2016 at 05:43:10PM +, Dave Gordon wrote:
>> On 23/12/15 21:02, Chris Wilson wrote:
>>> On Wed, Dec 23, 2015 at 07:33:53PM +, Dave Gordon wrote:
 There are quite a number of places where the driver tests whether
 a given context is or is not the global default context, usually by
 checking whether an engine's default_pointer points to the context.
 Now that we have a 'is_global_default' flag in the context itself,
 all these tests these can be rewritten to use it. This makes the
 logic more obvious, and usually saves at least one memory reference.
 In addition, with these uses eliminated, a future patch will be able
 to get rid of engine::default_context entirely.
>>>
>>> All the execlists use of ctx != ring->default_context stems from a
>>> misstep in execlists - if you stop treating that default_context as
>>> special during request processing and just take the pin/unpin at
>>> init/fini of the ring, they all disappear.
>>
>> We do already pin/unpin the default context at creation/deletion;
>> AFAICS the extra tests are probably an attempt not to do an extra
>> pin/unpin on an object which is by definition already pinned. And
>> I'd be quite happy to get rid of those tests, and just issue a pin
>> for *every* request issues on a context -- indeed, I think Nick may
>> have just such a patch. But his changes are blocked on getting the
>> elimination of ring->default_context (patch 4 of THIS set) merged
>> first, since having those backpointers dictates the order of
>> creation and destruction.
> 
> This series is NAKed.

Why?  Because you want things in a different order?  Or do you object to 
something in Dave's reply?

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 21/32] drm/i915: Broadwell execlists needs exactly the same seqno w/a as legacy

2016-01-04 Thread Jesse Barnes

On 12/11/2015 03:33 AM, Chris Wilson wrote:
> +  * Note that this effectively effectively stalls the read by the time
> +  * it takes to do a memory transaction, which more or less ensures
> +  * that the write from the GPU has sufficient time to invalidate
> +  * the CPU cacheline. Alternatively we could delay the interrupt from
> +  * the CS ring to give the write time to land, but that would incur
> +  * a delay after every batch i.e. much more frequent than a delay
> +  * when waiting for the interrupt (with the same net latency).
>*/
> + struct drm_i915_private *dev_priv = ring->i915;
> + POSTING_READ_FW(RING_ACTHD(ring->mmio_base));
> +
>   intel_flush_status_page(ring, I915_GEM_HWS_INDEX);

Funnily enough, the interrupt ought to provide the same behavior as the MMIO 
read, i.e. flush outstanding system memory writes ahead of it.  The fact that 
we need it *plus* a CPU cache flush definitely means we're still missing 
something...

But hey, whatever works is good for now...

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 05/13] drm/i915: Convert requests to use struct fence

2016-01-04 Thread Jesse Barnes

On 01/04/2016 12:57 PM, Chris Wilson wrote:
> On Mon, Jan 04, 2016 at 09:20:44AM -0800, Jesse Barnes wrote:
>> So this one has my ack.
> 
> This series makes a number of fundamental mistakes in seqno-interrupt
> handling, so no.

Well unless you can enumerate the issues in enough detail for us to address 
them, we don't have much choice but to go ahead.  I know you've replied to a 
few of these threads in the past, but I don't see a current list of outstanding 
bugs aside from the one about modifying input params on the execbuf error path 
(though the code comment seems to indicate some care is being taken there at 
least, so should be a small fix).

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 05/13] drm/i915: Convert requests to use struct fence

2016-01-04 Thread Jesse Barnes

On 12/17/2015 09:43 AM, Jesse Barnes wrote:
> On 12/11/2015 05:11 AM, john.c.harri...@intel.com wrote:
>> From: John Harrison 
>>
>> There is a construct in the linux kernel called 'struct fence' that is
>> intended to keep track of work that is executed on hardware. I.e. it
>> solves the basic problem that the drivers 'struct
>> drm_i915_gem_request' is trying to address. The request structure does
>> quite a lot more than simply track the execution progress so is very
>> definitely still required. However, the basic completion status side
>> could be updated to use the ready made fence implementation and gain
>> all the advantages that provides.
>>
>> This patch makes the first step of integrating a struct fence into the
>> request. It replaces the explicit reference count with that of the
>> fence. It also replaces the 'is completed' test with the fence's
>> equivalent. Currently, that simply chains on to the original request
>> implementation. A future patch will improve this.
>>
>> v3: Updated after review comments by Tvrtko Ursulin. Added fence
>> context/seqno pair to the debugfs request info. Renamed fence 'driver
>> name' to just 'i915'. Removed BUG_ONs.
>>
>> For: VIZ-5190
>> Signed-off-by: John Harrison 
>> Cc: Tvrtko Ursulin 
>> ---
>>  drivers/gpu/drm/i915/i915_debugfs.c |  5 +--
>>  drivers/gpu/drm/i915/i915_drv.h | 45 +-
>>  drivers/gpu/drm/i915/i915_gem.c | 56 
>> ++---
>>  drivers/gpu/drm/i915/intel_lrc.c|  1 +
>>  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
>>  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
>>  6 files changed, 81 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 7415606..5b31186 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -709,11 +709,12 @@ static int i915_gem_request_info(struct seq_file *m, 
>> void *data)
>>  task = NULL;
>>  if (req->pid)
>>  task = pid_task(req->pid, PIDTYPE_PID);
>> -seq_printf(m, "%x @ %d: %s [%d]\n",
>> +seq_printf(m, "%x @ %d: %s [%d], fence = %u.%u\n",
>> req->seqno,
>> (int) (jiffies - req->emitted_jiffies),
>> task ? task->comm : "",
>> -   task ? task->pid : -1);
>> +   task ? task->pid : -1,
>> +   req->fence.context, req->fence.seqno);
>>  rcu_read_unlock();
>>  }
>>  
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 436149e..aa5cba7 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -51,6 +51,7 @@
>>  #include 
>>  #include 
>>  #include "intel_guc.h"
>> +#include 
>>  
>>  /* General customization:
>>   */
>> @@ -2174,7 +2175,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object 
>> *old,
>>   * initial reference taken using kref_init
>>   */
>>  struct drm_i915_gem_request {
>> -struct kref ref;
>> +/**
>> + * Underlying object for implementing the signal/wait stuff.
>> + * NB: Never call fence_later() or return this fence object to user
>> + * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
>> + * etc., there is no guarantee at all about the validity or
>> + * sequentiality of the fence's seqno! It is also unsafe to let
>> + * anything outside of the i915 driver get hold of the fence object
>> + * as the clean up when decrementing the reference count requires
>> + * holding the driver mutex lock.
>> + */
>> +struct fence fence;
>>  
>>  /** On Which ring this request was generated */
>>  struct drm_i915_private *i915;
>> @@ -2251,7 +2262,13 @@ int i915_gem_request_alloc(struct intel_engine_cs 
>> *ring,
>> struct intel_context *ctx,
>> struct drm_i915_gem_request **req_out);
>>  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>> -void i915_gem_request_fre

Re: [Intel-gfx] [PATCH 05/13] drm/i915: Convert requests to use struct fence

2015-12-17 Thread Jesse Barnes

On 12/11/2015 05:11 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> There is a construct in the linux kernel called 'struct fence' that is
> intended to keep track of work that is executed on hardware. I.e. it
> solves the basic problem that the drivers 'struct
> drm_i915_gem_request' is trying to address. The request structure does
> quite a lot more than simply track the execution progress so is very
> definitely still required. However, the basic completion status side
> could be updated to use the ready made fence implementation and gain
> all the advantages that provides.
> 
> This patch makes the first step of integrating a struct fence into the
> request. It replaces the explicit reference count with that of the
> fence. It also replaces the 'is completed' test with the fence's
> equivalent. Currently, that simply chains on to the original request
> implementation. A future patch will improve this.
> 
> v3: Updated after review comments by Tvrtko Ursulin. Added fence
> context/seqno pair to the debugfs request info. Renamed fence 'driver
> name' to just 'i915'. Removed BUG_ONs.
> 
> For: VIZ-5190
> Signed-off-by: John Harrison 
> Cc: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c |  5 +--
>  drivers/gpu/drm/i915/i915_drv.h | 45 +-
>  drivers/gpu/drm/i915/i915_gem.c | 56 
> ++---
>  drivers/gpu/drm/i915/intel_lrc.c|  1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
>  6 files changed, 81 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 7415606..5b31186 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -709,11 +709,12 @@ static int i915_gem_request_info(struct seq_file *m, 
> void *data)
>   task = NULL;
>   if (req->pid)
>   task = pid_task(req->pid, PIDTYPE_PID);
> - seq_printf(m, "%x @ %d: %s [%d]\n",
> + seq_printf(m, "%x @ %d: %s [%d], fence = %u.%u\n",
>  req->seqno,
>  (int) (jiffies - req->emitted_jiffies),
>  task ? task->comm : "",
> -task ? task->pid : -1);
> +task ? task->pid : -1,
> +req->fence.context, req->fence.seqno);
>   rcu_read_unlock();
>   }
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 436149e..aa5cba7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -51,6 +51,7 @@
>  #include 
>  #include 
>  #include "intel_guc.h"
> +#include 
>  
>  /* General customization:
>   */
> @@ -2174,7 +2175,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
>   * initial reference taken using kref_init
>   */
>  struct drm_i915_gem_request {
> - struct kref ref;
> + /**
> +  * Underlying object for implementing the signal/wait stuff.
> +  * NB: Never call fence_later() or return this fence object to user
> +  * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
> +  * etc., there is no guarantee at all about the validity or
> +  * sequentiality of the fence's seqno! It is also unsafe to let
> +  * anything outside of the i915 driver get hold of the fence object
> +  * as the clean up when decrementing the reference count requires
> +  * holding the driver mutex lock.
> +  */
> + struct fence fence;
>  
>   /** On Which ring this request was generated */
>   struct drm_i915_private *i915;
> @@ -2251,7 +2262,13 @@ int i915_gem_request_alloc(struct intel_engine_cs 
> *ring,
>  struct intel_context *ctx,
>  struct drm_i915_gem_request **req_out);
>  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
> -void i915_gem_request_free(struct kref *req_ref);
> +
> +static inline bool i915_gem_request_completed(struct drm_i915_gem_request 
> *req,
> +   bool lazy_coherency)
> +{
> + return fence_is_signaled(&req->fence);
> +}
> +
>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>  struct drm_file *file);
>  
> @@ -2271,7 +2288,7 @@ static inline struct drm_i915_gem_request *
>  i915_gem_request_reference(struct drm_i915_gem_request *req)
>  {
>   if (req)
> - kref_get(&req->ref);
> + fence_get(&req->fence);
>   return req;
>  }
>  
> @@ -2279,7 +2296,7 @@ static inline void
>  i915_gem_request_unreference(struct drm_i915_gem_request *req)
>  {
>   WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
> - kref_put

Re: [Intel-gfx] [PATCH 07/13] drm/i915: Add per context timelines to fence object

2015-12-17 Thread Jesse Barnes

On 12/11/2015 05:11 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> The fence object used inside the request structure requires a sequence
> number. Although this is not used by the i915 driver itself, it could
> potentially be used by non-i915 code if the fence is passed outside of
> the driver. This is the intention as it allows external kernel drivers
> and user applications to wait on batch buffer completion
> asynchronously via the dma-buff fence API.
> 
> To ensure that such external users are not confused by strange things
> happening with the seqno, this patch adds in a per context timeline
> that can provide a guaranteed in-order seqno value for the fence. This
> is safe because the scheduler will not re-order batch buffers within a
> context - they are considered to be mutually dependent.
> 
> v2: New patch in series.
> 
> v3: Renamed/retyped timeline structure fields after review comments by
> Tvrtko Ursulin.
> 
> Added context information to the timeline's name string for better
> identification in debugfs output.
> 
> For: VIZ-5190
> Signed-off-by: John Harrison 
> Cc: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 25 ---
>  drivers/gpu/drm/i915/i915_gem.c | 80 
> +
>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++-
>  drivers/gpu/drm/i915/intel_lrc.c|  8 
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
>  5 files changed, 111 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index caf7897..7d6a7c0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -841,6 +841,15 @@ struct i915_ctx_hang_stats {
>   bool banned;
>  };
>  
> +struct i915_fence_timeline {
> + charname[32];
> + unsignedfence_context;
> + unsignednext;
> +
> + struct intel_context *ctx;
> + struct intel_engine_cs *ring;
> +};
> +
>  /* This must match up with the value previously used for execbuf2.rsvd1. */
>  #define DEFAULT_CONTEXT_HANDLE 0
>  
> @@ -885,6 +894,7 @@ struct intel_context {
>   struct drm_i915_gem_object *state;
>   struct intel_ringbuffer *ringbuf;
>   int pin_count;
> + struct i915_fence_timeline fence_timeline;
>   } engine[I915_NUM_RINGS];
>  
>   struct list_head link;
> @@ -2177,13 +2187,10 @@ void i915_gem_track_fb(struct drm_i915_gem_object 
> *old,
>  struct drm_i915_gem_request {
>   /**
>* Underlying object for implementing the signal/wait stuff.
> -  * NB: Never call fence_later() or return this fence object to user
> -  * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
> -  * etc., there is no guarantee at all about the validity or
> -  * sequentiality of the fence's seqno! It is also unsafe to let
> -  * anything outside of the i915 driver get hold of the fence object
> -  * as the clean up when decrementing the reference count requires
> -  * holding the driver mutex lock.
> +  * NB: Never return this fence object to user land! It is unsafe to
> +  * let anything outside of the i915 driver get hold of the fence
> +  * object as the clean up when decrementing the reference count
> +  * requires holding the driver mutex lock.
>*/
>   struct fence fence;
>  
> @@ -2263,6 +2270,10 @@ int i915_gem_request_alloc(struct intel_engine_cs 
> *ring,
>  struct drm_i915_gem_request **req_out);
>  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  
> +int i915_create_fence_timeline(struct drm_device *dev,
> +struct intel_context *ctx,
> +struct intel_engine_cs *ring);
> +
>  static inline bool i915_gem_request_completed(struct drm_i915_gem_request 
> *req)
>  {
>   return fence_is_signaled(&req->fence);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0801738..7a37fb7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2665,9 +2665,32 @@ static const char 
> *i915_gem_request_get_driver_name(struct fence *req_fence)
>  
>  static const char *i915_gem_request_get_timeline_name(struct fence 
> *req_fence)
>  {
> - struct drm_i915_gem_request *req = container_of(req_fence,
> -  typeof(*req), fence);
> - return req->ring->name;
> + struct drm_i915_gem_request *req;
> + struct i915_fence_timeline *timeline;
> +
> + req = container_of(req_fence, typeof(*req), fence);
> + timeline = &req->ctx->engine[req->ring->id].fence_timeline;
> +
> + return timeline->name;
> +}
> +
> +static void i915_gem_request_timeline_value_str(struct fence *req_fence, 
> char *str, int size)
> +{
> + struct drm_i915_gem_request *req;
> +
> + req = container_of(req_fence, typeof(*req), fence);
> +
> +

Re: [Intel-gfx] [PATCH 04/13] android/sync: Improved debug dump to dmesg

2015-12-17 Thread Jesse Barnes

 (i = 0; i < s.count; i += DUMP_CHUNK) {
> - if ((s.count - i) > DUMP_CHUNK) {
> - char c = s.buf[i + DUMP_CHUNK];
> + sync_dump_dfs(&s, targetPtr);
> +}
>  
> - s.buf[i + DUMP_CHUNK] = 0;
> - pr_cont("%s", s.buf + i);
> - s.buf[i + DUMP_CHUNK] = c;
> - } else {
> -     s.buf[s.count] = 0;
> - pr_cont("%s", s.buf + i);
> - }
> - }
> +void sync_dump_timeline(struct sync_timeline *timeline)
> +{
> + struct seq_file s = {
> + .buf = sync_dump_buf,
> + .size = sizeof(sync_dump_buf) - 1,
> + };
> +
> + pr_info("timeline: %p\n", timeline);
> + sync_print_obj(&s, timeline);
> +
> + sync_dump_dfs(&s, NULL);
>  }
>  
>  #endif
> 

I guess the Android guys might have feedback here, but it seems fine to me.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 03/13] staging/android/sync: Move sync framework out of staging

2015-12-17 Thread Jesse Barnes

On 12/11/2015 05:11 AM, john.c.harri...@intel.com wrote:
> From: John Harrison 
> 
> The sync framework is now used by the i915 driver. Therefore it can be
> moved out of staging and into the regular tree. Also, the public
> interfaces can actually be made public and exported.
> 
> v3: New patch for series.
> 
> Signed-off-by: John Harrison 
> Signed-off-by: Geoff Miller 
> ---
>  drivers/android/Kconfig|  28 ++
>  drivers/android/Makefile   |   2 +
>  drivers/android/sw_sync.c  | 260 
>  drivers/android/sw_sync.h  |  59 +++
>  drivers/android/sync.c | 734 
> +
>  drivers/android/sync.h | 366 
>  drivers/android/sync_debug.c   | 256 
>  drivers/android/trace/sync.h   |  82 
>  drivers/staging/android/Kconfig|  28 --
>  drivers/staging/android/Makefile   |   2 -
>  drivers/staging/android/sw_sync.c  | 260 
>  drivers/staging/android/sw_sync.h  |  59 ---
>  drivers/staging/android/sync.c | 734 
> -
>  drivers/staging/android/sync.h | 366 
>  drivers/staging/android/sync_debug.c   | 256 
>  drivers/staging/android/trace/sync.h   |  82 
>  drivers/staging/android/uapi/sw_sync.h |  32 --
>  drivers/staging/android/uapi/sync.h|  97 -
>  include/uapi/Kbuild|   1 +
>  include/uapi/sync/Kbuild   |   3 +
>  include/uapi/sync/sw_sync.h|  32 ++
>  include/uapi/sync/sync.h   |  97 +
>  22 files changed, 1920 insertions(+), 1916 deletions(-)
>  create mode 100644 drivers/android/sw_sync.c
>  create mode 100644 drivers/android/sw_sync.h
>  create mode 100644 drivers/android/sync.c
>  create mode 100644 drivers/android/sync.h
>  create mode 100644 drivers/android/sync_debug.c
>  create mode 100644 drivers/android/trace/sync.h
>  delete mode 100644 drivers/staging/android/sw_sync.c
>  delete mode 100644 drivers/staging/android/sw_sync.h
>  delete mode 100644 drivers/staging/android/sync.c
>  delete mode 100644 drivers/staging/android/sync.h
>  delete mode 100644 drivers/staging/android/sync_debug.c
>  delete mode 100644 drivers/staging/android/trace/sync.h
>  delete mode 100644 drivers/staging/android/uapi/sw_sync.h
>  delete mode 100644 drivers/staging/android/uapi/sync.h
>  create mode 100644 include/uapi/sync/Kbuild
>  create mode 100644 include/uapi/sync/sw_sync.h
>  create mode 100644 include/uapi/sync/sync.h
> 
> diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> index bdfc6c6..9edcd8f 100644
> --- a/drivers/android/Kconfig
> +++ b/drivers/android/Kconfig
> @@ -32,6 +32,34 @@ config ANDROID_BINDER_IPC_32BIT
>  
> Note that enabling this will break newer Android user-space.
>  
> +config SYNC
> + bool "Synchronization framework"
> + default n
> + select ANON_INODES
> + select DMA_SHARED_BUFFER
> + ---help---
> +   This option enables the framework for synchronization between multiple
> +   drivers.  Sync implementations can take advantage of hardware
> +   synchronization built into devices like GPUs.
> +
> +config SW_SYNC
> + bool "Software synchronization objects"
> + default n
> + depends on SYNC
> + ---help---
> +   A sync object driver that uses a 32bit counter to coordinate
> +   synchronization.  Useful when there is no hardware primitive backing
> +   the synchronization.
> +
> +config SW_SYNC_USER
> + bool "Userspace API for SW_SYNC"
> + default n
> + depends on SW_SYNC
> + ---help---
> +   Provides a user space API to the sw sync object.
> +   *WARNING* improper use of this can result in deadlocking kernel
> +   drivers from userspace.
> +
>  endif # if ANDROID

IIRC we wanted to drop the user ABI altogether?  I think we can de-stage this 
even before we push the new ABI on the i915 side to expose the sync points 
(since we'll need an open source userspace for that), and any changes/cleanups 
can happen outside of staging.

Thanks,
Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 01/13] staging/android/sync: Support sync points created from dma-fences

2015-12-17 Thread Jesse Barnes

On 12/11/2015 05:11 AM, john.c.harri...@intel.com wrote:
> From: Maarten Lankhorst 
> 
> Debug output assumes all sync points are built on top of Android sync points
> and when we start creating them from dma-fences will NULL ptr deref unless
> taught about this.
> 
> v4: Corrected patch ownership.
> 
> Signed-off-by: Maarten Lankhorst 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Maarten Lankhorst 
> Cc: de...@driverdev.osuosl.org
> Cc: Riley Andrews 
> Cc: Greg Kroah-Hartman 
> Cc: Arve Hjønnevåg 
> ---
>  drivers/staging/android/sync_debug.c | 42 
> +++-
>  1 file changed, 22 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/staging/android/sync_debug.c 
> b/drivers/staging/android/sync_debug.c
> index 91ed2c4..f45d13c 100644
> --- a/drivers/staging/android/sync_debug.c
> +++ b/drivers/staging/android/sync_debug.c
> @@ -82,36 +82,42 @@ static const char *sync_status_str(int status)
>   return "error";
>  }
>  
> -static void sync_print_pt(struct seq_file *s, struct sync_pt *pt, bool fence)
> +static void sync_print_pt(struct seq_file *s, struct fence *pt, bool fence)
>  {
>   int status = 1;
> - struct sync_timeline *parent = sync_pt_parent(pt);
>  
> - if (fence_is_signaled_locked(&pt->base))
> - status = pt->base.status;
> + if (fence_is_signaled_locked(pt))
> + status = pt->status;
>  
>   seq_printf(s, "  %s%spt %s",
> -fence ? parent->name : "",
> +fence && pt->ops->get_timeline_name ?
> +pt->ops->get_timeline_name(pt) : "",
>  fence ? "_" : "",
>  sync_status_str(status));
>  
>   if (status <= 0) {
>   struct timespec64 ts64 =
> - ktime_to_timespec64(pt->base.timestamp);
> + ktime_to_timespec64(pt->timestamp);
>  
>   seq_printf(s, "@%lld.%09ld", (s64)ts64.tv_sec, ts64.tv_nsec);
>   }
>  
> - if (parent->ops->timeline_value_str &&
> - parent->ops->pt_value_str) {
> + if ((!fence || pt->ops->timeline_value_str) &&
> + pt->ops->fence_value_str) {
>   char value[64];
> + bool success;
>  
> - parent->ops->pt_value_str(pt, value, sizeof(value));
> - seq_printf(s, ": %s", value);
> - if (fence) {
> - parent->ops->timeline_value_str(parent, value,
> - sizeof(value));
> - seq_printf(s, " / %s", value);
> + pt->ops->fence_value_str(pt, value, sizeof(value));
> + success = strlen(value);
> +
> + if (success)
> + seq_printf(s, ": %s", value);
> +
> + if (success && fence) {
> + pt->ops->timeline_value_str(pt, value, sizeof(value));
> +
> + if (strlen(value))
> + seq_printf(s, " / %s", value);
>   }
>   }
>  
> @@ -138,7 +144,7 @@ static void sync_print_obj(struct seq_file *s, struct 
> sync_timeline *obj)
>   list_for_each(pos, &obj->child_list_head) {
>   struct sync_pt *pt =
>   container_of(pos, struct sync_pt, child_list);
> - sync_print_pt(s, pt, false);
> + sync_print_pt(s, &pt->base, false);
>   }
>   spin_unlock_irqrestore(&obj->child_list_lock, flags);
>  }
> @@ -153,11 +159,7 @@ static void sync_print_fence(struct seq_file *s, struct 
> sync_fence *fence)
>  sync_status_str(atomic_read(&fence->status)));
>  
>   for (i = 0; i < fence->num_fences; ++i) {
> - struct sync_pt *pt =
> - container_of(fence->cbs[i].sync_pt,
> -  struct sync_pt, base);
> -
> - sync_print_pt(s, pt, true);
> + sync_print_pt(s, fence->cbs[i].sync_pt, true);
>   }
>  
>   spin_lock_irqsave(&fence->wq.lock, flags);
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 02/13] staging/android/sync: add sync_fence_create_dma

2015-12-17 Thread Jesse Barnes

On 12/11/2015 05:11 AM, john.c.harri...@intel.com wrote:
> From: Maarten Lankhorst 
> 
> This allows users of dma fences to create a android fence.
> 
> v2: Added kerneldoc. (Tvrtko Ursulin).
> 
> v4: Updated comments from review feedback my Maarten.
> 
> Signed-off-by: Maarten Lankhorst 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Maarten Lankhorst 
> Cc: Daniel Vetter 
> Cc: Jesse Barnes 
> Cc: de...@driverdev.osuosl.org
> Cc: Riley Andrews 
> Cc: Greg Kroah-Hartman 
> Cc: Arve Hjønnevåg 
> ---
>  drivers/staging/android/sync.c | 13 +
>  drivers/staging/android/sync.h | 10 ++
>  2 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
> index f83e00c..7f0e919 100644
> --- a/drivers/staging/android/sync.c
> +++ b/drivers/staging/android/sync.c
> @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct 
> fence_cb *cb)
>  }
>  
>  /* TODO: implement a create which takes more that one sync_pt */
> -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
>  {
>   struct sync_fence *fence;
>  
> @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char *name, 
> struct sync_pt *pt)
>   fence->num_fences = 1;
>   atomic_set(&fence->status, 1);
>  
> - fence->cbs[0].sync_pt = &pt->base;
> + fence->cbs[0].sync_pt = pt;
>   fence->cbs[0].fence = fence;
> - if (fence_add_callback(&pt->base, &fence->cbs[0].cb,
> -fence_check_cb_func))
> + if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
>   atomic_dec(&fence->status);
>  
>   sync_fence_debug_add(fence);
>  
>   return fence;
>  }
> +EXPORT_SYMBOL(sync_fence_create_dma);
> +
> +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
> +{
> + return sync_fence_create_dma(name, &pt->base);
> +}
>  EXPORT_SYMBOL(sync_fence_create);
>  
>  struct sync_fence *sync_fence_fdget(int fd)
> diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
> index 61f8a3a..afa0752 100644
> --- a/drivers/staging/android/sync.h
> +++ b/drivers/staging/android/sync.h
> @@ -254,6 +254,16 @@ void sync_pt_free(struct sync_pt *pt);
>   */
>  struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
>  
> +/**
> + * sync_fence_create_dma() - creates a sync fence from dma-fence
> + * @name:name of fence to create
> + * @pt:  dma-fence to add to the fence
> + *
> + * Creates a fence containg @pt.  Once this is called, the fence takes
> + * ownership of @pt.
> + */
> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
> +
>  /*
>   * API for sync_fence consumers
>   */
> 

I've been using this one for awhile, so:
Reviewed-by: Jesse Barnes 
Tested-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/vlv: Take forcewake on media engine writes

2015-12-17 Thread Jesse Barnes

On 12/17/2015 07:14 AM, Mika Kuoppala wrote:
> Since commit 940aece471bd ("drm/i915/vlv: Valleyview support
> for forcewake Individual power wells.") we have only taken
> media engine forcewake correctly on reads, but only taken render
> engine forcewake on media engine writes and omitted the media
> domain.
> 
> This asymmetry might have caused unstable behaviour on
> media ring access.
> 
> Fix is to take media engine forcewake symmetrically to writes.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=88012
> Cc: Deepak S 
> Cc: Jesse Barnes 
> Cc: Chris Wilson 
> Signed-off-by: Mika Kuoppala 
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> b/drivers/gpu/drm/i915/intel_uncore.c
> index 277e60ae0e47..a2e204088aa5 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -902,6 +902,23 @@ hsw_write##x(struct drm_i915_private *dev_priv, 
> i915_reg_t reg, u##x val, bool t
>   GEN6_WRITE_FOOTER; \
>  }
>  
> +#define __vlv_write(x) \
> +static void \
> +vlv_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, 
> bool trace) { \
> + enum forcewake_domains fw_engine = 0; \
> + GEN6_WRITE_HEADER; \
> + if (!NEEDS_FORCE_WAKE(offset)) \
> + fw_engine = 0; \
> + else if (FORCEWAKE_VLV_RENDER_RANGE_OFFSET(offset)) \
> + fw_engine = FORCEWAKE_RENDER; \
> + else if (FORCEWAKE_VLV_MEDIA_RANGE_OFFSET(offset)) \
> + fw_engine = FORCEWAKE_MEDIA; \
> + if (fw_engine) \
> + __force_wake_get(dev_priv, fw_engine); \
> + __raw_i915_write##x(dev_priv, reg, val); \
> + GEN6_WRITE_FOOTER; \
> +}
> +
>  static const i915_reg_t gen8_shadowed_regs[] = {
>   FORCEWAKE_MT,
>   GEN6_RPNSWREQ,
> @@ -1019,6 +1036,10 @@ __gen8_write(8)
>  __gen8_write(16)
>  __gen8_write(32)
>  __gen8_write(64)
> +__vlv_write(8)
> +__vlv_write(16)
> +__vlv_write(32)
> +__vlv_write(64)
>  __hsw_write(8)
>  __hsw_write(16)
>  __hsw_write(32)
> @@ -1031,6 +1052,7 @@ __gen6_write(64)
>  #undef __gen9_write
>  #undef __chv_write
>  #undef __gen8_write
> +#undef __vlv_write
>  #undef __hsw_write
>  #undef __gen6_write
>  #undef GEN6_WRITE_FOOTER
> @@ -1243,6 +1265,8 @@ void intel_uncore_init(struct drm_device *dev)
>   case 6:
>   if (IS_HASWELL(dev)) {
>   ASSIGN_WRITE_MMIO_VFUNCS(hsw);
> + } else if (IS_VALLEYVIEW(dev)) {
> + ASSIGN_WRITE_MMIO_VFUNCS(vlv);
>   } else {
>   ASSIGN_WRITE_MMIO_VFUNCS(gen6);
>   }
> 

Looks good.  Looks like we also have it on chv, so I guess it was just
an oversight.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] intel: merge latest i915_drm.h

2015-12-12 Thread Jesse Barnes

On 12/12/2015 07:16 AM, Emil Velikov wrote:
> On 11 December 2015 at 21:55, Jesse Barnes  wrote:
>> Pick up context flags, softpin, etc.
>>
>> Signed-off-by: Jesse Barnes 
>> ---
>>  include/drm/i915_drm.h | 57 
>> ++
>>  1 file changed, 48 insertions(+), 9 deletions(-)
>>
> Any objections if we do this (and pretty much every other outdated
> header) in a single go, as the header cleanups hit Linus' tree ?
> Dave already has then in drm-next :-)

No objection here.  Feel free to push it with my ack!

Thanks,
Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] intel: merge latest i915_drm.h

2015-12-11 Thread Jesse Barnes

Pick up context flags, softpin, etc.

Signed-off-by: Jesse Barnes 
---
 include/drm/i915_drm.h | 57 ++
 1 file changed, 48 insertions(+), 9 deletions(-)

diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index ded43b1..4ce1fe9 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -171,8 +171,12 @@ typedef struct _drm_i915_sarea {
 #define I915_BOX_TEXTURE_LOAD  0x8
 #define I915_BOX_LOST_CONTEXT  0x10
 
-/* I915 specific ioctls
- * The device specific ioctl range is 0x40 to 0x79.
+/*
+ * i915 specific ioctls.
+ *
+ * The device specific ioctl range is [DRM_COMMAND_BASE, DRM_COMMAND_END) ie
+ * [0x40, 0xa0) (a0 is excluded). The numbers below are defined as offset
+ * against DRM_COMMAND_BASE and should be between [0x0, 0x60).
  */
 #define DRM_I915_INIT  0x00
 #define DRM_I915_FLUSH 0x01
@@ -270,7 +274,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_OVERLAY_PUT_IMAGE   DRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_OVERLAY_PUT_IMAGE, struct drm_intel_overlay_put_image)
 #define DRM_IOCTL_I915_OVERLAY_ATTRS   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_OVERLAY_ATTRS, struct drm_intel_overlay_attrs)
 #define DRM_IOCTL_I915_SET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_SET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey)
-#define DRM_IOCTL_I915_GET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_SET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey)
+#define DRM_IOCTL_I915_GET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey)
 #define DRM_IOCTL_I915_GEM_WAITDRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GEM_WAIT, struct drm_i915_gem_wait)
 #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE  DRM_IOWR (DRM_COMMAND_BASE + 
DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
 #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
@@ -350,9 +354,16 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_REVISION  32
 #define I915_PARAM_SUBSLICE_TOTAL   33
 #define I915_PARAM_EU_TOTAL 34
+#define I915_PARAM_HAS_GPU_RESET35
+#define I915_PARAM_HAS_RESOURCE_STREAMER 36
+#define I915_PARAM_HAS_EXEC_SOFTPIN 37
 
 typedef struct drm_i915_getparam {
-   int param;
+   __s32 param;
+   /*
+* WARNING: Using pointers instead of fixed-size u64 means we need to 
write
+* compat32 code. Don't repeat this mistake.
+*/
int *value;
 } drm_i915_getparam_t;
 
@@ -672,15 +683,21 @@ struct drm_i915_gem_exec_object2 {
__u64 alignment;
 
/**
-* Returned value of the updated offset of the object, for future
-* presumed_offset writes.
+* When the EXEC_OBJECT_PINNED flag is specified this is populated by
+* the user with the GTT offset at which this object will be pinned.
+* When the I915_EXEC_NO_RELOC flag is specified this must contain the
+* presumed_offset of the object.
+* During execbuffer2 the kernel populates it with the value of the
+* current GTT offset of the object, for future presumed_offset writes.
 */
__u64 offset;
 
 #define EXEC_OBJECT_NEEDS_FENCE (1<<0)
 #define EXEC_OBJECT_NEEDS_GTT  (1<<1)
 #define EXEC_OBJECT_WRITE  (1<<2)
-#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
+#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
+#define EXEC_OBJECT_PINNED (1<<4)
+#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_PINNED<<1)
__u64 flags;
 
__u64 rsvd1;
@@ -760,7 +777,12 @@ struct drm_i915_gem_execbuffer2 {
 #define I915_EXEC_BSD_RING1(1<<13)
 #define I915_EXEC_BSD_RING2(2<<13)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(1<<15)
+/** Tell the kernel that the batchbuffer is processed by
+ *  the resource streamer.
+ */
+#define I915_EXEC_RESOURCE_STREAMER (1<<15)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_RESOURCE_STREAMER<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK  (0x)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -996,6 +1018,7 @@ struct drm_intel_overlay_put_image {
 /* flags */
 #define I915_OVERLAY_UPDATE_ATTRS  (1<<0)
 #define I915_OVERLAY_UPDATE_GAMMA  (1<<1)
+#define I915_OVERLAY_DISABLE_DEST_COLORKEY (1<<2)
 struct drm_intel_overlay_attrs {
__u32 flags;
__u32 color_key;
@@ -1062,9 +1085,23 @@ struct drm_i915_gem_context_destroy {
 };
 
 struct drm_i915_reg_read {
+   /*
+* Register offset.
+* For 64bit wide registers where the upper 32bits don't immediately
+* follow the lower 32bits, the offset of the lower 32bits must
+* be specified
+*/
__u64 offset;
__u64 val; /* Return value */
 };
+/* Known reg

Re: [Intel-gfx] [PATCH] drm/i915: Flush the RPS bottom-half when the GPU idles

2015-12-09 Thread Jesse Barnes

On 12/09/2015 09:10 AM, Chris Wilson wrote:
> Make sure that the RPS bottom-half is flushed before we set the idle
> frequency when we decide the GPU is idle. This should prevent any races
> with the bottom-half and setting the idle frequency, and ensures that
> the bottom-half is bounded by the GPU's rpm reference taken for when it
> is active (i.e. between gen6_rps_busy() and gen6_rps_idle()).
> 
> Signed-off-by: Chris Wilson 
> Cc: Imre Deak 
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index e655321385e2..bb796d4e9a3a 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4401,11 +4401,15 @@ void gen6_rps_busy(struct drm_i915_private *dev_priv)
>  
>  void gen6_rps_idle(struct drm_i915_private *dev_priv)
>  {
> - struct drm_device *dev = dev_priv->dev;
> + /* Flush our bottom-half so that it does not race with us
> +  * setting the idle frequency and so that it is bounded by
> +  * our rpm wakeref.
> +  */
> + flush_work(&dev_priv->rps.work);
>  
>   mutex_lock(&dev_priv->rps.hw_lock);
>   if (dev_priv->rps.enabled) {
> - if (IS_VALLEYVIEW(dev))
> + if (IS_VALLEYVIEW(dev_priv))
>   vlv_set_rps_idle(dev_priv);
>   else
>   gen6_set_rps(dev_priv->dev, dev_priv->rps.idle_freq);
> 

Hah and a consistency fix snuck in there... nice.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Restore waitboost credit to the synchronous waiter

2015-12-07 Thread Jesse Barnes

On 12/01/2015 02:48 PM, Chris Wilson wrote:
> Ideally, we want to automagically have the GPU respond to the
> instantaneous load by reclocking itself. However, reclocking occurs
> relatively slowly, and to the client waiting for a result from the GPU,
> too late. To compensate and reduce the client latency, we allow the
> first wait from a client to boost the GPU clocks to maximum. This
> overcomes the lag in autoreclocking, at the expense of forcing the GPU
> clocks too high. So to offset the excessive power usage, we currently
> allow a client to only boost the clocks once before we detect the GPU
> is idle again. This works reasonably for say the first frame in a
> benchmark, but for many more synchronous workloads (like OpenCL) we find
> the GPU clocks remain too low. By noting a wait which would idle the GPU
> (i.e. we just waited upon the last known request), we can give that
> client the idle boost credit (for their next wait) without the 100ms
> delay required for us to detect the GPU idle state. The intention is to
> boost clients that are stalling in the process of feeding the GPU more
> work (and who in doing so let the GPU idle), without granting boost
> credits to clients that are throttling themselves (such as compositors).
> 
> Signed-off-by: Chris Wilson 
> Cc: "Zou, Nanhai" 
> Cc: Jesse Barnes 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 92598601a232..f5aef48b93db 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1312,6 +1312,22 @@ out:
>   *timeout = 0;
>   }
>  
> + if (ret == 0 && rps && req->seqno == req->ring->last_submitted_seqno) {
> + /* The GPU is now idle and this client has stalled.
> +  * Since no other client has submitted a request in the
> +  * meantime, assume that this client is the only one
> +  * supplying work to the GPU but is unable to keep that
> +  * work supplied because it is waiting. Since the GPU is
> +  * then never kept fully busy, RPS autoclocking will
> +  * keep the clocks relatively low, causing further delays.
> +  * Compensate by giving the synchronous client credit for
> +  * a waitboost next time.
> +  */
> + spin_lock(&req->i915->rps.client_lock);
> + list_del_init(&rps->link);
> + spin_unlock(&req->i915->rps.client_lock);
> + }
> +
>   return ret;
>  }
>  
> 

Still wishing we had a good way to benchmark these types of changes
across a range of workloads.  Eero, have you guys looked at turbo stuff
at all yet?

Also, is the boost logic only documented in misc commit messages?  Or do
we have a nice block of text somewhere describing the intent (which may
not match our implementation!) and how we try to achieve it?

Those are both new requests though, so no need to block this patch:
Reviewed-by: Jesse Barnes 

Thanks,
Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4] drm/i915: Pin the ifbdev for the info->system_base GGTT mmapping

2015-11-20 Thread Jesse Barnes

On 11/20/2015 08:29 AM, Chris Wilson wrote:
> A long time ago (before 3.14) we relied on a permanent pinning of the
> ifbdev to lock the fb in place inside the GGTT. However, the
> introduction of stealing the BIOS framebuffer and reusing its address in
> the GGTT for the fbdev has muddied waters and we use an inherited fb.
> However, the inherited fb is only pinned whilst it is active and we no
> longer have an explicit pin for the info->system_base mmapping used by
> the fbdev. The result is that after some aperture pressure the fbdev may
> be evicted, but we continue to write the fbcon into the same GGTT
> address - overwriting anything else that may be put into that offset.
> The effect is most pronounced across suspend/resume as
> intel_fbdev_set_suspend() does a full clear over the whole scanout.
> 
> v2: Only unpin the intel_fb is we allocate it. If we inherit the fb from
> the BIOS, we do not own the pinned vma (except for the reference we add
> in this patch for our access via info->screen_base).
> 
> v3: Finish balancing the vma pinning for the normal !preallocated case.
> 
> v4: Try to simplify the pinning even further.
> 
> Signed-off-by: Chris Wilson 
> Cc: "Goel, Akash" 
> Cc: Daniel Vetter 
> Cc: Jesse Barnes 
> Cc: sta...@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/intel_fbdev.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
> b/drivers/gpu/drm/i915/intel_fbdev.c
> index 7ccde58f8c98..79f02e72da8a 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -163,13 +163,6 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   goto out;
>   }
>  
> - /* Flush everything out, we'll be doing GTT only from now on */
> - ret = intel_pin_and_fence_fb_obj(NULL, fb, NULL);
> - if (ret) {
> - DRM_ERROR("failed to pin obj: %d\n", ret);
> - goto out;
> - }
> -
>   mutex_unlock(&dev->struct_mutex);
>  
>   ifbdev->fb = to_intel_framebuffer(fb);
> @@ -225,6 +218,14 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  
>   mutex_lock(&dev->struct_mutex);
>  
> + /* Pin the GGTT vma for our access via info->screen_base.
> +  * This also validates that any existing fb inherited from the
> +  * BIOS is suitable for own access.
> +  */
> + ret = intel_pin_and_fence_fb_obj(NULL, ifbdev->fb->base, NULL);
> + if (ret)
> + goto out_unlock;
> +
>   info = drm_fb_helper_alloc_fbi(helper);
>   if (IS_ERR(info)) {
>   DRM_ERROR("Failed to allocate fb_info\n");
> @@ -287,6 +288,7 @@ out_destroy_fbi:
>   drm_fb_helper_release_fbi(helper);
>  out_unpin:
>   i915_gem_object_ggtt_unpin(obj);
> +out_unlock:
>   mutex_unlock(&dev->struct_mutex);
>   return ret;
>  }
> @@ -524,6 +526,8 @@ static const struct drm_fb_helper_funcs 
> intel_fb_helper_funcs = {
>  static void intel_fbdev_destroy(struct drm_device *dev,
>   struct intel_fbdev *ifbdev)
>  {
> + /* Release the pinning for the info->screen_base mmaping. */
> + i915_gem_object_ggtt_unpin(ifbdev->fb->obj);
>  
>   drm_fb_helper_unregister_fbi(&ifbdev->helper);
>   drm_fb_helper_release_fbi(&ifbdev->helper);
> 

Ah even better.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3] drm/i915: Pin the ifbdev for the info->system_base GGTT mmapping

2015-11-20 Thread Jesse Barnes

On 11/20/2015 06:34 AM, Chris Wilson wrote:
> A long time ago (before 3.14) we relied on a permanent pinning of the
> ifbdev to lock the fb in place inside the GGTT. However, the
> introduction of stealing the BIOS framebuffer and reusing its address in
> the GGTT for the fbdev has muddied waters and we use an inherited fb.
> However, the inherited fb is only pinned whilst it is active and we no
> longer have an explicit pin for the info->system_base mmapping used by
> the fbdev. The result is that after some aperture pressure the fbdev may
> be evicted, but we continue to write the fbcon into the same GGTT
> address - overwriting anything else that may be put into that offset.
> The effect is most pronounced across suspend/resume as
> intel_fbdev_set_suspend() does a full clear over the whole scanout.
> 
> v2: Only unpin the intel_fb is we allocate it. If we inherit the fb from
> the BIOS, we do not own the pinned vma (except for the reference we add
> in this patch for our access via info->screen_base).
> 
> v3: Finish balancing the vma pinning for the normal !preallocated case.
> 
> Signed-off-by: Chris Wilson 
> Cc: "Goel, Akash" 
> Cc: Daniel Vetter 
> Cc: Jesse Barnes 
> Cc: sta...@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/intel_fbdev.c | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
> b/drivers/gpu/drm/i915/intel_fbdev.c
> index 7ccde58f8c98..7a415fe31299 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -225,6 +225,16 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  
>   mutex_lock(&dev->struct_mutex);
>  
> + /* The fb constructor will have already pinned us (or inherited a
> +  * GGTT region from the BIOS) suitable for a scanout, so
> +  * this should just be a no-op and increment the pin count for the
> +  * fbdev mmapping. It does have a useful side-effect of validating
> +  * the pin for fbdev's use via a GGTT mmapping.
> +  */
> + ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE);
> + if (ret)
> + goto out_unlock;
> +
>   info = drm_fb_helper_alloc_fbi(helper);
>   if (IS_ERR(info)) {
>   DRM_ERROR("Failed to allocate fb_info\n");
> @@ -279,6 +289,12 @@ static int intelfb_create(struct drm_fb_helper *helper,
> fb->width, fb->height,
> i915_gem_obj_ggtt_offset(obj), obj);
>  
> + /* We pin the vma for our access through info->screen_base, so
> +  * we can drop the pin we took if we created the intel_fb.
> +  */
> + if (!prealloc)
> + i915_gem_object_ggtt_unpin(obj);
> +
>   mutex_unlock(&dev->struct_mutex);
>   vga_switcheroo_client_fb_set(dev->pdev, info);
>   return 0;
> @@ -286,7 +302,12 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  out_destroy_fbi:
>   drm_fb_helper_release_fbi(helper);
>  out_unpin:
> + /* Once for info->screen_base mmaping... */
>   i915_gem_object_ggtt_unpin(obj);
> +out_unlock:
> + if (!prealloc)
> + /* ...and once for the intel_fb */
> + i915_gem_object_ggtt_unpin(obj);
>   mutex_unlock(&dev->struct_mutex);
>   return ret;
>  }
> @@ -524,6 +545,8 @@ static const struct drm_fb_helper_funcs 
> intel_fb_helper_funcs = {
>  static void intel_fbdev_destroy(struct drm_device *dev,
>   struct intel_fbdev *ifbdev)
>  {
> + /* Release the pinning for the info->screen_base mmaping. */
> + i915_gem_object_ggtt_unpin(ifbdev->fb->obj);
>  
>   drm_fb_helper_unregister_fbi(&ifbdev->helper);
>   drm_fb_helper_release_fbi(&ifbdev->helper);
> 

Now you're making me look at the pin/unpin handling...  Could probably
make the prealloc vs non-prealloc cases a bit clearer, but it looks
correct.  In the prealloc case we need the additional pin, since
create_stolen_for_preallocated just pins the pages and doesn't up the
pin count, right?  But in the non-prealloc case we'll have done a
regular fb alloc, which does a pin & fence, so we can drop the extra pin
count.  And I think the page unpin is already taken care of?  ISTR bugs
there when we first landed the initial plane allocation stuff.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 1/2] drm/i915: Set the map-and-fenceable flag for preallocated objects

2015-11-20 Thread Jesse Barnes

On 11/20/2015 06:16 AM, Chris Wilson wrote:
> As we mark the preallocated objects as bound, we should also flag them
> correctly as being map-and-fenceable (if appropriate!) so that later
> users do not get confused and try and rebind the pinned vma in order to
> get a map-and-fenceable binding.
> 
> Signed-off-by: Chris Wilson 
> Cc: "Goel, Akash" 
> Cc: Daniel Vetter 
> Cc: Jesse Barnes 
> Cc: sta...@vger.kernel.org
> ---
>  drivers/gpu/drm/i915/i915_drv.h|  1 +
>  drivers/gpu/drm/i915/i915_gem.c| 43 
> +++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c|  1 +
>  drivers/gpu/drm/i915/i915_gem_stolen.c |  1 +
>  4 files changed, 27 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f2b65433ed7d..24143e5273b6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2844,6 +2844,7 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object 
> *obj,
>  
>  int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> u32 flags);
> +void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
>  int __must_check i915_vma_unbind(struct i915_vma *vma);
>  /*
>   * BEWARE: Do not use the function below unless you can _absolutely_
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3ad198a41c4a..e6a8a52c8a6b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4092,6 +4092,29 @@ i915_vma_misplaced(struct i915_vma *vma, uint32_t 
> alignment, uint64_t flags)
>   return false;
>  }
>  
> +void __i915_vma_set_map_and_fenceable(struct i915_vma *vma)
> +{
> + struct drm_i915_gem_object *obj = vma->obj;
> + bool mappable, fenceable;
> + u32 fence_size, fence_alignment;
> +
> + fence_size = i915_gem_get_gtt_size(obj->base.dev,
> +obj->base.size,
> +obj->tiling_mode);
> + fence_alignment = i915_gem_get_gtt_alignment(obj->base.dev,
> +  obj->base.size,
> +  obj->tiling_mode,
> +  true);
> +
> + fenceable = (vma->node.size == fence_size &&
> +  (vma->node.start & (fence_alignment - 1)) == 0);
> +
> + mappable = (vma->node.start + fence_size <=
> + to_i915(obj->base.dev)->gtt.mappable_end);
> +
> + obj->map_and_fenceable = mappable && fenceable;
> +}
> +
>  static int
>  i915_gem_object_do_pin(struct drm_i915_gem_object *obj,
>  struct i915_address_space *vm,
> @@ -4159,25 +4182,7 @@ i915_gem_object_do_pin(struct drm_i915_gem_object *obj,
>  
>   if (ggtt_view && ggtt_view->type == I915_GGTT_VIEW_NORMAL &&
>   (bound ^ vma->bound) & GLOBAL_BIND) {
> - bool mappable, fenceable;
> - u32 fence_size, fence_alignment;
> -
> - fence_size = i915_gem_get_gtt_size(obj->base.dev,
> -obj->base.size,
> -obj->tiling_mode);
> - fence_alignment = i915_gem_get_gtt_alignment(obj->base.dev,
> -  obj->base.size,
> -  obj->tiling_mode,
> -  true);
> -
> - fenceable = (vma->node.size == fence_size &&
> -  (vma->node.start & (fence_alignment - 1)) == 0);
> -
> - mappable = (vma->node.start + fence_size <=
> - dev_priv->gtt.mappable_end);
> -
> - obj->map_and_fenceable = mappable && fenceable;
> -
> + __i915_vma_set_map_and_fenceable(vma);
>   WARN_ON(flags & PIN_MAPPABLE && !obj->map_and_fenceable);
>   }
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index a09f8f0510d5..74b26b2d0889 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2704,6 +2704,7 @@ static int i915_gem_setup_global_gtt(struct drm_device 
> *dev,
>   return ret;
>   }
>   vma->bound |= GLOBAL_BIND;
> + __i915_vma_set_map_and_fenceable(

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

2015-11-19 Thread Jesse Barnes

On 11/19/2015 01:35 AM, Chris Wilson wrote:
> On Thu, Nov 19, 2015 at 10:14:08AM +0100, Daniel Vetter wrote:
>> On Wed, Nov 18, 2015 at 03:08:47PM -0800, Jesse Barnes wrote:
>>> On 11/17/2015 08:37 AM, Daniel Vetter wrote:
>>>> On Fri, Oct 30, 2015 at 04:58:41PM +, Chris Wilson wrote:
>>>>> On Fri, Oct 30, 2015 at 05:14:21PM +0100, Daniel Vetter wrote:
>>>>>> On Fri, Oct 23, 2015 at 06:43:32PM +0100, Chris Wilson wrote:
>>>>>>> When accessing through the GTT from one CPU whilst concurrently updating
>>>>>>> the GGTT PTEs in another thread, the hardware likes to return random
>>>>>>> data. As we have strong serialisation prevent us from modifying the PTE
>>>>>>> of an active GTT mmapping, we have to conclude that it whilst modifying
>>>>>>> other PTE's that error occurs. (I have not looked for any pattern such
>>>>>>> as modifying PTE within the same page or cacheline as active PTE -
>>>>>>> though checking whether revoking neighbouring objects should be enough
>>>>>>> to test that theory.) The corruption also seems restricted to Braswell
>>>>>>> and disappears with maxcpus=0. This patch stops all access through the
>>>>>>> GTT by other CPUs when we update any PTE by stopping the machine around
>>>>>>> the GGTT update.
>>>>>>>
>>>>>>> Testcase: igt/gem_concurrent_blit
>>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89079
>>>>>>> Signed-off-by: Chris Wilson 
>>>>>>
>>>>>> Wild guess, since it wouldn't be the first time hw engineers screwed this
>>>>>> up.
>>>>>>
>>>>>> Cheers, Daniel
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
>>>>>> b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>> index d1c5cf89fe77..de983c8e6e54 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>> @@ -2337,12 +2337,8 @@ int i915_gem_gtt_prepare_object(struct 
>>>>>> drm_i915_gem_object *obj)
>>>>>>  
>>>>>>  static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>>>>>>  {
>>>>>> -#ifdef writeq
>>>>>> -writeq(pte, addr);
>>>>>> -#else
>>>>>>  iowrite32((u32)pte, addr);
>>>>>>  iowrite32(pte >> 32, addr + 4);
>>>>>> -#endif
>>>>>
>>>>> Tried:
>>>>>  static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>>>>>   {
>>>>>   -#ifdef writeq
>>>>>   -   writeq(pte, addr);
>>>>>   -#else
>>>>>   -   iowrite32((u32)pte, addr);
>>>>>   -   iowrite32(pte >> 32, addr + 4);
>>>>>   -#endif
>>>>>   +   iowrite32(0, addr);
>>>>>   +   wmb();
>>>>>   +   iowrite32(upper_32_bits(pte), addr + 4);
>>>>>   +   iowrite32(lower_32_bits(pte), addr);
>>>>>   +   wmb();
>>>>>}
>>>>> 
>>>>> and just the plain iowrite(lower), iowrite(upper), neither helps.
>>>>
>>>> Added a note about this and applied to dinq. Yay for awesome hw.
>>>
>>> I thought Ville explained how this wasn't necessary?
>>
>> Ville can't repro, Chris claims it fixes something, I don't have a
>> system. We probably should dig into this more, but since I didn't see
>> anything going on I figured I can just pull it in for now.
> 
> Both myself, old QA (when they finally got around to running some of the
> coherency tests), new QA and VPG have reported coherency issues with
> access through the GGTT.

I can believe it; it would be good to find the root cause the hw issue
though.  Obviously we're not understanding something fully...

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

2015-11-18 Thread Jesse Barnes

On 11/17/2015 08:37 AM, Daniel Vetter wrote:
> On Fri, Oct 30, 2015 at 04:58:41PM +, Chris Wilson wrote:
>> On Fri, Oct 30, 2015 at 05:14:21PM +0100, Daniel Vetter wrote:
>>> On Fri, Oct 23, 2015 at 06:43:32PM +0100, Chris Wilson wrote:
 When accessing through the GTT from one CPU whilst concurrently updating
 the GGTT PTEs in another thread, the hardware likes to return random
 data. As we have strong serialisation prevent us from modifying the PTE
 of an active GTT mmapping, we have to conclude that it whilst modifying
 other PTE's that error occurs. (I have not looked for any pattern such
 as modifying PTE within the same page or cacheline as active PTE -
 though checking whether revoking neighbouring objects should be enough
 to test that theory.) The corruption also seems restricted to Braswell
 and disappears with maxcpus=0. This patch stops all access through the
 GTT by other CPUs when we update any PTE by stopping the machine around
 the GGTT update.

 Testcase: igt/gem_concurrent_blit
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89079
 Signed-off-by: Chris Wilson 
>>>
>>> Wild guess, since it wouldn't be the first time hw engineers screwed this
>>> up.
>>>
>>> Cheers, Daniel
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
>>> b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> index d1c5cf89fe77..de983c8e6e54 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>> @@ -2337,12 +2337,8 @@ int i915_gem_gtt_prepare_object(struct 
>>> drm_i915_gem_object *obj)
>>>  
>>>  static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>>>  {
>>> -#ifdef writeq
>>> -   writeq(pte, addr);
>>> -#else
>>> iowrite32((u32)pte, addr);
>>> iowrite32(pte >> 32, addr + 4);
>>> -#endif
>>
>> Tried:
>>  static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>>   {
>>   -#ifdef writeq
>>   -   writeq(pte, addr);
>>   -#else
>>   -   iowrite32((u32)pte, addr);
>>   -   iowrite32(pte >> 32, addr + 4);
>>   -#endif
>>   +   iowrite32(0, addr);
>>   +   wmb();
>>   +   iowrite32(upper_32_bits(pte), addr + 4);
>>   +   iowrite32(lower_32_bits(pte), addr);
>>   +   wmb();
>>}
>> 
>> and just the plain iowrite(lower), iowrite(upper), neither helps.
> 
> Added a note about this and applied to dinq. Yay for awesome hw.

I thought Ville explained how this wasn't necessary?

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Try to fix MST for SKL

2015-11-10 Thread Jesse Barnes

gt;   intel_mst->port = found->port;
>  
>   if (intel_dp->active_mst_links == 0) {
> - enum port port = intel_ddi_get_encoder_port(encoder);
> + intel_ddi_clk_select(encoder, intel_crtc->config);
>  
>   intel_dp_set_link_params(intel_dp, intel_crtc->config);
>  
> - /* FIXME: add support for SKL */
> - if (INTEL_INFO(dev)->gen < 9)
> - I915_WRITE(PORT_CLK_SEL(port),
> -intel_crtc->config->ddi_pll_sel);
> -
>   intel_ddi_init_dp_buf_reg(&intel_dig_port->base);
>  
>   intel_dp_sink_dpms(intel_dp, DRM_MODE_DPMS_ON);
>  
> -
>   intel_dp_start_link_train(intel_dp);
>   intel_dp_complete_link_train(intel_dp);
>   intel_dp_stop_link_train(intel_dp);
> diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> b/drivers/gpu/drm/i915/intel_drv.h
> index 71a2e18..a97908a 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -938,6 +938,8 @@ void intel_crt_init(struct drm_device *dev);
>  
>  
>  /* intel_ddi.c */
> +void intel_ddi_clk_select(struct intel_encoder *encoder,
> +   const struct intel_crtc_state *pipe_config);
>  void intel_prepare_ddi(struct drm_device *dev);
>  void hsw_fdi_link_train(struct drm_crtc *crtc);
>  void intel_ddi_init(struct drm_device *dev, enum port port);
> 

Looks like an improvement over current code and fixes the hard hang in
https://bugs.freedesktop.org/show_bug.cgi?id=91791, so I think we should
push it.

Sounds like we need more than just this to fix MST on SKL though.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/3] drm/i915: Kill intel_runtime_pm_disable()

2015-11-10 Thread Jesse Barnes

On 11/06/2015 05:08 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> intel_runtime_pm_disable() takes an extra rpm reference which combined
> with the one we leak from intel_display_set_init_power() leaves the
> usage count at +1 after the driver has been unloaded.
> The original ref is dropped explicitly in intel_runtime_pm_enable().
> So the next time we load the driver we can no longer do runtime PM ever.
> 
> This used to work, but
> commit 292b990e86ab ("drm/i915: Update power domains on readout.")
> broke things by not dropping the init power domain during fbdev
> teardown. Based on the comment in intel_power_domains_fini(), the
> way it used to to work wasn't intentional. As in we weren't supposed
> to drop the init power during driver unload. And since we no longer
> do, we now leak an extra rpm reference.
> 
> So fix things by throwing intel_runtime_pm_disable() to the bin, so
> that the only leaked reference comes from the init power domain.
> 
> Cc: Maarten Lankhorst 
> Cc: Daniel Stone 
> Cc: Jesse Barnes 
> Fixes: 292b990e86ab ("drm/i915: Update power domains on readout.")
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 17 -
>  1 file changed, 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
> b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 1017555..bdc9ed4 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -1847,21 +1847,6 @@ int intel_power_domains_init(struct drm_i915_private 
> *dev_priv)
>   return 0;
>  }
>  
> -static void intel_runtime_pm_disable(struct drm_i915_private *dev_priv)
> -{
> - struct drm_device *dev = dev_priv->dev;
> - struct device *device = &dev->pdev->dev;
> -
> - if (!HAS_RUNTIME_PM(dev))
> - return;
> -
> - if (!intel_enable_rc6(dev))
> - return;
> -
> - /* Make sure we're not suspended first. */
> - pm_runtime_get_sync(device);
> -}
> -
>  /**
>   * intel_power_domains_fini - finalizes the power domain structures
>   * @dev_priv: i915 device instance
> @@ -1872,8 +1857,6 @@ static void intel_runtime_pm_disable(struct 
> drm_i915_private *dev_priv)
>   */
>  void intel_power_domains_fini(struct drm_i915_private *dev_priv)
>  {
> - intel_runtime_pm_disable(dev_priv);
> -
>   /* The i915.ko module is still not prepared to be loaded when
>    * the power well is not enabled, so just enable it in case
>* we're going to unload/reload. */
> 

Yeah I guess this is fine.  Will we still disable RPM on unload?  What's
the expected behavior here?  Cc'ing Rafael.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Move the fbdev async_schedule() into intel_fbdev.c

2015-11-06 Thread Jesse Barnes

On 11/06/2015 05:08 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Reading the driver load/unload code leaves one confused as there's
> an async_schedule() in the load, but not async_synchronize_full()
> in sight. In fact it's hidden inside intel_fbdev.c. So let's move the
> async_schedule() into intel_fbdev.c as well so that it's next to the
> async_synchronize_full(), which should make the relationship easier
> to see.
> 
> Plus this way we won't schedule a nop function call when fbdev is
> disabled. And we were passing a pointer to a static inline
> function to async_schedule(), which seems rather dubious to me.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/i915_dma.c| 3 +--
>  drivers/gpu/drm/i915/intel_drv.h   | 4 ++--
>  drivers/gpu/drm/i915/intel_fbdev.c | 7 ++-
>  3 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index c58048f..cae3d78 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -28,7 +28,6 @@
>  
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>  
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -437,7 +436,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
>* scanning against hotplug events. Hence do this first and ignore the
>* tiny window where we will loose hotplug notifactions.
>*/
> - async_schedule(intel_fbdev_initial_config, dev_priv);
> + intel_fbdev_initial_config_async(dev);
>  
>   drm_kms_helper_poll_init(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> b/drivers/gpu/drm/i915/intel_drv.h
> index 00d9882..50c78b6 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1298,7 +1298,7 @@ void intel_dvo_init(struct drm_device *dev);
>  /* legacy fbdev emulation in intel_fbdev.c */
>  #ifdef CONFIG_DRM_FBDEV_EMULATION
>  extern int intel_fbdev_init(struct drm_device *dev);
> -extern void intel_fbdev_initial_config(void *data, async_cookie_t cookie);
> +extern void intel_fbdev_initial_config_async(struct drm_device *dev);
>  extern void intel_fbdev_fini(struct drm_device *dev);
>  extern void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool 
> synchronous);
>  extern void intel_fbdev_output_poll_changed(struct drm_device *dev);
> @@ -1309,7 +1309,7 @@ static inline int intel_fbdev_init(struct drm_device 
> *dev)
>   return 0;
>  }
>  
> -static inline void intel_fbdev_initial_config(void *data, async_cookie_t 
> cookie)
> +static inline void intel_fbdev_initial_config_async(struct drm_device *dev)
>  {
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
> b/drivers/gpu/drm/i915/intel_fbdev.c
> index 840d6bf..fe1fdb6 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -702,7 +702,7 @@ int intel_fbdev_init(struct drm_device *dev)
>   return 0;
>  }
>  
> -void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
> +static void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
>  {
>   struct drm_i915_private *dev_priv = data;
>   struct intel_fbdev *ifbdev = dev_priv->fbdev;
> @@ -711,6 +711,11 @@ void intel_fbdev_initial_config(void *data, 
> async_cookie_t cookie)
>   drm_fb_helper_initial_config(&ifbdev->helper, ifbdev->preferred_bpp);
>  }
>  
> +void intel_fbdev_initial_config_async(struct drm_device *dev)
> +{
> + async_schedule(intel_fbdev_initial_config, to_i915(dev));
> +}
> +
>  void intel_fbdev_fini(struct drm_device *dev)
>  {
>   struct drm_i915_private *dev_priv = dev->dev_private;
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Add soft-pinning API for execbuffer

2015-11-06 Thread Jesse Barnes

On 11/06/2015 05:38 AM, Chris Wilson wrote:
> On Thu, Nov 05, 2015 at 10:17:56AM -0800, Jesse Barnes wrote:
>> On 11/05/2015 09:51 AM, Kristian Høgsberg wrote:
>>> On Tue, Oct 6, 2015 at 3:53 AM, Chris Wilson  
>>> wrote:
>>>> Userspace can pass in an offset that it presumes the object is located
>>>> at. The kernel will then do its utmost to fit the object into that
>>>> location. The assumption is that userspace is handling its own object
>>>> locations (for example along with full-ppgtt) and that the kernel will
>>>> rarely have to make space for the user's requests.
>>>
>>> I know the commit message isn't documentation, but the phrase "do its
>>> utmost" makes me uncomfortable. I'd like to be explicit about what
>>> might make it fail (should only be pinned fbs in case of aliased ppgtt
>>> or userspace errors such as overlapping placements), or conversely,
>>> spell out when the flag can be expected to work (full ppgtt).
>>
>> Ooh yeah that would be good to add to the execbuf man page with the
>> softpin additions.  Oh wait, we don't have a man page for execbuf?
>> Someone should write one!
> 
> How about:
> 
> This extends the DRM_I915_GEM_EXECBUFFER2 ioctl to do the following:
> * if the user supplies a virtual address via the execobject->offset
>   *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
>   that object is placed at that offset in the address space selected
>   by the context specifier in execbuffer.
> * the location must be aligned to the GTT page size, 4096 bytes
> * as the object is placed exactly as specified, it may be used in this 
> batch
>   without relocations pointing to it
> 
> It may fail to do so if:
> * EINVAL is returned if the object does not have a 4096 byte aligned
>   address
> * the object conflicts with another pinned object (either pinned by
>   hardware in that address space, e.g. scanouts in the aliasing ppgtt)
>   or within the same batch.
>   EBUSY is returned if the location is pinned by hardware
>   EINVAL is returned if the location is already in use by the batch
> * EINVAL is returned if the object conflicts with its own alignment (as 
> meets
>   the hardware requirements) or if the placement of the object does not 
> fit
>   within the address space
> 
> All other execbuffer errors apply.

Looks great, now we just need an existing man page in which to integrate
this additional text.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Add soft-pinning API for execbuffer

2015-11-05 Thread Jesse Barnes

On 11/05/2015 09:51 AM, Kristian Høgsberg wrote:
> On Tue, Oct 6, 2015 at 3:53 AM, Chris Wilson  wrote:
>> Userspace can pass in an offset that it presumes the object is located
>> at. The kernel will then do its utmost to fit the object into that
>> location. The assumption is that userspace is handling its own object
>> locations (for example along with full-ppgtt) and that the kernel will
>> rarely have to make space for the user's requests.
> 
> I know the commit message isn't documentation, but the phrase "do its
> utmost" makes me uncomfortable. I'd like to be explicit about what
> might make it fail (should only be pinned fbs in case of aliased ppgtt
> or userspace errors such as overlapping placements), or conversely,
> spell out when the flag can be expected to work (full ppgtt).

Ooh yeah that would be good to add to the execbuf man page with the
softpin additions.  Oh wait, we don't have a man page for execbuf?
Someone should write one!

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 02/14] drm/i915: Extend DSL readout fix to BDW and SKL.

2015-11-05 Thread Jesse Barnes

On 11/03/2015 04:44 AM, Maarten Lankhorst wrote:
> Hey,
> 
> Op 03-11-15 om 12:32 schreef Jani Nikula:
>> On Tue, 03 Nov 2015, Ville Syrjälä  wrote:
>>> On Tue, Nov 03, 2015 at 08:31:41AM +0100, Maarten Lankhorst wrote:
 Those platforms have the same bug as haswell, and the same fix applies to 
 them.
>> How about Broxton? IS_DDI matches that.
>>
>> Jani.
>>
> Judging from irc it's very likely it suffers from the same problem, but it 
> would be nice if we had someone who could confirm. :)

It won't hurt (much) if we apply this workaround and it doesn't affect
BXT, so I think we may as well apply given what we know of BXT's lineage.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 12/13] drm/i915: remove in_dbg_master check from intel_fbc.c

2015-11-04 Thread Jesse Barnes

On 11/04/2015 12:26 PM, Zanoni, Paulo R wrote:
> Em Qua, 2015-11-04 às 14:19 -0600, Jason Wessel escreveu:
>> On 11/04/2015 02:13 PM, Jesse Barnes wrote:
>>> On 11/04/2015 11:10 AM, Paulo Zanoni wrote:
>>>>  From our maintainer Daniel Vetter a few days ago:
>>>>"Oh dear this is dead code. kdbg uses the fbcon, which always
>>>> uses
>>>>untiled, which means fbc will never be enabled. Also we have 0
>>>> users
>>>>and 0 test coverage for kdbg on top of i915 (Jesse implemented
>>>> it
>>>>for fun years back). Imo just remove all this code."
>>>>
>>>> Adding to what Daniel said: for kgdboc's KMS support,
>>>> intel_pipe_set_base_atomic() already manually disables FBC, so we
>>>> won't do the in_dbg_master() check there. This is essentially a
>>>> revert
>>>> of:
>>>>
>>>> commit c924b934d0cd14a4559611da91f28f59acebe32a
>>>> Author: Jason Wessel 
>>>> Date:   Thu Aug 5 09:22:32 2010 -0500
>>>>  i915: when kgdb is active display compression should be off
>>>>
>>>> Besides, it is not clear what is the exact problem caused by FBC,
>>>> and
>>>> why other features such as PSR, DRRS, IPS and RPM are not also
>>>> checking for in_dbg_master(). IMHO we should either remove the
>>>> code as
>>>> suggested by Daniel or we add some nice comments explaining why
>>>> is FBC
>>>> so special.
>>>>
>>>> v2: Rebase due to new patch order.
>>>>
>>>> Cc: Jason Wessel 
>>>> Cc: Jesse Barnes 
>>>> Cc: Daniel Vetter 
>>>> Signed-off-by: Paulo Zanoni 
>>>> ---
>>>>   drivers/gpu/drm/i915/intel_fbc.c | 6 --
>>>>   1 file changed, 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_fbc.c
>>>> b/drivers/gpu/drm/i915/intel_fbc.c
>>>> index 8e806be..e496cb0 100644
>>>> --- a/drivers/gpu/drm/i915/intel_fbc.c
>>>> +++ b/drivers/gpu/drm/i915/intel_fbc.c
>>>> @@ -890,12 +890,6 @@ static void __intel_fbc_update(struct
>>>> drm_i915_private *dev_priv)
>>>>    goto out_disable;
>>>>}
>>>>   
>>>> -  /* If the kernel debugger is active, always disable
>>>> compression */
>>>> -  if (in_dbg_master()) {
>>>> -  set_no_fbc_reason(dev_priv, "Kernel debugger is
>>>> active");
>>>> -  goto out_disable;
>>>> -  }
>>>> -
>>>>/* WaFbcExceedCdClockThreshold:hsw,bdw */
>>>>if ((IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) &&
>>>>ilk_pipe_pixel_rate(crtc->config) >=
>>>>
>>> Yeah looks fine.  I haven't had any bug reports from the kdboc
>>> work, so
>>> I guess that means no one is using it. :)
>>>
>>> Reviewed-by: Jesse Barnes 
>>
>>
>> It was previously the case that the code here only got hit when you
>> had a oops or a panic while running with the graphics console up. We
>> would end up with fuzzy lines instead of a readable text console when
>> we activated the atomic mode set.
> 
> But on this case we'll call pipe_set_base_atomic(), which will disable
> FBC before doing the modeset. Maybe this was not the case in the
> past..?

Or some subsequent call re-enabled it somehow.  You could replace it
with a WARN and keep my r-b if you want, or just push as-is.

But my guess is that kgdboc hasn't survived all the other modeset work
we've done the past couple of years...  I guess we need some igt tests
and a bunch of fixups to keep it going.  We could address this problem
if/when that happens.

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 12/13] drm/i915: remove in_dbg_master check from intel_fbc.c

2015-11-04 Thread Jesse Barnes

On 11/04/2015 11:10 AM, Paulo Zanoni wrote:
> From our maintainer Daniel Vetter a few days ago:
>   "Oh dear this is dead code. kdbg uses the fbcon, which always uses
>   untiled, which means fbc will never be enabled. Also we have 0 users
>   and 0 test coverage for kdbg on top of i915 (Jesse implemented it
>   for fun years back). Imo just remove all this code."
> 
> Adding to what Daniel said: for kgdboc's KMS support,
> intel_pipe_set_base_atomic() already manually disables FBC, so we
> won't do the in_dbg_master() check there. This is essentially a revert
> of:
> 
> commit c924b934d0cd14a4559611da91f28f59acebe32a
> Author: Jason Wessel 
> Date:   Thu Aug 5 09:22:32 2010 -0500
> i915: when kgdb is active display compression should be off
> 
> Besides, it is not clear what is the exact problem caused by FBC, and
> why other features such as PSR, DRRS, IPS and RPM are not also
> checking for in_dbg_master(). IMHO we should either remove the code as
> suggested by Daniel or we add some nice comments explaining why is FBC
> so special.
> 
> v2: Rebase due to new patch order.
> 
> Cc: Jason Wessel 
> Cc: Jesse Barnes 
> Cc: Daniel Vetter 
> Signed-off-by: Paulo Zanoni 
> ---
>  drivers/gpu/drm/i915/intel_fbc.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_fbc.c 
> b/drivers/gpu/drm/i915/intel_fbc.c
> index 8e806be..e496cb0 100644
> --- a/drivers/gpu/drm/i915/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/intel_fbc.c
> @@ -890,12 +890,6 @@ static void __intel_fbc_update(struct drm_i915_private 
> *dev_priv)
>   goto out_disable;
>   }
>  
> - /* If the kernel debugger is active, always disable compression */
> - if (in_dbg_master()) {
> - set_no_fbc_reason(dev_priv, "Kernel debugger is active");
> - goto out_disable;
> - }
> -
>   /* WaFbcExceedCdClockThreshold:hsw,bdw */
>   if ((IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) &&
>   ilk_pipe_pixel_rate(crtc->config) >=
> 

Yeah looks fine.  I haven't had any bug reports from the kdboc work, so
I guess that means no one is using it. :)

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] skylake + drm-next - warn city

2015-11-03 Thread Jesse Barnes

On 11/03/2015 12:07 PM, Dave Airlie wrote:
> We have a major process failure in place here, and shoving more code
> in the backend and hoping it somehow magically fixes itself between
> drm-intel-next and merging to Linus's tree is clearly not working for
> the past 6 months at least. I'm really unhappy about how shoddy 4.2
> is, and 4.3 is clearly not shaping up to have been a winner, 4.4 is
> looking even less fun.
> 
> So maybe you guys can brainstrom a bit, also when Daniel gets back.
> But at the moment I think until QA is fully reestablished, I think not
> merging anything to drm-intel-next for a few weeks and taking a break
> on new features until some of the features that were merged broken
> actually get fixed.
> 
> I'm also going to start looking at reverting skylake firmware loading,
> it's clearly never been tested with lockdep enabled by anyone who
> cared, which to my mind says it should never have been merged in the
> first place.

I think this is the right way to go.  We have several known failures in
even our basic tests, and when we created that list the intention was to
"drop everything" if one of them failed on any of the past few platforms
(BYT+ on Atom and HSW+ on Core).  So far we haven't done that but imo we
should.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 11/14] drm/i915: Remove ILK-A eDP PLL workaround notes

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:26 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> We don't care about ILK-A and the old w/a notes may just confuse
> people, so get rid of them.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 55d5246..763b0ef 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -1556,10 +1556,6 @@ static void ironlake_set_pll_cpu_edp(struct intel_dp 
> *intel_dp)
>   dpa_ctl &= ~DP_PLL_FREQ_MASK;
>  
>   if (crtc->config->port_clock == 162000) {
> - /* For a long time we've carried around a ILK-DevA w/a for the
> -  * 162MHz clock. If we're really unlucky, it's still required.
> -  */
> - DRM_DEBUG_KMS("162MHz cpu eDP clock, might need ilk devA 
> w/a\n");
>   dpa_ctl |= DP_PLL_FREQ_162MHZ;
>   intel_dp->DP |= DP_PLL_FREQ_162MHZ;
>   } else {
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 09/14] drm/i915: Hide underruns from eDP PLL and port enable on ILK

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> We get underruns on the other pipe when enabling the CPU eDP PLL and
> port on ILK.
> 
> Bspec knows about the PLL issue, and recommends doing a vblank wait just
> prior to enabling the PLL. That does seem to help, but unfortunately we
> get another underrun when actually enabling the CPU eDP port. Bspec
> doesn't mention that at all, and the same vblank wait trick doesn't
> appear to be effective there.
> 
> Since I have no better clue how to deal with this, just hide the errors.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 34 +++---
>  1 file changed, 31 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 4a0fb63..0b9b440 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -2575,6 +2575,8 @@ static void intel_enable_dp(struct intel_encoder 
> *encoder)
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   struct intel_crtc *crtc = to_intel_crtc(encoder->base.crtc);
>   uint32_t dp_reg = I915_READ(intel_dp->output_reg);
> + enum port port = dp_to_dig_port(intel_dp)->port;
> + enum pipe pipe = crtc->pipe;
>  
>   if (WARN_ON(dp_reg & DP_PORT_EN))
>   return;
> @@ -2586,6 +2588,17 @@ static void intel_enable_dp(struct intel_encoder 
> *encoder)
>  
>   intel_dp_enable_port(intel_dp);
>  
> + if (port == PORT_A && IS_GEN5(dev_priv)) {
> + /*
> +  * Underrun reporting for the other pipe was disabled in
> +  * g4x_pre_enable_dp(). The eDP PLL and port have now been
> +  * enabled, so it's now safe to re-enable underrun reporting.
> +  */
> + intel_wait_for_vblank_if_active(dev_priv->dev, !pipe);
> + intel_set_cpu_fifo_underrun_reporting(dev_priv, !pipe, true);
> + intel_set_pch_fifo_underrun_reporting(dev_priv, !pipe, true);
> + }
> +
>   edp_panel_vdd_on(intel_dp);
>   edp_panel_on(intel_dp);
>   edp_panel_vdd_off(intel_dp, true);
> @@ -2608,7 +2621,7 @@ static void intel_enable_dp(struct intel_encoder 
> *encoder)
>  
>   if (crtc->config->has_audio) {
>   DRM_DEBUG_DRIVER("Enabling DP audio on pipe %c\n",
> -  pipe_name(crtc->pipe));
> +  pipe_name(pipe));
>   intel_audio_codec_enable(encoder);
>   }
>  }
> @@ -2631,13 +2644,28 @@ static void vlv_enable_dp(struct intel_encoder 
> *encoder)
>  
>  static void g4x_pre_enable_dp(struct intel_encoder *encoder)
>  {
> + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
>   struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
> - struct intel_digital_port *dport = dp_to_dig_port(intel_dp);
> + enum port port = dp_to_dig_port(intel_dp)->port;
> + enum pipe pipe = to_intel_crtc(encoder->base.crtc)->pipe;
>  
>   intel_dp_prepare(encoder);
>  
> + if (port == PORT_A && IS_GEN5(dev_priv)) {
> + /*
> +  * We get FIFO underruns on the other pipe when
> +  * enabling the CPU eDP PLL, and when enabling CPU
> +  * eDP port. We could potentially avoid the PLL
> +  * underrun with a vblank wait just prior to enabling
> +  * the PLL, but that doesn't appear to help the port
> +  * enable case. Just sweep it all under the rug.
> +  */
> + intel_set_cpu_fifo_underrun_reporting(dev_priv, !pipe, false);
> + intel_set_pch_fifo_underrun_reporting(dev_priv, !pipe, false);
> + }
> +
>   /* Only ilk+ has port A */
> - if (dport->port == PORT_A) {
> + if (port == PORT_A) {
>   ironlake_set_pll_cpu_edp(intel_dp);
>   ironlake_edp_pll_on(intel_dp);
>   }
> 

Wish we had a nice hook to hide the gen5 bits somewhere better, but it's
fine as is with the comment.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 08/14] drm/i915: Disable FIFO underrun reporting around IBX transcoder B workaround

2015-10-29 Thread Jesse Barnes

> --- a/drivers/gpu/drm/i915/intel_sdvo.c
> +++ b/drivers/gpu/drm/i915/intel_sdvo.c
> @@ -1464,12 +1464,23 @@ static void intel_disable_sdvo(struct intel_encoder 
> *encoder)
>* matching DP port to be enabled on transcoder A.
>*/
>   if (HAS_PCH_IBX(dev_priv) && crtc->pipe == PIPE_B) {
> + /*
> +  * We get CPU/PCH FIFO underruns on the other pipe when
> +  * doing the workaround. Sweep them under the rug.
> +  */
> + intel_set_cpu_fifo_underrun_reporting(dev_priv, PIPE_A, false);
> + intel_set_pch_fifo_underrun_reporting(dev_priv, PIPE_A, false);
> +
>   temp &= ~SDVO_PIPE_B_SELECT;
>   temp |= SDVO_ENABLE;
>   intel_sdvo_write_sdvox(intel_sdvo, temp);
>  
>   temp &= ~SDVO_ENABLE;
>   intel_sdvo_write_sdvox(intel_sdvo, temp);
> +
> + intel_wait_for_vblank_if_active(dev_priv->dev, PIPE_A);
> + intel_set_cpu_fifo_underrun_reporting(dev_priv, PIPE_A, true);
> + intel_set_pch_fifo_underrun_reporting(dev_priv, PIPE_A, true);
>   }
>  }
>  
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 07/14] drm/i915: Check for CPT and not !IBX in ironlake_disable_pch_transcoder()

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> ironlake_enaable_pch_transcoder() checks for CPT to see if it should
> enable the timing override chicken bit, but
> ironlake_disable_pch_transcoder() checks for !IBX to see if it should
> clear the same bit. Change ironlake_disable_pch_transcoder() to check
> for CPT as well to keep the two sides consistent.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index e820147..0d87a4e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2068,7 +2068,7 @@ static void ironlake_disable_pch_transcoder(struct 
> drm_i915_private *dev_priv,
>   if (wait_for((I915_READ(reg) & TRANS_STATE_ENABLE) == 0, 50))
>   DRM_ERROR("failed to disable transcoder %c\n", pipe_name(pipe));
>  
> - if (!HAS_PCH_IBX(dev)) {
> + if (HAS_PCH_CPT(dev)) {
>   /* Workaround: Clear the timing override chicken bit again. */
>       reg = TRANS_CHICKEN2(pipe);
>   val = I915_READ(reg);
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 05/14] drm/i915: Re-enable PCH FIO underrun reporting after pipe has been disabled

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Some hardware (IVB/HSW and CPT/PPT) have a shared error interrupt for
> all the relevant underrun bits, so in order to keep the error interrupt
> enabled, we need to have underrun reporting enabled on all PCH
> transocders. Currently we leave the underrun reporting disabled when
> the pipe is off, which means we won't get any underrun interrupts
> when only a subset of the pipes are active.
> 
> Fix the problem by re-enabling the underrun reporting after the pipe has
> been disabled. And to avoid the spurious underruns during pipe enable,
> disable the underrun reporting before embarking on the pipe enable
> sequence. So this way we have the error reporting disabled while
> running through the modeset sequence.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 4fc3d24..c7cd9f7 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -4857,6 +4857,9 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
>   return;
>  
>   if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, false);
> +
> + if (intel_crtc->config->has_pch_encoder)
>   intel_prepare_shared_dpll(intel_crtc);

I guess these could be combined under the conditional, but no biggie.

>  
>   if (intel_crtc->config->has_dp_encoder)
> @@ -4939,6 +4942,10 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
>   if (WARN_ON(intel_crtc->active))
>   return;
>  
> + if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
> +   false);
> +
>   if (intel_crtc_to_shared_dpll(intel_crtc))
>   intel_enable_shared_dpll(intel_crtc);
>  
> @@ -5086,6 +5093,9 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>  
>   ironlake_fdi_pll_disable(intel_crtc);
>   }
> +
> + if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, true);
>  }
>  
>  static void haswell_crtc_disable(struct drm_crtc *crtc)
> @@ -5133,6 +5143,10 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>   for_each_encoder_on_crtc(dev, crtc, encoder)
>   if (encoder->post_disable)
>   encoder->post_disable(encoder);
> +
> + if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
> +   true);
>  }
>  
>  static void i9xx_pfit_enable(struct intel_crtc *crtc)
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 01/14] drm/i915: Don't use intel_pipe_to_cpu_transcoder() when there's a pipe config around

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> No point in doing the crtc->pipe->crtc->config->cpu_transcoder dance
> when we can just do crtc->config->cpu_transcoder.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index bc1907e..d3cd177 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2106,8 +2106,7 @@ static void intel_enable_pipe(struct intel_crtc *crtc)
>   struct drm_device *dev = crtc->base.dev;
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   enum pipe pipe = crtc->pipe;
> - enum transcoder cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv,
> -   pipe);
> + enum transcoder cpu_transcoder = crtc->config->cpu_transcoder;
>   enum pipe pch_transcoder;
>   int reg;
>   u32 val;
> @@ -5208,13 +5207,11 @@ static unsigned long get_crtc_power_domains(struct 
> drm_crtc *crtc)
>   struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>   enum pipe pipe = intel_crtc->pipe;
>   unsigned long mask;
> - enum transcoder transcoder;
> + enum transcoder transcoder = intel_crtc->config->cpu_transcoder;
>  
>   if (!crtc->state->active)
>   return 0;
>  
> - transcoder = intel_pipe_to_cpu_transcoder(dev->dev_private, pipe);
> -
>   mask = BIT(POWER_DOMAIN_PIPE(pipe));
>   mask |= BIT(POWER_DOMAIN_TRANSCODER(transcoder));
>   if (intel_crtc->config->pch_pfit.enabled ||
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 04/14] drm/i915: Enable PCH FIFO underruns later on HSW+

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> As we did for ILK/SNB/IVB, move the PCH FIFO underrun enable to happen
> after the encoder enable on HSW+. And again, for symmetry, move the
> the disable to happen before encoder disable.
> 
> I've left out the vblank wait before the enable here because I don't
> know if it's needed or not. Actually I don't know if this entire
> change is needed as I don't have a HSW/BDW with VGA output.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 16 +---
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index d5cb899..4fc3d24 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -4971,11 +4971,8 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
>   encoder->pre_enable(encoder);
>   }
>  
> - if (intel_crtc->config->has_pch_encoder) {
> - intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
> -   true);
> + if (intel_crtc->config->has_pch_encoder)
>   dev_priv->display.fdi_link_train(crtc);
> - }
>  
>   if (!is_dsi)
>   intel_ddi_enable_pipe_clock(intel_crtc);
> @@ -5012,6 +5009,10 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
>   intel_opregion_notify_encoder(encoder, true);
>   }
>  
> + if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
> +   true);
> +
>   /* If we change the relative order between pipe/planes enabling, we need
>* to change the workaround. */
>   hsw_workaround_pipe = pipe_config->hsw_workaround_pipe;
> @@ -5096,6 +5097,10 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>   enum transcoder cpu_transcoder = intel_crtc->config->cpu_transcoder;
>   bool is_dsi = intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DSI);
>  
> + if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
> +   false);
> +
>   for_each_encoder_on_crtc(dev, crtc, encoder) {
>   intel_opregion_notify_encoder(encoder, false);
>   encoder->disable(encoder);
> @@ -5104,9 +5109,6 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>   drm_crtc_vblank_off(crtc);
>   assert_vblank_disabled(crtc);
>  
> - if (intel_crtc->config->has_pch_encoder)
> - intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
> -   false);
>   intel_disable_pipe(intel_crtc);
>  
>   if (intel_crtc->config->dp_encoder_is_mst)
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 03/14] drm/i915: Enable PCH FIFO underruns later on ILK/SNB/IVB

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> We get spurious PCH FIFO underruns if we enable the reporting too soon
> after enabling the crtc. Move it to be the last step, after the encoder
> enable. Additionally we need an extra vblank wait, otherwise we still
> get the underruns. Presumably the pipe/fdi isn't yet fully up and running
> otherwise.
> 
> For symmetry, disable the PCH underrun reporting as the first thing,
> just before encoder disable, when shutting down the crtc.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 99fb33f..d5cb899 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -4874,7 +4874,6 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
>   intel_crtc->active = true;
>  
>   intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, true);
> - intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, true);
>  
>   for_each_encoder_on_crtc(dev, crtc, encoder)
>   if (encoder->pre_enable)
> @@ -4912,6 +4911,12 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
>  
>   if (HAS_PCH_CPT(dev))
>   cpt_verify_modeset(dev, intel_crtc->pipe);
> +
> + if (intel_crtc->config->has_pch_encoder) {
> + /* Must wait for vblank to avoid spurious PCH FIFO underruns */
> + intel_wait_for_vblank(dev, pipe);
> + intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, true);
> + }
>  }
>  
>  /* IPS only exists on ULT machines and is tied to pipe A. */
> @@ -5040,15 +5045,15 @@ static void ironlake_crtc_disable(struct drm_crtc 
> *crtc)
>   int pipe = intel_crtc->pipe;
>   u32 reg, temp;
>  
> + if (intel_crtc->config->has_pch_encoder)
> + intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, false);
> +
>   for_each_encoder_on_crtc(dev, crtc, encoder)
>   encoder->disable(encoder);
>  
>   drm_crtc_vblank_off(crtc);
>   assert_vblank_disabled(crtc);
>  
> - if (intel_crtc->config->has_pch_encoder)
> - intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, false);
> -
>   intel_disable_pipe(intel_crtc);
>  
>   ironlake_pfit_disable(intel_crtc, false);
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 02/14] drm/i915: Set sync polarity from adjusted mode for TRANS_DP_CTL

2015-10-29 Thread Jesse Barnes

On 10/29/2015 12:25 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Rather than looking at crtc->mode (which is the user mode) dig up the
> sync polarity settings from the adjusted_mode when programming
> TRANS_DP_CTL on CPT/PPT.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index d3cd177..99fb33f 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -4170,6 +4170,8 @@ static void ironlake_pch_enable(struct drm_crtc *crtc)
>  
>   /* For PCH DP, enable TRANS_DP_CTL */
>   if (HAS_PCH_CPT(dev) && intel_crtc->config->has_dp_encoder) {
> + const struct drm_display_mode *adjusted_mode =
> + &intel_crtc->config->base.adjusted_mode;
>   u32 bpc = (I915_READ(PIPECONF(pipe)) & PIPECONF_BPC_MASK) >> 5;
>   reg = TRANS_DP_CTL(pipe);
>   temp = I915_READ(reg);
> @@ -4179,9 +4181,9 @@ static void ironlake_pch_enable(struct drm_crtc *crtc)
>   temp |= TRANS_DP_OUTPUT_ENABLE;
>   temp |= bpc << 9; /* same format but at 11:9 */
>  
> - if (crtc->mode.flags & DRM_MODE_FLAG_PHSYNC)
> + if (adjusted_mode->flags & DRM_MODE_FLAG_PHSYNC)
>   temp |= TRANS_DP_HSYNC_ACTIVE_HIGH;
> - if (crtc->mode.flags & DRM_MODE_FLAG_PVSYNC)
> + if (adjusted_mode->flags & DRM_MODE_FLAG_PVSYNC)
>   temp |= TRANS_DP_VSYNC_ACTIVE_HIGH;
>  
>       switch (intel_trans_dp_port_sel(crtc)) {
> 

God I wish we'd rename these structs a bit... "adjusted" and
"crtc->mode" don't really communicate much.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/1] drm/i915: Add Backlight Control using DPCD for eDP connectors

2015-10-28 Thread Jesse Barnes

tate *pipe_config);
> +bool intel_dp_aux_display_control_capable(struct intel_connector *connector);
> +void intel_dp_aux_init_backlight_funcs(struct intel_connector 
> *intel_connector);
>  
>  /* intel_dp_mst.c */
>  int intel_dp_mst_encoder_init(struct intel_digital_port *intel_dig_port, int 
> conn_id);
> diff --git a/drivers/gpu/drm/i915/intel_panel.c 
> b/drivers/gpu/drm/i915/intel_panel.c
> index 9adb62b..04eff34 100644
> --- a/drivers/gpu/drm/i915/intel_panel.c
> +++ b/drivers/gpu/drm/i915/intel_panel.c
> @@ -1685,7 +1685,10 @@ intel_panel_init_backlight_funcs(struct intel_panel 
> *panel)
>   struct drm_device *dev = intel_connector->base.dev;
>   struct drm_i915_private *dev_priv = dev->dev_private;
>  
> - if (IS_BROXTON(dev)) {
> + if (intel_connector->base.connector_type == DRM_MODE_CONNECTOR_eDP &&
> + intel_dp_aux_display_control_capable(intel_connector)) {
> + intel_dp_aux_init_backlight_funcs(intel_connector);
> + } else if (IS_BROXTON(dev)) {
>   panel->backlight.setup = bxt_setup_backlight;
>   panel->backlight.enable = bxt_enable_backlight;
>   panel->backlight.disable = bxt_disable_backlight;
> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> index 9ec4716..7367e1a 100644
> --- a/include/drm/drm_dp_helper.h
> +++ b/include/drm/drm_dp_helper.h
> @@ -455,16 +455,22 @@
>  # define DP_EDP_14   0x03
>  
>  #define DP_EDP_GENERAL_CAP_1 0x701
> +#define DP_EDP_TCON_BACKLIGHT_ADJUSTMENT_CAPABLE  (1 << 0)
> +#define DP_EDP_BACKLIGHT_AUX_ENABLE_CAPABLE   (1 << 2)
>  
>  #define DP_EDP_BACKLIGHT_ADJUSTMENT_CAP 0x702
> +#define DP_EDP_BACKLIGHT_BRIGHTNESS_AUX_SET_CAPABLE   (1 << 1)
> +#define DP_EDP_BACKLIGHT_BRIGHTNESS_BYTE_COUNT(1 << 2)
>  
>  #define DP_EDP_GENERAL_CAP_2 0x703
>  
>  #define DP_EDP_GENERAL_CAP_3 0x704/* eDP 1.4 */
>  
>  #define DP_EDP_DISPLAY_CONTROL_REGISTER 0x720
> +#define DP_EDP_BACKLIGHT_ENABLE   (1 << 0)
>  
>  #define DP_EDP_BACKLIGHT_MODE_SET_REGISTER  0x721
> +#define DP_EDP_BACKLIGHT_BRIGHTNESS_CTL_MODE_DPCD_MASK 0x2
>  
>  #define DP_EDP_BACKLIGHT_BRIGHTNESS_MSB 0x722
>  #define DP_EDP_BACKLIGHT_BRIGHTNESS_LSB 0x723

I don't have the spec for this but assume you've tested it.  The code looks ok, 
my only worry is that some eDP panels might return a DPCD backlight capability 
but then just ignore the writes.  But I guess we'll find that out soon enough 
if we land this.

So:
Acked-by: Jesse Barnes 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH i-g-t 2/4] lib: Skip suspend/hibernate tests if the system doesn't support them

2015-10-27 Thread Jesse Barnes

On 10/26/2015 11:58 PM, David Weinehall wrote:
> On Fri, Oct 23, 2015 at 12:39:31PM -0700, Jesse Barnes wrote:
>> On 10/22/2015 01:35 PM, ville.syrj...@linux.intel.com wrote:
>>> From: Ville Syrjälä 
>>>
>>> Do a dry run with rtcwake first to determine if the system even supports
>>> the intended suspend state. If not, skip the test.
>>>
>>> Fixes a bunch of stuff on my BYT FFRD8 that doesn't support S3.
>>>
>>> Signed-off-by: Ville Syrjälä 
>>> ---
>>>  lib/igt_aux.c | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/lib/igt_aux.c b/lib/igt_aux.c
>>> index 04ca25b..f3c76ae 100644
>>> --- a/lib/igt_aux.c
>>> +++ b/lib/igt_aux.c
>>> @@ -357,6 +357,9 @@ void igt_system_suspend_autoresume(void)
>>>  * seems to fare better. We need to investigate what's going on. */
>>> igt_skip_on_simulation();
>>>  
>>> +   /* skip if system doesn't support suspend-to-mem */
>>> +   igt_skip_on(system("rtcwake -n -s 30 -m mem") != 0);
>>> +
>>> ret = system("rtcwake -s 30 -m mem");
>>> igt_assert_f(ret == 0,
>>>  "This failure means that something is wrong with the "
>>> @@ -384,6 +387,9 @@ void igt_system_hibernate_autoresume(void)
>>>  * seems to fare better. We need to investigate what's going on. */
>>> igt_skip_on_simulation();
>>>  
>>> +   /* skip if system doesn't support suspend-to-disk */
>>> +   igt_skip_on(system("rtcwake -n -s 90 -m disk") != 0);
>>> +
>>> /* The timeout might need to be adjusted if hibernation takes too long
>>>  * or if we have to wait excessively long before resume
>>>  */
>>>
>>
>> Are there reliable alternatives to the rtcwake alarm?
>> Maybe some AMT/MEI wakeup event or some ACPI clock thing (handwaving pretty 
>> hard here)?
> 
> Depending on what the hardware supports, for hibernate to disk there's ipmi 
> power-on.
> 
> ipmi-power -h $hostname --stat will show the status of the machine,
> ipmi-power -h $hostname --on will power it on.
> 
> Maybe wake-on-lan could be an option too?

If there's some way to automate these into the tests, that would be
ideal, otherwise including them in the Jenkins setup for platforms that
don't have RTC wake would be good so we can get full test coverage.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH i-g-t 2/4] lib: Skip suspend/hibernate tests if the system doesn't support them

2015-10-23 Thread Jesse Barnes

On 10/22/2015 01:35 PM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Do a dry run with rtcwake first to determine if the system even supports
> the intended suspend state. If not, skip the test.
> 
> Fixes a bunch of stuff on my BYT FFRD8 that doesn't support S3.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  lib/igt_aux.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/igt_aux.c b/lib/igt_aux.c
> index 04ca25b..f3c76ae 100644
> --- a/lib/igt_aux.c
> +++ b/lib/igt_aux.c
> @@ -357,6 +357,9 @@ void igt_system_suspend_autoresume(void)
>* seems to fare better. We need to investigate what's going on. */
>   igt_skip_on_simulation();
>  
> + /* skip if system doesn't support suspend-to-mem */
> + igt_skip_on(system("rtcwake -n -s 30 -m mem") != 0);
> +
>   ret = system("rtcwake -s 30 -m mem");
>   igt_assert_f(ret == 0,
>"This failure means that something is wrong with the "
> @@ -384,6 +387,9 @@ void igt_system_hibernate_autoresume(void)
>* seems to fare better. We need to investigate what's going on. */
>   igt_skip_on_simulation();
>  
> + /* skip if system doesn't support suspend-to-disk */
> + igt_skip_on(system("rtcwake -n -s 90 -m disk") != 0);
> +
>   /* The timeout might need to be adjusted if hibernation takes too long
>* or if we have to wait excessively long before resume
>*/
> 

Are there reliable alternatives to the rtcwake alarm?  Maybe some AMT/MEI 
wakeup event or some ACPI clock thing (handwaving pretty hard here)?

Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: respect previous reg values on primary plane disable

2015-10-13 Thread Jesse Barnes

On 10/13/2015 02:24 PM, Kevin Strasser wrote:
> On HSW the crc differs between black and disabled primary planes, causing an
> assert to fail in the kms_universal_plane test. It seems that things like 
> gamma
> correction are causing the black primary plane case to result in a brighter
> color than the disabled primary plane case.
> 
> Only toggle the enable bit instead of clearing the control register, making 
> the
> disable path more similar to that of the sprite plane.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89331
> Testcase: igt/kms_universal_plane
> Signed-off-by: Kevin Strasser 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index cddb0c6..b6164d8e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2829,7 +2829,7 @@ static void ironlake_update_primary_plane(struct 
> drm_crtc *crtc,
>   int pixel_size;
>  
>   if (!visible || !fb) {
> - I915_WRITE(reg, 0);
> + I915_WRITE(reg, I915_READ(reg) & ~DISPLAY_PLANE_ENABLE);
>   I915_WRITE(DSPSURF(plane), 0);
>   POSTING_READ(reg);
>   return;

For some reason this rings a bell.  Paulo did you work on something
similar awhile back?

Anyway, hooray for fixing bugs!

Reviewed-by: Jesse Barnes 

Thanks,
Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 33/43] drm/i915: Remove dev_priv argument from NEEDS_FORCE_WAKE

2015-10-12 Thread Jesse Barnes

On 09/18/2015 10:03 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> b/drivers/gpu/drm/i915/intel_uncore.c
> index 3294f63..197ca397 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -525,7 +525,7 @@ void assert_forcewakes_inactive(struct drm_i915_private 
> *dev_priv)
>  }
>  
>  /* We give fast paths for the really cool registers */
> -#define NEEDS_FORCE_WAKE(dev_priv, reg) \
> +#define NEEDS_FORCE_WAKE(reg) \
>((reg) < 0x4 && (reg) != FORCEWAKE)
>  
>  #define REG_RANGE(reg, start, end) ((reg) >= (start) && (reg) < (end))
> @@ -727,7 +727,7 @@ static u##x \
>  gen6_read##x(struct drm_i915_private *dev_priv, off_t reg, bool trace) { \
>   GEN6_READ_HEADER(x); \
>   hsw_unclaimed_reg_debug(dev_priv, reg, true, true); \
> - if (NEEDS_FORCE_WAKE((dev_priv), (reg))) \
> + if (NEEDS_FORCE_WAKE(reg)) \
>   __force_wake_get(dev_priv, FORCEWAKE_RENDER); \
>   val = __raw_i915_read##x(dev_priv, reg); \
>   hsw_unclaimed_reg_debug(dev_priv, reg, true, false); \
> @@ -761,7 +761,7 @@ chv_read##x(struct drm_i915_private *dev_priv, off_t reg, 
> bool trace) { \
>   GEN6_READ_FOOTER; \
>  }
>  
> -#define SKL_NEEDS_FORCE_WAKE(dev_priv, reg)  \
> +#define SKL_NEEDS_FORCE_WAKE(reg) \
>((reg) < 0x4 && !FORCEWAKE_GEN9_UNCORE_RANGE_OFFSET(reg))
>  
>  #define __gen9_read(x) \
> @@ -770,9 +770,9 @@ gen9_read##x(struct drm_i915_private *dev_priv, off_t 
> reg, bool trace) { \
>   enum forcewake_domains fw_engine; \
>   GEN6_READ_HEADER(x); \
>   hsw_unclaimed_reg_debug(dev_priv, reg, true, true); \
> - if (!SKL_NEEDS_FORCE_WAKE((dev_priv), (reg)))   \
> + if (!SKL_NEEDS_FORCE_WAKE(reg)) \
>   fw_engine = 0; \
> - else if (FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(reg))   \
> + else if (FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(reg)) \
>   fw_engine = FORCEWAKE_RENDER; \
>   else if (FORCEWAKE_GEN9_MEDIA_RANGE_OFFSET(reg)) \
>   fw_engine = FORCEWAKE_MEDIA; \
> @@ -868,7 +868,7 @@ static void \
>  gen6_write##x(struct drm_i915_private *dev_priv, off_t reg, u##x val, bool 
> trace) { \
>   u32 __fifo_ret = 0; \
>   GEN6_WRITE_HEADER; \
> - if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
> + if (NEEDS_FORCE_WAKE(reg)) { \
>   __fifo_ret = __gen6_gt_wait_for_fifo(dev_priv); \
>   } \
>   __raw_i915_write##x(dev_priv, reg, val); \
> @@ -883,7 +883,7 @@ static void \
>  hsw_write##x(struct drm_i915_private *dev_priv, off_t reg, u##x val, bool 
> trace) { \
>   u32 __fifo_ret = 0; \
>   GEN6_WRITE_HEADER; \
> - if (NEEDS_FORCE_WAKE((dev_priv), (reg))) { \
> + if (NEEDS_FORCE_WAKE(reg)) { \
>   __fifo_ret = __gen6_gt_wait_for_fifo(dev_priv); \
>   } \
>   hsw_unclaimed_reg_debug(dev_priv, reg, false, true); \
> @@ -985,7 +985,7 @@ gen9_write##x(struct drm_i915_private *dev_priv, off_t 
> reg, u##x val, \
>   enum forcewake_domains fw_engine; \
>   GEN6_WRITE_HEADER; \
>   hsw_unclaimed_reg_debug(dev_priv, reg, false, true); \
> - if (!SKL_NEEDS_FORCE_WAKE((dev_priv), (reg)) || \
> + if (!SKL_NEEDS_FORCE_WAKE(reg) || \
>   is_gen9_shadowed(dev_priv, reg)) \
>   fw_engine = 0; \
>   else if (FORCEWAKE_GEN9_RENDER_RANGE_OFFSET(reg)) \
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 32/43] drm/i915: Clean up LVDS register handling

2015-10-12 Thread Jesse Barnes

On 09/18/2015 10:03 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Keep single 'lvds_reg' and 'lvds' variable around in
> intel_lvds_init(), and read it just once at the start.
> 
> Also intel_lvds_get_config() doesn't need to figure out which reg to use
> since it can just consult lvds_encoder->reg.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_lvds.c | 30 ++
>  1 file changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lvds.c 
> b/drivers/gpu/drm/i915/intel_lvds.c
> index 2c2d1f0..35bad71 100644
> --- a/drivers/gpu/drm/i915/intel_lvds.c
> +++ b/drivers/gpu/drm/i915/intel_lvds.c
> @@ -98,15 +98,11 @@ static void intel_lvds_get_config(struct intel_encoder 
> *encoder,
>  {
>   struct drm_device *dev = encoder->base.dev;
>   struct drm_i915_private *dev_priv = dev->dev_private;
> - u32 lvds_reg, tmp, flags = 0;
> + struct intel_lvds_encoder *lvds_encoder = 
> to_lvds_encoder(&encoder->base);
> + u32 tmp, flags = 0;
>   int dotclock;
>  
> - if (HAS_PCH_SPLIT(dev))
> - lvds_reg = PCH_LVDS;
> - else
> - lvds_reg = LVDS;
> -
> - tmp = I915_READ(lvds_reg);
> + tmp = I915_READ(lvds_encoder->reg);
>   if (tmp & LVDS_HSYNC_POLARITY)
>   flags |= DRM_MODE_FLAG_NHSYNC;
>   else
> @@ -944,6 +940,7 @@ void intel_lvds_init(struct drm_device *dev)
>   struct drm_display_mode *downclock_mode = NULL;
>   struct edid *edid;
>   struct drm_crtc *crtc;
> + u32 lvds_reg;
>   u32 lvds;
>   int pipe;
>   u8 pin;
> @@ -966,8 +963,15 @@ void intel_lvds_init(struct drm_device *dev)
>   if (dmi_check_system(intel_no_lvds))
>   return;
>  
> + if (HAS_PCH_SPLIT(dev))
> + lvds_reg = PCH_LVDS;
> + else
> + lvds_reg = LVDS;
> +
> + lvds = I915_READ(lvds_reg);
> +
>   if (HAS_PCH_SPLIT(dev)) {
> - if ((I915_READ(PCH_LVDS) & LVDS_DETECTED) == 0)
> + if ((lvds & LVDS_DETECTED) == 0)
>   return;
>   if (dev_priv->vbt.edp_support) {
>   DRM_DEBUG_KMS("disable LVDS for eDP support\n");
> @@ -977,8 +981,7 @@ void intel_lvds_init(struct drm_device *dev)
>  
>   pin = GMBUS_PIN_PANEL;
>   if (!lvds_is_present_in_vbt(dev, &pin)) {
> - u32 reg = HAS_PCH_SPLIT(dev) ? PCH_LVDS : LVDS;
> - if ((I915_READ(reg) & LVDS_PORT_EN) == 0) {
> + if ((lvds & LVDS_PORT_EN) == 0) {
>   DRM_DEBUG_KMS("LVDS is not present in VBT\n");
>   return;
>   }
> @@ -1055,11 +1058,7 @@ void intel_lvds_init(struct drm_device *dev)
>   connector->interlace_allowed = false;
>   connector->doublescan_allowed = false;
>  
> - if (HAS_PCH_SPLIT(dev)) {
> - lvds_encoder->reg = PCH_LVDS;
> - } else {
> - lvds_encoder->reg = LVDS;
> - }
> + lvds_encoder->reg = lvds_reg;
>  
>   /* create the scaling mode property */
>   drm_mode_create_scaling_mode_property(dev);
> @@ -1140,7 +1139,6 @@ void intel_lvds_init(struct drm_device *dev)
>   if (HAS_PCH_SPLIT(dev))
>   goto failed;
>  
> - lvds = I915_READ(LVDS);
>   pipe = (lvds & LVDS_PIPEB_SELECT) ? 1 : 0;
>   crtc = intel_get_crtc_for_pipe(dev, pipe);
>  
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 31/43] drm/i915: Throw out some useless variables

2015-10-12 Thread Jesse Barnes

   val = I915_READ(reg);
> + u32 val = I915_READ(DVSCNTR(pipe));
>   I915_STATE_WARN(val & DVS_ENABLE,
>"sprite %c assertion failure, should be off on pipe %c but 
> is still active\n",
>plane_name(pipe), pipe_name(pipe));
> @@ -1441,12 +1416,10 @@ static void ibx_assert_pch_refclk_enabled(struct 
> drm_i915_private *dev_priv)
>  static void assert_pch_transcoder_disabled(struct drm_i915_private *dev_priv,
>  enum pipe pipe)
>  {
> - int reg;
>   u32 val;
>   bool enabled;
>  
> - reg = PCH_TRANSCONF(pipe);
> - val = I915_READ(reg);
> + val = I915_READ(PCH_TRANSCONF(pipe));
>   enabled = !!(val & TRANS_ENABLE);
>   I915_STATE_WARN(enabled,
>"transcoder assertion failed, should be off on pipe %c but is 
> still active\n",
> @@ -1553,21 +1526,18 @@ static void assert_pch_hdmi_disabled(struct 
> drm_i915_private *dev_priv,
>  static void assert_pch_ports_disabled(struct drm_i915_private *dev_priv,
> enum pipe pipe)
>  {
> - int reg;
>   u32 val;
>  
>   assert_pch_dp_disabled(dev_priv, pipe, PCH_DP_B, TRANS_DP_PORT_SEL_B);
>   assert_pch_dp_disabled(dev_priv, pipe, PCH_DP_C, TRANS_DP_PORT_SEL_C);
>   assert_pch_dp_disabled(dev_priv, pipe, PCH_DP_D, TRANS_DP_PORT_SEL_D);
>  
> - reg = PCH_ADPA;
> - val = I915_READ(reg);
> + val = I915_READ(PCH_ADPA);
>   I915_STATE_WARN(adpa_pipe_enabled(dev_priv, pipe, val),
>"PCH VGA enabled on transcoder %c, should be disabled\n",
>pipe_name(pipe));
>  
> - reg = PCH_LVDS;
> - val = I915_READ(reg);
> + val = I915_READ(PCH_LVDS);
>   I915_STATE_WARN(lvds_pipe_enabled(dev_priv, pipe, val),
>"PCH LVDS enabled on transcoder %c, should be disabled\n",
>pipe_name(pipe));
> @@ -14864,13 +14834,12 @@ intel_check_plane_mapping(struct intel_crtc *crtc)
>  {
>   struct drm_device *dev = crtc->base.dev;
>   struct drm_i915_private *dev_priv = dev->dev_private;
> - u32 reg, val;
> + u32 val;
>  
>   if (INTEL_INFO(dev)->num_pipes == 1)
>   return true;
>  
> - reg = DSPCNTR(!crtc->plane);
> - val = I915_READ(reg);
> + val = I915_READ(DSPCNTR(!crtc->plane));
>  
>   if ((val & DISPLAY_PLANE_ENABLE) &&
>   (!!(val & DISPPLANE_SEL_PIPE_MASK) == crtc->pipe))
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 7e64555..0b9f973 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -574,8 +574,6 @@ static int edp_notify_handler(struct notifier_block 
> *this, unsigned long code,
>edp_notifier);
>   struct drm_device *dev = intel_dp_to_dev(intel_dp);
>   struct drm_i915_private *dev_priv = dev->dev_private;
> - u32 pp_div;
> - u32 pp_ctrl_reg, pp_div_reg;
>  
>   if (!is_edp(intel_dp) || code != SYS_RESTART)
>   return 0;
> @@ -584,6 +582,8 @@ static int edp_notify_handler(struct notifier_block 
> *this, unsigned long code,
>  
>   if (IS_VALLEYVIEW(dev)) {
>   enum pipe pipe = vlv_power_sequencer_pipe(intel_dp);
> + u32 pp_ctrl_reg, pp_div_reg;
> + u32 pp_div;
>  
>   pp_ctrl_reg = VLV_PIPE_PP_CONTROL(pipe);
>   pp_div_reg  = VLV_PIPE_PP_DIVISOR(pipe);
> @@ -5526,7 +5526,6 @@ static void intel_dp_set_drrs_state(struct drm_device 
> *dev, int refresh_rate)
>   struct intel_dp *intel_dp = dev_priv->drrs.dp;
>   struct intel_crtc_state *config = NULL;
>   struct intel_crtc *intel_crtc = NULL;
> - u32 reg, val;
>   enum drrs_refresh_rate_type index = DRRS_HIGH_RR;
>  
>   if (refresh_rate <= 0) {
> @@ -5588,9 +5587,10 @@ static void intel_dp_set_drrs_state(struct drm_device 
> *dev, int refresh_rate)
>   DRM_ERROR("Unsupported refreshrate type\n");
>   }
>   } else if (INTEL_INFO(dev)->gen > 6) {
> - reg = PIPECONF(intel_crtc->config->cpu_transcoder);
> - val = I915_READ(reg);
> + u32 reg = PIPECONF(intel_crtc->config->cpu_transcoder);
> + u32 val;
>  
> + val = I915_READ(reg);
>   if (index > DRRS_HIGH_RR) {
>   if (IS_VALLEYVIEW(dev))
>   val |= PIPECONF_EDP_RR_MODE_SWITCH_VLV;
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 30/43] drm/i915: Parametrize and fix SWF registers

2015-10-12 Thread Jesse Barnes

}
> + for (i = 0; i < 3; i++)
> + dev_priv->regfile.saveSWF3[i] = I915_READ(SWF3(i));
>   }
> - for (i = 0; i < 3; i++)
> - dev_priv->regfile.saveSWF2[i] = I915_READ(SWF30 + (i << 2));
>  
>   mutex_unlock(&dev->struct_mutex);
>  
> @@ -156,12 +168,25 @@ int i915_restore_state(struct drm_device *dev)
>   /* Memory arbitration state */
>   I915_WRITE(MI_ARB_STATE, dev_priv->regfile.saveMI_ARB_STATE | 
> 0x);
>  
> - for (i = 0; i < 16; i++) {
> - I915_WRITE(SWF00 + (i << 2), dev_priv->regfile.saveSWF0[i]);
> - I915_WRITE(SWF10 + (i << 2), dev_priv->regfile.saveSWF1[i]);
> + /* Scratch space */
> + if (IS_GEN2(dev_priv) && IS_MOBILE(dev_priv)) {
> + for (i = 0; i < 7; i++) {
> + I915_WRITE(SWF0(i), dev_priv->regfile.saveSWF0[i]);
> + I915_WRITE(SWF1(i), dev_priv->regfile.saveSWF1[i]);
> + }
> + for (i = 0; i < 3; i++)
> + I915_WRITE(SWF3(i), dev_priv->regfile.saveSWF3[i]);
> + } else if (IS_GEN2(dev_priv)) {
> + for (i = 0; i < 7; i++)
> + I915_WRITE(SWF1(i), dev_priv->regfile.saveSWF1[i]);
> + } else if (HAS_GMCH_DISPLAY(dev_priv)) {
> + for (i = 0; i < 16; i++) {
> + I915_WRITE(SWF0(i), dev_priv->regfile.saveSWF0[i]);
> + I915_WRITE(SWF1(i), dev_priv->regfile.saveSWF1[i]);
> + }
> + for (i = 0; i < 3; i++)
> + I915_WRITE(SWF3(i), dev_priv->regfile.saveSWF3[i]);
>   }
> - for (i = 0; i < 3; i++)
> - I915_WRITE(SWF30 + (i << 2), dev_priv->regfile.saveSWF2[i]);
>  
>   mutex_unlock(&dev->struct_mutex);
>  
> 

I think these were added speculatively in the first place.  Maybe we'd
see a bug on 8xx without these saved & restored, but I wonder if we'd
see anything else?

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 29/43] drm/i915: s/PIPE_FRMCOUNT_GM45/PIPE_FRMCOUNT_G4X/ etc.

2015-10-12 Thread Jesse Barnes

On 09/18/2015 10:03 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> The PIPE_FRMCOUNT_GM45 and PIPE_FLIPCOUNT_GM45 names have bothered me
> for a long time. The work equally well for ELK and onwards, so let's
> s/GM45/G4X/.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/i915_irq.c  |  6 +++---
>  drivers/gpu/drm/i915/i915_reg.h  | 12 ++--
>  drivers/gpu/drm/i915/intel_display.c |  4 ++--
>  3 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 24f68de..4b61a42 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -671,10 +671,10 @@ static u32 i915_get_vblank_counter(struct drm_device 
> *dev, int pipe)
>   return (((high1 << 8) | low) + (pixel >= vbl_start)) & 0xff;
>  }
>  
> -static u32 gm45_get_vblank_counter(struct drm_device *dev, int pipe)
> +static u32 g4x_get_vblank_counter(struct drm_device *dev, int pipe)
>  {
>   struct drm_i915_private *dev_priv = dev->dev_private;
> - int reg = PIPE_FRMCOUNT_GM45(pipe);
> + int reg = PIPE_FRMCOUNT_G4X(pipe);
>  
>   return I915_READ(reg);
>  }
> @@ -4311,7 +4311,7 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
>   dev->driver->get_vblank_counter = i8xx_get_vblank_counter;
>   } else if (IS_G4X(dev_priv) || INTEL_INFO(dev_priv)->gen >= 5) {
>   dev->max_vblank_count = 0x; /* full 32 bit counter */
> - dev->driver->get_vblank_counter = gm45_get_vblank_counter;
> + dev->driver->get_vblank_counter = g4x_get_vblank_counter;
>   } else {
>   dev->driver->get_vblank_counter = i915_get_vblank_counter;
>   dev->max_vblank_count = 0xff; /* only 24 bits of frame 
> count */
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 02f0935..0cc41e4b 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4796,10 +4796,10 @@ enum skl_disp_power_wells {
>  #define   PIPE_PIXEL_MASK 0x00ff
>  #define   PIPE_PIXEL_SHIFT0
>  /* GM45+ just has to be different */
> -#define _PIPEA_FRMCOUNT_GM45 0x70040
> -#define _PIPEA_FLIPCOUNT_GM450x70044
> -#define PIPE_FRMCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FRMCOUNT_GM45)
> -#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE2(pipe, _PIPEA_FLIPCOUNT_GM45)
> +#define _PIPEA_FRMCOUNT_G4X  0x70040
> +#define _PIPEA_FLIPCOUNT_G4X 0x70044
> +#define PIPE_FRMCOUNT_G4X(pipe) _PIPE2(pipe, _PIPEA_FRMCOUNT_G4X)
> +#define PIPE_FLIPCOUNT_G4X(pipe) _PIPE2(pipe, _PIPEA_FLIPCOUNT_G4X)
>  
>  /* Cursor A & B regs */
>  #define _CURACNTR0x70080
> @@ -4962,8 +4962,8 @@ enum skl_disp_power_wells {
>  #define _PIPEBSTAT   (dev_priv->info.display_mmio_offset + 0x71024)
>  #define _PIPEBFRAMEHIGH  0x71040
>  #define _PIPEBFRAMEPIXEL 0x71044
> -#define _PIPEB_FRMCOUNT_GM45 (dev_priv->info.display_mmio_offset + 0x71040)
> -#define _PIPEB_FLIPCOUNT_GM45(dev_priv->info.display_mmio_offset + 
> 0x71044)
> +#define _PIPEB_FRMCOUNT_G4X  (dev_priv->info.display_mmio_offset + 0x71040)
> +#define _PIPEB_FLIPCOUNT_G4X (dev_priv->info.display_mmio_offset + 0x71044)
>  
>  
>  /* Display B control */
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 92e624b..0074781 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10769,7 +10769,7 @@ static bool page_flip_finished(struct intel_crtc 
> *crtc)
>*/
>   return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) ==
>   crtc->unpin_work->gtt_offset &&
> - 
> g4x_flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)),
> + 
> g4x_flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_G4X(crtc->pipe)),
>   crtc->unpin_work->flip_count);
>  }
>  
> @@ -11374,7 +11374,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>   intel_crtc->reset_counter = 
> atomic_read(&dev_priv->gpu_error.reset_counter);
>  
>   if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev))
> - work->flip_count = I915_READ(PIPE_FLIPCOUNT_GM45(pipe)) + 1;
> + work->flip_count = I915_READ(PIPE_FLIPCOUNT_G4X(pipe)) + 1;
>  
>   if (IS_VALLEYVIEW(dev)) {
>   ring = &dev_priv->ring[BCS];
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 28/43] drm/i915: Turn GEN5_ASSERT_IIR_IS_ZERO() into a function

2015-10-12 Thread Jesse Barnes

On 09/18/2015 10:03 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 31 +--
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 16948b2..24f68de 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -139,27 +139,30 @@ static const u32 hpd_bxt[HPD_NUM_PINS] = {
>  /*
>   * We should clear IMR at preinstall/uninstall, and just check at 
> postinstall.
>   */
> -#define GEN5_ASSERT_IIR_IS_ZERO(reg) do { \
> - u32 val = I915_READ(reg); \
> - if (val) { \
> - WARN(1, "Interrupt register 0x%x is not zero: 0x%08x\n", \
> -  (reg), val); \
> - I915_WRITE((reg), 0x); \
> - POSTING_READ(reg); \
> - I915_WRITE((reg), 0x); \
> - POSTING_READ(reg); \
> - } \
> -} while (0)
> +static void gen5_assert_iir_is_zero(struct drm_i915_private *dev_priv, u32 
> reg)
> +{
> + u32 val = I915_READ(reg);
> +
> + if (val == 0)
> + return;
> +
> + WARN(1, "Interrupt register 0x%x is not zero: 0x%08x\n",
> +  reg, val);
> + I915_WRITE(reg, 0x);
> + POSTING_READ(reg);
> + I915_WRITE(reg, 0x);
> + POSTING_READ(reg);
> +}
>  
>  #define GEN8_IRQ_INIT_NDX(type, which, imr_val, ier_val) do { \
> - GEN5_ASSERT_IIR_IS_ZERO(GEN8_##type##_IIR(which)); \
> + gen5_assert_iir_is_zero(dev_priv, GEN8_##type##_IIR(which)); \
>   I915_WRITE(GEN8_##type##_IER(which), (ier_val)); \
>   I915_WRITE(GEN8_##type##_IMR(which), (imr_val)); \
>   POSTING_READ(GEN8_##type##_IMR(which)); \
>  } while (0)
>  
>  #define GEN5_IRQ_INIT(type, imr_val, ier_val) do { \
> - GEN5_ASSERT_IIR_IS_ZERO(type##IIR); \
> + gen5_assert_iir_is_zero(dev_priv, type##IIR); \
>   I915_WRITE(type##IER, (ier_val)); \
>   I915_WRITE(type##IMR, (imr_val)); \
>   POSTING_READ(type##IMR); \
> @@ -3276,7 +3279,7 @@ static void ibx_irq_postinstall(struct drm_device *dev)
>   else
>   mask = SDE_GMBUS_CPT | SDE_AUX_MASK_CPT;
>  
> - GEN5_ASSERT_IIR_IS_ZERO(SDEIIR);
> + gen5_assert_iir_is_zero(dev_priv, SDEIIR);
>   I915_WRITE(SDEIMR, ~mask);
>  }
>  
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 27/43] drm/i915: Fix a few bad hex numbers in register defines

2015-10-12 Thread Jesse Barnes

On 09/18/2015 10:03 AM, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> A few register mask defines were missing the '0x' from hex numbers. Or
> at least I assume those were meant to be hex numbers. Put the '0x' in
> place.
> 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/i915_reg.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 21d49e7..02f0935 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4234,7 +4234,7 @@ enum skl_disp_power_wells {
>  #define   DP_AUX_CH_CTL_PSR_DATA_AUX_REG_SKL (1 << 14)
>  #define   DP_AUX_CH_CTL_FS_DATA_AUX_REG_SKL  (1 << 13)
>  #define   DP_AUX_CH_CTL_GTC_DATA_AUX_REG_SKL (1 << 12)
> -#define   DP_AUX_CH_CTL_FW_SYNC_PULSE_SKL_MASK (1f << 5)
> +#define   DP_AUX_CH_CTL_FW_SYNC_PULSE_SKL_MASK (0x1f << 5)
>  #define   DP_AUX_CH_CTL_FW_SYNC_PULSE_SKL(c) (((c) - 1) << 5)
>  #define   DP_AUX_CH_CTL_SYNC_PULSE_SKL(c)   ((c) - 1)
>  
> @@ -7819,7 +7819,7 @@ enum skl_disp_power_wells {
>  #define  VIRTUAL_CHANNEL_SHIFT   6
>  #define  VIRTUAL_CHANNEL_MASK(3 << 6)
>  #define  DATA_TYPE_SHIFT 0
> -#define  DATA_TYPE_MASK  (3f << 0)
> +#define  DATA_TYPE_MASK  (0x3f << 0)
>  /* data type values, see include/video/mipi_display.h */
>  
>  #define _MIPIA_GEN_FIFO_STAT     (dev_priv->mipi_mmio_base + 0xb074)
> 

Hah!  Maybe they're supposed to be floats though!  We should use more
floats in masks in general I believe.

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 26/43] drm/i915: Protect register macro arguments

2015-10-12 Thread Jesse Barnes

 (1<<16)
> -#define  GEN8_DE_PIPE_IRQ(pipe)  (1<<(16+pipe))
> +#define  GEN8_DE_PIPE_IRQ(pipe)  (1<<(16+(pipe)))
>  #define  GEN8_GT_VECS_IRQ(1<<6)
>  #define  GEN8_GT_PM_IRQ  (1<<4)
>  #define  GEN8_GT_VCS2_IRQ(1<<3)
> @@ -5763,7 +5763,7 @@ enum skl_disp_power_wells {
>  #define  GEN9_PIPE_PLANE3_FLIP_DONE  (1 << 5)
>  #define  GEN9_PIPE_PLANE2_FLIP_DONE  (1 << 4)
>  #define  GEN9_PIPE_PLANE1_FLIP_DONE  (1 << 3)
> -#define  GEN9_PIPE_PLANE_FLIP_DONE(p)(1 << (3 + p))
> +#define  GEN9_PIPE_PLANE_FLIP_DONE(p)(1 << (3 + (p)))
>  #define GEN8_DE_PIPE_IRQ_FAULT_ERRORS \
>   (GEN8_PIPE_CURSOR_FAULT | \
>GEN8_PIPE_SPRITE_FAULT | \
> @@ -6022,7 +6022,7 @@ enum skl_disp_power_wells {
>  #define  SERR_INT_TRANS_C_FIFO_UNDERRUN  (1<<6)
>  #define  SERR_INT_TRANS_B_FIFO_UNDERRUN  (1<<3)
>  #define  SERR_INT_TRANS_A_FIFO_UNDERRUN  (1<<0)
> -#define  SERR_INT_TRANS_FIFO_UNDERRUN(pipe)  (1<<(pipe*3))
> +#define  SERR_INT_TRANS_FIFO_UNDERRUN(pipe)  (1<<((pipe)*3))
>  
>  /* digital port hotplug */
>  #define PCH_PORT_HOTPLUG 0xc4030 /* SHOTPLUG_CTL */
> @@ -6133,9 +6133,9 @@ enum skl_disp_power_wells {
>  #define PCH_SSC4_AUX_PARMS  0xc6214
>  
>  #define PCH_DPLL_SEL 0xc7000
> -#define   TRANS_DPLLB_SEL(pipe)  (1 << (pipe * 4))
> +#define   TRANS_DPLLB_SEL(pipe)  (1 << ((pipe) * 4))
>  #define   TRANS_DPLLA_SEL(pipe)  0
> -#define  TRANS_DPLL_ENABLE(pipe) (1 << (pipe * 4 + 3))
> +#define  TRANS_DPLL_ENABLE(pipe) (1 << ((pipe) * 4 + 3))
>  
>  /* transcoder */
>  
> @@ -7295,7 +7295,7 @@ enum skl_disp_power_wells {
>  #define TRANS_CLK_SEL(tran) _TRANSCODER(tran, TRANS_CLK_SEL_A, 
> TRANS_CLK_SEL_B)
>  /* For each transcoder, we need to select the corresponding port clock */
>  #define  TRANS_CLK_SEL_DISABLED  (0x0<<29)
> -#define  TRANS_CLK_SEL_PORT(x)   ((x+1)<<29)
> +#define  TRANS_CLK_SEL_PORT(x)   (((x)+1)<<29)
>  
>  #define TRANSA_MSA_MISC  0x60410
>  #define TRANSB_MSA_MISC  0x61410
> @@ -7368,10 +7368,10 @@ enum skl_disp_power_wells {
>  
>  /* DPLL control2 */
>  #define DPLL_CTRL2   0x6C05C
> -#define  DPLL_CTRL2_DDI_CLK_OFF(port)(1<<(port+15))
> +#define  DPLL_CTRL2_DDI_CLK_OFF(port)(1<<((port)+15))
>  #define  DPLL_CTRL2_DDI_CLK_SEL_MASK(port)   (3<<((port)*3+1))
>  #define  DPLL_CTRL2_DDI_CLK_SEL_SHIFT(port)((port)*3+1)
> -#define  DPLL_CTRL2_DDI_CLK_SEL(clk, port)   (clk<<((port)*3+1))
> +#define  DPLL_CTRL2_DDI_CLK_SEL(clk, port)   ((clk)<<((port)*3+1))
>  #define  DPLL_CTRL2_DDI_SEL_OVERRIDE(port) (1<<((port)*3))
>  
>  /* DPLL Status */
> @@ -7384,23 +7384,23 @@ enum skl_disp_power_wells {
>  #define DPLL3_CFGCR1 0x6C050
>  #define  DPLL_CFGCR1_FREQ_ENABLE (1<<31)
>  #define  DPLL_CFGCR1_DCO_FRACTION_MASK   (0x7fff<<9)
> -#define  DPLL_CFGCR1_DCO_FRACTION(x) (x<<9)
> +#define  DPLL_CFGCR1_DCO_FRACTION(x) ((x)<<9)
>  #define  DPLL_CFGCR1_DCO_INTEGER_MASK(0x1ff)
>  
>  #define DPLL1_CFGCR2 0x6C044
>  #define DPLL2_CFGCR2 0x6C04C
>  #define DPLL3_CFGCR2 0x6C054
>  #define  DPLL_CFGCR2_QDIV_RATIO_MASK (0xff<<8)
> -#define  DPLL_CFGCR2_QDIV_RATIO(x)   (x<<8)
> -#define  DPLL_CFGCR2_QDIV_MODE(x)(x<<7)
> +#define  DPLL_CFGCR2_QDIV_RATIO(x)   ((x)<<8)
> +#define  DPLL_CFGCR2_QDIV_MODE(x)((x)<<7)
>  #define  DPLL_CFGCR2_KDIV_MASK   (3<<5)
> -#define  DPLL_CFGCR2_KDIV(x) (x<<5)
> +#define  DPLL_CFGCR2_KDIV(x) ((x)<<5)
>  #define  DPLL_CFGCR2_KDIV_5 (0<<5)
>  #define  DPLL_CFGCR2_KDIV_2 (1<<5)
>  #define  DPLL_CFGCR2_KDIV_3 (2<<5)
>  #define  DPLL_CFGCR2_KDIV_1 (3<<5)
>  #define  DPLL_CFGCR2_PDIV_MASK   (7<<2)
> -#define  DPLL_CFGCR2_PDIV(x) (x<<2)
> +#define  DPLL_CFGCR2_PDIV(x) ((x)<<2)
>  #define  DPLL_CFGCR2_PDIV_1 (0<<2)
>  #define  DPLL_CFGCR2_PDIV_2 (1<<2)
>  #define  DPLL_CFGCR2_PDIV_3 (2<<2)
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 25/43] drm/i915: Include gpio_mmio_base in GMBUS reg defines

2015-10-12 Thread Jesse Barnes

> - I915_WRITE(GMBUS0 + reg_offset, bus->reg0);
> + I915_WRITE(GMBUS0, bus->reg0);
>  
>   for (; i < num; i += inc) {
>   inc = 1;
> @@ -530,7 +522,7 @@ retry:
>* a STOP on the very first cycle. To simplify the code we
>* unconditionally generate the STOP condition with an additional gmbus
>* cycle. */
> - I915_WRITE(GMBUS1 + reg_offset, GMBUS_CYCLE_STOP | GMBUS_SW_RDY);
> + I915_WRITE(GMBUS1, GMBUS_CYCLE_STOP | GMBUS_SW_RDY);
>  
>   /* Mark the GMBUS interface as disabled after waiting for idle.
>* We will re-enable it at the start of the next xfer,
> @@ -541,7 +533,7 @@ retry:
>adapter->name);
>   ret = -ETIMEDOUT;
>   }
> - I915_WRITE(GMBUS0 + reg_offset, 0);
> + I915_WRITE(GMBUS0, 0);
>   ret = ret ?: i;
>   goto out;
>  
> @@ -570,9 +562,9 @@ clear_err:
>* of resetting the GMBUS controller and so clearing the
>* BUS_ERROR raised by the slave's NAK.
>*/
> - I915_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT);
> - I915_WRITE(GMBUS1 + reg_offset, 0);
> - I915_WRITE(GMBUS0 + reg_offset, 0);
> + I915_WRITE(GMBUS1, GMBUS_SW_CLR_INT);
> + I915_WRITE(GMBUS1, 0);
> + I915_WRITE(GMBUS0, 0);
>  
>   DRM_DEBUG_KMS("GMBUS [%s] NAK for addr: %04x %c(%d)\n",
>adapter->name, msgs[i].addr,
> @@ -595,7 +587,7 @@ clear_err:
>  timeout:
>   DRM_INFO("GMBUS [%s] timed out, falling back to bit banging on pin 
> %d\n",
>bus->adapter.name, bus->reg0 & 0xff);
> - I915_WRITE(GMBUS0 + reg_offset, 0);
> + I915_WRITE(GMBUS0, 0);
>  
>   /* Hardware may not support GMBUS over these pins? Try GPIO bitbanging 
> instead. */
>   bus->force_bit = 1;
> 

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 24/43] drm/i915: Parametrize HSW video DIP data registers

2015-10-12 Thread Jesse Barnes

   type, i >> 2), 0);
>   mmiowb();
>  
>   val |= hsw_infoframe_enable(type);
> diff --git a/drivers/gpu/drm/i915/intel_psr.c 
> b/drivers/gpu/drm/i915/intel_psr.c
> index a04b4dc..213581c 100644
> --- a/drivers/gpu/drm/i915/intel_psr.c
> +++ b/drivers/gpu/drm/i915/intel_psr.c
> @@ -73,14 +73,14 @@ static bool vlv_is_psr_active_on_pipe(struct drm_device 
> *dev, int pipe)
>  }
>  
>  static void intel_psr_write_vsc(struct intel_dp *intel_dp,
> - struct edp_vsc_psr *vsc_psr)
> + const struct edp_vsc_psr *vsc_psr)
>  {
>   struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
>   struct drm_device *dev = dig_port->base.base.dev;
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   struct intel_crtc *crtc = to_intel_crtc(dig_port->base.base.crtc);
> - u32 ctl_reg = HSW_TVIDEO_DIP_CTL(crtc->config->cpu_transcoder);
> - u32 data_reg = HSW_TVIDEO_DIP_VSC_DATA(crtc->config->cpu_transcoder);
> + enum transcoder cpu_transcoder = crtc->config->cpu_transcoder;
> + u32 ctl_reg = HSW_TVIDEO_DIP_CTL(cpu_transcoder);
>   uint32_t *data = (uint32_t *) vsc_psr;
>   unsigned int i;
>  
> @@ -90,12 +90,14 @@ static void intel_psr_write_vsc(struct intel_dp *intel_dp,
>   I915_WRITE(ctl_reg, 0);
>   POSTING_READ(ctl_reg);
>  
> - for (i = 0; i < VIDEO_DIP_VSC_DATA_SIZE; i += 4) {
> - if (i < sizeof(struct edp_vsc_psr))
> - I915_WRITE(data_reg + i, *data++);
> - else
> - I915_WRITE(data_reg + i, 0);
> + for (i = 0; i < sizeof(*vsc_psr); i += 4) {
> + I915_WRITE(HSW_TVIDEO_DIP_VSC_DATA(cpu_transcoder,
> +i >> 2), *data);
> + data++;
>   }
> + for (; i < VIDEO_DIP_VSC_DATA_SIZE; i += 4)
> + I915_WRITE(HSW_TVIDEO_DIP_VSC_DATA(cpu_transcoder,
> +i >> 2), 0);
>  
>   I915_WRITE(ctl_reg, VIDEO_DIP_ENABLE_VSC_HSW);
>   POSTING_READ(ctl_reg);
> 

Since you fixed the macro to use a *4 for the reg index, it might be
clearer to fix up the loop to just use i++ instead?  I guess you'd then
have to divide the condition, so meh (or maybe we need a DWORDS(bytes)
macro!).  Either way:

Reviewed-by: Jesse Barnes 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 6/9] drm/i915: driver based PASID handling

2015-10-09 Thread Jesse Barnes

On 10/09/2015 02:07 AM, David Woodhouse wrote:
> On Fri, 2015-10-09 at 10:47 +0200, Daniel Vetter wrote:
>> On Fri, Oct 09, 2015 at 08:56:24AM +0100, David Woodhouse wrote:
>>> On Fri, 2015-10-09 at 09:28 +0200, Daniel Vetter wrote:

 Hm if this still works the same way as on older platforms then pagefaults
 just read all 0 and writes go nowhere from the gpu. That generally also
 explains ever-increasing numbers of the CS execution pointer since it's
 busy churning through 48b worth of address space filled with MI_NOP. I'd
 have hoped our hw would do better than that with svm ...
>>>
>>> I'm looking at simple cases like Jesse's 'gem_svm_fault' test. If the
>>> access to process address space (a single dword write) does nothing,
>>> I'm not sure why it would then churn through MI_NOOPs; why would the
>>> batch still not complete?
>>
>> Yeah that testcase doesn't fit, the one I had in mind is where the batch
>> itself faults and the CS just reads MI_NOP forever. No idea why the gpu
>> just keeps walking through the address space here. Puzzling.
> 
> Does it just keep walking through the address space?
> 
> When I hacked my page request handler to *not* service the fault and
> just say it failed, the batch did seem to complete as normal. Just
> without doing the write, as you described.

My understanding is that this behavior will depend on how we submit the
work.  We have to faulting modes: halt and stream.  In either case, the
context that faults will be switched out, and the hardware will either
wait for a resubmit (the halt case) to restart the context, or switch to
the next context in the execlist queue.

If the fault is then serviced by the IOMMU layer, potentially as an
error, I'd expect the faulting context to simply fault again.  I don't
think we'd see a GPU hang in the same way we do today, where we get an
indication in the GPU private fault regs and such; they go through the
IOMMU in advanced context mode.

So I think we'll need a callback in the fatal case; we can just kick off
a private i915 worker for that, just like we do for the recoverable case
that's now hidden in the IOMMU layer.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 7/9] drm/i915: add fences to the request struct

2015-10-09 Thread Jesse Barnes

On 10/09/2015 06:29 AM, David Woodhouse wrote:
> On Fri, 2015-09-04 at 09:59 -0700, Jesse Barnes wrote:
>>
>> @@ -2286,6 +2287,10 @@ struct drm_i915_gem_request {
>> /** Execlists no. of times this request has been sent to the ELSP */
>> int elsp_submitted;
>>  
>> +   /* core fence obj for this request, may be exported */
>> +   struct fence fence;
> 
> As discussed, this doesn't work as-is. The final fence_put() will
> attempt to free(&req->fence). Unless you have a .release method in your
> fence ops, which you don't.
> 
> I suppose we could tie up a .release method with the existing release
> method for the drm_i915_gem_request.
> 
> As things stand, though, bad things are happening. This makes it go
> away and at least lets me get on with testing.
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8ef19e2..2d0c93c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2297,7 +2298,7 @@ struct drm_i915_gem_request {
>   int elsp_submitted;
>  
>   /* core fence obj for this request, may be exported */
> - struct fence fence;
> + struct fence *fence;
>  
>   wait_queue_t wait;
>  };
> diff --git a/drivers/gpu/drm/i915/i915_sync.c 
> b/drivers/gpu/drm/i915/i915_sync.c
> index 085f1f9..6ffe273 100644
> --- a/drivers/gpu/drm/i915/i915_sync.c
> +++ b/drivers/gpu/drm/i915/i915_sync.c
> @@ -58,7 +58,12 @@ struct i915_sync_timeline {
>   *   allow non-RCS fences (need ring/context association)
>   */
>  
> -#define to_i915_request(x) container_of(x, struct drm_i915_gem_request, 
> fence)
> +struct foo {
> + struct fence fence;
> + struct drm_i915_gem_request *req;
> +};
> +
> +#define to_i915_request(x) (((struct foo *)(x))->req)
>  
>  static const char *i915_fence_get_driver_name(struct fence *fence)
>  {
> @@ -81,10 +86,10 @@ static int i915_fence_ring_check(wait_queue_t *wait, 
> unsigned mode, int flags,
>   if (!i915_gem_request_completed(req, false))
>   return 0;
>  
> - fence_signal_locked(&req->fence);
> + fence_signal_locked(req->fence);
>  
>   __remove_wait_queue(&ring->irq_queue, wait);
> - fence_put(&req->fence);
> + fence_put(req->fence);
>   ring->irq_put(ring);
>  
>   return 0;
> @@ -200,6 +205,15 @@ struct fence *i915_fence_create_ring(struct 
> intel_engine_cs *ring,
>   if (ret)
>   return ERR_PTR(ret);
>  
> + request->fence = kmalloc(sizeof(struct foo), GFP_KERNEL);
> + if (!request->fence) {
> + ret = -ENOMEM;
> + goto err_cancel;
> + }
> + /* I have no clue how this is *supposed* to work and no real interest
> +in finding out. Just stop hurting me please. */
> + ((struct foo *)request->fence)->req = request;
> +
>   if (i915.enable_execlists) {
>   ringbuf = ctx->engine[ring->id].ringbuf;
>   } else
> @@ -270,10 +284,10 @@ struct fence *i915_fence_create_ring(struct 
> intel_engine_cs *ring,
>  round_jiffies_up_relative(HZ));
>   intel_mark_busy(dev_priv->dev);
>  
> - fence_init(&request->fence, &i915_fence_ring_ops, &fence_lock,
> + fence_init(request->fence, &i915_fence_ring_ops, &fence_lock,
>  ctx->user_handle, request->seqno);
>  
> - return &request->fence;
> + return request->fence;
>  
>  err_cancel:
>   i915_gem_request_cancel(request);
> @@ -306,10 +320,10 @@ static struct fence *i915_fence_create_display(struct 
> intel_context *ctx)
>  
>   req = ring->outstanding_lazy_request;
>  
> - fence_init(&req->fence, &i915_fence_ops, &fence_lock,
> + fence_init(req->fence, &i915_fence_ops, &fence_lock,
>  ctx->user_handle, req->seqno);
>  
> - return &req->fence;
> + return req->fence;
>  }
>  #endif

Yeah this is definitely better than what I had (untested code and all
that).  But the actual signaling and such still needs work.  I had a
question for Maarten on that actually; today it doesn't look like the
fence would enabling signaling at the right point, so I had to add
something.  But I'll look and see what the latest is here from John H; I
know his Android code worked, so it would probably be best to just use that.

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/7] drm/i915: don't allocate fbcon from stolen memory if it's too big

2015-10-08 Thread Jesse Barnes

On 09/23/2015 08:52 AM, Paulo Zanoni wrote:
> Technology has evolved and now we have eDP panels with 3200x1800
> resolution. In the meantime, the BIOS guys didn't change the default
> 32mb for stolen memory. On top of that, we can't assume our users will
> be able to increase the default stolen memory size to more than 32mb -
> I'm not even sure all BIOSes allow that.
> 
> So just the fbcon buffer alone eats 22mb of my stolen memroy, and due
> to the BDW/SKL restriction of not using the last 8mb of stolen memory,
> all that's left for FBC is 2mb! Since fbcon is not the coolest feature
> ever, I think it's better to save our precious stolen resource to FBC
> and the other guys.
> 
> On the other hand, we really want to use as much stolen memory as
> possible, since on some older systems the stolen memory may be a
> considerable percentage of the total available memory.
> 
> This patch tries to achieve a little balance using a simple heuristic:
> if the fbcon wants more than half of the available stolen memory,
> don't use stolen memory in order to leave some for FBC and the other
> features.
> 
> The long term plan should be to implement a way to set priorities for
> stolen memory allocation and then evict low priority users when the
> high priority ones need the memory. While we still don't have that,
> let's try to make FBC usable with the simple solution.
> 
> Cc: Chris Wilson 
> Signed-off-by: Paulo Zanoni 
> ---
>  drivers/gpu/drm/i915/intel_display.c |  7 +++
>  drivers/gpu/drm/i915/intel_fbdev.c   | 10 --
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 2a1fab3..24b8a72 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2486,6 +2486,7 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
> struct intel_initial_plane_config *plane_config)
>  {
>   struct drm_device *dev = crtc->base.dev;
> + struct drm_i915_private *dev_priv = to_i915(dev);
>   struct drm_i915_gem_object *obj = NULL;
>   struct drm_mode_fb_cmd2 mode_cmd = { 0 };
>   struct drm_framebuffer *fb = &plane_config->fb->base;
> @@ -2498,6 +2499,12 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
>   if (plane_config->size == 0)
>   return false;
>  
> + /* If the FB is too big, just don't use it since fbdev is not very
> +  * important and we should probably use that space with FBC or other
> +  * features. */
> + if (size_aligned * 2 > dev_priv->gtt.stolen_usable_size)
> + return false;
> +
>   obj = i915_gem_object_create_stolen_for_preallocated(dev,
>base_aligned,
>base_aligned,
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
> b/drivers/gpu/drm/i915/intel_fbdev.c
> index 6532912..4fd5fdf 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -121,8 +121,9 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   container_of(helper, struct intel_fbdev, helper);
>   struct drm_framebuffer *fb;
>   struct drm_device *dev = helper->dev;
> + struct drm_i915_private *dev_priv = to_i915(dev);
>   struct drm_mode_fb_cmd2 mode_cmd = {};
> - struct drm_i915_gem_object *obj;
> + struct drm_i915_gem_object *obj = NULL;
>   int size, ret;
>  
>   /* we don't do packed 24bpp */
> @@ -139,7 +140,12 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>  
>   size = mode_cmd.pitches[0] * mode_cmd.height;
>   size = PAGE_ALIGN(size);
> - obj = i915_gem_object_create_stolen(dev, size);
> +
> + /* If the FB is too big, just don't use it since fbdev is not very
> +  * important and we should probably use that space with FBC or other
> +  * features. */
> + if (size * 2 < dev_priv->gtt.stolen_usable_size)
> + obj = i915_gem_object_create_stolen(dev, size);
>   if (obj == NULL)
>   obj = i915_gem_alloc_object(dev, size);
>   if (!obj) {
> 

I agree with Chris that we should make this smarter too, but I don't
think this hurts in the meantime, so:

Reviewed-by: Jesse Barnes 

Might be nice to macro-ize the size comparison too, both for readability
and to change our threshold in one place if we ever need to.

Thanks,
Jesse
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Pin the ifbdev for the info->system_base GGTT mmapping

2015-10-08 Thread Jesse Barnes

On 10/08/2015 02:07 AM, Chris Wilson wrote:
> On Wed, Oct 07, 2015 at 11:34:17AM -0700, Wayne Boyer wrote:
>> From: Chris Wilson 
>>
>> A long time ago (before 3.14) we relied on a permanent pinning of the
>> ifbdev to lock the fb in place inside the GGTT. However, the
>> introduction of stealing the BIOS framebuffer and reusing its address in
>> the GGTT for the fbdev has muddied waters and we use an inherited fb.
>> However, the inherited fb is only pinned whilst it is active and we no
>> longer have an explicit pin for the info->system_base mmapping used by
>> the fbdev. The result is that after some aperture pressure the fbdev may
>> be evicted, but we continue to write the fbcon into the same GGTT
>> address - overwriting anything else that may be put into that offset.
>> The effect is most pronounced across suspend/resume as
>> intel_fbdev_set_suspend() does a full clear over the whole scanout.
>>
>> v2: rebased on latest nightly (Wayne)
>>
>> Signed-off-by: Chris Wilson 
>> Cc: "Goel, Akash" 
>> Cc: Daniel Vetter 
>> Cc: Jesse Barnes 
>> Cc: sta...@vger.kernel.org
>> Reviewed-by: Deepak S 
>> Signed-off-by: Wayne Boyer 
>> ---
>>  drivers/gpu/drm/i915/intel_fbdev.c | 15 +++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
>> b/drivers/gpu/drm/i915/intel_fbdev.c
>> index 6532912..c6aa4f9 100644
>> --- a/drivers/gpu/drm/i915/intel_fbdev.c
>> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
>> @@ -215,6 +215,16 @@ static int intelfb_create(struct drm_fb_helper *helper,
>>  obj = intel_fb->obj;
>>  size = obj->base.size;
>>  
>> +/* The fb constructor will have already pinned us (or inherited a
>> + * GGTT region from the BIOS) suitable for a scanout, so
>> + * this should just be a no-op and increment the pin count for the
>> + * fbdev mmapping. It does have a useful side-effect of validating
>> + * the pin for fbdev's use via a GGTT mmapping.
>> + */
>> +ret = i915_gem_object_ggtt_pin(obj, NULL, 0, PIN_MAPPABLE);
> 
> This should be i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE);
> At which point I just rage quit.

LOL GEM naming strikes again!

Jesse

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3350 matches

Mail list logo