[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: intel_dp_link_is_valid() should only return status of link (rev3)

2016-08-12 Thread Patchwork
== Series Details ==

Series: drm/i915: intel_dp_link_is_valid() should only return status of link 
(rev3)
URL   : https://patchwork.freedesktop.org/series/9737/
State : failure

== Summary ==

Series 9737v3 drm/i915: intel_dp_link_is_valid() should only return status of 
link
http://patchwork.freedesktop.org/api/1.0/series/9737/revisions/3/mbox

Test kms_cursor_legacy:
Subgroup basic-flip-vs-cursor-legacy:
fail   -> PASS   (ro-byt-n2820)
pass   -> FAIL   (ro-bdw-i5-5250u)
Subgroup basic-flip-vs-cursor-varying-size:
pass   -> FAIL   (ro-bdw-i5-5250u)
pass   -> DMESG-FAIL (fi-skl-i7-6700k)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
dmesg-warn -> PASS   (ro-bdw-i7-5600u)
dmesg-warn -> SKIP   (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-b:
pass   -> INCOMPLETE (fi-hsw-i7-4770k)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-c:
skip   -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:207  pass:186  dwarn:0   dfail:0   fail:0   skip:20 
fi-kbl-qkkr  total:244  pass:185  dwarn:29  dfail:0   fail:3   skip:27 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:218  dwarn:3   dfail:0   fail:2   skip:17 
ro-bdw-i7-5600u  total:240  pass:207  dwarn:0   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:195  dwarn:0   dfail:0   fail:3   skip:42 
ro-byt-n2820 total:240  pass:198  dwarn:0   dfail:0   fail:2   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:222  dwarn:0   dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1859/

3612906 drm-intel-nightly: 2016y-08m-12d-15h-08m-02s UTC integration manifest
b41a36b drm/i915: intel_dp_link_is_valid() should only return status of link

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [v2,1/2] drm/i915/mst: Validate modes against available link bandwidth

2016-08-12 Thread Patchwork
== Series Details ==

Series: series starting with [v2,1/2] drm/i915/mst: Validate modes against 
available link bandwidth
URL   : https://patchwork.freedesktop.org/series/11039/
State : failure

== Summary ==

Series 11039v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/11039/revisions/1/mbox

Test kms_cursor_legacy:
Subgroup basic-flip-vs-cursor-varying-size:
fail   -> PASS   (ro-byt-n2820)
pass   -> FAIL   (ro-bdw-i5-5250u)
pass   -> DMESG-FAIL (fi-skl-i7-6700k)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
dmesg-warn -> PASS   (ro-bdw-i7-5600u)
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-c:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:186  dwarn:29  dfail:0   fail:3   skip:26 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:3   dfail:0   fail:1   skip:17 
ro-bdw-i7-5600u  total:240  pass:205  dwarn:2   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820 total:240  pass:198  dwarn:0   dfail:0   fail:2   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:222  dwarn:0   dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1858/

3612906 drm-intel-nightly: 2016y-08m-12d-15h-08m-02s UTC integration manifest
200cbdb drm/mst: A Helper function that returns available link bandwidth
813da48 drm/i915/mst: Validate modes against available link bandwidth

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Embrace the race in busy-ioctl

2016-08-12 Thread Patchwork
== Series Details ==

Series: drm/i915: Embrace the race in busy-ioctl
URL   : https://patchwork.freedesktop.org/series/11034/
State : failure

== Summary ==

Series 11034v1 drm/i915: Embrace the race in busy-ioctl
http://patchwork.freedesktop.org/api/1.0/series/11034/revisions/1/mbox

Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-legacy:
pass   -> FAIL   (ro-byt-n2820)
Subgroup basic-cursor-vs-flip-varying-size:
fail   -> PASS   (ro-ilk1-i5-650)
Subgroup basic-flip-vs-cursor-legacy:
fail   -> PASS   (ro-skl3-i5-6260u)
Subgroup basic-flip-vs-cursor-varying-size:
pass   -> FAIL   (ro-bdw-i5-5250u)
pass   -> DMESG-FAIL (fi-skl-i7-6700k)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
dmesg-warn -> PASS   (ro-bdw-i7-5600u)
dmesg-warn -> SKIP   (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-c:
skip   -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:186  dwarn:28  dfail:0   fail:3   skip:27 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:2   dfail:0   fail:1   skip:18 
ro-bdw-i7-5600u  total:240  pass:207  dwarn:0   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820 total:240  pass:196  dwarn:0   dfail:0   fail:4   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:174  dwarn:0   dfail:0   fail:1   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:223  dwarn:0   dfail:0   fail:3   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1857/

3612906 drm-intel-nightly: 2016y-08m-12d-15h-08m-02s UTC integration manifest
eb6c27a drm/i915: Embrace the race in busy-ioctl

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3] drm/i915: intel_dp_link_is_valid() should only return status of link

2016-08-12 Thread Manasi Navare
Intel_dp_link_is_valid() function reads the Link status registers
and returns a boolean to indicate link is valid or not.
If the link has lost lock and is not valid any more, link
training is performed outside the function else previously trained link
is retained.
This gives us flexibility of checking whether link is valid and training
it independently.

v3:
* Removed some unnecessary DEBUG prints
* Optimized the conditional checking (Dhinakaran Pandiyan)
v2:
* Changed the function name from intel_dp_check_link_status()
to intel_dp_link_is_valid()  (Lukas Wunner)
* Checks for CRTC and active CRTC are moved outside the
intel_dp_link_is_valid() function (Rodrigo Vivi)

Signed-off-by: Manasi Navare 
---
 drivers/gpu/drm/i915/intel_dp.c | 53 ++---
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 364db90..d234042 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -3881,36 +3881,32 @@ go_again:
return -EINVAL;
 }
 
-static void
-intel_dp_check_link_status(struct intel_dp *intel_dp)
+static bool
+intel_dp_link_is_valid(struct intel_dp *intel_dp)
 {
-   struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base;
struct drm_device *dev = intel_dp_to_dev(intel_dp);
u8 link_status[DP_LINK_STATUS_SIZE];
 
WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex));
 
if (!intel_dp_get_link_status(intel_dp, link_status)) {
-   DRM_ERROR("Failed to get link status\n");
-   return;
+   DRM_DEBUG_KMS("Failed to get link status\n");
+   return false;
}
 
-   if (!intel_encoder->base.crtc)
-   return;
+   /* Check if the link is valid by reading the bits of Link status
+* registers
+*/
+   if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) {
+   DRM_DEBUG_KMS("Channel EQ or CR not ok, need to retrain\n");
+   return false;
+   }
 
-   if (!to_intel_crtc(intel_encoder->base.crtc)->active)
-   return;
+   return true;
 
-   /* if link training is requested we should perform it always */
-   if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) ||
-   (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) {
-   DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n",
- intel_encoder->base.name);
-   intel_dp_start_link_train(intel_dp);
-   intel_dp_stop_link_train(intel_dp);
-   }
 }
 
+
 /*
  * According to DP spec
  * 5.1.2:
@@ -3928,6 +3924,8 @@ static bool
 intel_dp_short_pulse(struct intel_dp *intel_dp)
 {
struct drm_device *dev = intel_dp_to_dev(intel_dp);
+   struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
+   struct intel_encoder *intel_encoder = &intel_dig_port->base;
u8 sink_irq_vector = 0;
u8 old_sink_count = intel_dp->sink_count;
bool ret;
@@ -3968,8 +3966,17 @@ intel_dp_short_pulse(struct intel_dp *intel_dp)
DRM_DEBUG_DRIVER("CP or sink specific irq unhandled\n");
}
 
+   /* Do not train the link if there is no crtc */
+   if (!intel_encoder->base.crtc ||
+   !to_intel_crtc(intel_encoder->base.crtc)->active)
+   return true;
+
drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);
-   intel_dp_check_link_status(intel_dp);
+   if (!intel_dp_link_is_valid(intel_dp) ||
+   intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) {
+   intel_dp_start_link_train(intel_dp);
+   intel_dp_stop_link_train(intel_dp);
+   }
drm_modeset_unlock(&dev->mode_config.connection_mutex);
 
return true;
@@ -4298,8 +4305,16 @@ intel_dp_long_pulse(struct intel_connector 
*intel_connector)
 * check links status, there has been known issues of
 * link loss triggerring long pulse
 */
+   /* Do not train the link if there is no crtc */
+   if (!intel_encoder->base.crtc ||
+   !to_intel_crtc(intel_encoder->base.crtc)->active)
+   goto out;
+
drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);
-   intel_dp_check_link_status(intel_dp);
+   if (!intel_dp_link_is_valid(intel_dp)) {
+   intel_dp_start_link_train(intel_dp);
+   intel_dp_stop_link_train(intel_dp);
+   }
drm_modeset_unlock(&dev->mode_config.connection_mutex);
goto out;
}
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3] drm/i915/dp: DP audio API changes for MST

2016-08-12 Thread Pandiyan, Dhinakaran
On Sat, 2016-08-13 at 00:16 +, Pandiyan, Dhinakaran wrote:
> On Fri, 2016-08-12 at 08:18 +0300, Ville Syrjälä wrote:
> > On Fri, Aug 12, 2016 at 04:28:09AM +, Pandiyan, Dhinakaran wrote:
> > > On Thu, 2016-08-11 at 10:39 +0300, Ville Syrjälä wrote:
> > > > On Thu, Aug 11, 2016 at 07:10:39AM +, Pandiyan, Dhinakaran wrote:
> > > > > On Thu, 2016-08-11 at 09:26 +0300, Ville Syrjälä wrote:
> > > > > > On Wed, Aug 10, 2016 at 12:41:57PM -0700, Dhinakaran Pandiyan wrote:
> > > > > > > DP MST provides the capability to send multiple video and audio 
> > > > > > > streams
> > > > > > > through a single port. This requires the API's between i915 and 
> > > > > > > audio
> > > > > > > drivers to distinguish between multiple audio capable displays 
> > > > > > > that can be
> > > > > > > connected to a port. Currently only the port identity is shared 
> > > > > > > in the
> > > > > > > APIs. This patch adds support for MST with an additional parameter
> > > > > > > 'int pipe'.  The existing parameter 'port' does not change it's 
> > > > > > > meaning.
> > > > > > > 
> > > > > > > pipe =
> > > > > > >   MST : display pipe that the stream originates from
> > > > > > >   Non-MST : -1
> > > > > > > 
> > > > > > > Affected APIs:
> > > > > > > struct i915_audio_component_ops
> > > > > > > -   int (*sync_audio_rate)(struct device *, int port, int 
> > > > > > > rate);
> > > > > > > + int (*sync_audio_rate)(struct device *, int port, int pipe,
> > > > > > > +  int rate);
> > > > > > > 
> > > > > > > -   int (*get_eld)(struct device *, int port, bool *enabled,
> > > > > > > -   unsigned char *buf, int max_bytes);
> > > > > > > +   int (*get_eld)(struct device *, int port, int pipe,
> > > > > > > +bool *enabled, unsigned char *buf, int 
> > > > > > > max_bytes);
> > > > > > > 
> > > > > > > struct i915_audio_component_audio_ops
> > > > > > > -   void (*pin_eld_notify)(void *audio_ptr, int port);
> > > > > > > +   void (*pin_eld_notify)(void *audio_ptr, int port, int 
> > > > > > > pipe);
> > > > > > > 
> > > > > > > This patch makes dummy changes in the audio drivers (Libin) for 
> > > > > > > build to
> > > > > > > succeed. The audio side drivers will send the right 'pipe' values 
> > > > > > > in
> > > > > > > patches that will follow.
> > > > > > > 
> > > > > > > v2:
> > > > > > > Renamed the new API parameter from 'dev_id' to 'pipe'. (Jim, 
> > > > > > > Ville)
> > > > > > > Included Asoc driver API compatibility changes from Jeeja.
> > > > > > > Added WARN_ON() for invalid pipe in get_saved_encoder(). (Takashi)
> > > > > > > Added comment for av_enc_map[] definition. (Takashi)
> > > > > > > 
> > > > > > > v3:
> > > > > > > Fixed logic error introduced while renaming 'dev_id' as 'pipe' 
> > > > > > > (Ville)
> > > > > > > Renamed get_saved_encoder() to get_saved_enc() to reduce line 
> > > > > > > length
> > > > > > > 
> > > > > > > Signed-off-by: Dhinakaran Pandiyan 
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/i915/i915_drv.h|  3 +-
> > > > > > >  drivers/gpu/drm/i915/intel_audio.c | 93 
> > > > > > > ++
> > > > > > >  include/drm/i915_component.h   |  6 +--
> > > > > > >  include/sound/hda_i915.h   | 11 +++--
> > > > > > >  sound/hda/hdac_i915.c  |  9 ++--
> > > > > > >  sound/pci/hda/patch_hdmi.c |  7 +--
> > > > > > >  sound/soc/codecs/hdac_hdmi.c   |  2 +-
> > > > > > >  7 files changed, 86 insertions(+), 45 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > > > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > > index c36d176..8e4a88f 100644
> > > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > > @@ -2036,7 +2036,8 @@ struct drm_i915_private {
> > > > > > >   /* perform PHY state sanity checks? */
> > > > > > >   bool chv_phy_assert[2];
> > > > > > >  
> > > > > > > - struct intel_encoder *dig_port_map[I915_MAX_PORTS];
> > > > > > > + /* Used to save the pipe-to-encoder mapping for audio */
> > > > > > > + struct intel_encoder *av_enc_map[I915_MAX_PIPES];
> > > > > > >  
> > > > > > >   /*
> > > > > > >* NOTE: This is the dri1/ums dungeon, don't add stuff here. 
> > > > > > > Your patch
> > > > > > > diff --git a/drivers/gpu/drm/i915/intel_audio.c 
> > > > > > > b/drivers/gpu/drm/i915/intel_audio.c
> > > > > > > index ef20875..a7467ea 100644
> > > > > > > --- a/drivers/gpu/drm/i915/intel_audio.c
> > > > > > > +++ b/drivers/gpu/drm/i915/intel_audio.c
> > > > > > > @@ -500,6 +500,7 @@ void intel_audio_codec_enable(struct 
> > > > > > > intel_encoder *intel_encoder)
> > > > > > >   struct i915_audio_component *acomp = dev_priv->audio_component;
> > > > > > >   struct intel_digital_port *intel_dig_port = 
> > > > > > > enc_to_dig_port(encoder);
> > > > > > >   enum port port = intel_dig_port->port;
> > > > > > > + enum pipe pipe = crtc->pipe;
> > > > > > >  
>

Re: [Intel-gfx] [PATCH v2] drm/i915: intel_dp_link_is_valid() should only return status of link

2016-08-12 Thread Manasi Navare
On Fri, Aug 12, 2016 at 02:50:58PM -0700, Pandiyan, Dhinakaran wrote:
> On Fri, 2016-08-12 at 10:56 -0700, Manasi Navare wrote:
> > On Thu, Aug 11, 2016 at 08:18:54PM -0700, Pandiyan, Dhinakaran wrote:
> > > On Thu, 2016-08-11 at 15:23 -0700, Manasi Navare wrote:
> > > > Intel_dp_link_is_valid() function reads the Link status registers
> > > > and returns a boolean to indicate link is valid or not.
> > > > If the link has lost lock and is not valid any more, link
> > > > training is performed outside the function else previously trained link
> > > > is retained.
> > > > This gives us flexibility of checking whether link is valid and training
> > > > it independently.
> > > > 
> > > > v2:
> > > > * Changed the function name from intel_dp_check_link_status()
> > > > to intel_dp_link_is_valid()  (Lukas Wunner)
> > > > * Checks for CRTC and active CRTC are moved outside the
> > > > intel_dp_link_is_valid() function (Rodrigo Vivi)
> > > > 
> > > > Signed-off-by: Manasi Navare 
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_dp.c | 56 
> > > > +++--
> > > >  1 file changed, 37 insertions(+), 19 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c 
> > > > b/drivers/gpu/drm/i915/intel_dp.c
> > > > index 364db90..891147d 100644
> > > > --- a/drivers/gpu/drm/i915/intel_dp.c
> > > > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > > > @@ -3881,36 +3881,33 @@ go_again:
> > > > return -EINVAL;
> > > >  }
> > > >  
> > > > -static void
> > > > -intel_dp_check_link_status(struct intel_dp *intel_dp)
> > > > +static bool
> > > > +intel_dp_link_is_valid(struct intel_dp *intel_dp)
> > > >  {
> > > > -   struct intel_encoder *intel_encoder = 
> > > > &dp_to_dig_port(intel_dp)->base;
> > > > struct drm_device *dev = intel_dp_to_dev(intel_dp);
> > > > u8 link_status[DP_LINK_STATUS_SIZE];
> > > >  
> > > > 
> > > > WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex));
> > > >  
> > > > if (!intel_dp_get_link_status(intel_dp, link_status)) {
> > > > -   DRM_ERROR("Failed to get link status\n");
> > > > -   return;
> > > > +   DRM_DEBUG_KMS("Failed to get link status\n");
> > > > +   return false;
> > > > }
> > > >  
> > > > -   if (!intel_encoder->base.crtc)
> > > > -   return;
> > > > +   /* Check if the link is valid by reading the bits of Link status
> > > > +* registers
> > > > +*/
> > > > +   if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) {
> > > > +   DRM_DEBUG_KMS("Channel EQ or CR not ok, need to 
> > > > retrain\n");
> > > drm_dp_channel_eq_ok() does not check for CR. Should we just say
> > > "Channel EQ not ok" to preempt ambiguity while debugging ?
> > 
> > Actually this macro checks for DP_CHANNEL_EQ_BITS which is defined as:
> > #define DP_CHANNEL_EQ_BITS (DP_LANE_CR_DONE |   \
> > DP_LANE_CHANNEL_EQ_DONE |   \
> > DP_LANE_SYMBOL_LOCKED)
> > So it includes checking for Channel EQ and Clock Recovery CR bits
> > 
> > 
> 
> Thank you, I should have looked hard. I will leave this to you. 
> 
> > > 
> > > > +   return false;
> > > > +   }
> > > >  
> > > > -   if (!to_intel_crtc(intel_encoder->base.crtc)->active)
> > > > -   return;
> > > > +   DRM_DEBUG_KMS("Link is good, no need to retrain\n");
> > > The caller does not expect us to link train anymore, I don't think we
> > > have to explicitly state "no need to retrain". Also, do we need debug
> > > messages if the link is good?
> > 
> > I agree , maybe this is not needed. I will remove this
> > 
> > > 
> > > > +   return true;
> > > >  
> > > > -   /* if link training is requested we should perform it always */
> > > > -   if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) ||
> > > > -   (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) 
> > > > {
> > > > -   DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n",
> > > > - intel_encoder->base.name);
> > > > -   intel_dp_start_link_train(intel_dp);
> > > > -   intel_dp_stop_link_train(intel_dp);
> > > > -   }
> > > >  }
> > > >  
> > > > +
> > > >  /*
> > > >   * According to DP spec
> > > >   * 5.1.2:
> > > > @@ -3928,6 +3925,8 @@ static bool
> > > >  intel_dp_short_pulse(struct intel_dp *intel_dp)
> > > >  {
> > > > struct drm_device *dev = intel_dp_to_dev(intel_dp);
> > > > +   struct intel_digital_port *intel_dig_port = 
> > > > dp_to_dig_port(intel_dp);
> > > > +   struct intel_encoder *intel_encoder = &intel_dig_port->base;
> > > > u8 sink_irq_vector = 0;
> > > > u8 old_sink_count = intel_dp->sink_count;
> > > > bool ret;
> > > > @@ -3968,8 +3967,18 @@ intel_dp_short_pulse(struct intel_dp *intel_dp)
> > > > DRM_DEBUG_D

Re: [Intel-gfx] [PATCH v3] drm/i915/dp: DP audio API changes for MST

2016-08-12 Thread Pandiyan, Dhinakaran
On Fri, 2016-08-12 at 08:18 +0300, Ville Syrjälä wrote:
> On Fri, Aug 12, 2016 at 04:28:09AM +, Pandiyan, Dhinakaran wrote:
> > On Thu, 2016-08-11 at 10:39 +0300, Ville Syrjälä wrote:
> > > On Thu, Aug 11, 2016 at 07:10:39AM +, Pandiyan, Dhinakaran wrote:
> > > > On Thu, 2016-08-11 at 09:26 +0300, Ville Syrjälä wrote:
> > > > > On Wed, Aug 10, 2016 at 12:41:57PM -0700, Dhinakaran Pandiyan wrote:
> > > > > > DP MST provides the capability to send multiple video and audio 
> > > > > > streams
> > > > > > through a single port. This requires the API's between i915 and 
> > > > > > audio
> > > > > > drivers to distinguish between multiple audio capable displays that 
> > > > > > can be
> > > > > > connected to a port. Currently only the port identity is shared in 
> > > > > > the
> > > > > > APIs. This patch adds support for MST with an additional parameter
> > > > > > 'int pipe'.  The existing parameter 'port' does not change it's 
> > > > > > meaning.
> > > > > > 
> > > > > > pipe =
> > > > > > MST : display pipe that the stream originates from
> > > > > > Non-MST : -1
> > > > > > 
> > > > > > Affected APIs:
> > > > > > struct i915_audio_component_ops
> > > > > > -   int (*sync_audio_rate)(struct device *, int port, int rate);
> > > > > > +   int (*sync_audio_rate)(struct device *, int port, int pipe,
> > > > > > +int rate);
> > > > > > 
> > > > > > -   int (*get_eld)(struct device *, int port, bool *enabled,
> > > > > > -   unsigned char *buf, int max_bytes);
> > > > > > +   int (*get_eld)(struct device *, int port, int pipe,
> > > > > > +  bool *enabled, unsigned char *buf, int 
> > > > > > max_bytes);
> > > > > > 
> > > > > > struct i915_audio_component_audio_ops
> > > > > > -   void (*pin_eld_notify)(void *audio_ptr, int port);
> > > > > > +   void (*pin_eld_notify)(void *audio_ptr, int port, int pipe);
> > > > > > 
> > > > > > This patch makes dummy changes in the audio drivers (Libin) for 
> > > > > > build to
> > > > > > succeed. The audio side drivers will send the right 'pipe' values in
> > > > > > patches that will follow.
> > > > > > 
> > > > > > v2:
> > > > > > Renamed the new API parameter from 'dev_id' to 'pipe'. (Jim, Ville)
> > > > > > Included Asoc driver API compatibility changes from Jeeja.
> > > > > > Added WARN_ON() for invalid pipe in get_saved_encoder(). (Takashi)
> > > > > > Added comment for av_enc_map[] definition. (Takashi)
> > > > > > 
> > > > > > v3:
> > > > > > Fixed logic error introduced while renaming 'dev_id' as 'pipe' 
> > > > > > (Ville)
> > > > > > Renamed get_saved_encoder() to get_saved_enc() to reduce line length
> > > > > > 
> > > > > > Signed-off-by: Dhinakaran Pandiyan 
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/i915_drv.h|  3 +-
> > > > > >  drivers/gpu/drm/i915/intel_audio.c | 93 
> > > > > > ++
> > > > > >  include/drm/i915_component.h   |  6 +--
> > > > > >  include/sound/hda_i915.h   | 11 +++--
> > > > > >  sound/hda/hdac_i915.c  |  9 ++--
> > > > > >  sound/pci/hda/patch_hdmi.c |  7 +--
> > > > > >  sound/soc/codecs/hdac_hdmi.c   |  2 +-
> > > > > >  7 files changed, 86 insertions(+), 45 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > index c36d176..8e4a88f 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > @@ -2036,7 +2036,8 @@ struct drm_i915_private {
> > > > > > /* perform PHY state sanity checks? */
> > > > > > bool chv_phy_assert[2];
> > > > > >  
> > > > > > -   struct intel_encoder *dig_port_map[I915_MAX_PORTS];
> > > > > > +   /* Used to save the pipe-to-encoder mapping for audio */
> > > > > > +   struct intel_encoder *av_enc_map[I915_MAX_PIPES];
> > > > > >  
> > > > > > /*
> > > > > >  * NOTE: This is the dri1/ums dungeon, don't add stuff here. 
> > > > > > Your patch
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_audio.c 
> > > > > > b/drivers/gpu/drm/i915/intel_audio.c
> > > > > > index ef20875..a7467ea 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_audio.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_audio.c
> > > > > > @@ -500,6 +500,7 @@ void intel_audio_codec_enable(struct 
> > > > > > intel_encoder *intel_encoder)
> > > > > > struct i915_audio_component *acomp = dev_priv->audio_component;
> > > > > > struct intel_digital_port *intel_dig_port = 
> > > > > > enc_to_dig_port(encoder);
> > > > > > enum port port = intel_dig_port->port;
> > > > > > +   enum pipe pipe = crtc->pipe;
> > > > > >  
> > > > > > connector = drm_select_eld(encoder);
> > > > > > if (!connector)
> > > > > > @@ -524,12 +525,18 @@ void intel_audio_codec_enable(struct 
> > > > > > intel_encoder *intel_encoder)
> > > > > >  
> > > > > > mutex_lock(&dev_priv->av_mutex);
> > > > > > intel_encoder->audi

Re: [Intel-gfx] drm/i915/fbc: disable FBC on FIFO underruns

2016-08-12 Thread Pandiyan, Dhinakaran
On Fri, 2016-06-10 at 22:18 -0300, Paulo Zanoni wrote:
> Ever since I started working on FBC I was already aware that FBC can
> really amplify the FIFO underrun symptoms. On systems where FIFO
> underruns were harmless error messages, enabling FBC would cause the
> underruns to give black screens.
> 

Do we know why we get black screens in this scenario?

> We recently tried to enable FBC on Haswell and got reports of a system
> that would hang after some hours of uptime, and the first bad commit
> was the one that enabled FBC. We also observed that this system had
> FIFO underrun error messages on its dmesg. Although we don't have any
> evidence that fixing the underruns would solve the bug and make FBC
> work properly on this machine, IMHO it's better if we minimize the
> amount of possible problems by just giving up FBC whenever we detect
> an underrun.
> 
> v2: new version, different implementation and commit message.
> 
> Cc: Stefan Richter 
> Cc: Lyude 
> Cc: Steven Honeyman 
> Signed-off-by: Paulo Zanoni 
> ---
>  drivers/gpu/drm/i915/i915_drv.h|  3 ++
>  drivers/gpu/drm/i915/intel_drv.h   |  1 +
>  drivers/gpu/drm/i915/intel_fbc.c   | 53 
> ++
>  drivers/gpu/drm/i915/intel_fifo_underrun.c |  2 ++
>  4 files changed, 59 insertions(+)
> 
> 
> Since my test machines don't produce FIFO underrun errors, I tested this by
> creating a debugfs file that just calls intel_fbc_handle_fifo_underrun(). I'd
> appreciate some Tested-by tags, if possible.
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 20a676d..18b4257 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -908,6 +908,9 @@ struct intel_fbc {
>   bool enabled;
>   bool active;
>  
> + bool underrun_detected;
> + struct work_struct underrun_work;
> +
>   struct intel_fbc_state_cache {
>   struct {
>   unsigned int mode_flags;
> diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> b/drivers/gpu/drm/i915/intel_drv.h
> index ebe7b34..7bf97b1 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1436,6 +1436,7 @@ void intel_fbc_invalidate(struct drm_i915_private 
> *dev_priv,
>  void intel_fbc_flush(struct drm_i915_private *dev_priv,
>unsigned int frontbuffer_bits, enum fb_op_origin origin);
>  void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv);
> +void intel_fbc_handle_fifo_underrun(struct drm_i915_private *dev_priv);
>  
>  /* intel_hdmi.c */
>  void intel_hdmi_init(struct drm_device *dev, i915_reg_t hdmi_reg, enum port 
> port);
> diff --git a/drivers/gpu/drm/i915/intel_fbc.c 
> b/drivers/gpu/drm/i915/intel_fbc.c
> index d268f76..2363bff 100644
> --- a/drivers/gpu/drm/i915/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/intel_fbc.c
> @@ -755,6 +755,13 @@ static bool intel_fbc_can_activate(struct intel_crtc 
> *crtc)
>   struct intel_fbc *fbc = &dev_priv->fbc;
>   struct intel_fbc_state_cache *cache = &fbc->state_cache;
>  
> + /* We don't need to use a state cache here since this information is
> +  * global for every CRTC. */
> + if (fbc->underrun_detected) {
> + fbc->no_fbc_reason = "underrun detected";
> + return false;
> + }
> +
>   if (!cache->plane.visible) {
>   fbc->no_fbc_reason = "primary plane not visible";
>   return false;
> @@ -1195,6 +1202,51 @@ void intel_fbc_global_disable(struct drm_i915_private 
> *dev_priv)
>   cancel_work_sync(&fbc->work.work);
>  }
>  
> +static void intel_fbc_underrun_work_fn(struct work_struct *work)
> +{
> + struct drm_i915_private *dev_priv =
> + container_of(work, struct drm_i915_private, fbc.underrun_work);
> + struct intel_fbc *fbc = &dev_priv->fbc;
> +
> + mutex_lock(&fbc->lock);
> +
> + /* Maybe we were scheduled twice. */
> + if (fbc->underrun_detected)
> + goto out;
> +
> + DRM_DEBUG_KMS("Disabling FBC due to FIFO underrun.\n");
> + fbc->underrun_detected = true;
> +
> + intel_fbc_deactivate(dev_priv);
> +out:
> + mutex_unlock(&fbc->lock);
> +}
> +
> +/**
> + * intel_fbc_handle_fifo_underrun - disable FBC when we get a FIFO underrun
> + * @dev_priv: i915 device instance
> + *
> + * Without FBC, most underruns are harmless and don't really cause too many
> + * problems, except for an annoying message on dmesg. With FBC, underruns can
> + * become black screens or even worse, especially when paired with bad
> + * watermarks. So in order for us to be on the safe side, completely disable 
> FBC
> + * in case we ever detect a FIFO underrun on any pipe. An underrun on any 
> pipe
> + * already suggests that watermarks may be bad, so try to be as safe as
> + * possible.
> + */
> +void intel_fbc_handle_fifo_underrun(struct drm_i915_private *dev_priv)
> +{
> + struct intel_fbc *fbc = &dev_priv->fbc;
> +
> + if (!fbc_sup

Re: [Intel-gfx] [PATCH v2] drm/i915: intel_dp_link_is_valid() should only return status of link

2016-08-12 Thread Pandiyan, Dhinakaran
On Fri, 2016-08-12 at 10:56 -0700, Manasi Navare wrote:
> On Thu, Aug 11, 2016 at 08:18:54PM -0700, Pandiyan, Dhinakaran wrote:
> > On Thu, 2016-08-11 at 15:23 -0700, Manasi Navare wrote:
> > > Intel_dp_link_is_valid() function reads the Link status registers
> > > and returns a boolean to indicate link is valid or not.
> > > If the link has lost lock and is not valid any more, link
> > > training is performed outside the function else previously trained link
> > > is retained.
> > > This gives us flexibility of checking whether link is valid and training
> > > it independently.
> > > 
> > > v2:
> > > * Changed the function name from intel_dp_check_link_status()
> > > to intel_dp_link_is_valid()  (Lukas Wunner)
> > > * Checks for CRTC and active CRTC are moved outside the
> > > intel_dp_link_is_valid() function (Rodrigo Vivi)
> > > 
> > > Signed-off-by: Manasi Navare 
> > > ---
> > >  drivers/gpu/drm/i915/intel_dp.c | 56 
> > > +++--
> > >  1 file changed, 37 insertions(+), 19 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_dp.c 
> > > b/drivers/gpu/drm/i915/intel_dp.c
> > > index 364db90..891147d 100644
> > > --- a/drivers/gpu/drm/i915/intel_dp.c
> > > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > > @@ -3881,36 +3881,33 @@ go_again:
> > >   return -EINVAL;
> > >  }
> > >  
> > > -static void
> > > -intel_dp_check_link_status(struct intel_dp *intel_dp)
> > > +static bool
> > > +intel_dp_link_is_valid(struct intel_dp *intel_dp)
> > >  {
> > > - struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base;
> > >   struct drm_device *dev = intel_dp_to_dev(intel_dp);
> > >   u8 link_status[DP_LINK_STATUS_SIZE];
> > >  
> > >   WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex));
> > >  
> > >   if (!intel_dp_get_link_status(intel_dp, link_status)) {
> > > - DRM_ERROR("Failed to get link status\n");
> > > - return;
> > > + DRM_DEBUG_KMS("Failed to get link status\n");
> > > + return false;
> > >   }
> > >  
> > > - if (!intel_encoder->base.crtc)
> > > - return;
> > > + /* Check if the link is valid by reading the bits of Link status
> > > +  * registers
> > > +  */
> > > + if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) {
> > > + DRM_DEBUG_KMS("Channel EQ or CR not ok, need to retrain\n");
> > drm_dp_channel_eq_ok() does not check for CR. Should we just say
> > "Channel EQ not ok" to preempt ambiguity while debugging ?
> 
> Actually this macro checks for DP_CHANNEL_EQ_BITS which is defined as:
> #define DP_CHANNEL_EQ_BITS (DP_LANE_CR_DONE |   \
> DP_LANE_CHANNEL_EQ_DONE |   \
> DP_LANE_SYMBOL_LOCKED)
> So it includes checking for Channel EQ and Clock Recovery CR bits
> 
> 

Thank you, I should have looked hard. I will leave this to you. 

> > 
> > > + return false;
> > > + }
> > >  
> > > - if (!to_intel_crtc(intel_encoder->base.crtc)->active)
> > > - return;
> > > + DRM_DEBUG_KMS("Link is good, no need to retrain\n");
> > The caller does not expect us to link train anymore, I don't think we
> > have to explicitly state "no need to retrain". Also, do we need debug
> > messages if the link is good?
> 
> I agree , maybe this is not needed. I will remove this
> 
> > 
> > > + return true;
> > >  
> > > - /* if link training is requested we should perform it always */
> > > - if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) ||
> > > - (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) {
> > > - DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n",
> > > -   intel_encoder->base.name);
> > > - intel_dp_start_link_train(intel_dp);
> > > - intel_dp_stop_link_train(intel_dp);
> > > - }
> > >  }
> > >  
> > > +
> > >  /*
> > >   * According to DP spec
> > >   * 5.1.2:
> > > @@ -3928,6 +3925,8 @@ static bool
> > >  intel_dp_short_pulse(struct intel_dp *intel_dp)
> > >  {
> > >   struct drm_device *dev = intel_dp_to_dev(intel_dp);
> > > + struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
> > > + struct intel_encoder *intel_encoder = &intel_dig_port->base;
> > >   u8 sink_irq_vector = 0;
> > >   u8 old_sink_count = intel_dp->sink_count;
> > >   bool ret;
> > > @@ -3968,8 +3967,18 @@ intel_dp_short_pulse(struct intel_dp *intel_dp)
> > >   DRM_DEBUG_DRIVER("CP or sink specific irq unhandled\n");
> > >   }
> > >  
> > > + /* Do not train the link if there is no crtc */
> > > + if (!intel_encoder->base.crtc)
> > > + return true;
> > > + if (!to_intel_crtc(intel_encoder->base.crtc)->active)
> > > + return true;
> > > +
> > I might be completely off base here. Shouldn't we keep the link valid
> > irrespective of whether there is an active crtc? I thought that is what
> > the refactoring is supposed to enable. Does intel_dp_short_pulse() get
> > called when there is a link loss during upfront link tr

Re: [Intel-gfx] [PATCH v2 1/2] drm/i915/mst: Validate modes against available link bandwidth

2016-08-12 Thread kbuild test robot
Hi Anusha,

[auto build test ERROR on drm/drm-next]
[also build test ERROR on v4.8-rc1 next-20160812]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Anusha-Srivatsa/drm-i915-mst-Validate-modes-against-available-link-bandwidth/20160813-050818
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
config: i386-randconfig-x014-201632 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

Note: the 
linux-review/Anusha-Srivatsa/drm-i915-mst-Validate-modes-against-available-link-bandwidth/20160813-050818
 HEAD 413b1bb45dc2c58540a17c5ca642b6aee2c97405 builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/intel_dp_mst.c: In function 'intel_dp_mst_mode_valid':
>> drivers/gpu/drm/i915/intel_dp_mst.c:363:14: error: implicit declaration of 
>> function 'drm_dp_mst_get_avail_pbn' [-Werror=implicit-function-declaration]
 avail_pbn = drm_dp_mst_get_avail_pbn(mgr, port);
 ^~~~
   cc1: some warnings being treated as errors

vim +/drm_dp_mst_get_avail_pbn +363 drivers/gpu/drm/i915/intel_dp_mst.c

   357  struct intel_connector *intel_connector = 
to_intel_connector(connector);
   358  struct intel_dp *intel_dp = intel_connector->mst_port;
   359  struct drm_dp_mst_topology_mgr *mgr = &intel_dp->mst_mgr;
   360  struct drm_dp_mst_port *port = (struct drm_dp_mst_port *) 
(intel_connector->port);
   361  int max_dotclk = to_i915(connector->dev)->max_dotclk_freq;
   362  
 > 363  avail_pbn = drm_dp_mst_get_avail_pbn(mgr, port);
   364  req_pbn = drm_dp_calc_pbn_mode(mode->clock, 24);
   365  if (req_pbn > avail_pbn)
   366  return MODE_H_ILLEGAL;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 2/2] drm/mst: A Helper function that returns available link bandwidth

2016-08-12 Thread Anusha Srivatsa
Add a function that returns the available link bandwidth for
MST port so that we can accurately determine whether a new
mode is valid for the link or not.

v2: Put the Signed-off to the end of commit message

Cc: dri-de...@lists.freedesktop.org
Cc: dhinakaran.pandi...@intel.com

Signed-off-by: Anusha Srivatsa 
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 12 
 include/drm/drm_dp_mst_helper.h   |  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 04e4571..7a239f6 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -43,6 +43,8 @@ static bool dump_dp_payload_table(struct 
drm_dp_mst_topology_mgr *mgr,
  char *buf);
 static int test_calc_pbn_mode(void);
 
+int drm_dp_mst_get_avail_pbn(struct drm_dp_mst_topology_mgr *mgr, struct 
drm_dp_mst_port *port);
+
 static void drm_dp_put_port(struct drm_dp_mst_port *port);
 
 static int drm_dp_dpcd_write_payload(struct drm_dp_mst_topology_mgr *mgr,
@@ -2730,6 +2732,16 @@ static int test_calc_pbn_mode(void)
return 0;
 }
 
+int drm_dp_mst_get_avail_pbn(struct drm_dp_mst_topology_mgr *mgr, struct 
drm_dp_mst_port *port)
+{
+port = drm_dp_get_validated_port_ref(mgr,port);
+if (port)
+return port->available_pbn;
+
+return -EINVAL;
+}
+EXPORT_SYMBOL(drm_dp_mst_get_avail_pbn);
+
 /* we want to kick the TX after we've ack the up/down IRQs. */
 static void drm_dp_mst_kick_tx(struct drm_dp_mst_topology_mgr *mgr)
 {
diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
index 0032076..74dc4ab 100644
--- a/include/drm/drm_dp_mst_helper.h
+++ b/include/drm/drm_dp_mst_helper.h
@@ -576,6 +576,7 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector 
*connector, struct drm_dp_
 
 int drm_dp_calc_pbn_mode(int clock, int bpp);
 
+int drm_dp_mst_get_avail_pbn(struct drm_dp_mst_topology_mgr *mgr, struct 
drm_dp_mst_port *port);
 
 bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr, struct 
drm_dp_mst_port *port, int pbn, int *slots);
 
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2 1/2] drm/i915/mst: Validate modes against available link bandwidth

2016-08-12 Thread Anusha Srivatsa
Validate the modes against available link bandwidth rather than
maximum link bandwidth so that we have a better idea as to whether
a proposed mode can truly run beside existing stream.

v2: Put the Signed-off to the end of the commit message

Cc: dhinakaran.pandi...@intel.com

Signed-off-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/intel_dp_mst.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c 
b/drivers/gpu/drm/i915/intel_dp_mst.c
index 629337d..e7e87d7 100644
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -352,13 +352,23 @@ static enum drm_mode_status
 intel_dp_mst_mode_valid(struct drm_connector *connector,
struct drm_display_mode *mode)
 {
+   int req_pbn = 0;
+   int avail_pbn = 0;
+   struct intel_connector *intel_connector = to_intel_connector(connector);
+   struct intel_dp *intel_dp = intel_connector->mst_port;
+   struct drm_dp_mst_topology_mgr *mgr = &intel_dp->mst_mgr;
+   struct drm_dp_mst_port *port = (struct drm_dp_mst_port *) 
(intel_connector->port);
int max_dotclk = to_i915(connector->dev)->max_dotclk_freq;
 
-   /* TODO - validate mode against available PBN for link */
+   avail_pbn = drm_dp_mst_get_avail_pbn(mgr, port);
+   req_pbn = drm_dp_calc_pbn_mode(mode->clock, 24);
+   if (req_pbn > avail_pbn)
+   return MODE_H_ILLEGAL;
+
if (mode->clock < 1)
return MODE_CLOCK_LOW;
 
-   if (mode->flags & DRM_MODE_FLAG_DBLCLK)
+if (mode->flags & DRM_MODE_FLAG_DBLCLK)
return MODE_H_ILLEGAL;
 
if (mode->clock > max_dotclk)
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 12/20] drm/doc: Update drm_framebuffer docs

2016-08-12 Thread Daniel Vetter
On Wed, Aug 10, 2016 at 06:15:56PM +0300, Ville Syrjälä wrote:
> On Tue, Aug 09, 2016 at 03:41:23PM +0200, Daniel Vetter wrote:
> > - Move the intro section into a DOC comment, and update it slightly.
> > - kernel-doc for struct drm_framebuffer!
> > 
> > Signed-off-by: Daniel Vetter 
> > ---
> >  Documentation/gpu/drm-kms.rst | 26 +--
> >  drivers/gpu/drm/drm_framebuffer.c | 35 +++
> >  include/drm/drm_framebuffer.h | 94 
> > +--
> >  3 files changed, 118 insertions(+), 37 deletions(-)
> > 
> > diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
> > index 8264a88a8695..d244e03658cc 100644
> > --- a/Documentation/gpu/drm-kms.rst
> > +++ b/Documentation/gpu/drm-kms.rst
> > @@ -39,30 +39,8 @@ Atomic Mode Setting Function Reference
> >  Frame Buffer Abstraction
> >  
> >  
> > -Frame buffers are abstract memory objects that provide a source of
> > -pixels to scanout to a CRTC. Applications explicitly request the
> > -creation of frame buffers through the DRM_IOCTL_MODE_ADDFB(2) ioctls
> > -and receive an opaque handle that can be passed to the KMS CRTC control,
> > -plane configuration and page flip functions.
> > -
> > -Frame buffers rely on the underneath memory manager for low-level memory
> > -operations. When creating a frame buffer applications pass a memory
> > -handle (or a list of memory handles for multi-planar formats) through
> > -the ``drm_mode_fb_cmd2`` argument. For drivers using GEM as their
> > -userspace buffer management interface this would be a GEM handle.
> > -Drivers are however free to use their own backing storage object
> > -handles, e.g. vmwgfx directly exposes special TTM handles to userspace
> > -and so expects TTM handles in the create ioctl and not GEM handles.
> > -
> > -The lifetime of a drm framebuffer is controlled with a reference count,
> > -drivers can grab additional references with
> > -:c:func:`drm_framebuffer_reference()`and drop them again with
> > -:c:func:`drm_framebuffer_unreference()`. For driver-private
> > -framebuffers for which the last reference is never dropped (e.g. for the
> > -fbdev framebuffer when the struct :c:type:`struct drm_framebuffer
> > -` is embedded into the fbdev helper struct)
> > -drivers can manually clean up a framebuffer at module unload time with
> > -:c:func:`drm_framebuffer_unregister_private()`.
> > +.. kernel-doc:: drivers/gpu/drm/drm_framebuffer.c
> > +   :doc: overview
> >  
> >  Frame Buffer Functions Reference
> >  
> > diff --git a/drivers/gpu/drm/drm_framebuffer.c 
> > b/drivers/gpu/drm/drm_framebuffer.c
> > index c7a8a623b336..f2f4928c7262 100644
> > --- a/drivers/gpu/drm/drm_framebuffer.c
> > +++ b/drivers/gpu/drm/drm_framebuffer.c
> > @@ -28,6 +28,41 @@
> >  #include "drm_crtc_internal.h"
> >  
> >  /**
> > + * DOC: overview
> > + *
> > + * Frame buffers are abstract memory objects that provide a source of 
> > pixels to
> > + * scanout to a CRTC. Applications explicitly request the creation of frame
> > + * buffers through the DRM_IOCTL_MODE_ADDFB(2) ioctls and receive an opaque
> > + * handle that can be passed to the KMS CRTC control, plane configuration 
> > and
> > + * page flip functions.
> > + *
> > + * Frame buffers rely on the underlying memory manager for allocating 
> > backing
> > + * storage. When creating a frame buffer applications pass a memory handle
> > + * (or a list of memory handles for multi-planar formats) through the
> > + * struct &drm_mode_fb_cmd2 argument. For drivers using GEM as their 
> > userspace
> > + * buffer management interface this would be a GEM handle.  Drivers are 
> > however
> > + * free to use their own backing storage object handles, e.g. vmwgfx 
> > directly
> > + * exposes special TTM handles to userspace and so expects TTM handles in 
> > the
> > + * create ioctl and not GEM handles.
> > + *
> > + * Framebuffers are tracked with struct &drm_framebuffer. They are 
> > published
> > + * using drm_framebuffer_init() - after calling that function userspace 
> > can use
> > + * and access the framebuffer object. The helper function
> > + * drm_helper_mode_fill_fb_struct() can be used to pre-fill the required
> > + * metadata fields.
> > + *
> > + * The lifetime of a drm framebuffer is controlled with a reference count,
> > + * drivers can grab additional references with drm_framebuffer_reference() 
> > and
> > + * drop them again with drm_framebuffer_unreference(). For driver-private
> > + * framebuffers for which the last reference is never dropped (e.g. for the
> > + * fbdev framebuffer when the struct struct &drm_framebuffer is embedded 
> > into
> > + * the fbdev helper struct) drivers can manually clean up a framebuffer at
> > + * module unload time with drm_framebuffer_unregister_private(). But doing 
> > this
> > + * is not recommended, and it's better to have a normal free-standing 
> > struct
> > + * &drm_framebuffer.
> > + */
> > +
>

Re: [Intel-gfx] [PATCH 11/20] drm: Extract drm_framebuffer.[hc]

2016-08-12 Thread Daniel Vetter
On Wed, Aug 10, 2016 at 10:48:20AM -0400, Sean Paul wrote:
> On Tue, Aug 9, 2016 at 9:41 AM, Daniel Vetter  wrote:
> >
> > -/**
> > - * drm_crtc_force_disable_all - Forcibly turn off all enabled CRTCs
> > - * @dev: DRM device whose CRTCs to turn off
> > - *
> > - * Drivers may want to call this on unload to ensure that all displays are
> > - * unlit and the GPU is in a consistent, low power state. Takes modeset 
> > locks.
> > - *
> > - * Returns:
> > - * Zero on success, error code on failure.
> > - */
> > -int drm_crtc_force_disable_all(struct drm_device *dev)
> > -{
> > -   struct drm_crtc *crtc;
> > -   int ret = 0;
> > -
> > -   drm_modeset_lock_all(dev);
> > -   drm_for_each_crtc(crtc, dev)
> > -   if (crtc->enabled) {
> > -   ret = drm_crtc_force_disable(crtc);
> > -   if (ret)
> > -   goto out;
> > -   }
> > -out:
> > -   drm_modeset_unlock_all(dev);
> > -   return ret;
> > -}
> > -EXPORT_SYMBOL(drm_crtc_force_disable_all);
> 
> 
> I'm not so sure about moving this one. If it's going to be declared in
> drm_crtc.h, it should stay here (with force_disable). Alternatively,
> assuming no one else is using this (didn't check), move it to
> drm_framebuffer and make it a static helper function there (removing
> the declaration from drm_crtc.h).

This shouldn't be moved, accidentally overselected. Will fix.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function

2016-08-12 Thread Matthew Auld
On 12 August 2016 at 18:58, Chris Wilson  wrote:
> On Fri, Aug 12, 2016 at 06:42:30PM +0100, Matthew Auld wrote:
>> > -#define STD_MI_OPCODE_MASK  0xFF80
>> > -#define STD_3D_OPCODE_MASK  0x
>> > -#define STD_2D_OPCODE_MASK  0xFFC0
>> > -#define STD_MFX_OPCODE_MASK 0x
>> > +#define STD_MI_OPCODE_SHIFT  (32 - 9)
>> > +#define STD_3D_OPCODE_SHIFT  (32 - 16)
>> > +#define STD_2D_OPCODE_SHIFT  (32 - 10)
>> > +#define STD_MFX_OPCODE_SHIFT (32 - 16)
>> Why don't we make use of this one in cmd_header_key? What client is it
>> supposed to map to?
>
> It doesn't map to its own CLIENT, it reuses the RC_CLIENT for its
> commands. (iirc)
hmm, okay.

Reviewed-by: Matthew Auld 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"

2016-08-12 Thread Paul E. McKenney
On Fri, Aug 12, 2016 at 08:25:43PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 11, 2016 at 11:26:47AM -0700, Paul E. McKenney wrote:
> > If my upcoming testing of the two changes together pans out, I will
> > give you a Tested-by -- I am guessing that you don't want to wait
> > until the next merge window for these changes.
> 
> I was planning to stuff them in tip/locking/urgent, so they'd end up in
> this release.

They seem to work fine for me, so for both:

Tested-by: Paul E. McKenney 

Thanx, Paul

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"

2016-08-12 Thread Peter Zijlstra
On Thu, Aug 11, 2016 at 11:26:47AM -0700, Paul E. McKenney wrote:
> If my upcoming testing of the two changes together pans out, I will
> give you a Tested-by -- I am guessing that you don't want to wait
> until the next merge window for these changes.

I was planning to stuff them in tip/locking/urgent, so they'd end up in
this release.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function

2016-08-12 Thread Chris Wilson
On Fri, Aug 12, 2016 at 06:42:30PM +0100, Matthew Auld wrote:
> > -#define STD_MI_OPCODE_MASK  0xFF80
> > -#define STD_3D_OPCODE_MASK  0x
> > -#define STD_2D_OPCODE_MASK  0xFFC0
> > -#define STD_MFX_OPCODE_MASK 0x
> > +#define STD_MI_OPCODE_SHIFT  (32 - 9)
> > +#define STD_3D_OPCODE_SHIFT  (32 - 16)
> > +#define STD_2D_OPCODE_SHIFT  (32 - 10)
> > +#define STD_MFX_OPCODE_SHIFT (32 - 16)
> Why don't we make use of this one in cmd_header_key? What client is it
> supposed to map to?

It doesn't map to its own CLIENT, it reuses the RC_CLIENT for its
commands. (iirc)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Embrace the race in busy-ioctl

2016-08-12 Thread Chris Wilson
Daniel Vetter proposed a new challenge to the serialisation inside the
busy-ioctl that exposed a flaw that could result in us reporting the
wrong engine as being busy. If the request is reallocated as we test
its busyness and then reassigned to this object by another thread, we
would not notice that the test itself was incorrect.

We are faced with a choice of using __i915_gem_active_get_request_rcu()
to first acquire a reference to the request preventing the race, or to
acknowledge the race and accept the limitations upon the accuracy of the
busy flags. Note that we guarantee that we never falsely report the
object as idle (providing userspace itself doesn't race), and so the
most important use of the busy-ioctl and its guarantees are fulfilled.

Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem.c | 87 ++---
 include/uapi/drm/i915_drm.h | 15 ++-
 2 files changed, 60 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5566916870eb..c77915378768 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3791,49 +3791,54 @@ static __always_inline unsigned int
 __busy_set_if_active(const struct i915_gem_active *active,
 unsigned int (*flag)(unsigned int id))
 {
-   /* For more discussion about the barriers and locking concerns,
-* see __i915_gem_active_get_rcu().
-*/
-   do {
-   struct drm_i915_gem_request *request;
-   unsigned int id;
-
-   request = rcu_dereference(active->request);
-   if (!request || i915_gem_request_completed(request))
-   return 0;
+   struct drm_i915_gem_request *request;
 
-   id = request->engine->exec_id;
+   request = rcu_dereference(active->request);
+   if (!request || i915_gem_request_completed(request))
+   return 0;
 
-   /* Check that the pointer wasn't reassigned and overwritten.
-*
-* In __i915_gem_active_get_rcu(), we enforce ordering between
-* the first rcu pointer dereference (imposing a
-* read-dependency only on access through the pointer) and
-* the second lockless access through the memory barrier
-* following a successful atomic_inc_not_zero(). Here there
-* is no such barrier, and so we must manually insert an
-* explicit read barrier to ensure that the following
-* access occurs after all the loads through the first
-* pointer.
-*
-* It is worth comparing this sequence with
-* raw_write_seqcount_latch() which operates very similarly.
-* The challenge here is the visibility of the other CPU
-* writes to the reallocated request vs the local CPU ordering.
-* Before the other CPU can overwrite the request, it will
-* have updated our active->request and gone through a wmb.
-* During the read here, we want to make sure that the values
-* we see have not been overwritten as we do so - and we do
-* that by serialising the second pointer check with the writes
-* on other other CPUs.
-*
-* The corresponding write barrier is part of
-* rcu_assign_pointer().
-*/
-   smp_rmb();
-   if (request == rcu_access_pointer(active->request))
-   return flag(id);
-   } while (1);
+   /* This is racy. See __i915_gem_active_get_rcu() for a in detail
+* discussion of how to handle the race correctly, but for reporting
+* the busy state we err on the side of potentially reporting the
+* wrong engine as being busy (but we guarantee that the result
+* is at least self-consistent).
+*
+* As we use SLAB_DESTROY_BY_RCU, the request may be reallocated
+* whilst we are inspecting it, even under the RCU read lock as we are.
+* This means that there is a small window for the engine and/or the
+* seqno to have been overwritten. The seqno will always be in the
+* future compared to the intended, and so we know that if that
+* seqno is idle (on whatever engine) our request is idle and the
+* return 0 above is correct.
+*
+* The issue is that if the engine is switched, it is just as likely
+* to report that it is busy (but since the switch happened, we know
+* the request should be idle). So there is a small chance that a busy
+* result is actually the wrong engine.
+*
+* So why don't we care?
+*
+* For starters, the busy ioctl is a heuristic that

Re: [Intel-gfx] [PATCH 7/9] drm/i915/cmdparser: Check for SKIP descriptors first

2016-08-12 Thread Matthew Auld
Reviewed-by: Matthew Auld 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: intel_dp_link_is_valid() should only return status of link

2016-08-12 Thread Manasi Navare
On Thu, Aug 11, 2016 at 08:18:54PM -0700, Pandiyan, Dhinakaran wrote:
> On Thu, 2016-08-11 at 15:23 -0700, Manasi Navare wrote:
> > Intel_dp_link_is_valid() function reads the Link status registers
> > and returns a boolean to indicate link is valid or not.
> > If the link has lost lock and is not valid any more, link
> > training is performed outside the function else previously trained link
> > is retained.
> > This gives us flexibility of checking whether link is valid and training
> > it independently.
> > 
> > v2:
> > * Changed the function name from intel_dp_check_link_status()
> > to intel_dp_link_is_valid()  (Lukas Wunner)
> > * Checks for CRTC and active CRTC are moved outside the
> > intel_dp_link_is_valid() function (Rodrigo Vivi)
> > 
> > Signed-off-by: Manasi Navare 
> > ---
> >  drivers/gpu/drm/i915/intel_dp.c | 56 
> > +++--
> >  1 file changed, 37 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_dp.c 
> > b/drivers/gpu/drm/i915/intel_dp.c
> > index 364db90..891147d 100644
> > --- a/drivers/gpu/drm/i915/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > @@ -3881,36 +3881,33 @@ go_again:
> > return -EINVAL;
> >  }
> >  
> > -static void
> > -intel_dp_check_link_status(struct intel_dp *intel_dp)
> > +static bool
> > +intel_dp_link_is_valid(struct intel_dp *intel_dp)
> >  {
> > -   struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base;
> > struct drm_device *dev = intel_dp_to_dev(intel_dp);
> > u8 link_status[DP_LINK_STATUS_SIZE];
> >  
> > WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex));
> >  
> > if (!intel_dp_get_link_status(intel_dp, link_status)) {
> > -   DRM_ERROR("Failed to get link status\n");
> > -   return;
> > +   DRM_DEBUG_KMS("Failed to get link status\n");
> > +   return false;
> > }
> >  
> > -   if (!intel_encoder->base.crtc)
> > -   return;
> > +   /* Check if the link is valid by reading the bits of Link status
> > +* registers
> > +*/
> > +   if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) {
> > +   DRM_DEBUG_KMS("Channel EQ or CR not ok, need to retrain\n");
> drm_dp_channel_eq_ok() does not check for CR. Should we just say
> "Channel EQ not ok" to preempt ambiguity while debugging ?

Actually this macro checks for DP_CHANNEL_EQ_BITS which is defined as:
#define DP_CHANNEL_EQ_BITS (DP_LANE_CR_DONE |   \
DP_LANE_CHANNEL_EQ_DONE |   \
DP_LANE_SYMBOL_LOCKED)
So it includes checking for Channel EQ and Clock Recovery CR bits


> 
> > +   return false;
> > +   }
> >  
> > -   if (!to_intel_crtc(intel_encoder->base.crtc)->active)
> > -   return;
> > +   DRM_DEBUG_KMS("Link is good, no need to retrain\n");
> The caller does not expect us to link train anymore, I don't think we
> have to explicitly state "no need to retrain". Also, do we need debug
> messages if the link is good?

I agree , maybe this is not needed. I will remove this

> 
> > +   return true;
> >  
> > -   /* if link training is requested we should perform it always */
> > -   if ((intel_dp->compliance_test_type == DP_TEST_LINK_TRAINING) ||
> > -   (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count))) {
> > -   DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n",
> > - intel_encoder->base.name);
> > -   intel_dp_start_link_train(intel_dp);
> > -   intel_dp_stop_link_train(intel_dp);
> > -   }
> >  }
> >  
> > +
> >  /*
> >   * According to DP spec
> >   * 5.1.2:
> > @@ -3928,6 +3925,8 @@ static bool
> >  intel_dp_short_pulse(struct intel_dp *intel_dp)
> >  {
> > struct drm_device *dev = intel_dp_to_dev(intel_dp);
> > +   struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
> > +   struct intel_encoder *intel_encoder = &intel_dig_port->base;
> > u8 sink_irq_vector = 0;
> > u8 old_sink_count = intel_dp->sink_count;
> > bool ret;
> > @@ -3968,8 +3967,18 @@ intel_dp_short_pulse(struct intel_dp *intel_dp)
> > DRM_DEBUG_DRIVER("CP or sink specific irq unhandled\n");
> > }
> >  
> > +   /* Do not train the link if there is no crtc */
> > +   if (!intel_encoder->base.crtc)
> > +   return true;
> > +   if (!to_intel_crtc(intel_encoder->base.crtc)->active)
> > +   return true;
> > +
> I might be completely off base here. Shouldn't we keep the link valid
> irrespective of whether there is an active crtc? I thought that is what
> the refactoring is supposed to enable. Does intel_dp_short_pulse() get
> called when there is a link loss during upfront link training? And in
> that case, shouldn't we retrain even without a crtc? 

We cannot ever retrain without a CRTC. This check is more for making sure that 
the clocks
are set up befofe we try to retrain else we will see AUX channel failures.
If I track this back in the kerne

Re: [Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function

2016-08-12 Thread Matthew Auld
> -#define STD_MI_OPCODE_MASK  0xFF80
> -#define STD_3D_OPCODE_MASK  0x
> -#define STD_2D_OPCODE_MASK  0xFFC0
> -#define STD_MFX_OPCODE_MASK 0x
> +#define STD_MI_OPCODE_SHIFT  (32 - 9)
> +#define STD_3D_OPCODE_SHIFT  (32 - 16)
> +#define STD_2D_OPCODE_SHIFT  (32 - 10)
> +#define STD_MFX_OPCODE_SHIFT (32 - 16)
Why don't we make use of this one in cmd_header_key? What client is it
supposed to map to?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PULL] topic/drm-misc

2016-08-12 Thread Daniel Vetter
Hi Dave,

- more fence destaging and cleanup (Gustavo&Sumit)
- DRIVER_LEGACY to untangle from DRIVER_MODESET
- drm_mm refactor (Chris)
- fbdev-less compile fies
- clipped plane src/dst rects (Ville)
- + a few mediatek patches that build on top of that (Bibby+Daniel)
- small stuff all over really

Cheers, Daniel


The following changes since commit 29b4817d4018df78086157ea3a55c1d9424a7cfc:

  Linux 4.8-rc1 (2016-08-07 18:18:00 -0700)

are available in the git repository at:

  git://anongit.freedesktop.org/drm-intel tags/topic/drm-misc-2016-08-12

for you to fetch changes up to 3590d50e2313644cd192ff55e83df76dea232319:

  dma-buf/fence: kerneldoc: remove spurious section header (2016-08-12 20:32:14 
+0530)


Bibby Hsieh (2):
  drm/mediatek: Use drm_atomic destroy_state helpers
  drm/mediatek: Fix mtk_atomic_complete for runtime_pm

Chris Wilson (4):
  drm: Track drm_mm nodes with an interval tree
  drm: Convert drm_vma_manager to embedded interval-tree in drm_mm
  drm: Skip initialising the drm_mm_node->hole_stack
  drm: Declare that create drm_mm nodes with size 0 is illegal

Daniel Kurtz (5):
  drm/mediatek: Remove mtk_drm_crtc_check_flush
  drm/mediatek: plane: Remove plane zpos/index
  drm/mediatek: Remove mtk_drm_plane
  drm/mediatek: plane: Merge mtk_plane_enable into mtk_plane_atomic_update
  drm/mediatek: plane: Use FB's format's cpp to compute x offset

Daniel Vetter (8):
  drm: Mark up legacy/dri1 drivers with DRM_LEGACY
  drm: Used DRM_LEGACY for all legacy functions
  drm: Make sure drm_vblank_no_hw_counter isn't abused
  drm/fb-helper: Add a dummy remove_conflicting_framebuffers
  drm: Remove superflous linux/fb.h includes
  drm/vmwgfx: select CONFIG_FB
  drm/radeon|amgpu: Make fbdev emulation optional
  drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION

David Herrmann (1):
  drm: rename DRM_MINOR_LEGACY to DRM_MINOR_PRIMARY

Gustavo Padovan (5):
  dma-buf/fence-array: add fence_is_array()
  dma-buf/sync_file: refactor fence storage in struct sync_file
  dma-buf/sync_file: add sync_file_get_fence()
  Documentation: add doc for sync_file_get_fence()
  dma-buf/sync_file: only enable fence signalling on poll()

Joonas Lahtinen (1):
  drm: BIT(DRM_ROTATE_?) -> DRM_ROTATE_?

Keith Packard (1):
  drm: Don't prepare or cleanup unchanging frame buffers [v3]

Lyude (3):
  drm: Add ratelimited versions of the DRM_DEBUG* macros
  drm/dp_helper: Print first error received on failure in 
drm_dp_dpcd_access()
  drm/dp_helper: Rate limit timeout errors from drm_dp_i2c_do_msg()

Peter Chen (1):
  Revert "gpu: drm: omapdrm: dss-of: add missing of_node_put after calling 
of_parse_phandle"

Rodrigo Vivi (1):
  drm: Avoid printing negative values for unsigned variables.

Sumit Semwal (2):
  dma-buf/fence: kerneldoc: remove unused struct members
  dma-buf/fence: kerneldoc: remove spurious section header

Ville Syrjälä (9):
  drm: Warn about negative sizes when calculating scale factor
  drm: Store clipped src/dst coordinatee in drm_plane_state
  drm/plane-helper: Add drm_plane_helper_check_state()
  drm/i915: Use drm_plane_state.{src,dst,visible}
  drm/i915: Use drm_plane_helper_check_state()
  drm/rockchip: Use drm_plane_state.{src, dst}
  drm/rockchip: Use drm_plane_helper_check_state()
  drm/mediatek: Use drm_plane_helper_check_state()
  drm/simple_kms_helper: Use drm_plane_helper_check_state()

 Documentation/gpu/drm-internals.rst|   9 +-
 Documentation/sync_file.txt|  14 ++
 drivers/dma-buf/fence-array.c  |   1 +
 drivers/dma-buf/sync_file.c| 204 ++---
 drivers/gpu/drm/Kconfig|   8 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c |   1 -
 drivers/gpu/drm/amd/powerplay/hwmgr/fiji_hwmgr.c   |   1 -
 .../gpu/drm/amd/powerplay/hwmgr/polaris10_hwmgr.c  |   1 -
 drivers/gpu/drm/amd/powerplay/hwmgr/ppatomctrl.c   |   1 -
 drivers/gpu/drm/amd/powerplay/hwmgr/tonga_hwmgr.c  |   1 -
 .../amd/powerplay/hwmgr/tonga_processpptables.c|   1 -
 drivers/gpu/drm/arm/malidp_drv.h   |   2 +-
 drivers/gpu/drm/arm/malidp_planes.c|  20 +-
 drivers/gpu/drm/armada/armada_fbdev.c  |   1 -
 drivers/gpu/drm/armada/armada_overlay.c|   2 +-
 drivers/gpu/drm/ast/ast_fb.c   |   1 -
 drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c|  22 +--
 drivers/gpu/drm/bochs/bochs.h  |   1 -
 drivers/gpu/drm/bochs/bochs_drv.c  |   3 +-
 drivers/gpu/drm/bridge/parade-ps8622.c |   1 -
 drivers/gpu/drm/cirrus/cirrus_drv.c|   2 +-
 drivers/gpu/drm/cirrus/cirrus_fbdev.c 

[Intel-gfx] [PULL] drm-intel-next

2016-08-12 Thread Daniel Vetter
Hi Dave,

drm-intel-next-2016-08-08:
- refactor ddi buffer programming a bit (Ville)
- large-scale renaming to untangle naming in the gem code (Chris)
- rework vma/active tracking for accurately reaping idle mappings of shared
  objects (Chris)
- misc dp sst/mst probing corner case fixes (Ville)
- tons of cleanup&tunings all around in gem
- lockless (rcu-protected) request lookup, plus use it everywhere for
  non(b)locking waits (Chris)
- pipe crc debugfs fixes (Rodrigo)
- random fixes all over
drm-intel-next-2016-07-25:
- more engine code unification (Tvrtko)
- reorganize rps&rc6 setup (Chris Wilson)
- hotplug polling when in deep rpm states, especially fixes vls (Lyude)
- mocs fix for bxt (Imre)
- convert i915 request to use dma fences (Chris)
- prep work for lockless i915 requests/fences (needed for full sync integration)
  from Chris Wilson
- wait for external rendering/fences attached to dma_bufs (Chris)
- tons of small bugfixes all over

Note also contains a backmerge (git got confused), but when you've pulled
in all pending pulls (there's a few now) I want to do another backmerge to
get at the latest fences stuff from Gustavo.

Cheers, Daniel


The following changes since commit 1cf915d305b6e1d57db6c35c208016f9747ba3c6:

  Merge tag 'imx-drm-fixes-2016-07-27' of 
git://git.pengutronix.de/git/pza/linux into drm-next (2016-07-30 05:45:30 +1000)

are available in the git repository at:

  git://anongit.freedesktop.org/drm-intel tags/drm-intel-next-2016-08-08

for you to fetch changes up to c5b7e97b27db4f8a8ffe1072506620679043f006:

  drm/i915: Update DRIVER_DATE to 20160808 (2016-08-08 09:37:31 +0200)


- refactor ddi buffer programming a bit (Ville)
- large-scale renaming to untangle naming in the gem code (Chris)
- rework vma/active tracking for accurately reaping idle mappings of shared
  objects (Chris)
- misc dp sst/mst probing corner case fixes (Ville)
- tons of cleanup&tunings all around in gem
- lockless (rcu-protected) request lookup, plus use it everywhere for
  non(b)locking waits (Chris)
- pipe crc debugfs fixes (Rodrigo)
- random fixes all over


Akash Goel (1):
  drm/i915/gen9: Update i915_drpc_info debugfs for coarse pg & forcewake 
info

Bob Paauwe (1):
  drm/i915: Set legacy properties when using legacy gamma set IOCTL. (v2)

Chris Wilson (152):
  drm/i915/breadcrumbs: Queue hangcheck before sleeping
  drm/i915: Kick hangcheck from retire worker
  drm/i915: Remove temporary RPM wakeref assert disables
  drm/i915: Update ifdeffery for mutex->owner
  drm/i915: Provide argument names for static stubs
  drm/i915: Flush GT idle status upon reset
  drm/i915: Preserve current RPS frequency across init
  drm/i915: Perform static RPS frequency setup before userspace
  drm/i915: Move overclocking detection to alongside RPS frequency detection
  drm/i915: Define a separate variable and control for RPS waitboost 
frequency
  drm/i915: Remove superfluous powersave work flushing
  drm/i915: Defer enabling rc6 til after we submit the first batch/context
  drm/i915: Hide gen6_update_ring_freq()
  drm/i915/fbdev: Drain the suspend worker on retiring
  drm/i915/fbdev: Check for the framebuffer before use
  drm/i915/evict: Always switch away from the current context
  drm/i915: Flush logical context image out to memory upon suspend
  drm/i915: Handle ENOSPC after failing to insert a mappable node
  drm/i915: Move GEM request routines to i915_gem_request.c
  drm/i915: Retire oldest completed request before allocating next
  drm/i915: Mark all current requests as complete before resetting them
  drm/i915: Derive GEM requests from dma-fence
  drm/i915: Disable waitboosting for fence_wait()
  drm/i915: Disable waitboosting for mmioflips/semaphores
  drm/i915: Mark imported dma-buf objects as being coherent
  drm/i915: Wait on external rendering for GEM objects
  drm/i915: Rename request reference/unreference to get/put
  drm/i915: Rename i915_gem_context_reference/unreference()
  drm/i915: Wrap drm_gem_object_lookup in i915_gem_object_lookup
  drm/i915: Wrap drm_gem_object_reference in i915_gem_object_get
  drm/i915: Rename drm_gem_object_unreference in preparation for lockless 
free
  drm/i915: Rename drm_gem_object_unreference_unlocked in preparation for 
lockless free
  drm/i915: Treat ringbuffer writes as write to normal memory
  drm/i915: Rename ring->virtual_start as ring->vaddr
  drm/i915: Convert i915_semaphores_is_enabled over to early sanitize
  drm/i915: Enable RC6 immediately
  Revert "drm/i915: Enable RC6 immediately"
  drm/i915: Drop racy markup of missed-irqs from idle-worker
  drm/i915: Update the breadcrumb interrupt counter before enabling
  drm/i915: Reduce breadcrumb lock coverage for 
intel_engine

Re: [Intel-gfx] [PATCH 15/20] drm/i915: Debugfs support for GuC logging control

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:27 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging
and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling
logging.
 Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
 set the guc_log_level to -1 when logging is disabled. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 44 -
  drivers/gpu/drm/i915/i915_guc_submission.c | 63
++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 14e0dcf..f472fbcd3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file
*m, void *data)
  return 0;
  }

+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = to_i915(dev);
+
+if (!dev_priv->guc.log.obj)
+return -EINVAL;
+
+*val = i915.guc_log_level;
+
+return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+struct drm_device *dev = data;
+struct drm_i915_private *dev_priv = to_i915(dev);
+int ret;
+
+ret = mutex_lock_interruptible(&dev->struct_mutex);
+if (ret)
+return ret;
+
+if (!dev_priv->guc.log.obj) {
+ret = -EINVAL;
+goto end;
+}
+
+intel_runtime_pm_get(dev_priv);
+ret = i915_guc_log_control(dev_priv, val);
+intel_runtime_pm_put(dev_priv);
+
+end:
+mutex_unlock(&dev->struct_mutex);
+return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+i915_guc_log_control_get, i915_guc_log_control_set,
+"%lld\n");
+
  static int i915_edp_psr_status(struct seq_file *m, void *data)
  {
  struct drm_info_node *node = m->private;
@@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files {
  {"i915_fbc_false_color", &i915_fbc_fc_fops},
  {"i915_dp_test_data", &i915_displayport_test_data_fops},
  {"i915_dp_test_type", &i915_displayport_test_type_fops},
-{"i915_dp_test_active", &i915_displayport_test_active_fops}
+{"i915_dp_test_active", &i915_displayport_test_active_fops},
+{"i915_guc_log_control", &i915_guc_log_control_fops}
  };

  void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4a75c16..041cf68 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct
intel_guc *guc)
  return host2guc_action(guc, data, 2);
  }

+static int host2guc_logging_control(struct intel_guc *guc, u32
control_val)
+{
+u32 data[2];
+
+data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+data[1] = control_val;
+
+return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private
*dev_priv)
  guc_log_late_setup(&dev_priv->guc);
  mutex_unlock(&dev_priv->drm.struct_mutex);
  }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
control_val)
+{
+union guc_log_control log_param;
+int ret;
+
+log_param.logging_enabled = control_val & 0x1;
+log_param.verbosity = (control_val >> 4) & 0xF;


Maybe "log_param.value = control_val" would also work since
guc_log_control is conveniently defined as an union. Doesn't matter though.


+
+if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+return -EINVAL;
+
+/* This combination doesn't make sense & won't have any effect */
+if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+return 0;


I wonder if it would work and maybe look nicer to generalize as:

int guc_log_level;

guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1;
if (i915.guc_log_level == guc_log_level)
return 0;


Fine, will try to refactor the code as per your suggestions.
Thanks for the suggestions.


+
+ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+if (ret < 0) {
+DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+return ret;
+}
+
+i915.guc_log_level = log_param.verbosity;


This would then become i915.guc_log_level = guc_log_level.


+
+/* 

Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps

2016-08-12 Thread Goel, Akash



On 8/12/2016 8:46 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote:

On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:

Unreleated and unmentioned change to no guard page. Best to remove IMHO.
Can keep the RB in that case.


Though its not called out, sorry for that, but isn't it better to
avoid using the guard page, which will save 4KB of vmalloc virtual
space (which is scarce) for every mapping created by Driver.

Updating the commit message would be fine to mention about this ?.


Too late, already applied without the new flag.


ohh, the patch is already queued for merge ?


Yes, that's why I dropped the guard page when I found out it was being
added. Send a patch to add the flag and we can discuss whether we think
our code is adequate to not require the protection.


Fine, will prepare a separate patch to avoid using the guard page.

Best regards
Akash


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is
done
at the end and so the relay channel debugfs file is also created after
that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC
firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 61
+-
  1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
intel_guc *guc)
  relay_close(guc->log.relay_chan);
  }

-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  struct rchan *guc_log_relay_chan;
-struct dentry *log_dir;
  size_t n_subbufs, subbuf_size;

-/* For now create the log file in /sys/kernel/debug/dri/0 dir */
-log_dir = dev_priv->drm.primary->debugfs_root;
-
-/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
- * not mounted and so can't create the relay file.
- * The relay API seems to fit well with debugfs only.


It only needs a dentry, I don't see that it has to be a debugfs one.

Besides dentry, there are other requirements for using relay, which can 
be met only for a debugfs file.
debugfs wasn't the preferred choice to place the log file, but had no 
other option, as relay API is compatible with debugfs only.


Also retrieving dentry of a file is not so straight forward, as it might 
seem (spent considerable time on this initially).




- */
-if (!log_dir) {
-DRM_DEBUG_DRIVER("Parent debugfs directory not available
yet\n");
-return -ENODEV;
-}
-
  /* Keep the size of sub buffers same as shared log buffer */
  subbuf_size = guc->log.obj->base.size;
  /* Store up to 8 snaphosts, which is large enough to buffer
sufficient
@@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)
   */
  n_subbufs = 8;

-guc_log_relay_chan = relay_open("guc_log", log_dir,
+guc_log_relay_chan = relay_open(NULL, NULL,
  subbuf_size, n_subbufs, &relay_callbacks, dev_priv);

  if (!guc_log_relay_chan) {
@@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct
intel_guc *guc)
  return 0;
  }

+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+struct drm_i915_private *dev_priv = guc_to_i915(guc);
+struct dentry *log_dir;
+int ret;
+
+/* For now create the log file in /sys/kernel/debug/dri/0 dir */
+log_dir = dev_priv->drm.primary->debugfs_root;
+
+/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
+ * not mounted and so can't create the relay file.
+ * The relay API seems to fit well with debugfs only.
+ */
+if (!log_dir) {
+DRM_DEBUG_DRIVER("Parent debugfs directory not available
yet\n");
+return -ENODEV;
+}
+
+ret = relay_late_setup_files(guc->log.relay_chan, "guc_log",
log_dir);
+if (ret) {
+DRM_DEBUG_DRIVER("Couldn't associate the channel with file
%d\n", ret);
+return ret;
+}
+
+return 0;
+}
+
  static void guc_log_cleanup(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct
intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  void *vaddr;
-int ret;
+int ret = 0;

  lockdep_assert_held(&dev_priv->drm.struct_mutex);

@@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct
intel_guc *guc)
  guc->log.buf_addr = vaddr;
  }

-return 0;
+if (!guc->log.relay_chan) {
+/* Create a relay channel, so that we have buffers for storing
+ * the GuC firmware logs, the channel will be linked with a file
+ * later on when debugfs is registered.
+ */
+ret = guc_create_relay_channel(guc);
+}
+
+return ret;
  }

  static void guc_create_log(struct intel_guc *guc)
@@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
  guc->log.obj = obj;

  if (guc_create_log_extras(guc

Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is done
at the end and so the relay channel debugfs file is also created after that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 61 +-
  1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc 
*guc)
relay_close(guc->log.relay_chan);
  }

-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
  {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct rchan *guc_log_relay_chan;
-   struct dentry *log_dir;
size_t n_subbufs, subbuf_size;

-   /* For now create the log file in /sys/kernel/debug/dri/0 dir */
-   log_dir = dev_priv->drm.primary->debugfs_root;
-
-   /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
-* not mounted and so can't create the relay file.
-* The relay API seems to fit well with debugfs only.


It only needs a dentry, I don't see that it has to be a debugfs one.


-*/
-   if (!log_dir) {
-   DRM_DEBUG_DRIVER("Parent debugfs directory not available 
yet\n");
-   return -ENODEV;
-   }
-
/* Keep the size of sub buffers same as shared log buffer */
subbuf_size = guc->log.obj->base.size;
/* Store up to 8 snaphosts, which is large enough to buffer sufficient
@@ -1127,7 +1114,7 @@ static int guc_create_log_relay_file(struct intel_guc 
*guc)
   */
n_subbufs = 8;

-   guc_log_relay_chan = relay_open("guc_log", log_dir,
+   guc_log_relay_chan = relay_open(NULL, NULL,
subbuf_size, n_subbufs, &relay_callbacks, dev_priv);

if (!guc_log_relay_chan) {
@@ -1140,6 +1127,33 @@ static int guc_create_log_relay_file(struct intel_guc 
*guc)
return 0;
  }

+static int guc_create_log_relay_file(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   struct dentry *log_dir;
+   int ret;
+
+   /* For now create the log file in /sys/kernel/debug/dri/0 dir */
+   log_dir = dev_priv->drm.primary->debugfs_root;
+
+   /* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is
+* not mounted and so can't create the relay file.
+* The relay API seems to fit well with debugfs only.
+*/
+   if (!log_dir) {
+   DRM_DEBUG_DRIVER("Parent debugfs directory not available 
yet\n");
+   return -ENODEV;
+   }
+
+   ret = relay_late_setup_files(guc->log.relay_chan, "guc_log", log_dir);
+   if (ret) {
+   DRM_DEBUG_DRIVER("Couldn't associate the channel with file 
%d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
  static void guc_log_cleanup(struct intel_guc *guc)
  {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -1167,7 +1181,7 @@ static int guc_create_log_extras(struct intel_guc *guc)
  {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
void *vaddr;
-   int ret;
+   int ret = 0;

lockdep_assert_held(&dev_priv->drm.struct_mutex);

@@ -1190,7 +1204,15 @@ static int guc_create_log_extras(struct intel_guc *guc)
guc->log.buf_addr = vaddr;
}

-   return 0;
+   if (!guc->log.relay_chan) {
+   /* Create a relay channel, so that we have buffers for storing
+* the GuC firmware logs, the channel will be linked with a file
+* later on when debugfs is registered.
+*/
+   ret = guc_create_relay_channel(guc);
+   }
+
+   return ret;
  }

  static void guc_create_log(struct intel_guc *guc)
@@ -1231,6 +1253,7 @@ static void guc_create_log(struct intel_guc *guc)
guc->log.obj = obj;

if (guc_create_log_extras(guc)) {
+   guc_log_cleanup(guc);
gem_release_guc_obj(guc->log.obj);
guc->log.obj = NULL;
   

Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:37 PM, Tvrtko Ursulin wrote:


On 12/08/16 14:45, Goel, Akash wrote:



On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.c|  14 +++
  drivers/gpu/drm/i915/i915_guc_submission.c | 150
+
  drivers/gpu/drm/i915/i915_irq.c|   5 +-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
drm_i915_private *dev_priv)
  if (dev_priv->hotplug.dp_wq == NULL)
  goto out_free_wq;

+if (HAS_GUC_SCHED(dev_priv)) {


This just reminded me that a previous patch had:

+if (HAS_GUC_UCODE(dev))
+dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;

In the interrupt setup. I don't think there is a bug right now, but
there is a disagreement between the two which would be good to resolve.

This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
for correctness. I think.


Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.

As per Chris's comments will move the wq init/destroy to the GuC logging
setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
You are fine with that ?.


Yes thats OK I think.




+/* Need a dedicated wq to process log buffer flush interrupts
+ * from GuC without much delay so as to avoid any loss of
logs.
+ */
+dev_priv->guc.log.wq =
+alloc_ordered_workqueue("i915-guc_log", 0);
+if (dev_priv->guc.log.wq == NULL)
+goto out_free_hotplug_dp_wq;
+}
+
  return 0;

+out_free_hotplug_dp_wq:
+destroy_workqueue(dev_priv->hotplug.dp_wq);
  out_free_wq:
  destroy_workqueue(dev_priv->wq);
  out_err:
@@ -782,6 +794,8 @@ out_err:

  static void i915_workqueues_cleanup(struct drm_i915_private
*dev_priv)
  {
+if (HAS_GUC_SCHED(dev_priv))
+destroy_workqueue(dev_priv->guc.log.wq);
  destroy_workqueue(dev_priv->hotplug.dp_wq);
  destroy_workqueue(dev_priv->wq);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7c679f..2635b67 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -840,6 +849,127 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+u32 i, buffer_size;


unsigned int i if you can be bothered.


Fine will do that for both i & buffer_size.


buffer_size can match the type of log_buffer_state_local.size or use
something else if more appropriate.


But I remember earlier in one of the patch, you suggested to use u32 as
a type for some variables.
Please could you share the guideline.
Should u32, u64 be used we are exactly sure of the range of the
variable, like for 

Re: [Intel-gfx] [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:23 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snaphots could be stored in the relay buffer.


snapshots


The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the
'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of
data.

v2: Defer the creation of relay channel & associated debugfs file, as
 debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
   the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
   sufficient buffering for boot time logs

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/Kconfig   |   1 +
  drivers/gpu/drm/i915/i915_drv.c|   2 +
  drivers/gpu/drm/i915/i915_guc_submission.c | 206
-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 209 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
  select DRM_KMS_HELPER
  select DRM_PANEL
  select DRM_MIPI_DSI
+select RELAY
  # i915 depends on ACPI_VIDEO when ACPI is enabled
  # but for select to work, need to select ACPI_VIDEO's
dependencies, ick
  select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index fc2da32..cb8c943 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1145,6 +1145,7 @@ static void i915_driver_register(struct
drm_i915_private *dev_priv)
  /* Reveal our presence to userspace */
  if (drm_dev_register(dev, 0) == 0) {
  i915_debugfs_register(dev_priv);
+i915_guc_register(dev_priv);
  i915_setup_sysfs(dev);
  } else
  DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct
drm_i915_private *dev_priv)
  intel_opregion_unregister(dev_priv);

  i915_teardown_sysfs(&dev_priv->drm);
+i915_guc_unregister(dev_priv);
  i915_debugfs_unregister(dev_priv);
  drm_dev_unregister(&dev_priv->drm);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2635b67..1a2d648 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
   */
  #include 
  #include 
+#include 
+#include 
  #include "i915_drv.h"
  #include "intel_guc.h"

@@ -851,12 +853,33 @@ err:

  static void guc_move_to_next_buf(struct intel_guc *guc)
  {
-return;
+/* Make sure the updates made in the sub buffer are visible when
+ * Consumer sees the following update to offset inside the sub
buffer.
+ */
+smp_wmb();
+
+/* All data has been written, so now move the offset of sub
buffer. */
+relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
+
+/* Switch to the next sub buffer */
+relay_flush(guc->log.relay_chan);
  }

  static void* guc_get_write_buffer(struct intel_guc *guc)
  {
-return NULL;
+/* FIXME: Cover the check under a lock ? */


Need to resolve before r-b in any case.
After the last patch in this series, where relay channel will be created 
before enabling the GuC interrupts, the need of lock will not be there 
so will remove these comments in that patch.





+if (!guc->log.relay_chan)
+return NULL;
+
+/* Just get the base address of a new sub buffer and copy data
into it
+ * ourselves. NULL will be returned in no-overwrite mode, if all sub
+ * buffers are full. Could have used the relay_write() to indirectly
+ * copy the data, but that would have been bit convoluted, as we
need to
+ * write to only certain locations inside a sub buffer which
cannot be
+ * done without using relay_reserve() along with relay_write().
So its
+ * better to use relay_

Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Chris Wilson
On Fri, Aug 12, 2016 at 09:34:23PM +0530, Goel, Akash wrote:
> 
> 
> On 8/12/2016 9:22 PM, Chris Wilson wrote:
> >On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:
> >>On 8/12/2016 9:02 PM, Chris Wilson wrote:
> >>>There's (or will be) a function to dump the error object in a uniform
> >>>manner. This patch is obsolete.
> >>
> >>There is a print_error_obj() function, but that prints one dword per line.
> >
> >It used to. It will shortly be a compressed stream.
> 
> >Pretty printing is left to userspace.
> But invariably, we only will be interpreting the error state or Guc
> log buffer dump, and it will be really convenient if we can have 4
> dwords per line matching the log sample size.

That's fine. Do it in userspace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 19/20] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

In order to have fast reads from the GuC log buffer, used SSE4.1 movntdqa
based memcpy function i915_memcpy_from_wc.
GuC log buffer has a WC type vmalloc mapping and copying using movntqda from
WC type memory is almost as fast as reading from WB memory.
This will further reduce the log buffer sampling time, so is needed dearly
to deal with the flush interrupt storm when GuC is generating logs at a very
high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 17 ++---
  1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1818343..af48f62 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -987,15 +987,16 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
/* Just copy the newly written data */
if (read_offset <= write_offset) {
bytes_to_copy = write_offset - read_offset;
-   memcpy(dst_data_ptr + read_offset,
+   i915_memcpy_from_wc(dst_data_ptr + read_offset,
 src_data_ptr + read_offset, bytes_to_copy);
} else {
bytes_to_copy = buffer_size - read_offset;
-   memcpy(dst_data_ptr + read_offset,
+   i915_memcpy_from_wc(dst_data_ptr + read_offset,
 src_data_ptr + read_offset, bytes_to_copy);

bytes_to_copy = write_offset;
-   memcpy(dst_data_ptr, src_data_ptr, 
bytes_to_copy);
+   i915_memcpy_from_wc(dst_data_ptr, src_data_ptr,
+bytes_to_copy);
}

src_data_ptr += buffer_size;
@@ -1210,6 +1211,16 @@ static void guc_create_log(struct intel_guc *guc)

obj = guc->log.obj;
if (!obj) {
+   /* We require SSE 4.1 for fast reads from the GuC log buffer and
+* it should be present on the chipsets supporting GuC based
+* submisssions.
+*/
+   if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
+   /* logging will not be enabled */
+   i915.guc_log_level = -1;
+   return;
+   }
+
obj = gem_allocate_guc_obj(dev_priv, size);
if (!obj) {
/* logging will be off */



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC log buffer

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Host needs to sample the GuC log buffer on every flush interrupt from GuC.
To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used.
Though there could be some performance implication with Uncached read, but
reliability of data will be ensured.

v2: Rebase.

v3: Rebase.

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1d58d36..1818343 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1002,8 +1002,6 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
dst_data_ptr += buffer_size;
}

-   /* FIXME: invalidate/flush for log buffer needed */
-
/* Update the read pointer in the shared log buffer */
log_buffer_state->read_ptr = write_offset;

@@ -1177,8 +1175,11 @@ static int guc_create_log_extras(struct intel_guc *guc)
return 0;

if (!guc->log.buf_addr) {
-   /* Create a vmalloc mapping of log buffer pages */
-   vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WB);
+   /* Create a WC (Uncached for read) vmalloc mapping of log
+* buffer pages, so that we can directly get the data
+* (up-to-date) from memory.
+*/
+   vaddr = i915_gem_object_pin_map(guc->log.obj, I915_MAP_WC);
if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr);
DRM_ERROR("Couldn't map log buffer pages %d\n", ret);



Reviewed-by: Tvrtko Ursulin 

Hopefully no one applies this without 19/20. :)

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:22 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:

On 8/12/2016 9:02 PM, Chris Wilson wrote:

There's (or will be) a function to dump the error object in a uniform
manner. This patch is obsolete.


There is a print_error_obj() function, but that prints one dword per line.


It used to. It will shortly be a compressed stream.



Pretty printing is left to userspace.
But invariably, we only will be interpreting the error state or Guc log 
buffer dump, and it will be really convenient if we can have 4 dwords 
per line matching the log sample size.



Best regards
Akash

-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 8/9] drm/i915/cmdparser: Use binary search for faster register lookup

2016-08-12 Thread Matthew Auld
On 12 August 2016 at 16:07, Chris Wilson  wrote:
> A signifcant proportion of the cmdparsing time for some batches is the
> cost to find the register in the mmiotable. We ensure that those tables
> are in ascending order such that we could do a binary search if it was
> ever merited. It is.
>
> Signed-off-by: Chris Wilson 
Cool.
s/signifcant/significant/

Reviewed-by: Matthew Auld 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/15] drm/omap: Use per-plane rotation property

2016-08-12 Thread Ville Syrjälä
On Thu, Aug 11, 2016 at 04:33:32PM +0300, Ville Syrjälä wrote:
> On Thu, Aug 11, 2016 at 02:32:44PM +0300, Tomi Valkeinen wrote:
> > Hi,
> > 
> > On 22/07/16 16:43, ville.syrj...@linux.intel.com wrote:
> > > From: Ville Syrjälä 
> > > 
> > > The global mode_config.rotation_property is going away, switch over to
> > > per-plane rotation_property.
> > > 
> > > Not sure I got the annoying crtc rotation_property handling right.
> > > Might work, or migth not.
> > 
> > I think something is funny with this patch or the series. I fetched your
> > branch, and with your series, it looks like the primary planes lose all
> > their props. modetest says:
> > 
> > could not get plane 26 properties: Invalid argument
> > could not get plane 30 properties: Invalid argument
> 
> Hmm. Weird. Is it really the get props ioctl that fails?
> 
> The first EINVAL I can spot there is
> if (!obj->properties) {
>   ret = -EINVAL;
>   goto out_unref;
>   }
> which definitely makes no sense since this is assigned
> as plane->base.properties = &plane->properties. So can't be that unless
> we manage to clear the pointer somehow after the init.
> 
> The only other direct EINVAL I see there is if
>  drm_object_property_get_value(obj->properties->properties[i])
> fails to find the passed prop in the properties array. Which clearly
> can't happen since we got it from the array in the first place. Also,
> clearly that code is rather inefficient, perhaps someone should rewrite
> it a bit.
> 
> Can't quite see how this could fail for the plane in other ways. But I
> might be blind.

I tried to think on this a bit more, and the only think I came up with was
that we end up doing the drm_plane_create_rotation_property() twice for the
primary planes. I tried that on i915 but it'd didn't result in anything bad
AFAICS. Would leak a bit, but so what :P

Dunno, I guess you could try something like:

--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -211,11 +211,12 @@ void omap_plane_install_properties(struct drm_plane 
*plane,
struct omap_drm_private *priv = dev->dev_private;
 
if (priv->has_dmm) {
-   drm_plane_create_rotation_property(plane,
-  BIT(DRM_ROTATE_0),
-  BIT(DRM_ROTATE_0) | 
BIT(DRM_ROTATE_90) |
-  BIT(DRM_ROTATE_180) | 
BIT(DRM_ROTATE_270) |
-  BIT(DRM_REFLECT_X) | 
BIT(DRM_REFLECT_Y));
+   if (!plane->rotation_property)
+   drm_plane_create_rotation_property(plane,
+  BIT(DRM_ROTATE_0),
+  BIT(DRM_ROTATE_0) | 
BIT(DRM_ROTATE_90) |
+  BIT(DRM_ROTATE_180) 
| BIT(DRM_ROTATE_270) |
+  BIT(DRM_REFLECT_X) | 
BIT(DRM_REFLECT_Y));


> 
> > 
> > and
> > 
> > Planes:
> > id  crtcfb  CRTC x,yx,y gamma size  possible
> > crtcs
> > 26  28  55  0,0 0,0 0   0x0001
> >   formats: RG16 RX12 XR12 RA12 AR12 XR15 AR15 RG24 RX24 XR24 RA24 AR24
> >   no properties found
> > 30  0   0   0,0 0,0 0   0x0002
> >   formats: RG16 RX12 XR12 RA12 AR12 XR15 AR15 RG24 RX24 XR24 RA24 AR24
> > NV12 YUYV UYVY
> >   no properties found
> > 
> > I didn't look closer yet.
> > 
> >  Tomi
> > 
> 
> 
> 
> 
> -- 
> Ville Syrjälä
> Intel OTC
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 15/20] drm/i915: Debugfs support for GuC logging control

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

This patch provides debugfs interface i915_guc_output_control for
on the fly enabling/disabling of logging in GuC firmware and controlling
the verbosity level of logs.
The value written to the file, should have bit 0 set to enable logging and
bits 4-7 should contain the verbosity info.

v2: Add a forceful flush, to collect left over logs, on disabling logging.
 Useful for Validation.

v3: Besides minor cleanup, implement read method for the debugfs file and
 set the guc_log_level to -1 when logging is disabled. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 44 -
  drivers/gpu/drm/i915/i915_guc_submission.c | 63 ++
  drivers/gpu/drm/i915/intel_guc.h   |  1 +
  3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 14e0dcf..f472fbcd3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2674,6 +2674,47 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
return 0;
  }

+static int i915_guc_log_control_get(void *data, u64 *val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   if (!dev_priv->guc.log.obj)
+   return -EINVAL;
+
+   *val = i915.guc_log_level;
+
+   return 0;
+}
+
+static int i915_guc_log_control_set(void *data, u64 val)
+{
+   struct drm_device *dev = data;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   int ret;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   if (!dev_priv->guc.log.obj) {
+   ret = -EINVAL;
+   goto end;
+   }
+
+   intel_runtime_pm_get(dev_priv);
+   ret = i915_guc_log_control(dev_priv, val);
+   intel_runtime_pm_put(dev_priv);
+
+end:
+   mutex_unlock(&dev->struct_mutex);
+   return ret;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
+   i915_guc_log_control_get, i915_guc_log_control_set,
+   "%lld\n");
+
  static int i915_edp_psr_status(struct seq_file *m, void *data)
  {
struct drm_info_node *node = m->private;
@@ -5477,7 +5518,8 @@ static const struct i915_debugfs_files {
{"i915_fbc_false_color", &i915_fbc_fc_fops},
{"i915_dp_test_data", &i915_displayport_test_data_fops},
{"i915_dp_test_type", &i915_displayport_test_type_fops},
-   {"i915_dp_test_active", &i915_displayport_test_active_fops}
+   {"i915_dp_test_active", &i915_displayport_test_active_fops},
+   {"i915_guc_log_control", &i915_guc_log_control_fops}
  };

  void intel_display_crc_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4a75c16..041cf68 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -195,6 +195,16 @@ static int host2guc_force_logbuffer_flush(struct intel_guc 
*guc)
return host2guc_action(guc, data, 2);
  }

+static int host2guc_logging_control(struct intel_guc *guc, u32 control_val)
+{
+   u32 data[2];
+
+   data[0] = HOST2GUC_ACTION_UK_LOG_ENABLE_LOGGING;
+   data[1] = control_val;
+
+   return host2guc_action(guc, data, 2);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -1538,3 +1548,56 @@ void i915_guc_register(struct drm_i915_private *dev_priv)
guc_log_late_setup(&dev_priv->guc);
mutex_unlock(&dev_priv->drm.struct_mutex);
  }
+
+int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
+{
+   union guc_log_control log_param;
+   int ret;
+
+   log_param.logging_enabled = control_val & 0x1;
+   log_param.verbosity = (control_val >> 4) & 0xF;


Maybe "log_param.value = control_val" would also work since 
guc_log_control is conveniently defined as an union. Doesn't matter though.



+
+   if (log_param.verbosity < GUC_LOG_VERBOSITY_MIN ||
+   log_param.verbosity > GUC_LOG_VERBOSITY_MAX)
+   return -EINVAL;
+
+   /* This combination doesn't make sense & won't have any effect */
+   if (!log_param.logging_enabled && (i915.guc_log_level < 0))
+   return 0;


I wonder if it would work and maybe look nicer to generalize as:

int guc_log_level;

guc_log_level = log_param.logging_enabled ? log_param.verbosity : -1;
if (i915.guc_log_level == guc_log_level)
return 0;

+
+   ret = host2guc_logging_control(&dev_priv->guc, log_param.value);
+   if (ret < 0) {
+   DRM_DEBUG_DRIVER("host2guc action failed %d\n", ret);
+   return ret;
+   }
+
+   i915.guc_log_level = log_p

Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Chris Wilson
On Fri, Aug 12, 2016 at 09:16:03PM +0530, Goel, Akash wrote:
> On 8/12/2016 9:02 PM, Chris Wilson wrote:
> >There's (or will be) a function to dump the error object in a uniform
> >manner. This patch is obsolete.
> 
> There is a print_error_obj() function, but that prints one dword per line.

It used to. It will shortly be a compressed stream. Pretty printing is
left to userspace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Goel, Akash



On 8/12/2016 9:02 PM, Chris Wilson wrote:

On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++
 2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 28ffac5..4bd3790 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -509,6 +509,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore_obj;
+   struct drm_i915_error_object *guc_log_obj;

struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb870..561b523 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}

+   if ((obj = error->guc_log_obj)) {
+   err_printf(m, "GuC log buffer = 0x%08x\n",
+  lower_32_bits(obj->gtt_offset));
+   for (i = 0; i < obj->page_count; i++) {
+   for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {


Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like
it is counting in u32 * 4 chunks so it might be. Or I might be
confused..

It will be PAGE_SIZE / 4 only. It took me some iterations to get it right.
PAGE_SIZE/4 is number of dwords and
elt+=4  is covering 4 dwords in every iteration




There's (or will be) a function to dump the error object in a uniform
manner. This patch is obsolete.


There is a print_error_obj() function, but that prints one dword per line.
For GuC log buffer its better (for ease of interpretation) to print 4 
dwords per line as each sample if of 4 dwords, also headers are of 8 dwords.
Other benefit is that it reduces the line count of the error state file 
(Compared to other captured buffers like ring buffer, batch buffers, 
status page, size of Log buffer is more, 76 KB).


Best regards
Akash




-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/9] drm/i915/cmdparser: Make initialisation failure non-fatal

2016-08-12 Thread Patchwork
== Series Details ==

Series: series starting with [1/9] drm/i915/cmdparser: Make initialisation 
failure non-fatal
URL   : https://patchwork.freedesktop.org/series/11031/
State : failure

== Summary ==

Applying: drm/i915/cmdparser: Make initialisation failure non-fatal
fatal: sha1 information is lacking or useless (drivers/gpu/drm/i915/i915_drv.h).
error: could not build fake ancestor
Patch failed at 0001 drm/i915/cmdparser: Make initialisation failure non-fatal
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote:


On 12/08/16 15:31, Goel, Akash wrote:

On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:

+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+struct drm_i915_private *dev_priv =
+container_of(work, struct drm_i915_private,
guc.events_work);
+
+spin_lock_irq(&dev_priv->irq_lock);
+/* Speed up work cancellation during disabling guc
interrupts. */
+if (!dev_priv->guc.interrupts_enabled) {
+spin_unlock_irq(&dev_priv->irq_lock);
+return;


I suppose locking for early exit is something about ensuring the
worker
sees the update to dev_priv->guc.interrupts_enabled done on another
CPU?


Yes locking (providing implicit barrier) will ensure that update made
from another CPU is immediately visible to the worker.


What if the disable happens after the unlock above? It would wait in
disable until the irq handler exits.

Most probably it will not have to wait, as irq handler would have
completed if work item began the execution.
Irq handler just queues the work item, which gets scheduled later on.

Using the lock is beneficial for the case where the execution of work
item and interrupt disabling is done around the same time.


Ok maybe I am missing something.

When can the interrupt disabling happen? Will it be controlled by the
debugfs file or is it driver load/unload and suspend/resume?


yes disabling will happen for all the above 3 scenarios.


+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+bool interrupts_enabled;
+
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(&dev_priv->irq_lock);
+interrupts_enabled = dev_priv->guc.interrupts_enabled;
+spin_unlock(&dev_priv->irq_lock);


Not sure that taking a lock around only this read is needed.


Again same reason as above, to make sure an update made on another CPU
is immediately visible to the irq handler.


I don't get it, see above. :)


Here also If interrupt disabling & ISR execution happens around the same
time then ISR might miss the reset of 'interrupts_enabled' flag and
queue the new work.


What if reset of interrupts_enabled happens just as the ISR releases the
lock?


Then ISR will proceed ahead and queue the work item.

Lock is useful if reset of interrupts_enabled flag just happens before 
the ISR inspects the value of that flag.
Also lock will help when interrupts_enabled flag is set again, next ISR 
will definitely see it as set.



And same applies to the case when interrupt is re-enabled, ISR might
still see the 'interrupts_enabled' flag as false.
It will eventually see the update though.




+if (interrupts_enabled) {
+/* Sample the log buffer flush related bits & clear them
+ * out now itself from the message identity register to
+ * minimize the probability of losing a flush interrupt,
+ * when there are back to back flush interrupts.
+ * There can be a new flush interrupt, for different log
+ * buffer type (like for ISR), whilst Host is handling
+ * one (for DPC). Since same bit is used in message
+ * register for ISR & DPC, it could happen that GuC
+ * sets the bit for 2nd interrupt but Host clears out
+ * the bit on handling the 1st interrupt.
+ */
+u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+(GUC2HOST_MSG_CRASH_DUMP_POSTED |
+ GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+if (msg) {
+/* Clear the message bits that are handled */
+I915_WRITE(SOFT_SCRATCH(15),
+I915_READ(SOFT_SCRATCH(15)) & ~msg);


Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
twice?


Thought reading it again (just before the update) is bit safer compared
to reading it once, as there is a potential race problem here.
GuC could also write to the SOFT_SCRATCH(15) register, set new events
bit, while Host clears off the bit of handled events.


Don't get it. If there is a race between read and write there still is,
don't see how a second read makes it safer.


Yes can't avoid the race completely by double reads, but can reduce the
race window size.


There was only one thing between the two reads, and that was "if (msg)":

 +u32 msg = I915_READ(SOFT_SCRATCH(15)) &
 +(GUC2HOST_MSG_CRASH_DUMP_POSTED |
 + GUC2HOST_MSG_FLUSH_LOG_BUFFER);

 +if (msg) {

 +/* Clear the message bits that are handled */
 +I915_WRITE(SOFT_SCRATCH(15),
 +I915_READ(SOFT_SCRATCH(15)) & ~msg);



Also I felt code looked better in current form, as macros
GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
only once.

Will change as per the initial implementation.

 u32 msg = I915_READ(SOFT_SCRATCH(15));
 if (msg & (GUC2HOST_MSG_CRASH_DUMP_

Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Chris Wilson
On Fri, Aug 12, 2016 at 04:20:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 12/08/16 07:25, akash.g...@intel.com wrote:
> >From: Akash Goel 
> >
> >Added the dump of GuC log buffer to i915 error state, as the contents of
> >GuC log buffer would also be useful to determine that why the GPU reset
> >was triggered.
> >
> >Suggested-by: Chris Wilson 
> >Signed-off-by: Akash Goel 
> >---
> >  drivers/gpu/drm/i915/i915_drv.h   |  1 +
> >  drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++
> >  2 files changed, 28 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> >b/drivers/gpu/drm/i915/i915_drv.h
> >index 28ffac5..4bd3790 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -509,6 +509,7 @@ struct drm_i915_error_state {
> > struct intel_overlay_error_state *overlay;
> > struct intel_display_error_state *display;
> > struct drm_i915_error_object *semaphore_obj;
> >+struct drm_i915_error_object *guc_log_obj;
> >
> > struct drm_i915_error_engine {
> > int engine_id;
> >diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> >b/drivers/gpu/drm/i915/i915_gpu_error.c
> >index eecb870..561b523 100644
> >--- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct 
> >drm_i915_error_state_buf *m,
> > }
> > }
> >
> >+if ((obj = error->guc_log_obj)) {
> >+err_printf(m, "GuC log buffer = 0x%08x\n",
> >+   lower_32_bits(obj->gtt_offset));
> >+for (i = 0; i < obj->page_count; i++) {
> >+for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {
> 
> Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like
> it is counting in u32 * 4 chunks so it might be. Or I might be
> confused..

There's (or will be) a function to dump the error object in a uniform
manner. This patch is obsolete.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 13/20] drm/i915: Augment i915 error state to include the dump of GuC log buffer

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Added the dump of GuC log buffer to i915 error state, as the contents of
GuC log buffer would also be useful to determine that why the GPU reset
was triggered.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h   |  1 +
  drivers/gpu/drm/i915/i915_gpu_error.c | 27 +++
  2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 28ffac5..4bd3790 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -509,6 +509,7 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
struct drm_i915_error_object *semaphore_obj;
+   struct drm_i915_error_object *guc_log_obj;

struct drm_i915_error_engine {
int engine_id;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb870..561b523 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -546,6 +546,21 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}

+   if ((obj = error->guc_log_obj)) {
+   err_printf(m, "GuC log buffer = 0x%08x\n",
+  lower_32_bits(obj->gtt_offset));
+   for (i = 0; i < obj->page_count; i++) {
+   for (elt = 0; elt < PAGE_SIZE/4; elt += 4) {


Should the condition be PAGE_SIZE / 16 ? I am not sure, looks like it is 
counting in u32 * 4 chunks so it might be. Or I might be confused..



+   err_printf(m, "[%08x] %08x %08x %08x %08x\n",
+  (u32)(i*PAGE_SIZE) + elt*4,
+  obj->pages[i][elt],
+  obj->pages[i][elt+1],
+  obj->pages[i][elt+2],
+  obj->pages[i][elt+3]);
+   }
+   }
+   }
+
if (error->overlay)
intel_overlay_print_error_state(m, error->overlay);

@@ -625,6 +640,7 @@ static void i915_error_state_free(struct kref *error_ref)
}

i915_error_object_free(error->semaphore_obj);
+   i915_error_object_free(error->guc_log_obj);

for (i = 0; i < error->vm_count; i++)
kfree(error->active_bo[i]);
@@ -1210,6 +1226,16 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
}
  }

+static void i915_gem_capture_guc_log_buffer(struct drm_i915_private *dev_priv,
+struct drm_i915_error_state *error)


Alignment.


+{
+   if (!dev_priv->guc.log.obj)
+   return;
+
+   error->guc_log_obj = i915_error_ggtt_object_create(dev_priv,
+   dev_priv->guc.log.obj);
+}
+
  /* FIXME: Since pin count/bound list is global, we duplicate what we capture 
per
   * VM.
   */
@@ -1439,6 +1465,7 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
i915_gem_capture_buffers(dev_priv, error);
i915_gem_record_fences(dev_priv, error);
i915_gem_record_rings(dev_priv, error);
+   i915_gem_capture_guc_log_buffer(dev_priv, error);

do_gettimeofday(&error->time);




Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps

2016-08-12 Thread Chris Wilson
On Fri, Aug 12, 2016 at 08:43:58PM +0530, Goel, Akash wrote:
> On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:
> >Unreleated and unmentioned change to no guard page. Best to remove IMHO.
> >Can keep the RB in that case.
> 
> Though its not called out, sorry for that, but isn't it better to
> avoid using the guard page, which will save 4KB of vmalloc virtual
> space (which is scarce) for every mapping created by Driver.
> 
> Updating the commit message would be fine to mention about this ?.

Too late, already applied without the new flag.

Yes, that's why I dropped the guard page when I found out it was being
added. Send a patch to add the flag and we can discuss whether we think
our code is adequate to not require the protection.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 9/9] drm/i915/cmdparser: Accelerate copies from WC memory

2016-08-12 Thread Chris Wilson
On Fri, Aug 12, 2016 at 04:07:30PM +0100, Chris Wilson wrote:
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 2fe88d930ca7..8dcdc27afe80 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -715,18 +715,13 @@ static int i915_gem_seqno_info(struct seq_file *m, void 
> *data)
>   struct drm_device *dev = node->minor->dev;
>   struct drm_i915_private *dev_priv = to_i915(dev);
>   struct intel_engine_cs *engine;
> - int ret;
>  
> - ret = mutex_lock_interruptible(&dev->struct_mutex);
> - if (ret)
> - return ret;
>   intel_runtime_pm_get(dev_priv);
>  
>   for_each_engine(engine, dev_priv)
>   i915_ring_seqno_info(m, engine);
>  
>   intel_runtime_pm_put(dev_priv);
> - mutex_unlock(&dev->struct_mutex);

On noes, rebase damage. /o\
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 16/20] drm/i915: Support to create write combined type vmaps

2016-08-12 Thread Goel, Akash



On 8/12/2016 4:19 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Chris Wilson 

vmaps has a provision for controlling the page protection bits, with
which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map
to a
single mapping type for the lifetime of the object. Not usually a
problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
 in documentation (Chris)

v3:
- Add a new enum for the vmalloc mapping type & pass that as an
argument to
   i915_object_pin_map. (Tvrtko)
- Use PAGE_MASK to extract or filter the mapping type info and remove a
   superfluous BUG_ON.(Tvrtko)

v4:
- Rename the enums and clean up the pin_map function. (Chris)

Signed-off-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|  9 -
  drivers/gpu/drm/i915/i915_gem.c| 58
+++---
  drivers/gpu/drm/i915/i915_gem_dmabuf.c |  2 +-
  drivers/gpu/drm/i915/i915_guc_submission.c |  2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |  8 ++---
  drivers/gpu/drm/i915/intel_ringbuffer.c|  2 +-
  6 files changed, 60 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 4bd3790..6603812 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -834,6 +834,11 @@ enum i915_cache_level {
  I915_CACHE_WT, /* hsw:gt3e WriteThrough for scanouts */
  };

+enum i915_map_type {
+I915_MAP_WB = 0,
+I915_MAP_WC,
+};
+
  struct i915_ctx_hang_stats {
  /* This context had batch pending when hang was declared */
  unsigned batch_pending;
@@ -3150,6 +3155,7 @@ static inline void
i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
  /**
   * i915_gem_object_pin_map - return a contiguous mapping of the
entire object
   * @obj - the object to map into kernel address space
+ * @map_type - whether the vmalloc mapping should be using WC or WB
pgprot_t
   *
   * Calls i915_gem_object_pin_pages() to prevent reaping of the object's
   * pages and then returns a contiguous mapping of the backing
storage into
@@ -3161,7 +3167,8 @@ static inline void
i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
   * Returns the pointer through which to access the mapped object, or an
   * ERR_PTR() on error.
   */
-void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
*obj);
+void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object
*obj,
+enum i915_map_type map_type);

  /**
   * i915_gem_object_unpin_map - releases an earlier mapping
diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
index 03548db..7dabbc3f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2077,10 +2077,11 @@ i915_gem_object_put_pages(struct
drm_i915_gem_object *obj)
  list_del(&obj->global_list);

  if (obj->mapping) {
-if (is_vmalloc_addr(obj->mapping))
-vunmap(obj->mapping);
+void *ptr = (void *)((uintptr_t)obj->mapping & PAGE_MASK);
+if (is_vmalloc_addr(ptr))
+vunmap(ptr);
  else
-kunmap(kmap_to_page(obj->mapping));
+kunmap(kmap_to_page(ptr));
  obj->mapping = NULL;
  }

@@ -2253,7 +2254,8 @@ i915_gem_object_get_pages(struct
drm_i915_gem_object *obj)
  }

  /* The 'mapping' part of i915_gem_object_pin_map() below */
-static void *i915_gem_object_map(const struct drm_i915_gem_object *obj)
+static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
+ enum i915_map_type type)
  {
  unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
  struct sg_table *sgt = obj->pages;
@@ -2263,9 +2265,10 @@ static void *i915_gem_object_map(const struct
drm_i915_gem_object *obj)
  struct page **pages = stack_pages;
  unsigned long i = 0;
  void *addr;
+bool use_wc = (type == I915_MAP_WC);

  /* A single page can always be kmapped */
-if (n_pages == 1)
+if (n_pages == 1 && !use_wc)
  return kmap(sg_page(sgt->sgl));

  if (n_pages > ARRAY_SIZE(stack_pages)) {
@@ -2281,7 +2284,8 @@ static void *i915_gem_object_map(const struct
drm_i915_gem_object *obj)
  /* Check that we have the expected number of pages */
  GEM_B

[Intel-gfx] [PATCH 7/9] drm/i915/cmdparser: Check for SKIP descriptors first

2016-08-12 Thread Chris Wilson
If the command descriptor says to skip it, ignore checking for anyother
other conflict.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 3b1100a0e0cb..b88607bb971a 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1022,6 +1022,9 @@ static bool check_cmd(const struct intel_engine_cs 
*engine,
  const bool is_master,
  bool *oacontrol_set)
 {
+   if (desc->flags & CMD_DESC_SKIP)
+   return true;
+
if (desc->flags & CMD_DESC_REJECT) {
DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd);
return false;
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 9/9] drm/i915/cmdparser: Accelerate copies from WC memory

2016-08-12 Thread Chris Wilson
If we need to use clflush to prepare our batch for reads from memory, we
can bypass the cache instead by using non-temporal copies.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 58 ++
 drivers/gpu/drm/i915/i915_debugfs.c| 24 --
 drivers/gpu/drm/i915/i915_drv.c| 19 ---
 drivers/gpu/drm/i915/i915_gem.c| 48 
 drivers/gpu/drm/i915/i915_gem_gtt.c| 17 +++---
 drivers/gpu/drm/i915/i915_gem_tiling.c |  4 ---
 drivers/gpu/drm/i915/i915_irq.c|  2 --
 drivers/gpu/drm/i915/intel_uncore.c|  6 ++--
 8 files changed, 81 insertions(+), 97 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index cea3ef7299cc..3244ef1401ad 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -969,8 +969,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 {
unsigned int src_needs_clflush;
unsigned int dst_needs_clflush;
-   void *dst, *ptr;
-   int offset, n;
+   void *dst;
int ret;
 
ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush);
@@ -987,24 +986,43 @@ static u32 *copy_batch(struct drm_i915_gem_object 
*dst_obj,
if (IS_ERR(dst))
goto unpin_dst;
 
-   ptr = dst;
-   offset = offset_in_page(batch_start_offset);
-   if (dst_needs_clflush & CLFLUSH_BEFORE)
-   batch_len = roundup(batch_len, boot_cpu_data.x86_clflush_size);
-
-   for (n = batch_start_offset >> PAGE_SHIFT; batch_len; n++) {
-   int len = min_t(int, batch_len, PAGE_SIZE - offset);
-   void *vaddr;
-
-   vaddr = kmap_atomic(i915_gem_object_get_page(src_obj, n));
-   if (src_needs_clflush)
-   drm_clflush_virt_range(vaddr + offset, len);
-   memcpy(ptr, vaddr + offset, len);
-   kunmap_atomic(vaddr);
-
-   ptr += len;
-   batch_len -= len;
-   offset = 0;
+   if (src_needs_clflush &&
+   i915_memcpy_from_wc((void *)(uintptr_t)batch_start_offset, 0, 0)) {
+   void *src;
+
+   src = i915_gem_object_pin_map(src_obj, I915_MAP_WC);
+   if (IS_ERR(src))
+   goto shmem_copy;
+
+   i915_memcpy_from_wc(dst,
+   src + batch_start_offset,
+   ALIGN(batch_len, 16));
+   i915_gem_object_unpin_map(src_obj);
+   } else {
+   void *ptr;
+   int offset, n;
+
+shmem_copy:
+   offset = offset_in_page(batch_start_offset);
+   if (dst_needs_clflush & CLFLUSH_BEFORE)
+   batch_len = roundup(batch_len,
+   boot_cpu_data.x86_clflush_size);
+
+   ptr = dst;
+   for (n = batch_start_offset >> PAGE_SHIFT; batch_len; n++) {
+   int len = min_t(int, batch_len, PAGE_SIZE - offset);
+   void *vaddr;
+
+   vaddr = kmap_atomic(i915_gem_object_get_page(src_obj, 
n));
+   if (src_needs_clflush)
+   drm_clflush_virt_range(vaddr + offset, len);
+   memcpy(ptr, vaddr + offset, len);
+   kunmap_atomic(vaddr);
+
+   ptr += len;
+   batch_len -= len;
+   offset = 0;
+   }
}
 
/* dst_obj is returned with vmap pinned */
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 2fe88d930ca7..8dcdc27afe80 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -715,18 +715,13 @@ static int i915_gem_seqno_info(struct seq_file *m, void 
*data)
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = to_i915(dev);
struct intel_engine_cs *engine;
-   int ret;
 
-   ret = mutex_lock_interruptible(&dev->struct_mutex);
-   if (ret)
-   return ret;
intel_runtime_pm_get(dev_priv);
 
for_each_engine(engine, dev_priv)
i915_ring_seqno_info(m, engine);
 
intel_runtime_pm_put(dev_priv);
-   mutex_unlock(&dev->struct_mutex);
 
return 0;
 }
@@ -1379,11 +1374,7 @@ static int ironlake_drpc_info(struct seq_file *m)
struct drm_i915_private *dev_priv = to_i915(dev);
u32 rgvmodectl, rstdbyctl;
u16 crstandvid;
-   int ret;
 
-   ret = mutex_lock_interruptible(&dev->struct_mutex);
-   if (ret)
-   return ret;
intel_runtime_pm_get(dev_priv);
 
rgvmodectl = I915_READ(MEMMODECTL);
@@ -1391,7 +1382,6 @@ static int ironlake_drpc_info(struct seq_file *m)
crstandvid = I915_R

[Intel-gfx] [PATCH 8/9] drm/i915/cmdparser: Use binary search for faster register lookup

2016-08-12 Thread Chris Wilson
A signifcant proportion of the cmdparsing time for some batches is the
cost to find the register in the mmiotable. We ensure that those tables
are in ascending order such that we could do a binary search if it was
ever merited. It is.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 42 --
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index b88607bb971a..cea3ef7299cc 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -925,36 +925,37 @@ find_cmd(struct intel_engine_cs *engine,
 }
 
 static const struct drm_i915_reg_descriptor *
-find_reg(const struct drm_i915_reg_descriptor *table,
-int count, u32 addr)
+__find_reg(const struct drm_i915_reg_descriptor *table, int count, u32 addr)
 {
-   int i;
-
-   for (i = 0; i < count; i++) {
-   if (i915_mmio_reg_offset(table[i].addr) == addr)
-   return &table[i];
+   int start = 0, end = count;
+   while (start < end) {
+   int mid = start + (end - start) / 2;
+   int ret = addr - i915_mmio_reg_offset(table[mid].addr);
+   if (ret < 0)
+   end = mid;
+   else if (ret > 0)
+   start = mid + 1;
+   else
+   return &table[mid];
}
-
return NULL;
 }
 
 static const struct drm_i915_reg_descriptor *
-find_reg_in_tables(const struct drm_i915_reg_table *tables,
-  int count, bool is_master, u32 addr)
+find_reg(const struct intel_engine_cs *engine, bool is_master, u32 addr)
 {
-   int i;
-   const struct drm_i915_reg_table *table;
-   const struct drm_i915_reg_descriptor *reg;
+   const struct drm_i915_reg_table *table = engine->reg_tables;
+   int count = engine->reg_table_count;
 
-   for (i = 0; i < count; i++) {
-   table = &tables[i];
+   do {
if (!table->master || is_master) {
-   reg = find_reg(table->regs, table->num_regs,
-  addr);
+   const struct drm_i915_reg_descriptor *reg;
+
+   reg = __find_reg(table->regs, table->num_regs, addr);
if (reg != NULL)
return reg;
}
-   }
+   } while (table++, --count);
 
return NULL;
 }
@@ -1049,10 +1050,7 @@ static bool check_cmd(const struct intel_engine_cs 
*engine,
 offset += step) {
const u32 reg_addr = cmd[offset] & desc->reg.mask;
const struct drm_i915_reg_descriptor *reg =
-   find_reg_in_tables(engine->reg_tables,
-  engine->reg_table_count,
-  is_master,
-  reg_addr);
+   find_reg(engine, is_master, reg_addr);
 
if (!reg) {
DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X 
in command: 0x%08X (exec_id=%d)\n",
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 5/9] drm/i915/cmdparser: Improve hash function

2016-08-12 Thread Chris Wilson
The existing code's hashfunction is very suboptimal (most 3D commands
use the same bucket degrading the hash to a long list). The code even
acknowledge that the issue was known and the fix simple:

/*
 * If we attempt to generate a perfect hash, we should be able to look at bits
 * 31:29 of a command from a batch buffer and use the full mask for that
 * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this.
 */

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 51 +-
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 4c903081604c..274f2136a846 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -86,24 +86,24 @@
  * general bitmasking mechanism.
  */
 
-#define STD_MI_OPCODE_MASK  0xFF80
-#define STD_3D_OPCODE_MASK  0x
-#define STD_2D_OPCODE_MASK  0xFFC0
-#define STD_MFX_OPCODE_MASK 0x
+#define STD_MI_OPCODE_SHIFT  (32 - 9)
+#define STD_3D_OPCODE_SHIFT  (32 - 16)
+#define STD_2D_OPCODE_SHIFT  (32 - 10)
+#define STD_MFX_OPCODE_SHIFT (32 - 16)
 
 #define CMD(op, opm, f, lm, fl, ...)   \
{   \
.flags = (fl) | ((f) ? CMD_DESC_FIXED : 0), \
-   .cmd = { (op), (opm) }, \
+   .cmd = { (op), ~0u << (opm) },  \
.length = { (lm) }, \
__VA_ARGS__ \
}
 
 /* Convenience macros to compress the tables */
-#define SMI STD_MI_OPCODE_MASK
-#define S3D STD_3D_OPCODE_MASK
-#define S2D STD_2D_OPCODE_MASK
-#define SMFX STD_MFX_OPCODE_MASK
+#define SMI STD_MI_OPCODE_SHIFT
+#define S3D STD_3D_OPCODE_SHIFT
+#define S2D STD_2D_OPCODE_SHIFT
+#define SMFX STD_MFX_OPCODE_SHIFT
 #define F true
 #define S CMD_DESC_SKIP
 #define R CMD_DESC_REJECT
@@ -696,12 +696,26 @@ struct cmd_node {
  * non-opcode bits being set. But if we don't include those bits, some 3D
  * commands may hash to the same bucket due to not including opcode bits that
  * make the command unique. For now, we will risk hashing to the same bucket.
- *
- * If we attempt to generate a perfect hash, we should be able to look at bits
- * 31:29 of a command from a batch buffer and use the full mask for that
- * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this.
  */
-#define CMD_HASH_MASK STD_MI_OPCODE_MASK
+static inline u32 cmd_header_key(u32 x)
+{
+   u32 shift;
+
+   switch (x >> INSTR_CLIENT_SHIFT) {
+   default:
+   case INSTR_MI_CLIENT:
+   shift = STD_MI_OPCODE_SHIFT;
+   break;
+   case INSTR_RC_CLIENT:
+   shift = STD_3D_OPCODE_SHIFT;
+   break;
+   case INSTR_BC_CLIENT:
+   shift = STD_2D_OPCODE_SHIFT;
+   break;
+   }
+
+   return x >> shift;
+}
 
 static int init_hash_table(struct intel_engine_cs *engine,
   const struct drm_i915_cmd_table *cmd_tables,
@@ -725,7 +739,7 @@ static int init_hash_table(struct intel_engine_cs *engine,
 
desc_node->desc = desc;
hash_add(engine->cmd_hash, &desc_node->node,
-desc->cmd.value & CMD_HASH_MASK);
+cmd_header_key(desc->cmd.value));
}
}
 
@@ -864,12 +878,9 @@ find_cmd_in_table(struct intel_engine_cs *engine,
struct cmd_node *desc_node;
 
hash_for_each_possible(engine->cmd_hash, desc_node, node,
-  cmd_header & CMD_HASH_MASK) {
+  cmd_header_key(cmd_header)) {
const struct drm_i915_cmd_descriptor *desc = desc_node->desc;
-   u32 masked_cmd = desc->cmd.mask & cmd_header;
-   u32 masked_value = desc->cmd.value & desc->cmd.mask;
-
-   if (masked_cmd == masked_value)
+   if (((cmd_header ^ desc->cmd.value) & desc->cmd.mask) == 0)
return desc;
}
 
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 6/9] drm/i915/cmdparser: Compare against the previous command descriptor

2016-08-12 Thread Chris Wilson
On the blitter (and in test code), we see long sequences of repeated
commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these,
we can skip the hashtable lookup by remembering the previous command
descriptor and doing a straightforward compare of the command header.
The corollary is that we need to do one extra comparison before lookup
up new commands.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 274f2136a846..3b1100a0e0cb 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -350,6 +350,9 @@ static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] 
= {
CMD(  MI_LOAD_SCAN_LINES_EXCL,  SMI,   !F,  0x3F,   R  ),
 };
 
+static const struct drm_i915_cmd_descriptor noop_desc =
+   CMD(MI_NOOP, SMI, F, 1, S);
+
 #undef CMD
 #undef SMI
 #undef S3D
@@ -898,11 +901,14 @@ find_cmd_in_table(struct intel_engine_cs *engine,
 static const struct drm_i915_cmd_descriptor*
 find_cmd(struct intel_engine_cs *engine,
 u32 cmd_header,
+const struct drm_i915_cmd_descriptor *desc,
 struct drm_i915_cmd_descriptor *default_desc)
 {
-   const struct drm_i915_cmd_descriptor *desc;
u32 mask;
 
+   if (((cmd_header ^ desc->cmd.value) & desc->cmd.mask) == 0)
+   return desc;
+
desc = find_cmd_in_table(engine, cmd_header);
if (desc)
return desc;
@@ -911,10 +917,10 @@ find_cmd(struct intel_engine_cs *engine,
if (!mask)
return NULL;
 
-   BUG_ON(!default_desc);
-   default_desc->flags = CMD_DESC_SKIP;
+   default_desc->cmd.value = cmd_header;
+   default_desc->cmd.mask = 0x;
default_desc->length.mask = mask;
-
+   default_desc->flags = CMD_DESC_SKIP;
return default_desc;
 }
 
@@ -1165,7 +1171,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
bool is_master)
 {
u32 *cmd, *batch_end;
-   struct drm_i915_cmd_descriptor default_desc = { 0 };
+   struct drm_i915_cmd_descriptor default_desc = noop_desc;
+   const struct drm_i915_cmd_descriptor *desc = &default_desc;
bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */
bool needs_clflush_after = false;
int ret = 0;
@@ -1185,13 +1192,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs 
*engine,
 */
batch_end = cmd + (batch_len / sizeof(*batch_end));
while (cmd < batch_end) {
-   const struct drm_i915_cmd_descriptor *desc;
u32 length;
 
if (*cmd == MI_BATCH_BUFFER_END)
break;
 
-   desc = find_cmd(engine, *cmd, &default_desc);
+   desc = find_cmd(engine, *cmd, desc, &default_desc);
if (!desc) {
DRM_DEBUG_DRIVER("CMD: Unrecognized command: 0x%08X\n",
 *cmd);
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/9] drm/i915/cmdparser: Make initialisation failure non-fatal

2016-08-12 Thread Chris Wilson
If the developer adds a register in the wrong order, we BUG during boot.
That makes development and testing very difficult. Let's be a bit more
friendly and disable the command parser with a big warning if the tables
are invalid.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 30 ++
 drivers/gpu/drm/i915/i915_drv.h|  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c |  6 --
 3 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index a1f4683f5c35..1882dc28c750 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -746,17 +746,15 @@ static void fini_hash_table(struct intel_engine_cs 
*engine)
  * Optionally initializes fields related to batch buffer command parsing in the
  * struct intel_engine_cs based on whether the platform requires software
  * command parsing.
- *
- * Return: non-zero if initialization fails
  */
-int intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
+void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
 {
const struct drm_i915_cmd_table *cmd_tables;
int cmd_table_count;
int ret;
 
if (!IS_GEN7(engine->i915))
-   return 0;
+   return;
 
switch (engine->id) {
case RCS:
@@ -811,24 +809,32 @@ int intel_engine_init_cmd_parser(struct intel_engine_cs 
*engine)
break;
default:
MISSING_CASE(engine->id);
-   BUG();
+   return;
}
 
-   BUG_ON(!validate_cmds_sorted(engine, cmd_tables, cmd_table_count));
-   BUG_ON(!validate_regs_sorted(engine));
+   if (!hash_empty(engine->cmd_hash)) {
+   DRM_DEBUG_DRIVER("%s: no commands?\n", engine->name);
+   return;
+   }
 
-   WARN_ON(!hash_empty(engine->cmd_hash));
+   if (!validate_cmds_sorted(engine, cmd_tables, cmd_table_count)) {
+   DRM_ERROR("%s: command descriptions are not sorted\n",
+ engine->name);
+   return;
+   }
+   if (!validate_regs_sorted(engine)) {
+   DRM_ERROR("%s: registers are not sorted\n", engine->name);
+   return;
+   }
 
ret = init_hash_table(engine, cmd_tables, cmd_table_count);
if (ret) {
-   DRM_ERROR("CMD: cmd_parser_init failed!\n");
+   DRM_ERROR("%s: initialised failed!\n", engine->name);
fini_hash_table(engine);
-   return ret;
+   return;
}
 
engine->needs_cmd_parser = true;
-
-   return 0;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 52207b086286..f5b187662059 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3608,7 +3608,7 @@ const char *i915_cache_level_str(struct drm_i915_private 
*i915, int type);
 
 /* i915_cmd_parser.c */
 int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv);
-int intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
+void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
 void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
 int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct drm_i915_gem_object *batch_obj,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 63440c6a6349..0eb19388eba4 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -239,6 +239,8 @@ void intel_engine_setup_common(struct intel_engine_cs 
*engine)
intel_engine_init_requests(engine);
intel_engine_init_hangcheck(engine);
i915_gem_batch_pool_init(engine, &engine->batch_pool);
+
+   intel_engine_init_cmd_parser(engine);
 }
 
 int intel_engine_create_scratch(struct intel_engine_cs *engine, int size)
@@ -305,7 +307,7 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
if (ret)
return ret;
 
-   return intel_engine_init_cmd_parser(engine);
+   return 0;
 }
 
 /**
@@ -319,8 +321,8 @@ void intel_engine_cleanup_common(struct intel_engine_cs 
*engine)
 {
intel_engine_cleanup_scratch(engine);
 
-   intel_engine_cleanup_cmd_parser(engine);
i915_gem_render_state_fini(engine);
intel_engine_fini_breadcrumbs(engine);
+   intel_engine_cleanup_cmd_parser(engine);
i915_gem_batch_pool_fini(&engine->batch_pool);
 }
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/9] drm/i915/cmdparser: Only cache the dst vmap

2016-08-12 Thread Chris Wilson
For simplicity, we want to continue using a contiguous mapping of the
command buffer, but we can reduce the number of vmappings we hold by
switching over to a page-by-page copy from the user batch buffer to the
shadow. The cost for saving one linear mapping is about 5% in trivial
workloads - which is more or less the overhead in calling kmap_atomic().

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 545c333663c0..4c903081604c 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -951,7 +951,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 {
unsigned int src_needs_clflush;
unsigned int dst_needs_clflush;
-   void *src, *dst;
+   void *dst, *ptr;
+   int offset, n;
int ret;
 
ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush);
@@ -964,30 +965,33 @@ static u32 *copy_batch(struct drm_i915_gem_object 
*dst_obj,
goto unpin_src;
}
 
-   src = i915_gem_object_pin_map(src_obj, I915_MAP_WB);
-   if (IS_ERR(src)) {
-   dst = src;
-   goto unpin_dst;
-   }
-
dst = i915_gem_object_pin_map(dst_obj, I915_MAP_WB);
if (IS_ERR(dst))
-   goto unmap_src;
-
-   src += batch_start_offset;
-   if (src_needs_clflush)
-   drm_clflush_virt_range(src, batch_len);
+   goto unpin_dst;
 
+   ptr = dst;
+   offset = offset_in_page(batch_start_offset);
if (dst_needs_clflush & CLFLUSH_BEFORE)
batch_len = roundup(batch_len, boot_cpu_data.x86_clflush_size);
 
-   memcpy(dst, src, batch_len);
+   for (n = batch_start_offset >> PAGE_SHIFT; batch_len; n++) {
+   int len = min_t(int, batch_len, PAGE_SIZE - offset);
+   void *vaddr;
+
+   vaddr = kmap_atomic(i915_gem_object_get_page(src_obj, n));
+   if (src_needs_clflush)
+   drm_clflush_virt_range(vaddr + offset, len);
+   memcpy(ptr, vaddr + offset, len);
+   kunmap_atomic(vaddr);
+
+   ptr += len;
+   batch_len -= len;
+   offset = 0;
+   }
 
/* dst_obj is returned with vmap pinned */
*needs_clflush_after = dst_needs_clflush & CLFLUSH_AFTER;
 
-unmap_src:
-   i915_gem_object_unpin_map(src_obj);
 unpin_dst:
i915_gem_object_unpin_pages(dst_obj);
 unpin_src:
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] cmdparser perf improvement

2016-08-12 Thread Chris Wilson
From the moment the cmdparser was enabled (4.0) we got regression reports
about the performance regression, e.g. most notable on Baytrail

http://www.spinics.net/lists/dri-devel/msg80933.html
msg->id:1428627643.3417.22.ca...@collabora.com

Whilst this doesn't make the cmdparser free, it does significantly
reduce the overhead. (The cached vmappings and better hash were tested
at the time and demonstrated to reduce the impact on the user's workload
to the point where the new kernel was an improvement over the last known
good). This builds upon the regression fixes to stop the cmdparser
falling over in the first place.
-Chris

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/9] drm/i915/cmdparser: Add the TIMESTAMP register for the other engines

2016-08-12 Thread Chris Wilson
Since I have been using the BCS_TIMESTAMP to measure latency of
execution upon the blitter ring, allow regular userspace to also read
from that register. They are already allowed RCS_TIMESTAMP!

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 1882dc28c750..5fbd049f8095 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -458,6 +458,7 @@ static const struct drm_i915_reg_descriptor 
gen7_render_regs[] = {
REG32(GEN7_GPGPU_DISPATCHDIMX),
REG32(GEN7_GPGPU_DISPATCHDIMY),
REG32(GEN7_GPGPU_DISPATCHDIMZ),
+   REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE),
REG64_IDX(GEN7_SO_NUM_PRIMS_WRITTEN, 0),
REG64_IDX(GEN7_SO_NUM_PRIMS_WRITTEN, 1),
REG64_IDX(GEN7_SO_NUM_PRIMS_WRITTEN, 2),
@@ -473,6 +474,7 @@ static const struct drm_i915_reg_descriptor 
gen7_render_regs[] = {
REG32(GEN7_L3SQCREG1),
REG32(GEN7_L3CNTLREG2),
REG32(GEN7_L3CNTLREG3),
+   REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE),
 };
 
 static const struct drm_i915_reg_descriptor hsw_render_regs[] = {
@@ -502,7 +504,10 @@ static const struct drm_i915_reg_descriptor 
hsw_render_regs[] = {
 };
 
 static const struct drm_i915_reg_descriptor gen7_blt_regs[] = {
+   REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE),
+   REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE),
REG32(BCS_SWCTRL),
+   REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE),
 };
 
 static const struct drm_i915_reg_descriptor ivb_master_regs[] = {
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/9] drm/i915/cmdparser: Use cached vmappings

2016-08-12 Thread Chris Wilson
The single largest factor in the overhead of parsing the commands is the
setup of the virtual mapping to provide a continuous block for the batch
buffer. If we keep those vmappings around (against the better judgement
of mm/vmalloc.c, which we offset by handwaving and looking suggestively
at the shrinker) we can dramatically improve the performance of the
parser for small batches (such as media workloads). Furthermore, we can
use the prepare shmem read/write functions to determine  how best we
need to clflush the range (rather than every page of the object).

The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt
for the throughput on small batches. (Caching just the dst vmap and
iterating over the src, doing a page by page copy is roughly 5% slower
on both platforms. That may be an acceptable trade-off to eliminate one
cached vmapping, and we may be able to reduce the per-page copying overhead
further.) For *this* simple test case, the cmdparser is now within a
factor of 2 of ideal performance.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 121 ++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   6 ++
 2 files changed, 47 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5fbd049f8095..545c333663c0 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -942,98 +942,57 @@ find_reg_in_tables(const struct drm_i915_reg_table 
*tables,
return NULL;
 }
 
-static u32 *vmap_batch(struct drm_i915_gem_object *obj,
-  unsigned start, unsigned len)
-{
-   int i;
-   void *addr = NULL;
-   struct sg_page_iter sg_iter;
-   int first_page = start >> PAGE_SHIFT;
-   int last_page = (len + start + 4095) >> PAGE_SHIFT;
-   int npages = last_page - first_page;
-   struct page **pages;
-
-   pages = drm_malloc_ab(npages, sizeof(*pages));
-   if (pages == NULL) {
-   DRM_DEBUG_DRIVER("Failed to get space for pages\n");
-   goto finish;
-   }
-
-   i = 0;
-   for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 
first_page) {
-   pages[i++] = sg_page_iter_page(&sg_iter);
-   if (i == npages)
-   break;
-   }
-
-   addr = vmap(pages, i, 0, PAGE_KERNEL);
-   if (addr == NULL) {
-   DRM_DEBUG_DRIVER("Failed to vmap pages\n");
-   goto finish;
-   }
-
-finish:
-   if (pages)
-   drm_free_large(pages);
-   return (u32*)addr;
-}
-
-/* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
-static u32 *copy_batch(struct drm_i915_gem_object *dest_obj,
+/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
+static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
   struct drm_i915_gem_object *src_obj,
   u32 batch_start_offset,
-  u32 batch_len)
+  u32 batch_len,
+  bool *needs_clflush_after)
 {
-   unsigned int needs_clflush;
-   void *src_base, *src;
-   void *dst = NULL;
+   unsigned int src_needs_clflush;
+   unsigned int dst_needs_clflush;
+   void *src, *dst;
int ret;
 
-   if (batch_len > dest_obj->base.size ||
-   batch_len + batch_start_offset > src_obj->base.size)
-   return ERR_PTR(-E2BIG);
-
-   if (WARN_ON(dest_obj->pages_pin_count == 0))
-   return ERR_PTR(-ENODEV);
-
-   ret = i915_gem_obj_prepare_shmem_read(src_obj, &needs_clflush);
-   if (ret) {
-   DRM_DEBUG_DRIVER("CMD: failed to prepare shadow batch\n");
+   ret = i915_gem_obj_prepare_shmem_read(src_obj, &src_needs_clflush);
+   if (ret)
return ERR_PTR(ret);
-   }
 
-   src_base = vmap_batch(src_obj, batch_start_offset, batch_len);
-   if (!src_base) {
-   DRM_DEBUG_DRIVER("CMD: Failed to vmap batch\n");
-   ret = -ENOMEM;
+   ret = i915_gem_obj_prepare_shmem_write(dst_obj, &dst_needs_clflush);
+   if (ret) {
+   dst = ERR_PTR(ret);
goto unpin_src;
}
 
-   ret = i915_gem_object_set_to_cpu_domain(dest_obj, true);
-   if (ret) {
-   DRM_DEBUG_DRIVER("CMD: Failed to set shadow batch to CPU\n");
-   goto unmap_src;
+   src = i915_gem_object_pin_map(src_obj, I915_MAP_WB);
+   if (IS_ERR(src)) {
+   dst = src;
+   goto unpin_dst;
}
 
-   dst = vmap_batch(dest_obj, 0, batch_len);
-   if (!dst) {
-   DRM_DEBUG_DRIVER("CMD: Failed to vmap shadow batch\n");
-   ret = -ENOMEM;
+   dst = i915_gem_object_pin_map(dst_obj, I915_MAP_WB);
+   if (IS_ERR(dst))
goto unmap_src;
-   }
 
-   src = src_base + offset_in_pa

Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 15:48, Goel, Akash wrote:

On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 40
+-
  1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1ca1866..8e0f360 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
  struct guc_log_buffer_state log_buffer_state_local;
  void *src_data_ptr, *dst_data_ptr;
-u32 i, buffer_size;
+bool new_overflow;
+u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;

  if (!guc->log.buf_addr)
  return;
@@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  memcpy(&log_buffer_state_local, log_buffer_state,
  sizeof(struct guc_log_buffer_state));
  buffer_size = log_buffer_state_local.size;
+read_offset = log_buffer_state_local.read_ptr;
+write_offset = log_buffer_state_local.sampled_write_ptr;

  guc->log.flush_count[i] +=
log_buffer_state_local.flush_to_file;
  if (log_buffer_state_local.buffer_full_cnt !=
  guc->log.prev_overflow_count[i]) {


Wrong alignment. You can try checkpatch.pl for all of those.


Sorry for all the alignment & indentation issues.

Should the above condition be written like this ?

 if (log_buffer_state_local.buffer_full_cnt !=
 guc->log.prev_overflow_count[i]) {


Yes, but checkpatch.pl is your friend. :)


+new_overflow = 1;


true/false since it is a bool

fine will do that.



  guc->log.total_overflow_count[i] +=
  (log_buffer_state_local.buffer_full_cnt -
   guc->log.prev_overflow_count[i]);
@@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  guc->log.prev_overflow_count[i] =
  log_buffer_state_local.buffer_full_cnt;
  DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-}
+} else
+new_overflow = 0;

  if (log_buffer_snapshot_state) {
  /* First copy the state structure in local buffer */
@@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
   * for consistency set the write pointer value to same
   * value of sampled_write_ptr in the snapshot buffer.
   */
-log_buffer_snapshot_state->write_ptr =
-log_buffer_snapshot_state->sampled_write_ptr;
+log_buffer_snapshot_state->write_ptr = write_offset;

  log_buffer_snapshot_state++;

  /* Now copy the actual logs */
  memcpy(dst_data_ptr, src_data_ptr, buffer_size);


The confusing bit - the memcpy above still copies the whole buffer, no?


Really very sorry for this blooper.


No worries, it happens to everyone!

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 15:31, Goel, Akash wrote:

On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:

+static void gen9_guc2host_events_work(struct work_struct *work)
+{
+struct drm_i915_private *dev_priv =
+container_of(work, struct drm_i915_private, guc.events_work);
+
+spin_lock_irq(&dev_priv->irq_lock);
+/* Speed up work cancellation during disabling guc interrupts. */
+if (!dev_priv->guc.interrupts_enabled) {
+spin_unlock_irq(&dev_priv->irq_lock);
+return;


I suppose locking for early exit is something about ensuring the worker
sees the update to dev_priv->guc.interrupts_enabled done on another
CPU?


Yes locking (providing implicit barrier) will ensure that update made
from another CPU is immediately visible to the worker.


What if the disable happens after the unlock above? It would wait in
disable until the irq handler exits.

Most probably it will not have to wait, as irq handler would have
completed if work item began the execution.
Irq handler just queues the work item, which gets scheduled later on.

Using the lock is beneficial for the case where the execution of work
item and interrupt disabling is done around the same time.


Ok maybe I am missing something.

When can the interrupt disabling happen? Will it be controlled by the 
debugfs file or is it driver load/unload and suspend/resume?



+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 gt_iir)
+{
+bool interrupts_enabled;
+
+if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
+spin_lock(&dev_priv->irq_lock);
+interrupts_enabled = dev_priv->guc.interrupts_enabled;
+spin_unlock(&dev_priv->irq_lock);


Not sure that taking a lock around only this read is needed.


Again same reason as above, to make sure an update made on another CPU
is immediately visible to the irq handler.


I don't get it, see above. :)


Here also If interrupt disabling & ISR execution happens around the same
time then ISR might miss the reset of 'interrupts_enabled' flag and
queue the new work.


What if reset of interrupts_enabled happens just as the ISR releases the 
lock?



And same applies to the case when interrupt is re-enabled, ISR might
still see the 'interrupts_enabled' flag as false.
It will eventually see the update though.




+if (interrupts_enabled) {
+/* Sample the log buffer flush related bits & clear them
+ * out now itself from the message identity register to
+ * minimize the probability of losing a flush interrupt,
+ * when there are back to back flush interrupts.
+ * There can be a new flush interrupt, for different log
+ * buffer type (like for ISR), whilst Host is handling
+ * one (for DPC). Since same bit is used in message
+ * register for ISR & DPC, it could happen that GuC
+ * sets the bit for 2nd interrupt but Host clears out
+ * the bit on handling the 1st interrupt.
+ */
+u32 msg = I915_READ(SOFT_SCRATCH(15)) &
+(GUC2HOST_MSG_CRASH_DUMP_POSTED |
+ GUC2HOST_MSG_FLUSH_LOG_BUFFER);
+if (msg) {
+/* Clear the message bits that are handled */
+I915_WRITE(SOFT_SCRATCH(15),
+I915_READ(SOFT_SCRATCH(15)) & ~msg);


Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
twice?


Thought reading it again (just before the update) is bit safer compared
to reading it once, as there is a potential race problem here.
GuC could also write to the SOFT_SCRATCH(15) register, set new events
bit, while Host clears off the bit of handled events.


Don't get it. If there is a race between read and write there still is,
don't see how a second read makes it safer.


Yes can't avoid the race completely by double reads, but can reduce the
race window size.


There was only one thing between the two reads, and that was "if (msg)":

 +u32 msg = I915_READ(SOFT_SCRATCH(15)) &
 +(GUC2HOST_MSG_CRASH_DUMP_POSTED |
 + GUC2HOST_MSG_FLUSH_LOG_BUFFER);

 +if (msg) {

 +/* Clear the message bits that are handled */
 +I915_WRITE(SOFT_SCRATCH(15),
 +I915_READ(SOFT_SCRATCH(15)) & ~msg);



Also I felt code looked better in current form, as macros
GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
only once.

Will change as per the initial implementation.

 u32 msg = I915_READ(SOFT_SCRATCH(15));
 if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
 msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
  GUC2HOST_MSG_FLUSH_LOG_BUFFER);
 I915_WRITE(SOFT_SCRATCH(15), msg);
 }


Or:
u32 msg, flush;

msg = I915_READ(SOFT_SCRATCH(15));
	flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED | 
GUC2HOST_MSG_FLUSH_LOG_BUFFER);

if (flush

Re: [Intel-gfx] [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:25 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
 while waiting for the action's response from GuC. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
  drivers/gpu/drm/i915/intel_guc.h   | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1a2d648..cb9672b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  return -EINVAL;

  intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+mutex_lock(&guc->action_lock);


I would probably take the mutex before grabbing forcewake as a general
rule. Not that I think it matters in this case since we don't expect any
contention on this one.

Yes did not expected a contention for this mutex, hence thought it use 
just around the code where it is actually needed.
Will move it before the forcewake, as you suggested, to conform to the 
rules.


Best regards
Akash


  dev_priv->guc.action_count += 1;
  dev_priv->guc.action_cmd = data[0];
@@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc,
u32 *data, u32 len)
  }
  dev_priv->guc.action_status = status;

+mutex_unlock(&guc->action_lock);
  intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);

  return ret;
@@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct
drm_i915_private *dev_priv)
  return -ENOMEM;

  ida_init(&guc->ctx_ids);
+mutex_init(&guc->action_lock);
  guc_create_log(guc);
  guc_create_ads(guc);

diff --git a/drivers/gpu/drm/i915/intel_guc.h
b/drivers/gpu/drm/i915/intel_guc.h
index 96ef7dc..e4ec8d8 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -156,6 +156,9 @@ struct intel_guc {

  uint64_t submissions[I915_NUM_ENGINES];
  uint32_t last_seqno[I915_NUM_ENGINES];
+
+/* To serialize the Host2GuC actions */
+struct mutex action_lock;
  };

  /* intel_guc_loader.c */



With or without the mutex vs forcewake ordering change:

Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:56 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
   flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
   the GuC log buffer. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 28

  drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++
  drivers/gpu/drm/i915/i915_irq.c|  2 ++
  drivers/gpu/drm/i915/intel_guc.h   |  7 +++
  4 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
b/drivers/gpu/drm/i915/i915_debugfs.c
index 51b59d5..14e0dcf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct
seq_file *m, void *data)
  return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv)
+{
+struct intel_guc *guc = &dev_priv->guc;
+
+seq_printf(m, "\nGuC logging stats:\n");
+
+seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);


Why is the width for overflow only 8 chars and not 10 like for flush
since both are u32?


Looks to be a discrepancy. I will check.
Both should be 10 as per the max value of u32, which takes 10 digits in 
decimal form.





+
+seq_printf(m, "\tTotal flush interrupt count: %u\n",
+   guc->log.flush_interrupt_count);
+
+seq_printf(m, "\tCapture miss count: %u\n",
+   guc->log.capture_miss_count);
+}
+
  static void i915_guc_client_info(struct seq_file *m,
   struct drm_i915_private *dev_priv,
   struct i915_guc_client *client)
@@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m,
void *data)
  seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
  i915_guc_client_info(m, dev_priv, &client);

+i915_guc_log_info(m, dev_priv);
+
  /* Add more as required ... */

  return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index cb9672b..1ca1866 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  sizeof(struct guc_log_buffer_state));
  buffer_size = log_buffer_state_local.size;

+guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
+if (log_buffer_state_local.buffer_full_cnt !=
+guc->log.prev_overflow_count[i]) {
+guc->log.total_overflow_count[i] +=
+(log_buffer_state_local.buffer_full_cnt -
+ guc->log.prev_overflow_count[i]);
+
+if (log_buffer_state_local.buffer_full_cnt <
+guc->log.prev_overflow_count[i]) {
+/* buffer_full_cnt is a 4 bit counter */
+guc->log.total_overflow_count[i] += 16;
+}
+
+guc->log.prev_overflow_count[i] =
+log_buffer_state_local.buffer_full_cnt;
+DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+}
+
  if (log_buffer_snapshot_state) {
  /* First copy the state structure in local buffer */
  memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
@@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
   * getting consumed by User at a slow rate.
   */
  DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
+guc->log.capture_miss_count++;
  }
  }

diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index d4d6f0a..b08d1d2 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1705,6 +1705,8 @@ static void gen9_guc_irq_handler(struct
drm_i915_private *dev_priv, u32 gt_iir)
  

Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-12 Thread Goel, Akash



On 8/12/2016 8:12 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 40
+-
  1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1ca1866..8e0f360 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
  struct guc_log_buffer_state log_buffer_state_local;
  void *src_data_ptr, *dst_data_ptr;
-u32 i, buffer_size;
+bool new_overflow;
+u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;

  if (!guc->log.buf_addr)
  return;
@@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  memcpy(&log_buffer_state_local, log_buffer_state,
  sizeof(struct guc_log_buffer_state));
  buffer_size = log_buffer_state_local.size;
+read_offset = log_buffer_state_local.read_ptr;
+write_offset = log_buffer_state_local.sampled_write_ptr;

  guc->log.flush_count[i] +=
log_buffer_state_local.flush_to_file;
  if (log_buffer_state_local.buffer_full_cnt !=
  guc->log.prev_overflow_count[i]) {


Wrong alignment. You can try checkpatch.pl for all of those.


Sorry for all the alignment & indentation issues.

Should the above condition be written like this ?

if (log_buffer_state_local.buffer_full_cnt !=
guc->log.prev_overflow_count[i]) {



+new_overflow = 1;


true/false since it is a bool

fine will do that.



  guc->log.total_overflow_count[i] +=
  (log_buffer_state_local.buffer_full_cnt -
   guc->log.prev_overflow_count[i]);
@@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
  guc->log.prev_overflow_count[i] =
  log_buffer_state_local.buffer_full_cnt;
  DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-}
+} else
+new_overflow = 0;

  if (log_buffer_snapshot_state) {
  /* First copy the state structure in local buffer */
@@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct
intel_guc *guc)
   * for consistency set the write pointer value to same
   * value of sampled_write_ptr in the snapshot buffer.
   */
-log_buffer_snapshot_state->write_ptr =
-log_buffer_snapshot_state->sampled_write_ptr;
+log_buffer_snapshot_state->write_ptr = write_offset;

  log_buffer_snapshot_state++;

  /* Now copy the actual logs */
  memcpy(dst_data_ptr, src_data_ptr, buffer_size);


The confusing bit - the memcpy above still copies the whole buffer, no?


Really very sorry for this blooper.

Best regards
Akash


+if (unlikely(new_overflow)) {
+/* copy the whole buffer in case of overflow */
+read_offset = 0;
+write_offset = buffer_size;
+} else if (unlikely((read_offset > buffer_size) ||
+(write_offset > buffer_size))) {
+DRM_ERROR("invalid log buffer state\n");
+/* copy whole buffer as offsets are unreliable */
+read_offset = 0;
+write_offset = buffer_size;
+}
+
+/* Just copy the newly written data */
+if (read_offset <= write_offset) {
+bytes_to_copy = write_offset - read_offset;
+memcpy(dst_data_ptr + read_offset,
+ src_data_ptr + read_offset, bytes_to_copy);
+} else {
+bytes_to_copy = buffer_size - read_offset;
+memcpy(dst_data_ptr + read_offset,
+ src_data_ptr + read_offset, bytes_to_copy);
+
+bytes_to_copy = write_offset;
+memcpy(dst_data_ptr, src_data_ptr, bytes_to_copy);
+}

  src_data_ptr += buffer_size;
  dst_data_ptr += buffer_size;

Re: [Intel-gfx] [PATCH 11/20] drm/i915: Optimization to reduce the sampling time of GuC log buffer

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it becomes
half full, so Driver doesn't really need to sample the complete buffer
and can just copy only the newly written data by GuC into the local
buffer, i.e. as per the read & write pointer values.
Moreover the flush interrupt would generally come for one type of log
buffer, when it becomes half full, so at that time the other 2 types of
log buffer would comparatively have much lesser unread data in them.
In case of overflow reported by GuC, Driver do need to copy the entire
buffer as the whole buffer would contain the unread data.

v2: Rebase.

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 40 +-
  1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1ca1866..8e0f360 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -889,7 +889,8 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
struct guc_log_buffer_state *log_buffer_state, 
*log_buffer_snapshot_state;
struct guc_log_buffer_state log_buffer_state_local;
void *src_data_ptr, *dst_data_ptr;
-   u32 i, buffer_size;
+   bool new_overflow;
+   u32 i, buffer_size, read_offset, write_offset, bytes_to_copy;

if (!guc->log.buf_addr)
return;
@@ -912,10 +913,13 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
memcpy(&log_buffer_state_local, log_buffer_state,
sizeof(struct guc_log_buffer_state));
buffer_size = log_buffer_state_local.size;
+   read_offset = log_buffer_state_local.read_ptr;
+   write_offset = log_buffer_state_local.sampled_write_ptr;

guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
if (log_buffer_state_local.buffer_full_cnt !=
guc->log.prev_overflow_count[i]) {


Wrong alignment. You can try checkpatch.pl for all of those.


+   new_overflow = 1;


true/false since it is a bool


guc->log.total_overflow_count[i] +=
(log_buffer_state_local.buffer_full_cnt -
 guc->log.prev_overflow_count[i]);
@@ -929,7 +933,8 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
guc->log.prev_overflow_count[i] =
log_buffer_state_local.buffer_full_cnt;
DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
-   }
+   } else
+   new_overflow = 0;

if (log_buffer_snapshot_state) {
/* First copy the state structure in local buffer */
@@ -941,13 +946,37 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 * for consistency set the write pointer value to same
 * value of sampled_write_ptr in the snapshot buffer.
 */
-   log_buffer_snapshot_state->write_ptr =
-   log_buffer_snapshot_state->sampled_write_ptr;
+   log_buffer_snapshot_state->write_ptr = write_offset;

log_buffer_snapshot_state++;

/* Now copy the actual logs */
memcpy(dst_data_ptr, src_data_ptr, buffer_size);


The confusing bit - the memcpy above still copies the whole buffer, no?


+   if (unlikely(new_overflow)) {
+   /* copy the whole buffer in case of overflow */
+   read_offset = 0;
+   write_offset = buffer_size;
+   } else if (unlikely((read_offset > buffer_size) ||
+   (write_offset > buffer_size))) {
+   DRM_ERROR("invalid log buffer state\n");
+   /* copy whole buffer as offsets are unreliable 
*/
+   read_offset = 0;
+   write_offset = buffer_size;
+   }
+
+   /* Just copy the newly written data */
+   if (read_offset <= write_offset) {
+   bytes_to_copy = write_offset - read_offset;
+   memcpy(dst_data_ptr + read_offset,
+src_data_ptr + read_offset, bytes_to_copy);
+   } else {
+   bytes_to_copy = buffer_size - read_offset;
+   memcpy(dst_data_ptr + read_offset,
+ 

Re: [Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

2016-08-12 Thread Goel, Akash



On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:


On 12/08/16 14:10, Goel, Akash wrote:

On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

There are certain types of interrupts which Host can recieve from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
   start of bottom half, its not really needed as there is only a
   single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
   register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
   bits, outside the irq spinlock. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.h|   1 +
  drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
  drivers/gpu/drm/i915/i915_irq.c| 100
+++--
  drivers/gpu/drm/i915/i915_reg.h|  11 
  drivers/gpu/drm/i915/intel_drv.h   |   3 +
  drivers/gpu/drm/i915/intel_guc.h   |   4 ++
  drivers/gpu/drm/i915/intel_guc_loader.c|   4 ++
  7 files changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index a608a5c..28ffac5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1779,6 +1779,7 @@ struct drm_i915_private {
  u32 pm_imr;
  u32 pm_ier;
  u32 pm_rps_events;
+u32 pm_guc_events;
  u32 pipestat_irq_mask[I915_MAX_PIPES];

  struct i915_hotplug hotplug;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index ad3b55f..c7c679f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
  if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
  return 0;

+gen9_disable_guc_interrupts(dev_priv);
+
  ctx = dev_priv->kernel_context;

  data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
@@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
  if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
  return 0;

+if (i915.guc_log_level >= 0)
+gen9_enable_guc_interrupts(dev_priv);
+
  ctx = dev_priv->kernel_context;

  data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
diff --git a/drivers/gpu/drm/i915/i915_irq.c
b/drivers/gpu/drm/i915/i915_irq.c
index 5f93309..5f1974f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
drm_i915_private *dev_priv,
  } while (0)

  static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
u32 pm_iir);
+static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
u32 pm_iir);

  /* For display hotplug interrupt */
  static inline void
@@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
drm_i915_private *dev_priv)
  gen6_reset_rps_interrupts(dev_priv);
  }

+void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(&dev_priv->irq_lock);
+gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
+spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(&dev_priv->irq_lock);
+if (!dev_priv->guc.interrupts_enabled) {
+WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
+dev_priv->pm_guc_events);
+dev_priv->guc.interrupts_enabled = true;
+gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+}
+spin_unlock_irq(&dev_priv->irq_lock);
+}
+
+void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
+{
+spin_lock_irq(&dev_priv->irq_lock);
+dev_priv->guc.interrupts_enabled = false;
+
+gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
+
+spin_unlock_irq(&dev_priv->irq_lock);
+synchronize_irq(dev_priv->drm.irq);
+
+gen9_reset_guc_interrupts(dev_priv);
+}
+
  /**
   * bdw_update_port_irq - update DE port interrupt
   * @dev_priv: driver private
@@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct
*work)
  mutex_unlock(&dev_priv->rps.hw_

Re: [Intel-gfx] [PATCH 10/20] drm/i915: Add stats for GuC log buffer flush interrupts

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
   flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
   the GuC log buffer. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_debugfs.c| 28 
  drivers/gpu/drm/i915/i915_guc_submission.c | 19 +++
  drivers/gpu/drm/i915/i915_irq.c|  2 ++
  drivers/gpu/drm/i915/intel_guc.h   |  7 +++
  4 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 51b59d5..14e0dcf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2539,6 +2539,32 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
return 0;
  }

+static void i915_guc_log_info(struct seq_file *m,
+struct drm_i915_private *dev_priv)
+{
+   struct intel_guc *guc = &dev_priv->guc;
+
+   seq_printf(m, "\nGuC logging stats:\n");
+
+   seq_printf(m, "\tISR:   flush count %10u, overflow count %8u\n",
+   guc->log.flush_count[GUC_ISR_LOG_BUFFER],
+   guc->log.total_overflow_count[GUC_ISR_LOG_BUFFER]);
+
+   seq_printf(m, "\tDPC:   flush count %10u, overflow count %8u\n",
+   guc->log.flush_count[GUC_DPC_LOG_BUFFER],
+   guc->log.total_overflow_count[GUC_DPC_LOG_BUFFER]);
+
+   seq_printf(m, "\tCRASH: flush count %10u, overflow count %8u\n",
+   guc->log.flush_count[GUC_CRASH_DUMP_LOG_BUFFER],
+   guc->log.total_overflow_count[GUC_CRASH_DUMP_LOG_BUFFER]);


Why is the width for overflow only 8 chars and not 10 like for flush 
since both are u32?



+
+   seq_printf(m, "\tTotal flush interrupt count: %u\n",
+  guc->log.flush_interrupt_count);
+
+   seq_printf(m, "\tCapture miss count: %u\n",
+  guc->log.capture_miss_count);
+}
+
  static void i915_guc_client_info(struct seq_file *m,
 struct drm_i915_private *dev_priv,
 struct i915_guc_client *client)
@@ -2613,6 +2639,8 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
i915_guc_client_info(m, dev_priv, &client);

+   i915_guc_log_info(m, dev_priv);
+
/* Add more as required ... */

return 0;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index cb9672b..1ca1866 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -913,6 +913,24 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
sizeof(struct guc_log_buffer_state));
buffer_size = log_buffer_state_local.size;

+   guc->log.flush_count[i] += log_buffer_state_local.flush_to_file;
+   if (log_buffer_state_local.buffer_full_cnt !=
+   guc->log.prev_overflow_count[i]) {
+   guc->log.total_overflow_count[i] +=
+   (log_buffer_state_local.buffer_full_cnt -
+guc->log.prev_overflow_count[i]);
+
+   if (log_buffer_state_local.buffer_full_cnt <
+   guc->log.prev_overflow_count[i]) {
+   /* buffer_full_cnt is a 4 bit counter */
+   guc->log.total_overflow_count[i] += 16;
+   }
+
+   guc->log.prev_overflow_count[i] =
+   log_buffer_state_local.buffer_full_cnt;
+   DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+   }
+
if (log_buffer_snapshot_state) {
/* First copy the state structure in local buffer */
memcpy(log_buffer_snapshot_state, 
&log_buffer_state_local,
@@ -953,6 +971,7 @@ static void guc_read_update_log_buffer(struct intel_guc 
*guc)
 * getting consumed by User at a slow rate.
 */
DRM_ERROR_RATELIMITED("no sub-buffer to capture log buffer\n");
+   guc->log.capture_miss_count++;
}
  }

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c

[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [CI,01/31] drm/i915: Record the position of the start of the request

2016-08-12 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/31] drm/i915: Record the position of the 
start of the request
URL   : https://patchwork.freedesktop.org/series/11029/
State : failure

== Summary ==

Series 11029v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/11029/revisions/1/mbox

Test drv_module_reload_basic:
pass   -> SKIP   (ro-hsw-i3-4010u)
Test gem_exec_suspend:
Subgroup basic-s3:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-varying-size:
pass   -> FAIL   (ro-ilk1-i5-650)
Subgroup basic-flip-vs-cursor-varying-size:
fail   -> PASS   (ro-skl3-i5-6260u)
pass   -> FAIL   (ro-bdw-i5-5250u)
dmesg-fail -> PASS   (fi-skl-i7-6700k)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
dmesg-warn -> PASS   (ro-bdw-i7-5600u)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:186  dwarn:29  dfail:0   fail:3   skip:26 
fi-skl-i7-6700k  total:244  pass:209  dwarn:4   dfail:1   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:218  dwarn:2   dfail:0   fail:2   skip:18 
ro-bdw-i7-5600u  total:240  pass:206  dwarn:1   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820 total:240  pass:197  dwarn:0   dfail:0   fail:3   skip:40 
ro-hsw-i3-4010u  total:240  pass:213  dwarn:0   dfail:0   fail:0   skip:27 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:223  dwarn:0   dfail:0   fail:3   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1855/

9a79c0b drm-intel-nightly: 2016y-08m-12d-12h-12m-18s UTC integration manifest
af56144 drm/i915: Record the RING_MODE register for post-mortem debugging
90ea1ef drm/i915: Only record active and pending requests upon a GPU hang
5a32c90 drm/i915: Print the batchbuffer offset next to BBADDR in error state
dad1d48 drm/i915: Introduce i915_ggtt_offset()
1707d23 drm/i915: Track pinned VMA
cf7576d drm/i915: Consolidate i915_vma_unpin_and_release()
aeac2e6 drm/i915: Use VMA for wa_ctx tracking
c938b0f drm/i915: Use VMA for render state page tracking
70a014e drm/i915: Use VMA as the primary tracker for semaphore page
c10834c drm/i915/overlay: Use VMA as the primary tracker for images
ccf275f drm/i915: Move common seqno reset to intel_engine_cs.c
ec45caa drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c
2a36ad5 drm/i915: Use VMA for scratch page tracking
1624cc0 drm/i915: Use VMA for ringbuffer tracking
4ce773a drm/i915: Move assertion for iomap access to i915_vma_pin_iomap
fb0c322f drm/i915: Only change the context object's domain when binding
7c2a6f6 drm/i915: Use VMA as the primary object for context state
160017d drm/i915: Use VMA directly for checking tiling parameters
ce4571b drm/i915: Convert fence computations to use vma directly
59e16d6 drm/i915: Track pinned vma inside guc
094e926 drm/i915: Add convenience wrappers for vma's object get/put
3015b87 drm/i915: Add fetch_and_zero() macro
15fbe90 drm/i915: Create a VMA for an object
6131737 drm/i915: Always set the vma->pages
9f0f991 drm/i915: Remove redundant WARN_ON from __i915_add_request()
1c06408 drm/i915: Reduce i915_gem_objects to only show object information
fe2ce95 drm/i915: Focus debugfs/i915_gem_pinned to show only display pins
c258579 drm/i915: Remove inactive/active list from debugfs
be86468 drm/i915: Store the active context object on all engines upon error
dc43be9 drm/i915: Reduce amount of duplicate buffer information captured on 
error
8f4ea2e drm/i915: Record the position of the start of the request

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 14:45, Goel, Akash wrote:



On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.c|  14 +++
  drivers/gpu/drm/i915/i915_guc_submission.c | 150
+
  drivers/gpu/drm/i915/i915_irq.c|   5 +-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
drm_i915_private *dev_priv)
  if (dev_priv->hotplug.dp_wq == NULL)
  goto out_free_wq;

+if (HAS_GUC_SCHED(dev_priv)) {


This just reminded me that a previous patch had:

+if (HAS_GUC_UCODE(dev))
+dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;

In the interrupt setup. I don't think there is a bug right now, but
there is a disagreement between the two which would be good to resolve.

This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
for correctness. I think.


Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.

As per Chris's comments will move the wq init/destroy to the GuC logging
setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
You are fine with that ?.


Yes thats OK I think.




+/* Need a dedicated wq to process log buffer flush interrupts
+ * from GuC without much delay so as to avoid any loss of logs.
+ */
+dev_priv->guc.log.wq =
+alloc_ordered_workqueue("i915-guc_log", 0);
+if (dev_priv->guc.log.wq == NULL)
+goto out_free_hotplug_dp_wq;
+}
+
  return 0;

+out_free_hotplug_dp_wq:
+destroy_workqueue(dev_priv->hotplug.dp_wq);
  out_free_wq:
  destroy_workqueue(dev_priv->wq);
  out_err:
@@ -782,6 +794,8 @@ out_err:

  static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
  {
+if (HAS_GUC_SCHED(dev_priv))
+destroy_workqueue(dev_priv->guc.log.wq);
  destroy_workqueue(dev_priv->hotplug.dp_wq);
  destroy_workqueue(dev_priv->wq);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7c679f..2635b67 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -840,6 +849,127 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+u32 i, buffer_size;


unsigned int i if you can be bothered.


Fine will do that for both i & buffer_size.


buffer_size can match the type of log_buffer_state_local.size or use 
something else if more appropriate.



But I remember earlier in one of the patch, you suggested to use u32 as
a type for some variables.
Please could you share the guideline.
Should u32, u64 be used we are exactly sure of the range of the
variable, like for variables containing the register values ?



Re: [Intel-gfx] [PATCH 09/20] drm/i915: New lock to serialize the Host2GuC actions

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

With the addition of new Host2GuC actions related to GuC logging, there
is a need of a lock to serialize them, as they can execute concurrently
with each other and also with other existing actions.

v2: Use mutex in place of spinlock to serialize, as sleep can happen
 while waiting for the action's response from GuC. (Tvrtko)

Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 3 +++
  drivers/gpu/drm/i915/intel_guc.h   | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 1a2d648..cb9672b 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -88,6 +88,7 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, 
u32 len)
return -EINVAL;

intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+   mutex_lock(&guc->action_lock);


I would probably take the mutex before grabbing forcewake as a general 
rule. Not that I think it matters in this case since we don't expect any 
contention on this one.




dev_priv->guc.action_count += 1;
dev_priv->guc.action_cmd = data[0];
@@ -126,6 +127,7 @@ static int host2guc_action(struct intel_guc *guc, u32 
*data, u32 len)
}
dev_priv->guc.action_status = status;

+   mutex_unlock(&guc->action_lock);
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);

return ret;
@@ -1312,6 +1314,7 @@ int i915_guc_submission_init(struct drm_i915_private 
*dev_priv)
return -ENOMEM;

ida_init(&guc->ctx_ids);
+   mutex_init(&guc->action_lock);
guc_create_log(guc);
guc_create_ads(guc);

diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 96ef7dc..e4ec8d8 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -156,6 +156,9 @@ struct intel_guc {

uint64_t submissions[I915_NUM_ENGINES];
uint32_t last_seqno[I915_NUM_ENGINES];
+
+   /* To serialize the Host2GuC actions */
+   struct mutex action_lock;
  };

  /* intel_guc_loader.c */



With or without the mutex vs forcewake ordering change:

Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/20] drm/i915: Add a relay backed debugfs interface for capturing GuC logs

2016-08-12 Thread Tvrtko Ursulin


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

Added a new debugfs interface '/sys/kernel/debug/dri/guc_log' for the
User to capture GuC firmware logs. Availed relay framework to implement
the interface, where Driver will have to just use a relay API to store
snapshots of the GuC log buffer in the buffer managed by relay.
The snapshot will be taken when GuC firmware sends a log buffer flush
interrupt and up to four snaphots could be stored in the relay buffer.


snapshots


The relay buffer will be operated in a mode where it will overwrite the
data not yet collected by User.
Besides mmap method, through which User can directly access the relay
buffer contents, relay also supports the 'poll' method. Through the 'poll'
call on log file, User can come to know whenever a new snapshot of the
log buffer is taken by Driver, so can run in tandem with the Driver and
capture the logs in a sustained/streaming manner, without any loss of data.

v2: Defer the creation of relay channel & associated debugfs file, as
 debugfs setup is now done at the end of i915 Driver load. (Chris)

v3:
- Switch to no-overwrite mode for relay.
- Fix the relay sub buffer switching sequence.

v4:
- Update i915 Kconfig to select RELAY config. (TvrtKo)
- Log a message when there is no sub buffer available to capture
   the GuC log buffer. (Tvrtko)
- Increase the number of relay sub buffers to 8 from 4, to have
   sufficient buffering for boot time logs

Suggested-by: Chris Wilson 
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/Kconfig   |   1 +
  drivers/gpu/drm/i915/i915_drv.c|   2 +
  drivers/gpu/drm/i915/i915_guc_submission.c | 206 -
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 209 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e46..fc900d2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -11,6 +11,7 @@ config DRM_I915
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_MIPI_DSI
+   select RELAY
# i915 depends on ACPI_VIDEO when ACPI is enabled
# but for select to work, need to select ACPI_VIDEO's dependencies, ick
select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index fc2da32..cb8c943 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1145,6 +1145,7 @@ static void i915_driver_register(struct drm_i915_private 
*dev_priv)
/* Reveal our presence to userspace */
if (drm_dev_register(dev, 0) == 0) {
i915_debugfs_register(dev_priv);
+   i915_guc_register(dev_priv);
i915_setup_sysfs(dev);
} else
DRM_ERROR("Failed to register driver for userspace access!\n");
@@ -1183,6 +1184,7 @@ static void i915_driver_unregister(struct 
drm_i915_private *dev_priv)
intel_opregion_unregister(dev_priv);

i915_teardown_sysfs(&dev_priv->drm);
+   i915_guc_unregister(dev_priv);
i915_debugfs_unregister(dev_priv);
drm_dev_unregister(&dev_priv->drm);

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 2635b67..1a2d648 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -23,6 +23,8 @@
   */
  #include 
  #include 
+#include 
+#include 
  #include "i915_drv.h"
  #include "intel_guc.h"

@@ -851,12 +853,33 @@ err:

  static void guc_move_to_next_buf(struct intel_guc *guc)
  {
-   return;
+   /* Make sure the updates made in the sub buffer are visible when
+* Consumer sees the following update to offset inside the sub buffer.
+*/
+   smp_wmb();
+
+   /* All data has been written, so now move the offset of sub buffer. */
+   relay_reserve(guc->log.relay_chan, guc->log.obj->base.size);
+
+   /* Switch to the next sub buffer */
+   relay_flush(guc->log.relay_chan);
  }

  static void* guc_get_write_buffer(struct intel_guc *guc)
  {
-   return NULL;
+   /* FIXME: Cover the check under a lock ? */


Need to resolve before r-b in any case.


+   if (!guc->log.relay_chan)
+   return NULL;
+
+   /* Just get the base address of a new sub buffer and copy data into it
+* ourselves. NULL will be returned in no-overwrite mode, if all sub
+* buffers are full. Could have used the relay_write() to indirectly
+* copy the data, but that would have been bit convoluted, as we need to
+* write to only certain locations inside a sub buffer which cannot be
+* done without using relay_reserve() along with relay_write(). So its
+* better to use relay_reserve() alone.
+*/
+   return relay_reserve(guc->log.relay_chan, 0);
  }

  static void guc_read_update

[Intel-gfx] [CI 20/31] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c

2016-08-12 Thread Chris Wilson
Since the scratch allocation and cleanup is shared by all engine
submission backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 50 +
 drivers/gpu/drm/i915/intel_lrc.c|  1 -
 drivers/gpu/drm/i915/intel_ringbuffer.c | 50 -
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 4 files changed, 51 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 186c12d07f99..7104dec5e893 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -195,6 +195,54 @@ void intel_engine_setup_common(struct intel_engine_cs 
*engine)
i915_gem_batch_pool_init(engine, &engine->batch_pool);
 }
 
+int intel_engine_create_scratch(struct intel_engine_cs *engine, int size)
+{
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+   int ret;
+
+   WARN_ON(engine->scratch);
+
+   obj = i915_gem_object_create_stolen(&engine->i915->drm, size);
+   if (!obj)
+   obj = i915_gem_object_create(&engine->i915->drm, size);
+   if (IS_ERR(obj)) {
+   DRM_ERROR("Failed to allocate scratch page\n");
+   return PTR_ERR(obj);
+   }
+
+   vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL);
+   if (IS_ERR(vma)) {
+   ret = PTR_ERR(vma);
+   goto err_unref;
+   }
+
+   ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH);
+   if (ret)
+   goto err_unref;
+
+   engine->scratch = vma;
+   DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n",
+engine->name, vma->node.start);
+   return 0;
+
+err_unref:
+   i915_gem_object_put(obj);
+   return ret;
+}
+
+static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
+{
+   struct i915_vma *vma;
+
+   vma = fetch_and_zero(&engine->scratch);
+   if (!vma)
+   return;
+
+   i915_vma_unpin(vma);
+   i915_vma_put(vma);
+}
+
 /**
  * intel_engines_init_common - initialize cengine state which might require hw 
access
  * @engine: Engine to initialize.
@@ -226,6 +274,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
  */
 void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
+   intel_engine_cleanup_scratch(engine);
+
intel_engine_cleanup_cmd_parser(engine);
intel_engine_fini_breadcrumbs(engine);
i915_gem_batch_pool_fini(&engine->batch_pool);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 42999ba02152..56c904e2dc98 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1844,7 +1844,6 @@ int logical_render_ring_init(struct intel_engine_cs 
*engine)
else
engine->init_hw = gen8_init_render_ring;
engine->init_context = gen8_init_rcs_context;
-   engine->cleanup = intel_engine_cleanup_scratch;
engine->emit_flush = gen8_emit_flush_render;
engine->emit_request = gen8_emit_request_render;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7ce912f8d96c..c89aea55bc10 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -613,54 +613,6 @@ out:
return ret;
 }
 
-void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
-{
-   struct i915_vma *vma;
-
-   vma = fetch_and_zero(&engine->scratch);
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
-}
-
-int intel_engine_create_scratch(struct intel_engine_cs *engine, int size)
-{
-   struct drm_i915_gem_object *obj;
-   struct i915_vma *vma;
-   int ret;
-
-   WARN_ON(engine->scratch);
-
-   obj = i915_gem_object_create_stolen(&engine->i915->drm, size);
-   if (!obj)
-   obj = i915_gem_object_create(&engine->i915->drm, size);
-   if (IS_ERR(obj)) {
-   DRM_ERROR("Failed to allocate scratch page\n");
-   return PTR_ERR(obj);
-   }
-
-   vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL);
-   if (IS_ERR(vma)) {
-   ret = PTR_ERR(vma);
-   goto err_unref;
-   }
-
-   ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH);
-   if (ret)
-   goto err_unref;
-
-   engine->scratch = vma;
-   DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n",
-engine->name, vma->node.start);
-   return 0;
-
-err_unref:
-   i915_gem_object_put(obj);
-   return ret;
-}
-
 static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 {
struct intel_r

[Intel-gfx] [CI 18/31] drm/i915: Use VMA for ringbuffer tracking

2016-08-12 Thread Chris Wilson
Use the GGTT VMA as the primary cookie for handing ring objects as
the most common action upon the ring is mapping and unmapping which act
upon the VMA itself. By restructuring the code to work with the ring
VMA, we can shrink the code and remove a few cycles from context pinning.

v2: Move the flush of the object back to before the first pin. We use
the am-I-bound? query to only have to check the flush on the first
bind and so avoid stalling on active rings.
Lots of little renames and small hoops.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  |   4 +-
 drivers/gpu/drm/i915/i915_guc_submission.c |  16 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  17 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c| 243 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h|  14 +-
 6 files changed, 139 insertions(+), 157 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index fcda4e7da127..2da37c196ef0 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -356,7 +356,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data)
if (ctx->engine[n].state)
per_file_stats(0, ctx->engine[n].state->obj, data);
if (ctx->engine[n].ring)
-   per_file_stats(0, ctx->engine[n].ring->obj, data);
+   per_file_stats(0, ctx->engine[n].ring->vma->obj, data);
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 35394d393edc..4a19494a4f6f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1128,12 +1128,12 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
ee->cpu_ring_tail = ring->tail;
ee->ringbuffer =
i915_error_ggtt_object_create(dev_priv,
- ring->obj);
+ ring->vma->obj);
}
 
ee->hws_page =
i915_error_ggtt_object_create(dev_priv,
- engine->status_page.obj);
+ 
engine->status_page.vma->obj);
 
ee->wa_ctx = i915_error_ggtt_object_create(dev_priv,
   engine->wa_ctx.obj);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4f0f173f9754..c40b92e212fa 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -343,7 +343,6 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
struct intel_context *ce = &ctx->engine[engine->id];
uint32_t guc_engine_id = engine->guc_id;
struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id];
-   struct drm_i915_gem_object *obj;
 
/* TODO: We have a design issue to be solved here. Only when we
 * receive the first batch, we know which engine is used by the
@@ -358,17 +357,14 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
/* The state page is after PPHWSP */
-   gfx_addr = ce->state->node.start;
-   lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE;
+   lrc->ring_lcra =
+   ce->state->node.start + LRC_STATE_PN * PAGE_SIZE;
lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
 
-   obj = ce->ring->obj;
-   gfx_addr = i915_gem_obj_ggtt_offset(obj);
-
-   lrc->ring_begin = gfx_addr;
-   lrc->ring_end = gfx_addr + obj->base.size - 1;
-   lrc->ring_next_free_location = gfx_addr;
+   lrc->ring_begin = ce->ring->vma->node.start;
+   lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
+   lrc->ring_next_free_location = lrc->ring_begin;
lrc->ring_current_tail_pointer_value = 0;
 
desc.engines_used |= (1 << guc_engine_id);
@@ -943,7 +939,7 @@ static void guc_create_ads(struct intel_guc *guc)
 * to find it.
 */
engine = &dev_priv->engine[RCS];
-   ads->golden_context_lrca = engine->status_page.gfx_addr;
+   ads->golden_context_lrca = engine->status_page.ggtt_offset;
 
for_each_engine(engine, dev_priv)
ads->eng_state_size[engine->guc_id] = 
intel_lr_context_size(engine);
diff --git a/drivers/gpu/drm/i915/int

[Intel-gfx] [CI 26/31] drm/i915: Consolidate i915_vma_unpin_and_release()

2016-08-12 Thread Chris Wilson
In a few places, we repeat a call to clear a pointer to a vma whilst
unpinning and releasing a reference to its owner. Refactor those into a
common function.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 12 
 drivers/gpu/drm/i915/i915_gem_gtt.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 21 -
 drivers/gpu/drm/i915/intel_engine_cs.c |  9 +
 drivers/gpu/drm/i915/intel_lrc.c   |  9 +
 drivers/gpu/drm/i915/intel_ringbuffer.c|  8 +---
 6 files changed, 20 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 738a474c5afa..d15eb1d71341 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3674,3 +3674,15 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
__i915_vma_pin(vma);
return ptr;
 }
+
+void i915_vma_unpin_and_release(struct i915_vma **p_vma)
+{
+   struct i915_vma *vma;
+
+   vma = fetch_and_zero(p_vma);
+   if (!vma)
+   return;
+
+   i915_vma_unpin(vma);
+   i915_vma_put(vma);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index a2691943a404..ec538fcc9c20 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -232,6 +232,7 @@ struct i915_vma *
 i915_vma_create(struct drm_i915_gem_object *obj,
struct i915_address_space *vm,
const struct i915_ggtt_view *view);
+void i915_vma_unpin_and_release(struct i915_vma **p_vma);
 
 static inline bool i915_vma_is_ggtt(const struct i915_vma *vma)
 {
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c40b92e212fa..e7dbc64ec1da 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -653,19 +653,6 @@ err:
return vma;
 }
 
-/**
- * guc_release_vma() - Release gem object allocated for GuC usage
- * @vma:   gem obj to be released
- */
-static void guc_release_vma(struct i915_vma *vma)
-{
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
-}
-
 static void
 guc_client_free(struct drm_i915_private *dev_priv,
struct i915_guc_client *client)
@@ -690,7 +677,7 @@ guc_client_free(struct drm_i915_private *dev_priv,
kunmap(kmap_to_page(client->client_base));
}
 
-   guc_release_vma(client->vma);
+   i915_vma_unpin_and_release(&client->vma);
 
if (client->ctx_index != GUC_INVALID_CTX_ID) {
guc_fini_ctx_desc(guc, client);
@@ -1048,12 +1035,12 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
 {
struct intel_guc *guc = &dev_priv->guc;
 
-   guc_release_vma(fetch_and_zero(&guc->ads_vma));
-   guc_release_vma(fetch_and_zero(&guc->log_vma));
+   i915_vma_unpin_and_release(&guc->ads_vma);
+   i915_vma_unpin_and_release(&guc->log_vma);
 
if (guc->ctx_pool_vma)
ida_destroy(&guc->ctx_ids);
-   guc_release_vma(fetch_and_zero(&guc->ctx_pool_vma));
+   i915_vma_unpin_and_release(&guc->ctx_pool_vma);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 573f642a74f8..f02d66bbec4b 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -279,14 +279,7 @@ err_unref:
 
 static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
 {
-   struct i915_vma *vma;
-
-   vma = fetch_and_zero(&engine->scratch);
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
+   i915_vma_unpin_and_release(&engine->scratch);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 64cb04e63512..2673fb4f817b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1193,14 +1193,7 @@ err:
 
 static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine)
 {
-   struct i915_vma *vma;
-
-   vma = fetch_and_zero(&engine->wa_ctx.vma);
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
+   i915_vma_unpin_and_release(&engine->wa_ctx.vma);
 }
 
 static int intel_init_workaround_bb(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 30b066140b0c..65ef172e8761 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1257,14 +1257,8 @@ static int init_render_ring(struct intel_engine_cs 
*engine)
 static void render_ring_cleanup(struct intel_engine_cs *engine)
 {
struct drm_i915_private *dev_priv = engine->i915;
-   struct i915_vma *vma;
-
-   vma = fetch_a

[Intel-gfx] [CI 22/31] drm/i915/overlay: Use VMA as the primary tracker for images

2016-08-12 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_overlay.c | 39 
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c 
b/drivers/gpu/drm/i915/intel_overlay.c
index 90f3ab424e01..d930e3a4a9cd 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -171,8 +171,8 @@ struct overlay_registers {
 struct intel_overlay {
struct drm_i915_private *i915;
struct intel_crtc *crtc;
-   struct drm_i915_gem_object *vid_bo;
-   struct drm_i915_gem_object *old_vid_bo;
+   struct i915_vma *vma;
+   struct i915_vma *old_vma;
bool active;
bool pfit_active;
u32 pfit_vscale_ratio; /* shifted-point number, (1<<12) == 1.0 */
@@ -317,15 +317,17 @@ static void intel_overlay_release_old_vid_tail(struct 
i915_gem_active *active,
 {
struct intel_overlay *overlay =
container_of(active, typeof(*overlay), last_flip);
-   struct drm_i915_gem_object *obj = overlay->old_vid_bo;
+   struct i915_vma *vma;
 
-   i915_gem_track_fb(obj, NULL,
- INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
+   vma = fetch_and_zero(&overlay->old_vma);
+   if (WARN_ON(!vma))
+   return;
 
-   i915_gem_object_ggtt_unpin(obj);
-   i915_gem_object_put(obj);
+   i915_gem_track_fb(vma->obj, NULL,
+ INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
 
-   overlay->old_vid_bo = NULL;
+   i915_gem_object_unpin_from_display_plane(vma->obj, 
&i915_ggtt_view_normal);
+   i915_vma_put(vma);
 }
 
 static void intel_overlay_off_tail(struct i915_gem_active *active,
@@ -333,15 +335,15 @@ static void intel_overlay_off_tail(struct i915_gem_active 
*active,
 {
struct intel_overlay *overlay =
container_of(active, typeof(*overlay), last_flip);
-   struct drm_i915_gem_object *obj = overlay->vid_bo;
+   struct i915_vma *vma;
 
/* never have the overlay hw on without showing a frame */
-   if (WARN_ON(!obj))
+   vma = fetch_and_zero(&overlay->vma);
+   if (WARN_ON(!vma))
return;
 
-   i915_gem_object_ggtt_unpin(obj);
-   i915_gem_object_put(obj);
-   overlay->vid_bo = NULL;
+   i915_gem_object_unpin_from_display_plane(vma->obj, 
&i915_ggtt_view_normal);
+   i915_vma_put(vma);
 
overlay->crtc->overlay = NULL;
overlay->crtc = NULL;
@@ -421,7 +423,7 @@ static int intel_overlay_release_old_vid(struct 
intel_overlay *overlay)
/* Only wait if there is actually an old frame to release to
 * guarantee forward progress.
 */
-   if (!overlay->old_vid_bo)
+   if (!overlay->old_vma)
return 0;
 
if (I915_READ(ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) {
@@ -744,6 +746,7 @@ static int intel_overlay_do_put_image(struct intel_overlay 
*overlay,
struct drm_i915_private *dev_priv = overlay->i915;
u32 swidth, swidthsw, sheight, ostride;
enum pipe pipe = overlay->crtc->pipe;
+   struct i915_vma *vma;
 
lockdep_assert_held(&dev_priv->drm.struct_mutex);

WARN_ON(!drm_modeset_is_locked(&dev_priv->drm.mode_config.connection_mutex));
@@ -757,6 +760,8 @@ static int intel_overlay_do_put_image(struct intel_overlay 
*overlay,
if (ret != 0)
return ret;
 
+   vma = i915_gem_obj_to_ggtt_view(new_bo, &i915_ggtt_view_normal);
+
ret = i915_gem_object_put_fence(new_bo);
if (ret)
goto out_unpin;
@@ -834,11 +839,11 @@ static int intel_overlay_do_put_image(struct 
intel_overlay *overlay,
if (ret)
goto out_unpin;
 
-   i915_gem_track_fb(overlay->vid_bo, new_bo,
+   i915_gem_track_fb(overlay->vma->obj, new_bo,
  INTEL_FRONTBUFFER_OVERLAY(pipe));
 
-   overlay->old_vid_bo = overlay->vid_bo;
-   overlay->vid_bo = new_bo;
+   overlay->old_vma = overlay->vma;
+   overlay->vma = vma;
 
intel_frontbuffer_flip(dev_priv, INTEL_FRONTBUFFER_OVERLAY(pipe));
 
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 23/31] drm/i915: Use VMA as the primary tracker for semaphore page

2016-08-12 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h |  4 +--
 drivers/gpu/drm/i915/i915_gpu_error.c   | 16 -
 drivers/gpu/drm/i915/intel_engine_cs.c  | 12 ---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 60 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 6 files changed, 55 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 2da37c196ef0..fb483df1afd6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3145,7 +3145,7 @@ static int i915_semaphore_status(struct seq_file *m, void 
*unused)
struct page *page;
uint64_t *seqno;
 
-   page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0);
+   page = i915_gem_object_get_page(dev_priv->semaphore->obj, 0);
 
seqno = (uint64_t *)kmap_atomic(page);
for_each_engine_id(engine, dev_priv, id) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 259425d99e17..50dc3613c61c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -733,7 +733,7 @@ struct drm_i915_error_state {
u64 fence[I915_MAX_NUM_FENCES];
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
-   struct drm_i915_error_object *semaphore_obj;
+   struct drm_i915_error_object *semaphore;
 
struct drm_i915_error_engine {
int engine_id;
@@ -1750,7 +1750,7 @@ struct drm_i915_private {
struct pci_dev *bridge_dev;
struct i915_gem_context *kernel_context;
struct intel_engine_cs engine[I915_NUM_ENGINES];
-   struct drm_i915_gem_object *semaphore_obj;
+   struct i915_vma *semaphore;
u32 next_seqno;
 
struct drm_dma_handle *status_page_dmah;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index b80d2a6f56b3..da8aa86ad0c9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -549,7 +549,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf 
*m,
}
}
 
-   if ((obj = error->semaphore_obj)) {
+   if ((obj = error->semaphore)) {
err_printf(m, "Semaphore page = 0x%08x\n",
   lower_32_bits(obj->gtt_offset));
for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
@@ -640,7 +640,7 @@ static void i915_error_state_free(struct kref *error_ref)
kfree(ee->waiters);
}
 
-   i915_error_object_free(error->semaphore_obj);
+   i915_error_object_free(error->semaphore);
 
for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
kfree(error->active_bo[i]);
@@ -876,7 +876,7 @@ static void gen8_record_semaphore_state(struct 
drm_i915_error_state *error,
struct intel_engine_cs *to;
enum intel_engine_id id;
 
-   if (!error->semaphore_obj)
+   if (!error->semaphore)
return;
 
for_each_engine_id(to, dev_priv, id) {
@@ -889,7 +889,7 @@ static void gen8_record_semaphore_state(struct 
drm_i915_error_state *error,
 
signal_offset =
(GEN8_SIGNAL_OFFSET(engine, id) & (PAGE_SIZE - 1)) / 4;
-   tmp = error->semaphore_obj->pages[0];
+   tmp = error->semaphore->pages[0];
idx = intel_engine_sync_index(engine, to);
 
ee->semaphore_mboxes[idx] = tmp[signal_offset];
@@ -1061,11 +1061,9 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
struct drm_i915_gem_request *request;
int i, count;
 
-   if (dev_priv->semaphore_obj) {
-   error->semaphore_obj =
-   i915_error_ggtt_object_create(dev_priv,
- dev_priv->semaphore_obj);
-   }
+   error->semaphore =
+   i915_error_ggtt_object_create(dev_priv,
+ dev_priv->semaphore->obj);
 
for (i = 0; i < I915_NUM_ENGINES; i++) {
struct intel_engine_cs *engine = &dev_priv->engine[i];
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 829624571ca4..573f642a74f8 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -179,12 +179,16 @@ void intel_engine_init_seqno(struct intel_engine_cs 
*engine, u32 seqno)
if (HAS_VEBOX(dev_priv))
I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
}
-   if (dev_priv->semaphore_obj) {
-   struct drm_i915_gem_object *obj = dev_priv->semaphore_obj;
-   struct page *page = i915_gem_object_get_dirty_page(obj, 0);
- 

[Intel-gfx] [CI 28/31] drm/i915: Introduce i915_ggtt_offset()

2016-08-12 Thread Chris Wilson
This little helper only exists to safely discard the upper unused 32bits
of the general 64-bit VMA address - as we know that all Global GTT
currently are less than 4GiB in size and so that the upper bits must be
zero. In many places, we use a u32 for the global GTT offset and we want
to document where we are discarding the full VMA offset.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  2 +-
 drivers/gpu/drm/i915/i915_drv.h|  2 +-
 drivers/gpu/drm/i915/i915_gem.c| 11 +--
 drivers/gpu/drm/i915/i915_gem_context.c|  6 --
 drivers/gpu/drm/i915/i915_gem_gtt.h|  9 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 15 ---
 drivers/gpu/drm/i915/intel_display.c   | 10 +++---
 drivers/gpu/drm/i915/intel_engine_cs.c |  4 ++--
 drivers/gpu/drm/i915/intel_fbdev.c |  6 +++---
 drivers/gpu/drm/i915/intel_guc_loader.c|  6 +++---
 drivers/gpu/drm/i915/intel_lrc.c   | 20 +++-
 drivers/gpu/drm/i915/intel_overlay.c   | 10 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c| 28 ++--
 13 files changed, 70 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 21961304284e..82652ad28cd4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2008,7 +2008,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
 
if (vma->flags & I915_VMA_GLOBAL_BIND)
seq_printf(m, "\tBound in GGTT at 0x%08x\n",
-  lower_32_bits(vma->node.start));
+  i915_ggtt_offset(vma));
 
if (i915_gem_object_get_pages(vma->obj)) {
seq_puts(m, "\tFailed to get pages for context object\n\n");
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bbee45acedeb..bd58878de77b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3330,7 +3330,7 @@ static inline unsigned long
 i915_gem_object_ggtt_offset(struct drm_i915_gem_object *o,
const struct i915_ggtt_view *view)
 {
-   return i915_gem_object_to_ggtt(o, view)->node.start;
+   return i915_ggtt_offset(i915_gem_object_to_ggtt(o, view));
 }
 
 /* i915_gem_fence.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 07f7d3da5457..8bd2fa7644d5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -758,7 +758,7 @@ i915_gem_gtt_pread(struct drm_device *dev,
 
i915_gem_object_pin_pages(obj);
} else {
-   node.start = vma->node.start;
+   node.start = i915_ggtt_offset(vma);
node.allocated = false;
ret = i915_gem_object_put_fence(obj);
if (ret)
@@ -1062,7 +1062,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 
i915_gem_object_pin_pages(obj);
} else {
-   node.start = vma->node.start;
+   node.start = i915_ggtt_offset(vma);
node.allocated = false;
ret = i915_gem_object_put_fence(obj);
if (ret)
@@ -1703,7 +1703,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct 
vm_fault *vmf)
goto err_unpin;
 
/* Finally, remap it using the new GTT offset */
-   pfn = ggtt->mappable_base + vma->node.start;
+   pfn = ggtt->mappable_base + i915_ggtt_offset(vma);
pfn >>= PAGE_SHIFT;
 
if (unlikely(view.type == I915_GGTT_VIEW_PARTIAL)) {
@@ -3750,10 +3750,9 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 
WARN(i915_vma_is_pinned(vma),
 "bo is already pinned in ggtt with incorrect alignment:"
-" offset=%08x %08x, req.alignment=%llx, 
req.map_and_fenceable=%d,"
+" offset=%08x, req.alignment=%llx, 
req.map_and_fenceable=%d,"
 " obj->map_and_fenceable=%d\n",
-upper_32_bits(vma->node.start),
-lower_32_bits(vma->node.start),
+i915_ggtt_offset(vma),
 alignment,
 !!(flags & PIN_MAPPABLE),
 obj->map_and_fenceable);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index e566167d9441..98d2956f91f4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -631,7 +631,8 @@ mi_set_context(struct drm_i915_gem_request *req, u32 
hw_flags)
 
intel_ring_emit(ring, MI_NOOP);
intel_ring_emit(ring, MI_SET_CONTEXT);
-   intel_ring_emit(ring, req->ctx->engine[RCS].state->node.start | flags);
+   intel_ring_emit(ring,
+   i915_ggtt_offset(req->ctx->engine[RCS].state) | flags);
   

[Intel-gfx] [CI 29/31] drm/i915: Print the batchbuffer offset next to BBADDR in error state

2016-08-12 Thread Chris Wilson
It is useful when looking at captured error states to check the recorded
BBADDR register (the address of the last batchbuffer instruction loaded)
against the expected offset of the batch buffer, and so do a quick check
that (a) the capture is true or (b) HEAD hasn't wandered off into the
badlands.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 25 -
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/i915_gem_context.c |  4 
 drivers/gpu/drm/i915/i915_gem_request.c |  6 --
 drivers/gpu/drm/i915/i915_gem_request.h |  3 ---
 drivers/gpu/drm/i915/i915_gpu_error.c   | 28 +++-
 6 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 82652ad28cd4..61e12a0f08d4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -460,6 +460,8 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
print_context_stats(m, dev_priv);
list_for_each_entry_reverse(file, &dev->filelist, lhead) {
struct file_stats stats;
+   struct drm_i915_file_private *file_priv = file->driver_priv;
+   struct drm_i915_gem_request *request;
struct task_struct *task;
 
memset(&stats, 0, sizeof(stats));
@@ -473,10 +475,17 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
 * still alive (e.g. get_pid(current) => fork() => exit()).
 * Therefore, we need to protect this ->comm access using RCU.
 */
+   mutex_lock(&dev->struct_mutex);
+   request = list_first_entry_or_null(&file_priv->mm.request_list,
+  struct drm_i915_gem_request,
+  client_list);
rcu_read_lock();
-   task = pid_task(file->pid, PIDTYPE_PID);
+   task = pid_task(request && request->ctx->pid ?
+   request->ctx->pid : file->pid,
+   PIDTYPE_PID);
print_file_stats(m, task ? task->comm : "", stats);
rcu_read_unlock();
+   mutex_unlock(&dev->struct_mutex);
}
mutex_unlock(&dev->filelist_mutex);
 
@@ -658,12 +667,11 @@ static int i915_gem_request_info(struct seq_file *m, void 
*data)
 
seq_printf(m, "%s requests: %d\n", engine->name, count);
list_for_each_entry(req, &engine->request_list, link) {
+   struct pid *pid = req->ctx->pid;
struct task_struct *task;
 
rcu_read_lock();
-   task = NULL;
-   if (req->pid)
-   task = pid_task(req->pid, PIDTYPE_PID);
+   task = pid ? pid_task(pid, PIDTYPE_PID) : NULL;
seq_printf(m, "%x @ %d: %s [%d]\n",
   req->fence.seqno,
   (int) (jiffies - req->emitted_jiffies),
@@ -1952,18 +1960,17 @@ static int i915_context_status(struct seq_file *m, void 
*unused)
 
list_for_each_entry(ctx, &dev_priv->context_list, link) {
seq_printf(m, "HW context %u ", ctx->hw_id);
-   if (IS_ERR(ctx->file_priv)) {
-   seq_puts(m, "(deleted) ");
-   } else if (ctx->file_priv) {
-   struct pid *pid = ctx->file_priv->file->pid;
+   if (ctx->pid) {
struct task_struct *task;
 
-   task = get_pid_task(pid, PIDTYPE_PID);
+   task = get_pid_task(ctx->pid, PIDTYPE_PID);
if (task) {
seq_printf(m, "(%s [%d]) ",
   task->comm, task->pid);
put_task_struct(task);
}
+   } else if (IS_ERR(ctx->file_priv)) {
+   seq_puts(m, "(deleted) ");
} else {
seq_puts(m, "(kernel) ");
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bd58878de77b..bb7d8130dbfd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -775,6 +775,7 @@ struct drm_i915_error_state {
struct drm_i915_error_object {
int page_count;
u64 gtt_offset;
+   u64 gtt_size;
u32 *pages[0];
} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
 
@@ -782,6 +783,7 @@ struct drm_i915_error_state {
 
struct drm_i915_error_request {
long jiffies;
+   

[Intel-gfx] [CI 24/31] drm/i915: Use VMA for render state page tracking

2016-08-12 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_render_state.c | 40 +++-
 drivers/gpu/drm/i915/i915_gem_render_state.h |  2 +-
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 57fd767a2d79..95b7e9afd5f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -30,8 +30,7 @@
 
 struct render_state {
const struct intel_renderstate_rodata *rodata;
-   struct drm_i915_gem_object *obj;
-   u64 ggtt_offset;
+   struct i915_vma *vma;
u32 aux_batch_size;
u32 aux_batch_offset;
 };
@@ -73,7 +72,7 @@ render_state_get_rodata(const struct drm_i915_gem_request 
*req)
 
 static int render_state_setup(struct render_state *so)
 {
-   struct drm_device *dev = so->obj->base.dev;
+   struct drm_device *dev = so->vma->vm->dev;
const struct intel_renderstate_rodata *rodata = so->rodata;
const bool has_64bit_reloc = INTEL_GEN(dev) >= 8;
unsigned int i = 0, reloc_index = 0;
@@ -81,18 +80,18 @@ static int render_state_setup(struct render_state *so)
u32 *d;
int ret;
 
-   ret = i915_gem_object_set_to_cpu_domain(so->obj, true);
+   ret = i915_gem_object_set_to_cpu_domain(so->vma->obj, true);
if (ret)
return ret;
 
-   page = i915_gem_object_get_dirty_page(so->obj, 0);
+   page = i915_gem_object_get_dirty_page(so->vma->obj, 0);
d = kmap(page);
 
while (i < rodata->batch_items) {
u32 s = rodata->batch[i];
 
if (i * 4  == rodata->reloc[reloc_index]) {
-   u64 r = s + so->ggtt_offset;
+   u64 r = s + so->vma->node.start;
s = lower_32_bits(r);
if (has_64bit_reloc) {
if (i + 1 >= rodata->batch_items ||
@@ -154,7 +153,7 @@ static int render_state_setup(struct render_state *so)
 
kunmap(page);
 
-   ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
+   ret = i915_gem_object_set_to_gtt_domain(so->vma->obj, false);
if (ret)
return ret;
 
@@ -175,6 +174,7 @@ err_out:
 int i915_gem_render_state_init(struct drm_i915_gem_request *req)
 {
struct render_state so;
+   struct drm_i915_gem_object *obj;
int ret;
 
if (WARN_ON(req->engine->id != RCS))
@@ -187,21 +187,25 @@ int i915_gem_render_state_init(struct 
drm_i915_gem_request *req)
if (so.rodata->batch_items * 4 > 4096)
return -EINVAL;
 
-   so.obj = i915_gem_object_create(&req->i915->drm, 4096);
-   if (IS_ERR(so.obj))
-   return PTR_ERR(so.obj);
+   obj = i915_gem_object_create(&req->i915->drm, 4096);
+   if (IS_ERR(obj))
+   return PTR_ERR(obj);
 
-   ret = i915_gem_object_ggtt_pin(so.obj, NULL, 0, 0, 0);
-   if (ret)
+   so.vma = i915_vma_create(obj, &req->i915->ggtt.base, NULL);
+   if (IS_ERR(so.vma)) {
+   ret = PTR_ERR(so.vma);
goto err_obj;
+   }
 
-   so.ggtt_offset = i915_gem_obj_ggtt_offset(so.obj);
+   ret = i915_vma_pin(so.vma, 0, 0, PIN_GLOBAL);
+   if (ret)
+   goto err_obj;
 
ret = render_state_setup(&so);
if (ret)
goto err_unpin;
 
-   ret = req->engine->emit_bb_start(req, so.ggtt_offset,
+   ret = req->engine->emit_bb_start(req, so.vma->node.start,
 so.rodata->batch_items * 4,
 I915_DISPATCH_SECURE);
if (ret)
@@ -209,7 +213,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request 
*req)
 
if (so.aux_batch_size > 8) {
ret = req->engine->emit_bb_start(req,
-(so.ggtt_offset +
+(so.vma->node.start +
  so.aux_batch_offset),
 so.aux_batch_size,
 I915_DISPATCH_SECURE);
@@ -217,10 +221,10 @@ int i915_gem_render_state_init(struct 
drm_i915_gem_request *req)
goto err_unpin;
}
 
-   i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req, 0);
+   i915_vma_move_to_active(so.vma, req, 0);
 err_unpin:
-   i915_gem_object_ggtt_unpin(so.obj);
+   i915_vma_unpin(so.vma);
 err_obj:
-   i915_gem_object_put(so.obj);
+   i915_gem_object_put(obj);
return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h 
b/drivers/gpu/drm/i915/i915_gem_render_state.h
index c44fca8599bb..18cce3f06e9c 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.h
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.h
@@ -24,7 

[Intel-gfx] [CI 25/31] drm/i915: Use VMA for wa_ctx tracking

2016-08-12 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c| 58 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 3 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index da8aa86ad0c9..09219809488d 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1134,7 +1134,7 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
  
engine->status_page.vma->obj);
 
ee->wa_ctx = i915_error_ggtt_object_create(dev_priv,
-  engine->wa_ctx.obj);
+  
engine->wa_ctx.vma->obj);
 
count = 0;
list_for_each_entry(request, &engine->request_list, link)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 56c904e2dc98..64cb04e63512 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1165,45 +1165,51 @@ static int gen9_init_perctx_bb(struct intel_engine_cs 
*engine,
 
 static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *engine, u32 size)
 {
-   int ret;
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+   int err;
 
-   engine->wa_ctx.obj = i915_gem_object_create(&engine->i915->drm,
-   PAGE_ALIGN(size));
-   if (IS_ERR(engine->wa_ctx.obj)) {
-   DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n");
-   ret = PTR_ERR(engine->wa_ctx.obj);
-   engine->wa_ctx.obj = NULL;
-   return ret;
-   }
+   obj = i915_gem_object_create(&engine->i915->drm, PAGE_ALIGN(size));
+   if (IS_ERR(obj))
+   return PTR_ERR(obj);
 
-   ret = i915_gem_object_ggtt_pin(engine->wa_ctx.obj, NULL,
-  0, PAGE_SIZE, PIN_HIGH);
-   if (ret) {
-   DRM_DEBUG_DRIVER("pin LRC WA ctx backing obj failed: %d\n",
-ret);
-   i915_gem_object_put(engine->wa_ctx.obj);
-   return ret;
+   vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL);
+   if (IS_ERR(vma)) {
+   err = PTR_ERR(vma);
+   goto err;
}
 
+   err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
+   if (err)
+   goto err;
+
+   engine->wa_ctx.vma = vma;
return 0;
+
+err:
+   i915_gem_object_put(obj);
+   return err;
 }
 
 static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine)
 {
-   if (engine->wa_ctx.obj) {
-   i915_gem_object_ggtt_unpin(engine->wa_ctx.obj);
-   i915_gem_object_put(engine->wa_ctx.obj);
-   engine->wa_ctx.obj = NULL;
-   }
+   struct i915_vma *vma;
+
+   vma = fetch_and_zero(&engine->wa_ctx.vma);
+   if (!vma)
+   return;
+
+   i915_vma_unpin(vma);
+   i915_vma_put(vma);
 }
 
 static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 {
-   int ret;
+   struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
uint32_t *batch;
uint32_t offset;
struct page *page;
-   struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
+   int ret;
 
WARN_ON(engine->id != RCS);
 
@@ -1226,7 +1232,7 @@ static int intel_init_workaround_bb(struct 
intel_engine_cs *engine)
return ret;
}
 
-   page = i915_gem_object_get_dirty_page(wa_ctx->obj, 0);
+   page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
batch = kmap_atomic(page);
offset = 0;
 
@@ -2019,9 +2025,9 @@ populate_lr_context(struct i915_gem_context *ctx,
   RING_INDIRECT_CTX(engine->mmio_base), 0);
ASSIGN_CTX_REG(reg_state, CTX_RCS_INDIRECT_CTX_OFFSET,
   RING_INDIRECT_CTX_OFFSET(engine->mmio_base), 0);
-   if (engine->wa_ctx.obj) {
+   if (engine->wa_ctx.vma) {
struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
-   uint32_t ggtt_offset = 
i915_gem_obj_ggtt_offset(wa_ctx->obj);
+   u32 ggtt_offset = wa_ctx->vma->node.start;
 
reg_state[CTX_RCS_INDIRECT_CTX+1] =
(ggtt_offset + wa_ctx->indirect_ctx.offset * 
sizeof(uint32_t)) |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index cb40785e7677..e3777572c70e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -123,12 +123,12 @@ struct drm_i915_reg_table;
  *an option for fu

[Intel-gfx] [CI 15/31] drm/i915: Use VMA as the primary object for context state

2016-08-12 Thread Chris Wilson
When working with contexts, we most frequently want the GGTT VMA for the
context state, first and foremost. Since the object is available via the
VMA, we need only then store the VMA.

v2: Formatting tweaks to debugfs output, restored some comments removed
in the next patch

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 34 
 drivers/gpu/drm/i915/i915_drv.h|  3 +-
 drivers/gpu/drm/i915/i915_gem_context.c| 51 +---
 drivers/gpu/drm/i915/i915_gpu_error.c  |  7 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c |  6 +--
 drivers/gpu/drm/i915/intel_lrc.c   | 64 +++---
 drivers/gpu/drm/i915/intel_ringbuffer.c|  6 +--
 7 files changed, 86 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 32d26b6c4bca..fcda4e7da127 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -354,7 +354,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data)
 
for (n = 0; n < ARRAY_SIZE(ctx->engine); n++) {
if (ctx->engine[n].state)
-   per_file_stats(0, ctx->engine[n].state, data);
+   per_file_stats(0, ctx->engine[n].state->obj, data);
if (ctx->engine[n].ring)
per_file_stats(0, ctx->engine[n].ring->obj, data);
}
@@ -1977,7 +1977,7 @@ static int i915_context_status(struct seq_file *m, void 
*unused)
seq_printf(m, "%s: ", engine->name);
seq_putc(m, ce->initialised ? 'I' : 'i');
if (ce->state)
-   describe_obj(m, ce->state);
+   describe_obj(m, ce->state->obj);
if (ce->ring)
describe_ctx_ring(m, ce->ring);
seq_putc(m, '\n');
@@ -1995,36 +1995,34 @@ static void i915_dump_lrc_obj(struct seq_file *m,
  struct i915_gem_context *ctx,
  struct intel_engine_cs *engine)
 {
-   struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
+   struct i915_vma *vma = ctx->engine[engine->id].state;
struct page *page;
-   uint32_t *reg_state;
int j;
-   unsigned long ggtt_offset = 0;
 
seq_printf(m, "CONTEXT: %s %u\n", engine->name, ctx->hw_id);
 
-   if (ctx_obj == NULL) {
-   seq_puts(m, "\tNot allocated\n");
+   if (!vma) {
+   seq_puts(m, "\tFake context\n");
return;
}
 
-   if (!i915_gem_obj_ggtt_bound(ctx_obj))
-   seq_puts(m, "\tNot bound in GGTT\n");
-   else
-   ggtt_offset = i915_gem_obj_ggtt_offset(ctx_obj);
+   if (vma->flags & I915_VMA_GLOBAL_BIND)
+   seq_printf(m, "\tBound in GGTT at 0x%08x\n",
+  lower_32_bits(vma->node.start));
 
-   if (i915_gem_object_get_pages(ctx_obj)) {
-   seq_puts(m, "\tFailed to get pages for context object\n");
+   if (i915_gem_object_get_pages(vma->obj)) {
+   seq_puts(m, "\tFailed to get pages for context object\n\n");
return;
}
 
-   page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
-   if (!WARN_ON(page == NULL)) {
-   reg_state = kmap_atomic(page);
+   page = i915_gem_object_get_page(vma->obj, LRC_STATE_PN);
+   if (page) {
+   u32 *reg_state = kmap_atomic(page);
 
for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) {
-   seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 
0x%08x\n",
-  ggtt_offset + 4096 + (j * 4),
+   seq_printf(m,
+  "\t[0x%04x] 0x%08x 0x%08x 0x%08x 0x%08x\n",
+  j * 4,
   reg_state[j], reg_state[j + 1],
   reg_state[j + 2], reg_state[j + 3]);
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3285c8e2c87a..259425d99e17 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -893,9 +893,8 @@ struct i915_gem_context {
u32 ggtt_alignment;
 
struct intel_context {
-   struct drm_i915_gem_object *state;
+   struct i915_vma *state;
struct intel_ring *ring;
-   struct i915_vma *lrc_vma;
uint32_t *lrc_reg_state;
u64 lrc_desc;
int pin_count;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 547caf26a6b9..3857ce097c84 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -155,7 +155

[Intel-gfx] [CI 27/31] drm/i915: Track pinned VMA

2016-08-12 Thread Chris Wilson
Treat the VMA as the primary struct responsible for tracking bindings
into the GPU's VM. That is we want to treat the VMA returned after we
pin an object into the VM as the cookie we hold and eventually release
when unpinning. Doing so eliminates the ambiguity in pinning the object
and then searching for the relevant pin later.

v2: Joonas' stylistic nitpicks, a fun rebase.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|   2 +-
 drivers/gpu/drm/i915/i915_drv.h|  60 ++--
 drivers/gpu/drm/i915/i915_gem.c| 233 -
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  65 
 drivers/gpu/drm/i915/i915_gem_fence.c  |  14 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c|  74 +
 drivers/gpu/drm/i915/i915_gem_gtt.h|  14 --
 drivers/gpu/drm/i915/i915_gem_request.c|   2 +-
 drivers/gpu/drm/i915/i915_gem_request.h|   2 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c |   2 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  |  58 +++
 drivers/gpu/drm/i915/intel_display.c   |  57 ---
 drivers/gpu/drm/i915/intel_drv.h   |   5 +-
 drivers/gpu/drm/i915/intel_fbc.c   |   2 +-
 drivers/gpu/drm/i915/intel_fbdev.c |  19 +--
 drivers/gpu/drm/i915/intel_guc_loader.c|  21 +--
 drivers/gpu/drm/i915/intel_overlay.c   |  32 ++--
 18 files changed, 267 insertions(+), 397 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index fb483df1afd6..21961304284e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -105,7 +105,7 @@ static char get_tiling_flag(struct drm_i915_gem_object *obj)
 
 static char get_global_flag(struct drm_i915_gem_object *obj)
 {
-   return i915_gem_obj_to_ggtt(obj) ? 'g' : ' ';
+   return i915_gem_object_to_ggtt(obj, NULL) ?  'g' : ' ';
 }
 
 static char get_pin_mapped_flag(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 50dc3613c61c..bbee45acedeb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3075,7 +3075,7 @@ struct drm_i915_gem_object 
*i915_gem_object_create_from_data(
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file);
 void i915_gem_free_object(struct drm_gem_object *obj);
 
-int __must_check
+struct i915_vma * __must_check
 i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 const struct i915_ggtt_view *view,
 u64 size,
@@ -3279,12 +3279,11 @@ i915_gem_object_set_to_gtt_domain(struct 
drm_i915_gem_object *obj,
  bool write);
 int __must_check
 i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
-int __must_check
+struct i915_vma * __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 u32 alignment,
 const struct i915_ggtt_view *view);
-void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj,
- const struct i915_ggtt_view 
*view);
+void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
int align);
 int i915_gem_open(struct drm_device *dev, struct drm_file *file);
@@ -3304,63 +3303,34 @@ struct drm_gem_object *i915_gem_prime_import(struct 
drm_device *dev,
 struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *gem_obj, int flags);
 
-u64 i915_gem_obj_ggtt_offset_view(struct drm_i915_gem_object *o,
- const struct i915_ggtt_view *view);
-u64 i915_gem_obj_offset(struct drm_i915_gem_object *o,
-   struct i915_address_space *vm);
-static inline u64
-i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
-{
-   return i915_gem_obj_ggtt_offset_view(o, &i915_ggtt_view_normal);
-}
-
-bool i915_gem_obj_ggtt_bound_view(struct drm_i915_gem_object *o,
- const struct i915_ggtt_view *view);
-bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
-   struct i915_address_space *vm);
-
 struct i915_vma *
 i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
-   struct i915_address_space *vm);
-struct i915_vma *
-i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj,
- const struct i915_ggtt_view *view);
+struct i915_address_space *vm,
+const struct i915_ggtt_view *view);
 
 struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
- struct i915_address_space *vm);
-st

[Intel-gfx] [CI 31/31] drm/i915: Record the RING_MODE register for post-mortem debugging

2016-08-12 Thread Chris Wilson
Just another useful register to inspect following a GPU hang.

v2: Remove partial decoding of RING_MODE to userspace, be consistent and
use GEN > 2 guards around RING_MODE everywhere.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h | 1 +
 drivers/gpu/drm/i915/i915_gpu_error.c   | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 7 ---
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bb7d8130dbfd..35caa9b2f36a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -757,6 +757,7 @@ struct drm_i915_error_state {
u32 tail;
u32 head;
u32 ctl;
+   u32 mode;
u32 hws;
u32 ipeir;
u32 ipehr;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 6215c1bf79c8..cdf5464a0c39 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -236,6 +236,7 @@ static void error_print_engine(struct 
drm_i915_error_state_buf *m,
err_printf(m, "  HEAD:  0x%08x\n", ee->head);
err_printf(m, "  TAIL:  0x%08x\n", ee->tail);
err_printf(m, "  CTL:   0x%08x\n", ee->ctl);
+   err_printf(m, "  MODE:  0x%08x\n", ee->mode);
err_printf(m, "  HWS:   0x%08x\n", ee->hws);
err_printf(m, "  ACTHD: 0x%08x %08x\n",
   (u32)(ee->acthd>>32), (u32)ee->acthd);
@@ -1005,6 +1006,8 @@ static void error_record_engine_registers(struct 
drm_i915_error_state *error,
ee->head = I915_READ_HEAD(engine);
ee->tail = I915_READ_TAIL(engine);
ee->ctl = I915_READ_CTL(engine);
+   if (INTEL_GEN(dev_priv) > 2)
+   ee->mode = I915_READ_MODE(engine);
 
if (I915_NEED_GFX_HWS(dev_priv)) {
i915_reg_t mmio;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e3327a2ac6e1..fa22bd87bab0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -498,7 +498,7 @@ static bool stop_ring(struct intel_engine_cs *engine)
 {
struct drm_i915_private *dev_priv = engine->i915;
 
-   if (!IS_GEN2(dev_priv)) {
+   if (INTEL_GEN(dev_priv) > 2) {
I915_WRITE_MODE(engine, _MASKED_BIT_ENABLE(STOP_RING));
if (intel_wait_for_register(dev_priv,
RING_MI_MODE(engine->mmio_base),
@@ -520,7 +520,7 @@ static bool stop_ring(struct intel_engine_cs *engine)
I915_WRITE_HEAD(engine, 0);
I915_WRITE_TAIL(engine, 0);
 
-   if (!IS_GEN2(dev_priv)) {
+   if (INTEL_GEN(dev_priv) > 2) {
(void)I915_READ_CTL(engine);
I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
}
@@ -2142,7 +2142,8 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
dev_priv = engine->i915;
 
if (engine->buffer) {
-   WARN_ON(!IS_GEN2(dev_priv) && (I915_READ_MODE(engine) & 
MODE_IDLE) == 0);
+   WARN_ON(INTEL_GEN(dev_priv) > 2 &&
+   (I915_READ_MODE(engine) & MODE_IDLE) == 0);
 
intel_ring_unpin(engine->buffer);
intel_ring_free(engine->buffer);
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 30/31] drm/i915: Only record active and pending requests upon a GPU hang

2016-08-12 Thread Chris Wilson
There is no other state pertaining to the completed requests in the
hang, other than gleamed through the ringbuffer, so including the
expired requests in the list of outstanding requests simply adds noise.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 110 +++---
 1 file changed, 62 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 4c00d93396e6..6215c1bf79c8 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1060,12 +1060,69 @@ static void error_record_engine_registers(struct 
drm_i915_error_state *error,
}
 }
 
+static void engine_record_requests(struct intel_engine_cs *engine,
+  struct drm_i915_gem_request *first,
+  struct drm_i915_error_engine *ee)
+{
+   struct drm_i915_gem_request *request;
+   int count;
+
+   count = 0;
+   request = first;
+   list_for_each_entry_from(request, &engine->request_list, link)
+   count += !!request->batch;
+   if (!count)
+   return;
+
+   ee->requests = kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC);
+   if (!ee->requests)
+   return;
+
+   count = 0;
+   request = first;
+   list_for_each_entry_from(request, &engine->request_list, link) {
+   struct drm_i915_error_request *erq;
+
+   if (!request->batch)
+   continue;
+
+   if (count >= ee->num_requests) {
+   /*
+* If the ring request list was changed in
+* between the point where the error request
+* list was created and dimensioned and this
+* point then just exit early to avoid crashes.
+*
+* We don't need to communicate that the
+* request list changed state during error
+* state capture and that the error state is
+* slightly incorrect as a consequence since we
+* are typically only interested in the request
+* list state at the point of error state
+* capture, not in any changes happening during
+* the capture.
+*/
+   break;
+   }
+
+   erq = &ee->requests[count++];
+   erq->seqno = request->fence.seqno;
+   erq->jiffies = request->emitted_jiffies;
+   erq->head = request->head;
+   erq->tail = request->tail;
+
+   rcu_read_lock();
+   erq->pid = request->ctx->pid ? pid_nr(request->ctx->pid) : 0;
+   rcu_read_unlock();
+   }
+   ee->num_requests = count;
+}
+
 static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
  struct drm_i915_error_state *error)
 {
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   struct drm_i915_gem_request *request;
-   int i, count;
+   int i;
 
error->semaphore =
i915_error_object_create(dev_priv, dev_priv->semaphore);
@@ -1073,6 +1130,7 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
for (i = 0; i < I915_NUM_ENGINES; i++) {
struct intel_engine_cs *engine = &dev_priv->engine[i];
struct drm_i915_error_engine *ee = &error->engine[i];
+   struct drm_i915_gem_request *request;
 
ee->pid = -1;
ee->engine_id = -1;
@@ -1131,6 +1189,8 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
ee->cpu_ring_tail = ring->tail;
ee->ringbuffer =
i915_error_object_create(dev_priv, ring->vma);
+
+   engine_record_requests(engine, request, ee);
}
 
ee->hws_page =
@@ -1139,52 +1199,6 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
 
ee->wa_ctx =
i915_error_object_create(dev_priv, engine->wa_ctx.vma);
-
-   count = 0;
-   list_for_each_entry(request, &engine->request_list, link)
-   count++;
-
-   ee->num_requests = count;
-   ee->requests =
-   kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC);
-   if (!ee->requests) {
-   ee->num_requests = 0;
-   continue;
-   }
-
-   count = 0;
-   list_for_each_entry(request, &engine->request_list, link) {
-   struct drm_i915_error_request *erq

[Intel-gfx] [CI 21/31] drm/i915: Move common seqno reset to intel_engine_cs.c

2016-08-12 Thread Chris Wilson
Since the intel_engine_init_seqno() is shared by all engine submission
backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 42 +
 drivers/gpu/drm/i915/intel_ringbuffer.c | 42 -
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 7104dec5e893..829624571ca4 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -161,6 +161,48 @@ cleanup:
return ret;
 }
 
+void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno)
+{
+   struct drm_i915_private *dev_priv = engine->i915;
+
+   /* Our semaphore implementation is strictly monotonic (i.e. we proceed
+* so long as the semaphore value in the register/page is greater
+* than the sync value), so whenever we reset the seqno,
+* so long as we reset the tracking semaphore value to 0, it will
+* always be before the next request's seqno. If we don't reset
+* the semaphore value, then when the seqno moves backwards all
+* future waits will complete instantly (causing rendering corruption).
+*/
+   if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) {
+   I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
+   I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
+   if (HAS_VEBOX(dev_priv))
+   I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
+   }
+   if (dev_priv->semaphore_obj) {
+   struct drm_i915_gem_object *obj = dev_priv->semaphore_obj;
+   struct page *page = i915_gem_object_get_dirty_page(obj, 0);
+   void *semaphores = kmap(page);
+   memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0),
+  0, I915_NUM_ENGINES * gen8_semaphore_seqno_size);
+   kunmap(page);
+   }
+   memset(engine->semaphore.sync_seqno, 0,
+  sizeof(engine->semaphore.sync_seqno));
+
+   intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
+   if (engine->irq_seqno_barrier)
+   engine->irq_seqno_barrier(engine);
+   engine->last_submitted_seqno = seqno;
+
+   engine->hangcheck.seqno = seqno;
+
+   /* After manually advancing the seqno, fake the interrupt in case
+* there are any waiters for that seqno.
+*/
+   intel_engine_wakeup(engine);
+}
+
 void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
 {
memset(&engine->hangcheck, 0, sizeof(engine->hangcheck));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c89aea55bc10..6008d54b9152 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2314,48 +2314,6 @@ int intel_ring_cacheline_align(struct 
drm_i915_gem_request *req)
return 0;
 }
 
-void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno)
-{
-   struct drm_i915_private *dev_priv = engine->i915;
-
-   /* Our semaphore implementation is strictly monotonic (i.e. we proceed
-* so long as the semaphore value in the register/page is greater
-* than the sync value), so whenever we reset the seqno,
-* so long as we reset the tracking semaphore value to 0, it will
-* always be before the next request's seqno. If we don't reset
-* the semaphore value, then when the seqno moves backwards all
-* future waits will complete instantly (causing rendering corruption).
-*/
-   if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) {
-   I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
-   I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
-   if (HAS_VEBOX(dev_priv))
-   I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
-   }
-   if (dev_priv->semaphore_obj) {
-   struct drm_i915_gem_object *obj = dev_priv->semaphore_obj;
-   struct page *page = i915_gem_object_get_dirty_page(obj, 0);
-   void *semaphores = kmap(page);
-   memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0),
-  0, I915_NUM_ENGINES * gen8_semaphore_seqno_size);
-   kunmap(page);
-   }
-   memset(engine->semaphore.sync_seqno, 0,
-  sizeof(engine->semaphore.sync_seqno));
-
-   intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
-   if (engine->irq_seqno_barrier)
-   engine->irq_seqno_barrier(engine);
-   engine->last_submitted_seqno = seqno;
-
-   engine->hangcheck.seqno = seqno;
-
-   /* After manually advancing the seqno, fake the interrupt in case
-* there 

[Intel-gfx] [CI 17/31] drm/i915: Move assertion for iomap access to i915_vma_pin_iomap

2016-08-12 Thread Chris Wilson
Access through the GTT requires the device to be awake. Ideally
i915_vma_pin_iomap() is short-lived and the pinning demarcates the
access through the iomap. This is not entirely true, we have a mixture
of long lived pins that exceed the wakelock (such as legacy ringbuffers)
and short lived pin that do live within the wakelock (such as execlist
ringbuffers).

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1bec50bd651b..738a474c5afa 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3650,6 +3650,9 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 {
void __iomem *ptr;
 
+   /* Access through the GTT requires the device to be awake. */
+   assert_rpm_wakelock_held(to_i915(vma->vm->dev));
+
lockdep_assert_held(&vma->vm->dev->struct_mutex);
if (WARN_ON(!vma->obj->map_and_fenceable))
return IO_ERR_PTR(-ENODEV);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 81dc69d1ff05..4a614e567353 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1966,9 +1966,6 @@ int intel_ring_pin(struct intel_ring *ring)
if (ret)
goto err_unpin;
 
-   /* Access through the GTT requires the device to be awake. */
-   assert_rpm_wakelock_held(dev_priv);
-
addr = (void __force *)
i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj));
if (IS_ERR(addr)) {
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 12/31] drm/i915: Track pinned vma inside guc

2016-08-12 Thread Chris Wilson
Since the guc allocates and pins and object into the GGTT for its usage,
it is more natural to use that pinned VMA as our resource cookie.

v2: Embrace naming tautology
v3: Rewrite comments for guc_allocate_vma()

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  10 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h|   6 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 144 ++---
 drivers/gpu/drm/i915/intel_guc.h   |   9 +-
 drivers/gpu/drm/i915/intel_guc_loader.c|   7 +-
 5 files changed, 90 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index fd028953453d..32d26b6c4bca 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2526,15 +2526,15 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = to_i915(dev);
-   struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj;
-   u32 *log;
+   struct drm_i915_gem_object *obj;
int i = 0, pg;
 
-   if (!log_obj)
+   if (!dev_priv->guc.log_vma)
return 0;
 
-   for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) {
-   log = kmap_atomic(i915_gem_object_get_page(log_obj, pg));
+   obj = dev_priv->guc.log_vma->obj;
+   for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
+   u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
 
for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4)
seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f2769e01cc8c..a2691943a404 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -716,4 +716,10 @@ static inline void i915_vma_unpin_iomap(struct i915_vma 
*vma)
i915_vma_unpin(vma);
 }
 
+static inline struct page *i915_vma_first_page(struct i915_vma *vma)
+{
+   GEM_BUG_ON(!vma->pages);
+   return sg_page(vma->pages->sgl);
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6831321a9c8c..29de8cec1b58 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -183,7 +183,7 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
  struct i915_guc_client *client,
  u16 new_id)
 {
-   struct sg_table *sg = guc->ctx_pool_obj->pages;
+   struct sg_table *sg = guc->ctx_pool_vma->pages;
void *doorbell_bitmap = guc->doorbell_bitmap;
struct guc_doorbell_info *doorbell;
struct guc_context_desc desc;
@@ -325,7 +325,6 @@ static void guc_init_proc_desc(struct intel_guc *guc,
 static void guc_init_ctx_desc(struct intel_guc *guc,
  struct i915_guc_client *client)
 {
-   struct drm_i915_gem_object *client_obj = client->client_obj;
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct intel_engine_cs *engine;
struct i915_gem_context *ctx = client->owner;
@@ -383,8 +382,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 * The doorbell, process descriptor, and workqueue are all parts
 * of the client object, which the GuC will reference via the GGTT
 */
-   gfx_addr = i915_gem_obj_ggtt_offset(client_obj);
-   desc.db_trigger_phy = sg_dma_address(client_obj->pages->sgl) +
+   gfx_addr = client->vma->node.start;
+   desc.db_trigger_phy = sg_dma_address(client->vma->pages->sgl) +
client->doorbell_offset;
desc.db_trigger_cpu = (uintptr_t)client->client_base +
client->doorbell_offset;
@@ -400,7 +399,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
desc.desc_private = (uintptr_t)client;
 
/* Pool context is pinned already */
-   sg = guc->ctx_pool_obj->pages;
+   sg = guc->ctx_pool_vma->pages;
sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
 sizeof(desc) * client->ctx_index);
 }
@@ -413,7 +412,7 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 
memset(&desc, 0, sizeof(desc));
 
-   sg = guc->ctx_pool_obj->pages;
+   sg = guc->ctx_pool_vma->pages;
sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
 sizeof(desc) * client->ctx_index);
 }
@@ -496,7 +495,7 @@ static void guc_add_workqueue_item(struct i915_guc_client 
*gc,
/* WQ starts from the page after doorbell / process_desc */
wq_page = (wq_off + GUC_DB_SIZE) >> PAGE_SHIFT;
wq_off &= PAGE_SIZE - 1;
-   base = kmap_atomic(i

[Intel-gfx] [CI 19/31] drm/i915: Use VMA for scratch page tracking

2016-08-12 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_display.c|  2 +-
 drivers/gpu/drm/i915/intel_lrc.c| 18 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 55 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h | 10 ++
 6 files changed, 46 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 824dfe14bcd0..e566167d9441 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -660,7 +660,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 
hw_flags)
MI_STORE_REGISTER_MEM |
MI_SRM_LRM_GLOBAL_GTT);
intel_ring_emit_reg(ring, last_reg);
-   intel_ring_emit(ring, engine->scratch.gtt_offset);
+   intel_ring_emit(ring, engine->scratch->node.start);
intel_ring_emit(ring, MI_NOOP);
}
intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 4a19494a4f6f..b80d2a6f56b3 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1101,7 +1101,7 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
if (HAS_BROKEN_CS_TLB(dev_priv))
ee->wa_batchbuffer =
i915_error_ggtt_object_create(dev_priv,
- 
engine->scratch.obj);
+ 
engine->scratch->obj);
 
if (request->ctx->engine[i].state) {
ee->ctx = 
i915_error_ggtt_object_create(dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index c5c0c35d4f6e..2e7d03c5bf5c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11795,7 +11795,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
intel_ring_emit(ring, MI_STORE_REGISTER_MEM |
  MI_SRM_LRM_GLOBAL_GTT);
intel_ring_emit_reg(ring, DERRMR);
-   intel_ring_emit(ring, req->engine->scratch.gtt_offset + 256);
+   intel_ring_emit(ring, req->engine->scratch->node.start + 256);
if (IS_GEN8(dev)) {
intel_ring_emit(ring, 0);
intel_ring_emit(ring, MI_NOOP);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 73dd2f9e0547..42999ba02152 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -914,7 +914,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct 
intel_engine_cs *engine,
wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 |
   MI_SRM_LRM_GLOBAL_GTT));
wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
-   wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256);
+   wa_ctx_emit(batch, index, engine->scratch->node.start + 256);
wa_ctx_emit(batch, index, 0);
 
wa_ctx_emit(batch, index, MI_LOAD_REGISTER_IMM(1));
@@ -932,7 +932,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct 
intel_engine_cs *engine,
wa_ctx_emit(batch, index, (MI_LOAD_REGISTER_MEM_GEN8 |
   MI_SRM_LRM_GLOBAL_GTT));
wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
-   wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256);
+   wa_ctx_emit(batch, index, engine->scratch->node.start + 256);
wa_ctx_emit(batch, index, 0);
 
return index;
@@ -993,7 +993,7 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs 
*engine,
 
/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
/* Actual scratch location is at 128 bytes offset */
-   scratch_addr = engine->scratch.gtt_offset + 2*CACHELINE_BYTES;
+   scratch_addr = engine->scratch->node.start + 2 * CACHELINE_BYTES;
 
wa_ctx_emit(batch, index, GFX_OP_PIPE_CONTROL(6));
wa_ctx_emit(batch, index, (PIPE_CONTROL_FLUSH_L3 |
@@ -1072,8 +1072,8 @@ static int gen9_init_indirectctx_bb(struct 
intel_engine_cs *engine,
/* WaClearSlmSpaceAtContextSwitch:kbl */
/* Actual scratch location is at 128 bytes offset */
if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_A0)) {
-   uint32_t scratch_addr
-   = engine->scratch.gtt_offset + 2*CACHELINE_BYTES;
+   u32 scratch_addr =
+   engine->scratch->node.start + 2 

[Intel-gfx] [CI 13/31] drm/i915: Convert fence computations to use vma directly

2016-08-12 Thread Chris Wilson
Lookup the GGTT vma once for the object assigned to the fence, and then
derive everything from that vma.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_fence.c | 55 +--
 1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c 
b/drivers/gpu/drm/i915/i915_gem_fence.c
index 9e8173fe2a09..1d0f975c61f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence.c
@@ -85,22 +85,19 @@ static void i965_write_fence_reg(struct drm_device *dev, 
int reg,
POSTING_READ(fence_reg_lo);
 
if (obj) {
-   u32 size = i915_gem_obj_ggtt_size(obj);
+   struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
unsigned int tiling = i915_gem_object_get_tiling(obj);
unsigned int stride = i915_gem_object_get_stride(obj);
-   uint64_t val;
+   u64 size = vma->node.size;
+   u32 row_size = stride * (tiling == I915_TILING_Y ? 32 : 8);
+   u64 val;
 
/* Adjust fence size to match tiled area */
-   if (tiling != I915_TILING_NONE) {
-   uint32_t row_size = stride *
-   (tiling == I915_TILING_Y ? 32 : 8);
-   size = (size / row_size) * row_size;
-   }
+   size = rounddown(size, row_size);
 
-   val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) &
-0xf000) << 32;
-   val |= i915_gem_obj_ggtt_offset(obj) & 0xf000;
-   val |= (uint64_t)((stride / 128) - 1) << fence_pitch_shift;
+   val = ((vma->node.start + size - 4096) & 0xf000) << 32;
+   val |= vma->node.start & 0xf000;
+   val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
if (tiling == I915_TILING_Y)
val |= 1 << I965_FENCE_TILING_Y_SHIFT;
val |= I965_FENCE_REG_VALID;
@@ -123,17 +120,17 @@ static void i915_write_fence_reg(struct drm_device *dev, 
int reg,
u32 val;
 
if (obj) {
-   u32 size = i915_gem_obj_ggtt_size(obj);
+   struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
unsigned int tiling = i915_gem_object_get_tiling(obj);
unsigned int stride = i915_gem_object_get_stride(obj);
int pitch_val;
int tile_width;
 
-   WARN((i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK) ||
-(size & -size) != size ||
-(i915_gem_obj_ggtt_offset(obj) & (size - 1)),
-"object 0x%08llx [fenceable? %d] not 1M or pot-size 
(0x%08x) aligned\n",
-i915_gem_obj_ggtt_offset(obj), obj->map_and_fenceable, 
size);
+   WARN((vma->node.start & ~I915_FENCE_START_MASK) ||
+!is_power_of_2(vma->node.size) ||
+(vma->node.start & (vma->node.size - 1)),
+"object 0x%08llx [fenceable? %d] not 1M or pot-size 
(0x%08llx) aligned\n",
+vma->node.start, obj->map_and_fenceable, vma->node.size);
 
if (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
tile_width = 128;
@@ -144,10 +141,10 @@ static void i915_write_fence_reg(struct drm_device *dev, 
int reg,
pitch_val = stride / tile_width;
pitch_val = ffs(pitch_val) - 1;
 
-   val = i915_gem_obj_ggtt_offset(obj);
+   val = vma->node.start;
if (tiling == I915_TILING_Y)
val |= 1 << I830_FENCE_TILING_Y_SHIFT;
-   val |= I915_FENCE_SIZE_BITS(size);
+   val |= I915_FENCE_SIZE_BITS(vma->node.size);
val |= pitch_val << I830_FENCE_PITCH_SHIFT;
val |= I830_FENCE_REG_VALID;
} else
@@ -161,27 +158,27 @@ static void i830_write_fence_reg(struct drm_device *dev, 
int reg,
struct drm_i915_gem_object *obj)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
-   uint32_t val;
+   u32 val;
 
if (obj) {
-   u32 size = i915_gem_obj_ggtt_size(obj);
+   struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
unsigned int tiling = i915_gem_object_get_tiling(obj);
unsigned int stride = i915_gem_object_get_stride(obj);
-   uint32_t pitch_val;
+   u32 pitch_val;
 
-   WARN((i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) ||
-(size & -size) != size ||
-(i915_gem_obj_ggtt_offset(obj) & (size - 1)),
-"object 0x%08llx not 512K or pot-size 0x%08x aligned\n",
-i915_gem_obj_ggtt_offset(obj), size);
+   WARN((vma->node.s

[Intel-gfx] [CI 16/31] drm/i915: Only change the context object's domain when binding

2016-08-12 Thread Chris Wilson
We know that the only access to the context object is via the GPU, and
the only time when it can be out of the GPU domain is when it is swapped
out and unbound. Therefore we only need to clflush the object when
binding, thus avoiding any potential stall on touching the domain on an
active context.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 19 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 3857ce097c84..824dfe14bcd0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -772,6 +772,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
if (skip_rcs_switch(ppgtt, engine, to))
return 0;
 
+   /* Clear this page out of any CPU caches for coherent swap-in/out. */
+   if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
+   ret = i915_gem_object_set_to_gtt_domain(vma->obj, false);
+   if (ret)
+   return ret;
+   }
+
/* Trying to pin first makes error handling easier. */
ret = i915_vma_pin(vma, 0, to->ggtt_alignment, PIN_GLOBAL);
if (ret)
@@ -786,18 +793,6 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 */
from = engine->last_context;
 
-   /*
-* Clear this page out of any CPU caches for coherent swap-in/out. Note
-* that thanks to write = false in this call and us not setting any gpu
-* write domains when putting a context object onto the active list
-* (when switching away from it), this won't block.
-*
-* XXX: We need a real interface to do this instead of trickery.
-*/
-   ret = i915_gem_object_set_to_gtt_domain(vma->obj, false);
-   if (ret)
-   goto err;
-
if (needs_pd_load_pre(ppgtt, engine, to)) {
/* Older GENs and non render rings still want the load first,
 * "PP_DCLV followed by PP_DIR_BASE register through Load
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2318a27341c8..81dc69d1ff05 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2092,6 +2092,10 @@ static int intel_ring_context_pin(struct 
i915_gem_context *ctx,
return 0;
 
if (ce->state) {
+   ret = i915_gem_object_set_to_gtt_domain(ce->state->obj, false);
+   if (ret)
+   goto error;
+
ret = i915_vma_pin(ce->state, 0, ctx->ggtt_alignment,
   PIN_GLOBAL | PIN_HIGH);
if (ret)
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 11/31] drm/i915: Add convenience wrappers for vma's object get/put

2016-08-12 Thread Chris Wilson
The VMA are unreferenced, they belong to the object and live until they
are closed. However, if we want to use the VMA as a cookie and use it to
keep the object alive, we want to hold onto a reference to the object
for the lifetime of the VMA cookie. To facilitate this, add a couple of
simple wrappers for managing the reference count on the object owning the
VMA.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h| 12 
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 ++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 855833a6306a..3285c8e2c87a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2373,6 +2373,18 @@ i915_gem_object_get_stride(struct drm_i915_gem_object 
*obj)
return obj->tiling_and_stride & STRIDE_MASK;
 }
 
+static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
+{
+   i915_gem_object_get(vma->obj);
+   return vma;
+}
+
+static inline void i915_vma_put(struct i915_vma *vma)
+{
+   lockdep_assert_held(&vma->vm->dev->struct_mutex);
+   i915_gem_object_put(vma->obj);
+}
+
 /*
  * Optimised SGL iterator for GEM objects
  */
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c8d13fea4b25..ced05878b405 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -271,7 +271,7 @@ static void eb_destroy(struct eb_vmas *eb)
   exec_list);
list_del_init(&vma->exec_list);
i915_gem_execbuffer_unreserve_vma(vma);
-   i915_gem_object_put(vma->obj);
+   i915_vma_put(vma);
}
kfree(eb);
 }
@@ -900,7 +900,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list);
list_del_init(&vma->exec_list);
i915_gem_execbuffer_unreserve_vma(vma);
-   i915_gem_object_put(vma->obj);
+   i915_vma_put(vma);
}
 
mutex_unlock(&dev->struct_mutex);
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 14/31] drm/i915: Use VMA directly for checking tiling parameters

2016-08-12 Thread Chris Wilson
v2: Rename functions to suit their more active role

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_tiling.c | 51 --
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c 
b/drivers/gpu/drm/i915/i915_gem_tiling.c
index f4b984de83b5..b2b0cb7199ac 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -116,35 +116,46 @@ i915_tiling_ok(struct drm_device *dev, int stride, int 
size, int tiling_mode)
return true;
 }
 
-/* Is the current GTT allocation valid for the change in tiling? */
-static bool
-i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
+/* Make the current GTT allocation valid for the change in tiling. */
+static int
+i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj, int tiling_mode)
 {
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+   struct i915_vma *vma;
u32 size;
 
if (tiling_mode == I915_TILING_NONE)
-   return true;
+   return 0;
 
if (INTEL_GEN(dev_priv) >= 4)
-   return true;
+   return 0;
+
+   vma = i915_gem_obj_to_ggtt(obj);
+   if (!vma)
+   return 0;
+
+   if (!obj->map_and_fenceable)
+   return 0;
 
if (IS_GEN3(dev_priv)) {
-   if (i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK)
-   return false;
+   if (vma->node.start & ~I915_FENCE_START_MASK)
+   goto bad;
} else {
-   if (i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK)
-   return false;
+   if (vma->node.start & ~I830_FENCE_START_MASK)
+   goto bad;
}
 
size = i915_gem_get_ggtt_size(dev_priv, obj->base.size, tiling_mode);
-   if (i915_gem_obj_ggtt_size(obj) != size)
-   return false;
+   if (vma->node.size < size)
+   goto bad;
 
-   if (i915_gem_obj_ggtt_offset(obj) & (size - 1))
-   return false;
+   if (vma->node.start & (size - 1))
+   goto bad;
 
-   return true;
+   return 0;
+
+bad:
+   return i915_vma_unbind(vma);
 }
 
 /**
@@ -168,7 +179,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
struct drm_i915_gem_set_tiling *args = data;
struct drm_i915_private *dev_priv = to_i915(dev);
struct drm_i915_gem_object *obj;
-   int ret = 0;
+   int err = 0;
 
/* Make sure we don't cross-contaminate obj->tiling_and_stride */
BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
@@ -187,7 +198,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 
mutex_lock(&dev->struct_mutex);
if (obj->pin_display || obj->framebuffer_references) {
-   ret = -EBUSY;
+   err = -EBUSY;
goto err;
}
 
@@ -234,11 +245,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 * has to also include the unfenced register the GPU uses
 * whilst executing a fenced command for an untiled object.
 */
-   if (obj->map_and_fenceable &&
-   !i915_gem_object_fence_ok(obj, args->tiling_mode))
-   ret = i915_vma_unbind(i915_gem_obj_to_ggtt(obj));
 
-   if (ret == 0) {
+   err = i915_gem_object_fence_prepare(obj, args->tiling_mode);
+   if (!err) {
if (obj->pages &&
obj->madv == I915_MADV_WILLNEED &&
dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
@@ -281,7 +290,7 @@ err:
 
intel_runtime_pm_put(dev_priv);
 
-   return ret;
+   return err;
 }
 
 /**
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 01/31] drm/i915: Record the position of the start of the request

2016-08-12 Thread Chris Wilson
Not only does it make for good documentation and debugging aide, but it is
also vital for when we want to unwind requests - such as when throwing away
an incomplete request.

Signed-off-by: Chris Wilson 
Link: 
http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluv...@linux.intel.com
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem_request.c | 13 +
 drivers/gpu/drm/i915/i915_gpu_error.c   |  6 --
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bf193ba1574e..b1017950087b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -557,6 +557,7 @@ struct drm_i915_error_state {
struct drm_i915_error_request {
long jiffies;
u32 seqno;
+   u32 head;
u32 tail;
} *requests;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index b764c1d440c8..8a9e9bfeea09 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -426,6 +426,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
if (ret)
goto err_ctx;
 
+   /* Record the position of the start of the request so that
+* should we detect the updated seqno part-way through the
+* GPU processing the request, we never over-estimate the
+* position of the head.
+*/
+   req->head = req->ring->tail;
+
return req;
 
 err_ctx:
@@ -500,8 +507,6 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
 
trace_i915_gem_request_add(request);
 
-   request->head = request_start;
-
/* Seal the request and mark it as pending execution. Note that
 * we may inspect this state, without holding any locks, during
 * hangcheck. Hence we apply the barrier to ensure that we do not
@@ -514,10 +519,10 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
list_add_tail(&request->link, &engine->request_list);
list_add_tail(&request->ring_link, &ring->request_list);
 
-   /* Record the position of the start of the request so that
+   /* Record the position of the start of the breadcrumb so that
 * should we detect the updated seqno part-way through the
 * GPU processing the request, we never over-estimate the
-* position of the head.
+* position of the ring's HEAD.
 */
request->postfix = ring->tail;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb87063c88..d54848f5f246 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -455,9 +455,10 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
   dev_priv->engine[i].name,
   ee->num_requests);
for (j = 0; j < ee->num_requests; j++) {
-   err_printf(m, "  seqno 0x%08x, emitted %ld, 
tail 0x%08x\n",
+   err_printf(m, "  seqno 0x%08x, emitted %ld, 
head 0x%08x, tail 0x%08x\n",
   ee->requests[j].seqno,
   ee->requests[j].jiffies,
+  ee->requests[j].head,
   ee->requests[j].tail);
}
}
@@ -1205,7 +1206,8 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
erq = &ee->requests[count++];
erq->seqno = request->fence.seqno;
erq->jiffies = request->emitted_jiffies;
-   erq->tail = request->postfix;
+   erq->head = request->head;
+   erq->tail = request->tail;
}
}
 }
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 07/31] drm/i915: Remove redundant WARN_ON from __i915_add_request()

2016-08-12 Thread Chris Wilson
It's an outright programming error, so explode if it is ever hit.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 8a9e9bfeea09..4c5b7e104f2f 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -470,18 +470,12 @@ static void i915_gem_mark_busy(const struct 
intel_engine_cs *engine)
  */
 void __i915_add_request(struct drm_i915_gem_request *request, bool 
flush_caches)
 {
-   struct intel_engine_cs *engine;
-   struct intel_ring *ring;
+   struct intel_engine_cs *engine = request->engine;
+   struct intel_ring *ring = request->ring;
u32 request_start;
u32 reserved_tail;
int ret;
 
-   if (WARN_ON(!request))
-   return;
-
-   engine = request->engine;
-   ring = request->ring;
-
/*
 * To ensure that this call will not fail, space for its emissions
 * should already have been reserved in the ring buffer. Let the ring
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 06/31] drm/i915: Reduce i915_gem_objects to only show object information

2016-08-12 Thread Chris Wilson
No longer is knowing how much of the GTT (both mappable aperture and
beyond) relevant, and the output clutters the real information - that is
how many objects are allocated and bound (and by who) so that we can
quickly grasp if there is a leak.

v2: Relent, and rename pinned to indicate display only. Since the
display objects are semi-static and are of variable size, they are the
interesting objects to watch over time for aperture leaking. The other
pins are either static (such as the scratch page) or very short lived
(such as execbuf) and not part of the precious GGTT.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c   | 100 --
 drivers/gpu/drm/i915/i915_drv.h   | 249 +-
 drivers/gpu/drm/i915/i915_gpu_error.c |  15 ++
 3 files changed, 168 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c535c4c2f7af..fd028953453d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -269,17 +269,6 @@ static int i915_gem_stolen_list_info(struct seq_file *m, 
void *data)
return 0;
 }
 
-#define count_objects(list, member) do { \
-   list_for_each_entry(obj, list, member) { \
-   size += i915_gem_obj_total_ggtt_size(obj); \
-   ++count; \
-   if (obj->map_and_fenceable) { \
-   mappable_size += i915_gem_obj_ggtt_size(obj); \
-   ++mappable_count; \
-   } \
-   } \
-} while (0)
-
 struct file_stats {
struct drm_i915_file_private *file_priv;
unsigned long count;
@@ -394,30 +383,16 @@ static void print_context_stats(struct seq_file *m,
print_file_stats(m, "[k]contexts", stats);
 }
 
-#define count_vmas(list, member) do { \
-   list_for_each_entry(vma, list, member) { \
-   size += i915_gem_obj_total_ggtt_size(vma->obj); \
-   ++count; \
-   if (vma->obj->map_and_fenceable) { \
-   mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
-   ++mappable_count; \
-   } \
-   } \
-} while (0)
-
 static int i915_gem_object_info(struct seq_file *m, void* data)
 {
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = to_i915(dev);
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   u32 count, mappable_count, purgeable_count;
-   u64 size, mappable_size, purgeable_size;
-   unsigned long pin_mapped_count = 0, pin_mapped_purgeable_count = 0;
-   u64 pin_mapped_size = 0, pin_mapped_purgeable_size = 0;
+   u32 count, mapped_count, purgeable_count, dpy_count;
+   u64 size, mapped_size, purgeable_size, dpy_size;
struct drm_i915_gem_object *obj;
struct drm_file *file;
-   struct i915_vma *vma;
int ret;
 
ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -428,70 +403,51 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
   dev_priv->mm.object_count,
   dev_priv->mm.object_memory);
 
-   size = count = mappable_size = mappable_count = 0;
-   count_objects(&dev_priv->mm.bound_list, global_list);
-   seq_printf(m, "%u [%u] objects, %llu [%llu] bytes in gtt\n",
-  count, mappable_count, size, mappable_size);
-
-   size = count = mappable_size = mappable_count = 0;
-   count_vmas(&ggtt->base.active_list, vm_link);
-   seq_printf(m, "  %u [%u] active objects, %llu [%llu] bytes\n",
-  count, mappable_count, size, mappable_size);
-
-   size = count = mappable_size = mappable_count = 0;
-   count_vmas(&ggtt->base.inactive_list, vm_link);
-   seq_printf(m, "  %u [%u] inactive objects, %llu [%llu] bytes\n",
-  count, mappable_count, size, mappable_size);
-
size = count = purgeable_size = purgeable_count = 0;
list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) {
-   size += obj->base.size, ++count;
-   if (obj->madv == I915_MADV_DONTNEED)
-   purgeable_size += obj->base.size, ++purgeable_count;
+   size += obj->base.size;
+   ++count;
+
+   if (obj->madv == I915_MADV_DONTNEED) {
+   purgeable_size += obj->base.size;
+   ++purgeable_count;
+   }
+
if (obj->mapping) {
-   pin_mapped_count++;
-   pin_mapped_size += obj->base.size;
-   if (obj->pages_pin_count == 0) {
-   pin_mapped_purgeable_count++;
-   pin_mapped_purgeable_size += obj->base.size;
-   }
+   mapped_count++;
+   mapped_size += obj->base.size;

[Intel-gfx] [CI 08/31] drm/i915: Always set the vma->pages

2016-08-12 Thread Chris Wilson
Previously, we would only set the vma->pages pointer for GGTT entries.
However, if we always set it, we can use it to prettify some code that
may want to access the backing store associated with the VMA (as
assigned to the VMA).

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem.c |  8 
 drivers/gpu/drm/i915/i915_gem_gtt.c | 30 ++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +--
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5566916870eb..8b1a74dbb870 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2859,12 +2859,12 @@ int i915_vma_unbind(struct i915_vma *vma)
if (i915_vma_is_ggtt(vma)) {
if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
obj->map_and_fenceable = false;
-   } else if (vma->ggtt_view.pages) {
-   sg_free_table(vma->ggtt_view.pages);
-   kfree(vma->ggtt_view.pages);
+   } else if (vma->pages) {
+   sg_free_table(vma->pages);
+   kfree(vma->pages);
}
-   vma->ggtt_view.pages = NULL;
}
+   vma->pages = NULL;
 
/* Since the unbound list is global, only move to that list if
 * no more VMAs exist. */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d876501694c6..9c178b0c40b5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -170,11 +170,13 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 {
u32 pte_flags = 0;
 
+   vma->pages = vma->obj->pages;
+
/* Currently applicable only to VLV */
if (vma->obj->gt_ro)
pte_flags |= PTE_READ_ONLY;
 
-   vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
+   vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
cache_level, pte_flags);
 
return 0;
@@ -2618,8 +2620,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
if (obj->gt_ro)
pte_flags |= PTE_READ_ONLY;
 
-   vma->vm->insert_entries(vma->vm, vma->ggtt_view.pages,
-   vma->node.start,
+   vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
cache_level, pte_flags);
 
/*
@@ -2651,8 +2652,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 
if (flags & I915_VMA_GLOBAL_BIND) {
vma->vm->insert_entries(vma->vm,
-   vma->ggtt_view.pages,
-   vma->node.start,
+   vma->pages, vma->node.start,
cache_level, pte_flags);
}
 
@@ -2660,8 +2660,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
struct i915_hw_ppgtt *appgtt =
to_i915(vma->vm->dev)->mm.aliasing_ppgtt;
appgtt->base.insert_entries(&appgtt->base,
-   vma->ggtt_view.pages,
-   vma->node.start,
+   vma->pages, vma->node.start,
cache_level, pte_flags);
}
 
@@ -3557,28 +3556,27 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma)
 {
int ret = 0;
 
-   if (vma->ggtt_view.pages)
+   if (vma->pages)
return 0;
 
if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
-   vma->ggtt_view.pages = vma->obj->pages;
+   vma->pages = vma->obj->pages;
else if (vma->ggtt_view.type == I915_GGTT_VIEW_ROTATED)
-   vma->ggtt_view.pages =
+   vma->pages =

intel_rotate_fb_obj_pages(&vma->ggtt_view.params.rotated, vma->obj);
else if (vma->ggtt_view.type == I915_GGTT_VIEW_PARTIAL)
-   vma->ggtt_view.pages =
-   intel_partial_pages(&vma->ggtt_view, vma->obj);
+   vma->pages = intel_partial_pages(&vma->ggtt_view, vma->obj);
else
WARN_ONCE(1, "GGTT view %u not implemented!\n",
  vma->ggtt_view.type);
 
-   if (!vma->ggtt_view.pages) {
+   if (!vma->pages) {
DRM_ERROR("Failed to get pages for GGTT view type %u!\n",
  vma->ggtt_view.type);
ret = -EINVAL;
-   } else if (IS_ERR(vma->ggtt_view.pages)) {
-   ret = PTR_ERR(vma->ggtt_view.pages);
-   vma->ggtt_view.pages = NULL;
+   } else if (IS_ERR(vma->pages)) {
+   ret = PTR_ERR(vma->pages);
+   vma->pages = NULL;
DRM_ERROR("Failed to get pages for VMA view type %

[Intel-gfx] [CI 03/31] drm/i915: Store the active context object on all engines upon error

2016-08-12 Thread Chris Wilson
With execlists, we have context objects everywhere, not just RCS. So
store them for post-mortem debugging. This also has a secondary effect
of removing one more unsafe list iteration with using preserved state
from the hanging request. And now we can cross-reference the request's
context state with that loaded by the GPU.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 28 
 1 file changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index a51c5422c1bd..f34e63eda178 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1043,28 +1043,6 @@ static void error_record_engine_registers(struct 
drm_i915_error_state *error,
}
 }
 
-static void i915_gem_record_active_context(struct intel_engine_cs *engine,
-  struct drm_i915_error_state *error,
-  struct drm_i915_error_engine *ee)
-{
-   struct drm_i915_private *dev_priv = engine->i915;
-   struct drm_i915_gem_object *obj;
-
-   /* Currently render ring is the only HW context user */
-   if (engine->id != RCS || !error->ccid)
-   return;
-
-   list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-   if (!i915_gem_obj_ggtt_bound(obj))
-   continue;
-
-   if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) 
{
-   ee->ctx = i915_error_ggtt_object_create(dev_priv, obj);
-   break;
-   }
-   }
-}
-
 static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
  struct drm_i915_error_state *error)
 {
@@ -1114,6 +1092,10 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
i915_error_ggtt_object_create(dev_priv,
  
engine->scratch.obj);
 
+   ee->ctx =
+   i915_error_ggtt_object_create(dev_priv,
+ 
request->ctx->engine[i].state);
+
if (request->pid) {
struct task_struct *task;
 
@@ -1144,8 +1126,6 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
ee->wa_ctx = i915_error_ggtt_object_create(dev_priv,
   engine->wa_ctx.obj);
 
-   i915_gem_record_active_context(engine, error, ee);
-
count = 0;
list_for_each_entry(request, &engine->request_list, link)
count++;
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 02/31] drm/i915: Reduce amount of duplicate buffer information captured on error

2016-08-12 Thread Chris Wilson
When capturing the error state, we do not need to know about every
address space - just those that are related to the error. We know which
context is active at the time, therefore we know which VM are implicated
in the error. We can then restrict the VM which we report to the
relevant subset.

v2: s/i/count_active/ (and similar)
Rewrite label generation for "Buffers"

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h   |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 224 +++---
 2 files changed, 105 insertions(+), 128 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b1017950087b..7eb911e47904 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -517,6 +517,7 @@ struct drm_i915_error_state {
int num_waiters;
int hangcheck_score;
enum intel_engine_hangcheck_action hangcheck_action;
+   struct i915_address_space *vm;
int num_requests;
 
/* our own tracking of ring head and tail */
@@ -587,17 +588,15 @@ struct drm_i915_error_state {
u32 read_domains;
u32 write_domain;
s32 fence_reg:I915_MAX_NUM_FENCE_BITS;
-   s32 pinned:2;
u32 tiling:2;
u32 dirty:1;
u32 purgeable:1;
u32 userptr:1;
s32 engine:4;
u32 cache_level:3;
-   } **active_bo, **pinned_bo;
-
-   u32 *active_bo_count, *pinned_bo_count;
-   u32 vm_count;
+   } *active_bo[I915_NUM_ENGINES], *pinned_bo;
+   u32 active_bo_count[I915_NUM_ENGINES], pinned_bo_count;
+   struct i915_address_space *active_vm[I915_NUM_ENGINES];
 };
 
 struct intel_connector;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index d54848f5f246..a51c5422c1bd 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -42,16 +42,6 @@ static const char *engine_str(int engine)
}
 }
 
-static const char *pin_flag(int pinned)
-{
-   if (pinned > 0)
-   return " P";
-   else if (pinned < 0)
-   return " p";
-   else
-   return "";
-}
-
 static const char *tiling_flag(int tiling)
 {
switch (tiling) {
@@ -189,7 +179,7 @@ static void print_error_buffers(struct 
drm_i915_error_state_buf *m,
 {
int i;
 
-   err_printf(m, "  %s [%d]:\n", name, count);
+   err_printf(m, "%s [%d]:\n", name, count);
 
while (count--) {
err_printf(m, "%08x_%08x %8u %02x %02x [ ",
@@ -202,7 +192,6 @@ static void print_error_buffers(struct 
drm_i915_error_state_buf *m,
err_printf(m, "%02x ", err->rseqno[i]);
 
err_printf(m, "] %02x", err->wseqno);
-   err_puts(m, pin_flag(err->pinned));
err_puts(m, tiling_flag(err->tiling));
err_puts(m, dirty_flag(err->dirty));
err_puts(m, purgeable_flag(err->purgeable));
@@ -414,18 +403,33 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
error_print_engine(m, &error->engine[i]);
}
 
-   for (i = 0; i < error->vm_count; i++) {
-   err_printf(m, "vm[%d]\n", i);
+   for (i = 0; i < ARRAY_SIZE(error->active_vm); i++) {
+   char buf[128];
+   int len, first = 1;
 
-   print_error_buffers(m, "Active",
+   if (!error->active_vm[i])
+   break;
+
+   len = scnprintf(buf, sizeof(buf), "Active[%d] (", i);
+   for (j = 0; j < ARRAY_SIZE(error->engine); j++) {
+   if (error->engine[j].vm != error->active_vm[i])
+   continue;
+
+   len += scnprintf(buf + len, sizeof(buf), "%s%s",
+first ? "" : ", ",
+dev_priv->engine[j].name);
+   first = 0;
+   }
+   scnprintf(buf + len, sizeof(buf), ")");
+   print_error_buffers(m, buf,
error->active_bo[i],
error->active_bo_count[i]);
-
-   print_error_buffers(m, "Pinned",
-   error->pinned_bo[i],
-   error->pinned_bo_count[i]);
}
 
+   print_error_buffers(m, "Pinned (global)",
+   error->pinned_bo,
+   error->pinned_bo_count);
+
for (i = 0; i < ARRAY_SIZE(error->engine); i++) {
struct drm_i915_error_engine *ee = &error->engine[i];
 
@@ -627,13 +631,10 @@ static void i915_error_state_free(struct kref *error_ref)
 
i915_error_object_free(error->semaphore_o

[Intel-gfx] [CI 04/31] drm/i915: Remove inactive/active list from debugfs

2016-08-12 Thread Chris Wilson
These two files (i915_gem_active, i915_gem_inactive) no longer give
pertinent information since active/inactive tracking is per-vm and so we
need the information per-vm. They are obsolete so remove them.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 49 -
 1 file changed, 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c461072da142..4c08e2d23002 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -210,53 +210,6 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
seq_printf(m, " (frontbuffer: 0x%03x)", frontbuffer_bits);
 }
 
-static int i915_gem_object_list_info(struct seq_file *m, void *data)
-{
-   struct drm_info_node *node = m->private;
-   uintptr_t list = (uintptr_t) node->info_ent->data;
-   struct list_head *head;
-   struct drm_device *dev = node->minor->dev;
-   struct drm_i915_private *dev_priv = to_i915(dev);
-   struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   struct i915_vma *vma;
-   u64 total_obj_size, total_gtt_size;
-   int count, ret;
-
-   ret = mutex_lock_interruptible(&dev->struct_mutex);
-   if (ret)
-   return ret;
-
-   /* FIXME: the user of this interface might want more than just GGTT */
-   switch (list) {
-   case ACTIVE_LIST:
-   seq_puts(m, "Active:\n");
-   head = &ggtt->base.active_list;
-   break;
-   case INACTIVE_LIST:
-   seq_puts(m, "Inactive:\n");
-   head = &ggtt->base.inactive_list;
-   break;
-   default:
-   mutex_unlock(&dev->struct_mutex);
-   return -EINVAL;
-   }
-
-   total_obj_size = total_gtt_size = count = 0;
-   list_for_each_entry(vma, head, vm_link) {
-   seq_printf(m, "   ");
-   describe_obj(m, vma->obj);
-   seq_printf(m, "\n");
-   total_obj_size += vma->obj->base.size;
-   total_gtt_size += vma->node.size;
-   count++;
-   }
-   mutex_unlock(&dev->struct_mutex);
-
-   seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n",
-  count, total_obj_size, total_gtt_size);
-   return 0;
-}
-
 static int obj_rank_by_stolen(void *priv,
  struct list_head *A, struct list_head *B)
 {
@@ -5376,8 +5329,6 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_gem_objects", i915_gem_object_info, 0},
{"i915_gem_gtt", i915_gem_gtt_info, 0},
{"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST},
-   {"i915_gem_active", i915_gem_object_list_info, 0, (void *) ACTIVE_LIST},
-   {"i915_gem_inactive", i915_gem_object_list_info, 0, (void *) 
INACTIVE_LIST},
{"i915_gem_stolen", i915_gem_stolen_list_info },
{"i915_gem_pageflip", i915_gem_pageflip_info, 0},
{"i915_gem_request", i915_gem_request_info, 0},
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 05/31] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins

2016-08-12 Thread Chris Wilson
Only those objects pinned to the display have semi-permanent pins of a
global nature (other pins are transient within their local vm). Simplify
i915_gem_pinned to only show the pertinent information about the pinned
objects within the GGTT.

v2: i915_gem_gtt_info is still shared with debugfs/i915_gem_gtt,
rename i915_gem_pinned to i915_gem_pin_display to better reflect its
contents

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 4c08e2d23002..c535c4c2f7af 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -40,12 +40,6 @@
 #include 
 #include "i915_drv.h"
 
-enum {
-   ACTIVE_LIST,
-   INACTIVE_LIST,
-   PINNED_LIST,
-};
-
 /* As the drm_debugfs_init() routines are called before dev->dev_private is
  * allocated we need to hook into the minor for release. */
 static int
@@ -537,8 +531,8 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
-   uintptr_t list = (uintptr_t) node->info_ent->data;
struct drm_i915_private *dev_priv = to_i915(dev);
+   bool show_pin_display_only = !!data;
struct drm_i915_gem_object *obj;
u64 total_obj_size, total_gtt_size;
int count, ret;
@@ -549,7 +543,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 
total_obj_size = total_gtt_size = count = 0;
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-   if (list == PINNED_LIST && !i915_gem_obj_is_pinned(obj))
+   if (show_pin_display_only && !obj->pin_display)
continue;
 
seq_puts(m, "   ");
@@ -5328,7 +5322,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_capabilities", i915_capabilities, 0},
{"i915_gem_objects", i915_gem_object_info, 0},
{"i915_gem_gtt", i915_gem_gtt_info, 0},
-   {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST},
+   {"i915_gem_pin_display", i915_gem_gtt_info, 0, (void *)1},
{"i915_gem_stolen", i915_gem_stolen_list_info },
{"i915_gem_pageflip", i915_gem_pageflip_info, 0},
{"i915_gem_request", i915_gem_request_info, 0},
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 10/31] drm/i915: Add fetch_and_zero() macro

2016-08-12 Thread Chris Wilson
A simple little macro to clear a pointer and return the old value. This
is useful for writing

value = *ptr;
if (!value)
return;

*ptr = 0;
...
free(value);

in a slightly more concise form:

value = fetch_and_zero(ptr);
if (!value)
return;

...
free(value);

with the idea that this establishes a pattern that may be extended for
atomic use (using xchg or cmpxchg) i.e. atomic_fetch_and_zero() and
similar to llist.

Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 25b1e6c010d5..855833a6306a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3920,4 +3920,10 @@ bool i915_memcpy_from_wc(void *dst, const void *src, 
unsigned long len);
 #define ptr_pack_bits(ptr, bits)   \
((typeof(ptr))((unsigned long)(ptr) | (bits)))
 
+#define fetch_and_zero(ptr) ({ \
+   typeof(*ptr) __T = *(ptr);  \
+   *(ptr) = (typeof(*ptr))0;   \
+   __T;\
+})
+
 #endif
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 09/31] drm/i915: Create a VMA for an object

2016-08-12 Thread Chris Wilson
In many places, we wish to store the VMA in preference to the object
itself and so being able to create the persistent VMA is useful.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 11 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  5 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9c178b0c40b5..1bec50bd651b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3387,6 +3387,17 @@ __i915_gem_vma_create(struct drm_i915_gem_object *obj,
 }
 
 struct i915_vma *
+i915_vma_create(struct drm_i915_gem_object *obj,
+   struct i915_address_space *vm,
+   const struct i915_ggtt_view *view)
+{
+   GEM_BUG_ON(view && !i915_is_ggtt(vm));
+   GEM_BUG_ON(view ? i915_gem_obj_to_ggtt_view(obj, view) : 
i915_gem_obj_to_vma(obj, vm));
+
+   return __i915_gem_vma_create(obj, vm, view ?: &i915_ggtt_view_normal);
+}
+
+struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
  struct i915_address_space *vm)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b580e8a013ce..f2769e01cc8c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -228,6 +228,11 @@ struct i915_vma {
struct drm_i915_gem_exec_object2 *exec_entry;
 };
 
+struct i915_vma *
+i915_vma_create(struct drm_i915_gem_object *obj,
+   struct i915_address_space *vm,
+   const struct i915_ggtt_view *view);
+
 static inline bool i915_vma_is_ggtt(const struct i915_vma *vma)
 {
return vma->flags & I915_VMA_GGTT;
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC

2016-08-12 Thread Goel, Akash



On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:


On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Sagar Arun Kamble 

GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
   crash buffer area for regular cases and copying only the state
   structure data in first page.

v3:
  - Create a vmalloc mapping of log buffer. (Chris)
  - Cover the flush acknowledgment under rpm get & put.(Chris)
  - Revert the change of skipping the copy of crash dump area, as
not really needed, will be covered by subsequent patch.

v4:
  - Destroy the wq under the same condition in which it was created,
pass dev_piv pointer instead of dev to newly added GuC function,
add more comments & rename variable for clarity. (Tvrtko)

Signed-off-by: Sagar Arun Kamble 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_drv.c|  14 +++
  drivers/gpu/drm/i915/i915_guc_submission.c | 150
+
  drivers/gpu/drm/i915/i915_irq.c|   5 +-
  drivers/gpu/drm/i915/intel_guc.h   |   3 +
  4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 0fcd1c0..fc2da32 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
drm_i915_private *dev_priv)
  if (dev_priv->hotplug.dp_wq == NULL)
  goto out_free_wq;

+if (HAS_GUC_SCHED(dev_priv)) {


This just reminded me that a previous patch had:

+if (HAS_GUC_UCODE(dev))
+dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;

In the interrupt setup. I don't think there is a bug right now, but
there is a disagreement between the two which would be good to resolve.

This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
for correctness. I think.


Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.

As per Chris's comments will move the wq init/destroy to the GuC logging 
setup/teardown routines (guc_create_log_extras, guc_log_cleanup)

You are fine with that ?.




+/* Need a dedicated wq to process log buffer flush interrupts
+ * from GuC without much delay so as to avoid any loss of logs.
+ */
+dev_priv->guc.log.wq =
+alloc_ordered_workqueue("i915-guc_log", 0);
+if (dev_priv->guc.log.wq == NULL)
+goto out_free_hotplug_dp_wq;
+}
+
  return 0;

+out_free_hotplug_dp_wq:
+destroy_workqueue(dev_priv->hotplug.dp_wq);
  out_free_wq:
  destroy_workqueue(dev_priv->wq);
  out_err:
@@ -782,6 +794,8 @@ out_err:

  static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
  {
+if (HAS_GUC_SCHED(dev_priv))
+destroy_workqueue(dev_priv->guc.log.wq);
  destroy_workqueue(dev_priv->hotplug.dp_wq);
  destroy_workqueue(dev_priv->wq);
  }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c7c679f..2635b67 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
intel_guc *guc,
  return host2guc_action(guc, data, ARRAY_SIZE(data));
  }

+static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
+{
+u32 data[1];
+
+data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
+
+return host2guc_action(guc, data, 1);
+}
+
  /*
   * Initialise, update, or clear doorbell data shared with the GuC
   *
@@ -840,6 +849,127 @@ err:
  return NULL;
  }

+static void guc_move_to_next_buf(struct intel_guc *guc)
+{
+return;
+}
+
+static void* guc_get_write_buffer(struct intel_guc *guc)
+{
+return NULL;
+}
+
+static void guc_read_update_log_buffer(struct intel_guc *guc)
+{
+struct guc_log_buffer_state *log_buffer_state,
*log_buffer_snapshot_state;
+struct guc_log_buffer_state log_buffer_state_local;
+void *src_data_ptr, *dst_data_ptr;
+u32 i, buffer_size;


unsigned int i if you can be bothered.


Fine will do that for both i & buffer_size.

But I remember earlier in one of the patch, you suggested to use u32 as 
a type for some variables.

Please could you share the guideline.
Should u32, u64 be used we are exactly sure of the range of the 
variable, like for variables containing the register values ?






+
+if (!guc->log.buf_addr)
+return;


Can it hit this? If yes, I think better disable GuC logging when pin map
on the object fails rather than let it gen

  1   2   >