[Intel-gfx] [PATCH] drm/i915: Avoid reading through unaccessible PTEs during an error event

2011-11-29 Thread Chris Wilson
We need to sanity check that the buffer is actually bound into the
mappable range of the GTT prior to reading it back through the GTT with
the CPU. Fortuitously, the only buffers we have been interested in so
far are constrained to be in the mappable region in order to handle
potential relocations. However, this can be relaxed in future and given
that the purpose is to read back following an error we should be extra
careful and not assume everything is safe.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
---
 drivers/gpu/drm/i915/i915_irq.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 26ac3bf..9b38226 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -708,7 +708,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
int page, page_count;
u32 reloc_offset;
 
-   if (src == NULL || src-pages == NULL)
+   if (src == NULL || src-pages == NULL || !src-map_and_fenceable)
return NULL;
 
page_count = src-base.size / PAGE_SIZE;
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/11] aliasing ppgtt support v2

2011-11-29 Thread Chris Wilson
On Mon, 28 Nov 2011 21:35:27 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:
 
 Hi all,
 
 Changes since the last submission:
 - fixed issues pointed out by Chris Wilson on irc.
 - fixed an oops on pre-snb, shame on me for that one.
 - added two new patches to only bind objects to the global gtt when required.
 - added a new patch so that userspace can find out whether ppgtt is on. This 
 is
   required to use MI_STORE/LOAD commands correctly from userspace 
 batchbuffers.
   Luckily no currently released userspace code depends on this, it's just prep
   work.
 
 Comments, test reports, reviews and flames highly welcome.

The lazy-gtt is just missing a guard to ensure the buffer is bound in
the global gtt before reading through those PTE (impacts other code to
avoid allocating mappable GTT space). The beauty is that it forced me to
grok the rest of the lazy-gtt, it's deceptively simple.

Aside from the lazy-gtt sufficiently speeding up command submission and
reopening an old wound that prevents me from usefully analysing
performance, the series is tested-by myself. I've so far reviewed
everything but the actual mechanics of the PDE, which is not saying much
;-)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid reading through unaccessible PTEs during an error event

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 09:35:10AM +, Chris Wilson wrote:
 We need to sanity check that the buffer is actually bound into the
 mappable range of the GTT prior to reading it back through the GTT with
 the CPU. Fortuitously, the only buffers we have been interested in so
 far are constrained to be in the mappable region in order to handle
 potential relocations. However, this can be relaxed in future and given
 that the purpose is to read back following an error we should be extra
 careful and not assume everything is safe.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk

Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch

For the lazy-gtt binding we need to also check whether the ptes are
correct (they should because we pin buffers with relocations as mappable).
I'll add that additional paranoia to my series.
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: capture error_state also for stuck rings

2011-11-29 Thread Daniel Vetter
Since quite a while we also the basic output configuration in the
error_state, so it should contain enough information to diagnose
these MI_WAIT hangs.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_irq.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index f870958..8926bbe 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1708,6 +1708,7 @@ void i915_hangcheck_elapsed(unsigned long data)
dev_priv-last_instdone1 == instdone1) {
if (dev_priv-hangcheck_count++  1) {
DRM_ERROR(Hangcheck timer elapsed... GPU hung\n);
+   i915_handle_error(dev, true);
 
if (!IS_GEN2(dev)) {
/* Is the chip hanging on a WAIT_FOR_EVENT?
@@ -1715,7 +1716,6 @@ void i915_hangcheck_elapsed(unsigned long data)
 * and break the hang. This should work on
 * all but the second generation chipsets.
 */
-
if (kick_ring(dev_priv-ring[RCS]))
goto repeat;
 
@@ -1728,7 +1728,6 @@ void i915_hangcheck_elapsed(unsigned long data)
goto repeat;
}
 
-   i915_handle_error(dev, true);
return;
}
} else {
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: bind objects to the global gtt only when needed

2011-11-29 Thread Daniel Vetter
And track the existence of such a binding similar to the aliasing
ppgtt case. Speeds up binding/unbinding in the common case where we
only need a ppgtt binding (which is accessed in a cpu coherent fashion
by the gpu) and no gloabl gtt binding (which needs uc writes for the
ptes).

This patch just puts the required tracking in place.

v2: Check that global gtt mappings exist in the error_state capture
code (they should because we pin all relevant things as mappable).
Suggested by Chris Wilson.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_drv.h |1 +
 drivers/gpu/drm/i915/i915_gem.c |   12 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c |4 
 drivers/gpu/drm/i915/i915_irq.c |3 +++
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e026299..8b5c016 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -857,6 +857,7 @@ struct drm_i915_gem_object {
unsigned int cache_level:2;
 
unsigned int has_aliasing_ppgtt_mapping:1;
+   unsigned int has_global_gtt_mapping:1;
 
struct page **pages;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 234eae8..88e5c1b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1224,6 +1224,9 @@ int i915_gem_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
goto unlock;
}
 
+   if (!obj-has_global_gtt_mapping)
+   i915_gem_gtt_bind_object(obj, obj-cache_level);
+
if (obj-tiling_mode == I915_TILING_NONE)
ret = i915_gem_object_put_fence(obj);
else
@@ -2133,7 +2136,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 
trace_i915_gem_object_unbind(obj);
 
-   i915_gem_gtt_unbind_object(obj);
+   if (obj-has_global_gtt_mapping)
+   i915_gem_gtt_unbind_object(obj);
if (obj-has_aliasing_ppgtt_mapping) {
i915_ppgtt_unbind_object(dev_priv-mm.aliasing_ppgtt, obj);
obj-has_aliasing_ppgtt_mapping = 0;
@@ -2986,7 +2990,8 @@ int i915_gem_object_set_cache_level(struct 
drm_i915_gem_object *obj,
return ret;
}
 
-   i915_gem_gtt_bind_object(obj, cache_level);
+   if (obj-has_global_gtt_mapping)
+   i915_gem_gtt_bind_object(obj, cache_level);
if (obj-has_aliasing_ppgtt_mapping)
i915_ppgtt_bind_object(dev_priv-mm.aliasing_ppgtt,
   obj, cache_level);
@@ -3369,6 +3374,9 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
return ret;
}
 
+   if (!obj-has_global_gtt_mapping  map_and_fenceable)
+   i915_gem_gtt_bind_object(obj, obj-cache_level);
+
if (obj-pin_count++ == 0) {
if (!obj-active)
list_move_tail(obj-mm_list,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index bfa500d..f437d4b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -388,12 +388,16 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object 
*obj,
   obj-base.size  PAGE_SHIFT,
   obj-pages,
   agp_type);
+
+   obj-has_global_gtt_mapping = 1;
 }
 
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 {
intel_gtt_clear_range(obj-gtt_space-start  PAGE_SHIFT,
  obj-base.size  PAGE_SHIFT);
+
+   obj-has_global_gtt_mapping = 0;
 }
 
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 65dc543..20f6e5a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -711,6 +711,9 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
if (src == NULL || src-pages == NULL || !src-map_and_fenceable)
return NULL;
 
+   if (!src-has_global_gtt_mapping)
+   return NULL;
+
page_count = src-base.size / PAGE_SIZE;
 
dst = kmalloc(sizeof(*dst) + page_count * sizeof(u32 *), GFP_ATOMIC);
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: capture error_state also for stuck rings

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 12:16:34 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:
 Since quite a while we also the basic output configuration in the
 error_state, so it should contain enough information to diagnose
 these MI_WAIT hangs.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
Reviewed-and-tested-by: Chris Wilson ch...@chris-wilson.co.uk

I've often used this for debugging, so about time it went upstream.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/4 v2] drm/i915: fix ELD writing for SandyBridge

2011-11-29 Thread Wu Fengguang
SandyBridge should be using the same register addresses as IvyBridge.

Signed-off-by: Wu Fengguang fengguang...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

Sorry, this moves some necessary changes from patch 2/4 to here.

--- linux.orig/drivers/gpu/drm/i915/intel_display.c 2011-11-28 
15:33:04.0 +0800
+++ linux/drivers/gpu/drm/i915/intel_display.c  2011-11-29 19:51:28.0 
+0800
@@ -5857,14 +5857,14 @@ static void ironlake_write_eld(struct dr
int aud_cntl_st;
int aud_cntrl_st2;
 
-   if (IS_IVYBRIDGE(connector-dev)) {
-   hdmiw_hdmiedid = GEN7_HDMIW_HDMIEDID_A;
-   aud_cntl_st = GEN7_AUD_CNTRL_ST_A;
-   aud_cntrl_st2 = GEN7_AUD_CNTRL_ST2;
-   } else {
+   if (HAS_PCH_IBX(connector-dev)) {
hdmiw_hdmiedid = GEN5_HDMIW_HDMIEDID_A;
aud_cntl_st = GEN5_AUD_CNTL_ST_A;
aud_cntrl_st2 = GEN5_AUD_CNTL_ST2;
+   } else {
+   hdmiw_hdmiedid = GEN7_HDMIW_HDMIEDID_A;
+   aud_cntl_st = GEN7_AUD_CNTRL_ST_A;
+   aud_cntrl_st2 = GEN7_AUD_CNTRL_ST2;
}
 
i = to_intel_crtc(crtc)-pipe;
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/4 v2] drm/i915: rename audio ELD registers

2011-11-29 Thread Wu Fengguang
Change the definitions from GEN5 to IBX as they aren't in the CPU and
some SNB systems actually shipped with IBX chipsets (or, at least that's
a supported configuration).

The GEN7_* register addresses actually take effect since GEN6 and should
be prefixed by CPT, the PCH code name.

Suggested-by: Keith Packard kei...@keithp.com
Signed-off-by: Wu Fengguang fengguang...@intel.com
---
 drivers/gpu/drm/i915/i915_reg.h  |   22 +++---
 drivers/gpu/drm/i915/intel_display.c |   22 +++---
 2 files changed, 22 insertions(+), 22 deletions(-)

--- linux.orig/drivers/gpu/drm/i915/i915_reg.h  2011-11-29 19:50:27.0 
+0800
+++ linux/drivers/gpu/drm/i915/i915_reg.h   2011-11-29 19:51:38.0 
+0800
@@ -3534,17 +3534,17 @@
 #define G4X_ELD_ACK(1  4)
 #define G4X_HDMIW_HDMIEDID 0x6210C
 
-#define GEN5_HDMIW_HDMIEDID_A  0xE2050
-#define GEN5_AUD_CNTL_ST_A 0xE20B4
-#define GEN5_ELD_BUFFER_SIZE   (0x1f  10)
-#define GEN5_ELD_ADDRESS   (0x1f  5)
-#define GEN5_ELD_ACK   (1  4)
-#define GEN5_AUD_CNTL_ST2  0xE20C0
-#define GEN5_ELD_VALIDB(1  0)
-#define GEN5_CP_READYB (1  1)
+#define IBX_HDMIW_HDMIEDID_A   0xE2050
+#define IBX_AUD_CNTL_ST_A  0xE20B4
+#define IBX_ELD_BUFFER_SIZE(0x1f  10)
+#define IBX_ELD_ADDRESS(0x1f  5)
+#define IBX_ELD_ACK(1  4)
+#define IBX_AUD_CNTL_ST2   0xE20C0
+#define IBX_ELD_VALIDB (1  0)
+#define IBX_CP_READYB  (1  1)
 
-#define GEN7_HDMIW_HDMIEDID_A  0xE5050
-#define GEN7_AUD_CNTRL_ST_A0xE50B4
-#define GEN7_AUD_CNTRL_ST2 0xE50C0
+#define CPT_HDMIW_HDMIEDID_A   0xE5050
+#define CPT_AUD_CNTL_ST_A  0xE50B4
+#define CPT_AUD_CNTRL_ST2  0xE50C0
 
 #endif /* _I915_REG_H_ */
--- linux.orig/drivers/gpu/drm/i915/intel_display.c 2011-11-29 
19:51:28.0 +0800
+++ linux/drivers/gpu/drm/i915/intel_display.c  2011-11-29 19:52:01.0 
+0800
@@ -5858,13 +5858,13 @@ static void ironlake_write_eld(struct dr
int aud_cntrl_st2;
 
if (HAS_PCH_IBX(connector-dev)) {
-   hdmiw_hdmiedid = GEN5_HDMIW_HDMIEDID_A;
-   aud_cntl_st = GEN5_AUD_CNTL_ST_A;
-   aud_cntrl_st2 = GEN5_AUD_CNTL_ST2;
+   hdmiw_hdmiedid = IBX_HDMIW_HDMIEDID_A;
+   aud_cntl_st = IBX_AUD_CNTL_ST_A;
+   aud_cntrl_st2 = IBX_AUD_CNTL_ST2;
} else {
-   hdmiw_hdmiedid = GEN7_HDMIW_HDMIEDID_A;
-   aud_cntl_st = GEN7_AUD_CNTRL_ST_A;
-   aud_cntrl_st2 = GEN7_AUD_CNTRL_ST2;
+   hdmiw_hdmiedid = CPT_HDMIW_HDMIEDID_A;
+   aud_cntl_st = CPT_AUD_CNTL_ST_A;
+   aud_cntrl_st2 = CPT_AUD_CNTRL_ST2;
}
 
i = to_intel_crtc(crtc)-pipe;
@@ -5878,12 +5878,12 @@ static void ironlake_write_eld(struct dr
if (!i) {
DRM_DEBUG_DRIVER(Audio directed to unknown port\n);
/* operate blindly on all ports */
-   eldv = GEN5_ELD_VALIDB;
-   eldv |= GEN5_ELD_VALIDB  4;
-   eldv |= GEN5_ELD_VALIDB  8;
+   eldv = IBX_ELD_VALIDB;
+   eldv |= IBX_ELD_VALIDB  4;
+   eldv |= IBX_ELD_VALIDB  8;
} else {
DRM_DEBUG_DRIVER(ELD on port %c\n, 'A' + i);
-   eldv = GEN5_ELD_VALIDB  ((i - 1) * 4);
+   eldv = IBX_ELD_VALIDB  ((i - 1) * 4);
}
 
i = I915_READ(aud_cntrl_st2);
@@ -5899,7 +5899,7 @@ static void ironlake_write_eld(struct dr
}
 
i = I915_READ(aud_cntl_st);
-   i = ~GEN5_ELD_ADDRESS;
+   i = ~IBX_ELD_ADDRESS;
I915_WRITE(aud_cntl_st, i);
 
len = min_t(uint8_t, eld[2], 21);   /* 84 bytes of hw ELD buffer */
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/2] drm/i915: enable deepest RC6 state

2011-11-29 Thread Eugeni Dodonov
This should allow even more energy saving when GPU is not in use.
According to the testing, this state results in around 0.1 - 0.4 W better
power usage.

No issues or regressions observed so far, but additional testing is
certainly welcome.

Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
---
 drivers/gpu/drm/i915/intel_display.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 981b1f1..d1e5726 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7924,6 +7924,7 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
 
if (i915_enable_rc6)
rc6_mask = GEN6_RC_CTL_RC6p_ENABLE |
+   GEN6_RC_CTL_RC6pp_ENABLE |
GEN6_RC_CTL_RC6_ENABLE;
 
I915_WRITE(GEN6_RC_CONTROL,
-- 
1.7.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: add debugfs interface to control rc6 and rps

2011-11-29 Thread Eugeni Dodonov
This allows to enable/disable rps and rc6 from userspace. This is
necessary for to have predictable results from hardware counters, and also
to provide a finer granularity over power control from userspace.

As an additional trick, we also change the value of i915_enable_rc6 by
using this value, to allow the module to know the latest status of rc6
requested by user.

Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
---
 drivers/gpu/drm/i915/i915_debugfs.c  |  101 ++
 drivers/gpu/drm/i915/intel_display.c |2 +-
 drivers/gpu/drm/i915/intel_drv.h |3 +
 3 files changed, 105 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 4f40f1c..dffc998 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1340,6 +1340,79 @@ static const struct file_operations i915_wedged_fops = {
 };
 
 static int
+i915_enable_rc6_open(struct inode *inode,
+  struct file *filp)
+{
+   filp-private_data = inode-i_private;
+   return 0;
+}
+
+static ssize_t
+i915_enable_rc6_read(struct file *filp,
+  char __user *ubuf,
+  size_t max,
+  loff_t *ppos)
+{
+   char buf[80];
+   int len;
+
+   len = snprintf(buf, sizeof(buf),
+  %d\n, i915_enable_rc6);
+
+   if (len  sizeof(buf))
+   len = sizeof(buf);
+
+   return simple_read_from_buffer(ubuf, max, ppos, buf, len);
+}
+
+static ssize_t
+i915_enable_rc6_write(struct file *filp,
+ const char __user *ubuf,
+ size_t cnt,
+ loff_t *ppos)
+{
+   struct drm_device *dev = filp-private_data;
+   struct drm_i915_private *dev_priv = dev-dev_private;
+   char buf[20];
+   int val = -1;
+
+   if (cnt  0) {
+   if (cnt  sizeof(buf) - 1)
+   return -EINVAL;
+
+   if (copy_from_user(buf, ubuf, cnt))
+   return -EFAULT;
+   buf[cnt] = 0;
+
+   val = simple_strtol(buf, NULL, 0);
+   }
+
+   if (INTEL_INFO(dev)-gen  5)
+   return cnt;
+
+   DRM_DEBUG_DRIVER(Manually setting rps and rc6 status to %d\n, val);
+   i915_enable_rc6 = val;
+
+   if (val == 0) {
+   if (IS_IRONLAKE_M(dev))
+   ironlake_disable_rc6(dev);
+   else {
+   gen6_disable_rps(dev);
+   gen6_update_ring_freq(dev_priv);
+   }
+   } else {
+   if (IS_IRONLAKE_M(dev))
+   ironlake_disable_rc6(dev);
+   else {
+   gen6_enable_rps(dev_priv);
+   gen6_update_ring_freq(dev_priv);
+   }
+   }
+
+   return cnt;
+}
+
+static int
 i915_max_freq_open(struct inode *inode,
   struct file *filp)
 {
@@ -1401,6 +1474,14 @@ i915_max_freq_write(struct file *filp,
return cnt;
 }
 
+static const struct file_operations i915_enable_rc6_fops = {
+   .owner = THIS_MODULE,
+   .open = i915_enable_rc6_open,
+   .read = i915_enable_rc6_read,
+   .write = i915_enable_rc6_write,
+   .llseek = default_llseek,
+};
+
 static const struct file_operations i915_max_freq_fops = {
.owner = THIS_MODULE,
.open = i915_max_freq_open,
@@ -1605,6 +1686,21 @@ static int i915_max_freq_create(struct dentry *root, 
struct drm_minor *minor)
return drm_add_fake_info_node(minor, ent, i915_max_freq_fops);
 }
 
+static int i915_enable_rc6_create(struct dentry *root, struct drm_minor *minor)
+{
+   struct drm_device *dev = minor-dev;
+   struct dentry *ent;
+
+   ent = debugfs_create_file(i915_enable_rc6,
+ S_IRUGO | S_IWUSR,
+ root, dev,
+ i915_enable_rc6_fops);
+   if (IS_ERR(ent))
+   return PTR_ERR(ent);
+
+   return drm_add_fake_info_node(minor, ent, i915_enable_rc6_fops);
+}
+
 static int i915_cache_sharing_create(struct dentry *root, struct drm_minor 
*minor)
 {
struct drm_device *dev = minor-dev;
@@ -1676,6 +1772,9 @@ int i915_debugfs_init(struct drm_minor *minor)
ret = i915_max_freq_create(minor-debugfs_root, minor);
if (ret)
return ret;
+   ret = i915_enable_rc6_create(minor-debugfs_root, minor);
+   if (ret)
+   return ret;
ret = i915_cache_sharing_create(minor-debugfs_root, minor);
if (ret)
return ret;
@@ -1695,6 +1794,8 @@ void i915_debugfs_cleanup(struct drm_minor *minor)
 1, minor);
drm_debugfs_remove_files((struct drm_info_list *) i915_max_freq_fops,
 1, minor);
+   drm_debugfs_remove_files((struct drm_info_list *) i915_enable_rc6_fops,
+

Re: [Intel-gfx] [PATCH 1/2] drm/i915: enable deepest RC6 state

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 10:55:04 -0200, Eugeni Dodonov eugeni.dodo...@intel.com 
wrote:
 This should allow even more energy saving when GPU is not in use.
 According to the testing, this state results in around 0.1 - 0.4 W better
 power usage.
 
 No issues or regressions observed so far, but additional testing is
 certainly welcome.

The docs I saw said not implemented; do not use. Do we have it on good
authority that this is safe and useful to enable? And doesn't it
require programming of more transition thresholds?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: add debugfs interface to control rc6 and rps

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 10:55:05 -0200, Eugeni Dodonov eugeni.dodo...@intel.com 
wrote:
 This allows to enable/disable rps and rc6 from userspace. This is
 necessary for to have predictable results from hardware counters, and also
 to provide a finer granularity over power control from userspace.

Reading registers and counters is handled with i915_forcewake_user. So
the question is what benefit does htis give over module reload. It looks
far riskier...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: add debugfs interface to control rc6 and rps

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 01:05:03PM +, Chris Wilson wrote:
 On Tue, 29 Nov 2011 10:55:05 -0200, Eugeni Dodonov eugeni.dodo...@intel.com 
 wrote:
  This allows to enable/disable rps and rc6 from userspace. This is
  necessary for to have predictable results from hardware counters, and also
  to provide a finer granularity over power control from userspace.
 
 Reading registers and counters is handled with i915_forcewake_user. So
 the question is what benefit does htis give over module reload. It looks
 far riskier...

The observation architecture bspec says that we need to disable rc6,
otherwise we'll get garbage in the gpu perf counters. But imo I think this
should be handled in the kernel (we neeed to write tons of funny registers
to set things up anyway), at least for the time-based perf counter
sampling. For prototyping things with intel_reg_write disabling rc6 at
module load time seems sufficient to me.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: add debugfs interface to control rc6 and rps

2011-11-29 Thread Eugeni Dodonov
On Tue, Nov 29, 2011 at 12:12, Daniel Vetter dan...@ffwll.ch wrote:

 On Tue, Nov 29, 2011 at 01:05:03PM +, Chris Wilson wrote:
  On Tue, 29 Nov 2011 10:55:05 -0200, Eugeni Dodonov 
 eugeni.dodo...@intel.com wrote:
   This allows to enable/disable rps and rc6 from userspace. This is
   necessary for to have predictable results from hardware counters, and
 also
   to provide a finer granularity over power control from userspace.
 
  Reading registers and counters is handled with i915_forcewake_user. So
  the question is what benefit does htis give over module reload. It looks
  far riskier...

 The observation architecture bspec says that we need to disable rc6,
 otherwise we'll get garbage in the gpu perf counters. But imo I think this
 should be handled in the kernel (we neeed to write tons of funny registers
 to set things up anyway), at least for the time-based perf counter
 sampling. For prototyping things with intel_reg_write disabling rc6 at
 module load time seems sufficient to me.


The major idea behind this was to provide additional facilities for
controlling power and performance-related details of the driver from
userspace, without module reloading or reboot.

Currently, both the GPU turbo and rc6/fbc operations are mostly autonomous,
there is not much we can influence on their behavior without a reboot or
module reload. So my idea was to have a way to toggle - and in some future,
ideally, fine-tune their behavior from userspace during the execution. For
GPU turbo, we can control it to some point from userspace via
i915_max_freq. And with this patch, we could also control the rps/rc6
behavior to some point as well.

For the perf counters, I thought on the following flow of execution with
this patch:
1. prev_val=$(cat /sys/kernel/debug/dri/0/i915_enable_rc6)
2. echo 0  /sys/kernel/debug/dri/0/i915_enable_rc6
3. do the perf counters aquisition and testing
4. echo $(prev_val)  /sys/kernel/debug/dri/0/i915_enable_rc6

Comments?

-- 
Eugeni Dodonov
 http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/15] [RFC] Forced throttling/scheduling

2011-11-29 Thread Eugeni Dodonov
On Sat, Nov 19, 2011 at 00:24, Ben Widawsky b...@bwidawsk.net wrote:

 This code is currently living on my personal git repo:
 git://people.freedesktop.org/~bwidawsk/drm-intelhttp://people.freedesktop.org/%7Ebwidawsk/drm-intelforced_throttling

 Since these are RFC, I haven't spent too much time cleaning things up.
 Feel free to comment on typos, comments you feel are missing, etc. I
 also haven't spent any time running the standard kernel tests, kmemleak
 and such - I intend to do so while these are reviewed here.

 There are two main scheduler types implemented in this patch. What I
 refer to as a fairness scheduler, and a batch scheduler. The fairness
 scheduler is currently implemented on batch granularity but that is not
 guaranteed going forward. The batch scheduler is a way to set batch
 limits per pid. Refer to the relevant patch for more details.

 It is my opinion that the fairness scheduler isn't terribly interesting
 except to prevent badly written, or malicious apps. For example, a 3d
 app which queues up a ton of work but never calls glxSwapBuffer.

 The batch scheduler is also somewhat uninteresting as the values it uses
 require proper tuning and will vary from system to system, and even then
 depending on what's currently running. But like the fairness one, this
 too has its applications.

 Most comments and feedback are welcome.


I am not that expert about how GPU scheduling works, but all the patches
make sense to me and I haven't found anything apparently wrong.

I have one feature request for patch 11, which is to list available
schedulers via a debugfs entry. For example:
# cat /sys/kernel/debug/dri/0/i915_available_schedulers
none *fair batch

or list them in the output of i915_scheduler, like:
# cat /sys/kernel/debug/dri/0/i915_scheduler
...
Scheduler: batch
Available schedulers: none fair batch

To see what schedulers are available for usage.

But besides this, for the series:
Tested-by: Eugeni Dodonov eugeni.dodo...@intel.com
Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: add debugfs interface to control rc6 and rps

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 12:26:23 -0200, Eugeni Dodonov eug...@dodonov.net wrote:
 For the perf counters, I thought on the following flow of execution with
 this patch:
 1. prev_val=$(cat /sys/kernel/debug/dri/0/i915_enable_rc6)
 2. echo 0  /sys/kernel/debug/dri/0/i915_enable_rc6
 3. do the perf counters aquisition and testing
 4. echo $(prev_val)  /sys/kernel/debug/dri/0/i915_enable_rc6
 
 Comments?

debugfs should never be ABI. Development and debug testing is one thing,
but if we are serious about this, we need to start building an acceptable
user interface elsewhere in /sys.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: properly clflush pwrites to phys objects

2011-11-29 Thread Daniel Vetter
Usually results in (rare) cursor corruptions on platforms
requiring physically addressed cursors.

Note to the stable team: This requires the drm core patch
drm: add helper to clflush a virtual address range which
creates the helper used here.

Tested-and-reported-by: Bruno Prémont bonb...@linux-vserver.org
Cc: sta...@kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35460
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=21442
Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 39459d2..e395a7d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4123,6 +4123,7 @@ i915_gem_phys_pwrite(struct drm_device *dev,
return -EFAULT;
}
 
+   drm_clflush_virt_range(vaddr, args-size);
intel_gtt_chipset_flush();
return 0;
 }
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Only clear the GPU domains upon a successful finish

2011-11-29 Thread Chris Wilson
By clearing the GPU read domains before waiting upon the buffer, we run
the risk of the wait being interrupted and the domains prematurely
cleared. The next time we attempt to wait upon the buffer (after
userspace handles the signal), we believe that the buffer is idle and so
skip the wait.

There are a number of bugs across all generations which show signs of an
overly haste reuse of active buffers.

Such as:

  https://bugs.freedesktop.org/show_bug.cgi?id=29046
  https://bugs.freedesktop.org/show_bug.cgi?id=35863
  https://bugs.freedesktop.org/show_bug.cgi?id=38952
  https://bugs.freedesktop.org/show_bug.cgi?id=40282
  https://bugs.freedesktop.org/show_bug.cgi?id=41098
  https://bugs.freedesktop.org/show_bug.cgi?id=41102
  https://bugs.freedesktop.org/show_bug.cgi?id=41284
  https://bugs.freedesktop.org/show_bug.cgi?id=42141

A couple of those pre-date i915_gem_object_finish_gpu(), so may be
unrelated (such as a wild write from a userspace command buffer), but
this does look like a convincing cause for most of those bugs.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: sta...@kernel.org
---
 drivers/gpu/drm/i915/i915_gem.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8118942..593a64b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3121,10 +3121,13 @@ i915_gem_object_finish_gpu(struct drm_i915_gem_object 
*obj)
return ret;
}
 
+   ret = i915_gem_object_wait_rendering(obj);
+   if (ret)
+   return ret;
+
/* Ensure that we invalidate the GPU's caches and TLBs. */
obj-base.read_domains = ~I915_GEM_GPU_DOMAINS;
-
-   return i915_gem_object_wait_rendering(obj);
+   return 0;
 }
 
 /**
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Chris Wilson
We try to avoid writing the relocations through the uncached GTT, if the
buffer is currently in the CPU write domain and so will be flushed out to
main memory afterwards anyway. Also on SandyBridge we can safely write
to the pages in cacheable memory, so long as the buffer is LLC mapped.
In either of these caches, we therefore do not need to force the
reallocation of the buffer into the mappable region of the GTT, reducing
the aperture pressure.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   24 +---
 1 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f71f26e..9b8d543 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -265,6 +265,12 @@ eb_destroy(struct eb_objects *eb)
kfree(eb);
 }
 
+static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
+{
+   return (obj-base.write_domain == I915_GEM_DOMAIN_CPU ||
+   obj-cache_level != I915_CACHE_NONE);
+}
+
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
   struct eb_objects *eb,
@@ -351,7 +357,7 @@ i915_gem_execbuffer_relocate_entry(struct 
drm_i915_gem_object *obj,
}
 
reloc-delta += target_offset;
-   if (obj-base.write_domain == I915_GEM_DOMAIN_CPU) {
+   if (use_cpu_reloc(obj)) {
uint32_t page_offset = reloc-offset  ~PAGE_MASK;
char *vaddr;
 
@@ -463,6 +469,13 @@ i915_gem_execbuffer_relocate(struct drm_device *dev,
 #define  __EXEC_OBJECT_HAS_FENCE (131)
 
 static int
+need_reloc_mappable(struct drm_i915_gem_object *obj)
+{
+   struct drm_i915_gem_exec_object2 *entry = obj-exec_entry;
+   return entry-relocation_count  !use_cpu_reloc(obj);
+}
+
+static int
 pin_and_fence_object(struct drm_i915_gem_object *obj,
 struct intel_ring_buffer *ring)
 {
@@ -475,8 +488,7 @@ pin_and_fence_object(struct drm_i915_gem_object *obj,
has_fenced_gpu_access 
entry-flags  EXEC_OBJECT_NEEDS_FENCE 
obj-tiling_mode != I915_TILING_NONE;
-   need_mappable =
-   entry-relocation_count ? true : need_fence;
+   need_mappable = need_fence || need_reloc_mappable(obj);
 
ret = i915_gem_object_pin(obj, entry-alignment, need_mappable);
if (ret)
@@ -532,8 +544,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
has_fenced_gpu_access 
entry-flags  EXEC_OBJECT_NEEDS_FENCE 
obj-tiling_mode != I915_TILING_NONE;
-   need_mappable =
-   entry-relocation_count ? true : need_fence;
+   need_mappable = need_fence || need_reloc_mappable(obj);
 
if (need_mappable)
list_move(obj-exec_list, ordered_objects);
@@ -573,8 +584,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
has_fenced_gpu_access 
entry-flags  EXEC_OBJECT_NEEDS_FENCE 
obj-tiling_mode != I915_TILING_NONE;
-   need_mappable =
-   entry-relocation_count ? true : need_fence;
+   need_mappable = need_fence || need_reloc_mappable(obj);
 
if ((entry-alignment  obj-gtt_offset  
(entry-alignment - 1)) ||
(need_mappable  !obj-map_and_fenceable))
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
 We try to avoid writing the relocations through the uncached GTT, if the
 buffer is currently in the CPU write domain and so will be flushed out to
 main memory afterwards anyway. Also on SandyBridge we can safely write
 to the pages in cacheable memory, so long as the buffer is LLC mapped.
 In either of these caches, we therefore do not need to force the
 reallocation of the buffer into the mappable region of the GTT, reducing
 the aperture pressure.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk

The error_state capture currently relies on us pinning buffers as mappable
when they contain relocations (and userspace always submitting a
batchbuffers containing relocations). You break that guarantee without
fixing up the error capture code. Otherwise I like this.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: properly clflush pwrites to phys objects

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 16:09:29 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:
 Usually results in (rare) cursor corruptions on platforms
 requiring physically addressed cursors.

So the phys cursor pages are set to WC upon creation, are we just
missing the mb()? Or more likely the CPUs don't have PAT and we are
being lazy in not detecting the error.

Anyway as this is reported to fix the issue on 8xx at least, overkill
is fine. :)

Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only clear the GPU domains upon a successful finish

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 03:12:16PM +, Chris Wilson wrote:
 By clearing the GPU read domains before waiting upon the buffer, we run
 the risk of the wait being interrupted and the domains prematurely
 cleared. The next time we attempt to wait upon the buffer (after
 userspace handles the signal), we believe that the buffer is idle and so
 skip the wait.
 
 There are a number of bugs across all generations which show signs of an
 overly haste reuse of active buffers.
 
 Such as:
 
   https://bugs.freedesktop.org/show_bug.cgi?id=29046
   https://bugs.freedesktop.org/show_bug.cgi?id=35863
   https://bugs.freedesktop.org/show_bug.cgi?id=38952
   https://bugs.freedesktop.org/show_bug.cgi?id=40282
   https://bugs.freedesktop.org/show_bug.cgi?id=41098
   https://bugs.freedesktop.org/show_bug.cgi?id=41102
   https://bugs.freedesktop.org/show_bug.cgi?id=41284
   https://bugs.freedesktop.org/show_bug.cgi?id=42141
 
 A couple of those pre-date i915_gem_object_finish_gpu(), so may be
 unrelated (such as a wild write from a userspace command buffer), but
 this does look like a convincing cause for most of those bugs.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: sta...@kernel.org

Really nice catch!
Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Only clear the GPU domains upon a successful finish

2011-11-29 Thread Eugeni Dodonov
On Tue, Nov 29, 2011 at 13:12, Chris Wilson ch...@chris-wilson.co.ukwrote:

 By clearing the GPU read domains before waiting upon the buffer, we run
 the risk of the wait being interrupted and the domains prematurely
 cleared. The next time we attempt to wait upon the buffer (after
 userspace handles the signal), we believe that the buffer is idle and so
 skip the wait.

 There are a number of bugs across all generations which show signs of an
 overly haste reuse of active buffers.

 Such as:

  https://bugs.freedesktop.org/show_bug.cgi?id=29046
  https://bugs.freedesktop.org/show_bug.cgi?id=35863
  https://bugs.freedesktop.org/show_bug.cgi?id=38952
  https://bugs.freedesktop.org/show_bug.cgi?id=40282
  https://bugs.freedesktop.org/show_bug.cgi?id=41098
  https://bugs.freedesktop.org/show_bug.cgi?id=41102
  https://bugs.freedesktop.org/show_bug.cgi?id=41284
  https://bugs.freedesktop.org/show_bug.cgi?id=42141

 A couple of those pre-date i915_gem_object_finish_gpu(), so may be
 unrelated (such as a wild write from a userspace command buffer), but
 this does look like a convincing cause for most of those bugs.

 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: sta...@kernel.org


Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: multithreaded forcewake is an ivb+ feature

2011-11-29 Thread Eugeni Dodonov
On Mon, Nov 28, 2011 at 19:12, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 Name the function accordingly. Suggested by Chris Wilson.

 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch


Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

Such patches are so hard to review. The only ones harder than those are the
white space-related ones :).

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: properly clflush pwrites to phys objects

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 03:35:54PM +, Chris Wilson wrote:
 On Tue, 29 Nov 2011 16:09:29 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:
  Usually results in (rare) cursor corruptions on platforms
  requiring physically addressed cursors.
 
 So the phys cursor pages are set to WC upon creation, are we just
 missing the mb()? Or more likely the CPUs don't have PAT and we are
 being lazy in not detecting the error.

Yes, on reconsidering the tested-by is from a pentium m, which has working
pat, and we do a wbinvd in the i8xx chipset flush, so I don't know anymore
how this patch actually works.

But it seems to indeed fix the issue for at least one reporter and cursor
update is about as far away from a perf critical path as possible, so who
cares about such minor quibbles, it works ;-)
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/4] drm/i915: drop register special-casing in forcewake

2011-11-29 Thread Eugeni Dodonov
On Mon, Nov 28, 2011 at 18:17, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 We currently have 3 register for which we must not grab forcewake for:
 FORCEWAKE, FROCEWAKE_MT and ECOBUS.
 - FROCEWAKE is excluded in the NEEDS_FORCE_WAKE macro and accessed
  with _NOTRACE.
 - FROCEWAKE_MT is just accessed with _NOTRACE.


s/FROCEWAKE/FORCEWAKE/g :)

Besides this,

Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

for the series.

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: capture error_state also for stuck rings

2011-11-29 Thread Eugeni Dodonov
On Tue, Nov 29, 2011 at 09:42, Chris Wilson ch...@chris-wilson.co.ukwrote:

 On Tue, 29 Nov 2011 12:16:34 +0100, Daniel Vetter daniel.vet...@ffwll.ch
 wrote:
  Since quite a while we also the basic output configuration in the
  error_state, so it should contain enough information to diagnose
  these MI_WAIT hangs.
 
  Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
 Reviewed-and-tested-by: Chris Wilson ch...@chris-wilson.co.uk


Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: re-enable semaphores by default

2011-11-29 Thread Daniel Vetter
On Tue, Nov 22, 2011 at 06:41, Andrew Lutomirski l...@mit.edu wrote:
 On Thu, Nov 17, 2011 at 1:22 AM, Daniel Vetter daniel.vet...@ffwll.ch wrote:
 Oops, sorry for wasting your time, wrong branch. Can you try the
 for-poland one? And also please try what happens when enabling the
 iommu.

 for-poland with semaphores, rc6, and iommu on seems to work.  I'm
 sending this email from that kernel right now :)

Just to double-check: Can you please retest my latest ppgtt branch available at:

http://cgit.freedesktop.org/~danvet/drm/log/?h=ppgtt

If that works for you with DMAR and semaphores enabled, a tested-by is
highly appreciated.

Thanks, Daniel
-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 16:34:41 +0100, Daniel Vetter dan...@ffwll.ch wrote:
 On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
  We try to avoid writing the relocations through the uncached GTT, if the
  buffer is currently in the CPU write domain and so will be flushed out to
  main memory afterwards anyway. Also on SandyBridge we can safely write
  to the pages in cacheable memory, so long as the buffer is LLC mapped.
  In either of these caches, we therefore do not need to force the
  reallocation of the buffer into the mappable region of the GTT, reducing
  the aperture pressure.
  
  Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 
 The error_state capture currently relies on us pinning buffers as mappable
 when they contain relocations (and userspace always submitting a
 batchbuffers containing relocations). You break that guarantee without
 fixing up the error capture code. Otherwise I like this.

I may have sent that patch a little earlier. ;-)
That particular patch stands by itself since we already do use the full
GTT and so need defense against reading through unmappable PTEs because
having some of our errors, you can't be paranoid enough.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 04:48:15PM +, Chris Wilson wrote:
 On Tue, 29 Nov 2011 16:34:41 +0100, Daniel Vetter dan...@ffwll.ch wrote:
  On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
   We try to avoid writing the relocations through the uncached GTT, if the
   buffer is currently in the CPU write domain and so will be flushed out to
   main memory afterwards anyway. Also on SandyBridge we can safely write
   to the pages in cacheable memory, so long as the buffer is LLC mapped.
   In either of these caches, we therefore do not need to force the
   reallocation of the buffer into the mappable region of the GTT, reducing
   the aperture pressure.
   
   Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
  
  The error_state capture currently relies on us pinning buffers as mappable
  when they contain relocations (and userspace always submitting a
  batchbuffers containing relocations). You break that guarantee without
  fixing up the error capture code. Otherwise I like this.
 
 I may have sent that patch a little earlier. ;-)

Yes, I know. My gripe is that this will reduce our chances of successfully
capturing the error_state, because now we expect to hit that case in the
error capture code whereas up to now it would have been a bug somewhere.
So either
- fixup the error_capture to fall back to cpu reads (needs the usual
  clflush dance if the object is not llc cached)
- or drop the pin mappable change in this patch.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 06:03:53PM +0100, Daniel Vetter wrote:
 On Tue, Nov 29, 2011 at 04:48:15PM +, Chris Wilson wrote:
  On Tue, 29 Nov 2011 16:34:41 +0100, Daniel Vetter dan...@ffwll.ch wrote:
   On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
We try to avoid writing the relocations through the uncached GTT, if the
buffer is currently in the CPU write domain and so will be flushed out 
to
main memory afterwards anyway. Also on SandyBridge we can safely write
to the pages in cacheable memory, so long as the buffer is LLC mapped.
In either of these caches, we therefore do not need to force the
reallocation of the buffer into the mappable region of the GTT, reducing
the aperture pressure.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
   
   The error_state capture currently relies on us pinning buffers as mappable
   when they contain relocations (and userspace always submitting a
   batchbuffers containing relocations). You break that guarantee without
   fixing up the error capture code. Otherwise I like this.
  
  I may have sent that patch a little earlier. ;-)
 
 Yes, I know. My gripe is that this will reduce our chances of successfully
 capturing the error_state, because now we expect to hit that case in the
 error capture code whereas up to now it would have been a bug somewhere.
 So either
 - fixup the error_capture to fall back to cpu reads (needs the usual
   clflush dance if the object is not llc cached)
 - or drop the pin mappable change in this patch.

After some irc discussion with Dave Airlie I think a simple wmb() to flush
the wc buffer might make more sense. I'll try to run that past testers and
gather results. Will take at least a week to get anything conclusive.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 18:03:53 +0100, Daniel Vetter dan...@ffwll.ch wrote:
 On Tue, Nov 29, 2011 at 04:48:15PM +, Chris Wilson wrote:
  On Tue, 29 Nov 2011 16:34:41 +0100, Daniel Vetter dan...@ffwll.ch wrote:
   On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
We try to avoid writing the relocations through the uncached GTT, if the
buffer is currently in the CPU write domain and so will be flushed out 
to
main memory afterwards anyway. Also on SandyBridge we can safely write
to the pages in cacheable memory, so long as the buffer is LLC mapped.
In either of these caches, we therefore do not need to force the
reallocation of the buffer into the mappable region of the GTT, reducing
the aperture pressure.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
   
   The error_state capture currently relies on us pinning buffers as mappable
   when they contain relocations (and userspace always submitting a
   batchbuffers containing relocations). You break that guarantee without
   fixing up the error capture code. Otherwise I like this.
  
  I may have sent that patch a little earlier. ;-)
 
 Yes, I know. My gripe is that this will reduce our chances of successfully
 capturing the error_state, because now we expect to hit that case in the
 error capture code whereas up to now it would have been a bug somewhere.
 So either
 - fixup the error_capture to fall back to cpu reads (needs the usual
   clflush dance if the object is not llc cached)
 - or drop the pin mappable change in this patch.

Ah you forget, I volunteered you to do the error-state capture from a
workqueue so that we could add further complexity... 

We would then be able to allocate enough memory to capture auxiliary
buffers as well, etc.

In the meantime, the paths that hit this code are during warmup (before
any batches have been retired into the userspace bo cache), slow steady
state behaviour (when the caches are being reaped and repopulated), and
when thrashing the hardware. It also requires that the whole mappable
range had already been allocated.

Whilst not negligible, the risk imo is small and all will be solved with
the next generation i915_capture_error().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: flush overlay regfile writes

2011-11-29 Thread Daniel Vetter
Better be paranoid. The wmb should flush the wc writes, and
the chipset_flush hopefully flushes any mch buffers. There've been a
few overlay hangs I've never really diagnosed, unfortunately all the
reporters disappeared.

Maybe-related: https://bugs.freedesktop.org/show_bug.cgi?id=33309
Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/intel_overlay.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c 
b/drivers/gpu/drm/i915/intel_overlay.c
index cdf17d4..f75e892 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -209,6 +209,9 @@ static void intel_overlay_unmap_regs(struct intel_overlay 
*overlay,
 {
if (!OVERLAY_NEEDS_PHYSICAL(overlay-dev))
io_mapping_unmap(regs);
+
+   wmb();
+   intel_gtt_chipset_flush();
 }
 
 static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: flush overlay regfile writes

2011-11-29 Thread Chris Wilson
On Tue, 29 Nov 2011 18:32:18 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:
 Better be paranoid. The wmb should flush the wc writes, and
 the chipset_flush hopefully flushes any mch buffers. There've been a
 few overlay hangs I've never really diagnosed, unfortunately all the
 reporters disappeared.

One of the worries I've had in the past has been whether we need a wmb()
inside intel_ring_begin() so that all the register writes (assuming we
get WC registers one day) and the WC gtt writes are coherent with
initiating work on the GPU. I attacked
i915_gem_object_flush_gtt_write_domain() instead in the belief that
would be sufficient. I think this demonstrates I was wrong. However, you
could argue that the overlay code is circumventing the cache domain
tracking of gem objects established for this very purpose. Though
calling i915_gem_object_set_to_gtt_domain(obj,true) in map and
i915_gem_object_flush_gtt_write_domain() in unmap might be overkill...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/11] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Ben Widawsky
On Mon, 28 Nov 2011 21:35:30 +0100
Daniel Vetter daniel.vet...@ffwll.ch wrote:

 This just adds the setup and teardown code for the ppgtt PDE and the
 last-level pagetables, which are fixed for the entire lifetime, at
 least for the moment.
 
 v2: Kill the stray debug printk noted by and improve the pte
 definitions as suggested by Chris Wilson.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---
  drivers/gpu/drm/i915/i915_dma.c |   14 +++-
  drivers/gpu/drm/i915/i915_drv.h |   18 +
  drivers/gpu/drm/i915/i915_gem_gtt.c |  133 
 +++
  drivers/gpu/drm/i915/i915_reg.h |   18 +
  4 files changed, 181 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
 index fd617fb..9d9a92c 100644
 --- a/drivers/gpu/drm/i915/i915_dma.c
 +++ b/drivers/gpu/drm/i915/i915_dma.c
 @@ -1199,13 +1199,21 @@ static int i915_load_gem_init(struct drm_device *dev)
* at the last page of the aperture.  One page should be enough to
* keep any prefetching inside of the aperture.
*/
 - i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
 + i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE - 
 512*PAGE_SIZE);
 +
 + if (HAS_ALIASING_PPGTT(dev)) {
 + ret = i915_gem_init_aliasing_ppgtt(dev);
 + if (ret)
 + return ret;
 + }

Not sure if you fixed this based on our IRC conversation. The size
belongs in the HAS_ALIASING block. Also, I'd prefer NUM_PDEs or
something similar. Finally, if you put the PDEs at the top, couldn't we
get rid of the scratch page?

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Handle unmappable buffers during error state capture

2011-11-29 Thread Chris Wilson
As the buffer is not necessarily accessible through the GTT at the time
of a GPU hang, and capturing some of its contents is far more valuable
than skipping it, provide a clflushed fallback read path. We still
prefer to read through the GTT as that is more consistent with the GPU
access of the same buffer. So example it will demonstrate any errorneous
tiling or swizzling of the command buffer as seen by the GPU.

This becomes necessary with use of CPU relocations and lazy GTT binding,
but could potentially happen anyway as a result of a pathological error.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Daniel Vetter daniel.vet...@ffwll.ch
---

We continued discussing how best to handle this in the light of the
desirability of unmappable command buffers and decided that a CPU
fallback path was necessary to maintain the utility of the
i915_error_state and a prerequisite for LLC relocations.
-Chris

---
 drivers/gpu/drm/i915/i915_irq.c |   28 +++-
 1 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b40004b..08877a7 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -720,7 +720,6 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
reloc_offset = src-gtt_offset;
for (page = 0; page  page_count; page++) {
unsigned long flags;
-   void __iomem *s;
void *d;
 
d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
@@ -728,10 +727,29 @@ i915_error_object_create(struct drm_i915_private 
*dev_priv,
goto unwind;
 
local_irq_save(flags);
-   s = io_mapping_map_atomic_wc(dev_priv-mm.gtt_mapping,
-reloc_offset);
-   memcpy_fromio(d, s, PAGE_SIZE);
-   io_mapping_unmap_atomic(s);
+   if (reloc_offset  dev_priv-mm.gtt_mappable_end) {
+   void __iomem *s;
+
+   /* Simply ignore tiling or any overlapping fence.
+* It's part of the error state, and this hopefully
+* captures what the GPU read.
+*/
+
+   s = io_mapping_map_atomic_wc(dev_priv-mm.gtt_mapping,
+reloc_offset);
+   memcpy_fromio(d, s, PAGE_SIZE);
+   io_mapping_unmap_atomic(s);
+   } else {
+   void *s;
+
+   drm_clflush_pages(src-pages[page], 1);
+
+   s = kmap_atomic(src-pages[page]);
+   memcpy(d, s, PAGE_SIZE);
+   kunmap_atomic(s);
+
+   drm_clflush_pages(src-pages[page], 1);
+   }
local_irq_restore(flags);
 
dst-pages[page] = d;
-- 
1.7.7.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/11] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 10:50:56AM -0800, Ben Widawsky wrote:
 On Mon, 28 Nov 2011 21:35:30 +0100
 Daniel Vetter daniel.vet...@ffwll.ch wrote:
 
  This just adds the setup and teardown code for the ppgtt PDE and the
  last-level pagetables, which are fixed for the entire lifetime, at
  least for the moment.
  
  v2: Kill the stray debug printk noted by and improve the pte
  definitions as suggested by Chris Wilson.
  
  Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
  ---
   drivers/gpu/drm/i915/i915_dma.c |   14 +++-
   drivers/gpu/drm/i915/i915_drv.h |   18 +
   drivers/gpu/drm/i915/i915_gem_gtt.c |  133 
  +++
   drivers/gpu/drm/i915/i915_reg.h |   18 +
   4 files changed, 181 insertions(+), 2 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_dma.c 
  b/drivers/gpu/drm/i915/i915_dma.c
  index fd617fb..9d9a92c 100644
  --- a/drivers/gpu/drm/i915/i915_dma.c
  +++ b/drivers/gpu/drm/i915/i915_dma.c
  @@ -1199,13 +1199,21 @@ static int i915_load_gem_init(struct drm_device 
  *dev)
   * at the last page of the aperture.  One page should be enough to
   * keep any prefetching inside of the aperture.
   */
  -   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
  +   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE - 
  512*PAGE_SIZE);
  +
  +   if (HAS_ALIASING_PPGTT(dev)) {
  +   ret = i915_gem_init_aliasing_ppgtt(dev);
  +   if (ret)
  +   return ret;
  +   }
 
 Not sure if you fixed this based on our IRC conversation. The size
 belongs in the HAS_ALIASING block. Also, I'd prefer NUM_PDEs or
 something similar. Finally, if you put the PDEs at the top, couldn't we
 get rid of the scratch page?

Well, yet another reason this is about the wost patch series I've
submitted in the last few months. You're totally right, I'll move this in.
I'll keep the guard page though even for the ppgtt case, I feel safer with
the more paranoid choice and it's just 4kb of almost 2GB of gtt.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Daniel Vetter
This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.

v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.

v3: Clean up the aperture stealing code as noted by Ben Widawsky.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_dma.c |   40 ---
 drivers/gpu/drm/i915/i915_drv.h |   18 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |  133 +++
 drivers/gpu/drm/i915/i915_reg.h |   18 +
 4 files changed, 198 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index fd617fb..9c56284 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1190,22 +1190,38 @@ static int i915_load_gem_init(struct drm_device *dev)
/* Basic memrange allocator for stolen space */
drm_mm_init(dev_priv-mm.stolen, 0, prealloc_size);
 
-   /* Let GEM Manage all of the aperture.
-*
-* However, leave one page at the end still bound to the scratch page.
-* There are a number of places where the hardware apparently
-* prefetches past the end of the object, and we've seen multiple
-* hangs with the GPU head pointer stuck in a batchbuffer bound
-* at the last page of the aperture.  One page should be enough to
-* keep any prefetching inside of the aperture.
-*/
-   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+   if (HAS_ALIASING_PPGTT(dev)) {
+   /* PPGTT pdes are stolen from global gtt ptes, so shrink the
+* aperture accordingly when using aliasing ppgtt. For paranoia
+* keep the guard page in between. */
+   i915_gem_do_init(dev, 0, mappable_size,
+gtt_size - PAGE_SIZE
+- I915_PPGTT_PD_ENTRIES*PAGE_SIZE);
+
+   ret = i915_gem_init_aliasing_ppgtt(dev);
+   if (ret)
+   return ret;
+   } else {
+   /* Let GEM Manage all of the aperture.
+*
+* However, leave one page at the end still bound to the scratch
+* page.  There are a number of places where the hardware
+* apparently prefetches past the end of the object, and we've
+* seen multiple hangs with the GPU head pointer stuck in a
+* batchbuffer bound at the last page of the aperture.  One page
+* should be enough to keep any prefetching inside of the
+* aperture.
+*/
+   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+   }
 
mutex_lock(dev-struct_mutex);
ret = i915_gem_init_hw(dev);
mutex_unlock(dev-struct_mutex);
-   if (ret)
+   if (ret) {
+   i915_gem_cleanup_aliasing_ppgtt(dev);
return ret;
+   }
 
/* Try to set up FBC with a reasonable compressed buffer size */
if (I915_HAS_FBC(dev)  i915_powersave) {
@@ -1292,6 +1308,7 @@ cleanup_gem:
mutex_lock(dev-struct_mutex);
i915_gem_cleanup_ringbuffer(dev);
mutex_unlock(dev-struct_mutex);
+   i915_gem_cleanup_aliasing_ppgtt(dev);
 cleanup_vga_switcheroo:
vga_switcheroo_unregister_client(dev-pdev);
 cleanup_vga_client:
@@ -2179,6 +2196,7 @@ int i915_driver_unload(struct drm_device *dev)
i915_gem_free_all_phys_object(dev);
i915_gem_cleanup_ringbuffer(dev);
mutex_unlock(dev-struct_mutex);
+   i915_gem_cleanup_aliasing_ppgtt(dev);
if (I915_HAS_FBC(dev)  i915_powersave)
i915_cleanup_compression(dev);
drm_mm_takedown(dev_priv-mm.stolen);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bdbd6d8..8b6f1b7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -254,6 +254,16 @@ struct intel_device_info {
u8 has_blt_ring:1;
 };
 
+#define I915_PPGTT_PD_ENTRIES 512
+#define I915_PPGTT_PT_ENTRIES 1024
+struct i915_hw_ppgtt {
+   unsigned num_pd_entries;
+   struct page **pt_pages;
+   uint32_t pd_offset;
+   dma_addr_t *pt_dma_addr;
+   dma_addr_t scratch_page_dma_addr;
+};
+
 enum no_fbc_reason {
FBC_NO_OUTPUT, /* no outputs enabled to compress */
FBC_STOLEN_TOO_SMALL, /* not enough space to hold compressed buffers */
@@ -584,6 +594,9 @@ typedef struct drm_i915_private {
struct io_mapping *gtt_mapping;
int gtt_mtrr;
 
+   /** PPGTT used for aliasing the PPGTT with the GTT */
+   struct i915_hw_ppgtt *aliasing_ppgtt;
+
struct shrinker inactive_shrinker;
 
   

Re: [Intel-gfx] [PATCH] drm/i915: Handle unmappable buffers during error state capture

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 07:02:06PM +, Chris Wilson wrote:
 As the buffer is not necessarily accessible through the GTT at the time
 of a GPU hang, and capturing some of its contents is far more valuable
 than skipping it, provide a clflushed fallback read path. We still
 prefer to read through the GTT as that is more consistent with the GPU
 access of the same buffer. So example it will demonstrate any errorneous
 tiling or swizzling of the command buffer as seen by the GPU.
 
 This becomes necessary with use of CPU relocations and lazy GTT binding,
 but could potentially happen anyway as a result of a pathological error.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Daniel Vetter daniel.vet...@ffwll.ch
Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
 We try to avoid writing the relocations through the uncached GTT, if the
 buffer is currently in the CPU write domain and so will be flushed out to
 main memory afterwards anyway. Also on SandyBridge we can safely write
 to the pages in cacheable memory, so long as the buffer is LLC mapped.
 In either of these caches, we therefore do not need to force the
 reallocation of the buffer into the mappable region of the GTT, reducing
 the aperture pressure.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk

With the improved error_state capture code this is
Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Handle unmappable buffers during error state capture

2011-11-29 Thread Eugeni Dodonov
On Tue, Nov 29, 2011 at 17:02, Chris Wilson ch...@chris-wilson.co.ukwrote:

 As the buffer is not necessarily accessible through the GTT at the time
 of a GPU hang, and capturing some of its contents is far more valuable
 than skipping it, provide a clflushed fallback read path. We still
 prefer to read through the GTT as that is more consistent with the GPU
 access of the same buffer. So example it will demonstrate any errorneous
 tiling or swizzling of the command buffer as seen by the GPU.

 This becomes necessary with use of CPU relocations and lazy GTT binding,
 but could potentially happen anyway as a result of a pathological error.

 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Daniel Vetter daniel.vet...@ffwll.ch


Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Daniel Vetter
This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.

v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.

v3: Clean up the aperture stealing code as noted by Ben Widawsky.

v4: Paint the init code in a more pleasing colour as suggest by Chris
Wilson.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_dma.c |   41 ---
 drivers/gpu/drm/i915/i915_drv.h |   18 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |  133 +++
 drivers/gpu/drm/i915/i915_reg.h |   18 +
 4 files changed, 199 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index fd617fb..2cc0ba4 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1190,22 +1190,39 @@ static int i915_load_gem_init(struct drm_device *dev)
/* Basic memrange allocator for stolen space */
drm_mm_init(dev_priv-mm.stolen, 0, prealloc_size);
 
-   /* Let GEM Manage all of the aperture.
-*
-* However, leave one page at the end still bound to the scratch page.
-* There are a number of places where the hardware apparently
-* prefetches past the end of the object, and we've seen multiple
-* hangs with the GPU head pointer stuck in a batchbuffer bound
-* at the last page of the aperture.  One page should be enough to
-* keep any prefetching inside of the aperture.
-*/
-   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+   if (HAS_ALIASING_PPGTT(dev)) {
+   /* PPGTT pdes are stolen from global gtt ptes, so shrink the
+* aperture accordingly when using aliasing ppgtt. */
+   gtt_size -= I915_PPGTT_PD_ENTRIES*PAGE_SIZE;
+   /* For paranoia keep the guard page in between. */
+   gtt_size -= PAGE_SIZE;
+
+   i915_gem_do_init(dev, 0, mappable_size, gtt_size);
+
+   ret = i915_gem_init_aliasing_ppgtt(dev);
+   if (ret)
+   return ret;
+   } else {
+   /* Let GEM Manage all of the aperture.
+*
+* However, leave one page at the end still bound to the scratch
+* page.  There are a number of places where the hardware
+* apparently prefetches past the end of the object, and we've
+* seen multiple hangs with the GPU head pointer stuck in a
+* batchbuffer bound at the last page of the aperture.  One page
+* should be enough to keep any prefetching inside of the
+* aperture.
+*/
+   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+   }
 
mutex_lock(dev-struct_mutex);
ret = i915_gem_init_hw(dev);
mutex_unlock(dev-struct_mutex);
-   if (ret)
+   if (ret) {
+   i915_gem_cleanup_aliasing_ppgtt(dev);
return ret;
+   }
 
/* Try to set up FBC with a reasonable compressed buffer size */
if (I915_HAS_FBC(dev)  i915_powersave) {
@@ -1292,6 +1309,7 @@ cleanup_gem:
mutex_lock(dev-struct_mutex);
i915_gem_cleanup_ringbuffer(dev);
mutex_unlock(dev-struct_mutex);
+   i915_gem_cleanup_aliasing_ppgtt(dev);
 cleanup_vga_switcheroo:
vga_switcheroo_unregister_client(dev-pdev);
 cleanup_vga_client:
@@ -2179,6 +2197,7 @@ int i915_driver_unload(struct drm_device *dev)
i915_gem_free_all_phys_object(dev);
i915_gem_cleanup_ringbuffer(dev);
mutex_unlock(dev-struct_mutex);
+   i915_gem_cleanup_aliasing_ppgtt(dev);
if (I915_HAS_FBC(dev)  i915_powersave)
i915_cleanup_compression(dev);
drm_mm_takedown(dev_priv-mm.stolen);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bdbd6d8..8b6f1b7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -254,6 +254,16 @@ struct intel_device_info {
u8 has_blt_ring:1;
 };
 
+#define I915_PPGTT_PD_ENTRIES 512
+#define I915_PPGTT_PT_ENTRIES 1024
+struct i915_hw_ppgtt {
+   unsigned num_pd_entries;
+   struct page **pt_pages;
+   uint32_t pd_offset;
+   dma_addr_t *pt_dma_addr;
+   dma_addr_t scratch_page_dma_addr;
+};
+
 enum no_fbc_reason {
FBC_NO_OUTPUT, /* no outputs enabled to compress */
FBC_STOLEN_TOO_SMALL, /* not enough space to hold compressed buffers */
@@ -584,6 +594,9 @@ typedef struct drm_i915_private {
struct io_mapping *gtt_mapping;
int gtt_mtrr;
 
+   /** PPGTT used for aliasing the PPGTT with the GTT */
+   struct i915_hw_ppgtt *aliasing_ppgtt;
+

[Intel-gfx] [PATCH] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Daniel Vetter
This just adds the setup and teardown code for the ppgtt PDE and the
last-level pagetables, which are fixed for the entire lifetime, at
least for the moment.

v2: Kill the stray debug printk noted by and improve the pte
definitions as suggested by Chris Wilson.

v3: Clean up the aperture stealing code as noted by Ben Widawsky.

v4: Paint the init code in a more pleasing colour as suggest by Chris
Wilson.

v5: Explain the magic numbers noticed by Ben Widawsky.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_dma.c |   41 ---
 drivers/gpu/drm/i915/i915_drv.h |   18 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |  139 +++
 drivers/gpu/drm/i915/i915_reg.h |   18 +
 4 files changed, 205 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index fd617fb..2cc0ba4 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1190,22 +1190,39 @@ static int i915_load_gem_init(struct drm_device *dev)
/* Basic memrange allocator for stolen space */
drm_mm_init(dev_priv-mm.stolen, 0, prealloc_size);
 
-   /* Let GEM Manage all of the aperture.
-*
-* However, leave one page at the end still bound to the scratch page.
-* There are a number of places where the hardware apparently
-* prefetches past the end of the object, and we've seen multiple
-* hangs with the GPU head pointer stuck in a batchbuffer bound
-* at the last page of the aperture.  One page should be enough to
-* keep any prefetching inside of the aperture.
-*/
-   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+   if (HAS_ALIASING_PPGTT(dev)) {
+   /* PPGTT pdes are stolen from global gtt ptes, so shrink the
+* aperture accordingly when using aliasing ppgtt. */
+   gtt_size -= I915_PPGTT_PD_ENTRIES*PAGE_SIZE;
+   /* For paranoia keep the guard page in between. */
+   gtt_size -= PAGE_SIZE;
+
+   i915_gem_do_init(dev, 0, mappable_size, gtt_size);
+
+   ret = i915_gem_init_aliasing_ppgtt(dev);
+   if (ret)
+   return ret;
+   } else {
+   /* Let GEM Manage all of the aperture.
+*
+* However, leave one page at the end still bound to the scratch
+* page.  There are a number of places where the hardware
+* apparently prefetches past the end of the object, and we've
+* seen multiple hangs with the GPU head pointer stuck in a
+* batchbuffer bound at the last page of the aperture.  One page
+* should be enough to keep any prefetching inside of the
+* aperture.
+*/
+   i915_gem_do_init(dev, 0, mappable_size, gtt_size - PAGE_SIZE);
+   }
 
mutex_lock(dev-struct_mutex);
ret = i915_gem_init_hw(dev);
mutex_unlock(dev-struct_mutex);
-   if (ret)
+   if (ret) {
+   i915_gem_cleanup_aliasing_ppgtt(dev);
return ret;
+   }
 
/* Try to set up FBC with a reasonable compressed buffer size */
if (I915_HAS_FBC(dev)  i915_powersave) {
@@ -1292,6 +1309,7 @@ cleanup_gem:
mutex_lock(dev-struct_mutex);
i915_gem_cleanup_ringbuffer(dev);
mutex_unlock(dev-struct_mutex);
+   i915_gem_cleanup_aliasing_ppgtt(dev);
 cleanup_vga_switcheroo:
vga_switcheroo_unregister_client(dev-pdev);
 cleanup_vga_client:
@@ -2179,6 +2197,7 @@ int i915_driver_unload(struct drm_device *dev)
i915_gem_free_all_phys_object(dev);
i915_gem_cleanup_ringbuffer(dev);
mutex_unlock(dev-struct_mutex);
+   i915_gem_cleanup_aliasing_ppgtt(dev);
if (I915_HAS_FBC(dev)  i915_powersave)
i915_cleanup_compression(dev);
drm_mm_takedown(dev_priv-mm.stolen);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bdbd6d8..8b6f1b7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -254,6 +254,16 @@ struct intel_device_info {
u8 has_blt_ring:1;
 };
 
+#define I915_PPGTT_PD_ENTRIES 512
+#define I915_PPGTT_PT_ENTRIES 1024
+struct i915_hw_ppgtt {
+   unsigned num_pd_entries;
+   struct page **pt_pages;
+   uint32_t pd_offset;
+   dma_addr_t *pt_dma_addr;
+   dma_addr_t scratch_page_dma_addr;
+};
+
 enum no_fbc_reason {
FBC_NO_OUTPUT, /* no outputs enabled to compress */
FBC_STOLEN_TOO_SMALL, /* not enough space to hold compressed buffers */
@@ -584,6 +594,9 @@ typedef struct drm_i915_private {
struct io_mapping *gtt_mapping;
int gtt_mtrr;
 
+   /** PPGTT used for aliasing the PPGTT with the GTT */

Re: [Intel-gfx] [PATCH] drm/i915: Avoid using mappable space for relocation processing through the CPU

2011-11-29 Thread Ben Widawsky
On Tue, Nov 29, 2011 at 03:12:40PM +, Chris Wilson wrote:
 We try to avoid writing the relocations through the uncached GTT, if the
 buffer is currently in the CPU write domain and so will be flushed out to
 main memory afterwards anyway. Also on SandyBridge we can safely write
 to the pages in cacheable memory, so long as the buffer is LLC mapped.
 In either of these caches, we therefore do not need to force the
 reallocation of the buffer into the mappable region of the GTT, reducing
 the aperture pressure.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Reviewed-by: Ben Widawsky b...@bwidawsk.net
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: enable deepest RC6 state

2011-11-29 Thread Ben Widawsky
On Tue, Nov 29, 2011 at 01:01:31PM +, Chris Wilson wrote:
 On Tue, 29 Nov 2011 10:55:04 -0200, Eugeni Dodonov eugeni.dodo...@intel.com 
 wrote:
  This should allow even more energy saving when GPU is not in use.
  According to the testing, this state results in around 0.1 - 0.4 W better
  power usage.
  
  No issues or regressions observed so far, but additional testing is
  certainly welcome.
 
 The docs I saw said not implemented; do not use. Do we have it on good
 authority that this is safe and useful to enable? And doesn't it
 require programming of more transition thresholds?
 -Chris

Yes, I think we do have to program more stuff to make this work. Perhaps
the BIOS puts in decent values for these registers though? We would have
to restore those on reset and resume I'd guess. If BIOS doesn't use
anything there, you probably aren't even entering these states.

Also, for posterity, there are 3 rc6 states, rc6, deep rc6, and deepest
rc6. I think deepest rc6 was recommended to avoid (though I don't recall
a specific root caused issue, just some data from Jesse, from the
windows team that it didn't seem stable). And I think deepest rc6 is
also referred to as rc7 sometimes.

IIRC, Jesse had patches for deep rc6, but it was very minimal
performance gain. Someone should check if the windows driver uses it.

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: remove the i915_batchbuffer_info debugfs file

2011-11-29 Thread Daniel Vetter
With the error_state facility in place, this has outlived it's
usefulness. It also oopses with the lates llc-reloc patches because
it directly access objects through the gtt without any checks.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_debugfs.c |   40 ---
 1 files changed, 0 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 919f072..84c7295 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -562,45 +562,6 @@ static int i915_hws_info(struct seq_file *m, void *data)
return 0;
 }
 
-static void i915_dump_object(struct seq_file *m,
-struct io_mapping *mapping,
-struct drm_i915_gem_object *obj)
-{
-   int page, page_count, i;
-
-   page_count = obj-base.size / PAGE_SIZE;
-   for (page = 0; page  page_count; page++) {
-   u32 *mem = io_mapping_map_wc(mapping,
-obj-gtt_offset + page * 
PAGE_SIZE);
-   for (i = 0; i  PAGE_SIZE; i += 4)
-   seq_printf(m, %08x :  %08x\n, i, mem[i / 4]);
-   io_mapping_unmap(mem);
-   }
-}
-
-static int i915_batchbuffer_info(struct seq_file *m, void *data)
-{
-   struct drm_info_node *node = (struct drm_info_node *) m-private;
-   struct drm_device *dev = node-minor-dev;
-   drm_i915_private_t *dev_priv = dev-dev_private;
-   struct drm_i915_gem_object *obj;
-   int ret;
-
-   ret = mutex_lock_interruptible(dev-struct_mutex);
-   if (ret)
-   return ret;
-
-   list_for_each_entry(obj, dev_priv-mm.active_list, mm_list) {
-   if (obj-base.read_domains  I915_GEM_DOMAIN_COMMAND) {
-   seq_printf(m, --- gtt_offset = 0x%08x\n, obj-gtt_offset);
-   i915_dump_object(m, dev_priv-mm.gtt_mapping, obj);
-   }
-   }
-
-   mutex_unlock(dev-struct_mutex);
-   return 0;
-}
-
 static int i915_ringbuffer_data(struct seq_file *m, void *data)
 {
struct drm_info_node *node = (struct drm_info_node *) m-private;
@@ -1823,7 +1784,6 @@ static struct drm_info_list i915_debugfs_list[] = {
{i915_bsd_ringbuffer_info, i915_ringbuffer_info, 0, (void *)VCS},
{i915_blt_ringbuffer_data, i915_ringbuffer_data, 0, (void *)BCS},
{i915_blt_ringbuffer_info, i915_ringbuffer_info, 0, (void *)BCS},
-   {i915_batchbuffers, i915_batchbuffer_info, 0},
{i915_error_state, i915_error_state, 0},
{i915_rstdby_delays, i915_rstdby_delays, 0},
{i915_cur_delayinfo, i915_cur_delayinfo, 0},
-- 
1.7.6.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Is MI_FLUSH_ENABLE bit 12?

2011-11-29 Thread Keith Packard

Just reading through vol1c.4 of the bspec  this evening and found something odd.

Bit 11 of MI_MODE is Invalidate UHPTR enable.
Bit 12 of MI_MODE is MI_FLUSH Enable

And, yet, in i915_reg.h:

#define MI_MODE 0x0209c
# define VS_TIMER_DISPATCH  (1  6)
# define MI_FLUSH_ENABLE(1  11)

Are we off-by-one on MI_FLUSH_ENABLE? Seems like this would cause
serious problems...

-- 
keith.pack...@intel.com


pgp6gLFj3Dxwh.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/11] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:30PM +0100, Daniel Vetter wrote:
 This just adds the setup and teardown code for the ppgtt PDE and the
 last-level pagetables, which are fixed for the entire lifetime, at
 least for the moment.
 
 v2: Kill the stray debug printk noted by and improve the pte
 definitions as suggested by Chris Wilson.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
Crap... one more point

 +/* PPGTT support for Sandybdrige/Gen6 and later */
 +static void i915_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 +unsigned first_entry,
 +unsigned num_entries)
 +{
 + int i, j;
 + uint32_t *pt_vaddr;
 + uint32_t scratch_pte;
 +
 + scratch_pte = GEN6_PTE_ADDR_ENCODE(ppgtt-scratch_page_dma_addr);
 + scratch_pte |= GEN6_PTE_VALID | GEN6_PTE_CACHE_LLC;
 +
 + for (i = 0; i  ppgtt-num_pd_entries; i++) {
 + pt_vaddr = kmap_atomic(ppgtt-pt_pages[i]);
 +
 + for (j = 0; j  I915_PPGTT_PT_ENTRIES; j++)
 + pt_vaddr[j] = scratch_pte;
 +
 + kunmap_atomic(pt_vaddr);
 + }
 +
 +}

Don't you want to clflush here (unless I missed it somewhere else).
Ideally I think you'd also flush the TLB with a PIPE_CONTROL, but I
guess so long as we have that bit set that always flushes we're okay on
that one... Still feel we need a clflush though.

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 04/11] drm/i915: ppgtt binding/unbinding support

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:31PM +0100, Daniel Vetter wrote:
 This adds support to bind/unbind objects and wires it up. Objects are
 only put into the ppgtt when necessary, i.e. at execbuf time.
 
 Objects are still unconditionally put into the global gtt.
 
 v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind
 like for the global gtt function. Noticed by Chris Wilson.
 
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---

 diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
 b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 index c918124..9c81cda 100644
 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 @@ -513,6 +513,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer 
 *ring,
   struct drm_file *file,
   struct list_head *objects)
  {
 + drm_i915_private_t *dev_priv = ring-dev-dev_private;
   struct drm_i915_gem_object *obj;
   int ret, retry;
   bool has_fenced_gpu_access = INTEL_INFO(ring-dev)-gen  4;
 @@ -621,6 +622,14 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer 
 *ring,
   }
  
   i915_gem_object_unpin(obj);
 +
 + /* ... and ensure ppgtt mapping exist if needed. */
 + if (dev_priv-mm.aliasing_ppgtt  
 !obj-has_aliasing_ppgtt_mapping) {
 + 
 i915_ppgtt_bind_object(dev_priv-mm.aliasing_ppgtt,
 +obj, obj-cache_level);
 +
 + obj-has_aliasing_ppgtt_mapping = 1;
 + }
   }
  
   if (ret != -ENOSPC || retry  1)
 diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
 b/drivers/gpu/drm/i915/i915_gem_gtt.c
 index bd9b520..061ae12 100644
 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
 +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
 @@ -34,22 +34,31 @@ static void i915_ppgtt_clear_range(struct i915_hw_ppgtt 
 *ppgtt,
  unsigned first_entry,
  unsigned num_entries)
  {
 - int i, j;
   uint32_t *pt_vaddr;
   uint32_t scratch_pte;
 + unsigned act_pd = first_entry / I915_PPGTT_PT_ENTRIES;
 + unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 + unsigned last_pte, i;
  
   scratch_pte = GEN6_PTE_ADDR_ENCODE(ppgtt-scratch_page_dma_addr);
   scratch_pte |= GEN6_PTE_VALID | GEN6_PTE_CACHE_LLC;
  
 - for (i = 0; i  ppgtt-num_pd_entries; i++) {
 - pt_vaddr = kmap_atomic(ppgtt-pt_pages[i]);
 + while (num_entries) {
 + last_pte = first_pte + num_entries;
 + if (last_pte  I915_PPGTT_PT_ENTRIES)
 + last_pte = I915_PPGTT_PT_ENTRIES;
 +
 + pt_vaddr = kmap_atomic(ppgtt-pt_pages[act_pd]);
  
 - for (j = 0; j  I915_PPGTT_PT_ENTRIES; j++)
 - pt_vaddr[j] = scratch_pte;
 + for (i = first_pte; i  last_pte; i++)
 + pt_vaddr[i] = scratch_pte;
  
   kunmap_atomic(pt_vaddr);
 - }
  
 + num_entries -= last_pte - first_pte;
 + first_pte = 0;
 + act_pd++;
 + }
  }
  
  int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
 @@ -162,6 +171,131 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device 
 *dev)
   kfree(ppgtt);
  }
  
 +static void i915_ppgtt_insert_sg_entries(struct i915_hw_ppgtt *ppgtt,
 +  struct scatterlist *sg_list,
 +  unsigned sg_len,
 +  unsigned first_entry,
 +  uint32_t pte_flags)
 +{
 + uint32_t *pt_vaddr, pte;
 + unsigned act_pd = first_entry / I915_PPGTT_PT_ENTRIES;
 + unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 + unsigned i, j, m, segment_len;
 + dma_addr_t page_addr;
 + struct scatterlist *sg;
 +
 + /* init sg walking */
 + sg = sg_list;
 + i = 0;
 + segment_len = sg_dma_len(sg)  PAGE_SHIFT;
 + m = 0;
 +
 + while (i  sg_len) {
 + pt_vaddr = kmap_atomic(ppgtt-pt_pages[act_pd]);
 +
 + for (j = first_pte; j  I915_PPGTT_PT_ENTRIES; j++) {
 + page_addr = sg_dma_address(sg) + (m  PAGE_SHIFT);
 + pte = GEN6_PTE_ADDR_ENCODE(page_addr);
 + pt_vaddr[j] = pte | pte_flags;
 +
 + /* grab the next page */
 + m++;
 + if (m == segment_len) {
 + sg = sg_next(sg);
 + i++;
 + if (i == sg_len)
 + break;
 +
 + segment_len = sg_dma_len(sg)  PAGE_SHIFT;
 + m = 0;
 + }
 + }
 +
 + kunmap_atomic(pt_vaddr);
 +

Re: [Intel-gfx] [PATCH 1/2] drm/i915: enable deepest RC6 state

2011-11-29 Thread Eugeni Dodonov
On Tue, Nov 29, 2011 at 19:42, Ben Widawsky b...@bwidawsk.net wrote:

 On Tue, Nov 29, 2011 at 01:01:31PM +, Chris Wilson wrote:
  On Tue, 29 Nov 2011 10:55:04 -0200, Eugeni Dodonov 
 eugeni.dodo...@intel.com wrote:
   This should allow even more energy saving when GPU is not in use.
   According to the testing, this state results in around 0.1 - 0.4 W
 better
   power usage.
  
   No issues or regressions observed so far, but additional testing is
   certainly welcome.
 
  The docs I saw said not implemented; do not use. Do we have it on good
  authority that this is safe and useful to enable? And doesn't it
  require programming of more transition thresholds?
  -Chris

 Yes, I think we do have to program more stuff to make this work. Perhaps
 the BIOS puts in decent values for these registers though? We would have
 to restore those on reset and resume I'd guess. If BIOS doesn't use
 anything there, you probably aren't even entering these states.

 Also, for posterity, there are 3 rc6 states, rc6, deep rc6, and deepest
 rc6. I think deepest rc6 was recommended to avoid (though I don't recall
 a specific root caused issue, just some data from Jesse, from the
 windows team that it didn't seem stable). And I think deepest rc6 is
 also referred to as rc7 sometimes.


We already setup the variables for both deep and deepest rc6 in our driver
(GEN6_RC6p_* and GEN6_RC6pp_*), but we weren't using this additional state
previously - if I understood the documentation and the code correctly, we
do enable plain rc6 and deep rc6 currently. I haven't found any indications
which would tell to avoid it in the latest docs, and I also haven't seen
any regressions or issues with it being enabled on any of the machines, so
I thought it would be worth trying that additional state as well.

From the testing which QA did for this patch, looks like we save between
0.1 and 0.4W when compared to what we had with i915_enable_rc6=1.

But in any case, this is all highly experimental and I'll do more testing
with it :).

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/11] drm/i915: initialization/teardown for the aliasing ppgtt

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:30PM +0100, Daniel Vetter wrote:
 This just adds the setup and teardown code for the ppgtt PDE and the
 last-level pagetables, which are fixed for the entire lifetime, at
 least for the moment.
 
 v2: Kill the stray debug printk noted by and improve the pte
 definitions as suggested by Chris Wilson.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
{...}
  
 +#define I915_PPGTT_PD_ENTRIES 512
 +#define I915_PPGTT_PT_ENTRIES 1024
Not to fond that we have this redefined as GEN6_PTES_PER_PD (which makes
more sense as far as naming goes IMO)

{...}

 +#define GEN6_PTES_PER_PD 1024

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/11] agp/intel-gtt: export the scratch page dma address

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:28PM +0100, Daniel Vetter wrote:
 To implement a PPGTT for drm/i915 that fully aliases the GTT, we also
 need to properly alias the scratch page.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
Reviewed-by: Ben Widawsky b...@bwidawsk.net
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/11] agp/intel-gtt: export the gtt pagetable iomapping

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:29PM +0100, Daniel Vetter wrote:
 We need this because ppgtt page directory entries need to be in the
 global gtt pagetable.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
I think it makes sense to export the size of the GTT as well, but I'm
good either way.
Reviewed-by: Ben Widawsky b...@bwidawsk.net
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/11] drm/i915: ppgtt register definitions

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:32PM +0100, Daniel Vetter wrote:
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---
  drivers/gpu/drm/i915/i915_reg.h |   18 ++
  1 files changed, 18 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
 index e9f5dc8..7227446 100644
 --- a/drivers/gpu/drm/i915/i915_reg.h
 +++ b/drivers/gpu/drm/i915/i915_reg.h
 @@ -86,6 +86,13 @@
  #define   GEN6_MBC_SNPCR_LOW (221)
  #define   GEN6_MBC_SNPCR_MIN (321) /* only 1/16th of the cache is shared */
  
 +#define GEN6_MBCTL   0x0907c
 +#define   GEN6_MBCTL_ENABLE_BOOT_FETCH   (1  4)
 +#define   GEN6_MBCTL_CTX_FETCH_NEEDED(1  3)
 +#define   GEN6_MBCTL_BME_UPDATE_ENABLE   (1  2)
 +#define   GEN6_MBCTL_MAE_UPDATE_ENABLE   (1  1)
 +#define   GEN6_MBCTL_BOOT_FETCH_MECH (1  0)
 +

Maybe grep fail, but I don't see these registers used later in the
series.

  #define GEN6_GDRST   0x941c
  #define  GEN6_GRDOM_FULL (1  0)
  #define  GEN6_GRDOM_RENDER   (1  1)
 @@ -110,6 +117,16 @@
  
  #define GEN6_PTES_PER_PD 1024
  
 +#define RING_PP_DIR_BASE(ring)   ((ring)-mmio_base+0x228)
 +#define RING_PP_DIR_BASE_READ(ring)  ((ring)-mmio_base+0x518)
 +#define RING_PP_DIR_DCLV(ring)   ((ring)-mmio_base+0x220)
 +#define   PP_DIR_DCLV_2G 0x
 +
 +#define GAM_ECOCHK   0x4090
 +#define   ECOCHK_SNB_BIT (110)
 +#define   ECOCHK_PPGTT_CACHE64B  (0x33)
 +#define   ECOCHK_PPGTT_CACHE4B   (0x03)
 +
  /* VGA stuff */
  
  #define VGA_ST01_MDA 0x3ba
 @@ -422,6 +439,7 @@
  
  #define GFX_MODE 0x02520
  #define GFX_MODE_GEN70x0229c
 +#define RING_MODE_GEN7(ring) ((ring)-mmio_base+0x29c)
  #define   GFX_RUN_LIST_ENABLE(115)
  #define   GFX_TLB_INVALIDATE_ALWAYS  (113)
  #define   GFX_SURFACE_FAULT_ENABLE   (112)

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: remove the i915_batchbuffer_info debugfs file

2011-11-29 Thread Chris Wilson
On Wed, 30 Nov 2011 00:17:45 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:
 With the error_state facility in place, this has outlived it's
 usefulness. It also oopses with the lates llc-reloc patches because
 it directly access objects through the gtt without any checks.

This code has been useless for some time.
Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/11] drm/i915: ppgtt debugfs info

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 09:35:33PM +0100, Daniel Vetter wrote:
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
I am fine with the patch, but I don't yet understand how useful this is.

Reviewed-by: Ben Widawsky b...@bwidawsk.net
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/11] drm/i915: enable ppgtt

2011-11-29 Thread Ben Widawsky
On Mon, Nov 28, 2011 at 10:24:52PM +0100, Daniel Vetter wrote:
 v2: Don't try to enable ppgtt on pre-snb.
 
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---
  drivers/gpu/drm/i915/i915_drv.c |2 ++
  drivers/gpu/drm/i915/i915_drv.h |1 +
  drivers/gpu/drm/i915/i915_gem.c |   39 
 +++
  3 files changed, 42 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
 index f12b43e..c7cf881 100644
 --- a/drivers/gpu/drm/i915/i915_drv.c
 +++ b/drivers/gpu/drm/i915/i915_drv.c
 @@ -698,6 +698,8 @@ int i915_reset(struct drm_device *dev, u8 flags)
   if (HAS_BLT(dev))
   dev_priv-ring[BCS].init(dev_priv-ring[BCS]);
  
 + i915_gem_init_ppgtt(dev);
 +
   mutex_unlock(dev-struct_mutex);
   drm_irq_uninstall(dev);
   drm_mode_config_reset(dev);
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index 5db3b84..0d228d2 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -1228,6 +1228,7 @@ int __must_check i915_gem_object_set_domain(struct 
 drm_i915_gem_object *obj,
  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
  int __must_check i915_gem_init_hw(struct drm_device *dev);
  void i915_gem_init_swizzling(struct drm_device *dev);
 +void i915_gem_init_ppgtt(struct drm_device *dev);
  void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
  void i915_gem_do_init(struct drm_device *dev,
 unsigned long start,
 diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
 index 593aa60..e1d7852 100644
 --- a/drivers/gpu/drm/i915/i915_gem.c
 +++ b/drivers/gpu/drm/i915/i915_gem.c
 @@ -3772,6 +3772,43 @@ void i915_gem_init_swizzling(struct drm_device *dev)
DISP_TILE_SURFACE_SWIZZLING);
  
  }
 +
 +void i915_gem_init_ppgtt(struct drm_device *dev)
 +{
 + drm_i915_private_t *dev_priv = dev-dev_private;
 + uint32_t pd_offset;
 + struct intel_ring_buffer *ring;
 + int i;
 +
 + if (!HAS_ALIASING_PPGTT(dev))
 + return;
 +
 + pd_offset = dev_priv-mm.aliasing_ppgtt-pd_offset;
 + pd_offset /= 64; /* in cachelines, */
 + pd_offset = 16;
 +
 + if (INTEL_INFO(dev)-gen == 6) {
 + uint32_t ecochk = I915_READ(GAM_ECOCHK);
 + I915_WRITE(GAM_ECOCHK, ecochk | ECOCHK_SNB_BIT |
 +ECOCHK_PPGTT_CACHE64B);
 + I915_WRITE(GFX_MODE, GFX_MODE_ENABLE(GFX_PPGTT_ENABLE));
 + } else if (INTEL_INFO(dev)-gen = 7) {
 + I915_WRITE(GAM_ECOCHK, ECOCHK_PPGTT_CACHE64B);
 + /* GFX_MODE is per-ring on gen7+ */
 + }
 +
 + for (i = 0; i  I915_NUM_RINGS; i++) {
 + ring = dev_priv-ring[i];
 +
 + if (INTEL_INFO(dev)-gen = 7)
 + I915_WRITE(RING_MODE_GEN7(ring),
 +GFX_MODE_ENABLE(GFX_PPGTT_ENABLE));
 +
 + I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
 + I915_WRITE(RING_PP_DIR_BASE(ring), pd_offset);
 + }
 +}
 +
  int
  i915_gem_init_hw(struct drm_device *dev)
  {
 @@ -3798,6 +3835,8 @@ i915_gem_init_hw(struct drm_device *dev)
  
   dev_priv-next_seqno = 1;
  
 + i915_gem_init_ppgtt(dev);
 +
   return 0;
  
  cleanup_bsd_ring:
Maybe through some DRM_INFOs in here for later debugging to see if we
failed in setting up the page tables?

Also, does this work across resume and reset?

Ben
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/11] aliasing ppgtt support v2

2011-11-29 Thread Eugeni Dodonov
On Mon, Nov 28, 2011 at 18:35, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 Hi all,

 Changes since the last submission:
 - fixed issues pointed out by Chris Wilson on irc.
 - fixed an oops on pre-snb, shame on me for that one.
 - added two new patches to only bind objects to the global gtt when
 required.
 - added a new patch so that userspace can find out whether ppgtt is on.
 This is
  required to use MI_STORE/LOAD commands correctly from userspace
 batchbuffers.
  Luckily no currently released userspace code depends on this, it's just
 prep
  work.

 Comments, test reports, reviews and flames highly welcome.


For the series (with the latest patches on your fd.o repo)

Tested-by: Eugeni Dodonov eugeni.dodo...@intel.com

I left it running on 2 SNBs and 1 IVB for couple of hours under different
set of workloads, and haven't seen any issues so far.

Also, after the latest round of fixes with Chris and Ben's comments, I
guess I can also give a
Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

for the series, with one small bike shed comment for Patch 7 which I'll
reply there.

-- 
Eugeni Dodonov
 http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/11] drm/i915: per-ring fault reg

2011-11-29 Thread Eugeni Dodonov
On Mon, Nov 28, 2011 at 18:35, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 +#define ERROR  0x40a0


You do not seem to use this, and it is the same value as ERROR_GEN6. Do we
need it? Or I misread something?

-- 
Eugeni Dodonov
 http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: remove the i915_batchbuffer_info debugfs file

2011-11-29 Thread Eugeni Dodonov
On Tue, Nov 29, 2011 at 21:17, Daniel Vetter daniel.vet...@ffwll.ch wrote:

 With the error_state facility in place, this has outlived it's
 usefulness. It also oopses with the lates llc-reloc patches because
 it directly access objects through the gtt without any checks.

 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch


Yep, I think that error_state does the job just fine now.

Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Is MI_FLUSH_ENABLE bit 12?

2011-11-29 Thread Eric Anholt
On Mon, 28 Nov 2011 18:48:04 -0800, Keith Packard kei...@keithp.com wrote:
Non-text part: multipart/mixed
Non-text part: multipart/signed
 
 Just reading through vol1c.4 of the bspec  this evening and found something 
 odd.
 
 Bit 11 of MI_MODE is Invalidate UHPTR enable.
 Bit 12 of MI_MODE is MI_FLUSH Enable
 
 And, yet, in i915_reg.h:
 
 #define MI_MODE   0x0209c
 # define VS_TIMER_DISPATCH(1  6)
 # define MI_FLUSH_ENABLE  (1  11)
 
 Are we off-by-one on MI_FLUSH_ENABLE? Seems like this would cause
 serious problems...

I think we are.  On the other hand, based on actual behavior plus
reading of simulator, I believe that the bit does nothing, regardless.


pgp8HsuP4gkZG.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Is MI_FLUSH_ENABLE bit 12?

2011-11-29 Thread Ben Widawsky
On Tue, Nov 29, 2011 at 04:47:57PM -0800, Eric Anholt wrote:
 On Mon, 28 Nov 2011 18:48:04 -0800, Keith Packard kei...@keithp.com wrote:
 Non-text part: multipart/mixed
 Non-text part: multipart/signed
  
  Just reading through vol1c.4 of the bspec  this evening and found something 
  odd.
  
  Bit 11 of MI_MODE is Invalidate UHPTR enable.
  Bit 12 of MI_MODE is MI_FLUSH Enable
  
  And, yet, in i915_reg.h:
  
  #define MI_MODE 0x0209c
  # define VS_TIMER_DISPATCH  (1  6)
  # define MI_FLUSH_ENABLE(1  11)
  
  Are we off-by-one on MI_FLUSH_ENABLE? Seems like this would cause
  serious problems...
 
 I think we are.  On the other hand, based on actual behavior plus
 reading of simulator, I believe that the bit does nothing, regardless.

I do not think so. We've (Chris, I, and perhaps Jesse?) been through
this excercise at least twice before, and both times resulted in hangs
when we switched to bit 12 on Ironlake, not sure about other platforms.

Ben


pgpyMIhTaEJkB.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: remove the i915_batchbuffer_info debugfs file

2011-11-29 Thread Ben Widawsky
On Wed, Nov 30, 2011 at 12:17:45AM +0100, Daniel Vetter wrote:
 With the error_state facility in place, this has outlived it's
 usefulness. It also oopses with the lates llc-reloc patches because
 it directly access objects through the gtt without any checks.
 
 Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---
  drivers/gpu/drm/i915/i915_debugfs.c |   40 
 ---
  1 files changed, 0 insertions(+), 40 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
 b/drivers/gpu/drm/i915/i915_debugfs.c
 index 919f072..84c7295 100644
 --- a/drivers/gpu/drm/i915/i915_debugfs.c
 +++ b/drivers/gpu/drm/i915/i915_debugfs.c
 @@ -562,45 +562,6 @@ static int i915_hws_info(struct seq_file *m, void *data)
   return 0;
  }
  
 -static void i915_dump_object(struct seq_file *m,
 -  struct io_mapping *mapping,
 -  struct drm_i915_gem_object *obj)
 -{
 - int page, page_count, i;
 -
 - page_count = obj-base.size / PAGE_SIZE;
 - for (page = 0; page  page_count; page++) {
 - u32 *mem = io_mapping_map_wc(mapping,
 -  obj-gtt_offset + page * 
 PAGE_SIZE);
 - for (i = 0; i  PAGE_SIZE; i += 4)
 - seq_printf(m, %08x :  %08x\n, i, mem[i / 4]);
 - io_mapping_unmap(mem);
 - }
 -}

For some time I've wanted to turn this into a generic bo dumping debugfs entry.
I'd be sorry to see it go, but we can always resurrect it later.

The rest I don't care about.

 -
 -static int i915_batchbuffer_info(struct seq_file *m, void *data)
 -{
 - struct drm_info_node *node = (struct drm_info_node *) m-private;
 - struct drm_device *dev = node-minor-dev;
 - drm_i915_private_t *dev_priv = dev-dev_private;
 - struct drm_i915_gem_object *obj;
 - int ret;
 -
 - ret = mutex_lock_interruptible(dev-struct_mutex);
 - if (ret)
 - return ret;
 -
 - list_for_each_entry(obj, dev_priv-mm.active_list, mm_list) {
 - if (obj-base.read_domains  I915_GEM_DOMAIN_COMMAND) {
 - seq_printf(m, --- gtt_offset = 0x%08x\n, obj-gtt_offset);
 - i915_dump_object(m, dev_priv-mm.gtt_mapping, obj);
 - }
 - }
 -
 - mutex_unlock(dev-struct_mutex);
 - return 0;
 -}
 -
  static int i915_ringbuffer_data(struct seq_file *m, void *data)
  {
   struct drm_info_node *node = (struct drm_info_node *) m-private;
 @@ -1823,7 +1784,6 @@ static struct drm_info_list i915_debugfs_list[] = {
   {i915_bsd_ringbuffer_info, i915_ringbuffer_info, 0, (void *)VCS},
   {i915_blt_ringbuffer_data, i915_ringbuffer_data, 0, (void *)BCS},
   {i915_blt_ringbuffer_info, i915_ringbuffer_info, 0, (void *)BCS},
 - {i915_batchbuffers, i915_batchbuffer_info, 0},
   {i915_error_state, i915_error_state, 0},
   {i915_rstdby_delays, i915_rstdby_delays, 0},
   {i915_cur_delayinfo, i915_cur_delayinfo, 0},
 -- 
 1.7.6.3
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/11] drm/i915: ppgtt register definitions

2011-11-29 Thread Daniel Vetter
On Tue, Nov 29, 2011 at 03:57:41PM -0800, Ben Widawsky wrote:
 On Mon, Nov 28, 2011 at 09:35:32PM +0100, Daniel Vetter wrote:
  Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
  ---
   drivers/gpu/drm/i915/i915_reg.h |   18 ++
   1 files changed, 18 insertions(+), 0 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_reg.h 
  b/drivers/gpu/drm/i915/i915_reg.h
  index e9f5dc8..7227446 100644
  --- a/drivers/gpu/drm/i915/i915_reg.h
  +++ b/drivers/gpu/drm/i915/i915_reg.h
  @@ -86,6 +86,13 @@
   #define   GEN6_MBC_SNPCR_LOW   (221)
   #define   GEN6_MBC_SNPCR_MIN   (321) /* only 1/16th of the cache is 
  shared */
   
  +#define GEN6_MBCTL 0x0907c
  +#define   GEN6_MBCTL_ENABLE_BOOT_FETCH (1  4)
  +#define   GEN6_MBCTL_CTX_FETCH_NEEDED  (1  3)
  +#define   GEN6_MBCTL_BME_UPDATE_ENABLE (1  2)
  +#define   GEN6_MBCTL_MAE_UPDATE_ENABLE (1  1)
  +#define   GEN6_MBCTL_BOOT_FETCH_MECH   (1  0)
  +
 
 Maybe grep fail, but I don't see these registers used later in the
 series.

They're not used. But both public snb docs and Bspec contain notices that
you need to frob the boot mode enable bit in here, which afaics doesn't
exist under that exact name. Also, frobbing any of these seems to just
reduce the expected lifetime of my gpus.

But in case anyone wants to try things out, it's imo good to include the
definitions (especially since public Docs don't explain anything about
this reg than that warning about ppgtt).
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx