[Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
Chad Versace (2):

  xf86-video-intel
  dri: Do not tile stencil buffer

  src/intel_dri.c |   16 

  mesa
  intel: Fix stencil buffer to be W tiled

  src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
  src/mesa/drivers/dri/intel/intel_context.c |9 ++-
  src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
  src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
  src/mesa/drivers/dri/intel/intel_span.c|   85 

  5 files changed, 89 insertions(+), 33 deletions(-)

-- 
1.7.6

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
Until now, the stencil buffer was allocated as a Y tiled buffer, because
in several locations the PRM states that it is. However, it is actually
W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
4.5.2.1 W-Major Format:
W-Major Tile Format is used for separate stencil.

The GTT is incapable of W fencing, so we allocate the stencil buffer with
I915_TILING_NONE and decode the tile's layout in software.

This fix touches the following portions of code:
- In intel_allocate_renderbuffer_storage(), allocate the stencil
  buffer with I915_TILING_NONE.
- In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
  not tiled.
- In the stencil buffer's span functions, the tile's layout must be
  decoded in software.

This commit mutually depends on the xf86-video-intel commit
dri: Do not tile stencil buffer
Author: Chad Versace 
Date:   Mon Jul 18 00:38:00 2011 -0700

On Gen6 with separate stencil enabled, fixes the following Piglit tests:
bugs/fdo23670-drawpix_stencil
general/stencil-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
spec/EXT_packed_depth_stencil/readpixels-24_8

Note: This is a candidate for the 7.11 branch.

CC: Eric Anholt 
CC: Kenneth Graunke 
Signed-off-by: Chad Versace 
---
 src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
 src/mesa/drivers/dri/intel/intel_context.c |9 ++-
 src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
 src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
 src/mesa/drivers/dri/intel/intel_span.c|   85 
 5 files changed, 89 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
b/src/mesa/drivers/dri/intel/intel_clear.c
index dfca03c..5ab9873 100644
--- a/src/mesa/drivers/dri/intel/intel_clear.c
+++ b/src/mesa/drivers/dri/intel/intel_clear.c
@@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
 */
 tri_mask |= BUFFER_BIT_STENCIL;
  }
+else if (intel->has_separate_stencil &&
+  stencilRegion->tiling == I915_TILING_NONE) {
+   /* The stencil buffer is actually W tiled, which the hardware
+* cannot blit to. */
+   tri_mask |= BUFFER_BIT_STENCIL;
+}
  else {
 /* clearing all stencil bits, use blitting */
 blit_mask |= BUFFER_BIT_STENCIL;
diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
b/src/mesa/drivers/dri/intel/intel_context.c
index 2ba1363..fe8be08 100644
--- a/src/mesa/drivers/dri/intel/intel_context.c
+++ b/src/mesa/drivers/dri/intel/intel_context.c
@@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
   assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
   assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);
 
-  if (stencil_rb->region->tiling == I915_TILING_Y) {
+  if (stencil_rb->region->tiling == I915_TILING_NONE) {
+/*
+ * The stencil buffer is actually W tiled. The region's tiling is
+ * I915_TILING_NONE, however, because the GTT is incapable of W
+ * fencing.
+ */
 intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
 return;
   } else {
@@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
* Presently, however, no verification or clean up is necessary, and
* execution should not reach here. If the framebuffer still has a hiz
* region, then we have already set dri2_has_hiz to true after
-   * confirming above that the stencil buffer is Y tiled.
+   * confirming above that the stencil buffer is W tiled.
*/
   assert(0);
}
diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
b/src/mesa/drivers/dri/intel/intel_fbo.c
index 1669af2..507cc33 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.c
+++ b/src/mesa/drivers/dri/intel/intel_fbo.c
@@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct gl_context * ctx

[Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
Until now, the stencil buffer was allocated as a Y tiled buffer, because
in several locations the PRM states that it is. However, it is actually
W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
4.5.2.1 W-Major Format:
W-Major Tile Format is used for separate stencil.

The GTT is incapable of W fencing, so we allocate the stencil buffer with
I915_TILING_NONE and decode the tile's layout in software.

This fix touches the following portions of code:
- In intel_allocate_renderbuffer_storage(), allocate the stencil
  buffer with I915_TILING_NONE.
- In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
  not tiled.
- In the stencil buffer's span functions, the tile's layout must be
  decoded in software.

This commit mutually depends on the xf86-video-intel commit
dri: Do not tile stencil buffer
Author: Chad Versace 
Date:   Mon Jul 18 00:38:00 2011 -0700

On Gen6 with separate stencil enabled, fixes the following Piglit tests:
bugs/fdo23670-drawpix_stencil
general/stencil-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
spec/EXT_packed_depth_stencil/readpixels-24_8

Note: This is a candidate for the 7.11 branch.

CC: Eric Anholt 
CC: Kenneth Graunke 
CC: Ian Romancik 
Signed-off-by: Chad Versace 
---
 src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
 src/mesa/drivers/dri/intel/intel_context.c |9 ++-
 src/mesa/drivers/dri/intel/intel_fbo.c |   12 ++--
 src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
 src/mesa/drivers/dri/intel/intel_span.c|   88 +---
 5 files changed, 93 insertions(+), 31 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
b/src/mesa/drivers/dri/intel/intel_clear.c
index dfca03c..5ab9873 100644
--- a/src/mesa/drivers/dri/intel/intel_clear.c
+++ b/src/mesa/drivers/dri/intel/intel_clear.c
@@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
 */
 tri_mask |= BUFFER_BIT_STENCIL;
  }
+else if (intel->has_separate_stencil &&
+  stencilRegion->tiling == I915_TILING_NONE) {
+   /* The stencil buffer is actually W tiled, which the hardware
+* cannot blit to. */
+   tri_mask |= BUFFER_BIT_STENCIL;
+}
  else {
 /* clearing all stencil bits, use blitting */
 blit_mask |= BUFFER_BIT_STENCIL;
diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
b/src/mesa/drivers/dri/intel/intel_context.c
index 2ba1363..fe8be08 100644
--- a/src/mesa/drivers/dri/intel/intel_context.c
+++ b/src/mesa/drivers/dri/intel/intel_context.c
@@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
   assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
   assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);
 
-  if (stencil_rb->region->tiling == I915_TILING_Y) {
+  if (stencil_rb->region->tiling == I915_TILING_NONE) {
+/*
+ * The stencil buffer is actually W tiled. The region's tiling is
+ * I915_TILING_NONE, however, because the GTT is incapable of W
+ * fencing.
+ */
 intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
 return;
   } else {
@@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
* Presently, however, no verification or clean up is necessary, and
* execution should not reach here. If the framebuffer still has a hiz
* region, then we have already set dri2_has_hiz to true after
-   * confirming above that the stencil buffer is Y tiled.
+   * confirming above that the stencil buffer is W tiled.
*/
   assert(0);
}
diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
b/src/mesa/drivers/dri/intel/intel_fbo.c
index 1669af2..2206391 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.c
+++ b/src/mesa/drivers/dri/intel/intel_fbo.c
@@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct

Re: [Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Paul Menzel
Am Montag, den 18.07.2011, 00:55 -0700 schrieb Chad Versace:

[…]

> diff --git a/src/mesa/drivers/dri/intel/intel_span.c 
> b/src/mesa/drivers/dri/intel/intel_span.c
> index 153803f..d306432 100644
> --- a/src/mesa/drivers/dri/intel/intel_span.c
> +++ b/src/mesa/drivers/dri/intel/intel_span.c
> @@ -131,38 +131,77 @@ intel_set_span_functions(struct intel_context *intel,
> int miny = 0; \
> int maxx = rb->Width; \
> int maxy = rb->Height;\
> -   int stride = rb->RowStride;   
> \
> -   uint8_t *buf = rb->Data;  \
> + \
> +   /*
> \
> +* Here we ignore rb->Data and rb->RowStride as set by\
> +* intelSpanRenderStart. Since intel_offset_S8 decodes the W tile \
> +* manually, the region's *real* base address and stride is   
> \
> +* required.  
> \
> +*/   
> \
> +   struct intel_renderbuffer *irb = intel_renderbuffer(rb);  \
> +   uint8_t *buf = irb->region->buffer->virtual;  
> \

There are alignment/white space issues above.

> +   unsigned stride = irb->region->pitch; \
> +   unsigned height = 2 * irb->region->height;
> \
> +   bool flip = rb->Name == 0;
> \

[…]


Thanks,

Paul


signature.asc
Description: This is a digitally signed message part
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Eric Anholt
On Mon, 18 Jul 2011 00:55:03 -0700, Chad Versace  wrote:
> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
> b/src/mesa/drivers/dri/intel/intel_fbo.c
> index 1669af2..507cc33 100644
> --- a/src/mesa/drivers/dri/intel/intel_fbo.c
> +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
> @@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct gl_context * ctx, 
> struct gl_renderbuffer
>  
> if (irb->Base.Format == MESA_FORMAT_S8) {
>/*
> +   * The stencil buffer is W tiled. However, we request from the kernel a
> +   * non-tiled buffer because the GTT is incapable of W fencing.
> +   *
> * The stencil buffer has quirky pitch requirements.  From Vol 2a,
> * 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field "Surface Pitch":
> *The pitch must be set to 2x the value computed based on width, as
> @@ -180,14 +183,14 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
> ctx, struct gl_renderbuffer
> * To accomplish this, we resort to the nasty hack of doubling the drm
> * region's cpp and halving its height.
> *
> -   * If we neglect to double the pitch, then drm_intel_gem_bo_map_gtt()
> -   * maps the memory incorrectly.
> +   * If we neglect to double the pitch, then render corruption occurs.
> */
>irb->region = intel_region_alloc(intel->intelScreen,
> -I915_TILING_Y,
> +I915_TILING_NONE,
>  cpp * 2,
> -width,
> -height / 2,
> +ALIGN(width, 64),
> +/* XXX: Maybe align to 128? */
> +ALIGN(height / 2, 64),
>  GL_TRUE);
>if (!irb->region)
>   return false;

This looks like it would fail on a buffer with height = 129.  Doesn't
seem like 128 pitch requirement would be needed -- has it been tested?

> diff --git a/src/mesa/drivers/dri/intel/intel_span.c 
> b/src/mesa/drivers/dri/intel/intel_span.c
> index 153803f..d306432 100644
> --- a/src/mesa/drivers/dri/intel/intel_span.c
> +++ b/src/mesa/drivers/dri/intel/intel_span.c
> @@ -131,38 +131,77 @@ intel_set_span_functions(struct intel_context *intel,
> int miny = 0; \
> int maxx = rb->Width; \
> int maxy = rb->Height;\
> -   int stride = rb->RowStride;   
> \
> -   uint8_t *buf = rb->Data;  \
> + \
> +   /*
> \
> +* Here we ignore rb->Data and rb->RowStride as set by\
> +* intelSpanRenderStart. Since intel_offset_S8 decodes the W tile \
> +* manually, the region's *real* base address and stride is   
> \
> +* required.  
> \
> +*/   
> \
> +   struct intel_renderbuffer *irb = intel_renderbuffer(rb);  \
> +   uint8_t *buf = irb->region->buffer->virtual;  
> \
> +   unsigned stride = irb->region->pitch; \
> +   unsigned height = 2 * irb->region->height;
> \
> +   bool flip = rb->Name == 0;
> \
>  
> -/* Don't flip y. */
>  #undef Y_FLIP
> -#define Y_FLIP(y) y
> +#define Y_FLIP(y) ((1 - flip) * y + flip * (height - 1 - y))

The flip is usually handled by a scale and bias variable, so that Y_FLIP
is ((y) * scale + bias).  I think it'll produce less code, since Y_FLIP
is used a lot.



pgpvujxxgyynD.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx