Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-19 Thread Chad Versace
On 07/18/2011 01:20 AM, Paul Menzel wrote:
> Am Montag, den 18.07.2011, 00:55 -0700 schrieb Chad Versace:
> There are alignment/white space issues above.
> 
>> +   unsigned stride = irb->region->pitch;\
>> +   unsigned height = 2 * irb->region->height;   
>> \
>> +   bool flip = rb->Name == 0;   
>> \
> 
> […]
> 
> 
> Thanks,
> 
> Paul
> 

Thanks. Fix whitespace in both patches.

-- 
Chad Versace
c...@chad-versace.us



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-19 Thread Eric Anholt
On Mon, 18 Jul 2011 17:00:54 -0700, Chad Versace  wrote:
> On 07/18/2011 08:57 AM, Eric Anholt wrote:
> > On Mon, 18 Jul 2011 00:55:03 -0700, Chad Versace  
> > wrote:
> >> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
> >> b/src/mesa/drivers/dri/intel/intel_fbo.c
> >> index 1669af2..507cc33 100644
> >> --- a/src/mesa/drivers/dri/intel/intel_fbo.c
> >> +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
> >> @@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
> >> ctx, struct gl_renderbuffer
> >>  
> >> if (irb->Base.Format == MESA_FORMAT_S8) {
> >>/*
> >> +   * The stencil buffer is W tiled. However, we request from the 
> >> kernel a
> >> +   * non-tiled buffer because the GTT is incapable of W fencing.
> >> +   *
> >> * The stencil buffer has quirky pitch requirements.  From Vol 2a,
> >> * 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field "Surface Pitch":
> >> *The pitch must be set to 2x the value computed based on 
> >> width, as
> >> @@ -180,14 +183,14 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
> >> ctx, struct gl_renderbuffer
> >> * To accomplish this, we resort to the nasty hack of doubling the 
> >> drm
> >> * region's cpp and halving its height.
> >> *
> >> -   * If we neglect to double the pitch, then 
> >> drm_intel_gem_bo_map_gtt()
> >> -   * maps the memory incorrectly.
> >> +   * If we neglect to double the pitch, then render corruption occurs.
> >> */
> >>irb->region = intel_region_alloc(intel->intelScreen,
> >> - I915_TILING_Y,
> >> + I915_TILING_NONE,
> >>   cpp * 2,
> >> - width,
> >> - height / 2,
> >> + ALIGN(width, 64),
> >> + /* XXX: Maybe align to 128? */
> >> + ALIGN(height / 2, 64),
> >>   GL_TRUE);
> >>if (!irb->region)
> >>return false;
> > 
> > This looks like it would fail on a buffer with height = 129.
> 
> Oops. It does fail for height = 129. Accordingly, I've changed this hunk to:
> 
>irb->region = intel_region_alloc(intel->intelScreen,
> -I915_TILING_Y,
> +I915_TILING_NONE,
>  cpp * 2,
> -width,
> -height / 2,
> +ALIGN(width, 64),
> +ALIGN((height + 1)/ 2, 64),
>  GL_TRUE);
> 
> And changed the same line in xf86-video-intel too.
> 
> > Doesn't
> > seem like 128 pitch requirement would be needed -- has it been tested?
> 
> The XXX was left in the patch accidentally. 128-alignment has been tested, 
> and was
> found unnecessary.

Looks good to me, gets the Reviewed-by: Eric Anholt 


pgpZ8dm2JCZZt.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
On 07/18/2011 08:57 AM, Eric Anholt wrote:
> On Mon, 18 Jul 2011 00:55:03 -0700, Chad Versace  wrote:
>> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
>> b/src/mesa/drivers/dri/intel/intel_fbo.c
>> index 1669af2..507cc33 100644
>> --- a/src/mesa/drivers/dri/intel/intel_fbo.c
>> +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
>> @@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
>> ctx, struct gl_renderbuffer
>>  
>> if (irb->Base.Format == MESA_FORMAT_S8) {
>>/*
>> +   * The stencil buffer is W tiled. However, we request from the kernel 
>> a
>> +   * non-tiled buffer because the GTT is incapable of W fencing.
>> +   *
>> * The stencil buffer has quirky pitch requirements.  From Vol 2a,
>> * 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field "Surface Pitch":
>> *The pitch must be set to 2x the value computed based on width, 
>> as
>> @@ -180,14 +183,14 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
>> ctx, struct gl_renderbuffer
>> * To accomplish this, we resort to the nasty hack of doubling the drm
>> * region's cpp and halving its height.
>> *
>> -   * If we neglect to double the pitch, then drm_intel_gem_bo_map_gtt()
>> -   * maps the memory incorrectly.
>> +   * If we neglect to double the pitch, then render corruption occurs.
>> */
>>irb->region = intel_region_alloc(intel->intelScreen,
>> -   I915_TILING_Y,
>> +   I915_TILING_NONE,
>> cpp * 2,
>> -   width,
>> -   height / 2,
>> +   ALIGN(width, 64),
>> +   /* XXX: Maybe align to 128? */
>> +   ALIGN(height / 2, 64),
>> GL_TRUE);
>>if (!irb->region)
>>  return false;
> 
> This looks like it would fail on a buffer with height = 129.

Oops. It does fail for height = 129. Accordingly, I've changed this hunk to:

   irb->region = intel_region_alloc(intel->intelScreen,
-  I915_TILING_Y,
+  I915_TILING_NONE,
   cpp * 2,
-  width,
-  height / 2,
+  ALIGN(width, 64),
+  ALIGN((height + 1)/ 2, 64),
   GL_TRUE);

And changed the same line in xf86-video-intel too.

> Doesn't
> seem like 128 pitch requirement would be needed -- has it been tested?

The XXX was left in the patch accidentally. 128-alignment has been tested, and 
was
found unnecessary.

-- 
Chad Versace
c...@chad-versace.us



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/18/2011 02:24 PM, Chad Versace wrote:
> On 07/18/2011 02:02 PM, Ian Romanick wrote:
>> On 07/18/2011 01:54 PM, Chad Versace wrote:
>>> On 07/18/2011 11:49 AM, Ian Romanick wrote:
 On 07/18/2011 12:55 AM, Chad Versace wrote:
> Until now, the stencil buffer was allocated as a Y tiled buffer, because
> in several locations the PRM states that it is. However, it is actually
> W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
> 4.5.2.1 W-Major Format:
> W-Major Tile Format is used for separate stencil.

> The GTT is incapable of W fencing, so we allocate the stencil buffer with
> I915_TILING_NONE and decode the tile's layout in software.

> This fix touches the following portions of code:
> - In intel_allocate_renderbuffer_storage(), allocate the stencil
>   buffer with I915_TILING_NONE.
> - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
>   not tiled.
> - In the stencil buffer's span functions, the tile's layout must be
>   decoded in software.

> This commit mutually depends on the xf86-video-intel commit
> dri: Do not tile stencil buffer
> Author: Chad Versace 
> Date:   Mon Jul 18 00:38:00 2011 -0700

> On Gen6 with separate stencil enabled, fixes the following Piglit tests:
> bugs/fdo23670-drawpix_stencil
> general/stencil-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
> 
> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
> 
> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
> spec/EXT_packed_depth_stencil/readpixels-24_8

> Note: This is a candidate for the 7.11 branch.

> CC: Eric Anholt 
> CC: Kenneth Graunke 
> Signed-off-by: Chad Versace 
> ---
>  src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
>  src/mesa/drivers/dri/intel/intel_context.c |9 ++-
>  src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
>  src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
>  src/mesa/drivers/dri/intel/intel_span.c|   85 
> 
>  5 files changed, 89 insertions(+), 33 deletions(-)

> diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
> b/src/mesa/drivers/dri/intel/intel_clear.c
> index dfca03c..5ab9873 100644
> --- a/src/mesa/drivers/dri/intel/intel_clear.c
> +++ b/src/mesa/drivers/dri/intel/intel_clear.c
> @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
>*/
>  tri_mask |= BUFFER_BIT_STENCIL;
>   }
> +  else if (intel->has_separate_stencil &&
> +stencilRegion->tiling == I915_TILING_NONE) {
> + /* The stencil buffer is actually W tiled, which the hardware
> +  * cannot blit to. */
> + tri_mask |= BUFFER_BIT_STENCIL;
> +  }
>   else {
>  /* clearing all stencil bits, use blitting */
>  blit_mask |= BUFFER_BIT_STENCIL;
> diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
> b/src/mesa/drivers/dri/intel/intel_context.c
> index 2ba1363..fe8be08 100644
> --- a/src/mesa/drivers/dri/intel/intel_context.c
> +++ b/src/mesa/drivers/dri/intel/intel_context.c
> @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context 
> *intel,
>assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
>assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);

> -  if (stencil_rb->region->tiling == I915_TILING_Y) {
> +  if (stencil_rb->region->tiling == I915_TILING_NONE) {
> +  /*
> +   * The stencil buffer is actually W tiled. The region's tiling is
> +   * I915_TILING_NONE, however, because the GTT is incapable of W
> +   * fencing.
> +   */
>intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
>return;
>} else 

Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
On 07/18/2011 02:02 PM, Ian Romanick wrote:
> On 07/18/2011 01:54 PM, Chad Versace wrote:
>> On 07/18/2011 11:49 AM, Ian Romanick wrote:
>>> On 07/18/2011 12:55 AM, Chad Versace wrote:
 Until now, the stencil buffer was allocated as a Y tiled buffer, because
 in several locations the PRM states that it is. However, it is actually
 W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
 4.5.2.1 W-Major Format:
 W-Major Tile Format is used for separate stencil.
>>>
 The GTT is incapable of W fencing, so we allocate the stencil buffer with
 I915_TILING_NONE and decode the tile's layout in software.
>>>
 This fix touches the following portions of code:
 - In intel_allocate_renderbuffer_storage(), allocate the stencil
   buffer with I915_TILING_NONE.
 - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
   not tiled.
 - In the stencil buffer's span functions, the tile's layout must be
   decoded in software.
>>>
 This commit mutually depends on the xf86-video-intel commit
 dri: Do not tile stencil buffer
 Author: Chad Versace 
 Date:   Mon Jul 18 00:38:00 2011 -0700
>>>
 On Gen6 with separate stencil enabled, fixes the following Piglit tests:
 bugs/fdo23670-drawpix_stencil
 general/stencil-drawpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
 spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
 
 spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
 
 spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
 spec/EXT_packed_depth_stencil/readpixels-24_8
>>>
 Note: This is a candidate for the 7.11 branch.
>>>
 CC: Eric Anholt 
 CC: Kenneth Graunke 
 Signed-off-by: Chad Versace 
 ---
  src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
  src/mesa/drivers/dri/intel/intel_context.c |9 ++-
  src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
  src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
  src/mesa/drivers/dri/intel/intel_span.c|   85 
 
  5 files changed, 89 insertions(+), 33 deletions(-)
>>>
 diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
 b/src/mesa/drivers/dri/intel/intel_clear.c
 index dfca03c..5ab9873 100644
 --- a/src/mesa/drivers/dri/intel/intel_clear.c
 +++ b/src/mesa/drivers/dri/intel/intel_clear.c
 @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
 */
  tri_mask |= BUFFER_BIT_STENCIL;
   }
 +   else if (intel->has_separate_stencil &&
 + stencilRegion->tiling == I915_TILING_NONE) {
 +  /* The stencil buffer is actually W tiled, which the hardware
 +   * cannot blit to. */
 +  tri_mask |= BUFFER_BIT_STENCIL;
 +   }
   else {
  /* clearing all stencil bits, use blitting */
  blit_mask |= BUFFER_BIT_STENCIL;
 diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
 b/src/mesa/drivers/dri/intel/intel_context.c
 index 2ba1363..fe8be08 100644
 --- a/src/mesa/drivers/dri/intel/intel_context.c
 +++ b/src/mesa/drivers/dri/intel/intel_context.c
 @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context 
 *intel,
assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);
>>>
 -  if (stencil_rb->region->tiling == I915_TILING_Y) {
 +  if (stencil_rb->region->tiling == I915_TILING_NONE) {
 +   /*
 +* The stencil buffer is actually W tiled. The region's tiling is
 +* I915_TILING_NONE, however, because the GTT is incapable of W
 +* fencing.
 +*/
 intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
 return;
} else {
 @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context 
 *intel,
 * Presently, however, no verification or clean up is necessary, and
  

Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
On 07/18/2011 08:57 AM, Eric Anholt wrote:
> On Mon, 18 Jul 2011 00:55:03 -0700, Chad Versace  wrote:
>> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
>> b/src/mesa/drivers/dri/intel/intel_fbo.c
>> index 1669af2..507cc33 100644
>> --- a/src/mesa/drivers/dri/intel/intel_fbo.c
>> +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
>> @@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
>> ctx, struct gl_renderbuffer
>>  
>> if (irb->Base.Format == MESA_FORMAT_S8) {
>>/*
>> +   * The stencil buffer is W tiled. However, we request from the kernel 
>> a
>> +   * non-tiled buffer because the GTT is incapable of W fencing.
>> +   *
>> * The stencil buffer has quirky pitch requirements.  From Vol 2a,
>> * 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field "Surface Pitch":
>> *The pitch must be set to 2x the value computed based on width, 
>> as
>> @@ -180,14 +183,14 @@ intel_alloc_renderbuffer_storage(struct gl_context * 
>> ctx, struct gl_renderbuffer
>> * To accomplish this, we resort to the nasty hack of doubling the drm
>> * region's cpp and halving its height.
>> *
>> -   * If we neglect to double the pitch, then drm_intel_gem_bo_map_gtt()
>> -   * maps the memory incorrectly.
>> +   * If we neglect to double the pitch, then render corruption occurs.
>> */
>>irb->region = intel_region_alloc(intel->intelScreen,
>> -   I915_TILING_Y,
>> +   I915_TILING_NONE,
>> cpp * 2,
>> -   width,
>> -   height / 2,
>> +   ALIGN(width, 64),
>> +   /* XXX: Maybe align to 128? */
>> +   ALIGN(height / 2, 64),
>> GL_TRUE);
>>if (!irb->region)
>>  return false;
> 
> This looks like it would fail on a buffer with height = 129.  Doesn't
> seem like 128 pitch requirement would be needed -- has it been tested?
> 
>> diff --git a/src/mesa/drivers/dri/intel/intel_span.c 
>> b/src/mesa/drivers/dri/intel/intel_span.c
>> index 153803f..d306432 100644
>> --- a/src/mesa/drivers/dri/intel/intel_span.c
>> +++ b/src/mesa/drivers/dri/intel/intel_span.c
>> @@ -131,38 +131,77 @@ intel_set_span_functions(struct intel_context *intel,
>> int miny = 0;\
>> int maxx = rb->Width;\
>> int maxy = rb->Height;   \
>> -   int stride = rb->RowStride;  
>> \
>> -   uint8_t *buf = rb->Data; \
>> +\
>> +   /*   
>> \
>> +* Here we ignore rb->Data and rb->RowStride as set by   \
>> +* intelSpanRenderStart. Since intel_offset_S8 decodes the W tile
>> \
>> +* manually, the region's *real* base address and stride is  
>> \
>> +* required. 
>> \
>> +*/  
>> \
>> +   struct intel_renderbuffer *irb = intel_renderbuffer(rb); \
>> +   uint8_t *buf = irb->region->buffer->virtual; 
>> \
>> +   unsigned stride = irb->region->pitch;\
>> +   unsigned height = 2 * irb->region->height;   
>> \
>> +   bool flip = rb->Name == 0;   
>> \
>>  
>> -/* Don't flip y. */
>>  #undef Y_FLIP
>> -#define Y_FLIP(y) y
>> +#define Y_FLIP(y) ((1 - flip) * y + flip * (height - 1 - y))
> 
> The flip is usually handled by a scale and bias variable, so that Y_FLIP
> is ((y) * scale + bias).  I think it'll produce less code, since Y_FLIP
> is used a lot.

Good call. Does changing the hunk like this look good to you?

+   struct intel_renderbuffer *irb = intel_renderbuffer(rb);\
+   uint8_t *buf = irb->region->buffer->virtual;\
+   unsigned stride = irb->region->pitch;   \
+   unsigned height = 2 * irb->region->height;  \
+   bool flip = rb->Name == 0;  \
+   int y_scale = flip ? -1 : 1;\
+   int y_bias = flip ? (height - 1) : 0;   \

-/* Don't flip y. */
 #undef Y_FLIP
-#define Y_FLIP(y) y
+#define Y_FLIP(y) (y_scale * (y) + y_bias)


-- 
Chad Versace
c...@chad-versace.us



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedeskto

Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/18/2011 01:54 PM, Chad Versace wrote:
> On 07/18/2011 11:49 AM, Ian Romanick wrote:
>> On 07/18/2011 12:55 AM, Chad Versace wrote:
>>> Until now, the stencil buffer was allocated as a Y tiled buffer, because
>>> in several locations the PRM states that it is. However, it is actually
>>> W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
>>> 4.5.2.1 W-Major Format:
>>> W-Major Tile Format is used for separate stencil.
>>
>>> The GTT is incapable of W fencing, so we allocate the stencil buffer with
>>> I915_TILING_NONE and decode the tile's layout in software.
>>
>>> This fix touches the following portions of code:
>>> - In intel_allocate_renderbuffer_storage(), allocate the stencil
>>>   buffer with I915_TILING_NONE.
>>> - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
>>>   not tiled.
>>> - In the stencil buffer's span functions, the tile's layout must be
>>>   decoded in software.
>>
>>> This commit mutually depends on the xf86-video-intel commit
>>> dri: Do not tile stencil buffer
>>> Author: Chad Versace 
>>> Date:   Mon Jul 18 00:38:00 2011 -0700
>>
>>> On Gen6 with separate stencil enabled, fixes the following Piglit tests:
>>> bugs/fdo23670-drawpix_stencil
>>> general/stencil-drawpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
>>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
>>> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
>>> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
>>> spec/EXT_packed_depth_stencil/readpixels-24_8
>>
>>> Note: This is a candidate for the 7.11 branch.
>>
>>> CC: Eric Anholt 
>>> CC: Kenneth Graunke 
>>> Signed-off-by: Chad Versace 
>>> ---
>>>  src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
>>>  src/mesa/drivers/dri/intel/intel_context.c |9 ++-
>>>  src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
>>>  src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
>>>  src/mesa/drivers/dri/intel/intel_span.c|   85 
>>> 
>>>  5 files changed, 89 insertions(+), 33 deletions(-)
>>
>>> diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
>>> b/src/mesa/drivers/dri/intel/intel_clear.c
>>> index dfca03c..5ab9873 100644
>>> --- a/src/mesa/drivers/dri/intel/intel_clear.c
>>> +++ b/src/mesa/drivers/dri/intel/intel_clear.c
>>> @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
>>>  */
>>>  tri_mask |= BUFFER_BIT_STENCIL;
>>>   }
>>> +else if (intel->has_separate_stencil &&
>>> +  stencilRegion->tiling == I915_TILING_NONE) {
>>> +   /* The stencil buffer is actually W tiled, which the hardware
>>> +* cannot blit to. */
>>> +   tri_mask |= BUFFER_BIT_STENCIL;
>>> +}
>>>   else {
>>>  /* clearing all stencil bits, use blitting */
>>>  blit_mask |= BUFFER_BIT_STENCIL;
>>> diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
>>> b/src/mesa/drivers/dri/intel/intel_context.c
>>> index 2ba1363..fe8be08 100644
>>> --- a/src/mesa/drivers/dri/intel/intel_context.c
>>> +++ b/src/mesa/drivers/dri/intel/intel_context.c
>>> @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context 
>>> *intel,
>>>assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
>>>assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);
>>
>>> -  if (stencil_rb->region->tiling == I915_TILING_Y) {
>>> +  if (stencil_rb->region->tiling == I915_TILING_NONE) {
>>> +/*
>>> + * The stencil buffer is actually W tiled. The region's tiling is
>>> + * I915_TILING_NONE, however, because the GTT is incapable of W
>>> + * fencing.
>>> + */
>>>  intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
>>>  return;
>>>} else {
>>> @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
>>> * Presently, however, no verification or clean up is necessary, and
>>> * execution should not reach here. If the framebuffer still has a 
>>> hiz
>>> * regio

Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Chad Versace
On 07/18/2011 11:49 AM, Ian Romanick wrote:
> On 07/18/2011 12:55 AM, Chad Versace wrote:
>> Until now, the stencil buffer was allocated as a Y tiled buffer, because
>> in several locations the PRM states that it is. However, it is actually
>> W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
>> 4.5.2.1 W-Major Format:
>> W-Major Tile Format is used for separate stencil.
> 
>> The GTT is incapable of W fencing, so we allocate the stencil buffer with
>> I915_TILING_NONE and decode the tile's layout in software.
> 
>> This fix touches the following portions of code:
>> - In intel_allocate_renderbuffer_storage(), allocate the stencil
>>   buffer with I915_TILING_NONE.
>> - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
>>   not tiled.
>> - In the stencil buffer's span functions, the tile's layout must be
>>   decoded in software.
> 
>> This commit mutually depends on the xf86-video-intel commit
>> dri: Do not tile stencil buffer
>> Author: Chad Versace 
>> Date:   Mon Jul 18 00:38:00 2011 -0700
> 
>> On Gen6 with separate stencil enabled, fixes the following Piglit tests:
>> bugs/fdo23670-drawpix_stencil
>> general/stencil-drawpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
>> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
>> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
>> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
>> spec/EXT_packed_depth_stencil/readpixels-24_8
> 
>> Note: This is a candidate for the 7.11 branch.
> 
>> CC: Eric Anholt 
>> CC: Kenneth Graunke 
>> Signed-off-by: Chad Versace 
>> ---
>>  src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
>>  src/mesa/drivers/dri/intel/intel_context.c |9 ++-
>>  src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
>>  src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
>>  src/mesa/drivers/dri/intel/intel_span.c|   85 
>> 
>>  5 files changed, 89 insertions(+), 33 deletions(-)
> 
>> diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
>> b/src/mesa/drivers/dri/intel/intel_clear.c
>> index dfca03c..5ab9873 100644
>> --- a/src/mesa/drivers/dri/intel/intel_clear.c
>> +++ b/src/mesa/drivers/dri/intel/intel_clear.c
>> @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
>>   */
>>  tri_mask |= BUFFER_BIT_STENCIL;
>>   }
>> + else if (intel->has_separate_stencil &&
>> +   stencilRegion->tiling == I915_TILING_NONE) {
>> +/* The stencil buffer is actually W tiled, which the hardware
>> + * cannot blit to. */
>> +tri_mask |= BUFFER_BIT_STENCIL;
>> + }
>>   else {
>>  /* clearing all stencil bits, use blitting */
>>  blit_mask |= BUFFER_BIT_STENCIL;
>> diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
>> b/src/mesa/drivers/dri/intel/intel_context.c
>> index 2ba1363..fe8be08 100644
>> --- a/src/mesa/drivers/dri/intel/intel_context.c
>> +++ b/src/mesa/drivers/dri/intel/intel_context.c
>> @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
>>assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
>>assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);
> 
>> -  if (stencil_rb->region->tiling == I915_TILING_Y) {
>> +  if (stencil_rb->region->tiling == I915_TILING_NONE) {
>> + /*
>> +  * The stencil buffer is actually W tiled. The region's tiling is
>> +  * I915_TILING_NONE, however, because the GTT is incapable of W
>> +  * fencing.
>> +  */
>>   intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
>>   return;
>>} else {
>> @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
>> * Presently, however, no verification or clean up is necessary, and
>> * execution should not reach here. If the framebuffer still has a hiz
>> * region, then we have already set dri2_has_hiz to true after
>> -   * confirming above that the stencil buffer is Y tiled.
>> +   * confirming above that the stencil buffer is 

Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled

2011-07-18 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/18/2011 12:55 AM, Chad Versace wrote:
> Until now, the stencil buffer was allocated as a Y tiled buffer, because
> in several locations the PRM states that it is. However, it is actually
> W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
> 4.5.2.1 W-Major Format:
> W-Major Tile Format is used for separate stencil.
> 
> The GTT is incapable of W fencing, so we allocate the stencil buffer with
> I915_TILING_NONE and decode the tile's layout in software.
> 
> This fix touches the following portions of code:
> - In intel_allocate_renderbuffer_storage(), allocate the stencil
>   buffer with I915_TILING_NONE.
> - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
>   not tiled.
> - In the stencil buffer's span functions, the tile's layout must be
>   decoded in software.
> 
> This commit mutually depends on the xf86-video-intel commit
> dri: Do not tile stencil buffer
> Author: Chad Versace 
> Date:   Mon Jul 18 00:38:00 2011 -0700
> 
> On Gen6 with separate stencil enabled, fixes the following Piglit tests:
> bugs/fdo23670-drawpix_stencil
> general/stencil-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
> spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
> spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
> spec/EXT_packed_depth_stencil/readpixels-24_8
> 
> Note: This is a candidate for the 7.11 branch.
> 
> CC: Eric Anholt 
> CC: Kenneth Graunke 
> Signed-off-by: Chad Versace 
> ---
>  src/mesa/drivers/dri/intel/intel_clear.c   |6 ++
>  src/mesa/drivers/dri/intel/intel_context.c |9 ++-
>  src/mesa/drivers/dri/intel/intel_fbo.c |   13 +++--
>  src/mesa/drivers/dri/intel/intel_screen.h  |9 ++-
>  src/mesa/drivers/dri/intel/intel_span.c|   85 
> 
>  5 files changed, 89 insertions(+), 33 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_clear.c 
> b/src/mesa/drivers/dri/intel/intel_clear.c
> index dfca03c..5ab9873 100644
> --- a/src/mesa/drivers/dri/intel/intel_clear.c
> +++ b/src/mesa/drivers/dri/intel/intel_clear.c
> @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask)
>*/
>  tri_mask |= BUFFER_BIT_STENCIL;
>   }
> +  else if (intel->has_separate_stencil &&
> +stencilRegion->tiling == I915_TILING_NONE) {
> + /* The stencil buffer is actually W tiled, which the hardware
> +  * cannot blit to. */
> + tri_mask |= BUFFER_BIT_STENCIL;
> +  }
>   else {
>  /* clearing all stencil bits, use blitting */
>  blit_mask |= BUFFER_BIT_STENCIL;
> diff --git a/src/mesa/drivers/dri/intel/intel_context.c 
> b/src/mesa/drivers/dri/intel/intel_context.c
> index 2ba1363..fe8be08 100644
> --- a/src/mesa/drivers/dri/intel/intel_context.c
> +++ b/src/mesa/drivers/dri/intel/intel_context.c
> @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
>assert(stencil_rb->Base.Format == MESA_FORMAT_S8);
>assert(depth_rb && depth_rb->Base.Format == MESA_FORMAT_X8_Z24);
>  
> -  if (stencil_rb->region->tiling == I915_TILING_Y) {
> +  if (stencil_rb->region->tiling == I915_TILING_NONE) {
> +  /*
> +   * The stencil buffer is actually W tiled. The region's tiling is
> +   * I915_TILING_NONE, however, because the GTT is incapable of W
> +   * fencing.
> +   */
>intel->intelScreen->dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE;
>return;
>} else {
> @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel,
> * Presently, however, no verification or clean up is necessary, and
> * execution should not reach here. If the framebuffer still has a hiz
> * region, then we have already set dri2_has_hiz to true after
> -   * confirming above that the stencil buffer is Y tiled.
> +   * confirming above that the stencil buffer is W tiled.
> */
>assert(0);
> }
> diff --git a/src/mesa/