Re: [Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.

2012-08-08 Thread Eric Anholt
Kenneth Graunke  writes:

> Separate out the depth stall from the depth count write.  Workarounds
> say that a depth stall needs to be preceeded with a non-zero post-sync
> op (in this case, the depth count write).  Also, before the non-zero
> post-sync op, we need a CS stall, which needs a stall at scoreboard.
>
> Signed-off-by: Daniel Vetter 
> Signed-off-by: Kenneth Graunke 

This series is:

Reviewed-by: Eric Anholt 


pgp1TMliqxVMP.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.

2012-08-08 Thread Daniel Vetter
On Wed, Aug 08, 2012 at 09:41:44AM +0200, Daniel Vetter wrote:
> On Tue, Aug 07, 2012 at 04:05:33PM -0700, Kenneth Graunke wrote:
> > Separate out the depth stall from the depth count write.  Workarounds
> > say that a depth stall needs to be preceeded with a non-zero post-sync
> > op (in this case, the depth count write).  Also, before the non-zero
> > post-sync op, we need a CS stall, which needs a stall at scoreboard.
> > 
> > Signed-off-by: Daniel Vetter 
> > Signed-off-by: Kenneth Graunke 
> 
> In my understanding of Bspec (haven't done any experiments on hw) we need
> to set the depth stall bit on the pipe_control with the depth_count write
> (bspec for both snb&ivb even says that depth stall should be disable if
> post sync op != write_depth). So I think we need to keep these two
> together and simply emit the entire nonzero postsync op workaround on
> gen6, like we already do for render cache flushes.
> 
> In my reading of bspec, no such workaround is required on gen7+

I've forgotten to add: The other patches look good, for all of them safe
this on:
Reviewed-by: Daniel Vetter 
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.

2012-08-08 Thread Daniel Vetter
On Tue, Aug 07, 2012 at 04:05:33PM -0700, Kenneth Graunke wrote:
> Separate out the depth stall from the depth count write.  Workarounds
> say that a depth stall needs to be preceeded with a non-zero post-sync
> op (in this case, the depth count write).  Also, before the non-zero
> post-sync op, we need a CS stall, which needs a stall at scoreboard.
> 
> Signed-off-by: Daniel Vetter 
> Signed-off-by: Kenneth Graunke 

In my understanding of Bspec (haven't done any experiments on hw) we need
to set the depth stall bit on the pipe_control with the depth_count write
(bspec for both snb&ivb even says that depth stall should be disable if
post sync op != write_depth). So I think we need to keep these two
together and simply emit the entire nonzero postsync op workaround on
gen6, like we already do for render cache flushes.

In my reading of bspec, no such workaround is required on gen7+
-Daniel

> ---
>  src/mesa/drivers/dri/i965/brw_queryobj.c | 36 
> 
>  1 file changed, 27 insertions(+), 9 deletions(-)
> 
> This does remove the CS stall on Ivybridge.
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
> b/src/mesa/drivers/dri/i965/brw_queryobj.c
> index 1e03d08..4c561ad 100644
> --- a/src/mesa/drivers/dri/i965/brw_queryobj.c
> +++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
> @@ -91,17 +91,24 @@ static void
>  write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int 
> idx)
>  {
> if (intel->gen >= 6) {
> -  BEGIN_BATCH(9);
> -
> -  /* workaround: CS stall required before depth stall. */
> -  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
> -  OUT_BATCH(PIPE_CONTROL_CS_STALL);
> -  OUT_BATCH(0); /* write address */
> -  OUT_BATCH(0); /* write data */
> +  /* Emit Sandybridge workaround flush: */
> +  if (intel->gen == 6) {
> + /* The timestamp write below is a non-zero post-sync op, which on
> +  * Gen6 necessitates a CS stall.  CS stalls need stall at scoreboard
> +  * set.  See the comments for intel_emit_post_sync_nonzero_flush().
> +  */
> + BEGIN_BATCH(4);
> + OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
> + OUT_BATCH(PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD);
> + OUT_BATCH(0);
> + OUT_BATCH(0);
> + ADVANCE_BATCH();
> +  }
>  
> +  /* Emit the actual depth count write: */
> +  BEGIN_BATCH(5);
>OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
> -  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
> -PIPE_CONTROL_WRITE_DEPTH_COUNT);
> +  OUT_BATCH(PIPE_CONTROL_WRITE_DEPTH_COUNT);
>OUT_RELOC(query_bo,
>  I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
>  PIPE_CONTROL_GLOBAL_GTT_WRITE |
> @@ -109,6 +116,17 @@ write_depth_count(struct intel_context *intel, 
> drm_intel_bo *query_bo, int idx)
>OUT_BATCH(0);
>OUT_BATCH(0);
>ADVANCE_BATCH();
> +
> +  /* We need to emit a depth stall to get the right value for the depth
> +   * count.  As a workaround this needs a preceeding PIPE_CONTROL with a
> +   * non-zero post-sync op.  The depth count write above does that for 
> us.
> +   */
> +  BEGIN_BATCH(4);
> +  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
> +  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL);
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
> +  ADVANCE_BATCH();
> } else {
>BEGIN_BATCH(4);
>OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
> -- 
> 1.7.11.4
> 

-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.

2012-08-07 Thread Kenneth Graunke
Separate out the depth stall from the depth count write.  Workarounds
say that a depth stall needs to be preceeded with a non-zero post-sync
op (in this case, the depth count write).  Also, before the non-zero
post-sync op, we need a CS stall, which needs a stall at scoreboard.

Signed-off-by: Daniel Vetter 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_queryobj.c | 36 
 1 file changed, 27 insertions(+), 9 deletions(-)

This does remove the CS stall on Ivybridge.

diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c 
b/src/mesa/drivers/dri/i965/brw_queryobj.c
index 1e03d08..4c561ad 100644
--- a/src/mesa/drivers/dri/i965/brw_queryobj.c
+++ b/src/mesa/drivers/dri/i965/brw_queryobj.c
@@ -91,17 +91,24 @@ static void
 write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int idx)
 {
if (intel->gen >= 6) {
-  BEGIN_BATCH(9);
-
-  /* workaround: CS stall required before depth stall. */
-  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
-  OUT_BATCH(PIPE_CONTROL_CS_STALL);
-  OUT_BATCH(0); /* write address */
-  OUT_BATCH(0); /* write data */
+  /* Emit Sandybridge workaround flush: */
+  if (intel->gen == 6) {
+ /* The timestamp write below is a non-zero post-sync op, which on
+  * Gen6 necessitates a CS stall.  CS stalls need stall at scoreboard
+  * set.  See the comments for intel_emit_post_sync_nonzero_flush().
+  */
+ BEGIN_BATCH(4);
+ OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+ OUT_BATCH(PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD);
+ OUT_BATCH(0);
+ OUT_BATCH(0);
+ ADVANCE_BATCH();
+  }
 
+  /* Emit the actual depth count write: */
+  BEGIN_BATCH(5);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2));
-  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL |
-PIPE_CONTROL_WRITE_DEPTH_COUNT);
+  OUT_BATCH(PIPE_CONTROL_WRITE_DEPTH_COUNT);
   OUT_RELOC(query_bo,
 I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
 PIPE_CONTROL_GLOBAL_GTT_WRITE |
@@ -109,6 +116,17 @@ write_depth_count(struct intel_context *intel, 
drm_intel_bo *query_bo, int idx)
   OUT_BATCH(0);
   OUT_BATCH(0);
   ADVANCE_BATCH();
+
+  /* We need to emit a depth stall to get the right value for the depth
+   * count.  As a workaround this needs a preceeding PIPE_CONTROL with a
+   * non-zero post-sync op.  The depth count write above does that for us.
+   */
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2));
+  OUT_BATCH(PIPE_CONTROL_DEPTH_STALL);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
} else {
   BEGIN_BATCH(4);
   OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) |
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev