Re: [Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.
Kenneth Graunke writes: > Separate out the depth stall from the depth count write. Workarounds > say that a depth stall needs to be preceeded with a non-zero post-sync > op (in this case, the depth count write). Also, before the non-zero > post-sync op, we need a CS stall, which needs a stall at scoreboard. > > Signed-off-by: Daniel Vetter > Signed-off-by: Kenneth Graunke This series is: Reviewed-by: Eric Anholt pgp1TMliqxVMP.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.
On Wed, Aug 08, 2012 at 09:41:44AM +0200, Daniel Vetter wrote: > On Tue, Aug 07, 2012 at 04:05:33PM -0700, Kenneth Graunke wrote: > > Separate out the depth stall from the depth count write. Workarounds > > say that a depth stall needs to be preceeded with a non-zero post-sync > > op (in this case, the depth count write). Also, before the non-zero > > post-sync op, we need a CS stall, which needs a stall at scoreboard. > > > > Signed-off-by: Daniel Vetter > > Signed-off-by: Kenneth Graunke > > In my understanding of Bspec (haven't done any experiments on hw) we need > to set the depth stall bit on the pipe_control with the depth_count write > (bspec for both snb&ivb even says that depth stall should be disable if > post sync op != write_depth). So I think we need to keep these two > together and simply emit the entire nonzero postsync op workaround on > gen6, like we already do for render cache flushes. > > In my reading of bspec, no such workaround is required on gen7+ I've forgotten to add: The other patches look good, for all of them safe this on: Reviewed-by: Daniel Vetter -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.
On Tue, Aug 07, 2012 at 04:05:33PM -0700, Kenneth Graunke wrote: > Separate out the depth stall from the depth count write. Workarounds > say that a depth stall needs to be preceeded with a non-zero post-sync > op (in this case, the depth count write). Also, before the non-zero > post-sync op, we need a CS stall, which needs a stall at scoreboard. > > Signed-off-by: Daniel Vetter > Signed-off-by: Kenneth Graunke In my understanding of Bspec (haven't done any experiments on hw) we need to set the depth stall bit on the pipe_control with the depth_count write (bspec for both snb&ivb even says that depth stall should be disable if post sync op != write_depth). So I think we need to keep these two together and simply emit the entire nonzero postsync op workaround on gen6, like we already do for render cache flushes. In my reading of bspec, no such workaround is required on gen7+ -Daniel > --- > src/mesa/drivers/dri/i965/brw_queryobj.c | 36 > > 1 file changed, 27 insertions(+), 9 deletions(-) > > This does remove the CS stall on Ivybridge. > > diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c > b/src/mesa/drivers/dri/i965/brw_queryobj.c > index 1e03d08..4c561ad 100644 > --- a/src/mesa/drivers/dri/i965/brw_queryobj.c > +++ b/src/mesa/drivers/dri/i965/brw_queryobj.c > @@ -91,17 +91,24 @@ static void > write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int > idx) > { > if (intel->gen >= 6) { > - BEGIN_BATCH(9); > - > - /* workaround: CS stall required before depth stall. */ > - OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2)); > - OUT_BATCH(PIPE_CONTROL_CS_STALL); > - OUT_BATCH(0); /* write address */ > - OUT_BATCH(0); /* write data */ > + /* Emit Sandybridge workaround flush: */ > + if (intel->gen == 6) { > + /* The timestamp write below is a non-zero post-sync op, which on > + * Gen6 necessitates a CS stall. CS stalls need stall at scoreboard > + * set. See the comments for intel_emit_post_sync_nonzero_flush(). > + */ > + BEGIN_BATCH(4); > + OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2)); > + OUT_BATCH(PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD); > + OUT_BATCH(0); > + OUT_BATCH(0); > + ADVANCE_BATCH(); > + } > > + /* Emit the actual depth count write: */ > + BEGIN_BATCH(5); >OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2)); > - OUT_BATCH(PIPE_CONTROL_DEPTH_STALL | > -PIPE_CONTROL_WRITE_DEPTH_COUNT); > + OUT_BATCH(PIPE_CONTROL_WRITE_DEPTH_COUNT); >OUT_RELOC(query_bo, > I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION, > PIPE_CONTROL_GLOBAL_GTT_WRITE | > @@ -109,6 +116,17 @@ write_depth_count(struct intel_context *intel, > drm_intel_bo *query_bo, int idx) >OUT_BATCH(0); >OUT_BATCH(0); >ADVANCE_BATCH(); > + > + /* We need to emit a depth stall to get the right value for the depth > + * count. As a workaround this needs a preceeding PIPE_CONTROL with a > + * non-zero post-sync op. The depth count write above does that for > us. > + */ > + BEGIN_BATCH(4); > + OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2)); > + OUT_BATCH(PIPE_CONTROL_DEPTH_STALL); > + OUT_BATCH(0); > + OUT_BATCH(0); > + ADVANCE_BATCH(); > } else { >BEGIN_BATCH(4); >OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) | > -- > 1.7.11.4 > -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] i965: Rework the extra flushes surrounding occlusion queries.
Separate out the depth stall from the depth count write. Workarounds say that a depth stall needs to be preceeded with a non-zero post-sync op (in this case, the depth count write). Also, before the non-zero post-sync op, we need a CS stall, which needs a stall at scoreboard. Signed-off-by: Daniel Vetter Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_queryobj.c | 36 1 file changed, 27 insertions(+), 9 deletions(-) This does remove the CS stall on Ivybridge. diff --git a/src/mesa/drivers/dri/i965/brw_queryobj.c b/src/mesa/drivers/dri/i965/brw_queryobj.c index 1e03d08..4c561ad 100644 --- a/src/mesa/drivers/dri/i965/brw_queryobj.c +++ b/src/mesa/drivers/dri/i965/brw_queryobj.c @@ -91,17 +91,24 @@ static void write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int idx) { if (intel->gen >= 6) { - BEGIN_BATCH(9); - - /* workaround: CS stall required before depth stall. */ - OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2)); - OUT_BATCH(PIPE_CONTROL_CS_STALL); - OUT_BATCH(0); /* write address */ - OUT_BATCH(0); /* write data */ + /* Emit Sandybridge workaround flush: */ + if (intel->gen == 6) { + /* The timestamp write below is a non-zero post-sync op, which on + * Gen6 necessitates a CS stall. CS stalls need stall at scoreboard + * set. See the comments for intel_emit_post_sync_nonzero_flush(). + */ + BEGIN_BATCH(4); + OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2)); + OUT_BATCH(PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); + } + /* Emit the actual depth count write: */ + BEGIN_BATCH(5); OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2)); - OUT_BATCH(PIPE_CONTROL_DEPTH_STALL | -PIPE_CONTROL_WRITE_DEPTH_COUNT); + OUT_BATCH(PIPE_CONTROL_WRITE_DEPTH_COUNT); OUT_RELOC(query_bo, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION, PIPE_CONTROL_GLOBAL_GTT_WRITE | @@ -109,6 +116,17 @@ write_depth_count(struct intel_context *intel, drm_intel_bo *query_bo, int idx) OUT_BATCH(0); OUT_BATCH(0); ADVANCE_BATCH(); + + /* We need to emit a depth stall to get the right value for the depth + * count. As a workaround this needs a preceeding PIPE_CONTROL with a + * non-zero post-sync op. The depth count write above does that for us. + */ + BEGIN_BATCH(4); + OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2)); + OUT_BATCH(PIPE_CONTROL_DEPTH_STALL); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); } else { BEGIN_BATCH(4); OUT_BATCH(_3DSTATE_PIPE_CONTROL | (4 - 2) | -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev