Re: [Mesa-dev] [PATCH 1/2] r600g: rework flusing and synchronization pattern v4
On Sat, Dec 8, 2012 at 7:27 PM, Marek Olšák wrote: > Hi Jerome, > > I'm okay with the simplification of r600_flush_emit, I'm not so okay > with some other things. There's also some cruft unrelated to flushing. > > 1) R600_CONTEXT_FLUSH could have a better name, because it's not clear > what it does. (it looks like it only flushed read-only bindings) GPU_FLUSH ? > 2) Don't use magic numbers when setting cp_coher_cntl unless you want > to hide something from us / obfuscating the code. :) > > 3) The definition of R600_MAX_FLUSH_CS_DWORDS should be updated. Yes i haven't recomputed worst case > 4) SURFACE_BASE_UPDATE is emitted twice in emit_framebuffer_state. I > don't think splitting one packet into two packets doing the same thing > is needed. It's need couple r6xx/r7xx gpu will lockup after couple hour of stressing, wasn't seeing lockup with it. > 5) RS780 and RS880 don't need SURFACE_BASE_UPDATE for streamout. Their > streamout hardware was actually copied from R700. Doing "< CHIP_RS780" > instead of "< CHIP_RV770" was correct. The same for r600_flush_emit. fglrx mostly do the same on r7xx and r6xx for streamout as i am not sure i have any stressing test for that i side on fglrx side. > 6) In r600_context_flush, don't remove the comment about flushing > framebuffer caches, because it's still done there. > > 7) Masking out R600_CONTEXT_FLUSH in r600_context_emit_fence is not > correct. We should still flush the caches later if they're dirty and > even if the fence was emitted. You can't see this regression in > piglit, because we don't have a test for that. True > 8) There's some inconsistent flushing between graphics and compute > colorbuffer bindings. For graphics, you use (WAIT_IDLE | > FLUSH_AND_INV), which makes sense. For compute, you use > R600_CONTEXT_FLUSH (which is used for vertex buffers and the like > elsewhere, but not colorbuffers). I haven't paid much attention to compute side, i should probably look at it. > And one question: > > Why do you use set both FLUSH_AND_INV and STREAMOUT_FLUSH on > Evergreen, while r600 only gets FLUSH_AND_INV? Did you overlook this? No, just matching fglrx pattern, i don't think i tested without that change, but it definitly match fglrx. Cheers, Jerome > Marek > > On Thu, Dec 6, 2012 at 8:51 PM, wrote: >> From: Jerome Glisse >> >> This bring r600g allmost inline with closed source driver when >> it comes to flushing and synchronization pattern. >> >> Signed-off-by: Jerome Glisse >> --- >> src/gallium/drivers/r600/evergreen_compute.c | 8 +- >> .../drivers/r600/evergreen_compute_internal.c | 4 +- >> src/gallium/drivers/r600/evergreen_state.c | 4 +- >> src/gallium/drivers/r600/r600.h| 16 +-- >> src/gallium/drivers/r600/r600_hw_context.c | 154 >> - >> src/gallium/drivers/r600/r600_state.c | 18 ++- >> src/gallium/drivers/r600/r600_state_common.c | 19 ++- >> 7 files changed, 61 insertions(+), 162 deletions(-) >> >> diff --git a/src/gallium/drivers/r600/evergreen_compute.c >> b/src/gallium/drivers/r600/evergreen_compute.c >> index 44831a7..33a5910 100644 >> --- a/src/gallium/drivers/r600/evergreen_compute.c >> +++ b/src/gallium/drivers/r600/evergreen_compute.c >> @@ -98,7 +98,7 @@ static void evergreen_cs_set_vertex_buffer( >> >> /* The vertex instructions in the compute shaders use the texture >> cache, >> * so we need to invalidate it. */ >> - rctx->flags |= R600_CONTEXT_TEX_FLUSH; >> + rctx->flags |= R600_CONTEXT_FLUSH; >> state->enabled_mask |= 1 << vb_index; >> state->dirty_mask |= 1 << vb_index; >> state->atom.dirty = true; >> @@ -329,7 +329,7 @@ static void compute_emit_cs(struct r600_context *ctx, >> const uint *block_layout, >> */ >> r600_emit_command_buffer(ctx->cs, &ctx->start_compute_cs_cmd); >> >> - ctx->flags |= R600_CONTEXT_CB_FLUSH; >> + ctx->flags |= R600_CONTEXT_FLUSH; >> r600_flush_emit(ctx); >> >> /* Emit colorbuffers. */ >> @@ -409,7 +409,7 @@ static void compute_emit_cs(struct r600_context *ctx, >> const uint *block_layout, >> >> /* XXX evergreen_flush_emit() hardcodes the CP_COHER_SIZE to >> 0x >> */ >> - ctx->flags |= R600_CONTEXT_CB_FLUSH; >> + ctx->flags |= R600_CONTEXT_FLUSH; >> r600_flush_emit(ctx); >> >> #if 0 >> @@ -468,7 +468,7 @@ void evergreen_emit_cs_shader( >> r600_write_value(cs, r600_context_bo_reloc(rctx, kernel->code_bo, >> RADEON_USAGE_READ)); >> >> - rctx->flags |= R600_CONTEXT_SHADERCONST_FLUSH; >> + rctx->flags |= R600_CONTEXT_FLUSH; >> } >> >> static void evergreen_launch_grid( >> diff --git a/src/gallium/drivers/r600/evergreen_compute_internal.c >> b/src/gallium/drivers/r600/evergreen_compute_internal.c >> index 7bc7fb4..187bcf1 100644 >> --- a/src/gallium/drivers/r600/evergreen_com
Re: [Mesa-dev] [PATCH 1/2] r600g: rework flusing and synchronization pattern v4
Hi Jerome, I'm okay with the simplification of r600_flush_emit, I'm not so okay with some other things. There's also some cruft unrelated to flushing. 1) R600_CONTEXT_FLUSH could have a better name, because it's not clear what it does. (it looks like it only flushed read-only bindings) 2) Don't use magic numbers when setting cp_coher_cntl unless you want to hide something from us / obfuscating the code. :) 3) The definition of R600_MAX_FLUSH_CS_DWORDS should be updated. 4) SURFACE_BASE_UPDATE is emitted twice in emit_framebuffer_state. I don't think splitting one packet into two packets doing the same thing is needed. 5) RS780 and RS880 don't need SURFACE_BASE_UPDATE for streamout. Their streamout hardware was actually copied from R700. Doing "< CHIP_RS780" instead of "< CHIP_RV770" was correct. The same for r600_flush_emit. 6) In r600_context_flush, don't remove the comment about flushing framebuffer caches, because it's still done there. 7) Masking out R600_CONTEXT_FLUSH in r600_context_emit_fence is not correct. We should still flush the caches later if they're dirty and even if the fence was emitted. You can't see this regression in piglit, because we don't have a test for that. 8) There's some inconsistent flushing between graphics and compute colorbuffer bindings. For graphics, you use (WAIT_IDLE | FLUSH_AND_INV), which makes sense. For compute, you use R600_CONTEXT_FLUSH (which is used for vertex buffers and the like elsewhere, but not colorbuffers). And one question: Why do you use set both FLUSH_AND_INV and STREAMOUT_FLUSH on Evergreen, while r600 only gets FLUSH_AND_INV? Did you overlook this? Marek On Thu, Dec 6, 2012 at 8:51 PM, wrote: > From: Jerome Glisse > > This bring r600g allmost inline with closed source driver when > it comes to flushing and synchronization pattern. > > Signed-off-by: Jerome Glisse > --- > src/gallium/drivers/r600/evergreen_compute.c | 8 +- > .../drivers/r600/evergreen_compute_internal.c | 4 +- > src/gallium/drivers/r600/evergreen_state.c | 4 +- > src/gallium/drivers/r600/r600.h| 16 +-- > src/gallium/drivers/r600/r600_hw_context.c | 154 > - > src/gallium/drivers/r600/r600_state.c | 18 ++- > src/gallium/drivers/r600/r600_state_common.c | 19 ++- > 7 files changed, 61 insertions(+), 162 deletions(-) > > diff --git a/src/gallium/drivers/r600/evergreen_compute.c > b/src/gallium/drivers/r600/evergreen_compute.c > index 44831a7..33a5910 100644 > --- a/src/gallium/drivers/r600/evergreen_compute.c > +++ b/src/gallium/drivers/r600/evergreen_compute.c > @@ -98,7 +98,7 @@ static void evergreen_cs_set_vertex_buffer( > > /* The vertex instructions in the compute shaders use the texture > cache, > * so we need to invalidate it. */ > - rctx->flags |= R600_CONTEXT_TEX_FLUSH; > + rctx->flags |= R600_CONTEXT_FLUSH; > state->enabled_mask |= 1 << vb_index; > state->dirty_mask |= 1 << vb_index; > state->atom.dirty = true; > @@ -329,7 +329,7 @@ static void compute_emit_cs(struct r600_context *ctx, > const uint *block_layout, > */ > r600_emit_command_buffer(ctx->cs, &ctx->start_compute_cs_cmd); > > - ctx->flags |= R600_CONTEXT_CB_FLUSH; > + ctx->flags |= R600_CONTEXT_FLUSH; > r600_flush_emit(ctx); > > /* Emit colorbuffers. */ > @@ -409,7 +409,7 @@ static void compute_emit_cs(struct r600_context *ctx, > const uint *block_layout, > > /* XXX evergreen_flush_emit() hardcodes the CP_COHER_SIZE to > 0x > */ > - ctx->flags |= R600_CONTEXT_CB_FLUSH; > + ctx->flags |= R600_CONTEXT_FLUSH; > r600_flush_emit(ctx); > > #if 0 > @@ -468,7 +468,7 @@ void evergreen_emit_cs_shader( > r600_write_value(cs, r600_context_bo_reloc(rctx, kernel->code_bo, > RADEON_USAGE_READ)); > > - rctx->flags |= R600_CONTEXT_SHADERCONST_FLUSH; > + rctx->flags |= R600_CONTEXT_FLUSH; > } > > static void evergreen_launch_grid( > diff --git a/src/gallium/drivers/r600/evergreen_compute_internal.c > b/src/gallium/drivers/r600/evergreen_compute_internal.c > index 7bc7fb4..187bcf1 100644 > --- a/src/gallium/drivers/r600/evergreen_compute_internal.c > +++ b/src/gallium/drivers/r600/evergreen_compute_internal.c > @@ -538,7 +538,7 @@ void evergreen_set_tex_resource( > > util_format_get_blockwidth(tmp->resource.b.b.format) * > view->base.texture->width0*height*depth; > > - pipe->ctx->flags |= R600_CONTEXT_TEX_FLUSH; > + pipe->ctx->flags |= R600_CONTEXT_FLUSH; > > evergreen_emit_force_reloc(res); > evergreen_emit_force_reloc(res); > @@ -597,7 +597,7 @@ void evergreen_set_const_cache( > res->usage = RADEON_USAGE_READ; > res->coher_bo_size = size; > > - pipe->ctx->flags |= R600_CONTEXT_SHADERCONST
[Mesa-dev] [PATCH 1/2] r600g: rework flusing and synchronization pattern v4
From: Jerome Glisse This bring r600g allmost inline with closed source driver when it comes to flushing and synchronization pattern. Signed-off-by: Jerome Glisse --- src/gallium/drivers/r600/evergreen_compute.c | 8 +- .../drivers/r600/evergreen_compute_internal.c | 4 +- src/gallium/drivers/r600/evergreen_state.c | 4 +- src/gallium/drivers/r600/r600.h| 16 +-- src/gallium/drivers/r600/r600_hw_context.c | 154 - src/gallium/drivers/r600/r600_state.c | 18 ++- src/gallium/drivers/r600/r600_state_common.c | 19 ++- 7 files changed, 61 insertions(+), 162 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 44831a7..33a5910 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -98,7 +98,7 @@ static void evergreen_cs_set_vertex_buffer( /* The vertex instructions in the compute shaders use the texture cache, * so we need to invalidate it. */ - rctx->flags |= R600_CONTEXT_TEX_FLUSH; + rctx->flags |= R600_CONTEXT_FLUSH; state->enabled_mask |= 1 << vb_index; state->dirty_mask |= 1 << vb_index; state->atom.dirty = true; @@ -329,7 +329,7 @@ static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout, */ r600_emit_command_buffer(ctx->cs, &ctx->start_compute_cs_cmd); - ctx->flags |= R600_CONTEXT_CB_FLUSH; + ctx->flags |= R600_CONTEXT_FLUSH; r600_flush_emit(ctx); /* Emit colorbuffers. */ @@ -409,7 +409,7 @@ static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout, /* XXX evergreen_flush_emit() hardcodes the CP_COHER_SIZE to 0x */ - ctx->flags |= R600_CONTEXT_CB_FLUSH; + ctx->flags |= R600_CONTEXT_FLUSH; r600_flush_emit(ctx); #if 0 @@ -468,7 +468,7 @@ void evergreen_emit_cs_shader( r600_write_value(cs, r600_context_bo_reloc(rctx, kernel->code_bo, RADEON_USAGE_READ)); - rctx->flags |= R600_CONTEXT_SHADERCONST_FLUSH; + rctx->flags |= R600_CONTEXT_FLUSH; } static void evergreen_launch_grid( diff --git a/src/gallium/drivers/r600/evergreen_compute_internal.c b/src/gallium/drivers/r600/evergreen_compute_internal.c index 7bc7fb4..187bcf1 100644 --- a/src/gallium/drivers/r600/evergreen_compute_internal.c +++ b/src/gallium/drivers/r600/evergreen_compute_internal.c @@ -538,7 +538,7 @@ void evergreen_set_tex_resource( util_format_get_blockwidth(tmp->resource.b.b.format) * view->base.texture->width0*height*depth; - pipe->ctx->flags |= R600_CONTEXT_TEX_FLUSH; + pipe->ctx->flags |= R600_CONTEXT_FLUSH; evergreen_emit_force_reloc(res); evergreen_emit_force_reloc(res); @@ -597,7 +597,7 @@ void evergreen_set_const_cache( res->usage = RADEON_USAGE_READ; res->coher_bo_size = size; - pipe->ctx->flags |= R600_CONTEXT_SHADERCONST_FLUSH; + pipe->ctx->flags |= R600_CONTEXT_FLUSH; } struct r600_resource* r600_compute_buffer_alloc_vram( diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 9b898cb..7bc4772 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1557,14 +1557,14 @@ static void evergreen_set_framebuffer_state(struct pipe_context *ctx, uint32_t i, log_samples; if (rctx->framebuffer.state.nr_cbufs) { - rctx->flags |= R600_CONTEXT_CB_FLUSH; + rctx->flags |= R600_CONTEXT_WAIT_IDLE | R600_CONTEXT_FLUSH_AND_INV; if (rctx->framebuffer.state.cbufs[0]->texture->nr_samples > 1) { rctx->flags |= R600_CONTEXT_FLUSH_AND_INV_CB_META; } } if (rctx->framebuffer.state.zsbuf) { - rctx->flags |= R600_CONTEXT_DB_FLUSH; + rctx->flags |= R600_CONTEXT_WAIT_IDLE | R600_CONTEXT_FLUSH_AND_INV; } util_copy_framebuffer_state(&rctx->framebuffer.state, state); diff --git a/src/gallium/drivers/r600/r600.h b/src/gallium/drivers/r600/r600.h index 7d43416..4060672 100644 --- a/src/gallium/drivers/r600/r600.h +++ b/src/gallium/drivers/r600/r600.h @@ -180,17 +180,11 @@ struct r600_so_target { unsignedso_index; }; -#define R600_CONTEXT_PS_PARTIAL_FLUSH (1 << 0) -#define R600_CONTEXT_CB_FLUSH (1 << 1) -#define R600_CONTEXT_DB_FLUSH (1 << 2) -#define R600_CONTEXT_SHADERCONST_FLUSH (1 << 3) -#define R600_CONTEXT_TEX_FLUSH (1 << 4) -#define R600_CONTEXT_VTX_FLUSH (1 << 5) -#define R600_CONTEXT_STREAMOUT_FLUSH (1 << 6) -#define R600_CONTEXT_WAIT_IDLE