On Wed, 2015-12-16 at 11:39 -0800, Kenneth Graunke wrote: > On Wednesday, December 16, 2015 10:02:16 AM Iago Toral Quiroga wrote: > > The BDW PRM Vol2a: Command Reference: Instructions, section > > MEDIA_CURBE_LOAD, > > says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are > > 64-byte aligned. This is different from previous gens, that were 32-byte > > aligned. > > > > v2 (Jordan): > > - CURBE Data Start Address is also 64-byte aligned. > > - The call to brw_state_batch should also use 64-byte alignment. > > - Improve PRM reference. > > > > Fixes the following SSBO CTS tests on BDW: > > ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs > > ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs > > ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs > > ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs > > ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs > > ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs > > ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs > > > > And many other CS CTS tests as reported by Marta Lofstedt. > > --- > > src/mesa/drivers/dri/i965/gen7_cs_state.c | 12 ++++++++---- > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c > > b/src/mesa/drivers/dri/i965/gen7_cs_state.c > > index 1fde69c..df0f301 100644 > > --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c > > +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c > > @@ -68,7 +68,7 @@ brw_upload_cs_state(struct brw_context *brw) > > > > uint32_t *bind = (uint32_t*) brw_state_batch(brw, > > AUB_TRACE_BINDING_TABLE, > > > > prog_data->binding_table.size_bytes, > > - 32, > > &stage_state->bind_bo_offset); > > + 64, > > &stage_state->bind_bo_offset); > > I don't understand this hunk - binding tables don't have anything to do > with push constants. These are for pull constants and UBOs. At least > in the 3D pipeline, we only align these to 32B, not 64. > > > unsigned local_id_dwords = 0; > > > > @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw) > > > > unsigned push_constant_data_size = > > (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value); > > - unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32); > > + unsigned reg_aligned_constant_size = > > + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); > > unsigned push_constant_regs = reg_aligned_constant_size / 32; > > unsigned threads = get_cs_thread_count(cs_prog_data); > > > > @@ -138,11 +139,13 @@ brw_upload_cs_state(struct brw_context *brw) > > ADVANCE_BATCH(); > > > > if (reg_aligned_constant_size > 0) { > > + const unsigned aligned_push_const_offset = > > + ALIGN(stage_state->push_const_offset, brw->gen < 8 ? 32 : 64); > > This is wrong. What you want is to change: > > param = (gl_constant_value*) > brw_state_batch(brw, type, > reg_aligned_constant_size * threads, > 32, &stage_state->push_const_offset); > > to use an alignment of 64 instead of 32 on Gen8+. That way, it'll > actually upload the data to a portion of the buffer that starts on > a 64B aligned boundary. > > As is, you're uploading the data to a 32B aligned section and then > just fudging the pointer to be 64B aligned, possibly skipping over > the first 32B. Probably not what you wanted :) > > Maybe you accidentally changed the wrong brw_state_batch call?
Ouch! yeah, I meant to change the other call, sorry about that. Anyway, I think the patch proposed by Jordan is good so I won't send another version. Thanks for the review Ken! Iago > > BEGIN_BATCH(4); > > OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2)); > > OUT_BATCH(0); > > OUT_BATCH(reg_aligned_constant_size * threads); > > - OUT_BATCH(stage_state->push_const_offset); > > + OUT_BATCH(aligned_push_const_offset); > > ADVANCE_BATCH(); > > } > > > > @@ -241,7 +244,8 @@ brw_upload_cs_push_constants(struct brw_context *brw, > > > > const unsigned push_constant_data_size = > > (local_id_dwords + prog_data->nr_params) * > > sizeof(gl_constant_value); > > - const unsigned reg_aligned_constant_size = > > ALIGN(push_constant_data_size, 32); > > + const unsigned reg_aligned_constant_size = > > + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64); > > const unsigned param_aligned_count = > > reg_aligned_constant_size / sizeof(*param); > > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev