On Wed, Dec 19, 2012 at 12:33 PM, Tom Stellard <t...@stellard.net> wrote: > On Sun, Dec 16, 2012 at 08:33:23PM +1000, Dave Airlie wrote: >> From: Dave Airlie <airl...@redhat.com> >> >> This adds TBO support to r600g, and with GLSL 1.40 enabled, >> we now get 3.1 core profiles advertised for r600g. >> >> This code is evergreen only so far, but I don't think there is >> much to make it work on r600/700/cayman other than testing. >> >> a) buffer txq is broken like cube map txq, this sucks, fix it the >> exact same way. >> >> b) buffer fetches are done with a vertex clause, >> >> c) vertex swizzling offsets are different than texture swizzles, >> but we still need to use the combiner, so make it configurable. >> >> d) add implementation of UCMP. >> >> TODO: r600/700/cayman testin >> Signed-off-by: Dave Airlie <airl...@redhat.com> >> --- >> src/gallium/drivers/r600/evergreen_state.c | 55 ++++++++++++++++++++ >> src/gallium/drivers/r600/r600_asm.c | 2 +- >> src/gallium/drivers/r600/r600_asm.h | 2 + >> src/gallium/drivers/r600/r600_pipe.c | 4 +- >> src/gallium/drivers/r600/r600_pipe.h | 10 +++- >> src/gallium/drivers/r600/r600_shader.c | 75 >> ++++++++++++++++++++++++++++ >> src/gallium/drivers/r600/r600_shader.h | 1 + >> src/gallium/drivers/r600/r600_state_common.c | 58 +++++++++++++++++---- >> src/gallium/drivers/r600/r600_texture.c | 16 ++++-- >> 9 files changed, 204 insertions(+), 19 deletions(-) >> > > [snip] > >> diff --git a/src/gallium/drivers/r600/r600_shader.c >> b/src/gallium/drivers/r600/r600_shader.c >> index feb7001..60667e7 100644 >> --- a/src/gallium/drivers/r600/r600_shader.c >> +++ b/src/gallium/drivers/r600/r600_shader.c >> @@ -3819,6 +3819,71 @@ static inline unsigned tgsi_tex_get_src_gpr(struct >> r600_shader_ctx *ctx, >> return ctx->file_offset[inst->Src[index].Register.File] + >> inst->Src[index].Register.Index; >> } >> >> +static int do_vtx_fetch_inst(struct r600_shader_ctx *ctx, boolean >> src_requires_loading) >> +{ >> + struct r600_bytecode_vtx vtx; >> + struct r600_bytecode_alu alu; >> + struct tgsi_full_instruction *inst = >> &ctx->parse.FullToken.FullInstruction; >> + int src_gpr, r, i; >> + >> + src_gpr = tgsi_tex_get_src_gpr(ctx, 0); >> + if (src_requires_loading) { >> + for (i = 0; i < 4; i++) { >> + memset(&alu, 0, sizeof(struct r600_bytecode_alu)); >> + alu.inst = >> CTX_INST(V_SQ_ALU_WORD1_OP2_SQ_OP2_INST_MOV); >> + r600_bytecode_src(&alu.src[0], &ctx->src[0], i); >> + alu.dst.sel = ctx->temp_reg; >> + alu.dst.chan = i; >> + if (i == 3) >> + alu.last = 1; >> + alu.dst.write = 1; >> + r = r600_bytecode_add_alu(ctx->bc, &alu); >> + if (r) >> + return r; >> + } >> + src_gpr = ctx->temp_reg; >> + } >> + >> + memset(&vtx, 0, sizeof(vtx)); >> + vtx.inst = 0; >> + vtx.buffer_id = tgsi_tex_get_src_gpr(ctx, 1) + R600_MAX_CONST_BUFFERS;; >> + vtx.fetch_type = 2; /* VTX_FETCH_NO_INDEX_OFFSET */ >> + vtx.src_gpr = src_gpr; >> + vtx.mega_fetch_count = 16; >> + vtx.dst_gpr = ctx->file_offset[inst->Dst[0].Register.File] + >> inst->Dst[0].Register.Index; >> + vtx.dst_sel_x = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; >> /* SEL_X */ >> + vtx.dst_sel_y = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; >> /* SEL_Y */ >> + vtx.dst_sel_z = (inst->Dst[0].Register.WriteMask & 4) ? 2 : 7; >> /* SEL_Z */ >> + vtx.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; >> /* SEL_W */ >> + vtx.use_const_fields = 1; >> + vtx.srf_mode_all = 1; /* SRF_MODE_NO_ZERO */ >> + > > According to the docs, srf_mode_all will be ignored if use_const_fields > is set. However, based on my tests while running compute shaders, other > fields like data_format, which are supposed to be ignored weren't being > ignored unless the were set to zero. So, I think it would be safer > here to set srf_mode_all to zero and make sure that bit gets set on > the resource. > > >> + if ((r = r600_bytecode_add_vtx(ctx->bc, &vtx))) >> + return r; >> + return 0; >> +} >> + > > Otherwise, this code for vtx fetch looks good to me. One problem I ran into > with vtx fetch instructions while working on compute shaders was that > the GPU will hang if you write to vtx.src_gpr in the > instruction group following the vtx fetch. Here is a simple example: > > %T2_X<def> = MOV %ZERO > %T3_X<def> = VTX_READ_eg %T2_X<kill>, 24 > %T2_X<def> = MOV %ZERO > > I'm not sure if this happens on all GPU variants, but I was able to > consistently reproduce this on my SUMO. You may want to keep an eye > out for this in case you run into any unexplainable hangs. >
The vtx fetch group had the barrier flag set ? Cheers, Jerome _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev