"Pohjolainen, Topi" <topi.pohjolai...@intel.com> writes: > On Tue, Mar 10, 2015 at 11:07:26PM +0200, Francisco Jerez wrote: >> "Pohjolainen, Topi" <topi.pohjolai...@intel.com> writes: >> >> > On Mon, Mar 09, 2015 at 12:43:08PM +0200, Francisco Jerez wrote: >> >> "Pohjolainen, Topi" <topi.pohjolai...@intel.com> writes: >> >> >> >> > On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote: >> >> >> Topi Pohjolainen <topi.pohjolai...@intel.com> writes: >> >> >> >> >> >> > The original patch from Curro was based on something that is not >> >> >> > present in the master yet. This patch tries to mimick the logic on >> >> >> > top master. >> >> >> > I wanted to see if could separate the building of the descriptor >> >> >> > instruction from building of the send instruction. This logic now >> >> >> > allows the caller to construct any kind of sequence of instructions >> >> >> > filling in the descriptor before giving it to the send instruction >> >> >> > builder. >> >> >> > >> >> >> > This is only compile tested. Curro, how would you feel about this >> >> >> > sort of approach? I apologise for muddying the waters again but I >> >> >> > wasn't entirely comfortable with the logic and wanted to try to >> >> >> > something else. >> >> >> > >> >> >> > I believe patch number five should go nicely on top of this as >> >> >> > the descriptor instruction could be followed by (or preceeeded by) >> >> >> > any additional instructions modifying the descriptor register >> >> >> > before the actual send instruction. >> >> >> > >> >> >> >> >> >> Topi, the goal I had in mind with PATCH 01 was to refactor a commonly >> >> >> recurring pattern. In terms of the functions defined in this patch my >> >> >> example from yesterday [1] would now look like: >> >> >> >> >> >> | if (index.file == BRW_IMMEDIATE_VALUE) { >> >> >> | >> >> >> | uint32_t surf_index = index.dw1.ud; >> >> >> | >> >> >> | brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND); >> >> >> | brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW)); >> >> >> | brw_set_src0(p, send, offset); >> >> >> | brw_set_sampler_message(p, send, >> >> >> | surf_index, >> >> >> | 0, /* LD message ignores sampler unit */ >> >> >> | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, >> >> >> | rlen, >> >> >> | mlen, >> >> >> | false, /* no header */ >> >> >> | simd_mode, >> >> >> | 0); >> >> >> | >> >> >> | brw_mark_surface_used(prog_data, surf_index); >> >> >> | >> >> >> | } else { >> >> >> | >> >> >> | struct brw_reg addr = vec1(retype(brw_address_reg(0), >> >> >> BRW_REGISTER_TYPE_UD)); >> >> >> | >> >> >> | brw_push_insn_state(p); >> >> >> | brw_set_default_mask_control(p, BRW_MASK_DISABLE); >> >> >> | brw_set_default_access_mode(p, BRW_ALIGN_1); >> >> >> | >> >> >> | /* a0.0 = surf_index & 0xff */ >> >> >> | brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND); >> >> >> | brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1); >> >> >> | brw_set_dest(p, insn_and, addr); >> >> >> | brw_set_src0(p, insn_and, vec1(retype(index, >> >> >> BRW_REGISTER_TYPE_UD))); >> >> >> | brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); >> >> >> | >> >> >> | >> >> >> | /* a0.0 |= <descriptor> */ >> >> >> | brw_inst *descr_inst = brw_build_indirect_message_descr(p, >> >> >> addr, addr); >> >> >> | brw_set_sampler_message(p, descr_inst, >> >> >> | 0 /* surface */, >> >> >> | 0 /* sampler */, >> >> >> | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, >> >> >> | rlen /* rlen */, >> >> >> | mlen /* mlen */, >> >> >> | false /* header */, >> >> >> | simd_mode, >> >> >> | 0); >> >> >> | >> >> >> | /* dst = send(offset, a0.0) */ >> >> >> | brw_send_indirect_message(p, BRW_SFID_SAMPLER, dst, offset, >> >> >> addr); >> >> >> | >> >> >> | brw_pop_insn_state(p); >> >> >> | >> >> >> | /* visitor knows more than we do about the surface limit >> >> >> required, >> >> >> | * so has already done marking. >> >> >> | */ >> >> >> | } >> >> > >> >> > Which I think could also be written as follows. Or am I missing >> >> > something >> >> > again? >> >> > >> >> > static brw_inst * >> >> > brw_build_surface_index_descr(struct brw_compile *p, >> >> > struct brw_reg dst, index) >> >> > { >> >> > brw_set_default_mask_control(p, BRW_MASK_DISABLE); >> >> > brw_set_default_access_mode(p, BRW_ALIGN_1); >> >> > >> >> > /* a0.0 = surf_index & 0xff */ >> >> > brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND); >> >> > brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1); >> >> > brw_set_dest(p, insn_and, addr); >> >> > brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD))); >> >> > brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); >> >> > >> >> > /* a0.0 |= <descriptor> */ >> >> > brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, >> >> > addr); >> >> > } >> >> > >> >> > ... >> >> > brw_inst *descr_inst; >> >> > if (index.file == BRW_IMMEDIATE_VALUE) { >> >> > descr = brw_next_insn(p, BRW_OPCODE_SEND); >> >> > brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW)); >> >> > brw_set_src0(p, send, offset); >> >> > >> >> > brw_mark_surface_used(prog_data, surf_index); >> >> > } else { >> >> > struct brw_reg addr = vec1(retype(brw_address_reg(0), >> >> > BRW_REGISTER_TYPE_UD)); >> >> > brw_push_insn_state(p); >> >> > >> >> > brw_build_surface_index_descr(p, addr, index); >> >> > /* dst = send(offset, a0.0) */ >> >> > descr_inst = brw_send_indirect_message(p, BRW_SFID_SAMPLER, >> >> > dst, offset, addr); >> >> > brw_pop_insn_state(p); >> >> > } >> >> > >> >> > uint32_t surf_index = index.file == BRW_IMMEDIATE_VALUE ? >> >> > index.dw1.ud : 0; >> >> > brw_set_sampler_message(p, descr_inst, >> >> > surf_index, >> >> > 0, /* LD message ignores sampler unit */ >> >> > GEN5_SAMPLER_MESSAGE_SAMPLE_LD, >> >> > rlen, >> >> > mlen, >> >> > false, /* no header */ >> >> > simd_mode, >> >> > 0); >> >> > ... >> >> >> >> Yeah, something like that. In any case you need logic to handle both >> >> the immediate and indirect cases explicitly by every caller, which seems >> >> like plenty of duplication to me, not sure what you are trying to >> >> achieve with this. >> > >> > Well, to me it seems that cases where address is an immediate and >> > non-immediate already have quite fixed paths. I'm uneasy about changing the >> > final kick-off of the message sending to have two alternatives for these >> > - in each path only one final branch is really possible but this isn't that >> > obvious to the reader anymore as one needs to go and analyze the callers: >> > >> >> No, that's not right, *all* users of brw_send_indirect_message() or >> brw_send_indirect_surface_message() will eventually have to handle >> *both* immediate and indirect descriptor cases, and the necessary logic >> is identical for *all* of them. > > Some rational in the commit message would have helped me to understand the > ultimate goal, same as with generate_tex(). > I'll continue with patch number five again. There are quite few things moving > around making it difficult to double check. > OK, I'll write a short explanation in the commit message.
> Also, this series is not based on master but something else. What is your > plan on landing this? > Right, it was based on my constant cache series, I've just rebased it on top of master. >> >> > Patch 1: >> > - introduces brw_send_indirect_message() considering if address is >> > immediate or not. >> > - called by generate_tex() against address that is never an immediate >> >> This is only temporary, I have a (not yet sent for review) patch that >> gets rid of explicit handling of immediate vs. indirect descriptors in >> the texturing and pull constant code completely. >> >> > - called by generate_uniform_pull_constant_load_gen7() in a branch which >> > explicitly uses an immediate and another which explicitly uses >> > non-immediate >> > >> I don't think that any immediate-only branch of >> generate_uniform_pull_constant_load_gen7() is calling >> brw_send_indirect_message() at this point. >> >> > Patch 5: >> > - introduces brw_send_indirect_surface_message() considering if address >> > is >> > immediate or not. >> > - called by brw_untyped_atomic() and brw_untyped_surface_read() which >> > both are only called against immediates >> > >> > Patch 11: >> > - introduces brw_untyped_surface_write() which is only called against >> > immediate >> > >> > Patch 12: >> > - introduces brw_typed_atomic() which is only called against immediate >> > - introduces brw_typed_surface_read() which is only called against >> > immediate >> > - introduces brw_typed_surface_write() which is only called against >> > immediate >> > >> Typed and untyped surface opcodes are *all* called with both immediate >> and indirect surfaces. >> >> > >> > >> > Then, in general I was hoping to have design pattern of this sort. Caller: >> > >> > create_descriptor(); >> > add_property_1_to_descriptor(); >> > add_property_2_to_descriptor(); >> > ... >> > send() >> > >> >> I guess I find this acceptable usually, but sticking to this model in >> this case prevents abstraction, because the shared logic is only in the >> first and last statements of your example. > > The property additions are shared as well, property_1 being the brw_OR() > in brw_send_indirect_message(). > >> >> > >> > Instead of: >> > >> > create_descriptor_and_send() >> > { >> > create(); >> > send(); >> > } >> > >> > create_descriptor_with_1_and_send() >> > { >> > create_descriptor_and_send(); >> > add_property_1(); >> > } >> > >> >> If create_descriptor_with_1_and_send() is pseudocode for >> brw_send_indirect_surface_message(), the only reason why it's there is >> to emit the AND instruction that masks out invalid bits from the surface >> index, which could otherwise lead to hangs. I'm not planning on adding >> more nesting levels. > > By that I meant brw_send_indirect_message() and > brw_send_indirect_surface_message() being create_descriptor_with_1_and_send().
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev