Francisco Jerez <curroje...@riseup.net> writes: > Matt Turner <matts...@gmail.com> writes: > >> On Wed, Mar 11, 2015 at 2:29 PM, Francisco Jerez <curroje...@riseup.net> >> wrote: >>> Matt Turner <matts...@gmail.com> writes: >>>> commit 4c4934636cb286e7d7836afc26e9d392e2f0f155 >>>> Author: Paul Berry <stereotype...@gmail.com> >>>> Date: Tue Sep 24 15:18:52 2013 -0700 >>>> >>>> i965/blorp: retype destination register for texture SEND instruction >>>> to UW. >>>> >>>> The resource streamer only exists on HSW+, so the UW dest is certainly >>>> needed for things after Gen5. >>> >>> Odd, I haven't seen a mention of that restriction in the hardware specs >>> (at least not on reasonably recent ones -- the Gen4 and 5 specs do >>> mention it and they actually hang if you send a message with compression >>> enabled and anything bigger than a W as destination type). Is this a >>> purely empirical finding? If so, doesn't it deserve a big fat warning >>> comment? Is this only a problem for some interaction with the resource >>> streamer or has it ever been observed to fix something else? >> >> This page [0] says: >> >> """ >> The subregister number, horizontal stride, destination mask and type >> fields of <dest> are always valid and are used in part to generate the >> WrEn. This is true even if <dest> is a null register (this is an >> exception for null as for most cases these fields are ignored by >> hardware). These parameters of <dest> follow the same restriction as >> that of normal destination operand – destination region cannot cross >> the 256-bit register boundary. >> """ >> >> Searching for the exact phrase quoted in Paul's commit finds another >> page that says it applies to "DevSNB+,Pre-DevBDW". >> > > Hm, searching for the same phrase ("destination region cannot cross the > 256-bit register boundary.") gives several matches, some of the pages > are indeed tagged "DevSNB+,Pre-DevBDW", but in all of them that phrase > appears inside a SNB-only block, I don't see any indication of that > restriction applying to Gen7 and up. >
Meh, I've modified PATCH 01 so it drops the send destination type changes in the Gen7 pull constant load code (see attachment). Do we know if the destination type of the SEND instruction has any other subtle effects, like, affecting the computation of dependency scoreboard signals? >> I think this is one of those cases where they technically give you all >> the information, they just don't tell you anything about what you're >> supposed to do with it. Totally bullshit, but par for the course. >> >> I guess the good news in all of this is that we now know we don't need >> to bother with this for BDW+. >> >> [0] 3D-Media-GPGPU Engine > EU Overview > ISA Introduction > >> Instruction Set Reference > EUISA Instructions > Send Message [SNB+]
From 592a35f79ecfac77d0aee33ff134c4f0c73b95f7 Mon Sep 17 00:00:00 2001 From: Francisco Jerez <curroje...@riseup.net> Date: Mon, 9 Mar 2015 19:33:10 +0200 Subject: [PATCH] i965: Factor out logic to build a send message instruction with indirect descriptor. This is going to be useful because the Gen7+ uniform and varying pull constant, texturing, typed and untyped surface read, write, and atomic generation code on the vec4 and fs back-end all require the same logic to handle conditionally indirect surface indices. In pseudocode: | if (surface.file == BRW_IMMEDIATE_VALUE) { | inst = brw_SEND(p, dst, payload); | set_descriptor_control_bits(inst, surface, ...); | } else { | inst = brw_OR(p, addr, surface, 0); | set_descriptor_control_bits(inst, ...); | inst = brw_SEND(p, dst, payload); | set_indirect_send_descriptor(inst, addr); | } This patch abstracts out this frequently recurring pattern so we can now write: | inst = brw_send_indirect_message(p, sfid, dst, payload, surface) | set_descriptor_control_bits(inst, ...); without worrying about handling the immediate and indirect surface index cases explicitly. v2: Rebase. Improve documentatation and commit message. (Topi) Preserve UW destination type cargo-cult. (Topi, Ken, Matt) Reviewed-by: Topi Pohjolainen <topi.pohjolai...@intel.com> Acked-by: Kenneth Graunke <kenn...@whitecape.org> --- src/mesa/drivers/dri/i965/brw_eu.h | 21 ++++++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 58 ++++++++++++++++------ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 63 ++++++------------------ src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 37 +++----------- 4 files changed, 83 insertions(+), 96 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 736c54b..e2b5b75 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -205,11 +205,6 @@ void brw_set_sampler_message(struct brw_compile *p, unsigned simd_mode, unsigned return_format); -void brw_set_indirect_send_descriptor(struct brw_compile *p, - brw_inst *insn, - unsigned sfid, - struct brw_reg descriptor); - void brw_set_dp_read_message(struct brw_compile *p, brw_inst *insn, unsigned binding_table_index, @@ -242,6 +237,22 @@ void brw_urb_WRITE(struct brw_compile *p, unsigned offset, unsigned swizzle); +/** + * Send message to shared unit \p sfid with a possibly indirect descriptor \p + * desc. If \p desc is not an immediate it will be transparently loaded to an + * address register using an OR instruction. The returned instruction can be + * passed as argument to the usual brw_set_*_message() functions in order to + * specify any additional descriptor bits -- If \p desc is an immediate this + * will be the SEND instruction itself, otherwise it will be the OR + * instruction. + */ +struct brw_inst * +brw_send_indirect_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg desc); + void brw_ff_sync(struct brw_compile *p, struct brw_reg dest, unsigned msg_reg_nr, diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 6f29468..0d2d07f 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -772,21 +772,6 @@ brw_set_sampler_message(struct brw_compile *p, } } -void brw_set_indirect_send_descriptor(struct brw_compile *p, - brw_inst *insn, - unsigned sfid, - struct brw_reg descriptor) -{ - /* Only a0.0 may be used as SEND's descriptor operand. */ - assert(descriptor.file == BRW_ARCHITECTURE_REGISTER_FILE); - assert(descriptor.type == BRW_REGISTER_TYPE_UD); - assert(descriptor.nr == BRW_ARF_ADDRESS); - assert(descriptor.subnr == 0); - - brw_set_message_descriptor(p, insn, sfid, 0, 0, false, false); - brw_set_src1(p, insn, descriptor); -} - static void gen7_set_dp_scratch_message(struct brw_compile *p, brw_inst *inst, @@ -2495,6 +2480,49 @@ void brw_urb_WRITE(struct brw_compile *p, swizzle); } +struct brw_inst * +brw_send_indirect_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg desc) +{ + const struct brw_context *brw = p->brw; + struct brw_inst *send, *setup; + + assert(desc.type == BRW_REGISTER_TYPE_UD); + + if (desc.file == BRW_IMMEDIATE_VALUE) { + setup = send = next_insn(p, BRW_OPCODE_SEND); + brw_set_src1(p, send, desc); + + } else { + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD); + + brw_push_insn_state(p); + brw_set_default_access_mode(p, BRW_ALIGN_1); + brw_set_default_mask_control(p, BRW_MASK_DISABLE); + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); + + /* Load the indirect descriptor to an address register using OR so the + * caller can specify additional descriptor bits with the usual + * brw_set_*_message() helper functions. + */ + setup = brw_OR(p, addr, desc, brw_imm_ud(0)); + + brw_pop_insn_state(p); + + send = next_insn(p, BRW_OPCODE_SEND); + brw_set_src1(p, send, addr); + } + + brw_set_dest(p, send, dst); + brw_set_src0(p, send, retype(payload, BRW_REGISTER_TYPE_UD)); + brw_inst_set_sfid(brw, send, sfid); + + return setup; +} + static int brw_find_next_block_end(struct brw_compile *p, int start_offset) { diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index cbe6191..37b669c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -754,9 +754,10 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src brw_AND(p, addr, addr, brw_imm_ud(0x0ff)); brw_OR(p, addr, addr, temp); - /* a0.0 |= <descriptor> */ - brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR); - brw_set_sampler_message(p, insn_or, + /* dst = send(offset, a0.0 | <descriptor>) */ + brw_inst *insn = brw_send_indirect_message( + p, BRW_SFID_SAMPLER, dst, src, addr); + brw_set_sampler_message(p, insn, 0 /* surface */, 0 /* sampler */, msg_type, @@ -765,17 +766,6 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src inst->header_present /* header */, simd_mode, return_format); - brw_inst_set_exec_size(p->brw, insn_or, BRW_EXECUTE_1); - brw_inst_set_src1_reg_type(p->brw, insn_or, BRW_REGISTER_TYPE_UD); - brw_set_src0(p, insn_or, addr); - brw_set_dest(p, insn_or, addr); - - - /* dst = send(offset, a0.0) */ - brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn_send, dst); - brw_set_src0(p, insn_send, src); - brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); brw_pop_insn_state(p); @@ -1083,29 +1073,18 @@ fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst, brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD))); brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); - - /* a0.0 |= <descriptor> */ - brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR); - brw_set_sampler_message(p, insn_or, - 0 /* surface */, - 0 /* sampler */, + /* dst = send(payload, a0.0 | <descriptor>) */ + brw_inst *insn = brw_send_indirect_message( + p, BRW_SFID_SAMPLER, dst, src, addr); + brw_set_sampler_message(p, insn, + 0, + 0, /* LD message ignores sampler unit */ GEN5_SAMPLER_MESSAGE_SAMPLE_LD, - 1 /* rlen */, + 1, /* rlen */ mlen, header_present, BRW_SAMPLER_SIMD_MODE_SIMD4X2, 0); - brw_inst_set_exec_size(p->brw, insn_or, BRW_EXECUTE_1); - brw_inst_set_src1_reg_type(p->brw, insn_or, BRW_REGISTER_TYPE_UD); - brw_set_src0(p, insn_or, addr); - brw_set_dest(p, insn_or, addr); - - - /* dst = send(offset, a0.0) */ - brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn_send, dst); - brw_set_src0(p, insn_send, src); - brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); brw_pop_insn_state(p); @@ -1242,10 +1221,11 @@ fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst, brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD))); brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); - - /* a0.0 |= <descriptor> */ - brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR); - brw_set_sampler_message(p, insn_or, + /* dst = send(offset, a0.0 | <descriptor>) */ + brw_inst *insn = brw_send_indirect_message( + p, BRW_SFID_SAMPLER, retype(dst, BRW_REGISTER_TYPE_UW), + offset, addr); + brw_set_sampler_message(p, insn, 0 /* surface */, 0 /* sampler */, GEN5_SAMPLER_MESSAGE_SAMPLE_LD, @@ -1254,17 +1234,6 @@ fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst, false /* header */, simd_mode, 0); - brw_inst_set_exec_size(p->brw, insn_or, BRW_EXECUTE_1); - brw_inst_set_src1_reg_type(p->brw, insn_or, BRW_REGISTER_TYPE_UD); - brw_set_src0(p, insn_or, addr); - brw_set_dest(p, insn_or, addr); - - - /* dst = send(offset, a0.0) */ - brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn_send, retype(dst, BRW_REGISTER_TYPE_UW)); - brw_set_src0(p, insn_send, offset); - brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); brw_pop_insn_state(p); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 010a5c4..3efd5c9 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -419,9 +419,10 @@ vec4_generator::generate_tex(vec4_instruction *inst, brw_AND(p, addr, addr, brw_imm_ud(0x0ff)); brw_OR(p, addr, addr, temp); - /* a0.0 |= <descriptor> */ - brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR); - brw_set_sampler_message(p, insn_or, + /* dst = send(offset, a0.0 | <descriptor>) */ + brw_inst *insn = brw_send_indirect_message( + p, BRW_SFID_SAMPLER, dst, src, addr); + brw_set_sampler_message(p, insn, 0 /* surface */, 0 /* sampler */, msg_type, @@ -430,17 +431,6 @@ vec4_generator::generate_tex(vec4_instruction *inst, inst->header_present /* header */, BRW_SAMPLER_SIMD_MODE_SIMD4X2, return_format); - brw_inst_set_exec_size(p->brw, insn_or, BRW_EXECUTE_1); - brw_inst_set_src1_reg_type(p->brw, insn_or, BRW_REGISTER_TYPE_UD); - brw_set_src0(p, insn_or, addr); - brw_set_dest(p, insn_or, addr); - - - /* dst = send(offset, a0.0) */ - brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn_send, dst); - brw_set_src0(p, insn_send, src); - brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); brw_pop_insn_state(p); @@ -1081,10 +1071,10 @@ vec4_generator::generate_pull_constant_load_gen7(vec4_instruction *inst, brw_set_src0(p, insn_and, vec1(retype(surf_index, BRW_REGISTER_TYPE_UD))); brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); - - /* a0.0 |= <descriptor> */ - brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR); - brw_set_sampler_message(p, insn_or, + /* dst = send(offset, a0.0 | <descriptor>) */ + brw_inst *insn = brw_send_indirect_message( + p, BRW_SFID_SAMPLER, dst, offset, addr); + brw_set_sampler_message(p, insn, 0 /* surface */, 0 /* sampler */, GEN5_SAMPLER_MESSAGE_SAMPLE_LD, @@ -1093,17 +1083,6 @@ vec4_generator::generate_pull_constant_load_gen7(vec4_instruction *inst, false /* header */, BRW_SAMPLER_SIMD_MODE_SIMD4X2, 0); - brw_inst_set_exec_size(p->brw, insn_or, BRW_EXECUTE_1); - brw_inst_set_src1_reg_type(p->brw, insn_or, BRW_REGISTER_TYPE_UD); - brw_set_src0(p, insn_or, addr); - brw_set_dest(p, insn_or, addr); - - - /* dst = send(offset, a0.0) */ - brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn_send, dst); - brw_set_src0(p, insn_send, offset); - brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); brw_pop_insn_state(p); -- 2.1.3
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev