Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > On Thu, 2017-02-09 at 18:28 -0800, Francisco Jerez wrote: >> Francisco Jerez <curroje...@riseup.net> writes: >> >> > --- >> > This replaces "[PATCH v2 09/20] i965/fs: indirect addressing with >> > doubles is not supported in IVB/BYT". >> > >> >> Note that some of the fp64 indirect addressing test-cases still fail >> on >> IVB even with this patch applied, but the reason doesn't seem to have >> anything to do with indirect addressing. Apparently the remaining >> failures are caused by illegal code like: >> >> > sel(8) g30<1>D g9<0,2,1>DF g8.3<0,2,1>DF { >> > align1 1Q }; >> >> The problem is that the SEL instruction doesn't support most datatype >> conversions, so this leads to data corruption on at least IVB. >> According to the hardware docs, the only valid destination type of >> SEL >> is DF if the execution type is DF, and F (or HF on CHV+) if the >> execution type is F. Integer 16 or 32-bit execution types seem to >> allow >> converting to other single-precision integer types. >> > > Some weeks ago I saw this issue while working in a solution for IVB's > MOV INDIRECT. That SEL is added by opt_peephole_sel(): > > https://lists.freedesktop.org/archives/mesa-dev/2017-January/142023.htm > l > > However, I did it very restrictive by just don't allowing any > conversion between different data type sizes. I am going to improve > that patch to include all the allowed cases. >
I wouldn't worry too much about the couple of supported type conversions, it doesn't seem terribly important, my concern with your patch is that it does seem to allow several type conversions which are invalid, like D->F, F->D, DF->Q, etc. Aside from that I'm not particularly confident that your patch will fix all potential sources of invalid conversions. E.g. do we know that no other optimization pass will ever change the destination type of the SEL assuming that the initial type was correct? What about other instructions with double-to-single or single-to-double type conversion restrictions? (A quick look at the PRMs would likely reveal instructions with similar restrictions) -- I don't think we can answer these questions without auditing the rest of the compiler back-end (while doing this in a legalization pass would be pretty much obviously correct). Alternatively you could fix the EU validator to recognize this and other cases of invalid conversions, which would make sure we get an assertion failure in the likely case that we've missed something so we can find the problem quickly instead of spending hours debugging non-deterministic data corruption issues. I'm okay either way, validation or legalization pass, do it as you like. >> I'm not sure why we haven't seen this cause massive breakage on HSW+, >> but I think we need some sort of legalization pass to do the type >> conversion in a separate MOV for any instruction with an invalid >> destination conversion. You may be able to do it within the d2x pass >> but then it would make sense to rename it to something more >> appropriate >> like destination conversion lowering (lower_cvt? :P). >> > > :) > >> In addition there seem to be other issues with your d2x lowering code >> (I'm bringing this up here because you don't seem to have sent the >> d2x >> changes for review to the mailing list yet) -- It removes the >> original >> instruction and creates a new one with the same opcode and first few >> sources, which will miscompile the program if the original >> instruction >> had a larger number of sources or any other instruction controls >> set. I >> suggest you modify the original instruction in-place to point it to >> the >> temporary you've allocated, and then copy the data into the original >> destination by using a strided move. >> > > OK. > > Thanks, > > Sam > >> > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27 >> > ++++++++++++++++++++++++-- >> > 1 file changed, 25 insertions(+), 2 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > index ea4a3fe1399..0e2dbe23571 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > @@ -440,7 +440,7 @@ fs_generator::generate_mov_indirect(fs_inst >> > *inst, >> > brw_MOV(p, dst, reg); >> > } else { >> > /* Prior to Broadwell, there are only 8 address registers. >> > */ >> > - assert(inst->exec_size == 8 || devinfo->gen >= 8); >> > + assert(inst->exec_size <= 8 || devinfo->gen >= 8); >> > >> > /* We use VxH indirect addressing, clobbering a0.0 through >> > a0.7. */ >> > struct brw_reg addr = vec8(brw_address_reg(0)); >> > @@ -478,7 +478,30 @@ fs_generator::generate_mov_indirect(fs_inst >> > *inst, >> > * code, using it saves us 0 instructions and would require >> > quite a bit >> > * of case-by-case work. It's just not worth it. >> > */ >> > - brw_ADD(p, addr, indirect_byte_offset, >> > brw_imm_uw(imm_byte_offset)); >> > + if (devinfo->gen >= 8 || devinfo->is_haswell || >> > type_sz(reg.type) < 8) { >> > + brw_ADD(p, addr, indirect_byte_offset, >> > brw_imm_uw(imm_byte_offset)); >> > + } else { >> > + /* IVB reads two address register components per channel >> > for >> > + * indirectly addressed 64-bit sources, so we need to >> > initialize >> > + * adjacent address components to consecutive dwords of >> > the source >> > + * region by emitting two separate ADD >> > instructions. Found >> > + * empirically. >> > + */ >> > + assert(inst->exec_size <= 4); >> > + brw_push_insn_state(p); >> > + brw_set_default_exec_size(p, cvt(inst->exec_size) - 1); >> > + >> > + brw_ADD(p, spread(addr, 2), indirect_byte_offset, >> > + brw_imm_uw(imm_byte_offset)); >> > + brw_inst_set_no_dd_clear(devinfo, brw_last_inst, true); >> > + >> > + brw_ADD(p, spread(suboffset(addr, 1), 2), >> > indirect_byte_offset, >> > + brw_imm_uw(imm_byte_offset + 4)); >> > + brw_inst_set_no_dd_check(devinfo, brw_last_inst, true); >> > + >> > + brw_pop_insn_state(p); >> > + } >> > + >> > struct brw_reg ind_src = brw_VxH_indirect(0, 0); >> > >> > brw_inst *mov = brw_MOV(p, dst, retype(ind_src, dst.type)); >> > -- >> > 2.11.0 >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev