On 1/26/19 12:42 AM, Rob Clark wrote: > On Fri, Jan 25, 2019 at 10:48 AM Eduardo Lima Mitev <el...@igalia.com> wrote: >> >> There are a bunch of instructions emitted on ir3_compiler_nir related to >> offset computations for IO opcodes (ssbo, image, etc). This small series >> explores the possibility of moving these instructions to NIR, where we >> have higher chances of optimizing them. >> >> The series introduces a new, freedreno specific NIR pass, >> 'ir3_nir_lower_sampler_io' (final name not set). The pass is executed >> early on ir3_optimize_nir(), and the goal is to centralize all these >> computations there, hoping that later NIR passes will produce better >> code than what is currently emitted. > > I can think of a few other things that would be interesting to lower > to driver specific nir opcodes (imul and various lowering for tex > instructions come to mind.. but probably also ubo and ssbo address > calculation.. maybe it could even make sense for some of the single > src alu instructions that translate into multiple ir3 instructions, > not sure).. >
Yes, the plan is to abstract to NIR whatever brings us a benefit in instruction stats. There is also the question of just simplifying the backend compiler, provided we don't hurt produced code. > Are you thinking about having separate passes for each? I guess at > least for alu instructions we might able to use nir_algebraic so > having things split up might be easier. > I haven't thought too much about this yet, but it seems to make sense having at least 2 passes, one for I/O and one for ALUs. >> So far, we have just implemented byte-offset computation for image store >> and atomics. This seemed like a good first target given the amount of >> instructions being emitted for it by the backend. >> >> This is an RFC series because there are a few open questions, but we >> wanted to gather feedback already now, in case this effort is something >> not worth it; and also hoping that somebody else will give it a try >> against other shaders and on other gens (we have just tried this on >> a5xx). >> >> * We have so far been unable to see any improvement in generated code >> (not a penalty either). shader-db has not been specially useful. Few >> shaders there exercise image store or image atomic ops, and of those >> that do, most require higher versions of GLSL than what freedreno >> supports, so they get skipped. The few that do actually run, don't >> show any meaningful difference. > > I guess it would be easy enough to construct shaders that would > benefit from this, but maybe that is cheating.. > > Possibly UBO and SSBO is a better target, I guess there you might be > more likely to see patterns of access of successive elements (ie. > foo[idx], foo[idx+1], etc)? > I took a first stab at SSBO's load/store/atomic, where the offset is divided by 4 in the backend, but was bitten by IR3_STGB requiring both the original byte-offset and the dword-offset (in src1 and src2 respectively). So trivially emitting a nir_shr on the offset didn't buy us anything. I have in the backlog to revisit this, turning the offset into a 2-component reg so we can hold the original byte-offset and the offset divided by 4. > Anyways, since we don't try to do (and I'd rather not do) any sort of > CSE post nir->ir3 I think starting to introduce more ir3 specific > nir->nir lowering seems like a thing we need, so I'm pretty happy that > someone is looking at this :-) > Thanks, that's encouraging. Lets see how far we can get :). Eduardo > BR, > -R > >> Then other shaders picked from tests suites are simple enough not to >> produce any difference in code either. >> >> There is still on-going work looking for cases where the pass helps >> instruction stats, whether writing custom shaders or porting complex >> shader from shader-db to run on GLES 310. >> >> There is though an open question here as to whether moving backend >> code to NIR is a benefit in and of itself. >> >> * The series adds a nir_op_imad opcode that didn't exist before, and >> perhaps not generally useful even for freedreno outside this pass, >> because it maps to IR3_MAD_S24 which is probably not suitable for >> generic integer multiply-add. >> >> * The pass currently has 2 alternative code-paths to emit the >> multiplication by the bytes-per-pixel of an image format. In one >> case, since this value can be obtained at compile time, it is >> emitted as an immediate by nir_imul_imm. The other alternative is >> emitting an nir_imul with an SSA value that will map to >> image_dims[0] at shader runtime. >> >> The former case is uglier but produces better code (a single SHL >> instruction), whereas the latter involves a generic imul, for which >> the backend emits a lot of code to cover for overflow. >> >> The doubt here is whether we should introduce a (lower precision) >> version of imul that maps directly to IR3_IMUL_S. >> >> >> A live (WIP) tree of the series can be found at: >> <https://gitlab.freedesktop.org/elima/mesa/commits/wip/fd-compiler-io> >> >> We plan to continue moving computations to the pass if we see >> good opportunities. >> >> Feedback very welcome, >> >> cheers, >> Eduardo >> >> Eduardo Lima Mitev (4): >> nir: Add a new intrinsic 'load_image_stride' >> nir: Add a new ALU nir_op_imad >> ir3/nir: Add a new pass 'ir3_nir_lower_sampler_io' >> ir3: Use ir3_nir_lower_sampler_io pass >> >> src/compiler/nir/nir_intrinsics.py | 2 + >> src/compiler/nir/nir_opcodes.py | 1 + >> src/freedreno/Makefile.sources | 1 + >> src/freedreno/ir3/ir3_compiler_nir.c | 61 ++-- >> src/freedreno/ir3/ir3_nir.c | 1 + >> src/freedreno/ir3/ir3_nir.h | 1 + >> src/freedreno/ir3/ir3_nir_lower_sampler_io.c | 349 +++++++++++++++++++ >> 7 files changed, 383 insertions(+), 33 deletions(-) >> create mode 100644 src/freedreno/ir3/ir3_nir_lower_sampler_io.c >> >> -- >> 2.20.1 >> >> _______________________________________________ >> Freedreno mailing list >> Freedreno@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/freedreno > _______________________________________________ > Freedreno mailing list > Freedreno@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/freedreno > _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno