Connor Abbott <cwabbo...@gmail.com> writes: > On Thu, Oct 22, 2015 at 7:21 AM, Iago Toral Quiroga <ito...@igalia.com> wrote: >> I implemented this first as a separate optimization pass in GLSL IR [1], but >> Curro pointed out that this being pretty much a restricted form of a CSE pass >> it would probably make more sense to do it inside CSE (and we no longer have >> a CSE pass in GLSL IR). >> >> Unlike other things we CSE in NIR, in the case of SSBO loads we need to make >> sure that we invalidate previous entries in the set in the presence of >> conflicting instructions (i.e. SSBO writes to the same block and offset) or >> in the presence of memory barriers. >> >> If this is accepted I intend to extend this to also cover image reads, which >> follow similar behavior. >> >> No regressions observed in piglit or dEQP's SSBO functional tests. >> >> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-October/097718.html >> >> Iago Toral Quiroga (2): >> nir/cse: invalidate SSBO loads in presence of ssbo writes or memory >> barriers >> nir/instr_set: allow rewrite of SSBO loads >> >> src/glsl/nir/nir_instr_set.c | 24 ++++++-- >> src/glsl/nir/nir_opt_cse.c | 142 >> +++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 162 insertions(+), 4 deletions(-) >> >> -- >> 1.9.1 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > NAK, this isn't going to work. NIR CSE is designed for operations > which can be moved around freely as long they're still dominated by > the SSA values they use. It makes heavy advantage of this to avoid > looking at the entire CFG and instead only at the current block and > its parents in the dominance tree. For example, imagine you have > something like: > > A = load_ssbo 0 > if (cond) { > store_ssbo 0 > } > B = load_ssbo 0 > > Then A and B can't be combined, but CSE will combine them anyways when > it reaches B because it keeps a hash table of values dominating B and > finds A as a match. It doesn't look at the if conditional at all > because it doesn't dominate the load to B. This is great when you want > to CSE pure things that don't depend on other side effects -- after > all, this is the sort of efficiency that SSA is supposed to give us -- > but it means that as-is, it can't be used for e.g. SSBO's and images > without completely changing how the pass works and making it less > efficient. > > Now, that being said, I still think that we should definitely be doing > this sort of thing in NIR now that we've finally added support for > SSBO's and images. We've been trying to avoid adding new optimizations > to GLSL, since we've been trying to move away from it. In addition, > with SPIR-V on the way, anything added to GLSL IR now is something > that we won't be able to use with SPIR-V shaders. Only doing it in FS > doesn't sound so great either; we should be doing as much as possible > at the midlevel, and combining SSBO loads is something that isn't > FS-specific at all. > > There are two ways I can see support for this being added to NIR: > > 1. Add an extra fake source/destination to intrinsics with side > effects, and add a pass to do essentially a conversion to SSA that > wires up these "token" sources/destinations, or perhaps extend the > existing to-SSA pass. > This is in fact the approach taken by other SSA-form IRs like LLVM's -- Side-effectful instructions take a "chain" input as argument and spit out a new chain token which can then be passed as argument to subsequent non-pure instructions for which a control-flow dependency exists.
> 2. Add a special "load-combining" pass that does some dataflow > analysis or similar (or, for now, only looks at things within a single > block). > > The advantage of #1 is that we get to use existing NIR passes, like > CSE, DCE, and GCM "for free" on SSBO loads and stores, without having > to do the equivalent thing using dataflow analysis. Also, doing store > forwarding (i.e. replacing the result of an SSBO load with the value > corresponding to a store, if we can figure out which store affects it) > is going to much easier. However, #1 is going to be much more of a > research project. I've thought about how we could do it, but I'm still > not sure how it could be done feasibly and still be correct. > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev