On 09/10/2015 08:45 PM, Jason Ekstrand wrote: > On Wed, Sep 9, 2015 at 11:51 PM, Eduardo Lima Mitev <el...@igalia.com> wrote: >> On 09/09/2015 07:10 PM, Jason Ekstrand wrote: >>> >>> On Sep 8, 2015 23:27, "Eduardo Lima Mitev" <el...@igalia.com >>> <mailto:el...@igalia.com>> wrote: >>>> >>>> This pass will propagate the destination components of a vecN >>> instructions, >>>> as destination of the instructions that define its sources; if certain >>>> conditions are met. >>>> >>>> If all the components of the destination register in the vecN instruction >>>> can be propagated, the instruction is removed. Otherwise, a new, reduced >>>> vecN instruction is emitted with the channels that remained. >>>> >>>> This effectively coalesces registers and reduces indirection. >>>> >>>> By now, this pass will only propagate to ALU instructions, but it could >>>> be extended to include other instructions like load_input intrinsic. >>>> >>>> It also propagates to instructions within the same block as the vecN >>>> instruction. But it could be made to work cross-block in the future, >>>> though there are non-trivial issues with this like considering >>>> registers that are written in different branches of a conditional. >>>> More analysis is needed to correctly cover these cases. >>>> >>>> This pass works on a NIR shader in final form (after SSA), and is >>>> expected to run before nir_lower_vec_to_movs(). >>>> --- >>>> [...] >>>> + >>>> + /* We only override dest registers that are only used >>> once, and in >>>> + * this vecX instruction. >>>> + * @TODO: In the future we might consider registers used >>> more than >>>> + * once as sources of the same vecX instruction. >>>> + */ >>> >>> Yeah, we definitely want to handle that case and we want to handle it by >>> emitting a single instruction that does multiple channels. >>> >> >> I started implementing that as part of of this patch but quickly >> realized it was not trivial and basically postponed it until I got some >> feedback for the basics. >> >>>> + if (list_length(&parent_dest_reg->uses) != 1) >>>> + continue; >>>> + >>>> + nir_alu_instr *new_alu_instr = >>>> + clone_alu_instr_and_override_dest(parent_alu_instr, >>> &vec->dest, >>>> + i, mem_ctx); >>> >>> This mess would be a lot easier if we were still in SSA form. I've been >>> thinking for a while that we should make lower_vec_to_movs work in SSA >>> form (it would replace SSA destinations with register destinations). >>> Then use partial SSA lowering like we do in the FS backend. That would >>> make this pass much easier together correct. >>> >> >> I didn't know about the partial SSA pass in FS backend, I could have got >> some inspiration there. But did tried hard to implement this while in >> SSA form at the beginning, without much success because the combination >> of nir_from_ssa and lower_vec_to_movs would screw things after any SSA pass. > > Yeah, partial SSA was something that's actually fairly recent and > happened in parallel with the vec4 stuff. > >>> In general, I'm kind of thinking that this might be better done as an >>> extension to the lower_vec_to_moves pass or as a new version of it. You >>> would loop through the sources of a vecN and, if you can rewrite a >>> destination, you do and if you can't, you insert moves. We may also >>> want to it handle SSA while we're at it. Note: it will get run after >>> nir_from_ssa so it will have to handle partial SSA where some >>> destinations are SSA and some are registers. >>> >> >> Actually, the first version of this patch, I wrote it against >> lower_vec_to_movs, but then Matt suggested that likely it would be >> easier for the opt_register_coalsece pass to avoid lower_vec_to_movs and >> deal with vecN instructions in the backend. Then I rewrote this pass as >> a separate thing that would re-emit reduced vecN instructions, to >> essentially make it compatible with Matt's suggestion, thus future proof :). > > Hrm... I'm not sure why matt suggested it be it's own pass but I can > kind of see it. At least in my implementation, it seemed to work > fairly naturally to just put it in lower_vec_to_movs. Is there some > place where I can see your original pass that did it there? If you > don't have it handy, don't do a lot of work to dig it up, I'm mostly > curious. >
Well, what Matt suggested was that the nir_lower_vec_to_movs pass itself (without my changes) could be avoided, because dealing with vecN instructions on backend side would simplify things for opt_register_coalesce. In any case, this is a version of the patch I had for lower_vec_to_movs, but it was incomplete then and had regressions: https://github.com/Igalia/mesa/commit/e82a732cac896b4ec7f6e58d95ff89cf684207df >>> Does this seem reasonable? >>> >> >> Absolutely. The first thing I noticed is that lowering stuff in NIR out >> of SSA form is not very elegant and also not very well supported (lack >> of utility APIs). >> >> I see you sent a series to the ML implementing some of the ideas in this >> patch. The shader-db results are impressive! >> I will use that as training material :) and follow the review process >> closely. This also means I stop working on this pass. > > I'm sorry if I stole your thunder; I didn't mean to. It was just one > of those things where it takes fewer lines of code to implement it > than it takes lines of prose to explain it to someone. > Not at all :), I'm happy you took over. Your fix is way more complete. I would have never produced a similar patch any time soon. The important thing is that we have regressions under control now. > Most of the > work (once I saw what you were doing) was just in fixing core NIR bugs > to make it all work. > --Jason > >> Thanks a lot for the review! >> >> Eduardo >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev