Hi Segher, on 2024/6/19 04:31, Segher Boessenkool wrote: > On Fri, Feb 10, 2023 at 10:59:52AM +0800, Xionghu Luo via Gcc-patches wrote: > So, nothing here is obvious at all still. Could you please split it up > a bit more, so that every step is either small or simple?
I just chatted with Xionghu off-list, he is being busy on some other tasks and preferred me to follow up this. > > So maybe first just split patterns to BE and LE versions, and nothing > else? > > And one patch per insn, if at all possible. OK, I'll try to separate them as element type word, half-word and byte. > > This matters so that a regression search will immediately show the > culprit pattern, if anything went wrong. > > Most patches will not change anything consequential, but some will, and > it should be very clear which do! > > And change (or add) comments in the patch so that I don't have to ask > the same questions as before again! :-) > > Most of this seems clean and good, but there is just too much > independent stuff going on at the same time. If your patch series is > split up correctly writing a changelog for it is very easy (this is a > good canary to use!), and if we get regressions from this it should be > trivial to fond the problem, too. Good point. > >> @@ -3699,13 +3799,13 @@ (define_expand "vec_widen_umult_hi_v16qi" >> { >> emit_insn (gen_altivec_vmuleub (ve, operands[1], operands[2])); >> emit_insn (gen_altivec_vmuloub (vo, operands[1], operands[2])); >> - emit_insn (gen_altivec_vmrghh_direct (operands[0], ve, vo)); >> + emit_insn (gen_altivec_vmrghh_direct_be (operands[0], ve, vo)); >> } >> else >> { >> emit_insn (gen_altivec_vmuloub (ve, operands[1], operands[2])); >> emit_insn (gen_altivec_vmuleub (vo, operands[1], operands[2])); >> - emit_insn (gen_altivec_vmrghh_direct (operands[0], vo, ve)); >> + emit_insn (gen_altivec_vmrghh_direct_le (operands[0], vo, ve)); >> } >> DONE; >> }) > > Please don't. Call the generic gen_vmrg* patterns from the widen > things, don't try to do the compilers job of specialising stuff, it > only makes things much less readable, and causes more mistakes. Just do > like what was there before, essentially. Before r12-4496 (the culprit commit), this part looks like: @@ -3795,182 +3708,182 @@ (define_expand "vec_widen_smult_hi_v16qi" emit_insn (gen_altivec_vmrghh_direct (operands[0], ve, vo)); } else { emit_insn (gen_altivec_vmulosb (ve, operands[1], operands[2])); emit_insn (gen_altivec_vmulesb (vo, operands[1], operands[2])); emit_insn (gen_altivec_vmrghh_direct (operands[0], vo, ve)); } DONE; }) , its associated gen_altivec_vmrghh_direct looks like: -(define_insn "altivec_vmrghh_direct" - [(set (match_operand:V8HI 0 "register_operand" "=v") - (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v") - (match_operand:V8HI 2 "register_operand" "v")] - UNSPEC_VMRGH_DIRECT))] - "TARGET_ALTIVEC" "vmrghh %0,%1,%2" [(set_attr "type" "vecperm")]) , the intention is to emit exactly the insn "vmrghh". It's doable to call gen_vmrg* here instead, but I'm not sure if it's more readable, as this vec_widen_smult_hi_v16qi expander already has the different arms for BE and LE, for calling with the generic gen_vmrg*, it would be gen_altivec_vmrghb for BE and gen_altivec_vmrglb for LE, for LE readers need to be more careful that we actually generate vmrghh. From this perspective, gen_altivec_vmrghh_direct_{be,le} seems more straight. BR, Kewen