On Mon, May 16, 2022 at 05:31:31PM -0500, Peter Bergner wrote: > On 5/10/22 5:35 PM, Segher Boessenkool wrote: > > Out of interest, did you try using v,?wa (so just two alternatives, not > > four)? Or did you think it wouldresult in measurably worse code? Or > > did you decide it is not such bad backend code size explosion after > > all :-) > > So I tried using just "v,?wa" instead of the 4 alternative "v,v,?d,?d" > version and that fixes the performance issue too and is simpler too. > The other option is "better", in that it can allow one operand to get > a "v" reg when the other gets a "d" reg, but I think that's just a > micro-optimization and not worth the extra complexity in the pattern. > Thanks for the suggestion!
The difference is that "v,?wa" makes no difference between one or more lower VSRs used. But whenever you would see that there is so much register allocation pressure already that it does not change anything materially. > gcc/ > PR target/105556 > * config/rs6000/mma.md (mma_<vv>, mma_<avv>, mma_<pv>, mma_<apv>, > mma_<vvi4i4i8>, mma_<avvi4i4i8>, mma_<vvi4i4i2>, mma_<avvi4i4i2>, > mma_<vvi4i4>, mma_<avvi4i4>, mma_<pvi4i2>, mma_<apvi4i2>, > mma_<vvi4i4i4>, mma_<avvi4i4i4>): Replace "wa" constraints with "v,?wa". > Update other operands accordingly. > (define_insn "mma_<vv>" > - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") > - (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa")] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa") > + (match_operand:V16QI 2 "vsx_register_operand" "v,?wa")] You now have two "?" on alternative 1, instead of just one. This is the same as if you had had [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,??wa") (match_operand:V16QI 2 "vsx_register_operand" "v,wa")] The "?" are per alternative, not really per operand. It won't change much here of course, just penalise more than you perhaps expected. With or without that changed: okay for trunk and for 12 (after the usual cooldown). Thanks! Also okay for 11 and 10, shoukd you want that later anyway. Segher