On Mon, Jun 24, 2024 at 1:34 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Mon, Jun 24, 2024 at 10:03 AM Richard Sandiford > > <richard.sandif...@arm.com> wrote: > >> > >> Richard Biener <richard.guent...@gmail.com> writes: > >> > On Sat, Jun 22, 2024 at 6:50 PM Richard Sandiford > >> >> The traditional (and IMO correct) way to handle this is to make the > >> >> pattern reserve the temporary registers that it needs, using > >> >> match_scratches. > >> >> rs6000 has many examples of this. E.g.: > >> >> > >> >> (define_insn_and_split "@ieee_128bit_vsx_neg<mode>2" > >> >> [(set (match_operand:IEEE128 0 "register_operand" "=wa") > >> >> (neg:IEEE128 (match_operand:IEEE128 1 "register_operand" "wa"))) > >> >> (clobber (match_scratch:V16QI 2 "=v"))] > >> >> "TARGET_FLOAT128_TYPE && !TARGET_FLOAT128_HW" > >> >> "#" > >> >> "&& 1" > >> >> [(parallel [(set (match_dup 0) > >> >> (neg:IEEE128 (match_dup 1))) > >> >> (use (match_dup 2))])] > >> >> { > >> >> if (GET_CODE (operands[2]) == SCRATCH) > >> >> operands[2] = gen_reg_rtx (V16QImode); > >> >> > >> >> emit_insn (gen_ieee_128bit_negative_zero (operands[2])); > >> >> } > >> >> [(set_attr "length" "8") > >> >> (set_attr "type" "vecsimple")]) > >> >> > >> >> Before RA, this is just: > >> >> > >> >> (set ...) > >> >> (clobber (scratch:V16QI)) > >> >> > >> >> and the split creates a new register. After RA, operand 2 provides > >> >> the required temporary register: > >> >> > >> >> (set ...) > >> >> (clobber (reg:V16QI TMP)) > >> >> > >> >> Another approach is to add can_create_pseudo_p () to the define_insn > >> >> condition (rather than the split condition). But IMO that's an ICE > >> >> trap, since insns that have already been matched & accepted shouldn't > >> >> suddenly become invalid if recog is reattempted later. > >> > > >> > What about splitting immediately in late-combine? Wouldn't that possibly > >> > allow more combinations to immediately happen? > >> > >> It would be difficult to guarantee termination. Often the split > >> instructions can be immediately recombined back to the original > >> instruction. Even if we guard against that happening directly, > >> it'd be difficult to prove that it can't happen indirectly. > >> > >> We might also run into issues like PR101523. > >> > >> Combine uses define_splits (without define_insns) for 3->2 combinations, > >> but the current late-combine optimisation is kind-of 1/N+1->1 x N. > >> > >> Personally, I think we should allow targets to use the .md file to > >> define match.pd-style simplification rules involving unspecs, but there > >> were objections to that when I last suggested it. > > > > Isn't that what basically "combine-helper" patterns do to some extent? > > Partly, but: > > (1) It's a big hammer. It means we add all the overhead of a define_insn > for something that is only meant to survive between one pass and the next. > > (2) Unlike match.pd, it isn't designed to be applied iteratively. > There is no attempt even in theory to ensure that match helper > -> split -> match helper -> split -> ... would terminate. > > (3) It operates at the level of complete instructions, including e.g. > destinations of sets. The kind of rule I had in mind would be aimed > at arithmetic simplification, and would operate at the simplify-rtx.cc > level. > > That is, if simplify_foo failed to apply a target-independent rule, > it could fall back on an automatically generated target-specific rule, > with the requirement/understanding that these rules really should be > target-specific. One easy way of enforcing that is to say that > at least one side of a production rule must involve an unspec.
OK, that makes sense. I did think of having something like match.pd generate simplify-rtx.cc. It probably has different constraints so that simply translating tree codes to rtx codes and re-using match.pd patterns isn't going to work well. Richard. > Richard > >