Paolo Bonzini schrieb: > On 10/28/2010 03:10 PM, Georg Lay wrote: >> Georg Lay schrieb: >> >>> This code is not nice. >>> >>> ;; d8 = d4 * d6 >>> ;; d8 = d2 >>> ;; d2 = d8 >>> ;; return d2 >> >> this should be >> >> ;; d2 = d4 * d6 >> ;; d8 = d2 >> ;; d2 = d8 >> ;; return d2 > > It seems to me that some of your peepholes should instead be implemented > using constraints and multiple alternatives (for example the xor one), > so that reload and register allocation can do a better job. However I > can't tell without looking at the code.
As far as I understand the internals, peephole2 matches due to predicates and condition, it does not care for constraints (except for optional match_scratch) The xor reads as ;; "*andsi3" "*iorsi3" "*xorsi3" (define_insn "*<code>si3" [(set (match_operand:SI 0 "register_operand" "=d,d,d,!d,d,d") (tric_bitop:SI (match_operand:SI 1 "register_operand" "%0,d,d,d,d,d") (match_operand:SI 2 "reg_or_s10_operand" "d,0,d,d,Ku9,Kc9")))] "" "@ ..." [(set_attr "opt" "*,*,speed,size,*,*") (set_attr "pipe" "ip")]) The machine has two types of registers: "a" that are used to address memory and can only do very basic kind of arithmetic, and "d" that can do arithmetic. The xor above allows three destinct operands. The xor in my original example gets generated here: ;; $2 is a mask of the form 00..111..000 that ;; cannot be handled by "*andsi3" (define_insn_and_split "*and3_zeroes-2.insert.ic" [(set (match_operand:SI 0 "register_operand" "=&d") (and:SI (match_operand:SI 1 "register_operand" "%d") (match_operand:SI 2 "const_int_operand" "n")))] "TARGET_COMBINE_INSNS && ..." "#" "&& reload_completed" [(set (match_dup 0) (and:SI (match_dup 1) (match_dup 3))) (set (match_dup 0) (xor:SI (match_dup 0) (match_dup 1)))] { rtx lo, hi; if (REGNO(operands[0]) == REGNO(operands[1])) FAIL; if (...) { /* We can load the constant in one instruction. This is better than insert */ ... DONE; } operands[3] = GEN_INT (~OPVAL(2)); }) The condition will ensure that the mask (op2) is of the form indicated in the comment. Perhaps it is better to write this with a clobber operand instead of an early-clobber. Maybe it's better to write it as a split1 that works prior to reload instead of split2 that works after reload. I don't know. match_scratch has the disadvantage that the input pattern will become a PARALLEL and is no more a single_set. But let me explain why the pattern is there at all: There are many cases where insn combine produces patterns that are close to things that the machine do in one or maybe two instructions. In these situations you want to say "hey combine, you are producing good stuff, but please go further in this direction". That is the point where you want to do arithmetic on RTL-level and transform one RTL construct into another or into a small sequence. Note that I am a backend guy and the opportunities to introduce target specific instructions are quite few: expand to do basics, and insn combine, split1 (before reload), split2 (after reload) for more fancy kind of instructions and peephole (which I don't intend to use) and peephole2 to fix mess. So the define_insn_and_split helps the combiner to combine more complex instructions and the split won't be reached in many situations because the combiner stuffs the AND in some complex insn that it wouldn't have found without the define_insn_and_split (or without a define_insn). Georg