Dominik Vogt wrote: > v4: > > * Remoce CCZZ1 iterator. > * Remove duplicates of CS patterns. > * Move the skip_cs_label so that output is moved to vtarget even > if the CS instruction was not used. > * Removed leftover from "sne" (from an earlier version of the > * patch).
Thanks, this looks quite good to me now. I do still have two questions: > +; Peephole to combine a load-and-test from volatile memory which combine does > +; not do. > +(define_peephole2 > + [(set (match_operand:GPR 0 "register_operand") > + (match_operand:GPR 2 "memory_operand")) > + (set (reg CC_REGNUM) > + (compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))] > + "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM > + && GENERAL_REG_P (operands[0]) > + && satisfies_constraint_T (operands[2])" > + [(parallel > + [(set (reg:CCS CC_REGNUM) > + (compare:CCS (match_dup 2) (match_dup 1))) > + (set (match_dup 0) (match_dup 2))])]) Still wondering why this is necessary. On the other hand, I guess it cannot hurt to have the peephole either ... > @@ -6518,13 +6533,30 @@ > [(parallel > [(set (match_operand:SI 0 "register_operand" "") > (match_operator:SI 1 "s390_eqne_operator" > - [(match_operand:CCZ1 2 "register_operand") > + [(match_operand 2 "cc_reg_operand") > (match_operand 3 "const0_operand")])) > (clobber (reg:CC CC_REGNUM))])] > "" > - "emit_insn (gen_sne (operands[0], operands[2])); > - if (GET_CODE (operands[1]) == EQ) > - emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx)); > + "machine_mode mode = GET_MODE (operands[2]); > + if (TARGET_Z196) > + { > + rtx cond, ite; > + > + if (GET_CODE (operands[1]) == NE) > + cond = gen_rtx_NE (VOIDmode, operands[2], const0_rtx); > + else > + cond = gen_rtx_EQ (VOIDmode, operands[2], const0_rtx); > + ite = gen_rtx_IF_THEN_ELSE (SImode, cond, const1_rtx, const0_rtx); > + emit_insn (gen_rtx_SET (operands[0], ite)); > + } > + else > + { > + if (mode != CCZ1mode) > + FAIL; > + emit_insn (gen_sne (operands[0], operands[2])); > + if (GET_CODE (operands[1]) == EQ) > + emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx)); > + } > DONE;") >From what I can see in the rest of the patch, none of the CS changes now actually *rely* on this change to cstorecc4 ... s390_expand_cs_tdsi only calls cstorecc4 on !TARGET_Z196, where the above change is a no-op, and in the TARGET_Z196 case it deliberates does *not* use cstorecc4. Now, in general this improvement to cstorecc4 is of course valuable in itself. But I think at this point it might be better to separate this out into an independent patch (and measure its effect separately). Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com