Dominik Vogt wrote:

> v4:
> 
>   * Remoce CCZZ1 iterator. 
>   * Remove duplicates of CS patterns. 
>   * Move the skip_cs_label so that output is moved to vtarget even 
>     if the CS instruction was not used. 
>   * Removed leftover from "sne" (from an earlier version of the
>   * patch). 

Thanks, this looks quite good to me now.  I do still have two questions:

> +; Peephole to combine a load-and-test from volatile memory which combine does
> +; not do.
> +(define_peephole2
> +  [(set (match_operand:GPR 0 "register_operand")
> +     (match_operand:GPR 2 "memory_operand"))
> +   (set (reg CC_REGNUM)
> +     (compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))]
> +  "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM
> +   && GENERAL_REG_P (operands[0])
> +   && satisfies_constraint_T (operands[2])"
> +  [(parallel
> +    [(set (reg:CCS CC_REGNUM)
> +       (compare:CCS (match_dup 2) (match_dup 1)))
> +     (set (match_dup 0) (match_dup 2))])])

Still wondering why this is necessary.  On the other hand, I guess it
cannot hurt to have the peephole either ...

> @@ -6518,13 +6533,30 @@
>    [(parallel
>      [(set (match_operand:SI 0 "register_operand" "")
>         (match_operator:SI 1 "s390_eqne_operator"
> -           [(match_operand:CCZ1 2 "register_operand")
> +           [(match_operand 2 "cc_reg_operand")
>           (match_operand 3 "const0_operand")]))
>       (clobber (reg:CC CC_REGNUM))])]
>    ""
> -  "emit_insn (gen_sne (operands[0], operands[2]));
> -   if (GET_CODE (operands[1]) == EQ)
> -     emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> +  "machine_mode mode = GET_MODE (operands[2]);
> +   if (TARGET_Z196)
> +     {
> +       rtx cond, ite;
> +
> +       if (GET_CODE (operands[1]) == NE)
> +      cond = gen_rtx_NE (VOIDmode, operands[2], const0_rtx);
> +       else
> +      cond = gen_rtx_EQ (VOIDmode, operands[2], const0_rtx);
> +       ite = gen_rtx_IF_THEN_ELSE (SImode, cond, const1_rtx, const0_rtx);
> +       emit_insn (gen_rtx_SET (operands[0], ite));
> +     }
> +   else
> +     {
> +       if (mode != CCZ1mode)
> +      FAIL;
> +       emit_insn (gen_sne (operands[0], operands[2]));
> +       if (GET_CODE (operands[1]) == EQ)
> +      emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> +     }
>     DONE;")

>From what I can see in the rest of the patch, none of the CS changes now
actually *rely* on this change to cstorecc4 ... s390_expand_cs_tdsi only
calls cstorecc4 on !TARGET_Z196, where the above change is a no-op, and
in the TARGET_Z196 case it deliberates does *not* use cstorecc4.

Now, in general this improvement to cstorecc4 is of course valuable
in itself.  But I think at this point it might be better to separate
this out into an independent patch (and measure its effect separately).

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

Reply via email to