On Fri, Dec 2, 2016 at 3:21 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> Hi!
>
> While the https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01087.html
> added *andndi3_doubleword, I don't understand how it can actually work.
> The problem is that this pattern is for combine, and needs combining
> of DImode NOT setter with DImode AND user.  But there is no DImode
> pattern for one_cmpldi2, so it is lowered early into two one_cmplsi2
> patterns and I can't see how combine would be able to figure stuff out from
> that.
>
> This patch:
> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two
>    one_cmplsi2 after reload)
> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones)
> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it
>    is for -mbmi only, and adds another *andndi3_doubleword pattern that is
>    meant to live just from combine till the stv pass, or worse case till
>    following split1 pass when it is split back into not followed by and;
>    this change makes it possible to use pandn in stv pass, even without
>    -mbmi

Please use attached (lightly tested) patch to implement point 3)
above. The patch splits insn after reload, as is the case with all STV
patterns.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-12-02  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/70322
>         * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle
>         NOT.
>         (dimode_scalar_chain::compute_convert_gain): Likewise.
>         (dimode_scalar_chain::convert_insn): Likewise.
>         * config/i386/i386.md (*andndi3_doubleword): New 
> define_insn_and_split.
>         Rename the old define_insn_and_split to...
>         (*andndi3_doubleword_bmi): ... this.
>         (one_cmpl<mode>2): Use SWIM1248x iterator instead of SWIM.
>         (*one_cmpldi2_doubleword): New define_insn_and_split.
>
>         * gcc.target/i386/pr70322-1.c: New test.
>         * gcc.target/i386/pr70322-2.c: New test.
>         * gcc.target/i386/pr70322-3.c: New test.

Apart to andn patterns, the patch is OK with the small nit below.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-12-02 11:17:40.702995111 +0100
> +++ gcc/config/i386/i386.c      2016-12-02 12:01:31.656469089 +0100
> @@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx
>         return false;
>        break;
>
> +    case NOT:
> +      break;
> +
>      case REG:
>        return true;
>
> @@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx
>
>    if ((GET_MODE (XEXP (src, 0)) != DImode
>         && !CONST_INT_P (XEXP (src, 0)))
> -      || (GET_MODE (XEXP (src, 1)) != DImode
> +      || (GET_CODE (src) != NOT
> +         && GET_MODE (XEXP (src, 1)) != DImode
>           && !CONST_INT_P (XEXP (src, 1))))
>      return false;
>
> @@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai
>           if (CONST_INT_P (XEXP (src, 1)))
>             gain -= vector_const_cost (XEXP (src, 1));
>         }
> +      else if (GET_CODE (src) == NOT)
> +       gain += ix86_cost->add - COSTS_N_INSNS (1);
>        else if (GET_CODE (src) == COMPARE)
>         {
>           /* Assume comparison cost is the same.  */
> @@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i
>        PUT_MODE (src, V2DImode);
>        break;
>
> +    case NOT:
> +      src = XEXP (src, 0);
> +      convert_op (&src, insn);
> +      subreg = gen_reg_rtx (V2DImode);
> +      emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), 
> insn);
> +      src = gen_rtx_XOR (V2DImode, src, subreg);
> +      break;
> +
>      case MEM:
>        if (!REG_P (dst))
>         convert_op (&src, insn);
> --- gcc/config/i386/i386.md.jj  2016-12-01 23:24:51.663157486 +0100
> +++ gcc/config/i386/i386.md     2016-12-02 12:50:27.616829191 +0100
> @@ -8534,7 +8534,7 @@ (define_split
>    operands[2] = gen_lowpart (QImode, operands[2]);
>  })
>
> -(define_insn_and_split "*andndi3_doubleword"
> +(define_insn_and_split "*andndi3_doubleword_bmi"
>    [(set (match_operand:DI 0 "register_operand" "=r")
>         (and:DI
>           (not:DI (match_operand:DI 1 "register_operand" "r"))
> @@ -8551,6 +8551,27 @@ (define_insn_and_split "*andndi3_doublew
>               (clobber (reg:CC FLAGS_REG))])]
>    "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);")
>
> +;; This insn is there only so that the stv pass can use pandn even when
> +;; BMI is not available.  It should be matched during combine and split
> +;; again in split1, unless the stv pass converts it.
> +(define_insn_and_split "*andndi3_doubleword"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +       (and:DI
> +         (not:DI (match_operand:DI 1 "register_operand" "r"))
> +         (match_operand:DI 2 "nonimmediate_operand" "rm")))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "!TARGET_BMI && !TARGET_64BIT && TARGET_STV && TARGET_SSE2
> +   && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 3)
> +       (not:DI (match_dup 1)))
> +   (parallel [(set (match_dup 0)
> +                  (and:DI (match_dup 3)
> +                          (match_dup 2)))
> +             (clobber (reg:CC FLAGS_REG))])]
> +  "operands[3] = gen_reg_rtx (DImode);")
> +
>  (define_insn "*andn<mode>_1"
>    [(set (match_operand:SWI48 0 "register_operand" "=r,r")
>         (and:SWI48
> @@ -9313,8 +9334,8 @@ (define_split
>  ;; One complement instructions
>
>  (define_expand "one_cmpl<mode>2"
> -  [(set (match_operand:SWIM 0 "nonimmediate_operand")
> -       (not:SWIM (match_operand:SWIM 1 "nonimmediate_operand")))]
> +  [(set (match_operand:SWIM1248x 0 "nonimmediate_operand")
> +       (not:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")))]
>    ""
>    "ix86_expand_unary_operator (NOT, <MODE>mode, operands); DONE;")
>
> @@ -9403,6 +9424,19 @@ (define_split
>                                     (const_int 0)]))
>               (set (match_dup 1)
>                    (zero_extend:DI (xor:SI (match_dup 3) (const_int -1))))])])
> +
> +(define_insn_and_split "*one_cmpldi2_doubleword"
> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")
> +       (not:DI (match_operand:DI 1 "nonimmediate_operand" "0")))]
> +  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
> +   && ix86_unary_operator_ok (NOT, DImode, operands)"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +       (not:SI (match_dup 1)))
> +   (set (match_dup 2)
> +       (not:SI (match_dup 3)))]
> +  "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);")

Please put the above pattern just after one_cmpl<mode>2 expander.

>  ;; Shift instructions
>
> --- gcc/testsuite/gcc.target/i386/pr70322-1.c.jj        2016-12-02 
> 12:52:47.193051745 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70322-1.c   2016-12-02 12:52:24.708338078 
> +0100
> @@ -0,0 +1,12 @@
> +/* PR target/70322 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -msse2 -mstv -mbmi" } */
> +/* { dg-final { scan-assembler "pandn" } } */
> +
> +extern long long z;
> +
> +void
> +foo (long long x, long long y)
> +{
> +  z = ~x & y;
> +}
> --- gcc/testsuite/gcc.target/i386/pr70322-2.c.jj        2016-12-02 
> 12:52:50.165013898 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70322-2.c   2016-12-02 12:52:39.302152232 
> +0100
> @@ -0,0 +1,12 @@
> +/* PR target/70322 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -msse2 -mstv -mno-bmi" } */
> +/* { dg-final { scan-assembler "pandn" } } */
> +
> +extern long long z;
> +
> +void
> +foo (long long x, long long y)
> +{
> +  z = ~x & y;
> +}
> --- gcc/testsuite/gcc.target/i386/pr70322-3.c.jj        2016-12-02 
> 13:07:27.658796578 +0100
> +++ gcc/testsuite/gcc.target/i386/pr70322-3.c   2016-12-02 13:08:11.899229225 
> +0100
> @@ -0,0 +1,13 @@
> +/* PR target/70322 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -msse2 -mstv" } */
> +/* { dg-final { scan-assembler "pxor" } } */
> +/* { dg-final { scan-assembler "por" } } */
> +
> +extern long long z;
> +
> +void
> +foo (long long x, long long y)
> +{
> +  z = ~x | y;
> +}
>
>         Jakub

Reply via email to