On Fri, Sep 20, 2019 at 3:53 PM Yuliang Wang <yuliang.w...@arm.com> wrote: > > Hi Richard, > > Thanks for your comments and tips. fold_binary_op_with_conditional_arg > performs the reverse transformation to this patch in certain situations: > > /* Transform `a + (b ? x : y)' into `b ? (a + x) : (a + y)'.
Yes, this is essentially the reverse transform. It's bad to have both, we should decide on one here. That means we have to align conditions appropriately, for example by splitting out a predicate which we can test (negated) in both places. I agree with Marc that the condition in fold_binary_op_with_conditional_arg is sub-optimal. Richard. > ... */ > > static tree > fold_binary_op_with_conditional_arg (location_t loc, > ... > > /* This transformation is only worthwhile if we don't have to wrap ARG > in a SAVE_EXPR and the operation can be simplified without recursing > on at least one of the branches once its pushed inside the COND_EXPR. */ > if (!TREE_CONSTANT (arg) > && (TREE_SIDE_EFFECTS (arg) ...) > return NULL_TREE; > ... > > For instance, this causes infinite recursion in > gcc.dg/vect/fast-math-vect-call-2 because ARG is a float literal. > Regards, > Yuliang > > -----Original Message----- > From: Richard Biener <richard.guent...@gmail.com> > Sent: 20 September 2019 13:02 > To: Yuliang Wang <yuliang.w...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Sandiford > <richard.sandif...@arm.com> > Subject: Re: [PATCH] Reduction of conditional operations for vectorization > > On Fri, Sep 20, 2019 at 10:09 AM Yuliang Wang <yuliang.w...@arm.com> wrote: > > > > Hi, > > > > ifcvt transforms the following conditional operation reduction pattern: > > > > if ( condition ) > > a = a OP b; > > else > > a = a OP c; > > > > Into: > > > > a_1 = a OP b; > > a_2 = a OP c; > > a = condition ? a_1 : a_2; > > > > Where OP is one of { plus minus mult min max and ior eor }. > > > > This patch further optimizes the above to: > > > > a_0 = condition ? b : c; > > a = a OP a_0; > > > > Which enables vectorization on AArch64. > > Also supported are permutations of the above operand ordering subject to > > commutativity of OP. > > > > Added new tests. Built and regression tested on aarch64-none-elf and > > aarch64-linux-gnu. > > @@ -3206,7 +3206,41 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* !A ? B : C -> A ? C : B. */ > (simplify > (cnd (logical_inverted_value truth_valued_p@0) @1 @2) > - (cnd @0 @2 @1))) > + (cnd @0 @2 @1)) > + > + /* !A ? B : C -> A ? C : B. */ > + (simplify > + (cnd (logical_inverted_value truth_valued_p@0) @1 @2) (cnd @0 @2 > + @1)) > + > > looks like you duplicate the above pattern. Should have raised a warning in > the genmatch run. > > The patch header shows you are not working against trunk? > > + (for op (plus minus mult > + min max > + bit_and bit_ior bit_xor) > + (simplify > + (cnd @0 (op @1 @2) (op @1 @3)) > + (op @1 (cnd @0 @2 @3))) > + (simplify > + (cnd @0 (op @1 @2) (op @3 @2)) > + (op (cnd @0 @1 @3) @2)) > + (if (op != MINUS_EXPR) > + (simplify > + (cnd @0 (op @1 @2) (op @3 @1)) > + (op @1 (cnd @0 @2 @3))) > + (simplify > + (cnd @0 (op @2 @1) (op @1 @3)) > + (op @1 (cnd @0 @2 @3))))) > > if you would have dropped minus handling this simplifies to > > (for op (...) > (simpify > (cnd @0 (op:c @1 @2) (op:c @1 @3)) > (op @1 (cnd @0 @2 @3))) > > you can then add minus special-cases if they are important > > (simplify > (cnd @0 (minus @1 @2) (minus @1 @3)) > ... > (simplify > (cnd @0 (minus @2 @1) (minus @3 @1)) > > I think that's clearer. > > + /* Hack: generic-match causes infinite recursion > + by reverting this transformation when > + i) -fno-trapping-math is enabled, and > + ii) the common operand does not need to be wrapped in a SAVE_EXPR. > + */ > > What's the specific transform that causes this? Yes, there are some left in > fold-const.c. > > Thanks, > Richard. > > > Best Regards, > > Yuliang Wang > > > > > > gcc/ChangeLog: > > > > 2019-09-19 Yuliang Wang <yuliang.w...@arm.com> > > > > * match.pd (for cnd (cond vec_cond)): New match statements for the > > above patterns. > > * doc/sourcebuild.texi (vect_condred_si): Document new target > > selector. > > > > gcc/testsuite/ChangeLog: > > > > 2019-09-19 Yuliang Wang <yuliang.w...@arm.com> > > > > * gcc.target/aarch64/sve2/condred_1.c: New test. > > * gcc.dg/vect/vect-condred-1.c: As above. > > * gcc.dg/vect/vect-condred-2.c: As above. > > * gcc.dg/vect/vect-condred-3.c: As above. > > * gcc.dg/vect/vect-condred-4.c: As above. > > * gcc.dg/vect/vect-condred-5.c: As above. > > * gcc.dg/vect/vect-condred-6.c: As above. > > * gcc.dg/vect/vect-condred-7.c: As above. > > * gcc.dg/vect/vect-condred-8.c: As above. > > * lib/target-supports.exp (check_effective_target_vect_condred_si): > > Return true for AArch64 without SVE.