Hi Richard,
On 7 January 2018 at 21:35, James Greenhalgh <james.greenha...@arm.com> wrote: > On Wed, Dec 13, 2017 at 04:34:34PM +0000, Jeff Law wrote: >> On 11/17/2017 07:59 AM, Richard Sandiford wrote: >> > This patch removes the restriction that fully-masked loops cannot >> > have reductions. The key thing here is to make sure that the >> > reduction accumulator doesn't include any values associated with >> > inactive lanes; the patch adds a bunch of conditional binary >> > operations for doing that. >> > >> > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu >> > and powerpc64le-linux-gnu. >> > >> > Richard >> > >> > >> > 2017-11-17 Richard Sandiford <richard.sandif...@linaro.org> >> > Alan Hayward <alan.hayw...@arm.com> >> > David Sherwood <david.sherw...@arm.com> >> > >> > gcc/ >> > * doc/md.texi (cond_add@var{mode}, cond_sub@var{mode}) >> > (cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode}) >> > (cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode}) >> > (cond_umax@var{mode}): Document. >> > * optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab) >> > (cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab) >> > (cond_umin_optab, cond_umax_optab): New optabs. >> > * internal-fn.def (COND_ADD, COND_SUB, COND_SMIN, COND_SMAX) >> > (COND_UMIN, COND_UMAX, COND_AND, COND_IOR, COND_XOR): New internal >> > functions. >> > * internal-fn.h (get_conditional_internal_fn): Declare. >> > * internal-fn.c (cond_binary_direct): New macro. >> > (expand_cond_binary_optab_fn): Likewise. >> > (direct_cond_binary_optab_supported_p): Likewise. >> > (get_conditional_internal_fn): New function. >> > * tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops. >> > Cope with reduction statements that are vectorized as calls rather >> > than assignments. >> > * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New insns. >> > * config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB) >> > (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN) >> > (UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR) >> > (UNSPEC_COND_EOR): New unspecs. >> > (optab): Add mappings for them. >> > (SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators. >> > (sve_int_op, sve_fp_op): New int attributes. >> > >> > gcc/testsuite/ >> > * gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors. >> > * gcc.target/aarch64/sve_reduc_1.c: Expect the loop operations >> > to be predicated. >> > * gcc.target/aarch64/sve_slp_5.c: Check for a fully-masked loop. >> > * gcc.target/aarch64/sve_slp_7.c: Likewise. >> > * gcc.target/aarch64/sve_reduc_5.c: New test. >> > * gcc.target/aarch64/sve_slp_13.c: Likewise. >> > * gcc.target/aarch64/sve_slp_13_run.c: Likewise. >> I didn't walk through the aarch64 specific bits here. The generic bits >> are OK. > > As are the AArch64 bits. > As of r256626, I've noticed that a new test says XPASS on aarch64-none-elf with -mabi=ilp32: XPASS: gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve scan-assembler-times \\tsub\\t 8 Not sure if I should file a PR for this? Christophe > OK. > > James