This patch prevents pattern-matching of fold-left SLP reduction chains, which the previous patch for 83965 didn't handle properly. It only stops the last statement in the group from being matched, but that's enough to cause the group to be dissolved later.
A better fix would be to put all the information about the reduction on the the first statement in the reduction chain, so that every statement in the group can tell what the group is doing. That doesn't seem like stage 4 material though. As it stands, things seem to be a bit of a mess. In vect_force_simple_reduction we attach the reduction type and phi pointer to the last statement in a reduction chain: reduc_def_info = vinfo_for_stmt (def); STMT_VINFO_REDUC_TYPE (reduc_def_info) = v_reduc_type; STMT_VINFO_REDUC_DEF (reduc_def_info) = phi; and mark it as vect_reduction_type in vect_analyze_scalar_cycles_1: STMT_VINFO_DEF_TYPE (vinfo_for_stmt (reduc_stmt)) = vect_reduction_def; This code in vectorizable_reduction gave the impression that everything really is keyed off the last statement: /* In case of reduction chain we switch to the first stmt in the chain, but we don't update STMT_INFO, since only the last stmt is marked as reduction and has reduction properties. */ if (GROUP_FIRST_ELEMENT (stmt_info) && GROUP_FIRST_ELEMENT (stmt_info) != stmt) { stmt = GROUP_FIRST_ELEMENT (stmt_info); first_p = false; } But this code is dead these days. GROUP_FIRST_ELEMENT is only nonnull for SLP reduction chains, since we dissolve the group if SLP fails. And SLP only analyses the first statement in the group, not the last: stmt = SLP_TREE_SCALAR_STMTS (node)[0]; stmt_vec_info stmt_info = vinfo_for_stmt (stmt); [...] bool res = vect_analyze_stmt (stmt, &dummy, node, node_instance); So from that point of view the DEF_TYPE, REDUC_TYPE and REDUC_DEF are being attached to the wrong statement, since we only analyse the first one. But it turns out that REDUC_TYPE and REDUC_DEF don't matter for the first statement in the group, since that takes the phi as an input, and when the phi is a direct input, we use *its* REDUC_TYPE and REDUC_DEF instead of the statement's own info. The DEF_TYPE problem is handled by: /* Mark the first element of the reduction chain as reduction to properly transform the node. In the reduction analysis phase only the last element of the chain is marked as reduction. */ if (!STMT_VINFO_GROUPED_ACCESS (vinfo_for_stmt (stmt))) STMT_VINFO_DEF_TYPE (vinfo_for_stmt (stmt)) = vect_reduction_def; in vect_analyze_slp_instance (cancelled by: STMT_VINFO_DEF_TYPE (vinfo_for_stmt (first_element)) = vect_internal_def; in vect_analyze_slp on failure), with the operation being repeated in vect_schedule_slp_instance (redundantly AFAICT): /* Mark the first element of the reduction chain as reduction to properly transform the node. In the analysis phase only the last element of the chain is marked as reduction. */ if (GROUP_FIRST_ELEMENT (stmt_info) && !STMT_VINFO_GROUPED_ACCESS (stmt_info) && GROUP_FIRST_ELEMENT (stmt_info) == stmt) { STMT_VINFO_DEF_TYPE (stmt_info) = vect_reduction_def; STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type; } Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu. OK to install? Richard 2018-02-20 Richard Sandiford <richard.sandif...@linaro.org> gcc/ PR tree-optimization/83965 * tree-vect-patterns.c (vect_reassociating_reduction_p): Assume that grouped statements are part of a reduction chain. Return true if the statement is not marked as a reduction itself but is part of a group. (vect_recog_dot_prod_pattern): Don't check whether the statement is part of a group here. (vect_recog_sad_pattern): Likewise. (vect_recog_widen_sum_pattern): Likewise. gcc/testsuite/ PR tree-optimization/83965 * gcc.dg/vect/pr83965-2.c: New test. Index: gcc/tree-vect-patterns.c =================================================================== --- gcc/tree-vect-patterns.c 2018-02-20 09:40:41.843451227 +0000 +++ gcc/tree-vect-patterns.c 2018-02-20 16:28:55.636762056 +0000 @@ -222,13 +222,16 @@ vect_recog_temp_ssa_var (tree type, gimp } /* Return true if STMT_VINFO describes a reduction for which reassociation - is allowed. */ + is allowed. If STMT_INFO is part of a group, assume that it's part of + a reduction chain and optimistically assume that all statements + except the last allow reassociation. */ static bool vect_reassociating_reduction_p (stmt_vec_info stmt_vinfo) { return (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def - && STMT_VINFO_REDUC_TYPE (stmt_vinfo) != FOLD_LEFT_REDUCTION); + ? STMT_VINFO_REDUC_TYPE (stmt_vinfo) != FOLD_LEFT_REDUCTION + : GROUP_FIRST_ELEMENT (stmt_vinfo) != NULL); } /* Function vect_recog_dot_prod_pattern @@ -350,8 +353,7 @@ vect_recog_dot_prod_pattern (vec<gimple { gimple *def_stmt; - if (!vect_reassociating_reduction_p (stmt_vinfo) - && ! STMT_VINFO_GROUP_FIRST_ELEMENT (stmt_vinfo)) + if (!vect_reassociating_reduction_p (stmt_vinfo)) return NULL; oprnd0 = gimple_assign_rhs1 (last_stmt); oprnd1 = gimple_assign_rhs2 (last_stmt); @@ -571,8 +573,7 @@ vect_recog_sad_pattern (vec<gimple *> *s { gimple *def_stmt; - if (!vect_reassociating_reduction_p (stmt_vinfo) - && ! STMT_VINFO_GROUP_FIRST_ELEMENT (stmt_vinfo)) + if (!vect_reassociating_reduction_p (stmt_vinfo)) return NULL; plus_oprnd0 = gimple_assign_rhs1 (last_stmt); plus_oprnd1 = gimple_assign_rhs2 (last_stmt); @@ -1256,8 +1257,7 @@ vect_recog_widen_sum_pattern (vec<gimple if (gimple_assign_rhs_code (last_stmt) != PLUS_EXPR) return NULL; - if (!vect_reassociating_reduction_p (stmt_vinfo) - && ! STMT_VINFO_GROUP_FIRST_ELEMENT (stmt_vinfo)) + if (!vect_reassociating_reduction_p (stmt_vinfo)) return NULL; oprnd0 = gimple_assign_rhs1 (last_stmt); Index: gcc/testsuite/gcc.dg/vect/pr83965-2.c =================================================================== --- /dev/null 2018-02-19 19:34:42.906488063 +0000 +++ gcc/testsuite/gcc.dg/vect/pr83965-2.c 2018-02-20 16:28:55.635762095 +0000 @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-Ofast -ftrapv" } */ + +int c; +unsigned char d; +int e (unsigned char *f) +{ + int g; + for (int a; a; a++) + { + for (int b = 0; b < 6; b++) + g += __builtin_abs (f[b] - d); + f += c; + } + return g; +}