> -----Original Message----- > From: Richard Biener <rguent...@suse.de> > Sent: Tuesday, December 12, 2023 10:10 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; j...@ventanamicro.com; > Richard Sandiford <richard.sandif...@arm.com> > Subject: RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for > codegen of exit code > > On Mon, 11 Dec 2023, Tamar Christina wrote: > > > > > + vectype = truth_type_for (comp_type); > > > > > > so this leaves the producer of the mask in the GIMPLE_COND and we > > > vectorize the GIMPLE_COND as > > > > > > mask_1 = ...; > > > if (mask_1 != {-1,-1...}) > > > .. > > > > > > ? In principle only the mask producer needs a vector type and that > > > adjusted by bool handling, the branch itself doesn't need any > > > STMT_VINFO_VECTYPE. > > > > > > As said I believe if you recognize a GIMPLE_COND pattern for conds > > > that aren't bool != 0 producing the mask stmt this should be picked > > > up by bool handling correctly already. > > > > > > Also as said piggy-backing on the COND_EXPR handling in this function > > > which has the condition split out into a separate stmt(!) might not > > > completely handle things correctly and you are likely missing > > > the tcc_comparison handling of the embedded compare. > > > > > > > Ok, I've stopped piggy-backing on the COND_EXPR handling and created > > vect_recog_gcond_pattern. As you said in the previous email I've also > > stopped setting the vectype for the gcond and instead use the type of the > > operand. > > > > Note that because the pattern doesn't apply if you were already an NE_EXPR > > I do need the extra truth_type_for for that case. Because in the case of > > e.g. > > > > a = b > 4; > > If (a != 0) > > > > The producer of the mask is already outside of the cond but will not trigger > > Boolean recognition. > > It should trigger because we have a mask use of 'a', I always forget > where we do that - it might be where we compute mask precision stuff > or it might be bool pattern recognition itself ... > > That said, a GIMPLE_COND (be it pattern or not) should be recognized > as mask use. > > > That means that while the integral type is correct it > > Won't be a Boolean one and vectorable_comparison expects a Boolean > > vector. Alternatively, we can remove that assert? But that seems worse. > > > > Additionally in the previous email you mention "adjusted Boolean statement". > > > > I'm guessing you were referring to generating a COND_EXPR from the gcond. > > So vect_recog_bool_pattern detects it? The problem with that this gets > > folded > > to x & 1 and doesn't trigger. It also then blocks vectorization. So > > instead I've > > not forced it. > > Not sure what you are refering to, but no - we shouln't generate a > COND_EXPR from the gcond. Pattern recog generates COND_EXPRs for > _data_ uses of masks (if we need a 'bool' data type for storing). > We then get mask != 0 ? true : false; >
Thought so.. but there happens to be a function called adjust_bool_stmts which I thought you wanted me to call. This is where the confusion came from, couldn't tell whether "adjusted Boolean statement" meant just the new modified one or one from adjust_bool_stmts. But that last one didn't make much sense so hence the question above.. > > > > + /* Determine if we need to reduce the final value. */ > > > > + if (stmts.length () > 1) > > > > + { > > > > + /* We build the reductions in a way to maintain as much > > > > parallelism as > > > > + possible. */ > > > > + auto_vec<tree> workset (stmts.length ()); > > > > + > > > > + /* Mask the statements as we queue them up. */ > > > > + if (masked_loop_p) > > > > + for (auto stmt : stmts) > > > > + workset.quick_push (prepare_vec_mask (loop_vinfo, TREE_TYPE > > > > (mask), > > > > + mask, stmt, &cond_gsi)); > > > > + else > > > > + workset.splice (stmts); > > > > + > > > > + while (workset.length () > 1) > > > > + { > > > > + new_temp = make_temp_ssa_name (vectype, NULL, "vexit_reduc"); > > > > + tree arg0 = workset.pop (); > > > > + tree arg1 = workset.pop (); > > > > + new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, > > > > arg1); > > > > + vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt, > > > > + &cond_gsi); > > > > + workset.quick_insert (0, new_temp); > > > > + } > > > > + } > > > > + else > > > > + new_temp = stmts[0]; > > > > + > > > > + gcc_assert (new_temp); > > > > + > > > > + tree cond = new_temp; > > > > + /* If we have multiple statements after reduction we should check > > > > all the > > > > + lanes and treat it as a full vector. */ > > > > + if (masked_loop_p) > > > > + cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond, > > > > + &cond_gsi); > > > > > > You didn't fix any of the code above it seems, it's still wrong. > > > > > > > Apologies, I hadn't realized that the last argument to get_loop_mask was the > index. > > > > Should be fixed now. Is this closer to what you wanted? > > The individual ops are now masked with separate masks. (See testcase when > N=865). > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * tree-vect-patterns.cc (vect_init_pattern_stmt): Support gconds. > > (vect_recog_gcond_pattern): New. > > (vect_vect_recog_func_ptrs): Use it. > > * tree-vect-stmts.cc (vectorizable_comparison_1): Support stmts without > > lhs. > > (vectorizable_early_exit): New. > > (vect_analyze_stmt, vect_transform_stmt): Use it. > > (vect_is_simple_use, vect_get_vector_types_for_stmt): Support gcond. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/vect/vect-early-break_88.c: New test. > > > > --- inline copy of patch --- > > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..b64becd588973f5860119 > 6bfcb15afbe4bab60f2 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_88.c > > @@ -0,0 +1,36 @@ > > +/* { dg-require-effective-target vect_early_break } */ > > +/* { dg-require-effective-target vect_int } */ > > + > > +/* { dg-additional-options "-Ofast --param vect-partial-vector-usage=2" } > > */ > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ > > + > > +#ifndef N > > +#define N 5 > > +#endif > > +float vect_a[N] = { 5.1f, 4.2f, 8.0f, 4.25f, 6.5f }; > > +unsigned vect_b[N] = { 0 }; > > + > > +__attribute__ ((noinline, noipa)) > > +unsigned test4(double x) > > +{ > > + unsigned ret = 0; > > + for (int i = 0; i < N; i++) > > + { > > + if (vect_a[i] > x) > > + break; > > + vect_a[i] = x; > > + > > + } > > + return ret; > > +} > > + > > +extern void abort (); > > + > > +int main () > > +{ > > + if (test4 (7.0) != 0) > > + abort (); > > + > > + if (vect_b[2] != 0 && vect_b[1] == 0) > > + abort (); > > +} > > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc > > index > 7debe7f0731673cd1bf25cd39d55e23990a73d0e..359d30b5991a50717c269df > 577c08adffa44e71b 100644 > > --- a/gcc/tree-vect-patterns.cc > > +++ b/gcc/tree-vect-patterns.cc > > @@ -132,6 +132,7 @@ vect_init_pattern_stmt (vec_info *vinfo, gimple > *pattern_stmt, > > if (!STMT_VINFO_VECTYPE (pattern_stmt_info)) > > { > > gcc_assert (!vectype > > + || is_a <gcond *> (pattern_stmt) > > || (VECTOR_BOOLEAN_TYPE_P (vectype) > > == vect_use_mask_type_p (orig_stmt_info))); > > STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype; > > @@ -5553,6 +5554,83 @@ integer_type_for_mask (tree var, vec_info *vinfo) > > return build_nonstandard_integer_type (def_stmt_info->mask_precision, 1); > > } > > > > +/* Function vect_recog_gcond_pattern > > + > > + Try to find pattern like following: > > + > > + if (a op b) > > + > > + where operator 'op' is not != and convert it to an adjusted boolean > > pattern > > + > > + mask = a op b > > + if (mask != 0) > > + > > + and set the mask type on MASK. > > + > > + Input: > > + > > + * STMT_VINFO: The stmt at the end from which the pattern > > + search begins, i.e. cast of a bool to > > + an integer type. > > + > > + Output: > > + > > + * TYPE_OUT: The type of the output of this pattern. > > + > > + * Return value: A new stmt that will be used to replace the pattern. */ > > + > > +static gimple * > > +vect_recog_gcond_pattern (vec_info *vinfo, > > + stmt_vec_info stmt_vinfo, tree *type_out) > > +{ > > + gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo); > > + gcond* cond = NULL; > > + if (!(cond = dyn_cast <gcond *> (last_stmt))) > > + return NULL; > > + > > + auto lhs = gimple_cond_lhs (cond); > > + auto rhs = gimple_cond_rhs (cond); > > + auto code = gimple_cond_code (cond); > > + > > + tree scalar_type = TREE_TYPE (lhs); > > + if (VECTOR_TYPE_P (scalar_type)) > > + return NULL; > > + > > + if (code == NE_EXPR && zerop (rhs)) > > I think you need && VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type) here, > an integer != 0 would not be an appropriate mask. I guess two > relevant testcases would have an early exit like > > if (here[i] != 0) > break; > > once with a 'bool here[]' and once with a 'int here[]'. > > > + return NULL; > > + > > + tree vecitype = get_vectype_for_scalar_type (vinfo, scalar_type); > > + if (vecitype == NULL_TREE) > > + return NULL; > > + > > + /* Build a scalar type for the boolean result that when vectorized > > matches the > > + vector type of the result in size and number of elements. */ > > + unsigned prec > > + = vector_element_size (tree_to_poly_uint64 (TYPE_SIZE (vecitype)), > > + TYPE_VECTOR_SUBPARTS (vecitype)); > > + > > + scalar_type > > + = build_nonstandard_integer_type (prec, TYPE_UNSIGNED (scalar_type)); > > + > > + vecitype = get_vectype_for_scalar_type (vinfo, scalar_type); > > + if (vecitype == NULL_TREE) > > + return NULL; > > + > > + tree vectype = truth_type_for (vecitype); > > That looks awfully complicated. I guess one complication is that > we compute mask_precision & friends before this pattern gets > recognized. See vect_determine_mask_precision and its handling > of tcc_comparison, see also integer_type_for_mask. For comparisons > properly handled during pattern recog the vector type is determined > in vect_get_vector_types_for_stmt via > > else if (vect_use_mask_type_p (stmt_info)) > { > unsigned int precision = stmt_info->mask_precision; > scalar_type = build_nonstandard_integer_type (precision, 1); > vectype = get_mask_type_for_scalar_type (vinfo, scalar_type, > group_size); > if (!vectype) > return opt_result::failure_at (stmt, "not vectorized: unsupported" > " data-type %T\n", scalar_type); > > Richard, do you have any advice here? I suppose vect_determine_precisions > needs to handle the gcond case with bool != 0 somehow and for the > extra mask producer we add here we have to emulate what it would have > done, right? > There seems to be an awful lots of places that determine types and precision 😊 It's quite hard to figure out which part is used where... and Boolean handling seems to be especially complicated. > > + tree new_lhs = vect_recog_temp_ssa_var (boolean_type_node, NULL); > > + gimple *new_stmt = gimple_build_assign (new_lhs, code, lhs, rhs); > > + append_pattern_def_seq (vinfo, stmt_vinfo, new_stmt, vectype, > > scalar_type); > > + > > + gimple *pattern_stmt > > + = gimple_build_cond (NE_EXPR, new_lhs, > > + build_int_cst (TREE_TYPE (new_lhs), 0), > > + NULL_TREE, NULL_TREE); > > + *type_out = vectype; > > + vect_pattern_detected ("vect_recog_gcond_pattern", last_stmt); > > + return pattern_stmt; > > +} > > + > > /* Function vect_recog_bool_pattern > > > > Try to find pattern like following: > > @@ -6860,6 +6938,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = > { > > { vect_recog_divmod_pattern, "divmod" }, > > { vect_recog_mult_pattern, "mult" }, > > { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" }, > > + { vect_recog_gcond_pattern, "gcond" }, > > { vect_recog_bool_pattern, "bool" }, > > /* This must come before mask conversion, and includes the parts > > of mask conversion that are needed for gather and scatter > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index > 582c5e678fad802d6e76300fe3c939b9f2978f17..7c50ee37f2ade24eccf7a7d1ea > 2e00b4450023f9 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-stmts.cc > > @@ -12489,7 +12489,7 @@ vectorizable_comparison_1 (vec_info *vinfo, tree > vectype, > > vec<tree> vec_oprnds0 = vNULL; > > vec<tree> vec_oprnds1 = vNULL; > > tree mask_type; > > - tree mask; > > + tree mask = NULL_TREE; > > > > if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo) > > return false; > > @@ -12629,8 +12629,9 @@ vectorizable_comparison_1 (vec_info *vinfo, tree > vectype, > > /* Transform. */ > > > > /* Handle def. */ > > - lhs = gimple_assign_lhs (STMT_VINFO_STMT (stmt_info)); > > - mask = vect_create_destination_var (lhs, mask_type); > > + lhs = gimple_get_lhs (STMT_VINFO_STMT (stmt_info)); > > + if (lhs) > > + mask = vect_create_destination_var (lhs, mask_type); > > > > vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies, > > rhs1, &vec_oprnds0, vectype, > > @@ -12644,7 +12645,10 @@ vectorizable_comparison_1 (vec_info *vinfo, tree > vectype, > > gimple *new_stmt; > > vec_rhs2 = vec_oprnds1[i]; > > > > - new_temp = make_ssa_name (mask); > > + if (lhs) > > + new_temp = make_ssa_name (mask); > > + else > > + new_temp = make_temp_ssa_name (mask_type, NULL, "cmp"); > > if (bitop1 == NOP_EXPR) > > { > > new_stmt = gimple_build_assign (new_temp, code, > > @@ -12723,6 +12727,211 @@ vectorizable_comparison (vec_info *vinfo, > > return true; > > } > > > > +/* Check to see if the current early break given in STMT_INFO is valid for > > + vectorization. */ > > + > > +static bool > > +vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info, > > + gimple_stmt_iterator *gsi, gimple **vec_stmt, > > + slp_tree slp_node, stmt_vector_for_cost *cost_vec) > > +{ > > + loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); > > + if (!loop_vinfo > > + || !is_a <gcond *> (STMT_VINFO_STMT (stmt_info))) > > + return false; > > + > > + if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_condition_def) > > + return false; > > + > > + if (!STMT_VINFO_RELEVANT_P (stmt_info)) > > + return false; > > + > > + DUMP_VECT_SCOPE ("vectorizable_early_exit"); > > + > > + auto code = gimple_cond_code (STMT_VINFO_STMT (stmt_info)); > > + > > + tree vectype_op0 = NULL_TREE; > > + slp_tree slp_op0; > > + tree op0; > > + enum vect_def_type dt0; > > + if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, &op0, &slp_op0, > > &dt0, > > + &vectype_op0)) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "use not simple.\n"); > > + return false; > > + } > > + > > + stmt_vec_info op0_info = vinfo->lookup_def (op0); > > + tree vectype = truth_type_for (STMT_VINFO_VECTYPE (op0_info)); > > + gcc_assert (vectype); > > + > > + machine_mode mode = TYPE_MODE (vectype); > > + int ncopies; > > + > > + if (slp_node) > > + ncopies = 1; > > + else > > + ncopies = vect_get_num_copies (loop_vinfo, vectype); > > + > > + vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo); > > + bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo); > > + > > + /* Analyze only. */ > > + if (!vec_stmt) > > + { > > + if (direct_optab_handler (cbranch_optab, mode) == CODE_FOR_nothing) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "can't vectorize early exit because the " > > + "target doesn't support flag setting vector " > > + "comparisons.\n"); > > + return false; > > + } > > + > > + if (ncopies > 1 > > + && direct_optab_handler (ior_optab, mode) == CODE_FOR_nothing) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "can't vectorize early exit because the " > > + "target does not support boolean vector OR for " > > + "type %T.\n", vectype); > > + return false; > > + } > > + > > + if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi, > > + vec_stmt, slp_node, cost_vec)) > > + return false; > > + > > + if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) > > + { > > + if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype, > > + OPTIMIZE_FOR_SPEED)) > > + return false; > > + else > > + vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL); > > + } > > + > > + > > + return true; > > + } > > + > > + /* Tranform. */ > > + > > + tree new_temp = NULL_TREE; > > + gimple *new_stmt = NULL; > > + > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_NOTE, vect_location, "transform early-exit.\n"); > > + > > + if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi, > > + vec_stmt, slp_node, cost_vec)) > > + gcc_unreachable (); > > + > > + gimple *stmt = STMT_VINFO_STMT (stmt_info); > > + basic_block cond_bb = gimple_bb (stmt); > > + gimple_stmt_iterator cond_gsi = gsi_last_bb (cond_bb); > > + > > + auto_vec<tree> stmts; > > + > > + tree mask = NULL_TREE; > > + if (masked_loop_p) > > + mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, vectype, > > 0); > > + > > + if (slp_node) > > + stmts.safe_splice (SLP_TREE_VEC_DEFS (slp_node)); > > + else > > + { > > + auto vec_stmts = STMT_VINFO_VEC_STMTS (stmt_info); > > + stmts.reserve_exact (vec_stmts.length ()); > > + for (auto stmt : vec_stmts) > > + stmts.quick_push (gimple_assign_lhs (stmt)); > > + } > > + > > + /* Determine if we need to reduce the final value. */ > > + if (stmts.length () > 1) > > + { > > + /* We build the reductions in a way to maintain as much parallelism > > as > > + possible. */ > > + auto_vec<tree> workset (stmts.length ()); > > + > > + /* Mask the statements as we queue them up. Normally we loop over > > + vec_num, but since we inspect the exact results of vectorization > > + we don't need to and instead can just use the stmts themselves. */ > > + if (masked_loop_p) > > + for (unsigned i = 0; i < stmts.length (); i++) > > + { > > + tree stmt_mask > > + = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, vectype, > > + i); > > + stmt_mask > > + = prepare_vec_mask (loop_vinfo, TREE_TYPE (stmt_mask), stmt_mask, > > + stmts[i], &cond_gsi); > > + workset.quick_push (stmt_mask); > > + } > > + else > > + workset.splice (stmts); > > + > > + while (workset.length () > 1) > > + { > > + new_temp = make_temp_ssa_name (vectype, NULL, "vexit_reduc"); > > + tree arg0 = workset.pop (); > > + tree arg1 = workset.pop (); > > + new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, arg1); > > + vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt, > > + &cond_gsi); > > + workset.quick_insert (0, new_temp); > > + } > > + } > > + else > > + new_temp = stmts[0]; > > + > > + gcc_assert (new_temp); > > + > > + tree cond = new_temp; > > + /* If we have multiple statements after reduction we should check all the > > + lanes and treat it as a full vector. */ > > + if (masked_loop_p) > > + cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond, > > + &cond_gsi); > > This is still wrong, you are applying mask[0] on the IOR reduced result. > As suggested do that in the else { new_temp = stmts[0] } clause instead > (or simply elide the optimization of a single vector) PEBKAC.. I had looked at it, and thought, it doesn't seem right since why would mask[0] be used for both the elements and the final, but left it ☹ I'll wait for Richard's thoughts on the precision before re-spining. Thanks, Tamar > > > + /* Now build the new conditional. Pattern gimple_conds get dropped > > during > > + codegen so we must replace the original insn. */ > > + stmt = STMT_VINFO_STMT (vect_orig_stmt (stmt_info)); > > + gcond *cond_stmt = as_a <gcond *>(stmt); > > + /* When vectorizing we assume that if the branch edge is taken that we're > > + exiting the loop. This is not however always the case as the > > compiler will > > + rewrite conditions to always be a comparison against 0. To do this it > > + sometimes flips the edges. This is fine for scalar, but for vector > > we > > + then have to flip the test, as we're still assuming that if you take > > the > > + branch edge that we found the exit condition. */ > > + auto new_code = NE_EXPR; > > + tree cst = build_zero_cst (vectype); > > + if (flow_bb_inside_loop_p (LOOP_VINFO_LOOP (loop_vinfo), > > + BRANCH_EDGE (gimple_bb (cond_stmt))->dest)) > > + { > > + new_code = EQ_EXPR; > > + cst = build_minus_one_cst (vectype); > > + } > > + > > + gimple_cond_set_condition (cond_stmt, new_code, cond, cst); > > + update_stmt (stmt); > > + > > + if (slp_node) > > + SLP_TREE_VEC_DEFS (slp_node).truncate (0); > > + else > > + STMT_VINFO_VEC_STMTS (stmt_info).truncate (0); > > + > > + > > + if (!slp_node) > > + *vec_stmt = stmt; > > + > > + return true; > > +} > > + > > /* If SLP_NODE is nonnull, return true if vectorizable_live_operation > > can handle all live statements in the node. Otherwise return true > > if STMT_INFO is not live or if vectorizable_live_operation can handle > > it. > > @@ -12949,7 +13158,9 @@ vect_analyze_stmt (vec_info *vinfo, > > || vectorizable_lc_phi (as_a <loop_vec_info> (vinfo), > > stmt_info, NULL, node) > > || vectorizable_recurr (as_a <loop_vec_info> (vinfo), > > - stmt_info, NULL, node, cost_vec)); > > + stmt_info, NULL, node, cost_vec) > > + || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node, > > + cost_vec)); > > else > > { > > if (bb_vinfo) > > @@ -12972,7 +13183,10 @@ vect_analyze_stmt (vec_info *vinfo, > > NULL, NULL, node, cost_vec) > > || vectorizable_comparison (vinfo, stmt_info, NULL, NULL, node, > > cost_vec) > > - || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec)); > > + || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec) > > + || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node, > > + cost_vec)); > > + > > } > > > > if (node) > > @@ -13131,6 +13345,12 @@ vect_transform_stmt (vec_info *vinfo, > > gcc_assert (done); > > break; > > > > + case loop_exit_ctrl_vec_info_type: > > + done = vectorizable_early_exit (vinfo, stmt_info, gsi, &vec_stmt, > > + slp_node, NULL); > > + gcc_assert (done); > > + break; > > + > > default: > > if (!STMT_VINFO_LIVE_P (stmt_info)) > > { > > @@ -14321,10 +14541,19 @@ vect_get_vector_types_for_stmt (vec_info > *vinfo, stmt_vec_info stmt_info, > > } > > else > > { > > + gcond *cond = NULL; > > if (data_reference *dr = STMT_VINFO_DATA_REF (stmt_info)) > > scalar_type = TREE_TYPE (DR_REF (dr)); > > else if (gimple_call_internal_p (stmt, IFN_MASK_STORE)) > > scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3)); > > + else if ((cond = dyn_cast <gcond *> (stmt))) > > + { > > + /* We can't convert the scalar type to boolean yet, since booleans > > have a > > + single bit precision and we need the vector boolean to be a > > + representation of the integer mask. So set the correct integer > > type and > > + convert to boolean vector once we have a vectype. */ > > + scalar_type = TREE_TYPE (gimple_cond_lhs (cond)); > > You should get into the vect_use_mask_type_p (stmt_info) path for > early exit conditions (see above with regard to mask_precision). > > > + } > > else > > scalar_type = TREE_TYPE (gimple_get_lhs (stmt)); > > > > @@ -14339,12 +14568,18 @@ vect_get_vector_types_for_stmt (vec_info > *vinfo, stmt_vec_info stmt_info, > > "get vectype for scalar type: %T\n", scalar_type); > > } > > vectype = get_vectype_for_scalar_type (vinfo, scalar_type, > > group_size); > > + > > if (!vectype) > > return opt_result::failure_at (stmt, > > "not vectorized:" > > " unsupported data-type %T\n", > > scalar_type); > > > > + /* If we were a gcond, convert the resulting type to a vector > > boolean type > now > > + that we have the correct integer mask type. */ > > + if (cond) > > + vectype = truth_type_for (vectype); > > + > > which makes this moot. > > Richard. > > > if (dump_enabled_p ()) > > dump_printf_loc (MSG_NOTE, vect_location, "vectype: %T\n", vectype); > > } > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)