On Wed, 23 Oct 2024, Tamar Christina wrote:
> Hi All,
>
> I have been taking a look at boolean handing once more in the vectorizer.
>
> There are two situation to consider:
>
> 1. when the boolean being created are created from comparing data inputs
> then
> for the resulting vector boolean we need to know the vector type and the
> precision. In this case, when we have an operation such as NOT on the
> data
> element, this has to be lowered to XOR because the truncation to the
> vector
> precision needs to be explicit.
> 2. when the boolean being created comes from another boolean operation, then
> we don't need to lower NOT, as the precision doesn't change. We don't do
> any lowering for these (as denoted in check_bool_pattern) and instead the
> precision is copied from the element feeding the boolean statement during
> VF analysis.
>
> For early break gcond lowering in order to correctly handle the second
> scenario
> above we punted the lowering of VECT_SCALAR_BOOLEAN_TYPE_P comparisons that
> were
> already in the right shape. e.g. e != 0 where e is a boolean does not need
> any
> lowering.
>
> The issue however is that the statement feeding e may need to be lowered in
> the
> case where it's a data expression.
>
> This patch changes a bit how we do the lowering. We now always emit an
> additional compare. e.g. if the input is;
>
> if (e != 0)
>
> where is a boolean we would punt on thi before, but now we generate
>
> f = e != 0
> if (f != 0)
>
> We then use the same infrastructre as recog_bool to ask it to lower f, and in
> doing so handle and boolean conversions that need to be lowered.
>
> Because we now guarantee that f is an internal def we can also simplify the
> SLP building code.
>
> When e is a boolean, the precision we build for f needs to reflect the
> precision
> of the operation feeding e. To get this value we use integer_type_for_mask
> the
> same way recog_bool does, and if it's defined (e.g. we have a data conversions
> somewhere) we pass that precision on instead. This gets us the correct VF
> on the newly lowered boolean expressions.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
OK.
Thanks,
Richard.
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR tree-optimization/117176
> * tree-vect-patterns.cc (vect_recog_gcond_pattern): Lower all gconds.
> * tree-vect-slp.cc (vect_analyze_slp): No longer check for in vect def.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/117176
> * gcc.dg/vect/vect-early-break_130-pr117176.c: New test.
>
> ---
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..841dcce284dd7cff0c4f0648e6dc57082c8756c1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_130-pr117176.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +struct ColorSpace {
> + int componentCt;
> +};
> +
> +struct Psnr {
> + double psnr[3];
> +};
> +
> +int f(struct Psnr psnr, struct ColorSpace colorSpace) {
> + int i, hitsTarget = 1;
> +
> + for (i = 1; i < colorSpace.componentCt && hitsTarget; ++i)
> + hitsTarget = !(psnr.psnr[i] < 1);
> +
> + return hitsTarget;
> +}
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index
> 746f100a0842090953571b535ff05375f46033c0..dc188f7412770bcb2344808d31bfb7e6c981d777
> 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -5944,21 +5944,31 @@ vect_recog_gcond_pattern (vec_info *vinfo,
> if (VECTOR_TYPE_P (scalar_type))
> return NULL;
>
> - if (code == NE_EXPR
> - && zerop (rhs)
> - && VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
> - return NULL;
> + /* If the input is a boolean then try to figure out the precision that the
> + vector type should use. We cannot use the scalar precision as this
> would
> + later mismatch. This is similar to what recog_bool does. */
> + if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
> + {
> + if (tree stype = integer_type_for_mask (lhs, vinfo))
> + scalar_type = stype;
> + }
>
> - tree vecitype = get_vectype_for_scalar_type (vinfo, scalar_type);
> - if (vecitype == NULL_TREE)
> + tree vectype = get_mask_type_for_scalar_type (vinfo, scalar_type);
> + if (vectype == NULL_TREE)
> return NULL;
>
> - tree vectype = truth_type_for (vecitype);
> -
> tree new_lhs = vect_recog_temp_ssa_var (boolean_type_node, NULL);
> gimple *new_stmt = gimple_build_assign (new_lhs, code, lhs, rhs);
> append_pattern_def_seq (vinfo, stmt_vinfo, new_stmt, vectype, scalar_type);
>
> + hash_set<gimple *> bool_stmts;
> + /* This will only be entered if we have a data comparison in the gcond. */
> + if (check_bool_pattern (new_lhs, vinfo, bool_stmts))
> + {
> + scalar_type = TREE_TYPE (new_lhs);
> + new_lhs = adjust_bool_stmts (vinfo, bool_stmts, scalar_type,
> stmt_vinfo);
> + }
> +
> gimple *pattern_stmt
> = gimple_build_cond (NE_EXPR, new_lhs,
> build_int_cst (TREE_TYPE (new_lhs), 0),
> @@ -6208,7 +6218,6 @@ vect_recog_bool_pattern (vec_info *vinfo,
> return NULL;
> }
>
> -
> /* A helper for vect_recog_mask_conversion_pattern. Build
> conversion of MASK to a type suitable for masking VECTYPE.
> Built statement gets required vectype and is appended to
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index
> d35c2ea02dcef1c295bf97adc6717a7294d3f61e..ebd3ae16cf5cb5c1e85c9b2dced23fa1a1d76645
> 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -4858,37 +4858,16 @@ vect_analyze_slp (vec_info *vinfo, unsigned
> max_tree_size,
> from them. It's highly likely that the resulting SLP tree here if
> both
> arguments have a def will be incompatible, but we rely on it being
> split
> later on. */
> - if (auto varg = loop_vinfo->lookup_def (args0))
> - {
> - vec<stmt_vec_info> stmts;
> - vec<tree> remain = vNULL;
> - stmts.create (1);
> - stmts.quick_push (vect_stmt_to_vectorize (varg));
> + auto varg = loop_vinfo->lookup_def (args0);
> + vec<stmt_vec_info> stmts;
> + vec<tree> remain = vNULL;
> + stmts.create (1);
> + stmts.quick_push (vect_stmt_to_vectorize (varg));
>
> - vect_build_slp_instance (vinfo, slp_inst_kind_gcond,
> - stmts, roots, remain,
> - max_tree_size, &limit,
> - bst_map, NULL, force_single_lane);
> - }
> - else
> - {
> - /* Create a new SLP instance. */
> - slp_instance new_instance = XNEW (class _slp_instance);
> - vec<tree> ops;
> - ops.create (1);
> - ops.quick_push (args0);
> - slp_tree invnode = vect_create_new_slp_node (ops);
> - SLP_TREE_DEF_TYPE (invnode) = vect_external_def;
> - SLP_INSTANCE_TREE (new_instance) = invnode;
> - SLP_INSTANCE_LOADS (new_instance) = vNULL;
> - SLP_INSTANCE_ROOT_STMTS (new_instance) = roots;
> - SLP_INSTANCE_REMAIN_DEFS (new_instance) = vNULL;
> - SLP_INSTANCE_KIND (new_instance) = slp_inst_kind_gcond;
> - new_instance->reduc_phis = NULL;
> - new_instance->cost_vec = vNULL;
> - new_instance->subgraph_entries = vNULL;
> - vinfo->slp_instances.safe_push (new_instance);
> - }
> + vect_build_slp_instance (vinfo, slp_inst_kind_gcond,
> + stmts, roots, remain,
> + max_tree_size, &limit,
> + bst_map, NULL, force_single_lane);
> }
>
> /* Find and create slp instances for inductions that have been forced
>
>
>
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)