https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123301

--- Comment #3 from Robin Dapp <rdapp at gcc dot gnu.org> ---
Hmm, this is definitely latent and not originally due to the bisection commit.

ifcvt creates a COND_IOR scalar reduction (versioned for loop_vectorized, as it
should be).

What we create is
  _ifc__54 = .COND_IOR (_10, BS_VAR_0_41, _25, BS_VAR_0_41);

where
  vector(4) long unsigned int _ifc__54;
  vector(4) long unsigned int _25;
  _Bool _10;

These types come directly from the source, i.e. GNU vector extension types.

We consider this a scalar reduction resulting in the COND_IOR above.

The whole loop does not get vectorized but one BB of the loop does.

When vectorizing we set:

          if (!require_loop_vectorize)
            {
              tree arg = gimple_call_arg (loop_vectorized_call, 1);
              class loop *scalar_loop = get_loop (fun, tree_to_shwi (arg));
              if (vect_slp_if_converted_bb (bb, scalar_loop))
                  fold_loop_internal_call (loop_vectorized_call,
                                           boolean_true_node);

"activating" the if-converted reduction.

This is because we only check
                      || (direct_internal_fn_p (ifn)
                          && !direct_internal_fn_supported_p
                          (call, OPTIMIZE_FOR_SPEED)))
                    {
                      require_loop_vectorize = true;

where direct_internal_fn_supported_p looks at the operation type and not at all
the arguments (thus does not see the scalar mask).

I'd say we should forbid vector types when converting a _scalar_ reduction. 
But I obviously didn't think of that case when writing the code.

So a simple
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index c8f7b8453d8..51fbcc128c6 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1993,6 +1993,7 @@ convert_scalar_cond_reduction (gimple *reduc,
gimple_stmt_iterator *gsi,
   ifn = get_conditional_internal_fn (reduction_op);
   if (loop_versioned && ifn != IFN_LAST
       && vectorized_internal_fn_supported_p (ifn, TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
       && !swap)
     {
       gcall *cond_call = gimple_build_call_internal (ifn, 4,

could be enough.  I'll test that.

Reply via email to