https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111697

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> We have quite some code doing vector CTOR stuff in tree-ssa-forwprop.cc and
> this should be optimized to
> 
>  v_2 = { x_6(D), x_6(D), x_6(D), x_6(D) };
> 
> note SLP vectorization can do this but it fails because it doesn't handle
> a default def insert - it handles a group of BIT_INSERT_EXPRs as
> vector CTOR and SLP discovery doesn't know how to start from external defs
> (it needs actual definition stmts).
> 
> A more general approach would be to try to track vector construction through
> symbolic execution like we form bswap in the bswap pass.

You could "steal" the code in vect_slp_check_for_roots,

      else if (code == BIT_INSERT_EXPR
               && VECTOR_TYPE_P (TREE_TYPE (rhs))
               && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).is_constant ()
               && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).to_constant () > 1
               && integer_zerop (gimple_assign_rhs3 (assign))
               && useless_type_conversion_p
                    (TREE_TYPE (TREE_TYPE (rhs)),
                     TREE_TYPE (gimple_assign_rhs2 (assign)))
               && bb_vinfo->lookup_def (gimple_assign_rhs2 (assign)))
        {
          /* We start to match on insert to lane zero but since the
             inserts need not be ordered we'd have to search both
             the def and the use chains.  */
...

and put it into tree-ssa-forwprop.cc, explicitly creating the vector CTOR.

Reply via email to