https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111697
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Biener from comment #2) > We have quite some code doing vector CTOR stuff in tree-ssa-forwprop.cc and > this should be optimized to > > v_2 = { x_6(D), x_6(D), x_6(D), x_6(D) }; > > note SLP vectorization can do this but it fails because it doesn't handle > a default def insert - it handles a group of BIT_INSERT_EXPRs as > vector CTOR and SLP discovery doesn't know how to start from external defs > (it needs actual definition stmts). > > A more general approach would be to try to track vector construction through > symbolic execution like we form bswap in the bswap pass. You could "steal" the code in vect_slp_check_for_roots, else if (code == BIT_INSERT_EXPR && VECTOR_TYPE_P (TREE_TYPE (rhs)) && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).is_constant () && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).to_constant () > 1 && integer_zerop (gimple_assign_rhs3 (assign)) && useless_type_conversion_p (TREE_TYPE (TREE_TYPE (rhs)), TREE_TYPE (gimple_assign_rhs2 (assign))) && bb_vinfo->lookup_def (gimple_assign_rhs2 (assign))) { /* We start to match on insert to lane zero but since the inserts need not be ordered we'd have to search both the def and the use chains. */ ... and put it into tree-ssa-forwprop.cc, explicitly creating the vector CTOR.