On Fri, 1 Nov 2019, Joel Hutton wrote: > On 30/10/2019 13:49, Richard Biener wrote: > >> > >> * expr.c (store_constructor): Add case for constructor of > vectors. > > Why do you need this? The vectorizer already creates such CTORs. Any > > testcase that you can show? > > > > typedef long v2di __attribute__((vector_size(16))); > > v2di v; > > void > > foo() > > { > > v = (v2di){v[1], v[0]}; > > } > There is a special case for single element vectors, where the vector > mode and > the element mode are the same. > I've changed this slightly, as your solution of using VECTOR_MODE_P didn't > work for my case where: > emode = E_DImode > eltmode = E_DImode > On aarch64. As E_DImode is not a vector mode. > Based on some checks I found in verify_gimple, I set the type of the > replacement > constructor to the same as the original constructor, as verify_gimple > seems to > expect flattened types. i.e. a vector of vectors of ints should have > type vector > of ints.
Huh. On aarch64 for the above testcase I see mode V2DImode and emode = eltmode = DImode. That's just a regular CTOR with non-vector elements so not sure why you need any adjustment at all here? It looks like your vectorization of the CTOR introduces a V2DImode CTOR of vector(1) long elements which incidentially have DImode. That's because somehow V2DI vectorization isn't profitable but V1DI one is. In the end it's a noop transform but the testcase shows that for V2DI vectorization we fail to cost the CTOR in the scalar cost computation (you have to include the pattern root there I guess). Technically we feed valid GIMPLE to the expander so we indeed shouldn't ICE. So I'd like to have the earlier keying on vec_vec_init_p match the later assert we run into, thus sth like Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 277765) +++ gcc/expr.c (working copy) @@ -6809,6 +6809,7 @@ store_constructor (tree exp, rtx target, && n_elts.is_constant (&const_n_elts)) { machine_mode emode = eltmode; + bool vector_typed_elts_p = false; if (CONSTRUCTOR_NELTS (exp) && (TREE_CODE (TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value)) @@ -6819,13 +6820,14 @@ store_constructor (tree exp, rtx target, * TYPE_VECTOR_SUBPARTS (etype), n_elts)); emode = TYPE_MODE (etype); + vector_typed_elts_p = true; } icode = convert_optab_handler (vec_init_optab, mode, emode); if (icode != CODE_FOR_nothing) { unsigned int n = const_n_elts; - if (emode != eltmode) + if (vector_typed_elts_p) { n = CONSTRUCTOR_NELTS (exp); vec_vec_init_p = true; > > > What could be done is "compact" the > > operands to vectorize to only contain SSA names, try to SLP > > match and vectorize them and then combine the vectorized result > > with the constant elements. > > > > Not worth doing I guess unless it's sth regular like > > > > { a, b, c, d, 0, 0, 0, 0 } > > > > or so. But this can be handled as followup if necessary. > > > I agree, it should be possible to support this in future patches. > > > > + SLP_INSTANCE_ROOT_STMT (new_instance) = constructor ? > > stmt_info->stmt\ > > + : NULL; > > > > SLP_INSTANCE_ROOT_STMT should be a stmt_vec_info I guess... > > > > @@ -2801,6 +2830,10 @@ vect_bb_slp_scalar_cost (basic_block bb, > > stmt_vec_info use_stmt_info = vinfo->lookup_stmt > > (use_stmt); > > if (!use_stmt_info || !PURE_SLP_STMT (use_stmt_info)) > > { > > + /* Check this is not a constructor that will be > > vectorized > > + away. */ > > + if (BB_VINFO_GROUPED_STORES (vinfo).contains > > (use_stmt_info)) > > + continue; > > (*life)[i] = true; > > > > ... which you then simply mark as PURE_SLP_STMT where we call > > vect_mark_slp_stmts in vect_slp_analyze_bb_1. > Done. > > > > I see you have the TYPE_VECTOR_SUBPARTS check still at transform > > stage in vectorize_slp_instance_root_stmt, please simply move > > it to vect_slp_check_for_constructors or to vect_analyze_slp_instance > > where you have the other rejects (non-SSA names in the ctor). > > > Done. > > > > > > +bool > > +vectorize_slp_instance_root_stmt (slp_tree node, slp_instance instance) > > +{ > > + > > > > extra newline > Fixed. > > > > + /* For vector constructors, the constructor stmt that the SLP tree is > > built > > + from, NULL otherwise. */ > > + gimple *root_stmt; > > > > as said, make this a stmt_vec_info > Done. > > > > + if (SLP_TREE_NUMBER_OF_VEC_STMTS (node) == 1) > > > > the stmt replacing can be commonized between == 1 and > 1 cases. > Done. > > > > + > > + /* Vectorize the instance root. */ > > + if (instance->root == node && SLP_INSTANCE_ROOT_STMT (instance) > > + && SLP_TREE_VEC_STMTS (node).exists ()) > > + if (!vectorize_slp_instance_root_stmt (node, instance)) > > + { > > > > instance->root == node is always true. Likewise > > SLP_TREE_VEC_STMTS (node).exists (). > Done. > > > > @@ -4061,15 +4201,42 @@ vect_schedule_slp (vec_info *vinfo) > > if (is_a <loop_vec_info> (vinfo)) > > > > the ChangeLog doesn't mention this. I guess this isn't necessary > > unless you fail vectorizing late which you shouldn't. > > > It's necessary to avoid: > > removing stmts twice for constructors of the form {_1,_1,_1,_1} > removing stmts that define ssa names used elsewhere (which > previously wasn't a consideration because the scalar_stmts were stores > to memory, not assignments to ssa names) OK, but the code you are patching is just supposed to remove stores from the final scalar stmt seeds - it doesn't expect any loads there which is probably what happens. I'd instead add a /* Do not CSE the stmts feeding a CTOR, they might have uses outside of the vectorized stmts. */ if (SLP_INSTANCE_ROOT_STMT (instance)) continue; before the loop over SLP_TREE_SCALAR_STMTS. > Updated changelog (and patch) + if (!subparts.is_constant () || !(subparts.to_constant () + == CONSTRUCTOR_NELTS (rhs))) + continue; can be better written as if (maybe_ne (subparts, CONSTRUCTOR_NELTS (rhs))) +/* Vectorize the instance root. Return success or failure. */ + +void since it's now void the comment has to be adjusted. + vect_schedule_slp_instance (node, instance, bst_map); this now fits in one line. OK with those changes. Thanks, Richard. > 2019-10-31 Joel Hutton <joel.hut...@arm.com> > > * expr.c (store_constructor): Modify to handle single element > vectors. > * tree-vect-slp.c (vect_analyze_slp_instance): Add case for > vector constructors. > (vect_slp_check_for_constructors): New function. > (vect_slp_analyze_bb_1): Call new function to check for vector > constructors. > (vectorize_slp_instance_root_stmt): New function. > (vect_schedule_slp): Call new function to vectorize root stmt > of vector constructors. > * tree-vectorizer.h (SLP_INSTANCE_ROOT_STMT): New field. > > gcc/testsuite/ChangeLog: > > 2019-10-31 Joel Hutton <joel.hut...@arm.com> > > * gcc.dg/vect/bb-slp-40.c: New test. > * gcc.dg/vect/bb-slp-41.c: New test. > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)