On 23 February 2016 at 17:31, Richard Biener <[email protected]> wrote: > On Tue, 23 Feb 2016, Prathamesh Kulkarni wrote: > >> On 22 February 2016 at 17:36, Richard Biener <[email protected]> wrote: >> > On Mon, 22 Feb 2016, Prathamesh Kulkarni wrote: >> > >> >> Hi Richard, >> >> As discussed in private mail, this version of patch attempts to >> >> increase alignment >> >> of global struct decl if it contains an an array field(s) and array's >> >> offset is a multiple of the alignment of vector type corresponding to >> >> it's scalar type and recursively checks for nested structs. >> >> eg: >> >> static struct >> >> { >> >> int a, b, c, d; >> >> int k[4]; >> >> float f[10]; >> >> }; >> >> k is a candidate array since it's offset is 16 and alignment of >> >> "vector (4) int" is 8. >> >> Similarly for f. >> >> >> >> I haven't been able to create a test-case where there are >> >> multiple candidate arrays and vector alignment of arrays are different. >> >> I suppose in this case we will have to increase alignment >> >> of the struct by the max alignment ? >> >> eg: >> >> static struct >> >> { >> >> <fields> >> >> T1 k[S1] >> >> <fields> >> >> T2 f[S2] >> >> <fields> >> >> }; >> >> >> >> if V1 is vector type corresponding to T1, and V2 corresponding vector >> >> type to T2, >> >> offset (k) % align(V1) == 0 and offset (f) % align(V2) == 0 >> >> and align (V1) > align(V2) then we will increase alignment of struct >> >> by align(V1). >> >> >> >> Testing showed FAIL for g++.dg/torture/pr31863.C due to program timeout. >> >> Initially it appeared to me, it went in infinite loop. However >> >> on second thoughts I think it's probably not an infinite loop, rather >> >> taking (extraordinarily) large amount of time >> >> to compile the test-case with the patch. >> >> The test-case builds quickly for only 2 instantiations of ClassSpec >> >> (ClassSpec<Key, A001, 1>, >> >> ClassSpec<Key, A002, 2>) >> >> Building with 22 instantiations (upto ClassSpec<Key, A0023, 22>) takes up >> >> to ~1m to compile. >> >> with: >> >> 23 instantiations: ~2m >> >> 24 instantiations: ~5m >> >> For 30 instantiations I terminated cc1plus after 13m (by SIGKILL). >> >> >> >> I guess it shouldn't go in an infinite loop because: >> >> a) structs cannot have circular references. >> >> b) works for lower number of instantiations >> >> However I have no sound evidence that it cannot be in infinite loop. >> >> I don't understand why a decl node is getting visited more than once >> >> for that test-case. >> >> >> >> Using a hash_map to store alignments of decl's so that decl node gets >> >> visited >> >> only once prevents the issue. >> > >> > Maybe aliases. Try not walking vnode->alias == true vars. >> Hi, >> I have a hypothesis why decl node gets visited multiple times. >> >> Consider the test-case: >> template <typename T, unsigned N> >> struct X >> { >> T a; >> virtual int foo() { return N; } >> }; >> >> typedef struct X<int, 1> x_1; >> typedef struct X<int ,2> x_2; >> >> static x_1 obj1 __attribute__((used)); >> static x_2 obj2 __attribute__((used)); >> >> Two additional structs are created by C++FE, c++filt shows: >> _ZTI1XIiLj1EE -> typeinfo for X<int, 1u> >> _ZTI1XIiLj2EE -> typeinfo for X<int, 2u> >> >> Both of these structs have only one field D.2991 and it appears it's >> *shared* between them: >> struct D.2991; >> const void * D.2980; >> const char * D.2981; >> >> Hence the decl node D.2991 and it's fields (D.2890, D.2981) get visited >> twice: >> once when walking _ZTI1XIiLj1EE and 2nd time when walking _ZTI1XIiLj2EE >> >> Dump of walking over the global structs for above test-case: >> http://pastebin.com/R5SABW0c >> >> So it appears to me to me a DAG (interior node == struct decl, leaf == >> non-struct field, >> edge from node1 -> node2 if node2 is field of node1) is getting >> created when struct decl >> is a type-info object. >> >> I am not really clear on how we should proceed: >> a) Keep using hash_map to store alignments so that every decl gets >> visited only once. >> b) Skip walking artificial record decls. >> I am not sure if skipping all artificial struct decls would be a good >> idea, but I don't >> think it's possible to identify if a struct-decl is typeinfo struct at >> middle-end ? > > You shouldn't end up walking those when walking the type of > global decls. That is, don't walk typeinfo decls - yes, practically > that means just not walking DECL_ARTIFICIAL things. Hi, I have done the changes in the patch (attached) and cross-tested on arm*-*-* and aarch64*-*-* without failures. Is it OK for stage-1 ?
Thanks, Prathamesh > > Richard.
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 2b25b45..a6af535 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -794,6 +794,75 @@ make_pass_slp_vectorize (gcc::context *ctxt)
This should involve global alignment analysis and in the future also
array padding. */
+static unsigned get_vec_alignment_for_decl (tree);
+
+static unsigned
+get_vec_alignment_for_array_decl (tree array_decl)
+{
+ tree type = TREE_TYPE (array_decl);
+ gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
+
+ tree vectype = get_vectype_for_scalar_type (strip_array_types (type));
+ return (vectype) ? TYPE_ALIGN (vectype) : 0;
+}
+
+static unsigned
+get_vec_alignment_for_record_decl (tree record_decl)
+{
+ tree type = TREE_TYPE (record_decl);
+ gcc_assert (TREE_CODE (type) == RECORD_TYPE);
+ unsigned max_align = 0, alignment;
+ HOST_WIDE_INT offset;
+
+ if (DECL_ARTIFICIAL (record_decl) || TYPE_PACKED (type))
+ return 0;
+
+ for (tree field = first_field (type);
+ field != NULL_TREE;
+ field = DECL_CHAIN (field))
+ {
+ /* C++FE puts node "._0" of code TYPE_DECL. skip that. */
+ if (TREE_CODE (field) != FIELD_DECL)
+ continue;
+
+ offset = int_byte_position (field);
+ alignment = get_vec_alignment_for_decl (field);
+ if (alignment
+ && (offset % (alignment / BITS_PER_UNIT) == 0)
+ && (alignment > max_align))
+ max_align = alignment;
+ }
+
+ return max_align;
+}
+
+static unsigned
+get_vec_alignment_for_decl (tree decl)
+{
+ if (decl == NULL_TREE)
+ return 0;
+
+ gcc_assert (DECL_P (decl));
+
+ static unsigned alignment = 0;
+ tree type = TREE_TYPE (decl);
+
+ switch (TREE_CODE (type))
+ {
+ case ARRAY_TYPE:
+ alignment = get_vec_alignment_for_array_decl (decl);
+ break;
+ case RECORD_TYPE:
+ alignment = get_vec_alignment_for_record_decl (decl);
+ break;
+ default:
+ alignment = 0;
+ break;
+ }
+
+ return (alignment > DECL_ALIGN (decl)) ? alignment : 0;
+}
+
static unsigned int
increase_alignment (void)
{
@@ -804,23 +873,14 @@ increase_alignment (void)
/* Increase the alignment of all global arrays for vectorization. */
FOR_EACH_DEFINED_VARIABLE (vnode)
{
- tree vectype, decl = vnode->decl;
- tree t;
+ tree decl = vnode->decl;
unsigned int alignment;
- t = TREE_TYPE (decl);
- if (TREE_CODE (t) != ARRAY_TYPE)
- continue;
- vectype = get_vectype_for_scalar_type (strip_array_types (t));
- if (!vectype)
- continue;
- alignment = TYPE_ALIGN (vectype);
- if (DECL_ALIGN (decl) >= alignment)
- continue;
+ alignment = get_vec_alignment_for_decl (decl);
- if (vect_can_force_dr_alignment_p (decl, alignment))
+ if (alignment && vect_can_force_dr_alignment_p (decl, alignment))
{
- vnode->increase_alignment (TYPE_ALIGN (vectype));
+ vnode->increase_alignment (alignment);
dump_printf (MSG_NOTE, "Increasing alignment of decl: ");
dump_generic_expr (MSG_NOTE, TDF_SLIM, decl);
dump_printf (MSG_NOTE, "\n");
ChangeLog
Description: Binary data
