On Sun, Jul 8, 2018 at 4:56 AM Martin Sebor <mse...@gmail.com> wrote: > > On 07/06/2018 09:52 AM, Richard Biener wrote: > > On Fri, Jul 6, 2018 at 1:54 AM Martin Sebor <mse...@gmail.com> wrote: > >> > >> GCC folds accesses to members of constant aggregates except > >> for character arrays/strings. For example, the strlen() call > >> below is not folded: > >> > >> const char a[][4] = { "1", "12" }; > >> > >> int f (void) { retturn strlen (a[1]); } > >> > >> The attached change set enhances the string_constant() function > >> to make it possible to extract string constants from aggregate > >> initializers (CONSTRUCTORS). > >> > >> The initial solution was much simpler but as is often the case, > >> MEM_REF made it fail to fold things like: > >> > >> int f (void) { retturn strlen (a[1] + 1); } > >> > >> Handling those made the project a bit more interesting and > >> the final solution somewhat more involved. > >> > >> To handle offsets into aggregate string members the patch also > >> extends the fold_ctor_reference() function to extract entire > >> string array initializers even if the offset points past > >> the beginning of the string and even though the size and > >> exact type of the reference are not known (there isn't enough > >> information in a MEM_REF to determine that). > >> > >> Tested along with the patch for PR 86415 on x86_64-linux. > > > > + if (TREE_CODE (init) == CONSTRUCTOR) > > + { > > + tree type; > > + if (TREE_CODE (arg) == ARRAY_REF > > + || TREE_CODE (arg) == MEM_REF) > > + type = TREE_TYPE (arg); > > + else if (TREE_CODE (arg) == COMPONENT_REF) > > + { > > + tree field = TREE_OPERAND (arg, 1); > > + type = TREE_TYPE (field); > > + } > > + else > > + return NULL_TREE; > > > > what's wrong with just > > > > type = TREE_TYPE (field); > > In response to your comment below abut size I simplified things > further so determining the type a priori is no longer necessary. > > > ? > > > > + base_off *= BITS_PER_UNIT; > > > > poly_uint64 isn't enough for "bits", with wide-int you'd use offset_int, > > for poly you'd then use poly_offset? > > Okay, I tried to avoid the overflow. (Converting between all > these flavors of wide int types is a monumental PITA.) > > > > > You extend fold_ctor_reference to treat size == 0 specially but then > > bother to compute a size here - that looks unneeded? > > Yes, well spotted, thanks! I simplified the code so this isn't > necessary, and neither is the type. > > > > > While the offset of the reference determines the first field in the > > CONSTRUCTOR, how do you know the access doesn't touch > > adjacent ones? STRING_CSTs do not have to be '\0' terminated, > > so consider > > > > char x[2][4] = { "abcd", "abcd" }; > > > > and MEM[&x] with a char[8] type? memcpy "inlining" will create > > such MEMs for example. > > The code is only used to find string constants in initializer > expressions where I don't think the size of the access comes > into play. If a memcpy() call results in a MEM_REF[char[8], > &x, 8] that's fine. It's a valid reference and we can still > get the underlying character sequence (which is represented > as two STRING_CSTs with the two string literals). I might > be missing the point of your question.
Maybe irrelevant for strlen folding depending on what you do for missing '\0' termination. > > > > @@ -6554,8 +6577,16 @@ fold_nonarray_ctor_reference (tree type, tree ctor, > > tree byte_offset = DECL_FIELD_OFFSET (cfield); > > tree field_offset = DECL_FIELD_BIT_OFFSET (cfield); > > tree field_size = DECL_SIZE (cfield); > > - offset_int bitoffset; > > - offset_int bitoffset_end, access_end; > > + > > + if (!field_size && TREE_CODE (cval) == STRING_CST) > > + { > > + /* Determine the size of the flexible array member from > > + the size of the string initializer provided for it. */ > > + unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (cval); > > + tree eltype = TREE_TYPE (TREE_TYPE (cval)); > > + len *= tree_to_uhwi (TYPE_SIZE (eltype)); > > + field_size = build_int_cst (size_type_node, len); > > + } > > > > Why does this only apply to STRING_CST initializers and not CONSTRUCTORS, > > say, for > > > > struct S { int i; int a[]; } s = { 1, { 2, 3, 4, 5, 6 } }; > > I can't think of a use for it. Do you have something in mind? Well, you basically implemented a get-CONSTRUCTOR-elt-at-offset which is useful in other parts of the compiler. So I don't see why it shouldn't work for general flex-arrays. > > > > ? And why not use simply > > > > field_size = TYPE_SIZE (TREE_TYPE (cval)); > > > > like you do in c_strlen? > > Yes, that's simpler, thanks. > > > > > Otherwise looks reasonable. > > Attached is an updated patch. I also enhanced the handling > of non-constant indices. They were already handled before > to a smaller extent. (There may be other opportunities > here.) Please don't do functional changes to a patch in review, without exactly pointing out the change. It makes review inefficent for me. Looks like it might be the NULL type argument handling? + + if (!field_size && TREE_CODE (cval) == STRING_CST) + { + /* Determine the size of the flexible array member from + the size of the string initializer provided for it. */ + /* unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (cval); */ + /* tree eltype = TREE_TYPE (TREE_TYPE (cval)); */ + /* len *= tree_to_uhwi (TYPE_SIZE (eltype)); */ + /* field_size = build_int_cst (size_type_node, len); */ + field_size = TYPE_SIZE (TREE_TYPE (cval)); why all the commented code? OK with the comments removed and TREE_CODE (cval) == CONSTRUCTOR handled at that point. Thanks, Richard. > > Martin