On Wed, Apr 26, 2017 at 12:27 AM, Steve Ellcey <sell...@cavium.com> wrote: > This patch changes get_ref_base_and_extent to treat zero sized arrays > like C99 flexible arrays and assume that references to zero sized > arrays can also be made beyond the 'end' of the array. C99 flexible > arrays are recognized by not having an INTEGER_CST limit/size and the > routine then sets maxsize to -1 for special handling. This patch > checks for a value of 0 when we do have an INTEGER_CST limit/size and > handles it like a flexible array. > > Tested with a bootstrap on aarch64 and a GCC test run with no > regressions. > > I did not create a testcase because the test I have (attached) is not > runnable. If you compile this test case for aarch64 with > '-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two > stores in the main loop. In the original large program that this came > from, that generated bad code. If compiled with -DFLEX instead of > -UFLEX, you get load, store, load, store and that was working. With > this patch, both versions (-UFLEX and -DFLEX) generate the load, store, > load, store sequence. > > OK for checkin?
Huh. This doesn't look like the correct fix for anything. Ah, so the issue is that we prune MEM_EXPR of MEMs off ARRAY_REFs and end up with a MEM_EXPR of a.0_1->o which has zero size. Then we run into /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR is conservative, so trust it. */ if (!MEM_OFFSET_KNOWN_P (mem) || !MEM_SIZE_KNOWN_P (mem)) return true; but MEM_SIZE_KNOWN_P is true so we miss to do /* Refine size and offset we got from analyzing MEM_EXPR by using MEM_SIZE and MEM_OFFSET. */ ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT; ref->size = MEM_SIZE (mem) * BITS_PER_UNIT; thus a nearly minimal fix would be (looks like some other sanity checks might better be moved before the early "trust it" out): Index: gcc/alias.c =================================================================== --- gcc/alias.c (revision 247273) +++ gcc/alias.c (working copy) @@ -322,6 +322,13 @@ ao_ref_from_mem (ao_ref *ref, const_rtx ref->ref_alias_set = MEM_ALIAS_SET (mem); + /* Refine size and offset we got from analyzing MEM_EXPR by using + MEM_SIZE and MEM_OFFSET. */ + if (MEM_OFFSET_KNOWN_P (mem)) + ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT; + if (MEM_SIZE_KNOWN_P (mem)) + ref->size = MEM_SIZE (mem) * BITS_PER_UNIT; + /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR is conservative, so trust it. */ if (!MEM_OFFSET_KNOWN_P (mem) @@ -336,12 +343,6 @@ ao_ref_from_mem (ao_ref *ref, const_rtx > ref->max_size))) ref->ref = NULL_TREE; - /* Refine size and offset we got from analyzing MEM_EXPR by using - MEM_SIZE and MEM_OFFSET. */ - - ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT; - ref->size = MEM_SIZE (mem) * BITS_PER_UNIT; - /* The MEM may extend into adjacent fields, so adjust max_size if necessary. */ if (ref->max_size != -1 note that in the end we might want to stop pruning MEM_EXPR so much... (set_mem_attributes_minus_bitpos). At least dropping component/array refs in this case isn't conservative as the size zero access doesn't cover the array ref stripped. So arguably the bug is in set_mem_attributes_minus_bitpos itself. Thus Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 247273) +++ gcc/emit-rtl.c (working copy) @@ -1954,7 +1954,9 @@ set_mem_attributes_minus_bitpos (rtx ref while (TREE_CODE (t2) == ARRAY_REF); if (DECL_P (t2) - || TREE_CODE (t2) == COMPONENT_REF) + || (TREE_CODE (t2) == COMPONENT_REF + && (! DECL_SIZE (TREE_OPERAND (t2, 1)) + || ! integer_zerop (DECL_SIZE (TREE_OPERAND (t2, 1)))))) { attrs.expr = t2; attrs.offset_known_p = false; is an alternative / additional fix (OTOH for all trailing arrays this isn't really a conservative MEM_EXPR, not just for size zero ones). Thus Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 247273) +++ gcc/emit-rtl.c (working copy) @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref while (TREE_CODE (t2) == ARRAY_REF); if (DECL_P (t2) - || TREE_CODE (t2) == COMPONENT_REF) + || (TREE_CODE (t2) == COMPONENT_REF + /* For trailing arrays t2 doesn't have a size that + covers all valid accesses. */ + && ! array_at_struct_end_p (t, false))) { attrs.expr = t2; attrs.offset_known_p = false; is probably better. Richard. > Steve Ellcey > sell...@cavium.com > > > 2017-04-25 Steve Ellcey <sell...@cavium.com> > > * tree-dfa.c (get_ref_base_and_extent): Treat zero size array like > a C99 flexible array. >