Re: [PATCH] Alias analysis of zero sized arrays

2017-04-26 Thread Steve Ellcey
On Wed, 2017-04-26 at 20:52 +0200, Richard Biener wrote:
> 
> I'll throw it on x86 testing and will commit it.  What branches are
> affected?  Can you open a bugreport?
> 
> I'll try to add a runtime testcase as well if you don't beat me to
> that.  On x86 it likely requires -fschedule-insns to trigger.
> 
> Richard.

I created bug 80533.  The problem was found on the gcc-5 branch and is
also on the gcc-6 and gcc-7 branches.

Steve Ellcey


Re: [PATCH] Alias analysis of zero sized arrays

2017-04-26 Thread Richard Biener
On April 26, 2017 8:03:44 PM GMT+02:00, Steve Ellcey  wrote:
>On Wed, 2017-04-26 at 11:33 +0200, Richard Biener wrote:
>> > 
>> > Index: gcc/emit-rtl.c
>> > ===
>> > --- gcc/emit-rtl.c  (revision 247273)
>> > +++ gcc/emit-rtl.c  (working copy)
>> > @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
>> >   while (TREE_CODE (t2) == ARRAY_REF);
>> > 
>> >   if (DECL_P (t2)
>> > - || TREE_CODE (t2) == COMPONENT_REF)
>> > + || (TREE_CODE (t2) == COMPONENT_REF
>> > + /* For trailing arrays t2 doesn't have a size
>that
>> > +covers all valid accesses.  */
>> > + && ! array_at_struct_end_p (t, false)))
>> > {
>> >   attrs.expr = t2;
>> >   attrs.offset_known_p = false;
>> > 
>> > is probably better.
>> > 
>> > Richard.
>
>I tested this patch and it fixed my problem.  It does seem better than
>my patch because now the [0] and [] versions of my test generate
>exactly the same code.  With my patch they still generated different
>code though both versions worked OK.
>
>I did a full bootstrap and gcc testsuite run on aarch64 with no
>regressions.  Will you check this in or do you want me to do a
>submission and checkin?

I'll throw it on x86 testing and will commit it.  What branches are affected?  
Can you open a bugreport?

I'll try to add a runtime testcase as well if you don't beat me to that.  On 
x86 it likely requires -fschedule-insns to trigger.

Richard.

>Steve Ellcey
>sell...@cavium.com



Re: [PATCH] Alias analysis of zero sized arrays

2017-04-26 Thread Steve Ellcey
On Wed, 2017-04-26 at 11:33 +0200, Richard Biener wrote:
> > 
> > Index: gcc/emit-rtl.c
> > ===
> > --- gcc/emit-rtl.c  (revision 247273)
> > +++ gcc/emit-rtl.c  (working copy)
> > @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
> >   while (TREE_CODE (t2) == ARRAY_REF);
> > 
> >   if (DECL_P (t2)
> > - || TREE_CODE (t2) == COMPONENT_REF)
> > + || (TREE_CODE (t2) == COMPONENT_REF
> > + /* For trailing arrays t2 doesn't have a size that
> > +covers all valid accesses.  */
> > + && ! array_at_struct_end_p (t, false)))
> > {
> >   attrs.expr = t2;
> >   attrs.offset_known_p = false;
> > 
> > is probably better.
> > 
> > Richard.

I tested this patch and it fixed my problem.  It does seem better than
my patch because now the [0] and [] versions of my test generate
exactly the same code.  With my patch they still generated different
code though both versions worked OK.

I did a full bootstrap and gcc testsuite run on aarch64 with no
regressions.  Will you check this in or do you want me to do a
submission and checkin?

Steve Ellcey
sell...@cavium.com


Re: [PATCH] Alias analysis of zero sized arrays

2017-04-26 Thread Richard Biener
On Wed, Apr 26, 2017 at 11:27 AM, Richard Biener
 wrote:
> On Wed, Apr 26, 2017 at 12:27 AM, Steve Ellcey  wrote:
>> This patch changes get_ref_base_and_extent to treat zero sized arrays
>> like C99 flexible arrays and assume that references to zero sized
>> arrays can also be made beyond the 'end' of the array.  C99 flexible
>> arrays are recognized by not having an INTEGER_CST limit/size and the
>> routine then sets maxsize to -1 for special handling.  This patch
>> checks for a value of 0 when we do have an INTEGER_CST limit/size and
>> handles it like a flexible array.
>>
>> Tested with a bootstrap on aarch64 and a GCC test run with no
>> regressions.
>>
>> I did not create a testcase because the test I have (attached) is not
>> runnable.  If you compile this test case for aarch64 with
>> '-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two
>> stores in the main loop.  In the original large program that this came
>> from, that generated bad code.  If compiled with -DFLEX instead of
>> -UFLEX, you get load, store, load, store and that was working.  With
>> this patch, both versions (-UFLEX and -DFLEX) generate the load, store,
>> load, store sequence.
>>
>> OK for checkin?
>
> Huh.  This doesn't look like the correct fix for anything.
>
> Ah, so the issue is that we prune MEM_EXPR of MEMs off ARRAY_REFs
> and end up with a MEM_EXPR of a.0_1->o which has zero size.  Then
> we run into
>
>   /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
>  is conservative, so trust it.  */
>   if (!MEM_OFFSET_KNOWN_P (mem)
>   || !MEM_SIZE_KNOWN_P (mem))
> return true;
>
> but MEM_SIZE_KNOWN_P is true so we miss to do
>
>   /* Refine size and offset we got from analyzing MEM_EXPR by using
>  MEM_SIZE and MEM_OFFSET.  */
>
>   ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
>   ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
>
> thus a nearly minimal fix would be (looks like some other sanity checks
> might better be moved before the early "trust it" out):
>
> Index: gcc/alias.c
> ===
> --- gcc/alias.c (revision 247273)
> +++ gcc/alias.c (working copy)
> @@ -322,6 +322,13 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
>
>ref->ref_alias_set = MEM_ALIAS_SET (mem);
>
> +  /* Refine size and offset we got from analyzing MEM_EXPR by using
> + MEM_SIZE and MEM_OFFSET.  */
> +  if (MEM_OFFSET_KNOWN_P (mem))
> +ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
> +  if (MEM_SIZE_KNOWN_P (mem))
> +ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
> +
>/* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
>   is conservative, so trust it.  */
>if (!MEM_OFFSET_KNOWN_P (mem)
> @@ -336,12 +343,6 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
>   > ref->max_size)))
>  ref->ref = NULL_TREE;
>
> -  /* Refine size and offset we got from analyzing MEM_EXPR by using
> - MEM_SIZE and MEM_OFFSET.  */
> -
> -  ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
> -  ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
> -
>/* The MEM may extend into adjacent fields, so adjust max_size if
>   necessary.  */
>if (ref->max_size != -1

Actually this change is wrong!  Leaves the one below as the fix.

> note that in the end we might want to stop pruning MEM_EXPR so much...
> (set_mem_attributes_minus_bitpos).  At least dropping component/array
> refs in this case isn't conservative as the size zero access doesn't cover
> the array ref stripped.  So arguably the bug is in
> set_mem_attributes_minus_bitpos
> itself.  Thus
>
> Index: gcc/emit-rtl.c
> ===
> --- gcc/emit-rtl.c  (revision 247273)
> +++ gcc/emit-rtl.c  (working copy)
> @@ -1954,7 +1954,9 @@ set_mem_attributes_minus_bitpos (rtx ref
>   while (TREE_CODE (t2) == ARRAY_REF);
>
>   if (DECL_P (t2)
> - || TREE_CODE (t2) == COMPONENT_REF)
> + || (TREE_CODE (t2) == COMPONENT_REF
> + && (! DECL_SIZE (TREE_OPERAND (t2, 1))
> + || ! integer_zerop (DECL_SIZE (TREE_OPERAND (t2, 1))
> {
>   attrs.expr = t2;
>   attrs.offset_known_p = false;
>
> is an alternative / additional fix (OTOH for all trailing arrays this
> isn't really a
> conservative MEM_EXPR, not just for size zero ones).  Thus
>
> Index: gcc/emit-rtl.c
> ===
> --- gcc/emit-rtl.c  (revision 247273)
> +++ gcc/emit-rtl.c  (working copy)
> @@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
>   while (TREE_CODE (t2) == ARRAY_REF);
>
>   if (DECL_P (t2)
> - || TREE_CODE (t2) == COMPONENT_REF)
> + || (TREE_CODE (t2) == COMPONENT_REF
> + /* For trailing arrays t2 doesn't have a size that
> +covers all 

Re: [PATCH] Alias analysis of zero sized arrays

2017-04-26 Thread Richard Biener
On Wed, Apr 26, 2017 at 12:27 AM, Steve Ellcey  wrote:
> This patch changes get_ref_base_and_extent to treat zero sized arrays
> like C99 flexible arrays and assume that references to zero sized
> arrays can also be made beyond the 'end' of the array.  C99 flexible
> arrays are recognized by not having an INTEGER_CST limit/size and the
> routine then sets maxsize to -1 for special handling.  This patch
> checks for a value of 0 when we do have an INTEGER_CST limit/size and
> handles it like a flexible array.
>
> Tested with a bootstrap on aarch64 and a GCC test run with no
> regressions.
>
> I did not create a testcase because the test I have (attached) is not
> runnable.  If you compile this test case for aarch64 with
> '-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two
> stores in the main loop.  In the original large program that this came
> from, that generated bad code.  If compiled with -DFLEX instead of
> -UFLEX, you get load, store, load, store and that was working.  With
> this patch, both versions (-UFLEX and -DFLEX) generate the load, store,
> load, store sequence.
>
> OK for checkin?

Huh.  This doesn't look like the correct fix for anything.

Ah, so the issue is that we prune MEM_EXPR of MEMs off ARRAY_REFs
and end up with a MEM_EXPR of a.0_1->o which has zero size.  Then
we run into

  /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
 is conservative, so trust it.  */
  if (!MEM_OFFSET_KNOWN_P (mem)
  || !MEM_SIZE_KNOWN_P (mem))
return true;

but MEM_SIZE_KNOWN_P is true so we miss to do

  /* Refine size and offset we got from analyzing MEM_EXPR by using
 MEM_SIZE and MEM_OFFSET.  */

  ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
  ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;

thus a nearly minimal fix would be (looks like some other sanity checks
might better be moved before the early "trust it" out):

Index: gcc/alias.c
===
--- gcc/alias.c (revision 247273)
+++ gcc/alias.c (working copy)
@@ -322,6 +322,13 @@ ao_ref_from_mem (ao_ref *ref, const_rtx

   ref->ref_alias_set = MEM_ALIAS_SET (mem);

+  /* Refine size and offset we got from analyzing MEM_EXPR by using
+ MEM_SIZE and MEM_OFFSET.  */
+  if (MEM_OFFSET_KNOWN_P (mem))
+ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
+  if (MEM_SIZE_KNOWN_P (mem))
+ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
+
   /* If MEM_OFFSET or MEM_SIZE are unknown what we got from MEM_EXPR
  is conservative, so trust it.  */
   if (!MEM_OFFSET_KNOWN_P (mem)
@@ -336,12 +343,6 @@ ao_ref_from_mem (ao_ref *ref, const_rtx
  > ref->max_size)))
 ref->ref = NULL_TREE;

-  /* Refine size and offset we got from analyzing MEM_EXPR by using
- MEM_SIZE and MEM_OFFSET.  */
-
-  ref->offset += MEM_OFFSET (mem) * BITS_PER_UNIT;
-  ref->size = MEM_SIZE (mem) * BITS_PER_UNIT;
-
   /* The MEM may extend into adjacent fields, so adjust max_size if
  necessary.  */
   if (ref->max_size != -1

note that in the end we might want to stop pruning MEM_EXPR so much...
(set_mem_attributes_minus_bitpos).  At least dropping component/array
refs in this case isn't conservative as the size zero access doesn't cover
the array ref stripped.  So arguably the bug is in
set_mem_attributes_minus_bitpos
itself.  Thus

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  (revision 247273)
+++ gcc/emit-rtl.c  (working copy)
@@ -1954,7 +1954,9 @@ set_mem_attributes_minus_bitpos (rtx ref
  while (TREE_CODE (t2) == ARRAY_REF);

  if (DECL_P (t2)
- || TREE_CODE (t2) == COMPONENT_REF)
+ || (TREE_CODE (t2) == COMPONENT_REF
+ && (! DECL_SIZE (TREE_OPERAND (t2, 1))
+ || ! integer_zerop (DECL_SIZE (TREE_OPERAND (t2, 1))
{
  attrs.expr = t2;
  attrs.offset_known_p = false;

is an alternative / additional fix (OTOH for all trailing arrays this
isn't really a
conservative MEM_EXPR, not just for size zero ones).  Thus

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c  (revision 247273)
+++ gcc/emit-rtl.c  (working copy)
@@ -1954,7 +1954,10 @@ set_mem_attributes_minus_bitpos (rtx ref
  while (TREE_CODE (t2) == ARRAY_REF);

  if (DECL_P (t2)
- || TREE_CODE (t2) == COMPONENT_REF)
+ || (TREE_CODE (t2) == COMPONENT_REF
+ /* For trailing arrays t2 doesn't have a size that
+covers all valid accesses.  */
+ && ! array_at_struct_end_p (t, false)))
{
  attrs.expr = t2;
  attrs.offset_known_p = false;

is probably better.

Richard.

> Steve Ellcey
> sell...@cavium.com
>
>
> 2017-04-25  Steve Ellcey  
>
> * tree-dfa.c (get_ref_base_and_extent): Treat 

[PATCH] Alias analysis of zero sized arrays

2017-04-25 Thread Steve Ellcey
This patch changes get_ref_base_and_extent to treat zero sized arrays
like C99 flexible arrays and assume that references to zero sized
arrays can also be made beyond the 'end' of the array.  C99 flexible
arrays are recognized by not having an INTEGER_CST limit/size and the
routine then sets maxsize to -1 for special handling.  This patch
checks for a value of 0 when we do have an INTEGER_CST limit/size and
handles it like a flexible array.

Tested with a bootstrap on aarch64 and a GCC test run with no
regressions.

I did not create a testcase because the test I have (attached) is not
runnable.  If you compile this test case for aarch64 with
'-UFLEX -O2 -fno-strict-aliasing' you will get two loads, then two
stores in the main loop.  In the original large program that this came
from, that generated bad code.  If compiled with -DFLEX instead of
-UFLEX, you get load, store, load, store and that was working.  With
this patch, both versions (-UFLEX and -DFLEX) generate the load, store,
load, store sequence.

OK for checkin?

Steve Ellcey
sell...@cavium.com


2017-04-25  Steve Ellcey  

* tree-dfa.c (get_ref_base_and_extent): Treat zero size array like
a C99 flexible array.

diff --git a/gcc/tree-dfa.c b/gcc/tree-dfa.c
index 8ee46dc..79c3489 100644
--- a/gcc/tree-dfa.c
+++ b/gcc/tree-dfa.c
@@ -438,6 +438,10 @@ get_ref_base_and_extent (tree exp, HOST_WIDE_INT *poffset,
   && TREE_CODE (size_tree) == INTEGER_CST)
 bitsize = wi::to_offset (size_tree);
 
+  /* If zero sized array, treat it like a flexible array.  */
+  if (bitsize == 0)
+bitsize = -1;
+
   /* Initially, maxsize is the same as the accessed element size.
  In the following it will only grow (or become -1).  */
   maxsize = bitsize;
@@ -504,7 +508,13 @@ get_ref_base_and_extent (tree exp, HOST_WIDE_INT *poffset,
 		if (maxsize != -1
 		&& csize
 		&& TREE_CODE (csize) == INTEGER_CST)
-		  maxsize = wi::to_offset (csize) - bit_offset;
+		  {
+		maxsize = wi::to_offset (csize) - bit_offset;
+
+		/* If zero sized array, treat it like a flexible array.  */
+		if (maxsize == 0)
+		  maxsize = -1;
+		  }
 		else
 		  maxsize = -1;
 	  }
struct q {
	int b;
};
struct r {
   int n;
   struct q slot[0];
};
struct s {
   int n;
#ifdef FLEX
 long int o[];
#else
 long int o[0];
#endif
};
extern int x, y, m;
extern struct s *a;
extern struct r *b;
extern void bar();
int foo() {
   int i,j;
   for (i = 0; i < m; i++) {
   	a->o[i] = sizeof(*a);
   	b = ((struct r *)(((char *)a) + a->o[a->n]));
	for (j = 0; j < 10; j++) {
		b->slot[j].b = 0;
   	}
bar();
  }
}