[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-08 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #35 from rguenther at suse dot de  
2011-04-08 08:30:26 UTC ---
On Thu, 7 Apr 2011, jakub at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377
> 
> --- Comment #34 from Jakub Jelinek  2011-04-07 
> 16:36:04 UTC ---
> While for 4.7 we can certainly do bigger changes, for 4.6 I think a safe fix
> would be:
> 
> --- gcc/tree-vect-data-refs.c2011-03-31 08:51:03.0 +0200
> +++ gcc/tree-vect-data-refs.c2011-04-07 18:32:37.379420938 +0200
> @@ -1110,6 +1110,9 @@ vector_alignment_reachable_p (struct dat
>if (ba)
>  is_packed = contains_packed_reference (ba);
> 
> +  if (compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) > 0)
> +is_packed = true;
> +
>if (vect_print_dump_info (REPORT_DETAILS))
>  fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
>if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
> 
> For scalars GCC ignores packed attribute, so aligned(1) is the only way to
> express unaligned accesses.

Ok if it passes testing.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #34 from Jakub Jelinek  2011-04-07 
16:36:04 UTC ---
While for 4.7 we can certainly do bigger changes, for 4.6 I think a safe fix
would be:

--- gcc/tree-vect-data-refs.c2011-03-31 08:51:03.0 +0200
+++ gcc/tree-vect-data-refs.c2011-04-07 18:32:37.379420938 +0200
@@ -1110,6 +1110,9 @@ vector_alignment_reachable_p (struct dat
   if (ba)
 is_packed = contains_packed_reference (ba);

+  if (compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) > 0)
+is_packed = true;
+
   if (vect_print_dump_info (REPORT_DETAILS))
 fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
   if (targetm.vectorize.vector_alignment_reachable (type, is_packed))

For scalars GCC ignores packed attribute, so aligned(1) is the only way to
express unaligned accesses.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #33 from rguenther at suse dot de  
2011-04-07 13:57:09 UTC ---
On Thu, 7 Apr 2011, jakub at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377
> 
> --- Comment #32 from Jakub Jelinek  2011-04-07 
> 13:43:47 UTC ---
> Given that without vectorization that code surely works well on all the weird
> ABIs, such ADJUST_FIELD_ALIGN is used only on targets that either aren't 
> strict
> alignment targets, or aren't strict alignment targets for the particular 
> types.
> You know, for most CPUs strict alignment isn't a binary thing, many targets
> including i?86/x86_64 are only partially strict alignment targets.
> Even for targets where say int needs to be accessed aligned often e.g. double
> can be just 4 byte aligned and insns that read/write doubles handle it.
> 
> We really shouldn't change __alignof__ of types/INDIRECT_REFs/etc., those are
> all ABI changes.  The problem with vectorization is that often the vectors 
> have
> strict alignment, while non-vector ops are not.

Sure, I agree with all of the above.  Still, if we need target hooks for
this kind of stuff then something is really rotten in the middle-end.

Richard.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #32 from Jakub Jelinek  2011-04-07 
13:43:47 UTC ---
Given that without vectorization that code surely works well on all the weird
ABIs, such ADJUST_FIELD_ALIGN is used only on targets that either aren't strict
alignment targets, or aren't strict alignment targets for the particular types.
You know, for most CPUs strict alignment isn't a binary thing, many targets
including i?86/x86_64 are only partially strict alignment targets.
Even for targets where say int needs to be accessed aligned often e.g. double
can be just 4 byte aligned and insns that read/write doubles handle it.

We really shouldn't change __alignof__ of types/INDIRECT_REFs/etc., those are
all ABI changes.  The problem with vectorization is that often the vectors have
strict alignment, while non-vector ops are not.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #31 from Richard Guenther  2011-04-07 
13:35:04 UTC ---
(In reply to comment #30)
> This is certainly a valid testcase:
> 
> struct S { long long x; int a; double b[1]; int c; } s;
> double *p = &s.b[0];
> int
> foo (double *q)
> {
>   int i, j = 0;
>   for (i = 0; i < 1; i++)
> j += *q++;
>   return j;
> }
> int
> main (void)
> {
>   return foo (p);
> }
> 
> and without the target hook, vectorizer would think it can peel the loop and
> get it naturally aligned (assuming it isn't inlined at least etc. to see that 
> q
> points to &s.b[0] initially).
> 
> I think the bug is in the ABIs that do that kind of thing, but it is too late
> to fix the ABIs up.

The question is what we should do in the middle-end.  If the above would
happen for strict-alignment targets then *p would simply trap.  The
only consistent way for the middle-end would be to simply assume all
pointer-to-doubles are only aligned to 4 bytes, thus use the minimal
alignment that can happen (without using GCC extensions).


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #30 from Jakub Jelinek  2011-04-07 
13:29:15 UTC ---
This is certainly a valid testcase:

struct S { long long x; int a; double b[1]; int c; } s;
double *p = &s.b[0];
int
foo (double *q)
{
  int i, j = 0;
  for (i = 0; i < 1; i++)
j += *q++;
  return j;
}
int
main (void)
{
  return foo (p);
}

and without the target hook, vectorizer would think it can peel the loop and
get it naturally aligned (assuming it isn't inlined at least etc. to see that q
points to &s.b[0] initially).

I think the bug is in the ABIs that do that kind of thing, but it is too late
to fix the ABIs up.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #29 from rguenther at suse dot de  
2011-04-07 13:21:20 UTC ---
On Thu, 7 Apr 2011, jakub at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377
> 
> --- Comment #28 from Jakub Jelinek  2011-04-07 
> 13:16:13 UTC ---
> I mean a and g are 8 with -m64 -malign-power, but f is 4.  If you do:
> extern void abort (void);
> struct S { long long x; int a; double b[1]; int c; } s;
> double *p = &s.b[0];

Here's a known type-inconsistency as well.  Or it's even an
invalid testcase ...

> int
> main ()
> {
>   if (((long) p) & (__alignof__ (*p) - 1))
> abort ();
>   return 0;
> }
> 
> I believe it will fail with -m64 -malign-power, and here it doesn't help at 
> all
> that you check get_object_alignment, because __alignof__ (*p) is still 8,
> but
> .comms,80016,8
> ...
> p:
> .quads+12
> This isn't the default ABI for powerpc64-linux fortunately, but anyway.
> 
> But, the above fails on i?86 too (without non-default -malign-double).
> So, we need some target hook that will tell us if ADJUST_FIELD_ALIGN could 
> ever
> lower alignment of the type when present in structs...

Well.  It's also wrong with

struct S { long long x; int a; double b[1]; int c; } s 
__attribute__((aligned(4)));
double *p = &s.b[0];

on x86_64.  And honestly I'm not exactly sure if that situation is
different than the i?86 or ppc one or if it is just a "bug" in the
C standard that it allows this kind of alignment behavior (and
thus, a bug in the GCC extension that allows adjusting it).

Richard.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #28 from Jakub Jelinek  2011-04-07 
13:16:13 UTC ---
I mean a and g are 8 with -m64 -malign-power, but f is 4.  If you do:
extern void abort (void);
struct S { long long x; int a; double b[1]; int c; } s;
double *p = &s.b[0];
int
main ()
{
  if (((long) p) & (__alignof__ (*p) - 1))
abort ();
  return 0;
}

I believe it will fail with -m64 -malign-power, and here it doesn't help at all
that you check get_object_alignment, because __alignof__ (*p) is still 8,
but
.comms,80016,8
...
p:
.quads+12
This isn't the default ABI for powerpc64-linux fortunately, but anyway.

But, the above fails on i?86 too (without non-default -malign-double).
So, we need some target hook that will tell us if ADJUST_FIELD_ALIGN could ever
lower alignment of the type when present in structs...


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #27 from Richard Guenther  2011-04-07 
12:56:34 UTC ---
(In reply to comment #26)
> int a = __alignof__ (double);
> struct A { int b; double c; int d; } e;
> int f = __alignof__ (e.c);
> double *p;
> int g = __alignof__ (*p);
> int a2 = __alignof__ (long long);
> struct A2 { int b; long long c; int d; } e2;
> int f2 = __alignof__ (e2.c);
> long long *p2;
> int g2 = __alignof__ (*p2);
> 
> on powerpc64-linux unfortunately shows that the target hook is needed, at 
> least
> with -m64 -malign-power.

You mean a != f in the above?  That's the basic issue of stor-layout not
using properly aligned types for FIELD_DECL types (same issue for packed
structs).  So you can't really use type alignment for memory references
but you have to use get_object_alignment & friends.  Same happens on
any target for non-packed but just custom (mis-)aligned struct types.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #26 from Jakub Jelinek  2011-04-07 
12:48:12 UTC ---
int a = __alignof__ (double);
struct A { int b; double c; int d; } e;
int f = __alignof__ (e.c);
double *p;
int g = __alignof__ (*p);
int a2 = __alignof__ (long long);
struct A2 { int b; long long c; int d; } e2;
int f2 = __alignof__ (e2.c);
long long *p2;
int g2 = __alignof__ (*p2);

on powerpc64-linux unfortunately shows that the target hook is needed, at least
with -m64 -malign-power.  Though, what the target hook does is definitely
weird:
  if (TARGET_32BIT)
{
  if (rs6000_alignment_flags == MASK_ALIGN_NATURAL)
return true;
  if (rs6000_alignment_flags ==  MASK_ALIGN_POWER)
return true;
  return false;
}
  else
{
  if (TARGET_MACHO)
return false;
  /* Assuming that all other types are naturally aligned. CHECKME!  */
  return true;
}

I believe rs6000_alignment_flags can be either MASK_ALIGN_NATURAL, or
MASK_ALIGN_POWER, nothing else, so wonder why it just doesn't return true for
TARGET_32BIT.  And, for TARGET_64BIT && !TARGET_MACHO, clearly DFmode types
aren't naturally aligned with rs6000_alignment_flags == MASK_ALIGN_POWER.
Therefore I'd say it should return false for
TYPE_MODE (strip_array_types (type)) == DFmode && !TARGET_ALIGN_NATURAL, at
least for Linux (rs6000 has unfortunately way too many ABIs and even more
target switches).  For TARGET_MACHO, ADJUST_FIELD_ALIGN is
darwin.h:#define ADJUST_FIELD_ALIGN(FIELD, COMPUTED)\
darwin.h-  (TARGET_ALIGN_NATURAL ? (COMPUTED)\
darwin.h-   : (COMPUTED) == 128 ? 128\
darwin.h-   : MIN ((COMPUTED), 32))
thus I think it could use ADJUST_FIELD_ALIGN (NULL, TYPE_ALIGN (type)) ==
TYPE_ALIGN (type).  And AIX is the same as Linux TARGET_64BIT, probably even
for 32-bit AIX though.

So, I think you want the compare_tree_int check in generic code (not sure if
is_packed isn't redundant with that check), plus a hook which will sometimes
tell you peeling won't necessarily align either.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #25 from Richard Guenther  2011-04-07 
12:44:35 UTC ---
(In reply to comment #24)
> (In reply to comment #23)
> > Well, it's hard to accept such an easy solution when the original patch has
> > hundreds of rows ;)
> > 
> > So, you think that 
> > 
> > Index: tree-vect-data-refs.c
> > ===
> > --- tree-vect-data-refs.c   (revision 172019)
> > +++ tree-vect-data-refs.c   (working copy)
> > @@ -1110,12 +1110,9 @@ vector_alignment_reachable_p (struct dat
> >if (ba)
> > is_packed = contains_packed_reference (ba);
> > 
> > -  if (vect_print_dump_info (REPORT_DETAILS))
> > -   fprintf (vect_dump, "Unknown misalignment, is_packed = 
> > %d",is_packed);
> > -  if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
> > -   return true;
> > -  else
> > -   return false;
> > +  if (is_packed
> > +  || compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) > 0)
> > +return false;
> >  }
> > 
> >return true;
> > 
> > is enough, and we can just get rid of vector_alignment_reachable?
> 
> Yes, I think so.

Even is_packed shouldn't be necessary, TYPE_ALIGN should better be correct
(but IIRC it isn't for the FIELD_DECL types of packed struckts).


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #24 from Richard Guenther  2011-04-07 
12:42:48 UTC ---
(In reply to comment #23)
> Well, it's hard to accept such an easy solution when the original patch has
> hundreds of rows ;)
> 
> So, you think that 
> 
> Index: tree-vect-data-refs.c
> ===
> --- tree-vect-data-refs.c   (revision 172019)
> +++ tree-vect-data-refs.c   (working copy)
> @@ -1110,12 +1110,9 @@ vector_alignment_reachable_p (struct dat
>if (ba)
> is_packed = contains_packed_reference (ba);
> 
> -  if (vect_print_dump_info (REPORT_DETAILS))
> -   fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
> -  if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
> -   return true;
> -  else
> -   return false;
> +  if (is_packed
> +  || compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) > 0)
> +return false;
>  }
> 
>return true;
> 
> is enough, and we can just get rid of vector_alignment_reachable?

Yes, I think so.

Richard.

> Thanks,
> Ira


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #23 from Ira Rosen  2011-04-07 12:02:46 UTC 
---
Well, it's hard to accept such an easy solution when the original patch has
hundreds of rows ;)

So, you think that 

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 172019)
+++ tree-vect-data-refs.c   (working copy)
@@ -1110,12 +1110,9 @@ vector_alignment_reachable_p (struct dat
   if (ba)
is_packed = contains_packed_reference (ba);

-  if (vect_print_dump_info (REPORT_DETAILS))
-   fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
-  if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
-   return true;
-  else
-   return false;
+  if (is_packed
+  || compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) > 0)
+return false;
 }

   return true;

is enough, and we can just get rid of vector_alignment_reachable?

Thanks,
Ira


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #22 from Jakub Jelinek  2011-04-07 
11:45:57 UTC ---
When would
compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) <= 0
give not what you are looking for?


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #21 from Ira Rosen  2011-04-07 11:37:29 UTC 
---
(In reply to comment #20)

> I saw them, but I can't see what the difference is between
> "aligned" and "aligned" ;)  Either the targets have aligned loads
> or they don't.  We can target independently check whether we
> can for example reach 16-byte alignment - in which cases is it then
> we can't "reach" that alignment anyway due to target issues?

What kind of target independent check do you have in mind? I guess this hook
was added because such check is hard to implement. Here is the discussion about
adding it http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00082.html.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #20 from rguenther at suse dot de  
2011-04-07 10:24:45 UTC ---
On Thu, 7 Apr 2011, irar at il dot ibm.com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377
> 
> --- Comment #19 from Ira Rosen  2011-04-07 09:55:44 
> UTC ---
> (In reply to comment #18)
> > I think rather tree-vect-data-refs.c:vector_alignment_reachable_p should
> > be adjusted.  These "packed" checks in the target hooks don't make any
> > sense to me either.  
> 
> Yes, I think this can be moved to vector_alignment_reachable_p.
> 
> > In fact, I fail to see the point of a target hook
> > completely (if it isn't maybe just for cost issues).
> 
> There are some target specific checks in rs6000.c and arm.c.

I saw them, but I can't see what the difference is between
"aligned" and "aligned" ;)  Either the targets have aligned loads
or they don't.  We can target independently check whether we
can for example reach 16-byte alignment - in which cases is it then
we can't "reach" that alignment anyway due to target issues?


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #19 from Ira Rosen  2011-04-07 09:55:44 UTC 
---
(In reply to comment #18)
> I think rather tree-vect-data-refs.c:vector_alignment_reachable_p should
> be adjusted.  These "packed" checks in the target hooks don't make any
> sense to me either.  

Yes, I think this can be moved to vector_alignment_reachable_p.

> In fact, I fail to see the point of a target hook
> completely (if it isn't maybe just for cost issues).

There are some target specific checks in rs6000.c and arm.c.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #18 from Richard Guenther  2011-04-07 
09:39:13 UTC ---
(In reply to comment #16)
> So perhaps:
> 
>  bool
>  default_builtin_vector_alignment_reachable (const_tree type, bool is_packed)
>  {
> -  if (is_packed)
> +  if (is_packed || (TYPE_USER_ALIGN (type) && compare_tree_int (TYPE_SIZE
> (type), TYPE_ALIGN (type)) == 1)
>  return false;
> 
> and similarly in rs6000 and other hooks?
> BTW, tree_int_cst_compare (TYPE_SIZE (type), bitsize_int (POINTER_SIZE))
> in the default hook probably should be compare_tree_int too.

I think rather tree-vect-data-refs.c:vector_alignment_reachable_p should
be adjusted.  These "packed" checks in the target hooks don't make any
sense to me either.  In fact, I fail to see the point of a target hook
completely (if it isn't maybe just for cost issues).


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

Richard Guenther  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|REOPENED|NEW
 CC||rguenth at gcc dot gnu.org

--- Comment #17 from Richard Guenther  2011-04-07 
09:35:08 UTC ---
Confirmed.  Peeling for alignment is pointless if the access isn't at least
aligned to the natural alignment of the vector element.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #16 from Jakub Jelinek  2011-04-07 
09:34:36 UTC ---
So perhaps:

 bool
 default_builtin_vector_alignment_reachable (const_tree type, bool is_packed)
 {
-  if (is_packed)
+  if (is_packed || (TYPE_USER_ALIGN (type) && compare_tree_int (TYPE_SIZE
(type), TYPE_ALIGN (type)) == 1)
 return false;

and similarly in rs6000 and other hooks?
BTW, tree_int_cst_compare (TYPE_SIZE (type), bitsize_int (POINTER_SIZE))
in the default hook probably should be compare_tree_int too.


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #15 from Ira Rosen  2011-04-07 09:00:43 UTC 
---
(In reply to comment #14)
> Sure, but for types with user alignment smaller than their size, loop peeling
> may often never be able to align it.
> 
> This regressed with http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161797

This patch only allowed peeling for loads. So the problem existed for stores
before this revision.

> 
> What I mean is that for say unsigned int type with 32-bit alignment loop
> peeling for valid code should always end up with correct alignment, but for
> unsigned int with 8-bit alignment it can't.  And e.g. i?86 long long is of the
> same category, it has 64-bit size, but 32-bit alignment, so if you have a
> pointer
> to long long array, the whole array might be just 32-bit aligned, but not
> 64-bit,
> then any kind of peeling will still result in misalignment.

The vectorizer calls builtin_vector_alignment_reachable in case of unknown
alignment to verify that the access is naturally aligned. (i?86 doesn't
implement this builtin and uses the default implementation). 
There is PR 42652 for a similar problem.

Thanks,
Ira


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

--- Comment #14 from Jakub Jelinek  2011-04-07 
08:20:30 UTC ---
Sure, but for types with user alignment smaller than their size, loop peeling
may often never be able to align it.

This regressed with http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161797

What I mean is that for say unsigned int type with 32-bit alignment loop
peeling for valid code should always end up with correct alignment, but for
unsigned int with 8-bit alignment it can't.  And e.g. i?86 long long is of the
same category, it has 64-bit size, but 32-bit alignment, so if you have a
pointer
to long long array, the whole array might be just 32-bit aligned, but not
64-bit,
then any kind of peeling will still result in misalignment.

__attribute__((noinline, noclone)) unsigned long long
foo (const char *s, int len)
{
  const unsigned long long *p = (const unsigned long long *) s;
  unsigned long long f = len / sizeof (unsigned long long), hash = len, i;

  for (i = 0; i < f; ++i)
hash += *p++;
  return hash;
}

struct S
{
  int i;
  long long buf[64];
} s;

int
main (void)
{
  return foo ((const char *) s.buf, 60);
}

works fine though (with -m32 -O3 -mavx or -m32 -O3 -msse4).


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

Ira Rosen  changed:

   What|Removed |Added

 CC||irar at il dot ibm.com

--- Comment #13 from Ira Rosen  2011-04-07 08:00:23 UTC 
---
The vectorizer uses loop peeling to align the loads from *p (by checking the
address at the run time).


[Bug middle-end/48377] [4.6/4.7 regression] miscompilation at -O3

2011-04-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48377

Jakub Jelinek  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 CC||irar at gcc dot gnu.org
 Resolution|INVALID |
Summary|[4.6 regression]|[4.6/4.7 regression]
   |miscompilation at -O3   |miscompilation at -O3

--- Comment #12 from Jakub Jelinek  2011-04-07 
07:44:18 UTC ---
I'd say that is a vectorization bug then.  Either it shouldn't vectorize it at
all, or needs to ensure unaligned vector loads (resp. stores) are used
everywhere.

Short testcase:

typedef unsigned int U __attribute__((__aligned__ (1), __may_alias__));

__attribute__((noinline, noclone)) unsigned int
foo (const char *s, int len)
{
  const U *p = (const U *) s;
  unsigned int f = len / sizeof (unsigned int), hash = len, i;

  for (i = 0; i < f; ++i)
hash += *p++;
  return hash;
}

char buf[64] __attribute__((aligned (32)));

int
main (void)
{
  return foo (buf + 1, 26);
}