[Bug tree-optimization/42652] vectorizer created unaligned vector insns
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652 Richard Guenther changed: What|Removed |Added Blocks||53947 Status|UNCONFIRMED |RESOLVED Known to work||4.4.7, 4.5.4 Resolution||FIXED Target Milestone|--- |4.5.4 --- Comment #19 from Richard Guenther 2012-07-13 09:01:19 UTC --- Link to vectorizer missed-optimization meta-bug. --- Comment #20 from Richard Guenther 2012-07-19 11:37:30 UTC --- The testcase is invalid. To quote: typedef struct __attribute__ ((__packed__)) Palette { unsigned short pad; unsigned int ents[0]; } Palette; int main () { unsigned char *buf = new unsigned char[sizeof (Palette) + 256 * sizeof (unsigned int) + 16]; Palette *palette = (Palette *) (buf); unsigned int *now = palette->ents; unsigned int *end = now + 256; for (; now < end; now++) *now = 69; } you are dereferencing a pointer of type unsigned int * which is not properly aligned. Fixed at least since 4.5.4 if you properly do typedef struct __attribute__ ((__packed__)) Palette { unsigned short pad; unsigned int ents[0]; } Palette; typedef unsigned int uuint __attribute__((aligned(1))); int main () { unsigned char *buf = new unsigned char[sizeof (Palette) + 256 * sizeof (unsigned int) + 16]; Palette *palette = (Palette *) (buf); uuint *now = palette->ents; uuint *end = now + 256; for (; now < end; now++) *now = 69; } -Wcast-align does not warn about the original unsigned int *now = palette->ents; stmt though.
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #18 from rguenth at gcc dot gnu dot org 2010-02-22 10:27 --- (In reply to comment #17) > Is there a way to pass alignment information similar to PR 39954? Well, that wasn't really a true fix but simply restored the (broken) behavior of earlier compilers so the 4.5 regression is gone ... > Otherwise, a proper fix would be some inter-procedural analysis... Meantime, > we > can do intra-procedural analysis and fail when we reach function argument, > i.e, > use runtime checks. We already have several types of versioning, so adding > another one will complicate the things even more, and will not always be > possible (because of code size constrains). Yes, we need to propagate (mis-)alignment information properly. My plan for 4.6 is to add (mis-)alignemnt information to SSA_NAME_PTR_INFO and have a simple propagator computing it. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #17 from irar at il dot ibm dot com 2010-02-22 09:01 --- Is there a way to pass alignment information similar to PR 39954? Otherwise, a proper fix would be some inter-procedural analysis... Meantime, we can do intra-procedural analysis and fail when we reach function argument, i.e, use runtime checks. We already have several types of versioning, so adding another one will complicate the things even more, and will not always be possible (because of code size constrains). Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #16 from law at redhat dot com 2010-02-09 23:49 --- Subject: Re: vectorizer created unaligned vector insns On 02/09/10 16:11, rguenth at gcc dot gnu dot org wrote: > --- Comment #15 from rguenth at gcc dot gnu dot org 2010-02-09 23:11 > --- > (In reply to comment #14) > >> Subject: Re: vectorizer created unaligned vector >> insns >> >> On 01/18/10 05:17, irar at il dot ibm dot com wrote: >> >>> --- Comment #13 from irar at il dot ibm dot com 2010-01-18 12:17 >>> --- >>> Does something like this make sense? (With this patch we will never use >>> peeling >>> for function parameters, unless the builtin returns OK to peel for packed >>> types). >>> >>> Index: tree-vect-data-refs.c >>> === >>> --- tree-vect-data-refs.c (revision 155880) >>> +++ tree-vect-data-refs.c (working copy) >>> @@ -1010,10 +1010,29 @@ vector_alignment_reachable_p (struct dat >>> tree type = (TREE_TYPE (DR_REF (dr))); >>> tree ba = DR_BASE_OBJECT (dr); >>> bool is_packed = false; >>> + tree tmp = TREE_TYPE (DR_BASE_ADDRESS (dr)); >>> >>> if (ba) >>> is_packed = contains_packed_reference (ba); >>> >>> + is_packed = is_packed || contains_packed_reference (DR_BASE_ADDRESS >>> (dr)); >>> + >>> + if (!is_packed) >>> +{ >>> + while (tmp) >>> +{ >>> + is_packed = TYPE_PACKED (tmp); >>> + if (is_packed) >>> +break; >>> + >>> + tmp = TREE_TYPE (tmp); >>> +} >>> +} >>> + >>> + if (TREE_CODE (DR_BASE_ADDRESS (dr)) == SSA_NAME >>> +&& TREE_CODE (SSA_NAME_VAR (DR_BASE_ADDRESS (dr))) == PARM_DECL) >>> +is_packed = true; >>> + >>> if (vect_print_dump_info (REPORT_DETAILS)) >>> fprintf (vect_dump, "Unknown misalignment, is_packed = >>> %d",is_packed); >>> if (targetm.vectorize.vector_alignment_reachable (type, is_packed)) >>> >>> >>> >> I still don't see how this can be conservatively correct. The >> fundamental problem is the code assumes that if it doesn't find a packed >> attribute or the DR_BASE_ADDRESS isn't directly derived from a PARM_DECL >> then it's safe to vectorize.Instead the code really needs to operate >> by proving suitable alignment and if suitable alignment can't be proven, >> then vectorization is not possible without runtime alignment guards. >> >> As an example, consider the case where an unaligned packed address is >> passed as a parameter, but the target function does somethign like >> >> fubar (int *p, bool condition ) // where P is potentially unaligned >> > A pointer of type int * which is not aligned properly invokes undefined > behavior. > Unaligned in the sense that the alignment is not suitable for vectorization, but still has suitable alignment for natural integer loads & stores on the target processor. jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #15 from rguenth at gcc dot gnu dot org 2010-02-09 23:11 --- (In reply to comment #14) > Subject: Re: vectorizer created unaligned vector > insns > > On 01/18/10 05:17, irar at il dot ibm dot com wrote: > > --- Comment #13 from irar at il dot ibm dot com 2010-01-18 12:17 > > --- > > Does something like this make sense? (With this patch we will never use > > peeling > > for function parameters, unless the builtin returns OK to peel for packed > > types). > > > > Index: tree-vect-data-refs.c > > === > > --- tree-vect-data-refs.c (revision 155880) > > +++ tree-vect-data-refs.c (working copy) > > @@ -1010,10 +1010,29 @@ vector_alignment_reachable_p (struct dat > > tree type = (TREE_TYPE (DR_REF (dr))); > > tree ba = DR_BASE_OBJECT (dr); > > bool is_packed = false; > > + tree tmp = TREE_TYPE (DR_BASE_ADDRESS (dr)); > > > > if (ba) > > is_packed = contains_packed_reference (ba); > > > > + is_packed = is_packed || contains_packed_reference (DR_BASE_ADDRESS > > (dr)); > > + > > + if (!is_packed) > > +{ > > + while (tmp) > > +{ > > + is_packed = TYPE_PACKED (tmp); > > + if (is_packed) > > +break; > > + > > + tmp = TREE_TYPE (tmp); > > +} > > +} > > + > > + if (TREE_CODE (DR_BASE_ADDRESS (dr)) == SSA_NAME > > +&& TREE_CODE (SSA_NAME_VAR (DR_BASE_ADDRESS (dr))) == PARM_DECL) > > +is_packed = true; > > + > > if (vect_print_dump_info (REPORT_DETAILS)) > > fprintf (vect_dump, "Unknown misalignment, is_packed = > > %d",is_packed); > > if (targetm.vectorize.vector_alignment_reachable (type, is_packed)) > > > > > I still don't see how this can be conservatively correct. The > fundamental problem is the code assumes that if it doesn't find a packed > attribute or the DR_BASE_ADDRESS isn't directly derived from a PARM_DECL > then it's safe to vectorize.Instead the code really needs to operate > by proving suitable alignment and if suitable alignment can't be proven, > then vectorization is not possible without runtime alignment guards. > > As an example, consider the case where an unaligned packed address is > passed as a parameter, but the target function does somethign like > > fubar (int *p, bool condition ) // where P is potentially unaligned A pointer of type int * which is not aligned properly invokes undefined behavior. > { > int *addr; >if (condition) > temp = malloc (whatever) >addr = (condition ? temp : p); > >// addr used as base address in vectorizable loop here > } > > Jeff > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #14 from law at redhat dot com 2010-02-09 23:04 --- Subject: Re: vectorizer created unaligned vector insns On 01/18/10 05:17, irar at il dot ibm dot com wrote: > --- Comment #13 from irar at il dot ibm dot com 2010-01-18 12:17 --- > Does something like this make sense? (With this patch we will never use > peeling > for function parameters, unless the builtin returns OK to peel for packed > types). > > Index: tree-vect-data-refs.c > === > --- tree-vect-data-refs.c (revision 155880) > +++ tree-vect-data-refs.c (working copy) > @@ -1010,10 +1010,29 @@ vector_alignment_reachable_p (struct dat > tree type = (TREE_TYPE (DR_REF (dr))); > tree ba = DR_BASE_OBJECT (dr); > bool is_packed = false; > + tree tmp = TREE_TYPE (DR_BASE_ADDRESS (dr)); > > if (ba) > is_packed = contains_packed_reference (ba); > > + is_packed = is_packed || contains_packed_reference (DR_BASE_ADDRESS > (dr)); > + > + if (!is_packed) > +{ > + while (tmp) > +{ > + is_packed = TYPE_PACKED (tmp); > + if (is_packed) > +break; > + > + tmp = TREE_TYPE (tmp); > +} > +} > + > + if (TREE_CODE (DR_BASE_ADDRESS (dr)) == SSA_NAME > +&& TREE_CODE (SSA_NAME_VAR (DR_BASE_ADDRESS (dr))) == PARM_DECL) > +is_packed = true; > + > if (vect_print_dump_info (REPORT_DETAILS)) > fprintf (vect_dump, "Unknown misalignment, is_packed = > %d",is_packed); > if (targetm.vectorize.vector_alignment_reachable (type, is_packed)) > > I still don't see how this can be conservatively correct. The fundamental problem is the code assumes that if it doesn't find a packed attribute or the DR_BASE_ADDRESS isn't directly derived from a PARM_DECL then it's safe to vectorize.Instead the code really needs to operate by proving suitable alignment and if suitable alignment can't be proven, then vectorization is not possible without runtime alignment guards. As an example, consider the case where an unaligned packed address is passed as a parameter, but the target function does somethign like fubar (int *p, bool condition ) // where P is potentially unaligned { int *addr; if (condition) temp = malloc (whatever) addr = (condition ? temp : p); // addr used as base address in vectorizable loop here } Jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #13 from irar at il dot ibm dot com 2010-01-18 12:17 --- Does something like this make sense? (With this patch we will never use peeling for function parameters, unless the builtin returns OK to peel for packed types). Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 155880) +++ tree-vect-data-refs.c (working copy) @@ -1010,10 +1010,29 @@ vector_alignment_reachable_p (struct dat tree type = (TREE_TYPE (DR_REF (dr))); tree ba = DR_BASE_OBJECT (dr); bool is_packed = false; + tree tmp = TREE_TYPE (DR_BASE_ADDRESS (dr)); if (ba) is_packed = contains_packed_reference (ba); + is_packed = is_packed || contains_packed_reference (DR_BASE_ADDRESS (dr)); + + if (!is_packed) +{ + while (tmp) +{ + is_packed = TYPE_PACKED (tmp); + if (is_packed) +break; + + tmp = TREE_TYPE (tmp); +} +} + + if (TREE_CODE (DR_BASE_ADDRESS (dr)) == SSA_NAME + && TREE_CODE (SSA_NAME_VAR (DR_BASE_ADDRESS (dr))) == PARM_DECL) +is_packed = true; + if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed); if (targetm.vectorize.vector_alignment_reachable (type, is_packed)) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #12 from hjl dot tools at gmail dot com 2010-01-14 17:34 --- Intel AVX architecture won't fault on unaligned load-op instructions and unaligned VMOV* instructions are as fast as aligned VMOV* instructions on aligned memory. For AVX, we can always use unaligned VMOV* instructions, even on aligned memory, to work around this issue without any performance penalty. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #11 from law at redhat dot com 2010-01-14 17:29 --- Subject: Re: vectorizer created unaligned vector insns On 01/13/10 02:35, irar at il dot ibm dot com wrote: > --- Comment #10 from irar at il dot ibm dot com 2010-01-13 09:35 --- > Yes, I understand that we can't assume that an access is aligned if we can't > prove it's aligned. I don't understand how we can prove that a COMPONENT_REF > is > aligned, i.e., if there is a way to check if a struct is packed, or we'd > better > decide that we always use versioning for COMPONENT_REFs? > Given a raw COMPONENT_REF I don't think you can prove proper alignment. You have to look at the entire effective address computation. Unfortunately, I think this has rather significant implications on how often we can vectorize without needing to perform the runtime check for alignment. Jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #10 from irar at il dot ibm dot com 2010-01-13 09:35 --- Yes, I understand that we can't assume that an access is aligned if we can't prove it's aligned. I don't understand how we can prove that a COMPONENT_REF is aligned, i.e., if there is a way to check if a struct is packed, or we'd better decide that we always use versioning for COMPONENT_REFs? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #9 from law at redhat dot com 2010-01-12 15:18 --- Subject: Re: vectorizer created unaligned vector insns On 01/12/10 01:08, irar at il dot ibm dot com wrote: > --- Comment #8 from irar at il dot ibm dot com 2010-01-12 08:08 --- > So, to be on the safe side, we should assume that COMPONENT_REFs are not > naturally aligned and never use peeling for them? > If you can not *prove* at compile time that you're going to get the proper alignment, then you can't vectorize without a runtime check to select between a vectorized and unvectorized loop. Proven alignement -- vectorize Proven unaligned -- do not vectorize Likely aligned, but not proven -- emit two loops, one vectorized, one not vectorized and runtime selection of the proper loop Otherwise you run the risk of making conforming code segfault which is obviously unacceptable. jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #8 from irar at il dot ibm dot com 2010-01-12 08:08 --- So, to be on the safe side, we should assume that COMPONENT_REFs are not naturally aligned and never use peeling for them? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #7 from law at redhat dot com 2010-01-11 17:16 --- Subject: Re: vectorizer created unaligned vector insns On 01/08/10 10:25, rguenth at gcc dot gnu dot org wrote: > --- Comment #4 from rguenth at gcc dot gnu dot org 2010-01-08 17:25 > --- > Well, indeed we have a certain weakness in how we represent pointers to > possibly (un-)aligned stuff. See PR39954 for another case. Maybe this > bug is really a duplicate of the underlying problems. > > PR39954 is slightly different, though it would apply if we started looking at passing the address of palette->ents as a parameter. Jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #6 from law at redhat dot com 2010-01-11 17:14 --- Subject: Re: vectorizer created unaligned vector insns On 01/10/10 01:22, irar at il dot ibm dot com wrote: > --- Comment #5 from irar at il dot ibm dot com 2010-01-10 08:22 --- > In vector_alignment_reachable_p() we check if an access is packed using > contains_packed_reference(). For packed accesses we return false, meaning > alignment is unreachable and peeling cannot be used. > > In the attached testcase contains_packed_reference() returns false for > palette_5. > Right. If you look more closely at the way the code works, there's no way contains_packed_reference is ever going to return true. It simply isn't passed the right information. It'd need to be looking at the base address's type (not the base object). Furthermore, contains_packed_reference simply isn't designed to dive into types. It primarily works on exprs and only references the underlying type when it encounters a COMPONENT_REF. As Richard apparently pointed out in the past, this code is not conservatively correct -- if it can't prove the item is properly aligned, then a runtime check is required. So while I think we might be able to fix this specific instance by improving the code around the call to contains_packed_reference, I think we have a bigger problem as it's fairly trivial to change the test so that the addresses of interest are parameters and alignment info is effectively unknown. Jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #5 from irar at il dot ibm dot com 2010-01-10 08:22 --- In vector_alignment_reachable_p() we check if an access is packed using contains_packed_reference(). For packed accesses we return false, meaning alignment is unreachable and peeling cannot be used. In the attached testcase contains_packed_reference() returns false for palette_5. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #4 from rguenth at gcc dot gnu dot org 2010-01-08 17:25 --- Well, indeed we have a certain weakness in how we represent pointers to possibly (un-)aligned stuff. See PR39954 for another case. Maybe this bug is really a duplicate of the underlying problems. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #3 from law at redhat dot com 2010-01-08 16:45 --- Subject: Re: vectorizer created unaligned vector insns On 01/08/10 04:58, rguenth at gcc dot gnu dot org wrote: > --- Comment #2 from rguenth at gcc dot gnu dot org 2010-01-08 11:58 > --- > There is some hacks with contains_packed_reference but IIRC I was complaining > about these at some point - they are not conservatively correct. > > > I was thinking about this some more and ISTM that if we can't prove suitable alignment, then we have to fall back to a runtime test. So improving the code to detect packed structures would be useful, but is not sufficient to resolve this class of problems. ie, it probably wouldn't be too hard to transform that testcase into one which passed in a pointer to an unaligned palette->elms at which point we're forced to use the runtime test. Thoughts? jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #2 from rguenth at gcc dot gnu dot org 2010-01-08 11:58 --- There is some hacks with contains_packed_reference but IIRC I was complaining about these at some point - they are not conservatively correct. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||irar at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
[Bug tree-optimization/42652] vectorizer created unaligned vector insns
--- Comment #1 from law at redhat dot com 2010-01-07 17:33 --- Created an attachment (id=19501) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19501&action=view) testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652