[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2017-07-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #23 from Jakub Jelinek  ---
The bug is fixed, you must be running into a different issue, either in the
source you're compiling, or in the compiler.  So, please open a new bugreport
instead of commenting on a different one, and supply all the needed information
(see http://gcc.gnu.org/bugs/ for details on what we need).

[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2017-07-13 Thread gcc at thomasgereke dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

Thomas Gereke  changed:

   What|Removed |Added

 CC||gcc at thomasgereke dot de

--- Comment #22 from Thomas Gereke  ---
Seems the bug does still exist in 6.3.0 20170516 (Debian 6.3.0-18). I get a GP
on

  >x0x5574d8c8 <...[abi:cxx11]() const+264>movdqa 0x68(%rsp),%xmm0
   x0x5574d8ce <...[abi:cxx11]() const+270>lea0x80(%rsp),%r13
   x0x5574d8d6 <...[abi:cxx11]() const+278>movq   $0x0,0x50(%rsp)
   x0x5574d8df <...[abi:cxx11]() const+287>movl   $0x0,0x10(%rsp)
   x0x5574d8e7 <...[abi:cxx11]() const+295>movaps %xmm0,(%rsp)
   x0x5574d8eb <...[abi:cxx11]() const+299>movq   $0x0,0x6(%rsp)
   x0x5574d8f4 <...[abi:cxx11]() const+308>movw   $0x0,0xe(%rsp)
   x0x5574d8fb <...[abi:cxx11]() const+315>movdqa (%rsp),%xmm1
   x0x5574d900 <...[abi:cxx11]() const+320>movaps %xmm1,0x40(%rsp)

The asm code is obviously wrong, because movdqa 0x68(%rsp),%xmm0 followed by
movdqa (%rsp),%xmm1 without changes to %rsp has to fail. %rsp was
0x7fffecd477d0.

Code was C++ compiled with -O3 and x86_64. The underlying data structure is
boost::asio::ip::address, which consists of an enum (4 bytes), address_v4 (4
bytes) and address_v6 (16 bytes). The GP occurs when accessing the ipv6
address.

[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2015-12-11 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #21 from H.J. Lu  ---
This bug isn't fixed in GCC 4.9.  -O3 increases alignment from
64 bits to 128 bits on the original testcase:

Hardware watchpoint 6: *(unsigned int *) 0x7fffee9b4468

Old value = 64
New value = 128
ensure_base_align (stmt_info=0x1c8f990, dr=0x1db5b20)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:4907
4907  DECL_USER_ALIGN (base_decl) = 1;
(gdb) bt
#0  ensure_base_align (stmt_info=0x1c8f990, dr=0x1db5b20)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:4907
#1  0x00d33471 in vectorizable_store (stmt=0x7fffed95a280, 
gsi=0x7fffd830, vec_stmt=0x7fffd790, slp_node=0x1d9e7a0)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:5131
#2  0x00d38f80 in vect_transform_stmt (stmt=0x7fffed95a280, 
gsi=0x7fffd830, grouped_store=0x7fffd84a, slp_node=0x1d9e7a0, 
slp_node_instance=0x1cb3e10)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:7211
#3  0x00d5a980 in vect_schedule_slp_instance (node=0x1d9e7a0, 
instance=0x1cb3e10, vectorization_factor=1)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-slp.c:3084
#4  0x00d5abd0 in vect_schedule_slp (loop_vinfo=0x0, 
bb_vinfo=0x1ddf410)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-slp.c:3154
#5  0x00d5aea7 in vect_slp_transform_bb (bb=0x7fffece8ec30)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-slp.c:3230
#6  0x00d5e41b in execute_vect_slp ()
at /export/gnu/import/git/gcc-release/gcc/tree-vectorizer.c:605
#7  0x00d5e4c9 in (anonymous namespace)::pass_slp_vectorize::execute (
this=0x1b97010)
at /export/gnu/import/git/gcc-release/gcc/tree-vectorizer.c:649
#8  0x00a7da14 in execute_one_pass (pass=0x1b97010)
---Type  to continue, or q  to quit---q
 at /export/gnu/imporQuit
(gdb) f 1
#1  0x00d33471 in vectorizable_store (stmt=0x7fffed95a280, 
gsi=0x7fffd830, vec_stmt=0x7fffd790, slp_node=0x1d9e7a0)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:5131
5131  ensure_base_align (stmt_info, dr);
(gdb) f 2
#2  0x00d38f80 in vect_transform_stmt (stmt=0x7fffed95a280, 
gsi=0x7fffd830, grouped_store=0x7fffd84a, slp_node=0x1d9e7a0, 
slp_node_instance=0x1cb3e10)
at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:7211
7211  done = vectorizable_store (stmt, gsi, &vec_stmt, slp_node);
(gdb) 

This bug may be really fixed by r221268:

iff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index aa9d43f..41ff802 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4956,8 +4956,13 @@ ensure_base_align (stmt_vec_info stmt_info, struct
data_reference *dr)
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
   tree base_decl = ((dataref_aux *)dr->aux)->base_decl;

-  DECL_ALIGN (base_decl) = TYPE_ALIGN (vectype);
-  DECL_USER_ALIGN (base_decl) = 1;
+  if (decl_in_symtab_p (base_decl))
+  symtab_node::get (base_decl)->increase_alignment (TYPE_ALIGN (vectype));
+  else
+  {
+  DECL_ALIGN (base_decl) = TYPE_ALIGN (vectype);
+  DECL_USER_ALIGN (base_decl) = 1;
+  }
   ((dataref_aux *)dr->aux)->base_misaligned = false;
 }
 }

in GCC 5.

[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-11 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #20 from Jakub Jelinek  ---
Author: jakub
Date: Wed Jun 12 06:43:05 2013
New Revision: 199984

URL: http://gcc.gnu.org/viewcvs?rev=199984&root=gcc&view=rev
Log:
PR target/56564
* varasm.c (decl_binds_to_current_def_p): Call binds_local_p
target hook even for !TREE_PUBLIC decls.  If no resolution info
is available, return false for common and external decls.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/varasm.c

Author: jakub
Date: Wed Jun 12 06:46:53 2013
New Revision: 199985

URL: http://gcc.gnu.org/viewcvs?rev=199985&root=gcc&view=rev
Log:
PR target/56564
* gcc.target/i386/pr56564-1.c: Skip on darwin, mingw and cygwin.
* gcc.target/i386/pr56564-3.c: Likewise.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/pr56564-1.c
trunk/gcc/testsuite/gcc.target/i386/pr56564-3.c


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-11 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #19 from Jakub Jelinek  ---
The mingw/cygwin stuff.  The testcases assume that the symbols have
decl_binds_to_current_def_p false, if that isn't the case (because darwin/mingw
apparently don't allow symbol interposition), then the testcase can't work on
those.


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-11 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #18 from Dominique d'Humieres  ---
(In reply to comment #17)
> Yeah, MachO is broken by design, guess the tests need to be restricted 
> to non-darwin non-PE.

Questions:
(1) What is PE?
(2) Is the second "return 0;" wrong code or valid optimization? If the former,
why?
(3) Is the decoration "__emutls_v." the same for all the emutls platforms? If
not, where can I find the variants?


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-11 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #17 from Jakub Jelinek  ---
(In reply to Dominique d'Humieres from comment #16)
> On x86_64-apple-darwin10.8 at revision 199935, I get the following failures
> for the tests added at revision 199898:
> 
> FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "&s" 1
> FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "return 0" 1
> FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&s" 1
> FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&t" 1

Yeah, MachO is broken by design, guess the tests need to be restricted to
non-darwin non-PE.


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-11 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #16 from Dominique d'Humieres  ---
On x86_64-apple-darwin10.8 at revision 199935, I get the following failures for
the tests added at revision 199898:

FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "&s" 1
FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "return 0" 1
FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&s" 1
FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&t" 1

The optimized dumps are (blank lines removed):

[macbook] f90/bug% cat pr56564-1.c.165t.optimized
;; Function foo (foo, funcdef_no=0, decl_uid=1741, symbol_order=2)
foo ()
{
  :
  return 0;
}
;; Function bar (bar, funcdef_no=1, decl_uid=1744, symbol_order=3)
bar ()
{
  :
  return 0;
}

[macbook] f90/bug% cat pr56564-3.c.165t.optimized
;; Function foo (foo, funcdef_no=0, decl_uid=1741, symbol_order=2)
foo ()
{
  struct S * D.1770;
  long int s.0;
  int _2;
  int _3;
  :
  _5 = __builtin___emutls_get_address (&__emutls_v.s);
  s.0_1 = (long int) _5;
  _2 = (int) s.0_1;
  _3 = _2 & 15;
  return _3;
}
;; Function bar (bar, funcdef_no=1, decl_uid=1744, symbol_order=3)
bar ()
{
  char * D.1769;
  char[16] * D.1768;
  long int _1;
  int _2;
  int _3;
  :
  _5 = __builtin___emutls_get_address (&__emutls_v.t);
  _6 = &*_5[0];
  _1 = (long int) _6;
  _2 = (int) _1;
  _3 = _2 & 15;
  return _3;
}


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-10 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #15 from Jakub Jelinek  ---
Author: jakub
Date: Tue Jun 11 06:03:46 2013
New Revision: 199934

URL: http://gcc.gnu.org/viewcvs?rev=199934&root=gcc&view=rev
Log:
PR target/56564
* varasm.c (get_variable_align): Move #endif to the right place.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/varasm.c


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-10 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|hubicka at gcc dot gnu.org |jakub at gcc dot gnu.org

--- Comment #14 from Jakub Jelinek  ---
Author: jakub
Date: Mon Jun 10 15:41:52 2013
New Revision: 199898

URL: http://gcc.gnu.org/viewcvs?rev=199898&root=gcc&view=rev
Log:
PR target/56564
* varasm.c (align_variable): Don't use DATA_ALIGNMENT or
CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
Use DATA_ABI_ALIGNMENT for that case instead if defined.
(get_variable_align): New function.
(get_variable_section, emit_bss, emit_common,
assemble_variable_contents, place_block_symbol): Use
get_variable_align instead of DECL_ALIGN.
(assemble_noswitch_variable): Add align argument, use it
instead of DECL_ALIGN.
(assemble_variable): Adjust caller.  Use get_variable_align
instead of DECL_ALIGN.
* config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
caller.
(DATA_ABI_ALIGNMENT): Define.
* config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
* config/i386/i386.c (x86_data_alignment): Add opt argument.  If
opt is false, only return the psABI mandated alignment increase.
* config/c6x/c6x.h (DATA_ALIGNMENT): Renamed to...
(DATA_ABI_ALIGNMENT): ... this.
* config/mmix/mmix.h (DATA_ALIGNMENT): Renamed to...
(DATA_ABI_ALIGNMENT): ... this.
* config/mmix/mmix.c (mmix_data_alignment): Adjust function comment.
* config/s390/s390.h (DATA_ALIGNMENT): Renamed to...
(DATA_ABI_ALIGNMENT): ... this.
* doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
* doc/tm.texi: Regenerated.

* gcc.target/i386/pr56564-1.c: New test.
* gcc.target/i386/pr56564-2.c: New test.
* gcc.target/i386/pr56564-3.c: New test.
* gcc.target/i386/pr56564-4.c: New test.
* gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
* gcc.target/i386/vect-sizes-1.c: Likewise.
* gcc.target/i386/memcpy-1.c: Likewise.
* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr56564-1.c
trunk/gcc/testsuite/gcc.target/i386/pr56564-2.c
trunk/gcc/testsuite/gcc.target/i386/pr56564-3.c
trunk/gcc/testsuite/gcc.target/i386/pr56564-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/c6x/c6x.h
trunk/gcc/config/i386/i386-protos.h
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.h
trunk/gcc/config/mmix/mmix.c
trunk/gcc/config/mmix/mmix.h
trunk/gcc/config/s390/s390.h
trunk/gcc/doc/tm.texi
trunk/gcc/doc/tm.texi.in
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c
trunk/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c
trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c
trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c
trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c
trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c
trunk/gcc/testsuite/gcc.target/i386/memcpy-1.c
trunk/gcc/testsuite/gcc.target/i386/vect-sizes-1.c
trunk/gcc/varasm.c


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-06-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #13 from Jakub Jelinek  ---
Created attachment 30275
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30275&action=edit
gcc49-pr56564.patch

Untested fix.  Honza, is the array type >= 16 bytes alignment increase the only
ABI mandated one and all the rest is just optimization?


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-05-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

--- Comment #12 from Jakub Jelinek  ---
Maybe it was original DATA_ALIGNMENT purpose, but it certainly serves for both
right now, which is wrong, we need one for ABI mandated stuff and one for
optimization stuff beyond, where optimization alignment can be used if it can
be proved that we'll bind to the optimized decl, but ABI has to be used
otherwise.

E.g. x86_64 ABI says that certain arrays are aligned that and that way, it is
certainly something beyond what TYPE_ALIGN provides (changing TYPE_ALIGN of the
arrays would affect layout of structures, but that is wrong).


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-05-25 Thread sandra at codesourcery dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564

Sandra Loosemore  changed:

   What|Removed |Added

 CC||sandra at codesourcery dot com

--- Comment #11 from Sandra Loosemore  ---
This affects at least PowerPC, too, which implements DATA_ALIGNMENT to add
additional alignment beyond that specified by the ABI.

Isn't TYPE_ALIGN already supposed to return the ABI-mandated alignment for
objects of a given type?  The documentation for DATA_ALIGNMENT already suggests
that its purpose is to add additional alignment for optimization purposes and I
suspect other targets may be using it that way, too.  Perhaps what's needed
here is more careful monitoring of the places where DATA_ALIGNMENT is being
used, rather than splitting it into two macros or adding an argument to control
the two uses.  Or at least, we'd have to clarify how the requirements for the
ABI-conforming use of DATA_ALIGNMENT differ from what TYPE_ALIGN is supposed to
do.

It seems to me that DATA_ALIGNMENT's original purpose was to add additional
alignment on variable definitions, and IIUC the problem now is either that it
is being used in other contexts or that its intended use is not taking into
account common, weak, and/or comdat definitions where the linker may substitute
a less-aligned definition from another compilation unit.  

Also, somebody should check whether vect_can_force_dr_alignment_p in
tree-vect-data-refs.c is catching all the cases it needs to for ABI
conformance.


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-04-08 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564



Jan Hubicka  changed:



   What|Removed |Added



 Status|NEW |ASSIGNED

 AssignedTo|unassigned at gcc dot   |hubicka at gcc dot gnu.org

   |gnu.org |



--- Comment #10 from Jan Hubicka  2013-04-08 
15:22:21 UTC ---

Mine.


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-03-08 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564



--- Comment #9 from Jakub Jelinek  2013-03-08 
12:38:20 UTC ---

Smaller testcase (-O2 -fpic):



struct S { long a, b; } s;

int foo (void)

{

  return ((long) &s) & 15;

}



is since http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162943 optimized into

return 0, even when (probably) the psABI doesn't guarantee that.  But e.g. for

__builtin_memset (&s, 0, sizeof (s)); one can see already in 4.0 RTL dumps with

-O2 -fpic that MEM_ALIGN of s is assumed to be 128-bit.


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-03-08 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564



--- Comment #8 from Jakub Jelinek  2013-03-08 
11:35:36 UTC ---

Guess we'd need to split DATA_ALIGNMENT into two different macros (or one with

an extra argument), so that align_variable would know what alignment is part of

ABI and what is just an optimization above that, then align_variable could call

targetm.binds_local_p to see if DECL_ALIGN can be increased to the optimization

level or needs to stay at the ABI guaranteed level.  And then when assembling

vars, we'd increase the emitted alignment to the optimization level.


[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3

2013-03-08 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564



Richard Biener  changed:



   What|Removed |Added



   Keywords||ABI, wrong-code

 Target||x86_64-*-*

 Status|WAITING |NEW



--- Comment #7 from Richard Biener  2013-03-08 
11:26:19 UTC ---

Confirmed.