[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-22 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #8 from Wilco  ---
(In reply to Jakub Jelinek from comment #7)
> cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover
> __builtin_setjmp.
> 
> aarch64_frame_pointer_required would force frame_pointer_needed and thus be
> true in that case too.  But sure, if it works, we can change:
>/* Force a frame chain for EH returns so the return address is at FP+8. 
> */
>cfun->machine->frame.emit_frame_chain
> -= frame_pointer_needed || crtl->calls_eh_return;
> += frame_pointer_needed || crtl->calls_eh_return ||
> cfun->has_nonlocal_label;

Note I'm not convinced this is sufficient. I tried compiling
testsuite/gcc.c-torture/execute/pr60003.c and it appears to mess up the frame
pointer so it no longer points to a frame chain:

baz:
adrpx1, .LANCHOR0
add x0, x1, :lo12:.LANCHOR0
stp x29, x30, [sp, -16]!
mov x29, sp
ldr x2, [x0, 8]
ldr x29, [x1, #:lo12:.LANCHOR0]  // load of bad frame pointer
ldr x0, [x0, 16]
mov sp, x0
br  x2

foo:
stp x29, x30, [sp, -176]!
adrpx2, .LANCHOR0
add x1, x2, :lo12:.LANCHOR0
mov x29, sp
add x3, sp, 176  // store of bad frame pointer
str x3, [x2, #:lo12:.LANCHOR0]
stp x19, x20, [sp, 16]
stp x21, x22, [sp, 32]
stp x23, x24, [sp, 48]
stp x25, x26, [sp, 64]
stp x27, x28, [sp, 80]
stp d8, d9, [sp, 96]
stp d10, d11, [sp, 112]
stp d12, d13, [sp, 128]
stp d14, d15, [sp, 144]
str w0, [sp, 172]
adrpx0, .L7
add x0, x0, :lo12:.L7
str x0, [x1, 8]
mov x0, sp
str x0, [x1, 16]
bl  baz
.p2align 2
.L7:
ldr w0, [sp, 172]
ldp x19, x20, [sp, 16]
ldp x21, x22, [sp, 32]
ldp x23, x24, [sp, 48]
ldp x25, x26, [sp, 64]
ldp x27, x28, [sp, 80]
ldp d8, d9, [sp, 96]
ldp d10, d11, [sp, 112]
ldp d12, d13, [sp, 128]
ldp d14, d15, [sp, 144]
ldp x29, x30, [sp], 176
ret

What should happen is that it stores the actual sp/fp just before calling baz,
and baz then restores those before jumping to L7.

[Bug c/84522] New: GCC does not generate cmpxchg16b when mcx16 is used

2018-02-22 Thread nruslan_devel at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84522

Bug ID: 84522
   Summary: GCC does not generate cmpxchg16b when mcx16 is used
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nruslan_devel at yahoo dot com
  Target Milestone: ---

I looked up similar bugs, but I could not quite understand why it redirects to
libatomic when used with 128-bit cmpxchg in x86-64 even when '-mcx16' flag is
specified. Especially because similar cmpxchg8b for x86 (32-bit) is still used
without redirecting to libatomic.

80878 mentioned something about read-only memory, but that should only apply to
atomic_load, not atomic_compare_and_exchange. Right?

It is especially annoying because libatomic will not guarantee lock-freedom,
therefore, these functions become useless in many cases.
This compiler behavior is inconsistent with clang.

For instance, for the following code:

#include 

__uint128_t cmpxhg_weak(_Atomic(__uint128_t) * obj, __uint128_t * expected,
__uint128_t desired)
{
return atomic_compare_exchange_weak(obj, expected, desired);
}

GCC generates:

(gcc -std=c11 -mcx16 -Wall -O2 -S test.c)

cmpxhg_weak:
subq$8, %rsp
movl$5, %r9d
movl$5, %r8d
call__atomic_compare_exchange_16@PLT
xorl%edx, %edx
movzbl  %al, %eax
addq$8, %rsp
ret

While clang/llvm generates the code which is obviously lock-free:
cmpxhg_weak:# @cmpxhg_weak
pushq   %rbx
movq%rdx, %r8
movq(%rsi), %rax
movq8(%rsi), %rdx
xorl%r9d, %r9d
movq%r8, %rbx
lockcmpxchg16b  (%rdi)
sete%cl
je  .LBB0_2
movq%rax, (%rsi)
movq%rdx, 8(%rsi)
.LBB0_2:
movb%cl, %r9b
xorl%edx, %edx
movq%r9, %rax
popq%rbx
retq

However, for 32-bit GCC still generates cmpxchg8b:

#include 
#include 

uint64_t cmpxhg_weak(_Atomic(uint64_t) * obj, uint64_t * expected, uint64_t
desired)
{
return atomic_compare_exchange_weak(obj, expected, desired);
}

gcc -std=c11 -m32 -Wall -O2 -S test.c


cmpxhg_weak:
pushl   %edi
pushl   %esi
pushl   %ebx
movl20(%esp), %esi
movl24(%esp), %ebx
movl28(%esp), %ecx
movl16(%esp), %edi
movl(%esi), %eax
movl4(%esi), %edx
lock cmpxchg8b  (%edi)
movl%edx, %ecx
movl%eax, %edx
sete%al
je  .L2
movl%edx, (%esi)
movl%ecx, 4(%esi)
.L2:
popl%ebx
movzbl  %al, %eax
xorl%edx, %edx
popl%esi
popl%edi
ret

[Bug target/84522] GCC does not generate cmpxchg16b when mcx16 is used

2018-02-22 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84522

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |target

--- Comment #1 from Andrew Pinski  ---
IIRC this was done because there is no atomic load/stores or a way to do
backwards compatible.

[Bug target/84522] GCC does not generate cmpxchg16b when mcx16 is used

2018-02-22 Thread nruslan_devel at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84522

--- Comment #2 from Ruslan Nikolaev  ---
Yes, but not having atomic_load is far less an issue. Oftentimes, algorithms
that use 128-bit can simply use compare_and_exchange only (at least for
x86-64).

[Bug target/84522] GCC does not generate cmpxchg16b when mcx16 is used

2018-02-22 Thread nruslan_devel at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84522

--- Comment #3 from Ruslan Nikolaev  ---
(In reply to Ruslan Nikolaev from comment #2)
> Yes, but not having atomic_load is far less an issue. Oftentimes, algorithms
> that use 128-bit can simply use compare_and_exchange only (at least for
> x86-64).

In other words, can atomic_load be redirected to libatomic while
compare_exchange still be generated directly (if -mcx16 is specified)?

[Bug fortran/83148] [8 regression] ICE: crash_signal from toplev.c:325

2018-02-22 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83148

--- Comment #6 from Thomas Koenig  ---
The problem seems to be that gfc_conv_initalizer does not look
through

(gdb) p *expr
$1 = {expr_type = EXPR_STRUCTURE, ts = {type = BT_DERIVED, kind = 0,

to

(gdb) p *(expr->ts.u.derived->components->ts->u.derived)


$22 = {name = 0x77487080 "c_ptr", module = 0x77483410
"__iso_c_binding", declared_at = {nextc = 0x25672b8, lb = 0x2567280}, ts = {
type = BT_INTEGER, kind = 8, u = {derived = 0x256fc00, cl = 0x256fc00, pad
= 39255040}, interface = 0x0, is_c_interop = 1, is_iso_c = 1

It is, of course, open if it should need to... without adding the
vtab, gfc_conv_initializer does not even appear to be called.

[Bug c++/84453] [8 Regression] ICE in build_type_attribute_qual_variant, at attribs.c:1166

2018-02-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84453

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

--- Comment #6 from Jason Merrill  ---
Yes.

[Bug target/84522] GCC does not generate cmpxchg16b when mcx16 is used

2018-02-22 Thread nruslan_devel at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84522

--- Comment #4 from Ruslan Nikolaev  ---
I guess, in this case you would have to fall-back to lock-based implementation
for everything. But does C11 even require that atomic_load work on read-only
memory?

[Bug target/81572] [7/8 Regression] gcc-7 regression: unnecessary vector regmove on compare

2018-02-22 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81572

--- Comment #4 from Vladimir Makarov  ---
Author: vmakarov
Date: Thu Feb 22 21:17:51 2018
New Revision: 257915

URL: https://gcc.gnu.org/viewcvs?rev=257915&root=gcc&view=rev
Log:
2018-02-22  Vladimir Makarov  

PR target/81572
* lra-int.h (LRA_UNKNOWN_ALT, LRA_NON_CLOBBERED_ALT): New macros.
* lra.c (lra_set_insn_recog_data, lra_update_insn_recog_data): Use
LRA_UNKNOWN_ALT.
* lra-constraints.c (curr_insn_transform): Set up
LRA_NON_CLOBBERED_ALT for moves processed on the fast path.  Use
LRA_UNKNOWN_ALT.
(remove_inheritance_pseudos): Use LRA_UNKNOWN_ALT.
* lra-eliminations.c (spill_pseudos): Ditto.
(process_insn_for_elimination): Ditto.
* lra-lives.c (reg_early_clobber_p): Use the new macros.
* lra-spills.c (spill_pseudos): Use LRA_UNKNOWN_ALT and
LRA_NON_CLOBBERED_ALT.

2018-02-22  Vladimir Makarov  

PR target/81572
* gcc.target/powerpc/pr81572.c: New.


Added:
trunk/gcc/testsuite/gcc.target/powerpc/pr81572.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-constraints.c
trunk/gcc/lra-eliminations.c
trunk/gcc/lra-int.h
trunk/gcc/lra-lives.c
trunk/gcc/lra-spills.c
trunk/gcc/lra.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/84518] [8 Regression] ICE with lambda capturing broken variable

2018-02-22 Thread dmalcolm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84518

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-02-22
 CC||dmalcolm at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from David Malcolm  ---
Thanks for filing this report.

Confirmed.  Both ICEs started with r253265.

The first ICE (testcase in comment #0) happens at line 446 of lambda.c in
build_capture_proxy here:

441   if (DECL_NORMAL_CAPTURE_P (member))
442 {
443   if (DECL_VLA_CAPTURE_P (member))
444 {
445   init = CONSTRUCTOR_ELT (init, 0)->value;
446   init = TREE_OPERAND (init, 0); // Strip ADDR_EXPR.
447   init = TREE_OPERAND (init, 0); // Strip ARRAY_REF.
448 }

where "init" is error_mark.


The second ICE (testcase in comment #1):

Happens at line 288 of lambda.c in is_normal_capture_proxy here:
288   gcc_assert (TREE_CODE (val) == COMPONENT_REF);

where val is a NOP_EXPR around a COMPONENT_REF (casting from T* to reference to
T[]).

[Bug c++/84516] bitfield temporaries > 32-bits have wrong type

2018-02-22 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84516

--- Comment #2 from joseph at codesourcery dot com  ---
See also bug 70733, another bug with these types being user-exposed for 
bit-fields for C++.  For C++ (unlike C), the existence of these types 
internally in the compiler should never be user-visible, because bit-field 
width is explicitly not part of the type for C++.

[Bug fortran/84523] New: [8 Regression] Runtime crash deallocating allocatable array within derived type

2018-02-22 Thread anlauf at gmx dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84523

Bug ID: 84523
   Summary: [8 Regression] Runtime crash deallocating allocatable
array within derived type
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anlauf at gmx dot de
  Target Milestone: ---

Created attachment 43490
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43490&action=edit
Reproducer

The attached code crashes when checking allocatable components within
a derived type:

 ### destruct: size(rc% spots)=  80
 ### destruct: allocated (vm) = F

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0xe3ff in ???
#1  0x8048a5e in destruct
at /work/DWD/git/dace_code/gfcbug148.f90:33
#2  0x8048be8 in gfcbug148
at /work/DWD/git/dace_code/gfcbug148.f90:12
#3  0x8049088 in main
at /work/DWD/git/dace_code/gfcbug148.f90:13


The program runs without problems with any version 4.8 through 7.2:

 ### destruct: size(rc% spots)=  80
 ### destruct: allocated (vm) = F
 OK

[Bug target/82851] [8 regression] g++.dg/vect/slp-pr56812.cc, i386/avx2-vpaddq-3.c, i386/avx2-vpsubq-3.c fails

2018-02-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82851

--- Comment #7 from Jakub Jelinek  ---
Author: jakub
Date: Thu Feb 22 21:27:44 2018
New Revision: 257916

URL: https://gcc.gnu.org/viewcvs?rev=257916&root=gcc&view=rev
Log:
PR target/82851
* gcc.target/i386/avx2-vpaddq-3.c: Add -mtune=generic to dg-options.
* gcc.target/i386/avx2-vpsubq-3.c: Likewise.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/avx2-vpaddq-3.c
trunk/gcc/testsuite/gcc.target/i386/avx2-vpsubq-3.c

[Bug target/82851] [8 regression] g++.dg/vect/slp-pr56812.cc, i386/avx2-vpaddq-3.c, i386/avx2-vpsubq-3.c fails

2018-02-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82851

Jakub Jelinek  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Jakub Jelinek  ---
Fixed.

[Bug target/83964] [8 Regression] ICE in extract_insn, at recog.c:2304

2018-02-22 Thread carll at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83964

--- Comment #10 from Carl Love  ---
These builtins were per a request from Steve Monroe.  Not sure why he wanted
them or if he actually ever used them.

[Bug fortran/59781] [6 Regression] [F03] Incorrect initialisation of derived type

2018-02-22 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59781

--- Comment #18 from Thomas Koenig  ---
Author: tkoenig
Date: Thu Feb 22 22:01:53 2018
New Revision: 257917

URL: https://gcc.gnu.org/viewcvs?rev=257917&root=gcc&view=rev
Log:
2018-02-22  Thomas Koenig  

PR fortran/59781
* gfortran.dg/derived_init_5.f90: New test.


Added:
trunk/gcc/testsuite/gfortran.dg/derived_init_5.f90
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug fortran/59781] [6 Regression] [F03] Incorrect initialisation of derived type

2018-02-22 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59781

Thomas Koenig  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #19 from Thomas Koenig  ---
Test case committed, closing.

[Bug fortran/84346] Statement functions should not accept keywords

2018-02-22 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84346

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
I have a patch.

[Bug c++/84424] [8 Regression] ICE on C++ code: tree check: expected record_type or union_type or qual_union_type, have vector_type in reduced_constant_expression_p, at cp/constexpr.c:1766

2018-02-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84424

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-22 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #9 from Wilco  ---
(In reply to Jakub Jelinek from comment #7)
> cfun->has_nonlocal_label instead of cfun->calls_setjmp would cover
> __builtin_setjmp.

Do non-local labels do the same odd thing? It seems to me if the mid-end
automatically inserts explicit writes to the frame pointer, it should also set
frame_pointer_needed. This may be a bug on other targets too.

Also a much better implementation would use a small landing pad in the function
that does the __builtin_setjmp (rather than inline it a different function), so
you avoid the frame pointer corruption. Eg.

baz:
...
ldr x1, [x0, 8]
br  x1

L7_nonlocal: (landing pad in foo)
ldr x29, [x0, 16]
ldr sp,  [x0]
b   .L7

Or maybe we should get rid of these horrible hacks altogether?

[Bug c++/84424] [8 Regression] ICE on C++ code: tree check: expected record_type or union_type or qual_union_type, have vector_type in reduced_constant_expression_p, at cp/constexpr.c:1766

2018-02-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84424

--- Comment #5 from Jason Merrill  ---
Author: jason
Date: Thu Feb 22 22:50:37 2018
New Revision: 257924

URL: https://gcc.gnu.org/viewcvs?rev=257924&root=gcc&view=rev
Log:
PR c++/84424 - ICE with constexpr and __builtin_shuffle.

* constexpr.c (reduced_constant_expression_p): Handle CONSTRUCTOR of
VECTOR_TYPE.

Added:
trunk/gcc/testsuite/g++.dg/ext/vector34.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/constexpr.c

[Bug c++/84424] [8 Regression] ICE on C++ code: tree check: expected record_type or union_type or qual_union_type, have vector_type in reduced_constant_expression_p, at cp/constexpr.c:1766

2018-02-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84424

Jason Merrill  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jason Merrill  ---
Fixed.

[Bug c++/70468] [6/7/8 Regression] ICE on invalid code on x86_64-linux-gnu in emit_mem_initializers, at cp/init.c:1109

2018-02-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70468

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug c/84524] New: -O3 causes behavior change

2018-02-22 Thread caleb.fujimori at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524

Bug ID: 84524
   Summary: -O3 causes behavior change
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: caleb.fujimori at gmail dot com
  Target Milestone: ---

Created attachment 43491
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43491&action=edit
Preprocessed source

Compiling with -march=native -O3 causes an array initialization loop to fill
all elements with 61215. Disabling AVX512 with -mno-avx512f or -mno-avx512bw
creates the expected behavior.

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
5.4.0-6ubuntu1~16.04.9' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wextra' '-std=c99'
'-march=native' '-O3' '-o' 'avx512bug'
 /usr/lib/gcc/x86_64-linux-gnu/5/cc1 -E -quiet -v -imultiarch x86_64-linux-gnu
avx512bug.c -march=knl -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a
-mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma
-mno-fma4 -mno-xop -mbmi -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt
-mrtm -mhle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave
-mxsaveopt -mavx512f -mno-avx512er -mavx512cd -mno-avx512pf -mno-prefetchwt1
-mclflushopt -mxsavec -mxsaves -mavx512dq -mavx512bw -mno-avx512vl
-mno-avx512ifma -mno-avx512vbmi -mclwb -mno-pcommit -mno-mwaitx --param
l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=14080
-mtune=generic -std=c99 -Wall -Wextra -O3 -fpch-preprocess
-fstack-protector-strong -Wformat-security -o avx512bug.i
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/5/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/5/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wextra' '-std=c99'
'-march=native' '-O3' '-o' 'avx512bug'
 /usr/lib/gcc/x86_64-linux-gnu/5/cc1 -fpreprocessed avx512bug.i -march=knl
-mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe
-maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi
-mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mrtm -mhle -mrdrnd
-mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mavx512f
-mno-avx512er -mavx512cd -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec
-mxsaves -mavx512dq -mavx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi
-mclwb -mno-pcommit -mno-mwaitx --param l1-cache-size=32 --param
l1-cache-line-size=64 --param l2-cache-size=14080 -mtune=generic -quiet
-dumpbase avx512bug.c -auxbase avx512bug -O3 -Wall -Wextra -std=c99 -version
-fstack-protector-strong -Wformat-security -o avx512bug.s
GNU C99 (Ubuntu 5.4.0-6ubuntu1~16.04.9) version 5.4.0 20160609
(x86_64-linux-gnu)
compiled by GNU C version 5.4.0 20160609, GMP version 6.1.0, MPFR
version 3.1.4, MPC version 1.0.3
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C99 (Ubuntu 5.4.0-6ubuntu1~16.04.9) version 5.4.0 20160609
(x86_64-linux-gnu)
compiled by GNU C version 5.4.0 20160609, GMP version 6.1.0, MPFR
version 3.1.4, MPC version 1.0.3
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: d079eab342c322d6be59e8628e10ae67
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-We

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-22 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #10 from Ramana Radhakrishnan  ---
(In reply to Jakub Jelinek from comment #4)
> Is the requirement just for functions that contain setjmp?  If so, the
> backend could just force frame pointers in cfun->calls_setjmp functions.

I think we should flip back fno-omit-frame-pointer on for gcc-8 as that breaks
the guarantee that we've had in the port for quite a while. I'm testing a patch
currently that I will get out first thing tomorrow to turn this back on.

If we want to turn it off that should be a conscious decision.


> 
> If not, even if the default is tweaked again to be -fno-omit-frame-pointer
> on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer,
> even before that change.

I think we should treat that as a separate but related issue.


Ramana

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-22 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #11 from Wilco  ---
(In reply to Ramana Radhakrishnan from comment #10)
> (In reply to Jakub Jelinek from comment #4)
> > Is the requirement just for functions that contain setjmp?  If so, the
> > backend could just force frame pointers in cfun->calls_setjmp functions.
> 
> I think we should flip back fno-omit-frame-pointer on for gcc-8 as that
> breaks the guarantee that we've had in the port for quite a while. I'm
> testing a patch currently that I will get out first thing tomorrow to turn
> this back on.
> 
> If we want to turn it off that should be a conscious decision.
> 
> 
> > 
> > If not, even if the default is tweaked again to be -fno-omit-frame-pointer
> > on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer,
> > even before that change.
> 
> I think we should treat that as a separate but related issue.
> 
> 
> Ramana

The code is clearly incorrect even with the frame pointer is enabled, so this
has absolutely nothing to do with the frame pointer default. Like the eh_return
builtin, the implementation of these builtins is incorrect with or without a
frame pointer (and apparently has always been).

[Bug target/84522] GCC does not generate cmpxchg16b when mcx16 is used

2018-02-22 Thread nruslan_devel at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84522

--- Comment #5 from Ruslan Nikolaev  ---
After more t(In reply to Andrew Pinski from comment #1)
> IIRC this was done because there is no atomic load/stores or a way to do
> backwards compatible.

After more thinking about it... Should not it be controlled by some flag
(similar to -mcx16 which enables cmpxchg16b)? This flag can basically say, that
atomic_load on 128-bit will not work on read-only memory. I think, it is better
than just unconditionally disabling lock-free implementation for 128-bit types
in C11 (which can is useful in a number of cases) just to accommodate some rare
cases when memory accesses must be read-only. That would also be more portable
and compatible with other compilers such as clang.

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-02-22 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #12 from Wilco  ---
Note PR64242 is related (also frame pointer corruption by __builtin_longjmp).

[Bug c++/84525] New: GCC7: generate movaps instruction when assign to unaligned __int128*

2018-02-22 Thread buaa.zhaoc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84525

Bug ID: 84525
   Summary: GCC7: generate movaps instruction when assign to
unaligned __int128*
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: buaa.zhaoc at gmail dot com
  Target Milestone: ---

compiling the following code. Run the binary will cause a segment fault. GCC
version is 7.3, Target: x86_64-pc-linux-gnu, Configure is
../gcc-7.3.0/configure CFLAGS=-O2 LDFLAGS=-static --enable-gold=yes
--enable-languages=c,c++ --enable-c99 --enable-threads=posix
--enable-__cxa_atexit --disable-multilib --disable-bootstrap

g++ -O3 a.cc b.cc

a.cc 
#include 

extern void set_to_max(char* abc);

int main() {
char* abc = new char[100];
set_to_max(abc + 1);
std::cout << (int64_t)(((*(__int128*)(abc+1))) >> 64) << std::endl;
delete[] abc;
return 0;
}

b.cc
void set_to_max(char* abc) {
*reinterpret_cast<__int128*>(abc) = ~((__int128)1 << 127);
}

when I execute following command
g++ -O3 -S b.cc

following asm code generated: 

.file   "b.cc"
.text
.p2align 4,,15
.globl  _Z10set_to_maxPc
.type   _Z10set_to_maxPc, @function
_Z10set_to_maxPc:
.LFB0:
.cfi_startproc
movdqa  .LC0(%rip), %xmm0
movaps  %xmm0, (%rdi)
ret
.cfi_endproc
.LFE0:
.size   _Z10set_to_maxPc, .-_Z10set_to_maxPc
.section.rodata.cst16,"aM",@progbits,16
.align 16
.LC0:
.quad   -1
.quad   9223372036854775807
.ident  "GCC: (GNU) 7.3.0"
.section.note.GNU-stack,"",@progbits

movaps instruction cause the Segmentation fault

[Bug c++/84525] GCC7: generate movaps instruction when assign to unaligned __int128*

2018-02-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84525

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||jakub at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #1 from Jakub Jelinek  ---
That testcase is invalid C, -fsanitize=undefined would even tell you.
__int128 requires 16-byte alignment and you're violating that.
You can use __int128 temp; memcpy (&temp, abc, sizeof (temp)); ... use temp ...
or __int128 in a __attribute__((packed)) structure etc. to read unaligned
objects.

<    1   2