[Bug rtl-optimization/65783] after reload, the memrefs_conflict_p is unreliable?

2015-04-16 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65783

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org ---
*** Bug 65784 has been marked as a duplicate of this bug. ***


[Bug c/65784] after reload, the memrefs_conflict_p is unreliable?

2015-04-16 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65784

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org ---
.

*** This bug has been marked as a duplicate of bug 65783 ***


[Bug c++/62182] New warning wished: operator== and equality comparison result unused [-Wunused-comparison]/-Wunsed-value

2015-04-16 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62182

--- Comment #5 from Marek Polacek mpolacek at gcc dot gnu.org ---
(In reply to Arnaud Bienner from comment #3)
 One thing that doesn't work is turning on this warning using
 -Wunused-comparison parameter. But surprisingly, turning it off with
 -Wno-unused-comparison (when -Wunused or -Wall is used) works. Not sure what
 I'm missing here.

Because the emit_side_effect_warnings calls are guarded by warn_unused_value.


[Bug c/65784] New: after reload, the memrefs_conflict_p is unreliable?

2015-04-16 Thread wangjiefeng at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65784

Bug ID: 65784
   Summary: after reload, the memrefs_conflict_p is unreliable?
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wangjiefeng at huawei dot com

int f = -1;
int foo(int * pa)
{
  int a = 1;
  *(pa) = a;
  pa = pa + f;
  a = *(pa + 1);
  return a;
}

With -O2, the ARM's assembler is as follows:
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movwr3, #:lower16:.LANCHOR0 @ 20*arm_movsi_insn/4   [length
= 4]
mov r2, #1  @ 6 *arm_movsi_insn/2   [length = 4]
movtr3, #:upper16:.LANCHOR0 @ 21*arm_movt   [length = 4]
str r2, [r0]@ 7 *arm_movsi_insn/6   [length = 4]
ldr r3, [r3]@ 9 *arm_movsi_insn/5   [length = 4]
add r0, r0, r3, asl #2  @ 11*arith_shiftsi/1[length
= 4]
ldr r0, [r0, #4]@ 17*arm_movsi_insn/5   [length = 4]
bx  lr  @ 26*arm_return [length = 12]
.size   foo, .-foo
.global f
.data
.align  2

In sched1, insn 7 and insn 17 has true dependence, but in sched2, the true
dependence between insn 7 and insn 17 is omitted.
It seems after reload, in function true_dependence_1, the memrefs_conflict_p is
unreliable?


[Bug c/65783] New: after reload, the memrefs_conflict_p is unreliable?

2015-04-16 Thread wangjiefeng at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65783

Bug ID: 65783
   Summary: after reload, the memrefs_conflict_p is unreliable?
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wangjiefeng at huawei dot com

int f = -1;
int foo(int * pa)
{
  int a = 1;
  *(pa) = a;
  pa = pa + f;
  a = *(pa + 1);
  return a;
}

With -O2, the ARM's assembler is as follows:
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movwr3, #:lower16:.LANCHOR0 @ 20*arm_movsi_insn/4   [length
= 4]
mov r2, #1  @ 6 *arm_movsi_insn/2   [length = 4]
movtr3, #:upper16:.LANCHOR0 @ 21*arm_movt   [length = 4]
str r2, [r0]@ 7 *arm_movsi_insn/6   [length = 4]
ldr r3, [r3]@ 9 *arm_movsi_insn/5   [length = 4]
add r0, r0, r3, asl #2  @ 11*arith_shiftsi/1[length
= 4]
ldr r0, [r0, #4]@ 17*arm_movsi_insn/5   [length = 4]
bx  lr  @ 26*arm_return [length = 12]
.size   foo, .-foo
.global f
.data
.align  2

In sched1, insn 7 and insn 17 has true dependence, but in sched2, the true
dependence between insn 7 and insn 17 is omitted.
It seems after reload, in function true_dependence_1, the memrefs_conflict_p is
unreliable?


[Bug target/65780] [5 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #9)
 Created attachment 35327 [details]
 A different patch
 
 On x86, this issue only shows up with PIE. Here is a different
 patch to treat common symbol defined locally only if the backend
 passes true common_maybe_local.  For x86-64, it is true only if
 HAVE_LD_PIE_COPYRELOC is 1.  For i386, it is always false.  If
 we aren't building PIE, common_maybe_local is true or false
 doesn't make a difference for x86 since the common symbol is
 always referenced normally with copy reloc.  For PIE on x86-64,
 common symbol is local only if HAVE_LD_PIE_COPYRELOC is 1.

+
+  /* For common symbol, it is defined locally only if common_maybe_local
+ is true.  */
+  bool defined_locally = (!DECL_EXTERNAL (exp)
+   (!DECL_COMMON (exp) || common_maybe_local));

I think better would be:
  bool uninited_common = (DECL_COMMON (exp)
   (DECL_INITIAL (exp) == NULL
  || (!in_lto_p  DECL_INITIAL (exp) ==
error_mark_node)));
  /* For common symbol, it is defined locally only if common_maybe_local
 is true.  */
  bool defined_locally = (!DECL_EXTERNAL (exp)  (!uninited_common ||
common_maybe_local));
...
and then use
  /* Uninitialized COMMON variable may be unified with symbols
 resolved from other modules.  */
  if (uninited_common  !resolved_locally)
return false;


[Bug c++/65786] Wrong code when using decltype to specify the return type

2015-04-16 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65786

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

   Severity|critical|normal

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org ---
I think this is undefined code as you are std::max returns a reference so you
are returning a reference to a temp variable which went out of scope.


[Bug target/63633] [avr] internal compiler error: unrecognizable insn with mult insns

2015-04-16 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63633

Georg-Johann Lay gjl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #5 from Georg-Johann Lay gjl at gcc dot gnu.org ---
Reopened on behalf of PR65657.


[Bug tree-optimization/65443] Don't peel last iteration from loop in transform_to_exit_first_loop

2015-04-16 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65443

--- Comment #16 from vries at gcc dot gnu.org ---
ping:
- https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00763.html


[Bug middle-end/65777] SPECOMP component 362.fma3d fails with error SIGSEGV, segmentation fault occurred

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65777

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Target||x86_64-*-*
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2015-04-16
Version|unknown |4.8.3
   Target Milestone|4.8.3   |---
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
Waiting for a testcase and more details (compiler options?).  If you use -Ofast
(or -fstack-arrays) indeed you need to raise the stack limit size (and with OMP
that might not be enough as thread stack size might be too small again).


[Bug rtl-optimization/65783] after reload, the memrefs_conflict_p is unreliable?

2015-04-16 Thread wangjiefeng at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65783

--- Comment #3 from Jason wangjiefeng at huawei dot com ---
when sched1 the RTL is as follows:
(note 4 1 3 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 3 4 10 2 NOTE_INSN_FUNCTION_BEG)
(note 10 3 12 2 NOTE_INSN_DELETED)
(note 12 10 20 2 NOTE_INSN_DELETED)
(insn 20 12 2 2 (set (reg/f:SI 119)
(high:SI (symbol_ref:SI (*.LANCHOR0) [flags 0x182]))) tmp.c:7 196
{*arm_movsi_insn}
 (nil))
(insn 2 20 21 2 (set (reg/v/f:SI 116 [ pa ])
(reg:SI 0 r0 [ pa ])) tmp.c:4 196 {*arm_movsi_insn}
 (expr_list:REG_DEAD (reg:SI 0 r0 [ pa ])
(nil)))
(insn 21 2 6 2 (set (reg/f:SI 119)
(lo_sum:SI (reg/f:SI 119)
(symbol_ref:SI (*.LANCHOR0) [flags 0x182]))) tmp.c:7 195
{*arm_movt}
 (expr_list:REG_EQUAL (symbol_ref:SI (*.LANCHOR0) [flags 0x182])
(nil)))
(insn 6 21 7 2 (set (reg:SI 117)
(const_int 1 [0x1])) tmp.c:6 196 {*arm_movsi_insn}
 (nil))
(insn 7 6 9 2 (set (mem:SI (reg/v/f:SI 116 [ pa ]) [1 *pa_2(D)+0 S4 A32])
(reg:SI 117)) tmp.c:6 196 {*arm_movsi_insn}
 (expr_list:REG_DEAD (reg:SI 117)
(nil)))
(insn 9 7 11 2 (set (reg:SI 120 [ f ])
(mem/c:SI (reg/f:SI 119) [1 f+0 S4 A32])) tmp.c:7 196 {*arm_movsi_insn}
 (expr_list:REG_DEAD (reg/f:SI 119)
(nil)))
(insn 11 9 17 2 (set (reg/f:SI 121)
(plus:SI (mult:SI (reg:SI 120 [ f ])
(const_int 4 [0x4]))
(reg/v/f:SI 116 [ pa ]))) tmp.c:8 283 {*arith_shiftsi}
 (expr_list:REG_DEAD (reg:SI 120 [ f ])
(expr_list:REG_DEAD (reg/v/f:SI 116 [ pa ])
(nil
(insn 17 11 18 2 (set (reg/i:SI 0 r0)
(mem:SI (plus:SI (reg/f:SI 121)
(const_int 4 [0x4])) [1 MEM[(int *)pa_7 + 4B]+0 S4 A32]))
tmp.c:10 196 {*arm_movsi_insn}
 (expr_list:REG_DEAD (reg/f:SI 121)
(nil)))
(insn 18 17 22 2 (use (reg/i:SI 0 r0)) tmp.c:10 -1
 (nil))

when sched2 the RTL is as follows:
(note 4 1 24 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 24 4 3 2 NOTE_INSN_PROLOGUE_END)
(note 3 24 10 2 NOTE_INSN_FUNCTION_BEG)
(note 10 3 12 2 NOTE_INSN_DELETED)
(note 12 10 20 2 NOTE_INSN_DELETED)
(insn:TI 20 12 6 2 (set (reg/f:SI 3 r3 [119])
(high:SI (symbol_ref:SI (*.LANCHOR0) [flags 0x182]))) tmp.c:7 196
{*arm_movsi_insn}
 (expr_list:REG_EQUAL (high:SI (symbol_ref:SI (*.LANCHOR0) [flags
0x182]))
(nil)))
(insn 6 20 21 2 (set (reg:SI 2 r2 [117])
(const_int 1 [0x1])) tmp.c:6 196 {*arm_movsi_insn}
 (expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))
(insn:TI 21 6 7 2 (set (reg/f:SI 3 r3 [119])
(lo_sum:SI (reg/f:SI 3 r3 [119])
(symbol_ref:SI (*.LANCHOR0) [flags 0x182]))) tmp.c:7 195
{*arm_movt}
 (expr_list:REG_EQUAL (symbol_ref:SI (*.LANCHOR0) [flags 0x182])
(nil)))
(insn 7 21 9 2 (set (mem:SI (reg/v/f:SI 0 r0 [orig:116 pa ] [116]) [1
*pa_2(D)+0 S4 A32])
(reg:SI 2 r2 [117])) tmp.c:6 196 {*arm_movsi_insn}
 (expr_list:REG_DEAD (reg:SI 2 r2 [117])
(nil)))
(insn:TI 9 7 11 2 (set (reg:SI 3 r3 [orig:120 f ] [120])
(mem/c:SI (reg/f:SI 3 r3 [119]) [1 f+0 S4 A32])) tmp.c:7 196
{*arm_movsi_insn}
 (nil))
(insn:TI 11 9 17 2 (set (reg/f:SI 0 r0 [121])
(plus:SI (mult:SI (reg:SI 3 r3 [orig:120 f ] [120])
(const_int 4 [0x4]))
(reg/v/f:SI 0 r0 [orig:116 pa ] [116]))) tmp.c:8 283
{*arith_shiftsi}
 (expr_list:REG_DEAD (reg:SI 3 r3 [orig:120 f ] [120])
(nil)))
(insn:TI 17 11 18 2 (set (reg/i:SI 0 r0)
(mem:SI (plus:SI (reg/f:SI 0 r0 [121])
(const_int 4 [0x4])) [1 MEM[(int *)pa_7 + 4B]+0 S4 A32]))
tmp.c:10 196 {*arm_movsi_insn}
 (nil))
(insn 18 17 26 2 (use (reg/i:SI 0 r0)) tmp.c:10 -1
 (nil))
(jump_insn:TI 26 18 25 2 (return) tmp.c:10 268 {*arm_return}
 (nil)
 - return)
(In reply to Richard Biener from comment #2)
 How does the RTL look like after reload?


[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

--- Comment #9 from Jan Hubicka hubicka at gcc dot gnu.org ---
Bill,
you can track inlining with -fdump-tree-einline (early inliner) and
-fdump-ipa-inline (the greedy inliner)


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread mwahab at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #27 from mwahab at gcc dot gnu.org ---
(In reply to Andrew Macleod from comment #25)
 My opinion:
 
 1) is undesirable... even though it could possibly accelerate the conversion
 of legacy sync to atomic calls... I fear it would instead just cause
 frustration, annoyance and worse.  I don't think we need to actually
 penalize sync for no reason better than a 1 in a somethillion shot in the
 dark.
 
 2) Similar reasoning, I don't think we need to penalize SEQ_CST everywhere
 for a legacy sync call that probably doesn't need stronger code either.
 
 which leaves option 3)
 
 There are 2 primary options I think
 
 a) introduce an additional memory model... MEMMODEL_SYNC or something which
 is even stronger than SEQ_CST.  This would involve fixing any places that
 assume SEQ_CST is the highest.  And then we use this from the places that
 expand sync calls as the memory model.  

 or
 b) When we are expanding sync calls, we first look for target __sync
 patterns instead of atomic patterns.  If they exist, we use those. Otherwise
 we simply expand like we do today.  
 
 
 (b) may be easier to implement, but puts more onus on the target.. probably
 involves more complex patterns since you need both atomic and sync patterns.
 My guess is some duplication of code will occur here.  But the impact is
 only to specific targets.
 
 (a) Is probably cleaner... just touches a lot more code.  Since we're in
 stage 1, maybe (a) is a better long term solution...?   
 

Adding a new barrier specifer to enum memmodel seems the simplest approach. It
would mean that all barriers used in gcc are represented in the same way and
that would make them more visible and easier to understand than splitting them
between the enum memmodel and the atomic/sync patterns.

I'm testing a patch-set for option (a). It touches several backends but all the
changes are very simple since the new barrier is the same as MEMMODEL_SEQ_CST
for most targets.

One thing though is that the aarch64 code for __sync_lock_test_and_set and
there may have a similar problem with a too-weak barrier. It's not confirmed
that there is a problem but, since it may mean more work in this area, I
thought I'd mention it.


[Bug c++/65786] Wrong code when using decltype to specify the return type

2015-04-16 Thread josopait at goopax dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65786

--- Comment #3 from Ingo Josopait josopait at goopax dot com ---
Yes, you are right. Thanks.


[Bug debug/65549] [4.9/5/6 Regression] crash in htab_hash_string with -flto -g

2015-04-16 Thread ferdinandw+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65549

--- Comment #22 from Ferdinand ferdinandw+gcc at gmail dot com ---
Now that I understand the bug, of course I notice (too late) that my way of
setting -g0 wasn't taking effect in the beta tree, the way it was before. So
that explains that, but still this crash is happening with just -flto=4 (and
default -flto-partition=balanced) for me. That means, at the moment, it's not
latent for building firefox.


[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

--- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org ---
For better analysis, I'll note that the:
@@ -5846,13 +5848,13 @@ virtual void llvm::AArch64InstrInfo::loa
   D.391854.SubReg_TargetFlags = 0;
   D.391854.ParentMI = 0B;
   D.391854.Contents.ImmVal = 0;
-  llvm::MachineInstr::addOperand (MI_301, _440, D.391854);
+  llvm::MachineInstr::addOperand (SR.1446_371(D), SR.1445_78(D), D.391854);

   bb 82:
   D.391854 ={v} {CLOBBER};

   bb 83:
-  llvm::MachineInstr::addMemOperand (MI_301, _440, MMO_21);
+  llvm::MachineInstr::addMemOperand (SR.1446_371(D), SR.1445_78(D), MMO_21);

   bb 84:
   _79 = MEM[(struct TrackingMDRef *)DL].MD;
change in the loadRegFromStackSlot function in *.optimized dump started with
r211725.


[Bug c/65781] gcc-5.1.0-RC-20150412 thinks it is 5.0.1

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65781

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org ---
We'll need to do something with the *.gcda version numbers for the new
versioning scheme (because otherwise we'll run out of versions RSN), most
likely to start somehow encoding the major number in the former major and minor
characters and leave the patchlevel character for say the (minor  1) |
patchlevel into the former patchlevel character, but then it would be better to
always keep patchlevel at 0 or 1.  So not sure if we want to use 5.0.2 for
RC2...


[Bug fortran/54714] ICE on invalid expression involving DT with allocatable components and constructor in I/O

2015-04-16 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54714

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||vehre at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #2 from vehre at gcc dot gnu.org ---
I stumbled over this bug, while looking for something different with
allocatable components. 

The ICE is fixed in gcc 6.0 instead the error message 


test_pr54714.f90:5:27:

write(*,*) na_var([2,2])

Error: Data transfer element at (1) cannot have ALLOCATABLE components unless
it is processed by a defined input/output procedure


is printed. I deem the pr therefore as fixed.


[Bug tree-optimization/64950] postpone expanding va_arg till pass_stdarg

2015-04-16 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64950

--- Comment #4 from vries at gcc dot gnu.org ---
ping^2:
- https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00796.html
- https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00797.html


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P1


[Bug testsuite/65785] New: libgo TestIPv4MulticastListener test fails on machine with no network connection

2015-04-16 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65785

Bug ID: 65785
   Summary: libgo TestIPv4MulticastListener test fails on machine
with no network connection
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: trivial
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org

I observed the following failure:
...
--- FAIL: TestIPv4MulticastListener (0.00s)
testing.go:278: First ListenMulticastUDP on nil failed: listen udp
224.0.0.254: setsockopt: no such device
FAIL
FAIL: net
src/libgo/testsuite/gotest: line 514: gotest-timeout: No such file or directory
...

Once in a while my test machine loses the wlan0 interface, which is it's only
connection to the internet. My guess is that this makes this libgo test fail.
This seems to be confirmed by this timeline:
1. my first test run (using a non-bootstrap compiler) completed libgo.log
   without failures (well, apart from the usual TestMemoryProfiler fail, see
   PR64683)
   (16:10)
2. kernel log shows: IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
   (16:50)
3. my second test run (using a bootstapped compiler) completed libgo.log
   with above failure twice, once with and once without -m32.
   (20:10)
4. I ran ifconfig, and found that wlan0 is not connected

I saw a related PR48017: Network tests should fail gracefully without network
connectivity. A variable GCCGO_RUN_ALL_TESTS was introduced to manage this
problem, but subsequently removed (as noted here:
https://gcc.gnu.org/ml/gcc-patches/2011-09/msg01568.html ), though the variable
is still listed in the README.gcc.

It was removed because
(https://gcc.gnu.org/ml/gcc-patches/2011-09/msg01652.html ):
...
Running the net test now is intentional--it used to depend on an
Internet connection, but it no longer does.
...


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread mwahab at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #30 from mwahab at gcc dot gnu.org ---
(In reply to James Greenhalgh from comment #28)

 Which leaves 3). From Andrew's two proposed solutions:
 
  a) introduce an additional memory model... MEMMODEL_SYNC or something
  which is even stronger than SEQ_CST.  This would involve fixing any places
  that assume SEQ_CST is the highest.  And then we use this from the places
  that expand sync calls as the memory model.
 
 This seems a sensible approach and leads to a nicer design, but I worry that
 it might be overkill for primitives which we ultimately want to reduce
 support for and eventually deprecate.

I don't understand why it would be overkill. Its already expected that not all
barriers will be meaningful for all targets and target code to handle redundant
barriers usually comes down to a few clauses in a conditional statement. 

  (b) may be easier to implement, but puts more onus on the target..
  probably involves more complex patterns since you need both atomic and
  sync patterns. My guess is some duplication of code will occur here.  But
  the impact is only to specific targets.
 
 When I looked at this problem internally a few weeks back, this is exactly
 how I expected things to work (we can add the documentation for the pattern
 names to the list of things which need fixing, as it took me a while to
 figure out why my new patterns were never expanded!).
 
 I don't think this is particularly onerous for a target. The tough part in
 all of this is figuring out the minimal cost instructions at the ISA level
 to use for the various __atomic primitives. Extending support to a stronger
 model, should be straightforward given explicit documentation of the
 stronger ordering requirements.


My objection to using sync patterns is that it does take more work, both for
the initial implementation and for continuing maintenance. It also means adding
sync patterns to targets that would not otherwise need them and preserving a
part of the gcc infrastructure that is only needed for a legacy feature and
could otherwise be targeted for removal.

 Certainly, a target would have to do the same legwork for a) regardless, and
 would have to pollute their atomic patterns with checks and code paths for
 MEMMODEL_SYNC.

Actually, the changes needed for (a) are very simple. The checks and code-paths
for handling barriers are already there, reusing them for MEMMODEL_SYNC is
trivial. The resulting code is much easier to understand, and safely fix,
because it is all in the same place, than if it was spread out across patterns
and functions. 

 This also gives us an easier route to fixing any issues with the
 acquire/release __sync primitives (__sync_lock_test_and_set and
 __sync_lock_release) if we decide that these also need to be stronger than
 their C++11 equivalents.

This seems like a chunky workaround to avoid having to add a barrier
representation to gcc.  It's a, not necessarily straightforward, reworking of
the middle-end to use patterns that would need to be added to architectures
that don't currently have them and at least checked in the architectures that
do.

This seems to come down to what enum memmodel is supposed to be for. If it is
intended to represent the barriers provided by C11/C+11 atomics and nothing
else than a workaround seems unavoidable. If it is meant to represent barriers
needed by gcc to compile user code than, it seems to me, that it would be
better to just add the barrier to the enum and update code where necessary. 

Since memmodel is relatively recent, was merged from what looks like the C++
memory model branch (cxx-mem-model), and doesn't seem to have changed since
then, it's maybe not surprising that it doesn't already include every thing
needed by gcc. I don't see that adding to it should be prohibited, provided the
additions can be show to be strictly required.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #11 from Jakub Jelinek jakub at gcc dot gnu.org ---
And, shouldn't common_maybe_local for i?86/x86_64 be
  !flag_pic || (TARGET_64BIT  HAVE_LD_PIE_COPYRELOC != 0)
?  What about other targets that are known to generate COPY relocations in this
case for non-PIE executables?  Should they pass !flag_pic?
Perhaps there should be a generic default_binds_local_p* entry point that
passes
!flag_pic as common_maybe_local, that those targets could (after maintainers
test it properly with various vintage linkers?) use as their
TARGET_BINDS_LOCAL_P ?
Clearly rs6000 should pass always false though, perhaps many others too.


[Bug tree-optimization/65774] [6.0 regression] FAIL: gcc.dg/builtin-arith-overflow-1.c (internal compiler error)

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65774

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
Created attachment 35329
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35329action=edit
pach that might fix the issue

Hmm, can't reproduce with a cross - but eventually this accesses uninitialized
memory if

  mask = -1;

doesn't make

  if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
{

true.  Of course TYPE_PRECISION doesn't make sense on complex types.


[Bug target/63633] [avr] internal compiler error: unrecognizable insn with mult insns

2015-04-16 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63633

Georg-Johann Lay gjl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jonathan.creekmore@synapse-
   ||wireless.com

--- Comment #4 from Georg-Johann Lay gjl at gcc dot gnu.org ---
*** Bug 65657 has been marked as a duplicate of this bug. ***


[Bug target/65657] [avr] read from __memx address space tramples argument to following function

2015-04-16 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65657

Georg-Johann Lay gjl at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||avr
   Priority|P3  |P4
 Status|UNCONFIRMED |RESOLVED
 CC||gjl at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #6 from Georg-Johann Lay gjl at gcc dot gnu.org ---
(In reply to Senthil Kumar Selvaraj from comment #5)
 This tentative patch (pending regression tests) makes the problem go away
 [...]
 @@ -9959,7 +9959,11 @@ avr_rtx_costs_1 (rtx x, int codearg, int outer_code
 ATTRIBUTE_UNUSED,
return true;
  
  case MEM:
 -  *total = COSTS_N_INSNS (GET_MODE_SIZE (mode));
 +  /* MEM rtx with non-default address space is more
 + expensive. Not expressing that results in reg
 + clobber during expand (PR 65657). */
 +  *total = COSTS_N_INSNS (GET_MODE_SIZE (mode)
 +   + (MEM_ADDR_SPACE(x) == ADDR_SPACE_RAM ? 0 : 5));

This might lead to better code, but costs should never be a proper fix for
wrong code or ICE.

*** This bug has been marked as a duplicate of bug 63633 ***


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread jgreenhalgh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #28 from James Greenhalgh jgreenhalgh at gcc dot gnu.org ---
(In reply to torvald from comment #24)
 I think we need to at least clarify the documentation of __atomic, probably
 also of __sync; we might also have to change the implementation of __sync
 builtins on some archs.
 
 First, I think the __atomic builtins should strictly follow the C11 model --
 one major reason being that it's not worthwhile for us to support a
 GCC-specific memory model, which is complex.  Therefore, I think we should
 clarify in the __atomic docs that C++11 is the reference semantics, and that
 any additional text we provide (e.g., to describe what MEMMODEL_SEQ_CST
 does) is just some hand-wavy attempt of makings things look simpler.  I'd
 choose C++11 over C11 because, although the models are supposed to be equal,
 C++11 has more detailed docs in the area of atomics and synchronization in
 general, IMO.  There are also more people involved in C++ standardization
 than C standardization.

Agreed.

 The only exception that I see is that we might want to give more guarantees
 wrt data-race-freedom (DRF) as far as compiler reordering is concerned.  I'm
 mentioning just the compiler because I'm not aware of GCC supporting
 C11/C++11 on any archs whose ISA distinguishes between atomic and nonatomic
 accesses (ie, where DRF is visible at the ISA level); however, I've heard of
 upcoming archs that may do that.  Trying to promise some support for non-DRF
 programs that synchronize might make transition of legacy code to __atomic
 builtins easier, but it's a slippery slope; I don't think it's worthwhile
 for us to try to give really precise and hard guarantees, just because of
 the complexity involved.  I'm not quite sure what's best here.

I use enough whiteboard space trying to work through the well researched,
formalized, and discussed C++11 semantics; I would not like to see a subtly
different GNU++11 semantics!! Not least because I worry for the programmer who
thinks they've understood the guarantees of the specification and reverse
engineers GCC's output for confirmation, only to find GCC is giving stronger
guarantees than is strictly neccessary.

 snip

 Thus, we need to at least improve the __sync docs.  If we want to really
 make them C11-like, we need to update the docs, and there might be legacy
 code assuming different semantics that breaks.If we don't, we need to
 update the implementation of __sync RMWs.  I don't know what would be the
 easiest way to do that:

 1) We could put __synch_synchronize around them on all archs, and just don't
 care about the performance overhead (thinking that we want to deprecate them
 anyway).  But unless the backends optimize unnecessary sync ops (eg, on x86,
 remove mfence adjacent to a lock'd RMW), there could be performance
 degradation for __sync-using code.

I'm with Andrew, I don't see the need to penalise everyone to ensure
AArch64/ARM memory ordering (Though I agree it would be nice as a stick to
encourage people to move forwards).

 2) We could make __atomic RMWs with MEMMODEL_SEQ_CST stronger than required
 by C11.  This could result in other C11-conforming compilers to produce more
 efficient synchronization code, and is thus not a good option IMO.

I would not like to see this implemented.

 3) We could do something just on ARM (and scan other arcs for similar
 issues).  That's perhaps the cleanest option.

Which leaves 3). From Andrew's two proposed solutions:

 a) introduce an additional memory model... MEMMODEL_SYNC or something
 which is even stronger than SEQ_CST.  This would involve fixing any places
 that assume SEQ_CST is the highest.  And then we use this from the places
 that expand sync calls as the memory model.

This seems a sensible approach and leads to a nicer design, but I worry that it
might be overkill for primitives which we ultimately want to reduce support for
and eventually deprecate.

 b) When we are expanding sync calls, we first look for target __sync
 patterns instead of atomic patterns.  If they exist, we use those.
 Otherwise we simply expand like we do today.  

 (b) may be easier to implement, but puts more onus on the target..
 probably involves more complex patterns since you need both atomic and
 sync patterns. My guess is some duplication of code will occur here.  But
 the impact is only to specific targets.

When I looked at this problem internally a few weeks back, this is exactly how
I expected things to work (we can add the documentation for the pattern names
to the list of things which need fixing, as it took me a while to figure out
why my new patterns were never expanded!).

I don't think this is particularly onerous for a target. The tough part in all
of this is figuring out the minimal cost instructions at the ISA level to use
for the various __atomic primitives. Extending support to a stronger model,
should be straightforward given explicit documentation of the stronger 

[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread mwahab at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #29 from mwahab at gcc dot gnu.org ---
(In reply to mwahab from comment #27)
 (In reply to Andrew Macleod from comment #25)
  My opinion:
  
  1) is undesirable... even though it could possibly accelerate the conversion
  of legacy sync to atomic calls... I fear it would instead just cause
  frustration, annoyance and worse.  I don't think we need to actually
  penalize sync for no reason better than a 1 in a somethillion shot in the
  dark.
  
  2) Similar reasoning, I don't think we need to penalize SEQ_CST everywhere
  for a legacy sync call that probably doesn't need stronger code either.
  
  which leaves option 3)
  
  There are 2 primary options I think


There may be a third option, which is to set-up a target hook to allow the sync
expansions in the middle end to be overridden. This limits the changes to the
backends that need the different semantics without having to continue with the
sync_ patterns (which aren't in aarch64 for example). There's space in the
memmodel values to support target specific barriers so, for aarch64, this would
allow the atomics code to be reused. I don't know much about the target hook
infrastructure though so I don't know how disruptive a new hook would be.


[Bug fortran/54714] ICE on invalid expression involving DT with allocatable components and constructor in I/O

2015-04-16 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54714

--- Comment #3 from Dominique d'Humieres dominiq at lps dot ens.fr ---
 ...
 The ICE is fixed in gcc 6.0 instead the error message 
 ...
 is printed. I deem the pr therefore as fixed.

For the record, it has been fixed between revisions r220436 (2015-02-05, ICE)
and r220481+one patch (2015-02-06, error).


[Bug c++/65786] Wrong code when using decltype to specify the return type

2015-04-16 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65786

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Jonathan Wakely redi at gcc dot gnu.org ---
Undefined behaviour, for the reason Andrew gave.

(People seem to like blaming new C++11 or C++14 features for their mistakes
when using std::max incorrectly, Bug 61769 is another example.)


[Bug c/65781] gcc-5.1.0-RC-20150412 thinks it is 5.0.1

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65781

--- Comment #4 from Richard Biener rguenth at gcc dot gnu.org ---
Sure.


[Bug rtl-optimization/65783] after reload, the memrefs_conflict_p is unreliable?

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65783

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
How does the RTL look like after reload?


[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |5.0

--- Comment #8 from Richard Biener rguenth at gcc dot gnu.org ---
comment #5 sounds like this is an issue in LLVM.  SRA (and into-SSA) now take
advantage of clobbers.  Try -fno-lifetime-dse (and -fstack-reuse=none)?



[Bug tree-optimization/65774] [6.0 regression] FAIL: gcc.dg/builtin-arith-overflow-1.c (internal compiler error)

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65774

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2015-04-16
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |6.0
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org ---
Mine.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread mwahab at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #26 from mwahab at gcc dot gnu.org ---
(In reply to torvald from comment #21)
 (In reply to Andrew Haley from comment #20)
  (In reply to mwahab from comment #19)
   (In reply to Andrew Haley from comment #18)
   
   It looks inconsistent with C11 S7.17.7.4-2 (C++11 S29.6.4-21) Further, if
   the comparison is true, memory is affected according to the value of
   success, and if the comparison is false, memory is affected according to 
   the
   value of failure. (where success and failure are the memory model
   arguments.) In this case, the write to *exp should be 
   memory_order_seq_cst.
  
  But no store actually takes place, so the only effect is that of the read.
  You can't have a sequentially consistent store without a store.
 
 I agree. If you continue reading in the C++11 paragraph that you cited,
 you'll see that if just one MO is provided and the CAS fails, an acq_rel MO
 is downgraded to acquire and a release MO to relaxed.  This is consistent
 with no update of the atomic variable (note that expected is not atomic, so
 applying an MO to accesses to it is not meaningful in any case).  However,
 if the provided MO is seq_cst, I think a failed CAS still needs to be
 equivalent to a seq_cst load.

Yes, the last two sentences in the C++11 paragraph make it clear: If the
operation returns true, these operations are atomic read-modify-write
operations (1.10). Otherwise, these operations are atomic load operations.  In
that case, the Aarch64 code looks ok.


[Bug c++/65786] New: Wrong code when using decltype to specify the return type

2015-04-16 Thread josopait at goopax dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65786

Bug ID: 65786
   Summary: Wrong code when using decltype to specify the return
type
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: critical
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: josopait at goopax dot com

In the program below, the assignment of the return value is messed up. While
the C++-14 style of fully automatic type deduction works fine, the mymax11
function call produces some random numbers.

The output is:

1
2
-2100190336

or similar. The last line is different for every program execution.
I am using gcc 4.9.2 on x86_64 Linux. I don't get this bug if I use -m32.




#include iostream
using namespace std;


struct testclass
{
  int data;
  inline operator const int() const
  {
return data; 
  }

  testclass operator = (const int in)
  {
data = in;
return *this;
  }
};


template typename A, typename B auto mymax14(const A a, const B b)
{
  return std::max((int)a, (int)b);
}

template typename A, typename B auto mymax11(const A a, const B b) -
decltype(std::max((int)a, (int)b))
{
  return std::max((int)a, (int)b);
}



int main()
{

  testclass d;

  d = 1;
  cout  d.data  endl;   // ok, d.data==1

  d = mymax14(d, 2);
  cout  d.data  endl;   // ok, d.data==2

  d = mymax11(d, 2);
  cout  d.data  endl;   // bad: d.data == some random number.

  return 0;
}


[Bug c++/65786] Wrong code when using decltype to specify the return type

2015-04-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65786

--- Comment #4 from Marc Glisse glisse at gcc dot gnu.org ---
Compiling with -Wall -O prints:

w.cc: In function ‘int main()’:
w.cc:45:13: warning: ‘anonymous’ is used uninitialized in this function
[-Wuninitialized]
   cout  d.data  endl;   // bad: d.data == some random number.
 ^

[Bug lto/65778] v8 build fails with assembly error with LTO enabled on arm-linux-gnueabihf

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65778

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
Thus technically INVALID?


[Bug target/65779] [5 Regression] undefined local symbol on powerpc [regression]

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65779

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |5.0
Summary|undefined local symbol on   |[5 Regression] undefined
   |powerpc [regression]|local symbol on powerpc
   ||[regression]


[Bug debug/65549] [4.9/5/6 Regression] crash in htab_hash_string with -flto -g

2015-04-16 Thread ferdinandw+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65549

Ferdinand ferdinandw+gcc at gmail dot com changed:

   What|Removed |Added

 CC||ferdinandw+gcc at gmail dot com

--- Comment #21 from Ferdinand ferdinandw+gcc at gmail dot com ---
I'm running into this when building firefox beta tree with lto. The release
tree is working alright, but on beta it crashes with this bug's stacktrace. Is
it only with -g? I'm using -g0 for the most part.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #13 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Jakub Jelinek from comment #11)
 And, shouldn't common_maybe_local for i?86/x86_64 be
   !flag_pic || (TARGET_64BIT  HAVE_LD_PIE_COPYRELOC != 0)
 ?  What about other targets that are known to generate COPY relocations in
 this case for non-PIE executables?  Should they pass !flag_pic?
 Perhaps there should be a generic default_binds_local_p* entry point that
 passes
 !flag_pic as common_maybe_local, that those targets could (after maintainers
 test it properly with various vintage linkers?) use as their
 TARGET_BINDS_LOCAL_P ?

Check flag_pic isn't necessary.  For non-PIC, the same code sequence
and relocation are used to access defined and undefined symbols, common
or not.


[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread james.molloy at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

--- Comment #14 from James Molloy james.molloy at arm dot com ---
Hi,

For completeness, I just fixed this in LLVM r235088
(http://reviews.llvm.org/rL235088).

Cheers,

James


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #20 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Jakub Jelinek from comment #16)
 (In reply to H.J. Lu from comment #13)
  Check flag_pic isn't necessary.  For non-PIC, the same code sequence
  and relocation are used to access defined and undefined symbols, common
  or not.
 
 What do you mean by is not necessary?  Without that, you'll return false
 from targetm.binds_local_p for DECL_COMMON in the testcase say on
 i686-linux, or if you have old linker.

I guess it won't hurt.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread amacleod at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #34 from Andrew Macleod amacleod at redhat dot com ---
 However, I guess some people relying on data races in their programs could
 (mis?)understand the __sync_lock_release semantics to mean that it is a
 means to get the equivalent of a C11 release *fence* -- which it is not
 because the fence would apply to the (erroneously non-atomic) store after
 the barrier, which could one lead to believe that if one observes the store
 after the barrier, the fence must also be in effect.  Thoughts?

before we get too carried away, maybe we should return to what we *think*
__sync are suppose to do. It represents a specific definition by intel.. From
the original documentation for __sync back in the day, and all legacy uses of
sync should expect this behaviour:


The following builtins are intended to be compatible with those described
in the Intel Itanium Processor-specific Application Binary Interface,
section 7.4.  As such, they depart from the normal GCC practice of using
the ``__builtin_'' prefix, and further that they are overloaded such that
they work on multiple types.

The definition of barrier from that documentation is :

acquire barrier : Disallows the movement of memory references to visible data
from before the intrinsic (in program order) to after the intrinsic (this
behavior is desirable at lock-release operations, hence the name).

release barrier: Disallows the movement of memory references to visible data
from after the intrinsic (in program order) to before the intrinsic (this
behavior is desirable at lock-acquire operations, hence the name).

full barrier: disallows the movement of memory references to visible data past
the intrinsic (in either direction), and is thus both an acquire and a release
barrier. A barrier only restricts the movement of memory references to visible
data across the intrinsic operation: between synchronization operations (or in
their absence), memory references to visible data may be freely reordered
subject to the usual data-dependence constraints.

Caution: Conditional execution of a synchronization intrinsic (such as within
an if or a while statement) does not prevent the movement of memory references
to visible data past the overall if or while construct.


[Bug sanitizer/65749] sanitizer stack trace pc off by 1

2015-04-16 Thread y.gribov at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65749

--- Comment #3 from Yury Gribov y.gribov at samsung dot com ---
@Kostya: I suggest to mention this in ASan FAQ.


[Bug sanitizer/65749] sanitizer stack trace pc off by 1

2015-04-16 Thread y.gribov at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65749

Yury Gribov y.gribov at samsung dot com changed:

   What|Removed |Added

 CC||y.gribov at samsung dot com

--- Comment #2 from Yury Gribov y.gribov at samsung dot com ---
This is not a bug but rather a design choice - it is hard to compute exact size
of preceding instruction on CISC platforms. ASan just decrements 1 because this
is enough for tools like addr2line or gdb to symbolize addresses. Replacing by
trace[i] would indeed cause invalid symbolization as you already noticed.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #15 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Jakub Jelinek from comment #10)
 (In reply to H.J. Lu from comment #9)
  Created attachment 35327 [details]
  A different patch
  
  On x86, this issue only shows up with PIE. Here is a different
  patch to treat common symbol defined locally only if the backend
  passes true common_maybe_local.  For x86-64, it is true only if
  HAVE_LD_PIE_COPYRELOC is 1.  For i386, it is always false.  If
  we aren't building PIE, common_maybe_local is true or false
  doesn't make a difference for x86 since the common symbol is
  always referenced normally with copy reloc.  For PIE on x86-64,
  common symbol is local only if HAVE_LD_PIE_COPYRELOC is 1.
 
 +
 +  /* For common symbol, it is defined locally only if common_maybe_local
 + is true.  */
 +  bool defined_locally = (!DECL_EXTERNAL (exp)
 +(!DECL_COMMON (exp) || common_maybe_local));
 
 I think better would be:
   bool uninited_common = (DECL_COMMON (exp)
(DECL_INITIAL (exp) == NULL
   || (!in_lto_p  DECL_INITIAL (exp) ==
 error_mark_node)));
   /* For common symbol, it is defined locally only if common_maybe_local
  is true.  */
   bool defined_locally = (!DECL_EXTERNAL (exp)  (!uninited_common ||
 common_maybe_local));
 ...
 and then use
   /* Uninitialized COMMON variable may be unified with symbols
  resolved from other modules.  */
   if (uninited_common  !resolved_locally)
 return false;

Can we find a tectase with initialized COMMON variable and compile
it as PIE?

[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #16 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #13)
 Check flag_pic isn't necessary.  For non-PIC, the same code sequence
 and relocation are used to access defined and undefined symbols, common
 or not.

What do you mean by is not necessary?  Without that, you'll return false from
targetm.binds_local_p for DECL_COMMON in the testcase say on i686-linux, or if
you have old linker.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #18 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Jakub Jelinek from comment #10)
 (In reply to H.J. Lu from comment #9)
  Created attachment 35327 [details]
  A different patch
  
  On x86, this issue only shows up with PIE. Here is a different
  patch to treat common symbol defined locally only if the backend
  passes true common_maybe_local.  For x86-64, it is true only if
  HAVE_LD_PIE_COPYRELOC is 1.  For i386, it is always false.  If
  we aren't building PIE, common_maybe_local is true or false
  doesn't make a difference for x86 since the common symbol is
  always referenced normally with copy reloc.  For PIE on x86-64,
  common symbol is local only if HAVE_LD_PIE_COPYRELOC is 1.
 
 +
 +  /* For common symbol, it is defined locally only if common_maybe_local
 + is true.  */
 +  bool defined_locally = (!DECL_EXTERNAL (exp)
 +(!DECL_COMMON (exp) || common_maybe_local));
 
 I think better would be:
   bool uninited_common = (DECL_COMMON (exp)
(DECL_INITIAL (exp) == NULL
   || (!in_lto_p  DECL_INITIAL (exp) ==
 error_mark_node)));
   /* For common symbol, it is defined locally only if common_maybe_local
  is true.  */
   bool defined_locally = (!DECL_EXTERNAL (exp)  (!uninited_common ||
 common_maybe_local));
 ...
 and then use
   /* Uninitialized COMMON variable may be unified with symbols
  resolved from other modules.  */
   if (uninited_common  !resolved_locally)
 return false;

Here is a testcase:

---
int optopt = 5;
int optopt;

int
main ()
{
  optopt = 4;
  return 0;
}

~

[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #12 from Jakub Jelinek jakub at gcc dot gnu.org ---
I've tried the #c0 testcase with gcc 5.1 rc and binutils 2.25 on various linux
architectures.  On armv7hl, x86_64, s390 and s390x no errors are reported for
both normal executable and PIE, for i686 PIE link fails, normal works, for
powerpc64{,le} expectedly both normal and PIE link fail, aarch64 I'm still
waiting for results.


[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread james.molloy at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

James Molloy james.molloy at arm dot com changed:

   What|Removed |Added

 CC||james.molloy at arm dot com

--- Comment #13 from James Molloy james.molloy at arm dot com ---
Hi,

This has just been pinged at me. Thanks for debugging this and sorry about the
broken code.

Just a note that the actual problem/fix is even more simple. The problem is
we're asking for a reference instead of a copy (which is the pattern used
elsewhere), so simply removing the '' will enforce correct behaviour.

Cheers,

James


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #17 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #15)
 Can we find a tectase with initialized COMMON variable and compile
 it as PIE?

I don't know where initialized DECL_COMMON could come from, but this spot
certainly isn't the only one that is counting with that, e.g. when deciding
what section to use comm_section is used only if it is bbs_initializer_p, etc.
At least for GCC 5, I'd strongly prefer small provably correct changes at this
point.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread torvald at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #32 from torvald at gcc dot gnu.org ---
(In reply to James Greenhalgh from comment #28)
 (In reply to torvald from comment #24)
  3) We could do something just on ARM (and scan other arcs for similar
  issues).  That's perhaps the cleanest option.
 
 Which leaves 3). From Andrew's two proposed solutions:

3) Also seems best to me.  2) is worst, 1) is too much of a stick.

 This also gives us an easier route to fixing any issues with the
 acquire/release __sync primitives (__sync_lock_test_and_set and
 __sync_lock_release) if we decide that these also need to be stronger than
 their C++11 equivalents.

I don't think we have another case of different __sync vs. __atomics semantics
in case of __sync_lock_test_and_set.  The current specification makes it clear
that this is an acquire barrier, and how it describes the semantics (ie, loads
and stores that are program-order before the acquire op can move to after it) ,
this seems to be consistent with the effects C11 specifies for acquire MO (with
perhaps the distinction that C11 is clear that acquire needs to be paired with
some release op to create an ordering constraint).

I'd say this basically also applies to __sync_lock_release, with the exception
that the current documentation does not mention that stores can be speculated
to before the barrier.  That seems to be an artefact of a TSO model.
However, I don't think this matters much because what the release barrier
allows one to do is reasoning that if one sees the barrier to have taken place
(eg, observe that the lock has been released), then also all ops before the
barrier will be visible.
It does not guarantee that if one observes an effect that is after the barrier
in program order, that the barrier itself will necessarily have taken effect. 
To be able to make this observation, one would have to ensure using __sync ops
that the other effect after the barrier is indeed after the barrier, which
would mean using an release op for the other effect -- which would take care of
things.

If everyone agrees with this reasoning, we probably should add documentation
explaining this.

However, I guess some people relying on data races in their programs could
(mis?)understand the __sync_lock_release semantics to mean that it is a means
to get the equivalent of a C11 release *fence* -- which it is not because the
fence would apply to the (erroneously non-atomic) store after the barrier,
which could one lead to believe that if one observes the store after the
barrier, the fence must also be in effect.  Thoughts?


[Bug tree-optimization/59124] [4.8/4.9/5/6 Regression] Wrong warnings array subscript is above array bounds

2015-04-16 Thread georgmueller at gmx dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124

Georg Müller georgmueller at gmx dot net changed:

   What|Removed |Added

 CC||georgmueller at gmx dot net

--- Comment #12 from Georg Müller georgmueller at gmx dot net ---
gcc --version
gcc (GCC) 5.0.1 20150413 (Red Hat 5.0.1-0.1)


When compiling the first example with -fopt-info, I see the following
difference between -O2 -funroll-loops and -O3:

gcc -Wall -Wextra -fopt-info -O2 -c 1.c -funroll-loops
1.c:11:5: note: loop turned into non-loop; it never loops.
1.c:11:5: note: loop with 6 iterations completely unrolled
1.c:10:3: note: loop turned into non-loop; it never loops.
1.c:10:3: note: loop with 5 iterations completely unrolled


gcc -Wall -Wextra -fopt-info -O3 -c 1.c -funroll-loops
1.c:11:5: note: loop turned into non-loop; it never loops.
1.c:11:5: note: loop with 7 iterations completely unrolled
1.c: In function 'foo':
1.c:12:23: warning: array subscript is above array bounds [-Warray-bounds]
   bar[j - 1] = baz[j - 1];
   ^
1.c:12:23: warning: array subscript is above array bounds [-Warray-bounds]
1.c:10:3: note: loop turned into non-loop; it never loops.
1.c:10:3: note: loop with 5 iterations completely unrolled

So, -O2 unrolls 6 and 5 iterations, while -O3 unrolls 7 and 5.

[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #11 from Jakub Jelinek jakub at gcc dot gnu.org ---
So, the source in question is:
  const MachineInstrBuilder MI = BuildMI(MBB, MBBI, DL, get(Opc))
  .addReg(DestReg, getDefRegState(true))
  .addFrameIndex(FI);
  if (Offset)  
MI.addImm(0);
  MI.addMemOperand(MMO);
which in *.gimple dump looks like:
  try
{
  D.382398 = this-D.207751.D.207513.D.205713;
  D.382399 = llvm::MCInstrInfo::get (D.382398, Opc);
  D.325474 = llvm::BuildMI (MBB, MBBI, D.325473, D.382399);
[return slot optimization]
  try
{
  D.382400 = llvm::getDefRegState (1);
  D.382401 = llvm::MachineInstrBuilder::addReg (D.325474,
DestReg, D.382400, 0);
  MI = llvm::MachineInstrBuilder::addFrameIndex (D.382401, FI);
}
  finally
{
  D.325474 = {CLOBBER};
}
}
  finally
{
  llvm::DebugLoc::~DebugLoc (D.325473);
  D.325473 = {CLOBBER};
}
  if (Offset != 0) goto D.382402; else goto D.382403;
  D.382402:
  llvm::MachineInstrBuilder::addImm (MI, 0);
  goto D.382404;
  D.382403:
  D.382404:
  llvm::MachineInstrBuilder::addMemOperand (MI, MMO);
which suggests that the temporary that BuildMI returns goes out of scope at the
end of the
  const MachineInstrBuilder MI = BuildMI(MBB, MBBI, DL, get(Opc))
  .addReg(DestReg, getDefRegState(true))
  .addFrameIndex(FI);
and sets the reference to the out of scope temporary.
  const
  MachineInstrBuilder addReg(unsigned RegNo, unsigned flags = 0,
  unsigned SubReg = 0) const {
ends with
return *this;
and
  const MachineInstrBuilder addFrameIndex(int Idx) const {
MI-addOperand(*MF, MachineOperand::CreateFI(Idx));
return *this;
  }
too.


[Bug tree-optimization/65773] [5 Regression] GCC 5.1 miscompiles LLVM function AArch64InstrInfo::loadRegFromStackSlot()

2015-04-16 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773

--- Comment #12 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #11)
 So, the source in question is:
   const MachineInstrBuilder MI = BuildMI(MBB, MBBI, DL, get(Opc))
   .addReg(DestReg, getDefRegState(true))
   .addFrameIndex(FI);
  

The way to fix llvm' sources is do to:

const MachineInstrBuilder MI = BuildMI(MBB, MBBI, DL, get(Opc));
MI.addReg(DestReg, getDefRegState(true)).addFrameIndex(FI);

Which allows for the return value of buildmi to expand its life time past the
end of the statement.


[Bug tree-optimization/65774] [6.0 regression] FAIL: gcc.dg/builtin-arith-overflow-1.c (internal compiler error)

2015-04-16 Thread sch...@linux-m68k.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65774

--- Comment #3 from Andreas Schwab sch...@linux-m68k.org ---
wi::sext has undefined behaviour with offset == 0.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread amacleod at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #31 from Andrew Macleod amacleod at redhat dot com ---
(In reply to mwahab from comment #30)
 (In reply to James Greenhalgh from comment #28)
 
  I don't think this is particularly onerous for a target. The tough part in
  all of this is figuring out the minimal cost instructions at the ISA level
  to use for the various __atomic primitives. Extending support to a stronger
  model, should be straightforward given explicit documentation of the
  stronger ordering requirements.
 
 
 My objection to using sync patterns is that it does take more work, both for
 the initial implementation and for continuing maintenance. It also means
 adding sync patterns to targets that would not otherwise need them and
 preserving a part of the gcc infrastructure that is only needed for a legacy
 feature and could otherwise be targeted for removal.
 

Targets that don't need special sync patterns (ie most of them) simply don't
provide them.   The expanders see no sync pattern and use SEQ_CST, exactly like
they do today.

sync patterns would only be provided by targets which do need to do something
different.   This means there is no potential bug introduction on unaffected
targets.. 

  This also gives us an easier route to fixing any issues with the
  acquire/release __sync primitives (__sync_lock_test_and_set and
  __sync_lock_release) if we decide that these also need to be stronger than
  their C++11 equivalents.
 
 This seems like a chunky workaround to avoid having to add a barrier
 representation to gcc.  It's a, not necessarily straightforward, reworking
 of the middle-end to use patterns that would need to be added to
 architectures that don't currently have them and at least checked in the
 architectures that do.

Well, it actually returns back to the situation before they were merged. We use
to look for sync patterns too... until I thought they were redundant.

 
 This seems to come down to what enum memmodel is supposed to be for. If it
 is intended to represent the barriers provided by C11/C+11 atomics and
 nothing else than a workaround seems unavoidable. If it is meant to
 represent barriers needed by gcc to compile user code than, it seems to me,
 that it would be better to just add the barrier to the enum and update code
 where necessary. 
 
 Since memmodel is relatively recent, was merged from what looks like the C++
 memory model branch (cxx-mem-model), and doesn't seem to have changed since
 then, it's maybe not surprising that it doesn't already include every thing
 needed by gcc. I don't see that adding to it should be prohibited, provided
 the additions can be show to be strictly required.

The intention was to deprecate __sync and support just the c++ memory model. 
SEQ_CST == SYNC was the original intent and understanding, thus the code merge.
 If we decide we want/need to provide a stronger form of SEQ_CST and call it
SYNC, then we can do that as required... but I'm not envisioning a lot of
future additional memory models. Although never say never I suppose.


[Bug tree-optimization/64277] [4.9 Regression] Incorrect warning array subscript is above array bounds

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64277

--- Comment #23 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Thu Apr 16 12:03:11 2015
New Revision: 222146

URL: https://gcc.gnu.org/viewcvs?rev=222146root=gccview=rev
Log:
2015-04-16  Richard Biener  rguent...@suse.de

PR tree-optimization/64277
* tree-vrp.c (check_array_ref): Fix anti-range handling,
simplify upper bound handling.
(search_for_addr_array): Simplify.
(check_array_bounds): Handle ADDR_EXPRs here.
(check_all_array_refs): Simplify.

* gcc.dg/Warray-bounds-14.c: New testcase.
* gcc.dg/Warray-bounds-15.c: Likewise.
* c-c++-common/ubsan/bounds-4.c: Disable -Warray-bounds.
* c-c++-common/ubsan/bounds-6.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/Warray-bounds-14.c
trunk/gcc/testsuite/gcc.dg/Warray-bounds-15.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/c-c++-common/ubsan/bounds-4.c
trunk/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
trunk/gcc/tree-vrp.c


[Bug tree-optimization/65774] [6.0 regression] FAIL: gcc.dg/builtin-arith-overflow-1.c (internal compiler error)

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65774

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Biener rguenth at gcc dot gnu.org ---
Fixed.


[Bug tree-optimization/65774] [6.0 regression] FAIL: gcc.dg/builtin-arith-overflow-1.c (internal compiler error)

2015-04-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65774

--- Comment #5 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Thu Apr 16 12:10:34 2015
New Revision: 222147

URL: https://gcc.gnu.org/viewcvs?rev=222147root=gccview=rev
Log:
2015-04-16  Richard Biener  rguent...@suse.de

PR tree-optimization/65774
* tree-ssa-ccp.c (evaluate_stmt): Constrain types we invoke
bit-value tracking on.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-ccp.c


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #14 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Jakub Jelinek from comment #10)
 (In reply to H.J. Lu from comment #9)
  Created attachment 35327 [details]
  A different patch
  
  On x86, this issue only shows up with PIE. Here is a different
  patch to treat common symbol defined locally only if the backend
  passes true common_maybe_local.  For x86-64, it is true only if
  HAVE_LD_PIE_COPYRELOC is 1.  For i386, it is always false.  If
  we aren't building PIE, common_maybe_local is true or false
  doesn't make a difference for x86 since the common symbol is
  always referenced normally with copy reloc.  For PIE on x86-64,
  common symbol is local only if HAVE_LD_PIE_COPYRELOC is 1.
 
 +
 +  /* For common symbol, it is defined locally only if common_maybe_local
 + is true.  */
 +  bool defined_locally = (!DECL_EXTERNAL (exp)
 +(!DECL_COMMON (exp) || common_maybe_local));
 
 I think better would be:
   bool uninited_common = (DECL_COMMON (exp)
(DECL_INITIAL (exp) == NULL
   || (!in_lto_p  DECL_INITIAL (exp) ==
 error_mark_node)));
   /* For common symbol, it is defined locally only if common_maybe_local
  is true.  */
   bool defined_locally = (!DECL_EXTERNAL (exp)  (!uninited_common ||
 common_maybe_local));
 ...
 and then use
   /* Uninitialized COMMON variable may be unified with symbols
  resolved from other modules.  */
   if (uninited_common  !resolved_locally)
 return false;

What does initialized COMMON look like to linker?  If it is marked
as COMMON symbol to linker, it will be treated the same as
uninitialized COMMON symbol.

[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #35327|0   |1
is obsolete||

--- Comment #19 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 35330
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35330action=edit
A new patch


[Bug c++/65764] internal compiler error: in retrieve_specialization, at cp/pt.c:1048

2015-04-16 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65764

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #2 from Marek Polacek mpolacek at gcc dot gnu.org ---
Got fixed in r218955.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #21 from Jakub Jelinek jakub at gcc dot gnu.org ---
I've repeated my test on the various architectures, this time with additional
readelf -Ws test | grep optopt
if the link succeeds.  And indeed, x86_64 with recent linker is the only one
where optopt is defined, rather than SHN_UNDEF in the PIE.  armv7hl, s390,
s390x, i686 and x86_64 with old linker all have optopt defined in the binary
for normal executable (!flag_pic) and SHN_UNDEF for PIE.

Thus, based on this I'd say that i386 backend should pass
!flag_pic || (TARGET_64BIT  HAVE_LD_PIE_COPYRELOC != 0)
to the new param (in ix86_binds_local_p).
Then, perhaps you should make default_binds_local_p_3 also non-static and
declared in output.h, ix86_binds_local_p should perhaps use it directly, and
default_binds_local_p_2 should have just a single argument, so that arm and
s390 backends (dunno, maybe aarch64 and a few others too) could use it directly
as their TARGET_BINDS_LOCAL_P definition.  default_binds_local_p_2 would then
call
default_binds_local_p_3 with
exp, flag_shlib != 0, true, false, !flag_pic
arguments.  And obviously all the two (default_binds_local_p{,2}) should have
better documentation on how they differ.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread mwahab at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #33 from mwahab at gcc dot gnu.org ---
(In reply to torvald from comment #32)
 (In reply to James Greenhalgh from comment #28)
  (In reply to torvald from comment #24)
   3) We could do something just on ARM (and scan other arcs for similar
   issues).  That's perhaps the cleanest option.
  
  Which leaves 3). From Andrew's two proposed solutions:
 
 3) Also seems best to me.  2) is worst, 1) is too much of a stick.
 
  This also gives us an easier route to fixing any issues with the
  acquire/release __sync primitives (__sync_lock_test_and_set and
  __sync_lock_release) if we decide that these also need to be stronger than
  their C++11 equivalents.
 
 I don't think we have another case of different __sync vs. __atomics
 semantics in case of __sync_lock_test_and_set.  The current specification
 makes it clear that this is an acquire barrier, and how it describes the
 semantics (ie, loads and stores that are program-order before the acquire op
 can move to after it) , this seems to be consistent with the effects C11
 specifies for acquire MO (with perhaps the distinction that C11 is clear
 that acquire needs to be paired with some release op to create an ordering
 constraint).

Thanks, I suspect that the acquire barrier may not be much as much of an issue
as I had remembered. (The issue came up while I was trying to understand the
C11 semantics.)

The test case (aarch64) I have is:

int foo = 0;
int bar = 0;
int T5(void)
{
  int x = __sync_lock_test_and_set(foo, 1);
  return bar;
}

.L11:
ldaxrw2, [x0] ; load-acquire
stxrw3, w1, [x0] ; store
cbnzw3, .L11
ldrw0, [x0, 4]  ; load
ret

My concern was that the load could be speculated ahead of the store. Since the
store marks the end of the barrier, that could make it appear as if the load
had completed before the acquire-barrier.

In retrospect, I don't think that there will be a problem because any load that
could be moved would have to end up with the same value as if it had not moved.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #35330|0   |1
is obsolete||

--- Comment #22 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 35331
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35331action=edit
A patch

I am testing this.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #24 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 35332
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35332action=edit
A patch


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #23 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #22)
 Created attachment 35331 [details]
 A patch
 
 I am testing this.

+static bool
+ix86_binds_local_p (const_tree exp)
+{
+  return default_binds_local_p_3 (exp, flag_shlib != 0, true, false,
+  (!flag_pic
+   || (TARGET_64BIT
+HAVE_LD_PIE_COPYRELOC != 0)));
+}

shouldn't the 4th argument be true?  At least before this patch, i386 backend
was the only one that passed true as extern_protected_data, but after this
patch you never pass true to that parameter, making it dead again.

Also, a typo:

lcoal - local


[Bug c++/57472] internal compiler error: in finish_member_declaration, at cp/semantics.c

2015-04-16 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57472

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #4 from Marek Polacek mpolacek at gcc dot gnu.org ---
This could be a dup of PR50800.


[Bug sanitizer/65749] sanitizer stack trace pc off by 1

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65749

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
For the purpose of looking up the address in line table etc. IMNSHO the
subtraction of 1 is needed (that is what gcc unwinder does too, except for
signal frames where the pc must be on the faulting or asynchronously
interrupted insn)).  But if the addresses are printed, supposedly it should
match what the debugger does, and at least gdb prints the address after the
call instruction, not address after the call instruction - 1.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #25 from Jakub Jelinek jakub at gcc dot gnu.org ---
Comment on attachment 35332
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35332
A patch

An non-external
shouldn't this be
A non-external
?
Other than that LGTM, but I'd prefer another pair of eyes on this.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread mwahab at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #36 from mwahab at gcc dot gnu.org ---
(In reply to Andrew Macleod from comment #31)
 
 
 Targets that don't need special sync patterns (ie most of them) simply don't
 provide them.   The expanders see no sync pattern and use SEQ_CST, exactly
 like they do today.

For current targets that may be true but assumes that no new target will need
the special sync patterns.

 sync patterns would only be provided by targets which do need to do
 something different.   This means there is no potential bug introduction on
 unaffected targets.. 

I was thinking of existing sync patterns in current backends which may have
been made redundant by the atomics builtins but are still there. There's a
danger that they'll get used even if they're not preferred for that target.
There's also the question of whether they're getting tested, assuming that they
were only ever generated for the __sync builtins.

 Well, it actually returns back to the situation before they were merged. We
 use to look for sync patterns too... until I thought they were redundant.

I believe that they are redundant (for __sync builtins anyway) since it looks
like everything could be done through the atomics + new barrier.

 The intention was to deprecate __sync and support just the c++ memory model.
 SEQ_CST == SYNC was the original intent and understanding, thus the code
 merge.  If we decide we want/need to provide a stronger form of SEQ_CST and
 call it SYNC, then we can do that as required... but I'm not envisioning a
 lot of future additional memory models. Although never say never I suppose.

I don't expect that many models will be needed, the set should certainly be
kept as small as possible. I think that extending it in this case is justified
because of the gap between what is needed and what is now available. 

The choice seems to be 
- between continuing the move away from the syncs to the atomics. This makes
the __sync and the __atomic builtins rely on one infrastructure.
- keeping both the atomics and the syncs infrastructures with individual
targets choosing between them.

The first option seems better, not least because it reduces the number of
things that need to be understood when dealing with synchronization primitives.


[Bug target/64363] Unresolved labels with -fcheck-pointer-bounds and -mmpx

2015-04-16 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64363

Ilya Enkovich ienkovich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ienkovich at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #4 from Ilya Enkovich ienkovich at gcc dot gnu.org ---
Fixed


[Bug debug/65771] [5 Regression] ICE (in loc_list_from_tree, at dwarf2out.c:14964) on arm-linux-gnueabihf

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65771

--- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 35334
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35334action=edit
gcc5-pr65771.patch

Untested fix.  Though, of course this is too risky for the 5 branch at this
point, and giving up on DEBUG_EXPR_DECL there is desirable too.  So I'll test
that first, and this patch only afterwards for the trunk only.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #27 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #26)
 Created attachment 35333 [details]
 A patch with updated comments

Found a couple of issues, here is incremental diff, mostly formatting
improvements,
and in the case of default_binds_local_p_2 (right now unused, hopefully
incrementally used
by arm, s390 and perhaps other backends later), passing true to
common_maybe_local unconditionally,
when only in non-PIE binaries (thus !flag_pic) it works fine.

--- gcc/varasm.c
+++ gcc/varasm.c
@@ -6811,8 +6811,7 @@

 bool
 default_binds_local_p_3 (const_tree exp, bool shlib, bool weak_dominate,
- bool extern_protected_data,
- bool common_maybe_local)
+ bool extern_protected_data, bool common_maybe_local)
 {
   /* A non-decl is an entry in the constant pool.  */
   if (!DECL_P (exp))
@@ -6902,8 +6901,7 @@
 bool
 default_binds_local_p (const_tree exp)
 {
-  return default_binds_local_p_3 (exp, flag_shlib != 0, true, false,
-  false);
+  return default_binds_local_p_3 (exp, flag_shlib != 0, true, false, false);
 }

 /* Similar to default_binds_local_p, but common symbol may be local.  */
@@ -6912,14 +6910,13 @@
 default_binds_local_p_2 (const_tree exp)
 {
   return default_binds_local_p_3 (exp, flag_shlib != 0, true, false,
-  true);
+  !flag_pic);
 }

 bool
 default_binds_local_p_1 (const_tree exp, int shlib)
 {
-  return default_binds_local_p_3 (exp, shlib != 0, false, false,
-  false);
+  return default_binds_local_p_3 (exp, shlib != 0, false, false, false);
 }

 /* Return true when references to DECL must bind to current definition in


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #35333|0   |1
is obsolete||

--- Comment #28 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 35335
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35335action=edit
The final patch


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #35332|0   |1
is obsolete||

--- Comment #26 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 35333
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35333action=edit
A patch with updated comments


[Bug middle-end/64805] Specific use of __attribute ((always_inline)) breaks MPX functionality with -fcheck-pointer-bounds -mmpx

2015-04-16 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64805

Ilya Enkovich ienkovich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Ilya Enkovich ienkovich at gcc dot gnu.org ---
Fixed


[Bug target/65676] ICE: in extract_insn, at recog.c:2343 (unrecognizable insn) with -mavx512f -funsigned-char and __builtin_ia32_pmovsxwq512_mask()

2015-04-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65676

--- Comment #7 from Kirill Yukhin kyukhin at gcc dot gnu.org ---
Author: kyukhin
Date: Thu Apr 16 14:24:51 2015
New Revision: 222149

URL: https://gcc.gnu.org/viewcvs?rev=222149root=gccview=rev
Log:

gcc/
PR target/65676
* config/i386/i386.c (fixup_modeless_constant): New.
(ix86_expand_args_builtin): Fixup modeless constant operand.
(ix86_expand_round_builtin): Ditto.
(ix86_expand_special_args_builtin): Ditto.
(ix86_expand_builtin): Ditto.
gcc/testsuite/
PR target/65676
* gcc.target/i386/sse-25.c: New.


Added:
branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/sse-25.c
Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/config/i386/i386.c
branches/gcc-4_9-branch/gcc/testsuite/ChangeLog


[Bug debug/65771] [5 Regression] ICE (in loc_list_from_tree, at dwarf2out.c:14964) on arm-linux-gnueabihf

2015-04-16 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65771

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

  Component|target  |debug

--- Comment #8 from ktkachov at gcc dot gnu.org ---
Changing component then


[Bug target/65779] [5/6 Regression] undefined local symbol on powerpc [regression]

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65779

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-04-16
 CC||aoliva at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org ---
There seems to be multiple issues here.

One is a bug in the TARGET_RELOCATABLE handling in the powerpc backend, it
emits the LCL and LCF symbols in the IL, but then emits them into the actual
assembly only if some conditions are satisfied.  If they make it through into
debug insns for some reason, that obviously results in the undefined
references.
Perhaps the target address legitimization should handle those, or if the
backend determines that it won't emit the symbols it should scan the IL for any
possible debug uses and reset them.  I'm afraid there is nothing dwarf2out can
do for this.

And another thing is why valtrack during DSE2 emits a debug instruction
containing this.
At *.split2 we have:
(insn 36 35 37 2 (set (reg/v:SI 0 0 [orig:239 s2 ] [239])
(lshiftrt:SI (reg/v:SI 3 3 [orig:265 adler ] [265])
(const_int 16 [0x10])))
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_powerpc/bootloader/../../../powerpc/shared/b
 (nil))
(debug_insn 37 36 39 2 (var_location:SI s2 (reg/v:SI 0 0 [orig:239 s2 ] [239]))
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_
 (nil))
which LGTM.
But then during pro_and_epilogue pass, the insn 36 is removed (perhaps some
kind of fast DCE?), and instead a code in the prologue that uses register 0 for
something completely different is added, which results in the wrong-debug
because the debug insns weren't reset.

As a workaround, -fno-var-tracking-assignments should cure this.


[Bug jit/63854] Fix memory leaks seen in JIT

2015-04-16 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63854
Bug 63854 depends on bug 64003, which changed state.

Bug 64003 Summary: valgrind complains about get_attr_length_nobnd in 
insn-attrtab.c from i386.md
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED


[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md

2015-04-16 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003

Ilya Enkovich ienkovich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ienkovich at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #28 from Ilya Enkovich ienkovich at gcc dot gnu.org ---
Fixed


[Bug target/65771] [5 Regression] ICE (in loc_list_from_tree, at dwarf2out.c:14964) on arm-linux-gnueabihf

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65771

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org ---
Better testcase (no UB, warning free):

struct S { int s; int t; };
__thread struct S a[10];
int b;

void
foo ()
{
  int c = a[b].t;
  (void) c;
}


[Bug target/65771] [5 Regression] ICE (in loc_list_from_tree, at dwarf2out.c:14964) on arm-linux-gnueabihf

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65771

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Target|arm-linux-gnueabihf |

--- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org ---
Happens on x86_64-linux too, started with r214899.

So, at *.optimized we have:
  # DEBUG D#2 = b
  # DEBUG D#1 = a[D#2].t
  # DEBUG c = D#1
which is expanded as:
(debug_insn 5 2 6 2 (var_location:SI D#2 (mem/c:SI (symbol_ref:SI (b) [flags
0x82]  var_decl 0x7f8953872990 b) [0 b+0 S4 A32])) pr65771.c:8 -1
 (nil))
(debug_insn 6 5 7 2 (var_location:SI D#1 (mem/j:SI (plus:SI (ashift:SI
(debug_expr:SI D#2)
(const_int 3 [0x3]))
(const:SI (plus:SI (symbol_ref:SI (a) [flags 0xaa]  var_decl
0x7f8953872900 a)
(const_int 4 [0x4] [0 a[D#2].t+0 S4 A32])) pr65771.c:8
-1
 (nil))
(debug_insn 7 6 0 2 (var_location:SI c (debug_expr:SI D#1)) pr65771.c:8 -1
 (nil))
and vartracking makes:
(note 15 2 14 2 (var_location c (mem/j:SI (plus:SI (ashift:SI (mem/c:SI
(symbol_ref:SI (b) [flags 0x82]  var_decl 0x7f8953872990 b) [0 b+0 S4 A
32])
(const_int 3 [0x3]))
(const:SI (plus:SI (symbol_ref:SI (a) [flags 0xaa]  var_decl
0x7f8953872900 a)
(const_int 4 [0x4] [0 a[D#2].t+0 S4 A32]))
NOTE_INSN_VAR_LOCATION)
out of this.  Var-tracking for obvious reasons can only replace the
DEBUG_EXPR_DECLs when they appear in debug_expr RTL, because we track
DEBUG_EXPR_DECL values at RTL as RTL expressions, rather than tree.  The
problem is I believe that we try to use the MEM_EXPR as a fallback and if it
contains DEBUG_EXPR_DECL, dwarf2out.c is upset.


[Bug target/65697] __atomic memory barriers not strong enough for __sync builtins

2015-04-16 Thread jgreenhalgh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #35 from James Greenhalgh jgreenhalgh at gcc dot gnu.org ---
(In reply to torvald from comment #32)
 (In reply to James Greenhalgh from comment #28)
  This also gives us an easier route to fixing any issues with the
  acquire/release __sync primitives (__sync_lock_test_and_set and
  __sync_lock_release) if we decide that these also need to be stronger than
  their C++11 equivalents.
 
 I don't think we have another case of different __sync vs. __atomics
 semantics in case of __sync_lock_test_and_set.  The current specification
 makes it clear that this is an acquire barrier, and how it describes the
 semantics (ie, loads and stores that are program-order before the acquire op
 can move to after it) , this seems to be consistent with the effects C11
 specifies for acquire MO (with perhaps the distinction that C11 is clear
 that acquire needs to be paired with some release op to create an ordering
 constraint).

I think that the question is which parts of a RMW operation with
MEMMODEL_ACQUIRE semantics is ordered. My understanding is that in C++11
MEMMODEL_ACQUIRE only applies to the load half of the operation. So an
observer to:

  atomic_flag_test_and_set_explicit(foo, memory_order_acquire)
  atomic_store_exlicit (bar, 1, memory_model_relaxed)

Is permitted to observe a write to bar before a write to foo (but not before
the read from foo).

My reading of the Itanium ABI is that the acquire barrier applies to the entire
operation (Andrew, I think you copied these over exactly backwards in comment
34 ;) ):

  Disallows the movement of memory references to visible data from
   after the intrinsic (in program order) to before the intrinsic (this
   behavior is desirable at lock-acquire operations, hence the name).

The definition of __sync_lock_test_and_set is:

  Behavior:
   • Atomically store the supplied value in *ptr and return the old value
 of *ptr. (i.e.)
   { tmp = *ptr; *ptr = value; return tmp; }
   • Acquire barrier.

So by the strict letter of the specification, no memory references to visible
data should be allowed to move from after the entire body of the intrinsic to
before it. That is to say in:

  __sync_lock_test_and_set (foo, 1)
  bar = 1

an observer should not be able to observe the write to bar before the write to
foo. This is a difference from the C++11 semantics.

I'm not worried about __sync_lock_release, I think the documentation is strong
enough and unambiguous.

[Bug bootstrap/63995] Bootstrap error with -mmpx -fcheck-pointer-bounds

2015-04-16 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63995

Ilya Enkovich ienkovich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ienkovich at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #13 from Ilya Enkovich ienkovich at gcc dot gnu.org ---
Fixed


[Bug fortran/61831] [4.9/ 5 Regression] runtime error: pointer being freed was not allocated

2015-04-16 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61831

--- Comment #45 from Dominique d'Humieres dominiq at lps dot ens.fr ---
 Created attachment 34942 [details]
 Better patch

Sorry for the delay, but I noticed this new patch only yesterday!-(

 I'm not working on this, so I'm attaching the current patch in my work tree,
 before it's lost.

With this patch there is no memory leak
gfortran.dg/alloc_comp_constructor_1.f90 (19 builtin_free) and
gfortran.dg/class_array_15.f03 (12 builtin_free).

 If I remember correctly, the patch passes the testsuite without regressing.

With this patch I see a regression for gfortran.dg/alloc_comp_assign_10.f90.

 The test in comment #41 exhibits some leaks with it though.

With the patch I see 33 builtin_free instead of 34 without the patch.

 And I haven't looked yet at Dominique's feedback in comment #43.

The test in comment #41 fails at run time when compiled with
-fsanitize=address. If I take the complement of the reduced test posted in
comment #43, everything works fine, but for one builtin_free less.

I did not investigated what is wrong with the test in comment #43 (will do).

 Oh, and I have to double check that the gfc_trans_subarray_assign hunk
 is really necessary.

If you are speaking of the hunk

@@ -6263,7 +6283,7 @@ gfc_trans_subarray_assign (tree dest, gfc_componen

   gfc_conv_expr (rse, expr);

-  tmp = gfc_trans_scalar_assign (lse, rse, cm-ts, true, false, true);
+  tmp = gfc_trans_scalar_assign (lse, rse, cm-ts, true, true, true);
   gfc_add_expr_to_block (body, tmp);

   gcc_assert (rse.ss == gfc_ss_terminator);

I think it is needed, otherwise the test in comment #41 fails at run time with

a.out(89696,0x7fff74744300) malloc: *** mach_vm_map(size=18446603338973675520)
failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

However the hunk

@@ -4990,7 +5010,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol *

   tmp = gfc_deallocate_alloc_comp (e-ts.u.derived, tmp, parm_rank);

-  gfc_add_expr_to_block (se-post, tmp);
+  gfc_prepend_expr_to_block (se-post, tmp);
 }

   /* Add argument checking of passing an unallocated/NULL actual to

does not seems necessary, but is the cause of the regression, i.e., no
regression without it.

The patch (without the above hunk) fixes a lot of problems and makes the code
simpler. IMO it should be submitted and the few left issues can be deferred
(with new PRs).


[Bug c++/50800] Internal compiler error in finish_member_declarations, possibly related to may_alias attribute

2015-04-16 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50800

Jason Merrill jason at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 Depends on||18174
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

--- Comment #12 from Jason Merrill jason at gcc dot gnu.org ---
This is related to the fix for bug 18174, which made int a different type from
A::TA.  But this difference is not reflected in the mangling, so Bint and
BA::TA have the same mangling, which we can't allow.

The compiler tries to handle this by stripping attributes from template
arguments, so that BA::TA is treated as Bint.  This was failing in this
case because the middle end was giving the two types different TYPE_CANONICAL. 
I guess we need to work harder at stripping the attributes.


[Bug middle-end/65777] SPECOMP component 362.fma3d fails with error SIGSEGV, segmentation fault occurred

2015-04-16 Thread rajendray_14 at yahoo dot co.in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65777

--- Comment #3 from raj rajendray_14 at yahoo dot co.in ---
I'm using Intel Compiler version 15.0 and the update 2. Both has the similar
issue and it could be because the libraries from the GCC version installed. I
tried setting stack size to unlimited, 190 MB to 16 GB, none worked.


[Bug c++/65789] New: cannot convert calling convention

2015-04-16 Thread puetzk at puetzk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65789

Bug ID: 65789
   Summary: cannot convert calling convention
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: puetzk at puetzk dot org

Created attachment 35339
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35339action=edit
example using a member function pointer - poor error message

When the calling convention of a member function pointer does not match the
calling convention of a member function pointer, the compiler reports the error
but the message is unhelpful; it gives the exactly same description for both
the input and output types, failing to mention the difference in calling
convention. Conversions of a non-member (or static member) to an ordinary
function pointer behave as expected, and mention the attribute.

This is demonstrated by the two attached test-case files:
* test-static.cpp (free function pointer) fails using the phrase invalid
conversion, which correctly identifies the difference
* test-member.cpp (member function pointer) fails using the phrase cannot
convert, but gives identical descriptions of the input and output types.

g++ -fsyntax-only test-static.cpp
test-static.cpp: In function 'int main()':
test-static.cpp:6:22: error: invalid conversion from 'void
(__attribute__((__stdcall__)) *)()' to 'void (*)()' [-fpermissive]
  void (*f)() = foo::bar;
  ^

g++ -fsyntax-only test-member.cpp
test-member.cpp: In function 'int main()':
test-member.cpp:6:27: error: cannot convert 'void (foo::*)()' to 'void
(foo::*)()' in initialization
  void (foo::*f)() = foo::bar;
   ^
gcc -v
Using built-in specs.
COLLECT_GCC=C:\projects\mingw32-4.9.2\bin\g++.exe
COLLECT_LTO_WRAPPER=C:/projects/mingw32-4.9.2/bin/../libexec/gcc/i686-w64-mingw32/4.9.2/lto-wrapper.exe
Target: i686-w64-mingw32
Configured with: ../../../src/gcc-4.9.2/configure --host=i686-w64-mingw32
--build=i686-w64-mingw32 --target=i686-w64-mingw32 --prefix=/mingw32
--with-sysroot=/c/mingw492/i686-492-win32-dwarf-rt_v3-rev1/mingw32
--with-gxx-include-dir=/mingw32/i686-w64-mingw32/include/c++ --enable-shared
--enable-static --disable-multilib
--enable-languages=ada,c,c++,fortran,objc,obj-c++,lto
--enable-libstdcxx-time=yes --enable-threads=win32 --enable-libgomp
--enable-libatomic --enable-lto --enable-graphite --enable-checking=release
--enable-fully-dynamic-string --enable-version-specific-runtime-libs
--disable-sjlj-exceptions --with-dwarf2 --disable-isl-version-check
--disable-cloog-version-check --disable-libstdcxx-pch --disable-libstdcxx-debug
--enable-bootstrap --disable-rpath --disable-win32-registry --disable-nls
--disable-werror --disable-symvers --with-gnu-as --with-gnu-ld --with-arch=i686
--with-tune=generic --with-libiconv --with-system-zlib
--with-gmp=/c/mingw492/prerequisites/i686-w64-mingw32-static
--with-mpfr=/c/mingw492/prerequisites/i686-w64-mingw32-static
--with-mpc=/c/mingw492/prerequisites/i686-w64-mingw32-static
--with-isl=/c/mingw492/prerequisites/i686-w64-mingw32-static
--with-cloog=/c/mingw492/prerequisites/i686-w64-mingw32-static
--enable-cloog-backend=isl --with-pkgversion='i686-win32-dwarf-rev1, Built by
MinGW-W64 project' --with-bugurl=http://sourceforge.net/projects/mingw-w64
CFLAGS='-O2 -pipe
-I/c/mingw492/i686-492-win32-dwarf-rt_v3-rev1/mingw32/opt/include
-I/c/mingw492/prerequisites/i686-zlib-static/include
-I/c/mingw492/prerequisites/i686-w64-mingw32-static/include' CXXFLAGS='-O2
-pipe -I/c/mingw492/i686-492-win32-dwarf-rt_v3-rev1/mingw32/opt/include
-I/c/mingw492/prerequisites/i686-zlib-static/include
-I/c/mingw492/prerequisites/i686-w64-mingw32-static/include' CPPFLAGS=
LDFLAGS='-pipe -L/c/mingw492/i686-492-win32-dwarf-rt_v3-rev1/mingw32/opt/lib
-L/c/mingw492/prerequisites/i686-zlib-static/lib
-L/c/mingw492/prerequisites/i686-w64-mingw32-static/lib -Wl
,--large-address-aware'
Thread model: win32
gcc version 4.9.2 (i686-win32-dwarf-rev1, Built by MinGW-W64 project)


[Bug c++/65789] cannot convert calling convention

2015-04-16 Thread puetzk at puetzk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65789

--- Comment #1 from Kevin Puetz puetzk at puetzk dot org ---
Created attachment 35340
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35340action=edit
example using a free function pointer - works as expected


[Bug target/65787] [5.1 regression] Miscompile due to bad vector swap optimization for little endian

2015-04-16 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65787

Bill Schmidt wschmidt at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2015-04-16
 Ever confirmed|0   |1


[Bug target/65535] powerpc64-freebsd build failure

2015-04-16 Thread andreast at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65535

--- Comment #5 from Andreas Tobler andreast at gcc dot gnu.org ---
Here my patch:

https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00839.html

I do not like to hardcode a version number. Would mean to update when needed..

The important thing here is, if you build a cross compiler make sure that the
binutils and the gcc target match, including a major version number.

E.g: the binutils --target=amd64-unknown-freebsd10.1 must be equal to the gcc
--target=amd64-unknown-freebsd10.1.

It doesn't work if you give a target w/o version number to binutils and one
with version number to gcc configury.

The mentoned patch was tested on a CentOS - amd64-unknown-freebsd11.0 cross
also on native FreeBSD (armv6/hf, amd64).


[Bug target/65787] [5 Regression] Miscompile due to bad vector swap optimization for little endian

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65787

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org ---
Though, len is used just in one place, so perhaps even better just remove the
{}s and use 
  if (XVECLEN (op, 0) != 2)
return 0;
and drop len variable alltogether, it will be more readable that way.


[Bug target/65787] [5 Regression] Miscompile due to bad vector swap optimization for little endian

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65787

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org ---
The formatting looks weird, look how the case UNSPEC: is formatted -
{ goes below case PARALLEL:, two columns to the right, then another two columns
to the right the body of the scope, then } below the {.


[Bug target/65780] [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780

--- Comment #35 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 35341
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35341action=edit
The final patch with variable name change and updated comments


[Bug middle-end/65788] New: [6 Regression] 416.gamess in SPEC CPU 2006 failed to build

2015-04-16 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65788

Bug ID: 65788
   Summary: [6 Regression] 416.gamess in SPEC CPU 2006 failed to
build
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com

On Linux/x86-64, r222049 caused:

gfortran  -O3 -funroll-loops -ffast-math -fwhole-program -flto=jobserver
-fuse-linker-plugin  -DSPEC_CPU_LP64aldeci.fppized.o algnci.fppized.o
basecp.fppized.o basext.o bashuz.o bashz2.o basn21.fppized.o basn31.o
bassto.fppized.o blas.fppized.o ccaux.o ccsdt.fppized.o chgpen.fppized.o
cisgrd.fppized.o cosmo.fppized.o cphf.fppized.o cpmchf.o cprohf.fppized.o
ddi.fppized.o delocl.fppized.o dft.fppized.o dftaux.fppized.o dftexc.fppized.o
dftfun.o dftgrd.fppized.o dftint.fppized.o dgeev.o dmulti.fppized.o
drc.fppized.o dummygetenv.fppized.o ecp.fppized.o ecpder.fppized.o
ecplib.fppized.o ecppot.o efdrvr.fppized.o efelec.o efgrd2.fppized.o
efgrda.fppized.o efgrdb.fppized.o efgrdc.fppized.o efinp.fppized.o
efinta.fppized.o efintb.fppized.o efpaul.fppized.o efpcm.fppized.o
efpcov.fppized.o eigen.fppized.o eomcc.fppized.o ffield.fppized.o
frfmt.fppized.o fsodci.fppized.o gamess.fppized.o globop.fppized.o
gradex.fppized.o grd1.fppized.o grd2a.fppized.o grd2b.o grd2c.fppized.o
guess.fppized.o gugdga.fppized.o gugdgb.fppized.o gugdm.fppized.o
gugdm2.fppized.o gugdrt.fppized.o gugem.fppized.o gugsrt.fppized.o
gvb.fppized.o hess.fppized.o hss1a.fppized.o hss1b.fppized.o hss2a.fppized.o
hss2b.fppized.o inputa.fppized.o inputb.fppized.o inputc.fppized.o
int1.fppized.o int2a.fppized.o int2b.o iolib.fppized.o lagran.fppized.o
local.fppized.o loccd.fppized.o locpol.fppized.o mccas.fppized.o mcjac.o
mcpinp.fppized.o mcpint.fppized.o mcplib.o mcqdpt.fppized.o mcqdwt.o
mcqud.fppized.o mcscf.fppized.o mctwo.fppized.o morokm.fppized.o mp2.fppized.o
mp2ddi.fppized.o mp2grd.fppized.o mpcdat.o mpcgrd.fppized.o mpcint.fppized.o
mpcmol.fppized.o mpcmsc.fppized.o mthlib.fppized.o nameio.fppized.o
nmr.fppized.o olix.o ordint.fppized.o ormas1.fppized.o parley.fppized.o
pcm.fppized.o pcmcav.o pcmcv2.fppized.o pcmder.fppized.o pcmdis.fppized.o
pcmief.fppized.o pcmpol.fppized.o pcmvch.fppized.o prpel.fppized.o
prplib.fppized.o prppop.fppized.o qeigen.fppized.o qfmm.fppized.o
qmfm.fppized.o qmmm.fppized.o qrel.fppized.o raman.fppized.o rhfuhf.fppized.o
rxncrd.fppized.o ryspol.o scflib.fppized.o scfmi.fppized.o scrf.fppized.o
sobrt.fppized.o soffac.fppized.o solib.fppized.o sozeff.fppized.o
statpt.fppized.o surf.fppized.o symorb.fppized.o symslc.fppized.o
tdhf.fppized.o trans.fppized.o trfdm2.fppized.o trnstn.fppized.o
trudge.fppized.o umpddi.fppized.o unport.fppized.o vibanl.fppized.o
vscf.fppized.o zheev.fppized.o zmatrx.fppized.o -lm-o
gamess
cisgrd.fppized.f: In function 'cisao':
cisgrd.fppized.f:319:0: internal compiler error: in set_lattice_value, at
tree-ssa-ccp.c:524
   SUBROUTINE CISAO
 ^
0xb343c7 set_lattice_value
../../src-trunk/gcc/tree-ssa-ccp.c:524
0xb3957a visit_assignment
../../src-trunk/gcc/tree-ssa-ccp.c:2271
0xb396dc ccp_visit_stmt
../../src-trunk/gcc/tree-ssa-ccp.c:2345
0xbba7c4 simulate_stmt
../../src-trunk/gcc/tree-ssa-propagate.c:348
0xbba977 process_ssa_edge_worklist
../../src-trunk/gcc/tree-ssa-propagate.c:422
0xbbc178 ssa_propagate(ssa_prop_result (*)(gimple_statement_base*, edge_def**,
tree_node**), ssa_prop_result (*)(gphi*))
../../src-trunk/gcc/tree-ssa-propagate.c:896
0xb32554 do_ssa_ccp
../../src-trunk/gcc/tree-ssa-ccp.c:2386
0xb32554 execute
../../src-trunk/gcc/tree-ssa-ccp.c:2419
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.
make[4]: *** [/tmp/cckiKD9F.ltrans29.ltrans.o] Error 1
lto-wrapper: fatal error: make returned 2 exit status
compilation terminated.
/usr/local/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
specmake[3]: *** [gamess] Error 1


[Bug target/65787] [5 Regression] Miscompile due to bad vector swap optimization for little endian

2015-04-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65787

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |5.0
Summary|[5.1 regression] Miscompile |[5 Regression] Miscompile
   |due to bad vector swap  |due to bad vector swap
   |optimization for little |optimization for little
   |endian  |endian


  1   2   >