[Bug target/81621] New: ICE in delete_insn, at cfgrtl.c:167 with s390x cross compiler

2017-07-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81621

Bug ID: 81621
   Summary: ICE in delete_insn, at cfgrtl.c:167 with s390x cross
compiler
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-linux-gnu
Target: s390x-linux-gnu

Running cross compiler ICEs:

$ s390x-linux-gnu-gcc
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/graphite/scop-10.c -Og
-fno-split-wide-types -freorder-blocks-and-partition

0xdeadbeef delete_insn(rtx_insn*)
.././../gcc/cfgrtl.c:167
0xdeadbeef move_unallocated_pseudos
.././../gcc/ira.c:5041
0xdeadbeef ira
.././../gcc/ira.c:5399
0xdeadbeef execute
.././../gcc/ira.c:5581

[Bug c/79586] missing -Wdeprecated depending on position of attribute

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79586

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-31
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Eric Gallager  ---
Confirmed.

[Bug tree-optimization/81620] [8 Regression] ICE in is_inv_store_elimination_chain, at tree-predcom.c:1651 with -O3

2017-07-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81620

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-31
 CC||amker at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed, started with r250670.

[Bug lto/81612] lto1: internal compiler error: Segmentation fault

2017-07-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81612

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2017-07-31
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Can you please attach a pre-processed source code that triggers that?

[Bug boehm-gc/64042] FAIL: boehm-gc.c/gctest.c -O2 execution test

2017-07-30 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64042

--- Comment #15 from Tom de Vries  ---
Subject: Re: [Gc] boehm-gc.c/gctest.c spurious failure
From: bo...@acm.org
To: tom_devr...@mentor.com
CC: bd...@lists.opendylan.org
Date: 01/21/2015 09:11 PM

I haven't had a chance to look at this carefully.  But the typed allocation
test looks a bit fishy.  It seems to allocate a 2000 byte object described by a
320 bit entry bit map, each of which describes a pointer-sized word, IIRC. 
That worked fine once upon a time when we this code was written and we only had
32-bit machines ...

Hans

[Bug boehm-gc/64042] FAIL: boehm-gc.c/gctest.c -O2 execution test

2017-07-30 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64042

--- Comment #14 from Tom de Vries  ---
Subject: boehm-gc.c/gctest.c spurious failure
From: tom_devr...@mentor.com
To: bd...@lists.opendylan.org
Date: 01/19/2015 10:09 AM

Hi,

FYI, with gcc trunk on x86_64 Linux, I ran into PR64042: 'FAIL:
boehm-gc.c/gctest.c -O2 execution test'. (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64042 )

The testcase gctest fails spuriously, a couple of times per 1000 runs.

The failure has been reproduced by others, also on Darwin.

Any information on this is appreciated.

Thanks,
- Tom

[Bug boehm-gc/64042] FAIL: boehm-gc.c/gctest.c -O2 execution test

2017-07-30 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64042

--- Comment #13 from Tom de Vries  ---
(In reply to Eric Gallager from comment #12)
> (In reply to Tom de Vries from comment #11)
> > Reported upstream here:
> > https://lists.opendylan.org/pipermail/bdwgc/2015-January/006071.html
> 
> This link doesn't work for me; is there a better upstream bug link URL?

The archive seems to be down, and also gmane doesn't seem to work. I'll post
the thread here.

[Bug c/61342] Segfault when using default clause and VLA in OpenMP task

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61342

Eric Gallager  changed:

   What|Removed |Added

   Keywords||openmp
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-31
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Eric Gallager  ---
Confirmed, the stray quote mark printed preceding the ICE message looks
suspicious, too.

gcc-bugs@gcc.gnu.org

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30552

Eric Gallager  changed:

   What|Removed |Added

   Last reconfirmed|2008-12-29 14:22:53 |2017-7-30
 CC||egallager at gcc dot gnu.org
Summary|gcc crashed when compiling  |gcc crashes when compiling
   |an example  |examples with GNU statement
   ||expressions in VLAs (also
   ||involved: nested functions
   ||declared K&R-style)
  Known to fail||8.0

--- Comment #3 from Eric Gallager  ---
Confirmed that gcc still ICEs, although I'm not sure if the code is valid or
not... I'll leave the "ice-on-valid-code" keyword for now; someone else more
knowledgeable than me can change it if necessary.

[Bug c/79320] sqrt of negative number do not return NaN with i686-w64-mingw32-gcc on pentiumI7/Windows10

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79320

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
URL||https://sourceforge.net/p/m
   ||ingw-w64/bugs/567/
 CC||egallager at gcc dot gnu.org
 Resolution|--- |MOVED

--- Comment #3 from Eric Gallager  ---
(In reply to Daniel WEIL from comment #2)
> OK. I log the issue on mingw bugs :
> https://sourceforge.net/p/mingw/bugs/2337/

Linked bug shows that that one was closed in favor of:
https://sourceforge.net/p/mingw-w64/bugs/567/
So closing to mark as MOVED to the mingw-w64 one.

[Bug c/70257] #line incorrectly handled in error messages

2017-07-30 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70257

Manuel López-Ibáñez  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #2 from Manuel López-Ibáñez  ---
I think this is a dup of bug 79106.

The caret line is printed by reopening the file and counting 3 lines because
the line directive is believed by GCC to point to the actual source code.

[Bug c/79010] -Wlarger-than ineffective for VLAs, alloca, malloc

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79010

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Eric Gallager  ---
Confirmed.

[Bug sanitizer/81340] ICE in compute_bb_dataflow, at var-tracking.c:6877

2017-07-30 Thread daniel.black at au dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81340

--- Comment #5 from Daniel Black  ---
Thankyou Martin.

[Bug c/78155] missing warning on invalid isalpha et al.

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78155

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Eric Gallager  ---
When I run the program, it prints 0 rather than crashing. Confirming that a
warning would be nice though, for portability to platforms where it would cause
a crash.

[Bug c/63710] Incorrect column number for -Wconversion

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63710

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Eric Gallager  ---
Confirmed. The location for the first one is still the same, but the location
for the second one has changed:

$ /usr/local/bin/gcc -c -Wconversion 63710.c
63710.c: In function ‘f1’:
63710.c:2:24: warning: conversion to ‘long unsigned int’ from ‘char’ may change
the sign of the result [-Wsign-conversion]
  unsigned long r1 = ul + l;
^
63710.c:3:23: warning: conversion to ‘long unsigned int’ from ‘char’ may change
the sign of the result [-Wsign-conversion]
  unsigned long r2 = l + ul;
   ^
63710.c: In function ‘f2’:
63710.c:8:15: warning: conversion to ‘unsigned int’ from ‘char’ may change the
sign of the result [-Wsign-conversion]
  return l ? l : c;
 ~~^~~
63710.c:8:15: warning: conversion to ‘unsigned int’ from ‘long int’ may change
the sign of the result [-Wsign-conversion]
$

I agree that both could still be better.

[Bug target/81602] Unnecessary zero-extension after 16 bit popcnt

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81602

--- Comment #1 from Uroš Bizjak  ---
(In reply to Christoph Diegelmann from comment #0)
> GCC misses an optimization on this:
> 
>  #include 
>  #include "immintrin.h"
> 
>  void test(std::uint16_t* mask, std::uint16_t* data) {
>  for (int i = 0; i < 1024; ++i) {
>  *data = 0;
>  unsigned tmp = *mask++;
>  unsigned step = _mm_popcnt_u32(tmp);
>  data += step;
>  }
>  }
> 
> g++ -O3 -Wall -std=c++14 -march=skylake generates:
> 
>  test(unsigned short*, unsigned short*):
>  leaq 2048(%rdi), %rdx
>  .L2:
>  xorl %eax, %eax
>  addq $2, %rdi
>  movw %ax, (%rsi)
>  popcntw -2(%rdi), %ax
>  movzwl %ax, %eax
>  leaq (%rsi,%rax,2), %rsi
>  cmpq %rdx, %rdi
>  jne .L2
>  ret
> 
> The rax register is known to be zero at the time of `popcntw -2(%rdi), %ax`.
> Anyway gcc still clears the upper bits using `movzwl %ax, %eax` afterwards.

The "xorl %eax, %eax; movw %ax, (%rsi)" pair is just optimized way to implement
"movw $0, (%rsi);". It just happens that peephole pass found unused %eax as an
empty temporary reg when splitting direct move of immediate to memory.

> While clang uses 32 bit popcnt and `movzwl (%rdi,%rax,2), %ecx` it correctly
> recognises that there's no need to clear the upper bits.
> 
> clang -O3 -Wall -std=c++14 -march=skylake -fno-unroll-loops generates:
> 
>  test(unsigned short*, unsigned short*): 
>  xorl %eax, %eax
>  .LBB0_1: 
>  movw $0, (%rsi)
>  movzwl (%rdi,%rax,2), %ecx
>  popcntl %ecx, %ecx
>  leaq (%rsi,%rcx,2), %rsi
>  addq $1, %rax
>  cmpl $1024, %eax # imm = 0x400
>  jne .LBB0_1
>  retq

popcntl has a false dependency on its output in certain situations, where
popcntw doesn have this limitation. So, gcc choose this approach for a reason.

[Bug target/25967] Add attribute naked for x86

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25967

Uroš Bizjak  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2017-07-30
   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com
   Target Milestone|--- |8.0
 Ever confirmed|0   |1

--- Comment #16 from Uroš Bizjak  ---
Patch at [1].

[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01968.html

[Bug target/79964] Cortex A53 codegen still not optimal

2017-07-30 Thread tulipawn at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79964

--- Comment #7 from PeteVine  ---
Thanks for pointing that out! I was using my bash history to change the CFLAGS
and when I was flipping the crc switch I didn't notice I'd picked a version
without -frename-registers, hence this wrong conclusion :)

Definitely then, -frename-registers it is!

http://openbenchmarking.org/result/1707307-RI-CORTEXA5313

[Bug web/43887] stable anchors needed

2017-07-30 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43887

Manuel López-Ibáñez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Manuel López-Ibáñez  ---
This seems to have been fixed recently and anchor names for options are not
numbered anymore.

[Bug tree-optimization/81620] [8 Regression] ICE in is_inv_store_elimination_chain, at tree-predcom.c:1651 with -O3

2017-07-30 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81620

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 Target||x86_64-pc-linux-gnu
  Component|c   |tree-optimization
Version|unknown |8.0
   Target Milestone|--- |8.0
Summary|ICE on valid code at -O3 in |[8 Regression] ICE in
   |both 32-bit and 64-bit  |is_inv_store_elimination_ch
   |modes on x86_64-linux-gnu   |ain, at tree-predcom.c:1651
   |(internal compiler error:   |with -O3
   |in  |
   |is_inv_store_elimination_ch |
   |ain, at |
   |tree-predcom.c:1651)|

[Bug sanitizer/81619] pairs of mmap/munmap do not reset asan's user-poisoning flags, leading to invalid error reports

2017-07-30 Thread dvilleneuve at kronos dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81619

Daniel Villeneuve  changed:

   What|Removed |Added

  Attachment #41863|0   |1
is obsolete||

--- Comment #4 from Daniel Villeneuve  ---
Created attachment 41866
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41866&action=edit
small C program showing the problem on Linux

[Bug c/81620] New: ICE on valid code at -O3 in both 32-bit and 64-bit modes on x86_64-linux-gnu (small.c:3:5: internal compiler error: in is_inv_store_elimination_chain, at tree-predcom.c:1651)

2017-07-30 Thread chengniansun at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81620

Bug ID: 81620
   Summary: ICE on valid code at -O3 in both 32-bit and 64-bit
modes on x86_64-linux-gnu (small.c:3:5: internal
compiler error: in is_inv_store_elimination_chain, at
tree-predcom.c:1651)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: chengniansun at gmail dot com
  Target Milestone: ---

$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/8.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 8.0.0 20170730 (experimental) [trunk revision 250721] (GCC) 
$ gcc-trunk -O3 small.c
during GIMPLE pass: pcom
small.c: In function ‘main’:
small.c:3:5: internal compiler error: in is_inv_store_elimination_chain, at
tree-predcom.c:1651
 int main() {
 ^~~~
0xd25e10 is_inv_store_elimination_chain
../../gcc-source-trunk/gcc/tree-predcom.c:1651
0xd25e10 prepare_initializers_chain_store_elim
../../gcc-source-trunk/gcc/tree-predcom.c:2786
0xd25e10 prepare_initializers_chain
../../gcc-source-trunk/gcc/tree-predcom.c:2846
0xd25e10 prepare_initializers
../../gcc-source-trunk/gcc/tree-predcom.c:2901
0xd25e10 tree_predictive_commoning_loop
../../gcc-source-trunk/gcc/tree-predcom.c:3092
0xd25e10 tree_predictive_commoning()
../../gcc-source-trunk/gcc/tree-predcom.c:3170
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
$ cat small.c
int a[7];
char b;
int main() {
  b = 4;
  for (; b; b--) {
a[b] = b;
a[b + 2] = 1;
  }
  return 0;
}
$

[Bug sanitizer/81619] pairs of mmap/munmap do not reset asan's user-poisoning flags, leading to invalid error reports

2017-07-30 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81619

--- Comment #3 from Andrew Pinski  ---
This might be a bug in the upstream sources too.

[Bug sanitizer/81619] pairs of mmap/munmap do not reset asan's user-poisoning flags, leading to invalid error reports

2017-07-30 Thread dvilleneuve at kronos dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81619

--- Comment #2 from Daniel Villeneuve  ---
Created attachment 41865
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41865&action=edit
shell script to invoke program in different configurations

[Bug sanitizer/81619] pairs of mmap/munmap do not reset asan's user-poisoning flags, leading to invalid error reports

2017-07-30 Thread dvilleneuve at kronos dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81619

--- Comment #1 from Daniel Villeneuve  ---
Created attachment 41864
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41864&action=edit
Makefile to build program

[Bug sanitizer/81619] New: pairs of mmap/munmap do not reset asan's user-poisoning flags, leading to invalid error reports

2017-07-30 Thread dvilleneuve at kronos dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81619

Bug ID: 81619
   Summary: pairs of mmap/munmap do not reset asan's
user-poisoning flags, leading to invalid error reports
   Product: gcc
   Version: 6.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dvilleneuve at kronos dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Created attachment 41863
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41863&action=edit
small C program showing the problem on Linux

When using mmap/munmap from an application, memory returned by mmap is not seen
by the address sanitizer in a newly-initialized state: it might still be marked
with user-poisoning flags.

This is unlike using malloc/free pairs, where memory obtained from malloc,
although possibly reused after being freed, is correctly initialized.

By looking at the code for the sanitizer (gcc 6.3.0), I could figure out that
malloc/free do some reinitialization of memory flags.  I could not find such
code for mmap/munmap.

A workaround in the application is to explicitly call
ASAN_UNPOISON_MEMORY_REGION prior to invoking munmap.

[Bug c/61939] warn when __attribute__((aligned(x))) is ignored

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61939

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Eric Gallager  ---
I had to modify the original testcase a bit to get it to compile:

$ cat 61939.c
struct some_struct { int foo; };
void copy_something(void *p, const void *s) {
struct some_struct __attribute__((aligned(8))) *_d = p;
struct some_struct __attribute__((aligned(8))) *_s = s;
*_d = *_s;
}
$ /usr/local/bin/gcc -c -Wall -Wextra -pedantic -Wcast-align -Wattributes
61939.c
61939.c: In function ‘copy_something’:
61939.c:4:58: warning: initialization discards ‘const’ qualifier from pointer
target type [-Wdiscarded-qualifiers]
 struct some_struct __attribute__((aligned(8))) *_s = s;
  ^
$

But beyond that, yeah, confirmed. I think there's probably a duplicate around
here somewhere but I've forgotten the number already...

[Bug target/81614] Should -mtune-ctrl=partial_reg_stall be turned by default?

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614

--- Comment #5 from Uroš Bizjak  ---
(In reply to Cody Gray from comment #3)

> > Also, it is hard to confirm tuning PRs without hard benchmark data.
> 
> No, it really isn't. I know that's a canned response, likely brought about
> by hard-won experience with a lot of dubious "tuning" feature requests, but
> it's just a cop-out in this case, if not outright dismissive. Partial
> register stalls are a well-documented phenomenon, confirmed by multiple
> sources, and have been a significant source of performance degradation since
> the Pentium Pro was released circa 1995.

Well, then please find some representative benchmark suite and test the effect
of -mtune-ctrl=partial_reg_stall on your target. There are plenty of benchmarks
listed at [1].

It is an one-line change in the compiler source to set the new default then.

[1] https://gcc.opensuse.org/specs/cxx_groups

[Bug target/81614] Should -mtune-ctrl=partial_reg_stall be turned by default?

2017-07-30 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614

H.J. Lu  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2017-07-30
 Resolution|DUPLICATE   |---
 Ever confirmed|0   |1

--- Comment #4 from H.J. Lu  ---
-mtune-ctrl=partial_reg_stall is turned on only for -mtune=i686.  We
should exam it for Nehalem and above processors.

[Bug target/79964] Cortex A53 codegen still not optimal

2017-07-30 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79964

--- Comment #6 from Andrew Pinski  ---
(In reply to PeteVine from comment #5)
> Turns out the GCC 8 regression is caused by the +crc switch in
> -march=armv8-a+crc. Interesting, eh?

+crc should not cause any code generation difference ...

[Bug target/25967] Add attribute naked for x86

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25967

--- Comment #15 from Uroš Bizjak  ---
Please also note this description from the gcc docs:

'naked'
 This attribute allows the compiler to construct the requisite
 function declaration, while allowing the body of the function to be
 assembly code.  The specified function will not have
 prologue/epilogue sequences generated by the compiler.  Only basic
 'asm' statements can safely be included in naked functions (*note
 Basic Asm::).  While using extended 'asm' or a mixture of basic
 'asm' and C code may appear to work, they cannot be depended upon
 to work reliably and are not supported.

[Bug sanitizer/81601] [7/8 Regression] incorrect Warray-bounds warning with -fsanitize

2017-07-30 Thread ppalka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601

--- Comment #6 from Patrick Palka  ---
(In reply to Patrick Palka from comment #5)
> So what's the right way to fix this?  To move optimize_bit_field_compare()
> from fold_binary to match.pd so that the conditions on

... so that conditions on tp->chrono_type get consistently transformed into
BIT_FIELD_REFs, or to remove optimize_bit_field_compare() altogether?  It seems
like a rather low-level optimization to be done in GENERIC/GIMPLE.

[Bug sanitizer/81601] [7/8 Regression] incorrect Warray-bounds warning with -fsanitize

2017-07-30 Thread ppalka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601

--- Comment #5 from Patrick Palka  ---
So what's the right way to fix this?  To move optimize_bit_field_compare() from
fold_binary to match.pd so that the conditions on

[Bug tree-optimization/81354] [5/6 Regression] Segmentation fault in SSA Strength Reduction using -O3

2017-07-30 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81354

--- Comment #9 from Bill Schmidt  ---
OK, I've now confirmed this is the problem.  I have a rough patch for trunk,
and backporting it to GCC 5 r236439 verifies that this fixes it.  Still
verifying bootstrap/regression on trunk, and need to do some cleanup before
submitting.

[Bug c++/81587] GCC doesn't warn about calling functions that don't exist

2017-07-30 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81587

--- Comment #5 from Jonny Grant  ---
Thank you Martin, I raised Bug #81618

[Bug target/25967] Add attribute naked for x86

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25967

--- Comment #14 from Uroš Bizjak  ---
I'm testing the above patch. Using the patched compiler, the testcase that is
mentioned by Daniel in Comment #12 can be changed to:

Index: testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
===
--- testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c   (revision 250720)
+++ testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c   (working copy)
@@ -169,15 +169,9 @@

 #define TEST_DATA_OFFSET(f)((int)__builtin_offsetof(struct test_data, f))

-void __attribute__((used))
-do_test_body0 (void)
-{
-  __asm__ ("\n"
-   "   .globl " ASMNAME(do_test_body) "\n"
-#ifdef __ELF__
-   "   .type " ASMNAME(do_test_body) ",@function\n"
-#endif
-   ASMNAME(do_test_body) ":\n"
+void __attribute__((naked))
+do_test_body (void)
+{__asm__ (
"   # rax, r10 and r11 are usable here.\n"
"\n"
"   # Save registers.\n"
@@ -212,9 +206,6 @@
"   call" ASMNAME(mem_to_regs) "\n"
"\n"
"   retq\n"
-#ifdef __ELF__
-   "   .size " ASMNAME(do_test_body) ",.-" ASMNAME(do_test_body) "\n"
-#endif
::
"i"(TEST_DATA_OFFSET(regdata[REG_SET_SAVE])),
"i"(TEST_DATA_OFFSET(regdata[REG_SET_INPUT])),

[Bug c++/81618] New: Warn for unused functions declared in local scope

2017-07-30 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81618

Bug ID: 81618
   Summary: Warn for unused functions declared in local scope
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jg at jguk dot org
  Target Milestone: ---

Hello

Could GCC warn for unused functions declared in local scope please? See
function below g()

$ cat t.C && gcc -S -Wall t.C
void f (void)
{
  typedef int I;
  int i;
  void g ();
}
t.C: In function ‘void f()’:
t.C:4:7: warning: unused variable ‘i’ [-Wunused-variable]
   int i;
   ^
t.C:3:15: warning: typedef ‘I’ locally defined but not used
[-Wunused-local-typedefs]
   typedef int I;
   ^

[Bug target/25967] Add attribute naked for x86

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25967

--- Comment #13 from Uroš Bizjak  ---
Created attachment 41862
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41862&action=edit
Patch that implements naked attribute

[Bug target/81614] Should -mtune-ctrl=partial_reg_stall be turned by default?

2017-07-30 Thread cody at codygray dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614

--- Comment #3 from Cody Gray  ---
(In reply to Uroš Bizjak from comment #1)
> Partial register stalls were discussed many times in the past, but
> apparently the compiler still produces fastest code when partial register
> stalls are enabled on latest target processors (e.g. -mtune=intel).

I don't understand what that means. -mtune=intel does *not* fix the partial
register stall problem. It should. All Intel CPUs prior to Haswell would
absolutely experience partial register stalls on this code, resulting in a
performance degradation.

-mtune-ctrl=partial_reg_stall does get the correct code, but I wasn't aware of
this option and I believe I shouldn't have to be. If a developer is getting
sub-optimal code even when he is asking the compiler to tune for his specific
microarchitecture, then the optimizer has a bug.

This is not an issue where there are arguments on either side. There is
absolutely no benefit to generating the code that the compiler currently does.
It is the same number of bytes to OR the BYTE-sized registers as it is to OR
the DWORD-sized registers, while the former will run faster on the vast
majority of CPUs and won't be any slower on the others.

> Also, it is hard to confirm tuning PRs without hard benchmark data.

No, it really isn't. I know that's a canned response, likely brought about by
hard-won experience with a lot of dubious "tuning" feature requests, but it's
just a cop-out in this case, if not outright dismissive. Partial register
stalls are a well-documented phenomenon, confirmed by multiple sources, and
have been a significant source of performance degradation since the Pentium Pro
was released circa 1995.

Agner Fog's manuals, as cited above, are really the authoritative reference
when it comes to performance tuning on x86, and they provide confirmation of
this in spades. In fact, I would argue that an accurate conceptual
understanding of the microarchitecture is often a better guide than one-off
microbenchmarks, since the latter are so difficult to craft and therefore so
often misleading. For example, the effects of the stall might be masked by the
overhead of the function call, but when the code is inlined or *certainly* when
it is executed within an inner loop, there will be a significant performance
degradation.

Again, if this were an issue where I was proposing bloating the size of the
code for a small payoff in speed, I could see how you might be skeptical. But
there is literally no downside to making this change.

You could possibly argue that -mtune-ctrl=partial_reg_stall should not be
turned on when tuning for Haswell and later microarchitectures, as Haswell was
the first to alleviate the visible performance penalties associated with
reading from a full 32-bit register after writing to a partial 8-bit "view" of
that same register. However, this applies *only* to the low-byte register
(e.g., AL, CL, DL, etc.). With the high-byte registers (e.g., AH, CH, DH,
etc.), there is still a loss in performance because an extra µop has to be
inserted between the write to the 8-bit register and the read from the 32-bit
register. This increases the latency by one clock cycle, and so unless the xH
partial registers are treated differently from the xL partial registers,
applying the optimizations described would still result in a performance win,
especially since there is no drawback.

[Bug go/81617] New: mksigtab.sh fails to resolve NSIG with glibc 2.26

2017-07-30 Thread sch...@linux-m68k.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81617

Bug ID: 81617
   Summary: mksigtab.sh fails to resolve NSIG with glibc 2.26
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: sch...@linux-m68k.org
CC: cmang at google dot com
  Target Milestone: ---

In glibc 2.26 the value of _NSIG is now defined as an expression of __SIGRTMAX
instead of a simple number.

$ grep 'NSIG =' gen-sysinfo.go 
const _NSIG = __NSIG
const __NSIG = (___SIGRTMAX + 1)

[Bug tree-optimization/81354] [5/6 Regression] Segmentation fault in SSA Strength Reduction using -O3

2017-07-30 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81354

--- Comment #8 from Bill Schmidt  ---
This is likely the same as another problem that recently came up (not yet filed
as the source is sensitive).  SLSR is sensitive to addresses of PHI
instructions remaining the same throughout the pass, but gimple_split_edge does
not maintain this.  I'm working on a patch to ensure that it does.  I still
need to verify this is the same issue.

[Bug c/64619] No -Wsign-conversion warning

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64619

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Eric Gallager  ---
(In reply to Mikhail Maltsev from comment #1)
> Indeed, confirmed on recent revision, r219801.

Changing status to NEW then.

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

2017-07-30 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

H.J. Lu  changed:

   What|Removed |Added

 CC||cody at codygray dot com

--- Comment #1 from H.J. Lu  ---
*** Bug 81614 has been marked as a duplicate of this bug. ***

[Bug target/81614] Should -mtune-ctrl=partial_reg_stall be turned by default?

2017-07-30 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE
Summary|x86 optimizer combines  |Should
   |results of comparisons in a |-mtune-ctrl=partial_reg_sta
   |way that risks partial  |ll be turned by default?
   |register stalls |

--- Comment #2 from H.J. Lu  ---
With -mtune-ctrl=partial_reg_stall, I got

[hjl@gnu-tools-1 pr81614]$ cat x.i
_Bool
foo(int a, int b, int c)
{
  return (a == c || b == c);
}

int
bar (int a, int b, int c)
{
  return (a == c || b == c);
}
[hjl@gnu-tools-1 pr81614]$ make
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2
-mtune-ctrl=partial_reg_stall -S x.i
[hjl@gnu-tools-1 pr81614]$ cat x.s
.file   "x.i"
.text
.p2align 4,,15
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
cmpl%edx, %edi
sete%al
cmpl%esi, %edx
sete%dl
orb %dl, %al
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.p2align 4,,15
.globl  bar
.type   bar, @function
bar:
.LFB1:
.cfi_startproc
cmpl%edx, %edi
sete%al
cmpl%esi, %edx
sete%dl
orb %dl, %al
movzbl  %al, %eax
ret
.cfi_endproc
.LFE1:
.size   bar, .-bar
.ident  "GCC: (GNU) 8.0.0 20170730 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-tools-1 pr81614]$ 

I opened PR 81616 to update default tuning options.

*** This bug has been marked as a duplicate of bug 81616 ***

[Bug c/69389] bit field incompatible with OpenMP atomic update

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69389

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Eric Gallager  ---
Confirmed.

[Bug target/81616] New: Update -mtune=generic for the current Intel and AMD processors

2017-07-30 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616

Bug ID: 81616
   Summary: Update -mtune=generic for the current Intel and AMD
processors
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: pavel.v.chupin at gmail dot com
Blocks: 80820
  Target Milestone: ---
Target: x86

-mtune=generic should be updated for the current Intel and AMD
processors.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80820
[Bug 80820] _mm_set_epi64x shouldn't store/reload for -mtune=haswell, Zen
should avoid store/reload, and generic should think about it.

[Bug target/79964] Cortex A53 codegen still not optimal

2017-07-30 Thread tulipawn at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79964

--- Comment #5 from PeteVine  ---
Turns out the GCC 8 regression is caused by the +crc switch in
-march=armv8-a+crc. Interesting, eh?

[Bug fortran/81615] New: save-temps and gfortran produces *.f90 files instead of *.i or *i90 files

2017-07-30 Thread barrowes at alum dot mit.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81615

Bug ID: 81615
   Summary: save-temps and gfortran produces *.f90 files instead
of *.i or *i90 files
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barrowes at alum dot mit.edu
  Target Milestone: ---

If I have a file test1.f:

  program test2
  real x
  x=3.0
C#ifdef DDD
C  x=2.0
C#endif
  print *,'x=',x
  end

and I compile it with:

gfortran -save-temps -o test0 test0.f

I get two temporary files, test0.o and test0.s.

If I uncomment the directive:

  program test2
  real x
  x=3.0
#ifdef MPI
  x=2.0
#endif
  print *,'x=',x
  end

and compile with:

gfortran -cpp -save-temps -o test0 test0.f

In addition to the two temporary files above, a test0.f90 is produced that
looks like:

# 1 "test0.f"
# 1 ""
# 1 ""
# 1 "test0.f"
  program test2
  real x
  x=3.0
  print *,'x=',x
  end


I was under the impression that I would get a test0.i file since the only
documentation of using -save-temps I can find comes from the gcc docs:
https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html#index-save-temps
which mentions *.i files in their example.


And if I have a file test1.f90:

program test1
real x
x=3.0
#ifdef MPI
x=2.0
#endif
print *,'x=',x
end

and I compile with:

gfortran -cpp -save-temps -o test1 test1.f90

The test1.o and test1.s files are produced, but no preprocessed fortran source
file is produced (I suppose because the source file already has the f90
extension).

How can I get a preprocessed source file in this case? Where is this behavior
of -save-temps producing *.f90 files documented? Can I change the f90 extension
of the preprocessed temporary files to i or i90 instead of f90?

[Bug target/81614] x86 optimizer combines results of comparisons in a way that risks partial register stalls

2017-07-30 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614

Uroš Bizjak  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #1 from Uroš Bizjak  ---
This transformation is handled by -mtune-ctrl=partial_reg_stall tune flag (and
more specifically, -mtune-ctrl=^promote_qimode flag).

Partial register stalls were discussed many times in the past, but apparently
the compiler still produces fastest code when partial register stalls are
enabled on latest target processors (e.g. -mtune=intel).

BTW, there are quite some flags in x86-tune.def under:

/*/
/* Historical relics: tuning flags that helps a specific old CPU designs */
/*/

where nobody bothered to change defaults for new processors.

Also, it is hard to confirm tuning PRs without hard benchmark data.

Adding CC.

[Bug c/77328] incorrect caret location in -Wformat calling printf via a macro

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77328

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Eric Gallager  ---
Confirmed.

[Bug target/79793] Incorrect stack alignment for interrupt handler in 64-bit

2017-07-30 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79793

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #20 from H.J. Lu  ---
Fixed for GCC 8.

[Bug target/79793] Incorrect stack alignment for interrupt handler in 64-bit

2017-07-30 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79793

--- Comment #19 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Sun Jul 30 14:10:32 2017
New Revision: 250721

URL: https://gcc.gnu.org/viewcvs?rev=250721&root=gcc&view=rev
Log:
i386: Update INCOMING_FRAME_SP_OFFSET for exception handler

Since there is an extra error code passed to the exception handler,
INCOMING_FRAME_SP_OFFSET is return address plus error code for the
exception handler.  This patch updates INCOMING_FRAME_SP_OFFSET to
the correct value for the exception handler.

This patch exposed a bug in DWARF stack frame CFI generation, which
assumes that INCOMING_FRAME_SP_OFFSET is the same for all functions:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81570

It sets and caches the incoming stack frame offset with the same
INCOMING_FRAME_SP_OFFSET for all functions.  When there are both
exception handler and normal function in the same input, the wrong
incoming stack frame offset is used for exception handler or normal
function, which leads to

FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 error == 0x12345670
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->ip == 0x12345671
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->cs == 0x12345672
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->flags == 0x12345673
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->sp == 0x12345674
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->ss == 0x12345675

With the patch for PR 81570:

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01851.html

applied, there are no regressions on i686 and x86-64.

gcc/

PR target/79793
* config/i386/i386.c (ix86_function_arg): Update arguments for
exception handler.
(ix86_compute_frame_layout): Set the initial stack offset to
INCOMING_FRAME_SP_OFFSET.  Update red-zone offset with
INCOMING_FRAME_SP_OFFSET.
(ix86_expand_epilogue): Don't pop the 'ERROR_CODE' off the
stack before exception handler returns.
* config/i386/i386.h (INCOMING_FRAME_SP_OFFSET): Add the
the 'ERROR_CODE' for exception handler.

gcc/testsuite/

PR target/79793
* gcc.dg/guality/pr68037-1.c: Update gdb breakpoints.
* gcc.target/i386/interrupt-5.c (interrupt_frame): New struct.
(foo): Check the builtin return address against the return address
in interrupt frame.
* gcc.target/i386/pr79793-1.c: New test.
* gcc.target/i386/pr79793-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr79793-1.c
trunk/gcc/testsuite/gcc.target/i386/pr79793-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.h
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/guality/pr68037-1.c
trunk/gcc/testsuite/gcc.target/i386/interrupt-5.c

[Bug debug/81570] create_pseudo_cfg assumes that INCOMING_FRAME_SP_OFFSET is a constant

2017-07-30 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81570

--- Comment #3 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Sun Jul 30 14:10:32 2017
New Revision: 250721

URL: https://gcc.gnu.org/viewcvs?rev=250721&root=gcc&view=rev
Log:
i386: Update INCOMING_FRAME_SP_OFFSET for exception handler

Since there is an extra error code passed to the exception handler,
INCOMING_FRAME_SP_OFFSET is return address plus error code for the
exception handler.  This patch updates INCOMING_FRAME_SP_OFFSET to
the correct value for the exception handler.

This patch exposed a bug in DWARF stack frame CFI generation, which
assumes that INCOMING_FRAME_SP_OFFSET is the same for all functions:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81570

It sets and caches the incoming stack frame offset with the same
INCOMING_FRAME_SP_OFFSET for all functions.  When there are both
exception handler and normal function in the same input, the wrong
incoming stack frame offset is used for exception handler or normal
function, which leads to

FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 error == 0x12345670
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->ip == 0x12345671
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->cs == 0x12345672
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->flags == 0x12345673
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->sp == 0x12345674
FAIL: gcc.dg/guality/pr68037-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 33 frame->ss == 0x12345675

With the patch for PR 81570:

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01851.html

applied, there are no regressions on i686 and x86-64.

gcc/

PR target/79793
* config/i386/i386.c (ix86_function_arg): Update arguments for
exception handler.
(ix86_compute_frame_layout): Set the initial stack offset to
INCOMING_FRAME_SP_OFFSET.  Update red-zone offset with
INCOMING_FRAME_SP_OFFSET.
(ix86_expand_epilogue): Don't pop the 'ERROR_CODE' off the
stack before exception handler returns.
* config/i386/i386.h (INCOMING_FRAME_SP_OFFSET): Add the
the 'ERROR_CODE' for exception handler.

gcc/testsuite/

PR target/79793
* gcc.dg/guality/pr68037-1.c: Update gdb breakpoints.
* gcc.target/i386/interrupt-5.c (interrupt_frame): New struct.
(foo): Check the builtin return address against the return address
in interrupt frame.
* gcc.target/i386/pr79793-1.c: New test.
* gcc.target/i386/pr79793-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr79793-1.c
trunk/gcc/testsuite/gcc.target/i386/pr79793-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.h
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/guality/pr68037-1.c
trunk/gcc/testsuite/gcc.target/i386/interrupt-5.c

[Bug c/70502] inconsistent behavior of -Werror=

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70502

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||egallager at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #1 from Eric Gallager  ---
Looks like a dup of bug 55976 to me

*** This bug has been marked as a duplicate of bug 55976 ***

[Bug c/55976] -Werror=return-type should error on returning a value from a void function

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55976

Eric Gallager  changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #3 from Eric Gallager  ---
*** Bug 70502 has been marked as a duplicate of this bug. ***

[Bug c/71996] -fdump-translation-unit fails to dump string literals of type char16_t/char32_t/wchar_t

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71996

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #1 from Eric Gallager  ---
My version of gcc trunk doesn't recognize the flag
‘-fdump-translation-unit=stdout’; I think I remember reading on the mailing
lists that it was going to be removed for gcc8...

[Bug libfortran/78449] compile time ieee_support_halting is not correct on arm and aarch64 ( FAIL: gfortran.dg/ieee/ieee_8.f90 -Os execution test )

2017-07-30 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78449

Richard Earnshaw  changed:

   What|Removed |Added

   Target Milestone|--- |7.0

[Bug middle-end/80929] [6/7/8 Regression] Division with constant no more optimized to mult highpart

2017-07-30 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80929

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
Summary|[7/8 Regression] Division   |[6/7/8 Regression] Division
   |with constant no more   |with constant no more
   |optimized to mult highpart  |optimized to mult highpart
 Ever confirmed|0   |1

--- Comment #7 from Georg-Johann Lay  ---
v4.7 generates best code: The 2 div+mod 60 are implemented as 2 mul-highpart.

v6 tries to be overly smart by fusing the two divisions by 60 to one division
by 3600, leaving with 1 slow divmod call *and* 2 mul-higpart for the 2 modulo
60.

v8 also fuses to a slow division by 3600, but also fails to use mul-highpart
for the 2nd mod 60.

[Bug rtl-optimization/81611] gcc un-learned loop / post-increment optimization

2017-07-30 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81611

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 Ever confirmed|0   |1

[Bug c/71870] wrong location of "%n$" directive in -Wformat

2017-07-30 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71870

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-07-30
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Eric Gallager  ---
Confirmed. Note that the original testcase now prints an additional
-Wformat-overflow warning:

$ /usr/local/bin/gcc -c -Wall -Wextra -Wpedantic -S 71870.c
71870.c: In function ‘f’:
71870.c:5:26: warning: unknown conversion type character ‘r’ in format
[-Wformat=]
  __builtin_sprintf (d, "%r");
  ^
71870.c:7:2: warning: ISO C does not support %n$ operand number formats
[-Wformat=]
  __builtin_sprintf (d, "%2$i%1$i", 1, 234);
  ^
71870.c:7:33: warning: ‘__builtin_sprintf’ writing a terminating nul past the
end of the destination [-Wformat-overflow=]
  __builtin_sprintf (d, "%2$i%1$i", 1, 234);
 ^
71870.c:7:2: note: ‘__builtin_sprintf’ output 5 bytes into a destination of
size 4
  __builtin_sprintf (d, "%2$i%1$i", 1, 234);
  ^
$

[Bug middle-end/80929] [7/8 Regression] Division with constant no more optimized to mult highpart

2017-07-30 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80929

--- Comment #6 from Georg-Johann Lay  ---
Created attachment 41861
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41861&action=edit
time-i.c: C test case

(In reply to Richard Biener from comment #4)
> Fixed?

No.  The attached test case

$ avr-gcc-8 time-i.c -mmcu=atmega168 -O2 -S -dp

Still uses slow __[u]divmodhi when optimizing for speed.  The code has 2
divisions and modulo with 60.  The first mod is expanded as mul highpart (insn
25) but the second is expanded as __divmodhi4 call (insn 67):

timeid_add:
...
ldi r26,lo8(-119);  24  *movhi/5[length = 2]
ldi r27,lo8(-120)
call __umulhisi3 ;  25  *umulhi3_highpart_call  [length = 2]
...
ldi r22,lo8(16)  ;  61  *movhi/5[length = 2]
ldi r23,lo8(14)
call __udivmodhi4;  62  *udivmodhi4_call[length = 2]
std Z+2,r22  ;  34  movqi_insn/3[length = 1]
movw r24,r18 ;  65  *movhi/1[length = 1]
ldi r22,lo8(60)  ;  66  *movhi/5[length = 2]
ldi r23,0
call __divmodhi4 ;  67  *divmodhi4_call [length = 2]
std Z+1,r24  ;  50  movqi_insn/3[length = 1]
/* epilogue start */

[Bug target/81614] New: x86 optimizer combines results of comparisons in a way that risks partial register stalls

2017-07-30 Thread cody at codygray dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81614

Bug ID: 81614
   Summary: x86 optimizer combines results of comparisons in a way
that risks partial register stalls
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cody at codygray dot com
  Target Milestone: ---
Target: i?86-*-*

Consider the following code:

bool foo(int a, int b, int c)
{
// It doesn't matter if this short-circuits  ('||' vs. '|')
// because the optimizer treats them as equivalent.
return (a == c || b == c);
}

All versions of GCC (going back to at least 4.4.7 and forward to the current
8.0 preview) translate this to the following optimized assembly on x86 targets:

foo(int, int, int):
movl12(%esp), %edx
cmpl%edx, 4(%esp)
sete%al
cmpl8(%esp), %edx
sete%dl
orl %edx, %eax
ret

The problem here is the second-to-last instruction. It ORs together two full
32-bit registers, even though the preceding SETE instructions only set the low
8 bits of each register. This results in a speed-zapping phenomenon on
virtually all x86 processors called a *partial register stall*.

(See http://www.agner.org/optimize/microarchitecture.pdf for details on exactly
how this is a performance problem on various implementations of x86. Although
there are differences in exactly *why* it is a speed penalty, it virtually
always is and *certainly* should be considered one when the output is tuned for
a generic x86 target.)

You get the same results at all optimization levels, including -Os (at least,
the relevant portion of the code is the same). You also see this for x86-64
targets:

foo(int, int, int):
cmpl%edx, %edi
sete%al
cmpl%esi, %edx
sete%dl
orl %edx, %eax
ret

One of two things should be done instead: either (A) perform the bitwise
operation *only* on the low bytes, or (B) pre-zero the entire 32-bit register
*before* setting its low byte to break dependencies.

Proposed Resolution A (use only low bytes):

foo(int, int, int):
movl12(%esp), %edx
cmpl%edx, 4(%esp)
sete%al
cmpl8(%esp), %edx
sete%dl
orl %dl, %al
ret

Proposed Resolution B (pre-zero to break dependencies):

foo(int, int, int):
movl12(%esp), %edx
xorl%eax, %eax
cmpl%edx, 4(%esp)
sete%al
xorl%ecx, %ecx
cmpl8(%esp), %edx
sete%cl
orl %ecx, %eax
ret

Approach A is the one used by Clang and MSVC. It solves the problem of partial
register stalls while avoiding the need for a third register as in Approach B.

The disadvantage of Approach A is that it creates only a byte-sized (8-bit)
result. This is perfectly fine if the function returns a bool, but doesn't work
if the function returns an integer type. There are two ways to solve that. What
GCC currently does if you change foo() to return int is add a MOVZBL
instruction between the OR and RET:

foo(int, int, int):
movl12(%esp), %edx
cmpl%edx, 4(%esp)
sete%al
cmpl8(%esp), %edx
sete%dl
orl %edx, %eax
movzbl  %al, %eax
ret

This zero-extends the result in AL into EAX. (Notice that the partial register
stall hazard is still there.) This existing behavior could simply be
maintained. However, it would be more optimal to pre-zero as shown in Approach
B. (For details on why this would be more optimal on all x86
microarchitectures, see here: https://stackoverflow.com/a/33668295).

[Bug c++/81514] g++.dg/lookup/missing-std-include-2.C FAILs on Solaris

2017-07-30 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81514

--- Comment #3 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #2 from David Malcolm  ---
> Candidate patch: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01858.html

I've included the patch in this weekend's Solaris bootstraps and the
failures are gone indeed.

Thanks.
Rainer

[Bug c/51515] Unable to forward declare nested functions

2017-07-30 Thread SztfG at yandex dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51515

SztfG at yandex dot ru changed:

   What|Removed |Added

 CC||SztfG at yandex dot ru

--- Comment #2 from SztfG at yandex dot ru ---
but why this doesn't compile?

void f()
{
  typedef auto void (*func)();
  func g(void);

  g();

  func g() {}
}