[Bug debug/103975] DWARF .debug_frame incorrect for ISRs on AVR; pushing SREG creates off-by-one error

2022-01-10 Thread kimballa at apache dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103975

--- Comment #1 from Aaron Kimball  ---
Created attachment 52160
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52160&action=edit
isr.i that generates incorrect .debug_frame info

Attaching isr.i test case file

[Bug debug/103975] New: DWARF .debug_frame incorrect for ISRs on AVR; pushing SREG creates off-by-one error

2022-01-10 Thread kimballa at apache dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103975

Bug ID: 103975
   Summary: DWARF .debug_frame incorrect for ISRs on AVR; pushing
SREG creates off-by-one error
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kimballa at apache dot org
  Target Milestone: ---

On the AVR platform, when generating ISRs/interrupts/signal handlers (without
using the `naked` attribute), the ISR prologue correctly emits standard code to
read the value of `SREG` from its register port 0x3f and save it to the stack.
However, this is ignored when generating the FDE for stack frame unwind
information in .debug_frame; the result is that for $PC values after that
point, the calculations for the CFA as well as the locations of all
subsequently-pushed registers are off-by-one on the stack. 

The test case (attached) is a minimal ISR:

```
volatile unsigned char x = 0;

ISR(TIMER1_COMPA_vect) {
  x++; // may set carry/overflow flags and affect SREG, forcing it to be
pushed.
}
```

which generates the following (correct) AVR assembly:

```
ISR(TIMER1_COMPA_vect) {
  ce:   1f 92   pushr1
  d0:   0f 92   pushr0
  d2:   0f b6   in  r0, 0x3f; 63 # read SREG to r0
  d4:   0f 92   pushr0   # push SREG on stack
  d6:   11 24   eor r1, r1
  d8:   8f 93   pushr24
  x++; // may set carry/overflow flags and affect SREG, forcing it to be
pushed.
  da:   80 91 00 01 lds r24, 0x0100 ; 0x800100 <_edata>
  de:   8f 5f   subir24, 0xFF   ; 255
# (ISR continues; remainder elided)
```

A separate bug in binutils/objdump makes `objdump -Wframe` fail to decode the
unwind instructions, but pyelftools can parse the info correctly and shows as
follows:

```
* FDE for __vector_17() starting at $PC=0x00ce:

PC: 00ce {'cfa': CFARule(reg=32, offset=2, expr=None), 36: RegisterRule(OFFSET,
-1)}
PC: 00d0 {'cfa': CFARule(reg=32, offset=3, expr=None), 36: RegisterRule(OFFSET,
-1), 1: RegisterRule(OFFSET, -2)}
PC: 00d2 {'cfa': CFARule(reg=32, offset=4, expr=None), 36: RegisterRule(OFFSET,
-1), 1: RegisterRule(OFFSET, -2), 0: RegisterRule(OFFSET, -3)}

<-- we *should* see a RegisterRule for SREG here valid after $PC=00d4h.

PC: 00da {'cfa': CFARule(reg=32, offset=5, expr=None), 36: RegisterRule(OFFSET,
-1), 1: RegisterRule(OFFSET, -2), 0: RegisterRule(OFFSET, -3), 24:
RegisterRule(OFFSET, -4)}

^-- Even that notwithstanding, the 'push' at 0xd4 means the CFARule offset at
0xda is now 1 too few, and the subsequent RegisterRule offset for r24 makes it
look snug against r0, ignoring the intervening push; CFARule offset should = 6
and r24's rule should have OFFSET=-5.
```

(n.b., gcc refers to $SP as register 32 in the stack frame info, and the return
addr / virtual link register as register 36.)

I am by no means a gcc internals expert but I tried to satisfy myself that it
was a legitimate bug (and not my own goof) by reading the gcc source code. 

I believe the bug is in `gcc/config/avr/avr.c` at lines 1946--47:
```
1946   /* ??? There's no dwarf2 column reserved for SREG.  */
1947   emit_push_sfr (sreg_rtx, false, false /* clr */, AVR_TMP_REGNO);
```

The author's confused comment about lacking an unwind-tracking register for
SREG shows that something might be going wrong here.

I think the 2nd argument to emit_push_sfr() may need to be `true` to set
`frame_related_p=true` (which marks the instruction as
`RTX_FRAME_RELATED_P(insn)=1`). But I don't know what other knock-on effects
frame_related_p=true might have, so it's hard for me to say it's definitely
just a one-flag fix. 

Whether or not unwind info for SREG can be tracked, at minimum the remaining
unwind info needs to account for the extra push to the stack mid-prologue.

I have replicated this with avr-gcc 7.3.0-atmel3.6.1-arduino7 (Arduino's
official package) as well as gcc version 5.4.0 with target=avr, which is what
Ubuntu 20.04.3 installs as `apt install gcc-avr`. Although I couldn't bootstrap
a functioning cross-compiler with gcc 11.2 myself, the source code snippet
above is from the main branch of gcc.git, which makes me believe it's likely
still an issue.

The attached file was created with the 7.3.0 edition of gcc referenced above:
```
$ ./avr-g++ -v
Using built-in specs.
Reading specs from
/home/aaron/.arduino15/packages/arduino/tools/avr-gcc/7.3.0-atmel3.6.1-arduino7/bin/../lib/gcc/avr/7.3.0/device-specs/specs-avr2
COLLECT_GCC=./avr-g++
COLLECT_LTO_WRAPPER=/home/aaron/.arduino15/packages/arduino/tools/avr-gcc/7.3.0-atmel3.6.1-arduino7/bin/../libexec/gcc/avr/7.3.0/lto-wrapper
Target: avr
Configured with: ../gcc/configure --enable-fixed-point --enable-languages=c,c++
--prefix=/home/jenkins/workspace/avr-gcc-staging/label/debian7-x86_64/objdir
--disable-nl

[Bug bootstrap/103974] [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

--- Comment #5 from Andrew Pinski  ---
This the steps I needed to reproduce this bug;
mkdir objdir-cris
cd objdir-cris
../configure --target=cris-elf
make -j24 all-gcc
cd gcc
wget https://gcc.gnu.org/bugzilla/attachment.cgi?id=52159
mv attach* t.c
./cc1 t.c -O2

(you can do at this point:
make clean
make -j24 cc1
and get a -O0 built cc1 too if needed).

[Bug bootstrap/103974] [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

--- Comment #4 from Andrew Pinski  ---
Created attachment 52159
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52159&action=edit
reduced testcase as much as I could get it

Just needs -O2. It is both a C or C++ testcase even.
I can't reduce it any further really.

[Bug target/103973] x86: 4-way comparison of floats/doubles with spaceship operator possibly suboptimal

2022-01-10 Thread nekotekina at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103973

--- Comment #4 from Ivan  ---
So there is nothing to improve here? That's good to know, I suppose it can be
closed then.

[Bug target/103973] x86: 4-way comparison of floats/doubles with spaceship operator possibly suboptimal

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103973

--- Comment #3 from Andrew Pinski  ---
With -O2 -fno-trapping-math, we get:
ucomisd %xmm1, %xmm0
jp  .L9
movl$0, %eax
jne .L9
ret
.p2align 4,,10
.p2align 3
.L9:
ucomisd %xmm0, %xmm1
ja  .L12
ucomisd %xmm1, %xmm0
setbe   %al
addl$1, %eax
ret
.p2align 4,,10
.p2align 3
.L12:
movl$-1, %eax
ret


With -O2 -fno-trapping-math -ffinite-math-only we get:
comisd  %xmm1, %xmm0
je  .L12
jb  .L13
movl$1, %edx
movl$2, %eax
cmova   %edx, %eax
ret
.p2align 4,,10
.p2align 3
.L12:
xorl%eax, %eax
ret
.p2align 4,,10
.p2align 3
.L13:
movl$-1, %eax
ret

The main reason is NaNs are interesting and trapping comparisons are also
interesting. Note LLVM does not implement trapping math at all which is why
their code gen is different too.

[Bug target/103973] x86: 4-way comparison of floats/doubles with spaceship operator possibly suboptimal

2022-01-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103973

--- Comment #2 from Hongtao.liu  ---
W/ -ffast-math only one COMISD is used.

[Bug target/103973] x86: 4-way comparison of floats/doubles with spaceship operator possibly suboptimal

2022-01-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103973

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---
(In reply to Ivan from comment #0)
> Hello, I may be missing something here but the generated code seems strange
> and suboptimal. It looks like all 4 possible paths can use flags from a
> single UCOMISD instruction, not calling it 3 times in worst case.
There're also COMISD which is defferent from UCOMISD as

---cut from intel sdm-
The UCOMISD instruction differs from the COMISD instruction in that it signals
a SIMD floating-point invalid oper-
ation exception (#I) only when a source operand is an SNaN. The COMISD
instruction signals an invalid operation
exception only if a source operand is either an SNaN or a QNaN.
cut end-

And they're different in GCC rtl representation, that's why CSE doesn't helps
here.

[Bug bootstrap/103974] [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-01-11

--- Comment #3 from Andrew Pinski  ---
Confirmed, reducing.

[Bug bootstrap/103974] [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 CC||pinskia at gcc dot gnu.org

[Bug libstdc++/103726] --disable-hosted-libstdcxx (freestanding C++) does not provide as what standard requires

2022-01-10 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103726

--- Comment #10 from cqwrteur  ---
(In reply to Jonathan Wakely from comment #9)
> Ah it was left out of libstdc++'s freestanding builds intentionally:
> 
> > When the final  header is added it will need to be in
> > libsupc++ so that it's included for freestanding builds (and at that
> > point it won't be able to use , but that will be
> > OK as the final header will be C++20-only and can rely on 
> > unconditionally, which is also freestanding).
> 
> The current experimental implementation supports C++14 and C++17, but to do
> that it needs to include some non-freestanding stuff.

then we need to report to wg21 as defects. Removing exception, typeinfo and
coroutines from freestanding for example.

[Bug bootstrap/103974] [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

--- Comment #2 from Hans-Peter Nilsson  ---
Created attachment 52158
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52158&action=edit
brief gdb inspection session

I had to "rm *ira*.o" and "make all-gcc CXXFLAGS=-g3" and re-start the gdb
session to get a sane inspection context: that's what the "r;
`/tmp/X/gccobj/gcc/cc1plus' has changed; re-reading symbols." is about.  IOW,
business as usual.
BTW, host environment is Debian 11; "gcc --version | head -1" yields
gcc (Debian 10.2.1-6) 10.2.1 20210110

[Bug bootstrap/103974] [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

--- Comment #1 from Hans-Peter Nilsson  ---
Created attachment 52157
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52157&action=edit
gzipped preprocessed (>1 MB) floating_to_chars.ii

Reproduce using the ungzipped file for example like so:

/tmp/X/gccobj/./gcc/cc1plus -fpreprocessed floating_to_chars.ii -march=v10
-quiet -dumpbase floating_to_chars.cc -dumpbase-ext .cc -O2 -std=gnu++17
-frandom-seed=floating_to_chars.lo

[Bug middle-end/70425] decl_expr does not print the *_decl which is it is associated with it for tree-dump.c

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70425

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-01-11

--- Comment #4 from Andrew Pinski  ---

/* Used to represent a local declaration. The operand is DECL_EXPR_DECL.  */
DEFTREECODE (DECL_EXPR, "decl_expr", tcc_statement, 1)


case tcc_expression:
case tcc_reference:
case tcc_statement:
case tcc_vl_exp:
  /* These nodes are handled explicitly below.  */


But DECL_EXPR is not handled below.

Conmfirmed.

[Bug bootstrap/103974] New: [12 Regression] ICE in ira_flattening building libstdc++ with r12-6415-g01f3e6a40e72

2022-01-10 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103974

Bug ID: 103974
   Summary: [12 Regression] ICE in ira_flattening building
libstdc++ with r12-6415-g01f3e6a40e72
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: build, ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hp at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-linux
Target: cris-elf

Commit r12-6415-g01f3e6a40e72, "[PATCH 27/33] ira: Consider modelling
caller-save allocations as loop spills" caused build to break for cris-elf,
when building the v10 multilib of libstdc++-v3, specifically
v10/libstdc++-v3/src/c++17 floating_to_chars.cc, like so:

libtool: compile:  /tmp/X/gccobj/./gcc/xgcc -shared-libgcc
-B/tmp/X/gccobj/./gcc -nostdinc++ -L/tmp/X/gccobj/cris-elf/v10/libstdc++-v3/src
-L/tmp/X/gccobj/cris-elf/v10/libstdc++-v3/src/.libs
-L/tmp/X/gccobj/cris-elf/v10/libstdc++-v3/libsupc++/.libs -nostdinc
-B/tmp/X/gccobj/cris-elf/v10/newlib/ -isystem
/tmp/X/gccobj/cris-elf/v10/newlib/targ-include -isystem
/tmp/X/gcc/newlib/libc/include -B/tmp/X/gccobj/cris-elf/v10/libgloss/cris
-L/tmp/X/gccobj/cris-elf/v10/libgloss/libnosys -L/tmp/X/gcc/libgloss/cris
-B/tmp/X/pre/cris-elf/bin/ -B/tmp/X/pre/cris-elf/lib/ -isystem
/tmp/X/pre/cris-elf/include -isystem /tmp/X/pre/cris-elf/sys-include -march=v10
-mbest-lib-options -I/tmp/X/gcc/libstdc++-v3/../libgcc
-I/tmp/X/gccobj/cris-elf/v10/libstdc++-v3/include/cris-elf
-I/tmp/X/gccobj/cris-elf/v10/libstdc++-v3/include
-I/tmp/X/gcc/libstdc++-v3/libsupc++ -std=gnu++17 -nostdinc++
-fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi=2
-fdiagnostics-show-location=once -ffunction-sections -fdata-sections
-frandom-seed=floating_to_chars.lo -fimplicit-templates -g -O2 -march=v10
-mbest-lib-options -c /tmp/X/gcc/libstdc++-v3/src/c++17/floating_to_chars.cc -o
floating_to_chars.o
/tmp/X/gcc/libstdc++-v3/src/c++17/floating_to_chars.cc:561:3: warning:
'std::to_chars_result {anonymous}::to_chars(char*, char*, uint128_t)' defined
but not used [-Wunused-function]
  561 |   to_chars(char* first, char* const last, uint128_t x)
  |   ^~~~
/tmp/X/gcc/libstdc++-v3/src/c++17/floating_to_chars.cc:554:3: warning: 'int
{anonymous}::get_mantissa_length(ryu::generic128::floating_decimal_128)'
defined but not used [-Wunused-function]
  554 |   get_mantissa_length(const ryu::floating_decimal_128 fd)
  |   ^~~
/tmp/X/gcc/libstdc++-v3/src/c++17/floating_to_chars.cc:127:5: warning: 'int
{anonymous}::ryu::to_chars(generic128::floating_decimal_128, char*)' defined
but not used [-Wunused-function]
  127 | to_chars(const floating_decimal_128 v, char* const result)
  | ^~~~
during RTL pass: ira
/tmp/X/gcc/libstdc++-v3/src/c++17/floating_to_chars.cc: In function
'std::to_chars_result std::__floating_to_chars_shortest(char*, char*, T,
chars_format) [with T = double]':
/tmp/X/gcc/libstdc++-v3/src/c++17/floating_to_chars.cc:1106:3: internal
compiler error: in ira_flattening, at ira-build.c:3213
 1106 |   }
  |   ^
0x708412 ira_flattening(int, int)
/tmp/X/gcc/gcc/ira-build.c:3213
0xe7c9b8 ira
/tmp/X/gcc/gcc/ira.c:5807
0xe7c9b8 execute
/tmp/X/gcc/gcc/ira.c:6075
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

I'll attach a brief gdb inspection session.

[Bug target/103973] New: x86: 4-way comparison of floats/doubles with spaceship operator possibly suboptimal

2022-01-10 Thread nekotekina at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103973

Bug ID: 103973
   Summary: x86: 4-way comparison of floats/doubles with spaceship
operator possibly suboptimal
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nekotekina at gmail dot com
  Target Milestone: ---

Hello, I may be missing something here but the generated code seems strange and
suboptimal. It looks like all 4 possible paths can use flags from a single
UCOMISD instruction, not calling it 3 times in worst case.

cmp4way(double, double):
ucomisd xmm0, xmm1
jp  .L8
mov eax, 0
jne .L8
.L2:
ret
.L8:
comisd  xmm1, xmm0
mov eax, -1
ja  .L2
ucomisd xmm0, xmm1
setbe   al
add eax, 1
ret

https://godbolt.org/z/j1j7G1MYP

#include 

auto cmp4way(double a, double b)
{
return a <=> b;
}

[Bug target/78855] -mtune=generic should keep cmp/jcc together. AMD and Intel both macro-fuse

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78855

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |8.0
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=81616

--- Comment #2 from Andrew Pinski  ---
Fixed in GCC 8 by r8-5077.

[Bug tree-optimization/103964] [9/10/11/12 Regression] OVS miscompilation since r0-92313-g5006671f1aaa63cd

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103964

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |9.5

[Bug target/103967] x86-64: bitfields make inefficient indexing for array with 16 byte+ objects

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103967

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-01-11
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
There are bugs dealing with both I think already.

[Bug target/103967] x86-64: bitfields make inefficient indexing for array with 16 byte+ objects

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103967

--- Comment #1 from Andrew Pinski  ---
There are two issues, the first issue deals with array accesses not always
being lowered at the gimple level and the second issue is bitfield accesses are
not being lowered either.

[Bug c++/88313] generic lambda in default template argument

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88313

Andrew Pinski  changed:

   What|Removed |Added

 CC||johelegp at gmail dot com

--- Comment #2 from Andrew Pinski  ---
*** Bug 101884 has been marked as a duplicate of this bug. ***

[Bug c++/101884] Generic lambda with auto in template parameter list rejected

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101884

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Dup of bug 88313.

*** This bug has been marked as a duplicate of bug 88313 ***

[Bug c++/88313] generic lambda in default template argument

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88313

Andrew Pinski  changed:

   What|Removed |Added

 CC||egor.pugin at gmail dot com

--- Comment #1 from Andrew Pinski  ---
*** Bug 103969 has been marked as a duplicate of this bug. ***

[Bug c++/103969] generic lambda is rejected as a template argument default

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103969

--- Comment #3 from Andrew Pinski  ---
actually dup of bug 88313.

*** This bug has been marked as a duplicate of bug 88313 ***

[Bug c++/54367] [meta-bug] lambda expressions

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54367
Bug 54367 depends on bug 103969, which changed state.

Bug 103969 Summary: generic lambda is rejected as a template argument default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103969

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/101884] Generic lambda with auto in template parameter list rejected

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101884

Andrew Pinski  changed:

   What|Removed |Added

 CC||egor.pugin at gmail dot com

--- Comment #1 from Andrew Pinski  ---
*** Bug 103969 has been marked as a duplicate of this bug. ***

[Bug c++/103969] generic lambda is rejected as a template argument default

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103969

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Andrew Pinski  ---
Dup of bug 101884.

*** This bug has been marked as a duplicate of bug 101884 ***

[Bug c++/103969] generic lambda is rejected as a template argument default

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103969

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|rejects-valid   |
Summary|'auto' parameter not|generic lambda is rejected
   |permitted in this context   |as a template argument
   ||default

--- Comment #1 from Andrew Pinski  ---
Are you sure this is valid because clang rejects:
template 
struct s {
s(){t(1);}
};
s<> t;

[Bug rtl-optimization/53652] *andn* does not always get used with simd and loading from memory

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |12.0

--- Comment #10 from Andrew Pinski  ---
Fixed.

[Bug rtl-optimization/53652] *andn* does not always get used with simd and loading from memory

2022-01-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

--- Comment #9 from Hongtao.liu  ---
Fixed in GCC12.

[Bug rtl-optimization/53652] *andn* does not always get used with simd and loading from memory

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

--- Comment #8 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:4bb79e27c02c5cd57d5781bef20e70982d898c40

commit r12-6428-g4bb79e27c02c5cd57d5781bef20e70982d898c40
Author: Haochen Jiang 
Date:   Thu Dec 30 15:47:58 2021 +0800

Extend predicate of operands[1] from register_operand to vector_operand for
andnot insn.

This can do optimization like

-   pcmpeqd %xmm0, %xmm0
-   pxorg(%rip), %xmm0
-   pand%xmm1, %xmm0
+   movdqa  g(%rip), %xmm0
+   pandn   %xmm1, %xmm0

gcc/ChangeLog:

PR target/53652
* config/i386/sse.md (*andnot3): Extend predicate of
operands[1] from register_operand to vector_operand.

gcc/testsuite/ChangeLog:

PR target/53652
* gcc.target/i386/pr53652-1.c: New test.

[Bug tree-optimization/103961] [12 Regression] gcc-12 apparently miscompiles libcap's cap_to_text() function

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103961

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |blocker

[Bug tree-optimization/103961] [12 Regression] gcc-12 apparently miscompiles libcap's cap_to_text() function

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103961

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #10 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #9)
> Reduced testcase:

Note this is -O1 -Wall.

This is definitely a change in objsz which is calculating the size wrong.

We get:
: In function 'cap_to_text':
:18:29: warning: '+' directive writing 1 byte into a region of size 0
[-Wformat-overflow=]
   18 | sprintf(p, "+");
  | ^

[Bug tree-optimization/103961] [12 Regression] gcc-12 apparently miscompiles libcap's cap_to_text() function

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103961

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|needs-reduction |

--- Comment #9 from Andrew Pinski  ---
Reduced testcase:
extern __inline 
__attribute__ ((__gnu_inline__))
 int
 sprintf (char *__restrict __s, const char *__restrict __fmt, ...) {
return __builtin___sprintf_chk (__s, 2 - 1, __builtin_object_size (__s, 2 > 1),
__fmt, __builtin_va_arg_pack ());
}
void cap_to_text(int cmb)
{
char buf[(23 * ((2) * 32))+100];
char *p;
int n, t;
p = 20 + buf;
for (t = 8; t--; )
{
for (n = 0; n < cmb; n++)
p += sprintf(p, "a,");
p--;
sprintf(p, "+");
}
}

When p-- happens, it will always be buf+somesmallvalue.

[Bug middle-end/100536] ICE: in expand_call with large union (1GB) argument

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100536

--- Comment #6 from Andrew Pinski  ---
(In reply to Karine EM from comment #5)
> With GCC-11 and GCC-10, the compiler does not crash but returns: "confused
> by earlier errors, bailing out" and ends gracefully.

That is actually still a crash :) Just hiding the internal compiler error as
there was already an error. This happens with release checking turned on. It is
by design even.

[Bug tree-optimization/103971] [12 regression] build fails after r12-6420, ICE at libgfortran/generated/matmul_i1.c:2450

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103971

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
  Component|bootstrap   |tree-optimization
   Keywords||build, ice-on-valid-code
   Severity|normal  |blocker

[Bug go/103972] Building Go Frontend Failure on FreeBSD Powerpc64

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103972

--- Comment #1 from Andrew Pinski  ---
go/runtime/os_freebsd.go should have been used

[Bug c/81453] relational expression involving null pointer not diagnosed with -Wall

2022-01-10 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81453

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #4 from Martin Sebor  ---
I'm not sure pr103945 is a duplicate of this feature request: this one is about
relational expressions involving a null pointer, while pr103945 is about
relational expressions involving function pointers.  Both sets of expressions
are invalid, but both are also quite different.

That said, as I mentioned in bug 103945 comment 4: the patch I submitted in
October 2020 detects invalid relational comparisons of unrelated pointers of
all kinds (https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558775.html;
the patch review was never finished).  I'd expect it to also diagnose
relational expressions involving null pointers (because they're treated as
distinct by the pointer-query infrastructure) but I don't see any tests for it,

[Bug go/103972] New: Building Go Frontend Failure on FreeBSD Powerpc64

2022-01-10 Thread clhamilto at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103972

Bug ID: 103972
   Summary: Building Go Frontend Failure on FreeBSD Powerpc64
   Product: gcc
   Version: og11 (devel/omp/gcc-11)
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: clhamilto at gmail dot com
CC: cmang at google dot com
  Target Milestone: ---

Attempting to build Go frontend and got the below error.  The function is found
in "runtime/os_freebsd.go" but seems powerpc64 not supported for freebsd.

/usr/ports/lang/gcc11/work/gcc-11.2.0/libgo/go/internal/cpu/cpu_ppc64x.go:18:9:
error: reference to undefined name 'osinit'
   18 | osinit()
  | ^
gmake[6]: *** [Makefile:3001: internal/cpu.lo] Error 1

[Bug c++/87403] [Meta-bug] Issues that suggest a new warning

2022-01-10 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
Bug 87403 depends on bug 103945, which changed state.

Bug 103945 Summary: No warning for ordered comparison of function pointers ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103945

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

[Bug c/81453] relational expression involving null pointer not diagnosed with -Wall

2022-01-10 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81453

Eric Gallager  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #3 from Eric Gallager  ---
*** Bug 103945 has been marked as a duplicate of this bug. ***

[Bug c++/103945] No warning for ordered comparison of function pointers ?

2022-01-10 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103945

Eric Gallager  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 CC||egallager at gcc dot gnu.org
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Eric Gallager  ---
Dup of bug 81453

*** This bug has been marked as a duplicate of bug 81453 ***

[Bug middle-end/100536] ICE: in expand_call with large union (1GB) argument

2022-01-10 Thread k.even-mendoza at imperial dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100536

Karine EM  changed:

   What|Removed |Added

 CC||k.even-mendoza at imperial dot 
ac.
   ||uk

--- Comment #5 from Karine EM  ---
With GCC-11 and GCC-10, the compiler does not crash but returns: "confused by
earlier errors, bailing out" and ends gracefully. But with GCC-12, I got a
similar crash, with a flat array, a bit over 1GB:

 1  struct a {
 2char arr[11];
 3  } b[1];
 4  void c(struct a e) {
 5if (__builtin_memcmp(e.arr, b, 6))
 6  __builtin_abort();
 7  }
 8  int main() {
 9struct a d;
10d.arr;
11c(d);
12return 0;
13  }

However, the compiler does recognize the huge stack and gives: "sorry,
unimplemented: passing too large argument on stack", but still crash. If there
is already an error printed, what is the problem to terminate the compilation
gracefully as GCC-11 and GCC-10 used to do?



==
With GCC-11 and GCC-10 (at least for this case):
gcc-10 -O2 2c8efdb591d9739d4434f1c216106706c62bd78f_v2.c
2c8efdb591d9739d4434f1c216106706c62bd78f_v2.c: In function ‘main’:
2c8efdb591d9739d4434f1c216106706c62bd78f_v2.c:11:3: sorry, unimplemented:
passing too large argument on stack
   11 |   c(d);
  |   ^~~~
2c8efdb591d9739d4434f1c216106706c62bd78f_v2.c:11: confused by earlier errors,
bailing out

==
With GCC-12 ((GCC) 12.0.0 20211216 (experimental)), this is the trace:
2c8efdb591d9739d4434f1c216106706c62bd78f.c:11:3: sorry, unimplemented: passing
too large argument on stack
   11 |   c(d);
  |   ^~~~
during RTL pass: expand
2c8efdb591d9739d4434f1c216106706c62bd78f.c:11:3: internal compiler error: in
expand_call, at calls.c:3905
0x6cced7 expand_call(tree_node*, rtx_def*, int)
.././../gcc-source/gcc/calls.c:3905
0xb43a3f expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
.././../gcc-source/gcc/expr.c:11536
0xa14b41 expand_expr
.././../gcc-source/gcc/expr.h:301
0xa14b41 expand_call_stmt
.././../gcc-source/gcc/cfgexpand.c:2831
0xa14b41 expand_gimple_stmt_1
.././../gcc-source/gcc/cfgexpand.c:3864
0xa14b41 expand_gimple_stmt
.././../gcc-source/gcc/cfgexpand.c:4028
0xa1a67e expand_gimple_basic_block
.././../gcc-source/gcc/cfgexpand.c:6069
0xa1c527 execute
.././../gcc-source/gcc/cfgexpand.c:6795
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.

[Bug c++/67491] [meta-bug] concepts issues

2022-01-10 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67491
Bug 67491 depends on bug 103783, which changed state.

Bug 103783 Summary: Ambiguous overload between constrained static member and 
unconstrained non-static member
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103783

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c++/103783] Ambiguous overload between constrained static member and unconstrained non-static member

2022-01-10 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103783

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |11.3
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Patrick Palka  ---
Fixed for GCC 11.3/12

[Bug c++/103783] Ambiguous overload between constrained static member and unconstrained non-static member

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103783

--- Comment #6 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:1e4a9f22ac21da012fc93198a22e30b70e8fdf84

commit r11-9450-g1e4a9f22ac21da012fc93198a22e30b70e8fdf84
Author: Patrick Palka 
Date:   Mon Jan 10 14:57:51 2022 -0500

c++: "more constrained" vs staticness of memfn [PR103783]

Here we're rejecting the calls to g1 and g2 as ambiguous even though one
overload is more constrained than the other (and they're otherwise tied),
because the implicit 'this' parameter of the non-static overload causes
cand_parms_match to think the function parameter lists aren't equivalent.

This patch fixes this by making cand_parms_match skip over 'this'
appropriately.  Note that this bug only affects partial ordering of
non-template member functions because for member function templates
more_specialized_fn seems to already skip over 'this' appropriately.

PR c++/103783

gcc/cp/ChangeLog:

* call.c (cand_parms_match): Skip over 'this' when given one
static and one non-static member function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-memfun2.C: New test.

(cherry picked from commit 3e95a974c39e922d19bf7ac1246730c516ae01f2)

[Bug libstdc++/103726] --disable-hosted-libstdcxx (freestanding C++) does not provide as what standard requires

2022-01-10 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103726

--- Comment #9 from Jonathan Wakely  ---
Ah it was left out of libstdc++'s freestanding builds intentionally:

> When the final  header is added it will need to be in
> libsupc++ so that it's included for freestanding builds (and at that
> point it won't be able to use , but that will be
> OK as the final header will be C++20-only and can rely on 
> unconditionally, which is also freestanding).

The current experimental implementation supports C++14 and C++17, but to do
that it needs to include some non-freestanding stuff.

[Bug tree-optimization/103821] [12 Regression] huge compile time (jump threading) at -O3 for simple code since r12-4790-g4b3a325f07acebf47e82de227ce1d5ba62f5bcae

2022-01-10 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103821

--- Comment #3 from Andrew Macleod  ---
Interesting.  This isn't caused by jump threading, just exposed by it.

We end up unrolling this loop, and the pattern of code creates a set of
cascading multiplies for which we can precisely evaluate them with subranges.

For instance, we calculate:

_38 = int [8192, 8192][24576, 24576][40960, 40960][57344, 57344]

so _38 has 4 subranges, and then we calculate:

_39 = _38 * _38;

we do 16 multiplications and end up with:  int [67108864, 67108864][201326592,
201326592][335544320, 335544320][469762048, 469762048][603979776,
603979776][1006632960, 1006632960][1409286144, 1409286144][1677721600,
1677721600][+INF, +INF]

This feeds other multiplies and progresses rapidly to blow up the number of
subranges, which are then propagated via PHIs and other operations.

Folding of subranges is an O(n*m) process. We perform the operation on each
pair of subranges and union them.   Values like _38 * _38 that continue feeding
each other quickly become exponential.

Then combining that with union (an inherently linear operation over the number
of subranges) at each step of the way adds an additional quadratic operation on
top of the exponential factor. 

I will adjust the wi_fold routine to recognize when the calculation is moving
in an exponential direction, simply produce a summary a result instead of a
precise one.  Longer term, we could consider merging some of the subranges to
prevent the exponential growth, but still retain some precision.

[Bug bootstrap/103971] New: [12 regression] build fails after r12-6420, ICE at libgfortran/generated/matmul_i1.c:2450

2022-01-10 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103971

Bug ID: 103971
   Summary: [12 regression] build fails after r12-6420, ICE at
libgfortran/generated/matmul_i1.c:2450
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:d3ff7420e941931d32ce2e332e7968fe67ba20af, r12-6420 

libtool: compile:  /home/seurer/gcc/git/build/gcc-trunk/./gcc/xgcc
-B/home/seurer/gcc/git/build/gcc-trunk/./gcc/
-B/home/seurer/gcc/git/install/gcc-trunk/powerpc64le-unknown-linux-gnu/bin/
-B/home/seurer/gcc/git/install/gcc-trunk/powerpc64le-unknown-linux-gnu/lib/
-isystem
/home/seurer/gcc/git/install/gcc-trunk/powerpc64le-unknown-linux-gnu/include
-isystem
/home/seurer/gcc/git/install/gcc-trunk/powerpc64le-unknown-linux-gnu/sys-include
-DHAVE_CONFIG_H -I. -I/home/seurer/gcc/git/gcc-trunk/libgfortran
-iquote/home/seurer/gcc/git/gcc-trunk/libgfortran/io
-I/home/seurer/gcc/git/gcc-trunk/libgfortran/../gcc
-I/home/seurer/gcc/git/gcc-trunk/libgfortran/../gcc/config
-I/home/seurer/gcc/git/gcc-trunk/libgfortran/../libquadmath -I../.././gcc
-I/home/seurer/gcc/git/gcc-trunk/libgfortran/../libgcc -I../libgcc
-I/home/seurer/gcc/git/gcc-trunk/libgfortran/../libbacktrace -I../libbacktrace
-I../libbacktrace -std=gnu11 -Wall -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition -Wextra -Wwrite-strings
-Werror=implicit-function-declaration -Werror=vla -fcx-fortran-rules
-ffunction-sections -fdata-sections -ffast-math -ftree-vectorize -funroll-loops
--param max-unroll-times=4 -g -O2 -MT matmul_i1.lo -MD -MP -MF
.deps/matmul_i1.Tpo -c
/home/seurer/gcc/git/gcc-trunk/libgfortran/generated/matmul_i1.c  -fPIC -DPIC
-o .libs/matmul_i1.o
during GIMPLE pass: vect
/home/seurer/gcc/git/gcc-trunk/libgfortran/generated/matmul_i1.c: In function
'matmul_i1':
/home/seurer/gcc/git/gcc-trunk/libgfortran/generated/matmul_i1.c:2450:1:
internal compiler error: in operator[], at vec.h:889
 2450 | matmul_i1 (gfc_array_i1 * const restrict retarray,
  | ^
0x1022e357 vec::operator[](unsigned int)
/home/seurer/gcc/git/gcc-trunk/gcc/vec.h:889
0x1022e357 vec::operator[](unsigned int)
/home/seurer/gcc/git/gcc-trunk/gcc/vec.h:1495
0x10fdf8d7 vec::operator[](unsigned int) const
/home/seurer/gcc/git/gcc-trunk/gcc/tree-vect-loop.c:2882
0x10fdf8d7 vect_analyze_loop_1
/home/seurer/gcc/git/gcc-trunk/gcc/tree-vect-loop.c:2826
0x10fe008b vect_analyze_loop(loop*, vec_info_shared*)
/home/seurer/gcc/git/gcc-trunk/gcc/tree-vect-loop.c:3057
0x11023efb try_vectorize_loop_1
/home/seurer/gcc/git/gcc-trunk/gcc/tree-vectorizer.c:1047
0x11023efb try_vectorize_loop
/home/seurer/gcc/git/gcc-trunk/gcc/tree-vectorizer.c:1162
0x11024b1b execute
/home/seurer/gcc/git/gcc-trunk/gcc/tree-vectorizer.c:1278

commit d3ff7420e941931d32ce2e332e7968fe67ba20af (HEAD)
Author: Andre Vieira 
Date:   Thu Dec 2 14:34:15 2021 +

[vect] Re-analyze all modes for epilogues

[Bug target/47867] lto language is not supported on 32-bit HP-UX

2022-01-10 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47867

John David Anglin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from John David Anglin  ---
Fixed some time ago.

[Bug target/80817] [missed optimization][x86] relaxed atomics

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80817

Andrew Pinski  changed:

   What|Removed |Added

 CC||witold.baryluk+gcc at gmail 
dot co
   ||m

--- Comment #5 from Andrew Pinski  ---
*** Bug 103966 has been marked as a duplicate of this bug. ***

[Bug c++/103966] std::atomic relaxed load, inc, store sub-optimal codegen

2022-01-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103966

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
Dup of bug 80817

*** This bug has been marked as a duplicate of bug 80817 ***

[Bug target/99674] gcc/config/i386/i386-features.c: 2143: 2 * member variable not inited in ctor ?

2022-01-10 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99674

--- Comment #2 from David Binderman  ---
(In reply to David Binderman from comment #1)
> Interestingly, there are about 125 cases of this in the gcc source code,
> so this warning would be of immediate use for gcc itself.

Now up to 161 cases.

[Bug fortran/103970] New: Multi-image co_broadcast of derived type with allocatable components fails

2022-01-10 Thread damian at sourceryinstitute dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103970

Bug ID: 103970
   Summary: Multi-image co_broadcast of derived type with
allocatable components fails
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: damian at sourceryinstitute dot org
  Target Milestone: ---

Using gfortran 11.2.0 installed by Homebrew on macOS 12.0.1 to compile the code
below and link against OpenCoarrays 2.9.2 built with MPICH 3.2.0 results in
printing "Test failed." when executed in multiple images.

  implicit none

  type foo_t
integer i
integer, allocatable :: j
  end type

  type(foo_t) foo
  integer, parameter :: source_image = 1

  if (this_image() == source_image)  then
foo = foo_t(2,3)
  else
allocate(foo%j)
  end if
  call co_broadcast(foo, source_image)

  if ((foo%i /= 2) .or. (foo%j /= 3))  error stop "Test failed."
  sync all
  print *, "Test passed."

end

This bug is also summarized in OpenCoarrays issue 727 at
https://github.com/sourceryinstitute/OpenCoarrays/issues/727, where Andre
Vehreschild confirms that the problem is a compiler bug.

[Bug c++/103879] error: accessing value of variant::_Copy_ctor_base through a 'const variant' glvalue in a constant expression

2022-01-10 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103879

Patrick Palka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |12.0

--- Comment #12 from Patrick Palka  ---
Fixed for GCC 12

[Bug target/103861] [i386] vectorize v2qi vectors

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:04a745556021b7a1c6e81a41d0a12b60a4d9475d

commit r12-6426-g04a745556021b7a1c6e81a41d0a12b60a4d9475d
Author: Uros Bizjak 
Date:   Mon Jan 10 20:59:02 2022 +0100

i386: Introduce V2QImode vector compares [PR103861]

Add V2QImode vector compares with SSE registers.

2022-01-10  Uroš Bizjak  

gcc/ChangeLog:

PR target/103861
* config/i386/i386-expand.c (ix86_expand_int_sse_cmp):
Handle V2QImode.
* config/i386/mmx.md (3):
Use VI1_16_32 mode iterator.
(*eq3): Ditto.
(*gt3): Ditto.
(*xop_maskcmp3): Ditto.
(*xop_maskcmp_uns3): Ditto.
(vec_cmp): Ditto.
(vec_cmpu): Ditto.

gcc/testsuite/ChangeLog:

PR target/103861
* gcc.target/i386/pr103861-2.c: New test.

[Bug c++/103879] error: accessing value of variant::_Copy_ctor_base through a 'const variant' glvalue in a constant expression

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103879

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:ab36b554bd90e8db279d13b133369118814f13fb

commit r12-6425-gab36b554bd90e8db279d13b133369118814f13fb
Author: Patrick Palka 
Date:   Mon Jan 10 14:57:54 2022 -0500

c++: constexpr base-to-derived conversion with offset 0 [PR103879]

r12-136 made us canonicalize an object/offset pair with negative offset
into one with a nonnegative offset, by iteratively absorbing the
innermost component into the offset and stopping as soon as the offset
becomes nonnegative.

This patch strengthens this transformation by making it keep on absorbing
even if the offset is already 0 as long as the innermost component is at
position 0 (and thus absorbing doesn't change the offset).  This lets us
accept the two constexpr testcases below, which we'd previously reject
essentially because cxx_fold_indirect_ref would be unable to resolve
*(B*)&b.D123 (where D123 is the base A subobject at position 0) to just b.

PR c++/103879

gcc/cp/ChangeLog:

* constexpr.c (cxx_fold_indirect_ref): Split out object/offset
canonicalization step into a local lambda.  Strengthen it to
absorb more components at position 0.  Use it before both calls
to cxx_fold_indirect_ref_1.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-base2.C: New test.
* g++.dg/cpp1y/constexpr-base2a.C: New test.

[Bug c++/103783] Ambiguous overload between constrained static member and unconstrained non-static member

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103783

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:3e95a974c39e922d19bf7ac1246730c516ae01f2

commit r12-6424-g3e95a974c39e922d19bf7ac1246730c516ae01f2
Author: Patrick Palka 
Date:   Mon Jan 10 14:57:51 2022 -0500

c++: "more constrained" vs staticness of memfn [PR103783]

Here we're rejecting the calls to g1 and g2 as ambiguous even though one
overload is more constrained than the other (and they're otherwise tied),
because the implicit 'this' parameter of the non-static overload causes
cand_parms_match to think the function parameter lists aren't equivalent.

This patch fixes this by making cand_parms_match skip over 'this'
appropriately.  Note that this bug only affects partial ordering of
non-template member functions because for member function templates
more_specialized_fn seems to already skip over 'this' appropriately.

PR c++/103783

gcc/cp/ChangeLog:

* call.c (cand_parms_match): Skip over 'this' when given one
static and one non-static member function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-memfun2.C: New test.

[Bug c++/103969] New: 'auto' parameter not permitted in this context

2022-01-10 Thread egor.pugin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103969

Bug ID: 103969
   Summary: 'auto' parameter not permitted in this context
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: egor.pugin at gmail dot com
  Target Milestone: ---

template  decltype(auto) { return v; }>
struct s {};

gives

Output of x86-64 gcc (trunk) (Compiler #2)
:1:21: error: 'auto' parameter not permitted in this context
1 | template  decltype(auto) { return v; }>
  | ^~~~
: In lambda function:
:1:58: error: 'v' was not declared in this scope
1 | template  decltype(auto) { return v; }>
  |  ^


clang, msvc work fine.
https://godbolt.org/z/4Mf5efeWY

[Bug c++/103912] ICE in a consteval function which returns a lambda which takes a "non-POD" argument and the consteval has other code

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103912

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:54fa7daefe35cacf4a933947d1802318da193c01

commit r12-6423-g54fa7daefe35cacf4a933947d1802318da193c01
Author: Jakub Jelinek 
Date:   Mon Jan 10 20:49:11 2022 +0100

c++: Ensure some more that immediate functions aren't gimplified [PR103912]

Immediate functions should never be emitted into assembly, the FE doesn't
genericize them and does various things to ensure they aren't gimplified.
But the following testcase ICEs anyway due to that, because the consteval
function returns a lambda, and operator() of the lambda has
decl_function_context of the consteval function.  cgraphunit.c then
does:
  /* Preserve a functions function context node.  It will
 later be needed to output debug info.  */
  if (tree fn = decl_function_context (decl))
{
  cgraph_node *origin_node = cgraph_node::get_create (fn);
  enqueue_node (origin_node);
}
which enqueues the immediate function and then tries to gimplify it,
which results in ICE because it hasn't been genericized.

When I try similar testcase with constexpr instead of consteval and
static constinit auto instead of auto in main, what happens is that
the functions are gimplified, later ipa.c discovers they aren't reachable
and sets body_removed to true for them (and clears other flags) and we end
up with a debug info which has the foo and bar functions without
DW_AT_low_pc and other code specific attributes, just stuff from its BLOCK
structure and in there the lambda with DW_AT_low_pc etc.

The following patch attempts to emulate that behavior early, so that cgraph
doesn't try to gimplify those and pretends they were already gimplified
and found unused and optimized away.

2022-01-10  Jakub Jelinek  

PR c++/103912
* semantics.c (expand_or_defer_fn): For immediate functions, set
node->body_removed to true and clear analyzed, definition and
force_output.
* decl2.c (c_parse_final_cleanups): Ignore immediate functions for
expand_or_defer_fn.

* g++.dg/cpp2a/consteval26.C: New test.

[Bug middle-end/88670] [meta-bug] generic vector extension issues

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670
Bug 88670 depends on bug 103948, which changed state.

Bug 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 103948, which changed state.

Bug 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948

Uroš Bizjak  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Uroš Bizjak  ---
Fixed.

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:de0faa56a10406b50fba159957e3a3fd2f95c64b

commit r12-6422-gde0faa56a10406b50fba159957e3a3fd2f95c64b
Author: Uros Bizjak 
Date:   Mon Jan 10 20:39:35 2022 +0100

tree-optimization/103948 - detect vector vec_cmp in expand_vector_condition

Currently, expand_vector_condition detects only vcondMN and vconduMN
named RTX patterns.  Teach it to also consider vec_cmpMN and vec_cmpuMN
RTX patterns when all ones vector is returned for true and all zeros vector
is returned for false.

2022-01-10  Richard Biener  

gcc/ChangeLog:

PR tree-optimization/103948
* tree-vect-generic.c (expand_vector_condition): Return true if
all ones vector is returned for true, all zeros vector for false
and the target defines corresponding vec_cmp{,u}MN named RTX
pattern.

[Bug c++/103968] [11/12 Regression] ICE and segfault when instantiating template with lvalue ref argument and nested template type

2022-01-10 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103968

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
   Keywords||ice-on-valid-code
   Last reconfirmed||2022-01-10
 Status|UNCONFIRMED |NEW
Summary|ICE and segfault when   |[11/12 Regression] ICE and
   |instantiating template with |segfault when instantiating
   |lvalue ref argument and |template with lvalue ref
   |nested template type|argument and nested
   ||template type
 Ever confirmed|0   |1
   Target Milestone|--- |11.3

--- Comment #1 from Marek Polacek  ---
Started with r11-2748.

[Bug c++/103968] New: ICE and segfault when instantiating template with lvalue ref argument and nested template type

2022-01-10 Thread aliaume.morel at baracoda dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103968

Bug ID: 103968
   Summary: ICE and segfault when instantiating template with
lvalue ref argument and nested template type
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: aliaume.morel at baracoda dot com
  Target Milestone: ---

The following compiles with clang, GCC 10.3, but gives an ICE on GCC 11.1 and
12.0.0 (20220110) - tested on Godbolt (https://godbolt.org/z/db5EjWGWa):

```
template 
struct trait
{
template 
struct NonInstantiated{};
};

struct Options {};

template 
struct Widget
{
static constexpr auto c_options = Options{};
using Trait = trait;
};

Widget::Trait b{}; // Crashes GCC > 10.3
```


Stack trace:
: In instantiation of 'struct trait::c_options)>':
:17:20:   required from here
:5:12: internal compiler error: Segmentation fault
5 | struct NonInstantiated{};
  |^~~
0x1780bf9 internal_error(char const*, ...)
???:0
0x7f628d instantiate_class_template(tree_node*)
???:0
0x829553 complete_type(tree_node*)
???:0
0x6e6d2b start_decl_1(tree_node*, bool)
???:0
0x6f6587 start_decl(cp_declarator const*, cp_decl_specifier_seq*, int,
tree_node*, tree_node*, tree_node**)
???:0
0x7c0cab c_parse_file()
???:0
0x892aa2 c_common_parse_file()
???:0


- This is reproducible in C++17 and C++20 mode.
- In C++20 mode, removing the lvalue ref from the second template argument in
`struct trait` doesn't trigger the ICE and builds successfully.
- If `Widget` is a non-template type, it doesn't trigger the ICE and builds
succesfully.

[Bug libstdc++/100017] [11 regression] error: 'fenv_t' has not been declared in '::' -- canadian compilation fails

2022-01-10 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #81 from Jonathan Wakely  ---
Fixed for gcc 11.3

[Bug libstdc++/100017] [11 regression] error: 'fenv_t' has not been declared in '::' -- canadian compilation fails

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017

--- Comment #80 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:01a70ccd723eb9a479186fe37c972b0d0f8676cf

commit r11-9448-g01a70ccd723eb9a479186fe37c972b0d0f8676cf
Author: Jonathan Wakely 
Date:   Fri Jan 7 15:21:03 2022 +

libstdc++: Add -nostdinc++ for c++17 sources [PR100017]

When building a build!=host compiler, the just-built gcc can't be used
to build the target libstdc++ (because it is built for the host triplet,
not the build triplet). The top-level configure.ac sets up the build
flags for libstdc++ (and other "raw_cxx" libs) like this:

GCC_TARGET_TOOL(c++ for libstdc++, RAW_CXX_FOR_TARGET, CXX,
[gcc/xgcc -shared-libgcc -B$$r/$(HOST_SUBDIR)/gcc
-nostdinc++ -L$$r/$(TARGET_SUBDIR)/libstdc++-v3/src
-L$$r/$(TARGET_SUBDIR)/libstdc++-v3/src/.libs
-L$$r/$(TARGET_SUBDIR)/libstdc++-v3/libsupc++/.libs],
c++)

The -nostdinc++ flag is only used for the IN-TREE-TOOL, i.e. when using
the just-built gcc/xgcc compiler. This means that the cross-compiler
used to build libstdc++ will add its own libstdc++ headers to the
include path. That results in the #include  in
src/c++17/floating_to_chars.cc and src/c++17/floating_from_chars.cc
doing #include_next  and finding the libstdc++ fenv.h wrapper
from the host compiler. Because that has the same include guard as the
 in the libstdc++ we're trying to build, we never reach the
underlying  from libc. That results in several errors of the
form:

error: 'fenv_t' has not been declared in '::'

The most correct fix would be to add -nostdinc++ to the
RAW_CXX_FOR_TARGET variable in configure.ac, or the
RAW_CXX_TARGET_EXPORTS variable in Makefile.tpl.

Another solution would be to make the libstdc++  wrapper use
_GLIBCXX_INCLUDE_NEXT_C_HEADERS like our  and other C header
wrappers.

For now though, the simplest and safest solution is to just add
-nostdinc++ to the CXXFLAGS used for src/c++17/*.cc, which is what this
does.

libstdc++-v3/ChangeLog:

PR libstdc++/100017
* src/c++17/Makefile.am (AM_CXXFLAGS): Add -nostdinc++.
* src/c++17/Makefile.in: Regenerate.

(cherry picked from commit 4fde88e5dd152fe866a97b12e0f8229970d15cb3)

[Bug libstdc++/103866] AM_PROG_LIBTOOL not compatible with GCC_NO_EXECUTABLES

2022-01-10 Thread pixel--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103866

--- Comment #14 from Nicolas Noble  ---
Thank you, I'll check it out.

On Mon, Jan 10, 2022, 04:24 redi at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103866
>
> Jonathan Wakely  changed:
>
>What|Removed |Added
>
> 
>Target Milestone|--- |12.0
>  Resolution|--- |FIXED
>  Status|NEW |RESOLVED
>
> --- Comment #13 from Jonathan Wakely  ---
> This should be fixed on trunk now. Please comment here if it isn't working
> for
> you.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug d/103944] [12 Regression] Testsuite hang due to libphobos/testsuite/libphobos.gc/forkgc2.d

2022-01-10 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103944

--- Comment #7 from Iain Buclaw  ---
(In reply to Iain Buclaw from comment #6)
> (In reply to Jakub Jelinek from comment #4)
> > Note, it isn't just i686-linux, I was getting the hangs on x86_64-linux,
> > s390x-linux or armv7hl-linux-gnueabi too.
> > Wonder whether it is
> > -fstack-clash-protection -fcf-protection -fcf-protection
> > related or perhaps glibc version dependent (in the builds where it
> > reproduces we are using a very recent glibc snapshot (e.g. one with
> > _dl_find_object in there in case the test uses unwind info in some way)).
> CET is turned on by default now IIRC for X86/64, so I doubt it'd be
> -fcf-protection.
Don't get a hung process on latest rawhide either, which uses this version of
glibc.

https://koji.fedoraproject.org/koji/buildinfo?buildID=1872662

I'll re-run a few times, increasing/decreasing the available memory to see if
anything triggers.

[Bug middle-end/93848] missing -Warray-bounds warning for array subscript 1 is outside array bounds

2022-01-10 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93848

--- Comment #14 from Martin Sebor  ---
Created attachment 52156
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52156&action=edit
Updated patch.

The attached patch is an updated version that fixes a few ICEs.  It's not in
the archives because I sent to Jeff privately back in December 2020 to help
with Fedora testing.

[Bug c++/103945] No warning for ordered comparison of function pointers ?

2022-01-10 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103945

--- Comment #5 from Martin Sebor  ---
(The patch review was never finished.)

[Bug c++/103945] No warning for ordered comparison of function pointers ?

2022-01-10 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103945

Martin Sebor  changed:

   What|Removed |Added

 CC||msebor at gcc dot gnu.org

--- Comment #4 from Martin Sebor  ---
The patch I submitted in October 2020 detects invalid relational comparisons of
unrelated pointers of all kinds
(https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558775.html).

[Bug middle-end/93848] missing -Warray-bounds warning for array subscript 1 is outside array bounds

2022-01-10 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93848

--- Comment #13 from Martin Sebor  ---
The patch submitted (but not approved) for GCC 11:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558775.html

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948

--- Comment #7 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #6)

> I'll try your proposed patch from Comment #5 later today and report here.
Yes, the patch works for me.

[Bug fortran/103366] [9/10/11/12 Regression] ICE in gfc_conv_gfc_desc_to_cfi_desc, at fortran/trans-expr.c:5647

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103366

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:828474fafd2ed33430172fe227f9da7d6fb98723

commit r12-6419-g828474fafd2ed33430172fe227f9da7d6fb98723
Author: Paul Thomas 
Date:   Mon Jan 10 16:54:53 2022 +

Fortran: Pass unlimited polymorphic argument to assumed type [PR103366].

2022-01-10  Paul Thomas  

gcc/fortran
PR fortran/103366
* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Allow unlimited
polymorphic actual argument passed to assumed type formal.

gcc/testsuite/
PR fortran/103366
* gfortran.dg/pr103366.f90: New test.

[Bug target/102024] [12 Regression] zero width bitfields and ABIs

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102024

--- Comment #17 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:3159da6c46568a7c600f78fb3a3b76e2ea4bf4cc

commit r12-6418-g3159da6c46568a7c600f78fb3a3b76e2ea4bf4cc
Author: Jakub Jelinek 
Date:   Mon Jan 10 17:43:23 2022 +0100

x86_64: Ignore zero width bitfields in ABI and issue -Wpsabi warning about
C zero width bitfield ABI changes [PR102024]

For zero-width bitfields current GCC classify_argument does:
  if (DECL_BIT_FIELD (field))
{
  for (i = (int_bit_position (field)
+ (bit_offset % 64)) / 8 / 8;
   i < ((int_bit_position (field) + (bit_offset %
64))
+ tree_to_shwi (DECL_SIZE (field))
+ 63) / 8 / 8; i++)
classes[i]
  = merge_classes (X86_64_INTEGER_CLASS,
classes[i]);
}
which I think means that if the zero-width bitfields are at bit-positions
(in the toplevel aggregate) which are multiples of 64 bits doesn't do
anything, (int_bit_position (field) + (bit_offset % 64)) / 64 and
(int_bit_position (field) + (bit_offset % 64) + 63) / 64 should be equal.
But for zero-width bitfields at other bit positions it will call
merge_classes once.  Now, the typical case is that the zero width bitfield
is surrounded by some bitfields and in that case, it doesn't change
anything, but it can be sandwitched in between floats too as the testcases
show.
In C we had this behavior, in C++ previously the FE was removing the
zero-width bitfields and therefore they were ignored.
LLVM and ICC seems to ignore those bitfields both in C and C++ (== passing
struct S in SSE register rather than in GPR).

The x86-64 psABI has been recently clarified by
   
https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/1aa4398d26c250b252a0c4a0f777216c9a6789ec
that zero width bitfield should be always ignored.

This patch implements that and emits a warning for C for cases where the
ABI
changed from GCC 11.

2022-01-10  Jakub Jelinek  

PR target/102024
* config/i386/i386.c (classify_argument): Add zero_width_bitfields
argument, when seeing DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD
bitfields,
always ignore them, when seeing other zero sized bitfields, either
set zero_width_bitfields to 1 and ignore it or if equal to 2
process
it.  Pass it to recursive calls.  Add wrapper
with old arguments and diagnose ABI differences for C structures
with zero width bitfields.  Formatting fixes.

* gcc.target/i386/pr102024.c: New test.
* g++.target/i386/pr102024.C: New test.

[Bug target/103967] New: x86-64: bitfields make inefficient indexing for array with 16 byte+ objects

2022-01-10 Thread nekotekina at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103967

Bug ID: 103967
   Summary: x86-64: bitfields make inefficient indexing for array
with 16 byte+ objects
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nekotekina at gmail dot com
  Target Milestone: ---

Hello, this problem is seemingly not specific to GCC and is probably well
known. Loading or storing 16-byte (or larger) vector from array using a
bitfield as an index generates code that can be noticeably smaller in theory.

shr esi, 12 ; shift bitfield
and esi, 31 ; mask bitfield
sal rsi, 4 ; unnecessary, also could drop REX prefix for size
pxorxmm0, XMMWORD PTR [rsi+1024+rdi] ; index + offset addressing

1) Second shift can be fused with bitfield load
2) Bitfield load can then be adjusted for shifted indexing (rsi*8)
3) Optionally, array offset can be precomputed if it's used twice or more,
which can result in smaller and potencially faster code.

shr esi, 12 - 1 ; adjusted shift
and esi, 31 << 1 ; adjusted mask which fits in 8-bit immediate
pxor xmm0, [rdi + rsi * 8] ; precomputed array offset

https://godbolt.org/z/7aa7oaMhn

#include 
struct bitfields
{
unsigned dummy : 7;
unsigned a : 5;
unsigned b : 5;
unsigned c : 5;
};
struct context
{
unsigned dummy[256];
__m128i data[32];
};

void xor_data(context& ctx, bitfields op)
{
ctx.data[op.c] = _mm_xor_si128(ctx.data[op.a], ctx.data[op.b]);
}

[Bug c++/103966] std::atomic relaxed load, inc, store sub-optimal codegen

2022-01-10 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103966

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
__atomic_*/__sync_* loads and stores are (intentionally) MEM_VOLATILE_P, which
is why combine gives up on them and similarly peephole2.

[Bug target/103465] [12 regression] -freorder-blocks-and-partition broken on 64-bit Windows

2022-01-10 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103465

Eric Botcazou  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
   Target Milestone|12.0|9.5
 Resolution|--- |FIXED

--- Comment #33 from Eric Botcazou  ---
.

[Bug target/103465] [12 regression] -freorder-blocks-and-partition broken on 64-bit Windows

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103465

--- Comment #32 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:90e463661d81f85a2365b89bea2b0cc6070ae024

commit r9-9905-g90e463661d81f85a2365b89bea2b0cc6070ae024
Author: Eric Botcazou 
Date:   Mon Jan 10 12:40:10 2022 +0100

Properly enable -freorder-blocks-and-partition on 64-bit Windows

The PR uncovered that -freorder-blocks-and-partition was working by
accident
on 64-bit Windows, i.e. the middle-end was supposed to disable it with SEH.
After the change installed on mainline, the middle-end properly disables
it,
which is too bad since a significant amount of work went into it for SEH.

gcc/
PR target/103465
* coretypes.h (unwind_info_type): Swap UI_SEH and UI_TARGET.

[Bug target/103465] [12 regression] -freorder-blocks-and-partition broken on 64-bit Windows

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103465

--- Comment #31 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:331733c384ed74539b38e5fa933b35818a109f5c

commit r10-10388-g331733c384ed74539b38e5fa933b35818a109f5c
Author: Eric Botcazou 
Date:   Mon Jan 10 12:40:10 2022 +0100

Properly enable -freorder-blocks-and-partition on 64-bit Windows

The PR uncovered that -freorder-blocks-and-partition was working by
accident
on 64-bit Windows, i.e. the middle-end was supposed to disable it with SEH.
After the change installed on mainline, the middle-end properly disables
it,
which is too bad since a significant amount of work went into it for SEH.

gcc/
PR target/103465
* coretypes.h (unwind_info_type): Swap UI_SEH and UI_TARGET.

[Bug target/103465] [12 regression] -freorder-blocks-and-partition broken on 64-bit Windows

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103465

--- Comment #30 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:27e6c84c1f14a8195a3f9fb489c240ecc7a6257d

commit r11-9447-g27e6c84c1f14a8195a3f9fb489c240ecc7a6257d
Author: Eric Botcazou 
Date:   Mon Jan 10 12:40:10 2022 +0100

Properly enable -freorder-blocks-and-partition on 64-bit Windows

The PR uncovered that -freorder-blocks-and-partition was working by
accident
on 64-bit Windows, i.e. the middle-end was supposed to disable it with SEH.
After the change installed on mainline, the middle-end properly disables
it,
which is too bad since a significant amount of work went into it for SEH.

gcc/
PR target/103465
* coretypes.h (unwind_info_type): Swap UI_SEH and UI_TARGET.

[Bug tree-optimization/103964] [9/10/11/12 Regression] OVS miscompilation since r0-92313-g5006671f1aaa63cd

2022-01-10 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103964

Jakub Jelinek  changed:

   What|Removed |Added

 CC||amorenoz at redhat dot com

--- Comment #2 from Jakub Jelinek  ---
*** Bug 103965 has been marked as a duplicate of this bug. ***

[Bug c++/103966] std::atomic relaxed load, inc, store sub-optimal codegen

2022-01-10 Thread witold.baryluk+gcc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103966

--- Comment #2 from Witold Baryluk  ---
Similarly, dec, add, sub, are affected, as well mul.

Example:

#include 
#include 

uint64_t x;
void add_a() {
x += 5;
}

std::atomic y;

void add_b_non_atomic() {
y.store(y.load(std::memory_order_relaxed) + 5, std::memory_order_relaxed);
}



Producing:

add_a():
add QWORD PTR x[rip], 5
ret
add_b_non_atomic():
mov rax, QWORD PTR y[rip]
add rax, 5
mov QWORD PTR y[rip], rax
ret
y:
.zero   8
x:
.zero   8

[Bug c++/103966] std::atomic relaxed load, inc, store sub-optimal codegen

2022-01-10 Thread witold.baryluk+gcc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103966

--- Comment #1 from Witold Baryluk  ---
Current codegen on gcc 12 on 64-bit x86:

inc_a():
inc QWORD PTR x[rip]
ret
inc_b_non_atomic():
mov rax, QWORD PTR y[rip]
inc rax
mov QWORD PTR y[rip], rax
ret
y:
.zero   8
x:
.zero   8

[Bug c++/103966] New: std::atomic relaxed load, inc, store sub-optimal codegen

2022-01-10 Thread witold.baryluk+gcc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103966

Bug ID: 103966
   Summary: std::atomic relaxed load, inc, store sub-optimal
codegen
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: witold.baryluk+gcc at gmail dot com
  Target Milestone: ---

Both functions below, should compile to the same assembly on x86:

#include 
#include 

uint64_t x;
void inc_a() {
x++;
}

std::atomic y;

void inc_b_non_atomic() {
y.store(y.load(std::memory_order_relaxed) + 1, std::memory_order_relaxed);
}


and it does so in clang.

It does not in gcc 12 (and earlier).

https://godbolt.org/z/GcM67xz8T



This pattern is very popular in approximate statistical counters / metrics,
where the flow of information is unidirectional (i.e. from one thread that does
updates, to another thread that only reads the counters), and its performance
is critical in many codebases.

[Bug libstdc++/100017] [11 regression] error: 'fenv_t' has not been declared in '::' -- canadian compilation fails

2022-01-10 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017

--- Comment #79 from Jonathan Wakely  ---
I'm doing that now.

[Bug c/103965] optimizer (-O2) changes behavior in cast-to-container iteration

2022-01-10 Thread amorenoz at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103965

--- Comment #2 from amorenoz at redhat dot com  ---
Jakub and Florian have kindly assisted in trying to understand the problem and
will likely be able to provide more insightful comments

[Bug c/103965] optimizer (-O2) changes behavior in cast-to-container iteration

2022-01-10 Thread amorenoz at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103965

amorenoz at redhat dot com  changed:

   What|Removed |Added

 CC||amorenoz at redhat dot com,
   ||fweimer at redhat dot com,
   ||jakub at redhat dot com

--- Comment #1 from amorenoz at redhat dot com  ---
/cc jakub

[Bug c/103965] New: optimizer (-O2) changes behavior in cast-to-container iteration

2022-01-10 Thread amorenoz at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103965

Bug ID: 103965
   Summary: optimizer (-O2) changes behavior in cast-to-container
iteration
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amorenoz at redhat dot com
  Target Milestone: ---

Created attachment 52155
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52155&action=edit
test case that returns 0 when working fine and 1 when broken

Hi,

The attached test-case has been extracted from openvswitch source code. Macros
have been expanded and code has been simplified as much as possible.

The problematic code is a cast-to-container list iteration that uses a
stack-allocated list element without containing object:

struct member {
int padding[14];
int order;
struct ovs_list elem;
};
[...]
struct ovs_list start;
struct member *member *members[2]; 
for (i = 0; i < 2; i++) {
struct member* pos;

member = members[i];

for (((pos) = ((void*)0),
((pos) = ((typeof(pos))(void*)((uintptr_t)((&start)->next) -
   __builtin_offsetof(struct member,
  elem);
 &(pos)->elem != (&start);

 ((pos) = ((typeof(pos))(void*)((uintptr_t)((pos)->elem.next) -
__builtin_offsetof(struct member,
   elem) {
if (member->order > pos->order) {
break;
}
}
ovs_list_insert(&pos->elem, &member->elem);
}


When "-O2" is used, gcc seems to just optimize out one of the iterations.

Interestingly, replacing the pointer arithmetics with integer arithmetics still
makes the code fail.

Is this a bug in gcc? or is this just invalid C code? and if so, why?

[Bug tree-optimization/103964] [9/10/11/12 Regression] OVS miscompilation since r0-92313-g5006671f1aaa63cd

2022-01-10 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103964

Florian Weimer  changed:

   What|Removed |Added

 CC||fw at gcc dot gnu.org

--- Comment #1 from Florian Weimer  ---
The kernel version of the macro is called list_for_each_entry. There's a
Stackoverflow question about this issue:

Does Linux kernel list implementation cause UB?
https://stackoverflow.com/questions/64859526/does-linux-kernel-list-implementation-cause-ub

[Bug tree-optimization/103964] New: [9/10/11/12 Regression] OVS miscompilation since r0-92313-g5006671f1aaa63cd

2022-01-10 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103964

Bug ID: 103964
   Summary: [9/10/11/12 Regression] OVS miscompilation since
r0-92313-g5006671f1aaa63cd
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

Following testcase is miscompiled e.g. with -g -Og -fno-strict-aliasing
starting with r0-92313-g5006671f1aaa63cd - this is a self-contained testcase
from https://gcc.gnu.org/pipermail/gcc-help/2021-December/141021.html

With current trunk and -g -Og -fno-strict-aliasing (-Og chosen as something
that does just a few optimizations), I see on current gcc trunk the fre1 pass
optimizing:
-  _69 = start.prev;
   # DEBUG list_ => NULL
-  last_49 = _69 + 18446744073709551552;
-  # DEBUG last => last_49
+  # DEBUG last => &MEM  [(void *)&start + -64B]
   # DEBUG BEGIN_STMT
-  printf ("first: %p \nlast: %p\n", first_47, last_49);
+  printf ("first: %p \nlast: %p\n", first_47, &MEM  [(void
*)&start + -64B]);

We have earlier:
  start.prev = &start;
  start.next = &start;
and .prev stores in between are:
  MEM[(struct ovs_list *)member_59 + 64B].prev = _48;
...
  MEM[(struct ovs_list *)pos_32 + 64B].prev = _15;
I bet the alias oracle assumes that pos_32, being an struct member pointer,
can't overwrite start.prev where start is much smaller than that (has just
struct ovs_list type).
That MEM[(struct ovs_list *)pos_32 + 64B].prev = _15; is actually what
overwrites start.prev.
  # pos_32 = PHI 
and
  _6 = start.next;
  _7 = (long unsigned int) _6;
  _8 = _7 + 18446744073709551552;
  pos_61 = (struct member *) _8;
and
  _11 = pos_32->elem.next;
  _12 = (long unsigned int) _11;
  _13 = _12 + 18446744073709551552;
  pos_62 = (struct member *) _13;

If it wouldn't use uintptr_t in there, I'd say it is clearly UB, doing pointer
arithmetics out of bounds of the start object.  With uintptr_t it just
materializes a pointer known to point outside of the start object.
For -fstrict-aliasing, I think it is just fine to treat it as UB, for
-fno-strict-aliasing I don't know.
I'm afraid Linux kernel and various other projects that copied such
questionable code from it use it heavily.

/* How to build:

>> gcc -O2 -g -o example2 example2.c ^C
>> ./example2
start: 0x7ffc13ba2800
first: 0x1ba32f0
last: 0x7ffc13ba27c0
list is broken!
Start: 0x7ffc13ba2800. start->next: 0x1ba3330, start->next->next: 0x1ba3390,
start->prev: 0x1ba3390

>> gcc -g -o example2 example2.c
>> ./example2
start: 0x7ffd84d91660
first: 0x23b52f0
last: 0x23b5350

Same for clang.
*/

#include 
#include 
#include 
#include 
#include 
#include 

// from include/openvswitch/util.h //

#define INIT_CONTAINER(OBJECT, POINTER, MEMBER)   
\
((OBJECT) = NULL, ASSIGN_CONTAINER(OBJECT, POINTER, MEMBER))

#define OVS_TYPEOF(OBJECT) typeof(OBJECT)

#define OBJECT_OFFSETOF(OBJECT, MEMBER) offsetof(typeof(*(OBJECT)), MEMBER)

#define OBJECT_CONTAINING(POINTER, OBJECT, MEMBER)
\
((OVS_TYPEOF(OBJECT))(
\
void*)((char*)(POINTER)-OBJECT_OFFSETOF(OBJECT, MEMBER)))

#define OBJECT_MEMBER(POINTER, OBJECT, MEMBER)
\
((OVS_TYPEOF(&POINTER->MEMBER))   
\
((uintptr_t) POINTER + OBJECT_OFFSETOF(POINTER, MEMBER)))

#define ASSIGN_CONTAINER(OBJECT, POINTER, MEMBER) 
\
((OBJECT) = OBJECT_CONTAINING(POINTER, OBJECT, MEMBER), (void)0)

#define HMAP_FOR_EACH(NODE, MEMBER, HMAP) 
\
HMAP_FOR_EACH_INIT(NODE, MEMBER, HMAP, (void)0)
#define HMAP_FOR_EACH_INIT(NODE, MEMBER, HMAP, ...)   
\
for (INIT_CONTAINER(NODE, hmap_first(HMAP), MEMBER), __VA_ARGS__; 
\
 (NODE != OBJECT_CONTAINING(NULL, NODE, MEMBER)) ||   
\
 ((NODE = NULL), 0);  
\
 ASSIGN_CONTAINER(NODE, hmap_next(HMAP, &(NODE)->MEMBER), MEMBER))

#define LIST_FOR_EACH(ITER, MEMBER, LIST) 
\
for (INIT_CONTAINER(ITER, (LIST)->next, MEMBER);  
\
 &(ITER)->MEMBER != (LIST);   
\
 ASSIGN_CONTAINER(ITER, (ITER)->MEMBER.next, MEMBER))

// from lib/list.c / include/openvswitch/list.h /

struct ovs_list {
struct ovs_list* prev; /* Previous list element. */
struct ovs_list* next; /* Next list element. */
};

static inline void ovs_list_init(struct ovs_list* list)
{
list->next = list->prev = list;
}

static inline void ovs_list_insert(struct ovs_list* before,
   struct

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2022-01-10 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 98782, which changed state.

Bug 98782 Summary: [11 Regression] Bad interaction between IPA frequences and 
IRA resulting in spills due to changes in BB frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/98782] [11 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-10 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Resolution|--- |FIXED
Summary|[11/12 Regression] Bad  |[11 Regression] Bad
   |interaction between IPA |interaction between IPA
   |frequences and IRA  |frequences and IRA
   |resulting in spills due to  |resulting in spills due to
   |changes in BB frequencies   |changes in BB frequencies
 Status|ASSIGNED|RESOLVED

--- Comment #41 from rsandifo at gcc dot gnu.org  
---
Hopefully fixed on trunk, much too invasive to backport.

Thanks Vlad for the reviews.

[Bug d/103944] [12 Regression] Testsuite hang due to libphobos/testsuite/libphobos.gc/forkgc2.d

2022-01-10 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103944

--- Comment #6 from Iain Buclaw  ---
(In reply to Jakub Jelinek from comment #4)
> Note, it isn't just i686-linux, I was getting the hangs on x86_64-linux,
> s390x-linux or armv7hl-linux-gnueabi too.
> Wonder whether it is
> -fstack-clash-protection -fcf-protection -fcf-protection
> related or perhaps glibc version dependent (in the builds where it
> reproduces we are using a very recent glibc snapshot (e.g. one with
> _dl_find_object in there in case the test uses unwind info in some way)).
CET is turned on by default now IIRC for X86/64, so I doubt it'd be
-fcf-protection.

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

--- Comment #40 from CVS Commits  ---
The master branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:037cc0b4a6646cc86549247a3590215ebd5c4c43

commit r12-6416-g037cc0b4a6646cc86549247a3590215ebd5c4c43
Author: Richard Sandiford 
Date:   Mon Jan 10 14:47:09 2022 +

ira: Handle "soft" conflicts between cap and non-cap allocnos

This patch looks for allocno conflicts of the following form:

- One allocno (X) is a cap allocno for some non-cap allocno X2.
- X2 belongs to some loop L2.
- The other allocno (Y) is a non-cap allocno.
- Y is an ancestor of some allocno Y2 in L2.
- Y2 is not referenced in L2 (that is, ALLOCNO_NREFS (Y2) == 0).
- Y can use a different allocation from Y2.

In this case, Y's register is live across L2 but is not used within it,
whereas X's register is used only within L2.  The conflict is therefore
only "soft", in that it can easily be avoided by spilling Y2 inside L2
without affecting any insn references.

In principle we could do this for ALLOCNO_NREFS (Y2) != 0 too, with the
callers then taking Y2's ALLOCNO_MEMORY_COST into account.  There would
then be no "cliff edge" between a Y2 that has no references and a Y2 that
has (say) a single cold reference.

However, doing that isn't necessary for the PR and seems to give
variable results in practice.  (fotonik3d_r improves slightly but
namd_r regresses slightly.)  It therefore seemed better to start
with the higher-value zero-reference case and see how things go.

On top of the previous patches in the series, this fixes the exchange2
regression seen in GCC 11.

gcc/
PR rtl-optimization/98782
* ira-int.h (ira_soft_conflict): Declare.
* ira-color.c (max_soft_conflict_loop_depth): New constant.
(ira_soft_conflict): New function.
(spill_soft_conflicts): Likewise.
(assign_hard_reg): Use them to handle the case described by
the comment above ira_soft_conflict.
(improve_allocation): Likewise.
* ira.c (check_allocation): Allow allocnos with "soft" conflicts
to share the same register.

gcc/testsuite/
* gcc.target/aarch64/reg-alloc-4.c: New test.

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

--- Comment #39 from CVS Commits  ---
The master branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:01f3e6a40e7202310abbeb41c345d325bd69554f

commit r12-6415-g01f3e6a40e7202310abbeb41c345d325bd69554f
Author: Richard Sandiford 
Date:   Mon Jan 10 14:47:08 2022 +

ira: Consider modelling caller-save allocations as loop spills

If an allocno A in an inner loop L spans a call, a parent allocno AP
can choose to handle a call-clobbered/caller-saved hard register R
in one of two ways:

(1) save R before each call in L and restore R after each call
(2) spill R to memory throughout L

(2) can be cheaper than (1) in some cases, particularly if L does
not reference A.

Before the patch we always did (1).  The patch adds support for
picking (2) instead, when it seems cheaper.  It builds on the
earlier support for not propagating conflicts to parent allocnos.

gcc/
PR rtl-optimization/98782
* ira-int.h (ira_caller_save_cost): New function.
(ira_caller_save_loop_spill_p): Likewise.
* ira-build.c (ira_propagate_hard_reg_costs): Test whether it is
cheaper to spill a call-clobbered register throughout a loop rather
than spill it around each individual call.  If so, treat all
call-clobbered registers as conflicts and...
(propagate_allocno_info): ...do not propagate call information
from the child to the parent.
* ira-color.c (move_spill_restore): Update accordingly.
* ira-costs.c (ira_tune_allocno_costs): Use ira_caller_save_cost.

gcc/testsuite/
* gcc.target/aarch64/reg-alloc-3.c: New test.

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

--- Comment #38 from CVS Commits  ---
The master branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:d54565d87ff79b882208dfb29af50232033c233d

commit r12-6413-gd54565d87ff79b882208dfb29af50232033c233d
Author: Richard Sandiford 
Date:   Mon Jan 10 14:47:07 2022 +

ira: Add ira_subloop_allocnos_can_differ_p

color_pass has two instances of the same code for propagating non-cap
assignments from parent loops to subloops.  This patch adds a helper
function for testing when such propagations are required for correctness
and uses it to remove the duplicated code.

A later patch will use this in ira-build.c too, which is why the
function is exported to ira-int.h.

No functional change intended.

gcc/
PR rtl-optimization/98782
* ira-int.h (ira_subloop_allocnos_can_differ_p): New function,
extracted from...
* ira-color.c (color_pass): ...here.

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

--- Comment #37 from CVS Commits  ---
The master branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:909a4b4764c4f270f09ccb2a950c91b21ed7b33a

commit r12-6412-g909a4b4764c4f270f09ccb2a950c91b21ed7b33a
Author: Richard Sandiford 
Date:   Mon Jan 10 14:47:07 2022 +

ira: Add comments and fix move_spill_restore calculation

This patch adds comments to describe each use of ira_loop_border_costs.
I think this highlights that move_spill_restore was using the wrong cost
in one case, which came from tranposing [0] and [1] in the original
(pre-ira_loop_border_costs) ira_memory_move_cost expressions.  The
difference would only be noticeable on targets that distinguish between
load and store costs.

gcc/
PR rtl-optimization/98782
* ira-color.c (color_pass): Add comments to describe the spill
costs.
(move_spill_restore): Likewise.  Fix reversed calculation.

  1   2   >