[Bug target/111107] i686-w64-mingw32 does not realign stack when __attribute__((aligned)) or __attribute__((vector_size)) are used

2023-11-24 Thread alexhenrie24 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07

Alex Henrie  changed:

   What|Removed |Added

 CC||alexhenrie24 at gmail dot com

--- Comment #8 from Alex Henrie  ---
Created attachment 56685
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56685=edit
File that crashed my patched MinGW

I tried to compile Wine 8.21 with GCC 12 plus the patch from comment #5 and
-march=native on a Ryzen 7800X3D, but MinGW crashed with:

i686-w64-mingw32-gcc -c -o libs/tiff/i386-windows/libtiff/tif_aux.o
libs/tiff/libtiff/tif_aux.c -Ilibs/tiff -Iinclude -Iinclude/msvcrt \
  -Ilibs/tiff/libtiff -Ilibs/jpeg -Ilibs/zlib -D_UCRT -DFAR= -DZ_SOLO
-D__WINE_PE_BUILD \
  -fno-strict-aliasing -Wno-packed-not-aligned -fno-omit-frame-pointer
-march=native -freport-bug
during RTL pass: split1
libs/tiff/libtiff/tif_aux.c: In function ‘_TIFFUInt64ToFloat’:
libs/tiff/libtiff/tif_aux.c:415:1: internal compiler error: in
assign_stack_local_1, at function.cc:429
  415 | }
  | ^
0x19408f7 internal_error(char const*, ...)
???:0
0x6674e6 fancy_abort(char const*, int, char const*)
???:0
0xee3860 assign_386_stack_local(machine_mode, ix86_stack_slot)
???:0
0x130bfa7 gen_split_56(rtx_insn*, rtx_def**)
???:0
0x17066c2 split_insns(rtx_def*, rtx_insn*)
???:0
0x873ece try_split(rtx_def*, rtx_insn*, int)
???:0
0xb60f02 split_all_insns()
???:0
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
Preprocessed source stored into /tmp/cc824MzT.out file, please attach this to
your bugreport.
make: *** [Makefile:179744: libs/tiff/i386-windows/libtiff/tif_aux.o] Error 1

Omitting -march=native or using unpatched GCC avoids the compiler crash.

I used GCC 12 because unfortunately, Arch Linux does not yet have packaging for
GCC 13, and compiling MinGW without the help of a PKGBUILD file looked pretty
daunting. If you want to try it yourself, just clone
https://gitlab.winehq.org/wine/wine.git and run `./configure
CROSSCFLAGS='-march=native' && make -j16`.

[Bug sanitizer/112708] "gcc -fsanitize=address" produces wrong debug info for variables in function prologue

2023-11-24 Thread bruno at clisp dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112708

--- Comment #5 from Bruno Haible  ---
(In reply to Andrew Pinski from comment #3)
> Also did you add -fvar-tracking-assignments ?

No, I haven't. I have specified CFLAGS=-ggdb, indicating that
  - I don't care about the optimization level,
  - but I want the ability to debug with gdb. And that includes not being
disturbed and alarmed by wrong values of variables. (I wouldn't mind if
single-stepping would not stop at the function entry directly, only at the
first statement of the function. Then I would not have the opportunity to do
'print context' at the wrong moment.)

Which passes and internal machinery GCC needs in order to fulfil these goals,
should be GCC internal. In other words, I specify '-ggdb' and expect GCC to do
the rest.

Additionally, Jakub Jelinek writes in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102523#c2 :
! sometimes -O0 -g is debuggable much better than -Og -g, sometimes the other
way around.
Which is not really a recommendation to use this option on a general basis.

[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16

2023-11-24 Thread aros at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811

--- Comment #17 from Artem S. Tashkinov  ---
Terrific results, thanks a ton!

Maybe this bug report could be closed now?

[Bug sanitizer/112708] "gcc -fsanitize=address" produces wrong debug info for variables in function prologue

2023-11-24 Thread bruno at clisp dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112708

--- Comment #4 from Bruno Haible  ---
(In reply to Andrew Pinski from comment #1)
> Is this with or without optimization?

Since in step 5, I specified CFLAGS=-ggdb, it is without optimization.
(configure sets CFLAGS="-O2 -g" only if CFLAGS is not preset.)

[Bug tree-optimization/112709] New: ICE verify_flow_info failed during GIMPLE pass: asan0

2023-11-24 Thread iamanonymous.cs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112709

Bug ID: 112709
   Summary: ICE verify_flow_info failed during GIMPLE pass: asan0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamanonymous.cs at gmail dot com
  Target Milestone: ---

***
OS and Platform:
$ uname -a:
Linux ubuntu 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023
x86_64 x86_64 x86_64 GNU/Linux
***
gcc version:
$ gcc -v
Using built-in specs.
COLLECT_GCC=/root/gcc_set/202311021000/bin/gcc
COLLECT_LTO_WRAPPER=/root/gcc_set/202311021000/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/root/gcc_set/202311021000
--with-gmp=/root/build_essential --with-mpfr=/root/build_essential
--with-mpc=/root/build_essential --enable-languages=c,c++ --disable-multilib
--with-sanitizer=address,undefined,thread,leak
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231102 (experimental) (GCC)

git version: d508a94167c186b2baacc679896e2809554c0b99
***
Program:
$ cat mutant.c
struct S { char c[1024]; };
void func(void);
struct S s(void) __attribute__((returns_twice));
struct S *p;

void func1(void)
{
  func();
  *p = s();
}

***
Command Lines:
$ gcc -fsanitize=address -c mutant.c
mutant.c: In function ‘func1’:
mutant.c:6:6: error: returns_twice call is not first in basic block 4
6 | void func1(void)
  |  ^
*p.0_1(ab) = s ();
during GIMPLE pass: asan0
mutant.c:6:6: internal compiler error: verify_flow_info failed
0xad0a0e verify_flow_info()
../../gcc/gcc/cfghooks.cc:287
0xed36d7 execute_function_todo
../../gcc/gcc/passes.cc:2100
0xed3c0e execute_todo
../../gcc/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug sanitizer/112708] "gcc -fsanitize=address" produces wrong debug info for variables in function prologue

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112708

--- Comment #3 from Andrew Pinski  ---
Also did you add -fvar-tracking-assignments ? (there are other reports asking
to turn on -fvar-tracking-assignments for -O0 except it is a big compile time
increase in many cases).

[Bug sanitizer/112708] "gcc -fsanitize=address" produces wrong debug info for variables in function prologue

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112708

--- Comment #2 from Andrew Pinski  ---
>Which means that the culprit is gcc.

Not always ...

[Bug sanitizer/112708] "gcc -fsanitize=address" produces wrong debug info for variables in function prologue

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112708

--- Comment #1 from Andrew Pinski  ---
Is this with or without optimization?

[Bug sanitizer/112708] New: "gcc -fsanitize=address" produces wrong debug info for variables in function prologue

2023-11-24 Thread bruno at clisp dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112708

Bug ID: 112708
   Summary: "gcc -fsanitize=address" produces wrong debug info for
variables in function prologue
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bruno at clisp dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

As "gcc -fsanitize=address" finds several categories of memory related bugs,
I'm trying to use CC="gcc -fsanitize=address" everywhere. Unfortunately,
in the following case, a variable's value during a function prologue is
wrong when displayed by gdb. The value is displayed correctly when I don't
use the option -fsanitize=address. Which means that the culprit is gcc.

How to reproduce:
1. $ wget https://ftp.gnu.org/gnu/gettext/gettext-0.22.tar.xz
2. $ tar xf gettext-0.22.tar.xz
3. $ cd gettext-0.22
4. $ GCC13DIR=/some/directory/with/gcc-13.2.0
   $ PATH=$GCC13DIR/bin:$PATH
   Verify it:
   $ gcc --version
5. $ CC="gcc -fsanitize=address" CXX="g++ -fsanitize=address
-Wl,-rpath,$GCC13DIR/lib64" CFLAGS=-ggdb ./configure --disable-shared
6. $ make
7. $ cd gettext-tools/src
8. $ cat > foo.vala <<\EOF
primary_text.set_markup(
"%s".printf(_("Welcome
to Shotwell!")));
EOF
9.
$ gdb xgettext
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from xgettext...
(gdb) break xg-message.c:383
Breakpoint 1 at 0x41cad1: file xg-message.c, line 383.
(gdb) run -o - foo.vala
Starting program: /tmp/gettext-0.22/gettext-tools/src/xgettext -o - foo.vala
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, remember_a_message (mlp=0x60e00040, msgctxt=0x0,
msgid=0x60300a30 "Welcome to Shotwell!", is_utf8=true, pluralp=false,
context=..., pos=0x610004c0, extracted_comment=0x0, comment=0x0,
comment_is_utf8=false) at xg-message.c:383
383   set_format_flags_from_context (is_format, context, mp->msgid, pos,
"msgid");
(gdb) print context
$1 = {is_format1 = 3, pass_format1 = 0, is_format2 = 0, pass_format2 = 0,
is_format3 = 0, pass_format3 = 0, is_format4 = 0, pass_format4 = 0}
(gdb) step
set_format_flags_from_context (is_format=0x7fffc620, context=...,
string=0x60300a30 "Welcome to Shotwell!", pos=0x610004c0,
pretty_msgstr=0x6f0d40 "msgid") at xg-message.c:50
50 flag_context_ty context, const char
*string,
(gdb) print context
$2 = {is_format1 = 0, pass_format1 = 0, is_format2 = 2, pass_format2 = 0,
is_format3 = 5, pass_format3 = 0, is_format4 = 7, pass_format4 = 0}
(gdb) next
55if (context.is_format1 != undecided
(gdb) print context
$3 = {is_format1 = 3, pass_format1 = 0, is_format2 = 0, pass_format2 = 0,
is_format3 = 0, pass_format3 = 0, is_format4 = 0, pass_format4 = 0}

The variable 'context' is passed from xg-message.c:383 to
set_format_flags_from_context.
The value printed as $1 and $3 is correct.
The value printed as $2 is nonsense.

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

--- Comment #3 from Andrew Pinski  ---
I am 99% sure it was caused by
r14-4485-gc1e474785859c9630fcae19c8d2d606f5642c636 .

I suspect the check on lrintdi2 should have been changed to
`TARGET_HARD_FLOAT && TARGET_POWERPC64` instead of just `TARGET_HARD_FLOAT`.

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-24 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

--- Comment #2 from Christopher Fore  ---
Created attachment 56684
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56684=edit
assembly output of sharedbook.i

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-24 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

--- Comment #1 from Christopher Fore  ---
Created attachment 56683
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56683=edit
trimmed file with cvise

[Bug c/112707] New: gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-24 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

Bug ID: 112707
   Summary: gcc 14 outputs invalid assembly on ppc: Error:
unrecognized opcode: `fctid'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: csfore at posteo dot net
  Target Milestone: ---

Created attachment 56682
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56682=edit
original preprocessed file

Error is first hit when building libvorbis-1.3.7-r1. Needs `gcc -O3 -ffast-math
foo.i to reproduce`. It is successful on GCC 13

`
/tmp/ccv3vNGH.s: Assembler messages:
/tmp/ccv3vNGH.s:34: Error: unrecognized opcode: `fctid'
`

I believe this is a GCC bug given that Apple release notes[0] specifically
state this instruction is 64-bit only (it's a bit down, but a search for the
instruction should find it)


[0]:
https://opensource.apple.com/source/cctools/cctools-446.1/as/notes.auto.html

gcc -v:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/powerpc-unknown-linux-gnu/14/lto-wrapper
Target: powerpc-unknown-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-14.0.0_pre20231119/work/gcc-14-20231119/configure
--host=powerpc-unknown-linux-gnu --build=powerpc-unknown-linux-gnu
--prefix=/usr --bindir=/usr/powerpc-unknown-linux-gnu/gcc-bin/14
--includedir=/usr/lib/gcc/powerpc-unknown-linux-gnu/14/include
--datadir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/14
--mandir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/14/man
--infodir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/14/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-unknown-linux-gnu/14/include/g++-v14
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/powerpc-unknown-linux-gnu/14/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=yes,extra
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo
14.0.0_pre20231119 p9' --with-gcc-major-version-only --enable-libstdcxx-time
--enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --disable-multilib
--disable-fixed-point --enable-targets=all --enable-libgomp --disable-libssp
--disable-libada --disable-cet --disable-systemtap
--disable-valgrind-annotations --disable-vtable-verify --disable-libvtv
--without-zstd --without-isl --enable-default-pie --enable-host-pie
--disable-host-bind-now --enable-default-ssp
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231119 (experimental) (Gentoo 14.0.0_pre20231119 p9)

[Bug c++/102341] [modules] "error: conflicting exporting declaration" for anything previously declared

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102341

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:9dd8be6fc2debc4fbd0950386d4e98878af27a45

commit r14-5838-g9dd8be6fc2debc4fbd0950386d4e98878af27a45
Author: Nathaniel Shead 
Date:   Mon Nov 13 16:48:36 2023 +1100

c++: Allow exporting a typedef redeclaration [PR102341]

A typedef doesn't create a new entity, and thus should be allowed to be
exported even if it has been previously declared un-exported. See the
example in [module.interface] p6:

  export module M;
  struct S { int n; };
  typedef S S;
  export typedef S S; // OK, does not redeclare an entity

PR c++/102341

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Allow exporting a redeclaration of
a typedef.

gcc/testsuite/ChangeLog:

* g++.dg/modules/export-1.C: Adjust test.
* g++.dg/modules/export-2_a.C: New test.
* g++.dg/modules/export-2_b.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug target/109812] GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 on Intel Raptor Lake

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812

--- Comment #20 from Jan Hubicka  ---
On zen4 hardware I now get

GCC13 with -O3 -flto -march=native -fopenmp
2163
2161
2153

Average: 2159 Iterations Per Minute

clang 17 with -O3 -flto -march=native -fopenmp
2004
1988
1991

Average: 1994 Iterations Per Minute

trunk -O3 -flto -march=native -fopenmp
Operation: Resizing:
2126
2135
2123

Average: 2128 Iterations Per Minute

So no big changes here...

[Bug middle-end/112653] PTA should handle correctly escape information of values returned by a function

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653

--- Comment #8 from Jan Hubicka  ---
On ARM32 and other targets methods returns this pointer.  Togher with making
return value escape this probably completely disables any chance for IPA
tracking of C++ data types...

[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16

2023-11-24 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811

--- Comment #16 from Sam James  ---
Thank you all for the effort.

[Bug middle-end/110015] openjpeg is slower when built with gcc13 compared to clang16

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110015

--- Comment #10 from Jan Hubicka  ---
runtimes on zen4 hardware.

trunk -O3 -flto -march-native
42171
42964
42106
clang -O3 -flto -march=native
37393
37423
37508
gcc 13 -O3 -flto -march=native
42380
42314
43285

So seems the performance did not change

[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811

--- Comment #15 from Jan Hubicka  ---
With SRA improvements r:aae723d360ca26cd9fd0b039fb0a616bd0eae363 we finally get
good performance at -O2. Improvements to push_back implementation also helps a
bit.

Mainline with default flags (-O2):
Input: JPEG - Quality: 90:
19.76
19.75
19.68
Mainline with -O2 -march=native:
Input: JPEG - Quality: 90:
20.01
20
19.98
Mainline with -O2 -march=native -flto
Input: JPEG - Quality: 90:
19.95
19.98
19.81
Mainline with -O2 -march=native -flto --param max-inline-insns-auto=80 (this
makes push_back inlined)
Input: JPEG - Quality: 90:
19.98
20.05
20.03
Mainline with -O2 -flto  -march=native -I/usr/include/c++/v1 -nostdinc++ -lc++
(so clang's libc++)
21.38
21.37
21.32
Mainline with -O2 -flto  -march=native run manualy since build machinery patch
is needed
23.03
22.85
23.04
Clang 17 with -O2 -march=native -flto and also -fno-tree-vectorize
-fno-tree-slp-vectorize added by cmake. This is with system libstdc++ from
GCC13 so before push_back improvements.
21.16
20.95
21.06
Clang 17 with -O2 -march=native -flto and also -fno-tree-vectorize
-fno-tree-slp-vectorize added by cmake. This is with trunk libstdc++ with
push_back improvements.
21.2
20.93
20.98
Clang 17 with -O2 -march=native -flto -stdlib=libc++ and also
-fno-tree-vectorize -fno-tree-slp-vectorize added by cmake. This is with clan'g
libc++
Input: JPEG - Quality: 90:
22.08
21.88
21.78
Clang 17 with -O3 -march=native -flto
23.08
22.90
22.84


libc++ declares push_back always_inline and splits out the slow copying path. I
think the inlined part is still bit too large for inlining at -O2.

We could still try to get remaining approx 10% without increasing code size at 
-O2
However major part of the problem is solved.

[Bug middle-end/112660] missed-optimization: combine shifts when shifted out bits are known 0

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112660

--- Comment #2 from Andrew Pinski  ---
(In reply to gooncreeper from comment #1)
> This could be further extended for signed integers as we can assume for left
> shifts that shifted out bits are always 0 else UB, and always combine x << a
> >> b.

Not for C90 ... So for gimple and RTL level we can't assume that.

[Bug middle-end/112660] missed-optimization: combine shifts when shifted out bits are known 0

2023-11-24 Thread goon.pri.low at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112660

--- Comment #1 from gooncreeper  ---
This could be further extended for signed integers as we can assume for left
shifts that shifted out bits are always 0 else UB, and always combine x << a >>
b.

[Bug tree-optimization/112706] missed simplification in FRE

2023-11-24 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706

--- Comment #3 from Jan Hubicka  ---
Thanks, new pattern looks like noticeable improvement :)
Base+offset is effective for alias analysis and I suppose it happens
reasonably enough for compares as well.
>   _76 = _71 + 4;
>   # .MEM_154 = VDEF <.MEM_153>
>   x_3(D)->D.25942._M_implD.25172.D.25249._M_finishD.25175 = _76;
>   # .MEM_7 = VDEF <.MEM_154>
>   D.26033 = 0;
>   # .MEM_157 = VDEF <.MEM_7>
>   *_76 = 0;
>   # PT = nonlocal escaped 
>   _82 = _71 + 8;
>   # .MEM_158 = VDEF <.MEM_157>
>   x_3(D)->D.25942._M_implD.25172.D.25249._M_finishD.25175 = _82;
>   # .MEM_8 = VDEF <.MEM_158>
>   D.26033 ={v} {CLOBBER(eol)};
>   # .MEM_9 = VDEF <.MEM_8>
>   D.26034 = 0;
>   if (_66 != _82)
> ```
> After pre (note the first comparison is gone but not the second one and maybe 
> a
> 3rd). So this patch helps but it looks like a PRE/VN improvement is still
> needed to fix the others.
I think it is missing predication in VN. At each execution of CCP or VN
we work out one conditional to be true, but we stil account both paths
for the value number of the pointer used in next compare.

If vector used base+size pair instead of base+endptr VRP would help
here, but we can't vrp finish-start range...

[Bug tree-optimization/112706] missed simplification in FRE

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> That is:
> ```
> (for op  (eq ne)
>  (simplify
>   (op (pointer_plus @0 @1) (pointer_plus @0 @2))
>   (op @1 @2))
> 
> ```

Note I am missing a extra `)` but then I Noticed in the full testcase (that is
located in https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638130.html)
which was originally reported we still end up with:
```
  _76 = _71 + 4;
  # .MEM_154 = VDEF <.MEM_153>
  x_3(D)->D.25942._M_implD.25172.D.25249._M_finishD.25175 = _76;
  # .MEM_7 = VDEF <.MEM_154>
  D.26033 = 0;
  # .MEM_157 = VDEF <.MEM_7>
  *_76 = 0;
  # PT = nonlocal escaped 
  _82 = _71 + 8;
  # .MEM_158 = VDEF <.MEM_157>
  x_3(D)->D.25942._M_implD.25172.D.25249._M_finishD.25175 = _82;
  # .MEM_8 = VDEF <.MEM_158>
  D.26033 ={v} {CLOBBER(eol)};
  # .MEM_9 = VDEF <.MEM_8>
  D.26034 = 0;
  if (_66 != _82)
```
After pre (note the first comparison is gone but not the second one and maybe a
3rd). So this patch helps but it looks like a PRE/VN improvement is still
needed to fix the others.

[Bug pch/112319] [14 Regression] segfault with pch and #pragma GCC diagnostic

2023-11-24 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112319

Lewis Hyatt  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Lewis Hyatt  ---
Fixed now.

[Bug pch/112319] [14 Regression] segfault with pch and #pragma GCC diagnostic

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112319

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:5d4abd9219dfa53b52b341255e99139bb6cad302

commit r14-5836-g5d4abd9219dfa53b52b341255e99139bb6cad302
Author: Lewis Hyatt 
Date:   Wed Nov 1 13:01:12 2023 -0400

preprocessor: Reinitialize frontend parser after loading a PCH [PR112319]

Since r14-2893, the frontend parser object needs to exist when running in
preprocess-only mode, because pragma_lex() is now called in that mode and
needs to make use of it. This is handled by calling c_init_preprocess() at
startup. If -fpch-preprocess is in effect (commonly, because of
-save-temps), a PCH file may be loaded during preprocessing, in which
case the parser will be destroyed, causing the issue noted in the
PR. Resolve it by reinitializing the frontend parser after loading the PCH.

gcc/c-family/ChangeLog:

PR pch/112319
* c-ppoutput.cc (cb_read_pch): Reinitialize the frontend parser
after loading a PCH.

gcc/testsuite/ChangeLog:

PR pch/112319
* g++.dg/pch/pr112319.C: New test.
* g++.dg/pch/pr112319.Hs: New test.
* gcc.dg/pch/pr112319.c: New test.
* gcc.dg/pch/pr112319.hs: New test.

[Bug bootstrap/111601] [14 Regression] bootstrap fails in stagestrain in libcody on x86_64-linux-gnu and powerpc64le-linux-gnu

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111601

--- Comment #9 from Jakub Jelinek  ---
Reproduced on powerpc64le-linux with just
../configure --enable-languages=c,c++ --enable-checking=yes,rtl,extra
--disable-libsanitizer --with-long-double-128; make -j160 profiledbootstrap
so fortunately doesn't need LTO bootstrap.

[Bug tree-optimization/112706] missed simplification in FRE

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-11-24
  Component|middle-end  |tree-optimization
   Keywords||missed-optimization
 Blocks|110287  |
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #1 from Andrew Pinski  ---
This should have been even folded during CCP1.

```
/* For equality and subtraction, this is also true with wrapping overflow.  */
(for op (eq ne minus)
 (simplify
  (op (plus:c @0 @2) (plus:c @1 @2))
  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
   && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
   || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0
   (op @0 @1
```

Should be extended for PointerPlus or done for it.

That is:
```
(for op  (eq ne)
 (simplify
  (op (pointer_plus @0 @1) (pointer_plus @0 @2))
  (op @1 @2))

```

That should fix it during CCP1 (and fre).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287
[Bug 110287] _M_check_len is expensive

[Bug middle-end/112706] New: missed simplification in FRE

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706

Bug ID: 112706
   Summary: missed simplification in FRE
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

Compiling the following testcase (simplified from repeated
std::vector::push_back expansion):

int *ptr;
void link_error ();
void
test ()
{

int *ptr1 = ptr + 10;
int *ptr2 = ptr + 20;
if (ptr1 == ptr2)
link_error ();
}

with gcc -O2 t.C -fdump-tree-all-details
one can check that link_error is optimized away really late:

jh@ryzen4:/tmp> grep link_error a-t.C*

a-t.C.106t.cunrolli:  link_error ();
a-t.C.107t.backprop:  link_error ();
a-t.C.108t.phiprop:  link_error ();
a-t.C.109t.forwprop2:link_error ();

this is too late for some optimization to catch up (in the case of std::vector
we end up missing DSE since the transform is delayed to forwprop3)

I think this is something value numbering should catch.

[Bug fortran/112700] Segmentation fault with list of characters and types

2023-11-24 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112700

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-24
 Status|UNCONFIRMED |NEW
   Keywords||ice-on-valid-code

--- Comment #1 from anlauf at gcc dot gnu.org ---
The testcase has some similarity with reduced versions of pr93678
and also the same traceback.  Thus likely related.

Note: replacing the rank-1 result variable

character(len=1) :: list(1)

by a scalar

character(len=1) :: list !(1)

avoids the ICE.

[Bug c/112702] C23, C++23: Extended characters not valid in an identifier with -pedantic

2023-11-24 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112702

--- Comment #3 from Jonathan Wakely  ---
It's also documented in the release notes:

https://gcc.gnu.org/gcc-13/changes.html#c (see N2836, Identifier Syntax using
Unicode Standard Annex 31).

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2023-11-24 Thread post+gcc at ralfj dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #48 from post+gcc at ralfj dot de ---
> Note, clang makes the same assumption apparently (while MSVC emits rep movs 
> inline and ICC either that, or calls _intel_fast_memcpy).

MSVC does the same thing as clang and GCC, if godbolt is to be trusted:
https://rust.godbolt.org/z/o7TevfvcY

[Bug analyzer/112705] New: FAIL: gcc.dg/analyzer/pr94688.c (test for excess errors)

2023-11-24 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112705

Bug ID: 112705
   Summary: FAIL: gcc.dg/analyzer/pr94688.c (test for excess
errors)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa64-hp-hpux11.11
Target: hppa64-hp-hpux11.11
 Build: hppa64-hp-hpux11.11

spawn -ignore SIGHUP /home/dave/gnu/gcc/objdir64/gcc/xgcc
-B/home/dave/gnu/gcc/o
bjdir64/gcc/ /home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/pr94688.c
-fdi
agnostics-plain-output -fanalyzer -Wanalyzer-too-complex
-fanalyzer-call-summari
es -S -o pr94688.s
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/pr94688.c: In function
'c':
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/pr94688.c:5:5: warning:
all
ocated buffer size is not a multiple of the pointee's size [CWE-131]
[-Wanalyzer
-allocation-size]
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/pr94688.c:1:5: note: (1)
al
located here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/pr94688.c:5:5: note: (2)
as
signed to 'void (*)()' here; 'sizeof (void())' is '8'
FAIL: gcc.dg/analyzer/pr94688.c (test for excess errors)
Excess errors:
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/pr94688.c:5:5: warning:
allocated buffer size is not a multiple of the pointee's size [CWE-131]
[-Wanalyzer-allocation-size]

[Bug analyzer/112704] New: FAIL: gcc.dg/analyzer/data-model-20.c (test for warnings, line 17)

2023-11-24 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112704

Bug ID: 112704
   Summary: FAIL: gcc.dg/analyzer/data-model-20.c  (test for
warnings, line 17)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa64-hp-hpux11.11
Target: hppa64-hp-hpux11.11
 Build: hppa64-hp-hpux11.11

Executing on host: /home/dave/gnu/gcc/objdir64/gcc/xgcc
-B/home/dave/gnu/gcc/obj
dir64/gcc/ 
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c
-fdiagnostics-plain-output   -fanalyzer -Wanalyzer-too-complex
-fanalyzer-ca
ll-summaries -Wno-analyzer-too-complex -S -o data-model-20.s(timeout = 300)
spawn -ignore SIGHUP /home/dave/gnu/gcc/objdir64/gcc/xgcc
-B/home/dave/gnu/gcc/o
bjdir64/gcc/
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.
c -fdiagnostics-plain-output -fanalyzer -Wanalyzer-too-complex
-fanalyzer-call-s
ummaries -Wno-analyzer-too-complex -S -o data-model-20.s
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c: In
functio
n 'test':
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:24:7:
warni
ng: leak of '' [CWE-401] [-Wanalyzer-malloc-leak]
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:12:6:
note:
 (1) following 'false' branch (when 'arr' is non-NULL)...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:15:10:
note
: (2) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:15:17:
note
: (3) following 'true' branch (when 'i < n')...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:16:13:
note
: (4) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:16:33:
note
: (5) allocated here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:16:8:
note:
 (6) assuming '' is non-NULL
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:16:8:
note:
 (7) following 'false' branch...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:15:23:
note
: (8) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:15:17:
note
: (9) following 'true' branch (when 'i < n')...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:16:13:
note
: (10) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:16:8:
note:
 (11) following 'true' branch...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:17:7:
note:
 (12) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:17:16:
note
: (13) following 'true' branch (when 'i >= 0')...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:22:17:
note
: (14) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:17:16:
note: (15) following 'true' branch (when 'i >= 0')...
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:22:17:
note: (16) ...to here
/home/dave/gnu/gcc/gcc/gcc/testsuite/gcc.dg/analyzer/data-model-20.c:24:7:
note: (17) '' leaks here; was allocated at (5)
FAIL: gcc.dg/analyzer/data-model-20.c  (test for warnings, line 17)
PASS: gcc.dg/analyzer/data-model-20.c  (test for bogus messages, line 22)
PASS: gcc.dg/analyzer/data-model-20.c  (test for warnings, line 24)
PASS: gcc.dg/analyzer/data-model-20.c (test for excess errors)

[Bug tree-optimization/110794] FAIL: g++.dg/pr99966.C -std=gnu++17 scan-tree-dump-not vrp1 "throw"

2023-11-24 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110794

John David Anglin  changed:

   What|Removed |Added

   Last reconfirmed|2023-11-17 00:00:00 |2023-11-24

--- Comment #1 from John David Anglin  ---
   [count: 0]:
  # i_19 = PHI 
  # _24 = PHI <_31(7)>
  std::__throw_out_of_range_fmt ("vector::_M_range_check: __n (which is %zu) >=
this->size() (which is %zu)", i_19, _24);

[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
So
2023-11-24  Jakub Jelinek  

PR target/112606
* config/rs6000/rs6000.md (copysign3): Change predicate
of the last argument from gpc_reg_operand to any_operand.  If
operands[2] is CONST_DOUBLE, emit abs or neg abs depending on
its sign, otherwise if it doesn't satisfy gpc_reg_operand,
force it to REG using copy_to_mode_reg.

--- gcc/config/rs6000/rs6000.md.jj  2023-10-13 19:34:43.927834877 +0200
+++ gcc/config/rs6000/rs6000.md 2023-11-24 18:54:13.587876170 +0100
@@ -5358,7 +5358,7 @@ (define_expand "copysign3"
(set (match_dup 4)
(neg:SFDF (abs:SFDF (match_dup 1
(set (match_operand:SFDF 0 "gpc_reg_operand")
-(if_then_else:SFDF (ge (match_operand:SFDF 2 "gpc_reg_operand")
+   (if_then_else:SFDF (ge (match_operand:SFDF 2 "any_operand")
   (match_dup 5))
 (match_dup 3)
 (match_dup 4)))]
@@ -5369,6 +5369,24 @@ (define_expand "copysign3"
|| TARGET_CMPB
|| VECTOR_UNIT_VSX_P (mode))"
 {
+  /* Middle-end canonicalizes -fabs (x) to copysign (x, -1),
+ but PowerPC prefers -fabs (x).  */
+  if (CONST_DOUBLE_AS_FLOAT_P (operands[2]))
+{
+  if (real_isneg (CONST_DOUBLE_REAL_VALUE (operands[2])))
+   {
+ operands[3] = gen_reg_rtx (mode);
+ emit_insn (gen_abs2 (operands[3], operands[1]));
+ emit_insn (gen_neg2 (operands[0], operands[3]));
+   }
+  else
+   emit_insn (gen_abs2 (operands[0], operands[1]));
+  DONE;
+}
+
+  if (!gpc_reg_operand (operands[2], mode))
+operands[2] = copy_to_mode_reg (mode, operands[2]);
+
   if (TARGET_CMPB || VECTOR_UNIT_VSX_P (mode))
 {
   emit_insn (gen_copysign3_fcpsgn (operands[0], operands[1],

then?

[Bug debug/112703] New: [13/14 Regression] -fcompare-debug failure at -O1 and above

2023-11-24 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112703

Bug ID: 112703
   Summary: [13/14 Regression] -fcompare-debug failure at -O1 and
above
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: compare-debug-failure
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: aoliva at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56681
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56681=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -fcompare-debug testcase.C -save-temps
x86_64-pc-linux-gnu-gcc: error: testcase.C: '-fcompare-debug' failure

$ diff -u *gkd
--- a-testcase.C.gkd2023-11-24 18:57:06.420042167 +0100
+++ a-testcase.gk.C.gkd 2023-11-24 18:57:06.450042167 +0100
@@ -67,7 +67,7 @@
 (nil))
 (nil))
 (insn # 0 0 4 (set (reg:SI 0 ax [orig:98  ] [98])
-(const_int 1 [0x1])) "testcase.C":7:32 discrim 3# {*movsi_internal}
+(const_int 1 [0x1])) "testcase.C":7:32 discrim 4# {*movsi_internal}
  (nil))
 (jump_insn # 0 0 4 (set (pc)
 (label_ref #)) "testcase.C":7:32# {jump}

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-5822-20231124121307-g3eb9cae6d37-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-5822-20231124121307-g3eb9cae6d37-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231124 (experimental) (GCC)

[Bug middle-end/112697] [14 Regression] 30-40% exec time regression of 433.milc on zen2 since r14-4972-g8aa47713701b1f

2023-11-24 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112697

Sam James  changed:

   What|Removed |Added

Summary|[14 Regression] 30-40% exec |[14 Regression] 30-40% exec
   |time regression of 433.milc |time regression of 433.milc
   |on zen2 |on zen2 since
   ||r14-4972-g8aa47713701b1f

--- Comment #4 from Sam James  ---
I can probably find a znver2 machine for someone to work on if it's needed, but
that's obviously not going to be the hardest part here...

[Bug middle-end/112697] [14 Regression] 30-40% exec time regression of 433.milc on zen2

2023-11-24 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112697

Martin Jambor  changed:

   What|Removed |Added

 CC||jamborm at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #3 from Martin Jambor  ---
I can reliably bisect this to r14-4972-g8aa47713701b1f (Vladimir's [RA]: Add
cost calculation for reg equivalence invariants) on a similar zen2 machine. 
But it seems zen2 specific, I did not see any performance difference (this is
generic march/tuning) on znver4, for example.  So it may be quite hard to
analyze and fix, even though the regression is big :-/

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-24 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849

--- Comment #25 from Martin Jambor  ---
(In reply to Richard Biener from comment #7)
> There is nothing to sink really, loop header copying introduces a PHI and
> there's not partial redundancies but only partial-partial and those are not
> obvious to CSE because of the introduced PHI.
> 

SRA now decomposes stack.

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849

--- Comment #24 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:c2dcfb6ba6e9a84a16e63ae73a822ae2a843170c

commit r14-5832-gc2dcfb6ba6e9a84a16e63ae73a822ae2a843170c
Author: Jan Hubicka 
Date:   Fri Nov 24 17:59:44 2023 +0100

Use memcpy instead of memmove in __relocate_a_1

__relocate_a_1 is used to copy data after vector reizing.  This can be done
by memcpy
rather than memmove.

libstdc++-v3/ChangeLog:

PR middle-end/109849
* include/bits/stl_uninitialized.h (__relocate_a_1): Use memcpy
instead
of memmove.

[Bug c++/111703] [11/12/13 Regression] [C++20]Compiler fails when using generic lambda in specific situation since r11-550-gf65a3299a521a4

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111703

--- Comment #9 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:cc4cbf38e842cf023e2bdc63a51ef836d7726d8e

commit r13-8098-gcc4cbf38e842cf023e2bdc63a51ef836d7726d8e
Author: Patrick Palka 
Date:   Thu Nov 16 09:32:07 2023 -0500

c++: constantness of call to function pointer [PR111703]

potential_constant_expression for CALL_EXPR tests FUNCTION_POINTER_TYPE_P
on the callee rather than on the type of the callee, which means we
always pass want_rval=any when recursing and so may fail to identify a
non-constant function pointer callee as such.  Fixing this turns out to
further work around PR111703.

PR c++/111703
PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) :
Fix FUNCTION_POINTER_TYPE_P test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: Extend test.
* g++.dg/diagnostic/constexpr4.C: New test.

(cherry picked from commit 0077c0fb19981c108a01cd15af9b2d6d478c183b)

[Bug c++/107939] [11 Regression] Rejects use of `extern const` variable in a template since r11-557

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107939

--- Comment #11 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:cc4cbf38e842cf023e2bdc63a51ef836d7726d8e

commit r13-8098-gcc4cbf38e842cf023e2bdc63a51ef836d7726d8e
Author: Patrick Palka 
Date:   Thu Nov 16 09:32:07 2023 -0500

c++: constantness of call to function pointer [PR111703]

potential_constant_expression for CALL_EXPR tests FUNCTION_POINTER_TYPE_P
on the callee rather than on the type of the callee, which means we
always pass want_rval=any when recursing and so may fail to identify a
non-constant function pointer callee as such.  Fixing this turns out to
further work around PR111703.

PR c++/111703
PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) :
Fix FUNCTION_POINTER_TYPE_P test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: Extend test.
* g++.dg/diagnostic/constexpr4.C: New test.

(cherry picked from commit 0077c0fb19981c108a01cd15af9b2d6d478c183b)

[Bug c++/112269] [14 Regression] x86_64 GNU/Linux '-m32' multilib 'libstdc++-v3/include/complex:1493: internal compiler error: in tsubst_expr, at cp/pt.cc:21534' since r14-4796-g3e3d73ed5e85e7

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112269

--- Comment #14 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:dd57446bd90c3225ce45e8818c5b00f2e86a9607

commit r13-8097-gdd57446bd90c3225ce45e8818c5b00f2e86a9607
Author: Patrick Palka 
Date:   Wed Nov 15 12:03:16 2023 -0500

c++: constantness of local var in constexpr fn [PR111703, PR112269]

potential_constant_expression was incorrectly treating most local
variables from a constexpr function as constant because it wasn't
considering the 'now' parameter.  This patch fixes this by relaxing
its var_in_maybe_constexpr_fn checks accordingly, which turns out to
partially fix two recently reported regressions:

PR111703 is a regression caused by r11-550-gf65a3299a521a4 for restricting
constexpr evaluation during warning-dependent folding.  The mechanism is
intended to restrict only constant evaluation of the instantiated
non-dependent expression, but it also ends up restricting constant
evaluation occurring during instantiation of the expression, in particular
when instantiating the converted argument 'x' (a VIEW_CONVERT_EXPR) into
a copy constructor call.  This seems like a flaw in the mechanism, though
I don't know if we want to fix the mechanism or get rid of it completely
since the original testcases which motivated the mechanism are fixed more
simply by r13-1225-gb00b95198e6720.  In any case, this patch partially
fixes this by making us correctly treat 'x' as non-constant which prevents
the problematic warning-dependent folding from occurring at all.

PR112269 is caused by r14-4796-g3e3d73ed5e85e7 for merging tsubst_copy
into tsubst_copy_and_build.  tsubst_copy used to exit early when 'args'
was empty, behavior which that commit deliberately didn't preserve.
This early exit masked the fact that COMPLEX_EXPR wasn't handled by
tsubst at all, and is a tree code that apparently we could see during
warning-dependent folding on some targets.  A complete fix is to add
handling for this tree code in tsubst_expr, but this patch should fix
the reported testsuite failures since the COMPLEX_EXPRs that crop up
in  are considered non-constant expressions after this patch.

PR c++/111703
PR c++/112269

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) :
Only consider var_in_maybe_constexpr_fn if 'now' is false.
: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: New test.

(cherry picked from commit 6665a8572c8f24bd55c6081c91f461442c94dcfb)

[Bug c++/111703] [11/12/13 Regression] [C++20]Compiler fails when using generic lambda in specific situation since r11-550-gf65a3299a521a4

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111703

--- Comment #8 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:dd57446bd90c3225ce45e8818c5b00f2e86a9607

commit r13-8097-gdd57446bd90c3225ce45e8818c5b00f2e86a9607
Author: Patrick Palka 
Date:   Wed Nov 15 12:03:16 2023 -0500

c++: constantness of local var in constexpr fn [PR111703, PR112269]

potential_constant_expression was incorrectly treating most local
variables from a constexpr function as constant because it wasn't
considering the 'now' parameter.  This patch fixes this by relaxing
its var_in_maybe_constexpr_fn checks accordingly, which turns out to
partially fix two recently reported regressions:

PR111703 is a regression caused by r11-550-gf65a3299a521a4 for restricting
constexpr evaluation during warning-dependent folding.  The mechanism is
intended to restrict only constant evaluation of the instantiated
non-dependent expression, but it also ends up restricting constant
evaluation occurring during instantiation of the expression, in particular
when instantiating the converted argument 'x' (a VIEW_CONVERT_EXPR) into
a copy constructor call.  This seems like a flaw in the mechanism, though
I don't know if we want to fix the mechanism or get rid of it completely
since the original testcases which motivated the mechanism are fixed more
simply by r13-1225-gb00b95198e6720.  In any case, this patch partially
fixes this by making us correctly treat 'x' as non-constant which prevents
the problematic warning-dependent folding from occurring at all.

PR112269 is caused by r14-4796-g3e3d73ed5e85e7 for merging tsubst_copy
into tsubst_copy_and_build.  tsubst_copy used to exit early when 'args'
was empty, behavior which that commit deliberately didn't preserve.
This early exit masked the fact that COMPLEX_EXPR wasn't handled by
tsubst at all, and is a tree code that apparently we could see during
warning-dependent folding on some targets.  A complete fix is to add
handling for this tree code in tsubst_expr, but this patch should fix
the reported testsuite failures since the COMPLEX_EXPRs that crop up
in  are considered non-constant expressions after this patch.

PR c++/111703
PR c++/112269

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) :
Only consider var_in_maybe_constexpr_fn if 'now' is false.
: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-fn8.C: New test.

(cherry picked from commit 6665a8572c8f24bd55c6081c91f461442c94dcfb)

[Bug target/112643] [14 regression] including x86intrin.h is broken for -march=native (which adds -mno-avx10.1-256 )

2023-11-24 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112643

Sam James  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Assignee|unassigned at gcc dot gnu.org  |haochen.jiang at intel 
dot com

--- Comment #28 from Sam James  ---
All done, I think. Thanks!

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849

--- Comment #23 from CVS Commits  ---
The master branch has been updated by Martin Jambor :

https://gcc.gnu.org/g:aae723d360ca26cd9fd0b039fb0a616bd0eae363

commit r14-5831-gaae723d360ca26cd9fd0b039fb0a616bd0eae363
Author: Martin Jambor 
Date:   Fri Nov 24 17:32:35 2023 +0100

sra: SRA of non-escaped aggregates passed by reference to calls

PR109849 shows that a loop that heavily pushes and pops from a stack
implemented by a C++ std::vec results in slow code, mainly because the
vector structure is not split by SRA and so we end up in many loads
and stores into it.  This is because it is passed by reference
to (re)allocation methods and so needs to live in memory, even though
it does not escape from them and so we could SRA it if we
re-constructed it before the call and then separated it to distinct
replacements afterwards.

This patch does exactly that, first relaxing the selection of
candidates to also include those which are addressable but do not
escape and then adding code to deal with the calls.  The
micro-benchmark that is also the (scan-dump) testcase in this patch
runs twice as fast with it than with current trunk.  Honza measured
its effect on the libjxl benchmark and it almost closes the
performance gap between Clang and GCC while not requiring excessive
inlining and thus code growth.

The patch disallows creation of replacements for such aggregates which
are also accessed with a precision smaller than their size because I
have observed that this led to excessive zero-extending of data
leading to slow-downs of perlbench (on some CPUs).  Apart from this
case I have not noticed any regressions, at least not so far.

Gimple call argument flags can tell if an argument is unused (and then
we do not need to generate any statements for it) or if it is not
written to and then we do not need to generate statements loading
replacements from the original aggregate after the call statement.
Unfortunately, we cannot symmetrically use flags that an aggregate is
not read because to avoid re-constructing the aggregate before the
call because flags don't tell which what parts of aggregates were not
written to, so we load all replacements, and so all need to have the
correct value before the call.

This version of the patch also takes care to avoid attempts to modify
abnormal edges, something which was missing in the previosu version.

gcc/ChangeLog:

2023-11-23  Martin Jambor  

PR middle-end/109849
* tree-sra.cc (passed_by_ref_in_call): New.
(sra_initialize): Allocate passed_by_ref_in_call.
(sra_deinitialize): Free passed_by_ref_in_call.
(create_access): Add decl pool candidates only if they are not
already candidates.
(build_access_from_expr_1): Bail out on ADDR_EXPRs.
(build_access_from_call_arg): New function.
(asm_visit_addr): Rename to scan_visit_addr, change the
disqualification dump message.
(scan_function): Check taken addresses for all non-call statements,
including phi nodes.  Process all call arguments, including the
static
chain, build_access_from_call_arg.
(maybe_add_sra_candidate): Relax need_to_live_in_memory check to
allow
non-escaped local variables.
(sort_and_splice_var_accesses): Disallow smaller-than-precision
replacements for aggregates passed by reference to functions.
(sra_modify_expr): Use a separate stmt iterator for adding
satements
before the processed statement and after it.
(enum out_edge_check): New type.
(abnormal_edge_after_stmt_p): New function.
(sra_modify_call_arg): New function.
(sra_modify_assign): Adjust calls to sra_modify_expr.
(sra_modify_function_body): Likewise, use sra_modify_call_arg to
process call arguments, including the static chain.

gcc/testsuite/ChangeLog:

2023-11-23  Martin Jambor  

PR middle-end/109849
* g++.dg/tree-ssa/pr109849.C: New test.
* g++.dg/tree-ssa/sra-eh-1.C: Likewise.
* gcc.dg/tree-ssa/pr109849.c: Likewise.
* gcc.dg/tree-ssa/sra-longjmp-1.c: Likewise.
* gfortran.dg/pr43984.f90: Added -fno-tree-sra to dg-options.

[Bug target/112686] [14 Regression] ICE: in gen_reg_rtx, at emit-rtl.cc:1176 with -fsplit-stack -mcmodel=large

2023-11-24 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112686

Uroš Bizjak  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
 CC|uros at gcc dot gnu.org|

--- Comment #5 from Uroš Bizjak  ---
Fixed.

[Bug target/112686] [14 Regression] ICE: in gen_reg_rtx, at emit-rtl.cc:1176 with -fsplit-stack -mcmodel=large

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112686

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:404ea4c1381398aee162415a88e5cb81c44f8c69

commit r14-5830-g404ea4c1381398aee162415a88e5cb81c44f8c69
Author: Uros Bizjak 
Date:   Fri Nov 24 16:11:27 2023 +0100

i386: Fix ICE with -fsplit-stack -mcmodel=large [PR112686]

For -mcmodel=large, we have to load function address to a register.

PR target/112686

gcc/ChangeLog:

* config/i386/i386.cc (ix86_expand_split_stack_prologue): Load
function address to a register for ix86_cmodel == CM_LARGE.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112686.c: New test.

[Bug middle-end/112634] [14 Regression][OpenMP][-fprofile-generate] ICE in verify_gimple for gcc.dg/gomp/pr27573.c:

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112634

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #7 from Jakub Jelinek  ---
This has been fixed.

[Bug target/112672] [14 Regression] wrong code with __builtin_parityl() at -O and above on x86_64

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112672

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Jakub Jelinek  ---
.

[Bug target/112675] [14 Regression] r14-5385-g0a140730c97087 caused regression on testcases for i386

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112675

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Those should have been fixed by r14-5442 , haven't they?

[Bug sanitizer/112562] [14 regression] asan_interceptors_memintrinsics.cpp doesn't assemble with Solaris/x86 as

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112562

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Jakub Jelinek  ---
Should be fixed now I believe.

[Bug tree-optimization/111293] [14 Regression] Missed Dead Code Elimination since r14-3414-g0cfc9c953d0

2023-11-24 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111293

--- Comment #2 from Andrew Macleod  ---
It seems like we set 'e' to 3 immediately at the start:
   [local count: 1073741824]:
  e = 3;
  goto ; [100.00%]

and it is never changed again. However, when we load from 'e' later in the IL
   [local count: 9485263241]:
  e.1_6 = e;

we simply get varying. Is some pass suppose to propagate this?  This reminds me
of a few other regression PRs where we no longer propagate known values from
loads from memory into ssa-names.

If we knew that e.1_6 was '3', then the call to foo would be folded away as
never executable.

[Bug target/112300] [14 Regression] Cross compiling to mipsisa64r2-sde-elf fails because "HEAP_TRAMPOLINES_INIT was not declared in this scope" since r14-4821-g28d8c680aaea46

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112300

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-24
 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
That seems like config.gcc bug for the particular target, which is the only one
which doesn't append its tm_defines to the earlier ones, but overwrites them:
grep 'tm_defines="[^$]' config.gcc 
tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32R6
MIPS_ABI_DEFAULT=ABI_32"
tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32R2
MIPS_ABI_DEFAULT=ABI_32"
tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32
MIPS_ABI_DEFAULT=ABI_32"
tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64R6
MIPS_ABI_DEFAULT=ABI_N32"
tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64R2
MIPS_ABI_DEFAULT=ABI_N32"
tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64
MIPS_ABI_DEFAULT=ABI_N32"
tm_defines="TARGET_ENDIAN_DEFAULT=0
$tm_defines"
So, I guess
2023-11-24  Jakub Jelinek  

PR target/112300
* config.gcc (mips*-sde-elf*): Append to tm_defines rather than
overwriting them.

--- gcc/config.gcc.jj   2023-11-18 09:35:20.625089143 +0100
+++ gcc/config.gcc  2023-11-24 16:41:39.194495079 +0100
@@ -2682,22 +2682,22 @@ mips*-sde-elf*)
esac
case ${target} in
  mipsisa32r6*)
-   tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32R6
MIPS_ABI_DEFAULT=ABI_32"
+   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32R6
MIPS_ABI_DEFAULT=ABI_32"
;;
  mipsisa32r2*)
-   tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32R2
MIPS_ABI_DEFAULT=ABI_32"
+   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32R2
MIPS_ABI_DEFAULT=ABI_32"
;;
  mipsisa32*)
-   tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32
MIPS_ABI_DEFAULT=ABI_32"
+   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=MIPS_ISA_MIPS32
MIPS_ABI_DEFAULT=ABI_32"
;;
  mipsisa64r6*)
-   tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64R6
MIPS_ABI_DEFAULT=ABI_N32"
+   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64R6
MIPS_ABI_DEFAULT=ABI_N32"
;;
  mipsisa64r2*)
-   tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64R2
MIPS_ABI_DEFAULT=ABI_N32"
+   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64R2
MIPS_ABI_DEFAULT=ABI_N32"
;;
  mipsisa64*)
-   tm_defines="MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64
MIPS_ABI_DEFAULT=ABI_N32"
+   tm_defines="${tm_defines} MIPS_ISA_DEFAULT=MIPS_ISA_MIPS64
MIPS_ABI_DEFAULT=ABI_N32"
;;
esac
;;
should be the right fix, but I've never heard of mips*-sde*-elf before... ;)
>From looking at config.gcc, I think the change will add
" LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 LIBC_MUSL=4 HEAP_TRAMPOLINES_INIT=0
"
to the start of tm_defines.

[Bug tree-optimization/108351] [13/14 Regression] Dead Code Elimination Regression at -O3 since r13-4240-gfeeb0d68f1c708

2023-11-24 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108351

Martin Jambor  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #10 from Martin Jambor  ---
(In reply to Yann Girsberger from comment #9)
> > Is this reduced from a real-world problem?
> 
> No, it is reduced from a modified csmith/random program.

In that case let me close this, please reopen if you disagree with my assesment
in comment #5.  Thanks.

[Bug preprocessor/112701] wrong type inference for ternary operator with `0/0u` in preprocessing context

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701

--- Comment #1 from Andrew Pinski  ---
Interesting is MSVC emits both. clang emits none.

[Bug c/112702] C23, C++23: Extended characters not valid in an identifier with -pedantic

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112702

--- Comment #2 from Andrew Pinski  ---
The answer is to is expected, the answer is YES.

[Bug c++/109936] error: extended character ≠ is not valid in an identifier

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109936

Andrew Pinski  changed:

   What|Removed |Added

 CC||stammark at gcc dot gnu.org

--- Comment #27 from Andrew Pinski  ---
*** Bug 112702 has been marked as a duplicate of this bug. ***

[Bug c/112702] C23, C++23: Extended characters not valid in an identifier with -pedantic

2023-11-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112702

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 109936 ***

[Bug tree-optimization/111137] [11/12 Regression] Wrong code at -O2/3

2023-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37

Richard Biener  changed:

   What|Removed |Added

  Known to fail||13.2.0
  Known to work||13.2.1
   Priority|P3  |P2
Summary|[11/12/13 Regression] Wrong |[11/12 Regression] Wrong
   |code at -O2/3   |code at -O2/3

[Bug tree-optimization/111465] [14 regression] stage 3 ICE kills bootstrap from r14-4089-gd45ddc2c04e471d0dcee01

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111465

--- Comment #13 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:152400decc8383aeff9a9ad8262b9e7e2fff61e0

commit r13-8096-g152400decc8383aeff9a9ad8262b9e7e2fff61e0
Author: Richard Biener 
Date:   Tue Sep 19 12:36:04 2023 +0200

tree-optimization/111465 - bougs jump threading with no-copy src block

The following avoids to forward thread a path with a EDGE_NO_COPY_SRC_BLOCK
block that became non-empty due to folding.

PR tree-optimization/111465
* tree-ssa-threadupdate.cc (fwd_jt_path_registry::thread_block_1):
Cancel the path when a EDGE_NO_COPY_SRC_BLOCK became non-empty.

* g++.dg/torture/pr111465.C: New testcase.

(cherry picked from commit 564ecb7d5afb0bb4eb39285ce65c631490e37dce)

[Bug tree-optimization/111137] [11/12/13 Regression] Wrong code at -O2/3

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:4649121c8e3bae1315e265ad2e205990e39573c5

commit r13-8095-g4649121c8e3bae1315e265ad2e205990e39573c5
Author: Richard Biener 
Date:   Fri Aug 25 13:37:30 2023 +0200

tree-optimization/37 - dependence checking for SLP

The following fixes a mistake with SLP dependence checking.  When
checking whether we can hoist loads to the first load place we
special-case stores of the same instance considering them sunk
to the last store place.  But we fail to consider that stores from
other SLP instances are sunk in a similar way.  This leads us to
miss the dependence between (A) and (B) in

  b[0][1] = 0; (A)
...
  _6 = b[_5 /* 0 */][0];   (B')
  _7 = _6 ^ 1;
  b[_5 /* 0 */][0] = _7;
  b[0][2] = 0; (A')
  _10 = b[_5 /* 0 */][1];  (B)
  _11 = _10 ^ 1;
  b[_5 /* 0 */][1] = _11;

where the zeroing stores are sunk to (A') and the loads hoisted
to (B').  The following fixes this, treating grouped stores from
other instances similar to stores from our own instance.  The
difference is - and this is more conservative than necessary - that
we don't know which stores of a group are in which SLP instance
(though I believe either all of the grouped stores will be in
a single SLP instance or in none at the moment), so we don't
know which stores are sunk where.  We simply assume they are
all sunk to the last store we run into.  Likewise we do not take
into account that an SLP instance might be cancelled (or a grouped
store not actually belong to any instance).

PR tree-optimization/37
* tree-vect-data-refs.cc (vect_slp_analyze_load_dependences):
Properly handle grouped stores from other SLP instances.

* gcc.dg/torture/pr37.c: New testcase.

(cherry picked from commit 845ee9c7107956845e487cb123fa581d9c70ea1b)

[Bug c/112702] New: C23, C++23: Extended characters not valid in an identifier with -pedantic

2023-11-24 Thread stammark at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112702

Bug ID: 112702
   Summary: C23, C++23: Extended characters not valid in an
identifier with -pedantic
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stammark at gcc dot gnu.org
  Target Milestone: ---

Hi all,

This is likely a symptom of the WIP-ness of C23 and C++23 support in the
frontends, but see here:

https://godbolt.org/z/78KeK1fnG

The use of extended characters in identifiers with -pedantic stopped working

* For C, in GCC13
* For C++ in GCC12

Removing -pedantic makes the compilation succeed.

Is this expected behaviour with -pedantic or a bug?

Thanks,

[Bug target/112672] [14 Regression] wrong code with __builtin_parityl() at -O and above on x86_64

2023-11-24 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112672

Uroš Bizjak  changed:

   What|Removed |Added

   Target Milestone|14.0|11.5

--- Comment #9 from Uroš Bizjak  ---
Fixed everywhere.

[Bug target/112672] [14 Regression] wrong code with __builtin_parityl() at -O and above on x86_64

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112672

--- Comment #8 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:422e30e4d5ca2f26f77e7c90e09658408c07a23c

commit r11-1-g422e30e4d5ca2f26f77e7c90e09658408c07a23c
Author: Uros Bizjak 
Date:   Thu Nov 23 16:17:57 2023 +0100

i386: Wrong code with __builtin_parityl [PR112672]

gen_parityhi2_cmp instruction clobbers its input operand, so use
a temporary register in the call to gen_parityhi2_cmp.

PR target/112672

gcc/ChangeLog:

* config/i386/i386.md (parityhi2):
Use temporary register in the call to gen_parityhi2_cmp.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112672.c: New test.

(cherry picked from commit b2d17bdd45b582b93e89c00b04763a45f97d7a34)

[Bug target/112672] [14 Regression] wrong code with __builtin_parityl() at -O and above on x86_64

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112672

--- Comment #7 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:f0445f4401c941d0aa3cc413ca4548f313cc1257

commit r12-10001-gf0445f4401c941d0aa3cc413ca4548f313cc1257
Author: Uros Bizjak 
Date:   Thu Nov 23 16:17:57 2023 +0100

i386: Wrong code with __builtin_parityl [PR112672]

gen_parityhi2_cmp instruction clobbers its input operand, so use
a temporary register in the call to gen_parityhi2_cmp.

PR target/112672

gcc/ChangeLog:

* config/i386/i386.md (parityhi2):
Use temporary register in the call to gen_parityhi2_cmp.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112672.c: New test.

(cherry picked from commit b2d17bdd45b582b93e89c00b04763a45f97d7a34)

[Bug preprocessor/112701] New: wrong type inference for ternary operator in preprocessing context

2023-11-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701

Bug ID: 112701
   Summary: wrong type inference for ternary operator in
preprocessing context
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

In the following snippet, the result of the ternary operator is (-1, cast to an
unsigned type), so the comparison yields false, and both conditional inclusions
must come out empty:

#if (0 ? 0u : -1) < 0
int foo = (0 ? 0u : -1) < 0;
#endif

#if (0 ? 0/0u : -1) < 0
int bar = (0 ? 0/0u : -1) < 0;
#endif

However, GCC emits:

bar:
.zero   4

So clearly the evaluation of the second expression is inconsistent between
preprocessing context (where it incorrectly yields 1) vs. initializer context
(where it is zero as it should be, as seen from the resulting asm).

[Bug tree-optimization/108351] [13/14 Regression] Dead Code Elimination Regression at -O3 since r13-4240-gfeeb0d68f1c708

2023-11-24 Thread yann at ywg dot ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108351

--- Comment #9 from Yann Girsberger  ---
> Is this reduced from a real-world problem?

No, it is reduced from a modified csmith/random program.

[Bug target/111408] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111408

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Jakub Jelinek  ---
Created attachment 56680
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56680=edit
gcc14-pr111408.patch

Untested fix.  That said, I think at least latently this bug exists all the way
to r0-87899-g88b9490b3361c8e7a901134936cd5013abc85158 - unfortunately the md
file readers don't complain if a binary match_operator doesn't have two
operands in the syntax.  I've quickly skimmed all of i386.md match_operator
uses and didn't spot other bugs (extract_operator uses 3 arguments in all
cases, all others after this patch use 2 arguments).

[Bug fortran/112700] New: Segmentation fault with list of characters and types

2023-11-24 Thread alexandre.poux at coria dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112700

Bug ID: 112700
   Summary: Segmentation fault with list of characters and types
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: alexandre.poux at coria dot fr
  Target Milestone: ---

Created attachment 56679
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56679=edit
a module triggerring a segmentation fault

The attached source code for a fortran module triggers a segmentation fault
when compiling

```
bug_mod.f90:24:39:

   24 | list_char = obj%get_char_list()
  |   1
internal compiler error: Segmentation fault
0xd432a1 internal_error(char const*, ...)
???:0
0xe409f5 fold_convert_loc(unsigned int, tree_node*, tree_node*)
???:0
0x15e8f14 gfc_trans_create_temp_array(stmtblock_t*, stmtblock_t*, gfc_ss*,
tree_node*, tree_node*, bool, bool, bool, locus*)
???:0
0x1631d79 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*,
gfc_expr*, vec*)
???:0
0x15f5151 gfc_conv_loop_setup(gfc_loopinfo*, locus*)
???:0
0x161d4f8 gfc_generate_function_code(gfc_namespace*)
???:0
0x15e7f29 gfc_generate_module_code(gfc_namespace*)
???:0
0x157973b gfc_parse_file()
???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
```

I've been able to reproduce it with
- latest version of gfortran from the archlinux repository: core/gcc-fortran
13.2.1-3
- older version from ubuntu 22.04 : Ubuntu 11.4.0-1ubuntu1~22.04

[Bug c++/99232] Exported variable in module gives error: 'lambda' was not declared in this scope

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99232

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:726723c476800285cfbdfce612cedde4a9a7ad58

commit r14-5826-g726723c476800285cfbdfce612cedde4a9a7ad58
Author: Nathaniel Shead 
Date:   Wed Nov 15 20:50:53 2023 +1100

c++: Allow exporting const-qualified namespace-scope variables [PR99232]

By [basic.link] p3.2.1, a non-template non-volatile const-qualified
variable is not necessarily internal linkage in a module declaration,
and rather may have module linkage (or external linkage if it is
exported, see p4.8).

PR c++/99232

gcc/cp/ChangeLog:

* decl.cc (grokvardecl): Don't mark variables attached to
modules as internal.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr99232_a.C: New test.
* g++.dg/modules/pr99232_b.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c/112699] Should limits.h in freestanding environment be self-contained?

2023-11-24 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112699

--- Comment #3 from Xi Ruoyao  ---
(In reply to Alexander Monakov from comment #1)
> Can you clarify which file you mean? gcc/ginclude does not have a limits.h.
> 
> I assume you are not talking about the fixinclude'd limits.h?

No, I mean the limits.h in $(dirname $(gcc -print-libgcc-file-name)).  It
currently contains these lines to include libc limits.h:

#ifndef _LIBC_LIMITS_H_
/* Use "..." so that we find syslimits.h only in this same directory.  */
#include "syslimits.h"
#endif

(if libc limits.h is fixinclude'd, it will be saved to syslimits.h; otherwise
syslimits.h will just contain "#include_next ".)

And I'm asking if

#ifndef _LIBC_LIMITS_H_ && !__STDC_HOSTED__
/* Use "..." so that we find syslimits.h only in this same directory.  */
#include "syslimits.h"
#endif

would be better?

[Bug tree-optimization/112694] RISC-V: zve64d testing expose many ICE on C/C++ testing

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112694

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:aea337cf740ec33022f3cabfa7dd4333d5ba78ee

commit r14-5825-gaea337cf740ec33022f3cabfa7dd4333d5ba78ee
Author: Juzhe-Zhong 
Date:   Fri Nov 24 16:34:28 2023 +0800

RISC-V: Fix inconsistency among all vectorization hooks

This patches 200+ ICEs exposed by testing with rv64gc_zve64d.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112694

The rootcause is we disallow poly (1,1) size vectorization in
preferred_simd_mode.
with this following code:
-  if (TARGET_MIN_VLEN < 128 && TARGET_MAX_LMUL < RVV_M2)
-   return word_mode;

However, we allow poly (1,1) size in hook:
TARGET_VECTORIZE_RELATED_MODE
TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES

And also enables it in all vectorization patterns.

I was adding this into preferred_simd_mode because poly (1,1) size mode
will cause
ICE in can_duplicate_and_interleave_p.

So, the alternative approach we need to block poly (1,1) size in both
TARGET_VECTORIZE_RELATED_MODE
and TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES hooks and all vectorization
patterns.
which is ugly approach and too much codes change.

Now, after investivation, I find it's nice that loop vectorizer can
automatically block poly (1,1)
size vector in interleave vectorization with this commit:
   
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=730909fa858bd691095bc23655077aa13b7941a9

So, we don't need to worry about ICE in interleave vectorization and allow
poly (1,1) size vector
in vectorization which fixes 200+ ICEs in zve64d march.

PR target/112694

gcc/ChangeLog:

* config/riscv/riscv-v.cc (preferred_simd_mode): Allow poly_int
(1,1) vectors.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112694-1.c: New test.

[Bug target/112686] [14 Regression] ICE: in gen_reg_rtx, at emit-rtl.cc:1176 with -fsplit-stack -mcmodel=large

2023-11-24 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112686

Uroš Bizjak  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com
 Status|NEW |ASSIGNED

--- Comment #3 from Uroš Bizjak  ---
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 7b922857d80..50e8826dbe5 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -10503,7 +10503,7 @@ ix86_expand_split_stack_prologue (void)
  fn = copy_to_suggested_reg (x, reg11, Pmode);
}
  else
-   fn = split_stack_fn_large;
+   fn = copy_to_suggested_reg (split_stack_fn_large, reg11, Pmode);

  /* When using the large model we need to load the address
 into a register, and we've run out of registers.  So we

[Bug tree-optimization/108351] [13/14 Regression] Dead Code Elimination Regression at -O3 since r13-4240-gfeeb0d68f1c708

2023-11-24 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108351

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #8 from Sam James  ---
(In reply to Martin Jambor from comment #5)
> If you rename main to something else, like bar, and so the calls to f
> outside of the loop are not considered cold, you get the GCC 12
> behavior.  Is this reduced from a real-world problem?
> 

Yann?

[Bug target/109977] [14 Regression] ICE: output_operand: incompatible floating point / vector register operand for '%d' at -Og since r14-215-g85279b0bddc1c5

2023-11-24 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109977

--- Comment #5 from Sam James  ---
If needed, you can email me an SSH key for a Neoverse-N1 (fp asimd evtstrm aes
pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
ssbs) which should be fast.

[Bug target/109977] [14 Regression] ICE: output_operand: incompatible floating point / vector register operand for '%d' at -Og

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109977

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
I agree with the analysis and
2023-11-24  Andrew Pinski  
Jakub Jelinek  

* config/aarch64/aarch64-simd.md (aarch64_simd_stp): Use 
rather than % for alternative with r constraint on input operand.

* gcc.dg/pr109977.c: New test.

--- gcc/config/aarch64/aarch64-simd.md.jj   2023-11-22 22:55:20.577075762
+0100
+++ gcc/config/aarch64/aarch64-simd.md  2023-11-24 12:51:22.855215700 +0100
@@ -269,7 +269,7 @@ (define_insn "aarch64_simd_stp"
   "TARGET_SIMD"
   {@ [ cons: =0 , 1 ; attrs: type]
  [ Umn  , w ; neon_stp   ] stp\t%1, %1,
%y0
- [ Umn  , r ; store_  ] stp\t%1, %1, %y0
+ [ Umn  , r ; store_  ] stp\t%1, %1,
%y0
   }
 )

--- gcc/testsuite/gcc.dg/pr109977.c.jj  2023-11-24 12:51:04.551473591 +0100
+++ gcc/testsuite/gcc.dg/pr109977.c 2023-11-24 12:50:44.158760916 +0100
@@ -0,0 +1,16 @@
+/* PR target/109977 */
+/* { dg-do compile } */
+/* { dg-options "-Og" } */
+
+typedef double __attribute__((__vector_size__ (8))) V;
+typedef double __attribute__((__vector_size__ (16))) W;
+V v;
+int i;
+extern void bar (void *);
+
+void
+foo (void)
+{
+  W w = __builtin_shufflevector (v, (W) { }, 0, 0);
+  bar ();
+}
fixes it (though it will take me a while to find where to bootstrap/regtest
this).

[Bug c/112699] Should limits.h in freestanding environment be self-contained?

2023-11-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112699

--- Comment #2 from Alexander Monakov  ---
Sorry, even though GCC's limits.h is installed under include-fixed, it is
generated separately, not by the generic fixincludes mechanism. I was confused.

[Bug target/112686] [14 Regression] ICE: in gen_reg_rtx, at emit-rtl.cc:1176 with -fsplit-stack -mcmodel=large

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112686

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||uros at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Indeed, it is r14-5784-g2f3f8952ff1736.

[Bug c/112699] Should limits.h in freestanding environment be self-contained?

2023-11-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112699

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #1 from Alexander Monakov  ---
Can you clarify which file you mean? gcc/ginclude does not have a limits.h.

I assume you are not talking about the fixinclude'd limits.h?

[Bug target/112681] [14 Regression] ICE: in extract_insn, at recog.cc:2804 (unrecognizable insn) with -O2 -mfma -mno-sse4.2 and memcmp() since r14-5747

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112681

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Jakub Jelinek  ---
Fixed.

[Bug target/112681] [14 Regression] ICE: in extract_insn, at recog.cc:2804 (unrecognizable insn) with -O2 -mfma -mno-sse4.2 and memcmp() since r14-5747

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112681

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:3eb9cae6d375d222787498b15ac87f383b3834fe

commit r14-5822-g3eb9cae6d375d222787498b15ac87f383b3834fe
Author: Jakub Jelinek 
Date:   Fri Nov 24 12:12:20 2023 +0100

i386: Fix ICE during cbranchv16qi4 expansion [PR112681]

The following testcase ICEs, because cbranchv16qi4 expansion calls
ix86_expand_branch with op1 being a pre-AVX unaligned memory and
ix86_expand_branch emits a xorv16qi3 instruction without making sure
the operand predicates are satisfied.
While I could manually check if the argument (or both?) doesn't
match vector_operand predicate (apparently this one or bcst_vector_operand
is used in all integral 16+ bytes *xorv*3 instructions) force it into a
register, but as all gen_xorv*3 expanders call
ix86_expand_vector_logical_operator, it seems easier to just call that
function which ensures the right thing happens.  Calling the individual
gen_xorv*3 functions would mean ugly switch on the modes and using high
level expand_simple_binop here seems too high level to me.

2023-11-24  Jakub Jelinek  

PR target/112681
* config/i386/i386-expand.cc (ix86_expand_branch): Use
ix86_expand_vector_logical_operator to expand vector XOR rather
than
gen_rtx_SET on gen_rtx_XOR.

* gcc.target/i386/sse4-pr112681.c: New test.

[Bug middle-end/112679] ICE in expand_float, at optabs.cc:5724 with bitint

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112679

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Jakub Jelinek  ---
Should be fixed now.

[Bug tree-optimization/112673] [14 Regression] ICE verify_gimple failed since r14-5557-g6dd4c703be17fa

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112673

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Jakub Jelinek  ---
Should be fixed now.

[Bug tree-optimization/112673] [14 Regression] ICE verify_gimple failed since r14-5557-g6dd4c703be17fa

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112673

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:eebcad0ac22010fc59de9d856bb02017fccab282

commit r14-5819-geebcad0ac22010fc59de9d856bb02017fccab282
Author: Jakub Jelinek 
Date:   Fri Nov 24 11:32:28 2023 +0100

match.pd: Avoid simplification into invalid BIT_FIELD_REFs [PR112673]

The following testcase is lowered by the bitint lowering pass, then
vectorizer vectorizes one of the loops in it, so we have
  vect__18.6_34 = VIEW_CONVERT_EXPR(x_35(D));
  _8 = BIT_FIELD_REF ;
...
  _18 = BIT_FIELD_REF ;
etc. where x_35(D) is _BitInt(256) argument.  That is valid BIT_FIELD_REF,
the first argument is a vector and it extracts the vector elements from it.
Then comes forwprop4 and simplifies that using match.pd into
  _8 = (unsigned long) x_35(D);
...
  _18 = BIT_FIELD_REF ;
and tree-cfg verification ICEs on the latter (though, even the first cast
is kind of undesirable after bitint lowering, we want large/huge bitints
lowered).  The ICE is because if BIT_FIELD_REFs first argument has
INTEGRAL_TYPE_P, we require type_has_mode_precision_p, but that is not the
case of _BitInt(256), it has BLKmode.

The following patch fixes it by doing the BIT_FIELD_REF with VCE to
BIT_FIELD_REF simplification only if the result is valid.

2023-11-24  Jakub Jelinek  

PR tree-optimization/112673
* match.pd (bit_field_ref (vce @0) -> bit_field_ref @0): Only
simplify
if either @0 doesn't have scalar integral type or if it has mode
precision.

* gcc.dg/pr112673.c: New test.

[Bug c/112699] New: Should limits.h in freestanding environment be self-contained?

2023-11-24 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112699

Bug ID: 112699
   Summary: Should limits.h in freestanding environment be
self-contained?
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xry111 at gcc dot gnu.org
  Target Milestone: ---

Currently limits.h always includes the libc limits.h if it's available at build
time.  Should it be self-contained instead like stdint.h with -ffreestanding?

[Bug middle-end/112679] ICE in expand_float, at optabs.cc:5724 with bitint

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112679

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:31669ec1d01c93fb0a63a7053ad314c17fa5a416

commit r14-5818-g31669ec1d01c93fb0a63a7053ad314c17fa5a416
Author: Jakub Jelinek 
Date:   Fri Nov 24 11:30:30 2023 +0100

lower-bitint: Lower FLOAT_EXPR from BITINT_TYPE INTEGER_CST [PR112679]

The bitint lowering pass only does something if it sees BITINT_TYPE
(medium,
large, huge) SSA_NAMEs.  In the past I've already ran into one special case
where the above doesn't work well, if there is a store of medium/large/huge
BITINT_TYPE INTEGER_CST into memory, there might not be any BITINT_TYPE
SSA_NAMEs in the function, yet we need to lower.  This has been solved by
also checking for SSA_NAME_IS_VIRTUAL_OPERAND if at the vdef there isn't
such a store (the whole intent is make the pass as cheap as possible in the
currently very likely case that the IL doesn't have any BITINT_TYPEs at
all).
And the following testcase shows a similar problem.  With -frounding-math
we don't fold some of FLOAT_EXPRs with INTEGER_CST operands, and if those
INTEGER_CSTs are medium/large/huge BITINT_TYPEs, we need to either cast
the INTEGER_CST to corresponding INTEGER_TYPE (for medium) or lower to
internal fn call which is later turned into libgcc call (for large/huge).
The following patch does that, but of course admittedly this discovery
of stores and FLOAT_EXPRs means we already look through quite a few
SSA_NAME_DEF_STMTs even when BITINT_TYPEs never appear.

2023-11-23  Jakub Jelinek  

PR middle-end/112679
* gimple-lower-bitint.cc (gimple_lower_bitint): Also stop first
loop on
floating point SSA_NAME set in FLOAT_EXPR assignment from
BITINT_TYPE
INTEGER_CST.  Set has_large_huge for those if that BITINT_TYPE is
large
or huge.  Set kind to such FLOAT_EXPR assignment rhs1 BITINT_TYPE's
kind.

* gcc.dg/bitint-42.c: New test.

[Bug tree-optimization/112677] [14 Regression] ASAN reports stack-buffer-overflow in tree-vect-loop.cc vect_is_simple_use when compiling with -mavx512

2023-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112677

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Richard Biener  ---
Fixed.

[Bug other/86656] [meta-bug] Issues found with -fsanitize=address

2023-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86656
Bug 86656 depends on bug 112677, which changed state.

Bug 112677 Summary: [14 Regression] ASAN reports stack-buffer-overflow in 
tree-vect-loop.cc vect_is_simple_use when compiling with -mavx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112677

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/112677] [14 Regression] ASAN reports stack-buffer-overflow in tree-vect-loop.cc vect_is_simple_use when compiling with -mavx512

2023-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112677

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:9f63a8898154473f7b773c3e2ed71e4959719b71

commit r14-5817-g9f63a8898154473f7b773c3e2ed71e4959719b71
Author: Richard Biener 
Date:   Fri Nov 24 10:04:15 2023 +0100

tree-optimization/112677 - stack corruption with .COND_* reduction

The following makes sure to allocate enough space for vectype_op
in vectorizable_reduction.

PR tree-optimization/112677
* tree-vect-loop.cc (vectorizable_reduction): Use alloca
to allocate vectype_op.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #14 from Richard Biener  ---
(In reply to Richard Sandiford from comment #13)
> In vect_create_constant_vectors, I think the uniform_elt needs
> to come first, and needs to be used irrespective of whether the
> number of elements is constant.  The general tree_vector_builder
> has a more general pattern than 1 duplicated element.

can you take it from here since I have limited means to test?

[Bug target/112698] gcc-14-5617-gb8592186611 introduces regressions in bfloat16_vector_typecheck_1.c for cortex-m0 and cortex-m3

2023-11-24 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112698

--- Comment #1 from Christophe Lyon  ---
For gcc.target/arm/bfloat16_vector_typecheck* tests, the log says:
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for excess errors)
Excess errors:
bfloat16_vector_typecheck_1.c:122:17: error: incompatible types when
initializing type 'long int' using type 'bfloat16x4_t'
bfloat16_vector_typecheck_1.c:124:17: error: incompatible types when
initializing type 'long int' using type 'bfloat16x4_t'

FAIL: gcc.target/arm/bfloat16_vector_typecheck_2.c (test for excess errors)
Excess errors:
bfloat16_vector_typecheck_2.c:114:17: error: incompatible types when
initializing type 'long int' using type 'bfloat16x8_t'


For experimental/simd/pr109261_constexpr_simd.cc, the log says:
FAIL: experimental/simd/pr109261_constexpr_simd.cc -mfpu=neon
-mfloat-abi=softfp -march=armv7-a -ffast-math -O2 -Wno-psabi (test for excess
errors)
Excess errors:
simd_neon.h:332: error: cannot convert '__vector(2) int' to 'int32x2_t'
simd_neon.h:332: error: cannot convert '__vector(2) int' to 'int32x2_t'
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization
simd_neon.h:497: error: cannot convert '__vector(2) int' to 'int32x2_t' in
initialization

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-24 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #13 from Richard Sandiford  ---
In vect_create_constant_vectors, I think the uniform_elt needs
to come first, and needs to be used irrespective of whether the
number of elements is constant.  The general tree_vector_builder
has a more general pattern than 1 duplicated element.

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #47 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #46)
> So yes, the most clean solution would be to have __forgiving_memcpy
> possibly also allowing NULL pointers when n == 0 besides allowing
> the exact overlap.  Its prototype wouldn't have restrict then or
> nonnull.

That is basically what I've argued above (however it is called), and completely
agree also on the NULL pointers when n == 0 case for it, I think that one came
up in the Honza vs. Jonathan libstdc++ discussions again recently.
And, ideally it could be implemented in libc as an alias to memcpy if e.g. the
assembly written memcpy satisfies all the requirements in it, but valgrind
could implement it separately and do there if (n == 0 || dst == src) return
dst; return memcpy (dst, src, n);
Of course, it would take some time, because it needs to be in libc first, gcc
needs to key on the versions which have it.  But then possibly in such case
could also fold
if (n != 0)
  memcpy (dst, src, n);
into
  __forgiving_memcpy (dst, src, n);
or __builtin_* variants thereof.

[Bug target/112698] New: gcc-14-5617-gb8592186611 introduces regressions in bfloat16_vector_typecheck_1.c for cortex-m0 and cortex-m3

2023-11-24 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112698

Bug ID: 112698
   Summary: gcc-14-5617-gb8592186611 introduces regressions in
bfloat16_vector_typecheck_1.c for cortex-m0 and
cortex-m3
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: clyon at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---
Target: arm-eabi

Commit g:b8592186611b671d6dc47332ecaf4a4b9c3802fb introduced regressions on
arm-eabi as follows:

* for cortex-m0:
Running gcc:gcc.target/arm/arm.exp ...
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for errors, line 122)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for errors, line 124)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for excess errors)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_2.c (test for errors, line 114)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_2.c (test for excess errors)

* for cortex-m3:
Running gcc:gcc.target/arm/arm.exp ...
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for errors, line 122)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for errors, line 124)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_1.c (test for excess errors)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_2.c (test for errors, line 114)
FAIL: gcc.target/arm/bfloat16_vector_typecheck_2.c (test for excess errors)

Running libstdc++:libstdc++-dg/conformance.exp ...
FAIL: experimental/simd/pr109261_constexpr_simd.cc -mfpu=neon
-mfloat-abi=softfp -march=armv7-a -ffast-math -O2 -Wno-psabi (test for excess
errors)

[Bug middle-end/111655] [11/12/13/14 Regression] wrong code generated for __builtin_signbit and 0./0. on x86-64 -O2

2023-11-24 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111655

--- Comment #14 from rguenther at suse dot de  ---
On Fri, 24 Nov 2023, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111655
> 
> --- Comment #13 from Alexander Monakov  ---
> > Then there is the MULT_EXPR x * x case
> 
> This is PR 111701.
> 
> It would be nice to clarify what "nonnegative" means in the contracts of this
> family of functions, because it's ambiguous for NaNs and negative zeros (x < 0
> is false while signbit is set, and x >= 0 is also false for positive NaNs).

Agreed, I think we're using it in both ways which is the problem in
the end.  Maybe having _compares_nonnegative and _sign_positive
would clarify this.

[Bug middle-end/112697] [14 Regression] 30-40% exec time regression of 433.milc on zen2

2023-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112697

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

--- Comment #2 from Richard Biener  ---
This big jump has been seen in the past, I wonder if it's one of those
micro-arch hazards regarding alignment.  The only "generic" change with
possibly
ripple-down effects is r14-4965-ga5e69e94591ae2.

[Bug middle-end/111655] [11/12/13/14 Regression] wrong code generated for __builtin_signbit and 0./0. on x86-64 -O2

2023-11-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111655

--- Comment #13 from Alexander Monakov  ---
> Then there is the MULT_EXPR x * x case

This is PR 111701.

It would be nice to clarify what "nonnegative" means in the contracts of this
family of functions, because it's ambiguous for NaNs and negative zeros (x < 0
is false while signbit is set, and x >= 0 is also false for positive NaNs).

[Bug middle-end/111655] [11/12/13/14 Regression] wrong code generated for __builtin_signbit and 0./0. on x86-64 -O2

2023-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111655

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #12 from Jakub Jelinek  ---
Started with r6-3811-g68e57f040c6330eb853551622d458a67d6f9e572 on this
testcase,
but as pointed out tree_binary_nonnegative_warnv_p needs to be more careful.
Given that tree_single_nonnegative_warnv_p may return true on REAL_CST NaN
literal with sign bit unset, one question is if e.g. such NaN +
nonnegative_value or nonnegative_value + such NaN could result in NaN result
with negative sign, in that case even PLUS_EXPR
  if (FLOAT_TYPE_P (type))
return RECURSE (op0) && RECURSE (op1);
would be incorrect and we'd need to guard it with && !tree_expr_maybe_nan_p
(op0) && !tree_expr_maybe_nan_p (op1).
Then there is the MULT_EXPR x * x case, that might suffer from similar problem
to PLUS_EXPR if NaN with positive sign * NaN with positive sign can result in
NaN with negative sign.  Then there is the MULT_EXPR RECURSE (op0) && RECURSE
(op1) case, that
one can certainly result in possibly NaN if one operand is 0 and another +inf,
so
we'd need to guard it for FLOAT_TYPE_P with
(tree_expr_nonzero_warnv_p (op0, strict_overflow_p) ||
!tree_expr_maybe_infinite_p (op1))
&& (tree_expr_nonzero_warnv_p (op1, strict_overflow_p) ||
!tree_expr_maybe_infinite_p (op0))
And then RDIV_EXPR, again corresponding checks.

[Bug sanitizer/112353] asan-enabled, aarch64-gcc cross-compiled elf executables fail ro run in qemu-user on x86

2023-11-24 Thread robert at bedrocksystems dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112353

--- Comment #3 from Robert  ---
Thanks for the information!

  1   2   >