[Bug target/108910] [12 Regression] Further ICE in aarch64_layout_arg

2023-04-14 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108910

--- Comment #16 from Andreas Schwab  ---
This breaks Ada:

/opt/gcc/gcc-20230414/Build/./gcc/xgcc -B/opt/gcc/gcc-20230414/Build/./gcc/
-B/usr/aarch64-suse-linux/bin/ -B/usr/aarch64-suse-linux/lib/ -isystem
/usr/aarch64-suse-linux/include -isystem /usr/aarch64-suse-linux/sys-include   
-c -g -O2  -fPIC -fno-lto  -W -Wall -gnatpg -nostdinc   s-bitfie.ads -o
s-bitfie.o
+===GNAT BUG DETECTED==+
| 13.0.1 20230414 (experimental) [master 48d0244b605] (aarch64-suse-linux) GCC
error:|
| in aarch64_function_arg_alignment, at config/aarch64/aarch64.cc:7498 |
| Error detected at s-bituti.adb:195:7 [s-bitfie.ads:55:4] |
| Compiling s-bitfie.ads   |

[Bug ipa/109509] New: Huge compile time with forced inlining

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509

Bug ID: 109509
   Summary: Huge compile time with forced inlining
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

There are several reports about TUs which heavily use always_inline to guide
inlining experiencing huge compile-time spent in IPA / integration.

[Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p()

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491

--- Comment #13 from Richard Biener  ---
(In reply to Chip Kerchner from comment #12)
> > having always_inline across a deep call stack can exponentially increase 
> > compile-time
> 
> Do you think it would be worth requesting a feature to reduce the
> compilation times in situations like this?  Ideally exponentially is not a
> good thing.

Well, suppose you have

static __attribute__((always_inline)) inline void large_leaf () { /* large */ }

static __attribute__((always_inline)) inline void inter1 () { large_leaf (); }

static __attribute__((always_inline)) inline void inter2 () { inter1 (); inter1
(); }

static __attribute__((always_inline)) inline void inter3 () { inter2 (); inter2
(); }

void final () { inter3 (); inter3 (); }

then of course you end up with 8 copies of large_leaf in 'final' (you asked
for it).  Now, implementation wise it gets worse because we also fully
materialize the intermediate inter1, inter2 and inter3 with one and two
and four copies.  That's "only" double of the work but if it's single
call chains the overhead is larger.

There are specific cases where we could do better and IIRC some intermediate
updating of the costs blows up here as well (we build a "fat" callgraph
with inlined edges and inlined node clones).

In the end it requires somebody to sit down and see where to improve things
algorithmically - eventually eschewing the simple topological processing
for all inline candidates in favor of first resolving always-inlines in
the most optimal way, taking advantage of the fact that in principle
we do not need their bodies anymore.

I wasn't able to find a bug tracking this very specific issue so I created
one.  I have opened PR109509 for this.

[Bug tree-optimization/109502] [12/13 Regression] wrong code with -O -ftree-vectorize -fvect-cost-model=unlimited on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109502

--- Comment #3 from Richard Biener  ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > SLP transforms:
> > 
> >   g.0_1 = g;
> >   _2 = g.0_1 == 0;
> >   a_7 = (unsigned int) _2;
> >   _3 = a_7 % 6;
> >   _4 = _3 == 0;
> >   _5 = (unsigned int) _4;
> >   a_8 = _5 + a_7;
> > 
> > To:
> > 
> >   g.0_1 = g;
> >   _2 = g.0_1 == 0;
> >   a_7 = (unsigned int) _2;
> >   _3 = a_7 % 6;
> >   _15 = {_3, g.0_1};
> >   mask__4.4_16 = { 0, 0 } == _15;
> >   vect__5.5_19 = VIEW_CONVERT_EXPR(mask__4.4_16);
> >   _17 = BIT_FIELD_REF ;
> >   _18 = (bool) _17;
> >   _4 = _3 == 0;
> >   _5 = (unsigned int) _18;
> >   _20 = .REDUC_PLUS (vect__5.5_19);
> >   a_8 = _20;
> > 
> 
> If anything there is a missing, a negative after the
> reduc_plus (or before) when it translates the bools comparisons into vector
> comparisons.

Indeed.  Usually this is the failure of bool pattern detection.

unsigned foo (unsigned *p)
{
  unsigned tem1 = p[0] == 0;
  unsigned tem2 = p[1] == 0;
  unsigned tem3 = p[2] == 0;
  unsigned tem4 = p[3] == 0;
  return tem1 + tem2 + tem3 + tem4;
}

doesn't reproduce it - we have put in defences "after the fact" to work
around this for some cases:

t.c:7:29: note:   ==> examining statement: tem4_16 = (unsigned int) _8;
t.c:7:29: note:   vect_is_simple_use: operand _7 == 0, type of def: internal
t.c:7:29: missed:   type conversion to/from bit-precision unsupported.
t.c:7:29: note:   vect_is_simple_use: operand _7 == 0, type of def: internal
t.c:7:29: missed:   mixed mask and nonmask vector types

[Bug target/108947] [13 Regression] wrong code with -O2 -fno-forward-propagate and vector compare on riscv64

2023-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108947

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:9d1a6119590ef828f9782a7083d03e535bc2f2cf

commit r13-7178-g9d1a6119590ef828f9782a7083d03e535bc2f2cf
Author: Jakub Jelinek 
Date:   Fri Apr 14 09:20:49 2023 +0200

combine: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR109040]

The following testcase is miscompiled on riscv since the addition
of *mvconst_internal define_insn_and_split.
We have:
(insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame)
(const_int -64 [0xffc0])) [2  S4 A128])
(reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 166)
(nil)))
(insn 39 36 40 2 (set (reg:SI 171)
(zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame)
(const_int -64 [0xffc0])) [0  S2 A128])))
"pr109040.c":9:11 111 {*zero_extendhisi2}
 (nil))
and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles
even different modes like in the above case, and so it optimizes it into:
(insn 47 35 39 2 (set (reg:HI 175)
(subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_internal}
 (expr_list:REG_DEAD (reg:SI 166)
(nil)))
(insn 39 47 40 2 (set (reg:SI 171)
(zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111
{*zero_extendhisi2}
 (expr_list:REG_DEAD (reg:HI 175)
(nil)))
Pseudo 166 is result of AND with 0x8084c constant (forced into a register).
Combine attempts to combine the AND with the insn 47 above created by DSE,
and turns it because of WORD_REGISTER_OPERATIONS and its assumption that
all
the subword operations are actually done on word mode into:
(set (subreg:SI (reg:HI 175) 0)
(and:SI (reg:SI 167 [ m ])
(reg:SI 168)))
and later on the ZERO_EXTEND is thrown away.

We then see
(and:SI (subreg:SI (reg:HI 175) 0) (const_int 0x84c))
and optimize that into
(subreg:SI (and:HI (reg:HI 175) (const_int 0x84c)) 0)
which is still fine, in WORD_REGISTER_OPERATIONS the AND in HImode
will set all upper bits up to BITS_PER_WORD to zeros.

But later on simplify_binary_operation_1 or simplify_and_const_int_1
sees that because nonzero_bits ((reg:HI 175), HImode) == 0x84c, we can
optimize the AND into (reg:HI 175).  That isn't correct, because while
the low 16 bits of that REG are known to have all bits but 0x84c cleared,
we don't know that all the upper 16 bits are all clear as well.
So, for WORD_REGISTER_OPERATIONS for integral modes smaller than word mode,
we need to check all bits from word_mode in nonzero_bits for the
optimizations.

2023-04-14  Jeff Law  
Jakub Jelinek  

PR target/108947
PR target/109040
* combine.cc (simplify_and_const_int_1): Compute nonzero_bits in
word_mode rather than mode if WORD_REGISTER_OPERATIONS and mode is
smaller than word_mode.
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1)
: Likewise.

* gcc.dg/pr108947.c: New test.
* gcc.c-torture/execute/pr109040.c: New test.

[Bug target/109504] Compilation fails with pragma GCC target sse4.1 and immintrin.h

2023-04-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109504

--- Comment #3 from Hongtao.liu  ---
>From pr108883, maybe we shouldnot restrict _Float16 under TARGET_SSE2.

Jakub Jelinek 2023-02-22 12:21:24 UTC
Created attachment 54506 [details]
gcc13-pr108883.patch

Untested fix on the compiler side of emit_support_tinfos.

That said, these fundamental types whose presence/absence depends on ISA flags
are quite problematic IMHO, as they are incompatible with the target
attribute/pragmas.  Whether they are available or not available depends on
whether in this case SSE2 is enabled during compiler initialization (aka after
parsing command line options) and then they are available or unavailable to
everything else based on that.

[Bug target/109040] [13 Regression] wrong code with v16hi compare & mask on riscv64 at -O2 since r13-4907-g2e886eef7f2b5a

2023-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109040

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:9d1a6119590ef828f9782a7083d03e535bc2f2cf

commit r13-7178-g9d1a6119590ef828f9782a7083d03e535bc2f2cf
Author: Jakub Jelinek 
Date:   Fri Apr 14 09:20:49 2023 +0200

combine: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR109040]

The following testcase is miscompiled on riscv since the addition
of *mvconst_internal define_insn_and_split.
We have:
(insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame)
(const_int -64 [0xffc0])) [2  S4 A128])
(reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 166)
(nil)))
(insn 39 36 40 2 (set (reg:SI 171)
(zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame)
(const_int -64 [0xffc0])) [0  S2 A128])))
"pr109040.c":9:11 111 {*zero_extendhisi2}
 (nil))
and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles
even different modes like in the above case, and so it optimizes it into:
(insn 47 35 39 2 (set (reg:HI 175)
(subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_internal}
 (expr_list:REG_DEAD (reg:SI 166)
(nil)))
(insn 39 47 40 2 (set (reg:SI 171)
(zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111
{*zero_extendhisi2}
 (expr_list:REG_DEAD (reg:HI 175)
(nil)))
Pseudo 166 is result of AND with 0x8084c constant (forced into a register).
Combine attempts to combine the AND with the insn 47 above created by DSE,
and turns it because of WORD_REGISTER_OPERATIONS and its assumption that
all
the subword operations are actually done on word mode into:
(set (subreg:SI (reg:HI 175) 0)
(and:SI (reg:SI 167 [ m ])
(reg:SI 168)))
and later on the ZERO_EXTEND is thrown away.

We then see
(and:SI (subreg:SI (reg:HI 175) 0) (const_int 0x84c))
and optimize that into
(subreg:SI (and:HI (reg:HI 175) (const_int 0x84c)) 0)
which is still fine, in WORD_REGISTER_OPERATIONS the AND in HImode
will set all upper bits up to BITS_PER_WORD to zeros.

But later on simplify_binary_operation_1 or simplify_and_const_int_1
sees that because nonzero_bits ((reg:HI 175), HImode) == 0x84c, we can
optimize the AND into (reg:HI 175).  That isn't correct, because while
the low 16 bits of that REG are known to have all bits but 0x84c cleared,
we don't know that all the upper 16 bits are all clear as well.
So, for WORD_REGISTER_OPERATIONS for integral modes smaller than word mode,
we need to check all bits from word_mode in nonzero_bits for the
optimizations.

2023-04-14  Jeff Law  
Jakub Jelinek  

PR target/108947
PR target/109040
* combine.cc (simplify_and_const_int_1): Compute nonzero_bits in
word_mode rather than mode if WORD_REGISTER_OPERATIONS and mode is
smaller than word_mode.
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1)
: Likewise.

* gcc.dg/pr108947.c: New test.
* gcc.c-torture/execute/pr109040.c: New test.

[Bug target/108947] [13 Regression] wrong code with -O2 -fno-forward-propagate and vector compare on riscv64

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108947

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
 CC||jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug target/109504] [12/13 Regression] Compilation fails with pragma GCC target sse4.1 and immintrin.h

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109504

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.3
Summary|Compilation fails with  |[12/13 Regression]
   |pragma GCC target sse4.1|Compilation fails with
   |and immintrin.h |pragma GCC target sse4.1
   ||and immintrin.h

[Bug target/109508] [13 Regression] ICE: in extract_insn, at recog.cc:2791 with -mcpu=sifive-s76 on riscv64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109508

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug target/109508] [13 Regression] ICE: in extract_insn, at recog.cc:2791 with -mcpu=sifive-s76 on riscv64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109508

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression

2023-04-14 Thread klaus.doldinger64 at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476

--- Comment #15 from Wilhelm M  ---
Just checked actual gcc 13.0.1 without the patch: then no ICE accurs.

[Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression

2023-04-14 Thread klaus.doldinger64 at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476

--- Comment #16 from Wilhelm M  ---
(In reply to Roger Sayle from comment #14)
> My apologies for the delay/issues.  My bootstrap and regression testing of
> this patch (on x86_64-pc-linux-gnu) revealed an issue or two (including the
> reported ICE).  My plan was to fix/resolve all these before posting a
> concrete submission to gcc-patches.  

We all appreciate your great effort in this case! Please don't hesitate to send
here some patches to test with. I'll be happy to test your patches!

[Bug tree-optimization/109502] [12/13 Regression] wrong code with -O -ftree-vectorize -fvect-cost-model=unlimited on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109502

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Richard Biener  ---
Simple omission.

[Bug target/109504] [12/13 Regression] Compilation fails with pragma GCC target sse4.1 and immintrin.h

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109504

--- Comment #4 from Jakub Jelinek  ---
Yeah.  Enable all the time and have say the
targetm.invalid_conversion, targetm.invalid_unary_op, targetm.invalid_binary_op
and something in argument/return value passing reject _Float16/__bf16 in
functions without SSE2.
That will not be enough though, we'll need to arrange e.g. for the spot where
we #undef/#define target  macros based on currently active ISA in pragmas to
also
do that for __STDCPP_FLOAT16_T__ and __STDCPP_BFLOAT16_T__ for C++, and change
libstdc++
such that for x86 it adds similarly to x86 intrin headers something like
#ifndef __SSE2__
#pragma GCC push_options
#pragma GCC target("sse2")
#define __DISABLE_SSE2__
#endif /* __SSE2__ */
...
#ifdef __DISABLE_SSE2__
#undef __DISABLE_SSE2__
#pragma GCC pop_options
#endif /* __DISABLE_SSE2__ */
around std::float16_t/std::bfloat16_t stuff.
Definitely not stage4 material I'm afraid.

[Bug middle-end/109495] Stack is used (unexpectedly) for copying on-heap objects (no problem in clang)

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109495

--- Comment #7 from Richard Biener  ---
Created attachment 54857
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54857&action=edit
patch

I have tested the attached successfully.  Can you think of a case where
we'd have a BLKmode target but a same size component ref source that
has not?  We could also check the components mode (but emulating what
get_inner_reference does is tricky - but it might be useful to split
the head of it out).

In any case if the concern is just padding it's odd that we have to massage
the _source_.  And if we have to, for larger sizes we could at least
block-copy the large head and just deal with the tail covering the padding
with a temporary.

[Bug target/109504] [12/13 Regression] Compilation fails with pragma GCC target sse4.1 and immintrin.h

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109504

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
Note, until then we could perhaps do something like:
--- gcc/config/i386/avx512fp16intrin.h.jj   2023-01-16 11:52:15.946736083
+0100
+++ gcc/config/i386/avx512fp16intrin.h  2023-04-14 10:13:30.054297659 +0200
@@ -28,6 +28,8 @@
 #ifndef __AVX512FP16INTRIN_H_INCLUDED
 #define __AVX512FP16INTRIN_H_INCLUDED

+#ifdef __FLT16_DIG__
+
 #ifndef __AVX512FP16__
 #pragma GCC push_options
 #pragma GCC target("avx512fp16")
@@ -7216,4 +7218,6 @@ _mm512_set1_pch (_Float16 _Complex __A)
 #pragma GCC pop_options
 #endif /* __DISABLE_AVX512FP16__ */

+#endif
+
 #endif /* __AVX512FP16INTRIN_H_INCLUDED */
--- gcc/config/i386/avx512fp16vlintrin.h.jj 2023-01-16 11:52:15.946736083
+0100
+++ gcc/config/i386/avx512fp16vlintrin.h2023-04-14 10:13:54.152946427
+0200
@@ -28,6 +28,8 @@
 #ifndef __AVX512FP16VLINTRIN_H_INCLUDED
 #define __AVX512FP16VLINTRIN_H_INCLUDED

+#ifdef __FLT16_DIG__
+
 #if !defined(__AVX512VL__) || !defined(__AVX512FP16__)
 #pragma GCC push_options
 #pragma GCC target("avx512fp16,avx512vl")
@@ -3359,4 +3361,6 @@ _mm_set1_pch (_Float16 _Complex __A)
 #pragma GCC pop_options
 #endif /* __DISABLE_AVX512FP16VL__ */

+#endif
+
 #endif /* __AVX512FP16VLINTRIN_H_INCLUDED */
--- gcc/config/i386/avxneconvertintrin.h.jj 2023-01-16 11:52:15.949736039
+0100
+++ gcc/config/i386/avxneconvertintrin.h2023-04-14 10:14:51.280113813
+0200
@@ -48,6 +48,7 @@ _mm256_bcstnebf16_ps (const void *__P)
   return (__m256) __builtin_ia32_vbcstnebf162ps256 ((const __bf16 *) __P);
 }

+#ifdef __FLT16_DIG__
 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_bcstnesh_ps (const void *__P)
@@ -61,6 +62,7 @@ _mm256_bcstnesh_ps (const void *__P)
 {
   return (__m256) __builtin_ia32_vbcstnesh2ps256 ((const _Float16 *) __P);
 }
+#endif

 extern __inline __m128
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))

i.e. if _Float16 type isn't usable, just hide all the _Float16 related stuff
from the headers.
But don't we need the same thing for __bf16 too?  There I'm afraid aren't
macros for that.

[Bug target/96373] [10/11 Regression] SVE miscompilation on vectorized division loop, leading to FP exception

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96373

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
   Target Milestone|--- |11.4

[Bug c++/109505] Compiler loops forever to OOM while compiling evaluate_prg_hwy.cc in Chromium

2023-04-14 Thread jdapena at igalia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109505

--- Comment #2 from Jose Dapena Paz  ---
Information collected:

### g++ -v

aarch64-poky-linux-g++ -v
Using built-in specs.
COLLECT_GCC=./home/dape/Development/yocto/meta-chromium/build/tmp/work/x86_64-linux/gcc-cross-aarch64/12.2.0-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/aarch64-poky-linux-g++
COLLECT_LTO_WRAPPER=/home/dape/Development/yocto/meta-chromium/build/tmp/work/x86_64-linux/gcc-cross-aarch64/12.2.0-r0/image/home/dape/Development/yocto/meta-chromium/build/tmp/work/x86_64-linux/gcc-cross-aarch64/12.2.0-r0/recipe-sysroot-native/usr/bin/aarch64-poky-linux/../../libexec/aarch64-poky-linux/gcc/aarch64-poky-linux/12.2.0/lto-wrapper
Target: aarch64-poky-linux
Configured with:
../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/configure
--build=x86_64-linux --host=x86_64-linux --target=aarch64-poky-linux
--prefix=/host-native/usr --exec_prefix=/host-native/usr
--bindir=/host-native/usr/bin/aarch64-poky-linux
--sbindir=/host-native/usr/bin/aarch64-poky-linux
--libexecdir=/host-native/usr/libexec/aarch64-poky-linux
--datadir=/host-native/usr/share --sysconfdir=/host-native/etc
--sharedstatedir=/host-native/com --localstatedir=/host-native/var
--libdir=/host-native/usr/lib/aarch64-poky-linux
--includedir=/host-native/usr/include --oldincludedir=/host-native/usr/include
--infodir=/host-native/usr/share/info --mandir=/host-native/usr/share/man
--disable-silent-rules --disable-dependency-tracking
--with-libtool-sysroot=/host-native --enable-clocale=generic --with-gnu-ld
--enable-shared --enable-languages=c,c++ --enable-threads=posix
--disable-multilib --enable-default-pie --enable-c99 --enable-long-long
--enable-symvers=gnu --enable-libstdcxx-pch
--program-prefix=aarch64-poky-linux- --without-local-prefix
--disable-install-libiberty --disable-libssp --enable-libitm --enable-lto
--disable-bootstrap --with-system-zlib --with-linker-hash-style=sysv
--enable-linker-build-id --with-ppl=no --with-cloog=no
--enable-checking=release --enable-cheaders=c_global --without-isl
--with-gxx-include-dir=/not/exist/usr/include/c++/12.2.0
--with-sysroot=/not/exist --with-build-sysroot=/host
--enable-standard-branch-protection --enable-poison-system-directories=error
--with-system-zlib --disable-static --disable-nls --with-glibc-version=2.28
--enable-initfini-array --enable-__cxa_atexit
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (GCC) 

### Extracted information

 GCC version

12.2.0

 System type

Toolchain built by Yocto Langdale with the mentioned steps, on Ubuntu 22.10

 GCC configure options

Configured with:
../../../../../../work-shared/gcc-12.2.0-r0/gcc-12.2.0/configure
--build=x86_64-linux --host=x86_64-linux --target=aarch64-poky-linux
--prefix=/host-native/usr --exec_prefix=/host-native/usr
--bindir=/host-native/usr/bin/aarch64-poky-linux
--sbindir=/host-native/usr/bin/aarch64-poky-linux
--libexecdir=/host-native/usr/libexec/aarch64-poky-linux
--datadir=/host-native/usr/share --sysconfdir=/host-native/etc
--sharedstatedir=/host-native/com --localstatedir=/host-native/var
--libdir=/host-native/usr/lib/aarch64-poky-linux
--includedir=/host-native/usr/include --oldincludedir=/host-native/usr/include
--infodir=/host-native/usr/share/info --mandir=/host-native/usr/share/man
--disable-silent-rules --disable-dependency-tracking
--with-libtool-sysroot=/host-native --enable-clocale=generic --with-gnu-ld
--enable-shared --enable-languages=c,c++ --enable-threads=posix
--disable-multilib --enable-default-pie --enable-c99 --enable-long-long
--enable-symvers=gnu --enable-libstdcxx-pch
--program-prefix=aarch64-poky-linux- --without-local-prefix
--disable-install-libiberty --disable-libssp --enable-libitm --enable-lto
--disable-bootstrap --with-system-zlib --with-linker-hash-style=sysv
--enable-linker-build-id --with-ppl=no --with-cloog=no
--enable-checking=release --enable-cheaders=c_global --without-isl
--with-gxx-include-dir=/not/exist/usr/include/c++/12.2.0
--with-sysroot=/not/exist --with-build-sysroot=/host
--enable-standard-branch-protection --enable-poison-system-directories=error
--with-system-zlib --disable-static --disable-nls --with-glibc-version=2.28
--enable-initfini-array --enable-__cxa_atexit

 Complete command line that triggers the bug

aarch64-poky-linux-g++  -mcpu=cortex-a72 -march=armv8-a+crc
-fstack-protector-strong   -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security
-Werror=format-security
--sysroot=/home/dape/Development/yocto/meta-chromium/build/tmp/work/cortexa72-poky-linux/chromium-dev/114.0.5696.0-r0/recipe-sysroot
-MMD -MF
obj/third_party/distributed_point_functions/distributed_point_functions/evaluate_prg_hwy.o.d
-DUSE_UDEV -DUSE_AURA=1 -DUSE_GLIB=1 -DUSE_OZONE=1 -DOFFICIAL_BUILD
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE
-DNO_UNWIND_TABLES -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0
-DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_56
-DGLIB_VERSION_MIN_REQ

[Bug c++/109505] Compiler loops forever to OOM while compiling evaluate_prg_hwy.cc in Chromium

2023-04-14 Thread jdapena at igalia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109505

--- Comment #3 from Jose Dapena Paz  ---
Created attachment 54858
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54858&action=edit
evaluate_prg_hwy.ii (compressed with gzip)

[Bug bootstrap/109510] New: [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

Bug ID: 109510
   Summary: [13 Regression] bootstrap with Ada broken on aarch64
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The fix for PR108910 reportedly breaks bootstrap with Ada enabled

/opt/gcc/gcc-20230414/Build/./gcc/xgcc -B/opt/gcc/gcc-20230414/Build/./gcc/
-B/usr/aarch64-suse-linux/bin/ -B/usr/aarch64-suse-linux/lib/ -isystem
/usr/aarch64-suse-linux/include -isystem /usr/aarch64-suse-linux/sys-include   
-c -g -O2  -fPIC -fno-lto  -W -Wall -gnatpg -nostdinc   s-bitfie.ads -o
s-bitfie.o
+===GNAT BUG DETECTED==+
| 13.0.1 20230414 (experimental) [master 48d0244b605] (aarch64-suse-linux) GCC
error:|
| in aarch64_function_arg_alignment, at config/aarch64/aarch64.cc:7498 |
| Error detected at s-bituti.adb:195:7 [s-bitfie.ads:55:4] |
| Compiling s-bitfie.ads   |

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.0
   Keywords||build
   Priority|P3  |P1
 Target||aarch64

[Bug target/108910] [12 Regression] Further ICE in aarch64_layout_arg

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108910

--- Comment #17 from Richard Biener  ---
(In reply to Andreas Schwab from comment #16)
> This breaks Ada:
> 
> /opt/gcc/gcc-20230414/Build/./gcc/xgcc -B/opt/gcc/gcc-20230414/Build/./gcc/
> -B/usr/aarch64-suse-linux/bin/ -B/usr/aarch64-suse-linux/lib/ -isystem
> /usr/aarch64-suse-linux/include -isystem /usr/aarch64-suse-linux/sys-include
> -c -g -O2  -fPIC -fno-lto  -W -Wall -gnatpg -nostdinc   s-bitfie.ads -o
> s-bitfie.o
> +===GNAT BUG DETECTED======+
> | 13.0.1 20230414 (experimental) [master 48d0244b605] (aarch64-suse-linux)
> GCC error:|
> | in aarch64_function_arg_alignment, at config/aarch64/aarch64.cc:7498 |
> | Error detected at s-bituti.adb:195:7 [s-bitfie.ads:55:4] |
> | Compiling s-bitfie.ads   |

tracking this in PR109510 now.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

Richard Biener  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
Maybe Eric can clarify which type kinds in Ada can have TYPE_USER_ALIGN and
_not_ a TYPE_MAIN_VARIANT without.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

Eric Botcazou  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-04-14
 Status|UNCONFIRMED |NEW

--- Comment #2 from Eric Botcazou  ---
> Maybe Eric can clarify which type kinds in Ada can have TYPE_USER_ALIGN and
> _not_ a TYPE_MAIN_VARIANT without.

All of them, TYPE_USER_ALIGN is supposed to be orthogonal to type variants.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #3 from Richard Biener  ---
(In reply to Eric Botcazou from comment #2)
> > Maybe Eric can clarify which type kinds in Ada can have TYPE_USER_ALIGN and
> > _not_ a TYPE_MAIN_VARIANT without.
> 
> All of them, TYPE_USER_ALIGN is supposed to be orthogonal to type variants.

How do you get at the alignment the type would have when the user didn't
specify it?  That's what the call ABI is supposed to look at.

/* 1 if the alignment for this type was requested by "aligned" attribute,
   0 if it is the default for this type.  */
#define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->base.u.bits.user_align)

so what's the "default" alignment?

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #4 from Richard Biener  ---
(In reply to Richard Biener from comment #3)
> (In reply to Eric Botcazou from comment #2)
> > > Maybe Eric can clarify which type kinds in Ada can have TYPE_USER_ALIGN 
> > > and
> > > _not_ a TYPE_MAIN_VARIANT without.
> > 
> > All of them, TYPE_USER_ALIGN is supposed to be orthogonal to type variants.
> 
> How do you get at the alignment the type would have when the user didn't
> specify it?  That's what the call ABI is supposed to look at.
> 
> /* 1 if the alignment for this type was requested by "aligned" attribute,
>0 if it is the default for this type.  */
> #define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->base.u.bits.user_align)
> 
> so what's the "default" alignment?

Btw, we're talking about non-aggregate types here.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #5 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #3)
> (In reply to Eric Botcazou from comment #2)
> > > Maybe Eric can clarify which type kinds in Ada can have TYPE_USER_ALIGN 
> > > and
> > > _not_ a TYPE_MAIN_VARIANT without.
> > 
> > All of them, TYPE_USER_ALIGN is supposed to be orthogonal to type variants.
> 
> How do you get at the alignment the type would have when the user didn't
> specify it?  That's what the call ABI is supposed to look at.
> 
> /* 1 if the alignment for this type was requested by "aligned" attribute,
>0 if it is the default for this type.  */
> #define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->base.u.bits.user_align)
> 
> so what's the "default" alignment?

What stor-layout.cc sets it to.
For scalar types on targets which don't define ROUND_TYPE_ALIGN, I think it is
usually
(see stor-layout.cc (finalize_type_size)) GET_MODE_ALIGNMENT (TYPE_MODE
(type)),
for vector types targetm.vector_alignment (type).
AGGREGATE_TYPE_Ps should have their specified user alignment honored.

The reason why we shouldn't take user alignment into account for ABI argument
and return value passing is e.g. because useless_type_conversion_p doesn't
really take that user alignment on the scalar types into account.
E.g. integer type conversions are considered equal if they have same mode, same
signedness and same precision, scalar floats if they have same mode,
pointers/references indeed same mode and same address space, vectors have some
extra rules as well, but none
of that takes into account the alignment.  So, what exact type you get in the
call vs. what exact type the callee expects might have different alignment, and
it is the default alignment that should matter there (solely).

[Bug tree-optimization/109502] [12/13 Regression] wrong code with -O -ftree-vectorize -fvect-cost-model=unlimited on aarch64

2023-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109502

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:bf24f2db2841b97bc5e86bf9294d61eef32f83b3

commit r13-7180-gbf24f2db2841b97bc5e86bf9294d61eef32f83b3
Author: Richard Biener 
Date:   Fri Apr 14 09:55:27 2023 +0200

tree-optimization/109502 - vector conversion between mask and non-mask

The following fixes a check that should have rejected vectorizing
a conversion between a mask and non-mask type.  Those should be
done via pattern statements.

PR tree-optimization/109502
* tree-vect-stmts.cc (vectorizable_assignment): Fix
check for conversion between mask and non-mask types.

* gcc.dg/vect/pr109502.c: New testcase.

[Bug tree-optimization/109502] [12 Regression] wrong code with -O -ftree-vectorize -fvect-cost-model=unlimited on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109502

Richard Biener  changed:

   What|Removed |Added

Summary|[12/13 Regression] wrong|[12 Regression] wrong code
   |code with -O|with -O -ftree-vectorize
   |-ftree-vectorize|-fvect-cost-model=unlimited
   |-fvect-cost-model=unlimited |on aarch64
   |on aarch64  |
   Priority|P3  |P2
  Known to fail|13.0|
  Known to work||13.0

--- Comment #6 from Richard Biener  ---
Should be fixed on trunk.  Is probably latent on older branches as well.

[Bug c++/109359] [12/13 Regression] Compile-time rounding of double literal to float is incorrect with -frounding-math

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109359

--- Comment #4 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #3)
> I think the fold-const.cc change is right though.
> I wonder if for constant evaluation (constexpr, constinit) we shouldn't
> arrange for those to be evaluated with temporarily -fno-rounding-math, I
> think C uses
> fold_init and its START_FOLD_INIT ... END_FOLD_INIT for this purpose..
> And otherwise perhaps we want dynamic initialization and do the conversion
> at runtime?
> Or disable the -frounding-math for all initializer folding?
> What we emit is definitely wrong,
> Variable which claims to have 8 bytes in size but actually has 16 under the
> hood, with constants in different mode.

We should have ICEd emitting the constant.  And yes, I think -frounding-math
should be disabled for constinit initializer folding (and possibly whether
it is const or not should not depend on -frounding-math).

[Bug fortran/109511] New: incorrectly rejects set_exponent with constant negative exponent

2023-04-14 Thread sebastien at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109511

Bug ID: 109511
   Summary: incorrectly rejects set_exponent with constant
negative exponent
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sebastien at debian dot org
  Target Milestone: ---

The following program fails to compile:

program exponent
  print *, set_exponent(1., -1)
end program exponent

With the following error message:

exponent.f08:2:24:

2 |   print *, set_exponent(1., -1)
  |1
Error: Result of SET_EXPONENT overflows its kind at (1)

But as I understand it, this is a valid program.

Note that the problem does not occur if the exponent is positive. It also does
not occur if the exponent is variable (even with a negative value). So it looks
like the static analyzer is overly restrictive.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #6 from Eric Botcazou  ---
> Btw, we're talking about non-aggregate types here.

Right, I agree that this is unexpected for them, let me investigate.

[Bug c++/109359] [12/13 Regression] Compile-time rounding of double literal to float is incorrect with -frounding-math

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109359

--- Comment #5 from Jakub Jelinek  ---
Ah, cp/constexpr.cc already uses fold_binary_initializer_loc if
-fconstexpr-fp-except.
That will turn the -frounding-math temporarily off for binary operations.
For this PR guess we need to use fold_init or fold_build1_initializer_loc
instead of fold or fold_build1 in the NOP_EXPR handling under the same
conditions.
Plus of course we need to figure out how to fix the issue if we didn't turn
those off (we shouldn't consider the initializer constant then but evaluate
dynamically).

[Bug c++/109359] [12/13 Regression] Compile-time rounding of double literal to float is incorrect with -frounding-math

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109359

--- Comment #6 from Richard Biener  ---
output_constant gets called with

{(float) 1.91390279997419406754488591104745864868164062e-3, (float)
6.3053899606217215841752476990222930908203125e-1}

it then eventually does

  /* Eliminate any conversions since we'll be outputting the underlying
 constant.  */
  while (CONVERT_EXPR_P (exp)
 || TREE_CODE (exp) == NON_LVALUE_EXPR
 || TREE_CODE (exp) == VIEW_CONVERT_EXPR)
{
  HOST_WIDE_INT type_size = int_size_in_bytes (TREE_TYPE (exp));
  HOST_WIDE_INT op_size = int_size_in_bytes (TREE_TYPE (TREE_OPERAND (exp,
0)));

  /* Make sure eliminating the conversion is really a no-op, except with
 VIEW_CONVERT_EXPRs to allow for wild Ada unchecked conversions and
 union types to allow for Ada unchecked unions.  */
  if (type_size > op_size
  && TREE_CODE (exp) != VIEW_CONVERT_EXPR
  && TREE_CODE (TREE_TYPE (exp)) != UNION_TYPE)
/* Keep the conversion. */
break;
  else
exp = TREE_OPERAND (exp, 0);
}

where we strip conversions with type_size < op_size (aka float from double).
For float conversions not sure if just keying on type size is good enough
though (ibm double double vs long double 128 for example).

Fixing that "improves" the behavior to

t.ii:1:34: error: initializer for floating value is not a floating constant
1 | float xs[] = {0.001914, 0.630539};
  |  ^
t.ii:1:34: error: initializer for floating value is not a floating constant

aka from wrong-code to rejects-valid.

diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index cd0cd88321c..e6ab581dc5f 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -5202,7 +5202,7 @@ output_constant (tree exp, unsigned HOST_WIDE_INT size,
unsigned int align,
   /* Make sure eliminating the conversion is really a no-op, except with
 VIEW_CONVERT_EXPRs to allow for wild Ada unchecked conversions and
 union types to allow for Ada unchecked unions.  */
-  if (type_size > op_size
+  if (type_size != op_size
  && TREE_CODE (exp) != VIEW_CONVERT_EXPR
  && TREE_CODE (TREE_TYPE (exp)) != UNION_TYPE)
/* Keep the conversion. */

note that for integral and pointer types we do

  cst = expand_expr (exp, NULL_RTX, VOIDmode, EXPAND_INITIALIZER);
  if (reverse)
cst = flip_storage_order (TYPE_MODE (TREE_TYPE (exp)), cst);
  if (!assemble_integer (cst, MIN (size, thissize), align, 0))
error ("initializer for integer/fixed-point value is too complicated");

so we handle "narrowing" in a weird way.  So in case FEs leave around
nop-casts the following should be safer

diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index cd0cd88321c..81f7288449c 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -5196,13 +5196,17 @@ output_constant (tree exp, unsigned HOST_WIDE_INT size,
unsigned int align,
 || TREE_CODE (exp) == NON_LVALUE_EXPR
 || TREE_CODE (exp) == VIEW_CONVERT_EXPR)
 {
-  HOST_WIDE_INT type_size = int_size_in_bytes (TREE_TYPE (exp));
-  HOST_WIDE_INT op_size = int_size_in_bytes (TREE_TYPE (TREE_OPERAND (exp,
0)));
+  tree type = TREE_TYPE (exp);
+  tree op_type = TREE_TYPE (TREE_OPERAND (exp, 0));
+  HOST_WIDE_INT type_size = int_size_in_bytes (type);
+  HOST_WIDE_INT op_size = int_size_in_bytes (op_type);

   /* Make sure eliminating the conversion is really a no-op, except with
 VIEW_CONVERT_EXPRs to allow for wild Ada unchecked conversions and
 union types to allow for Ada unchecked unions.  */
-  if (type_size > op_size
+  if ((type_size > op_size
+  || (TYPE_MAIN_VARIANT (type) != TYPE_MAIN_VARIANT (op_type)
+  && FLOAT_TYPE_P (type)))
  && TREE_CODE (exp) != VIEW_CONVERT_EXPR
  && TREE_CODE (TREE_TYPE (exp)) != UNION_TYPE)
/* Keep the conversion. */

The real fix is of course in the frontend, the above is just a safety net.

[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88

2023-04-14 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357

--- Comment #16 from chenglulu  ---
(In reply to rguent...@suse.de from comment #15)
> On Thu, 13 Apr 2023, xry111 at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357
> > 
> > --- Comment #14 from Xi Ruoyao  ---
> > (In reply to rguent...@suse.de from comment #13)
> > > On Thu, 13 Apr 2023, chenglulu at loongson dot cn wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357
> > > > 
> > > > --- Comment #10 from chenglulu  ---
> > > > (In reply to Xi Ruoyao from comment #5)
> > > > > The test fails on loongarch64-linux-gnu.  foo is kept in 
> > > > > 114t.threadfull1,
> > > > > but removed in 135t.forwprop3.
> > > > > 
> > > > > Does this mean something is wrong for LoongArch, or we should simply 
> > > > > check
> > > > > the tree dump in a later pass (for e.g. 254t.optimized)?
> > > > 
> > > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the 
> > > > test
> > > > case can pass the test. I guess it is because the definition of
> > > > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting 
> > > > in some
> > > > blocks that cannot be removed, resulting in the failure of this test 
> > > > case.
> > > 
> > > Can you check if making b unsigned fixes the test for you?  If so
> > > that's what we should do.
> > 
> > It works?
> > 
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c
> > index 44c457b7a97..79cf371ef28 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c
> > @@ -1,7 +1,7 @@
> >  /* { dg-do compile } */
> >  /* { dg-options "-O2 -fdump-tree-threadfull1" } */
> > 
> > -static char b;
> > +static unsigned char b;
> >  static unsigned c;
> >  void foo();
> >  short(a)(short d, short e) { return d * e; }
> > 
> > But I'm still wondering why this is not an issue for x86_64.
> 
> Yes, that's interesting to see.  It does change how b is extended
> in b ^ 9854 (but for the value zero it doesn't matter).

I think the problem is here:
In adjust_alignment, the intermediate result output of loongarch and x86 is as
follows:

LoongArch:
  ...
  b.2_1 = bD.2176;
  # RANGE [irange] short int [-128, 127]
  _2 = (short intD.12) b.2_1;
  # RANGE [irange] short int [-16384, -1][1, 16383]
  _3 = _2 ^ 9854;
  # RANGE [irange] unsigned short [1, 16383][49152, +INF]
  e.1_6 = (unsigned short) _3;
  _7 = e.1_6 * 5;
  _8 = (short intD.12) _7;
  # .MEM_15 = VDEF <.MEM_4(D)>
  bD.2176 = 0;
  if (_8 != 0)
goto ; [67.00%]
  else
goto ; [33.00%]
  ...
c.4_9 = 0;
  _10 = c.4_9 == 0;
  # RANGE [irange] int [0, 1] NONZERO 0x1
  _11 = (intD.1) _10;
  # RANGE [irange] int [-32768, -1][1, 32767]
  _12 = (intD.1) _8;
 ...

X86:
  ...
  b.2_1 = bD.2738;
  # RANGE [irange] short int [-128, 127]
  _2 = (short intD.17) b.2_1;
  # RANGE [irange] short int [-16384, -1][1, 16383]
  _3 = _2 ^ 9854;
  # RANGE [irange] unsigned short [1, 16383][49152, +INF]
  e.1_7 = (unsigned short) _3;
  _8 = e.1_7 * 5;
  _9 = (short intD.17) _8;
  # RANGE [irange] int [-32768, 32767]
  _4 = (intD.6) _9;
  d_10 = (short intD.17) _4;
  # .MEM_17 = VDEF <.MEM_5(D)>
  bD.2738 = 0;
  if (d_10 != 0)
goto ; [67.00%]
  else
goto ; [33.00%]
  ...


There is an additional intermediate variable _9 in x86 and loongarch does not,
but _8 is used, but _8 is used twice, so 
  if (_8 != 0)
goto ; [67.00%]
  else
goto ; [33.00%]
is not deleted when ccp2 passes.
That's why the test case failed. I think if loongarch can generate an
intermediate variable like x86, the test will pass.

[Bug c++/109359] [12/13 Regression] Compile-time rounding of double literal to float is incorrect with -frounding-math

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109359

--- Comment #7 from Jakub Jelinek  ---
For the safety net I'd compare TYPE_MODE of the SCALAR_FLOAT_TYPE_Ps, that is
what matters for those whether it is a noop conversion or needs actually some
runtime adjustment.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #7 from Jakub Jelinek  ---
In patch form what I wrote above (completely untested):
--- gcc/config/aarch64/aarch64.cc.jj2023-04-14 09:15:08.470312336 +0200
+++ gcc/config/aarch64/aarch64.cc   2023-04-14 12:08:59.785137542 +0200
@@ -7459,6 +7459,8 @@ aarch64_vfp_is_call_candidate (cumulativ
  nregs, NULL, pcum->silent_p);
 }

+static HOST_WIDE_INT aarch64_simd_vector_alignment (const_tree);
+
 /* Given MODE and TYPE of a function argument, return the alignment in
bits.  The idea is to suppress any stronger alignment requested by
the user and opt for the natural alignment (specified in AAPCS64 \S
@@ -7487,16 +7489,16 @@ aarch64_function_arg_alignment (machine_
 {
   /* The ABI alignment is the natural alignment of the type, without
 any attributes applied.  Normally this is the alignment of the
-TYPE_MAIN_VARIANT, but not always; see PR108910 for a counterexample.
-For now we just handle the known exceptions explicitly.  */
+TYPE_MAIN_VARIANT, but not always; see PR108910 for a
+counterexample.  */
+  unsigned int ret;
   type = TYPE_MAIN_VARIANT (type);
-  if (POINTER_TYPE_P (type))
-   {
- gcc_assert (known_eq (POINTER_SIZE, GET_MODE_BITSIZE (mode)));
- return POINTER_SIZE;
-   }
-  gcc_assert (!TYPE_USER_ALIGN (type));
-  return TYPE_ALIGN (type);
+  if (VECTOR_TYPE_P (type))
+   ret = aarch64_simd_vector_alignment (type);
+  else
+   ret = GET_MODE_ALIGNMENT (TYPE_MODE (type));
+  gcc_assert (TYPE_USER_ALIGN (type) || ret == TYPE_ALIGN (type));
+  return ret;
 }

   if (TREE_CODE (type) == ARRAY_TYPE)

[Bug fortran/104272] finalizer gets called during allocate

2023-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104272

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:b0e85485fbf042abccee5c0a9eb499da386c8db3

commit r13-7181-gb0e85485fbf042abccee5c0a9eb499da386c8db3
Author: Paul Thomas 
Date:   Fri Apr 14 11:14:00 2023 +0100

Fortran: Fix an excess finalization during allocation [PR104272]

2023-04-14  Paul Thomas  

gcc/fortran
PR fortran/104272
* gfortran.h : Add expr3_not_explicit bit field to gfc_code.
* resolve.cc (resolve_allocate_expr): Set bit field when the
default initializer is applied to expr3.
* trans-stmt.cc (gfc_trans_allocate): If expr3_not_explicit is
set, do not deallocate expr3.

gcc/testsuite/
PR fortran/104272
* gfortran.dg/class_result_8.f90 : Number of builtin_frees down
from 6 to 5 without memory leaks.
* gfortran.dg/finalize_52.f90: New test

[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357

--- Comment #17 from Richard Biener  ---
Isn't this the same issue as seen in another bug, most targets defining
TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not?
That will cause those conversions to be missed.

[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88

2023-04-14 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357

--- Comment #18 from Xi Ruoyao  ---
(In reply to Richard Biener from comment #17)
> Isn't this the same issue as seen in another bug, most targets defining
> TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not?
> That will cause those conversions to be missed.

Looks like we should define it, as our psABI says:

In most cases, the unsigned integer data types are zero-extended when stored in
general-purpose register, and the signed integer data types are sign-extended.
However, in the LP64D ABI, unsigned 32-bit types, such as unsigned int, are
stored in general-purpose registers as proper sign extensions of their 32-bit
values.

IIUC it matches the semantics of TARGET_PROMOTE_PROTOTYPES.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #8 from rsandifo at gcc dot gnu.org  
---
(In reply to Jakub Jelinek from comment #7)
> In patch form what I wrote above (completely untested):
Sorry in advance for the overly verbose comment, but the timeline here was
that:

- PR108411 fixed an ABI problem exposed by struct-layout-1 trying something
new.

- The fix for PR108411 also included some extra asserts to try to prevent
similar such problems being silent failures in future.

- The breakage in PR108910 was caused by the extra asserts rather than the
change in behaviour (which meant that the asserts were doing their job :)).  I
think it would be possible to construct a wrong-code bug for GCC 11 based on
this.

- The fix for PR108910 handled that case, but in a way that is intended to be
backwards-compatible, since the bug was really an internal inconsistency rather
than a specification vs. implementation discrepancy.

- The fix for PR108910 also included extra asserts, because I can't help
myself.

- This PR was again caused by the extra asserts, rather than by the code
change.

My laboured point being, it's only really the first bullet point that is (or
was intended to be) a change in the implemented ABI.  Otherwise we've tried to
keep the implementation (and implementation method) the same as far as
possible.  The extra asserts are just to make sure that we know what's going
on.

Because of that, I'm wary of changing the general way that we calculate the
alignment.  These asserts have been firing because of unexpected corner cases. 
At least as things stand, we'll err on the side of doing what earlier compilers
did in the face of those corner cases, including those where TYPE_USER_ALIGN is
set.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #9 from Eric Botcazou  ---
> How do you get at the alignment the type would have when the user didn't
> specify it?  That's what the call ABI is supposed to look at.
> 
> /* 1 if the alignment for this type was requested by "aligned" attribute,
>0 if it is the default for this type.  */
> #define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->base.u.bits.user_align)
> 
> so what's the "default" alignment?

Note that the wording is ambiguous if the "aligned" attribute requests the same
alignment as the default for the type, which is the case here:

   type Val is mod 2**Val_Bits with Alignment => Val_Bytes;

The alignment clause is confirming in Ada parlance, which means that it does
not change anything.  We set TYPE_USER_ALIGN in this case, but I agree that we
might as well not set it.  Tentative patch to be attached.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #10 from Eric Botcazou  ---
Created attachment 54859
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54859&action=edit
Tentative fix

To be tested.

[Bug c/109507] Optimizer creates incorrect program

2023-04-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109507

--- Comment #3 from Jonathan Wakely  ---
When you created this bug report there was a red banner at the top of the page
that begins by asking you to read https://gcc.gnu.org/bugs/ and that tells you
to try -fsanitize=undefined

[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88

2023-04-14 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357

--- Comment #19 from chenglulu  ---
(In reply to Xi Ruoyao from comment #18)
> (In reply to Richard Biener from comment #17)
> > Isn't this the same issue as seen in another bug, most targets defining
> > TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not?
> > That will cause those conversions to be missed.
> 
> Looks like we should define it, as our psABI says:
> 
> In most cases, the unsigned integer data types are zero-extended when stored
> in general-purpose register, and the signed integer data types are
> sign-extended. However, in the LP64D ABI, unsigned 32-bit types, such as
> unsigned int, are stored in general-purpose registers as proper sign
> extensions of their 32-bit values.
> 
> IIUC it matches the semantics of TARGET_PROMOTE_PROTOTYPE

I also think this should be considered

[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88

2023-04-14 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357

--- Comment #20 from rguenther at suse dot de  ---
On Fri, 14 Apr 2023, xry111 at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357
> 
> --- Comment #18 from Xi Ruoyao  ---
> (In reply to Richard Biener from comment #17)
> > Isn't this the same issue as seen in another bug, most targets defining
> > TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not?
> > That will cause those conversions to be missed.
> 
> Looks like we should define it, as our psABI says:
> 
> In most cases, the unsigned integer data types are zero-extended when stored 
> in
> general-purpose register, and the signed integer data types are sign-extended.
> However, in the LP64D ABI, unsigned 32-bit types, such as unsigned int, are
> stored in general-purpose registers as proper sign extensions of their 32-bit
> values.
> 
> IIUC it matches the semantics of TARGET_PROMOTE_PROTOTYPES.

TARGET_PROMOTE_PROTOTYPES is about foo (signed char) or foo (unsigned 
short), thus argument types less than int.  With 
TARGET_PROMOTE_PROTOTYPES defined to true they will get promoted to
integer so you'll see foo ((int)x) when 'x' is of type signed char
or unsigned short for the above cases.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #11 from Richard Biener  ---
It might be possible to re-write the aarch64 assert to be instead

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index f4ef22ce02f..9da46a5e45d 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -7489,13 +7489,15 @@ aarch64_function_arg_alignment (machine_mode mode,
const_tree type,
 any attributes applied.  Normally this is the alignment of the
 TYPE_MAIN_VARIANT, but not always; see PR108910 for a counterexample.
 For now we just handle the known exceptions explicitly.  */
+  tree orig_type = type;
   type = TYPE_MAIN_VARIANT (type);
   if (POINTER_TYPE_P (type))
{
  gcc_assert (known_eq (POINTER_SIZE, GET_MODE_BITSIZE (mode)));
  return POINTER_SIZE;
}
-  gcc_assert (!TYPE_USER_ALIGN (type));
+  gcc_assert (TYPE_ALIGN (type) == TYPE_ALIGN (orig_type)
+ || TYPE_USER_ALIGN (orig_type));
   return TYPE_ALIGN (type);
 }

I'm not sure if that captures the original intent of the assert though.
Maybe instead || !TYPE_USER_ALIGN (type)?

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #12 from rsandifo at gcc dot gnu.org  
---
(In reply to Eric Botcazou from comment #10)
> Created attachment 54859 [details]
> Tentative fix
> 
> To be tested.
Thanks!  This works for me locally and gives clean Ada test results.

[Bug fortran/109511] incorrectly rejects set_exponent with constant negative exponent

2023-04-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109511

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||wrong-code
 CC||anlauf at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2023-04-14

--- Comment #1 from anlauf at gcc dot gnu.org ---
Confirmed.

The simplification seems to use the old mpfr_mul_2exp which takes an
unsigned exponent.  There is the newer mpfr_mul_2si which accepts a signed
exponent.

[Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p()

2023-04-14 Thread chip.kerchner at ibm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491

--- Comment #14 from Chip Kerchner  ---
Just one more question and then I'll switch to the new bug.

Would it help any if the functions that are "always_inline" be changed from
non-static to static?

Eigen's approach (where this code originally came from - yes, it could be
definite be better) is to use non-static inlined function.

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread chip.kerchner at ibm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509

--- Comment #1 from Chip Kerchner  ---
Just for note:  The same code that has heavy use always_inline compiles about
3X faster in LLVM and uses about 2X less memory to compile.

[Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p()

2023-04-14 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491

--- Comment #15 from rguenther at suse dot de  ---
On Fri, 14 Apr 2023, chip.kerchner at ibm dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491
> 
> --- Comment #14 from Chip Kerchner  ---
> Just one more question and then I'll switch to the new bug.
> 
> Would it help any if the functions that are "always_inline" be changed from
> non-static to static?
> 
> Eigen's approach (where this code originally came from - yes, it could be
> definite be better) is to use non-static inlined function.

I don't think so, you'd have to try.  The compile-time issues usually
show up when you have a multi-level call chain all being always-inline
as we then repeatedly early-optimize the same code over and over.
Usually we do that so functions become smaller for the inline heuristics
but for always-inlines that wouldn't matter (but of course making the
body smaller if there's more than one caller can still pay off).

It really depends on the actual callgraph and code which is also
why it is hard to improve.

Another probably more common with C++ code issue would be that we
inline into not optimized callers which means calls that are
almost trivially unreachable have not been eliminated yet but
get inlined.  Usual heuristics would only inline small functions
at this level and defer inlining of larger functions to IPA time
at which point the calls might be eliminated already.  Likewise
calls attributed const or pure might be subject to CSE but
always-inline forces them to be inlined before CSE.

So what you could try is instead of always_inline use
__attribute__((flatten)) on the functions containing the
loop kernels for example and then disable early inlining
(otherwise flatten is applied there as well) with -fno-early-inlining.

[Bug preprocessor/109183] [regression?] since GCC 11.1, -MM -MMD generates "a-" prefixed dependency files

2023-04-14 Thread allan.w.macdonald at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109183

--- Comment #13 from Allan W. Macdonald  ---
Ahhh, so, "to get back the old behaviour" (as @ Richard Biener put it), this
seems to work (at least with my project):

%.d: %.c
gcc -MM -MD -dumpbase '' $<

Not obvious in the gcc 11.3.0 manual.

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509

--- Comment #2 from Richard Biener  ---
Possible issues specific to GCC that LLVM maybe avoids are:

Another probably more common with C++ code issue would be that we
inline into not optimized callers which means calls that are
almost trivially unreachable have not been eliminated yet but
get inlined.  Usual heuristics would only inline small functions
at this level and defer inlining of larger functions to IPA time
at which point the calls might be eliminated already.  Likewise
calls attributed const or pure might be subject to CSE but
always-inline forces them to be inlined before CSE.

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #3 from Jan Hubicka  ---
Chip,
can you specify which code it is?

LLVM only does always_inlines we handle during early optimization. Many of
cases where people run into inline bombs are with LTO where the inlining
happens cross-module.  One option would be to ignore them at WPA

I did some work on chasing out non-linearities two years ago while enabling
auto-inlining at -O2.  Plan to return to it this stage1.

Just checking the situation with basic inline bomb:

#include 
static inline __attribute__ ((always_inline)) void
test1()
{
printf ("ok\n");
}
#define D(i,j) static  inline __attribute__ ((always_inline)) void test##j() {
test##i(); test##i(); }
D(1,2)
D(2,3)
D(3,4)
D(4,5)
D(5,6)
D(6,7)
D(7,8)
D(8,9)
D(9,10)
D(10,11)
D(11,12)
D(12,13)
D(13,14)
D(14,15)
D(15,16)
D(16,17)
D(17,18)
D(18,19)
int
main()
{
test19();
return 0;
}

GCC 13.0 compiles it in 126s with about 1GB of RAM. clang needs 12s.
Time report claims half of it is ipa passes:
 callgraph ipa passes   :  56.68 ( 45%)   1.08 ( 50%)  57.77 ( 45%)
 1026M ( 81%)
But it is because early optimization is IPA pass an in fact we spent most of
time in IPA:

 tree PTA   :  97.49 ( 77%)   0.57 ( 26%)  98.10 ( 76%)
   80M (  6%)

Which is also run as part of early passes.  Integratoin is 2.2s

[Bug fortran/109512] New: accepts implicit dummy procedure even with "implicit none (external)"

2023-04-14 Thread sebastien at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109512

Bug ID: 109512
   Summary: accepts implicit dummy procedure even with "implicit
none (external)"
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sebastien at debian dot org
  Target Milestone: ---

gfortran compiles the following source without error:

subroutine foo(bar)
  implicit none (external)
  call bar(1)
end subroutine foo

However my understanding is that it should reject it, because "bar" is an
implicit dummy procedure, which is forbidden by "implicit none (external)".

Note that this is not the same issue as Bug 100972, which was not about dummy
procedures.

[Bug tree-optimization/109513] New: Missed Dead Code Elimination when using __builtin_unreachable

2023-04-14 Thread theodort at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109513

Bug ID: 109513
   Summary: Missed Dead Code Elimination when using
__builtin_unreachable
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: theodort at inf dot ethz.ch
  Target Milestone: ---

cat input.c

void foo(void);
void bar(void);

static char a, h;
static int b, c, d, k;
static int **e;
static int ***f = &e;
static short g;
static int *j, *l = &c;

static char m(int *i) {
  if (*i)
return 0;
  for (; c;)
;
  if (i == &b)
bar(); 
  return 1;
}
int main() {
  int *n = &b;
  for (; k; ++k)
m(n);
  h = a + 10;
  if (h) {
int *o = &d;
for (; g; ++g) {
  *e = l;
  j = **f;
  o = j;
}
if (!(o == &c || o == &d))
  foo();
  }
}

In the above piece of code the calls to foo and bar are both dead

gcc-trunk at -O3 generates the following code:
main:
subq$8, %rsp
movlk(%rip), %ecx
testl   %ecx, %ecx
je  .L7
.L2:
movlc(%rip), %edx
testl   %edx, %edx
je  .L5
.L6:
jmp .L6
.L5:
callbar
addl$1, k(%rip)
jne .L2
.L7:
cmpw$0, g(%rip)
je  .L4
movqe(%rip), %rax
movq$c, (%rax)
xorl%eax, %eax
movw%ax, g(%rip)
.L4:
xorl%eax, %eax
addq$8, %rsp
ret

it eliminates the call to foo but not to bar. If I try to "help" the compiler
by replacing bar() with __builtin_unreachable(), it generates worse code and it
also misses eliminating the dead call to foo:

void foo(void);

static char a, h;
static int b, c, d, k;
static int **e;
static int ***f = &e;
static short g;
static int *j, *l = &c;

static char m(int *i) {
  if (*i)
return 0;
  for (; c;)
;
  if (i == &b)
__builtin_unreachable(); // <- the call to bar was here
  return 1;
}

int main() {
  int *n = &b;
  for (; k; ++k)
m(n);
  h = a + 10;
  if (h) {
int *o = &d;
for (; g; ++g) {
  *e = l;
  j = **f;
  o = j;
}
if (!(o == &c || o == &d))
  foo();
  }
}

gcc-trunk -O3 output:

main:
movlk(%rip), %ecx
testl   %ecx, %ecx
je  .L22
.L2:
jmp .L2
.L22:
movqe(%rip), %rsi
movzwl  g(%rip), %eax
xorl%ecx, %ecx
movl$d, %edx
jmp .L3
.L4:
movq$c, (%rsi)
addl$1, %eax
movl$1, %ecx
movl$c, %edx
.L3:
testw   %ax, %ax
jne .L4
testb   %cl, %cl
je  .L5
movw$0, g(%rip)
.L5:
cmpq$d, %rdx
je  .L18
cmpq$c, %rdx
je  .L18
pushq   %rax
callfoo
xorl%eax, %eax
popq%rdx
ret
.L18:
xorl%eax, %eax
ret

This is also an old regression. gcc-7.4 at -O3 generates for the version with
__builtin_unreachable:

main:
movlk(%rip), %eax
testl   %eax, %eax
je  .L7
.L2:
jmp .L2
.L7:
cmpw$0, g(%rip)
je  .L4
movqe(%rip), %rax
movw$0, g(%rip)
movq$c, (%rax)
.L4:
xorl%eax, %eax
ret

[Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p()

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491

--- Comment #16 from Richard Biener  ---
Just to make the point, for the testcase when compiling with -O -g I see

> grep 'INLINE_ENTRY' t.ii.031t.einline | wc -l 
16976
> grep 'INLINE_ENTRY' t.ii.034t.ccp1 | wc -l
15530
> grep 'INLINE_ENTRY' t.ii.043t.cddce1 | wc -l
14180

so there's some (not the majority) of inlined bodies removed by the first
constant propagation pass and more at the end of the early optimization
phase.  It's a bit unreliable since not all inlines result in a INLINE_ENTRY
debug stmt.  It also shows this likely isn't the biggest issue.

A factor of two compared to clang can be easily attributed to bigger
IL representation as well.

For me the testcase with -O2 builds in 17s with release checking enabled
so that's not too bad (on a fast x86 machine).  The most expensive pass
is RTL PRE, accounting for 1/4 of the compile-time.  With -O3 it's even
faster.

With -Dalways_inline=user the compile time reduces to 3s though.

[Bug rtl-optimization/108086] [11 Regression] internal compiler error: in set_accesses, at rtl-ssa/internals.inl:449

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108086

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #19 from Jan Hubicka  ---
With trunk I now get:

 phase opt and generate :  71.71 (100%)   4.60 ( 95%)  76.32 ( 99%)
 2848M ( 98%)
 callgraph functions expansion  :  27.25 ( 38%)   1.65 ( 34%)  28.90 ( 38%)
  510M ( 17%)
 callgraph ipa passes   :  44.43 ( 62%)   2.95 ( 61%)  47.38 ( 62%)
 2333M ( 80%)
 integration:   6.99 ( 10%)   1.05 ( 22%)   8.01 ( 10%)
 1998M ( 68%)
 tree PTA   :  36.28 ( 50%)   2.52 ( 52%)  38.80 ( 50%)
   45k (  0%)
 combiner   :   3.96 (  5%)   0.00 (  0%)   3.96 (  5%)
   41M (  1%)

with over 5% runtime. Perf claims it is mostly bitmap operations:

  16.91%  cc1plus   cc1plus[.] bitmap_ior_into
  12.22%  cc1plus   cc1plus[.] bitmap_set_bit
   3.58%  cc1plus   cc1plus[.] 0x011cad58
   1.97%  cc1plus   cc1plus[.] bitmap_bit_p
   1.80%  cc1plus   cc1plus[.] bitmap_copy
   1.24%  cc1plus   cc1plus[.] htab_find_slot_with_hash
   1.13%  cc1plus   cc1plus[.] 0x011cad70
   1.11%  cc1plus   cc1plus[.] refers_to_regno_p

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509

--- Comment #4 from Richard Biener  ---
when working on another testcase I noticed our inlining itself creates a lot of
garbage - copies can pile up, esp. when not optimizing.  The PR79416 testcase
is similar than yours but using asm("nop") and thus not running into PTA
scalability.

[Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p()

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #17 from Jan Hubicka  ---
With a cross on znver3 machine I get with ./xgcc -B ./ -mcpu=power8 -std=c++14
-O2 tt.cc --verbose -Q :

 phase opt and generate :  13.19 ( 99%)   0.35 ( 95%)  13.53 ( 99%)
  408M ( 94%)
 callgraph functions expansion  :  10.19 ( 77%)   0.15 ( 41%)  10.33 ( 76%)
  193M ( 45%)
 callgraph ipa passes   :   2.97 ( 22%)   0.20 ( 54%)   3.17 ( 23%)
  203M ( 47%)
 df live regs   :   0.65 (  5%)   0.00 (  0%)   0.62 (  5%)
0  (  0%)
 alias stmt walking :   1.19 (  9%)   0.03 (  8%)   1.25 (  9%)
 6915k (  2%)
 tree Early VRP :   0.63 (  5%)   0.01 (  3%)   0.58 (  4%)
   16M (  4%)
 PRE:   2.81 ( 21%)   0.01 (  3%)   2.83 ( 21%)
  349k (  0%)
 scheduling :   1.00 (  8%)   0.00 (  0%)   0.99 (  7%)
 1997k (  0%)


inliner is:
 ipa inlining heuristics:   0.49 (  4%)   0.00 (  0%)   0.47 (  3%)
 4429k (  1%)
 integration:   0.19 (  1%)   0.04 ( 11%)   0.28 (  2%)
  117M ( 27%)

So inlining itself seems to be in control.

With RTL SSA we may be able to replace RTL PRE with GVN-PRE :)

[Bug c++/109160] [Valid code] Constraint on deduced NTTP from method call causes ICE/Segfault.

2023-04-14 Thread vincent_saulue at hotmail dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109160

--- Comment #5 from Vincent Saulue-Laborde  
---
(In reply to Patrick Palka from comment #4)
> (In reply to Patrick Palka from comment #3)
> > Fixed for GCC 12 so far.
> 
> GCC 13*

I confirm that gcc trunk works fine from my side.

Thanks for the fix !

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509

--- Comment #5 from Jan Hubicka  ---
For a summary
  - PR109491 does not seem to be about integration time. most time is RTL PRE.
  - PR108086 has 10% spent in integration and seems to be operand scan issue
  - PR99785 is hard to judge given that it is a large inline bomb. Clang
compiles it faster but it compiles different code, because Firefox gfx library
now has two vector iplementations. Clang is not able to copmile gcc path and
gcc is not able to compile clang path.
  - PR79416 has large inline stack, so we eventually run out of memory. Looking
at the profile, 15% is spent by ggc_internal_alloc, so it seems that
tree-inline is actually doing its job, just has too much of it.

[Bug fortran/104272] finalizer gets called during allocate

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104272

Paul Thomas  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #7 from Paul Thomas  ---
Fixed on mainline. I am closing it but will incorporate this in a composite
finalization patch for 12-branch.

Thanks for the report

Paul

[Bug preprocessor/109183] [regression?] since GCC 11.1, -MM -MMD generates "a-" prefixed dependency files

2023-04-14 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109183

--- Comment #14 from Andreas Schwab  ---
It doesn't make sense to use both -MM and -MD.  Either you want to generate
only dependencies, then use -M or -MM (and -MF to redirect to a file).  Or you
want to generate dependencies as side effect of compilation, then use -MD or
-MMD.

[Bug c++/79416] Internal compiler error for recursive template expansion

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79416

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Jan Hubicka  ---
With trunk I get:

 phase opt and generate :  82.04 (100%)  25.20 (100%) 107.23 (100%)
15103M (100%)
 callgraph ipa passes   :  78.81 ( 96%)  24.72 ( 98%) 103.53 ( 96%)
15021M ( 99%)
 integration:  32.50 ( 40%)   8.29 ( 33%)  40.44 ( 38%)
10311M ( 68%)
 tree Early VRP :   4.28 (  5%)   0.04 (  0%)   4.13 (  4%)
 1154k (  0%)
 tree operand scan  :  13.42 ( 16%)   4.64 ( 18%)  18.23 ( 17%)
 1409M (  9%)
 remove unused locals   :   6.16 (  8%)   0.07 (  0%)   6.12 (  6%)
0  (  0%)

13% of overall runtime is ggc_internal_alloc.
I wonder why early VRP become expensive.  Also remove unused locals could be
quite likely tamed down - we run it many times and perhaps we can keep tract of
changes.

[Bug fortran/65347] [F03] Final subroutine not called for function result

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65347

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Paul Thomas  ---
I am closing this as fixed.

The original testcase output now agrees with nagfor. The testcase of comment #3
produces the same output as nagfor and ifort. ifort finalizes the function call
Ctfin(10) in the original testcase, which is certainly not required by the
F2018 standard and was explicitly deleted in a revision to the F2008 standard.

Finalization of structure and array constructors is all over the place with the
different compilers. I therefore am taking the view that complying with F2018
is the best that can be hoped for.

Sorry this has taken so long to address.

Paul

[Bug fortran/37336] [F03] Finish derived-type finalization

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37336
Bug 37336 depends on bug 65347, which changed state.

Bug 65347 Summary: [F03] Final subroutine not called for function result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65347

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug fortran/84472] Missing finalization and memory leak

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84472

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Paul Thomas  ---
I seem to have put myself off closing this PR with the comment in
finalize_47.f90. This pertained to an intermediate stage of the patch although
I suspect that nullification of pointer components of function results IS
needed before finalization.

Look out for a composite finalization patch for 12-branch in the coming weeks.

Thanks for the report.

Paul

[Bug fortran/37336] [F03] Finish derived-type finalization

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37336
Bug 37336 depends on bug 84472, which changed state.

Bug 84472 Summary: Missing finalization and memory leak
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84472

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug fortran/86754] [meta-bug] Memory leaks at run time

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86754
Bug 86754 depends on bug 84472, which changed state.

Bug 84472 Summary: Missing finalization and memory leak
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84472

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug fortran/91316] Derived type finalization failing

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91316

Paul Thomas  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Paul Thomas  ---
Fixed on trunk.

Thanks for the report.

Paul

[Bug fortran/37336] [F03] Finish derived-type finalization

2023-04-14 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37336
Bug 37336 depends on bug 91316, which changed state.

Bug 91316 Summary: Derived type finalization failing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91316

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/109494] inline const variables interfere with source_location

2023-04-14 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109494

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #4 from Patrick Palka  ---
I can't reproduce the linker warning or bad output with GCC 12.2 or trunk on
x86_64-pc-linux-gnu:

$ g++ -std=c++20 -fext-numeric-literals Main.ii Test.ii
$ ./a.out
/Users/Claire/Dropbox/SourceLoc/Test/Main.cpp
/Users/Claire/Dropbox/SourceLoc/Test/Test.cpp

[Bug target/108807] [11/12/13 regression] gcc.target/powerpc/vsx-builtin-10d.c fails after r11-6857-gb29225597584b6 on power 9 BE

2023-04-14 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108807

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org

--- Comment #5 from Jeffrey A. Law  ---
Kewen, is this BZ fixed on the trunk?  If so we should update the title by
dropping the "/13" so that's not flagged as a gcc-13 regression.

[Bug c++/109514] New: [13 regression] -Werror=dangling-pointer false positive nn fheroes-1.0.3 (lambdas)

2023-04-14 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109514

Bug ID: 109514
   Summary: [13 regression] -Werror=dangling-pointer false
positive nn fheroes-1.0.3 (lambdas)
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

Initially noticed the false positive on fheroes2-1.0.3 project which uses
-Werror by default. Here is the extracted example:

// $ cat screen.cpp
#include 
#include 

struct R {
int w;
};

typedef std::vector vr_t;
typedef std::set sr_t;

static sr_t FilterRs (const sr_t & rs)
{
return rs;
}

vr_t getRs ()
{
const vr_t filteredRs = []() {
sr_t rs;

rs = FilterRs (rs);

return vr_t{rs.begin(), rs.end()};
}();

return filteredRs;
}

$ g++ -c -O2 screen.cpp -Werror=dangling-pointer
In file included from /<>/gcc-13.0.0/include/c++/13.0.1/set:62,
 from screen.cpp:1:
In member function 'void
std::_Rb_tree_header::_M_move_data(std::_Rb_tree_header&)',
inlined from 'void std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>::_M_move_data(std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>&, std::true_type) [with _Key = R; _Val = R; _KeyOfValue =
std::_Identity; _Compare = std::less; _Alloc = std::allocator]' at
/<>/gcc-13.0.0/include/c++/13.0.1/bits/stl_tree.h:1421:29,
inlined from 'void std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>::_M_move_assign(std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>&, std::true_type) [with _Key = R; _Val = R; _KeyOfValue =
std::_Identity; _Compare = std::less; _Alloc = std::allocator]' at
/<>/gcc-13.0.0/include/c++/13.0.1/bits/stl_tree.h:1710:14,
inlined from 'std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>&
std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
_Alloc>::operator=(std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>&&)
[with _Key = R; _Val = R; _KeyOfValue = std::_Identity; _Compare =
std::less; _Alloc = std::allocator]' at
/<>/gcc-13.0.0/include/c++/13.0.1/bits/stl_tree.h:1744:21,
inlined from 'std::set<_Key, _Compare, _Alloc>& std::set<_Key, _Compare,
_Alloc>::operator=(std::set<_Key, _Compare, _Alloc>&&) [with _Key = R; _Compare
= std::less; _Alloc = std::allocator]' at
/<>/gcc-13.0.0/include/c++/13.0.1/bits/stl_set.h:302:7,
inlined from 'getRs()::' at screen.cpp:21:26,
inlined from 'vr_t getRs()' at screen.cpp:24:7:
/<>/gcc-13.0.0/include/c++/13.0.1/bits/stl_tree.h:199:38: error: storing
the address of local variable 'rs' in '*MEM[(struct _Rb_tree_node_base *
&)&D.35566 + 16].std::_Rb_tree_node_base::_M_parent'
[-Werror=dangling-pointer=]
  199 |   _M_header._M_parent->_M_parent = &_M_header;
  |   ~~~^~~~
screen.cpp: In function 'vr_t getRs()':
screen.cpp:19:14: note: 'rs' declared here
   19 | sr_t rs;
  |  ^~
screen.cpp:19:14: note: '.std::set, std::allocator
>::_M_t.std::_Rb_tree, std::less, std::allocator
>::_M_impl.std::_Rb_tree, std::less,
std::allocator >::_Rb_tree_impl,
true>::.std::_Rb_tree_header::_M_header.std::_Rb_tree_node_base::_M_parent'
declared here
cc1plus: some warnings being treated as errors

I think it's a false positive (or maybe it exposes some NRVO bug?).

$ g++ -v
Using built-in specs.
COLLECT_GCC=/<>/gcc-13.0.0/bin/g++
COLLECT_LTO_WRAPPER=/<>/gcc-13.0.0/libexec/gcc/x86_64-unknown-linux-gnu/13.0.1/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with:
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.0.1 20230409 (experimental) (GCC)

[Bug target/109494] inline const variables interfere with source_location

2023-04-14 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109494

--- Comment #5 from Patrick Palka  ---
(In reply to Patrick Palka from comment #4)
> I can't reproduce the linker warning or bad output with GCC 12.2 or trunk on
> x86_64-pc-linux-gnu:
> 
> $ g++ -std=c++20 -fext-numeric-literals Main.ii Test.ii
> $ ./a.out
> /Users/Claire/Dropbox/SourceLoc/Test/Main.cpp
> /Users/Claire/Dropbox/SourceLoc/Test/Test.cpp
Oops, it was already established this is a darwin-specific issue, sorry for the
noise.

[Bug fortran/109512] accepts implicit dummy procedure even with "implicit none (external)"

2023-04-14 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109512

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||kargl at gcc dot gnu.org
   Last reconfirmed||2023-04-14

--- Comment #1 from kargl at gcc dot gnu.org ---
Confirmed.

8.7 IMPLICIT statement
...
An IMPLICIT NONE statement can indicate ..., or that external and
dummy procedures need to be explicitly given the EXTERNAL attribute.

'bar' is clearly a dummy procedure, so 'externa bar' is required.

[Bug c/109507] Optimizer creates incorrect program

2023-04-14 Thread aran at 100acres dot us via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109507

Aran Clauson  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #4 from Aran Clauson  ---
Disregard.

[Bug c++/109514] [13 regression] -Werror=dangling-pointer false positive nn fheroes-1.0.3 (lambdas)

2023-04-14 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109514

Jason Merrill  changed:

   What|Removed |Added

   Last reconfirmed||2023-04-14
 CC||jason at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Jason Merrill  ---
Confirmed.  -Wdangling-pointer shouldn't warn about a store to an unknown
object.  And it should be able to figure out that this store is to sr itself,
and not warn for that reason as well.

[Bug fortran/109511] incorrectly rejects set_exponent with constant negative exponent

2023-04-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109511

--- Comment #2 from anlauf at gcc dot gnu.org ---
It's even worse, as simplification for arguments X below one gives wrong
results:

set_exponent(0.75, 3)

gives 3., while the runtime version correctly prints 6.00

All gfortran versions seem to be affected.

[Bug target/109137] [12 regression] Compiling ffmpeg with -m32 on x86_64-pc-linux-gnu hangs on libavcodec/h264_cabac.c since r12-9086-g489c81db7d4f75

2023-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137

--- Comment #25 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:9075b0f19eece7d5ddf948204507b5dae9d292c4

commit r12-9400-g9075b0f19eece7d5ddf948204507b5dae9d292c4
Author: Jan Hubicka 
Date:   Fri Apr 14 19:18:24 2023 +0200

Disable X86_TUNE_AVX256_MOVE_BY_PIECES and STORE_BY_PIECES for znver1-3

I have enabled SSE moves for znver1-3 since they are performance win on
this
machine too (we avoid using loops or string operations which are more
costy).
However as discussed in the PR log, this triggers bug in IRA and it was
decided
it is better to not backport the fix.

gcc/ChangeLog:

2023-04-14  Jan Hubicka  

PR target/109137
* config/i386/x86-tune.def (X86_TUNE_AVX256_MOVE_BY_PIECES):
Remove znver1-3.
(X86_TUNE_AVX256_STORE_BY_PIECES): Remove znver1-3.

[Bug fortran/103931] Type name "c_ptr" is ambiguous when iso_c_binding is imported both directly and indirectly

2023-04-14 Thread aldot at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103931

--- Comment #16 from Bernhard Reutner-Fischer  ---
> Under the assumption that we have a generic "c_ptr" in a module, we know (?) 
> that "c_ptr" is not ambiguous.
> 
> Is that right?

When we look at gmodule (when compiled when DModule has a commented-out "use
CModule"), we have:

GFORTRAN module version '15' created from pr103931.F90
(() () () () () () () () () () () () () () () () () () () () () () () ()
() () ()) // oper interfaces

() // user oper

(('c_funptr' '__iso_c_binding' 2) ('c_ptr' '__iso_c_binding' 3)) // generic
interfaces
() // blank commons, common
() // equiv
() // omp_udr
// now first symbols w/ generic interfaces, flavour PROCEDURE but not sub nor
function
(2 'C_funptr' '__iso_c_binding' '' 1 ((DERIVED UNKNOWN-INTENT
UNKNOWN-PROC UNKNOWN UNKNOWN 0 0 IS_BIND_C IS_C_INTEROP PRIVATE_COMP) (
(4 'c_address' (INTEGER 8 0 1 0 INTEGER ()) () () () (UNKNOWN-FL
UNKNOWN-INTENT UNKNOWN-PROC UNKNOWN UNKNOWN 0 0) PRIVATE ())) PRIVATE (
DERIVED 2 0 1 1 VOID ()) 0 0 () () 0 () () () 2 44 0)
3 'C_ptr' '__iso_c_binding' '' 1 ((DERIVED UNKNOWN-INTENT UNKNOWN-PROC
UNKNOWN UNKNOWN 0 0 IS_BIND_C IS_C_INTEROP PRIVATE_COMP) ((5 'c_address'
(INTEGER 8 0 1 0 INTEGER ()) () () () (UNKNOWN-FL UNKNOWN-INTENT
UNKNOWN-PROC UNKNOWN UNKNOWN 0 0) PRIVATE ())) PRIVATE (DERIVED 3 0 1 1
VOID ()) 0 0 () () 0 () () () 2 42 0)
[snip]
// second, symbols rest

So, the ambiguous error comes from seeing C_ptr with id 3, loading that. Then
we check if it _has_ the GENERIC attr set, but it has not, since (i think) this
very symbol is *used* to describe the type of the generic interface. I do not
quite see why we check_for_ambiguous for these generic interface types in the
first place, i think that they are just not ambiguous, at least as far as i can
see in the aforementioned gmodule.mod.

I'm testing a more elaborate test which keeps check_for_ambiguous in the same
spot as before, but in that check, looks if the symtree (C_ptr in this case,
number 3) is listed in the generic interfaces after the existing check that
looks if the symbol itself is GENERIC.

I did not really look at the standard, so i wonder if that makes sense to
somebody more familiar with that area?

[Bug target/109137] [12 regression] Compiling ffmpeg with -m32 on x86_64-pc-linux-gnu hangs on libavcodec/h264_cabac.c since r12-9086-g489c81db7d4f75

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137

--- Comment #26 from Jan Hubicka  ---
reverted the znver1-3 change on gcc-12 branch.  We still may want to fix IRA to
avoid the problem on core_avx512 targets.

[Bug fortran/109511] incorrectly rejects set_exponent with constant negative exponent

2023-04-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109511

--- Comment #3 from anlauf at gcc dot gnu.org ---
Created attachment 54860
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54860&action=edit
Patch

Patch I am testing.

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-04-14 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009

--- Comment #5 from Surya Kumari Jangala  ---
I was analysing and comparing the following test cases:

Test1 (shrink wrapped)

long
foo (long i, long cond)
{
  i = i + 1;
  if (cond)
bar ();
  return i;
}


Test2 (not shrink wrapped)

long
foo (long i, long cond)
{
  if (cond)
bar ();
  return i+1;
}


There is a difference in register allocation by IRA in the two cases.

Input RTL to IRA (Test1: passing case)
BB2:
  set r123, r4
  set r122, r3
  set r120, compare(r123, 0)
  set r117, r122 + 1
  if r120 jump BB4 else jump BB3
BB3:
  call bar()
BB4:
  set r3, r117
  return r3


Input RTL to IRA (Test2: failing case)

BB2:
  set r123, r4
  set r122, r3
  set r120, compare(r123, 0)
  set r118, r122
  if r120 jump BB4 else jump BB3
BB3:
  call bar()
BB4:
  set r3, r118+1
  return r3


There is a difference in registers allocated for r117 (passing case) and r118
(failing case) by IRA.
r117 is allocated r3 while r118 is allocated r31.
Since r117 is allocated r3, r3 is spilled across the call to bar() by LRA. And
so only BB3 requires a prolog and shrink wrap is successful.
In the failing case, since r31 is assigned to r118, BB2 requires a prolog and
shrink wrap fails.

In the IRA pass, after graph coloring, both r117 and r118 get assigned to r3.
The routine improve_allocation() is called after graph coloring. In this
routine, IRA checks for each allocno if spilling any conflicting allocnos can
improve the 
allocation of this allocno.

Going into more detail, improve_allocation() does the following:
1. We first compute the cost improvement for usage of each profitable hard
register for a given allocno A. The cost improvement is computed as follows:

costs[regno] = A->hard_reg_costs[regno]   // ‘hard_reg_costs’ is an array of
usage 
 costs for each hard register
costs[regno] -= allocno_copy_cost_saving (A, regno);
costs[regno] -= base_cost;   //Say, ‘reg’ is assigned to A. Then ‘base_cost’ is 
   the usage cost of ‘reg’ for A.

2. Then we process each conflicting allocno of A and update the cost
improvement for the profitable hard registers of A. Basically, we compute the
spill costs of the conflicting allocnos and update the cost (for A) of the
register that was assigned to the conflicting allocno. 
3. We then find the best register among the profitable registers, spill the
conflicting allocno that uses this best register and assign the best register
to A.


However, the initial hard register costs for some of the profitable hard
registers is different in the passing and failing cases. More specifically, the
costs in hard_reg_costs[] array are 0 for regs 14-31 in the failing case. A
zero cost seems incorrect. If using a reg in the set [14..31] has zero cost,
then why wasn’t such a reg chosen for r118?
In the passing case, the costs in hard_reg_costs[] for regs 14-31 is 2000.
At the end of step 1, costs[r31] is -390 for failing case(for allocno r118) and
1610 for passing case (for allocno r117).

Another issue(?) is that in step 2, the only conficting allocno for r118 is the
allocno for r120 which is used to hold the value of the condition check. The
pseudo r120 has been assigned to r100 by the graph coloring step. But r100 is
not in the set of profitable hard registers for r118. (The profitable hard regs
are: [0, 3-12, 14-31]). So the allocno for r120 is not considered for spilling.
 And finally in step 3, r31 is assigned to r118, though r31 has not been
assigned to any conflicting allocno. Perhaps improve_allocation() should only
consider registers that have been assigned to conflicting allocnos, and not
other registers, since it’s stated aim is to see if spilling conflicting
allocnos can result in a better allocation.

I am investigating why hard_reg_costs[] has 0 cost for r14..r31.

[Bug c++/109515] New: Diagnostic request: warning on out-of-order structured bindings names

2023-04-14 Thread barry.revzin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109515

Bug ID: 109515
   Summary: Diagnostic request: warning on out-of-order structured
bindings names
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

Consider this example:

struct div_t {
int quot;
int rem;
};

auto div(int, int) -> div_t;

int main() {
auto [rem, quot] = div(1, 2);
return quot;
}

I'm using structured bindings, but div_t's members are in the order quot then
rem, but I accidentally typed rem then quot. This is a bug! Currently, nobody
warns here though.

At the very least, if I'm:
(a) using a name in a structured binding,
(b) where we fall into the 3rd case of structured bindings (struct, not
array/tuple),
(c) the name I'm using is the name of one of the members of the type, and
(d) the structured binding is in the wrong place

I think that's a situation where a warning would have a low false-positive
rate, could be a useful -Wall (or at least -Wextra) kind of warning. 

There's a more expansive potential diagnostic, if I used names that weren't
names of members at all, but I think that sort of warning would have to be
opt-in.

[Bug c++/109515] Diagnostic request: warning on out-of-order structured bindings names

2023-04-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109515

Andrew Pinski  changed:

   What|Removed |Added

 Blocks||87403
   Keywords||diagnostic
   Severity|normal  |enhancement


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
[Bug 87403] [Meta-bug] Issues that suggest a new warning

[Bug tree-optimization/109154] [13 regression] jump threading de-optimizes nested floating point comparisons

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #48 from Jakub Jelinek  ---
for PHIs with 3+ arguments unless all the arguments but one are the same even
when not doing any smarts seems we emit one more COND_EXPR from what we could.
The
  /* Common case.  */
case loop emits args_len COND_EXPRs, where to select one of the args_len
values, one should need only args_len - 1 COND_EXPRs.
So e.g. for the #c47 first testcase, we emit:
  _7 = a_10 < 0;
  _21 = a_10 >= 0;
  _22 = a_10 < e_11(D);
  _23 = _21 & _22;
  _26 = a_10 >= e_11(D);
  _27 = _21 & _26;
  _ifc__42 = _7 ? 1 : t_13;
  _ifc__43 = _23 ? t_13 : _ifc__42;
  t_6 = _27 ? 0 : _ifc__43;
Even when not trying to be smart on which predicate goes first and which goes
last (currently we only make sure that argument with most duplicates gets
last), I don't see why we should emit args_len COND_EXPRs, if we check just the
last args_len - 1 predicates or first args_len - 1 predicates, when all the
predicates are false it should represent the argument that wasn't otherwise
picked.  So, the above without smart optimizations IMHO could be either
replaced with
  _ifc__42 = _23 ? t_13 : 1;
  t_6 = _27 ? 0 : _ifc__42;
or
  _ifc__42 = _23 ? t_13 : 0;
  t_6 = _7 ? 1 : _ifc__42;
etc.
But we really should also do the smart optimization, see through the
bb_predicates which one is BIT_AND_EXPRed with inversion of some other arg's
predicate and avoid those BIT_AND_EXPRs and redundant comparisons by sorting
them better.

[Bug tree-optimization/109154] [13 regression] jump threading de-optimizes nested floating point comparisons

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #49 from Jakub Jelinek  ---
Plus for 4+ args_len, if we don't find some smart sorting, we should still
consider at least some reassociation between the COND_EXPRs, instead of
emitting for 4 args_len
3 COND_EXPRs where second depends on the first and third depends on the second
we could
emit two independent COND_EXPRs and third that depends on those two.
Of course, it might be harmful for register allocation if we do that
aggressively for very large number of PHI args.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Eric Botcazou :

https://gcc.gnu.org/g:94a21e008c4778e446321b1355f61abc75a076be

commit r13-7187-g94a21e008c4778e446321b1355f61abc75a076be
Author: Eric Botcazou 
Date:   Fri Apr 14 20:14:07 2023 +0200

Fix build failure of Ada runtime for Aarch64 targets

The Aarch64 back-end now asserts that the main variant of scalar types
has TYPE_USER_ALIGN cleared, and that's not the case for scalar types
declared with a confirming alignment clause in Ada.

gcc/ada/
PR bootstrap/109510
* gcc-interface/decl.cc (gnat_to_gnu_entity) : Reset align
to zero if its value is equal to TYPE_ALIGN and the type is scalar.
Set TYPE_USER_ALIGN on the type only if align is positive.

[Bug bootstrap/109510] [13 Regression] bootstrap with Ada broken on aarch64

2023-04-14 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

Eric Botcazou  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #14 from Eric Botcazou  ---
Fix applied.

[Bug target/108910] [12 Regression] Further ICE in aarch64_layout_arg

2023-04-14 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108910
Bug 108910 depends on bug 109510, which changed state.

Bug 109510 Summary: [13 Regression] bootstrap with Ada broken on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109510

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/109154] [13 regression] jump threading de-optimizes nested floating point comparisons

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #50 from Jakub Jelinek  ---
Anyway, given that in the sorting the last entry has the maximum number of
occurrences,
I think without trying to do more smarts best would be to avoid evaluating that
last condition for now.

[Bug fortran/109511] incorrectly rejects set_exponent with constant negative exponent

2023-04-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109511

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from anlauf at gcc dot gnu.org ---
Submitted, with an additional fix for negative X:

https://gcc.gnu.org/pipermail/fortran/2023-April/059182.html

[Bug tree-optimization/109154] [13 regression] jump threading de-optimizes nested floating point comparisons

2023-04-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154

--- Comment #51 from Jakub Jelinek  ---
Dumb untested patch which saves 2 instructions from each of those testcases:

--- gcc/tree-if-conv.cc.jj  2023-04-12 08:53:58.264496474 +0200
+++ gcc/tree-if-conv.cc 2023-04-14 21:02:42.403826690 +0200
@@ -2071,7 +2071,7 @@ predicate_scalar_phi (gphi *phi, gimple_
 }

   /* Put element with max number of occurences to the end of ARGS.  */
-  if (max_ind != -1 && max_ind +1 != (int) args_len)
+  if (max_ind != -1 && max_ind + 1 != (int) args_len)
 std::swap (args[args_len - 1], args[max_ind]);

   /* Handle one special case when number of arguments with different values
@@ -2116,12 +2116,12 @@ predicate_scalar_phi (gphi *phi, gimple_
   vec *indexes;
   tree type = TREE_TYPE (gimple_phi_result (phi));
   tree lhs;
-  arg1 = args[1];
-  for (i = 0; i < args_len; i++)
+  arg1 = args[args_len - 1];
+  for (i = args_len - 1; i > 0; i--)
{
- arg0 = args[i];
- indexes = phi_arg_map.get (args[i]);
- if (i != args_len - 1)
+ arg0 = args[i - 1];
+ indexes = phi_arg_map.get (args[i - 1]);
+ if (i != 1)
lhs = make_temp_ssa_name (type, NULL, "_ifc_");
  else
lhs = res;

[Bug fortran/105800] Segfault deallocating a class, dimension(:) array

2023-04-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105800

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-04-14

--- Comment #2 from anlauf at gcc dot gnu.org ---
(In reply to martin from comment #1)
> The fixed attachment declares "a" as "type(t), dimension(:), pointer". This
> shows the described error. It is still present in the current 13 development
> branch.

It is the pointer assignment in subroutine init

self%x => a

that fails.

[Bug fortran/109511] incorrectly rejects set_exponent with constant negative exponent

2023-04-14 Thread sebastien at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109511

--- Comment #5 from Sébastien Villemot  ---
Thanks for your work on this issue!

[Bug fortran/109500] SIGABRT when calling a function that returns an unallocated value

2023-04-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109500

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2023-04-14
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||accepts-invalid, diagnostic
   Priority|P3  |P5
   Severity|normal  |enhancement

--- Comment #8 from anlauf at gcc dot gnu.org ---
(In reply to Steve Kargl from comment #7)
> I suspect it works by accident, but I don't have enough
> time at the moment to go read the gfortran source.  What 
> is likely happening is gfortran checks that the actual
> and dummy argument both have the allocatable attribute.
> For the actual argument, the symbol is probably marked
> an allocatable attribute and an internal attribute 
> that designates this as a function-result, and gfortran
> does not check for the latter.

So marking as a missed diagnostic for invalid code.

  1   2   >