[PATCH v1] RISC-V: Support {U}INT64 to FP16 auto-vectorization

2023-09-27 Thread pan2 . li
From: Pan Li 

This patch would like to support the auto-vectorization from
the INT64 to FP16. We take below steps for the conversion.

* INT64 to FP32.
* FP32 to FP16.

Given sample code as below:
void
test_func (int64_t * __restrict a, _Float16 *b, unsigned n)
{
  for (unsigned i = 0; i < n; i++)
b[i] = (_Float16) (a[i]);
}

Before this patch:
test.c:6:26: missed: couldn't vectorize loop
test.c:6:26: missed: not vectorized: unsupported data-type
ld  a0,0(s0)
call__floatdihf
fsh fa0,0(s1)
addis0,s0,8
addis1,s1,2
bne s2,s0,.L3
ld  ra,24(sp)
ld  s0,16(sp)
ld  s1,8(sp)
ld  s2,0(sp)
addisp,sp,32

After this patch:
vsetvli a5,a2,e8,mf8,ta,ma
vle64.v v1,0(a0)
vsetvli a4,zero,e32,mf2,ta,ma
vfncvt.f.x.wv1,v1
vsetvli zero,zero,e16,mf4,ta,ma
vfncvt.f.f.wv1,v1
vsetvli zero,a2,e16,mf4,ta,ma
vse16.v v1,0(a1)

Please note VLS mode is also involved in this patch and covered by the
test cases.

PR target/111506

gcc/ChangeLog:

* config/riscv/autovec.md (2):
* config/riscv/vector-iterators.md:

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv32gcv.c:
Adjust checker.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv64gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/unop/cvt-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/cvt-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/cvt-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 24 ++
 gcc/config/riscv/vector-iterators.md  | 38 +++
 .../autovec/conversions/vfncvt-itof-rv32gcv.c |  5 +-
 .../autovec/conversions/vfncvt-itof-rv64gcv.c |  5 +-
 .../gcc.target/riscv/rvv/autovec/unop/cvt-0.c | 21 +
 .../gcc.target/riscv/rvv/autovec/unop/cvt-1.c | 22 +
 .../gcc.target/riscv/rvv/autovec/vls/cvt-0.c  | 47 +++
 7 files changed, 158 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/cvt-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index cd0cbdd2889..6dd3b96a423 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -974,6 +974,30 @@ (define_insn_and_split "2"
 }
 [(set_attr "type" "vfncvtitof")])
 
+;; This operation can be performed in the loop vectorizer but unfortunately
+;; not applicable for now. We can remove this pattern after loop vectorizer
+;; is able to take care of INT64 to FP16 conversion.
+(define_insn_and_split "2"
+  [(set (match_operand:  0 "register_operand")
+   (any_float:
+ (match_operand:VWWCONVERTI 1 "register_operand")))]
+  "TARGET_VECTOR && TARGET_ZVFH && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+rtx single = gen_reg_rtx (mode); /* Get vector SF mode.  */
+
+/* Step-1, INT64 => FP32.  */
+emit_insn (gen_2 (single, operands[1]));
+/* Step-2, FP32 => FP16.  */
+emit_insn (gen_trunc2 (operands[0], single));
+
+DONE;
+  }
+  [(set_attr "type" "vfncvtitof")]
+)
+
 ;; =
 ;; == Unary arithmetic
 ;; =
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index b6cd872eb42..c9a7344b1bc 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -1247,6 +1247,24 @@ (define_mode_iterator VWCONVERTI [
   (V512DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 4096")
 ])
 
+(define_mode_iterator VWWCONVERTI [
+  (RVVM8DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (RVVM4DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (RVVM2DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (RVVM1DI "TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+
+  (V1DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (V2DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (V4DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH")
+  (V8DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 64")
+  (V16DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 128")
+  (V32DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 256")
+  (V64DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 512")
+  (V128DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 
TARGET_MIN_VLEN >= 1024")
+  (V256DI "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_64 && TARGET_ZVFH && 

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #8 from Sergei Trofimovich  ---
Looks like it's mainly -O0.

Why not try to use at least -O1 for bootstrap? Perhaps it was a safe default to
workaround host compiler bugs in C days.

But nowadays gcc uses -std=c++11 with quite a bit of abstractions to remove at
-O0. Maybe having a disableable -O1 (or even default -O2) would be a better
default?

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #7 from Andrew Pinski  ---
I am not sure there is not much to be done here really since the issue is
profilingbootstrap will use -O0 for stage1 to make sure we don't run into bugs
in host compiler (though we still run into issues there).

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #6 from Sergei Trofimovich  ---
And here is fomr completeness default checking with CC='gcc -g -O2' CXX='g++ -g
-O2':

$ ~/dev/git/gcc/configure --disable-multilib --enable-languages=c,c++ 'CC=gcc
-g -O2' 'CXX=g++ -g -O2'

$ /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/ -v
Reading specs from /tmp/gb/./prev-gcc/specs
COLLECT_GCC=/tmp/gb/./prev-gcc/xg++
COLLECT_LTO_WRAPPER=/tmp/gb/./prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++ CC='gcc -g -O2' CXX='g++ -g -O2'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Result is 1m57s:

$ time /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g
-O2 -fno-checking -gtoggle -fprofile-generate -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/slyfox/dev/git/gcc/gcc -I/home/slyfox/dev/git/gcc/gcc/.
-I/home/slyfox/dev/git/gcc/gcc/../include
-I/home/slyfox/dev/git/gcc/gcc/../libcpp/include
-I/home/slyfox/dev/git/gcc/gcc/../libcody
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libbacktrace -o insn-recog.o -MT insn-recog.o
-MMD -MP -MF ./.deps/insn-recog.TPo insn-recog.cc

real1m57,549s
user1m56,617s
sys 0m0,780s

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #5 from Sergei Trofimovich  ---
(In reply to Andrew Pinski from comment #3)
> Note prev-gcc/cc1plus is compiled at -O0 also which definitely makes things
> worse here.

Also tried with: '--enable-checking=release -O2 -g' as:

$ ~/dev/git/gcc/configure --disable-multilib --enable-languages=c,c++
--enable-checking=release 'CC=gcc -g -O2' 'CXX=g++ -g -O2'

$ /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/ -v
Reading specs from /tmp/gb/./prev-gcc/specs
COLLECT_GCC=/tmp/gb/./prev-gcc/xg++
COLLECT_LTO_WRAPPER=/tmp/gb/./prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++ --enable-checking=release CC='gcc -g -O2' CXX='g++ -g
-O2'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Result is a lot better: 1m55s:

$ time /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g
-O2 -fno-checking -gtoggle -fprofile-generate -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/slyfox/dev/git/gcc/gcc -I/home/slyfox/dev/git/gcc/gcc/.
-I/home/slyfox/dev/git/gcc/gcc/../include
-I/home/slyfox/dev/git/gcc/gcc/../libcpp/include
-I/home/slyfox/dev/git/gcc/gcc/../libcody
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libbacktrace -o insn-recog.o -MT insn-recog.o
-MMD -MP -MF ./.deps/insn-recog.TPo insn-recog.cc

real1m55,334s
user1m54,146s
sys 0m0,993s

Re: Test with an lto-build of libgfortran.

2023-09-27 Thread Thomas Koenig via Gcc

Hi Toon,

During the GNU Tools Cauldron we discussed (at the BoF: IPA & LTO) the 
possibility (and hazards) of building the run time libraries for various 
compilers with -flto, enabling an -flto -static linking of programs with 
the run time library available during link time optimizations.


This would be a big win for libgfortran, especially the array functions,
knowing that stride==1 can be a _big_ win for optimization.  This is
why LTO is such an excellent idea for Fortran in general, and for the
library in particular.

There is a PR on this with quite some discussion already,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77278 .

I've put you in CC of that bug, maybe we can discuss there more
in detail.

One point about the array functions: In the library, we use sort of a
ripple carry algorithm to step through the arrays.  This saves space
an is general, but the performance (esp for the most common one-and
two-dimensional arrays) can suffer.

[...]

The full question of "lto-ing" run time libraries is more complicated 
than just "whether it works" as those who attended the BoF will recall.


I didn't attend the Cauldron (but that discussion would have been
very interesting).  I think for libgfortran, a first step would be
additional work to get declarations on both sides to agree (which is
worth doing anyway).

Best regards

Thomas




[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #4 from Sergei Trofimovich  ---
(In reply to Andrew Pinski from comment #2)
> Can you also try with --enable-checking=release to double check that it is
> not the extra compile time checks which is causing issues ...

Added --enable-checking=release:

$ /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/ -v
Reading specs from /tmp/gb/./prev-gcc/specs
COLLECT_GCC=/tmp/gb/./prev-gcc/xg++
COLLECT_LTO_WRAPPER=/tmp/gb/./prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++ --enable-checking=release
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Result did not change much:

$ time /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g
-O2 -fno-checking -gtoggle -fprofile-generate -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/slyfox/dev/git/gcc/gcc -I/home/slyfox/dev/git/gcc/gcc/.
-I/home/slyfox/dev/git/gcc/gcc/../include
-I/home/slyfox/dev/git/gcc/gcc/../libcpp/include
-I/home/slyfox/dev/git/gcc/gcc/../libcody
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libbacktrace -o insn-recog.o -MT insn-recog.o
-MMD -MP -MF ./.deps/insn-recog.TPo insn-recog.cc

real12m18,994s
user12m17,085s
sys 0m1,001s

Re: [PATCH 00/18] Support -mevex512 for AVX512

2023-09-27 Thread ZiNgA BuRgA

That sounds about right.  The code I had in mind would perhaps look like:


#if defined(__AVX512BW__) && defined(__AVX512VL__)
    #if defined(__EVEX256__) && !defined(__EVEX512__)
    // compiled code is AVX10.1/256 and AVX512 compatible
    #else
    // compiled code is only AVX512 compatible
    #endif

    // some code which only uses 256b instructions
    __m256i...
#endif


The '__EVEX256__' define would avoid needing to check compiler versions.
Hopefully you can align it with whatever Clang does: 
https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661/18


Thanks!

On 28/09/2023 12:26 pm, Hu, Lin1 wrote:

Hi,

Thanks for you reply.

I'd like to verify that our understanding of your requirements is correct, and 
that __EVEX256__ can be considered a default macro to determine whether the 
compiler supports the __EVEX***__ series of switches.

For example:

I have a segment of code like:
#if defined(__EVEX512__):
__mm512.*__;
#else
__mm256.*__;
#endif

But __EVEX512__ is undefined that doesn't mean I only need 256bit, maybe I use 
gcc-13, so I can still use 512bit.

So the code should be:
#if defined(__EVEX512__):
__mm512.*__;
#elif defined(__EVEX256__):
__mm256.*__;
#else
__mm512.*__;
#endif

If we understand correctly, we'll consider the request. But since we're about 
to have a vacation, follow-up replies may be a bit slower.

BRs,
Lin

-Original Message-
From: ZiNgA BuRgA 
Sent: Thursday, September 28, 2023 8:32 AM
To: Hu, Lin1 ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 00/18] Support -mevex512 for AVX512

Thanks for the new patch!

I see that there's a new __EVEX512__ define.  Will there be some __EVEX256__ 
(or maybe some max EVEX width) define, so that code can detect whether the 
compiler supports AVX10.1/256 without resorting to version checks?






RE: [PATCH 00/18] Support -mevex512 for AVX512

2023-09-27 Thread Hu, Lin1
Hi, 

Thanks for you reply.

I'd like to verify that our understanding of your requirements is correct, and 
that __EVEX256__ can be considered a default macro to determine whether the 
compiler supports the __EVEX***__ series of switches. 

For example:

I have a segment of code like:
#if defined(__EVEX512__):
__mm512.*__;
#else
__mm256.*__;
#endif

But __EVEX512__ is undefined that doesn't mean I only need 256bit, maybe I use 
gcc-13, so I can still use 512bit.

So the code should be:
#if defined(__EVEX512__):
__mm512.*__;
#elif defined(__EVEX256__):
__mm256.*__;
#else
__mm512.*__;
#endif

If we understand correctly, we'll consider the request. But since we're about 
to have a vacation, follow-up replies may be a bit slower.

BRs,
Lin

-Original Message-
From: ZiNgA BuRgA  
Sent: Thursday, September 28, 2023 8:32 AM
To: Hu, Lin1 ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 00/18] Support -mevex512 for AVX512

Thanks for the new patch!

I see that there's a new __EVEX512__ define.  Will there be some __EVEX256__ 
(or maybe some max EVEX width) define, so that code can detect whether the 
compiler supports AVX10.1/256 without resorting to version checks?




Re: Re: [PATCH v2] RISC-V: Bugfix for RTL check[PR111533]

2023-09-27 Thread Li Xu
Committed, thanks juzhe.
--
Li Xu
>LGTM. Thanks for fixing it.
>
>
>
>juzhe.zh...@rivai.ai
>
>From: Li Xu
>Date: 2023-09-28 09:33
>To: gcc-patches
>CC: kito.cheng; palmer; juzhe.zhong; xuli
>Subject: [PATCH v2] RISC-V: Bugfix for RTL check[PR111533]
>From: xuli 
>
>Consider the flowing situation:
>BB5: local_dem(RVV Insn 1, AVL(reg zero))
>RVV Insn 1: vmv.s.x, AVL (const_int 1)
>RVV Insn 2: vredsum.vs, AVL(reg zero)
>
>vmv.s.x has vl operand, the following code will get
>avl (cosnt_int) from RVV Insn 1.
>rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ())
>   : dem.get_avl ();
>
>If use REGNO for const_int, the compiler will crash:
>
>during RTL pass: vsetvl
>res_debug.c: In function '__dn_count_labels':
>res_debug.c:1050:1: internal compiler error: RTL check: expected code 'reg',
>have 'const_int' in rhs_regno, at rtl.h:1934
>1050 | }
>  | ^
>0x8fb169 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, 
>char const*)
>../.././gcc/gcc/rtl.cc:770
>0x1399818 rhs_regno(rtx_def const*)
>../.././gcc/gcc/rtl.h:1934
>0x1399818 anticipatable_occurrence_p
>../.././gcc/gcc/config/riscv/riscv-vsetvl.cc:348
>
>So in this case avl should be obtained from dem.
>
>Another issue is caused by the following code:
>HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;
>
>during RTL pass: expand
>../../.././gcc/libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
>../../.././gcc/libgfortran/generated/matmul_c4.c:2906:39: internal compiler 
>error: RTL check:
>expected code 'const_int', have 'const_poly_int' in expand_const_vector,
>at config/riscv/riscv-v.cc:1149
>
>The builder.elt (i) can be either const_int or const_poly_int.
>
>PR target/111533
>
>gcc/ChangeLog:
>
>* config/riscv/riscv-v.cc (expand_const_vector): Fix bug.
>* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix bug.
>
>gcc/testsuite/ChangeLog:
>
>* gcc.target/riscv/rvv/base/pr111533-1.c: New test.
>* gcc.target/riscv/rvv/base/pr111533-2.c: New test.
>---
>gcc/config/riscv/riscv-v.cc   |  5 ++--
>gcc/config/riscv/riscv-vsetvl.cc  |  3 +-
>.../gcc.target/riscv/rvv/base/pr111533-1.c    | 15 ++
>.../gcc.target/riscv/rvv/base/pr111533-2.c    | 29 +++
>4 files changed, 48 insertions(+), 4 deletions(-)
>create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
>create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
>
>diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
>index 359fb2ced8b..26700cfc732 100644
>--- a/gcc/config/riscv/riscv-v.cc
>+++ b/gcc/config/riscv/riscv-v.cc
>@@ -1149,8 +1149,9 @@ expand_const_vector (rtx target, rtx src)
>  for (unsigned int i = 0; i < v.npatterns (); ++i)
>{
>  /* Calculate the diff between the target sequence and
>-  vid sequence.  */
>-   HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;
>+  vid sequence.  The elt (i) can be either const_int or
>+  const_poly_int. */
>+   poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i;
>  v.quick_push (gen_int_mode (diff, v.inner_mode ()));
>}
>  /* Step 2: Generate result = VID + diff.  */
>diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
>b/gcc/config/riscv/riscv-vsetvl.cc
>index 7af33e7ea6f..af8c31d873c 100644
>--- a/gcc/config/riscv/riscv-vsetvl.cc
>+++ b/gcc/config/riscv/riscv-vsetvl.cc
>@@ -307,8 +307,7 @@ anticipatable_occurrence_p (const bb_info *bb, const 
>vector_insn_info dem)
>   if (dem.has_avl_reg ())
> {
>   /* rs1 (avl) are not modified in the basic block prior to the VSETVL.  
>*/
>-  rtx avl
>- = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ()) : dem.get_avl ();
>+  rtx avl = dem.get_avl_or_vl_reg ();
>   if (dem.dirty_p ())
>{
>  gcc_assert (!vsetvl_insn_p (insn->rtl ()));
>diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c 
>b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
>new file mode 100644
>index 000..aba26dfac89
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
>@@ -0,0 +1,15 @@
>+/* { dg-do compile } */
>+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2 -ffast-math -ftree-vectorize" 
>} */
>+
>+#include 
>+
>+typedef _Complex float GFC_COMPLEX_4;
>+
>+void
>+test (GFC_COMPLEX_4 *a, GFC_COMPLEX_4 *b, GFC_COMPLEX_4 c, ptrdiff_t i, 
>ptrdiff_t j)
>+{
>+  ptrdiff_t l;
>+  for (l = 0; l <= i; ++l)
>+    c += b[l] * a[j];
>+  b[j] = c;
>+}
>diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c 
>b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
>new file mode 100644
>index 000..a4d2011b74b
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
>@@ -0,0 +1,29 @@
>+/* { dg-do compile } */
>+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2" } */
>+
>+#include 
>+
>+/* Return the number of DNS hierarchy levels in the name. */
>+int
>+test (const char *name) {
>+ int i, len, count;
>+
>+ len = strlen(name);
>+ for (i = 0, count = 0; i < len; i++) {

[Bug target/111533] [14 Regression] ICE: RTL check: expected code 'reg', have 'const_int' in rhs_regno, at rtl.h:1934

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111533

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Li Xu :

https://gcc.gnu.org/g:110ffb2d8d3a64b32dd56ac995c2e30e8f64d4dc

commit r14-4301-g110ffb2d8d3a64b32dd56ac995c2e30e8f64d4dc
Author: xuli 
Date:   Thu Sep 28 01:29:12 2023 +

RISC-V: Bugfix for RTL check[PR111533]

Consider the flowing situation:
BB5: local_dem(RVV Insn 1, AVL(reg zero))
RVV Insn 1: vmv.s.x, AVL (const_int 1)
RVV Insn 2: vredsum.vs, AVL(reg zero)

vmv.s.x has vl operand, the following code will get
avl (cosnt_int) from RVV Insn 1.
rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ())
   : dem.get_avl ();

If use REGNO for const_int, the compiler will crash:

during RTL pass: vsetvl
res_debug.c: In function '__dn_count_labels':
res_debug.c:1050:1: internal compiler error: RTL check: expected code
'reg',
have 'const_int' in rhs_regno, at rtl.h:1934
 1050 | }
  | ^
0x8fb169 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
../.././gcc/gcc/rtl.cc:770
0x1399818 rhs_regno(rtx_def const*)
../.././gcc/gcc/rtl.h:1934
0x1399818 anticipatable_occurrence_p
../.././gcc/gcc/config/riscv/riscv-vsetvl.cc:348

So in this case avl should be obtained from dem.

Another issue is caused by the following code:
HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;

during RTL pass: expand
../../.././gcc/libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
../../.././gcc/libgfortran/generated/matmul_c4.c:2906:39: internal compiler
error: RTL check:
expected code 'const_int', have 'const_poly_int' in expand_const_vector,
at config/riscv/riscv-v.cc:1149

The builder.elt (i) can be either const_int or const_poly_int.

PR target/111533

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Fix bug.
* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix
bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111533-1.c: New test.
* gcc.target/riscv/rvv/base/pr111533-2.c: New test.

Re: [PATCH V4 2/2] rs6000: use mtvsrws to move sf from si p9

2023-09-27 Thread Jiufu Guo


Hi,

"Kewen.Lin"  writes:

> Hi Jeff,
>
> on 2023/8/30 15:43, Jiufu Guo wrote:
>> Hi,
>> 
>> As mentioned in PR108338, on p9, we could use mtvsrws to implement
>> the bitcast from SI to SF (or lowpart DI to SF).
>> 
>> For code:
>>   *(long long*)buff = di;
>>   float f = *(float*)(buff);
>> 
>> "sldi 9,3,32 ; mtvsrd 1,9 ; xscvspdpn 1,1" is generated.
>> A better one would be "mtvsrws 1,3 ; xscvspdpn 1,1".
>> 
>> Compare with previous patch:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623533.html
>> "highpart DI-->SF" is put to a seperate patch.
>> 
>> Pass bootstrap and regression on ppc64{,le}.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu Guo)
>> 
> Nit: Missing a PR marker line.
Ok, this patch would share the PR108338.
>
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/rs6000.md (movsf_from_si): Update to generate mtvsrws
>>  for P9.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/powerpc/pr108338.c: Updated to check mtvsrws for p9.
>> 
>> ---
>>  gcc/config/rs6000/rs6000.md | 25 -
>>  gcc/testsuite/gcc.target/powerpc/pr108338.c |  6 +++--
>>  2 files changed, 23 insertions(+), 8 deletions(-)
>> 
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index 
>> 8c92cbf976de915136ad5dba24e69a363d21438d..c03e677bca79e8fb1acb276d07d0acfae009f6d8
>>  100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -8280,13 +8280,26 @@ (define_insn_and_split "movsf_from_si"
>>  {
>>rtx op0 = operands[0];
>>rtx op1 = operands[1];
>> -  rtx op2 = operands[2];
>> -  rtx op1_di = gen_rtx_REG (DImode, REGNO (op1));
>> 
>> -  /* Move SF value to upper 32-bits for xscvspdpn.  */
>> -  emit_insn (gen_ashldi3 (op2, op1_di, GEN_INT (32)));
>> -  emit_insn (gen_p8_mtvsrd_sf (op0, op2));
>> -  emit_insn (gen_vsx_xscvspdpn_directmove (op0, op0));
>> +  /* Move lowpart 32-bits from register for SFmode.  */
>> +  if (TARGET_P9_VECTOR)
>> +{
>> +  /* Using mtvsrws;xscvspdpn.  */
>> +  rtx op0_v = gen_rtx_REG (V4SImode, REGNO (op0));
>> +  emit_insn (gen_vsx_splat_v4si (op0_v, op1));
>> +  emit_insn (gen_vsx_xscvspdpn_directmove (op0, op0));
>> +}
>> +  else
>> +{
>> +  rtx op2 = operands[2];
>> +  rtx op1_di = gen_rtx_REG (DImode, REGNO (op1));
>> +
>> +  /* Using ashl;mtvsrd;xscvspdpn.  */
>
> Nit: Use sldi instead of ashl as the others are actual
> mnemonics but ashl isn't.
Oh, yes, thanks for your insight review!
>
>> +  emit_insn (gen_ashldi3 (op2, op1_di, GEN_INT (32)));
>> +  emit_insn (gen_p8_mtvsrd_sf (op0, op2));
>> +  emit_insn (gen_vsx_xscvspdpn_directmove (op0, op0));
>> +}
>> +
>>DONE;
>>  }
>>[(set_attr "length"
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr108338.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr108338.c
>> index 
>> 6db65595343c2407fc32f68f5f52a1f7196c371d..0565e5254ed0a8cc579cf505a3f865426dcf62ae
>>  100644
>> --- a/gcc/testsuite/gcc.target/powerpc/pr108338.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr108338.c
>> @@ -19,9 +19,11 @@ float  __attribute__ ((noipa)) sf_from_di_off4 (long long 
>> l)
>> 
>>  /* Under lp64, parameter 'l' is in one DI reg, then bitcast sub DI to SF. */
>>  /* { dg-final { scan-assembler-times {\mxscvspdpn\M} 2 { target { lp64 && 
>> has_arch_pwr8 } } } } */
>> -/* { dg-final { scan-assembler-times {\mmtvsrd\M} 2 { target { lp64 && 
>> has_arch_pwr8 } } } } */
>> +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 2 { target { lp64 && { 
>> has_arch_pwr8 && { ! has_arch_pwr9 } } } } } } */
>> +/* { dg-final { scan-assembler-times {\msldi\M} 1 { target { lp64 && { 
>> has_arch_pwr8 && { ! has_arch_pwr9 } } } } } } */
>> +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 1 { target { lp64 && 
>> has_arch_pwr9 } } } } */
>> +/* { dg-final { scan-assembler-times {\mmtvsrws\M} 1 { target { lp64 && 
>> has_arch_pwr9 } } } } */
>>  /* { dg-final { scan-assembler-times {\mrldicr\M} 1 { target { lp64 && 
>> has_arch_pwr8 } } } } */
>> -/* { dg-final { scan-assembler-times {\msldi\M} 1 { target { lp64 && 
>> has_arch_pwr8 } } } } */
>> 
>
> This part might need a fresh as the comments to patch 1/2.
Yes, thanks!
>
> The others look good to me, thanks!

BR,
Jeff (Jiufu Guo)
>
> BR,
> Kewen


Re: [PATCH v2] RISC-V: Bugfix for RTL check[PR111533]

2023-09-27 Thread juzhe.zh...@rivai.ai
LGTM. Thanks for fixing it.



juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-09-28 09:33
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; xuli
Subject: [PATCH v2] RISC-V: Bugfix for RTL check[PR111533]
From: xuli 
 
Consider the flowing situation:
BB5: local_dem(RVV Insn 1, AVL(reg zero))
RVV Insn 1: vmv.s.x, AVL (const_int 1)
RVV Insn 2: vredsum.vs, AVL(reg zero)
 
vmv.s.x has vl operand, the following code will get
avl (cosnt_int) from RVV Insn 1.
rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ())
   : dem.get_avl ();
 
If use REGNO for const_int, the compiler will crash:
 
during RTL pass: vsetvl
res_debug.c: In function '__dn_count_labels':
res_debug.c:1050:1: internal compiler error: RTL check: expected code 'reg',
have 'const_int' in rhs_regno, at rtl.h:1934
1050 | }
  | ^
0x8fb169 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, 
char const*)
../.././gcc/gcc/rtl.cc:770
0x1399818 rhs_regno(rtx_def const*)
../.././gcc/gcc/rtl.h:1934
0x1399818 anticipatable_occurrence_p
../.././gcc/gcc/config/riscv/riscv-vsetvl.cc:348
 
So in this case avl should be obtained from dem.
 
Another issue is caused by the following code:
HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;
 
during RTL pass: expand
../../.././gcc/libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
../../.././gcc/libgfortran/generated/matmul_c4.c:2906:39: internal compiler 
error: RTL check:
expected code 'const_int', have 'const_poly_int' in expand_const_vector,
at config/riscv/riscv-v.cc:1149
 
The builder.elt (i) can be either const_int or const_poly_int.
 
PR target/111533
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (expand_const_vector): Fix bug.
* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix bug.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr111533-1.c: New test.
* gcc.target/riscv/rvv/base/pr111533-2.c: New test.
---
gcc/config/riscv/riscv-v.cc   |  5 ++--
gcc/config/riscv/riscv-vsetvl.cc  |  3 +-
.../gcc.target/riscv/rvv/base/pr111533-1.c| 15 ++
.../gcc.target/riscv/rvv/base/pr111533-2.c| 29 +++
4 files changed, 48 insertions(+), 4 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 359fb2ced8b..26700cfc732 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1149,8 +1149,9 @@ expand_const_vector (rtx target, rtx src)
  for (unsigned int i = 0; i < v.npatterns (); ++i)
{
  /* Calculate the diff between the target sequence and
-  vid sequence.  */
-   HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;
+  vid sequence.  The elt (i) can be either const_int or
+  const_poly_int. */
+   poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i;
  v.quick_push (gen_int_mode (diff, v.inner_mode ()));
}
  /* Step 2: Generate result = VID + diff.  */
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7af33e7ea6f..af8c31d873c 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -307,8 +307,7 @@ anticipatable_occurrence_p (const bb_info *bb, const 
vector_insn_info dem)
   if (dem.has_avl_reg ())
 {
   /* rs1 (avl) are not modified in the basic block prior to the VSETVL.  */
-  rtx avl
- = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ()) : dem.get_avl ();
+  rtx avl = dem.get_avl_or_vl_reg ();
   if (dem.dirty_p ())
{
  gcc_assert (!vsetvl_insn_p (insn->rtl ()));
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
new file mode 100644
index 000..aba26dfac89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2 -ffast-math -ftree-vectorize" 
} */
+
+#include 
+
+typedef _Complex float GFC_COMPLEX_4;
+
+void
+test (GFC_COMPLEX_4 *a, GFC_COMPLEX_4 *b, GFC_COMPLEX_4 c, ptrdiff_t i, 
ptrdiff_t j)
+{
+  ptrdiff_t l;
+  for (l = 0; l <= i; ++l)
+c += b[l] * a[j];
+  b[j] = c;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
new file mode 100644
index 000..a4d2011b74b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2" } */
+
+#include 
+
+/* Return the number of DNS hierarchy levels in the name. */
+int
+test (const char *name) {
+ int i, len, count;
+
+ len = strlen(name);
+ for (i = 0, count = 0; i < len; i++) {
+ /* XXX need to check for \. or use named's nlabels(). */
+ if (name[i] == '.')
+ count++;
+ }
+
+ /* don't count initial wildcard */
+ if (name[0] == '*')
+ if (count)
+ count--;
+

Re: [PATCH V4 1/2] rs6000: optimize moving to sf from highpart di

2023-09-27 Thread Jiufu Guo


Hi,

"Kewen.Lin"  writes:

> Hi Jeff,
>
> on 2023/8/30 15:43, Jiufu Guo wrote:
>> Hi,
>> 
>> Currently, we have the pattern "movsf_from_si2" which was trying
>> to support moving high part DI to SF.
>> 
>> The pattern looks like: XX:SF=bitcast:SF(subreg(YY:DI>>32),0)
>> It only accepts the "ashiftrt" for ">>", but "lshiftrt" is also ok.
>> And the offset of "subreg" is hard code 0, which only works for LE.
>> 
>> "movsf_from_si2" is updated to cover BE for "subreg", and cover
>> the logical shift for ":DI>>32".
>> 
>> Pass bootstrap and regression on ppc64{,le}.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu Guo)
>> 
>>  PR target/108338
>> 
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/predicates.md (lowpart_subreg_operator): New
>>  define_predicate.
>>  * config/rs6000/rs6000.md (any_rshift): New code_iterator.
>>  (movsf_from_si2): Rename to ...
>>  (movsf_from_si2_): ... this.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/powerpc/pr108338.c: New test.
>> 
>> ---
>>  gcc/config/rs6000/predicates.md |  5 +++
>>  gcc/config/rs6000/rs6000.md | 11 +++---
>>  gcc/testsuite/gcc.target/powerpc/pr108338.c | 40 +
>>  3 files changed, 51 insertions(+), 5 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108338.c
>> 
>> diff --git a/gcc/config/rs6000/predicates.md 
>> b/gcc/config/rs6000/predicates.md
>> index 
>> 3552d908e9d149a30993e3e6568466de537336be..e25b3b4864f681d47e9d5c2eb88bcde0aea6d17b
>>  100644
>> --- a/gcc/config/rs6000/predicates.md
>> +++ b/gcc/config/rs6000/predicates.md
>> @@ -2098,3 +2098,8 @@ (define_predicate "macho_pic_address"
>>else
>>  return false;
>>  })
>> +
>> +(define_predicate "lowpart_subreg_operator"
>> +  (and (match_code "subreg")
>> +   (match_test "subreg_lowpart_offset (mode, GET_MODE (SUBREG_REG (op)))
>> +== SUBREG_BYTE (op)")))
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index 
>> 1a9a7b1a47918f39fc91038607f21a8ba9a2e740..8c92cbf976de915136ad5dba24e69a363d21438d
>>  100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -8299,18 +8299,19 @@ (define_insn_and_split "movsf_from_si"
>>  "*,  *, p9v,   p8v,   *, *,
>>   p8v,p8v,   p8v,   *")])
>> 
>> +(define_code_iterator any_rshift [ashiftrt lshiftrt])
>
> Nit: Could we name this as any_shiftrt instead and move this close to the
> existing any_* code_iterator?
ok, thanks!
>
>> +
>>  ;; For extracting high part element from DImode register like:
>>  ;; {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>>  ;; split it before reload with "and mask" to avoid generating shift right
>>  ;; 32 bit then shift left 32 bit.
>> -(define_insn_and_split "movsf_from_si2"
>> +(define_insn_and_split "movsf_from_si2_"
>>[(set (match_operand:SF 0 "gpc_reg_operand" "=wa")
>>  (unspec:SF
>> - [(subreg:SI
>> -   (ashiftrt:DI
>> + [(match_operator:SI 3 "lowpart_subreg_operator"
>> +   [(any_rshift:DI
>>  (match_operand:DI 1 "input_operand" "r")
>> -(const_int 32))
>> -   0)]
>> +(const_int 32))])]
>>   UNSPEC_SF_FROM_SI))
>>(clobber (match_scratch:DI 2 "=r"))]
>>"TARGET_NO_SF_SUBREG"
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr108338.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr108338.c
>> new file mode 100644
>> index 
>> ..6db65595343c2407fc32f68f5f52a1f7196c371d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr108338.c
>> @@ -0,0 +1,40 @@
>> +// { dg-do run }
>> +// { dg-options "-O2 -save-temps" }
>
> I think we need to check effective target hard_float to ensure
> the expected assembly?
Thanks for pointing out this!
>
>> +
>> +float __attribute__ ((noipa)) sf_from_di_off0 (long long l)
>> +{
>> +  char buff[16];
>> +  *(long long*)buff = l;
>> +  float f = *(float*)(buff);
>> +  return f;
>> +}
>> +
>> +float  __attribute__ ((noipa)) sf_from_di_off4 (long long l)
>> +{
>> +  char buff[16];
>> +  *(long long*)buff = l;
>> +  float f = *(float*)(buff + 4);
>> +  return f; 
>> +}
>> +
>
> IIUC, this patch is to deal with high 32-bit, but why you proposed
> two functions is due to endianness difference, then could we use
> endianness macro like __LITTLE_ENDIAN__ to simplify the corresponding
> offset value (0 on BE, 4 on LE)?  so that we have only function and
> IMHO it's more focused.
Yes, this patch is for high part of DI, next patch is for low part of
DI.  Great sugguestion, thanks!
>
>> +/* Under lp64, parameter 'l' is in one DI reg, then bitcast sub DI to SF. */
>> +/* { dg-final { scan-assembler-times {\mxscvspdpn\M} 2 { target { lp64 && 
>> has_arch_pwr8 } } } } */
>> +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 2 { target { lp64 && 
>> has_arch_pwr8 } } } } */
>> +/* { dg-final { scan-assembler-times 

[PATCH v2] RISC-V: Bugfix for RTL check[PR111533]

2023-09-27 Thread Li Xu
From: xuli 

Consider the flowing situation:
BB5: local_dem(RVV Insn 1, AVL(reg zero))
RVV Insn 1: vmv.s.x, AVL (const_int 1)
RVV Insn 2: vredsum.vs, AVL(reg zero)

vmv.s.x has vl operand, the following code will get
avl (cosnt_int) from RVV Insn 1.
rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ())
   : dem.get_avl ();

If use REGNO for const_int, the compiler will crash:

during RTL pass: vsetvl
res_debug.c: In function '__dn_count_labels':
res_debug.c:1050:1: internal compiler error: RTL check: expected code 'reg',
have 'const_int' in rhs_regno, at rtl.h:1934
 1050 | }
  | ^
0x8fb169 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, 
char const*)
../.././gcc/gcc/rtl.cc:770
0x1399818 rhs_regno(rtx_def const*)
../.././gcc/gcc/rtl.h:1934
0x1399818 anticipatable_occurrence_p
../.././gcc/gcc/config/riscv/riscv-vsetvl.cc:348

So in this case avl should be obtained from dem.

Another issue is caused by the following code:
HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;

during RTL pass: expand
../../.././gcc/libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
../../.././gcc/libgfortran/generated/matmul_c4.c:2906:39: internal compiler 
error: RTL check:
expected code 'const_int', have 'const_poly_int' in expand_const_vector,
at config/riscv/riscv-v.cc:1149

The builder.elt (i) can be either const_int or const_poly_int.

PR target/111533

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Fix bug.
* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111533-1.c: New test.
* gcc.target/riscv/rvv/base/pr111533-2.c: New test.
---
 gcc/config/riscv/riscv-v.cc   |  5 ++--
 gcc/config/riscv/riscv-vsetvl.cc  |  3 +-
 .../gcc.target/riscv/rvv/base/pr111533-1.c| 15 ++
 .../gcc.target/riscv/rvv/base/pr111533-2.c| 29 +++
 4 files changed, 48 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 359fb2ced8b..26700cfc732 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1149,8 +1149,9 @@ expand_const_vector (rtx target, rtx src)
  for (unsigned int i = 0; i < v.npatterns (); ++i)
{
  /* Calculate the diff between the target sequence and
-vid sequence.  */
- HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;
+vid sequence.  The elt (i) can be either const_int or
+const_poly_int. */
+ poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i;
  v.quick_push (gen_int_mode (diff, v.inner_mode ()));
}
  /* Step 2: Generate result = VID + diff.  */
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7af33e7ea6f..af8c31d873c 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -307,8 +307,7 @@ anticipatable_occurrence_p (const bb_info *bb, const 
vector_insn_info dem)
   if (dem.has_avl_reg ())
 {
   /* rs1 (avl) are not modified in the basic block prior to the VSETVL.  */
-  rtx avl
-   = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ()) : dem.get_avl ();
+  rtx avl = dem.get_avl_or_vl_reg ();
   if (dem.dirty_p ())
{
  gcc_assert (!vsetvl_insn_p (insn->rtl ()));
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
new file mode 100644
index 000..aba26dfac89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2 -ffast-math -ftree-vectorize" 
} */
+
+#include 
+
+typedef _Complex float GFC_COMPLEX_4;
+
+void
+test (GFC_COMPLEX_4 *a, GFC_COMPLEX_4 *b, GFC_COMPLEX_4 c, ptrdiff_t i, 
ptrdiff_t j)
+{
+  ptrdiff_t l;
+  for (l = 0; l <= i; ++l)
+c += b[l] * a[j];
+  b[j] = c;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
new file mode 100644
index 000..a4d2011b74b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr111533-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O2" } */
+
+#include 
+
+/* Return the number of DNS hierarchy levels in the name. */
+int
+test (const char *name) {
+   int i, len, count;
+
+   len = strlen(name);
+   for (i = 0, count = 0; i < len; i++) {
+   /* XXX need to check for \. or use named's nlabels(). */
+   if (name[i] == '.')
+   

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-09-27 Thread Li, Pan2
Almost forget about this patch, sorry for disturbing and kindly ping again.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Monday, September 11, 2023 4:37 PM
To: Jeff Law ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

Hi Jeff,

Kindly ping for the Patch V2 as below.

https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628508.html

Pan

-Original Message-
From: Li, Pan2  
Sent: Friday, August 25, 2023 8:45 PM
To: Li, Pan2 ; Jeff Law ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

Hi Jeff,

> You might also peek at the RTL gcse/pre code which is also LCM based and 
> has the same class of problems.

I found a similar approach to take care of this in gcse.cc/pre_edge_insert with 
some comments as below.

  /* We can't insert anything on an abnormal and
   critical edge, so we insert the insn at the end of
   the previous block. There are several alternatives
   detailed in Morgans book P277 (sec 10.5) for
   handling this situation.  This one is easiest for
   now.  */

if (eg->flags & EDGE_ABNORMAL)
  insert_insn_end_basic_block (index_map[j], bb);
else
  {
  insn = process_insert_insn (index_map[j]);
  insert_insn_on_edge (insn, eg);
  }

It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL 
edge by inserting at end of previous block from the comments.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 24, 2023 12:54 PM
To: Jeff Law ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

Thanks Jeff.

> That implies a save/restore pair around the call (possibly optimized so 
> that we minimize the number of save/restores).  I would have expected 
> x86 to already be doing this.  But maybe there's some ABI thing around 
> mmx vs x86 state that allows it to be avoided

Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add 
save/restore
pair around the call. I bet mode-switching take care of this already.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook



On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> Understood.  So the natural question is why does x86/sh not need this
>> for its mode switching?   Don't all the same issues exist on those
>> targets as well?
> 
> AFAIK, it comes from the different design principle between the risc-v and 
> x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above 
> the insn itself, while
> the x86/arm only indicates the semantics of the insn.
> 
> For example, if one vector instruction VFADD doesn't have static rounding 
> mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. 
> While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is 
> honored.
> 
> In short, the risc-v intrinsic API is closer to the end-user, while the 
> x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important 
here.  Ultimately there's a state at a call site.  We need to make sure 
that state from the current function doesn't impact the callee and we 
need to make sure that the callee doesn't impact the state in the caller.

That implies a save/restore pair around the call (possibly optimized so 
that we minimize the number of save/restores).  I would have expected 
x86 to already be doing this.  But maybe there's some ABI thing around 
mmx vs x86 state that allows it to be avoided

> 
> For the rest part, will have a try based on your suggestion soon as I am in 
> the middle of something.
No problem.  Get to it when you can.  I think it affects you more than 
me :-)

jeff


Re: [PATCH 00/18] Support -mevex512 for AVX512

2023-09-27 Thread ZiNgA BuRgA

Thanks for the new patch!

I see that there's a new __EVEX512__ define.  Will there be some 
__EVEX256__ (or maybe some max EVEX width) define, so that code can 
detect whether the compiler supports AVX10.1/256 without resorting to 
version checks?





[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-09-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #15 from Kewen Lin  ---
(In reply to Richard Biener from comment #14)
> (In reply to Kewen Lin from comment #13)
> > Thanks again for the reduced test case and the information!
> > 
> > I tried to bisect it but encountered some build failures on _Float32 error
> > etc., through grepping the log I switched to start from r13-2887 (good) to
> > r13-7206 (bad).
> > 
> > The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
> > backported to GCC-12, it seems to match the observation new gcc-12 fail
> > while gcc-11 pass.
> 
> Note this change likely triggers a latent issue but it might help analyzing
> the issue.

Thanks for the hint! Yeah, I tried -fdisable-tree-esra and -fdisable-tree-sra,
the failure is still there, I supposed that commit only takes effect when SRA
is enabled. I'll continue to investigate it. btw, I'm just starting two weeks
vacation so may respond slowly.

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-27 Thread Waffl3x
Not to worry, I'm currently going through that process with the FSF, it
was confirmed that a pseudonym should be just fine. I don't know how
long the process takes but my goal is to get this in for GCC14, and
surely this won't take more than a month. One can only hope anyway.

On 2023-09-27 04:43 p.m., Hans-Peter Nilsson wrote:
>> Date: Tue, 26 Sep 2023 01:56:55 +
>> From: waffl3x 
> 
>> Signed-off-by: waffl3x 
> 
> I think I've read that you have to put your actual name in
> the DCO; using an alias (presumably) as above would be
> wrong.
> 
> Ah, it's on https://gcc.gnu.org/dco.html - the *second* DCO
> link; under "Signed-off-by", on
> https://gcc.gnu.org/contribute.html! "sorry, no pseudonyms
> or anonymous contributions".
> 
> (Also, from Some Source I Don't Remember: using an alias if
> you have FSF papers in place is ok; you can use a pseudonym
> if FSF can match it to papers on file that have your actual
> name or something to that effect.)
> 
> brgds, H-P



Re: committed [RISC-V]: Harden test scan patterns

2023-09-27 Thread Vineet Gupta




On 9/27/23 13:14, Jeff Law wrote:

It would help to describe how these patterns were under specified so
that folks don't continue to make the same mistake as new tests get 
added.


dg-final scan-assembler, scan-assembler-not, and scan-assembler-times
use a tcl regular expression (often referred to abbreviated as RE), as
described in https://www.tcl.tk/man/tcl8.4/TclCmd/re_syntax.html .

If your RE is not specific enough, it can match LTO information that the
compiler places into its assembly output when the relevant options are
provided, which is common when running tests where the test harness
iterates over a number of optimization option combinations.
Note that '.' is an atom that can match any character.  If you want to
match a dot specifically, you have to escape it with a backslash: '\.' .
When you are matching an instruction mnemonic, an effective way to
avoid matching in LTO information is to enforce matching of word start
(\m) and/or word end (\M) .
Note also that the backslash has to be quoted.  If the RE is enclosed in
'"' quotes, extra backslashes are needed.  That is not necessary when it
is enclosed in curly braces.

For example, "ld.w" will be matched in:

.ascii "h\227\022\212ld@w\251jr\254'\320\255vwj\252\026\016\364"

If you write {\mld\.w\M} instead, you avoid this problem.
OK.  So that naturally leads to the question, why aren't others seeing 
this, both in the RISC-V world and more generally.  I'm not aware of 
any case where I've run the testsuite and tripped over this issue, nor 
am I aware of anyone else tripping over it.


Actually I did run into it. See commit ecfa870ff29d979bd2c ("RISC-V: 
optim const DF +0.0 store to mem [PR/110748]") where a false failure was 
triggered due to these random LTO strings and needed adjusting.


-/* { dg-final { scan-assembler-not "sw" } } */
-/* { dg-final { scan-assembler-not "fld" } } */
-/* { dg-final { scan-assembler-not "fsd" } } */
-/* { dg-final { scan-assembler-not "lw" } } */
+/* { dg-final { scan-assembler-not "\tsw\t" } } */
+/* { dg-final { scan-assembler-not "\tfld\t" } } */
+/* { dg-final { scan-assembler-not "\tfsd\t" } } */
+/* { dg-final { scan-assembler-not "\tlw\t" } } */



[Bug target/111466] RISC-V: redundant sign extensions despite ABI guarantees

2023-09-27 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111466

--- Comment #1 from Vineet Gupta  ---
So there are various aspects to tackling this issue.

#1. REE reports failure as "missing definition(s)".

This is because function args don't have an explicit def, they are just there.

Cannot eliminate extension:
(insn 12 6 13 2 (set (reg:DI 16 a6 [orig:138 n.1_15 ] [138])
(sign_extend:DI (reg:SI 11 a1 [orig:141 n ] [141])))  {extendsidi2}
 (nil))
 because of missing definition(s)

#2. At Expand time there's an explicit sign_extend for the incoming function
arg which is not needed per RISC-V ABI. Not generating these to begin with will
require less fixup needs in REE and/or CSE.

(insn 3 2 4 2 (set (reg/v:DI 141 [ n ])
(reg:DI 11 a1 [ n ]))

(insn 12 6 13 2 (set (reg:DI 138 [ n.1_15 ])
(sign_extend:DI (subreg/u:SI (reg/v:DI 141 [ n ]) 0)))

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-27 Thread Hans-Peter Nilsson
> Date: Tue, 26 Sep 2023 01:56:55 +
> From: waffl3x 

> Signed-off-by: waffl3x 

I think I've read that you have to put your actual name in
the DCO; using an alias (presumably) as above would be
wrong.

Ah, it's on https://gcc.gnu.org/dco.html - the *second* DCO
link; under "Signed-off-by", on
https://gcc.gnu.org/contribute.html! "sorry, no pseudonyms
or anonymous contributions".

(Also, from Some Source I Don't Remember: using an alias if
you have FSF papers in place is ok; you can use a pseudonym
if FSF can match it to papers on file that have your actual
name or something to that effect.)

brgds, H-P


[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #3 from Andrew Pinski  ---
Note prev-gcc/cc1plus is compiled at -O0 also which definitely makes things
worse here.

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #2 from Andrew Pinski  ---
Can you also try with --enable-checking=release to double check that it is not
the extra compile time checks which is causing issues ...

Re: [PATCH 1/2] match.pd: Support combine cond_len_op + vec_cond similar to cond_op

2023-09-27 Thread Jeff Law




On 9/20/23 07:09, Lehua Ding wrote:

This patch adds combine cond_len_op and vec_cond to cond_len_op like
cond_op.

gcc/ChangeLog:

* gimple-match.h (gimple_match_op::gimple_match_op):
Add interfaces for more arguments.
(gimple_match_op::set_op): Add interfaces for more arguments.
* match.pd: Add support of combining cond_len_op + vec_cond

OK
jeff


Re: [PATCH 2/2] RISC-V: Add assert of the number of vmerge in autovec cond testcases

2023-09-27 Thread Jeff Law




On 9/20/23 07:09, Lehua Ding wrote:

This patch makes cond autovec testcase checks more restrict.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c:
Assert of the number of vmerge.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: Ditto.
* 

[Bug target/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #1 from Sergei Trofimovich  ---
-ftime-report breakdown:

time /tmp/gb/./prev-gcc/cc1plus -quiet -nostdinc++ -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include -I
/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++ -I . -I . -I
/home/slyfox/dev/git/gcc/gcc -I /home/slyfox/dev/git/gcc/gcc/. -I
/home/slyfox/dev/git/gcc/gcc/../include -I
/home/slyfox/dev/git/gcc/gcc/../libcpp/include -I
/home/slyfox/dev/git/gcc/gcc/../libcody -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I ../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libbacktrace -iprefix
/tmp/gb/prev-gcc/../lib/gcc/x86_64-pc-linux-gnu/14.0.0/ -isystem
/tmp/gb/./prev-gcc/include -isystem /tmp/gb/./prev-gcc/include-fixed -MMD
insn-recog.d -MF ./.deps/insn-recog.TPo -MP -MT insn-recog.o -D_GNU_SOURCE -D
IN_GCC -D HAVE_CONFIG_H insn-recog.cc -quiet -dumpbase insn-recog.cc
-dumpbase-ext .cc -mtune=generic -march=x86-64 -g -gtoggle -O2 -Wextra -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wsuggest-attribute=format
-Wconditionally-supported -Woverloaded-virtual=2 -Wpedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-checking
-fprofile-generate -fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-fno-common -fno-PIE -o /run/user/1000/ccQK54tL.s -ftime-report

Time variable   usr   sys  wall
  GGC
 phase setup:   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
 1892k (  0%)
 phase parsing  :  22.49 (  3%)   1.58 ( 35%)  24.09 (  3%)
  903M ( 22%)
 phase lang. deferred   :   0.06 (  0%)   0.01 (  0%)   0.07 (  0%)
 2268k (  0%)
 phase opt and generate : 791.23 ( 97%)   2.90 ( 65%) 794.84 ( 97%)
 3111M ( 77%)
 |name lookup   :   1.20 (  0%)   0.09 (  2%)   1.23 (  0%)
 3296k (  0%)
 |overload resolution   :   3.40 (  0%)   0.18 (  4%)   3.69 (  0%)
  107M (  3%)
 garbage collection :   5.82 (  1%)   0.08 (  2%)   5.86 (  1%)
0  (  0%)
 dump files :   0.24 (  0%)   0.00 (  0%)   0.15 (  0%)
0  (  0%)
 callgraph construction :   4.41 (  1%)   0.14 (  3%)   4.74 (  1%)
  329M (  8%)
 callgraph optimization :   1.01 (  0%)   0.03 (  1%)   1.02 (  0%)
 2938k (  0%)
 callgraph functions expansion  : 734.71 ( 90%)   2.08 ( 46%) 737.44 ( 90%)
 2238M ( 56%)
 callgraph ipa passes   :  50.35 (  6%)   0.71 ( 16%)  51.10 (  6%)
  437M ( 11%)
 ipa function summary   :   1.89 (  0%)   0.00 (  0%)   1.90 (  0%)
 5969k (  0%)
 ipa dead code removal  :   0.22 (  0%)   0.00 (  0%)   0.22 (  0%)
0  (  0%)
 ipa devirtualization   :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
0  (  0%)
 ipa cp :   0.55 (  0%)   0.00 (  0%)   0.56 (  0%)
 3831k (  0%)
 ipa inlining heuristics:   0.57 (  0%)   0.03 (  1%)   0.46 (  0%)
   20M (  1%)
 ipa comdats:   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
0  (  0%)
 ipa reference  :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
0  (  0%)
 ipa profile:   5.98 (  1%)   0.07 (  2%)   6.11 (  1%)
  108M (  3%)
 ipa pure const :   0.57 (  0%)   0.01 (  0%)   0.55 (  0%)
 1080  (  0%)
 ipa icf:   1.37 (  0%)   0.00 (  0%)   1.37 (  0%)
   45k (  0%)
 ipa SRA:   4.22 (  1%)   0.01 (  0%)   4.27 (  1%)
 6213k (  0%)
 ipa free lang data :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
0  (  0%)
 ipa free inline summary:   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
0  (  0%)
 ipa modref :   1.33 (  0%)   0.00 (  0%)   1.33 (  0%)
 1893k (  0%)
 cfg construction   :   0.19 (  0%)   0.00 (  0%)   0.13 (  0%)
   12M (  0%)
 cfg cleanup:   3.35 (  0%)   0.00 (  0%)   3.71 (  0%)
 9974k (  0%)
 trivially dead code:   0.90 (  0%)   0.01 (  0%)   0.77 (  0%)
0  (  0%)
 df scan insns  :   1.45 (  0%)   0.00 (  0%)   1.39 (  0%)
   95k (  0%)
 df reaching defs   :   1.79 (  0%)   0.00 (  0%)   1.83 (  0%)
0  (  0%)
 df live regs   :   6.03 (  1%)   0.01 (  0%)   5.78 (  1%)
0  (  0%)
 df live regs   :   2.55 (  0%)   0.00 (  0%)   2.49 (  0%)
0  (  0%)
 df must-initialized regs   :   0.19 (  0%)   0.00 (  0%)   0.20 (  0%)
0  (  0%)
 df use-def / def-use chains:   1.13 (  0%)   0.00 (  0%)   1.05 (  0%)
0  (  0%)
 df reg dead/unused notes   :   2.89 (  0%)   0.01 (  0%)   2.79 (  0%)
   34M (  1%)
 register information   :   0.45 (  0%)   0.00 (  0%)   

Re: committed [RISC-V]: Harden test scan patterns

2023-09-27 Thread Andrew Pinski
On Wed, Sep 27, 2023 at 1:14 PM Jeff Law  wrote:
>
>
>
> On 9/27/23 12:22, Joern Rennecke wrote:
> > On Wed, 27 Sept 2023 at 18:22, Jeff Law  wrote:
> >
> >> It would help to describe how these patterns were under specified so
> >> that folks don't continue to make the same mistake as new tests get added.
> >
> > dg-final scan-assembler, scan-assembler-not, and scan-assembler-times
> > use a tcl regular expression (often referred to abbreviated as RE), as
> > described in https://www.tcl.tk/man/tcl8.4/TclCmd/re_syntax.html .
> >
> > If your RE is not specific enough, it can match LTO information that the
> > compiler places into its assembly output when the relevant options are
> > provided, which is common when running tests where the test harness
> > iterates over a number of optimization option combinations.
> > Note that '.' is an atom that can match any character.  If you want to
> > match a dot specifically, you have to escape it with a backslash: '\.' .
> > When you are matching an instruction mnemonic, an effective way to
> > avoid matching in LTO information is to enforce matching of word start
> > (\m) and/or word end (\M) .
> > Note also that the backslash has to be quoted.  If the RE is enclosed in
> > '"' quotes, extra backslashes are needed.  That is not necessary when it
> > is enclosed in curly braces.
> >
> > For example, "ld.w" will be matched in:
> >
> > .ascii  "h\227\022\212ld@w\251jr\254'\320\255vwj\252\026\016\364"
> >
> > If you write {\mld\.w\M} instead, you avoid this problem.
> OK.  So that naturally leads to the question, why aren't others seeing
> this, both in the RISC-V world and more generally.  I'm not aware of any
> case where I've run the testsuite and tripped over this issue, nor am I
> aware of anyone else tripping over it.

I can answer the more generally part. Most other testcases if not all
scan-assembler tries to add spaces that allow not to catch things out
of place. Or even file/directory names.
The documentation at
https://gcc.gnu.org/onlinedocs/gccint/Final-Actions.html (and/or
https://gcc.gnu.org/wiki/HowToPrepareATestcase) definitely could be
expanded to make a mention of these gotchas really.

Thanks,
Andrew

>
> Jeff


Re: [Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]

2023-09-27 Thread Jeff Law




On 9/26/23 17:08, 钟居哲 wrote:

Hi, Jeff.

I removed mem-to-mem patterns as you suggested that means we don't have 
scalar move optimization for small size vector modes.

Is it ok for trunk?
Since it is a bug fix patch, I hope we can land it soon. We may will 
find another way to optimize small size vector mode mem-to-mem.

It's OK with me.

jeff


[Bug fortran/111618] ICE in associate construction

2023-09-27 Thread antoine.lemoine--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111618

--- Comment #1 from Antoine Lemoine  ---
Error message using gfortran 13.2 on Compiler Explorer:

f951: internal compiler error: Segmentation fault
0x1bec57e internal_error(char const*, ...)
???:0
0x7d4c85 gfc_expression_rank(gfc_expr*)
???:0
0x7d4e62 gfc_op_rank_conformable(gfc_expr*, gfc_expr*)
???:0
0x79a447 gfc_match_expr(gfc_expr**)
???:0
0x790f48 gfc_match(char const*, ...)
???:0
0x792f01 gfc_match_assignment()
???:0
0x7c7795 gfc_parse_file()
???:0

Looks like a duplicate of pr109948.

Re: [PATCH] RISC-V/testsuite: Fix ILP32 RVV failures from missing

2023-09-27 Thread Jeff Law




On 9/27/23 11:32, Palmer Dabbelt wrote:



IMO this is one of those places where we should just be as normal as 
possible.  So if the other big ports allow system headers then we 
should, otherwise we should move everyone over to testing in some way 
we'll catch these before commit.
Exactly.  I think the dance we've been doing with stdint-gcc.h is a bit 
silly, but I haven't pushed on it at all.


No other port does anything similar.  When they need stdint.h, they 
include it.  It does mean you have to have the appropriate headers 
installed for each multilib configuration, but that's the way every 
other port handles this problem.  There's no good reason I'm aware of 
for RISC-V to be different.


jeff


[Bug target/111619] New: [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

Bug ID: 111619
   Summary: [14 regression] 'make profiledbootstrap' makes 10+
minutes on insn-recog.cc (x86_64-linux)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

The reproducer on gcc from r14-4300-g1fab05a885a308:

$ ~/dev/git/gcc/configure --disable-multilib --enable-languages=c,c++
$ make profiledbootstrap

insn-recog.o takes ~13 min to build on `AMD Ryzen 9 5950X` CPU:

$ time /tmp/gb/./prev-gcc/cc1plus -quiet -nostdinc++ -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include -I
/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++ -I . -I . -I
/home/slyfox/dev/git/gcc/gcc -I /home/slyfox/dev/git/gcc/gcc/. -I
/home/slyfox/dev/git/gcc/gcc/../include -I
/home/slyfox/dev/git/gcc/gcc/../libcpp/include -I
/home/slyfox/dev/git/gcc/gcc/../libcody -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I ../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libbacktrace -iprefix
/tmp/gb/prev-gcc/../lib/gcc/x86_64-pc-linux-gnu/14.0.0/ -isystem
/tmp/gb/./prev-gcc/include -isystem /tmp/gb/./prev-gcc/include-fixed -MMD
insn-recog.d -MF ./.deps/insn-recog.TPo -MP -MT insn-recog.o -D_GNU_SOURCE -D
IN_GCC -D HAVE_CONFIG_H insn-recog.cc -quiet -dumpbase insn-recog.cc
-dumpbase-ext .cc -mtune=generic -march=x86-64 -g -gtoggle -O2 -Wextra -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wsuggest-attribute=format
-Wconditionally-supported -Woverloaded-virtual=2 -Wpedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-checking
-fprofile-generate -fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-fno-common -fno-PIE -o /run/user/1000/ccQK54tL.s

real13m39,864s
user13m38,263s
sys 0m0,823s

`insn-recog.cc` is 8.3MB.

$ ./prev-gcc/xgcc -Bprev-gcc -v
Reading specs from prev-gcc/specs
COLLECT_GCC=./prev-gcc/xgcc
COLLECT_LTO_WRAPPER=prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Re: RFA: [RISC-V] Replace riscv_vector with riscv_v in target selector clauses. (Followup-patch for RISCV test infrastructure for d / v / zfh extensions)

2023-09-27 Thread Jeff Law




On 9/27/23 11:48, Joern Rennecke wrote:

Regression tested for:
 riscv-sim
 
riscv-sim/-march=rv32gcv_zfh/-mabi=ilp32d/-ftree-vectorize/--param=riscv-autovec-preference=scalable
 riscv-sim/-march=rv32imac/-mabi=ilp32
 
riscv-sim/-march=rv64gcv_zfh_zvfh_zba_zbb_zbc_zicond_zicboz_zawrs/-mabi=lp64d/-ftree-vectorize/--param=riscv-autovec-preference=scalable
 riscv-sim/-march=rv64imac/-mabi=lp64


riscv_v-ector-.txt

Replace riscv_vector with riscv_v in target selector clauses.

# after deleting check_effective_target_riscv_vector:
$ grep -rl '[^"<]riscv_vector\>[^.]' > file-list
# edit file-list to remove ChangeLog and *.orig
$ cat edcmds
g/riscv_vector[^.]/s/\([^"<]riscv_v\)ector\>\([^.]\)/\1\2/g
w
q
$ sed 's/.*/ed & < edcmds/' < file-list > tmp
$ source tmp
$ git checkout gcc.target/riscv/predef-19.c
$ git checkout gcc.target/riscv/predef-18.c gcc.target/riscv/predef-20.c

2023-09-27  Joern Rennecke

gcc/testsuite/
 * lib/target-supports.exp (check_effective_target_riscv_vector): 
Delete.  Changed all users to use *riscv_v instead.


OK after wrapping the ChangeLog entry appropriately.

jeff


Re: Test with an lto-build of libgfortran.

2023-09-27 Thread Jeff Law via Gcc




On 9/27/23 12:21, Toon Moene wrote:



The lto-ing of libgfortran did succeed, because I did get a new warning:

gfortran -O3 -flto -flto-partition=none -static  -o xlintstrfz zchkrfp.o 
zdrvrfp.o zdrvrf1.o zdrvrf2.o zdrvrf3.o zdrvrf4.o zerrrfp.o zlatb4.o 
zlaipd.o zlarhs.o zsbmv.o zget04.o zpot01.o zpot03.o zpot02.o chkxer.o 
xerbla.o alaerh.o aladhd.o alahd.o alasvm.o ../../libtmglib.a 
../../liblapack.a ../../librefblas.a

In function 'xtoa_big',
     inlined from 'write_z' at 
/home/toon/compilers/gcc/libgfortran/io/write.c:1296:11,
     inlined from 'formatted_transfer_scalar_write' at 
/home/toon/compilers/gcc/libgfortran/io/transfer.c:2136:4:
/home/toon/compilers/gcc/libgfortran/io/write.c:1222:6: warning: writing 
1 byte into a region of size 0 [-Wstringop-overflow=]

  1222 |   *q = '\0';
   |  ^
/home/toon/compilers/gcc/libgfortran/io/write.c: In function 
'formatted_transfer_scalar_write':
/home/toon/compilers/gcc/libgfortran/io/write.c:1291:8: note: at offset 
[34, 4294967294] into destination object 'itoa_buf' of size 33

  1291 |   char itoa_buf[GFC_XTOA_BUF_SIZE];
   |    ^

which was (of course) not given with a non-lto libgfortran.
Yea.  This certainly can happen with LTO.  These warnings would 
definitely be something worth investigating.


Essentially the inlining enabled by LTO can expose a different set of 
diagnostics.


Jeff


Re: RISC-V: Added support for CRC.

2023-09-27 Thread Jeff Law




On 9/26/23 12:56, Joern Rennecke wrote:




What ultimately pushed us to keep moving forward on this effort was
discovering numerous CRC loop implementations out in the wild, including
4 implementations (IIRC) in the kernel itself.


I have always assumed that such must exist (CRCs are useful for a number
of problems, and besides, they wouldn't have been included in coremark as
a part of the workload if they were not relevant), but it is good to have
confirmation, and even better to have code that can detect and analyse a
entire class of idioms that is in such widespread use.
I was personally surprised at how many we found.   While there were a 
bunch of table generation routines which obviously aren't at all 
interesting, there were enough in the cases we analyzed that it 
convinced me this wasn't just catering to a benchmark.




This still leaves room for further improvements, like detecting fully-unrolled
code, table lookup, or slice-by-N, and replacing them with better
target-optimized code where this is indicated by the optimization flags to
save execution time or code/rodata size.  Not something we have to tackle
now, but just because we don't do it now, doesn't mean we couldn't address
these in the future if that appears worthwhile.
Absolutely.  In fact, I have encouraged Mariam to capture some of the 
cases we can't currently handle in the testsuite, essentially building a 
bit of a TODO list.






I can easily see creating a clmul RTL opcode for targets which support
it and hoisting the clmul vs lookup table selection into generic code.
I'm still pondering if we're likely to ever see cases where we want a
vector clmul intrinsic or support in the autovectorizer for clmul.
We've certainly got targets with vector clmul in the ISA, the question
is using it.


If we aim for optimal code, I think it more likely that we want to detect a
block CRC computation, and have a target expander decide to do that
with inline code or a library call that uses vectorized clmul.  At the time
we add such block-CRC expansion code, it also makes sense to add a
builtin for block CRC so that new GNU C programs can directly request
that functionality without having to go through the cargo cult of matching
a supported idiom.
And I think we've got a structure which largely allows further 
improvements, both in the recognition/validation step and in the code 
expansion step.




Now, the library might be written in GNU C, and for that it might be useful
to have a vector clmul intrinsic so that we can express this algorithm more
easily.
Agreed.  It's also worth nothing that LLVM has a clmul in their IL and I 
suspect they expose it via a builtin/intrinsic.  I'd expect we'll 
ultimately end up in the same boat.





Probably the biggest task in that space right now is to see if we can
avoid the symbolic execution engine by re-using ranger.


I'll be interested to see what you'll come up with, but if reverting to the
symbolic execution engine, the compile time cost isn't much if you only
use it for a proper match.  So whatever heuristics are used before deciding
to use the engine matter.  Can all the cases detected by the engine be
recognized as a loop with a reduction?  We might use different heuristics
for different optimization levels, i.e. allow more false negatives at -O1,
and more false positives at -O2 / -fexpensive-optimizations.
It's mostly a desire not to add (yet another) symbolic execution engine 
to GCC.  We've already got the engine for CCP as well as symbolic 
execution capabilities in Ranger.  I'd like to avoid adding another if 
we can do so.


For a LFSR validation step we need to track 4 potential states for each 
bit in an object.  0, 1, x, !x where "x" is the state of the bit from a 
different object.  If it was just tracking 0, 1, x, !x for an entire 
object, Ranger is probably already do that.  But we need to do it for 
each bit within an object.


We haven't seen compile-time cost be a real issue.  But we also haven't 
looked too closely at that problem.




While I concur that we want existing code to be able to take advantage of
this optimization by gcc recognizing whatever method the code uses to
compute CRC (within reason, we are still bound by the laws of
computability and resource constraints for the compilation), I would
like to stress that I don't think the builtin will use its value over time.
It can be used in tests to make sure we exercise the code generation for the
internal function.  It can be used in new GNU C code to make it easier to
do a CRC computation without worrying about the implementation.  If
an accord with other major compiler suppliers (e.g. the LLVM community)
is reached, it might even see more widespread use.
Which somewhat dovetails with Alex's position -- namely that it's not 
that value.  Hence the de-emphasis on this approach.  We'll continue to 
focus on the end-to-end solution.


We may still want a clmul as an RTL opcode and builtins to utilize it.




Which is 

[Bug fortran/111618] New: ICE in associate construction

2023-09-27 Thread antoine.lemoine--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111618

Bug ID: 111618
   Summary: ICE in associate construction
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoine.lemo...@bordeaux-inp.fr
  Target Milestone: ---

An ICE occurs with this code:

program prog
   implicit none
   type foo
  double precision, dimension(3) :: long_a
  double precision, dimension(3) :: long_b
   end type
   type(foo) :: the_foo
   double precision :: d
   associate(a => the_foo%long_a, b => the_foo%long_b)
  a = 2d0
  b = 1d0
  d = hypot(b(1), b(2)) ! No ICE without this line.
  b = a - b
   end associate
end program

No ICE when writing 'b = a - b(:)' or commenting out 'd = hypot(b(1), b(2))'.

Re: RISC-V sign extension query

2023-09-27 Thread Jeff Law




On 9/27/23 00:29, Vineet Gupta wrote:

Hi Jeff,




We touched upon this in our airport "rendezvous". I'm curious if you 
have the wip bits lying around - (a) to get a feel for how this could be 
done and (b) to see why REE and/or similar construct in CSE don't work 
as expected.

Not in any usable form.  Just several variants that didn't work ;-)

They don't work with REE because I'd forgotten a key point.  REE doesn't 
look for lexical redundancies like you'd see with CSE/PRE.  ie, given a 
pair of identical sign extensions in the IL, REE doesn't eliminate one.


Instead REE is focused on cases where we can transform an existing 
operation such as a load into an extending load to eliminate a 
subsequent extension.


This leads to a couple thoughts.

First, we ought to be able to use the same concept, but instead of 
putting something like this into the IL to express the extension done by 
the caller


(set (reg:DI a0) (sign_extend:DI (reg:SI a0)))

Instead we probably want to insert this as a dummy into the IL

(set (reg:SI a0) (mem:SI (sp))

If this is followed by a sign extension, then it'll get turned into

(set (reg:DI a0) (sign_extend:DI (mem:SI (sp)))

And the subsequent extension will get removed.  And since we've tracked 
the dummy, we can just eliminate the dummy as well.  I'm a bit worried 
about how this plays with the copy_needed bits in REE.


This should at least tell us how often there's an extension of an 
incoming argument that can be trivially eliminated.  I'm not sure it's 
the best place to eliminate the extensions though.  Leading to




We should make sure that CSE/PRE are properly identifying and 
eliminating lexical redundancies.   I wouldn't be surprised if the class 
of problems we're chasing are better detected and optimized by CSE/PRE 
since those can work on an extended block or global basis respectively.


For CSE/PRE we'd want to insert something like

(set (reg:DI a0) (sign_extend:DI (reg:SI a0)))

Into the IL which should make any expressions like

(sign_extend:DI (reg:SI a0))

fully redundant in the IL, thus allowing CSE/PRE to eliminate them.

I've got a few things backed up from before the Cauldron, but expect to 
be able to poke at this some this week.


jeff


GCC workshop at university

2023-09-27 Thread Benjamin Priour via Gcc
Hi everyone,

I'm in my final MSc year and figured after this weekend's Q
that I could replicate David's workshop on a smaller scale within my
university.

Would that be doable/acceptable ?
Is there any need for special licensing ? What about uploading the
session's recording afterwards ?

To Dave:
If the above is alright could I reuse some of your slides ?
I won't necessarily follow what you did, but some of them would be useful.

Cheers to you all,
Benjamin.


Re: committed [RISC-V]: Harden test scan patterns

2023-09-27 Thread Jeff Law




On 9/27/23 12:22, Joern Rennecke wrote:

On Wed, 27 Sept 2023 at 18:22, Jeff Law  wrote:


It would help to describe how these patterns were under specified so
that folks don't continue to make the same mistake as new tests get added.


dg-final scan-assembler, scan-assembler-not, and scan-assembler-times
use a tcl regular expression (often referred to abbreviated as RE), as
described in https://www.tcl.tk/man/tcl8.4/TclCmd/re_syntax.html .

If your RE is not specific enough, it can match LTO information that the
compiler places into its assembly output when the relevant options are
provided, which is common when running tests where the test harness
iterates over a number of optimization option combinations.
Note that '.' is an atom that can match any character.  If you want to
match a dot specifically, you have to escape it with a backslash: '\.' .
When you are matching an instruction mnemonic, an effective way to
avoid matching in LTO information is to enforce matching of word start
(\m) and/or word end (\M) .
Note also that the backslash has to be quoted.  If the RE is enclosed in
'"' quotes, extra backslashes are needed.  That is not necessary when it
is enclosed in curly braces.

For example, "ld.w" will be matched in:

.ascii  "h\227\022\212ld@w\251jr\254'\320\255vwj\252\026\016\364"

If you write {\mld\.w\M} instead, you avoid this problem.
OK.  So that naturally leads to the question, why aren't others seeing 
this, both in the RISC-V world and more generally.  I'm not aware of any 
case where I've run the testsuite and tripped over this issue, nor am I 
aware of anyone else tripping over it.


Jeff


[Bug fortran/67740] Wrong association status of allocatable character pointer in derived types

2023-09-27 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67740

--- Comment #7 from anlauf at gcc dot gnu.org ---
The following snippet in gfc_trans_pointer_assignment looks suspicious:

  if (expr1->ts.type == BT_CHARACTER
  && expr1->symtree->n.sym->ts.deferred
  && expr1->symtree->n.sym->ts.u.cl->backend_decl
  && VAR_P (expr1->symtree->n.sym->ts.u.cl->backend_decl))
{
  tmp = expr1->symtree->n.sym->ts.u.cl->backend_decl;
  if (expr2->expr_type != EXPR_NULL)
gfc_add_modify (, tmp,
fold_convert (TREE_TYPE (tmp), strlen_rhs));
  else
gfc_add_modify (, tmp, build_zero_cst (TREE_TYPE (tmp)));
}

I wonder whether it should read:

  if (expr1->ts.type == BT_CHARACTER
  && expr1->ts.deferred
...

Furthermore, expr1->ts.u.cl->backend_decl appears not set properly,
and I fail to see why.

Re: [committed] libstdc++: Add GDB printers for types

2023-09-27 Thread Jonathan Wakely
On Wed, 27 Sept 2023, 18:25 Tom Tromey via Libstdc++, 
wrote:

> >> I have fixes for most of the issues that are worth fixing (I didn't
> >> bother with line lengths -- FWIW in gdb we just run 'black' and don't
> >> worry about these details),
>
> Jonathan> I used autopep8 and committed the result as
> Jonathan> e08559271b2d797f658579ac8610dbf5e58bcfd8 so the line lengths
> Jonathan> should be OK now.
>
> Yeah, my patches are on top of that, but flake8 still complains, and I
> still see lines > 79 characters.  However maybe flake8 isn't the checker
> you want to use, or maybe you have something set up for a different line
> length?
>

I don't think I have anything set up for python formatting at all, I just
committed whatever autopep8 did with its default settings.

If that's suboptimal, we can consider other tools, if they're reliable and
easy to run.


> Jonathan> So the fix is to just change the string to '{} {}' which I've
> pushed
> Jonathan> as 1fab05a885a308c19cf42b72fd36805ddf27fdc8 now (also attached).
>
> Thank you.
>
> Tom
>


[Bug libgcc/109685] [13/14 Regression] Memory leak in `__deregister_frame`

2023-09-27 Thread markus.boeck02 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109685

Markus Böck  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Markus Böck  ---
Fixed

Re: [PATCH v2] ARM: Block predication on atomics [PR111235]

2023-09-27 Thread Wilco Dijkstra
Hi Ramana,

> Hope this helps.

Yes definitely!

>> Passes regress/bootstrap, OK for commit?
>
> Target ? armhf ? --with-arch , -with-fpu , -with-float parameters ?
> Please be specific.

I used --target=arm-none-linux-gnueabihf --host=arm-none-linux-gnueabihf
--build=arm-none-linux-gnueabihf --with-float=hard. However it seems that the
default armhf settings are incorrect. I shouldn't need the --with-float=hard 
since
that is obviously implied by armhf, and they should also imply armv7-a with 
vfpv3
according to documentation. It seems to get confused and skip some tests. I 
tried
using --with-fpu=auto, but that doesn't work at all, so in the end I forced it 
like:
--with-arch=armv8-a --with-fpu=neon-fp-armv8. With this it runs a few more 
tests.

> Since these patterns touch armv8m.baseline can you find all the
> testcases in the testsuite and ensure no change in code for
> armv8m.baseline as that's unpredicated already and this patch brings
> this in line with the same ? Does the testsuite already cover these
> arch variants and are you satisfied that the tests in the testsuite
> can catch / don't make any additional code changes to the other
> architectures affected by this ?

There are various v8-m(.base/.main) tests and they all pass. The generated
code is generally unchanged if there was no conditional execution. I made
the new UNSPEC_LDR/STR patterns support offsets so there is no difference
in generated code for relaxed loads/stores (since they used to use a plain
load/store which has an immediate offset).

>> * onfig/arm/sync.md (arm_atomic_load): Add new pattern.
>
> Nit: s/onfig/config

Fixed.

>> (atomic_load): Always expand atomic loads explicitly.
>> (atomic_store): Always expand atomic stores explicitly.
>
> Nit: Change message to :
> Switch patterns to define_expand.

Fixed.

> Largely looks ok though I cannot work out tonight if we need more v8-a
> or v8m-baseline specific tests for scan-assembler patterns.
>
> Clearly our testsuite doesn't catch it , so perhaps the OP could help
> validate this patch with their formal models to see if this fixes
> these set of issues and creates no new regressions ? Is that feasible
> to do ?

Disabling conditional execution avoids the issue. It's trivial to verify that
atomics can no longer be conditionally executed (no "%?"). When this is
committed, we can run the random testing again to confirm the issue
is no longer present.

> -(define_insn "atomic_load"
> -  [(set (match_operand:QHSI 0 "register_operand" "=r,r,l")
> +(define_insn "arm_atomic_load"
> +  [(set (match_operand:QHSI 0 "register_operand" "=r,l")
>  (unspec_volatile:QHSI
> -  [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q")
> -   (match_operand:SI 2 "const_int_operand" "n,Pf,n")]  ;; model
> +  [(match_operand:QHSI 1 "memory_operand" "m,m")]
>
> Remind me again why is it safe to go from the Q constraint to the m
> constraint here and everywhere else you've done this ?

That's because the relaxed loads/stores use LDR/STR wrapped in an
UNSPEC. To avoid regressions we have to use 'm' so that an immediate
offset can be merged into the memory access.

>> -  VUNSPEC_LDA  ; Represent a store-register-acquire.
>> +  VUNSPEC_LDR  ; Represent a load-register-relaxed.
>> +  VUNSPEC_LDA  ; Represent a load-register-acquire.
>
> Nit: LDA before LDR ? Though I suspect this list can be alphabetically
> ordered at another point of time.

Swapped.

> There are new tests added for v7-a , what happens with the output for
> v8-a and the changes for ldacq and other such instructions ?

v7-a and v8-a generate the same instructions for relaxed load/store.
The acquire/release versions are identical except they are no longer
predicated. Basically the new patterns are not only significantly simpler,
they are now the same between the many ARM/Thumb-2/v7-a/v8-m/v8-a
combinations, so test coverage is much higher now. This is how these
patterns should have been designed all along.

v2 follows below.

Cheers,
Wilco


[PATCH v2] ARM: Block predication on atomics [PR111235]

The v7 memory ordering model allows reordering of conditional atomic
instructions.  To avoid this, make all atomic patterns unconditional.
Expand atomic loads and stores for all architectures so the memory access
can be wrapped into an UNSPEC.

gcc/ChangeLog/
PR target/111235
* config/arm/constraints.md: Remove Pf constraint.
* config/arm/sync.md (arm_atomic_load): Add new pattern.
(arm_atomic_load_acquire): Likewise.
(arm_atomic_store): Likewise.
(arm_atomic_store_release): Likewise.
(atomic_load): Switch patterns to define_expand.
(atomic_store): Likewise.
(arm_atomic_loaddi2_ldrd): Remove predication.
(arm_load_exclusive): Likewise.
(arm_load_acquire_exclusive): Likewise.
(arm_load_exclusivesi): Likewise.
(arm_load_acquire_exclusivesi: Likewise.

[Bug target/111617] Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=92821,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=46942
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski  ---
Also see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92821#c2. clang/LLVM is
still not following the ABI after years of reprorting to them they are wrong.

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread markus at oberhumer dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #6 from Markus F.X.J. Oberhumer  ---
@Andrew Pinksi Many thanks for cleaning up the bug case!

cvise (https://github.com/marxin/cvise) did correctly reduce the original from
~5 lines to 18 lines, but the result looked extremely strange...

[Bug target/111617] Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=98425

--- Comment #2 from Andrew Pinski  ---
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98425#c3 also.

[Bug target/111617] Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64-linux-gnu
   Keywords||ABI
  Component|c   |target

--- Comment #1 from Andrew Pinski  ---
There is a disagreement on the ABI between GCC and clang and even what the ABI
says. GCC assumes the upper bits are not zero/sign extended while clang thinks
they are.

[Bug c/111617] New: Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread davidfromonline at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Bug ID: 111617
   Summary: Unnecessary instructions generated when comparing
mixed-sign small integers
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: davidfromonline at gmail dot com
  Target Milestone: ---

Compiling with `-std=c2x -O3`

```c
bool a(signed char x, unsigned char y) {
return x == y;
}

bool b(short x, unsigned short y) {
return x == y;
}

bool c(int x, unsigned y) {
return x == y;
}

bool d(long x, unsigned long y) {
return x == y;
}

bool e(long long x, unsigned long long y) {
return x == y;
}
```

causes gcc to generate

```asm
a:
movsx   edi, dil
movzx   esi, sil
cmp edi, esi
seteal
ret
b:
movsx   edi, di
movzx   esi, si
cmp edi, esi
seteal
ret
c:
cmp edi, esi
seteal
ret
d:
cmp rdi, rsi
seteal
ret
e:
cmp rdi, rsi
seteal
ret
```

The `movsx` and `movzx` seem unnecessary, and are not emitted by clang.

See it live: https://godbolt.org/z/dfc93f7Pv

[Bug tree-optimization/111614] [14 Regression] ICE at -O2: verify_gimple failed since r14-2282-gf703d2fd3f0

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111614

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-09-27

--- Comment #1 from Andrew Pinski  ---
Looks like a latent bug in reassoc:
```

  vector(2) unsigned int _11;
  intD.6 _6;
  intD.6 _8;

  vect__15.28_19 = VIEW_CONVERT_EXPR(_3);
  _4 = BIT_FIELD_REF ;
  _35 = BIT_FIELD_REF ;
  _30 = _35 & _4;
  _28 = _30 & d_lsm.16_31;
...
  _62 = VIEW_CONVERT_EXPR(vect_i_7.24_45);
  _11 = _62 + { 11, 11 };
  _6 = BIT_FIELD_REF <_11, 32, 0>;
  _8 = BIT_FIELD_REF <_11, 32, 32>;
  _21 = _8 & _6;
  _34 = _21 & _28;
```

basically BIT_FIELD_REF has a type of `int` but the inner vector type of _11 is
`vector unsigned int`.

The VCE was removed by the following match pattern:
```
(simplify
 (BIT_FIELD_REF (view_convert @0) @1 @2)
 (BIT_FIELD_REF @0 @1 @2))
```

Which you expect really.

Confirmed.

[Bug tree-optimization/111614] [14 Regression] ICE at -O2: verify_gimple failed since r14-2282-gf703d2fd3f0

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111614

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-checking,
   ||ice-on-valid-code
   Target Milestone|--- |14.0

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #5 from Andrew Pinski  ---
Actually here is the full backtrace:
#2  0x03524383 in error_recursion (context=0x4d471a0
) at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:2265
#3  0x035217be in diagnostic_report_diagnostic (context=0x4d471a0
, diagnostic=0x7fff9e20) at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:1543
#4  0x03522102 in diagnostic_impl (richloc=0x7fff9f00,
metadata=0x0, opt=-1, gmsgid=0x39302e0 "explicit instantiation of %qD but no
definition available", ap=0x7fff9ee8, kind=DK_PERMERROR) at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:1770
#5  0x03523434 in permerror (location=2147483649, gmsgid=0x39302e0
"explicit instantiation of %qD but no definition available") at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:2037
#6  0x010eb12c in instantiate_decl (d=,
defer_ok=false, expl_inst_class_mem_p=false) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:27362
#7  0x00ec2acd in maybe_instantiate_decl (decl=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:
#8  0x00ebee57 in decl_constant_var_p (decl=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:4569
#9  0x00ef4a05 in constant_value_1 (decl=, strict_p=true, return_aggregate_cst_ok_p=true, unshare_p=false) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2526
#10 0x00ef4e59 in decl_really_constant_value (decl=, unshare_p=false) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2617
#11 0x00dced8b in cxx_eval_constant_expression (ctx=0x7fffa7b0,
t=, lval=vc_prvalue,
non_constant_p=0x7fffa730, overflow_p=0x7fffa731, jump_target=0x0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7216
#12 0x00dd1261 in cxx_eval_constant_expression (ctx=0x7fffa7b0,
t=, lval=vc_prvalue, non_constant_p=0x7fffa730,
overflow_p=0x7fffa731, jump_target=0x0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7827
#13 0x00dd4f51 in cxx_eval_outermost_constant_expr (t=, allow_non_constant=true, strict=true,
manifestly_const_eval=mce_value::mce_true, constexpr_dtor=false, object=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8517
#14 0x00dd60be in maybe_constant_value (t=,
decl=, manifestly_const_eval=mce_value::mce_true) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8806
#15 0x00dd6736 in fold_non_dependent_expr (t=,
complain=0, manifestly_const_eval=true, object=) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8945
#16 0x011bf9c9 in check_narrowing (type=, init=, complain=0, const_only=true) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/typeck2.cc:993
#17 0x00d5a284 in convert_like_internal (convs=0x4dea410,
expr=, fn=, argnum=0,
issue_conversion_warnings=true, c_cast_p=false, nested_p=false, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8879
#18 0x00d5a5ca in convert_like (convs=0x4dea410, expr=, fn=, argnum=0, issue_conversion_warnings=true,
c_cast_p=false, nested_p=false, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8944
#19 0x00d5a63b in convert_like (convs=0x4dea410, expr=, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8959
#20 0x00d4a9ee in build_converted_constant_expr_internal
(type=, expr=,
flags=5, complain=0) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4808
#21 0x00d4aa85 in build_converted_constant_expr (type=, expr=, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4838
#22 0x01084794 in convert_nontype_argument (type=, expr=, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:7414
#23 0x01089897 in convert_template_argument (parm=, arg=, args=, complain=0, i=0, in_decl=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:8713
#24 0x0108b976 in coerce_template_parms (parms=, args=, in_decl=, complain=0, require_all_args=true) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9205
#25 0x0108e57f in lookup_template_class (d1=, arglist=, in_decl=, context=, entering_scope=1, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9980
#26 0x010a2e5c in tsubst_aggr_type_1 (t=, args=, complain=0, in_decl=,
entering_scope=1) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:14055
#27 0x010a2ca5 in tsubst_aggr_type (t=, args=, complain=16384, in_decl=,
entering_scope=1) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:14019
#28 0x010b3d52 in tsubst (t=,
args=, complain=0, in_decl=) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:16589
#29 0x00ecdda9 in dump_template_bindings (pp=0x4b8f220
, parms=, args=,
typenames=0x77409ca8 = {...}) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/error.cc:492
#30 0x00ed47ac in dump_substitution (pp=0x4b8f220
, t=,
template_parms=, template_args=, flags=132) at

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #4 from Andrew Pinski  ---
(In reply to Markus F.X.J. Oberhumer from comment #0)
> Test case has been reduced by cvise.
> 
> Might be related to / duplicate of
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90747
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100557

Looks unrelated to those two.

Full backtrace:
t.c: In instantiation of ‘const bool integral_constant<1>::value’:
t.c:10:3:   required by substitution of ‘template Span::Span(U,
typename enable_if::value>::type) [with U = int]’
t.c:13:11:   required from here
t.c:10:3: error: explicit instantiation of ‘integral_constant<1>::value’ but no
definition available [-fpermissive]
   10 |   Span(U, typename enable_if::value>::type
= 1){}
  |   ^~~~
‘
internal compiler error: error reporting routines re-entered.
0x10eb12b instantiate_decl(tree_node*, bool, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:27362
0xec2acc maybe_instantiate_decl(tree_node*)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:
0xebee56 decl_constant_var_p(tree_node*)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:4569
0xef4a04 constant_value_1
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2526
0xef4e58 decl_really_constant_value(tree_node*, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2617
0xdced8a cxx_eval_constant_expression
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7216
0xdd1260 cxx_eval_constant_expression
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7827
0xdd4f50 cxx_eval_outermost_constant_expr
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8517
0xdd60bd maybe_constant_value(tree_node*, tree_node*, mce_value)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8806
0xdd6735 fold_non_dependent_expr(tree_node*, int, bool, tree_node*)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8945
0x11bf9c8 check_narrowing(tree_node*, tree_node*, int, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/typeck2.cc:993
0xd5a283 convert_like_internal
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8879
0xd5a5c9 convert_like
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8944
0xd5a63a convert_like
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8959
0xd4a9ed build_converted_constant_expr_internal
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4808
0xd4aa84 build_converted_constant_expr(tree_node*, tree_node*, int)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4838
0x1084793 convert_nontype_argument
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:7414
0x1089896 convert_template_argument
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:8713
0x108b975 coerce_template_parms(tree_node*, tree_node*, tree_node*, int, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9205
0x108e57e lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int, int)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9980
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

Re: committed [RISC-V]: Harden test scan patterns

2023-09-27 Thread Joern Rennecke
On Wed, 27 Sept 2023 at 18:22, Jeff Law  wrote:

> It would help to describe how these patterns were under specified so
> that folks don't continue to make the same mistake as new tests get added.

dg-final scan-assembler, scan-assembler-not, and scan-assembler-times
use a tcl regular expression (often referred to abbreviated as RE), as
described in https://www.tcl.tk/man/tcl8.4/TclCmd/re_syntax.html .

If your RE is not specific enough, it can match LTO information that the
compiler places into its assembly output when the relevant options are
provided, which is common when running tests where the test harness
iterates over a number of optimization option combinations.
Note that '.' is an atom that can match any character.  If you want to
match a dot specifically, you have to escape it with a backslash: '\.' .
When you are matching an instruction mnemonic, an effective way to
avoid matching in LTO information is to enforce matching of word start
(\m) and/or word end (\M) .
Note also that the backslash has to be quoted.  If the RE is enclosed in
'"' quotes, extra backslashes are needed.  That is not necessary when it
is enclosed in curly braces.

For example, "ld.w" will be matched in:

.ascii  "h\227\022\212ld@w\251jr\254'\320\255vwj\252\026\016\364"

If you write {\mld\.w\M} instead, you avoid this problem.

#

Where should this go?  Maybe somewhere in or linked from
https://gcc.gnu.org/codingconventions.html , Testsuite conventions?


Test with an lto-build of libgfortran.

2023-09-27 Thread Toon Moene

Hi all,

During the GNU Tools Cauldron we discussed (at the BoF: IPA & LTO) the 
possibility (and hazards) of building the run time libraries for various 
compilers with -flto, enabling an -flto -static linking of programs with 
the run time library available during link time optimizations.


Today I tried that on my (AMD Ryzen 7 5800U) laptop with

gcc version 14.0.0 20230926 (experimental) [master 
r14-4282-g53daf67fd55] (GCC)


with the following "quick hack":

diff --git a/libgfortran/configure b/libgfortran/configure
index cd176b04a14..69a2b4a8881 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -5959,11 +5959,11 @@ fi
 # Add -Wall -fno-repack-arrays -fno-underscoring if we are using GCC.
 have_real_17=no
 if test "x$GCC" = "xyes"; then
-  AM_FCFLAGS="-I . -Wall -Werror -fimplicit-none -fno-repack-arrays 
-fno-underscoring"
+  AM_FCFLAGS="-I . -Wall -Werror -fimplicit-none -fno-repack-arrays 
-fno-underscoring -flto"

   ## We like to use C11 and C99 routines when available.  This makes
   ## sure that
   ## __STDC_VERSION__ is set such that libc includes make them available.
-  AM_CFLAGS="-std=gnu11 -Wall -Wstrict-prototypes -Wmissing-prototypes 
-Wold-style-definition -Wextra -Wwrite-strings 
-Werror=implicit-function-declaration -Werror=vla"
+  AM_CFLAGS="-std=gnu11 -Wall -Wstrict-prototypes -Wmissing-prototypes 
-Wold-style-definition -Wextra -Wwrite-strings 
-Werror=implicit-function-declaration -Werror=vla -flto"

   ## Compile the following tests with the same system header contents
   ## that we'll encounter when compiling our own source files.
   CFLAGS="-std=gnu11 $CFLAGS"

The build of this compiler (languages=fortran) completed without 
problems (no test results - not enough time).


I then proceeded to build LAPACK with the following build options:

CFLAGS = -O3 -flto -flto-partition=none -static
and
FFLAGS = -O3 -flto -flto-partition=none -static

This gave the same test results of the LAPACK test suite as the build 
with the same compiler, but without an lto'd libgfortran.


The lto-ing of libgfortran did succeed, because I did get a new warning:

gfortran -O3 -flto -flto-partition=none -static  -o xlintstrfz zchkrfp.o 
zdrvrfp.o zdrvrf1.o zdrvrf2.o zdrvrf3.o zdrvrf4.o zerrrfp.o zlatb4.o 
zlaipd.o zlarhs.o zsbmv.o zget04.o zpot01.o zpot03.o zpot02.o chkxer.o 
xerbla.o alaerh.o aladhd.o alahd.o alasvm.o ../../libtmglib.a 
../../liblapack.a ../../librefblas.a

In function 'xtoa_big',
inlined from 'write_z' at 
/home/toon/compilers/gcc/libgfortran/io/write.c:1296:11,
inlined from 'formatted_transfer_scalar_write' at 
/home/toon/compilers/gcc/libgfortran/io/transfer.c:2136:4:
/home/toon/compilers/gcc/libgfortran/io/write.c:1222:6: warning: writing 
1 byte into a region of size 0 [-Wstringop-overflow=]

 1222 |   *q = '\0';
  |  ^
/home/toon/compilers/gcc/libgfortran/io/write.c: In function 
'formatted_transfer_scalar_write':
/home/toon/compilers/gcc/libgfortran/io/write.c:1291:8: note: at offset 
[34, 4294967294] into destination object 'itoa_buf' of size 33

 1291 |   char itoa_buf[GFC_XTOA_BUF_SIZE];
  |^

which was (of course) not given with a non-lto libgfortran.

The full question of "lto-ing" run time libraries is more complicated 
than just "whether it works" as those who attended the BoF will recall.


Hope this helps,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands


[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

--- Comment #8 from Andreas Schwab  ---
Native on HiFive Unleashed.

Re: [PATCH V3] RISC-V: Remove mem-to-mem VLS move pattern[PR111566]

2023-09-27 Thread Toon Moene

On 9/27/23 19:31, Jeff Law wrote:



On 9/27/23 04:14, juzhe.zh...@rivai.ai wrote:

Since after removing mem-to-mem pattern.

program main
   integer, dimension(:,:), allocatable :: a, b
   integer, dimension(:), allocatable :: sh
   allocate (a(2,2))
   allocate (b(2,2))
   allocate (sh(3))
   a = 1
   b = cshift(a,sh)
end program main

This case will failed if we don't change mov pattern.
Can you expand on this?  You didn't indicate the failure mode or any 
analysis behind the failure.


jeff


Note that this Fortran code has no defined behavior, because the sh 
array isn't given any values ...


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands



[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Target Milestone|--- |11.5
  Known to fail||6.3.0
Summary|[ICE] internal compiler |[11/12/13/14 Regression]
   |error: error reporting  |[ICE] internal compiler
   |routines re-entered.|error: error reporting
   ||routines re-entered.
 Ever confirmed|0   |1
   Severity|normal  |trivial
   Last reconfirmed||2023-09-27
  Known to work||6.2.0

--- Comment #3 from Andrew Pinski  ---
Confirmed.

[Bug c++/111606] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #2 from Andrew Pinski  ---
Created attachment 56005
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56005=edit
Reduced further

Attached is the testcase reduced further, and adding back to make it more valid
code.

Re: [PATCH v4] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-09-27 Thread Fangrui Song
On Wed, Sep 13, 2023 at 11:19 AM Fangrui Song  wrote:
>
> On Tue, Aug 22, 2023 at 12:19 AM Fangrui Song  wrote:
> >
> > On Tue, Aug 1, 2023 at 12:51 PM Fangrui Song  wrote:
> > >
> > > When using -mcmodel=medium, large data objects larger than the
> > > -mlarge-data-threshold threshold are placed into large data sections
> > > (.lrodata, .ldata, .lbss and some variants).  GNU ld and ld.lld 17 place
> > > .l* sections into separate output sections.  If small and medium code
> > > model object files are mixed, the .l* sections won't exert relocation
> > > overflow pressure on sections in object files built with -mcmodel=small.
> > >
> > > However, when using -mcmodel=large, -mlarge-data-threshold doesn't
> > > apply.  This means that the .rodata/.data/.bss sections may exert
> > > relocation overflow pressure on sections in -mcmodel=small object files.
> > >
> > > This patch allows -mcmodel=large to generate .l* sections and drops an
> > > unneeded documentation restriction that the value must be the same.
> > >
> > > Link: https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU
> > > ("Large data sections for the large code model")
> > >
> > > Signed-off-by: Fangrui Song 
> > >
> > > ---
> > > Changes from v1 
> > > (https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616947.html):
> > > * Clarify commit message. Add link to 
> > > https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU
> > >
> > > Changes from v2
> > > * Drop an uneeded limitation in the documentation.
> > >
> > > Changes from v3
> > > * Change scan-assembler directives to use \. to match literal .
> > > ---
> > >  gcc/config/i386/i386.cc| 15 +--
> > >  gcc/config/i386/i386.opt   |  2 +-
> > >  gcc/doc/invoke.texi|  6 +++---
> > >  gcc/testsuite/gcc.target/i386/large-data.c | 13 +
> > >  4 files changed, 26 insertions(+), 10 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/large-data.c
> > >
> > > [...]
> >
> > Ping:)
>
> Ping:) https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625993.html
>
> (I don't have write access to gcc.)
>
>
> --
> 宋方睿

Ping? :) https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625993.html

(I don't have write access to gcc.)


-- 
宋方睿


[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

palmer at gcc dot gnu.org changed:

   What|Removed |Added

 CC||palmer at gcc dot gnu.org,
   ||vineetg at gcc dot gnu.org

--- Comment #7 from palmer at gcc dot gnu.org ---
(In reply to Andreas Schwab from comment #3)
> Here are the build times of the stage1 compiler:
> 
> 20230714  21573
> 20230722  19932   -7.6%
> 20230728  21608   +8.4%
> 20230804  21841   +1.0%
> 20230811  25016   +14.5%
> 20230818  25429   +1.7%
> 20230825  25872   +1.7%
> 20230901  25965   +0.4%
> 20230908  28824   +11.0%
> 20230915  30926   +7.3%
> 20230922  40180   +30.0%

Did anything else change?  The latest binutils has better debug support, so I
could imagine us ending up with some longer compiler times as a result -- there
has to be more than just that here, though.

Aside from that we have had a ton of vector codegen go in over the last few
months, but this is a pretty huge increase so I agree it's worrisome.  I'm
adding Vineet to the CC list, as he's been doing some SPEC runs.  I don't think
we've had any major runtime regressions, but looks like dwarf2out.cc times have
crept up a bit which is also worrisome.

Also what exactly are you timing?  Native boostraps on QEMU?

[Bug target/111609] Zero shift in ARM NEON vshll_n_s8 intrinsic produces an error

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111609

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||12.1.0, 4.5.4
   Last reconfirmed||2023-09-27
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Confirmed.

Re: [PATCH] RISC-V/testsuite: Fix ILP32 RVV failures from missing

2023-09-27 Thread Palmer Dabbelt

On Wed, 27 Sep 2023 10:28:55 PDT (-0700), jeffreya...@gmail.com wrote:



On 9/25/23 15:17, Maciej W. Rozycki wrote:

On Mon, 25 Sep 2023, Maciej W. Rozycki wrote:


  NB the use of this specific  header, still in place elsewhere,
seems gratuitous to me.  We don't need or indeed want to print anything in
the test cases (unless verifying something specific to the print facility)
and if we want to avoid minor code duplication (i.e. not to have explicit:

   if (...)
 __builtin_abort ();

replicated across test cases), we can easily implement this via a local
header, there's no need to pull in a complex system facility.


  Overall we ought not to require any system headers in compile tests and
then link and run tests need a functional target environment anyway.  So
maybe the use of  in run tests isn't as bad after all if not for
the -DNDEBUG peculiarity.  However I still think the less we depend in
verification on external components the better, that's one variable to
exclude.

Certainly we don't want extraneous #includes.   We can often avoid them
with a few judicious prototypes, like for abort ().

But we also need to get to the point where we can run tests which have
#include directives that reference system headers.  Many tests in the
various GCC testsuites have those directives and we don't want to be
continually trying to eradicate #includes from those tests.

The standard way to deal with this is single tree builds which are
deprecated or to have an install tree with the suitable multilib headers
and libraries.  The latter seems like the only viable solution to me.


IMO this is one of those places where we should just be as normal as 
possible.  So if the other big ports allow system headers then we 
should, otherwise we should move everyone over to testing in some way 
we'll catch these before commit.




jeff


Re: [PATCH V3] RISC-V: Remove mem-to-mem VLS move pattern[PR111566]

2023-09-27 Thread Jeff Law




On 9/27/23 04:14, juzhe.zh...@rivai.ai wrote:

Since after removing mem-to-mem pattern.

program main
   integer, dimension(:,:), allocatable :: a, b
   integer, dimension(:), allocatable :: sh
   allocate (a(2,2))
   allocate (b(2,2))
   allocate (sh(3))
   a = 1
   b = cshift(a,sh)
end program main

This case will failed if we don't change mov pattern.
Can you expand on this?  You didn't indicate the failure mode or any 
analysis behind the failure.


jeff


Re: [PATCH V3] RISC-V: Remove mem-to-mem VLS move pattern[PR111566]

2023-09-27 Thread Jeff Law




On 9/27/23 03:38, juzhe.zh...@rivai.ai wrote:

 >> Why add `can_create_pseudo_p ()` here? this will split after reload,

but we forbid that pattern between reload and split2?


I have no ideal. Some fortran tests just need recognization of 
mem-to-mem pattern before RA

I don't know the reason.

But isn't that the key to understanding what's going on here?

There is nothing special about Fortran here.  Whatever problem this is 
working around will almost certainly show up again in other, 
non-Fortran, contexts.


There aren't enough details in here to really evaluate what's going on.

jeff


Re: [PATCH] RISC-V/testsuite: Fix ILP32 RVV failures from missing

2023-09-27 Thread Jeff Law




On 9/25/23 15:17, Maciej W. Rozycki wrote:

On Mon, 25 Sep 2023, Maciej W. Rozycki wrote:


  NB the use of this specific  header, still in place elsewhere,
seems gratuitous to me.  We don't need or indeed want to print anything in
the test cases (unless verifying something specific to the print facility)
and if we want to avoid minor code duplication (i.e. not to have explicit:

   if (...)
 __builtin_abort ();

replicated across test cases), we can easily implement this via a local
header, there's no need to pull in a complex system facility.


  Overall we ought not to require any system headers in compile tests and
then link and run tests need a functional target environment anyway.  So
maybe the use of  in run tests isn't as bad after all if not for
the -DNDEBUG peculiarity.  However I still think the less we depend in
verification on external components the better, that's one variable to
exclude.
Certainly we don't want extraneous #includes.   We can often avoid them 
with a few judicious prototypes, like for abort ().


But we also need to get to the point where we can run tests which have 
#include directives that reference system headers.  Many tests in the 
various GCC testsuites have those directives and we don't want to be 
continually trying to eradicate #includes from those tests.


The standard way to deal with this is single tree builds which are 
deprecated or to have an install tree with the suitable multilib headers 
and libraries.  The latter seems like the only viable solution to me.


jeff


Re: [committed] libstdc++: Add GDB printers for types

2023-09-27 Thread Tom Tromey
>> I have fixes for most of the issues that are worth fixing (I didn't
>> bother with line lengths -- FWIW in gdb we just run 'black' and don't
>> worry about these details),

Jonathan> I used autopep8 and committed the result as
Jonathan> e08559271b2d797f658579ac8610dbf5e58bcfd8 so the line lengths
Jonathan> should be OK now.

Yeah, my patches are on top of that, but flake8 still complains, and I
still see lines > 79 characters.  However maybe flake8 isn't the checker
you want to use, or maybe you have something set up for a different line
length?

Jonathan> So the fix is to just change the string to '{} {}' which I've pushed
Jonathan> as 1fab05a885a308c19cf42b72fd36805ddf27fdc8 now (also attached).

Thank you.

Tom


Re: committed [RISC-V]: Harden test scan patterns

2023-09-27 Thread Jeff Law




On 9/27/23 03:26, Joern Rennecke wrote:

I got tired of scan tests failing when they have an underspecified
pattern that matches LTO information, so I did a global replace for
the most common form of such scan patterns in the gcc.target/riscv
testsuite.

regression tested for:
 riscv-sim
 
riscv-sim/-march=rv32gcv_zfh/-mabi=ilp32d/-ftree-vectorize/--param=riscv-autovec-preference=scalable
 riscv-sim/-march=rv32imac/-mabi=ilp32
 
riscv-sim/-march=rv64gcv_zfh_zvfh_zba_zbb_zbc_zicond_zicboz_zawrs/-mabi=lp64d/-ftree-vectorize/--param=riscv-autovec-preferenc
e=scalable
 riscv-sim/-march=rv64imac/-mabi=lp64

Committed as obvious.
It would help to describe how these patterns were under specified so 
that folks don't continue to make the same mistake as new tests get added.


Jeff


Re: seek advice about GCC learning

2023-09-27 Thread David Brown

On 26/09/2023 08:48, weizhe wang via Gcc wrote:

Thanks for your reply. Is there some guide for building rv32 cross compiler gcc 
? I encounter some error in the building progress.




You might find useful information here:








I can recommend google.  It took me perhaps 10 seconds to find these sites.




[Bug libstdc++/111511] Incorrect ADL in std::to_array in GCC 11/12/13

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111511

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |13.3

--- Comment #10 from Jonathan Wakely  ---
std::to_array is fixed on all branches, thanks for the reports.

[Bug libstdc++/111511] Incorrect ADL in std::to_array in GCC 11/12/13

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111511

--- Comment #9 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5

commit r11-11021-g97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5
Author: Jonathan Wakely 
Date:   Thu Sep 21 09:14:57 2023 +0100

libstdc++: Prevent unwanted ADL in std::to_array [PR111512]

Qualify the calls to the __to_array helper to prevent ADL, so we don't
try to complete associated classes.

libstdc++-v3/ChangeLog:

PR libstdc++/111511
PR c++/111512
* include/std/array (to_array): Qualify calls to __to_array.
* testsuite/23_containers/array/creation/111512.cc: New test.

(cherry picked from commit 77cf3773021b0a20d89623e09d620747a05588ec)

[Bug c++/111512] GCC's __builtin_memcpy can trigger ADL

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111512

--- Comment #6 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5

commit r11-11021-g97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5
Author: Jonathan Wakely 
Date:   Thu Sep 21 09:14:57 2023 +0100

libstdc++: Prevent unwanted ADL in std::to_array [PR111512]

Qualify the calls to the __to_array helper to prevent ADL, so we don't
try to complete associated classes.

libstdc++-v3/ChangeLog:

PR libstdc++/111511
PR c++/111512
* include/std/array (to_array): Qualify calls to __to_array.
* testsuite/23_containers/array/creation/111512.cc: New test.

(cherry picked from commit 77cf3773021b0a20d89623e09d620747a05588ec)

[Bug libstdc++/111102] illegal pointer arithmetic invoked by std::format("L{:65536}",1)

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=02

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Jonathan Wakely  ---
Fixed for 13.3, thanks for the report and patches.

[Bug libstdc++/108046] The dot in the floating-point alternative form has wrong position

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108046

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jonathan Wakely  ---
Fixed for 13.3, thanks for the report.

[Bug libstdc++/111102] illegal pointer arithmetic invoked by std::format("L{:65536}",1)

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=02

--- Comment #4 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:9853ad876bd3d9d4685126466f74402e567664b3

commit r13-7918-g9853ad876bd3d9d4685126466f74402e567664b3
Author: Paul Dreik 
Date:   Thu Aug 24 11:43:43 2023 +0100

libstdc++: Add test for illegal pointer arithmetic in format [PR02]

libstdc++-v3/ChangeLog:

PR libstdc++/02
* testsuite/std/format/string.cc: Check wide character format
strings with out-of-range widths.

(cherry picked from commit 7564fe98657ad5ede34bd08f5279778fa8698865)

[Bug c++/59526] [C++11] Defaulted special member functions don't accept noexcept if a member has a non-trivial noexcept operator in the corresponding special member function

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59526

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:0547f663ee09aa5887dcd1bb0ea48eba24a30485

commit r13-7917-g0547f663ee09aa5887dcd1bb0ea48eba24a30485
Author: François Dumont 
Date:   Wed Aug 23 19:15:43 2023 +0200

libstdc++: [_GLIBCXX_INLINE_VERSION] Fix  friend declaration

GCC do not consider the inline namespace in friend function declarations.
This is PR c++/59526, we need to explicit this namespace.

libstdc++-v3/ChangeLog:

* include/std/format (std::__format::_Arg_store): Explicit version
namespace on make_format_args friend declaration.

(cherry picked from commit 92456291849fe88303bbcab366f41dcd4a885ad5)

[Bug libstdc++/111102] illegal pointer arithmetic invoked by std::format("L{:65536}",1)

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=02

--- Comment #3 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:183eea6029be2f6c9f416d6ffe751c469237ff2d

commit r13-7916-g183eea6029be2f6c9f416d6ffe751c469237ff2d
Author: Paul Dreik 
Date:   Thu Aug 24 11:43:43 2023 +0100

libstdc++: fix illegal pointer arithmetic in format [PR02]

When parsing a format string, the width is parsed into an unsigned short
but the result is not checked in the case the format string is not a
char string (such as a wide string). In case the parse fails, a null
pointer is returned which is used for pointer arithmetic which is
undefined behaviour.

Signed-off-by: Paul Dreik 

libstdc++-v3/ChangeLog:

PR libstdc++/02
* include/std/format (__format::__parse_integer): Check for
non-null pointer.

(cherry picked from commit dd4bdb9eea436bf06f175d8dbfc2190377455be4)

[Bug libstdc++/108046] The dot in the floating-point alternative form has wrong position

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108046

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:da1ba03245c212ef1ba100e7806588802f3ad46f

commit r13-7914-gda1ba03245c212ef1ba100e7806588802f3ad46f
Author: Jonathan Wakely 
Date:   Thu Jul 27 14:07:09 2023 +0100

libstdc++: Fix std::format alternate form for floating-point [PR108046]

A decimal point was being added to the end of the string for {:#.0}
because the __expc character was not being set, for the _Pres_none
presentation type, so __s.find(__expc) didn't the 'e' in "1e+01" and so
we created "1e+01." by appending the radix char to the end.

This can be fixed by ensuring that __expc='e' is set for the _Pres_none
case. I realized we can also set __expc='P' and __expc='E' when needed,
to save a call to std::toupper later.

For the {:#.0g} format, __expc='e' was being set and so the 'e' was
found in "1e+10" but then __z = __prec - __sigfigs would wraparound to
SIZE_MAX. That meant we would decide not to add a radix char because the
number of extra characters to insert would be 1+SIZE_MAX i.e. zero.

This can be fixed by using __z == 0 when __prec == 0.

libstdc++-v3/ChangeLog:

PR libstdc++/108046
* include/std/format (__formatter_fp::format): Ensure __expc is
always set for all presentation types. Set __z correctly for
zero precision.
* testsuite/std/format/functions/format.cc: Check problem cases.

(cherry picked from commit 50bc490c090cc95175e6068ed7438788d7fd7040)

Re: [committed] libstdc++: Add GDB printers for types

2023-09-27 Thread Jonathan Wakely
On Wed, 27 Sept 2023 at 16:37, Tom Tromey  wrote:
>
> > Jonathan Wakely via Gcc-patches  writes:
>
> Replying to a quite old email...
>
> I ran a Python linter on the libstdc++ pretty-printers.
>
> I have fixes for most of the issues that are worth fixing (I didn't
> bother with line lengths -- FWIW in gdb we just run 'black' and don't
> worry about these details),

I used autopep8 and committed the result as
e08559271b2d797f658579ac8610dbf5e58bcfd8 so the line lengths should be
OK now.

> but the patch I'm replying to had a problem
> that I didn't know how to fix:
>
> > +class StdChronoTimeZoneRulePrinter:
> [...]
> > +if kind == 0: # DayOfMonth
> > +start = '{} {}{}'.format(month, ordinal_day)
>
> flake8 points out that this call to format has three placeholders but
> only two arguments.

Oops, I think it was originally written like this:

'{} {}{}'.format(month, day, suffixes.get(day, 'th'))

but then I refactored it to:

ordinal_day = '{}{}'.format(day, suffixes.get(day, 'th'))
if kind == 0:  # DayOfMonth
start = '{} {}{}'.format(month, ordinal_day)

So the fix is to just change the string to '{} {}' which I've pushed
as 1fab05a885a308c19cf42b72fd36805ddf27fdc8 now (also attached).

These printers are for implementation details internal to the library,
which are never exposed to users. I added them because they made it
much easier to debug the implementation when stepping through library
functions, but that means there are no tests for them.

Thanks for finding this!
commit 1fab05a885a308c19cf42b72fd36805ddf27fdc8
Author: Jonathan Wakely 
Date:   Wed Sep 27 17:03:51 2023

libstdc++: Fix format string in StdChronoTimeZoneRulePrinter

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdChronoTimeZoneRulePrinter):
Fix incorrect number of replacement fields.

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index c0056de2565..d60c8003a63 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -2215,7 +2215,7 @@ class StdChronoTimeZoneRulePrinter:
 day = on['day_of_month']
 ordinal_day = '{}{}'.format(day, suffixes.get(day, 'th'))
 if kind == 0:  # DayOfMonth
-start = '{} {}{}'.format(month, ordinal_day)
+start = '{} {}'.format(month, ordinal_day)
 else:
 weekday = weekdays[on['day_of_week']]
 if kind == 1:  # LastWeekDay


Re: School District Contact - 2023

2023-09-27 Thread Susan Miller via Gcc
Hi there,
We are excited to offer you a comprehensive email list of school districts that 
includes key contact information such as phone numbers, email addresses, 
mailing addresses, company revenue, size, and web addresses. Our databases also 
cover related industries such as:

  *   K-12 schools
  *   Universities
  *   Vocational schools and training programs
  *   Performing arts schools
  *   Fitness centers and gyms
  *   Child care services and providers
  *   Educational publishers and suppliers
If you're interested, we would be happy to provide you with relevant counts and 
a test file based on your specific requirements.
Thank you for your time and consideration, and please let us know if you have 
any questions or concerns.

Best regards,

Susan Miller



To remove from this mailing reply with the subject line " LEAVE US".



After Cauldron - online mini BoFs and Fosdem

2023-09-27 Thread Mark Wielaard
Hi all,

Cauldron was really great. Seeing everybody in person again.
One item that came up was about meeting more frequently and/or in
smaller (virtual) groups.

If people want to have online mini BoFs to follow up on some discussion
they had at Cauldron, or for some periodic meetup, then please remember
that The Software Freedom Conservancy is extending the use of their Big
Blue Button instance https://bbb.sfconservancy.org/ to Sourceware
projects that want to host video meetings.

Please create an account at https://bbb.sfconservancy.org/b/signup then
contact admin-reque...@sourceware.org with the name and email you used
and the kind of project/BoF/meeting you want to run to activate the
account.

Note: Anyone is able to join a meeting, accounts are only required to
create new meetings.

Also Doji, Jose and Gwen (on CC) are trying to coordinate a Fosdem
devroom for the various projects. Please contact them if you want to
help out with that.

Cheers,

Mark

https://sfconservancy.org/news/2023/aug/15/exit-zoom/
https://fosdem.org/2024/


[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2023-09-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org,
   ||toon at gcc dot gnu.org

--- Comment #6 from Tamar Christina  ---
This is the ticket I meant toon.

Do you or Thomas have any ideas how we can inline this?

Re: [committed] libstdc++: Add GDB printers for types

2023-09-27 Thread Tom Tromey
> Jonathan Wakely via Gcc-patches  writes:

Replying to a quite old email...

I ran a Python linter on the libstdc++ pretty-printers.

I have fixes for most of the issues that are worth fixing (I didn't
bother with line lengths -- FWIW in gdb we just run 'black' and don't
worry about these details), but the patch I'm replying to had a problem
that I didn't know how to fix:

> +class StdChronoTimeZoneRulePrinter:
[...]
> +if kind == 0: # DayOfMonth
> +start = '{} {}{}'.format(month, ordinal_day)

flake8 points out that this call to format has three placeholders but
only two arguments.

Tom


[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #6 from Ben Gardner  ---
(In reply to Andrew Pinski from comment #5)
> extern void *memmem (const void *__haystack, size_t __haystacklen,
>const void *__needle, size_t __needlelen)
>  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__pure__))
> __attribute__ ((__nonnull__ (1, 3)));
> 
> 
> memmem is declared with nonnull for the 1st and 3rd argument. If those
> arguments are null, the behavior is undefined and the values of those
> arguments can be assumed as not null afterwards too.
> 
> If you don't want that behavior you can use -fno-delete-null-pointer-checks .
> 
> Otherwise the behavior you are seeing is correct behavior based on well
> defined code.

Thanks for the info. That makes sense. I didn't check the header file, so I
didn't know that memmem() was declared with nonnull.

Also, thanks for the tip about -fno-delete-null-pointer-checks.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from Andrew Pinski  ---
extern void *memmem (const void *__haystack, size_t __haystacklen,
   const void *__needle, size_t __needlelen)
 __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__pure__))
__attribute__ ((__nonnull__ (1, 3)));


memmem is declared with nonnull for the 1st and 3rd argument. If those
arguments are null, the behavior is undefined and the values of those arguments
can be assumed as not null afterwards too.

If you don't want that behavior you can use -fno-delete-null-pointer-checks .

Otherwise the behavior you are seeing is correct behavior based on well defined
code.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #4 from Jonathan Wakely  ---
Anything passed to memmem (or memcmpy, or memcpy, etc.) is considered to be a
non-null pointer, because that's a requirement of those functions. And so if
it's a non-null pointer, any null checks for it can be removed.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #3 from Ben Gardner  ---
The issue isn't with memmem(). It is with the value passed into pr_str() from
the structure. I suspect memmem() is a distraction.
I'll try to further reduce the test case to eliminate memmem(), if possible.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #2 from Andrew Pinski  ---
I don't think this is a bug. memmem is defined such that a null pointer
argument is undefined even if the len is 0.

[Bug target/111616] New: On Zen2 7% 519.lbm_r regression between g:1d17d58c284fa8c3 (2023-09-14 02:39) and g:c8e9a75085f9725c (2023-09-18 13:09)

2023-09-27 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111616

Bug ID: 111616
   Summary: On Zen2 7% 519.lbm_r regression between
g:1d17d58c284fa8c3 (2023-09-14 02:39) and
g:c8e9a75085f9725c (2023-09-18 13:09)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: needs-bisection
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fkastl at suse dot cz
CC: mjambor at suse dot cz
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

On x86_64 AMD Zen2 machine with Ofast LTO PGO march=native mtune=native between
commits g:1d17d58c284fa8c3 (2023-09-14 02:39) and g:c8e9a75085f9725c
(2023-09-18 13:09) there is a 519.lbm_r 7% execution time regression.

Here is a plot of recent measurements:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=286.477.0

I confirmed this on another Zen2 machine. This time I measured 9% slowdown.

[Bug c/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #1 from Ben Gardner  ---
Created attachment 56004
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56004=edit
Build script.

[Bug c/111615] New: NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

Bug ID: 111615
   Summary: NULL check incorrectly skipped at O2 and O3
   Product: gcc
   Version: 11.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gardner.ben at gmail dot com
  Target Milestone: ---

Created attachment 56003
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56003=edit
Source file that produces the issue.

The attached source code has a function (pr_str()) that prints something
different if the parameter is NULL.
When passed a NULL (const char *) value from a static const structure, the NULL
check is skipped and the first printf() is executed.


static void pr_str(const char *s)
{
   /* BUG: this NULL check is skipped/wrong at O2 and O3 for
* vec->haystack and vec->needle.
*/
   if (s != NULL)
   {
  printf("'%s' %p %d", s, s, (int)(intptr_t)s);
   }
   else
   {
  printf("(nil)");
   }
}


This occurs at O2 and O3, but not at O0, O1, or Os.
If the program prints "h=(nil)" on the 2nd to last line when executed, then it
worked.
The the program prints "h='(null)' (nil) 0" on the 2nd to last line, then if
failed.


Build script:
#!/bin/sh
build_it() {
OP=$1
gcc -g -Wall -O$OP -c -o memmem_test.O$OP.o memmem_test.c
gcc memmem_test.O$OP.o -o memmem_test.O$OP
}
build_it 0
build_it 1
build_it 2
build_it 3
build_it s


GCC detailed info:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

Re: [PATCH 01/12] [contrib] validate_failures.py: Avoid testsuite aliasing

2023-09-27 Thread Maxim Kuvyrkov
Hi Bernhard,

Thanks, I meant to fix this, but forgot.

The underlying problem here is that we want to detect which sub-testsuites had 
failures.  Current regex doesn't match go's case because there is no "..." at 
the end: "Running foo" vs "Running foo ..." .

My preferred way of fixing this is to make go's testsuite print out "..." .  We 
have a similar patch for glibc [1].

[1] https://sourceware.org/pipermail/libc-alpha/2023-June/148702.html

--
Maxim Kuvyrkov
https://www.linaro.org

> On Sep 26, 2023, at 19:46, Bernhard Reutner-Fischer  
> wrote:
> 
> Hi Maxim!
> 
> On Mon, 5 Jun 2023 18:06:25 +0400
> Maxim Kuvyrkov via Gcc-patches  wrote:
> 
>>> On Jun 3, 2023, at 19:17, Jeff Law  wrote:
>>> 
>>> On 6/2/23 09:20, Maxim Kuvyrkov via Gcc-patches wrote:  
 This patch adds tracking of current testsuite "tool" and "exp"
 to the processing of .sum files.  This avoids aliasing between
 tests from different testsuites with same name+description.
 E.g., this is necessary for testsuite/c-c++-common, which is ran
 for both gcc and g++ "tools".
 This patch changes manifest format from ...
 
 FAIL: gcc_test
 FAIL: g++_test
 
 ... to ...
 
 === gcc tests ===
 Running gcc/foo.exp ...
 FAIL: gcc_test
 === gcc Summary ==
 === g++ tests ===
 Running g++/bar.exp ...
 FAIL: g++_test
 === g++ Summary ==
 .
 The new format uses same formatting as DejaGnu's .sum files
 to specify which "tool" and "exp" the test belongs to.  
>>> I think the series is fine.  You're not likely to hear from Diego or Doug I 
>>> suspect, I don't think either are involved in GNU stuff anymore.
>>> 
>> 
>> Thanks, Jeff.  I'll wait for a couple of days and will merge if there are no 
>> new comments.
> 
> Maxim, may i ask you to have a look at the following problem, please?
> 
> ISTM that your exp code does not work as expected for go, maybe you
> forgot to test the changes with go enabled?
> 
> Ever since your changes in summer i see the following:
> 
> gcc-14.mine$ 
> /scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py 
> --clean_build ../gcc-14.orig/
> Getting actual results from build directory .
> ./gcc/testsuite/go/go.sum
> ./gcc/testsuite/gcc/gcc.sum
> ./gcc/testsuite/objc/objc.sum
> ./gcc/testsuite/jit/jit.sum
> ./gcc/testsuite/gdc/gdc.sum
> ./gcc/testsuite/gnat/gnat.sum
> ./gcc/testsuite/ada/acats/acats.sum
> ./gcc/testsuite/g++/g++.sum
> ./gcc/testsuite/obj-c++/obj-c++.sum
> ./gcc/testsuite/rust/rust.sum
> ./gcc/testsuite/gfortran/gfortran.sum
> ./x86_64-pc-linux-gnu/libgomp/testsuite/libgomp.sum
> ./x86_64-pc-linux-gnu/libphobos/testsuite/libphobos.sum
> ./x86_64-pc-linux-gnu/libstdc++-v3/testsuite/libstdc++.sum
> ./x86_64-pc-linux-gnu/libffi/testsuite/libffi.sum
> ./x86_64-pc-linux-gnu/libitm/testsuite/libitm.sum
> ./x86_64-pc-linux-gnu/libgo/libgo.sum
> ./x86_64-pc-linux-gnu/libatomic/testsuite/libatomic.sum
> ./gotools/gotools.sum
> .sum file seems to be broken: tool="gotools", exp="None", summary_line="FAIL: 
> TestScript"
> Traceback (most recent call last):
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 732, in 
>retval = Main(sys.argv)
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 721, in Main
>retval = CompareBuilds()
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 622, in CompareBuilds
>actual = GetResults(sum_files)
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 466, in GetResults
>build_results.update(ParseSummary(sum_fname))
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 405, in ParseSummary
>result = result_set.MakeTestResult(line, ordinal)
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 239, in MakeTestResult
>return TestResult(summary_line, ordinal,
>  File 
> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py", 
> line 151, in __init__
>raise
> RuntimeError: No active exception to reraise
> 
> 
> The problem seems to be that gotools.sum does not mention any ".exp"
> files.
> 
> $ grep "Running " gotools/gotools.sum 
> Running cmd/go
> Running runtime
> Running cgo
> Running carchive
> Running cmd/vet
> Running embed
> $ grep -c "\.exp" gotools/gotools.sum 
> 0
> 
> The .sum files looks like this:
> ---8<---
> Test Run By foo on Tue Sep 26 14:46:48 CEST 2023
> Native configuration is x86_64-foo-linux-gnu
> 
>=== gotools tests ===
> 
> Running cmd/go
> UNTESTED: TestAccidentalGitCheckout
> PASS: TestAlwaysLinkSysoFiles
> ...
> UNTESTED: TestParallelTest
> FAIL: TestScript
> ...
> ---8<---
> 
> May i ask you to have a look, please?
> 
> TIA,




[Bug ipa/111283] [14 Regression] gnat profilebootstrap broken on trunk 20230902 on 32bit targets

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111283

--- Comment #9 from Sergei Trofimovich  ---
Proposed conservative fix as
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631526.html

[Bug gcov-profile/111559] [14 regression] ICE when building Python with PGO

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111559

--- Comment #7 from Sergei Trofimovich  ---
Proposed conservative fix as
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631526.html

[PATCH] ipa-utils: avoid generating uninitialized probabilities on merges.

2023-09-27 Thread Sergei Trofimovich
From: Sergei Trofimovich 

r14-3459-g0c78240fd7d519 "Check that passes do not forget to define profile"
exposed check failures in cases when gcc produces uninitialized profile
probabilities. In case of PR/111559 uninitialized profile is generated
by edges executed 0 times during profile:

__attribute__((noipa)) static void edge(void) {}

int p = 0;

__attribute__((noinline))
static void rule1(void) { if (p) edge(); }

__attribute__((noinline))
static void rule1_same(void) { if (p) edge(); }

__attribute__((noipa)) int main(void) {
rule1();
rule1_same();
}

$ gcc -O2 -fprofile-generate bug.c -o b -fopt-info
$ ./b
$ gcc -O2 -fprofile-use -fprofile-correction bug.c -o b -fopt-info

bug.c: In function 'rule1':
bug.c:6:13: error: probability of edge 3->4 not initialized
6 | static void rule1(void) { if (p) edge(); }
  | ^
during GIMPLE pass: fixup_cfg
bug.c:6:13: internal compiler error: verify_flow_info failed

The change conservatively ignores updates with uninitialized values and
uses initially assigned probabilities (`always` probability in case of
the example).

gcc/
PR/111283
PR/111559
* ipa-utils.cc (ipa_merge_profiles): Avoid producing
  uninitialized probabilities when merging counters with zero
  denominators.
---
 gcc/ipa-utils.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-utils.cc b/gcc/ipa-utils.cc
index 956c6294fd7..7c53ae9dd45 100644
--- a/gcc/ipa-utils.cc
+++ b/gcc/ipa-utils.cc
@@ -651,13 +651,17 @@ ipa_merge_profiles (struct cgraph_node *dst,
{
  edge srce = EDGE_SUCC (srcbb, i);
  edge dste = EDGE_SUCC (dstbb, i);
- dste->probability = 
+ profile_probability merged =
dste->probability * dstbb->count.ipa ().probability_in
 (dstbb->count.ipa ()
  + srccount.ipa ())
+ srce->probability * srcbb->count.ipa ().probability_in
 (dstbb->count.ipa ()
  + srccount.ipa ());
+ /* We produce uninitialized probabilities when
+denominator is zero: https://gcc.gnu.org/PR111559.  */
+ if (merged.initialized_p ())
+   dste->probability = merged;
}
  dstbb->count = dstbb->count.ipa () + srccount.ipa ();
}
-- 
2.42.0



[PATCH v4] c-family: Implement __has_feature and __has_extension [PR60512]

2023-09-27 Thread Alex Coplan
Hi,

This is a v4 patch to address Jason's feedback here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630911.html

w.r.t. v3 it just removes a comment now that some uncertainty around
cxx_binary_literals has been resolved, and updates the documentation as
suggested to point to the Clang docs.

--

This patch implements clang's __has_feature and __has_extension in GCC.
Currently the patch aims to implement all documented features (and some
undocumented ones) following the documentation at
https://clang.llvm.org/docs/LanguageExtensions.html with the exception
of the legacy features for C++ type traits.  These are omitted, since as
the clang documentation notes, __has_builtin is the correct "modern" way
to query for these (which GCC already implements).

Bootstrapped/regtested on aarch64-linux-gnu, bootstrapped on
x86_64-apple-darwin, darwin regtest in progress.  OK for trunk if
testing passes?

Thanks,
Alex

gcc/c-family/ChangeLog:

PR c++/60512
* c-common.cc (struct hf_feature_info): New.
(c_common_register_feature): New.
(init_has_feature): New.
(has_feature_p): New.
* c-common.h (c_common_has_feature): New.
(c_family_register_lang_features): New.
(c_common_register_feature): New.
(has_feature_p): New.
(c_register_features): New.
(cp_register_features): New.
* c-lex.cc (init_c_lex): Plumb through has_feature callback.
(c_common_has_builtin): Generalize and move common part ...
(c_common_lex_availability_macro): ... here.
(c_common_has_feature): New.
* c-ppoutput.cc (init_pp_output): Plumb through has_feature.

gcc/c/ChangeLog:

PR c++/60512
* c-lang.cc (c_family_register_lang_features): New.
* c-objc-common.cc (struct c_feature_info): New.
(c_register_features): New.

gcc/cp/ChangeLog:

PR c++/60512
* cp-lang.cc (c_family_register_lang_features): New.
* cp-objcp-common.cc (struct cp_feature_selector): New.
(cp_feature_selector::has_feature): New.
(struct cp_feature_info): New.
(cp_register_features): New.

gcc/ChangeLog:

PR c++/60512
* doc/cpp.texi: Document __has_{feature,extension}.

gcc/objc/ChangeLog:

PR c++/60512
* objc-act.cc (struct objc_feature_info): New.
(objc_nonfragile_abi_p): New.
(objc_common_register_features): New.
* objc-act.h (objc_common_register_features): New.
* objc-lang.cc (c_family_register_lang_features): New.

gcc/objcp/ChangeLog:

PR c++/60512
* objcp-lang.cc (c_family_register_lang_features): New.

libcpp/ChangeLog:

PR c++/60512
* include/cpplib.h (struct cpp_callbacks): Add has_feature.
(enum cpp_builtin_type): Add BT_HAS_{FEATURE,EXTENSION}.
* init.cc: Add __has_{feature,extension}.
* macro.cc (_cpp_builtin_macro_text): Handle
BT_HAS_{FEATURE,EXTENSION}.

gcc/testsuite/ChangeLog:

PR c++/60512
* c-c++-common/has-feature-common.c: New test.
* g++.dg/ext/has-feature.C: New test.
* gcc.dg/asan/has-feature-asan.c: New test.
* gcc.dg/has-feature.c: New test.
* gcc.dg/ubsan/has-feature-ubsan.c: New test.
* obj-c++.dg/has-feature.mm: New test.
* objc.dg/has-feature.m: New test.
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index aae57260097..1210953d33a 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -311,6 +311,43 @@ const struct fname_var_t fname_vars[] =
   {NULL, 0, 0},
 };
 
+/* Flags to restrict availability of generic features that
+   are known to __has_{feature,extension}.  */
+
+enum
+{
+  HF_FLAG_EXT = 1, /* Available only as an extension.  */
+  HF_FLAG_SANITIZE = 2, /* Availability depends on sanitizer flags.  */
+};
+
+/* Info for generic features which can be queried through
+   __has_{feature,extension}.  */
+
+struct hf_feature_info
+{
+  const char *ident;
+  unsigned flags;
+  unsigned mask;
+};
+
+/* Table of generic features which can be queried through
+   __has_{feature,extension}.  */
+
+static const hf_feature_info has_feature_table[] =
+{
+  { "address_sanitizer",   HF_FLAG_SANITIZE, SANITIZE_ADDRESS },
+  { "thread_sanitizer",HF_FLAG_SANITIZE, SANITIZE_THREAD },
+  { "leak_sanitizer",  HF_FLAG_SANITIZE, SANITIZE_LEAK },
+  { "hwaddress_sanitizer", HF_FLAG_SANITIZE, SANITIZE_HWADDRESS },
+  { "undefined_behavior_sanitizer", HF_FLAG_SANITIZE, SANITIZE_UNDEFINED },
+  { "attribute_deprecated_with_message",  0, 0 },
+  { "attribute_unavailable_with_message", 0, 0 },
+  { "enumerator_attributes", 0, 0 },
+  { "tls", 0, 0 },
+  { "gnu_asm_goto_with_outputs", HF_FLAG_EXT, 0 },
+  { "gnu_asm_goto_with_outputs_full",HF_FLAG_EXT, 0 }
+};
+
 /* Global visibility options.  */
 

Re: [PATCH] libstdc++: Ensure active union member is correctly set

2023-09-27 Thread Jonathan Wakely
On Sat, 23 Sept 2023 at 08:30, Nathaniel Shead via Libstdc++
 wrote:
>
> On Sat, Sep 23, 2023 at 07:40:48AM +0100, Jonathan Wakely wrote:
> > On Sat, 23 Sept 2023, 01:39 Nathaniel Shead via Libstdc++, <
> > libstd...@gcc.gnu.org> wrote:
> >
> > > Now that bootstrap has finished, I have gotten regressions in the
> > > following libstdc++ tests:
> > >
> > > Running libstdc++:libstdc++-dg/conformance.exp ...
> > > FAIL: 20_util/bitset/access/constexpr.cc -std=gnu++23 (test for excess
> > > errors)
> > > FAIL: 20_util/bitset/access/constexpr.cc -std=gnu++26 (test for excess
> > > errors)
> > > FAIL: 20_util/variant/constexpr.cc -std=gnu++20 (test for excess errors)
> > > FAIL: 20_util/variant/constexpr.cc -std=gnu++26 (test for excess errors)
> > > FAIL: 21_strings/basic_string/cons/char/constexpr.cc -std=gnu++20 (test
> > > for excess errors)
> > > FAIL: 21_strings/basic_string/cons/char/constexpr.cc -std=gnu++26 (test
> > > for excess errors)
> > > FAIL: 21_strings/basic_string/cons/wchar_t/constexpr.cc -std=gnu++20 (test
> > > for excess errors)
> > > FAIL: 21_strings/basic_string/cons/wchar_t/constexpr.cc -std=gnu++26 (test
> > > for excess errors)
> > > FAIL: 21_strings/basic_string/modifiers/swap/constexpr-wchar_t.cc
> > > -std=gnu++20 (test for excess errors)
> > > FAIL: 21_strings/basic_string/modifiers/swap/constexpr-wchar_t.cc
> > > -std=gnu++26 (test for excess errors)
> > > FAIL: 21_strings/basic_string/modifiers/swap/constexpr.cc -std=gnu++20
> > > (test for excess errors)
> > > FAIL: 21_strings/basic_string/modifiers/swap/constexpr.cc -std=gnu++26
> > > (test for excess errors)
> > > FAIL: std/ranges/adaptors/join_with/1.cc -std=gnu++23 (test for excess
> > > errors)
> > > UNRESOLVED: std/ranges/adaptors/join_with/1.cc -std=gnu++23 compilation
> > > failed to produce executable
> > > FAIL: std/ranges/adaptors/join_with/1.cc -std=gnu++26 (test for excess
> > > errors)
> > > UNRESOLVED: std/ranges/adaptors/join_with/1.cc -std=gnu++26 compilation
> > > failed to produce executable
> > >
> > > On investigation though it looks like the issue might be with libstdc++
> > > rather than the patch itself; running the failing tests using clang with
> > > libstdc++ also produces similar errors, and my reading of the code
> > > suggests that this is correct.
> > >
> > > What's the way forward here? Should I look at creating a patch to fix
> > > the libstdc++ issues before resubmitting this patch for the C++
> > > frontend? Or should I submit a version of this patch without the
> > > `std::construct_at` changes and wait till libstdc++ gets fixed for that?
> > >
> >
> > I think we should fix libstdc++. There are probably only a few places that
> > need a fix, which cause all those failures.
> >
> > I can help with those fixes. I'll look into it after the weekend.
> >
>
> Thanks. I did end up getting a chance to look at it earlier today, and
> with the following patch I had no regressions when applying the frontend
> changes. Bootstrapped and regtested on x86_64-pc-linux-gnu.
>
> -- >8 --
>
> This patch ensures that the union members for std::string and
> std::variant are always properly set when a change occurs.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/basic_string.h: (basic_string(basic_string&&)):
> Activate _M_local_buf when needed.
> (basic_string(basic_string&&, const _Alloc&)): Likewise.
> * include/bits/basic_string.tcc: (basic_string::swap): Likewise.
> * include/std/variant: (__detail::__variant::__construct_n): New.
> (__detail::_variant::__emplace): Use __construct_n.
>
> Signed-off-by: Nathaniel Shead 
> ---
>  libstdc++-v3/include/bits/basic_string.h   |  7 +++--
>  libstdc++-v3/include/bits/basic_string.tcc |  8 +++---
>  libstdc++-v3/include/std/variant   | 32 --
>  3 files changed, 38 insertions(+), 9 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/basic_string.h 
> b/libstdc++-v3/include/bits/basic_string.h
> index 09fd62afa66..7c342879827 100644
> --- a/libstdc++-v3/include/bits/basic_string.h
> +++ b/libstdc++-v3/include/bits/basic_string.h
> @@ -678,7 +678,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>{
> if (__str._M_is_local())
>   {
> -   traits_type::copy(_M_local_buf, __str._M_local_buf,
> +   traits_type::copy(_M_use_local_data(), __str._M_local_buf,
>   __str.length() + 1);
>   }
> else
> @@ -691,7 +691,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
> // basic_stringbuf relies on writing into unallocated capacity so
> // we mess up the contents if we put a '\0' in the string.
> _M_length(__str.length());
> -   __str._M_data(__str._M_local_data());
> +   __str._M_data(__str._M_use_local_data());
> __str._M_set_length(0);
>}
>
> @@ -717,6 +717,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
>{
> if (__str._M_is_local())
>   {
> +   _M_use_local_data();

Lets add a 

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

2023-09-27 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

--- Comment #13 from JuzheZhong  ---
Hi, Richi. This is my draft approach to enhance the finding more potential
condtional reduction.

diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index a8c915913ae..c25d2038f16 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1790,8 +1790,72 @@ is_cond_scalar_reduction (gimple *phi, gimple **reduc,
tree arg_0, tree arg_1,
   std::swap (r_op1, r_op2);
   std::swap (r_nop1, r_nop2);
 }
-  else if (r_nop1 != PHI_RESULT (header_phi))
-return false;
+  else if (r_nop1 == PHI_RESULT (header_phi))
+;
+  else
+{
+  /* Analyze the statement chain of STMT so that we could teach generate
+better if-converison code sequence.  We are trying to catch this
+following situation:
+
+  loop-header:
+  reduc_1 = PHI <..., reduc_2>
+  ...
+  if (...)
+  tmp1 = reduc_1 + rhs1;
+  tmp2 = tmp1 + rhs2;
+  tmp3 = tmp2 + rhs3;
+  ...
+  reduc_3 = tmpN-1 + rhsN-1;
+
+  reduc_2 = PHI 
+
+  and convert to
+
+  reduc_2 = PHI <0, reduc_1>
+  tmp1 = rhs1 + rhs2;
+  tmp2 = tmp1 + rhs3;
+  tmp3 = tmp2 + rhs4;
+  ...
+  tmpN-1 = tmpN-2 + rhsN;
+  ifcvt = cond_expr ? tmpN-1 : 0
+  reduc_1 = tmpN-1 +/- ifcvt;  */
+  if (num_imm_uses (PHI_RESULT (header_phi)) != 2)
+   return false;
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, PHI_RESULT (header_phi))
+   {
+ gimple *use_stmt = USE_STMT (use_p);
+ if (is_gimple_assign (use_stmt))
+   {
+ if (gimple_assign_rhs_code (use_stmt) != reduction_op)
+   return false;
+ if (TREE_CODE (gimple_assign_lhs (use_stmt)) != SSA_NAME)
+   return false;
+
+ bool visited_p = false;
+ while (!visited_p)
+   {
+ use_operand_p use;
+ if (!single_imm_use (gimple_assign_lhs (use_stmt), ,
+  _stmt)
+ || gimple_bb (use_stmt) != gimple_bb (stmt)
+ || !is_gimple_assign (use_stmt)
+ || TREE_CODE (gimple_assign_lhs (use_stmt)) != SSA_NAME
+ || gimple_assign_rhs_code (use_stmt) != reduction_op)
+   return false;
+
+ if (gimple_assign_lhs (use_stmt) == gimple_assign_lhs (stmt))
+   {
+ r_op2 = r_op1;
+ r_op1 = PHI_RESULT (header_phi);
+ visited_p = true;
+   }
+   }
+   }
+ else if (use_stmt != phi)
+   return false;
+   }
+}


My approach is doing the check as follows:

   tmp1 = reduc_1 + rhs1;
   tmp2 = tmp1 + rhs2;
   tmp3 = tmp2 + rhs3;
   ...
   reduc_3 = tmpN-1 + rhsN-1;

Start the iteration check from "tmp1 = reduc_1 + rhs1;" until "reduc_3 = tmpN-1
+ rhsN-1;"

Make sure each statement are PLUS_EXPR for reduction sum.
Does it look reasonable ?

It succeed on vectorization.

Re: [PATCH v14 16/40] c, c++: Use 16 bits for all use of enum rid for more keyword space

2023-09-27 Thread Jason Merrill
On Tue, Sep 19, 2023 at 7:05 PM Ken Matsui 
wrote:

> On Tue, Sep 19, 2023 at 9:59 AM Jason Merrill  wrote:
> >
> > On 9/15/23 19:51, Ken Matsui via Gcc-patches wrote:
> > > Now that RID_MAX has reached 255, we need to update the bit sizes of
> every
> > > use of the enum rid from 8 to 16 to support more keywords.
> >
> > Sorry to bring this up so late, but this does raise the question of
> > whether we actually want to use keyword space for all these traits that
> > will probably be used approximately once in a C++ translation unit.  I
> > wonder if it would make sense to instead use e.g. RID_TRAIT for all of
> > them and use gperf to look up the specific trait from the identifier?
> >
>
> Thank you for your review. To use gperf, we might need to duplicate
> the list of all traits defined in cp-trait.def. Modifying the traits
> would require us to edit two files, but would it be acceptable?
>

I think the gperf input could be generated from the .def with a simple
script?

Jason


[Bug ipa/111613] [12/13/14 Regression] Bit field stores can be incorrectly optimized away when -fstore-merging is in effect

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

--- Comment #2 from Richard Biener  ---
It's the late IPA modref that mis-analyzes the store-merged sequence I think.

[Bug libstdc++/111589] Use relaxed atomic increment (but not decrement!) in shared_ptr

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111589

--- Comment #2 from Jonathan Wakely  ---
The interesting question is whether all of these can be relaxed or if we need
to stop using __atomic_add_dispatch for shared_ptr copies:

include/bits/cow_string.h: 
__gnu_cxx::__atomic_add_dispatch(>_M_refcount, 1);
include/bits/cow_string.h:   
__gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1);
include/bits/ios_base.h:  _M_add_reference() {
__gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); }
include/bits/locale_classes.h:{
__gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); }
include/bits/locale_classes.h:{
__gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); }
include/bits/shared_ptr_base.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }
include/bits/shared_ptr_base.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_weak_count, 1); }
include/ext/atomicity.h:  // __atomic_add_dispatch
include/ext/atomicity.h:  __atomic_add_dispatch(_Atomic_word* __mem, int __val)
include/ext/pool_allocator.h:   __atomic_add_dispatch(&_S_force_new,
1);
include/ext/pool_allocator.h:   __atomic_add_dispatch(&_S_force_new,
-1);
include/ext/rc_string_base.h: __atomic_add_dispatch(&_M_info._M_refcount,
1);
include/tr1/shared_ptr.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }
include/tr1/shared_ptr.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_weak_count, 1); }
libsupc++/eh_atomics.h:__gnu_cxx::__atomic_add_dispatch (__count, 1);
src/c++98/ios_init.cc:  __gnu_cxx::__atomic_add_dispatch(&_S_refcount, 1);

  1   2   3   >