date:20230904

Re: [PATCH 1/2] VR-VALUES: Rename op0/op1 to op1/op2 for test_for_singularity

2023-09-04 Thread Jeff Law via Gcc-patches





On 9/1/23 11:30, Andrew Pinski via Gcc-patches wrote:

As requested and make easier to understand with the new ranger
code, rename the arguments op0/op1 to op1/op2.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions

gcc/ChangeLog:

* vr-values.cc (test_for_singularity): Rename
arguments op0/op1 to op1/op2.

OK
jeff

Re: [PATCH v2] RISC-V: zicond: Fix opt2 pattern

2023-09-04 Thread Jeff Law via Gcc-patches





On 9/4/23 20:19, Tsukasa OI wrote:



-FAIL: 30_threads/async/async.cc execution test
+FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
+FAIL: gcc.dg/setjmp-3.c execution test
+FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
+FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test
FWIW, I've got async.cc marked here as flakey when run under QEMU. 
That's also consistent with what I found at a prior employer when 
working on a private GCC port.




Jeff

Re: [PATCH] RISC-V: Fix Zicond ICE on large constants

2023-09-04 Thread Tsukasa OI via Gcc-patches

On 2023/09/05 14:27, Jeff Law wrote:
> 
> 
> On 9/4/23 00:45, Kito Cheng wrote:
>> Maybe move the check logic a bit forward? My thought is the logic is
>> already specialized into a few catalogs, (imm, imm), (imm, reg), (reg,
>> reg)... and the logic you put is already in (imm, reg), but it should
>> really move into (reg, reg) case IMO? and move that forward we could
>> prevent add too much logic to redirect the case.
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index 2db9c81ac8b..c84509c393b 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -3892,6 +3892,12 @@ riscv_expand_conditional_move (rtx dest, rtx
>> op, rtx cons, rtx alt)
>>   op1 = XEXP (op, 1);
>>     }
>>
>> +  /* CONS might not fit into a signed 12 bit immediate suitable
>> +    for an addi instruction.  If that's the case, force it into
>> +    a register.  */
>> +  if (CONST_INT_P (cons) && !SMALL_OPERAND (INTVAL (cons)))
>> +   cons = force_reg (mode, cons);
>> +
>>    /* 0, reg or 0, imm */
>>    if (cons == CONST0_RTX (mode)
>>   && (REG_P (alt)
> But for the imm, imm case if we force things into regs too early, then
> we'll lose if alt - cons and cons fit in a 12 bit immediate but alt does
> not.
> 
> I think Tsukasa is on the right path here.  I should have checked
> riscv_emit_binary -- I though it handled the out-of-range constant case,
> but looking at it now, it clearly does not.
> 
> I think this implies we need a similar blob of code for the imm, imm
> case for cons.
> 
> Jeff
> 

Okay, adding a check to "imm, imm" case (although I haven't figured out
how to reproduce this case) and will submit the v2.

Tsukasa

Re: [PATCH] RISC-V: Fix Zicond ICE on large constants

2023-09-04 Thread Jeff Law via Gcc-patches





On 9/4/23 00:45, Kito Cheng wrote:

Maybe move the check logic a bit forward? My thought is the logic is
already specialized into a few catalogs, (imm, imm), (imm, reg), (reg,
reg)... and the logic you put is already in (imm, reg), but it should
really move into (reg, reg) case IMO? and move that forward we could
prevent add too much logic to redirect the case.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2db9c81ac8b..c84509c393b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3892,6 +3892,12 @@ riscv_expand_conditional_move (rtx dest, rtx
op, rtx cons, rtx alt)
  op1 = XEXP (op, 1);
}

+  /* CONS might not fit into a signed 12 bit immediate suitable
+for an addi instruction.  If that's the case, force it into
+a register.  */
+  if (CONST_INT_P (cons) && !SMALL_OPERAND (INTVAL (cons)))
+   cons = force_reg (mode, cons);
+
   /* 0, reg or 0, imm */
   if (cons == CONST0_RTX (mode)
  && (REG_P (alt)
But for the imm, imm case if we force things into regs too early, then 
we'll lose if alt - cons and cons fit in a 12 bit immediate but alt does 
not.


I think Tsukasa is on the right path here.  I should have checked 
riscv_emit_binary -- I though it handled the out-of-range constant case, 
but looking at it now, it clearly does not.


I think this implies we need a similar blob of code for the imm, imm 
case for cons.


Jeff

Re: [PATCH v2 1/2] strlen: fold strstr() even if the length isn't previously known [PR96601]

2023-09-04 Thread Jeff Law via Gcc-patches





On 9/4/23 14:58, Hamza Mahfooz wrote:

Currently, we give up in fold_strstr_to_strncmp() if the length of the
the second argument to strstr() isn't known to us by the time we hit
that function. However, we can instead insert a strlen() in ourselves
and continue trying to fold strstr() into strlen()+strncmp().

PR tree-optimization/96601

gcc/ChangeLog:

* tree-ssa-strlen.cc (fold_strstr_to_strncmp): If arg1_len == NULL,
insert a strlen() for strstr()'s arg1 and use it as arg1_len.

gcc/testsuite/ChangeLog:

* gcc.dg/strlenopt-30.c: Modify test.
I'm not sure it's necessarily a win to convert to strncmp as 
aggressively as this patch would do.  Essentially when you have large 
needles that are partially matched repeatedly, performance can 
significantly suffer.


If we're going to seriously consider this path, then I'd like to see how 
it performs in general.  The glibc testsuite I think has some 
performance coverage for strstr.


jeff

Re: [PATCH] lra: Avoid unfolded plus-0

2023-09-04 Thread Jeff Law via Gcc-patches





On 8/31/23 09:24, Richard Sandiford via Gcc-patches wrote:

While backporting another patch to an earlier release, I hit a
situation in which lra_eliminate_regs_1 would eliminate an address to:

 (plus (reg:P R) (const_int 0))

This address compared not-equal to plain:

 (reg:P R)

which caused an ICE in a later peephole2.  (The ICE showed up in
gfortran.fortran-torture/compile/pr80464.f90 on the branch but seems
to be latent on trunk.)

These unfolded PLUSes shouldn't occur in the insn stream, and later code
in the same function tried to avoid them.

Tested on aarch64-linux-gnu so far, but I'll test on x86_64-linux-gnu too.
Does this look OK?

There are probably other instances of the same thing elsewhere,
but it seemed safer to stick to the one that caused the issue.

Thanks,
Richard


gcc/
* lra-eliminations.cc (lra_eliminate_regs_1): Use simplify_gen_binary
rather than gen_rtx_PLUS.

OK
jeff

Re: [PATCH] RISC-V: Document some -march special cases

2023-09-04 Thread Jeff Law via Gcc-patches





On 8/29/23 23:52, Kito Cheng wrote:

I would prefer NOT to expose those --param on user manual since
generally those options are used for internal only, we should add -m
option and enable `--param=riscv-autovec-preference=scalable` by
default once we think it's stable enough.

I tend to agree.

The params controlling this stuff are in my mind a stopgap as we work 
through those issues.  Ideally we get to a point where they're no longer 
needed and we just remove them.  THat gets harder if they get exposed in 
a release, even more so if they get documented in the release.


Jeff

Re:[pushed] [PATCH v3 0/4] LoongArch: target configuration interface update

2023-09-04 Thread chenglulu


Pushed to r14-3665.

在 2023/8/31 下午8:48, Yang Yujie 写道:

This is an update of
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html

Changes since the last patchset:

1. Fix texinfo format of the install.texi document.
2. Add documentation for --with-strict-align-lib.

v1 -> v2:
1. Add new configure option --with-strict-align-lib to control
whether -mstrict-align should be used when building libraries.
This facilitates building toolchains targeting both LA264
(Loongson 2k1000la) and non-LA264 cores.

2. Define preprocessing macros  __loongarch_sx / __loongarch_asx
/ __loongarch_simd_width that indicates the enabled SIMD
extensions.

3. Keep the current non-symmetric multidir layout, but do not build
duplicate multilib variants with the same ABI option.  Make
--with-abi= obsolete to ensure a consistent directory layout.
(ABI type of the "toplevel" libraries can be inferred from the
 target triplet)

4. Using "-mno-lasx" do not cause a fallback to "-msimd=none" as
long as the -march= architecture or the default --with-simd=
setting has LSX support.

Yang Yujie (4):
   LoongArch: improved target configuration interface
   LoongArch: define preprocessing macros "__loongarch_{arch,tune}"
   LoongArch: add new configure option --with-strict-align-lib
   LoongArch: support loongarch*-elf target

  config-ml.in  |  10 +
  gcc/config.gcc| 408 ++
  gcc/config/loongarch/elf.h|  52 +++
  .../loongarch/genopts/loongarch-strings   |   8 +-
  gcc/config/loongarch/genopts/loongarch.opt.in |  62 ++-
  gcc/config/loongarch/la464.md |  32 +-
  gcc/config/loongarch/loongarch-c.cc   |  22 +-
  gcc/config/loongarch/loongarch-cpu.cc | 263 ++-
  gcc/config/loongarch/loongarch-cpu.h  |   3 +-
  gcc/config/loongarch/loongarch-def.c  |  67 +--
  gcc/config/loongarch/loongarch-def.h  |  57 +--
  gcc/config/loongarch/loongarch-driver.cc  | 208 +
  gcc/config/loongarch/loongarch-driver.h   |  40 +-
  gcc/config/loongarch/loongarch-opts.cc| 372 +++-
  gcc/config/loongarch/loongarch-opts.h |  59 +--
  gcc/config/loongarch/loongarch-str.h  |   7 +-
  gcc/config/loongarch/loongarch.cc |  87 ++--
  gcc/config/loongarch/loongarch.opt|  60 ++-
  gcc/config/loongarch/t-linux  |  32 +-
  gcc/doc/install.texi  |  56 ++-
  gcc/doc/invoke.texi   |  32 +-
  libgcc/config.host|   9 +-
  22 files changed, 1261 insertions(+), 685 deletions(-)
  create mode 100644 gcc/config/loongarch/elf.h

Re:[pushed] [PATCH v2] LoongArch: initial ada support on linux

2023-09-04 Thread chenglulu


Pushed to r14-3669.

在 2023/9/4 上午10:42, Yang Yujie 写道:

gcc/ChangeLog:

* ada/Makefile.rtl: Add LoongArch support.
* ada/libgnarl/s-linux__loongarch.ads: New.
* ada/libgnat/system-linux-loongarch.ads: New.
* config/loongarch/loongarch.h: mark normalized options
passed from driver to gnat1 as explicit for multilib.
---
  gcc/ada/Makefile.rtl   |  49 +++
  gcc/ada/libgnarl/s-linux__loongarch.ads| 134 +++
  gcc/ada/libgnat/system-linux-loongarch.ads | 145 +
  gcc/config/loongarch/loongarch.h   |   4 +-
  4 files changed, 330 insertions(+), 2 deletions(-)
  create mode 100644 gcc/ada/libgnarl/s-linux__loongarch.ads
  create mode 100644 gcc/ada/libgnat/system-linux-loongarch.ads

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 96306f8cc9a..852a3324388 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2111,6 +2111,55 @@ ifeq ($(strip $(filter-out cygwin% mingw32% 
pe,$(target_os))),)
LIBRARY_VERSION := $(LIB_VERSION)
  endif
  
+# LoongArch Linux

+ifeq ($(strip $(filter-out loongarch% linux%,$(target_cpu) $(target_os))),)
+  LIBGNAT_TARGET_PAIRS = \
+  a-exetim.adbhttp://www.gnu.org/licenses/>.  --
+--  --
+--
+
+--  This is the LoongArch version of this package
+
+--  This package encapsulates cpu specific differences between implementations
+--  of GNU/Linux, in order to share s-osinte-linux.ads.
+
+--  PLEASE DO NOT add any with-clauses to this package or remove the pragma
+--  Preelaborate. This package is designed to be a bottom-level (leaf) package
+
+with Interfaces.C;
+with System.Parameters;
+
+package System.Linux is
+   pragma Preelaborate;
+
+   --
+   -- Time --
+   --
+
+   subtype int is Interfaces.C.int;
+   subtype longis Interfaces.C.long;
+   subtype suseconds_t is Interfaces.C.long;
+   type time_t is range -2 ** (System.Parameters.time_t_bits - 1)
+ .. 2 ** (System.Parameters.time_t_bits - 1) - 1;
+   subtype clockid_t   is Interfaces.C.int;
+
+   type timespec is record
+  tv_sec  : time_t;
+  tv_nsec : long;
+   end record;
+   pragma Convention (C, timespec);
+
+   type timeval is record
+  tv_sec  : time_t;
+  tv_usec : suseconds_t;
+   end record;
+   pragma Convention (C, timeval);
+
+   ---
+   -- Errno --
+   ---
+
+   EAGAIN: constant := 11;
+   EINTR : constant := 4;
+   EINVAL: constant := 22;
+   ENOMEM: constant := 12;
+   EPERM : constant := 1;
+   ETIMEDOUT : constant := 110;
+
+   -
+   -- Signals --
+   -
+
+   SIGHUP : constant := 1; --  hangup
+   SIGINT : constant := 2; --  interrupt (rubout)
+   SIGQUIT: constant := 3; --  quit (ASCD FS)
+   SIGILL : constant := 4; --  illegal instruction (not reset)
+   SIGTRAP: constant := 5; --  trace trap (not reset)
+   SIGIOT : constant := 6; --  IOT instruction
+   SIGABRT: constant := 6; --  used by abort, replace SIGIOT in the  future
+   SIGBUS : constant := 7; --  bus error
+   SIGFPE : constant := 8; --  floating point exception
+   SIGKILL: constant := 9; --  kill (cannot be caught or ignored)
+   SIGUSR1: constant := 10; --  user defined signal 1
+   SIGSEGV: constant := 11; --  segmentation violation
+   SIGUSR2: constant := 12; --  user defined signal 2
+   SIGPIPE: constant := 13; --  write on a pipe with no one to read it
+   SIGALRM: constant := 14; --  alarm clock
+   SIGTERM: constant := 15; --  software termination signal from kill
+   SIGSTKFLT  : constant := 16; --  coprocessor stack fault (Linux)
+   SIGCLD : constant := 17; --  alias for SIGCHLD
+   SIGCHLD: constant := 17; --  child status change
+   SIGCONT: constant := 18; --  stopped process has been continued
+   SIGSTOP: constant := 19; --  stop (cannot be caught or ignored)
+   SIGTSTP: constant := 20; --  user stop requested from tty
+   SIGTTIN: constant := 21; --  background tty read attempted
+   SIGTTOU: constant := 22; --  background tty write attempted
+   SIGURG : constant := 23; --  urgent condition on IO channel
+   SIGXCPU: constant := 24; --  CPU time limit exceeded
+   SIGXFSZ: constant := 25; --  filesize limit exceeded
+   SIGVTALRM  : constant := 26; --  virtual timer expired
+   SIGPROF: constant := 27; --  profiling timer expired
+   SIGWINCH   : constant := 28; --  window size change
+   SIGPOLL: constant := 29; --  pollable event occurred
+   SIGIO  : constant := 29; --  I/O now possible (4.2 BSD)
+   SIGPWR : constant := 30; --  power-fail restart
+   SIGSYS : constant := 31; --  bad system call
+   SIG32  : constant := 32; --  glibc internal signal
+   SIG33  :

Re: [PATCH] RISC-V: Fix Dynamic LMUL compile option

2023-09-04 Thread juzhe.zh...@rivai.ai

simple patch for dynamic cost model:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629212.html 
committed.



juzhe.zh...@rivai.ai
 
From: Juzhe-Zhong
Date: 2023-09-04 17:08
To: gcc-patches
CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Juzhe-Zhong
Subject: [PATCH] RISC-V: Fix Dynamic LMUL compile option
gcc/ChangeLog:
 
* config/riscv/riscv-opts.h (enum riscv_autovec_lmul_enum): Fix Dynamic status.
* config/riscv/riscv-v.cc (preferred_simd_mode): Ditto.
(autovectorize_vector_modes): Ditto.
(vectorize_related_mode): Ditto.
 
---
gcc/config/riscv/riscv-opts.h |  2 +-
gcc/config/riscv/riscv-v.cc   | 15 ---
2 files changed, 9 insertions(+), 8 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 79e0f12e388..b6b5907e111 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -81,7 +81,7 @@ enum riscv_autovec_lmul_enum {
   RVV_M4 = 4,
   RVV_M8 = 8,
   /* For dynamic LMUL, we compare COST start with LMUL8.  */
-  RVV_DYNAMIC = RVV_M8
+  RVV_DYNAMIC = 9
};
enum riscv_multilib_select_kind {
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index c8ad96f44d5..fbbc16a3c26 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1971,16 +1971,16 @@ preferred_simd_mode (scalar_mode mode)
  vectorizer when we enable them in this target hook. Currently, we can
  support auto-vectorization in -march=rv32_zve32x_zvl128b. Wheras,
  -march=rv32_zve32x_zvl32b or -march=rv32_zve32x_zvl64b are disabled.  */
+  int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul;
   if (autovec_use_vlmax_p ())
 {
-  if (TARGET_MIN_VLEN < 128 && riscv_autovec_lmul < RVV_M2)
+  if (TARGET_MIN_VLEN < 128 && lmul < RVV_M2)
return word_mode;
   /* We use LMUL = 1 as base bytesize which is BYTES_PER_RISCV_VECTOR and
riscv_autovec_lmul as multiply factor to calculate the the NUNITS to
get the auto-vectorization mode.  */
   poly_uint64 nunits;
-  poly_uint64 vector_size
- = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul);
+  poly_uint64 vector_size = BYTES_PER_RISCV_VECTOR * lmul;
   poly_uint64 scalar_size = GET_MODE_SIZE (mode);
   gcc_assert (multiple_p (vector_size, scalar_size, ));
   machine_mode rvv_mode;
@@ -2154,10 +2154,10 @@ get_cmp_insn_code (rtx_code code, machine_mode mode)
unsigned int
autovectorize_vector_modes (vector_modes *modes, bool)
{
+  int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul;
   if (autovec_use_vlmax_p ())
 {
-  poly_uint64 full_size
- = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul);
+  poly_uint64 full_size = BYTES_PER_RISCV_VECTOR * lmul;
   /* Start with a RVVQImode where LMUL is the number of units that
fit a whole vector.
@@ -2187,7 +2187,7 @@ autovectorize_vector_modes (vector_modes *modes, bool)
 {
   /* Push all VLSmodes according to TARGET_MIN_VLEN.  */
   unsigned int i = 0;
-  unsigned int base_size = TARGET_MIN_VLEN * riscv_autovec_lmul / 8;
+  unsigned int base_size = TARGET_MIN_VLEN * lmul / 8;
   unsigned int size = base_size;
   machine_mode mode;
   while (size > 0 && get_vector_mode (QImode, size).exists ())
@@ -2212,8 +2212,9 @@ vectorize_related_mode (machine_mode vector_mode, 
scalar_mode element_mode,
{
   /* TODO: We will support RVV VLS auto-vectorization mode in the future. */
   poly_uint64 min_units;
+  int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul;
   if (autovec_use_vlmax_p () && riscv_v_ext_vector_mode_p (vector_mode)
-  && multiple_p (BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul),
+  && multiple_p (BYTES_PER_RISCV_VECTOR * lmul,
 GET_MODE_SIZE (element_mode), _units))
 {
   machine_mode rvv_mode;
-- 
2.36.1

Re: [PATCH v2] RISC-V: zicond: Fix opt2 pattern

2023-09-04 Thread Tsukasa OI via Gcc-patches

Sorry, I want to directly reply to Jeff but I couldn't because I haven't
subscribed to gcc-patches and Jeff's recent reply hasn't archived yet.

Bug confirmed for me.

I tried the full test with following configuration (I found another bug
[ICE] as I submitted a quick fix while testing this and requires
following patch set to be applied; will make a PATCH v2 though):


Possibly, ICE, simulator configuration and/or dirty build tree might be
the reason Jeff couldn't reproduce the bug.

# ZiCond enabled
# Remove "_zicond" to disable ZiCond.
# ${SYSROOT} points to the prebuilt sysroot with
# glibc + libgcc with -march=rv64imafdc -mabi=lp64d
${GCC_SRCDIR}/configure \
--target=riscv64-unknown-linux-gnu \
--prefix=${PREFIX}   \
--with-sysroot=${SYSROOT}\
--with-system-zlib   \
--disable-shared \
--enable-tls \
--enable-languages=c,c++ \
--disable-libmudflap \
--disable-libssp \
--disable-libquadmath\
--disable-libsanitizer   \
--disable-nls\
--disable-bootstrap  \
--disable-multilib   \
--with-tune=rocket   \
--with-arch=rv64imafdc_zicond\
--with-abi=lp64d \
--with-isa-spec=20191213

Then I ran "make; make check RUNTESTFLAGS='--target_board=riscv-sim'".
Note that I configured DejaGnu (riscv-sim.exp) to execute tests with:
"qemu-riscv64 -L ${SYSROOT} -cpu rv64,g=on,x-zicond=on" (QEMU 8.1.0
Linux user emulation).

Warning: abort() on QEMU with Linux user emulation causes QEMU to abort,
too (possibly making many coredumps).

The diff of test failures are as follows.
-: Occurs only when ZiCond is disabled
+: Occurs only when ZiCond is enabled

-FAIL: 30_threads/async/async.cc execution test
+FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
+FAIL: gcc.dg/setjmp-3.c execution test
+FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
+FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test

I'm not sure why 30_threads/async/async.cc succeeds after enabling the
'Zicond' extension but I am sure that setjmp-3.c failures are caused by
this very bug.

Smaller example (not involving setjmp / longjmp) to reproduce this bug
in my environment is as follows (you *don't* have to apply my patch
above, make all-gcc && make install-gcc overwriting existing RV64 GCC
prefix will work):

> #include 
> 
> __attribute__((noinline, noclone))
> void sample(long* a)
> {
> *a = 1;
> }
> 
> __attribute__((noinline, noclone))
> long foo(long x)
> {
> long a = 0;
> sample(); // a is overwritten to 1.
> if (a == 0)
> return 0;
> else
> return x; // should be always taken
> }
> 
> int main(int argc, char** argv)
> {
> printf("%ld\n", foo(5)); // should print 5
> return 0;
> }

Note that we have to make sure that variables are not easily inferred by
another optimization pass (that's why I needed two functions).

> riscv64-unknown-linux-gnu-gcc -march=rv64gc_zicond -O1 -static a.c
> qemu-riscv64 -cpu rv64,g=on,x-zicond=on ./a.out

printed 0, not 5 as I expected.

I support Vineet's patch set (v2).

Thanks,
Tsukasa

[PATCH] analyzer: implement symbolic value support for CPython plugin's refcnt checker [PR107646]

2023-09-04 Thread Eric Feng via Gcc-patches

Hi Dave,

Recently I've been working on symbolic value support for the reference
count checker. I've attached a patch for it below; let me know it looks
OK for trunk. Thanks!

Best,
Eric

---

This patch enhances the reference count checker in the CPython plugin by
adding support for symbolic values. Whereas previously we were only able
to check the reference count of PyObject* objects created in the scope
of the function; we are now able to emit diagnostics on reference count
mismatch of objects that were, for example, passed in as a function
parameter.

rc6.c:6:10: warning: expected ‘obj’ to have reference count: N + ‘1’ but 
ob_refcnt field is N + ‘2’
6 |   return obj;
  |  ^~~
  ‘create_py_object2’: event 1
|
|6 |   return obj;
|  |  ^~~
|  |  |
|  |  (1) here
|


gcc/testsuite/ChangeLog:
PR analyzer/107646
* gcc.dg/plugin/analyzer_cpython_plugin.c: Support reference count 
checking
of symbolic values.
* gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: New test.
* gcc.dg/plugin/plugin.exp: New test.
* gcc.dg/plugin/cpython-plugin-test-refcnt.c: New test.

Signed-off-by: Eric Feng 

---
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 133 +++---
 .../cpython-plugin-test-PyList_Append.c   |  21 ++-
 .../plugin/cpython-plugin-test-refcnt.c   |  18 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   1 +
 4 files changed, 118 insertions(+), 55 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt.c

diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index bf1982e79c3..d7ecd7fce09 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -314,17 +314,20 @@ public:
   {
 diagnostic_metadata m;
 bool warned;
-// just assuming constants for now
-auto actual_refcnt
-   = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
-auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant ();
-warned = warning_meta (rich_loc, m, get_controlling_option (),
-  "expected %qE to have "
-  "reference count: %qE but ob_refcnt field is: %qE",
-  m_reg_tree, actual_refcnt, ob_refcnt);
-
-// location_t loc = rich_loc->get_loc ();
-// foo (loc);
+
+const auto *actual_refcnt_constant
+   = m_actual_refcnt->dyn_cast_constant_svalue ();
+const auto *ob_refcnt_constant = m_ob_refcnt->dyn_cast_constant_svalue ();
+if (!actual_refcnt_constant || !ob_refcnt_constant)
+  return false;
+
+auto actual_refcnt = actual_refcnt_constant->get_constant ();
+auto ob_refcnt = ob_refcnt_constant->get_constant ();
+warned = warning_meta (
+   rich_loc, m, get_controlling_option (),
+   "expected %qE to have "
+   "reference count: N + %qE but ob_refcnt field is N + %qE",
+   m_reg_tree, actual_refcnt, ob_refcnt);
 return warned;
   }
 
@@ -336,10 +339,6 @@ public:
 
 private:
 
-  void foo(location_t loc) const 
-  {
-inform(loc, "something is up right here");
-  }
   const region *m_base_region;
   const svalue *m_ob_refcnt;
   const svalue *m_actual_refcnt;
@@ -369,6 +368,19 @@ increment_region_refcnt (hash_map 
, const region *key)
   refcnt = existed ? refcnt + 1 : 1;
 }
 
+static const region *
+get_region_from_svalue (const svalue *sval, region_model_manager *mgr)
+{
+  const auto *region_sval = sval->dyn_cast_region_svalue ();
+  if (region_sval)
+return region_sval->get_pointee ();
+
+  const auto *initial_sval = sval->dyn_cast_initial_svalue ();
+  if (initial_sval)
+return mgr->get_symbolic_region (initial_sval);
+
+  return nullptr;
+}
 
 /* Recursively fills in region_to_refcnt with the references owned by
pyobj_ptr_sval.  */
@@ -381,20 +393,9 @@ count_pyobj_references (const region_model *model,
   if (!pyobj_ptr_sval)
 return;
 
-  const auto *pyobj_region_sval = pyobj_ptr_sval->dyn_cast_region_svalue ();
-  const auto *pyobj_initial_sval = pyobj_ptr_sval->dyn_cast_initial_svalue ();
-  if (!pyobj_region_sval && !pyobj_initial_sval)
-return;
-
-  // todo: support initial sval (e.g passed in as parameter)
-  if (pyobj_initial_sval)
-{
-  // increment_region_refcnt (region_to_refcnt,
-  //  pyobj_initial_sval->get_region ());
-  return;
-}
+  region_model_manager *mgr = model->get_manager ();
 
-  const region *pyobj_region = pyobj_region_sval->get_pointee ();
+  const region *pyobj_region = get_region_from_svalue (pyobj_ptr_sval, mgr);
   if (!pyobj_region || seen.contains (pyobj_region))
 return;
 
@@ -409,49 +410,75 @@ count_pyobj_references (const region_model *model,
 return;
 
   const auto _binding_map = retval_cluster->get_map ();
-
   for (const

Re: [PATCH v2] RISC-V: zicond: Fix opt2 pattern

2023-09-04 Thread Jeff Law via Gcc-patches





On 9/1/23 13:53, Vineet Gupta wrote:

This was tripping up gcc.c-torture/execute/pr60003.c at -O1 since in
failing case, pattern's asm czero.nez gets both rs2 and rs1 as non zero.

We start with the following src code snippet:

   if (a == 0)
return 0;
   else
return x;
 }

which is equivalent to:  "x = (a != 0) ? x : a" where x is NOT 0.
 

and matches define_insn "*czero.nez..opt2"

| (insn 41 20 38 3 (set (reg/v:DI 136 [ x ])
|(if_then_else:DI (ne (reg/v:DI 134 [ a ])
|(const_int 0 [0]))
|(reg/v:DI 136 [ x ])
|(reg/v:DI 134 [ a ]))) {*czero.nez.didi.opt2}

The corresponding asm pattern generates
 czero.nez x, x, a   ; %0, %2, %1

which implies
 "x = (a != 0) ? 0 : a"

clearly not what the pattern wants to do.

Essentially "(a != 0) ? x : a" cannot be expressed with CZERO.nez if X
is not guaranteed to be 0.

However this can be fixed with a small tweak

"x = (a != 0) ? x : a"

is same as

"x = (a == 0) ? a : x" since middle operand is 0 when a == 0.

which can be expressed with CZERO.eqz

before fix  after fix
-   -
lia5,1  lia5,1
lda4,8(sp)  lda4,8(sp)   # a4 is runtime non zero
czero.nez a0,a4,a5 # a0=0 NOK   czero.eqz a0,a4,a5   # a0=a4!=0 OK

The issue only happens at -O1 as at higher optimization levels, the
whole conditional move gets optimized away.

This fixes 4 testsuite failues in a zicond build:

FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
FAIL: gcc.dg/setjmp-3.c execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test

gcc/ChangeLog:
* config/riscv/zicond.md: Fix op2 pattern.

Fixes: 1d5bc3285e8a ("[committed][RISC-V] Fix 20010221-1.c with zicond")
Signed-off-by: Vineet Gupta 

OK.

Still not sure why I didn't trip over it in my own testing (execute.exp 
runs pr60003.c), but regardless, glad to have it fixed.


jeff

Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-04 Thread Richard Sandiford via Gcc-patches

Qing Zhao via Gcc-patches  writes:
>> On Aug 29, 2023, at 3:42 PM, Marek Polacek via Gcc-patches 
>>  wrote:
>> 
>> Improving the security of software has been a major trend in the recent
>> years.  Fortunately, GCC offers a wide variety of flags that enable extra
>> hardening.  These flags aren't enabled by default, though.  And since
>> there are a lot of hardening flags, with more to come, it's been difficult
>> to keep on top of them; more so for the users of GCC who ought not to be
>> expected to keep track of all the new options.
>> 
>> To alleviate some of the problems I mentioned, we thought it would
>> be useful to provide a new umbrella option that enables a reasonable set
>> of hardening flags.  What's "reasonable" in this context is not easy to
>> pin down.  Surely, there must be no ABI impact, the option cannot cause
>> severe performance issues, and, I suspect, it should not cause build
>> errors by enabling stricter compile-time errors (such as, -Wimplicit-int,
>> -Wint-conversion).  Including a controversial option in -fhardened
>> would likely cause that users would not use -fhardened at all.  It's
>> roughly akin to -Wall or -O2 -- those also enable a reasonable set of
>> options, and evolve over time, and are not kept in sync with other
>> compilers.
>> 
>> Currently, -fhardened enables:
>> 
>>  -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
>>  -D_GLIBCXX_ASSERTIONS
>>  -ftrivial-auto-var-init=zero
>>  -fPIE  -pie  -Wl,-z,relro,-z,now
>>  -fstack-protector-strong
>>  -fstack-clash-protection
>>  -fcf-protection=full (x86 GNU/Linux only)
>> 
>> -fsanitize=undefined is specifically not enabled.  -fstrict-flex-arrays is
>> also liable to break a lot of code so I didn't include it.
>> 
>> Appended is a proof-of-concept patch.  It doesn't implement --help=hardened
>> yet.  A fairly crucial point is that -fhardened will not override options
>> that were specified on the command line (before or after -fhardened).  For
>> example,
>> 
>> -D_FORTIFY_SOURCE=1 -fhardened
>> 
>> means that _FORTIFY_SOURCE=1 will be used.  Similarly,
>> 
>>  -fhardened -fstack-protector
>> 
>> will not enable -fstack-protector-strong.
>> 
>> Thoughts?
>
> In general, I think that it is a very good idea to provide umbrella options
>  for software security purpose.  Thanks a lot for this work!
>
> 1. I do agree with Martin, multiple-level control for this purpose might be 
> needed,
> similar as multiple levels for warnings, and multiple levels for 
> optimizations.
>
> Similar as optimization options, can we organize all the security options 
> together 
> In our manual, then the user will have a good central place to get more and 
> complete
> Information of the security features our compiler provides? 
>
> 2. What’s the major criteria to decide which security feature should go into 
> this list?
> Later, when we have new security features, how to decide whether to add them 
> to
> This list or not?
> I am wondering why -fzero-call-used-regs is not included in the list and also

FWIW, I wondered the same thing.  Not a strong conviction that it should
be included -- maybe the code bloat is too much on some targets.  But it
might be acceptable for the -fhardened equivalent of -O3, at least if
restricted to GPRs.
 
> Why chose -ftrivial-auto-var-init=zero instead of 
> -ftrivial-auto-var-init=pattern? 

Yeah, IIRC -ftrivial-auto-var-init=zero was controversial with some
Clang maintainers because it effectively creates a language dialect.
-ftrivial-auto-var-init=pattern wasn't controversial in the same way.

Thanks,
Richard

[WIP] testsuite: Port 'check-function-bodies' to nvptx (was: Add dg test for matching function bodies)

2023-09-04 Thread Thomas Schwinge

Hi!

On 2019-07-16T15:04:49+0100, Richard Sandiford  
wrote:
> There isn't a 1:1 mapping from SVE intrinsics to SVE instructions,
> but the intrinsics are still close enough to the instructions for
> there to be a specific preferred sequence (or sometimes choice of
> preferred sequences) for a given combination of operands.  Sometimes
> these sequences will be one instruction, sometimes they'll be several.
>
> I therefore wanted a convenient way of matching the exact assembly
> implementation of a given function.  It's possible to do that using
> single scan-assembler lines, but:
>
> (a) they become hard to read for multiline matches
> (b) the PASS/FAIL lines tend to be long
> (c) it's useful to have a single place that skips over uninteresting
> lines, such as entry block labels and .cfi_* directives, without
> being overly broad
>
> This patch therefore adds a new check-function-bodies dg-final test
> that looks for specially-formatted comments.  As a demo, the patch
> converts the SVE vec_init tests to use the new harness instead of
> scan-assembler.

Great, thanks, belatedly!

> The regexps in parse_function_bodies are fairly general, but might
> still need to be extended in future for targets like Darwin or AIX.

..., or nvptx.  As an example, I'm attaching the 'abort.s' generated for
'gcc.target/nvptx/abort.c'.

I'm further attaching a crude ;-) (obviously, not intending to push in
this form) "[WIP] testsuite: Port 'check-function-bodies' to nvptx" to
illustrate that (a) it can be made work for nvptx, but (b) there are a
number of TODO items.

In particular how to parameterize regular expressions for the different
syntax used by nvptx: for example, parameterize via global variables,
initialized accordingly (where?)?  Thinking about it, maybe simply
conditionalizing the current local initializations by
'if { [istarget nvptx-*-*] } { [...] } else { [...] }' will do, simple
enough!

Regarding whitespace prefixed, I think I'll go with the current
'append function_regexp "\t" $line "\n"', that is, prefix expected output
lines with '\t' (as done in 'gcc.target/nvptx/abort.c'), and also for
nvptx handle labels as "fluff" (until we solve that issue generally).

(I'll look into all that later, but wanted to post this now, in case
anyone has different ideas.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
// BEGIN PREAMBLE
.version6.0
.target sm_30
.address_size 64
// END PREAMBLE


// BEGIN GLOBAL FUNCTION DECL: main
.visible .func (.param.u32 %value_out) main (.param.u32 %in_ar0, .param.u64 
%in_ar1);

// BEGIN GLOBAL FUNCTION DEF: main
.visible .func (.param.u32 %value_out) main (.param.u32 %in_ar0, .param.u64 
%in_ar1)
{
.reg.u32 %value;
.reg.u32 %ar0;
ld.param.u32 %ar0, [%in_ar0];
.reg.u64 %ar1;
ld.param.u64 %ar1, [%in_ar1];
.reg.u32 %r23;
.reg.pred %r25;
mov.u32 %r23, %ar0;
setp.le.s32 %r25, %r23, 2;
@%r25   bra $L2;
{
call abort;
trap; // (noreturn)
exit; // (noreturn)
}
$L2:
mov.u32 %value, 0;
st.param.u32[%value_out], %value;
ret;
}

// BEGIN GLOBAL FUNCTION DECL: abort
.extern .func abort;
>From 1a15a9dbd8cfc3c2f5df72653614c5c70a0c6018 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 4 Sep 2023 22:28:12 +0200
Subject: [PATCH] [WIP] testsuite: Port 'check-function-bodies' to nvptx

This extends commit 4d706ff86ea86868615558e92407674a4f4b4af9
"Add dg test for matching function bodies" for nvptx.
---
 gcc/doc/sourcebuild.texi   |  2 ++
 gcc/testsuite/gcc.target/nvptx/abort.c | 19 +--
 gcc/testsuite/lib/scanasm.exp  | 21 +
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 60a708e88c0..d83da89f9ba 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -3319,6 +3319,7 @@ function body for unoptimized code.
 
 The first line of the expected output for a function @var{fn} has the form:
 
+@c TODO
 @smallexample
 @var{prefix} @var{fn}:  [@{ target/xfail @var{selector} @}]
 @end smallexample
@@ -3326,6 +3327,7 @@ The first line of the expected output for a function @var{fn} has the form:
 Subsequent lines of the expected output also start with @var{prefix}.
 In both cases, whitespace after @var{prefix} is not significant.
 
+@c TODO
 The test discards assembly directives such as @code{.cfi_startproc}
 and local label definitions such as @code{.LFB0} from the compiler's
 assembly output.  It then matches the result against the expected
diff --git a/gcc/testsuite/gcc.target/nvptx/abort.c

[PATCH v2 2/2] strlen: call handle_builtin_strlen() from fold_strstr_to_strncmp()

2023-09-04 Thread Hamza Mahfooz

Currently, we are not saving the strlen() call we inserted for possible
future common subexpression elimination. Also, it's possible that we can
further fold that strlen() call. So, refactor handle_builtin_strlen()
so that it can be called from fold_strstr_to_strncmp().

gcc/ChangeLog:

* tree-ssa-strlen.cc (strlen_pass::handle_builtin_strlen):
Remove from class and mark as static.
(handle_builtin_strlen): Add parameter
"gimple_stmt_iterator gsi" and replace references of m_gsi
with gsi.
(fold_strstr_to_strncmp): Call handle_builtin_strlen().
(strlen_pass::check_and_optimize_call): Add m_gsi to the
handle_builtin_strlen() call.

gcc/testsuite/ChangeLog:

* gcc.dg/strlenopt-30.c: Add a test.

Signed-off-by: Hamza Mahfooz 
---
v2: bump up the number of strncmp()s from 6 to 7 in strlenopt-30.c
---
 gcc/testsuite/gcc.dg/strlenopt-30.c |  9 -
 gcc/tree-ssa-strlen.cc  | 23 ---
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
index 1ee814048c1..d89fe83ca98 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-30.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -44,6 +44,12 @@ _Bool f6(char *s, char *t)
   return __builtin_strstr (s, t) == s;
 }
 
+__attribute__((no_icf))
+_Bool f6plus(char *s, char *t)
+{
+  return __builtin_strstr (s, t) == s && __builtin_strlen(t) > 10;
+}
+
 /* Do not perform transform in this case, since
t1 doesn't have single use.  */
 
@@ -57,4 +63,5 @@ _Bool f7(char *s)
   return (t1 == s);
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_strncmp" 6 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_strlen" 2 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_strncmp" 7 "strlen1" } } */
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index b0ebbb0db62..8ec6ddbc7c0 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -252,7 +252,6 @@ public:
   bool handle_assign (tree lhs, bool *zero_write);
   bool handle_store (bool *zero_write);
   void handle_pointer_plus ();
-  void handle_builtin_strlen ();
   void handle_builtin_strchr ();
   void handle_builtin_strcpy (built_in_function);
   void handle_integral_assign (bool *cleanup_eh);
@@ -2211,10 +2210,10 @@ strlen_pass::maybe_warn_overflow (gimple *stmt, bool 
call_lhs,
the strlen call with the known value, otherwise remember that strlen
of the argument is stored in the lhs SSA_NAME.  */
 
-void
-strlen_pass::handle_builtin_strlen ()
+static void
+handle_builtin_strlen (gimple_stmt_iterator gsi)
 {
-  gimple *stmt = gsi_stmt (m_gsi);
+  gimple *stmt = gsi_stmt (gsi);
   tree lhs = gimple_call_lhs (stmt);
 
   if (lhs == NULL_TREE)
@@ -2268,8 +2267,8 @@ strlen_pass::handle_builtin_strlen ()
  if (bound)
rhs = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (rhs), rhs, bound);
 
- gimplify_and_update_call_from_tree (_gsi, rhs);
- stmt = gsi_stmt (m_gsi);
+ gimplify_and_update_call_from_tree (, rhs);
+ stmt = gsi_stmt (gsi);
  update_stmt (stmt);
  if (dump_file && (dump_flags & TDF_DETAILS) != 0)
{
@@ -2367,8 +2366,8 @@ strlen_pass::handle_builtin_strlen ()
  }
if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (ret)))
  ret = fold_convert_loc (loc, TREE_TYPE (lhs), ret);
-   gimplify_and_update_call_from_tree (_gsi, ret);
-   stmt = gsi_stmt (m_gsi);
+   gimplify_and_update_call_from_tree (, ret);
+   stmt = gsi_stmt (gsi);
update_stmt (stmt);
if (dump_file && (dump_flags & TDF_DETAILS) != 0)
  {
@@ -5272,8 +5271,9 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
{
  tree arg1 = gimple_call_arg (call_stmt, 1);
  tree arg1_len = NULL_TREE;
- int idx = get_stridx (arg1, call_stmt);
  gimple_stmt_iterator gsi = gsi_for_stmt (call_stmt);
+again:
+ int idx = get_stridx (arg1, call_stmt);
 
  if (idx)
{
@@ -5296,7 +5296,8 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
  gimple_call_set_lhs (strlen_call, strlen_lhs);
  gimple_set_vuse (strlen_call, gimple_vuse (call_stmt));
  gsi_insert_before (, strlen_call, GSI_SAME_STMT);
- arg1_len = strlen_lhs;
+ handle_builtin_strlen (gsi_for_stmt (strlen_call));
+ goto again;
}
  else if (!is_gimple_val (arg1_len))
{
@@ -5393,7 +5394,7 @@ strlen_pass::check_and_optimize_call (bool *zero_write)
 {
 case BUILT_IN_STRLEN:
 case BUILT_IN_STRNLEN:
-  handle_builtin_strlen ();
+  handle_builtin_strlen (m_gsi);
   break;
 case BUILT_IN_STRCHR:
   handle_builtin_strchr ();
-- 
2.42.0

[PATCH v2 1/2] strlen: fold strstr() even if the length isn't previously known [PR96601]

2023-09-04 Thread Hamza Mahfooz

Currently, we give up in fold_strstr_to_strncmp() if the length of the
the second argument to strstr() isn't known to us by the time we hit
that function. However, we can instead insert a strlen() in ourselves
and continue trying to fold strstr() into strlen()+strncmp().

PR tree-optimization/96601

gcc/ChangeLog:

* tree-ssa-strlen.cc (fold_strstr_to_strncmp): If arg1_len == NULL,
insert a strlen() for strstr()'s arg1 and use it as arg1_len.

gcc/testsuite/ChangeLog:

* gcc.dg/strlenopt-30.c: Modify test.

Signed-off-by: Hamza Mahfooz 
---
Please push this for me if you think it looks good. Since, I don't have
write access to the repository.
---
 gcc/testsuite/gcc.dg/strlenopt-30.c |  5 +-
 gcc/tree-ssa-strlen.cc  | 81 -
 2 files changed, 45 insertions(+), 41 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
index 2a3098ba96f..1ee814048c1 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-30.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -38,9 +38,6 @@ void f5(char *s)
 foo_f5();
 }
 
-/* Do not perform transform, since strlen (t)
-   is unknown.  */
-
 __attribute__((no_icf))
 _Bool f6(char *s, char *t)
 {
@@ -60,4 +57,4 @@ _Bool f7(char *s)
   return (t1 == s);
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_strncmp" 5 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_strncmp" 6 "strlen1" } } */
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index 8b7ef919826..b0ebbb0db62 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -5273,6 +5273,7 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
  tree arg1 = gimple_call_arg (call_stmt, 1);
  tree arg1_len = NULL_TREE;
  int idx = get_stridx (arg1, call_stmt);
+ gimple_stmt_iterator gsi = gsi_for_stmt (call_stmt);
 
  if (idx)
{
@@ -5286,51 +5287,57 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
}
}
 
- if (arg1_len != NULL_TREE)
+ if (arg1_len == NULL_TREE)
{
- gimple_stmt_iterator gsi = gsi_for_stmt (call_stmt);
- tree strncmp_decl = builtin_decl_explicit (BUILT_IN_STRNCMP);
+ tree strlen_decl = builtin_decl_explicit (BUILT_IN_STRLEN);
+ gcall *strlen_call = gimple_build_call (strlen_decl, 1, arg1);
+ tree strlen_lhs = make_ssa_name (size_type_node, strlen_call);
+
+ gimple_call_set_lhs (strlen_call, strlen_lhs);
+ gimple_set_vuse (strlen_call, gimple_vuse (call_stmt));
+ gsi_insert_before (, strlen_call, GSI_SAME_STMT);
+ arg1_len = strlen_lhs;
+   }
+ else if (!is_gimple_val (arg1_len))
+   {
+ tree arg1_len_tmp = make_ssa_name (TREE_TYPE (arg1_len));
+ gassign *arg1_stmt = gimple_build_assign (arg1_len_tmp,
+   arg1_len);
+ gsi_insert_before (, arg1_stmt, GSI_SAME_STMT);
+ arg1_len = arg1_len_tmp;
+   }
 
- if (!is_gimple_val (arg1_len))
+ tree strncmp_decl = builtin_decl_explicit (BUILT_IN_STRNCMP);
+ gcall *strncmp_call = gimple_build_call (strncmp_decl, 3,
+  arg0, arg1, arg1_len);
+ tree strncmp_lhs = make_ssa_name (integer_type_node);
+ gimple_set_vuse (strncmp_call, gimple_vuse (call_stmt));
+ gimple_call_set_lhs (strncmp_call, strncmp_lhs);
+ gsi_remove (, true);
+ gsi_insert_before (, strncmp_call, GSI_SAME_STMT);
+ tree zero = build_zero_cst (TREE_TYPE (strncmp_lhs));
+
+ if (is_gimple_assign (stmt))
+   {
+ if (gimple_assign_rhs_code (stmt) == COND_EXPR)
{
- tree arg1_len_tmp = make_ssa_name (TREE_TYPE (arg1_len));
- gassign *arg1_stmt = gimple_build_assign (arg1_len_tmp,
-   arg1_len);
- gsi_insert_before (, arg1_stmt, GSI_SAME_STMT);
- arg1_len = arg1_len_tmp;
-   }
-
- gcall *strncmp_call = gimple_build_call (strncmp_decl, 3,
- arg0, arg1, arg1_len);
- tree strncmp_lhs = make_ssa_name (integer_type_node);
- gimple_set_vuse (strncmp_call, gimple_vuse (call_stmt));
- gimple_call_set_lhs (strncmp_call, strncmp_lhs);
- gsi_remove (, true);
- gsi_insert_before (, strncmp_call, GSI_SAME_STMT);
- tree zero = build_zero_cst (TREE_TYPE (strncmp_lhs));
-
- if (is_gimple_assign (stmt))
-   {
- if (gimple_assign_rhs_code (stmt) == COND_EXPR)
-   {
- tree cond = gimple_assign_rhs1 (stmt);
-

Re: [PATCH] testsuite: aarch64: Adjust SVE ACLE tests to new generated code

2023-09-04 Thread Thiago Jung Bauermann via Gcc-patches



Hello Richard,

Richard Sandiford  writes:

> Thiago Jung Bauermann via Gcc-patches  writes:
>> Since commit e7a36e4715c7 "[PATCH] RISC-V: Support simplify (-1-x) for
>> vector." these tests fail on aarch64-linux:
>>
>>  === g++ tests ===
>>
>> Running g++:g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu++98 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
>> check-function-bodies subr_m1_s8_m
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu++98 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
>> check-function-bodies subr_m1_s8_m
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu++98 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
>> check-function-bodies subr_m1_u8_m
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu++98 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
>> check-function-bodies subr_m1_u8_m
>>
>>  === gcc tests ===
>>
>> Running gcc:gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu90 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
>> check-function-bodies subr_m1_s8_m
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu90 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
>> check-function-bodies subr_m1_s8_m
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu90 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
>> check-function-bodies subr_m1_u8_m
>> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu90 -O2 
>> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
>> check-function-bodies subr_m1_u8_m
>>
>> Andrew Pinski's analysis in PR testsuite/111071 is that the new code is
>> better and the testcase should be updated. I also asked Prathamesh Kulkarni
>> in private and he agreed.
>>
>> Here is the update. With this change, all tests in
>> gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp pass on aarch64-linux.
>>
>> gcc/testsuite/
>>  PR testsuite/111071
>>  * gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c: Adjust to 
>> new code.
>>  * gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c: Likewise.
>
> Thanks, pushed to trunk.  And sorry for the delay.  I somehow
> missed this earlier. :(

No problem. Thank you for pushing it!

-- 
Thiago

[COMMITED v4] mklog: handle Signed-off-by, minor cleanup

2023-09-04 Thread Marc Poulhiès via Gcc-patches

Consider Signed-off-by lines as part of the ending of the initial
commit to avoid having these in the middle of the log when the
changelog part is injected after.

This is particularly usefull with:

 $ git gcc-commit-mklog --amend -s

that can be used to create the changelog and add the Signed-off-by line.

Also applies most of the shellcheck suggestions on the
prepare-commit-msg hook.

contrib/ChangeLog:

* mklog.py: Leave SOB lines after changelog.
* prepare-commit-msg: Apply most shellcheck suggestions.

Signed-off-by: Marc Poulhiès 
---
 contrib/mklog.py   | 34 +-
 contrib/prepare-commit-msg | 20 ++--
 2 files changed, 39 insertions(+), 15 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 0abefcd9374..effe5aa1ca5 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -41,7 +41,34 @@ from unidiff import PatchSet
 
 LINE_LIMIT = 100
 TAB_WIDTH = 8
-CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
+
+# Initial commit:
+#   +--+
+#   | gccrs: Some title|
+#   |  | This is the "start"
+#   | This is some text explaining the commit. |
+#   | There can be several lines.  |
+#   |  |<--->
+#   | Signed-off-by: My Name  | This is the "end"
+#   +--+
+#
+# Results in:
+#   +--+
+#   | gccrs: Some title|
+#   |  |
+#   | This is some text explaining the commit. | This is the "start"
+#   | There can be several lines.  |
+#   |  |<--->
+#   | gcc/rust/ChangeLog:  |
+#   |  | This is the generated
+#   | * some_file (bla):   | ChangeLog part
+#   | (foo):   |
+#   |  |<--->
+#   | Signed-off-by: My Name  | This is the "end"
+#   +--+
+
+# this regex matches the first line of the "end" in the initial commit message
+FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by:|co-authored-by:|#)')
 
 pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PPR [a-z+-]+\/[0-9]+)')
 prnum_regex = re.compile(r'PR (?P[a-z+-]+)/(?P[0-9]+)')
@@ -330,10 +357,7 @@ def update_copyright(data):
 
 
 def skip_line_in_changelog(line):
-if line.lower().startswith(CO_AUTHORED_BY_PREFIX) or line.startswith('#'):
-return False
-return True
-
+return FIRST_LINE_OF_END_RE.match(line) == None
 
 if __name__ == '__main__':
 extra_args = os.getenv('GCC_MKLOG_ARGS')
diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
index 48c9dad3c6f..1e94706ba40 100755
--- a/contrib/prepare-commit-msg
+++ b/contrib/prepare-commit-msg
@@ -32,11 +32,11 @@ if ! [ -f "$COMMIT_MSG_FILE" ]; then exit 0; fi
 # Don't do anything unless requested to.
 if [ -z "$GCC_FORCE_MKLOG" ]; then exit 0; fi
 
-if [ -z "$COMMIT_SOURCE" ] || [ $COMMIT_SOURCE = template ]; then
+if [ -z "$COMMIT_SOURCE" ] || [ "$COMMIT_SOURCE" = template ]; then
 # No source or "template" means new commit.
 cmd="diff --cached"
 
-elif [ $COMMIT_SOURCE = message ]; then
+elif [ "$COMMIT_SOURCE" = message ]; then
 # "message" means -m; assume a new commit if there are any changes staged.
 if ! git diff --cached --quiet; then
cmd="diff --cached"
@@ -44,23 +44,23 @@ elif [ $COMMIT_SOURCE = message ]; then
cmd="diff --cached HEAD^"
 fi
 
-elif [ $COMMIT_SOURCE = commit ]; then
+elif [ "$COMMIT_SOURCE" = commit ]; then
 # The message of an existing commit.  If it's HEAD, assume --amend;
 # otherwise, assume a new commit with -C.
-if [ $SHA1 = HEAD ]; then
+if [ "$SHA1" = HEAD ]; then
cmd="diff --cached HEAD^"
if [ "$(git config gcc-config.mklog-hook-type)" = "smart-amend" ]; then
# Check if the existing message still describes the staged changes.
f=$(mktemp /tmp/git-commit.XX) || exit 1
-   git log -1 --pretty=email HEAD > $f
-   printf '\n---\n\n' >> $f
-   git $cmd >> $f
+   git log -1 --pretty=email HEAD > "$f"
+   printf '\n---\n\n' >> "$f"
+   git $cmd >> "$f"
if contrib/gcc-changelog/git_email.py "$f" >/dev/null 2>&1; then
# Existing commit message is still OK for amended commit.
-   rm $f
+   rm "$f"
exit 0
fi
-   rm $f
+   rm "$f"
fi
 else
cmd="diff --cached"
@@ -72,7

Re: [PATCH v3] mklog: handle Signed-off-by, minor cleanup

2023-09-04 Thread Marc via Gcc-patches



Richard Sandiford  writes:

>> +# this regex matches the first line of the "end" in the initial commit 
>> message
>> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by:|co-authored-by:|#) 
>> ')
>
> Personally I think it would be safer to drop the final space in the regexp.
>
> OK with that change if you agree.

Hello,

You're correct. I'll commit the adjusted change.

Thanks,
Marc

Re: [PATCH] Bug 111071: fix the subr with -1 to not due to the simplify.

2023-09-04 Thread Richard Sandiford via Gcc-patches

Richard Sandiford  writes:
> "yanzhang.wang--- via Gcc-patches"  writes:
>> From: Yanzhang Wang 
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1
>> to not.
>>
>> Signed-off-by: Yanzhang Wang 
>> ---
>>
>> Tested on my local arm environment and passed. Thanks Andrew Pinski's comment
>> the code is the same with that.
>>
>>  gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
>> index b9615de6655..1cf6916a5e0 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
>> @@ -76,8 +76,7 @@ TEST_UNIFORM_Z (subr_1_s8_m_untied, svint8_t,
>>  
>>  /*
>>  ** subr_m1_s8_m:
>> -**  mov (z[0-9]+\.b), #-1
>> -**  subrz0\.b, p0/m, z0\.b, \1
>> +**  not z0.b, p0/m, z0.b
>>  **  ret
>>  */
>>  TEST_UNIFORM_Z (subr_m1_s8_m, svint8_t,
>
> I think we need this for subr_u8.c too.  OK with that change,
> and thanks for the fix!

Actually, never mind.  I just saw a patch from Thiago Jung Bauerman
for the same issue, which is now in trunk.  Sorry for the confusion,
and thanks again for posting the fix.

Richard

Re: [PATCH] testsuite: aarch64: Adjust SVE ACLE tests to new generated code

2023-09-04 Thread Richard Sandiford via Gcc-patches

Thiago Jung Bauermann via Gcc-patches  writes:
> Since commit e7a36e4715c7 "[PATCH] RISC-V: Support simplify (-1-x) for
> vector." these tests fail on aarch64-linux:
>
>   === g++ tests ===
>
> Running g++:g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu++98 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
> check-function-bodies subr_m1_s8_m
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu++98 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
> check-function-bodies subr_m1_s8_m
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu++98 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
> check-function-bodies subr_m1_u8_m
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu++98 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
> check-function-bodies subr_m1_u8_m
>
>   === gcc tests ===
>
> Running gcc:gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu90 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
> check-function-bodies subr_m1_s8_m
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_s8.c -std=gnu90 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
> check-function-bodies subr_m1_s8_m
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu90 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL  
> check-function-bodies subr_m1_u8_m
> FAIL: gcc.target/aarch64/sve/acle/asm/subr_u8.c -std=gnu90 -O2 
> -fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS  
> check-function-bodies subr_m1_u8_m
>
> Andrew Pinski's analysis in PR testsuite/111071 is that the new code is
> better and the testcase should be updated. I also asked Prathamesh Kulkarni
> in private and he agreed.
>
> Here is the update. With this change, all tests in
> gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp pass on aarch64-linux.
>
> gcc/testsuite/
>   PR testsuite/111071
>   * gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c: Adjust to 
> new code.
>   * gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c: Likewise.

Thanks, pushed to trunk.  And sorry for the delay.  I somehow
missed this earlier. :(

Richard

> Suggested-by: Andrew Pinski 
> ---
>  gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c | 3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c | 3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
> index b9615de6655f..3e521bc9ae32 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
> @@ -76,8 +76,7 @@ TEST_UNIFORM_Z (subr_1_s8_m_untied, svint8_t,
>  
>  /*
>  ** subr_m1_s8_m:
> -**   mov (z[0-9]+\.b), #-1
> -**   subrz0\.b, p0/m, z0\.b, \1
> +**   not z0\.b, p0/m, z0\.b
>  **   ret
>  */
>  TEST_UNIFORM_Z (subr_m1_s8_m, svint8_t,
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c
> index 65606b6dda03..4922bdbacc47 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_u8.c
> @@ -76,8 +76,7 @@ TEST_UNIFORM_Z (subr_1_u8_m_untied, svuint8_t,
>  
>  /*
>  ** subr_m1_u8_m:
> -**   mov (z[0-9]+\.b), #-1
> -**   subrz0\.b, p0/m, z0\.b, \1
> +**   not z0\.b, p0/m, z0\.b
>  **   ret
>  */
>  TEST_UNIFORM_Z (subr_m1_u8_m, svuint8_t,

[PATCH 2/2] strlen: call handle_builtin_strlen() from fold_strstr_to_strncmp()

2023-09-04 Thread Hamza Mahfooz

Currently, we are not saving the strlen() call we inserted for possible
future common subexpression elimination. Also, it's possible that we can
further fold that strlen() call. So, refactor handle_builtin_strlen()
so that it can be called from fold_strstr_to_strncmp().

gcc/ChangeLog:

* tree-ssa-strlen.cc (strlen_pass::handle_builtin_strlen):
Remove from class and mark as static.
(handle_builtin_strlen): Add parameter
"gimple_stmt_iterator gsi" and replace references of m_gsi
with gsi.
(fold_strstr_to_strncmp): Call handle_builtin_strlen().
(strlen_pass::check_and_optimize_call): Add m_gsi to the
handle_builtin_strlen() call.

gcc/testsuite/ChangeLog:

* gcc.dg/strlenopt-30.c: Add a test.

Signed-off-by: Hamza Mahfooz 
---
Please push this for me if you think it looks good. Since, I don't have
write access to the repository.
---
 gcc/testsuite/gcc.dg/strlenopt-30.c |  7 +++
 gcc/tree-ssa-strlen.cc  | 23 ---
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
index 1ee814048c1..de51a66383b 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-30.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -44,6 +44,12 @@ _Bool f6(char *s, char *t)
   return __builtin_strstr (s, t) == s;
 }
 
+__attribute__((no_icf))
+_Bool f6plus(char *s, char *t)
+{
+  return __builtin_strstr (s, t) == s && __builtin_strlen(t) > 10;
+}
+
 /* Do not perform transform in this case, since
t1 doesn't have single use.  */
 
@@ -57,4 +63,5 @@ _Bool f7(char *s)
   return (t1 == s);
 }
 
+/* { dg-final { scan-tree-dump-times "__builtin_strlen" 2 "strlen1" } } */
 /* { dg-final { scan-tree-dump-times "__builtin_strncmp" 6 "strlen1" } } */
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index b0ebbb0db62..8ec6ddbc7c0 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -252,7 +252,6 @@ public:
   bool handle_assign (tree lhs, bool *zero_write);
   bool handle_store (bool *zero_write);
   void handle_pointer_plus ();
-  void handle_builtin_strlen ();
   void handle_builtin_strchr ();
   void handle_builtin_strcpy (built_in_function);
   void handle_integral_assign (bool *cleanup_eh);
@@ -2211,10 +2210,10 @@ strlen_pass::maybe_warn_overflow (gimple *stmt, bool 
call_lhs,
the strlen call with the known value, otherwise remember that strlen
of the argument is stored in the lhs SSA_NAME.  */
 
-void
-strlen_pass::handle_builtin_strlen ()
+static void
+handle_builtin_strlen (gimple_stmt_iterator gsi)
 {
-  gimple *stmt = gsi_stmt (m_gsi);
+  gimple *stmt = gsi_stmt (gsi);
   tree lhs = gimple_call_lhs (stmt);
 
   if (lhs == NULL_TREE)
@@ -2268,8 +2267,8 @@ strlen_pass::handle_builtin_strlen ()
  if (bound)
rhs = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (rhs), rhs, bound);
 
- gimplify_and_update_call_from_tree (_gsi, rhs);
- stmt = gsi_stmt (m_gsi);
+ gimplify_and_update_call_from_tree (, rhs);
+ stmt = gsi_stmt (gsi);
  update_stmt (stmt);
  if (dump_file && (dump_flags & TDF_DETAILS) != 0)
{
@@ -2367,8 +2366,8 @@ strlen_pass::handle_builtin_strlen ()
  }
if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (ret)))
  ret = fold_convert_loc (loc, TREE_TYPE (lhs), ret);
-   gimplify_and_update_call_from_tree (_gsi, ret);
-   stmt = gsi_stmt (m_gsi);
+   gimplify_and_update_call_from_tree (, ret);
+   stmt = gsi_stmt (gsi);
update_stmt (stmt);
if (dump_file && (dump_flags & TDF_DETAILS) != 0)
  {
@@ -5272,8 +5271,9 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
{
  tree arg1 = gimple_call_arg (call_stmt, 1);
  tree arg1_len = NULL_TREE;
- int idx = get_stridx (arg1, call_stmt);
  gimple_stmt_iterator gsi = gsi_for_stmt (call_stmt);
+again:
+ int idx = get_stridx (arg1, call_stmt);
 
  if (idx)
{
@@ -5296,7 +5296,8 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
  gimple_call_set_lhs (strlen_call, strlen_lhs);
  gimple_set_vuse (strlen_call, gimple_vuse (call_stmt));
  gsi_insert_before (, strlen_call, GSI_SAME_STMT);
- arg1_len = strlen_lhs;
+ handle_builtin_strlen (gsi_for_stmt (strlen_call));
+ goto again;
}
  else if (!is_gimple_val (arg1_len))
{
@@ -5393,7 +5394,7 @@ strlen_pass::check_and_optimize_call (bool *zero_write)
 {
 case BUILT_IN_STRLEN:
 case BUILT_IN_STRNLEN:
-  handle_builtin_strlen ();
+  handle_builtin_strlen (m_gsi);
   break;
 case BUILT_IN_STRCHR:
   handle_builtin_strchr ();
-- 
2.42.0

[PATCH 1/2] strlen: fold strstr() even if the length isn't previously known [PR96601]

2023-09-04 Thread Hamza Mahfooz

Currently, we give up in fold_strstr_to_strncmp() if the length of the
the second argument to strstr() isn't known to us by the time we hit
that function. However, we can instead insert a strlen() in ourselves
and continue trying to fold strstr() into strlen()+strncmp().

PR tree-optimization/96601

gcc/ChangeLog:

* tree-ssa-strlen.cc (fold_strstr_to_strncmp): If arg1_len == NULL,
insert a strlen() for strstr()'s arg1 and use it as arg1_len.

gcc/testsuite/ChangeLog:

* gcc.dg/strlenopt-30.c: Modify test.

Signed-off-by: Hamza Mahfooz 
---
Please push this for me if you think it looks good. Since, I don't have
write access to the repository.
---
 gcc/testsuite/gcc.dg/strlenopt-30.c |  5 +-
 gcc/tree-ssa-strlen.cc  | 81 -
 2 files changed, 45 insertions(+), 41 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/strlenopt-30.c 
b/gcc/testsuite/gcc.dg/strlenopt-30.c
index 2a3098ba96f..1ee814048c1 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-30.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-30.c
@@ -38,9 +38,6 @@ void f5(char *s)
 foo_f5();
 }
 
-/* Do not perform transform, since strlen (t)
-   is unknown.  */
-
 __attribute__((no_icf))
 _Bool f6(char *s, char *t)
 {
@@ -60,4 +57,4 @@ _Bool f7(char *s)
   return (t1 == s);
 }
 
-/* { dg-final { scan-tree-dump-times "__builtin_strncmp" 5 "strlen1" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_strncmp" 6 "strlen1" } } */
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index 8b7ef919826..b0ebbb0db62 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -5273,6 +5273,7 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
  tree arg1 = gimple_call_arg (call_stmt, 1);
  tree arg1_len = NULL_TREE;
  int idx = get_stridx (arg1, call_stmt);
+ gimple_stmt_iterator gsi = gsi_for_stmt (call_stmt);
 
  if (idx)
{
@@ -5286,51 +5287,57 @@ fold_strstr_to_strncmp (tree rhs1, tree rhs2, gimple 
*stmt)
}
}
 
- if (arg1_len != NULL_TREE)
+ if (arg1_len == NULL_TREE)
{
- gimple_stmt_iterator gsi = gsi_for_stmt (call_stmt);
- tree strncmp_decl = builtin_decl_explicit (BUILT_IN_STRNCMP);
+ tree strlen_decl = builtin_decl_explicit (BUILT_IN_STRLEN);
+ gcall *strlen_call = gimple_build_call (strlen_decl, 1, arg1);
+ tree strlen_lhs = make_ssa_name (size_type_node, strlen_call);
+
+ gimple_call_set_lhs (strlen_call, strlen_lhs);
+ gimple_set_vuse (strlen_call, gimple_vuse (call_stmt));
+ gsi_insert_before (, strlen_call, GSI_SAME_STMT);
+ arg1_len = strlen_lhs;
+   }
+ else if (!is_gimple_val (arg1_len))
+   {
+ tree arg1_len_tmp = make_ssa_name (TREE_TYPE (arg1_len));
+ gassign *arg1_stmt = gimple_build_assign (arg1_len_tmp,
+   arg1_len);
+ gsi_insert_before (, arg1_stmt, GSI_SAME_STMT);
+ arg1_len = arg1_len_tmp;
+   }
 
- if (!is_gimple_val (arg1_len))
+ tree strncmp_decl = builtin_decl_explicit (BUILT_IN_STRNCMP);
+ gcall *strncmp_call = gimple_build_call (strncmp_decl, 3,
+  arg0, arg1, arg1_len);
+ tree strncmp_lhs = make_ssa_name (integer_type_node);
+ gimple_set_vuse (strncmp_call, gimple_vuse (call_stmt));
+ gimple_call_set_lhs (strncmp_call, strncmp_lhs);
+ gsi_remove (, true);
+ gsi_insert_before (, strncmp_call, GSI_SAME_STMT);
+ tree zero = build_zero_cst (TREE_TYPE (strncmp_lhs));
+
+ if (is_gimple_assign (stmt))
+   {
+ if (gimple_assign_rhs_code (stmt) == COND_EXPR)
{
- tree arg1_len_tmp = make_ssa_name (TREE_TYPE (arg1_len));
- gassign *arg1_stmt = gimple_build_assign (arg1_len_tmp,
-   arg1_len);
- gsi_insert_before (, arg1_stmt, GSI_SAME_STMT);
- arg1_len = arg1_len_tmp;
-   }
-
- gcall *strncmp_call = gimple_build_call (strncmp_decl, 3,
- arg0, arg1, arg1_len);
- tree strncmp_lhs = make_ssa_name (integer_type_node);
- gimple_set_vuse (strncmp_call, gimple_vuse (call_stmt));
- gimple_call_set_lhs (strncmp_call, strncmp_lhs);
- gsi_remove (, true);
- gsi_insert_before (, strncmp_call, GSI_SAME_STMT);
- tree zero = build_zero_cst (TREE_TYPE (strncmp_lhs));
-
- if (is_gimple_assign (stmt))
-   {
- if (gimple_assign_rhs_code (stmt) == COND_EXPR)
-   {
- tree cond = gimple_assign_rhs1 (stmt);
-

Re: [PATCH v3] mklog: handle Signed-off-by, minor cleanup

2023-09-04 Thread Richard Sandiford via Gcc-patches

Marc Poulhiès via Gcc-patches  writes:
> Richard Sandiford via Gcc-patches  writes:
>>> +# this regex matches the first line of the "end" in the initial commit 
>>> message
>>> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by|co-authored-by|#): 
>>> ')
>>
>> The current code only requires an initial "#", rather than an initial "#: ".
>> Is that a deliberate change?
>>
>> The patch LGTM apart from that.
>
> Hello Richard,
>
> Thanks for the review and sorry for the delayed answer as I was away the
> past weeks. This issue was catched early this month
> (https://github.com/Rust-GCC/gccrs/pull/2504), but I didn't want to send
> something here before leaving. Here's a fixed patched.
>
> Ok for master?
>
> Thanks,
> Marc
>
> ---
>  contrib/mklog.py   | 34 +-
>  contrib/prepare-commit-msg | 20 ++--
>  2 files changed, 39 insertions(+), 15 deletions(-)
>
> diff --git a/contrib/mklog.py b/contrib/mklog.py
> index 26230b9b4f2..496780883fb 100755
> --- a/contrib/mklog.py
> +++ b/contrib/mklog.py
> @@ -41,7 +41,34 @@ from unidiff import PatchSet
>  
>  LINE_LIMIT = 100
>  TAB_WIDTH = 8
> -CO_AUTHORED_BY_PREFIX = 'co-authored-by: '
> +
> +# Initial commit:
> +#   +--+
> +#   | gccrs: Some title|
> +#   |  | This is the "start"
> +#   | This is some text explaining the commit. |
> +#   | There can be several lines.  |
> +#   |  |<--->
> +#   | Signed-off-by: My Name  | This is the "end"
> +#   +--+
> +#
> +# Results in:
> +#   +--+
> +#   | gccrs: Some title|
> +#   |  |
> +#   | This is some text explaining the commit. | This is the "start"
> +#   | There can be several lines.  |
> +#   |  |<--->
> +#   | gcc/rust/ChangeLog:  |
> +#   |  | This is the 
> generated
> +#   | * some_file (bla):   | ChangeLog part
> +#   | (foo):   |
> +#   |  |<--->
> +#   | Signed-off-by: My Name  | This is the "end"
> +#   +--+
> +
> +# this regex matches the first line of the "end" in the initial commit 
> message
> +FIRST_LINE_OF_END_RE = re.compile('(?i)^(signed-off-by:|co-authored-by:|#) ')

Personally I think it would be safer to drop the final space in the regexp.

OK with that change if you agree.

Thanks,
Richard

>  
>  pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PPR [a-z+-]+\/[0-9]+)')
>  prnum_regex = re.compile(r'PR (?P[a-z+-]+)/(?P[0-9]+)')
> @@ -330,10 +357,7 @@ def update_copyright(data):
>  
>  
>  def skip_line_in_changelog(line):
> -if line.lower().startswith(CO_AUTHORED_BY_PREFIX) or 
> line.startswith('#'):
> -return False
> -return True
> -
> +return FIRST_LINE_OF_END_RE.match(line) == None
>  
>  if __name__ == '__main__':
>  extra_args = os.getenv('GCC_MKLOG_ARGS')
> diff --git a/contrib/prepare-commit-msg b/contrib/prepare-commit-msg
> index 48c9dad3c6f..1e94706ba40 100755
> --- a/contrib/prepare-commit-msg
> +++ b/contrib/prepare-commit-msg
> @@ -32,11 +32,11 @@ if ! [ -f "$COMMIT_MSG_FILE" ]; then exit 0; fi
>  # Don't do anything unless requested to.
>  if [ -z "$GCC_FORCE_MKLOG" ]; then exit 0; fi
>  
> -if [ -z "$COMMIT_SOURCE" ] || [ $COMMIT_SOURCE = template ]; then
> +if [ -z "$COMMIT_SOURCE" ] || [ "$COMMIT_SOURCE" = template ]; then
>  # No source or "template" means new commit.
>  cmd="diff --cached"
>  
> -elif [ $COMMIT_SOURCE = message ]; then
> +elif [ "$COMMIT_SOURCE" = message ]; then
>  # "message" means -m; assume a new commit if there are any changes 
> staged.
>  if ! git diff --cached --quiet; then
>   cmd="diff --cached"
> @@ -44,23 +44,23 @@ elif [ $COMMIT_SOURCE = message ]; then
>   cmd="diff --cached HEAD^"
>  fi
>  
> -elif [ $COMMIT_SOURCE = commit ]; then
> +elif [ "$COMMIT_SOURCE" = commit ]; then
>  # The message of an existing commit.  If it's HEAD, assume --amend;
>  # otherwise, assume a new commit with -C.
> -if [ $SHA1 = HEAD ]; then
> +if [ "$SHA1" = HEAD ]; then
>   cmd="diff --cached HEAD^"
>   if [ "$(git config gcc-config.mklog-hook-type)" = "smart-amend" ]; then
>   # Check if the existing message still describes the staged changes.
>   f=$(mktemp /tmp/git-commit.XX) || exit 1
> - git log -1 --pretty=email HEAD > $f
> - printf '\n---\n\n' >>

[PATCH] c++: Additional warning for name-hiding [PR12341]

2023-09-04 Thread Benjamin Priour via Gcc-patches

From: benjamin priour 

Hi,

This patch was the first I wrote and had been
at that time returned to me because ill-formatted.

Getting busy with other things, I forgot about it.
I've now fixed the formatting.

Succesfully regstrapped on x86_64-linux-gnu off trunk
a7d052b3200c7928d903a0242b8cfd75d131e374.
Is it OK for trunk ?

Thanks,
Benjamin.

Patch below.
---

Add a new warning for name-hiding. When a class's field
is named similarly to one inherited, a warning should
be issued.
This new warning is controlled by the existing Wshadow.

gcc/cp/ChangeLog:

PR c++/12341
* search.cc (lookup_member):
New optional parameter to preempt processing the
inheritance tree deeper than necessary.
(lookup_field): Likewise.
(dfs_walk_all): Likewise.
* cp-tree.h: Update the above declarations.
* class.cc: (warn_name_hiding): New function.
(finish_struct_1): Call warn_name_hiding if -Wshadow.

gcc/testsuite/ChangeLog:

PR c++/12341
* g++.dg/pr12341-1.C: New file.
* g++.dg/pr12341-2.C: New file.

Signed-off-by: benjamin priour 
---
 gcc/cp/class.cc  | 75 
 gcc/cp/cp-tree.h |  9 ++--
 gcc/cp/search.cc | 28 
 gcc/testsuite/g++.dg/pr12341-1.C | 65 +++
 gcc/testsuite/g++.dg/pr12341-2.C | 34 +++
 5 files changed, 200 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr12341-1.C
 create mode 100644 gcc/testsuite/g++.dg/pr12341-2.C

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index 778759237dc..b1c59c392a0 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -3080,6 +3080,79 @@ warn_hidden (tree t)
   }
 }
 
+/* Warn about non-static fields name hiding.  */
+
+static void
+warn_name_hiding (tree t)
+{
+  if (is_empty_class (t) || CLASSTYPE_NEARLY_EMPTY_P (t))
+return;
+
+  for (tree field = TYPE_FIELDS (t); field; field = DECL_CHAIN (field))
+{
+  /* Skip if field is not an user-defined non-static data member.  */
+  if (TREE_CODE (field) != FIELD_DECL || DECL_ARTIFICIAL (field))
+   continue;
+
+  unsigned j;
+  tree name = DECL_NAME (field);
+  /* Skip if field is anonymous.  */
+  if (!name || !identifier_p (name))
+   continue;
+
+  auto_vec base_vardecls;
+  tree binfo;
+  tree base_binfo;
+  /* Iterate through all of the base classes looking for possibly
+shadowed non-static data members.  */
+  for (binfo = TYPE_BINFO (t), j = 0;
+  BINFO_BASE_ITERATE (binfo, j, base_binfo); j++)
+   {
+ tree basetype = BINFO_TYPE (base_binfo);
+ tree candidate = lookup_field (basetype, name,
+/* protect */ 2,
+/* want_type */ 0,
+/* once_suffices */ true);
+ if (candidate)
+   {
+ /* If we went up the hierarchy to a base class with multiple
+inheritance, there could be multiple matches in which case
+a TREE_LIST is returned.  */
+ if (TREE_TYPE (candidate) == error_mark_node)
+   {
+ for (; candidate; candidate = TREE_CHAIN (candidate))
+   {
+ tree candidate_field = TREE_VALUE (candidate);
+ tree candidate_klass = DECL_CONTEXT (candidate_field);
+ if (accessible_base_p (t, candidate_klass, true))
+   base_vardecls.safe_push (candidate_field);
+   }
+   }
+ else if (accessible_base_p (t, DECL_CONTEXT (candidate), true))
+   base_vardecls.safe_push (candidate);
+   }
+   }
+
+  /* Field was not found among the base classes.  */
+  if (base_vardecls.is_empty ())
+   continue;
+
+  /* Emit a warning for each field similarly named found
+in the base class hierarchy.  */
+  for (tree base_vardecl : base_vardecls)
+   {
+ if (base_vardecl)
+   {
+ auto_diagnostic_group d;
+ if (warning_at (location_of (field), OPT_Wshadow,
+ "%qD might shadow %qD", field, base_vardecl))
+   inform (location_of (base_vardecl),
+   "  %qD name already in use here", base_vardecl);
+   }
+   }
+}
+}
+
 /* Recursive helper for finish_struct_anon.  */
 
 static void
@@ -7654,6 +7727,8 @@ finish_struct_1 (tree t)
 
   if (warn_overloaded_virtual)
 warn_hidden (t);
+  if (warn_shadow)
+warn_name_hiding (t);
 
   /* Class layout, assignment of virtual table slots, etc., is now
  complete.  Give the back end a chance to tweak the visibility of
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 3ca011c61c8..890326f0fd8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7554,11 +7554,13 @@

Re: [PATCH] testsuite: Remove unwanted 'dg-do run' from gcc.dg/vect tests

2023-09-04 Thread Richard Sandiford via Gcc-patches

Christophe Lyon via Gcc-patches  writes:
> Tests under gcc.dg/vect use check_vect_support_and_set_flags to set
> compilation flags as appropriate for the target, but they also set
> dg-do-what-default to 'run' or 'compile', depending on the actual
> target hardware (or simulator) capabilities.
>
> For instance on arm, we use options to enable Neon, but set
> dg-do-what-default to 'run' only if we cam actually execute Neon
> instructions.
>
> Therefore, we would always try to link and execute tests containing
> 'dg-do run', although dg-do-what-default says otherwise, leading to
> uninteresting failures.
>
> Therefore, this patch removes all such unconditionnal 'dg-do run',
> thus avoid link errors for instance if GCC has been configured with
> multilibs disabled and some --with-{float|cpu|hard} option
> incompatible with what check_vect_support_and_set_flags selects.
>
> For exmaple, GCC configured with:
> --disable-multilib --with-mode=thumb --with-cpu=cortex-m7 --with-float=hard
> and check_vect_support_and_set_flags uses
> -mfpu=neon -mfloat-abi=softfp -march=armv7-a
> (thus incompatible float-abi options)
>
> Tested on native aarch64-linux-gnu (no change) and several arm-eabi
> cases where the FAIL/UNRESOLVED disappear (and we keep only the
> 'compilation' tests).
>
> 2023-09-04  Christophe Lyon  
>
>   gcc/testsuite/
>   * gcc.dg/vect/bb-slp-44.c: Remove 'dg-do run'.
>   * gcc.dg/vect/bb-slp-71.c: Likewise.
>   * gcc.dg/vect/bb-slp-72.c: Likewise.
>   * gcc.dg/vect/bb-slp-73.c: Likewise.
>   * gcc.dg/vect/bb-slp-74.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr101207.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr101615-1.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr101615-2.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr101668.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr54400.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr98516-1.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr98516-2.c: Likewise.
>   * gcc.dg/vect/bb-slp-pr98544.c: Likewise.
>   * gcc.dg/vect/pr101445.c: Likewise.
>   * gcc.dg/vect/pr105219.c: Likewise.
>   * gcc.dg/vect/pr107160.c: Likewise.
>   * gcc.dg/vect/pr107212-1.c: Likewise.
>   * gcc.dg/vect/pr107212-2.c: Likewise.
>   * gcc.dg/vect/pr109502.c: Likewise.
>   * gcc.dg/vect/pr110381.c: Likewise.
>   * gcc.dg/vect/pr110838.c: Likewise.
>   * gcc.dg/vect/pr88497-1.c: Likewise.
>   * gcc.dg/vect/pr88497-7.c: Likewise.
>   * gcc.dg/vect/pr96783-1.c: Likewise.
>   * gcc.dg/vect/pr96783-2.c: Likewise.
>   * gcc.dg/vect/pr97558-2.c: Likewise.
>   * gcc.dg/vect/pr99253.c: Likewise.
>   * gcc.dg/vect/slp-mask-store-1.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-10.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-11.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-2.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-3.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-4.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-5.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-6.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-8.c: Likewise.
>   * gcc.dg/vect/vect-bic-bitmask-9.c: Likewise.
>   * gcc.dg/vect/vect-cond-13.c: Likewise.
>   * gcc.dg/vect/vect-recurr-1.c: Likewise.
>   * gcc.dg/vect/vect-recurr-2.c: Likewise.
>   * gcc.dg/vect/vect-recurr-3.c: Likewise.
>   * gcc.dg/vect/vect-recurr-4.c: Likewise.
>   * gcc.dg/vect/vect-recurr-5.c: Likewise.
>   * gcc.dg/vect/vect-recurr-6.c: Likewise.

OK, thanks.

Richard

> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-44.c   | 2 --
>  gcc/testsuite/gcc.dg/vect/bb-slp-71.c   | 2 --
>  gcc/testsuite/gcc.dg/vect/bb-slp-72.c   | 2 --
>  gcc/testsuite/gcc.dg/vect/bb-slp-73.c   | 2 --
>  gcc/testsuite/gcc.dg/vect/bb-slp-74.c   | 1 -
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr101207.c | 1 -
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c   | 1 -
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c   | 1 -
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c | 1 -
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c  | 1 -
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c| 2 --
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c| 2 --
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr98544.c  | 2 --
>  gcc/testsuite/gcc.dg/vect/pr101445.c| 2 --
>  gcc/testsuite/gcc.dg/vect/pr105219.c| 1 -
>  gcc/testsuite/gcc.dg/vect/pr107160.c| 2 --
>  gcc/testsuite/gcc.dg/vect/pr107212-1.c  | 2 --
>  gcc/testsuite/gcc.dg/vect/pr107212-2.c  | 2 --
>  gcc/testsuite/gcc.dg/vect/pr109502.c| 1 -
>  gcc/testsuite/gcc.dg/vect/pr110381.c| 1 -
>  gcc/testsuite/gcc.dg/vect/pr110838.c| 2 --
>  gcc/testsuite/gcc.dg/vect/pr88497-1.c   | 1 -
>  gcc/testsuite/gcc.dg/vect/pr88497-7.c   | 1 -
>  gcc/testsuite/gcc.dg/vect/pr96783-1.c   | 2 --
>  gcc/testsuite/gcc.dg/vect/pr96783-2.c   | 2 --
>

Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-04 Thread Jonathan Wakely via Gcc-patches

On Mon, 4 Sept 2023 at 17:47, Hans-Peter Nilsson via Libstdc++
 wrote:
>
> > Date: Fri, 1 Sep 2023 12:16:40 +0100
> > Reply-To: Jonathan Wakely 
> >
> > On Wed, 23 Aug 2023 at 17:03, Jonathan Wakely via Libstdc++
> >  wrote:
> > >
> > > Any objections to this? It's a C++23 feture, so should be enabled by
> > > default.
> >
> > I've pushed this to trunk, so let's see what breaks!
> >
> >
> > >
> > > -- >8 --
> > >
> > > This causes libstdc++_libbacktrace.a to be built by default. This might
> > > fail on some targets, in which case we can make the 'auto' choice expand
> > > to either 'yes' or 'no' depending on the target.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Default to yes.
> > > * configure: Regenerate.
>
> Incidentally, should check_effective_target_stacktrace in
> libstdc++.exp also be adjusted to match; removing the
> _GLIBCXX_HOSTED condition?

No, it should still depend on is_hosted. The acinclude.m4 macro should
check that.

Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-04 Thread Hans-Peter Nilsson via Gcc-patches

> Date: Fri, 1 Sep 2023 12:16:40 +0100
> Reply-To: Jonathan Wakely 
> 
> On Wed, 23 Aug 2023 at 17:03, Jonathan Wakely via Libstdc++
>  wrote:
> >
> > Any objections to this? It's a C++23 feture, so should be enabled by
> > default.
> 
> I've pushed this to trunk, so let's see what breaks!
> 
> 
> >
> > -- >8 --
> >
> > This causes libstdc++_libbacktrace.a to be built by default. This might
> > fail on some targets, in which case we can make the 'auto' choice expand
> > to either 'yes' or 'no' depending on the target.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Default to yes.
> > * configure: Regenerate.

Incidentally, should check_effective_target_stacktrace in
libstdc++.exp also be adjusted to match; removing the
_GLIBCXX_HOSTED condition?

brgds, H-P

Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-04 Thread Hans-Peter Nilsson via Gcc-patches

I was about to enter a PR for the regression, but as you're
already aware, I'll wait 24 hours to see if this magically
goes away. :]

> On Fri, 1 Sept 2023 at 12:16, Jonathan Wakely  wrote:
> >
> > On Wed, 23 Aug 2023 at 17:03, Jonathan Wakely via Libstdc++
> >  wrote:
> > >
> > > Any objections to this? It's a C++23 feture, so should be enabled by
> > > default.
> >
> > I've pushed this to trunk, so let's see what breaks!
> 
> This modules header broke on aarch64, of course:
> FAIL: g++.dg/modules/xtreme-header_b.C -std=c++2b (test for excess errors)

And others, according to testresults@ including
powerpc64le-unknown-linux-gnu, x86_64-pc-linux-gnu,
s390x-ibm-linux-gnu, m68k-unknown-linux-gnu,
pru-unknown-elf, and...cris-elf (notably, both "64-bit" and
"32-bit" configurations).

Not sure how much information you have, but for cris-elf,
g++.log shows:

FAIL: g++.dg/modules/xtreme-header_b.C -std=c++2b (test for excess errors)
Excess errors:
/obj/libstdc++-v3/include/stacktrace:202:24: error: mangling of 'constexpr 
std::stacktrace_entry::_M_get_info(std::string*, std::string*, int*) 
constoperator 
void (*)(void*, std::stacktrace_entry::uintptr_t, const char*, 
std::stacktrace_entry::uintptr_t, std::stacktrace_entry::uintptr_t)() const' as 
'_ZZNKSt16stacktrace_entry11_M_get_infoEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_PiENKUlPvmPKcmmE_cvPFvS8_mSA_mmEEv'
 conflicts with a previous mangle
/obj/libstdc++-v3/include/stacktrace:202:24: error: mangling of 'static 
constexpr void std::stacktrace_entry::_M_get_info(std::string*, std::string*, 
int*) const_FUN(void*, 
std::stacktrace_entry::uintptr_t, const char*, 
std::stacktrace_entry::uintptr_t, std::stacktrace_entry::uintptr_t)' as 
'_ZZNKSt16stacktrace_entry11_M_get_infoEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_PiENUlPvmPKcmmE_4_FUNES8_mSA_mm'
 conflicts with a previous mangle
/obj/libstdc++-v3/include/stacktrace:202:24: error: mangling of 
'std::stacktrace_entry::_M_get_info(std::string*, std::string*, int*) 
const::' as 
'_ZZNKSt16stacktrace_entry11_M_get_infoEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_PiENKUlPvmPKcmmE_clES8_mSA_mm'
 conflicts with a previous mangle

So, I *guess* it's some kind of pre-existing mangling foulup
with C++20 in the backtrace-support that just happens to be
ticked off by the module testsuite.  But you probably
already knew that.

brgds, H-P

[committed] libstdc++: Remove unnecessary dg-options and outdated comment

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

It's no longer true that 1.0if has type float _Complex when GNU
extensions are enabled, so remove the hardcoded -std option.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/complex/literals/types.cc: Remove
dg-options and add target selector instead.
---
 libstdc++-v3/testsuite/26_numerics/complex/literals/types.cc | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/libstdc++-v3/testsuite/26_numerics/complex/literals/types.cc 
b/libstdc++-v3/testsuite/26_numerics/complex/literals/types.cc
index 5cc46d741ef..69c6781d739 100644
--- a/libstdc++-v3/testsuite/26_numerics/complex/literals/types.cc
+++ b/libstdc++-v3/testsuite/26_numerics/complex/literals/types.cc
@@ -1,7 +1,4 @@
-// Use -std=c++14 explicitly, because -std=gnu++14 enables GNU extension for
-// complex literals, so 1.0if is __complex__ float not std::complex.
-// { dg-options "-std=c++14" }
-// { dg-do compile }
+// { dg-do compile { target c++14 } }
 
 // Copyright (C) 2013-2023 Free Software Foundation, Inc.
 //
-- 
2.41.0

[committed] libstdc++: Enable std::auto_ptr tests for C++11 and later

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

There is no reason to only test std::auto_ptr with -std=c++03, we just
need to handle the deprecated warnings for C++11 and later.

libstdc++-v3/ChangeLog:

* testsuite/20_util/auto_ptr/1.cc: Remove dg-options -std=c++03
and add dg-warning for deprecation warnings.
* testsuite/20_util/auto_ptr/2.cc: Likewise.
* testsuite/20_util/auto_ptr/3.cc: Likewise.
* testsuite/20_util/auto_ptr/3946.cc: Likewise.
* testsuite/20_util/auto_ptr/4.cc: Likewise.
* testsuite/20_util/auto_ptr/5.cc: Likewise.
* testsuite/20_util/auto_ptr/6.cc: Likewise.
* testsuite/20_util/auto_ptr/7.cc: Likewise.
* testsuite/20_util/auto_ptr/assign_neg.cc: Likewise.
* testsuite/20_util/auto_ptr/requirements/explicit_instantiation/1.cc:
Likewise.
* testsuite/tr1/2_general_utilities/shared_ptr/assign/auto_ptr.cc:
Likewise.
* testsuite/tr1/2_general_utilities/shared_ptr/assign/auto_ptr_neg.cc:
Likewise.
* 
testsuite/tr1/2_general_utilities/shared_ptr/assign/auto_ptr_rvalue_neg.cc:
Likewise.
* testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc:
Likewise.
* testsuite/tr1/2_general_utilities/shared_ptr/cons/auto_ptr.cc:
Likewise.
* testsuite/tr1/2_general_utilities/shared_ptr/cons/auto_ptr_neg.cc:
Likewise.
---
 libstdc++-v3/testsuite/20_util/auto_ptr/1.cc   | 10 +-
 libstdc++-v3/testsuite/20_util/auto_ptr/2.cc   |  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/3.cc   |  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/3946.cc|  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/4.cc   |  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/5.cc   |  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/6.cc   |  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/7.cc   |  5 +++--
 libstdc++-v3/testsuite/20_util/auto_ptr/assign_neg.cc  |  6 +++---
 .../auto_ptr/requirements/explicit_instantiation/1.cc  |  6 +++---
 .../2_general_utilities/shared_ptr/assign/auto_ptr.cc  |  3 ++-
 .../shared_ptr/assign/auto_ptr_neg.cc  |  3 ++-
 .../shared_ptr/assign/auto_ptr_rvalue_neg.cc   |  3 ++-
 .../2_general_utilities/shared_ptr/cons/43820_neg.cc   |  3 ++-
 .../2_general_utilities/shared_ptr/cons/auto_ptr.cc|  3 ++-
 .../shared_ptr/cons/auto_ptr_neg.cc|  3 ++-
 16 files changed, 44 insertions(+), 31 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/auto_ptr/1.cc 
b/libstdc++-v3/testsuite/20_util/auto_ptr/1.cc
index b498711da76..64c0cf97e2f 100644
--- a/libstdc++-v3/testsuite/20_util/auto_ptr/1.cc
+++ b/libstdc++-v3/testsuite/20_util/auto_ptr/1.cc
@@ -15,9 +15,9 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// 20.4.5 Template class auto_ptr [lib.auto.ptr]
+// C++03 20.4.5 Template class auto_ptr [lib.auto.ptr]
 
-// { dg-options "-std=c++98" }
+// { dg-add-options using-deprecated }
 
 #include 
 #include 
@@ -63,21 +63,21 @@ test01()
 {
   reset_count_struct __attribute__((unused)) reset;
 
-  std::auto_ptr A_default;
+  std::auto_ptr A_default; // { dg-warning "deprecated" "" { target c++11 } 
}
   VERIFY( A_default.get() == 0 );
   VERIFY( A::ctor_count == 0 );
   VERIFY( A::dtor_count == 0 );
   VERIFY( B::ctor_count == 0 );
   VERIFY( B::dtor_count == 0 );
 
-  std::auto_ptr A_from_A(new A);
+  std::auto_ptr A_from_A(new A); // { dg-warning "deprecated" "" { target 
c++11 } }
   VERIFY( A_from_A.get() != 0 );
   VERIFY( A::ctor_count == 1 );
   VERIFY( A::dtor_count == 0 );
   VERIFY( B::ctor_count == 0 );
   VERIFY( B::dtor_count == 0 );
 
-  std::auto_ptr A_from_B(new B);
+  std::auto_ptr A_from_B(new B); // { dg-warning "deprecated" "" { target 
c++11 } }
   VERIFY( A_from_B.get() != 0 );
   VERIFY( A::ctor_count == 2 );
   VERIFY( A::dtor_count == 0 );
diff --git a/libstdc++-v3/testsuite/20_util/auto_ptr/2.cc 
b/libstdc++-v3/testsuite/20_util/auto_ptr/2.cc
index 0d5aabe61a4..9cbab139068 100644
--- a/libstdc++-v3/testsuite/20_util/auto_ptr/2.cc
+++ b/libstdc++-v3/testsuite/20_util/auto_ptr/2.cc
@@ -15,9 +15,10 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// 20.4.5 Template class auto_ptr [lib.auto.ptr]
+// C++03 20.4.5 Template class auto_ptr [lib.auto.ptr]
 
-// { dg-options "-std=c++98" }
+// { dg-add-options using-deprecated }
+// { dg-warning "auto_ptr. is deprecated" "" { target c++11 } 0 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/20_util/auto_ptr/3.cc 
b/libstdc++-v3/testsuite/20_util/auto_ptr/3.cc
index afac4013b59..ce020406cb8 100644
--- a/libstdc++-v3/testsuite/20_util/auto_ptr/3.cc
+++ b/libstdc++-v3/testsuite/20_util/auto_ptr/3.cc
@@ -15,9 +15,10 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-//

[committed] libstdc++: Remove dg-options "-std=c++98" from TR1 tests

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

These tests need slight adjustments to be valid in C++11 and later, but
there's no reason that can't be done, so that we test them in more
modes.

libstdc++-v3/ChangeLog:

* testsuite/tr1/6_containers/utility/pair.cc: Remove dg-options
and qualify ambiguous calls to get.
* testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc: Adjust
expected result for std::pow(float, int) as per DR 550.
---
 .../tr1/6_containers/utility/pair.cc  | 19 ---
 .../tr1/8_c_compatibility/cmath/pow_cmath.cc  |  7 +--
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/libstdc++-v3/testsuite/tr1/6_containers/utility/pair.cc 
b/libstdc++-v3/testsuite/tr1/6_containers/utility/pair.cc
index 4d4dcdb7a02..904b38c2b64 100644
--- a/libstdc++-v3/testsuite/tr1/6_containers/utility/pair.cc
+++ b/libstdc++-v3/testsuite/tr1/6_containers/utility/pair.cc
@@ -17,8 +17,6 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=c++98" }
-
 // tr1 additions to pair
 
 #include 
@@ -42,15 +40,14 @@ main()
   tuple_element<1, pair >::type
 blank3 __attribute__((unused)) = blank;
   pair test_pair(1, 2);
-  VERIFY(get<0>(test_pair) == 1);
-  VERIFY(get<1>(test_pair) == 2);
-  get<0>(test_pair) = 3;
-  get<1>(test_pair) = 4;
-  VERIFY(get<0>(test_pair) == 3);
-  VERIFY(get<1>(test_pair) == 4);
+  VERIFY(std::tr1::get<0>(test_pair) == 1);
+  VERIFY(std::tr1::get<1>(test_pair) == 2);
+  std::tr1::get<0>(test_pair) = 3;
+  std::tr1::get<1>(test_pair) = 4;
+  VERIFY(std::tr1::get<0>(test_pair) == 3);
+  VERIFY(std::tr1::get<1>(test_pair) == 4);
 
   const pair test_pair2(1,2);
-  VERIFY(get<0>(test_pair2) == 1);
-  VERIFY(get<1>(test_pair2) == 2);
+  VERIFY(std::tr1::get<0>(test_pair2) == 1);
+  VERIFY(std::tr1::get<1>(test_pair2) == 2);
 }
-
diff --git a/libstdc++-v3/testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc 
b/libstdc++-v3/testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc
index bc89ab2f6fe..63891bf4ba0 100644
--- a/libstdc++-v3/testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc
+++ b/libstdc++-v3/testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc
@@ -17,8 +17,6 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=c++98" }
-
 #include 
 using std::pow;
 #include 
@@ -30,6 +28,11 @@ test01()
   using namespace __gnu_test;
 
   float x = 2080703.375F;
+#if __cplusplus < 201103L
   check_ret_type(std::pow(x, 2));
+#else
+  // LWG 550 What should the return type of pow(float,int) be?
+  check_ret_type(std::pow(x, 2));
+#endif
   check_ret_type(std::tr1::pow(x, 2));
 }
-- 
2.41.0

[committed] libstdc++: Add explicit -std=gnu++98 to tests that use { target c++98_only }

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* 
testsuite/23_containers/deque/requirements/explicit_instantiation/2.cc:
Add dg-options to restrict the test to C++98 mode.
* testsuite/23_containers/list/requirements/explicit_instantiation/2.cc:
Likewise.
---
 .../23_containers/deque/requirements/explicit_instantiation/2.cc | 1 +
 .../23_containers/list/requirements/explicit_instantiation/2.cc  | 1 +
 2 files changed, 2 insertions(+)

diff --git 
a/libstdc++-v3/testsuite/23_containers/deque/requirements/explicit_instantiation/2.cc
 
b/libstdc++-v3/testsuite/23_containers/deque/requirements/explicit_instantiation/2.cc
index 3afa9fb4403..6e6ceabad21 100644
--- 
a/libstdc++-v3/testsuite/23_containers/deque/requirements/explicit_instantiation/2.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/deque/requirements/explicit_instantiation/2.cc
@@ -21,6 +21,7 @@
 #include 
 #include 
 
+// { dg-options "-std=gnu++98" }
 // { dg-do compile { target c++98_only } }
 
 // N.B. Since C++11 we cannot instantiate with T == NonDefaultConstructible
diff --git 
a/libstdc++-v3/testsuite/23_containers/list/requirements/explicit_instantiation/2.cc
 
b/libstdc++-v3/testsuite/23_containers/list/requirements/explicit_instantiation/2.cc
index b8f393983b1..463ec507bd9 100644
--- 
a/libstdc++-v3/testsuite/23_containers/list/requirements/explicit_instantiation/2.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/list/requirements/explicit_instantiation/2.cc
@@ -21,6 +21,7 @@
 #include 
 #include 
 
+// { dg-options "-std=gnu++98" }
 // { dg-do compile { target c++98_only } }
 
 // N.B. Since C++11 we cannot instantiate with T == NonDefaultConstructible
-- 
2.41.0

[committed] libstdc++: Fix filenames and comments in tests [PR26142]

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

These tests have transposed digits in the filenames and comments.

libstdc++-v3/ChangeLog:

PR libstdc++/26142
* testsuite/23_containers/vector/26412-1.cc: Moved to...
* testsuite/23_containers/vector/26142-1.cc: ...here.
* testsuite/23_containers/vector/26412-2.cc: Moved to...
* testsuite/23_containers/vector/26142-2.cc: ...here.
---
 .../testsuite/23_containers/vector/{26412-1.cc => 26142-1.cc}   | 2 +-
 .../testsuite/23_containers/vector/{26412-2.cc => 26142-2.cc}   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
 rename libstdc++-v3/testsuite/23_containers/vector/{26412-1.cc => 26142-1.cc} 
(92%)
 rename libstdc++-v3/testsuite/23_containers/vector/{26412-2.cc => 26142-2.cc} 
(92%)

diff --git a/libstdc++-v3/testsuite/23_containers/vector/26412-1.cc 
b/libstdc++-v3/testsuite/23_containers/vector/26142-1.cc
similarity index 92%
rename from libstdc++-v3/testsuite/23_containers/vector/26412-1.cc
rename to libstdc++-v3/testsuite/23_containers/vector/26142-1.cc
index 943032c2377..2c078c7a04d 100644
--- a/libstdc++-v3/testsuite/23_containers/vector/26412-1.cc
+++ b/libstdc++-v3/testsuite/23_containers/vector/26142-1.cc
@@ -21,7 +21,7 @@
 
 #include 
 
-// libstdc++26412
+// PR libstdc++/26142 global debug namespace clashes everywhere
 namespace debug
 {
   int i;
diff --git a/libstdc++-v3/testsuite/23_containers/vector/26412-2.cc 
b/libstdc++-v3/testsuite/23_containers/vector/26142-2.cc
similarity index 92%
rename from libstdc++-v3/testsuite/23_containers/vector/26412-2.cc
rename to libstdc++-v3/testsuite/23_containers/vector/26142-2.cc
index 807f6075ab7..d4a5fe42274 100644
--- a/libstdc++-v3/testsuite/23_containers/vector/26412-2.cc
+++ b/libstdc++-v3/testsuite/23_containers/vector/26142-2.cc
@@ -23,4 +23,4 @@
 
 #include 
 
-// libstdc++26412
+// PR libstdc++/26142 global debug namespace clashes everywhere
-- 
2.41.0

[committed] libstdc++: Add missing target selector to std::expected test

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

This test should use a target selector of c++23 so that the explicit
-std=gnu++23 option can be removed, to allow testing with later
standards too.

libstdc++-v3/ChangeLog:

* testsuite/20_util/expected/bad.cc: Add missing target
selector.
---
 libstdc++-v3/testsuite/20_util/expected/bad.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/20_util/expected/bad.cc 
b/libstdc++-v3/testsuite/20_util/expected/bad.cc
index 17bc6d69e88..e5d7ba4afb0 100644
--- a/libstdc++-v3/testsuite/20_util/expected/bad.cc
+++ b/libstdc++-v3/testsuite/20_util/expected/bad.cc
@@ -1,5 +1,5 @@
 // { dg-options "-std=gnu++23" }
-// { dg-do compile }
+// { dg-do compile { target c++23 } }
 
 #include 
 
-- 
2.41.0

[committed] libstdc++: Add { target c++98_only } to tests

2023-09-04 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux. Pushed to trunk.

-- >8 --

These test behaviour only seen with -std=c++03 so the target selector
should match.

libstdc++-v3/ChangeLog:

* testsuite/20_util/bitset/107037.cc: Add c++98_only selector.
* testsuite/26_numerics/complex/56111.cc: Likewise.
---
 libstdc++-v3/testsuite/20_util/bitset/107037.cc | 2 +-
 libstdc++-v3/testsuite/26_numerics/complex/56111.cc | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/bitset/107037.cc 
b/libstdc++-v3/testsuite/20_util/bitset/107037.cc
index b4560dd3775..3b2bb80277e 100644
--- a/libstdc++-v3/testsuite/20_util/bitset/107037.cc
+++ b/libstdc++-v3/testsuite/20_util/bitset/107037.cc
@@ -1,5 +1,5 @@
 // { dg-options "-std=c++03" }
-// { dg-do compile }
+// { dg-do compile { target c++98_only } }
 // PR libstdc++/107037 bitset::_M_do_reset fails for strict -std=c++03 mode
 #include 
 template class std::bitset<0>;
diff --git a/libstdc++-v3/testsuite/26_numerics/complex/56111.cc 
b/libstdc++-v3/testsuite/26_numerics/complex/56111.cc
index e583b190ee1..0237ed356e1 100644
--- a/libstdc++-v3/testsuite/26_numerics/complex/56111.cc
+++ b/libstdc++-v3/testsuite/26_numerics/complex/56111.cc
@@ -1,5 +1,5 @@
-// { dg-do compile }
 // { dg-options "-std=c++98" }
+// { dg-do compile { target c++98_only } }
 
 // Copyright (C) 2013-2023 Free Software Foundation, Inc.
 //
-- 
2.41.0

[WIP] nvptx: Also allow immediate input operand to 'bitrev2'

2023-09-04 Thread Thomas Schwinge

Hi!

I'm working towards reviewing some (of Roger's) GCC/nvptx patches, and
therefore learning some more GCC/nvptx, and generally RTL etc., and the
conventions around it.  Please bear with me asking "obvious" questions.

For the PTX bit reverse instruction, GCC/nvptx currently ("forever")
defines:

(define_insn "bitrev2"
  [(set (match_operand:SDIM 0 "nvptx_register_operand" "=R")
   (unspec:SDIM [(match_operand:SDIM 1 "nvptx_register_operand" "R")]
UNSPEC_BITREV))]
  ""
  "%.\\tbrev.b%T0\\t%0, %1;")

..., with:

(define_predicate "nvptx_register_operand"
  (match_code "reg")
{
  return register_operand (op, mode);
})

(define_constraint "R"
  "A pseudo register."
  (match_code "reg"))

That is, only a register input operand is permitted, not an immediate.
However, I don't see such a restriction in the manual,
.

If I change that 'define_insn':

-   (unspec:SDIM [(match_operand:SDIM 1 "nvptx_register_operand" "R")]
+   (unspec:SDIM [(match_operand:SDIM 1 "nvptx_nonmemory_operand" "Ri")]

..., with (existing):

(define_predicate "nvptx_nonmemory_operand"
  (match_code "reg,const_int,const_double")
{
  return (REG_P (op) ? register_operand (op, mode)
  : immediate_operand (op, mode));
})

..., then a simple code:

return __builtin_nvptx_brev(0xe6a2c480) != 0x01234567;

... for '-O1' subsequently changes:

[...]
 .reg .u32 %r22;
-.reg .u32 %r25;
 .reg .pred %r29;
-mov.u32 %r25,-425540480;
-brev.b32 %r22,%r25;
+brev.b32 %r22,-425540480;
 setp.ne.u32 %r29,%r22,19088743;
[...]

(I understand that, in the end, that's probably equivalent, assuming that
the later PTX -> SASS compiler does the same optimization, but I find it
easier to read: one less '.reg' to keep track of textually/mentally.)

Does that make sense to you, too?  I'd then extend my attached
"[WIP] nvptx: Also allow immediate input operand to 'bitrev2'" with
a test case.

(Similarly then, a number of other GCC/nvptx 'define_insn's to be
reviewed/revised, later on.)

Relatedly, I see that a lot of GCC/nvptx' two input operands instructions
('add3', etc.) similarly do allow for their second input operand to
be an immediate in addition to a register.  I suppose that only allowing
for the second input operand to be an immediate is sufficient/desirable:
reduces load on the matching; we shouldn't ever end up with 'IMM + IMM',
for example: should've optimized that before.  As I learn from
,
"for commutative [...] operators, a constant is always made the second
operand".  (Confirmed to ICE if swapping that around for 'add3';
so, that's all as expected.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 4a5138eb61a026ad6bc5470a648ebc596af1b1ed Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 4 Sep 2023 16:48:53 +0200
Subject: [PATCH] [WIP] nvptx: Also allow immediate input operand to
 'bitrev2'

---
 gcc/config/nvptx/nvptx.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 1bb93045403..e1c822f2ea8 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -636,7 +636,7 @@
 
 (define_insn "bitrev2"
   [(set (match_operand:SDIM 0 "nvptx_register_operand" "=R")
-	(unspec:SDIM [(match_operand:SDIM 1 "nvptx_register_operand" "R")]
+	(unspec:SDIM [(match_operand:SDIM 1 "nvptx_nonmemory_operand" "Ri")]
 		 UNSPEC_BITREV))]
   ""
   "%.\\tbrev.b%T0\\t%0, %1;")
-- 
2.34.1

Re: [PATCH] RISC-V: Keep vlmax vector operators in simple form until split1 pass

2023-09-04 Thread Lehua Ding


Hi Kito,


Can those intermediate patterns be used for intrinsic? I would prefer
to keep those stuff *IF* possible used for intrinsics.


I think we don't need those patterns for intrinisc. First, the deleted 
pattern does not directly correspond to intrinisc. Second, if you want 
to use these patterns to optimize the following intrinisc program (e.g. 
combine to a single vwadd.vv instruction), we'd better fold them 
directly into the internal function of autovec in Gimple IR.


  vint32m1_t foo(vint16mf2_t va1, vint16mf2_t vb1) {
vint32m1_t va2 = __riscv_vsext_vf2_i32m1(va1, 
__riscv_vsetvlmax_e32m1());
vint32m1_t vb2 = __riscv_vsext_vf2_i32m1(vb1, 
__riscv_vsetvlmax_e32m1());
vint32m1_t vc = __riscv_vadd_vv_i32m1(va2, vb2, 
__riscv_vsetvlmax_e32m1());

return vc;
  }
--
Best,
Lehua

Re: [PATCH] Bug 111071: fix the subr with -1 to not due to the simplify.

2023-09-04 Thread Richard Sandiford via Gcc-patches

"yanzhang.wang--- via Gcc-patches"  writes:
> From: Yanzhang Wang 
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/acle/asm/subr_s8.c: Modify subr with -1
> to not.
>
> Signed-off-by: Yanzhang Wang 
> ---
>
> Tested on my local arm environment and passed. Thanks Andrew Pinski's comment
> the code is the same with that.
>
>  gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
> index b9615de6655..1cf6916a5e0 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/subr_s8.c
> @@ -76,8 +76,7 @@ TEST_UNIFORM_Z (subr_1_s8_m_untied, svint8_t,
>  
>  /*
>  ** subr_m1_s8_m:
> -**   mov (z[0-9]+\.b), #-1
> -**   subrz0\.b, p0/m, z0\.b, \1
> +**   not z0.b, p0/m, z0.b
>  **   ret
>  */
>  TEST_UNIFORM_Z (subr_m1_s8_m, svint8_t,

I think we need this for subr_u8.c too.  OK with that change,
and thanks for the fix!

Richard

[PING][PATCH v2 1/2] c++, libstdc++: implement __is_scalar built-in trait

2023-09-04 Thread Ken Matsui via Gcc-patches

Ping for __is_scalar built-in.

Sincerely,
Ken Matsui

On Fri, Jul 7, 2023 at 9:46 PM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_scalar. The existent
> __is_scalar codes were replaced with is_scalar to avoid unintentional
> macro replacement by the new built-in.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_scalar.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_SCALAR.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_scalar.
> * g++.dg/ext/is_scalar.C: New test.
> * g++.dg/tm/pr46567.C: Use is_scalar instead.
> * g++.dg/torture/pr57107.C: Likewise.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/cpp_type_traits.h (__is_scalar): Rename to ...
> (is_scalar): ... this.
> * include/bits/stl_algobase.h: Use is_scalar instead.
> * include/bits/valarray_array.h: Likewise.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc|  3 ++
>  gcc/cp/cp-trait.def |  1 +
>  gcc/cp/semantics.cc |  4 +++
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
>  gcc/testsuite/g++.dg/ext/is_scalar.C| 31 +
>  gcc/testsuite/g++.dg/tm/pr46567.C   | 10 +++
>  gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +--
>  libstdc++-v3/include/bits/cpp_type_traits.h |  2 +-
>  libstdc++-v3/include/bits/stl_algobase.h|  8 +++---
>  libstdc++-v3/include/bits/valarray_array.h  |  2 +-
>  10 files changed, 55 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_scalar.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..4c27f2a3a62 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>inform (loc, "  %qT is not a union", t1);
>break;
> +case CPTK_IS_SCALAR:
> +  inform (loc, "  %qT is not a scalar type", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..59ae087c457 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_SCALAR, "__is_scalar", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..3edc7f23212 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_UNION:
>return type_code1 == UNION_TYPE;
>
> +case CPTK_IS_SCALAR:
> +  return SCALAR_TYPE_P (type1);
> +
>  case CPTK_IS_ASSIGNABLE:
>return is_xible (MODIFY_EXPR, type1, type2);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_SCALAR:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..75acbdfb9fc 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_scalar)
> +# error "__has_builtin (__is_scalar) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_scalar.C 
> b/gcc/testsuite/g++.dg/ext/is_scalar.C
> new file mode 100644
> index 000..457fddc52fc
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_scalar.C
> @@ -0,0 +1,31 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include   // std::nullptr_t
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +
> +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
> +  SA(TRAIT(TYPE) == EXPECT);   \
> +  SA(TRAIT(const TYPE) == EXPECT); \
> +  SA(TRAIT(volatile TYPE) == EXPECT);  \
> +  SA(TRAIT(const volatile TYPE) == EXPECT)
> +
> +// volatile return type would cause a

[PING][PATCH v10 2/5] libstdc++: Use new built-in trait __is_reference for std::is_reference

2023-09-04 Thread Ken Matsui via Gcc-patches

Ping for __is_reference built-in.

Sincerely,
Ken Matsui


On Wed, Jul 12, 2023 at 7:56 PM Ken Matsui  wrote:
>
> Hi,
>
> Here is the benchmark result for is_reference:
>
> https://github.com/ken-matsui/gcc-benches/blob/main/is_reference.md#wed-jul-12-074702-pm-pdt-2023
>
> Time: -8.15593%
> Peak Memory Usage: -4.48408%
> Total Memory Usage: -8.03783%
>
> Sincerely,
> Ken Matsui
>
> On Wed, Jul 12, 2023 at 7:39 PM Ken Matsui  wrote:
> >
> > This patch gets std::is_reference to dispatch to new built-in trait
> > __is_reference.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/std/type_traits (is_reference): Use __is_reference 
> > built-in
> > trait.
> > (is_reference_v): Likewise.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  libstdc++-v3/include/std/type_traits | 14 ++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/libstdc++-v3/include/std/type_traits 
> > b/libstdc++-v3/include/std/type_traits
> > index 0e7a9c9c7f3..2a14df7e5f9 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -639,6 +639,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >// Composite type categories.
> >
> >/// is_reference
> > +#if __has_builtin(__is_reference)
> > +  template
> > +struct is_reference
> > +: public __bool_constant<__is_reference(_Tp)>
> > +{ };
> > +#else
> >template
> >  struct is_reference
> >  : public false_type
> > @@ -653,6 +659,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >  struct is_reference<_Tp&&>
> >  : public true_type
> >  { };
> > +#endif
> >
> >/// is_arithmetic
> >template
> > @@ -3192,12 +3199,19 @@ template 
> >inline constexpr bool is_class_v = __is_class(_Tp);
> >  template 
> >inline constexpr bool is_function_v = is_function<_Tp>::value;
> > +
> > +#if __has_builtin(__is_reference)
> > +template 
> > +  inline constexpr bool is_reference_v = __is_reference(_Tp);
> > +#else
> >  template 
> >inline constexpr bool is_reference_v = false;
> >  template 
> >inline constexpr bool is_reference_v<_Tp&> = true;
> >  template 
> >inline constexpr bool is_reference_v<_Tp&&> = true;
> > +#endif
> > +
> >  template 
> >inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
> >  template 
> > --
> > 2.41.0
> >

[PING][PATCH v2 1/2] c++: implement __remove_pointer built-in trait

2023-09-04 Thread Ken Matsui via Gcc-patches

Ping for __remove_pointer built-in.

Sincerely,
Ken Matsui

On Fri, Jul 7, 2023 at 10:29 PM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::remove_pointer.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __remove_pointer.
> * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer.
> * g++.dg/ext/remove_pointer.C: New test.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/cp-trait.def   |  1 +
>  gcc/cp/semantics.cc   |  5 +++
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
>  gcc/testsuite/g++.dg/ext/remove_pointer.C | 51 +++
>  4 files changed, 60 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/remove_pointer.C
>
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..07823e55579 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -90,6 +90,7 @@ DEFTRAIT_EXPR (IS_DEDUCIBLE, "__is_deducible ", 2)
>  DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
>  DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1)
>  DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1)
> +DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1)
>  DEFTRAIT_TYPE (UNDERLYING_TYPE,  "__underlying_type", 1)
>  DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1)
>
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..a3b283ce938 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12374,6 +12374,11 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
> tree type2,
> type1 = TREE_TYPE (type1);
>return cv_unqualified (type1);
>
> +case CPTK_REMOVE_POINTER:
> +  if (TYPE_PTR_P (type1))
> +type1 = TREE_TYPE (type1);
> +  return type1;
> +
>  case CPTK_TYPE_PACK_ELEMENT:
>return finish_type_pack_element (type1, type2, complain);
>
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..e21e0a95509 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__remove_pointer)
> +# error "__has_builtin (__remove_pointer) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/remove_pointer.C 
> b/gcc/testsuite/g++.dg/ext/remove_pointer.C
> new file mode 100644
> index 000..7b13db93950
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/remove_pointer.C
> @@ -0,0 +1,51 @@
> +// { dg-do compile { target c++11 } }
> +
> +#define SA(X) static_assert((X),#X)
> +
> +SA(__is_same(__remove_pointer(int), int));
> +SA(__is_same(__remove_pointer(int*), int));
> +SA(__is_same(__remove_pointer(int**), int*));
> +
> +SA(__is_same(__remove_pointer(const int*), const int));
> +SA(__is_same(__remove_pointer(const int**), const int*));
> +SA(__is_same(__remove_pointer(int* const), int));
> +SA(__is_same(__remove_pointer(int** const), int*));
> +SA(__is_same(__remove_pointer(int* const* const), int* const));
> +
> +SA(__is_same(__remove_pointer(volatile int*), volatile int));
> +SA(__is_same(__remove_pointer(volatile int**), volatile int*));
> +SA(__is_same(__remove_pointer(int* volatile), int));
> +SA(__is_same(__remove_pointer(int** volatile), int*));
> +SA(__is_same(__remove_pointer(int* volatile* volatile), int* volatile));
> +
> +SA(__is_same(__remove_pointer(const volatile int*), const volatile int));
> +SA(__is_same(__remove_pointer(const volatile int**), const volatile int*));
> +SA(__is_same(__remove_pointer(const int* volatile), const int));
> +SA(__is_same(__remove_pointer(volatile int* const), volatile int));
> +SA(__is_same(__remove_pointer(int* const volatile), int));
> +SA(__is_same(__remove_pointer(const int** volatile), const int*));
> +SA(__is_same(__remove_pointer(volatile int** const), volatile int*));
> +SA(__is_same(__remove_pointer(int** const volatile), int*));
> +SA(__is_same(__remove_pointer(int* const* const volatile), int* const));
> +SA(__is_same(__remove_pointer(int* volatile* const volatile), int* 
> volatile));
> +SA(__is_same(__remove_pointer(int* const volatile* const volatile), int* 
> const volatile));
> +
> +SA(__is_same(__remove_pointer(int&), int&));
> +SA(__is_same(__remove_pointer(const int&), const int&));
> +SA(__is_same(__remove_pointer(volatile int&), volatile int&));
> +SA(__is_same(__remove_pointer(const volatile int&), const volatile int&));
> +
> +SA(__is_same(__remove_pointer(int&&), int&&));
> +SA(__is_same(__remove_pointer(const int&&), const int&&));
> +SA(__is_same(__remove_pointer(volatile int&&), volatile int&&));
> +SA(__is_same(__remove_pointer(const volatile int&&), const volatile int&&));
> +
> +SA(__is_same(__remove_pointer(int[3]), int[3]));
> +SA(__is_same(__remove_pointer(const int[3]),

[PING][PATCH v4 1/2] c++: implement __is_unsigned built-in trait

2023-09-04 Thread Ken Matsui via Gcc-patches

Ping for __is_unsigned built-in.

Sincerely,
Ken Matsui

On Sat, Jul 8, 2023 at 4:25 AM Ken Matsui  wrote:
>
> Hi,
>
> Here is the benchmark result for is_unsigned:
>
> https://github.com/ken-matsui/gcc-benches/blob/main/is_unsigned.md#sat-jul--8-041510-am-pdt-2023
>
> Time: -66.908%
> Peak Memory Usage: -42.5139%
> Total Memory Usage: -46.3483%
>
> Sincerely,
> Ken Matsui
>
> On Sat, Jul 8, 2023 at 4:13 AM Ken Matsui  wrote:
> >
> > This patch implements built-in trait for std::is_unsigned.
> >
> > gcc/cp/ChangeLog:
> >
> > * cp-trait.def: Define __is_unsigned.
> > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_UNSIGNED.
> > * semantics.cc (trait_expr_value): Likewise.
> > (finish_trait_expr): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/ext/has-builtin-1.C: Test existence of __is_unsigned.
> > * g++.dg/ext/is_unsigned.C: New test.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/cp/constraint.cc |  3 ++
> >  gcc/cp/cp-trait.def  |  1 +
> >  gcc/cp/semantics.cc  |  4 ++
> >  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
> >  gcc/testsuite/g++.dg/ext/is_unsigned.C   | 47 
> >  5 files changed, 58 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/ext/is_unsigned.C
> >
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 8cf0f2d0974..ec8de87d1a1 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
> >  case CPTK_IS_UNION:
> >inform (loc, "  %qT is not a union", t1);
> >break;
> > +case CPTK_IS_UNSIGNED:
> > +  inform (loc, "  %qT is not an unsigned type", t1);
> > +  break;
> >  case CPTK_IS_AGGREGATE:
> >inform (loc, "  %qT is not an aggregate", t1);
> >break;
> > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> > index 8b7fece0cc8..1a219243162 100644
> > --- a/gcc/cp/cp-trait.def
> > +++ b/gcc/cp/cp-trait.def
> > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> > "__is_trivially_assignable", 2)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> > -1)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
> >  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> > +DEFTRAIT_EXPR (IS_UNSIGNED, "__is_unsigned", 1)
> >  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> > "__reference_constructs_from_temporary", 2)
> >  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> > "__reference_converts_from_temporary", 2)
> >  /* FIXME Added space to avoid direct usage in GCC 13.  */
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 8fb47fd179e..2d48894d811 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> > tree type2)
> >  case CPTK_IS_UNION:
> >return type_code1 == UNION_TYPE;
> >
> > +case CPTK_IS_UNSIGNED:
> > +  return TYPE_UNSIGNED (type1);
> > +
> >  case CPTK_IS_ASSIGNABLE:
> >return is_xible (MODIFY_EXPR, type1, type2);
> >
> > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >  case CPTK_IS_ENUM:
> >  case CPTK_IS_UNION:
> >  case CPTK_IS_SAME:
> > +case CPTK_IS_UNSIGNED:
> >break;
> >
> >  case CPTK_IS_LAYOUT_COMPATIBLE:
> > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > index f343e153e56..20bf8e6cad5 100644
> > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > @@ -146,3 +146,6 @@
> >  #if !__has_builtin (__remove_cvref)
> >  # error "__has_builtin (__remove_cvref) failed"
> >  #endif
> > +#if !__has_builtin (__is_unsigned)
> > +# error "__has_builtin (__is_unsigned) failed"
> > +#endif
> > diff --git a/gcc/testsuite/g++.dg/ext/is_unsigned.C 
> > b/gcc/testsuite/g++.dg/ext/is_unsigned.C
> > new file mode 100644
> > index 000..2bb45d209a7
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/ext/is_unsigned.C
> > @@ -0,0 +1,47 @@
> > +// { dg-do compile { target c++11 } }
> > +
> > +#include 
> > +
> > +using namespace __gnu_test;
> > +
> > +#define SA(X) static_assert((X),#X)
> > +#define SA_TEST_CATEGORY(TRAIT, X, expect) \
> > +  SA(TRAIT(X) == expect);  \
> > +  SA(TRAIT(const X) == expect);\
> > +  SA(TRAIT(volatile X) == expect); \
> > +  SA(TRAIT(const volatile X) == expect)
> > +
> > +SA_TEST_CATEGORY(__is_unsigned, void, false);
> > +
> > +SA_TEST_CATEGORY(__is_unsigned, bool, (bool(-1) > bool(0)));
> > +SA_TEST_CATEGORY(__is_unsigned, char, (char(-1) > char(0)));
> > +SA_TEST_CATEGORY(__is_unsigned, signed char, false);
> > +SA_TEST_CATEGORY(__is_unsigned, unsigned char, true);
> > +SA_TEST_CATEGORY(__is_unsigned, wchar_t, (wchar_t(-1) >

[PING][PATCH v10 3/5] c++: Implement __is_function built-in trait

2023-09-04 Thread Ken Matsui via Gcc-patches

Ping for __is_function built-in.

Sincerely,
Ken Matsui


On Tue, Aug 22, 2023 at 12:53 PM Patrick Palka  wrote:
>
> On Wed, 12 Jul 2023, Ken Matsui via Libstdc++ wrote:
>
> > This patch implements built-in trait for std::is_function.
> >
> > gcc/cp/ChangeLog:
> >
> >   * cp-trait.def: Define __is_function.
> >   * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_FUNCTION.
> >   * semantics.cc (trait_expr_value): Likewise.
> >   (finish_trait_expr): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * g++.dg/ext/has-builtin-1.C: Test existence of __is_function.
> >   * g++.dg/ext/is_function.C: New test.
>
> LGTM!
>
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/cp/constraint.cc |  3 ++
> >  gcc/cp/cp-trait.def  |  1 +
> >  gcc/cp/semantics.cc  |  4 ++
> >  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
> >  gcc/testsuite/g++.dg/ext/is_function.C   | 58 
> >  5 files changed, 69 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C
> >
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index f6951ee2670..927605c6cb7 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -3754,6 +3754,9 @@ diagnose_trait_expr (tree expr, tree args)
> >  case CPTK_IS_UNION:
> >inform (loc, "  %qT is not a union", t1);
> >break;
> > +case CPTK_IS_FUNCTION:
> > +  inform (loc, "  %qT is not a function", t1);
> > +  break;
> >  case CPTK_IS_AGGREGATE:
> >inform (loc, "  %qT is not an aggregate", t1);
> >break;
> > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> > index 1e3310cd682..3cd3babc242 100644
> > --- a/gcc/cp/cp-trait.def
> > +++ b/gcc/cp/cp-trait.def
> > @@ -83,6 +83,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> > "__is_trivially_assignable", 2)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> > -1)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
> >  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> > +DEFTRAIT_EXPR (IS_FUNCTION, "__is_function", 1)
> >  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> > "__reference_constructs_from_temporary", 2)
> >  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> > "__reference_converts_from_temporary", 2)
> >  /* FIXME Added space to avoid direct usage in GCC 13.  */
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 2f37bc353a1..b976633645a 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12072,6 +12072,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> > tree type2)
> >  case CPTK_IS_ENUM:
> >return type_code1 == ENUMERAL_TYPE;
> >
> > +case CPTK_IS_FUNCTION:
> > +  return type_code1 == FUNCTION_TYPE;
> > +
> >  case CPTK_IS_FINAL:
> >return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
> >
> > @@ -12293,6 +12296,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >  case CPTK_IS_UNION:
> >  case CPTK_IS_SAME:
> >  case CPTK_IS_REFERENCE:
> > +case CPTK_IS_FUNCTION:
> >break;
> >
> >  case CPTK_IS_LAYOUT_COMPATIBLE:
> > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > index b697673790c..90eb00ebf2d 100644
> > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > @@ -149,3 +149,6 @@
> >  #if !__has_builtin (__is_reference)
> >  # error "__has_builtin (__is_reference) failed"
> >  #endif
> > +#if !__has_builtin (__is_function)
> > +# error "__has_builtin (__is_function) failed"
> > +#endif
> > diff --git a/gcc/testsuite/g++.dg/ext/is_function.C 
> > b/gcc/testsuite/g++.dg/ext/is_function.C
> > new file mode 100644
> > index 000..2e1594b12ad
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/ext/is_function.C
> > @@ -0,0 +1,58 @@
> > +// { dg-do compile { target c++11 } }
> > +
> > +#include 
> > +
> > +using namespace __gnu_test;
> > +
> > +#define SA(X) static_assert((X),#X)
> > +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)\
> > +  SA(TRAIT(TYPE) == EXPECT); \
> > +  SA(TRAIT(const TYPE) == EXPECT);   \
> > +  SA(TRAIT(volatile TYPE) == EXPECT);\
> > +  SA(TRAIT(const volatile TYPE) == EXPECT)
> > +
> > +struct A
> > +{ void fn(); };
> > +
> > +template
> > +struct AHolder { };
> > +
> > +template
> > +struct AHolder
> > +{ using type = U; };
> > +
> > +// Positive tests.
> > +SA(__is_function(int (int)));
> > +SA(__is_function(ClassType (ClassType)));
> > +SA(__is_function(float (int, float, int[], int&)));
> > +SA(__is_function(int (int, ...)));
> > +SA(__is_function(bool (ClassType) const));
> > +SA(__is_function(AHolder::type));
> > +
> > +void fn();
> > +SA(__is_function(decltype(fn)));
> > +
> > +// Negative tests.
> >

Re: [PATCH v5 1/4] c++, libstdc++: Implement __is_arithmetic built-in trait

2023-09-04 Thread Ken Matsui via Gcc-patches

Ping for __is_arithmetic built-in.

Sincerely,
Ken Matsui

On Fri, Sep 1, 2023 at 4:25 AM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_arithmetic.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_arithmetic.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARITHMETIC.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_arithmetic.
> * g++.dg/ext/is_arithmetic.C: New test.
> * g++.dg/tm/pr46567.C (__is_arithmetic): Rename to ...
> (__is_arith): ... this.
> * g++.dg/torture/pr57107.C: Likewise.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/cpp_type_traits.h (__is_arithmetic): Rename to ...
> (__is_arith): ... this.
> * include/c_global/cmath: Use __is_arith instead.
> * include/c_std/cmath: Likewise.
> * include/tr1/cmath: Likewise.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc|  3 ++
>  gcc/cp/cp-trait.def |  1 +
>  gcc/cp/semantics.cc |  4 ++
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
>  gcc/testsuite/g++.dg/ext/is_arithmetic.C| 33 ++
>  gcc/testsuite/g++.dg/tm/pr46567.C   |  6 +--
>  gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
>  libstdc++-v3/include/bits/cpp_type_traits.h |  4 +-
>  libstdc++-v3/include/c_global/cmath | 48 ++---
>  libstdc++-v3/include/c_std/cmath| 24 +--
>  libstdc++-v3/include/tr1/cmath  | 24 +--
>  11 files changed, 99 insertions(+), 55 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_arithmetic.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..bd517d08843 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3754,6 +3754,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> +case CPTK_IS_ARITHMETIC:
> +  inform (loc, "  %qT is not an arithmetic type", t1);
> +  break;
>  case CPTK_IS_TRIVIALLY_COPYABLE:
>inform (loc, "  %qT is not trivially copyable", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..a95aeeaf778 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_ARITHMETIC, "__is_arithmetic", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..4531f047d73 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_UNION:
>return type_code1 == UNION_TYPE;
>
> +case CPTK_IS_ARITHMETIC:
> +  return ARITHMETIC_TYPE_P (type1);
> +
>  case CPTK_IS_ASSIGNABLE:
>return is_xible (MODIFY_EXPR, type1, type2);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_ARITHMETIC:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..3d63b0101d1 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_arithmetic)
> +# error "__has_builtin (__is_arithmetic) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_arithmetic.C 
> b/gcc/testsuite/g++.dg/ext/is_arithmetic.C
> new file mode 100644
> index 000..fd35831f646
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_arithmetic.C
> @@ -0,0 +1,33 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
> +  SA(TRAIT(TYPE) == EXPECT);   \
> +  SA(TRAIT(const TYPE) == EXPECT); \
> +  SA(TRAIT(volatile TYPE) == EXPECT);  \
> +  SA(TRAIT(const

[PATCH] RISC-V: Support Dynamic LMUL Cost model

2023-09-04 Thread Juzhe-Zhong

This patch support dynamic LMUL cost modeling with 
--param=riscv-autovec-lmul=dynamic.

Consider this following case:
void
foo (int32_t *__restrict a, int32_t *__restrict b,int32_t *__restrict c,
  int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2,
  int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3,
  int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4,
  int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5,
  int32_t *__restrict d,
  int32_t *__restrict d2,
  int32_t *__restrict d3,
  int32_t *__restrict d4,
  int32_t *__restrict d5,
  int n)
{
  for (int i = 0; i < n; i++)
{
  a[i] = b[i] + c[i];
  b5[i] = b[i] + c[i];
  a2[i] = b2[i] + c2[i];
  a3[i] = b3[i] + c3[i];
  a4[i] = b4[i] + c4[i];
  a5[i] = a[i] + a4[i];
  d2[i] = a2[i] + c2[i];
  d3[i] = a3[i] + c3[i];
  d4[i] = a4[i] + c4[i];
  d5[i] = a[i] + a4[i];
  a[i] = a5[i] + b5[i] + a[i];

  c2[i] = a[i] + c[i];
  c3[i] = b5[i] * a5[i];
  c4[i] = a2[i] * a3[i];
  c5[i] = b5[i] * a2[i];
  c[i] = a[i] + c3[i];
  c2[i] = a[i] + c4[i];
  a5[i] = a[i] + a4[i];
  a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i]
  * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i]
  * d[i] * d2[i] * d3[i] * d4[i] * d5[i];
}
}

Demo: https://godbolt.org/z/x1acoMxGT

You can see it will produce register spilling if you specify LMUL >= 4

Now, with --param=riscv-autovec-lmul=dynamic.

GCC is able to pick LMUL = 2 to optimized this case.

This feature is supported by linear scan based local live ranges analysis and
compute maximum live V_REGS in specific program point of the function to 
determine the VF/LMUL.

Note that this patch can well handle both SLP and non-SLP loop.

Currenty approach didn't consider the later instruction scheduler which may 
improve the register pressure.
In this case, we are conservatively applying smaller VF/LMUL. (Not sure whether 
we should support live range shrink for such corner case since we don't known 
whether it can improve performance a lot.)

gcc/ChangeLog:

* config/riscv/riscv-protos.h (lookup_vector_type_attribute): Export 
global.
* config/riscv/riscv-vector-builtins.cc (sizeless_type_p): Ditto.
* config/riscv/riscv-vector-costs.cc (get_last_live_range): New 
function.
(compute_nregs_for_mode): Ditto.
(live_range_conflict_p): Ditto.
(get_all_predecessors): Ditto.
(get_all_successors): Ditto.
(max_number_of_live_regs): Ditto.
(compute_lmul): Ditto.
(costs::prefer_new_lmul_p): Ditto.
(costs::better_main_loop_than_p): Ditto.
* config/riscv/riscv-vector-costs.h (struct stmt_point): New struct.
(struct var_live_range): Ditto.
(struct autovec_info): Ditto.
* config/riscv/t-riscv: Update makefile for cost model.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul-mixed-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-5.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-6.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-6.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-7.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-8.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-10.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-5.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-6.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-7.c: New test.
*

Re: [PATCH] RISC-V: Keep vlmax vector operators in simple form until split1 pass

2023-09-04 Thread Kito Cheng via Gcc-patches

Can those intermediate patterns be used for intrinsic? I would prefer
to keep those stuff *IF* possible used for intrinsics.

On Mon, Sep 4, 2023 at 7:14 PM Lehua Ding  wrote:
>
> This patch keep vlmax vector pattern in simple before split1 pass which
> will allow more optimization (e.g. combine) before split1 pass.
> This patch changes the vlmax pattern in autovec.md to define_insn_and_split
> as much as possible and clean up some combine patterns that are no longer 
> needed.
> This patch also fixed PR111232 bug which was caused by a combined failed.
>
> PR target/111232
>
> gcc/ChangeLog:
>
> * config/riscv/autovec-opt.md 
> (@pred_single_widen_mul):
> Delete.
> (*pred_widen_mulsu): Delete.
> (*pred_single_widen_mul): Delete.
> (*dual_widen_):
> Add new combine patterns.
> (*single_widen_sub): Ditto.
> (*single_widen_add): Ditto.
> (*single_widen_mult): Ditto.
> (*dual_widen_mulsu): Ditto.
> (*dual_widen_mulus): Ditto.
> (*dual_widen_): Ditto.
> (*single_widen_add): Ditto.
> (*single_widen_sub): Ditto.
> (*single_widen_mult): Ditto.
> * config/riscv/autovec.md (3):
> Change define_expand to define_insn_and_split.
> (2): Ditto.
> (abs2): Ditto.
> (smul3_highpart): Ditto.
> (umul3_highpart): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/widen/widen-4.c: Add more testcases.
> * gcc.target/riscv/rvv/autovec/widen/widen-complicate-4.c: Ditto.
> * gcc.target/riscv/rvv/autovec/pr111232.c: New test.
>
> ---
>  gcc/config/riscv/autovec-opt.md   | 294 --
>  gcc/config/riscv/autovec.md   |  82 +++--
>  .../gcc.target/riscv/rvv/autovec/pr111232.c   |  18 ++
>  .../riscv/rvv/autovec/widen/widen-4.c |   7 +-
>  .../rvv/autovec/widen/widen-complicate-4.c|  11 +-
>  5 files changed, 276 insertions(+), 136 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111232.c
>
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
> index d9863c76654..3aaee54f02a 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -18,67 +18,6 @@
>  ;; along with GCC; see the file COPYING3.  If not see
>  ;; .
>
> -;; We don't have vwmul.wv instruction like vwadd.wv in RVV.
> -;; This pattern is an intermediate RTL IR as a pseudo vwmul.wv to enhance
> -;; optimization of instructions combine.
> -(define_insn_and_split "@pred_single_widen_mul"
> -  [(set (match_operand:VWEXTI 0 "register_operand"  
> "=,")
> -   (if_then_else:VWEXTI
> - (unspec:
> -   [(match_operand: 1 "vector_mask_operand"   
> "vmWc1,vmWc1")
> -(match_operand 5 "vector_length_operand"  "   rK,   
> rK")
> -(match_operand 6 "const_int_operand"  "i,
> i")
> -(match_operand 7 "const_int_operand"  "i,
> i")
> -(match_operand 8 "const_int_operand"  "i,
> i")
> -(reg:SI VL_REGNUM)
> -(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> - (mult:VWEXTI
> -   (any_extend:VWEXTI
> - (match_operand: 4 "register_operand" "   vr,   
> vr"))
> -   (match_operand:VWEXTI 3 "register_operand" "   vr,   
> vr"))
> - (match_operand:VWEXTI 2 "vector_merge_operand"   "   vu,
> 0")))]
> -  "TARGET_VECTOR && can_create_pseudo_p ()"
> -  "#"
> -  "&& 1"
> -  [(const_int 0)]
> -  {
> -insn_code icode = code_for_pred_vf2 (, mode);
> -rtx tmp = gen_reg_rtx (mode);
> -rtx ops[] = {tmp, operands[4]};
> -riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP, ops);
> -
> -emit_insn (gen_pred (MULT, mode, operands[0], operands[1], 
> operands[2],
> -operands[3], tmp, operands[5], operands[6],
> -operands[7], operands[8]));
> -DONE;
> -  }
> -  [(set_attr "type" "viwmul")
> -   (set_attr "mode" "")])
> -
> -;; This pattern it to enchance the instruction combine optimizations for 
> complicate
> -;; sign and unsigned widening multiplication operations.
> -(define_insn "*pred_widen_mulsu"
> -  [(set (match_operand:VWEXTI 0 "register_operand"  
> "=,")
> -   (if_then_else:VWEXTI
> - (unspec:
> -   [(match_operand: 1 "vector_mask_operand"   
> "vmWc1,vmWc1")
> -(match_operand 5 "vector_length_operand"  "   rK,   
> rK")
> -(match_operand 6 "const_int_operand"  "i,
> i")
> -(match_operand 7 "const_int_operand"  "i,
> i")
> -(match_operand 8 "const_int_operand"  "i,
> i")
> -(reg:SI VL_REGNUM)
> -

Add 'libgomp.c-c++-common/pr100059-1.c' [PR100059]

2023-09-04 Thread Thomas Schwinge

Hi!

Pushed to master branch commit fe0f9e09413047484441468b05288412760d8a09
"Add 'libgomp.c-c++-common/pr100059-1.c'" (omitting PR100059 tag
unfortunately), see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From fe0f9e09413047484441468b05288412760d8a09 Mon Sep 17 00:00:00 2001
From: Tobias Burnus 
Date: Tue, 13 Apr 2021 08:58:51 +
Subject: [PATCH] Add 'libgomp.c-c++-common/pr100059-1.c'

For nvptx offloading, it'll FAIL its execution test until nvptx-tools updated
to include commit 1b5946d78ef5dcfb640e9f545a7c791b7f623911
"Merge commit '26095fd01232061de9f79decb3e8222ef7b46191' into HEAD [#29]",
.

	libgomp/
	* testsuite/libgomp.c-c++-common/pr100059-1.c: New.

Co-authored-by: Thomas Schwinge 
---
 .../libgomp.c-c++-common/pr100059-1.c | 55 +++
 1 file changed, 55 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/pr100059-1.c

diff --git a/libgomp/testsuite/libgomp.c-c++-common/pr100059-1.c b/libgomp/testsuite/libgomp.c-c++-common/pr100059-1.c
new file mode 100644
index 000..af12295541a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/pr100059-1.c
@@ -0,0 +1,55 @@
+/* Based on sollve_vv's tests/5.0/declare_target/test_nested_declare_target.c.  */
+
+#define N 1024
+int a[N], b[N], c[N];  
+int i = 0;
+
+void
+update ()
+{ 
+  for (i = 0; i < N; i++)
+{
+  a[i] += 1;
+  b[i] += 2;
+  c[i] += 3;
+}
+}
+
+#pragma omp declare target 
+#pragma omp declare target link(a,c,b,i)
+#pragma omp declare target to(update)  
+#pragma omp end declare target
+
+int
+main ()
+{
+  for (i = 0; i < N; i++)
+{
+  a[i] = i;
+  b[i] = i + 1;
+  c[i] = i + 2;
+}
+
+  //__builtin_printf("i=5: A=%d, B=%d, C=%d\n", a[5], b[5], c[5]);
+  
+  #pragma omp target map(to: i) map(tofrom: a, b, c) 
+  {
+update();  /* Device. */
+  }
+
+  //__builtin_printf("i=5: A=%d, B=%d, C=%d\n", a[5], b[5], c[5]);
+
+  for (i = 0; i < N; i++)
+if ( a[i] != i + 1 || b[i] != i + 3 || c[i] != i + 5)
+  __builtin_abort();
+
+  update();  /* Host. */
+
+  //__builtin_printf("i=5: A=%d, B=%d, C=%d\n", a[5], b[5], c[5]);
+
+  for (i = 0; i < N; i++)
+if ( a[i] != i + 2 || b[i] != i + 5 || c[i] != i + 8)
+  __builtin_abort ();
+  
+  return 0;
+}
-- 
2.34.1

[PATCH] testsuite: Remove unwanted 'dg-do run' from gcc.dg/vect tests

2023-09-04 Thread Christophe Lyon via Gcc-patches

Tests under gcc.dg/vect use check_vect_support_and_set_flags to set
compilation flags as appropriate for the target, but they also set
dg-do-what-default to 'run' or 'compile', depending on the actual
target hardware (or simulator) capabilities.

For instance on arm, we use options to enable Neon, but set
dg-do-what-default to 'run' only if we cam actually execute Neon
instructions.

Therefore, we would always try to link and execute tests containing
'dg-do run', although dg-do-what-default says otherwise, leading to
uninteresting failures.

Therefore, this patch removes all such unconditionnal 'dg-do run',
thus avoid link errors for instance if GCC has been configured with
multilibs disabled and some --with-{float|cpu|hard} option
incompatible with what check_vect_support_and_set_flags selects.

For exmaple, GCC configured with:
--disable-multilib --with-mode=thumb --with-cpu=cortex-m7 --with-float=hard
and check_vect_support_and_set_flags uses
-mfpu=neon -mfloat-abi=softfp -march=armv7-a
(thus incompatible float-abi options)

Tested on native aarch64-linux-gnu (no change) and several arm-eabi
cases where the FAIL/UNRESOLVED disappear (and we keep only the
'compilation' tests).

2023-09-04  Christophe Lyon  

gcc/testsuite/
* gcc.dg/vect/bb-slp-44.c: Remove 'dg-do run'.
* gcc.dg/vect/bb-slp-71.c: Likewise.
* gcc.dg/vect/bb-slp-72.c: Likewise.
* gcc.dg/vect/bb-slp-73.c: Likewise.
* gcc.dg/vect/bb-slp-74.c: Likewise.
* gcc.dg/vect/bb-slp-pr101207.c: Likewise.
* gcc.dg/vect/bb-slp-pr101615-1.c: Likewise.
* gcc.dg/vect/bb-slp-pr101615-2.c: Likewise.
* gcc.dg/vect/bb-slp-pr101668.c: Likewise.
* gcc.dg/vect/bb-slp-pr54400.c: Likewise.
* gcc.dg/vect/bb-slp-pr98516-1.c: Likewise.
* gcc.dg/vect/bb-slp-pr98516-2.c: Likewise.
* gcc.dg/vect/bb-slp-pr98544.c: Likewise.
* gcc.dg/vect/pr101445.c: Likewise.
* gcc.dg/vect/pr105219.c: Likewise.
* gcc.dg/vect/pr107160.c: Likewise.
* gcc.dg/vect/pr107212-1.c: Likewise.
* gcc.dg/vect/pr107212-2.c: Likewise.
* gcc.dg/vect/pr109502.c: Likewise.
* gcc.dg/vect/pr110381.c: Likewise.
* gcc.dg/vect/pr110838.c: Likewise.
* gcc.dg/vect/pr88497-1.c: Likewise.
* gcc.dg/vect/pr88497-7.c: Likewise.
* gcc.dg/vect/pr96783-1.c: Likewise.
* gcc.dg/vect/pr96783-2.c: Likewise.
* gcc.dg/vect/pr97558-2.c: Likewise.
* gcc.dg/vect/pr99253.c: Likewise.
* gcc.dg/vect/slp-mask-store-1.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-10.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-11.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-2.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-3.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-4.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-5.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-6.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-8.c: Likewise.
* gcc.dg/vect/vect-bic-bitmask-9.c: Likewise.
* gcc.dg/vect/vect-cond-13.c: Likewise.
* gcc.dg/vect/vect-recurr-1.c: Likewise.
* gcc.dg/vect/vect-recurr-2.c: Likewise.
* gcc.dg/vect/vect-recurr-3.c: Likewise.
* gcc.dg/vect/vect-recurr-4.c: Likewise.
* gcc.dg/vect/vect-recurr-5.c: Likewise.
* gcc.dg/vect/vect-recurr-6.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-44.c   | 2 --
 gcc/testsuite/gcc.dg/vect/bb-slp-71.c   | 2 --
 gcc/testsuite/gcc.dg/vect/bb-slp-72.c   | 2 --
 gcc/testsuite/gcc.dg/vect/bb-slp-73.c   | 2 --
 gcc/testsuite/gcc.dg/vect/bb-slp-74.c   | 1 -
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101207.c | 1 -
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-1.c   | 1 -
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101615-2.c   | 1 -
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c | 1 -
 gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c  | 1 -
 gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-1.c| 2 --
 gcc/testsuite/gcc.dg/vect/bb-slp-pr98516-2.c| 2 --
 gcc/testsuite/gcc.dg/vect/bb-slp-pr98544.c  | 2 --
 gcc/testsuite/gcc.dg/vect/pr101445.c| 2 --
 gcc/testsuite/gcc.dg/vect/pr105219.c| 1 -
 gcc/testsuite/gcc.dg/vect/pr107160.c| 2 --
 gcc/testsuite/gcc.dg/vect/pr107212-1.c  | 2 --
 gcc/testsuite/gcc.dg/vect/pr107212-2.c  | 2 --
 gcc/testsuite/gcc.dg/vect/pr109502.c| 1 -
 gcc/testsuite/gcc.dg/vect/pr110381.c| 1 -
 gcc/testsuite/gcc.dg/vect/pr110838.c| 2 --
 gcc/testsuite/gcc.dg/vect/pr88497-1.c   | 1 -
 gcc/testsuite/gcc.dg/vect/pr88497-7.c   | 1 -
 gcc/testsuite/gcc.dg/vect/pr96783-1.c   | 2 --
 gcc/testsuite/gcc.dg/vect/pr96783-2.c   | 2 --
 gcc/testsuite/gcc.dg/vect/pr97558-2.c   | 1 -
 gcc/testsuite/gcc.dg/vect/pr99253.c | 2 --
 gcc/testsuite/gcc.dg/vect/slp-mask-store-1.c| 1 -

[pushed] Darwin, ppc: Add system stubs for all 32b PPC

2023-09-04 Thread Iain Sandoe via Gcc-patches

Tested on powerpc-darwin9, pushed to trunk, thanks
Iain

--- 8< ---

This is a minor adjustment to make the GCC behaviour better match the
old system tools.

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config/rs6000/darwin.h (LIB_SPEC): Include libSystemStubs for
all 32b Darwin PowerPC cases.
---
 gcc/config/rs6000/darwin.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index bf9dfaf2f34..88a39478702 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -98,7 +98,7 @@
Include libmx when targeting Darwin 7.0 and above, but before libSystem,
since the functions are actually in libSystem but for 7.x compatibility
we want them to be looked for in libmx first.
-   Include libSystemStubs when compiling against 10.3 - 10.5 SDKs (we assume
+   Include libSystemStubs when compiling against 10.3 - 10.6 SDKs (we assume
this is the case when targetting these) - but not for 64-bit long double.
Don't do either for m64, the library is either a dummy or non-existent.
 */
@@ -107,8 +107,8 @@
 #define LIB_SPEC \
 "%{!static:\
   %{!m64:%{!mlong-double-64:   \
-%{pg:%:version-compare(>< 10.3 10.5 mmacosx-version-min= 
-lSystemStubs_profile)} \
-%{!pg:%:version-compare(>< 10.3 10.5 mmacosx-version-min= -lSystemStubs)} \
+%{pg:%:version-compare(>< 10.3 10.7 mmacosx-version-min= 
-lSystemStubs_profile)} \
+%{!pg:%:version-compare(>< 10.3 10.7 mmacosx-version-min= -lSystemStubs)} \
  %:version-compare(>< 10.3 10.4 mmacosx-version-min= -lmx)}}   \
   -lSystem \
 }"
-- 
2.39.2 (Apple Git-143)

[PATCH] Darwin: Place global inits in the correct section.

2023-09-04 Thread Iain Sandoe via Gcc-patches

Tested on x86_64-darwin21, i686-darwin9, aarch64-darwin21 and
powerpc-darwin9, pushed to trunk, thanks
Iain

--- 8< ---

This handles placement of global initializers into __TEXT,__StaticInit as used
by other platform toolchains.

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config/darwin-sections.def (static_init_section): Add the
__TEXT,__StaticInit section.
* config/darwin.cc (darwin_function_section): Use the static init
section for global initializers, to match other platform toolchains.
---
 gcc/config/darwin-sections.def | 2 ++
 gcc/config/darwin.cc   | 8 
 2 files changed, 10 insertions(+)

diff --git a/gcc/config/darwin-sections.def b/gcc/config/darwin-sections.def
index de2334f4a7a..7e1b4710bd6 100644
--- a/gcc/config/darwin-sections.def
+++ b/gcc/config/darwin-sections.def
@@ -98,6 +98,8 @@ DEF_SECTION (mod_init_section, 0, ".mod_init_func", 0)
 DEF_SECTION (mod_term_section, 0, ".mod_term_func", 0)
 DEF_SECTION (constructor_section, 0, ".constructor", 0)
 DEF_SECTION (destructor_section, 0, ".destructor", 0)
+DEF_SECTION (static_init_section, SECTION_CODE,
+".section\t__TEXT,__StaticInit,regular,pure_instructions", 0)
 
 /* Objective-C ABI=0 (Original version) sections.  */
 DEF_SECTION (objc_class_section, 0, ".objc_class", 1)
diff --git a/gcc/config/darwin.cc b/gcc/config/darwin.cc
index b435bb2b80a..95d6194cf22 100644
--- a/gcc/config/darwin.cc
+++ b/gcc/config/darwin.cc
@@ -3893,6 +3893,14 @@ darwin_function_section (tree decl, enum node_frequency 
freq,
   if (decl && DECL_SECTION_NAME (decl) != NULL)
 return get_named_section (decl, NULL, 0);
 
+  /* Intercept functions in global init; these are placed in separate sections.
+ FIXME: there should be some neater way to do this.  */
+  if (DECL_NAME (decl)
+  && (startswith (IDENTIFIER_POINTER (DECL_NAME (decl)), "_GLOBAL__sub_I")
+ || startswith (IDENTIFIER_POINTER (DECL_NAME (decl)),
+"__static_initialization_and_destruction")))
+return  darwin_sections[static_init_section];
+
   /* We always put unlikely executed stuff in the cold section.  */
   if (freq == NODE_FREQUENCY_UNLIKELY_EXECUTED)
 return (use_coal) ? darwin_sections[text_cold_coal_section]
-- 
2.39.2 (Apple Git-143)

[PATCH] RISC-V: Keep vlmax vector operators in simple form until split1 pass

2023-09-04 Thread Lehua Ding

This patch keep vlmax vector pattern in simple before split1 pass which
will allow more optimization (e.g. combine) before split1 pass.
This patch changes the vlmax pattern in autovec.md to define_insn_and_split
as much as possible and clean up some combine patterns that are no longer 
needed.
This patch also fixed PR111232 bug which was caused by a combined failed.

PR target/111232

gcc/ChangeLog:

* config/riscv/autovec-opt.md 
(@pred_single_widen_mul):
Delete.
(*pred_widen_mulsu): Delete.
(*pred_single_widen_mul): Delete.
(*dual_widen_):
Add new combine patterns.
(*single_widen_sub): Ditto.
(*single_widen_add): Ditto.
(*single_widen_mult): Ditto.
(*dual_widen_mulsu): Ditto.
(*dual_widen_mulus): Ditto.
(*dual_widen_): Ditto.
(*single_widen_add): Ditto.
(*single_widen_sub): Ditto.
(*single_widen_mult): Ditto.
* config/riscv/autovec.md (3):
Change define_expand to define_insn_and_split.
(2): Ditto.
(abs2): Ditto.
(smul3_highpart): Ditto.
(umul3_highpart): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/widen/widen-4.c: Add more testcases.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/pr111232.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 294 --
 gcc/config/riscv/autovec.md   |  82 +++--
 .../gcc.target/riscv/rvv/autovec/pr111232.c   |  18 ++
 .../riscv/rvv/autovec/widen/widen-4.c |   7 +-
 .../rvv/autovec/widen/widen-complicate-4.c|  11 +-
 5 files changed, 276 insertions(+), 136 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111232.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index d9863c76654..3aaee54f02a 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -18,67 +18,6 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-;; We don't have vwmul.wv instruction like vwadd.wv in RVV.
-;; This pattern is an intermediate RTL IR as a pseudo vwmul.wv to enhance
-;; optimization of instructions combine.
-(define_insn_and_split "@pred_single_widen_mul"
-  [(set (match_operand:VWEXTI 0 "register_operand"  "=,")
-   (if_then_else:VWEXTI
- (unspec:
-   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1")
-(match_operand 5 "vector_length_operand"  "   rK,   
rK")
-(match_operand 6 "const_int_operand"  "i,
i")
-(match_operand 7 "const_int_operand"  "i,
i")
-(match_operand 8 "const_int_operand"  "i,
i")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (mult:VWEXTI
-   (any_extend:VWEXTI
- (match_operand: 4 "register_operand" "   vr,   
vr"))
-   (match_operand:VWEXTI 3 "register_operand" "   vr,   
vr"))
- (match_operand:VWEXTI 2 "vector_merge_operand"   "   vu,
0")))]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-  {
-insn_code icode = code_for_pred_vf2 (, mode);
-rtx tmp = gen_reg_rtx (mode);
-rtx ops[] = {tmp, operands[4]};
-riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP, ops);
-
-emit_insn (gen_pred (MULT, mode, operands[0], operands[1], 
operands[2],
-operands[3], tmp, operands[5], operands[6],
-operands[7], operands[8]));
-DONE;
-  }
-  [(set_attr "type" "viwmul")
-   (set_attr "mode" "")])
-
-;; This pattern it to enchance the instruction combine optimizations for 
complicate
-;; sign and unsigned widening multiplication operations.
-(define_insn "*pred_widen_mulsu"
-  [(set (match_operand:VWEXTI 0 "register_operand"  "=,")
-   (if_then_else:VWEXTI
- (unspec:
-   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1")
-(match_operand 5 "vector_length_operand"  "   rK,   
rK")
-(match_operand 6 "const_int_operand"  "i,
i")
-(match_operand 7 "const_int_operand"  "i,
i")
-(match_operand 8 "const_int_operand"  "i,
i")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (mult:VWEXTI
-   (zero_extend:VWEXTI
- (match_operand: 4 "register_operand" "   vr,   
vr"))
-   (sign_extend:VWEXTI
- (match_operand: 3 "register_operand" "   vr,   
vr")))
- (match_operand:VWEXTI 2 "vector_merge_operand"   "   vu,
0")))]
-  "TARGET_VECTOR"
-  "vwmulsu.vv\t%0,%3,%4%p1"
-

[pushed] Darwin: Match system sections and relocs for exception tables.

2023-09-04 Thread Iain Sandoe via Gcc-patches

Tested on x86_64-darwin21 and i686-darwin9, pushed to trunk, thanks,
Iain

--- 8< ---

System tools from Darwin10 onwards have moved the exceptions tables from
the __DATA segment to the __TEXT one.  They also revised the relocations
used for typeinfo.  While Darwin9 was not changed at the time, in fact the
tools there are equally happy with the revised scheme - and therefore at
present there seems no reason to special-case it.

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config/darwin-sections.def (darwin_exception_section): Move to
the __TEXT segment.
* config/darwin.cc (darwin_emit_except_table_label): Align before
the exception table label.
* config/darwin.h (ASM_PREFERRED_EH_DATA_FORMAT): Use indirect PC-
relative 4byte relocs.
---
 gcc/config/darwin-sections.def | 2 +-
 gcc/config/darwin.cc   | 1 +
 gcc/config/darwin.h| 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/darwin-sections.def b/gcc/config/darwin-sections.def
index 62a51b9761c..de2334f4a7a 100644
--- a/gcc/config/darwin-sections.def
+++ b/gcc/config/darwin-sections.def
@@ -157,7 +157,7 @@ DEF_SECTION (machopic_picsymbol_stub3_section, 
SECTION_NO_ANCHOR,
 
 /* Exception-related.  */
 DEF_SECTION (darwin_exception_section, SECTION_NO_ANCHOR,
-".section __DATA,__gcc_except_tab", 0)
+".section __TEXT,__gcc_except_tab", 0)
 DEF_SECTION (darwin_eh_frame_section, SECTION_NO_ANCHOR,
 ".section " EH_FRAME_SECTION_NAME ",__eh_frame"
 EH_FRAME_SECTION_ATTR, 0)
diff --git a/gcc/config/darwin.cc b/gcc/config/darwin.cc
index 0d53e97ae80..b435bb2b80a 100644
--- a/gcc/config/darwin.cc
+++ b/gcc/config/darwin.cc
@@ -2271,6 +2271,7 @@ darwin_emit_except_table_label (FILE *file)
 {
   char section_start_label[30];
 
+  fputs ("\t.p2align\t2\n", file);
   ASM_GENERATE_INTERNAL_LABEL (section_start_label, "GCC_except_table",
   except_table_label_num++);
   ASM_OUTPUT_LABEL (file, section_start_label);
diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 1b3f1bd984c..b7cfab607db 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -1089,7 +1089,7 @@ enum machopic_addr_class {
 
 #undef ASM_PREFERRED_EH_DATA_FORMAT
 #define ASM_PREFERRED_EH_DATA_FORMAT(CODE,GLOBAL)  \
-  (((CODE) == 2 && (GLOBAL) == 1) \
+  (((CODE) == 2 && (GLOBAL) == 1) || ((CODE) == 0 && (GLOBAL) == 1) \
? (DW_EH_PE_pcrel | DW_EH_PE_indirect | DW_EH_PE_sdata4) : \
  ((CODE) == 1 || (GLOBAL) == 0) ? DW_EH_PE_pcrel : DW_EH_PE_absptr)
 
-- 
2.39.2 (Apple Git-143)

[pushed] Darwin, machopic: Debug printer for macho symbol flags.

2023-09-04 Thread Iain Sandoe via Gcc-patches

Tested on x86_64-darwin21 and i686-darwin9, pushed to trunk,
thanks.
Iain

--- 8< ---

There are now quite a few symbol flags, so it is sometimes useful to get
them in a text form, rather than decoding the hex number printed by
debug_rtx().

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config/darwin.cc (dump_machopic_symref_flags): New.
(debug_machopic_symref_flags): New.
---
 gcc/config/darwin.cc | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/gcc/config/darwin.cc b/gcc/config/darwin.cc
index efbcb3856ca..0d53e97ae80 100644
--- a/gcc/config/darwin.cc
+++ b/gcc/config/darwin.cc
@@ -258,6 +258,45 @@ name_needs_quotes (const char *name)
   return 0;
 }
 
+DEBUG_FUNCTION void
+dump_machopic_symref_flags (FILE *dump, rtx sym_ref)
+{
+  unsigned long flags = SYMBOL_REF_FLAGS (sym_ref);
+
+  fprintf (dump, "flags: %08lx %c%c%c%c%c%c%c",
+  flags,
+  (MACHO_SYMBOL_STATIC_P (sym_ref) ? 's' : '-'),
+  (MACHO_SYMBOL_INDIRECTION_P (sym_ref) ? 'I' : '-'),
+  (MACHO_SYMBOL_LINKER_VIS_P (sym_ref) ? 'l' : '-'),
+  (MACHO_SYMBOL_HIDDEN_VIS_P (sym_ref) ? 'h' : '-'),
+  (MACHO_SYMBOL_DEFINED_P (sym_ref) ? 'd' : '-'),
+  (MACHO_SYMBOL_MUST_INDIRECT_P (sym_ref) ? 'i' : '-'),
+  (MACHO_SYMBOL_VARIABLE_P (sym_ref) ? 'v' : '-'));
+
+#if (DARWIN_X86)
+  fprintf (dump, "%c%c%c%c",
+(SYMBOL_REF_STUBVAR_P (sym_ref) ? 'S' : '-'),
+(SYMBOL_REF_DLLEXPORT_P (sym_ref) ? 'X' : '-'),
+(SYMBOL_REF_DLLIMPORT_P (sym_ref) ? 'I' : '-'),
+(SYMBOL_REF_FAR_ADDR_P (sym_ref) ? 'F' : '-'));
+#endif
+
+  fprintf (dump, "%c%c%c%03u%c%c%c\n",
+  (SYMBOL_REF_ANCHOR_P (sym_ref) ? 'a' : '-'),
+  (SYMBOL_REF_HAS_BLOCK_INFO_P (sym_ref) ? 'b' : '-'),
+  (SYMBOL_REF_EXTERNAL_P (sym_ref) ? 'e' : '-'),
+  (unsigned)SYMBOL_REF_TLS_MODEL (sym_ref),
+  (SYMBOL_REF_SMALL_P (sym_ref) ? 'm' : '-'),
+  (SYMBOL_REF_LOCAL_P (sym_ref) ? 'l' : '-'),
+  (SYMBOL_REF_FUNCTION_P (sym_ref) ? 'f' : '-'));
+}
+
+DEBUG_FUNCTION void
+debug_machopic_symref_flags (rtx sym_ref)
+{
+  dump_machopic_symref_flags (stderr, sym_ref);
+}
+
 /* Return true if SYM_REF can be used without an indirection.  */
 int
 machopic_symbol_defined_p (rtx sym_ref)
-- 
2.39.2 (Apple Git-143)

RE: [PATCH] [tree-optimization/110279] swap operands in reassoc to reduce cross backedge FMA

2023-09-04 Thread Di Zhao OS via Gcc-patches

> -Original Message-
> From: Richard Biener 
> Sent: Thursday, August 31, 2023 8:23 PM
> To: Di Zhao OS 
> Cc: Jeff Law ; Martin Jambor ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] swap operands in reassoc to
> reduce cross backedge FMA
> 
> On Wed, Aug 30, 2023 at 11:33 AM Di Zhao OS
>  wrote:
> >
> > Hello Richard,
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Tuesday, August 29, 2023 7:11 PM
> > > To: Di Zhao OS 
> > > Cc: Jeff Law ; Martin Jambor ;
> gcc-
> > > patc...@gcc.gnu.org
> > > Subject: Re: [PATCH] [tree-optimization/110279] swap operands in reassoc
> to
> > > reduce cross backedge FMA
> > >
> > > On Tue, Aug 29, 2023 at 10:59 AM Di Zhao OS
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > > -Original Message-
> > > > > From: Richard Biener 
> > > > > Sent: Tuesday, August 29, 2023 4:09 PM
> > > > > To: Di Zhao OS 
> > > > > Cc: Jeff Law ; Martin Jambor ;
> > > gcc-
> > > > > patc...@gcc.gnu.org
> > > > > Subject: Re: [PATCH] [tree-optimization/110279] swap operands in
> reassoc
> > > to
> > > > > reduce cross backedge FMA
> > > > >
> > > > > On Tue, Aug 29, 2023 at 9:49 AM Di Zhao OS
> > > > >  wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: Richard Biener 
> > > > > > > Sent: Tuesday, August 29, 2023 3:41 PM
> > > > > > > To: Jeff Law ; Martin Jambor
> 
> > > > > > > Cc: Di Zhao OS ; gcc-
> > > patc...@gcc.gnu.org
> > > > > > > Subject: Re: [PATCH] [tree-optimization/110279] swap operands in
> > > reassoc
> > > > > to
> > > > > > > reduce cross backedge FMA
> > > > > > >
> > > > > > > On Tue, Aug 29, 2023 at 1:23 AM Jeff Law via Gcc-patches
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On 8/28/23 02:17, Di Zhao OS via Gcc-patches wrote:
> > > > > > > > > This patch tries to fix the 2% regression in 510.parest_r on
> > > > > > > > > ampere1 in the tracker. (Previous discussion is here:
> > > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-
> July/624893.html)
> > > > > > > > >
> > > > > > > > > 1. Add testcases for the problem. For an op list in the form
> of
> > > > > > > > > "acc = a * b + c * d + acc", currently reassociation doesn't
> > > > > > > > > Swap the operands so that more FMAs can be generated.
> > > > > > > > > After widening_mul the result looks like:
> > > > > > > > >
> > > > > > > > > _1 = .FMA(a, b, acc_0);
> > > > > > > > > acc_1 = .FMA(c, d, _1);
> > > > > > > > >
> > > > > > > > > While previously (before the "Handle FMA friendly..." patch),
> > > > > > > > > widening_mul's result was like:
> > > > > > > > >
> > > > > > > > > _1 = a * b;
> > > > > > > > > _2 = .FMA (c, d, _1);
> > > > > > > > > acc_1 = acc_0 + _2;
> > > > > > >
> > > > > > > How can we execute the multiply and the FMA in parallel?  They
> > > > > > > depend on each other.  Or is it the uarch can handle dependence
> > > > > > > on the add operand but only when it is with a multiplication and
> > > > > > > not a FMA in some better ways?  (I'd doubt so much complexity)
> > > > > > >
> > > > > > > Can you explain in more detail how the uarch executes one vs. the
> > > > > > > other case?
> > > >
> > > > Here's my understanding after consulted our hardware team. For the
> > > > second case, the uarch of some out-of-order processors can calculate
> > > > "_2" of several loops at the same time, since there's no dependency
> > > > among different iterations. While for the first case the next iteration
> > > > has to wait for the current iteration to finish, so "acc_0"'s value is
> > > > known. I assume it is also the case in some i386 processors, since I
> > > > saw the patch "Deferring FMA transformations in tight loops" also
> > > > changed corresponding files.
> > >
> > > That should be true for all kind of operations, no?  Thus it means
> > > reassoc should in general associate cross-iteration accumulation
> > Yes I think both are true.
> >
> > > last?  Historically we associated those first because that's how the
> > > vectorizer liked to see them, but I think that's no longer necessary.
> > >
> > > It should be achievable by properly biasing the operand during
> > > rank computation (don't we already do that?).
> >
> > The issue is related with the following codes (handling cases with
> > three operands left):
> >   /* When there are three operands left, we want
> >  to make sure the ones that get the double
> >  binary op are chosen wisely.  */
> >   int len = ops.length ();
> >   if (len >= 3 && !has_fma)
> > swap_ops_for_binary_stmt (ops, len - 3);
> >
> >   new_lhs = rewrite_expr_tree (stmt, rhs_code, 0, ops,
> >powi_result != NULL
> >|| negate_result,
> >len != orig_len);
> >
> > Originally (before the "Handle FMA friendly..." patch), for the
> >

Re: [PATCH v5 1/4] c++, libstdc++: Implement __is_arithmetic built-in trait

2023-09-04 Thread Jonathan Wakely via Gcc-patches

On Fri, 1 Sept 2023 at 12:25, Ken Matsui via Libstdc++
 wrote:
>
> This patch implements built-in trait for std::is_arithmetic.

The libstdc++-v3 parts are OK for trunk.


>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_arithmetic.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARITHMETIC.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_arithmetic.
> * g++.dg/ext/is_arithmetic.C: New test.
> * g++.dg/tm/pr46567.C (__is_arithmetic): Rename to ...
> (__is_arith): ... this.
> * g++.dg/torture/pr57107.C: Likewise.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/cpp_type_traits.h (__is_arithmetic): Rename to ...
> (__is_arith): ... this.
> * include/c_global/cmath: Use __is_arith instead.
> * include/c_std/cmath: Likewise.
> * include/tr1/cmath: Likewise.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc|  3 ++
>  gcc/cp/cp-trait.def |  1 +
>  gcc/cp/semantics.cc |  4 ++
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
>  gcc/testsuite/g++.dg/ext/is_arithmetic.C| 33 ++
>  gcc/testsuite/g++.dg/tm/pr46567.C   |  6 +--
>  gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
>  libstdc++-v3/include/bits/cpp_type_traits.h |  4 +-
>  libstdc++-v3/include/c_global/cmath | 48 ++---
>  libstdc++-v3/include/c_std/cmath| 24 +--
>  libstdc++-v3/include/tr1/cmath  | 24 +--
>  11 files changed, 99 insertions(+), 55 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_arithmetic.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..bd517d08843 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3754,6 +3754,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> +case CPTK_IS_ARITHMETIC:
> +  inform (loc, "  %qT is not an arithmetic type", t1);
> +  break;
>  case CPTK_IS_TRIVIALLY_COPYABLE:
>inform (loc, "  %qT is not trivially copyable", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..a95aeeaf778 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_ARITHMETIC, "__is_arithmetic", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..4531f047d73 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_UNION:
>return type_code1 == UNION_TYPE;
>
> +case CPTK_IS_ARITHMETIC:
> +  return ARITHMETIC_TYPE_P (type1);
> +
>  case CPTK_IS_ASSIGNABLE:
>return is_xible (MODIFY_EXPR, type1, type2);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_ARITHMETIC:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..3d63b0101d1 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_arithmetic)
> +# error "__has_builtin (__is_arithmetic) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_arithmetic.C 
> b/gcc/testsuite/g++.dg/ext/is_arithmetic.C
> new file mode 100644
> index 000..fd35831f646
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_arithmetic.C
> @@ -0,0 +1,33 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
> +  SA(TRAIT(TYPE) == EXPECT);   \
> +  SA(TRAIT(const TYPE) == EXPECT); \
> +  SA(TRAIT(volatile TYPE) == EXPECT);  \
> +  SA(TRAIT(const

Re: [PING][PATCH v2 2/2] libstdc++: Use new built-in trait __is_signed

2023-09-04 Thread Jonathan Wakely via Gcc-patches

On Fri, 1 Sept 2023 at 14:01, Ken Matsui  wrote:
>
> Ping for the use of __is_signed built-in.
>
> Sincerely,
> Ken Matsui
>
>
> On Wed, Jul 12, 2023 at 6:45 PM Ken Matsui  wrote:
> >
> > This patch lets libstdc++ use new built-in trait __is_signed.

OK for trunk after the built-in is approved for the compiler.


> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/std/type_traits (is_signed): Use __is_signed built-in 
> > trait.
> > (is_signed_v): Likewise.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  libstdc++-v3/include/std/type_traits | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/libstdc++-v3/include/std/type_traits 
> > b/libstdc++-v3/include/std/type_traits
> > index 0e7a9c9c7f3..23ab5a4b1e5 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -865,6 +865,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >  : public __bool_constant<__is_abstract(_Tp)>
> >  { };
> >
> > +  /// is_signed
> > +#if __has_builtin(__is_signed)
> > +  template
> > +struct is_signed
> > +: public __bool_constant<__is_signed(_Tp)>
> > +{ };
> > +#else
> >/// @cond undocumented
> >template >bool = is_arithmetic<_Tp>::value>
> > @@ -877,11 +884,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >  { };
> >/// @endcond
> >
> > -  /// is_signed
> >template
> >  struct is_signed
> >  : public __is_signed_helper<_Tp>::type
> >  { };
> > +#endif
> >
> >/// is_unsigned
> >template
> > @@ -3240,8 +3247,14 @@ template 
> >  template 
> >inline constexpr bool is_final_v = __is_final(_Tp);
> >
> > +#if __has_builtin(__is_signed)
> > +template 
> > +  inline constexpr bool is_signed_v = __is_signed(_Tp);
> > +#else
> >  template 
> >inline constexpr bool is_signed_v = is_signed<_Tp>::value;
> > +#endif
> > +
> >  template 
> >inline constexpr bool is_unsigned_v = is_unsigned<_Tp>::value;
> >
> > --
> > 2.41.0
> >
>

[PATCH] Generate vmovsh instead of vpblendw for specific vec_merge.

2023-09-04 Thread liuhongt via Gcc-patches

On SPR, vmovsh can be execute on 3 ports, vpblendw can only be
executed on 2 ports.
On znver4, vpblendw can be executed on 4 ports, if vmovsh is similar
as vmovss, then it can also be executed on 4 ports.
So there's no difference for znver? but vmovsh is more optimized on
SPR.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready push to trunk.

gcc/ChangeLog:

* config/i386/sse.md: (V8BFH_128): Renamed to ..
(VHFBF_128): .. this.
(V16BFH_256): Renamed to ..
(VHFBF_256): .. this.
(avx512f_mov): Extend to V_128.
(vcvtnee2ps_): Changed to VHFBF_128.
(vcvtneo2ps_): Ditto.
(vcvtnee2ps_): Changed to VHFBF_256.
(vcvtneo2ps_): Ditto.
* config/i386/i386-expand.cc (expand_vec_perm_blend):
Canonicalize vec_merge.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512fp16-vmovsh-1a.c: Remove xfail.
---
 gcc/config/i386/i386-expand.cc| 17 +
 gcc/config/i386/sse.md| 25 ---
 .../gcc.target/i386/avx512fp16-vmovsh-1a.c|  2 +-
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index cbd51a0f362..e42ff27c6ef 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -19433,6 +19433,23 @@ expand_vec_perm_blend (struct expand_vec_perm_d *d)
   mmode = VOIDmode;
 }
 
+  /* Canonicalize vec_merge.  */
+  if (swap_commutative_operands_p (op1, op0)
+  /* Two operands have same precedence, then
+first bit of mask select first operand.  */
+  || (!swap_commutative_operands_p (op0, op1)
+ && !(mask & 1)))
+{
+  unsigned n_elts = GET_MODE_NUNITS (vmode);
+  std::swap (op0, op1);
+  unsigned HOST_WIDE_INT mask_all = HOST_WIDE_INT_1U;
+  if (n_elts == HOST_BITS_PER_WIDE_INT)
+   mask_all  = -1;
+  else
+   mask_all = (HOST_WIDE_INT_1U << n_elts) - 1;
+  mask = ~mask & mask_all;
+}
+
   if (mmode != VOIDmode)
 maskop = force_reg (mmode, gen_int_mode (mask, mmode));
   else
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index e282d978a01..6d3ae8dea0c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -459,8 +459,9 @@ (define_mode_iterator VF2_AVX512VL
 (define_mode_iterator VF1_AVX512VL
   [V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")])
 
-(define_mode_iterator VHFBF
-  [V32HF V16HF V8HF V32BF V16BF V8BF])
+(define_mode_iterator VHFBF [V32HF V16HF V8HF V32BF V16BF V8BF])
+(define_mode_iterator VHFBF_256 [V16HF V16BF])
+(define_mode_iterator VHFBF_128 [V8HF V8BF])
 
 (define_mode_iterator VHF_AVX512VL
   [V32HF (V16HF "TARGET_AVX512VL") (V8HF "TARGET_AVX512VL")])
@@ -11134,13 +11135,11 @@ (define_insn_and_split 
"*vec_setv2di_0_zero_extendhi_1"
   DONE;
 })
 
-(define_mode_iterator V8BFH_128 [V8HF V8BF])
-
 (define_insn "avx512fp16_mov"
-  [(set (match_operand:V8BFH_128 0 "register_operand" "=v")
-   (vec_merge:V8BFH_128
-  (match_operand:V8BFH_128 2 "register_operand" "v")
- (match_operand:V8BFH_128 1 "register_operand" "v")
+  [(set (match_operand:V8_128 0 "register_operand" "=v")
+   (vec_merge:V8_128
+ (match_operand:V8_128 2 "register_operand" "v")
+ (match_operand:V8_128 1 "register_operand" "v")
  (const_int 1)))]
   "TARGET_AVX512FP16"
   "vmovsh\t{%2, %1, %0|%0, %1, %2}"
@@ -30358,8 +30357,6 @@ (define_insn "vbcstnesh2ps_"
   [(set_attr "prefix" "vex")
(set_attr "mode" "")])
 
-(define_mode_iterator V16BFH_256 [V16HF V16BF])
-
 (define_mode_attr bf16_ph
   [(V8HF "ph") (V16HF "ph")
(V8BF "bf16") (V16BF "bf16")])
@@ -30368,7 +30365,7 @@ (define_insn "vcvtnee2ps_"
   [(set (match_operand:V4SF 0 "register_operand" "=x")
(float_extend:V4SF
  (vec_select:
-   (match_operand:V8BFH_128 1 "memory_operand" "m")
+   (match_operand:VHFBF_128 1 "memory_operand" "m")
(parallel [(const_int 0) (const_int 2)
   (const_int 4) (const_int 6)]]
   "TARGET_AVXNECONVERT"
@@ -30380,7 +30377,7 @@ (define_insn "vcvtnee2ps_"
   [(set (match_operand:V8SF 0 "register_operand" "=x")
(float_extend:V8SF
  (vec_select:
-   (match_operand:V16BFH_256 1 "memory_operand" "m")
+   (match_operand:VHFBF_256 1 "memory_operand" "m")
(parallel [(const_int 0) (const_int 2)
   (const_int 4) (const_int 6)
   (const_int 8) (const_int 10)
@@ -30394,7 +30391,7 @@ (define_insn "vcvtneo2ps_"
   [(set (match_operand:V4SF 0 "register_operand" "=x")
(float_extend:V4SF
  (vec_select:
-   (match_operand:V8BFH_128 1 "memory_operand" "m")
+   (match_operand:VHFBF_128 1 "memory_operand" "m")
(parallel [(const_int 1) (const_int 3)
   (const_int 5) (const_int 7)]]
   "TARGET_AVXNECONVERT"
@@ -30406,7 +30403,7 @@ (define_insn

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-04 Thread Hongtao Liu via Gcc-patches

On Mon, Sep 4, 2023 at 4:57 PM Uros Bizjak  wrote:
>
> On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu  wrote:
>
> > > > > > > > I think there should be some constraint which explicitly has 
> > > > > > > > all the 32
> > > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that 
> > > > > > > > regardless of
> > > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline 
> > > > > > > > asm wants.
> > > > > > > >
> > > > > > > > Also, what about the "g" constraint?  Shouldn't there be 
> > > > > > > > another for "g"
> > > > > > > > without r16..r31?  What about the various other memory
> > > > > > > > constraints ("<", "o", ...)?
> > > > > > >
> > > > > > > I think we should leave all existing constraints as they are, so 
> > > > > > > "r"
> > > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > > > > introduce "h" to instructions that have the ability to handle 
> > > > > > > EGPR.
> > > > > > > This would be somehow similar to the SSE -> AVX512F transition, 
> > > > > > > where
> > > > > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > > > > register class for EVEX SSE registers. This way, asm will be
> > > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > > > > constraint "Bt", should allow new registers, and should be added 
> > > > > > > to
> > > > > > > the constraint string as a separate constraint, and conditionally
> > > > > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > > > > >
> > > > > > The extended constraint can work for registers, but for memory it 
> > > > > > is more
> > > > > > complicated.
> > > > >
> > > > > Yes, unfortunately. The compiler assumes that an unchangeable register
> > > > > class is used for BASE/INDEX registers. I have hit this limitation
> > > > > when trying to implement memory support for instructions involving
> > > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > > > > registers, also inside memory operand. (You can see the "hack" in e.g.
> > > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > > > > register class is the major limitation in the compiler, so perhaps the
> > > > > strategy on how to override this limitation should be discussed with
> > > > > the register allocator author first. Perhaps adding an insn attribute
> > > > > to insn RTX pattern to specify different BASE/INDEX register sets can
> > > > > be a better solution than passing insn RTX to the register allocator.
> > > > >
> > > > > The above idea still does not solve the asm problem on how to select
> > > > > correct BASE/INDEX register set for memory operands.
> > > > The current approach disables gpr32 for memory operand in asm_operand
> > > > by default. but can be turned on by options
> > > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
> > > > supports gpr32).
> > > > Only ~ 5% of total instructions don't support gpr32, reversed approach
> > > > only gonna get more complicated.
> > >
> > > I'm not referring to the reversed approach, just want to point out
> > > that the same approach as you proposed w.r.t. to memory operand can be
> > > achieved using some named insn attribute that would affect BASE/INDEX
> > > register class selection. The attribute could default to gpr32 with
> > > APX, unless the insn specific attribute has e.g. nogpr32 value. See
> > > for example how "enabled" and "preferred_for_*" attributes are used.
> > > Perhaps this new attribute can also be applied to separate
> > > alternatives.
> > Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like
> > (define_attr "gpr32" "0, 1"
> >   (cond [(eq_attr "isa" "fma4")
> >(const_string "0")]
> >   (const_string "1")))
>
> Just a nit, can the member be named "map0" and "map1"? The code will
> then look like:
>
> if (get_attr_gpr32 (insn) == GPR32_MAP0) ...
>
> instead of:
>
> if (get_attr_gpr32 (insn) == GPR32_0) ...
>
> > But still, we need to adjust memory constraints in the pattern.
>
> I guess the gpr32 property is the same for all alternatives of the
> insn pattern. In this case,  "m" "g" and "a" constraints could remain
> as they are, the final register class will be adjusted (by some target
> hook?) based on the value of gpr32 attribute.
I'm worried that not all rtl optimizers after post_reload will respect
base/index_reg_class regarding the insn they belong to.
 if they just check if it's a legitimate memory/address (the current
legitimate_address doesn't have a corresponding insn to pass down),
m/g/a will still generate invalid instruction.
So a defensive programming is to explicitly modifying the constraint.
>
> > Ideally, gcc includes encoding information for every instruction,
> > (.i.e. map0/map1), so that we can determine the attribute value of
> > gpr32 directly from this information.
>
> I think the right

[PATCH] RISC-V: Fix Dynamic LMUL compile option

2023-09-04 Thread Juzhe-Zhong

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum riscv_autovec_lmul_enum): Fix Dynamic 
status.
* config/riscv/riscv-v.cc (preferred_simd_mode): Ditto.
(autovectorize_vector_modes): Ditto.
(vectorize_related_mode): Ditto.

---
 gcc/config/riscv/riscv-opts.h |  2 +-
 gcc/config/riscv/riscv-v.cc   | 15 ---
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 79e0f12e388..b6b5907e111 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -81,7 +81,7 @@ enum riscv_autovec_lmul_enum {
   RVV_M4 = 4,
   RVV_M8 = 8,
   /* For dynamic LMUL, we compare COST start with LMUL8.  */
-  RVV_DYNAMIC = RVV_M8
+  RVV_DYNAMIC = 9
 };
 
 enum riscv_multilib_select_kind {
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index c8ad96f44d5..fbbc16a3c26 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1971,16 +1971,16 @@ preferred_simd_mode (scalar_mode mode)
  vectorizer when we enable them in this target hook. Currently, we can
  support auto-vectorization in -march=rv32_zve32x_zvl128b. Wheras,
  -march=rv32_zve32x_zvl32b or -march=rv32_zve32x_zvl64b are disabled.  */
+  int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul;
   if (autovec_use_vlmax_p ())
 {
-  if (TARGET_MIN_VLEN < 128 && riscv_autovec_lmul < RVV_M2)
+  if (TARGET_MIN_VLEN < 128 && lmul < RVV_M2)
return word_mode;
   /* We use LMUL = 1 as base bytesize which is BYTES_PER_RISCV_VECTOR and
 riscv_autovec_lmul as multiply factor to calculate the the NUNITS to
 get the auto-vectorization mode.  */
   poly_uint64 nunits;
-  poly_uint64 vector_size
-   = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul);
+  poly_uint64 vector_size = BYTES_PER_RISCV_VECTOR * lmul;
   poly_uint64 scalar_size = GET_MODE_SIZE (mode);
   gcc_assert (multiple_p (vector_size, scalar_size, ));
   machine_mode rvv_mode;
@@ -2154,10 +2154,10 @@ get_cmp_insn_code (rtx_code code, machine_mode mode)
 unsigned int
 autovectorize_vector_modes (vector_modes *modes, bool)
 {
+  int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul;
   if (autovec_use_vlmax_p ())
 {
-  poly_uint64 full_size
-   = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul);
+  poly_uint64 full_size = BYTES_PER_RISCV_VECTOR * lmul;
 
   /* Start with a RVVQImode where LMUL is the number of units that
 fit a whole vector.
@@ -2187,7 +2187,7 @@ autovectorize_vector_modes (vector_modes *modes, bool)
 {
   /* Push all VLSmodes according to TARGET_MIN_VLEN.  */
   unsigned int i = 0;
-  unsigned int base_size = TARGET_MIN_VLEN * riscv_autovec_lmul / 8;
+  unsigned int base_size = TARGET_MIN_VLEN * lmul / 8;
   unsigned int size = base_size;
   machine_mode mode;
   while (size > 0 && get_vector_mode (QImode, size).exists ())
@@ -2212,8 +2212,9 @@ vectorize_related_mode (machine_mode vector_mode, 
scalar_mode element_mode,
 {
   /* TODO: We will support RVV VLS auto-vectorization mode in the future. */
   poly_uint64 min_units;
+  int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul;
   if (autovec_use_vlmax_p () && riscv_v_ext_vector_mode_p (vector_mode)
-  && multiple_p (BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul),
+  && multiple_p (BYTES_PER_RISCV_VECTOR * lmul,
 GET_MODE_SIZE (element_mode), _units))
 {
   machine_mode rvv_mode;
-- 
2.36.1

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-04 Thread Uros Bizjak via Gcc-patches

On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu  wrote:

> > > > > > > I think there should be some constraint which explicitly has all 
> > > > > > > the 32
> > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that 
> > > > > > > regardless of
> > > > > > > -mapx-inline-asm-use-gpr32 one can be explicit what the inline 
> > > > > > > asm wants.
> > > > > > >
> > > > > > > Also, what about the "g" constraint?  Shouldn't there be another 
> > > > > > > for "g"
> > > > > > > without r16..r31?  What about the various other memory
> > > > > > > constraints ("<", "o", ...)?
> > > > > >
> > > > > > I think we should leave all existing constraints as they are, so "r"
> > > > > > covers only GPR16, "m" and "o" to only use GPR16. We can then
> > > > > > introduce "h" to instructions that have the ability to handle EGPR.
> > > > > > This would be somehow similar to the SSE -> AVX512F transition, 
> > > > > > where
> > > > > > we still have "x" for SSE16 and "v" was introduced as a separate
> > > > > > register class for EVEX SSE registers. This way, asm will be
> > > > > > compatible, when "r", "m", "o" and "g" are used. The new memory
> > > > > > constraint "Bt", should allow new registers, and should be added to
> > > > > > the constraint string as a separate constraint, and conditionally
> > > > > > enabled by relevant "isa" (AKA "enabled") attribute.
> > > > >
> > > > > The extended constraint can work for registers, but for memory it is 
> > > > > more
> > > > > complicated.
> > > >
> > > > Yes, unfortunately. The compiler assumes that an unchangeable register
> > > > class is used for BASE/INDEX registers. I have hit this limitation
> > > > when trying to implement memory support for instructions involving
> > > > 8-bit high registers (%ah, %bh, %ch, %dh), which do not support REX
> > > > registers, also inside memory operand. (You can see the "hack" in e.g.
> > > > *extzvqi_mem_rex64" and corresponding peephole2 with the original
> > > > *extzvqi pattern). I am aware that dynamic insn-dependent BASE/INDEX
> > > > register class is the major limitation in the compiler, so perhaps the
> > > > strategy on how to override this limitation should be discussed with
> > > > the register allocator author first. Perhaps adding an insn attribute
> > > > to insn RTX pattern to specify different BASE/INDEX register sets can
> > > > be a better solution than passing insn RTX to the register allocator.
> > > >
> > > > The above idea still does not solve the asm problem on how to select
> > > > correct BASE/INDEX register set for memory operands.
> > > The current approach disables gpr32 for memory operand in asm_operand
> > > by default. but can be turned on by options
> > > ix86_apx_inline_asm_use_gpr32(users need to guarantee the instruction
> > > supports gpr32).
> > > Only ~ 5% of total instructions don't support gpr32, reversed approach
> > > only gonna get more complicated.
> >
> > I'm not referring to the reversed approach, just want to point out
> > that the same approach as you proposed w.r.t. to memory operand can be
> > achieved using some named insn attribute that would affect BASE/INDEX
> > register class selection. The attribute could default to gpr32 with
> > APX, unless the insn specific attribute has e.g. nogpr32 value. See
> > for example how "enabled" and "preferred_for_*" attributes are used.
> > Perhaps this new attribute can also be applied to separate
> > alternatives.
> Yes, for xop/fma4/3dnow instructions, I think we can use isa attr like
> (define_attr "gpr32" "0, 1"
>   (cond [(eq_attr "isa" "fma4")
>(const_string "0")]
>   (const_string "1")))

Just a nit, can the member be named "map0" and "map1"? The code will
then look like:

if (get_attr_gpr32 (insn) == GPR32_MAP0) ...

instead of:

if (get_attr_gpr32 (insn) == GPR32_0) ...

> But still, we need to adjust memory constraints in the pattern.

I guess the gpr32 property is the same for all alternatives of the
insn pattern. In this case,  "m" "g" and "a" constraints could remain
as they are, the final register class will be adjusted (by some target
hook?) based on the value of gpr32 attribute.

> Ideally, gcc includes encoding information for every instruction,
> (.i.e. map0/map1), so that we can determine the attribute value of
> gpr32 directly from this information.

I think the right tool for this is attribute infrastructure of insn
patterns. We can set the default, set precise value of the insns, or
calculate attribute from some other attribute in a quite flexible way.
Other than that, adjusting BASE/INDEX register class of the RA pass is
the infrastructure change, but perhaps similar to the one you
proposed.

Uros.

Re: [PATCH v2] LoongArch: initial ada support on linux

2023-09-04 Thread Arnaud Charlet via Gcc-patches

OK, thanks.

> gcc/ChangeLog:
> 
>   * ada/Makefile.rtl: Add LoongArch support.
>   * ada/libgnarl/s-linux__loongarch.ads: New.
>   * ada/libgnat/system-linux-loongarch.ads: New.
>   * config/loongarch/loongarch.h: mark normalized options
>   passed from driver to gnat1 as explicit for multilib.

Re: [PATCH] analyzer: Add support of placement new and improved operator new [PR105948,PR94355]

2023-09-04 Thread Christophe Lyon via Gcc-patches

Hi Benjanmin,

On Fri, 1 Sept 2023 at 17:45, David Malcolm via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Fri, 2023-09-01 at 16:48 +0200, Benjamin Priour wrote:
> > Patch has been updated as per your suggestions and successfully
> > regstrapped
> > on x86_64-linux-gnu.
> >

The new testcase placement-new-size.C fails on aarch64:
placement-new-size.C:10:3: error: 'int8_t' was not declared in this scope;
did you mean 'wint_t'?
placement-new-size.C:11:3: error: 'int64_t' was not declared in this scope
placement-new-size.C:11:12: error: 'lp' was not declared in this scope
placement-new-size.C:11:23: error: 's' was not declared in this scope
placement-new-size.C:11:26: error: 'int64_t' does not name a type
placement-new-size.C:34:3: error: 'int32_t' was not declared in this scope
placement-new-size.C:34:12: error: 'i' was not declared in this scope
placement-new-size.C:34:30: error: 'int32_t' does not name a type

I suspect you should include  (instead of stdlib.h)

Thanks,

Christophe

> > call_details::maybe_get_arg_region is now
> > /* If argument IDX's svalue at the callsite is of pointer type,
> > return the region it points to.
> > Otherwise return NULL.  */
> >
> > const region *
> >  call_details::deref_ptr_arg (unsigned idx) const
> >  {
> >const svalue *ptr_sval = get_arg_svalue (idx);
> >return m_model->deref_rvalue (ptr_sval, get_arg_tree (idx),
> > m_ctxt);
> >  }
> >
> >
> > New test is
> >
> > +
> > +void test_binop ()
> > +{
> > +  char *p = (char *) malloc (4);
> > +  if (!p)
> > +return;
> > +  int32_t *i = ::new (p + 1) int32_t; /* { dg-warning "heap-based
> > buffer
> > overflow" } */
> > +  *i = 42; /* { dg-warning "heap-based buffer overflow" } */
> > +  free (p);
> > +}
> >
> > Is it OK for trunk ?
> > I didn't resend the whole patch as it otherwise was OK.
>
> Yes, thanks.
>
> Dave
>
>

[PATCH 3/4] Improve functionality of ree pass.

2023-09-04 Thread Ajit Agarwal via Gcc-patches



Hello Jeff:

This patch eliminates redundant zero and sign extension with ree pass for rs6000
target.

Bootstrapped and regtested for powerpc64-linux-gnu.

Thanks & Regards
Ajit


ree: Improve ree pass

For rs6000 target we see redundant zero and sign extension and ree pass
s improved to eliminate such redundant zero and sign extension. Support of
zero_extend/sign_extend/AND.

2023-09-04  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc (eliminate_across_bbs_p): Add checks to enable extension
elimination across and within basic blocks.
(def_arith_p): New function to check definition has arithmetic
operation.
(combine_set_extension): Modification to incorporate AND
and current zero_extend and sign_extend instruction.
(merge_def_and_ext): Add calls to eliminate_across_bbs_p and
zero_extend sign_extend and AND instruction.
(rtx_is_zext_p): New function.
(feasible_cfg): New function.
* rtl.h (reg_used_set_between_p): Add prototype.
* rtlanal.cc (reg_used_set_between_p): New function.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/zext-elim.C: New testcase.
* g++.target/powerpc/zext-elim-1.C: New testcase.
* g++.target/powerpc/zext-elim-2.C: New testcase.
* g++.target/powerpc/sext-elim.C: New testcase.
---
 gcc/ree.cc| 487 --
 gcc/rtl.h |   1 +
 gcc/rtlanal.cc|  15 +
 gcc/testsuite/g++.target/powerpc/sext-elim.C  |  17 +
 .../g++.target/powerpc/zext-elim-1.C  |  19 +
 .../g++.target/powerpc/zext-elim-2.C  |  11 +
 gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
 7 files changed, 534 insertions(+), 46 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C

diff --git a/gcc/ree.cc b/gcc/ree.cc
index fc04249fa84..931b9b08821 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -253,6 +253,77 @@ struct ext_cand
 
 static int max_insn_uid;
 
+/* Return TRUE if OP can be considered a zero extension from one or
+   more sub-word modes to larger modes up to a full word.
+
+   For example (and:DI (reg) (const_int X))
+
+   Depending on the value of X could be considered a zero extension
+   from QI, HI and SI to larger modes up to DImode.  */
+
+static bool
+rtx_is_zext_p (rtx insn)
+{
+  if (GET_CODE (insn) == AND)
+{
+  rtx set = XEXP (insn, 0);
+  if (REG_P (set))
+   {
+ rtx src = XEXP (insn, 1);
+ machine_mode m_mode = GET_MODE (set);
+
+ if (CONST_INT_P (src)
+ && (INTVAL (src) == 1
+ || (m_mode == QImode && INTVAL (src) == 0x7)
+ || (m_mode == QImode && INTVAL (src) == 0x007F)
+ || (m_mode == HImode && INTVAL (src) == 0x7FFF)
+ || (m_mode == SImode && INTVAL (src) == 0x007F)))
+   return true;
+
+   }
+  else
+   return false;
+}
+
+  return false;
+}
+/* Return TRUE if OP can be considered a zero extension from one or
+   more sub-word modes to larger modes up to a full word.
+
+   For example (and:DI (reg) (const_int X))
+
+   Depending on the value of X could be considered a zero extension
+   from QI, HI and SI to larger modes up to DImode.  */
+
+static bool
+rtx_is_zext_p (rtx_insn *insn)
+{
+  rtx body = single_set (insn);
+
+  if (GET_CODE (body) == SET && GET_CODE (SET_SRC (body)) == AND)
+   {
+ rtx set = XEXP (SET_SRC (body), 0);
+
+ if (REG_P (set) && GET_MODE (SET_DEST (body)) == GET_MODE (set))
+   {
+ rtx src = XEXP (SET_SRC (body), 1);
+ machine_mode m_mode = GET_MODE (set);
+
+ if (CONST_INT_P (src)
+ && (INTVAL (src) == 1
+ || (m_mode == QImode && INTVAL (src) == 0x7)
+ || (m_mode == QImode && INTVAL (src) == 0x007F)
+ || (m_mode == HImode && INTVAL (src) == 0x7FFF)
+ || (m_mode == SImode && INTVAL (src) == 0x007F)))
+   return true;
+   }
+ else
+  return false;
+   }
+
+   return false;
+}
+
 /* Update or remove REG_EQUAL or REG_EQUIV notes for INSN.  */
 
 static bool
@@ -319,7 +390,7 @@ combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, 
rtx *orig_set)
 {
   rtx orig_src = SET_SRC (*orig_set);
   machine_mode orig_mode = GET_MODE (SET_DEST (*orig_set));
-  rtx new_set;
+  rtx new_set = NULL_RTX;
   rtx cand_pat = single_set (cand->insn);
 
   /* If the extension's source/destination registers are not the same
@@ -359,27 +430,41 @@ combine_set_extension (ext_cand *cand, rtx_insn 
*curr_insn, rtx *orig_set)
   else if (GET_CODE (orig_src) == cand->code)
 {
   /* Here is a sequence of two extensions.

RE: [PATCH v1] RISC-V: Support FP16 for RVV VRGATHEREI16 intrinsic

2023-09-04 Thread Li, Pan2 via Gcc-patches

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, September 4, 2023 3:29 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Wang, Yanzhang ; 
juzhe.zh...@rivai.ai
Subject: Re: [PATCH v1] RISC-V: Support FP16 for RVV VRGATHEREI16 intrinsic

LGTM

On Mon, Sep 4, 2023 at 3:18 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to add FP16 support for the VRGATHEREI16
> intrinsic. Aka:
>
> * __riscv_vrgatherei16_vv_f16mf4
> * __riscv_vrgatherei16_vv_f16mf4_m
>
> As well as f16mf2 to f16m8 types.
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-types.def
> (vfloat16mf4_t): Add FP16 intrinsic def.
> (vfloat16mf2_t): Ditto.
> (vfloat16m1_t): Ditto.
> (vfloat16m2_t): Ditto.
> (vfloat16m4_t): Ditto.
> (vfloat16m8_t): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/intrisinc-vrgatherei16.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-types.def |  9 ++
>  .../riscv/rvv/intrisinc-vrgatherei16.c| 28 +++
>  2 files changed, 37 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def 
> b/gcc/config/riscv/riscv-vector-builtins-types.def
> index 1c3cc0eb222..6aa45ae9a7e 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-types.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-types.def
> @@ -689,11 +689,20 @@ DEF_RVV_EI16_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64)
> +
> +DEF_RVV_EI16_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | 
> RVV_REQUIRE_MIN_VLEN_64)
> +DEF_RVV_EI16_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m8_t, RVV_REQUIRE_ELEN_FP_16)
> +
>  DEF_RVV_EI16_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | 
> RVV_REQUIRE_MIN_VLEN_64)
>  DEF_RVV_EI16_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
> +
>  DEF_RVV_EI16_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64)
>  DEF_RVV_EI16_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
>  DEF_RVV_EI16_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
> new file mode 100644
> index 000..59c6d7c887d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +typedef _Float16 float16_t;
> +
> +vfloat16mf4_t test_vrgatherei16_vv_f16mf4(vfloat16mf4_t op1, vuint16mf4_t 
> op2,
> +  size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16mf4(op1, op2, vl);
> +}
> +
> +vfloat16m8_t test_vrgatherei16_vv_f16m8(vfloat16m8_t op1, vuint16m8_t op2,
> +  size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16m8(op1, op2, vl);
> +}
> +
> +vfloat16mf4_t test_vrgatherei16_vv_f16mf4_m(vbool64_t mask, vfloat16mf4_t 
> op1,
> +  vuint16mf4_t op2, size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16mf4_m(mask, op1, op2, vl);
> +}
> +
> +vfloat16m8_t test_vrgatherei16_vv_f16m8_m(vbool2_t mask, vfloat16m8_t op1,
> +  vuint16m8_t op2, size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16m8_m(mask, op1, op2, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times 
> {vrgatherei16.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
> --
> 2.34.1
>

Re: [PATCH v1] RISC-V: Support FP16 for RVV VRGATHEREI16 intrinsic

2023-09-04 Thread Kito Cheng via Gcc-patches

LGTM

On Mon, Sep 4, 2023 at 3:18 PM Pan Li via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> This patch would like to add FP16 support for the VRGATHEREI16
> intrinsic. Aka:
>
> * __riscv_vrgatherei16_vv_f16mf4
> * __riscv_vrgatherei16_vv_f16mf4_m
>
> As well as f16mf2 to f16m8 types.
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-types.def
> (vfloat16mf4_t): Add FP16 intrinsic def.
> (vfloat16mf2_t): Ditto.
> (vfloat16m1_t): Ditto.
> (vfloat16m2_t): Ditto.
> (vfloat16m4_t): Ditto.
> (vfloat16m8_t): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/intrisinc-vrgatherei16.c: New test.
> ---
>  .../riscv/riscv-vector-builtins-types.def |  9 ++
>  .../riscv/rvv/intrisinc-vrgatherei16.c| 28 +++
>  2 files changed, 37 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def 
> b/gcc/config/riscv/riscv-vector-builtins-types.def
> index 1c3cc0eb222..6aa45ae9a7e 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-types.def
> +++ b/gcc/config/riscv/riscv-vector-builtins-types.def
> @@ -689,11 +689,20 @@ DEF_RVV_EI16_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64)
>  DEF_RVV_EI16_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64)
> +
> +DEF_RVV_EI16_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | 
> RVV_REQUIRE_MIN_VLEN_64)
> +DEF_RVV_EI16_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16)
> +DEF_RVV_EI16_OPS (vfloat16m8_t, RVV_REQUIRE_ELEN_FP_16)
> +
>  DEF_RVV_EI16_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | 
> RVV_REQUIRE_MIN_VLEN_64)
>  DEF_RVV_EI16_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
>  DEF_RVV_EI16_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
> +
>  DEF_RVV_EI16_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64)
>  DEF_RVV_EI16_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
>  DEF_RVV_EI16_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
> new file mode 100644
> index 000..59c6d7c887d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +typedef _Float16 float16_t;
> +
> +vfloat16mf4_t test_vrgatherei16_vv_f16mf4(vfloat16mf4_t op1, vuint16mf4_t 
> op2,
> +  size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16mf4(op1, op2, vl);
> +}
> +
> +vfloat16m8_t test_vrgatherei16_vv_f16m8(vfloat16m8_t op1, vuint16m8_t op2,
> +  size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16m8(op1, op2, vl);
> +}
> +
> +vfloat16mf4_t test_vrgatherei16_vv_f16mf4_m(vbool64_t mask, vfloat16mf4_t 
> op1,
> +  vuint16mf4_t op2, size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16mf4_m(mask, op1, op2, vl);
> +}
> +
> +vfloat16m8_t test_vrgatherei16_vv_f16m8_m(vbool2_t mask, vfloat16m8_t op1,
> +  vuint16m8_t op2, size_t vl) {
> +  return __riscv_vrgatherei16_vv_f16m8_m(mask, op1, op2, vl);
> +}
> +
> +/* { dg-final { scan-assembler-times 
> {vrgatherei16.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
> --
> 2.34.1
>

[PATCH v1] RISC-V: Support FP16 for RVV VRGATHEREI16 intrinsic

2023-09-04 Thread Pan Li via Gcc-patches

From: Pan Li 

This patch would like to add FP16 support for the VRGATHEREI16
intrinsic. Aka:

* __riscv_vrgatherei16_vv_f16mf4
* __riscv_vrgatherei16_vv_f16mf4_m

As well as f16mf2 to f16m8 types.

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-types.def
(vfloat16mf4_t): Add FP16 intrinsic def.
(vfloat16mf2_t): Ditto.
(vfloat16m1_t): Ditto.
(vfloat16m2_t): Ditto.
(vfloat16m4_t): Ditto.
(vfloat16m8_t): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/intrisinc-vrgatherei16.c: New test.
---
 .../riscv/riscv-vector-builtins-types.def |  9 ++
 .../riscv/rvv/intrisinc-vrgatherei16.c| 28 +++
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def 
b/gcc/config/riscv/riscv-vector-builtins-types.def
index 1c3cc0eb222..6aa45ae9a7e 100644
--- a/gcc/config/riscv/riscv-vector-builtins-types.def
+++ b/gcc/config/riscv/riscv-vector-builtins-types.def
@@ -689,11 +689,20 @@ DEF_RVV_EI16_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_EI16_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_EI16_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_EI16_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64)
+
+DEF_RVV_EI16_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | 
RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_EI16_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16)
+DEF_RVV_EI16_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16)
+DEF_RVV_EI16_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16)
+DEF_RVV_EI16_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16)
+DEF_RVV_EI16_OPS (vfloat16m8_t, RVV_REQUIRE_ELEN_FP_16)
+
 DEF_RVV_EI16_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | 
RVV_REQUIRE_MIN_VLEN_64)
 DEF_RVV_EI16_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
 DEF_RVV_EI16_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
 DEF_RVV_EI16_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
 DEF_RVV_EI16_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
+
 DEF_RVV_EI16_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_EI16_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_EI16_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
new file mode 100644
index 000..59c6d7c887d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/intrisinc-vrgatherei16.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+typedef _Float16 float16_t;
+
+vfloat16mf4_t test_vrgatherei16_vv_f16mf4(vfloat16mf4_t op1, vuint16mf4_t op2,
+  size_t vl) {
+  return __riscv_vrgatherei16_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vrgatherei16_vv_f16m8(vfloat16m8_t op1, vuint16m8_t op2,
+  size_t vl) {
+  return __riscv_vrgatherei16_vv_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vrgatherei16_vv_f16mf4_m(vbool64_t mask, vfloat16mf4_t op1,
+  vuint16mf4_t op2, size_t vl) {
+  return __riscv_vrgatherei16_vv_f16mf4_m(mask, op1, op2, vl);
+}
+
+vfloat16m8_t test_vrgatherei16_vv_f16m8_m(vbool2_t mask, vfloat16m8_t op1,
+  vuint16m8_t op2, size_t vl) {
+  return __riscv_vrgatherei16_vv_f16m8_m(mask, op1, op2, vl);
+}
+
+/* { dg-final { scan-assembler-times 
{vrgatherei16.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
-- 
2.34.1

Re: [PATCH] RISC-V: Fix Zicond ICE on large constants

2023-09-04 Thread Kito Cheng via Gcc-patches

Maybe move the check logic a bit forward? My thought is the logic is
already specialized into a few catalogs, (imm, imm), (imm, reg), (reg,
reg)... and the logic you put is already in (imm, reg), but it should
really move into (reg, reg) case IMO? and move that forward we could
prevent add too much logic to redirect the case.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2db9c81ac8b..c84509c393b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3892,6 +3892,12 @@ riscv_expand_conditional_move (rtx dest, rtx
op, rtx cons, rtx alt)
 op1 = XEXP (op, 1);
   }

+  /* CONS might not fit into a signed 12 bit immediate suitable
+for an addi instruction.  If that's the case, force it into
+a register.  */
+  if (CONST_INT_P (cons) && !SMALL_OPERAND (INTVAL (cons)))
+   cons = force_reg (mode, cons);
+
  /* 0, reg or 0, imm */
  if (cons == CONST0_RTX (mode)
 && (REG_P (alt)

On Mon, Sep 4, 2023 at 8:21 AM Tsukasa OI via Gcc-patches
 wrote:
>
> From: Tsukasa OI 
>
> Large constant cons and/or alt will trigger ICEs building GCC target
> libraries (libgomp and libatomic) when the 'Zicond' extension is enabled.
>
> For instance, zicond-ice-2.c (new test case in this commit) will cause
> an ICE when SOME_NUMBER is 0x1000 or larger.  While opposite numbers
> corresponding cons/alt (two temp2 variables) are checked, cons/alt
> themselves are not checked and causing 2 ICEs building
> GCC target libraries as of this writing:
>
> 1.  gcc/libatomic/config/posix/lock.c
> 2.  gcc/libgomp/fortran.c
>
> Coercing a large value into a register will fix the issue.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_expand_conditional_move): Force
> large constant cons/alt into a register.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zicond-ice-2.c: New test.  This is based on
> an ICE at libat_lock_n func on gcc/libatomic/config/posix/lock.c
> but heavily minimized.
> ---
>  gcc/config/riscv/riscv.cc | 16 ++--
>  gcc/testsuite/gcc.target/riscv/zicond-ice-2.c | 11 +++
>  2 files changed, 21 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-ice-2.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 8d8f7b4f16ed..cfaa4b6a7720 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3940,11 +3940,13 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
> cons, rtx alt)
>   rtx temp1 = gen_reg_rtx (mode);
>   rtx temp2 = gen_int_mode (-1 * INTVAL (cons), mode);
>
> - /* TEMP2 might not fit into a signed 12 bit immediate suitable
> -for an addi instruction.  If that's the case, force it into
> -a register.  */
> + /* TEMP2 and/or CONS might not fit into a signed 12 bit immediate
> +suitable for an addi instruction.  If that's the case, force it
> +into a register.  */
>   if (!SMALL_OPERAND (INTVAL (temp2)))
> temp2 = force_reg (mode, temp2);
> + if (!SMALL_OPERAND (INTVAL (cons)))
> +   cons = force_reg (mode, cons);
>
>   riscv_emit_binary (PLUS, temp1, alt, temp2);
>   emit_insn (gen_rtx_SET (dest,
> @@ -3986,11 +3988,13 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
> cons, rtx alt)
>   rtx temp1 = gen_reg_rtx (mode);
>   rtx temp2 = gen_int_mode (-1 * INTVAL (alt), mode);
>
> - /* TEMP2 might not fit into a signed 12 bit immediate suitable
> -for an addi instruction.  If that's the case, force it into
> -a register.  */
> + /* TEMP2 and/or ALT might not fit into a signed 12 bit immediate
> +suitable for an addi instruction.  If that's the case, force it
> +into a register.  */
>   if (!SMALL_OPERAND (INTVAL (temp2)))
> temp2 = force_reg (mode, temp2);
> + if (!SMALL_OPERAND (INTVAL (alt)))
> +   alt = force_reg (mode, alt);
>
>   riscv_emit_binary (PLUS, temp1, cons, temp2);
>   emit_insn (gen_rtx_SET (dest,
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ice-2.c 
> b/gcc/testsuite/gcc.target/riscv/zicond-ice-2.c
> new file mode 100644
> index ..ffd8dcb5814e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-ice-2.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64d" { target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_zicond -mabi=ilp32d" { target { rv32 } } } */
> +
> +#define SOME_NUMBER 0x1000
> +
> +unsigned long
> +d (unsigned long n)
> +{
> +  return n > SOME_NUMBER ? SOME_NUMBER : n;
> +}
>
> base-commit: 78f636d979530c8a649262dbd44914bdfb6f7290
> --
> 2.42.0
>

69 matches

Mail list logo