[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2022-08-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
The related insns requires TARGET_POWERPC64, they are still available on 32bit
if the option -O2 -mpowerpc64 comes after -m32.

I think it suffers the issue as its comments:

/* On some versions of dejagnu this test will fail when biarch testing
   with RUNTESTFLAGS="--target_board=unix'{-m64,-m32}'" due to -m32
   being added on the command line after the dg-options -mpowerpc64.
   common/config/rs6000/rs6000-common.c:rs6000_handle_option disables
   -mpowerpc64 for -m32.  */

Hi Mike,

Could you share which test box you used for testing? Or dejagnu version?

[Bug tree-optimization/105651] [12/13 Regression] bogus "may overlap" memcpy warning with std::string and operator+ at -O3

2022-08-18 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651

--- Comment #18 from Jason Merrill  ---
(In reply to Jason Merrill from comment #17)
> There's probably a way to help the optimizer out without the
> __builtin_unreachable hammer, as for 98465; suggestions are welcome.

..like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104336#c1

[Bug tree-optimization/105651] [12/13 Regression] bogus "may overlap" memcpy warning with std::string and operator+ at -O3

2022-08-18 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105651

Jason Merrill  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #17 from Jason Merrill  ---
Created attachment 53474
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53474=edit
patch to work around the issue in the library

This patch tells the optimizer that the copy can't overlap, since it is having
trouble figuring that out on its own.  This fixes the false positive.

It theoretically could deduce this from the previous two conditions: the first
establishes that the end of the source is after the end of the destination; the
second establishes that the beginning of the source is before the end of the
destination.  So the source crosses the end of the destination, and so the
length of the overlap is less than the length of the source.

There's probably a way to help the optimizer out without the
__builtin_unreachable hammer, as for 98465; suggestions are welcome.

Turning off -Wrestrict just around the call to _M_copy also works, but this
patch should also improve optimization.

Re: [PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen,

on 2022/8/19 10:35, HAO CHEN GUI wrote:
> Hi,
> 
>   This patch is for internal issue1136. It changes insn condition from
> TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions.
> These instructions all use DI registers and can be invoked with -mpowerpc64
> in a 32-bit environment.
> 
>   This patch also changes prototypes of related built-ins and target selector
> of test cases.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> 
> ChangeLog
> 2022-08-19  Haochen Gui  
> 
> gcc/
>   * config/rs6000/rs6000-builtins.def
>   (__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
>   long long.
>   (__builtin_vsx_scalar_extract_sig): Likewise.
>   * config/rs6000/vsx.md (xsxexpdp): Change insn condition from
>   TARGET_64BIT to TARGET_POWERPC64.
>   (xsxsigdp): Likewise.
>   (xsiexpdp): Likewise.
>   (xsiexpdpf): Likewise.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Change effective
>   target from lp64 to has_arch_ppc64 and add -mpowerpc64 for 32-bit
>   environment.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Likewise.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index f76f54793d7..4ebfd4704a1 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2847,10 +2847,10 @@
>pure vsc __builtin_vsx_lxvl (const void *, signed long);
>  LXVL lxvl {}
> 
> -  const signed long __builtin_vsx_scalar_extract_exp (double);
> +  const unsigned long long __builtin_vsx_scalar_extract_exp (double);
>  VSEEDP xsxexpdp {}
> 
> -  const signed long __builtin_vsx_scalar_extract_sig (double);
> +  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
>  VSESDP xsxsigdp {}
> 
>const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index e226a93bbe5..a01711aa2cb 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5098,7 +5098,7 @@ (define_insn "xsxexpdp"
>[(set (match_operand:DI 0 "register_operand" "=r")
>   (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UNSPEC_VSX_SXEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsxexpdp %0,%x1"
>[(set_attr "type" "integer")])
> 
> @@ -5116,7 +5116,7 @@ (define_insn "xsxsigdp"
>[(set (match_operand:DI 0 "register_operand" "=r")
>   (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UNSPEC_VSX_SXSIG))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsxsigdp %0,%x1"
>[(set_attr "type" "integer")])
> 
> @@ -5147,7 +5147,7 @@ (define_insn "xsiexpdp"
>   (unspec:DF [(match_operand:DI 1 "register_operand" "r")
>   (match_operand:DI 2 "register_operand" "r")]
>UNSPEC_VSX_SIEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsiexpdp %x0,%1,%2"
>[(set_attr "type" "fpsimple")])
> 
> @@ -5157,7 +5157,7 @@ (define_insn "xsiexpdpf"
>   (unspec:DF [(match_operand:DF 1 "register_operand" "r")
>   (match_operand:DI 2 "register_operand" "r")]
>UNSPEC_VSX_SIEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsiexpdp %x0,%1,%2"
>[(set_attr "type" "fpsimple")])
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c 
> b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> index 35bf1b240f3..c9190bc7c6c 100644
> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> @@ -1,7 +1,8 @@
>  /* { dg-do compile { target { powerpc*-*-* } } } */
> -/* { dg-require-effective-target lp64 } */
> -/* { dg-require-effective-target powerpc_p9vector_ok } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
> +/* { dg-additional-options "-mpowerpc64" { target { powerpc*-*-linux* && 
> ilp32 } } } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */

Maybe we should add one comment here (also the other touched case) or
in the commit log saying why we reorder the dg-require-effective-target
and dg-options, since the reason isn't obvious.  :)

The others looks good to me.  Thanks!

BR,
Kewen

> 
>  /* This test should succeed only on 64-bit configurations.  */
>  #include 
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-0.c 
> b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-0.c
> index 

Re: [PATCH v3] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-18 Thread Kewen.Lin via Gcc-patches
Hi Segher,

Thanks for the review!

on 2022/8/19 01:34, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Aug 18, 2022 at 10:12:48AM +0800, Kewen.Lin wrote:
>> As PR99888 and its related show, the current support for
>> -fpatchable-function-entry on powerpc ELFv2 doesn't work
>> well with global entry existence.
> 
>> +  /* Emit the NOPs after local entry.  */
> 
> Please do not say "NOPs".  It is not an acronym.  I know some of our
> documentation has this bug already, but please do not spread it further.
> 
> The machine instruction is "nop", lowercase.
> 
> Please fix this.

Whoops, I thought it's only used in commit log mistakenly, will fix!

> 
> So, this patch overloads the meaning of the two parameters here to have
> more meaning than explained in the documentation for the option.  There
> isn't much that can be done about this, so adding some new option would
> only be extra work for everyone.  But, could you add a line or two in
> the documentation?  "For PowerPC with the ELFv2 ABI, there will be M
> nops before the local entry point, and N-M after", something like that?
> 

Since you proposed to update the documentation, I'm thinking if we can
reconsider Fangrui's proposal in the PR which Alan seconded: Put preceding
nops before GEP and succeeding nops after LEP.  Previously I had the concern
that the nops inserted doesn't respect to a same function entry, it looks
inconsistent to the documentation, and you also noted that "The nops have
to be consecutive".  If we want to update the documentation, could we reword
it for PowerPC ELFv2 ABI?

What's your opinion?

BR,
Kewen


[PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-18 Thread HAO CHEN GUI via Gcc-patches
Hi,

  This patch is for internal issue1136. It changes insn condition from
TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions.
These instructions all use DI registers and can be invoked with -mpowerpc64
in a 32-bit environment.

  This patch also changes prototypes of related built-ins and target selector
of test cases.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.


ChangeLog
2022-08-19  Haochen Gui  

gcc/
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
long long.
(__builtin_vsx_scalar_extract_sig): Likewise.
* config/rs6000/vsx.md (xsxexpdp): Change insn condition from
TARGET_64BIT to TARGET_POWERPC64.
(xsxsigdp): Likewise.
(xsiexpdp): Likewise.
(xsiexpdpf): Likewise.

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Change effective
target from lp64 to has_arch_ppc64 and add -mpowerpc64 for 32-bit
environment.
* gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Likewise.


patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d7..4ebfd4704a1 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2847,10 +2847,10 @@
   pure vsc __builtin_vsx_lxvl (const void *, signed long);
 LXVL lxvl {}

-  const signed long __builtin_vsx_scalar_extract_exp (double);
+  const unsigned long long __builtin_vsx_scalar_extract_exp (double);
 VSEEDP xsxexpdp {}

-  const signed long __builtin_vsx_scalar_extract_sig (double);
+  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
 VSESDP xsxsigdp {}

   const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index e226a93bbe5..a01711aa2cb 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5098,7 +5098,7 @@ (define_insn "xsxexpdp"
   [(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
 UNSPEC_VSX_SXEXPDP))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsxexpdp %0,%x1"
   [(set_attr "type" "integer")])

@@ -5116,7 +5116,7 @@ (define_insn "xsxsigdp"
   [(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
 UNSPEC_VSX_SXSIG))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsxsigdp %0,%x1"
   [(set_attr "type" "integer")])

@@ -5147,7 +5147,7 @@ (define_insn "xsiexpdp"
(unspec:DF [(match_operand:DI 1 "register_operand" "r")
(match_operand:DI 2 "register_operand" "r")]
 UNSPEC_VSX_SIEXPDP))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsiexpdp %x0,%1,%2"
   [(set_attr "type" "fpsimple")])

@@ -5157,7 +5157,7 @@ (define_insn "xsiexpdpf"
(unspec:DF [(match_operand:DF 1 "register_operand" "r")
(match_operand:DI 2 "register_operand" "r")]
 UNSPEC_VSX_SIEXPDP))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsiexpdp %x0,%1,%2"
   [(set_attr "type" "fpsimple")])

diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
index 35bf1b240f3..c9190bc7c6c 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
@@ -1,7 +1,8 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mpowerpc64" { target { powerpc*-*-linux* && ilp32 
} } } */
+/* { dg-require-effective-target has_arch_ppc64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */

 /* This test should succeed only on 64-bit configurations.  */
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-0.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-0.c
index 637080652b7..a391ac8cce3 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-0.c
@@ -1,7 +1,8 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mpowerpc64" { target { powerpc*-*-linux* && ilp32 
} } } */
+/* { 

[wwwdocs] [GCC13] Mention Intel __bf16 support.

2022-08-18 Thread Kong, Lingling via Gcc-patches
Hi

The patch is for mention Intel __bf16 support in gcc13.
Ok for master ?

Thanks,
Lingling

htdocs/gcc-13/changes.html | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index 
57bd8724..7d98329c 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -122,7 +122,12 @@ a work-in-progress.
 
 
 
-
+IA-32/x86-64
+
+  For both C and C++ the __bf16 type is supported on
+  x86 systems with SSE2 and above enabled.
+  
+
 
 
 
--
2.18.2



[PATCH v2] LoongArch: Add support code model extreme.

2022-08-18 Thread Lulu Cheng
v1 -> v2:
- Modify some description information.
- Add options -W[no]extreme-plt, warn about code model extreme not support plt 
mode,
and then disable plt.

---
Use five instructions to calculate a signed 64-bit offset relative to the pc.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in: Add new option 
W[no-]extreme-plt.
* config/loongarch/loongarch.opt: Update file.
* config/loongarch/loongarch-opts.cc: Allow cmodel to be extreme.
* config/loongarch/loongarch.cc (loongarch_call_tls_get_addr):
Add extreme support for TLS GD and LD types.
(loongarch_legitimize_tls_address): Add extreme support for TLS LE
and IE.
(loongarch_split_symbol): When compiling with -mcmodel=extreme,
the symbol address will be obtained through five instructions.
(loongarch_print_operand_reloc): Add support.
(loongarch_print_operand): Add support.
(loongarch_print_operand_address): Add support.
(loongarch_option_override_internal): Set '-mcmodel=extreme' option
incompatible with '-mno-explicit-relocs'.
* config/loongarch/loongarch.md (@lui_l_hi20):
Loads bits 12-31 of data into registers.
(lui_h_lo20): Load bits 32-51 of the data and spell bits 0-31 of
the source register.
(lui_h_hi12): Load bits 52-63 of the data and spell bits 0-51 of
the source register.
* config/loongarch/predicates.md: Symbols need to be decomposed
when defining the macro TARGET_CMODEL_EXTREME
* doc/invoke.texi: Modify the description information of cmodel in the 
document.
Document -W[no-]extreme-plt.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/func-call-1.c: Add option '-mcmodel=normal'.
* gcc.target/loongarch/func-call-2.c: Likewise.
* gcc.target/loongarch/func-call-3.c: Likewise.
* gcc.target/loongarch/func-call-4.c: Likewise.
* gcc.target/loongarch/func-call-5.c: Likewise.
* gcc.target/loongarch/func-call-6.c: Likewise.
* gcc.target/loongarch/func-call-7.c: Likewise.
* gcc.target/loongarch/func-call-8.c: Likewise.
* gcc.target/loongarch/relocs-symbol-noaddend.c: Likewise.
* gcc.target/loongarch/func-call-extreme-1.c: New test.
* gcc.target/loongarch/func-call-extreme-2.c: New test.
---
 gcc/config/loongarch/genopts/loongarch.opt.in |   3 +
 gcc/config/loongarch/loongarch-opts.cc|   3 +-
 gcc/config/loongarch/loongarch.cc | 189 +++---
 gcc/config/loongarch/loongarch.md |  34 +++-
 gcc/config/loongarch/loongarch.opt|   3 +
 gcc/config/loongarch/predicates.md|   9 +-
 gcc/doc/invoke.texi   |  59 ++
 .../gcc.target/loongarch/func-call-1.c|   2 +-
 .../gcc.target/loongarch/func-call-2.c|   2 +-
 .../gcc.target/loongarch/func-call-3.c|   2 +-
 .../gcc.target/loongarch/func-call-4.c|   2 +-
 .../gcc.target/loongarch/func-call-5.c|   2 +-
 .../gcc.target/loongarch/func-call-6.c|   2 +-
 .../gcc.target/loongarch/func-call-7.c|   2 +-
 .../gcc.target/loongarch/func-call-8.c|   2 +-
 .../loongarch/func-call-extreme-1.c   |  32 +++
 .../loongarch/func-call-extreme-2.c   |  32 +++
 .../loongarch/relocs-symbol-noaddend.c|   2 +-
 18 files changed, 303 insertions(+), 79 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/func-call-extreme-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/func-call-extreme-2.c

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index a571b6b7524..86fd80fc7a2 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -158,6 +158,9 @@ mexplicit-relocs
 Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS)
 Use %reloc() assembly operators.
 
+Wextreme-plt
+Target Var(warn_extreme_plt) Init(1) Warning code model extreme not support 
plt mode.
+
 ; The code model option names for -mcmodel.
 Enum
 Name(cmodel) Type(int)
diff --git a/gcc/config/loongarch/loongarch-opts.cc 
b/gcc/config/loongarch/loongarch-opts.cc
index 3f70943ded6..2ae89f23443 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -376,14 +376,13 @@ fallback:
 
   /* 5.  Target code model */
   t.cmodel = constrained.cmodel ? opt_cmodel : CMODEL_NORMAL;
-  if (t.cmodel != CMODEL_NORMAL)
+  if (t.cmodel != CMODEL_NORMAL && t.cmodel != CMODEL_EXTREME)
 {
   warning (0, "%qs is not supported, now cmodel is set to %qs",
   loongarch_cmodel_strings[t.cmodel], "normal");
   t.cmodel = CMODEL_NORMAL;
 }
 
-
   /* Cleanup and return.  */
   obstack_free (_obstack, NULL);
   *target = t;
diff --git a/gcc/config/loongarch/loongarch.cc 

Re: [PING][PATCH] Add instruction level discriminator support.

2022-08-18 Thread Jason Merrill via Gcc-patches

On 8/3/22 17:25, Eugene Rozenfeld wrote:

One more ping for this patch 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

CC Jason since this changes discriminators emitted in dwarf.

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld
Sent: Monday, June 27, 2022 12:45 PM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: RE: [PING][PATCH] Add instruction level discriminator support.

Another ping for 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html .

I got a review from Andi 
(https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596549.html) but I also 
need a review from someone who can approve the changes.

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld
Sent: Friday, June 10, 2022 12:03 PM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [PING][PATCH] Add instruction level discriminator support.

Hello,

I'd like to ping this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Thursday, June 02, 2022 12:22 AM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.

This is the first in a series of patches to enable discriminator support in 
AutoFDO.

This patch switches to tracking discriminators per statement/instruction 
instead of per basic block. Tracking per basic block was problematic since not 
all statements in a basic block needed a discriminator and, also, later 
optimizations could move statements between basic blocks making correlation 
during AutoFDO compilation unreliable. Tracking per statement also allows us to 
assign different discriminators to multiple function calls in the same basic 
block. A subsequent patch will add that support.

The idea of this patch is based on commit 
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
approach. In Dehao's work special (normally unused) location ids and side 
tables were used to keep track of locations with discriminators. Things have 
changed since then and I don't think we have unused location ids anymore. 
Instead, I made discriminators a part of ad-hoc locations.

The difference from Dehao's work also includes support for discriminator 
reading/writing in lto streaming and in modules.

Tested on x86_64-pc-linux-gnu.



@@ -1190,12 +1217,12 @@ assign_discriminators (void)
  || (last && same_line_p (locus, _e,
   gimple_location (last
{
- if (e->dest->discriminator != 0 && bb->discriminator == 0)
-   bb->discriminator
- = next_discriminator_for_locus (locus_e.line);
+ if (((first && has_discriminator (gimple_location (first)))
+  || (last && has_discriminator (gimple_location (last


I think you want to check has_discriminator only for the one of first or 
last that we find to have the same line as locus above?


Incidentally, I wonder why we ignore column number here, but that's not 
an issue for this patch.


Jason



[Bug testsuite/106516] New test case gcc.dg/pr104992.c fails on power 10

2022-08-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106516

Kewen Lin  changed:

   What|Removed |Added

 CC||meissner at gcc dot gnu.org

--- Comment #3 from Kewen Lin  ---
*** Bug 106681 has been marked as a duplicate of this bug. ***

[Bug testsuite/106681] Powerpc test gcc.dg/pr104992.c fails on power10

2022-08-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106681

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Kewen Lin  ---
Dup.

*** This bug has been marked as a duplicate of bug 106516 ***

[Bug testsuite/106345] Some ppc64le tests fail with -mcpu=power9 -mtune=power9

2022-08-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345

--- Comment #9 from Kewen Lin  ---
(In reply to Michael Meissner from comment #8)
> Note, the gcc.target/powerpc/pr92398.p9-.c test fails when the compiler is
> configured for either --with-cpu=power9 or --with-cpu=power10.  No
> --with-tune= was used in configuring either compiler.

Yeah, as comment #1, it's a different issue from the --with-tune issue, it's
due to empty TU in effective target checks.

The patch was posted at:
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598748.html

Re: [PATCH] libcpp, v2: Implement C++23 P2290R3 - Delimited escape sequences [PR106645]

2022-08-18 Thread Jason Merrill via Gcc-patches

On 8/18/22 01:17, Jakub Jelinek wrote:

On Wed, Aug 17, 2022 at 10:22:03PM -0400, Jason Merrill wrote:

OK, a comment mentioning this should be sufficient.


Here is an updated patch with those changes in.
So far successfully tested with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make -j32 -k check-gcc check-g++ 
RUNTESTFLAGS="dg.exp='Wbidi* cpp/*' cpp.exp"
ok if it passes full bootstrap/regtest tonight?


OK.


2022-08-18  Jakub Jelinek  

PR c++/106645
libcpp/
* include/cpplib.h (struct cpp_options): Implement
P2290R3 - Delimited escape sequences.  Add delimite_escape_seqs
member.
* init.cc (struct lang_flags): Likewise.
(lang_defaults): Add delim column.
(cpp_set_lang): Copy over delimite_escape_seqs.
* charset.cc (extend_char_range): New function.
(_cpp_valid_ucn): Use it.  Handle delimited escape sequences.
(convert_hex): Likewise.
(convert_oct): Likewise.
(convert_ucn): Use extend_char_range.
(convert_escape): Call convert_oct even for \o.
(_cpp_interpret_identifier): Handle delimited escape sequences.
* lex.cc (get_bidi_ucn_1): Likewise.  Add end argument, fill it in.
(get_bidi_ucn): Adjust get_bidi_ucn_1 caller.  Use end argument to
compute num_bytes.
gcc/testsuite/
* c-c++-common/cpp/delimited-escape-seq-1.c: New test.
* c-c++-common/cpp/delimited-escape-seq-2.c: New test.
* c-c++-common/cpp/delimited-escape-seq-3.c: New test.
* c-c++-common/Wbidi-chars-24.c: New test.
* gcc.dg/cpp/delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/delimited-escape-seq-2.c: New test.
* g++.dg/cpp/delimited-escape-seq-1.C: New test.
* g++.dg/cpp/delimited-escape-seq-2.C: New test.

--- libcpp/include/cpplib.h.jj  2022-08-10 09:06:53.268209449 +0200
+++ libcpp/include/cpplib.h 2022-08-15 19:32:53.743213474 +0200
@@ -519,6 +519,9 @@ struct cpp_options
/* Nonzero for C++23 size_t literals.  */
unsigned char size_t_literals;
  
+  /* Nonzero for C++23 delimited escape sequences.  */

+  unsigned char delimited_escape_seqs;
+
/* Holds the name of the target (execution) character set.  */
const char *narrow_charset;
  
--- libcpp/init.cc.jj	2022-08-10 09:06:53.268209449 +0200

+++ libcpp/init.cc  2022-08-15 16:09:01.403020485 +0200
@@ -96,34 +96,35 @@ struct lang_flags
char dfp_constants;
char size_t_literals;
char elifdef;
+  char delimited_escape_seqs;
  };
  
  static const struct lang_flags lang_defaults[] =

-{ /*  c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep 
trig u8chlit vaopt scope dfp szlit elifdef */
-  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,  1,   0,   0,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC11   */  { 1,  0,  1,  1,  1,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC17   */  { 1,  0,  1,  1,  1,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC2X   */  { 1,  0,  1,  1,  1,  0,  1,   1,   1,   0,1, 1, 
0,   1,  1,   1, 1,   0,   1 },
-  /* STDC89   */  { 0,  0,  0,  0,  0,  1,  0,   0,   0,   0,0, 0, 
1,   0,  0,   0, 0,   0,   0 },
-  /* STDC94   */  { 0,  0,  0,  0,  0,  1,  1,   0,   0,   0,0, 0, 
1,   0,  0,   0, 0,   0,   0 },
-  /* STDC99   */  { 1,  0,  1,  1,  0,  1,  1,   0,   0,   0,0, 0, 
1,   0,  0,   0, 0,   0,   0 },
-  /* STDC11   */  { 1,  0,  1,  1,  1,  1,  1,   1,   0,   0,0, 0, 
1,   0,  0,   0, 0,   0,   0 },
-  /* STDC17   */  { 1,  0,  1,  1,  1,  1,  1,   1,   0,   0,0, 0, 
1,   0,  0,   0, 0,   0,   0 },
-  /* STDC2X   */  { 1,  0,  1,  1,  1,  1,  1,   1,   0,   0,1, 1, 
1,   1,  0,   1, 1,   0,   1 },
-  /* GNUCXX   */  { 0,  1,  1,  1,  0,  0,  1,   0,   0,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
-  /* CXX98*/  { 0,  1,  0,  1,  0,  1,  1,   0,   0,   0,0, 0, 
1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  0,  1,   1,   1,   1,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  1,  1,   1,   1,   1,0, 0, 
1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  0,  1,   1,   1,   1,1, 1, 
0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  1,  1,   1,   1,   1,1, 1, 
1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  0,  1,   1,   1,   1,1, 1, 
0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  1,  1,   1,   1,   1,1, 1, 
0,   1,  0,   1, 0,   0,   0 },
-  

Re: [PATCH v2] c++: Implement -Wself-move warning [PR81159]

2022-08-18 Thread Jason Merrill via Gcc-patches

On 8/18/22 13:19, Marek Polacek wrote:

On Mon, Aug 15, 2022 at 03:54:05PM -0400, Jason Merrill wrote:

On 8/9/22 09:37, Marek Polacek wrote:

+  /* We're looking for *std::move ((T &) ), or
+ *std::move ((T &) (T *) r) if the argument it a reference.  */
+  if (!REFERENCE_REF_P (rhs)
+  || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
+return;
+  tree fn = TREE_OPERAND (rhs, 0);
+  if (!is_std_move_p (fn))
+return;
+  tree arg = CALL_EXPR_ARG (fn, 0);
+  if (TREE_CODE (arg) != NOP_EXPR)
+return;
+  /* Strip the (T &).  */
+  arg = TREE_OPERAND (arg, 0);
+  /* Strip the (T *) or &.  */
+  arg = TREE_OPERAND (arg, 0);


Are you sure these are the only two expressions that can make it here? What
if the argument to move is *Tptr?


Not 100% sure but I couldn't find any other form.  For *Tptr we get
*std::move ((int * &) )


That likes like what you'd get when the argument is Tptr, not when it's 
*Tptr.  And indeed that's what I see in the testcase:



+  Tptr = std::move (Tptr); // { dg-warning "moving a variable to itself" }


is missing the *


@@ -6826,6 +6827,26 @@ of a declaration:
   This warning is enabled by @option{-Wall}.
+@item -Wno-self-move @r{(C++ and Objective-C++ only)}
+@opindex Wself-move
+@opindex Wno-self-move
+This warning warns when a value is moved to itself with @code{std::move}.
+Such a @code{std::move} has no effect.


...unless it naively breaks the object, like

T(T&& ot): data(ot.data) { ot.data = nullptr; } // oops


"If you try to move me I'll disappear!"

I've added the weasel word: "typically has no effect."  Or do we want to say
more?

-- >8 --
About 5 years ago we got a request to implement -Wself-move, which
warns about useless moves like this:

   int x;
   x = std::move (x);

This patch implements that warning.

PR c++/81159

gcc/c-family/ChangeLog:

* c.opt (Wself-move): New option.

gcc/cp/ChangeLog:

* typeck.cc (maybe_warn_self_move): New.
(cp_build_modify_expr): Call maybe_warn_self_move.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wself-move.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wself-move1.C: New test.
---
  gcc/c-family/c.opt  |   4 +
  gcc/cp/typeck.cc|  48 ++-
  gcc/doc/invoke.texi |  23 +-
  gcc/testsuite/g++.dg/warn/Wself-move1.C | 105 
  4 files changed, 178 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wself-move1.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index dfdebd596ef..f776efd39d8 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1229,6 +1229,10 @@ Wselector
  ObjC ObjC++ Var(warn_selector) Warning
  Warn if a selector has multiple methods.
  
+Wself-move

+C++ ObjC++ Var(warn_self_move) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn when a value is moved to itself with std::move.
+
  Wsequence-point
  C ObjC C++ ObjC++ Var(warn_sequence_point) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
  Warn about possible violations of sequence point rules.
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 992ebfd99fb..cbc32a7c8ca 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -8897,7 +8897,51 @@ cp_build_c_cast (location_t loc, tree type, tree expr,
  
return error_mark_node;

  }
-
+
+/* Warn when a value is moved to itself with std::move.  LHS is the target,
+   RHS may be the std::move call, and LOC is the location of the whole
+   assignment.  */
+
+static void
+maybe_warn_self_move (location_t loc, tree lhs, tree rhs)
+{
+  if (!warn_self_move)
+return;
+
+  /* C++98 doesn't know move.  */
+  if (cxx_dialect < cxx11)
+return;
+
+  if (processing_template_decl)
+return;
+
+  /* We're looking for *std::move ((T &) ), or
+ *std::move ((T &) (T *) r) if the argument it a reference.  */
+  if (!REFERENCE_REF_P (rhs)
+  || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
+return;
+  tree fn = TREE_OPERAND (rhs, 0);
+  if (!is_std_move_p (fn))
+return;
+  tree arg = CALL_EXPR_ARG (fn, 0);
+  if (TREE_CODE (arg) != NOP_EXPR)
+return;
+  /* Strip the (T &).  */
+  arg = TREE_OPERAND (arg, 0);
+  /* Strip the (T *) or &.  */
+  arg = TREE_OPERAND (arg, 0);
+  arg = convert_from_reference (arg);
+  /* So that we catch (i) = std::move (i);.  */
+  lhs = maybe_undo_parenthesized_ref (lhs);
+  STRIP_ANY_LOCATION_WRAPPER (lhs);
+  if (cp_tree_equal (lhs, arg))
+{
+  auto_diagnostic_group d;
+  if (warning_at (loc, OPT_Wself_move, "moving a variable to itself"))
+   inform (loc, "remove % call");
+}
+}
+
  /* For use from the C common bits.  */
  tree
  build_modify_expr (location_t location,
@@ -9101,6 +9145,8 @@ cp_build_modify_expr (location_t loc, tree lhs, enum 
tree_code modifycode,
  
if (modifycode == NOP_EXPR)

{
+ maybe_warn_self_move (loc, lhs, rhs);
+
  if (c_dialect_objc ())
{
  result = 

[Bug analyzer/106181] [13 Regression] ICE in capacity_compatible_with_type, at analyzer/region-model.cc:2909

2022-08-18 Thread tlange at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106181

Tim Lange  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Tim Lange  ---
(In reply to David Malcolm from comment #4)
> Tim: is this fixed by the above commit?

Yes, marking as fixed.

[Bug analyzer/106181] [13 Regression] ICE in capacity_compatible_with_type, at analyzer/region-model.cc:2909

2022-08-18 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106181

David Malcolm  changed:

   What|Removed |Added

 CC||dmalcolm at gcc dot gnu.org

--- Comment #4 from David Malcolm  ---
Tim: is this fixed by the above commit?

[Bug c++/106646] [C++23] P2437R1 - Support for #warning

2022-08-18 Thread jsm28 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106646

--- Comment #1 from Joseph S. Myers  ---
Enabled for C2x (i.e. stopped -pedantic diagnosing it) with commit
d7c3000147c1d8090f66a2baf4623d2c0dfe8eb6 - C++ will presumably want to adjust
the diagnostics as well as enabling for relevant C++ versions and adding
associated tests.

[committed] preprocessor: Support #warning for standard C2x

2022-08-18 Thread Joseph Myers
ISO C2x standardizes the existing #warning extension.  Arrange
accordingly for it not to be diagnosed with -std=c2x -pedantic, but to
be diagnosed with -Wc11-c2x-compat.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/testsuite/
* gcc.dg/cpp/c11-warning-1.c, gcc.dg/cpp/c11-warning-2.c,
gcc.dg/cpp/c11-warning-3.c, gcc.dg/cpp/c11-warning-4.c,
gcc.dg/cpp/c2x-warning-1.c, gcc.dg/cpp/c2x-warning-2.c,
gcc.dg/cpp/gnu11-warning-1.c, gcc.dg/cpp/gnu11-warning-2.c,
gcc.dg/cpp/gnu11-warning-3.c, gcc.dg/cpp/gnu11-warning-4.c,
gcc.dg/cpp/gnu2x-warning-1.c, gcc.dg/cpp/gnu2x-warning-2.c: New
tests.

libcpp/
* include/cpplib.h (struct cpp_options): Add warning_directive.
* init.cc (struct lang_flags, lang_defaults): Add
warning_directive.
* directives.cc (DIRECTIVE_TABLE): Mark #warning as STDC2X not
EXTENSION.
(directive_diagnostics): Diagnose #warning with -Wc11-c2x-compat,
or with -pedantic for a standard not supporting #warning.

diff --git a/gcc/testsuite/gcc.dg/cpp/c11-warning-1.c 
b/gcc/testsuite/gcc.dg/cpp/c11-warning-1.c
new file mode 100644
index 000..45d1ff874e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c11-warning-1.c
@@ -0,0 +1,6 @@
+/* Test #warning not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* { dg-error "#warning before C2X is a GCC extension" "pedantic" { target 
*-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/c11-warning-2.c 
b/gcc/testsuite/gcc.dg/cpp/c11-warning-2.c
new file mode 100644
index 000..ba385bf60fb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c11-warning-2.c
@@ -0,0 +1,6 @@
+/* Test #warning not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* { dg-warning "#warning before C2X is a GCC extension" "pedantic" { target 
*-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/c11-warning-3.c 
b/gcc/testsuite/gcc.dg/cpp/c11-warning-3.c
new file mode 100644
index 000..8d74fcdaea4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c11-warning-3.c
@@ -0,0 +1,6 @@
+/* Test #warning not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -Wc11-c2x-compat" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* { dg-warning "#warning before C2X is a GCC extension" "compat" { target 
*-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/c11-warning-4.c 
b/gcc/testsuite/gcc.dg/cpp/c11-warning-4.c
new file mode 100644
index 000..0af93f3f459
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c11-warning-4.c
@@ -0,0 +1,6 @@
+/* Test #warning not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* Not diagnosed by default.  */
diff --git a/gcc/testsuite/gcc.dg/cpp/c2x-warning-1.c 
b/gcc/testsuite/gcc.dg/cpp/c2x-warning-1.c
new file mode 100644
index 000..696a0cd7aad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c2x-warning-1.c
@@ -0,0 +1,5 @@
+/* Test #warning in C2x.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#warning example text /* { dg-warning "example text" } */
diff --git a/gcc/testsuite/gcc.dg/cpp/c2x-warning-2.c 
b/gcc/testsuite/gcc.dg/cpp/c2x-warning-2.c
new file mode 100644
index 000..3042e7a088c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c2x-warning-2.c
@@ -0,0 +1,6 @@
+/* Test #warning in C2x: -Wc11-c2x-comapt.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c2x -pedantic-errors -Wc11-c2x-compat" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* { dg-warning "#warning before C2X is a GCC extension" "compat" { target 
*-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/gnu11-warning-1.c 
b/gcc/testsuite/gcc.dg/cpp/gnu11-warning-1.c
new file mode 100644
index 000..7dda115ab3e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/gnu11-warning-1.c
@@ -0,0 +1,6 @@
+/* Test #warning not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=gnu11 -pedantic-errors" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* { dg-error "#warning before C2X is a GCC extension" "pedantic" { target 
*-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/gnu11-warning-2.c 
b/gcc/testsuite/gcc.dg/cpp/gnu11-warning-2.c
new file mode 100644
index 000..af2cc349702
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/gnu11-warning-2.c
@@ -0,0 +1,6 @@
+/* Test #warning not in C11.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=gnu11 -pedantic" } */
+
+#warning example text /* { dg-warning "example text" } */
+/* { dg-warning "#warning before C2X is a GCC extension" "pedantic" { target 
*-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/cpp/gnu11-warning-3.c 
b/gcc/testsuite/gcc.dg/cpp/gnu11-warning-3.c
new file mode 100644

[Bug testsuite/106345] Some ppc64le tests fail with -mcpu=power9 -mtune=power9

2022-08-18 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345

Michael Meissner  changed:

   What|Removed |Added

 CC||meissner at gcc dot gnu.org

--- Comment #8 from Michael Meissner  ---
Note, the gcc.target/powerpc/pr92398.p9-.c test fails when the compiler is
configured for either --with-cpu=power9 or --with-cpu=power10.  No --with-tune=
was used in configuring either compiler.

[Bug target/106682] New: Powerpc test gcc.target/powerpc/pr86731-fwrapv-longlong.c fails on power8, passes on power9/power10

2022-08-18 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106682

Bug ID: 106682
   Summary: Powerpc test
gcc.target/powerpc/pr86731-fwrapv-longlong.c fails on
power8, passes on power9/power10
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I was doing builds on a power10 for patch submission, and I noticed the test
pr86731-fwrapv-longlong.c fails when the target is power8, but it passes when
the target is power9 or power10.

Here is the log file from power8:
Executing on host: /home/meissner/fsf-build-ppc64le/work098-power8/gcc/xgcc
-B/home/meissner/fsf-build-ppc64le/work098-power8/gcc/ 
/home/meissner/fsf-src/work098/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
   -fdiagnostics-plain-output  -maltivec -O3 -fwrapv -mpower8-vector
-ffat-lto-objects -fno-ident -S -o pr86731-fwrapv-longlong.s(timeout = 300)
spawn -ignore SIGHUP /home/meissner/fsf-build-ppc64le/work098-power8/gcc/xgcc
-B/home/meissner/fsf-build-ppc64le/work098-power8/gcc/
/home/meissner/fsf-src/work098/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
-fdiagnostics-plain-output -maltivec -O3 -fwrapv -mpower8-vector
-ffat-lto-objects -fno-ident -S -o pr86731-fwrapv-longlong.s
PASS: gcc.target/powerpc/pr86731-fwrapv-longlong.c (test for excess errors)
PASS: gcc.target/powerpc/pr86731-fwrapv-longlong.c scan-assembler-times
\\mvspltis[bhw]\\M 0
PASS: gcc.target/powerpc/pr86731-fwrapv-longlong.c scan-assembler-times
\\mvsl[bhwd]\\M 0
gcc.target/powerpc/pr86731-fwrapv-longlong.c:
\\mp?lxv\\M|\\mlxv\\M|\\mlxvd2x\\M|\\mxxspltidp\\M found 0 times

gcc-10-20220818 is now available

2022-08-18 Thread GCC Administrator via Gcc
Snapshot gcc-10-20220818 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/10-20220818/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 10 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-10 revision fcafb592988ceb342c134d62f0adcbffe628f862

You'll find:

 gcc-10-20220818.tar.xz   Complete GCC

  SHA256=9ef5d7be51836f364b700ad0551c042542abfedc8687ecd45ab350aef290af7b
  SHA1=0ae0a2814ab3fff4f331a224cbe7565f91f62c93

Diffs from 10-20220811 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-10
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Bug testsuite/106681] New: Powerpc test gcc.dg/pr104992.c fails on power10

2022-08-18 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106681

Bug ID: 106681
   Summary: Powerpc test gcc.dg/pr104992.c fails on power10
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I was doing builds on a power10 system for patch submission, and I noticed the
following test fails when the test is compiled for power10, but it does not
fail when the test is compiled for power8 or power9: gcc.dg/pr104992.c:

Executing on host: /home/meissner/fsf-build-ppc64le/work098-if/gcc/xgcc
-B/home/meissner/fsf-build-ppc64le/work098-if/gcc/ 
/home/meissner/fsf-src/work098/gcc/testsuite/gcc.dg/pr104992.c   
-fdiagnostics-plain-output   -O2 -Wno-psabi -fdump-tree-optimized -S -o
pr104992.s(timeout = 300)
spawn -ignore SIGHUP /home/meissner/fsf-build-ppc64le/work098-if/gcc/xgcc
-B/home/meissner/fsf-build-ppc64le/work098-if/gcc/
/home/meissner/fsf-src/work098/gcc/testsuite/gcc.dg/pr104992.c
-fdiagnostics-plain-output -O2 -Wno-psabi -fdump-tree-optimized -S -o
pr104992.s
PASS: gcc.dg/pr104992.c (test for excess errors)
gcc.dg/pr104992.c: pattern found 6 times
FAIL: gcc.dg/pr104992.c scan-tree-dump-times optimized " % " 9

[Bug testsuite/106680] New: Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2022-08-18 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

Bug ID: 106680
   Summary: Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I was doing some builds for submitting patches, and I did runs on BE systems as
well as LE systems.

I noticed the test gcc.target/powerpc/bswap64-4.c fails in 32-bit, because it
does not generate ldbrx or stdbrx instructions.  These instructions are not
supported on 32-bit.  So the test has to be adjusted to either only be run on a
64-bit system, or adjust the insns generated when the test is run on a 32-bit
target.

[Bug testsuite/101169] [10 regression] test case gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r10-9880

2022-08-18 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101169

Michael Meissner  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||meissner at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-08-18

--- Comment #3 from Michael Meissner  ---
The fold-vec-extract tests work fine on the development version of GCC 13 for
64-bit, but they are still failing if I run them on a BE system that supports
32-bit code generation.  It looks like the insn count may need to be adjusted
for 32-bit:

FAIL: gcc.target/powerpc/fold-vec-extract-int.p8.c scan-assembler-times
\\maddi\\M 9
FAIL: gcc.target/powerpc/fold-vec-extract-short.p7.c scan-assembler-times
\\maddi\\M|\\madd\\M 12
FAIL: gcc.target/powerpc/fold-vec-extract-short.p8.c scan-assembler-times
\\maddi\\M 9
FAIL: gcc.target/powerpc/fold-vec-extract-char.p7.c scan-assembler-times
\\maddi\\M 9
FAIL: gcc.target/powerpc/fold-vec-extract-double.p7.c scan-assembler-times
\\maddi\\M|\\madd\\M 3
FAIL: gcc.target/powerpc/fold-vec-extract-float.p7.c scan-assembler-times
\\maddi\\M|\\madd\\M 3
FAIL: gcc.target/powerpc/fold-vec-extract-float.p8.c scan-assembler-times
\\maddi\\M 2
FAIL: gcc.target/powerpc/fold-vec-extract-int.p7.c scan-assembler-times
\\maddi\\M|\\madd\\M 12

[Bug c++/103876] Parameter pack not expanded in lambda within static_assert in a fold-expression of a lambda

2022-08-18 Thread herring at lanl dot gov via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103876

S. Davis Herring  changed:

   What|Removed |Added

 CC||herring at lanl dot gov

--- Comment #2 from S. Davis Herring  ---
Probably the same bug from a simpler (and C++17-compatible) case:

template
void f(const TT &...tt) {
[tt...]() {
([tt] {},...);
}();
}

:4:13: error: parameter packs not expanded with '...':
4 | ([tt] {},...);
  | ^
:4:13: note: 'tt'

[PATCH 05/10] [RISCV] Add %~ to print w if TARGET_64BIT and use it

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

To make things easier and more maintainable, we need to
add support printing out w if TARGET_64BIT so this patch
adds %~ to do that, similar how the x86 backend uses %~
to print out i/f for TARGET_AVX2. We could have chosen any
punctuation symbol but ~ looks the closest to w.

OK? Build and tested for riscv64-linux-gnu and riscv32-linux-gnu with no 
regressions.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand):
Handle '~'.
(riscv_print_operand_punct_valid_p): New function
(TARGET_PRINT_OPERAND_PUNCT_VALID_P): Define.
* config/riscv/bitmanip.md (si2/clz_ctz_pcnt):
Use %~ instead of conditional the pattern on TARGET_64BIT.
(rotrsi3): Likewise.
(rotlsi3): Likewise.
* config/riscv/riscv.md: Add ~ to the list of modifiers.
(addsi3): Use %~ instead of conditional the pattern on TARGET_64BIT.
(subsi3): Likewise.
(negsi2): Likewise.
(mulsi3): Likewise.
(optab>si3/any_div): Likewise.
(*addhi3): Likewise.
(si3/any_shift): Likewise.
---
 gcc/config/riscv/bitmanip.md |  6 +++---
 gcc/config/riscv/riscv.cc| 19 +++
 gcc/config/riscv/riscv.md| 15 ---
 3 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 3329dd54eb6..ebd6eee1a22 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -143,7 +143,7 @@ (define_insn "si2"
   [(set (match_operand:SI 0 "register_operand" "=r")
 (clz_ctz_pcnt:SI (match_operand:SI 1 "register_operand" "r")))]
   "TARGET_ZBB"
-  { return TARGET_64BIT ? "w\t%0,%1" : 
"\t%0,%1"; }
+  "%~\t%0,%1"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "SI")])
 
@@ -201,7 +201,7 @@ (define_insn "rotrsi3"
(rotatert:SI (match_operand:SI 1 "register_operand" "r")
 (match_operand:QI 2 "arith_operand" "rI")))]
   "TARGET_ZBB"
-  { return TARGET_64BIT ? "ror%i2w\t%0,%1,%2" : "ror%i2\t%0,%1,%2"; }
+  "ror%i2%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
 (define_insn "rotrdi3"
@@ -225,7 +225,7 @@ (define_insn "rotlsi3"
(rotate:SI (match_operand:SI 1 "register_operand" "r")
   (match_operand:QI 2 "register_operand" "r")))]
   "TARGET_ZBB"
-  { return TARGET_64BIT ? "rolw\t%0,%1,%2" : "rol\t%0,%1,%2"; }
+  "rol%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
 (define_insn "rotldi3"
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 189be5e4e6f..22d0f6d604c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3731,12 +3731,22 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
'i' Print i if the operand is not a register.
'S' Print shift-index of single-bit mask OP.
'T' Print shift-index of inverted single-bit mask OP.
+   '~' Print w if TARGET_64BIT is true; otherwise not print anything.
 
Note please keep this list and the list in riscv.md in sync.  */
 
 static void
 riscv_print_operand (FILE *file, rtx op, int letter)
 {
+  /* `~` does not take an operand so op will be null
+ Check for before accessing op.
+  */
+  if (letter == '~')
+{
+  if (TARGET_64BIT)
+   fputc('w', file);
+  return;
+}
   machine_mode mode = GET_MODE (op);
   enum rtx_code code = GET_CODE (op);
 
@@ -3812,6 +3822,13 @@ riscv_print_operand (FILE *file, rtx op, int letter)
 }
 }
 
+/* Implement TARGET_PRINT_OPERAND_PUNCT_VALID_P */
+static bool
+riscv_print_operand_punct_valid_p (unsigned char code)
+{
+  return (code == '~');
+}
+
 /* Implement TARGET_PRINT_OPERAND_ADDRESS.  */
 
 static void
@@ -5900,6 +5917,8 @@ riscv_init_libfuncs (void)
 #define TARGET_PRINT_OPERAND riscv_print_operand
 #undef TARGET_PRINT_OPERAND_ADDRESS
 #define TARGET_PRINT_OPERAND_ADDRESS riscv_print_operand_address
+#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P
+#define TARGET_PRINT_OPERAND_PUNCT_VALID_P riscv_print_operand_punct_valid_p
 
 #undef TARGET_SETUP_INCOMING_VARARGS
 #define TARGET_SETUP_INCOMING_VARARGS riscv_setup_incoming_varargs
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index aad2836d179..30cd07dc6f5 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -32,6 +32,7 @@
 ;; i -- Print i if the operand is not a register.
 ;; S -- Print shift-index of single-bit mask OP.
 ;; T -- Print shift-index of inverted single-bit mask OP.
+;; ~ -- Print w if TARGET_64BIT is true; otherwise not print anything.
 
 (define_c_enum "unspec" [
   ;; Override return address for exception handling.
@@ -312,7 +313,7 @@ (define_insn "addsi3"
(plus:SI (match_operand:SI 1 "register_operand" " r,r")
 (match_operand:SI 2 "arith_operand"" r,I")))]
   ""
-  { return TARGET_64BIT ? "add%i2w\t%0,%1,%2" : "add%i2\t%0,%1,%2"; }
+  "add%i2%~\t%0,%1,%2"
   [(set_attr "type" "arith")
(set_attr "mode" "SI")])
 
@@ -452,7 +453,7 @@ 

[PATCH 07/10] [RISCV] Use a constraint for bset_mask and bset_1_mask

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

A constraint here just makes it easier to understand what the
operands are.

OK? Built and tested on riscv32-linux-gnu and riscv64-linux-gnu with
--with-arch=rvNimafdc_zba_zbb_zbc_zbs (where N is 32 and 64).

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/riscv/constraints.md (DsS): New constraint.
(DsD): New constraint.
* config/riscv/iterators.md (shiftm1c): New iterator.
* config/riscv/bitmanip.md (*bset_mask):
Use shiftm1c.
(*bset_1_mask): Likewise.
---
 gcc/config/riscv/bitmanip.md|  4 ++--
 gcc/config/riscv/constraints.md | 12 
 gcc/config/riscv/iterators.md   |  1 +
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 73a36f7751b..d362f526e79 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -273,7 +273,7 @@ (define_insn "*bset_mask"
(ior:X (ashift:X (const_int 1)
 (subreg:QI
  (and:X (match_operand:X 2 "register_operand" "r")
-(match_operand 3 "" "i")) 0))
+(match_operand 3 "" 
"")) 0))
   (match_operand:X 1 "register_operand" "r")))]
   "TARGET_ZBS"
   "bset\t%0,%1,%2"
@@ -292,7 +292,7 @@ (define_insn "*bset_1_mask"
(ashift:X (const_int 1)
  (subreg:QI
   (and:X (match_operand:X 1 "register_operand" "r")
- (match_operand 2 "" "i")) 0)))]
+ (match_operand 2 "" "")) 0)))]
   "TARGET_ZBS"
   "bset\t%0,x0,%1"
   [(set_attr "type" "bitmanip")])
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 61b84875fd9..444870ad060 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -60,6 +60,18 @@ (define_constraint "Ds3"
   (and (match_code "const_int")
(match_test "IN_RANGE (ival, 1, 3)")))
 
+(define_constraint "DsS"
+  "@internal
+   31 immediate"
+  (and (match_code "const_int")
+   (match_test "ival == 31")))
+
+(define_constraint "DsD"
+  "@internal
+   63 immediate"
+  (and (match_code "const_int")
+   (match_test "ival == 63")))
+
 ;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
 ;; not available in RV32.
 (define_constraint "G"
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 6c8a6d2dd59..be0d5390307 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -115,6 +115,7 @@ (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
 
 ; bitmanip mode attribute
 (define_mode_attr shiftm1 [(SI "const31_operand") (DI "const63_operand")])
+(define_mode_attr shiftm1p [(SI "DsS") (DI "DsD")])
 
 ;; ---
 ;; Code Iterators
-- 
2.27.0



[PATCH 10/10] [RISCV] Fix PR 106632 and PR 106588 a few constraints in bitmanip.md

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The constraints should be n instead of i. Also there
needs to a check for out of bounds zero_extract for
*bexti.

gcc/ChangeLog:

PR target/106632
PR target/106588
* config/riscv/bitmanip.md (*shNadduw): Use n constraint
instead of i.
(*slliuw): Likewise.
(*bexti): Likewise. Also add a check for operands[2] to be less
than the mode bitsize.
---
 gcc/config/riscv/bitmanip.md | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 026299d6703..ecf5b51b533 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -44,7 +44,7 @@ (define_insn "*shNadduw"
(plus:DI
  (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
 (match_operand:QI 2 "imm123_operand" "Ds3"))
-(match_operand 3 "immediate_operand" ""))
+(match_operand 3 "immediate_operand" "n"))
  (match_operand:DI 4 "register_operand" "r")))]
   "TARGET_64BIT && TARGET_ZBA
&& (INTVAL (operands[3]) >> INTVAL (operands[2])) == 0x"
@@ -110,7 +110,7 @@ (define_insn "*slliuw"
   [(set (match_operand:DI 0 "register_operand" "=r")
(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
   (match_operand:QI 2 "immediate_operand" "I"))
-   (match_operand 3 "immediate_operand" "")))]
+   (match_operand 3 "immediate_operand" "n")))]
   "TARGET_64BIT && TARGET_ZBA
&& (INTVAL (operands[3]) >> INTVAL (operands[2])) == 0x"
   "slli.uw\t%0,%1,%2"
@@ -354,6 +354,7 @@ (define_insn "*bexti"
(zero_extract:X (match_operand:X 1 "register_operand" "r")
(const_int 1)
(match_operand 2 "immediate_operand" "i")))]
-  "TARGET_ZBS"
+   (match_operand 2 "immediate_operand" "n")))]
+  "TARGET_ZBS && UINTVAL (operands[2]) < GET_MODE_BITSIZE (mode)"
   "bexti\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
-- 
2.27.0



[PATCH 09/10] [RISCV] Add constraints for not_single_bit_mask_operand/single_bit_mask_operand

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

Like a previous patch, just add constraints for predicates
not_single_bit_mask_operand and single_bit_mask_operand.

OK? Built and tested for riscv32-linux-gnu and riscv64-linux-gnu.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/riscv/constraints.md (DbS): New constraint.
(DnS): New constraint.
* config/riscv/bitmanip.md (*bset_1_mask): Use new constraint.
(*bclr): Likewise.
(*binvi): Likewise.
---
 gcc/config/riscv/bitmanip.md|  6 +++---
 gcc/config/riscv/constraints.md | 10 ++
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index d362f526e79..026299d6703 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -300,7 +300,7 @@ (define_insn "*bset_1_mask"
 (define_insn "*bseti"
   [(set (match_operand:X 0 "register_operand" "=r")
(ior:X (match_operand:X 1 "register_operand" "r")
-  (match_operand 2 "single_bit_mask_operand" "i")))]
+  (match_operand:X 2 "single_bit_mask_operand" "DbS")))]
   "TARGET_ZBS"
   "bseti\t%0,%1,%S2"
   [(set_attr "type" "bitmanip")])
@@ -317,7 +317,7 @@ (define_insn "*bclr"
 (define_insn "*bclri"
   [(set (match_operand:X 0 "register_operand" "=r")
(and:X (match_operand:X 1 "register_operand" "r")
-  (match_operand 2 "not_single_bit_mask_operand" "i")))]
+  (match_operand:X 2 "not_single_bit_mask_operand" "DnS")))]
   "TARGET_ZBS"
   "bclri\t%0,%1,%T2"
   [(set_attr "type" "bitmanip")])
@@ -334,7 +334,7 @@ (define_insn "*binv"
 (define_insn "*binvi"
   [(set (match_operand:X 0 "register_operand" "=r")
(xor:X (match_operand:X 1 "register_operand" "r")
-  (match_operand 2 "single_bit_mask_operand" "i")))]
+  (match_operand:X 2 "single_bit_mask_operand" "DbS")))]
   "TARGET_ZBS"
   "binvi\t%0,%1,%S2"
   [(set_attr "type" "bitmanip")])
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 444870ad060..2873d533cb5 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -72,6 +72,16 @@ (define_constraint "DsD"
   (and (match_code "const_int")
(match_test "ival == 63")))
 
+(define_constraint "DbS"
+  "@internal"
+  (and (match_code "const_int")
+   (match_test "SINGLE_BIT_MASK_OPERAND (ival)")))
+
+(define_constraint "DnS"
+  "@internal"
+  (and (match_code "const_int")
+   (match_test "SINGLE_BIT_MASK_OPERAND (~ival)")))
+
 ;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
 ;; not available in RV32.
 (define_constraint "G"
-- 
2.27.0



[PATCH 08/10] [RISCV] Fix PR 106586: riscv32 vs ZBS

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The problem here is two fold. With RISCV32, 32bit
const_int are always signed extended to 64bit in HWI.
So that means for SINGLE_BIT_MASK_OPERAND, it should
mask off the upper bits to see it is a single bit
for !TARGET_64BIT.
Plus there are a few locations which forget to call
trunc_int_for_mode when generating a SImode constant
so they are not sign extended correctly for HWI.
The predicates single_bit_mask_operand and
not_single_bit_mask_operand need get the same handling
as SINGLE_BIT_MASK_OPERAND so just use SINGLE_BIT_MASK_OPERAND.

OK? Built and tested on riscv32-linux-gnu and riscv64-linux-gnu with
--with-arch=rvNimafdc_zba_zbb_zbc_zbs where N is replaced with 32 or 64.

Thanks,
Andrew Pinski

gcc/ChangeLog:

PR target/106586
* config/riscv/predicates.md (single_bit_mask_operand):
Use SINGLE_BIT_MASK_OPERAND instead of directly calling pow2p_hwi.
(not_single_bit_mask_operand): Likewise.
* config/riscv/riscv.cc (riscv_build_integer_1): Don't special case
1<<31 for 32bits as it is already handled.
Call trunc_int_for_mode on the upper part after the subtraction.
(riscv_move_integer): Call trunc_int_for_mode before generating
the integer just make sure the constant has been sign extended
corectly.
(riscv_emit_int_compare): Call trunc_int_for_mode after doing the
addition for the new rhs.
* config/riscv/riscv.h (SINGLE_BIT_MASK_OPERAND): If !TARGET64BIT,
then mask off the upper 32bits of the HWI as it will be sign extended.
---
 gcc/config/riscv/predicates.md |  4 ++--
 gcc/config/riscv/riscv.cc  | 12 +---
 gcc/config/riscv/riscv.h   |  4 +++-
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 2af7f661d6f..862e72b0983 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -226,11 +226,11 @@ (define_special_predicate "gpr_save_operation"
 ;; Predicates for the ZBS extension.
 (define_predicate "single_bit_mask_operand"
   (and (match_code "const_int")
-   (match_test "pow2p_hwi (INTVAL (op))")))
+   (match_test "SINGLE_BIT_MASK_OPERAND (UINTVAL (op))")))
 
 (define_predicate "not_single_bit_mask_operand"
   (and (match_code "const_int")
-   (match_test "pow2p_hwi (~INTVAL (op))")))
+   (match_test "SINGLE_BIT_MASK_OPERAND (~UINTVAL (op))")))
 
 (define_predicate "const31_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 22d0f6d604c..026c69ce40d 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -432,7 +432,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
 sign-extended (negative) representation (-1 << 31) for the
 value, if we want to build (1 << 31) in SImode.  This will
 then expand to an LUI instruction.  */
-  if (mode == SImode && value == (HOST_WIDE_INT_1U << 31))
+  if (TARGET_64BIT && mode == SImode && value == (HOST_WIDE_INT_1U << 31))
codes[0].value = (HOST_WIDE_INT_M1U << 31);
 
   return 1;
@@ -445,7 +445,11 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
   && (mode != HImode
  || value - low_part <= ((1 << (GET_MODE_BITSIZE (HImode) - 1)) - 1)))
 {
-  alt_cost = 1 + riscv_build_integer_1 (alt_codes, value - low_part, mode);
+  HOST_WIDE_INT upper_part = value - low_part;
+  if (mode != VOIDmode)
+   upper_part = trunc_int_for_mode (value - low_part, mode);
+
+  alt_cost = 1 + riscv_build_integer_1 (alt_codes, upper_part, mode);
   if (alt_cost < cost)
{
  alt_codes[alt_cost-1].code = PLUS;
@@ -1550,6 +1554,7 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT 
value,
 x = riscv_split_integer (value, mode);
   else
 {
+  codes[0].value = trunc_int_for_mode (codes[0].value, mode);
   /* Apply each binary operation to X. */
   x = GEN_INT (codes[0].value);
 
@@ -1559,7 +1564,7 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT 
value,
x = riscv_emit_set (temp, x);
  else
x = force_reg (mode, x);
-
+ codes[i].value = trunc_int_for_mode (codes[i].value, mode);
  x = gen_rtx_fmt_ee (codes[i].code, mode, x, GEN_INT (codes[i].value));
}
 }
@@ -2651,6 +2656,7 @@ riscv_emit_int_compare (enum rtx_code *code, rtx *op0, 
rtx *op1)
continue;
 
  new_rhs = rhs + (increment ? 1 : -1);
+ new_rhs = trunc_int_for_mode (new_rhs, GET_MODE (*op0));
  if (riscv_integer_cost (new_rhs) < riscv_integer_cost (rhs)
  && (rhs < 0) == (new_rhs < 0))
{
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 4b07c5487c6..5394776eb50 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -535,7 

[PATCH 06/10] [RISCV] Use constraints/predicates instead of checking const_int directly for shNadd patterns

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This simplifies the code by adding a predicate and a constraint for 1/2/3.
The aarch64 backend has a similar predicate called aarch64_shift_imm_
which they use there.

OK? Built and tested on riscv32-linux-gnu and riscv64-linux-gnu with no 
regressions.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/riscv/constraints.md (Ds3): New constraint.
* config/riscv/predicates.md (imm123_operand): New predicate.
* config/riscv/bitmanip.md (*shNadd): Use Ds3 and imm123_operand.
(*shNadduw): Likewise.
---
 gcc/config/riscv/bitmanip.md| 8 +++-
 gcc/config/riscv/constraints.md | 6 ++
 gcc/config/riscv/predicates.md  | 5 +
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index ebd6eee1a22..73a36f7751b 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -32,10 +32,9 @@ (define_insn "*zero_extendsidi2_bitmanip"
 (define_insn "*shNadd"
   [(set (match_operand:X 0 "register_operand" "=r")
(plus:X (ashift:X (match_operand:X 1 "register_operand" "r")
- (match_operand:QI 2 "immediate_operand" "I"))
+ (match_operand:QI 2 "imm123_operand" "Ds3"))
(match_operand:X 3 "register_operand" "r")))]
-  "TARGET_ZBA
-   && (INTVAL (operands[2]) >= 1) && (INTVAL (operands[2]) <= 3)"
+  "TARGET_ZBA"
   "sh%2add\t%0,%1,%3"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -44,11 +43,10 @@ (define_insn "*shNadduw"
   [(set (match_operand:DI 0 "register_operand" "=r")
(plus:DI
  (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
-(match_operand:QI 2 "immediate_operand" "I"))
+(match_operand:QI 2 "imm123_operand" "Ds3"))
 (match_operand 3 "immediate_operand" ""))
  (match_operand:DI 4 "register_operand" "r")))]
   "TARGET_64BIT && TARGET_ZBA
-   && (INTVAL (operands[2]) >= 1) && (INTVAL (operands[2]) <= 3)
&& (INTVAL (operands[3]) >> INTVAL (operands[2])) == 0x"
   "sh%2add.uw\t%0,%1,%4"
   [(set_attr "type" "bitmanip")
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index bafa4188ccb..61b84875fd9 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -54,6 +54,12 @@ (define_constraint "L"
   (and (match_code "const_int")
(match_test "LUI_OPERAND (ival)")))
 
+(define_constraint "Ds3"
+  "@internal
+   1, 2 or 3 immediate"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (ival, 1, 3)")))
+
 ;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
 ;; not available in RV32.
 (define_constraint "G"
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 79e0c1d5589..2af7f661d6f 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -244,6 +244,11 @@ (define_predicate "imm5_operand"
   (and (match_code "const_int")
(match_test "INTVAL (op) < 5")))
 
+;; A const_int for sh1add/sh2add/sh3add
+(define_predicate "imm123_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 1, 3)")))
+
 ;; A CONST_INT operand that consists of a single run of consecutive set bits.
 (define_predicate "consecutive_bits_operand"
   (match_code "const_int")
-- 
2.27.0



[PATCH 01/10] [RISCV] Move iterators from riscv.md to iterators.md

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This moves the iterators out from riscv.md to iterators.md
like most modern backends.
I have not moved the iterators from the other .md files yet.

OK? Build and tested on riscv64-linux-gnu and riscv32-linux-gnu.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/riscv/riscv.md
(GPR): Move to new file.
(P, X, BR): Likewise.
(MOVE32, MOVE64, SHORT): Likewise.
(HISI, SUPERQI, SUBX): Likewise.
(ANYI, ANYF, SOFTF): Likewise.
(size, load, default_load): Likewise.
(softload, store, softstore): Likewise.
(reg, fmt, ifmt, amo): Likewise.
(UNITMODE, HALFMODE): Likewise.
(RINT, rint_pattern, rint_rm): Likewise.
(QUIET_COMPARISON, quiet_pattern, QUIET_PATTERN): Likewise.
(any_extend, any_shiftrt, any_shift): Likewise.
(any_bitwise): Likewise.
(any_div, any_mod): Likewise.
(any_gt, any_ge, any_lt, any_le): Likewise.
(u, su): Likewise.
(optab, insn): Likewise.
* config/riscv/iterators.md: New file.
---
 gcc/config/riscv/iterators.md | 212 ++
 1 file changed, 212 insertions(+)
 create mode 100644 gcc/config/riscv/iterators.md

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
new file mode 100644
index 000..351aa7f3cea
--- /dev/null
+++ b/gcc/config/riscv/iterators.md
@@ -0,0 +1,212 @@
+;; Iterators for the machine description for RISC-V
+;; Copyright (C) 2011-2022 Free Software Foundation, Inc.
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+
+;; ---
+;; Mode Iterators
+;; ---
+
+;; This mode iterator allows 32-bit and 64-bit GPR patterns to be generated
+;; from the same template.
+(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
+
+;; This mode iterator allows :P to be used for patterns that operate on
+;; pointer-sized quantities.  Exactly one of the two alternatives will match.
+(define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
+
+;; Likewise, but for XLEN-sized quantities.
+(define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
+
+;; Branches operate on XLEN-sized quantities, but for RV64 we accept
+;; QImode values so we can force zero-extension.
+(define_mode_iterator BR [(QI "TARGET_64BIT") SI (DI "TARGET_64BIT")])
+
+;; 32-bit moves for which we provide move patterns.
+(define_mode_iterator MOVE32 [SI])
+
+;; 64-bit modes for which we provide move patterns.
+(define_mode_iterator MOVE64 [DI DF])
+
+;; Iterator for sub-32-bit integer modes.
+(define_mode_iterator SHORT [QI HI])
+
+;; Iterator for HImode constant generation.
+(define_mode_iterator HISI [HI SI])
+
+;; Iterator for QImode extension patterns.
+(define_mode_iterator SUPERQI [HI SI (DI "TARGET_64BIT")])
+
+;; Iterator for hardware integer modes narrower than XLEN.
+(define_mode_iterator SUBX [QI HI (SI "TARGET_64BIT")])
+
+;; Iterator for hardware-supported integer modes.
+(define_mode_iterator ANYI [QI HI SI (DI "TARGET_64BIT")])
+
+;; Iterator for hardware-supported floating-point modes.
+(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
+   (DF "TARGET_DOUBLE_FLOAT")
+   (HF "TARGET_ZFH")])
+
+;; Iterator for floating-point modes that can be loaded into X registers.
+(define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
+
+
+;; ---
+;; Mode attributes
+;; ---
+
+
+;; This attribute gives the length suffix for a sign- or zero-extension
+;; instruction.
+(define_mode_attr size [(QI "b") (HI "h")])
+
+;; Mode attributes for loads.
+(define_mode_attr load [(QI "lb") (HI "lh") (SI "lw") (DI "ld") (SF "flw") (HF 
"flh") (DF "fld")])
+
+;; Instruction names for integer loads that aren't explicitly sign or zero
+;; extended.  See riscv_output_move and LOAD_EXTEND_OP.
+(define_mode_attr default_load [(QI "lbu") (HI "lhu") (SI "lw") (DI "ld")])
+
+;; Mode attribute for FP loads into integer registers.
+(define_mode_attr softload [(HF "lh") (SF "lw") (DF "ld")])
+
+;; Instruction names for stores.
+(define_mode_attr store [(QI "sb") (HI "sh") 

[PATCH 03/10] [RISCV] Move iterators from sync.md to iterators.md

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

Like the previous two patches this moves the iterators
that are in sync.md to iterators.md.

OK? build and tested for riscv64-linux-gnu.

gcc/ChangeLog:

* config/riscv/sync.md (any_atomic, atomic_optab): Move to ...
* config/riscv/iterators.md: Here.
---
 gcc/config/riscv/iterators.md | 7 +++
 gcc/config/riscv/sync.md  | 4 
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 54590f43193..6c8a6d2dd59 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -151,6 +151,9 @@ (define_code_iterator any_ge [ge geu])
 (define_code_iterator any_lt [lt ltu])
 (define_code_iterator any_le [le leu])
 
+; atomics code iterator
+(define_code_iterator any_atomic [plus ior xor and])
+
 ; bitmanip code iterators
 (define_code_iterator bitmanip_bitwise [and ior])
 
@@ -205,6 +208,10 @@ (define_code_attr insn [(ashift "sll")
(plus "add")
(minus "sub")])
 
+; atomics code attribute
+(define_code_attr atomic_optab
+  [(plus "add") (ior "or") (xor "xor") (and "and")])
+
 ; bitmanip code attributes
 (define_code_attr bitmanip_optab [(smin "smin")
  (smax "smax")
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 86b41e6b00a..7deb290d9dc 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -27,10 +27,6 @@ (define_c_enum "unspec" [
   UNSPEC_MEMORY_BARRIER
 ])
 
-(define_code_iterator any_atomic [plus ior xor and])
-(define_code_attr atomic_optab
-  [(plus "add") (ior "or") (xor "xor") (and "and")])
-
 ;; Memory barriers.
 
 (define_expand "mem_thread_fence"
-- 
2.27.0



[PATCH 02/10] [RISCV] Move iterators from bitmanip.md to iterators.md

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

Just like the previous patch this move all of the iterators
of bitmanip.md to iterators.md.  All modern backends put the
iterators in iterators.md for easier access.

OK? Built and tested for riscv32-linux-gnu with 
--with-arch=rv32imafdc_zba_zbb_zbc_zbs.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/riscv/bitmanip.md
(bitmanip_bitwise, bitmanip_minmax, clz_ctz_pcna, tbitmanip_optab,
bitmanip_insn, shiftm1: Move to ...
* config/riscv/iterators.md: Here.
---
 gcc/config/riscv/bitmanip.md  | 25 -
 gcc/config/riscv/iterators.md | 27 ++-
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index d1570ce8508..3329dd54eb6 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -17,31 +17,6 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-(define_code_iterator bitmanip_bitwise [and ior])
-
-(define_code_iterator bitmanip_minmax [smin umin smax umax])
-
-(define_code_iterator clz_ctz_pcnt [clz ctz popcount])
-
-(define_code_attr bitmanip_optab [(smin "smin")
- (smax "smax")
- (umin "umin")
- (umax "umax")
- (clz "clz")
- (ctz "ctz")
- (popcount "popcount")])
-
-
-(define_code_attr bitmanip_insn [(smin "min")
-(smax "max")
-(umin "minu")
-(umax "maxu")
-(clz "clz")
-(ctz "ctz")
-(popcount "cpop")])
-
-(define_mode_attr shiftm1 [(SI "const31_operand") (DI "const63_operand")])
-
 ;; ZBA extension.
 
 (define_insn "*zero_extendsidi2_bitmanip"
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 351aa7f3cea..54590f43193 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -113,6 +113,9 @@ (define_mode_attr UNITMODE [(HF "HF") (SF "SF") (DF "DF")])
 ;; the controlling mode.
 (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
 
+; bitmanip mode attribute
+(define_mode_attr shiftm1 [(SI "const31_operand") (DI "const63_operand")])
+
 ;; ---
 ;; Code Iterators
 ;; ---
@@ -148,11 +151,17 @@ (define_code_iterator any_ge [ge geu])
 (define_code_iterator any_lt [lt ltu])
 (define_code_iterator any_le [le leu])
 
+; bitmanip code iterators
+(define_code_iterator bitmanip_bitwise [and ior])
+
+(define_code_iterator bitmanip_minmax [smin umin smax umax])
+
+(define_code_iterator clz_ctz_pcnt [clz ctz popcount])
+
 ;; ---
 ;; Code Attributes
 ;; ---
 
-
 ;;  expands to an empty string when doing a signed operation and
 ;; "u" when doing an unsigned operation.
 (define_code_attr u [(sign_extend "") (zero_extend "u")
@@ -196,6 +205,22 @@ (define_code_attr insn [(ashift "sll")
(plus "add")
(minus "sub")])
 
+; bitmanip code attributes
+(define_code_attr bitmanip_optab [(smin "smin")
+ (smax "smax")
+ (umin "umin")
+ (umax "umax")
+ (clz "clz")
+ (ctz "ctz")
+ (popcount "popcount")])
+(define_code_attr bitmanip_insn [(smin "min")
+(smax "max")
+(umin "minu")
+(umax "maxu")
+(clz "clz")
+(ctz "ctz")
+(popcount "cpop")])
+
 ;; ---
 ;; Int Iterators.
 ;; ---
-- 
2.27.0



[PATCH 04/10] [RISCV] Add the list of operand modifiers to riscv.md too

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

To make it easier to find operands modifiers while in the md
file, add the list of modifiers to the top of the md file.
This is similar to i386 target.

OK? Built and tested for riscv32-linux-gnu and riscv64-linux-gnu.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand): Make a mention to
keep the list in riscv.md in sync with this list.
* config/riscv/riscv.md: Add list of modifiers as comments.
---
 gcc/config/riscv/riscv.cc |   4 +-
 gcc/config/riscv/riscv.md | 184 --
 2 files changed, 18 insertions(+), 170 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7c120eaa8e3..189be5e4e6f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3730,7 +3730,9 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
'z' Print x0 if OP is zero, otherwise print OP normally.
'i' Print i if the operand is not a register.
'S' Print shift-index of single-bit mask OP.
-   'T' Print shift-index of inverted single-bit mask OP.  */
+   'T' Print shift-index of inverted single-bit mask OP.
+
+   Note please keep this list and the list in riscv.md in sync.  */
 
 static void
 riscv_print_operand (FILE *file, rtx op, int letter)
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f4a5ff07fe4..aad2836d179 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -19,6 +19,20 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
+
+;; Keep this list and the one above riscv_print_operand in sync.
+;; The special asm out single letter directives following a '%' are:
+;; h -- Print the high-part relocation associated with OP, after stripping
+;;   any outermost HIGH.
+;; R -- Print the low-part relocation associated with OP.
+;; C -- Print the integer branch condition for comparison OP.
+;; A -- Print the atomic operation suffix for memory model OP.
+;; F -- Print a FENCE if the memory model requires a release.
+;; z -- Print x0 if OP is zero, otherwise print OP normally.
+;; i -- Print i if the operand is not a register.
+;; S -- Print shift-index of single-bit mask OP.
+;; T -- Print shift-index of inverted single-bit mask OP.
+
 (define_c_enum "unspec" [
   ;; Override return address for exception handling.
   UNSPEC_EH_RETURN
@@ -107,6 +121,7 @@ (define_constants
 
 (include "predicates.md")
 (include "constraints.md")
+(include "iterators.md")
 
 ;; 
 ;;
@@ -269,175 +284,6 @@ (define_attr "tune"
 (define_asm_attributes
   [(set_attr "type" "multi")])
 
-;; This mode iterator allows 32-bit and 64-bit GPR patterns to be generated
-;; from the same template.
-(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
-
-;; This mode iterator allows :P to be used for patterns that operate on
-;; pointer-sized quantities.  Exactly one of the two alternatives will match.
-(define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
-
-;; Likewise, but for XLEN-sized quantities.
-(define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
-
-;; Branches operate on XLEN-sized quantities, but for RV64 we accept
-;; QImode values so we can force zero-extension.
-(define_mode_iterator BR [(QI "TARGET_64BIT") SI (DI "TARGET_64BIT")])
-
-;; 32-bit moves for which we provide move patterns.
-(define_mode_iterator MOVE32 [SI])
-
-;; 64-bit modes for which we provide move patterns.
-(define_mode_iterator MOVE64 [DI DF])
-
-;; Iterator for sub-32-bit integer modes.
-(define_mode_iterator SHORT [QI HI])
-
-;; Iterator for HImode constant generation.
-(define_mode_iterator HISI [HI SI])
-
-;; Iterator for QImode extension patterns.
-(define_mode_iterator SUPERQI [HI SI (DI "TARGET_64BIT")])
-
-;; Iterator for hardware integer modes narrower than XLEN.
-(define_mode_iterator SUBX [QI HI (SI "TARGET_64BIT")])
-
-;; Iterator for hardware-supported integer modes.
-(define_mode_iterator ANYI [QI HI SI (DI "TARGET_64BIT")])
-
-;; Iterator for hardware-supported floating-point modes.
-(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
-   (DF "TARGET_DOUBLE_FLOAT")
-   (HF "TARGET_ZFH")])
-
-;; Iterator for floating-point modes that can be loaded into X registers.
-(define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
-
-;; This attribute gives the length suffix for a sign- or zero-extension
-;; instruction.
-(define_mode_attr size [(QI "b") (HI "h")])
-
-;; Mode attributes for loads.
-(define_mode_attr load [(QI "lb") (HI "lh") (SI "lw") (DI "ld") (HF "flh") (SF 
"flw") (DF "fld")])
-
-;; Instruction names for integer loads that aren't explicitly sign or zero
-;; extended.  See riscv_output_move and LOAD_EXTEND_OP.
-(define_mode_attr default_load [(QI "lbu") (HI "lhu") (SI "lw") (DI "ld")])
-
-;; Mode attribute for FP loads into integer registers.
-(define_mode_attr softload [(HF "lh") (SF "lw") (DF 

[PATCH 00/10] [RISCV] Fix/improve the RISCV backend

2022-08-18 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This set of patches fixes a few RISCV issues and does a few
cleanups. Including moving all of the iterators to iterators.md like
many newer backends.
It also fixes a few PRs which I filed including the RISCV32 issue
with ZBS enabled.

Thanks,
Andrew Pinski

Andrew Pinski (10):
  [RISCV] Move iterators from riscv.md to iterators.md
  [RISCV] Move iterators from bitmanip.md to iterators.md
  [RISCV] Move iterators from sync.md to iterators.md
  [RISCV] Add the list of operand modifiers to riscv.md too
  [RISCV] Add %~ to print w if TARGET_64BIT and use it
  [RISCV] Use constraints/predicates instead of checking const_int
directly for shNadd patterns
  [RISCV] Use a constraint for bset_mask and bset_1_mask
  [RISCV] Fix PR 106586: riscv32 vs ZBS
  [RISCV] Add constraints for
not_single_bit_mask_operand/single_bit_mask_operand
  [RISCV] Fix PR 106632 and PR 106588 a few constraints in bitmanip.md

 gcc/config/riscv/bitmanip.md|  56 ++--
 gcc/config/riscv/constraints.md |  28 
 gcc/config/riscv/iterators.md   | 245 
 gcc/config/riscv/predicates.md  |   9 +-
 gcc/config/riscv/riscv.cc   |  35 -
 gcc/config/riscv/riscv.h|   4 +-
 gcc/config/riscv/riscv.md   | 199 +++---
 gcc/config/riscv/sync.md|   4 -
 8 files changed, 352 insertions(+), 228 deletions(-)
 create mode 100644 gcc/config/riscv/iterators.md

-- 
2.27.0



[PATCH] added myself to maintainers

2022-08-18 Thread Ondrej Kubanek via Gcc-patches
---
 ChangeLog   | 4 
 MAINTAINERS | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 2c32b6403f0..a80db84157b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2022-08-18  Ondrej Kubanek  
+
+   * MAINTAINERS: add myself.
+
 2022-07-04  Martin Liska  
 
* MAINTAINERS: fix sorting of names
diff --git a/MAINTAINERS b/MAINTAINERS
index 7d9aab76dd9..1a67bc6ea5c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -486,6 +486,7 @@ Matt Kraai  

 Jan Kratochvil 
 Matthias Kretz 
 Louis Krupp
+Ondrej Kubanek  
 Prathamesh Kulkarni
 Venkataramanan Kumar   
 Doug Kwan  
@@ -717,6 +718,7 @@ Certificate of Origin Version 1.1.  See 
https://gcc.gnu.org/dco.html for more
 information.
 
 
+Ondrej Kubanek  
 Matthias Kretz 
 Tim Lange  
 Jeff Law   
-- 
2.37.2



[PATCH] Improve converting between 128-bit modes that use the same format

2022-08-18 Thread Michael Meissner via Gcc-patches
mprove converting between 128-bit modes that use the same format.

This patch improves the insns used for converting between two modes using
the 128-bit floating point format (i.e. converting between KFmode and TFmode if
-mabi=ieeelongdouble is used, and converting between IFmode and TFmode if
-mabi=ibmlongdouble is used).  The new insns have the correct insn type and
instruction length for the move involved.

Previously, the two different moves were lumped together (i.e. converting
between IEEE 128-bit and IEEE 128-bit was matched by the same insns as
converting between IBM 128-bit and IBM 128-bit moves).

I have tested this patch on the following systems:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
4)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
5)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
6)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm
7)  BE Power8  using --with-cpu=power5  --with-long-double-format=ibm

There were no regressions in the build or in the tests.  On the power10 with
long double using the IEEE 128-bit format, pr105334.c now runs where it
previously failed.

Can I check this patch into the trunk?

Did we want to backport this to earlier GCC releases?

2022-08-17   Michael Meissner  

gcc/

* config/rs6000/rs6000.md (IFKF): Delete.
(IFKF_reg): Delete.
(extendkfif2): New define_expand.
(trunckfif2): New define_expand.
(extendtf2_internal): Split into extendiftf2_internal and
extendkftf2_internal.  Update the insns to use the correct insn type and
length attributes based on whether the move uses IEEE 128-bit floating
point or IBM 128-bit floating point type.
(extendiftf2_internal): Likewise.
(extendkftf2_internal): Likewise.
(extendtf2_internal): Split into extendtfif2_internal and
extendtfkf2_internal.  Update the insns to use the correct insn type and
length attributes based on whether the move uses IEEE 128-bit floating e
point or IBM 128-bit floating point type.
(extendtfif2_internal): Likewise.
(extendtfkf2_internal): Likewise.
---
 gcc/config/rs6000/rs6000.md | 94 +
 1 file changed, 74 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f942597c3b4..e17252bb8de 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -543,12 +543,6 @@ (define_mode_iterator FMOVE128_GPR [TI
 ; Iterator for 128-bit VSX types for pack/unpack
 (define_mode_iterator FMOVE128_VSX [V1TI KF])
 
-; Iterators for converting to/from TFmode
-(define_mode_iterator IFKF [IF KF])
-
-; Constraints for moving IF/KFmode.
-(define_mode_attr IFKF_reg [(IF "d") (KF "wa")])
-
 ; Whether a floating point move is ok, don't allow SD without hardware FP
 (define_mode_attr fmove_ok [(SF "")
(DF "")
@@ -9075,6 +9069,15 @@ (define_expand "extendifkf2"
   DONE;
 })
 
+(define_expand "extendkfif2"
+  [(set (match_operand:IF 0 "gpc_reg_operand")
+   (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))]
+  "TARGET_FLOAT128_TYPE"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
 (define_expand "extendtfkf2"
   [(set (match_operand:KF 0 "gpc_reg_operand")
(float_extend:KF (match_operand:TF 1 "gpc_reg_operand")))]
@@ -9111,6 +9114,15 @@ (define_expand "truncifkf2"
   DONE;
 })
 
+(define_expand "trunckfif2"
+  [(set (match_operand:IF 0 "gpc_reg_operand")
+   (float_truncate:IF (match_operand:KF 1 "gpc_reg_operand")))]
+  "TARGET_FLOAT128_TYPE"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
 (define_expand "trunckftf2"
   [(set (match_operand:TF 0 "gpc_reg_operand")
(float_truncate:TF (match_operand:KF 1 "gpc_reg_operand")))]
@@ -9129,31 +9141,73 @@ (define_expand "trunctfif2"
   DONE;
 })
 
-(define_insn_and_split "*extendtf2_internal"
-  [(set (match_operand:TF 0 "gpc_reg_operand" "=")
+;; Convert between KFmode and TFmode when -mabi=ieeelongdouble
+(define_insn_and_split "*extendkftf2_internal"
+  [(set (match_operand:TF 0 "gpc_reg_operand" "=wa,wa")
(float_extend:TF
-(match_operand:IFKF 1 "gpc_reg_operand" "")))]
-   "TARGET_FLOAT128_TYPE
-&& FLOAT128_IBM_P (TFmode) == FLOAT128_IBM_P (mode)"
+(match_operand:KF 1 "gpc_reg_operand" "0,wa")))]
+   "FLOAT128_IEEE_P (TFmode)"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 2))]
 {
   operands[2] = gen_rtx_REG (TFmode, REGNO (operands[1]));
-})
+}
+  [(set_attr "type" "vecsimple")])
 
-(define_insn_and_split "*extendtf2_internal"
-  [(set (match_operand:IFKF 0 "gpc_reg_operand" "=")
-   (float_extend:IFKF
-   

[PATCH] Rework 128-bit complex multiply and divide.

2022-08-18 Thread Michael Meissner via Gcc-patches
Rework 128-bit complex multiply and divide.

This function reworks how the complex multiply and divide built-in functions are
done.  Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit.  The old code also did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

In terms of history, I wrote the original code just as I was starting to test
GCC on systems where IEEE 128-bit long double was the default.  At the time, we
had not yet started mangling the built-in function names as a way to bridge
going from a system with 128-bit IBM long double to 128-bin IEEE long double.

The original code depends on there only being two 128-bit types invovled.  With
some of the changes that I plan on making, this assumption will no longer be
true in the future.

The problem is we cannot create two separate built-in functions that resolve to
the same name.  This is a requirement of add_builtin_function and the C front
end.  That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
only use 2 of them.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

We need to disable using this mapping when we are building libgcc, which is
creating the multiply and divide functions.  The flag that is used when libgcc
is built (-fbuilding-libcc) is only available in the C/C++ front ends.  We need
to remember that we are building libgcc in the rs6000-c.cc support to be able to
use this later to decided whether to mangle the decl assembler name or not.

When I wrote these patches, I discovered that __ibm128 complex multiply and
divide had originally not been supported if long double is IEEE 128-bit as it
would generate calls to __mulic3 and __divic3.  I added tests in the testsuite
to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
case.

I have tested this patch on the following systems:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
4)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
5)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
6)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm
7)  BE Power8  using --with-cpu=power5  --with-long-double-format=ibm

There were no regressions in the build or in the tests.

Can I check this patch into the trunk?  Note this patch needs the first patch
in the __ibm128 patches that I posted on Thursday August 18th for the
TARGET_IBM128 declaration.  If those patches are rejected, it would be fairly
simple to change the one use of TARGET_IBM128.

Did we want to backport this to earlier GCC releases?

2022-08-17   Michael Meissner  

gcc/

* config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Set
building_libgcc.
* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
(init_float128_ieee): Delete code to switch complex multiply and divide
for long double.
(complex_multiply_builtin_code): New helper function.
(complex_divide_builtin_code): Likewise.
(rs6000_mangle_decl_assembler_name): Add support for mangling the name
of complex 128-bit multiply and divide built-in functions.
* config/rs6000/rs6000.opt (building_libgcc): New target variable.

gcc/testsuite/

* gcc.target/powerpc/divic3-1.c: New test.
* gcc.target/powerpc/divic3-2.c: Likewise.
* gcc.target/powerpc/mulic3-1.c: Likewise.
* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000-c.cc   |   8 ++
 gcc/config/rs6000/rs6000.cc | 110 +++-
 gcc/config/rs6000/rs6000.opt|   4 +
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  18 
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  17 +++
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  18 
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  17 +++
 7 files changed, 145 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 4d051b90658..11de8389fd6 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -780,6 +780,14 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
   || DEFAULT_ABI == ABI_ELFv2
   || (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm))
 builtin_define 

[PATCH 3/3] Add 'w' suffix for __ibm128 constants

2022-08-18 Thread Michael Meissner via Gcc-patches
Add 'w' suffix for __ibm128 constants.

In the documentation, we mention that 'w' or 'W' can be used as a suffix for
__ibm128 constants.  We never implemented this.  This patch fixes that.

In addition, the 'q' and 'Q' suffix were changed to use the mode used for the
__float128 type, instead of knowing whether to use KFmode or TFmode explicitly.
This will be used in a future patch where we change the mode used for __float128
on systems where long double is IEEE 128-bit.

I have tested this patch on the following systems:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
4)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
5)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
6)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm
7)  BE Power8  using --with-cpu=power5  --with-long-double-format=ibm

There were no regressions in the build or in the tests.

Can I check this patch into the trunk?

Did we want to backport this to earlier GCC releases?

2022-08-17   Michael Meissner  

gcc/

* config/rs6000/rs6000.cc (rs6000_c_mode_for_suffix): Allow 'w' or 'W'
for __ibm128 constants.

gcc/testsuite/

* gcc.target/powerpc/ibm128-suffix.c: New test.
---
 gcc/config/rs6000/rs6000.cc   | 25 ++-
 .../gcc.target/powerpc/ibm128-suffix.c| 13 ++
 2 files changed, 26 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/ibm128-suffix.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a6ec4c71ac0..046c538c748 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -23845,22 +23845,23 @@ rs6000_floatn_mode (int n, bool extended)
 
 }
 
-/* Target hook for c_mode_for_suffix.  */
+/* Target hook for c_mode_for_suffix.  We use TYPE_MODE to follow the mode used
+   for __float128 and __ibm128.
+
+   Only two suffixes are allowed, 'q' and 'w'.  The 'q' suffix is used for
+   float128 constants on both the x86 and PowerPC processors.  Its use predates
+   the use of 'f128' for _Float128 constants, and existing code still uses it.
+
+   The 'w' suffix was used on the x86 processors for their 80-bit long
+   double.  We use it for __ibm128 constants.  */
 static machine_mode
 rs6000_c_mode_for_suffix (char suffix)
 {
-  if (TARGET_FLOAT128_TYPE)
-{
-  if (suffix == 'q' || suffix == 'Q')
-   return (FLOAT128_IEEE_P (TFmode)) ? TFmode : KFmode;
+  if (TARGET_FLOAT128_TYPE && (suffix == 'q' || suffix == 'Q'))
+return TYPE_MODE (ieee128_float_type_node);
 
-  /* At the moment, we are not defining a suffix for IBM extended double.
-If/when the default for -mabi=ieeelongdouble is changed, and we want
-to support __ibm128 constants in legacy library code, we may need to
-re-evalaute this decision.  Currently, c-lex.cc only supports 'w' and
-'q' as machine dependent suffixes.  The x86_64 port uses 'w' for
-__float80 constants.  */
-}
+  if (TARGET_IBM128 && (suffix == 'w' || suffix == 'W'))
+return TYPE_MODE (ibm128_float_type_node);
 
   return VOIDmode;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/ibm128-suffix.c 
b/gcc/testsuite/gcc.target/powerpc/ibm128-suffix.c
new file mode 100644
index 000..ff619860409
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/ibm128-suffix.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target longdouble128 } */
+/* { dg-options "-O2" } */
+
+/* See if the 'w' suffix is accepted for __ibm128.  */
+#ifndef NUMBER
+#define NUMBER  123456789012345678901234567890123456789E-10
+#endif
+
+#define GLUE2(X,Y)  X ## Y
+#define GLUE(X,Y)   GLUE2(X,Y)
+
+__ibm128 x = GLUE (NUMBER, w);
-- 
2.37.2


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[PATCH 2/3] Allow __ibm128 with -msoft-float (PR target/105334)

2022-08-18 Thread Michael Meissner via Gcc-patches
Allow __ibm128 with -msoft-float (PR target/105334)

This patch allows __ibm128 to be used on systems with software floating point
enabled.  Previously, we required hardware floating point to be enabled to use
__ibm128 keyword and the __ibm128 built-in functions.  This patch fixes PR
target/105334.

I have tested this patch on the following systems:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
4)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
5)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
6)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm
7)  BE Power8  using --with-cpu=power5  --with-long-double-format=ibm

There were no regressions in the build or in the tests.  On the power10 with
long double using the IEEE 128-bit format, pr105334.c now runs where it
previously failed.

Can I check this patch into the trunk?

Did we want to backport this to earlier GCC releases?

2022-08-17   Michael Meissner  

gcc/

PR target/105334
* config/rs6000/rs6000.cc (init_float128_ibm): Do not require hardware
floating point for the IBM 128-bit floating point comparison functions.
* config/rs6000/rs6000.h (FLOAT128_IBM_P): Do not require hardware
floating point to enable recognizing IBM 128-bit floating point modes.
---
 gcc/config/rs6000/rs6000.cc | 37 +
 gcc/config/rs6000/rs6000.h  |  2 +-
 2 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 39527ce9bbc..a6ec4c71ac0 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10913,26 +10913,23 @@ init_float128_ibm (machine_mode mode)
   set_optab_libfunc (smul_optab, mode, "__gcc_qmul");
   set_optab_libfunc (sdiv_optab, mode, "__gcc_qdiv");
 
-  if (!TARGET_HARD_FLOAT)
-   {
- set_optab_libfunc (neg_optab, mode, "__gcc_qneg");
- set_optab_libfunc (eq_optab, mode, "__gcc_qeq");
- set_optab_libfunc (ne_optab, mode, "__gcc_qne");
- set_optab_libfunc (gt_optab, mode, "__gcc_qgt");
- set_optab_libfunc (ge_optab, mode, "__gcc_qge");
- set_optab_libfunc (lt_optab, mode, "__gcc_qlt");
- set_optab_libfunc (le_optab, mode, "__gcc_qle");
- set_optab_libfunc (unord_optab, mode, "__gcc_qunord");
-
- set_conv_libfunc (sext_optab, mode, SFmode, "__gcc_stoq");
- set_conv_libfunc (sext_optab, mode, DFmode, "__gcc_dtoq");
- set_conv_libfunc (trunc_optab, SFmode, mode, "__gcc_qtos");
- set_conv_libfunc (trunc_optab, DFmode, mode, "__gcc_qtod");
- set_conv_libfunc (sfix_optab, SImode, mode, "__gcc_qtoi");
- set_conv_libfunc (ufix_optab, SImode, mode, "__gcc_qtou");
- set_conv_libfunc (sfloat_optab, mode, SImode, "__gcc_itoq");
- set_conv_libfunc (ufloat_optab, mode, SImode, "__gcc_utoq");
-   }
+  set_optab_libfunc (neg_optab, mode, "__gcc_qneg");
+  set_optab_libfunc (eq_optab, mode, "__gcc_qeq");
+  set_optab_libfunc (ne_optab, mode, "__gcc_qne");
+  set_optab_libfunc (gt_optab, mode, "__gcc_qgt");
+  set_optab_libfunc (ge_optab, mode, "__gcc_qge");
+  set_optab_libfunc (lt_optab, mode, "__gcc_qlt");
+  set_optab_libfunc (le_optab, mode, "__gcc_qle");
+  set_optab_libfunc (unord_optab, mode, "__gcc_qunord");
+
+  set_conv_libfunc (sext_optab, mode, SFmode, "__gcc_stoq");
+  set_conv_libfunc (sext_optab, mode, DFmode, "__gcc_dtoq");
+  set_conv_libfunc (trunc_optab, SFmode, mode, "__gcc_qtos");
+  set_conv_libfunc (trunc_optab, DFmode, mode, "__gcc_qtod");
+  set_conv_libfunc (sfix_optab, SImode, mode, "__gcc_qtoi");
+  set_conv_libfunc (ufix_optab, SImode, mode, "__gcc_qtou");
+  set_conv_libfunc (sfloat_optab, mode, SImode, "__gcc_itoq");
+  set_conv_libfunc (ufloat_optab, mode, SImode, "__gcc_utoq");
 }
   else
 {
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 813ec696c0d..f58f5f3f355 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -337,7 +337,7 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 #define FLOAT128_IBM_P(MODE)   \
   ((!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 \
 && ((MODE) == TFmode || (MODE) == TCmode)) \
-   || (TARGET_HARD_FLOAT && ((MODE) == IFmode || (MODE) == ICmode)))
+   || ((MODE) == IFmode || (MODE) == ICmode))
 
 /* Helper macros to say whether a 128-bit floating point type can go in a
single vector register, or whether it needs paired scalar values.  */
-- 
2.37.2


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: 

[PATCH 1/3] Allow __ibm128 even if IEEE 128-bit floating point is not supported.

2022-08-18 Thread Michael Meissner via Gcc-patches
Allow __ibm128 even if IEEE 128-bit floating point is not supported.

This patch allows the use of the __ibm128 keyword on non-VSX systems.
Originally, the __ibm128 keyword was only enabled when the IEEE 128-bit
floating point is enabled.  Sometime back in the GCC 12 development period,
Segher asked that the __ibm128 keyword be allowed in older systems that don't
support IEEE 128-bit.  But at the time, stage 1 had closed for GCC 12, so I
deferred doing this change until GCC 13.  This patch allows __ibm128 to be used
if either IEEE 128-bit is enabled or long double used the IBM 128-bit format.

I have tested these patches on the following systems:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
4)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
5)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
6)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm
7)  BE Power8  using --with-cpu=power5  --with-long-double-format=ibm

There were no regressions in the build or in the tests.

Can I check this patch into the trunk?

Did we want to backport this to earlier GCC releases?

2022-08-17   Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.cc (rs6000_init_builtins): Enable using
the__ibm128 keyword on systems that either use the 128-bit IBM long
double format for long double or support IEEE 128-bit.
* config/rs6000/rs6000.cc (rs6000_init_libfuncs): Create IBM 128-bit
floating point support functions on systems that support the __ibm128
keyword.
(rs6000_scalar_mode_supported_p): Likewise.
* config/rs6000/rs6000.h (TARGET_IBM128): New macro.
* config/rs6000/rs6000.md (@extenddf2_fprs): Allow IFmode to be
converted even if long double is not 128-bits.
(extenddf2_vsx): Likewise.
(extendiftf2):Allow conversion on systems that support the __ibm128
keyword.
(extendtfif2): Likewise.
(trunciftf2): Likewise.
(trunctfif2): Likewise.
---
 gcc/config/rs6000/rs6000-builtin.cc |  2 +-
 gcc/config/rs6000/rs6000.cc | 13 -
 gcc/config/rs6000/rs6000.h  |  6 ++
 gcc/config/rs6000/rs6000.md | 13 ++---
 4 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 12afa86854c..70680890415 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -713,7 +713,7 @@ rs6000_init_builtins (void)
  For IEEE 128-bit floating point, always create the type __ieee128.  If the
  user used -mfloat128, rs6000-c.cc will create a define from __float128 to
  __ieee128.  */
-  if (TARGET_LONG_DOUBLE_128 && (!TARGET_IEEEQUAD || TARGET_FLOAT128_TYPE))
+  if (TARGET_IBM128)
 {
   if (!TARGET_IEEEQUAD)
ibm128_float_type_node = long_double_type_node;
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index df491bee2ea..39527ce9bbc 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5,10 +5,11 @@ rs6000_init_libfuncs (void)
 {
   /* __float128 support.  */
   if (TARGET_FLOAT128_TYPE)
-{
-  init_float128_ibm (IFmode);
-  init_float128_ieee (KFmode);
-}
+init_float128_ieee (KFmode);
+
+  /* __ibm128 support.  */
+  if (TARGET_IBM128)
+init_float128_ibm (IFmode);
 
   /* AIX/Darwin/64-bit Linux quad floating point routines.  */
   if (TARGET_LONG_DOUBLE_128)
@@ -23752,7 +23753,9 @@ rs6000_scalar_mode_supported_p (scalar_mode mode)
 
   if (DECIMAL_FLOAT_MODE_P (mode))
 return default_decimal_float_supported_p ();
-  else if (TARGET_FLOAT128_TYPE && (mode == KFmode || mode == IFmode))
+  else if (TARGET_FLOAT128_TYPE && mode == KFmode)
+return true;
+  else if (TARGET_IBM128 && mode == IFmode)
 return true;
   else
 return default_scalar_mode_supported_p (mode);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index ad9bf0f7358..813ec696c0d 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -564,6 +564,12 @@ extern int rs6000_vector_align[];
 && TARGET_P8_VECTOR\
 && TARGET_POWERPC64)
 
+/* Whether the __ibm128 keyword is allowed.  Any system that supports _Float128
+   is assumed to be capable of supporting __ibm128.  Similarly if the long
+   double size is 128 bits, we assume __ibm128 is supported.  We don't want to
+   support it on a system without existing 128-bit long doubles.  */
+#define TARGET_IBM128  (TARGET_FLOAT128_TYPE || TARGET_LONG_DOUBLE_128)
+
 /* Inlining allows targets to define the meanings of bits in target_info
field of ipa_fn_summary by itself, 

[PATCH 0/3] Improvements to __ibm128 on PowerPC

2022-08-18 Thread Michael Meissner via Gcc-patches
The following 3 patches improve __ibm128 on the PowerPC GCC compiler:

The first patch allows the use of the __ibm128 keyword on non-VSX systems.
Originally, the __ibm128 keyword was only enabled when the IEEE 128-bit
floating point is enabled.  Sometime back in the GCC 12 development period,
Segher asked that the __ibm128 keyword be allowed in older systems that don't
support IEEE 128-bit.  This patch allows __ibm128 to be used if either IEEE
128-bit is enabled or long double used the IBM 128-bit format.

The second patch fixes PR target/105334.  This PR complains that the __ibm128
keyword is not defined on a system that uses IEEE 128-bit long double, but the
user used the -msoft-float option.  This patch removes the checks for hardware
floating point support in IBM 128-bit long double support, and also enables the
__ibm128 keyword.  The existing test gcc.target/powerpc/pr105334.c will now
pass on a system using IEEE 128-bit long double.

The third patch uses the 'w' suffix for __ibm128 constants.  It turns out we
had documented using the 'w' suffix for __ibm128, but we had never implemented
it.

I have tested these patches on the following systems:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
4)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
5)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
6)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm
7)  BE Power8  using --with-cpu=power5  --with-long-double-format=ibm

There were no regressions in the build or in the tests.  On the power10 with
long double using the IEEE 128-bit format, pr105334.c now runs where it
previously failed.

Can I check these patches into the trunk?

Did we want to back port these changes to older GCC's?

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[Bug tree-optimization/106679] [13 regression] gcc.dg/tree-prof/cmpsf-1.c fails after r13-2098-g5adfb6540db95d

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106679

Andrew Pinski  changed:

   What|Removed |Added

 Target|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu   |
   Host|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu   |
   Last reconfirmed||2022-08-18
  Build|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu   |
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Looks like it also fails on x86_64-linux-gnu:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599920.html

[Bug tree-optimization/106679] [13 regression] gcc.dg/tree-prof/cmpsf-1.c fails after r13-2098-g5adfb6540db95d

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106679

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug other/106679] New: [13 regression] gcc.dg/tree-prof/cmpsf-1.c fails after r13-2098-g5adfb6540db95d

2022-08-18 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106679

Bug ID: 106679
   Summary: [13 regression] gcc.dg/tree-prof/cmpsf-1.c fails after
r13-2098-g5adfb6540db95d
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:5adfb6540db95da5faf1f77fbe9ec38b4cf8eb1f, r13-2098-g5adfb6540db95d
make  -k check-gcc RUNTESTFLAGS="tree-prof.exp=gcc.dg/tree-prof/cmpsf-1.c"
FAIL: gcc.dg/tree-prof/cmpsf-1.c scan-tree-dump-not dom2 "Invalid sum"
# of expected passes4
# of unexpected failures1
# of unsupported tests  1

commit 5adfb6540db95da5faf1f77fbe9ec38b4cf8eb1f (HEAD, refs/bisect/bad)
Author: Aldy Hernandez 
Date:   Wed Aug 17 17:47:21 2022 +0200

Reset root oracle from path_oracle::reset_path.

[Bug rtl-optimization/81501] mulitple calls to __tls_get_addr() with -fPIC

2022-08-18 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81501

--- Comment #8 from H.J. Lu  ---
Created attachment 53473
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53473=edit
A patch

This patch uses a single UNSPEC_TLS_LD_BASE in the whole function.

Re: [PATCH][DOCS] Document make jobserver related changes for GCC 13.

2022-08-18 Thread Gerald Pfeifer
On Thu, 18 Aug 2022, Martin Liška wrote:
> Ready for master?

Nearly. :)

> +Link-time optimization improvements:
> +
> +LTO supports the newly added GNU make's jobserver that uses 
> named pipes (--jobserver-style=fifo)

I believe that's just "GNU make" (instead of "GNU make's"), or maybe 
"...the newly added jobserver of GNU make...".

Not 100% sure myself.

> +If make's jobserver is active, parallel LTO WPA streaming 
>communicates with it and so the streaming
> +does not lead to the system overcommitting.

Might we be able to simplify this to something like

  "...parallel LTO WPA streaming communicates with it and thus avoids
  system overcommitting" ?


Just two minor points; the patch as such looks fine.

Thank you,
Gerald


[PATCH] rs6000: Delete pr56605.c testcase

2022-08-18 Thread Segher Boessenkool
This testcase has never tested the problem in the PR it is named for
(except perhaps very indirectly), and over the years this argumentation
has become thinner all the time.  Now we are faced with on the one hand
having to accept various forms of code, but on the other hand very
similar code is generated elsewhere in the testcase, and the crux of the
check is that we have to make sure no duplicate happens.

I see no better way forward than to delete this testcase.

Comments?  I'll commit this tomorrow.


Segher


2022-08-18  Segher Boessenkool  

gcc/testsuite/
PR target/102146
* gcc.target/powerpc/pr56605.c: Delete.
---
 gcc/testsuite/gcc.target/powerpc/pr56605.c | 15 ---
 1 file changed, 15 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/pr56605.c

diff --git a/gcc/testsuite/gcc.target/powerpc/pr56605.c 
b/gcc/testsuite/gcc.target/powerpc/pr56605.c
deleted file mode 100644
index 7695f87db6f6..
--- a/gcc/testsuite/gcc.target/powerpc/pr56605.c
+++ /dev/null
@@ -1,15 +0,0 @@
-/* PR rtl-optimization/56605 */
-/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
-/* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-O3 -mvsx -mdejagnu-cpu=power7 -fno-unroll-loops 
-fdump-rtl-combine" } */
-
-void foo (short* __restrict sb, int* __restrict ia)
-{
-  int i;
-  for (i = 0; i < 4000; i++)
-ia[i] = (int) sb[i];
-}
-
-/* { dg-final { scan-rtl-dump-times {\(compare:CC 
\((?:and|zero_extend):(?:[SD]I) \((?:sub)?reg:[SD]I} 1 "combine" } } */
-
-- 
1.8.3.1



[PATCH v2] c++: Implement -Wself-move warning [PR81159]

2022-08-18 Thread Marek Polacek via Gcc-patches
On Mon, Aug 15, 2022 at 03:54:05PM -0400, Jason Merrill wrote:
> On 8/9/22 09:37, Marek Polacek wrote:
> > +  /* We're looking for *std::move ((T &) ), or
> > + *std::move ((T &) (T *) r) if the argument it a reference.  */
> > +  if (!REFERENCE_REF_P (rhs)
> > +  || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
> > +return;
> > +  tree fn = TREE_OPERAND (rhs, 0);
> > +  if (!is_std_move_p (fn))
> > +return;
> > +  tree arg = CALL_EXPR_ARG (fn, 0);
> > +  if (TREE_CODE (arg) != NOP_EXPR)
> > +return;
> > +  /* Strip the (T &).  */
> > +  arg = TREE_OPERAND (arg, 0);
> > +  /* Strip the (T *) or &.  */
> > +  arg = TREE_OPERAND (arg, 0);
> 
> Are you sure these are the only two expressions that can make it here? What
> if the argument to move is *Tptr?

Not 100% sure but I couldn't find any other form.  For *Tptr we get
*std::move ((int * &) )
so it works as expected as well.  I've extended the existing test to test this
too.
 
> > @@ -6826,6 +6827,26 @@ of a declaration:
> >   This warning is enabled by @option{-Wall}.
> > +@item -Wno-self-move @r{(C++ and Objective-C++ only)}
> > +@opindex Wself-move
> > +@opindex Wno-self-move
> > +This warning warns when a value is moved to itself with @code{std::move}.
> > +Such a @code{std::move} has no effect.
> 
> ...unless it naively breaks the object, like
> 
> T(T&& ot): data(ot.data) { ot.data = nullptr; } // oops

"If you try to move me I'll disappear!"

I've added the weasel word: "typically has no effect."  Or do we want to say
more?

-- >8 --
About 5 years ago we got a request to implement -Wself-move, which
warns about useless moves like this:

  int x;
  x = std::move (x);

This patch implements that warning.

PR c++/81159

gcc/c-family/ChangeLog:

* c.opt (Wself-move): New option.

gcc/cp/ChangeLog:

* typeck.cc (maybe_warn_self_move): New.
(cp_build_modify_expr): Call maybe_warn_self_move.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wself-move.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wself-move1.C: New test.
---
 gcc/c-family/c.opt  |   4 +
 gcc/cp/typeck.cc|  48 ++-
 gcc/doc/invoke.texi |  23 +-
 gcc/testsuite/g++.dg/warn/Wself-move1.C | 105 
 4 files changed, 178 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wself-move1.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index dfdebd596ef..f776efd39d8 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1229,6 +1229,10 @@ Wselector
 ObjC ObjC++ Var(warn_selector) Warning
 Warn if a selector has multiple methods.
 
+Wself-move
+C++ ObjC++ Var(warn_self_move) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn when a value is moved to itself with std::move.
+
 Wsequence-point
 C ObjC C++ ObjC++ Var(warn_sequence_point) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
 Warn about possible violations of sequence point rules.
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 992ebfd99fb..cbc32a7c8ca 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -8897,7 +8897,51 @@ cp_build_c_cast (location_t loc, tree type, tree expr,
 
   return error_mark_node;
 }
-
+
+/* Warn when a value is moved to itself with std::move.  LHS is the target,
+   RHS may be the std::move call, and LOC is the location of the whole
+   assignment.  */
+
+static void
+maybe_warn_self_move (location_t loc, tree lhs, tree rhs)
+{
+  if (!warn_self_move)
+return;
+
+  /* C++98 doesn't know move.  */
+  if (cxx_dialect < cxx11)
+return;
+
+  if (processing_template_decl)
+return;
+
+  /* We're looking for *std::move ((T &) ), or
+ *std::move ((T &) (T *) r) if the argument it a reference.  */
+  if (!REFERENCE_REF_P (rhs)
+  || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
+return;
+  tree fn = TREE_OPERAND (rhs, 0);
+  if (!is_std_move_p (fn))
+return;
+  tree arg = CALL_EXPR_ARG (fn, 0);
+  if (TREE_CODE (arg) != NOP_EXPR)
+return;
+  /* Strip the (T &).  */
+  arg = TREE_OPERAND (arg, 0);
+  /* Strip the (T *) or &.  */
+  arg = TREE_OPERAND (arg, 0);
+  arg = convert_from_reference (arg);
+  /* So that we catch (i) = std::move (i);.  */
+  lhs = maybe_undo_parenthesized_ref (lhs);
+  STRIP_ANY_LOCATION_WRAPPER (lhs);
+  if (cp_tree_equal (lhs, arg))
+{
+  auto_diagnostic_group d;
+  if (warning_at (loc, OPT_Wself_move, "moving a variable to itself"))
+   inform (loc, "remove % call");
+}
+}
+
 /* For use from the C common bits.  */
 tree
 build_modify_expr (location_t location,
@@ -9101,6 +9145,8 @@ cp_build_modify_expr (location_t loc, tree lhs, enum 
tree_code modifycode,
 
   if (modifycode == NOP_EXPR)
{
+ maybe_warn_self_move (loc, lhs, rhs);
+
  if (c_dialect_objc ())
{
  result = objc_maybe_build_modify_expr (lhs, rhs);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f65d351a5fc..5dea3fee124 

[Bug target/106609] [12/13 Regression] sh3eb-elf cross compiler is being miscompiled since r12-1525-g3155d51bfd1de8b6c4645

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106609

Andrew Pinski  changed:

   What|Removed |Added

Summary|sh3eb-elf cross compiler is |[12/13 Regression]
   |being miscompiled   |sh3eb-elf cross compiler is
   ||being miscompiled since
   ||r12-1525-g3155d51bfd1de8b6c
   ||4645
   Target Milestone|--- |12.2
  Component|middle-end  |target
   Severity|normal  |blocker
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=46235

[Bug middle-end/106609] sh3eb-elf cross compiler is being miscompiled

2022-08-18 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106609

--- Comment #9 from Mikael Pettersson  ---
# first bad commit: [3155d51bfd1de8b6c4645dcb2292248a8d7cc3c9] [PATCH] PR
rtl-optimization/46235: Improved use of bt for bit tests on x86_64.

Starting with this commit, the host compiler (on x86_64-linux) miscompiles the
gcc-13 based cross-compiler to sh3eb-elf.

[Bug target/106678] Inefficiency in long integer multiplication

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106678

--- Comment #1 from Andrew Pinski  ---
The inner loop for aarch64 on the trunk is:
.L5:
ldr x7, [x20, x5, lsl 3]
ldr x10, [x21, x12, lsl 3]
ldr x6, [x11, x5, lsl 3]
mul x2, x7, x10
umulh   x7, x7, x10
addsx2, x2, x8
cincx8, x7, cs
addsx2, x2, x6
csetx7, cs
addsx2, x2, x9
add x6, x6, x2
str x6, [x11, x5, lsl 3]
add x5, x5, 1
cincx9, x7, cs
cmp x19, x5
bne .L5

So I suspect this is still a target issue.

Re: [PATCH] xtensa: Improve indirect sibling call handling

2022-08-18 Thread Max Filippov via Gcc-patches
On Thu, Aug 18, 2022 at 3:06 AM Takayuki 'January June' Suwa
 wrote:
>
> No longer needs the dedicated hard register (A11) for the address of the
> call and the split patterns for fixups, due to the introduction of appropriate
> register class and constraint.
>
> (Note: "ISC_REGS" contains a hard register A8 used as a "static chain"
>  pointer for nested functions, but no problem;  Pointer to nested function
>  actually points to "trampoline", and trampoline itself doesn't receive
>  "static chain" pointer to its parent's stack frame from the caller.)
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.h
> (enum reg_class, REG_CLASS_NAMES, REG_CLASS_CONTENTS):
> Add new register class "ISC_REGS".
> * config/xtensa/constraints.md (c): Add new register constraint.
> * config/xtensa/xtensa.md (define_constants): Remove "A11_REG".
> (sibcall_internal, sibcall_value_internal):
> Change to use the new register constraint, and remove two split
> patterns for fixups that are no longer needed.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/xtensa/sibcalls.c: Add a new test function to ensure
> that registers for arguments (occupy from A2 to A7) and for indirect
> sibcall (should be assigned to A8) neither conflict nor spill out.
> ---
>  gcc/config/xtensa/constraints.md   |  5 
>  gcc/config/xtensa/xtensa.h |  3 +++
>  gcc/config/xtensa/xtensa.md| 29 ++
>  gcc/testsuite/gcc.target/xtensa/sibcalls.c |  5 
>  4 files changed, 15 insertions(+), 27 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


[Bug rtl-optimization/106678] New: Inefficiency in long integer multiplication

2022-08-18 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106678

Bug ID: 106678
   Summary: Inefficiency in long integer multiplication
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

The code from PR 103109

#include 

void Long_multiplication( uint64_t multiplicand[],
  uint64_t multiplier[],
  uint64_t sum[],
  uint64_t ilength, uint64_t jlength )
{
  uint64_t acarry, mcarry, product;

  for( uint64_t i = 0;
   i < (ilength + jlength);
   i++ )
sum[i] = 0;

  acarry = 0;
  for( uint64_t j = 0; j < jlength; j++ )
{
  mcarry = 0;
  for( uint64_t i = 0; i < ilength; i++ )
{
  __uint128_t mcarry_prod;
  __uint128_t acarry_sum;
  mcarry_prod = ((__uint128_t) multiplicand[i]) * ((__uint128_t)
multiplier[j])
+ (__uint128_t) mcarry;
  mcarry = mcarry_prod >> 64;
  product = mcarry_prod;
  acarry_sum = ((__uint128_t) sum[i+j]) + ((__uint128_t) acarry) +
product;
  sum[i+j] += acarry_sum;
  acarry = acarry_sum >> 64;
  //  {mcarry, product} = multiplicand[i]*multiplier[j]
  //+ mcarry;
  //  {acarry,sum[i+j]} = {sum[i+j]+acarry} + product;

}
}
}

still shows some inefficiency after r13-2107.

Compiling the function with gcc 13.0.0 20220818, with

$ gcc  -mcpu=power9 -O3 -c loop.c

and disassembling the output (for easier reading) gives (looking only
at the main part)

  7c:   00 00 80 38 li  r4,0
  80:   00 00 80 3b li  r28,0
  84:   00 00 60 38 li  r3,0
  88:   00 00 00 38 li  r0,0
  8c:   ff ff c0 38 li  r6,-1
  90:   00 00 e0 38 li  r7,0
  94:   20 00 c1 fa std r22,32(r1)
  98:   28 00 e1 fa std r23,40(r1)
  9c:   60 00 c1 fb std r30,96(r1)
  a0:   68 00 e1 fb std r31,104(r1)
  a4:   00 00 00 60 nop
  a8:   00 00 00 60 nop
  ac:   00 00 42 60 ori r2,r2,0
  b0:   a6 03 49 7f mtctr   r26
  b4:   78 c3 0c 7f mr  r12,r24
  b8:   14 22 b9 7c add r5,r25,r4
  bc:   00 00 00 39 li  r8,0
  c0:   09 00 6c e9 ldu r11,8(r12)
  c4:   2a 20 5d 7d ldx r10,r29,r4
  c8:   09 00 25 e9 ldu r9,8(r5)
  cc:   33 52 cb 13 maddld  r30,r11,r10,r8
  d0:   31 52 eb 13 maddhdu r31,r11,r10,r8
  d4:   38 30 d6 7f and r22,r30,r6
  d8:   38 38 f7 7f and r23,r31,r7
  dc:   78 fb e8 7f mr  r8,r31
  e0:   14 48 56 7d addcr10,r22,r9
  e4:   14 01 77 7d adder11,r23,r0
  e8:   14 18 4a 7d addcr10,r10,r3
  ec:   14 52 29 7d add r9,r9,r10
  f0:   94 01 6b 7c addze   r3,r11
  f4:   00 00 25 f9 std r9,0(r5)
  f8:   c8 ff 00 42 bdnzc0 
  fc:   01 00 9c 3b addir28,r28,1
 100:   08 00 84 38 addir4,r4,8
 104:   40 e0 3b 7c cmpld   r27,r28
 108:   a8 ff 82 40 bne b0 

In these two nested loops, r6 is not changed, so it is always -1.

  d4:   38 30 d6 7f and r22,r30,r6

just assigns r30 to r22, so r30 could have been used instead of
r22.

Similarly,

  d8:   38 38 f7 7f and r23,r31,r7

just sets r23 to zero because r7 is always zero.

[Bug fortran/106557] nesting intrinsics ibset and transfer gives wrong result

2022-08-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106557

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-08-18

--- Comment #1 from anlauf at gcc dot gnu.org ---
Confirmed.

The underlying issue also affects ibclr, and possibly other intrinsics that
are simplified at compile time.

Reduced testcase:

print *,   ibset (transfer (0, 0), 0)   ! bad
print *,   ibset (transfer (0, 0) + 0, 0)   ! OK
print *,   ibset (transfer (0, 0), 0) + 0   ! OK
print *, transfer (ibset (transfer (0, 0), 0), 0) == 1  ! bad
print *, transfer (ibclr (transfer (1, 0), 0), 0) == 0  ! bad
end

The problem arises from the dual representation of the result of the
simplification of TRANSFER in gfc_simplify_transfer(), once as
result->value.integer, and as result->representation.string.
The latter is used e.g. for ensuring safe round-trip for nested
TRANSFER(TRANSFER(...),...).

It happens in gfc_simplify_ibset that value.integer is updated, but not
representation.string, and under the given circumstances either one or the
other is used later.

Possible solutions:

1) conservative hack:

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index fb725994653..6c5ffcdaf86 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -3380,6 +3380,11 @@ gfc_simplify_ibclr (gfc_expr *x, gfc_expr *y)
   k = gfc_validate_kind (x->ts.type, x->ts.kind, false);

   result = gfc_copy_expr (x);
+  if (result->representation.string)
+{
+  free (result->representation.string);
+  result->representation.string = NULL;
+}

   convert_mpz_to_unsigned (result->value.integer,
   gfc_integer_kinds[k].bit_size);

and a similar change to gfc_simplify_ibset.  This fixes the issue just for
those two intrinsics.  This may miss other cases, although I could not find
them so far.

2) more aggressive hack:

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index fb725994653..3e895e12f8c 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -8157,6 +8158,16 @@ gfc_simplify_transfer (gfc_expr *source, gfc_expr *mold,
gfc_expr *size)
   /* And read the buffer back into the new expression.  */
   gfc_target_interpret_expr (buffer, buffer_size, result, false);

+  /* Integer is capable to hold all bits needed for complete round-trip
+with TRANSFER-in-TRANSFER.  Drop separate memory representation so that
+subsequent simplification of bit manipulation intrinsics of the result
+of gfc_simplify_transfer does not lead to inconsistencies.  */
+  if (result->ts.type == BT_INTEGER)
+{
+  free (result->representation.string);
+  result->representation.string = NULL;
+}
+
   return result;
 }


Both patches regtest cleanly on x86_64-pc-linux-gnu.

[Bug tree-optimization/106677] Abstraction overhead with std::views::join

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106677

--- Comment #1 from Andrew Pinski  ---
(In reply to Marc Glisse from comment #0)
> sum_vec gets relatively nice, short code. sum_array gets something uglier.
> 
>   _18 = _5(D)->m_array;
>   _6 = foo_5(D) + 24;
>   if (_6 != _18)

That is related to PR 89317.

[Bug tree-optimization/106677] Abstraction overhead with std::views::join

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106677

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug fortran/77652] Invalid rank error in ASSOCIATED when rank is remapped

2022-08-18 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77652

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

--- Comment #9 from anlauf at gcc dot gnu.org ---
As the inquiry to SC22WG5 resulted in a comment that different ranks should
not be allowed, I've reverted the commit as r13-2118:

commit ca170ed9f8a086ca7e1eec841882b6bed9ec1a3a
Author: Harald Anlauf 
Date:   Thu Aug 18 21:24:29 2022 +0200

Revert "Fortran: fix invalid rank error in ASSOCIATED when rank is remapped
[PR77652]"

This reverts commit 0110cfd5449bae3a772f45ea2e4c5dab5b7a8ccd.


See https://gcc.gnu.org/pipermail/fortran/2022-August/058049.html
for details.

Will close this PR therefore as invalid (at least for <= Fortran 2018).

[Bug tree-optimization/106677] New: Abstraction overhead with std::views::join

2022-08-18 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106677

Bug ID: 106677
   Summary: Abstraction overhead with std::views::join
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---

(from https://stackoverflow.com/q/73407636/1918193 )

#include 
#include 
#include 

struct Foo {
auto join() const { return m_array | std::views::join; }
auto direct() const { return std::views::all(m_array[0]); }
std::array, 1> m_array;
};
__attribute__((noinline)) int sum_array(const Foo& foo)
{
int result = 0;
for (int* val : foo.join())
result += *val;
return result;
}
__attribute__((noinline)) int sum_vec(const Foo& foo)
{
int result = 0;
for (int* val : foo.direct())
result += *val;
return result;
}

I am using a snapshot from 20220719 with -std=gnu++2b -O3 and looking at
.optimized dumps.

sum_vec gets relatively nice, short code. sum_array gets something uglier.

  _18 = _5(D)->m_array;
  _6 = foo_5(D) + 24;
  if (_6 != _18)

Err, x != x+24 should be folded to false? Let's add

  if(foo.m_array.begin()==foo.m_array.end())__builtin_unreachable();

to move forward.

  _16 = MEM[(int * const * const &)foo_4(D)];
  _17 = MEM[(int * const * const &)foo_4(D) + 8];
  if (_16 != _17)
goto ; [5.50%]
  else
goto ; [94.50%]

why are we guessing that the vector is probably empty? Let's look at more code

   [local count: 853673669]:
  _10 = [(const struct array *)foo_4(D)]._M_elems;
  _7 = foo_4(D) + 24;
  _16 = MEM[(int * const * const &)foo_4(D)];
  _17 = MEM[(int * const * const &)foo_4(D) + 8];
  if (_16 != _17)
goto ; [5.50%]
  else
goto ; [94.50%]

   [local count: 806721618]:
  _18 = foo_4(D) + 24;

   [local count: 96636762]:
  # SR.89_28 = PHI <_10(2), _18(3)>
  # SR.90_41 = PHI <_16(2), 0B(3)>
  goto ; [100.00%]

   [local count: 923031551]:
  # result_2 = PHI <0(4), result_12(8)>
  # SR.89_13 = PHI 
  # SR.90_30 = PHI 
  if (_7 == SR.89_13)
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 276909463]:
  if (SR.90_30 == 0B)
goto ; [16.34%]
  else
goto ; [83.66%]

   [local count: 96636764]:
  # result_31 = PHI 
  return result_31;

(why not _18 = _7 towards the beginning?)
It would be nice if threading could isolate the case of an empty vector: 2 -> 3
-> 4 -> 9 -> 10 -> 11: just return 0, and the rest of the code may become
easier to optimize.

Let me add

  if(foo.m_array[0].begin()==foo.m_array[0].end())__builtin_unreachable();

to avoid the empty vector case as well. This looks better, at least the inner
loop looks normal, but we are still iterating on the elements of m_array, when
we should be able to tell that it has exactly 1 element.

Re: [PATCH, v2] Fortran: fix invalid rank error in ASSOCIATED when rank is remapped [PR77652]

2022-08-18 Thread Harald Anlauf via Gcc-patches

Hi Mikael, all,

I've just reverted commit 0110cfd5449bae3a772f45ea2e4c5dab5b7a8ccd.
As it seems that commit ca170ed9f8a086ca7e1eec841882b6bed9ec1a3a did
not update bugzilla, I'll add a note to the PR and close it as invalid.

Thanks,
Harald


Am 04.08.22 um 14:03 schrieb Mikael Morin:

Le 30/07/2022 à 12:03, Mikael Morin a écrit :

Le 28/07/2022 à 22:19, Mikael Morin a écrit :

I propose to prepare something tomorrow.



Here you go.


I posted the message the other day.
The mailing list archive are not automatic, so there is no link to the
message (yet?), nor to the thread that follows it.
So I attach below the answer from Malcolm Cohen.
Long story short, he confirms the interpretation from Steve Lionel, and
that the text in the standard needs fixing.
I’m afraid we’ll have to revert.


 Message transféré 
Sujet : [SC22WG5.6416] RE: [ukfortran] Request for interpretation of
compile-time restrictions on ASSOCIATED
Date : Thu, 4 Aug 2022 11:43:16 +0900
De : Malcolm Cohen 
Pour : 'Mikael Morin' , sc22...@open-std.org
Copie à : 'Harald Anlauf' 

Dear Mikael,

Thank you for your query.

I would agree with Steve Lionel that the ranks must be the same (when
POINTER is not assumed-rank), for two reasons.

(1) The result of ASSOCIATED is unambiguously .FALSE. when the shapes of
POINTER and TARGET differ. As the shapes cannot be the same when the ranks
differ seeing as how the number of elements in the shape are not the same,
that means it would always be .FALSE. when the ranks differ. The Fortran
language does not need an extra way to produce the LOGICAL constant .FALSE.

Note that this means that even in the case where POINTER is dimension (2,1)
and TARGET is dimension (1,2), and they both refer to the same elements in
array element order, ASSOCIATED will return .FALSE. because the shapes are
not the same. ASSOCIATED is a much stronger test than mere address
comparison.

(2) This text arises from an attempted, but failed, simplification of what
we had before. Unfortunately, it is completely and utterly broken, as it
forbids the use of ASSOCIATED when POINTER is assumed-rank, has INTENT(IN),
is PROTECTED (outside of its module), or is a pointer function reference.
That is because there are no pointer assignment statements where the
pointer
object is permitted to be any of those, and thus the conditions for TARGET
cannot ever be satisfied.

However, the processor is not *required* to report an error when the ranks
differ, as this is not a "Constraint" in the standard. I would expect a
high
quality implementation to do so, but maybe I just have high expectations...

It could also be a deliberate extension, with different semantics provided
by the processor. In that case, the processor would be required to have the
capability to report the use of the extension (but this need not be the
default).

Finally, I note that we are not accepting interpretation requests on
Fortran
2018 at this time, as we are in the process of replacing it with a new
revision (Fortran 2023). However, we will certainly consider whether we can
make any correction to Fortran 2023 before publication (expected next
year);
if there is consensus on how to fix the clearly-incorrect requirements on
TARGET, we can do so. Otherwise, we will need to wait until after Fortran
2023 is published before we can restart the Defect Processing process.

I will undertake to write a meeting paper addressing this issue before this
year's October meeting. If no paper has appeared by mid-September, please
feel free to remind me to do that!

Cheers,




[Bug libstdc++/106676] New: [C++20] Automatic iterator_category detection misbehaves when `::reference` is an rvalue reference, refuses to accept a forward iterator

2022-08-18 Thread iamsupermouse at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106676

Bug ID: 106676
   Summary: [C++20] Automatic iterator_category detection
misbehaves when `::reference` is an rvalue reference,
refuses to accept a forward iterator
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamsupermouse at mail dot ru
  Target Milestone: ---

Since C++20, `std::iterator_traits` can auto-detect `::iterator_category`, if
the iterator doesn't set it.

Forward iterators require `::reference` (aka the return type of `operator*`) to
be a reference (any reference), but libstdc++ only accepts lvalue references
here.

Example:

#include 
#include 

template 
struct A
{
using value_type = std::remove_cvref_t;
using difference_type = int;
T operator*() const;
A ++();
A operator++(int);
bool operator==(const A &) const;
};

// Ok.
static_assert(std::is_same_v>::iterator_category, std::forward_iterator_tag>);
// Should pass but fails, the category is `std::input_iterator_tag`.
static_assert(std::is_same_v>::iterator_category, std::forward_iterator_tag>);

I blame this on cppreference, which used to incorrectly say that only lvalue
references are allowed there. It was fixed since then.

See https://eel.is/c++draft/iterators#forward.iterators-1.3. Also see
https://cplusplus.github.io/LWG/issue1211 (from 2009), which was resolved by
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3066.html (in 2010).

This was discussed on SO: https://stackoverflow.com/q/73353152

libc++ and MSVC's standard library share the exact same bug.

This was tested on GCC 12.1 and on trunk (13.0.0).

[Bug c++/106675] [10/11/12/13 Regression] g++ crashes on funky operators

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106675

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=86521

--- Comment #2 from Andrew Pinski  ---
I suspect the patch for PR 86521 which was aiming on fixing the rejection was
not fully correct and introduced an ICE in some cases ...

[Bug c++/106675] [10/11/12/13 Regression] g++ crashes on funky operators

2022-08-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106675

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2022-08-18
Summary|g++ crashes on funky|[10/11/12/13 Regression]
   |operators   |g++ crashes on funky
   ||operators
  Known to fail||12.1.0, 8.4.0, 8.5.0, 9.1.0
   Target Milestone|--- |10.5
   Keywords||ice-on-valid-code
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
(In reply to Isabella from comment #0)
> g++ from v11 onward crashes on it: https://godbolt.org/z/nYqo1zj31

That is because the language default changed to gnu++17 from gnu++14.

Confirmed. It might be considered a regression even though it was rejected
before GCC 8.4.0.

The ICE is in joust_maybe_elide_copy.

[Bug c++/106675] New: g++ crashes on funky operators

2022-08-18 Thread izaberina at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106675

Bug ID: 106675
   Summary: g++ crashes on funky operators
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: izaberina at gmail dot com
  Target Milestone: ---

Created attachment 53472
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53472=edit
short repro

Attached is a small reproducer of something that originally came from
boost::system::error_code.

g++ from v11 onward crashes on it: https://godbolt.org/z/nYqo1zj31

Re: [PATCH v3] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-18 Thread Segher Boessenkool
Hi!

On Thu, Aug 18, 2022 at 10:12:48AM +0800, Kewen.Lin wrote:
> As PR99888 and its related show, the current support for
> -fpatchable-function-entry on powerpc ELFv2 doesn't work
> well with global entry existence.

> +  /* Emit the NOPs after local entry.  */

Please do not say "NOPs".  It is not an acronym.  I know some of our
documentation has this bug already, but please do not spread it further.

The machine instruction is "nop", lowercase.

Please fix this.

So, this patch overloads the meaning of the two parameters here to have
more meaning than explained in the documentation for the option.  There
isn't much that can be done about this, so adding some new option would
only be extra work for everyone.  But, could you add a line or two in
the documentation?  "For PowerPC with the ELFv2 ABI, there will be M
nops before the local entry point, and N-M after", something like that?


Segher


[Bug libstdc++/86164] std::regex crashes when matching long lines

2022-08-18 Thread maarten at hekkelman dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164

Maarten L. Hekkelman  changed:

   What|Removed |Added

 CC||maarten at hekkelman dot com

--- Comment #13 from Maarten L. Hekkelman  ---
Too bad this bug has still not been dealt with. And it is even worse that
simply running out of stack space seems to be acceptable. And no, I'm not using
inputs in the form of 27kB, more like just a few hundred characters at most
with quite complex expressions.

Fortunately, it is now very easy to use the boost::regex as a standalone
library as a replacement. But alas, that's still a dependency.

[Bug c++/106648] [C++23] P2071 - Named universal character escapes

2022-08-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106648

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Created attachment 53471
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53471=edit
makeuname2c.cc

I've so far written a generator of a space optimized radix tree for the Unicode
name to codepoint mapping (this would be libcpp/makeuname2c.cc),
but will need to write a consumer of those arrays to actually implement the
Unicode name to codepoint mapping.

[PATCH] Remove path_range_query constructor that takes an edge.

2022-08-18 Thread Aldy Hernandez via Gcc-patches
The path_range_query constructor that takes an edge is really a
convenience function for the loop-ch pass.  It feels wrong to pollute
the API with such a specialized function that could be done with
a small inline function closer to its user.

As an added benefit, we remove one use of reset_path.  The last
remaining one is the forward threader one.

OK?

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::path_range_query):
Remove constructor that takes edge.
* gimple-range-path.h (class path_range_query): Same.
* tree-ssa-loop-ch.cc (edge_range_query): New.
(entry_loop_condition_is_static): Call edge_range_query.
---
 gcc/gimple-range-path.cc | 15 ---
 gcc/gimple-range-path.h  |  1 -
 gcc/tree-ssa-loop-ch.cc  | 17 +++--
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index ba7c2ed9b47..bc2879c0c57 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -59,21 +59,6 @@ path_range_query::path_range_query (gimple_ranger , 
bool resolve)
   m_oracle = new path_oracle (m_ranger.oracle ());
 }
 
-path_range_query::path_range_query (gimple_ranger ,
-   edge e,
-   bool resolve)
-  : m_cache (new ssa_global_cache),
-m_has_cache_entry (BITMAP_ALLOC (NULL)),
-m_ranger (ranger),
-m_resolve (resolve)
-{
-  m_oracle = new path_oracle (m_ranger.oracle ());
-  auto_vec bbs (2);
-  bbs.quick_push (e->dest);
-  bbs.quick_push (e->src);
-  reset_path (bbs, NULL);
-}
-
 path_range_query::~path_range_query ()
 {
   delete m_oracle;
diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
index 483fde0d431..9f2d6d92dab 100644
--- a/gcc/gimple-range-path.h
+++ b/gcc/gimple-range-path.h
@@ -37,7 +37,6 @@ public:
const bitmap_head *dependencies = NULL,
bool resolve = true);
   path_range_query (gimple_ranger , bool resolve = true);
-  path_range_query (gimple_ranger , edge e, bool resolve = true);
   virtual ~path_range_query ();
   void reset_path (const vec &, const bitmap_head *dependencies);
   bool range_of_expr (vrange , tree name, gimple * = NULL) override;
diff --git a/gcc/tree-ssa-loop-ch.cc b/gcc/tree-ssa-loop-ch.cc
index 96816b89287..9c316887d5b 100644
--- a/gcc/tree-ssa-loop-ch.cc
+++ b/gcc/tree-ssa-loop-ch.cc
@@ -45,6 +45,20 @@ along with GCC; see the file COPYING3.  If not see
increases effectiveness of code motion optimizations, and reduces the need
for loop preconditioning.  */
 
+/* Given a path through edge E, whose last statement is COND, return
+   the range of the solved conditional in R.  */
+
+static void
+edge_range_query (irange , edge e, gcond *cond, gimple_ranger )
+{
+  auto_vec path (2);
+  path.safe_push (e->dest);
+  path.safe_push (e->src);
+  path_range_query query (ranger, path);
+  if (!query.range_of_stmt (r, cond))
+r.set_varying (boolean_type_node);
+}
+
 /* Return true if the condition on the first iteration of the loop can
be statically determined.  */
 
@@ -72,8 +86,7 @@ entry_loop_condition_is_static (class loop *l, gimple_ranger 
*ranger)
 desired_static_value = boolean_true_node;
 
   int_range<2> r;
-  path_range_query query (*ranger, e);
-  query.range_of_stmt (r, last);
+  edge_range_query (r, e, last, *ranger);
   return r == int_range<2> (desired_static_value, desired_static_value);
 }
 
-- 
2.37.1



Re: [PING][PATCH] RISC-V: Standardize formatting of SFB ALU conditional move

2022-08-18 Thread Maciej W. Rozycki
On Thu, 18 Aug 2022, Kito Cheng wrote:

> OK, thanks for tweaking this!

 Committed now, thanks for your review!

 Would you mind sharing your opinion on my previous observation here:
?

 I have since realised we have a `-mbranch-cost=' option letting the user 
set the threshold for choosing branches over alternative code sequences, 
so my concern is valid even for our tree unchanged and without the change 
just committed here applied.  Consequently the test case may fail.

 E.g. with:

RUNTESTFLAGS="--target_board remote-unmatched/-mbranch-cost=1 
riscv.exp=pr105314.c"

I get:

PASS: gcc.target/riscv/pr105314.c   -O0  (test for excess errors)
FAIL: gcc.target/riscv/pr105314.c   -O0   scan-assembler-not \tbeq\t
PASS: gcc.target/riscv/pr105314.c   -O1  (test for excess errors)
FAIL: gcc.target/riscv/pr105314.c   -O1   scan-assembler-not \tbeq\t
[...]

=== gcc Summary ===

# of expected passes9
# of unexpected failures9

because GCC legitimately chooses to emit branches as less costly in this 
case.

 I think we need to pacify the test case somehow if it does not match the 
criteria for PR105314, either by excluding the case from testing in that 
situation or by forcing it via command-line options to make it match the 
criteria (or indeed by verifying a branch is produced regardless).  Sadly 
Jakub chose not to chime in and it's not clear to me which approach would 
be the most appropriate here.

  Maciej


[Bug tree-optimization/106457] array_at_struct_end_p returns TRUE for a two-dimension array which is not inside any structure

2022-08-18 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106457

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REOPENED|RESOLVED

--- Comment #10 from qinzhao at gcc dot gnu.org ---
after discussing with Richard, and agreed on the several bugs mentioned in
comment #7,#8 and #9 are actually not bugs. 
we just need to add more comments for array_at_struct_end_p and change the name
of it later. 
close this bug as fixed.

[PATCH] contrib: Fix a typo in contrib/git-fetch-vendor.sh

2022-08-18 Thread Andrea Corallo via Gcc-patches
Hi all,

just commited this to fix a typo as obvious.

Bests

  Andrea

/contrib/ChangeLog:

* git-fetch-vendor.sh : Fix typo.

diff --git a/contrib/git-fetch-vendor.sh b/contrib/git-fetch-vendor.sh
index 15303629b5c..bbd52fb2055 100755
--- a/contrib/git-fetch-vendor.sh
+++ b/contrib/git-fetch-vendor.sh
@@ -14,7 +14,7 @@ enable_push=no
 upstream=`git config --get "gcc-config.upstream"`
 if [ x"$upstream" = x ]
 then
-echo "Config gcc-config.upstream not set, run 
contrib/gcc-git-customization"
+echo "Config gcc-config.upstream not set, run 
contrib/gcc-git-customization.sh"
 exit 1
 fi
 


Re: [PATCH V2] Add warning options -W[no-]compare-distinct-pointer-types

2022-08-18 Thread Joseph Myers
On Thu, 18 Aug 2022, Jose E. Marchesi via Gcc-patches wrote:

> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> index de8780a1502..04af02add37 100644
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -12397,7 +12397,8 @@ build_binary_op (location_t location, enum tree_code 
> code,
>   }
> else
>   /* Avoid warning about the volatile ObjC EH puts on decls.  */
> - if (!objc_ok)
> + if (!objc_ok
> +&& warn_compare_distinct_pointer_types)
> pedwarn (location, 0,
>  "comparison of distinct pointer types lacks a cast");
>  
> @@ -12517,8 +12518,9 @@ build_binary_op (location_t location, enum tree_code 
> code,
> int qual = ENCODE_QUAL_ADDR_SPACE (as_common);
> result_type = build_pointer_type
> (build_qualified_type (void_type_node, qual));
> -   pedwarn (location, 0,
> -"comparison of distinct pointer types lacks a cast");
> +  if (warn_compare_distinct_pointer_types)
> +pedwarn (location, 0,
> + "comparison of distinct pointer types lacks a 
> cast");

I think this should use OPT_Wcompare_distinct_pointer_types in place of 0, 
and then you shouldn't need to check warn_compare_distinct_pointer_types 
(as well as the diagnostic then automatically telling the user what option 
controls it).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [GCC][PATCH v2] arm: Add support for Arm Cortex-M85 CPU.

2022-08-18 Thread Richard Earnshaw via Gcc-patches




On 12/08/2022 18:20, Srinath Parvathaneni via Gcc-patches wrote:

Hi,

This patch adds the -mcpu support for the Arm Cortex-M85 CPU which is an
Armv8.1-M Mainline CPU supporting MVE and PACBTI by default.

-mpcu=cortex-m85 switch by default matches to 
-march=armv8.1-m.main+pacbti+mve.fp+fp.dp.

Also following options are provided to disable default features.
+nomve.fp (disables MVE Floating point)
+nomve (disables MVE Integer and MVE Floating point)
+nodsp (disables dsp, MVE Integer and MVE Floating point)
+nopacbti (disables pacbti)
+nofp (disables floating point and MVE floating point)

Regression tested on arm-none-eabi and bootstrapped on arm-none-linux-gnueabihf.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2022-08-12  Srinath Parvathaneni  

 * config/arm/arm-cpus.in (cortex-m85): Define new CPU.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm-tune.md: Likewise.
 * doc/invoke.texi (Arm Options): Document -mcpu=cortex-m85.
 * (-mfix-cmse-cve-2021-35465): Likewise.

gcc/testsuite/ChangeLog:

2022-08-12  Srinath Parvathaneni  

 * gcc.target/arm/multilib.exp: Add tests for cortex-m85.


OK, but in future, please don't send patches as octet-stream 
attachments; they should be plain text.


R.


[Bug analyzer/106181] [13 Regression] ICE in capacity_compatible_with_type, at analyzer/region-model.cc:2909

2022-08-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106181

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Tim Lange :

https://gcc.gnu.org/g:c83e97317efb87fd5639a9ee9ec55aa1caa5423e

commit r13-2115-gc83e97317efb87fd5639a9ee9ec55aa1caa5423e
Author: Tim Lange 
Date:   Thu Aug 18 11:36:08 2022 +0200

analyzer: warn on the use of floating-points operands in the size argument
[PR106181]

This patch fixes the ICE reported in PR106181 and adds a new warning to
the analyzer complaining about the use of floating-point operands.

Regrtested on Linux x86_64.

2022-08-17  Tim Lange  

gcc/analyzer/ChangeLog:

PR analyzer/106181
* analyzer.opt: Add Wanalyzer-imprecise-floating-point-arithmetic.
* region-model.cc (is_any_cast_p): Formatting.
(region_model::check_region_size): Ensure precondition.
(class imprecise_floating_point_arithmetic): New abstract
diagnostic class for all floating-point related warnings.
(class float_as_size_arg): Concrete diagnostic class to complain
about floating-point operands inside the size argument.
(class contains_floating_point_visitor):
New visitor to find floating-point operands inside svalues.
(region_model::check_dynamic_size_for_floats): New function.
(region_model::set_dynamic_extents):
Call to check_dynamic_size_for_floats.
* region-model.h (class region_model):
Add region_model::check_dynamic_size_for_floats.

gcc/ChangeLog:

PR analyzer/106181
* doc/invoke.texi: Add Wanalyzer-imprecise-fp-arithmetic.

gcc/testsuite/ChangeLog:

PR analyzer/106181
* gcc.dg/analyzer/allocation-size-1.c: New test.
* gcc.dg/analyzer/imprecise-floating-point-1.c: New test.
* gcc.dg/analyzer/pr106181.c: New test.

[Bug tree-optimization/80635] [10 regression] std::optional and bogus -Wmaybe-uninitialized warning

2022-08-18 Thread alec.edgington at quantinuum dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80635

--- Comment #68 from Alec Edgington  ---
This (or at least a very similar) bug still exists in gcc 11.2.0.

Re: [PING][PATCH] RISC-V: Standardize formatting of SFB ALU conditional move

2022-08-18 Thread Kito Cheng via Gcc-patches
OK, thanks for tweaking this!

On Thu, Aug 18, 2022 at 10:40 PM Maciej W. Rozycki  wrote:
>
> On Tue, 26 Jul 2022, Maciej W. Rozycki wrote:
>
> > Standardize the formatting of SFB ALU conditional move operations from:
> >
> >   beq a2,zero,1f; mv a0,zero; 1: # movcc
> >
> > to:
> >
> >   beq a2,zero,1f  # movcc
> >   mv  a0,zero
> > 1:
> >
> > for consistency with other assembly code produced.  No functional change.
>
>  Ping for:
> 
>
>   Maciej


Re: [PATCH] Fix bogus -Wstringop-overflow warning in Ada

2022-08-18 Thread Eric Botcazou via Gcc-patches
> Hmm :/  But that means we _should_ force a sign extension but only
> from ptrofftype_p ()?  That is, your test above should maybe read
> 
>signop sgn = TYPE_SIGN (type);
>if (ptrofftype_p (type))
>  sgn = SIGNED;
> 
> assuming 'type' is the type of lowbnd

Yes, that's essentially equivalent to what get_offset_range does, but I'm not 
sure why having two slightly different ways of doing it would be better than a 
single one here,   Maybe replace the call to get_precision in both places with 
TYPE_PRECSION (type) then?

-- 
Eric Botcazou





[PING][PATCH] RISC-V: Standardize formatting of SFB ALU conditional move

2022-08-18 Thread Maciej W. Rozycki
On Tue, 26 Jul 2022, Maciej W. Rozycki wrote:

> Standardize the formatting of SFB ALU conditional move operations from:
> 
>   beq a2,zero,1f; mv a0,zero; 1: # movcc
> 
> to:
> 
>   beq a2,zero,1f  # movcc
>   mv  a0,zero
> 1:
> 
> for consistency with other assembly code produced.  No functional change.

 Ping for: 


  Maciej


[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2022-08-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

Segher Boessenkool  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #29 from Segher Boessenkool  ---
Okay, closing then.  Thanks!

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2022-08-18 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #28 from Arseny Solokha  ---
Yes, I think so.

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2022-08-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #27 from Segher Boessenkool  ---
So this particular bug is no longer there, and this PR can be closed?

[PATCH] Improve uninit analysis

2022-08-18 Thread Richard Biener via Gcc-patches
The following reduces the number of false positives in uninit analysis
by providing fallback for situations the current analysis gives up
and thus warns because it cannot prove initialization.

The first situation is when compute_control_dep_chain gives up walking
because it runs into either param_uninit_control_dep_attempts or
MAX_CHAIN_LEN.  If in the process it did not collect a single path
from function entry to the interesting PHI edge then we'll give up
and diagnose.  The following patch insteads provides a sparse path
including only those predicates that always hold when the PHI edge
is reached in that case.  That's cheap to produce but may in some
odd cases prove less precise than what the code tries now (enumerating
all possible paths from function entry to the PHI edge, but only
use the first N of those and only require unreachability of those N).

The second situation is when the set of predicates computed to hold
on the use stmt was formed from multiple paths (there's a similar
enumeration of all paths and their predicates from the PHI def to the
use).  In that case use_preds.use_cannot_happen gives up because
it doesn't know which of the predicates from which path from PHI to
the use it can use to prove unreachability of the PHI edge that has
the uninitialized def.  The patch for this case simply computes
the intersection of the predicates and uses that for further analysis,
but in a crude way since the predicate vectors are not sorted.
Fortunately the total size is limited - we have max MAX_NUM_CHAINS
number of predicates each of length MAX_CHAIN_LEN so the brute
force intersection code should behave quite reasonable in practice.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

The testcase that made me produce this is tree-ssa-math-opts.cc
compiled with the backward jump threading limit increased by
a factor of two, so I don't have any testcase, not even one
with some --param adjustment (because that also affects DOM).

* gimple-predicate-analysis.cc (predicate::use_cannot_happen):
If the use is guarded with multiple predicate paths compute
the predicates intersection before going forward.  When
compute_control_dep_chain wasn't able to come up with at
least one path from function entry to the PHI edge compute
a conservative sparse path instead.
---
 gcc/gimple-predicate-analysis.cc | 64 
 1 file changed, 56 insertions(+), 8 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index b8221d3fb8d..820a9bde28a 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -40,6 +40,7 @@
 #include "builtins.h"
 #include "calls.h"
 #include "value-query.h"
+#include "cfganal.h"
 
 #include "gimple-predicate-analysis.h"
 
@@ -1224,13 +1225,36 @@ predicate::use_cannot_happen (gphi *phi, unsigned opnds)
 
   /* PHI_USE_GUARDS are OR'ed together.  If we have more than one
  possible guard, there's no way of knowing which guard was true.
- Since we need to be absolutely sure that the uninitialized
- operands will be invalidated, bail.  */
+ In that case compute the intersection of all use predicates
+ and use that.  */
   const pred_chain_union _use_guards = m_preds;
+  const pred_chain *use_guard = _use_guards[0];
+  pred_chain phi_use_guard_intersection = vNULL;
   if (phi_use_guards.length () != 1)
-return false;
-
-  const pred_chain _guard = phi_use_guards[0];
+{
+  phi_use_guard_intersection = use_guard->copy ();
+  for (unsigned i = 1; i < phi_use_guards.length (); ++i)
+   {
+ for (unsigned j = 0; j < phi_use_guard_intersection.length ();)
+   {
+ unsigned k;
+ for (k = 0; k < phi_use_guards[i].length (); ++k)
+   if (pred_equal_p (phi_use_guards[i][k],
+ phi_use_guard_intersection[j]))
+ break;
+ if (k == phi_use_guards[i].length ())
+   phi_use_guard_intersection.unordered_remove (j);
+ else
+   j++;
+   }
+   }
+  if (phi_use_guard_intersection.is_empty ())
+   {
+ phi_use_guard_intersection.release ();
+ return false;
+   }
+  use_guard = _use_guard_intersection;
+}
 
   /* Look for the control dependencies of all the interesting operands
  and build guard predicates describing them.  */
@@ -1250,7 +1274,27 @@ predicate::use_cannot_happen (gphi *phi, unsigned opnds)
   if (!compute_control_dep_chain (ENTRY_BLOCK_PTR_FOR_FN (cfun),
  e->src, dep_chains, _chains,
  cur_chain, _calls))
-   return false;
+   {
+ gcc_assert (num_chains == 0);
+ /* If compute_control_dep_chain bailed out due to limits
+build a partial sparse path using dominators.  Collect
+only edges whose 

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #14 from Richard Biener  ---
Should be fixed now.

[Bug middle-end/106642] cc1 compiler hangs when cross-compiling ring_buffer.c (from kernel/events) on Aarch64

2022-08-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106642
Bug 106642 depends on bug 106617, which changed state.

Bug 106617 Summary: [13 Regression] gcc is very slow at ternary expressions 
since r13-322-g7f04b0d786e13ff5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:ac68f904fe31baf80fa53218f1d8ee033bd8c79b

commit r13-2113-gac68f904fe31baf80fa53218f1d8ee033bd8c79b
Author: Richard Biener 
Date:   Thu Aug 18 11:10:30 2022 +0200

middle-end/106617 - fix fold_binary_op_with_conditional_arg pattern issue

Now that we have parts of fold_binary_op_with_conditional_arg duplicated
in match.pd and are using ! to take or throw away the result we have to
be careful to not have both implementations play games which each other,
causing quadratic behavior.  In particular the match.pd implementation
requires both arms to simplify while the fold-const.cc is happy with
just one arm simplifying (something we cannot express in match.pd).

The fix is to simply not enable the match.pd pattern for GENERIC.

PR middle-end/106617
* match.pd ((a ? b : c) > d -> a ? (b > d) : (c > d)): Fix
guard, disable on GENERIC to not cause quadratic behavior
with the fold-const.cc implementation and the use of !

* gcc.dg/pr106617.c: New testcase.

[PATCH] middle-end/106617 - fix fold_binary_op_with_conditional_arg pattern issue

2022-08-18 Thread Richard Biener via Gcc-patches
Now that we have parts of fold_binary_op_with_conditional_arg duplicated
in match.pd and are using ! to take or throw away the result we have to
be careful to not have both implementations play games which each other,
causing quadratic behavior.  In particular the match.pd implementation
requires both arms to simplify while the fold-const.cc is happy with
just one arm simplifying (something we cannot express in match.pd).

The fix is to simply not enable the match.pd pattern for GENERIC.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR middle-end/106617
* match.pd ((a ? b : c) > d -> a ? (b > d) : (c > d)): Fix
guard, disable on GENERIC to not cause quadratic behavior
with the fold-const.cc implementation and the use of !

* gcc.dg/pr106617.c: New testcase.
---
 gcc/match.pd|  4 +++-
 gcc/testsuite/gcc.dg/pr106617.c | 36 +
 2 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr106617.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 07d0a61fc3a..1bb936fc401 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5796,6 +5796,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (cmp (bit_and@2 @0 integer_pow2p@1) @1)
   (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))
 
+#if GIMPLE
 /* From fold_binary_op_with_conditional_arg handle the case of
rewriting (a ? b : c) > d to a ? (b > d) : (c > d) when the
compares simplify.  */
@@ -5805,8 +5806,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   /* Do not move possibly trapping operations into the conditional as this
  pessimizes code and causes gimplification issues when applied late.  */
   (if (!FLOAT_TYPE_P (TREE_TYPE (@3))
-   || operation_could_trap_p (cmp, true, false, @3))
+   || !operation_could_trap_p (cmp, true, false, @3))
(cond @0 (cmp! @1 @3) (cmp! @2 @3)
+#endif
 
 (for cmp (ge lt)
 /* x < 0 ? ~y : y into (x >> (prec-1)) ^ y. */
diff --git a/gcc/testsuite/gcc.dg/pr106617.c b/gcc/testsuite/gcc.dg/pr106617.c
new file mode 100644
index 000..4274b55f80d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr106617.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+
+int nr_cpu_ids;
+void fc_setup_exch_mgr() {
+  (((1UL << (((0, 0)
+? ((1)
+   ? (((nr_cpu_ids)) ? 0
+  : ((nr_cpu_ids)) & (21) ? 21
+  : ((nr_cpu_ids)) ? 20
+  : ((nr_cpu_ids)) & (19) ? 19
+  : ((nr_cpu_ids)) ? 18
+  : ((nr_cpu_ids)) & (17) ? 17
+  : ((nr_cpu_ids)) ? 16
+  : ((nr_cpu_ids)) & (15) ? 15
+  : ((nr_cpu_ids)) ? 14
+  : ((nr_cpu_ids)) & (13) ? 13
+  : ((nr_cpu_ids)) ? 12
+  : ((nr_cpu_ids)) & (11) ? 11
+  : ((nr_cpu_ids)) ? 10
+  : ((nr_cpu_ids)) & (9)  ? 9
+  : ((nr_cpu_ids))  ? 8
+  : ((nr_cpu_ids)) & (7)  ? 7
+  : ((nr_cpu_ids))  ? 6
+  : ((nr_cpu_ids)) & (5)  ? 5
+  : ((nr_cpu_ids))  ? 4
+  : ((nr_cpu_ids)) & (3)
+  ? 3
+  : ((nr_cpu_ids)-1) & 1)
+   : 1)
+: 0) +
+   1) &
+ (1UL << 2)
+ ? 2
+ : 1))
+   );
+}
-- 
2.35.3


Re: [PATCH] Support threading of just the exit edge

2022-08-18 Thread Andrew MacLeod via Gcc-patches



On 8/18/22 03:08, Richard Biener wrote:



The caveat is that it is only a partial solution... it will only work for
names on that stmt.  if you have anything more complex, ie

if (a == 0 || b == 0)  we have a seqeunce feeding the ctrl stmt..

c_1 = a == 0
c_2 = b == 0
c_3 = c_1 && c_2
if (c_3 == 0)

only the evaluation of c_3 will have the ctrl stmt as its context.. the others
will be evaluted on their own statement, and thus neither a nor b would pick
up anything from the block as they are evalauted and cached as they are
seen.    unless of course we are doing a walk :-P

Hmm, but as I traced it when I do range_of_expr the context stmt I provide
will be passed down and even when processing dependences that context
will stick?  But maybe it doesn't because it would mean explosion of the
cache?

But yeah, with the above restriction it would be even more useless.
Same issue as with

   *p = 0;
   if (..)
/ \
  ..   \
   if (p)

here the local adjustment of 'p' in if (p) would not pick up the
p != 0 guarantee from the immediate dominator.


it certainly should. the earlier BB will have the ~[0, 0] property in 
the on-exit structure, so when the range of 'p' is evaluated on the edge 
to the next block, it will be adjusted. the value for on-entry of P to 
that block will therefore be ~[0, 0].   Ranger does this, and the path 
query code is *suppose* to.. late discussions with Aldy yesterday left 
me unclear if it always does.  it should.  that was the entire point of 
leaving the on-demand filling of the structure via immediate uses.





  In the meantime,
it should be possible to take a ranger that just completed a VRP pass, and use
that as the root ranger for a threading pass immediately after.. I think there
may be some lingering issues with abnormal edges if we "re-visit" blocks which
we claim to have walked due to the way I store inferred ranges in those
block.. (the expectation being we never go back up into the block, so the
on-entry cache works like the "current def" vector in the original EVRP.  I'd
have to think about that too.

Of course that would essentially do a VRP pass before each threading
which I think is a bit expensive.  Also looking at how ranger works
with all its abstraction having the "old" EVRP style body walk rather
than abusing the on-demand ranger for such a walk would be a lot more
efficient for this purpose :/


No, I just meant when we do the VRP pass, rather than throw away the 
fully primed ranger and its values, one could invoke the threader using 
it...  But I'm not sure how much extra we'd get anyway.




Meanwhile I'm leaning towards calling this a phase ordering issue
of threading + VRP, but that also means we shouldn't deliberately
try to preserve "threadings" of this kind - in fact we might want
to explicitely reject them?

we are probably going to want to visit some pass ordering.

Sure, though there's never going to be a good enough pass ordering :/

Richard.




Re: [PATCH V2] place `const volatile' objects in read-only sections

2022-08-18 Thread Jose E. Marchesi via Gcc-patches


ping

> [Changes from V1:
> - Added a test.]
>
> It is common for C BPF programs to use variables that are implicitly
> set by the BPF loader and run-time.  It is also necessary for these
> variables to be stored in read-only storage so the BPF verifier
> recognizes them as such.  This leads to declarations using both
> `const' and `volatile' qualifiers, like this:
>
>   const volatile unsigned char is_allow_list = 0;
>
> Where `volatile' is used to avoid the compiler to optimize out the
> variable, or turn it into a constant, and `const' to make sure it is
> placed in .rodata.
>
> Now, it happens that:
>
> - GCC places `const volatile' objects in the .data section, under the
>   assumption that `volatile' somehow voids the `const'.
>
> - LLVM places `const volatile' objects in .rodata, under the
>   assumption that `volatile' is orthogonal to `const'.
>
> So there is a divergence, that has practical consequences: it makes
> BPF programs compiled with GCC to not work properly.
>
> When looking into this, I found this bugzilla:
>
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
>   "change semantics of const volatile variables"
>
> which was filed back in 2005, long ago.  This report was already
> asking to put `const volatile' objects in .rodata, questioning the
> current behavior.
>
> While discussing this in the #gcc IRC channel I was pointed out to the
> following excerpt from the C18 spec:
>
>6.7.3 Type qualifiers / 5 The properties associated with qualified
>  types are meaningful only for expressions that are
>  lval-values [note 135]
>
>135) The implementation may place a const object that is not
> volatile in a read-only region of storage. Moreover, the
> implementation need not allocate storage for such an object if
> its $ address is never used.
>
> This footnote may be interpreted as if const objects that are volatile
> shouldn't be put in read-only storage.  Even if I personally was not
> very convinced of that interpretation (see my earlier comment in BZ
> 25521) I filed the following issue in the LLVM tracker in order to
> discuss the matter:
>
>   https://github.com/llvm/llvm-project/issues/56468
>
> As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
> reflectors about this.  He reported that the reflectors don't think
> footnote 135 has any normative value.
>
> So, not having a normative mandate on either direction, there are two
> options:
>
> a) To change GCC to place `const volatile' objects in .rodata instead
>of .data.
>
> b) To change LLVM to place `const volatile' objects in .data instead
>of .rodata.
>
> Considering that:
>
> - One target (bpf-unknown-none) breaks with the current GCC behavior.
>
> - No target/platform relies on the GCC behavior, that we know.
>
> - Changing the LLVM behavior at this point would be very severely
>   traumatic for the BPF people and their users.
>
> I think the right thing to do at this point is a).
> Therefore this patch.
>
> Regtested in x86_64-linux-gnu and bpf-unknown-none.
> No regressions observed.
>
> gcc/ChangeLog:
>
>   PR middle-end/25521
>   * varasm.cc (categorize_decl_for_section): Place `const volatile'
>   objects in read-only sections.
>   (default_select_section): Likewise.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/25521
>   * lib/target-supports.exp (check_effective_target_elf): Define.
>   * gcc.dg/pr25521.c: New test.
> ---
>  gcc/testsuite/gcc.dg/pr25521.c| 10 ++
>  gcc/testsuite/lib/target-supports.exp | 10 ++
>  gcc/varasm.cc |  3 ---
>  3 files changed, 20 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr25521.c
>
> diff --git a/gcc/testsuite/gcc.dg/pr25521.c b/gcc/testsuite/gcc.dg/pr25521.c
> new file mode 100644
> index 000..74fe2ae6626
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr25521.c
> @@ -0,0 +1,10 @@
> +/* PR middle-end/25521 - place `const volatile' objects in read-only
> +   sections.
> +
> +   { dg-require-effective-target elf }
> +   { dg-do compile } */
> +
> +const volatile int foo = 30;
> +
> +
> +/* { dg-final { scan-assembler "\\.rodata" } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 04a2a8e8659..c663d59264b 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -483,6 +483,16 @@ proc check_effective_target_alias { } {
>  }
>  }
>  
> +# Returns 1 if the target uses the ELF object format, 0 otherwise.
> +
> +proc check_effective_target_elf { } {
> +if { [gcc_target_object_format] == "elf" } {
> + return 1;
> +} else {
> + return 0;
> +}
> +}
> +
>  # Returns 1 if the target toolchain supports ifunc, 0 otherwise.
>  
>  proc check_ifunc_available { } {
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 4db8506b106..7864db11faf 100644
> --- a/gcc/varasm.cc
> +++ 

[PATCH V2] Add warning options -W[no-]compare-distinct-pointer-types

2022-08-18 Thread Jose E. Marchesi via Gcc-patches


Hi Joseph.

> On Fri, 5 Aug 2022, Jose E. Marchesi via Gcc-patches wrote:
>
>> +Wcompare-distinct-pointer-types
>> +C C++ Var(warn_compare_distinct_pointer_types) Warning Init(1)
>> +Warn if pointers of distinct types are compared without a cast.
>
> There's no implementation for C++ in this patch, so the option shouldn't 
> be supported for C++ in c.opt.  However, C options are normally supported 
> for Objective-C; unless you have a specific reason why Objective-C support 
> for this option would be a bad idea, "C ObjC" would be appropriate for the 
> languages.

Thanks for the review!
See a V2 of the patch with the suggested change below.



GCC emits pedwarns unconditionally when comparing pointers of
different types, for example:

  int xdp_context (struct xdp_md *xdp)
{
void *data = (void *)(long)xdp->data;
__u32 *metadata = (void *)(long)xdp->data_meta;
__u32 ret;

if (metadata + 1 > data)
  return 0;
return 1;
   }

  /home/jemarch/foo.c: In function ‘xdp_context’:
  /home/jemarch/foo.c:15:20: warning: comparison of distinct pointer types 
lacks a cast
 15 |   if (metadata + 1 > data)
 |^

LLVM supports an option -W[no-]compare-distinct-pointer-types that can
be used in order to enable or disable the emission of such warnings.
It is enabled by default.

This patch adds the same options to GCC.

Documentation and testsuite updated included.
Regtested in x86_64-linu-gnu.
No regressions observed.

gcc/ChangeLog:

PR c/106537
* doc/invoke.texi (Option Summary): Mention
-Wcompare-distinct-pointer-types under `Warning Options'.
(Warning Options): Document -Wcompare-distinct-pointer-types.

gcc/c-family/ChangeLog:

PR c/106537
* c.opt (Wcompare-distinct-pointer-types): New option.

gcc/c/ChangeLog:

PR c/106537
* c-typeck.cc (build_binary_op): Warning on comparing distinct
pointer types only when -Wcompare-distinct-pointer-types.

gcc/testsuite/ChangeLog:

PR c/106537
* gcc.c-torture/compile/pr106537-1.c: New test.
* gcc.c-torture/compile/pr106537-2.c: Likewise.
* gcc.c-torture/compile/pr106537-3.c: Likewise.
---
 gcc/c-family/c.opt|  4 
 gcc/c/c-typeck.cc |  8 ---
 gcc/doc/invoke.texi   |  6 +
 .../gcc.c-torture/compile/pr106537-1.c| 23 +++
 .../gcc.c-torture/compile/pr106537-2.c| 21 +
 .../gcc.c-torture/compile/pr106537-3.c| 21 +
 6 files changed, 80 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106537-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106537-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106537-3.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index dfdebd596ef..c401c06ec0b 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1844,6 +1844,10 @@ Winvalid-imported-macros
 C++ ObjC++ Var(warn_imported_macros) Warning
 Warn about macros that have conflicting header units definitions.
 
+Wcompare-distinct-pointer-types
+C ObjC Var(warn_compare_distinct_pointer_types) Warning Init(1)
+Warn if pointers of distinct types are compared without a cast.
+
 flang-info-include-translate
 C++ Var(note_include_translate_yes)
 Note #include directives translated to import declarations.
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index de8780a1502..04af02add37 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -12397,7 +12397,8 @@ build_binary_op (location_t location, enum tree_code 
code,
}
  else
/* Avoid warning about the volatile ObjC EH puts on decls.  */
-   if (!objc_ok)
+   if (!objc_ok
+&& warn_compare_distinct_pointer_types)
  pedwarn (location, 0,
   "comparison of distinct pointer types lacks a cast");
 
@@ -12517,8 +12518,9 @@ build_binary_op (location_t location, enum tree_code 
code,
  int qual = ENCODE_QUAL_ADDR_SPACE (as_common);
  result_type = build_pointer_type
  (build_qualified_type (void_type_node, qual));
- pedwarn (location, 0,
-  "comparison of distinct pointer types lacks a cast");
+  if (warn_compare_distinct_pointer_types)
+pedwarn (location, 0,
+ "comparison of distinct pointer types lacks a cast");
}
}
   else if (code0 == POINTER_TYPE && null_pointer_constant_p (orig_op1))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1ac81ad0bb4..88b4af14d8c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -341,6 +341,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wcast-align  -Wcast-align=strict  -Wcast-function-type  -Wcast-qual  @gol
 

Re: [PATCH v1] LoongArch: Add support code model extreme.

2022-08-18 Thread Lulu Cheng



在 2022/8/18 下午8:52, Xi Ruoyao 写道:

On Thu, 2022-08-18 at 19:49 +0800, Lulu Cheng wrote:

I think we can ignore the effect of -fplt if code model is extreme,
instead of forcing everyone to explicitly add -fno-plt.  The "large"
code model of x86_64 also does not limit the address range and it always
avoids PLT (even if someone adds "-fplt" explicitly).

 Do you mean that if cmodel=extreme,
   then add -fno-plt by default?

Yes, we should use -fno-plt as the default for cmodel=extreme.

x86_64 silently ignores -fplt for cmodel=large (their "large" is like
our "extreme"), but perhaps it's better for us to just report an error
if someone uses "-mcmodel=extreme -fplt" explicitly (if possible, I'm
not sure if we can determine whether -fplt is explicitly given in the
backend code).



I thought if using -mcmodel=extreme and -fplt at the same time we could 
add a warning here and inform the user that we changed to noplt.




Re: [PATCH v1] LoongArch: Add support code model extreme.

2022-08-18 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-08-18 at 19:49 +0800, Lulu Cheng wrote:
> > I think we can ignore the effect of -fplt if code model is extreme,
> > instead of forcing everyone to explicitly add -fno-plt.  The "large"
> > code model of x86_64 also does not limit the address range and it always
> > avoids PLT (even if someone adds "-fplt" explicitly).

> Do you mean that if cmodel=extreme,
>   then add -fno-plt by default?

Yes, we should use -fno-plt as the default for cmodel=extreme.

x86_64 silently ignores -fplt for cmodel=large (their "large" is like
our "extreme"), but perhaps it's better for us to just report an error
if someone uses "-mcmodel=extreme -fplt" explicitly (if possible, I'm
not sure if we can determine whether -fplt is explicitly given in the
backend code).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: ICE after folding svld1rq to vec_perm_expr duing forwprop

2022-08-18 Thread Prathamesh Kulkarni via Gcc-patches
On Thu, 18 Aug 2022 at 18:14, Prathamesh Kulkarni
 wrote:
>
> On Wed, 17 Aug 2022 at 17:01, Richard Biener  
> wrote:
> >
> > On Tue, Aug 16, 2022 at 6:30 PM Richard Sandiford
> >  wrote:
> > >
> > > Prathamesh Kulkarni  writes:
> > > > On Tue, 9 Aug 2022 at 18:42, Richard Biener 
> > > >  wrote:
> > > >>
> > > >> On Tue, Aug 9, 2022 at 12:10 PM Prathamesh Kulkarni
> > > >>  wrote:
> > > >> >
> > > >> > On Mon, 8 Aug 2022 at 14:27, Richard Biener 
> > > >> >  w>> > >
> > > >> > >
> > > >> > >   /* If result vector has greater length than input vector,
> > > >> > > + then allow permuting two vectors as long as:
> > > >> > > + a) sel.nelts_per_pattern == 1
> > > >> > > + b) sel.npatterns == len of input vector.
> > > >> > > + The intent is to permute input vectors, and
> > > >> > > + dup the elements in resulting vector to target vector 
> > > >> > > length.  */
> > > >> > > +
> > > >> > > +  if (maybe_gt (TYPE_VECTOR_SUBPARTS (type),
> > > >> > > +   TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0
> > > >> > > +{
> > > >> > > +  nelts = sel.encoding ().npatterns ();
> > > >> > > +  if (sel.encoding ().nelts_per_pattern () != 1
> > > >> > > + || (!known_eq (nelts, TYPE_VECTOR_SUBPARTS (TREE_TYPE 
> > > >> > > (arg0)
> > > >> > > +   return NULL_TREE;
> > > >> > > +}
> > > >> > >
> > > >> > > so the only case you add is non-VLA to VLA and there
> > > >> > > explicitely only the case of a period that's same as the
> > > >> > > element count in the input vectors.
> > > >> > >
> > > >> > >
> > > >> > > @@ -2602,6 +2602,9 @@ dump_generic_node (pretty_printer *pp, tree
> > > >> > > node, int spc, dump_flags_t flags,
> > > >> > > pp_space (pp);
> > > >> > >   }
> > > >> > >   }
> > > >> > > +   if (VECTOR_TYPE_P (TREE_TYPE (node))
> > > >> > > +   && !TYPE_VECTOR_SUBPARTS (TREE_TYPE 
> > > >> > > (node)).is_constant ())
> > > >> > > + pp_string (pp, ", ... ");
> > > >> > > pp_right_brace (pp);
> > > >> > >
> > > >> > > btw, I do wonder if VLA CONSTRUCTORs are a "thing"?  Are they?
> > > >> > Well, it got created for the following case after folding:
> > > >> > svint32_t f2(int a, int b, int c, int d)
> > > >> > {
> > > >> >   int32x4_t v = {a, b, c, d};
> > > >> >   return svld1rq_s32 (svptrue_b8 (), [0]);
> > > >> > }
> > > >> >
> > > >> > The svld1rq_s32 call gets folded to:
> > > >> > v = {a, b, c, d}
> > > >> > lhs = VEC_PERM_EXPR
> > > >> >
> > > >> > fold_vec_perm then folds the above VEC_PERM_EXPR to
> > > >> > VLA constructor, since elements in v (in_elts) are not constant, and
> > > >> > need_ctor is thus true:
> > > >> > lhs = {a, b, c, d, ...}
> > > >> > I added "..." to make it more explicit that it's a VLA constructor.
> > > >>
> > > >> But I doubt we do anything reasonable with such a beast?  Do we?
> > > >> I suppose it's like a vec_duplicate if you view it as V1TImode
> > > >> but do we actually make sure to do this duplication?
> > > > I am not sure. As mentioned above, the current code-gen for VLA
> > > > constructor looks pretty bad.
> > > > Should we avoid folding VLA constructors for now ?
> > >
> > > VLA constructors aren't really a thing.  At least, the only VLA vector
> > > you could represent with current CONSTRUCTOR nodes is a fixed-length
> > > sequence at the start of an otherwise zero vector.  I'm not sure
> > > we even use that though (perhaps we do and I've forgotten).
> > >
> > > > I guess these are 2 different issues:
> > > > (a) Resolving ICE with VEC_PERM_EXPR for above aarch64 tests.
> > > > (b) Extending fold_vec_perm to handle vectors with differing lengths.
> > > >
> > > > For (a), I think the issue with using:
> > > > res_type = gimple_assign_lhs (stmt)
> > > > in previous patch, was that op2's type will change to match tgt_units,
> > > > if we go thru
> > > > (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR) branch,
> > > > and may thus not be same as len(lhs_type) anymore, and hit the assert
> > > > in fold_vec_perm.
> > > >
> > > > IIUC, for lhs = VEC_PERM_EXPR, we now have the
> > > > following semantics:
> > > > (1) Element types for lhs, rhs1 and rhs2 should be the same.
> > > > (2) len(lhs) == len(mask) and len(rhs1) == len(rhs2).
> > >
> > > Yeah.
> > >
> > > > The attached patch changes res_type from TREE_TYPE (arg0) to following:
> > > > res_type = build_vector_type (TREE_TYPE (TREE_TYPE (arg0)),
> > > > TYPE_VECTOR_SUBPARTS 
> > > > (op2))
> > > > so it has same element type as arg0 (and arg1) and len of op2.
> > > > Does that look reasonable ?
> > > >
> > > > If we need a cast from res_type to lhs_type, then both would be fixed
> > > > width vectors
> > > > with len(lhs_type) being a multiple of len(res_type).
> > > > IIUC, we don't support casting from VLA vector to/from fixed width 
> > > > vector,
> > >
> > > Yes, that's not supported as a cast.  If the compiler knows the
> > > length 

Re: ICE after folding svld1rq to vec_perm_expr duing forwprop

2022-08-18 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 17 Aug 2022 at 17:01, Richard Biener  wrote:
>
> On Tue, Aug 16, 2022 at 6:30 PM Richard Sandiford
>  wrote:
> >
> > Prathamesh Kulkarni  writes:
> > > On Tue, 9 Aug 2022 at 18:42, Richard Biener  
> > > wrote:
> > >>
> > >> On Tue, Aug 9, 2022 at 12:10 PM Prathamesh Kulkarni
> > >>  wrote:
> > >> >
> > >> > On Mon, 8 Aug 2022 at 14:27, Richard Biener 
> > >> >  w>> > >
> > >> > >
> > >> > >   /* If result vector has greater length than input vector,
> > >> > > + then allow permuting two vectors as long as:
> > >> > > + a) sel.nelts_per_pattern == 1
> > >> > > + b) sel.npatterns == len of input vector.
> > >> > > + The intent is to permute input vectors, and
> > >> > > + dup the elements in resulting vector to target vector length.  
> > >> > > */
> > >> > > +
> > >> > > +  if (maybe_gt (TYPE_VECTOR_SUBPARTS (type),
> > >> > > +   TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0
> > >> > > +{
> > >> > > +  nelts = sel.encoding ().npatterns ();
> > >> > > +  if (sel.encoding ().nelts_per_pattern () != 1
> > >> > > + || (!known_eq (nelts, TYPE_VECTOR_SUBPARTS (TREE_TYPE 
> > >> > > (arg0)
> > >> > > +   return NULL_TREE;
> > >> > > +}
> > >> > >
> > >> > > so the only case you add is non-VLA to VLA and there
> > >> > > explicitely only the case of a period that's same as the
> > >> > > element count in the input vectors.
> > >> > >
> > >> > >
> > >> > > @@ -2602,6 +2602,9 @@ dump_generic_node (pretty_printer *pp, tree
> > >> > > node, int spc, dump_flags_t flags,
> > >> > > pp_space (pp);
> > >> > >   }
> > >> > >   }
> > >> > > +   if (VECTOR_TYPE_P (TREE_TYPE (node))
> > >> > > +   && !TYPE_VECTOR_SUBPARTS (TREE_TYPE (node)).is_constant 
> > >> > > ())
> > >> > > + pp_string (pp, ", ... ");
> > >> > > pp_right_brace (pp);
> > >> > >
> > >> > > btw, I do wonder if VLA CONSTRUCTORs are a "thing"?  Are they?
> > >> > Well, it got created for the following case after folding:
> > >> > svint32_t f2(int a, int b, int c, int d)
> > >> > {
> > >> >   int32x4_t v = {a, b, c, d};
> > >> >   return svld1rq_s32 (svptrue_b8 (), [0]);
> > >> > }
> > >> >
> > >> > The svld1rq_s32 call gets folded to:
> > >> > v = {a, b, c, d}
> > >> > lhs = VEC_PERM_EXPR
> > >> >
> > >> > fold_vec_perm then folds the above VEC_PERM_EXPR to
> > >> > VLA constructor, since elements in v (in_elts) are not constant, and
> > >> > need_ctor is thus true:
> > >> > lhs = {a, b, c, d, ...}
> > >> > I added "..." to make it more explicit that it's a VLA constructor.
> > >>
> > >> But I doubt we do anything reasonable with such a beast?  Do we?
> > >> I suppose it's like a vec_duplicate if you view it as V1TImode
> > >> but do we actually make sure to do this duplication?
> > > I am not sure. As mentioned above, the current code-gen for VLA
> > > constructor looks pretty bad.
> > > Should we avoid folding VLA constructors for now ?
> >
> > VLA constructors aren't really a thing.  At least, the only VLA vector
> > you could represent with current CONSTRUCTOR nodes is a fixed-length
> > sequence at the start of an otherwise zero vector.  I'm not sure
> > we even use that though (perhaps we do and I've forgotten).
> >
> > > I guess these are 2 different issues:
> > > (a) Resolving ICE with VEC_PERM_EXPR for above aarch64 tests.
> > > (b) Extending fold_vec_perm to handle vectors with differing lengths.
> > >
> > > For (a), I think the issue with using:
> > > res_type = gimple_assign_lhs (stmt)
> > > in previous patch, was that op2's type will change to match tgt_units,
> > > if we go thru
> > > (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR) branch,
> > > and may thus not be same as len(lhs_type) anymore, and hit the assert
> > > in fold_vec_perm.
> > >
> > > IIUC, for lhs = VEC_PERM_EXPR, we now have the
> > > following semantics:
> > > (1) Element types for lhs, rhs1 and rhs2 should be the same.
> > > (2) len(lhs) == len(mask) and len(rhs1) == len(rhs2).
> >
> > Yeah.
> >
> > > The attached patch changes res_type from TREE_TYPE (arg0) to following:
> > > res_type = build_vector_type (TREE_TYPE (TREE_TYPE (arg0)),
> > > TYPE_VECTOR_SUBPARTS 
> > > (op2))
> > > so it has same element type as arg0 (and arg1) and len of op2.
> > > Does that look reasonable ?
> > >
> > > If we need a cast from res_type to lhs_type, then both would be fixed
> > > width vectors
> > > with len(lhs_type) being a multiple of len(res_type).
> > > IIUC, we don't support casting from VLA vector to/from fixed width vector,
> >
> > Yes, that's not supported as a cast.  If the compiler knows the
> > length of the "VLA" vector then it's not VLA.  If it doesn't
> > know the length of the VLA vector then the sizes could be different
> > (preventing VIEW_CONVERT_EXPR) and the number of elements could be
> > different (preventing pointwise CONVERT_EXPRs).
> >
> > > or from VLA vector of one type to VLA 

[PATCH] jobserver: detect properly O_NONBLOCK

2022-08-18 Thread Martin Liška
That handles systems that don't have O_NONBLOCK, in that case
WPA streaming is not using jobserver if --jobserver-auth uses 'fifo'.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
Tested with mingw cross compiler as well.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* configure.ac: Detect O_NONBLOCK flag for open.
* config.in: Regenerate.
* configure: Regenerate.
* opts-common.cc (jobserver_info::connect): Set is_connected
  properly based on O_NONBLOCK.
* opts-jobserver.h (struct jobserver_info): Add is_connected
  member variable.

gcc/lto/ChangeLog:

* lto.cc (wait_for_child): Ask if we are connected to jobserver.
(stream_out_partitions): Likewise.
---
 gcc/config.in|  6 ++
 gcc/configure| 39 +--
 gcc/configure.ac | 11 +++
 gcc/lto/lto.cc   | 12 ++--
 gcc/opts-common.cc   | 11 ++-
 gcc/opts-jobserver.h |  2 ++
 6 files changed, 72 insertions(+), 9 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 413b2bd36cb..abab9bf5024 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -2148,6 +2148,12 @@
 #endif
 
 
+/* Define if O_NONBLOCK supported by fcntl. */
+#ifndef USED_FOR_TARGET
+#undef HOST_HAS_O_NONBLOCK
+#endif
+
+
 /* Define which stat syscall is able to handle 64bit indodes. */
 #ifndef USED_FOR_TARGET
 #undef HOST_STAT_FOR_64BIT_INODES
diff --git a/gcc/configure b/gcc/configure
index da7a45066b5..8b416c1a142 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -12460,6 +12460,41 @@ $as_echo "#define HOST_HAS_O_CLOEXEC 1" >>confdefs.h
 
 fi
 
+# Check if O_NONBLOCK is defined by fcntl
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for O_NONBLOCK" >&5
+$as_echo_n "checking for O_NONBLOCK... " >&6; }
+if ${ac_cv_o_nonblock+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+#include 
+int
+main ()
+{
+
+return open ("/dev/null", O_RDONLY | O_NONBLOCK);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  ac_cv_o_nonblock=yes
+else
+  ac_cv_o_nonblock=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_o_nonblock" >&5
+$as_echo "$ac_cv_o_nonblock" >&6; }
+if test $ac_cv_o_nonblock = yes; then
+
+$as_echo "#define HOST_HAS_O_NONBLOCK 1" >>confdefs.h
+
+fi
+
 # C++ Modules would like some networking features to provide the mapping
 # server.  You can still use modules without them though.
 # The following network-related checks could probably do with some
@@ -19678,7 +19713,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19681 "configure"
+#line 19716 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19784,7 +19819,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19787 "configure"
+#line 19822 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index f70b6c24fda..4ebdad38b9b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1707,6 +1707,17 @@ if test $ac_cv_o_cloexec = yes; then
   [Define if O_CLOEXEC supported by fcntl.])
 fi
 
+# Check if O_NONBLOCK is defined by fcntl
+AC_CACHE_CHECK(for O_NONBLOCK, ac_cv_o_nonblock, [
+AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+#include ]], [[
+return open ("/dev/null", O_RDONLY | O_NONBLOCK);]])],
+[ac_cv_o_nonblock=yes],[ac_cv_o_nonblock=no])])
+if test $ac_cv_o_nonblock = yes; then
+  AC_DEFINE(HOST_HAS_O_NONBLOCK, 1,
+  [Define if O_NONBLOCK supported by fcntl.])
+fi
+
 # C++ Modules would like some networking features to provide the mapping
 # server.  You can still use modules without them though.
 # The following network-related checks could probably do with some
diff --git a/gcc/lto/lto.cc b/gcc/lto/lto.cc
index c82307f4f7e..3a9147b01b5 100644
--- a/gcc/lto/lto.cc
+++ b/gcc/lto/lto.cc
@@ -213,11 +213,11 @@ wait_for_child ()
 }
   while (!WIFEXITED (status) && !WIFSIGNALED (status));
 
---nruns;
+  --nruns;
 
-/* Return token to the jobserver if active.  */
-if (jinfo != NULL && jinfo->is_active)
-  jinfo->return_token ();
+  /* Return token to the jobserver if active.  */
+  if (jinfo != NULL && jinfo->is_connected)
+jinfo->return_token ();
 }
 #endif
 
@@ -254,7 +254,7 @@ stream_out_partitions (char *temp_filename, int blen, int 
min, int max,
  streaming process.  */
   if (!last)
 {
-  if (jinfo != NULL && jinfo->is_active)
+  if (jinfo != NULL && jinfo->is_connected)
while (true)
  {
if (jinfo->get_token ())
@@ -291,7 +291,7 @@ stream_out_partitions (char *temp_filename, int blen, int 
min, int max,
   while (nruns > 0)
wait_for_child ();
 
-  if (jinfo != NULL && 

[Bug gcov-profile/106659] [13 Regression] error: no member named 'fancy_abort' in namespace 'std'; did you mean simply 'fancy_abort'

2022-08-18 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106659

Martin Liška  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #4 from Martin Liška  ---
Fixed.

[Bug gcov-profile/106659] [13 Regression] error: no member named 'fancy_abort' in namespace 'std'; did you mean simply 'fancy_abort'

2022-08-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106659

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Martin Liska :

https://gcc.gnu.org/g:03119249b9cfedb48e910b8df6a832b206cced2b

commit r13-2112-g03119249b9cfedb48e910b8df6a832b206cced2b
Author: Andrew Pinski 
Date:   Thu Aug 18 14:36:28 2022 +0200

gcov-dump: properly use INCLUDE_VECTOR

PR gcov-profile/106659

gcc/ChangeLog:

* gcov-dump.cc (INCLUDE_VECTOR): Include vector.h with
  INCLUDE_VECTOR.

Re: [PATCH v2] analyzer: warn on the use of floating-points operands in the size argument [PR106181]

2022-08-18 Thread David Malcolm via Gcc-patches
On Thu, 2022-08-18 at 11:44 +0200, Tim Lange wrote:
> Hi,
> 
> this is the revised version of my patch. I had trouble to get your
> point regarding the float_visitor:
> 
> > If the constant is seen first, then the non-constant won't be
> > favored
> > (though perhaps binary ops get canonicalized so that constants are
> > on
> > the RHS?).
> 
> Only the assignment of m_result in visit_constant_svalue is guarded
> by
>  !m_result, while the other two are not. So, there are two
> possibilities:
> 1. A constant is seen first and then assigned to m_result.
> 1.1. A non-constant float operand is seen later and
>  overwrites m_result.
> 1.2. There's no non-constant float operand, thus the
>  constant is the actual floating-point operand
> and
>  is kept inside m_result.
> 2. A non-constant is seen first, then m_result might be
>    overwritten with another non-constant later but never
>    with a constant.
> Do I have a flaw in my thinking? (But they do seem to get
> canonicalized,
> so that shouldn't matter)

I think I was confused here, and that you're right.  Sorry about that.

> 
> > How about:
> >  -Wanalyzer-imprecise-float-arithmetic
> >  -Wanalyzer-imprecise-fp-arithmetic
> > instead?  (ideas welcome)
> 
> I've chosen the second. I mostly tried to avoid float because it is
> also
> a reserved keyword in many languages and I wanted to avoid confusion
> (might be overthinking that).

Fair enough.

> 
> - Tim
> 
> This patch fixes the ICE reported in PR106181 and adds a new warning
> to
> the analyzer complaining about the use of floating-point operands.
> 
> Regrtested on Linux x86_64.

Thanks; the patch looks good for trunk.

Dave



Re: [PATCH v1] LoongArch: Add support code model extreme.

2022-08-18 Thread Lulu Cheng



在 2022/8/18 下午7:37, Xi Ruoyao 写道:

+   if (opts->x_flag_plt)
+ error ("code model %qs and %qs not support %s mode",
+"tiny-static", "extreme", "plt");

I think we can ignore the effect of -fplt if code model is extreme,
instead of forcing everyone to explicitly add -fno-plt.  The "large"
code model of x86_64 also does not limit the address range and it always
avoids PLT (even if someone adds "-fplt" explicitly).

Do you mean that if cmodel=extreme, then add -fno-plt by default?


[Bug libstdc++/106669] incorrect definition of viewable_range ("more madness with move-only views")

2022-08-18 Thread h2+bugs at fsfe dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106669

--- Comment #1 from Hannes Hauswedell  ---
This affects GCC 10.4 and GCC 11.3 since move-only views were backported.

The following part of the draft standard also needs changing:

https://eel.is/c++draft/range.all#general-2.1

--->

decay-copy(E) if that expression is well-formed and the decayed type of E
models view.

This will make references to move-only views pick the second option (ref_view).

  1   2   >