Re: [PATCH] Fix typo in insn name.

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi Mike,

on 2023/7/11 03:59, Michael Meissner wrote:
> In doing other work, I noticed that there was an insn:
> 
>   vsx_extract_v4sf__load
> 
> Which did not have an iterator.  I removed the useless .

It actually has a mode iterator, the "P" is used for clobber.

The whole pattern of this define_insn_and_split is

(define_insn_and_split "*vsx_extract_v4sf__load"
  [(set (match_operand:SF 0 "register_operand" "=f,v,v,?r")
(vec_select:SF
 (match_operand:V4SF 1 "memory_operand" "m,Z,m,m")
 (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n,n")])))
   (clobber (match_scratch:P 3 "=,,,"))] <== *P used here*

Its definition is:

(define_mode_iterator P [(SI "TARGET_32BIT") (DI "TARGET_64BIT")])

I guess we can just leave it there?

BR,
Kewen

> 
> I have tested this patch on the following systems and there was no degration.
> Can I check it into the trunk branch?
> 
> * Power10, LE, --with-cpu=power10, IBM 128-bit long double
> * Power9,  LE, --with-cpu=power9,  IBM 128-bit long double
> * Power9,  LE, --with-cpu=power9,  IEEE 128-bit long double
> *   Power9,  LE, --with-cpu=power9,  64-bit default long double
> * Power9,  BE, --with-cpu=power9,  IBM 128-bit long double
> * Power8,  BE, --with-cpu=power8,  IBM 128-bit long double
> 
> 2023-07-10  Michael Meissner  
> 
> gcc/
> 
>   * config/rs6000/vsx.md (vsx_extract_v4sf_load): Rename from
>   vsx_extract_v4sf__load.
> ---
>  gcc/config/rs6000/vsx.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index d34c3b21abe..aed450e31ec 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -3576,7 +3576,7 @@ (define_insn_and_split "vsx_extract_v4sf"
>[(set_attr "length" "8")
> (set_attr "type" "fp")])
>  
> -(define_insn_and_split "*vsx_extract_v4sf__load"
> +(define_insn_and_split "*vsx_extract_v4sf_load"
>[(set (match_operand:SF 0 "register_operand" "=f,v,v,?r")
>   (vec_select:SF
>(match_operand:V4SF 1 "memory_operand" "m,Z,m,m")





[PATCH v4] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread Jin Ma via Gcc-patches
The pattern mistakenly believes that fsflags can use immediate numbers,
but in fact it does not support it. Immediate numbers should use fsflagsi.

For example:
__builtin_riscv_fsflags(4);

The following error occurred.
/tmp/ccoWdWqT.s: Assembler messages:
/tmp/ccoWdWqT.s:14: Error: illegal operands `fsflags 4'

gcc/ChangeLog:

* config/riscv/riscv.md: Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/fsflags.c: New test.
---
 gcc/config/riscv/riscv.md|  4 ++--
 gcc/testsuite/gcc.target/riscv/fsflags.c | 16 
 2 files changed, 18 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/fsflags.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4615e811947..24515bcf706 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3074,7 +3074,7 @@ (define_insn "riscv_frcsr"
   "frcsr\t%0")
 
 (define_insn "riscv_fscsr"
-  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
+  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] 
UNSPECV_FSCSR)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
   "fscsr\t%0")
 
@@ -3087,7 +3087,7 @@ (define_insn "riscv_frflags"
 (define_insn "riscv_fsflags"
   [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSFLAGS)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
-  "fsflags\t%0")
+  "fsflags%i0\t%0")
 
 (define_insn "*riscv_fsnvsnan2"
   [(unspec_volatile [(match_operand:ANYF 0 "register_operand" "f")
diff --git a/gcc/testsuite/gcc.target/riscv/fsflags.c 
b/gcc/testsuite/gcc.target/riscv/fsflags.c
new file mode 100644
index 000..74a97b8a7c7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/fsflags.c
@@ -0,0 +1,16 @@
+/* Verify that fsflags is using the correct register or immediate.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-options "-O" } */
+
+void foo1 (int a)
+{
+   __builtin_riscv_fsflags(a);
+}
+void foo2 ()
+{
+   __builtin_riscv_fsflags(4);
+}
+
+/* { dg-final { scan-assembler-times "fsflags\t" 1 } } */
+/* { dg-final { scan-assembler-times "fsflagsi\t" 1 } } */
-- 
2.17.1



[Bug c++/110809] ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

--- Comment #6 from Andrew Pinski  ---
(In reply to Ed Catmur from comment #5)
> The original code is valid. A reduced valid case would be:
> ```
> template  struct S {};
> template  struct bucket {};
> template 
> int find_indices_impl(bucket const &);
> struct HashTable : bucket, 1>, bucket, 2> {};
> auto t = find_indices_impl>(HashTable{});
> ```

Oh yes and yes the above looks like the original code even.
It is definitely trying to match bucket, 2> base class with bucket,
i> though and it is ICEing there.

Re: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread Jin Ma via Gcc-patches
> So I guess you should change `fscsr` to `fscsr%i0` instead of dropping
> K from the constraint list?
> 
Sorry, you are right. I thought you were talking about fsflags, 
but I didn't notice it was fscsr. I'll correct it right away.
> On Wed, Jul 26, 2023 at 11:42 AM juzhe.zh...@rivai.ai
>  wrote:
> >
> > I don't understand:
> >  (define_insn "riscv_fscsr"
> > -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] 
> > UNSPECV_FSCSR)]
> > +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] 
> > UNSPECV_FSCSR)]
> >"TARGET_HARD_FLOAT || TARGET_ZFINX"
> >"fscsr\t%0")
> >
> > This pattern never allows immediate in the constraint. Why still make 
> > predicate allow immediate?
> >
> >
> >
> >
> > juzhe.zh...@rivai.ai
> >
> > From: Jin Ma
> > Date: 2023-07-26 11:33
> > To: gcc-patches; juzhe.zh...@rivai.ai
> > CC: jeffreyalaw; palmer; richard.sandiford; kito.cheng; philipp.tomsich; 
> > christoph.muellner; Robin Dapp; jinma.contrib
> > Subject: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using 
> > immediate.
> > > -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] 
> > > UNSPECV_FSCSR)]
> > > +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] 
> > > UNSPECV_FSCSR)]
> > >
> > > If you don't allow immediate value in range 0 ~ 31, it should be 
> > > "register_operand" instead of "csr_operand".
> > >
> > >
> >
> > I think directives that support the immediate pattern might be better, on 
> > the one
> > hand fsflagsi are supported in the manual, on the other hand fsflagsi can be
> > slightly faster than fsflags.
> >
> > Regards
> > Jin
> >
> > >
> > > juzhe.zh...@rivai.ai
> > >

[Bug c++/109899] [12/13/14 Regression] ICE in check_noexcept_r, at cp/except.cc:1065

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109899

--- Comment #7 from Andrew Pinski  ---
Reduced further (which shows PR 110810 is the same here):
```
struct class1 {
  class1();
  ~class1();
};
using array = class1[1]; 
template 
void f()
{
  array{};
}
```

[Bug c++/109899] [12/13/14 Regression] ICE in check_noexcept_r, at cp/except.cc:1065

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109899

Andrew Pinski  changed:

   What|Removed |Added

 CC||cuzdav at gmail dot com

--- Comment #6 from Andrew Pinski  ---
*** Bug 110810 has been marked as a duplicate of this bug. ***

[Bug c++/110810] [12/13/14 Regression] ICE in check_noexcept_r, at cp/except.cc:1068

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110810

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
It is a dup of bug 109899 in the end (will show why there in a second).

*** This bug has been marked as a duplicate of bug 109899 ***

[Bug c++/110810] [12/13/14 Regression] ICE in check_noexcept_r, at cp/except.cc:1068

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110810

--- Comment #2 from Andrew Pinski  ---
You don't even need it to be a new or even ~X there:
```
struct Foo {
Foo() {}
~Foo() {}
};

struct X {
Foo data[4];
};

template
void f() {
X{};
}
```

[Bug c++/110809] ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread ed at catmur dot uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

--- Comment #5 from Ed Catmur  ---
The original code is valid. A reduced valid case would be:
```
template  struct S {};
template  struct bucket {};
template 
int find_indices_impl(bucket const &);
struct HashTable : bucket, 1>, bucket, 2> {};
auto t = find_indices_impl>(HashTable{});
```

[Bug c++/110810] [12/13/14 Regression] ICE in check_noexcept_r, at cp/except.cc:1068

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110810

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
   Last reconfirmed||2023-07-26
 Ever confirmed|0   |1
   Target Milestone|--- |12.4
  Known to fail||12.1.0
 Status|UNCONFIRMED |NEW
  Known to work||11.4.0
Summary|ICE in check_noexcept_r, at |[12/13/14 Regression] ICE
   |cp/except.cc:1068   |in check_noexcept_r, at
   ||cp/except.cc:1068

--- Comment #1 from Andrew Pinski  ---
Confirmed.
You don't even need it to be a inplacement new:
```

struct Foo {
Foo() {}
~Foo() {}
};

struct X {
Foo data[4];
};

template
void f() {
auto& object = *new X{};
object.~X();
}
```

[Bug target/88160] Error: register save offset not a multiple of 4 only with optimize

2023-07-25 Thread vincent.riviere at freesbee dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88160

Vincent Riviere  changed:

   What|Removed |Added

 CC||vincent.riviere at freesbee 
dot fr

--- Comment #2 from Vincent Riviere  ---
I reproduce this bug with GCC 13.1.0 for m68k. It happens when compiling libgcc
with -mcpu=5475 -mshort -O2.

Affected files are:
unwind-dw2.c
unwind-dw2-fde.c
libgcov-driver.c

Workaround: compile with -O1.

[Bug c++/110798] [12 Regression] The reusult of sizeof operator followed with an 'unsigned typedef-ed generic integer' type is alway 4 bytes(except char)

2023-07-25 Thread 13958014620 at 139 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110798

--- Comment #5 from miles <13958014620 at 139 dot com> ---
(In reply to Jonathan Wakely from comment #4)
> (In reply to Andrew Pinski from comment #1)
> > I almost positive this was fixed by r14-159-g03cebd304955a6 which was
> > backported to GCC 13 branch r13-7277-ga713aa4f47ac1e (for 13.2.0) .
> 
> Yes, bisection confirms it. So this is a dup.
> 
> Aside: I'm not sure I'd call this a wrong-code bug. The testcase would be
> simpler if it used static_assert instead of assert, which would make it
> accepts-invalid / rejects-valid instead.
> 
> *** This bug has been marked as a duplicate of bug 108099 ***


> Aside: I'm not sure I'd call this a wrong-code bug.
Yep, the "unsigned" keyword qualifies an typedef-ed type is illegal according
to ISO_14882. It's acceptable that the compiler reports a failure, at least a
warning.

>The testcase would be simpler if it used static_assert instead of assert, 
>which would make itaccepts-invalid / rejects-valid instead.
Thanks a lot for your suggestion!

The experience of discovering this issue is very interesting.
I wrote a macro function to show the attributes of fundamental type for my son,
he's currently learning c++ language.

#define PRINT_TYPE_ATTRIBUTES(T)  \
cout << "typeid(" << O_YELLOW(#T) << ").name(): " <<
O_RED(typeid(T).name()) <

[Bug c++/110809] ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

--- Comment #4 from Andrew Pinski  ---
Note if we do:
```
struct A{}; struct B{};
template  struct S {};
template  struct bucket {};
template 
int find_indices_impl(bucket const &);
using HashTable = bucket, 1>;
auto t = find_indices_impl>(HashTable{});
```
GCC correctly rejects the above but if we replace B{} with a floating point
value, we get the ICE (but not if we replace A{}). Maybe that will give an hint
of what is going wrong in the end.

[Bug c++/110809] ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

--- Comment #3 from Andrew Pinski  ---
I reduced it to an invalid code (I don't know if the original was valid or
not):
```
template  struct S {};
template  struct bucket {};
template 
int find_indices_impl(bucket const &);
using HashTable = bucket, 1>;
auto t = find_indices_impl>(HashTable{});
```

If I remove the i template argument, GCC correctly rejects the above. If I
change 1.0 and 2.0 to the same value, GCC accepts the code correctly.


Note clang does not support double as a non-type template argument (yet).

RE: [PATCH] RISC-V: Fix vector tuple intrinsic

2023-07-25 Thread Li, Pan2 via Gcc-patches
Thanks a lot. I just fw one email about the write-after-approval steps.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of juzhe.zh...@rivai.ai
Sent: Wednesday, July 26, 2023 12:22 PM
To: Li Xu ; gcc-patches 
Cc: kito.cheng ; palmer ; Li Xu 

Subject: Re: [PATCH] RISC-V: Fix vector tuple intrinsic

Thanks a lot for testing and fixing RVV API。

Could you add a simple float16 tuple api test ?

I known the API is so big that we can't add all api tests into testsuite but 
adding a simple case will be nice.

By the way, do you have write access?




juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-07-26 12:04
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; Li Xu
Subject: [PATCH] RISC-V: Fix vector tuple intrinsic
Consider this following case:
void test_vsoxseg3ei32_v_i32mf2x3(int32_t *base, vuint32mf2_t bindex, 
vint32mf2x3_t v_tuple, size_t vl) {
  return __riscv_vsoxseg3ei32_v_i32mf2x3(base, bindex, v_tuple, vl);
}
 
Compiler failed with:
test.c:19:1: internal compiler error: in vl_vtype_info, at 
config/riscv/riscv-vsetvl.cc:1679
   19 | }
  | ^
0x1439ec2 riscv_vector::vl_vtype_info::vl_vtype_info(riscv_vector::avl_info, 
unsigned char, riscv_vector::vlmul_type, unsigned char, bool, bool)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:1679
0x143f788 get_vl_vtype_info
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:807
0x143f788 riscv_vector::vector_insn_info::parse_insn(rtl_ssa::insn_info*)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:1843
0x1440371 riscv_vector::vector_infos_manager::vector_infos_manager()
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:2350
0x14407ee pass_vsetvl::init()
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4581
0x14471cf pass_vsetvl::execute(function*)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4716
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): Change 
scalar type to float16, eliminate warning.
(vfloat16mf4x3_t): Ditto.
(vfloat16mf4x4_t): Ditto.
(vfloat16mf4x5_t): Ditto.
(vfloat16mf4x6_t): Ditto.
(vfloat16mf4x7_t): Ditto.
(vfloat16mf4x8_t): Ditto.
(vfloat16mf2x2_t): Ditto.
(vfloat16mf2x3_t): Ditto.
(vfloat16mf2x4_t): Ditto.
(vfloat16mf2x5_t): Ditto.
(vfloat16mf2x6_t): Ditto.
(vfloat16mf2x7_t): Ditto.
(vfloat16mf2x8_t): Ditto.
(vfloat16m1x2_t): Ditto.
(vfloat16m1x3_t): Ditto.
(vfloat16m1x4_t): Ditto.
(vfloat16m1x5_t): Ditto.
(vfloat16m1x6_t): Ditto.
(vfloat16m1x7_t): Ditto.
(vfloat16m1x8_t): Ditto.
(vfloat16m2x2_t): Ditto.
(vfloat16m2x3_t): Ditto.
(vfloat16m2x4_t): Ditto.
(vfloat16m4x2_t): Ditto.
* config/riscv/vector-iterators.md: add RVVM4x2DF in iterator V4T.
* config/riscv/vector.md: add tuple mode in attr sew.
---
gcc/config/riscv/riscv-vector-builtins.def | 50 +++---
gcc/config/riscv/vector-iterators.md   |  1 +
gcc/config/riscv/vector.md |  1 +
3 files changed, 27 insertions(+), 25 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-vector-builtins.def 
b/gcc/config/riscv/riscv-vector-builtins.def
index 0e49480703b..6661629aad8 100644
--- a/gcc/config/riscv/riscv-vector-builtins.def
+++ b/gcc/config/riscv/riscv-vector-builtins.def
@@ -441,47 +441,47 @@ DEF_RVV_TYPE (vuint64m8_t, 16, __rvv_uint64m8_t, uint64, 
RVVM8DI, _u64m8, _u64,
DEF_RVV_TYPE (vfloat16mf4_t, 18, __rvv_float16mf4_t, float16, RVVMF4HF, _f16mf4,
  _f16, _e16mf4)
/* Define tuple types for SEW = 16, LMUL = MF4. */
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x2_t, 20, __rvv_float16mf4x2_t, vfloat16mf4_t, 
float, 2, _f16mf4x2)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x3_t, 20, __rvv_float16mf4x3_t, vfloat16mf4_t, 
float, 3, _f16mf4x3)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x4_t, 20, __rvv_float16mf4x4_t, vfloat16mf4_t, 
float, 4, _f16mf4x4)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x5_t, 20, __rvv_float16mf4x5_t, vfloat16mf4_t, 
float, 5, _f16mf4x5)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x6_t, 20, __rvv_float16mf4x6_t, vfloat16mf4_t, 
float, 6, _f16mf4x6)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x7_t, 20, __rvv_float16mf4x7_t, vfloat16mf4_t, 
float, 7, _f16mf4x7)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x8_t, 20, __rvv_float16mf4x8_t, vfloat16mf4_t, 
float, 8, _f16mf4x8)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x2_t, 20, __rvv_float16mf4x2_t, vfloat16mf4_t, 
float16, 2, _f16mf4x2)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x3_t, 20, __rvv_float16mf4x3_t, vfloat16mf4_t, 
float16, 3, _f16mf4x3)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x4_t, 20, __rvv_float16mf4x4_t, vfloat16mf4_t, 
float16, 4, _f16mf4x4)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x5_t, 20, __rvv_float16mf4x5_t, vfloat16mf4_t, 
float16, 5, _f16mf4x5)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x6_t, 20, __rvv_float16mf4x6_t, vfloat16mf4_t, 
float16, 6, _f16mf4x6)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x7_t, 20, __rvv_float16mf4x7_t, vfloat16mf4_t, 
float16, 7, _f16mf4x7)

Re: [PATCH] - Devirtualization of array destruction (C++) - 110057

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/12/23 10:10, Ng YongXiang via Gcc-patches wrote:

Component:
c++

Bug ID:
110057

Bugzilla link:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110057

Description:
Array should not call virtual destructor of object when array is destructed

ChangeLog:

2023-07-12  Ng YongXiang  PR c++
* Devirtualize auto generated destructor calls of arraycp/*
init.c: Call non virtual destructor of objects in arraytestsuite/
   * g++.dg/devirt-array-destructor-1.C: New.*
g++.dg/devirt-array-destructor-2.C: New.


On Wed, Jul 12, 2023 at 5:02 PM Xi Ruoyao  wrote:


On Wed, 2023-07-12 at 16:58 +0800, Ng YongXiang via Gcc-patches wrote:

I'm writing to seek for a review for an issue I filed some time ago.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110057 . A proposed patch

is

attached in the bug tracker as well.


You should send the patch to gcc-patches@gcc.gnu.org for a review, see
https://gcc.gnu.org/contribute.html for the details.  Generally we
consider patches attached in bugzilla as drafts.


Thanks!  The change makes sense under 
https://eel.is/c++draft/expr.delete#3.sentence-2 , but please look again 
at contribute.html.


In particular, the Legal section; you don't seem to have a copyright 
assignment with the FSF, nor do I see a DCO certification 
(https://gcc.gnu.org/dco.html) in your patch.


Like the examples in contribute.html, the subject line should be more 
like "[PATCH] c++: devirtualization of array destruction [PR110057]"


The ChangeLog entry should be in the commit message.


 * g++.dg/warn/pr83054.C: Change expected number of devirtualized calls


This isn't just changing the expected number, it's also changing the 
array from a local variable to dynamically allocated, which is a big 
change to what's being tested.  If you want to test the dynamic case, 
please add a new test instead of making this change.



diff --git a/gcc/testsuite/g++.dg/warn/pr83054.C 
b/gcc/testsuite/g++.dg/warn/pr83054.C
index 5285f94acee..7cd0951713d 100644
--- a/gcc/testsuite/g++.dg/warn/pr83054.C
+++ b/gcc/testsuite/g++.dg/warn/pr83054.C
@@ -10,7 +10,7 @@
 #endif
 
 extern "C" int printf (const char *, ...);

-struct foo // { dg-warning "final would enable devirtualization of 5 calls" }
+struct foo // { dg-warning "final would enable devirtualization of 1 call" }
 {
   static int count;
   void print (int i, int j) { printf ("foo[%d][%d] = %d\n", i, j, x); }
@@ -29,19 +29,15 @@ int foo::count;
 
 int main ()

 {
-  {
-foo array[3][3];
-for (int i = 0; i < 3; i++)
-  {
-   for (int j = 0; j < 3; j++)
- {
-   printf("[%d][%d] = %x\n", i, j, (void *)[i][j]);
- }
-  }
-  // The count should be nine, if not, fail the test.
-  if (foo::count != 9)
-   return 1;
-  }
+  foo* arr[9];
+  for (int i = 0; i < 9; ++i)
+arr[i] = new foo();
+  if (foo::count != 9)
+return 1;
+  for (int i = 0; i < 9; ++i)
+arr[i]->print(i / 3, i % 3);
+  for (int i = 0; i < 9; ++i)
+delete arr[i];





Re: [PATCH] RISC-V: Fix vector tuple intrinsic

2023-07-25 Thread juzhe.zh...@rivai.ai
Thanks a lot for testing and fixing RVV API。

Could you add a simple float16 tuple api test ?

I known the API is so big that we can't add all api tests into testsuite but 
adding a simple case will be nice.

By the way, do you have write access?




juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-07-26 12:04
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; Li Xu
Subject: [PATCH] RISC-V: Fix vector tuple intrinsic
Consider this following case:
void test_vsoxseg3ei32_v_i32mf2x3(int32_t *base, vuint32mf2_t bindex, 
vint32mf2x3_t v_tuple, size_t vl) {
  return __riscv_vsoxseg3ei32_v_i32mf2x3(base, bindex, v_tuple, vl);
}
 
Compiler failed with:
test.c:19:1: internal compiler error: in vl_vtype_info, at 
config/riscv/riscv-vsetvl.cc:1679
   19 | }
  | ^
0x1439ec2 riscv_vector::vl_vtype_info::vl_vtype_info(riscv_vector::avl_info, 
unsigned char, riscv_vector::vlmul_type, unsigned char, bool, bool)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:1679
0x143f788 get_vl_vtype_info
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:807
0x143f788 riscv_vector::vector_insn_info::parse_insn(rtl_ssa::insn_info*)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:1843
0x1440371 riscv_vector::vector_infos_manager::vector_infos_manager()
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:2350
0x14407ee pass_vsetvl::init()
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4581
0x14471cf pass_vsetvl::execute(function*)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4716
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): Change 
scalar type to float16, eliminate warning.
(vfloat16mf4x3_t): Ditto.
(vfloat16mf4x4_t): Ditto.
(vfloat16mf4x5_t): Ditto.
(vfloat16mf4x6_t): Ditto.
(vfloat16mf4x7_t): Ditto.
(vfloat16mf4x8_t): Ditto.
(vfloat16mf2x2_t): Ditto.
(vfloat16mf2x3_t): Ditto.
(vfloat16mf2x4_t): Ditto.
(vfloat16mf2x5_t): Ditto.
(vfloat16mf2x6_t): Ditto.
(vfloat16mf2x7_t): Ditto.
(vfloat16mf2x8_t): Ditto.
(vfloat16m1x2_t): Ditto.
(vfloat16m1x3_t): Ditto.
(vfloat16m1x4_t): Ditto.
(vfloat16m1x5_t): Ditto.
(vfloat16m1x6_t): Ditto.
(vfloat16m1x7_t): Ditto.
(vfloat16m1x8_t): Ditto.
(vfloat16m2x2_t): Ditto.
(vfloat16m2x3_t): Ditto.
(vfloat16m2x4_t): Ditto.
(vfloat16m4x2_t): Ditto.
* config/riscv/vector-iterators.md: add RVVM4x2DF in iterator V4T.
* config/riscv/vector.md: add tuple mode in attr sew.
---
gcc/config/riscv/riscv-vector-builtins.def | 50 +++---
gcc/config/riscv/vector-iterators.md   |  1 +
gcc/config/riscv/vector.md |  1 +
3 files changed, 27 insertions(+), 25 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-vector-builtins.def 
b/gcc/config/riscv/riscv-vector-builtins.def
index 0e49480703b..6661629aad8 100644
--- a/gcc/config/riscv/riscv-vector-builtins.def
+++ b/gcc/config/riscv/riscv-vector-builtins.def
@@ -441,47 +441,47 @@ DEF_RVV_TYPE (vuint64m8_t, 16, __rvv_uint64m8_t, uint64, 
RVVM8DI, _u64m8, _u64,
DEF_RVV_TYPE (vfloat16mf4_t, 18, __rvv_float16mf4_t, float16, RVVMF4HF, _f16mf4,
  _f16, _e16mf4)
/* Define tuple types for SEW = 16, LMUL = MF4. */
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x2_t, 20, __rvv_float16mf4x2_t, vfloat16mf4_t, 
float, 2, _f16mf4x2)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x3_t, 20, __rvv_float16mf4x3_t, vfloat16mf4_t, 
float, 3, _f16mf4x3)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x4_t, 20, __rvv_float16mf4x4_t, vfloat16mf4_t, 
float, 4, _f16mf4x4)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x5_t, 20, __rvv_float16mf4x5_t, vfloat16mf4_t, 
float, 5, _f16mf4x5)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x6_t, 20, __rvv_float16mf4x6_t, vfloat16mf4_t, 
float, 6, _f16mf4x6)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x7_t, 20, __rvv_float16mf4x7_t, vfloat16mf4_t, 
float, 7, _f16mf4x7)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x8_t, 20, __rvv_float16mf4x8_t, vfloat16mf4_t, 
float, 8, _f16mf4x8)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x2_t, 20, __rvv_float16mf4x2_t, vfloat16mf4_t, 
float16, 2, _f16mf4x2)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x3_t, 20, __rvv_float16mf4x3_t, vfloat16mf4_t, 
float16, 3, _f16mf4x3)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x4_t, 20, __rvv_float16mf4x4_t, vfloat16mf4_t, 
float16, 4, _f16mf4x4)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x5_t, 20, __rvv_float16mf4x5_t, vfloat16mf4_t, 
float16, 5, _f16mf4x5)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x6_t, 20, __rvv_float16mf4x6_t, vfloat16mf4_t, 
float16, 6, _f16mf4x6)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x7_t, 20, __rvv_float16mf4x7_t, vfloat16mf4_t, 
float16, 7, _f16mf4x7)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x8_t, 20, __rvv_float16mf4x8_t, vfloat16mf4_t, 
float16, 8, _f16mf4x8)
/* LMUL = 1/2.  */
DEF_RVV_TYPE (vfloat16mf2_t, 18, __rvv_float16mf2_t, float16, RVVMF2HF, _f16mf2,
  _f16, _e16mf2)
/* Define tuple types for SEW = 16, LMUL = MF2. */
-DEF_RVV_TUPLE_TYPE (vfloat16mf2x2_t, 20, 

[PATCH] RISC-V: Fix vector tuple intrinsic

2023-07-25 Thread Li Xu
Consider this following case:
void test_vsoxseg3ei32_v_i32mf2x3(int32_t *base, vuint32mf2_t bindex, 
vint32mf2x3_t v_tuple, size_t vl) {
  return __riscv_vsoxseg3ei32_v_i32mf2x3(base, bindex, v_tuple, vl);
}

Compiler failed with:
test.c:19:1: internal compiler error: in vl_vtype_info, at 
config/riscv/riscv-vsetvl.cc:1679
   19 | }
  | ^
0x1439ec2 riscv_vector::vl_vtype_info::vl_vtype_info(riscv_vector::avl_info, 
unsigned char, riscv_vector::vlmul_type, unsigned char, bool, bool)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:1679
0x143f788 get_vl_vtype_info
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:807
0x143f788 riscv_vector::vector_insn_info::parse_insn(rtl_ssa::insn_info*)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:1843
0x1440371 riscv_vector::vector_infos_manager::vector_infos_manager()
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:2350
0x14407ee pass_vsetvl::init()
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4581
0x14471cf pass_vsetvl::execute(function*)
../.././riscv-gcc/gcc/config/riscv/riscv-vsetvl.cc:4716

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): Change 
scalar type to float16, eliminate warning.
(vfloat16mf4x3_t): Ditto.
(vfloat16mf4x4_t): Ditto.
(vfloat16mf4x5_t): Ditto.
(vfloat16mf4x6_t): Ditto.
(vfloat16mf4x7_t): Ditto.
(vfloat16mf4x8_t): Ditto.
(vfloat16mf2x2_t): Ditto.
(vfloat16mf2x3_t): Ditto.
(vfloat16mf2x4_t): Ditto.
(vfloat16mf2x5_t): Ditto.
(vfloat16mf2x6_t): Ditto.
(vfloat16mf2x7_t): Ditto.
(vfloat16mf2x8_t): Ditto.
(vfloat16m1x2_t): Ditto.
(vfloat16m1x3_t): Ditto.
(vfloat16m1x4_t): Ditto.
(vfloat16m1x5_t): Ditto.
(vfloat16m1x6_t): Ditto.
(vfloat16m1x7_t): Ditto.
(vfloat16m1x8_t): Ditto.
(vfloat16m2x2_t): Ditto.
(vfloat16m2x3_t): Ditto.
(vfloat16m2x4_t): Ditto.
(vfloat16m4x2_t): Ditto.
* config/riscv/vector-iterators.md: add RVVM4x2DF in iterator V4T.
* config/riscv/vector.md: add tuple mode in attr sew.
---
 gcc/config/riscv/riscv-vector-builtins.def | 50 +++---
 gcc/config/riscv/vector-iterators.md   |  1 +
 gcc/config/riscv/vector.md |  1 +
 3 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.def 
b/gcc/config/riscv/riscv-vector-builtins.def
index 0e49480703b..6661629aad8 100644
--- a/gcc/config/riscv/riscv-vector-builtins.def
+++ b/gcc/config/riscv/riscv-vector-builtins.def
@@ -441,47 +441,47 @@ DEF_RVV_TYPE (vuint64m8_t, 16, __rvv_uint64m8_t, uint64, 
RVVM8DI, _u64m8, _u64,
 DEF_RVV_TYPE (vfloat16mf4_t, 18, __rvv_float16mf4_t, float16, RVVMF4HF, 
_f16mf4,
  _f16, _e16mf4)
 /* Define tuple types for SEW = 16, LMUL = MF4. */
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x2_t, 20, __rvv_float16mf4x2_t, vfloat16mf4_t, 
float, 2, _f16mf4x2)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x3_t, 20, __rvv_float16mf4x3_t, vfloat16mf4_t, 
float, 3, _f16mf4x3)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x4_t, 20, __rvv_float16mf4x4_t, vfloat16mf4_t, 
float, 4, _f16mf4x4)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x5_t, 20, __rvv_float16mf4x5_t, vfloat16mf4_t, 
float, 5, _f16mf4x5)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x6_t, 20, __rvv_float16mf4x6_t, vfloat16mf4_t, 
float, 6, _f16mf4x6)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x7_t, 20, __rvv_float16mf4x7_t, vfloat16mf4_t, 
float, 7, _f16mf4x7)
-DEF_RVV_TUPLE_TYPE (vfloat16mf4x8_t, 20, __rvv_float16mf4x8_t, vfloat16mf4_t, 
float, 8, _f16mf4x8)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x2_t, 20, __rvv_float16mf4x2_t, vfloat16mf4_t, 
float16, 2, _f16mf4x2)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x3_t, 20, __rvv_float16mf4x3_t, vfloat16mf4_t, 
float16, 3, _f16mf4x3)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x4_t, 20, __rvv_float16mf4x4_t, vfloat16mf4_t, 
float16, 4, _f16mf4x4)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x5_t, 20, __rvv_float16mf4x5_t, vfloat16mf4_t, 
float16, 5, _f16mf4x5)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x6_t, 20, __rvv_float16mf4x6_t, vfloat16mf4_t, 
float16, 6, _f16mf4x6)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x7_t, 20, __rvv_float16mf4x7_t, vfloat16mf4_t, 
float16, 7, _f16mf4x7)
+DEF_RVV_TUPLE_TYPE (vfloat16mf4x8_t, 20, __rvv_float16mf4x8_t, vfloat16mf4_t, 
float16, 8, _f16mf4x8)
 /* LMUL = 1/2.  */
 DEF_RVV_TYPE (vfloat16mf2_t, 18, __rvv_float16mf2_t, float16, RVVMF2HF, 
_f16mf2,
  _f16, _e16mf2)
 /* Define tuple types for SEW = 16, LMUL = MF2. */
-DEF_RVV_TUPLE_TYPE (vfloat16mf2x2_t, 20, __rvv_float16mf2x2_t, vfloat16mf2_t, 
float, 2, _f16mf2x2)
-DEF_RVV_TUPLE_TYPE (vfloat16mf2x3_t, 20, __rvv_float16mf2x3_t, vfloat16mf2_t, 
float, 3, _f16mf2x3)
-DEF_RVV_TUPLE_TYPE (vfloat16mf2x4_t, 20, __rvv_float16mf2x4_t, vfloat16mf2_t, 
float, 4, _f16mf2x4)
-DEF_RVV_TUPLE_TYPE (vfloat16mf2x5_t, 20, __rvv_float16mf2x5_t, vfloat16mf2_t, 
float, 5, _f16mf2x5)
-DEF_RVV_TUPLE_TYPE (vfloat16mf2x6_t, 20, 

Re: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread juzhe.zh...@rivai.ai
Yes. I agree.

I didn't take a look into SPEC. Not sure whether fcsr has immediate form.

I mean this patch change in 'fcsr' is quite confusing.

You should either fix the assembly code-gen if fcsr has immediate form,

or fix predicate and constraint both (should not fix constraint only).

Thanks.


juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-07-26 11:45
To: juzhe.zh...@rivai.ai
CC: jinma; gcc-patches; jeffreyalaw; palmer; richard.sandiford; 
philipp.tomsich; christoph.muellner; Robin Dapp; jinma.contrib
Subject: Re: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using 
immediate.
So I guess you should change `fscsr` to `fscsr%i0` instead of dropping
K from the constraint list?
 
On Wed, Jul 26, 2023 at 11:42 AM juzhe.zh...@rivai.ai
 wrote:
>
> I don't understand:
>  (define_insn "riscv_fscsr"
> -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
> +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
>"TARGET_HARD_FLOAT || TARGET_ZFINX"
>"fscsr\t%0")
>
> This pattern never allows immediate in the constraint. Why still make 
> predicate allow immediate?
>
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Jin Ma
> Date: 2023-07-26 11:33
> To: gcc-patches; juzhe.zh...@rivai.ai
> CC: jeffreyalaw; palmer; richard.sandiford; kito.cheng; philipp.tomsich; 
> christoph.muellner; Robin Dapp; jinma.contrib
> Subject: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using 
> immediate.
> > -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] 
> > UNSPECV_FSCSR)]
> > +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] 
> > UNSPECV_FSCSR)]
> >
> > If you don't allow immediate value in range 0 ~ 31, it should be 
> > "register_operand" instead of "csr_operand".
> >
> >
>
> I think directives that support the immediate pattern might be better, on the 
> one
> hand fsflagsi are supported in the manual, on the other hand fsflagsi can be
> slightly faster than fsflags.
>
> Regards
> Jin
>
> >
> > juzhe.zh...@rivai.ai
> >
 


Re: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread Kito Cheng via Gcc-patches
So I guess you should change `fscsr` to `fscsr%i0` instead of dropping
K from the constraint list?

On Wed, Jul 26, 2023 at 11:42 AM juzhe.zh...@rivai.ai
 wrote:
>
> I don't understand:
>  (define_insn "riscv_fscsr"
> -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
> +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
>"TARGET_HARD_FLOAT || TARGET_ZFINX"
>"fscsr\t%0")
>
> This pattern never allows immediate in the constraint. Why still make 
> predicate allow immediate?
>
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Jin Ma
> Date: 2023-07-26 11:33
> To: gcc-patches; juzhe.zh...@rivai.ai
> CC: jeffreyalaw; palmer; richard.sandiford; kito.cheng; philipp.tomsich; 
> christoph.muellner; Robin Dapp; jinma.contrib
> Subject: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using 
> immediate.
> > -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] 
> > UNSPECV_FSCSR)]
> > +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] 
> > UNSPECV_FSCSR)]
> >
> > If you don't allow immediate value in range 0 ~ 31, it should be 
> > "register_operand" instead of "csr_operand".
> >
> >
>
> I think directives that support the immediate pattern might be better, on the 
> one
> hand fsflagsi are supported in the manual, on the other hand fsflagsi can be
> slightly faster than fsflags.
>
> Regards
> Jin
>
> >
> > juzhe.zh...@rivai.ai
> >


Re: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread juzhe.zh...@rivai.ai
I don't understand:
 (define_insn "riscv_fscsr"
-  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
+  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
   "fscsr\t%0")

This pattern never allows immediate in the constraint. Why still make predicate 
allow immediate?
 



juzhe.zh...@rivai.ai
 
From: Jin Ma
Date: 2023-07-26 11:33
To: gcc-patches; juzhe.zh...@rivai.ai
CC: jeffreyalaw; palmer; richard.sandiford; kito.cheng; philipp.tomsich; 
christoph.muellner; Robin Dapp; jinma.contrib
Subject: Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using 
immediate.
> -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
> +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
> 
> If you don't allow immediate value in range 0 ~ 31, it should be 
> "register_operand" instead of "csr_operand".
> 
> 
 
I think directives that support the immediate pattern might be better, on the 
one
hand fsflagsi are supported in the manual, on the other hand fsflagsi can be
slightly faster than fsflags.
 
Regards
Jin
 
> 
> juzhe.zh...@rivai.ai
>


Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread Jin Ma via Gcc-patches
> -  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
> +  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
> 
> If you don't allow immediate value in range 0 ~ 31, it should be 
> "register_operand" instead of "csr_operand".
> 
> 

I think directives that support the immediate pattern might be better, on the 
one
hand fsflagsi are supported in the manual, on the other hand fsflagsi can be
slightly faster than fsflags.

Regards
Jin

> 
> juzhe.zh...@rivai.ai
>

Re: RISC-V: Folding memory for FP + constant case

2023-07-25 Thread Jeff Law via Gcc-patches




On 7/25/23 05:24, Jivan Hakobyan wrote:

Hi.

I re-run the benchmarks and hopefully got the same profit.
I also compared the leela's code and figured out the reason.

Actually, my and Manolis's patches do the same thing. The difference is 
only execution order.
But shouldn't your patch also allow for for at the last the potential to 
pull the fp+offset computation out of a loop?  I'm pretty sure Manolis's 
patch can't do that.


Because of f-m-o held after the register allocation it cannot eliminate 
redundant move 'sp' to another register.
Actually that's supposed to be handled by a different patch that should 
already be upstream.  Specifically;



commit 6a2e8dcbbd4bab374b27abea375bf7a921047800
Author: Manolis Tsamis 
Date:   Thu May 25 13:44:41 2023 +0200

cprop_hardreg: Enable propagation of the stack pointer if possible

Propagation of the stack pointer in cprop_hardreg is currenty

forbidden in all cases, due to maybe_mode_change returning NULL.
Relax this restriction and allow propagation when no mode change is
requested.

gcc/ChangeLog:

* regcprop.cc (maybe_mode_change): Enable stack pointer

propagation.
I think there were a couple-follow-ups.  But that's the key change that 
should allow propagation of copies from the stack pointer and thus 
eliminate the mov gpr,sp instructions.  If that's not happening, then 
it's worth investigating why.




Besides that, I have checked the build failure on x264_r. It is already 
fixed on the third version.

Yea, this was a problem with re-recognition.  I think it was fixed by:


commit ecfa870ff29d979bd2c3d411643b551f2b6915b0
Author: Vineet Gupta 
Date:   Thu Jul 20 11:15:37 2023 -0700

RISC-V: optim const DF +0.0 store to mem [PR/110748]

Fixes: ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT")

DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.

[ ... ]


So I think the big question WRT your patch is does it still help the 
case where we weren't pulling the fp+offset computation out of a loop.


Jeff


Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]

2023-07-25 Thread Jeff Law via Gcc-patches




On 7/25/23 17:05, Palmer Dabbelt wrote:

On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

On 7/21/23 12:31, Palmer Dabbelt wrote:

(define_expand "len_mask_gather_load"
   [(match_operand:VNX1_QHSD 0 "register_operand")
-   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:P 1 "pmode_reg_or_0_operand")
    (match_operand:VNX1_QHSDI 2 "register_operand")
    (match_operand 3 "")
    (match_operand 4 "")

a bunch of times, as there's a ton of them?  I'm not entirely sure if 
that

could manifest as an actual bug, though...

But won't this cause (const_int 0) to no longer match because CONST_INT
nodes are modeless (VOIDmode)?


I poked around a bit and I'm not actually sure, I'm kind of lost on the 
docs

here.  IIUC we're eliding the VOIDmode in the predicate correctly

    (define_predicate "const_0_operand"
  (and (match_code "const_int,const_wide_int,const_vector")
   (match_test "op == CONST0_RTX (GET_MODE (op))")))

so we're OK there, otherwise we'd presumably have similar problems with
expanders like

    (define_expand "subsi3"
  [(set (match_operand:SI   0 "register_operand" "= r")
   (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
     (match_operand:SI 2 "register_operand" "  r")))]
  ""

which we have a few of -- though it'd be kind of a silent failure, as
presumably we'd just end up with some more move-x0s emitted?
It's a bit messy to say the least.  However, we can look at other ports 
and after doing so I'm less sure my concern is valid.


Take the typical movXX pattern or expander.  Both operands have a mode, 
so things like CONST_INT must be passing through, even though they're 
VOIDmode.


So it's probably a non-issue.
jeff




[Bug target/110776] [14 Regression] powerpc-darwin bootstrap broken after r14-2490 with ICE rs6000.cc:5069 building libgfortran

2023-07-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110776

--- Comment #9 from Kewen Lin  ---
(In reply to Iain Sandoe from comment #8)
> (In reply to rguent...@suse.de from comment #7)
> > On Tue, 25 Jul 2023, linkw at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110776
> > > 
> > > --- Comment #6 from Kewen Lin  ---
> > > (In reply to rguent...@suse.de from comment #5)
> > > > On Tue, 25 Jul 2023, linkw at gcc dot gnu.org wrote:
> > > > 
> > > > I think apart from the consideration what a single element vector
> > > > is compared to a scalar, a more to-the-point fix is
> > > > 
> > > >   if (VECTOR_TYPE_P (ltype)
> > > >   && memory_access_type != VMAT_ELEMENTWISE)
> > > 
> > > Thanks for the suggestion! I thought checking lnel can also cover
> > > VMAT_STRIDED_SLP's special case having const_nunits 1, but it seems 
> > > impossible
> > > to have?
> > 
> > I think so, unless I'm convinced with a testcase ;)

I guess there is no such test case. ;)

> 
> (sorry for being a bit slow - we had a power outage that wasted most of the
> day)
> 
> Richi's suggested patch fixes build of a cross-build for powerpc-darwin and
> the test results look OK too.  A non-expert look at the code suggests that
> VMAT_ELEMENTWISE is already accounted for on the write side, so that we
> should not see a call to the costing code for the equivalent write-side.

Thanks Iain, I also bootstrapped and regtested the suggested fix on x86 and
powerpc64{,le}, just posted it for review at
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625484.html.

[PATCH] rs6000: Correct vsx operands output for xxeval [PR110741]

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi,

PR110741 exposes one issue that we didn't use the correct
character for vsx operands in output operand substitution,
consequently it can map to the wrong registers which hold
some unexpected values.

Bootstrapped and regress-tested on powerpc64-linux-gnu
P7/P8/P9 and powerpc64le-linux-gnu P9/P10.

I'll push this soon and backport to release branches after
a week or so.

BR,
Kewen
-
PR target/110741

gcc/ChangeLog:

* config/rs6000/vsx.md (define_insn xxeval): Correct vsx
operands output with "x".

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr110741.C: New test.
---
 gcc/config/rs6000/vsx.md|   2 +-
 gcc/testsuite/g++.target/powerpc/pr110741.C | 552 
 2 files changed, 553 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/pr110741.C

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 0c269e4e8d9..1a87f1c0b63 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6586,7 +6586,7 @@ (define_insn "xxeval"
  (match_operand:QI 4 "u8bit_cint_operand" "n")]
 UNSPEC_XXEVAL))]
"TARGET_POWER10"
-   "xxeval %0,%1,%2,%3,%4"
+   "xxeval %x0,%x1,%x2,%x3,%4"
[(set_attr "type" "vecperm")
 (set_attr "prefixed" "yes")])

diff --git a/gcc/testsuite/g++.target/powerpc/pr110741.C 
b/gcc/testsuite/g++.target/powerpc/pr110741.C
new file mode 100644
index 000..0214936b06d
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr110741.C
@@ -0,0 +1,552 @@
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+#include 
+
+typedef unsigned char uint8_t;
+
+template 
+static inline vector unsigned long long
+VSXTernaryLogic (vector unsigned long long a, vector unsigned long long b,
+vector unsigned long long c)
+{
+  return vec_ternarylogic (a, b, c, kTernLogOp);
+}
+
+static vector unsigned long long
+VSXTernaryLogic (vector unsigned long long a, vector unsigned long long b,
+vector unsigned long long c, int ternary_logic_op)
+{
+  switch (ternary_logic_op & 0xFF)
+{
+case 0:
+  return VSXTernaryLogic<0> (a, b, c);
+case 1:
+  return VSXTernaryLogic<1> (a, b, c);
+case 2:
+  return VSXTernaryLogic<2> (a, b, c);
+case 3:
+  return VSXTernaryLogic<3> (a, b, c);
+case 4:
+  return VSXTernaryLogic<4> (a, b, c);
+case 5:
+  return VSXTernaryLogic<5> (a, b, c);
+case 6:
+  return VSXTernaryLogic<6> (a, b, c);
+case 7:
+  return VSXTernaryLogic<7> (a, b, c);
+case 8:
+  return VSXTernaryLogic<8> (a, b, c);
+case 9:
+  return VSXTernaryLogic<9> (a, b, c);
+case 10:
+  return VSXTernaryLogic<10> (a, b, c);
+case 11:
+  return VSXTernaryLogic<11> (a, b, c);
+case 12:
+  return VSXTernaryLogic<12> (a, b, c);
+case 13:
+  return VSXTernaryLogic<13> (a, b, c);
+case 14:
+  return VSXTernaryLogic<14> (a, b, c);
+case 15:
+  return VSXTernaryLogic<15> (a, b, c);
+case 16:
+  return VSXTernaryLogic<16> (a, b, c);
+case 17:
+  return VSXTernaryLogic<17> (a, b, c);
+case 18:
+  return VSXTernaryLogic<18> (a, b, c);
+case 19:
+  return VSXTernaryLogic<19> (a, b, c);
+case 20:
+  return VSXTernaryLogic<20> (a, b, c);
+case 21:
+  return VSXTernaryLogic<21> (a, b, c);
+case 22:
+  return VSXTernaryLogic<22> (a, b, c);
+case 23:
+  return VSXTernaryLogic<23> (a, b, c);
+case 24:
+  return VSXTernaryLogic<24> (a, b, c);
+case 25:
+  return VSXTernaryLogic<25> (a, b, c);
+case 26:
+  return VSXTernaryLogic<26> (a, b, c);
+case 27:
+  return VSXTernaryLogic<27> (a, b, c);
+case 28:
+  return VSXTernaryLogic<28> (a, b, c);
+case 29:
+  return VSXTernaryLogic<29> (a, b, c);
+case 30:
+  return VSXTernaryLogic<30> (a, b, c);
+case 31:
+  return VSXTernaryLogic<31> (a, b, c);
+case 32:
+  return VSXTernaryLogic<32> (a, b, c);
+case 33:
+  return VSXTernaryLogic<33> (a, b, c);
+case 34:
+  return VSXTernaryLogic<34> (a, b, c);
+case 35:
+  return VSXTernaryLogic<35> (a, b, c);
+case 36:
+  return VSXTernaryLogic<36> (a, b, c);
+case 37:
+  return VSXTernaryLogic<37> (a, b, c);
+case 38:
+  return VSXTernaryLogic<38> (a, b, c);
+case 39:
+  return VSXTernaryLogic<39> (a, b, c);
+case 40:
+  return VSXTernaryLogic<40> (a, b, c);
+case 41:
+  return VSXTernaryLogic<41> (a, b, c);
+case 42:
+  return VSXTernaryLogic<42> (a, b, c);
+case 43:
+  return VSXTernaryLogic<43> (a, b, c);
+case 44:
+  return VSXTernaryLogic<44> (a, b, c);
+case 45:
+  return VSXTernaryLogic<45> (a, b, c);
+case 46:
+  return VSXTernaryLogic<46> (a, b, c);
+case 47:
+  return VSXTernaryLogic<47> (a, b, c);
+case 48:
+  

[PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi,

PR110776 exposes one issue that we could query unaligned
load for vector type but actually no unaligned vector load
is supported there.  The reason is that the costed load is
with single-lane vector type and its memory access type is
VMAT_ELEMENTWISE, we actually take it as scalar load and
set its alignment_support_scheme as dr_unaligned_supported.

To avoid the ICE as exposed, following Rich's suggestion,
this patch is to make VMAT_ELEMENTWISE be costed as scalar
load.

Bootstrapped and regress-tested on x86_64-redhat-linux,
powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9/P10.

Is it ok for trunk?

BR,
Kewen
-

Co-authored-by: Richard Biener 

PR tree-optimization/110776

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_load): Always cost VMAT_ELEMENTWISE
as scalar load.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr110776.c: New test.
---
 gcc/testsuite/gcc.target/powerpc/pr110776.c | 22 +
 gcc/tree-vect-stmts.cc  |  5 -
 2 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr110776.c

diff --git a/gcc/testsuite/gcc.target/powerpc/pr110776.c 
b/gcc/testsuite/gcc.target/powerpc/pr110776.c
new file mode 100644
index 000..749159fd675
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr110776.c
@@ -0,0 +1,22 @@
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power6 -maltivec" } */
+
+/* Verify there is no ICE.  */
+
+int a;
+long *b;
+int
+c ()
+{
+  long e;
+  int d = 0;
+  for (long f; f; f++)
+{
+  e = b[f * a];
+  if (e)
+   d = 1;
+}
+  if (d)
+for (;;)
+  ;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index ed28fbdced3..09705200594 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9840,7 +9840,10 @@ vectorizable_load (vec_info *vinfo,
{
  if (costing_p)
{
- if (VECTOR_TYPE_P (ltype))
+ /* For VMAT_ELEMENTWISE, just cost it as scalar_load to
+avoid ICE, see PR110776.  */
+ if (VECTOR_TYPE_P (ltype)
+ && memory_access_type != VMAT_ELEMENTWISE)
vect_get_load_cost (vinfo, stmt_info, 1,
alignment_support_scheme, misalignment,
false, _cost, nullptr, cost_vec,
--
2.39.1


Re: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread juzhe.zh...@rivai.ai
-  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
+  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]

If you don't allow immediate value in range 0 ~ 31, it should be 
"register_operand" instead of "csr_operand".



juzhe.zh...@rivai.ai
 
From: Jin Ma
Date: 2023-07-26 10:17
To: gcc-patches
CC: jeffreyalaw; palmer; richard.sandiford; kito.cheng; philipp.tomsich; 
christoph.muellner; rdapp.gcc; juzhe.zhong; jinma.contrib; Jin Ma
Subject: [PATCH v3] RISC-V: Fixbug for fsflags instruction error using 
immediate.
The pattern mistakenly believes that fsflags can use immediate numbers,
but in fact it does not support it. Immediate numbers should use fsflagsi.
 
For example:
__builtin_riscv_fsflags(4);
 
The following error occurred.
/tmp/ccoWdWqT.s: Assembler messages:
/tmp/ccoWdWqT.s:14: Error: illegal operands `fsflags 4'
 
gcc/ChangeLog:
 
* config/riscv/riscv.md: Likewise.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/fsflags.c: New test.
---
gcc/config/riscv/riscv.md|  4 ++--
gcc/testsuite/gcc.target/riscv/fsflags.c | 16 
2 files changed, 18 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/fsflags.c
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4615e811947..74ff9ccc968 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3074,7 +3074,7 @@ (define_insn "riscv_frcsr"
   "frcsr\t%0")
(define_insn "riscv_fscsr"
-  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
+  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
   "fscsr\t%0")
@@ -3087,7 +3087,7 @@ (define_insn "riscv_frflags"
(define_insn "riscv_fsflags"
   [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSFLAGS)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
-  "fsflags\t%0")
+  "fsflags%i0\t%0")
(define_insn "*riscv_fsnvsnan2"
   [(unspec_volatile [(match_operand:ANYF 0 "register_operand" "f")
diff --git a/gcc/testsuite/gcc.target/riscv/fsflags.c 
b/gcc/testsuite/gcc.target/riscv/fsflags.c
new file mode 100644
index 000..74a97b8a7c7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/fsflags.c
@@ -0,0 +1,16 @@
+/* Verify that fsflags is using the correct register or immediate.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-options "-O" } */
+
+void foo1 (int a)
+{
+   __builtin_riscv_fsflags(a);
+}
+void foo2 ()
+{
+   __builtin_riscv_fsflags(4);
+}
+
+/* { dg-final { scan-assembler-times "fsflags\t" 1 } } */
+/* { dg-final { scan-assembler-times "fsflagsi\t" 1 } } */
-- 
2.17.1
 
 


[PATCH v3] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-25 Thread Jin Ma via Gcc-patches
The pattern mistakenly believes that fsflags can use immediate numbers,
but in fact it does not support it. Immediate numbers should use fsflagsi.

For example:
__builtin_riscv_fsflags(4);

The following error occurred.
/tmp/ccoWdWqT.s: Assembler messages:
/tmp/ccoWdWqT.s:14: Error: illegal operands `fsflags 4'

gcc/ChangeLog:

* config/riscv/riscv.md: Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/fsflags.c: New test.
---
 gcc/config/riscv/riscv.md|  4 ++--
 gcc/testsuite/gcc.target/riscv/fsflags.c | 16 
 2 files changed, 18 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/fsflags.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4615e811947..74ff9ccc968 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3074,7 +3074,7 @@ (define_insn "riscv_frcsr"
   "frcsr\t%0")
 
 (define_insn "riscv_fscsr"
-  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSCSR)]
+  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "r")] UNSPECV_FSCSR)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
   "fscsr\t%0")
 
@@ -3087,7 +3087,7 @@ (define_insn "riscv_frflags"
 (define_insn "riscv_fsflags"
   [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSFLAGS)]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
-  "fsflags\t%0")
+  "fsflags%i0\t%0")
 
 (define_insn "*riscv_fsnvsnan2"
   [(unspec_volatile [(match_operand:ANYF 0 "register_operand" "f")
diff --git a/gcc/testsuite/gcc.target/riscv/fsflags.c 
b/gcc/testsuite/gcc.target/riscv/fsflags.c
new file mode 100644
index 000..74a97b8a7c7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/fsflags.c
@@ -0,0 +1,16 @@
+/* Verify that fsflags is using the correct register or immediate.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-options "-O" } */
+
+void foo1 (int a)
+{
+   __builtin_riscv_fsflags(a);
+}
+void foo2 ()
+{
+   __builtin_riscv_fsflags(4);
+}
+
+/* { dg-final { scan-assembler-times "fsflags\t" 1 } } */
+/* { dg-final { scan-assembler-times "fsflagsi\t" 1 } } */
-- 
2.17.1



Re: [PATCH] c++: fix ICE with is_really_empty_class [PR110106]

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/25/23 16:30, Marek Polacek wrote:

On Tue, Jul 25, 2023 at 04:24:39PM -0400, Jason Merrill wrote:

On 7/25/23 15:59, Marek Polacek wrote:

Something like this, then?  I see that cp_parser_initializer_clause et al
offer further opportunities (because they sometimes use a dummy too) but
this should be a good start.


Looks good.  Please do update the other callers as well, while you're
looking at this.


Thanks.  Can I push this part first?


Ah, sure.  I had thought the other callers would be trivial to add.

Jason



Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-25 Thread Hao Liu OS via Gcc-patches
> When was STMT_VINFO_REDUC_DEF empty?  I just want to make sure that we're not 
> papering over an issue elsewhere.

Yes, I also wonder if this is an issue in vectorizable_reduction.  Below is the 
the gimple of "gcc.target/aarch64/sve/cost_model_13.c":

  :
  # res_18 = PHI 
  # i_20 = PHI 
  _1 = (long unsigned int) i_20;
  _2 = _1 * 2;
  _3 = x_14(D) + _2;
  _4 = *_3;
  _5 = (unsigned short) _4;
  res.0_6 = (unsigned short) res_18;
  _7 = _5 + res.0_6; <-- The current stmt_info
  res_15 = (short int) _7;
  i_16 = i_20 + 1;
  if (n_11(D) > i_16)
goto ;
  else
goto ;

  :
  goto ;

It looks like that STMT_VINFO_REDUC_DEF should be "res_18 = PHI "?
The status here is:
  STMT_VINFO_REDUC_IDX (stmt_info): 1
  STMT_VINFO_REDUC_TYPE (stmt_info): TREE_CODE_REDUCTION
  STMT_VINFO_REDUC_VECTYPE (stmt_info): 0x0

Thanks,
Hao


From: Richard Sandiford 
Sent: Tuesday, July 25, 2023 17:44
To: Hao Liu OS
Cc: GCC-patches@gcc.gnu.org
Subject: Re: [PATCH] AArch64: Do not increase the vect reduction latency by 
multiplying count [PR110625]

Hao Liu OS  writes:
> Hi,
>
> Thanks for the suggestion.  I tested it and found a gcc_assert failure:
> gcc.target/aarch64/sve/cost_model_13.c (internal compiler error: in 
> info_for_reduction, at tree-vect-loop.cc:5473)
>
> It is caused by empty STMT_VINFO_REDUC_DEF.

When was STMT_VINFO_REDUC_DEF empty?  I just want to make sure that
we're not papering over an issue elsewhere.

Thanks,
Richard

  So, I added an extra check before checking single_defuse_cycle. The updated 
patch is below.  Is it OK for trunk?
>
> ---
>
> The new costs should only count reduction latency by multiplying count for
> single_defuse_cycle.  For other situations, this will increase the reduction
> latency a lot and miss vectorization opportunities.
>
> Tested on aarch64-linux-gnu.
>
> gcc/ChangeLog:
>
>   PR target/110625
>   * config/aarch64/aarch64.cc (count_ops): Only '* count' for
>   single_defuse_cycle while counting reduction_latency.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/pr110625_1.c: New testcase.
>   * gcc.target/aarch64/pr110625_2.c: New testcase.
> ---
>  gcc/config/aarch64/aarch64.cc | 13 --
>  gcc/testsuite/gcc.target/aarch64/pr110625_1.c | 46 +++
>  gcc/testsuite/gcc.target/aarch64/pr110625_2.c | 14 ++
>  3 files changed, 69 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr110625_1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr110625_2.c
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 560e5431636..478a4e00110 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -16788,10 +16788,15 @@ aarch64_vector_costs::count_ops (unsigned int 
> count, vect_cost_for_stmt kind,
>  {
>unsigned int base
>   = aarch64_in_loop_reduction_latency (m_vinfo, stmt_info, m_vec_flags);
> -
> -  /* ??? Ideally we'd do COUNT reductions in parallel, but unfortunately
> -  that's not yet the case.  */
> -  ops->reduction_latency = MAX (ops->reduction_latency, base * count);
> +  if (STMT_VINFO_REDUC_DEF (stmt_info)
> +   && STMT_VINFO_FORCE_SINGLE_CYCLE (
> + info_for_reduction (m_vinfo, stmt_info)))
> + /* ??? Ideally we'd use a tree to reduce the copies down to 1 vector,
> +and then accumulate that, but at the moment the loop-carried
> +dependency includes all copies.  */
> + ops->reduction_latency = MAX (ops->reduction_latency, base * count);
> +  else
> + ops->reduction_latency = MAX (ops->reduction_latency, base);
>  }
>
>/* Assume that multiply-adds will become a single operation.  */
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr110625_1.c 
> b/gcc/testsuite/gcc.target/aarch64/pr110625_1.c
> new file mode 100644
> index 000..0965cac33a0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr110625_1.c
> @@ -0,0 +1,46 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -mcpu=neoverse-n2 -fdump-tree-vect-details 
> -fno-tree-slp-vectorize" } */
> +/* { dg-final { scan-tree-dump-not "reduction latency = 8" "vect" } } */
> +
> +/* Do not increase the vector body cost due to the incorrect reduction 
> latency
> +Original vector body cost = 51
> +Scalar issue estimate:
> +  ...
> +  reduction latency = 2
> +  estimated min cycles per iteration = 2.00
> +  estimated cycles per vector iteration (for VF 2) = 4.00
> +Vector issue estimate:
> +  ...
> +  reduction latency = 8  <-- Too large
> +  estimated min cycles per iteration = 8.00
> +Increasing body cost to 102 because scalar code would issue more quickly
> +  ...
> +missed:  cost model: the vector iteration cost = 102 divided by the 
> scalar iteration cost = 44 is greater or equal to the vectorization factor = 
> 2.
> 

Re: [PATCH v5 0/3] c++: Track lifetimes in constant evaluation [PR70331, ...]

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/22/23 11:12, Nathaniel Shead wrote:

This is an update of the patch series at
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625050.html


I applied the patches, with an addition to the first patch to fix 
constexpr-mutable3.C in C++11 mode, which was not part of the default 
std set.  And fixed the testsuite to run that test (and others that test 
c++11_only behavior) in C++11 mode.  Thanks!


FWIW, I test C++ patches with GXX_TESTSUITE_STDS=98,11,14,17,20,impcx 
for more coverage.



Changes since v4:

- Reordered patches to be more independent from each other (they don't need
   to keep updating the new tests)
- Removed workaround for better locations in cxx_eval_store_expression
- Don't bother checking lifetime for CONST_DECLs
- Rewrite patch for dangling pointers to keep the transformation to
   `return (, nullptr)`, but only perform it when genericising. It turns out
   that implementing this wasn't as hard as I thought it might be, at least for
   this specific case.

Thanks very much for all the reviews and comments so far!

Bootstrapped and regtested on x86_64-pc-linux-gnu.

Nathaniel Shead (3):
   c++: Improve location information in constant evaluation
   c++: Prevent dangling pointers from becoming nullptr in constexpr
 [PR110619]
   c++: Track lifetimes in constant evaluation [PR70331,PR96630,PR98675]

  gcc/cp/constexpr.cc   | 159 +-
  gcc/cp/cp-gimplify.cc |  23 ++-
  gcc/cp/cp-tree.h  |   8 +-
  gcc/cp/semantics.cc   |   4 +-
  gcc/cp/typeck.cc  |   9 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-48089.C  |  10 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-70323.C  |   8 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-70323a.C |   8 +-
  .../g++.dg/cpp0x/constexpr-delete2.C  |   5 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-diag3.C  |   2 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-ice20.C  |   1 +
  .../g++.dg/cpp0x/constexpr-recursion.C|   6 +-
  gcc/testsuite/g++.dg/cpp0x/overflow1.C|   2 +-
  gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C |  10 ++
  gcc/testsuite/g++.dg/cpp1y/constexpr-89285.C  |   5 +-
  gcc/testsuite/g++.dg/cpp1y/constexpr-89481.C  |   3 +-
  .../g++.dg/cpp1y/constexpr-lifetime1.C|  13 ++
  .../g++.dg/cpp1y/constexpr-lifetime2.C|  20 +++
  .../g++.dg/cpp1y/constexpr-lifetime3.C|  13 ++
  .../g++.dg/cpp1y/constexpr-lifetime4.C|  11 ++
  .../g++.dg/cpp1y/constexpr-lifetime5.C|  11 ++
  .../g++.dg/cpp1y/constexpr-lifetime6.C|  15 ++
  .../g++.dg/cpp1y/constexpr-tracking-const14.C |   3 +-
  .../g++.dg/cpp1y/constexpr-tracking-const16.C |   3 +-
  .../g++.dg/cpp1y/constexpr-tracking-const18.C |   4 +-
  .../g++.dg/cpp1y/constexpr-tracking-const19.C |   4 +-
  .../g++.dg/cpp1y/constexpr-tracking-const21.C |   4 +-
  .../g++.dg/cpp1y/constexpr-tracking-const22.C |   4 +-
  .../g++.dg/cpp1y/constexpr-tracking-const3.C  |   3 +-
  .../g++.dg/cpp1y/constexpr-tracking-const4.C  |   3 +-
  .../g++.dg/cpp1y/constexpr-tracking-const7.C  |   3 +-
  gcc/testsuite/g++.dg/cpp1y/constexpr-union5.C |   4 +-
  gcc/testsuite/g++.dg/cpp1y/pr68180.C  |   4 +-
  .../g++.dg/cpp1z/constexpr-lambda6.C  |   4 +-
  .../g++.dg/cpp1z/constexpr-lambda8.C  |   5 +-
  gcc/testsuite/g++.dg/cpp2a/bit-cast11.C   |  10 +-
  gcc/testsuite/g++.dg/cpp2a/bit-cast12.C   |  10 +-
  gcc/testsuite/g++.dg/cpp2a/bit-cast14.C   |  14 +-
  gcc/testsuite/g++.dg/cpp2a/constexpr-98122.C  |   4 +-
  .../g++.dg/cpp2a/constexpr-dynamic17.C|   5 +-
  gcc/testsuite/g++.dg/cpp2a/constexpr-init1.C  |   5 +-
  gcc/testsuite/g++.dg/cpp2a/constexpr-new12.C  |   6 +-
  gcc/testsuite/g++.dg/cpp2a/constexpr-new3.C   |  10 +-
  gcc/testsuite/g++.dg/cpp2a/constinit10.C  |   5 +-
  .../g++.dg/cpp2a/is-corresponding-member4.C   |   4 +-
  gcc/testsuite/g++.dg/ext/constexpr-vla2.C |   4 +-
  gcc/testsuite/g++.dg/ext/constexpr-vla3.C |   4 +-
  gcc/testsuite/g++.dg/ubsan/pr63956.C  |  23 +--
  .../25_algorithms/equal/constexpr_neg.cc  |   7 +-
  .../testsuite/26_numerics/gcd/105844.cc   |  10 +-
  .../testsuite/26_numerics/lcm/105844.cc   |  14 +-
  51 files changed, 361 insertions(+), 168 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-110619.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime4.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime5.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime6.C





[Bug c++/98675] Accessing member of temporary outside its lifetime allowed in constexpr function

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98675

--- Comment #7 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:9fdbd7d6fa5e0a76898dd66658934e3184111680

commit r14-2773-g9fdbd7d6fa5e0a76898dd66658934e3184111680
Author: Nathaniel Shead 
Date:   Sun Jul 23 01:15:14 2023 +1000

c++: Track lifetimes in constant evaluation [PR70331,PR96630,PR98675]

This adds rudimentary lifetime tracking in C++ constexpr contexts,
allowing the compiler to report errors with using values after their
backing has gone out of scope. We don't yet handle other ways of
accessing values outside their lifetime (e.g. following explicit
destructor calls).

PR c++/96630
PR c++/98675
PR c++/70331

gcc/cp/ChangeLog:

* constexpr.cc (constexpr_global_ctx::is_outside_lifetime): New
function.
(constexpr_global_ctx::get_value): Don't return expired values.
(constexpr_global_ctx::get_value_ptr): Likewise.
(constexpr_global_ctx::remove_value): Mark value outside
lifetime.
(outside_lifetime_error): New function.
(cxx_eval_call_expression): No longer track save_exprs.
(cxx_eval_loop_expr): Likewise.
(cxx_eval_constant_expression): Add checks for outside lifetime
values. Remove local variables at end of bind exprs, and
temporaries after cleanup points.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-lifetime1.C: New test.
* g++.dg/cpp1y/constexpr-lifetime2.C: New test.
* g++.dg/cpp1y/constexpr-lifetime3.C: New test.
* g++.dg/cpp1y/constexpr-lifetime4.C: New test.
* g++.dg/cpp1y/constexpr-lifetime5.C: New test.
* g++.dg/cpp1y/constexpr-lifetime6.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c++/96630] dangling reference accepted in constexpr evaluation

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96630

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:9fdbd7d6fa5e0a76898dd66658934e3184111680

commit r14-2773-g9fdbd7d6fa5e0a76898dd66658934e3184111680
Author: Nathaniel Shead 
Date:   Sun Jul 23 01:15:14 2023 +1000

c++: Track lifetimes in constant evaluation [PR70331,PR96630,PR98675]

This adds rudimentary lifetime tracking in C++ constexpr contexts,
allowing the compiler to report errors with using values after their
backing has gone out of scope. We don't yet handle other ways of
accessing values outside their lifetime (e.g. following explicit
destructor calls).

PR c++/96630
PR c++/98675
PR c++/70331

gcc/cp/ChangeLog:

* constexpr.cc (constexpr_global_ctx::is_outside_lifetime): New
function.
(constexpr_global_ctx::get_value): Don't return expired values.
(constexpr_global_ctx::get_value_ptr): Likewise.
(constexpr_global_ctx::remove_value): Mark value outside
lifetime.
(outside_lifetime_error): New function.
(cxx_eval_call_expression): No longer track save_exprs.
(cxx_eval_loop_expr): Likewise.
(cxx_eval_constant_expression): Add checks for outside lifetime
values. Remove local variables at end of bind exprs, and
temporaries after cleanup points.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-lifetime1.C: New test.
* g++.dg/cpp1y/constexpr-lifetime2.C: New test.
* g++.dg/cpp1y/constexpr-lifetime3.C: New test.
* g++.dg/cpp1y/constexpr-lifetime4.C: New test.
* g++.dg/cpp1y/constexpr-lifetime5.C: New test.
* g++.dg/cpp1y/constexpr-lifetime6.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c++/70331] missing error dereferencing a dangling pointer (out of scope) in constexpr function

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70331

--- Comment #4 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:9fdbd7d6fa5e0a76898dd66658934e3184111680

commit r14-2773-g9fdbd7d6fa5e0a76898dd66658934e3184111680
Author: Nathaniel Shead 
Date:   Sun Jul 23 01:15:14 2023 +1000

c++: Track lifetimes in constant evaluation [PR70331,PR96630,PR98675]

This adds rudimentary lifetime tracking in C++ constexpr contexts,
allowing the compiler to report errors with using values after their
backing has gone out of scope. We don't yet handle other ways of
accessing values outside their lifetime (e.g. following explicit
destructor calls).

PR c++/96630
PR c++/98675
PR c++/70331

gcc/cp/ChangeLog:

* constexpr.cc (constexpr_global_ctx::is_outside_lifetime): New
function.
(constexpr_global_ctx::get_value): Don't return expired values.
(constexpr_global_ctx::get_value_ptr): Likewise.
(constexpr_global_ctx::remove_value): Mark value outside
lifetime.
(outside_lifetime_error): New function.
(cxx_eval_call_expression): No longer track save_exprs.
(cxx_eval_loop_expr): Likewise.
(cxx_eval_constant_expression): Add checks for outside lifetime
values. Remove local variables at end of bind exprs, and
temporaries after cleanup points.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-lifetime1.C: New test.
* g++.dg/cpp1y/constexpr-lifetime2.C: New test.
* g++.dg/cpp1y/constexpr-lifetime3.C: New test.
* g++.dg/cpp1y/constexpr-lifetime4.C: New test.
* g++.dg/cpp1y/constexpr-lifetime5.C: New test.
* g++.dg/cpp1y/constexpr-lifetime6.C: New test.

Signed-off-by: Nathaniel Shead 

[Bug c++/110619] Dangling pointer returned from constexpr function converts in nullptr

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110619

--- Comment #6 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:b8266af71c19a0bd7db4d08c8d2ee3c33214508c

commit r14-2772-gb8266af71c19a0bd7db4d08c8d2ee3c33214508c
Author: Nathaniel Shead 
Date:   Sun Jul 23 01:14:37 2023 +1000

c++: Prevent dangling pointers from becoming nullptr in constexpr
[PR110619]

Currently, when typeck discovers that a return statement will refer to a
local variable it rewrites to return a null pointer. This causes the
error messages for using the return value in a constant expression to be
unhelpful, especially for reference return values, and is also a visible
change to otherwise valid code (as in the linked PR).

The transformation is nonetheless important, however, both as a safety
guard against attackers being able to gain a handle to other data on the
stack, and to prevent duplicate warnings from later null-dereference
warning passes.

As such, this patch just delays the transformation until cp_genericize,
after constexpr function definitions have been generated.

PR c++/110619

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_genericize_r): Transform RETURN_EXPRs to
not return dangling pointers.
* cp-tree.h (RETURN_EXPR_LOCAL_ADDR_P): New flag.
(check_return_expr): Add a new parameter.
* semantics.cc (finish_return_stmt): Set flag on RETURN_EXPR
when referring to dangling pointer.
* typeck.cc (check_return_expr): Disable transformation of
dangling pointers, instead pass this information to caller.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-110619.C: New test.

Signed-off-by: Nathaniel Shead 

[pushed] testsuite: run C++11 tests in C++11 mode

2023-07-25 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

A recent change missed updating constexpr-mutable3.C because it wasn't run
in C++11 mode even though it checks the behavior for { target c++11_only }.

gcc/testsuite/ChangeLog:

* lib/g++-dg.exp (g++-dg-runtest): Check for c++11_only.
---
 gcc/testsuite/lib/g++-dg.exp | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/lib/g++-dg.exp b/gcc/testsuite/lib/g++-dg.exp
index 046d63170c8..142c52c8426 100644
--- a/gcc/testsuite/lib/g++-dg.exp
+++ b/gcc/testsuite/lib/g++-dg.exp
@@ -55,13 +55,16 @@ proc g++-dg-runtest { testcases flags default-extra-flags } 
{
} else {
# If the test requires a newer C++ version than which
# is tested by default, use that C++ version for that
-   # single test.  This should be updated or commented
-   # out whenever the default std_list is updated or newer
-   # C++ effective target is added.
+   # single test.  Or if a test checks behavior specifically for
+   # one C++ version, include that version in the default list.
+   # These should be adjusted whenever the default std_list is
+   # updated or newer C++ effective target is added.
if [search_for $test "\{ dg-do * \{ target c++23"] {
set std_list { 23 26 }
} elseif [search_for $test "\{ dg-do * \{ target c++26"] {
set std_list { 26 }
+   } elseif [search_for $test "c++11_only"] {
+   set std_list { 98 11 14 20 }
} else {
set std_list { 98 14 17 20 }
}

base-commit: 50656980497d77ac12a5e7179013a6af09ba32f7
-- 
2.39.3



Re: [gcc13 backport 12/12] riscv: fix error: control reaches end of non-void function

2023-07-25 Thread Kito Cheng via Gcc-patches
OK for backport :)

On Wed, Jul 26, 2023 at 2:11 AM Patrick O'Neill  wrote:
>
> From: Martin Liska 
>
> Fixes:
> gcc/config/riscv/sync.md:66:1: error: control reaches end of non-void 
> function [-Werror=return-type]
> 66 |   [(set (attr "length") (const_int 4))])
>| ^
>
> PR target/109713
>
> gcc/ChangeLog:
>
> * config/riscv/sync.md: Add gcc_unreachable to a switch.
> ---
>  gcc/config/riscv/sync.md | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
> index 6e7c762ac57..9fc626267de 100644
> --- a/gcc/config/riscv/sync.md
> +++ b/gcc/config/riscv/sync.md
> @@ -62,6 +62,8 @@
> return "fence\tr,rw";
>  else if (model == MEMMODEL_RELEASE)
> return "fence\trw,w";
> +else
> +   gcc_unreachable ();
>}
>[(set (attr "length") (const_int 4))])
>
> --
> 2.34.1
>


Re: [PATCH] RISC-V: optim const DF +0.0 store to mem [PR/110748]

2023-07-25 Thread Palmer Dabbelt

On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

On 7/21/23 12:31, Palmer Dabbelt wrote:

(define_expand "len_mask_gather_load"
   [(match_operand:VNX1_QHSD 0 "register_operand")
-   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:P 1 "pmode_reg_or_0_operand")
    (match_operand:VNX1_QHSDI 2 "register_operand")
    (match_operand 3 "")
    (match_operand 4 "")

a bunch of times, as there's a ton of them?  I'm not entirely sure if that
could manifest as an actual bug, though...

But won't this cause (const_int 0) to no longer match because CONST_INT
nodes are modeless (VOIDmode)?


I poked around a bit and I'm not actually sure, I'm kind of lost on the docs
here.  IIUC we're eliding the VOIDmode in the predicate correctly

   (define_predicate "const_0_operand"
 (and (match_code "const_int,const_wide_int,const_vector")
  (match_test "op == CONST0_RTX (GET_MODE (op))")))

so we're OK there, otherwise we'd presumably have similar problems with
expanders like

   (define_expand "subsi3"
 [(set (match_operand:SI   0 "register_operand" "= r")
  (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
(match_operand:SI 2 "register_operand" "  r")))]
 ""

which we have a few of -- though it'd be kind of a silent failure, as
presumably we'd just end up with some more move-x0s emitted?


[COMMITTED v2 1/2] bpf: don't print () in bpf_print_operand_address

2023-07-25 Thread David Faust via Gcc-patches
[Changes from v1: save calls to fprintf]

Unfortunately, the pseudo-C dialect syntax used for some of the v3
atomic instructions clashes with unconditionally printing the
surrounding parentheses in bpf_print_operand_address.

Instead, place the parentheses in the output templates where needed.

gcc/

* config/bpf/bpf.cc (bpf_print_operand_address): Don't print
enclosing parentheses for pseudo-C dialect.
* config/bpf/bpf.md (zero_exdendhidi2): Add parentheses around
operands of pseudo-C dialect output templates where needed.
(zero_extendqidi2): Likewise.
(zero_extendsidi2): Likewise.
(*mov): Likewise.
---
 gcc/config/bpf/bpf.cc | 11 +++
 gcc/config/bpf/bpf.md | 12 ++--
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 55b6927a62f..2e1e3e3abcf 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -933,9 +933,10 @@ bpf_print_operand_address (FILE *file, rtx addr)
   switch (GET_CODE (addr))
 {
 case REG:
-  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
+  if (asm_dialect == ASM_NORMAL)
+   fprintf (file, "[");
   bpf_print_register (file, addr, 0);
-  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0)");
+  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0");
   break;
 case PLUS:
   {
@@ -944,11 +945,13 @@ bpf_print_operand_address (FILE *file, rtx addr)
 
if (GET_CODE (op0) == REG && GET_CODE (op1) == CONST_INT)
  {
-   fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
+   if (asm_dialect == ASM_NORMAL)
+ fprintf (file, "[");
bpf_print_register (file, op0, 0);
fprintf (file, "+");
output_addr_const (file, op1);
-   fprintf (file, asm_dialect == ASM_NORMAL ? "]" : ")");
+   if (asm_dialect == ASM_NORMAL)
+ fprintf (file, "]");
  }
else
  fatal_insn ("invalid address in operand", addr);
diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
index 64342ea1de2..579a8213b09 100644
--- a/gcc/config/bpf/bpf.md
+++ b/gcc/config/bpf/bpf.md
@@ -260,7 +260,7 @@ (define_insn "zero_extendhidi2"
   "@
{and\t%0,0x|%0 &= 0x}
{mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}
-   {ldxh\t%0,%1|%0 = *(u16 *) %1}"
+   {ldxh\t%0,%1|%0 = *(u16 *) (%1)}"
   [(set_attr "type" "alu,alu,ldx")])
 
 (define_insn "zero_extendqidi2"
@@ -270,7 +270,7 @@ (define_insn "zero_extendqidi2"
   "@
{and\t%0,0xff|%0 &= 0xff}
{mov\t%0,%1\;and\t%0,0xff|%0 = %1;%0 &= 0xff}
-   {ldxh\t%0,%1|%0 = *(u8 *) %1}"
+   {ldxh\t%0,%1|%0 = *(u8 *) (%1)}"
   [(set_attr "type" "alu,alu,ldx")])
 
 (define_insn "zero_extendsidi2"
@@ -280,7 +280,7 @@ (define_insn "zero_extendsidi2"
   ""
   "@
* return bpf_has_alu32 ? \"{mov32\t%0,%1|%0 = %1}\" : 
\"{mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}\";
-   {ldxw\t%0,%1|%0 = *(u32 *) %1}"
+   {ldxw\t%0,%1|%0 = *(u32 *) (%1)}"
   [(set_attr "type" "alu,ldx")])
 
 ;;; Sign-extension
@@ -319,11 +319,11 @@ (define_insn "*mov"
 (match_operand:MM 1 "mov_src_operand"  " q,rI,B,r,I"))]
   ""
   "@
-   {ldx\t%0,%1|%0 = *( *) %1}
+   {ldx\t%0,%1|%0 = *( *) (%1)}
{mov\t%0,%1|%0 = %1}
{lddw\t%0,%1|%0 = %1 ll}
-   {stx\t%0,%1|*( *) %0 = %1}
-   {st\t%0,%1|*( *) %0 = %1}"
+   {stx\t%0,%1|*( *) (%0) = %1}
+   {st\t%0,%1|*( *) (%0) = %1}"
 [(set_attr "type" "ldx,alu,alu,stx,st")])
 
  Shifts
-- 
2.40.1



[PATCH v2 2/2] bpf: add v3 atomic instructions

2023-07-25 Thread David Faust via Gcc-patches
[Changes from v1: fix merge issue in invoke.texi]

This patch adds support for the general atomic operations introduced in
eBPF v3. In addition to the existing atomic add instruction, this adds:
 - Atomic and, or, xor
 - Fetching versions of these operations (including add)
 - Atomic exchange
 - Atomic compare-and-exchange

To control emission of these instructions, a new target option
-m[no-]v3-atomics is added. This option is enabled by -mcpu=v3
and above.

Support for these instructions was recently added in binutils.

gcc/

* config/bpf/bpf.opt (mv3-atomics): New option.
* config/bpf/bpf.cc (bpf_option_override): Handle it here.
* config/bpf/bpf.h (enum_reg_class): Add R0 class.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(REGNO_REG_CLASS): Handle R0.
* config/bpf/bpf.md (UNSPEC_XADD): Rename to UNSPEC_AADD.
(UNSPEC_AAND): New unspec.
(UNSPEC_AOR): Likewise.
(UNSPEC_AXOR): Likewise.
(UNSPEC_AFADD): Likewise.
(UNSPEC_AFAND): Likewise.
(UNSPEC_AFOR): Likewise.
(UNSPEC_AFXOR): Likewise.
(UNSPEC_AXCHG): Likewise.
(UNSPEC_ACMPX): Likewise.
(atomic_add): Use UNSPEC_AADD and atomic type attribute.
Move to...
* config/bpf/atomic.md: ...Here. New file.
* config/bpf/constraints.md (t): New constraint for R0.
* doc/invoke.texi (eBPF Options): Document -mv3-atomics.

gcc/testsuite/

* gcc.target/bpf/atomic-cmpxchg-1.c: New test.
* gcc.target/bpf/atomic-cmpxchg-2.c: New test.
* gcc.target/bpf/atomic-fetch-op-1.c: New test.
* gcc.target/bpf/atomic-fetch-op-2.c: New test.
* gcc.target/bpf/atomic-fetch-op-3.c: New test.
* gcc.target/bpf/atomic-op-1.c: New test.
* gcc.target/bpf/atomic-op-2.c: New test.
* gcc.target/bpf/atomic-op-3.c: New test.
* gcc.target/bpf/atomic-xchg-1.c: New test.
* gcc.target/bpf/atomic-xchg-2.c: New test.
---
 gcc/config/bpf/atomic.md  | 185 ++
 gcc/config/bpf/bpf.cc |   3 +
 gcc/config/bpf/bpf.h  |   6 +-
 gcc/config/bpf/bpf.md |  29 ++-
 gcc/config/bpf/bpf.opt|   4 +
 gcc/config/bpf/constraints.md |   3 +
 gcc/doc/invoke.texi   |   8 +-
 .../gcc.target/bpf/atomic-cmpxchg-1.c |  19 ++
 .../gcc.target/bpf/atomic-cmpxchg-2.c |  19 ++
 .../gcc.target/bpf/atomic-fetch-op-1.c|  50 +
 .../gcc.target/bpf/atomic-fetch-op-2.c|  50 +
 .../gcc.target/bpf/atomic-fetch-op-3.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-op-1.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-op-2.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-op-3.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-xchg-1.c  |  20 ++
 gcc/testsuite/gcc.target/bpf/atomic-xchg-2.c  |  20 ++
 17 files changed, 593 insertions(+), 19 deletions(-)
 create mode 100644 gcc/config/bpf/atomic.md
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-cmpxchg-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-cmpxchg-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-fetch-op-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-fetch-op-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-fetch-op-3.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-op-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-op-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-op-3.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-xchg-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-xchg-2.c

diff --git a/gcc/config/bpf/atomic.md b/gcc/config/bpf/atomic.md
new file mode 100644
index 000..caf8cc15cd4
--- /dev/null
+++ b/gcc/config/bpf/atomic.md
@@ -0,0 +1,185 @@
+;; Machine description for eBPF.
+;; Copyright (C) 2023 Free Software Foundation, Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+
+(define_mode_iterator AMO [SI DI])
+
+;;; Plain atomic modify operations.
+
+;; Non-fetching atomic add predates all other BPF atomic insns.
+;; Use xadd{w,dw} for compatibility with older GAS without support
+;; for v3 atomics.  Newer GAS supports "aadd[32]" in line with the
+;; other 

[Bug c++/110809] ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

--- Comment #2 from Andrew Pinski  ---
Reducing ...

Re: [PATCH 1/2] bpf: don't print () in bpf_print_operand_address

2023-07-25 Thread David Faust via Gcc-patches



On 7/25/23 15:14, Jose E. Marchesi wrote:
> 
> Hi David.
> 
>> Unfortunately, the pseudo-C dialect syntax used for some of the v3
>> atomic instructions clashes with unconditionally printing the
>> surrounding parentheses in bpf_print_operand_address.
>>
>> Instead, place the parentheses in the output templates where needed.
>>
>> Tested in bpf-unknown-none.
>> OK?
>>
>> gcc/
>>
>>  * config/bpf/bpf.cc (bpf_print_operand_address): Don't print
>>  enclosing parentheses for pseudo-C dialect.
>>  * config/bpf/bpf.md (zero_exdendhidi2): Add parentheses around
>>  operands of pseudo-C dialect output templates where needed.
>>  (zero_extendqidi2): Likewise.
>>  (zero_extendsidi2): Likewise.
>>  (*mov): Likewise.
>> ---
>>  gcc/config/bpf/bpf.cc |  8 
>>  gcc/config/bpf/bpf.md | 12 ++--
>>  2 files changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
>> index 55b6927a62f..2c077ea834e 100644
>> --- a/gcc/config/bpf/bpf.cc
>> +++ b/gcc/config/bpf/bpf.cc
>> @@ -933,9 +933,9 @@ bpf_print_operand_address (FILE *file, rtx addr)
>>switch (GET_CODE (addr))
>>  {
>>  case REG:
>> -  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
>> +  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "");
> 
> We can save the call to fprintf there with a conditional.

Good point, thanks.
I will update these before pushing.

> 
>>bpf_print_register (file, addr, 0);
>> -  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0)");
>> +  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0");
>>break;
>>  case PLUS:
>>{
>> @@ -944,11 +944,11 @@ bpf_print_operand_address (FILE *file, rtx addr)
>>  
>>  if (GET_CODE (op0) == REG && GET_CODE (op1) == CONST_INT)
>>{
>> -fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
>> +fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "");
> 
> Likewise.
> 
>>  bpf_print_register (file, op0, 0);
>>  fprintf (file, "+");
>>  output_addr_const (file, op1);
>> -fprintf (file, asm_dialect == ASM_NORMAL ? "]" : ")");
>> +fprintf (file, asm_dialect == ASM_NORMAL ? "]" : "");
>>}
>>  else
>>fatal_insn ("invalid address in operand", addr);
>> diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
>> index 64342ea1de2..579a8213b09 100644
>> --- a/gcc/config/bpf/bpf.md
>> +++ b/gcc/config/bpf/bpf.md
>> @@ -260,7 +260,7 @@ (define_insn "zero_extendhidi2"
>>"@
>> {and\t%0,0x|%0 &= 0x}
>> {mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}
>> -   {ldxh\t%0,%1|%0 = *(u16 *) %1}"
>> +   {ldxh\t%0,%1|%0 = *(u16 *) (%1)}"
>>[(set_attr "type" "alu,alu,ldx")])
>>  
>>  (define_insn "zero_extendqidi2"
>> @@ -270,7 +270,7 @@ (define_insn "zero_extendqidi2"
>>"@
>> {and\t%0,0xff|%0 &= 0xff}
>> {mov\t%0,%1\;and\t%0,0xff|%0 = %1;%0 &= 0xff}
>> -   {ldxh\t%0,%1|%0 = *(u8 *) %1}"
>> +   {ldxh\t%0,%1|%0 = *(u8 *) (%1)}"
>>[(set_attr "type" "alu,alu,ldx")])
>>  
>>  (define_insn "zero_extendsidi2"
>> @@ -280,7 +280,7 @@ (define_insn "zero_extendsidi2"
>>""
>>"@
>> * return bpf_has_alu32 ? \"{mov32\t%0,%1|%0 = %1}\" : 
>> \"{mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}\";
>> -   {ldxw\t%0,%1|%0 = *(u32 *) %1}"
>> +   {ldxw\t%0,%1|%0 = *(u32 *) (%1)}"
>>[(set_attr "type" "alu,ldx")])
>>  
>>  ;;; Sign-extension
>> @@ -319,11 +319,11 @@ (define_insn "*mov"
>>  (match_operand:MM 1 "mov_src_operand"  " q,rI,B,r,I"))]
>>""
>>"@
>> -   {ldx\t%0,%1|%0 = *( *) %1}
>> +   {ldx\t%0,%1|%0 = *( *) (%1)}
>> {mov\t%0,%1|%0 = %1}
>> {lddw\t%0,%1|%0 = %1 ll}
>> -   {stx\t%0,%1|*( *) %0 = %1}
>> -   {st\t%0,%1|*( *) %0 = %1}"
>> +   {stx\t%0,%1|*( *) (%0) = %1}
>> +   {st\t%0,%1|*( *) (%0) = %1}"
>>  [(set_attr "type" "ldx,alu,alu,stx,st")])
>>  
>>   Shifts
> 
> Otherwise, LGTM.
> OK.
> 
> Thanks!


Re: [PATCH 2/2] bpf: add v3 atomic instructions

2023-07-25 Thread David Faust via Gcc-patches



On 7/25/23 15:18, Jose E. Marchesi wrote:
> 
> Hi David.
> 
>> +<<< HEAD
> 
> There is a merge problem there.

Ugh, I swear I've fixed this twice now. Yet it keeps cropping up.
Sorry. v2 shortly.

> 
>>  @opindex mbswap
>>  @item -mbswap
>>  Enable byte swap instructions.  Enabled for CPU v4 and above.
>> @@ -24715,6 +24716,12 @@ Enable byte swap instructions.  Enabled for CPU v4 
>> and above.
>>  @item -msdiv
>>  Enable signed division and modulus instructions.  Enabled for CPU v4
>>  and above.
>> +===
>> +@opindex mv3-atomics
>> +@item -mv3-atomics
>> +Enable instructions for general atomic operations introduced in CPU v3.
>> +Enabled for CPU v3 and above.
>> +>>> 6de76bd11b6 (bpf: add v3 atomic instructions)


Re: [PATCH 2/2] bpf: add v3 atomic instructions

2023-07-25 Thread Jose E. Marchesi via Gcc-patches


Hi David.

> +<<< HEAD

There is a merge problem there.

>  @opindex mbswap
>  @item -mbswap
>  Enable byte swap instructions.  Enabled for CPU v4 and above.
> @@ -24715,6 +24716,12 @@ Enable byte swap instructions.  Enabled for CPU v4 
> and above.
>  @item -msdiv
>  Enable signed division and modulus instructions.  Enabled for CPU v4
>  and above.
> +===
> +@opindex mv3-atomics
> +@item -mv3-atomics
> +Enable instructions for general atomic operations introduced in CPU v3.
> +Enabled for CPU v3 and above.
> +>>> 6de76bd11b6 (bpf: add v3 atomic instructions)


Re: [PATCH 1/2] bpf: don't print () in bpf_print_operand_address

2023-07-25 Thread Jose E. Marchesi via Gcc-patches


Hi David.

> Unfortunately, the pseudo-C dialect syntax used for some of the v3
> atomic instructions clashes with unconditionally printing the
> surrounding parentheses in bpf_print_operand_address.
>
> Instead, place the parentheses in the output templates where needed.
>
> Tested in bpf-unknown-none.
> OK?
>
> gcc/
>
>   * config/bpf/bpf.cc (bpf_print_operand_address): Don't print
>   enclosing parentheses for pseudo-C dialect.
>   * config/bpf/bpf.md (zero_exdendhidi2): Add parentheses around
>   operands of pseudo-C dialect output templates where needed.
>   (zero_extendqidi2): Likewise.
>   (zero_extendsidi2): Likewise.
>   (*mov): Likewise.
> ---
>  gcc/config/bpf/bpf.cc |  8 
>  gcc/config/bpf/bpf.md | 12 ++--
>  2 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
> index 55b6927a62f..2c077ea834e 100644
> --- a/gcc/config/bpf/bpf.cc
> +++ b/gcc/config/bpf/bpf.cc
> @@ -933,9 +933,9 @@ bpf_print_operand_address (FILE *file, rtx addr)
>switch (GET_CODE (addr))
>  {
>  case REG:
> -  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
> +  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "");

We can save the call to fprintf there with a conditional.

>bpf_print_register (file, addr, 0);
> -  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0)");
> +  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0");
>break;
>  case PLUS:
>{
> @@ -944,11 +944,11 @@ bpf_print_operand_address (FILE *file, rtx addr)
>  
>   if (GET_CODE (op0) == REG && GET_CODE (op1) == CONST_INT)
> {
> - fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
> + fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "");

Likewise.

>   bpf_print_register (file, op0, 0);
>   fprintf (file, "+");
>   output_addr_const (file, op1);
> - fprintf (file, asm_dialect == ASM_NORMAL ? "]" : ")");
> + fprintf (file, asm_dialect == ASM_NORMAL ? "]" : "");
> }
>   else
> fatal_insn ("invalid address in operand", addr);
> diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
> index 64342ea1de2..579a8213b09 100644
> --- a/gcc/config/bpf/bpf.md
> +++ b/gcc/config/bpf/bpf.md
> @@ -260,7 +260,7 @@ (define_insn "zero_extendhidi2"
>"@
> {and\t%0,0x|%0 &= 0x}
> {mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}
> -   {ldxh\t%0,%1|%0 = *(u16 *) %1}"
> +   {ldxh\t%0,%1|%0 = *(u16 *) (%1)}"
>[(set_attr "type" "alu,alu,ldx")])
>  
>  (define_insn "zero_extendqidi2"
> @@ -270,7 +270,7 @@ (define_insn "zero_extendqidi2"
>"@
> {and\t%0,0xff|%0 &= 0xff}
> {mov\t%0,%1\;and\t%0,0xff|%0 = %1;%0 &= 0xff}
> -   {ldxh\t%0,%1|%0 = *(u8 *) %1}"
> +   {ldxh\t%0,%1|%0 = *(u8 *) (%1)}"
>[(set_attr "type" "alu,alu,ldx")])
>  
>  (define_insn "zero_extendsidi2"
> @@ -280,7 +280,7 @@ (define_insn "zero_extendsidi2"
>""
>"@
> * return bpf_has_alu32 ? \"{mov32\t%0,%1|%0 = %1}\" : 
> \"{mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}\";
> -   {ldxw\t%0,%1|%0 = *(u32 *) %1}"
> +   {ldxw\t%0,%1|%0 = *(u32 *) (%1)}"
>[(set_attr "type" "alu,ldx")])
>  
>  ;;; Sign-extension
> @@ -319,11 +319,11 @@ (define_insn "*mov"
>  (match_operand:MM 1 "mov_src_operand"  " q,rI,B,r,I"))]
>""
>"@
> -   {ldx\t%0,%1|%0 = *( *) %1}
> +   {ldx\t%0,%1|%0 = *( *) (%1)}
> {mov\t%0,%1|%0 = %1}
> {lddw\t%0,%1|%0 = %1 ll}
> -   {stx\t%0,%1|*( *) %0 = %1}
> -   {st\t%0,%1|*( *) %0 = %1}"
> +   {stx\t%0,%1|*( *) (%0) = %1}
> +   {st\t%0,%1|*( *) (%0) = %1}"
>  [(set_attr "type" "ldx,alu,alu,stx,st")])
>  
>   Shifts

Otherwise, LGTM.
OK.

Thanks!


[PATCH 2/2] bpf: add v3 atomic instructions

2023-07-25 Thread David Faust via Gcc-patches
This patch adds support for the general atomic operations introduced in
eBPF v3. In addition to the existing atomic add instruction, this adds:
 - Atomic and, or, xor
 - Fetching versions of these operations (including add)
 - Atomic exchange
 - Atomic compare-and-exchange

To control emission of these instructions, a new target option
-m[no-]v3-atomics is added. This option is enabled by -mcpu=v3
and above.

Support for these instructions was recently added in binutils.

Tested in bpf-unknown-none.
OK?

gcc/

* config/bpf/bpf.opt (mv3-atomics): New option.
* config/bpf/bpf.cc (bpf_option_override): Handle it here.
* config/bpf/bpf.h (enum_reg_class): Add R0 class.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(REGNO_REG_CLASS): Handle R0.
* config/bpf/bpf.md (UNSPEC_XADD): Rename to UNSPEC_AADD.
(UNSPEC_AAND): New unspec.
(UNSPEC_AOR): Likewise.
(UNSPEC_AXOR): Likewise.
(UNSPEC_AFADD): Likewise.
(UNSPEC_AFAND): Likewise.
(UNSPEC_AFOR): Likewise.
(UNSPEC_AFXOR): Likewise.
(UNSPEC_AXCHG): Likewise.
(UNSPEC_ACMPX): Likewise.
(atomic_add): Use UNSPEC_AADD and atomic type attribute.
Move to...
* config/bpf/atomic.md: ...Here. New file.
* config/bpf/constraints.md (t): New constraint for R0.
* doc/invoke.texi (eBPF Options): Document -mv3-atomics.

gcc/testsuite/

* gcc.target/bpf/atomic-cmpxchg-1.c: New test.
* gcc.target/bpf/atomic-cmpxchg-2.c: New test.
* gcc.target/bpf/atomic-fetch-op-1.c: New test.
* gcc.target/bpf/atomic-fetch-op-2.c: New test.
* gcc.target/bpf/atomic-fetch-op-3.c: New test.
* gcc.target/bpf/atomic-op-1.c: New test.
* gcc.target/bpf/atomic-op-2.c: New test.
* gcc.target/bpf/atomic-op-3.c: New test.
* gcc.target/bpf/atomic-xchg-1.c: New test.
* gcc.target/bpf/atomic-xchg-2.c: New test.
---
 gcc/config/bpf/atomic.md  | 185 ++
 gcc/config/bpf/bpf.cc |   3 +
 gcc/config/bpf/bpf.h  |   6 +-
 gcc/config/bpf/bpf.md |  29 ++-
 gcc/config/bpf/bpf.opt|   4 +
 gcc/config/bpf/constraints.md |   3 +
 gcc/doc/invoke.texi   |  10 +-
 .../gcc.target/bpf/atomic-cmpxchg-1.c |  19 ++
 .../gcc.target/bpf/atomic-cmpxchg-2.c |  19 ++
 .../gcc.target/bpf/atomic-fetch-op-1.c|  50 +
 .../gcc.target/bpf/atomic-fetch-op-2.c|  50 +
 .../gcc.target/bpf/atomic-fetch-op-3.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-op-1.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-op-2.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-op-3.c|  49 +
 gcc/testsuite/gcc.target/bpf/atomic-xchg-1.c  |  20 ++
 gcc/testsuite/gcc.target/bpf/atomic-xchg-2.c  |  20 ++
 17 files changed, 595 insertions(+), 19 deletions(-)
 create mode 100644 gcc/config/bpf/atomic.md
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-cmpxchg-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-cmpxchg-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-fetch-op-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-fetch-op-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-fetch-op-3.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-op-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-op-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-op-3.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-xchg-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/atomic-xchg-2.c

diff --git a/gcc/config/bpf/atomic.md b/gcc/config/bpf/atomic.md
new file mode 100644
index 000..caf8cc15cd4
--- /dev/null
+++ b/gcc/config/bpf/atomic.md
@@ -0,0 +1,185 @@
+;; Machine description for eBPF.
+;; Copyright (C) 2023 Free Software Foundation, Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+
+(define_mode_iterator AMO [SI DI])
+
+;;; Plain atomic modify operations.
+
+;; Non-fetching atomic add predates all other BPF atomic insns.
+;; Use xadd{w,dw} for compatibility with older GAS without support
+;; for v3 atomics.  Newer GAS supports "aadd[32]" in line with the
+;; other atomic operations.

[PATCH 1/2] bpf: don't print () in bpf_print_operand_address

2023-07-25 Thread David Faust via Gcc-patches
Unfortunately, the pseudo-C dialect syntax used for some of the v3
atomic instructions clashes with unconditionally printing the
surrounding parentheses in bpf_print_operand_address.

Instead, place the parentheses in the output templates where needed.

Tested in bpf-unknown-none.
OK?

gcc/

* config/bpf/bpf.cc (bpf_print_operand_address): Don't print
enclosing parentheses for pseudo-C dialect.
* config/bpf/bpf.md (zero_exdendhidi2): Add parentheses around
operands of pseudo-C dialect output templates where needed.
(zero_extendqidi2): Likewise.
(zero_extendsidi2): Likewise.
(*mov): Likewise.
---
 gcc/config/bpf/bpf.cc |  8 
 gcc/config/bpf/bpf.md | 12 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 55b6927a62f..2c077ea834e 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -933,9 +933,9 @@ bpf_print_operand_address (FILE *file, rtx addr)
   switch (GET_CODE (addr))
 {
 case REG:
-  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
+  fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "");
   bpf_print_register (file, addr, 0);
-  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0)");
+  fprintf (file, asm_dialect == ASM_NORMAL ? "+0]" : "+0");
   break;
 case PLUS:
   {
@@ -944,11 +944,11 @@ bpf_print_operand_address (FILE *file, rtx addr)
 
if (GET_CODE (op0) == REG && GET_CODE (op1) == CONST_INT)
  {
-   fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "(");
+   fprintf (file, asm_dialect == ASM_NORMAL ? "[" : "");
bpf_print_register (file, op0, 0);
fprintf (file, "+");
output_addr_const (file, op1);
-   fprintf (file, asm_dialect == ASM_NORMAL ? "]" : ")");
+   fprintf (file, asm_dialect == ASM_NORMAL ? "]" : "");
  }
else
  fatal_insn ("invalid address in operand", addr);
diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
index 64342ea1de2..579a8213b09 100644
--- a/gcc/config/bpf/bpf.md
+++ b/gcc/config/bpf/bpf.md
@@ -260,7 +260,7 @@ (define_insn "zero_extendhidi2"
   "@
{and\t%0,0x|%0 &= 0x}
{mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}
-   {ldxh\t%0,%1|%0 = *(u16 *) %1}"
+   {ldxh\t%0,%1|%0 = *(u16 *) (%1)}"
   [(set_attr "type" "alu,alu,ldx")])
 
 (define_insn "zero_extendqidi2"
@@ -270,7 +270,7 @@ (define_insn "zero_extendqidi2"
   "@
{and\t%0,0xff|%0 &= 0xff}
{mov\t%0,%1\;and\t%0,0xff|%0 = %1;%0 &= 0xff}
-   {ldxh\t%0,%1|%0 = *(u8 *) %1}"
+   {ldxh\t%0,%1|%0 = *(u8 *) (%1)}"
   [(set_attr "type" "alu,alu,ldx")])
 
 (define_insn "zero_extendsidi2"
@@ -280,7 +280,7 @@ (define_insn "zero_extendsidi2"
   ""
   "@
* return bpf_has_alu32 ? \"{mov32\t%0,%1|%0 = %1}\" : 
\"{mov\t%0,%1\;and\t%0,0x|%0 = %1;%0 &= 0x}\";
-   {ldxw\t%0,%1|%0 = *(u32 *) %1}"
+   {ldxw\t%0,%1|%0 = *(u32 *) (%1)}"
   [(set_attr "type" "alu,ldx")])
 
 ;;; Sign-extension
@@ -319,11 +319,11 @@ (define_insn "*mov"
 (match_operand:MM 1 "mov_src_operand"  " q,rI,B,r,I"))]
   ""
   "@
-   {ldx\t%0,%1|%0 = *( *) %1}
+   {ldx\t%0,%1|%0 = *( *) (%1)}
{mov\t%0,%1|%0 = %1}
{lddw\t%0,%1|%0 = %1 ll}
-   {stx\t%0,%1|*( *) %0 = %1}
-   {st\t%0,%1|*( *) %0 = %1}"
+   {stx\t%0,%1|*( *) (%0) = %1}
+   {st\t%0,%1|*( *) (%0) = %1}"
 [(set_attr "type" "ldx,alu,alu,stx,st")])
 
  Shifts
-- 
2.40.1



[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #7 from Tom de Vries  ---
(In reply to Alexander Monakov from comment #5)
> This trips Valgrind's data race detector (valgrind --tool=helgrind) too. So
> I don't think checking SANITIZE_THREAD is the correct approach.

Can you elaborate on what you consider a correct approach?

Re: [PATCH] match.pd: Implement missed optimization (~X | Y) ^ X -> ~(X & Y) [PR109986]

2023-07-25 Thread Andrew Pinski via Gcc-patches
On Tue, Jul 25, 2023 at 1:54 PM Andrew Pinski  wrote:
>
> On Tue, Jul 25, 2023 at 12:45 PM Jakub Jelinek via Gcc-patches
>  wrote:
> >
> > On Tue, Jul 25, 2023 at 03:42:21PM -0400, David Edelsohn via Gcc-patches 
> > wrote:
> > > Hi, Drew
> > >
> > > Thanks for addressing this missed optimization.
> > >
> > > The testcase includes an incorrect assumption: signed char, which
> > > causes the testcase to fail on PowerPC.
> > >
> > > Should the testcase be updated to specify signed char in the function
> > > signatures or should -fsigned-char be added to the command line
> > > options?
> >
> > I think we should use signed char instead of char in the testcase.
>
> I also think it should be `signed char` instead as I mentioned in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110803 .

Committed the testsuite fix as r14-2767-g67357270772b91 .

Thanks,
Andrew

>
> Thanks,
> Andrew
>
> >
> > Jakub
> >


[Bug testsuite/110803] new test case gcc.c-torture/execute/pr109986.c in r14-2751-g2a3556376c69a1 fails

2023-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110803

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #3 from Andrew Pinski  ---
Fixed.

[Bug testsuite/110803] new test case gcc.c-torture/execute/pr109986.c in r14-2751-g2a3556376c69a1 fails

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110803

--- Comment #2 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:67357270772b9131f1780267485c9eba0331bd6f

commit r14-2767-g67357270772b9131f1780267485c9eba0331bd6f
Author: Andrew Pinski 
Date:   Tue Jul 25 21:50:33 2023 +

Fix 110803: use of plain char instead of signed char

So the problem here is that plain char can either be signed
or unsigned depending on the target (powerpc and aarch64 are
unsigned while most other targets are signed). So the testcase
gcc.c-torture/execute/pr109986.c was assuming plain char was signed
char which is wrong so it is better to just change the `char` to be
`signed char`.
Note gcc.c-torture/execute/pr109986.c includes gcc.dg/tree-ssa/pr109986.c
where the plain char was being used.

Committed as obvious after a quick test to make sure
gcc.c-torture/execute/pr109986.c
now passes and gcc.dg/tree-ssa/pr109986.c still passes.

gcc/testsuite/ChangeLog:

PR testsuite/110803
* gcc.dg/tree-ssa/pr109986.c: Change plain char to be
`signed char`.

[COMMITTED] Fix 110803: use of plain char instead of signed char

2023-07-25 Thread Andrew Pinski via Gcc-patches
So the problem here is that plain char can either be signed
or unsigned depending on the target (powerpc and aarch64 are
unsigned while most other targets are signed). So the testcase
gcc.c-torture/execute/pr109986.c was assuming plain char was signed
char which is wrong so it is better to just change the `char` to be
`signed char`.
Note gcc.c-torture/execute/pr109986.c includes gcc.dg/tree-ssa/pr109986.c
where the plain char was being used.

Committed as obvious after a quick test to make sure 
gcc.c-torture/execute/pr109986.c
now passes and gcc.dg/tree-ssa/pr109986.c still passes.

gcc/testsuite/ChangeLog:

PR testsuite/110803
* gcc.dg/tree-ssa/pr109986.c: Change plain char to be
`signed char`.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr109986.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr109986.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr109986.c
index 45f099b5656..0724510e5d5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr109986.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr109986.c
@@ -16,14 +16,14 @@ t2 (int a, int b)
   return a ^ (~a | (unsigned int) b);
 }
 
-__attribute__((noipa)) char
-t3 (char a, char b)
+__attribute__((noipa)) signed char
+t3 (signed char a, signed char b)
 {
   return (b | ~a) ^ a;
 }
 
 __attribute__((noipa)) unsigned char
-t4 (char a, char b)
+t4 (signed char a, signed char b)
 {
   return ((unsigned char) a) ^ (b | ~a);
 }
@@ -89,20 +89,20 @@ t12 (int a, unsigned int b)
   return t3;
 }
 
-__attribute__((noipa)) char
-t13 (char a, char b)
+__attribute__((noipa)) signed char
+t13 (signed char a, signed char b)
 {
-  char t1 = ~a;
-  char t2 = b | t1;
-  char t3 = t2 ^ a;
+  signed char t1 = ~a;
+  signed char t2 = b | t1;
+  signed char t3 = t2 ^ a;
   return t3;
 }
 
 __attribute__((noipa)) unsigned char
-t14 (unsigned char a, char b)
+t14 (unsigned char a, signed char b)
 {
   unsigned char t1 = ~a;
-  char t2 = b | t1;
+  signed char t2 = b | t1;
   unsigned char t3 = a ^ t2;
   return t3;
 }
-- 
2.31.1



[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #6 from Tom de Vries  ---
(In reply to rguent...@suse.de from comment #4)
> I'm suggesting to not fix it ;) 

Can you explain why ?

It doesn't look difficult to fix to me, and I don't see any downsides.

> That said, is TSAN a useful vehicle?

Well, false positives aside, yes.

[patch] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect

2023-07-25 Thread Tobias Burnus

The attached patch calls CUDA's cuMemcopy2D and cuMemcpy3D
for omp_target_memcpy_rect[,_async} for dim=2/dim=3. This should
speed up the data transfer for noncontiguous data.

While being there, I ended up adding support for device to other device
copying; while potentially slow, it is still better than not being able to
copy - and with shared-memory, it shouldn't be that bad.

Comments, suggestions, remarks?
If there are none, will commit it...

Disclaimer: While I have done correctness tests (system with two nvptx GPUs,
I have not done any performance tests. (I also tested it without offloading
configured, but that's rather boring.)

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect

When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.

Additionally, it permits device-to-device copies, if neccessary using a
temporary variable on the host.

include/ChangeLog:

	* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
	CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
	(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
	CUDA_MEMCPY3D_PEER): New typdefs.
	(cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned,
	cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
	cuMemcpy3DPeerAsync): New prototypes.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New prototypes.
	* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
	and memcpy3d_func.
	* libgomp.texi (5.1 Impl. Status): Add 'defaultmap(:all)' with 'N'.
	(nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used.
	* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
	* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
	cuMemcpy3D): Invoke via CUDA_ONE_CALL.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New.
	* target.c (omp_target_memcpy_rect_worker):
	(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
	Permit all device-to-device copyies; invoke new plugins for
	2D and 3D copying when available.
	(gomp_load_plugin_for_device): DLSYM the new plugin functions.
	* testsuite/libgomp.c/target-12.c: Fix dimension bug.
	* testsuite/libgomp.fortran/target-12.f90: Likewise.
	* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.

 include/cuda/cuda.h|  85 
 libgomp/libgomp-plugin.h   |   7 +
 libgomp/libgomp.h  |   2 +
 libgomp/libgomp.texi   |   6 +
 libgomp/oacc-host.c|   2 +
 libgomp/plugin/cuda-lib.def|   3 +
 libgomp/plugin/plugin-nvptx.c  | 116 +
 libgomp/target.c   | 152 +-
 libgomp/testsuite/libgomp.c/target-12.c|   6 +-
 libgomp/testsuite/libgomp.fortran/target-12.f90|   6 +-
 .../libgomp.fortran/target-memcpy-rect-1.f90   | 531 +
 11 files changed, 885 insertions(+), 31 deletions(-)

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 338626fb6dc..09c3c2b8dbe 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -47,6 +47,7 @@ typedef void *CUevent;
 typedef void *CUfunction;
 typedef void *CUlinkState;
 typedef void *CUmodule;
+typedef void *CUarray;
 typedef size_t (*CUoccupancyB2DSize)(int);
 typedef void *CUstream;
 
@@ -54,7 +55,10 @@ typedef enum {
   CUDA_SUCCESS = 0,
   CUDA_ERROR_INVALID_VALUE = 1,
   CUDA_ERROR_OUT_OF_MEMORY = 2,
+  CUDA_ERROR_NOT_INITIALIZED = 3,
+  CUDA_ERROR_DEINITIALIZED = 4,
   CUDA_ERROR_INVALID_CONTEXT = 201,
+  CUDA_ERROR_INVALID_HANDLE = 400,
   CUDA_ERROR_NOT_FOUND = 500,
   CUDA_ERROR_NOT_READY = 600,
   CUDA_ERROR_LAUNCH_FAILED = 719,
@@ -126,6 +130,75 @@ typedef enum {
   CU_LIMIT_MALLOC_HEAP_SIZE = 0x02,
 } CUlimit;
 
+typedef enum {
+  CU_MEMORYTYPE_HOST = 0x01,
+  CU_MEMORYTYPE_DEVICE = 0x02,
+  CU_MEMORYTYPE_ARRAY = 0x03,
+  CU_MEMORYTYPE_UNIFIED = 0x04
+} CUmemorytype;
+
+typedef struct {
+  size_t srcXInBytes, srcY;
+  CUmemorytype srcMemoryType;
+  const void *srcHost;
+  CUdeviceptr srcDevice;
+  CUarray srcArray;
+  size_t srcPitch;
+
+  size_t dstXInBytes, dstY;
+  CUmemorytype dstMemoryType;
+  const void *dstHost;
+  CUdeviceptr dstDevice;
+  CUarray dstArray;
+  size_t dstPitch;
+
+  size_t WidthInBytes, Height;
+} CUDA_MEMCPY2D;
+
+typedef struct {
+  size_t srcXInBytes, srcY, srcZ;
+  size_t srcLOD;
+  CUmemorytype srcMemoryType;
+  const void *srcHost;
+  CUdeviceptr srcDevice;
+  CUarray srcArray;
+  void *dummy;
+  size_t srcPitch, srcHeight;
+
+  size_t dstXInBytes, dstY, dstZ;
+  size_t dstLOD;

[Bug fortran/68569] ICE with automatic character object and DATA

2023-07-25 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68569

--- Comment #7 from anlauf at gcc dot gnu.org ---
Created attachment 55635
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55635=edit
Patch

This patch fixes the testcases in this PR and regtests OK, except for the
necessary minor adjustments to the patterns in gfortran.dg/data_char_4.f90
and gfortran.dg/data_char_5.f90 (not included).

Re: analyzer: New state machine should be C++ only

2023-07-25 Thread Martin Uecker
Am Mittwoch, dem 12.07.2023 um 15:23 +0200 schrieb Benjamin Priour via
Gcc:
> Hi David,
> 
> 
> Lately I've been working on adding a new state machine to keep track
> of
> ownership transfers
> 
> and misuses, e.g. to warn about use-after-move, partial or shallow
> copy/move.
> 
> I'm trying to stay abstracted from heap allocated regions, and to
> rather
> work with "resources",
> 
> so that the state machine could be easily further extended.
> 
> However, the whole concern of ownership is really C++-like, and most
> of the
> checks would require
> 
> things unheard of in vanilla C, such as copy/move operators, ctors &
> dtors
> ...
> 
> 
> Using those constructs, it is really doable to guess ownership of
> resources, whereas without them it becomes
> 
> much more hazardous.
> 
> So, should we make this new sm -adroitly called sm-ownership- C++-
> only ?
> 
> 
> Doing so would allow the sm to reuse code from under cp/*, thus it'd
> reduce
> duplicating code and would
> 
> likely lead to less false positives in C++ -more precise function
> checks-,
> though it would make any future C-support more tedious.
> 
> It's also going against the current flow of porting what's already
> done for
> C to C++.

A gnu::owner attribute is certainly something which also make sense
for C.

Martin




Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Palmer Dabbelt

On Tue, 25 Jul 2023 14:02:24 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/25/23 13:50, Jakub Jelinek wrote:

On Tue, Jul 25, 2023 at 11:01:54AM -0700, Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


Please don't before 13.2 will be released, the branch is frozen and none of
this seems to be a release blocker.

Ugh.  Missed the boat :(

I could make an argument for inclusion given the strong desire to have
compatible mappings across the toolchains and alignment with the RVI
specs -- but I won't.  As Palmer has indicated, it's been broken for a
while and we can manage that breakage.


I think if we just merge it right after 13.2 and indicate that distros 
doing long-term binary builds before 13.3 backport the patches we should 
be fine.  I think that's just Debian right now, so while it's an 
important set of bugs to get fixed it's just the single user.


It's certainly a bummer to miss 13.2, but we've just got ourselves to 
blame for forgetting about the backport ;)







jeff


[Bug c++/84542] missing -Wdeprecated-declarations on a redeclared function template

2023-07-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84542

--- Comment #4 from Jonathan Wakely  ---
This affects std::random_shuffle in libstdc++, which only warns with clang, not
gcc. The attribute needs to be on the first declaration to work with gcc.

[Bug c++/110802] Missing warning for attribute deprecated on a function template definition with a previous declaration

2023-07-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110802

--- Comment #2 from Jonathan Wakely  ---
Not sure why that didn't show up when I searched, thanks Andrew!

Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Jeff Law via Gcc-patches




On 7/25/23 13:50, Jakub Jelinek wrote:

On Tue, Jul 25, 2023 at 11:01:54AM -0700, Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


Please don't before 13.2 will be released, the branch is frozen and none of
this seems to be a release blocker.

Ugh.  Missed the boat :(

I could make an argument for inclusion given the strong desire to have 
compatible mappings across the toolchains and alignment with the RVI 
specs -- but I won't.  As Palmer has indicated, it's been broken for a 
while and we can manage that breakage.





jeff


Re: [PATCH] match.pd: Implement missed optimization (~X | Y) ^ X -> ~(X & Y) [PR109986]

2023-07-25 Thread Andrew Pinski via Gcc-patches
On Tue, Jul 25, 2023 at 12:45 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Tue, Jul 25, 2023 at 03:42:21PM -0400, David Edelsohn via Gcc-patches 
> wrote:
> > Hi, Drew
> >
> > Thanks for addressing this missed optimization.
> >
> > The testcase includes an incorrect assumption: signed char, which
> > causes the testcase to fail on PowerPC.
> >
> > Should the testcase be updated to specify signed char in the function
> > signatures or should -fsigned-char be added to the command line
> > options?
>
> I think we should use signed char instead of char in the testcase.

I also think it should be `signed char` instead as I mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110803 .

Thanks,
Andrew

>
> Jakub
>


Re: [PATCH 1/2][frontend] Add novector C++ pragma

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/19/23 11:15, Tamar Christina wrote:

Hi All,

FORTRAN currently has a pragma NOVECTOR for indicating that vectorization should
not be applied to a particular loop.

ICC/ICX also has such a pragma for C and C++ called #pragma novector.

As part of this patch series I need a way to easily turn off vectorization of
particular loops, particularly for testsuite reasons.

This patch proposes a #pragma GCC novector that does the same for C++
as gfortan does for FORTRAN and what ICX/ICX does for C++.

I added only some basic tests here, but the next patch in the series uses this
in the testsuite in about ~800 tests.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/cp/ChangeLog:

* cp-tree.def (RANGE_FOR_STMT): Update comment.
* cp-tree.h (RANGE_FOR_NOVECTOR): New.
(cp_convert_range_for, finish_while_stmt_cond, finish_do_stmt,
finish_for_cond): Add novector param.
* init.cc (build_vec_init): Default novector to false.
* method.cc (build_comparison_op): Likewise.
* parser.cc (cp_parser_statement): Likewise.
(cp_parser_for, cp_parser_c_for, cp_parser_range_for,
cp_convert_range_for, cp_parser_iteration_statement,
cp_parser_omp_for_loop, cp_parser_pragma): Support novector.
(cp_parser_pragma_novector): New.
* pt.cc (tsubst_expr): Likewise.
* semantics.cc (finish_while_stmt_cond, finish_do_stmt,
finish_for_cond): Likewise.

gcc/ChangeLog:

* doc/extend.texi: Document it.

gcc/testsuite/ChangeLog:

* g++.dg/vect/vect.exp (support vect- prefix).
* g++.dg/vect/vect-novector-pragma.cc: New test.

--- inline copy of patch --
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index 
0e66ca70e00caa1dc4beada1024ace32954e2aaf..c13c8ea98a523c4ef1c55a11e02d5da9db7e367e
 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -305,8 +305,8 @@ DEFTREECODE (IF_STMT, "if_stmt", tcc_statement, 4)
  
  /* Used to represent a range-based `for' statement. The operands are

 RANGE_FOR_DECL, RANGE_FOR_EXPR, RANGE_FOR_BODY, RANGE_FOR_SCOPE,
-   RANGE_FOR_UNROLL, and RANGE_FOR_INIT_STMT, respectively.  Only used in
-   templates.  */
+   RANGE_FOR_UNROLL, RANGE_FOR_NOVECTOR and RANGE_FOR_INIT_STMT,
+   respectively.  Only used in templates.  */


This change is unnecessary; RANGE_FOR_NOVECTOR is a flag, not an operand.


  DEFTREECODE (RANGE_FOR_STMT, "range_for_stmt", tcc_statement, 6)
  
  /* Used to represent an expression statement.  Use `EXPR_STMT_EXPR' to

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 
dd3665c8ccf48a8a0b1ba2c06400fe50999ea240..8776e8f4cf8266ee715c3e7f943602fdb1acaf79
 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -13658,7 +13660,13 @@ cp_parser_c_for (cp_parser *parser, tree scope, tree 
init, bool ivdep,
   "% pragma");
condition = error_mark_node;
  }
-  finish_for_cond (condition, stmt, ivdep, unroll);
+  else if (novector)
+{
+  cp_parser_error (parser, "missing loop condition in loop with "
+  "% pragma");
+  condition = error_mark_node;
+}


Why is it a problem for a loop with novector to have no condition?  This 
error makes sense for the other pragmas that want to optimize based on 
the condition, it seems unneeded for this pragma.



+
+   cp_token *tok = pragma_tok;
+
+   do
  {
-   tok = cp_lexer_consume_token (parser->lexer);
-   ivdep = cp_parser_pragma_ivdep (parser, tok);
-   tok = cp_lexer_peek_token (the_parser->lexer);
+   switch (cp_parser_pragma_kind (tok))
+ {
+   case PRAGMA_IVDEP:
+ {
+   if (tok != pragma_tok)
+ tok = cp_lexer_consume_token (parser->lexer);
+   ivdep = cp_parser_pragma_ivdep (parser, tok);
+   tok = cp_lexer_peek_token (the_parser->lexer);
+   break;
+ }
+   case PRAGMA_UNROLL:
+ {
+   if (tok != pragma_tok)
+ tok = cp_lexer_consume_token (parser->lexer);
+   unroll = cp_parser_pragma_unroll (parser, tok);
+   tok = cp_lexer_peek_token (the_parser->lexer);
+   break;
+ }
+   case PRAGMA_NOVECTOR:
+ {
+   if (tok != pragma_tok)
+ tok = cp_lexer_consume_token (parser->lexer);
+   novector = cp_parser_pragma_novector (parser, tok);
+   tok = cp_lexer_peek_token (the_parser->lexer);
+   break;
+ }
+   default:
+ gcc_unreachable ();


This unreachable seems to assert that if a pragma follows one of these 
pragmas, it must be another one of these pragmas?  That seems wrong; 
instead of hitting gcc_unreachable() in that case we should 

[Bug fortran/68569] ICE with automatic character object and DATA

2023-07-25 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68569

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org

--- Comment #6 from anlauf at gcc dot gnu.org ---
Nicolas, are you still working on this?

I have played with extensions of your patch that fix the testcases here
as well as the fallout from regression testing.  To be attached later.

[Bug c++/108960] clear tf_partial et al in instantiate_template

2023-07-25 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108960

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Marek Polacek  ---
Done.

[Bug c++/108960] clear tf_partial et al in instantiate_template

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108960

--- Comment #2 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:39004608e79b68fe7615a026ce58dea646dba20e

commit r14-2765-g39004608e79b68fe7615a026ce58dea646dba20e
Author: Marek Polacek 
Date:   Tue Jul 25 14:36:47 2023 -0400

c++: clear tf_partial et al in instantiate_template [PR108960]

In 
we concluded that we might clear all flags except tf_warning_or_error
when performing instantiate_template.

PR c++/108960

gcc/cp/ChangeLog:

* pt.cc (lookup_and_finish_template_variable): Don't clear
tf_partial
here.
(instantiate_template): Reset all complain flags except
tf_warning_or_error.

Re: [PATCH] c++: fix ICE with is_really_empty_class [PR110106]

2023-07-25 Thread Marek Polacek via Gcc-patches
On Tue, Jul 25, 2023 at 04:24:39PM -0400, Jason Merrill wrote:
> On 7/25/23 15:59, Marek Polacek wrote:
> > Something like this, then?  I see that cp_parser_initializer_clause et al
> > offer further opportunities (because they sometimes use a dummy too) but
> > this should be a good start.
> 
> Looks good.  Please do update the other callers as well, while you're
> looking at this.

Thanks.  Can I push this part first?



Re: [PATCH] c++: clear tf_partial et al in instantiate_template [PR108960]

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/25/23 15:55, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --

In 
we concluded that we might clear all flags except tf_warning_or_error
when performing instantiate_template.

PR c++/108960

gcc/cp/ChangeLog:

* pt.cc (lookup_and_finish_template_variable): Don't clear tf_partial
here.
(instantiate_template): Reset all complain flags except
tf_warning_or_error.
---
  gcc/cp/pt.cc | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 21b08a6266a..265e2a59a52 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10396,12 +10396,6 @@ lookup_and_finish_template_variable (tree templ, tree 
targs,
tree var = lookup_template_variable (templ, targs, complain);
if (var == error_mark_node)
  return error_mark_node;
-  /* We may be called while doing a partial substitution, but the
- type of the variable template may be auto, in which case we
- will call do_auto_deduction in mark_used (which clears tf_partial)
- and the auto must be properly reduced at that time for the
- deduction to work.  */
-  complain &= ~tf_partial;
var = finish_template_variable (var, complain);
mark_used (var);
return var;
@@ -22008,6 +22002,14 @@ instantiate_template (tree tmpl, tree orig_args, 
tsubst_flags_t complain)
if (tmpl == error_mark_node)
  return error_mark_node;
  
+  /* The other flags are not relevant anymore here, especially tf_partial

+ shouldn't be set.  For instance, we may be called while doing a partial
+ substitution of a template variable, but the type of the variable
+ template may be auto, in which case we will call do_auto_deduction
+ in mark_used (which clears tf_partial) and the auto must be properly
+ reduced at that time for the deduction to work.  */
+  complain &= tf_warning_or_error;
+
gcc_assert (TREE_CODE (tmpl) == TEMPLATE_DECL);
  
if (modules_p ())


base-commit: 6e424febfbcb27c21a7fe3a137e614765f9cf9d2




Re: [PATCH] c++: fix ICE with is_really_empty_class [PR110106]

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/25/23 15:59, Marek Polacek wrote:

On Fri, Jul 21, 2023 at 01:44:17PM -0400, Jason Merrill wrote:

On 7/20/23 17:58, Marek Polacek wrote:

On Thu, Jul 20, 2023 at 03:51:32PM -0400, Marek Polacek wrote:

On Thu, Jul 20, 2023 at 02:37:07PM -0400, Jason Merrill wrote:

On 7/20/23 14:13, Marek Polacek wrote:

On Wed, Jul 19, 2023 at 10:11:27AM -0400, Patrick Palka wrote:

On Tue, 18 Jul 2023, Marek Polacek via Gcc-patches wrote:


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and branches?


Looks reasonable to me.


Thanks.

Though I wonder if we could also fix this by not checking potentiality
at all in this case?  The problematic call to is_rvalue_constant_expression
happens from cp_parser_constant_expression with 'allow_non_constant' != 0
and with 'non_constant_p' being a dummy out argument that comes from
cp_parser_functional_cast, so the result of is_rvalue_constant_expression
is effectively unused in this case, and we should be able to safely elide
it when 'allow_non_constant && non_constant_p == nullptr'.


Sounds plausible.  I think my patch could be applied first since it
removes a tiny bit of code, then I can hopefully remove the flag below,
then maybe go back and optimize the call to is_rvalue_constant_expression.
Does that sound sensible?


Relatedly, ISTM the member cp_parser::non_integral_constant_expression_p
is also effectively unused and could be removed?


It looks that way.  Seems it's only used in cp_parser_constant_expression:
10806   if (allow_non_constant_p)
10807 *non_constant_p = parser->non_integral_constant_expression_p;
but that could be easily replaced by a local var.  I'd be happy to see if
we can actually do away with it.  (I wonder why it was introduced and when
it actually stopped being useful.)


It was for the C++98 notion of constant-expression, which was more of a
parser-level notion, and has been supplanted by the C++11 version.  I'm
happy to remove it, and therefore remove the is_rvalue_constant_expression
call.


Wonderful.  I'll do that next.


I found a use of parser->non_integral_constant_expression_p:
finish_id_expression_1 can set it to true which then makes
a difference in cp_parser_constant_expression in C++98.  In
cp_parser_constant_expression we set n_i_c_e_p to false, call
cp_parser_assignment_expression in which finish_id_expression_1
sets n_i_c_e_p to true, then back in cp_parser_constant_expression
we skip the cxx11 block, and set *non_constant_p to true.  If I
remove n_i_c_e_p, we lose that.  This can be seen in init/array60.C.


Sure, we would need to use the C++11 code for C++98 mode, which is likely
fine but is more uncertain.

It's probably simpler to just ignore n_i_c_e_p for C++11 and up, along with
Patrick's suggestion of allowing null non_constant_p with true
allow_non_constant_p.


Something like this, then?  I see that cp_parser_initializer_clause et al
offer further opportunities (because they sometimes use a dummy too) but
this should be a good start.


Looks good.  Please do update the other callers as well, while you're 
looking at this.



Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
It's pointless to call *_rvalue_constant_expression when we're not using
the result.  Also apply some drive-by cleanups.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_constant_expression): Allow non_constant_p to be
nullptr even when allow_non_constant_p is true.  Don't call
_rvalue_constant_expression when not necessary.  Move local variable
declarations closer to their first use.
(cp_parser_static_assert): Don't pass a dummy down to
cp_parser_constant_expression.
---
  gcc/cp/parser.cc | 24 +++-
  1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 5e2b5cba57e..efaa806f107 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -10734,11 +10734,6 @@ cp_parser_constant_expression (cp_parser* parser,
   bool *non_constant_p /* = NULL */,
   bool strict_p /* = false */)
  {
-  bool saved_integral_constant_expression_p;
-  bool saved_allow_non_integral_constant_expression_p;
-  bool saved_non_integral_constant_expression_p;
-  cp_expr expression;
-
/* It might seem that we could simply parse the
   conditional-expression, and then check to see if it were
   TREE_CONSTANT.  However, an expression that is TREE_CONSTANT is
@@ -10757,10 +10752,12 @@ cp_parser_constant_expression (cp_parser* parser,
   will fold this operation to an INTEGER_CST for `3'.  */
  
/* Save the old settings.  */

-  saved_integral_constant_expression_p = 
parser->integral_constant_expression_p;
-  saved_allow_non_integral_constant_expression_p
+  bool saved_integral_constant_expression_p
+= parser->integral_constant_expression_p;
+  bool saved_allow_non_integral_constant_expression_p
  = parser->allow_non_integral_constant_expression_p;
-  

Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Palmer Dabbelt

On Tue, 25 Jul 2023 12:50:48 PDT (-0700), ja...@redhat.com wrote:

On Tue, Jul 25, 2023 at 11:01:54AM -0700, Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


Please don't before 13.2 will be released, the branch is frozen and none of
this seems to be a release blocker.


Sorry I missed this.  IMO it's fine to wait, this has been broken for 
5-10 years so we can wait another cycle ;)




Jakub


Re: [PATCH] c++: fix ICE with is_really_empty_class [PR110106]

2023-07-25 Thread Marek Polacek via Gcc-patches
On Fri, Jul 21, 2023 at 01:44:17PM -0400, Jason Merrill wrote:
> On 7/20/23 17:58, Marek Polacek wrote:
> > On Thu, Jul 20, 2023 at 03:51:32PM -0400, Marek Polacek wrote:
> > > On Thu, Jul 20, 2023 at 02:37:07PM -0400, Jason Merrill wrote:
> > > > On 7/20/23 14:13, Marek Polacek wrote:
> > > > > On Wed, Jul 19, 2023 at 10:11:27AM -0400, Patrick Palka wrote:
> > > > > > On Tue, 18 Jul 2023, Marek Polacek via Gcc-patches wrote:
> > > > > > 
> > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 
> > > > > > > branches?
> > > > > > 
> > > > > > Looks reasonable to me.
> > > > > 
> > > > > Thanks.
> > > > > > Though I wonder if we could also fix this by not checking 
> > > > > > potentiality
> > > > > > at all in this case?  The problematic call to 
> > > > > > is_rvalue_constant_expression
> > > > > > happens from cp_parser_constant_expression with 
> > > > > > 'allow_non_constant' != 0
> > > > > > and with 'non_constant_p' being a dummy out argument that comes from
> > > > > > cp_parser_functional_cast, so the result of 
> > > > > > is_rvalue_constant_expression
> > > > > > is effectively unused in this case, and we should be able to safely 
> > > > > > elide
> > > > > > it when 'allow_non_constant && non_constant_p == nullptr'.
> > > > > 
> > > > > Sounds plausible.  I think my patch could be applied first since it
> > > > > removes a tiny bit of code, then I can hopefully remove the flag 
> > > > > below,
> > > > > then maybe go back and optimize the call to 
> > > > > is_rvalue_constant_expression.
> > > > > Does that sound sensible?
> > > > > 
> > > > > > Relatedly, ISTM the member 
> > > > > > cp_parser::non_integral_constant_expression_p
> > > > > > is also effectively unused and could be removed?
> > > > > 
> > > > > It looks that way.  Seems it's only used in 
> > > > > cp_parser_constant_expression:
> > > > > 10806   if (allow_non_constant_p)
> > > > > 10807 *non_constant_p = 
> > > > > parser->non_integral_constant_expression_p;
> > > > > but that could be easily replaced by a local var.  I'd be happy to 
> > > > > see if
> > > > > we can actually do away with it.  (I wonder why it was introduced and 
> > > > > when
> > > > > it actually stopped being useful.)
> > > > 
> > > > It was for the C++98 notion of constant-expression, which was more of a
> > > > parser-level notion, and has been supplanted by the C++11 version.  I'm
> > > > happy to remove it, and therefore remove the 
> > > > is_rvalue_constant_expression
> > > > call.
> > > 
> > > Wonderful.  I'll do that next.
> > 
> > I found a use of parser->non_integral_constant_expression_p:
> > finish_id_expression_1 can set it to true which then makes
> > a difference in cp_parser_constant_expression in C++98.  In
> > cp_parser_constant_expression we set n_i_c_e_p to false, call
> > cp_parser_assignment_expression in which finish_id_expression_1
> > sets n_i_c_e_p to true, then back in cp_parser_constant_expression
> > we skip the cxx11 block, and set *non_constant_p to true.  If I
> > remove n_i_c_e_p, we lose that.  This can be seen in init/array60.C.
> 
> Sure, we would need to use the C++11 code for C++98 mode, which is likely
> fine but is more uncertain.
> 
> It's probably simpler to just ignore n_i_c_e_p for C++11 and up, along with
> Patrick's suggestion of allowing null non_constant_p with true
> allow_non_constant_p.

Something like this, then?  I see that cp_parser_initializer_clause et al
offer further opportunities (because they sometimes use a dummy too) but
this should be a good start.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
It's pointless to call *_rvalue_constant_expression when we're not using
the result.  Also apply some drive-by cleanups.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_constant_expression): Allow non_constant_p to be
nullptr even when allow_non_constant_p is true.  Don't call
_rvalue_constant_expression when not necessary.  Move local variable
declarations closer to their first use.
(cp_parser_static_assert): Don't pass a dummy down to
cp_parser_constant_expression.
---
 gcc/cp/parser.cc | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 5e2b5cba57e..efaa806f107 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -10734,11 +10734,6 @@ cp_parser_constant_expression (cp_parser* parser,
   bool *non_constant_p /* = NULL */,
   bool strict_p /* = false */)
 {
-  bool saved_integral_constant_expression_p;
-  bool saved_allow_non_integral_constant_expression_p;
-  bool saved_non_integral_constant_expression_p;
-  cp_expr expression;
-
   /* It might seem that we could simply parse the
  conditional-expression, and then check to see if it were
  TREE_CONSTANT.  However, an expression that is TREE_CONSTANT is
@@ -10757,10 +10752,12 @@ cp_parser_constant_expression 

Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Palmer Dabbelt

On Tue, 25 Jul 2023 11:01:54 PDT (-0700), Patrick O'Neill wrote:

Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
Jeff Law.
If there aren't any objections I'll commit this cherry-picked series
on Thursday (July 27th).


+Jakub

According to the "GCC 13.1.1 Status Report (2023-07-20)", it looks like 
we're frozen for 13.2 and thus would need a release maintainer to sign 
off on anything we backport until 13.2 is released.


I'm not opposed to the backport, but it does looks like we're down to no 
P1 regressions which means we might release very soon.  So we should at 
least make sure this gets through all the tests and such.  It's kind of 
splitting hairs as this is a pretty bad set of bugs we're fixing and 
distros are probably going to just backport it anyway, so not sure what 
the right answer is.



Patchset on trunk:
https://inbox.sourceware.org/gcc-patches/20230427162301.1151333-1-patr...@rivosinc.com/
First commit: f37a36bce81b50a43ec1613c1d08d803642f7506

Also includes bugfix from:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109713
commit: 4bd434fbfc7865961a8e10d7e9601b28765ce7be

[1] 
https://inbox.sourceware.org/gcc/mhng-b7423fca-67ec-4ce4-9694-4e062632ceb0@palmer-ri-x1c9/T/#t

Martin Liska (1):
  riscv: fix error: control reaches end of non-void function

Patrick O'Neill (11):
  RISC-V: Eliminate SYNC memory models
  RISC-V: Enforce Libatomic LR/SC SEQ_CST
  RISC-V: Enforce subword atomic LR/SC SEQ_CST
  RISC-V: Enforce atomic compare_exchange SEQ_CST
  RISC-V: Add AMO release bits
  RISC-V: Strengthen atomic stores
  RISC-V: Eliminate AMO op fences
  RISC-V: Weaken LR/SC pairs
  RISC-V: Weaken mem_thread_fence
  RISC-V: Weaken atomic loads
  RISC-V: Table A.6 conformance tests

 gcc/config/riscv/riscv-protos.h   |   3 +
 gcc/config/riscv/riscv.cc |  66 --
 gcc/config/riscv/sync.md  | 196 --
 .../riscv/amo-table-a-6-amo-add-1.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-2.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-3.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-4.c   |  15 ++
 .../riscv/amo-table-a-6-amo-add-5.c   |  15 ++
 .../riscv/amo-table-a-6-compare-exchange-1.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-2.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-3.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-4.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-5.c  |   9 +
 .../riscv/amo-table-a-6-compare-exchange-6.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-7.c  |   9 +
 .../gcc.target/riscv/amo-table-a-6-fence-1.c  |  14 ++
 .../gcc.target/riscv/amo-table-a-6-fence-2.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-fence-3.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-fence-4.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-fence-5.c  |  15 ++
 .../gcc.target/riscv/amo-table-a-6-load-1.c   |  16 ++
 .../gcc.target/riscv/amo-table-a-6-load-2.c   |  17 ++
 .../gcc.target/riscv/amo-table-a-6-load-3.c   |  18 ++
 .../gcc.target/riscv/amo-table-a-6-store-1.c  |  16 ++
 .../gcc.target/riscv/amo-table-a-6-store-2.c  |  17 ++
 .../riscv/amo-table-a-6-store-compat-3.c  |  18 ++
 .../riscv/amo-table-a-6-subword-amo-add-1.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-2.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-3.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-4.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-5.c   |   9 +
 gcc/testsuite/gcc.target/riscv/pr89835.c  |   9 +
 libgcc/config/riscv/atomic.c  |   4 +-
 33 files changed, 563 insertions(+), 75 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-4.c
 create mode 100644 

[PATCH] c++: clear tf_partial et al in instantiate_template [PR108960]

2023-07-25 Thread Marek Polacek via Gcc-patches
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --

In 
we concluded that we might clear all flags except tf_warning_or_error
when performing instantiate_template.

PR c++/108960

gcc/cp/ChangeLog:

* pt.cc (lookup_and_finish_template_variable): Don't clear tf_partial
here.
(instantiate_template): Reset all complain flags except
tf_warning_or_error.
---
 gcc/cp/pt.cc | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 21b08a6266a..265e2a59a52 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10396,12 +10396,6 @@ lookup_and_finish_template_variable (tree templ, tree 
targs,
   tree var = lookup_template_variable (templ, targs, complain);
   if (var == error_mark_node)
 return error_mark_node;
-  /* We may be called while doing a partial substitution, but the
- type of the variable template may be auto, in which case we
- will call do_auto_deduction in mark_used (which clears tf_partial)
- and the auto must be properly reduced at that time for the
- deduction to work.  */
-  complain &= ~tf_partial;
   var = finish_template_variable (var, complain);
   mark_used (var);
   return var;
@@ -22008,6 +22002,14 @@ instantiate_template (tree tmpl, tree orig_args, 
tsubst_flags_t complain)
   if (tmpl == error_mark_node)
 return error_mark_node;
 
+  /* The other flags are not relevant anymore here, especially tf_partial
+ shouldn't be set.  For instance, we may be called while doing a partial
+ substitution of a template variable, but the type of the variable
+ template may be auto, in which case we will call do_auto_deduction
+ in mark_used (which clears tf_partial) and the auto must be properly
+ reduced at that time for the deduction to work.  */
+  complain &= tf_warning_or_error;
+
   gcc_assert (TREE_CODE (tmpl) == TEMPLATE_DECL);
 
   if (modules_p ())

base-commit: 6e424febfbcb27c21a7fe3a137e614765f9cf9d2
-- 
2.41.0



Re: [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings

2023-07-25 Thread Jakub Jelinek via Gcc-patches
On Tue, Jul 25, 2023 at 11:01:54AM -0700, Patrick O'Neill wrote:
> Discussed during the weekly RISC-V GCC meeting[1] and pre-approved by
> Jeff Law.
> If there aren't any objections I'll commit this cherry-picked series
> on Thursday (July 27th).

Please don't before 13.2 will be released, the branch is frozen and none of
this seems to be a release blocker.

Jakub



Re: [PATCH] match.pd: Implement missed optimization (~X | Y) ^ X -> ~(X & Y) [PR109986]

2023-07-25 Thread Jakub Jelinek via Gcc-patches
On Tue, Jul 25, 2023 at 03:42:21PM -0400, David Edelsohn via Gcc-patches wrote:
> Hi, Drew
> 
> Thanks for addressing this missed optimization.
> 
> The testcase includes an incorrect assumption: signed char, which
> causes the testcase to fail on PowerPC.
> 
> Should the testcase be updated to specify signed char in the function
> signatures or should -fsigned-char be added to the command line
> options?

I think we should use signed char instead of char in the testcase.

Jakub



[Bug target/110776] [14 Regression] powerpc-darwin bootstrap broken after r14-2490 with ICE rs6000.cc:5069 building libgfortran

2023-07-25 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110776

--- Comment #8 from Iain Sandoe  ---
(In reply to rguent...@suse.de from comment #7)
> On Tue, 25 Jul 2023, linkw at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110776
> > 
> > --- Comment #6 from Kewen Lin  ---
> > (In reply to rguent...@suse.de from comment #5)
> > > On Tue, 25 Jul 2023, linkw at gcc dot gnu.org wrote:
> > > 
> > > I think apart from the consideration what a single element vector
> > > is compared to a scalar, a more to-the-point fix is
> > > 
> > >   if (VECTOR_TYPE_P (ltype)
> > >   && memory_access_type != VMAT_ELEMENTWISE)
> > 
> > Thanks for the suggestion! I thought checking lnel can also cover
> > VMAT_STRIDED_SLP's special case having const_nunits 1, but it seems 
> > impossible
> > to have?
> 
> I think so, unless I'm convinced with a testcase ;)

(sorry for being a bit slow - we had a power outage that wasted most of the
day)

Richi's suggested patch fixes build of a cross-build for powerpc-darwin and the
test results look OK too.  A non-expert look at the code suggests that
VMAT_ELEMENTWISE is already accounted for on the write side, so that we should
not see a call to the costing code for the equivalent write-side.

Re: [PATCH] match.pd: Implement missed optimization (x << c) >> c -> -(x & 1) [PR101955]

2023-07-25 Thread Jakub Jelinek via Gcc-patches
On Tue, Jul 25, 2023 at 03:25:57PM -0400, Drew Ross wrote:
> > With that fixed I think for non-vector integrals the above is the most
> suitable
> > canonical form of a sign-extension.  Note it should also work for any
> other
> > constant shift amount - just use the appropriate intermediate precision
> for
> > the truncating type.
> > We _might_ want
> > to consider to only use the converts when the intermediate type has
> > mode precision (and as a special case allow one bit as in your above case)
> > so it can expand to (sign_extend: (subreg: reg)).
> 
> Here is a pattern that that only matches to truncations that result in mode
> precision (or precision of 1):
> 
> (simplify
>  (rshift (nop_convert? (lshift @0 INTEGER_CST@1)) @@1)
>  (if (INTEGRAL_TYPE_P (type)
>   && !TYPE_UNSIGNED (type)
>   && wi::gt_p (element_precision (type), wi::to_wide (@1), TYPE_SIGN
> (TREE_TYPE (@1

I'd use
 && wi::ltu_p (wi::to_wide (@1), element_precision (type))
If the shift count would be negative, you'd otherwise ICE in tree_to_uhwi on
it (sure, that is UB at runtime, but compiler shouldn't ICE on it).

>   (with {
> int width = element_precision (type) - tree_to_uhwi (@1);
> tree stype = build_nonstandard_integer_type (width, 0);
>}
>(if (TYPE_PRECISION (stype) == 1 || type_has_mode_precision_p (stype))
> (convert (convert:stype @0))

Jakub



Re: [PATCH] match.pd: Implement missed optimization (~X | Y) ^ X -> ~(X & Y) [PR109986]

2023-07-25 Thread David Edelsohn via Gcc-patches
Hi, Drew

Thanks for addressing this missed optimization.

The testcase includes an incorrect assumption: signed char, which
causes the testcase to fail on PowerPC.

Should the testcase be updated to specify signed char in the function
signatures or should -fsigned-char be added to the command line
options?

Thanks, David


Re: [PATCH] match.pd: Implement missed optimization (x << c) >> c -> -(x & 1) [PR101955]

2023-07-25 Thread Drew Ross via Gcc-patches
> With that fixed I think for non-vector integrals the above is the most
suitable
> canonical form of a sign-extension.  Note it should also work for any
other
> constant shift amount - just use the appropriate intermediate precision
for
> the truncating type.
> We _might_ want
> to consider to only use the converts when the intermediate type has
> mode precision (and as a special case allow one bit as in your above case)
> so it can expand to (sign_extend: (subreg: reg)).

Here is a pattern that that only matches to truncations that result in mode
precision (or precision of 1):

(simplify
 (rshift (nop_convert? (lshift @0 INTEGER_CST@1)) @@1)
 (if (INTEGRAL_TYPE_P (type)
  && !TYPE_UNSIGNED (type)
  && wi::gt_p (element_precision (type), wi::to_wide (@1), TYPE_SIGN
(TREE_TYPE (@1
  (with {
int width = element_precision (type) - tree_to_uhwi (@1);
tree stype = build_nonstandard_integer_type (width, 0);
   }
   (if (TYPE_PRECISION (stype) == 1 || type_has_mode_precision_p (stype))
(convert (convert:stype @0))

Look ok?

> You might also want to verify what RTL expansion
> produces before/after - it at least shouldn't be worse.

The RTL is slightly better for the mode precision cases and slightly worse
for the precision 1 case.

> That said - do you have any testcase where the canonicalization is an
enabler
> for further transforms or was this requested stand-alone?

No, I don't have any specific test cases. This patch is just in response to
pr101955 .

On Tue, Jul 25, 2023 at 2:55 AM Richard Biener 
wrote:

> On Mon, Jul 24, 2023 at 9:42 PM Jakub Jelinek  wrote:
> >
> > On Mon, Jul 24, 2023 at 03:29:54PM -0400, Drew Ross via Gcc-patches
> wrote:
> > > So would something like
> > >
> > > (simplify
> > >  (rshift (nop_convert? (lshift @0 INTEGER_CST@1)) @@1)
> > >  (with { tree stype = build_nonstandard_integer_type (1, 0); }
> > >  (if (INTEGRAL_TYPE_P (type)
> > >   && !TYPE_UNSIGNED (type)
> > >   && wi::eq_p (wi::to_wide (@1), element_precision (type) - 1))
> > >   (convert (convert:stype @0)
> > >
> > > work?
> >
> > Certainly swap the if and with and the (with then should be indented by 1
> > column to the right of (if and (convert one further (the reason for the
> > swapping is not to call build_nonstandard_integer_type when it will not
> be
> > needed, which will be probably far more often then an actual match).
>
> With that fixed I think for non-vector integrals the above is the most
> suitable
> canonical form of a sign-extension.  Note it should also work for any other
> constant shift amount - just use the appropriate intermediate precision for
> the truncating type.  You might also want to verify what RTL expansion
> produces before/after - it at least shouldn't be worse.  We _might_ want
> to consider to only use the converts when the intermediate type has
> mode precision (and as a special case allow one bit as in your above case)
> so it can expand to (sign_extend: (subreg: reg)).
>
> > As discussed privately, the above isn't what we want for vectors and the
> 2
> > shifts are probably best on most arches because even when using -(x & 1)
> the
> > { 1, 1, 1, ... } vector would often needed to be loaded from memory.
>
> I think for vectors a vpcmpgt {0,0,0,..}, %xmm is the cheapest way of
> producing the result.  Note that to reflect this on GIMPLE you'd need
>
>   _2 = _1 < { 0,0...};
>   res = _2 ? { -1, -1, ...} : { 0, 0,...};
>
> because whether the ISA has a way to produce all-ones masks isn't known.
>
> For scalars using -(T)(_1 < 0) would also be possible.
>
> That said - do you have any testcase where the canonicalization is an
> enabler
> for further transforms or was this requested stand-alone?
>
> Thanks,
> Richard.
>
> > Jakub
> >
>
>


List myself as "nvptx port" maintainer (was: Thomas Schwinge appointed co-maintainer of the nvptx backend)

2023-07-25 Thread Thomas Schwinge
Hi!

On 2023-07-19T23:41:47+0200, Gerald Pfeifer  wrote:
> It's my pleasure to announce Thomas Schwinge as co-maintainer of the
> nvptx backend.
>
> Congratulations and Happy Hacking, Thomas! Please go ahead and update
> MAINTAINERS accordingly.
>
> Gerald (on behalf of the steering committee)

Thanks!  I've pushed commit 28e3d361ba0cfa7ea2f90706159a144eaf4b650e
'List myself as "nvptx port" maintainer', see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 28e3d361ba0cfa7ea2f90706159a144eaf4b650e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 25 Jul 2023 21:17:52 +0200
Subject: [PATCH] List myself as "nvptx port" maintainer

	* MAINTAINERS: List myself as "nvptx port" maintainer.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index b626d89fe34..e9b11b43a0f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -102,6 +102,7 @@ nds32 port		Shiva Chen		
 nios2 port		Chung-Lin Tang		
 nios2 port		Sandra Loosemore	
 nvptx port		Tom de Vries		
+nvptx port		Thomas Schwinge		
 or1k port		Stafford Horne		
 pdp11 port		Paul Koning		
 powerpcspe port		Andrew Jenner		
-- 
2.34.1



Re: [PATCH] range-op-float: Fix up -frounding-math frange_arithmetic +- handling [PR110755]

2023-07-25 Thread Aldy Hernandez via Gcc-patches
The frange bits look fine to me, so if you feel confident in the math 
logic, go right ahead :).


Thanks.
Aldy

On 7/24/23 18:01, Jakub Jelinek wrote:

Hi!

IEEE754 says that x + (-x) and x - x result in +0 in all rounding modes
but rounding towards negative infinity, in which case the result is -0
for all finite x.  x + x and x - (-x) if it is zero retain sign of x.
Now, range_arithmetic implements the normal rounds to even rounding,
and as the addition or subtraction in those cases is exact, we don't do any
further rounding etc. and e.g. on the testcase below distilled from glibc
compute a range [+0, +INF], which is fine for -fno-rounding-math or
if we'd have a guarantee that those statements aren't executed with rounding
towards negative infinity.

I believe it is only +- which has this problematic behavior and I think
it is best to deal with it in frange_arithmetic; if we know -frounding-math
is on, it is x + (-x) or x - x and we are asked to round to negative
infinity (i.e. want low bound rather than high bound), change +0 result to
-0.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
after a while for 13.3?  I'm afraid rushing this so late into 13.2...

2023-07-24  Jakub Jelinek  

PR tree-optimization/110755
* range-op-float.cc (frange_arithmetic): Change +0 result to -0
for PLUS_EXPR or MINUS_EXPR if -frounding-math, inf is negative and
it is exact op1 + (-op1) or op1 - op1.

* gcc.dg/pr110755.c: New test.

--- gcc/range-op-float.cc.jj2023-07-23 19:32:20.832434105 +0200
+++ gcc/range-op-float.cc   2023-07-24 09:41:26.231030258 +0200
@@ -324,6 +324,24 @@ frange_arithmetic (enum tree_code code,
bool inexact = real_arithmetic (, code, , );
real_convert (, mode, );
  
+  /* When rounding towards negative infinity, x + (-x) and

+ x - x is -0 rather than +0 real_arithmetic computes.
+ So, when we are looking for lower bound (inf is negative),
+ use -0 rather than +0.  */
+  if (flag_rounding_math
+  && (code == PLUS_EXPR || code == MINUS_EXPR)
+  && !inexact
+  && real_iszero ()
+  && !real_isneg ()
+  && real_isneg ())
+{
+  REAL_VALUE_TYPE op2a = op2;
+  if (code == PLUS_EXPR)
+   op2a.sign ^= 1;
+  if (real_isneg () == real_isneg () && real_equal (, ))
+   result.sign = 1;
+}
+
// Be extra careful if there may be discrepancies between the
// compile and runtime results.
bool round = false;
--- gcc/testsuite/gcc.dg/pr110755.c.jj  2023-07-21 10:34:05.037251433 +0200
+++ gcc/testsuite/gcc.dg/pr110755.c 2023-07-21 10:35:10.986326816 +0200
@@ -0,0 +1,29 @@
+/* PR tree-optimization/110755 */
+/* { dg-do run } */
+/* { dg-require-effective-target fenv } */
+/* { dg-require-effective-target hard_float } */
+/* { dg-options "-O2 -frounding-math" } */
+
+#include 
+
+__attribute__((noipa)) float
+foo (float x)
+{
+  if (x > 0.0)
+{
+  x += 0x1p+23;
+  x -= 0x1p+23;
+  x = __builtin_fabsf (x);
+}
+  return x;
+}
+
+int
+main ()
+{
+#ifdef FE_DOWNWARD
+  fesetround (FE_DOWNWARD);
+  if (__builtin_signbit (foo (0.5)))
+__builtin_abort ();
+#endif
+}

Jakub





[Bug c++/110810] New: ICE in check_noexcept_r, at cp/except.cc:1068

2023-07-25 Thread cuzdav at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110810

Bug ID: 110810
   Summary: ICE in check_noexcept_r, at cp/except.cc:1068
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cuzdav at gmail dot com
  Target Milestone: ---

Created attachment 55634
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55634=edit
preprocessed code, zipped

Starting with x86_64 version of gcc 12.1 (linux), and through all newer
versions (including the trunk) on Compiler explorer, I receive an internal
compiler error on the following code:

Some overlap with https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109899 but there
is enough difference that I thought it might still be useful to report.

https://godbolt.org/z/ThTdxqzKG

#include 

struct Foo {
Foo() {}
~Foo() {}
};

struct X {
Foo data[4];
};

template
void f() {
char storage[256];
auto& object = *new(storage) X{};
object.~X();
}

* removing {} in the call to placement new "fixes" it, or replacing with
parenthesis.  Another fix is removing either the default ctor or destructor
from Foo, or making data not be an array, or f() not be a template.  It's a
strange and specific combination.

Output:


source>: In function 'void f()':
:15:36: internal compiler error: in check_noexcept_r, at
cp/except.cc:1068
   15 | auto& object = *new(storage) X{};
  |^
0x246af2e internal_error(char const*, ...)
???:0
0xac8ab6 fancy_abort(char const*, int, char const*)
???:0
0x16d661c walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x16d9912 walk_tree_without_duplicates_1(tree_node**, tree_node*
(*)(tree_node**, int*, void*), void*, tree_node* (*)(tree_node**, int*,
tree_node* (*)(tree_node**, int*, void*), void*, hash_set >*))
???:0
0xbd04a3 expr_noexcept_p(tree_node*, int)
???:0
0xbd64f0 build_vec_delete(unsigned int, tree_node*, tree_node*,
special_function_kind, int, int)
???:0
0xbd6a1e build_delete(unsigned int, tree_node*, tree_node*,
special_function_kind, int, int, int)
???:0
0xb8c5cc cxx_maybe_build_cleanup(tree_node*, int)
???:0
0xaf6b55 build_new_method_call(tree_node*, tree_node*, vec**, tree_node*, int, tree_node**, int)
???:0
0xaf78e7 build_special_member_call(tree_node*, tree_node*, vec**, tree_node*, int, int)
???:0
0xbde5fc build_new(unsigned int, vec**,
tree_node*, tree_node*, vec**, int, int)
???:0
0xcae89f c_parse_file()
???:0
0xdee919 c_common_parse_file()
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[PATCH] Initialize value in bit_value_unop.

2023-07-25 Thread Aldy Hernandez via Gcc-patches
bit_value_binop initializes VAL regardless of the final mask.  It even
has a comment to that effect:

  /* Ensure that VAL is initialized (to any value).  */

However, bit_value_unop, which in theory shares the same API, does not.
This causes range-ops to choke on uninitialized VALs for some inputs to
ABS.

Instead of fixing the callers, it's cleaner to make bit_value_unop and
bit_value_binop consistent.

OK for trunk?

gcc/ChangeLog:

* tree-ssa-ccp.cc (bit_value_unop): Initialize val when appropriate.
---
 gcc/tree-ssa-ccp.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 73fb7c11c64..15e65f16008 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -1359,7 +1359,10 @@ bit_value_unop (enum tree_code code, signop type_sgn, 
int type_precision,
 case ABS_EXPR:
 case ABSU_EXPR:
   if (wi::sext (rmask, rtype_precision) == -1)
-   *mask = -1;
+   {
+ *mask = -1;
+ *val = 0;
+   }
   else if (wi::neg_p (rmask))
{
  /* Result is either rval or -rval.  */
@@ -1385,6 +1388,7 @@ bit_value_unop (enum tree_code code, signop type_sgn, int 
type_precision,
 
 default:
   *mask = -1;
+  *val = 0;
   break;
 }
 }
-- 
2.41.0



[COMMITTED] Make some functions in CCP static.

2023-07-25 Thread Aldy Hernandez via Gcc-patches
Committed as obvious.

gcc/ChangeLog:

* tree-ssa-ccp.cc (value_mask_to_min_max): Make static.
(bit_value_mult_const): Same.
(get_individual_bits): Same.
---
 gcc/tree-ssa-ccp.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 64d5fa81334..73fb7c11c64 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -1297,7 +1297,7 @@ ccp_fold (gimple *stmt)
represented by the mask pair VAL and MASK with signedness SGN and
precision PRECISION.  */
 
-void
+static void
 value_mask_to_min_max (widest_int *min, widest_int *max,
   const widest_int , const widest_int ,
   signop sgn, int precision)
@@ -1391,7 +1391,7 @@ bit_value_unop (enum tree_code code, signop type_sgn, int 
type_precision,
 
 /* Determine the mask pair *VAL and *MASK from multiplying the
argument mask pair RVAL, RMASK by the unsigned constant C.  */
-void
+static void
 bit_value_mult_const (signop sgn, int width,
  widest_int *val, widest_int *mask,
  const widest_int , const widest_int ,
@@ -1453,7 +1453,7 @@ bit_value_mult_const (signop sgn, int width,
bits in X (capped at the maximum value MAX).  For example, an X
value 11, places 1, 2 and 8 in BITS and returns the value 3.  */
 
-unsigned int
+static unsigned int
 get_individual_bits (widest_int *bits, widest_int x, unsigned int max)
 {
   unsigned int count = 0;
-- 
2.41.0



[Bug c++/110809] ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

Marek Polacek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org
   Last reconfirmed||2023-07-25

--- Comment #1 from Marek Polacek  ---
Confirmed.  Even 11 ICEs; 10 gives errors.

[Bug c++/110809] New: ICE: in unify, at cp/pt.cc:25226 with floating-point NTTPs

2023-07-25 Thread ed at catmur dot uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110809

Bug ID: 110809
   Summary: ICE: in unify, at cp/pt.cc:25226 with floating-point
NTTPs
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at catmur dot uk
  Target Milestone: ---

#include 
using A = double;
template struct S {};
auto x = boost::hana::make_map(
boost::hana::make_pair(boost::hana::type_c>, 1),
boost::hana::make_pair(boost::hana::type_c>, 2))[
boost::hana::type_c>];

/opt/compiler-explorer/libs/boost_1_82_0/boost/hana/detail/hash_table.hpp:51:62:
internal compiler error: in unify, at cp/pt.cc:25226
   51 | using type =
decltype(detail::find_indices_impl(std::declval()));
  |  
~~~^
0x246af2e internal_error(char const*, ...)
???:0
0xac8ab6 fancy_abort(char const*, int, char const*)
???:0
0xd01e11 fn_type_unification(tree_node*, tree_node*, tree_node*, tree_node*
const*, unsigned int, tree_node*, unification_kind_t, int, conversion**, bool,
bool)
???:0
0xaf7db9 build_new_function_call(tree_node*, vec**, int)
???:0
0xd2379c finish_call_expr(tree_node*, vec**, bool,
bool, int)
???:0
0xcddfef tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0xcde6fc tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0xd06d50 instantiate_class_template(tree_node*)
???:0
0xd5954f complete_type_or_maybe_complain(tree_node*, tree_node*, int)
???:0
0xcdf1a7 tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0xcde6fc tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0xd06d50 instantiate_class_template(tree_node*)
???:0
0xd5954f complete_type_or_maybe_complain(tree_node*, tree_node*, int)
???:0
0xcdf1a7 tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0xcde6fc tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0xcd6618 instantiate_decl(tree_node*, bool, bool)
???:0
0xbb8461 maybe_instantiate_decl(tree_node*)
???:0
0xbb9ebf mark_used(tree_node*, int)
???:0
0xaf673e build_new_method_call(tree_node*, tree_node*, vec**, tree_node*, int, tree_node**, int)
???:0
0xd22eaf finish_call_expr(tree_node*, vec**, bool,
bool, int)
???:0

Changing `A` to `int` makes it compile, so this is a problem with
floating-point NTTPs.

[Bug c++/110808] [modules] Internal Compiler Error in check_mergeable_decl

2023-07-25 Thread mihi32 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110808

--- Comment #1 from Silviu Vrinceanu  ---
Created attachment 55633
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55633=edit
Archive with preprocessed files *.ii

[Bug c++/110808] New: [modules] Internal Compiler Error in check_mergeable_decl

2023-07-25 Thread mihi32 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110808

Bug ID: 110808
   Summary: [modules] Internal Compiler Error in
check_mergeable_decl
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mihi32 at gmail dot com
  Target Milestone: ---

Created attachment 55632
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55632=edit
Archive with the 3 cpp files that reproduce the error.

// tres.cpp
export module group:tres;

int mul()
{
return 0;
}

// group.cpp
export module group;
export import :tres;

// main.cpp
import group:tres;

int main() 
{
return mul();
}

$ g++-12 -freport-bug -std=c++20 -fmodules-ts tres.cpp group.cpp main.cpp -o
main

main.cpp: In function ‘int main()’:
main.cpp:5:16: internal compiler error: Segmentation fault
5 | return mul();
  |^~~
0xd910e3 crash_signal
../../src/gcc/toplev.cc:322
0x7facc691751f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x76fc34 ovl_iterator::operator*() const
../../src/gcc/cp/cp-tree.h:842
0x76fc34 check_mergeable_decl
../../src/gcc/cp/module.cc:10570
0x792150 trees_in::key_mergeable(int, merge_kind, tree_node*, tree_node*,
tree_node*, tree_node*, bool)
../../src/gcc/cp/module.cc:10805
0x794f14 trees_in::decl_value()
../../src/gcc/cp/module.cc:7921
0x78fa77 trees_in::tree_node(bool)
../../src/gcc/cp/module.cc:9172
0x795b4b module_state::read_cluster(unsigned int)
../../src/gcc/cp/module.cc:14838
0x7960a5 module_state::load_section(unsigned int, binding_slot*)
../../src/gcc/cp/module.cc:18109
0x796237 lazy_load_binding(unsigned int, tree_node*, tree_node*, binding_slot*)
../../src/gcc/cp/module.cc:18801
0x7a3ec4 name_lookup::search_namespace_only(tree_node*)
../../src/gcc/cp/name-lookup.cc:919
0x7a523b name_lookup::search_unqualified(tree_node*, cp_binding_level*)
../../src/gcc/cp/name-lookup.cc:1142
0x7a6fab lookup_name(tree_node*, LOOK_where, LOOK_want)
../../src/gcc/cp/name-lookup.cc:7774
0x7aee01 lookup_name(tree_node*, LOOK_want)
../../src/gcc/cp/name-lookup.h:404
0x7aee01 cp_parser_lookup_name
../../src/gcc/cp/parser.cc:30605
0x7dd045 cp_parser_class_name
../../src/gcc/cp/parser.cc:25693
0x7dd31c cp_parser_type_name
../../src/gcc/cp/parser.cc:20140
0x7ebee0 cp_parser_simple_type_specifier
../../src/gcc/cp/parser.cc:19831
0x7d27fa cp_parser_postfix_expression
../../src/gcc/cp/parser.cc:7573
0x7bc7c6 cp_parser_binary_expression
../../src/gcc/cp/parser.cc:10035
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.
-
$ g++-12 -v
Using built-in specs.
COLLECT_GCC=g++-12
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12.1.0-2ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-sZcx2y/gcc-12-12.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-sZcx2y/gcc-12-12.1.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (Ubuntu 12.1.0-2ubuntu1~22.04)

[Bug c++/110807] New: Copy list initialisation of a vector raises a warning with -O2

2023-07-25 Thread twic at urchin dot earth.li via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110807

Bug ID: 110807
   Summary: Copy list initialisation of a vector raises a
warning with -O2
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: twic at urchin dot earth.li
  Target Milestone: ---

Created attachment 55631
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55631=edit
the source coded needed to reproduce the problem

If you have a class with a vector member:

struct Foo {
std::vector byCallSpread;
};

And try to initialise it with copy list initialisation:

Foo() { byCallSpread = {true, false}; }

Then that works fine with the default optimisation level, but gets a warning at
-O2 saying:

/usr/local/include/c++/13.1.0/bits/stl_algobase.h:437:30: warning: 'void*
__builtin_memmove(void*, const void*, long unsigned int)' writing between 9 and
9223372036854775807 bytes into a region of size 8 overflows the destination
[-Wstringop-overflow=]
  437 | __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
  | ~^~~

(i believe this is the relevant error - i am not very good at reading error
messages, so apologies if not)

The warning goes away if the initialisation uses an explicit constructor:

Foo() { byCallSpread = std::vector({true, false}); }

I did not get this warning with GCC 7.2.0. According to Compiler Explorer, it
does not occur with GCC 12.3.

Here is a transcript of a self-contained session using the GCC 13.1.0 official
Docker image (official for Docker, perhaps not for GCC!) which demonstrates the
problem:

$ sudo docker run -it gcc:13.1.0
root@4694030a8bea:/# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-linux-gnu/13.1.0/lto-wrapper
Target: x86_64-linux-gnu
Configured with: /usr/src/gcc/configure --build=x86_64-linux-gnu
--disable-multilib --enable-languages=c,c++,fortran,go
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.1.0 (GCC) 
root@4694030a8bea:/# cat >foo.cpp
#include 

struct Foo {
std::vector byCallSpread;

Foo() { byCallSpread = {true, false}; }
};

Foo f;

int main(int argc, char** argv) {}
root@4694030a8bea:/# g++ foo.cpp 
root@4694030a8bea:/# g++ -O2 foo.cpp 
In file included from /usr/local/include/c++/13.1.0/vector:62,
 from foo.cpp:1:
In static member function 'static _Up* std::__copy_move<_IsMove, true,
std::random_access_iterator_tag>::__copy_m(_Tp*, _Tp*, _Up*) [with _Tp = long
unsigned int; _Up = long unsigned int; bool _IsMove = false]',
inlined from '_OI std::__copy_move_a2(_II, _II, _OI) [with bool _IsMove =
false; _II = long unsigned int*; _OI = long unsigned int*]' at
/usr/local/include/c++/13.1.0/bits/stl_algobase.h:506:30,
inlined from '_OI std::__copy_move_a1(_II, _II, _OI) [with bool _IsMove =
false; _II = long unsigned int*; _OI = long unsigned int*]' at
/usr/local/include/c++/13.1.0/bits/stl_algobase.h:533:42,
inlined from '_OI std::__copy_move_a(_II, _II, _OI) [with bool _IsMove =
false; _II = long unsigned int*; _OI = long unsigned int*]' at
/usr/local/include/c++/13.1.0/bits/stl_algobase.h:540:31,
inlined from '_OI std::copy(_II, _II, _OI) [with _II = long unsigned int*;
_OI = long unsigned int*]' at
/usr/local/include/c++/13.1.0/bits/stl_algobase.h:633:7,
inlined from 'std::vector::iterator std::vector::_M_copy_aligned(const_iterator, const_iterator, iterator) [with _Alloc
= std::allocator]' at
/usr/local/include/c++/13.1.0/bits/stl_bvector.h:1303:28,
inlined from 'void std::vector::_M_insert_range(iterator,
_ForwardIterator, _ForwardIterator, std::forward_iterator_tag) [with
_ForwardIterator = const bool*; _Alloc = std::allocator]' at
/usr/local/include/c++/13.1.0/bits/vector.tcc:915:33,
inlined from 'std::vector::iterator std::vector::insert(const_iterator, _InputIterator, _InputIterator) [with
_InputIterator = const bool*;  = void; _Alloc =
std::allocator]' at
/usr/local/include/c++/13.1.0/bits/stl_bvector.h:1180:19,
inlined from 'void std::vector::_M_assign_aux(_ForwardIterator, _ForwardIterator,
std::forward_iterator_tag) [with _ForwardIterator = const bool*; _Alloc =
std::allocator]' at
/usr/local/include/c++/13.1.0/bits/stl_bvector.h:1440:14,
inlined from 'void std::vector::assign(_InputIterator,
_InputIterator) [with _InputIterator = const bool*;  =
void; _Alloc = std::allocator]' at
/usr/local/include/c++/13.1.0/bits/stl_bvector.h:935:17,
inlined from 'std::vector& std::vector::operator=(std::initializer_list) [with _Alloc =
std::allocator]' at
/usr/local/include/c++/13.1.0/bits/stl_bvector.h:915:14,
inlined from 'Foo::Foo()' at foo.cpp:6:40,
inlined from 'void __static_initialization_and_destruction_0()' at
foo.cpp:9:5,

[PATCH] Replace invariant ternlog operands

2023-07-25 Thread Yan Simonaytes
Sometimes GCC generates ternlog with three operands, but some of them are 
invariant.
For example:

vpternlogq  $252, %zmm2, %zmm1, %zmm0

In this case zmm1 register isnt used by ternlog.
So should replace zmm1 with zmm0 or zmm2:

vpternlogq  $252, %zmm0, %zmm1, %zmm0

When the third operand of ternlog is memory and both others are invariant 
should add load instruction from this memory to register
and replace the first and the second operands to this register. 
So insted of

vpternlogq  $85, (%rdi), %zmm1, %zmm0

Should emit

vmovdqa64   (%rdi), %zmm0
vpternlogq  $85, %zmm0, %zmm0, %zmm0

gcc/ChangeLog:

* config/i386/i386.cc (ternlog_invariant_operand_mask): New helper
function for replacing invariant operands.
(reduce_ternlog_operands): Likewise.
* config/i386/i386-protos.h (ternlog_invariant_operand_mask): Prototype 
here.
(reduce_ternlog_operands): Likewise.
* config/i386/sse.md:

gcc/testsuite/ChangeLog:

* gcc.target/i386/reduce-ternlog-operands-1.c: New test.
* gcc.target/i386/reduce-ternlog-operands-2.c: New test.
---
 gcc/config/i386/i386-protos.h |  2 +
 gcc/config/i386/i386.cc   | 45 +++
 gcc/config/i386/sse.md| 43 ++
 .../i386/reduce-ternlog-operands-1.c  | 20 +
 .../i386/reduce-ternlog-operands-2.c  | 11 +
 5 files changed, 121 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/reduce-ternlog-operands-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/reduce-ternlog-operands-2.c

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 27fe73ca65c..49398ef9936 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -57,6 +57,8 @@ extern int standard_80387_constant_p (rtx);
 extern const char *standard_80387_constant_opcode (rtx);
 extern rtx standard_80387_constant_rtx (int);
 extern int standard_sse_constant_p (rtx, machine_mode);
+extern int ternlog_invariant_operand_mask (rtx *operands);
+extern void reduce_ternlog_operands (rtx *operands);
 extern const char *standard_sse_constant_opcode (rtx_insn *, rtx *);
 extern bool ix86_standard_x87sse_constant_load_p (const rtx_insn *, rtx);
 extern bool ix86_pre_reload_split (void);
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index f0d6167e667..140de478571 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -5070,6 +5070,51 @@ ix86_check_no_addr_space (rtx insn)
 }
   return true;
 }
+
+/* Return mask of invariant operands:
+   bit number 0 1 2
+   operand number 1 2 3.  */
+
+int
+ternlog_invariant_operand_mask (rtx *operands)
+{
+  int mask = 0;
+  int imm8 = XINT (operands[4], 0);
+
+  if (((imm8 >> 4) & 0xF) == (imm8 & 0xF))
+mask |= 1;
+  if (((imm8 >> 2) & 0x33) == (imm8 & 0x33))
+mask |= (1 << 1);
+  if (((imm8 >> 1) & 0x55) == (imm8 & 0x55))
+mask |= (1 << 2);
+
+  return mask;
+}
+
+/* Replace one of the unused operators with the one used.  */
+
+void
+reduce_ternlog_operands (rtx *operands)
+{
+  int mask = ternlog_invariant_operand_mask (operands);
+
+  if (mask & 1) /* the first operand is invariant.  */
+operands[1] = operands[2];
+
+  if (mask & 2) /* the second operand is invariant.  */
+operands[2] = operands[1];
+
+  if (mask & 4)/* the third operand is invariant.  */
+   operands[3] = operands[1];
+  else if (!MEM_P (operands[3]))
+{
+  if (mask & 1) /* the first operand is invariant.  */
+   operands[1] = operands[3];
+  if (mask & 2) /* the second operands is invariant.  */
+   operands[2] = operands[3];
+}
+}
+
 
 /* Initialize the table of extra 80387 mathematical constants.  */
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a2099373123..f88d82b315c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -12625,6 +12625,49 @@
  (symbol_ref " == 64 || TARGET_AVX512VL")
  (const_string "*")))])
 
+;; If the first and the second operands of ternlog are invariant and
+;; the third operand is memory
+;; then we should add load third operand from memory to register and
+;; replace first and second operands with this register
+(define_split
+  [(set (match_operand:V 0 "register_operand")
+   (unspec:V
+ [(match_operand:V 1 "register_operand")
+  (match_operand:V 2 "register_operand")
+  (match_operand:V 3 "memory_operand")
+  (match_operand:SI 4 "const_0_to_255_operand")]
+ UNSPEC_VTERNLOG))]
+  "ternlog_invariant_operand_mask (operands) == 3 && !reload_completed"
+  [(set (match_dup 0)
+   (match_dup 3))
+   (set (match_dup 0)
+   (unspec:V
+ [(match_dup 0)
+  (match_dup 0)
+  (match_dup 0)
+  (match_dup 4)]
+ UNSPEC_VTERNLOG))])
+
+;; Replace invariant ternlog operands with used operands
+;; (except for 

[Bug c++/110382] [13 Regression] internal compiler error: in verify_ctor_sanity

2023-07-25 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110382

Marek Polacek  changed:

   What|Removed |Added

Summary|[13/14 Regression] internal |[13 Regression] internal
   |compiler error: in  |compiler error: in
   |verify_ctor_sanity  |verify_ctor_sanity

--- Comment #5 from Marek Polacek  ---
Fixed on trunk so far.

[Bug c++/110382] [13/14 Regression] internal compiler error: in verify_ctor_sanity

2023-07-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110382

--- Comment #4 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:6e424febfbcb27c21a7fe3a137e614765f9cf9d2

commit r14-2762-g6e424febfbcb27c21a7fe3a137e614765f9cf9d2
Author: Marek Polacek 
Date:   Fri Jul 21 17:48:37 2023 -0400

c++: fix ICE with constexpr ARRAY_REF [PR110382]

This code in cxx_eval_array_reference has been hard to get right.
In r12-2304 I added some code; in r13-5693 I removed some of it.

Here the problematic line is "S s = arr[0];" which causes a crash
on the assert in verify_ctor_sanity:

  gcc_assert (!ctx->object || !DECL_P (ctx->object)
  || ctx->global->get_value (ctx->object) == ctx->ctor);

ctx->object is the VAR_DECL 's', which is correct here.  The second
line points to the problem: we replaced ctx->ctor in
cxx_eval_array_reference:

  new_ctx.ctor = build_constructor (elem_type, NULL); // #1

which I think we shouldn't have; the CONSTRUCTOR we created in
cxx_eval_constant_expression/DECL_EXPR

  new_ctx.ctor = build_constructor (TREE_TYPE (r), NULL);

had the right type.

We still need #1 though.  E.g., in constexpr-96241.C, we never
set ctx.ctor/object before calling cxx_eval_array_reference, so
we have to build a CONSTRUCTOR there.  And in constexpr-101371-2.C
we have a ctx.ctor, but it has the wrong type, so we need a new one.

We can fix the problem by always clearing the object, and, as an
optimization, only create/free a new ctor when actually needed.

PR c++/110382

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_array_reference): Create a new constructor
only when we don't already have a matching one.  Clear the object
when the type is non-scalar.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-110382.C: New test.

Re: [PATCH 2/1] c++: passing partially inst ttp as ttp [PR110566]

2023-07-25 Thread Jason Merrill via Gcc-patches

On 7/24/23 13:03, Patrick Palka wrote:

On Fri, 21 Jul 2023, Jason Merrill wrote:

On 7/21/23 14:34, Patrick Palka wrote:

(This is a follow-up of
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624951.html)

Bootstrapped and regtested on x86_64-pc-linux-gnu, how does this look?

-- >8 --

The previous fix doesn't work for partially instantiated ttps primarily
because most_general_template doesn't work for them.  This patch fixes
this by giving such ttps a DECL_TEMPLATE_INFO (extending the
r11-734-g2fb595f8348e16 fix) with which we can obtain the original ttp.

This patch additionally makes us be more careful about using the correct
amount of levels from the scope of a ttp argument during
coerce_template_template_parms.

PR c++/110566

gcc/cp/ChangeLog:

* pt.cc (reduce_template_parm_level): Set DECL_TEMPLATE_INFO
on the DECL_TEMPLATE_RESULT of a reduced template template
parameter.
(add_defaults_to_ttp): Also update DECL_TEMPLATE_INFO of the
ttp's DECL_TEMPLATE_RESULT.
(coerce_template_template_parms): Make sure 'scope_args' has
the right amount of levels for the ttp argument.
(most_general_template): Handle template template parameters.

gcc/testsuite/ChangeLog:

* g++.dg/template/ttp39.C: New test.
---
   gcc/cp/pt.cc  | 46 ---
   gcc/testsuite/g++.dg/template/ttp39.C | 16 ++
   2 files changed, 57 insertions(+), 5 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/template/ttp39.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index e0ed4bc8bbb..be7119dd9a0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -4570,8 +4570,14 @@ reduce_template_parm_level (tree index, tree type,
int levels, tree args,
  TYPE_DECL, DECL_NAME (decl), type);
  DECL_TEMPLATE_RESULT (decl) = inner;
  DECL_ARTIFICIAL (inner) = true;
- DECL_TEMPLATE_PARMS (decl) = tsubst_template_parms
-   (DECL_TEMPLATE_PARMS (orig_decl), args, complain);
+ tree parms = tsubst_template_parms (DECL_TEMPLATE_PARMS (orig_decl),
+ args, complain);
+ DECL_TEMPLATE_PARMS (decl) = parms;
+ retrofit_lang_decl (inner);
+ tree orig_inner = DECL_TEMPLATE_RESULT (orig_decl);
+ DECL_TEMPLATE_INFO (inner)
+   = build_template_info (DECL_TI_TEMPLATE (orig_inner),
+  template_parms_to_args (parms));


Should we assert that orig_inner doesn't have its own DECL_TEMPLATE_INFO?  I'm
wondering if it's possible to reduce the level of a TTP more than once.


It's possible for a ttp belonging to a nested generic lambda:

   template
   void f() {
 [](auto) {
   [] class TT>() {
   };
 }(0);
   }

   template void f();




}
   /* Attach the TPI to the decl.  */
@@ -7936,6 +7942,19 @@ add_defaults_to_ttp (tree otmpl)
}
   }
   +  tree oresult = DECL_TEMPLATE_RESULT (otmpl);
+  tree gen_otmpl = DECL_TI_TEMPLATE (oresult);


Hmm, here we're assuming that all TTPs have DECL_TEMPLATE_INFO?


I figured it's a reasonable assumption since all "formal" ttps
originally start out with DECL_TEMPLATE_INFO (via process_template_parm).
Though I realized I missed adjusting rewrite_template_parm to set
DECL_TEMPLATE_INFO on the new ttp, which the below patch fixes (and
adds a testcase that we'd otherwise segfualt on).




+  tree gen_ntmpl;
+  if (gen_otmpl == otmpl)
+gen_ntmpl = ntmpl;
+  else
+gen_ntmpl = add_defaults_to_ttp (gen_otmpl);
+
+  tree nresult = copy_node (oresult);


Another fixed bug: since we build the new DECL_TEMPLATE_RESULT via
copy_node, we need to avoid sharing its DECL_LANG_SPECIFIC with the
old decl.


+  DECL_TEMPLATE_INFO (nresult) = copy_node (DECL_TEMPLATE_INFO (oresult));
+  DECL_TI_TEMPLATE (nresult) = gen_ntmpl;
+  DECL_TEMPLATE_RESULT (ntmpl) = nresult;
+
 hash_map_safe_put (defaulted_ttp_cache, otmpl, ntmpl);
 return ntmpl;
   }
@@ -8121,15 +8140,29 @@ coerce_template_template_parms (tree parm_tmpl,
 OUTER_ARGS are not the right outer levels in this case, as they are
 the args we're building up for PARM, and for the coercion we want the
 args for ARG.  If DECL_CONTEXT isn't set for a template template
-parameter, we can assume that it's in the current scope.  In that
case
-we might end up adding more levels than needed, but that shouldn't be
-a problem; any args we need to refer to are at the right level.  */
+parameter, we can assume that it's in the current scope.  */
 tree ctx = DECL_CONTEXT (arg_tmpl);
 if (!ctx && DECL_TEMPLATE_TEMPLATE_PARM_P (arg_tmpl))
ctx = current_scope ();
 tree scope_args = NULL_TREE;
 if (tree tinfo = get_template_info (ctx))
scope_args = TI_ARGS (tinfo);
+  if (DECL_TEMPLATE_TEMPLATE_PARM_P (arg_tmpl))
+   {
+ int level = 

[gcc13 backport 12/12] riscv: fix error: control reaches end of non-void function

2023-07-25 Thread Patrick O'Neill
From: Martin Liska 

Fixes:
gcc/config/riscv/sync.md:66:1: error: control reaches end of non-void function 
[-Werror=return-type]
66 |   [(set (attr "length") (const_int 4))])
   | ^

PR target/109713

gcc/ChangeLog:

* config/riscv/sync.md: Add gcc_unreachable to a switch.
---
 gcc/config/riscv/sync.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 6e7c762ac57..9fc626267de 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -62,6 +62,8 @@
return "fence\tr,rw";
 else if (model == MEMMODEL_RELEASE)
return "fence\trw,w";
+else
+   gcc_unreachable ();
   }
   [(set (attr "length") (const_int 4))])
 
-- 
2.34.1



[gcc13 backport 10/12] RISC-V: Weaken atomic loads

2023-07-25 Thread Patrick O'Neill
This change brings atomic loads in line with table A.6 of the ISA
manual.

2023-04-27 Patrick O'Neill 

gcc/ChangeLog:

* config/riscv/sync.md (atomic_load): Implement atomic
load mapping.

Signed-off-by: Patrick O'Neill 
---
 gcc/config/riscv/sync.md | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index ba132d8a1ce..6e7c762ac57 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -26,6 +26,7 @@
   UNSPEC_SYNC_OLD_OP_SUBWORD
   UNSPEC_SYNC_EXCHANGE
   UNSPEC_SYNC_EXCHANGE_SUBWORD
+  UNSPEC_ATOMIC_LOAD
   UNSPEC_ATOMIC_STORE
   UNSPEC_MEMORY_BARRIER
 ])
@@ -66,8 +67,31 @@
 
 ;; Atomic memory operations.
 
-;; Implement atomic stores with conservative fences.  Fall back to fences for
-;; atomic loads.
+(define_insn "atomic_load"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(unspec_volatile:GPR
+  [(match_operand:GPR 1 "memory_operand" "A")
+   (match_operand:SI 2 "const_int_operand")]  ;; model
+  UNSPEC_ATOMIC_LOAD))]
+  "TARGET_ATOMIC"
+  {
+enum memmodel model = (enum memmodel) INTVAL (operands[2]);
+model = memmodel_base (model);
+
+if (model == MEMMODEL_SEQ_CST)
+  return "fence\trw,rw\;"
+"l\t%0,%1\;"
+"fence\tr,rw";
+if (model == MEMMODEL_ACQUIRE)
+  return "l\t%0,%1\;"
+"fence\tr,rw";
+else
+  return "l\t%0,%1";
+  }
+  [(set_attr "type" "atomic")
+   (set (attr "length") (const_int 12))])
+
+;; Implement atomic stores with conservative fences.
 ;; This allows us to be compatible with the ISA manual Table A.6 and Table A.7.
 (define_insn "atomic_store"
   [(set (match_operand:GPR 0 "memory_operand" "=A")
-- 
2.34.1



[gcc13 backport 11/12] RISC-V: Table A.6 conformance tests

2023-07-25 Thread Patrick O'Neill
These tests cover basic cases to ensure the atomic mappings follow the
strengthened Table A.6 mappings that are compatible with Table A.7.

2023-04-27 Patrick O'Neill 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo-table-a-6-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-a-6-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-a-6-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-a-6-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-a-6-amo-add-5.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-1.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-2.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-3.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-4.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-5.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-6.c: New test.
* gcc.target/riscv/amo-table-a-6-compare-exchange-7.c: New test.
* gcc.target/riscv/amo-table-a-6-fence-1.c: New test.
* gcc.target/riscv/amo-table-a-6-fence-2.c: New test.
* gcc.target/riscv/amo-table-a-6-fence-3.c: New test.
* gcc.target/riscv/amo-table-a-6-fence-4.c: New test.
* gcc.target/riscv/amo-table-a-6-fence-5.c: New test.
* gcc.target/riscv/amo-table-a-6-load-1.c: New test.
* gcc.target/riscv/amo-table-a-6-load-2.c: New test.
* gcc.target/riscv/amo-table-a-6-load-3.c: New test.
* gcc.target/riscv/amo-table-a-6-store-1.c: New test.
* gcc.target/riscv/amo-table-a-6-store-2.c: New test.
* gcc.target/riscv/amo-table-a-6-store-compat-3.c: New test.
* gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c: New test.
* gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c: New test.
* gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c: New test.
* gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c: New test.
* gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c: New test.

Signed-off-by: Patrick O'Neill 
---
 .../gcc.target/riscv/amo-table-a-6-amo-add-1.c | 15 +++
 .../gcc.target/riscv/amo-table-a-6-amo-add-2.c | 15 +++
 .../gcc.target/riscv/amo-table-a-6-amo-add-3.c | 15 +++
 .../gcc.target/riscv/amo-table-a-6-amo-add-4.c | 15 +++
 .../gcc.target/riscv/amo-table-a-6-amo-add-5.c | 15 +++
 .../riscv/amo-table-a-6-compare-exchange-1.c   |  9 +
 .../riscv/amo-table-a-6-compare-exchange-2.c   |  9 +
 .../riscv/amo-table-a-6-compare-exchange-3.c   |  9 +
 .../riscv/amo-table-a-6-compare-exchange-4.c   |  9 +
 .../riscv/amo-table-a-6-compare-exchange-5.c   |  9 +
 .../riscv/amo-table-a-6-compare-exchange-6.c   | 10 ++
 .../riscv/amo-table-a-6-compare-exchange-7.c   |  9 +
 .../gcc.target/riscv/amo-table-a-6-fence-1.c   | 14 ++
 .../gcc.target/riscv/amo-table-a-6-fence-2.c   | 15 +++
 .../gcc.target/riscv/amo-table-a-6-fence-3.c   | 15 +++
 .../gcc.target/riscv/amo-table-a-6-fence-4.c   | 15 +++
 .../gcc.target/riscv/amo-table-a-6-fence-5.c   | 15 +++
 .../gcc.target/riscv/amo-table-a-6-load-1.c| 16 
 .../gcc.target/riscv/amo-table-a-6-load-2.c| 17 +
 .../gcc.target/riscv/amo-table-a-6-load-3.c| 18 ++
 .../gcc.target/riscv/amo-table-a-6-store-1.c   | 16 
 .../gcc.target/riscv/amo-table-a-6-store-2.c   | 17 +
 .../riscv/amo-table-a-6-store-compat-3.c   | 18 ++
 .../riscv/amo-table-a-6-subword-amo-add-1.c|  9 +
 .../riscv/amo-table-a-6-subword-amo-add-2.c|  9 +
 .../riscv/amo-table-a-6-subword-amo-add-3.c|  9 +
 .../riscv/amo-table-a-6-subword-amo-add-4.c|  9 +
 .../riscv/amo-table-a-6-subword-amo-add-5.c|  9 +
 28 files changed, 360 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
 create mode 100644 

[gcc13 backport 07/12] RISC-V: Eliminate AMO op fences

2023-07-25 Thread Patrick O'Neill
Atomic operations with the appropriate bits set already enfore release
semantics. Remove unnecessary release fences from atomic ops.

This change brings AMO ops in line with table A.6 of the ISA manual.

2023-04-27 Patrick O'Neill 

gcc/ChangeLog:

* config/riscv/riscv.cc
(riscv_memmodel_needs_amo_release): Change function name.
(riscv_print_operand): Remove unneeded %F case.
* config/riscv/sync.md: Remove unneeded fences.

Signed-off-by: Patrick O'Neill 
---
 gcc/config/riscv/riscv.cc | 16 +---
 gcc/config/riscv/sync.md  | 12 ++--
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index df55c427b1b..951f6b5cf42 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4307,11 +4307,11 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
 }
 }
 
-/* Return true if a FENCE should be emitted to before a memory access to
-   implement the release portion of memory model MODEL.  */
+/* Return true if the .RL suffix should be added to an AMO to implement the
+   release portion of memory model MODEL.  */
 
 static bool
-riscv_memmodel_needs_release_fence (enum memmodel model)
+riscv_memmodel_needs_amo_release (enum memmodel model)
 {
   switch (model)
 {
@@ -4337,7 +4337,6 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
'R' Print the low-part relocation associated with OP.
'C' Print the integer branch condition for comparison OP.
'A' Print the atomic operation suffix for memory model OP.
-   'F' Print a FENCE if the memory model requires a release.
'z' Print x0 if OP is zero, otherwise print OP normally.
'i' Print i if the operand is not a register.
'S' Print shift-index of single-bit mask OP.
@@ -4499,19 +4498,14 @@ riscv_print_operand (FILE *file, rtx op, int letter)
 
 case 'A':
   if (riscv_memmodel_needs_amo_acquire (model)
- && riscv_memmodel_needs_release_fence (model))
+ && riscv_memmodel_needs_amo_release (model))
fputs (".aqrl", file);
   else if (riscv_memmodel_needs_amo_acquire (model))
fputs (".aq", file);
-  else if (riscv_memmodel_needs_release_fence (model))
+  else if (riscv_memmodel_needs_amo_release (model))
fputs (".rl", file);
   break;
 
-case 'F':
-  if (riscv_memmodel_needs_release_fence (model))
-   fputs ("fence iorw,ow; ", file);
-  break;
-
 case 'i':
   if (code != REG)
 fputs ("i", file);
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 1acb78a9ae4..9a3b57bd09f 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -91,9 +91,9 @@
   (match_operand:SI 2 "const_int_operand")] ;; model
 UNSPEC_SYNC_OLD_OP))]
   "TARGET_ATOMIC"
-  "%F2amo.%A2 zero,%z1,%0"
+  "amo.%A2\tzero,%z1,%0"
   [(set_attr "type" "atomic")
-   (set (attr "length") (const_int 8))])
+   (set (attr "length") (const_int 4))])
 
 (define_insn "atomic_fetch_"
   [(set (match_operand:GPR 0 "register_operand" "=")
@@ -105,9 +105,9 @@
   (match_operand:SI 3 "const_int_operand")] ;; model
 UNSPEC_SYNC_OLD_OP))]
   "TARGET_ATOMIC"
-  "%F3amo.%A3 %0,%z2,%1"
+  "amo.%A3\t%0,%z2,%1"
   [(set_attr "type" "atomic")
-   (set (attr "length") (const_int 8))])
+   (set (attr "length") (const_int 4))])
 
 (define_insn "subword_atomic_fetch_strong_"
   [(set (match_operand:SI 0 "register_operand" "=") ;; old value 
at mem
@@ -247,9 +247,9 @@
(set (match_dup 1)
(match_operand:GPR 2 "register_operand" "0"))]
   "TARGET_ATOMIC"
-  "%F3amoswap.%A3 %0,%z2,%1"
+  "amoswap.%A3\t%0,%z2,%1"
   [(set_attr "type" "atomic")
-   (set (attr "length") (const_int 8))])
+   (set (attr "length") (const_int 4))])
 
 (define_expand "atomic_exchange"
   [(match_operand:SHORT 0 "register_operand") ;; old value at mem
-- 
2.34.1



[gcc13 backport 05/12] RISC-V: Add AMO release bits

2023-07-25 Thread Patrick O'Neill
This patch sets the relevant .rl bits on amo operations.

2023-04-27 Patrick O'Neill 

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand): Change behavior
of %A to include release bits.

Signed-off-by: Patrick O'Neill 
---
 gcc/config/riscv/riscv.cc | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 11b897aca5c..df55c427b1b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4498,8 +4498,13 @@ riscv_print_operand (FILE *file, rtx op, int letter)
   break;
 
 case 'A':
-  if (riscv_memmodel_needs_amo_acquire (model))
+  if (riscv_memmodel_needs_amo_acquire (model)
+ && riscv_memmodel_needs_release_fence (model))
+   fputs (".aqrl", file);
+  else if (riscv_memmodel_needs_amo_acquire (model))
fputs (".aq", file);
+  else if (riscv_memmodel_needs_release_fence (model))
+   fputs (".rl", file);
   break;
 
 case 'F':
-- 
2.34.1



[gcc13 backport 08/12] RISC-V: Weaken LR/SC pairs

2023-07-25 Thread Patrick O'Neill
Introduce the %I and %J flags for setting the .aqrl bits on LR/SC pairs
as needed.

Atomic compare and exchange ops provide success and failure memory
models. C++17 and later place no restrictions on the relative strength
of each model, so ensure we cover both by using a model that enforces
the ordering of both given models.

This change brings LR/SC ops in line with table A.6 of the ISA manual.

2023-04-27 Patrick O'Neill 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_union_memmodels): Expose
riscv_union_memmodels function to sync.md.
* config/riscv/riscv.cc (riscv_union_memmodels): Add function to
get the union of two memmodels in sync.md.
(riscv_print_operand): Add %I and %J flags that output the
optimal LR/SC flag bits for a given memory model.
* config/riscv/sync.md: Remove static .aqrl bits on LR op/.rl
bits on SC op and replace with optimized %I, %J flags.

Signed-off-by: Patrick O'Neill 
---
 gcc/config/riscv/riscv-protos.h |   3 +
 gcc/config/riscv/riscv.cc   |  44 
 gcc/config/riscv/sync.md| 114 +++-
 3 files changed, 114 insertions(+), 47 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 02b33e02020..b5616fb3e88 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_RISCV_PROTOS_H
 #define GCC_RISCV_PROTOS_H
 
+#include "memmodel.h"
+
 /* Symbol types we understand.  The order of this list must match that of
the unspec enum in riscv.md, subsequent to UNSPEC_ADDRESS_FIRST.  */
 enum riscv_symbol_type {
@@ -81,6 +83,7 @@ extern bool riscv_v_ext_vector_mode_p (machine_mode);
 extern bool riscv_shamt_matches_mask_p (int, HOST_WIDE_INT);
 extern void riscv_subword_address (rtx, rtx *, rtx *, rtx *, rtx *);
 extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *);
+extern enum memmodel riscv_union_memmodels (enum memmodel, enum memmodel);
 
 /* Routines implemented in riscv-c.cc.  */
 void riscv_cpu_cpp_builtins (cpp_reader *);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 951f6b5cf42..59899268918 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4284,6 +4284,36 @@ riscv_print_operand_reloc (FILE *file, rtx op, bool 
hi_reloc)
   fputc (')', file);
 }
 
+/* Return the memory model that encapuslates both given models.  */
+
+enum memmodel
+riscv_union_memmodels (enum memmodel model1, enum memmodel model2)
+{
+  model1 = memmodel_base (model1);
+  model2 = memmodel_base (model2);
+
+  enum memmodel weaker = model1 <= model2 ? model1: model2;
+  enum memmodel stronger = model1 > model2 ? model1: model2;
+
+  switch (stronger)
+{
+  case MEMMODEL_SEQ_CST:
+  case MEMMODEL_ACQ_REL:
+   return stronger;
+  case MEMMODEL_RELEASE:
+   if (weaker == MEMMODEL_ACQUIRE || weaker == MEMMODEL_CONSUME)
+ return MEMMODEL_ACQ_REL;
+   else
+ return stronger;
+  case MEMMODEL_ACQUIRE:
+  case MEMMODEL_CONSUME:
+  case MEMMODEL_RELAXED:
+   return stronger;
+  default:
+   gcc_unreachable ();
+}
+}
+
 /* Return true if the .AQ suffix should be added to an AMO to implement the
acquire portion of memory model MODEL.  */
 
@@ -4337,6 +4367,8 @@ riscv_memmodel_needs_amo_release (enum memmodel model)
'R' Print the low-part relocation associated with OP.
'C' Print the integer branch condition for comparison OP.
'A' Print the atomic operation suffix for memory model OP.
+   'I' Print the LR suffix for memory model OP.
+   'J' Print the SC suffix for memory model OP.
'z' Print x0 if OP is zero, otherwise print OP normally.
'i' Print i if the operand is not a register.
'S' Print shift-index of single-bit mask OP.
@@ -4506,6 +4538,18 @@ riscv_print_operand (FILE *file, rtx op, int letter)
fputs (".rl", file);
   break;
 
+case 'I':
+  if (model == MEMMODEL_SEQ_CST)
+   fputs (".aqrl", file);
+  else if (riscv_memmodel_needs_amo_acquire (model))
+   fputs (".aq", file);
+  break;
+
+case 'J':
+  if (riscv_memmodel_needs_amo_release (model))
+   fputs (".rl", file);
+  break;
+
 case 'i':
   if (code != REG)
 fputs ("i", file);
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 9a3b57bd09f..3e6345e83a3 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -116,21 +116,22 @@
(unspec_volatile:SI
  [(any_atomic:SI (match_dup 1)
 (match_operand:SI 2 "register_operand" "rI")) ;; value for 
op
-  (match_operand:SI 3 "register_operand" "rI")]   ;; mask
+  (match_operand:SI 3 "const_int_operand")]   ;; model
 UNSPEC_SYNC_OLD_OP_SUBWORD))
-(match_operand:SI 4 "register_operand" "rI")  ;; not_mask
- 

[gcc13 backport 09/12] RISC-V: Weaken mem_thread_fence

2023-07-25 Thread Patrick O'Neill
This change brings atomic fences in line with table A.6 of the ISA
manual.

Relax mem_thread_fence according to the memmodel given.

2023-04-27 Patrick O'Neill 

gcc/ChangeLog:

* config/riscv/sync.md (mem_thread_fence_1): Change fence
depending on the given memory model.

Signed-off-by: Patrick O'Neill 
---
 gcc/config/riscv/sync.md | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 3e6345e83a3..ba132d8a1ce 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -45,14 +45,24 @@
   DONE;
 })
 
-;; Until the RISC-V memory model (hence its mapping from C++) is finalized,
-;; conservatively emit a full FENCE.
 (define_insn "mem_thread_fence_1"
   [(set (match_operand:BLK 0 "" "")
(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
(match_operand:SI 1 "const_int_operand" "")] ;; model
   ""
-  "fence\tiorw,iorw")
+  {
+enum memmodel model = (enum memmodel) INTVAL (operands[1]);
+model = memmodel_base (model);
+if (model == MEMMODEL_SEQ_CST)
+   return "fence\trw,rw";
+else if (model == MEMMODEL_ACQ_REL)
+   return "fence.tso";
+else if (model == MEMMODEL_ACQUIRE)
+   return "fence\tr,rw";
+else if (model == MEMMODEL_RELEASE)
+   return "fence\trw,w";
+  }
+  [(set (attr "length") (const_int 4))])
 
 ;; Atomic memory operations.
 
-- 
2.34.1



  1   2   3   >