[Bug middle-end/110673] [14 regression] ICE when buliding opus (internal compiler error: in gimple_phi_arg_def_from_edge, at gimple.h:4699)

2023-07-14 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110673

--- Comment #2 from Sam James  ---
Created attachment 55548
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55548=edit
reduced.i

[Bug tree-optimization/110666] [14 Regression] wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

--- Comment #6 from Andrew Pinski  ---
(In reply to Zhendong Su from comment #5)
> A couple of very likely related tests (especially #2 and #3):
> 
> *** (1) wrong code at -O2, -O3, and -Os

case 1 is a phiopt issue (PR 110252).


> *** (2) wrong code at -O1 and above

Case 2 is this issue.

> 
> *** (3) wrong code at -O1 and above

Case 3 is this issue too.

[Bug tree-optimization/110666] [14 Regression] wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

--- Comment #5 from Zhendong Su  ---
A couple of very likely related tests (especially #2 and #3):

*** (1) wrong code at -O2, -O3, and -Os

[539] % gcctk -O1 small.c; ./a.out
[540] % gcctk -O2 small.c
[541] % timeout -s 9 5 ./a.out
Killed
[542] % cat small.c
int a, b = -1, c, d = -1;
int main() {
  if (a)
goto L2;
 L1:
  a = 0;
 L2:
  b && c;
  c = ~c * (d ^ (0 || a) || d & b);
  if (c)
goto L1;
  return 0;
}

*** (2) wrong code at -O1 and above

[558] % gcctk -O0 small.c; ./a.out
[559] % gcctk -O1 small.c
[560] % ./a.out
Floating point exception
[561] % cat small.c
int a = 1, b, c;
void f(int d) {
  for (; c < 2; c++) {
if (!a)
  b = -1;
a = (d != 4) == d;
b = 1 % ~b;
  }
}
int main() { f(1 || b); }

*** (3) wrong code at -O1 and above

[577] % gcctk -O0 small.c; ./a.out
[578] % gcctk -O1 small.c
[579] % timeout -s 9 5 ./a.out
Killed
[580] % cat small.c
int a = 1, b, c, d;
void e(int f) {
  for (; c < 2; c++) {
if (b)
  d = c;
c = d;
b = f;
  }
}
int main() { e(((a != 2) != a) != 1); }

[Bug middle-end/110673] [14 regression] ICE when buliding opus (internal compiler error: in gimple_phi_arg_def_from_edge, at gimple.h:4699)

2023-07-14 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110673

--- Comment #1 from Sam James  ---
Created attachment 55547
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55547=edit
test_unit_entropy.c.i

This is what cvise popped out, it's borderline invalid though, so let me touch
it up.

Re: [PATCH v2 3/3] libstdc++: Optimize is_fundamental performance by __is_arithmetic built-in

2023-07-14 Thread Ken Matsui via Gcc-patches
Hi,

Here are the benchmarks for this change:

* is_fundamental

https://github.com/ken-matsui/gcc-benches/blob/main/is_fundamental.md#fri-jul-14-091146-pm-pdt-2023

Time: -37.1619%
Peak Memory Usage: -29.4294%
Total Memory Usage: -29.4783%

* is_fundamental_v

https://github.com/ken-matsui/gcc-benches/blob/main/is_fundamental_v.md#fri-jul-14-091757-pm-pdt-2023

Time: -35.5446%
Peak Memory Usage: -30.0096%
Total Memory Usage: -30.6021%

* is_fundamental with bool_constant (on trunk
[18dac101678b8c0aed4bd995351e47f26cd54dec])

https://github.com/ken-matsui/gcc-benches/blob/main/is_fundamental-bool_constant.md#fri-jul-14-094237-pm-pdt-2023

Time: -28.3908%
Peak Memory Usage: -18.5403%
Total Memory Usage: -19.9045%

---

It appears using bool_constant is better than disjunction. If my
understanding is correct, disjunction can avoid later instantiations
when short-circuiting, but might the evaluation of disjunction be more
expensive than evaluating is_void and is_null_pointer? Or my benchmark
might be just incorrect.

Sincerely,
Ken Matsui

On Fri, Jul 14, 2023 at 9:57 PM Ken Matsui  wrote:
>
> This patch optimizes the performance of the is_fundamental trait by
> dispatching to the new __is_arithmetic built-in trait.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/type_traits (is_fundamental_v): Use __is_arithmetic
> built-in trait.
> (is_fundamental): Likewise. Optimize the original implementation.
>
> Signed-off-by: Ken Matsui 
> ---
>  libstdc++-v3/include/std/type_traits | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index 7ebbe04c77b..cf24de2fcac 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -668,11 +668,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  #endif
>
>/// is_fundamental
> +#if __has_builtin(__is_arithmetic)
> +  template
> +struct is_fundamental
> +: public __bool_constant<__is_arithmetic(_Tp)
> + || is_void<_Tp>::value
> + || is_null_pointer<_Tp>::value>
> +{ };
> +#else
>template
>  struct is_fundamental
> -: public __or_, is_void<_Tp>,
> -  is_null_pointer<_Tp>>::type
> +: public __bool_constant::value
> + || is_void<_Tp>::value
> + || is_null_pointer<_Tp>::value>
>  { };
> +#endif
>
>/// is_object
>template
> @@ -3209,13 +3219,16 @@ template 
>  #if __has_builtin(__is_arithmetic)
>  template 
>inline constexpr bool is_arithmetic_v = __is_arithmetic(_Tp);
> +template 
> +  inline constexpr bool is_fundamental_v
> += __is_arithmetic(_Tp) || is_void_v<_Tp> || is_null_pointer_v<_Tp>;
>  #else
>  template 
>inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
> -#endif
> -
>  template 
>inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
> +#endif
> +
>  template 
>inline constexpr bool is_object_v = is_object<_Tp>::value;
>  template 
> --
> 2.41.0
>


[PATCH v2 3/3] libstdc++: Optimize is_fundamental performance by __is_arithmetic built-in

2023-07-14 Thread Ken Matsui via Gcc-patches
This patch optimizes the performance of the is_fundamental trait by
dispatching to the new __is_arithmetic built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_fundamental_v): Use __is_arithmetic
built-in trait.
(is_fundamental): Likewise. Optimize the original implementation.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 7ebbe04c77b..cf24de2fcac 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -668,11 +668,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// is_fundamental
+#if __has_builtin(__is_arithmetic)
+  template
+struct is_fundamental
+: public __bool_constant<__is_arithmetic(_Tp)
+ || is_void<_Tp>::value
+ || is_null_pointer<_Tp>::value>
+{ };
+#else
   template
 struct is_fundamental
-: public __or_, is_void<_Tp>,
-  is_null_pointer<_Tp>>::type
+: public __bool_constant::value
+ || is_void<_Tp>::value
+ || is_null_pointer<_Tp>::value>
 { };
+#endif
 
   /// is_object
   template
@@ -3209,13 +3219,16 @@ template 
 #if __has_builtin(__is_arithmetic)
 template 
   inline constexpr bool is_arithmetic_v = __is_arithmetic(_Tp);
+template 
+  inline constexpr bool is_fundamental_v
+= __is_arithmetic(_Tp) || is_void_v<_Tp> || is_null_pointer_v<_Tp>;
 #else
 template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
-#endif
-
 template 
   inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_object_v = is_object<_Tp>::value;
 template 
-- 
2.41.0



[PATCH v2 2/3] libstdc++: Optimize is_arithmetic performance by __is_arithmetic built-in

2023-07-14 Thread Ken Matsui via Gcc-patches
This patch optimizes the performance of the is_arithmetic trait by
dispatching to the new __is_arithmetic built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_arithmetic): Use __is_arithmetic
built-in trait.
(is_arithmetic_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..7ebbe04c77b 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -655,10 +655,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_arithmetic
+#if __has_builtin(__is_arithmetic)
+  template
+struct is_arithmetic
+: public __bool_constant<__is_arithmetic(_Tp)>
+{ };
+#else
   template
 struct is_arithmetic
 : public __or_, is_floating_point<_Tp>>::type
 { };
+#endif
 
   /// is_fundamental
   template
@@ -3198,8 +3205,15 @@ template 
   inline constexpr bool is_reference_v<_Tp&> = true;
 template 
   inline constexpr bool is_reference_v<_Tp&&> = true;
+
+#if __has_builtin(__is_arithmetic)
+template 
+  inline constexpr bool is_arithmetic_v = __is_arithmetic(_Tp);
+#else
 template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
 template 
-- 
2.41.0



[PATCH v2 1/3] c++, libstdc++: Implement __is_arithmetic built-in trait

2023-07-14 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_arithmetic.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_arithmetic.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARITHMETIC.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_arithmetic.
* g++.dg/ext/is_arithmetic.C: New test.
* g++.dg/tm/pr46567.C (__is_arithmetic): Rename to ...
(is_arithmetic): ... this.
* g++.dg/torture/pr57107.C: Likewise.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_arithmetic): Rename to ...
(is_arithmetic): ... this.
* include/c_global/cmath: Use is_arithmetic instead.
* include/c_std/cmath: Likewise.
* include/tr1/cmath: Likewise.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc|  3 ++
 gcc/cp/cp-trait.def |  1 +
 gcc/cp/semantics.cc |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
 gcc/testsuite/g++.dg/ext/is_arithmetic.C| 33 ++
 gcc/testsuite/g++.dg/tm/pr46567.C   |  6 +--
 gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
 libstdc++-v3/include/bits/cpp_type_traits.h |  4 +-
 libstdc++-v3/include/c_global/cmath | 48 ++---
 libstdc++-v3/include/c_std/cmath| 24 +--
 libstdc++-v3/include/tr1/cmath  | 24 +--
 11 files changed, 99 insertions(+), 55 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_arithmetic.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..bd517d08843 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3754,6 +3754,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
+case CPTK_IS_ARITHMETIC:
+  inform (loc, "  %qT is not an arithmetic type", t1);
+  break;
 case CPTK_IS_TRIVIALLY_COPYABLE:
   inform (loc, "  %qT is not trivially copyable", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..a95aeeaf778 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_ARITHMETIC, "__is_arithmetic", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..4531f047d73 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_ARITHMETIC:
+  return ARITHMETIC_TYPE_P (type1);
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_ARITHMETIC:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..3d63b0101d1 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_arithmetic)
+# error "__has_builtin (__is_arithmetic) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_arithmetic.C 
b/gcc/testsuite/g++.dg/ext/is_arithmetic.C
new file mode 100644
index 000..fd35831f646
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_arithmetic.C
@@ -0,0 +1,33 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+SA_TEST_CATEGORY(__is_arithmetic, void, false);
+
+SA_TEST_CATEGORY(__is_arithmetic, char, true);
+SA_TEST_CATEGORY(__is_arithmetic, signed char, true);
+SA_TEST_CATEGORY(__is_arithmetic, unsigned char, true);
+SA_TEST_CATEGORY(__is_arithmetic, wchar_t, true);
+SA_TEST_CATEGORY(__is_arithmetic, short, 

[Bug middle-end/110673] New: [14 regression] ICE when buliding opus (internal compiler error: in gimple_phi_arg_def_from_edge, at gimple.h:4699)

2023-07-14 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110673

Bug ID: 110673
   Summary: [14 regression] ICE when buliding opus (internal
compiler error: in gimple_phi_arg_def_from_edge, at
gimple.h:4699)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 55546
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55546=edit
test_unit_entropy.c.i

Hit with opus-1.4.

```
# aarch64-unknown-linux-gnu-gcc -Icelt/tests/test_unit_entropy.p -Icelt/tests
-I../opus-1.4/celt/tests -I. -I../opus-1.4 -Iinclude -I../opus-1.4/include
-Icelt -I../opus-1.4/celt -Isilk -I../opus-1.4/silk -fdiagnostics-color=always
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu99 -DOPUS_BUILD
-DHAVE_CONFIG_H -fvisibility=hidden -Wcast-align -Wnested-externs -Wshadow
-Wstrict-prototypes -fstack-protector-strong -O3 -pipe -mcpu=native
-fdiagnostics-color=always -ggdb3 -MD -MQ
celt/tests/test_unit_entropy.p/test_unit_entropy.c.o -MF
celt/tests/test_unit_entropy.p/test_unit_entropy.c.o.d -o
celt/tests/test_unit_entropy.p/test_unit_entropy.c.o -c
../opus-1.4/celt/tests/test_unit_entropy.c -save-temps
aarch64-unknown-linux-gnu-gcc: warning: ‘-pipe’ ignored because ‘-save-temps’
specified
during GIMPLE pass: sccp
../opus-1.4/celt/tests/test_unit_entropy.c: In function ‘main’:
../opus-1.4/celt/tests/test_unit_entropy.c:53:5: internal compiler error: in
gimple_phi_arg_def_from_edge, at gimple.h:4699
   53 | int main(int _argc,char **_argv){
  | ^~~~
0xd781675b gimple_phi_arg_def_from_edge(gphi const*, edge_def const*)
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/gimple.h:4699
0xd8160f8b gimple_phi_arg_def_from_edge(gphi const*, edge_def const*)
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/gimple-iterator.h:133
0xd8160f8b final_value_replacement_loop(loop*)
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-scalar-evolution.cc:3732
0xd822566b execute
   
/usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/tree-ssa-loop.cc:411
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

Reproduces with `gcc -c test_unit_entropy.c.i -O3`.

[Bug tree-optimization/110204] [14 Regression] Suspicous warning when compiling ranges-v3 using GCC trunk (iteration 9223372036854775807 invokes undefined behavior)

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110204

Andrew Pinski  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #4 from Andrew Pinski  ---
Replaced _40 - _41 with pretmp_163 in all uses of _42 = _40 - _41;
Replaced _42 /[ex] 4 with pretmp_162 in all uses of _43 = _42 /[ex] 4;
Replaced (long unsigned int) _43 with pretmp_161 in all uses of _44 = (long
unsigned int) _43;
Removing unexecutable edge from if (_42 != 0)
Removing dead stmt _44 = (long unsigned int) _43;
Removing dead stmt _43 = _42 /[ex] 4;
Removing dead stmt _42 = _40 - _41;
Removing dead stmt _41 = MEM[(const struct vector
*)_3(D)].D.214899._M_impl.D.214244._M_start;
Removing dead stmt _40 = MEM[(const struct vector
*)_3(D)].D.214899._M_impl.D.214244._M_finish;


What is interesting is before PRE we had:
```
   [local count: 19488414]:
  if (_42 != 0)
goto ; [59.00%]
  else
goto ; [41.00%]

   [local count: 11498164]:
  __n_154 = _43 + -1;
  if (_42 != 0)
goto ; [89.00%]
  else
goto ; [11.00%]
```

After we got:
```
   [local count: 19488414]:
  if (pretmp_163 != 0)
goto ; [59.00%]
  else
goto ; [41.00%]

   [local count: 7990250]:
  goto ; [100.00%]

   [local count: 11498164]:
  __n_154 = pretmp_162 + -1;
```

That is only the second condition based on _42 was removed and not the first
...

[Bug tree-optimization/110204] [14 Regression] Suspicous warning when compiling ranges-v3 using GCC trunk (iteration 9223372036854775807 invokes undefined behavior)

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110204

--- Comment #3 from Andrew Pinski  ---
PRE leaves around:
   [local count: 118111600]:
...
  pretmp_163 = 0;

   [local count: 19488414]:
  if (pretmp_163 != 0)
goto ; [59.00%]
  else
goto ; [41.00%]

[Bug tree-optimization/110204] [14 Regression] Suspicous warning when compiling ranges-v3 using GCC trunk (iteration 9223372036854775807 invokes undefined behavior)

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110204

--- Comment #2 from Andrew Pinski  ---
The preprocessed source that is produced by GCC 13 is warning free.

[Bug tree-optimization/110252] [14 Regression] Wrong code at -O2/3/s on x86_64-linux-gnu

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110252

--- Comment #16 from Andrew Pinski  ---
Updated patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624563.html

(depends on:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624562.html
)

[PATCH 2/2] [PATCH] Fix tree-opt/110252: wrong code due to phiopt using flow sensitive info during match

2023-07-14 Thread Andrew Pinski via Gcc-patches
Match will query ranger via tree_nonzero_bits/get_nonzero_bits for 2 and 3rd
operand of the COND_EXPR and phiopt tries to do create the COND_EXPR even if we 
moving
one statement. That one statement could have some flow sensitive information on 
it
based on the condition that is for the COND_EXPR but that might create wrong 
code
if the statement was moved out.

This is similar to the previous version of the patch except now we use
flow_sensitive_info_storage instead of manually doing the save/restore
and also handle all defs on a gimple statement rather than just for lhs
of the gimple statement. Oh and a few more testcases were added that
was failing before.

OK? Bootsrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/110252

gcc/ChangeLog:

* tree-ssa-phiopt.cc (class auto_flow_sensitive): New class.
(auto_flow_sensitive::auto_flow_sensitive): New constructor.
(auto_flow_sensitive::~auto_flow_sensitive): New deconstructor.
(match_simplify_replacement): Temporarily
remove the flow sensitive info on the two statements that might
be moved.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/phi-opt-25b.c: Updated as
__builtin_parity loses the nonzerobits info.
* gcc.c-torture/execute/pr110252-1.c: New test.
* gcc.c-torture/execute/pr110252-2.c: New test.
* gcc.c-torture/execute/pr110252-3.c: New test.
* gcc.c-torture/execute/pr110252-4.c: New test.
---
 .../gcc.c-torture/execute/pr110252-1.c| 15 ++
 .../gcc.c-torture/execute/pr110252-2.c| 10 
 .../gcc.c-torture/execute/pr110252-3.c| 13 +
 .../gcc.c-torture/execute/pr110252-4.c|  8 +++
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25b.c   |  6 +--
 gcc/tree-ssa-phiopt.cc| 51 +--
 6 files changed, 96 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110252-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110252-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110252-3.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110252-4.c

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110252-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110252-1.c
new file mode 100644
index 000..4ae93ca0647
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110252-1.c
@@ -0,0 +1,15 @@
+/* This is reduced from sel-sched.cc which was noticed was being miscompiled 
too. */
+int g(int min_need_stall) __attribute__((__noipa__));
+int g(int min_need_stall)
+{
+  return  min_need_stall < 0 ? 1 : ((min_need_stall) < (1) ? (min_need_stall) 
: (1));
+}
+int main(void)
+{
+  for(int i = -100; i <= 100; i++)
+{
+  int t = g(i);
+  if (t != (i!=0))
+__builtin_abort();
+}
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110252-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110252-2.c
new file mode 100644
index 000..7f1a7dbf134
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110252-2.c
@@ -0,0 +1,10 @@
+signed char f() __attribute__((__noipa__));
+signed char f() { return 0; }
+int main()
+{
+  int g = f() - 1;
+  int e = g < 0 ? 1 : ((g >> (8-2))!=0);
+  asm("":"+r"(e));
+  if (e != 1)
+__builtin_abort();
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110252-3.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110252-3.c
new file mode 100644
index 000..c24bf1ab1e4
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110252-3.c
@@ -0,0 +1,13 @@
+
+unsigned int a = 1387579096U;
+void sinkandcheck(unsigned b) __attribute__((noipa));
+void sinkandcheck(unsigned b)
+{
+if (a != b)
+__builtin_abort();
+}
+int main() {
+a = 1 < (~a) ? 1 : (~a);
+sinkandcheck(1);
+return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110252-4.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110252-4.c
new file mode 100644
index 000..f97edd3f069
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110252-4.c
@@ -0,0 +1,8 @@
+
+int a, b = 2, c = 2;
+int main() {
+  b = ~(1 % (a ^ (b - (1 && c) || c & b)));
+  if (b < -1)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25b.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25b.c
index 7298da0c96e..0fd9b004a03 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25b.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-25b.c
@@ -65,8 +65,6 @@ int test_popcountll(unsigned long long x, unsigned long long 
y)
   return x ? __builtin_popcountll(y) : 0;
 }
 
-/* 3 types of functions (not including parity), each with 3 types and there 
are 2 goto each */
-/* { dg-final { scan-tree-dump-times "goto " 18 "optimized" } } */
+/* 4 types of functions, each with 3 types and there are 2 goto each */
+/* { dg-final { scan-tree-dump-times "goto " 24 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "x_..D. != 0" 12 "optimized" } } */
-/* parity case will be 

[PATCH 1/2] Add flow_sensitive_info_storage and use it in gimple-fold.

2023-07-14 Thread Andrew Pinski via Gcc-patches
This adds flow_sensitive_info_storage and uses it in
maybe_fold_comparisons_from_match_pd as mentioned in
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621817.html .
Since using it in maybe_fold_comparisons_from_match_pd was easy
and allowed me to test the storage earlier, I did it.

This also hides better how the flow sensitive information is
stored and only a single place needs to be updated if that
ever changes (again).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* gimple-fold.cc (fosa_unwind): Replace `vrange_storage *`
with flow_sensitive_info_storage.
(follow_outer_ssa_edges): Update how to save off the flow
sensitive info.
(maybe_fold_comparisons_from_match_pd): Update restoring
of flow sensitive info.
* tree-ssanames.cc (flow_sensitive_info_storage::save): New method.
(flow_sensitive_info_storage::restore): New method.
(flow_sensitive_info_storage::save_and_clear): New method.
(flow_sensitive_info_storage::clear_storage): New method.
* tree-ssanames.h (class flow_sensitive_info_storage): New class.
---
 gcc/gimple-fold.cc   | 17 +--
 gcc/tree-ssanames.cc | 72 
 gcc/tree-ssanames.h  | 21 +
 3 files changed, 100 insertions(+), 10 deletions(-)

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 4027ff71e10..de94efbcff7 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -6947,7 +6947,7 @@ and_comparisons_1 (tree type, enum tree_code code1, tree 
op1a, tree op1b,
 }
 
 static basic_block fosa_bb;
-static vec > *fosa_unwind;
+static vec > *fosa_unwind;
 static tree
 follow_outer_ssa_edges (tree val)
 {
@@ -6967,14 +6967,11 @@ follow_outer_ssa_edges (tree val)
   || POINTER_TYPE_P (TREE_TYPE (val)))
  && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (val)))
return NULL_TREE;
+  flow_sensitive_info_storage storage;
+  storage.save_and_clear (val);
   /* If the definition does not dominate fosa_bb temporarily reset
 flow-sensitive info.  */
-  if (val->ssa_name.info.range_info)
-   {
- fosa_unwind->safe_push (std::make_pair
-   (val, val->ssa_name.info.range_info));
- val->ssa_name.info.range_info = NULL;
-   }
+  fosa_unwind->safe_push (std::make_pair (val, storage));
   return val;
 }
   return val;
@@ -7034,14 +7031,14 @@ maybe_fold_comparisons_from_match_pd (tree type, enum 
tree_code code,
  type, gimple_assign_lhs (stmt1),
  gimple_assign_lhs (stmt2));
   fosa_bb = outer_cond_bb;
-  auto_vec, 8> unwind_stack;
+  auto_vec, 8> unwind_stack;
   fosa_unwind = _stack;
   if (op.resimplify (NULL, (!outer_cond_bb
? follow_all_ssa_edges : follow_outer_ssa_edges)))
 {
   fosa_unwind = NULL;
   for (auto p : unwind_stack)
-   p.first->ssa_name.info.range_info = p.second;
+   p.second.restore (p.first);
   if (gimple_simplified_result_is_gimple_val ())
{
  tree res = op.ops[0];
@@ -7065,7 +7062,7 @@ maybe_fold_comparisons_from_match_pd (tree type, enum 
tree_code code,
 }
   fosa_unwind = NULL;
   for (auto p : unwind_stack)
-p.first->ssa_name.info.range_info = p.second;
+p.second.restore (p.first);
 
   return NULL_TREE;
 }
diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc
index 5fdb6a37e9f..f81332451fc 100644
--- a/gcc/tree-ssanames.cc
+++ b/gcc/tree-ssanames.cc
@@ -916,3 +916,75 @@ make_pass_release_ssa_names (gcc::context *ctxt)
 {
   return new pass_release_ssa_names (ctxt);
 }
+
+/* Save and restore of flow sensitive information. */
+
+/* Save off the flow sensitive info from NAME. */
+
+void
+flow_sensitive_info_storage::save (tree name)
+{
+  gcc_assert (state == 0);
+  if (!POINTER_TYPE_P (TREE_TYPE (name)))
+{
+  range_info = SSA_NAME_RANGE_INFO (name);
+  state = 1;
+  return;
+}
+  state = -1;
+  auto ptr_info = SSA_NAME_PTR_INFO (name);
+  if (ptr_info)
+{
+  align = ptr_info->align;
+  misalign = ptr_info->misalign;
+  null = SSA_NAME_PTR_INFO (name)->pt.null;
+}
+  else
+{
+  align = 0;
+  misalign = 0;
+  null = true;
+}
+}
+
+/* Restore the flow sensitive info from NAME. */
+
+void
+flow_sensitive_info_storage::restore (tree name)
+{
+  gcc_assert (state != 0);
+  if (!POINTER_TYPE_P (TREE_TYPE (name)))
+{
+  gcc_assert (state == 1);
+  SSA_NAME_RANGE_INFO (name) = range_info;
+  return;
+}
+  gcc_assert (state == -1);
+  auto ptr_info = SSA_NAME_PTR_INFO (name);
+  /* If there was no flow sensitive info on the pointer
+ just return, there is nothing to restore to.  */
+  if (!ptr_info)
+return;
+  if (align != 0)
+set_ptr_info_alignment (ptr_info, align, misalign);
+  else
+mark_ptr_info_alignment_unknown (ptr_info);
+  SSA_NAME_PTR_INFO (name)->pt.null = null;

[PATCH] libstdc++: Use __bool_constant entirely

2023-07-14 Thread Ken Matsui via Gcc-patches
This patch uses __bool_constant entirely instead of integral_constant
in the type_traits header, specifically for true_type, false_type,
and bool_constant.

libstdc++-v3/ChangeLog:

* include/std/type_traits (true_type): Use __bool_constant
instead.
(false_type): Likewise.
(bool_constant): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 9f086992ebc..7dc5791a7c5 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -78,24 +78,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr _Tp integral_constant<_Tp, __v>::value;
 #endif
 
-  /// The type used as a compile-time boolean with true value.
-  using true_type =  integral_constant;
-
-  /// The type used as a compile-time boolean with false value.
-  using false_type = integral_constant;
-
   /// @cond undocumented
   /// bool_constant for C++11
   template
 using __bool_constant = integral_constant;
   /// @endcond
 
+  /// The type used as a compile-time boolean with true value.
+  using true_type =  __bool_constant;
+
+  /// The type used as a compile-time boolean with false value.
+  using false_type = __bool_constant;
+
 #if __cplusplus >= 201703L
 # define __cpp_lib_bool_constant 201505L
   /// Alias template for compile-time boolean constant types.
   /// @since C++17
   template
-using bool_constant = integral_constant;
+using bool_constant = __bool_constant<__v>;
 #endif
 
   // Metaprogramming helper types.
-- 
2.41.0



[PATCH v3] Introduce attribute reverse_alias

2023-07-14 Thread Alexandre Oliva via Gcc-patches


This patch introduces an attribute to add extra aliases to a symbol
when its definition is output.  The main goal is to ease interfacing
C++ with Ada, as C++ mangled names have to be named, and in some cases
(e.g. when using stdint.h typedefs in function arguments) the symbol
names may vary across platforms.

The attribute is usable in C and C++, presumably in all C-family
languages.  It can be attached to global variables and functions.  In
C++, it can also be attached to namespace-scoped variables and
functions, static data members, member functions, explicit
instantiations and specializations of template functions, members and
classes.

When applied to constructors or destructor, additional reverse_aliases
with _Base and _Del suffixes are defined for variants other than
complete-object ones.  This changes the assumption that clones always
carry the same attributes as their abstract declarations, so there is
now a function to adjust them.

C++ also had a bug in which attributes from local extern declarations
failed to be propagated to a preexisting corresponding
namespace-scoped decl.  I've fixed that, and adjusted acc tests that
distinguished between C and C++ in this regard.

Applying the attribute to class types is only valid in C++, and the
effect is to attach the alias to the RTTI object associated with the
class type.

Regstrapped on x86_64-linux-gnu.  Ok to install?

This is refreshed and renamed from earlier versions that named the
attribute 'exalias', and that AFAICT got stuck in name bikeshedding.
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551614.html


for  gcc/ChangeLog

* attribs.cc: Include cgraph.h.
(decl_attributes): Allow late introduction of reverse_alias in
types.
(create_reverse_alias_decl, create_reverse_alias_decls): New.
* attribs.h: Declare them.
(FOR_EACH_REVERSE_ALIAS): New macro.
* cgraph.cc (cgraph_node::create): Create reverse_alias decls.
* varpool.cc (varpool_node::get_create): Create reverse_alias
decls.
* cgraph.h (symtab_node::remap_reverse_alias_target): New.
* symtab.cc (symtab_node::remap_reverse_alias_target):
Define.
* cgraphunit.cc (cgraph_node::analyze): Create alias_target
node if needed.
(analyze_functions): Fixup visibility of implicit alias only
after its node is analyzed.
* doc/extend.texi (reverse_alias): Document for variables,
functions and types.

for  gcc/ada/ChangeLog

* doc/gnat_rm/interfacing_to_other_languages.rst: Mention
attribute reverse_alias to give RTTI symbols mnemonic names.
* doc/gnat_ugn/the_gnat_compilation_model.rst: Mention
attribute reverse_alias.  Fix incorrect ref to C1 ctor variant.

for  gcc/c-family/ChangeLog

* c-ada-spec.cc (pp_asm_name): Use first reverse_alias if
available.
* c-attribs.cc (handle_reverse_alias_attribute): New.
(c_common_attribute_table): Add reverse_alias.
(handle_copy_attribute): Do not copy reverse_alias.

for  gcc/c/ChangeLog

* c-decl.cc (duplicate_decls): Remap reverse_alias target.

for  gcc/cp/ChangeLog

* class.cc (adjust_clone_attributes): New.
(copy_fndecl_with_name, build_clone): Call it.
* cp-tree.h (adjust_clone_attributes): Declare.
(update_reverse_alias_interface): Declare.
(update_tinfo_reverse_alias): Declare.
* decl.cc (duplicate_decls): Remap reverse_alias target.
Adjust clone attributes.
(grokfndecl): Tentatively create reverse_alias decls after
adding attributes in e.g. a template member function explicit
instantiation.
* decl2.cc (cplus_decl_attributes): Update tinfo
reverse_alias.
(copy_interface, update_reverse_alias_interface): New.
(determine_visibility): Update reverse_alias interface.
(tentative_decl_linkage, import_export_decl): Likewise.
* name-lookup.cc: Include target.h and cgraph.h.
(push_local_extern_decl_alias): Merge attributes with
namespace-scoped decl, and drop duplicate reverse_alias.
* optimize.cc (maybe_clone_body): Re-adjust attributes after
cloning them.  Update reverse_alias interface.
* rtti.cc: Include attribs.h and cgraph.h.
(get_tinfo_decl): Copy reverse_alias attributes from type to
tinfo decl.  Create reverse_alias decls.
(update_tinfo_reverse_alias): New.

for  gcc/testsuite/ChangeLog

* c-c++-common/goacc/declare-1.c: Adjust.
* c-c++-common/goacc/declare-2.c: Adjust.
* c-c++-common/torture/attr-revalias-1.c: New.
* c-c++-common/torture/attr-revalias-2.c: New.
* c-c++-common/torture/attr-revalias-3.c: New.
* c-c++-common/torture/attr-revalias-4.c: New.
* g++.dg/torture/attr-revalias-1.C: New.
* g++.dg/torture/attr-revalias-2.C: New.
* g++.dg/torture/attr-revalias-3.C: 

[Bug tree-optimization/110672] vec.h:1023:9: error: 'new_temp' may be used uninitialized [-Werror=maybe-uninitialized]

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110672

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 110652 ***

[Bug tree-optimization/110652] [14 Regression] bootstrap failure on tree-vect-stmts.cc with --enable-checking=release: error: 'new_temp' may be used uninitialized

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110652

Andrew Pinski  changed:

   What|Removed |Added

 CC||danglin at gcc dot gnu.org

--- Comment #7 from Andrew Pinski  ---
*** Bug 110672 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/110672] New: vec.h:1023:9: error: 'new_temp' may be used uninitialized [-Werror=maybe-uninitialized]

2023-07-14 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110672

Bug ID: 110672
   Summary: vec.h:1023:9: error: 'new_temp' may be used
uninitialized [-Werror=maybe-uninitialized]
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa64-hp-hpux11.11
Target: hppa64-hp-hpux11.11
 Build: hppa64-hp-hpux11.11

In stage2,

/home/dave/gnu/gcc/objdir64/./prev-gcc/xg++
-B/home/dave/gnu/gcc/objdir64/./prev-gcc/
-B/opt/gnu64/gcc/gcc-14/hppa64-hp-hpux11.11/bin/ -nostdinc++
-B/home/dave/gnu/gcc/objdir64/prev-hppa64-hp-hpux11.11/libstdc++-v3/src/.libs
-B/home/dave/gnu/gcc/objdir64/prev-hppa64-hp-hpux11.11/libstdc++-v3/libsupc++/.libs

-I/home/dave/gnu/gcc/objdir64/prev-hppa64-hp-hpux11.11/libstdc++-v3/include/hppa64-hp-hpux11.11
 -I/home/dave/gnu/gcc/objdir64/prev-hppa64-hp-hpux11.11/libstdc++-v3/include 
-I/home/dave/gnu/gcc/gcc/libstdc++-v3/libsupc++
-L/home/dave/gnu/gcc/objdir64/prev-hppa64-hp-hpux11.11/libstdc++-v3/src/.libs
-L/home/dave/gnu/gcc/objdir64/prev-hppa64-hp-hpux11.11/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c   -g -O2 -fno-checking -DIN_GCC-fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror   -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include 
-I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody
-I/opt/gnu64/gcc/gmp/include  -I../../gcc/gcc/../libdecnumber
-I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber
-I../../gcc/gcc/../libbacktrace -I/opt/gnu64/gcc/gmp/include  -o
tree-vect-slp.o -MT tree-vect-slp.o -MMD -MP -MF ./.deps/tree-vect-slp.TPo
../../gcc/gcc/tree-vect-slp.cc
In file included from ../../gcc/gcc/hash-table.h:248,
 from ../../gcc/gcc/coretypes.h:486,
 from ../../gcc/gcc/tree-vect-stmts.cc:24:
In member function 'T* vec::quick_push(const T&) [with T =
tree_node*; A = va_heap]',
inlined from 'T* vec::quick_push(const T&) [with T = tree_node*]' at
../../gcc/gcc/vec.h:1987:28,
inlined from 'bool vectorizable_load(vec_info*, stmt_vec_info,
gimple_stmt_iterator*, gimple**, slp_tree, stmt_vector_for_cost*)' at
../../gcc/gcc/tree-vect-stmts.cc:10962:23:
../../gcc/gcc/vec.h:1023:9: error: 'new_temp' may be used uninitialized
[-Werror=maybe-uninitialized]
 1023 |   *slot = obj;
  |   ~~^
../../gcc/gcc/tree-vect-stmts.cc: In function 'bool
vectorizable_load(vec_info*, stmt_vec_info, gimple_stmt_iterator*, gimple**,
slp_tree, stmt_vector_for_cost*)':
../../gcc/gcc/tree-vect-stmts.cc:9300:8: note: 'new_temp' was declared here
 9300 |   tree new_temp;
  |^~~~
cc1plus: all warnings being treated as errors

Initializing new_temp to NULL_TREE fixes warning.

[PATCH] VECT: Add mask_len_fold_left_plus for in-order floating-point reduction

2023-07-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Hi, Richard and Richi.

This patch adds mask_len_fold_left_plus pattern to support in-order 
floating-point
reduction for target support len loop control.

Consider this following case:
double
foo2 (double *__restrict a,
 double init,
 int *__restrict cond,
 int n)
{
for (int i = 0; i < n; i++)
  if (cond[i])
init += a[i];
return init;
}

ARM SVE:

...
vec_mask_and_60 = loop_mask_54 & mask__23.33_57;
vect__ifc__35.37_64 = .VCOND_MASK (vec_mask_and_60, vect__8.36_61, { 0.0, ... 
});
_36 = .MASK_FOLD_LEFT_PLUS (init_20, vect__ifc__35.37_64, loop_mask_54);
...

For RVV, we want to see:
...
_36 = .MASK_LEN_FOLD_LEFT_PLUS (init_20, vect__ifc__35.37_64, control_mask, 
loop_len, bias);
...

gcc/ChangeLog:

* doc/md.texi: Add mask_len_fold_left_plus.
* internal-fn.cc (mask_len_fold_left_direct): Ditto.
(expand_mask_len_fold_left_optab_fn): Ditto.
(direct_mask_len_fold_left_optab_supported_p): Ditto.
* internal-fn.def (MASK_LEN_FOLD_LEFT_PLUS): Ditto.
* optabs.def (OPTAB_D): Ditto.

---
 gcc/doc/md.texi | 13 +
 gcc/internal-fn.cc  |  5 +
 gcc/internal-fn.def |  3 +++
 gcc/optabs.def  |  1 +
 4 files changed, 22 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index cbcb992e5d7..6f44e66399d 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5615,6 +5615,19 @@ no reassociation.
 Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand
 (operand 3) that specifies which elements of the source vector should be added.
 
+@cindex @code{mask_len_fold_left_plus_@var{m}} instruction pattern
+@item @code{mask_len_fold_left_plus_@var{m}}
+Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand
+(operand 3), len operand (operand 4) and bias operand (operand 5) that
+performs following operations strictly in-order (no reassociation):
+
+@smallexample
+operand0 = operand1;
+for (i = 0; i < LEN + BIAS; i++)
+  if (operand3[i])
+operand0 += operand2[i];
+@end smallexample
+
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
 
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index e698f0bffc7..2bf4fc492fe 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -190,6 +190,7 @@ init_internal_fns ()
 #define fold_extract_direct { 2, 2, false }
 #define fold_left_direct { 1, 1, false }
 #define mask_fold_left_direct { 1, 1, false }
+#define mask_len_fold_left_direct { 1, 1, false }
 #define check_ptrs_direct { 0, 0, false }
 
 const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
@@ -3890,6 +3891,9 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, 
convert_optab optab,
 #define expand_mask_fold_left_optab_fn(FN, STMT, OPTAB) \
   expand_direct_optab_fn (FN, STMT, OPTAB, 3)
 
+#define expand_mask_len_fold_left_optab_fn(FN, STMT, OPTAB) \
+  expand_direct_optab_fn (FN, STMT, OPTAB, 5)
+
 #define expand_check_ptrs_optab_fn(FN, STMT, OPTAB) \
   expand_direct_optab_fn (FN, STMT, OPTAB, 4)
 
@@ -3997,6 +4001,7 @@ multi_vector_optab_supported_p (convert_optab optab, 
tree_pair types,
 #define direct_fold_extract_optab_supported_p direct_optab_supported_p
 #define direct_fold_left_optab_supported_p direct_optab_supported_p
 #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
+#define direct_mask_len_fold_left_optab_supported_p direct_optab_supported_p
 #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
 #define direct_vec_set_optab_supported_p direct_optab_supported_p
 #define direct_vec_extract_optab_supported_p direct_optab_supported_p
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index ea750a921ed..d3aec51b1f2 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -319,6 +319,9 @@ DEF_INTERNAL_OPTAB_FN (FOLD_LEFT_PLUS, ECF_CONST | 
ECF_NOTHROW,
 DEF_INTERNAL_OPTAB_FN (MASK_FOLD_LEFT_PLUS, ECF_CONST | ECF_NOTHROW,
   mask_fold_left_plus, mask_fold_left)
 
+DEF_INTERNAL_OPTAB_FN (MASK_LEN_FOLD_LEFT_PLUS, ECF_CONST | ECF_NOTHROW,
+  mask_len_fold_left_plus, mask_len_fold_left)
+
 /* Unary math functions.  */
 DEF_INTERNAL_FLT_FN (ACOS, ECF_CONST, acos, unary)
 DEF_INTERNAL_FLT_FN (ACOSH, ECF_CONST, acosh, unary)
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 3dae228fba6..7023392979e 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -385,6 +385,7 @@ OPTAB_D (reduc_ior_scal_optab,  "reduc_ior_scal_$a")
 OPTAB_D (reduc_xor_scal_optab,  "reduc_xor_scal_$a")
 OPTAB_D (fold_left_plus_optab, "fold_left_plus_$a")
 OPTAB_D (mask_fold_left_plus_optab, "mask_fold_left_plus_$a")
+OPTAB_D (mask_len_fold_left_plus_optab, "mask_len_fold_left_plus_$a")
 
 OPTAB_D (extract_last_optab, "extract_last_$a")
 OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a")
-- 
2.36.1



Re: [PATCH v3 1/3] c++: Track lifetimes in constant evaluation [PR70331,PR96630,PR98675]

2023-07-14 Thread Jason Merrill via Gcc-patches

On 7/14/23 11:16, Jason Merrill wrote:
I'm not seeing either a copyright assignment or DCO certification for 
you; please see https://gcc.gnu.org/contribute.html#legal for more 
information.


Oops, now I see the DCO sign-off, not sure how I was missing it.

Jason



[Bug rtl-optimization/110206] [14 Regression] wrong code with -Os -march=cascadelake since r14-1246

2023-07-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206

Uroš Bizjak  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|14.0|12.4
 Status|ASSIGNED|RESOLVED

--- Comment #20 from Uroš Bizjak  ---
Fixed for gcc-12.4+.

[pushed] c++: c++26 regression fixes

2023-07-14 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Apparently I wasn't actually running the testsuite in C++26 mode like I
thought I was, so there were some failures I wasn't seeing.

The constexpr hunk fixes regressions with the P2738 implementation; we still
need to use the old handling for casting from void pointers to heap
variables.

PR c++/110344

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Move P2738 handling
after heap handling.
* name-lookup.cc (get_cxx_dialect_name): Add C++26.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-cast2.C: Adjust for P2738.
* g++.dg/ipa/devirt-45.C: Handle -fimplicit-constexpr.
---
 gcc/cp/constexpr.cc  | 21 ++--
 gcc/cp/name-lookup.cc|  2 ++
 gcc/testsuite/g++.dg/cpp0x/constexpr-cast2.C |  6 +++---
 gcc/testsuite/g++.dg/ipa/devirt-45.C |  2 +-
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index cca0435bafc..9f96a6c41ea 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -7681,17 +7681,6 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
&& !is_std_construct_at (ctx->call)
&& !is_std_allocator_allocate (ctx->call))
  {
-   /* P2738 (C++26): a conversion from a prvalue P of type "pointer to
-  cv void" to a pointer-to-object type T unless P points to an
-  object whose type is similar to T.  */
-   if (cxx_dialect > cxx23)
- if (tree ob
- = cxx_fold_indirect_ref (ctx, loc, TREE_TYPE (type), op))
-   {
- r = build1 (ADDR_EXPR, type, ob);
- break;
-   }
-
/* Likewise, don't error when casting from void* when OP is
uninit and similar.  */
tree sop = tree_strip_nop_conversions (op);
@@ -7699,6 +7688,16 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
&& VAR_P (TREE_OPERAND (sop, 0))
&& DECL_ARTIFICIAL (TREE_OPERAND (sop, 0)))
  /* OK */;
+   /* P2738 (C++26): a conversion from a prvalue P of type "pointer to
+  cv void" to a pointer-to-object type T unless P points to an
+  object whose type is similar to T.  */
+   else if (cxx_dialect > cxx23
+&& (sop = cxx_fold_indirect_ref (ctx, loc,
+ TREE_TYPE (type), sop)))
+ {
+   r = build1 (ADDR_EXPR, type, sop);
+   break;
+ }
else
  {
if (!ctx->quiet)
diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 74565184403..2d747561e1f 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -6731,6 +6731,8 @@ get_cxx_dialect_name (enum cxx_dialect dialect)
   return "C++20";
 case cxx23:
   return "C++23";
+case cxx26:
+  return "C++26";
 }
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-cast2.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-cast2.C
index b79e8a90131..3efbd92f043 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-cast2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-cast2.C
@@ -6,11 +6,11 @@ static int i;
 constexpr void *vp0 = nullptr;
 constexpr void *vpi = 
 constexpr int *p1 = (int *) vp0; // { dg-error "cast from .void\\*. is not 
allowed" }
-constexpr int *p2 = (int *) vpi; // { dg-error "cast from .void\\*. is not 
allowed" }
+constexpr int *p2 = (int *) vpi; // { dg-error "cast from .void\\*. is not 
allowed" "" { target c++23_down } }
 constexpr int *p3 = static_cast(vp0); // { dg-error "cast from 
.void\\*. is not allowed" }
-constexpr int *p4 = static_cast(vpi); // { dg-error "cast from 
.void\\*. is not allowed" }
+constexpr int *p4 = static_cast(vpi); // { dg-error "cast from 
.void\\*. is not allowed" "" { target c++23_down } }
 constexpr void *p5 = vp0;
 constexpr void *p6 = vpi;
 
 constexpr int *pi = 
-constexpr bool b = ((int *)(void *) pi == pi); // { dg-error "cast from 
.void\\*. is not allowed" }
+constexpr bool b = ((int *)(void *) pi == pi); // { dg-error "cast from 
.void\\*. is not allowed" "" { target c++23_down } }
diff --git a/gcc/testsuite/g++.dg/ipa/devirt-45.C 
b/gcc/testsuite/g++.dg/ipa/devirt-45.C
index c26be21964c..019b454835c 100644
--- a/gcc/testsuite/g++.dg/ipa/devirt-45.C
+++ b/gcc/testsuite/g++.dg/ipa/devirt-45.C
@@ -37,5 +37,5 @@ int main()
 }
 
 /* One invocation is A::foo () other is B::foo () even though the type is 
destroyed and rebuilt in test() */
-/* { dg-final { scan-ipa-dump-times "Discovered a virtual call to a known 
target\[^\\n\]*A::foo" 2 "inline"  } } */
+/* { dg-final { scan-ipa-dump-times "Discovered a virtual call to a known 
target\[^\\n\]*A::foo" 2 "inline" { target { ! implicit_constexpr } } } }*/
 /* { dg-final { 

[Bug rtl-optimization/110206] [14 Regression] wrong code with -Os -march=cascadelake since r14-1246

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206

--- Comment #19 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:eeb8e9a36d7aa9bc4ac8b0d7abe1e84e9afc4250

commit r12-9774-geeb8e9a36d7aa9bc4ac8b0d7abe1e84e9afc4250
Author: Uros Bizjak 
Date:   Fri Jul 14 11:46:22 2023 +0200

cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg
[PR110206]

cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
that it equals 8 elements of HImodeby setting REG_EQUAL note:

(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0  S4 A32]))
"pr110206.c":12:42 1530 {*movv4qi_internal}
 (expr_list:REG_EQUAL (const_vector:V4QI [
(const_int -52 [0xffcc]) repeated x4
])
(nil)))
(insn 22 21 23 4 (set (reg:V8HI 100)
(zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0)
(parallel [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 4 [0x4])
(const_int 5 [0x5])
(const_int 6 [0x6])
(const_int 7 [0x7])
] "pr110206.c":12:42 7471
{sse4_1_zero_extendv8qiv8hi2}
 (expr_list:REG_EQUAL (const_vector:V8HI [
(const_int 204 [0xcc]) repeated x8
])
(expr_list:REG_DEAD (reg:V4QI 98)
(nil

We rely on the "undefined" vals to have a specific value (from the earlier
REG_EQUAL note) but actual code generation doesn't ensure this (it doesn't
need to).  That said, the issue isn't the constant folding per-se but that
we do not actually constant fold but register an equality that doesn't
hold.

PR target/110206

gcc/ChangeLog:

* fwprop.cc (contains_paradoxical_subreg_p): Move to ...
* rtlanal.cc (contains_paradoxical_subreg_p): ... here.
* rtlanal.h (contains_paradoxical_subreg_p): Add prototype.
* cprop.cc (try_replace_reg): Do not set REG_EQUAL note
when the original source contains a paradoxical subreg.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110206.c: New test.

(cherry picked from commit 1815e313a8fb519a77c94a908eb6dafc4ce51ffe)

[Bug c++/110344] [C++26] P2738R1 - constexpr cast from void*

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110344

--- Comment #6 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:8d344146727da02eb5c62fbf6cee97a4e96d63db

commit r14-2535-g8d344146727da02eb5c62fbf6cee97a4e96d63db
Author: Jason Merrill 
Date:   Fri Jul 14 09:37:21 2023 -0400

c++: c++26 regression fixes

Apparently I wasn't actually running the testsuite in C++26 mode like I
thought I was, so there were some failures I wasn't seeing.

The constexpr hunk fixes regressions with the P2738 implementation; we
still
need to use the old handling for casting from void pointers to heap
variables.

PR c++/110344

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Move P2738 handling
after heap handling.
* name-lookup.cc (get_cxx_dialect_name): Add C++26.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-cast2.C: Adjust for P2738.
* g++.dg/ipa/devirt-45.C: Handle -fimplicit-constexpr.

gcc-12-20230714 is now available

2023-07-14 Thread GCC Administrator via Gcc
Snapshot gcc-12-20230714 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/12-20230714/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 12 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-12 revision 995c717500c368c5aec7889dfa047cff7cb0139b

You'll find:

 gcc-12-20230714.tar.xz   Complete GCC

  SHA256=8b0060164a55d0b836d3750918ececb1f75bfe17dfd1c367c76b217b16bb037c
  SHA1=935daf1376c0bc6f2ade4c5262d3359735f15ac5

Diffs from 12-20230707 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-12
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: [PATCH] c++: mangling template-id of unknown template [PR110524]

2023-07-14 Thread Jason Merrill via Gcc-patches

On 7/13/23 09:20, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk and perhaps 13?


OK for both.


-- >8 --

This fixes a crash when mangling an ADL-enabled call to a template-id
naming an unknown template (as per P0846R0).

PR c++/110524

gcc/cp/ChangeLog:

* mangle.cc (write_expression): Handle TEMPLATE_ID_EXPR
whose template is already an IDENTIFIER_NODE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/fn-template26.C: New test.
---
  gcc/cp/mangle.cc   |  3 ++-
  gcc/testsuite/g++.dg/cpp2a/fn-template26.C | 16 
  2 files changed, 18 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/fn-template26.C

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 7dab4e62bc9..bef0fda6d22 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -3312,7 +3312,8 @@ write_expression (tree expr)
else if (TREE_CODE (expr) == TEMPLATE_ID_EXPR)
  {
tree fn = TREE_OPERAND (expr, 0);
-  fn = OVL_NAME (fn);
+  if (!identifier_p (fn))
+   fn = OVL_NAME (fn);
if (IDENTIFIER_ANY_OP_P (fn))
write_string ("on");
write_unqualified_id (fn);
diff --git a/gcc/testsuite/g++.dg/cpp2a/fn-template26.C 
b/gcc/testsuite/g++.dg/cpp2a/fn-template26.C
new file mode 100644
index 000..d4a17eb9bd1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/fn-template26.C
@@ -0,0 +1,16 @@
+// PR c++/110524
+// { dg-do compile { target c++20 } }
+
+template
+auto f(T t) -> decltype(g(t));
+
+namespace N {
+  struct A { };
+  template void g(T);
+};
+
+int main() {
+  f(N::A{});
+}
+
+// { dg-final { scan-assembler "_Z1fIN1N1AEEDTcl1gIT_Efp_EES2_" } }




Re: [PATCH] c++: copy elision of object arg in static memfn call [PR110441]

2023-07-14 Thread Jason Merrill via Gcc-patches

On 7/13/23 14:49, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


-- >8 --

Here the call A().f() is represented as a COMPOUND_EXPR whose first
operand is the otherwise unused object argument A() and second operand
is the call result (both are TARGET_EXPRs).  Within the return statement,
this outermost COMPOUND_EXPR ends up foiling the copy elision check in
build_special_member_call, resulting in us introducing a bogus call to the
deleted move constructor.  (Within the variable initialization, which goes
through ocp_convert instead of convert_for_initialization, we've already
been eliding the copy despite the outermost COMPOUND_EXPR ever since
r10-7410-g72809d6fe8e085 made ocp_convert look through COMPOUND_EXPR).

In contrast, I noticed '(A(), A::f())' (which should be equivalent to
the above call) is represented with the COMPOUND_EXPR inside the RHS's
TARGET_EXPR initializer thanks to a special case in cp_build_compound_expr
thus avoiding the issue.

So this patch fixes this by making keep_unused_object_arg
use cp_build_compound_expr as well.

PR c++/110441

gcc/cp/ChangeLog:

* call.cc (keep_unused_object_arg): Use cp_build_compound_expr
instead of building a COMPOUND_EXPR directly.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/elide8.C: New test.
---
  gcc/cp/call.cc  |  2 +-
  gcc/testsuite/g++.dg/cpp1z/elide8.C | 25 +
  2 files changed, 26 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/elide8.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 119063979fa..b0a69cb46d4 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -5218,7 +5218,7 @@ keep_unused_object_arg (tree result, tree obj, tree fn)
if (TREE_THIS_VOLATILE (a))
  a = build_this (a);
if (TREE_SIDE_EFFECTS (a))
-return build2 (COMPOUND_EXPR, TREE_TYPE (result), a, result);
+return cp_build_compound_expr (a, result, tf_warning_or_error);
return result;
  }
  
diff --git a/gcc/testsuite/g++.dg/cpp1z/elide8.C b/gcc/testsuite/g++.dg/cpp1z/elide8.C

new file mode 100644
index 000..7d471be8a2a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/elide8.C
@@ -0,0 +1,25 @@
+// PR c++/110441
+// { dg-do compile { target c++11 } }
+
+struct immovable {
+  immovable(immovable &&) = delete;
+};
+
+struct A {
+  static immovable f();
+};
+
+immovable f() {
+  immovable m = A().f(); // { dg-error "deleted" "" { target c++14_down } }
+  return A().f(); // { dg-error "deleted" "" { target c++14_down } }
+}
+
+struct B {
+  A* operator->();
+};
+
+immovable g() {
+  B b;
+  immovable m = b->f(); // { dg-error "deleted" "" { target c++14_down } }
+  return b->f(); // { dg-error "deleted" "" { target c++14_down } }
+}




Re: [PATCH] c++: redundant targ coercion for var/alias tmpls

2023-07-14 Thread Jason Merrill via Gcc-patches

On 7/14/23 14:07, Patrick Palka wrote:

On Thu, 13 Jul 2023, Jason Merrill wrote:


On 7/13/23 11:48, Patrick Palka wrote:

On Wed, 28 Jun 2023, Patrick Palka wrote:


On Wed, Jun 28, 2023 at 11:50 AM Jason Merrill  wrote:


On 6/23/23 12:23, Patrick Palka wrote:

On Fri, 23 Jun 2023, Jason Merrill wrote:


On 6/21/23 13:19, Patrick Palka wrote:

When stepping through the variable/alias template specialization
code
paths, I noticed we perform template argument coercion twice:
first from
instantiate_alias_template / finish_template_variable and again
from
tsubst_decl (during instantiate_template).  It should suffice to
perform
coercion once.

To that end patch elides this second coercion from tsubst_decl
when
possible.  We can't get rid of it completely because we don't
always
specialize a variable template from finish_template_variable: we
could
also be doing so directly from instantiate_template during
variable
template partial specialization selection, in which case the
coercion
from tsubst_decl would be the first and only coercion.


Perhaps we should be coercing in lookup_template_variable rather
than
finish_template_variable?


Ah yes, there's a patch for that at
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617377.html :)


So after that patch, can we get rid of the second coercion completely?


On second thought it should be possible to get rid of it, if we
rearrange things to always pass the primary arguments to tsubst_decl,
and perform partial specialization selection from there instead of
instantiate_template.  Let me try...


Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.

-- >8 --

When stepping through the variable/alias template specialization code
paths, I noticed we perform template argument coercion twice: first from
instantiate_alias_template / finish_template_variable and again from
tsubst_decl (during instantiate_template).  It'd be good to avoid this
redundant coercion.

It turns out that this coercion could be safely elided whenever
specializing a primary variable/alias template, because we can rely on
lookup_template_variable and instantiate_alias_template to already have
coerced the arguments.

The other situation to consider is when fully specializing a partial
variable template specialization (from instantiate_template), in which
case the passed 'args' are the (already coerced) arguments relative to
the partial template and 'argvec', the result of substitution into
DECL_TI_ARGS, are the (uncoerced) arguments relative to the primary
template, so coercion is still necessary.  We can still avoid this
coercion however if we always pass the primary variable template to
tsubst_decl from instantiate_template, and instead perform partial
specialization selection directly from tsubst_decl.  This patch
implements this approach.


The relationship between instantiate_template and tsubst_decl is pretty
tangled.  We use the former to substitute (often deduced) template arguments
into a template, and the latter to substitute template arguments into a use of
a template...and also to implement the former.

For substitution of uses of a template, we expect to need to coerce the
arguments after substitution.  But we avoid this issue for variable templates
by keeping them as TEMPLATE_ID_EXPR until substitution time, so if we see a
VAR_DECL in tsubst_decl it's either a non-template variable or under
instantiate_template.


FWIW it seems we could also be in tsubst_decl for a VAR_DECL if

   * we're partially instantiating a class-scope variable template
 during instantiation of the class


Hmm, why don't partial instantiations stay as TEMPLATE_ID_EXPR?


   * we're substituting a use of an already non-dependent variable
 template specialization


Sure.


So it seems like the current coercion for variable templates is only needed in
this case to support the redundant hash table lookup that we just did in
instantiate_template.  Perhaps instead of doing coercion here or moving the
partial spec lookup, we could skip the hash table lookup for the case of a
variable template?


It seems we'd then also have to make instantiate_template responsible
for registering the variable template specialization since tsubst_decl
no longer necessarily has the arguments relative to the primary template
('args' could be relative to the partial template).

Like so?  The following makes us perform all the specialization table
manipulation in instantiate_template instead of tsubst_decl for variable
template specializations.


Looks good.


I wonder if we might want to do this for alias template specializations too?


That would make sense.


@@ -15222,20 +15230,21 @@ tsubst_decl (tree t, tree args, tsubst_flags_t 
complain)
  {
tmpl = DECL_TI_TEMPLATE (t);
gen_tmpl = most_general_template (tmpl);
-   argvec = tsubst (DECL_TI_ARGS (t), args, complain, in_decl);
-   if (argvec != error_mark_node
-   && 

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-14 Thread Nathan Sidwell via Gcc-patches

On 7/14/23 11:56, Richard Sandiford wrote:

Summary: We'd like to be able to specify some attributes using
keywords, rather than the traditional __attribute__ or [[...]]
syntax.  Would that be OK?

In more detail:

We'd like to add some new target-specific attributes for Arm SME.
These attributes affect semantics and code generation and so they
can't simply be ignored.

Traditionally we've done this kind of thing by adding GNU attributes,
via TARGET_ATTRIBUTE_TABLE in GCC's case.  The problem is that both
GCC and Clang have traditionally only warned about unrecognised GNU
attributes, rather than raising an error.  Older compilers might
therefore be able to look past some uses of the new attributes and
still produce object code, even though that object code is almost
certainly going to be wrong.  (The compilers will also emit a default-on
warning, but that might go unnoticed when building a big project.)

There are some existing attributes that similarly affect semantics
in ways that cannot be ignored.  vector_size is one obvious example.
But that doesn't make it a good thing. :)

Also, C++ says this for standard [[...]] attributes:

   For an attribute-token (including an attribute-scoped-token)
   not specified in this document, the behavior is implementation-defined;
   any such attribute-token that is not recognized by the implementation
   is ignored.

which doubles down on the idea that attributes should not be used
for necessary semantic information.


There;s been quite a bit of discussion about the practicalities of that.  As you 
say, there are existing, std-specified attributes, [[no_unique_address]] for 
instance, that affect user-visible object layout if ignored.
Further, my understanding is that implementation-specific attributes are 
permitted to affect program semantics -- they're implementatin extensions.


IMHO, attributes are the accepted mechanism for what you're describing. 
Compilers already have a way of dealing with them -- both parsing and, in 
general, representing them.  I would be wary of inventing a different mechanism.


Have you approached C or C++ std bodies for input?



One of the attributes we'd like to add provides a new way of compiling
existing code.  The attribute doesn't require SME to be available;
it just says that the code must be compiled so that it can run in either
of two modes.  This is probably the most dangerous attribute of the set,
since compilers that ignore it would just produce normal code.  That
code might work in some test scenarios, but it would fail in others.

The feeling from the Clang community was therefore that these SME
attributes should use keywords instead, so that the keywords trigger
an error with older compilers.

However, it seemed wrong to define new SME-specific grammar rules,
since the underlying problem is pretty generic.  We therefore
proposed having a type of keyword that can appear exactly where
a standard [[...]] attribute can appear and that appertains to
exactly what a standard [[...]] attribute would appertain to.
No divergence or cherry-picking is allowed.

For example:

   [[arm::foo]]

would become:

   __arm_foo

and:

   [[arm::bar(args)]]

would become:

   __arm_bar(args)

It wouldn't be possible to retrofit arguments to a keyword that
previously didn't take arguments, since that could lead to parsing
ambiguities.  So when a keyword is first added, a binding decision
would need to be made whether the keyword always takes arguments
or is always standalone.

For that reason, empty argument lists are allowed for keywords,
even though they're not allowed for [[...]] attributes.

The argument-less version was accepted into Clang, and I have a follow-on
patch for handling arguments.  Would the same thing be OK for GCC,
in both the C and C++ frontends?

The patch below is a proof of concept for the C frontend.  It doesn't
bootstrap due to warnings about uninitialised fields.  And it doesn't
have tests.  But I did test it locally with various combinations of
attribute_spec and it seemed to work as expected.

The impact on the C frontend seems to be pretty small.  It looks like
the impact on the C++ frontend would be a bit bigger, but not much.

The patch contains a logically unrelated change: c-common.h set aside
16 keywords for address spaces, but of the in-tree ports, the maximum
number of keywords used is 6 (for amdgcn).  The patch therefore changes
the limit to 8 and uses 8 keywords for the new attributes.  This keeps
the number of reserved ids <= 256.

A real, non-proof-of-concept patch series would:

- Change the address-space keywords separately, and deal with any fallout.

- Clean up the way that attributes are specified, so that it isn't
   necessary to update all definitions when adding a new field.

- Allow more precise attribute requirements, such as "function decl only".

- Add tests :)

WDYT?  Does this approach look OK in principle, or is it a non-starter?

If it is a non-starter, the fallback would be to 

[Bug middle-end/87944] Wrong code with LRA pushing stack local variable

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87944

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Andrew Pinski  ---
(In reply to Mikael Pettersson from comment #5)
> Appears to have been fixed for gcc-14.0 by
> 
> 30038a207c10a2783fa2695b62c7c8458ef05e73 is the first new commit
> commit 30038a207c10a2783fa2695b62c7c8458ef05e73
> Author: Vladimir N. Makarov 
> Date:   Tue May 30 15:54:28 2023 -0400
> 
> LRA: Update insn sp offset if its input reload changes SP
> 
> which mentions a problem switching the h8300 to lra, but there's no PR
> reference in either the commit message or the mailing-list post.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620128.html

This is definitely the fix here. 
In this case we had:
(insn 12 10 13 2 (set (mem/f:HI (pre_dec:HI (reg/f:HI 6 sp)) [1  S2 A16])
(reg/f:HI 24)) "../../testarg.c":9:12 25 {movhi}
 (expr_list:REG_DEAD (reg/f:HI 24)
(expr_list:REG_ARGS_SIZE (const_int 2 [0x2])
(nil

Which then would turn into incorrectly what is described in both the bug report
here and in fact the same as what is described in the email.

Re: [PATCH V4] Optimize '(X - N * M) / N' to 'X / N - M' if valid

2023-07-14 Thread Andrew MacLeod via Gcc-patches



On 7/14/23 09:37, Richard Biener wrote:

On Fri, 14 Jul 2023, Aldy Hernandez wrote:


I don't know what you're trying to accomplish here, as I haven't been
following the PR, but adding all these helper functions to the ranger header
file seems wrong, especially since there's only one use of them. I see you're
tweaking the irange API, adding helper functions to range-op (which is only
for code dealing with implementing range operators for tree codes), etc etc.

If you need these helper functions, I suggest you put them closer to their
uses (i.e. wherever the match.pd support machinery goes).

Note I suggested the opposite beacuse I thought these kind of helpers
are closer to value-range support than to match.pd.



probably vr-values.{cc.h} and  the simply_using_ranges paradigm would be 
the most sensible place to put these kinds of auxiliary routines?





But I take away from your answer that there's nothing close in the
value-range machinery that answers the question whether A op B may
overflow?


we dont track it in ranges themselves.   During calculation of a range 
we obviously know, but propagating that generally when we rarely care 
doesn't seem worthwhile.  The very first generation of irange 6 years 
ago had an overflow_p() flag, but it was removed as not being worth 
keeping.     easier to simply ask the question when it matters


As the routines show, it pretty easy to figure out when the need arises 
so I think that should suffice.  At least for now,


Should we decide we would like it in general, it wouldnt be hard to add 
to irange.  wi_fold() cuurently returns null, it could easily return a 
bool indicating if an overflow happened, and wi_fold_in_parts and 
fold_range would simply OR the results all together of the compoent 
wi_fold() calls.  It would require updating/audfiting  a number of 
range-op entries and adding an overflowed_p()  query to irange.


Andrew



[Bug tree-optimization/110666] [14 Regression] wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-July/62
   ||4551.html

--- Comment #4 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624551.html

[PATCH] Fix PR 110666: `(a != 2) == a` produces wrong code

2023-07-14 Thread Andrew Pinski via Gcc-patches
I had messed up the case where the outer operator is `==`.
The check for the resulting should have been `==` and not `!=`.
This patch fixes that and adds a full runtime testcase now for
all cases to make sure it works.

OK? Bootstrapped and tested on x86-64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/110666
* match.pd (A NEEQ (A NEEQ CST)): Fix Outer EQ case.

gcc/testsuite/ChangeLog:

PR tree-optimization/110666
* gcc.c-torture/execute/pr110666-1.c: New test.
---
 gcc/match.pd  | 34 -
 .../gcc.c-torture/execute/pr110666-1.c| 51 +++
 2 files changed, 71 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110666-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 351d9285e92..88061fa4a6f 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6431,8 +6431,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* x != (typeof x)(x == CST) -> CST == 0 ? 1 : (CST == 1 ? (x!=0&!=1) : x != 
0) */
 /* x != (typeof x)(x != CST) -> CST == 1 ? 1 : (CST == 0 ? (x!=0&!=1) : x != 
1) */
-/* x == (typeof x)(x == CST) -> CST == 0 ? 0 : (CST == 1 ? (x==0||x==1) : x != 
0) */
-/* x == (typeof x)(x != CST) -> CST == 1 ? 0 : (CST == 0 ? (x==0||x==1) : x != 
1) */
+/* x == (typeof x)(x == CST) -> CST == 0 ? 0 : (CST == 1 ? (x==0||x==1) : x == 
0) */
+/* x == (typeof x)(x != CST) -> CST == 1 ? 0 : (CST == 0 ? (x==0||x==1) : x == 
1) */
 (for outer (ne eq)
  (for inner (ne eq)
   (simplify
@@ -6443,23 +6443,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  bool innereq = inner == EQ_EXPR;
  bool outereq = outer == EQ_EXPR;
 }
-   (switch
-(if (innereq ? cst0 : cst1)
- { constant_boolean_node (!outereq, type); })
-(if (innereq ? cst1 : cst0)
+(switch
+ (if (innereq ? cst0 : cst1)
+  { constant_boolean_node (!outereq, type); })
+ (if (innereq ? cst1 : cst0)
+  (with {
+tree utype = unsigned_type_for (TREE_TYPE (@0));
+tree ucst1 = build_one_cst (utype);
+   }
+   (if (!outereq)
+(gt (convert:utype @0) { ucst1; })
+(le (convert:utype @0) { ucst1; })
+   )
+  )
+ )
  (with {
-   tree utype = unsigned_type_for (TREE_TYPE (@0));
-   tree ucst1 = build_one_cst (utype);
+   tree value = build_int_cst (TREE_TYPE (@0), !innereq);
   }
-  (if (!outereq)
-   (gt (convert:utype @0) { ucst1; })
-   (le (convert:utype @0) { ucst1; })
+  (if (outereq)
+   (eq @0 { value; })
+   (ne @0 { value; })
   )
  )
 )
-(if (innereq)
- (ne @0 { build_zero_cst (TREE_TYPE (@0)); }))
-(ne @0 { build_one_cst (TREE_TYPE (@0)); }))
)
   )
  )
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110666-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110666-1.c
new file mode 100644
index 000..b22eb7781da
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110666-1.c
@@ -0,0 +1,51 @@
+
+#define func_name(outer,inner,cst) outer##inner##_##cst
+#define func_name_v(outer,inner,cst) outer##inner##_##cst##_v
+
+#define func_decl(outer,inner,cst) \
+int outer##inner##_##cst (int) __attribute__((noipa)); \
+int outer##inner##_##cst (int a) { \
+  return (a op_##inner cst) op_##outer a; \
+} \
+int outer##inner##_##cst##_v (int) __attribute__((noipa)); \
+int outer##inner##_##cst##_v (volatile int a) { \
+  return (a op_##inner cst) op_##outer a; \
+}
+
+#define functions_n(outer, inner) \
+func_decl(outer,inner,0) \
+func_decl(outer,inner,1) \
+func_decl(outer,inner,2)
+
+#define functions() \
+functions_n(eq,eq) \
+functions_n(eq,ne) \
+functions_n(ne,eq) \
+functions_n(ne,ne)
+
+#define op_ne !=
+#define op_eq ==
+
+#define test(inner,outer,cst,arg) \
+func_name_v (inner,outer,cst)(arg) != func_name(inner,outer,cst)(arg)
+
+functions()
+
+#define tests_n(inner,outer,arg) \
+if (test(inner,outer,0,arg)) __builtin_abort(); \
+if (test(inner,outer,1,arg)) __builtin_abort(); \
+if (test(inner,outer,2,arg)) __builtin_abort();
+
+#define tests(arg) \
+tests_n(eq,eq,arg) \
+tests_n(eq,ne,arg) \
+tests_n(ne,eq,arg) \
+tests_n(ne,ne,arg)
+
+
+int main()
+{
+  for(int n = -1; n <= 2; n++) {
+tests(n)
+  }
+}
-- 
2.31.1



Re: [PATCH][RFC] tree-optimization/88540 - FP x > y ? x : y if-conversion without -ffast-math

2023-07-14 Thread Andrew Pinski via Gcc-patches
On Thu, Jul 13, 2023 at 2:54 AM Richard Biener via Gcc-patches
 wrote:
>
> The following makes sure that FP x > y ? x : y style max/min operations
> are if-converted at the GIMPLE level.  While we can neither match
> it to MAX_EXPR nor .FMAX as both have different semantics with IEEE
> than the ternary ?: operation we can make sure to maintain this form
> as a COND_EXPR so backends have the chance to match this to instructions
> their ISA offers.
>
> The patch does this in phiopt where we recognize min/max and instead
> of giving up when we have to honor NaNs we alter the generated code
> to a COND_EXPR.
>
> This resolves PR88540 and we can then SLP vectorize the min operation
> for its testcase.  It also resolves part of the regressions observed
> with the change matching bit-inserts of bit-field-refs to vec_perm.
>
> Expansion from a COND_EXPR rather than from compare-and-branch
> regresses gcc.target/i386/pr54855-13.c and gcc.target/i386/pr54855-9.c
> by producing extra moves while the corresponding min/max operations
> are now already synthesized by RTL expansion, register selection
> isn't optimal.  This can be also provoked without this change by
> altering the operand order in the source.
>
> It regresses gcc.target/i386/pr110170.c where we end up CSEing the
> condition which makes RTL expansion no longer produce the min/max
> directly and code generation is obfuscated enough to confuse
> RTL if-conversion.
>
> It also regresses gcc.target/i386/ssefp-[12].c where oddly one
> variant isn't if-converted and ix86_expand_fp_movcc doesn't
> match directly (the FP constants get expanded twice).  A fix
> could be in emit_conditional_move where both prepare_cmp_insn
> and emit_conditional_move_1 force the constants to (different)
> registers.
>
> Otherwise bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> PR tree-optimization/88540
> * tree-ssa-phiopt.cc (minmax_replacement): Do not give up
> with NaNs but handle the simple case by if-converting to a
> COND_EXPR.

One thing which I was thinking about adding to phiopt is having the
last pass do the conversion to COND_EXPR if the target supports a
conditional move for that expression. That should fix this one right?
This was one of things I was working towards with the moving to use
match-and-simplify too.

Thanks,
Andrew

>
> * gcc.target/i386/pr88540.c: New testcase.
> * gcc.target/i386/pr54855-12.c: Adjust.
> * gcc.target/i386/pr54855-13.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/i386/pr54855-12.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr54855-13.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr88540.c| 10 ++
>  gcc/tree-ssa-phiopt.cc | 21 -
>  4 files changed, 28 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88540.c
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr54855-12.c 
> b/gcc/testsuite/gcc.target/i386/pr54855-12.c
> index 2f8af392c83..09e8ab8ae39 100644
> --- a/gcc/testsuite/gcc.target/i386/pr54855-12.c
> +++ b/gcc/testsuite/gcc.target/i386/pr54855-12.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mavx512fp16" } */
> -/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-times "vm\[ai\]\[nx\]sh\[ \\t\]" 1 } } */
>  /* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */
>  /* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } 
> } */
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr54855-13.c 
> b/gcc/testsuite/gcc.target/i386/pr54855-13.c
> index 87b4f459a5a..a4f25066f81 100644
> --- a/gcc/testsuite/gcc.target/i386/pr54855-13.c
> +++ b/gcc/testsuite/gcc.target/i386/pr54855-13.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mavx512fp16" } */
> -/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-times "vm\[ai\]\[nx\]sh\[ \\t\]" 1 } } */
>  /* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */
>  /* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } 
> } */
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr88540.c 
> b/gcc/testsuite/gcc.target/i386/pr88540.c
> new file mode 100644
> index 000..b927d0c57d5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88540.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse2" } */
> +
> +void test(double* __restrict d1, double* __restrict d2, double* __restrict 
> d3)
> +{
> +  for (int n = 0; n < 2; ++n)
> +d3[n] = d1[n] < d2[n] ? d1[n] : d2[n];
> +}
> +
> +/* { dg-final { scan-assembler "minpd" } } */
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 467c9fd108a..13ee486831d 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -1580,10 +1580,6 @@ minmax_replacement (basic_block cond_bb, basic_block 
> middle_bb, basic_block alt_
>
>tree type = TREE_TYPE (PHI_RESULT (phi));
>
> -  

[Bug ipa/93385] [11 Regression] wrong code with u128 modulo at -O2 -fno-dce -fno-ipa-cp -fno-tree-dce

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385

Andrew Pinski  changed:

   What|Removed |Added

 CC||19373742 at buaa dot edu.cn

--- Comment #51 from Andrew Pinski  ---
*** Bug 110662 has been marked as a duplicate of this bug. ***

[Bug ipa/110662] [11 Regression] Segmentation fault with '-O3'

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110662

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
Dup of bug 93385.

*** This bug has been marked as a duplicate of bug 93385 ***

[Bug ipa/110662] [11 Regression] Segmentation fault with '-O3'

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110662

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
Summary|Segmentation fault with |[11 Regression]
   |'-O3'   |Segmentation fault with
   ||'-O3'

[Bug rtl-optimization/110206] [14 Regression] wrong code with -Os -march=cascadelake since r14-1246

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206

--- Comment #18 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:bef95ba085b0ae9bf3eb79a8eed685236d773116

commit r13-7565-gbef95ba085b0ae9bf3eb79a8eed685236d773116
Author: Uros Bizjak 
Date:   Fri Jul 14 11:46:22 2023 +0200

cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg
[PR110206]

cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
that it equals 8 elements of HImodeby setting REG_EQUAL note:

(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0  S4 A32]))
"pr110206.c":12:42 1530 {*movv4qi_internal}
 (expr_list:REG_EQUAL (const_vector:V4QI [
(const_int -52 [0xffcc]) repeated x4
])
(nil)))
(insn 22 21 23 4 (set (reg:V8HI 100)
(zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0)
(parallel [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 4 [0x4])
(const_int 5 [0x5])
(const_int 6 [0x6])
(const_int 7 [0x7])
] "pr110206.c":12:42 7471
{sse4_1_zero_extendv8qiv8hi2}
 (expr_list:REG_EQUAL (const_vector:V8HI [
(const_int 204 [0xcc]) repeated x8
])
(expr_list:REG_DEAD (reg:V4QI 98)
(nil

We rely on the "undefined" vals to have a specific value (from the earlier
REG_EQUAL note) but actual code generation doesn't ensure this (it doesn't
need to).  That said, the issue isn't the constant folding per-se but that
we do not actually constant fold but register an equality that doesn't
hold.

PR target/110206

gcc/ChangeLog:

* fwprop.cc (contains_paradoxical_subreg_p): Move to ...
* rtlanal.cc (contains_paradoxical_subreg_p): ... here.
* rtlanal.h (contains_paradoxical_subreg_p): Add prototype.
* cprop.cc (try_replace_reg): Do not set REG_EQUAL note
when the original source contains a paradoxical subreg.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110206.c: New test.

(cherry picked from commit 1815e313a8fb519a77c94a908eb6dafc4ce51ffe)

[Bug target/59172] pdp11-aout makes a wrong code at the epilogue

2023-07-14 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59172

Mikael Pettersson  changed:

   What|Removed |Added

 CC||mikpelinux at gmail dot com

--- Comment #3 from Mikael Pettersson  ---
For completeness, this was fixed for gcc-9.1.0 by

commit 442fcea74d0c7797fc083fa7e5543268c0ff54a6
Author: Paul Koning 
Date:   Thu Nov 8 13:56:58 2018 -0500

The commit message mentions PR c/87795 but that seems like an error; the
mailing-list message just mentions it being a collection of fixes.

Re: [PATCH v2 1/2] c++, libstdc++: implement __is_pointer built-in trait

2023-07-14 Thread Ken Matsui via Gcc-patches
On Fri, Jul 14, 2023 at 3:49 AM Jonathan Wakely  wrote:
>
> On Fri, 14 Jul 2023 at 11:48, Jonathan Wakely  wrote:
> >
> > On Thu, 13 Jul 2023 at 21:04, Ken Matsui  wrote:
> > >
> > > On Thu, Jul 13, 2023 at 2:22 AM Jonathan Wakely  
> > > wrote:
> > > >
> > > > On Wed, 12 Jul 2023 at 21:42, Ken Matsui  
> > > > wrote:
> > > > >
> > > > > On Wed, Jul 12, 2023 at 3:01 AM Jonathan Wakely  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 10 Jul 2023 at 06:51, Ken Matsui via Libstdc++
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Here is the benchmark result for is_pointer:
> > > > > > >
> > > > > > > https://github.com/ken-matsui/gcc-benches/blob/main/is_pointer.md#sun-jul--9-103948-pm-pdt-2023
> > > > > > >
> > > > > > > Time: -62.1344%
> > > > > > > Peak Memory Usage: -52.4281%
> > > > > > > Total Memory Usage: -53.5889%
> > > > > >
> > > > > > Wow!
> > > > > >
> > > > > > Although maybe we could have improved our std::is_pointer_v anyway, 
> > > > > > like so:
> > > > > >
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v = false;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp*> = true;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp* const> = true;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp* volatile> = true;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
> > > > > >
> > > > > > I'm not sure why I didn't already do that.
> > > > > >
> > > > > > Could you please benchmark that? And if it is better than the 
> > > > > > current
> > > > > > impl using is_pointer<_Tp>::value then we should do this in the
> > > > > > library:
> > > > > >
> > > > > > #if __has_builtin(__is_pointer)
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v = __is_pointer(_Tp);
> > > > > > #else
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v = false;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp*> = true;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp* const> = true;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp* volatile> = true;
> > > > > > template 
> > > > > >   inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
> > > > > > #endif
> > > > >
> > > > > Hi François and Jonathan,
> > > > >
> > > > > Thank you for your reviews! I will rename the four underscores to the
> > > > > appropriate name and take a benchmark once I get home.
> > > > >
> > > > > If I apply your change on is_pointer_v, is it better to add the
> > > > > `Co-authored-by:` line in the commit?
> > > >
> > > > Yes, that would be the correct thing to do (although in this case the
> > > > change is small enough that I don't really care about getting credit
> > > > for it :-)
> > > >
> > > Thank you! I will include it in my commit :) I see that you included
> > > the DCO sign-off in the MAINTAINERS file. However, if a reviewer
> > > doesn't, should I include the `Signed-off-by:` line for the reviewer
> > > as well?
> >
> > No, reviewers should not sign-off, that's for the code author. And
> > authors should add that themselves (or clearly state that they agree
> > to the DCO terms). You should not sign-off on someone else's behalf.
>
> You can add Reviewed-by: if you want to record that information.
>
I see. Thank you!


Re: [PATCH] Fix bootstrap failure (with g++ 4.8.5) in tree-if-conv.cc.

2023-07-14 Thread Andrew Pinski via Gcc-patches
On Fri, Jul 14, 2023 at 11:56 AM Roger Sayle  wrote:
>
>
>
> This patch fixes the bootstrap failure I'm seeing using gcc 4.8.5 as
>
> the host compiler.  Ok for mainline?  [I might be missing something]

I think adding const here makes this well defined C++20 too.
See http://cplusplus.github.io/LWG/lwg-defects.html#3031 .
Also see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107850 .

(I could be reading these wrong too).

Thanks,
Andrew

>
>
>
>
>
> 2023-07-14  Roger Sayle  
>
>
>
> gcc/ChangeLog
>
> * tree-if-conv.cc (predicate_scalar_phi): Make the arguments
>
> to the std::sort comparison lambda function const.
>
>
>
>
>
> Cheers,
>
> Roger
>
> --
>
>
>


[Bug libstdc++/108556] std::sort changes objects' member values

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108556

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|INVALID |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
dup

*** This bug has been marked as a duplicate of bug 553 ***

[Bug libstdc++/553] Call to sort () results in segfault

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=553

Andrew Pinski  changed:

   What|Removed |Added

 CC||gnu.iodaj at simplelogin dot 
com

--- Comment #9 from Andrew Pinski  ---
*** Bug 108556 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/110671] New: ICE on valid code at -O2 and -O3 on x86_64-linux-gnu: in gimple_phi_arg_def_from_edge, at gimple.h:4699

2023-07-14 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110671

Bug ID: 110671
   Summary: ICE on valid code at -O2 and -O3 on x86_64-linux-gnu:
in gimple_phi_arg_def_from_edge, at gimple.h:4699
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

It appears to be a very recent regression.

[631] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/home/suz/suz-local/software/local/gcc-trunk/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk
--enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230714 (experimental) (GCC) 
[632] % 
[632] % gcctk -O2 -c small.c
during GIMPLE pass: sccp
small.c: In function ‘main’:
small.c:3:5: internal compiler error: in gimple_phi_arg_def_from_edge, at
gimple.h:4699
3 | int main() {
  | ^~~~
0x80ec9d gimple_phi_arg_def_from_edge(gphi const*, edge_def const*)
../../gcc-trunk/gcc/gimple.h:4699
0x8101ae gimple_phi_arg_def_from_edge(gphi const*, edge_def const*)
../../gcc-trunk/gcc/tree.h:3700
0x8101ae final_value_replacement_loop(loop*)
../../gcc-trunk/gcc/tree-scalar-evolution.cc:3732
0x118a1a5 execute
../../gcc-trunk/gcc/tree-ssa-loop.cc:411
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
[633] % 
[633] % cat small.c
extern void f();
int a;
int main() {
  int *b = , c = 1, d;
 L:
  if (a)
for (; d < 1; d++) {
  if (a)
f();
  *b |= c;
}
  c = 0;
  if (a)
goto L;
  return 0;
}

[PATCH] Fix bootstrap failure (with g++ 4.8.5) in tree-if-conv.cc.

2023-07-14 Thread Roger Sayle
 

This patch fixes the bootstrap failure I'm seeing using gcc 4.8.5 as

the host compiler.  Ok for mainline?  [I might be missing something]

 

 

2023-07-14  Roger Sayle  

 

gcc/ChangeLog

* tree-if-conv.cc (predicate_scalar_phi): Make the arguments

to the std::sort comparison lambda function const.

 

 

Cheers,

Roger

--

 

diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 91e2eff..799f071 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -2204,7 +2204,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
*gsi)
 }
 
   /* Sort elements based on rankings ARGS.  */
-  std::sort(argsKV.begin(), argsKV.end(), [](ArgEntry , ArgEntry ) {
+  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry ,
+const ArgEntry ) {
 return left.second < right.second;
   });
 


[Bug fortran/99139] ICE: gfc_get_default_type(): Bad symbol '__tmp_UNKNOWN_0_rank_1'

2023-07-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99139

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

  Known to fail||10.5.0, 11.4.0, 12.3.0,
   ||13.1.0
  Known to work||14.0

--- Comment #7 from anlauf at gcc dot gnu.org ---
Updating known-to-work/known to fail version.

Paul/Steve: do you want to assign this PR to one of you?

[Bug target/110657] BPF verifier rejects generated code due to invalid stack access

2023-07-14 Thread jemarch at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110657

Jose E. Marchesi  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Jose E. Marchesi  ---
Thanks for confirming.  Resolving as fixed.

[Bug target/110657] BPF verifier rejects generated code due to invalid stack access

2023-07-14 Thread kris.van.hees at oracle dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110657

--- Comment #7 from Kris Van Hees  ---
Confirmed that it resolves the issue

Thanks!

[Bug fortran/110288] [11/12/13/14] Regression: segfault in findloc with allocatable array of allocatable characters

2023-07-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110288

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from anlauf at gcc dot gnu.org ---
Fixed for gcc-14, and backported to affected branches.  Closing.

Thanks for the report!

[Bug fortran/110288] [11/12/13/14] Regression: segfault in findloc with allocatable array of allocatable characters

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110288

--- Comment #10 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:a348245bfb018f02b36d22575380b34aef58f52c

commit r11-10910-ga348245bfb018f02b36d22575380b34aef58f52c
Author: Harald Anlauf 
Date:   Tue Jul 11 21:21:25 2023 +0200

Fortran: formal symbol attributes for intrinsic procedures [PR110288]

gcc/fortran/ChangeLog:

PR fortran/110288
* symbol.c (gfc_copy_formal_args_intr): When deriving the formal
argument attributes from the actual ones for intrinsic procedure
calls, take special care of CHARACTER arguments that we do not
wrongly treat them formally as deferred-length.

gcc/testsuite/ChangeLog:

PR fortran/110288
* gfortran.dg/findloc_10.f90: New test.

(cherry picked from commit 3b2c523ae31b68fc3b8363b458a55eec53a44365)

[Bug fortran/110288] [11/12/13/14] Regression: segfault in findloc with allocatable array of allocatable characters

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110288

--- Comment #9 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:995c717500c368c5aec7889dfa047cff7cb0139b

commit r12-9773-g995c717500c368c5aec7889dfa047cff7cb0139b
Author: Harald Anlauf 
Date:   Tue Jul 11 21:21:25 2023 +0200

Fortran: formal symbol attributes for intrinsic procedures [PR110288]

gcc/fortran/ChangeLog:

PR fortran/110288
* symbol.cc (gfc_copy_formal_args_intr): When deriving the formal
argument attributes from the actual ones for intrinsic procedure
calls, take special care of CHARACTER arguments that we do not
wrongly treat them formally as deferred-length.

gcc/testsuite/ChangeLog:

PR fortran/110288
* gfortran.dg/findloc_10.f90: New test.

(cherry picked from commit 3b2c523ae31b68fc3b8363b458a55eec53a44365)

[Bug c/110654] inconsistent error message in presence of unexpected encoded characters

2023-07-14 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110654

--- Comment #2 from Ulrich Drepper  ---
(In reply to Andrew Pinski from comment #1)
> So this is due to differences in the languages themselves rather than due to
> C vs C++ front-end ...

This is certainly true for the validation.

But the standard never says anything about how an error should be reported.  I
don't think there is a reason to make this more obscure than necessary.

[Bug fortran/110288] [11/12/13/14] Regression: segfault in findloc with allocatable array of allocatable characters

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110288

--- Comment #8 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:447dd2924e43884d798d8c40765cbfddd0fde0ae

commit r13-7564-g447dd2924e43884d798d8c40765cbfddd0fde0ae
Author: Harald Anlauf 
Date:   Tue Jul 11 21:21:25 2023 +0200

Fortran: formal symbol attributes for intrinsic procedures [PR110288]

gcc/fortran/ChangeLog:

PR fortran/110288
* symbol.cc (gfc_copy_formal_args_intr): When deriving the formal
argument attributes from the actual ones for intrinsic procedure
calls, take special care of CHARACTER arguments that we do not
wrongly treat them formally as deferred-length.

gcc/testsuite/ChangeLog:

PR fortran/110288
* gfortran.dg/findloc_10.f90: New test.

(cherry picked from commit 3b2c523ae31b68fc3b8363b458a55eec53a44365)

☺ Buildbot (Sourceware): gccrust - build successful (master)

2023-07-14 Thread builder--- via Gcc-rust
A restored build has been detected on builder gccrust-rawhide-x86_64 while 
building gccrust.

Full details are available at:
https://builder.sourceware.org/buildbot/#builders/132/builds/1208

Build state: build successful
Revision: 04353fb887c7e66769e817500bdca2ef6f53ba6f
Worker: bb1-1
Build Reason: (unknown)
Blamelist: Arthur Cohen , Marc Poulhiès 
, Matthew Jasper , Muhammad Mahad 
, Owen Avery , Philip Herron 
, Pierre-Emmanuel Patry 
, Raiki Tamura 

Steps:

- 0: worker_preparation ( success )

- 1: git checkout ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/1/logs/stdio

- 2: rm -rf gccrs-build ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/2/logs/stdio

- 3: configure ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/3/logs/stdio
- config.log: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/3/logs/config_log

- 4: make ( warnings )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/4/logs/stdio
- warnings (40): 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/4/logs/warnings__40_

- 5: make check ( warnings )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/5/logs/stdio
- rust.sum: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/5/logs/rust_sum
- rust.log: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/5/logs/rust_log
- warnings (6): 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/5/logs/warnings__6_

- 6: grep unexpected rust.sum ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/6/logs/stdio

- 7: prep ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/7/logs/stdio

- 8: build bunsen.cpio.gz ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/8/logs/stdio

- 9: fetch bunsen.cpio.gz ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/9/logs/stdio

- 10: unpack bunsen.cpio.gz ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/10/logs/stdio

- 11: pass .bunsen.source.gitname ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/11/logs/stdio

- 12: pass .bunsen.source.gitdescribe ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/12/logs/stdio

- 13: pass .bunsen.source.gitbranch ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/13/logs/stdio

- 14: pass .bunsen.source.gitrepo ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/14/logs/stdio

- 15: upload to bunsen ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/15/logs/stdio

- 16: clean up ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/16/logs/stdio

- 17: rm -rf gccrs-build_1 ( success )
Logs:
- stdio: 
https://builder.sourceware.org/buildbot/#builders/132/builds/1208/steps/17/logs/stdio

-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust


[Bug ada/110668] gcc/ada/gcc-interface/Make-lang.in:1012: ada/b_gnat1.adb] Error 5

2023-07-14 Thread dclarke at blastwave dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110668

Dennis Clarke  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Dennis Clarke  ---
Lets call this invalid !

[Bug other/110669] [14 regression] ICE in gcc.dg/torture/pr105132.c since r14-2515-gb77161e60bce7b

2023-07-14 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110669

--- Comment #2 from David Binderman  ---
Reduced C code seems to be:

int g_29, func_47_p_48, func_47_p_51_l_129;
void func_47_p_51() {
  for (;;) {
func_47_p_51_l_129 = 0;
for (; func_47_p_51_l_129 <= 1; func_47_p_51_l_129 += 1) {
  short *l_160 = func_47_p_48 || *l_160;
  *l_160 &= g_29;
}
  }
}

$ ~/gcc/results/bin/gcc -c -Ofast bug942.c
bug942.c: In function ‘func_47_p_51’:
bug942.c:6:22: warning: initialization of ‘short int *’ from ‘int’ makes
pointer from integer without a cast [-Wint-conversion]
6 |   short *l_160 = func_47_p_48 || *l_160;
  |  ^~~~
during GIMPLE pass: sccp
bug942.c:2:6: internal compiler error: in gimple_phi_arg_def_from_edge, at
gimple.h:4699
2 | void func_47_p_51() {
  |  ^~~~
0xf4f20f final_value_replacement_loop(loop*)
../../trunk.year/gcc/tree-scalar-evolution.cc:0

[Bug target/110647] [14 Regression] 66% TSVC/s2712 regressoin on N1-neoverse between g:620a35b24a2b6edb (2023-07-01 07:24) and g:80ae426a195a0d03 (2023-07-02 01:37)

2023-07-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110647

--- Comment #2 from Jan Hubicka  ---
This is a testcase based on our testuiste version so it can be copied to
compiler explorer

#define iterations 1
#define LEN_1D 32000
#define LEN_2D 256
#define ARRAY_ALIGNMENT 64

typedef float real_t;
#define ABS fabsf
__attribute__((aligned(ARRAY_ALIGNMENT)))
real_t flat_2d_array[LEN_2D * LEN_2D];
__attribute__((aligned(ARRAY_ALIGNMENT))) real_t x[LEN_1D];
__attribute__((aligned(ARRAY_ALIGNMENT))) real_t a[LEN_1D], b[LEN_1D],
c[LEN_1D], d[LEN_1D], e[LEN_1D], aa[LEN_2D][LEN_2D], bb[LEN_2D][LEN_2D],
cc[LEN_2D][LEN_2D], tt[LEN_2D][LEN_2D];
__attribute__((aligned(ARRAY_ALIGNMENT))) int indx[LEN_1D];

int dummy(real_t[LEN_1D], real_t[LEN_1D], real_t[LEN_1D], real_t[LEN_1D],
  real_t[LEN_1D], real_t[LEN_2D][LEN_2D], real_t[LEN_2D][LEN_2D],
  real_t[LEN_2D][LEN_2D], real_t);
real_t s2712(struct args_t * func_args)
{
//control flow
//if to elemental min


for (int nl = 0; nl < 4*iterations; nl++) {
for (int i = 0; i < LEN_1D; i++) {
if (a[i] >= b[i]) {
a[i] += b[i] * c[i];
}
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
}
return 0;
}

So with GCC 13 I get:
s2712(args_t*):
stp x29, x30, [sp, -96]!
mov x29, sp
stp x19, x20, [sp, 16]
adrpx19, a
adrpx20, b
add x19, x19, :lo12:a
add x20, x20, :lo12:b
stp x21, x22, [sp, 32]
adrpx22, c
mov x21, 62464
add x22, x22, :lo12:c
stp x23, x24, [sp, 48]
adrpx24, e
adrpx23, d
add x24, x24, :lo12:e
add x23, x23, :lo12:d
stp x25, x26, [sp, 64]
adrpx26, bb
adrpx25, aa
add x26, x26, :lo12:bb
add x25, x25, :lo12:aa
stp x27, x28, [sp, 80]
adrpx27, cc
add x27, x27, :lo12:cc
mov w28, 4
movkx21, 0x1, lsl 16
.L2:
mov x0, 0
.L5:
ldr s0, [x19, x0]
ldr s1, [x20, x0]
fcmpe   s0, s1
bge .L7
.L3:
add x0, x0, 4
cmp x0, x21
bne .L5
moviv0.2s, #0
mov x7, x27
mov x6, x26
mov x5, x25
mov x4, x24
mov x3, x23
mov x2, x22
mov x1, x20
mov x0, x19
bl  dummy(float*, float*, float*, float*, float*, float (*) [256],
float (*) [256], float (*) [256], float)
subsw28, w28, #1
bne .L2
ldp x19, x20, [sp, 16]
moviv0.2s, #0
ldp x21, x22, [sp, 32]
ldp x23, x24, [sp, 48]
ldp x25, x26, [sp, 64]
ldp x27, x28, [sp, 80]
ldp x29, x30, [sp], 96
ret
.L7:
ldr s2, [x22, x0]
fmadd   s0, s1, s2, s0
str s0, [x19, x0]
b   .L3

and trunk:
s2712(args_t*):
stp x29, x30, [sp, -96]!
mov x29, sp
stp x19, x20, [sp, 16]
adrpx19, a
adrpx20, b
add x19, x19, :lo12:a
add x20, x20, :lo12:b
stp x21, x22, [sp, 32]
adrpx22, c
mov x21, 62464
add x22, x22, :lo12:c
stp x23, x24, [sp, 48]
adrpx24, e
adrpx23, d
add x24, x24, :lo12:e
add x23, x23, :lo12:d
stp x25, x26, [sp, 64]
adrpx26, bb
adrpx25, aa
add x26, x26, :lo12:bb
add x25, x25, :lo12:aa
stp x27, x28, [sp, 80]
adrpx27, cc
add x27, x27, :lo12:cc
mov w28, 4
movkx21, 0x1, lsl 16
.L2:
mov x0, 0
.L5:
ldr s31, [x19, x0]
ldr s30, [x20, x0]
fcmpe   s31, s30
bge .L7
.L3:
add x0, x0, 4
cmp x0, x21
bne .L5
moviv0.2s, #0
mov x7, x27
mov x6, x26
mov x5, x25
mov x4, x24
mov x3, x23
mov x2, x22
mov x1, x20
mov x0, x19
bl  dummy(float*, float*, float*, float*, float*, float (*) [256],
float (*) [256], float (*) [256], float)
subsw28, w28, #1
bne .L2
ldp x19, x20, [sp, 16]
moviv0.2s, #0
ldp x21, x22, [sp, 32]
ldp x23, x24, [sp, 48]
ldp x25, x26, [sp, 64]
ldp x27, x28, [sp, 80]
ldp x29, x30, [sp], 96
ret
.L7:
ldr s29, [x22, x0]
fmadd   s31, s30, s29, s31
str s31, [x19, x0]
b   .L3

The only difference seems to be:
 .L2:
 mov x0, 0
 .L5:
-ldr s31, [x19, x0]
-ldr s30, [x20, x0]
-fcmpe   s31, s30
+ldr s0, [x19, x0]
+  

Re: [PATCH] c++: redundant targ coercion for var/alias tmpls

2023-07-14 Thread Patrick Palka via Gcc-patches
On Thu, 13 Jul 2023, Jason Merrill wrote:

> On 7/13/23 11:48, Patrick Palka wrote:
> > On Wed, 28 Jun 2023, Patrick Palka wrote:
> > 
> > > On Wed, Jun 28, 2023 at 11:50 AM Jason Merrill  wrote:
> > > > 
> > > > On 6/23/23 12:23, Patrick Palka wrote:
> > > > > On Fri, 23 Jun 2023, Jason Merrill wrote:
> > > > > 
> > > > > > On 6/21/23 13:19, Patrick Palka wrote:
> > > > > > > When stepping through the variable/alias template specialization
> > > > > > > code
> > > > > > > paths, I noticed we perform template argument coercion twice:
> > > > > > > first from
> > > > > > > instantiate_alias_template / finish_template_variable and again
> > > > > > > from
> > > > > > > tsubst_decl (during instantiate_template).  It should suffice to
> > > > > > > perform
> > > > > > > coercion once.
> > > > > > > 
> > > > > > > To that end patch elides this second coercion from tsubst_decl
> > > > > > > when
> > > > > > > possible.  We can't get rid of it completely because we don't
> > > > > > > always
> > > > > > > specialize a variable template from finish_template_variable: we
> > > > > > > could
> > > > > > > also be doing so directly from instantiate_template during
> > > > > > > variable
> > > > > > > template partial specialization selection, in which case the
> > > > > > > coercion
> > > > > > > from tsubst_decl would be the first and only coercion.
> > > > > > 
> > > > > > Perhaps we should be coercing in lookup_template_variable rather
> > > > > > than
> > > > > > finish_template_variable?
> > > > > 
> > > > > Ah yes, there's a patch for that at
> > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617377.html :)
> > > > 
> > > > So after that patch, can we get rid of the second coercion completely?
> > > 
> > > On second thought it should be possible to get rid of it, if we
> > > rearrange things to always pass the primary arguments to tsubst_decl,
> > > and perform partial specialization selection from there instead of
> > > instantiate_template.  Let me try...
> > 
> > Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.
> > 
> > -- >8 --
> > 
> > When stepping through the variable/alias template specialization code
> > paths, I noticed we perform template argument coercion twice: first from
> > instantiate_alias_template / finish_template_variable and again from
> > tsubst_decl (during instantiate_template).  It'd be good to avoid this
> > redundant coercion.
> > 
> > It turns out that this coercion could be safely elided whenever
> > specializing a primary variable/alias template, because we can rely on
> > lookup_template_variable and instantiate_alias_template to already have
> > coerced the arguments.
> > 
> > The other situation to consider is when fully specializing a partial
> > variable template specialization (from instantiate_template), in which
> > case the passed 'args' are the (already coerced) arguments relative to
> > the partial template and 'argvec', the result of substitution into
> > DECL_TI_ARGS, are the (uncoerced) arguments relative to the primary
> > template, so coercion is still necessary.  We can still avoid this
> > coercion however if we always pass the primary variable template to
> > tsubst_decl from instantiate_template, and instead perform partial
> > specialization selection directly from tsubst_decl.  This patch
> > implements this approach.
> 
> The relationship between instantiate_template and tsubst_decl is pretty
> tangled.  We use the former to substitute (often deduced) template arguments
> into a template, and the latter to substitute template arguments into a use of
> a template...and also to implement the former.
> 
> For substitution of uses of a template, we expect to need to coerce the
> arguments after substitution.  But we avoid this issue for variable templates
> by keeping them as TEMPLATE_ID_EXPR until substitution time, so if we see a
> VAR_DECL in tsubst_decl it's either a non-template variable or under
> instantiate_template.

FWIW it seems we could also be in tsubst_decl for a VAR_DECL if

  * we're partially instantiating a class-scope variable template
during instantiation of the class
  * we're substituting a use of an already non-dependent variable
template specialization

> 
> So it seems like the current coercion for variable templates is only needed in
> this case to support the redundant hash table lookup that we just did in
> instantiate_template.  Perhaps instead of doing coercion here or moving the
> partial spec lookup, we could skip the hash table lookup for the case of a
> variable template?

It seems we'd then also have to make instantiate_template responsible
for registering the variable template specialization since tsubst_decl
no longer necessarily has the arguments relative to the primary template
('args' could be relative to the partial template).

Like so?  The following makes us perform all the specialization table
manipulation in instantiate_template instead of tsubst_decl for variable
template 

[Bug ada/110668] gcc/ada/gcc-interface/Make-lang.in:1012: ada/b_gnat1.adb] Error 5

2023-07-14 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110668

--- Comment #3 from Mikael Pettersson  ---
https://gcc.gnu.org/install/prerequisites.html, GNAT section, 4th paragraph.

[Bug other/110669] [24 regression] ICE in gcc.dg/torture/pr105132.c since r14-2515-gb77161e60bce7b

2023-07-14 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110669

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #1 from David Binderman  ---
I also see this. Reduction running now.

[Bug other/110669] New: [24 regression] ICE in gcc.dg/torture/pr105132.c since r14-2515-gb77161e60bce7b

2023-07-14 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110669

Bug ID: 110669
   Summary: [24 regression] ICE in gcc.dg/torture/pr105132.c since
r14-2515-gb77161e60bce7b
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:b77161e60bce7b4416319defe5f141f14fd375c4, r14-2515-gb77161e60bce7b
make  -k check-gcc RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr105132.c"
FAIL: gcc.dg/torture/pr105132.c   -O1  (internal compiler error: in
gimple_phi_arg_def_from_edge, at gimple.h:4699)
FAIL: gcc.dg/torture/pr105132.c   -O1  (test for excess errors)
# of expected passes7
# of unexpected failures2

spawn -ignore SIGHUP /home/seurer/gcc/git/build/gcc-test/gcc/xgcc
-B/home/seurer/gcc/git/build/gcc-test/gcc/
/home/seurer/gcc/git/gcc-test/gcc/testsuite/gcc.dg/torture/pr105132.c
-fdiagnostics-plain-output -O1 -S -o pr105132.s
during GIMPLE pass: sccp
/home/seurer/gcc/git/gcc-test/gcc/testsuite/gcc.dg/torture/pr105132.c: In
function 'd':
/home/seurer/gcc/git/gcc-test/gcc/testsuite/gcc.dg/torture/pr105132.c:7:6:
internal compiler error: in gimple_phi_arg_def_from_edge, at gimple.h:4699
0x10e0f357 gimple_phi_arg_def_from_edge(gphi const*, edge_def const*)
/home/seurer/gcc/git/gcc-test/gcc/gimple.h:4699
0x10e0f357 gimple_phi_arg_def_from_edge(gphi const*, edge_def const*)
/home/seurer/gcc/git/gcc-test/gcc/gimple.h:4697
0x10e0f357 final_value_replacement_loop(loop*)
/home/seurer/gcc/git/gcc-test/gcc/tree-scalar-evolution.cc:3732
0x10f0e01f execute
/home/seurer/gcc/git/gcc-test/gcc/tree-ssa-loop.cc:411

commit b77161e60bce7b4416319defe5f141f14fd375c4 (HEAD)
Author: Richard Biener 
Date:   Fri Jul 14 10:01:39 2023 +0200

Provide extra checking for phi argument access from edge

[Bug c/110670] New: tree-vect-stmts.cc:9733:11: warning: variable 'offvar' is sometimes initialised

2023-07-14 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110670

Bug ID: 110670
   Summary: tree-vect-stmts.cc:9733:11: warning: variable 'offvar'
is sometimes initialised
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

>From today's build of gcc trunk, clang says:

tree-vect-stmts.cc:9733:11: warning: variable 'offvar' is used uninitialized
whenever 'if' condition is false [-Wsometimes-uninitialized]

$ grep -n offvar ../trunk.year/gcc/tree-vect-stmts.cc
8377:  tree offvar;
8495:, NULL);
8504: running_off = offvar;
9692:  tree offvar;
9767:, NULL);
9772:  running_off = offvar;
$ 

It might be worthwhile initialising offvar to something sensible
for both its declarations.

[Bug target/110588] btl (on x86_64) not always generated

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110588

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:43a0a5cd57eefd5a5bbead606ec4f6959af31802

commit r14-2528-g43a0a5cd57eefd5a5bbead606ec4f6959af31802
Author: Roger Sayle 
Date:   Fri Jul 14 18:21:56 2023 +0100

PR target/110588: Add *bt_setncqi_2 to generate btl on x86.

This patch resolves PR target/110588 to catch another case in combine
where the i386 backend should be generating a btl instruction.  This adds
another define_insn_and_split to recognize the RTL representation for this
case.

I also noticed that two related define_insn_and_split weren't using the
preferred string style for single statement preparation-statements, so
I've reformatted these to be consistent in style with the new one.

2023-07-14  Roger Sayle  

gcc/ChangeLog
PR target/110588
* config/i386/i386.md (*bt_setcqi): Prefer string form
preparation statement over braces for a single statement.
(*bt_setncqi): Likewise.
(*bt_setncqi_2): New define_insn_and_split.

gcc/testsuite/ChangeLog
PR target/110588
* gcc.target/i386/pr110588.c: New test case.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55542|0   |1
is obsolete||

--- Comment #81 from Jakub Jelinek  ---
Created attachment 55545
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55545=edit
gcc14-bitint-wip.patch

_BitInt -> double conversion (float, long double, __float128, _Float16 and
__bf16 conversions still to be implemented).

[Bug c++/109876] [11/12/13 Regression] initializer_list not usable in constant expressions in a template

2023-07-14 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876

Marek Polacek  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13 Regression]
   |initializer_list not usable |initializer_list not usable
   |in constant expressions in  |in constant expressions in
   |a template  |a template

--- Comment #14 from Marek Polacek  ---
Fixed on trunk so far.  I want to wait a bit before backporting this.

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-14 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via Gcc-patches 
wrote:
> Summary: We'd like to be able to specify some attributes using
> keywords, rather than the traditional __attribute__ or [[...]]
> syntax.  Would that be OK?

Will defer to C/C++ maintainers, but as you mentioned, there are many
attributes which really can't be ignored and change behavior significantly.
vector_size is one of those, mode attribute another,
no_unique_address another one (changes ABI in various cases),
the OpenMP attributes (omp::directive, omp::sequence) can change
behavior if -fopenmp, etc.
One can easily error with
#ifdef __has_cpp_attribute
#if !__has_cpp_attribute (arm::whatever)
#error arm::whatever attribute unsupported
#endif
#else
#error __has_cpp_attribute unsupported
#endif
Adding keywords instead of attributes seems to be too ugly to me.

Jakub



[Bug ada/110668] gcc/ada/gcc-interface/Make-lang.in:1012: ada/b_gnat1.adb] Error 5

2023-07-14 Thread dclarke at blastwave dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110668

--- Comment #2 from Dennis Clarke  ---
Oh darn. Is this documented anywhere in the build instructions?

[Bug c++/109876] [11/12/13/14 Regression] initializer_list not usable in constant expressions in a template

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876

--- Comment #13 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:b5138df96a93d3b5070c88b8617eabd38cb24ab6

commit r14-2527-gb5138df96a93d3b5070c88b8617eabd38cb24ab6
Author: Marek Polacek 
Date:   Thu May 25 18:54:18 2023 -0400

c++: wrong error with static constexpr var in tmpl [PR109876]

Since r8-509, we'll no longer create a static temporary var for
the initializer '{ 1, 2 }' for num in the attached test because
the code in finish_compound_literal is now guarded by
'&& fcl_context == fcl_c99' but it's fcl_functional here.  This
causes us to reject num as non-constant when evaluating it in
a template.

Jason's idea was to treat num as value-dependent even though it
actually isn't.  This patch implements that suggestion.

We weren't marking objects whose type is an empty class type
constant.  This patch changes that so that v_d_e_p doesn't need
to check is_really_empty_class.

Co-authored-by: Jason Merrill 

PR c++/109876

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Set TREE_CONSTANT when initializing
an object of empty class type.
* pt.cc (value_dependent_expression_p) : Treat a
constexpr-declared non-constant variable as value-dependent.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-template12.C: New test.
* g++.dg/cpp1z/constexpr-template1.C: New test.
* g++.dg/cpp1z/constexpr-template2.C: New test.

[Bug middle-end/88873] missing vectorization for decomposed operations on a vector type

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88873

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:8911879415d6c2a7baad88235554a912887a1c5c

commit r14-2526-g8911879415d6c2a7baad88235554a912887a1c5c
Author: Roger Sayle 
Date:   Fri Jul 14 18:10:05 2023 +0100

i386: Improved insv of DImode/DFmode {high,low}parts into TImode.

This is the next piece towards a fix for (the x86_64 ABI issues affecting)
PR 88873.  This patch generalizes the recent tweak to ix86_expand_move
for setting the highpart of a TImode reg from a DImode source using
*insvti_highpart_1, to handle both DImode and DFmode sources, and also
use the recently added *insvti_lowpart_1 for setting the lowpart.

Although this is another intermediate step (not yet a fix), towards
enabling *insvti and *concat* patterns to be candidates for TImode STV
(by using V2DI/V2DF instructions), it already improves things a little.

For the test case from PR 88873

typedef struct { double x, y; } s_t;
typedef double v2df __attribute__ ((vector_size (2 * sizeof(double;

s_t foo (s_t a, s_t b, s_t c)
{
  return (s_t) { fma(a.x, b.x, c.x), fma (a.y, b.y, c.y) };
}

With -O2 -march=cascadelake, GCC currently generates:

Before (29 instructions):
vmovq   %xmm2, -56(%rsp)
movq-56(%rsp), %rdx
vmovq   %xmm4, -40(%rsp)
movq$0, -48(%rsp)
movq%rdx, -56(%rsp)
movq-40(%rsp), %rdx
vmovq   %xmm0, -24(%rsp)
movq%rdx, -40(%rsp)
movq-24(%rsp), %rsi
movq-56(%rsp), %rax
movq$0, -32(%rsp)
vmovq   %xmm3, -48(%rsp)
movq-48(%rsp), %rcx
vmovq   %xmm5, -32(%rsp)
vmovq   %rax, %xmm6
movq-40(%rsp), %rax
movq$0, -16(%rsp)
movq%rsi, -24(%rsp)
movq-32(%rsp), %rsi
vpinsrq $1, %rcx, %xmm6, %xmm6
vmovq   %rax, %xmm7
vmovq   %xmm1, -16(%rsp)
vmovapd %xmm6, %xmm3
vpinsrq $1, %rsi, %xmm7, %xmm7
vfmadd132pd -24(%rsp), %xmm7, %xmm3
vmovapd %xmm3, -56(%rsp)
vmovsd  -48(%rsp), %xmm1
vmovsd  -56(%rsp), %xmm0
ret

After (20 instructions):
vmovq   %xmm2, -56(%rsp)
movq-56(%rsp), %rax
vmovq   %xmm3, -48(%rsp)
vmovq   %xmm4, -40(%rsp)
movq-48(%rsp), %rcx
vmovq   %xmm5, -32(%rsp)
vmovq   %rax, %xmm6
movq-40(%rsp), %rax
movq-32(%rsp), %rsi
vpinsrq $1, %rcx, %xmm6, %xmm6
vmovq   %xmm0, -24(%rsp)
vmovq   %rax, %xmm7
vmovq   %xmm1, -16(%rsp)
vmovapd %xmm6, %xmm2
vpinsrq $1, %rsi, %xmm7, %xmm7
vfmadd132pd -24(%rsp), %xmm7, %xmm2
vmovapd %xmm2, -56(%rsp)
vmovsd  -48(%rsp), %xmm1
vmovsd  -56(%rsp), %xmm0
ret

2023-07-14  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_expand_move): Generalize special
case inserting of 64-bit values into a TImode register, to handle
both DImode and DFmode using either *insvti_lowpart_1
or *isnvti_highpart_1.

[Bug ada/110668] gcc/ada/gcc-interface/Make-lang.in:1012: ada/b_gnat1.adb] Error 5

2023-07-14 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110668

Mikael Pettersson  changed:

   What|Removed |Added

 CC||mikpelinux at gmail dot com

--- Comment #1 from Mikael Pettersson  ---
Ada only supports being built by the same or older releases, while you're
trying to bootstrap gcc-10.5 using gcc-13.1. That won't work in general. (For
Canadian crosses it must be the exact same version.)

Re: [PATCH v2] c++: wrong error with static constexpr var in tmpl [PR109876]

2023-07-14 Thread Jason Merrill via Gcc-patches

On 7/13/23 14:54, Marek Polacek wrote:

On Fri, May 26, 2023 at 09:47:10PM -0400, Jason Merrill wrote:

On 5/26/23 19:18, Marek Polacek wrote:

The is_really_empty_class check is sort of non-obvious but the
comment should explain why I added it.

+  /* When there's nothing to initialize, we'll never mark the
+ VAR_DECL TREE_CONSTANT, therefore it would remain
+ value-dependent and we wouldn't instantiate.  */
  
Sorry it's taken so long to get back to this.



Interesting.  Can we change that (i.e. mark it TREE_CONSTANT) rather than
work around it here?


I think we can.  Maybe as in the below:

-- >8 --
Since r8-509, we'll no longer create a static temporary var for
the initializer '{ 1, 2 }' for num in the attached test because
the code in finish_compound_literal is now guarded by
'&& fcl_context == fcl_c99' but it's fcl_functional here.  This
causes us to reject num as non-constant when evaluating it in
a template.

Jason's idea was to treat num as value-dependent even though it
actually isn't.  This patch implements that suggestion.

We weren't marking objects whose type is an empty class type
constant.  This patch changes that so that v_d_e_p doesn't need
to check is_really_empty_class.

Co-authored-by: Jason Merrill 

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK, thanks.

Incidentally, I prefer to put the "ok?" line above the scissors line 
since it isn't intended to be part of the commit message.



PR c++/109876

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Set TREE_CONSTANT when initializing
an object of empty class type.
* pt.cc (value_dependent_expression_p) : Treat a
constexpr-declared non-constant variable as value-dependent.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-template12.C: New test.
* g++.dg/cpp1z/constexpr-template1.C: New test.
* g++.dg/cpp1z/constexpr-template2.C: New test.
---
  gcc/cp/decl.cc| 13 +--
  gcc/cp/pt.cc  |  7 
  .../g++.dg/cpp0x/constexpr-template12.C   | 38 +++
  .../g++.dg/cpp1z/constexpr-template1.C| 25 
  .../g++.dg/cpp1z/constexpr-template2.C| 25 
  5 files changed, 105 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-template12.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-template1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-template2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 60f107d50c4..792ab330dd0 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -8200,7 +8200,6 @@ void
  cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
tree asmspec_tree, int flags)
  {
-  tree type;
vec *cleanups = NULL;
const char *asmspec = NULL;
int was_readonly = 0;
@@ -8220,7 +8219,7 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
/* Parameters are handled by store_parm_decls, not cp_finish_decl.  */
gcc_assert (TREE_CODE (decl) != PARM_DECL);
  
-  type = TREE_TYPE (decl);

+  tree type = TREE_TYPE (decl);
if (type == error_mark_node)
  return;
  
@@ -8410,7 +8409,7 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,

  if (decl_maybe_constant_var_p (decl)
  /* FIXME setting TREE_CONSTANT on refs breaks the back end.  */
  && !TYPE_REF_P (type))
-   TREE_CONSTANT (decl) = 1;
+   TREE_CONSTANT (decl) = true;
}
/* This is handled mostly by gimplify.cc, but we have to deal with
 not warning about int x = x; as it is a GCC extension to turn off
@@ -8421,6 +8420,14 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
  && !warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self))
suppress_warning (decl, OPT_Winit_self);
  }
+  else if (VAR_P (decl)
+  && COMPLETE_TYPE_P (type)
+  && !TYPE_REF_P (type)
+  && !dependent_type_p (type)
+  && is_really_empty_class (type, /*ignore_vptr*/false))
+/* We have no initializer but there's nothing to initialize anyway.
+   Treat DECL as constant due to c++/109876.  */
+TREE_CONSTANT (decl) = true;
  
if (flag_openmp

&& TREE_CODE (decl) == FUNCTION_DECL
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index fa15b75b9c5..255d18b9539 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27983,6 +27983,13 @@ value_dependent_expression_p (tree expression)
else if (TYPE_REF_P (TREE_TYPE (expression)))
/* FIXME cp_finish_decl doesn't fold reference initializers.  */
return true;
+  /* We have a constexpr variable and we're processing a template.  When
+there's lifetime extension involved (for which finish_compound_literal
+used to create a temporary), we'll not be able to evaluate the
+variable until instantiating, so 

[Bug tree-optimization/110666] [14 Regression] wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

--- Comment #3 from Andrew Pinski  ---
Created attachment 55544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55544=edit
Full testcase to make sure we don't create wrong code

[PATCH] [og13] OpenMP: Dimension ordering for array-shaping operator for C and C++

2023-07-14 Thread Julian Brown
This patch fixes a bug in non-contiguous 'target update' operations using
the new array-shaping operator for C and C++, processing the dimensions
of the array the wrong way round during the OpenMP lowering pass.
Fortran was also incorrectly using the wrong ordering but the second
reversal in omp-low.cc made it produce the correct result.

The C and C++ bug only affected array shapes where the dimension sizes
are different ([X][Y]) - several existing tests used the same value
for both/all dimensions ([X][X]), which masked the problem.  Only the
array dimensions (extents) are affected, not e.g. the indices, lengths
or strides for array sections.

This patch reverses the order used in both omp-low.cc and the Fortran
front-end, so the order should now be correct for all supported base
languages.

Tested with offloading to AMD GCN.  I will apply (to og13) shortly.

2023-07-14  Julian Brown  

gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_arrayshape_type): Reverse dimension
ordering for created array type.

gcc/
* omp-low.cc (lower_omp_target): Reverse iteration over array
dimensions.

libgomp/
* testsuite/libgomp.c-c++-common/array-shaping-14.c: New test.
---
 gcc/fortran/trans-openmp.cc   |  2 +-
 gcc/omp-low.cc|  6 ++--
 .../libgomp.c-c++-common/array-shaping-14.c   | 34 +++
 3 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-shaping-14.c

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 6cb5340687e..6b9a0430eba 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -4271,7 +4271,7 @@ gfc_trans_omp_arrayshape_type (tree type, vec *dims)
 {
   gcc_assert (dims->length () > 0);
 
-  for (int i = dims->length () - 1; i >= 0; i--)
+  for (unsigned i = 0; i < dims->length (); i++)
 {
   tree dim = fold_convert (sizetype, (*dims)[i]);
   /* We need the index of the last element, not the array size.  */
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index c7706a5921f..ab2e4145ab2 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14290,7 +14290,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
  dims++;
}
 
-   int tdim = tdims.length () - 1;
+   unsigned tdim = 0;
 
vec *vdim;
vec *vindex;
@@ -14365,7 +14365,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
nc = nc2;
  }
 
-   if (tdim >= 0)
+   if (tdim < tdims.length ())
  {
/* We have an array shape -- use that to find the
   total size of the data on the target to look up
@@ -14403,7 +14403,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
"for array");
dim = index = len = stride = error_mark_node;
  }
-   tdim--;
+   tdim++;
 
c = nc;
  }
diff --git a/libgomp/testsuite/libgomp.c-c++-common/array-shaping-14.c 
b/libgomp/testsuite/libgomp.c-c++-common/array-shaping-14.c
new file mode 100644
index 000..4ca6f794f93
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/array-shaping-14.c
@@ -0,0 +1,34 @@
+/* { dg-do run { target offload_device_nonshared_as } } */
+
+#include 
+#include 
+#include 
+
+typedef struct {
+  int *ptr;
+} S;
+
+int main(void)
+{
+  S q;
+  q.ptr = (int *) calloc (9 * 11, sizeof (int));
+
+#pragma omp target enter data map(to: q.ptr, q.ptr[0:9*11])
+
+#pragma omp target
+  for (int i = 0; i < 9*11; i++)
+q.ptr[i] = i;
+
+#pragma omp target update from(([9][11]) q.ptr[3:3:2][1:4:3])
+
+  for (int j = 0; j < 9; j++)
+for (int i = 0; i < 11; i++)
+  if (j >= 3 && j <= 7 && ((j - 3) % 2) == 0
+ && i >= 1 && i <= 10 && ((i - 1) % 3) == 0)
+   assert (q.ptr[j * 11 + i] == j * 11 + i);
+  else
+   assert (q.ptr[j * 11 + i] == 0);
+
+#pragma omp target exit data map(release: q.ptr, q.ptr[0:9*11])
+  return 0;
+}
-- 
2.25.1



[PATCH] [og13] OpenMP: Enable c-c++-common/gomp/declare-mapper-3.c for C

2023-07-14 Thread Julian Brown
This patch enables the c-c++-common/gomp/declare-mapper-3.c test for C.
This was seemingly overlooked in commit 393fd99c90e.

Tested with offloading to AMD GCN.  I will apply (to og13) shortly.

2023-07-14  Julian Brown  

gcc/testsuite/
* c-c++-common/gomp/declare-mapper-3.c: Enable for C.
---
 gcc/testsuite/c-c++-common/gomp/declare-mapper-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/c-c++-common/gomp/declare-mapper-3.c 
b/gcc/testsuite/c-c++-common/gomp/declare-mapper-3.c
index 983d979d68c..e491bcd0ce6 100644
--- a/gcc/testsuite/c-c++-common/gomp/declare-mapper-3.c
+++ b/gcc/testsuite/c-c++-common/gomp/declare-mapper-3.c
@@ -1,4 +1,4 @@
-// { dg-do compile { target c++ } }
+// { dg-do compile }
 // { dg-additional-options "-fdump-tree-gimple" }
 
 #include 
-- 
2.25.1



RE: [PATCH 1/6] arm: [MVE intrinsics] Factorize vcaddq vhcaddq

2023-07-14 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Christophe Lyon 
> Sent: Thursday, July 13, 2023 11:22 AM
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
> Richard Earnshaw ; Richard Sandiford
> 
> Cc: Christophe Lyon 
> Subject: [PATCH 1/6] arm: [MVE intrinsics] Factorize vcaddq vhcaddq
> 
> Factorize vcaddq, vhcaddq so that they use the same parameterized
> names.
> 
> To be able to use the same patterns, we add a suffix to vcaddq.
> 
> Note that vcadd uses UNSPEC_VCADDxx for builtins without predication,
> and VCADDQ_ROTxx_M_x (that is, not starting with "UNSPEC_").  The
> UNPEC_* names are also used by neon.md

Thanks for working on this.
The series is ok.
Kyrill

> 
> 2023-07-13  Christophe Lyon  
> 
>   gcc/
>   * config/arm/arm_mve_builtins.def (vcaddq_rot90_,
> vcaddq_rot270_)
>   (vcaddq_rot90_f, vcaddq_rot90_f): Add "_" or "_f" suffix.
>   * config/arm/iterators.md (mve_insn): Add vcadd, vhcadd.
>   (isu): Add UNSPEC_VCADD90, UNSPEC_VCADD270,
> VCADDQ_ROT270_M_U,
>   VCADDQ_ROT270_M_S, VCADDQ_ROT90_M_U,
> VCADDQ_ROT90_M_S,
>   VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S,
> VHCADDQ_ROT90_S,
>   VHCADDQ_ROT270_S.
>   (rot): Add VCADDQ_ROT90_M_F, VCADDQ_ROT90_M_S,
> VCADDQ_ROT90_M_U,
>   VCADDQ_ROT270_M_F, VCADDQ_ROT270_M_S,
> VCADDQ_ROT270_M_U,
>   VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, VHCADDQ_ROT90_M_S,
>   VHCADDQ_ROT270_M_S.
>   (mve_rot): Add VCADDQ_ROT90_M_F, VCADDQ_ROT90_M_S,
>   VCADDQ_ROT90_M_U, VCADDQ_ROT270_M_F,
> VCADDQ_ROT270_M_S,
>   VCADDQ_ROT270_M_U, VHCADDQ_ROT90_S, VHCADDQ_ROT270_S,
>   VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S.
>   (supf): Add VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S,
>   VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, UNSPEC_VCADD90,
>   UNSPEC_VCADD270.
>   (VCADDQ_ROT270_M): Delete.
>   (VCADDQ_M_F VxCADDQ VxCADDQ_M): New.
>   (VCADDQ_ROT90_M): Delete.
>   * config/arm/mve.md (mve_vcaddq)
>   (mve_vhcaddq_rot270_s, mve_vhcaddq_rot90_s):
> Merge
>   into ...
>   (@mve_q_): ... this.
>   (mve_vcaddq): Rename into ...
>   (@mve_q_f): ... this
>   (mve_vcaddq_rot270_m_)
>   (mve_vcaddq_rot90_m_,
> mve_vhcaddq_rot270_m_s)
>   (mve_vhcaddq_rot90_m_s): Merge into ...
>   (@mve_q_m_): ... this.
>   (mve_vcaddq_rot270_m_f, mve_vcaddq_rot90_m_f):
> Merge
>   into ...
>   (@mve_q_m_f): ... this.
> ---
>  gcc/config/arm/arm_mve_builtins.def |   6 +-
>  gcc/config/arm/iterators.md |  38 +++-
>  gcc/config/arm/mve.md   | 135 +---
>  3 files changed, 62 insertions(+), 117 deletions(-)
> 
> diff --git a/gcc/config/arm/arm_mve_builtins.def
> b/gcc/config/arm/arm_mve_builtins.def
> index 8de765de3b0..63ad1845593 100644
> --- a/gcc/config/arm/arm_mve_builtins.def
> +++ b/gcc/config/arm/arm_mve_builtins.def
> @@ -187,6 +187,10 @@ VAR3 (BINOP_NONE_NONE_NONE, vmaxvq_s, v16qi,
> v8hi, v4si)
>  VAR3 (BINOP_NONE_NONE_NONE, vmaxq_s, v16qi, v8hi, v4si)
>  VAR3 (BINOP_NONE_NONE_NONE, vhsubq_s, v16qi, v8hi, v4si)
>  VAR3 (BINOP_NONE_NONE_NONE, vhsubq_n_s, v16qi, v8hi, v4si)
> +VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot90_, v16qi, v8hi, v4si)
> +VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot270_, v16qi, v8hi, v4si)
> +VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot90_f, v8hf, v4sf)
> +VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot270_f, v8hf, v4sf)
>  VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot90_s, v16qi, v8hi, v4si)
>  VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot270_s, v16qi, v8hi, v4si)
>  VAR3 (BINOP_NONE_NONE_NONE, vhaddq_s, v16qi, v8hi, v4si)
> @@ -870,8 +874,6 @@ VAR3
> (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, v16qi,
> v8hi, v4si)
>  VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u,
> v16qi, v8hi, v4si)
> 
>  /* optabs without any suffixes.  */
> -VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot90, v16qi, v8hi, v4si, v8hf,
> v4sf)
> -VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot270, v16qi, v8hi, v4si, v8hf,
> v4sf)
>  VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90, v8hf, v4sf)
>  VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270, v8hf, v4sf)
>  VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180, v8hf, v4sf)
> diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> index 9e77af55d60..da1ead34e58 100644
> --- a/gcc/config/arm/iterators.md
> +++ b/gcc/config/arm/iterators.md
> @@ -902,6 +902,7 @@
>])
> 
>  (define_int_attr mve_insn [
> +  (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd")
>(VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav")
>(VABAVQ_S "vabav") (VABAVQ_U "vabav")
>(VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F
> "vabd")
> @@ -925,6 +926,8 @@
>(VBICQ_N_S "vbic") (VBICQ_N_U "vbic")
>(VBRSRQ_M_N_S "vbrsr") (VBRSRQ_M_N_U "vbrsr")
> (VBRSRQ_M_N_F "vbrsr")
>(VBRSRQ_N_S "vbrsr") (VBRSRQ_N_U "vbrsr") (VBRSRQ_N_F
> "vbrsr")
> +  (VCADDQ_ROT270_M_U "vcadd") (VCADDQ_ROT270_M_S
> "vcadd") (VCADDQ_ROT270_M_F "vcadd")
> +  

[Bug ada/110668] New: gcc/ada/gcc-interface/Make-lang.in:1012: ada/b_gnat1.adb] Error 5

2023-07-14 Thread dclarke at blastwave dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110668

Bug ID: 110668
   Summary: gcc/ada/gcc-interface/Make-lang.in:1012:
ada/b_gnat1.adb] Error 5
   Product: gcc
   Version: 10.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dclarke at blastwave dot org
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Not sure where this arrived from however here are the particulars :

System is very off the shelf devuan/debian with a lean kernel : 

d# uname -a 
Linux dev 6.4.2-genunix #1 SMP PREEMPT_DYNAMIC Fri Jul  7 18:41:16 UTC 2023
x86_64 GNU/Linux
d# 

Compiler used for the bootstrap :

d# which gcc
/opt/bw/gcc13/bin/gcc
d# 
d# gcc --version 
gcc (GENUNIX Mon Jun 26 19:37:30 UTC 2023) 13.1.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

see https://gcc.gnu.org/pipermail/gcc-testresults/2023-June/788747.html

So I suspect that compiler to be good stuff. With gnat in there also :

d# ls /opt/bw/gcc13/bin/
c++ gfortran   gnatprep
cpp gnat   lto-dump
g++ gnatbind   x86_64-pc-linux-gnu-c++
gcc gnatchop   x86_64-pc-linux-gnu-g++
gcc-ar  gnatclean  x86_64-pc-linux-gnu-gcc
gcc-nm  gnatkr x86_64-pc-linux-gnu-gcc-13.1.0
gcc-ranlib  gnatlink   x86_64-pc-linux-gnu-gcc-ar
gcovgnatls x86_64-pc-linux-gnu-gcc-nm
gcov-dump   gnatmake   x86_64-pc-linux-gnu-gcc-ranlib
gcov-tool   gnatname   x86_64-pc-linux-gnu-gfortran
d# 

So imagine my total surprise when configure looks so nice and the whole
compile/bootstrap takes off nicely but then falls over quickly : 

d# 
d# pwd
/opt/bw/build/gcc-10.5.0_linux-6.4.2_x86_64.001
d# date -u
Fri Jul 14 14:14:46 UTC 2023
d# ../gcc-10.5.0/configure --prefix=/opt/imed/gcc10 \
> --enable-languages=c,ada,c++,fortran,objc,obj-c++ \
> --enable-shared --without-included-gettext \
> --enable-threads=posix --enable-bootstrap --with-system-zlib \
> --enable-checking=misc --disable-multilib --disable-nls \
> --enable-__cxa_atexit --enable-tls \
> --with-pkgversion='GENUNIX Fri Jul 14 14:14:46 UTC 2023' 2>&1 | tee 
> ../gcc-10.5.0_linux-6.4.2_x86_64.001.config.log 
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking target system type... x86_64-pc-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether ln works... yes
checking whether ln -s works... yes
checking for a sed that does not truncate output... /bin/sed
checking for gawk... no
checking for mawk... mawk
checking for libatomic support... yes
checking for libitm support... yes
checking for libsanitizer support... yes
checking for libvtv support... yes
checking for libhsail-rt support... yes
checking for libphobos support... yes
checking for gcc... /opt/bw/gcc13/bin/gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether /opt/bw/gcc13/bin/gcc accepts -g... yes
checking for /opt/bw/gcc13/bin/gcc option to accept ISO C89... none needed
checking whether we are using the GNU C++ compiler... yes
checking whether /opt/bw/gcc13/bin/g++ accepts -g... yes
checking whether g++ accepts -static-libstdc++ -static-libgcc... yes
checking for gnatbind... gnatbind
checking for gnatmake... gnatmake
checking whether compiler driver understands Ada... yes
checking how to compare bootstrapped objects... cmp --ignore-initial=16 $$f1
$$f2
checking for objdir... .libs
configure: WARNING: using in-tree isl, disabling version check
The following languages will be built: c,ada,c++,fortran,lto,objc,obj-c++
*** This configuration is not supported in the following subdirectories:
 zlib gotools target-libhsail-rt target-libphobos target-zlib target-libgo
target-libffi target-liboffloadmic
(Any other directories should still work fine.)
checking for default BUILD_CONFIG... bootstrap-debug
checking for --enable-vtable-verify... no
checking for bison... bison -y
checking for bison... bison
checking for gm4... no
checking for gnum4... no
checking for m4... m4
checking for flex... flex
checking for flex... flex
checking for makeinfo... makeinfo
checking for expect... expect
checking for runtest... runtest
checking for ar... ar
checking for as... as
checking for dlltool... no
checking for dsymutil... no
checking for ld... ld
checking for lipo... no
checking for nm... nm
checking for ranlib... ranlib
checking for strip... strip
checking for windres... no
checking for windmc... no
checking for objcopy... objcopy
checking for objdump... 

[WIP RFC] Add support for keyword-based attributes

2023-07-14 Thread Richard Sandiford via Gcc-patches
Summary: We'd like to be able to specify some attributes using
keywords, rather than the traditional __attribute__ or [[...]]
syntax.  Would that be OK?

In more detail:

We'd like to add some new target-specific attributes for Arm SME.
These attributes affect semantics and code generation and so they
can't simply be ignored.

Traditionally we've done this kind of thing by adding GNU attributes,
via TARGET_ATTRIBUTE_TABLE in GCC's case.  The problem is that both
GCC and Clang have traditionally only warned about unrecognised GNU
attributes, rather than raising an error.  Older compilers might
therefore be able to look past some uses of the new attributes and
still produce object code, even though that object code is almost
certainly going to be wrong.  (The compilers will also emit a default-on
warning, but that might go unnoticed when building a big project.)

There are some existing attributes that similarly affect semantics
in ways that cannot be ignored.  vector_size is one obvious example.
But that doesn't make it a good thing. :)

Also, C++ says this for standard [[...]] attributes:

  For an attribute-token (including an attribute-scoped-token)
  not specified in this document, the behavior is implementation-defined;
  any such attribute-token that is not recognized by the implementation
  is ignored.

which doubles down on the idea that attributes should not be used
for necessary semantic information.

One of the attributes we'd like to add provides a new way of compiling
existing code.  The attribute doesn't require SME to be available;
it just says that the code must be compiled so that it can run in either
of two modes.  This is probably the most dangerous attribute of the set,
since compilers that ignore it would just produce normal code.  That
code might work in some test scenarios, but it would fail in others.

The feeling from the Clang community was therefore that these SME
attributes should use keywords instead, so that the keywords trigger
an error with older compilers.

However, it seemed wrong to define new SME-specific grammar rules,
since the underlying problem is pretty generic.  We therefore
proposed having a type of keyword that can appear exactly where
a standard [[...]] attribute can appear and that appertains to
exactly what a standard [[...]] attribute would appertain to.
No divergence or cherry-picking is allowed.

For example:

  [[arm::foo]]

would become:

  __arm_foo

and:

  [[arm::bar(args)]]

would become:

  __arm_bar(args)

It wouldn't be possible to retrofit arguments to a keyword that
previously didn't take arguments, since that could lead to parsing
ambiguities.  So when a keyword is first added, a binding decision
would need to be made whether the keyword always takes arguments
or is always standalone.

For that reason, empty argument lists are allowed for keywords,
even though they're not allowed for [[...]] attributes.

The argument-less version was accepted into Clang, and I have a follow-on
patch for handling arguments.  Would the same thing be OK for GCC,
in both the C and C++ frontends?

The patch below is a proof of concept for the C frontend.  It doesn't
bootstrap due to warnings about uninitialised fields.  And it doesn't
have tests.  But I did test it locally with various combinations of
attribute_spec and it seemed to work as expected.

The impact on the C frontend seems to be pretty small.  It looks like
the impact on the C++ frontend would be a bit bigger, but not much.

The patch contains a logically unrelated change: c-common.h set aside
16 keywords for address spaces, but of the in-tree ports, the maximum
number of keywords used is 6 (for amdgcn).  The patch therefore changes
the limit to 8 and uses 8 keywords for the new attributes.  This keeps
the number of reserved ids <= 256.

A real, non-proof-of-concept patch series would:

- Change the address-space keywords separately, and deal with any fallout.

- Clean up the way that attributes are specified, so that it isn't
  necessary to update all definitions when adding a new field.

- Allow more precise attribute requirements, such as "function decl only".

- Add tests :)

WDYT?  Does this approach look OK in principle, or is it a non-starter?

If it is a non-starter, the fallback would be to predefine macros
that expand to [[...]] or __attribute__.  Having the keywords gives
more precise semantics and better error messages though.

Thanks,
Richard
---
 gcc/attribs.cc| 30 +++-
 gcc/c-family/c-common.h   | 13 ++
 gcc/c/c-parser.cc | 88 +--
 gcc/config/aarch64/aarch64.cc |  1 +
 gcc/tree-core.h   | 19 
 5 files changed, 135 insertions(+), 16 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index b8cb55b97df..706cd81329c 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -752,6 +752,11 @@ decl_attributes (tree *node, tree attributes, int flags,
 
   if (spec->decl_required && !DECL_P 

RE: [PATCH 2/2] [testsuite, arm]: Make mve_fp_fpu[12].c accept single or double precision FPU

2023-07-14 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Christophe Lyon 
> Sent: Thursday, July 13, 2023 11:22 AM
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
> Richard Earnshaw 
> Cc: Christophe Lyon 
> Subject: [PATCH 2/2] [testsuite,arm]: Make mve_fp_fpu[12].c accept single or
> double precision FPU
> 
> This tests currently expect a directive containing .fpu fpv5-sp-d16
> and thus may fail if the test is executed for instance with
> -march=armv8.1-m.main+mve.fp+fp.dp
> 
> This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure.
> 

Ok.
Thanks,
Kyrill

> 2023-06-28  Christophe Lyon  
> 
>   gcc/testsuite/
>   * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu
>   scan-assembler.
>   * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c | 2 +-
>  gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> index e375327fb97..8358a616bb5 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> @@ -12,4 +12,4 @@ foo1 (int8x16_t value)
>return b;
>  }
> 
> -/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
> +/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> index 1fca1100cf0..5dd2feefc35 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> @@ -12,4 +12,4 @@ foo1 (int8x16_t value)
>return b;
>  }
> 
> -/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
> +/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
> --
> 2.34.1



RE: [PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp

2023-07-14 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Christophe Lyon 
> Sent: Thursday, July 13, 2023 11:22 AM
> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
> Richard Earnshaw 
> Cc: Christophe Lyon 
> Subject: [PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp
> 
> If GCC is configured with the default (soft) -mfloat-abi, and we don't
> override the target_board test flags appropriately,
> gcc.target/arm/mve/general-c/nomve_fp_1.c fails for lack of
> -mfloat-abi=softfp or -mfloat-abi=hard, because it doesn't use
> dg-add-options arm_v8_1m_mve (on purpose, see comment in the test).
> 
> Require and use the options needed for arm_fp to fix this problem.

Ok.
Thanks,
Kyrill

> 
> 2023-06-28  Christophe Lyon  
> 
>   gcc/testsuite/
>   * gcc.target/arm/mve/general-c/nomve_fp_1.c: Require arm_fp.
> ---
>  gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
> b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
> index 21c2af16a61..c9d279ead68 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
> @@ -1,9 +1,11 @@
>  /* { dg-do compile } */
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>  /* Do not use dg-add-options arm_v8_1m_mve, because this might expand
> to "",
> which could imply mve+fp depending on the user settings. We want to
> make
> sure the '+fp' extension is not enabled.  */
>  /* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
> +/* { dg-add-options arm_fp } */
> 
>  #include 
> 
> --
> 2.34.1



[Bug c++/102854] [OpenMP] Bogus "initializer expression refers to iteration variable" when using templates

2023-07-14 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102854

--- Comment #4 from Tobias Burnus  ---
The testcase of comment 2 seems to be the same as the one of
bug 106449 comment 4, which went in as

r13-1887-g97d32048c04e97  openmp: Fix up handling of non-rectangular simd loops
with pointer type iterators [PR106449]

A slightly modified variant of that testcase went in follow-up commit

r13-1893-g85fe7e7dd1f146  Add libgomp.c-c++-common/pr106449-2.c

* * *

>From the bug comments, it seems as if the following remains:

"Non-rectangular loops with class random access iterators remain broken, that
is something to be fixed incrementally."

* * *

However, given the commit for PR106449 (see above) and the earlier commit
r12-4733-g2084b5f42a4432  openmp: Allow non-rectangular loops with pointer
iterators
(committed right after the commit of comment 3):

I wonder whether the only thing that remains to be done is to add a real C++
testcase for random access iterator non-rect loops - assuming that the pointer
patches cover all what's needed in terms of compiler support.

[Bug tree-optimization/110666] [14 Regression] wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

--- Comment #2 from Andrew Pinski  ---
I totally messed up the !=1/!=0 cases for the outer eq case:
Patch
diff --git a/gcc/match.pd b/gcc/match.pd
index 351d9285e92..3de30df8b06 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6431,8 +6431,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

 /* x != (typeof x)(x == CST) -> CST == 0 ? 1 : (CST == 1 ? (x!=0&!=1) : x !=
0) */
 /* x != (typeof x)(x != CST) -> CST == 1 ? 1 : (CST == 0 ? (x!=0&!=1) : x !=
1) */
-/* x == (typeof x)(x == CST) -> CST == 0 ? 0 : (CST == 1 ? (x==0||x==1) : x !=
0) */
-/* x == (typeof x)(x != CST) -> CST == 1 ? 0 : (CST == 0 ? (x==0||x==1) : x !=
1) */
+/* x == (typeof x)(x == CST) -> CST == 0 ? 0 : (CST == 1 ? (x==0||x==1) : x ==
0) */
+/* x == (typeof x)(x != CST) -> CST == 1 ? 0 : (CST == 0 ? (x==0||x==1) : x ==
1) */
 (for outer (ne eq)
  (for inner (ne eq)
   (simplify
@@ -6457,9 +6457,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   )
  )
 )
-(if (innereq)
- (ne @0 { build_zero_cst (TREE_TYPE (@0)); }))
-(ne @0 { build_one_cst (TREE_TYPE (@0)); }))
+(with {
+  tree value = build_int_cst (TREE_TYPE (@0), !innereq);
+ }
+ (if (outereq)
+  (eq @0 { value; })
+  (ne @0 { value; })
+ )
+)
)
   )
  )

[Bug middle-end/110667] gcc-14, ICE: internal compiler error: in replace_reg, at reg-stack.cc:722

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110667

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-07-14
 Depends on||69899
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Confirmed. All GCC I can get my hands on have ICEd (even though most hide it
with "confused by earlier errors, bailing out").


`-O2 -ffinite-math-only` is enough to reproduce the ICE. So I almost think this
is a dup of bug 69899 .


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69899
[Bug 69899] gcc ICE on invalid code on x86_64-linux-gnu in "replace_reg"

[Bug rtl-optimization/110206] [14 Regression] wrong code with -Os -march=cascadelake since r14-1246

2023-07-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110206

--- Comment #17 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:1815e313a8fb519a77c94a908eb6dafc4ce51ffe

commit r14-2525-g1815e313a8fb519a77c94a908eb6dafc4ce51ffe
Author: Uros Bizjak 
Date:   Fri Jul 14 11:46:22 2023 +0200

cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg
[PR110206]

cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
that it equals 8 elements of HImodeby setting REG_EQUAL note:

(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0  S4 A32]))
"pr110206.c":12:42 1530 {*movv4qi_internal}
 (expr_list:REG_EQUAL (const_vector:V4QI [
(const_int -52 [0xffcc]) repeated x4
])
(nil)))
(insn 22 21 23 4 (set (reg:V8HI 100)
(zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0)
(parallel [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 4 [0x4])
(const_int 5 [0x5])
(const_int 6 [0x6])
(const_int 7 [0x7])
] "pr110206.c":12:42 7471
{sse4_1_zero_extendv8qiv8hi2}
 (expr_list:REG_EQUAL (const_vector:V8HI [
(const_int 204 [0xcc]) repeated x8
])
(expr_list:REG_DEAD (reg:V4QI 98)
(nil

We rely on the "undefined" vals to have a specific value (from the earlier
REG_EQUAL note) but actual code generation doesn't ensure this (it doesn't
need to).  That said, the issue isn't the constant folding per-se but that
we do not actually constant fold but register an equality that doesn't
hold.

PR target/110206

gcc/ChangeLog:

* fwprop.cc (contains_paradoxical_subreg_p): Move to ...
* rtlanal.cc (contains_paradoxical_subreg_p): ... here.
* rtlanal.h (contains_paradoxical_subreg_p): Add prototype.
* cprop.cc (try_replace_reg): Do not set REG_EQUAL note
when the original source contains a paradoxical subreg.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110206.c: New test.

Re: [PATCH v3 1/3] c++: Track lifetimes in constant evaluation [PR70331,PR96630,PR98675]

2023-07-14 Thread Jason Merrill via Gcc-patches

On 6/30/23 23:28, Nathaniel Shead via Gcc-patches wrote:

This adds rudimentary lifetime tracking in C++ constexpr contexts,


Thanks!

I'm not seeing either a copyright assignment or DCO certification for 
you; please see https://gcc.gnu.org/contribute.html#legal for more 
information.



diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index cca0435bafc..bc59b4aab67 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1188,7 +1190,12 @@ public:
  if (!already_in_map && modifiable)
modifiable->add (t);
}
-  void remove_value (tree t) { values.remove (t); }
+  void remove_value (tree t)
+  {
+if (DECL_P (t))
+  outside_lifetime.add (t);
+values.remove (t);


What if, instead of removing the variable from one hash table and adding 
it to another, we change the value to, say, void_node?



+ /* Also don't cache a call if we return a pointer to an expired
+value.  */
+ if (cacheable && (cp_walk_tree_without_duplicates
+   (, find_expired_values,
+>global->outside_lifetime)))
+   cacheable = false;


I think we need to reconsider cacheability in general; I think we only 
want to cache calls that are themselves valid constant expressions, in 
that the return value is a "permitted result of a constant expression" 
(https://eel.is/c++draft/expr.const#13).  A pointer to an automatic 
variable is not, whether or not it is currently within its lifetime.


That is, only cacheable if reduced_constant_expression_p (result).

I'm experimenting with this now, you don't need to mess with it.


@@ -7085,7 +7138,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
  case PARM_DECL:
if (lval && !TYPE_REF_P (TREE_TYPE (t)))
/* glvalue use.  */;
-  else if (tree v = ctx->global->get_value (r))
+  else if (tree v = ctx->global->get_value (t))


I agree with this change, but it doesn't have any actual effect, right? 
I'll go ahead and apply it separately.



@@ -7328,17 +7386,28 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
auto_vec cleanups;
vec *prev_cleanups = ctx->global->cleanups;
ctx->global->cleanups = 
-   r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0),
+
+   auto_vec save_exprs;


Now that we're going to track temporaries for each full-expression, I 
think we shouldn't also need to track them for loops and calls.


Jason



[PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-14 Thread Robin Dapp via Gcc-patches
>>> Can you add testcases?  Also the current restriction is because
>>> the variants you add are not always correct and I don't see any
>>> checks that the intermediate type doesn't lose significant bits?

I didn't manage to create one for aarch64 nor for x86 because AVX512
has direct conversions e.g. for int64_t -> _Float16 and the new code
will not be triggered.  Instead I added two separate RISC-V tests.

The attached V2 always checks trapping_math when converting float
to integer and, like the NARROW_DST case, checks if the operand fits
the intermediate type when demoting from int to float.

Would that be sufficient?

riscv seems to be the only backend not (yet?) providing pack/unpack
expanders for the vect conversions and rather relying on extend/trunc
which seems a disadvantage now, particularly for the cases requiring
!flag_trapping_math with NONE but not for NARROW_DST.  That might
be reason enough to implement pack/unpack in the backend.

Nevertheless the patch might improve the status quo a bit?

Regards
 Robin


The recent changes that allowed multi-step conversions for
"non-packing/unpacking", i.e. modifier == NONE targets included
promoting to-float and demoting to-int variants.  This patch
adds the missing demoting to-float and promoting to-int handling.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_conversion): Handle
more demotion/promotion for modifier == NONE.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c: 
New test.
* gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c: 
New test.
---
 .../conversions/vec-narrow-int64-float16.c| 12 
 .../conversions/vec-widen-float16-int64.c | 12 
 gcc/tree-vect-stmts.cc| 58 +++
 3 files changed, 71 insertions(+), 11 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c
 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c
new file mode 100644
index 000..ebee1cfa888
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-narrow-int64-float16.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh 
-mabi=lp64d --param=riscv-autovec-preference=scalable" } */
+
+#include 
+
+void convert (_Float16 *restrict dst, int64_t *restrict a, int n)
+{
+  for (int i = 0; i < n; i++)
+dst[i] = (_Float16) (a[i] & 0x7fff);
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c
 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c
new file mode 100644
index 000..eb0a17e99bc
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/conversions/vec-widen-float16-int64.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh 
-mabi=lp64d --param=riscv-autovec-preference=scalable -fno-trapping-math" } */
+
+#include 
+
+void convert (int64_t *restrict dst, _Float16 *restrict a, int n)
+{
+  for (int i = 0; i < n; i++)
+dst[i] = (int64_t) a[i];
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index c08d0ef951f..c78a750301d 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5192,29 +5192,65 @@ vectorizable_conversion (vec_info *vinfo,
break;
   }
 
-  /* For conversions between float and smaller integer types try whether we
-can use intermediate signed integer types to support the
+  /* For conversions between float and integer types try whether
+we can use intermediate signed integer types to support the
 conversion.  */
   if ((code == FLOAT_EXPR
-  && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
+  && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
  || (code == FIX_TRUNC_EXPR
- && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
- && !flag_trapping_math))
+ && (GET_MODE_SIZE (rhs_mode) != GET_MODE_SIZE (lhs_mode)
+ && !flag_trapping_math)))
{
+ bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode);
  bool float_expr_p = code == FLOAT_EXPR;
- scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
- fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
+ unsigned short target_size;
+ scalar_mode intermediate_mode;
+ if (demotion)
+   {
+ 

vectorizer: Avoid an OOB access from vectorization

2023-07-14 Thread Matthew Malcomson via Gcc-patches
Our checks for whether the vectorization of a given loop would make an
out of bounds access miss the case when the vector we load is so large
as to span multiple iterations worth of data (while only being there to
implement a single iteration).

This patch adds a check for such an access.

Example where this was going wrong (smaller version of testcase added):

```
  extern unsigned short multi_array[5][16][16];
  extern void initialise_s(int *);
  extern int get_sval();

  void foo() {
int s0 = get_sval();
int s[31];
int i,j;
initialise_s([0]);
s0 = get_sval();
for (j=0; j < 16; j++)
  for (i=0; i < 16; i++)
multi_array[1][j][i]=s[j*2];
  }
```

With the above loop we would load the `s[j*2]` integer into a 4 element
vector, which reads 3 extra elements than the scalar loop would.
`get_group_load_store_type` identifies that the loop requires a scalar
epilogue due to gaps.  However we do not identify that the above code
requires *two* scalar loops to be peeled due to the fact that each
iteration loads an amount of data from the *next* iteration (while not
using it).

Bootstrapped and regtested on aarch64-none-linux-gnu.
N.b. out of interest we came across this working with Morello.


### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c 
b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
new file mode 100644
index 
..1b721fd26cab8d5583b153dd6b28c914db870ec3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
@@ -0,0 +1,60 @@
+/* For some targets we end up vectorizing the below loop such that the `sp`
+   single integer is loaded into a 4 integer vector.
+   While the writes are all safe, without 2 scalar loops being peeled into the
+   epilogue we would read past the end of the 31 integer array.  This happens
+   because we load a 4 integer chunk to only use the first integer and
+   increment by 2 integers at a time, hence the last load needs s[30-33] and
+   the penultimate load needs s[28-31].
+   This testcase ensures that we do not crash due to that behaviour.  */
+/* { dg-require-effective-target mmap } */
+#include 
+#include 
+
+#define MMAP_SIZE 0x2
+#define ADDRESS 0x112200
+
+#define MB_BLOCK_SIZE 16
+#define VERT_PRED_16 0
+#define HOR_PRED_16 1
+#define DC_PRED_16 2
+int *sptr;
+extern void intrapred_luma_16x16();
+unsigned short mprr_2[5][16][16];
+void initialise_s(int *s) { }
+int main() {
+void *s_mapping;
+void *end_s;
+s_mapping = mmap ((void *)ADDRESS, MMAP_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+if (s_mapping == MAP_FAILED)
+  {
+   perror ("mmap");
+   return 1;
+  }
+end_s = (s_mapping + MMAP_SIZE);
+sptr = (int*)(end_s - sizeof(int[31]));
+intrapred_luma_16x16(sptr);
+return 0;
+}
+
+void intrapred_luma_16x16(int * restrict sp) {
+for (int j=0; j < MB_BLOCK_SIZE; j++)
+  {
+   mprr_2[VERT_PRED_16][j][0]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][1]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][2]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][3]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][4]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][5]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][6]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][7]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][8]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][9]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][10]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][11]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][12]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][13]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][14]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][15]=sp[j*2];
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
c08d0ef951fc63adcfffc601917134ddf51ece45..1c8c6784cc7b5f2d327339ff55a5a5ea08835aab
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2217,7 +2217,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 but the access in the loop doesn't cover the full vector
 we can end up with no gap recorded but still excess
 elements accessed, see PR103116.  Make sure we peel for
-gaps if necessary and sufficient and give up if not.  */
+gaps if necessary and sufficient and give up if not.
+If there is a combination of the access not covering the full 
vector and
+a gap recorded then we may need to peel twice.  */
  if (loop_vinfo
  && *memory_access_type == VMAT_CONTIGUOUS
  && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()
@@ -2233,7 +2235,7 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 access excess elements.
 ???  Enhancements include peeling multiple iterations
 or using masked loads with a static mask.  */
- 

[Bug tree-optimization/110666] [14 Regression] wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Target Milestone|--- |14.0
   Last reconfirmed||2023-07-14
 Ever confirmed|0   |1
Summary|wrong code at -O1 and above |[14 Regression] wrong code
   |on x86_64-linux-gnu |at -O1 and above on
   ||x86_64-linux-gnu
Version|unknown |14.0
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Andrew Pinski  ---
Mine. Let me look at what I did wrong.

[Bug middle-end/110659] Error from linker: .eh_frame_hdr refers to overlapping FDEs

2023-07-14 Thread townsend at astro dot wisc.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110659

--- Comment #9 from Rich Townsend  ---
OK, I managed to get things working by setting

export LDFLAGS='-Wl,--no-eh-frame-hdr'

prior to configuring. I'm hoping this won't affect the functionality of the
built compiler.

[Bug target/110649] [14 Regression] 25% sphinx3 spec2006 regression on Ice Lake and zen between g:acaa441a98bebc52 (2023-07-06 11:36) and g:55900189ab517906 (2023-07-07 00:23)

2023-07-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110649

--- Comment #4 from Jan Hubicka  ---
We also have PR98782 that is about sphinx being sensitive to LRA decisions. 
Reducing loopback probability might trigger LRA adding a spill to the loop.

[Bug target/110649] [14 Regression] 25% sphinx3 spec2006 regression on Ice Lake and zen between g:acaa441a98bebc52 (2023-07-06 11:36) and g:55900189ab517906 (2023-07-07 00:23)

2023-07-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110649

Jan Hubicka  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=10647
 Ever confirmed|0   |1
   Last reconfirmed||2023-07-14
 Status|UNCONFIRMED |NEW

--- Comment #3 from Jan Hubicka  ---
Thanks for bisecting this! We also have PR10647 which is tracked to this
change.
The change correct loop profile after header copying:

test()
{
for (int i = 0; i < 10; i++)
test2();
}

has probability of exit conditional 90.9% before loop header copying (since it
technically iterates 10 times) while after loop header copying and optimizing
out the constant "if (0<10)" test it has only 90% loopback probability.

So probably fixing the bug above triggers something else.
I will first look at PR10647 and see if I can figure out what is going on
there.

[Bug target/54089] [SH] Refactor shift patterns

2023-07-14 Thread klepikov.alex+bugs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

Alexander Klepikov  changed:

   What|Removed |Added

  Attachment #55503|0   |1
is obsolete||
  Attachment #55513|0   |1
is obsolete||

--- Comment #102 from Alexander Klepikov  
---
Created attachment 55543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55543=edit
Arithmetic right shift late expanding v2

Here's the patch. I hope I did not miss anything.

Now considering regexp. I remade it using 'check-function-bodies' command and
now it looks less confusing. I also found that in this testcase right shift
expands to 'shad' instructions early on platforms that have dynamic shift
support, so I deleted checks for those CPUs.

I don't like that 'check-function-bodies' ignores asm labels but it's better
than nothing.

Turn TODO_rebuild_frequencies to a pass

2023-07-14 Thread Jan Hubicka via Gcc-patches
Hi,
currently we rebuild profile_counts from profile_probability after inlining,
because there is a chance that producing large loop nests may get 
unrealistically
large profile_count values.  This is much less of concern when we switched to
new profile_count representation while back.

This propagation can also compensate for profile inconsistencies caused by
optimization passes.  Since inliner is followed by basic cleanup passes that
does not use profile, we get more realistic profile by delaying the 
recomputation
after basic optimizations exposed by inlininig are finished.

This does not fit into TODO machinery, so I turn rebuilding into stand alone
pass and schedule it before first consumer of profile in the optimization
queue.

I also added logic that avoids repropagating when CFG is good and not too close
to overflow.  Propagating visits very basic block loop_depth times, so it is
not linear and avoiding it may help a bit.

On tramp3d we get 14 functions repropagated and 916 are OK.  The repropagated
functions are RB tree ones where we produce crazy loop nests by recurisve 
inlining.
This is something to fix independently.

Bootstrapped/regtested x86_64-linux.  Plan to commit it later today
if there are no complains.

Honza

gcc/ChangeLog:

* passes.cc (execute_function_todo): Remove
TODO_rebuild_frequencies
* passes.def: Add rebuild_frequencies pass.
* predict.cc (estimate_bb_frequencies): Drop
force parameter.
(tree_estimate_probability): Update call of
estimate_bb_frequencies.
(rebuild_frequencies): Turn into a pass; verify CFG profile consistency
first and do not rebuild if not necessary.
(class pass_rebuild_frequencies): New.
(make_pass_rebuild_frequencies): New.
* profile-count.h: Add profile_count::very_large_p.
* tree-inline.cc (optimize_inline_calls): Do not return
TODO_rebuild_frequencies
* tree-pass.h (TODO_rebuild_frequencies): Remove.
(make_pass_rebuild_frequencies): Declare.

diff --git a/gcc/passes.cc b/gcc/passes.cc
index 2f0e378b8b2..d7b0ad271a1 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -2075,9 +2075,6 @@ execute_function_todo (function *fn, void *data)
   if (flags & TODO_remove_unused_locals)
 remove_unused_locals ();
 
-  if (flags & TODO_rebuild_frequencies)
-rebuild_frequencies ();
-
   if (flags & TODO_rebuild_cgraph_edges)
 cgraph_edge::rebuild_edges ();
 
diff --git a/gcc/passes.def b/gcc/passes.def
index faa5208b26b..f2893ae8a8b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -206,6 +206,10 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_post_ipa_warn);
   /* Must run before loop unrolling.  */
   NEXT_PASS (pass_warn_access, /*early=*/true);
+  /* Profile count may overflow as a result of inlinining very large
+ loop nests.  This pass should run before any late pass that makes
+use of profile.  */
+  NEXT_PASS (pass_rebuild_frequencies);
   NEXT_PASS (pass_complete_unrolli);
   NEXT_PASS (pass_backprop);
   NEXT_PASS (pass_phiprop);
@@ -395,6 +399,10 @@ along with GCC; see the file COPYING3.  If not see
  to forward object-size and builtin folding results properly.  */
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_dce);
+  /* Profile count may overflow as a result of inlinining very large
+ loop nests.  This pass should run before any late pass that makes
+use of profile.  */
+  NEXT_PASS (pass_rebuild_frequencies);
   NEXT_PASS (pass_sancov);
   NEXT_PASS (pass_asan);
   NEXT_PASS (pass_tsan);
diff --git a/gcc/predict.cc b/gcc/predict.cc
index 1aa4c25eb70..26f9f3f6a88 100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -89,7 +89,7 @@ static void predict_paths_leading_to_edge (edge, enum 
br_predictor,
 static bool can_predict_insn_p (const rtx_insn *);
 static HOST_WIDE_INT get_predictor_value (br_predictor, HOST_WIDE_INT);
 static void determine_unlikely_bbs ();
-static void estimate_bb_frequencies (bool force);
+static void estimate_bb_frequencies ();
 
 /* Information we hold about each branch predictor.
Filled using information from predict.def.  */
@@ -3169,8 +3169,9 @@ tree_estimate_probability (bool dry_run)
   delete bb_predictions;
   bb_predictions = NULL;
 
-  if (!dry_run)
-estimate_bb_frequencies (false);
+  if (!dry_run
+  && profile_status_for_fn (cfun) != PROFILE_READ)
+estimate_bb_frequencies ();
   free_dominance_info (CDI_POST_DOMINATORS);
   remove_fake_exit_edges ();
 }
@@ -3923,103 +3924,97 @@ determine_unlikely_bbs ()
 }
 
 /* Estimate and propagate basic block frequencies using the given branch
-   probabilities.  If FORCE is true, the frequencies are used to estimate
-   the counts even when there are already non-zero profile counts.  */
+   probabilities.  */
 
 static void
-estimate_bb_frequencies (bool force)
+estimate_bb_frequencies ()
 {
   

[Bug c/110667] New: gcc-14, ICE: internal compiler error: in replace_reg, at reg-stack.cc:722

2023-07-14 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110667

Bug ID: 110667
   Summary: gcc-14, ICE: internal compiler error: in replace_reg,
at reg-stack.cc:722
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

When compiling below program using gcc-14 with option `gcc-14 -Ofast a.c`,
gcc-14 crashes:
```
#include 
#include 

void f() {
  long double res = 0;
  asm("" : "="(res) : "f"(40.), "f"(2.));
  assert(res == 42.);
}
```

GCC's output is pasted below:
```
: In function 'f':
:6:3: error: output constraint 0 must specify a single register
6 |   asm("" : "="(res) : "f"(40.), "f"(2.));
  |   ^~~
during RTL pass: stack
:8:1: internal compiler error: in replace_reg, at reg-stack.cc:722
8 | }
  | ^
0x214e13e internal_error(char const*, ...)
???:0
0x9cd8e8 fancy_abort(char const*, int, char const*)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

This can be verified by visiting the Compiler Explorer:
https://gcc.godbolt.org/z/cbYGeaze8

Re: [PATCH V4] Optimize '(X - N * M) / N' to 'X / N - M' if valid

2023-07-14 Thread Aldy Hernandez via Gcc-patches




On 7/14/23 15:37, Richard Biener wrote:

On Fri, 14 Jul 2023, Aldy Hernandez wrote:


I don't know what you're trying to accomplish here, as I haven't been
following the PR, but adding all these helper functions to the ranger header
file seems wrong, especially since there's only one use of them. I see you're
tweaking the irange API, adding helper functions to range-op (which is only
for code dealing with implementing range operators for tree codes), etc etc.

If you need these helper functions, I suggest you put them closer to their
uses (i.e. wherever the match.pd support machinery goes).


Note I suggested the opposite beacuse I thought these kind of helpers
are closer to value-range support than to match.pd.


Oh sorry, I missed that.



But I take away from your answer that there's nothing close in the
value-range machinery that answers the question whether A op B may
overflow?


Not currently.

I vaguely recall we talked about some mechanism for doing range 
operations in a wider precision and comparing them with the result of 
doing it in the natural precision, and if the results differ, it must 
have overflowed.


*hunts down PR*

Comment 23 here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100499#c23

Would something like that work?

I would prefer something more general, rather than having to re-invent 
every range-op entry to check for overflow.


Aldy



[Bug tree-optimization/110666] New: wrong code at -O1 and above on x86_64-linux-gnu

2023-07-14 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110666

Bug ID: 110666
   Summary: wrong code at -O1 and above on x86_64-linux-gnu
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

This appears to be a recent regression.

Compiler Explorer: https://godbolt.org/z/xG14vrs6K

[501] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/home/suz/suz-local/software/local/gcc-trunk/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk
--enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230714 (experimental) (GCC) 
[502] % 
[502] % gcctk -O0 small.c; ./a.out
[503] % gcctk -O1 small.c; ./a.out
Aborted
[504] % cat small.c
int a;
int main() {
  if ((a != 2) == a)
__builtin_abort();
  return 0;
}

Re: [PATCH V4] Optimize '(X - N * M) / N' to 'X / N - M' if valid

2023-07-14 Thread Richard Biener via Gcc-patches
On Fri, 14 Jul 2023, Aldy Hernandez wrote:

> I don't know what you're trying to accomplish here, as I haven't been
> following the PR, but adding all these helper functions to the ranger header
> file seems wrong, especially since there's only one use of them. I see you're
> tweaking the irange API, adding helper functions to range-op (which is only
> for code dealing with implementing range operators for tree codes), etc etc.
> 
> If you need these helper functions, I suggest you put them closer to their
> uses (i.e. wherever the match.pd support machinery goes).

Note I suggested the opposite beacuse I thought these kind of helpers
are closer to value-range support than to match.pd.

But I take away from your answer that there's nothing close in the 
value-range machinery that answers the question whether A op B may
overflow?

Richard.

> Aldy
> 
> On 7/11/23 11:04, Jiufu Guo wrote:
> > Hi,
> > 
> > Integer expression "(X - N * M) / N" can be optimized to "X / N - M"
> > if there is no wrap/overflow/underflow and "X - N * M" has the same
> > sign with "X".
> > 
> > Compare the previous version:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623028.html
> > - The APIs for checking overflow of range operation are moved to
> > other files: range-op and gimple-range.
> > - Improve the patterns with '(X + C)' for unsigned type.
> > 
> > Bootstrap & regtest pass on ppc64{,le} and x86_64.
> > Is this patch ok for trunk?
> > 
> > BR,
> > Jeff (Jiufu Guo)
> > 
> > 
> >  PR tree-optimization/108757
> > 
> > gcc/ChangeLog:
> > 
> >  * gimple-range.cc (arith_without_overflow_p): New function.
> >  (same_sign_p): New function.
> >  * gimple-range.h (arith_without_overflow_p): New declare.
> >  (same_sign_p): New declare.
> >  * match.pd ((X - N * M) / N): New pattern.
> >  ((X + N * M) / N): New pattern.
> >  ((X + C) div_rshift N): New pattern.
> >  * range-op.cc (plus_without_overflow_p): New function.
> >  (minus_without_overflow_p): New function.
> >  (mult_without_overflow_p): New function.
> >  * range-op.h (plus_without_overflow_p): New declare.
> >  (minus_without_overflow_p): New declare.
> >  (mult_without_overflow_p): New declare.
> >  * value-query.h (get_range): New function
> >  * value-range.cc (irange::nonnegative_p): New function.
> >  (irange::nonpositive_p): New function.
> >  * value-range.h (irange::nonnegative_p): New declare.
> >  (irange::nonpositive_p): New declare.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >  * gcc.dg/pr108757-1.c: New test.
> >  * gcc.dg/pr108757-2.c: New test.
> >  * gcc.dg/pr108757.h: New test.
> > 
> > ---
> >   gcc/gimple-range.cc   |  50 +++
> >   gcc/gimple-range.h|   2 +
> >   gcc/match.pd  |  64 
> >   gcc/range-op.cc   |  77 ++
> >   gcc/range-op.h|   4 +
> >   gcc/value-query.h |  10 ++
> >   gcc/value-range.cc|  12 ++
> >   gcc/value-range.h |   2 +
> >   gcc/testsuite/gcc.dg/pr108757-1.c |  18 +++
> >   gcc/testsuite/gcc.dg/pr108757-2.c |  19 +++
> >   gcc/testsuite/gcc.dg/pr108757.h   | 233 ++
> >   11 files changed, 491 insertions(+)
> >   create mode 100644 gcc/testsuite/gcc.dg/pr108757-1.c
> >   create mode 100644 gcc/testsuite/gcc.dg/pr108757-2.c
> >   create mode 100644 gcc/testsuite/gcc.dg/pr108757.h
> > 
> > diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
> > index
> > 01e62d3ff3901143bde33dc73c0debf41d0c0fdd..620fe32e85e5fe3847a933554fc656b2939cf02d
> > 100644
> > --- a/gcc/gimple-range.cc
> > +++ b/gcc/gimple-range.cc
> > @@ -926,3 +926,53 @@ assume_query::dump (FILE *f)
> >   }
> > fprintf (f, "--\n");
> >   }
> > +
> > +/* Return true if the operation "X CODE Y" in type does not overflow
> > +   underflow or wrap with value range info, otherwise return false.  */
> > +
> > +bool
> > +arith_without_overflow_p (tree_code code, tree x, tree y, tree type)
> > +{
> > +  gcc_assert (INTEGRAL_TYPE_P (type));
> > +
> > +  if (TYPE_OVERFLOW_UNDEFINED (type))
> > +return true;
> > +
> > +  value_range vr0;
> > +  value_range vr1;
> > +  if (!(get_range (vr0, x) && get_range (vr1, y)))
> > +return false;
> > +
> > +  switch (code)
> > +{
> > +case PLUS_EXPR:
> > +  return plus_without_overflow_p (vr0, vr1, type);
> > +case MINUS_EXPR:
> > +  return minus_without_overflow_p (vr0, vr1, type);
> > +case MULT_EXPR:
> > +  return mult_without_overflow_p (vr0, vr1, type);
> > +default:
> > +  gcc_unreachable ();
> > +}
> > +
> > +  return false;
> > +}
> > +
> > +/* Return true if "X" and "Y" have the same sign or zero.  */
> > +
> > +bool
> > +same_sign_p (tree x, tree y, tree type)
> > +{
> > +  gcc_assert (INTEGRAL_TYPE_P (type));
> > +
> > +  if (TYPE_UNSIGNED (type))
> > +return true;
> > +
> > +  value_range vr0;
> > +  value_range vr1;
> > +  if (!(get_range (vr0, x) && get_range 

RE: [PATCH 12/19]middle-end: implement loop peeling and IV updates for early break.

2023-07-14 Thread Richard Biener via Gcc-patches
On Thu, 13 Jul 2023, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Thursday, July 13, 2023 6:31 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> > Subject: Re: [PATCH 12/19]middle-end: implement loop peeling and IV
> > updates for early break.
> > 
> > On Wed, 28 Jun 2023, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch updates the peeling code to maintain LCSSA during peeling.
> > > The rewrite also naturally takes into account multiple exits and so it 
> > > didn't
> > > make sense to split them off.
> > >
> > > For the purposes of peeling the only change for multiple exits is that the
> > > secondary exits are all wired to the start of the new loop preheader when
> > doing
> > > epilogue peeling.
> > >
> > > When doing prologue peeling the CFG is kept in tact.
> > >
> > > For both epilogue and prologue peeling we wire through between the two
> > loops any
> > > PHI nodes that escape the first loop into the second loop if flow_loops is
> > > specified.  The reason for this conditionality is because
> > > slpeel_tree_duplicate_loop_to_edge_cfg is used in the compiler in 3 ways:
> > >   - prologue peeling
> > >   - epilogue peeling
> > >   - loop distribution
> > >
> > > for the last case the loops should remain independent, and so not be
> > connected.
> > > Because of this propagation of only used phi nodes get_current_def can be
> > used
> > > to easily find the previous definitions.  However live statements that are
> > > not used inside the loop itself are not propagated (since if unused, the
> > moment
> > > we add the guard in between the two loops the value across the bypass edge
> > can
> > > be wrong if the loop has been peeled.)
> > >
> > > This is dealt with easily enough in find_guard_arg.
> > >
> > > For multiple exits, while we are in LCSSA form, and have a correct DOM 
> > > tree,
> > the
> > > moment we add the guard block we will change the dominators again.  To
> > deal with
> > > this slpeel_tree_duplicate_loop_to_edge_cfg can optionally return the 
> > > blocks
> > to
> > > update without having to recompute the list of blocks to update again.
> > >
> > > When multiple exits and doing epilogue peeling we will also temporarily 
> > > have
> > an
> > > incorrect VUSES chain for the secondary exits as it anticipates the final 
> > > result
> > > after the VDEFs have been moved.  This will thus be corrected once the 
> > > code
> > > motion is applied.
> > >
> > > Lastly by doing things this way we can remove the helper functions that
> > > previously did lock step iterations to update things as it went along.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > 
> > Not sure if I get through all of this in one go - so be prepared that
> > the rest of the review follows another day.
> 
> No worries, I appreciate the reviews!
> Just giving some quick replies for when you continue.

Continueing.

> > 
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-loop-distribution.cc (copy_loop_before): Pass flow_loops =
> > false.
> > >   * tree-ssa-loop-niter.cc (loop_only_exit_p):  Fix bug when exit==null.
> > >   * tree-vect-loop-manip.cc (adjust_phi_and_debug_stmts): Add
> > additional
> > >   assert.
> > >   (vect_set_loop_condition_normal): Skip modifying loop IV for multiple
> > >   exits.
> > >   (slpeel_tree_duplicate_loop_to_edge_cfg): Support multiple exit
> > peeling.
> > >   (slpeel_can_duplicate_loop_p): Likewise.
> > >   (vect_update_ivs_after_vectorizer): Don't enter this...
> > >   (vect_update_ivs_after_early_break): ...but instead enter here.
> > >   (find_guard_arg): Update for new peeling code.
> > >   (slpeel_update_phi_nodes_for_loops): Remove.
> > >   (slpeel_update_phi_nodes_for_guard2): Remove hardcoded edge 0
> > checks.
> > >   (slpeel_update_phi_nodes_for_lcssa): Remove.
> > >   (vect_do_peeling): Fix VF for multiple exits and force epilogue.
> > >   * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Initialize
> > >   non_break_control_flow and early_breaks.
> > >   (vect_need_peeling_or_partial_vectors_p): Force partial vector if
> > >   multiple exits and VLA.
> > >   (vect_analyze_loop_form): Support inner loop multiple exits.
> > >   (vect_create_loop_vinfo): Set LOOP_VINFO_EARLY_BREAKS.
> > >   (vect_create_epilog_for_reduction):  Update live phi nodes.
> > >   (vectorizable_live_operation): Ignore live operations in vector loop
> > >   when multiple exits.
> > >   (vect_transform_loop): Force unrolling for VF loops and multiple exits.
> > >   * tree-vect-stmts.cc (vect_stmt_relevant_p): Analyze ctrl statements.
> > >   (vect_mark_stmts_to_be_vectorized): Check for non-exit control flow
> > and
> > >   analyze gcond params.
> > >   (vect_analyze_stmt): Support gcond.
> > >   * tree-vectorizer.cc (pass_vectorize::execute): Support multiple exits
> > >   in RPO pass.
> > >   * 

Re: LRA for avr: help with FP and elimination

2023-07-14 Thread Vladimir Makarov via Gcc



On 7/13/23 05:27, SenthilKumar.Selvaraj--- via Gcc wrote:

Hi,

   I've been spending some (spare) time checking what it would take to
   make LRA work for the avr target.

   Right after I removed the TARGET_LRA_P hook disabling LRA, building
   libgcc failed with a weird ICE.



  On the avr, the stack pointer (SP)
   is not used to access stack slots

It is very uncommon target then.

  - TARGET_CAN_ELIMINATE returns false
   if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true
   if get_frame_size() > 0.

   With LRA, however, reload generates

(insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
 (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
  (nil))

   and the backend code errors out when it finds SP is being used as a
   pointer register.

   Digging through the RTL dumps, I found the following. For the
   following insn sequence in *.ira

(insn 189 128 159 7 (set (reg:HI 58 [ b ])
 (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
  (nil))
(insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0)
 (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split}
  (nil))
(insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1)
 (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split}
  (nil))

   1. For r58, IRA picks R28:R29, which is the frame pointer for avr.

   Popping a13(r58,l0)  -- assign reg 28

   2. LRA sees the subreg in insn 159 and generates a reload reg
   (r125).  simplify_subreg_regno (lra-constraints.cc:1810) however
   bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and
   reload isn't completed yet. LRA therefore decides rclass for the
   pseudo reg is NO_REGS.


Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg r125
   159: r125:HI#0=r86:QI

   4. As rclass is NO_REGS, LRA picks an insn alternative that involves memory.
   That is my understanding, please correct me if I'm wrong.

 0 Small class reload: reject+=3
 0 Non input pseudo reload: reject++
 Cycle danger: overall += LRA_MAX_REJECT
   alt=0,overall=610,losers=1,rld_nregs=1
 0 Small class reload: reject+=3
 0 Non input pseudo reload: reject++
 alt=1: Bad operand -- refuse
 0 Non pseudo reload: reject++
   alt=2,overall=1,losers=0,rld_nregs=0
 Choosing alt 2 in insn 159:  (0) Qm  (1) rY00 {movqi_insn_split}

   5. LRA creates stack slots, and then uses the FP register to access
   the slots. This is despite r58 already being assigned R28:R29.

   6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore
  frame_pointer_needed is not set, despite the creation of stack
  slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP,
  and this eventually causes the ICE when the avr backend sees SP being
  used as a pointer register.

   This is the relevant sequence after reload

(insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
 (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
  (nil))
(insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
 (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
  (nil))
(insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
 (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
  (nil))
(insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
 (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 
{*movhi_split}
  (nil))
(insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
 (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
  (nil))
(insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 2 [0x2])) [2 %sfp+2 S1 A8])
 (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split}
  (nil))
(insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
 (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 
{*movhi_split}
  (nil))

   For choices other than FP, simplify_subreg_regno returns the correct part
   of the wider HImode reg, so rclass is not NO_REGS, and things workout fine.

   I checked what classic reload does in the same situation - it picks a
   different register (R25) instead of spilling to a stack slot.


(insn 189 128 159 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
 (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
  (nil))
(insn 159 189 226 7 (set (reg:QI 25 r25)
 (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 

  1   2   >