[gcc/aoliva/heads/testbase] (71 commits) Align tight&hot loop without considering max skipping bytes

2024-05-29 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testbase' was updated to point to:

 b644126237a... Align tight&hot loop without considering max skipping bytes

It previously pointed to:

 7acd5d71547... testsuite: adjust iteration count for ppc costmodel 76b

Diff:

Summary of changes (added commits):
---

  b644126... Align tight&hot loop without considering max skipping bytes (*)
  00ed542... Adjust generic loop alignment from 16:11:8 to 16 for Intel  (*)
  d9933e8... testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_vsx  (*)
  a19f588... Gori_on_edge tweaks. (*)
  e5fc5d4... rs6000: Don't clobber return value when eh_return called [P (*)
  2b84169... Daily bump. (*)
  1d6199e... Reduce cost of MEM (A + imm). (*)
  6f36cc2... More tweaks from gimple_outgoing_range changes. (*)
  802a98d... resource.cc: Remove redundant conditionals (*)
  e1abce5... resource.cc (mark_target_live_regs): Remove check for bb no (*)
  933ab59... resource.cc: Replace calls to find_basic_block with cfgrtl  (*)
  84b4ed4... resource.cc (mark_target_live_regs): Don't look past target (*)
  91d7905... i386: Improve access to _Atomic DImode location via XMM reg (*)
  21fc89b... diagnostics: consolidate global state in diagnostic-color.c (*)
  9bda2c4... libcpp: move label_text to its own header (*)
  fb7a943... selftests: split out make_fndecl from selftest.h to its own (*)
  7cc529f... regenerate-opt-urls.py: fix transposed values for "vax" and (*)
  efaaae4... c++: extend -Wself-move for mem-init-list [PR109396] (*)
  5ada486... Do not invoke SCEV if it will use a different range query. (*)
  d52b44a... Strlen pass should set current range query. (*)
  5bc731b... c++: mark TARGET_EXPRs for function arguments eliding [PR11 (*)
  c0d7828... testsuite/*/gomp: Remove 'dg-prune-output "not supported ye (*)
  2dbb1c1... diagnostics: disable localization of events in selftest pat (*)
  b544ff8... Fix bootstrap on AIX by adding c-family/c-type-mismatch.cc  (*)
  2361160... [to-be-committed] [RISC-V] Some basic patterns for zbkb cod (*)
  a3aeff4... vect: Use vect representative statement instead of original (*)
  d8d70b7... target/115254 - fix gcc.dg/vect/vect-gather-4.c dump scanni (*)
  c08b0d3... tree-optimization/115236 - more points-to *ANYTHING = x fix (*)
  19cc611... Avoid pessimistic constraints for asm memory constraints (*)
  eaaa4b8... tree-optimization/115254 - don't account single-lane SLP ag (*)
  65aa46f... Fix SLP reduction neutral op value for pointer reductions (*)
  c650023... Fix predicate mismatch between vfcmaddcph's define_insn and (*)
  ded91d8... LoongArch: Guard REGNO with REG_P in loongarch_expand_condi (*)
  4fcdc37... Fix bitops-9.c for -m32 and other targets that don't have v (*)
  958a682... Daily bump. (*)
  c5a7628... match: Use uniform_integer_cst_p in bitwise_inverted_equal_ (*)
  a209f21... modula2: simplify xref usage in documentation, remove exter (*)
  07cdba6... Fix points-to SCC collapsing bug (*)
  f9fbb47... tree-optimization/115220 - fix store sinking virtual operan (*)
  f6c5f83... Define which threading model is in use on Windows (*)
  311d7f5... tree-optimization/115232 - demangle failure during -Waccess (*)
  88c9b96... Add testcase for PR c++/105229: ICE in lookup_template_clas (*)
  6e97482... doc: Use https for our own site (and GCC for the project) (*)
  06bb125... RISC-V: Fix missing boolean_expression in zmmul extension (*)
  314448f... VAX/doc: Fix issues with FP format option documentation (*)
  a7f6543... vax: Fix descriptions of the FP format options [PR79646] (*)
  1609294... [to-be-committed][RISC-V] Reassociate constants in logical  (*)
  0022064... x86: Fix Logical Shift Issue in expand_vec_perm_psrlw_psllw (*)
  5d99cf7... Gen-Match: Fix gen_kids_1 right hand braces mis-alignment (*)
  56d0d0d... Daily bump. (*)
  3a915d6... [to-be-committed] [RISC-V] Try inverting for constant synth (*)
  a06df66... go: Move web references from golang.org to go.dev. (*)
  53d9198... doc: Quote singular '=' signs (*)
  9566022... [to-be-committed][RISC-V] Generate nearby constant, then ad (*)
  8746373... [PATCH] libcpp: Correct typo 'r' -> '\r' (*)
  f981072... Delete gori_map during destruction of GORI. (*)
  3c7ae57... Daily bump. (*)
  05daf61... [committed] [v2] More logical op simplifications in simplif (*)
  28b5082... c++/modules: Improve diagnostic when redeclaring builtin in (*)
  6c0b7e1... Daily bump. (*)
  9561cf5... Fortran: improve attribute conflict checking [PR93635] (*)
  9376573... Fortran: fix bounds check for assignment, class component [ (*)
  73eef7a... Small enhancement to implementation of -fdump-ada-spec (*)
  9f1798c... c: Fix for some variably modified types not being recognize (*)
  dae606a... c++/modules: Improve errors for bad module-directives [PR11 (*)
  03531ec... c++/modules: Remember that header units have CMIs (*)
  0173dcc... c++/modules: Fix treatment of unnamed types (*)
  401994d... [to-be-committed,v2,RISC-V] Use bclri in constant synt

[gcc/aoliva/heads/testme] (78 commits) [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra

2024-05-29 Thread Alexandre Oliva via Gcc-cvs
The branch 'aoliva/heads/testme' was updated to point to:

 ca809ee3fbe... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra

It previously pointed to:

 3bcf4294d89... [rs6000] adjust return_pc debug attrs

Diff:

!!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST):
---

  3bcf429... [rs6000] adjust return_pc debug attrs
  a56062c... enable adjustment of return_pc debug attrs


Summary of changes (added commits):
---

  ca809ee... [testsuite] [powerpc] adjust -m32 counts for fold-vec-extra
  0276651... [libstdc++-v3] [rtems] enable filesystem support
  1c34040... [tree-prof] skip if errors were seen [PR113681]
  1b22d42... [testsuite] [arm] add effective target and options for pacb
  d34a3eb... add explicit ABI and align options to pr88233.c
  0bb10f1... [rs6000] adjust return_pc debug attrs
  99047b7... enable adjustment of return_pc debug attrs
  b644126... Align tight&hot loop without considering max skipping bytes (*)
  00ed542... Adjust generic loop alignment from 16:11:8 to 16 for Intel  (*)
  d9933e8... testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_vsx  (*)
  a19f588... Gori_on_edge tweaks. (*)
  e5fc5d4... rs6000: Don't clobber return value when eh_return called [P (*)
  2b84169... Daily bump. (*)
  1d6199e... Reduce cost of MEM (A + imm). (*)
  6f36cc2... More tweaks from gimple_outgoing_range changes. (*)
  802a98d... resource.cc: Remove redundant conditionals (*)
  e1abce5... resource.cc (mark_target_live_regs): Remove check for bb no (*)
  933ab59... resource.cc: Replace calls to find_basic_block with cfgrtl  (*)
  84b4ed4... resource.cc (mark_target_live_regs): Don't look past target (*)
  91d7905... i386: Improve access to _Atomic DImode location via XMM reg (*)
  21fc89b... diagnostics: consolidate global state in diagnostic-color.c (*)
  9bda2c4... libcpp: move label_text to its own header (*)
  fb7a943... selftests: split out make_fndecl from selftest.h to its own (*)
  7cc529f... regenerate-opt-urls.py: fix transposed values for "vax" and (*)
  efaaae4... c++: extend -Wself-move for mem-init-list [PR109396] (*)
  5ada486... Do not invoke SCEV if it will use a different range query. (*)
  d52b44a... Strlen pass should set current range query. (*)
  5bc731b... c++: mark TARGET_EXPRs for function arguments eliding [PR11 (*)
  c0d7828... testsuite/*/gomp: Remove 'dg-prune-output "not supported ye (*)
  2dbb1c1... diagnostics: disable localization of events in selftest pat (*)
  b544ff8... Fix bootstrap on AIX by adding c-family/c-type-mismatch.cc  (*)
  2361160... [to-be-committed] [RISC-V] Some basic patterns for zbkb cod (*)
  a3aeff4... vect: Use vect representative statement instead of original (*)
  d8d70b7... target/115254 - fix gcc.dg/vect/vect-gather-4.c dump scanni (*)
  c08b0d3... tree-optimization/115236 - more points-to *ANYTHING = x fix (*)
  19cc611... Avoid pessimistic constraints for asm memory constraints (*)
  eaaa4b8... tree-optimization/115254 - don't account single-lane SLP ag (*)
  65aa46f... Fix SLP reduction neutral op value for pointer reductions (*)
  c650023... Fix predicate mismatch between vfcmaddcph's define_insn and (*)
  ded91d8... LoongArch: Guard REGNO with REG_P in loongarch_expand_condi (*)
  4fcdc37... Fix bitops-9.c for -m32 and other targets that don't have v (*)
  958a682... Daily bump. (*)
  c5a7628... match: Use uniform_integer_cst_p in bitwise_inverted_equal_ (*)
  a209f21... modula2: simplify xref usage in documentation, remove exter (*)
  07cdba6... Fix points-to SCC collapsing bug (*)
  f9fbb47... tree-optimization/115220 - fix store sinking virtual operan (*)
  f6c5f83... Define which threading model is in use on Windows (*)
  311d7f5... tree-optimization/115232 - demangle failure during -Waccess (*)
  88c9b96... Add testcase for PR c++/105229: ICE in lookup_template_clas (*)
  6e97482... doc: Use https for our own site (and GCC for the project) (*)
  06bb125... RISC-V: Fix missing boolean_expression in zmmul extension (*)
  314448f... VAX/doc: Fix issues with FP format option documentation (*)
  a7f6543... vax: Fix descriptions of the FP format options [PR79646] (*)
  1609294... [to-be-committed][RISC-V] Reassociate constants in logical  (*)
  0022064... x86: Fix Logical Shift Issue in expand_vec_perm_psrlw_psllw (*)
  5d99cf7... Gen-Match: Fix gen_kids_1 right hand braces mis-alignment (*)
  56d0d0d... Daily bump. (*)
  3a915d6... [to-be-committed] [RISC-V] Try inverting for constant synth (*)
  a06df66... go: Move web references from golang.org to go.dev. (*)
  53d9198... doc: Quote singular '=' signs (*)
  9566022... [to-be-committed][RISC-V] Generate nearby constant, then ad (*)
  8746373... [PATCH] libcpp: Correct typo 'r' -> '\r' (*)
  f981072... Delete gori_map during destruction of GORI. (*)
  3c7ae57... Daily bump. (*)
  05daf61... [committed] [v2] More logical op simplifications in simplif (*)
  28b5082... 

[gcc(refs/users/aoliva/heads/testme)] add explicit ABI and align options to pr88233.c

2024-05-29 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:d34a3eb9286d547533a7226a504c229b2ab6d4b3

commit d34a3eb9286d547533a7226a504c229b2ab6d4b3
Author: Alexandre Oliva 
Date:   Wed May 29 02:52:14 2024 -0300

add explicit ABI and align options to pr88233.c

We've observed failures of this test on powerpc configurations that
default to different calling conventions and alignment requirements.
Both settings are needed for the original expectations to be met.

The test was later modified to have different expectations for big and
little endian code generation.  This patch restores the original
codegen expectations, that, with the explicit options, don't vary any
more.


for  gcc/testsuite/ChangeLog

* gcc.target/powerpc/pr88233.c: Make some alignment strictness
and calling conventions assumptions explicit.  Restore uniform
codegen expectations

Diff:
---
 gcc/testsuite/gcc.target/powerpc/pr88233.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr88233.c 
b/gcc/testsuite/gcc.target/powerpc/pr88233.c
index 27c73717a3f..46a3ebfa287 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr88233.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr88233.c
@@ -1,5 +1,5 @@
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-strict-align 
-fpcc-struct-return" } */
 
 typedef struct { double a[2]; } A;
 A
@@ -9,6 +9,5 @@ foo (const A *a)
 }
 
 /* { dg-final { scan-assembler-not {\mmtvsr} } } */
-/* { dg-final { scan-assembler-times {\mlxvd2x\M} 1 { target { be } } } } */
-/* { dg-final { scan-assembler-times {\mstxvd2x\M} 1 { target { be } } } } */
-/* { dg-final { scan-assembler-times {\mlfd\M} 2 { target { le } } } } */
+/* { dg-final { scan-assembler-times {\mlxvd2x\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvd2x\M} 1 } } */


[gcc r15-889] libstdc++: Avoid MMX return types from __builtin_shufflevector

2024-05-29 Thread Matthias Kretz via Libstdc++-cvs
https://gcc.gnu.org/g:241a6cc88d866fb36bd35ddb3edb659453d6322e

commit r15-889-g241a6cc88d866fb36bd35ddb3edb659453d6322e
Author: Matthias Kretz 
Date:   Wed May 15 11:02:22 2024 +0200

libstdc++: Avoid MMX return types from __builtin_shufflevector

This resolves a regression on i686 that was introduced with
r15-429-gfb1649f8b4ad50.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/115247
* include/experimental/bits/simd.h (__as_vector): Don't use
vector_size(8) on __i386__.
(__vec_shuffle): Never return MMX vectors, widen to 16 bytes
instead.
(concat): Fix padding calculation to pick up widening logic from
__as_vector.

Diff:
---
 libstdc++-v3/include/experimental/bits/simd.h | 39 +++
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h 
b/libstdc++-v3/include/experimental/bits/simd.h
index 6a6fd4f109d..7c524625719 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1665,7 +1665,12 @@ template 
  {
static_assert(is_simd<_V>::value);
using _Tp = typename _V::value_type;
+#ifdef __i386__
+   constexpr auto __bytes = sizeof(_Tp) == 8 ? 16 : sizeof(_Tp);
+   using _RV [[__gnu__::__vector_size__(__bytes)]] = _Tp;
+#else
using _RV [[__gnu__::__vector_size__(sizeof(_Tp))]] = _Tp;
+#endif
return _RV{__data(__x)};
  }
   }
@@ -2081,11 +2086,14 @@ template >
 // }}}
 // __vec_shuffle{{{
 template 
-  _GLIBCXX_SIMD_INTRINSIC constexpr auto
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  __vector_type_t()[0])>, 
sizeof...(_Is)>
   __vec_shuffle(_T0 __x, _T1 __y, index_sequence<_Is...> __seq, _Fun 
__idx_perm)
   {
 constexpr int _N0 = sizeof(__x) / sizeof(__x[0]);
 constexpr int _N1 = sizeof(__y) / sizeof(__y[0]);
+using _Tp = remove_reference_t()[0])>;
+using _RV [[maybe_unused]] = __vector_type_t<_Tp, sizeof...(_Is)>;
 #if __has_builtin(__builtin_shufflevector)
 #ifdef __clang__
 // Clang requires _T0 == _T1
@@ -2105,14 +2113,23 @@ template 
 });
 else
 #endif
-  return __builtin_shufflevector(__x, __y, [=] {
-  constexpr int __j = __idx_perm(_Is);
-  static_assert(__j < _N0 + _N1);
-  return __j;
-}()...);
+  {
+   const auto __r = __builtin_shufflevector(__x, __y, [=] {
+  constexpr int __j = __idx_perm(_Is);
+  static_assert(__j < _N0 + _N1);
+  return __j;
+}()...);
+#ifdef __i386__
+   if constexpr (sizeof(__r) == sizeof(_RV))
+ return __r;
+   else
+ return _RV {__r[_Is]...};
+#else
+   return __r;
+#endif
+  }
 #else
-using _Tp = __remove_cvref_t;
-return __vector_type_t<_Tp, sizeof...(_Is)> {
+return _RV {
   [=]() -> _Tp {
constexpr int __j = __idx_perm(_Is);
static_assert(__j < _N0 + _N1);
@@ -4393,9 +4410,9 @@ template 
__vec_shuffle(__as_vector(__xs)..., 
std::make_index_sequence<_RW::_S_full_size>(),
  [](int __i) {
constexpr int __sizes[2] = 
{int(simd_size_v<_Tp, _As>)...};
-   constexpr int __padding0
- = sizeof(__vector_type_t<_Tp, __sizes[0]>) / 
sizeof(_Tp)
- - __sizes[0];
+   constexpr int __vsizes[2]
+ = {int(sizeof(__as_vector(__xs)) / 
sizeof(_Tp))...};
+   constexpr int __padding0 = __vsizes[0] - 
__sizes[0];
return __i >= _Np ? -1 : __i < __sizes[0] ? __i 
: __i + __padding0;
  })};
   }


[gcc r15-890] libstdc++: Build libbacktrace and 19_diagnostics/stacktrace with -funwind-tables [PR111641]

2024-05-29 Thread Rainer Orth via Libstdc++-cvs
https://gcc.gnu.org/g:a99ebb88f8f25e76ebed5afc22e64fa77a2f0d3f

commit r15-890-ga99ebb88f8f25e76ebed5afc22e64fa77a2f0d3f
Author: Rainer Orth 
Date:   Wed May 29 10:08:07 2024 +0200

libstdc++: Build libbacktrace and 19_diagnostics/stacktrace with 
-funwind-tables [PR111641]

Several of the 19_diagnostics/stacktrace tests FAIL on Solaris/SPARC (32
and 64-bit), Solaris/x86 (32-bit only), and several other targets:

FAIL: 19_diagnostics/stacktrace/current.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/current.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/entry.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/entry.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/output.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/output.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/stacktrace.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/stacktrace.cc  -std=gnu++26 execution test

As it turns out, both the copy of libbacktrace in libstdc++ and the
testcases proper need to compiled with -funwind-tables, as is done for
libbacktrace itself.

This isn't an issue on Linux/x86_64 and Solaris/amd64 since 64-bit x86
always defaults to -funwind-tables.  32-bit x86 does, too, when
-fomit-frame-pointer is enabled as on Linux/i686, but unlike
Solaris/i386.

So this patch always enables the option both for the libbacktrace copy
and the testcases.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.

2024-05-23  Rainer Orth  

libstdc++-v3:
PR libstdc++/111641
* src/libbacktrace/Makefile.am (AM_CFLAGS): Add -funwind-tables.
* src/libbacktrace/Makefile.in: Regenerate.

* testsuite/19_diagnostics/stacktrace/current.cc (dg-options): Add
-funwind-tables.
* testsuite/19_diagnostics/stacktrace/entry.cc: Likewise.
* testsuite/19_diagnostics/stacktrace/hash.cc: Likewise.
* testsuite/19_diagnostics/stacktrace/output.cc: Likewise.
* testsuite/19_diagnostics/stacktrace/stacktrace.cc: Likewise.

Diff:
---
 libstdc++-v3/src/libbacktrace/Makefile.am  | 2 +-
 libstdc++-v3/src/libbacktrace/Makefile.in  | 2 +-
 libstdc++-v3/testsuite/19_diagnostics/stacktrace/current.cc| 2 +-
 libstdc++-v3/testsuite/19_diagnostics/stacktrace/entry.cc  | 2 +-
 libstdc++-v3/testsuite/19_diagnostics/stacktrace/hash.cc   | 2 +-
 libstdc++-v3/testsuite/19_diagnostics/stacktrace/output.cc | 2 +-
 libstdc++-v3/testsuite/19_diagnostics/stacktrace/stacktrace.cc | 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/src/libbacktrace/Makefile.am 
b/libstdc++-v3/src/libbacktrace/Makefile.am
index a2e78671259..82205db46de 100644
--- a/libstdc++-v3/src/libbacktrace/Makefile.am
+++ b/libstdc++-v3/src/libbacktrace/Makefile.am
@@ -51,7 +51,7 @@ C_WARN_FLAGS = $(WARN_FLAGS) -Wstrict-prototypes 
-Wmissing-prototypes -Wold-styl
 CXX_WARN_FLAGS = $(WARN_FLAGS) -Wno-unused-parameter
 AM_CFLAGS = \
$(glibcxx_lt_pic_flag) $(glibcxx_compiler_shared_flag) \
-   $(C_WARN_FLAGS)
+   $(C_WARN_FLAGS) -funwind-tables
 AM_CFLAGS += $(EXTRA_CFLAGS)
 AM_CXXFLAGS = \
$(glibcxx_lt_pic_flag) $(glibcxx_compiler_shared_flag) \
diff --git a/libstdc++-v3/src/libbacktrace/Makefile.in 
b/libstdc++-v3/src/libbacktrace/Makefile.in
index b5713b0c616..51c8092335a 100644
--- a/libstdc++-v3/src/libbacktrace/Makefile.in
+++ b/libstdc++-v3/src/libbacktrace/Makefile.in
@@ -473,7 +473,7 @@ libstdc___libbacktrace_la_CPPFLAGS = \
 C_WARN_FLAGS = $(WARN_FLAGS) -Wstrict-prototypes -Wmissing-prototypes 
-Wold-style-definition -Wno-unused-but-set-variable
 CXX_WARN_FLAGS = $(WARN_FLAGS) -Wno-unused-parameter
 AM_CFLAGS = $(glibcxx_lt_pic_flag) $(glibcxx_compiler_shared_flag) \
-   $(C_WARN_FLAGS) $(EXTRA_CFLAGS)
+   $(C_WARN_FLAGS) -funwind-tables $(EXTRA_CFLAGS)
 AM_CXXFLAGS = $(glibcxx_lt_pic_flag) $(glibcxx_compiler_shared_flag) \
$(CXX_WARN_FLAGS) -fno-rtti -fno-exceptions $(EXTRA_CXXFLAGS)
 obj_prefix = std_stacktrace
diff --git a/libstdc++-v3/testsuite/19_diagnostics/stacktrace/current.cc 
b/libstdc++-v3/testsuite/19_diagnostics/stacktrace/current.cc
index b1af5f74fb2..cdebd5f1daa 100644
--- a/libstdc++-v3/testsuite/19_diagnostics/stacktrace/current.cc
+++ b/libstdc++-v3/testsuite/19_diagnostics/stacktrace/current.cc
@@ -1,4 +1,4 @@
-// { dg-options "-lstdc++exp" }
+// { dg-options "-funwind-tables -lstdc++exp" }
 // { dg-do run { target c++23 } }
 // { dg-require-cpp-feature-test __cpp_lib_stacktrace }
 
diff --git a/libstdc++-v3/testsuite/19_diagnostics/stacktrace/entry.cc 
b/libstdc++-v3/testsuite/19_diagnostics/stacktrace/entry.cc
index bb348ebef8f..90671e68f8b 100644
--- a/libstdc++-v3/testsuite/19_diagnostics

[gcc r15-891] Fix memory leak.

2024-05-29 Thread Andre Vehreschild via Gcc-cvs
https://gcc.gnu.org/g:2f97d98d174e3ef9f3a9a83c179d787abde5e066

commit r15-891-g2f97d98d174e3ef9f3a9a83c179d787abde5e066
Author: Andre Vehreschild 
Date:   Wed Jul 12 16:52:15 2023 +0200

Fix memory leak.

Prevent double call of function return class object
and free the object after copy.

gcc/fortran/ChangeLog:

PR fortran/90069
* trans-expr.cc (gfc_conv_procedure_call): Evaluate
expressions with side-effects only ones and ensure
old is freeed.

gcc/testsuite/ChangeLog:

PR fortran/90069
* gfortran.dg/class_76.f90: New test.

Diff:
---
 gcc/fortran/trans-expr.cc  | 29 +--
 gcc/testsuite/gfortran.dg/class_76.f90 | 66 ++
 2 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index dfc5b8e9b4a..9f6cc8f871e 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6725,9 +6725,32 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
{
  tree efield;
 
- /* Evaluate arguments just once.  */
- if (e->expr_type != EXPR_VARIABLE)
-   parmse.expr = save_expr (parmse.expr);
+ /* Evaluate arguments just once, when they have
+side effects.  */
+ if (TREE_SIDE_EFFECTS (parmse.expr))
+   {
+ tree cldata, zero;
+
+ parmse.expr = gfc_evaluate_now (parmse.expr,
+ &parmse.pre);
+
+ /* Prevent memory leak, when old component
+was allocated already.  */
+ cldata = gfc_class_data_get (parmse.expr);
+ zero = build_int_cst (TREE_TYPE (cldata),
+   0);
+ tmp = fold_build2_loc (input_location, 
NE_EXPR,
+logical_type_node,
+cldata, zero);
+ tmp = build3_v (COND_EXPR, tmp,
+ gfc_call_free (cldata),
+ build_empty_stmt (
+   input_location));
+ gfc_add_expr_to_block (&parmse.finalblock,
+tmp);
+ gfc_add_modify (&parmse.finalblock,
+ cldata, zero);
+   }
 
  /* Set the _data field.  */
  tmp = gfc_class_data_get (var);
diff --git a/gcc/testsuite/gfortran.dg/class_76.f90 
b/gcc/testsuite/gfortran.dg/class_76.f90
new file mode 100644
index 000..1ee1e1fc25f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_76.f90
@@ -0,0 +1,66 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/90069
+!
+! Contributed by Brad Richardson  
+!
+
+program returned_memory_leak
+implicit none
+
+type, abstract :: base
+end type base
+
+type, extends(base) :: extended
+end type extended
+
+type :: container
+class(*), allocatable :: thing
+end type
+
+call run()
+contains
+subroutine run()
+type(container) :: a_container
+
+a_container = theRightWay()
+a_container = theWrongWay()
+end subroutine
+
+function theRightWay()
+type(container) :: theRightWay
+
+class(base), allocatable :: thing
+
+allocate(thing, source = newAbstract())
+theRightWay = newContainer(thing)
+end function theRightWay
+
+function theWrongWay()
+type(container) :: theWrongWay
+
+theWrongWay = newContainer(newAbstract())
+end function theWrongWay
+
+function  newAbstract()
+class(base), allocatable :: newAbstract
+
+allocate(newAbstract, source = newExtended())
+end function newAbstract
+
+function newExtended()
+type(extended) :: newExtended
+end function newExtended
+
+function newContainer(thing)
+class(*), intent(in) :: thing
+type(container) :: newContainer
+
+allocate(newContainer%thing, source = thing)
+end function newContainer
+end program returned_memory_leak
+
+! { dg-final { scan-tree-dump-times "newabstract" 14 "original" } }
+! { dg-final { scan-tree-dump-times "__builtin_free" 8 "original" } }
+


[gcc r15-892] c++: canonicity of fn types w/ instantiated eh specs [PR115223]

2024-05-29 Thread Patrick Palka via Gcc-cvs
https://gcc.gnu.org/g:58b8c87b7fb281e35a6817cc91a292096fdc02dc

commit r15-892-g58b8c87b7fb281e35a6817cc91a292096fdc02dc
Author: Patrick Palka 
Date:   Wed May 29 04:49:37 2024 -0400

c++: canonicity of fn types w/ instantiated eh specs [PR115223]

When propagating structural equality in build_cp_fntype_variant, we
should consider structural equality of the exception-less variant, not
of the given type which might use structural equality only because it
has a (complex) noexcept-spec that we're intending to replace, as in
maybe_instantiate_noexcept which calls build_exception_variant using
the deferred-noexcept function type.  Otherwise we might pessimistically
use structural equality for a function type with a simple instantiated
noexcept-spec, leading to a LTO-triggered type verification failure if we
later use that (structural-equality) type as the canonical version of
some other variant.

PR c++/115223

gcc/cp/ChangeLog:

* tree.cc (build_cp_fntype_variant): Propagate structural
equality of the exception-less variant.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept87.C: New test.

Reviewed-by: Jason Merrill 

Diff:
---
 gcc/cp/tree.cc  |  4 
 gcc/testsuite/g++.dg/cpp0x/noexcept87.C | 11 +++
 2 files changed, 15 insertions(+)

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index fe3f034d000..72dd46e1bd1 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -2796,6 +2796,10 @@ build_cp_fntype_variant (tree type, cp_ref_qualifier 
rqual,
   bool complex_eh_spec_p = (cr && cr != noexcept_true_spec
&& !UNPARSED_NOEXCEPT_SPEC_P (cr));
 
+  if (!complex_eh_spec_p && TYPE_RAISES_EXCEPTIONS (type))
+/* We want to consider structural equality of the exception-less
+   variant since we'll be replacing the exception specification.  */
+type = build_cp_fntype_variant (type, rqual, /*raises=*/NULL_TREE, late);
   if (TYPE_STRUCTURAL_EQUALITY_P (type) || complex_eh_spec_p)
 /* Propagate structural equality.  And always use structural equality
for function types with a complex noexcept-spec since their identity
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept87.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept87.C
new file mode 100644
index 000..339569d15ae
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept87.C
@@ -0,0 +1,11 @@
+// PR c++/115223
+// { dg-do compile { target c++11 } }
+// { dg-additional-options -flto }
+
+template
+void f() noexcept(bool(T() || true));
+
+void g() { f(); }
+
+using type = void;
+type callDestructorIfNecessary() noexcept {}


[gcc r15-893] i386: Fix ix86_option override after change [PR 113719]

2024-05-29 Thread Hongyu Wang via Gcc-cvs
https://gcc.gnu.org/g:499d00127d39ba894b0f7216d73660b380bdc325

commit r15-893-g499d00127d39ba894b0f7216d73660b380bdc325
Author: Hongyu Wang 
Date:   Wed May 15 11:24:34 2024 +0800

i386: Fix ix86_option override after change [PR 113719]

In ix86_override_options_after_change, calls to ix86_default_align
and ix86_recompute_optlev_based_flags will cause mismatched target
opt_set when doing cl_optimization_restore. Move them back to
ix86_option_override_internal to solve the issue.

gcc/ChangeLog:

PR target/113719
* config/i386/i386-options.cc (ix86_override_options_after_change):
Remove call to ix86_default_align and
ix86_recompute_optlev_based_flags.
(ix86_option_override_internal): Call ix86_default_align and
ix86_recompute_optlev_based_flags.

Diff:
---
 gcc/config/i386/i386-options.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index 78602a17f7e..f2cecc0e254 100644
--- a/gcc/config/i386/i386-options.cc
+++ b/gcc/config/i386/i386-options.cc
@@ -1916,11 +1916,6 @@ ix86_recompute_optlev_based_flags (struct gcc_options 
*opts,
 void
 ix86_override_options_after_change (void)
 {
-  /* Default align_* from the processor table.  */
-  ix86_default_align (&global_options);
-
-  ix86_recompute_optlev_based_flags (&global_options, &global_options_set);
-
   /* Disable unrolling small loops when there's explicit
  -f{,no}unroll-loop.  */
   if ((OPTION_SET_P (flag_unroll_loops))
@@ -2491,6 +2486,8 @@ ix86_option_override_internal (bool main_args_p,
 
   set_ix86_tune_features (opts, ix86_tune, opts->x_ix86_dump_tunes);
 
+  ix86_recompute_optlev_based_flags (opts, opts_set);
+
   ix86_override_options_after_change ();
 
   ix86_tune_cost = processor_cost_table[ix86_tune];
@@ -2526,6 +2523,9 @@ ix86_option_override_internal (bool main_args_p,
   || TARGET_64BIT_P (opts->x_ix86_isa_flags))
 opts->x_ix86_regparm = REGPARM_MAX;
 
+  /* Default align_* from the processor table.  */
+  ix86_default_align (&global_options);
+
   /* Provide default for -mbranch-cost= value.  */
   SET_OPTION_IF_UNSET (opts, opts_set, ix86_branch_cost,
   ix86_tune_cost->branch_cost);


[gcc r15-894] Fix link failure of GNAT tools on 32-bit SPARC/Linux

2024-05-29 Thread Eric Botcazou via Gcc-cvs
https://gcc.gnu.org/g:9c6e75a6d1cc2858fc945266a5edb700edb44389

commit r15-894-g9c6e75a6d1cc2858fc945266a5edb700edb44389
Author: Eric Botcazou 
Date:   Wed May 29 12:06:32 2024 +0200

Fix link failure of GNAT tools on 32-bit SPARC/Linux

There is an incorrect binding to the 64-bit compare-and-exchange builtin.

gcc/ada/
PR ada/115270
* Makefile.rtl (PowerPC/Linux): Use libgnat/s-atopri__32.ads for
the 32-bit library.
(SPARC/Linux): Likewise.

Diff:
---
 gcc/ada/Makefile.rtl | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 570d0b2703d..0f5ebb87d73 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2266,15 +2266,18 @@ ifeq ($(strip $(filter-out powerpc% 
linux%,$(target_cpu) $(target_os))),)
   system.ads

[gcc r14-10258] Fix link failure of GNAT tools on 32-bit SPARC/Linux

2024-05-29 Thread Eric Botcazou via Gcc-cvs
https://gcc.gnu.org/g:fba2843b9b35b9700155677f90555700b6ad4e16

commit r14-10258-gfba2843b9b35b9700155677f90555700b6ad4e16
Author: Eric Botcazou 
Date:   Wed May 29 12:06:32 2024 +0200

Fix link failure of GNAT tools on 32-bit SPARC/Linux

There is an incorrect binding to the 64-bit compare-and-exchange builtin.

gcc/ada/
PR ada/115270
* Makefile.rtl (PowerPC/Linux): Use libgnat/s-atopri__32.ads for
the 32-bit library.
(SPARC/Linux): Likewise.

Diff:
---
 gcc/ada/Makefile.rtl | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 6e1ca305faf..32cbdb69247 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2238,15 +2238,18 @@ ifeq ($(strip $(filter-out powerpc% 
linux%,$(target_cpu) $(target_os))),)
   system.ads

[gcc r15-895] tree-optimization/114435 - pcom left around copies confusing SLP

2024-05-29 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:1065a7db6f2a69770a85b4d53b9123b090dd1771

commit r15-895-g1065a7db6f2a69770a85b4d53b9123b090dd1771
Author: Richard Biener 
Date:   Wed May 29 10:41:51 2024 +0200

tree-optimization/114435 - pcom left around copies confusing SLP

The following arranges for the pre-SLP vectorization scalar cleanup
to be run when predictive commoning was applied to a loop in the
function.  This is similar to the complete unroll situation and
facilitating SLP vectorization.  Avoiding the SSA copies in predictive
commoning itself isn't easy (and predcom also sometimes unrolls,
asking for scalar cleanup).

PR tree-optimization/114435
* tree-predcom.cc (tree_predictive_commoning): Queue
the next scalar cleanup sub-pipeline to be run when we
did something.

* gcc.dg/vect/bb-slp-pr114435.c: New testcase.

Diff:
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c | 37 +
 gcc/tree-predcom.cc |  3 +++
 2 files changed, 40 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c
new file mode 100644
index 000..d1eecf7979a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr114435.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* Predictive commining is supposed to happen.  */
+/* { dg-additional-options "-O3 -fdump-tree-pcom" } */
+
+struct res {
+double r0;
+double r1;
+double r2;
+double r3;
+};
+
+struct pxl {
+double v0;
+double v1;
+double v2;
+double v3;
+};
+
+#define IS_NAN(x) ((x) == (x))
+
+void fold(struct res *r, struct pxl *in, double k, int sz)
+{
+  int i;
+
+  for (i = 0; i < sz; i++) {
+  if (IS_NAN(k)) continue;
+  r->r0 += in[i].v0 * k;
+  r->r1 += in[i].v1 * k;
+  r->r2 += in[i].v2 * k;
+  r->r3 += in[i].v3 * k;
+  }
+}
+
+/* { dg-final { scan-tree-dump "# r__r0_lsm\[^\r\n\]* = PHI" "pcom" } } */
+/* { dg-final { scan-tree-dump "optimized: basic block part vectorized" "slp1" 
} } */
+/* { dg-final { scan-tree-dump "# vect\[^\r\n\]* = PHI" "slp1" } } */
diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc
index 75a4c85164c..9844fee1e97 100644
--- a/gcc/tree-predcom.cc
+++ b/gcc/tree-predcom.cc
@@ -3522,6 +3522,9 @@ tree_predictive_commoning (bool allow_unroll_p)
}
 }
 
+  if (ret != 0)
+cfun->pending_TODOs |= PENDING_TODO_force_next_scalar_cleanup;
+
   return ret;
 }


[gcc r15-896] tree-optimization/115252 - enhance peeling for gaps avoidance

2024-05-29 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:f46eaad445e680034df51bd0dec4e6c7b1f372a4

commit r15-896-gf46eaad445e680034df51bd0dec4e6c7b1f372a4
Author: Richard Biener 
Date:   Mon May 27 16:04:35 2024 +0200

tree-optimization/115252 - enhance peeling for gaps avoidance

Code generation for contiguous load vectorization can already deal
with generalized avoidance of loading from a gap.  The following
extends detection of peeling for gaps requirement with that,
gets rid of the old special casing of a half load and makes sure
when we do access the gap we have peeling for gaps enabled.

PR tree-optimization/115252
* tree-vect-stmts.cc (get_group_load_store_type): Enhance
detecting the number of cases where we can avoid accessing a gap
during code generation.
(vectorizable_load): Remove old half-vector peeling for gap
avoidance which is now redundant.  Add gap-aligned case where
it's OK to access the gap.  Add assert that we have peeling for
gaps enabled when we access a gap.

* gcc.dg/vect/slp-gap-1.c: New testcase.

Diff:
---
 gcc/testsuite/gcc.dg/vect/slp-gap-1.c | 18 +++
 gcc/tree-vect-stmts.cc| 58 +--
 2 files changed, 46 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-gap-1.c 
b/gcc/testsuite/gcc.dg/vect/slp-gap-1.c
new file mode 100644
index 000..36463ca22c5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/slp-gap-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef unsigned char uint8_t;
+typedef short int16_t;
+void pixel_sub_wxh(int16_t * __restrict diff, uint8_t *pix1, uint8_t *pix2) {
+  for (int y = 0; y < 4; y++) {
+for (int x = 0; x < 4; x++)
+  diff[x + y * 4] = pix1[x] - pix2[x];
+pix1 += 16;
+pix2 += 32;
+  }
+}
+
+/* We can vectorize this without peeling for gaps and thus without epilogue,
+   but the only thing we can reliably scan is the zero-padding trick for the
+   partial loads.  */
+/* { dg-final { scan-tree-dump-times "\{_\[0-9\]\+, 0" 6 "vect" { target 
vect64 } } } */
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 4219ad832db..935d80f0e1b 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2072,16 +2072,22 @@ get_group_load_store_type (vec_info *vinfo, 
stmt_vec_info stmt_info,
  dr_alignment_support alss;
  int misalign = dr_misalignment (first_dr_info, vectype);
  tree half_vtype;
+ poly_uint64 remain;
+ unsigned HOST_WIDE_INT tem, num;
  if (overrun_p
  && !masked_p
  && (((alss = vect_supportable_dr_alignment (vinfo, first_dr_info,
  vectype, misalign)))
   == dr_aligned
  || alss == dr_unaligned_supported)
- && known_eq (nunits, (group_size - gap) * 2)
- && known_eq (nunits, group_size)
- && (vector_vector_composition_type (vectype, 2, &half_vtype)
- != NULL_TREE))
+ && can_div_trunc_p (group_size
+ * LOOP_VINFO_VECT_FACTOR (loop_vinfo) - gap,
+ nunits, &tem, &remain)
+ && (known_eq (remain, 0u)
+ || (constant_multiple_p (nunits, remain, &num)
+ && (vector_vector_composition_type (vectype, num,
+ &half_vtype)
+ != NULL_TREE
overrun_p = false;
 
  if (overrun_p && !can_overrun_p)
@@ -11513,33 +11519,14 @@ vectorizable_load (vec_info *vinfo,
unsigned HOST_WIDE_INT gap = DR_GROUP_GAP (first_stmt_info);
unsigned int vect_align
  = vect_known_alignment_in_bytes (first_dr_info, vectype);
-   unsigned int scalar_dr_size
- = vect_get_scalar_dr_size (first_dr_info);
-   /* If there's no peeling for gaps but we have a gap
-  with slp loads then load the lower half of the
-  vector only.  See get_group_load_store_type for
-  when we apply this optimization.  */
-   if (slp
-   && loop_vinfo
-   && !LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) && gap != 0
-   && known_eq (nunits, (group_size - gap) * 2)
-   && known_eq (nunits, group_size)
-   && gap >= (vect_align / scalar_dr_size))
- {
-   tree half_vtype;
-   new_vtype
- = vector_vector_composition_type (vectype, 2,
-   &half_vtype);
-   if (new_vtype != NULL_TREE)
-

[gcc r15-897] c-family: add hints for strerror

2024-05-29 Thread Jason Merrill via Gcc-cvs
https://gcc.gnu.org/g:19c491d1848a8410559247183597096778967edf

commit r15-897-g19c491d1848a8410559247183597096778967edf
Author: Oskari Pirhonen 
Date:   Tue Feb 27 19:13:30 2024 -0600

c-family: add hints for strerror

Add proper hints for implicit declaration of strerror.

The results could be confusing depending on the other included headers.
These example messages are from compiling a trivial program to print the
string for an errno value. It only includes stdio.h (cstdio for C++).

Before:
$ /tmp/gcc-master/bin/gcc test.c -o test_c
test.c: In function ‘main’:
test.c:4:20: warning: implicit declaration of function ‘strerror’; did you 
mean ‘perror’? [-Wimplicit-function-declaration]
4 | printf("%s\n", strerror(0));
  |^~~~
  |perror

$ /tmp/gcc-master/bin/g++ test.cpp -o test_cpp
test.cpp: In function ‘int main()’:
test.cpp:4:20: error: ‘strerror’ was not declared in this scope; did you 
mean ‘stderr’?
4 | printf("%s\n", strerror(0));
  |^~~~
  |stderr

After:
$ /tmp/gcc-known-headers/bin/gcc test.c -o test_c
test.c: In function ‘main’:
test.c:4:20: warning: implicit declaration of function ‘strerror’ 
[-Wimplicit-function-declaration]
4 | printf("%s\n", strerror(0));
  |^~~~
test.c:2:1: note: ‘strerror’ is defined in header ‘’; this is 
probably fixable by adding ‘#include ’
1 | #include 
  +++ |+#include 
2 |

$ /tmp/gcc-known-headers/bin/g++ test.cpp -o test_cpp
test.cpp: In function ‘int main()’:
test.cpp:4:20: error: ‘strerror’ was not declared in this scope
4 | printf("%s\n", strerror(0));
  |^~~~
test.cpp:2:1: note: ‘strerror’ is defined in header ‘’; this is 
probably fixable by adding ‘#include ’
1 | #include 
  +++ |+#include 
2 |

gcc/c-family/ChangeLog:

* known-headers.cc (get_stdlib_header_for_name): Add strerror.

gcc/testsuite/ChangeLog:

* g++.dg/spellcheck-stdlib.C: Add check for strerror.
* gcc.dg/spellcheck-stdlib-2.c: New test.

Signed-off-by: Oskari Pirhonen 

Diff:
---
 gcc/c-family/known-headers.cc  | 1 +
 gcc/testsuite/g++.dg/spellcheck-stdlib.C   | 2 ++
 gcc/testsuite/gcc.dg/spellcheck-stdlib-2.c | 8 
 3 files changed, 11 insertions(+)

diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
index dbc42eacde1..871fd714eb5 100644
--- a/gcc/c-family/known-headers.cc
+++ b/gcc/c-family/known-headers.cc
@@ -182,6 +182,7 @@ get_stdlib_header_for_name (const char *name, enum stdlib 
lib)
 {"strchr", {"", ""} },
 {"strcmp", {"", ""} },
 {"strcpy", {"", ""} },
+{"strerror", {"", ""} },
 {"strlen", {"", ""} },
 {"strncat", {"", ""} },
 {"strncmp", {"", ""} },
diff --git a/gcc/testsuite/g++.dg/spellcheck-stdlib.C 
b/gcc/testsuite/g++.dg/spellcheck-stdlib.C
index fd0f3a9b8c9..33718b8034e 100644
--- a/gcc/testsuite/g++.dg/spellcheck-stdlib.C
+++ b/gcc/testsuite/g++.dg/spellcheck-stdlib.C
@@ -104,6 +104,8 @@ void test_cstring (char *dest, char *src)
   // { dg-message "'#include '" "" { target *-*-* } .-1 }
   strcpy(dest, "test"); // { dg-error "was not declared" }
   // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  strerror(0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
   strlen("test"); // { dg-error "was not declared" }
   // { dg-message "'#include '" "" { target *-*-* } .-1 }
   strncat(dest, "test", 3); // { dg-error "was not declared" }
diff --git a/gcc/testsuite/gcc.dg/spellcheck-stdlib-2.c 
b/gcc/testsuite/gcc.dg/spellcheck-stdlib-2.c
new file mode 100644
index 000..4762e2ddbbd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/spellcheck-stdlib-2.c
@@ -0,0 +1,8 @@
+/* { dg-options "-Wimplicit-function-declaration" } */
+
+/* Missing .  */
+void test_string_h (void)
+{
+  strerror (0); /* { dg-error "implicit declaration of function 'strerror'" } 
*/
+  /* { dg-message "'strerror' is defined in header ''" "" { target 
*-*-* } .-1 } */
+}


[gcc r15-898] libgomp: Enable USM for some nvptx devices

2024-05-29 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:4ccb3366ade6ec9493f8ca20ab73b0da4b9816db

commit r15-898-g4ccb3366ade6ec9493f8ca20ab73b0da4b9816db
Author: Tobias Burnus 
Date:   Wed May 29 15:14:38 2024 +0200

libgomp: Enable USM for some nvptx devices

A few high-end nvptx devices support the attribute
CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS; for those, unified shared
memory is supported in hardware. This patch enables support for those -
if all installed nvptx devices have this feature (as the capabilities
are per device type).

This exposes a bug in gomp_copy_back_icvs as it did before use
omp_get_mapped_ptr to find mapped variables, but that returns
the unchanged pointer in cased of shared memory. But in this case,
we have a few actually mapped pointers - like the ICV variables.
Additionally, there was a mismatch with regards to '-1' for the
device number as gomp_copy_back_icvs and omp_get_mapped_ptr count
differently. Hence, do the lookup manually.

include/ChangeLog:

* cuda/cuda.h (CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS): Add.

libgomp/ChangeLog:

* libgomp.texi (nvptx): Update USM description.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices):
Claim support when requesting USM and all devices support
CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS.
* target.c (gomp_copy_back_icvs): Fix device ptr lookup.
(gomp_target_init): Set GOMP_OFFLOAD_CAP_SHARED_MEM is the
devices supports USM.

Diff:
---
 include/cuda/cuda.h   |  3 ++-
 libgomp/libgomp.texi  |  7 +--
 libgomp/plugin/plugin-nvptx.c | 15 +++
 libgomp/target.c  | 24 +++-
 4 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 0dca4b3a5c0..804d08ca57e 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -83,7 +83,8 @@ typedef enum {
   CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR = 39,
   CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT = 40,
   CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING = 41,
-  CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR = 82
+  CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR = 82,
+  CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS = 88
 } CUdevice_attribute;
 
 enum {
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 71d62105a20..22868635230 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -6435,8 +6435,11 @@ The implementation remark:
   the next reverse offload region is only executed after the previous
   one returned.
 @item OpenMP code that has a @code{requires} directive with
-  @code{unified_shared_memory} will remove any nvptx device from the
-  list of available devices (``host fallback'').
+  @code{unified_shared_memory} runs on nvptx devices if and only if
+  all of those support the @code{pageableMemoryAccess} property;@footnote{
+  
@uref{https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements}}
+  otherwise, all nvptx device are removed from the list of available
+  devices (``host fallback'').
 @item The default per-warp stack size is 128 kiB; see also @code{-msoft-stack}
   in the GCC manual.
 @item The OpenMP routines @code{omp_target_memcpy_rect} and
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 5aad3448a8d..4cedc5390a3 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1201,8 +1201,23 @@ GOMP_OFFLOAD_get_num_devices (unsigned int 
omp_requires_mask)
   if (num_devices > 0
   && ((omp_requires_mask
   & ~(GOMP_REQUIRES_UNIFIED_ADDRESS
+  | GOMP_REQUIRES_UNIFIED_SHARED_MEMORY
   | GOMP_REQUIRES_REVERSE_OFFLOAD)) != 0))
 return -1;
+  /* Check whether host page access (direct or via migration) is supported;
+ if so, enable USM.  Currently, capabilities is per device type, hence,
+ check all devices.  */
+  if (num_devices > 0
+  && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY))
+for (int dev = 0; dev < num_devices; dev++)
+  {
+   int pi;
+   CUresult r;
+   r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi,
+  CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS, dev);
+   if (r != CUDA_SUCCESS || pi == 0)
+ return -1;
+  }
   return num_devices;
 }
 
diff --git a/libgomp/target.c b/libgomp/target.c
index 5ec19ae489e..48689920d4a 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -2969,8 +2969,25 @@ gomp_copy_back_icvs (struct gomp_device_descr *devicep, 
int device)
   if (item == NULL)
 return;
 
+  gomp_mutex_lock (&devicep->lock);
+
+  struct splay_tree_s *mem_map = &devicep->mem_map;
+  struct splay_tree_key_s cur_node;
+  void *dev_ptr = NULL;
+
   void *host_ptr = &item->icvs;
-  void *dev_ptr = omp_get_mapped_ptr (host_ptr, device);
+

[gcc r15-899] libgomp: Enable USM for AMD APUs and MI200 devices

2024-05-29 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:18f477980c8597fe3dca2c2e8bd533c0c2b17aa6

commit r15-899-g18f477980c8597fe3dca2c2e8bd533c0c2b17aa6
Author: Tobias Burnus 
Date:   Wed May 29 15:29:06 2024 +0200

libgomp: Enable USM for AMD APUs and MI200 devices

If HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true,
all GPUs on the system support unified shared memory. That's
the case for APUs and MI200 devices when XNACK is enabled.

XNACK can be enabled by setting HSA_XNACK=1 as env var for
supported devices; otherwise, if disable, USM code will
use host fallback.

gcc/ChangeLog:

* config/gcn/gcn-hsa.h (gcn_local_sym_hash): Fix typo.

include/ChangeLog:

* hsa.h (HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT): Add
enum value.

libgomp/ChangeLog:

* libgomp.texi (gcn): Update USM handling
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Handle
USM if HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true.

Diff:
---
 gcc/config/gcn/gcn-hsa.h|  2 +-
 include/hsa.h   |  4 +++-
 libgomp/libgomp.texi|  9 +++--
 libgomp/plugin/plugin-gcn.c | 17 +
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/gcc/config/gcn/gcn-hsa.h b/gcc/config/gcn/gcn-hsa.h
index 4611bc55392..03220555075 100644
--- a/gcc/config/gcn/gcn-hsa.h
+++ b/gcc/config/gcn/gcn-hsa.h
@@ -80,7 +80,7 @@ extern unsigned int gcn_local_sym_hash (const char *name);
writes a new AMD GPU object file and the ABI version needs to be the
same. - LLVM <= 17 defaults to 4 while LLVM >= 18 defaults to 5.
GCC supports LLVM >= 13.0.1 and only LLVM >= 14 supports version 5.
-   Note that Fiji is only suppored with LLVM <= 17 as version 3 is no longer
+   Note that Fiji is only supported with LLVM <= 17 as version 3 is no longer
supported in LLVM >= 18.  */
 #define ABI_VERSION_SPEC "march=fiji:--amdhsa-code-object-version=3;" \
 "!march=*|march=*:--amdhsa-code-object-version=4"
diff --git a/include/hsa.h b/include/hsa.h
index f9b5d9daf85..3c7be95d7fd 100644
--- a/include/hsa.h
+++ b/include/hsa.h
@@ -466,7 +466,9 @@ typedef enum {
   /**
   * String containing the ROCr build identifier.
   */
-  HSA_AMD_SYSTEM_INFO_BUILD_VERSION = 0x200
+  HSA_AMD_SYSTEM_INFO_BUILD_VERSION = 0x200,
+
+  HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT = 0x202
 } hsa_system_info_t;
 
 /**
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 22868635230..e79bd7a3392 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -6360,8 +6360,13 @@ The implementation remark:
   such that the next reverse offload region is only executed after the 
previous
   one returned.
 @item OpenMP code that has a @code{requires} directive with
-  @code{unified_shared_memory} will remove any GCN device from the list of
-  available devices (``host fallback'').
+  @code{unified_shared_memory} is only supported if all AMD GPUs have the
+  @code{HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT} property; for
+  discrete GPUs, this may require setting the @code{HSA_XNACK} environment
+  variable to @samp{1}; for systems with both an APU and a discrete GPU 
that
+  does not support XNACK, consider using @code{ROCR_VISIBLE_DEVICES} to
+  enable only the APU.  If not supported, all AMD GPU devices are removed
+  from the list of available devices (``host fallback'').
 @item The available stack size can be changed using the @code{GCN_STACK_SIZE}
   environment variable; the default is 32 kiB per thread.
 @item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 3cdc7ba929f..3d882b5ab63 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3355,8 +3355,25 @@ GOMP_OFFLOAD_get_num_devices (unsigned int 
omp_requires_mask)
   if (hsa_context.agent_count > 0
   && ((omp_requires_mask
   & ~(GOMP_REQUIRES_UNIFIED_ADDRESS
+  | GOMP_REQUIRES_UNIFIED_SHARED_MEMORY
   | GOMP_REQUIRES_REVERSE_OFFLOAD)) != 0))
 return -1;
+  /* Check whether host page access is supported; this is per system level
+ (all GPUs supported by HSA).  While intrinsically true for APUs, it
+ requires XNACK support for discrete GPUs.  */
+  if (hsa_context.agent_count > 0
+  && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY))
+{
+  bool b;
+  hsa_system_info_t type = HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT;
+  hsa_status_t status = hsa_fns.hsa_system_get_info_fn (type, &b);
+  if (status != HSA_STATUS_SUCCESS)
+   GOMP_PLUGIN_error ("HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT "
+  "failed");
+  if (!b)
+   return -1;
+}
+
   return hsa_context.agent_count;
 }


[gcc r15-900] c++: add module extensions

2024-05-29 Thread Jason Merrill via Gcc-cvs
https://gcc.gnu.org/g:ff41abdca0ab9993b6170b9b1f46b3a40921f1b0

commit r15-900-gff41abdca0ab9993b6170b9b1f46b3a40921f1b0
Author: Jason Merrill 
Date:   Thu May 16 16:09:12 2024 -0400

c++: add module extensions

There is a trend in the broader C++ community to use a different extension
for module interface units, even though (in GCC) they are compiled in the
same way as other source files.  Let's recognize these extensions as C++.

.ixx is the MSVC standard, while the .c*m are supported by Clang.  libc++
standard headers use .cppm, as their other source files use .cpp.
Perhaps libstdc++ might use .ccm for parallel consistency?

One issue with .c++m is that libcpp/mkdeps.cc has been using it for the
phony dependencies to express module dependencies, so I'm changing mkdeps to
something less likely to be an actual file, ".c++-module".

gcc/cp/ChangeLog:

* lang-specs.h: Add module interface extensions.

gcc/ChangeLog:

* doc/invoke.texi: Update module extension docs.

libcpp/ChangeLog:

* mkdeps.cc (make_write): Change .c++m to .c++-module.

gcc/testsuite/ChangeLog:

* g++.dg/modules/dep-1_a.C
* g++.dg/modules/dep-1_b.C
* g++.dg/modules/dep-2.C: Change .c++m to .c++-module.

Diff:
---
 gcc/doc/invoke.texi| 20 ++--
 gcc/cp/lang-specs.h|  6 ++
 gcc/testsuite/g++.dg/modules/dep-1_a.C |  4 ++--
 gcc/testsuite/g++.dg/modules/dep-1_b.C |  8 
 gcc/testsuite/g++.dg/modules/dep-2.C   |  4 ++--
 libcpp/mkdeps.cc   | 13 ++---
 6 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2cba380718b..517a782987d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2317,9 +2317,12 @@ other language.
 C++ source files conventionally use one of the suffixes @samp{.C},
 @samp{.cc}, @samp{.cpp}, @samp{.CPP}, @samp{.c++}, @samp{.cp}, or
 @samp{.cxx}; C++ header files often use @samp{.hh}, @samp{.hpp},
-@samp{.H}, or (for shared template code) @samp{.tcc}; and
-preprocessed C++ files use the suffix @samp{.ii}.  GCC recognizes
-files with these names and compiles them as C++ programs even if you
+@samp{.H}, or (for shared template code) @samp{.tcc};
+preprocessed C++ files use the suffix @samp{.ii}; and C++20 module interface
+units sometimes use @samp{.ixx}, @samp{.cppm}, @samp{.cxxm}, @samp{.c++m},
+or @samp{.ccm}.
+
+GCC recognizes files with these names and compiles them as C++ programs even 
if you
 call the compiler the same way as for compiling C programs (usually
 with the name @command{gcc}).
 
@@ -37705,13 +37708,10 @@ Modular compilation is @emph{not} enabled with just 
the
 version selected, although in pre-C++20 versions, it is of course an
 extension.
 
-No new source file suffixes are required or supported.  If you wish to
-use a non-standard suffix (@pxref{Overall Options}), you also need
-to provide a @option{-x c++} option too.@footnote{Some users like to
-distinguish module interface files with a new suffix, such as naming
-the source @code{module.cppm}, which involves
-teaching all tools about the new suffix.  A different scheme, such as
-naming @code{module-m.cpp} would be less invasive.}
+No new source file suffixes are required.  A few suffixes preferred
+for module interface units by other compilers (e.g. @samp{.ixx},
+@samp{.cppm}) are supported, but files with these suffixes are treated
+the same as any other C++ source file.
 
 Compiling a module interface unit produces an additional output (to
 the assembly or object file), called a Compiled Module Interface
diff --git a/gcc/cp/lang-specs.h b/gcc/cp/lang-specs.h
index 7a7f5ff0ab5..e5651567a2d 100644
--- a/gcc/cp/lang-specs.h
+++ b/gcc/cp/lang-specs.h
@@ -39,6 +39,12 @@ along with GCC; see the file COPYING3.  If not see
   {".HPP", "@c++-header", 0, 0, 0},
   {".tcc", "@c++-header", 0, 0, 0},
   {".hh",  "@c++-header", 0, 0, 0},
+  /* Module interface unit.  Should there also be a .C counterpart?  */
+  {".ixx", "@c++", 0, 0, 0}, /* MSVC */
+  {".cppm", "@c++", 0, 0, 0}, /* Clang/libc++ */
+  {".cxxm", "@c++", 0, 0, 0},
+  {".c++m", "@c++", 0, 0, 0},
+  {".ccm", "@c++", 0, 0, 0},
   {"@c++-header",
   "%{E|M|MM:cc1plus -E %{fmodules-ts:-fdirectives-only -fmodule-header}"
   "  %(cpp_options) %2 %(cpp_debug_options)}"
diff --git a/gcc/testsuite/g++.dg/modules/dep-1_a.C 
b/gcc/testsuite/g++.dg/modules/dep-1_a.C
index 5ec5dd30f6d..3e92eeaef9f 100644
--- a/gcc/testsuite/g++.dg/modules/dep-1_a.C
+++ b/gcc/testsuite/g++.dg/modules/dep-1_a.C
@@ -4,6 +4,6 @@ export module m:part;
 // { dg-module-cmi m:part }
 
 // All The Backslashes!
-// { dg-final { scan-file dep-1_a.d {\nm:part\.c\+\+m: gcm.cache/m-part\.gcm} 
} }
+// { dg-final { scan-file dep-1_a.d {\nm:part\.c\+\+-module: 
gcm.cache/m-part\.gcm} } }
 // { dg-final { scan-f

[gcc r15-901] [to-be-committed] [RISC-V] Use pack to handle repeating constants

2024-05-29 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:3ae02dcb108df426838bbbcc73d7d01855bc1196

commit r15-901-g3ae02dcb108df426838bbbcc73d7d01855bc1196
Author: Jeff Law 
Date:   Wed May 29 07:41:55 2024 -0600

[to-be-committed] [RISC-V] Use pack to handle repeating constants

This patch utilizes zbkb to improve the code we generate for 64bit constants
when the high half is a duplicate of the low half.

Basically we generate the low half and use a pack instruction with that same
register repeated.  ie

pack dest,src,src

That gives us a maximum sequence of 3 instructions and sometimes it will be
just 2 instructions (say if the low 32bits can be constructed with a single
addi or lui).

As with shadd, I'm abusing an RTL opcode.  This time it's CONCAT.  It's
reasonably close to what we're doing.  Obviously it's just how we identify 
the
desire to generate a pack in the array of opcodes.  We don't actually emit a
CONCAT.

Note that we don't care about the potential sign extension from bit 31. pack
will only look at bits 0..31 of each input (for rv64).  So we go ahead and 
sign
extend before synthesizing the low part as that allows us to handle more 
cases
trivially.

I had my testsuite generator chew on random cases of a repeating constant
without any surprises.  I don't see much point in including all those in the
testcase (after all there's 2**32 of them).  I've got a set of 10 I'm
including.  Nothing particularly interesting in them.

An enterprising developer that needs this improved without zbkb could 
probably
do so with a bit of work.  First increase the cost by 1 unit. Second avoid
cases where bit 31 is set and restrict it to cases when we can still create
pseudos.   On the codegen side, when encountering the CONCAT, generate the
appropriate shift of "X" into a temporary register, then IOR the temporary 
with
"X" into the new destination.

Anyway, I've tested this in my tester (though it doesn't turn on zbkb, yet).
I'll let the CI system chew on it overnight, but like mine, I don't think it
lights up zbkb.  So it's unlikely to spit out anything interesting.

gcc/
* config/riscv/crypto.md (riscv_xpack___2): Remove 
'*'
allow it to be used via the gen_* interface.
* config/riscv/riscv.cc (riscv_build_integer): Identify when Zbkb
can be used to profitably synthesize repeating constants.
(riscv_move_integer): Codegen changes to generate those Zbkb 
sequences.

gcc/testsuite/

* gcc.target/riscv/synthesis-9.c: New test.

Diff:
---
 gcc/config/riscv/crypto.md   |  2 +-
 gcc/config/riscv/riscv.cc| 23 +++
 gcc/testsuite/gcc.target/riscv/synthesis-9.c | 28 
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index b632312ade2..b9cac78fce1 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -107,7 +107,7 @@
 ;; This is slightly more complex than the other pack patterns
 ;; that fully expose the RTL as it needs to self-adjust to
 ;; rv32 and rv64.  But it's not that hard.
-(define_insn "*riscv_xpack__2"
+(define_insn "riscv_xpack___2"
   [(set (match_operand:X 0 "register_operand" "=r")
(ior:X (ashift:X (match_operand:X 1 "register_operand" "r")
 (match_operand 2 "immediate_operand" "n"))
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a99211d56b1..91fefacee80 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1123,6 +1123,22 @@ riscv_build_integer (struct riscv_integer_op *codes, 
HOST_WIDE_INT value,
}
 }
 
+  /* With pack we can generate a 64 bit constant with the same high
+ and low 32 bits triviall.  */
+  if (cost > 3 && TARGET_64BIT && TARGET_ZBKB)
+{
+  unsigned HOST_WIDE_INT loval = value & 0x;
+  unsigned HOST_WIDE_INT hival = value & ~loval;
+  if (hival >> 32 == loval)
+   {
+ cost = 1 + riscv_build_integer_1 (codes, sext_hwi (loval, 32), mode);
+ codes[cost - 1].code = CONCAT;
+ codes[cost - 1].value = 0;
+ codes[cost - 1].use_uw = false;
+   }
+
+}
+
   return cost;
 }
 
@@ -2679,6 +2695,13 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT 
value,
  rtx t = can_create_pseudo_p () ? gen_reg_rtx (mode) : temp;
  x = riscv_emit_set (t, x);
}
+ else if (codes[i].code == CONCAT)
+   {
+ rtx t = can_create_pseudo_p () ? gen_reg_rtx (mode) : temp;
+ rtx t2 = gen_lowpart (SImode, x);
+ emit_insn (gen_riscv_xpack_di_si_2 (t, x, GEN_INT (32), t2));
+ x = t;
+   }
  else
x = gen_rtx_fmt_ee (codes[i].code, mode,

[gcc r15-902] c++: pragma target and static init [PR109753]

2024-05-29 Thread Jason Merrill via Gcc-cvs
https://gcc.gnu.org/g:eff00046409a7289bfdc1861e68b532895f91c0e

commit r15-902-geff00046409a7289bfdc1861e68b532895f91c0e
Author: Jason Merrill 
Date:   Wed Feb 14 17:18:17 2024 -0500

c++: pragma target and static init [PR109753]

 #pragma target and optimize should also apply to implicitly-generated
 functions like static initialization functions and defaulted special member
 functions.

The handle_optimize_attribute change is necessary to avoid regressing
g++.dg/opt/pr105306.C; maybe_clone_body creates a cgraph_node for the ~B
alias before handle_optimize_attribute, and the alias never goes through
finalize_function, so we need to adjust semantic_interposition somewhere
else.

PR c++/109753

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_optimize_attribute): Set
cgraph_node::semantic_interposition.

gcc/cp/ChangeLog:

* decl.cc (start_preparsed_function): Call decl_attributes.

gcc/testsuite/ChangeLog:

* g++.dg/opt/always_inline1.C: New test.

Diff:
---
 gcc/c-family/c-attribs.cc | 4 
 gcc/cp/decl.cc| 3 +++
 gcc/testsuite/g++.dg/opt/always_inline1.C | 8 
 3 files changed, 15 insertions(+)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 04e39b41bdf..605469dd7dd 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -5971,6 +5971,10 @@ handle_optimize_attribute (tree *node, tree name, tree 
args,
   if (prev_target_node != target_node)
DECL_FUNCTION_SPECIFIC_TARGET (*node) = target_node;
 
+  /* Also update the cgraph_node, if it's already built.  */
+  if (cgraph_node *cn = cgraph_node::get (*node))
+   cn->semantic_interposition = flag_semantic_interposition;
+
   /* Restore current options.  */
   cl_optimization_restore (&global_options, &global_options_set,
   &cur_opts);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index a992d54dc8f..d481e1ec074 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17832,6 +17832,9 @@ start_preparsed_function (tree decl1, tree attrs, int 
flags)
doing_friend = true;
 }
 
+  /* Adjust for #pragma target/optimize.  */
+  decl_attributes (&decl1, NULL_TREE, 0);
+
   if (DECL_DECLARED_INLINE_P (decl1)
   && lookup_attribute ("noinline", attrs))
 warning_at (DECL_SOURCE_LOCATION (decl1), 0,
diff --git a/gcc/testsuite/g++.dg/opt/always_inline1.C 
b/gcc/testsuite/g++.dg/opt/always_inline1.C
new file mode 100644
index 000..a042a1cf0c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/always_inline1.C
@@ -0,0 +1,8 @@
+// PR c++/109753
+// { dg-do compile { target x86_64-*-* } }
+
+#pragma GCC target("avx2")
+struct aa {
+__attribute__((__always_inline__)) aa() {}
+};
+aa _M_impl;


[gcc r15-903] vect: Unify bbs in loop_vec_info and bb_vec_info

2024-05-29 Thread Feng Xue via Gcc-cvs
https://gcc.gnu.org/g:9c747183efa555e45200523c162021e385511be5

commit r15-903-g9c747183efa555e45200523c162021e385511be5
Author: Feng Xue 
Date:   Thu May 16 11:08:38 2024 +0800

vect: Unify bbs in loop_vec_info and bb_vec_info

Both derived classes have their own "bbs" field, which have exactly same
purpose of recording all basic blocks inside the corresponding vect region,
while the fields are composed by different data type, one is normal array,
the other is auto_vec. This difference causes some duplicated code even
handling the same stuff, almost in tree-vect-patterns. One refinement is
lifting this field into the base class "vec_info", and reset its value to
the continuous memory area pointed by two old "bbs" in each constructor
of derived classes.

2024-05-16 Feng Xue 

gcc/
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Move
initialization of bbs to explicit construction code.  Adjust the
definition of nbbs.
(update_epilogue_loop_vinfo): Update nbbs for epilog vinfo.
* tree-vect-patterns.cc (vect_determine_precisions): Make
loop_vec_info and bb_vec_info share same code.
(vect_pattern_recog): Remove duplicated vect_pattern_recog_1 loop.
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Access to bbs[0]
via base vec_info class.
(_bb_vec_info::_bb_vec_info): Initialize bbs and nbbs using data
fields of input auto_vec<> bbs.
(vect_slp_region): Use access to nbbs to replace original
bbs.length().
(vect_schedule_slp_node): Access to bbs[0] via base vec_info class.
* tree-vectorizer.cc (vec_info::vec_info): Add initialization of
bbs and nbbs.
(vec_info::insert_seq_on_entry): Access to bbs[0] via base vec_info
class.
* tree-vectorizer.h (vec_info): Add new fields bbs and nbbs.
(LOOP_VINFO_NBBS): New macro.
(BB_VINFO_BBS): Rename BB_VINFO_BB to BB_VINFO_BBS.
(BB_VINFO_NBBS): New macro.
(_loop_vec_info): Remove field bbs.
(_bb_vec_info): Rename field bbs.

Diff:
---
 gcc/tree-vect-loop.c  |   0
 gcc/tree-vect-loop.cc |   7 ++-
 gcc/tree-vect-patterns.cc | 142 +-
 gcc/tree-vect-slp.cc  |  23 +---
 gcc/tree-vectorizer.cc|   7 ++-
 gcc/tree-vectorizer.h |  23 
 6 files changed, 74 insertions(+), 128 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
new file mode 100644
index 000..e69de29bb2d
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 3b94bb13a8b..04a9ac64df7 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -1028,7 +1028,6 @@ bb_in_loop_p (const_basic_block bb, const void *data)
 _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared)
   : vec_info (vec_info::loop, shared),
 loop (loop_in),
-bbs (XCNEWVEC (basic_block, loop->num_nodes)),
 num_itersm1 (NULL_TREE),
 num_iters (NULL_TREE),
 num_iters_unchanged (NULL_TREE),
@@ -1079,8 +1078,9 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, 
vec_info_shared *shared)
  case of the loop forms we allow, a dfs order of the BBs would the same
  as reversed postorder traversal, so we are safe.  */
 
-  unsigned int nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p,
- bbs, loop->num_nodes, loop);
+  bbs = XCNEWVEC (basic_block, loop->num_nodes);
+  nbbs = dfs_enumerate_from (loop->header, 0, bb_in_loop_p, bbs,
+loop->num_nodes, loop);
   gcc_assert (nbbs == loop->num_nodes);
 
   for (unsigned int i = 0; i < nbbs; i++)
@@ -11667,6 +11667,7 @@ update_epilogue_loop_vinfo (class loop *epilogue, tree 
advance)
 
   free (LOOP_VINFO_BBS (epilogue_vinfo));
   LOOP_VINFO_BBS (epilogue_vinfo) = epilogue_bbs;
+  LOOP_VINFO_NBBS (epilogue_vinfo) = epilogue->num_nodes;
 
   /* Advance data_reference's with the number of iterations of the previous
  loop and its prologue.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 8929e5aa7f3..88e7e34d78d 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -6925,81 +6925,41 @@ vect_determine_stmt_precisions (vec_info *vinfo, 
stmt_vec_info stmt_info)
 void
 vect_determine_precisions (vec_info *vinfo)
 {
+  basic_block *bbs = vinfo->bbs;
+  unsigned int nbbs = vinfo->nbbs;
+
   DUMP_VECT_SCOPE ("vect_determine_precisions");
 
-  if (loop_vec_info loop_vinfo = dyn_cast  (vinfo))
+  for (unsigned int i = 0; i < nbbs; i++)
 {
-  class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  unsigned int nbbs = loop->num_nodes;
-
-  for (unsigned int i = 0; i < nbbs; i++)
+  basic_block bb = bbs[i];
+  for (auto gsi 

[gcc r15-904] Delete a file due to push error

2024-05-29 Thread Feng Xue via Gcc-cvs
https://gcc.gnu.org/g:b24b081113c696f4e523c8ae53fc3ab89c3b4e4d

commit r15-904-gb24b081113c696f4e523c8ae53fc3ab89c3b4e4d
Author: Feng Xue 
Date:   Wed May 29 22:20:45 2024 +0800

Delete a file due to push error

gcc/
* tree-vect-loop.c : Removed.

Diff:
---
 gcc/tree-vect-loop.c | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
deleted file mode 100644
index e69de29bb2d..000


[gcc r15-905] libstdc++: Use RAII to replace try/catch blocks

2024-05-29 Thread Francois Dumont via Libstdc++-cvs
https://gcc.gnu.org/g:d22eaeca7634b57e80ea61cadd82902fdc7e57ea

commit r15-905-gd22eaeca7634b57e80ea61cadd82902fdc7e57ea
Author: François Dumont 
Date:   Thu May 16 06:59:50 2024 +0200

libstdc++: Use RAII to replace try/catch blocks

Move _Guard into std::vector declaration and use it to guard all calls to
vector _M_allocate.

Doing so the compiler has more visibility on what is done with the pointers
and do not raise anymore the -Wfree-nonheap-object warning.

libstdc++-v3/ChangeLog:

* include/bits/vector.tcc (_Guard): Move all the nested duplicated 
class...
* include/bits/stl_vector.h (_Guard_alloc): ...here and rename.
(_M_allocate_and_copy): Use latter.
(_M_initialize_dispatch): Small code simplification.
(_M_range_initialize): Likewise and set _M_finish first from the 
result
of __uninitialize_fill_n_a that can throw.

Diff:
---
 libstdc++-v3/include/bits/stl_vector.h | 77 ++---
 libstdc++-v3/include/bits/vector.tcc   | 78 ++
 2 files changed, 55 insertions(+), 100 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 31169711a48..182ad41ed94 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1607,6 +1607,39 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   clear() _GLIBCXX_NOEXCEPT
   { _M_erase_at_end(this->_M_impl._M_start); }
 
+private:
+  // RAII guard for allocated storage.
+  struct _Guard_alloc
+  {
+   pointer _M_storage; // Storage to deallocate
+   size_type _M_len;
+   _Base& _M_vect;
+
+   _GLIBCXX20_CONSTEXPR
+   _Guard_alloc(pointer __s, size_type __l, _Base& __vect)
+   : _M_storage(__s), _M_len(__l), _M_vect(__vect)
+   { }
+
+   _GLIBCXX20_CONSTEXPR
+   ~_Guard_alloc()
+   {
+ if (_M_storage)
+   _M_vect._M_deallocate(_M_storage, _M_len);
+   }
+
+   _GLIBCXX20_CONSTEXPR
+   pointer
+   _M_release()
+   {
+ pointer __res = _M_storage;
+ _M_storage = pointer();
+ return __res;
+   }
+
+  private:
+   _Guard_alloc(const _Guard_alloc&);
+  };
+
 protected:
   /**
*  Memory expansion handler.  Uses the member allocation function to
@@ -1618,18 +1651,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
_M_allocate_and_copy(size_type __n,
 _ForwardIterator __first, _ForwardIterator __last)
{
- pointer __result = this->_M_allocate(__n);
- __try
-   {
- std::__uninitialized_copy_a(__first, __last, __result,
- _M_get_Tp_allocator());
- return __result;
-   }
- __catch(...)
-   {
- _M_deallocate(__result, __n);
- __throw_exception_again;
-   }
+ _Guard_alloc __guard(this->_M_allocate(__n), __n, *this);
+ std::__uninitialized_copy_a
+   (__first, __last, __guard._M_storage, _M_get_Tp_allocator());
+ return __guard._M_release();
}
 
 
@@ -1642,13 +1667,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   // 438. Ambiguity in the "do the right thing" clause
   template
void
-   _M_initialize_dispatch(_Integer __n, _Integer __value, __true_type)
+   _M_initialize_dispatch(_Integer __int_n, _Integer __value, __true_type)
{
- this->_M_impl._M_start = _M_allocate(_S_check_init_len(
-   static_cast(__n), _M_get_Tp_allocator()));
- this->_M_impl._M_end_of_storage =
-   this->_M_impl._M_start + static_cast(__n);
- _M_fill_initialize(static_cast(__n), __value);
+ const size_type __n = static_cast(__int_n);
+ pointer __start =
+   _M_allocate(_S_check_init_len(__n, _M_get_Tp_allocator()));
+ this->_M_impl._M_start = __start;
+ this->_M_impl._M_end_of_storage = __start + __n;
+ _M_fill_initialize(__n, __value);
}
 
   // Called by the range constructor to implement [23.1.1]/9
@@ -1690,13 +1716,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
std::forward_iterator_tag)
{
  const size_type __n = std::distance(__first, __last);
- this->_M_impl._M_start
-   = this->_M_allocate(_S_check_init_len(__n, _M_get_Tp_allocator()));
- this->_M_impl._M_end_of_storage = this->_M_impl._M_start + __n;
- this->_M_impl._M_finish =
-   std::__uninitialized_copy_a(__first, __last,
-   this->_M_impl._M_start,
-   _M_get_Tp_allocator());
+ pointer __start =
+   this->_M_allocate(_S_check_init_len(__n, _M_get_Tp_allocator()));
+ _Guard_alloc __guard(__start, __n, *this);
+ this->_M_im

[gcc r15-906] aarch64: Split aarch64_combinev16qi before RA [PR115258]

2024-05-29 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:39263ed2d39ac1cebde59bc5e72ddcad5dc7a1ec

commit r15-906-g39263ed2d39ac1cebde59bc5e72ddcad5dc7a1ec
Author: Richard Sandiford 
Date:   Wed May 29 16:43:33 2024 +0100

aarch64: Split aarch64_combinev16qi before RA [PR115258]

Two-vector TBL instructions are fed by an aarch64_combinev16qi, whose
purpose is to put the two input data vectors into consecutive registers.
This aarch64_combinev16qi was then split after reload into individual
moves (from the first input to the first half of the output, and from
the second input to the second half of the output).

In the worst case, the RA might allocate things so that the destination
of the aarch64_combinev16qi is the second input followed by the first
input.  In that case, the split form of aarch64_combinev16qi uses three
eors to swap the registers around.

This PR is about a test where this worst case occurred.  And given the
insn description, that allocation doesn't semm unreasonable.

early-ra should (hopefully) mean that we're now better at allocating
subregs of vector registers.  The upcoming RA subreg patches should
improve things further.  The best fix for the PR therefore seems
to be to split the combination before RA, so that the RA can see
the underlying moves.

Perhaps it even makes sense to do this at expand time, avoiding the need
for aarch64_combinev16qi entirely.  That deserves more experimentation
though.

gcc/
PR target/115258
* config/aarch64/aarch64-simd.md (aarch64_combinev16qi): Allow
the split before reload.
* config/aarch64/aarch64.cc (aarch64_split_combinev16qi): Generalize
into a form that handles pseudo registers.

gcc/testsuite/
PR target/115258
* gcc.target/aarch64/pr115258.c: New test.

Diff:
---
 gcc/config/aarch64/aarch64-simd.md  |  2 +-
 gcc/config/aarch64/aarch64.cc   | 29 ++---
 gcc/testsuite/gcc.target/aarch64/pr115258.c | 19 +++
 3 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index c311888e4bd..868f4486218 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -8474,7 +8474,7 @@
UNSPEC_CONCAT))]
   "TARGET_SIMD"
   "#"
-  "&& reload_completed"
+  "&& 1"
   [(const_int 0)]
 {
   aarch64_split_combinev16qi (operands);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index ee12d8897a8..13191ec8e34 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -25333,27 +25333,26 @@ aarch64_output_sve_ptrues (rtx const_unspec)
 void
 aarch64_split_combinev16qi (rtx operands[3])
 {
-  unsigned int dest = REGNO (operands[0]);
-  unsigned int src1 = REGNO (operands[1]);
-  unsigned int src2 = REGNO (operands[2]);
   machine_mode halfmode = GET_MODE (operands[1]);
-  unsigned int halfregs = REG_NREGS (operands[1]);
-  rtx destlo, desthi;
 
   gcc_assert (halfmode == V16QImode);
 
-  if (src1 == dest && src2 == dest + halfregs)
+  rtx destlo = simplify_gen_subreg (halfmode, operands[0],
+   GET_MODE (operands[0]), 0);
+  rtx desthi = simplify_gen_subreg (halfmode, operands[0],
+   GET_MODE (operands[0]),
+   GET_MODE_SIZE (halfmode));
+
+  bool skiplo = rtx_equal_p (destlo, operands[1]);
+  bool skiphi = rtx_equal_p (desthi, operands[2]);
+
+  if (skiplo && skiphi)
 {
   /* No-op move.  Can't split to nothing; emit something.  */
   emit_note (NOTE_INSN_DELETED);
   return;
 }
 
-  /* Preserve register attributes for variable tracking.  */
-  destlo = gen_rtx_REG_offset (operands[0], halfmode, dest, 0);
-  desthi = gen_rtx_REG_offset (operands[0], halfmode, dest + halfregs,
-  GET_MODE_SIZE (halfmode));
-
   /* Special case of reversed high/low parts.  */
   if (reg_overlap_mentioned_p (operands[2], destlo)
   && reg_overlap_mentioned_p (operands[1], desthi))
@@ -25366,16 +25365,16 @@ aarch64_split_combinev16qi (rtx operands[3])
 {
   /* Try to avoid unnecessary moves if part of the result
 is in the right place already.  */
-  if (src1 != dest)
+  if (!skiplo)
emit_move_insn (destlo, operands[1]);
-  if (src2 != dest + halfregs)
+  if (!skiphi)
emit_move_insn (desthi, operands[2]);
 }
   else
 {
-  if (src2 != dest + halfregs)
+  if (!skiphi)
emit_move_insn (desthi, operands[2]);
-  if (src1 != dest)
+  if (!skiplo)
emit_move_insn (destlo, operands[1]);
 }
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/pr115258.c 
b/gcc/testsuite/gcc.target/aarch64/pr115258.c
new file mode 100644
index 000..9a489d4604c
--- /dev/null
+++ b

[gcc r15-907] Match: Add maybe_bit_not instead of plain matching

2024-05-29 Thread Andrew Pinski via Gcc-cvs
https://gcc.gnu.org/g:0a9154d154957b21eb2c9e4fbe9869e50fb9742f

commit r15-907-g0a9154d154957b21eb2c9e4fbe9869e50fb9742f
Author: Andrew Pinski 
Date:   Sat May 25 23:29:48 2024 -0700

Match: Add maybe_bit_not instead of plain matching

While working on adding matching of negative expressions of `a - b`,
I noticed that we started to have "duplicated" patterns due to not having
a way to match maybe negative expressions. So I went back to what I did for
bit_not and decided to improve the situtation there so for some patterns
where we had 2 operands of an expression where one could have been a 
bit_not,
add back maybe_bit_not.
This does not add maybe_bit_not in every place were bitwise_inverted_equal_p
is used, just the ones were 2 operands of an expression could be swapped.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* match.pd (bit_not_with_nop): Unconditionalize.
(maybe_cmp): Likewise.
(maybe_bit_not): New match pattern.
(`~X & X`): Use maybe_bit_not and add `:c` back.
(`~x ^ x`/`~x | x`): Likewise.

Signed-off-by: Andrew Pinski 

Diff:
---
 gcc/match.pd | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 024e3350465..090ad4e08b0 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -167,7 +167,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
   && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE 
(@0))
 
-#if GIMPLE
 /* These are used by gimple_bitwise_inverted_equal_p to simplify
detection of BIT_NOT and comparisons. */
 (match (bit_not_with_nop @0)
@@ -188,7 +187,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (bit_xor@0 @1 @2)
  (if (INTEGRAL_TYPE_P (type)
   && TYPE_PRECISION (type) == 1)))
-#endif
+/* maybe_bit_not is used to match what
+   is acceptable for bitwise_inverted_equal_p. */
+(match (maybe_bit_not @0)
+ (bit_not_with_nop@0 @1))
+(match (maybe_bit_not @0)
+ (INTEGER_CST@0))
+(match (maybe_bit_not @0)
+ (maybe_cmp@0 @1))
 
 /* Transform likes of (char) ABS_EXPR <(int) x> into (char) ABSU_EXPR 
ABSU_EXPR returns unsigned absolute value of the operand and the operand
@@ -1332,7 +1338,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Simplify ~X & X as zero.  */
 (simplify
- (bit_and (convert? @0) (convert? @1))
+ (bit_and:c (convert? @0) (convert? (maybe_bit_not @1)))
  (with { bool wascmp; }
   (if (types_match (TREE_TYPE (@0), TREE_TYPE (@1))
&& bitwise_inverted_equal_p (@0, @1, wascmp))
@@ -1597,7 +1603,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* ~x ^ x -> -1 */
 (for op (bit_ior bit_xor)
  (simplify
-  (op (convert? @0) (convert? @1))
+  (op:c (convert? @0) (convert? (maybe_bit_not @1)))
   (with { bool wascmp; }
(if (types_match (TREE_TYPE (@0), TREE_TYPE (@1))
 && bitwise_inverted_equal_p (@0, @1, wascmp))


[gcc r15-908] match: Add support for `a ^ CST` to bitwise_inverted_equal_p [PR115224]

2024-05-29 Thread Andrew Pinski via Gcc-cvs
https://gcc.gnu.org/g:547143df5aa0960fb149a26933dad7ca1c363afb

commit r15-908-g547143df5aa0960fb149a26933dad7ca1c363afb
Author: Andrew Pinski 
Date:   Sun May 26 17:38:37 2024 -0700

match: Add support for `a ^ CST` to bitwise_inverted_equal_p [PR115224]

While looking into something else, I noticed that `a ^ CST` needed to be
special casing to bitwise_inverted_equal_p as it would simplify to `a ^ 
~CST`
for the bitwise not.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/115224

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): Add `a ^ CST`
case.
* gimple-match-head.cc (gimple_bit_xor_cst): New declaration.
(gimple_bitwise_inverted_equal_p): Add `a ^ CST` case.
* match.pd (bit_xor_cst): New match.
(maybe_bit_not): Add bit_xor_cst case.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-8.c: New test.

Signed-off-by: Andrew Pinski 

Diff:
---
 gcc/generic-match-head.cc| 10 ++
 gcc/gimple-match-head.cc | 13 +
 gcc/match.pd |  4 
 gcc/testsuite/gcc.dg/tree-ssa/bitops-8.c | 15 +++
 4 files changed, 42 insertions(+)

diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
index 55ba369c6b3..641d8e9b2de 100644
--- a/gcc/generic-match-head.cc
+++ b/gcc/generic-match-head.cc
@@ -158,6 +158,16 @@ bitwise_inverted_equal_p (tree expr1, tree expr2, bool 
&wascmp)
   if (TREE_CODE (expr2) == BIT_NOT_EXPR
   && bitwise_equal_p (expr1, TREE_OPERAND (expr2, 0)))
 return true;
+
+  /* `X ^ CST` and `X ^ ~CST` match for ~. */
+  if (TREE_CODE (expr1) == BIT_XOR_EXPR && TREE_CODE (expr2) == BIT_XOR_EXPR
+  && bitwise_equal_p (TREE_OPERAND (expr1, 0), TREE_OPERAND (expr2, 0)))
+{
+  tree cst1 = uniform_integer_cst_p (TREE_OPERAND (expr1, 1));
+  tree cst2 = uniform_integer_cst_p (TREE_OPERAND (expr2, 1));
+  if (cst1 && cst2 && wi::to_wide (cst1) == ~wi::to_wide (cst2))
+   return true;
+}
   if (COMPARISON_CLASS_P (expr1)
   && COMPARISON_CLASS_P (expr2))
 {
diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
index 6220725b259..e26fa0860ee 100644
--- a/gcc/gimple-match-head.cc
+++ b/gcc/gimple-match-head.cc
@@ -283,6 +283,7 @@ gimple_bitwise_equal_p (tree expr1, tree expr2, tree 
(*valueize) (tree))
 
 bool gimple_bit_not_with_nop (tree, tree *, tree (*) (tree));
 bool gimple_maybe_cmp (tree, tree *, tree (*) (tree));
+bool gimple_bit_xor_cst (tree, tree *, tree (*) (tree));
 
 /* Helper function for bitwise_inverted_equal_p macro.  */
 
@@ -301,6 +302,18 @@ gimple_bitwise_inverted_equal_p (tree expr1, tree expr2, 
bool &wascmp, tree (*va
   if (operand_equal_p (expr1, expr2, 0))
 return false;
 
+  tree xor1[2];
+  tree xor2[2];
+  /* `X ^ CST` and `X ^ ~CST` match for ~. */
+  if (gimple_bit_xor_cst (expr1, xor1, valueize)
+  && gimple_bit_xor_cst (expr2, xor2, valueize))
+{
+  if (operand_equal_p (xor1[0], xor2[0], 0)
+ && (wi::to_wide (uniform_integer_cst_p (xor1[1]))
+ == ~wi::to_wide (uniform_integer_cst_p (xor2[1]
+   return true;
+}
+
   tree other;
   /* Try if EXPR1 was defined as ~EXPR2. */
   if (gimple_bit_not_with_nop (expr1, &other, valueize))
diff --git a/gcc/match.pd b/gcc/match.pd
index 090ad4e08b0..480e36bbbaf 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -174,6 +174,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (match (bit_not_with_nop @0)
  (convert (bit_not @0))
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)
+(match (bit_xor_cst @0 @1)
+ (bit_xor @0 uniform_integer_cst_p@1))
 (for cmp (tcc_comparison)
  (match (maybe_cmp @0)
   (cmp@0 @1 @2))
@@ -195,6 +197,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (INTEGER_CST@0))
 (match (maybe_bit_not @0)
  (maybe_cmp@0 @1))
+(match (maybe_bit_not @0)
+ (bit_xor_cst@0 @1 @2))
 
 /* Transform likes of (char) ABS_EXPR <(int) x> into (char) ABSU_EXPR 
ABSU_EXPR returns unsigned absolute value of the operand and the operand
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitops-8.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bitops-8.c
new file mode 100644
index 000..40f756e4455
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitops-8.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/115224 */
+
+int f1(int a, int b)
+{
+a = a ^ 1;
+int c = ~a;
+return c | (a ^ b);
+// ~((a ^ 1) & b) or (a ^ -2) | ~b
+}
+/* { dg-final { scan-tree-dump-times   "bit_xor_expr, "  1  "optimized" } } */
+/* { dg-final { scan-tree-dump-times   "bit_ior_expr, "  1  "optimized" } } */
+/* { dg-final { scan-tree-dump-times   "bit_not_expr, "  1  "optimized" } } */
+


[gcc r15-909] PR modula2/115276 bugfix libgm2 wraptime.InitTM returns NIL

2024-05-29 Thread Gaius Mulley via Gcc-cvs
https://gcc.gnu.org/g:d1a1f7e9f0bedea55c558ab95127679bc3e9ff72

commit r15-909-gd1a1f7e9f0bedea55c558ab95127679bc3e9ff72
Author: Gaius Mulley 
Date:   Wed May 29 17:26:59 2024 +0100

PR modula2/115276 bugfix libgm2 wraptime.InitTM returns NIL

This patch fixes libgm2/libm2iso/wraptime.cc:InitTM so that
it does not always return NULL.  The incorrect autoconf macro
was used (inside InitTM) and the function short circuited
to return NULL.  The fix is to use HAVE_SYS_TIME_H and use
AC_HEADER_TIME in libgm2/configure.ac.

libgm2/ChangeLog:

PR modula2/115276
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use AC_HEADER_TIME.
* libm2iso/wraptime.cc (InitTM): Check HAVE_SYS_TIME_H
before using struct tm to obtain the size.

gcc/testsuite/ChangeLog:

PR modula2/115276
* gm2/isolib/run/pass/testinittm.mod: New test.

Signed-off-by: Gaius Mulley 

Diff:
---
 gcc/testsuite/gm2/isolib/run/pass/testinittm.mod | 17 +++
 libgm2/config.h.in   |  3 ++
 libgm2/configure | 39 ++--
 libgm2/configure.ac  |  1 +
 libgm2/libm2iso/wraptime.cc  |  2 +-
 5 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gm2/isolib/run/pass/testinittm.mod 
b/gcc/testsuite/gm2/isolib/run/pass/testinittm.mod
new file mode 100644
index 000..dfe041140f1
--- /dev/null
+++ b/gcc/testsuite/gm2/isolib/run/pass/testinittm.mod
@@ -0,0 +1,17 @@
+MODULE testinittm ;
+
+FROM wraptime IMPORT InitTM, tm ;
+FROM libc IMPORT printf, exit ;
+
+VAR
+   m: tm ;
+BEGIN
+   m := InitTM () ;
+   IF m = NIL
+   THEN
+  printf ("InitTM failed\n");
+  exit (1)
+   ELSE
+  printf ("InitTM passed\n")
+   END
+END testinittm.
diff --git a/libgm2/config.h.in b/libgm2/config.h.in
index 7426cb26cf8..321ef3b807f 100644
--- a/libgm2/config.h.in
+++ b/libgm2/config.h.in
@@ -335,6 +335,9 @@
 /* Define to 1 if you have the ANSI C header files. */
 #undef STDC_HEADERS
 
+/* Define to 1 if you can safely include both  and . */
+#undef TIME_WITH_SYS_TIME
+
 /* Enable extensions on AIX 3, Interix.  */
 #ifndef _ALL_SOURCE
 # undef _ALL_SOURCE
diff --git a/libgm2/configure b/libgm2/configure
index 13861f0ff93..c36fd7d4cac 100755
--- a/libgm2/configure
+++ b/libgm2/configure
@@ -6837,6 +6837,41 @@ $as_echo "#define HAVE_SYS_WAIT_H 1" >>confdefs.h
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether time.h and 
sys/time.h may both be included" >&5
+$as_echo_n "checking whether time.h and sys/time.h may both be included... " 
>&6; }
+if ${ac_cv_header_time+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+#include 
+#include 
+
+int
+main ()
+{
+if ((struct tm *) 0)
+return 0;
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  ac_cv_header_time=yes
+else
+  ac_cv_header_time=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_header_time" >&5
+$as_echo "$ac_cv_header_time" >&6; }
+if test $ac_cv_header_time = yes; then
+
+$as_echo "#define TIME_WITH_SYS_TIME 1" >>confdefs.h
+
+fi
+
 ac_fn_c_check_header_mongrel "$LINENO" "math.h" "ac_cv_header_math_h" 
"$ac_includes_default"
 if test "x$ac_cv_header_math_h" = xyes; then :
 
@@ -14544,7 +14579,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 14547 "configure"
+#line 14582 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -14650,7 +14685,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 14653 "configure"
+#line 14688 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/libgm2/configure.ac b/libgm2/configure.ac
index 9563831ddc5..1e6b82305ff 100644
--- a/libgm2/configure.ac
+++ b/libgm2/configure.ac
@@ -88,6 +88,7 @@ AC_ARG_WITH(cross-host,
 # Checks for header files.
 AC_HEADER_STDC
 AC_HEADER_SYS_WAIT
+AC_HEADER_TIME
 AC_CHECK_HEADER([math.h],
   [AC_DEFINE([HAVE_MATH_H], [1], [have math.h])])
 
diff --git a/libgm2/libm2iso/wraptime.cc b/libgm2/libm2iso/wraptime.cc
index 158086b75cc..4bbd5f9701d 100644
--- a/libgm2/libm2iso/wraptime.cc
+++ b/libgm2/libm2iso/wraptime.cc
@@ -113,7 +113,7 @@ EXPORT(KillTimezone) (struct timezone *tv)
 
 /* InitTM - returns a newly created opaque type.  */
 
-#if defined(HAVE_STRUCT_TM) && defined(HAVE_MALLOC_H)
+#if defined(HAVE_SYS_TIME_H) && defined(HAVE_MALLOC_H)
 extern "C" struct tm *
 EXPORT(InitTM) (void)
 {


[gcc r15-910] MIPS/testsuite: Fix bseli.b fail in msa-builtins.c

2024-05-29 Thread YunQiang Su via Gcc-cvs
https://gcc.gnu.org/g:9a92e5e56a7f2b19928b8cb7634f59d9c7b2b582

commit r15-910-g9a92e5e56a7f2b19928b8cb7634f59d9c7b2b582
Author: YunQiang Su 
Date:   Tue May 28 23:44:49 2024 +0800

MIPS/testsuite: Fix bseli.b fail in msa-builtins.c

commit 05daf617ea22e1d818295ed2d037456937e23530
Author: Jeff Law 
Date:   Sat May 25 12:39:05 2024 -0600

[committed] [v2] More logical op simplifications in simplify-rtx.cc

does some simplifications, and then `bseli.b $w1,$w0,255` is found that
it is same with `or.v $w1,$w0,$w1`. So there will be no bseli.b instruction
generated.

Let's use 254 instead of 255 to test the generation of `bseli.b`.

gcc/testsuite

* gcc.target/mips/msa-builtins.c: Use 254 instead of 255 for
bseli.b, as `bseli.b $w0,$w1,255` is same as `or.v $w0,$w0,$w1`.

Diff:
---
 gcc/testsuite/gcc.target/mips/msa-builtins.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/mips/msa-builtins.c 
b/gcc/testsuite/gcc.target/mips/msa-builtins.c
index a679f065f34..6a146b3e6ae 100644
--- a/gcc/testsuite/gcc.target/mips/msa-builtins.c
+++ b/gcc/testsuite/gcc.target/mips/msa-builtins.c
@@ -705,7 +705,7 @@
 #define BNEG(T) NOMIPS16 T FN (bneg, T ## _DF) (T i, T j) { return BUILTIN 
(bneg, T ## _DF) (i, j); }
 #define BNEGI(T) NOMIPS16 T FN (bnegi, T ## _DF) (T i) { return BUILTIN 
(bnegi, T ## _DF) (i, 0); }
 #define BSEL(T) NOMIPS16 T FN (bsel, v) (T i, T j, T k) { return BUILTIN 
(bsel, v) (i, j, k); }
-#define BSELI(T) NOMIPS16 T FN (bseli, T ## _DF) (T i, T j) { return BUILTIN 
(bseli, T ## _DF) (i, j, U8MAX); }
+#define BSELI(T) NOMIPS16 T FN (bseli, T ## _DF) (T i, T j) { return BUILTIN 
(bseli, T ## _DF) (i, j, U8MAX-1); }
 #define BSET(T) NOMIPS16 T FN (bset, T ## _DF) (T i, T j) { return BUILTIN 
(bset, T ## _DF) (i, j); }
 #define BSETI(T) NOMIPS16 T FN (bseti, T ## _DF) (T i) { return BUILTIN 
(bseti, T ## _DF) (i, 0); }
 #define NLOC(T) NOMIPS16 T FN (nloc, T ## _DF) (T i) { return BUILTIN (nloc, T 
## _DF) (i); }


[gcc r15-911] MIPS16: Mark $2/$3 as clobbered if GP is used

2024-05-29 Thread YunQiang Su via Gcc-cvs
https://gcc.gnu.org/g:915440eed21de367cb41857afb5273aff5bcb737

commit r15-911-g915440eed21de367cb41857afb5273aff5bcb737
Author: YunQiang Su 
Date:   Wed May 29 02:28:25 2024 +0800

MIPS16: Mark $2/$3 as clobbered if GP is used

PR Target/84790.
The gp init sequence
li  $2,%hi(_gp_disp)
addiu   $3,$pc,%lo(_gp_disp)
sll $2,16
addu$2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer 
Origin-Patch-by: Felix Fietkau .

gcc
* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

Diff:
---
 gcc/config/mips/mips.cc | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index b63d40a357b..b478cddc8ad 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -3233,6 +3233,9 @@ mips_emit_call_insn (rtx pattern, rtx orig_addr, rtx 
addr, bool lazy_p)
 {
   rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), MIPS16_PIC_TEMP);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+   MIPS_PROLOGUE_TEMP (word_mode));
 }
 
   return insn;
@@ -3329,7 +3332,13 @@ mips16_gp_pseudo_reg (void)
   rtx set = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
   rtx_insn *insn = emit_insn_after (set, scan);
   INSN_LOCATION (insn) = 0;
-
+  /* NewABI support hasn't been implement.  NewABI should generate RTL
+sequence instead of ASM sequence directly.  */
+  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+   {
+ emit_clobber (MIPS16_PIC_TEMP);
+ emit_clobber (MIPS_PROLOGUE_TEMP (Pmode));
+   }
   pop_topmost_sequence ();
 }


[gcc/ibm/heads/gcc-13-branch] (553 commits) ibm: Merge up to top of releases/gcc-13

2024-05-29 Thread Peter Bergner via Gcc-cvs
The branch 'ibm/heads/gcc-13-branch' was updated to point to:

 c3db5f495a1... ibm: Merge up to top of releases/gcc-13

It previously pointed to:

 efb4bfb219d... ibm: Merge up to top of releases/gcc-13

Diff:

Summary of changes (added commits):
---

  c3db5f4... ibm: Merge up to top of releases/gcc-13
  ebca600... Daily bump. (*)
  fd91953... libstdc++: Fix up 19_diagnostics/stacktrace/hash.cc on 13 b (*)
  3185cfe... Fortran: Fix SHAPE for zero-size arrays (*)
  67434fe... libstdc++: Guard use of sized deallocation [PR114940] (*)
  d7f9f23... Daily bump. (*)
  b954f15... Daily bump. (*)
  513d050... Daily bump. (*)
  91c7ec5... Daily bump. (*)
  53cdaa7... c++: unroll pragma in templates [PR111529] (*)
  5f14578... c++: array of PMF [PR113598] (*)
  cf76815... Daily bump. (*)
  6f8933c... Daily bump. (*)
  75d394c... testsuite: Verify r0-r3 are extended with CMSE (*)
  f0b88ec... Fortran: fix issues with class(*) assignment [PR114827] (*)
  2ebf3af... Fortran: fix reallocation on assignment of polymorphic vari (*)
  53bc98f... strlen: Fix up !si->full_string_p handling in count_nonzero (*)
  35ac28b... ubsan: Use right address space for MEM_REF created for bool (*)
  a841964... Daily bump. (*)
  9433e30... libstdc++: testsuite: Enhance codecvt_unicode with tests fo (*)
  bd5e672... libstdc++: Fix handling of surrogate CP in codecvt [PR10897 (*)
  0a9df2c... c++: Fix std dialect hint for std::to_address [PR107800] (*)
  5ed32d0... Fortran: fix dependency checks for inquiry refs [PR115039] (*)
  c827f46... testsuite: Adjust pr113359-2_*.c with unsigned long long [P (*)
  3f6a425... PHIOPT: Don't transform minmax if middle bb contains a phi  (*)
  d6cf49e... match: Disable `(type)zero_one_valuep*CST` for 1bit signed  (*)
  bde5894... Bump BASE-VER. (*)
  b71f1de... Update ChangeLog and version files for release (*)
  a021b58... Daily bump. (*)
  4416023... Daily bump. (*)
  94509b6... Daily bump. (*)
  162c441... [committed] Fix RISC-V missing stack tie (*)
  5b5342e... Daily bump. (*)
  851aa3b... Daily bump. (*)
  1db45e8... ipa: Compare jump functions in ICF (PR 113907) (*)
  10bf53a... ICF&SRA: Make ICF and SRA agree on padding (*)
  7dca716... libstdc++: Fix typo in std::stacktrace::max_size [PR115063] (*)
  71e941b... libstdc++: Fix infinite loop in std::binomial_distribution  (*)
  b9e2a32... libstdc++: Adjust expected locale-dependent date formats in (*)
  ebc61a9... libstdc++: Fix typo in Doxygen comment (*)
  bce15a5... libstdc++: Fix run_doxygen for Doxygen 1.10 man page format (*)
  47cac09... c++: build_extra_args recapturing local specs [PR114303] (*)
  12ee04d... Daily bump. (*)
  d3659e2... c++: constexpr union member access folding [PR114709] (*)
  2e353c6... Manually add ChangeLog entries for various commits from 202 (*)
  d629308... rtl-optimization/54052 - RTL SSA PHI insertion compile-time (*)
  6d1801f... Daily bump. (*)
  b7a2697... diagnostics: fix corrupt json/SARIF on stderr [PR114348] (*)
  2a6f99a... Fix ICE in -fdiagnostics-generate-patch [PR112684] (*)
  230f672... diagnostics: fix ICE on sarif output when source file is un (*)
  96f7a36... analyzer: fix ICE and false positive with -Wanalyzer-deref- (*)
  810d35a... analyzer: fix ICE due to type mismatch when replaying call  (*)
  ed02610... analyzer: fix -Wanalyzer-deref-before-check false positive  (*)
  67d104f... analyzer: fix -Wanalyzer-va-arg-type-mismatch false +ve on  (*)
  2c688f6... analyzer: fix skipping of debug stmts [PR113253] (*)
  0593151... analyzer: fix defaults in compound assignments from non-zer (*)
  132eb1a... analyzer: casting all zeroes should give all zeroes [PR1133 (*)
  994477c... analyzer: fix deref-before-check false positives due to inl (*)
  a1cb188... analyzer: fix ICE for 2 bits before the start of base regio (*)
  b8c772c... jit: dump string literal initializers correctly (*)
  44968a0... testsuite, analyzer: add test case [PR108171] (*)
  a0b13d0... analyzer: fix ICE on zero-sized arrays [PR110882] (*)
  0df1ee0... analyzer: fix ICE on division of tainted floating-point val (*)
  60dcb71... jit.exp: handle dwarf version mismatch in jit-check-debug-i (*)
  b38472f... jit: avoid using __vector in testcase [PR110466] (*)
  e0c5290... testsuite: Add more allocation size tests for conjured sval (*)
  ccf8d3e... analyzer: Fix allocation size false positive on conjured sv (*)
  89feb35... analyzer: add caching to globals with initializers [PR11011 (*)
  e30211c... [PR114415][scheduler]: Fixing wrong code generation (*)
  421311a... Fix range-ops operator_addr. (*)
  fefdb9f... Daily bump. (*)
  6f7674a... testsuite: Fix up vector-subaccess-1.C test for ia32 [PR892 (*)
  adba85b... AVR: target/114981 - Support __builtin_powi[l] / __powidf2. (*)
  44d84db... reassoc: Fix up optimize_range_tests_to_bit_test [PR114965] (*)
  cad27df... expansion: Use __trunchfbf2 calls rather than __extendhfbf2 (*)
  d1ec7bc... tree-inline: Remove .ASAN_MARK calls when inlining function 

[gcc(refs/vendors/ibm/heads/gcc-13-branch)] ibm: Merge up to top of releases/gcc-13

2024-05-29 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:c3db5f495a1543fb22f725be910dc46249a15e57

commit c3db5f495a1543fb22f725be910dc46249a15e57
Merge: efb4bfb219d ebca6006f44
Author: Peter Bergner 
Date:   Wed May 29 10:48:31 2024 -0500

ibm: Merge up to top of releases/gcc-13

2024-05-29  Peter Bergner  

Merge up to releases/gcc-13 ebca6006f44408b8084868da6613f185b810db74

Diff:

 ChangeLog  |   15 +
 Makefile.in|   30 +
 Makefile.tpl   |   24 +
 c++tools/ChangeLog |4 +
 config/ChangeLog   |4 +
 contrib/ChangeLog  |   13 +
 contrib/dg-extract-results.sh  |   17 +-
 contrib/header-tools/ChangeLog |4 +
 contrib/reghunt/ChangeLog  |4 +
 contrib/regression/ChangeLog   |4 +
 fixincludes/ChangeLog  |4 +
 gcc/BASE-VER   |2 +-
 gcc/ChangeLog  | 1964 ++
 gcc/ChangeLog.ibm  |4 +
 gcc/DATESTAMP  |2 +-
 gcc/ada/ChangeLog  |   50 +
 gcc/ada/exp_attr.adb   |   63 +-
 gcc/ada/exp_ch4.adb|2 -
 gcc/ada/exp_ch7.adb|   13 +
 gcc/ada/exp_util.adb   |   15 +-
 gcc/ada/sem_aggr.adb   |9 +-
 gcc/ada/sem_ch13.adb   |   12 +-
 gcc/ada/sem_res.adb|   14 +-
 gcc/analyzer/ChangeLog |  148 +
 gcc/analyzer/call-summary.cc   |   12 +
 gcc/analyzer/checker-event.cc  |   40 -
 gcc/analyzer/constraint-manager.cc |  131 +
 gcc/analyzer/constraint-manager.h  |1 +
 gcc/analyzer/engine.cc |7 +
 gcc/analyzer/inlining-iterator.h   |   40 +
 gcc/analyzer/kf.cc |   22 +
 gcc/analyzer/region-model-manager.cc   |9 +-
 gcc/analyzer/region-model.cc   |  110 +-
 gcc/analyzer/region.cc |   77 +-
 gcc/analyzer/region.h  |   14 +-
 gcc/analyzer/sm-malloc.cc  |   40 +
 gcc/analyzer/sm-taint.cc   |6 +
 gcc/analyzer/state-purge.cc|9 +
 gcc/analyzer/store.cc  |   11 +-
 gcc/analyzer/store.h   |   10 +-
 gcc/analyzer/supergraph.cc |4 +
 gcc/analyzer/varargs.cc|   38 +-
 gcc/asan.cc|   52 +-
 gcc/attribs.cc |   17 +-
 gcc/bb-reorder.cc  |3 +-
 gcc/bitmap.cc  |2 +-
 gcc/c-family/ChangeLog |   49 +
 gcc/c-family/c-attribs.cc  |   32 +-
 gcc/c-family/c-common.cc   |8 +-
 gcc/c-family/c-lex.cc  |   32 +-
 gcc/c-family/c-pch.cc  |5 +-
 gcc/c/ChangeLog|   14 +
 gcc/c/c-decl.cc|7 +-
 gcc/calls.cc   |7 +-
 gcc/cfgexpand.cc   |   32 +-
 gcc/cfgrtl.cc  |   27 +-
 gcc/cfgrtl.h   |1 +
 gcc/cgraph.cc  |   10 +-
 gcc/cgraph.h   |   15 +-
 gcc/cgraphunit.cc  |2 +
 gcc/combine.cc |   12 +-
 gcc/common.opt |2 +-
 gcc/common/config/avr/avr-common.cc|6 -
 gcc/common/config/i386/i386-common.cc  |2 +-
 gcc/config.gcc |1 +
 gcc/config.in  |   21 +-
 gcc/config/aarch64/aarch64-arches.def  |2 +-
 gcc/config/aarch64/aarch64-builtins.cc |2 +-
 gcc/config/aarch64/aarch64-cores.def   |2 +-
 gcc/config/aarch64/aarch64.cc  |   31 +-
 gcc/config/aarch64/aarch64.md  |   35 +-
 gcc/config/aarch64/iterators.md|3 +
 gcc/config/aarch64/t-aarch64-rtems |   42 +
 gcc/config/alpha/alpha.cc  |3 +-
 gcc/config/arc/arc.cc  |   

[gcc r15-912] C23: fix aliasing for structures/unions with incomplete types

2024-05-29 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:86b98d939989427ff025bcfd536ad361fcdc699c

commit r15-912-g86b98d939989427ff025bcfd536ad361fcdc699c
Author: Martin Uecker 
Date:   Sat Mar 30 19:49:48 2024 +0100

C23: fix aliasing for structures/unions with incomplete types

When incomplete structure/union types are completed later, compatibility
of struct types that contain pointers to such types changes.  When forming
equivalence classes for TYPE_CANONICAL, we therefor need to be conservative
and treat all structs with the same tag which are pointer targets as
equivalent for purposed of determining equivalency of structure/union
types which contain such types as member. This avoids having to update
TYPE_CANONICAL of such structure/unions recursively. The pointer types
themselves are updated in c_update_type_canonical.

gcc/c/
* c-typeck.cc (comptypes_internal): Add flag to track
whether a struct is the target of a pointer.
(tagged_types_tu_compatible): When forming equivalence
classes, treat nested pointed-to structs as equivalent.

gcc/testsuite/
* gcc.dg/c23-tag-incomplete-alias-1.c: New test.

Diff:
---
 gcc/c/c-typeck.cc | 43 +--
 gcc/testsuite/gcc.dg/c23-tag-incomplete-alias-1.c | 36 +++
 2 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ad4c7add562..09b2c265a46 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1172,6 +1172,7 @@ struct comptypes_data {
   bool different_types_p;
   bool warning_needed;
   bool anon_field;
+  bool pointedto;
   bool equiv;
 
   const struct tagged_tu_seen_cache* cache;
@@ -1235,8 +1236,36 @@ comptypes_check_different_types (tree type1, tree type2,
 }
 
 
-/* Like comptypes, but if it returns nonzero for struct and union
-   types considered equivalent for aliasing purposes.  */
+/* Like comptypes, but if it returns true for struct and union types
+   considered equivalent for aliasing purposes, i.e. for setting
+   TYPE_CANONICAL after completing a struct or union.
+
+   This function must return false only for types which are not
+   compatible according to C language semantics (cf. comptypes),
+   otherwise the middle-end would make incorrect aliasing decisions.
+   It may return true for some similar types that are not compatible
+   according to those stricter rules.
+
+   In particular, we ignore size expression in arrays so that the
+   following structs are in the same equivalence class:
+
+   struct foo { char (*buf)[]; };
+   struct foo { char (*buf)[3]; };
+   struct foo { char (*buf)[4]; };
+
+   We also treat unions / structs with members which are pointers to
+   structures or unions with the same tag as equivalent (if they are not
+   incompatible for other reasons).  Although incomplete structure
+   or union types are not compatible to any other type, they may become
+   compatible to different types when completed.  To avoid having to update
+   TYPE_CANONICAL at this point, we only consider the tag when forming
+   the equivalence classes.  For example, the following types with tag
+   'foo' are all considered equivalent:
+
+   struct bar;
+   struct foo { struct bar *x };
+   struct foo { struct bar { int a; } *x };
+   struct foo { struct bar { char b; } *x };  */
 
 bool
 comptypes_equiv_p (tree type1, tree type2)
@@ -1357,6 +1386,7 @@ comptypes_internal (const_tree type1, const_tree type2,
   /* Do not remove mode information.  */
   if (TYPE_MODE (t1) != TYPE_MODE (t2))
return false;
+  data->pointedto = true;
   return comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data);
 
 case FUNCTION_TYPE:
@@ -1375,7 +1405,7 @@ comptypes_internal (const_tree type1, const_tree type2,
 
if ((d1 == NULL_TREE) != (d2 == NULL_TREE))
  data->different_types_p = true;
-   /* Ignore size mismatches.  */
+   /* Ignore size mismatches when forming equivalence classes.  */
if (data->equiv)
  return true;
/* Sizes must match unless one is missing or variable.  */
@@ -1515,6 +1545,12 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
   if (TYPE_NAME (t1) != TYPE_NAME (t2))
 return false;
 
+  /* When forming equivalence classes for TYPE_CANONICAL in C23, we treat
+ structs with the same tag as equivalent, but only when they are targets
+ of pointers inside other structs.  */
+  if (data->equiv && data->pointedto)
+return true;
+
   if (!data->anon_field && NULL_TREE == TYPE_NAME (t1))
 return false;
 
@@ -1610,6 +1646,7 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
  return false;
 
data->anon_field = !DECL_NAME (s1);
+   data->pointedto = false;
 
data->cache = &entry;
if (!comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2), data))
diff --git a/gcc/testsuite/gcc.d

[gcc/ibm/heads/gcc-12-branch] (363 commits) ibm: Merge up to top of releases/gcc-12

2024-05-29 Thread Peter Bergner via Gcc-cvs
The branch 'ibm/heads/gcc-12-branch' was updated to point to:

 92786addfe0... ibm: Merge up to top of releases/gcc-12

It previously pointed to:

 9f2e51a88fb... ibm: Merge up to top of releases/gcc-12

Diff:

Summary of changes (added commits):
---

  92786ad... ibm: Merge up to top of releases/gcc-12
  342f577... Daily bump. (*)
  da9b7a5... ubsan: Use right address space for MEM_REF created for bool (*)
  e0b2c4f... Fortran: Fix SHAPE for zero-size arrays (*)
  72f6b7e... ipa: Compare jump functions in ICF (PR 113907) (*)
  3bb534d... Daily bump. (*)
  4507501... Daily bump. (*)
  0bd259a... Daily bump. (*)
  e11d3dd... Daily bump. (*)
  ba57a52... c++: __is_constructible ref binding [PR100667] (*)
  6a5dcdb... c++: fix PR111529 backport (*)
  1982783... c++: unroll pragma in templates [PR111529] (*)
  419b5e1... c++: array of PMF [PR113598] (*)
  7076c56... c++: binding reference to comma expr [PR114561] (*)
  a1ff317... Daily bump. (*)
  df19155... Daily bump. (*)
  d9c8940... testsuite: Verify r0-r3 are extended with CMSE (*)
  13ced60... Daily bump. (*)
  113ddbe... Daily bump. (*)
  2f0c2cc... Daily bump. (*)
  1ba6e8b... Daily bump. (*)
  65e5547... middle-end/110176 - wrong zext (bool) <= (int) 4294967295u  (*)
  47e6bff... tree-optimization/111039 - abnormals and bit test merging (*)
  5db4b54... tree-optimization/112281 - loop distribution and zero depen (*)
  dbb5273... tree-optimization/112495 - alias versioning and address spa (*)
  4a71557... tree-optimization/112505 - bit-precision induction vectoriz (*)
  1f41e8e... debug/112718 - reset all type units with -ffat-lto-objects (*)
  9bad5cf... tree-optimization/112793 - SLP of constant/external code-ge (*)
  2d650c0... tree-optimization/114027 - fix testcase (*)
  6661a7c... tree-optimization/114027 - conditional reduction chain (*)
  c1b2185... tree-optimization/114375 - disallow SLP discovery of permut (*)
  a7b1d81... tree-optimization/114231 - use patterns for BB SLP discover (*)
  46b2e98... middle-end/114734 - wrong code with expand_call_mem_ref (*)
  42a0393... lto/114655 - -flto=4 at link time doesn't override -flto=au (*)
  56415e3... gcov-profile/114715 - missing coverage for switch (*)
  b656e65... Daily bump. (*)
  2183e5b... ipa: Self-DCE of uses of removed call LHSs (PR 108007) (*)
  4419198... ipa: Force args obtined through pass-through maps to the ex (*)
  de66146... Daily bump. (*)
  2beef72... Daily bump. (*)
  c5c3a4a... Fix range-ops operator_addr. (*)
  f7db003... Daily bump. (*)
  587596d... Objective-C, NeXT, v2: Correct a regression in code-gen. (*)
  3349a6c... Daily bump. (*)
  ffa41c6... testsuite: Fix up vector-subaccess-1.C test for ia32 [PR892 (*)
  f5c7306... Fix PR 110386: backprop vs ABSU_EXPR (*)
  58d11bf... testsuite: fix Wmismatched-new-delete-8.C with -m32 (*)
  16319f8... warn-access: Fix handling of unnamed types [PR109804] (*)
  39d56b9... Fix PR 111331: wrong code for `a > 28 ? MIN : 29` (*)
  d88fe82... Fold: Fix up merge_truthop_with_opposite_arm for NaNs [PR95 (*)
  0ab30fb... libstdc++: Fix conversion of simd to vector builtin (*)
  79aa696... libstdc++: Silence irrelevant warnings in  (*)
  7abc861... libstdc++: Fix -Wsystem-headers warnings in tests (*)
  c0c1207... libstdc++: Update  synopsis test for C++11 and late (*)
  2d174d4... libstdc++: Fix -Wsystem-headers warnings (*)
  14876f3... libstdc++: Improve doxygen docs for  (*)
  0a9cfae... libstdc++: Improve doxygen docs for some of  (*)
  0d128f5... libstdc++: Improve doxygen docs for algorithms and more (*)
  54de91d... libstdc++: Improve doxygen docs for std::allocator (*)
  e1800b8... libstdc++: Improve doxygen docs for  (*)
  f0db5df... libstdc++: Improve doxygen docs for  (*)
  914a226... libstdc++: Stop defining C++0x compat symbols for versioned (*)
  f8ab9b7... libstdc++: Add macros for the inline namespace std::_V2 (*)
  d9f006d... libstdc++: Disable Doxygen GROUP_NESTED_COMPOUNDS config op (*)
  f3d4e25... libstdc++: Simplify fs::path construction using variable te (*)
  57eb035... libstdc++: Update std::pointer_traits to match new LWG 3545 (*)
  1bb467f... libstdc++: Simplify detection idiom using concepts (*)
  51e9dcc... libstdc++: Improve doxygen docs for std::pointer_traits (*)
  c6f80dc... libstdc++: use grep -E instead of egrep in scripts (*)
  5c156f5... libstdc++: Fix allocator propagation in regex algorithms [P (*)
  e35b26c... libstdc++: Define std::basic_stringbuf::view() for old std: (*)
  0135f93... libstdc++: Add autoconf checks for mkdir, chmod, chdir, and (*)
  a389921... libstdc++: Explicitly default some copy ctors and assignmen (*)
  dc0964f... libstdc++: Add static_assert to std::integer_sequence [PR11 (*)
  15c5170... libstdc++: Remove non-void static assertions in variant's s (*)
  c285c1b... libstdc++: Fix exception thrown by std::shared_lock::unlock (*)
  6f5dcea... libstdc++: Fix conditions for using memcmp in std::lexicogr (*)
  8ec265c... libstdc++: Do not use memmove 

[gcc(refs/vendors/ibm/heads/gcc-12-branch)] ibm: Merge up to top of releases/gcc-12

2024-05-29 Thread Peter Bergner via Libstdc++-cvs
https://gcc.gnu.org/g:92786addfe0797790a97ddc50f7709a1bf4791a9

commit 92786addfe0797790a97ddc50f7709a1bf4791a9
Merge: 9f2e51a88fb 342f577d8ea
Author: Peter Bergner 
Date:   Wed May 29 14:42:14 2024 -0500

ibm: Merge up to top of releases/gcc-12

2024-05-29  Peter Bergner  

Merge up to releases/gcc-12 342f577d8ea60c3473a6c1e66ef038b96f99f9d2

Diff:

 ChangeLog  |8 +
 configure  |2 +-
 configure.ac   |2 +-
 fixincludes/ChangeLog  |   20 +
 fixincludes/fixincl.x  |  109 +-
 fixincludes/inclhack.def   |   47 +
 fixincludes/tests/base/objc/runtime.h  |   24 +
 fixincludes/tests/base/stdio.h |7 +
 gcc/ChangeLog  |  954 +++
 gcc/ChangeLog.ibm  |4 +
 gcc/DATESTAMP  |2 +-
 gcc/ada/ChangeLog  |   18 +
 gcc/ada/exp_ch4.adb|2 -
 gcc/ada/exp_ch7.adb|   13 +
 gcc/ada/exp_util.adb   |   15 +-
 gcc/ada/sem_res.adb|   14 +-
 gcc/asan.cc|   15 +-
 gcc/c-family/ChangeLog |   16 +
 gcc/c-family/c-common.cc   |7 +-
 gcc/c-family/c-pch.cc  |5 +-
 gcc/cfgexpand.cc   |2 +-
 gcc/cfgrtl.cc  |   24 +-
 gcc/cfgrtl.h   |1 +
 gcc/cgraph.cc  |   10 +-
 gcc/cgraph.h   |   18 +-
 gcc/cgraphunit.cc  |2 +
 gcc/config.in  |   24 +
 gcc/config/aarch64/aarch64-cores.def   |2 +-
 gcc/config/aarch64/aarch64.cc  |   29 +-
 gcc/config/aarch64/aarch64.h   |2 +-
 gcc/config/aarch64/aarch64.md  |   35 +-
 gcc/config/aarch64/iterators.md|3 +
 gcc/config/arm/arm.cc  |   69 ++
 gcc/config/arm/neon.md |4 +-
 gcc/config/avr/avr-mcus.def|   83 +-
 gcc/config/avr/avr.cc  |   10 +
 gcc/config/darwin-protos.h |   11 +
 gcc/config/darwin-sections.def |4 +-
 gcc/config/darwin.cc   |  224 +++-
 gcc/config/darwin.h|   92 +-
 gcc/config/darwin.opt  |4 +
 gcc/config/i386/amxtileintrin.h|4 +-
 gcc/config/i386/darwin.h   |4 +-
 gcc/config/i386/i386-builtin.def   |4 +
 gcc/config/i386/i386-expand.cc |   19 +
 gcc/config/i386/i386-features.cc   |   50 +-
 gcc/config/i386/i386-features.h|1 +
 gcc/config/i386/i386.md|   24 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |   31 +-
 gcc/config/loongarch/gnu-user.h|4 +-
 gcc/config/loongarch/loongarch-opts.cc |   22 +
 gcc/config/loongarch/loongarch-opts.h  |   18 +
 gcc/config/loongarch/loongarch-protos.h|2 +-
 gcc/config/loongarch/loongarch.cc  |   69 +-
 gcc/config/loongarch/loongarch.h   |   22 +-
 gcc/config/loongarch/loongarch.md  |   23 +-
 gcc/config/loongarch/loongarch.opt |   31 +-
 gcc/config/loongarch/sync.md   |   46 +-
 gcc/config/mips/mips-msa.md|   18 +-
 gcc/config/pa/pa.md|6 +-
 gcc/config/riscv/sync.md   |9 +
 gcc/config/rs6000/darwin.h |6 +-
 gcc/config/rs6000/mma.md   |8 +-
 gcc/config/rs6000/predicates.md|2 +-
 gcc/config/rs6000/rs6000-builtin.cc|6 +-
 gcc/config/rs6000/rs6000-c.cc  |   14 +-
 gcc/config/rs6000/rs6000-cpus.def  |5 +-
 gcc/config/rs6000/rs6000.cc|   19 +-
 gcc/config/rs6000/rs6000.h |4 +-
 gcc/config/rs6000/rs6000.md|8 +-
 gcc/config/rs6000/rs6000.opt   |6 +-
 gcc/config/rs6000/vsx.md   |4 +-
 gcc/config/sh/sh.cc|3 +-
 gcc/configure  |  149 ++-
 gcc/configure.ac 

[gcc r15-914] Revert "resource.cc: Remove redundant conditionals"

2024-05-29 Thread Hans-Peter Nilsson via Gcc-cvs
https://gcc.gnu.org/g:c31a9d3152d6119aab83c403308ddb933fe905c5

commit r15-914-gc31a9d3152d6119aab83c403308ddb933fe905c5
Author: Hans-Peter Nilsson 
Date:   Thu May 30 01:57:16 2024 +0200

Revert "resource.cc: Remove redundant conditionals"

This reverts commit 802a98d128f9b0eea2432f6511328d14e0bd721b.

Diff:
---
 gcc/resource.cc | 123 
 1 file changed, 71 insertions(+), 52 deletions(-)

diff --git a/gcc/resource.cc b/gcc/resource.cc
index 7c1de886432..62bd46f786e 100644
--- a/gcc/resource.cc
+++ b/gcc/resource.cc
@@ -658,41 +658,48 @@ mark_target_live_regs (rtx_insn *insns, rtx 
target_maybe_return, struct resource
   res->cc = 0;
 
   /* See if we have computed this value already.  */
-  for (tinfo = target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME];
-   tinfo; tinfo = tinfo->next)
-if (tinfo->uid == INSN_UID (target))
-  break;
-
-  /* Start by getting the basic block number.  If we have saved
- information, we can get it from there unless the insn at the
- start of the basic block has been deleted.  */
-  if (tinfo && tinfo->block != -1
-  && ! BB_HEAD (BASIC_BLOCK_FOR_FN (cfun, tinfo->block))->deleted ())
-b = tinfo->block;
+  if (target_hash_table != NULL)
+{
+  for (tinfo = target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME];
+  tinfo; tinfo = tinfo->next)
+   if (tinfo->uid == INSN_UID (target))
+ break;
+
+  /* Start by getting the basic block number.  If we have saved
+information, we can get it from there unless the insn at the
+start of the basic block has been deleted.  */
+  if (tinfo && tinfo->block != -1
+ && ! BB_HEAD (BASIC_BLOCK_FOR_FN (cfun, tinfo->block))->deleted ())
+   b = tinfo->block;
+}
 
   if (b == -1)
 b = BLOCK_FOR_INSN (target)->index;
   gcc_assert (b != -1);
 
-  if (tinfo)
+  if (target_hash_table != NULL)
 {
-  /* If the information is up-to-date, use it.  Otherwise, we will
-update it below.  */
-  if (b == tinfo->block && tinfo->bb_tick == bb_ticks[b])
+  if (tinfo)
{
- res->regs = tinfo->live_regs;
- return;
+ /* If the information is up-to-date, use it.  Otherwise, we will
+update it below.  */
+ if (b == tinfo->block && tinfo->bb_tick == bb_ticks[b])
+   {
+ res->regs = tinfo->live_regs;
+ return;
+   }
+   }
+  else
+   {
+ /* Allocate a place to put our results and chain it into the
+hash table.  */
+ tinfo = XNEW (struct target_info);
+ tinfo->uid = INSN_UID (target);
+ tinfo->block = b;
+ tinfo->next
+   = target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME];
+ target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME] = tinfo;
}
-}
-  else
-{
-  /* Allocate a place to put our results and chain it into the hash
-table.  */
-  tinfo = XNEW (struct target_info);
-  tinfo->uid = INSN_UID (target);
-  tinfo->block = b;
-  tinfo->next = target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME];
-  target_hash_table[INSN_UID (target) % TARGET_HASH_PRIME] = tinfo;
 }
 
   CLEAR_HARD_REG_SET (pending_dead_regs);
@@ -818,12 +825,13 @@ mark_target_live_regs (rtx_insn *insns, rtx 
target_maybe_return, struct resource
 to be live here still are.  The fallthrough edge may have
 left a live register uninitialized.  */
  bb = BLOCK_FOR_INSN (real_insn);
- gcc_assert (bb);
-
- HARD_REG_SET extra_live;
+ if (bb)
+   {
+ HARD_REG_SET extra_live;
 
- REG_SET_TO_HARD_REG_SET (extra_live, DF_LR_IN (bb));
- current_live_regs |= extra_live;
+ REG_SET_TO_HARD_REG_SET (extra_live, DF_LR_IN (bb));
+ current_live_regs |= extra_live;
+   }
}
 
   /* The beginning of the epilogue corresponds to the end of the
@@ -839,8 +847,10 @@ mark_target_live_regs (rtx_insn *insns, rtx 
target_maybe_return, struct resource
 {
   tinfo->block = b;
   tinfo->bb_tick = bb_ticks[b];
-  tinfo->live_regs = res->regs;
 }
+
+  if (tinfo != NULL)
+tinfo->live_regs = res->regs;
 }
 
 /* Initialize the resources required by mark_target_live_regs ().
@@ -929,25 +939,31 @@ init_resource_info (rtx_insn *epilogue_insn)
 void
 free_resource_info (void)
 {
-  int i;
-
-  for (i = 0; i < TARGET_HASH_PRIME; ++i)
+  if (target_hash_table != NULL)
 {
-  struct target_info *ti = target_hash_table[i];
+  int i;
 
-  while (ti)
+  for (i = 0; i < TARGET_HASH_PRIME; ++i)
{
- struct target_info *next = ti->next;
- free (ti);
- ti = next;
+ struct target_info *ti = target_hash_table[i];
+
+ while (ti)
+   {
+ struct target_info *next = ti->next;
+ free 

[gcc r15-915] Revert "resource.cc (mark_target_live_regs): Remove check for bb not found"

2024-05-29 Thread Hans-Peter Nilsson via Gcc-cvs
https://gcc.gnu.org/g:afe48a45b8baa310c8373499b1e5b5407a3e2b94

commit r15-915-gafe48a45b8baa310c8373499b1e5b5407a3e2b94
Author: Hans-Peter Nilsson 
Date:   Thu May 30 01:57:29 2024 +0200

Revert "resource.cc (mark_target_live_regs): Remove check for bb not found"

This reverts commit e1abce5b6ad8f5aee86ec7729b516d81014db09e.

Diff:
---
 gcc/resource.cc | 270 +---
 1 file changed, 138 insertions(+), 132 deletions(-)

diff --git a/gcc/resource.cc b/gcc/resource.cc
index 62bd46f786e..0d8cde93570 100644
--- a/gcc/resource.cc
+++ b/gcc/resource.cc
@@ -704,150 +704,156 @@ mark_target_live_regs (rtx_insn *insns, rtx 
target_maybe_return, struct resource
 
   CLEAR_HARD_REG_SET (pending_dead_regs);
 
-  /* Get the live registers from the basic block and update them with
- anything set or killed between its start and the insn before
- TARGET; this custom life analysis is really about registers so we
- need to use the LR problem.  Otherwise, we must assume everything
- is live.  */
-  regset regs_live = DF_LR_IN (BASIC_BLOCK_FOR_FN (cfun, b));
-  rtx_insn *start_insn, *stop_insn;
-  df_ref def;
-
-  /* Compute hard regs live at start of block.  */
-  REG_SET_TO_HARD_REG_SET (current_live_regs, regs_live);
-  FOR_EACH_ARTIFICIAL_DEF (def, b)
-if (DF_REF_FLAGS (def) & DF_REF_AT_TOP)
-  SET_HARD_REG_BIT (current_live_regs, DF_REF_REGNO (def));
-
-  /* Get starting and ending insn, handling the case where each might
- be a SEQUENCE.  */
-  start_insn = (b == ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb->index ?
-   insns : BB_HEAD (BASIC_BLOCK_FOR_FN (cfun, b)));
-  stop_insn = target;
-
-  if (NONJUMP_INSN_P (start_insn)
-  && GET_CODE (PATTERN (start_insn)) == SEQUENCE)
-start_insn = as_a  (PATTERN (start_insn))->insn (0);
-
-  if (NONJUMP_INSN_P (stop_insn)
-  && GET_CODE (PATTERN (stop_insn)) == SEQUENCE)
-stop_insn = next_insn (PREV_INSN (stop_insn));
-
-  for (insn = start_insn; insn != stop_insn;
-   insn = next_insn_no_annul (insn))
+  /* If we found a basic block, get the live registers from it and update
+ them with anything set or killed between its start and the insn before
+ TARGET; this custom life analysis is really about registers so we need
+ to use the LR problem.  Otherwise, we must assume everything is live.  */
+  if (b != -1)
 {
-  rtx link;
-  rtx_insn *real_insn = insn;
-  enum rtx_code code = GET_CODE (insn);
-
-  if (DEBUG_INSN_P (insn))
-   continue;
-
-  /* If this insn is from the target of a branch, it isn't going to
-be used in the sequel.  If it is used in both cases, this
-test will not be true.  */
-  if ((code == INSN || code == JUMP_INSN || code == CALL_INSN)
- && INSN_FROM_TARGET_P (insn))
-   continue;
-
-  /* If this insn is a USE made by update_block, we care about the
-underlying insn.  */
-  if (code == INSN
- && GET_CODE (PATTERN (insn)) == USE
- && INSN_P (XEXP (PATTERN (insn), 0)))
-   real_insn = as_a  (XEXP (PATTERN (insn), 0));
-
-  if (CALL_P (real_insn))
+  regset regs_live = DF_LR_IN (BASIC_BLOCK_FOR_FN (cfun, b));
+  rtx_insn *start_insn, *stop_insn;
+  df_ref def;
+
+  /* Compute hard regs live at start of block.  */
+  REG_SET_TO_HARD_REG_SET (current_live_regs, regs_live);
+  FOR_EACH_ARTIFICIAL_DEF (def, b)
+   if (DF_REF_FLAGS (def) & DF_REF_AT_TOP)
+ SET_HARD_REG_BIT (current_live_regs, DF_REF_REGNO (def));
+
+  /* Get starting and ending insn, handling the case where each might
+be a SEQUENCE.  */
+  start_insn = (b == ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb->index ?
+   insns : BB_HEAD (BASIC_BLOCK_FOR_FN (cfun, b)));
+  stop_insn = target;
+
+  if (NONJUMP_INSN_P (start_insn)
+ && GET_CODE (PATTERN (start_insn)) == SEQUENCE)
+   start_insn = as_a  (PATTERN (start_insn))->insn (0);
+
+  if (NONJUMP_INSN_P (stop_insn)
+ && GET_CODE (PATTERN (stop_insn)) == SEQUENCE)
+   stop_insn = next_insn (PREV_INSN (stop_insn));
+
+  for (insn = start_insn; insn != stop_insn;
+  insn = next_insn_no_annul (insn))
{
- /* Values in call-clobbered registers survive a COND_EXEC CALL
-if that is not executed; this matters for resoure use because
-they may be used by a complementarily (or more strictly)
-predicated instruction, or if the CALL is NORETURN.  */
- if (GET_CODE (PATTERN (real_insn)) != COND_EXEC)
+ rtx link;
+ rtx_insn *real_insn = insn;
+ enum rtx_code code = GET_CODE (insn);
+
+ if (DEBUG_INSN_P (insn))
+   continue;
+
+ /* If this insn is from the target of a branch, it isn't going to
+be used in the sequel.  If it is used in both cases, this
+test will not be true.  */
+ if ((code == INSN

[gcc r15-916] Revert "resource.cc: Replace calls to find_basic_block with cfgrtl BLOCK_FOR_INSN"

2024-05-29 Thread Hans-Peter Nilsson via Gcc-cvs
https://gcc.gnu.org/g:c68bd7e8023f65d1dc23237f5a04a863344b1264

commit r15-916-gc68bd7e8023f65d1dc23237f5a04a863344b1264
Author: Hans-Peter Nilsson 
Date:   Thu May 30 01:57:39 2024 +0200

Revert "resource.cc: Replace calls to find_basic_block with cfgrtl 
BLOCK_FOR_INSN"

This reverts commit 933ab59c59bdc1ac9e3ca3a56527836564e1821b.

Diff:
---
 gcc/resource.cc | 66 -
 1 file changed, 56 insertions(+), 10 deletions(-)

diff --git a/gcc/resource.cc b/gcc/resource.cc
index 0d8cde93570..06fcfd3e44c 100644
--- a/gcc/resource.cc
+++ b/gcc/resource.cc
@@ -28,7 +28,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm_p.h"
 #include "regs.h"
 #include "emit-rtl.h"
-#include "cfgrtl.h"
 #include "resource.h"
 #include "insn-attr.h"
 #include "function-abi.h"
@@ -76,6 +75,7 @@ static HARD_REG_SET current_live_regs;
 static HARD_REG_SET pending_dead_regs;
 
 static void update_live_status (rtx, const_rtx, void *);
+static int find_basic_block (rtx_insn *, int);
 static rtx_insn *next_insn_no_annul (rtx_insn *);
 
 /* Utility function called from mark_target_live_regs via note_stores.
@@ -113,6 +113,46 @@ update_live_status (rtx dest, const_rtx x, void *data 
ATTRIBUTE_UNUSED)
CLEAR_HARD_REG_BIT (pending_dead_regs, i);
   }
 }
+
+/* Find the number of the basic block with correct live register
+   information that starts closest to INSN.  Return -1 if we couldn't
+   find such a basic block or the beginning is more than
+   SEARCH_LIMIT instructions before INSN.  Use SEARCH_LIMIT = -1 for
+   an unlimited search.
+
+   The delay slot filling code destroys the control-flow graph so,
+   instead of finding the basic block containing INSN, we search
+   backwards toward a BARRIER where the live register information is
+   correct.  */
+
+static int
+find_basic_block (rtx_insn *insn, int search_limit)
+{
+  /* Scan backwards to the previous BARRIER.  Then see if we can find a
+ label that starts a basic block.  Return the basic block number.  */
+  for (insn = prev_nonnote_insn (insn);
+   insn && !BARRIER_P (insn) && search_limit != 0;
+   insn = prev_nonnote_insn (insn), --search_limit)
+;
+
+  /* The closest BARRIER is too far away.  */
+  if (search_limit == 0)
+return -1;
+
+  /* The start of the function.  */
+  else if (insn == 0)
+return ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb->index;
+
+  /* See if any of the upcoming CODE_LABELs start a basic block.  If we reach
+ anything other than a CODE_LABEL or note, we can't find this code.  */
+  for (insn = next_nonnote_insn (insn);
+   insn && LABEL_P (insn);
+   insn = next_nonnote_insn (insn))
+if (BLOCK_FOR_INSN (insn))
+  return BLOCK_FOR_INSN (insn)->index;
+
+  return -1;
+}
 
 /* Similar to next_insn, but ignores insns in the delay slots of
an annulled branch.  */
@@ -674,8 +714,7 @@ mark_target_live_regs (rtx_insn *insns, rtx 
target_maybe_return, struct resource
 }
 
   if (b == -1)
-b = BLOCK_FOR_INSN (target)->index;
-  gcc_assert (b != -1);
+b = find_basic_block (target, param_max_delay_slot_live_search);
 
   if (target_hash_table != NULL)
 {
@@ -683,7 +722,7 @@ mark_target_live_regs (rtx_insn *insns, rtx 
target_maybe_return, struct resource
{
  /* If the information is up-to-date, use it.  Otherwise, we will
 update it below.  */
- if (b == tinfo->block && tinfo->bb_tick == bb_ticks[b])
+ if (b == tinfo->block && b != -1 && tinfo->bb_tick == bb_ticks[b])
{
  res->regs = tinfo->live_regs;
  return;
@@ -866,6 +905,7 @@ void
 init_resource_info (rtx_insn *epilogue_insn)
 {
   int i;
+  basic_block bb;
 
   /* Indicate what resources are required to be valid at the end of the current
  function.  The condition code never is and memory always is.
@@ -935,8 +975,10 @@ init_resource_info (rtx_insn *epilogue_insn)
   target_hash_table = XCNEWVEC (struct target_info *, TARGET_HASH_PRIME);
   bb_ticks = XCNEWVEC (int, last_basic_block_for_fn (cfun));
 
-  /* Set the BLOCK_FOR_INSN for each insn.  */
-  compute_bb_for_insn ();
+  /* Set the BLOCK_FOR_INSN of each label that starts a basic block.  */
+  FOR_EACH_BB_FN (bb, cfun)
+if (LABEL_P (BB_HEAD (bb)))
+  BLOCK_FOR_INSN (BB_HEAD (bb)) = bb;
 }
 
 /* Free up the resources allocated to mark_target_live_regs ().  This
@@ -945,6 +987,8 @@ init_resource_info (rtx_insn *epilogue_insn)
 void
 free_resource_info (void)
 {
+  basic_block bb;
+
   if (target_hash_table != NULL)
 {
   int i;
@@ -971,7 +1015,9 @@ free_resource_info (void)
   bb_ticks = NULL;
 }
 
-  free_bb_for_insn ();
+  FOR_EACH_BB_FN (bb, cfun)
+if (LABEL_P (BB_HEAD (bb)))
+  BLOCK_FOR_INSN (BB_HEAD (bb)) = NULL;
 }
 
 /* Clear any hashed information that we have stored for INSN.  */
@@ -1017,10 +1063,10 @@ clear_hashed_info_until_next_barrier (rtx_insn *insn)
 void
 incr_ticks

[gcc r14-10260] MIPS16: Mark $2/$3 as clobbered if GP is used

2024-05-29 Thread YunQiang Su via Gcc-cvs
https://gcc.gnu.org/g:201cfa725587d13867b4dc25955434ebe90aff7b

commit r14-10260-g201cfa725587d13867b4dc25955434ebe90aff7b
Author: YunQiang Su 
Date:   Wed May 29 02:28:25 2024 +0800

MIPS16: Mark $2/$3 as clobbered if GP is used

PR Target/84790.
The gp init sequence
li  $2,%hi(_gp_disp)
addiu   $3,$pc,%lo(_gp_disp)
sll $2,16
addu$2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer 
Origin-Patch-by: Felix Fietkau .

gcc
* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)

Diff:
---
 gcc/config/mips/mips.cc | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index ce764a5cb35..1156d212c1f 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -3233,6 +3233,9 @@ mips_emit_call_insn (rtx pattern, rtx orig_addr, rtx 
addr, bool lazy_p)
 {
   rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), MIPS16_PIC_TEMP);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+   MIPS_PROLOGUE_TEMP (word_mode));
 }
 
   return insn;
@@ -3329,7 +3332,13 @@ mips16_gp_pseudo_reg (void)
   rtx set = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
   rtx_insn *insn = emit_insn_after (set, scan);
   INSN_LOCATION (insn) = 0;
-
+  /* NewABI support hasn't been implement.  NewABI should generate RTL
+sequence instead of ASM sequence directly.  */
+  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+   {
+ emit_clobber (MIPS16_PIC_TEMP);
+ emit_clobber (MIPS_PROLOGUE_TEMP (Pmode));
+   }
   pop_topmost_sequence ();
 }


[gcc r12-10480] MIPS16: Mark $2/$3 as clobbered if GP is used

2024-05-29 Thread YunQiang Su via Gcc-cvs
https://gcc.gnu.org/g:e26f16424f6279662efb210bc87c77148e956fed

commit r12-10480-ge26f16424f6279662efb210bc87c77148e956fed
Author: YunQiang Su 
Date:   Wed May 29 02:28:25 2024 +0800

MIPS16: Mark $2/$3 as clobbered if GP is used

PR Target/84790.
The gp init sequence
li  $2,%hi(_gp_disp)
addiu   $3,$pc,%lo(_gp_disp)
sll $2,16
addu$2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer 
Origin-Patch-by: Felix Fietkau .

gcc
* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)

Diff:
---
 gcc/config/mips/mips.cc | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index e64928f4113..d26630b20ce 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -3140,6 +3140,9 @@ mips_emit_call_insn (rtx pattern, rtx orig_addr, rtx 
addr, bool lazy_p)
 {
   rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), MIPS16_PIC_TEMP);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+   MIPS_PROLOGUE_TEMP (word_mode));
 }
 
   return insn;
@@ -3236,7 +3239,13 @@ mips16_gp_pseudo_reg (void)
   rtx set = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
   rtx_insn *insn = emit_insn_after (set, scan);
   INSN_LOCATION (insn) = 0;
-
+  /* NewABI support hasn't been implement.  NewABI should generate RTL
+sequence instead of ASM sequence directly.  */
+  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+   {
+ emit_clobber (MIPS16_PIC_TEMP);
+ emit_clobber (MIPS_PROLOGUE_TEMP (Pmode));
+   }
   pop_topmost_sequence ();
 }


[gcc r13-8809] MIPS16: Mark $2/$3 as clobbered if GP is used

2024-05-29 Thread YunQiang Su via Gcc-cvs
https://gcc.gnu.org/g:3be8fa7b19d218ca5812d71801e3e83ee2260ea0

commit r13-8809-g3be8fa7b19d218ca5812d71801e3e83ee2260ea0
Author: YunQiang Su 
Date:   Wed May 29 02:28:25 2024 +0800

MIPS16: Mark $2/$3 as clobbered if GP is used

PR Target/84790.
The gp init sequence
li  $2,%hi(_gp_disp)
addiu   $3,$pc,%lo(_gp_disp)
sll $2,16
addu$2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer 
Origin-Patch-by: Felix Fietkau .

gcc
* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)

Diff:
---
 gcc/config/mips/mips.cc | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 8e3dc313cb3..9bc73b2e77d 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -3140,6 +3140,9 @@ mips_emit_call_insn (rtx pattern, rtx orig_addr, rtx 
addr, bool lazy_p)
 {
   rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), MIPS16_PIC_TEMP);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+   MIPS_PROLOGUE_TEMP (word_mode));
 }
 
   return insn;
@@ -3236,7 +3239,13 @@ mips16_gp_pseudo_reg (void)
   rtx set = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
   rtx_insn *insn = emit_insn_after (set, scan);
   INSN_LOCATION (insn) = 0;
-
+  /* NewABI support hasn't been implement.  NewABI should generate RTL
+sequence instead of ASM sequence directly.  */
+  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+   {
+ emit_clobber (MIPS16_PIC_TEMP);
+ emit_clobber (MIPS_PROLOGUE_TEMP (Pmode));
+   }
   pop_topmost_sequence ();
 }


[gcc r15-917] tree-ssa-pre.c/115214(ICE in find_or_generate_expression, at tree-ssa-pre.c:2780): Return NULL_TREE

2024-05-29 Thread Jiawei Chen via Gcc-cvs
https://gcc.gnu.org/g:c9842f99042454bef99fe82506c6dd50f34e283e

commit r15-917-gc9842f99042454bef99fe82506c6dd50f34e283e
Author: Jiawei 
Date:   Mon May 27 15:40:51 2024 +0800

tree-ssa-pre.c/115214(ICE in find_or_generate_expression, at 
tree-ssa-pre.c:2780): Return NULL_TREE when deal special cases.

Return NULL_TREE when genop3 equal EXACT_DIV_EXPR.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652641.html

version log v3: remove additional POLY_INT_CST check.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652795.html

gcc/ChangeLog:

* tree-ssa-pre.cc (create_component_ref_by_pieces_1): New 
conditions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr115214.c: New test.

Diff:
---
 .../gcc.target/riscv/rvv/vsetvl/pr115214.c | 52 ++
 gcc/tree-ssa-pre.cc| 10 +++--
 2 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr115214.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr115214.c
new file mode 100644
index 000..fce2e9da766
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr115214.c
@@ -0,0 +1,52 @@
+/* { dg-do compile } */
+/* { dg-options "-mrvv-vector-bits=scalable -march=rv64gcv -mabi=lp64d -O3 -w" 
} */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+
+#include 
+
+static inline __attribute__(()) int vaddq_f32();
+static inline __attribute__(()) int vload_tillz_f32(int nlane) {
+  vint32m1_t __trans_tmp_9;
+  {
+int __trans_tmp_0 = nlane;
+{
+  vint64m1_t __trans_tmp_1;
+  vint64m1_t __trans_tmp_2;
+  vint64m1_t __trans_tmp_3;
+  vint64m1_t __trans_tmp_4;
+  if (__trans_tmp_0 == 1) {
+{
+  __trans_tmp_3 =
+  __riscv_vslideup_vx_i64m1(__trans_tmp_1, __trans_tmp_2, 1, 2);
+}
+__trans_tmp_4 = __trans_tmp_2;
+  }
+  __trans_tmp_4 = __trans_tmp_3;
+  __trans_tmp_9 = __riscv_vreinterpret_v_i64m1_i32m1(__trans_tmp_3);
+}
+  }
+  return vaddq_f32(__trans_tmp_9); /* { dg-error {RVV type 'vint32m1_t' cannot 
be passed to an unprototyped function} } */
+}
+
+char CFLOAT_add_args[3];
+const int *CFLOAT_add_steps;
+const int CFLOAT_steps;
+
+__attribute__(()) void CFLOAT_add() {
+  char *b_src0 = &CFLOAT_add_args[0], *b_src1 = &CFLOAT_add_args[1],
+   *b_dst = &CFLOAT_add_args[2];
+  const float *src1 = (float *)b_src1;
+  float *dst = (float *)b_dst;
+  const int ssrc1 = CFLOAT_add_steps[1] / sizeof(float);
+  const int sdst = CFLOAT_add_steps[2] / sizeof(float);
+  const int hstep = 4 / 2;
+  vfloat32m1x2_t a;
+  int len = 255;
+  for (; len > 0; len -= hstep, src1 += 4, dst += 4) {
+int b = vload_tillz_f32(len);
+int r = vaddq_f32(a.__val[0], b); /* { dg-error {RVV type 
'__rvv_float32m1_t' cannot be passed to an unprototyped function} } */
+  }
+  for (; len > 0; --len, b_src0 += CFLOAT_steps,
+  b_src1 += CFLOAT_add_steps[1], b_dst += CFLOAT_add_steps[2])
+;
+}
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index 75217f5cde1..5cf1968bc26 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -2685,11 +2685,15 @@ create_component_ref_by_pieces_1 (basic_block block, 
vn_reference_t ref,
   here as the element alignment may be not visible.  See
   PR43783.  Simply drop the element size for constant
   sizes.  */
-   if (TREE_CODE (genop3) == INTEGER_CST
+   if ((TREE_CODE (genop3) == INTEGER_CST
&& TREE_CODE (TYPE_SIZE_UNIT (elmt_type)) == INTEGER_CST
&& wi::eq_p (wi::to_offset (TYPE_SIZE_UNIT (elmt_type)),
-(wi::to_offset (genop3)
- * vn_ref_op_align_unit (currop
+(wi::to_offset (genop3) * vn_ref_op_align_unit 
(currop
+ || (TREE_CODE (genop3) == EXACT_DIV_EXPR
+   && TREE_CODE (TREE_OPERAND (genop3, 1)) == INTEGER_CST
+   && operand_equal_p (TREE_OPERAND (genop3, 0), TYPE_SIZE_UNIT 
(elmt_type))
+   && wi::eq_p (wi::to_offset (TREE_OPERAND (genop3, 1)),
+vn_ref_op_align_unit (currop
  genop3 = NULL_TREE;
else
  {


[gcc r11-11457] MIPS16: Mark $2/$3 as clobbered if GP is used

2024-05-29 Thread YunQiang Su via Gcc-cvs
https://gcc.gnu.org/g:1bc4a777b21ae36b116e1842b7c482340ec929ef

commit r11-11457-g1bc4a777b21ae36b116e1842b7c482340ec929ef
Author: YunQiang Su 
Date:   Wed May 29 02:28:25 2024 +0800

MIPS16: Mark $2/$3 as clobbered if GP is used

PR Target/84790.
The gp init sequence
li  $2,%hi(_gp_disp)
addiu   $3,$pc,%lo(_gp_disp)
sll $2,16
addu$2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer 
Origin-Patch-by: Felix Fietkau .

gcc
* config/mips/mips.c(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)

Diff:
---
 gcc/config/mips/mips.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index bb6ff08e94c..3cf09494aec 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -3138,6 +3138,9 @@ mips_emit_call_insn (rtx pattern, rtx orig_addr, rtx 
addr, bool lazy_p)
 {
   rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), MIPS16_PIC_TEMP);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+   MIPS_PROLOGUE_TEMP (word_mode));
 }
 
   return insn;
@@ -3234,7 +3237,13 @@ mips16_gp_pseudo_reg (void)
   rtx set = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
   rtx_insn *insn = emit_insn_after (set, scan);
   INSN_LOCATION (insn) = 0;
-
+  /* NewABI support hasn't been implement.  NewABI should generate RTL
+sequence instead of ASM sequence directly.  */
+  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+   {
+ emit_clobber (MIPS16_PIC_TEMP);
+ emit_clobber (MIPS_PROLOGUE_TEMP (Pmode));
+   }
   pop_topmost_sequence ();
 }


[gcc r15-918] [testsuite] conditionalize dg-additional-sources on target and type

2024-05-29 Thread Alexandre Oliva via Gcc-cvs
https://gcc.gnu.org/g:bdc264a16e327c63d133131a695a202fbbc0a6a0

commit r15-918-gbdc264a16e327c63d133131a695a202fbbc0a6a0
Author: Alexandre Oliva 
Date:   Thu May 30 02:06:48 2024 -0300

[testsuite] conditionalize dg-additional-sources on target and type

g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when
check_vect_support_and_set_flags finds vector support lacking for
execution tests: tests decay to compile tests, and additional sources
are rejected by the compiler when compiling to a named output file.

At first I considered using some effective target to conditionalize
the additional sources.  There was no support for target-specific
additional sources, so I added that.

But then, I found that adding an effective target to check whether the
test involves linking would just make for busy work in this case, and
so I went ahead and adjusted the handling of additional sources to
refrain from adding them on compile tests, reporting them as
unsupported.

That solves the problem without using the newly-added machinery for
per-target additional sources, but I figured since I'd implemented it
I might as well contribute it, since there might be other uses for it.


for  gcc/ChangeLog

* doc/sourcebuild.texi (dg-additional-sources): Document
newly-added support for target selectors, and implicit discard
on non-linking tests that name the compiler output explicitly.

for  gcc/testsuite/ChangeLog

* lib/gcc-defs.exp (dg-additional-sources): Support target
selectors.  Make it cumulative.
(dg-additional-files-options): Take dest and type.  Note
unsupported additional sources when not linking and naming the
compiler output.  Adjust source dirname prepending to cope
with leading blanks.
* lib/g++.exp (g++_target_compile): Pass dest and type on to
dg-additional-files-options.
* lib/gcc.exp (gcc_target_compile): Likewise.
* lib/gdc.exp (gdb_target_compile): Likewise.
* lib/gfortran.exp (gfortran_target_compile): Likewise.
* lib/go.exp (go_target_compile): Likewise.
* lib/obj-c++.exp (obj-c++_target_compile): Likewise.
* lib/objc.exp (objc_target_compile): Likewise.
* lib/rust.exp (rust_target_compile): Likewise.
* lib/profopt.exp (profopt-execute): Likewise-ish.

Diff:
---
 gcc/doc/sourcebuild.texi   |  8 +++-
 gcc/testsuite/lib/g++.exp  |  2 +-
 gcc/testsuite/lib/gcc-defs.exp | 35 ++-
 gcc/testsuite/lib/gcc.exp  |  2 +-
 gcc/testsuite/lib/gdc.exp  |  2 +-
 gcc/testsuite/lib/gfortran.exp |  2 +-
 gcc/testsuite/lib/go.exp   |  2 +-
 gcc/testsuite/lib/obj-c++.exp  |  2 +-
 gcc/testsuite/lib/objc.exp |  2 +-
 gcc/testsuite/lib/profopt.exp  |  2 +-
 gcc/testsuite/lib/rust.exp |  2 +-
 11 files changed, 46 insertions(+), 15 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 8e4e59ac44c..e997dbec333 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1320,9 +1320,15 @@ to @var{var_value} before execution of the program 
created by the test.
 Specify additional files, other than source files, that must be copied
 to the system where the compiler runs.
 
-@item @{ dg-additional-sources "@var{filelist}" @}
+@item @{ dg-additional-sources "@var{filelist}" [@{ target @var{selector} @}] 
@}
 Specify additional source files to appear in the compile line
 following the main test file.
+If the directive includes the optional @samp{@{ @var{selector} @}}
+then the additional sources are only added if the target system
+matches the @var{selector}.
+Additional sources are generally used only in @samp{link} and @samp{run}
+tests; they are reported as unsupported and discarded in other kinds of
+tests that direct the compiler to output to a single file.
 @end table
 
 @subsubsection Add checks at the end of a test
diff --git a/gcc/testsuite/lib/g++.exp b/gcc/testsuite/lib/g++.exp
index 0e47769c25b..a6b34d5d3a2 100644
--- a/gcc/testsuite/lib/g++.exp
+++ b/gcc/testsuite/lib/g++.exp
@@ -326,7 +326,7 @@ proc g++_target_compile { source dest type options } {
 append board_info($tboard,multilib_flags) " $flags_to_postpone"
 }
 
-set options [dg-additional-files-options $options $source]
+set options [dg-additional-files-options $options $source $dest $type]
 
 if { [target_info needs_status_wrapper] != "" && [info exists gluefile] } {
lappend options "libs=${gluefile}"
diff --git a/gcc/testsuite/lib/gcc-defs.exp b/gcc/testsuite/lib/gcc-defs.exp
index 70215ed4905..cdca4c254d6 100644
--- a/gcc/testsuite/lib/gcc-defs.exp
+++ b/gcc/testsuite/lib/gcc-defs.exp
@@ -307,7 +307,22 @@ set additional_sources_used ""
 
 proc dg-additional-sources { args } {
 g

[gcc r15-919] Don't reduce estimated unrolled size for innermost loop.

2024-05-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ef27b91b62c3aa8841c02665dffa8914c742fd37

commit r15-919-gef27b91b62c3aa8841c02665dffa8914c742fd37
Author: liuhongt 
Date:   Tue Feb 27 15:34:57 2024 +0800

Don't reduce estimated unrolled size for innermost loop.

For the innermost loop, after completely loop unroll, it will most likely
not be able to reduce the body size to 2/3. The current 2/3 reduction
will make some of the larger loops completely unrolled during
cunrolli, which will then result in them not being able to be
vectorized. It also increases the register pressure.

The patch move the 2/3 reduction from estimated_unrolled_size to
tree_unroll_loops_completely.

gcc/ChangeLog:

PR tree-optimization/112325
* tree-ssa-loop-ivcanon.cc (estimated_unrolled_size): Move the
2 / 3 loop body size reduction to ..
(try_unroll_loop_completely): .. here, add it for the check of
body size shrink, and the check of comparison against
param_max_completely_peeled_insns when
(!cunrolli ||loop->inner).
(canonicalize_loop_induction_variables): Add new parameter
cunrolli and pass down.
(tree_unroll_loops_completely_1): Ditto.
(canonicalize_induction_variables): Pass cunrolli as false to
canonicalize_loop_induction_variables.
(tree_unroll_loops_completely): Set cunrolli to true at
beginning and set it to false after CHANGED is true.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr112325.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/vect/pr112325.c | 59 
 gcc/tree-ssa-loop-ivcanon.cc | 49 --
 2 files changed, 86 insertions(+), 22 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr112325.c 
b/gcc/testsuite/gcc.dg/vect/pr112325.c
new file mode 100644
index 000..71cf4099253
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr112325.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops -fdump-tree-vect-details" } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-mavx2" { target x86_64-*-* i?86-*-* } } */
+
+typedef unsigned short ggml_fp16_t;
+static float table_f32_f16[1 << 16];
+
+inline static float ggml_lookup_fp16_to_fp32(ggml_fp16_t f) {
+unsigned short s;
+__builtin_memcpy(&s, &f, sizeof(unsigned short));
+return table_f32_f16[s];
+}
+
+typedef struct {
+ggml_fp16_t d;
+ggml_fp16_t m;
+unsigned char qh[4];
+unsigned char qs[32 / 2];
+} block_q5_1;
+
+typedef struct {
+float d;
+float s;
+char qs[32];
+} block_q8_1;
+
+void ggml_vec_dot_q5_1_q8_1(const int n, float * restrict s, const void * 
restrict vx, const void * restrict vy) {
+const int qk = 32;
+const int nb = n / qk;
+
+const block_q5_1 * restrict x = vx;
+const block_q8_1 * restrict y = vy;
+
+float sumf = 0.0;
+
+for (int i = 0; i < nb; i++) {
+unsigned qh;
+__builtin_memcpy(&qh, x[i].qh, sizeof(qh));
+
+int sumi = 0;
+
+for (int j = 0; j < qk/2; ++j) {
+const unsigned char xh_0 = ((qh >> (j + 0)) << 4) & 0x10;
+const unsigned char xh_1 = ((qh >> (j + 12)) ) & 0x10;
+
+const int x0 = (x[i].qs[j] & 0xF) | xh_0;
+const int x1 = (x[i].qs[j] >> 4) | xh_1;
+
+sumi += (x0 * y[i].qs[j]) + (x1 * y[i].qs[j + qk/2]);
+}
+
+sumf += (ggml_lookup_fp16_to_fp32(x[i].d)*y[i].d)*sumi + 
ggml_lookup_fp16_to_fp32(x[i].m)*y[i].s;
+}
+
+*s = sumf;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index bf017137260..5ef24a91917 100644
--- a/gcc/tree-ssa-loop-ivcanon.cc
+++ b/gcc/tree-ssa-loop-ivcanon.cc
@@ -437,11 +437,7 @@ tree_estimate_loop_size (class loop *loop, edge exit, edge 
edge_to_cancel,
It is (NUNROLL + 1) * size of loop body with taking into account
the fact that in last copy everything after exit conditional
is dead and that some instructions will be eliminated after
-   peeling.
-
-   Loop body is likely going to simplify further, this is difficult
-   to guess, we just decrease the result by 1/3.  */
-
+   peeling.  */
 static unsigned HOST_WIDE_INT
 estimated_unrolled_size (struct loop_size *size,
 unsigned HOST_WIDE_INT nunroll)
@@ -453,10 +449,6 @@ estimated_unrolled_size (struct loop_size *size,
 unr_insns = 0;
   unr_insns += size->last_iteration - 
size->last_iteration_eliminated_by_peeling;
 
-  unr_insns = unr_insns * 2 / 3;
-  if (unr_insns <= 0)
-unr_insns = 1;
-
   return unr_insns;
 }
 
@@ -734,7 +726,8 @@ try_unroll_loop_completely (class loop *loop,
edge exit, tree niter, bool may_be_zero,
enum unroll_level ul,
  

[gcc r15-920] Support vcond_mask_qiqi and friends.

2024-05-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b6c6d5abf0d31c936f50f8f9073c5e335b9e24b7

commit r15-920-gb6c6d5abf0d31c936f50f8f9073c5e335b9e24b7
Author: liuhongt 
Date:   Wed Feb 28 11:17:10 2024 +0800

Support vcond_mask_qiqi and friends.

gcc/ChangeLog:

* config/i386/sse.md (vcond_mask_): New expander.

gcc/testsuite/ChangeLog:
* gcc.target/i386/pr114125.c: New test.

Diff:
---
 gcc/config/i386/sse.md   | 20 
 gcc/testsuite/gcc.target/i386/pr114125.c | 10 ++
 2 files changed, 30 insertions(+)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 0f4fbcb2c5d..7cd912eeeb1 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4807,6 +4807,26 @@
   DONE;
 })
 
+(define_expand "vcond_mask_"
+  [(match_operand:SWI1248_AVX512BW 0 "register_operand")
+   (match_operand:SWI1248_AVX512BW 1 "register_operand")
+   (match_operand:SWI1248_AVX512BW 2 "register_operand")
+   (match_operand:SWI1248_AVX512BW 3 "register_operand")]
+  "TARGET_AVX512F"
+{
+  /* (operand[1] & operand[3]) | (operand[2] & ~operand[3])  */
+  rtx op1 = gen_reg_rtx (mode);
+  rtx op2 = gen_reg_rtx (mode);
+  rtx op3 = gen_reg_rtx (mode);
+
+  emit_insn (gen_and3 (op1, operands[1], operands[3]));
+  emit_insn (gen_one_cmpl2 (op3, operands[3]));
+  emit_insn (gen_and3 (op2, operands[2], op3));
+  emit_insn (gen_ior3 (operands[0], op1, op2));
+
+  DONE;
+})
+
 ;
 ;;
 ;; Parallel floating point logical operations
diff --git a/gcc/testsuite/gcc.target/i386/pr114125.c 
b/gcc/testsuite/gcc.target/i386/pr114125.c
new file mode 100644
index 000..e63fbffe965
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr114125.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=x86-64-v4 -fdump-tree-forwprop3-raw " } */
+
+typedef long vec __attribute__((vector_size(16)));
+vec f(vec x){
+  vec y = x < 10;
+  return y & (y == 0);
+}
+
+/* { dg-final { scan-tree-dump-not "_expr" "forwprop3" } } */