date:20240607

[gcc r15-1111] analyzer: Restore g++ 4.8 bootstrap; use std::move to return std::unique_ptr.

2024-06-07 Thread Roger Sayle via Gcc-cvs

https://gcc.gnu.org/g:e22b7f741ab54ff3a3f8a676ce9e7414fe174958

commit r15--ge22b7f741ab54ff3a3f8a676ce9e7414fe174958
Author: Roger Sayle 
Date:   Sat Jun 8 05:01:38 2024 +0100

analyzer: Restore g++ 4.8 bootstrap; use std::move to return 
std::unique_ptr.

This patch restores bootstrap when using g++ 4.8 as a host compiler.
Returning a std::unique_ptr requires a std::move on C++ compilers
(pre-C++17) that don't guarantee copy elision/return value optimization.

2024-06-08  Roger Sayle  

gcc/analyzer/ChangeLog
* constraint-manager.cc (equiv_class::make_dump_widget): Use
std::move to return a std::unique_ptr.
(bounded_ranges_constraint::make_dump_widget): Likewise.
(constraint_manager::make_dump_widget): Likewise.
* program-state.cc (sm_state_map::make_dump_widget): Likewise.
(program_state::make_dump_widget): Likewise.
* region-model.cc (region_to_value_map::make_dump_widget): Likewise.
(region_model::make_dump_widget): Likewise.
* region.cc (region::make_dump_widget): Likewise.
* store.cc (binding_cluster::make_dump_widget): Likewise.
(store::make_dump_widget): Likewise.
* svalue.cc (svalue::make_dump_widget): Likewise.

Diff:
---
 gcc/analyzer/constraint-manager.cc | 6 +++---
 gcc/analyzer/program-state.cc  | 4 ++--
 gcc/analyzer/region-model.cc   | 4 ++--
 gcc/analyzer/region.cc | 2 +-
 gcc/analyzer/store.cc  | 4 ++--
 gcc/analyzer/svalue.cc | 2 +-
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 707385d3fa6..883f33b2cdd 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -1176,7 +1176,7 @@ equiv_class::make_dump_widget (const 
text_art::dump_widget_info ,
   ec_widget->add_child (tree_widget::make (dwi, ));
 }
 
-  return ec_widget;
+  return std::move (ec_widget);
 }
 
 /* Generate a hash value for this equiv_class.
@@ -1500,7 +1500,7 @@ make_dump_widget (const text_art::dump_widget_info ) 
const
 (tree_widget::from_fmt (dwi, nullptr,
"ec%i bounded ranges", m_ec_id.as_int ()));
   m_ranges->add_to_dump_widget (*brc_widget.get (), dwi);
-  return brc_widget;
+  return std::move (brc_widget);
 }
 
 bool
@@ -1853,7 +1853,7 @@ constraint_manager::make_dump_widget (const 
text_art::dump_widget_info ) con
   if (cm_widget->get_num_children () == 0)
 return nullptr;
 
-  return cm_widget;
+  return std::move (cm_widget);
 }
 
 /* Attempt to add the constraint LHS OP RHS to this constraint_manager.
diff --git a/gcc/analyzer/program-state.cc b/gcc/analyzer/program-state.cc
index dc2d4bdf7b0..efaf569a490 100644
--- a/gcc/analyzer/program-state.cc
+++ b/gcc/analyzer/program-state.cc
@@ -382,7 +382,7 @@ sm_state_map::make_dump_widget (const 
text_art::dump_widget_info ,
   state_widget->add_child (tree_widget::make (dwi, pp));
 }
 
-  return state_widget;
+  return std::move (state_widget);
 }
 
 /* Return true if no states have been set within this map
@@ -1247,7 +1247,7 @@ program_state::make_dump_widget (const 
text_art::dump_widget_info ) const
state_widget->add_child (smap->make_dump_widget (dwi, m_region_model));
   }
 
-  return state_widget;
+  return std::move (state_widget);
 }
 
 /* Update this program_state to reflect a top-level call to FUN.
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index a25181f2a3e..1a44ff073bd 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -288,7 +288,7 @@ make_dump_widget (const text_art::dump_widget_info ) 
const
   sval->dump_to_pp (pp, true);
   w->add_child (text_art::tree_widget::make (dwi, pp));
 }
-  return w;
+  return std::move (w);
 }
 
 /* Attempt to merge THIS with OTHER, writing the result
@@ -556,7 +556,7 @@ region_model::make_dump_widget (const 
text_art::dump_widget_info ) const
   m_mgr->get_store_manager ()));
   model_widget->add_child (m_constraints->make_dump_widget (dwi));
   model_widget->add_child (m_dynamic_extents.make_dump_widget (dwi));
-  return model_widget;
+  return std::move (model_widget);
 }
 
 /* Assert that this object is valid.  */
diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 1fc42f2cd97..d5cfd476fd8 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -1101,7 +1101,7 @@ region::make_dump_widget (const 
text_art::dump_widget_info ,
   if (m_parent)
 w->add_child (m_parent->make_dump_widget (dwi, "parent"));
 
-  return w;
+  return std::move (w);
 }
 
 void
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index d5c1a9f6aff..5a33d740ce2 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1489,7 +1489,7 @@ binding_cluster::make_dump_widget (const

gcc-wwwdocs branch master updated. 97dfd479fc922ee33fb25096b99df8492152f750

2024-06-07 Thread Gerald Pfeifer via Gcc-cvs-wwwdocs

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gcc-wwwdocs".

The branch, master has been updated
   via  97dfd479fc922ee33fb25096b99df8492152f750 (commit)
  from  4260d675af42b9c97e29818ab3b3154d27103d49 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 97dfd479fc922ee33fb25096b99df8492152f750
Author: Gerald Pfeifer 
Date:   Sat Jun 8 00:15:45 2024 +0200

simtest-howto: Use https to link to our install docs

diff --git a/htdocs/simtest-howto.html b/htdocs/simtest-howto.html
index d9c027fd..3afbdb0b 100644
--- a/htdocs/simtest-howto.html
+++ b/htdocs/simtest-howto.html
@@ -115,7 +115,7 @@ cd gcc  find . -print | cpio -pdlmu ../combined 
 cd ..
 Build it
 
 Make sure the
-http://gcc.gnu.org/install/prerequisites.html;>building
+https://gcc.gnu.org/install/prerequisites.html;>building
 prerequisites for GCC are met, for example a host GCC no earlier
 than 3.4 or later, with C++ support enabled.
 

---

Summary of changes:
 htdocs/simtest-howto.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


hooks/post-receive
-- 
gcc-wwwdocs

[gcc r15-1108] analyzer: eliminate cast_region::m_original_region

2024-06-07 Thread David Malcolm via Gcc-cvs

https://gcc.gnu.org/g:70f26314b62e2d636b1f2d3db43e75abb026e535

commit r15-1108-g70f26314b62e2d636b1f2d3db43e75abb026e535
Author: David Malcolm 
Date:   Fri Jun 7 16:14:29 2024 -0400

analyzer: eliminate cast_region::m_original_region

cast_region had its own field m_original_region, rather than
simply using region::m_parent, leading to lots of pointless
special-casing of RK_CAST.

Remove the field and simply use the parent region.

Doing so revealed a bug (seen in gcc.dg/analyzer/taint-alloc-4.c)
where region_model::get_representative_path_var_1's RK_CAST case
was always failing, due to using the "parent region" (actually
that of the original region's parent), rather than the original region;
the patch fixes the bug by removing the distinction.

gcc/analyzer/ChangeLog:
* call-summary.cc
(call_summary_replay::convert_region_from_summary_1): Update
for removal of cast_region::m_original_region.
* region-model-manager.cc
(region_model_manager::get_or_create_initial_value): Likewise.
* region-model.cc (region_model::get_store_value): Likewise.
* region.cc (region::get_base_region): Likewise.
(region::descendent_of_p): Likewise.
(region::maybe_get_frame_region): Likewise.
(region::get_memory_space): Likewise.
(region::calc_offset): Likewise.
(cast_region::accept): Delete.
(cast_region::dump_to_pp): Update for removal of
cast_region::m_original_region.
(cast_region::add_dump_widget_children): Delete.
* region.h (struct cast_region::key_t): Rename "original_region"
to "parent".
(cast_region::cast_region): Likewise.  Update for removal of
cast_region::m_original_region.
(cast_region::accept): Delete.
(cast_region::add_dump_widget_children): Delete.
(cast_region::get_original_region): Delete.
(cast_region::m_original_region): Delete.
* sm-taint.cc (region_model::check_region_for_taint): Remove
special-casing for RK_CAST.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/taint-alloc-4.c: Update expected result to
reflect change in message due to
region_model::get_representative_path_var_1 now handling RK_CAST.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/analyzer/call-summary.cc  | 11 +++---
 gcc/analyzer/region-model-manager.cc  |  2 +-
 gcc/analyzer/region-model.cc  |  2 +-
 gcc/analyzer/region.cc| 50 ---
 gcc/analyzer/region.h | 37 +++-
 gcc/analyzer/sm-taint.cc  |  8 -
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-4.c |  4 +--
 7 files changed, 29 insertions(+), 85 deletions(-)

diff --git a/gcc/analyzer/call-summary.cc b/gcc/analyzer/call-summary.cc
index 60ca78a334d..46b4e2a3bbd 100644
--- a/gcc/analyzer/call-summary.cc
+++ b/gcc/analyzer/call-summary.cc
@@ -726,13 +726,12 @@ call_summary_replay::convert_region_from_summary_1 (const 
region *summary_reg)
   {
const cast_region *summary_cast_reg
  = as_a  (summary_reg);
-   const region *summary_original_reg
- = summary_cast_reg->get_original_region ();
-   const region *caller_original_reg
- = convert_region_from_summary (summary_original_reg);
-   if (!caller_original_reg)
+   const region *summary_parent_reg = summary_reg->get_parent_region ();
+   const region *caller_parent_reg
+ = convert_region_from_summary (summary_parent_reg);
+   if (!caller_parent_reg)
  return NULL;
-   return mgr->get_cast_region (caller_original_reg,
+   return mgr->get_cast_region (caller_parent_reg,
 summary_reg->get_type ());
   }
   break;
diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index b094b2f7e43..8154d914e81 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -327,7 +327,7 @@ region_model_manager::get_or_create_initial_value (const 
region *reg,
   /* The initial value of a cast is a cast of the initial value.  */
   if (const cast_region *cast_reg = reg->dyn_cast_cast_region ())
 {
-  const region *original_reg = cast_reg->get_original_region ();
+  const region *original_reg = cast_reg->get_parent_region ();
   return get_or_create_cast (cast_reg->get_type (),
 get_or_create_initial_value (original_reg));
 }
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index d6bcb8630cd..9f24011c17b 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -2933,7 +2933,7 @@ region_model::get_store_value (const region *reg,

[gcc r15-1109] analyzer: add logging to get_representative_path_var

2024-06-07 Thread David Malcolm via Gcc-cvs

https://gcc.gnu.org/g:d039eef925878e41e3df1448cac6add51dba6333

commit r15-1109-gd039eef925878e41e3df1448cac6add51dba6333
Author: David Malcolm 
Date:   Fri Jun 7 16:14:29 2024 -0400

analyzer: add logging to get_representative_path_var

This was very helpful when debugging the cast_region::m_original_region
removal, but is probably too verbose to enable except by hand on
specific calls to get_representative_tree.

gcc/analyzer/ChangeLog:
* engine.cc (impl_region_model_context::on_state_leak): Pass nullptr
to get_representative_path_var.
* region-model.cc (region_model::get_representative_path_var_1):
Add logger param and use it in both overloads.
(region_model::get_representative_path_var): Likewise.
(region_model::get_representative_tree): Likewise.
(selftest::test_get_representative_path_var): Pass nullptr to
get_representative_path_var.
* region-model.h (region_model::get_representative_tree): Add
optional logger param to both overloads.
(region_model::get_representative_path_var): Add logger param to
both overloads.
(region_model::get_representative_path_var_1): Likewise.
* store.cc (binding_cluster::get_representative_path_vars): Add
logger param and use it.
(store::get_representative_path_vars): Likewise.
* store.h (binding_cluster::get_representative_path_vars): Add
logger param.
(store::get_representative_path_vars): Likewise.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/analyzer/engine.cc   |   3 +-
 gcc/analyzer/region-model.cc | 109 +--
 gcc/analyzer/region-model.h  |  18 ---
 gcc/analyzer/store.cc|  12 +++--
 gcc/analyzer/store.h |   2 +
 5 files changed, 109 insertions(+), 35 deletions(-)

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 8b3706cdfa8..30c0913c861 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -903,7 +903,8 @@ impl_region_model_context::on_state_leak (const 
state_machine ,
   svalue_set visited;
   path_var leaked_pv
 = m_old_state->m_region_model->get_representative_path_var (sval,
-   );
+   ,
+   nullptr);
 
   /* Strip off top-level casts  */
   if (leaked_pv.m_tree && TREE_CODE (leaked_pv.m_tree) == NOP_EXPR)
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 9f24011c17b..a25181f2a3e 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5343,7 +5343,8 @@ region_model::eval_condition (tree lhs,
 
 path_var
 region_model::get_representative_path_var_1 (const svalue *sval,
-svalue_set *visited) const
+svalue_set *visited,
+logger *logger) const
 {
   gcc_assert (sval);
 
@@ -5360,7 +5361,8 @@ region_model::get_representative_path_var_1 (const svalue 
*sval,
   /* Handle casts by recursion into get_representative_path_var.  */
   if (const svalue *cast_sval = sval->maybe_undo_cast ())
 {
-  path_var result = get_representative_path_var (cast_sval, visited);
+  path_var result = get_representative_path_var (cast_sval, visited,
+logger);
   tree orig_type = sval->get_type ();
   /* If necessary, wrap the result in a cast.  */
   if (result.m_tree && orig_type)
@@ -5369,7 +5371,7 @@ region_model::get_representative_path_var_1 (const svalue 
*sval,
 }
 
   auto_vec pvs;
-  m_store.get_representative_path_vars (this, visited, sval, );
+  m_store.get_representative_path_vars (this, visited, sval, logger, );
 
   if (tree cst = sval->maybe_get_constant ())
 pvs.safe_push (path_var (cst, 0));
@@ -5378,7 +5380,7 @@ region_model::get_representative_path_var_1 (const svalue 
*sval,
   if (const region_svalue *ptr_sval = sval->dyn_cast_region_svalue ())
 {
   const region *reg = ptr_sval->get_pointee ();
-  if (path_var pv = get_representative_path_var (reg, visited))
+  if (path_var pv = get_representative_path_var (reg, visited, logger))
return path_var (build1 (ADDR_EXPR,
 sval->get_type (),
 pv.m_tree),
@@ -5391,7 +5393,7 @@ region_model::get_representative_path_var_1 (const svalue 
*sval,
   const svalue *parent_sval = sub_sval->get_parent ();
   const region *subreg = sub_sval->get_subregion ();
   if (path_var parent_pv
-   = get_representative_path_var (parent_sval, visited))
+   = get_representative_path_var (parent_sval, visited, logger))
if

[gcc r15-1107] analyzer: new warning: -Wanalyzer-undefined-behavior-ptrdiff (PR analyzer/105892)

2024-06-07 Thread David Malcolm via Gcc-cvs

https://gcc.gnu.org/g:13dcaf1bb6d4f15665a47b14ac0c12cf454e38a2

commit r15-1107-g13dcaf1bb6d4f15665a47b14ac0c12cf454e38a2
Author: David Malcolm 
Date:   Fri Jun 7 16:14:28 2024 -0400

analyzer: new warning: -Wanalyzer-undefined-behavior-ptrdiff (PR 
analyzer/105892)

Add a new warning to complain about pointer subtraction involving
different chunks of memory.

For example, given:

  #include 

  int arr[42];
  int sentinel;

  ptrdiff_t
  test_invalid_calc_of_array_size (void)
  {
return  - arr;
  }

this emits:

demo.c: In function ‘test_invalid_calc_of_array_size’:
demo.c:9:20: warning: undefined behavior when subtracting pointers 
[CWE-469] [-Wanalyzer-undefined-behavior-ptrdiff]
9 |   return  - arr;
  |^
  events 1-2
│
│3 | int arr[42];
│  | ~~~
│  | |
│  | (2) underlying object for right-hand side of subtraction 
created here
│4 | int sentinel;
│  | ^~~~
│  | |
│  | (1) underlying object for left-hand side of subtraction 
created here
│
└──> ‘test_invalid_calc_of_array_size’: event 3
   │
   │9 |   return  - arr;
   │  |^
   │  ||
   │  |(3) ⚠️  subtraction of pointers has 
undefined behavior if they do not point into the same array object
   │

gcc/analyzer/ChangeLog:
PR analyzer/105892
* analyzer.opt (Wanalyzer-undefined-behavior-ptrdiff): New option.
* analyzer.opt.urls: Regenerate.
* region-model.cc (class undefined_ptrdiff_diagnostic): New.
(check_for_invalid_ptrdiff): New.
(region_model::get_gassign_result): Call it for POINTER_DIFF_EXPR.

gcc/ChangeLog:
* doc/invoke.texi: Add -Wanalyzer-undefined-behavior-ptrdiff.

gcc/testsuite/ChangeLog:
PR analyzer/105892
* c-c++-common/analyzer/out-of-bounds-pr110387.c: Add
expected warnings about pointer subtraction.
* c-c++-common/analyzer/ptr-subtraction-1.c: New test.
* c-c++-common/analyzer/ptr-subtraction-CWE-469-example.c: New test.

Signed-off-by: David Malcolm 

Diff:
---
 gcc/analyzer/analyzer.opt  |   4 +
 gcc/analyzer/analyzer.opt.urls |   3 +
 gcc/analyzer/region-model.cc   | 141 +
 gcc/doc/invoke.texi|  16 +++
 .../c-c++-common/analyzer/out-of-bounds-pr110387.c |   4 +-
 .../c-c++-common/analyzer/ptr-subtraction-1.c  |  46 +++
 .../analyzer/ptr-subtraction-CWE-469-example.c |  81 
 7 files changed, 293 insertions(+), 2 deletions(-)

diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
index bbf2ba670d8..5335f7e1999 100644
--- a/gcc/analyzer/analyzer.opt
+++ b/gcc/analyzer/analyzer.opt
@@ -222,6 +222,10 @@ Wanalyzer-tainted-size
 Common Var(warn_analyzer_tainted_size) Init(1) Warning
 Warn about code paths in which an unsanitized value is used as a size.
 
+Wanalyzer-undefined-behavior-ptrdiff
+Common Var(warn_analyzer_undefined_behavior_ptrdiff) Init(1) Warning
+Warn about code paths in which pointer subtraction involves undefined behavior.
+
 Wanalyzer-undefined-behavior-strtok
 Common Var(warn_analyzer_undefined_behavior_strtok) Init(1) Warning
 Warn about code paths in which a call is made to strtok with undefined 
behavior.
diff --git a/gcc/analyzer/analyzer.opt.urls b/gcc/analyzer/analyzer.opt.urls
index 5fcab720582..18a0d6926de 100644
--- a/gcc/analyzer/analyzer.opt.urls
+++ b/gcc/analyzer/analyzer.opt.urls
@@ -114,6 +114,9 @@ 
UrlSuffix(gcc/Static-Analyzer-Options.html#index-Wanalyzer-tainted-offset)
 Wanalyzer-tainted-size
 UrlSuffix(gcc/Static-Analyzer-Options.html#index-Wanalyzer-tainted-size)
 
+Wanalyzer-undefined-behavior-ptrdiff
+UrlSuffix(gcc/Static-Analyzer-Options.html#index-Wanalyzer-undefined-behavior-ptrdiff)
+
 Wanalyzer-undefined-behavior-strtok
 
UrlSuffix(gcc/Static-Analyzer-Options.html#index-Wanalyzer-undefined-behavior-strtok)
 
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index d142d851a26..d6bcb8630cd 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -841,6 +841,144 @@ private:
   tree m_count_cst;
 };
 
+/* A subclass of pending_diagnostic for complaining about pointer
+   subtractions involving unrelated buffers.  */
+
+class undefined_ptrdiff_diagnostic
+: public pending_diagnostic_subclass
+{
+public:
+  /* Region_creation_event subclass to give a custom wording when
+ talking about creation of buffers for LHS and RHS of the
+ subtraction.  */
+  class ptrdiff_region_creation_event : public

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for scalar unsigned SAT_ADD form 3

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:4b3c0b3380d38553e76bbf01e1ac5b3f66dc3d5c

commit 4b3c0b3380d38553e76bbf01e1ac5b3f66dc3d5c
Author: Pan Li 
Date:   Mon Jun 3 10:24:47 2024 +0800

RISC-V: Add testcases for scalar unsigned SAT_ADD form 3

After the middle-end support the form 3 of unsigned SAT_ADD and
the RISC-V backend implement the scalar .SAT_ADD, add more test
case to cover the form 3 of unsigned .SAT_ADD.

Form 3:
  #define SAT_ADD_U_3(T) \
  T sat_add_u_3_##T (T x, T y) \
  { \
T ret; \
return __builtin_add_overflow (x, y, ) ? -1 : ret; \
  }

Passed the rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test macro for form 3.
* gcc.target/riscv/sat_u_add-13.c: New test.
* gcc.target/riscv/sat_u_add-14.c: New test.
* gcc.target/riscv/sat_u_add-15.c: New test.
* gcc.target/riscv/sat_u_add-16.c: New test.
* gcc.target/riscv/sat_u_add-run-13.c: New test.
* gcc.target/riscv/sat_u_add-run-14.c: New test.
* gcc.target/riscv/sat_u_add-run-15.c: New test.
* gcc.target/riscv/sat_u_add-run-16.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 39dde9200dd936339df7dd6c8f56e88866bcecc5)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 10 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-13.c | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-14.c | 21 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-15.c | 18 
 gcc/testsuite/gcc.target/riscv/sat_u_add-16.c | 17 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-13.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-14.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-15.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-16.c | 25 +++
 9 files changed, 185 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index d44fd63fd83..adb8be5886e 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -26,6 +26,15 @@ sat_u_add_##T##_fmt_3 (T x, T y)\
   return (T)(-overflow) | ret;  \
 }
 
+#define DEF_SAT_U_ADD_FMT_4(T)   \
+T __attribute__((noinline))  \
+sat_u_add_##T##_fmt_4 (T x, T y) \
+{\
+  T ret; \
+  return __builtin_add_overflow (x, y, ) ? -1 : ret; \
+}
+
+
 #define DEF_VEC_SAT_U_ADD_FMT_1(T)   \
 void __attribute__((noinline))   \
 vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
@@ -42,6 +51,7 @@ vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned 
limit) \
 #define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y)
 #define RUN_SAT_U_ADD_FMT_2(T, x, y) sat_u_add_##T##_fmt_2(x, y)
 #define RUN_SAT_U_ADD_FMT_3(T, x, y) sat_u_add_##T##_fmt_3(x, y)
+#define RUN_SAT_U_ADD_FMT_4(T, x, y) sat_u_add_##T##_fmt_4(x, y)
 
 #define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \
   vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N)
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-13.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-13.c
new file mode 100644
index 000..b2d93f29f48
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-13.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint8_t_fmt_4:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_FMT_4(uint8_t)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-14.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-14.c
new file mode 100644
index 000..eafc578aafa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-14.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint16_t_fmt_4:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+**

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for scalar unsigned SAT_ADD form 1

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:efe00579c04e02b6132c678962ce8050c8759bee

commit efe00579c04e02b6132c678962ce8050c8759bee
Author: Pan Li 
Date:   Wed May 29 14:15:45 2024 +0800

RISC-V: Add testcases for scalar unsigned SAT_ADD form 1

After the middle-end support the form 1 of unsigned SAT_ADD and
the RISC-V backend implement the scalar .SAT_ADD, add more test
case to cover the form 1 of unsigned .SAT_ADD.

Form 1:

  #define SAT_ADD_U_1(T)   \
  T sat_add_u_1_##T(T x, T y)  \
  {\
return (T)(x + y) >= x ? (x + y) : -1; \
  }

Passed the riscv fully regression tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add helper macro for form 1.
* gcc.target/riscv/sat_u_add-5.c: New test.
* gcc.target/riscv/sat_u_add-6.c: New test.
* gcc.target/riscv/sat_u_add-7.c: New test.
* gcc.target/riscv/sat_u_add-8.c: New test.
* gcc.target/riscv/sat_u_add-run-5.c: New test.
* gcc.target/riscv/sat_u_add-run-6.c: New test.
* gcc.target/riscv/sat_u_add-run-7.c: New test.
* gcc.target/riscv/sat_u_add-run-8.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit a737c2bf5212822b8225f65efa643a968e5a7c78)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h   |  8 
 gcc/testsuite/gcc.target/riscv/sat_u_add-5.c | 19 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add-6.c | 21 
 gcc/testsuite/gcc.target/riscv/sat_u_add-7.c | 18 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-8.c | 17 
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-5.c | 25 
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-6.c | 25 
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-7.c | 25 
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-8.c | 25 
 9 files changed, 183 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index 2ef9fd825f3..2abc83d7666 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -10,6 +10,13 @@ sat_u_add_##T##_fmt_1 (T x, T y)   \
   return (x + y) | (-(T)((T)(x + y) < x)); \
 }
 
+#define DEF_SAT_U_ADD_FMT_2(T)   \
+T __attribute__((noinline))  \
+sat_u_add_##T##_fmt_2 (T x, T y) \
+{\
+  return (T)(x + y) >= x ? (x + y) : -1; \
+}
+
 #define DEF_VEC_SAT_U_ADD_FMT_1(T)   \
 void __attribute__((noinline))   \
 vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
@@ -24,6 +31,7 @@ vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned 
limit) \
 }
 
 #define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y)
+#define RUN_SAT_U_ADD_FMT_2(T, x, y) sat_u_add_##T##_fmt_2(x, y)
 
 #define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \
   vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N)
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-5.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-5.c
new file mode 100644
index 000..4c73c7f8a21
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-5.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint8_t_fmt_2:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_FMT_2(uint8_t)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-6.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-6.c
new file mode 100644
index 000..0d64f5631bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-6.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint16_t_fmt_2:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** slli\s+a0,\s*a0,\s*48
+** srli\s+a0,\s*a0,\s*48
+** ret
+*/
+DEF_SAT_U_ADD_FMT_2(uint16_t)
+
+/* { dg-final { scan-rtl-dump-times

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Regenerate opt urls.

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:4f20feccf708ff7a7af5d776ca87d4995ef46f76

commit 4f20feccf708ff7a7af5d776ca87d4995ef46f76
Author: Robin Dapp 
Date:   Thu Jun 6 09:32:28 2024 +0200

RISC-V: Regenerate opt urls.

I wasn't aware that I needed to regenerate the opt urls when
adding an option.  This patch does that.

gcc/ChangeLog:

* config/riscv/riscv.opt.urls: Regenerate.

(cherry picked from commit 037fc4d1012dc9d533862ef7e2c946249877dd71)

Diff:
---
 gcc/config/riscv/riscv.opt.urls | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/config/riscv/riscv.opt.urls b/gcc/config/riscv/riscv.opt.urls
index d87e9d5c9a8..622cb6e7b44 100644
--- a/gcc/config/riscv/riscv.opt.urls
+++ b/gcc/config/riscv/riscv.opt.urls
@@ -47,6 +47,12 @@ UrlSuffix(gcc/RISC-V-Options.html#index-mcmodel_003d-4)
 mstrict-align
 UrlSuffix(gcc/RISC-V-Options.html#index-mstrict-align-4)
 
+mscalar-strict-align
+UrlSuffix(gcc/RISC-V-Options.html#index-mscalar-strict-align)
+
+mvector-strict-align
+UrlSuffix(gcc/RISC-V-Options.html#index-mvector-strict-align)
+
 ; skipping UrlSuffix for 'mexplicit-relocs' due to finding no URLs
 
 mrelax

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for scalar unsigned SAT_ADD form 5

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:1a6d2ed7fbd20bfa3079da4700eb591f2abaa395

commit 1a6d2ed7fbd20bfa3079da4700eb591f2abaa395
Author: Pan Li 
Date:   Mon Jun 3 10:43:10 2024 +0800

RISC-V: Add testcases for scalar unsigned SAT_ADD form 5

After the middle-end support the form 5 of unsigned SAT_ADD and
the RISC-V backend implement the scalar .SAT_ADD, add more test
case to cover the form 5 of unsigned .SAT_ADD.

Form 5:
  #define SAT_ADD_U_5(T) \
  T sat_add_u_5_##T(T x, T y) \
  { \
return (T)(x + y) < x ? -1 : (x + y); \
  }

Passed the riscv fully regression tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test macro for form 5.
* gcc.target/riscv/sat_u_add-21.c: New test.
* gcc.target/riscv/sat_u_add-22.c: New test.
* gcc.target/riscv/sat_u_add-23.c: New test.
* gcc.target/riscv/sat_u_add-24.c: New test.
* gcc.target/riscv/sat_u_add-run-21.c: New test.
* gcc.target/riscv/sat_u_add-run-22.c: New test.
* gcc.target/riscv/sat_u_add-run-23.c: New test.
* gcc.target/riscv/sat_u_add-run-24.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 93f44e18cddb2b5eb3a00232d3be9a5bc8179f25)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h|  8 
 gcc/testsuite/gcc.target/riscv/sat_u_add-21.c | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-22.c | 21 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-23.c | 18 
 gcc/testsuite/gcc.target/riscv/sat_u_add-24.c | 17 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-21.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-22.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-23.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-24.c | 25 +++
 9 files changed, 183 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index 6ca158d57c4..976ef1c44c1 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -42,6 +42,13 @@ sat_u_add_##T##_fmt_5 (T x, T y) 
 \
   return __builtin_add_overflow (x, y, ) == 0 ? ret : -1; \
 }
 
+#define DEF_SAT_U_ADD_FMT_6(T)  \
+T __attribute__((noinline)) \
+sat_u_add_##T##_fmt_6 (T x, T y)\
+{   \
+  return (T)(x + y) < x ? -1 : (x + y); \
+}
+
 #define DEF_VEC_SAT_U_ADD_FMT_1(T)   \
 void __attribute__((noinline))   \
 vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
@@ -60,6 +67,7 @@ vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned 
limit) \
 #define RUN_SAT_U_ADD_FMT_3(T, x, y) sat_u_add_##T##_fmt_3(x, y)
 #define RUN_SAT_U_ADD_FMT_4(T, x, y) sat_u_add_##T##_fmt_4(x, y)
 #define RUN_SAT_U_ADD_FMT_5(T, x, y) sat_u_add_##T##_fmt_5(x, y)
+#define RUN_SAT_U_ADD_FMT_6(T, x, y) sat_u_add_##T##_fmt_6(x, y)
 
 #define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \
   vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N)
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-21.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-21.c
new file mode 100644
index 000..f75e35a5fa9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-21.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint8_t_fmt_6:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_FMT_6(uint8_t)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-22.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-22.c
new file mode 100644
index 000..ad957a061f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-22.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint16_t_fmt_6:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** slli\s+a0,\s*a0,\s*48
+**

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for scalar unsigned SAT_ADD form 2

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:6f5119eed91a2ec0708e38c9f2e5d58169a3f53e

commit 6f5119eed91a2ec0708e38c9f2e5d58169a3f53e
Author: Pan Li 
Date:   Mon Jun 3 09:35:49 2024 +0800

RISC-V: Add testcases for scalar unsigned SAT_ADD form 2

After the middle-end support the form 2 of unsigned SAT_ADD and
the RISC-V backend implement the scalar .SAT_ADD, add more test
case to cover the form 2 of unsigned .SAT_ADD.

Form 2:

  #define SAT_ADD_U_2(T) \
  T sat_add_u_2_##T(T x, T y) \
  { \
T ret; \
T overflow = __builtin_add_overflow (x, y, ); \
return (T)(-overflow) | ret; \
  }

Passed the rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test macro for form 2.
* gcc.target/riscv/sat_u_add-10.c: New test.
* gcc.target/riscv/sat_u_add-11.c: New test.
* gcc.target/riscv/sat_u_add-12.c: New test.
* gcc.target/riscv/sat_u_add-9.c: New test.
* gcc.target/riscv/sat_u_add-run-10.c: New test.
* gcc.target/riscv/sat_u_add-run-11.c: New test.
* gcc.target/riscv/sat_u_add-run-12.c: New test.
* gcc.target/riscv/sat_u_add-run-9.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 0261ed4337f62c247b33145a81cd4fb5a69bc5a7)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 10 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-10.c | 21 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-11.c | 18 
 gcc/testsuite/gcc.target/riscv/sat_u_add-12.c | 17 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-9.c  | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-10.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-11.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-12.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-9.c  | 25 +++
 9 files changed, 185 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index 2abc83d7666..d44fd63fd83 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -17,6 +17,15 @@ sat_u_add_##T##_fmt_2 (T x, T y) \
   return (T)(x + y) >= x ? (x + y) : -1; \
 }
 
+#define DEF_SAT_U_ADD_FMT_3(T)  \
+T __attribute__((noinline)) \
+sat_u_add_##T##_fmt_3 (T x, T y)\
+{   \
+  T ret;\
+  T overflow = __builtin_add_overflow (x, y, ); \
+  return (T)(-overflow) | ret;  \
+}
+
 #define DEF_VEC_SAT_U_ADD_FMT_1(T)   \
 void __attribute__((noinline))   \
 vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
@@ -32,6 +41,7 @@ vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned 
limit) \
 
 #define RUN_SAT_U_ADD_FMT_1(T, x, y) sat_u_add_##T##_fmt_1(x, y)
 #define RUN_SAT_U_ADD_FMT_2(T, x, y) sat_u_add_##T##_fmt_2(x, y)
+#define RUN_SAT_U_ADD_FMT_3(T, x, y) sat_u_add_##T##_fmt_3(x, y)
 
 #define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \
   vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N)
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-10.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-10.c
new file mode 100644
index 000..3f627ef80b1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-10.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint16_t_fmt_3:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** slli\s+a0,\s*a0,\s*48
+** srli\s+a0,\s*a0,\s*48
+** ret
+*/
+DEF_SAT_U_ADD_FMT_3(uint16_t)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-11.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-11.c
new file mode 100644
index 000..b6dc779b212
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-11.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint32_t_fmt_3:
+** addw\s+[atx][0-9]+,\s*a0,\s*a1
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+**

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for scalar unsigned SAT_ADD form 4

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:b93df02d58c0c448c4b524c07bdf5f3d7c305378

commit b93df02d58c0c448c4b524c07bdf5f3d7c305378
Author: Pan Li 
Date:   Mon Jun 3 10:33:15 2024 +0800

RISC-V: Add testcases for scalar unsigned SAT_ADD form 4

After the middle-end support the form 4 of unsigned SAT_ADD and
the RISC-V backend implement the scalar .SAT_ADD, add more test
case to cover the form 4 of unsigned .SAT_ADD.

Form 4:
  #define SAT_ADD_U_4(T) \
  T sat_add_u_4_##T (T x, T y) \
  { \
T ret; \
return __builtin_add_overflow (x, y, ) == 0 ? ret : -1; \
  }

Passed the rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test macro for form 4.
* gcc.target/riscv/sat_u_add-17.c: New test.
* gcc.target/riscv/sat_u_add-18.c: New test.
* gcc.target/riscv/sat_u_add-19.c: New test.
* gcc.target/riscv/sat_u_add-20.c: New test.
* gcc.target/riscv/sat_u_add-run-17.c: New test.
* gcc.target/riscv/sat_u_add-run-18.c: New test.
* gcc.target/riscv/sat_u_add-run-19.c: New test.
* gcc.target/riscv/sat_u_add-run-20.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit a171aac72408837ed0b20e3912a22c5b4891ace4)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h|  8 
 gcc/testsuite/gcc.target/riscv/sat_u_add-17.c | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-18.c | 21 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-19.c | 18 
 gcc/testsuite/gcc.target/riscv/sat_u_add-20.c | 17 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-17.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-18.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-19.c | 25 +++
 gcc/testsuite/gcc.target/riscv/sat_u_add-run-20.c | 25 +++
 9 files changed, 183 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index adb8be5886e..6ca158d57c4 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -34,6 +34,13 @@ sat_u_add_##T##_fmt_4 (T x, T y) \
   return __builtin_add_overflow (x, y, ) ? -1 : ret; \
 }
 
+#define DEF_SAT_U_ADD_FMT_5(T)\
+T __attribute__((noinline))   \
+sat_u_add_##T##_fmt_5 (T x, T y)  \
+{ \
+  T ret;  \
+  return __builtin_add_overflow (x, y, ) == 0 ? ret : -1; \
+}
 
 #define DEF_VEC_SAT_U_ADD_FMT_1(T)   \
 void __attribute__((noinline))   \
@@ -52,6 +59,7 @@ vec_sat_u_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned 
limit) \
 #define RUN_SAT_U_ADD_FMT_2(T, x, y) sat_u_add_##T##_fmt_2(x, y)
 #define RUN_SAT_U_ADD_FMT_3(T, x, y) sat_u_add_##T##_fmt_3(x, y)
 #define RUN_SAT_U_ADD_FMT_4(T, x, y) sat_u_add_##T##_fmt_4(x, y)
+#define RUN_SAT_U_ADD_FMT_5(T, x, y) sat_u_add_##T##_fmt_5(x, y)
 
 #define RUN_VEC_SAT_U_ADD_FMT_1(T, out, op_1, op_2, N) \
   vec_sat_u_add_##T##_fmt_1(out, op_1, op_2, N)
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-17.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-17.c
new file mode 100644
index 000..7085ac835f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-17.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint8_t_fmt_5:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_FMT_5(uint8_t)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add-18.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add-18.c
new file mode 100644
index 000..355ff8ba4ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add-18.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_uint16_t_fmt_5:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+**

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:e46fc82745c1a917ade318222d514c881c68ce1a

commit e46fc82745c1a917ade318222d514c881c68ce1a
Author: liuhongt 
Date:   Fri Apr 19 10:29:34 2024 +0800

Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
of A, then it can be simplified to LSHIFTRT.

i.e Simplify
(and:v8hi
  (ashifrt:v8hi A 8)
  (const_vector 0xff x8))
to
(lshifrt:v8hi A 8)

gcc/ChangeLog:

PR target/114428
* simplify-rtx.cc
(simplify_context::simplify_binary_operation_1):
Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for
specific mask.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr114428-1.c: New test.

(cherry picked from commit 7876cde25cbd2f026a0ae488e5263e72f8e9bfa0)

Diff:
---
 gcc/simplify-rtx.cc| 25 +++
 gcc/testsuite/gcc.target/i386/pr114428-1.c | 39 ++
 2 files changed, 64 insertions(+)

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index bb562c3af2c..216aedbe7e2 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -4050,6 +4050,31 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
code,
return tem;
}
 
+  /* (and:v4si
+  (ashiftrt:v4si A 16)
+  (const_vector: 0x x4))
+is just (lshiftrt:v4si A 16).  */
+  if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT
+ && (CONST_INT_P (XEXP (op0, 1))
+ || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR
+ && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1
+ && GET_CODE (op1) == CONST_VECTOR
+ && CONST_VECTOR_DUPLICATE_P (op1))
+   {
+ unsigned HOST_WIDE_INT shift_count
+   = (CONST_INT_P (XEXP (op0, 1))
+  ? UINTVAL (XEXP (op0, 1))
+  : UINTVAL (XVECEXP (XEXP (op0, 1), 0, 0)));
+ unsigned HOST_WIDE_INT inner_prec
+   = GET_MODE_PRECISION (GET_MODE_INNER (mode));
+
+ /* Avoid UD shift count.  */
+ if (shift_count < inner_prec
+ && (UINTVAL (XVECEXP (op1, 0, 0))
+ == (HOST_WIDE_INT_1U << (inner_prec - shift_count)) - 1))
+   return simplify_gen_binary (LSHIFTRT, mode, XEXP (op0, 0), XEXP 
(op0, 1));
+   }
+
   tem = simplify_byte_swapping_operation (code, mode, op0, op1);
   if (tem)
return tem;
diff --git a/gcc/testsuite/gcc.target/i386/pr114428-1.c 
b/gcc/testsuite/gcc.target/i386/pr114428-1.c
new file mode 100644
index 000..927476f2269
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr114428-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2" } */
+/* { dg-final { scan-assembler-times "psrlw" 1 } } */
+/* { dg-final { scan-assembler-times "psrld" 1 } } */
+/* { dg-final { scan-assembler-times "psrlq" 1 { target { ! ia32 } } } } */
+
+
+#define SHIFTC 12
+
+typedef int v4si __attribute__((vector_size(16)));
+typedef short v8hi __attribute__((vector_size(16)));
+typedef long long v2di __attribute__((vector_size(16)));
+
+v8hi
+foo1 (v8hi a)
+{
+  return
+(a >> (16 - SHIFTC)) & (__extension__(v8hi){(1<> (32 - SHIFTC)) & (__extension__(v4si){(1<> (long long)(64 - SHIFTC)) & (__extension__(v2di){(1ULL<

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Introduce -mvector-strict-align.

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:0e0b666a30f53364292432903b68febd85a3e114

commit 0e0b666a30f53364292432903b68febd85a3e114
Author: Robin Dapp 
Date:   Tue May 28 21:19:26 2024 +0200

RISC-V: Introduce -mvector-strict-align.

this patch disables movmisalign by default and introduces
the -mno-vector-strict-align option to override it and re-enable
movmisalign.  For now, generic-ooo is the only uarch that supports
misaligned vector access.

The patch also adds a check_effective_target_riscv_v_misalign_ok to
the testsuite which enables or disables the vector misalignment tests
depending on whether the target under test can execute a misaligned
vle32.

Changes from v3:
 - Adressed Kito's comments.
 - Made -mscalar-strict-align a real alias.

gcc/ChangeLog:

* config/riscv/riscv-opts.h (TARGET_VECTOR_MISALIGN_SUPPORTED):
Move from here...
* config/riscv/riscv.h (TARGET_VECTOR_MISALIGN_SUPPORTED):
...to here and map to riscv_vector_unaligned_access_p.
* config/riscv/riscv.opt: Add -mvector-strict-align.
* config/riscv/riscv.cc (struct riscv_tune_param): Add
vector_unaligned_access.
(riscv_override_options_internal): Set
riscv_vector_unaligned_access_p.
* doc/invoke.texi: Document -mvector-strict-align.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add
check_effective_target_riscv_v_misalign_ok.
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Add
-mno-vector-strict-align.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-8.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/misalign-1.c: Ditto.

(cherry picked from commit 68b0742a49de7122d5023f0bf46460ff2fb3e3dd)

Diff:
---
 gcc/config/riscv/riscv-opts.h  |  3 --
 gcc/config/riscv/riscv.cc  | 19 
 gcc/config/riscv/riscv.h   |  5 
 gcc/config/riscv/riscv.opt |  8 +
 gcc/doc/invoke.texi| 16 ++
 .../vect/costmodel/riscv/rvv/dynamic-lmul2-7.c |  2 +-
 .../vect/costmodel/riscv/rvv/vla_vs_vls-10.c   |  2 +-
 .../vect/costmodel/riscv/rvv/vla_vs_vls-11.c   |  2 +-
 .../vect/costmodel/riscv/rvv/vla_vs_vls-12.c   |  2 +-
 .../gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-8.c |  2 +-
 .../gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-9.c |  2 +-
 .../gcc.target/riscv/rvv/autovec/vls/misalign-1.c  |  2 +-
 gcc/testsuite/lib/target-supports.exp  | 35 --
 13 files changed, 88 insertions(+), 12 deletions(-)

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1b2dd5757a8..f58a07abffc 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -147,9 +147,6 @@ enum rvv_vector_bits_enum {
  ? 0   
\
  : 32 << (__builtin_popcount (opts->x_riscv_zvl_flags) - 1))
 
-/* TODO: Enable RVV movmisalign by default for now.  */
-#define TARGET_VECTOR_MISALIGN_SUPPORTED 1
-
 /* The maximmum LMUL according to user configuration.  */
 #define TARGET_MAX_LMUL
\
   (int) (rvv_max_lmul == RVV_DYNAMIC ? RVV_M8 : rvv_max_lmul)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index c5c4c777349..9704ff9c6a0 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -288,6 +288,7 @@ struct riscv_tune_param
   unsigned short memory_cost;
   unsigned short fmv_cost;
   bool slow_unaligned_access;
+  bool vector_unaligned_access;
   bool use_divmod_expansion;
   bool overlap_op_by_pieces;
   unsigned int fusible_ops;
@@ -300,6 +301,10 @@ struct riscv_tune_param
 /* Whether unaligned accesses execute very slowly.  */
 bool riscv_slow_unaligned_access_p;
 
+/* Whether misaligned vector accesses are supported (i.e. do not
+   throw an exception).  */
+bool riscv_vector_unaligned_access_p;
+
 /* Whether user explicitly passed -mstrict-align.  */
 bool riscv_user_wants_strict_align;
 
@@ -442,6 +447,7 @@ static const struct riscv_tune_param rocket_tune_info = {
   5,   /* memory_cost */
   8,   /* fmv_cost */
   true,/* 
slow_unaligned_access */
+  false,   /* vector_unaligned_access */
   false,   /* use_divmod_expansion */
   false,

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add Zfbfmin extension

2024-06-07 Thread Jeff Law via Gcc-cvs

https://gcc.gnu.org/g:f11cbf2edfbd9615cf0d8519bd7a570a2ae00397

commit f11cbf2edfbd9615cf0d8519bd7a570a2ae00397
Author: Xiao Zeng 
Date:   Wed May 15 13:56:42 2024 +0800

RISC-V: Add Zfbfmin extension

1 In the previous patch, the libcall for BF16 was implemented:



2 Riscv provides Zfbfmin extension, which completes the "Scalar BF16 
Converts":



3 Implemented replacing libcall with Zfbfmin extension instruction.

4 Reused previous testcases in:


gcc/ChangeLog:

* config/riscv/iterators.md: Add mode_iterator between
floating-point modes and BFmode.
* config/riscv/riscv.cc (riscv_output_move): Handle BFmode move
for zfbfmin.
* config/riscv/riscv.md (truncbf2): New pattern for BFmode.
(extendbfsf2): Dotto.
(*movhf_hardfloat): Add BFmode.
(*mov_hardfloat): Dotto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zfbfmin-bf16_arithmetic.c: New test.
* gcc.target/riscv/zfbfmin-bf16_comparison.c: New test.
* gcc.target/riscv/zfbfmin-bf16_float_libcall_convert.c: New test.
* gcc.target/riscv/zfbfmin-bf16_integer_libcall_convert.c: New test.

(cherry picked from commit 4638e508aa814d4aa2e204c3ab041c6a56aad2bd)

Diff:
---
 gcc/config/riscv/iterators.md  |  6 +-
 gcc/config/riscv/riscv.cc  |  4 +-
 gcc/config/riscv/riscv.md  | 49 +---
 .../gcc.target/riscv/zfbfmin-bf16_arithmetic.c | 35 
 .../gcc.target/riscv/zfbfmin-bf16_comparison.c | 33 +++
 .../riscv/zfbfmin-bf16_float_libcall_convert.c | 45 +++
 .../riscv/zfbfmin-bf16_integer_libcall_convert.c   | 66 ++
 7 files changed, 228 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 3c139bc2e30..1e37e843023 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -78,9 +78,13 @@
 ;; Iterator for floating-point modes that can be loaded into X registers.
 (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
 
-;; Iterator for floating-point modes of BF16
+;; Iterator for floating-point modes of BF16.
 (define_mode_iterator HFBF [HF BF])
 
+;; Conversion between floating-point modes and BF16.
+;; SF to BF16 have hardware instructions.
+(define_mode_iterator FBF [HF DF TF])
+
 ;; ---
 ;; Mode attributes
 ;; ---
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 10af38a5a81..c5c4c777349 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4310,7 +4310,7 @@ riscv_output_move (rtx dest, rtx src)
switch (width)
  {
  case 2:
-   if (TARGET_ZFHMIN)
+   if (TARGET_ZFHMIN || TARGET_ZFBFMIN)
  return "fmv.x.h\t%0,%1";
/* Using fmv.x.s + sign-extend to emulate fmv.x.h.  */
return "fmv.x.s\t%0,%1;slli\t%0,%0,16;srai\t%0,%0,16";
@@ -4366,7 +4366,7 @@ riscv_output_move (rtx dest, rtx src)
switch (width)
  {
  case 2:
-   if (TARGET_ZFHMIN)
+   if (TARGET_ZFHMIN || TARGET_ZFBFMIN)
  return "fmv.h.x\t%0,%z1";
/* High 16 bits should be all-1, otherwise HW will treated
   as a n-bit canonical NaN, but isn't matter for softfloat.  */
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 25d341ec987..e57bfcf616a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1763,6 +1763,31 @@
   [(set_attr "type" "fcvt")
(set_attr "mode" "HF")])
 
+(define_insn "truncsfbf2"
+  [(set (match_operand:BF0 "register_operand" "=f")
+   (float_truncate:BF
+  (match_operand:SF 1 "register_operand" " f")))]
+  "TARGET_ZFBFMIN"
+  "fcvt.bf16.s\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "BF")])
+
+;; The conversion of HF/DF/TF to BF needs to be done with SF if there is a
+;; chance to generate at least one instruction, otherwise just using
+;; libfunc __trunc[h|d|t]fbf2.
+(define_expand "truncbf2"
+  [(set (match_operand:BF  0 "register_operand" "=f")
+   (float_truncate:BF
+  (match_operand:FBF   1 "register_operand" " f")))]
+  "TARGET_ZFBFMIN"
+  {
+convert_move (operands[0],
+ convert_modes (SFmode, mode, operands[1], 0), 0);
+DONE;
+  }
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "BF")])
+
 ;;
 ;;

[gcc r15-1106] libstdc++: Add missing header to for std::__memcmp

2024-06-07 Thread Jonathan Wakely via Gcc-cvs

https://gcc.gnu.org/g:674d213ab91871652e96dc2de06e6f50682eebe0

commit r15-1106-g674d213ab91871652e96dc2de06e6f50682eebe0
Author: Jonathan Wakely 
Date:   Fri Jun 7 09:49:06 2024 +0100

libstdc++: Add missing header to  for std::__memcmp

As noticed by Michael Levine.

libstdc++-v3/ChangeLog:

* include/bits/ranges_algobase.h: Include .

Diff:
---
 libstdc++-v3/include/bits/ranges_algobase.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/include/bits/ranges_algobase.h 
b/libstdc++-v3/include/bits/ranges_algobase.h
index e26a73a27d6..e1f00838818 100644
--- a/libstdc++-v3/include/bits/ranges_algobase.h
+++ b/libstdc++-v3/include/bits/ranges_algobase.h
@@ -38,6 +38,7 @@
 #include  // ranges::begin, ranges::range etc.
 #include   // __invoke
 #include  // __is_byte
+#include  // __memcmp
 
 #if __cpp_lib_concepts
 namespace std _GLIBCXX_VISIBILITY(default)

[gcc r15-1105] c++: Handle erroneous DECL_LOCAL_DECL_ALIAS in duplicate_decls [PR107575]

2024-06-07 Thread Simon Martin via Gcc-cvs

https://gcc.gnu.org/g:0ce138694a6b40708a80691fa4003f6af1defa49

commit r15-1105-g0ce138694a6b40708a80691fa4003f6af1defa49
Author: Simon Martin 
Date:   Tue Jun 4 21:20:23 2024 +0200

c++: Handle erroneous DECL_LOCAL_DECL_ALIAS in duplicate_decls [PR107575]

We currently ICE upon the following because we don't properly handle local
functions with an error_mark_node as DECL_LOCAL_DECL_ALIAS in 
duplicate_decls.

=== cut here ===
void f (void) {
  virtual int f (void) const;
  virtual int f (void);
}
=== cut here ===

This patch fixes this by checking for error_mark_node.

Successfully tested on x86_64-pc-linux-gnu.

PR c++/107575

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Check for error_mark_node
DECL_LOCAL_DECL_ALIAS.

gcc/testsuite/ChangeLog:

* g++.dg/parse/crash74.C: New test.

Diff:
---
 gcc/cp/decl.cc   | 11 +++
 gcc/testsuite/g++.dg/parse/crash74.C | 11 +++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index d481e1ec074..03deb1493a4 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -2792,10 +2792,13 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
hiding, bool was_hidden)
  retrofit_lang_decl (newdecl);
  tree alias = DECL_LOCAL_DECL_ALIAS (newdecl)
= DECL_LOCAL_DECL_ALIAS (olddecl);
- DECL_ATTRIBUTES (alias)
-   = (*targetm.merge_decl_attributes) (alias, newdecl);
- if (TREE_CODE (newdecl) == FUNCTION_DECL)
-   merge_attribute_bits (newdecl, alias);
+ if (alias != error_mark_node)
+   {
+ DECL_ATTRIBUTES (alias)
+   = (*targetm.merge_decl_attributes) (alias, newdecl);
+ if (TREE_CODE (newdecl) == FUNCTION_DECL)
+   merge_attribute_bits (newdecl, alias);
+   }
}
 }
 
diff --git a/gcc/testsuite/g++.dg/parse/crash74.C 
b/gcc/testsuite/g++.dg/parse/crash74.C
new file mode 100644
index 000..a7ba5094be6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/crash74.C
@@ -0,0 +1,11 @@
+// PR c++/107575
+
+void f (void) {
+  virtual int f (void) const; // { dg-line line_4 }
+  virtual int f (void); // { dg-line line_5 }
+}
+
+// { dg-error "outside class declaration" {} { target *-*-* } line_4 }
+// { dg-error "cannot have cv-qualifier" {} { target *-*-* } line_4 }
+// { dg-error "ambiguating new declaration of" {} { target *-*-* } line_4 }
+// { dg-error "outside class declaration" {} { target *-*-* } line_5 }

[gcc r15-1104] c++: -include and header unit translation

2024-06-07 Thread Jason Merrill via Gcc-cvs

https://gcc.gnu.org/g:a29f481bbcaf2b196f358122a5f1e45c6869df82

commit r15-1104-ga29f481bbcaf2b196f358122a5f1e45c6869df82
Author: Jason Merrill 
Date:   Tue Jun 4 22:27:56 2024 -0400

c++: -include and header unit translation

 Within a source file, #include is translated to import if a suitable header
 unit is available, but this wasn't working with -include.  This turned out
 to be because we suppressed the translation before the beginning of the
 main file.  After removing that, I had to tweak libcpp file handling to
 accommodate the way it moves from an -include to the main file.

gcc/ChangeLog:

* doc/invoke.texi (C++ Modules): Mention -include.

gcc/cp/ChangeLog:

* module.cc (maybe_translate_include): Allow before the main file.

libcpp/ChangeLog:

* files.cc (_cpp_stack_file): LC_ENTER for -include header unit.

gcc/testsuite/ChangeLog:

* g++.dg/modules/dashinclude-1_b.C: New test.
* g++.dg/modules/dashinclude-1_a.H: New test.

Diff:
---
 gcc/doc/invoke.texi| 17 +
 gcc/cp/module.cc   |  4 
 gcc/testsuite/g++.dg/modules/dashinclude-1_b.C |  9 +
 libcpp/files.cc|  5 -
 gcc/testsuite/g++.dg/modules/dashinclude-1_a.H |  5 +
 5 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e5a5d1d9335..ca2591ce2c3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -37764,6 +37764,23 @@ installed.  Specifying the language as one of these 
variants also
 inhibits output of the object file, as header files have no associated
 object file.
 
+Header units can be used in much the same way as precompiled headers
+(@pxref{Precompiled Headers}), but with fewer restrictions: an
+#include that is translated to a header unit import can appear at any
+point in the source file, and multiple header units can be used
+together.  In particular, the @option{-include} strategy works: with
+the bits/stdc++.h header used for libstdc++ precompiled headers you
+can
+
+@smallexample
+g++ -fmodules-ts -x c++-system-header -c bits/stdc++.h
+g++ -fmodules-ts -include bits/stdc++.h mycode.C
+@end smallexample
+
+and any standard library #includes in mycode.C will be skipped,
+because the import brought in the whole library.  This can be a simple
+way to use modules to speed up compilation without any code changes.
+
 The @option{-fmodule-only} option disables generation of the
 associated object file for compiling a module interface.  Only the CMI
 is generated.  This option is implied when using the
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index ed24814b601..21fc85150c9 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -19976,10 +19976,6 @@ maybe_translate_include (cpp_reader *reader, line_maps 
*lmaps, location_t loc,
   return nullptr;
 }
 
-  if (!spans.init_p ())
-/* Before the main file, don't divert.  */
-return nullptr;
-
   dump.push (NULL);
 
   dump () && dump ("Checking include translation '%s'", path);
diff --git a/gcc/testsuite/g++.dg/modules/dashinclude-1_b.C 
b/gcc/testsuite/g++.dg/modules/dashinclude-1_b.C
new file mode 100644
index 000..6e6a33407a4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/dashinclude-1_b.C
@@ -0,0 +1,9 @@
+// Test that include translation works with command-line -include.
+// { dg-additional-options "-fmodules-ts -fdump-lang-module -include 
$srcdir/g++.dg/modules/dashinclude-1_a.H" }
+
+int main ()
+{
+  return f();
+}
+
+// { dg-final { scan-lang-dump {Translating include to import} module } }
diff --git a/libcpp/files.cc b/libcpp/files.cc
index c61df339e20..78f56e30bde 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -1008,7 +1008,10 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, 
include_type type,
   if (decrement)
 pfile->line_table->highest_location--;
 
-  if (file->header_unit <= 0)
+  /* Normally a header unit becomes an __import directive in the current file,
+ but with -include we need something to LC_LEAVE to trigger the file_change
+ hook and continue to the next -include or the main source file.  */
+  if (file->header_unit <= 0 || type == IT_CMDLINE)
 /* Add line map and do callbacks.  */
 _cpp_do_file_change (pfile, LC_ENTER, file->path,
   /* With preamble injection, start on line zero,
diff --git a/gcc/testsuite/g++.dg/modules/dashinclude-1_a.H 
b/gcc/testsuite/g++.dg/modules/dashinclude-1_a.H
new file mode 100644
index 000..c1b40a53924
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/dashinclude-1_a.H
@@ -0,0 +1,5 @@
+// { dg-module-do run }
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+inline int f() { return 0; }

[gcc r15-1103] c++: lambda in pack expansion [PR115378]

2024-06-07 Thread Patrick Palka via Gcc-cvs

https://gcc.gnu.org/g:5c761395402a730535983a5e49ef1775561ebc61

commit r15-1103-g5c761395402a730535983a5e49ef1775561ebc61
Author: Patrick Palka 
Date:   Fri Jun 7 12:12:30 2024 -0400

c++: lambda in pack expansion [PR115378]

Here find_parameter_packs_r is incorrectly treating the 'auto' return
type of a lambda as a parameter pack due to Concepts-TS specific logic
added in r6-4517, leading to confusion later when expanding the pattern.

Since we intend on removing Concepts TS support soon anyway, this patch
fixes this by restricting the problematic logic with flag_concepts_ts.
Doing so revealed that add_capture was relying on this logic to set
TEMPLATE_TYPE_PARAMETER_PACK for the 'auto' type of an pack expansion
init-capture, which we now need to do explicitly.

PR c++/115378

gcc/cp/ChangeLog:

* lambda.cc (lambda_capture_field_type): Set
TEMPLATE_TYPE_PARAMETER_PACK on the auto type of an init-capture
pack expansion.
* pt.cc (find_parameter_packs_r) :
Restrict TEMPLATE_TYPE_PARAMETER_PACK promotion with
flag_concepts_ts.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto-103497.C: Adjust expected diagnostic.
* g++.dg/template/pr95672.C: Likewise.
* g++.dg/cpp2a/lambda-targ5.C: New test.

Reviewed-by: Jason Merrill 

Diff:
---
 gcc/cp/lambda.cc  |  3 ++-
 gcc/cp/pt.cc  |  2 +-
 gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C |  2 +-
 gcc/testsuite/g++.dg/cpp2a/lambda-targ5.C | 15 +++
 gcc/testsuite/g++.dg/template/pr95672.C   |  2 +-
 5 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc
index 630cc4eade1..0770417810e 100644
--- a/gcc/cp/lambda.cc
+++ b/gcc/cp/lambda.cc
@@ -223,7 +223,8 @@ lambda_capture_field_type (tree expr, bool explicit_init_p,
   outermost CV qualifiers of EXPR.  */
type = build_reference_type (type);
   if (uses_parameter_packs (expr))
-   /* Stick with 'auto' even if the type could be deduced.  */;
+   /* Stick with 'auto' even if the type could be deduced.  */
+   TEMPLATE_TYPE_PARAMETER_PACK (auto_node) = true;
   else
type = do_auto_deduction (type, expr, auto_node);
 }
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index abbba7c6746..607753ae6b7 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3940,7 +3940,7 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
 parameter pack (14.6.3), or the type-specifier-seq of a type-id that
 is a pack expansion, the invented template parameter is a template
 parameter pack.  */
-  if (ppd->type_pack_expansion_p && is_auto (t)
+  if (flag_concepts_ts && ppd->type_pack_expansion_p && is_auto (t)
  && TEMPLATE_TYPE_LEVEL (t) != 0)
TEMPLATE_TYPE_PARAMETER_PACK (t) = true;
   if (TEMPLATE_TYPE_PARAMETER_PACK (t))
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C
index cedd661710c..4162361d14f 100644
--- a/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C
@@ -1,7 +1,7 @@
 // PR c++/103497
 // { dg-do compile { target c++14 } }
 
-void foo(decltype(auto)... args);  // { dg-error "cannot declare a parameter 
with .decltype.auto.." }
+void foo(decltype(auto)... args);  // { dg-error "contains no parameter packs" 
}
 
 int main() {
   foo();
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-targ5.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-targ5.C
new file mode 100644
index 000..efd4bb45d58
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-targ5.C
@@ -0,0 +1,15 @@
+// PR c++/115378
+// { dg-do compile { target c++20 } }
+
+struct tt {};
+
+template
+constexpr auto __counter = 1;
+
+template 
+using _as_base = tt;
+
+template 
+struct env : _as_base>... {};
+
+env t;
diff --git a/gcc/testsuite/g++.dg/template/pr95672.C 
b/gcc/testsuite/g++.dg/template/pr95672.C
index c752b4a2c08..d97b8db2e97 100644
--- a/gcc/testsuite/g++.dg/template/pr95672.C
+++ b/gcc/testsuite/g++.dg/template/pr95672.C
@@ -1,3 +1,3 @@
 // PR c++/95672
 // { dg-do compile { target c++14 } }
-struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use of pack 
expansion" }
+struct g_class : decltype  (auto) ... {  }; // { dg-error "contains no 
parameter packs" }

[gcc r15-1102] lto: Fix build on MacOS

2024-06-07 Thread Simon Martin via Gcc-cvs

https://gcc.gnu.org/g:a3d68b5155018817dd7eef5abbaeadf3959b8e5e

commit r15-1102-ga3d68b5155018817dd7eef5abbaeadf3959b8e5e
Author: Simon Martin 
Date:   Fri Jun 7 16:14:58 2024 +0200

lto: Fix build on MacOS

The build fails on x86_64-apple-darwin19.6.0 starting with 5b6d5a886ee 
because
vector is included after system.h and runs into poisoned identifiers.

This patch fixes this by defining INCLUDE_VECTOR before including system.h.

Validated by doing a full build on x86_64-apple-darwin19.6.0.

gcc/lto/ChangeLog:

* lto-partition.cc: Define INCLUDE_VECTOR to avoid running into
poisoned identifiers.

Diff:
---
 gcc/lto/lto-partition.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/lto/lto-partition.cc b/gcc/lto/lto-partition.cc
index 44b457d0b2a..2238650fa0e 100644
--- a/gcc/lto/lto-partition.cc
+++ b/gcc/lto/lto-partition.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 #include "config.h"
+#define INCLUDE_VECTOR
 #include "system.h"
 #include "coretypes.h"
 #include "target.h"
@@ -38,7 +39,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "lto-partition.h"
 
 #include 
-#include 
 
 vec ltrans_partitions;

[gcc r14-10289] arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

2024-06-07 Thread Richard Ball via Gcc-cvs

https://gcc.gnu.org/g:ca1924947b5bed8105ae020bef6950bddda448f3

commit r14-10289-gca1924947b5bed8105ae020bef6950bddda448f3
Author: Richard Ball 
Date:   Thu Jun 6 16:10:14 2024 +0100

arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

The CASE_VECTOR_SHORTEN_MODE query is missing some equals signs
which causes suboptimal codegen due to missed optimisation
opportunities. This patch also adds a test for thumb2
switch statements as none exist currently.

gcc/ChangeLog:
PR target/115353
* config/arm/arm.h (enum arm_auto_incmodes):
Correct CASE_VECTOR_SHORTEN_MODE query.

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb2-switchstatement.c: New test.

(cherry picked from commit 2963c76e8e24d4ebaf2b1b4ac4d7ca44eb0a9025)

Diff:
---
 gcc/config/arm/arm.h   |   4 +-
 .../gcc.target/arm/thumb2-switchstatement.c| 144 +
 2 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 449e6935b32..0cd5d733952 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2111,8 +2111,8 @@ enum arm_auto_incmodes
   ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, HImode)   \
   : SImode)
\
: (TARGET_THUMB2\
-  ? ((min > 0 && max < 0x200) ? QImode \
-  : (min > 0 && max <= 0x2) ? HImode   \
+  ? ((min >= 0 && max < 0x200) ? QImode\
+  : (min >= 0 && max < 0x2) ? HImode   \
   : SImode)
\
: ((min >= 0 && max < 1024) \
   ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, QImode)   \
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c 
b/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c
new file mode 100644
index 000..8badf318e62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-switchstatement.c
@@ -0,0 +1,144 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb --param case-values-threshold=1 -fno-reorder-blocks 
-fno-tree-dce -O2" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#define NOP "nop;"
+#define NOP2 NOP NOP
+#define NOP4 NOP2 NOP2
+#define NOP8 NOP4 NOP4
+#define NOP16 NOP8 NOP8
+#define NOP32 NOP16 NOP16
+#define NOP64 NOP32 NOP32
+#define NOP128 NOP64 NOP64
+#define NOP256 NOP128 NOP128
+#define NOP512 NOP256 NOP256
+#define NOP1024 NOP512 NOP512
+#define NOP2048 NOP1024 NOP1024
+#define NOP4096 NOP2048 NOP2048
+#define NOP8192 NOP4096 NOP4096
+#define NOP16384 NOP8192 NOP8192
+#define NOP32768 NOP16384 NOP16384
+#define NOP65536 NOP32768 NOP32768
+#define NOP131072 NOP65536 NOP65536
+
+enum z
+{
+  a = 1,
+  b,
+  c,
+  d,
+  e,
+  f = 7,
+};
+
+inline void QIFunction (const char* flag)
+{
+  asm volatile (NOP32);
+  return;
+}
+
+inline void HIFunction (const char* flag)
+{
+  asm volatile (NOP512);
+  return;
+}
+
+inline void SIFunction (const char* flag)
+{
+  asm volatile (NOP131072);
+  return;
+}
+
+/*
+**QImode_test:
+** ...
+** tbb \[pc, r[0-9]+\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
QImode_test(enum z x)
+{
+  switch (x)
+{
+  case d:
+QIFunction("QItest");
+return "InlineASM";
+  case f:
+return "TEST";
+  default:
+return "Default";
+}
+}
+
+/* { dg-final { scan-assembler ".byte" } } */
+
+/*
+**HImode_test:
+** ...
+** tbh \[pc, r[0-9]+, lsl #1\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
HImode_test(enum z x)
+{
+  switch (x)
+  {
+case d:
+  HIFunction("HItest");
+  return "InlineASM";
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
+
+/* { dg-final { scan-assembler ".2byte" } } */
+
+/*
+**SImode_test:
+** ...
+** adr (r[0-9]+), .L[0-9]+
+** ldr pc, \[\1, r[0-9]+, lsl #2\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
SImode_test(enum z x)
+{
+  switch (x)
+  {
+case d:
+  SIFunction("SItest");
+  return "InlineASM";
+case f:
+  return "TEST";
+default:
+  return "Default";
+  }
+}
+
+/* { dg-final { scan-assembler ".word" } } */
+
+/*
+**backwards_branch_test:
+** ...
+** adr (r[0-9]+), .L[0-9]+
+** ldr pc, \[\1, r[0-9]+, lsl #2\]
+** ...
+*/
+__attribute__ ((noinline)) __attribute__ ((noclone)) const char* 
backwards_branch_test(enum z x, int flag)
+{
+  if (flag == 5)
+  {
+backwards:
+  asm volatile (NOP512);
+  return "ASM";
+  }
+  switch (x)
+  {
+case d:
+  goto backwards;
+

[gcc r15-1101] i386: PR target/115351: RTX costs for concatditi3 and insvti_highpart.

2024-06-07 Thread Roger Sayle via Gcc-cvs

https://gcc.gnu.org/g:fb3e4c549d16d5050e10114439ad77149f33c597

commit r15-1101-gfb3e4c549d16d5050e10114439ad77149f33c597
Author: Roger Sayle 
Date:   Fri Jun 7 14:03:20 2024 +0100

i386: PR target/115351: RTX costs for *concatditi3 and *insvti_highpart.

This patch addresses PR target/115351, which is a code quality regression
on x86 when passing floating point complex numbers.  The ABI considers
these arguments to have TImode, requiring interunit moves to place the
FP values (which are actually passed in SSE registers) into the upper
and lower parts of a TImode pseudo, and then similar moves back again
before they can be used.

The cause of the regression is that changes in how TImode initialization
is represented in RTL now prevents the RTL optimizers from eliminating
these redundant moves.  The specific cause is that the *concatditi3
pattern, (zext(hi)<<64)|zext(lo), has an inappropriately high (default)
rtx_cost, preventing fwprop1 from propagating it.  This pattern just
sets the hipart and lopart of a double-word register, typically two
instructions (less if reload can allocate things appropriately) but
the current ix86_rtx_costs actually returns INSN_COSTS(13), i.e. 52.

propagating insn 5 into insn 6, replacing:
(set (reg:TI 110)
(ior:TI (and:TI (reg:TI 110)
(const_wide_int 0x0))
(ashift:TI (zero_extend:TI (subreg:DI (reg:DF 112 [ zD.2796+8 ]) 0))
(const_int 64 [0x40]
successfully matched this instruction to *concatditi3_3:
(set (reg:TI 110)
(ior:TI (ashift:TI (zero_extend:TI (subreg:DI (reg:DF 112 [ zD.2796+8 
]) 0))
(const_int 64 [0x40]))
(zero_extend:TI (subreg:DI (reg:DF 111 [ zD.2796 ]) 0
change not profitable (cost 50 -> cost 52)

This issue is resolved by having ix86_rtx_costs return more reasonable
values for these (place-holder) patterns.

2024-06-07  Roger Sayle  

gcc/ChangeLog
PR target/115351
* config/i386/i386.cc (ix86_rtx_costs): Provide estimates for
the *concatditi3 and *insvti_highpart patterns, about two insns.

gcc/testsuite/ChangeLog
PR target/115351
* g++.target/i386/pr115351.C: New test case.

Diff:
---
 gcc/config/i386/i386.cc  | 43 
 gcc/testsuite/g++.target/i386/pr115351.C | 19 ++
 2 files changed, 62 insertions(+)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 4126ab24a79..173db213d14 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -21912,6 +21912,49 @@ ix86_rtx_costs (rtx x, machine_mode mode, int 
outer_code_i, int opno,
}
  *total = ix86_vec_cost (mode, cost->sse_op);
}
+  else if (TARGET_64BIT
+  && mode == TImode
+  && GET_CODE (XEXP (x, 0)) == ASHIFT
+  && GET_CODE (XEXP (XEXP (x, 0), 0)) == ZERO_EXTEND
+  && GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == DImode
+  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
+  && INTVAL (XEXP (XEXP (x, 0), 1)) == 64
+  && GET_CODE (XEXP (x, 1)) == ZERO_EXTEND
+  && GET_MODE (XEXP (XEXP (x, 1), 0)) == DImode)
+   {
+ /* *concatditi3 is cheap.  */
+ rtx op0 = XEXP (XEXP (XEXP (x, 0), 0), 0);
+ rtx op1 = XEXP (XEXP (x, 1), 0);
+ *total = (SUBREG_P (op0) && GET_MODE (SUBREG_REG (op0)) == DFmode)
+  ? COSTS_N_INSNS (1)/* movq.  */
+  : set_src_cost (op0, DImode, speed);
+ *total += (SUBREG_P (op1) && GET_MODE (SUBREG_REG (op1)) == DFmode)
+   ? COSTS_N_INSNS (1)/* movq.  */
+   : set_src_cost (op1, DImode, speed);
+ return true;
+   }
+  else if (TARGET_64BIT
+  && mode == TImode
+  && GET_CODE (XEXP (x, 0)) == AND
+  && REG_P (XEXP (XEXP (x, 0), 0))
+  && CONST_WIDE_INT_P (XEXP (XEXP (x, 0), 1))
+  && CONST_WIDE_INT_NUNITS (XEXP (XEXP (x, 0), 1)) == 2
+  && CONST_WIDE_INT_ELT (XEXP (XEXP (x, 0), 1), 0) == -1
+  && CONST_WIDE_INT_ELT (XEXP (XEXP (x, 0), 1), 1) == 0
+  && GET_CODE (XEXP (x, 1)) == ASHIFT
+  && GET_CODE (XEXP (XEXP (x, 1), 0)) == ZERO_EXTEND
+  && GET_MODE (XEXP (XEXP (XEXP (x, 1), 0), 0)) == DImode
+  && CONST_INT_P (XEXP (XEXP (x, 1), 1))
+  && INTVAL (XEXP (XEXP (x, 1), 1)) == 64)
+   {
+ /* *insvti_highpart is cheap.  */
+ rtx op = XEXP (XEXP (XEXP (x, 1), 0), 0);
+ *total = COSTS_N_INSNS (1) + 1;
+ *total += (SUBREG_P (op) && GET_MODE (SUBREG_REG (op)) == DFmode)
+   ? COSTS_N_INSNS (1)/* movq.  */
+   : set_src_cost (op, DImode, speed);
+ return

[gcc r15-1100] i386: Improve handling of ternlog instructions in i386/sse.md

2024-06-07 Thread Roger Sayle via Gcc-cvs

https://gcc.gnu.org/g:ec985bc97a01577bca8307f986caba7ba7633cde

commit r15-1100-gec985bc97a01577bca8307f986caba7ba7633cde
Author: Roger Sayle 
Date:   Fri Jun 7 13:57:23 2024 +0100

i386: Improve handling of ternlog instructions in i386/sse.md

This patch improves the way that the x86 backend recognizes and
expands AVX512's bitwise ternary logic (vpternlog) instructions.

As a motivating example consider the following code which calculates
the carry out from a (binary) full adder:

typedef unsigned long long v4di __attribute((vector_size(32)));

v4di foo(v4di a, v4di b, v4di c)
{
return (a & b) | ((a ^ b) & c);
}

with -O2 -march=cascadelake current mainline produces:

foo:vpternlogq  $96, %ymm0, %ymm1, %ymm2
vmovdqa %ymm0, %ymm3
vmovdqa %ymm2, %ymm0
vpternlogq  $248, %ymm3, %ymm1, %ymm0
ret

with the patch below, we now generate a single instruction:

foo:vpternlogq  $232, %ymm2, %ymm1, %ymm0
ret

The AVX512 vpternlog[qd] instructions are a very cool addition to the
x86 instruction set, that can calculate any Boolean function of three
inputs in a single fast instruction.  As the truth table for any
three-input function has 8 rows, any specific function can be represented
by specifying those bits, i.e. by a 8-bit byte, an immediate integer
between 0 and 256.

Examples of ternary functions and their indices are given below:

0x01   1:  ~((b|a)|c)
0x02   2:  (~(b|a))
0x03   3:  ~(b|a)
0x04   4:  (~(c|a))
0x05   5:  ~(c|a)
0x06   6:  (c^b)&~a
0x07   7:  ~((c)|a)
0x08   8:  (~a) (~a) (c)&~a
0x09   9:  ~((c^b)|a)
0x0a  10:  ~a
0x0b  11:  ~((~c)|a) (~b|c)&~a
0x0c  12:  ~a
0x0d  13:  ~((~b)|a) (~c|b)&~a
0x0e  14:  (c|b)&~a
0x0f  15:  ~a
0x10  16:  (~(c|b))
0x11  17:  ~(c|b)
...
0xf4 244:  (~c)|a
0xf5 245:  ~c|a
0xf6 246:  (c^b)|a
0xf7 247:  (~(c))|a
0xf8 248:  (c)|a
0xf9 249:  (~(c^b))|a
0xfa 250:  c|a
0xfb 251:  (c|a)|~b (~b|a)|c (~b|c)|a
0xfc 252:  b|a
0xfd 253:  (b|a)|~c (~c|a)|b (~c|b)|a
0xfe 254:  (b|a)|c (c|a)|b (c|b)|a

A naive implementation (in many compilers) might be add define_insn
patterns for all 256 different functions.  The situation is even
worse as many of these Boolean functions don't have a "canonical form"
(as produced by simplify_rtx) and would each need multiple patterns.
See the space-separated equivalent expressions in the table above.

This need to provide instruction "templates" might explain why GCC,
LLVM and ICC all exhibit similar coverage problems in their ability
to recognize x86 ternlog ternary functions.

Perhaps a unique feature of GCC's design is that in addition to regular
define_insn templates, machine descriptions can also perform pattern
matching via a match_operator (and its corresponding predicate).
This patch introduces a ternlog_operand predicate that matches a
(possibly infinite) set of expression trees, identifying those that
have at most three unique operands.  This then allows a
define_insn_and_split to recognize suitable expressions and then
transform them into the appropriate UNSPEC_VTERNLOG as a pre-reload
splitter.  This design allows combine to smash together arbitrarily
complex Boolean expressions, then transform them into an UNSPEC
before register allocation.  As an "optimization", where possible
ix86_expand_ternlog generates a simpler binary operation, using
AND, XOR, IOR or ANDN where possible, and in a few cases attempts
to "canonicalize" the ternlog, by reordering or duplicating operands,
so that later CSE passes have a hope of spotting equivalent values.

This patch leaves the existing ternlog patterns in sse.md (for now),
many of which are made obsolete by these changes.  In theory we now
only need one define_insn for UNSPEC_VTERNLOG.  One complication from
these previous variants was that they inconsistently used decimal vs.
hexadecimal to specify the immediate constant operand in assembly
language, making the list of tweaks to the testsuite with this patch
larger than it might have been.  I propose to remove the vestigial
patterns in a follow-up patch, once this approach has baked (proven
to be stable) on mainline.

2024-06-07  Roger Sayle  
Hongtao Liu  

gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Call
fixup_modeless_constant before testing predicates.  Only call
copy_to_mode_reg on memory operands (after the first one).
(ix86_gen_bcst_mem): Helper function to convert a CONST_VECTOR
into a VEC_DUPLICATE if possible.
(ix86_ternlog_idx):  Convert an RTX

[gcc r15-1099] lto: Implement cache partitioning

2024-06-07 Thread Michal Jires via Gcc-cvs

https://gcc.gnu.org/g:5b6d5a886ee45bb969b4de23528311472b4ab66b

commit r15-1099-g5b6d5a886ee45bb969b4de23528311472b4ab66b
Author: Michal Jires 
Date:   Fri Nov 17 21:17:18 2023 +0100

lto: Implement cache partitioning

This patch implements new cache partitioning. It tries to keep symbols
from single source file together to minimize propagation of divergence.

It starts with symbols already grouped by source files. If reasonably
possible it only either combines several files into one final partition,
or, if a file is large, split the file into several final partitions.

Intermediate representation is partition_set which contains set of
groups of symbols (each group corresponding to original source file) and
number of final partitions this partition_set should split into.

First partition_fixed_split splits partition_set into constant number of
partition_sets with equal number of symbols groups. If for example there
are 39 source files, the resulting partition_sets will contain 10, 10,
10, and 9 source files. This splitting intentionally ignores estimated
instruction counts to minimize propagation of divergence.

Second partition_over_target_split separates too large files and splits
them into individual symbols to be combined back into several smaller
files in next step.

Third partition_binary_split splits partition_set into two halves until
it should be split into only one final partition, at which point the
remaining symbols are joined into one final partition.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* common.opt: Add cache partitioning.
* flag-types.h (enum lto_partition_model): Likewise.

gcc/lto/ChangeLog:

* lto-partition.cc (new_partition): Use new_partition_no_push.
(new_partition_no_push): New.
(free_ltrans_partition): New.
(free_ltrans_partitions): Use free_ltrans_partition.
(join_partitions): New.
(split_partition_into_nodes): New.
(is_partition_reorder): New.
(class partition_set): New.
(distribute_n_partitions): New.
(partition_over_target_split): New.
(partition_binary_split): New.
(partition_fixed_split): New.
(class partitioner_base): New.
(class partitioner_default): New.
(lto_cache_map): New.
* lto-partition.h (lto_cache_map): New.
* lto.cc (do_whole_program_analysis): Use lto_cache_map.

gcc/testsuite/ChangeLog:

* gcc.dg/completion-2.c: Add -flto-partition=cache.

Diff:
---
 gcc/common.opt  |   3 +
 gcc/flag-types.h|   3 +-
 gcc/lto/lto-partition.cc| 605 +++-
 gcc/lto/lto-partition.h |   1 +
 gcc/lto/lto.cc  |   2 +
 gcc/testsuite/gcc.dg/completion-2.c |   1 +
 6 files changed, 605 insertions(+), 10 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 2c078fdd1f8..f2bc47fdc5e 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2233,6 +2233,9 @@ Enum(lto_partition_model) String(1to1) 
Value(LTO_PARTITION_1TO1)
 EnumValue
 Enum(lto_partition_model) String(max) Value(LTO_PARTITION_MAX)
 
+EnumValue
+Enum(lto_partition_model) String(cache) Value(LTO_PARTITION_CACHE)
+
 flto-partition=
 Common Joined RejectNegative Enum(lto_partition_model) Var(flag_lto_partition) 
Init(LTO_PARTITION_BALANCED)
 Specify the algorithm to partition symbols and vars at linktime.
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 5a2b461fa75..1e497f0bb91 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -396,7 +396,8 @@ enum lto_partition_model {
   LTO_PARTITION_ONE = 1,
   LTO_PARTITION_BALANCED = 2,
   LTO_PARTITION_1TO1 = 3,
-  LTO_PARTITION_MAX = 4
+  LTO_PARTITION_MAX = 4,
+  LTO_PARTITION_CACHE = 5
 };
 
 /* flag_lto_linker_output initialization values.  */
diff --git a/gcc/lto/lto-partition.cc b/gcc/lto/lto-partition.cc
index 19f91e5d660..44b457d0b2a 100644
--- a/gcc/lto/lto-partition.cc
+++ b/gcc/lto/lto-partition.cc
@@ -37,6 +37,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-fnsummary.h"
 #include "lto-partition.h"
 
+#include 
+#include 
+
 vec ltrans_partitions;
 
 static void add_symbol_to_partition (ltrans_partition part, symtab_node *node);
@@ -60,20 +63,41 @@ cmp_partitions_order (const void *a, const void *b)
   return orderb - ordera;
 }
 
-/* Create new partition with name NAME.  */
-
+/* Create new partition with name NAME.
+   Does not push into ltrans_partitions.  */
 static ltrans_partition
-new_partition (const char *name)
+new_partition_no_push (const char *name)
 {
   ltrans_partition part = XCNEW (struct ltrans_partition_def);
   part->encoder = lto_symtab_encoder_new (false);
   part->name = name;
   part->insns = 0;

[gcc r12-10497] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs

https://gcc.gnu.org/g:5d52558a531130675329d72ca5c4713abf5bf885

commit r12-10497-g5d52558a531130675329d72ca5c4713abf5bf885
Author: Jan Hubicka 
Date:   Fri Dec 29 23:51:03 2023 +0100

Disable FMADD in chains for Zen4 and generic

this patch disables use of FMA in matrix multiplication loop for generic 
(for
x86-64-v3) and zen4.  I tested this on zen4 and Xenon Gold Gold 6212U.

For Intel this is neutral both on the matrix multiplication microbenchmark
(attached) and spec2k17 where the difference was within noise for Core.

On core the micro-benchmark runs as follows:

With FMA:

   578,500,241  cycles:u #3.645 GHz
( +-  0.12% )
   753,318,477  instructions:u   #1.30  insn per
cycle  ( +-  0.00% )
   125,417,701  branches:u   #  790.227 M/sec
( +-  0.00% )
  0.159146 +- 0.000363 seconds time elapsed  ( +-  0.23% )

No FMA:

   577,573,960  cycles:u #3.514 GHz
( +-  0.15% )
   878,318,479  instructions:u   #1.52  insn per
cycle  ( +-  0.00% )
   125,417,702  branches:u   #  763.035 M/sec
( +-  0.00% )
  0.164734 +- 0.000321 seconds time elapsed  ( +-  0.19% )

So the cycle count is unchanged and discrete multiply+add takes same time as
FMA.

While on zen:

With FMA:
 484875179  cycles:u #3.599 GHz
 ( +-  0.05% )  (82.11%)
 752031517  instructions:u   #1.55  insn per
cycle
 125106525  branches:u   #  928.712 M/sec
 ( +-  0.03% )  (85.09%)
128356  branch-misses:u  #0.10% of all
branches  ( +-  0.06% )  (83.58%)

No FMA:
 375875209  cycles:u #3.592 GHz
 ( +-  0.08% )  (80.74%)
 875725341  instructions:u   #2.33  insn per
cycle
 124903825  branches:u   #1.194 G/sec
 ( +-  0.04% )  (84.59%)
  0.105203 +- 0.000188 seconds time elapsed  ( +-  0.18% )

The diffrerence is that Cores understand the fact that fmadd does not need
all three parameters to start computation, while Zen cores doesn't.

Since this seems noticeable win on zen and not loss on Core it seems like 
good
default for generic.

float a[SIZE][SIZE];
float b[SIZE][SIZE];
float c[SIZE][SIZE];

void init(void)
{
   int i, j, k;
   for(i=0; i

[gcc r15-1098] [libstdc++] drop workaround for clang<=7

2024-06-07 Thread Alexandre Oliva via Libstdc++-cvs

https://gcc.gnu.org/g:b6a9deb1e2ae01ee906e78e06e3a1b073d20e023

commit r15-1098-gb6a9deb1e2ae01ee906e78e06e3a1b073d20e023
Author: Alexandre Oliva 
Date:   Fri Jun 7 07:00:11 2024 -0300

[libstdc++] drop workaround for clang<=7

In response to a request in the review of the patch that introduced
_GLIBCXX_CLANG, this patch removes from std/variant an obsolete
workaround for clang 7-.


for  libstdc++-v3/ChangeLog

* include/std/variant: Drop obsolete workaround.

Diff:
---
 libstdc++-v3/include/std/variant | 5 -
 1 file changed, 5 deletions(-)

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 51aaa620851..13ea1dd3849 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -1758,11 +1758,6 @@ namespace __detail::__variant
  }, __rhs);
   }
 
-#if defined(_GLIBCXX_CLANG) && __clang_major__ <= 7
-public:
-  using _Base::_M_u; // See https://bugs.llvm.org/show_bug.cgi?id=31852
-#endif
-
 private:
   template
friend constexpr decltype(auto)

[gcc r15-1097] Fix fold-left reduction vectorization with multiple stmt copies

2024-06-07 Thread Richard Biener via Gcc-cvs

https://gcc.gnu.org/g:dd6f942c266533b2f72610f354bc9184f8276beb

commit r15-1097-gdd6f942c266533b2f72610f354bc9184f8276beb
Author: Richard Biener 
Date:   Fri Jun 7 09:41:11 2024 +0200

Fix fold-left reduction vectorization with multiple stmt copies

There's a typo when code generating the mask operand for conditional
fold-left reductions in the case we have multiple stmt copies.  The
latter is now allowed for SLP and possibly disabled for non-SLP by
accident.

This fixes the observed run-FAIL for
gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c with AVX512
and 256bit sized vectors.

* tree-vect-loop.cc (vectorize_fold_left_reduction): Fix
mask vector operand indexing.

Diff:
---
 gcc/tree-vect-loop.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index ceb92156b58..028692614bb 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7217,7 +7217,7 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
   if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
mask = vect_get_loop_mask (loop_vinfo, gsi, masks, vec_num, vectype_in, 
i);
   else if (is_cond_op)
-   mask = vec_opmask[0];
+   mask = vec_opmask[i];
   if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
{
  len = vect_get_loop_len (loop_vinfo, gsi, lens, vec_num, vectype_in,

[gcc r15-1096] libstdc++: Optimize std::to_address

2024-06-07 Thread Jonathan Wakely via Libstdc++-cvs

https://gcc.gnu.org/g:94997567ea5cbeb35571e94cf76d7f99ea3f9c43

commit r15-1096-g94997567ea5cbeb35571e94cf76d7f99ea3f9c43
Author: Jonathan Wakely 
Date:   Mon Mar 18 16:58:23 2024 +

libstdc++: Optimize std::to_address

We can use if-constexpr and variable templates to simplify and optimize
std::to_address. This should compile faster (and run faster for -O0)
than dispatching to the pre-C++20 std::__to_address overloads.

libstdc++-v3/ChangeLog:

* include/bits/ptr_traits.h (to_address): Optimize.
* testsuite/20_util/to_address/1_neg.cc: Adjust dg-error text.

Diff:
---
 libstdc++-v3/include/bits/ptr_traits.h | 47 +-
 libstdc++-v3/testsuite/20_util/to_address/1_neg.cc |  2 +-
 2 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/libstdc++-v3/include/bits/ptr_traits.h 
b/libstdc++-v3/include/bits/ptr_traits.h
index 6c65001cb74..ca67feecca3 100644
--- a/libstdc++-v3/include/bits/ptr_traits.h
+++ b/libstdc++-v3/include/bits/ptr_traits.h
@@ -200,36 +200,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 using __ptr_rebind = typename pointer_traits<_Ptr>::template rebind<_Tp>;
 
+#ifndef __glibcxx_to_address // C++ < 20
   template
+[[__gnu__::__always_inline__]]
 constexpr _Tp*
 __to_address(_Tp* __ptr) noexcept
 {
-  static_assert(!std::is_function<_Tp>::value, "not a function pointer");
+  static_assert(!std::is_function<_Tp>::value, "std::to_address argument "
+   "must not be a function pointer");
   return __ptr;
 }
 
-#ifndef __glibcxx_to_address // C++ < 20
   template
 constexpr typename std::pointer_traits<_Ptr>::element_type*
 __to_address(const _Ptr& __ptr)
 { return std::__to_address(__ptr.operator->()); }
 #else
-  template
-constexpr auto
-__to_address(const _Ptr& __ptr) noexcept
--> decltype(std::pointer_traits<_Ptr>::to_address(__ptr))
-{ return std::pointer_traits<_Ptr>::to_address(__ptr); }
-
-  template
-constexpr auto
-__to_address(const _Ptr& __ptr, _None...) noexcept
-{
-  if constexpr (is_base_of_v<__gnu_debug::_Safe_iterator_base, _Ptr>)
-   return std::__to_address(__ptr.base().operator->());
-  else
-   return std::__to_address(__ptr.operator->());
-}
-
   /**
* @brief Obtain address referenced by a pointer to an object
* @param __ptr A pointer to an object
@@ -237,9 +223,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @ingroup pointer_abstractions
   */
   template
+[[__gnu__::__always_inline__]]
 constexpr _Tp*
 to_address(_Tp* __ptr) noexcept
-{ return std::__to_address(__ptr); }
+{
+  static_assert(!is_function_v<_Tp>, "std::to_address argument "
+   "must not be a function pointer");
+  return __ptr;
+}
 
   /**
* @brief Obtain address referenced by a pointer to an object
@@ -251,7 +242,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 to_address(const _Ptr& __ptr) noexcept
-{ return std::__to_address(__ptr); }
+{
+  if constexpr (requires { pointer_traits<_Ptr>::to_address(__ptr); })
+   return pointer_traits<_Ptr>::to_address(__ptr);
+  else if constexpr (is_base_of_v<__gnu_debug::_Safe_iterator_base, _Ptr>)
+   return std::to_address(__ptr.base().operator->());
+  else
+   return std::to_address(__ptr.operator->());
+}
+
+  /// @cond undocumented
+  /// Compatibility for use in code that is also compiled as pre-C++20.
+  template
+[[__gnu__::__always_inline__]]
+constexpr auto
+__to_address(const _Ptr& __ptr) noexcept
+{ return std::to_address(__ptr); }
+  /// @endcond
 #endif // __glibcxx_to_address
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/testsuite/20_util/to_address/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/to_address/1_neg.cc
index 7385f0f335c..10e919757bb 100644
--- a/libstdc++-v3/testsuite/20_util/to_address/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/to_address/1_neg.cc
@@ -16,7 +16,7 @@
 // .
 
 // { dg-do compile { target c++20 } }
-// { dg-error "not a function pointer" "" { target *-*-* } 0 }
+// { dg-error "must not be a function pointer" "" { target *-*-* } 0 }
 
 #include

[gcc r15-1095] fixincludes: bypass some fixes for recent darwin headers

2024-06-07 Thread François-Xavier Coudert via Gcc-cvs

https://gcc.gnu.org/g:e4f1c1be61d916345655d5edba309502046c9473

commit r15-1095-ge4f1c1be61d916345655d5edba309502046c9473
Author: Francois-Xavier Coudert 
Date:   Sun Jun 2 21:07:23 2024 +0200

fixincludes: bypass some fixes for recent darwin headers

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (darwin_stdint_7, darwin_dispatch_object_1,
darwin_os_trace_2, darwin_os_base_1): Include bypasses
for recent headers, fixed by Apple.

Diff:
---
 fixincludes/fixincl.x| 42 +++---
 fixincludes/inclhack.def |  4 
 2 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/fixincludes/fixincl.x b/fixincludes/fixincl.x
index e52f11d8460..caaff2883e0 100644
--- a/fixincludes/fixincl.x
+++ b/fixincludes/fixincl.x
@@ -2,11 +2,11 @@
  *
  * DO NOT EDIT THIS FILE   (fixincl.x)
  *
- * It has been AutoGen-ed  August 17, 2023 at 10:16:38 AM by AutoGen 5.18.12
+ * It has been AutoGen-ed  June  4, 2024 at 02:35:55 PM by AutoGen 5.18.16
  * From the definitionsinclhack.def
  * and the template file   fixincl
  */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Thu Aug 17 10:16:38 CEST 2023
+/* DO NOT SVN-MERGE THIS FILE, EITHER Tue Jun  4 14:35:55 CEST 2024
  *
  * You must regenerate it.  Use the ./genfixes script.
  *
@@ -3070,8 +3070,15 @@ tSCC* apzDarwin_Os_Trace_2Machs[] = {
 tSCC zDarwin_Os_Trace_2Select0[] =
"typedef.*\\^os_trace_payload_t.*";
 
-#defineDARWIN_OS_TRACE_2_TEST_CT  1
+/*
+ *  content bypass pattern - skip fix if pattern found
+ */
+tSCC zDarwin_Os_Trace_2Bypass0[] =
+   "#ifdef __BLOCKS__";
+
+#defineDARWIN_OS_TRACE_2_TEST_CT  2
 static tTestDesc aDarwin_Os_Trace_2Tests[] = {
+  { TT_NEGREP,   zDarwin_Os_Trace_2Bypass0, (regex_t*)NULL },
   { TT_EGREP,zDarwin_Os_Trace_2Select0, (regex_t*)NULL }, };
 
 /*
@@ -3199,8 +3206,15 @@ tSCC zDarwin_Os_Base_1Select0[] =
"#define __has_attribute.*\n\
 #endif";
 
-#defineDARWIN_OS_BASE_1_TEST_CT  1
+/*
+ *  content bypass pattern - skip fix if pattern found
+ */
+tSCC zDarwin_Os_Base_1Bypass0[] =
+   "#define __has_extension";
+
+#defineDARWIN_OS_BASE_1_TEST_CT  2
 static tTestDesc aDarwin_Os_Base_1Tests[] = {
+  { TT_NEGREP,   zDarwin_Os_Base_1Bypass0, (regex_t*)NULL },
   { TT_EGREP,zDarwin_Os_Base_1Select0, (regex_t*)NULL }, };
 
 /*
@@ -3239,8 +3253,15 @@ tSCC* apzDarwin_Dispatch_Object_1Machs[] = {
 tSCC zDarwin_Dispatch_Object_1Select0[] =
"typedef void.*\\^dispatch_block_t.*";
 
-#defineDARWIN_DISPATCH_OBJECT_1_TEST_CT  1
+/*
+ *  content bypass pattern - skip fix if pattern found
+ */
+tSCC zDarwin_Dispatch_Object_1Bypass0[] =
+   "#ifdef __BLOCKS__";
+
+#defineDARWIN_DISPATCH_OBJECT_1_TEST_CT  2
 static tTestDesc aDarwin_Dispatch_Object_1Tests[] = {
+  { TT_NEGREP,   zDarwin_Dispatch_Object_1Bypass0, (regex_t*)NULL },
   { TT_EGREP,zDarwin_Dispatch_Object_1Select0, (regex_t*)NULL }, };
 
 /*
@@ -3591,8 +3612,15 @@ tSCC zDarwin_Stdint_7Select0[] =
"#define INTMAX_C\\(v\\)[ \t]+\\(v ## LL\\)\n\
 #define UINTMAX_C\\(v\\)[ \t]+\\(v ## ULL\\)";
 
-#defineDARWIN_STDINT_7_TEST_CT  1
+/*
+ *  content bypass pattern - skip fix if pattern found
+ */
+tSCC zDarwin_Stdint_7Bypass0[] =
+   "#ifdef __LP64__";
+
+#defineDARWIN_STDINT_7_TEST_CT  2
 static tTestDesc aDarwin_Stdint_7Tests[] = {
+  { TT_NEGREP,   zDarwin_Stdint_7Bypass0, (regex_t*)NULL },
   { TT_EGREP,zDarwin_Stdint_7Select0, (regex_t*)NULL }, };
 
 /*
@@ -11169,7 +11197,7 @@ static const char* apzX11_SprintfPatch[] = {
  *
  *  List of all fixes
  */
-#define REGEX_COUNT  313
+#define REGEX_COUNT  317
 #define MACH_LIST_SIZE_LIMIT 187
 #define FIX_COUNT274
 
diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def
index 19e0ea2df66..35402d0621c 100644
--- a/fixincludes/inclhack.def
+++ b/fixincludes/inclhack.def
@@ -1486,6 +1486,7 @@ fix = {
   mach  = "*-*-darwin*";
   files = os/trace.h;
   select= "typedef.*\\^os_trace_payload_t.*";
+  bypass= "#ifdef __BLOCKS__";
   c_fix = format;
   c_fix_arg = "#if __BLOCKS__\n%0\n#endif";
   test_text = "typedef void (^os_trace_payload_t)(xpc_object_t xdict);";
@@ -1566,6 +1567,7 @@ fix = {
 #define __has_attribute.*
 #endif
 OS_BASE_1_SEL;
+  bypass= "#define __has_extension";
   c_fix = format;
   c_fix_arg = <<- OS_BASE_1_FIX
 %0
@@ -1589,6 +1591,7 @@ fix = {
   mach  = "*-*-darwin*";
   files = dispatch/object.h;
   select= "typedef void.*\\^dispatch_block_t.*";
+  bypass= "#ifdef __BLOCKS__";
   c_fix = format;
   c_fix_arg = "#if __BLOCKS__\n%0\n#endif";
   test_text = <<- DISPATCH_OBJECT_1_TEST
@@ -1791,6 +1794,7 @@ fix = {
"#endif";
 select= "#define INTMAX_C\\(v\\)[ \t]+\\(v ## LL\\)\n"
"#define UINTMAX_C\\(v\\)[ \t]+\\(v ## ULL\\)";
+bypass = '#ifdef __LP64__';
 test_text = "#define INTMAX_C(v)  (v ## LL)\n"

[gcc r15-1094] Add finalizer creation to array constructor for functions of derived type.

2024-06-07 Thread Andre Vehreschild via Gcc-cvs

https://gcc.gnu.org/g:c3190756487080a11e819746f00b6e30fd0a0c2e

commit r15-1094-gc3190756487080a11e819746f00b6e30fd0a0c2e
Author: Andre Vehreschild 
Date:   Thu Jul 27 14:51:34 2023 +0200

Add finalizer creation to array constructor for functions of derived type.

PR fortran/90068

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_array_ctor_element): Eval non-
variable expressions once only.
(gfc_trans_array_constructor_value): Add statements of
final block.
(trans_array_constructor): Detect when final block is required.

gcc/testsuite/ChangeLog:

* gfortran.dg/finalize_57.f90: New test.

Diff:
---
 gcc/fortran/trans-array.cc| 18 -
 gcc/testsuite/gfortran.dg/finalize_57.f90 | 63 +++
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index eec62c296ff..cc50b961a97 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -1885,6 +1885,16 @@ gfc_trans_array_ctor_element (stmtblock_t * pblock, tree 
desc,
 gfc_conv_descriptor_data_get (desc));
   tmp = gfc_build_array_ref (tmp, offset, NULL);
 
+  if (expr->expr_type == EXPR_FUNCTION && expr->ts.type == BT_DERIVED
+  && expr->ts.u.derived->attr.alloc_comp)
+{
+  if (!VAR_P (se->expr))
+   se->expr = gfc_evaluate_now (se->expr, >pre);
+  gfc_add_expr_to_block (>finalblock,
+gfc_deallocate_alloc_comp_no_caf (
+  expr->ts.u.derived, se->expr, expr->rank, true));
+}
+
   if (expr->ts.type == BT_CHARACTER)
 {
   int i = gfc_validate_kind (BT_CHARACTER, expr->ts.kind, false);
@@ -2147,6 +2157,8 @@ gfc_trans_array_constructor_value (stmtblock_t * pblock,
  *poffset = fold_build2_loc (input_location, PLUS_EXPR,
  gfc_array_index_type,
  *poffset, gfc_index_one_node);
+ if (finalblock)
+   gfc_add_block_to_block (finalblock, );
}
  else
{
@@ -2795,6 +2807,7 @@ trans_array_constructor (gfc_ss * ss, locus * where)
   tree neg_len;
   char *msg;
   stmtblock_t finalblock;
+  bool finalize_required;
 
   /* Save the old values for nested checking.  */
   old_first_len = first_len;
@@ -2973,8 +2986,11 @@ trans_array_constructor (gfc_ss * ss, locus * where)
   TREE_USED (offsetvar) = 0;
 
   gfc_init_block ();
+  finalize_required = expr->must_finalize;
+  if (expr->ts.type == BT_DERIVED && expr->ts.u.derived->attr.alloc_comp)
+finalize_required = true;
   gfc_trans_array_constructor_value (_loop->pre,
-expr->must_finalize ?  : NULL,
+finalize_required ?  : NULL,
 type, desc, c, , ,
 dynamic);
 
diff --git a/gcc/testsuite/gfortran.dg/finalize_57.f90 
b/gcc/testsuite/gfortran.dg/finalize_57.f90
new file mode 100644
index 000..b6257357c75
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/finalize_57.f90
@@ -0,0 +1,63 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/90068
+!
+! Contributed by Brad Richardson  
+! 
+
+program array_memory_leak
+implicit none
+
+type, abstract :: base
+end type base
+
+type, extends(base) :: extended
+end type extended
+
+type :: container
+class(base), allocatable :: thing
+end type
+
+type, extends(base) :: collection
+type(container), allocatable :: stuff(:)
+end type collection
+
+call run()
+call bad()
+contains
+subroutine run()
+type(collection) :: my_thing
+type(container) :: a_container
+
+a_container = newContainer(newExtended()) ! This is fine
+my_thing = newCollection([a_container])
+end subroutine run
+
+subroutine bad()
+type(collection) :: my_thing
+
+my_thing = newCollection([newContainer(newExtended())]) ! This is a 
memory leak
+end subroutine bad
+
+function newExtended()
+type(extended) :: newExtended
+end function newExtended
+
+function newContainer(thing)
+class(base), intent(in) :: thing
+type(container) :: newContainer
+
+allocate(newContainer%thing, source = thing)
+end function newContainer
+
+function newCollection(things)
+type(container), intent(in) :: things(:)
+type(collection) :: newCollection
+
+newCollection%stuff = things
+end function newCollection
+end program array_memory_leak
+
+! { dg-final { scan-tree-dump-times "__builtin_free" 15 "original" } }
+

[gcc/redhat/heads/gcc-14-branch] (58 commits) Merge commit 'r14-10288-g0f616e75f32083e1bc6d08f31e3fbc3dea

2024-06-07 Thread Jakub Jelinek via Gcc-cvs

The branch 'redhat/heads/gcc-14-branch' was updated to point to:

 1de1e03e8bd... Merge commit 'r14-10288-g0f616e75f32083e1bc6d08f31e3fbc3dea

It previously pointed to:

 e6b72839728... Merge commit 'r14-10231-gfc9fb69ad624fd4cc89ff31ad0a7b8d884

Diff:

Summary of changes (added commits):
---

  1de1e03... Merge commit 'r14-10288-g0f616e75f32083e1bc6d08f31e3fbc3dea
  0f616e7... bitint: Fix up lower_addsub_overflow [PR115352] (*)
  7d40974... Daily bump. (*)
  56c7372... c: Fix up pointer types to may_alias structures [PR114493] (*)
  35ed54f... aarch64: Add missing ACLE macro for NEON-SVE Bridge (*)
  d576034... Daily bump. (*)
  e11a42b... testsuite: i386: Require ifunc support in gcc.target/i386/a (*)
  7f0f88e... Daily bump. (*)
  c6e6258... libstdc++: Only define std::span::at for C++26 [PR115335] (*)
  a88e13b... fold-const: Fix up CLZ handling in tree_call_nonnegative_wa (*)
  f9af4a0... builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overf (*)
  1c1bc25... invoke.texi: Clarify -march=lujiazui (*)
  a7dd44c... rs6000: Fix up PCH in --enable-host-pie builds [PR115324] (*)
  14a7296... combine: Fix up simplify_compare_const [PR115092] (*)
  e805232... testsuite: gm2: Remove timeout overrides [PR114886] (*)
  d92b508... libstdc++: Build libbacktrace and 19_diagnostics/stacktrace (*)
  b2bbf98... Daily bump. (*)
  955202e... libstdc++: Fix -Wstringop-overflow warning coming from std: (*)
  97474ba... Add AVX10.1 target_clones support (*)
  1dbf796... Daily bump. (*)
  a31676a... Daily bump. (*)
  d7f4279... AVR: target/115317 - Make isinf(-Inf) return -1. (*)
  2f097c0... libstdc++: Replace link to gcc-4.3.2 docs in manual [PR1152 (*)
  9d08c55... AVR: tree-optimization/115307 - Work around isinf bloat fro (*)
  5ca4e16... Daily bump. (*)
  ec92744... alpha: Fix invalid RTX in divmodsi insn patterns [PR115297] (*)
  36575f5... vect: Fix access size alignment assumption [PR115192] (*)
  cd161b3... i386: Fix ix86_option override after change [PR 113719] (*)
  06333a1... Daily bump. (*)
  201cfa7... MIPS16: Mark $2/$3 as clobbered if GP is used (*)
  8f6c56c... Daily bump. (*)
  fba2843... Fix link failure of GNAT tools on 32-bit SPARC/Linux (*)
  90a4476... tree-optimization/115149 - VOP live and missing PHIs (*)
  2a1fdd5... tree-optimization/115197 - fix ICE w/ constant in LC PHI an (*)
  9e971c6... tree-optimization/114921 - _Float16 -> __bf16 isn't noop fi (*)
  b4d4ece... Align tight loop without considering max skipping bytes (*)
  8060035... Adjust generic loop alignment from 16:11:8 to 16 for Intel  (*)
  e2b66da... Daily bump. (*)
  dbeb3d1... Fortran: Fix SHAPE for zero-size arrays (*)
  89dff14... libstdc++: Guard use of sized deallocation [PR114940] (*)
  e78980f... LoongArch: Guard REGNO with REG_P in loongarch_expand_condi (*)
  133da68... Daily bump. (*)
  4790076... tree-optimization/115232 - demangle failure during -Waccess (*)
  0cae44a... Daily bump. (*)
  2e0f832... Daily bump. (*)
  b0b21d5... Fortran: fix bounds check for assignment, class component [ (*)
  cab8941... Daily bump. (*)
  9031c02... c++: deleting array temporary [PR115187] (*)
  782ad20... c++: Propagate using decls from partitions [PR114868] (*)
  fd6fd88... c++: Fix instantiation of imported temploid friends [PR1142 (*)
  557cddc... c++: Standardise errors for module_may_redeclare (*)
  5429e6a... Daily bump. (*)
  1a6c1c8... sra: Do not leave work for DSE (that it can sometimes not p (*)
  137e7a8... Daily bump. (*)
  c27d6c7... c++: failure to suppress -Wsizeof-array-div in template [PR (*)
  da3a6b0... testsuite: Verify r0-r3 are extended with CMSE (*)
  2f0e086... Fix internal error in seh_cfa_offset with -O2 -fno-omit-fra (*)
  4896bb3... libstdc++: Implement std::formatter withou (*)

(*) This commit already exists in another branch.
Because the reference `refs/vendors/redhat/heads/gcc-14-branch' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.

[gcc(refs/vendors/redhat/heads/gcc-14-branch)] Merge commit 'r14-10288-g0f616e75f32083e1bc6d08f31e3fbc3dea41fa0c' into redhat/gcc-14-branch

2024-06-07 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:1de1e03e8bd3490b53f6fe454f7a48ddc1c839f2

commit 1de1e03e8bd3490b53f6fe454f7a48ddc1c839f2
Merge: e6b72839728 0f616e75f32
Author: Jakub Jelinek 
Date:   Fri Jun 7 10:39:08 2024 +0200

Merge commit 'r14-10288-g0f616e75f32083e1bc6d08f31e3fbc3dea41fa0c' into 
redhat/gcc-14-branch

Diff:

 gcc/ChangeLog  | 234 +++
 gcc/DATESTAMP  |   2 +-
 gcc/ada/ChangeLog  |   7 +
 gcc/ada/Makefile.rtl   |  13 +-
 gcc/builtins.cc|  22 +-
 gcc/c/ChangeLog|  10 +
 gcc/c/c-decl.cc|  15 ++
 gcc/combine.cc |   6 +-
 gcc/common/config/i386/i386-common.cc  |   4 +-
 gcc/common/config/i386/i386-cpuinfo.h  |   5 +-
 gcc/common/config/i386/i386-isas.h |   4 +-
 gcc/config/aarch64/aarch64-c.cc|   1 +
 gcc/config/alpha/alpha.md  |  21 +-
 gcc/config/alpha/constraints.md|   2 +-
 gcc/config/avr/avr.md  |  16 ++
 gcc/config/i386/i386-options.cc|  10 +-
 gcc/config/i386/i386.cc| 148 +++-
 gcc/config/i386/i386.md|  10 +-
 gcc/config/i386/x86-tune-costs.h   |   2 +-
 gcc/config/loongarch/loongarch.cc  |  17 +-
 gcc/config/mips/mips.cc|  11 +-
 gcc/config/rs6000/rs6000-builtin.cc|   2 +-
 gcc/config/rs6000/rs6000-c.cc  |  62 ++---
 gcc/config/rs6000/rs6000-gen-builtins.cc   |  72 +++---
 gcc/cp/ChangeLog   |  66 ++
 gcc/cp/cp-tree.h   |   5 +-
 gcc/cp/decl.cc |  69 +++---
 gcc/cp/init.cc |   9 +-
 gcc/cp/module.cc   | 201 
 gcc/cp/name-lookup.cc  |  53 +
 gcc/cp/pt.cc   |  33 ++-
 gcc/cp/semantics.cc|   8 +-
 gcc/cp/tree.cc |   6 +-
 gcc/doc/invoke.texi|   6 +-
 gcc/fold-const.cc  |   6 +-
 gcc/fold-mem-offsets.cc|   2 +-
 gcc/fortran/ChangeLog  |  20 ++
 gcc/fortran/trans-array.cc |   7 +-
 gcc/fortran/trans-expr.cc  |  40 ++--
 gcc/fortran/trans-intrinsic.cc |   4 +-
 gcc/gimple-lower-bitint.cc |   6 +-
 gcc/gimple-ssa-warn-access.cc  |   2 +-
 gcc/testsuite/ChangeLog| 253 +
 gcc/testsuite/g++.dg/cpp1z/array-prvalue3.C|   8 +
 gcc/testsuite/g++.dg/modules/enum-12.C |   2 +-
 gcc/testsuite/g++.dg/modules/friend-5_b.C  |   2 +-
 gcc/testsuite/g++.dg/modules/shadow-1_b.C  |   5 +-
 gcc/testsuite/g++.dg/modules/tpl-friend-10_a.C |  15 ++
 gcc/testsuite/g++.dg/modules/tpl-friend-10_b.C |   5 +
 gcc/testsuite/g++.dg/modules/tpl-friend-10_c.C |   7 +
 gcc/testsuite/g++.dg/modules/tpl-friend-10_d.C |   8 +
 gcc/testsuite/g++.dg/modules/tpl-friend-11_a.C |  14 ++
 gcc/testsuite/g++.dg/modules/tpl-friend-11_b.C |   5 +
 gcc/testsuite/g++.dg/modules/tpl-friend-12_a.C |  10 +
 gcc/testsuite/g++.dg/modules/tpl-friend-12_b.C |   9 +
 gcc/testsuite/g++.dg/modules/tpl-friend-12_c.C |  10 +
 gcc/testsuite/g++.dg/modules/tpl-friend-12_d.C |   8 +
 gcc/testsuite/g++.dg/modules/tpl-friend-12_e.C |   7 +
 gcc/testsuite/g++.dg/modules/tpl-friend-12_f.C |   8 +
 gcc/testsuite/g++.dg/modules/tpl-friend-13_a.C |  13 ++
 gcc/testsuite/g++.dg/modules/tpl-friend-13_b.C |  11 +
 gcc/testsuite/g++.dg/modules/tpl-friend-13_c.C |  13 ++
 gcc/testsuite/g++.dg/modules/tpl-friend-13_d.C |   7 +
 gcc/testsuite/g++.dg/modules/tpl-friend-13_e.C |  18 ++
 gcc/testsuite/g++.dg/modules/tpl-friend-13_f.C |   7 +
 gcc/testsuite/g++.dg/modules/tpl-friend-13_g.C |  11 +
 gcc/testsuite/g++.dg/modules/tpl-friend-14_a.C |   8 +
 gcc/testsuite/g++.dg/modules/tpl-friend-14_b.C |   8 +
 gcc/testsuite/g++.dg/modules/tpl-friend-14_c.C |   7 +
 gcc/testsuite/g++.dg/modules/tpl-friend-14_d.C |   9 +
 gcc/testsuite/g++.dg/modules/tpl-friend-9.C|  13 ++
 gcc/testsuite/g++.dg/modules/using-15_a.C  |  14 ++
 gcc/testsuite/g++.dg/modules/using-15_b.C  |   6 +
 gcc/testsuite/g++.dg/modules/using-15_c.C  |   8 +
 gcc/testsuite/g++.dg/opt/fmo1.C|  25 ++
 gcc/testsuite/g++.dg/pr115232.C|

[gcc r14-10288] bitint: Fix up lower_addsub_overflow [PR115352]

2024-06-07 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:0f616e75f32083e1bc6d08f31e3fbc3dea41fa0c

commit r14-10288-g0f616e75f32083e1bc6d08f31e3fbc3dea41fa0c
Author: Jakub Jelinek 
Date:   Fri Jun 7 10:32:08 2024 +0200

bitint: Fix up lower_addsub_overflow [PR115352]

The following testcase is miscompiled because of a flawed optimization.
If one changes the 65 in the testcase to e.g. 66, one gets:
...
  _25 = .USUBC (0, _24, _14);
  _12 = IMAGPART_EXPR <_25>;
  _26 = REALPART_EXPR <_25>;
  if (_23 >= 1)
goto ; [80.00%]
  else
goto ; [20.00%]

   :
  if (_23 != 1)
goto ; [80.00%]
  else
goto ; [20.00%]

   :
  _27 = (signed long) _26;
  _28 = _27 >> 1;
  _29 = (unsigned long) _28;
  _31 = _29 + 1;
  _30 = _31 > 1;
  goto ; [100.00%]

   :
  _32 = _26 != _18;
  _33 = _22 | _32;

   :
  # _17 = PHI <_30(9), _22(7), _33(10)>
  # _19 = PHI <_29(9), _18(7), _18(10)>
...
so there is one path for limbs below the boundary (in this case there are
actually no limbs there, maybe we could consider optimizing that further,
say with simply folding that _23 >= 1 condition to 1 == 1 and letting
cfg cleanup handle it), another case where it is exactly the limb on the
boundary (that is the bb 9 handling where it extracts the interesting
bits (the first 3 statements) and then checks if it is zero or all ones and
finally the case of limbs above that where it compares the current result
limb against the previously recorded 0 or all ones and ors differences into
accumulated result.

Now, the optimization which the first hunk removes was based on the idea
that for that case the extraction of the interesting bits from the limb
don't need anything special, so the _27/_28/_29 statements above aren't
needed, the whole limb is interesting bits, so it handled the >= 1
case like the bb 9 above without the first 3 statements and bb 10 wasn't
there at all.  There are 2 problems with that, for the higher limbs it
only checks if the the result limb bits are all zeros or all ones, but
doesn't check if they are the same as the other extension bits, and
it forgets the previous flag whether there was an overflow.
First I wanted to fix it just by adding the _33 = _22 | _30; statement
to the end of bb 9 above, which fixed the originally filed huge testcase
and the first 2 foo calls in the testcase included in the patch, it no
longer forgets about previously checked differences from 0/1.
But as the last 2 foo calls show, it still didn't check whether each
even (or each odd depending on the exact position) result limb is
equal to the first one, so every second limb it could choose some other
0 vs. all ones value and as long as it repeated in another limb above it
it would be ok.

So, the optimization just can't work properly and the following patch
removes it.

2024-06-07  Jakub Jelinek  

PR middle-end/115352
* gimple-lower-bitint.cc (lower_addsub_overflow): Don't disable
single_comparison if cmp_code is GE_EXPR.

* gcc.dg/torture/bitint-71.c: New test.

(cherry picked from commit a47b1aaa7a76201da7e091d9f8d4488105786274)

Diff:
---
 gcc/gimple-lower-bitint.cc   |  6 +-
 gcc/testsuite/gcc.dg/torture/bitint-71.c | 28 
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-lower-bitint.cc b/gcc/gimple-lower-bitint.cc
index 7e8b6e3c51a..56e5f826a8d 100644
--- a/gcc/gimple-lower-bitint.cc
+++ b/gcc/gimple-lower-bitint.cc
@@ -4286,11 +4286,7 @@ bitint_large_huge::lower_addsub_overflow (tree obj, 
gimple *stmt)
  bool single_comparison
= (startlimb + 2 >= fin || (startlimb & 1) != (i & 1));
  if (!single_comparison)
-   {
- cmp_code = GE_EXPR;
- if (!check_zero && (start % limb_prec) == 0)
-   single_comparison = true;
-   }
+   cmp_code = GE_EXPR;
  else if ((startlimb & 1) == (i & 1))
cmp_code = EQ_EXPR;
  else
diff --git a/gcc/testsuite/gcc.dg/torture/bitint-71.c 
b/gcc/testsuite/gcc.dg/torture/bitint-71.c
new file mode 100644
index 000..8ebd42b30b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/bitint-71.c
@@ -0,0 +1,28 @@
+/* PR middle-end/115352 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c23" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 385
+int
+foo (_BitInt (385) b)
+{
+  return __builtin_sub_overflow_p (0, b, (_BitInt (65)) 0);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 385
+  if

[gcc r15-1093] bitint: Fix up lower_addsub_overflow [PR115352]

2024-06-07 Thread Jakub Jelinek via Gcc-cvs

https://gcc.gnu.org/g:a47b1aaa7a76201da7e091d9f8d4488105786274

commit r15-1093-ga47b1aaa7a76201da7e091d9f8d4488105786274
Author: Jakub Jelinek 
Date:   Fri Jun 7 10:32:08 2024 +0200

bitint: Fix up lower_addsub_overflow [PR115352]

The following testcase is miscompiled because of a flawed optimization.
If one changes the 65 in the testcase to e.g. 66, one gets:
...
  _25 = .USUBC (0, _24, _14);
  _12 = IMAGPART_EXPR <_25>;
  _26 = REALPART_EXPR <_25>;
  if (_23 >= 1)
goto ; [80.00%]
  else
goto ; [20.00%]

   :
  if (_23 != 1)
goto ; [80.00%]
  else
goto ; [20.00%]

   :
  _27 = (signed long) _26;
  _28 = _27 >> 1;
  _29 = (unsigned long) _28;
  _31 = _29 + 1;
  _30 = _31 > 1;
  goto ; [100.00%]

   :
  _32 = _26 != _18;
  _33 = _22 | _32;

   :
  # _17 = PHI <_30(9), _22(7), _33(10)>
  # _19 = PHI <_29(9), _18(7), _18(10)>
...
so there is one path for limbs below the boundary (in this case there are
actually no limbs there, maybe we could consider optimizing that further,
say with simply folding that _23 >= 1 condition to 1 == 1 and letting
cfg cleanup handle it), another case where it is exactly the limb on the
boundary (that is the bb 9 handling where it extracts the interesting
bits (the first 3 statements) and then checks if it is zero or all ones and
finally the case of limbs above that where it compares the current result
limb against the previously recorded 0 or all ones and ors differences into
accumulated result.

Now, the optimization which the first hunk removes was based on the idea
that for that case the extraction of the interesting bits from the limb
don't need anything special, so the _27/_28/_29 statements above aren't
needed, the whole limb is interesting bits, so it handled the >= 1
case like the bb 9 above without the first 3 statements and bb 10 wasn't
there at all.  There are 2 problems with that, for the higher limbs it
only checks if the the result limb bits are all zeros or all ones, but
doesn't check if they are the same as the other extension bits, and
it forgets the previous flag whether there was an overflow.
First I wanted to fix it just by adding the _33 = _22 | _30; statement
to the end of bb 9 above, which fixed the originally filed huge testcase
and the first 2 foo calls in the testcase included in the patch, it no
longer forgets about previously checked differences from 0/1.
But as the last 2 foo calls show, it still didn't check whether each
even (or each odd depending on the exact position) result limb is
equal to the first one, so every second limb it could choose some other
0 vs. all ones value and as long as it repeated in another limb above it
it would be ok.

So, the optimization just can't work properly and the following patch
removes it.

2024-06-07  Jakub Jelinek  

PR middle-end/115352
* gimple-lower-bitint.cc (lower_addsub_overflow): Don't disable
single_comparison if cmp_code is GE_EXPR.

* gcc.dg/torture/bitint-71.c: New test.

Diff:
---
 gcc/gimple-lower-bitint.cc   |  6 +-
 gcc/testsuite/gcc.dg/torture/bitint-71.c | 28 
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-lower-bitint.cc b/gcc/gimple-lower-bitint.cc
index 7e8b6e3c51a..56e5f826a8d 100644
--- a/gcc/gimple-lower-bitint.cc
+++ b/gcc/gimple-lower-bitint.cc
@@ -4286,11 +4286,7 @@ bitint_large_huge::lower_addsub_overflow (tree obj, 
gimple *stmt)
  bool single_comparison
= (startlimb + 2 >= fin || (startlimb & 1) != (i & 1));
  if (!single_comparison)
-   {
- cmp_code = GE_EXPR;
- if (!check_zero && (start % limb_prec) == 0)
-   single_comparison = true;
-   }
+   cmp_code = GE_EXPR;
  else if ((startlimb & 1) == (i & 1))
cmp_code = EQ_EXPR;
  else
diff --git a/gcc/testsuite/gcc.dg/torture/bitint-71.c 
b/gcc/testsuite/gcc.dg/torture/bitint-71.c
new file mode 100644
index 000..8ebd42b30b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/bitint-71.c
@@ -0,0 +1,28 @@
+/* PR middle-end/115352 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c23" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 385
+int
+foo (_BitInt (385) b)
+{
+  return __builtin_sub_overflow_p (0, b, (_BitInt (65)) 0);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 385
+  if (!foo (-(_BitInt (385))

[gcc r13-8825] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs

https://gcc.gnu.org/g:e4f85ea6271a10e13c6874709a05e04ab0508fbf

commit r13-8825-ge4f85ea6271a10e13c6874709a05e04ab0508fbf
Author: Jan Hubicka 
Date:   Fri Dec 29 23:51:03 2023 +0100

Disable FMADD in chains for Zen4 and generic

this patch disables use of FMA in matrix multiplication loop for generic 
(for
x86-64-v3) and zen4.  I tested this on zen4 and Xenon Gold Gold 6212U.

For Intel this is neutral both on the matrix multiplication microbenchmark
(attached) and spec2k17 where the difference was within noise for Core.

On core the micro-benchmark runs as follows:

With FMA:

   578,500,241  cycles:u #3.645 GHz
( +-  0.12% )
   753,318,477  instructions:u   #1.30  insn per
cycle  ( +-  0.00% )
   125,417,701  branches:u   #  790.227 M/sec
( +-  0.00% )
  0.159146 +- 0.000363 seconds time elapsed  ( +-  0.23% )

No FMA:

   577,573,960  cycles:u #3.514 GHz
( +-  0.15% )
   878,318,479  instructions:u   #1.52  insn per
cycle  ( +-  0.00% )
   125,417,702  branches:u   #  763.035 M/sec
( +-  0.00% )
  0.164734 +- 0.000321 seconds time elapsed  ( +-  0.19% )

So the cycle count is unchanged and discrete multiply+add takes same time as
FMA.

While on zen:

With FMA:
 484875179  cycles:u #3.599 GHz
 ( +-  0.05% )  (82.11%)
 752031517  instructions:u   #1.55  insn per
cycle
 125106525  branches:u   #  928.712 M/sec
 ( +-  0.03% )  (85.09%)
128356  branch-misses:u  #0.10% of all
branches  ( +-  0.06% )  (83.58%)

No FMA:
 375875209  cycles:u #3.592 GHz
 ( +-  0.08% )  (80.74%)
 875725341  instructions:u   #2.33  insn per
cycle
 124903825  branches:u   #1.194 G/sec
 ( +-  0.04% )  (84.59%)
  0.105203 +- 0.000188 seconds time elapsed  ( +-  0.18% )

The diffrerence is that Cores understand the fact that fmadd does not need
all three parameters to start computation, while Zen cores doesn't.

Since this seems noticeable win on zen and not loss on Core it seems like 
good
default for generic.

float a[SIZE][SIZE];
float b[SIZE][SIZE];
float c[SIZE][SIZE];

void init(void)
{
   int i, j, k;
   for(i=0; i

[gcc r15-1092] go: Fix gccgo -v on Solaris with ld

2024-06-07 Thread Rainer Orth via Gcc-cvs

https://gcc.gnu.org/g:9fff0be2f849b84e4c427bdd7a4716158b80a511

commit r15-1092-g9fff0be2f849b84e4c427bdd7a4716158b80a511
Author: Rainer Orth 
Date:   Fri Jun 7 10:14:23 2024 +0200

go: Fix gccgo -v on Solaris with ld

The Go testsuite's go.sum file ends in

Couldn't determine version of 
/var/gcc/regression/master/11.4-gcc-64/build/gcc/gccgo

on Solaris.  It turns out this happens because gccgo -v is confused:

[...]
gcc version 15.0.0 20240531 (experimental) [master 
a0d60660f2aae2d79685f73d568facb2397582d8] (GCC)
COMPILER_PATH=./:/usr/ccs/bin/
LIBRARY_PATH=./:/lib/amd64/:/usr/lib/amd64/:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-g1' '-B' './' '-v' '-shared-libgcc' '-mtune=generic' 
'-march=x86-64' '-dumpdir' 'a.'
 ./collect2 -V -M ./libgcc-unwind.map -Qy /usr/lib/amd64/crt1.o ./crtp.o 
/usr/lib/amd64/crti.o /usr/lib/amd64/values-Xa.o /usr/lib/amd64/values-xpg6.o 
./crtbegin.o -L. -L/lib/amd64 -L/usr/lib/amd64 -t -lgcc_s -lgcc -lc -lgcc_s 
-lgcc ./crtend.o /usr/lib/amd64/crtn.o
ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.3297
Undefined   first referenced
 symbol in file
main/usr/lib/amd64/crt1.o
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status

trying to invoke the linker without adding any object file.  This only
happens when Solaris ld is in use.  gccgo passes -t to the linker in
that case, but does it unconditionally, even with -v.

When configured to use GNU ld, gccgo -v is fine instead.

This patch avoids this by restricting the -t to actually linking.

Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (ld and gld).

2024-06-05  Rainer Orth  

gcc/go:
* gospec.cc (lang_specific_driver) [TARGET_SOLARIS !USE_GLD]: Only
add -t if linking.

Diff:
---
 gcc/go/gospec.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/go/gospec.cc b/gcc/go/gospec.cc
index b866d47a942..a3da23dfa3a 100644
--- a/gcc/go/gospec.cc
+++ b/gcc/go/gospec.cc
@@ -443,8 +443,11 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
  using the GNU linker, the Solaris linker needs an option to not
  warn about this.  Everything works without this option, but you
  get unsightly warnings at link time.  */
-  generate_option (OPT_Wl_, "-t", 1, CL_DRIVER, _decoded_options[j]);
-  j++;
+  if (library > 0)
+{
+  generate_option (OPT_Wl_, "-t", 1, CL_DRIVER, _decoded_options[j]);
+  j++;
+}
 #endif
 
   *in_decoded_options_count = j;

[gcc r15-1091] testsuite: go: Require split-stack support for go.test/test/index0.go [PR87589]

2024-06-07 Thread Rainer Orth via Gcc-cvs

https://gcc.gnu.org/g:9ab90fc627301b1701cf19bf4ca220f02a93d894

commit r15-1091-g9ab90fc627301b1701cf19bf4ca220f02a93d894
Author: Rainer Orth 
Date:   Fri Jun 7 10:12:09 2024 +0200

testsuite: go: Require split-stack support for go.test/test/index0.go 
[PR87589]

The index0-out.go test FAILs on Solaris (SPARC and x86, 32 and 64-bit),
as well as several others:

FAIL: ./index0-out.go execution,  -O0 -g -fno-var-tracking-assignments

The test SEGVs because it tries a stack acess way beyond the stack
area.  As Ian analyzed in the PR, the testcase currently requires
split-stack support, so this patch requires just that.

Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11.

2024-06-05  Rainer Orth  

gcc/testsuite:
PR go/87589
* go.test/go-test.exp (go-gc-tests): Require split-stack support
for index0.go.

Diff:
---
 gcc/testsuite/go.test/go-test.exp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/go.test/go-test.exp 
b/gcc/testsuite/go.test/go-test.exp
index 8fdc7b420be..98317380746 100644
--- a/gcc/testsuite/go.test/go-test.exp
+++ b/gcc/testsuite/go.test/go-test.exp
@@ -477,7 +477,8 @@ proc go-gc-tests { } {
if { ( [file tail $test] == "select2.go" \
   || [file tail $test] == "stack.go" \
   || [file tail $test] == "peano.go" \
-  || [file tail $test] == "nilptr2.go" ) \
+  || [file tail $test] == "nilptr2.go" \
+  || [file tail $test] == "index0.go" ) \
 && ! [check_effective_target_split_stack] } {
# These tests fails on targets without split stack.
untested $name

[gcc r15-1090] Fix returned type to be allocatable for user-functions.

2024-06-07 Thread Andre Vehreschild via Gcc-cvs

https://gcc.gnu.org/g:51046e46ae66ca95bf2b93ae60f0c4d6b338f8af

commit r15-1090-g51046e46ae66ca95bf2b93ae60f0c4d6b338f8af
Author: Andre Vehreschild 
Date:   Wed Jul 19 11:57:43 2023 +0200

Fix returned type to be allocatable for user-functions.

The returned type of user-defined function returning a
class object was not detected and handled correctly, which
lead to memory leaks.

PR fortran/90072

gcc/fortran/ChangeLog:

* expr.cc (gfc_is_alloc_class_scalar_function): Detect
allocatable class return types also for user-defined
functions.
* trans-expr.cc (gfc_conv_procedure_call): Same.
(trans_class_vptr_len_assignment): Compute vptr len
assignment correctly for user-defined functions.

gcc/testsuite/ChangeLog:

* gfortran.dg/class_77.f90: New test.

Diff:
---
 gcc/fortran/expr.cc| 13 --
 gcc/fortran/trans-expr.cc  | 35 +++---
 gcc/testsuite/gfortran.dg/class_77.f90 | 83 ++
 3 files changed, 109 insertions(+), 22 deletions(-)

diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index a162744c719..be138d196a2 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -5573,11 +5573,14 @@ bool
 gfc_is_alloc_class_scalar_function (gfc_expr *expr)
 {
   if (expr->expr_type == EXPR_FUNCTION
-  && expr->value.function.esym
-  && expr->value.function.esym->result
-  && expr->value.function.esym->result->ts.type == BT_CLASS
-  && !CLASS_DATA (expr->value.function.esym->result)->attr.dimension
-  && CLASS_DATA (expr->value.function.esym->result)->attr.allocatable)
+  && ((expr->value.function.esym
+  && expr->value.function.esym->result
+  && expr->value.function.esym->result->ts.type == BT_CLASS
+  && !CLASS_DATA (expr->value.function.esym->result)->attr.dimension
+  && CLASS_DATA (expr->value.function.esym->result)->attr.allocatable)
+ || (expr->ts.type == BT_CLASS
+ && CLASS_DATA (expr)->attr.allocatable
+ && !CLASS_DATA (expr)->attr.dimension)))
 return true;
 
   return false;
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 9f6cc8f871e..d6f4d6bfe45 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -8301,7 +8301,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
}
 
  /* Finalize the result, if necessary.  */
- attr = CLASS_DATA (expr->value.function.esym->result)->attr;
+ attr = expr->value.function.esym
+? CLASS_DATA (expr->value.function.esym->result)->attr
+: CLASS_DATA (expr)->attr;
  if (!((gfc_is_class_array_function (expr)
 || gfc_is_alloc_class_scalar_function (expr))
&& attr.pointer))
@@ -10085,27 +10087,26 @@ trans_class_vptr_len_assignment (stmtblock_t *block, 
gfc_expr * le,
   if (re->expr_type != EXPR_VARIABLE && re->expr_type != EXPR_NULL
   && rse->expr != NULL_TREE)
 {
-  if (re->ts.type == BT_CLASS && !GFC_CLASS_TYPE_P (TREE_TYPE (rse->expr)))
-   class_expr = gfc_get_class_from_expr (rse->expr);
+  if (!DECL_P (rse->expr))
+   {
+ if (re->ts.type == BT_CLASS && !GFC_CLASS_TYPE_P (TREE_TYPE 
(rse->expr)))
+   class_expr = gfc_get_class_from_expr (rse->expr);
 
-  if (rse->loop)
-   pre = >loop->pre;
-  else
-   pre = >pre;
+ if (rse->loop)
+   pre = >loop->pre;
+ else
+   pre = >pre;
 
-  if (class_expr != NULL_TREE && UNLIMITED_POLY (re))
-   {
- tmp = TREE_OPERAND (rse->expr, 0);
- tmp = gfc_create_var (TREE_TYPE (tmp), "rhs");
- gfc_add_modify (>pre, tmp, TREE_OPERAND (rse->expr, 0));
+ if (class_expr != NULL_TREE && UNLIMITED_POLY (re))
+ tmp = gfc_evaluate_now (TREE_OPERAND (rse->expr, 0), >pre);
+ else
+ tmp = gfc_evaluate_now (rse->expr, >pre);
+
+ rse->expr = tmp;
}
   else
-   {
- tmp = gfc_create_var (TREE_TYPE (rse->expr), "rhs");
- gfc_add_modify (>pre, tmp, rse->expr);
-   }
+   pre = >pre;
 
-  rse->expr = tmp;
   temp_rhs = true;
 }
 
diff --git a/gcc/testsuite/gfortran.dg/class_77.f90 
b/gcc/testsuite/gfortran.dg/class_77.f90
new file mode 100644
index 000..ef38dd67743
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_77.f90
@@ -0,0 +1,83 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/90072
+!
+! Contributed by Brad Richardson  
+! 
+
+module types
+implicit none
+
+type, abstract :: base_returned
+end type base_returned
+
+type, extends(base_returned) :: first_returned
+end type first_returned
+
+type, extends(base_returned) :: second_returned
+end type second_returned
+
+type, abstract ::

gcc-wwwdocs branch master updated. 4260d675af42b9c97e29818ab3b3154d27103d49

2024-06-07 Thread Tobias Burnus via Gcc-cvs-wwwdocs

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gcc-wwwdocs".

The branch, master has been updated
   via  4260d675af42b9c97e29818ab3b3154d27103d49 (commit)
  from  8507122b38e6b60e8f2f3c8cd339d4f318377203 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 4260d675af42b9c97e29818ab3b3154d27103d49
Author: Tobias Burnus 
Date:   Fri Jun 7 10:06:52 2024 +0200

gcc-15/changes.html + projects/gomp: update for new OpenMP features

GCC 15 now supports unified-shared memory and the tile/unroll constructs
in OpenMP.

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index 0ea7bdec..a121f40a 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -40,6 +40,24 @@ a work-in-progress.
 
 New Languages and Language specific improvements
 
+
+  OpenMP
+  
+
+  Support for unified-shared memory has been added for some AMD and Nvidia
+  GPU devices, enabled when using the unified_shared_memory
+  clause to the requires directive. For details,
+  see the offload-target specifics section in the
+  https://gcc.gnu.org/onlinedocs/libgomp/Offload-Target-Specifics.html;
+  >GNU Offloading and Multi Processing Runtime Library Manual.
+
+
+  OpenMP 5.1: The unroll and tile
+  loop-transformation constructs are now supported.
+
+  
+
+
 
 
 
diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html
index 94bda5ff..d1765fc3 100644
--- a/htdocs/projects/gomp/index.html
+++ b/htdocs/projects/gomp/index.html
@@ -313,18 +313,21 @@ than listed, depending on resolved corner cases and 
optimizations.
   
   
 requires directive
-
+
   GCC9
   GCC12
   GCC13
-  GCC14
+  GCC14
+  GCC15
 
 
   (atomic_default_mem_order)
   (dynamic_allocators)
   complete but no non-host devices provides unified_address or
   unified_shared_memory
-  complete but no non-host devices provides 
unified_shared_memory
+  complete but no non-host devices provides 
unified_shared_memory
+  complete; see also https://gcc.gnu.org/onlinedocs/libgomp/Offload-Target-Specifics.html;>
+  Offload-Target Specifics
 
   
   
@@ -706,7 +709,7 @@ than listed, depending on resolved corner cases and 
optimizations.
   
   
 Loop transformation constructs
-No
+GCC15
 
   
   

---

Summary of changes:
 htdocs/gcc-15/changes.html  | 18 ++
 htdocs/projects/gomp/index.html | 11 +++
 2 files changed, 25 insertions(+), 4 deletions(-)


hooks/post-receive
-- 
gcc-wwwdocs

gcc-wwwdocs branch master updated. 8507122b38e6b60e8f2f3c8cd339d4f318377203

2024-06-07 Thread Tobias Burnus via Gcc-cvs-wwwdocs

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gcc-wwwdocs".

The branch, master has been updated
   via  8507122b38e6b60e8f2f3c8cd339d4f318377203 (commit)
  from  1db5b34eb8cf47f070f643f993d835149bce2ec7 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 8507122b38e6b60e8f2f3c8cd339d4f318377203
Author: Tobias Burnus 
Date:   Fri Jun 7 09:58:52 2024 +0200

gcc-15/changes.html (nvptx): Constructors are now supported

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index b59fd3be..0ea7bdec 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -85,7 +85,14 @@ a work-in-progress.
 
 
 
-
+NVPTX
+
+
+  GCC's nvptx target now supports constructors and destructors.
+  For this, a recent version of https://gcc.gnu.org/install/specific.html#nvptx-x-none;
+  >nvptx-tools is required.
+
 
 
 

---

Summary of changes:
 htdocs/gcc-15/changes.html | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)


hooks/post-receive
-- 
gcc-wwwdocs

39 matches

Mail list logo