Re: [PATCH, rs6000] Add multiply-add expand pattern [PR103109]

2022-07-31 Thread Kewen.Lin via Gcc-patches
Hi Haochen,

Thanks for the patch, some comments are inlined.

on 2022/7/25 13:11, HAO CHEN GUI wrote:
> Hi,
>   This patch adds an expand and several insns for multiply-add with
> three 64bit operands.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-07-22  Haochen Gui  
> 
> gcc/
>   PR target/103109
>   * config/rs6000/rs6000.md (maddditi4): New pattern for
>   multiply-add.
>   (madddi4_lowpart): New.
>   (madddi4_lowpart_le): New.
>   (madddi4_highpart): New.
>   (madddi4_highpart_le): New.
> 
> gcc/testsuite/
>   PR target/103109
>   * gcc.target/powerpc/pr103109.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index c55ee7e171a..4f3b56e103e 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -3226,6 +3226,97 @@ (define_insn "*maddld4"
>"maddld %0,%1,%2,%3"
>[(set_attr "type" "mul")])
> 
> +(define_expand "maddditi4"
> +  [(set (match_operand:TI 0 "gpc_reg_operand")
> + (plus:TI
> +   (mult:TI (any_extend:TI
> +  (match_operand:DI 1 "gpc_reg_operand"))
> +(any_extend:TI
> +  (match_operand:DI 2 "gpc_reg_operand")))
> +   (any_extend:TI
> + (match_operand:DI 3 "gpc_reg_operand"]
> +  "TARGET_POWERPC64 && TARGET_MADDLD"
> +{
> +  rtx op0_lo = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 8 : 
> 0);
> +  rtx op0_hi = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 0 : 
> 8);
> +
> +  if (BYTES_BIG_ENDIAN)
> +{
> +  emit_insn (gen_madddi4_lowpart (op0_lo, operands[1], operands[2],
> +  operands[3]));
> +  emit_insn (gen_madddi4_highpart (op0_hi, operands[1], operands[2],
> +   operands[3]));
> +}
> +  else
> +{
> +  emit_insn (gen_madddi4_lowpart_le (op0_lo, operands[1], operands[2],
> + operands[3]));
> +  emit_insn (gen_madddi4_highpart_le (op0_hi, operands[1], 
> operands[2],
> +  operands[3]));
> +}
> +  DONE;
> +})
> +
> +(define_insn "madddi4_lowpart"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  8))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN"
> +  "maddld %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
> +(define_insn "madddi4_lowpart_le"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  0))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && !BYTES_BIG_ENDIAN"
> +  "maddld %0,%1,%2,%3"
> +  [(set_attr "type" "mul")]
> +
> +(define_insn "madddi4_highpart"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  0))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN"
> +  "maddhd %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
> +(define_insn "madddi4_highpart_le"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  8))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && !BYTES_BIG_ENDIAN"
> +  "maddhd %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
>  (define_insn "udiv3"
>[(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  (udiv:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103109.c 
> b/gcc/testsuite/gcc.target/powerpc/pr103109.c
> new file mode 100644
> index 000..256e05d5677
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103109.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile { target { lp64 } } } */

Since the guard is TARGET_POWERPC64, should use

[COMMITTED] Make irange dependency explicit for range_of_ssa_name_with_loop_info.

2022-07-31 Thread Aldy Hernandez via Gcc-patches
Even though ranger is type agnostic, SCEV seems to only work with
integers.  This patch removes some FIXME notes making it explicit that
bounds_of_var_in_loop only works with iranges.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-fold.cc (fold_using_range::range_of_phi): Only
query SCEV for integers.
(fold_using_range::range_of_ssa_name_with_loop_info): Remove
irange check.
---
 gcc/gimple-range-fold.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index 6f907df5bf5..7665c954f2b 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -853,12 +853,14 @@ fold_using_range::range_of_phi (vrange &r, gphi *phi, 
fur_source &src)
   }
 
   // If SCEV is available, query if this PHI has any knonwn values.
-  if (scev_initialized_p () && !POINTER_TYPE_P (TREE_TYPE (phi_def)))
+  if (scev_initialized_p ()
+  && !POINTER_TYPE_P (TREE_TYPE (phi_def))
+  && irange::supports_p (TREE_TYPE (phi_def)))
 {
-  value_range loop_range;
   class loop *l = loop_containing_stmt (phi);
   if (l && loop_outer (l))
{
+ int_range_max loop_range;
  range_of_ssa_name_with_loop_info (loop_range, phi_def, l, phi, src);
  if (!loop_range.varying_p ())
{
@@ -1337,9 +1339,7 @@ fold_using_range::range_of_ssa_name_with_loop_info 
(irange &r, tree name,
 {
   gcc_checking_assert (TREE_CODE (name) == SSA_NAME);
   tree min, max, type = TREE_TYPE (name);
-  // FIXME: Remove the supports_p() once all this can handle floats, etc.
-  if (irange::supports_p (type)
-  && bounds_of_var_in_loop (&min, &max, src.query (), l, phi, name))
+  if (bounds_of_var_in_loop (&min, &max, src.query (), l, phi, name))
 {
   if (TREE_CODE (min) != INTEGER_CST)
{
-- 
2.37.1



[COMMITTED] const_tree conversion of vrange::supports_*

2022-07-31 Thread Aldy Hernandez via Gcc-patches
Make all vrange::supports_*_p methods const_tree as they can end up
being called from functions that are const_tree.

Tested on x86-64 Linux.

gcc/ChangeLog:

* value-range.cc (vrange::supports_type_p): Use const_tree.
(irange::supports_type_p): Same.
(frange::supports_type_p): Same.
* value-range.h (Value_Range::supports_type_p): Same.
(irange::supports_p): Same.
---
 gcc/value-range.cc |  6 +++---
 gcc/value-range.h  | 16 
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 2923f4f5a0e..7adbf55c6a6 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -105,7 +105,7 @@ vrange::type () const
 }
 
 bool
-vrange::supports_type_p (tree) const
+vrange::supports_type_p (const_tree) const
 {
   return false;
 }
@@ -229,7 +229,7 @@ vrange::dump (FILE *file) const
 }
 
 bool
-irange::supports_type_p (tree type) const
+irange::supports_type_p (const_tree type) const
 {
   return supports_p (type);
 }
@@ -416,7 +416,7 @@ frange::operator== (const frange &src) const
 }
 
 bool
-frange::supports_type_p (tree type) const
+frange::supports_type_p (const_tree type) const
 {
   return supports_p (type);
 }
diff --git a/gcc/value-range.h b/gcc/value-range.h
index e43fbe30f27..c6ab955c407 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -78,7 +78,7 @@ public:
   virtual void accept (const class vrange_visitor &v) const = 0;
   virtual void set (tree, tree, value_range_kind = VR_RANGE);
   virtual tree type () const;
-  virtual bool supports_type_p (tree type) const;
+  virtual bool supports_type_p (const_tree type) const;
   virtual void set_varying (tree type);
   virtual void set_undefined ();
   virtual bool union_ (const vrange &);
@@ -122,8 +122,8 @@ public:
   virtual void set_undefined () override;
 
   // Range types.
-  static bool supports_p (tree type);
-  virtual bool supports_type_p (tree type) const override;
+  static bool supports_p (const_tree type);
+  virtual bool supports_type_p (const_tree type) const override;
   virtual tree type () const override;
 
   // Iteration over sub-ranges.
@@ -336,7 +336,7 @@ class frange : public vrange
 public:
   frange ();
   frange (const frange &);
-  static bool supports_p (tree type)
+  static bool supports_p (const_tree type)
   {
 // Disabled until floating point range-ops come live.
 return 0 && SCALAR_FLOAT_TYPE_P (type);
@@ -347,7 +347,7 @@ public:
   virtual void set_undefined () override;
   virtual bool union_ (const vrange &) override;
   virtual bool intersect (const vrange &) override;
-  virtual bool supports_type_p (tree type) const override;
+  virtual bool supports_type_p (const_tree type) const override;
   virtual void accept (const vrange_visitor &v) const override;
   frange& operator= (const frange &);
   bool operator== (const frange &) const;
@@ -457,7 +457,7 @@ public:
   operator vrange &();
   operator const vrange &() const;
   void dump (FILE *) const;
-  static bool supports_type_p (tree type);
+  static bool supports_type_p (const_tree type);
 
   // Convenience methods for vrange compatability.
   void set (tree min, tree max, value_range_kind kind = VR_RANGE)
@@ -588,7 +588,7 @@ Value_Range::operator const vrange &() const
 // Return TRUE if TYPE is supported by the vrange infrastructure.
 
 inline bool
-Value_Range::supports_type_p (tree type)
+Value_Range::supports_type_p (const_tree type)
 {
   return irange::supports_p (type) || frange::supports_p (type);
 }
@@ -730,7 +730,7 @@ irange::nonzero_p () const
 }
 
 inline bool
-irange::supports_p (tree type)
+irange::supports_p (const_tree type)
 {
   return INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type);
 }
-- 
2.37.1



[COMMITTED] Cleanups to frange.

2022-07-31 Thread Aldy Hernandez via Gcc-patches
These are some assorted cleanups to the frange class to make it easier
to drop in an implementation with FP endpoints:

* frange::set() had some asserts limiting the type of arguments
  passed.  There's no reason why we can't handle all the variants.
  Worse comes to worse, we can always return a VARYING which is
  conservative and correct.

* frange::normalize_kind() now returns a boolean that can be used in
  union and intersection to indicate that the range changed.

* Implement vrp_val_max and vrp_val_min for floats.  Also, move them
  earlier in the header file so frange can use them.

Tested on x86-64 Linux.

gcc/ChangeLog:

* value-range.cc (tree_compare): New.
(frange::set): Make more general.
(frange::normalize_kind): Cleanup and return bool.
(frange::union_): Use normalize_kind return value.
(frange::intersect): Same.
(frange::verify_range): Remove unnecessary else.
* value-range.h (vrp_val_max): Move before frange class.
(vrp_val_min): Same.
(frange::frange): Remove set to m_type.
---
 gcc/value-range.cc | 102 +
 gcc/value-range.h  |  70 ++-
 2 files changed, 105 insertions(+), 67 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 7adbf55c6a6..dc06f8b0078 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -260,66 +260,93 @@ frange::accept (const vrange_visitor &v) const
   v.visit (*this);
 }
 
-// Setter for franges.  Currently only singletons are supported.
+// Helper function to compare floats.  Returns TRUE if op1 .CODE. op2
+// is nonzero.
+
+static inline bool
+tree_compare (tree_code code, tree op1, tree op2)
+{
+  return !integer_zerop (fold_build2 (code, integer_type_node, op1, op2));
+}
+
+// Setter for franges.
 
 void
 frange::set (tree min, tree max, value_range_kind kind)
 {
-  gcc_checking_assert (kind == VR_RANGE);
-  gcc_checking_assert (operand_equal_p (min, max));
   gcc_checking_assert (TREE_CODE (min) == REAL_CST);
+  gcc_checking_assert (TREE_CODE (max) == REAL_CST);
+
+  if (kind == VR_UNDEFINED)
+{
+  set_undefined ();
+  return;
+}
+
+  // Treat VR_ANTI_RANGE and VR_VARYING as varying.
+  if (kind != VR_RANGE)
+{
+  set_varying (TREE_TYPE (min));
+  return;
+}
 
   m_kind = kind;
   m_type = TREE_TYPE (min);
 
-  REAL_VALUE_TYPE *const cst = TREE_REAL_CST_PTR (min);
-  if (real_isnan (cst))
-m_props.nan_set_yes ();
-  else
-m_props.nan_set_no ();
-
-  if (real_isinf (cst))
+  // Mark NANness.
+  if (real_isnan (TREE_REAL_CST_PTR (min))
+  || real_isnan (TREE_REAL_CST_PTR (max)))
 {
-  if (real_isneg (cst))
-   {
- m_props.inf_set_no ();
- m_props.ninf_set_yes ();
-   }
-  else
-   {
- m_props.inf_set_yes ();
- m_props.ninf_set_no ();
-   }
+  gcc_checking_assert (operand_equal_p (min, max));
+  m_props.nan_set_yes ();
 }
   else
+m_props.nan_set_no ();
+
+  bool is_min = vrp_val_is_min (min);
+  bool is_max = vrp_val_is_max (max);
+
+  // Mark when the endpoints can't be INF.
+  if (!is_min)
+m_props.ninf_set_no ();
+  if (!is_max)
+m_props.inf_set_no ();
+
+  // Mark when the endpoints are definitely INF.
+  if (operand_equal_p (min, max))
 {
-  m_props.inf_set_no ();
-  m_props.ninf_set_no ();
+  if (is_min)
+   m_props.ninf_set_yes ();
+  else if (is_max)
+   m_props.inf_set_yes ();
 }
 
+  // Check for swapped ranges.
+  gcc_checking_assert (m_props.nan_yes_p ()
+  || tree_compare (LE_EXPR, min, max));
+
   if (flag_checking)
 verify_range ();
 }
 
-// Normalize range to VARYING or UNDEFINED, or vice versa.
+// Normalize range to VARYING or UNDEFINED, or vice versa.  Return
+// TRUE if anything changed.
 //
 // A range with no known properties can be dropped to VARYING.
 // Similarly, a VARYING with any properties should be dropped to a
 // VR_RANGE.  Normalizing ranges upon changing them ensures there is
 // only one representation for a given range.
 
-void
+bool
 frange::normalize_kind ()
 {
   if (m_kind == VR_RANGE)
 {
   // No FP properties set means varying.
-  if (m_props.nan_varying_p ()
- && m_props.inf_varying_p ()
- && m_props.ninf_varying_p ())
+  if (m_props.varying_p ())
{
  set_varying (m_type);
- return;
+ return true;
}
   // Undefined is viral.
   if (m_props.nan_undefined_p ()
@@ -327,17 +354,19 @@ frange::normalize_kind ()
  || m_props.ninf_undefined_p ())
{
  set_undefined ();
- return;
+ return true;
}
 }
   else if (m_kind == VR_VARYING)
 {
   // If a VARYING has any FP properties, it's no longer VARYING.
-  if (!m_props.nan_varying_p ()
- || !m_props.inf_varying_p ()
- || !m_props.ninf_varying_p ())
-   m_kind = VR_RANGE;
+  

[PATCH V2] libcpp: Optimize #pragma once with a hash table [PR58770]

2022-07-31 Thread Paul Hollinsky via Gcc-patches
Rather than traversing the all_files linked list for every include,
this factors out the quick idempotency checks (modification time
and size) to be the keys in a hash table so we can find matching
files quickly.

The hash table value type is a linked list, in case more than one
file matches the quick check.

The table is only built if a once-only file is seen, so include
guard performance is not affected.

My laptop would previously complete Ricardo's benchmark from the
PR in ~1.1s using #pragma once, and ~0.35s using include guards.

After this change, both benchmarks now complete in ~0.35s. I did
have to randomize the modification dates on the benchmark headers
so the files did not all end up in the same hash table list, but
that would likely not come up outside of the contrived benchmark.

I bootstrapped and ran the testsuite on x86_64 Darwin, as well as
ppc64le and aarch64 Linux.

libcpp/ChangeLog:

PR preprocessor/58770
* internal.h: Add hash table for #pragma once
* files.cc: Optimize #pragma once with the hash table

Signed-off-by: Paul Hollinsky 
---
 libcpp/files.cc   | 116 +++---
 libcpp/internal.h |   3 ++
 2 files changed, 112 insertions(+), 7 deletions(-)

diff --git a/libcpp/files.cc b/libcpp/files.cc
index 24208f7b0f8..d4ffd77578e 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -167,6 +167,33 @@ struct file_hash_entry_pool
   struct cpp_file_hash_entry pool[FILE_HASH_POOL_SIZE];
 };
 
+/* A set of attributes designed to quickly identify obviously different files
+   in a hashtable.  Just in case there are collisions, we still maintain a
+   list.  These sub-lists can then be checked for #pragma once rather than
+   interating through all_files.  */
+struct file_quick_idempotency_attrs
+{
+  file_quick_idempotency_attrs(const _cpp_file *f)
+: mtime(f->st.st_mtime), size(f->st.st_size) {}
+
+  time_t mtime;
+  off_t size;
+
+  static hashval_t hash (/* _cpp_file* */ const void *p);
+};
+
+/* Sub-list of very similar files kept in a hashtable to check for #pragma
+   once.  */
+struct file_sublist
+{
+  _cpp_file *f;
+  file_sublist *next;
+
+  static int eq (/* _cpp_file* */ const void *p,
+/* file_sublist* */ const void *q);
+  static void del (/* file_sublist* */ void *p);
+};
+
 static bool open_file (_cpp_file *file);
 static bool pch_open_file (cpp_reader *pfile, _cpp_file *file,
   bool *invalid_pch);
@@ -849,17 +876,17 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, 
bool import,
   if (!pfile->seen_once_only)
 return true;
 
-  /* We may have read the file under a different name.  Look
- for likely candidates and compare file contents to be sure.  */
-  for (_cpp_file *f = pfile->all_files; f; f = f->next_file)
+  /* We may have read the file under a different name.  We've kept
+ similar looking files in this lists under this hash table, so
+ check those more thoroughly.  */
+  void* ent = htab_find(pfile->pragma_once_files, file);
+  for (file_sublist *e = static_cast (ent); e; e = e->next)
 {
+  _cpp_file *f = e->f;
   if (f == file)
continue; /* It'sa me!  */
 
-  if ((import || f->once_only)
- && f->err_no == 0
- && f->st.st_mtime == file->st.st_mtime
- && f->st.st_size == file->st.st_size)
+  if ((import || f->once_only) && f->err_no == 0)
{
  _cpp_file *ref_file;
 
@@ -895,6 +922,38 @@ has_unique_contents (cpp_reader *pfile, _cpp_file *file, 
bool import,
   return true;
 }
 
+/* Add the given file to the #pragma once table so it can be
+   quickly identified and excluded the next time it's seen.  */
+static void
+update_pragma_once_table (cpp_reader *pfile, _cpp_file *file)
+{
+  void **slot = htab_find_slot (pfile->pragma_once_files, file, INSERT);
+  if (slot)
+{
+  if (!*slot)
+   *slot = xcalloc (1, sizeof (file_sublist));
+
+  file_sublist *e = static_cast (*slot);
+  while (e->f)
+   {
+ if (!e->next)
+   {
+ void *new_sublist = xcalloc(1, sizeof (file_sublist));
+ e->next = static_cast (new_sublist);
+   }
+ e = e->next;
+   }
+
+  e->f = file;
+}
+  else
+{
+  cpp_error (pfile, CPP_DL_ERROR,
+"Unable to create #pragma once table space for %s",
+_cpp_get_file_name (file));
+}
+}
+
 /* Place the file referenced by FILE into a new buffer on the buffer
stack if possible.  Returns true if a buffer is stacked.  Use LOC
for any diagnostics.  */
@@ -950,6 +1009,9 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, 
include_type type,
   if (!has_unique_contents (pfile, file, type == IT_IMPORT, loc))
return false;
 
+  if (pfile->seen_once_only && file->once_only)
+   update_pragma_once_table (pfile, file);
+
   if (pfile->buffer && file->dir)
sysp = MAX (pfile->buffer->sysp, file->dir->sysp);
 

Re: [PATCH] libsanitizer: Cherry-pick 2bfb0fcb51510f22723c8cdfefe from upstream

2022-07-31 Thread Martin Liška
On 7/29/22 08:38, Dimitrije Milosevic wrote:
> Thanks Martin! I'm sending out the output from git format-patch as an 
> attachment to this email.

You're welcome, pushed as r13-1909-g1efeaf99bd8bdf.

Cheers,
Martin


Re: ICE after folding svld1rq to vec_perm_expr duing forwprop

2022-07-31 Thread Prathamesh Kulkarni via Gcc-patches
On Thu, 21 Jul 2022 at 12:21, Richard Biener  wrote:
>
> On Wed, Jul 20, 2022 at 5:36 PM Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 18 Jul 2022 at 11:57, Richard Biener  
> > wrote:
> > >
> > > On Fri, Jul 15, 2022 at 3:49 PM Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > On Thu, 14 Jul 2022 at 17:22, Richard Sandiford
> > > >  wrote:
> > > > >
> > > > > Richard Biener  writes:
> > > > > > On Thu, Jul 14, 2022 at 9:55 AM Prathamesh Kulkarni
> > > > > >  wrote:
> > > > > >>
> > > > > >> On Wed, 13 Jul 2022 at 12:22, Richard Biener 
> > > > > >>  wrote:
> > > > > >> >
> > > > > >> > On Tue, Jul 12, 2022 at 9:12 PM Prathamesh Kulkarni via 
> > > > > >> > Gcc-patches
> > > > > >> >  wrote:
> > > > > >> > >
> > > > > >> > > Hi Richard,
> > > > > >> > > For the following test:
> > > > > >> > >
> > > > > >> > > svint32_t f2(int a, int b, int c, int d)
> > > > > >> > > {
> > > > > >> > >   int32x4_t v = (int32x4_t) {a, b, c, d};
> > > > > >> > >   return svld1rq_s32 (svptrue_b8 (), &v[0]);
> > > > > >> > > }
> > > > > >> > >
> > > > > >> > > The compiler emits following ICE with -O3 -mcpu=generic+sve:
> > > > > >> > > foo.c: In function ‘f2’:
> > > > > >> > > foo.c:4:11: error: non-trivial conversion in 
> > > > > >> > > ‘view_convert_expr’
> > > > > >> > > 4 | svint32_t f2(int a, int b, int c, int d)
> > > > > >> > >   |   ^~
> > > > > >> > > svint32_t
> > > > > >> > > __Int32x4_t
> > > > > >> > > _7 = VIEW_CONVERT_EXPR<__Int32x4_t>(_8);
> > > > > >> > > during GIMPLE pass: forwprop
> > > > > >> > > dump file: foo.c.109t.forwprop2
> > > > > >> > > foo.c:4:11: internal compiler error: verify_gimple failed
> > > > > >> > > 0xfda04a verify_gimple_in_cfg(function*, bool)
> > > > > >> > > ../../gcc/gcc/tree-cfg.cc:5568
> > > > > >> > > 0xe9371f execute_function_todo
> > > > > >> > > ../../gcc/gcc/passes.cc:2091
> > > > > >> > > 0xe93ccb execute_todo
> > > > > >> > > ../../gcc/gcc/passes.cc:2145
> > > > > >> > >
> > > > > >> > > This happens because, after folding svld1rq_s32 to 
> > > > > >> > > vec_perm_expr, we have:
> > > > > >> > >   int32x4_t v;
> > > > > >> > >   __Int32x4_t _1;
> > > > > >> > >   svint32_t _9;
> > > > > >> > >   vector(4) int _11;
> > > > > >> > >
> > > > > >> > >:
> > > > > >> > >   _1 = {a_3(D), b_4(D), c_5(D), d_6(D)};
> > > > > >> > >   v_12 = _1;
> > > > > >> > >   _11 = v_12;
> > > > > >> > >   _9 = VEC_PERM_EXPR <_11, _11, { 0, 1, 2, 3, ... }>;
> > > > > >> > >   return _9;
> > > > > >> > >
> > > > > >> > > During forwprop, simplify_permutation simplifies vec_perm_expr 
> > > > > >> > > to
> > > > > >> > > view_convert_expr,
> > > > > >> > > and the end result becomes:
> > > > > >> > >   svint32_t _7;
> > > > > >> > >   __Int32x4_t _8;
> > > > > >> > >
> > > > > >> > > ;;   basic block 2, loop depth 0
> > > > > >> > > ;;pred:   ENTRY
> > > > > >> > >   _8 = {a_2(D), b_3(D), c_4(D), d_5(D)};
> > > > > >> > >   _7 = VIEW_CONVERT_EXPR<__Int32x4_t>(_8);
> > > > > >> > >   return _7;
> > > > > >> > > ;;succ:   EXIT
> > > > > >> > >
> > > > > >> > > which causes the error duing verify_gimple since 
> > > > > >> > > VIEW_CONVERT_EXPR
> > > > > >> > > has incompatible types (svint32_t, int32x4_t).
> > > > > >> > >
> > > > > >> > > The attached patch disables simplification of VEC_PERM_EXPR
> > > > > >> > > in simplify_permutation, if lhs and rhs have non compatible 
> > > > > >> > > types,
> > > > > >> > > which resolves ICE, but am not sure if it's the correct 
> > > > > >> > > approach ?
> > > > > >> >
> > > > > >> > It for sure papers over the issue.  I think the error happens 
> > > > > >> > earlier,
> > > > > >> > the V_C_E should have been built with the type of the 
> > > > > >> > VEC_PERM_EXPR
> > > > > >> > which is the type of the LHS.  But then you probably run into the
> > > > > >> > different sizes ICE (VLA vs constant size).  I think for this 
> > > > > >> > case you
> > > > > >> > want a BIT_FIELD_REF instead of a VIEW_CONVERT_EXPR,
> > > > > >> > selecting the "low" part of the VLA vector.
> > > > > >> Hi Richard,
> > > > > >> Sorry I don't quite follow. In this case, we use VEC_PERM_EXPR to
> > > > > >> represent dup operation
> > > > > >> from fixed width to VLA vector. I am not sure how folding it to
> > > > > >> BIT_FIELD_REF will work.
> > > > > >> Could you please elaborate ?
> > > > > >>
> > > > > >> Also, the issue doesn't seem restricted to this case.
> > > > > >> The following test case also ICE's during forwprop:
> > > > > >> svint32_t foo()
> > > > > >> {
> > > > > >>   int32x4_t v = (int32x4_t) {1, 2, 3, 4};
> > > > > >>   svint32_t v2 = svld1rq_s32 (svptrue_b8 (), &v[0]);
> > > > > >>   return v2;
> > > > > >> }
> > > > > >>
> > > > > >> foo2.c: In function ‘foo’:
> > > > > >> foo2.c:9:1: error: non-trivial conversion in ‘vector_cst’
> > > > > >> 9 | }
> > > > > >>   | ^
> > > > > >> svint32_t
> > > > > >> int32x4_t
> > > > > >> v2_4 = { 1, 2, 3, 4 };
> > > > > >>
> > > > > >> because simplify

Ping^2 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-07-31 Thread HAO CHEN GUI via Gcc-patches
Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
Thanks.


On 4/7/2022 下午 2:32, HAO CHEN GUI wrote:
> Hi,
>Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
> Thanks.
> 
> On 24/6/2022 上午 10:02, HAO CHEN GUI wrote:
>> Hi,
>>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
>> Tests show that outputs of xs[min/max]dp are consistent with the standard
>> of C99 fmin/max.
>>
>>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>> of smin/max. So the builtins always generate xs[min/max]dp on all
>> platforms.
>>
>>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
>> Is this okay for trunk? Any recommendations? Thanks a lot.
>>
>> ChangeLog
>> 2022-06-24 Haochen Gui 
>>
>> gcc/
>>  PR target/103605
>>  * config/rs6000/rs6000.md (FMINMAX): New.
>>  (minmax_op): New.
>>  (f3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
>>  * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
>>  pattern to fmaxdf3.
>>  (__builtin_vsx_xsmindp): Set pattern to fmindf3.
>>
>> gcc/testsuite/
>>  PR target/103605
>>  * gcc.dg/powerpc/pr103605.c: New.
>>
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
>> b/gcc/config/rs6000/rs6000-builtins.def
>> index f4a9f24bcc5..8b735493b40 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1613,10 +1613,10 @@
>>  XSCVSPDP vsx_xscvspdp {}
>>
>>const double __builtin_vsx_xsmaxdp (double, double);
>> -XSMAXDP smaxdf3 {}
>> +XSMAXDP fmaxdf3 {}
>>
>>const double __builtin_vsx_xsmindp (double, double);
>> -XSMINDP smindf3 {}
>> +XSMINDP fmindf3 {}
>>
>>const double __builtin_vsx_xsrdpi (double);
>>  XSRDPI vsx_xsrdpi {}
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index bf85baa5370..ae0dd98f0f9 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>> UNSPEC_HASHCHK
>> UNSPEC_XXSPLTIDP_CONST
>> UNSPEC_XXSPLTIW_CONST
>> +   UNSPEC_FMAX
>> +   UNSPEC_FMIN
>>])
>>
>>  ;;
>> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s3_fpr"
>>DONE;
>>  })
>>
>> +
>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
>> +
>> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
>> + (UNSPEC_FMIN "min")])
>> +
>> +(define_insn "f3"
>> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
>> +(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
>> +  (match_operand:SFDF 2 "vsx_register_operand" "wa")]
>> + FMINMAX))]
>> +  "TARGET_VSX && !flag_finite_math_only"
>> +  "xsdp %x0,%x1,%x2"
>> +  [(set_attr "type" "fp")]
>> +)
>> +
>>  (define_expand "movcc"
>> [(set (match_operand:GPR 0 "gpc_reg_operand")
>>   (if_then_else:GPR (match_operand 1 "comparison_operator")
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>> new file mode 100644
>> index 000..1c938d40e61
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>> @@ -0,0 +1,37 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>> +/* { dg-options "-O2 -mvsx" } */
>> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
>> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
>> +
>> +#include 
>> +
>> +double test1 (double d0, double d1)
>> +{
>> +  return fmin (d0, d1);
>> +}
>> +
>> +float test2 (float d0, float d1)
>> +{
>> +  return fmin (d0, d1);
>> +}
>> +
>> +double test3 (double d0, double d1)
>> +{
>> +  return fmax (d0, d1);
>> +}
>> +
>> +float test4 (float d0, float d1)
>> +{
>> +  return fmax (d0, d1);
>> +}
>> +
>> +double test5 (double d0, double d1)
>> +{
>> +  return __builtin_vsx_xsmindp (d0, d1);
>> +}
>> +
>> +double test6 (double d0, double d1)
>> +{
>> +  return __builtin_vsx_xsmaxdp (d0, d1);
>> +}


Ping^2 [PATCH v2, rs6000] Use CC for BCD operations [PR100736]

2022-07-31 Thread HAO CHEN GUI via Gcc-patches
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html
Thanks.

On 4/7/2022 下午 2:33, HAO CHEN GUI wrote:
> Hi,
>Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html
> Thanks.
> 
> On 22/6/2022 下午 4:26, HAO CHEN GUI wrote:
>> Hi,
>>   This patch uses CC instead of CCFP for all BCD operations. Thus, infinite
>> math flag has no impact on BCD operations. To support BCD overflow and
>> invalid coding, an UNSPEC is defined to move the bit to a general register.
>> The patterns of condition branch and return with overflow bit are defined as
>> the UNSPEC and branch/return can be combined to one jump insn. The split
>> pattern of overflow bit extension is define for optimization.
>>
>>   This patch also replaces bcdadd with bcdsub for BCD invaliding coding
>> expand.
>>
>> ChangeLog
>> 2022-06-22 Haochen Gui 
>>
>> gcc/
>>  PR target/100736
>>  * config/rs6000/altivec.md (BCD_TEST): Remove unordered.
>>  (bcd_): Replace CCFP with CC.
>>  (*bcd_test_): Replace CCFP with CC.  Generate
>>  condition insn with CC mode.
>>  (bcd_overflow_): New.
>>  (*bcdoverflow_): New.
>>  (*bcdinvalid_): Removed.
>>  (bcdinvalid_): Implement by UNSPEC_BCDSUB and UNSPEC_BCD_OVERFLOW.
>>  (nuun): New.
>>  (*overflow_cbranch): New.
>>  (*overflow_creturn): New.
>>  (*overflow_extendsidi): New.
>>  (bcdshift_v16qi): Replace CCFP with CC.
>>  (bcdmul10_v16qi): Likewise.
>>  (bcddiv10_v16qi): Likewise.
>>  (peephole for bcd_add/sub): Likewise.
>>  * config/rs6000/rs6000-builtins.def (__builtin_bcdadd_ov_v1ti): Set
>>  pattern to bcdadd_overflow_v1ti.
>>  (__builtin_bcdadd_ov_v16qi): Set pattern to bcdadd_overflow_v16qi.
>>  (__builtin_bcdsub_ov_v1ti): Set pattern to bcdsub_overflow_v1ti.
>>  (__builtin_bcdsub_ov_v16qi): Set pattern to bcdsub_overflow_v16qi.
>>
>> gcc/testsuite/
>>  PR target/100736
>>  * gcc.target/powerpc/bcd-4.c: Adjust number of bcdadd and bcdsub.
>>  Scan no cror insns.
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
>> index efc8ae35c2e..26f131e61ea 100644
>> --- a/gcc/config/rs6000/altivec.md
>> +++ b/gcc/config/rs6000/altivec.md
>> @@ -4370,7 +4370,7 @@ (define_int_iterator UNSPEC_BCD_ADD_SUB [UNSPEC_BCDADD 
>> UNSPEC_BCDSUB])
>>  (define_int_attr bcd_add_sub [(UNSPEC_BCDADD "add")
>>(UNSPEC_BCDSUB "sub")])
>>
>> -(define_code_iterator BCD_TEST [eq lt le gt ge unordered])
>> +(define_code_iterator BCD_TEST [eq lt le gt ge])
>>  (define_mode_iterator VBCD [V1TI V16QI])
>>
>>  (define_insn "bcd_"
>> @@ -4379,7 +4379,7 @@ (define_insn "bcd_"
>>(match_operand:VBCD 2 "register_operand" "v")
>>(match_operand:QI 3 "const_0_to_1_operand" "n")]
>>   UNSPEC_BCD_ADD_SUB))
>> -   (clobber (reg:CCFP CR6_REGNO))]
>> +   (clobber (reg:CC CR6_REGNO))]
>>"TARGET_P8_VECTOR"
>>"bcd. %0,%1,%2,%3"
>>[(set_attr "type" "vecsimple")])
>> @@ -4389,9 +4389,9 @@ (define_insn "bcd_"
>>  ;; UNORDERED test on an integer type (like V1TImode) is not defined.  The 
>> type
>>  ;; probably should be one that can go in the VMX (Altivec) registers, so we
>>  ;; can't use DDmode or DFmode.
>> -(define_insn "*bcd_test_"
>> -  [(set (reg:CCFP CR6_REGNO)
>> -(compare:CCFP
>> +(define_insn "bcd_test_"
>> +  [(set (reg:CC CR6_REGNO)
>> +(compare:CC
>>   (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
>> (match_operand:VBCD 2 "register_operand" "v")
>> (match_operand:QI 3 "const_0_to_1_operand" "i")]
>> @@ -4408,8 +4408,8 @@ (define_insn "*bcd_test2_"
>>(match_operand:VBCD 2 "register_operand" "v")
>>(match_operand:QI 3 "const_0_to_1_operand" "i")]
>>   UNSPEC_BCD_ADD_SUB))
>> -   (set (reg:CCFP CR6_REGNO)
>> -(compare:CCFP
>> +   (set (reg:CC CR6_REGNO)
>> +(compare:CC
>>   (unspec:V2DF [(match_dup 1)
>> (match_dup 2)
>> (match_dup 3)]
>> @@ -4502,8 +4502,8 @@ (define_insn "vclrrb"
>> [(set_attr "type" "vecsimple")])
>>
>>  (define_expand "bcd__"
>> -  [(parallel [(set (reg:CCFP CR6_REGNO)
>> -   (compare:CCFP
>> +  [(parallel [(set (reg:CC CR6_REGNO)
>> +   (compare:CC
>>  (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
>>(match_operand:VBCD 2 "register_operand")
>>(match_operand:QI 3 "const_0_to_1_operand")]
>> @@ -4511,46 +4511,138 @@ (define_expand "bcd__"
>>  (match_dup 4)))
>>(clobber (match_scratch:VBCD 5))])
>> (set (match_operand:SI 0 "register_operand")
>> -(BCD_TEST:SI (reg:CCFP CR6_REGNO)
>> +(BCD_TEST:SI (reg:CC CR6_REGNO)
>>   (const_int 0)))]
>>"TARGET_P8_VECTOR"
>>  {
>>operands[4] = CONST

Ping [PATCH, rs6000] Add multiply-add expand pattern [PR103109]

2022-07-31 Thread HAO CHEN GUI via Gcc-patches
Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598744.html
Thanks

On 25/7/2022 下午 1:11, HAO CHEN GUI wrote:
> Hi,
>   This patch adds an expand and several insns for multiply-add with
> three 64bit operands.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-07-22  Haochen Gui  
> 
> gcc/
>   PR target/103109
>   * config/rs6000/rs6000.md (maddditi4): New pattern for
>   multiply-add.
>   (madddi4_lowpart): New.
>   (madddi4_lowpart_le): New.
>   (madddi4_highpart): New.
>   (madddi4_highpart_le): New.
> 
> gcc/testsuite/
>   PR target/103109
>   * gcc.target/powerpc/pr103109.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index c55ee7e171a..4f3b56e103e 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -3226,6 +3226,97 @@ (define_insn "*maddld4"
>"maddld %0,%1,%2,%3"
>[(set_attr "type" "mul")])
> 
> +(define_expand "maddditi4"
> +  [(set (match_operand:TI 0 "gpc_reg_operand")
> + (plus:TI
> +   (mult:TI (any_extend:TI
> +  (match_operand:DI 1 "gpc_reg_operand"))
> +(any_extend:TI
> +  (match_operand:DI 2 "gpc_reg_operand")))
> +   (any_extend:TI
> + (match_operand:DI 3 "gpc_reg_operand"]
> +  "TARGET_POWERPC64 && TARGET_MADDLD"
> +{
> +  rtx op0_lo = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 8 : 
> 0);
> +  rtx op0_hi = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 0 : 
> 8);
> +
> +  if (BYTES_BIG_ENDIAN)
> +{
> +  emit_insn (gen_madddi4_lowpart (op0_lo, operands[1], operands[2],
> +  operands[3]));
> +  emit_insn (gen_madddi4_highpart (op0_hi, operands[1], operands[2],
> +   operands[3]));
> +}
> +  else
> +{
> +  emit_insn (gen_madddi4_lowpart_le (op0_lo, operands[1], operands[2],
> + operands[3]));
> +  emit_insn (gen_madddi4_highpart_le (op0_hi, operands[1], 
> operands[2],
> +  operands[3]));
> +}
> +  DONE;
> +})
> +
> +(define_insn "madddi4_lowpart"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  8))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN"
> +  "maddld %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
> +(define_insn "madddi4_lowpart_le"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  0))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && !BYTES_BIG_ENDIAN"
> +  "maddld %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
> +(define_insn "madddi4_highpart"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  0))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN"
> +  "maddhd %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
> +(define_insn "madddi4_highpart_le"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> + (subreg:DI
> +   (plus:TI
> + (mult:TI (any_extend:TI
> +(match_operand:DI 1 "gpc_reg_operand" "r"))
> +  (any_extend:TI
> +(match_operand:DI 2 "gpc_reg_operand" "r")))
> + (any_extend:TI
> +   (match_operand:DI 3 "gpc_reg_operand" "r")))
> +  8))]
> +  "TARGET_POWERPC64 && TARGET_MADDLD && !BYTES_BIG_ENDIAN"
> +  "maddhd %0,%1,%2,%3"
> +  [(set_attr "type" "mul")])
> +
>  (define_insn "udiv3"
>[(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  (udiv:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103109.c 
> b/gcc/testsuite/gcc.target/powerpc/pr103109.c
> new file mode 100644
> index 000..256e05d5677
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103109.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile { target { lp64 } } } */
> +/* { dg-

Ping [PATCH v3] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2022-07-31 Thread HAO CHEN GUI via Gcc-patches
Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598685.html
Thanks.

On 22/7/2022 下午 3:07, HAO CHEN GUI wrote:
> Hi,
>   This patch creates a new function - change_pseudo_and_mask. If recog fails,
> the function converts a single pseudo to the pseudo AND with a mask if the
> outer operator is IOR/XOR/PLUS and inner operator is ASHIFT or AND. The
> conversion helps pattern to match rotate and mask insn on some targets.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-07-22  Haochen Gui  
> 
> gcc/
>   PR target/93453
>   * combine.cc (change_pseudo_and_mask): New.
>   (recog_for_combine): If recog fails, try again with the pattern
>   modified by change_pseudo_and_mask.
>   * config/rs6000/rs6000.md (plus_ior_xor): Remove.
>   (anonymous split pattern for plus_ior_xor): Remove.
> 
> gcc/testsuite/
>   PR target/93453
>   * gcc.target/powerpc/pr93453-2.c: New.
>   * gcc.target/powerpc/rlwimi-2.c: Both 32/64 bit platforms generate the
>   same number of rlwimi.  Reset the counter.
> 
> patch.diff
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index a5fabf397f7..e1c1aa7da1c 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -11599,6 +11599,48 @@ change_zero_ext (rtx pat)
>return changed;
>  }
> 
> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/AND, convert a pseudo to pseudo AND with a mask if its nonzero_bits
> +   is less than its mode mask.  The nonzero_bits in later passes is not a
> +   superset of what is known in combine pass.  So an insn with nonzero_bits
> +   can't be recoged later.  */
> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +   || GET_CODE (src) == XOR
> +   || GET_CODE (src) == PLUS)
> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> + || GET_CODE (XEXP (src, 0)) == AND)
> +&& REG_P (XEXP (src, 1)
> +{
> +  rtx reg = XEXP (src, 1);
> +  machine_mode mode = GET_MODE (reg);
> +  unsigned HOST_WIDE_INT nonzero = nonzero_bits (reg, mode);
> +  if (nonzero < GET_MODE_MASK (mode))
> + {
> +   int shift;
> +
> +   if (GET_CODE (XEXP (src, 0)) == ASHIFT)
> + shift = INTVAL (XEXP (XEXP (src, 0), 1));
> +   else
> + shift = ctz_hwi (INTVAL (XEXP (XEXP (src, 0), 1)));
> +
> +   if (shift > 0
> +   && (HOST_WIDE_INT_1U << shift) - 1 >= nonzero)
> + {
> +   unsigned HOST_WIDE_INT mask = (HOST_WIDE_INT_1U << shift) - 1;
> +   rtx x = gen_rtx_AND (mode, reg, GEN_INT (mask));
> +   SUBST (XEXP (SET_SRC (pat), 1), x);
> +   maybe_swap_commutative_operands (SET_SRC (pat));
> +   return true;
> + }
> + }
> +}
> +  return false;
> +}
> +
>  /* Like recog, but we receive the address of a pointer to a new pattern.
> We try to match the rtx that the pointer points to.
> If that fails, we may try to modify or replace the pattern,
> @@ -11646,7 +11688,10 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
>   }
>   }
>else
> - changed = change_zero_ext (pat);
> + {
> +   changed = change_pseudo_and_mask (pat);
> +   changed |= change_zero_ext (pat);
> + }
>  }
>else if (GET_CODE (pat) == PARALLEL)
>  {
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 1367a2cb779..2bd6bd5f908 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -4207,24 +4207,6 @@ (define_insn_and_split "*rotl3_insert_3_"
>   (ior:GPR (and:GPR (match_dup 3) (match_dup 4))
>(ashift:GPR (match_dup 1) (match_dup 2])
> 
> -(define_code_iterator plus_ior_xor [plus ior xor])
> -
> -(define_split
> -  [(set (match_operand:GPR 0 "gpc_reg_operand")
> - (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand")
> -   (match_operand:SI 2 "const_int_operand"))
> -   (match_operand:GPR 3 "gpc_reg_operand")))]
> -  "nonzero_bits (operands[3], mode)
> -   < HOST_WIDE_INT_1U << INTVAL (operands[2])"
> -  [(set (match_dup 0)
> - (ior:GPR (and:GPR (match_dup 3)
> -   (match_dup 4))
> -  (ashift:GPR (match_dup 1)
> -  (match_dup 2]
> -{
> -  operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1);
> -})
> -
>  (define_insn "*rotlsi3_insert_4"
>[(set (match_operand:SI 0 "gpc_reg_operand" "=r")
>   (ior:SI (and:SI (match_operand:SI 3 "gpc_reg_operand" "0")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr93453-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr93453-2.c
> new file mode 100644
> index 000..a83a6511653
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-2.c
> @@ -0,0 +1,19

Re: [PATCH] tree-optimization/105679 - disable backward threading of unlikely entry

2022-07-31 Thread Jeff Law via Gcc-patches




On 7/31/2022 1:17 PM, Iain Sandoe via Gcc-patches wrote:

Hi Richi,


On 29 Jul 2022, at 09:54, Richard Biener via Gcc-patches 
 wrote:

The following makes the backward threader reject threads whose entry
edge is probably never executed according to the profile.  That in
particular, for the testcase, avoids threading the irq == 1 check
on the path where irq > 31, thereby avoiding spurious -Warray-bounds
diagnostics

This breaks bootstrap on i686-darwin{9,17} with what looks like a valid  
warning (werrors on stage2)

cc1plus  … -O2 -Wall … is enough to.

I can repeat it on a cross from x86_64-darwin19, so I can probably reduce the 
.ii (it’s like 2M5 raw) and file a PR if you like - depends if the solution 
might be obvious to you …
I suspect what's happening here is by suppressing the jump thread we're 
leaving an unexecutable path through the CFG in the IL.   The warning is 
likely on that unexecutable path or at a join point where the 
unexecutable path re-joins the main path through the CFG.


Jeff



Re: [PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-07-31 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:23 PM, Joseph Myers wrote:

On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote:


This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

I'd expect this patch also to add tests verifying that u8"" strings have
the old type for C11 (unless there are existing such tests, but I don't
see them).

Agreed, good catch. thank you.



diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */

I don't think _ISOC2X_SOURCE belongs in any GCC tests.
That was necessary because the first patch in this series omitted the 
atomic_char8_t and ATOMIC_CHAR8_T_LOCK_FREE definitions unless one of 
_GNU_SOURCE or _ISOC2X_SOURCE was defined. Per review of that first 
patch, those conditions will be removed, so there will be no need to 
define them here.



diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */

Nor does _GNU_SOURCE (unless the test depends on glibc functionality
that's only available with _GNU_SOURCE, but in that case you also need
some effective-target conditionals to restrict it to appropriate glibc
targets).


Ditto.

I'll post new patches shortly.

Tom.



Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-31 Thread Tom Honermann via Gcc-patches

On 7/31/22 11:05 AM, Lewis Hyatt wrote:

On Sat, Jul 30, 2022 at 7:06 PM Tom Honermann via Gcc-patches
  wrote:

On 7/27/22 7:09 PM, Joseph Myers wrote:

On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote:


Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also

There are lots of C++ warning options, all of which should support pragma
suppression regardless of whether they are relevant to the preprocessor or
not.  Do they all need this kind of handling, or is it only -Wc++20-compat
that has some kind of problem?

I had only checked -Wc++20-compat when working on the patch.

I did some spot checking now and confirmed that suppression works as
expected for C++ for at least the following warnings:
-Wuninitialized
-Warray-compare
-Wbool-compare
-Wtautological-compare
-Wterminate

I don't know the diagnostic framework well. As best I can tell, this
issue is specific to the -Wc++20-compat option and when the particular
diagnostic is issued (e.g., during lexing as opposed to during parsing).
The following call chains appear to be relevant.
cp_lexer_new_main -> cp_lexer_handle_early_pragma ->
c_invoke_early_pragma_handler
cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler
(where * might be "declaration", "toplevel_declaration",
"class_head", "objc_interstitial_code", ...)

The -Wc++20-compat enabled warning regarding new keywords in C++20 is
issued from cp_lexer_get_preprocessor_token.

Tom.


I have been working on improving the handling of "#pragma GCC
diagnostic" lately. The behavior for C++ changed since r13-1544
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e46f4d7430c5210465791603735ab219ef263c51).
I have some more comments about the patch's approach on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c44).

"#pragma GCC diagnostic" formerly did not work in C++ at all, for
diagnostics generated by libcpp, because C++ obtains all the tokens
from libcpp first (including deferred pragmas), and then processes
them afterward, too late to take effect for diagnostics that libcpp
has already emitted. r13-1544 fixed this up by adding an early pragma
handler, which runs as soon as a deferred pragma token is seen and
handles diagnostic pragmas if they pertain to libcpp-controlled
diagnostics. Non-libcpp diagnostics still need to be handled later,
during parsing, or else they get processed too early and it leads to
other problems. Basically, now each diagnostic pragma is handled as
close in time as possible to the time the associated diagnostics might
be generated.

The early pragma handler determines that an option comes from libcpp,
and so should be subject to early processing, if it was marked as such
in the options definition file. Tom's patch points out that
-Wc++20-compat needs to be handled early, and so marking it as a
libcpp diagnostic in c-family/c.opt arranges for that to work as
intended. Now one potential objection here is that -Wc++20-compat
warnings are not technically generated by libcpp. They are generated
by the C++ frontend immediately after lexing an identifier token from
libcpp (cp_lexer_get_preprocessor_token()). But the distinction
between these two steps is rather blurry and it seems logical to me,
to denote this as a libcpp-related option. Also, the same is already
done for -Wc++11-compat. Otherwise, we would need to add some new
option property to indicate which ones need to be handled for pragmas
at lexing time rather than parsing time.

At the moment I don't see any other diagnostics issued from
cp_lexer_get_preprocessor_token() that would need similar adjustments.
Assuming the approach is OK, it might be nice to add a comment to that
function, indicating that any diagnostics emitted there should be
annotated as libcpp options in the .opt file?


Thank you for those details; I wasn't aware of that history.

If I'm interpreting your response correctly, it sounds like you agree 
with the direction of the patch.


If you like, I can add a comment as you suggested and re-post the patch. 
Perhaps:


diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@cp_lexer_saving_tokens (const cp_lexer* lexer)
/* Store the next token from the preprocessor in *TOKEN.  Return true
   if we reach EOF.  If LEXER is NULL, assume we are handling an
   initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling 
option (if
+   any) in c.opt annotated as a libcpp option

[PATCH] configure: respect --with-build-time-tools [PR43301]

2022-07-31 Thread Eric Gallager via Gcc-patches
Hi, there's been a patch sitting in bug 43301 for over a decade that I
think still makes sense to apply, so I rebased it against current
trunk and found that it still applies. It just makes the configure
script respect the --with-build-time-tools flag. OK to commit?

ChangeLog:

PR bootstrap/43301
* configure: Regenerate.
* configure.ac: Respect --with-build-time-tools flag.


patch-configure_1.diff
Description: Binary data


Re: [PATCH] tree-optimization/105679 - disable backward threading of unlikely entry

2022-07-31 Thread Iain Sandoe via Gcc-patches
Hi Richi,

> On 29 Jul 2022, at 09:54, Richard Biener via Gcc-patches 
>  wrote:
> 
> The following makes the backward threader reject threads whose entry
> edge is probably never executed according to the profile.  That in
> particular, for the testcase, avoids threading the irq == 1 check
> on the path where irq > 31, thereby avoiding spurious -Warray-bounds
> diagnostics

This breaks bootstrap on i686-darwin{9,17} with what looks like a valid  
warning (werrors on stage2)

cc1plus  … -O2 -Wall … is enough to.

I can repeat it on a cross from x86_64-darwin19, so I can probably reduce the 
.ii (it’s like 2M5 raw) and file a PR if you like - depends if the solution 
might be obvious to you …

thanks
Iain



In file included from /src-local/gcc-master/gcc/hash-table.h:248,
 from /src-local/gcc-master/gcc/coretypes.h:486,
 from /src-local/gcc-master/gcc/tree-ssa-threadbackward.cc:22:
In member function ‘T& vec::operator[](unsigned int) [with T = 
basic_block_def*; A = va_heap]’,
inlined from ‘const T& vec::operator[](unsigned int) const [with T = 
basic_block_def*]’ at /src-local/gcc-master/gcc/vec.h:1486:20,
inlined from ‘bool back_threader_profitability::profitable_path_p(const 
vec&, tree, edge, bool*)’ at 
/src-local/gcc-master/gcc/tree-ssa-threadbackward.cc:781:37:
/src-local/gcc-master/gcc/vec.h:890:19: warning: array subscript 4294967294 is 
above array bounds of ‘basic_block_def* [1]’ [-Warray-bounds]
  890 |   return m_vecdata[ix];
  |  ~^
/src-local/gcc-master/gcc/vec.h: In member function ‘bool 
back_threader_profitability::profitable_path_p(const vec&, 
tree, edge, bool*)’:
/src-local/gcc-master/gcc/vec.h:635:5: note: while referencing 
‘vec::m_vecdata’
  635 |   T m_vecdata[1];
  | ^

=

Re: [PATCH, v4] Fortran: detect blanks within literal constants in free-form mode [PR92805]

2022-07-31 Thread Harald Anlauf via Gcc-patches

Hi Mikael,

Am 31.07.22 um 10:35 schrieb Mikael Morin:

Le 30/07/2022 à 21:40, Harald Anlauf a écrit :

Hi Mikael,

Am 30.07.22 um 10:28 schrieb Mikael Morin:

Meh! We killed one check for gfc_current_form but the other one is still
there.
OK, match_kind_param calls two functions that also gobble space, so
there is work remaining here.
So please make match_small_literal_constant and gfc_match_name
space-gobbling wrappers around space-non-gobbling inner functions and
call those inner functions instead in match_kind_param.


well, here's the shortest solution I could come up with.
I added a new argument to 3 functions used in parsing that
controls the gobbling of whitespace.  We use this to handle
whitespace for numerical literals, while the parsing of string
literals remains as in the previous version of the patch.

This version obviously ignores Thomas' request, as that would
require to treat gfc_match_char specially...

Regtested again.  OK now?



PR fortran/92805
* match.cc (gfc_match_small_literal_int): Make gobbling of leading
whitespace optional.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* match.h (gfc_match_small_literal_int): Adjust prototype.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* primary.cc (match_kind_param): Match small literal int or name
without gobbling whitespace.
(get_kind): Do not skip over blanks in free-form mode.

I think the "in free-form mode" applied to the preceding patches but can
be dropped now.

(match_string_constant): Likewise.



diff --git a/gcc/fortran/match.cc b/gcc/fortran/match.cc
index 1aa3053e70e..c0dc0e89361 100644
--- a/gcc/fortran/match.cc
+++ b/gcc/fortran/match.cc
@@ -457,7 +457,7 @@ gfc_match_eos (void)
    will be set to the number of digits.  */

Please add a note about GOBBLE_WS here, like you did for gfc_match_char.


 match
-gfc_match_small_literal_int (int *value, int *cnt)
+gfc_match_small_literal_int (int *value, int *cnt, bool gobble_ws)
 {
   locus old_loc;
   char c;

(...)

@@ -611,14 +612,15 @@ gfc_match_label (void)
    than GFC_MAX_SYMBOL_LEN.  */

Same here.


 match
-gfc_match_name (char *buffer)
+gfc_match_name (char *buffer, bool gobble_ws)
 {
   locus old_loc;
   int i;
   char c;


(...)

@@ -1052,16 +1054,19 @@ cleanup:
 }


-/* Tries to match the next non-whitespace character on the input.
-   This subroutine does not return MATCH_ERROR.  */
+/* Tries to match the next non-whitespace character on the input.  This
+   subroutine does not return MATCH_ERROR.  When gobble_ws is false,
do not
+   skip over leading blanks.
+*/

There should be no line feed before end of comment.


I've adjusted the patch (see attached) and pushed it as

commit r13-1905-gd325e7048c85e13f12ea79aebf9623eddc7ffcaf

Thanks,
Harald


OK with those changes.
thanks for your patience.

Mikael


From d325e7048c85e13f12ea79aebf9623eddc7ffcaf Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 28 Jul 2022 22:07:02 +0200
Subject: [PATCH] Fortran: detect blanks within literal constants in free-form
 mode [PR92805]

gcc/fortran/ChangeLog:

	PR fortran/92805
	* match.cc (gfc_match_small_literal_int): Make gobbling of leading
	whitespace optional.
	(gfc_match_name): Likewise.
	(gfc_match_char): Likewise.
	* match.h (gfc_match_small_literal_int): Adjust prototype.
	(gfc_match_name): Likewise.
	(gfc_match_char): Likewise.
	* primary.cc (match_kind_param): Match small literal int or name
	without gobbling whitespace.
	(get_kind): Do not skip over blanks.
	(match_string_constant): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/92805
	* gfortran.dg/literal_constants.f: New test.
	* gfortran.dg/literal_constants.f90: New test.

Co-authored-by: Steven G. Kargl 
---
 gcc/fortran/match.cc  | 24 ---
 gcc/fortran/match.h   |  6 ++---
 gcc/fortran/primary.cc| 14 +++
 gcc/testsuite/gfortran.dg/literal_constants.f | 20 
 .../gfortran.dg/literal_constants.f90 | 24 +++
 5 files changed, 65 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/literal_constants.f
 create mode 100644 gcc/testsuite/gfortran.dg/literal_constants.f90

diff --git a/gcc/fortran/match.cc b/gcc/fortran/match.cc
index 1aa3053e70e..8b8b6e79c8b 100644
--- a/gcc/fortran/match.cc
+++ b/gcc/fortran/match.cc
@@ -454,10 +454,11 @@ gfc_match_eos (void)
 /* Match a literal integer on the input, setting the value on
MATCH_YES.  Literal ints occur in kind-parameters as well as
old-style character length specifications.  If cnt is non-NULL it
-   will be set to the number of digits.  */
+   will be set to the number of digits.
+   When gobble_ws is false, do not skip over leading blanks.  */
 
 match
-gfc_match_small_literal_int (int *value, int *cnt)
+gfc_match_small_literal_int (int *value, int *cnt, bool gobble_ws)
 {
   locus old_loc;
   char c;
@@ -466,7 +467,8 @@ gfc_match_sm

Re: [RFA] Implement basic range operators to enable floating point VRP.

2022-07-31 Thread Aldy Hernandez via Gcc-patches
PING

Andrew, anyone, would you mind giving this a once over?  I realize
reviewing ranger's range-op code is not on anyone's list of
priorities, but I could use a sanity check.

The patch is sufficiently self-contained to easily catch anything
caused by it, and I'd like to commit earlier in the week to have
enough time to field any possible fallout before I take a few days off
next week.

Updated patch attached.

Thanks.
Aldy

On Mon, Jul 25, 2022 at 8:50 PM Aldy Hernandez  wrote:
>
> Without further ado, here is the implementation for floating point
> range operators, plus the switch to enable all ranger clients to
> handle floats.
>
> These are bare bone implementations good enough for relation operators
> to work, while keeping the NAN bits up to date in the frange.  There
> is also minimal support for keeping track of +-INF when it is obvious.
>
> I have included some basic tests to help get a feel of what is
> ultimately handled.
>
> Since range-ops is the domain specific core of ranger, I think its
> best if a global maintainer or an FP expert could review this.
>
> OK for trunk?
>
> Tested on x86-64 Linux.
>
> p.s. I haven't done extensive testing in DOM, but with this we're mighty
> close for the forward threader there to be replaceable with the backward
> threader, thus removing the last use of the forward threader.
>
> gcc/ChangeLog:
>
> * range-op-float.cc (finite_operands_p): New.
> (frelop_early_resolve): New.
> (default_frelop_fold_range): New.
> (class foperator_equal): New.
> (class foperator_not_equal): New.
> (class foperator_lt): New.
> (class foperator_le): New.
> (class foperator_gt): New.
> (class foperator_ge): New.
> (class foperator_unordered): New.
> (class foperator_ordered): New.
> (class foperator_relop_unknown): New.
> (floating_op_table::floating_op_table): Add above classes to
> floating op table.
> * value-range.h (frange::supports_p): Enable.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/opt/pr94589-2.C: Add notes.
> * gcc.dg/tree-ssa/vrp-float-1.c: New test.
> * gcc.dg/tree-ssa/vrp-float-11.c: New test.
> * gcc.dg/tree-ssa/vrp-float-3.c: New test.
> * gcc.dg/tree-ssa/vrp-float-4.c: New test.
> * gcc.dg/tree-ssa/vrp-float-6.c: New test.
> * gcc.dg/tree-ssa/vrp-float-7.c: New test.
> * gcc.dg/tree-ssa/vrp-float-8.c: New test.
> ---
>  gcc/range-op-float.cc| 564 +++
>  gcc/testsuite/g++.dg/opt/pr94589-2.C |  25 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-1.c  |  19 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-11.c |  26 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-3.c  |  18 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-4.c  |  16 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-6.c  |  20 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-7.c  |  14 +
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-8.c  |  26 +
>  gcc/value-range.h|   3 +-
>  10 files changed, 729 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-11.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-4.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-6.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-7.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-8.c
>
> diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
> index 8e9d83e3827..d94ff6f915a 100644
> --- a/gcc/range-op-float.cc
> +++ b/gcc/range-op-float.cc
> @@ -150,6 +150,50 @@ range_operator_float::op1_op2_relation (const irange 
> &lhs ATTRIBUTE_UNUSED) cons
>return VREL_VARYING;
>  }
>
> +// Return TRUE if OP1 and OP2 are known to be free of NANs.
> +
> +static inline bool
> +finite_operands_p (const frange &op1, const frange &op2)
> +{
> +  return (flag_finite_math_only
> + || (op1.get_nan ().no_p ()
> + && op2.get_nan ().no_p ()));
> +}
> +
> +// Floating version of relop_early_resolve that takes into account NAN
> +// and -ffinite-math-only.
> +
> +inline bool
> +frelop_early_resolve (irange &r, tree type,
> + const frange &op1, const frange &op2,
> + relation_kind rel, relation_kind my_rel)
> +{
> +  // If either operand is undefined, return VARYING.
> +  if (empty_range_varying (r, type, op1, op2))
> +return true;
> +
> +  // We can fold relations from the oracle when we know both operands
> +  // are free of NANs, or when -ffinite-math-only.
> +  return (finite_operands_p (op1, op2)
> + && relop_early_resolve (r, type, op1, op2, rel, my_rel));
> +}
> +
> +// Default implementation of fold_range for relational operators.
> +// This amounts to passing on any known relations from the oracle, iff
> +// we 

Re: PING [PATCH] x86: Add ix86_ifunc_ref_local_ok

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 27, 2022 at 4:47 PM H.J. Lu  wrote:
>
> On Thu, Jul 21, 2022 at 11:53 AM H.J. Lu  wrote:
> >
> > We can't always use the PLT entry as the function address for local IFUNC
> > functions.  When the PIC register is needed for PLT call, indirect call
> > via the PLT entry will fail since the PIC register may not be set up
> > properly for indirect call.  Add ix86_ifunc_ref_local_ok to return false
> > when the PLT entry can't be used as local IFUNC function pointers.
> >
> > gcc/
> >
> > PR target/83782
> > * config/i386/i386.cc (ix86_ifunc_ref_local_ok): New.
> > (TARGET_IFUNC_REF_LOCAL_OK): Use it.
> >
> > gcc/testsuite/
> >
> > PR target/83782
> > * gcc.target/i386/pr83782-1.c: Require non-ia32.
> > * gcc.target/i386/pr83782-2.c: Likewise.
> > * gcc.target/i386/pr83782-3.c: New test.

You are the expert in this area, I'll blindly rubber-stamp OK.

Thanks,
Uros.

> > ---
> >  gcc/config/i386/i386.cc   | 15 ++-
> >  gcc/testsuite/gcc.target/i386/pr83782-1.c |  8 +++---
> >  gcc/testsuite/gcc.target/i386/pr83782-2.c |  4 +--
> >  gcc/testsuite/gcc.target/i386/pr83782-3.c | 32 +++
> >  4 files changed, 50 insertions(+), 9 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr83782-3.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index e03f86d4a23..5e30dc884bf 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -16070,6 +16070,19 @@ ix86_call_use_plt_p (rtx call_op)
> >return true;
> >  }
> >
> > +/* Implement TARGET_IFUNC_REF_LOCAL_OK.  If this hook returns true,
> > +   the PLT entry will be used as the function address for local IFUNC
> > +   functions.  When the PIC register is needed for PLT call, indirect
> > +   call via the PLT entry will fail since the PIC register may not be
> > +   set up properly for indirect call.  In this case, we should return
> > +   false.  */
> > +
> > +static bool
> > +ix86_ifunc_ref_local_ok (void)
> > +{
> > +  return !flag_pic || (TARGET_64BIT && ix86_cmodel != CM_LARGE_PIC);
> > +}
> > +
> >  /* Return true if the function being called was marked with attribute
> > "noplt" or using -fno-plt and we are compiling for non-PIC.  We need
> > to handle the non-PIC case in the backend because there is no easy
> > @@ -24953,7 +24966,7 @@ ix86_libgcc_floating_mode_supported_p
> >ix86_get_multilib_abi_name
> >
> >  #undef TARGET_IFUNC_REF_LOCAL_OK
> > -#define TARGET_IFUNC_REF_LOCAL_OK hook_bool_void_true
> > +#define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok
> >
> >  #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES
> >  # undef TARGET_ASM_RELOC_RW_MASK
> > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-1.c 
> > b/gcc/testsuite/gcc.target/i386/pr83782-1.c
> > index ce97b12e65d..85674346aec 100644
> > --- a/gcc/testsuite/gcc.target/i386/pr83782-1.c
> > +++ b/gcc/testsuite/gcc.target/i386/pr83782-1.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-do compile } */
> > +/* { dg-do compile { target { ! ia32 } } } */
> >  /* { dg-require-ifunc "" } */
> >  /* { dg-options "-O2 -fpic" } */
> >
> > @@ -20,7 +20,5 @@ bar(void)
> >return foo;
> >  }
> >
> > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
> > target ia32 } } } */
> > -/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ 
> > \t]%(?:e|r)ax} { target { ! ia32 } } } } */
> > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
> > -/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } 
> > } } } */
> > +/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ 
> > \t]%(?:e|r)ax} } } */
> > +/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-2.c 
> > b/gcc/testsuite/gcc.target/i386/pr83782-2.c
> > index e25d258bbda..a654ded771f 100644
> > --- a/gcc/testsuite/gcc.target/i386/pr83782-2.c
> > +++ b/gcc/testsuite/gcc.target/i386/pr83782-2.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-do compile } */
> > +/* { dg-do compile { target { ! ia32 } } } */
> >  /* { dg-require-ifunc "" } */
> >  /* { dg-options "-O2 -fpic" } */
> >
> > @@ -20,7 +20,5 @@ bar(void)
> >return foo;
> >  }
> >
> > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
> > target ia32 } } } */
> >  /* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ 
> > \t]%(?:e|r)ax} { target { ! ia32 } } } } */
> > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
> >  /* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } 
> > } } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-3.c 
> > b/gcc/testsuite/gcc.target/i386/pr83782-3.c
> > new file mode 100644
> > index 000..1536481cb79
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr83782-3.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do run }  */
> > +/* { dg-require-ifunc "" } */
> > +/* { dg-

Re: [x86_64 PATCH] Add rotl64ti2_doubleword pattern to i386.md

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 29, 2022 at 8:10 AM Roger Sayle  wrote:
>
>
> This patch adds rot[lr]64ti2_doubleword patterns to the x86_64 backend,
> to move splitting of 128-bit TImode rotates by 64 bits after reload,
> matching what we now do for 64-bit DImode rotations by 32 bits with -m32.
>
> In theory moving when this rotation is split should have little
> influence on code generation, but in practice "reload" sometimes
> decides to make use of the increased flexibility to reduce the number
> of registers used, and the code size, by using xchg.
>
> For example:
> __int128 x;
> __int128 y;
> __int128 a;
> __int128 b;
>
> void foo()
> {
> unsigned __int128 t = x;
> t ^= a;
> t = (t<<64) | (t>>64);
> t ^= b;
> y = t;
> }
>
> Before:
> movqx(%rip), %rsi
> movqx+8(%rip), %rdi
> xorqa(%rip), %rsi
> xorqa+8(%rip), %rdi
> movq%rdi, %rax
> movq%rsi, %rdx
> xorqb(%rip), %rax
> xorqb+8(%rip), %rdx
> movq%rax, y(%rip)
> movq%rdx, y+8(%rip)
> ret
>
> After:
> movqx(%rip), %rax
> movqx+8(%rip), %rdx
> xorqa(%rip), %rax
> xorqa+8(%rip), %rdx
> xchgq   %rdx, %rax
> xorqb(%rip), %rax
> xorqb+8(%rip), %rdx
> movq%rax, y(%rip)
> movq%rdx, y+8(%rip)
> ret
>
> One some modern architectures this is a small win, on some older
> architectures this is a small loss.  The decision which code to
> generate is made in "reload", and could probably be tweaked by
> register preferencing.  The much bigger win is that (eventually) all
> TImode mode shifts and rotates by constants will become potential
> candidates for TImode STV.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?
>
>
> 2022-07-29  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386.md (define_expand ti3): For
> rotations by 64 bits use new rot[lr]64ti2_doubleword pattern.
> (rot[lr]64ti2_doubleword): New post-reload splitter.

OK.

Thanks,
Uros.


Re: [x86_64 PATCH take #2] PR target/106450: Tweak timode_remove_non_convertible_regs.

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 30, 2022 at 11:42 AM Roger Sayle  wrote:
>
>
> Many thanks to H.J. for pointing out a better idiom for traversing
> the USEs (and also DEFs) of TImode registers in an instruction.
>
> This revised patched has been tested on x86_64-pc-linux-gnu with
> make bootstrap and make -k check, both with and without
> --target_board=unix{-m32}, with no new failures.  Ok for mainline?
>
>
> 2022-07-30  Roger Sayle  
> H.J. Lu  
>
> gcc/ChangeLog
> PR target/106450
> * config/i386/i386-features.cc (timode_check_non_convertible_regs):
> Do nothing if REGNO is set in the REGS bitmap, or is a hard reg.
> (timode_remove_non_convertible_regs): Update comment.
> Call timode_check_non_convertible_reg on all TImode register
> DEFs and USEs in each instruction.
>
> gcc/testsuite/ChangeLog
> PR target/106450
> * gcc.target/i386/pr106450.c: New test case.

LGTM.

Thanks,
Uros.

>
>
> Thanks (H.J. and Uros),
> Roger
> --
>
> > -Original Message-
> > From: H.J. Lu 
> > Sent: 28 July 2022 17:55
> > To: Roger Sayle 
> > Cc: GCC Patches 
> > Subject: Re: [x86_64 PATCH] PR target/106450: Tweak
> > timode_remove_non_convertible_regs.
> >
> > On Thu, Jul 28, 2022 at 9:43 AM Roger Sayle 
> > wrote:
> > >
> > > This patch resolves PR target/106450, some more fall-out from more
> > > aggressive TImode scalar-to-vector (STV) optimizations.  I continue to
> > > be caught out by how far TImode STV has diverged from DImode/SImode
> > > STV, and therefore requires additional (unexpected) tweaking.  Many
> > > thanks to H.J. Lu for pointing out timode_remove_non_convertible_regs
> > > needs to be extended to handle XOR (and other new operations).
> > >
> > > Unhelpfully the comment above this function states that it's the
> > > TImode version of "remove_non_convertible_regs", which doesn't exist
> > > anymore, so I've resurrected an explanatory comment from the git history.
> > > By refactoring the checks for hard regs and already "marked" regs into
> > > timode_check_non_convertible_regs itself, all its callers are
> > > simplified.  This patch then uses GET_RTX_CLASS to generically handle
> > > unary and binary operations, calling timode_check_non_convertible_regs
> > > on each TImode register operand in the single_set's SET_SRC.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > > and make -k check, both with and without --target_board=unix{-m32},
> > > with no new failures.  Ok for mainline?
> > >
> > >
> > > 2022-07-28  Roger Sayle  
> > >
> > > gcc/ChangeLog
> > > PR target/106450
> > > * config/i386/i386-features.cc 
> > > (timode_check_non_convertible_regs):
> > > Do nothing if REGNO is set in the REGS bitmap, or is a hard reg.
> > > (timode_remove_non_convertible_regs): Update comment.
> > > Call timode_check_non_convertible_regs on all register operands
> > > of supported (binary and unary) operations.
> >
> > Should we use
> >
> > df_ref ref;
> > FOR_EACH_INSN_USE (ref, insn)
> >if (!DF_REF_REG_MEM_P (ref))
> >  timode_check_non_convertible_regs (candidates, regs,
> >   DF_REF_REGNO (ref));
> >
> > to check each use?
> >
> > > gcc/testsuite/ChangeLog
> > > PR target/106450
> > > * gcc.target/i386/pr106450.c: New test case.
> > >
> > >
> > > Thanks in advance,
> > > Roger
> > > --
> > --
> > H.J.


Re: [x86 PATCH] Support logical shifts by (some) integer constants in TImode STV.

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 29, 2022 at 12:18 AM Roger Sayle  wrote:
>
>
> This patch improves TImode STV by adding support for logical shifts by
> integer constants that are multiples of 8.  For the test case:
>
> __int128 a, b;
> void foo() { a = b << 16; }
>
> on x86_64, gcc -O2 currently generates:
>
> movqb(%rip), %rax
> movqb+8(%rip), %rdx
> shldq   $16, %rax, %rdx
> salq$16, %rax
> movq%rax, a(%rip)
> movq%rdx, a+8(%rip)
> ret
>
> with this patch we now generate:
>
> movdqa  b(%rip), %xmm0
> pslldq  $2, %xmm0
> movaps  %xmm0, a(%rip)
> ret
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check. both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
>
> 2022-07-28  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-features.cc (compute_convert_gain): Add gain
> for converting suitable TImode shift to a V1TImode shift.
> (timode_scalar_chain::convert_insn): Add support for converting
> suitable ASHIFT and LSHIFTRT.
> (timode_scalar_to_vector_candidate_p): Consider logical shifts
> by integer constants that are multiples of 8 to be candidates.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/sse4_1-stv-7.c: New test case.

+ case ASHIFT:
+ case LSHIFTRT:
+  /* For logical shifts by constant multiples of 8. */
+  igain = optimize_insn_for_size_p () ? COSTS_N_BYTES (4)
+  : COSTS_N_INSNS (1);

Isn't the conversion an universal win for -O2 as well as for -Os? The
conversion to/from XMM register is already accounted for, so for -Os
substituting shldq/salq with pslldq should always be a win. I'd expect
the cost calculation to be similar to the
general_scalar_chain::compute_convert_gain cost calculation with m =
2.

Uros.


Re: [GCC 12] [PATCH] x86: Support 2/4/8 byte constant vector stores

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 27, 2022 at 4:24 PM H.J. Lu  wrote:
>
> On Fri, Jul 1, 2022 at 8:31 AM Uros Bizjak  wrote:
> >
> > On Thu, Jun 30, 2022 at 4:50 PM H.J. Lu  wrote:
> > >
> > > 1. Add a predicate for constant vectors which can be converted to integer
> > > constants suitable for constant integer stores.  For a 8-byte constant
> > > vector, the converted 64-bit integer must be valid for store with 64-bit
> > > immediate, which is a 64-bit integer sign-extended from a 32-bit integer.
> > > 2. Add a new pattern to allow 2-byte, 4-byte and 8-byte constant vector
> > > stores, like
> > >
> > > (set (mem:V2HI (reg:DI 84))
> > >  (const_vector:V2HI [(const_int 0 [0]) (const_int 1 [0x1])]))
> > >
> > > 3. After reload, convert constant vector stores to constant integer
> > > stores, like
> > >
> > > (set (mem:SI (reg:DI 5 di [84]))
> > >  (const_int 65536 [0x1]))
> > >
> > > For
> > >
> > > void
> > > foo (short * c)
> > > {
> > >   c[0] = 0;
> > >   c[1] = 1;
> > > }
> > >
> > > it generates
> > >
> > > movl$65536, (%rdi)
> > >
> > > instead of
> > >
> > > movl.LC0(%rip), %eax
> > > movl%eax, (%rdi)
> > >
> > > gcc/
> > >
> > > PR target/106022
> > > * config/i386/i386-protos.h 
> > > (ix86_convert_const_vector_to_integer):
> > > New.
> > > * config/i386/i386.cc (ix86_convert_const_vector_to_integer):
> > > New.
> > > * config/i386/mmx.md (V_16_32_64): New.
> > > (*mov_imm): New patterns for stores with 16-bit, 32-bit
> > > and 64-bit constant vector.
> > > * config/i386/predicates.md (x86_64_const_vector_operand): New.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/106022
> > > * gcc.target/i386/pr106022-1.c: New test.
> > > * gcc.target/i386/pr106022-2.c: Likewise.
> > > * gcc.target/i386/pr106022-3.c: Likewise.
> > > * gcc.target/i386/pr106022-4.c: Likewise.
> >
> > OK.
>
> OK to backport to GCC 12 branch?

Lets keep this in mainline only. It isn't something that makes a lot
of difference.

Uros.


Re: [PATCH] c: Fix location for _Pragma tokens [PR97498]

2022-07-31 Thread Jeff Law via Gcc-patches




On 7/31/2022 6:44 AM, Lewis Hyatt wrote:

On Sat, Jul 30, 2022 at 10:43 PM Jeff Law  wrote:

There was a request to backport this
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97498#c7) since it is
relevant to this one:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106267. Is that OK as
well for any of the current release branches please? It will work fine
as far back as 10. Thanks...

Generally we try to focus mostly on codegen issues and regressions on
the release branches, but it's not a strict rule.  Given this has been
on the trunk for nearly a couple weeks without issues, feel free to go
ahead and backport per Martin's request.

jeff

Thank you, I'll do that. One question, does a backport need to be an
exact cherry-pick, or is it OK if I need to tweak a few things as
well? I wasn't sure if I need to re-post the patch here in that case.
The patch itself applies to gcc 12 branch fine, however I think I need
a couple small changes to the testsuite parts. Thanks...
I personally prefer cherry-pick when we can, but as you note, sometimes 
minor twiddling is necessary, particularly as you go to older and older 
branches.  If you need to make minor changes, go ahead.  Consider the 
patch pre-approved with any minor changes you need to make to work with 
the branch and just post the patch with a note that it was installed as 
pre-approved.


Jeff


Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-31 Thread Lewis Hyatt via Gcc-patches
On Sat, Jul 30, 2022 at 7:06 PM Tom Honermann via Gcc-patches
 wrote:
>
> On 7/27/22 7:09 PM, Joseph Myers wrote:
> > On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote:
> >
> >> Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
> >> (see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
> >> require that the target diagnostic option be enabled for the preprocessor
> >> (see c_option_is_from_cpp_diagnostics).  This change modifies the
> >> -Wc++20-compat option definition to register it as a preprocessor option
> >> so that its associated diagnostics can be suppressed.  The changes also
> > There are lots of C++ warning options, all of which should support pragma
> > suppression regardless of whether they are relevant to the preprocessor or
> > not.  Do they all need this kind of handling, or is it only -Wc++20-compat
> > that has some kind of problem?
>
> I had only checked -Wc++20-compat when working on the patch.
>
> I did some spot checking now and confirmed that suppression works as
> expected for C++ for at least the following warnings:
>-Wuninitialized
>-Warray-compare
>-Wbool-compare
>-Wtautological-compare
>-Wterminate
>
> I don't know the diagnostic framework well. As best I can tell, this
> issue is specific to the -Wc++20-compat option and when the particular
> diagnostic is issued (e.g., during lexing as opposed to during parsing).
> The following call chains appear to be relevant.
>cp_lexer_new_main -> cp_lexer_handle_early_pragma ->
> c_invoke_early_pragma_handler
>cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler
>(where * might be "declaration", "toplevel_declaration",
> "class_head", "objc_interstitial_code", ...)
>
> The -Wc++20-compat enabled warning regarding new keywords in C++20 is
> issued from cp_lexer_get_preprocessor_token.
>
> Tom.
>

I have been working on improving the handling of "#pragma GCC
diagnostic" lately. The behavior for C++ changed since r13-1544
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e46f4d7430c5210465791603735ab219ef263c51).
I have some more comments about the patch's approach on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c44).

"#pragma GCC diagnostic" formerly did not work in C++ at all, for
diagnostics generated by libcpp, because C++ obtains all the tokens
from libcpp first (including deferred pragmas), and then processes
them afterward, too late to take effect for diagnostics that libcpp
has already emitted. r13-1544 fixed this up by adding an early pragma
handler, which runs as soon as a deferred pragma token is seen and
handles diagnostic pragmas if they pertain to libcpp-controlled
diagnostics. Non-libcpp diagnostics still need to be handled later,
during parsing, or else they get processed too early and it leads to
other problems. Basically, now each diagnostic pragma is handled as
close in time as possible to the time the associated diagnostics might
be generated.

The early pragma handler determines that an option comes from libcpp,
and so should be subject to early processing, if it was marked as such
in the options definition file. Tom's patch points out that
-Wc++20-compat needs to be handled early, and so marking it as a
libcpp diagnostic in c-family/c.opt arranges for that to work as
intended. Now one potential objection here is that -Wc++20-compat
warnings are not technically generated by libcpp. They are generated
by the C++ frontend immediately after lexing an identifier token from
libcpp (cp_lexer_get_preprocessor_token()). But the distinction
between these two steps is rather blurry and it seems logical to me,
to denote this as a libcpp-related option. Also, the same is already
done for -Wc++11-compat. Otherwise, we would need to add some new
option property to indicate which ones need to be handled for pragmas
at lexing time rather than parsing time.

At the moment I don't see any other diagnostics issued from
cp_lexer_get_preprocessor_token() that would need similar adjustments.
Assuming the approach is OK, it might be nice to add a comment to that
function, indicating that any diagnostics emitted there should be
annotated as libcpp options in the .opt file?

-Lewis


Re: [PATCH] c: Fix location for _Pragma tokens [PR97498]

2022-07-31 Thread Lewis Hyatt via Gcc-patches
On Sat, Jul 30, 2022 at 10:43 PM Jeff Law  wrote:
> > There was a request to backport this
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97498#c7) since it is
> > relevant to this one:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106267. Is that OK as
> > well for any of the current release branches please? It will work fine
> > as far back as 10. Thanks...
> Generally we try to focus mostly on codegen issues and regressions on
> the release branches, but it's not a strict rule.  Given this has been
> on the trunk for nearly a couple weeks without issues, feel free to go
> ahead and backport per Martin's request.
>
> jeff

Thank you, I'll do that. One question, does a backport need to be an
exact cherry-pick, or is it OK if I need to tweak a few things as
well? I wasn't sure if I need to re-post the patch here in that case.
The patch itself applies to gcc 12 branch fine, however I think I need
a couple small changes to the testsuite parts. Thanks...

-Lewis


Re: [PATCH] libfortran: Fix up boz_15.f90 on powerpc64le with -mabi=ieeelongdouble [PR106079]

2022-07-31 Thread Thomas Koenig via Gcc-patches

Hi Jakub,


The boz_15.f90 test FAILs on powerpc64le-linux when -mabi=ieeelongdouble
is used (either default through --with-long-double-format=ieee or
when used explicitly).
The problem is that the read/write transfer routines are called with
BT_REAL (or BT_COMPLEX) type and kind 17 which is magic we use to say
it is the IEEE quad real(kind=16) rather than the IBM double double
real(kind=16).  For the floating point input/output we then handle kind
17 specially, but for B/O/Z we just treat the bytes of the floating point
value as binary blob and using 17 in that case results in unexpected
behavior, for write it means we don't estimate right how many chars we'll
need and print  etc. rather than what we should, and
even with explicit size we'd print one further byte than intended.
For read it would even mean overwriting some unrelated byte after the
floating point object.

Fixed by using 16 instead of 17 in the read_radix and write_{b,o,z} calls.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk / 12.2?


OK for both.

Best regards

Thomas


Re: [PATCH v3] LoongArch: add addr_global attribute

2022-07-31 Thread Chenghua Xu



在 2022/7/30 上午1:17, Xi Ruoyao via Gcc-patches 写道:

Change v2 to v3:
- Disable section anchor for addr_global symbols.
- Use -O2 in test to make sure section anchor is disabled.

--

Background:
https://lore.kernel.org/loongarch/d7670b60-2782-4642-995b-7baa01779...@loongson.cn/T/#e1d47e2fe185f2e2be8fdc0784f0db2f644119379

Question:  Do you have a better name than "addr_global"?

Alternatives:

1. Just use "-mno-explicit-relocs -mla-local-with-abs" for kernel
modules.  It's stupid IMO.
2. Implement a "-maddress-local-with-got" option for GCC and use it for
kernel modules.  It seems too overkill: we might create many unnecessary
GOT entries.
3. For all variables with a section attribute, consider it global.  It
may make sense, but I just checked x86_64 and riscv and they don't do
this.
4. Implement -mcmodel=extreme and use it for kernel modules.  To me
"extreme" seems really too extreme.
5. More hacks in kernel. (Convert relocations against .data..percpu with
objtool?  But objtool is not even implemented for LoongArch yet.)

Note: I'll be mostly AFK in the following week.  My attempt to finish
the kernel support for new relocs before going AFK now failed miserably
:(.

-- >8 --

A linker script and/or a section attribute may locate a local object in
some way unexpected by the code model, leading to a link failure.  This
happens when the Linux kernel loads a module with "local" per-CPU
variables.

Add an attribute to explicitly mark an variable with the address
unlimited by the code model so we would be able to work around such
problems.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_attribute_table):
New attribute table.
(TARGET_ATTRIBUTE_TABLE): Define the target hook.
(loongarch_handle_addr_global_attribute): New static function.
(loongarch_classify_symbol): Return SYMBOL_GOT_DISP for
SYMBOL_REF_DECL with addr_global attribute.
(loongarch_use_anchors_for_symbol_p): New static function.
(TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define the target hook.
* doc/extend.texi (Variable Attributes): Document new
LoongArch specific attribute.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/addr-global.c: New test.
---
  gcc/config/loongarch/loongarch.cc | 61 +++
  gcc/doc/extend.texi   | 17 ++
  .../gcc.target/loongarch/addr-global.c| 28 +
  3 files changed, 106 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/addr-global.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 79687340dfd..db6f84d4e66 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1643,6 +1643,13 @@ loongarch_classify_symbol (const_rtx x)
&& !loongarch_symbol_binds_local_p (x))
  return SYMBOL_GOT_DISP;
  
+  if (SYMBOL_REF_P (x))

+{
+  tree decl = SYMBOL_REF_DECL (x);
+  if (decl && lookup_attribute ("addr_global", DECL_ATTRIBUTES (decl)))
+   return SYMBOL_GOT_DISP;
+}
+
return SYMBOL_PCREL;
  }
  
@@ -6068,6 +6075,54 @@ loongarch_starting_frame_offset (void)

return crtl->outgoing_args_size;
  }
  
+static tree

+loongarch_handle_addr_global_attribute (tree *node, tree name, tree, int,
+bool *no_add_attrs)
+{
+  tree decl = *node;
+  if (TREE_CODE (decl) == VAR_DECL)
+{
+  if (DECL_CONTEXT (decl)
+ && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL
+ && !TREE_STATIC (decl))
+   {
+ error_at (DECL_SOURCE_LOCATION (decl),
+   "%qE attribute cannot be specified for local "
+   "variables", name);
+ *no_add_attrs = true;
+   }
+}
+  else
+{
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+}
+  return NULL_TREE;
+}
+
+static const struct attribute_spec loongarch_attribute_table[] =
+{
+  /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
+   affects_type_identity, handler, exclude } */
+  { "addr_global", 0, 0, true, false, false, false,
+loongarch_handle_addr_global_attribute, NULL },
+  /* The last attribute spec is set to be NULL.  */
+  {}
+};
+
+bool
+loongarch_use_anchors_for_symbol_p (const_rtx symbol)
+{
+  tree decl = SYMBOL_REF_DECL (symbol);
+
+  /* addr_global indicates we don't know how the linker will locate the symbol,
+ so the use of anchor may cause relocation overflow.  */
+  if (decl && lookup_attribute ("addr_global", DECL_ATTRIBUTES (decl)))
+return false;
+
+  return default_use_anchors_for_symbol_p (symbol);
+}
+
  /* Initialize the GCC target structure.  */
  #undef TARGET_ASM_ALIGNED_HI_OP
  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -6256,6 +6311,12 @@ loongarch_starting_frame_offset (void)
  #undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
  #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed

Re: [committed] wwwdocs: cxx-status: Move www.open-std.org to https

2022-07-31 Thread Jonathan Wakely via Gcc-patches
"

On Sat, 30 Jul 2022 at 22:28, Gerald Pfeifer  wrote:
>
> This is a trivial change which fixes several dozen links.
>
> Marek, Jason, Jonathan - I noticed that (in other places) we have both
> links to www.open-std.org and open-std.org, both of which seem to work.
>
> What is the preferred spelling of that site? With or without www? (The
> latter would be shorter and sweeter. ;-)

https://www.open-std.org/ says "The site www.open-std.org is holding a
number of web pages for groups producing open standards:" but I don't
think it really matters which we use.


Re: [PATCH, v4] Fortran: detect blanks within literal constants in free-form mode [PR92805]

2022-07-31 Thread Mikael Morin

Le 30/07/2022 à 21:40, Harald Anlauf a écrit :

Hi Mikael,

Am 30.07.22 um 10:28 schrieb Mikael Morin:

Meh! We killed one check for gfc_current_form but the other one is still
there.
OK, match_kind_param calls two functions that also gobble space, so
there is work remaining here.
So please make match_small_literal_constant and gfc_match_name
space-gobbling wrappers around space-non-gobbling inner functions and
call those inner functions instead in match_kind_param.


well, here's the shortest solution I could come up with.
I added a new argument to 3 functions used in parsing that
controls the gobbling of whitespace.  We use this to handle
whitespace for numerical literals, while the parsing of string
literals remains as in the previous version of the patch.

This version obviously ignores Thomas' request, as that would
require to treat gfc_match_char specially...

Regtested again.  OK now?



PR fortran/92805
* match.cc (gfc_match_small_literal_int): Make gobbling of leading
whitespace optional.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* match.h (gfc_match_small_literal_int): Adjust prototype.
(gfc_match_name): Likewise.
(gfc_match_char): Likewise.
* primary.cc (match_kind_param): Match small literal int or name
without gobbling whitespace.
(get_kind): Do not skip over blanks in free-form mode.
I think the "in free-form mode" applied to the preceding patches but can 
be dropped now.

(match_string_constant): Likewise.



diff --git a/gcc/fortran/match.cc b/gcc/fortran/match.cc
index 1aa3053e70e..c0dc0e89361 100644
--- a/gcc/fortran/match.cc
+++ b/gcc/fortran/match.cc
@@ -457,7 +457,7 @@ gfc_match_eos (void)
will be set to the number of digits.  */

Please add a note about GOBBLE_WS here, like you did for gfc_match_char.


 match
-gfc_match_small_literal_int (int *value, int *cnt)
+gfc_match_small_literal_int (int *value, int *cnt, bool gobble_ws)
 {
   locus old_loc;
   char c;

(...)

@@ -611,14 +612,15 @@ gfc_match_label (void)
than GFC_MAX_SYMBOL_LEN.  */

Same here.
 
 match

-gfc_match_name (char *buffer)
+gfc_match_name (char *buffer, bool gobble_ws)
 {
   locus old_loc;
   int i;
   char c;
 

(...)

@@ -1052,16 +1054,19 @@ cleanup:
 }
 
 
-/* Tries to match the next non-whitespace character on the input.

-   This subroutine does not return MATCH_ERROR.  */
+/* Tries to match the next non-whitespace character on the input.  This
+   subroutine does not return MATCH_ERROR.  When gobble_ws is false, do not
+   skip over leading blanks.
+*/

There should be no line feed before end of comment.

OK with those changes.
thanks for your patience.

Mikael