[PATCH] testsuite/70230 - fix failures with default SSP

2022-01-26 Thread Allan McRae via Gcc-patches
Configuring with --enable-default-ssp triggers various testsuite
failures.  These contain asm statements that are not compatible with
-fstack-protector.  Adding -fno-stack-protector to dg-options to
work around this issue.

Tested on x86_64-linux.

2022-01-26  Allan McRae  

PR testsuite/70230
* gcc.dg/asan/use-after-scope-4.c (dg-options): Add
-fno-stack-protector.
* gcc.dg/stack-usage-1.c: Likewise
* gcc.dg/superblock.c: Likewise
* gcc.target/i386/avx-vzeroupper-17.c: Likewise
* gcc.target/i386/cleanup-1.c: Likewise
* gcc.target/i386/cleanup-2.c: Likewise
* gcc.target/i386/interrupt-redzone-1.c: Likewise
* gcc.target/i386/interrupt-redzone-2.c: Likewise
* gcc.target/i386/pr79793-1.c: Likewise
* gcc.target/i386/pr79793-2.c: Likewise
* gcc.target/i386/shrink_wrap_1.c: Likewise
* gcc.target/i386/stack-check-11.c: Likewise
* gcc.target/i386/stack-check-18.c: Likewise
* gcc.target/i386/stack-check-19.c: Likewise
* gcc.target/i386/stackalign/pr88483-1.c: Likewise
* gcc.target/i386/stackalign/pr88483-2.c: Likewise
* gcc.target/i386/sw-1.c: Likewise
---
 gcc/testsuite/gcc.dg/asan/use-after-scope-4.c| 1 +
 gcc/testsuite/gcc.dg/stack-usage-1.c | 2 +-
 gcc/testsuite/gcc.dg/superblock.c| 2 +-
 gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c| 2 +-
 gcc/testsuite/gcc.target/i386/cleanup-1.c| 2 +-
 gcc/testsuite/gcc.target/i386/cleanup-2.c| 2 +-
 gcc/testsuite/gcc.target/i386/interrupt-redzone-1.c  | 2 +-
 gcc/testsuite/gcc.target/i386/interrupt-redzone-2.c  | 2 +-
 gcc/testsuite/gcc.target/i386/pr79793-1.c| 2 +-
 gcc/testsuite/gcc.target/i386/pr79793-2.c| 2 +-
 gcc/testsuite/gcc.target/i386/shrink_wrap_1.c| 2 +-
 gcc/testsuite/gcc.target/i386/stack-check-11.c   | 2 +-
 gcc/testsuite/gcc.target/i386/stack-check-18.c   | 2 +-
 gcc/testsuite/gcc.target/i386/stack-check-19.c   | 2 +-
 gcc/testsuite/gcc.target/i386/stackalign/pr88483-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/stackalign/pr88483-2.c | 2 +-
 gcc/testsuite/gcc.target/i386/sw-1.c | 2 +-
 17 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c 
b/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c
index 44dc79535d2..e1486e75045 100644
--- a/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c
+++ b/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c
@@ -1,4 +1,5 @@
 // { dg-do run }
+/* { dg-options "-fno-stack-protector" } */
 
 #define FN(NAME) \
 NAME (void) \
diff --git a/gcc/testsuite/gcc.dg/stack-usage-1.c 
b/gcc/testsuite/gcc.dg/stack-usage-1.c
index 93cfe7c0163..1d7d1fee435 100644
--- a/gcc/testsuite/gcc.dg/stack-usage-1.c
+++ b/gcc/testsuite/gcc.dg/stack-usage-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fstack-usage" } */
+/* { dg-options "-fstack-usage -fno-stack-protector" } */
 /* nvptx doesn't have a reg allocator, and hence no stack usage data.  */
 /* { dg-skip-if "" { nvptx-*-* } } */
 
diff --git a/gcc/testsuite/gcc.dg/superblock.c 
b/gcc/testsuite/gcc.dg/superblock.c
index 2b2fa9e154f..6b4419adaf5 100644
--- a/gcc/testsuite/gcc.dg/superblock.c
+++ b/gcc/testsuite/gcc.dg/superblock.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-asynchronous-unwind-tables -fsched2-use-superblocks 
-fdump-rtl-sched2 -fdump-rtl-bbro" } */
+/* { dg-options "-O2 -fno-asynchronous-unwind-tables -fsched2-use-superblocks 
-fdump-rtl-sched2 -fdump-rtl-bbro -fno-stack-protector" } */
 /* { dg-require-effective-target scheduling } */
 
 typedef int aligned __attribute__ ((aligned (64)));
diff --git a/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c 
b/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c
index 6dc0dc05321..d677e6f10e0 100644
--- a/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c
+++ b/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-O2 -mavx -mabi=ms -dp" } */
+/* { dg-options "-O2 -mavx -mabi=ms -dp -fno-stack-protector" } */
 
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 
diff --git a/gcc/testsuite/gcc.target/i386/cleanup-1.c 
b/gcc/testsuite/gcc.target/i386/cleanup-1.c
index dcfcc4edb5f..6e7544c6b7a 100644
--- a/gcc/testsuite/gcc.target/i386/cleanup-1.c
+++ b/gcc/testsuite/gcc.target/i386/cleanup-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target *-*-linux* *-*-gnu* } } */
-/* { dg-options "-fexceptions -fnon-call-exceptions 
-fasynchronous-unwind-tables -O2" } */
+/* { dg-options "-fexceptions -fnon-call-exceptions 
-fasynchronous-unwind-tables -O2 -fno-stack-protector" } */
 /* Test complex CFA value expressions.  */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/i386/cleanup-2.c 
b/gcc/testsuite/gcc.target/i386/cleanup-2.c
index 7e60323373b..a24daba73da 100644
--- 

Re: [PATCH] c++: new-expr of array of deduced class tmpl [PR101988]

2022-01-26 Thread Jason Merrill via Gcc-patches

On 1/26/22 18:55, Marek Polacek wrote:

In r12-1933 I attempted to implement DR2397 aka allowing

   int a[3];
   auto ()[3] = a;

by removing the type_uses_auto check in create_array_type_for_decl.
That may have gone too far, because it also allows arrays of
CLASS_PLACEHOLDER_TEMPLATE and it looks like [dcl.type.class.deduct]
prohibits that: "...the declared type of the variable shall be cv T,
where T is the placeholder."  However, in /2 it explicitly states that
"A placeholder for a deduced class type can also be used in the
type-specifier-seq in the new-type-id or type-id of a new-expression."

In this PR, it manifested by making us accept invalid

   template struct A { A(T); };
   auto p = new A[]{1};

[expr.new]/2 says that such a construct is treated as an invented
declaration of the form

   A x[]{1};

but, I think, that ought to be ill-formed as per above.  So this patch
sort of restores the create_array_type_for_decl check.  I should mention
that the difference between [] and [1] is due to cp_parser_new_type_id:

   if (*nelts == NULL_TREE)
 /* Leave [] in the declarator.  */;

and groktypename returning different types based on that.

Does this make sense?  Bootstrapped/regtested on x86_64-pc-linux-gnu.


OK.


PR c++/101988

gcc/cp/ChangeLog:

* decl.cc (create_array_type_for_decl): Reject forming an array of
placeholder for a deduced class type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction-new1.C: New test.
* g++.dg/cpp23/auto-array2.C: New test.
---
  gcc/cp/decl.cc   | 12 
  .../g++.dg/cpp1z/class-deduction-new1.C  | 16 
  gcc/testsuite/g++.dg/cpp23/auto-array2.C | 11 +++
  3 files changed, 39 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 6534a7fd320..10e6956117e 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11087,6 +11087,18 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
if (type == error_mark_node || size == error_mark_node)
  return error_mark_node;
  
+  /* [dcl.type.class.deduct] prohibits forming an array of placeholder

+ for a deduced class type.  */
+  if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type))
+{
+  if (name)
+   error_at (loc, "%qD declared as array of template placeholder "
+ "type %qT", name, type);
+  else
+   error ("creating array of template placeholder type %qT", type);
+  return error_mark_node;
+}
+
/* If there are some types which cannot be array elements,
   issue an error-message and return.  */
switch (TREE_CODE (type))
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C
new file mode 100644
index 000..70283353619
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C
@@ -0,0 +1,16 @@
+// PR c++/101988
+// { dg-do compile { target c++17 } }
+
+template
+struct A {
+  A(T);
+  A();
+};
+auto p1 = new A[]{1}; // { dg-error "creating array of template placeholder 
type" }
+auto p2 = new A[1]{1}; // { dg-error "invalid use of placeholder" }
+auto p3 = new A[]{1};
+auto p4 = new A[1]{1};
+auto p5 = new A[]{1, 2}; // { dg-error "creating array of template placeholder 
type" }
+auto p6 = new A[]{1, 2};
+auto p7 = new A[]{A(1), A(1)};
+auto p8 = new A[2]{A(1), A(1)};
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array2.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array2.C
new file mode 100644
index 000..06431685b30
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array2.C
@@ -0,0 +1,11 @@
+// PR c++/101988
+// { dg-do compile { target c++17 } }
+
+template struct A { A(); };
+A a[3];
+auto (*p)[3] = 
+A (*p2)[3] = 
+A (*p3)[3] =  // { dg-error "template placeholder type" }
+auto ()[3] = a;
+A ()[3] = a;
+A ()[3] = a; // { dg-error "template placeholder type" }

base-commit: fd5b0488ad5e4f29b65238e06a2d65b7de120235




[PATCH] c++: new-expr of array of deduced class tmpl [PR101988]

2022-01-26 Thread Marek Polacek via Gcc-patches
In r12-1933 I attempted to implement DR2397 aka allowing

  int a[3];
  auto ()[3] = a;

by removing the type_uses_auto check in create_array_type_for_decl.
That may have gone too far, because it also allows arrays of
CLASS_PLACEHOLDER_TEMPLATE and it looks like [dcl.type.class.deduct]
prohibits that: "...the declared type of the variable shall be cv T,
where T is the placeholder."  However, in /2 it explicitly states that
"A placeholder for a deduced class type can also be used in the
type-specifier-seq in the new-type-id or type-id of a new-expression."

In this PR, it manifested by making us accept invalid

  template struct A { A(T); };
  auto p = new A[]{1};

[expr.new]/2 says that such a construct is treated as an invented
declaration of the form

  A x[]{1};

but, I think, that ought to be ill-formed as per above.  So this patch
sort of restores the create_array_type_for_decl check.  I should mention
that the difference between [] and [1] is due to cp_parser_new_type_id:

  if (*nelts == NULL_TREE)
/* Leave [] in the declarator.  */;

and groktypename returning different types based on that.

Does this make sense?  Bootstrapped/regtested on x86_64-pc-linux-gnu.

PR c++/101988

gcc/cp/ChangeLog:

* decl.cc (create_array_type_for_decl): Reject forming an array of
placeholder for a deduced class type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction-new1.C: New test.
* g++.dg/cpp23/auto-array2.C: New test.
---
 gcc/cp/decl.cc   | 12 
 .../g++.dg/cpp1z/class-deduction-new1.C  | 16 
 gcc/testsuite/g++.dg/cpp23/auto-array2.C | 11 +++
 3 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 6534a7fd320..10e6956117e 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11087,6 +11087,18 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
   if (type == error_mark_node || size == error_mark_node)
 return error_mark_node;
 
+  /* [dcl.type.class.deduct] prohibits forming an array of placeholder
+ for a deduced class type.  */
+  if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type))
+{
+  if (name)
+   error_at (loc, "%qD declared as array of template placeholder "
+ "type %qT", name, type);
+  else
+   error ("creating array of template placeholder type %qT", type);
+  return error_mark_node;
+}
+
   /* If there are some types which cannot be array elements,
  issue an error-message and return.  */
   switch (TREE_CODE (type))
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C
new file mode 100644
index 000..70283353619
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C
@@ -0,0 +1,16 @@
+// PR c++/101988
+// { dg-do compile { target c++17 } }
+
+template
+struct A {
+  A(T);
+  A();
+};
+auto p1 = new A[]{1}; // { dg-error "creating array of template placeholder 
type" }
+auto p2 = new A[1]{1}; // { dg-error "invalid use of placeholder" }
+auto p3 = new A[]{1};
+auto p4 = new A[1]{1};
+auto p5 = new A[]{1, 2}; // { dg-error "creating array of template placeholder 
type" }
+auto p6 = new A[]{1, 2};
+auto p7 = new A[]{A(1), A(1)};
+auto p8 = new A[2]{A(1), A(1)};
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array2.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array2.C
new file mode 100644
index 000..06431685b30
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array2.C
@@ -0,0 +1,11 @@
+// PR c++/101988
+// { dg-do compile { target c++17 } }
+
+template struct A { A(); };
+A a[3];
+auto (*p)[3] = 
+A (*p2)[3] = 
+A (*p3)[3] =  // { dg-error "template placeholder type" }
+auto ()[3] = a;
+A ()[3] = a;
+A ()[3] = a; // { dg-error "template placeholder type" }

base-commit: fd5b0488ad5e4f29b65238e06a2d65b7de120235
-- 
2.34.1



[PATCH RFA (tree)] c++: lambda in template default argument [PR103186]

2022-01-26 Thread Jason Merrill via Gcc-patches
The problem with this testcase was that since my patch for PR97900 we
weren't preserving DECL_UID identity for parameters of instantiations of
templated functions, so using those parameters as the keys for the
defarg_inst map broke.  I think this was always fragile given the
possibility of redeclarations, so instead of reverting that change let's
switch to keying off the function.

Memory use compiling stdc++.h is not noticeably different.

Tested x86_64-pc-linux-gnu.  Is the tree.{h,cc} change, just moving the class
definition from one to the other, OK for trunk?

PR c++/103186

gcc/ChangeLog:

* tree.h (struct tree_vec_map_cache_hasher): Move from...
* tree.cc (struct tree_vec_map_cache_hasher): ...here.

gcc/cp/ChangeLog:

* pt.cc (defarg_inst): Use tree_vec_map_cache_hasher.
(defarg_inst_for): New.
(tsubst_default_argument): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-defarg10.C: New test.
---
 gcc/tree.h| 17 
 gcc/cp/pt.cc  | 43 +++
 .../g++.dg/cpp0x/lambda/lambda-defarg10.C | 21 +
 gcc/tree.cc   | 17 
 4 files changed, 72 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C

diff --git a/gcc/tree.h b/gcc/tree.h
index 30bc53c2996..c5617fbcc61 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5559,6 +5559,23 @@ struct tree_decl_map_cache_hasher : 
ggc_cache_ptr_hash
 #define tree_vec_map_hash tree_decl_map_hash
 #define tree_vec_map_marked_p tree_map_base_marked_p
 
+struct tree_vec_map_cache_hasher : ggc_cache_ptr_hash
+{
+  static hashval_t hash (tree_vec_map *m) { return DECL_UID (m->base.from); }
+
+  static bool
+  equal (tree_vec_map *a, tree_vec_map *b)
+  {
+return a->base.from == b->base.from;
+  }
+
+  static int
+  keep_cache_entry (tree_vec_map *)
+  {
+return ggc_marked_p (m->base.from);
+  }
+};
+
 /* Hasher for tree decls.  Pointer equality is enough here, but the DECL_UID
is a better hash than the pointer value and gives a predictable traversal
order.  Additionally it can be used across PCH save/restore.  */
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 6fbda676527..db99b988fc3 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13666,7 +13666,36 @@ tsubst_aggr_type (tree t,
 }
 }
 
-static GTY((cache)) decl_tree_cache_map *defarg_inst;
+/* Map from a FUNCTION_DECL to a vec of default argument instantiations.  */
+
+static GTY((cache)) hash_table *defarg_inst;
+
+/* Return a reference to the slot for the defarg inst of PARM.  */
+
+tree &
+defarg_inst_for (tree parm)
+{
+  if (!defarg_inst)
+defarg_inst = hash_table::create_ggc (13);
+  tree fn = DECL_CONTEXT (parm);
+  tree_vec_map in = { fn, nullptr };
+  tree_vec_map **slot
+= defarg_inst->find_slot_with_hash (, DECL_UID (fn), INSERT);
+  if (!*slot)
+{
+  *slot = ggc_alloc ();
+  **slot = in;
+}
+
+  /* Index in reverse order to avoid allocating space for initial parameters
+ that don't have default arguments.  */
+  unsigned ridx = list_length (parm);
+
+  vec * = (*slot)->to;
+  if (vec_safe_length (defs) < ridx)
+vec_safe_grow_cleared (defs, ridx);
+  return (*defs)[ridx - 1];
+}
 
 /* Substitute into the default argument ARG (a default argument for
FN), which has the indicated TYPE.  */
@@ -13696,9 +13725,9 @@ tsubst_default_argument (tree fn, int parmnum, tree 
type, tree arg,
 
   gcc_assert (same_type_ignoring_top_level_qualifiers_p (type, parmtype));
 
-  tree *slot;
-  if (defarg_inst && (slot = defarg_inst->get (parm)))
-return *slot;
+  tree  = defarg_inst_for (parm);
+  if (inst)
+return inst;
 
   /* This default argument came from a template.  Instantiate the
  default argument here, not in tsubst.  In the case of
@@ -13743,11 +13772,7 @@ tsubst_default_argument (tree fn, int parmnum, tree 
type, tree arg,
   pop_from_top_level ();
 
   if (arg != error_mark_node && !cp_unevaluated_operand)
-{
-  if (!defarg_inst)
-   defarg_inst = decl_tree_cache_map::create_ggc (37);
-  defarg_inst->put (parm, arg);
-}
+inst = arg;
 
   return arg;
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C
new file mode 100644
index 000..e08eb4f5e56
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C
@@ -0,0 +1,21 @@
+// PR c++/103186
+// { dg-do compile { target c++11 } }
+
+struct f
+{
+  template
+   f(const T1&){}
+};
+
+
+template class A {
+public:
+void foo(A a, const f& fn = [](){}) { }
+void bar(A a) { foo(a); }
+};
+int main() {
+A a;
+a.foo(a);
+a.bar(a);
+return 0;
+}
diff --git a/gcc/tree.cc b/gcc/tree.cc
index c4b36619e6a..6bd3b14c93c 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -242,23 +242,6 @@ static GTY ((cache))
 static GTY ((cache))
  hash_table 

Re: [PATCH] c++: non-dependent immediate member fn call [PR99895]

2022-01-26 Thread Patrick Palka via Gcc-patches
On Wed, 26 Jan 2022, Patrick Palka wrote:

> On Wed, Jan 19, 2022 at 2:19 PM Jason Merrill  wrote:
> >
> > On 1/19/22 11:15, Patrick Palka wrote:
> > > Here we're emitting a bogus error during ahead of time evaluation of a
> > > non-dependent immediate member function call such as a.f(args) because
> > > the defacto templated form for such a call is (a.f)(args) but we're
> > > trying to evaluate it using the intermediate CALL_EXPR built by
> > > build_over_call, which has the non-member form f(a, args).  The defacto
> > > member form is built in build_new_method_call, so it seems we should
> > > handle the immediate call there instead.
> >
> > Hmm, there's already a bunch of code in build_over_call to try to fix up
> > the object argument, and there seem to be many places other than
> > build_new_method_call that call build_over_call for member functions; I
> > think it's probably better to build the needed COMPONENT_REF in
> > build_over_call.
> 
> Ah, but in build_over_call the arguments (including the implicit
> object argument) are potentially wrapped in a NON_DEPENDENT_EXPR.  So
> even if we built a COMPONENT_REF in build_over_call, we'd  have to
> keep rebuilding the entire CALL_EXPR (including the COMPONENT_REF) in
> terms of the original arguments in build_new_method_call, IIUC.
> 
> On a related note, the NON_DEPENDENT_EXPR wrapping seems to be a
> problem for non-member functions too, because the constexpr engine
> punts on them:
> 
>   struct fixed_string { };
> 
>   static consteval void size(fixed_string) { }
> 
>   template
>   void VerifyHash(fixed_string s) {
> size(s);  // error: expression ‘s’ is not a constant expression
> (because it's wrapped in NON_DEPENDENT_EXPR)
>   }
> 
> I wonder if this means we should be evaluating non-dependent
> non-member immediate calls elsewhere, e.g. in finish_call_expr?  Or
> perhaps we should stop punting on NON_DEPENDENT_EXPR during constexpr
> evaluation?

Actually, for that particular example, we probably should just avoid
wrapping PARM_DECL in a NON_DEPENDENT_EXPR...

> 
> >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > trunk and perhaps 11?
> > >
> > >   PR c++/99895
> > >
> > > gcc/cp/ChangeLog:
> > >
> > >   * call.cc (build_over_call): Don't evaluate non-dependent
> > >   immediate member function calls here.
> > >   (build_new_method_call): Instead evaluate them here.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * g++.dg/cpp2a/consteval-memfn1.C: New test.
> > >   * g++.dg/cpp2a/consteval-memfn2.C: New test.
> > > ---
> > >   gcc/cp/call.cc|  9 -
> > >   gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C | 15 
> > >   gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C | 34 +++
> > >   3 files changed, 57 insertions(+), 1 deletion(-)
> > >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
> > >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C
> > >
> > > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > > index d4a07a7a9b3..0583cc0083b 100644
> > > --- a/gcc/cp/call.cc
> > > +++ b/gcc/cp/call.cc
> > > @@ -9241,7 +9241,10 @@ build_over_call (struct z_candidate *cand, int 
> > > flags, tsubst_flags_t complain)
> > >  addr, nargs, argarray);
> > > if (TREE_THIS_VOLATILE (fn) && cfun)
> > >   current_function_returns_abnormally = 1;
> > > -  if (immediate_invocation_p (fn, nargs))
> > > +  if (!DECL_FUNCTION_MEMBER_P (fn)
> > > +   /* Non-dependent immediate member function calls are evaluated in
> > > +  build_new_method_call.  */
> > > +   && immediate_invocation_p (fn, nargs))
> > >   {
> > > tree obj_arg = NULL_TREE, exprimm = expr;
> > > if (DECL_CONSTRUCTOR_P (fn))
> > > @@ -11227,6 +11230,10 @@ skip_prune:
> > > call = convert_from_reference (call);
> > > if (cast_to_void)
> > >   call = build_nop (void_type_node, call);
> > > +
> > > +  if (immediate_invocation_p (fn, vec_safe_length (orig_args)))
> > > + fold_non_dependent_expr (call, complain,
> > > +  /*manifestly_const_eval=*/true);
> > >   }
> > >
> > >/* Free all the conversions we allocated.  */
> > > diff --git a/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C 
> > > b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
> > > new file mode 100644
> > > index 000..d2df2e9b5ae
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
> > > @@ -0,0 +1,15 @@
> > > +// PR c++/99895
> > > +// { dg-do compile { target c++20 } }
> > > +
> > > +struct fixed_string {
> > > +  consteval int size(int n) const {
> > > +if (n < 0) throw; // { dg-error "not a constant" }
> > > +return n;
> > > +  }
> > > +};
> > > +
> > > +template
> > > +void VerifyHash(fixed_string s) {
> > > +  s.size(0); // { dg-bogus "" }
> > > +  s.size(-1); // { dg-message "expansion of" }
> > 

Re: [PATCH] rs6000: Move the hunk affecting VSX/ALTIVEC ahead [PR103627]

2022-01-26 Thread Segher Boessenkool
Hi!

On Thu, Dec 23, 2021 at 10:12:19AM +0800, Kewen.Lin wrote:
>   PR target/103627
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
>   hunk affecting VSX and ALTIVEC to the appropriate place.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/103627
>   * gcc.target/powerpc/pr103627-3.c: New test.


> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -3955,6 +3955,15 @@ rs6000_option_override_internal (bool global_init_p)
>else if (TARGET_ALTIVEC)
>  rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks);
> 
> +  /* Disable VSX and Altivec silently if the user switched cpus to power7 in 
> a
> + target attribute or pragma which automatically enables both options,
> + unless the altivec ABI was set.  This is set by default for 64-bit, but
> + not for 32-bit.  Don't move this before the above code using 
> ignore_masks,
> + since it can reset the cleared VSX/ALTIVEC flag again.  */
> +  if (main_target_opt != NULL && !main_target_opt->x_rs6000_altivec_abi)
> +rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC)
> +   & ~rs6000_isa_flags_explicit);

Could you at the same time get rid of the != NULL please?
  if (bla != NULL)
is sillier than
  if (bla != 0)
which is about the same as
  if (!!bla)
but that is certainly better than
  if (bla != 0 != 0)
although I am not sure about the more stylish
  if (bla != 0 != 0 != 0 != 0 != 0)
but what is wrong with
  if (bla)
?  :-)

> +/* There are no error messages for either LE or BE 64bit.  */
> +/* { dg-require-effective-target be }*/

(space before */)

> +/* { dg-require-effective-target ilp32 } */
> +/* We don't have one powerpc.*_ok for Power6, use altivec_ok conservatively. 
>  */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-mdejagnu-cpu=power6" } */

It is okay always, no _ok at all please.

Okay for trunk with those things (but do test of course).  Thanks!


Segher


Re: [PATCH] c++: non-dependent immediate member fn call [PR99895]

2022-01-26 Thread Patrick Palka via Gcc-patches
On Wed, Jan 19, 2022 at 2:19 PM Jason Merrill  wrote:
>
> On 1/19/22 11:15, Patrick Palka wrote:
> > Here we're emitting a bogus error during ahead of time evaluation of a
> > non-dependent immediate member function call such as a.f(args) because
> > the defacto templated form for such a call is (a.f)(args) but we're
> > trying to evaluate it using the intermediate CALL_EXPR built by
> > build_over_call, which has the non-member form f(a, args).  The defacto
> > member form is built in build_new_method_call, so it seems we should
> > handle the immediate call there instead.
>
> Hmm, there's already a bunch of code in build_over_call to try to fix up
> the object argument, and there seem to be many places other than
> build_new_method_call that call build_over_call for member functions; I
> think it's probably better to build the needed COMPONENT_REF in
> build_over_call.

Ah, but in build_over_call the arguments (including the implicit
object argument) are potentially wrapped in a NON_DEPENDENT_EXPR.  So
even if we built a COMPONENT_REF in build_over_call, we'd  have to
keep rebuilding the entire CALL_EXPR (including the COMPONENT_REF) in
terms of the original arguments in build_new_method_call, IIUC.

On a related note, the NON_DEPENDENT_EXPR wrapping seems to be a
problem for non-member functions too, because the constexpr engine
punts on them:

  struct fixed_string { };

  static consteval void size(fixed_string) { }

  template
  void VerifyHash(fixed_string s) {
size(s);  // error: expression ‘s’ is not a constant expression
(because it's wrapped in NON_DEPENDENT_EXPR)
  }

I wonder if this means we should be evaluating non-dependent
non-member immediate calls elsewhere, e.g. in finish_call_expr?  Or
perhaps we should stop punting on NON_DEPENDENT_EXPR during constexpr
evaluation?

>
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk and perhaps 11?
> >
> >   PR c++/99895
> >
> > gcc/cp/ChangeLog:
> >
> >   * call.cc (build_over_call): Don't evaluate non-dependent
> >   immediate member function calls here.
> >   (build_new_method_call): Instead evaluate them here.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * g++.dg/cpp2a/consteval-memfn1.C: New test.
> >   * g++.dg/cpp2a/consteval-memfn2.C: New test.
> > ---
> >   gcc/cp/call.cc|  9 -
> >   gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C | 15 
> >   gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C | 34 +++
> >   3 files changed, 57 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C
> >
> > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > index d4a07a7a9b3..0583cc0083b 100644
> > --- a/gcc/cp/call.cc
> > +++ b/gcc/cp/call.cc
> > @@ -9241,7 +9241,10 @@ build_over_call (struct z_candidate *cand, int 
> > flags, tsubst_flags_t complain)
> >  addr, nargs, argarray);
> > if (TREE_THIS_VOLATILE (fn) && cfun)
> >   current_function_returns_abnormally = 1;
> > -  if (immediate_invocation_p (fn, nargs))
> > +  if (!DECL_FUNCTION_MEMBER_P (fn)
> > +   /* Non-dependent immediate member function calls are evaluated in
> > +  build_new_method_call.  */
> > +   && immediate_invocation_p (fn, nargs))
> >   {
> > tree obj_arg = NULL_TREE, exprimm = expr;
> > if (DECL_CONSTRUCTOR_P (fn))
> > @@ -11227,6 +11230,10 @@ skip_prune:
> > call = convert_from_reference (call);
> > if (cast_to_void)
> >   call = build_nop (void_type_node, call);
> > +
> > +  if (immediate_invocation_p (fn, vec_safe_length (orig_args)))
> > + fold_non_dependent_expr (call, complain,
> > +  /*manifestly_const_eval=*/true);
> >   }
> >
> >/* Free all the conversions we allocated.  */
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C 
> > b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
> > new file mode 100644
> > index 000..d2df2e9b5ae
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
> > @@ -0,0 +1,15 @@
> > +// PR c++/99895
> > +// { dg-do compile { target c++20 } }
> > +
> > +struct fixed_string {
> > +  consteval int size(int n) const {
> > +if (n < 0) throw; // { dg-error "not a constant" }
> > +return n;
> > +  }
> > +};
> > +
> > +template
> > +void VerifyHash(fixed_string s) {
> > +  s.size(0); // { dg-bogus "" }
> > +  s.size(-1); // { dg-message "expansion of" }
> > +}
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C 
> > b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C
> > new file mode 100644
> > index 000..71748f46b13
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C
> > @@ -0,0 +1,34 @@
> > +// PR c++/99895
> > +// { dg-do compile { target c++20 } }
> > +
> > +static constexpr unsigned hash(const 

Re: [PATCH] rs6000: Disable MMA if no P9 VECTOR support [PR103627]

2022-01-26 Thread Segher Boessenkool
Hi!

On Thu, Dec 23, 2021 at 10:09:27AM +0800, Kewen.Lin wrote:
> As PR103627 shows, there is an unexpected case where !TARGET_VSX
> and TARGET_MMA co-exist.  As ISA3.1 claims, SIMD is a requirement
> for MMA.  By looking into the ICE, I noticed that the current
> MMA implementation depends on vector pairs load/store, but since
> we don't have a separated option to control Power10 vector, this
> patch is to check for Power9 vector instead.
> 
> Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8.
> 
> Is it ok for trunk?

No, sorry.

> +  /* MMA requires SIMD support as ISA 3.1 claims and our implementation
> + such as "*movoo" uses vector pair access which are only supported
> + from ISA 3.1.  But since we don't have one separated option to
> + control Power10 vector, check for Power9 vector instead.  */
> +  if (TARGET_MMA && !TARGET_P9_VECTOR)
> +{
> +  if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0)
> + error ("%qs requires %qs", "-mmma", "-mpower9-vector");
> +  rs6000_isa_flags &= ~OPTION_MASK_MMA;
> +}

-mpower9-vector is a workaround that should go away.  TARGET_MMA should
require ISA 3.1 (or POWER10) directly, and require VSX.

> +/* { dg-options "-mdejagnu-cpu=power10 -mno-power9-vector" } */

It should be impossible to select this at all.  Either you have vectors
or you don't, but it should be impossible to selectively disable part of
vector support.  That way madness lies.

We can allow no VSRs, only VRs, or all VSRs.  There is precedent for
that (see -msoft-float for example, which means "do not use FPRs") --
when compiling certain code we want to disallow whole register banks.
But disallowing or allowing some instructions separately is not a good
idea: it doesn't give any useful extra functionality, it is hard to use,
and it is a lot of extra work for us (it is impossible to test, already,
too many combinations).


Segher


Re: [Patch] libgomp.texi: Update OpenMP implementation status

2022-01-26 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 26, 2022 at 11:14:54PM +0100, Tobias Burnus wrote:
> ]I saw that the following now is implemented:
> 
> * requires: dynamic_allocators is now also works
>   (by assuming that it is supported, no longer giving 'sorry')
>   https://gcc.gnu.org/r12-6641-g450c85b81f4dd67bf6211d307afdc0f3c07ef44f
> 
> * in_reduction on target: Now also supported for Fortran
>   https://gcc.gnu.org/r12-4574-gd98626bf451dea6a28a42d953f7d0bd7659ad4d5
> (I kept the "P" because of "nowait")
> 
> 
> I also note – but did not change:
> 
> * omp_display_env runtime routine  P  Not inside target regions
> 
> I observe that OpenMP 5.2 added:
>   "When called from within a target region the effect is unspecified."
> Thus, regarding this as bug fix, it could also be "Y".
> Thoughts?

Yes, we should say Y instead of P for it.

> Otherwise: OK?

Yes.

Jakub



[Patch] libgomp.texi: Update OpenMP implementation status

2022-01-26 Thread Tobias Burnus

]I saw that the following now is implemented:

* requires: dynamic_allocators is now also works
  (by assuming that it is supported, no longer giving 'sorry')
  https://gcc.gnu.org/r12-6641-g450c85b81f4dd67bf6211d307afdc0f3c07ef44f

* in_reduction on target: Now also supported for Fortran
  https://gcc.gnu.org/r12-4574-gd98626bf451dea6a28a42d953f7d0bd7659ad4d5
(I kept the "P" because of "nowait")


I also note – but did not change:

* omp_display_env runtime routine  P  Not inside target regions

I observe that OpenMP 5.2 added:
  "When called from within a target region the effect is unspecified."
Thus, regarding this as bug fix, it could also be "Y".
Thoughts?

[Some output modifications are also part of the pending device-specific
ICV patch, https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588728.html
thus, as that patch are not yet it, one can also argue for "P" in terms of
the output.]


Otherwise: OK?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp.texi: Update OpenMP implementation status

libgomp/
	* libgomp.texi (OpenMP 5.0): Update implementation status.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 3be9de51f11..73d9f4f76cb 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -187,7 +187,8 @@ The OpenMP 4.5 specification is fully supported.
   env variable @tab Y @tab
 @item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab
 @item @code{requires} directive @tab P
-  @tab Only fulfillable requirement is @code{atomic_default_mem_order}
+  @tab Only fulfillable requirement are @code{atomic_default_mem_order}
+  and @code{dynamic_allocators}
 @item @code{teams} construct outside an enclosing target region @tab Y @tab
 @item Non-rectangular loop nests @tab P @tab Only C/C++
 @item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
@@ -203,7 +204,7 @@ The OpenMP 4.5 specification is fully supported.
   @code{reduction} clause @tab Y @tab
 @item @code{in_reduction} clause on @code{task} constructs @tab Y @tab
 @item @code{in_reduction} clause on @code{target} constructs @tab P
-  @tab Only C/C++, @code{nowait} only stub
+  @tab @code{nowait} only stub
 @item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab
 @item @code{task} modifier to @code{reduction} clause @tab Y @tab
 @item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only


Re: [committed] libstdc++: Avoid symlink race in filesystem::remove_all [PR104161]

2022-01-26 Thread Jonathan Wakely via Gcc-patches
On Wed, 26 Jan 2022 at 22:08, Dimitar Dimitrov  wrote:
>
> On Tue, Jan 25, 2022 at 09:09:51PM +, Jonathan Wakely via Gcc-patches 
> wrote:
> > Tested x86_64-linux, pushed to trunk. Backports to follow.
> >
> >
> > This adds a new internal flag to the filesystem::directory_iterator
> > constructor that makes it fail if the path is a symlink that resolves to
> > a directory. This prevents filesystem::remove_all from following a
> > symlink to a directory, rather than deleting the symlink itself.
> >
> > We can also use that new flag in recursive_directory_iterator to ensure
> > that we don't follow symlinks if the follow_directory_symlink option is
> > not set.
> >
> > This also moves an error check in filesystem::remove_all after the while
> > loop, so that errors from the directory_iterator constructor are
> > reproted, instead of continuing to the filesystem::remove call below.
> >
> > libstdc++-v3/ChangeLog:
> >
> >   PR libstdc++/104161
> >   * acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for
> >   fdopendir.
> >   * config.h.in: Regenerate.
> >   * configure: Regenerate.
> >   * src/c++17/fs_dir.cc (_Dir): Add nofollow flag to constructor
> >   and pass it to base class constructor.
> >   (directory_iterator): Pass nofollow flag to _Dir constructor.
> >   (fs::recursive_directory_iterator::increment): Likewise.
> >   * src/c++17/fs_ops.cc (do_remove_all): Use nofollow option for
> >   directory_iterator constructor. Move error check outside loop.
> >   * src/filesystem/dir-common.h (_Dir_base): Add nofollow flag to
> >   constructor and when it's set use ::open with O_NOFOLLOW and
> >   O_DIRECTORY.
> >   * src/filesystem/dir.cc (_Dir): Add nofollow flag to constructor
> >   and pass it to base class constructor.
> >   (directory_iterator): Pass nofollow flag to _Dir constructor.
> >   (fs::recursive_directory_iterator::increment): Likewise.
> >   * src/filesystem/ops.cc (remove_all): Use nofollow option for
> >   directory_iterator constructor. Move error check outside loop.
> > ---
> >  libstdc++-v3/acinclude.m4| 12 ++
> >  libstdc++-v3/config.h.in |  3 ++
> >  libstdc++-v3/configure   | 55 
> >  libstdc++-v3/src/c++17/fs_dir.cc | 13 --
> >  libstdc++-v3/src/c++17/fs_ops.cc | 12 +++---
> >  libstdc++-v3/src/filesystem/dir-common.h | 48 -
> >  libstdc++-v3/src/filesystem/dir.cc   | 13 --
> >  libstdc++-v3/src/filesystem/ops.cc   |  6 +--
> >  8 files changed, 134 insertions(+), 28 deletions(-)
> >
> > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> > index d996477254c..7b6b807114a 100644
> > --- a/libstdc++-v3/acinclude.m4
> > +++ b/libstdc++-v3/acinclude.m4
> > @@ -4735,6 +4735,18 @@ dnl
> >if test $glibcxx_cv_truncate = yes; then
> >  AC_DEFINE(HAVE_TRUNCATE, 1, [Define if truncate is available in 
> > .])
> >fi
> > +dnl
> > +  AC_CACHE_CHECK([for fdopendir],
> > +glibcxx_cv_fdopendir, [dnl
> > +GCC_TRY_COMPILE_OR_LINK(
> > +  [#include ],
> > +  [::fdopendir(1);],
> > +  [glibcxx_cv_fdopendir=yes],
> > +  [glibcxx_cv_fdopendir=no])
> > +  ])
> > +  if test $glibcxx_cv_truncate = yes; then
>
> This is a typo. Should check glibcxx_cv_fdopendir.

Oops, thanks! Copy



Re: [committed] libstdc++: Avoid symlink race in filesystem::remove_all [PR104161]

2022-01-26 Thread Dimitar Dimitrov
On Tue, Jan 25, 2022 at 09:09:51PM +, Jonathan Wakely via Gcc-patches wrote:
> Tested x86_64-linux, pushed to trunk. Backports to follow.
> 
> 
> This adds a new internal flag to the filesystem::directory_iterator
> constructor that makes it fail if the path is a symlink that resolves to
> a directory. This prevents filesystem::remove_all from following a
> symlink to a directory, rather than deleting the symlink itself.
> 
> We can also use that new flag in recursive_directory_iterator to ensure
> that we don't follow symlinks if the follow_directory_symlink option is
> not set.
> 
> This also moves an error check in filesystem::remove_all after the while
> loop, so that errors from the directory_iterator constructor are
> reproted, instead of continuing to the filesystem::remove call below.
> 
> libstdc++-v3/ChangeLog:
> 
>   PR libstdc++/104161
>   * acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for
>   fdopendir.
>   * config.h.in: Regenerate.
>   * configure: Regenerate.
>   * src/c++17/fs_dir.cc (_Dir): Add nofollow flag to constructor
>   and pass it to base class constructor.
>   (directory_iterator): Pass nofollow flag to _Dir constructor.
>   (fs::recursive_directory_iterator::increment): Likewise.
>   * src/c++17/fs_ops.cc (do_remove_all): Use nofollow option for
>   directory_iterator constructor. Move error check outside loop.
>   * src/filesystem/dir-common.h (_Dir_base): Add nofollow flag to
>   constructor and when it's set use ::open with O_NOFOLLOW and
>   O_DIRECTORY.
>   * src/filesystem/dir.cc (_Dir): Add nofollow flag to constructor
>   and pass it to base class constructor.
>   (directory_iterator): Pass nofollow flag to _Dir constructor.
>   (fs::recursive_directory_iterator::increment): Likewise.
>   * src/filesystem/ops.cc (remove_all): Use nofollow option for
>   directory_iterator constructor. Move error check outside loop.
> ---
>  libstdc++-v3/acinclude.m4| 12 ++
>  libstdc++-v3/config.h.in |  3 ++
>  libstdc++-v3/configure   | 55 
>  libstdc++-v3/src/c++17/fs_dir.cc | 13 --
>  libstdc++-v3/src/c++17/fs_ops.cc | 12 +++---
>  libstdc++-v3/src/filesystem/dir-common.h | 48 -
>  libstdc++-v3/src/filesystem/dir.cc   | 13 --
>  libstdc++-v3/src/filesystem/ops.cc   |  6 +--
>  8 files changed, 134 insertions(+), 28 deletions(-)
> 
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index d996477254c..7b6b807114a 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -4735,6 +4735,18 @@ dnl
>if test $glibcxx_cv_truncate = yes; then
>  AC_DEFINE(HAVE_TRUNCATE, 1, [Define if truncate is available in 
> .])
>fi
> +dnl
> +  AC_CACHE_CHECK([for fdopendir],
> +glibcxx_cv_fdopendir, [dnl
> +GCC_TRY_COMPILE_OR_LINK(
> +  [#include ],
> +  [::fdopendir(1);],
> +  [glibcxx_cv_fdopendir=yes],
> +  [glibcxx_cv_fdopendir=no])
> +  ])
> +  if test $glibcxx_cv_truncate = yes; then

This is a typo. Should check glibcxx_cv_fdopendir.

Regards,
Dimitar


Re: [PATCH] rs6000: Fix up *intrin.h for C89 [PR104239]

2022-01-26 Thread Segher Boessenkool
Hi!

On Wed, Jan 26, 2022 at 10:03:36PM +0100, Jakub Jelinek wrote:
> When writing testcases for the previously posted patch, I have noticed
> that 3 of the headers aren't valid C89 (I didn't have any dg-options
> so -ansi -pedantic-errors was implied and these errors were reported).

Hrm.  Do they even work like that?

> Note, as can be seen even in this patch, seems older rs6000/*intrin.h
> headers uglify not just argument names (__A instead of A etc.), but also
> automatic variable names and other local identifiers, while e.g. emmintrin.h
> or bmi2intrin.h clearly uglify only the argument names and not local
> variables.  I think that should be fixed but don't have time for that myself
> (libstdc++ or e.g. the x86 headers uglify everything; this is so that one
> can
> #define result a + b
> #include 
> etc.).

Heh, I argued for this before, but mistakenly for arguments as well.
Not that I can remember right now how function arguments can mess up
either?

> --- gcc/config/rs6000/bmi2intrin.h.jj 2022-01-26 20:42:53.132315506 +
> +++ gcc/config/rs6000/bmi2intrin.h2022-01-26 20:46:33.687983641 +
> @@ -115,10 +115,11 @@ _pext_u64 (unsigned long long __X, unsig
> the Power8 Bit permute instruction.  */
>if (__builtin_constant_p (__M) && (__builtin_popcountl (__M) <= 8))
>  {
> +  long i;
>/* Also if the pext mask is constant, then the popcount is
> constant, we can evaluate the following loop at compile
> time and use a constant bit permute vector.  */
> -  for (long i = 0; i < __builtin_popcountl (__M); i++)
> +  for (i = 0; i < __builtin_popcountl (__M); i++)

Please put the declaration immediately before it is used, not at the
start of the function?

Okay for trunk like that.  Thanks!  Also any and all backports you want
are fine.


Segher


Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'

2022-01-26 Thread Joseph Myers
On Wed, 26 Jan 2022, Joseph Myers wrote:

> fmin and fmax:

Also:

* If the arguments are +0 and -0 in some order, it's unspecified what sign 
the result has.

> fminimum and fmaximum:

> fminimum_num and fmaximum_num:

Also:

* These four functions are required to treat -0 as less than +0.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] rs6000: Fix up *intrin.h for C89 [PR104239]

2022-01-26 Thread Jakub Jelinek via Gcc-patches
Hi!

When writing testcases for the previously posted patch, I have noticed
that 3 of the headers aren't valid C89 (I didn't have any dg-options
so -ansi -pedantic-errors was implied and these errors were reported).

The following patch fixes those, ok for trunk?

Note, as can be seen even in this patch, seems older rs6000/*intrin.h
headers uglify not just argument names (__A instead of A etc.), but also
automatic variable names and other local identifiers, while e.g. emmintrin.h
or bmi2intrin.h clearly uglify only the argument names and not local
variables.  I think that should be fixed but don't have time for that myself
(libstdc++ or e.g. the x86 headers uglify everything; this is so that one
can
#define result a + b
#include 
etc.).

2022-01-26  Jakub Jelinek  

PR target/104239
* config/rs6000/emmintrin.h (_mm_sad_epu8): Use __asm__ instead of
asm.
* config/rs6000/smmintrin.h (_mm_minpos_epu16): Declare iterator
before for loop instead of for init clause.
* config/rs6000/bmi2intrin.h (_pext_u64): Likewise.

* gcc.target/powerpc/pr104239-3.c: New test.

--- gcc/config/rs6000/emmintrin.h.jj2022-01-11 22:56:21.316686838 +
+++ gcc/config/rs6000/emmintrin.h   2022-01-26 20:47:26.319336232 +
@@ -2215,7 +2215,7 @@ _mm_sad_epu8 (__m128i __A, __m128i __B)
   vsum = (__vector signed int) vec_sum4s (vabsdiff, zero);
 #ifdef __LITTLE_ENDIAN__
   /* Sum across four integers with two integer results.  */
-  asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero));
+  __asm__ ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero));
   /* Note: vec_sum2s could be used here, but on little-endian, vector
  shifts are added that are not needed for this use-case.
  A vector shift to correctly position the 32-bit integer results
--- gcc/config/rs6000/smmintrin.h.jj2022-01-19 10:27:58.529911366 +
+++ gcc/config/rs6000/smmintrin.h   2022-01-26 20:48:05.720348812 +
@@ -687,7 +687,8 @@ _mm_minpos_epu16 (__m128i __A)
   union __u __u = { .__m = __A }, __r = { .__m = {0} };
   unsigned short __ridx = 0;
   unsigned short __rmin = __u.__uh[__ridx];
-  for (unsigned long __i = 1; __i < 8; __i++)
+  unsigned long __i;
+  for (__i = 1; __i < 8; __i++)
 {
   if (__u.__uh[__i] < __rmin)
{
--- gcc/config/rs6000/bmi2intrin.h.jj   2022-01-26 20:42:53.132315506 +
+++ gcc/config/rs6000/bmi2intrin.h  2022-01-26 20:46:33.687983641 +
@@ -115,10 +115,11 @@ _pext_u64 (unsigned long long __X, unsig
the Power8 Bit permute instruction.  */
   if (__builtin_constant_p (__M) && (__builtin_popcountl (__M) <= 8))
 {
+  long i;
   /* Also if the pext mask is constant, then the popcount is
constant, we can evaluate the following loop at compile
time and use a constant bit permute vector.  */
-  for (long i = 0; i < __builtin_popcountl (__M); i++)
+  for (i = 0; i < __builtin_popcountl (__M); i++)
{
  c = __builtin_clzl (m);
  p = (p << 8) | c;
--- gcc/testsuite/gcc.target/powerpc/pr104239-3.c.jj2022-01-26 
20:52:42.987474394 +
+++ gcc/testsuite/gcc.target/powerpc/pr104239-3.c   2022-01-26 
20:52:36.547308886 +
@@ -0,0 +1,7 @@
+/* PR target/104239 */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -DNO_WARN_X86_INTRINSICS -std=c89" } 
*/
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+
+#include 
+
+int i;

Jakub



[PATCH] PR fortran/84784 - ICEs: verify_gimple failed with -fdefault-integer-8

2022-01-26 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

the use of -fdefault-integer-8 exhibits several cases where
we missed to convert the result of an intrinsic from the
declared to the effective resulting type.

The attached obvious patch fixes this for IMAGE_STATUS,
TEAM_NUMBER, and POPCNT/POPPAR.

OK for mainline if regtesting passes on x86_64-pc-linux-gnu?

Thanks,
Harald

From af5cb1f0ec1cacb47acc8c2b0c0629cf3808e1af Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 26 Jan 2022 21:50:41 +0100
Subject: [PATCH] Fortran: add missing conversions for result of intrinsics to
 result type

gcc/fortran/ChangeLog:

	PR fortran/84784
	* trans-intrinsic.cc (conv_intrinsic_image_status): Convert result
	to resulting (default) integer type.
	(conv_intrinsic_team_number): Likewise.
	(gfc_conv_intrinsic_popcnt_poppar): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/84784
	* gfortran.dg/pr84784.f90: New test.
---
 gcc/fortran/trans-intrinsic.cc| 13 +++--
 gcc/testsuite/gfortran.dg/pr84784.f90 | 22 ++
 2 files changed, 29 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr84784.f90

diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index fccf0a9b229..da854fad89d 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -2620,7 +2620,7 @@ conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
   else
 gcc_unreachable ();

-  se->expr = tmp;
+  se->expr = fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp);
 }

 static void
@@ -2662,7 +2662,7 @@ conv_intrinsic_team_number (gfc_se *se, gfc_expr *expr)
   else
 gcc_unreachable ();

-  se->expr = tmp;
+  se->expr = fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp);
 }


@@ -7255,12 +7255,13 @@ gfc_conv_intrinsic_popcnt_poppar (gfc_se * se, gfc_expr *expr, int parity)

   /* Combine the results.  */
   if (parity)
-	se->expr = fold_build2_loc (input_location, BIT_XOR_EXPR, result_type,
-call1, call2);
+	se->expr = fold_build2_loc (input_location, BIT_XOR_EXPR,
+integer_type_node, call1, call2);
   else
-	se->expr = fold_build2_loc (input_location, PLUS_EXPR, result_type,
-call1, call2);
+	se->expr = fold_build2_loc (input_location, PLUS_EXPR,
+integer_type_node, call1, call2);

+  se->expr = convert (result_type, se->expr);
   return;
 }

diff --git a/gcc/testsuite/gfortran.dg/pr84784.f90 b/gcc/testsuite/gfortran.dg/pr84784.f90
new file mode 100644
index 000..48dd4dd4b0a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr84784.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=lib -fdefault-integer-8" }
+! { dg-require-effective-target fortran_integer_16 }
+! PR fortran/84784 - ICEs: verify_gimple failed with -fdefault-integer-8
+
+  use iso_fortran_env, only : team_type, STAT_FAILED_IMAGE
+  implicit none
+  type(team_type) :: team
+  integer :: new_team
+  new_team = mod(this_image(),2)+1
+  form team (new_team,team)
+change team (team)
+if (team_number() /= new_team) STOP 1
+  end team
+  if (image_status (1) == STAT_FAILED_IMAGE) ERROR STOP "cannot recover"
+  if (runtime_popcnt(0_16) /= 0) STOP 2
+contains
+  integer function runtime_popcnt (i)
+integer(kind=16), intent(in) :: i
+runtime_popcnt = popcnt(i)
+  end function
+end
--
2.31.1



Re: [PATCH] rs6000: Fix up #include or [PR104239]

2022-01-26 Thread David Edelsohn via Gcc-patches
On Wed, Jan 26, 2022 at 3:45 PM Jakub Jelinek  wrote:
>
> Hi!
>
> r12-4717-g7d37abedf58d66 added immintrin.h and x86gprintrin.h headers
> to rs6000, these headers are on x86 standalone headers that various
> programs include directly rather than including them through
> .
> Unfortunately, for that change the bmiintrin.h and bmi2intrin.h
> headers haven't been adjusted, so the effect is that if one includes them
> (without including also x86intrin.h first) #error will trigger.
> Furthermore, when including such headers conditionally as some real-world
> packages do, this means a regression.
>
> The following patch fixes it and matches what the x86 bmi{,2}intrin.h
> headers do.
>
> Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

Okay.

Thanks for catching this.

- David

>
> 2022-01-26  Jakub Jelinek  
>
> PR target/104239
> * config/rs6000/bmiintrin.h: Test _X86GPRINTRIN_H_INCLUDED instead of
> _X86INTRIN_H_INCLUDED and adjust #error wording.
> * config/rs6000/bmi2intrin.h: Likewise.
>
> * gcc.target/powerpc/pr104239-1.c: New test.
> * gcc.target/powerpc/pr104239-2.c: New test.
>
> --- gcc/config/rs6000/bmiintrin.h.jj2022-01-11 23:11:21.936296534 +0100
> +++ gcc/config/rs6000/bmiintrin.h   2022-01-26 13:35:08.705945170 +0100
> @@ -29,8 +29,8 @@
> standard C or GNU C extensions, which are more portable and better
> optimized across multiple targets.  */
>
> -#if !defined _X86INTRIN_H_INCLUDED
> -# error "Never use  directly; include  instead."
> +#if !defined _X86GPRINTRIN_H_INCLUDED
> +# error "Never use  directly; include  instead."
>  #endif
>
>  #ifndef _BMIINTRIN_H_INCLUDED
> --- gcc/config/rs6000/bmi2intrin.h.jj   2022-01-11 23:11:21.936296534 +0100
> +++ gcc/config/rs6000/bmi2intrin.h  2022-01-26 13:34:53.373162122 +0100
> @@ -29,8 +29,8 @@
> standard C or GNU C extensions, which are more portable and better
> optimized across multiple targets.  */
>
> -#if !defined _X86INTRIN_H_INCLUDED
> -# error "Never use  directly; include  instead."
> +#if !defined _X86GPRINTRIN_H_INCLUDED
> +# error "Never use  directly; include  
> instead."
>  #endif
>
>  #ifndef _BMI2INTRIN_H_INCLUDED
> --- gcc/testsuite/gcc.target/powerpc/pr104239-1.c.jj2022-01-26 
> 13:42:34.103643030 +0100
> +++ gcc/testsuite/gcc.target/powerpc/pr104239-1.c   2022-01-26 
> 13:42:23.101798701 +0100
> @@ -0,0 +1,9 @@
> +/* PR target/104239 */
> +/* { dg-do compile } */
> +/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */
> +
> +#if __has_include()
> +#include 
> +#endif
> +
> +int i;
> --- gcc/testsuite/gcc.target/powerpc/pr104239-2.c.jj2022-01-26 
> 13:42:42.279527345 +0100
> +++ gcc/testsuite/gcc.target/powerpc/pr104239-2.c   2022-01-26 
> 13:42:23.101798701 +0100
> @@ -0,0 +1,9 @@
> +/* PR target/104239 */
> +/* { dg-do compile } */
> +/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */
> +
> +#if __has_include()
> +#include 
> +#endif
> +
> +int i;
>
> Jakub
>


[PATCH] rs6000: Fix up #include or [PR104239]

2022-01-26 Thread Jakub Jelinek via Gcc-patches
Hi!

r12-4717-g7d37abedf58d66 added immintrin.h and x86gprintrin.h headers
to rs6000, these headers are on x86 standalone headers that various
programs include directly rather than including them through
.
Unfortunately, for that change the bmiintrin.h and bmi2intrin.h
headers haven't been adjusted, so the effect is that if one includes them
(without including also x86intrin.h first) #error will trigger.
Furthermore, when including such headers conditionally as some real-world
packages do, this means a regression.

The following patch fixes it and matches what the x86 bmi{,2}intrin.h
headers do.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

2022-01-26  Jakub Jelinek  

PR target/104239
* config/rs6000/bmiintrin.h: Test _X86GPRINTRIN_H_INCLUDED instead of
_X86INTRIN_H_INCLUDED and adjust #error wording.
* config/rs6000/bmi2intrin.h: Likewise.

* gcc.target/powerpc/pr104239-1.c: New test.
* gcc.target/powerpc/pr104239-2.c: New test.

--- gcc/config/rs6000/bmiintrin.h.jj2022-01-11 23:11:21.936296534 +0100
+++ gcc/config/rs6000/bmiintrin.h   2022-01-26 13:35:08.705945170 +0100
@@ -29,8 +29,8 @@
standard C or GNU C extensions, which are more portable and better
optimized across multiple targets.  */
 
-#if !defined _X86INTRIN_H_INCLUDED
-# error "Never use  directly; include  instead."
+#if !defined _X86GPRINTRIN_H_INCLUDED
+# error "Never use  directly; include  instead."
 #endif
 
 #ifndef _BMIINTRIN_H_INCLUDED
--- gcc/config/rs6000/bmi2intrin.h.jj   2022-01-11 23:11:21.936296534 +0100
+++ gcc/config/rs6000/bmi2intrin.h  2022-01-26 13:34:53.373162122 +0100
@@ -29,8 +29,8 @@
standard C or GNU C extensions, which are more portable and better
optimized across multiple targets.  */
 
-#if !defined _X86INTRIN_H_INCLUDED
-# error "Never use  directly; include  instead."
+#if !defined _X86GPRINTRIN_H_INCLUDED
+# error "Never use  directly; include  instead."
 #endif
 
 #ifndef _BMI2INTRIN_H_INCLUDED
--- gcc/testsuite/gcc.target/powerpc/pr104239-1.c.jj2022-01-26 
13:42:34.103643030 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr104239-1.c   2022-01-26 
13:42:23.101798701 +0100
@@ -0,0 +1,9 @@
+/* PR target/104239 */
+/* { dg-do compile } */
+/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */
+
+#if __has_include()
+#include 
+#endif
+
+int i;
--- gcc/testsuite/gcc.target/powerpc/pr104239-2.c.jj2022-01-26 
13:42:42.279527345 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr104239-2.c   2022-01-26 
13:42:23.101798701 +0100
@@ -0,0 +1,9 @@
+/* PR target/104239 */
+/* { dg-do compile } */
+/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */
+
+#if __has_include()
+#include 
+#endif
+
+int i;

Jakub



Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'

2022-01-26 Thread Maciej W. Rozycki
On Wed, 26 Jan 2022, Joseph Myers wrote:

> fmin and fmax:
> 
> * Treat quiet NaN as missing data and return the other argument (if a 
> number).
> 
> * Treat signaling NaN like most functions (raise invalid, return quiet 
> NaN).
> 
> fminimum and fmaximum:
> 
> * Treat quiet NaN like most functions (return quiet NaN, no exceptions 
> raised unless the other argument is signaling NaN).
> 
> * Treat signaling NaN like most functions (raise invalid, return quiet 
> NaN).
> 
> fminimum_num and fmaximum_num:
> 
> * Treat both quiet and signaling NaN as missing data and return the other 
> argument (if a number).  "invalid" is still raised if one argument is a 
> signaling NaN (contrary to the normal rule that raising "invalid" means 
> also returning a quiet NaN).

 Thanks!

  Maciej


[PATCH v2][GCC13] RISC-V: Provide `fmin'/`fmax' RTL patterns

2022-01-26 Thread Maciej W. Rozycki
As at r2.2 of the RISC-V ISA specification[1] the FMIN and FMAX machine 
instructions fully match our requirement for the `fminM3' and `fmaxM3' 
standard RTL patterns:

"For FMIN and FMAX, if at least one input is a signaling NaN, or if both 
inputs are quiet NaNs, the result is the canonical NaN.  If one operand 
is a quiet NaN and the other is not a NaN, the result is the non-NaN 
operand."

suitably for the IEEE 754-2008 `minNum' and `maxNum' operations.

However we only define `sminM3' and `smaxM3' standard RTL patterns to 
produce the FMIN and FMAX machine instructions, which in turn causes the 
`__builtin_fmin' and `__builtin_fmax' family of intrinsics to emit the 
corresponding libcalls rather than the relevant machine instructions.  
This is according to earlier revisions of the RISC-V ISA specification, 
which we however do not support anymore, as from commit 4b81528241ca 
("RISC-V: Support version controling for ISA standard extensions").

As from r20190608 of the RISC-V ISA specification the definition of the 
FMIN and FMAX machine instructions has been updated[2]:

"Defined the signed-zero behavior of FMIN.fmt and FMAX.fmt, and changed 
their behavior on signaling-NaN inputs to conform to the minimumNumber 
and maximumNumber operations in the proposed IEEE 754-201x 
specification."

and specifically[3]:

"Floating-point minimum-number and maximum-number instructions FMIN.S 
and FMAX.S write, respectively, the smaller or larger of rs1 and rs2 to 
rd.  For the purposes of these instructions only, the value -0.0 is 
considered to be less than the value +0.0.  If both inputs are NaNs, the 
result is the canonical NaN.  If only one operand is a NaN, the result 
is the non-NaN operand.  Signaling NaN inputs set the invalid operation 
exception flag, even when the result is not NaN."

Consequently for forwards compatibility with r20190608+ hardware we 
cannot use the FMIN and FMAX machine instructions unconditionally even 
where the ISA level of r2.2 has been specified with the `-misa-spec=2.2' 
option where operation would be different between ISA revisions, that 
is the handling of signaling NaN inputs.

Therefore provide new `fmin3' and `fmax3' patterns removing 
the need to emit libcalls with the `__builtin_fmin' and `__builtin_fmax' 
family of intrinsics, however limit them to where `-fno-signaling-nans' 
is in effect, deferring to the existing `smin3' and `smax3' 
patterns otherwise.  For clarity and maintenance error resistance add an 
explicit condition to the latter patterns rather than relying on source 
code ordering for where the respective RTL insns are matched by their 
operation rather than being explicitly referred to by their names.

References:

[1] "The RISC-V Instruction Set Manual, Volume I: User-Level ISA",
Document Version 2.2, May 7, 2017, Section 8.3 "NaN Generation and 
Propagation", p. 48

[1] "The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA",
Document Version 20190608-Base-Ratified, June 8, 2019, "Preface",
p. ii

[2] same, Section 11.6 "Single-Precision Floating-Point Computational
Instructions", p. 66

gcc/
* config/riscv/riscv.md (fmin3, fmax3): New insns.
(smin3, smax3): Only enable if `HONOR_SNANS'.
---
Hi,

 This updated version has passed full GCC regression testing (including 
the D frontend this time) with the `riscv64-linux-gnu' target using the 
HiFive Unmatched (U74 CPU) target board.

 Any further questions or comments?  Otherwise OK once GCC 13 has opened?

  Maciej

Changes from v1:

- Adjust heading from "RISC-V: Replace `smin'/`smax' RTL patterns with 
  `fmin'/`fmax'".

- Do not remove `smin'/`smax' patterns; instead make them available if
  `HONOR_SNANS (mode)'.

- Make `fmin'/`fmax' patterns available if `!HONOR_SNANS (mode)' 
  only.

- Update description accordingly; refer r2.2 and r20190608 ISA specs.
---
 gcc/config/riscv/riscv.md |   22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

gcc-riscv-fmin-fmax.diff
Index: gcc/gcc/config/riscv/riscv.md
===
--- gcc.orig/gcc/config/riscv/riscv.md
+++ gcc/gcc/config/riscv/riscv.md
@@ -1214,11 +1214,29 @@
 ;;
 ;;  
 
+(define_insn "fmin3"
+  [(set (match_operand:ANYF0 "register_operand" "=f")
+   (smin:ANYF (match_operand:ANYF 1 "register_operand" " f")
+  (match_operand:ANYF 2 "register_operand" " f")))]
+  "TARGET_HARD_FLOAT && !HONOR_SNANS (mode)"
+  "fmin.\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "")])
+
+(define_insn "fmax3"
+  [(set (match_operand:ANYF0 "register_operand" "=f")
+   (smax:ANYF (match_operand:ANYF 1 "register_operand" " f")
+  (match_operand:ANYF 2 "register_operand" " f")))]
+  "TARGET_HARD_FLOAT && !HONOR_SNANS (mode)"
+  "fmax.\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "")])
+
 (define_insn "smin3"
   [(set 

Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'

2022-01-26 Thread Joseph Myers
On Wed, 26 Jan 2022, Maciej W. Rozycki wrote:

> > The newer instruction semantics correspond to the functions fminimum_num 
> > and fmaximum_num, not fminimum and fmaximum (which are functions that 
> > treat both quiet and signaling NaNs like most libm functions do - any 
> > argument a NaN means the result is NaN).
> 
>  I got these wrong then, sorry.  I got lost too: what is the difference 
> between `fmin'/`fmax' and `fminimum'/`fmaximum' then?  It looks to me like 
> we have a zoo of selection functions now with very subtle differences 
> between each other.

fmin and fmax:

* Treat quiet NaN as missing data and return the other argument (if a 
number).

* Treat signaling NaN like most functions (raise invalid, return quiet 
NaN).

fminimum and fmaximum:

* Treat quiet NaN like most functions (return quiet NaN, no exceptions 
raised unless the other argument is signaling NaN).

* Treat signaling NaN like most functions (raise invalid, return quiet 
NaN).

fminimum_num and fmaximum_num:

* Treat both quiet and signaling NaN as missing data and return the other 
argument (if a number).  "invalid" is still raised if one argument is a 
signaling NaN (contrary to the normal rule that raising "invalid" means 
also returning a quiet NaN).

-- 
Joseph S. Myers
jos...@codesourcery.com


[pushed] c++: vector compound literal [PR104206]

2022-01-26 Thread Jason Merrill via Gcc-patches
My patch for PR101072 removed the specific VECTOR_TYPE handling here, which
broke pr72747-2.c on PPC; this patch restores it.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/104206
PR c++/101072

gcc/cp/ChangeLog:

* semantics.cc (finish_compound_literal): Restore VECTOR_TYPE check.
---
 gcc/cp/semantics.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 07c2b3393be..466d6b56871 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3272,8 +3272,9 @@ finish_compound_literal (tree type, tree compound_literal,
 
   /* Represent other compound literals with TARGET_EXPR so we produce
  a prvalue, and can elide copies.  */
-  if (TREE_CODE (compound_literal) == CONSTRUCTOR
-  || TREE_CODE (compound_literal) == VEC_INIT_EXPR)
+  if (!VECTOR_TYPE_P (type)
+  && (TREE_CODE (compound_literal) == CONSTRUCTOR
+ || TREE_CODE (compound_literal) == VEC_INIT_EXPR))
 {
   /* The CONSTRUCTOR is now an initializer, not a compound literal.  */
   if (TREE_CODE (compound_literal) == CONSTRUCTOR)

base-commit: 00d8321124123daf41f7c51526355a5a610cdeb8
prerequisite-patch-id: c7ab4056fcbd782232c8dc091597602ecd4a7a48
-- 
2.27.0



Re: [PATCH] MIPS: use 8bit for IPL in Cause register

2022-01-26 Thread Maciej W. Rozycki
On Wed, 26 Jan 2022, YunQiang Su wrote:

> Since MIPS r2, the IPL section in Cause register has been expand
> to 8bit instead of 6bit.

 Hmm, I cannot see it in my copy of the architecture manual I'm afraid.  
The interpretation may have changed, but the field is still 6-bit (not 
counting the software interrupts).  Now the MCU ASE does expand the IPL 
field, but we can't rely on that here, not at least unconditionally, and 
then MCU is MIPSr3+.

 What problem are you trying to solve anyway?

  Maciej


Re: [PATCH] rs6000: Fix constraint v with rs6000_constraints[RS6000_CONSTRAINT_v]

2022-01-26 Thread Segher Boessenkool
Hi!

On Wed, Jan 26, 2022 at 10:26:45AM +0800, Kewen.Lin wrote:
> on 2022/1/14 上午12:31, David Edelsohn wrote:
> Yeah, but IMHO it still can confuse new comers at first glance.

Yes, or at least cause to read (well, grep) the whole backend and
scratch heads.

> >> 2) Bootstrapped and tested one below patch to remove all the code using
> >> RS6000_CONSTRAINT_v on powerpc64le-linux-gnu P10 and P9,
> >> powerpc64-linux-gnu P8 and P7 with no regressions.

> > I would prefer that we not make gratuitous changes to this code, but
> > maybe Segher has a different opinion.
> 
> Thanks David for the comments!
> 
> Hi Segher, what's your preference on this?

I like your original patch better.  But for stage 1, sorry.

Indeed using ALTIVEC_REGS directly in the define_regiater_constraint
works fine, but it isn't as clear as it could be that is correct.


Segher


[PATCH] aarch64: [PR101529] Fix vector shuffle insertion expansion

2022-01-26 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The function aarch64_evpc_ins would reuse the target even though
it might be the same register as the two inputs.
Instead of checking to see if we can reuse the target, just use the
original input directly.

Committed as approved after bootstrapped and tested on
aarch64-linux-gnu with no regressions.
Note the testcases are not backported as __builtin_shufflevector
does not exist in GCC 11.

PR target/101529

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_evpc_ins): Don't use target
as an input, use original one.

(cherry picked from commit 52fa771758635d9c53cddb9116e5a66fae592230)
---
 gcc/config/aarch64/aarch64.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index bbcf5ed4a61..b58a379759d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23026,11 +23026,10 @@ aarch64_evpc_ins (struct expand_vec_perm_d *d)
 }
   gcc_assert (extractindex < nelt);
 
-  emit_move_insn (d->target, insv);
   insn_code icode = code_for_aarch64_simd_vec_copy_lane (mode);
   expand_operand ops[5];
   create_output_operand ([0], d->target, mode);
-  create_input_operand ([1], d->target, mode);
+  create_input_operand ([1], insv, mode);
   create_integer_operand ([2], 1 << idx);
   create_input_operand ([3], extractv, mode);
   create_integer_operand ([4], extractindex);
-- 
2.17.1



[PATCH] c++: constrained partial spec using qualified name [PR92944]

2022-01-26 Thread Patrick Palka via Gcc-patches
In the nested_name_specifier branch within cp_parser_class_head, we need
to update TYPE with the result of maybe_process_partial_specialization
just like we do in the template_id_p branch.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/92944

gcc/cp/ChangeLog:

* parser.cc (cp_parser_class_head): Update TYPE with the result
of maybe_process_partial_specialization in the
nested_name_specifier branch too.  Refactor nearby code to
accomodate that maybe_process_partial_specialization returns a
_TYPE, not a TYPE_DECL, and eliminate local variable CLASS_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-partial-spec10.C: New test.
---
 gcc/cp/parser.cc   | 18 +++---
 .../g++.dg/cpp2a/concepts-partial-spec10.C |  6 ++
 2 files changed, 13 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index ed219d79dc9..aa1768ffab1 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -26534,7 +26534,7 @@ cp_parser_class_head (cp_parser* parser,
 }
   else if (nested_name_specifier)
 {
-  tree class_type;
+  type = TREE_TYPE (type);
 
   /* Given:
 
@@ -26544,31 +26544,27 @@ cp_parser_class_head (cp_parser* parser,
 we will get a TYPENAME_TYPE when processing the definition of
 `S::T'.  We need to resolve it to the actual type before we
 try to define it.  */
-  if (TREE_CODE (TREE_TYPE (type)) == TYPENAME_TYPE)
+  if (TREE_CODE (type) == TYPENAME_TYPE)
{
- class_type = resolve_typename_type (TREE_TYPE (type),
- /*only_current_p=*/false);
- if (TREE_CODE (class_type) != TYPENAME_TYPE)
-   type = TYPE_NAME (class_type);
- else
+ type = resolve_typename_type (type, /*only_current_p=*/false);
+ if (TREE_CODE (type) == TYPENAME_TYPE)
{
  cp_parser_error (parser, "could not resolve typename type");
  type = error_mark_node;
}
}
 
-  if (maybe_process_partial_specialization (TREE_TYPE (type))
- == error_mark_node)
+  type = maybe_process_partial_specialization (type);
+  if (type == error_mark_node)
{
  type = NULL_TREE;
  goto done;
}
 
-  class_type = current_class_type;
   /* Enter the scope indicated by the nested-name-specifier.  */
   pushed_scope = push_scope (nested_name_specifier);
   /* Get the canonical version of this type.  */
-  type = TYPE_MAIN_DECL (TREE_TYPE (type));
+  type = TYPE_MAIN_DECL (type);
   /* Call push_template_decl if it seems like we should be defining a
 template either from the template headers or the type we're
 defining, so that we diagnose both extra and missing headers.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C
new file mode 100644
index 000..8504a055ac5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C
@@ -0,0 +1,6 @@
+// PR c++/92944
+// { dg-do compile { target c++20 } }
+
+namespace ns { template struct A { }; }
+template requires true struct ns::A { };
+template requires false struct ns::A { };
-- 
2.35.0



[PATCH v2] warn-access: Prevent -Wuse-after-free on ARM [PR104213]

2022-01-26 Thread Marek Polacek via Gcc-patches
On Wed, Jan 26, 2022 at 09:39:46AM -0700, Martin Sebor wrote:
> On 1/26/22 08:24, Jason Merrill via Gcc-patches wrote:
> > On 1/25/22 18:33, Marek Polacek wrote:
> > > Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors
> > > return, as mandated by the EABI.  To be entirely correct, it only
> > > requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel
> > > like changing that now and possibly running into issues later on.
> > > 
> > > This patch uses suppress_warning on 'this' for certain cdtor_returns_this
> > > cases in the C++ FE, and then warn_invalid_pointer makes use of this
> > > information and doesn't warn.
> > > 
> > > In my first attempt I tried suppress_warning the MODIFY_EXPR or
> > > RETURN_EXPR
> > > we build in build_delete_destructor_body, but the complication is that
> > > the suppress_warning bits don't always survive gimplification; see e.g.
> > > gimplify_modify_expr where we do
> > > 
> > >   6130   if (COMPARISON_CLASS_P (*from_p))
> > >   6131 copy_warning (assign, *from_p);
> > > 
> > > but here we're not dealing with a comparison.  Removing that check
> > > regresses uninit-pr74762.C.  Adding copy_warning (assign, *expr_p)
> > > regresses c-c++-common/uninit-17.c.
> > > 
> > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > OK if Martin doesn't have any suggestions.
> 
> Nothing further from me.

Here's what I've pushed then (it adds the testcase Martin mentioned earlier).

Thanks,

-- >8 --
Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors
return, as mandated by the EABI.  To be entirely correct, it only
requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel
like changing that now and possibly running into issues later on.

This patch uses suppress_warning on 'this' for certain cdtor_returns_this
cases in the C++ FE, and then warn_invalid_pointer makes use of this
information and doesn't warn.

In my first attempt I tried suppress_warning the MODIFY_EXPR or RETURN_EXPR
we build in build_delete_destructor_body, but the complication is that
the suppress_warning bits don't always survive gimplification; see e.g.
gimplify_modify_expr where we do

 6130   if (COMPARISON_CLASS_P (*from_p))
 6131 copy_warning (assign, *from_p);

but here we're not dealing with a comparison.  Removing that check
regresses uninit-pr74762.C.  Adding copy_warning (assign, *expr_p)
regresses c-c++-common/uninit-17.c.

PR target/104213

gcc/cp/ChangeLog:

* decl.cc (finish_constructor_body): Suppress -Wuse-after-free.
(finish_destructor_body): Likewise.
* optimize.cc (build_delete_destructor_body): Likewise.

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (pass_waccess::warn_invalid_pointer): Don't
warn when the SSA_NAME_VAR of REF has supressed -Wuse-after-free.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wuse-after-free2.C: New test.
* g++.dg/warn/Wuse-after-free3.C: New test.
---
 gcc/cp/decl.cc   |  2 ++
 gcc/cp/optimize.cc   |  1 +
 gcc/gimple-ssa-warn-access.cc| 14 +++---
 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C | 10 ++
 gcc/testsuite/g++.dg/warn/Wuse-after-free3.C | 16 
 5 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free3.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 22d3dd1e2ad..6534a7fd320 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17315,6 +17315,7 @@ finish_constructor_body (void)
   add_stmt (build_stmt (input_location, LABEL_EXPR, cdtor_label));
 
   val = DECL_ARGUMENTS (current_function_decl);
+  suppress_warning (val, OPT_Wuse_after_free);
   val = build2 (MODIFY_EXPR, TREE_TYPE (val),
DECL_RESULT (current_function_decl), val);
   /* Return the address of the object.  */
@@ -17408,6 +17409,7 @@ finish_destructor_body (void)
   tree val;
 
   val = DECL_ARGUMENTS (current_function_decl);
+  suppress_warning (val, OPT_Wuse_after_free);
   val = build2 (MODIFY_EXPR, TREE_TYPE (val),
DECL_RESULT (current_function_decl), val);
   /* Return the address of the object.  */
diff --git a/gcc/cp/optimize.cc b/gcc/cp/optimize.cc
index 4ad3f1dc9aa..13ab8b7361e 100644
--- a/gcc/cp/optimize.cc
+++ b/gcc/cp/optimize.cc
@@ -166,6 +166,7 @@ build_delete_destructor_body (tree delete_dtor, tree 
complete_dtor)
   if (targetm.cxx.cdtor_returns_this ())
 {
   tree val = DECL_ARGUMENTS (delete_dtor);
+  suppress_warning (val, OPT_Wuse_after_free);
   val = build2 (MODIFY_EXPR, TREE_TYPE (val),
 DECL_RESULT (delete_dtor), val);
   add_stmt (build_stmt (0, RETURN_EXPR, val));
diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index afcf0f71bec..3dcaf4230b8 100644

Re: [PATCH] warn-access: Prevent -Wuse-after-free on ARM [PR104213]

2022-01-26 Thread Martin Sebor via Gcc-patches

On 1/26/22 08:24, Jason Merrill via Gcc-patches wrote:

On 1/25/22 18:33, Marek Polacek wrote:

Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors
return, as mandated by the EABI.  To be entirely correct, it only
requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel
like changing that now and possibly running into issues later on.

This patch uses suppress_warning on 'this' for certain cdtor_returns_this
cases in the C++ FE, and then warn_invalid_pointer makes use of this
information and doesn't warn.

In my first attempt I tried suppress_warning the MODIFY_EXPR or 
RETURN_EXPR

we build in build_delete_destructor_body, but the complication is that
the suppress_warning bits don't always survive gimplification; see e.g.
gimplify_modify_expr where we do

  6130   if (COMPARISON_CLASS_P (*from_p))
  6131 copy_warning (assign, *from_p);

but here we're not dealing with a comparison.  Removing that check
regresses uninit-pr74762.C.  Adding copy_warning (assign, *expr_p)
regresses c-c++-common/uninit-17.c.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK if Martin doesn't have any suggestions.


Nothing further from me.

Thanks
Martin




PR target/104213

gcc/cp/ChangeLog:

* decl.cc (finish_constructor_body): Suppress -Wuse-after-free.
(finish_destructor_body): Likewise.
* optimize.cc (build_delete_destructor_body): Likewise.

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (pass_waccess::warn_invalid_pointer): 
Don't

warn when the SSA_NAME_VAR of REF has supressed -Wuse-after-free.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wuse-after-free2.C: New test.
---
  gcc/cp/decl.cc   |  2 ++
  gcc/cp/optimize.cc   |  1 +
  gcc/gimple-ssa-warn-access.cc    | 14 +++---
  gcc/testsuite/g++.dg/warn/Wuse-after-free2.C | 10 ++
  4 files changed, 24 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 22d3dd1e2ad..6534a7fd320 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17315,6 +17315,7 @@ finish_constructor_body (void)
    add_stmt (build_stmt (input_location, LABEL_EXPR, cdtor_label));
    val = DECL_ARGUMENTS (current_function_decl);
+  suppress_warning (val, OPT_Wuse_after_free);
    val = build2 (MODIFY_EXPR, TREE_TYPE (val),
  DECL_RESULT (current_function_decl), val);
    /* Return the address of the object.  */
@@ -17408,6 +17409,7 @@ finish_destructor_body (void)
    tree val;
    val = DECL_ARGUMENTS (current_function_decl);
+  suppress_warning (val, OPT_Wuse_after_free);
    val = build2 (MODIFY_EXPR, TREE_TYPE (val),
  DECL_RESULT (current_function_decl), val);
    /* Return the address of the object.  */
diff --git a/gcc/cp/optimize.cc b/gcc/cp/optimize.cc
index 4ad3f1dc9aa..13ab8b7361e 100644
--- a/gcc/cp/optimize.cc
+++ b/gcc/cp/optimize.cc
@@ -166,6 +166,7 @@ build_delete_destructor_body (tree delete_dtor, 
tree complete_dtor)

    if (targetm.cxx.cdtor_returns_this ())
  {
    tree val = DECL_ARGUMENTS (delete_dtor);
+  suppress_warning (val, OPT_Wuse_after_free);
    val = build2 (MODIFY_EXPR, TREE_TYPE (val),
  DECL_RESULT (delete_dtor), val);
    add_stmt (build_stmt (0, RETURN_EXPR, val));
diff --git a/gcc/gimple-ssa-warn-access.cc 
b/gcc/gimple-ssa-warn-access.cc

index 8bc33eeb6fa..484bcd34c25 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -3880,9 +3880,17 @@ pass_waccess::warn_invalid_pointer (tree ref, 
gimple *use_stmt,

  bool maybe, bool equality /* = false */)
  {
    /* Avoid printing the unhelpful "" in the diagnostics.  */
-  if (ref && TREE_CODE (ref) == SSA_NAME
-  && (!SSA_NAME_VAR (ref) || DECL_ARTIFICIAL (SSA_NAME_VAR (ref
-    ref = NULL_TREE;
+  if (ref && TREE_CODE (ref) == SSA_NAME)
+    {
+  tree var = SSA_NAME_VAR (ref);
+  if (!var)
+    ref = NULL_TREE;
+  /* Don't warn for cases like when a cdtor returns 'this' on 
ARM.  */

+  else if (warning_suppressed_p (var, OPT_Wuse_after_free))
+    return;
+  else if (DECL_ARTIFICIAL (var))
+    ref = NULL_TREE;
+    }
    location_t use_loc = gimple_location (use_stmt);
    if (use_loc == UNKNOWN_LOCATION)
diff --git a/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C 
b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C

new file mode 100644
index 000..6d5f2bf01b5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C
@@ -0,0 +1,10 @@
+// PR target/104213
+// { dg-do compile }
+// { dg-options "-Wuse-after-free" }
+
+class C
+{
+    virtual ~C();
+};
+
+C::~C() {} // { dg-bogus "used after" }

base-commit: aeac414923aa1e87986c7fc6f9b921d89a9b86cf






[pushed] c++: ->template and using-decl [PR104235]

2022-01-26 Thread Jason Merrill via Gcc-patches
cp_parser_template_id wasn't prepared to handle getting a USING_DECL back
from cp_parser_template_name.  Let's defer that case to instantiation time,
as well.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/104235

gcc/cp/ChangeLog:

* parser.cc (cp_parser_template_name): Repeat lookup of USING_DECL.

gcc/testsuite/ChangeLog:

* g++.dg/parse/template-keyword2.C: New test.
---
 gcc/cp/parser.cc   | 3 ++-
 gcc/testsuite/g++.dg/parse/template-keyword2.C | 8 
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/parse/template-keyword2.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index ed219d79dc9..8b38165020f 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -18680,7 +18680,8 @@ cp_parser_template_name (cp_parser* parser,
  cp_parser_error (parser, "expected template-name");
  return error_mark_node;
}
-  else if (!DECL_P (decl) && !is_overloaded_fn (decl))
+  else if ((!DECL_P (decl) && !is_overloaded_fn (decl))
+  || TREE_CODE (decl) == USING_DECL)
/* Repeat the lookup at instantiation time.  */
decl = identifier;
 }
diff --git a/gcc/testsuite/g++.dg/parse/template-keyword2.C 
b/gcc/testsuite/g++.dg/parse/template-keyword2.C
new file mode 100644
index 000..ecd066787bc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/template-keyword2.C
@@ -0,0 +1,8 @@
+// PR c++/104235
+
+template 
+struct L: M {
+  using M::a;
+  void a();
+  void p() { this->template a<>(); }
+};

base-commit: 1bc00a489086b0bdde89ccb11dfa4f50b6c4e8f0
prerequisite-patch-id: c7ab4056fcbd782232c8dc091597602ecd4a7a48
-- 
2.27.0



Re: [PATCH] tree-optimization/104162 - CSE of [ptr].a[i] and ptr + CST

2022-01-26 Thread Jeff Law via Gcc-patches




On 1/26/2022 8:56 AM, Richard Biener via Gcc-patches wrote:

This adds the capability to value-numbering of treating complex
address expressions where the offset becomes invariant as equal
to a POINTER_PLUS_EXPR.  This restores CSE that is now prevented
by early lowering of [ptr + CST] to a POINTER_PLUS_EXPR.

Unfortunately this regresses gcc.dg/asan/pr99673.c again.

Bootstrapped and tested on x86_64-unknown-linux-gnu - any
opinions?

Richard.

2022-01-26  Richard Biener  

PR tree-optimization/104162
* tree-ssa-sccvn.cc (vn_reference_lookup): Handle
[_1 + 5].a[i] like a POINTER_PLUS_EXPR if the offset
becomes invariant.
(vn_reference_insert): Likewise.

* gcc.dg/tree-ssa/ssa-fre-99.c: New testcase.
IIRC there's a handful of old regressions this might help.  More than 
once I've been tempted to do something similar in DOM -- with this 
addition it's just one more reason we should be looking to drop DOM and 
replace it with FRE going forward.


jeff



[PATCH] tree-optimization/104162 - CSE of [ptr].a[i] and ptr + CST

2022-01-26 Thread Richard Biener via Gcc-patches
This adds the capability to value-numbering of treating complex
address expressions where the offset becomes invariant as equal
to a POINTER_PLUS_EXPR.  This restores CSE that is now prevented
by early lowering of [ptr + CST] to a POINTER_PLUS_EXPR.

Unfortunately this regresses gcc.dg/asan/pr99673.c again.

Bootstrapped and tested on x86_64-unknown-linux-gnu - any
opinions?

Richard.

2022-01-26  Richard Biener  

PR tree-optimization/104162
* tree-ssa-sccvn.cc (vn_reference_lookup): Handle
[_1 + 5].a[i] like a POINTER_PLUS_EXPR if the offset
becomes invariant.
(vn_reference_insert): Likewise.

* gcc.dg/tree-ssa/ssa-fre-99.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c | 27 ++
 gcc/tree-ssa-sccvn.cc  | 62 +-
 2 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c
new file mode 100644
index 000..101d0d63f7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* Disable FRE1 because that for the sake of __builtin_object_size
+   will not consider the equality but still valueize 'i', defeating
+   the purpose of the check.  */
+/* { dg-options "-O -fdump-tree-fre3 -fdisable-tree-fre1" } */
+
+struct S { int a[4]; };
+
+int i;
+int bar (struct S *p)
+{
+  char *q = (char *)p + 4;
+  i = 1;
+  int *r = &((struct S *)p)->a[i];
+  return q == (char *)r;
+}
+int baz (struct S *p)
+{
+  i = 1;
+  int *r = &((struct S *)p)->a[i];
+  char *q = (char *)p + 4;
+  return q == (char *)r;
+}
+
+/* Verify FRE can handle valueizing >a[i] and value-numbering it
+   equal to a POINTER_PLUS_EXPR.  */
+/* { dg-final { scan-tree-dump-times "return 1;" 2 "fre3" } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index a03f0aae924..482cf372686 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -3669,6 +3669,36 @@ vn_reference_lookup (tree op, tree vuse, vn_lookup_kind 
kind,
   vr1.vuse = vuse_ssa_val (vuse);
   vr1.operands = operands
 = valueize_shared_reference_ops_from_ref (op, _anything);
+
+  /* Handle [ptr + 5].b[1].c as POINTER_PLUS_EXPR.  */
+  if (operands[0].opcode == ADDR_EXPR
+  && operands.last ().opcode == SSA_NAME)
+{
+  poly_int64 off = 0;
+  vn_reference_op_t vro;
+  unsigned i;
+  for (i = 1; operands.iterate (i, ); ++i)
+   {
+ if (vro->opcode == SSA_NAME)
+   break;
+ else if (known_eq (vro->off, -1))
+   break;
+ off += vro->off;
+   }
+  if (i == operands.length () - 1)
+   {
+ gcc_assert (operands[i-1].opcode == MEM_REF);
+ tree ops[2];
+ ops[0] = operands[i].op0;
+ ops[1] = wide_int_to_tree (sizetype, off);
+ tree res = vn_nary_op_lookup_pieces (2, POINTER_PLUS_EXPR,
+  TREE_TYPE (op), ops, NULL);
+ if (res)
+   return res;
+ return NULL_TREE;
+   }
+}
+
   vr1.type = TREE_TYPE (op);
   ao_ref op_ref;
   ao_ref_init (_ref, op);
@@ -3760,13 +3790,43 @@ vn_reference_insert (tree op, tree result, tree vuse, 
tree vdef)
   vn_reference_t vr1;
   bool tem;
 
+  vec operands
+= valueize_shared_reference_ops_from_ref (op, );
+  /* Handle [ptr + 5].b[1].c as POINTER_PLUS_EXPR.  */
+  if (operands[0].opcode == ADDR_EXPR
+  && operands.last ().opcode == SSA_NAME)
+{
+  poly_int64 off = 0;
+  vn_reference_op_t vro;
+  unsigned i;
+  for (i = 1; operands.iterate (i, ); ++i)
+   {
+ if (vro->opcode == SSA_NAME)
+   break;
+ else if (known_eq (vro->off, -1))
+   break;
+ off += vro->off;
+   }
+  if (i == operands.length () - 1)
+   {
+ gcc_assert (operands[i-1].opcode == MEM_REF);
+ tree ops[2];
+ ops[0] = operands[i].op0;
+ ops[1] = wide_int_to_tree (sizetype, off);
+ vn_nary_op_insert_pieces (2, POINTER_PLUS_EXPR,
+   TREE_TYPE (op), ops, result,
+   VN_INFO (result)->value_id);
+ return;
+   }
+}
+
   vr1 = XOBNEW (_tables_obstack, vn_reference_s);
   if (TREE_CODE (result) == SSA_NAME)
 vr1->value_id = VN_INFO (result)->value_id;
   else
 vr1->value_id = get_or_alloc_constant_value_id (result);
   vr1->vuse = vuse_ssa_val (vuse);
-  vr1->operands = valueize_shared_reference_ops_from_ref (op, ).copy ();
+  vr1->operands = operands.copy ();
   vr1->type = TREE_TYPE (op);
   vr1->punned = false;
   ao_ref op_ref;
-- 
2.31.1


Re: [PATCH v1] rtl: builtins: Fix builtins feclearexcept and feraiseexcept operand check [PR94193]

2022-01-26 Thread Segher Boessenkool
On Wed, Jan 26, 2022 at 12:05:54PM -0300, Raoni Fassina Firmino wrote:
> Commit 4343f5e25679 ("rtl: builtins: (not just) rs6000: Add builtins
> for fegetround, feclearexcept and feraiseexcept [PR94193]") broke gcc
> bootstra when building with --enable-checking=rtl[1].
> 
> The function expand_builtin_feclear_feraise_except was failing to
> proper validate op0 predicate before emit_insn leading to the mismatch
> type failure.

> gcc/
>   PR target/94193
>   * builtins.cc (expand_builtin_feclear_feraise_except): Add op0
>   predicate check.

Perfect, and pushed.  Thank you!


Segher


Re: [PATCH] IPA mod-ref: fix usage of --param names in dump messages.

2022-01-26 Thread Richard Biener via Gcc-patches
On Wed, Jan 26, 2022 at 3:49 PM Martin Liška  wrote:
>
> The patch fixed bad --param param=foo names.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> * ipa-modref-tree.cc (modref_access_node::update):
> Remove "--param param=foo" with "--param foo".
> (modref_access_node::insert): Likewise.
> (modref_access_node::insert_kill): Likewise.
> * ipa-modref-tree.h (struct modref_ref_node): Likewise.
> (struct modref_base_node): Likewise.
> (struct modref_tree): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/modref-7.c: Update scanned patterns.
> * gcc.dg/tree-ssa/modref-8.c: Likewise.
> ---
>   gcc/ipa-modref-tree.cc   | 10 --
>   gcc/ipa-modref-tree.h|  9 -
>   gcc/testsuite/gcc.dg/tree-ssa/modref-7.c |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/modref-8.c |  4 ++--
>   4 files changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/ipa-modref-tree.cc b/gcc/ipa-modref-tree.cc
> index 97e497accf2..828994f3536 100644
> --- a/gcc/ipa-modref-tree.cc
> +++ b/gcc/ipa-modref-tree.cc
> @@ -130,8 +130,7 @@ modref_access_node::update (poly_int64 parm_offset1,
> else
>   {
> if (dump_file)
> -   fprintf (dump_file,
> -"--param param=modref-max-adjustments limit reached:");
> +   fprintf (dump_file, "--param modref-max-adjustments limit reached:");
> if (!known_eq (parm_offset, parm_offset1))
> {
>   if (dump_file)
> @@ -594,11 +593,11 @@ modref_access_node::insert (vec  va_gc> *,
> return -1;
> if (dump_file && best2 >= 0)
> fprintf (dump_file,
> -"--param param=modref-max-accesses limit reached;"
> +"--param modref-max-accesses limit reached;"
>  " merging %i and %i\n", best1, best2);
> else if (dump_file)
> fprintf (dump_file,
> -"--param param=modref-max-accesses limit reached;"
> +"--param modref-max-accesses limit reached;"
>  " merging with %i\n", best1);
> modref_access_node::try_merge_with (accesses, best1);
> if (best2 >= 0)
> @@ -825,8 +824,7 @@ modref_access_node::insert_kill (vec 
> ,
> if ((int)kills.length () >= param_modref_max_accesses)
> {
>   if (dump_file)
> -   fprintf (dump_file,
> -"--param param=modref-max-accesses limit reached:");
> +   fprintf (dump_file, "--param modref-max-accesses limit reached:");
>   return false;
> }
> a.adjustments = 0;
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index edb3b49374a..fdaa9612e9a 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -197,8 +197,7 @@ struct GTY((user)) modref_ref_node
> {
> if (dump_file)
>   fprintf (dump_file,
> -  "--param param=modref-max-accesses limit reached;"
> -  " collapsing\n");
> +  "--param modref-max-accesses limit reached; collapsing\n");
> collapse ();
> }
>   return ret != 0;
> @@ -252,7 +251,7 @@ struct GTY((user)) modref_base_node
>   if (ref && refs && refs->length () >= max_refs)
> {
> if (dump_file)
> - fprintf (dump_file, "--param param=modref-max-refs limit reached;"
> + fprintf (dump_file, "--param modref-max-refs limit reached;"
>" using 0\n");
> ref = 0;
> ref_node = search (ref);
> @@ -344,12 +343,12 @@ struct GTY((user)) modref_tree
> if (base_node)
>   {
> if (dump_file)
> - fprintf (dump_file, "--param param=modref-max-bases"
> + fprintf (dump_file, "--param modref-max-bases"
>" limit reached; using ref\n");
> return base_node;
>   }
> if (dump_file)
> - fprintf (dump_file, "--param param=modref-max-bases"
> + fprintf (dump_file, "--param modref-max-bases"
>" limit reached; using 0\n");
> base = 0;
> base_node = search (base);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c
> index 53ffa1c394c..b55d7066b22 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c
> @@ -10,4 +10,4 @@ int test(struct a *a, int p)
> a->array[0] = 1;
>   }
>   /* All three accesses combine to one bigger access.  */
> -/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1" } } 
> */
> +/* { dg-final { scan-tree-dump-not "--param modref-max-accesses" "modref1" } 
> } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
> index 4a18e34cd16..c51590ff0ba 100644
> --- 

RFA: libiberty: Fix infinite recursion in rust demangler (PRs 98886 and 99935)

2022-01-26 Thread Nick Clifton via Gcc-patches
Hi Guys,

  I would like to propose the patch below to fix a couple of sources
  of infinite recursion in libiberty's rust demangling code.  This patch
  is based upon the one submitted for PR 99935, but extended to cope
  with the case presented in PR 98886 and also fixed so that the "uint"
  type is not used.

  Tested with a patched version of the binutils sources on an
  x86-pc-linux-gnu target.

Cheers
  Nick

2022-01-26  Nick Clifton  

* rust-demangle.c (struct rust_demangler): Add a recursion
counter.
(demangle_path): Increment/decrement the recursion counter upon
entry and exit.  Fail if the counter exceeds a fixed limit.
(demangle_type): Likewise.
(rust_demangle_callback): Initialise the recursion counter,
disabling if requested by the option flags.

diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index 18c760491bd..3b24d63892a 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -74,6 +74,12 @@ struct rust_demangler
   /* Rust mangling version, with legacy mangling being -1. */
   int version;
 
+  /* Recursion depth.  */
+  unsigned int recursion;
+  /* Maximum number of times demangle_path may be called recursively.  */
+#define RUST_MAX_RECURSION_COUNT  1024
+#define RUST_NO_RECURSION_LIMIT   ((unsigned int) -1)
+
   uint64_t bound_lifetime_depth;
 };
 
@@ -671,6 +677,15 @@ demangle_path (struct rust_demangler *rdm, int in_value)
   if (rdm->errored)
 return;
 
+  if (rdm->recursion != RUST_NO_RECURSION_LIMIT)
+{
+  ++ rdm->recursion;
+  if (rdm->recursion > RUST_MAX_RECURSION_COUNT)
+   /* FIXME: There ought to be a way to report
+  that the recursion limit has been reached.  */
+   goto fail_return;
+}
+
   switch (tag = next (rdm))
 {
 case 'C':
@@ -688,10 +703,7 @@ demangle_path (struct rust_demangler *rdm, int in_value)
 case 'N':
   ns = next (rdm);
   if (!ISLOWER (ns) && !ISUPPER (ns))
-{
-  rdm->errored = 1;
-  return;
-}
+   goto fail_return;
 
   demangle_path (rdm, in_value);
 
@@ -776,9 +788,15 @@ demangle_path (struct rust_demangler *rdm, int in_value)
 }
   break;
 default:
-  rdm->errored = 1;
-  return;
+  goto fail_return;
 }
+  goto pass_return;
+
+ fail_return:
+  rdm->errored = 1;
+ pass_return:
+  if (rdm->recursion != RUST_NO_RECURSION_LIMIT)
+-- rdm->recursion;
 }
 
 static void
@@ -870,6 +888,19 @@ demangle_type (struct rust_demangler *rdm)
   return;
 }
 
+   if (rdm->recursion != RUST_NO_RECURSION_LIMIT)
+{
+  ++ rdm->recursion;
+  if (rdm->recursion > RUST_MAX_RECURSION_COUNT)
+   /* FIXME: There ought to be a way to report
+  that the recursion limit has been reached.  */
+   {
+ rdm->errored = 1;
+ -- rdm->recursion;
+ return;
+   }
+}
+
   switch (tag)
 {
 case 'R':
@@ -1030,6 +1061,9 @@ demangle_type (struct rust_demangler *rdm)
   rdm->next--;
   demangle_path (rdm, 0);
 }
+
+  if (rdm->recursion != RUST_NO_RECURSION_LIMIT)
+-- rdm->recursion;
 }
 
 /* A trait in a trait object may have some "existential projections"
@@ -1320,6 +1354,7 @@ rust_demangle_callback (const char *mangled, int options,
   rdm.skipping_printing = 0;
   rdm.verbose = (options & DMGL_VERBOSE) != 0;
   rdm.version = 0;
+  rdm.recursion = (options & DMGL_NO_RECURSE_LIMIT) ? RUST_NO_RECURSION_LIMIT 
: 0;
   rdm.bound_lifetime_depth = 0;
 
   /* Rust symbols always start with _R (v0) or _ZN (legacy). */



Re: [PATCH] warn-access: Prevent -Wuse-after-free on ARM [PR104213]

2022-01-26 Thread Jason Merrill via Gcc-patches

On 1/25/22 18:33, Marek Polacek wrote:

Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors
return, as mandated by the EABI.  To be entirely correct, it only
requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel
like changing that now and possibly running into issues later on.

This patch uses suppress_warning on 'this' for certain cdtor_returns_this
cases in the C++ FE, and then warn_invalid_pointer makes use of this
information and doesn't warn.

In my first attempt I tried suppress_warning the MODIFY_EXPR or RETURN_EXPR
we build in build_delete_destructor_body, but the complication is that
the suppress_warning bits don't always survive gimplification; see e.g.
gimplify_modify_expr where we do

  6130   if (COMPARISON_CLASS_P (*from_p))
  6131 copy_warning (assign, *from_p);

but here we're not dealing with a comparison.  Removing that check
regresses uninit-pr74762.C.  Adding copy_warning (assign, *expr_p)
regresses c-c++-common/uninit-17.c.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK if Martin doesn't have any suggestions.


PR target/104213

gcc/cp/ChangeLog:

* decl.cc (finish_constructor_body): Suppress -Wuse-after-free.
(finish_destructor_body): Likewise.
* optimize.cc (build_delete_destructor_body): Likewise.

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (pass_waccess::warn_invalid_pointer): Don't
warn when the SSA_NAME_VAR of REF has supressed -Wuse-after-free.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wuse-after-free2.C: New test.
---
  gcc/cp/decl.cc   |  2 ++
  gcc/cp/optimize.cc   |  1 +
  gcc/gimple-ssa-warn-access.cc| 14 +++---
  gcc/testsuite/g++.dg/warn/Wuse-after-free2.C | 10 ++
  4 files changed, 24 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 22d3dd1e2ad..6534a7fd320 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17315,6 +17315,7 @@ finish_constructor_body (void)
add_stmt (build_stmt (input_location, LABEL_EXPR, cdtor_label));
  
val = DECL_ARGUMENTS (current_function_decl);

+  suppress_warning (val, OPT_Wuse_after_free);
val = build2 (MODIFY_EXPR, TREE_TYPE (val),
DECL_RESULT (current_function_decl), val);
/* Return the address of the object.  */
@@ -17408,6 +17409,7 @@ finish_destructor_body (void)
tree val;
  
val = DECL_ARGUMENTS (current_function_decl);

+  suppress_warning (val, OPT_Wuse_after_free);
val = build2 (MODIFY_EXPR, TREE_TYPE (val),
DECL_RESULT (current_function_decl), val);
/* Return the address of the object.  */
diff --git a/gcc/cp/optimize.cc b/gcc/cp/optimize.cc
index 4ad3f1dc9aa..13ab8b7361e 100644
--- a/gcc/cp/optimize.cc
+++ b/gcc/cp/optimize.cc
@@ -166,6 +166,7 @@ build_delete_destructor_body (tree delete_dtor, tree 
complete_dtor)
if (targetm.cxx.cdtor_returns_this ())
  {
tree val = DECL_ARGUMENTS (delete_dtor);
+  suppress_warning (val, OPT_Wuse_after_free);
val = build2 (MODIFY_EXPR, TREE_TYPE (val),
  DECL_RESULT (delete_dtor), val);
add_stmt (build_stmt (0, RETURN_EXPR, val));
diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 8bc33eeb6fa..484bcd34c25 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -3880,9 +3880,17 @@ pass_waccess::warn_invalid_pointer (tree ref, gimple 
*use_stmt,
bool maybe, bool equality /* = false */)
  {
/* Avoid printing the unhelpful "" in the diagnostics.  */
-  if (ref && TREE_CODE (ref) == SSA_NAME
-  && (!SSA_NAME_VAR (ref) || DECL_ARTIFICIAL (SSA_NAME_VAR (ref
-ref = NULL_TREE;
+  if (ref && TREE_CODE (ref) == SSA_NAME)
+{
+  tree var = SSA_NAME_VAR (ref);
+  if (!var)
+   ref = NULL_TREE;
+  /* Don't warn for cases like when a cdtor returns 'this' on ARM.  */
+  else if (warning_suppressed_p (var, OPT_Wuse_after_free))
+   return;
+  else if (DECL_ARTIFICIAL (var))
+   ref = NULL_TREE;
+}
  
location_t use_loc = gimple_location (use_stmt);

if (use_loc == UNKNOWN_LOCATION)
diff --git a/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C 
b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C
new file mode 100644
index 000..6d5f2bf01b5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C
@@ -0,0 +1,10 @@
+// PR target/104213
+// { dg-do compile }
+// { dg-options "-Wuse-after-free" }
+
+class C
+{
+virtual ~C();
+};
+
+C::~C() {} // { dg-bogus "used after" }

base-commit: aeac414923aa1e87986c7fc6f9b921d89a9b86cf




Re: [PATCH] c++: Fix up handling of vector CONSTRUCTORs with vectors in it in constexpr.cc [PR104226]

2022-01-26 Thread Jason Merrill via Gcc-patches

On 1/26/22 04:33, Jakub Jelinek wrote:

Hi!

The middle-end uses sometimes VECTOR_TYPE CONSTRUCTORs that contain
some other VECTOR_TYPE elements in it (should be with compatible element
size and smaller number of elements, e.g. a V8SImode vector can be
constructed as { V4SImode_var_1, V4SImode_var_2 }), and expansion of
__builtin_shufflevector emits these early, so constexpr.cc can see those
too.
constexpr.cc already has special cases for NULL index which is typical
for VECTOR_TYPE CONSTRUCTORs, and for VECTOR_TYPE CONSTRUCTORs that
contain just scalar elts that works just fine - init_subob_ctx just
returns on non-aggregate elts and get_or_insert_ctor_field has
   if (TREE_CODE (type) == VECTOR_TYPE && index == NULL_TREE)
 {
   CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (ctor), index, NULL_TREE);
   return _ELTS (ctor)->last();
 }
handling for it.  But for the vector in vector case init_subob_ctx would
try to create a sub-CONSTRUCTOR and even didn't handle the NULL index
case well, so instead of creating the sub-CONSTRUCTOR after the elts already
in it overwrote the first one.  So
(V8SImode) { { 0, 0, 0, 0 }, { 0, 0, 0, 0 } }
became
(V8SImode) { 0, 0, 0, 0 }
The following patch fixes it by not forcing a sub-CONSTRUCTOR for this
vector in vector case.


OK.


Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-26  Jakub Jelinek  

PR c++/104226
* constexpr.cc (init_subob_ctx): For vector ctors containing
vector elements, ensure appending to the same ctor instead of
creating another one.

* g++.dg/cpp0x/constexpr-104226.C: New test.

--- gcc/cp/constexpr.cc.jj  2022-01-19 00:42:11.0 +0100
+++ gcc/cp/constexpr.cc 2022-01-25 21:44:28.459208756 +0100
@@ -4658,6 +4658,13 @@ init_subob_ctx (const constexpr_ctx *ctx
if (!AGGREGATE_TYPE_P (type) && !VECTOR_TYPE_P (type))
  /* A non-aggregate member doesn't get its own CONSTRUCTOR.  */
  return;
+  if (VECTOR_TYPE_P (type)
+  && VECTOR_TYPE_P (TREE_TYPE (ctx->ctor))
+  && index == NULL_TREE)
+/* A vector inside of a vector CONSTRUCTOR, e.g. when a larger
+   vector is constructed from smaller vectors, doesn't get its own
+   CONSTRUCTOR either.  */
+return;
  
/* The sub-aggregate initializer might contain a placeholder;

   update object to refer to the subobject and ctor to refer to
--- gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C.jj2022-01-25 
21:50:34.977031244 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C   2022-01-25 
21:51:41.851086559 +0100
@@ -0,0 +1,15 @@
+// PR c++/104226
+// { dg-do compile }
+// { dg-options "-Wno-psabi" }
+
+typedef unsigned short __attribute__((__vector_size__(16))) U;
+typedef unsigned int __attribute__((__vector_size__(16))) V;
+typedef unsigned int __attribute__((__vector_size__(32))) W;
+
+U
+foo (void)
+{
+  return __builtin_convertvector (__builtin_shufflevector ((V){}, (W){},
+  0, 0, 1, 0,
+  5, 5, 0, 2), U);
+}

Jakub





[PATCH v1] rtl: builtins: Fix builtins feclearexcept and feraiseexcept operand check [PR94193]

2022-01-26 Thread Raoni Fassina Firmino via Gcc-patches
Tested on top of master (e0b8716f53ed6455e9f18931940141692793068d)
using --enable-checking=yes,rtl, on the following plataforms with no
regression:
  - powerpc64le-linux-gnu (Power 9)
  - powerpc64le-linux-gnu (Power 8)
  - powerpc64-linux-gnu (Power 9, with 32 and 64 bits tests)

I did not include a testcase because I could not figure out one that
works without --enable-checking=rtl yet.

 8< 

Commit 4343f5e25679 ("rtl: builtins: (not just) rs6000: Add builtins
for fegetround, feclearexcept and feraiseexcept [PR94193]") broke gcc
bootstra when building with --enable-checking=rtl[1].

The function expand_builtin_feclear_feraise_except was failing to
proper validate op0 predicate before emit_insn leading to the mismatch
type failure.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589186.html

2022-01-26  Raoni Fassina Firmino  

gcc/
PR target/94193
* builtins.cc (expand_builtin_feclear_feraise_except): Add op0
predicate check.

Signed-off-by: Raoni Fassina Firmino 
---
 gcc/builtins.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index e84208035dab..d784a57c2b81 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -2598,6 +2598,9 @@ expand_builtin_feclear_feraise_except (tree exp, rtx 
target,
   if (icode == CODE_FOR_nothing)
 return NULL_RTX;
 
+  if (!(*insn_data[icode].operand[1].predicate) (op0, GET_MODE (op0)))
+return NULL_RTX;
+
   if (target == 0
   || GET_MODE (target) != target_mode
   || !(*insn_data[icode].operand[0].predicate) (target, target_mode))
-- 
2.34.1



[committed] analyzer: fix missing uninit warning on args to stdio builtins [PR104224]

2022-01-26 Thread David Malcolm via Gcc-patches
We were failing to check for uninitialized arguments to stdio builtins,
such as when passing local "go" to the call to "printf" in "main" in
the testcase.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-6876-g9ff3e2368d86c1bf7d1e8876a14e58c0d6617ffe.

gcc/analyzer/ChangeLog:
PR analyzer/104224
* region-model.cc (region_model::check_call_args): New.
(region_model::on_call_pre): Call it when ignoring stdio builtins.
* region-model.h (region_model::check_call_args): New decl

gcc/testsuite/ChangeLog:
PR analyzer/104224
* gcc.dg/analyzer/pr104224.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model.cc |  11 +++
 gcc/analyzer/region-model.h  |   2 +
 gcc/testsuite/gcc.dg/analyzer/pr104224.c | 106 +++
 3 files changed, 119 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr104224.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index f6b7f986a39..a559bc84eb0 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1044,6 +1044,16 @@ region_model::on_stmt_pre (const gimple *stmt,
 }
 }
 
+/* Ensure that all arguments at the call described by CD are checked
+   for poisoned values, by calling get_rvalue on each argument.  */
+
+void
+region_model::check_call_args (const call_details ) const
+{
+  for (unsigned arg_idx = 0; arg_idx < cd.num_args (); arg_idx++)
+cd.get_arg_svalue (arg_idx);
+}
+
 /* Update this model for the CALL stmt, using CTXT to report any
diagnostics - the first half.
 
@@ -1173,6 +1183,7 @@ region_model::on_call_pre (const gcall *call, 
region_model_context *ctxt,
/* These stdio builtins have external effects that are out
   of scope for the analyzer: we only want to model the effects
   on the return value.  */
+   check_call_args (cd);
break;
  }
   else if (is_named_call_p (callee_fndecl, "malloc", call, 1))
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index c78efe8f215..40958842bce 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -832,6 +832,8 @@ class region_model
   void check_region_for_read (const region *src_reg,
  region_model_context *ctxt) const;
 
+  void check_call_args (const call_details ) const;
+
   /* Storing this here to avoid passing it around everywhere.  */
   region_model_manager *const m_mgr;
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr104224.c 
b/gcc/testsuite/gcc.dg/analyzer/pr104224.c
new file mode 100644
index 000..8f69d72befa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr104224.c
@@ -0,0 +1,106 @@
+#include 
+
+struct test {
+int one;
+int two;
+};
+
+void func2(const struct test *t)
+{
+if (t->one == 0)
+printf("init func2\n");
+
+if (t->two == 0)  /* { dg-warning "uninitialized" } */
+printf("uninit func2\n");
+}
+
+void func1(struct test *t)
+{
+t->one = 1;
+func2(t);
+}
+
+int func3(int num)
+{
+if (num)
+return num;
+else
+return 0;
+}
+
+void func4(int *a, int max)
+{
+int i;
+// skip the first
+for (i=1; i

[committed] analyzer: fix sense in range::add_bound [PR94362]

2022-01-26 Thread David Malcolm via Gcc-patches
On Sun, 2022-01-23 at 17:34 +0100, Mikael Morin wrote:
> Hello,
> 
> Le 21/01/2022 à 00:59, David Malcolm via Gcc-patches a écrit :
> > diff --git a/gcc/analyzer/constraint-manager.cc
> > b/gcc/analyzer/constraint-manager.cc
> > index 568e7150ea7..7c4a85bbb24 100644
> > --- a/gcc/analyzer/constraint-manager.cc
> > +++ b/gcc/analyzer/constraint-manager.cc
> > @@ -301,6 +301,80 @@ range::above_upper_bound (tree rhs_const)
> > const
> > m_upper_bound.m_constant).is_true ();
> >   }
> >   
> > +/* Attempt to add B to the bound of the given kind of this range.
> > +   Return true if feasible; false if infeasible.  */
> > +
> > +bool
> > +range::add_bound (bound b, enum bound_kind bound_kind)
> > +{
> > +  b.ensure_closed (bound_kind);
> > +
> > +  switch (bound_kind)
> > +{
> > +default:
> > +  gcc_unreachable ();
> > +case BK_LOWER:
> > +  /* Discard redundant bounds.  */
> > +  if (m_lower_bound.m_constant)
> > +   {
> > + m_lower_bound.ensure_closed (BK_LOWER);
> > + if (!tree_int_cst_lt (b.m_constant,
> > +   m_lower_bound.m_constant))
> > +   return true;
> 
> isn’t this condition reversed?
> 
> > +   }
> > +  m_lower_bound = b;
> > +  break;
> > +case BK_UPPER:
> > +  /* Discard redundant bounds.  */
> > +  if (m_upper_bound.m_constant)
> > +   {
> > + m_upper_bound.ensure_closed (BK_UPPER);
> > + if (tree_int_cst_le (b.m_constant,
> > +  m_upper_bound.m_constant))
> > +   return true;
> 
> same here.
> 
> All the tests added have just one lower and one upper bound, so they 
> don’t use the short-circuit code, but amending one of them as follows
> makes the problem appear as the test starts to fails.  It should 
> continue to work, shouldn’t it?
> 
> 
> diff --git a/gcc/analyzer/constraint-manager.cc 
> b/gcc/analyzer/constraint-manager.cc
> index 7c4a85bbb24..3f38b857722 100644
> --- a/gcc/analyzer/constraint-manager.cc
> +++ b/gcc/analyzer/constraint-manager.cc
> @@ -3697,6 +3697,7 @@ test_constant_comparisons ()
>   region_model_manager mgr;
>   {
> region_model model ();
> +  ADD_SAT_CONSTRAINT (model, int_1, LT_EXPR, a);
> ADD_SAT_CONSTRAINT (model, int_3, LT_EXPR, a);
> ADD_UNSAT_CONSTRAINT (model, a, LT_EXPR, int_4);
>   }

Good catch, thanks.

Fixed as follows, which also moves the rejection of contradictory
constraints in range::add_bound to earlier, so that this code can
be self-tested.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-6875-ge966a508e03fe28bfca65a1e60e579fa90355ea6.

gcc/analyzer/ChangeLog:
PR analyzer/94362
* constraint-manager.cc (range::add_bound): Fix tests for
discarding redundant constraints.  Perform test for rejecting
unsatisfiable constraints earlier so that they don't update
the object on failure.
(selftest::test_range): New.
(selftest::test_constant_comparisons): Add test coverage for
existing constraints becoming narrower until they are
unsatisfiable.
(selftest::run_constraint_manager_tests): Call test_range.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/constraint-manager.cc | 93 +-
 1 file changed, 79 insertions(+), 14 deletions(-)

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 7c4a85bbb24..88b0988513a 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -318,35 +318,42 @@ range::add_bound (bound b, enum bound_kind bound_kind)
   if (m_lower_bound.m_constant)
{
  m_lower_bound.ensure_closed (BK_LOWER);
- if (!tree_int_cst_lt (b.m_constant,
-   m_lower_bound.m_constant))
+ if (tree_int_cst_le (b.m_constant,
+  m_lower_bound.m_constant))
return true;
}
+  if (m_upper_bound.m_constant)
+   {
+ m_upper_bound.ensure_closed (BK_UPPER);
+ /* Reject B <= V <= UPPER when B > UPPER.  */
+ if (!tree_int_cst_le (b.m_constant,
+   m_upper_bound.m_constant))
+   return false;
+   }
   m_lower_bound = b;
   break;
+
 case BK_UPPER:
   /* Discard redundant bounds.  */
   if (m_upper_bound.m_constant)
{
  m_upper_bound.ensure_closed (BK_UPPER);
- if (tree_int_cst_le (b.m_constant,
-  m_upper_bound.m_constant))
+ if (!tree_int_cst_lt (b.m_constant,
+   m_upper_bound.m_constant))
return true;
}
+  if (m_lower_bound.m_constant)
+   {
+ m_lower_bound.ensure_closed (BK_LOWER);
+ /* Reject LOWER <= V <= B when LOWER > B.  */
+ if (!tree_int_cst_le (m_lower_bound.m_constant,
+   

[PATCH] IPA mod-ref: fix usage of --param names in dump messages.

2022-01-26 Thread Martin Liška

The patch fixed bad --param param=foo names.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* ipa-modref-tree.cc (modref_access_node::update):
Remove "--param param=foo" with "--param foo".
(modref_access_node::insert): Likewise.
(modref_access_node::insert_kill): Likewise.
* ipa-modref-tree.h (struct modref_ref_node): Likewise.
(struct modref_base_node): Likewise.
(struct modref_tree): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/modref-7.c: Update scanned patterns.
* gcc.dg/tree-ssa/modref-8.c: Likewise.
---
 gcc/ipa-modref-tree.cc   | 10 --
 gcc/ipa-modref-tree.h|  9 -
 gcc/testsuite/gcc.dg/tree-ssa/modref-7.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/modref-8.c |  4 ++--
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/gcc/ipa-modref-tree.cc b/gcc/ipa-modref-tree.cc
index 97e497accf2..828994f3536 100644
--- a/gcc/ipa-modref-tree.cc
+++ b/gcc/ipa-modref-tree.cc
@@ -130,8 +130,7 @@ modref_access_node::update (poly_int64 parm_offset1,
   else
 {
   if (dump_file)
-   fprintf (dump_file,
-"--param param=modref-max-adjustments limit reached:");
+   fprintf (dump_file, "--param modref-max-adjustments limit reached:");
   if (!known_eq (parm_offset, parm_offset1))
{
  if (dump_file)
@@ -594,11 +593,11 @@ modref_access_node::insert (vec  
*,
return -1;
   if (dump_file && best2 >= 0)
fprintf (dump_file,
-"--param param=modref-max-accesses limit reached;"
+"--param modref-max-accesses limit reached;"
 " merging %i and %i\n", best1, best2);
   else if (dump_file)
fprintf (dump_file,
-"--param param=modref-max-accesses limit reached;"
+"--param modref-max-accesses limit reached;"
 " merging with %i\n", best1);
   modref_access_node::try_merge_with (accesses, best1);
   if (best2 >= 0)
@@ -825,8 +824,7 @@ modref_access_node::insert_kill (vec 
,
   if ((int)kills.length () >= param_modref_max_accesses)
{
  if (dump_file)
-   fprintf (dump_file,
-"--param param=modref-max-accesses limit reached:");
+   fprintf (dump_file, "--param modref-max-accesses limit reached:");
  return false;
}
   a.adjustments = 0;
diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
index edb3b49374a..fdaa9612e9a 100644
--- a/gcc/ipa-modref-tree.h
+++ b/gcc/ipa-modref-tree.h
@@ -197,8 +197,7 @@ struct GTY((user)) modref_ref_node
   {
if (dump_file)
  fprintf (dump_file,
-  "--param param=modref-max-accesses limit reached;"
-  " collapsing\n");
+  "--param modref-max-accesses limit reached; collapsing\n");
collapse ();
   }
 return ret != 0;
@@ -252,7 +251,7 @@ struct GTY((user)) modref_base_node
 if (ref && refs && refs->length () >= max_refs)
   {
if (dump_file)
- fprintf (dump_file, "--param param=modref-max-refs limit reached;"
+ fprintf (dump_file, "--param modref-max-refs limit reached;"
   " using 0\n");
ref = 0;
ref_node = search (ref);
@@ -344,12 +343,12 @@ struct GTY((user)) modref_tree
if (base_node)
  {
if (dump_file)
- fprintf (dump_file, "--param param=modref-max-bases"
+ fprintf (dump_file, "--param modref-max-bases"
   " limit reached; using ref\n");
return base_node;
  }
if (dump_file)
- fprintf (dump_file, "--param param=modref-max-bases"
+ fprintf (dump_file, "--param modref-max-bases"
   " limit reached; using 0\n");
base = 0;
base_node = search (base);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c
index 53ffa1c394c..b55d7066b22 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c
@@ -10,4 +10,4 @@ int test(struct a *a, int p)
   a->array[0] = 1;
 }
 /* All three accesses combine to one bigger access.  */
-/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1" } } */
+/* { dg-final { scan-tree-dump-not "--param modref-max-accesses" "modref1" } } 
*/
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
index 4a18e34cd16..c51590ff0ba 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
@@ -17,8 +17,8 @@ recurse (char *p, int n)
if (n)
  recurse (p+1,n-1);
 }
-/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1" } } */
-/* { dg-final { scan-tree-dump "param=modref-max-adjustments" "modref1" } } */

Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'

2022-01-26 Thread Maciej W. Rozycki
On Mon, 24 Jan 2022, Joseph Myers wrote:

> >  I think we have consensus that we can ignore pre-r2.2 hardware.  What we 
> > actually support is `-misa-spec=<2.2|20190608|20191213>', so we can assume 
> > r2.2 semantics as the absolute minimum, matching the description in my 
> > submission.
> 
> Where, to repeat the point about possibly confusing version numbers, 
> that's saying we can ignore hardware from before *ISA* revision 2.2 (which 
> contained 'F' and 'D' extension version 2.0) - not that we can ignore 
> hardware from before 'F' and 'D' extension version 2.2 (which changed the 
> semantics of the FMIN and FMAX instructions).

 OK, I'll try to make it clear in the change description of v2.

> >  Then once we have IEEE 754-2019 support in place, which will require new 
> > RTL standard pattern names, say `fminimum'/`fmaximum', we will provide 
> > them iff (!HONOR_SNANS || riscv_isa_spec >= ISA_SPEC_CLASS_20190608).  It 
> > may be a good idea to start adding IEEE 754-2019 support, including the 
> > relevant `__builtin_fminimum' and `__builtin_fmaximum' intrinsics, once 
> > the GCC 13 development cycle has started.
> 
> The newer instruction semantics correspond to the functions fminimum_num 
> and fmaximum_num, not fminimum and fmaximum (which are functions that 
> treat both quiet and signaling NaNs like most libm functions do - any 
> argument a NaN means the result is NaN).

 I got these wrong then, sorry.  I got lost too: what is the difference 
between `fmin'/`fmax' and `fminimum'/`fmaximum' then?  It looks to me like 
we have a zoo of selection functions now with very subtle differences 
between each other.

  Maciej


Re: [PATCH] openmp: Add support for target_device selector set in metadirectives

2022-01-26 Thread Kwok Cheung Yeung

Hello

Just noticed a bug in the ISA checking in the nvptx plugin - the minor 
version should only be compared if the major version is equal, otherwise 
it would reject an isa of sm_35 if the card is capable of supporting 
sm_52, for example. This patch fixes the issue.


Thanks

Kwokdiff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 7427677e69d..86a12c3fcfd 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -2047,7 +2047,8 @@ GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, 
void **args)
 /* TODO: Implement GOMP_OFFLOAD_async_run. */
 
 #define CHECK_ISA(major, minor) \
-  if (device->compute_major >= major && device->compute_minor >= minor \
+  if (((device->compute_major == major && device->compute_minor >= minor) \
+   || device->compute_major > major) \
   && strcmp (isa, "sm_"#major#minor) == 0) \
 return true
 


Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-01-26 Thread Dan Li via Gcc-patches




On 1/26/22 03:09, Ard Biesheuvel wrote:

On Wed, 26 Jan 2022 at 11:40, Dan Li  wrote:


Thanks, Ard,

On 1/26/22 00:10, Ard Biesheuvel wrote:

On Wed, 26 Jan 2022 at 08:53, Dan Li  wrote:


Hi, all,

Sorry for bothering.

I'm trying to commit aarch64 scs code to the gcc and there is an issue
that I'm not sure about, could someone give me some suggestions?
(To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) )

When clang enables scs, the following instructions are usually generated:

str x30, [x18], 8
ldp x29, x30, [sp], 16
..
ldp x29, x30, [sp], 16  ## x30 pop
ldr x30, [x18, -8]! ## x30 pop again
ret

The x30 register is popped twice here, Richard suggested that we can
omit the first x30 pop here.

AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm
missing something with the kernel, could someone give some suggestions?

The previous discussion can be found here [1].

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html



As was pointed out in the discussion, binary patching is in fact a
concern for the Linux kernel. E.g., Android uses generic binary
builds, but we would like to be able to switch between pointer
authentication and shadow call stack at boot time, rather than always
support both, and take the SCS performance hit on systems that
implement PAC as well.

However, it seems more straight-forward to patch PACIASP and AUTIASP
instructions into SCS push/pop instructions rather than the other way
around, as we can force the use of these exact opcodes [in the NOP
space]), as well as rely on existing unwind annotations to locate any
such instruction in the binary.



Well, then I think I don't need to submit a kernel patch to
enable SCS for gcc :)



Not entirely.


BTW:
Do we have a plan to submit patches of dynamic patch PAC into
the kernel recently?



At the moment, there are just some ideas floating around. I did
implement a proof of concept that uses unwind data, but it hit some
issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not
at all (Clang) in some cases. Using objtool would be another
possibility.

So in summary, getting SCS support proper into GCC is definitely worth
the effort.


Got it!

And thanks for the suggestion, Ard :)


[PATCH] tree-optimization/100499 - niter analysis and multiple_of_p

2022-01-26 Thread Richard Biener via Gcc-patches
niter analysis uses multiple_of_p which currently assumes
operations like MULT_EXPR do not wrap.  We've got to rely on this
for optimizing size expressions like those in DECL_SIZE and those
generally use unsigned arithmetic with no indication that they
are not expected to wrap.  To preserve that the following adds
a parameter to multiple_of_p, defaulted to true, indicating that
the TOP expression is not expected to wrap for outer computations
in TYPE.  This mostly follows a patch proposed by Bin last year
with the conversion behavior added.

Applying to all users the new effect is that upon type conversions
in the TOP expression the behavior will switch to honor
TYPE_OVERFLOW_UNDEFINED for the converted sub-expressions.

The patch also changes the occurance in niter analysis that we
know is problematic and we have testcases for to pass false
to multiple_of_p.  The patch also contains a change to the
PR72817 fix from Bin to avoid regressing gcc.dg/tree-ssa/loop-42.c.

The intent for stage1 is to introduce a size_multiple_of_p and
internalize the added parameter so all multiple_of_p users will
honor TYPE_OVERFLOW_UNDEFINED and users dealing with size expressions
need to be switched to size_multiple_of_p.

Bootstrapped and tested on x86_64-unknown-linux-gnu with all languages
and {,-m32} testing.

The patch applies ontop of the three earlier posted ones that touch
multiple_of_p but have not yet been reviewed/pushed.

OK?

Thanks,
Richard.

2022-01-26  Richard Biener  

PR tree-optimization/100499
* fold-const.h (multiple_of_p): Add nowrap parameter, defaulted
to true.
* fold-const.cc (multiple_of_p): Likewise.  Honor it for
MULT_EXPR, PLUS_EXPR and MINUS_EXPR and pass it along,
switching to false for conversions.
* tree-ssa-loop-niter.cc (number_of_iterations_ne): Do not
claim the outermost expression does not wrap when calling
multiple_of_p.  Refactor the check done to check the
original IV, avoiding a bias that might wrap.

* gcc.dg/torture/pr100499-1.c: New testcase.
* gcc.dg/torture/pr100499-2.c: Likewise.
* gcc.dg/torture/pr100499-3.c: Likewise.

Co-authored-by: Bin Cheng  
---
 gcc/fold-const.cc | 81 +++
 gcc/fold-const.h  |  2 +-
 gcc/testsuite/gcc.dg/torture/pr100499-1.c | 27 
 gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +
 gcc/testsuite/gcc.dg/torture/pr100499-3.c | 14 
 gcc/tree-ssa-loop-niter.cc| 52 ++-
 6 files changed, 131 insertions(+), 61 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-3.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index a0a4913c45e..7c204fb6265 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -14062,10 +14062,16 @@ fold_binary_initializer_loc (location_t loc, 
tree_code code, tree type,
  SAVE_EXPR (I) * SAVE_EXPR (J)
 
(where the same SAVE_EXPR (J) is used in the original and the
-   transformed version).  */
+   transformed version).
+
+   NOWRAP specifies whether all outer operations in TYPE should
+   be considered not wrapping.  Any type conversion within TOP acts
+   as a barrier and we will fall back to NOWRAP being false.
+   NOWRAP is mostly used to treat expressions in TYPE_SIZE and friends
+   as not wrapping even though they are generally using unsigned arithmetic.  
*/
 
 int
-multiple_of_p (tree type, const_tree top, const_tree bottom)
+multiple_of_p (tree type, const_tree top, const_tree bottom, bool nowrap)
 {
   gimple *stmt;
   tree op1, op2;
@@ -14083,10 +14089,17 @@ multiple_of_p (tree type, const_tree top, const_tree 
bottom)
 a multiple of BOTTOM then TOP is a multiple of BOTTOM.  */
   if (!integer_pow2p (bottom))
return 0;
-  return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
- || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
+  return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap)
+ || multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap));
 
 case MULT_EXPR:
+  /* If the multiplication can wrap we cannot recurse further unless
+the second operand is a power of two which is where wrapping
+does not matter.  */
+  if (!nowrap
+ && !TYPE_OVERFLOW_UNDEFINED (type)
+ && !integer_pow2p (TREE_OPERAND (top, 1)))
+   return 0;
   if (TREE_CODE (bottom) == INTEGER_CST)
{
  op1 = TREE_OPERAND (top, 0);
@@ -14095,24 +14108,24 @@ multiple_of_p (tree type, const_tree top, const_tree 
bottom)
std::swap (op1, op2);
  if (TREE_CODE (op2) == INTEGER_CST)
{
- if (multiple_of_p (type, op2, bottom))
+ if (multiple_of_p (type, op2, bottom, nowrap))
return 1;
 

[PATCH] Improve profile handling in switch lowering.

2022-01-26 Thread Martin Liška

Hello.

Right now, switch lowering does not update basic_block::count values
so that they are uninitiliazed. Moreover, I've updated probability scaling
when a more complex expansion happens. There are still some situations where
the profile is a bit imprecise, but the patch improves rapidly the current 
situation.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR tree-optimization/101301
PR tree-optimization/103680

gcc/ChangeLog:

* tree-switch-conversion.cc (bit_test_cluster::emit):
Handle correctly remaining probability.
(switch_decision_tree::try_switch_expansion): Fix BB's count
where a cluster expansion happens.
(switch_decision_tree::emit_cmp_and_jump_insns): Fill up also
BB count.
(switch_decision_tree::do_jump_if_equal): Likewise.
(switch_decision_tree::emit_case_nodes): Handle special case
for BT expansion which can also fallback to a default BB.
* tree-switch-conversion.h (cluster::cluster): Add
m_default_prob probability.
---
 gcc/tree-switch-conversion.cc | 51 ---
 gcc/tree-switch-conversion.h  |  8 +-
 2 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
index 670397c87e4..d6679e8dee3 100644
--- a/gcc/tree-switch-conversion.cc
+++ b/gcc/tree-switch-conversion.cc
@@ -1538,10 +1538,12 @@ bit_test_cluster::emit (tree index_expr, tree 
index_type,
  test[k].target_bb = n->m_case_bb;
  test[k].label = n->m_case_label_expr;
  test[k].bits = 0;
+ test[k].prob = profile_probability::never ();
  count++;
}
 
   test[k].bits += n->get_range (n->get_low (), n->get_high ());

+  test[k].prob += n->m_prob;
 
   lo = tree_to_uhwi (int_const_binop (MINUS_EXPR, n->get_low (), minval));

   if (n->get_high () == NULL_TREE)
@@ -1629,6 +1631,11 @@ bit_test_cluster::emit (tree index_expr, tree index_type,
  /*simple=*/true, NULL_TREE,
  /*before=*/true, GSI_SAME_STMT);
 
+  profile_probability subtree_prob = m_subtree_prob;

+  profile_probability default_prob = m_default_prob;
+  if (!default_prob.initialized_p ())
+default_prob = m_subtree_prob.invert ();
+
   if (m_handles_entire_switch && entry_test_needed)
 {
   tree range = int_const_binop (MINUS_EXPR, maxval, minval);
@@ -1639,9 +1646,10 @@ bit_test_cluster::emit (tree index_expr, tree index_type,
/*simple=*/true, NULL_TREE,
/*before=*/true, GSI_SAME_STMT);
   tmp = fold_build2 (GT_EXPR, boolean_type_node, idx, range);
+  default_prob = default_prob.apply_scale (1, 2);
   basic_block new_bb
= hoist_edge_and_branch_if_true (, tmp, default_bb,
-profile_probability::unlikely ());
+default_prob);
   gsi = gsi_last_bb (new_bb);
 }
 
@@ -1662,14 +1670,12 @@ bit_test_cluster::emit (tree index_expr, tree index_type,

   else
 csui = tmp;
 
-  profile_probability prob = profile_probability::always ();

-
   /* for each unique set of cases:
if (const & csui) goto target  */
   for (k = 0; k < count; k++)
 {
-  prob = profile_probability::always ().apply_scale (test[k].bits,
-bt_range);
+  profile_probability prob = test[k].prob / (subtree_prob + default_prob);
+  subtree_prob -= test[k].prob;
   bt_range -= test[k].bits;
   tmp = wide_int_to_tree (word_type_node, test[k].mask);
   tmp = fold_build2 (BIT_AND_EXPR, word_type_node, csui, tmp);
@@ -1908,9 +1914,13 @@ switch_decision_tree::try_switch_expansion (vec 
)
   /* Emit cluster-specific switch handling.  */
   for (unsigned i = 0; i < clusters.length (); i++)
if (clusters[i]->get_type () != SIMPLE_CASE)
- clusters[i]->emit (index_expr, index_type,
-gimple_switch_default_label (m_switch),
-m_default_bb, gimple_location (m_switch));
+ {
+   edge e = single_pred_edge (clusters[i]->m_case_bb);
+   e->dest->count = e->src->count.apply_probability (e->probability);
+   clusters[i]->emit (index_expr, index_type,
+  gimple_switch_default_label (m_switch),
+  m_default_bb, gimple_location (m_switch));
+ }
 }
 
   fix_phi_operands_for_edges ();

@@ -2162,6 +2172,7 @@ switch_decision_tree::emit_cmp_and_jump_insns 
(basic_block bb, tree op0,
   edge false_edge = split_block (bb, cond);
   false_edge->flags = EDGE_FALSE_VALUE;
   false_edge->probability = prob.invert ();
+  false_edge->dest->count = bb->count.apply_probability (prob.invert ());
 
   edge true_edge 

Re: [PATCH v2] x86: Also check mode of memory broadcast in bcst_mem_operand

2022-01-26 Thread Jakub Jelinek via Gcc-patches
On Sun, Jan 23, 2022 at 04:39:34PM -0800, H.J. Lu via Gcc-patches wrote:
> On Sun, Jan 23, 2022 at 4:35 PM Hongtao Liu  wrote:
> >
> > On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches
> >  wrote:
> > >
> > > Return false for invalid mode on memory broadcast in bcst_mem_operand:
> > >
> > > (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ])))
> > >
> > Yes, thanks.
> 
> I will also backport it to GCC 11 branch.

On i686-linux this new testcase FAILs with:
cc1: warning: SSE instruction set disabled, using 387 arithmetics
FAIL: gcc.target/i386/pr104188.c (test for excess errors)
Excess errors:
cc1: warning: SSE instruction set disabled, using 387 arithmetics
This is because it uses -mfpmath=sse, but -msse2 isn't on.  Fixed
by adding -msse2 to dg-options and requiring sse2_runtime effective
target.

Tested on x86_64-linux and i686-linux, committed as obvious to trunk/11:

2022-01-26  Jakub Jelinek  

PR target/104188
* gcc.target/i386/pr104188.c: Add dg-require-effective-target
sse2_runtime.  Add -msse2 to dg-options.

--- gcc/testsuite/gcc.target/i386/pr104188.c.jj 2022-01-24 10:18:21.174512441 
+0100
+++ gcc/testsuite/gcc.target/i386/pr104188.c2022-01-26 11:54:58.025950692 
+0100
@@ -1,5 +1,6 @@
 /* { dg-do run { target avx512f } } */
-/* { dg-options "-O2 -mfpmath=sse" } */
+/* { dg-require-effective-target sse2_runtime } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
 
 #include 
 

Jakub



Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-01-26 Thread Ard Biesheuvel via Gcc-patches
On Wed, 26 Jan 2022 at 11:40, Dan Li  wrote:
>
> Thanks, Ard,
>
> On 1/26/22 00:10, Ard Biesheuvel wrote:
> > On Wed, 26 Jan 2022 at 08:53, Dan Li  wrote:
> >>
> >> Hi, all,
> >>
> >> Sorry for bothering.
> >>
> >> I'm trying to commit aarch64 scs code to the gcc and there is an issue
> >> that I'm not sure about, could someone give me some suggestions?
> >> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) )
> >>
> >> When clang enables scs, the following instructions are usually generated:
> >>
> >> str x30, [x18], 8
> >> ldp x29, x30, [sp], 16
> >> ..
> >> ldp x29, x30, [sp], 16  ## x30 pop
> >> ldr x30, [x18, -8]! ## x30 pop again
> >> ret
> >>
> >> The x30 register is popped twice here, Richard suggested that we can
> >> omit the first x30 pop here.
> >>
> >> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm
> >> missing something with the kernel, could someone give some suggestions?
> >>
> >> The previous discussion can be found here [1].
> >>
> >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html
> >>
> >
> > As was pointed out in the discussion, binary patching is in fact a
> > concern for the Linux kernel. E.g., Android uses generic binary
> > builds, but we would like to be able to switch between pointer
> > authentication and shadow call stack at boot time, rather than always
> > support both, and take the SCS performance hit on systems that
> > implement PAC as well.
> >
> > However, it seems more straight-forward to patch PACIASP and AUTIASP
> > instructions into SCS push/pop instructions rather than the other way
> > around, as we can force the use of these exact opcodes [in the NOP
> > space]), as well as rely on existing unwind annotations to locate any
> > such instruction in the binary.
> >
>
> Well, then I think I don't need to submit a kernel patch to
> enable SCS for gcc :)
>

Not entirely.

> BTW:
> Do we have a plan to submit patches of dynamic patch PAC into
> the kernel recently?
>

At the moment, there are just some ideas floating around. I did
implement a proof of concept that uses unwind data, but it hit some
issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not
at all (Clang) in some cases. Using objtool would be another
possibility.

So in summary, getting SCS support proper into GCC is definitely worth
the effort.

-- 
Ard.


Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-01-26 Thread Dan Li via Gcc-patches

Thanks, Ard,

On 1/26/22 00:10, Ard Biesheuvel wrote:

On Wed, 26 Jan 2022 at 08:53, Dan Li  wrote:


Hi, all,

Sorry for bothering.

I'm trying to commit aarch64 scs code to the gcc and there is an issue
that I'm not sure about, could someone give me some suggestions?
(To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) )

When clang enables scs, the following instructions are usually generated:

str x30, [x18], 8
ldp x29, x30, [sp], 16
..
ldp x29, x30, [sp], 16  ## x30 pop
ldr x30, [x18, -8]! ## x30 pop again
ret

The x30 register is popped twice here, Richard suggested that we can
omit the first x30 pop here.

AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm
missing something with the kernel, could someone give some suggestions?

The previous discussion can be found here [1].

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html



As was pointed out in the discussion, binary patching is in fact a
concern for the Linux kernel. E.g., Android uses generic binary
builds, but we would like to be able to switch between pointer
authentication and shadow call stack at boot time, rather than always
support both, and take the SCS performance hit on systems that
implement PAC as well.

However, it seems more straight-forward to patch PACIASP and AUTIASP
instructions into SCS push/pop instructions rather than the other way
around, as we can force the use of these exact opcodes [in the NOP
space]), as well as rely on existing unwind annotations to locate any
such instruction in the binary.



Well, then I think I don't need to submit a kernel patch to
enable SCS for gcc :)

BTW:
Do we have a plan to submit patches of dynamic patch PAC into
the kernel recently?


So omitting the load of X30 from the ordinary stack seems fine to me.


On 1/25/22 22:51, Dan Li wrote:



On 1/25/22 02:19, Richard Sandiford wrote:

Well, probably sticking to pop x30 twice is not a good idea.
AFAICT, there doesn't seem to be an explicit requirement that






Ok, I'll cc some kernel folks to make sure I didn't miss something.


To Richard:

Sorry for my mistake.

Due to binary compatibility issues, SCS related code may not
be directly merged into libgcc/glibc, do we still need to
add this patch into GCC? (I'd like to finish it if that
makes sense).


Thanks all for your time!
Dan


If binary patching is supposed to be possible then scs_push and
scs_pop *do* need to be separate define_insns.  But they also need
to have some magic unspec that differentiates them from normal
pushes and pops, e.g.:

(set ...mem...
 (unspec:DI [...reg...] UNSPEC_SCS_PUSH))

so that there is no chance that the pattern would be treated as
a normal move and optimised accordingly.



Yeah, this template looks more appropriate if it is to be treated
as a special directive.

Thanks for your suggestions,
Dan


[PATCH] c++: Fix up handling of vector CONSTRUCTORs with vectors in it in constexpr.cc [PR104226]

2022-01-26 Thread Jakub Jelinek via Gcc-patches
Hi!

The middle-end uses sometimes VECTOR_TYPE CONSTRUCTORs that contain
some other VECTOR_TYPE elements in it (should be with compatible element
size and smaller number of elements, e.g. a V8SImode vector can be
constructed as { V4SImode_var_1, V4SImode_var_2 }), and expansion of
__builtin_shufflevector emits these early, so constexpr.cc can see those
too.
constexpr.cc already has special cases for NULL index which is typical
for VECTOR_TYPE CONSTRUCTORs, and for VECTOR_TYPE CONSTRUCTORs that
contain just scalar elts that works just fine - init_subob_ctx just
returns on non-aggregate elts and get_or_insert_ctor_field has
  if (TREE_CODE (type) == VECTOR_TYPE && index == NULL_TREE)
{
  CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (ctor), index, NULL_TREE);
  return _ELTS (ctor)->last();
} 
handling for it.  But for the vector in vector case init_subob_ctx would
try to create a sub-CONSTRUCTOR and even didn't handle the NULL index
case well, so instead of creating the sub-CONSTRUCTOR after the elts already
in it overwrote the first one.  So
(V8SImode) { { 0, 0, 0, 0 }, { 0, 0, 0, 0 } }
became
(V8SImode) { 0, 0, 0, 0 }
The following patch fixes it by not forcing a sub-CONSTRUCTOR for this
vector in vector case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-26  Jakub Jelinek  

PR c++/104226
* constexpr.cc (init_subob_ctx): For vector ctors containing
vector elements, ensure appending to the same ctor instead of
creating another one.

* g++.dg/cpp0x/constexpr-104226.C: New test.

--- gcc/cp/constexpr.cc.jj  2022-01-19 00:42:11.0 +0100
+++ gcc/cp/constexpr.cc 2022-01-25 21:44:28.459208756 +0100
@@ -4658,6 +4658,13 @@ init_subob_ctx (const constexpr_ctx *ctx
   if (!AGGREGATE_TYPE_P (type) && !VECTOR_TYPE_P (type))
 /* A non-aggregate member doesn't get its own CONSTRUCTOR.  */
 return;
+  if (VECTOR_TYPE_P (type)
+  && VECTOR_TYPE_P (TREE_TYPE (ctx->ctor))
+  && index == NULL_TREE)
+/* A vector inside of a vector CONSTRUCTOR, e.g. when a larger
+   vector is constructed from smaller vectors, doesn't get its own
+   CONSTRUCTOR either.  */
+return;
 
   /* The sub-aggregate initializer might contain a placeholder;
  update object to refer to the subobject and ctor to refer to
--- gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C.jj2022-01-25 
21:50:34.977031244 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C   2022-01-25 
21:51:41.851086559 +0100
@@ -0,0 +1,15 @@
+// PR c++/104226
+// { dg-do compile }
+// { dg-options "-Wno-psabi" }
+
+typedef unsigned short __attribute__((__vector_size__(16))) U;
+typedef unsigned int __attribute__((__vector_size__(16))) V;
+typedef unsigned int __attribute__((__vector_size__(32))) W;
+
+U
+foo (void)
+{
+  return __builtin_convertvector (__builtin_shufflevector ((V){}, (W){},
+  0, 0, 1, 0,
+  5, 5, 0, 2), U);
+}

Jakub



Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for predicates

2022-01-26 Thread Christophe Lyon via Gcc-patches
Ping?

As discussed elsewhere with André, I'll drop patch #15 from this series,
since his patch
is a better fix.

Since v2 of this series had been approved, I think only patches 4,7,8,9,12
and 13 need
proper review.

Thanks,

Christophe


On Fri, Jan 14, 2022 at 3:22 PM Kyrylo Tkachov 
wrote:

> Hi Christophe, Richard,
>
> > -Original Message-
> > From: Gcc-patches  > bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Richard
> > Biener via Gcc-patches
> > Sent: Friday, January 14, 2022 1:33 PM
> > To: Christophe Lyon 
> > Cc: GCC Patches 
> > Subject: Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for
> > predicates
> >
> > On Fri, Jan 14, 2022 at 2:18 PM Christophe Lyon via Gcc-patches
> >  wrote:
> > >
> > > Hi,
> > >
> > > I hadn't realized we are moving to stage 4 this week-end :-(
> > >
> > > The PRs I'm fixing are P3, but without these fixes MVE support is badly
> > > broken, so I think I would be really good to fix that before the buggy
> > > version becomes part of an actual release.
> > > Anyway I posted v1 of the patches during stage1, so it should still be
> OK
> > > if they are accepted as-is ?
> >
> > In the end it's up to the target maintainers to weight the risk of
> breakage
> > vs. the risk of not usefulness ;)  But stage3 is where the "was posted
> > during stage1"
> > rule can easily apply - at some point we have to stop with such general
> ruling.
> >
>
> Thanks, that's in line with my interpretation.
> These patches resolve some nasty brokenness in the MVE support that I'm
> keen to see fixed and from what I can tell the patches shouldn't have a
> large effect on non-MVE code.
> So the risk vs reward balance for the arm port as a whole looks good to me.
> Andre has kindly agreed to help review the patches and I'll also try to
> get to them today and next week so that we can get them in early stage4.
>
> Thanks,
> Kyrill
>
> > Richard.
> >
> > > Thanks,
> > >
> > > Christophe
> > >
> > > On Thu, Jan 13, 2022 at 3:58 PM Christophe Lyon via Gcc-patches <
> > > gcc-patches@gcc.gnu.org> wrote:
> > >
> > > >
> > > > This is v3 of this patch series, fixing issues I discovered before
> > > > committing v2 (which had been approved).
> > > >
> > > > Thanks a lot to Richard Sandiford for his help.
> > > >
> > > > The changes v2 -> v3 are:
> > > >
> > > > Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support
> > VPR.
> > > >
> > > > Patch 7: Changes to the underlying representation of vectors of
> > > > booleans to account for the different expectations between
> AArch64/SVE
> > > > and Arm/MVE.
> > > >
> > > > Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of
> > > > duplicating them in mve_mov. This requires the introduction of
> a
> > > > new constraint to match a constant vector of booleans. Add a new RTL
> > > > test.
> > > >
> > > > Patch 9: Introduce check_effective_target_arm_mve and skip
> > > > gcc.dg/signbit-2.c, because with MVE there is no fallback
> architecture
> > > > unlike SVE or AVX512.
> > > >
> > > > Patch 12: Update less load/store MVE builtins
> > > > (mve_vldrdq_gather_base_z_v2di,
> > > > mve_vldrdq_gather_offset_z_v2di,
> > > > mve_vldrdq_gather_shifted_offset_z_v2di,
> > > > mve_vstrdq_scatter_base_p_v2di,
> > > > mve_vstrdq_scatter_offset_p_v2di,
> > > > mve_vstrdq_scatter_offset_p_v2di_insn,
> > > > mve_vstrdq_scatter_shifted_offset_p_v2di,
> > > > mve_vstrdq_scatter_shifted_offset_p_v2di_insn,
> > > > mve_vstrdq_scatter_base_wb_p_v2di,
> > > > mve_vldrdq_gather_base_wb_z_v2di,
> > > > mve_vldrdq_gather_base_nowb_z_v2di,
> > > > mve_vldrdq_gather_base_wb_z_v2di_insn) for which we keep HI
> > mode
> > > > for vpr_register_operand.
> > > >
> > > > Patch 13: No need to update
> > > > gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use
> > > > the mov pattern that emits '@ movhi' in the assembly.
> > > >
> > > > Patch 15: This is a new patch to fix a problem I noticed during this
> > > > v2->v3 update.
> > > >
> > > >
> > > >
> > > > I'll squash patch 2 with patch 9 and patch 3 with patch 8.
> > > >
> > > > Original text:
> > > >
> > > > This patch series addresses PR 100757 and 101325 by representing
> > > > vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> > > > rather than using HImode.
> > > >
> > > > As this implies a lot of mostly mechanical changes, I have tried to
> > > > split the patches in a way that should help reviewers, but the split
> > > > is a bit artificial.
> > > >
> > > > Patches 1-3 add new tests.
> > > >
> > > > Patches 4-6 are small independent improvements.
> > > >
> > > > Patch 7 implements the predicate qualifier, but does not change any
> > > > builtin yet.
> > > >
> > > > Patch 8 is the first of the two main patches, and uses the new
> > > > qualifier to describe the vcmp and vpsel builtins that are useful for
> > > > auto-vectorization of comparisons.
> > > >
> > > > Patch 9 is the second main patch, which fixes the vcond_mask
> expander.
> > > >
> > > > 

Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-01-26 Thread Ard Biesheuvel via Gcc-patches
On Wed, 26 Jan 2022 at 08:53, Dan Li  wrote:
>
> Hi, all,
>
> Sorry for bothering.
>
> I'm trying to commit aarch64 scs code to the gcc and there is an issue
> that I'm not sure about, could someone give me some suggestions?
> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) )
>
> When clang enables scs, the following instructions are usually generated:
>
> str x30, [x18], 8
> ldp x29, x30, [sp], 16
> ..
> ldp x29, x30, [sp], 16  ## x30 pop
> ldr x30, [x18, -8]! ## x30 pop again
> ret
>
> The x30 register is popped twice here, Richard suggested that we can
> omit the first x30 pop here.
>
> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm
> missing something with the kernel, could someone give some suggestions?
>
> The previous discussion can be found here [1].
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html
>

As was pointed out in the discussion, binary patching is in fact a
concern for the Linux kernel. E.g., Android uses generic binary
builds, but we would like to be able to switch between pointer
authentication and shadow call stack at boot time, rather than always
support both, and take the SCS performance hit on systems that
implement PAC as well.

However, it seems more straight-forward to patch PACIASP and AUTIASP
instructions into SCS push/pop instructions rather than the other way
around, as we can force the use of these exact opcodes [in the NOP
space]), as well as rely on existing unwind annotations to locate any
such instruction in the binary.

So omitting the load of X30 from the ordinary stack seems fine to me.

> On 1/25/22 22:51, Dan Li wrote:
> >
> >
> > On 1/25/22 02:19, Richard Sandiford wrote:
> >> Dan Li  writes:
> > +
> >   if (flag_stack_usage_info)
> > current_function_static_stack_size = constant_lower_bound 
> > (frame_size);
> > @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall)
> >   RTX_FRAME_RELATED_P (insn) = 1;
> > }
> > +  /* Pop return address from shadow call stack.  */
> > +  if (aarch64_shadow_call_stack_enabled ())
> > +emit_insn (gen_scs_pop ());
> > +
> 
>  This looks correct, but following on from the above, I guess we continue
>  to restore x30 from the frame in the traditional way first, and then
>  overwrite it with the SCS value.  I think the output code would be
>  slightly neater if we suppressed the first restore of x30.
> 
> >>> Yes, the current epilogue is like:
> >>>   ...
> >>>   ldp x29, x30, [sp], #16
> >>> +   ldr x30, [x18, #-8]!
> >>>
> >>> I think may be we can keep the two x30 pops here, for two reasons:
> >>> 1) The implementation of scs in clang is to pop x30 twice, it might be
> >>> better to be consistent with clang here[1].
> >>
> >> This doesn't seem a very compelling reason on its own, unless it's
> >> explicitly meant to be observable behaviour.  (But then, observed how?)
> >>
> >
> > Well, probably sticking to pop x30 twice is not a good idea.
> > AFAICT, there doesn't seem to be an explicit requirement that
> > this behavior must be followed.
> >
> > BTW:
> > Do we still need to emit the .cfi_restore 30 directive here if we
> > want to save a pop x30? (Sorry I'm not familiar enough with DWARF.)
> >
> > Since the aarch64 linux kernel always enables -fno-omit-frame-pointer,
> > the generated assembly code may be as follows:
> >
> > str x30, [x18], 8
> > ldp x29, x30, [sp], 16
> > ..
> > ldr x29, [sp], 16
> >  ## Do we still need a .cfi_restore 30 here
> > .cfi_restore 29
> > .cfi_def_cfa_offset 0
> > ldr x30, [x18, -8]!
> >  ## Or may be a non-CFA based directive here
> > ret
> >
> >>> 2) If we keep the first restore of x30, it may provide some flexibility
> >>> for other scenarios. For example, we can directly patch the scs_push/pop
> >>> insns in the binary to turn it into a binary without scs;
> >>
> >> Is that a supported (and ideally documented) use case?  If it is,
> >> then it becomes important for correctness that the compiler emits
> >> exactly the opcodes in the original patch.  If it isn't, then the
> >> compiler can emit other instructions that have the same effect.
> >>
> >
> > Oh, no, this is just a little trick that can be used for compatibility.
> > (Maybe some scenarios such as our company sometimes need to be
> > compatible with some non-open source binaries from different
> > manufacturers, two pops could make life easier :). )
> >
> > If this is not a consideration for our community, please ignore
> > this request.
> >
> >> I'd like a definitive ruling on this from the kernel folks before
> >> the patch goes in.
> >>
> >
> > Ok, I'll cc some kernel folks to make sure I didn't miss something.
> >
> >> If binary patching is supposed to be possible then scs_push and
> >> scs_pop *do* need to be separate