[PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
Hi!

For Ada with LTO, boolean_{false,true}_node can be 1-bit precision boolean,
while TREE_TYPE (lhs) can be 8-bit precision boolean and thus we can end up
with wide_int mismatches.

The following patch fixes it by using TYPE_{MIN,MAX}_VALUE instead.

Bootstrapped/regtested on x86_64-linux and i686-linux (the former including
Ada as usually), ok for trunk?
Or do you prefer your version with wi::zero/wi::one?

2020-12-09  Jakub Jelinek  

PR bootstrap/98188
* tree-ssa-phiopt.c (two_value_replacement): For boolean, set
min and max from minimum and maximum of the type.

--- gcc/tree-ssa-phiopt.c.jj2020-12-06 10:57:00.142847537 +0100
+++ gcc/tree-ssa-phiopt.c   2020-12-08 15:00:09.091063392 +0100
@@ -660,8 +660,8 @@ two_value_replacement (basic_block cond_
   wide_int min, max;
   if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE)
 {
-  min = wi::to_wide (boolean_false_node);
-  max = wi::to_wide (boolean_true_node);
+  min = wi::to_wide (TYPE_MIN_VALUE (TREE_TYPE (lhs)));
+  max = wi::to_wide (TYPE_MAX_VALUE (TREE_TYPE (lhs)));
 }
   else if (get_range_info (lhs, &min, &max) != VR_RANGE)
 return false;

Jakub



Re: [PATCH] fold-const: Fix up native_encode_initializer missing field handling [PR98193]

2020-12-08 Thread Richard Biener
On Wed, 9 Dec 2020, Jakub Jelinek wrote:

> Hi!
> 
> When native_encode_initializer is called with non-NULL mask (i.e. ATM
> bit_cast only), it checks if the current index in the CONSTRUCTOR (if any)
> is the next initializable FIELD_DECL, and if not, decrements cnt and
> performs the iteration with that FIELD_DECL as field and val of zero
> (so that it computes mask properly).  As the testcase shows, I forgot to
> set pos to the byte position of the field though (like it is done
> for e.g. index referenced FIELD_DECLs in the constructor.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Richard.

> 2020-12-09  Jakub Jelinek  
> 
>   PR c++/98193
>   * fold-const.c (native_encode_initializer): Set pos to field's
>   byte position if iterating over a field with missing initializer.
> 
>   * g++.dg/cpp2a/bit-cast7.C: New test.
> 
> --- gcc/fold-const.c.jj   2020-12-04 18:00:47.0 +0100
> +++ gcc/fold-const.c  2020-12-08 12:42:53.913529423 +0100
> @@ -8256,6 +8256,7 @@ native_encode_initializer (tree init, un
>   {
> cnt--;
> field = fld;
> +   pos = int_byte_position (field);
> val = build_zero_cst (TREE_TYPE (fld));
> if (TREE_CODE (val) == CONSTRUCTOR)
>   to_free = val;
> --- gcc/testsuite/g++.dg/cpp2a/bit-cast7.C.jj 2020-12-08 13:08:39.623341446 
> +0100
> +++ gcc/testsuite/g++.dg/cpp2a/bit-cast7.C2020-12-08 13:07:45.443943866 
> +0100
> @@ -0,0 +1,39 @@
> +// PR c++/98193
> +// { dg-do compile { target c++20 } }
> +
> +template 
> +constexpr To
> +bit_cast (const From &from)
> +{
> +  return __builtin_bit_cast (To, from);
> +}
> +
> +struct J
> +{
> +  long int a, b : 11, h;
> +};
> +
> +struct K
> +{
> +  long int a, b : 11, c;
> +  constexpr bool operator == (const K &x)
> +  {
> +return a == x.a && b == x.b && c == x.c;
> +  }
> +};
> +
> +struct L
> +{
> +  long long int a, b : 11, h;
> +};
> +struct M
> +{
> +  long long int a, b : 11, c;
> +  constexpr bool operator == (const M &x)
> +  {
> +return a == x.a && b == x.b && c == x.c;
> +  }
> +};
> +
> +static_assert (bit_cast  (J{}) == K{}, "");
> +static_assert (bit_cast  (L{0x0feedbacdeadbeefLL}) == 
> M{0x0feedbacdeadbeefLL}, "");
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] fold-const: Fix up native_encode_initializer missing field handling [PR98193]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
Hi!

When native_encode_initializer is called with non-NULL mask (i.e. ATM
bit_cast only), it checks if the current index in the CONSTRUCTOR (if any)
is the next initializable FIELD_DECL, and if not, decrements cnt and
performs the iteration with that FIELD_DECL as field and val of zero
(so that it computes mask properly).  As the testcase shows, I forgot to
set pos to the byte position of the field though (like it is done
for e.g. index referenced FIELD_DECLs in the constructor.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2020-12-09  Jakub Jelinek  

PR c++/98193
* fold-const.c (native_encode_initializer): Set pos to field's
byte position if iterating over a field with missing initializer.

* g++.dg/cpp2a/bit-cast7.C: New test.

--- gcc/fold-const.c.jj 2020-12-04 18:00:47.0 +0100
+++ gcc/fold-const.c2020-12-08 12:42:53.913529423 +0100
@@ -8256,6 +8256,7 @@ native_encode_initializer (tree init, un
{
  cnt--;
  field = fld;
+ pos = int_byte_position (field);
  val = build_zero_cst (TREE_TYPE (fld));
  if (TREE_CODE (val) == CONSTRUCTOR)
to_free = val;
--- gcc/testsuite/g++.dg/cpp2a/bit-cast7.C.jj   2020-12-08 13:08:39.623341446 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast7.C  2020-12-08 13:07:45.443943866 
+0100
@@ -0,0 +1,39 @@
+// PR c++/98193
+// { dg-do compile { target c++20 } }
+
+template 
+constexpr To
+bit_cast (const From &from)
+{
+  return __builtin_bit_cast (To, from);
+}
+
+struct J
+{
+  long int a, b : 11, h;
+};
+
+struct K
+{
+  long int a, b : 11, c;
+  constexpr bool operator == (const K &x)
+  {
+return a == x.a && b == x.b && c == x.c;
+  }
+};
+
+struct L
+{
+  long long int a, b : 11, h;
+};
+struct M
+{
+  long long int a, b : 11, c;
+  constexpr bool operator == (const M &x)
+  {
+return a == x.a && b == x.b && c == x.c;
+  }
+};
+
+static_assert (bit_cast  (J{}) == K{}, "");
+static_assert (bit_cast  (L{0x0feedbacdeadbeefLL}) == 
M{0x0feedbacdeadbeefLL}, "");

Jakub



Re: [PATCH] IBM Z: Build autovec-*-signaling-eq.c tests with exceptions

2020-12-08 Thread Andreas Krebbel via Gcc-patches
On 12/3/20 2:22 AM, Ilya Leoshkevich wrote:
> According to
> https://gcc.gnu.org/pipermail/gcc/2020-November/234344.html, GCC is
> allowed to perform optimizations that remove floating point traps,
> since they do not affect the modeled control flow.  This interferes with
> two signaling comparison tests, where (a <= b && a >= b) is turned into
> (a <= b && a == b) by test_for_singularity, into ((a <= b) & (a == b))
> by vectorizer and then into (a == b) eliminate_redundant_comparison.
> 
> Fix by making traps affect the control flow by turning them into
> exceptions.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-12-03  Ilya Leoshkevich  
> 
>   * gcc.target/s390/zvector/autovec-double-signaling-eq.c: Build
>   with exceptions.
>   * gcc.target/s390/zvector/autovec-float-signaling-eq.c:
>   Likewise.

Ok. Thanks!

Andreas


Re: [PATCH] c++: Fix printing of decltype(nullptr) [PR97517]

2020-12-08 Thread Jason Merrill via Gcc-patches

On 12/8/20 5:53 PM, Marek Polacek wrote:

The C++ printer doesn't handle NULLPTR_TYPE, so we issue the ugly
"'nullptr_type' not supported by...".  Since NULLPTR_TYPE is
decltype(nullptr), it seemed reasonable to handle it where we
handle DECLTYPE_TYPE, that is, in the simple-type-specifier handler.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97517
* cxx-pretty-print.c (cxx_pretty_printer::simple_type_specifier): Handle
NULLPTR_TYPE.
(pp_cxx_type_specifier_seq): Likewise.
(cxx_pretty_printer::type_id): Likewise.

gcc/testsuite/ChangeLog:

PR c++/97517
* g++.dg/diagnostic/nullptr.C: New test.
---
  gcc/cp/cxx-pretty-print.c | 6 ++
  gcc/testsuite/g++.dg/diagnostic/nullptr.C | 8 
  2 files changed, 14 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/diagnostic/nullptr.C

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index b97f70e2bd0..02721e88a5b 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -1381,6 +1381,10 @@ cxx_pretty_printer::simple_type_specifier (tree t)
pp_cxx_right_paren (this);
break;
  
+case NULLPTR_TYPE:

+  pp_cxx_ws_string (this, "nullptr_t");


Let's say std::nullptr_t.  OK with that change.


+  break;
+
  default:
c_pretty_printer::simple_type_specifier (t);
break;
@@ -1408,6 +1412,7 @@ pp_cxx_type_specifier_seq (cxx_pretty_printer *pp, tree t)
  case TYPE_DECL:
  case BOUND_TEMPLATE_TEMPLATE_PARM:
  case DECLTYPE_TYPE:
+case NULLPTR_TYPE:
pp_cxx_cv_qualifier_seq (pp, t);
pp->simple_type_specifier (t);
break;
@@ -1873,6 +1878,7 @@ cxx_pretty_printer::type_id (tree t)
  case TYPEOF_TYPE:
  case UNDERLYING_TYPE:
  case DECLTYPE_TYPE:
+case NULLPTR_TYPE:
  case TEMPLATE_ID_EXPR:
  case OFFSET_TYPE:
pp_cxx_type_specifier_seq (this, t);
diff --git a/gcc/testsuite/g++.dg/diagnostic/nullptr.C 
b/gcc/testsuite/g++.dg/diagnostic/nullptr.C
new file mode 100644
index 000..2c9e5a80bd5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/nullptr.C
@@ -0,0 +1,8 @@
+// PR c++/97517
+// { dg-do compile { target c++20 } }
+// Test that we print "decltype(nullptr)" correctly.
+
+template struct Trait { static constexpr bool value = false; };
+template concept Concept = Trait::value; // { dg-message 
{\[with T = nullptr_t\]} }
+static_assert( Concept ); // { dg-error "static assertion 
failed" }
+// { dg-message "constraints not satisfied" "" { target *-*-* } .-1 }

base-commit: 0221c656bbe5b4ab54e784df3b109c60cb27e5b6





[pushed] c++: Avoid [[nodiscard]] warning in requires-expr [PR98019]

2020-12-08 Thread Jason Merrill via Gcc-patches
If we aren't really evaluating the expression, it doesn't matter that the
return value is discarded.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/98019
* cvt.c (maybe_warn_nodiscard): Check c_inhibit_evaluation_warnings.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-nodiscard1.C: Remove xfail.
---
 gcc/cp/cvt.c | 3 +++
 gcc/testsuite/g++.dg/cpp2a/concepts-nodiscard1.C | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index bcd7c5af81c..29ceaeb24ce 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -1031,6 +1031,9 @@ cp_get_callee_fndecl_nofold (tree call)
 static void
 maybe_warn_nodiscard (tree expr, impl_conv_void implicit)
 {
+  if (!warn_unused_result || c_inhibit_evaluation_warnings)
+return;
+
   tree call = expr;
   if (TREE_CODE (expr) == TARGET_EXPR)
 call = TARGET_EXPR_INITIAL (expr);
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-nodiscard1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-nodiscard1.C
index 907e68b1fc2..3d5cd85bc94 100644
--- a/gcc/testsuite/g++.dg/cpp2a/concepts-nodiscard1.C
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-nodiscard1.C
@@ -1,6 +1,6 @@
 // PR c++/98019
 // { dg-do compile { target c++20 } }
-// { dg-excess-errors *-*-* }
+// Don't give [[nodiscard]] warning for an expression requirement.
 
 template  concept same_as = __is_same_as (T, U);
 

base-commit: f6e8e2797ebae21e483373e303ec1c7596309625
prerequisite-patch-id: 3a906bda30cfdb62957823e990826e9d6eaa474a
-- 
2.27.0



Re: [PATCH] c++: Don't require accessible dtors for some forms of new [PR59238]

2020-12-08 Thread Jason Merrill via Gcc-patches

On 12/8/20 5:57 PM, Jakub Jelinek wrote:

On Tue, Dec 08, 2020 at 05:34:13PM -0500, Jason Merrill wrote:

On 12/8/20 4:23 AM, Jakub Jelinek wrote:

The earlier cases in build_new_1 already use | tf_no_cleanup, these are
cases where the type isn't type_build_ctor_call nor explicit_value_init_p.
It is true that often one can't delete these (unless e.g. the dtor would be
private or protected and deletion done in some method), but diagnosing that
belongs to delete, not new.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?


It wasn't clear to me why adding tf_no_cleanup in those places made a
difference; after some investigation I tried moving the tf_no_cleanup closer
to where we build a TARGET_EXPR and then soon put it in an INIT_EXPR.



I'm afraid I don't know the FE enough to know.
I see cp_build_modify_expr can be called with INIT_EXPR from
perform_member_init, this build_new_1 case and get_temp_regvar.
Whether it is ok for all of them not to build destructors is something I
have no idea about, for build_new_1 case I was confident it shouldn't be
built.


It is OK; those other places are also using it to initialize something, 
so there's never a temporary.


Applied, thanks.
commit 3421b3cc35317815a60ee224b9593549d617d0ac
Author: Jason Merrill 
Date:   Tue Dec 8 22:05:45 2020 -0500

c++: Don't require accessible dtors for some forms of new [PR59238]

Jakub noticed that in build_new_1 we needed to add tf_no_cleanup to avoid
building a cleanup for a TARGET_EXPR that we already know is going to be
used to initialize something, so the cleanup will never be run.  The best
place to add it is close to where we build the INIT_EXPR; in
cp_build_modify_expr fixes the single-object new, in expand_default_init
fixes array new.

Co-authored-by: Jakub Jelinek  

gcc/cp/ChangeLog:

PR c++/59238
* init.c (expand_default_init): Pass tf_no_cleanup when building
a TARGET_EXPR to go on the RHS of an INIT_EXPR.
* typeck.c (cp_build_modify_expr): Likewise.

gcc/testsuite/ChangeLog:

PR c++/59238
* g++.dg/cpp0x/new4.C: New test.

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 0b98f338feb..3c3e05d9b21 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -1922,7 +1922,7 @@ expand_default_init (tree binfo, tree true_exp, tree exp, tree init, int flags,
 	   in an exception region.  */;
   else
 	init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
-			flags, complain);
+			flags, complain | tf_no_cleanup);
 
   if (TREE_CODE (init) == MUST_NOT_THROW_EXPR)
 	/* We need to protect the initialization of a catch parm with a
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 4d499af5ccb..afbb8ef02e6 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -8860,7 +8860,7 @@ cp_build_modify_expr (location_t loc, tree lhs, enum tree_code modifycode,
LOOKUP_ONLYCONVERTING.  */
 newrhs = convert_for_initialization (lhs, olhstype, newrhs, LOOKUP_NORMAL,
 	 ICR_INIT, NULL_TREE, 0,
- complain);
+	 complain | tf_no_cleanup);
   else
 newrhs = convert_for_assignment (olhstype, newrhs, ICR_ASSIGN,
  NULL_TREE, 0, complain, LOOKUP_IMPLICIT);
diff --git a/gcc/testsuite/g++.dg/cpp0x/new4.C b/gcc/testsuite/g++.dg/cpp0x/new4.C
new file mode 100644
index 000..728ef4ee7ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/new4.C
@@ -0,0 +1,36 @@
+// PR c++/59238
+// { dg-do compile { target c++11 } }
+
+struct A { ~A () = delete; };
+A *pa{new A{}};
+A *pa2{new A[2]{}};
+
+class B { ~B () = default; };
+B *pb{new B{}};
+
+struct E {
+  ~E () = delete;
+private:
+  int x;
+};
+E *pe{new E{}};
+
+class C { ~C (); };
+C *pc{new C{}};
+
+class D { ~D () {} };
+D *pd{new D{}};
+
+struct F {
+  F () = default;
+  ~F () = delete;
+};
+F *pf{new F{}};
+
+struct G {
+  G () = default;
+  ~G () = delete;
+private:
+  int x;
+};
+G *pg{new G{}};


Re: [r11-5391 Regression] FAIL: gcc.target/i386/avx512vl-vxorpd-2.c execution test on Linux/x86_64

2020-12-08 Thread Hongtao Liu via Gcc-patches
On Tue, Dec 8, 2020 at 6:23 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 30, 2020 at 06:16:06PM +0800, Hongtao Liu via Gcc-patches wrote:
> > Add no strict aliasing to function CALC, since there are
> >
> > "long long tmp = (*(long long *) &src1[i]) ^ (*(long long *) &src2[i]);"
> >  in function CALC.
> >
> >
> > modified   gcc/testsuite/gcc.target/i386/avx512dq-vandnpd-2.c
> > @@ -9,6 +9,7 @@
> >  #include "avx512f-mask-type.h"
> >
> >  void
> > +__attribute__ ((optimize ("no-strict-aliasing"), noinline))
> >  CALC (double *s1, double *s2, double *r)
> >  {
> >int i;
> > modified   gcc/testsuite/gcc.target/i386/avx512dq-vandpd-2.c
>
> I think that is not the best fix, the CALC routines just want to
> model the behavior of the instructions, they are just part of the
> verification that the rest of the test works correctly and so we
> can just rewrite the code not to violate aliasing.
>
> Fixed thusly, committed to the trunk as obvious:
>
> 2020-12-08  Jakub Jelinek  
>
> * gcc.target/i386/avx512dq-vandnpd-2.c (CALC): Use union
> to avoid aliasing violations.
> * gcc.target/i386/avx512dq-vandnps-2.c (CALC): Likewise.
> * gcc.target/i386/avx512dq-vandpd-2.c (CALC): Likewise.
> * gcc.target/i386/avx512dq-vandps-2.c (CALC): Likewise.
> * gcc.target/i386/avx512dq-vorpd-2.c (CALC): Likewise.
> * gcc.target/i386/avx512dq-vorps-2.c (CALC): Likewise.
> * gcc.target/i386/avx512dq-vxorpd-2.c (CALC): Likewise.
> * gcc.target/i386/avx512dq-vxorps-2.c (CALC): Likewise.
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vandnpd-2.c.jj   2020-01-14 
> 20:02:47.785594824 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512dq-vandnpd-2.c  2020-12-08 
> 11:12:37.106053066 +0100
> @@ -16,8 +16,11 @@ CALC (double *s1, double *s2, double *r)
>
>for (i = 0; i < SIZE; i++)
>  {
> -  tmp = (~(*(long long *) &s1[i])) & (*(long long *) &s2[i]);
> -  r[i] = *(double *) &tmp;
> +  union U { double d; long long l; } u1, u2;
> +  u1.d = s1[i];
> +  u2.d = s2[i];
> +  u1.l = (~u1.l) & u2.l;
> +  r[i] = u1.d;
>  }
>  }
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vandnps-2.c.jj   2020-01-14 
> 20:02:47.785594824 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512dq-vandnps-2.c  2020-12-08 
> 11:12:55.033852659 +0100
> @@ -16,8 +16,11 @@ CALC (float *s1, float *s2, float *r)
>
>for (i = 0; i < SIZE; i++)
>  {
> -  tmp = (~(*(int *) &s1[i])) & (*(int *) &s2[i]);
> -  r[i] = *(float *) &tmp;
> +  union U { float f; int i; } u1, u2;
> +  u1.f = s1[i];
> +  u2.f = s2[i];
> +  u1.i = (~u1.i) & u2.i;
> +  r[i] = u1.f;
>  }
>  }
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vandpd-2.c.jj2020-01-14 
> 20:02:47.785594824 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512dq-vandpd-2.c   2020-12-08 
> 11:10:03.767767230 +0100
> @@ -16,8 +16,11 @@ CALC (double *s1, double *s2, double *r)
>
>for (i = 0; i < SIZE; i++)
>  {
> -  tmp = (*(long long *) &s1[i]) & (*(long long *) &s2[i]);
> -  r[i] = *(double *) &tmp;
> +  union U { double d; long long l; } u1, u2;
> +  u1.d = s1[i];
> +  u2.d = s2[i];
> +  u1.l &= u2.l;
> +  r[i] = u1.d;
>  }
>  }
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vandps-2.c.jj2020-01-14 
> 20:02:47.785594824 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512dq-vandps-2.c   2020-12-08 
> 11:11:51.548562356 +0100
> @@ -16,8 +16,11 @@ CALC (float *s1, float *s2, float *r)
>
>for (i = 0; i < SIZE; i++)
>  {
> -  tmp = (*(int *) &s1[i]) & (*(int *) &s2[i]);
> -  r[i] = *(float *) &tmp;
> +  union U { float f; int i; } u1, u2;
> +  u1.f = s1[i];
> +  u2.f = s2[i];
> +  u1.i &= u2.i;
> +  r[i] = u1.f;
>  }
>  }
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vorpd-2.c.jj 2020-01-14 
> 20:02:47.786594810 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512dq-vorpd-2.c2020-12-08 
> 11:15:35.497058846 +0100
> @@ -15,8 +15,11 @@ CALC (double *src1, double *src2, double
>
>for (i = 0; i < SIZE; i++)
>  {
> -  long long tmp = (*(long long *) &src1[i]) | (*(long long *) &src2[i]);
> -  dst[i] = *(double *) &tmp;
> +  union U { double d; long long l; } u1, u2;
> +  u1.d = src1[i];
> +  u2.d = src2[i];
> +  u1.l |= u2.l;
> +  dst[i] = u1.d;
>  }
>  }
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vorps-2.c.jj 2020-01-14 
> 20:02:47.786594810 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512dq-vorps-2.c2020-12-08 
> 11:15:45.737944364 +0100
> @@ -15,8 +15,11 @@ CALC (float *src1, float *src2, float *d
>
>for (i = 0; i < SIZE; i++)
>  {
> -  int tmp = (*(int *) &src1[i]) | (*(int *) &src2[i]);
> -  dst[i] = *(float *) &tmp;
> +  union U { float f; int i; } u1, u2;
> +  u1.f = src1[i];
> +  u2.f = src2[i];
> +  u1.i |= u2.i;
> +  dst[i] = u1.f;
>  }
>  }
>
> --- gcc/testsuite/gcc.target/i386/avx512dq-vxorpd-2.c.jj202

Re: [PATCH v4 1/2] asan: specify alignment for LASANPC labels

2020-12-08 Thread Ilya Leoshkevich via Gcc-patches
On Thu, 2020-07-09 at 14:07 +0200, Ilya Leoshkevich wrote:
> On Wed, 2020-07-01 at 21:48 +0200, Ilya Leoshkevich wrote:
> > On Wed, 2020-07-01 at 11:57 -0600, Jeff Law wrote:
> > > On Wed, 2020-07-01 at 14:29 +0200, Ilya Leoshkevich via Gcc-
> > > patches
> > > wrote:
> > > > gcc/ChangeLog:
> > > > 
> > > > 2020-06-30  Ilya Leoshkevich  
> > > > 
> > > > * asan.c (asan_emit_stack_protection): Use
> > > > CODE_LABEL_BOUNDARY.
> > > > * defaults.h (CODE_LABEL_BOUNDARY): New macro.
> > > > * doc/tm.texi: Document CODE_LABEL_BOUNDARY.
> > > > * doc/tm.texi.in: Likewise.
> > > Don't we already have the ability to set label alignments?  See
> > > LABEL_ALIGN.
> > 
> > The following works with -falign-labels=2:
> > 
> > --- a/gcc/asan.c
> > +++ b/gcc/asan.c
> > @@ -1524,7 +1524,7 @@ asan_emit_stack_protection (rtx base, rtx
> > pbase,
> > unsigned int alignb,
> >DECL_INITIAL (decl) = decl;
> >TREE_ASM_WRITTEN (decl) = 1;
> >TREE_ASM_WRITTEN (id) = 1;
> > -  SET_DECL_ALIGN (decl, CODE_LABEL_BOUNDARY);
> > +  SET_DECL_ALIGN (decl, (1 << LABEL_ALIGN (gen_label_rtx ())) *
> > BITS_PER_UNIT);
> >emit_move_insn (mem, expand_normal (build_fold_addr_expr
> > (decl)));
> >shadow_base = expand_binop (Pmode, lshr_optab, base,
> >   gen_int_shift_amount (Pmode,
> > ASAN_SHADOW_SHIFT),
> > 
> > In order to go this way, we would need to raise `-falign-labels=`
> > default to 2 for s390, which is not incorrect, but would
> > unnecessarily
> > clutter asm with `.align 2` before each label.  So IMHO it would be
> > nicer to simply ask the backend "what is your target's instruction
> > alignment?".
> 
> Besides that it would clutter asm with .align 2, another argument
> against using LABEL_ALIGN here is that it's semantically different
> from
> what is needed: -falign-labels value, which it returns, is specified
> by
> user for optimization purposes, whereas here we need to query the
> architecture's property.
> 
> In practical terms, if user specifies -falign-labels=4096, this would
> affect how the code is generated here. However, this would be
> completely unnecessary: we never jump to decl, its address is only
> saved for reporting.

Hi Jeff,

Could you please have another look at this one?

Best regards,
Ilya



[committed] testsuite: Fix up testcase for ia32 [PR98191]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
Hi!

This new test fails on i686-linux, due to -Wpsabi warnings.
Fixed the usual way, tested on x86_64-linux -m32,-m64,-m32/-mno-sse,
committed to trunk as obvious.

2020-12-09  Jakub Jelinek  

PR tree-optimization/98191
* gcc.dg/torture/pr98191.c: Add dg-additional-options with
-w -Wno-psabi.

diff --git a/gcc/testsuite/gcc.dg/torture/pr98191.c 
b/gcc/testsuite/gcc.dg/torture/pr98191.c
index 93cd27c21e1..7c4a6d19613 100644
--- a/gcc/testsuite/gcc.dg/torture/pr98191.c
+++ b/gcc/testsuite/gcc.dg/torture/pr98191.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-additional-options "-w -Wno-psabi" } */
 
 typedef double v2df __attribute__((vector_size(2*sizeof(double;
 

Jakub



Re: [PATCH v5] Practical Improvement to libgcc Complex Divide

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 08, 2020 at 10:32:33PM +, Patrick McGehearty via Gcc-patches 
wrote:
> 2020-12-08 Patrick McGehearty 
> 
> * gcc/c-family/c-cppbuiltin.c - Add supporting macros for new complex divide.
> * libgcc2.c (__divsc3, __divdc3, __divxc3, __divtc3): Improve complex divide.
> * libgcc/config/rs6000/_divkc3.c - Complex divide changes for rs6000.
> * gcc/testsuite/gcc.c-torture/execute/ieee/cdivchkd.c - double cdiv test.
> * gcc/testsuite/gcc.c-torture/execute/ieee/cdivchkf.c - float cdiv test.
> * gcc/testsuite/gcc.c-torture/execute/ieee/cdivchkld.c - long double cdiv 
> test.

Thanks for working on this, I'll defer review to Joseph, just want to add a few
random comments.
The above ChangeLog will not get through the commit checking scripts,
one needs two spaces before and after name instead of just one,
pathnames should be relative to the corresponding ChangeLog file and one
should separate what goes to each ChangeLog, and lines except the first one
should be tab indented.  So it should look like:

2020-12-08  Patrick McGehearty  

gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Add supporting macros for new
complex divide.
libgcc/
* libgcc2.c (XMTYPE, XCTYPE, RBIG, RMIN, RMIN2, RMINSCAL, RMAX2):
Define.
(__divsc3, __divdc3, __divxc3, __divtc3): Improve complex divide.
* config/rs6000/_divkc3.c (RBIG, RMIN, RMIN2, RMINSCAL, RMAX2):
Define.
(__divkc3): Improve complex divide.
gcc/testsuite/
* gcc.c-torture/execute/ieee/cdivchkd.c: New test.
* gcc.c-torture/execute/ieee/cdivchkf.c: New test.
* gcc.c-torture/execute/ieee/cdivchkld.c: New test.

or so.

> --- a/gcc/c-family/c-cppbuiltin.c
> +++ b/gcc/c-family/c-cppbuiltin.c
> @@ -1347,6 +1347,47 @@ c_cpp_builtins (cpp_reader *pfile)
> "PRECISION__"));
> sprintf (macro_name, "__LIBGCC_%s_EXCESS_PRECISION__", name);
> builtin_define_with_int_value (macro_name, excess_precision);
> +
> +   if ((mode == TYPE_MODE (float_type_node))
> +   || (mode == TYPE_MODE (double_type_node))
> +   || (mode == TYPE_MODE (long_double_type_node)))
> + {
> +   char val_name[64];
> +   char fname[8] = "";
> +   if (mode == TYPE_MODE (float_type_node))
> + strncpy(fname, "FLT",4);

Formatting, there should be space before ( for calls, and space in between
, and 4.  Also, what is the point of using strncpy?  strcpy or
memcpy would do.

> +   else if (mode == TYPE_MODE (double_type_node))
> + strncpy(fname, "DBL",4);
> +   else if (mode == TYPE_MODE (long_double_type_node))
> + strncpy(fname, "LDBL",5);
> +
> +   if ( (mode == TYPE_MODE (float_type_node))
> +|| (mode == TYPE_MODE (double_type_node)) )

Formatting, no spaces in between the ( ( and ) ).
> + {
> +   macro_name = (char *) alloca (strlen (name)
> + + sizeof ("__LIBGCC_EPSILON__"
> +   ));

This should use XALLOCAVEC macro, so
  macro_name
= XALLOCAVEC (char, strlen (name)
+ sizeof ("__LIBGCC_EPSILON__"));

I admit it is a preexisting problem in the code above it too.

> +   sprintf (macro_name, "__LIBGCC_%s_EPSILON__", name);
> +   sprintf( val_name, "__%s_EPSILON__", fname);

Space before ( rather than after it.

> +   builtin_define_with_value (macro_name, val_name, 0);
> + }
> +
> +   macro_name = (char *) alloca (strlen (name)
> + + sizeof ("__LIBGCC_MAX__"));

Again, XALLOCAVEC.  You could have remembered strlen (name) in a temporary
when you use it multiple times.  Again it is used in the code earlier
multiple times too and could be just remembered there.  GCC strlen
pass can optimize some cases of using multiple strlen calls on the same
string, but if there are intervening calls that could in theory change the
string lengths it needs to recompute those.
So just size_t name_len = stlren (name); and using name_len would be
IMHO better.

> +   sprintf (macro_name, "__LIBGCC_%s_MAX__", name);
> +   sprintf( val_name, "__%s_MAX__", fname);
> +   builtin_define_with_value (macro_name, val_name, 0);
> +
> +   macro_name = (char *) alloca (strlen (name)
> + + sizeof ("__LIBGCC_MIN__"));
> +   sprintf (macro_name, "__LIBGCC_%s_MIN__", name);
> +   sprintf( val_name, "__%s_MIN__", fname);
> +   builtin_define_with_value (macro_name, val_name, 0);
> + }
> +#ifdef HAVE_adddf3
> +   builtin_define_with_int_value ("__LIBGCC_HAVE_HWDBL__",
> +  HAVE_adddf3);
> +#endif

Jakub



[PATCH] Correct -fdump-go-spec's handling of incomplete types

2020-12-08 Thread Nikhil Benesch via Gcc-patches

This patch corrects -fdump-go-spec's handling of incomplete types.
To my knowledge the issue fixed here has not been previously
reported. It was exposed by an in-progress port of gccgo to FreeBSD.

Given the following C code

struct s_fwd v_fwd;
struct s_fwd { };

-fdump-go-spec currently produces the following Go code

var v_fwd struct {};
type s_fwd s_fwd;

whereas the correct Go code is:

var v_fwd s_fwd;
type s_fwd struct {};

(Go is considerably more permissive than C with out-of-order
declarations, so anywhere an out-of-order declaration is valid in
C it is valid in Go.)

gcc/:
* godump.c (go_format_type): Don't consider whether a type has
been seen when determining whether to output a type by name.
Consider only the use_type_name parameter.
(go_output_typedef): When outputting a typedef, format the
declaration's original type, which contains the name of the
underlying type rather than the name of the typedef.
gcc/testsuite:
* gcc.misc-tests/godump-1.c: Add test case.

diff --git a/gcc/godump.c b/gcc/godump.c
index 29a45ce8979..b457965bdc8 100644
--- a/gcc/godump.c
+++ b/gcc/godump.c
@@ -697,9 +697,8 @@ go_format_type (class godump_container *container, tree 
type,
   ret = true;
   ob = &container->type_obstack;
 
-  if (TYPE_NAME (type) != NULL_TREE

-  && (container->decls_seen.contains (type)
- || container->decls_seen.contains (TYPE_NAME (type)))
+  if (use_type_name
+  && TYPE_NAME (type) != NULL_TREE
   && (AGGREGATE_TYPE_P (type)
  || POINTER_TYPE_P (type)
  || TREE_CODE (type) == FUNCTION_TYPE))
@@ -707,6 +706,12 @@ go_format_type (class godump_container *container, tree 
type,
   tree name;
   void **slot;
 
+  /* References to complex builtin types cannot be translated to

+Go.  */
+  if (DECL_P (TYPE_NAME (type))
+ && DECL_IS_UNDECLARED_BUILTIN (TYPE_NAME (type)))
+   ret = false;
+
   name = TYPE_IDENTIFIER (type);
 
   slot = htab_find_slot (container->invalid_hash, IDENTIFIER_POINTER (name),

@@ -714,13 +719,17 @@ go_format_type (class godump_container *container, tree 
type,
   if (slot != NULL)
ret = false;
 
+  /* References to incomplete structs are permitted in many

+contexts, like behind a pointer or inside of a typedef. So
+consider any referenced struct a potential dummy type.  */
+  if (RECORD_OR_UNION_TYPE_P (type))
+   container->pot_dummy_types.add (IDENTIFIER_POINTER (name));
+
   obstack_1grow (ob, '_');
   go_append_string (ob, name);
   return ret;
 }
 
-  container->decls_seen.add (type);

-
   switch (TREE_CODE (type))
 {
 case TYPE_DECL:
@@ -821,34 +830,6 @@ go_format_type (class godump_container *container, tree 
type,
   break;
 
 case POINTER_TYPE:

-  if (use_type_name
-  && TYPE_NAME (TREE_TYPE (type)) != NULL_TREE
-  && (RECORD_OR_UNION_TYPE_P (TREE_TYPE (type))
- || (POINTER_TYPE_P (TREE_TYPE (type))
-  && (TREE_CODE (TREE_TYPE (TREE_TYPE (type)))
- == FUNCTION_TYPE
-{
- tree name;
- void **slot;
-
- name = TYPE_IDENTIFIER (TREE_TYPE (type));
-
- slot = htab_find_slot (container->invalid_hash,
-IDENTIFIER_POINTER (name), NO_INSERT);
- if (slot != NULL)
-   ret = false;
-
- obstack_grow (ob, "*_", 2);
- go_append_string (ob, name);
-
- /* The pointer here can be used without the struct or union
-definition.  So this struct or union is a potential dummy
-type.  */
- if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (type)))
-   container->pot_dummy_types.add (IDENTIFIER_POINTER (name));
-
- return ret;
-}
   if (TREE_CODE (TREE_TYPE (type)) == FUNCTION_TYPE)
obstack_grow (ob, "func", 4);
   else
@@ -1182,8 +1163,8 @@ go_output_typedef (class godump_container *container, 
tree decl)
return;
   *slot = CONST_CAST (void *, (const void *) type);
 
-  if (!go_format_type (container, TREE_TYPE (decl), true, false, NULL,

-  false))
+  if (!go_format_type (container, DECL_ORIGINAL_TYPE (decl), true, false,
+  NULL, false))
{
  fprintf (go_dump_file, "// ");
  slot = htab_find_slot (container->invalid_hash, type, INSERT);
diff --git a/gcc/testsuite/gcc.misc-tests/godump-1.c 
b/gcc/testsuite/gcc.misc-tests/godump-1.c
index f97bbecc9bc..96c25863374 100644
--- a/gcc/testsuite/gcc.misc-tests/godump-1.c
+++ b/gcc/testsuite/gcc.misc-tests/godump-1.c
@@ -471,6 +471,9 @@ typedef struct s_undef_t s_undef_t2;
 typedef struct s_fwd *s_fwd_p;
 /* { dg-final { scan-file godump-1.out "(?n)^type _s_fwd_p \\*_s_fwd$" } } */
 
+struct s_fwd v_fwd;

+/* { dg-final { scan-file godump-1.out "(?n)^var _v_fwd

Re: [PATCH] c++: Don't require accessible dtors for some forms of new [PR59238]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 08, 2020 at 05:34:13PM -0500, Jason Merrill wrote:
> On 12/8/20 4:23 AM, Jakub Jelinek wrote:
> > The earlier cases in build_new_1 already use | tf_no_cleanup, these are
> > cases where the type isn't type_build_ctor_call nor explicit_value_init_p.
> > It is true that often one can't delete these (unless e.g. the dtor would be
> > private or protected and deletion done in some method), but diagnosing that
> > belongs to delete, not new.
> > 
> > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> > trunk?
> 
> It wasn't clear to me why adding tf_no_cleanup in those places made a
> difference; after some investigation I tried moving the tf_no_cleanup closer
> to where we build a TARGET_EXPR and then soon put it in an INIT_EXPR.

I'm afraid I don't know the FE enough to know.
I see cp_build_modify_expr can be called with INIT_EXPR from
perform_member_init, this build_new_1 case and get_temp_regvar.
Whether it is ok for all of them not to build destructors is something I
have no idea about, for build_new_1 case I was confident it shouldn't be
built.

> I also added an array-new to the testcase.

I thought I should look at the array new case too, but didn't get to that,
sorry.

> diff --git a/gcc/cp/init.c b/gcc/cp/init.c
> index 0b98f338feb..3c3e05d9b21 100644
> --- a/gcc/cp/init.c
> +++ b/gcc/cp/init.c
> @@ -1922,7 +1922,7 @@ expand_default_init (tree binfo, tree true_exp, tree 
> exp, tree init, int flags,
>  in an exception region.  */;
>else
>   init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
> - flags, complain);
> + flags, complain | tf_no_cleanup);
>  
>if (TREE_CODE (init) == MUST_NOT_THROW_EXPR)
>   /* We need to protect the initialization of a catch parm with a
> diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
> index 4d499af5ccb..afbb8ef02e6 100644
> --- a/gcc/cp/typeck.c
> +++ b/gcc/cp/typeck.c
> @@ -8860,7 +8860,7 @@ cp_build_modify_expr (location_t loc, tree lhs, enum 
> tree_code modifycode,
> LOOKUP_ONLYCONVERTING.  */
>  newrhs = convert_for_initialization (lhs, olhstype, newrhs, 
> LOOKUP_NORMAL,
>ICR_INIT, NULL_TREE, 0,
> - complain);
> +  complain | tf_no_cleanup);
>else
>  newrhs = convert_for_assignment (olhstype, newrhs, ICR_ASSIGN,
>NULL_TREE, 0, complain, LOOKUP_IMPLICIT);

Jakub



[PATCH] c++: Fix printing of decltype(nullptr) [PR97517]

2020-12-08 Thread Marek Polacek via Gcc-patches
The C++ printer doesn't handle NULLPTR_TYPE, so we issue the ugly
"'nullptr_type' not supported by...".  Since NULLPTR_TYPE is
decltype(nullptr), it seemed reasonable to handle it where we
handle DECLTYPE_TYPE, that is, in the simple-type-specifier handler.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97517
* cxx-pretty-print.c (cxx_pretty_printer::simple_type_specifier): Handle
NULLPTR_TYPE.
(pp_cxx_type_specifier_seq): Likewise.
(cxx_pretty_printer::type_id): Likewise.

gcc/testsuite/ChangeLog:

PR c++/97517
* g++.dg/diagnostic/nullptr.C: New test.
---
 gcc/cp/cxx-pretty-print.c | 6 ++
 gcc/testsuite/g++.dg/diagnostic/nullptr.C | 8 
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/nullptr.C

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index b97f70e2bd0..02721e88a5b 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -1381,6 +1381,10 @@ cxx_pretty_printer::simple_type_specifier (tree t)
   pp_cxx_right_paren (this);
   break;
 
+case NULLPTR_TYPE:
+  pp_cxx_ws_string (this, "nullptr_t");
+  break;
+
 default:
   c_pretty_printer::simple_type_specifier (t);
   break;
@@ -1408,6 +1412,7 @@ pp_cxx_type_specifier_seq (cxx_pretty_printer *pp, tree t)
 case TYPE_DECL:
 case BOUND_TEMPLATE_TEMPLATE_PARM:
 case DECLTYPE_TYPE:
+case NULLPTR_TYPE:
   pp_cxx_cv_qualifier_seq (pp, t);
   pp->simple_type_specifier (t);
   break;
@@ -1873,6 +1878,7 @@ cxx_pretty_printer::type_id (tree t)
 case TYPEOF_TYPE:
 case UNDERLYING_TYPE:
 case DECLTYPE_TYPE:
+case NULLPTR_TYPE:
 case TEMPLATE_ID_EXPR:
 case OFFSET_TYPE:
   pp_cxx_type_specifier_seq (this, t);
diff --git a/gcc/testsuite/g++.dg/diagnostic/nullptr.C 
b/gcc/testsuite/g++.dg/diagnostic/nullptr.C
new file mode 100644
index 000..2c9e5a80bd5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/nullptr.C
@@ -0,0 +1,8 @@
+// PR c++/97517
+// { dg-do compile { target c++20 } }
+// Test that we print "decltype(nullptr)" correctly.
+
+template struct Trait { static constexpr bool value = false; };
+template concept Concept = Trait::value; // { dg-message 
{\[with T = nullptr_t\]} }
+static_assert( Concept ); // { dg-error "static assertion 
failed" }
+// { dg-message "constraints not satisfied" "" { target *-*-* } .-1 }

base-commit: 0221c656bbe5b4ab54e784df3b109c60cb27e5b6
-- 
2.29.2



Re: [PATCH] c++: Don't require accessible dtors for some forms of new [PR59238]

2020-12-08 Thread Jason Merrill via Gcc-patches

On 12/8/20 4:23 AM, Jakub Jelinek wrote:

Hi!

The earlier cases in build_new_1 already use | tf_no_cleanup, these are
cases where the type isn't type_build_ctor_call nor explicit_value_init_p.
It is true that often one can't delete these (unless e.g. the dtor would be
private or protected and deletion done in some method), but diagnosing that
belongs to delete, not new.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?


It wasn't clear to me why adding tf_no_cleanup in those places made a 
difference; after some investigation I tried moving the tf_no_cleanup 
closer to where we build a TARGET_EXPR and then soon put it in an INIT_EXPR.


I also added an array-new to the testcase.

Tested x86_64-linux.  Does make sense to you?
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 0b98f338feb..3c3e05d9b21 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -1922,7 +1922,7 @@ expand_default_init (tree binfo, tree true_exp, tree exp, tree init, int flags,
 	   in an exception region.  */;
   else
 	init = ocp_convert (type, init, CONV_IMPLICIT|CONV_FORCE_TEMP,
-			flags, complain);
+			flags, complain | tf_no_cleanup);
 
   if (TREE_CODE (init) == MUST_NOT_THROW_EXPR)
 	/* We need to protect the initialization of a catch parm with a
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 4d499af5ccb..afbb8ef02e6 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -8860,7 +8860,7 @@ cp_build_modify_expr (location_t loc, tree lhs, enum tree_code modifycode,
LOOKUP_ONLYCONVERTING.  */
 newrhs = convert_for_initialization (lhs, olhstype, newrhs, LOOKUP_NORMAL,
 	 ICR_INIT, NULL_TREE, 0,
- complain);
+	 complain | tf_no_cleanup);
   else
 newrhs = convert_for_assignment (olhstype, newrhs, ICR_ASSIGN,
  NULL_TREE, 0, complain, LOOKUP_IMPLICIT);
diff --git a/gcc/testsuite/g++.dg/cpp0x/new4.C b/gcc/testsuite/g++.dg/cpp0x/new4.C
new file mode 100644
index 000..728ef4ee7ce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/new4.C
@@ -0,0 +1,36 @@
+// PR c++/59238
+// { dg-do compile { target c++11 } }
+
+struct A { ~A () = delete; };
+A *pa{new A{}};
+A *pa2{new A[2]{}};
+
+class B { ~B () = default; };
+B *pb{new B{}};
+
+struct E {
+  ~E () = delete;
+private:
+  int x;
+};
+E *pe{new E{}};
+
+class C { ~C (); };
+C *pc{new C{}};
+
+class D { ~D () {} };
+D *pd{new D{}};
+
+struct F {
+  F () = default;
+  ~F () = delete;
+};
+F *pf{new F{}};
+
+struct G {
+  G () = default;
+  ~G () = delete;
+private:
+  int x;
+};
+G *pg{new G{}};


[PATCH v5] Practical Improvement to libgcc Complex Divide

2020-12-08 Thread Patrick McGehearty via Gcc-patches
Summary of Purpose

The following patch to libgcc/libgcc2.c __divdc3 provides an
opportunity to gain important improvements to the quality of answers
for the default complex divide routine (half, float, double, extended,
long double precisions) when dealing with very large or very small exponents.

The current code correctly implements Smith's method (1962) [2]
further modified by c99's requirements for dealing with NaN (not a
number) results. When working with input values where the exponents
are greater than *_MAX_EXP/2 or less than -(*_MAX_EXP)/2, results are
substantially different from the answers provided by quad precision
more than 1% of the time. This error rate may be unacceptable for many
applications that cannot a priori restrict their computations to the
safe range. The proposed method reduces the frequency of
"substantially different" answers by more than 99% for double
precision at a modest cost of performance.

Differences between current gcc methods and the new method will be
described. Then accuracy and performance differences will be discussed.

Background

This project started with an investigation related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714.  Study of Beebe[1]
provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

I developed two sets of test set by randomly distributing values over
a restricted range and the full range of input values. The current
complex divide handled the restricted range well enough, but failed on
the full range more than 1% of the time. Baudin and Smith's primary
test for "ratio" equals zero reduced the cases with 16 or more error
bits by a factor of 5, but still left too many flawed answers. Adding
debug print out to cases with substantial errors allowed me to see the
intermediate calculations for test values that failed. I noted that
for many of the failures, "ratio" was a subnormal. Changing the
"ratio" test from check for zero to check for subnormal reduced the 16
bit error rate by another factor of 12. This single modified test
provides the greatest benefit for the least cost, but the percentage
of cases with greater than 16 bit errors (double precision data) is
still greater than 0.027% (2.7 in 10,000).

Continued examination of remaining errors and their intermediate
computations led to the various tests of input value tests and scaling
to avoid under/overflow. The current patch does not handle some of the
rare and most extreme combinations of input values, but the random
test data is only showing 1 case in 10 million that has an error of
greater than 12 bits. That case has 18 bits of error and is due to
subtraction cancellation. These results are significantly better
than the results reported by Baudin and Smith.

Support for half, float, double, extended, and long double precision
is included as all are handled with suitable preprocessor symbols in a
single source routine. Since half precision is computed with float
precision as per current libgcc practice, the enhanced algorithm
provides no benefit for half precision and would cost performance.
Further investigation showed changing the half precision algorithm
to use the simple formula (real=a*c+b*d imag=b*c-a*d) caused no
loss of precision and modest improvement in performance.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.  This
use is made generic by defining appropriate __LIBGCC2_* macros in
c-cppbuiltin.c.

Tests are added for when both parts of the denominator have exponents
small enough to allow shifting any subnormal values to normal values
all input values could be scaled up without risking overflow. That
gained a clear improvement in accuracy. Similarly, when either
numerator was subnormal and the other numerator and both denominator
values were not too large, scaling could be used to reduce risk of
computing with subnormals.  The test and scaling values used all fit
within the allowed exponent range for each precision required by the C
standard.

Float precision has more difficulty with getting correct answers than
double precision. When hardware for double precision floating point
operations is available, float precision is now handled in double
precision intermediate calculations with the simple algorithm the same
as the half-precision method of using float precision for intermediate
calculations. Using the higher precision yields exact results for all
tested input values (64-bit double, 32-bit float) with the only
performance cost being the requirement to convert the four input
values from float to double. If double precision hardware is not
available, then fl

Re: [PATCH] Practical Improvement to libgcc Complex Divide

2020-12-08 Thread Patrick McGehearty via Gcc-patches

It took some work, but I think I've responded to all the issues raised here.
Patch V5 coming right after this mail.

On 11/16/2020 8:34 PM, Joseph Myers wrote:

On Tue, 8 Sep 2020, Patrick McGehearty via Gcc-patches wrote:


This project started with an investigation related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714.  Study of Beebe[1]
provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

Thanks, I've now read Baudin and Smith so can review the patch properly.
I'm fine with the overall algorithm, so my comments generally relate to
how the code should best be integrated into libgcc while keeping it
properly machine-mode-generic as far as possible.


I developed two sets of test set by randomly distributing values over
a restricted range and the full range of input values. The current

Are these tests available somewhere?


After some polishing, the development tests are now ready to share.
I've got them in a single directory (a README, 47 mostly small .c files,
various scripts for running tests and sample outputs from all the tests.
the tarball totals about 0.5 MBytes.) These tests are intended for
developing and comparing different complex divide algorithms.
They are NOT intended or structured for routine compiler testing.
The complex divide compiler tests are in the accompanying patch
and are discussed later in this note.




Support for half, float, double, extended, and long double precision
is included as all are handled with suitable preprocessor symbols in a
single source routine. Since half precision is computed with float
precision as per current libgcc practice, the enhanced algorithm
provides no benefit for half precision and would cost performance.
Therefore half precision is left unchanged.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.

In general, libgcc code works with modes, not types; hardcoding references
to a particular mapping between modes and types is problematic.  Rather,
the existing code in c-cppbuiltin.c that has a loop over modes should be
extended to provide whatever information is needed, as macros defined for
each machine mode.

   /* For libgcc-internal use only.  */
   if (flag_building_libgcc)
 {
   /* Properties of floating-point modes for libgcc2.c.  */
   opt_scalar_float_mode mode_iter;
   FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_FLOAT)
 {
[...]

For example, that defines macros such as __LIBGCC_DF_FUNC_EXT__ and
__LIBGCC_DF_MANT_DIG__.  The _FUNC_EXT__ definition involves that code
computing a mapping to types.

I'd suggest defining additional macros such as __LIBGCC_DF_MAX__ in the
same code - for each supported floating-point mode.  They can be defined
to __FLT_MAX__ etc. (the predefined macros rather than the ones in
float.h) - the existing code that computes a suffix for functions can be
adjusted so it also computes the string such as "FLT", "DBL", "LDBL",
"FLT128" etc.

(I suggest defining to __FLT_MAX__ rather than to the expansion of
__FLT_MAX__ because that avoids any tricky interactions with the logic to
compute such expansions lazily.  I suggest __FLT_MAX__ rather than the
FLT_MAX name from float.h because that way you avoid any need to define
feature test macros to access names such as FLT128_MAX.)


After some study, I've done my best to follow your recommendations
for using modes. I've defined __LIBGCC_xx_yyy__, where xx is SF, DF,
XF, TF and yyy is MIN, MAX, and EPSILON in c-cppbuiltin.c. SF uses FLT,
DF used DBL, and XF and TF use LDBL. There is no need for those values
in HF mode because the HF code always uses SF precision.




diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 74ecca8..02c06d8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1343,6 +1343,11 @@ c_cpp_builtins (cpp_reader *pfile)
builtin_define_with_value ("__LIBGCC_INIT_SECTION_ASM_OP__",
 INIT_SECTION_ASM_OP, 1);
  #endif
+  /* For libgcc float/double optimization */
+#ifdef HAVE_adddf3
+  builtin_define_with_int_value ("__LIBGCC_HAVE_HWDBL__",
+HAVE_adddf3);
+#endif

This is another thing to handle more generically - possibly with something
like the mode_has_fma function, and defining a macro for each mode, named
after the mode, rather than only for DFmode.  For an alternative, see the
discussion below.


Defining this value generically but not using/testing it seems
more likely to be subject it future issues when someone tries
to use it, especially since I have no knowledge of how to
test for presen

libgo patch committed: Update to Go 1.15.6 release

2020-12-08 Thread Ian Lance Taylor via Gcc-patches
This patch updates libgo to the Go 1.15.6 release.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
7a75590577dd3da6ab5091097cc9b80f02615360
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 619f1c001f0..dc2682d95d1 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-f4069d94a25893afc9f2fcf641359366f3ede017
+0d0b423739b2fee9788cb6cb8af9ced29375e545
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/MERGE b/libgo/MERGE
index b753907837d..e95c59a132d 100644
--- a/libgo/MERGE
+++ b/libgo/MERGE
@@ -1,4 +1,4 @@
-c53315d6cf1b4bfea6ff356b4a1524778c683bb9
+9b955d2d3fcff6a5bc8bce7bafdc4c634a28e95b
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
diff --git a/libgo/VERSION b/libgo/VERSION
index 701454707cd..7b6d7469626 100644
--- a/libgo/VERSION
+++ b/libgo/VERSION
@@ -1 +1 @@
-go1.15.5
+go1.15.6
diff --git a/libgo/go/cmd/go/internal/work/exec.go 
b/libgo/go/cmd/go/internal/work/exec.go
index 4f689438d1d..3898b2047c3 100644
--- a/libgo/go/cmd/go/internal/work/exec.go
+++ b/libgo/go/cmd/go/internal/work/exec.go
@@ -2778,6 +2778,21 @@ func (b *Builder) cgo(a *Action, cgoExe, objdir string, 
pcCFLAGS, pcLDFLAGS, cgo
idx = bytes.Index(src, []byte(cgoLdflag))
}
}
+
+   // We expect to find the contents of cgoLDFLAGS in flags.
+   if len(cgoLDFLAGS) > 0 {
+   outer:
+   for i := range flags {
+   for j, f := range cgoLDFLAGS {
+   if f != flags[i+j] {
+   continue outer
+   }
+   }
+   flags = append(flags[:i], 
flags[i+len(cgoLDFLAGS):]...)
+   break
+   }
+   }
+
if err := checkLinkerFlags("LDFLAGS", "go:cgo_ldflag", flags); 
err != nil {
return nil, nil, err
}
diff --git a/libgo/go/internal/poll/copy_file_range_linux.go 
b/libgo/go/internal/poll/copy_file_range_linux.go
index 09de299ff71..fc34aef4cba 100644
--- a/libgo/go/internal/poll/copy_file_range_linux.go
+++ b/libgo/go/internal/poll/copy_file_range_linux.go
@@ -10,15 +10,61 @@ import (
"syscall"
 )
 
-var copyFileRangeSupported int32 = 1 // accessed atomically
+var copyFileRangeSupported int32 = -1 // accessed atomically
 
 const maxCopyFileRangeRound = 1 << 30
 
+func kernelVersion() (major int, minor int) {
+   var uname syscall.Utsname
+   if err := syscall.Uname(&uname); err != nil {
+   return
+   }
+
+   rl := uname.Release
+   var values [2]int
+   vi := 0
+   value := 0
+   for _, c := range rl {
+   if '0' <= c && c <= '9' {
+   value = (value * 10) + int(c-'0')
+   } else {
+   // Note that we're assuming N.N.N here.  If we see 
anything else we are likely to
+   // mis-parse it.
+   values[vi] = value
+   vi++
+   if vi >= len(values) {
+   break
+   }
+   value = 0
+   }
+   }
+   switch vi {
+   case 0:
+   return 0, 0
+   case 1:
+   return values[0], 0
+   case 2:
+   return values[0], values[1]
+   }
+   return
+}
+
 // CopyFileRange copies at most remain bytes of data from src to dst, using
 // the copy_file_range system call. dst and src must refer to regular files.
 func CopyFileRange(dst, src *FD, remain int64) (written int64, handled bool, 
err error) {
-   if atomic.LoadInt32(©FileRangeSupported) == 0 {
+   if supported := atomic.LoadInt32(©FileRangeSupported); supported == 
0 {
return 0, false, nil
+   } else if supported == -1 {
+   major, minor := kernelVersion()
+   if major > 5 || (major == 5 && minor >= 3) {
+   atomic.StoreInt32(©FileRangeSupported, 1)
+   } else {
+   // copy_file_range(2) is broken in various ways on 
kernels older than 5.3,
+   // see issue #42400 and
+   // 
https://man7.org/linux/man-pages/man2/copy_file_range.2.html#VERSIONS
+   atomic.StoreInt32(©FileRangeSupported, 0)
+   return 0, false, nil
+   }
}
for remain > 0 {
max := remain
@@ -41,7 +87,7 @@ func CopyFileRange(dst, src *FD, remain int64) (written 
int64, handled bool, err
// use copy_file_range(2) 

Re: [PATCH] c++: ICE with -fsanitize=vptr and constexpr dynamic_cast [PR98103]

2020-12-08 Thread Jason Merrill via Gcc-patches

On 12/4/20 10:40 PM, Marek Polacek wrote:

On Wed, Dec 02, 2020 at 09:01:48PM -0500, Jason Merrill wrote:

On 12/2/20 6:18 PM, Marek Polacek wrote:

-fsanitize=vptr initializes all vtable pointers to null so that it can
catch invalid calls; see cp_ubsan_maybe_initialize_vtbl_ptrs.  That
means that evaluating a vtable reference can produce a null pointer
in this mode, so cxx_eval_dynamic_cast_fn should check that.


Yes, but we shouldn't accept it silently; sanitize is supposed to flag
undefined behavior, not allow it.  If we see a null vptr, we should complain
and set *non_constant_p.


True, I shouldn't have left it for the run-time diagnostic.  How's this, then?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
-fsanitize=vptr initializes all vtable pointers to null so that it can
catch invalid calls; see cp_ubsan_maybe_initialize_vtbl_ptrs.  That
means that evaluating a vtable reference can produce a null pointer
in this mode, so cxx_eval_dynamic_cast_fn should check that and give
and error.

gcc/cp/ChangeLog:

PR c++/98103
* constexpr.c (cxx_eval_dynamic_cast_fn): If the evaluating of vtable
yields a null pointer, give an error and return.  Use objtype.

gcc/testsuite/ChangeLog:

PR c++/98103
* g++.dg/ubsan/vptr-18.C: New test.
---
  gcc/cp/constexpr.c   | 11 ++-
  gcc/testsuite/g++.dg/ubsan/vptr-18.C | 25 +
  2 files changed, 35 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/ubsan/vptr-18.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index e0d358027c9..c413313fbe1 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1998,11 +1998,20 @@ cxx_eval_dynamic_cast_fn (const constexpr_ctx *ctx, 
tree call,
   to the object under construction or destruction, this object is
   considered to be a most derived object that has the type of the
   constructor or destructor's class.  */
-  tree vtable = build_vfield_ref (obj, TREE_TYPE (obj));
+  tree vtable = build_vfield_ref (obj, objtype);
vtable = cxx_eval_constant_expression (ctx, vtable, /*lval*/false,
 non_constant_p, overflow_p);
if (*non_constant_p)
  return call;
+  /* With -fsanitize=vptr, we initialize all vtable pointers to null,
+ so it's possible that we got a null pointer now.  */
+  if (integer_zerop (vtable))
+{
+  if (!ctx->quiet)
+   error_at (loc, "virtual table pointer is used uninitialized");
+  *non_constant_p = true;
+  return integer_zero_node;
+}
/* VTABLE will be &_ZTV1A + 16 or similar, get _ZTV1A.  */
vtable = extract_obj_from_addr_offset (vtable);
const tree mdtype = DECL_CONTEXT (vtable);
diff --git a/gcc/testsuite/g++.dg/ubsan/vptr-18.C 
b/gcc/testsuite/g++.dg/ubsan/vptr-18.C
new file mode 100644
index 000..cd2ca0a9fb6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/vptr-18.C
@@ -0,0 +1,25 @@
+// PR c++/98103
+// { dg-do compile { target c++20 } }
+// { dg-additional-options "-fsanitize=vptr -fno-sanitize-recover=vptr" }
+// Modified constexpr-dynamic17.C.
+
+struct V {
+  virtual void f();
+};
+
+struct A : V { };
+
+struct B : V {
+  constexpr B(V*, A*);
+};
+
+struct D : B, A {
+  constexpr D() : B((A*)this, this) { }
+};
+
+constexpr B::B(V* v, A* a)
+{
+  dynamic_cast(a); // { dg-error "uninitialized" }
+}
+
+constexpr D d;

base-commit: df933e307b1950ce12472660dcac1765b8eb431d





[PATCH, rs6000] Update "size" attribute for Power10

2020-12-08 Thread Pat Haugen via Gcc-patches
Update size attribute for Power10.


This patch was broken out from my larger patch to update various attributes for
Power10, in order to make the review process hopefully easier. This patch only
updates the size attribute for various new instructions. There were no changes
requested to this portion of the original patch, so nothing is new here.

Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. Ok 
for trunk?

-Pat


2020-11-08  Pat Haugen  

gcc/
* config/rs6000/dfp.md (extendddtd2, trunctddd2, *cmp_internal1,
floatditd2, ftrunc2, fixdi2, dfp_ddedpd_,
dfp_denbcd_, dfp_dxex_, dfp_diex_,
*dfp_sgnfcnc_, dfp_dscli_, dfp_dscri_): Update size
attribute for Power10.
* config/rs6000/mma.md (*movoo): Likewise.
* config/rs6000/rs6000.md (define_attr "size"): Add 256.
(define_mode_attr bits): Add DD/TD modes.
* config/rs6000/sync.md (load_quadpti, store_quadpti, load_lockedpti,
store_conditionalpti): Update size attribute for Power10.

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index 9a952300cd6..7562e63a919 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -139,7 +139,8 @@ (define_insn "extendddtd2"
(float_extend:TD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctqpq %0,%1"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "128")])
 
 ;; The result of drdpq is an even/odd register pair with the converted
 ;; value in the even register and zero in the odd register.
@@ -153,6 +154,7 @@ (define_insn "trunctddd2"
   "TARGET_DFP"
   "drdpq %2,%1\;fmr %0,%2"
   [(set_attr "type" "dfp")
+   (set_attr "size" "128")
(set_attr "length" "8")])
 
 (define_insn "trunctdsd2"
@@ -206,7 +208,8 @@ (define_insn "*cmp_internal1"
  (match_operand:DDTD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcmpu %0,%1,%2"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 (define_insn "floatdidd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -220,7 +223,8 @@ (define_insn "floatditd2"
(float:TD (match_operand:DI 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcffixq %0,%1"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "128")])
 
 ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
@@ -230,7 +234,8 @@ (define_insn "ftrunc2"
(fix:DDTD (match_operand:DDTD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drintn. 0,%0,%1,1"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 ;; Convert a decimal64/128 whose value is an integer to an actual integer.
 ;; This is the second stage of converting decimal float to integer type.
@@ -240,7 +245,8 @@ (define_insn "fixdi2"
(fix:DI (match_operand:DDTD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctfix %0,%1"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 ;; Decimal builtin support
 
@@ -262,7 +268,8 @@ (define_insn "dfp_ddedpd_"
 UNSPEC_DDEDPD))]
   "TARGET_DFP"
   "ddedpd %1,%0,%2"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 (define_insn "dfp_denbcd_"
   [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
@@ -271,7 +278,8 @@ (define_insn "dfp_denbcd_"
 UNSPEC_DENBCD))]
   "TARGET_DFP"
   "denbcd %1,%0,%2"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 (define_insn "dfp_denbcd_v16qi_inst"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -301,7 +309,8 @@ (define_insn "dfp_dxex_"
   UNSPEC_DXEX))]
   "TARGET_DFP"
   "dxex %0,%1"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 (define_insn "dfp_diex_"
   [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
@@ -310,7 +319,8 @@ (define_insn "dfp_diex_"
 UNSPEC_DXEX))]
   "TARGET_DFP"
   "diex %0,%1,%2"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 (define_expand "dfptstsfi__"
   [(set (match_dup 3)
@@ -349,7 +359,8 @@ (define_insn "*dfp_sgnfcnc_"
 operands[1] = GEN_INT (63);
   return "dtstsfi %0,%1,%2";
 }
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fp")
+   (set_attr "size" "")])
 
 (define_insn "dfp_dscli_"
   [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
@@ -358,7 +369,8 @@ (define_insn "dfp_dscli_"
 UNSPEC_DSCLI))]
   "TARGET_DFP"
   "dscli %0,%1,%2"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "dfp")
+   (set_attr "size" "")])
 
 (define_insn "dfp_dscri_"
   [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
@@ -367,4 +379,5 @@ (define_insn "dfp_dscri_"
 UNSPEC_DSCRI))]
   "TARGET_DFP"
   "dscri %0,%1,%2"
-  [(set_attr "type" "dfp")])
+  [(set_attr "type" "

Re: [PATCH] correct -Wmismatched-new-delete (PR 98160, 98166)

2020-12-08 Thread Martin Sebor via Gcc-patches

On 12/8/20 1:46 PM, Martin Sebor wrote:

PR 98160 reports an ICE in pretty printer code called from the newly
added -Wmismatched-new-delete.  The ICE is just a simple oversight,
but much more interesting is the warning issued for the test case.
It highlights an issue I didn't consider in the initial implementation:
that inlining one of a pair of allocation/deallocation functions but
not the other might lead to false positives when the inlined function
calls another allocator that the deallocator isn't associated with.

In addition, tests for the changes exposed the overly simplistic
nature of the detection of calls to mismatched forms of operator
new and delete which fails to consider member operators, also
resulting in false positives.

Finally, in a comment on the initial implementation Jonathan notes
that the -Wmismatched-new-delete warning should trigger not only
in user code but also in libstdc++ functions inlined into user code.
I thought I had done that but as it turns out, the "standard code
sequence" I put in place isn't sufficient to make this work.


I forgot to mention one other issue: the initial implementation is
also susceptible to false positives for calls to __builtin_free (and
__builtin_realloc) when the library function (i.e., free or realloc)
was associated with an allocator.  The patch also avoids those by
making the built-in handling more robust.  Since Glibc headers are
not allowed to declare symbols from other headers (e.g., 
is not allowed to declare free()), referring to the __builtin_xxx
forms of the functions might be the only way to associate, say,
tempnam with free.  I'm hoping to add this for the next Glibc
release.


The attached changes avoid the false positives a) by ignoring (with
a warning) the new form of the malloc attribute on inline functions,
and disabling the inlining of others by implicitly adding attribute
noinline to their declaration, and b) by making more robust
the detection of mismatched operators new and delete.  Furthermore,
the patch also arranges for the warning to trigger even for inlined
calls to functions defined in system headers.

To make review a little (marginally) easier the change are two files:
1) gcc-98166-1.diff: introduces valid_new_delete_pair_p and
tree_inlined_location.
2) gcc-98166-2.diff: adjusts the atrribute/warning implementation .

Tested on x86_64-linux.

Martin




RE: [PATCH v2 13/16]Arm: Add support for auto-vectorization using HF mode.

2020-12-08 Thread Tamar Christina via Gcc-patches
ping

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Ramana Radhakrishnan 
> Subject: [PATCH v2 13/16]Arm: Add support for auto-vectorization using HF
> mode.
> 
> Hi All,
> 
> This adds support to the auto-vectorizer to support HFmode vectorization for
> AArch32.  This is supported when +fp16 is used.  I wonder if I should disable
> the returning of the type if the option isn't enabled.
> 
> At the moment it will be returned but the vectorizer will try and fail to use 
> it.
> It wastes a few compile cycles but doesn't result in bad code.
> 
> Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.c (arm_preferred_simd_mode): Add E_HFmode.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/vect-half-floats.c: New test.
> 
> --


Re: [pushed] c++: Fix defaulted <=> fallback to < and == [PR96299]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 08, 2020 at 03:05:09PM -0500, Jason Merrill via Gcc-patches wrote:
> I thought I had implemented P1186R3, but apparently I didn't read it closely
> enough to understand the point of the paper, namely that for a defaulted
> operator<=>, if a member type doesn't have a viable operator<=>, we will use
> its operator< and operator== if the defaulted operator has an specific
> comparison category as its return type; the compiler can't guess if it
> should be strong_ordering or something else, but the user can make that
> choice explicit.

Thanks.

> The libstdc++ test change was necessary because of the change in
> genericize_spaceship from op0 > op1 to op1 < op0; this should be equivalent,
> but isn't because of PR88173.

So shall we announce that in cxx-status.html?

diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index 23081245..403d6740 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -186,7 +186,7 @@
 
Consistent comparison 
(operator<=>)
   https://wg21.link/p0515r3";>P0515R3
-  10
+  10
__cpp_impl_three_way_comparison >= 201711 
 
 
@@ -200,9 +200,11 @@
 
 
   https://wg21.link/p1186r3";>P1186R3
+  11
 
 
   https://wg21.link/p1630r1";>P1630R1
+  10
 
 
   
@@ -312,7 +314,7 @@
 
Atomic Compare-and-Exchange with Padding Bits 
   https://wg21.link/p0528r3";>P0528R3
-  No (https://gcc.gnu.org/PR88101";>PR88101)
+   11 

 
 

Jakub



[PATCH] correct -Wmismatched-new-delete (PR 98160, 98166)

2020-12-08 Thread Martin Sebor via Gcc-patches

PR 98160 reports an ICE in pretty printer code called from the newly
added -Wmismatched-new-delete.  The ICE is just a simple oversight,
but much more interesting is the warning issued for the test case.
It highlights an issue I didn't consider in the initial implementation:
that inlining one of a pair of allocation/deallocation functions but
not the other might lead to false positives when the inlined function
calls another allocator that the deallocator isn't associated with.

In addition, tests for the changes exposed the overly simplistic
nature of the detection of calls to mismatched forms of operator
new and delete which fails to consider member operators, also
resulting in false positives.

Finally, in a comment on the initial implementation Jonathan notes
that the -Wmismatched-new-delete warning should trigger not only
in user code but also in libstdc++ functions inlined into user code.
I thought I had done that but as it turns out, the "standard code
sequence" I put in place isn't sufficient to make this work.

The attached changes avoid the false positives a) by ignoring (with
a warning) the new form of the malloc attribute on inline functions,
and disabling the inlining of others by implicitly adding attribute
noinline to their declaration, and b) by making more robust
the detection of mismatched operators new and delete.  Furthermore,
the patch also arranges for the warning to trigger even for inlined
calls to functions defined in system headers.

To make review a little (marginally) easier the change are two files:
1) gcc-98166-1.diff: introduces valid_new_delete_pair_p and
tree_inlined_location.
2) gcc-98166-2.diff: adjusts the atrribute/warning implementation .

Tested on x86_64-linux.

Martin
Introduce an overload of valid_new_delete_pair_p and tree_inlined_location.

gcc/ChangeLog:

	* tree-ssa-dce.c (valid_new_delete_pair_p): Factor code out into
	valid_new_delete_pair_p in tree.c.
	* tree.c (tree_inlined_location): Define new function.
	(valid_new_delete_pair_p): Define.
	* tree.h (tree_inlined_location): Declare.
	(valid_new_delete_pair_p): Declare.

diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 9fb156c120d..5ec872967b7 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -656,67 +656,7 @@ valid_new_delete_pair_p (gimple *new_call, gimple *delete_call)
 {
   tree new_asm = DECL_ASSEMBLER_NAME (gimple_call_fndecl (new_call));
   tree delete_asm = DECL_ASSEMBLER_NAME (gimple_call_fndecl (delete_call));
-  const char *new_name = IDENTIFIER_POINTER (new_asm);
-  const char *delete_name = IDENTIFIER_POINTER (delete_asm);
-  unsigned int new_len = IDENTIFIER_LENGTH (new_asm);
-  unsigned int delete_len = IDENTIFIER_LENGTH (delete_asm);
-
-  if (new_len < 5 || delete_len < 6)
-return false;
-  if (new_name[0] == '_')
-++new_name, --new_len;
-  if (new_name[0] == '_')
-++new_name, --new_len;
-  if (delete_name[0] == '_')
-++delete_name, --delete_len;
-  if (delete_name[0] == '_')
-++delete_name, --delete_len;
-  if (new_len < 4 || delete_len < 5)
-return false;
-  /* *_len is now just the length after initial underscores.  */
-  if (new_name[0] != 'Z' || new_name[1] != 'n')
-return false;
-  if (delete_name[0] != 'Z' || delete_name[1] != 'd')
-return false;
-  /* _Znw must match _Zdl, _Zna must match _Zda.  */
-  if ((new_name[2] != 'w' || delete_name[2] != 'l')
-  && (new_name[2] != 'a' || delete_name[2] != 'a'))
-return false;
-  /* 'j', 'm' and 'y' correspond to size_t.  */
-  if (new_name[3] != 'j' && new_name[3] != 'm' && new_name[3] != 'y')
-return false;
-  if (delete_name[3] != 'P' || delete_name[4] != 'v')
-return false;
-  if (new_len == 4
-  || (new_len == 18 && !memcmp (new_name + 4, "RKSt9nothrow_t", 14)))
-{
-  /* _ZnXY or _ZnXYRKSt9nothrow_t matches
-	 _ZdXPv, _ZdXPvY and _ZdXPvRKSt9nothrow_t.  */
-  if (delete_len == 5)
-	return true;
-  if (delete_len == 6 && delete_name[5] == new_name[3])
-	return true;
-  if (delete_len == 19 && !memcmp (delete_name + 5, "RKSt9nothrow_t", 14))
-	return true;
-}
-  else if ((new_len == 19 && !memcmp (new_name + 4, "St11align_val_t", 15))
-	   || (new_len == 33
-	   && !memcmp (new_name + 4, "St11align_val_tRKSt9nothrow_t", 29)))
-{
-  /* _ZnXYSt11align_val_t or _ZnXYSt11align_val_tRKSt9nothrow_t matches
-	 _ZdXPvSt11align_val_t or _ZdXPvYSt11align_val_t or  or
-	 _ZdXPvSt11align_val_tRKSt9nothrow_t.  */
-  if (delete_len == 20 && !memcmp (delete_name + 5, "St11align_val_t", 15))
-	return true;
-  if (delete_len == 21
-	  && delete_name[5] == new_name[3]
-	  && !memcmp (delete_name + 6, "St11align_val_t", 15))
-	return true;
-  if (delete_len == 34
-	  && !memcmp (delete_name + 5, "St11align_val_tRKSt9nothrow_t", 29))
-	return true;
-}
-  return false;
+  return valid_new_delete_pair_p (new_asm, delete_asm);
 }
 
 /* Propagate necessity using the operands of necessary statements.
diff --git a/gcc/tree.c b/gcc/tree.c
index

[PATCH][pushed] contrib: modernize filter-clang-warnings.py

2020-12-08 Thread Martin Liška

contrib/ChangeLog:

* filter-clang-warnings.py: Modernize and filter 2 more
patterns.
---
 contrib/filter-clang-warnings.py | 41 +++-
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/contrib/filter-clang-warnings.py b/contrib/filter-clang-warnings.py
index 15cca5ff2df..2b7b42fd099 100755
--- a/contrib/filter-clang-warnings.py
+++ b/contrib/filter-clang-warnings.py
@@ -21,17 +21,24 @@
 #
 #
 
-import sys

 import argparse
 
+

 def skip_warning(filename, message):
 ignores = {
-'': ['-Warray-bounds', '-Wmismatched-tags', 'gcc_gfc: 
-Wignored-attributes', '-Wchar-subscripts',
-'string literal (potentially insecure): -Wformat-security', 
'-Wdeprecated-register',
-'-Wvarargs', 'keyword is hidden by macro definition', "but the 
argument has type 'char *': -Wformat-pedantic",
-'-Wnested-anon-types', 'qualifier in explicit instantiation 
of', 'attribute argument not supported: asm_fprintf',
-'when in C++ mode, this behavior is deprecated', 
'-Wignored-attributes', '-Wgnu-zero-variadic-macro-arguments',
-'-Wformat-security'],
+'': ['-Warray-bounds', '-Wmismatched-tags',
+ 'gcc_gfc: -Wignored-attributes', '-Wchar-subscripts',
+ 'string literal (potentially insecure): -Wformat-security',
+ '-Wdeprecated-register',
+ '-Wvarargs', 'keyword is hidden by macro definition',
+ "but the argument has type 'char *': -Wformat-pedantic",
+ '-Wnested-anon-types',
+ 'qualifier in explicit instantiation of',
+ 'attribute argument not supported: asm_fprintf',
+ 'when in C++ mode, this behavior is deprecated',
+ '-Wignored-attributes', '-Wgnu-zero-variadic-macro-arguments',
+ '-Wformat-security', '-Wundefined-internal',
+ '-Wunknown-warning-option'],
 'insn-modes.c': ['-Wshift-count-overflow'],
 'insn-emit.c': ['-Wtautological-compare'],
 'insn-attrtab.c': ['-Wparentheses-equality'],
@@ -47,26 +54,26 @@ def skip_warning(filename, message):
 for i in ignores:
 if name in filename and i in message:
 return True
-
 return False
 
+

 parser = argparse.ArgumentParser()
-parser.add_argument('log', help = 'Log file with clang warnings')
+parser.add_argument('log', help='Log file with clang warnings')
 args = parser.parse_args()
 
-lines = [l.strip() for l in open(args.log)]

+lines = [line.strip() for line in open(args.log)]
 total = 0
 messages = []
-for l in lines:
+for line in lines:
 token = ': warning: '
-i = l.find(token)
+i = line.find(token)
 if i != -1:
-location = l[:i]
-message = l[i + len(token):]
+location = line[:i]
+message = line[i + len(token):]
 if not skip_warning(location, message):
 total += 1
-messages.append(l)
+messages.append(line)
 
-for l in sorted(messages):

-print(l)
+for line in sorted(messages):
+print(line)
 print('\nTotal warnings: %d' % total)
--
2.29.2



c++: Originating and instantiating module

2020-12-08 Thread Nathan Sidwell


With modules streamed entities have two new properties -- the module
that declares them and the module that instantiates them.  Here
'instantiate' applies to more than just templates -- for instance an
implicit member fn.  These may well be the same module.  This adds the
calls to places that need it.   

gcc/cp/
* class.c (layout_class_type): Call set_instantiating_module.
(build_self_reference): Likewise.
* decl.c (grokfndecl): Call set_originating_module.
(grokvardecl): Likewise.
(grokdeclarator): Likewise.
* pt.c (maybe_new_partial_specialization): Call
set_instantiating_module, propagate DECL_MODULE_EXPORT_P.
(lookup_template_class_1): Likewise.
(tsubst_function_decl): Likewise.
(tsubst_decl, instantiate_template_1): Likewise.
(build_template_decl): Propagate module flags.
(tsubst_template_dcl): Likewise.
(finish_concept_definition): Call set_originating_module.
* module.c (set_instantiating_module, set_originating_module): 
Stubs.



--
Nathan Sidwell
diff --git i/gcc/cp/class.c w/gcc/cp/class.c
index 2ab123d6ccf..bc0d3d6bf86 100644
--- i/gcc/cp/class.c
+++ w/gcc/cp/class.c
@@ -6759,6 +6759,8 @@ layout_class_type (tree t, tree *virtuals_p)
   TYPE_CONTEXT (base_t) = t;
   DECL_CONTEXT (base_d) = t;
 
+  set_instantiating_module (base_d);
+
   /* If the ABI version is not at least two, and the last
 	 field was a bit-field, RLI may not be on a byte
 	 boundary.  In particular, rli_size_unit_so_far might
@@ -8738,6 +8740,7 @@ build_self_reference (void)
   DECL_ARTIFICIAL (decl) = 1;
   SET_DECL_SELF_REFERENCE_P (decl);
   set_underlying_type (decl);
+  set_instantiating_module (decl);  
 
   if (processing_template_decl)
 decl = push_template_decl (decl);
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 7da8c65e984..bb5bb2f1a18 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -9878,6 +9878,8 @@ grokfndecl (tree ctype,
   && !processing_template_decl)
 deduce_noexcept_on_destructor (decl);
 
+  set_originating_module (decl);
+
   decl = check_explicit_specialization (orig_declarator, decl,
 	template_count,
 	2 * funcdef_flag +
@@ -10122,6 +10124,8 @@ grokvardecl (tree type,
   TREE_PUBLIC (decl) = DECL_EXTERNAL (decl);
 }
 
+  set_originating_module (decl);
+
   if (decl_spec_seq_has_spec_p (declspecs, ds_thread))
 {
   if (DECL_EXTERNAL (decl) || TREE_STATIC (decl))
@@ -12965,6 +12969,8 @@ grokdeclarator (const cp_declarator *declarator,
revert this subsequently if it determines that
the clones should share a common implementation.  */
 	DECL_ABSTRACT_P (decl) = true;
+
+	  set_originating_module (decl);
 	}
   else if (current_class_type
 	   && constructor_name_p (unqualified_id, current_class_type))
@@ -13499,6 +13505,8 @@ grokdeclarator (const cp_declarator *declarator,
 	  ;  /* We already issued a permerror.  */
 	else if (decl && DECL_NAME (decl))
 	  {
+		set_originating_module (decl, true);
+		
 		if (initialized)
 		  /* Kludge: We need funcdef_flag to be true in do_friend for
 		 in-class defaulted functions, but that breaks grokfndecl.
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index 705804a5515..948ca2a6cab 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -99,6 +99,16 @@ get_originating_module (tree, bool)
   return 0;
 }
 
+void
+set_instantiating_module (tree)
+{
+}
+
+void
+set_originating_module (tree, bool)
+{
+}
+
 module_state *
 preprocess_module (module_state *, unsigned, bool, bool, bool, cpp_reader *)
 {
diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 56d7b560229..6b8e486a642 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -960,6 +960,9 @@ maybe_new_partial_specialization (tree type)
   TREE_PRIVATE (d) = (current_access_specifier == access_private_node);
   TREE_PROTECTED (d) = (current_access_specifier == access_protected_node);
 
+  set_instantiating_module (d);
+  DECL_MODULE_EXPORT_P (d) = DECL_MODULE_EXPORT_P (tmpl);
+
   return t;
 }
 
@@ -4922,6 +4925,17 @@ build_template_decl (tree decl, tree parms, bool member_template_p)
   DECL_SOURCE_LOCATION (tmpl) = DECL_SOURCE_LOCATION (decl);
   DECL_MEMBER_TEMPLATE_P (tmpl) = member_template_p;
 
+  if (modules_p ())
+{
+  /* Propagate module information from the decl.  */
+  DECL_MODULE_EXPORT_P (tmpl) = DECL_MODULE_EXPORT_P (decl);
+  if (DECL_LANG_SPECIFIC (decl))
+	{
+	  DECL_MODULE_PURVIEW_P (tmpl) = DECL_MODULE_PURVIEW_P (decl);
+	  gcc_checking_assert (!DECL_MODULE_IMPORT_P (decl));
+	}
+}
+
   return tmpl;
 }
 
@@ -9994,6 +10008,12 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context,
 	= DECL_SOURCE_LOCATION (TYPE_STUB_DECL (template_type));
 	}
 
+  set_instantiating_module (type_decl);
+  /* Although GEN_TMPL is the TEMPLATE_DECL, it has the same value
+	 of export flag.  We want to propagate this

[pushed] c++: Fix defaulted <=> fallback to < and == [PR96299]

2020-12-08 Thread Jason Merrill via Gcc-patches
I thought I had implemented P1186R3, but apparently I didn't read it closely
enough to understand the point of the paper, namely that for a defaulted
operator<=>, if a member type doesn't have a viable operator<=>, we will use
its operator< and operator== if the defaulted operator has an specific
comparison category as its return type; the compiler can't guess if it
should be strong_ordering or something else, but the user can make that
choice explicit.

The libstdc++ test change was necessary because of the change in
genericize_spaceship from op0 > op1 to op1 < op0; this should be equivalent,
but isn't because of PR88173.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/96299
* cp-tree.h (build_new_op): Add overload that omits some parms.
(genericize_spaceship): Add location_t parm.
* constexpr.c (cxx_eval_binary_expression): Pass it.
* cp-gimplify.c (genericize_spaceship): Pass it.
* method.c (genericize_spaceship): Handle class-type arguments.
(build_comparison_op): Fall back to op **,
 tsubst_flags_t);
 extern bool aligned_allocation_fn_p(tree);
@@ -7807,7 +7814,7 @@ extern tree merge_types   (tree, 
tree);
 extern tree strip_array_domain (tree);
 extern tree check_return_expr  (tree, bool *);
 extern tree spaceship_type (tree, tsubst_flags_t = 
tf_warning_or_error);
-extern tree genericize_spaceship   (tree, tree, tree);
+extern tree genericize_spaceship   (location_t, tree, tree, tree);
 extern tree cp_build_binary_op  (const op_location_t &,
 enum tree_code, tree, tree,
 tsubst_flags_t);
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index cb477c848d1..2ef6de83830 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3159,7 +3159,7 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
  overflow_p);
   else if (code == SPACESHIP_EXPR)
 {
-  r = genericize_spaceship (type, lhs, rhs);
+  r = genericize_spaceship (loc, type, lhs, rhs);
   return cxx_eval_constant_expression (ctx, r, lval, non_constant_p,
   overflow_p);
 }
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 8bbcf017369..4f62398dfb0 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -882,7 +882,7 @@ static tree genericize_spaceship (tree expr)
   tree type = TREE_TYPE (expr);
   tree op0 = TREE_OPERAND (expr, 0);
   tree op1 = TREE_OPERAND (expr, 1);
-  return genericize_spaceship (type, op0, op1);
+  return genericize_spaceship (input_location, type, op0, op1);
 }
 
 /* If EXPR involves an anonymous VLA type, prepend a DECL_EXPR for that type
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 4de192fac00..da580a868b8 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1063,43 +1063,60 @@ spaceship_type (tree optype, tsubst_flags_t complain)
   return lookup_comparison_category (tag, complain);
 }
 
-/* Turn <=> with type TYPE and operands OP0 and OP1 into GENERIC.  */
+/* Turn <=> with type TYPE and operands OP0 and OP1 into GENERIC.
+   This is also used by build_comparison_op for fallback to op< and op==
+   in a defaulted op<=>.  */
 
 tree
-genericize_spaceship (tree type, tree op0, tree op1)
+genericize_spaceship (location_t loc, tree type, tree op0, tree op1)
 {
   /* ??? maybe optimize based on knowledge of representation? */
   comp_cat_tag tag = cat_tag_for (type);
+
+  if (tag == cc_last && is_auto (type))
+{
+  /* build_comparison_op is checking to see if we want to suggest changing
+the op<=> return type from auto to a specific comparison category; any
+category will do for now.  */
+  tag = cc_strong_ordering;
+  type = lookup_comparison_category (tag, tf_none);
+  if (type == error_mark_node)
+   return error_mark_node;
+}
+
   gcc_checking_assert (tag < cc_last);
 
   tree r;
-  op0 = save_expr (op0);
-  op1 = save_expr (op1);
+  if (SCALAR_TYPE_P (TREE_TYPE (op0)))
+{
+  op0 = save_expr (op0);
+  op1 = save_expr (op1);
+}
 
   tree gt = lookup_comparison_result (tag, type, 1);
 
+  int flags = LOOKUP_NORMAL;
+  tsubst_flags_t complain = tf_none;
+
   if (tag == cc_partial_ordering)
 {
   /* op0 == op1 ? equivalent : op0 < op1 ? less :
-op0 > op1 ? greater : unordered */
+op1 < op0 ? greater : unordered */
   tree uo = lookup_comparison_result (tag, type, 3);
-  tree comp = fold_build2 (GT_EXPR, boolean_type_node, op0, op1);
-  r = fold_build3 (COND_EXPR, type, comp, gt, uo);
+  tree comp = build_new_op (loc, LT_EXPR, flags, op1, op0, complain);
+  r = build_conditional_expr (loc, comp, gt, uo, complain);
 }
   else

[pushed] c++: Distinguish ambiguity from no valid candidate

2020-12-08 Thread Jason Merrill via Gcc-patches
Several recent C++ features are specified to try overload resolution, and if
no viable candidate is found, do something else.  But our error return
doesn't distinguish between that situation and finding multiple viable
candidates that end up being ambiguous.  We're already trying to separately
return the single function we found even if it ends up being ill-formed for
some reason; for ambiguity let's pass back error_mark_node, to be
distinguished from NULL_TREE meaning no viable candidate.  Most callers
won't notice the change, as they only look at this information if the call
succeeds.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* call.c (build_new_op_1): Set *overload for ambiguity.
(build_new_method_call_1): Likewise.
---
 gcc/cp/call.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index f1e0bcb796b..221e3de0c70 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6357,6 +6357,8 @@ build_new_op_1 (const op_location_t &loc, enum tree_code 
code, int flags,
  print_z_candidates (loc, candidates);
}
  result = error_mark_node;
+ if (overload)
+   *overload = error_mark_node;
}
   else if (TREE_CODE (cand->fn) == FUNCTION_DECL)
{
@@ -10438,6 +10440,8 @@ build_new_method_call_1 (tree instance, tree fns, 
vec **args,
free (pretty_name);
}
  call = error_mark_node;
+ if (fn_p)
+   *fn_p = error_mark_node;
}
   else
{

base-commit: f7251a2c103bc48775cb9726a4bebeaebde96684
-- 
2.27.0



Re: How to traverse all the local variables that declared in the current routine?

2020-12-08 Thread Qing Zhao via Gcc-patches



> On Dec 8, 2020, at 1:40 AM, Richard Biener  wrote:
> 
> On Mon, Dec 7, 2020 at 5:20 PM Qing Zhao  > wrote:
>> 
>> 
>> 
>> On Dec 7, 2020, at 1:12 AM, Richard Biener  
>> wrote:
>> 
>> On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao  wrote:
>> 
>> 
>> 
>> 
>> On Dec 4, 2020, at 2:50 AM, Richard Biener  
>> wrote:
>> 
>> On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford
>>  wrote:
>> 
>> 
>> Richard Biener via Gcc-patches  writes:
>> 
>> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>> 
>> Another issue is, in order to check whether an auto-variable has 
>> initializer, I plan to add a new bit in “decl_common” as:
>> /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
>> unsigned decl_is_initialized :1;
>> 
>> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
>> #define DECL_IS_INITIALIZED(NODE) \
>> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>> 
>> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
>> even though DECL_INITIAL might be NULLed.
>> 
>> 
>> For locals it would be more reliable to set this flag during gimplification.
>> 
>> Do you have any comment and suggestions?
>> 
>> 
>> As said above - do you want to cover registers as well as locals?  I'd do
>> the actual zeroing during RTL expansion instead since otherwise you
>> have to figure youself whether a local is actually used (see 
>> expand_stack_vars)
>> 
>> Note that optimization will already made have use of "uninitialized" state
>> of locals so depending on what the actual goal is here "late" may be too 
>> late.
>> 
>> 
>> Haven't thought about this much, so it might be a daft idea, but would a
>> compromise be to use a const internal function:
>> 
>> X1 = .DEFERRED_INIT (X0, INIT)
>> 
>> where the X0 argument is an uninitialised value and the INIT argument
>> describes the initialisation pattern?  So for a decl we'd have:
>> 
>> X = .DEFERRED_INIT (X, INIT)
>> 
>> and for an SSA name we'd have:
>> 
>> X_2 = .DEFERRED_INIT (X_1(D), INIT)
>> 
>> with all other uses of X_1(D) being replaced by X_2.  The idea is that:
>> 
>> * Having the X0 argument would keep the uninitialised use of the
>> variable around for the later warning passes.
>> 
>> * Using a const function should still allow the UB to be deleted as dead
>> if X1 isn't needed.
>> 
>> * Having a function in the way should stop passes from taking advantage
>> of direct uninitialised uses for optimisation.
>> 
>> This means we won't be able to optimise based on the actual init
>> value at the gimple level, but that seems like a fair trade-off.
>> AIUI this is really a security feature or anti-UB hardening feature
>> (in the sense that users are more likely to see predictable behaviour
>> “in the field” even if the program has UB).
>> 
>> 
>> The question is whether it's in line of peoples expectation that
>> explicitely zero-initialized code behaves differently from
>> implicitely zero-initialized code with respect to optimization
>> and secondary side-effects (late diagnostics, latent bugs, etc.).
>> 
>> Introducing a new concept like .DEFERRED_INIT is much more
>> heavy-weight than an explicit zero initializer.
>> 
>> 
>> What exactly you mean by “heavy-weight”? More difficult to implement or much 
>> more run-time overhead or both? Or something else?
>> 
>> The major benefit of the approach of “.DEFERRED_INIT”  is to enable us keep 
>> the current -Wuninitialized analysis untouched and also pass
>> the “uninitialized” info from source code level to “pass_expand”.
>> 
>> 
>> Well, "untouched" is a bit oversimplified.  You do need to handle
>> .DEFERRED_INIT as not
>> being an initialization which will definitely get interesting.
>> 
>> 
>> Yes, during uninitialized variable analysis pass, we should specially handle 
>> the defs with “.DEFERRED_INIT”, to treat them as uninitializations.
>> 
>> If we want to keep the current -Wuninitialized analysis untouched, this is a 
>> quite reasonable approach.
>> 
>> However, if it’s not required to keep the current -Wuninitialized analysis 
>> untouched, adding zero-initializer directly during gimplification should
>> be much easier and simpler, and also smaller run-time overhead.
>> 
>> 
>> As for optimization I fear you'll get a load of redundant zero-init
>> actually emitted if you can just rely on RTL DSE/DCE to remove it.
>> 
>> 
>> Runtime overhead for -fauto-init=zero is one important consideration for the 
>> whole feature, we should minimize the runtime overhead for zero
>> Initialization since it will be used in production build.
>> We can do some run-time performance evaluation when we have an 
>> implementation ready.
>> 
>> 
>> Note there will be other passes "confused" by .DEFERRED_INIT.  Note
>> that there's going to be other
>> considerations - namely where to emit the .DEFERRED_INIT - when
>> emitting it during gimplification
>> you can emit it at the start of the block of block-scope variables.
>> When emitting after gimplification
>

Re: [PATCH] Avoid atomic for guard acquire when that is expensive

2020-12-08 Thread Jason Merrill via Gcc-patches

On 12/7/20 11:17 AM, Bernd Edlinger wrote:

On 12/7/20 4:04 PM, Jason Merrill wrote:

On 12/5/20 7:37 AM, Bernd Edlinger wrote:

On 12/2/20 7:57 PM, Jason Merrill wrote:

On 12/1/20 1:28 PM, Bernd Edlinger wrote: +  tree type = 
targetm.cxx.guard_mask_bit ()

+  ? TREE_TYPE (guard) : char_type_node;
+
+  if (is_atomic_expensive_p (TYPE_MODE (type)))
+    guard = integer_zero_node;
+  else
+    guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);


It should still work to load a single byte, it just needs to be the 
least-significant byte.


I still don't think you need to load the whole word to check the guard bit.


Of course that is also possible.  But I would not expect an
address offset and a byte access to be cheaper than a word access.


Fair point.


I just saw that get_guard_bits does the same thing:
access the first byte if !targetm.cxx.guard_mask_bit ()
and access the whole word otherwise, which is only
there for ARM EABI.



And this isn't an EABI issue; it looks like the non-EABI code is also broken 
for big-endian targets, both the atomic load and the normal load in 
get_guard_bits.

I think the non-EABI code is always using bit 0 in the first byte,
by using the endian-neutral #define _GLIBCXX_GUARD_BIT __guard_test_bit (0, 1).


Except that set_guard sets the least-significant bit on all targets.


Hmm, as I said, get_guard_bits gets the guard as a word if 
!targetm.cxx.guard_mask_bit (),
and as the first byte otherwise.  Of course it could get the third byte,
if !targetm.cxx.guard_mask_bit () && BYTES_BIG_ENDIAN, but it would be more 
complicated
this way, wouldn't it?


Ah, yes, I was overlooking that set_guard uses get_guard_bits.

The patch is OK.


Only ARM EABI uses bit 0 in byte 3 if big-endian and bit 0 in byte 0 otherwise.

For all other targets when _GLIBCXX_USE_FUTEX is defined,
__cxa_guard_XXX accesses the value as int* while the memory
is a 64-bit long, so I could imagine that is an aliasing violation.


But nothing that needs to be fixed immediately.


Agreed.


Attached is the corrected patch.

Tested again on arm-none-eabi with arm-sim.
Is it OK for trunk?

Thanks
Bernd.


Jason









[r11-5839 Regression] FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3 on Linux/x86_64

2020-12-08 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1 is the first bad commit
commit 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1
Author: Prathamesh Kulkarni 
Date:   Tue Dec 8 14:30:04 2020 +0530

gimple-isel: Fold x CMP y ? -1 : 0 to x CMP y [PR97872]

caused

FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5839/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] Remove misleading debug line entries

2020-12-08 Thread Bernd Edlinger
On 12/8/20 11:35 AM, Richard Biener wrote:
> 
> + {
> +   /* Remove a nonbind marker when the outer scope of the
> +  inline function is completely removed.  */
> +   if (gimple_debug_nonbind_marker_p (stmt)
> +   && BLOCK_ABSTRACT_ORIGIN (b))
> + {
> +   while (TREE_CODE (b) == BLOCK
> +  && !inlined_function_outer_scope_p (b))
> + b = BLOCK_SUPERCONTEXT (b);
> 
> So given we never remove a inlined_function_outer_scope_p BLOCK from
> the block tree can we assert that we find such a BLOCK?  If we never
> elide those BLOCKs how can it happen that we elide it in the end?
> 
> +   if (TREE_CODE (b) == BLOCK)
> + {
> +   if (TREE_USED (b))
> 

For gimple_debug_inline_entry_p inlined_function_outer_scope_p (b)
is always true, and the code always proceeds to gsi_remove the
gimple_debug_inline_entry_p.

But for gimple_debug_begin_stmt_p all paths are actually taken.

> Iff we ever elide such block then this will still never be
> false since we've otherwise removed the BLOCK from the block

I can assure you TREE_USED (b) can be false here.

> tree already when we arrive here.  Iff we did that, then the
> above search for a inlined_function_outer_scope_p BLOCK
> might find the "wrong" inline instance.
> 

Indeed, that is a good question.

I tried to find out if there are constraints that are
preserved by remove_unused_scope_block_p, that can be
used to prove that the BLOCK_SUPERCONTEXT walking loop
does not fine the "wrong" inline instance.

diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 9ea24a1..997ccee 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -525,6 +525,10 @@ remove_unused_scope_block_p (tree scope, bool 
in_ctor_dtor_block)
*t = BLOCK_SUBBLOCKS (*t);
while (BLOCK_CHAIN (*t))
  {
+   gcc_assert (supercontext == BLOCK_SUPERCONTEXT 
(BLOCK_SUPERCONTEXT (*t)));
+   if (!TREE_USED (*t))
+ gcc_assert (!inlined_function_outer_scope_p 
(BLOCK_SUPERCONTEXT (*t)));
+   gcc_assert (!TREE_USED (BLOCK_SUPERCONTEXT (*t)));
BLOCK_SUPERCONTEXT (*t) = supercontext;
t = &BLOCK_CHAIN (*t);
  }


So I think I can prove that the above assertions hold,
at least when this block is not taken:

   else if (!flag_auto_profile && debug_info_level == DINFO_LEVEL_NONE
&& !optinfo_wants_inlining_info_p ())
 {
   /* Even for -g0 don't prune outer scopes from artificial
  functions, otherwise diagnostics using tree_nonartificial_location
  will not be emitted properly.  */
   if (inlined_function_outer_scope_p (scope))
 {
   tree ao = BLOCK_ORIGIN (scope);
   if (ao
   && TREE_CODE (ao) == FUNCTION_DECL
   && DECL_DECLARED_INLINE_P (ao)
   && lookup_attribute ("artificial", DECL_ATTRIBUTES (ao)))
 unused = false;
 }
 }

in that case it should be irrelevant, since we wont have debug_begin_stmt_p
if debug_info_level == DINFO_LEVEL_NONE.

So the above assertions mean, that *if* !TREE_USED (b)
which is the precondition for the while loop, then we know that
BLOCK_SUPERCONTEXT (b) *was* not a inlined_function_outer_scope_p,
and it was replaced by BLOCK_SUPERCONTEXT (BLOCKL_SUPERCONTEXT (b))
so the loop skips one step, but the result is still the same
inline block.

However what concerns me, is that the assertion
if (TREE_USED (*t))
   gcc_assert (!inlined_function_outer_scope_p (BLOCK_SUPERCONTEXT (*t))
does not hold.  I think that means, that it can happen, that
a USED block is moved out of the inline block.  And while I have
no test case for it, that sheds a big question on the correctness
of the debug info when that happens.

But I think that is a different issue if I at all.


> So I think we only eliminate the inline scopes if all
> subblocks are unused but then if there's a used outer
> inline instance your patch will assign that outer inline
> instances BLOCK to debug stmts formerly belonging to the
> elided inline instance, no?  (we also mess with
> BLOCK_SUPERCONTEXT during BLOCK removal so I'm not sure
> the walking works as expected)
> 

I did not look at before, and just expected it to behave
reasonably, that is just elide *unused* lexical scopes, and
*empty* subroutines, but that it seems to move *used* lexical
scopes to the grandfather scope, while *removing* the now
*emptied* original inline block is odd.

Or maybe I missed something...


> I guess we could, during the remove_unused_scope_block_p
> DFS walk, record and pass down the "current"
> inlined_function_outer_scope_p BLOCK and for BLOCKs
> we elide record that somehow?  (since we elide the block
> we could abuse fragment_origin/chain for this)
> Then in the patched loop do
> 

I would rather not move *used* blocks out of an inline scope,
whe

c++: template and clone fns for modules

2020-12-08 Thread Nathan Sidwell


We need to expose build_cdtor_clones, it fortunately has the desired
API -- gosh, how did that happen? :) The template machinery will need
to cache path-of-instantiation information, so add two more fields to
the tinst_level struct.  I also had to adjust the
match_mergeable_specialization API since adding it, so including that
change too.

gcc/cp/
* cp-tree.h (struct tinst_level): Add path & visible fields.
(build_cdtor_clones): Declare.
(match_mergeable_specialization): Use a spec_entry, add insert 
parm.

* class.c (build_cdtor_clones): Externalize.
* pt (push_tinst_level_loc): Clear new fields.
(match_mergeable_specialization): Adjust API.


--
Nathan Sidwell
diff --git i/gcc/cp/class.c w/gcc/cp/class.c
index ec47b0698ab..2ab123d6ccf 100644
--- i/gcc/cp/class.c
+++ w/gcc/cp/class.c
@@ -4920,7 +4920,7 @@ build_clone (tree fn, tree name, bool need_vtt_parm_p,
 /* Build the clones of FN, return the number of clones built.  These
will be inserted onto DECL_CHAIN of FN.  */
 
-static void
+void
 build_cdtor_clones (tree fn, bool needs_vtt_p, bool base_omits_inherited_p,
 		bool update_methods)
 {
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index f11cf87f190..66ad114567d 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -6196,6 +6196,13 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
  arguments.  */
   tree tldcl, targs;
 
+  /* For modules we need to know (a) the modules on the path of
+ instantiation and (b) the transitive imports along that path.
+ Note that these two bitmaps may be inherited from NEXT, if this
+ decl is in the same module as NEXT (or has no new information).  */
+  bitmap path;
+  bitmap visible;
+
  private:
   /* Return TRUE iff the original node is a split list.  */
   bool split_list_p () const { return targs; }
@@ -6497,6 +6504,7 @@ extern void check_abi_tags			(tree);
 extern tree missing_abi_tags			(tree);
 extern void fixup_type_variants			(tree);
 extern void fixup_attribute_variants		(tree);
+extern void build_cdtor_clones 			(tree, bool, bool, bool);
 extern void clone_cdtor(tree, bool);
 extern tree copy_operator_fn			(tree, tree_code code);
 extern void adjust_clone_args			(tree);
@@ -7189,8 +7197,8 @@ extern void walk_specializations		(bool,
 		 void (*)(bool, spec_entry *,
 			  void *),
 		 void *);
-extern tree match_mergeable_specialization	(bool is_decl, tree tmpl,
-		 tree args, tree spec);
+extern tree match_mergeable_specialization	(bool is_decl, spec_entry *,
+		 bool insert = true);
 extern unsigned get_mergeable_specialization_flags (tree tmpl, tree spec);
 extern void add_mergeable_specialization(tree tmpl, tree args,
 		 tree spec, unsigned);
diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 2d3ab92dfd1..56d7b560229 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -1704,10 +1704,11 @@ register_specialization (tree spec, tree tmpl, tree args, bool is_friend,
   return spec;
 }
 
-/* Returns true iff two spec_entry nodes are equivalent.  */
-
+/* Restricts tree and type comparisons.  */
 int comparing_specializations;
 
+/* Returns true iff two spec_entry nodes are equivalent.  */
+
 bool
 spec_hasher::equal (spec_entry *e1, spec_entry *e2)
 {
@@ -10909,6 +10910,7 @@ push_tinst_level_loc (tree tldcl, tree targs, location_t loc)
   new_level->errors = errorcount + sorrycount;
   new_level->next = NULL;
   new_level->refcount = 0;
+  new_level->path = new_level->visible = nullptr;
   set_refcount_ptr (new_level->next, current_tinst_level);
   set_refcount_ptr (current_tinst_level, new_level);
 
@@ -29668,29 +29670,27 @@ walk_specializations (bool decls_p,
 fn (decls_p, *iter, data);
 }
 
-/* Lookup the specialization of TMPL, ARGS in the decl or type
-   specialization table.  Return what's there, or if SPEC is non-null,
-   add it and return NULL.  */
+/* Lookup the specialization of *ELT, in the decl or type
+   specialization table.  Return the SPEC that's already there (NULL if
+   nothing).  If INSERT is true, and there was nothing, add the new
+   spec.  */
 
 tree
-match_mergeable_specialization (bool decl_p, tree tmpl, tree args, tree spec)
+match_mergeable_specialization (bool decl_p, spec_entry *elt, bool insert)
 {
-  spec_entry elt = {tmpl, args, spec};
   hash_table *specializations
 = decl_p ? decl_specializations : type_specializations;
-  hashval_t hash = spec_hasher::hash (&elt);
+  hashval_t hash = spec_hasher::hash (elt);
   spec_entry **slot
-= specializations->find_slot_with_hash (&elt, hash,
-	spec ? INSERT : NO_INSERT);
-  spec_entry *entry = slot ? *slot: NULL;
-  
-  if (entry)
-return entry->spec;
+= specializations->find_slot_with_hash (elt, hash,
+	insert ? INSERT : NO_INSERT);
+  if (slot && *slot)
+return (*slot)->spec;
 
-  if (spec)
+  if (insert)
 {
-  entry = ggc_alloc ();
-  *entry = elt;
+  auto entry = ggc_alloc ();
+  *entry = *elt;
   *slo

Re: [PATCH 0/6] Add missing calls to `onlyjump_p'

2020-12-08 Thread Maciej W. Rozycki
On Thu, 3 Dec 2020, Jeff Law wrote:

> >  Note that I have included unrelated though contextually connected 6/6 as
> > an RFC to verify whether this potential anomaly I have spotted has been
> > intentional.  I'll be happy to drop it if that is the case.  The remaining
> > changes are I believe actual bug fixes.
> I doubt it's intentional.  I'd tend to think this specific patch in the
> series should wait until gcc-12 out of an abundance of caution.

 Makes sense to me.  Since we have a working instance of patchwork and I 
actively use it for patch management (for own patches anyway, as I can't 
update the status of other people's patches) all I need to do is to just 
get back to it once we reopen for GCC 12 and see what's been outstanding 
there.

 Thanks for your review.

  Maciej


Raw tree accessors

2020-12-08 Thread Nathan Sidwell

Unchanged from the original posting

Here are the couple of raw accessors I make use of in the module streaming.

gcc/
* tree.h (DECL_ALIGN_RAW): New.
(DECL_ALIGN): Use it.
(DECL_WARN_IF_NOT_ALIGN_RAW): New.
(DECL_WARN_IF_NOT_ALIGN): Use it.
(SET_DECL_WARN_IF_NOT_ALIGN): Likewise.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/tree.h w/gcc/tree.h
index 7faa49d42ba..b44039f61ff 100644
--- i/gcc/tree.h
+++ w/gcc/tree.h
@@ -2529,25 +2529,28 @@ extern tree vector_element_bits_tree (const_tree);
 #define DECL_SIZE(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.size)
 /* Likewise for the size in bytes.  */
 #define DECL_SIZE_UNIT(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.size_unit)
+#define DECL_ALIGN_RAW(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.align)
 /* Returns the alignment required for the datum, in bits.  It must
be a power of two, but an "alignment" of zero is supported
(e.g. as "uninitialized" sentinel).  */
-#define DECL_ALIGN(NODE) \
-(DECL_COMMON_CHECK (NODE)->decl_common.align \
- ? ((unsigned)1) << ((NODE)->decl_common.align - 1) : 0)
+#define DECL_ALIGN(NODE)	\
+  (DECL_ALIGN_RAW (NODE)	\
+   ? ((unsigned)1) << (DECL_ALIGN_RAW (NODE) - 1) : 0)
 /* Specify that DECL_ALIGN(NODE) is X.  */
 #define SET_DECL_ALIGN(NODE, X) \
-(DECL_COMMON_CHECK (NODE)->decl_common.align = ffs_hwi (X))
+  (DECL_ALIGN_RAW (NODE) = ffs_hwi (X))
 
 /* The minimum alignment necessary for the datum, in bits, without
warning.  */
-#define DECL_WARN_IF_NOT_ALIGN(NODE) \
-(DECL_COMMON_CHECK (NODE)->decl_common.warn_if_not_align \
- ? ((unsigned)1) << ((NODE)->decl_common.warn_if_not_align - 1) : 0)
+#define DECL_WARN_IF_NOT_ALIGN_RAW(NODE)			\
+  (DECL_COMMON_CHECK (NODE)->decl_common.warn_if_not_align)
+#define DECL_WARN_IF_NOT_ALIGN(NODE)	\
+  (DECL_WARN_IF_NOT_ALIGN_RAW (NODE)	\
+   ? ((unsigned)1) << (DECL_WARN_IF_NOT_ALIGN_RAW (NODE) - 1) : 0)
 
 /* Specify that DECL_WARN_IF_NOT_ALIGN(NODE) is X.  */
-#define SET_DECL_WARN_IF_NOT_ALIGN(NODE, X) \
-(DECL_COMMON_CHECK (NODE)->decl_common.warn_if_not_align = ffs_hwi (X))
+#define SET_DECL_WARN_IF_NOT_ALIGN(NODE, X)		\
+  (DECL_WARN_IF_NOT_ALIGN_RAW (NODE) = ffs_hwi (X))
 
 /* The alignment of NODE, in bytes.  */
 #define DECL_ALIGN_UNIT(NODE) (DECL_ALIGN (NODE) / BITS_PER_UNIT)


Re: [PATCH 0/6] Add missing calls to `onlyjump_p'

2020-12-08 Thread Maciej W. Rozycki
On Thu, 3 Dec 2020, Maciej W. Rozycki wrote:

>  These changes have been successfully bootstrapped and regression-tested 
> with the `powerpc64le-linux-gnu' and `x86_64-linux-gnu' native systems; 
> verification with the `vax-netbsdelf' target using `powerpc64le-linux-gnu' 
> host has been underway.
> 
>  I meant to do size checks across the test suites with the native builds, 
> but I forgot that the test framework deletes built test cases after use by 
> default.  I have restarted verification now with a modified configuration 
> and will have results sometime tomorrow.

 So for the record, `size' has reported no text changes whatsoever with 
the `x86_64-linux-gnu' compilation or the testsuite, which makes me fairly 
confident no code change has resulted.  With `powerpc64le-linux-gnu' there 
were a couple of text size changes across libgo objects, namely these:

powerpc64le-linux-gnu/libgo/cmd/go/internal/modfetch.gox
powerpc64le-linux-gnu/libgo/cmd/go/internal/modfetch.o
powerpc64le-linux-gnu/libgo/cmd/go/internal/modload.gox
powerpc64le-linux-gnu/libgo/cmd/go/internal/modload.o
powerpc64le-linux-gnu/libgo/html/template.gox
powerpc64le-linux-gnu/libgo/html/template.o
powerpc64le-linux-gnu/libgo/net/.libs/http.o

however upon a closer inspection all the differences turned out to be in 
`.go_export' sections, whose flags are "ALLOC, EXCLUDE".  Oddly they are 
considered text due to an obscure change to `size' from years ago:

Mon Jan 20 14:24:04 1997  Ian Lance Taylor  

* size.c (berkeley_sum): Rewrite.  Skip sections which are not
SEC_ALLOC.  Count SEC_READONLY sections as text.

Switching to trunk binutils and `--format=gnu', added last year, made the 
phenomenon go away.  Sadly among all the wondering around the submission 
here:  nobody 
thought of actually asking Ian what the motivation for that old change of 
his might have been.

 Ian, it's been a while and the mailing lists carried much less actual 
discussion about changes made, but do you remember by any chance?

  Maciej


Go patch committed: Use correct location for iota errors

2020-12-08 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend uses the correct location for errors
involving iota in a constant expression.  We also check for a valid
array length when reducing len or cap of an array to a constant value.
This is for https://golang.org/issue/8183.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
f1b6e17b3f753980527721aa8e949d2481b2560b
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index d4c8e30d1b4..619f1c001f0 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-3363fc239f642d3c3fb9a138d2833985d85dc083
+f4069d94a25893afc9f2fcf641359366f3ede017
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 6d484d9a339..79ed44510a9 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -59,6 +59,67 @@ Expression::traverse_subexpressions(Traverse* traverse)
   return this->do_traverse(traverse);
 }
 
+// A traversal used to set the location of subexpressions.
+
+class Set_location : public Traverse
+{
+ public:
+  Set_location(Location loc)
+: Traverse(traverse_expressions),
+  loc_(loc)
+  { }
+
+  int
+  expression(Expression** pexpr);
+
+ private:
+  Location loc_;
+};
+
+// Set the location of an expression.
+
+int
+Set_location::expression(Expression** pexpr)
+{
+  // Some expressions are shared or don't have an independent
+  // location, so we shouldn't change their location.  This is the set
+  // of expressions for which do_copy is just "return this" or
+  // otherwise does not pass down the location.
+  switch ((*pexpr)->classification())
+{
+case Expression::EXPRESSION_ERROR:
+case Expression::EXPRESSION_VAR_REFERENCE:
+case Expression::EXPRESSION_ENCLOSED_VAR_REFERENCE:
+case Expression::EXPRESSION_STRING:
+case Expression::EXPRESSION_FUNC_DESCRIPTOR:
+case Expression::EXPRESSION_TYPE:
+case Expression::EXPRESSION_BOOLEAN:
+case Expression::EXPRESSION_CONST_REFERENCE:
+case Expression::EXPRESSION_NIL:
+case Expression::EXPRESSION_TYPE_DESCRIPTOR:
+case Expression::EXPRESSION_GC_SYMBOL:
+case Expression::EXPRESSION_PTRMASK_SYMBOL:
+case Expression::EXPRESSION_TYPE_INFO:
+case Expression::EXPRESSION_STRUCT_FIELD_OFFSET:
+  return TRAVERSE_CONTINUE;
+default:
+  break;
+}
+
+  (*pexpr)->location_ = this->loc_;
+  return TRAVERSE_CONTINUE;
+}
+
+// Set the location of an expression and its subexpressions.
+
+void
+Expression::set_location(Location loc)
+{
+  this->location_ = loc;
+  Set_location sl(loc);
+  this->traverse_subexpressions(&sl);
+}
+
 // Default implementation for do_traverse for child classes.
 
 int
@@ -9389,6 +9450,8 @@ Builtin_call_expression::do_is_constant() const
if (arg == NULL)
  return false;
Type* arg_type = arg->type();
+   if (arg_type->is_error())
+ return true;
 
if (arg_type->points_to() != NULL
&& arg_type->points_to()->array_type() != NULL
@@ -9460,6 +9523,8 @@ 
Builtin_call_expression::do_numeric_constant_value(Numeric_constant* nc) const
   if (arg == NULL)
return false;
   Type* arg_type = arg->type();
+  if (arg_type->is_error())
+   return false;
 
   if (this->code_ == BUILTIN_LEN && arg_type->is_string_type())
{
@@ -9482,17 +9547,25 @@ 
Builtin_call_expression::do_numeric_constant_value(Numeric_constant* nc) const
{
  if (this->seen_)
return false;
- Expression* e = arg_type->array_type()->length();
- this->seen_ = true;
- bool r = e->numeric_constant_value(nc);
- this->seen_ = false;
- if (r)
+
+ // We may be replacing this expression with a constant
+ // during lowering, so verify the type to report any errors.
+ // It's OK to verify an array type more than once.
+ arg_type->verify();
+ if (!arg_type->is_error())
{
- if (!nc->set_type(Type::lookup_integer_type("int"), false,
-   this->location()))
-   r = false;
+ Expression* e = arg_type->array_type()->length();
+ this->seen_ = true;
+ bool r = e->numeric_constant_value(nc);
+ this->seen_ = false;
+ if (r)
+   {
+ if (!nc->set_type(Type::lookup_integer_type("int"), false,
+   this->location()))
+   r = false;
+   }
+ return r;
}
- return r;
}
 }
   else if (this->code_ == BUILTIN_SIZEOF
diff --git a/gcc/go/gofrontend/expressions.h b/gcc/go/gofrontend/expressions.h
index 259eeb6027e..712f6870211 100644
--- a/gcc/go/gofrontend/expressions.h
+++ b/gcc/go/gofrontend/expressions.h
@@ -549,6 +549,16 @@ class Expression
   locat

libgcc patch committed: Block signals when release split-stack memory

2020-12-08 Thread Ian Lance Taylor via Gcc-patches
This patch to libgcc blocks signals when releasing split-stack memory
due to a thread exiting.  Without this, if a signal arrives, the
signal handler may try to split the stack itself, which won't work as
the data structures won't be in a stable state.  We just leave signals
blocked while completing the exit; this should do no harm, and
prevents a signal handler from jumping in and allocating new
split-stack structures which will then never be freed.  I will shortly
check in a test for this case, as part of updating libgo to the Go
1.15.6 release.  Bootstrapped this patch and ran Go and split-stack
tests on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* generic-morestack-thread.c (free_segments): Block signals during
thread exit.
f41dd93ade24f22f8cd1863129ab20c821000134
diff --git a/libgcc/generic-morestack-thread.c 
b/libgcc/generic-morestack-thread.c
index 83a65501272..fd391bb2e1f 100644
--- a/libgcc/generic-morestack-thread.c
+++ b/libgcc/generic-morestack-thread.c
@@ -38,6 +38,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #ifndef inhibit_libc
 
 #include 
+#include 
 #include 
 
 #include "generic-morestack.h"
@@ -54,6 +55,9 @@ extern int pthread_key_create (pthread_key_t *, void (*) 
(void *))
 extern int pthread_setspecific (pthread_key_t, const void *)
   __attribute__ ((weak));
 
+extern int pthread_sigmask (int, const sigset_t *, sigset_t *)
+  __attribute__ ((weak));
+
 /* The key for the list of stack segments to free when the thread
exits.  This is created by pthread_key_create.  */
 
@@ -70,6 +74,16 @@ static pthread_once_t create_key_once = PTHREAD_ONCE_INIT;
 static void
 free_segments (void* arg)
 {
+  /* We must block signals in case the signal handler tries to split
+ the stack.  We leave them blocked while the thread exits.  */
+  if (pthread_sigmask)
+{
+  sigset_t mask;
+
+  sigfillset (&mask);
+  pthread_sigmask (SIG_BLOCK, &mask, NULL);
+}
+
   __morestack_release_segments ((struct stack_segment **) arg, 1);
 }
 


Re: [Patch] OpenMP: C/C++ parse 'omp allocate'

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 23, 2020 at 03:50:33PM +0100, Tobias Burnus wrote:
> Given that (at least for C/C++) there is some initial support for
> OpenMP 5.0's allocators, it is likely that users will try it.

Sadly at least the current implementation doesn't offer much benefits;
I meant to add e.g. HBM support through dlopening of the memkind library,
but I haven't found a box with hw I could test it on.
Something that could be done is the memory pinning, we can use mlock for
that at least on Linux, the question is how to handle small allocations.
Larger allocations (say 2 or more pages) we could just round to number of
pages with page alignment and mlock that part and munlock on freeing.
Also, we should do something smarter for the NVPTX and AMDGCN offloading
targets for the allocators, perhaps handle omp_alloc etc. with some constant
allocator arguments directly through PTX etc. directives.

> Also the release notes state: "the allocator routines of OpenMP 5.0,
> including initial|allocate|  clause support in C/C++."
> 
> The latter does not include the omp allocate directive, still,
> it can be expected that users will try:
> 
>   #pragma omp allocate(...)
> 
> And that will fail at runtime. I think that's undesirable,
> even if - like any unknown directive - -Wunknown-pragmas
> (-Wall) warns about it.
> 
> Thoughts? OK?
> 
> Tobias
> 
> PS: I have not tried to implement restrictions or additions
> like 'allocate(a[5])', which is currently rejected. I also

I think allocate(a[5]) is not valid, allocate can't mention parts of other
variables (array elements, array sections, structure members).

> did not check whether there are differences between the clause
> ([partially] implemented) and the directive (this patch).

I guess your patch is ok, but I should fine time to implement at least
the rest of the restrictions; in particular e.g.:

A declarative allocate directive must appear in the same scope as the 
declarations of each of
its list items and must follow all such declarations.

Check if the current scope is the scope that contains all the vars.

Stick the allocator as an artificial attribute to the decls rather than
throwing it away.

I think we should implement also the 5.1 restriction:
A declared variable may appear as a list item in at most one declarative 
allocate directive in a
given compilation unit.
because having multiple allocate directives for the same variable is just
insane and unspecified what it would do.

While the patch tests for C that the allocator has the right type, for C++
(for obvious reasons) it isn't checked, so we need the checking there later
from the attributes or so, at least if it is dependent.

For static storage vars, we need to verify the allocator is a constant
expression, and most likely otherwise just ignore, unless we want to use say
PTX etc. directives to allocate stuff in special kinds of memory.

For automatic variables, we likely need to handle it during gimplification,
that is the last time we can reasonably add the destructors easily
(GOMP_free) such that it would be destructed on C++ exceptions, goto out of
scope etc.

> OpenMP: C/C++ parse 'omp allocate'
> 
> gcc/c-family/ChangeLog:
> 
>   * c-pragma.c (omp_pragmas): Add 'allocate'.
>   * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ALLOCATE.
> 
> gcc/c/ChangeLog:
> 
>   * c-parser.c (c_parser_omp_allocate): New.
>   (c_parser_omp_construct): Call it.
> 
> gcc/cp/ChangeLog:
> 
>   * parser.c (cp_parser_omp_allocate): New.
>   (cp_parser_omp_construct, cp_parser_pragma): Call it.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/gomp/allocate-5.c: New test.

Ok, thanks.

Jakub



Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2020-12-08 Thread Alexander Monakov via Gcc-patches


On Tue, 8 Dec 2020, Julian Brown wrote:

> Ping?

This has addressed my concerns, thanks.

Alexander

> On Fri, 13 Nov 2020 20:54:54 +
> Julian Brown  wrote:
> 
> > Hi Alexander,
> > 
> > Thanks for the review! Comments below.
> > 
> > On Tue, 10 Nov 2020 00:32:36 +0300
> > Alexander Monakov  wrote:
> > 
> > > On Mon, 26 Oct 2020, Jakub Jelinek wrote:
> > >   
> > > > On Mon, Oct 26, 2020 at 07:14:48AM -0700, Julian Brown wrote:
> > > > > This patch adds caching for the stack block allocated for
> > > > > offloaded OpenMP kernel launches on NVPTX. This is a performance
> > > > > optimisation -- we observed an average 11% or so performance
> > > > > improvement with this patch across a set of accelerated GPU
> > > > > benchmarks on one machine (results vary according to individual
> > > > > benchmark and with hardware used).
> > > 
> > > In this patch you're folding two changes together: reuse of
> > > allocated stacks and removing one host-device synchronization.  Why
> > > is that? Can you report performance change separately for each
> > > change (and split out the patches)?  
> > 
> > An accident of the development process of the patch, really -- the
> > idea for removing the post-kernel-launch synchronisation came from the
> > OpenACC side, and adapting it to OpenMP meant the stacks had to remain
> > allocated after the return of the GOMP_OFFLOAD_run function.
> > 
> > > > > A given kernel launch will reuse the stack block from the
> > > > > previous launch if it is large enough, else it is freed and
> > > > > reallocated. A slight caveat is that memory will not be freed
> > > > > until the device is closed, so e.g. if code is using highly
> > > > > variable launch geometries and large amounts of GPU RAM, you
> > > > > might run out of resources slightly quicker with this patch.
> > > > > 
> > > > > Another way this patch gains performance is by omitting the
> > > > > synchronisation at the end of an OpenMP offload kernel launch --
> > > > > it's safe for the GPU and CPU to continue executing in parallel
> > > > > at that point, because e.g. copies-back from the device will be
> > > > > synchronised properly with kernel completion anyway.
> > > 
> > > I don't think this explanation is sufficient. My understanding is
> > > that OpenMP forbids the host to proceed asynchronously after the
> > > target construct unless it is a 'target nowait' construct. This may
> > > be observable if there's a printf in the target region for example
> > > (or if it accesses memory via host pointers).
> > > 
> > > So this really needs to be a separate patch with more explanation
> > > why this is okay (if it is okay).  
> > 
> > As long as the offload kernel only touches GPU memory and does not
> > have any CPU-visible side effects (like the printf you mentioned -- I
> > hadn't really considered that, oops!), it's probably OK.
> > 
> > But anyway, the benefit obtained on OpenMP code (the same set of
> > benchmarks run before) of omitting the synchronisation at the end of
> > GOMP_OFFLOAD_run seems minimal. So it's good enough to just do the
> > stacks caching, and miss out the synchronisation removal for now. (It
> > might still be something worth considering later, perhaps, as long as
> > we can show some given kernel doesn't use printf or access memory via
> > host pointers -- I guess the former might be easier than the latter. I
> > have observed the equivalent OpenACC patch provide a significant boost
> > on some benchmarks, so there's probably something that could be gained
> > on the OpenMP side too.)
> > 
> > The benefit with the attached patch -- just stacks caching, no
> > synchronisation removal -- is about 12% on the same set of benchmarks
> > as before. Results are a little noisy on the machine I'm benchmarking
> > on, so this isn't necessarily proof that the synchronisation removal
> > is harmful for performance!
> > 
> > > > > In turn, the last part necessitates a change to the way
> > > > > "(perhaps abort was called)" errors are detected and reported.
> > > > >   
> > > 
> > > As already mentioned using callbacks is problematic. Plus, I'm sure
> > > the way you lock out other threads is a performance loss when
> > > multiple threads have target regions: even though they will not run
> > > concurrently on the GPU, you still want to allow host threads to
> > > submit GPU jobs while the GPU is occupied.
> > > 
> > > I would suggest to have a small pool (up to 3 entries perhaps) of
> > > stacks. Then you can arrange reuse without totally serializing host
> > > threads on target regions.  
> > 
> > I'm really wary of the additional complexity of adding a stack pool,
> > and the memory allocation/freeing code paths in CUDA appear to be so
> > slow that we get a benefit with this patch even when the GPU stream
> > has to wait for the CPU to unlock the stacks block. Also, for large
> > GPU launches, the size of the soft-stacks block isn't really trivial
> > (I've seen something like 50MB on the ha

c++: Named module global initializers

2020-12-08 Thread Nathan Sidwell

C++ 20 modules adds some new rules about when the global initializers
of imported modules run.  They must run no later than before any
initializers in the importer that appear after the import.  To provide
this, each named module emits an idempotent global initializer that
calls the global initializer functions of its imports (these of course
may call further import initializers).  This is the machinery in our
global-init emission to accomplish that, other than the actual
emission of calls, which is in the module file.  The naming of this
global init is a new piece of the ABI.

FWIW, the module's emitter does some optimization to avoid calling a
direct import's initializer when it can determine thatr import is also
indirect.

gcc/cp/
* decl2.c (start_objects): Refactor and adjust for named module
initializers.
(finish_objects): Likewise.
(generate_ctor_or_dtor_function): Likewise.
* module.cc (module_initializer_kind)
(module_add_import_initializers): Stubs.

pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/decl2.c w/gcc/cp/decl2.c
index 46069cb66a6..e713033a7f4 100644
--- i/gcc/cp/decl2.c
+++ w/gcc/cp/decl2.c
@@ -3636,35 +3636,45 @@ generate_tls_wrapper (tree fn)
 static tree
 start_objects (int method_type, int initp)
 {
-  tree body;
-  tree fndecl;
-  char type[14];
-
   /* Make ctor or dtor function.  METHOD_TYPE may be 'I' or 'D'.  */
+  int module_init = 0;
+
+  if (initp == DEFAULT_INIT_PRIORITY && method_type == 'I')
+module_init = module_initializer_kind ();
 
-  if (initp != DEFAULT_INIT_PRIORITY)
+  tree name = NULL_TREE;
+  if (module_init > 0)
+name = mangle_module_global_init (0);
+  else
 {
-  char joiner;
+  char type[14];
 
+  unsigned len = sprintf (type, "sub_%c", method_type);
+  if (initp != DEFAULT_INIT_PRIORITY)
+	{
+	  char joiner = '_';
 #ifdef JOINER
-  joiner = JOINER;
-#else
-  joiner = '_';
+	  joiner = JOINER;
 #endif
+	  type[len++] = joiner;
+	  sprintf (type + len, "%.5u", initp);
+	}
+  name = get_file_function_name (type);
+}
 
-  sprintf (type, "sub_%c%c%.5u", method_type, joiner, initp);
+  tree fntype =	build_function_type (void_type_node, void_list_node);
+  tree fndecl = build_lang_decl (FUNCTION_DECL, name, fntype);
+  DECL_CONTEXT (fndecl) = FROB_CONTEXT (global_namespace);
+  if (module_init > 0)
+{
+  SET_DECL_ASSEMBLER_NAME (fndecl, name);
+  TREE_PUBLIC (fndecl) = true;
+  determine_visibility (fndecl);
 }
   else
-sprintf (type, "sub_%c", method_type);
-
-  fndecl = build_lang_decl (FUNCTION_DECL,
-			get_file_function_name (type),
-			build_function_type_list (void_type_node,
-		  NULL_TREE));
+TREE_PUBLIC (fndecl) = 0;
   start_preparsed_function (fndecl, /*attrs=*/NULL_TREE, SF_PRE_PARSED);
 
-  TREE_PUBLIC (current_function_decl) = 0;
-
   /* Mark as artificial because it's not explicitly in the user's
  source code.  */
   DECL_ARTIFICIAL (current_function_decl) = 1;
@@ -3678,7 +3688,35 @@ start_objects (int method_type, int initp)
   else
 DECL_GLOBAL_DTOR_P (current_function_decl) = 1;
 
-  body = begin_compound_stmt (BCS_FN_BODY);
+  tree body = begin_compound_stmt (BCS_FN_BODY);
+
+  if (module_init > 0)
+{
+  // 'static bool __in_chrg = false;
+  // if (__inchrg) return;
+  // __inchrg = true
+  tree var = build_lang_decl (VAR_DECL, in_charge_identifier,
+  boolean_type_node);
+  DECL_CONTEXT (var) = fndecl;
+  DECL_ARTIFICIAL (var) = true;
+  TREE_STATIC (var) = true;
+  pushdecl (var);
+  cp_finish_decl (var, NULL_TREE, false, NULL_TREE, 0);
+
+  tree if_stmt = begin_if_stmt ();
+  finish_if_stmt_cond (var, if_stmt);
+  finish_return_stmt (NULL_TREE);
+  finish_then_clause (if_stmt);
+  finish_if_stmt (if_stmt);
+
+  tree assign = build2 (MODIFY_EXPR, boolean_type_node,
+			var, boolean_true_node);
+  TREE_SIDE_EFFECTS (assign) = true;
+  finish_expr_stmt (assign);
+}
+
+  if (module_init)
+module_add_import_initializers ();
 
   return body;
 }
@@ -3689,11 +3727,9 @@ start_objects (int method_type, int initp)
 static void
 finish_objects (int method_type, int initp, tree body)
 {
-  tree fn;
-
   /* Finish up.  */
   finish_compound_stmt (body);
-  fn = finish_function (/*inline_p=*/false);
+  tree fn = finish_function (/*inline_p=*/false);
 
   if (method_type == 'I')
 {
@@ -4228,50 +4264,50 @@ static void
 generate_ctor_or_dtor_function (bool constructor_p, int priority,
 location_t *locus)
 {
-  char function_key;
-  tree fndecl;
-  tree body;
-  size_t i;
-
   input_location = *locus;
-  /* ??? */
-  /* Was: locus->line++; */
 
   /* We use `I' to indicate initialization and `D' to indicate
  destruction.  */
-  function_key = constructor_p ? 'I' : 'D';
+  char function_key = constructor_p ? 'I' : 'D';
 
   /* We emit the function lazily, to avoid generating empty
  global constr

Re: Nested declare target support

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Fri, Nov 20, 2020 at 04:41:15PM +, Kwok Cheung Yeung wrote:
> Hello
> 
> > New OpenMP 5.0 features that won't be available in GCC 9, are planned for 
> > GCC 10
> > or later versions as time permits:
> > 
> ...
> > - nested declare target support
> 
> You said in an email two years ago that nested declare target was not
> supported yet. I do not see any patches that claim to implement this since
> then, but when I ran a quick test with a trunk build:
> 
> #pragma omp declare target
>   #pragma omp declare target
> int foo() { return 1; }
>   #pragma omp end declare target
>   int bar() { return 2; }
> #pragma omp end declare target

> It looks like this was written to handle nesting to begin with (since at
> least 2013) by making current_omp_declare_target_attribute (which
> effectively tracks the nesting level) an integer. Is there anything that is
> currently missing for nested declare target support?

We used to reject omp declare target with clauses in between declare target
without clauses, but that restriction has been lifted, so I don't know what
is missing, perhaps just add testcases that aren't covered in the testsuite
already.  I'm a little bit worried about the interaction between declare
target with nohost vs. normal one etc.
But to answer the question, I don't remember anymore why it was on the
unfinished list.

Jakub



Re: [PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 19, 2020 at 06:07:28PM +, Kwok Cheung Yeung wrote:
> Even without this patch, G++ currently accepts something like

Sorry for the delay.

> int foo() { return 1; }
> int x = foo();
> #pragma omp declare target to(x)
> 
> but will not generate the device-side initializer for x, even though x is
> now present on the device. So this part of the implementation is broken with
> or without the patch.
> 
> Given that my patch doesn't make the current situation any worse, can I
> commit this portion of it to trunk for now, and leave device-side dynamic
> initialization for later?

Ok, but for the patch I have a few nits:

> +/* The C++ version of the get_decl_init langhook returns the static
> +   initializer for a variable declaration if present, otherwise it
> +   tries to find and return the dynamic initializer.  If not present,
> +   it returns NULL.  */
> +
> +static tree*
> +cxx_get_decl_init (tree decl)

The GCC coding style (appart from libstdc++) is type * rather than type*,
occurs several times in the patch.

> +{
> +  tree node;
> +
> +  if (DECL_INITIAL (decl))
> +return &DECL_INITIAL (decl);
> +
> +  for (node = dynamic_initializers; node; node = TREE_CHAIN (node))
> +if (TREE_VALUE (node) == decl)
> +  return &TREE_PURPOSE (node);

I'm worried with many dynamic initializers this will be worst case
quadratic.  Can't you use instead a hash map?  Note, as this is in the
FE, we might need to worry about PCH and GC.
Thus the hash map needs to be indexed by DECL_UIDs rather than pointers,
so perhaps use decl_tree_map?
Also, I'm worried that nothing releases dynamic_initializers (or the
decl_tree_map replacement).  We need it only during the discovery and not
afterwards, so it would be nice if the omp declare target discovery at the
end called another lang hook that would free the decl_tree_map, so that GC
can take it all.
If trees would remain there afterwards, we'd need to worry about destructive
gimplifier too and would need to unshare the dynamic initializers or
something.

I think it would be best to use omp_ in the hook name(s), and:
> --- a/gcc/cp/decl2.c
> +++ b/gcc/cp/decl2.c
> @@ -4940,6 +4940,11 @@ c_parse_final_cleanups (void)
>loop.  */
>vars = prune_vars_needing_no_initialization (&static_aggregates);
>  
> +  /* Copy the contents of VARS into DYNAMIC_INITIALIZERS.  */
> +  for (t = vars; t; t = TREE_CHAIN (t))
> + dynamic_initializers = tree_cons (TREE_PURPOSE (t), TREE_VALUE (t),
> +   dynamic_initializers);

Not to add there anything if (!flag_openmp).  We don't need to waste memory
when nobody is going to look at it.

Jakub



Re: [PATCH] if-to-switch: fix matching of negative conditions

2020-12-08 Thread Richard Biener via Gcc-patches
On December 8, 2020 2:35:35 PM GMT+01:00, "Martin Liška"  wrote:
>We must be careful which edge we follow for conditions
>of a negative form (index != 2).
>
>Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
>Ready to be installed?

Ok. 

Richard. 

>Thanks,
>Martin
>
>gcc/ChangeLog:
>
>   PR tree-optimization/98182
>   * gimple-if-to-switch.cc (pass_if_to_switch::execute): Request
>   chain linkage through false edges only.
>
>gcc/testsuite/ChangeLog:
>
>   PR tree-optimization/98182
>   * gcc.dg/tree-ssa/if-to-switch-10.c: New test.
>   * gcc.dg/tree-ssa/pr98182.c: New test.
>---
>  gcc/gimple-if-to-switch.cc|  6 +++
> .../gcc.dg/tree-ssa/if-to-switch-10.c | 44 +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr98182.c   | 18 
>  3 files changed, 68 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr98182.c
>
>diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
>index 8e1043ae7c4..311f6f6ac97 100644
>--- a/gcc/gimple-if-to-switch.cc
>+++ b/gcc/gimple-if-to-switch.cc
>@@ -522,6 +522,12 @@ pass_if_to_switch::execute (function *fun)
> if (!info2 || info->m_ranges[0].exp != info2->m_ranges[0].exp)
>   break;
>  
>+/* It is important that the blocks are linked through
>FALSE_EDGE.
>+   For an expression of index != VALUE, true and false edges
>+   are flipped.  */
>+if (info2->m_false_edge != e)
>+  break;
>+
> chain->m_entries.safe_push (info2);
> bitmap_set_bit (seen_bbs, e->src->index);
> info = info2;
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
>b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
>new file mode 100644
>index 000..7b8da1c9f3c
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
>@@ -0,0 +1,44 @@
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fdump-tree-iftoswitch-optimized" } */
>+
>+int global;
>+int foo ();
>+
>+int main(int argc, char **argv)
>+{
>+  if (argc != 1)
>+{
>+  if (argc != 2)
>+  {
>+if (argc == 3)
>+  {
>+foo ();
>+foo ();
>+  }
>+else if (argc == 4)
>+  {
>+foo ();
>+  }
>+else if (argc == 5)
>+  {
>+global = 2;
>+  }
>+else
>+  global -= 123;
>+  }
>+  else
>+  {
>+global += 1;
>+  }
>+}
>+  else
>+foo ();
>+
>+
>+  global -= 12;
>+  return 0;
>+}
>+
>+/* { dg-final { scan-tree-dump "Canonical GIMPLE case clusters: 1 2 3
>4 5" "iftoswitch" } } */
>+/* { dg-final { scan-tree-dump "Condition chain with \[^\n\r]\* BBs
>transformed into a switch statement." "iftoswitch" } } */
>+
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr98182.c
>b/gcc/testsuite/gcc.dg/tree-ssa/pr98182.c
>new file mode 100644
>index 000..29a547e3788
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr98182.c
>@@ -0,0 +1,18 @@
>+/* PR tree-optimization/98182 */
>+/* { dg-do compile } */
>+/* { dg-options "-O1 --param case-values-threshold=1
>-fdump-tree-iftoswitch-optimized" } */
>+
>+int global;
>+int foo ();
>+
>+int main(int argc, char **argv)
>+{
>+  if (argc != 1)
>+__builtin_abort ();
>+  else if (argc != 2)
>+__builtin_abort ();
>+  else
>+return 0;
>+}
>+
>+/* { dg-final { scan-tree-dump-not "Condition chain" "iftoswitch" } }
>*/



Re: [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen.

2020-12-08 Thread will schmidt via Gcc-patches
On Tue, 2020-12-08 at 20:20 +1030, Alan Modra wrote:
> On Mon, Dec 07, 2020 at 05:49:05PM -0600, will schmidt via Gcc-
> patches wrote:
> > [PATCH, powerpc] testsuite update tests for powerpc power10 target
> > codegen.
> 
> Appears to duplicate work I did earlier,
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html
> 
> Except I omitted fold-vec-store-builtin_vec_xst-longlong.c, due to
> -mdejagnu-cpu=power8 in that test meaning we don't see any power10
> insns.

Ah shoot, I hate to duplicate work..  and I prob even looked over your
patches (a week + or so ago?)

Your previously submitted patch should 'win'.  I'll take a peek back
and make sure I've at least posted a lgtm for your submission. :-)

Thanks
-Will




Re: [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included)

2020-12-08 Thread Martin Husemann
On Tue, Dec 08, 2020 at 02:38:59PM +, Maciej W. Rozycki wrote:
>  Here's the full list of math functions that the `configure' script in 
> libgfortran reports as missing:
> 
> checking for acosl... no
> checking for acoshf... no
[..]
> Except for the Bessel functions these are a part of ISO C; `long double' 
> versions, some of which appear missing unlike their `float' or `double' 
> counterparts, should probably just alias to the corresponding `double' 
> versions as I doubt we want to get into the H-floating format, largely 
> missing from actual VAX hardware and meant to be emulated by the OS.

Thanks for the list - I'll add the aliases soonish (they are likely already
there for the IEEE versions but missing from the vax code) and check
what remains missing then.

Martin


Re: [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included)

2020-12-08 Thread Maciej W. Rozycki
On Thu, 26 Nov 2020, Martin Husemann wrote:

> >  The VAX/NetBSD port however does use hardware FP in their libm as far as 
> > I can tell, so I guess it would be reasonable for libgfortran to do so as 
> > well.  I haven't checked how correct their implementation actually is, but 
> > barring evidence otherwise I would assume they did the right thing.  
> 
> It does, but it is not totally correct in all places (due to gcc issues
> some parts have not received good testing, and others clearly are broken,
> eg. when tables are used that have not been adjusted for the different
> limits in VAX float/double formats).

 I have realised that with my VAX/Linux effort, more than 10 years ago, I 
did not encounter such issues, and I did port all the GCC components the 
compiler provided at the time (although the port of libjava could have 
been only partially functional as I didn't properly verify the IEEE<->VAX 
FP conversion stubs I have necessarily implemented), though what chose was 
4.1.2 rather than the most recent version (to avoid the need to port NPTL 
right away).  I should have tripped over this issue then, but I did not.

 So with the objective of this effort out of the way I have now looked 
into what happened with libgfortran here and realised that the cause of 
the compilation error was an attempt to provide a standard ISO C function 
missing from NetBSD's libc or libm (even though it's declared).  Indeed:

$ grep tgamma usr/include/math.h
double  tgamma(double);
float   tgammaf(float);
long double tgammal(long double);
$ readelf -s usr/lib/libc.so usr/lib/libm.so usr/lib/libc.a usr/lib/libm.a | 
grep tgamma
$ 

So clearly something went wrong there and I think it's that that has to be 
fixed rather than the fallback implementations in libgfortran (which I 
gather have been only provided for legacy systems that do not implement a 
full ISO C environment and are no longer maintained).  I suspect that once 
this function (and any other ones that may be missing) has been supplied 
by the system libraries libgfortran will just work out of the box.

 Here's the full list of math functions that the `configure' script in 
libgfortran reports as missing:

checking for acosl... no
checking for acoshf... no
checking for acoshl... no
checking for asinl... no
checking for asinhf... no
checking for asinhl... no
checking for atan2l... no
checking for atanl... no
checking for atanhl... no
checking for cosl... no
checking for coshl... no
checking for expl... no
checking for fmaf... no
checking for fma... no
checking for fmal... no
checking for frexpf... no
checking for frexpl... no
checking for logl... no
checking for log10l... no
checking for clog10f... no
checking for clog10... no
checking for clog10l... no
checking for nextafterf... no
checking for nextafter... no
checking for nextafterl... no
checking for lroundl... no
checking for llroundf... no
checking for llround... no
checking for llroundl... no
checking for sinl... no
checking for sinhl... no
checking for tanl... no
checking for tanhl... no
checking for erfcl... no
checking for j0f... no
checking for j1f... no
checking for jnf... no
checking for jnl... no
checking for y0f... no
checking for y1f... no
checking for ynf... no
checking for ynl... no
checking for tgamma... no
checking for tgammaf... no
checking for lgammaf... no

Except for the Bessel functions these are a part of ISO C; `long double' 
versions, some of which appear missing unlike their `float' or `double' 
counterparts, should probably just alias to the corresponding `double' 
versions as I doubt we want to get into the H-floating format, largely 
missing from actual VAX hardware and meant to be emulated by the OS.

 Please note that this is with NetBSD 9 rather than 9.1 (which has only 
been recently released and therefore I decided not to get distracted with 
an upgrade) and I don't know if it has been fixed in the latter release.

  Maciej


Re: c++: Add module includes

2020-12-08 Thread Nathan Sidwell

On 12/8/20 8:01 AM, Nathan Sidwell wrote:
This adds MODULE_VERSION to the makefile, so it's generated from the 
date of the module.cc file in development.  Also adds the include files 
to module.cc


It broke :(


c++: Fix MODULE_VERSION breakage

Adding includes to module.cc triggered the kind of build failure I
wanted to check for.  In this case it was MODULE_VERSION not being
defined, and module.cc's internal #error triggering.  I've relaxed the
check in Make-lang, so we proviude MODULE_VERSION when DEVPHASE is not
empty (rather than when it is 'experimental').  AFAICT devphase is
empty for release builds, and the #error will force us to decide
whether modules is sufficiently baked at that point.

gcc/cp
* Make-lang.in (MODULE_VERSION): Override when DEVPHASE not empty.
* module.cc: Comment.


--
Nathan Sidwell
diff --git c/gcc/cp/Make-lang.in w/gcc/cp/Make-lang.in
index d7dc0dec2b8..52116652900 100644
--- c/gcc/cp/Make-lang.in
+++ w/gcc/cp/Make-lang.in
@@ -57,7 +57,8 @@ CFLAGS-cp/g++spec.o += $(DRIVER_DEFINES)
 CFLAGS-cp/module.o += -DHOST_MACHINE=\"$(host)\" \
 	-DTARGET_MACHINE=\"$(target)\"
 
-ifeq ($(DEVPHASE_c),experimental)
+# In non-release builds, use a date-related module version.
+ifneq ($(DEVPHASE_c),)
 # Some date's don't grok 'r', if so, simply use today's
 # date (don't bootstrap at midnight).
 MODULE_VERSION := $(shell date -r $(srcdir)/cp/module.cc '+%y%m%d-%H%M' \
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 24580c70907..9a5d73af20e 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #define MODULE_MINOR(V) ((V) % 1)
 #define EXPERIMENT(A,B) (IS_EXPERIMENTAL (MODULE_VERSION) ? (A) : (B))
 #ifndef MODULE_VERSION
+// Be sure you're ready!  Remove #error this before release!
 #error "Shtopp! What are you doing? This is not ready yet."
 #include "bversion.h"
 #define MODULE_VERSION (BUILDING_GCC_MAJOR * 1U + BUILDING_GCC_MINOR)


c++: Mangling for modules

2020-12-08 Thread Nathan Sidwell


This is the mangling changes for modules.  These were developed in
collaboration with clang, which also implemements the same ABI (or
plans to, I do not think the global init is in clang).  The global
init mangling is captured in
https://github.com/itanium-cxx-abi/cxx-abi/issues/99

gcc/cp/
* cp-tree.h (mangle_module_substitution, mangle_identifier)
(mangle_module_global_init): Declare.
* mangle.c (struct globals): Add mod field.
 (mangle_module_substitution, mangle_identifier)
(mangle_module_global_init): Define.
(write_module, maybe_write_module): New.
(write_name): Call it.
(start_mangling): Clear mod field.
(finish_mangling_internal): Adjust.
* module.cc (mangle_module, mangle_module_fini)
(get_originating_module): Stubs.

pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index aa2b0f782fa..f11cf87f190 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -7916,6 +7916,9 @@ extern tree mangle_template_parm_object		(tree);
 extern char *get_mangled_vtable_map_var_name(tree);
 extern bool mangle_return_type_p		(tree);
 extern tree mangle_decomp			(tree, vec &);
+extern void mangle_module_substitution		(int);
+extern void mangle_identifier			(char, tree);
+extern tree mangle_module_global_init		(int);
 
 /* in dump.c */
 extern bool cp_dump_tree			(void *, tree);
diff --git i/gcc/cp/mangle.c w/gcc/cp/mangle.c
index 5548e51d39d..c1d9c737a16 100644
--- i/gcc/cp/mangle.c
+++ w/gcc/cp/mangle.c
@@ -117,6 +117,9 @@ struct GTY(()) globals {
 
   /* True if the mangling will be different in C++17 mode.  */
   bool need_cxx17_warning;
+
+  /* True if we mangled a module name.  */
+  bool mod;
 };
 
 static GTY (()) globals G;
@@ -832,6 +835,62 @@ write_encoding (const tree decl)
 }
 }
 
+/* Interface to substitution and identifer mangling, used by the
+   module name mangler.  */
+
+void
+mangle_module_substitution (int v)
+{
+  if (v < 10)
+{
+  write_char ('_');
+  write_char ('0' + v);
+}
+  else
+{
+  write_char ('W');
+  write_unsigned_number (v - 10);
+  write_char ('_');
+}
+}
+
+void
+mangle_identifier (char c, tree id)
+{
+  if (c)
+write_char (c);
+  write_source_name (id);
+}
+
+/* If the outermost non-namespace context (including DECL itself) is
+   a module-linkage decl, mangle the module information.  For module
+   global initializers we need to include the partition part.
+
+::= W + E
+:: 
+   || _   ;; short backref
+	   || W  _  ;; long backref
+   || P  ;; partition introducer
+*/
+
+static void
+write_module (int m, bool include_partition)
+{
+  G.mod = true;
+
+  write_char ('W');
+  mangle_module (m, include_partition);
+  write_char ('E');
+}
+
+static void
+maybe_write_module (tree decl)
+{
+  int m = get_originating_module (decl, true);
+  if (m >= 0)
+write_module (m, false);
+}
+
 /* Lambdas can have a bit more context for mangling, specifically VAR_DECL
or PARM_DECL context, which doesn't belong in DECL_CONTEXT.  */
 
@@ -894,6 +953,9 @@ write_name (tree decl, const int ignore_local_scope)
   decl = TYPE_NAME (TYPE_MAIN_VARIANT (TREE_TYPE (decl)));
 }
 
+  if (modules_p ())
+maybe_write_module (decl);
+
   context = decl_mangling_context (decl);
 
   gcc_assert (context != NULL_TREE);
@@ -3825,14 +3887,13 @@ start_mangling (const tree entity)
   G.entity = entity;
   G.need_abi_warning = false;
   G.need_cxx17_warning = false;
+  G.mod = false;
   obstack_free (&name_obstack, name_base);
   mangle_obstack = &name_obstack;
   name_base = obstack_alloc (&name_obstack, 0);
 }
 
-/* Done with mangling. If WARN is true, and the name of G.entity will
-   be mangled differently in a future version of the ABI, issue a
-   warning.  */
+/* Done with mangling.  Release the data.  */
 
 static void
 finish_mangling_internal (void)
@@ -3840,6 +3901,9 @@ finish_mangling_internal (void)
   /* Clear all the substitutions.  */
   vec_safe_truncate (G.substitutions, 0);
 
+  if (G.mod)
+mangle_module_fini ();
+
   /* Null-terminate the string.  */
   write_char ('\0');
 }
@@ -3884,6 +3948,20 @@ init_mangle (void)
   subst_identifiers[SUBID_BASIC_IOSTREAM] = get_identifier ("basic_iostream");
 }
 
+/* Generate a mangling for MODULE's global initializer fn.  */
+
+tree
+mangle_module_global_init (int module)
+{
+  start_mangling (NULL_TREE);
+
+  write_string ("_ZGI");
+  write_module (module, true);
+  write_char ('v');
+
+  return finish_mangling_get_identifier ();
+}
+
 /* Generate the mangled name of DECL.  */
 
 static tree
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index 91a16815811..24580c70907 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -70,6 +70,23 @@ get_module (tree, module_state *, bool)
   return nullptr;
 }
 
+
+void
+mangle_module (int, bool)
+{
+}
+
+void
+mangle_module_fini ()
+{
+}
+
+int
+get_originating_module (tree, bool)
+{
+  

[PATCH][GCC] arm: Add support for Cortex-A78C

2020-12-08 Thread Przemyslaw Wirkus via Gcc-patches
This patch adds support for -mcpu=cortex-a78c command line option.
For more information about this processor, see [0]:

[0] https://developer.arm.com/ip-products/processors/cortex-a/cortex-a78c

OK from master ?

gcc/ChangeLog:

* config/arm/arm-cpus.in: Add Cortex-A78C core.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Update docs.



rb13728.patch
Description: rb13728.patch


[PATCH][GCC] aarch64: Add support for Cortex-A78C

2020-12-08 Thread Przemyslaw Wirkus via Gcc-patches
This patch adds support for -mcpu=cortex-a78c command line option.
For more information about this processor, see [0]:

[0] https://developer.arm.com/ip-products/processors/cortex-a/cortex-a78c

OK for master ?

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex-A78C core.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Update docs.


rb13727.patch
Description: rb13727.patch


RE: [PATCH 1/5] arm: Auto-vectorization for MVE: vand

2020-12-08 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: 08 December 2020 13:59
> To: Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH 1/5] arm: Auto-vectorization for MVE: vand
> 
> On Tue, 8 Dec 2020 at 14:19, Kyrylo Tkachov 
> wrote:
> >
> > Hi Christophe
> >
> > > -Original Message-
> > > From: Gcc-patches  On Behalf Of
> > > Christophe Lyon via Gcc-patches
> > > Sent: 08 December 2020 13:06
> > > To: gcc-patches@gcc.gnu.org
> > > Subject: [PATCH 1/5] arm: Auto-vectorization for MVE: vand
> > >
> > > This patch enables MVE vandq instructions for auto-vectorization.  MVE
> > > vandq insns in mve.md are modified to use 'and' instead of unspec
> > > expression to support and3.  The and3 expander is added
> to
> > > vec-common.md
> > >
> > > 2020-12-03  Christophe Lyon  
> > >
> > >   gcc/
> > >   * config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
> > >   (VANQ): Remove.
> > >   (VDQ): Add TARGET_HAVE_MVE condition where relevant.
> > >   * config/arm/mve.md (mve_vandq_u): New entry for vand
> > >   instruction using expression 'and'.
> > >   (mve_vandq_s): New expander.
> > >   (mve_vaddq_n_f): Use 'and' code instead of unspec.
> > >   * config/arm/neon.md (and3): Rename into
> > > and3_neon.
> > >   * config/arm/predicates.md (imm_for_neon_inv_logic_operand):
> > >   Enable for MVE.
> > >   * config/arm/unspecs.md (VANDQ_S, VANDQ_U, VANDQ_F):
> > > Remove.
> > >   * config/arm/vec-common.md (and3): New expander.
> > >
> > >   gcc/testsuite/
> > >   * gcc.target/arm/simd/mve-vand.c: New test.
> > > ---
> > >  gcc/config/arm/iterators.md  | 11 +++--
> > >  gcc/config/arm/mve.md| 40 +-
> > >  gcc/config/arm/neon.md   |  2 +-
> > >  gcc/config/arm/predicates.md |  2 +-
> > >  gcc/config/arm/unspecs.md|  3 --
> > >  gcc/config/arm/vec-common.md |  8 
> > >  gcc/testsuite/gcc.target/arm/simd/mve-vand.c | 63
> > > 
> > >  7 files changed, 109 insertions(+), 20 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vand.c
> > >
> > > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> > > index 592af35..badad2b 100644
> > > --- a/gcc/config/arm/iterators.md
> > > +++ b/gcc/config/arm/iterators.md
> > > @@ -147,7 +147,12 @@ (define_mode_iterator VW [V8QI V4HI V2SI])
> > >  (define_mode_iterator VN [V8HI V4SI V2DI])
> > >
> > >  ;; All supported vector modes (except singleton DImode).
> > > -(define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V4HF
> V8HF
> > > V2SF V4SF V2DI])
> > > +(define_mode_iterator VDQ [(V8QI "!TARGET_HAVE_MVE") V16QI
> > > +(V4HI "!TARGET_HAVE_MVE") V8HI
> > > +(V2SI "!TARGET_HAVE_MVE") V4SI
> > > +(V4HF "!TARGET_HAVE_MVE") V8HF
> > > +(V2SF "!TARGET_HAVE_MVE") V4SF
> > > +(V2DI "!TARGET_HAVE_MVE")])
> > >
> > >  ;; All supported floating-point vector modes (except V2DF).
> > >  (define_mode_iterator VF [(V4HF "TARGET_NEON_FP16INST")
> > > @@ -1232,8 +1237,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s")
> > > (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
> > >  (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s")
> > >  (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s")
> > >  (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u")
> > > -(VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s")
> > > -(VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u")
> > > +(VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s")
> > > (VBICQ_U "u")
> > >  (VBRSRQ_N_S "s") (VBRSRQ_N_U "u")
> > > (VCADDQ_ROT270_S "s")
> > >  (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
> > >  (VCMPEQQ_S "s") (VCMPEQQ_U "u")
> > > (VCADDQ_ROT90_U "u")
> > > @@ -1501,7 +1505,6 @@ (define_int_iterator VABDQ [VABDQ_S
> VABDQ_U])
> > >  (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U])
> > >  (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U])
> > >  (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S])
> > > -(define_int_iterator VANDQ [VANDQ_U VANDQ_S])
> > >  (define_int_iterator VBICQ [VBICQ_S VBICQ_U])
> > >  (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S])
> > >  (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S
> > > VCADDQ_ROT270_U])
> > > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> > > index ecbaaa9..238c828 100644
> > > --- a/gcc/config/arm/mve.md
> > > +++ b/gcc/config/arm/mve.md
> > > @@ -894,17 +894,36 @@ (define_insn "mve_vaddvq_p_"
> > >  ;;
> > >  ;; [vandq_u, vandq_s])
> > >  ;;
> > > -(define_insn "mve_vandq_"
> > > +;; signed and unsigned versions are the same: define the unsigned
> > > +;; insn, and use an expander for the signed one as

Re: [PATCH 1/5] arm: Auto-vectorization for MVE: vand

2020-12-08 Thread Christophe Lyon via Gcc-patches
On Tue, 8 Dec 2020 at 14:19, Kyrylo Tkachov  wrote:
>
> Hi Christophe
>
> > -Original Message-
> > From: Gcc-patches  On Behalf Of
> > Christophe Lyon via Gcc-patches
> > Sent: 08 December 2020 13:06
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH 1/5] arm: Auto-vectorization for MVE: vand
> >
> > This patch enables MVE vandq instructions for auto-vectorization.  MVE
> > vandq insns in mve.md are modified to use 'and' instead of unspec
> > expression to support and3.  The and3 expander is added to
> > vec-common.md
> >
> > 2020-12-03  Christophe Lyon  
> >
> >   gcc/
> >   * config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
> >   (VANQ): Remove.
> >   (VDQ): Add TARGET_HAVE_MVE condition where relevant.
> >   * config/arm/mve.md (mve_vandq_u): New entry for vand
> >   instruction using expression 'and'.
> >   (mve_vandq_s): New expander.
> >   (mve_vaddq_n_f): Use 'and' code instead of unspec.
> >   * config/arm/neon.md (and3): Rename into
> > and3_neon.
> >   * config/arm/predicates.md (imm_for_neon_inv_logic_operand):
> >   Enable for MVE.
> >   * config/arm/unspecs.md (VANDQ_S, VANDQ_U, VANDQ_F):
> > Remove.
> >   * config/arm/vec-common.md (and3): New expander.
> >
> >   gcc/testsuite/
> >   * gcc.target/arm/simd/mve-vand.c: New test.
> > ---
> >  gcc/config/arm/iterators.md  | 11 +++--
> >  gcc/config/arm/mve.md| 40 +-
> >  gcc/config/arm/neon.md   |  2 +-
> >  gcc/config/arm/predicates.md |  2 +-
> >  gcc/config/arm/unspecs.md|  3 --
> >  gcc/config/arm/vec-common.md |  8 
> >  gcc/testsuite/gcc.target/arm/simd/mve-vand.c | 63
> > 
> >  7 files changed, 109 insertions(+), 20 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vand.c
> >
> > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> > index 592af35..badad2b 100644
> > --- a/gcc/config/arm/iterators.md
> > +++ b/gcc/config/arm/iterators.md
> > @@ -147,7 +147,12 @@ (define_mode_iterator VW [V8QI V4HI V2SI])
> >  (define_mode_iterator VN [V8HI V4SI V2DI])
> >
> >  ;; All supported vector modes (except singleton DImode).
> > -(define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF
> > V2SF V4SF V2DI])
> > +(define_mode_iterator VDQ [(V8QI "!TARGET_HAVE_MVE") V16QI
> > +(V4HI "!TARGET_HAVE_MVE") V8HI
> > +(V2SI "!TARGET_HAVE_MVE") V4SI
> > +(V4HF "!TARGET_HAVE_MVE") V8HF
> > +(V2SF "!TARGET_HAVE_MVE") V4SF
> > +(V2DI "!TARGET_HAVE_MVE")])
> >
> >  ;; All supported floating-point vector modes (except V2DF).
> >  (define_mode_iterator VF [(V4HF "TARGET_NEON_FP16INST")
> > @@ -1232,8 +1237,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s")
> > (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
> >  (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s")
> >  (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s")
> >  (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u")
> > -(VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s")
> > -(VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u")
> > +(VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s")
> > (VBICQ_U "u")
> >  (VBRSRQ_N_S "s") (VBRSRQ_N_U "u")
> > (VCADDQ_ROT270_S "s")
> >  (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
> >  (VCMPEQQ_S "s") (VCMPEQQ_U "u")
> > (VCADDQ_ROT90_U "u")
> > @@ -1501,7 +1505,6 @@ (define_int_iterator VABDQ [VABDQ_S VABDQ_U])
> >  (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U])
> >  (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U])
> >  (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S])
> > -(define_int_iterator VANDQ [VANDQ_U VANDQ_S])
> >  (define_int_iterator VBICQ [VBICQ_S VBICQ_U])
> >  (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S])
> >  (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S
> > VCADDQ_ROT270_U])
> > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> > index ecbaaa9..238c828 100644
> > --- a/gcc/config/arm/mve.md
> > +++ b/gcc/config/arm/mve.md
> > @@ -894,17 +894,36 @@ (define_insn "mve_vaddvq_p_"
> >  ;;
> >  ;; [vandq_u, vandq_s])
> >  ;;
> > -(define_insn "mve_vandq_"
> > +;; signed and unsigned versions are the same: define the unsigned
> > +;; insn, and use an expander for the signed one as we still reference
> > +;; both names from arm_mve.h.
> > +;; We use the same code as in neon.md (TODO: avoid this duplication).
> > +(define_insn "mve_vandq_u"
> > +  [
> > +   (set (match_operand:MVE_2 0 "s_register_operand" "=w,w")
> > + (and:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w,0")
> > +(match_operand:MVE_2 2 "neon_inv_logic_op2" "w,DL")))
> > +  ]
> > +  "TARGET_HAVE_MVE"

RFC: ARM MVE and Neon auto-vectorization

2020-12-08 Thread Christophe Lyon via Gcc-patches
Hi,

I've been working for a while on enabling auto-vectorization for ARM
MVE, and I find it a bit awkward to keep things common with Neon as
much as possible.

I've just sent a few patches for logical operators
(vand/vorr/veor/vbic), and I have a few more WIP patches where I
struggle to avoid duplication.

For example, vneg is supported in different modes by MVE and Neon:
* Neon: VDQ and VH iterators: V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF
V2SF V4SF V2DI  and V8HF V4HF
* MVE: MVE_2 and MVE_0 iterators: V16QI V8HI V4SI and V8HF V4SF

My 'vand' patch changes the definition of VDQ so that the relevant
modes are enabled only when !TARGET_HAVE_MVE (V8QI, ...), and this
helps writing a simpler expander.

However, vneg is used by vshr (right-shifts by register are
implemented as left-shift by negation of that register), so the
expander uses something like:

  emit_insn (gen_neg2 (neg, operands[2]));
  if (TARGET_NEON)
  emit_insn (gen_ashl3_signed (operands[0], operands[1], neg));
  else
  emit_insn (gen_mve_vshlq_s (operands[0], operands[1], neg));

which does not work if the iterator has conditional members: the
'else' part is still generated for  unsupported by MVE.

So I guess my question is:  do we want to enforce implementation
of Neon / MVE common parts? There are already lots of partly
overlapping/duplicate iterators. I have tried to split iterators into
eg VDQ_COMMON_TO_NEON_AND_MVE and VDQ_NEON_ONLY but this means we have
to basically duplicate the expanders which defeats the point...

Or we can keep different expanders for Neon and MVE? But we have
already quite a few in vec-common.md.

I hoped I could submit more vectorization patches for MVE, but it
seems there's more preparation work needed.

Any advice highly appreciated!

Thanks,

Christophe


[committed] libstdc++: Adjust whitespace in documentation

2020-12-08 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* doc/xml/manual/appendix_contributing.xml: Use consistent
indentation.
* doc/html/manual/source_code_style.html: Regenerate.

Committed to trunk.

commit edbbf7363cff62fc7ff536b5fa64e39f5a4d6496
Author: Jonathan Wakely 
Date:   Tue Dec 8 13:35:07 2020

libstdc++: Adjust whitespace in documentation

libstdc++-v3/ChangeLog:

* doc/xml/manual/appendix_contributing.xml: Use consistent
indentation.
* doc/html/manual/source_code_style.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml 
b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index d346b922907..ceb21f4478a 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -446,7 +446,7 @@ indicate a place that may require attention for 
multi-thread safety.
 
   MS adds:
   _T
-   __deref
+  __deref
 
   BSD adds:
   __used


[PATCH] if-to-switch: fix matching of negative conditions

2020-12-08 Thread Martin Liška

We must be careful which edge we follow for conditions
of a negative form (index != 2).

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR tree-optimization/98182
* gimple-if-to-switch.cc (pass_if_to_switch::execute): Request
chain linkage through false edges only.

gcc/testsuite/ChangeLog:

PR tree-optimization/98182
* gcc.dg/tree-ssa/if-to-switch-10.c: New test.
* gcc.dg/tree-ssa/pr98182.c: New test.
---
 gcc/gimple-if-to-switch.cc|  6 +++
 .../gcc.dg/tree-ssa/if-to-switch-10.c | 44 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr98182.c   | 18 
 3 files changed, 68 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr98182.c

diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
index 8e1043ae7c4..311f6f6ac97 100644
--- a/gcc/gimple-if-to-switch.cc
+++ b/gcc/gimple-if-to-switch.cc
@@ -522,6 +522,12 @@ pass_if_to_switch::execute (function *fun)
  if (!info2 || info->m_ranges[0].exp != info2->m_ranges[0].exp)
break;
 
+	  /* It is important that the blocks are linked through FALSE_EDGE.

+For an expression of index != VALUE, true and false edges
+are flipped.  */
+ if (info2->m_false_edge != e)
+   break;
+
  chain->m_entries.safe_push (info2);
  bitmap_set_bit (seen_bbs, e->src->index);
  info = info2;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c 
b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
new file mode 100644
index 000..7b8da1c9f3c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-iftoswitch-optimized" } */
+
+int global;
+int foo ();
+
+int main(int argc, char **argv)
+{
+  if (argc != 1)
+{
+  if (argc != 2)
+   {
+ if (argc == 3)
+   {
+ foo ();
+ foo ();
+   }
+ else if (argc == 4)
+   {
+ foo ();
+   }
+ else if (argc == 5)
+   {
+ global = 2;
+   }
+ else
+   global -= 123;
+   }
+  else
+   {
+ global += 1;
+   }
+}
+  else
+foo ();
+
+
+  global -= 12;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "Canonical GIMPLE case clusters: 1 2 3 4 5" 
"iftoswitch" } } */
+/* { dg-final { scan-tree-dump "Condition chain with \[^\n\r]\* BBs transformed into a switch 
statement." "iftoswitch" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr98182.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr98182.c
new file mode 100644
index 000..29a547e3788
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr98182.c
@@ -0,0 +1,18 @@
+/* PR tree-optimization/98182 */
+/* { dg-do compile } */
+/* { dg-options "-O1 --param case-values-threshold=1 
-fdump-tree-iftoswitch-optimized" } */
+
+int global;
+int foo ();
+
+int main(int argc, char **argv)
+{
+  if (argc != 1)
+__builtin_abort ();
+  else if (argc != 2)
+__builtin_abort ();
+  else
+return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "Condition chain" "iftoswitch" } } */
--
2.29.2



RE: [PATCH 1/5] arm: Auto-vectorization for MVE: vand

2020-12-08 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe

> -Original Message-
> From: Gcc-patches  On Behalf Of
> Christophe Lyon via Gcc-patches
> Sent: 08 December 2020 13:06
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 1/5] arm: Auto-vectorization for MVE: vand
> 
> This patch enables MVE vandq instructions for auto-vectorization.  MVE
> vandq insns in mve.md are modified to use 'and' instead of unspec
> expression to support and3.  The and3 expander is added to
> vec-common.md
> 
> 2020-12-03  Christophe Lyon  
> 
>   gcc/
>   * config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
>   (VANQ): Remove.
>   (VDQ): Add TARGET_HAVE_MVE condition where relevant.
>   * config/arm/mve.md (mve_vandq_u): New entry for vand
>   instruction using expression 'and'.
>   (mve_vandq_s): New expander.
>   (mve_vaddq_n_f): Use 'and' code instead of unspec.
>   * config/arm/neon.md (and3): Rename into
> and3_neon.
>   * config/arm/predicates.md (imm_for_neon_inv_logic_operand):
>   Enable for MVE.
>   * config/arm/unspecs.md (VANDQ_S, VANDQ_U, VANDQ_F):
> Remove.
>   * config/arm/vec-common.md (and3): New expander.
> 
>   gcc/testsuite/
>   * gcc.target/arm/simd/mve-vand.c: New test.
> ---
>  gcc/config/arm/iterators.md  | 11 +++--
>  gcc/config/arm/mve.md| 40 +-
>  gcc/config/arm/neon.md   |  2 +-
>  gcc/config/arm/predicates.md |  2 +-
>  gcc/config/arm/unspecs.md|  3 --
>  gcc/config/arm/vec-common.md |  8 
>  gcc/testsuite/gcc.target/arm/simd/mve-vand.c | 63
> 
>  7 files changed, 109 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vand.c
> 
> diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> index 592af35..badad2b 100644
> --- a/gcc/config/arm/iterators.md
> +++ b/gcc/config/arm/iterators.md
> @@ -147,7 +147,12 @@ (define_mode_iterator VW [V8QI V4HI V2SI])
>  (define_mode_iterator VN [V8HI V4SI V2DI])
> 
>  ;; All supported vector modes (except singleton DImode).
> -(define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF
> V2SF V4SF V2DI])
> +(define_mode_iterator VDQ [(V8QI "!TARGET_HAVE_MVE") V16QI
> +(V4HI "!TARGET_HAVE_MVE") V8HI
> +(V2SI "!TARGET_HAVE_MVE") V4SI
> +(V4HF "!TARGET_HAVE_MVE") V8HF
> +(V2SF "!TARGET_HAVE_MVE") V4SF
> +(V2DI "!TARGET_HAVE_MVE")])
> 
>  ;; All supported floating-point vector modes (except V2DF).
>  (define_mode_iterator VF [(V4HF "TARGET_NEON_FP16INST")
> @@ -1232,8 +1237,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s")
> (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
>  (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s")
>  (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s")
>  (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u")
> -(VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s")
> -(VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u")
> +(VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s")
> (VBICQ_U "u")
>  (VBRSRQ_N_S "s") (VBRSRQ_N_U "u")
> (VCADDQ_ROT270_S "s")
>  (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
>  (VCMPEQQ_S "s") (VCMPEQQ_U "u")
> (VCADDQ_ROT90_U "u")
> @@ -1501,7 +1505,6 @@ (define_int_iterator VABDQ [VABDQ_S VABDQ_U])
>  (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U])
>  (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U])
>  (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S])
> -(define_int_iterator VANDQ [VANDQ_U VANDQ_S])
>  (define_int_iterator VBICQ [VBICQ_S VBICQ_U])
>  (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S])
>  (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S
> VCADDQ_ROT270_U])
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index ecbaaa9..238c828 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -894,17 +894,36 @@ (define_insn "mve_vaddvq_p_"
>  ;;
>  ;; [vandq_u, vandq_s])
>  ;;
> -(define_insn "mve_vandq_"
> +;; signed and unsigned versions are the same: define the unsigned
> +;; insn, and use an expander for the signed one as we still reference
> +;; both names from arm_mve.h.
> +;; We use the same code as in neon.md (TODO: avoid this duplication).
> +(define_insn "mve_vandq_u"
> +  [
> +   (set (match_operand:MVE_2 0 "s_register_operand" "=w,w")
> + (and:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w,0")
> +(match_operand:MVE_2 2 "neon_inv_logic_op2" "w,DL")))
> +  ]
> +  "TARGET_HAVE_MVE"
> +  {
> +switch (which_alternative)
> +  {
> +  case 0: return "vand\t%q0, %q1, %q2";
> +  case 1: return neon_output_logic_immediate ("vand", &operands[2],
> + mode, 1, VALID_NEON_QREG_MODE
> (mode));
> +  default: gcc_u

Re: [wwwdocs] Document libstdc++ changes in GCC 11

2020-12-08 Thread Jonathan Wakely via Gcc-patches

On 08/12/20 13:07 +, Jonathan Wakely wrote:

Also add porting-to notes about tr1::bind.


And the traditional follow-up patch with markup fixes.

Pushed to wwwdocs.


commit 38b28b4e95d8331d28e70a0272618135d5f69c79
Author: Jonathan Wakely 
Date:   Tue Dec 8 13:15:13 2020 +

Fix markup errors

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 50e35505..5c3519ba 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -243,8 +243,8 @@ a work-in-progress.
 
   Improved C++17 support, including:
 
-  std::from_chars for floating-point types.
-
+  std::from_chars for floating-point types.
+
   
   Improved experimental C++2a support, including:
 


Re: [PATCH] gcc: handle double quotes in symbol name during stabstrings generation

2020-12-08 Thread CHIGOT, CLEMENT via Gcc-patches
Hi Ian,

Any news about this bug ? It's not urgent even if it's breaking gcc builds with 
Go language, but I just want to know if you need any inputs/help from me.

Thanks,
Clément

From: CHIGOT, CLEMENT 
Sent: Wednesday, December 2, 2020 5:14 PM
To: Ian Lance Taylor 
Cc: gcc-patches@gcc.gnu.org ; David Edelsohn 

Subject: Re: [PATCH] gcc: handle double quotes in symbol name during 
stabstrings generation

Hi Ian,

Here is the test case.
If you're compiling with -gstabs you should have a line looking like:
.stabs  "type..struct{Type go.bug1.ObjectIdentifier;Value 
[][]go.bug1.Extension{asn1:"set"}}:G(0,7)=xsStructType:",32,0,0,0

As you can see the " around for "set" aren't escaped.
I didn't try to reproduce it on linux/amd64, but I did on linux/ppc64le and I 
don't think it's a ppc-only bug.

Clément

From: Ian Lance Taylor 
Sent: Wednesday, December 2, 2020 4:55 PM
To: CHIGOT, CLEMENT 
Cc: gcc-patches@gcc.gnu.org ; David Edelsohn 

Subject: Re: [PATCH] gcc: handle double quotes in symbol name during 
stabstrings generation

Caution! External email. Do not open attachments or click links, unless this 
email comes from a known sender and you know the content is safe.

On Wed, Dec 2, 2020 at 4:24 AM CHIGOT, CLEMENT  wrote:
>
> Since the new gccgo mangling scheme, libgo compilation is broken on AIX (or 
> in Linux with -gstabs) because of a type symbol having a " in its name. I've 
> made a patch (see attachment) in order to fix stabstring generation, because, 
> IMO, it should be handled anyway.
> However, it happens only once in the whole libgo so I don't know if this " is 
> intended or not. The problematic type is there: 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgolang%2Fgo%2Fblob%2Fmaster%2Fsrc%2Fcrypto%2Fx509%2Fx509.go%23L2674&data=04%7C01%7Cclement.chigot%40atos.net%7Ce85b8b57669c47db583508d896db2fc2%7C33440fc6b7c7412cbb730e70b0198d5a%7C0%7C0%7C637425215428486700%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aB6diiR9Tgo3FTKOm0vmqVVJ%2B5JlCwd9oM5WeUaTaF4%3D&reserved=0.
>  Other similar types don't trigger the bug though.
>
> I've a minimal test which might can be added if you wish, in Golang tests or 
> in Gcc Go tests or in both ?
>
> If the patch is okay, could you please apply it for me ?

Could you show me the small test case?  I don't think I understand the
problem.  In DWARF I don't see any symbol names with quotation marks.
I'm not yet sure that your patch is the right fix.  Thanks.

Ian


[wwwdocs] Document libstdc++ changes in GCC 11

2020-12-08 Thread Jonathan Wakely via Gcc-patches
Also add porting-to notes about tr1::bind.

Pushed to wwwdocs.


commit 927e80dc01f505a625f1fcc4e1ca38aeb9f88e67
Author: Jonathan Wakely 
Date:   Tue Dec 8 13:05:42 2020 +

Document libstdc++ changes in GCC 11

Also add porting-to notes about tr1::bind.

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 4d3efed5..50e35505 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -239,7 +239,28 @@ a work-in-progress.
   
 
 
-
+Runtime Library (libstdc++)
+
+  Improved C++17 support, including:
+
+  std::from_chars for floating-point types.
+
+  
+  Improved experimental C++2a support, including:
+
+  Calendar additions to .
+  std::bit_cast
+  std::source_location
+  Atomic wait and notify operations.
+   and 
+  
+  Efficient access to basic_stringbuf's buffer.
+
+  
+  Faster std::uniform_int_distribution,
+  thanks to Daniel Lemire.
+  
+
 
 Fortran
 
diff --git a/htdocs/gcc-11/porting_to.html b/htdocs/gcc-11/porting_to.html
index 41efc3b6..4187dd8e 100644
--- a/htdocs/gcc-11/porting_to.html
+++ b/htdocs/gcc-11/porting_to.html
@@ -114,6 +114,33 @@ be included explicitly when compiled with GCC 11:
 
 
 
+Old iostream Members
+
+The deprecated iostream members ios_base::io_state,
+ios_base::open_mode, ios_base::seek_dir, and
+basic_streambuf::stossc are not available in C++17 mode.
+References to those members should be replaced by std::iostate,
+std::openmode, std::seekdir, and
+basic_streambuf::sbumpc respectively.
+
+
+Call of overloaded 'bind(...)' is ambiguous
+
+The placeholders for std::tr1::bind have been changed to use
+the same placeholder objects as std::bind.  This means that
+following using std::tr1::bind; an unqualified call to
+bind(f, std::tr1::placeholders::_1) may be ambiguous.
+This happens because std::tr1::bind is brought into scope by
+the using-declaration and std::bind is found by
+Argument-Dependent Lookup due to the type of the _1 placeholder.
+
+
+To resolve this ambiguity replace unqualified calls to bind
+with std::tr1::bind or std::bind. Alternatively,
+change the code to not include the  header,
+so that only std::bind is declared.
+
+
 


c++: module directive FSM

2020-12-08 Thread Nathan Sidwell


As mentioned in the preprocessor patches, there's a new kind of
preprocessor directive for modules, and it interacts with the
compiler-proper, as that has to stream in header-unit macro
information (when the directive is an import that names a
header-unit).  This is that machinery.  It's an FSM that inspects the
token stream and does the minimal parsing to detect such imports.
This ends up being called from the C++ parser's tokenizer and from the
-E tokenizer (via a lang hook).  The actual module streaming is a stub
here.

gcc/cp/
* cp-tree.h (module_token_pre, module_token_cdtor)
(module_token_lang): Declare.
* lex.c: Include langhooks.
(struct module_token_filter): New.
* cp-tree.h (module_token_pre, module_token_cdtor)
(module_token_lang): Define.
* module.cc (get_module, preprocess_module, preprocessed_module):
Nop stubs.

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index b72069eecda..aa2b0f782fa 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -6849,6 +6849,10 @@ extern void set_identifier_kind			(tree, cp_identifier_kind);
 extern bool cxx_init(void);
 extern void cxx_finish(void);
 extern bool in_main_input_context		(void);
+extern uintptr_t module_token_pre (cpp_reader *, const cpp_token *, uintptr_t);
+extern uintptr_t module_token_cdtor (cpp_reader *, uintptr_t);
+extern uintptr_t module_token_lang (int type, int keyword, tree value,
+location_t, uintptr_t);
 
 /* in method.c */
 extern void init_method(void);
diff --git i/gcc/cp/lex.c w/gcc/cp/lex.c
index 795f5718198..6053848535e 100644
--- i/gcc/cp/lex.c
+++ w/gcc/cp/lex.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-family/c-objc.h"
 #include "gcc-rich-location.h"
 #include "cp-name-hint.h"
+#include "langhooks.h"
 
 static int interface_strcmp (const char *);
 static void init_cp_pragma (void);
@@ -380,7 +381,206 @@ interface_strcmp (const char* s)
   return 1;
 }
 
-
+/* We've just read a cpp-token, figure out our next state.  Hey, this
+   is a hand-coded co-routine!  */
+
+struct module_token_filter
+{
+  enum state
+  {
+   idle,
+   module_first,
+   module_cont,
+   module_end,
+  };
+
+  enum state state : 8;
+  bool is_import : 1;
+  bool got_export : 1;
+  bool got_colon : 1;
+  bool want_dot : 1;
+
+  location_t token_loc;
+  cpp_reader *reader;
+  module_state *module;
+  module_state *import;
+
+  module_token_filter (cpp_reader *reader)
+: state (idle), is_import (false),
+got_export (false), got_colon (false), want_dot (false),
+token_loc (UNKNOWN_LOCATION),
+reader (reader), module (NULL), import (NULL)
+  {
+  };
+
+  /* Process the next token.  Note we cannot see CPP_EOF inside a
+ pragma -- a CPP_PRAGMA_EOL always happens.  */
+  uintptr_t resume (int type, int keyword, tree value, location_t loc)
+  {
+unsigned res = 0;
+
+switch (state)
+  {
+  case idle:
+	if (type == CPP_KEYWORD)
+	  switch (keyword)
+	{
+	default:
+	  break;
+
+	case RID__EXPORT:
+	  got_export = true;
+	  res = lang_hooks::PT_begin_pragma;
+	  break;
+
+	case RID__IMPORT:
+	  is_import = true;
+	  /* FALLTHRU */
+	case RID__MODULE:
+	  state = module_first;
+	  want_dot = false;
+	  got_colon = false;
+	  token_loc = loc;
+	  import = NULL;
+	  if (!got_export)
+		res = lang_hooks::PT_begin_pragma;
+	  break;
+	}
+	break;
+
+  case module_first:
+	if (is_import && type == CPP_HEADER_NAME)
+	  {
+	/* A header name.  The preprocessor will have already
+	   done include searching and canonicalization.  */
+	state = module_end;
+	goto header_unit;
+	  }
+	
+	if (type == CPP_PADDING || type == CPP_COMMENT)
+	  break;
+
+	state = module_cont;
+	if (type == CPP_COLON && module)
+	  {
+	got_colon = true;
+	import = module;
+	break;
+	  }
+	/* FALLTHROUGH  */
+
+  case module_cont:
+	switch (type)
+	  {
+	  case CPP_PADDING:
+	  case CPP_COMMENT:
+	break;
+
+	  default:
+	/* If we ever need to pay attention to attributes for
+	   header modules, more logic will be needed.  */
+	state = module_end;
+	break;
+
+	  case CPP_COLON:
+	if (got_colon)
+	  state = module_end;
+	got_colon = true;
+	/* FALLTHROUGH  */
+	  case CPP_DOT:
+	if (!want_dot)
+	  state = module_end;
+	want_dot = false;
+	break;
+
+	  case CPP_PRAGMA_EOL:
+	goto module_end;
+
+	  case CPP_NAME:
+	if (want_dot)
+	  {
+		/* Got name instead of [.:].  */
+		state = module_end;
+		break;
+	  }
+	  header_unit:
+	import = get_module (value, import, got_colon);
+	want_dot = true;
+	break;
+	  }
+	break;
+
+  case module_end:
+	if (type == CPP_PRAGMA_EOL)
+	  {
+	  module_end:;
+	/* End of the directive, handle the name.  */
+	if (import)
+	  if (module_state *m
+		  = preprocess_module (import, token_lo

[PATCH 2/5] arm: Auto-vectorization for MVE: vorr

2020-12-08 Thread Christophe Lyon via Gcc-patches
This patch enables MVE vorrq instructions for auto-vectorization.  MVE
vorrq insns in mve.md are modified to use ior instead of unspec
expression to support ior3.  The ior3 expander is added to
vec-common.md

2020-12-03  Christophe Lyon  

gcc/
* config/arm/iterators.md (supf): Remove VORRQ_S and VORRQ_U.
(VORRQ): Remove.
* config/arm/mve.md (mve_vorrq_s): New entry for vorr
instruction using expression ior.
(mve_vorrq_u): New expander.
(mve_vorrq_f): Use ior code instead of unspec.
* config/arm/neon.md (ior3): Renamed into ior3_neon.
* config/arm/predicates.md (imm_for_neon_logic_operand): Enable
for MVE.
* config/arm/unspecs.md (VORRQ_S, VORRQ_U, VORRQ_F): Remove.
* config/arm/vec-common.md (ior3): New expander.

gcc/testsuite/
* gcc.target/arm/simd/mve-vorr.c: Add vorr tests.
---
 gcc/config/arm/iterators.md  |  5 +--
 gcc/config/arm/mve.md| 36 
 gcc/config/arm/neon.md   |  2 +-
 gcc/config/arm/predicates.md |  2 +-
 gcc/config/arm/unspecs.md|  3 --
 gcc/config/arm/vec-common.md |  8 
 gcc/testsuite/gcc.target/arm/simd/mve-vorr.c | 64 
 7 files changed, 103 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vorr.c

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index badad2b..f0e1d60 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -1252,8 +1252,8 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VMULLBQ_INT_S "s") (VMULLBQ_INT_U "u") (VQADDQ_S "s")
   (VMULLTQ_INT_S "s") (VMULLTQ_INT_U "u") (VQADDQ_U "u")
   (VMULQ_N_S "s") (VMULQ_N_U "u") (VMULQ_S "s")
-  (VMULQ_U "u") (VORNQ_S "s") (VORNQ_U "u") (VORRQ_S "s")
-  (VORRQ_U "u") (VQADDQ_N_S "s") (VQADDQ_N_U "u")
+  (VMULQ_U "u") (VORNQ_S "s") (VORNQ_U "u")
+  (VQADDQ_N_S "s") (VQADDQ_N_U "u")
   (VQRSHLQ_N_S "s") (VQRSHLQ_N_U "u") (VQRSHLQ_S "s")
   (VQRSHLQ_U "u") (VQSHLQ_N_S "s") (VQSHLQ_N_U "u")
   (VQSHLQ_R_S "s") (VQSHLQ_R_U "u") (VQSHLQ_S "s")
@@ -1528,7 +1528,6 @@ (define_int_iterator VMULLTQ_INT [VMULLTQ_INT_U 
VMULLTQ_INT_S])
 (define_int_iterator VMULQ [VMULQ_U VMULQ_S])
 (define_int_iterator VMULQ_N [VMULQ_N_U VMULQ_N_S])
 (define_int_iterator VORNQ [VORNQ_U VORNQ_S])
-(define_int_iterator VORRQ [VORRQ_S VORRQ_U])
 (define_int_iterator VQADDQ [VQADDQ_U VQADDQ_S])
 (define_int_iterator VQADDQ_N [VQADDQ_N_S VQADDQ_N_U])
 (define_int_iterator VQRSHLQ [VQRSHLQ_S VQRSHLQ_U])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 238c828..0fcbe62 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1619,17 +1619,36 @@ (define_insn "mve_vornq_"
 ;;
 ;; [vorrq_s, vorrq_u])
 ;;
-(define_insn "mve_vorrq_"
+;; signed and unsigned versions are the same: define the unsigned
+;; insn, and use an expander for the signed one as we still reference
+;; both names from arm_mve.h.
+;; We use the same code as in neon.md (TODO: avoid this duplication).
+(define_insn "mve_vorrq_s"
   [
-   (set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
-  (match_operand:MVE_2 2 "s_register_operand" "w")]
-VORRQ))
+   (set (match_operand:MVE_2 0 "s_register_operand" "=w,w")
+   (ior:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w,0")
+  (match_operand:MVE_2 2 "neon_logic_op2" "w,Dl")))
   ]
   "TARGET_HAVE_MVE"
-  "vorr %q0, %q1, %q2"
+  {
+switch (which_alternative)
+  {
+  case 0: return "vorr\t%q0, %q1, %q2";
+  case 1: return neon_output_logic_immediate ("vorr", &operands[2],
+   mode, 0, VALID_NEON_QREG_MODE (mode));
+  default: gcc_unreachable ();
+  }
+  }
   [(set_attr "type" "mve_move")
 ])
+(define_expand "mve_vorrq_u"
+  [
+   (set (match_operand:MVE_2 0 "s_register_operand")
+   (ior:MVE_2 (match_operand:MVE_2 1 "s_register_operand")
+  (match_operand:MVE_2 2 "neon_logic_op2")))
+  ]
+  "TARGET_HAVE_MVE"
+)
 
 ;;
 ;; [vqaddq_n_s, vqaddq_n_u])
@@ -2664,9 +2683,8 @@ (define_insn "mve_vornq_f"
 (define_insn "mve_vorrq_f"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=w")
-   (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w")
-  (match_operand:MVE_0 2 "s_register_operand" "w")]
-VORRQ_F))
+   (ior:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")
+  (match_operand:MVE_0 2 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
   "vorr %q0, %q1, %q2"
diff --git a/gcc/

[PATCH 5/5] arm: Auto-vectorization for MVE: vmvn

2020-12-08 Thread Christophe Lyon via Gcc-patches
This patch enables MVE vmvnq instructions for auto-vectorization.  MVE
vmvnq insns in mve.md are modified to use 'not' instead of unspec
expression to support one_cmpl2.  The one_cmpl2 expander
is added to vec-common.md.

2020-12-03  Christophe Lyon  

gcc/
* config/arm/iterators.md (VDQNOTM2): New mode iterator.
(supf): Remove VMVNQ_S and VMVNQ_U.
(VMVNQ): Remove.
* config/arm/mve.md (mve_vmvnq_u): New entry for vmvn
instruction using expression not.
(mve_vmvnq_s): New expander.
* config/arm/neon.md (one_cmpl2): Renamed into
one_cmpl2_insn.
* config/arm/unspecs.md (VMVNQ_S, VMVNQ_U): Remove.
* config/arm/vec-common.md (one_cmpl2): New expander.

gcc/testsuite/
* gcc.target/arm/simd/mve-vmvn.c: Add tests for vmvn.
---
 gcc/config/arm/iterators.md  |  3 +--
 gcc/config/arm/mve.md| 14 +++
 gcc/config/arm/neon.md   |  4 ++--
 gcc/config/arm/unspecs.md|  2 --
 gcc/config/arm/vec-common.md |  7 ++
 gcc/testsuite/gcc.target/arm/simd/mve-vmvn.c | 35 
 6 files changed, 55 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vmvn.c

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index b3c8999..cfbce24 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -1221,7 +1221,7 @@ (define_int_attr mmla_sfx [(UNSPEC_MATMUL_S "s8") 
(UNSPEC_MATMUL_U "u8")
 (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
   (VREV16Q_U "u") (VMVNQ_N_S "s") (VMVNQ_N_U "u")
   (VCVTAQ_U "u") (VCVTAQ_S "s") (VREV64Q_S "s")
-  (VREV64Q_U "u") (VMVNQ_S "s") (VMVNQ_U "u")
+  (VREV64Q_U "u")
   (VDUPQ_N_U "u") (VDUPQ_N_S"s") (VADDVQ_S "s")
   (VADDVQ_U "u") (VADDVQ_S "s") (VADDVQ_U "u")
   (VMOVLTQ_U "u") (VMOVLTQ_S "s") (VMOVLBQ_S "s")
@@ -1481,7 +1481,6 @@ (define_int_iterator VREV64Q [VREV64Q_S VREV64Q_U])
 (define_int_iterator VCVTQ_FROM_F [VCVTQ_FROM_F_S VCVTQ_FROM_F_U])
 (define_int_iterator VREV16Q [VREV16Q_U VREV16Q_S])
 (define_int_iterator VCVTAQ [VCVTAQ_U VCVTAQ_S])
-(define_int_iterator VMVNQ [VMVNQ_U VMVNQ_S])
 (define_int_iterator VDUPQ_N [VDUPQ_N_U VDUPQ_N_S])
 (define_int_iterator VCLZQ [VCLZQ_U VCLZQ_S])
 (define_int_iterator VADDVQ [VADDVQ_U VADDVQ_S])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 1ec2395..f298546 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -433,16 +433,22 @@ (define_insn "mve_vnegq_s"
 ;;
 ;; [vmvnq_u, vmvnq_s])
 ;;
-(define_insn "mve_vmvnq_"
+(define_insn "mve_vmvnq_u"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")]
-VMVNQ))
+   (not:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "vmvn %q0, %q1"
+  "vmvn\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
+(define_expand "mve_vmvnq_s"
+  [
+   (set (match_operand:MVE_2 0 "s_register_operand")
+   (not:MVE_2 (match_operand:MVE_2 1 "s_register_operand")))
+  ]
+  "TARGET_HAVE_MVE"
+)
 
 ;;
 ;; [vdupq_n_u, vdupq_n_s])
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index e1263b0..42b82ee 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -756,7 +756,7 @@ (define_insn "xor3_neon"
   [(set_attr "type" "neon_logic")]
 )
 
-(define_insn "one_cmpl2"
+(define_insn "one_cmpl2_insn"
   [(set (match_operand:VDQ 0 "s_register_operand" "=w")
 (not:VDQ (match_operand:VDQ 1 "s_register_operand" "w")))]
   "TARGET_NEON"
@@ -3240,7 +3240,7 @@ (define_expand "neon_vmvn"
(match_operand:VDQIW 1 "s_register_operand")]
   "TARGET_NEON"
 {
-  emit_insn (gen_one_cmpl2 (operands[0], operands[1]));
+  emit_insn (gen_one_cmpl2_insn (operands[0], operands[1]));
   DONE;
 })
 
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 8a4389a..e581645 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -550,8 +550,6 @@ (define_c_enum "unspec" [
   VREV64Q_U
   VQABSQ_S
   VNEGQ_S
-  VMVNQ_S
-  VMVNQ_U
   VDUPQ_N_U
   VDUPQ_N_S
   VCLZQ_U
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index 54078f9..9b18ab2 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -196,3 +196,10 @@ (define_expand "xor3"
   "TARGET_NEON
|| TARGET_HAVE_MVE"
 )
+
+(define_expand "one_cmpl2"
+  [(set (match_operand:VDQ 0 "s_register_operand")
+   (not:VDQ (match_operand:VDQ 1 "s_register_operand")))]
+  "TARGET_NEON
+   || TARGET_HAVE_MVE"
+)
diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vmvn.c 
b/gcc/testsuite/gcc.target/arm/simd/mve-vmvn.c
new file mode 100644
index 000..73e897a
--- /dev/null
+++ b/gcc/testsuite

[PATCH 4/5] arm: Auto-vectorization for MVE: vbic

2020-12-08 Thread Christophe Lyon via Gcc-patches
This patch enables MVE vbic instructions for auto-vectorization.  MVE
vbicq insns in mve.md are modified to use 'and not' instead of unspec
expression.

2020-12-03  Christophe Lyon  

gcc/
* config/arm/iterators.md (supf): Remove VBICQ_S and VBICQ_U.
(VBICQ): Remove.
* config/arm/mve.md (mve_vbicq_u): New entry for vbic
instruction using expression and not.
(mve_vbicq_s): New expander.
(mve_vbicq_f): Replace use of unspec by 'and not'.
* config/arm/unspecs.md (VBICQ_S, VBICQ_U, VBICQ_F): Remove.

gcc/testsuite/
* gcc.target/arm/simd/mve-vbic.c: Add tests for vbic.
---
 gcc/config/arm/iterators.md  |  3 +-
 gcc/config/arm/mve.md| 23 ++
 gcc/config/arm/unspecs.md|  3 --
 gcc/testsuite/gcc.target/arm/simd/mve-vbic.c | 65 
 4 files changed, 81 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vbic.c

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index ae597be..b3c8999 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -1237,7 +1237,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s")
   (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s")
   (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u")
-  (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s") (VBICQ_U 
"u")
+  (VADDVQ_P_S "s") (VADDVQ_P_U "u")
   (VBRSRQ_N_S "s") (VBRSRQ_N_U "u") (VCADDQ_ROT270_S "s")
   (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
   (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCADDQ_ROT90_U "u")
@@ -1505,7 +1505,6 @@ (define_int_iterator VABDQ [VABDQ_S VABDQ_U])
 (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U])
 (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U])
 (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S])
-(define_int_iterator VBICQ [VBICQ_S VBICQ_U])
 (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S])
 (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S VCADDQ_ROT270_U])
 (define_int_iterator VCADDQ_ROT90 [VCADDQ_ROT90_U VCADDQ_ROT90_S])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index d7d7c1a..1ec2395 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -928,18 +928,26 @@ (define_expand "mve_vandq_s"
 ;;
 ;; [vbicq_s, vbicq_u])
 ;;
-(define_insn "mve_vbicq_"
+(define_insn "mve_vbicq_u"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
-  (match_operand:MVE_2 2 "s_register_operand" "w")]
-VBICQ))
+   (and:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand" "w"))
+ (match_operand:MVE_2 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "vbic %q0, %q1, %q2"
+  "vbic\t%q0, %q1, %q2"
   [(set_attr "type" "mve_move")
 ])
 
+(define_expand "mve_vbicq_s"
+  [
+   (set (match_operand:MVE_2 0 "s_register_operand")
+   (and:MVE_2 (not:MVE_2 (match_operand:MVE_2 2 "s_register_operand"))
+  (match_operand:MVE_2 1 "s_register_operand")))
+  ]
+  "TARGET_HAVE_MVE"
+)
+
 ;;
 ;; [vbrsrq_n_u, vbrsrq_n_s])
 ;;
@@ -2078,9 +2086,8 @@ (define_insn "mve_vandq_f"
 (define_insn "mve_vbicq_f"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=w")
-   (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w")
-  (match_operand:MVE_0 2 "s_register_operand" "w")]
-VBICQ_F))
+   (and:MVE_0 (not:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w"))
+ (match_operand:MVE_0 2 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
   "vbic %q0, %q1, %q2"
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index fe240e8..8a4389a 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -601,7 +601,6 @@ (define_c_enum "unspec" [
   VADDQ_N_S
   VADDVAQ_S
   VADDVQ_P_S
-  VBICQ_S
   VBRSRQ_N_S
   VCADDQ_ROT270_S
   VCADDQ_ROT90_S
@@ -645,7 +644,6 @@ (define_c_enum "unspec" [
   VADDQ_N_U
   VADDVAQ_U
   VADDVQ_P_U
-  VBICQ_U
   VBRSRQ_N_U
   VCADDQ_ROT270_U
   VCADDQ_ROT90_U
@@ -715,7 +713,6 @@ (define_c_enum "unspec" [
   VABDQ_M_U
   VABDQ_F
   VADDQ_N_F
-  VBICQ_F
   VCADDQ_ROT270_F
   VCADDQ_ROT90_F
   VCMPEQQ_F
diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vbic.c 
b/gcc/testsuite/gcc.target/arm/simd/mve-vbic.c
new file mode 100644
index 000..c9a64c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/mve-vbic.c
@@ -0,0 +1,65 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+
+#include 
+
+#define FUNC(SIGN, TYPE, 

[PATCH 1/5] arm: Auto-vectorization for MVE: vand

2020-12-08 Thread Christophe Lyon via Gcc-patches
This patch enables MVE vandq instructions for auto-vectorization.  MVE
vandq insns in mve.md are modified to use 'and' instead of unspec
expression to support and3.  The and3 expander is added to
vec-common.md

2020-12-03  Christophe Lyon  

gcc/
* config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
(VANQ): Remove.
(VDQ): Add TARGET_HAVE_MVE condition where relevant.
* config/arm/mve.md (mve_vandq_u): New entry for vand
instruction using expression 'and'.
(mve_vandq_s): New expander.
(mve_vaddq_n_f): Use 'and' code instead of unspec.
* config/arm/neon.md (and3): Rename into and3_neon.
* config/arm/predicates.md (imm_for_neon_inv_logic_operand):
Enable for MVE.
* config/arm/unspecs.md (VANDQ_S, VANDQ_U, VANDQ_F): Remove.
* config/arm/vec-common.md (and3): New expander.

gcc/testsuite/
* gcc.target/arm/simd/mve-vand.c: New test.
---
 gcc/config/arm/iterators.md  | 11 +++--
 gcc/config/arm/mve.md| 40 +-
 gcc/config/arm/neon.md   |  2 +-
 gcc/config/arm/predicates.md |  2 +-
 gcc/config/arm/unspecs.md|  3 --
 gcc/config/arm/vec-common.md |  8 
 gcc/testsuite/gcc.target/arm/simd/mve-vand.c | 63 
 7 files changed, 109 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vand.c

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 592af35..badad2b 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -147,7 +147,12 @@ (define_mode_iterator VW [V8QI V4HI V2SI])
 (define_mode_iterator VN [V8HI V4SI V2DI])
 
 ;; All supported vector modes (except singleton DImode).
-(define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF 
V2DI])
+(define_mode_iterator VDQ [(V8QI "!TARGET_HAVE_MVE") V16QI
+  (V4HI "!TARGET_HAVE_MVE") V8HI
+  (V2SI "!TARGET_HAVE_MVE") V4SI
+  (V4HF "!TARGET_HAVE_MVE") V8HF
+  (V2SF "!TARGET_HAVE_MVE") V4SF
+  (V2DI "!TARGET_HAVE_MVE")])
 
 ;; All supported floating-point vector modes (except V2DF).
 (define_mode_iterator VF [(V4HF "TARGET_NEON_FP16INST")
@@ -1232,8 +1237,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s")
   (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s")
   (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u")
-  (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s")
-  (VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u")
+  (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s") (VBICQ_U 
"u")
   (VBRSRQ_N_S "s") (VBRSRQ_N_U "u") (VCADDQ_ROT270_S "s")
   (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
   (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCADDQ_ROT90_U "u")
@@ -1501,7 +1505,6 @@ (define_int_iterator VABDQ [VABDQ_S VABDQ_U])
 (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U])
 (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U])
 (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S])
-(define_int_iterator VANDQ [VANDQ_U VANDQ_S])
 (define_int_iterator VBICQ [VBICQ_S VBICQ_U])
 (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S])
 (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S VCADDQ_ROT270_U])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index ecbaaa9..238c828 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -894,17 +894,36 @@ (define_insn "mve_vaddvq_p_"
 ;;
 ;; [vandq_u, vandq_s])
 ;;
-(define_insn "mve_vandq_"
+;; signed and unsigned versions are the same: define the unsigned
+;; insn, and use an expander for the signed one as we still reference
+;; both names from arm_mve.h.
+;; We use the same code as in neon.md (TODO: avoid this duplication).
+(define_insn "mve_vandq_u"
+  [
+   (set (match_operand:MVE_2 0 "s_register_operand" "=w,w")
+   (and:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w,0")
+  (match_operand:MVE_2 2 "neon_inv_logic_op2" "w,DL")))
+  ]
+  "TARGET_HAVE_MVE"
+  {
+switch (which_alternative)
+  {
+  case 0: return "vand\t%q0, %q1, %q2";
+  case 1: return neon_output_logic_immediate ("vand", &operands[2],
+   mode, 1, VALID_NEON_QREG_MODE (mode));
+  default: gcc_unreachable ();
+}
+  }
+  [(set_attr "type" "mve_move")
+])
+(define_expand "mve_vandq_s"
   [
-   (set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
-  (match_operand:MVE_2 2 "s_register_operand" "w")]
-VANDQ))
+   (set (match_operand:MVE_2 0 "s_register_operand")
+   

[PATCH 3/5] arm: Auto-vectorization for MVE: veor

2020-12-08 Thread Christophe Lyon via Gcc-patches
This patch enables MVE veorq instructions for auto-vectorization.  MVE
veorq insns in mve.md are modified to use xor instead of unspec
expression to support xor3.  The xor3 expander is added to
vec-common.md

2020-12-04  Christophe Lyon  

gcc/
* config/arm/iterators.md (supf): Remove VEORQ_S and VEORQ_U.
(VEORQ): Remove.
* config/arm/mve.md (mve_veorq_u): New entry for veor
instruction using expression xor.
(mve_veorq_s): New expander.
(mve_veorq_f): Use 'xor' code instead of unspec.
* config/arm/neon.md (xor3): Renamed into xor3_neon.
* config/arm/unspecs.md (VEORQ_S, VEORQ_U, VEORQ_F): Remove.
* config/arm/vec-common.md (xor3): New expander.

gcc/testsuite/
* gcc.target/arm/simd/mve-veor.c: Add tests for veor.
---
 gcc/config/arm/iterators.md  |  3 +-
 gcc/config/arm/mve.md| 22 ++
 gcc/config/arm/neon.md   |  2 +-
 gcc/config/arm/unspecs.md|  3 --
 gcc/config/arm/vec-common.md |  8 
 gcc/testsuite/gcc.target/arm/simd/mve-veor.c | 61 
 6 files changed, 85 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-veor.c

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f0e1d60..ae597be 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -1242,7 +1242,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s")
   (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCADDQ_ROT90_U "u")
   (VCMPEQQ_N_S "s") (VCMPEQQ_N_U "u") (VCMPNEQ_N_S "s")
-  (VCMPNEQ_N_U "u") (VEORQ_S "s") (VEORQ_U "u")
+  (VCMPNEQ_N_U "u")
   (VHADDQ_N_S "s") (VHADDQ_N_U "u") (VHADDQ_S "s")
   (VHADDQ_U "u") (VHSUBQ_N_S "s")  (VHSUBQ_N_U "u")
   (VHSUBQ_S "s") (VMAXQ_S "s") (VMAXQ_U "u") (VHSUBQ_U "u")
@@ -1512,7 +1512,6 @@ (define_int_iterator VCADDQ_ROT90 [VCADDQ_ROT90_U 
VCADDQ_ROT90_S])
 (define_int_iterator VCMPEQQ [VCMPEQQ_U VCMPEQQ_S])
 (define_int_iterator VCMPEQQ_N [VCMPEQQ_N_S VCMPEQQ_N_U])
 (define_int_iterator VCMPNEQ_N [VCMPNEQ_N_U VCMPNEQ_N_S])
-(define_int_iterator VEORQ [VEORQ_U VEORQ_S])
 (define_int_iterator VHADDQ [VHADDQ_S VHADDQ_U])
 (define_int_iterator VHADDQ_N [VHADDQ_N_U VHADDQ_N_S])
 (define_int_iterator VHSUBQ [VHSUBQ_S VHSUBQ_U])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 0fcbe62..d7d7c1a 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1213,17 +1213,24 @@ (define_insn "mve_vcmpneq_n_"
 ;;
 ;; [veorq_u, veorq_s])
 ;;
-(define_insn "mve_veorq_"
+(define_insn "mve_veorq_u"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
-  (match_operand:MVE_2 2 "s_register_operand" "w")]
-VEORQ))
+   (xor:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")
+  (match_operand:MVE_2 2 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "veor %q0, %q1, %q2"
+  "veor\t%q0, %q1, %q2"
   [(set_attr "type" "mve_move")
 ])
+(define_expand "mve_veorq_s"
+  [
+   (set (match_operand:MVE_2 0 "s_register_operand")
+   (xor:MVE_2 (match_operand:MVE_2 1 "s_register_operand")
+  (match_operand:MVE_2 2 "s_register_operand")))
+  ]
+  "TARGET_HAVE_MVE"
+)
 
 ;;
 ;; [vhaddq_n_u, vhaddq_n_s])
@@ -2416,9 +2423,8 @@ (define_insn "mve_vcvttq_f16_f32v8hf"
 (define_insn "mve_veorq_f"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=w")
-   (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w")
-  (match_operand:MVE_0 2 "s_register_operand" "w")]
-VEORQ_F))
+   (xor:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")
+  (match_operand:MVE_0 2 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
   "veor %q0, %q1, %q2"
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 669c34d..e1263b0 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -747,7 +747,7 @@ (define_insn "bic3_neon"
   [(set_attr "type" "neon_logic")]
 )
 
-(define_insn "xor3"
+(define_insn "xor3_neon"
   [(set (match_operand:VDQ 0 "s_register_operand" "=w")
(xor:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
 (match_operand:VDQ 2 "s_register_operand" "w")))]
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index c2076c9..fe240e8 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -608,7 +608,6 @@ (define_c_enum "unspec" [
   VCMPEQQ_S
   VCMPEQQ_N_S
   VCMPNEQ_N_S
-  VEORQ_S
   VHADDQ_S
   VHADDQ_N_S
   VHSUBQ_S
@@ -653,7 +652,6 @@ (define_c_enum "unspec" [
   VCMPEQQ_U
   VCMP

c++: Add module includes

2020-12-08 Thread Nathan Sidwell
This adds MODULE_VERSION to the makefile, so it's generated from the 
date of the module.cc file in development.  Also adds the include files 
to module.cc


gcc/cp/
* Make-lang.in (MODULE_VERSION): Define.
* module.cc: Add includes.

pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/Make-lang.in w/gcc/cp/Make-lang.in
index ebfdc902192..d7dc0dec2b8 100644
--- i/gcc/cp/Make-lang.in
+++ w/gcc/cp/Make-lang.in
@@ -57,6 +57,15 @@ CFLAGS-cp/g++spec.o += $(DRIVER_DEFINES)
 CFLAGS-cp/module.o += -DHOST_MACHINE=\"$(host)\" \
 	-DTARGET_MACHINE=\"$(target)\"
 
+ifeq ($(DEVPHASE_c),experimental)
+# Some date's don't grok 'r', if so, simply use today's
+# date (don't bootstrap at midnight).
+MODULE_VERSION := $(shell date -r $(srcdir)/cp/module.cc '+%y%m%d-%H%M' \
+  2>/dev/null || date '+%y%m%d-' 2>/dev/null || echo 0)
+
+CFLAGS-cp/module.o += -DMODULE_VERSION='($(subst -,,$(MODULE_VERSION))U)'
+endif
+
 # Create the compiler driver for g++.
 GXX_OBJS = $(GCC_OBJS) cp/g++spec.o
 xg++$(exeext): $(GXX_OBJS) $(EXTRA_GCC_OBJS) libcommon-target.a $(LIBDEPS)
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index 596061b3c49..f250d6c1819 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -18,4 +18,49 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-/* This file intentionally left empty.  */
+/* This file intentionally left empty of all but barest minium.  */
+
+/* In expermental (trunk) sources, MODULE_VERSION is a #define passed
+   in from the Makefile.  It records the modification date of the
+   source directory -- that's the only way to stay sane.  In release
+   sources, we (plan to) use the compiler's major.minor versioning.
+   While the format might not change between at minor versions, it
+   seems simplest to tie the two together.  There's no concept of
+   inter-version compatibility.  */
+#define IS_EXPERIMENTAL(V) ((V) >= (1U << 20))
+#define MODULE_MAJOR(V) ((V) / 1)
+#define MODULE_MINOR(V) ((V) % 1)
+#define EXPERIMENT(A,B) (IS_EXPERIMENTAL (MODULE_VERSION) ? (A) : (B))
+#ifndef MODULE_VERSION
+#error "Shtopp! What are you doing? This is not ready yet."
+#include "bversion.h"
+#define MODULE_VERSION (BUILDING_GCC_MAJOR * 1U + BUILDING_GCC_MINOR)
+#elif !IS_EXPERIMENTAL (MODULE_VERSION)
+#error "This is not the version I was looking for."
+#endif
+
+#define _DEFAULT_SOURCE 1 /* To get TZ field of struct tm, if available.  */
+#include "config.h"
+
+#include "system.h"
+#include "coretypes.h"
+#include "cp-tree.h"
+#include "timevar.h"
+#include "stringpool.h"
+#include "dumpfile.h"
+#include "bitmap.h"
+#include "cgraph.h"
+#include "tree-iterator.h"
+#include "cpplib.h"
+#include "mkdeps.h"
+#include "incpath.h"
+#include "libiberty.h"
+#include "stor-layout.h"
+#include "version.h"
+#include "tree-diagnostic.h"
+#include "toplev.h"
+#include "opts.h"
+#include "attribs.h"
+#include "intl.h"
+#include "langhooks.h"
+


V3 [PATCH 1/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-08 Thread H.J. Lu via Gcc-patches
When definitions marked with used attribute and unmarked definitions are
placed in the section with the same name, switch to a new section if the
SECTION_RETAIN bit doesn't match.

gcc/

PR target/98146
* output.h (switch_to_section): Add a tree argument, default to
nullptr.
* varasm.c (get_section): If the SECTION_RETAIN bit doesn't match,
return and switch to a new section later.
(assemble_start_function): Pass decl to switch_to_section.
(assemble_variable): Likewise.
(switch_to_section): If the SECTION_RETAIN bit doesn't match,
switch to a new section.

gcc/testsuite/

PR target/98146
* c-c++-common/attr-used-5.c: New test.
* c-c++-common/attr-used-6.c: Likewise.
* c-c++-common/attr-used-7.c: Likewise.
* c-c++-common/attr-used-8.c: Likewise.
* c-c++-common/attr-used-9.c: Likewise.
---
 gcc/output.h |  2 +-
 gcc/testsuite/c-c++-common/attr-used-5.c | 26 
 gcc/testsuite/c-c++-common/attr-used-6.c | 26 
 gcc/testsuite/c-c++-common/attr-used-7.c |  8 +++
 gcc/testsuite/c-c++-common/attr-used-8.c |  8 +++
 gcc/testsuite/c-c++-common/attr-used-9.c | 28 ++
 gcc/varasm.c | 30 
 7 files changed, 123 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-5.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-6.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-7.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-8.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-9.c

diff --git a/gcc/output.h b/gcc/output.h
index fa8ace1f394..1f9af46da1d 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -548,7 +548,7 @@ extern void switch_to_other_text_partition (void);
 extern section *get_cdtor_priority_section (int, bool);
 
 extern bool unlikely_text_section_p (section *);
-extern void switch_to_section (section *);
+extern void switch_to_section (section *, tree = nullptr);
 extern void output_section_asm_op (const void *);
 
 extern void record_tm_clone_pair (tree, tree);
diff --git a/gcc/testsuite/c-c++-common/attr-used-5.c 
b/gcc/testsuite/c-c++-common/attr-used-5.c
new file mode 100644
index 000..9fc0d3834e9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-5.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+struct dtv_slotinfo_list
+{
+  struct dtv_slotinfo_list *next;
+};
+
+extern struct dtv_slotinfo_list *list;
+
+static int __attribute__ ((section ("__libc_freeres_fn")))
+free_slotinfo (struct dtv_slotinfo_list **elemp)
+{
+  if (!free_slotinfo (&(*elemp)->next))
+return 0;
+  return 1;
+}
+
+__attribute__ ((used, section ("__libc_freeres_fn")))
+static void free_mem (void)
+{
+  free_slotinfo (&list);
+}
+
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"ax\"" { target 
R_flag_in_section } } } */
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"axR\"" { target 
R_flag_in_section } } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-6.c 
b/gcc/testsuite/c-c++-common/attr-used-6.c
new file mode 100644
index 000..0cb82ade5a9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-6.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+struct dtv_slotinfo_list
+{
+  struct dtv_slotinfo_list *next;
+};
+
+extern struct dtv_slotinfo_list *list;
+
+static int __attribute__ ((used, section ("__libc_freeres_fn")))
+free_slotinfo (struct dtv_slotinfo_list **elemp)
+{
+  if (!free_slotinfo (&(*elemp)->next))
+return 0;
+  return 1;
+}
+
+__attribute__ ((section ("__libc_freeres_fn")))
+void free_mem (void)
+{
+  free_slotinfo (&list);
+}
+
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"ax\"" { target 
R_flag_in_section } } } */
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"axR\"" { target 
R_flag_in_section } } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-7.c 
b/gcc/testsuite/c-c++-common/attr-used-7.c
new file mode 100644
index 000..fba2706ffc1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-7.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+int __attribute__((used,section(".data.foo"))) foo2 = 2;
+int __attribute__((section(".data.foo"))) foo1 = 1;
+
+/* { dg-final { scan-assembler ".data.foo,\"aw\"" { target R_flag_in_section } 
} } */
+/* { dg-final { scan-assembler ".data.foo,\"awR\"" { target R_flag_in_section 
} } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-8.c 
b/gcc/testsuite/c-c++-common/attr-used-8.c
new file mode 100644
index 000..4da4aabe573
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+int __attribute__((section(".data.foo"))) foo1 = 1;
+int __attribute__((used,section(".data.foo"))) foo2 = 2;
+
+/* { dg-fin

V3 [PATCH 2/2] Warn used and not used symbols in section with the same name

2020-12-08 Thread H.J. Lu via Gcc-patches
When SECTION_RETAIN is used, issue a warning when a symbol without used
attribute and a symbol with used attribute are placed in the section with
the same name, like

int __attribute__((used,section(".data.foo"))) foo2 = 2;
int __attribute__((section(".data.foo"))) foo1 = 1;

since assembler will put them in different sections with the same section
name.

gcc/

PR target/98146
* varasm.c (switch_to_section): Warn when a symbol without used
attribute and a symbol with used attribute are placed in the
section with the same name.

gcc/testsuite/

PR target/98146
* c-c++-common/attr-used-5.c: Updated.
* c-c++-common/attr-used-6.c: Likewise.
* c-c++-common/attr-used-7.c: Likewise.
* c-c++-common/attr-used-8.c: Likewise.
---
 gcc/testsuite/c-c++-common/attr-used-5.c |  1 +
 gcc/testsuite/c-c++-common/attr-used-6.c |  1 +
 gcc/testsuite/c-c++-common/attr-used-7.c |  1 +
 gcc/testsuite/c-c++-common/attr-used-8.c |  1 +
 gcc/varasm.c | 22 +++---
 5 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/attr-used-5.c 
b/gcc/testsuite/c-c++-common/attr-used-5.c
index 9fc0d3834e9..ba59326e452 100644
--- a/gcc/testsuite/c-c++-common/attr-used-5.c
+++ b/gcc/testsuite/c-c++-common/attr-used-5.c
@@ -10,6 +10,7 @@ extern struct dtv_slotinfo_list *list;
 
 static int __attribute__ ((section ("__libc_freeres_fn")))
 free_slotinfo (struct dtv_slotinfo_list **elemp)
+/* { dg-warning "'.*' without 'used' attribute and '.*' with 'used' attribute 
are placed in a section with the same name" "" { target R_flag_in_section } .-1 
} */
 {
   if (!free_slotinfo (&(*elemp)->next))
 return 0;
diff --git a/gcc/testsuite/c-c++-common/attr-used-6.c 
b/gcc/testsuite/c-c++-common/attr-used-6.c
index 0cb82ade5a9..5d20f875bf0 100644
--- a/gcc/testsuite/c-c++-common/attr-used-6.c
+++ b/gcc/testsuite/c-c++-common/attr-used-6.c
@@ -18,6 +18,7 @@ free_slotinfo (struct dtv_slotinfo_list **elemp)
 
 __attribute__ ((section ("__libc_freeres_fn")))
 void free_mem (void)
+/* { dg-warning "'.*' without 'used' attribute and '.*' with 'used' attribute 
are placed in a section with the same name" "" { target R_flag_in_section } .-1 
} */
 {
   free_slotinfo (&list);
 }
diff --git a/gcc/testsuite/c-c++-common/attr-used-7.c 
b/gcc/testsuite/c-c++-common/attr-used-7.c
index fba2706ffc1..75576bcabe5 100644
--- a/gcc/testsuite/c-c++-common/attr-used-7.c
+++ b/gcc/testsuite/c-c++-common/attr-used-7.c
@@ -3,6 +3,7 @@
 
 int __attribute__((used,section(".data.foo"))) foo2 = 2;
 int __attribute__((section(".data.foo"))) foo1 = 1;
+/* { dg-warning "'.*' without 'used' attribute and '.*' with 'used' attribute 
are placed in a section with the same name" "" { target R_flag_in_section } .-1 
} */
 
 /* { dg-final { scan-assembler ".data.foo,\"aw\"" { target R_flag_in_section } 
} } */
 /* { dg-final { scan-assembler ".data.foo,\"awR\"" { target R_flag_in_section 
} } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-8.c 
b/gcc/testsuite/c-c++-common/attr-used-8.c
index 4da4aabe573..e4982db1044 100644
--- a/gcc/testsuite/c-c++-common/attr-used-8.c
+++ b/gcc/testsuite/c-c++-common/attr-used-8.c
@@ -2,6 +2,7 @@
 /* { dg-options "-Wall -O2" } */
 
 int __attribute__((section(".data.foo"))) foo1 = 1;
+/* { dg-warning "'.*' without 'used' attribute and '.*' with 'used' attribute 
are placed in a section with the same name" "" { target R_flag_in_section } .-1 
} */
 int __attribute__((used,section(".data.foo"))) foo2 = 2;
 
 /* { dg-final { scan-assembler ".data.foo,\"aw\"" { target R_flag_in_section } 
} } */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index c5ea29c4e4c..346b3bea890 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -7732,11 +7732,27 @@ switch_to_section (section *new_section, tree decl)
{
  /* If the SECTION_RETAIN bit doesn't match, switch to a new
 section.  */
+ tree used_decl, no_used_decl;
+
  if (DECL_PRESERVE_P (decl))
-   new_section->common.flags |= SECTION_RETAIN;
+   {
+ new_section->common.flags |= SECTION_RETAIN;
+ used_decl = decl;
+ no_used_decl = new_section->named.decl;
+   }
  else
-   new_section->common.flags &= ~(SECTION_RETAIN
-  | SECTION_DECLARED);
+   {
+ new_section->common.flags &= ~(SECTION_RETAIN
+| SECTION_DECLARED);
+ used_decl = new_section->named.decl;
+ no_used_decl = decl;
+   }
+ warning (OPT_Wattributes,
+  "%+qD without % attribute and %qD with "
+  "% attribute are placed in a section with "
+  "the same name", no_used_decl, used_decl);
+ inform (DECL_SOURCE_LOCATION (used_decl),
+ "%qD was declared here", used_decl);
}
   else

V3 [PATCH 0/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-08 Thread H.J. Lu via Gcc-patches
When SECTION_RETAIN is used, definitions marked with used attribute and
unmarked definitions are placed in a section with the same name.  Instead
of issue an error:

[hjl@gnu-cfl-2 gcc]$ /usr/gcc-11.0.0-x32/bin/gcc -S c.c 
-fdiagnostics-plain-output
c.c:2:49: error: ‘foo1’ causes a section type conflict with ‘foo2’
c.c:1:54: note: ‘foo2’ was declared here
[hjl@gnu-cfl-2 gcc]$

the first patch switches to a new section if the SECTION_RETAIN bit
doesn't match.  The second optional patch issues a warning:

[hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S c.c
c.c:2:49: warning: ‘foo1’ without ‘used’ attribute and ‘foo2’ with ‘used’ 
attribute are placed in a section with the same name [-Wattributes]
2 | const int __attribute__((section(".data.foo"))) foo1 = 1;
  | ^~~~
c.c:1:54: note: ‘foo2’ was declared here
1 | const int __attribute__((used,section(".data.foo"))) foo2 = 2;
  |
[hjl@gnu-cfl-2 gcc]$

Changes from V2:

1. Add (new_section->common.flags & SECTION_NAMED) check since
SHF_GNU_RETAIN section must be named.
2. Move c-c++-common/attr-used-9.c to the fist patch since there are
no new warnings.
3. Check new warnings only for R_flag_in_section target.

H.J. Lu (2):
  Switch to a new section if the SECTION_RETAIN bit doesn't match
  Warn used and not used symbols in section with the same name

 gcc/output.h |  2 +-
 gcc/testsuite/c-c++-common/attr-used-5.c | 27 ++
 gcc/testsuite/c-c++-common/attr-used-6.c | 27 ++
 gcc/testsuite/c-c++-common/attr-used-7.c |  9 +
 gcc/testsuite/c-c++-common/attr-used-8.c |  9 +
 gcc/testsuite/c-c++-common/attr-used-9.c | 28 +++
 gcc/varasm.c | 46 +---
 7 files changed, 143 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-5.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-6.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-7.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-8.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-9.c

-- 
2.28.0



[COMMITTED] testsuite: i386: Require avx512vpopcntdq in two tests

2020-12-08 Thread Rainer Orth
Two recent AVX512 tests FAIL on Solaris/x86 with /bin/as:

FAIL: gcc.target/i386/avx512vpopcntdq-pr97770-2.c (test for excess errors)

Excess errors:
Assembler: avx512vpopcntdq-pr97770-2.c
"/var/tmp//ccM4Gt1a.s", line 171 : Illegal mnemonic
Near line: "vpopcntd(%eax), %zmm0"
"/var/tmp//ccM4Gt1a.s", line 171 : Syntax error
Near line: "vpopcntd(%eax), %zmm0"

FAIL: gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c (test for excess errors)

similarly.

Fixed as follows.

Tested on i386-pc-solaris2.11 with as and gas and x86_64-pc-linux-gnu.
Installed on master.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2020-12-07  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/avx512vpopcntdq-pr97770-2.c: Require
avx512vpopcntdq support.
* gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c: Require
avx512vpopcntdq, avx512vl support.

# HG changeset patch
# Parent  bb5cf0aa086befe7ea1de833e2dcb5f53d638eb8
testsuite: i386: Require avx512vpopcntdq in two tests

diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-pr97770-2.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-pr97770-2.c
--- a/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-pr97770-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-pr97770-2.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -mavx512vpopcntdq" } */
+/* { dg-require-effective-target avx512vpopcntdq } */
 
 #define AVX512VPOPCNTDQ
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c b/gcc/testsuite/gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c
--- a/gcc/testsuite/gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c
@@ -1,5 +1,7 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -mavx512vpopcntdq -mavx512vl" } */
+/* { dg-require-effective-target avx512vpopcntdq } */
+/* { dg-require-effective-target avx512vl } */
 
 #define AVX512VL
 #define AVX512F_LEN 256


[COMMITTED] testsuite: i386: Require ifunc support in gcc.target/i386/pr98100.c

2020-12-08 Thread Rainer Orth
The new gcc.target/i386/pr98100.c test FAILs on Solaris/x86:

FAIL: gcc.target/i386/pr98100.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr98100.c:6:1: 
error: the call requires 'ifunc', which is not supported by this target

Fixed as follows.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.  Installed on
master.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2020-12-07  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/pr98100.c: Require ifunc support.

# HG changeset patch
# Parent  00eb61f16bb6f4eab0547064ee5b74c25add665a
testsuite: i386: Require ifunc support in gcc.target/i386/pr98100.c

diff --git a/gcc/testsuite/gcc.target/i386/pr98100.c b/gcc/testsuite/gcc.target/i386/pr98100.c
--- a/gcc/testsuite/gcc.target/i386/pr98100.c
+++ b/gcc/testsuite/gcc.target/i386/pr98100.c
@@ -1,6 +1,7 @@
 /* PR target/98100 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mno-avx -fvar-tracking-assignments -g0" } */
+/* { dg-require-ifunc "" } */
 
 __attribute__((target_clones("default","avx2"))) void
 foo ()


Re: [Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 08, 2020 at 01:13:07PM +0100, Tobias Burnus wrote:
> + if (list == OMP_LIST_REDUCTION)
> +   has_inscan = true;

This looks weird, I would have expected
if (list == OMP_LIST_REDUCTION_INSCAN)

> @@ -6151,6 +6203,28 @@ gfc_resolve_omp_do_blocks (gfc_code *code, 
> gfc_namespace *ns)
>   }
>if (i < omp_current_do_collapse || omp_current_do_collapse <= 0)
>   omp_current_do_collapse = 1;
> +  if (code->ext.omp_clauses->lists[OMP_LIST_REDUCTION_INSCAN])
> + {
> +   locus *loc
> + = &code->ext.omp_clauses->lists[OMP_LIST_REDUCTION_INSCAN]->where;
> +   if (code->ext.omp_clauses->ordered)
> + gfc_error ("ORDERED clause specified together with % "
> +"REDUCTION clause at %L", loc);
> +   if (code->ext.omp_clauses->sched_kind != OMP_SCHED_NONE)
> + gfc_error ("SCHEDULE clause specified together with % "
> +"REDUCTION clause at %L", loc);
> +   if (!c->block
> +   || !c->block->next
> +   || !c->block->next->next
> +   || c->block->next->next->op != EXEC_OMP_SCAN
> +   || !c->block->next->next->next
> +   || c->block->next->next->next->next)
> + gfc_error ("With INSCAN at %L, expected loop body with !$OMP SCAN "
> +"between two structured-block-sequences", loc);
> +   else
> + /* Mark as checked; flag will be unset later.  */
> + c->block->next->next->ext.omp_clauses->if_present = true;
> + }

So you initially accept !$omp scan everywhere and only later complain if it
is misplaced?  I think e.g. for !$omp section I used to hardcode it in
parse_omp_structured_block - allow it only there and nowhere else:
  else if (st == ST_OMP_SECTION
   && (omp_st == ST_OMP_SECTIONS
   || omp_st == ST_OMP_PARALLEL_SECTIONS))

> @@ -7046,6 +7122,14 @@ gfc_resolve_omp_directive (gfc_code *code, 
> gfc_namespace *ns ATTRIBUTE_UNUSED)
>   gfc_error ("OMP CRITICAL at %L with HINT clause requires a NAME, "
>  "except when omp_sync_hint_none is used", &code->loc);
>break;
> +case EXEC_OMP_SCAN:
> +  /* Flag is only used to checking, hence, it is unset afterwards.  */
> +  if (!code->ext.omp_clauses->if_present)

Isn't if_present used also for OpenACC?  Then can't it with -fopenmp
-fopenacc allow
!$acc ... if_present...
!$omp scan inclusive(...)
!$add end ...
?
> + gfc_error ("Unexpected !$OMP SCAN at %L outside loop construct with "
> +"% REDUCTION clause", &code->loc);
> +  code->ext.omp_clauses->if_present = false;
> +  resolve_omp_clauses (code, code->ext.omp_clauses, ns);
> +  break;
>  default:
>break;
>  }

Otherwise LGTM.

Jakub



[Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-08 Thread Tobias Burnus

In a previous patch, the 'inscan' reduction-clause modifier was added.
This patch add the associated 'omp scan' for two reasons:

First, to make it actually usable and, secondly, to avoid some corner
cases where 'inscan' slips through without the required 'sorry'
(as it can happen with the current code).

(The change to 'gfc_match_omp_taskgroup' is an unrelated cleanup.)

This still works with the current list OMP_LIST_* and adds two more
items; I still need to update my previous patch to avoid carrying
around this long list.

The testcases are mostly converted C/C++ test cases; I moved some
code as some errors are FE and some are ME errors and currently
ME errors only show up if there are no FE errors.

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Add 'omp scan' support of OpenMP 5.0

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses, show_omp_node,
	show_code_node): Handle OMP SCAN.
	* gfortran.h (enum gfc_statement): Add ST_OMP_SCAN.
	(enum): Add OMP_LIST_SCAN_IN and OMP_LIST_SCAN_EX.
	(enum gfc_exec_op): Add EXEC_OMP_SCAN.
	* match.h (gfc_match_omp_scan): New prototype.
	* openmp.c (gfc_match_omp_scan): New.
	(gfc_match_omp_taskgroup): Cleanup.
	(resolve_omp_clauses, gfc_resolve_omp_do_blocks,
	omp_code_to_statement, gfc_resolve_omp_directive): Handle 'omp scan'.
	* parse.c (decode_omp_directive, next_statement,
	gfc_ascii_statement): Likewise.
	* resolve.c (gfc_resolve_code): Handle EXEC_OMP_SCAN.
	* st.c (gfc_free_statement): Likewise.
	* trans-openmp.c (gfc_trans_omp_clauses, gfc_trans_omp_do,
	gfc_split_omp_clauses): Handle 'omp scan'.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/scan-1.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/reduction4.f90: Update; move FE some tests to ...
	* gfortran.dg/gomp/reduction6.f90: ... this new test and ...
	* gfortran.dg/gomp/reduction7.f90: ... this new test.
	* gfortran.dg/gomp/reduction5.f90: Add dg-error.
	* gfortran.dg/gomp/scan-1.f90: New test.
	* gfortran.dg/gomp/scan-2.f90: New test.
	* gfortran.dg/gomp/scan-3.f90: New test.
	* gfortran.dg/gomp/scan-4.f90: New test.
	* gfortran.dg/gomp/scan-5.f90: New test.
	* gfortran.dg/gomp/scan-6.f90: New test.
	* gfortran.dg/gomp/scan-7.f90: New test.

 gcc/fortran/dump-parse-tree.c |   7 +-
 gcc/fortran/gfortran.h|   6 +-
 gcc/fortran/match.h   |   1 +
 gcc/fortran/openmp.c  | 102 ++--
 gcc/fortran/parse.c   |   6 +-
 gcc/fortran/resolve.c |   1 +
 gcc/fortran/st.c  |   1 +
 gcc/fortran/trans-openmp.c|  40 -
 gcc/testsuite/gfortran.dg/gomp/reduction4.f90 |  29 +---
 gcc/testsuite/gfortran.dg/gomp/reduction5.f90 |   7 +-
 gcc/testsuite/gfortran.dg/gomp/reduction6.f90 |  18 +++
 gcc/testsuite/gfortran.dg/gomp/reduction7.f90 |   9 ++
 gcc/testsuite/gfortran.dg/gomp/scan-1.f90 | 213 ++
 gcc/testsuite/gfortran.dg/gomp/scan-2.f90 |  21 +++
 gcc/testsuite/gfortran.dg/gomp/scan-3.f90 |  21 +++
 gcc/testsuite/gfortran.dg/gomp/scan-4.f90 |  22 +++
 gcc/testsuite/gfortran.dg/gomp/scan-5.f90 |  18 +++
 gcc/testsuite/gfortran.dg/gomp/scan-6.f90 |  16 ++
 gcc/testsuite/gfortran.dg/gomp/scan-7.f90 |  60 
 libgomp/testsuite/libgomp.fortran/scan-1.f90  | 115 ++
 20 files changed, 670 insertions(+), 43 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 1012b11fb98..b3fa1785b14 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1600,6 +1600,8 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
 	  case OMP_LIST_USE_DEVICE_PTR: type = "USE_DEVICE_PTR"; break;
 	  case OMP_LIST_USE_DEVICE_ADDR: type = "USE_DEVICE_ADDR"; break;
 	  case OMP_LIST_NONTEMPORAL: type = "NONTEMPORAL"; break;
+	  case OMP_LIST_SCAN_IN: type = "INCLUSIVE"; break;
+	  case OMP_LIST_SCAN_EX: type = "EXCLUSIVE"; break;
 	  default:
 	gcc_unreachable ();
 	  }
@@ -1803,6 +1805,7 @@ show_omp_node (int level, gfc_code *c)
 case EXEC_OMP_PARALLEL_DO_SIMD: name = "PARALLEL DO SIMD"; break;
 case EXEC_OMP_PARALLEL_SECTIONS: name = "PARALLEL SECTIONS"; break;
 case EXEC_OMP_PARALLEL_WORKSHARE: name = "PARALLEL WORKSHARE"; break;
+case EXEC_OMP_SCAN: name = "SCAN"; break;
 case EXEC_OMP_SECTIONS: name = "SECTIONS"; break;
 case EXEC_OMP_SIMD: name = "SIMD"; break;
 case EXEC_OMP_SINGLE: name = "SINGLE"; break;
@@ -1873,6 +1876,7 @@ show_omp_node (int level, gfc_code *c)
 case EXEC_OMP_PARALLEL_DO_SIMD:
 case EXEC_OMP_PARALLEL_SECTIONS:
 case EXEC_OMP_PARALLEL_WORKSHARE:
+case EXEC_OMP_SCAN:
 case EXEC_OMP_SECTIONS:
 case EXEC_OMP_SIMD:
 case EXEC_OMP_SINGLE:
@@ -1933,7 +1937,7 @@ show_

[PATCH] tree-optimization/98192 - fix double free in SLP

2020-12-08 Thread Richard Biener
This makes sure to clear the vector pointer on release.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-12-08  Richard Biener  

PR tree-optimization/98192
* tree-vect-slp.c (vect_build_slp_instance): Get scalar_stmts
by reference.
---
 gcc/tree-vect-slp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 35e783505b4..d248ce2c3f7 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2212,7 +2212,7 @@ vect_analyze_slp_instance (vec_info *vinfo,
 static bool
 vect_build_slp_instance (vec_info *vinfo,
 slp_instance_kind kind,
-vec scalar_stmts,
+vec &scalar_stmts,
 stmt_vec_info root_stmt_info,
 unsigned max_tree_size,
 scalar_stmts_to_slp_tree_map_t *bst_map,
-- 
2.26.2


[PATCH] testsuite/95900 - fix gcc.dg/vect/bb-slp-pr95866.c target requirement

2020-12-08 Thread Richard Biener
We require a vector-by-scalar shift, there's no appropriate target
selector so use SSE2 for now.

Tested on x86_64-linux, pushed.

2020-12-08  Richard Biener  

PR testsuite/95900
* gcc.dg/vect/bb-slp-pr95866.c: Require sse2 for the
BIT_FIELD_REF match.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr95866.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr95866.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95866.c
index edcaf17728e..14826b53cab 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr95866.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95866.c
@@ -13,5 +13,5 @@ void foo()
 }
 
 /* The scalar shift argument should be extracted from the available vector.  */
-/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "slp2" } } */
+/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "slp2" { target sse2 } } } */
 /* { dg-final { scan-tree-dump "optimized: basic block" "slp2" } } */
-- 
2.26.2


Re: [PATCH][GCC10] arm: Fix unwanted fall-through in arm.c

2020-12-08 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: 02 December 2020 16:06
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov ; Richard Earnshaw
>> ; nd 
>> Subject: [PATCH][GCC10] arm: Fix unwanted fall-through in arm.c
>> 
>> Hi all,
>> 
>> this is to fix in GCC 10 the incomplete backport done by 1aabb312f of
>> what at the time I fixed on master with dd019ef07.
>> 
>> Regtested and bootstraped on arm-linux-gnueabihf.
>> 
>> I guess should be under the obvious rule but prefer to ask, okay for
>> gcc-10?
>
> Ok.
> I'd consider it obvious.
> Thanks,
> Kyrill

In as 725179f3e40.

Thanks

  Andrea


RE: [PATCH][GCC10] arm: Fix unwanted fall-through in arm.c

2020-12-08 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: 02 December 2020 16:06
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; nd 
> Subject: [PATCH][GCC10] arm: Fix unwanted fall-through in arm.c
> 
> Hi all,
> 
> this is to fix in GCC 10 the incomplete backport done by 1aabb312f of
> what at the time I fixed on master with dd019ef07.
> 
> Regtested and bootstraped on arm-linux-gnueabihf.
> 
> I guess should be under the obvious rule but prefer to ask, okay for
> gcc-10?

Ok.
I'd consider it obvious.
Thanks,
Kyrill

> 
> Thanks
> 
>   Andrea



Re: [PATCH] arm: Improve documentation for effective target 'arm_softfloat'

2020-12-08 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Hi all,
>
> I'd like to submit the following patch to better specify the meaning
> of the 'arm_softfloat' effective target.
>
> As I've recently discovered we can have cases where '-mfloat-abi=hard'
> is used and the compiler correctly defines '__SOFTFP__'.
>
> Effectively 'arm_softfloat' is checking if the target requires floating
> point emulation but not what ABI is used, so I think would be nice to
> specify that in the documentation.
>
> Okay for trunk?
>
> Thanks
>
>   Andrea
>   

Ping


Re: [PATCH V2] arm: [testsuite] fix lob tests for -mfloat-abi=hard

2020-12-08 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Hi all,
>
> second version of this patch here fixing lob[2-5].c tests for hard float
> abi targets implementing Kyrill's suggestions.
>
> Okay for trunk?
>
>   Andrea

Ping


Re: [PATCH][GCC10] arm: Fix unwanted fall-through in arm.c

2020-12-08 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Andrea Corallo via Gcc-patches  writes:
>
>> Hi all,
>>
>> this is to fix in GCC 10 the incomplete backport done by 1aabb312f of
>> what at the time I fixed on master with dd019ef07.
>>
>> Regtested and bootstraped on arm-linux-gnueabihf.
>>
>> I guess should be under the obvious rule but prefer to ask, okay for
>> gcc-10?
>>
>> Thanks
>>
>>   Andrea
>
> Adding references to the two mentioned patches for clarity.
>
> Original patch (1aabb312f)
>
> 
>
> Backport (dd019ef07)
>
> 
>
>   Andrea

Ping


[ARM][PR66791] Replace builtins for vclt and vcgt intrinsics in arm_neon.h

2020-12-08 Thread Prathamesh Kulkarni via Gcc-patches
Hi,
This patch replaces calls to __builtin_neon_vcgt* by < and > in vclt
and vcgt intrinsics respectively, and removes entry to vcgt from
arm_neon_builtins.def.
OK to commit ?

Thanks,
Prathamesh


vclt-2.diff
Description: Binary data


Re: [PATCH] Remove misleading debug line entries

2020-12-08 Thread Richard Biener
On Mon, 7 Dec 2020, Bernd Edlinger wrote:

> On 12/7/20 11:50 AM, Richard Biener wrote:
> > The ipa-param-manipulation.c hunk is OK, please commit separately.
> > 
> 
> done.
> 
> > The tree-inline.c and cfgexpand.c changes are OK as well, for the
> > tree-ssa-live.c change see below
> > 
> > 
> > From 85b0e37d0c0d3ecac4908ebbfd67edc612ef22b2 Mon Sep 17 00:00:00 2001
> > From: Bernd Edlinger 
> > Date: Wed, 2 Dec 2020 12:32:02 +0100
> > Subject: [PATCH] Remove misleading debug line entries
> > 
> > This removes gimple_debug_begin_stmts without block info which remain
> > after a gimple block originating from an inline function is unused.
> > 
> > The line numbers from these stmts are from the inline function,
> > but since the inline function is completely optimized away,
> > there will be no DW_TAG_inlined_subroutine so the debugger has
> > no callstack available at this point, and therefore those
> > line table entries are not helpful to the user.
> > 
> > 2020-12-02  Bernd Edlinger  
> > 
> > * cfgexpand.c (expand_gimple_basic_block): Remove special handling
> > of debug_inline_entries without block info.
> > * ipa-param-manipulation.c
> > (ipa_param_body_adjustments::modify_call_stmt): Set location info.
> > * tree-inline.c (remap_gimple_stmt): Drop debug_nonbind_markers when
> > the call statement has no block info.
> > (copy_debug_stmt): Remove debug_nonbind_markers when inlining
> > and the block info is mapped to NULL.
> > * tree-ssa-live.c (clear_unused_block_pointer): Remove
> > debug_nonbind_markers originating from inlined functions.
> > ---
> >  gcc/cfgexpand.c  |  9 +
> >  gcc/ipa-param-manipulation.c |  2 ++
> >  gcc/tree-inline.c| 14 ++
> >  gcc/tree-ssa-live.c  | 22 --
> >  4 files changed, 33 insertions(+), 14 deletions(-)
> > 
> > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> > index 7e0bdd58e85..df7b62080b6 100644
> > --- a/gcc/cfgexpand.c
> > +++ b/gcc/cfgexpand.c
> > @@ -5953,14 +5953,7 @@ expand_gimple_basic_block (basic_block bb, bool 
> > disable_tail_calls)
> >   else if (gimple_debug_begin_stmt_p (stmt))
> > val = GEN_RTX_DEBUG_MARKER_BEGIN_STMT_PAT ();
> >   else if (gimple_debug_inline_entry_p (stmt))
> > -   {
> > - tree block = gimple_block (stmt);
> > -
> > - if (block)
> > -   val = GEN_RTX_DEBUG_MARKER_INLINE_ENTRY_PAT ();
> > - else
> > -   goto delink_debug_stmt;
> > -   }
> > +   val = GEN_RTX_DEBUG_MARKER_INLINE_ENTRY_PAT ();
> >   else
> > gcc_unreachable ();
> >  
> > diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
> > index 2bbea21be2e..9ab4a10096d 100644
> > --- a/gcc/ipa-param-manipulation.c
> > +++ b/gcc/ipa-param-manipulation.c
> > @@ -1681,6 +1681,8 @@ ipa_param_body_adjustments::modify_call_stmt (gcall 
> > **stmt_p)
> > }
> > }
> >gcall *new_stmt = gimple_build_call_vec (gimple_call_fn (stmt), 
> > vargs);
> > +  if (gimple_has_location (stmt))
> > +   gimple_set_location (new_stmt, gimple_location (stmt));
> >gimple_call_set_chain (new_stmt, gimple_call_chain (stmt));
> >gimple_call_copy_flags (new_stmt, stmt);
> >if (tree lhs = gimple_call_lhs (stmt))
> > diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> > index d9814bd10d3..360b85f14dc 100644
> > --- a/gcc/tree-inline.c
> > +++ b/gcc/tree-inline.c
> > @@ -1819,12 +1819,11 @@ remap_gimple_stmt (gimple *stmt, copy_body_data *id)
> >   /* If the inlined function has too many debug markers,
> >  don't copy them.  */
> >   if (id->src_cfun->debug_marker_count
> > - > param_max_debug_marker_count)
> > + > param_max_debug_marker_count
> > + || id->reset_location)
> > return stmts;
> >  
> >   gdebug *copy = as_a  (gimple_copy (stmt));
> > - if (id->reset_location)
> > -   gimple_set_location (copy, input_location);
> >   id->debug_stmts.safe_push (copy);
> >   gimple_seq_add_stmt (&stmts, copy);
> >   return stmts;
> > @@ -3169,7 +3168,14 @@ copy_debug_stmt (gdebug *stmt, copy_body_data *id)
> >  }
> >  
> >if (gimple_debug_nonbind_marker_p (stmt))
> > -return;
> > +{
> > +  if (id->call_stmt && !gimple_block (stmt))
> > +   {
> > + gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> > + gsi_remove (&gsi, true);
> > +   }
> > +  return;
> > +}
> >  
> >/* Remap all the operands in COPY.  */
> >memset (&wi, 0, sizeof (wi));
> > diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
> > index 21a9ee43e6b..acba4a58626 100644
> > --- a/gcc/tree-ssa-live.c
> > +++ b/gcc/tree-ssa-live.c
> > @@ -623,13 +623,31 @@ clear_unused_block_pointer (void)
> >{
> > unsigned i;
> > tree b;
> > -   gimple *stmt = gsi_stmt (gsi);
> > +   gimple *stmt;
> >  
> > +  next:
> > +   stmt =

Re: [r11-5391 Regression] FAIL: gcc.target/i386/avx512vl-vxorpd-2.c execution test on Linux/x86_64

2020-12-08 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 30, 2020 at 06:16:06PM +0800, Hongtao Liu via Gcc-patches wrote:
> Add no strict aliasing to function CALC, since there are
> 
> "long long tmp = (*(long long *) &src1[i]) ^ (*(long long *) &src2[i]);"
>  in function CALC.
> 
> 
> modified   gcc/testsuite/gcc.target/i386/avx512dq-vandnpd-2.c
> @@ -9,6 +9,7 @@
>  #include "avx512f-mask-type.h"
> 
>  void
> +__attribute__ ((optimize ("no-strict-aliasing"), noinline))
>  CALC (double *s1, double *s2, double *r)
>  {
>int i;
> modified   gcc/testsuite/gcc.target/i386/avx512dq-vandpd-2.c

I think that is not the best fix, the CALC routines just want to
model the behavior of the instructions, they are just part of the
verification that the rest of the test works correctly and so we
can just rewrite the code not to violate aliasing.

Fixed thusly, committed to the trunk as obvious:

2020-12-08  Jakub Jelinek  

* gcc.target/i386/avx512dq-vandnpd-2.c (CALC): Use union
to avoid aliasing violations.
* gcc.target/i386/avx512dq-vandnps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vandpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vandps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vorpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vorps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vxorpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vxorps-2.c (CALC): Likewise.

--- gcc/testsuite/gcc.target/i386/avx512dq-vandnpd-2.c.jj   2020-01-14 
20:02:47.785594824 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vandnpd-2.c  2020-12-08 
11:12:37.106053066 +0100
@@ -16,8 +16,11 @@ CALC (double *s1, double *s2, double *r)
 
   for (i = 0; i < SIZE; i++)
 {
-  tmp = (~(*(long long *) &s1[i])) & (*(long long *) &s2[i]);
-  r[i] = *(double *) &tmp;
+  union U { double d; long long l; } u1, u2;
+  u1.d = s1[i];
+  u2.d = s2[i];
+  u1.l = (~u1.l) & u2.l;
+  r[i] = u1.d;
 }
 }
 
--- gcc/testsuite/gcc.target/i386/avx512dq-vandnps-2.c.jj   2020-01-14 
20:02:47.785594824 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vandnps-2.c  2020-12-08 
11:12:55.033852659 +0100
@@ -16,8 +16,11 @@ CALC (float *s1, float *s2, float *r)
 
   for (i = 0; i < SIZE; i++)
 {
-  tmp = (~(*(int *) &s1[i])) & (*(int *) &s2[i]);
-  r[i] = *(float *) &tmp;
+  union U { float f; int i; } u1, u2;
+  u1.f = s1[i];
+  u2.f = s2[i];
+  u1.i = (~u1.i) & u2.i;
+  r[i] = u1.f;
 }
 }
 
--- gcc/testsuite/gcc.target/i386/avx512dq-vandpd-2.c.jj2020-01-14 
20:02:47.785594824 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vandpd-2.c   2020-12-08 
11:10:03.767767230 +0100
@@ -16,8 +16,11 @@ CALC (double *s1, double *s2, double *r)
 
   for (i = 0; i < SIZE; i++)
 {
-  tmp = (*(long long *) &s1[i]) & (*(long long *) &s2[i]);
-  r[i] = *(double *) &tmp;
+  union U { double d; long long l; } u1, u2;
+  u1.d = s1[i];
+  u2.d = s2[i];
+  u1.l &= u2.l;
+  r[i] = u1.d;
 }
 }
 
--- gcc/testsuite/gcc.target/i386/avx512dq-vandps-2.c.jj2020-01-14 
20:02:47.785594824 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vandps-2.c   2020-12-08 
11:11:51.548562356 +0100
@@ -16,8 +16,11 @@ CALC (float *s1, float *s2, float *r)
 
   for (i = 0; i < SIZE; i++)
 {
-  tmp = (*(int *) &s1[i]) & (*(int *) &s2[i]);
-  r[i] = *(float *) &tmp;
+  union U { float f; int i; } u1, u2;
+  u1.f = s1[i];
+  u2.f = s2[i];
+  u1.i &= u2.i;
+  r[i] = u1.f;
 }
 }
 
--- gcc/testsuite/gcc.target/i386/avx512dq-vorpd-2.c.jj 2020-01-14 
20:02:47.786594810 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vorpd-2.c2020-12-08 
11:15:35.497058846 +0100
@@ -15,8 +15,11 @@ CALC (double *src1, double *src2, double
 
   for (i = 0; i < SIZE; i++)
 {
-  long long tmp = (*(long long *) &src1[i]) | (*(long long *) &src2[i]);
-  dst[i] = *(double *) &tmp;
+  union U { double d; long long l; } u1, u2;
+  u1.d = src1[i];
+  u2.d = src2[i];
+  u1.l |= u2.l;
+  dst[i] = u1.d;
 }
 }
 
--- gcc/testsuite/gcc.target/i386/avx512dq-vorps-2.c.jj 2020-01-14 
20:02:47.786594810 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vorps-2.c2020-12-08 
11:15:45.737944364 +0100
@@ -15,8 +15,11 @@ CALC (float *src1, float *src2, float *d
 
   for (i = 0; i < SIZE; i++)
 {
-  int tmp = (*(int *) &src1[i]) | (*(int *) &src2[i]);
-  dst[i] = *(float *) &tmp;
+  union U { float f; int i; } u1, u2;
+  u1.f = src1[i];
+  u2.f = src2[i];
+  u1.i |= u2.i;
+  dst[i] = u1.f;
 }
 }
 
--- gcc/testsuite/gcc.target/i386/avx512dq-vxorpd-2.c.jj2020-01-14 
20:02:47.787594795 +0100
+++ gcc/testsuite/gcc.target/i386/avx512dq-vxorpd-2.c   2020-12-08 
11:15:59.644788891 +0100
@@ -15,8 +15,11 @@ CALC (double *src1, double *src2, double
 
   for (i = 0; i < SIZE; i++)
 {
-  long long tmp = (*(long long *) &src1[i]) ^ (*(long long *) &src2[i]);
-  dst

Re: [PATCH] i386: Fix up X87_ENABLE_{FLOAT, ARITH} in conditions [PR94440]

2020-12-08 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 8, 2020 at 10:36 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The documentation says
>  For a named pattern, the condition may not depend on the data in
>  the insn being matched, but only the target-machine-type flags.
> The i386 backend violates that by using flag_excess_precision and
> flag_unsafe_math_optimizations in the conditions too, which is bad
> when optimize attribute or pragmas are used.  The problem is that the
> middle-end caches the enabled conditions for the optabs for a particular
> switchable target, but multiple functions can share the same
> TARGET_OPTION_NODE, but have different TREE_OPTIMIZATION_NODE with different
> flag_excess_precision or flag_unsafe_math_optimizations, so the enabled
> conditions then match only one of those.
>
> I think best would be to just have a single options node for both the
> generic and target options, then such problems wouldn't exist, but that
> would be very risky at this point and quite large change.
>
> So, instead the following patch just shadows flag_excess_precision and
> flag_unsafe_math_optimizations values for uses in the instruction conditions
> in TargetVariable and during set_cfun artificially creates new
> TARGET_OPTION_NODE if flag_excess_precision and/or
> flag_unsafe_math_optimizations change from what is recorded in their
> TARGET_OPTION_NODE.  The target nodes are hashed, so worst case we can get 4
> times as many target option nodes if one would for each unique target option
> try all the flag_excess_precision and flag_unsafe_math_optimizations values.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2020-12-08  Jakub Jelinek  
>
> PR target/94440
> * config/i386/i386.opt (ix86_excess_precision,
> ix86_unsafe_math_optimizations): New TargetVariables.
> * config/i386/i386.h (X87_ENABLE_ARITH, X87_ENABLE_FLOAT): Use
> ix86_unsafe_math_optimizations instead of
> flag_unsafe_math_optimizations and ix86_excess_precision instead of
> flag_excess_precision.
> * config/i386/i386.c (ix86_excess_precision): Rename to ...
> (ix86_get_excess_precision): ... this.
> (TARGET_C_EXCESS_PRECISION): Define to ix86_get_excess_precision.
> * config/i386/i386-options.c (ix86_valid_target_attribute_tree,
> ix86_option_override_internal): Update ix86_unsafe_math_optimization
> from flag_unsafe_math_optimizations and ix86_excess_precision
> from flag_excess_precision when constructing target option nodes.
> (ix86_set_current_function): If flag_unsafe_math_optimizations
> or flag_excess_precision is different from the one recorded
> in TARGET_OPTION_NODE, create a new target option node for the
> current function and switch to that.

LGTM, but I'm not really experienced in option processing functionality.

Thanks,
Uros.

> --- gcc/config/i386/i386.opt.jj 2020-12-02 14:42:52.195054633 +0100
> +++ gcc/config/i386/i386.opt2020-12-07 16:05:16.898814331 +0100
> @@ -49,6 +49,16 @@ int recip_mask_explicit
>  TargetSave
>  int x_recip_mask_explicit
>
> +;; A copy of flag_excess_precision as a target variable that should
> +;; force a different DECL_FUNCTION_SPECIFIC_TARGET upon
> +;; flag_excess_precision changes.
> +TargetVariable
> +enum excess_precision ix86_excess_precision = EXCESS_PRECISION_DEFAULT
> +
> +;; Similarly for flag_unsafe_math_optimizations.
> +TargetVariable
> +bool ix86_unsafe_math_optimizations = false
> +
>  ;; Definitions to add to the cl_target_option structure
>  ;; -march= processor
>  TargetSave
> --- gcc/config/i386/i386.h.jj   2020-12-05 11:37:19.817423434 +0100
> +++ gcc/config/i386/i386.h  2020-12-07 16:17:13.051866670 +0100
> @@ -829,15 +829,15 @@ extern const char *host_detect_local_cpu
> SFmode, DFmode and XFmode) in the current excess precision
> configuration.  */
>  #define X87_ENABLE_ARITH(MODE) \
> -  (flag_unsafe_math_optimizations  \
> -   || flag_excess_precision == EXCESS_PRECISION_FAST   \
> +  (ix86_unsafe_math_optimizations  \
> +   || ix86_excess_precision == EXCESS_PRECISION_FAST   \
> || (MODE) == XFmode)
>
>  /* Likewise, whether to allow direct conversions from integer mode
> IMODE (HImode, SImode or DImode) to MODE.  */
>  #define X87_ENABLE_FLOAT(MODE, IMODE)  \
> -  (flag_unsafe_math_optimizations  \
> -   || flag_excess_precision == EXCESS_PRECISION_FAST   \
> +  (ix86_unsafe_math_optimizations  \
> +   || ix86_excess_precision == EXCESS_PRECISION_FAST   \
> || (MODE) == XFmode \
> || ((MODE) == DFmode && (IMODE) == SImode)  \
> || (IMODE) == HImode)
> --- gcc/config/i386/i386.c.jj   2020-12-05 11:37:19.0 +0100
> +++ gcc/config/i386/i386.c  2020-12-07 16:34:39.460252324 +0100
> @@ -23001,7 +23001,7 @@ ix86_init_libfuncs (void)

Re: [PR66791][ARM] Replace __builtin_neon_vneg* by - for vneg intrinsics

2020-12-08 Thread Prathamesh Kulkarni via Gcc-patches
On Thu, 3 Dec 2020 at 16:23, Prathamesh Kulkarni
 wrote:
>
> Hi,
> This patch replaces calls to __builtin_neon_vneg* builtins by -
> operator, for vneg intrinsics in arm_neon.h.
> Cross-tested on arm*-*-*.
> OK to commit ?
This patch removes call to entry for vneg from arm_neon_builtins.def.
There's another entry:
VAR2 (UNOP, vneg, v8hf, v4hf)
I am not sure if I should remove it since the patch doesn't remove
calls to either of these builtins ?

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh


vneg-2.diff
Description: Binary data


[committed] openmp: -fopenmp-simd fixes [PR98187]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
Hi!

This patch fixes two bugs in the -fopenmp-simd support.  One is that
in C++ #pragma omp parallel master would actually create OMP_PARALLEL
in the IL, which is a big no-no for -fopenmp-simd, we should be creating
only the constructs -fopenmp-simd handles (mainly OMP_SIMD, OMP_LOOP which
is gimplified as simd in that case, declare simd/reduction and ordered simd).

The other bug was that #pragma omp master taskloop simd combined construct
contains simd and thus should be recognized as #pragma omp simd (with only
the simd applicable clauses), but as master wasn't included in
omp_pragmas_simd, we'd ignore it completely instead.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk,
queued for backporting.

2020-12-08  Jakub Jelinek  

PR c++/98187
* c-pragma.c (omp_pragmas): Remove "master".
(omp_pragmas_simd): Add "master".

* parser.c (cp_parser_omp_parallel): For parallel master with
-fopenmp-simd only, just call cp_parser_omp_master instead of
wrapping it in OMP_PARALLEL.

* c-c++-common/gomp/pr98187.c: New test.

--- gcc/c-family/c-pragma.c.jj  2020-11-09 23:01:10.874738865 +0100
+++ gcc/c-family/c-pragma.c 2020-12-07 21:19:59.319988185 +0100
@@ -1317,7 +1317,6 @@ static const struct omp_pragma_def omp_p
   { "depobj", PRAGMA_OMP_DEPOBJ },
   { "end", PRAGMA_OMP_END_DECLARE_TARGET },
   { "flush", PRAGMA_OMP_FLUSH },
-  { "master", PRAGMA_OMP_MASTER },
   { "requires", PRAGMA_OMP_REQUIRES },
   { "section", PRAGMA_OMP_SECTION },
   { "sections", PRAGMA_OMP_SECTIONS },
@@ -1333,6 +1332,7 @@ static const struct omp_pragma_def omp_p
   { "distribute", PRAGMA_OMP_DISTRIBUTE },
   { "for", PRAGMA_OMP_FOR },
   { "loop", PRAGMA_OMP_LOOP },
+  { "master", PRAGMA_OMP_MASTER },
   { "ordered", PRAGMA_OMP_ORDERED },
   { "parallel", PRAGMA_OMP_PARALLEL },
   { "scan", PRAGMA_OMP_SCAN },
--- gcc/cp/parser.c.jj  2020-12-04 21:39:14.418768272 +0100
+++ gcc/cp/parser.c 2020-12-07 21:22:23.400385209 +0100
@@ -40491,6 +40491,9 @@ cp_parser_omp_parallel (cp_parser *parse
  cclauses = cclauses_buf;
 
  cp_lexer_consume_token (parser->lexer);
+ if (!flag_openmp)  /* flag_openmp_simd  */
+   return cp_parser_omp_master (parser, pragma_tok, p_name, mask,
+cclauses, if_p);
  block = begin_omp_parallel ();
  save = cp_parser_begin_omp_structured_block (parser);
  tree ret = cp_parser_omp_master (parser, pragma_tok, p_name, mask,
--- gcc/testsuite/c-c++-common/gomp/pr98187.c.jj2020-12-07 
21:25:38.108218964 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr98187.c   2020-12-07 21:25:28.960320755 
+0100
@@ -0,0 +1,109 @@
+/* PR c++/98187 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp-simd -O2 -fdump-tree-gimple" } */
+/* { dg-final { scan-tree-dump-times "#pragma omp simd" 17 "gimple" } } */
+
+void
+foo (int *p)
+{
+  int i;
+  #pragma omp distribute parallel for
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp distribute parallel for simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp distribute simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+}
+
+void
+bar (int *p)
+{
+  int i;
+  #pragma omp for simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp master taskloop
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp master taskloop simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp parallel for
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp parallel for simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp parallel loop
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp parallel master
+  p[0]++;
+  #pragma omp parallel master taskloop
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp parallel master taskloop simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp parallel sections
+  {
+p[0]++;
+#pragma omp section
+p[1]++;
+#pragma omp section
+p[2]++;
+  }
+  #pragma omp target parallel
+  #pragma omp master
+  p[0]++;
+  #pragma omp target parallel for
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target parallel for simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target parallel loop
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target teams private (i)
+  i = 0;
+  #pragma omp target teams distribute
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target teams distribute parallel for
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target teams distribute parallel for simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target teams distribute simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target teams loop
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp target simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp taskloop simd
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp teams distribute
+  for (i = 0; i < 64; i++)
+p[i]++;
+  #pragma omp team

Re: [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen.

2020-12-08 Thread Alan Modra via Gcc-patches
On Mon, Dec 07, 2020 at 05:49:05PM -0600, will schmidt via Gcc-patches wrote:
> [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen.

Appears to duplicate work I did earlier,
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html

Except I omitted fold-vec-store-builtin_vec_xst-longlong.c, due to
-mdejagnu-cpu=power8 in that test meaning we don't see any power10
insns.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PR66791][ARM] Replace __builtin_neon_vcreate* for vcreate intrinsics

2020-12-08 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 4 Dec 2020 at 16:26, Prathamesh Kulkarni
 wrote:
>
> On Thu, 3 Dec 2020 at 16:50, Kyrylo Tkachov  wrote:
> >
> > Hi Prathamesh,
> >
> > > -Original Message-
> > > From: Prathamesh Kulkarni 
> > > Sent: 03 December 2020 10:50
> > > To: gcc Patches ; Kyrylo Tkachov
> > > 
> > > Subject: [PR66791][ARM] Replace __builtin_neon_vcreate* for vcreate
> > > intrinsics
> > >
> > > Hi,
> > > This patch replaces calls to __builtin_neon_vcreate* builtins for
> > > vcreate intrinsics in arm_neon.h.
> > > Cross-tested on arm*-*-*.
> > > OK to commit ?
> >
> > Just remembered for this and the previous patch...
> > Do we need to remove the builtins from being created in the backend if they 
> > are now unused?
> Hi Kyrill,
> Indeed, I will resend patch(es) with builtins removed (if they're not
> used in other places).
Hi Kyrill,
Does attached patch for vcreate look OK ?

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Kyrill
> >
> > >
> > > Thanks,
> > > Prathamesh


vcreate-2.diff
Description: Binary data


[PATCH] tree-optimization/97559 - fix sinking in irreducible regions

2020-12-08 Thread Richard Biener
This fixes sinking of loads when irreducible regions are involved
and the heuristics to find stores on the path along the sink
breaks down since that uses dominator queries.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-12-08  Richard Biener  

PR tree-optimization/97559
* tree-ssa-sink.c (statement_sink_location): Never ignore
PHIs on sink paths in irreducible regions.

* gcc.dg/torture/pr97559-1.c: New testcase.
* gcc.dg/torture/pr97559-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr97559-1.c | 21 +
 gcc/testsuite/gcc.dg/torture/pr97559-2.c | 18 ++
 gcc/tree-ssa-sink.c  | 14 +-
 3 files changed, 48 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr97559-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr97559-2.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr97559-1.c 
b/gcc/testsuite/gcc.dg/torture/pr97559-1.c
new file mode 100644
index 000..d5de3bdb39e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr97559-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+
+int printf (char *, ...);
+
+int a, b, c, d;
+
+void e () {
+  int f = a;
+  if (b) {
+  L1:
+b = 0;
+  L2:
+if (c) {
+  if (f)
+printf("0");
+  goto L1;
+}
+  }
+  if (d)
+goto L2;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr97559-2.c 
b/gcc/testsuite/gcc.dg/torture/pr97559-2.c
new file mode 100644
index 000..b512e6db7ca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr97559-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+
+int a, b, c, d;
+
+void e() {
+  int f = b;
+  if (a) {
+  L1:
+a = 0;
+  L2:
+if (a) {
+  c = b;
+  goto L1;
+}
+  }
+  if (d)
+goto L2;
+}
diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index 207aae2818a..b0abf4147d6 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -390,12 +390,15 @@ statement_sink_location (gimple *stmt, basic_block frombb,
 with the use.  */
  if (gimple_code (use_stmt) == GIMPLE_PHI)
{
- /* In case the PHI node post-dominates the current insert 
location
-we can disregard it.  But make sure it is not dominating
-it as well as can happen in a CFG cycle.  */
+ /* In case the PHI node post-dominates the current insert
+location we can disregard it.  But make sure it is not
+dominating it as well as can happen in a CFG cycle.  */
  if (commondom != bb
  && !dominated_by_p (CDI_DOMINATORS, commondom, bb)
- && dominated_by_p (CDI_POST_DOMINATORS, commondom, bb))
+ && dominated_by_p (CDI_POST_DOMINATORS, commondom, bb)
+ /* If the blocks are possibly within the same irreducible
+cycle the above check breaks down.  */
+ && !(bb->flags & commondom->flags & BB_IRREDUCIBLE_LOOP))
continue;
  bb = EDGE_PRED (bb, PHI_ARG_INDEX_FROM_USE (use_p))->src;
}
@@ -407,7 +410,8 @@ statement_sink_location (gimple *stmt, basic_block frombb,
continue;
  /* There is no easy way to disregard defs not on the path from
 frombb to commondom so just consider them all.  */
- commondom = nearest_common_dominator (CDI_DOMINATORS, bb, 
commondom);
+ commondom = nearest_common_dominator (CDI_DOMINATORS,
+   bb, commondom);
  if (commondom == frombb)
return false;
}
-- 
2.26.2


[PATCH] tree-optimization/98191 - fix BIT_INSERT_EXPR sequence vectorization

2020-12-08 Thread Richard Biener
This adds a missing check.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-12-08  Richard Biener  

PR tree-optimization/98191
* tree-vect-slp.c (vect_slp_check_for_constructors): Do not
follow a non-SSA def chain.

* gcc.dg/torture/pr98191.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr98191.c | 10 ++
 gcc/tree-vect-slp.c|  3 ++-
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr98191.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr98191.c 
b/gcc/testsuite/gcc.dg/torture/pr98191.c
new file mode 100644
index 000..93cd27c21e1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr98191.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+typedef double v2df __attribute__((vector_size(2*sizeof(double;
+
+v2df foo (double *y)
+{
+  v2df x = (v2df){ 1.0, 2.0 };
+  x[0] = *y;
+  return x;
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index a2757e707ff..35e783505b4 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4243,7 +4243,8 @@ vect_slp_check_for_constructors (bb_vec_info bb_vinfo)
  def = gimple_assign_rhs1 (assign);
  do
{
- if (!has_single_use (def))
+ if (TREE_CODE (def) != SSA_NAME
+ || !has_single_use (def))
break;
  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
  unsigned this_lane;
-- 
2.26.2


[PATCH] i386: Fix up X87_ENABLE_{FLOAT, ARITH} in conditions [PR94440]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
Hi!

The documentation says
 For a named pattern, the condition may not depend on the data in
 the insn being matched, but only the target-machine-type flags.
The i386 backend violates that by using flag_excess_precision and
flag_unsafe_math_optimizations in the conditions too, which is bad
when optimize attribute or pragmas are used.  The problem is that the
middle-end caches the enabled conditions for the optabs for a particular
switchable target, but multiple functions can share the same
TARGET_OPTION_NODE, but have different TREE_OPTIMIZATION_NODE with different
flag_excess_precision or flag_unsafe_math_optimizations, so the enabled
conditions then match only one of those.

I think best would be to just have a single options node for both the
generic and target options, then such problems wouldn't exist, but that
would be very risky at this point and quite large change.

So, instead the following patch just shadows flag_excess_precision and
flag_unsafe_math_optimizations values for uses in the instruction conditions
in TargetVariable and during set_cfun artificially creates new
TARGET_OPTION_NODE if flag_excess_precision and/or
flag_unsafe_math_optimizations change from what is recorded in their
TARGET_OPTION_NODE.  The target nodes are hashed, so worst case we can get 4
times as many target option nodes if one would for each unique target option
try all the flag_excess_precision and flag_unsafe_math_optimizations values.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-12-08  Jakub Jelinek  

PR target/94440
* config/i386/i386.opt (ix86_excess_precision,
ix86_unsafe_math_optimizations): New TargetVariables.
* config/i386/i386.h (X87_ENABLE_ARITH, X87_ENABLE_FLOAT): Use
ix86_unsafe_math_optimizations instead of
flag_unsafe_math_optimizations and ix86_excess_precision instead of
flag_excess_precision.
* config/i386/i386.c (ix86_excess_precision): Rename to ...
(ix86_get_excess_precision): ... this.
(TARGET_C_EXCESS_PRECISION): Define to ix86_get_excess_precision.
* config/i386/i386-options.c (ix86_valid_target_attribute_tree,
ix86_option_override_internal): Update ix86_unsafe_math_optimization
from flag_unsafe_math_optimizations and ix86_excess_precision
from flag_excess_precision when constructing target option nodes.
(ix86_set_current_function): If flag_unsafe_math_optimizations
or flag_excess_precision is different from the one recorded
in TARGET_OPTION_NODE, create a new target option node for the
current function and switch to that.

--- gcc/config/i386/i386.opt.jj 2020-12-02 14:42:52.195054633 +0100
+++ gcc/config/i386/i386.opt2020-12-07 16:05:16.898814331 +0100
@@ -49,6 +49,16 @@ int recip_mask_explicit
 TargetSave
 int x_recip_mask_explicit
 
+;; A copy of flag_excess_precision as a target variable that should
+;; force a different DECL_FUNCTION_SPECIFIC_TARGET upon
+;; flag_excess_precision changes.
+TargetVariable
+enum excess_precision ix86_excess_precision = EXCESS_PRECISION_DEFAULT
+
+;; Similarly for flag_unsafe_math_optimizations.
+TargetVariable
+bool ix86_unsafe_math_optimizations = false
+
 ;; Definitions to add to the cl_target_option structure
 ;; -march= processor
 TargetSave
--- gcc/config/i386/i386.h.jj   2020-12-05 11:37:19.817423434 +0100
+++ gcc/config/i386/i386.h  2020-12-07 16:17:13.051866670 +0100
@@ -829,15 +829,15 @@ extern const char *host_detect_local_cpu
SFmode, DFmode and XFmode) in the current excess precision
configuration.  */
 #define X87_ENABLE_ARITH(MODE) \
-  (flag_unsafe_math_optimizations  \
-   || flag_excess_precision == EXCESS_PRECISION_FAST   \
+  (ix86_unsafe_math_optimizations  \
+   || ix86_excess_precision == EXCESS_PRECISION_FAST   \
|| (MODE) == XFmode)
 
 /* Likewise, whether to allow direct conversions from integer mode
IMODE (HImode, SImode or DImode) to MODE.  */
 #define X87_ENABLE_FLOAT(MODE, IMODE)  \
-  (flag_unsafe_math_optimizations  \
-   || flag_excess_precision == EXCESS_PRECISION_FAST   \
+  (ix86_unsafe_math_optimizations  \
+   || ix86_excess_precision == EXCESS_PRECISION_FAST   \
|| (MODE) == XFmode \
|| ((MODE) == DFmode && (IMODE) == SImode)  \
|| (IMODE) == HImode)
--- gcc/config/i386/i386.c.jj   2020-12-05 11:37:19.0 +0100
+++ gcc/config/i386/i386.c  2020-12-07 16:34:39.460252324 +0100
@@ -23001,7 +23001,7 @@ ix86_init_libfuncs (void)
apparently at random.  */
 
 static enum flt_eval_method
-ix86_excess_precision (enum excess_precision_type type)
+ix86_get_excess_precision (enum excess_precision_type type)
 {
   switch (type)
 {
@@ -23527,7 +23527,7 @@ ix86_run_selftests (void)
 #define TARGET_MD_ASM_ADJUST ix86_md_asm_adjust
 
 #undef TARGET

[PATCH] c++: Don't require accessible dtors for some forms of new [PR59238]

2020-12-08 Thread Jakub Jelinek via Gcc-patches
Hi!

The earlier cases in build_new_1 already use | tf_no_cleanup, these are
cases where the type isn't type_build_ctor_call nor explicit_value_init_p.
It is true that often one can't delete these (unless e.g. the dtor would be
private or protected and deletion done in some method), but diagnosing that
belongs to delete, not new.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2020-12-08  Jakub Jelinek  

PR c++/59238
* init.c (build_new_1): Pass complain | tf_no_cleanup to digest_init,
build_x_compound_expr_from_vec and cp_build_modify_expr.

* g++.dg/cpp0x/new4.C: New test.

--- gcc/cp/init.c.jj2020-11-19 20:08:22.831676932 +0100
+++ gcc/cp/init.c   2020-12-07 12:50:04.990586139 +0100
@@ -3485,13 +3485,14 @@ build_new_1 (vec **placemen
  ie = build_constructor_from_vec (init_list_type_node, *init);
  CONSTRUCTOR_IS_DIRECT_INIT (ie) = true;
  CONSTRUCTOR_IS_PAREN_INIT (ie) = true;
- ie = digest_init (type, ie, complain);
+ ie = digest_init (type, ie, complain | tf_no_cleanup);
}
  else
ie = build_x_compound_expr_from_vec (*init, "new initializer",
-complain);
+complain | tf_no_cleanup);
  init_expr = cp_build_modify_expr (input_location, init_expr,
-   INIT_EXPR, ie, complain);
+   INIT_EXPR, ie,
+   complain | tf_no_cleanup);
}
  /* If the initializer uses C++14 aggregate NSDMI that refer to the
 object being initialized, replace them now and don't try to
--- gcc/testsuite/g++.dg/cpp0x/new4.C.jj2020-12-07 12:54:00.369948493 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/new4.C   2020-12-07 12:53:25.917334573 +0100
@@ -0,0 +1,35 @@
+// PR c++/59238
+// { dg-do compile { target c++11 } }
+
+struct A { ~A () = delete; };
+A *pa{new A{}};
+
+class B { ~B () = default; };
+B *pb{new B{}};
+
+struct E {
+  ~E () = delete; 
+private: 
+  int x;
+};
+E *pe{new E{}};
+
+class C { ~C (); };
+C *pc{new C{}};
+
+class D { ~D () {} };
+D *pd{new D{}};
+
+struct F {
+  F () = default;
+  ~F () = delete; 
+};
+F *pf{new F{}};
+
+struct G {
+  G () = default;
+  ~G () = delete; 
+private: 
+  int x;
+};
+G *pg{new G{}};

Jakub



Re: Help with PR97872

2020-12-08 Thread Prathamesh Kulkarni via Gcc-patches
On Mon, 7 Dec 2020 at 17:37, Hongtao Liu  wrote:
>
> On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu  wrote:
> > >
> > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener  wrote:
> > > >
> > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener  wrote:
> > > > > >
> > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > For the test mentioned in PR, I was trying to see if 
> > > > > > > > > > > > > we could do
> > > > > > > > > > > > > specialized expansion for vcond in target when 
> > > > > > > > > > > > > operands are -1 and 0.
> > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > >
> > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 
> > > > > > > > > > > > > and 0 respectively.
> > > > > > > > > > > > > However, I am not sure if there's a way to check if 
> > > > > > > > > > > > > the register is
> > > > > > > > > > > > > constant during expansion time (since we don't have 
> > > > > > > > > > > > > df analysis yet) ?
> > >
> > > It seems to me that all you need to do is relax the predicates of op1
> > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > debugged it, but I see that vcondmn in neon.md only accepts
> > > s_register_operand.
> > >
> > > (define_expand "vcond"
> > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > (if_then_else:VDQW
> > >   (match_operator 3 "comparison_operator"
> > > [(match_operand:VDQW 4 "s_register_operand")
> > >  (match_operand:VDQW 5 "reg_or_zero_operand")])
> > >   (match_operand:VDQW 1 "s_register_operand")
> > >   (match_operand:VDQW 2 "s_register_operand")))]
> > >   "TARGET_NEON && (! || flag_unsafe_math_optimizations)"
> > > {
> > >   arm_expand_vcond (operands, mode);
> > >   DONE;
> > > })
> > >
> > > in sse.md it's defined as
> > > (define_expand "vcondu"
> > >   [(set (match_operand:V_512 0 "register_operand")
> > > (if_then_else:V_512
> > >   (match_operator 3 ""
> > > [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > >  (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > >   (match_operand:V_512 1 "general_operand")
> > >   (match_operand:V_512 2 "general_operand")))]
> > >   "TARGET_AVX512F
> > >&& (GET_MODE_NUNITS (mode)
> > >== GET_MODE_NUNITS (mode))"
> > > {
> > >   bool ok = ix86_expand_int_vcond (operands);
> > >   gcc_assert (ok);
> > >   DONE;
> > > })
> > >
> > > then we can get operands[1] and operands[2] as
> > >
> > > (gdb) p debug_rtx (operands[1])
> > >  (const_vector:V16QI [
> > > (const_int -1 [0x]) repeated x16
> > > ])
> > > (gdb) p debug_rtx (operands[2])
> > > (reg:V16QI 82 [ _2 ])
> > > (const_vector:V16QI [
> > > (const_int 0 [0]) repeated x16
> > > ])
> > Hi Hongtao,
> > Thanks for the suggestions!
> > However IIUC from vector extensions doc page, the result of vector
> > comparison is defined to be 0
> > or -1, so would it be better to canonicalize
> > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > adjust targets if required ?
>
> Yes, it would be more straightforward to handle it in gimple isel, I
> would adjust the backend and testcase after you check in the patch.
Thanks! I have committed the attached patch in
3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.

Regards,
Prathamesh
>
> > Alternatively, I could try fixing this in backend as you suggest above.
> >
> > Thanks,
> > Prathamesh
> > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alternatively, should we add a target hook that 
> > > > > > > > > > > > > returns true if the
> > > > > > > > > > > > > result of vector comparison is set to all-ones or 
> > > > > > > > > > > > > all-zeros, and then
> > > > > > > > > > > > > use this hook in gimple ISEL to effectively turn 
> > > > > > > >

[PATCH] tree-optimization/98180 - fix BIT_INSERT_EXPR sequence vectorization

2020-12-08 Thread Richard Biener
This adds a missing check for the first inserted value.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-12-08  Richard Biener  

PR tree-optimization/98180
* tree-vect-slp.c (vect_slp_check_for_constructors): Check the
first inserted value has a def.
---
 gcc/tree-vect-slp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 2dccca02aa0..a2757e707ff 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4193,10 +4193,12 @@ vect_slp_check_for_constructors (bb_vec_info bb_vinfo)
   else if (gimple_assign_rhs_code (assign) == BIT_INSERT_EXPR
   && VECTOR_TYPE_P (TREE_TYPE (rhs))
   && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).is_constant ()
+  && TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs)).to_constant () > 1
   && integer_zerop (gimple_assign_rhs3 (assign))
   && useless_type_conversion_p
(TREE_TYPE (TREE_TYPE (rhs)),
-TREE_TYPE (gimple_assign_rhs2 (assign
+TREE_TYPE (gimple_assign_rhs2 (assign)))
+  && bb_vinfo->lookup_def (gimple_assign_rhs2 (assign)))
{
  /* We start to match on insert to lane zero but since the
 inserts need not be ordered we'd have to search both
-- 
2.26.2