Re: [PATCH] c++: Fix wrong conversion error with non-viable overload [PR94124]

2020-03-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Mar 10, 2020 at 07:38:17PM -0400, Marek Polacek via Gcc-patches wrote:
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -6062,6 +6062,13 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
> reshape_iter *d,
>else if (last_nonzero < nelts - 1)
>   nelts = last_nonzero + 1;
>  
> +  /* Sharing a stripped constructor can get in the way of
> +  overload resolution.  E.g., initializing a class from
> +  {{0}} might be invalid while initializing the same class
> +  from {{}} might be valid.  */
> +  if (reuse)
> + new_init = unshare_constructor (new_init);
> +
>vec_safe_truncate (CONSTRUCTOR_ELTS (new_init), nelts);

Isn't it wasteful to first copy perhaps a large constructor (recursively)
and then truncate it to very few elts (zero in this case)?
So, perhaps doing instead:
  if (reuse)
{
  vec *v = NULL;
  if (nelts)
vec_alloc (v, nelts);
  for (unsigned int i = 0; i < nelts; i++)
{
  constructor_elt elt = CONSTRUCTOR_ELT (new_init, i);
  if (TREE_CODE (elt.value) == CONSTRUCTOR)
elt.value = unshare_constructor (elt.value);
  v->quick_push (elt);
}
  new_init = build_constructor (TREE_TYPE (new_init), v);
}
  else
vec_safe_truncate (CONSTRUCTOR_ELTS (new_init), nelts);
?

Jakub



Re: [testsuite] Fix PR93935 to guard case under vect_hw_misalign

2020-03-10 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this patch, also request to backport to gcc9 after some burn-in 
time.

BR,
Kewen

on 2020/2/26 下午2:17, Kewen.Lin wrote:
> Hi,
> 
> This patch is to apply the same fix as r267528 to another similar case
> bb-slp-over-widen-2.c which requires misaligned vector access.
> 
> Verified it on ppc64-redhat-linux (Power7 BE).
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> ---
> 
> gcc/testsuite/ChangeLog
> 
> 2020-02-26  Kewen Lin  
> 
>   PR testsuite/93935
>   * gcc.dg/vect/bb-slp-over-widen-2.c: Expect basic block vectorized
>   messages only on vect_hw_misalign targets.
> 
> * patch *:
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
> index 3750fb7..042b7e9 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
> @@ -63,4 +63,4 @@ main (void)
>  /* { dg-final { scan-tree-dump "demoting int to signed short" "slp2" { 
> target { ! vect_widen_shift } } } } */
>  /* { dg-final { scan-tree-dump "demoting int to unsigned short" "slp2" { 
> target { ! vect_widen_shift } } } } */
>  /* { dg-final { scan-tree-dump {\.AVG_FLOOR} "slp2" { target vect_avg_qi } } 
> } */
> -/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2" } } */
> +/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2" { 
> target vect_hw_misalign } } } */
> 



[committed] Fix length computation for movsi_insv on bfin

2020-03-10 Thread Jeff Law via Gcc-patches

The tester started spitting out these errors on bfin recently:

> Tests that now fail, but worked before (3 tests):
> 
> bfin-sim: c-c++-common/torture/vector-compare-1.c   -Os  (test for excess
> errors)
> bfin-sim: c-c++-common/torture/vector-compare-1.c   -Os  (test for excess
> errors)
> bfin-sim: gcc.c-torture/execute/scal-to-vec1.c   -Os  (test for excess errors)
> 

http://gcc.gnu.org/jenkins/job/bfin-elf/865/console

This can be clearly seen in the assembly files:

>  27  40E10100  R0.H = 1;   // 75   [c=4 l=2]  
> *movsi_insv/1
>  147  40E10200  R0.H = 2;   // 276  [c=4
> l=2]  *movsi_insv/1
>  296  40E10400  R0.H = 4;   // 555  [c=4
> l=2]  *movsi_insv/1
>  327  40E10100  R0.H = 1;   // 615  [c=4
> l=2]  *movsi_insv/1
>  461  40E10100  R0.H = 1;   // 834  [c=4
> l=2]  *movsi_insv/1
>  463  43E10100  R3.H = 1;   // 839  [c=4
> l=2]  *movsi_insv/1
>  465  42E10100  R2.H = 1;   // 844  [c=4
> l=2]  *movsi_insv/1
>  467  41E10100  R1.H = 1;   // 849  [c=4
> l=2]  *movsi_insv/1

According to comments in bfin.md insns with the type "mvi" should provide length
information directly rather than using the default length computation.  The
movsi_insv pattern fails to do that and the default computation gets it wrong
resulting in an out of range branch error from the assembler.

This patch adds an explicit length attribute to the movsi_insv insn.  For
alternative zero, we use the default computation.  For alternative one, the mvi
alternative we use a constant length of 4 which seems to make everything happy
again.

We can see the effect of fixing that in build #868 where I made the attached
patch available to the tester:

http://gcc.gnu.org/jenkins/job/bfin-elf/868/console

You'll note it longer complains about vector-compare-1.c at all nor does it
complain about scal-to-vec1 -Os.  Additionally scal-to-vec1.c now passes at -O1.


Committing to the trunk.

Jeff



commit 5115542a5cc17c5096e6e498c363e75d5bc14276
Author: Jeff Law 
Date:   Tue Mar 10 22:16:19 2020 -0600

Fix length computation for movsi_insv which resulted in regressions due to 
out of range branches on the bfin port.

* config/bfin/bfin.md (movsi_insv): Add length attribute.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5b67b79745f..887a55097db 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2020-03-10  Jeff Law  
+
+   * config/bfin/bfin.md (movsi_insv): Add length attribute.
+
 2020-03-10  Jiufu Guo  
 
PR target/93709
diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md
index bb71a32..aecb8138181 100644
--- a/gcc/config/bfin/bfin.md
+++ b/gcc/config/bfin/bfin.md
@@ -752,7 +752,8 @@
   "@
%d0 = %h1 << 0%!
%d0 = %1;"
-  [(set_attr "type" "dsp32shiftimm,mvi")])
+  [(set_attr "type" "dsp32shiftimm,mvi")
+   (set_attr "length" "*,4")])
 
 (define_expand "insv"
   [(set (zero_extract:SI (match_operand:SI 0 "register_operand" "")


Re: [PATCH] c++: Fix wrong conversion error with non-viable overload [PR94124]

2020-03-10 Thread Jason Merrill via Gcc-patches

On 3/10/20 7:38 PM, Marek Polacek wrote:

This is a bad interaction between sharing a constructor for an array
and stripping its trailing zero-initializers.  Here we reuse a ctor
and then strip its 0s.  This breaks overload resolution in this test:
D can be initialized from {} but not from {0}, so if we truncate the
constructor not to include the zero, the F(D) overload becomes valid
and then we get the ambiguous conversion error.

Bootstrapped/regtested on x86_64-linux, ok for trunk?


OK.


PR c++/94124 - wrong conversion error with non-viable overload.
* decl.c (reshape_init_array_1): Unshare a constructor if we
stripped trailing zero-initializers.

* g++.dg/cpp0x/initlist-overload1.C: New test.
---
  gcc/cp/decl.c   |  7 +++
  gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C | 15 +++
  2 files changed, 22 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index bb242743074..aa58e5f88ae 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6062,6 +6062,13 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
reshape_iter *d,
else if (last_nonzero < nelts - 1)
nelts = last_nonzero + 1;
  
+  /* Sharing a stripped constructor can get in the way of

+overload resolution.  E.g., initializing a class from
+{{0}} might be invalid while initializing the same class
+from {{}} might be valid.  */
+  if (reuse)
+   new_init = unshare_constructor (new_init);
+
vec_safe_truncate (CONSTRUCTOR_ELTS (new_init), nelts);
  }
  
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C b/gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C

new file mode 100644
index 000..12bb606ce67
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C
@@ -0,0 +1,15 @@
+// PR c++/94124 - wrong conversion error with non-viable overload.
+// { dg-do compile { target c++11 } }
+
+template  struct A { typedef int _Type[N]; };
+template  struct B { typename A::_Type _M_elems; };
+class C { };
+struct D {
+  D(C);
+};
+
+struct F {
+  F(B<2>);
+  F(D); // This overload should not be viable.
+};
+F fn1() { return {{{0}}}; }

base-commit: 0b7f1e24316cfc1f85408918d1734d3266d65089





[pushed] c++: Fix deferred noexcept on constructor [PR93901].

2020-03-10 Thread Jason Merrill via Gcc-patches
My change in r10-4394 to only update clones when we actually instantiate a
deferred noexcept-spec broke this because deferred parsing updates the
primary function but not the clones.  For GCC 10, let's just revert it.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-03-10  Jason Merrill  

PR c++/93901
* pt.c (maybe_instantiate_noexcept): Always update clones.
---
 gcc/cp/pt.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 179716b5680..cb237ba0d9d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25097,14 +25097,14 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
   TREE_TYPE (fn) = build_exception_variant (fntype, spec);
   if (orig_fn)
TREE_TYPE (orig_fn) = TREE_TYPE (fn);
+}
 
-  FOR_EACH_CLONE (clone, fn)
-   {
- if (TREE_TYPE (clone) == fntype)
-   TREE_TYPE (clone) = TREE_TYPE (fn);
- else
-   TREE_TYPE (clone) = build_exception_variant (TREE_TYPE (clone), 
spec);
-   }
+  FOR_EACH_CLONE (clone, fn)
+{
+  if (TREE_TYPE (clone) == fntype)
+   TREE_TYPE (clone) = TREE_TYPE (fn);
+  else
+   TREE_TYPE (clone) = build_exception_variant (TREE_TYPE (clone), spec);
 }
 
   return true;

base-commit: b269a014771776f860730874095dffb34839a466
-- 
2.18.1



[pushed] c++: Fix ICE with omitted template args [PR93956].

2020-03-10 Thread Jason Merrill via Gcc-patches
reshape_init only wants to work on BRACE_ENCLOSED_INITIALIZER_P, i.e. raw
initializer lists, and here was getting a CONSTRUCTOR that had already been
processed for type A.  maybe_aggr_guide should also use that test.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog
2020-03-10  Jason Merrill  

PR c++/93956
* pt.c (maybe_aggr_guide): Check BRACE_ENCLOSED_INITIALIZER_P.
---
 gcc/cp/pt.c| 2 +-
 gcc/testsuite/g++.dg/cpp1z/class-deduction70.C | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction70.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 49ee3920049..179716b5680 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28182,7 +28182,7 @@ maybe_aggr_guide (tree tmpl, tree init, vec 
*args)
   tsubst_flags_t complain = tf_none;
 
   tree parms = NULL_TREE;
-  if (TREE_CODE (init) == CONSTRUCTOR)
+  if (BRACE_ENCLOSED_INITIALIZER_P (init))
 {
   init = reshape_init (type, init, complain);
   if (init == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction70.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction70.C
new file mode 100644
index 000..f14bdf0b8ec
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction70.C
@@ -0,0 +1,7 @@
+// PR c++/93596
+
+template  struct A {};
+template  struct B {};
+template  struct C {
+  void foo () { B a = A { foo }; } // { dg-error "" }
+};

base-commit: b269a014771776f860730874095dffb34839a466
-- 
2.18.1



[PATCH] c++: Fix wrong conversion error with non-viable overload [PR94124]

2020-03-10 Thread Marek Polacek via Gcc-patches
This is a bad interaction between sharing a constructor for an array
and stripping its trailing zero-initializers.  Here we reuse a ctor
and then strip its 0s.  This breaks overload resolution in this test:
D can be initialized from {} but not from {0}, so if we truncate the
constructor not to include the zero, the F(D) overload becomes valid
and then we get the ambiguous conversion error.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

PR c++/94124 - wrong conversion error with non-viable overload.
* decl.c (reshape_init_array_1): Unshare a constructor if we
stripped trailing zero-initializers.

* g++.dg/cpp0x/initlist-overload1.C: New test.
---
 gcc/cp/decl.c   |  7 +++
 gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C | 15 +++
 2 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index bb242743074..aa58e5f88ae 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6062,6 +6062,13 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
reshape_iter *d,
   else if (last_nonzero < nelts - 1)
nelts = last_nonzero + 1;
 
+  /* Sharing a stripped constructor can get in the way of
+overload resolution.  E.g., initializing a class from
+{{0}} might be invalid while initializing the same class
+from {{}} might be valid.  */
+  if (reuse)
+   new_init = unshare_constructor (new_init);
+
   vec_safe_truncate (CONSTRUCTOR_ELTS (new_init), nelts);
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C
new file mode 100644
index 000..12bb606ce67
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-overload1.C
@@ -0,0 +1,15 @@
+// PR c++/94124 - wrong conversion error with non-viable overload.
+// { dg-do compile { target c++11 } }
+
+template  struct A { typedef int _Type[N]; };
+template  struct B { typename A::_Type _M_elems; };
+class C { };
+struct D {
+  D(C);
+};
+
+struct F {
+  F(B<2>);
+  F(D); // This overload should not be viable.
+};
+F fn1() { return {{{0}}}; }

base-commit: 0b7f1e24316cfc1f85408918d1734d3266d65089
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



Re: c: ignore initializers for elements of variable-size types [PR93577]

2020-03-10 Thread Joseph Myers
On Tue, 10 Mar 2020, Christophe Lyon wrote:

> sizeless-1.c and sizeless-2.c have the same code, but the latter is
> compiled with -msve-vector-bits=256 and expects different
> warnings/errors.
> For line 33:
> svint8_t *invalid_sve_sc_ptr = &(svint8_t) { *global_sve_sc_ptr };
> we now have:
> sizeless-1.c:33:44: error: empty scalar initializer
> sizeless-1.c:33:44: note: (near initialization for '(anonymous)')
> and
> sizeless-2.c:33:44: error: initializer element is not constant
> sizeless-2.c:33:44: note: (near initialization for 'invalid_sve_sc_ptr')
> sizeless-2.c:33:44: error: SVE type 'svint8_t' does not have a fixed size
> so I think the error comes from the compound literal being treated
> differently with -msve-vector-bits=256

I think the sizeless-2.c diagnostics are correct while there's a problem 
in the sizeless-1.c case (the initializer is not empty, so it should not 
be diagnosed as such).

Does the process_init_element code

  /* Ignore elements of an initializer for a variable-size type.
 Those are diagnosed in digest_init.  */
  if (COMPLETE_TYPE_P (constructor_type)
  && TREE_CODE (TYPE_SIZE (constructor_type)) != INTEGER_CST)
return;

fire for the sizeless-1.c case?  If so, maybe it needs to be restricted in 
some way to apply only to variable size structs / unions / arrays rather 
than whatever kind of variable-size type the SVE types are.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [committed] libstdc++: Fix invalid noexcept-specifier (PR 94117)

2020-03-10 Thread Jonathan Wakely via Gcc-patches

On 10/03/20 17:52 +, Jonathan Wakely wrote:

On 10/03/20 11:40 +, Jonathan Wakely wrote:

G++ fails to diagnose this non-dependent expression, but Clang doesn't
like it.

PR c++/94117
* include/std/ranges (ranges::transform_view::_Iterator::iter_move):
Change expression in noexcept-specifier to match function body.




This patch goes further and removes the __iter_move helper completely,
and the __iter_swap one, in transform_view.

It also does the same in split_view, and fixes a bug where the
noexcept-specifier was always false.

I've also added new _M_i_current() accessors (overloaded for const and
non-const) to return _M_i.__current(). Using this instead of
_M_i._M_current fixes a bug in inner-iterator::operator*() (which is
also present in the working draft).


I missed a few more bugs in the outer iterator, where _M_current was
being used directly despite only being valid for forward ranges. Fixed
by this patch.

Tested powerpc64le-linux, committed to master.


commit 0b7f1e24316cfc1f85408918d1734d3266d65089
Author: Jonathan Wakely 
Date:   Tue Mar 10 22:15:58 2020 +

libstdc++: Fix uses of _M_current in split_view's outer iterator

These direct uses of _M_current should all be __current() so they are
valid when the base type doesn't satisfy the forward_range concept.

* include/std/ranges (split_view::_OuterIter::__at_end): Use __current
instead of _M_current.
(split_view::_OuterIter::operator++): Likewise.

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 4dc7342e2f7..de120d6b55d 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2703,9 +2703,9 @@ namespace views
 
 	  constexpr bool
 	  __at_end() const
-	  { return _M_current == ranges::end(_M_parent->_M_base); }
+	  { return __current() == ranges::end(_M_parent->_M_base); }
 
-	  // XXX: [24.7.11.3.1]
+	  // [range.split.outer] p1
 	  //  Many of the following specifications refer to the notional member
 	  //  current of outer-iterator.  current is equivalent to current_ if
 	  //  V models forward_range, and parent_->current_ otherwise.
@@ -2798,21 +2798,21 @@ namespace views
 	  operator++()
 	  {
 	const auto __end = ranges::end(_M_parent->_M_base);
-	if (_M_current == __end)
+	if (__current() == __end)
 	  return *this;
 	const auto [__pbegin, __pend] = subrange{_M_parent->_M_pattern};
 	if (__pbegin == __pend)
-	  ++_M_current;
+	  ++__current();
 	else
 	  do
 		{
 		  auto [__b, __p]
-		= __detail::mismatch(std::move(_M_current), __end,
+		= __detail::mismatch(std::move(__current()), __end,
 	 __pbegin, __pend);
-		  _M_current = std::move(__b);
+		  __current() = std::move(__b);
 		  if (__p == __pend)
 		break;
-		} while (++_M_current != __end);
+		} while (++__current() != __end);
 	return *this;
 	  }
 


Re: [PATCH] c++: Fix wrong modifying const object error for COMPONENT_REF [PR94074]

2020-03-10 Thread Jason Merrill via Gcc-patches

On 3/9/20 4:34 PM, Marek Polacek wrote:

On Mon, Mar 09, 2020 at 04:25:00PM -0400, Marek Polacek wrote:

On Mon, Mar 09, 2020 at 03:37:56PM -0400, Jason Merrill wrote:

On 3/9/20 9:40 AM, Marek Polacek wrote:

On Mon, Mar 09, 2020 at 09:19:30AM -0400, Jason Merrill wrote:

On 3/9/20 8:58 AM, Jakub Jelinek wrote:

On Fri, Mar 06, 2020 at 07:43:43PM -0500, Jason Merrill wrote:

On 3/6/20 6:54 PM, Marek Polacek wrote:

I got a report that building Chromium fails with the "modifying a const
object" error.  After some poking I realized it's a bug in GCC, not in
their codebase.

Much like with ARRAY_REFs, which can be const even though the array
itself isn't, COMPONENT_REFs can be const although neither the object
nor the field were declared const.  So let's dial down the checking.
Here the COMPONENT_REF was const because of the "const_cast(m)"
thing -- cxx_eval_component_reference then builds a COMPONENT_REF with
TREE_TYPE (t).


What is folding the const into the COMPONENT_REF?


cxx_eval_component_reference when it is called on
((const struct array *) this)->elems
with /*lval=*/true and lval is true because we are evaluating
 = (const int &) &((const struct array *) 
this)->elems[VIEW_CONVERT_EXPR(n)];


Ah, sure.  We're pretty loose with cv-quals in the constexpr code in
general, so it's probably not worth trying to change that here.  Getting
back to the patch:


Yes, here the additional const was caused by a const_cast adding a const.

But this could also happen with wrapper functions like this one from
__array_traits in std::array:

static constexpr _Tp&
_S_ref(const _Type& __t, std::size_t __n) noexcept
{ return const_cast<_Tp&>(__t[__n]); }

where the ref-to-const parameter added the const.


+  if (TREE_CODE (obj) == COMPONENT_REF)
+   {
+ tree op1 = TREE_OPERAND (obj, 1);
+ if (CP_TYPE_CONST_P (TREE_TYPE (op1)))
+   return true;
+ else
+   {
+ tree op0 = TREE_OPERAND (obj, 0);
+ /* The LHS of . or -> might itself be a COMPONENT_REF.  */
+ if (TREE_CODE (op0) == COMPONENT_REF)
+   op0 = TREE_OPERAND (op0, 1);
+ return CP_TYPE_CONST_P (TREE_TYPE (op0));
+   }
+   }


Shouldn't this be a loop?


I don't think so, though my earlier patch had a call to

+static bool
+cref_has_const_field (tree ref)
+{
+  while (TREE_CODE (ref) == COMPONENT_REF)
+{
+  if (CP_TYPE_CONST_P (TREE_TYPE (TREE_OPERAND (ref, 1
+   return true;
+  ref = TREE_OPERAND (ref, 0);
+}
+  return false;
+}



here.  A problem arised when I checked even the outermost expression (which is 
not a
field_decl), then I saw another problematical error.

The more outer fields are expected to be checked in subsequent calls to
modifying_const_object_p in next iterations of the

4459   for (tree probe = target; object == NULL_TREE; )

loop in cxx_eval_store_expression.


OK, but then why do you want to check two levels here rather than just one?


It's a hack to keep constexpr-tracking-const7.C working.  There we have

   b.a.c.d.n

wherein 'd' is const struct D, but 'n' isn't const.  Without the hack
const_object_being_modified would be 'b.a.c.d', but due to the problem I
desribed in the original mail[1] the constructor for D wouldn't have
TREE_READONLY set.  With the hack const_object_being_modified will be
'b.a.c.d.n', which is of non-class type so we error:

4710   if (!CLASS_TYPE_P (const_objtype))
4711 fail = true;

I could remove the hack and maybe XFAIL constexpr-tracking-const7.C if you
want.  Unfortunately I wasn't aware of [1] when I added that feature and
checking if the whole COMPONENT_REF is const seemed to be enough.


So if D was a wrapper around another class with the int field, this hack 
looking one level out wouldn't help?



It's probably not a good idea to make this checking more strict at this
stage.

[1] "While looking into this I noticed that we don't detect modifying a const
object in certain cases like in
.  That's because
we never evaluate an X::X() CALL_EXPR -- there's none.  So there's no
CONSTRUCTOR to set TREE_READONLY on.  No idea how to fix this, but it's
likely something for GCC 11 anyway."

How about this?

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 76af0d710c4..b3d3499b9ac 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -4759,6 +4759,14 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
   else
 *valp = init;
 
+  /* After initialization, 'const' semantics apply to the value of the
+ object. Make a note of this fact by marking the CONSTRUCTOR
+ TREE_READONLY.  */
+  if (TREE_CODE (t) == INIT_EXPR
+  && TREE_CODE (*valp) == CONSTRUCTOR
+  && TYPE_READONLY (type))
+TREE_READONLY (*valp) = true;
+
   /* Update TREE_CONSTANT and TREE_SIDE_EFFECTS on enclosing
  CONSTRUCTORs, if any.  */
   tree elt;


[PATCH v2][ARM][GCC][4/2x]: MVE intrinsics with binary operands.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534336.html




Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vsubq_u8, vsubq_n_u8, vrmulhq_u8, vrhaddq_u8, vqsubq_u8, vqsubq_n_u8, vqaddq_u8,
vqaddq_n_u8, vorrq_u8, vornq_u8, vmulq_u8, vmulq_n_u8, vmulltq_int_u8, 
vmullbq_int_u8,
vmulhq_u8, vmladavq_u8, vminvq_u8, vminq_u8, vmaxvq_u8, vmaxq_u8, vhsubq_u8, 
vhsubq_n_u8,
vhaddq_u8, vhaddq_n_u8, veorq_u8, vcmpneq_n_u8, vcmphiq_u8, vcmphiq_n_u8, 
vcmpeqq_u8, 
vcmpeqq_n_u8, vcmpcsq_u8, vcmpcsq_n_u8, vcaddq_rot90_u8, vcaddq_rot270_u8, 
vbicq_u8,
vandq_u8, vaddvq_p_u8, vaddvaq_u8, vaddq_n_u8, vabdq_u8, vshlq_r_u8, vrshlq_u8,
vrshlq_n_u8, vqshlq_u8, vqshlq_r_u8, vqrshlq_u8, vqrshlq_n_u8, vminavq_s8, 
vminaq_s8,
vmaxavq_s8, vmaxaq_s8, vbrsrq_n_u8, vshlq_n_u8, vrshrq_n_u8, vqshlq_n_u8, 
vcmpneq_n_s8,
vcmpltq_s8, vcmpltq_n_s8, vcmpleq_s8, vcmpleq_n_s8, vcmpgtq_s8, vcmpgtq_n_s8, 
vcmpgeq_s8,
vcmpgeq_n_s8, vcmpeqq_s8, vcmpeqq_n_s8, vqshluq_n_s8, vaddvq_p_s8, vsubq_s8, 
vsubq_n_s8,
vshlq_r_s8, vrshlq_s8, vrshlq_n_s8, vrmulhq_s8, vrhaddq_s8, vqsubq_s8, 
vqsubq_n_s8,
vqshlq_s8, vqshlq_r_s8, vqrshlq_s8, vqrshlq_n_s8, vqrdmulhq_s8, vqrdmulhq_n_s8, 
vqdmulhq_s8,
vqdmulhq_n_s8, vqaddq_s8, vqaddq_n_s8, vorrq_s8, vornq_s8, vmulq_s8, 
vmulq_n_s8, vmulltq_int_s8,
vmullbq_int_s8, vmulhq_s8, vmlsdavxq_s8, vmlsdavq_s8, vmladavxq_s8, 
vmladavq_s8, vminvq_s8,
vminq_s8, vmaxvq_s8, vmaxq_s8, vhsubq_s8, vhsubq_n_s8, vhcaddq_rot90_s8, 
vhcaddq_rot270_s8,
vhaddq_s8, vhaddq_n_s8, veorq_s8, vcaddq_rot90_s8, vcaddq_rot270_s8, 
vbrsrq_n_s8, vbicq_s8,
vandq_s8, vaddvaq_s8, vaddq_n_s8, vabdq_s8, vshlq_n_s8, vrshrq_n_s8, 
vqshlq_n_s8, vsubq_u16,
vsubq_n_u16, vrmulhq_u16, vrhaddq_u16, vqsubq_u16, vqsubq_n_u16, vqaddq_u16, 
vqaddq_n_u16,
vorrq_u16, vornq_u16, vmulq_u16, vmulq_n_u16, vmulltq_int_u16, vmullbq_int_u16, 
vmulhq_u16,
vmladavq_u16, vminvq_u16, vminq_u16, vmaxvq_u16, vmaxq_u16, vhsubq_u16, 
vhsubq_n_u16,
vhaddq_u16, vhaddq_n_u16, veorq_u16, vcmpneq_n_u16, vcmphiq_u16, vcmphiq_n_u16, 
vcmpeqq_u16,
vcmpeqq_n_u16, vcmpcsq_u16, vcmpcsq_n_u16, vcaddq_rot90_u16, vcaddq_rot270_u16, 
vbicq_u16,
vandq_u16, vaddvq_p_u16, vaddvaq_u16, vaddq_n_u16, vabdq_u16, vshlq_r_u16, 
vrshlq_u16,
vrshlq_n_u16, vqshlq_u16, vqshlq_r_u16, vqrshlq_u16, vqrshlq_n_u16, 
vminavq_s16, vminaq_s16,
vmaxavq_s16, vmaxaq_s16, vbrsrq_n_u16, vshlq_n_u16, vrshrq_n_u16, vqshlq_n_u16, 
vcmpneq_n_s16,
vcmpltq_s16, vcmpltq_n_s16, vcmpleq_s16, vcmpleq_n_s16, vcmpgtq_s16, 
vcmpgtq_n_s16, 
vcmpgeq_s16, vcmpgeq_n_s16, vcmpeqq_s16, vcmpeqq_n_s16, vqshluq_n_s16, 
vaddvq_p_s16, vsubq_s16,
vsubq_n_s16, vshlq_r_s16, vrshlq_s16, vrshlq_n_s16, vrmulhq_s16, vrhaddq_s16, 
vqsubq_s16,
vqsubq_n_s16, vqshlq_s16, vqshlq_r_s16, vqrshlq_s16, vqrshlq_n_s16, 
vqrdmulhq_s16,
vqrdmulhq_n_s16, vqdmulhq_s16, vqdmulhq_n_s16, vqaddq_s16, vqaddq_n_s16, 
vorrq_s16, vornq_s16,
vmulq_s16, vmulq_n_s16, vmulltq_int_s16, vmullbq_int_s16, vmulhq_s16, 
vmlsdavxq_s16, vmlsdavq_s16,
vmladavxq_s16, vmladavq_s16, vminvq_s16, vminq_s16, vmaxvq_s16, vmaxq_s16, 
vhsubq_s16,
vhsubq_n_s16, vhcaddq_rot90_s16, vhcaddq_rot270_s16, vhaddq_s16, vhaddq_n_s16, 
veorq_s16,
vcaddq_rot90_s16, vcaddq_rot270_s16, vbrsrq_n_s16, vbicq_s16, vandq_s16, 
vaddvaq_s16, vaddq_n_s16,
vabdq_s16, vshlq_n_s16, vrshrq_n_s16, vqshlq_n_s16, vsubq_u32, vsubq_n_u32, 
vrmulhq_u32,
vrhaddq_u32, vqsubq_u32, vqsubq_n_u32, vqaddq_u32, vqaddq_n_u32, vorrq_u32, 
vornq_u32, vmulq_u32,
vmulq_n_u32, vmulltq_int_u32, vmullbq_int_u32, vmulhq_u32, vmladavq_u32, 
vminvq_u32, vminq_u32,
vmaxvq_u32, vmaxq_u32, vhsubq_u32, vhsubq_n_u32, vhaddq_u32, vhaddq_n_u32, 
veorq_u32, vcmpneq_n_u32,
vcmphiq_u32, vcmphiq_n_u32, vcmpeqq_u32, vcmpeqq_n_u32, vcmpcsq_u32, 
vcmpcsq_n_u32,
vcaddq_rot90_u32, vcaddq_rot270_u32, vbicq_u32, vandq_u32, vaddvq_p_u32, 
vaddvaq_u32, vaddq_n_u32,
vabdq_u32, vshlq_r_u32, vrshlq_u32, vrshlq_n_u32, vqshlq_u32, vqshlq_r_u32, 
vqrshlq_u32, vqrshlq_n_u32,
vminavq_s32, vminaq_s32, vmaxavq_s32, vmaxaq_s32, vbrsrq_n_u32, vshlq_n_u32, 
vrshrq_n_u32,
vqshlq_n_u32, vcmpneq_n_s32, vcmpltq_s32, vcmpltq_n_s32, vcmpleq_s32, 
vcmpleq_n_s32, vcmpgtq_s32,
vcmpgtq_n_s32, vcmpgeq_s32, vcmpgeq_n_s32, vcmpeqq_s32, vcmpeqq_n_s32, 
vqshluq_n_s32, vaddvq_p_s32,
vsubq_s32, vsubq_n_s32, vshlq_r_s32, vrshlq_s32, vrshlq_n_s32, vrmulhq_s32, 
vrhaddq_s32, vqsubq_s32,
vqsubq_n_s32, vqshlq_s32, vqshlq_r_s32, vqrshlq_s32, vqrshlq_n_s32, 
vqrdmulhq_s32, vqrdmulhq_n_s32,
vqdmulhq_s32, vqdmulhq_n_s32, vqaddq_s32, vqaddq_n_s32, vorrq_s32, vornq_s32, 
vmulq_s32, vmulq_n_s32,
vmulltq_int_s32, vmullbq_int_s32, vmulhq_s32, vmlsdavxq_s32, vmlsdavq_s32, 
vmladavxq_s32, vmladavq_s32,
vminvq_s32, vminq_s32, vmaxvq_s32, vmaxq_s32, vhsubq_s32, vhsubq_n_s32, 
vhcaddq_rot90_s32,
vhcaddq_rot270_s32, vhaddq_s32, vhaddq_n_s32, veorq_s32, vcaddq_rot90_s32, 
vcaddq_rot270_s32,
vbrsrq_n_s32, vbicq_s32, vandq_s32, vaddvaq_s32, vaddq_n_s32, vabdq_s32, 
vshlq_n_s32, vrshrq_n_s32,
vqshlq_n_

[PATCH v2][ARM][GCC][3/2x]: MVE intrinsics with binary operands.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534329.html




Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vaddlvq_p_s32, vaddlvq_p_u32, vcmpneq_s8, vcmpneq_s16, vcmpneq_s32, vcmpneq_u8,
vcmpneq_u16, vcmpneq_u32, vshlq_s8, vshlq_s16, vshlq_s32, vshlq_u8, vshlq_u16, 
vshlq_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (BINOP_NONE_NONE_UNONE_QUALIFIERS): Define
qualifier for binary operands.
(BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise.
(BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vaddlvq_p_s32): Define macro.
(vaddlvq_p_u32): Likewise.
(vcmpneq_s8): Likewise.
(vcmpneq_s16): Likewise.
(vcmpneq_s32): Likewise.
(vcmpneq_u8): Likewise.
(vcmpneq_u16): Likewise.
(vcmpneq_u32): Likewise.
(vshlq_s8): Likewise.
(vshlq_s16): Likewise.
(vshlq_s32): Likewise.
(vshlq_u8): Likewise.
(vshlq_u16): Likewise.
(vshlq_u32): Likewise.
(__arm_vaddlvq_p_s32): Define intrinsic.
(__arm_vaddlvq_p_u32): Likewise.
(__arm_vcmpneq_s8): Likewise.
(__arm_vcmpneq_s16): Likewise.
(__arm_vcmpneq_s32): Likewise.
(__arm_vcmpneq_u8): Likewise.
(__arm_vcmpneq_u16): Likewise.
(__arm_vcmpneq_u32): Likewise.
(__arm_vshlq_s8): Likewise.
(__arm_vshlq_s16): Likewise.
(__arm_vshlq_s32): Likewise.
(__arm_vshlq_u8): Likewise.
(__arm_vshlq_u16): Likewise.
(__arm_vshlq_u32): Likewise.
(vaddlvq_p): Define polymorphic variant.
(vcmpneq): Likewise.
(vshlq): Likewise.
* config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_UNONE_QUALIFIERS):
Use it.
(BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise.
(BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise.
* config/arm/mve.md (mve_vaddlvq_p_v4si): Define RTL pattern.
(mve_vcmpneq_): Likewise.
(mve_vshlq_): Likewise.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vaddlvq_p_s32.c: New test.
* gcc.target/arm/mve/intrinsics/vaddlvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshlq_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
5db4db498b71224a4b5a0f5b6aa3476b351f7fd3..264006bc8645d8e73149961debad8eb82bab8af0
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -415,6 +415,24 @@ arm_binop_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define BINOP_UNONE_NONE_IMM_QUALIFIERS \
   (arm_binop_unone_none_imm_qualifiers)
 
+static enum arm_type_qualifiers
+arm_binop_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_unsigned };
+#define BINOP_NONE_NONE_UNONE_QUALIFIERS \
+  (arm_binop_none_none_unone_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_none, qualifier_none };
+#define BINOP_UNONE_NONE_NONE_QUALIFIERS \
+  (arm_binop_unone_none_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_unone_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_none };
+#define BINOP_UNONE_UNONE_NONE_QUALIFIERS \
+  (arm_binop_unone_unone_none_qualifiers)
+
 /* End of Qualifier for MVE builtins.  */
 
/* void ([T element type] *, T, immediate).  */
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
37d5b1c0c4914efab1b765298a7ce15726e45183..5aa347312ae2bf160e9664600ae1d5057f10

[PATCH v2][ARM][GCC][3/1x]: MVE intrinsics with unary operand.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534321.html




Hello,

This patch supports following MVE ACLE intrinsics with unary operand.

vdupq_n_s8, vdupq_n_s16, vdupq_n_s32, vabsq_s8, vabsq_s16, vabsq_s32, vclsq_s8, 
vclsq_s16, vclsq_s32, vclzq_s8, vclzq_s16, vclzq_s32, vnegq_s8, vnegq_s16, 
vnegq_s32, vaddlvq_s32, vaddvq_s8, vaddvq_s16, vaddvq_s32, vmovlbq_s8, 
vmovlbq_s16, vmovltq_s8, vmovltq_s16, vmvnq_s8, vmvnq_s16, vmvnq_s32, 
vrev16q_s8, vrev32q_s8, vrev32q_s16, vqabsq_s8, vqabsq_s16, vqabsq_s32, 
vqnegq_s8, vqnegq_s16, vqnegq_s32, vcvtaq_s16_f16, vcvtaq_s32_f32, 
vcvtnq_s16_f16, vcvtnq_s32_f32, vcvtpq_s16_f16, vcvtpq_s32_f32, vcvtmq_s16_f16, 
vcvtmq_s32_f32, vmvnq_u8, vmvnq_u16, vmvnq_u32, vdupq_n_u8, vdupq_n_u16, 
vdupq_n_u32, vclzq_u8, vclzq_u16, vclzq_u32, vaddvq_u8, vaddvq_u16, vaddvq_u32, 
vrev32q_u8, vrev32q_u16, vmovltq_u8, vmovltq_u16, vmovlbq_u8, vmovlbq_u16, 
vrev16q_u8, vaddlvq_u32, vcvtpq_u16_f16, vcvtpq_u32_f32, vcvtnq_u16_f16, 
vcvtmq_u16_f16, vcvtmq_u32_f32, vcvtaq_u16_f16, vcvtaq_u32_f32, vdupq_n, vabsq, 
vclsq, vclzq, vnegq, vaddlvq, vaddvq, vmovlbq, vmovltq, vmvnq, vrev16q, 
vrev32q, vqabsq, vqneg
 q.

A new register class "EVEN_REGS" which allows only even registers is added in 
this patch.

The new constraint "e" allows only reigsters of EVEN_REGS class.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.


Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm.h (enum reg_class): Define new class EVEN_REGS.
* config/arm/arm_mve.h (vdupq_n_s8): Define macro.
(vdupq_n_s16): Likewise.
(vdupq_n_s32): Likewise.
(vabsq_s8): Likewise.
(vabsq_s16): Likewise.
(vabsq_s32): Likewise.
(vclsq_s8): Likewise.
(vclsq_s16): Likewise.
(vclsq_s32): Likewise.
(vclzq_s8): Likewise.
(vclzq_s16): Likewise.
(vclzq_s32): Likewise.
(vnegq_s8): Likewise.
(vnegq_s16): Likewise.
(vnegq_s32): Likewise.
(vaddlvq_s32): Likewise.
(vaddvq_s8): Likewise.
(vaddvq_s16): Likewise.
(vaddvq_s32): Likewise.
(vmovlbq_s8): Likewise.
(vmovlbq_s16): Likewise.
(vmovltq_s8): Likewise.
(vmovltq_s16): Likewise.
(vmvnq_s8): Likewise.
(vmvnq_s16): Likewise.
(vmvnq_s32): Likewise.
(vrev16q_s8): Likewise.
(vrev32q_s8): Likewise.
(vrev32q_s16): Likewise.
(vqabsq_s8): Likewise.
(vqabsq_s16): Likewise.
(vqabsq_s32): Likewise.
(vqnegq_s8): Likewise.
(vqnegq_s16): Likewise.
(vqnegq_s32): Likewise.
(vcvtaq_s16_f16): Likewise.
(vcvtaq_s32_f32): Likewise.
(vcvtnq_s16_f16): Likewise.
(vcvtnq_s32_f32): Likewise.
(vcvtpq_s16_f16): Likewise.
(vcvtpq_s32_f32): Likewise.
(vcvtmq_s16_f16): Likewise.
(vcvtmq_s32_f32): Likewise.
(vmvnq_u8): Likewise.
(vmvnq_u16): Likewise.
(vmvnq_u32): Likewise.
(vdupq_n_u8): Likewise.
(vdupq_n_u16): Likewise.
(vdupq_n_u32): Likewise.
(vclzq_u8): Likewise.
(vclzq_u16): Likewise.
(vclzq_u32): Likewise.
(vaddvq_u8): Likewise.
(vaddvq_u16): Likewise.
(vaddvq_u32): Likewise.
(vrev32q_u8): Likewise.
(vrev32q_u16): Likewise.
(vmovltq_u8): Likewise.
(vmovltq_u16): Likewise.
(vmovlbq_u8): Likewise.
(vmovlbq_u16): Likewise.
(vrev16q_u8): Likewise.
(vaddlvq_u32): Likewise.
(vcvtpq_u16_f16): Likewise.
(vcvtpq_u32_f32): Likewise.
(vcvtnq_u16_f16): Likewise.
(vcvtmq_u16_f16): Likewise.
(vcvtmq_u32_f32): Likewise.
(vcvtaq_u16_f16): Likewise.
(vcvtaq_u32_f32): Likewise.
(__arm_vdupq_n_s8): Define intrinsic.
(__arm_vdupq_n_s16): Likewise.
(__arm_vdupq_n_s32): Likewise.
(__arm_vabsq_s8): Likewise.
(__arm_vabsq_s16): Likewise.
(__arm_vabsq_s32): Likewise.
(__arm_vclsq_s8): Likewise.
(__arm_vclsq_s16): Likewise.
(__arm_vclsq_s32): Likewise.
(__arm_vclzq_s8): Likewise.
(__arm_vclzq_s16): Likewise.
(__arm_vclzq_s32): Likewise.
(__arm_vnegq_s8): Likewise.
(__arm_vnegq_s16): Likewise.
(__arm_vnegq_s32): Likewise.
(__arm_vaddlvq_s32): Likewise.
(__arm_vaddvq_s8): Likewise.
(__arm_vaddvq_s16): Likewise.
(__arm_vaddvq_s32): Likewise.
(__arm_vmovlbq_s8): Likewise.
(__arm_vmovlbq_s16): Likewise.
(__arm_vmov

[PATCH v2][ARM][GCC][5/2x]: MVE intrinsics with binary operands.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534330.html




Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vqmovntq_u16, vqmovnbq_u16, vmulltq_poly_p8, vmullbq_poly_p8, vmovntq_u16,
vmovnbq_u16, vmlaldavxq_u16, vmlaldavq_u16, vqmovuntq_s16, vqmovunbq_s16, 
vshlltq_n_u8, vshllbq_n_u8, vorrq_n_u16, vbicq_n_u16, vcmpneq_n_f16, 
vcmpneq_f16,
vcmpltq_n_f16, vcmpltq_f16, vcmpleq_n_f16, vcmpleq_f16, vcmpgtq_n_f16,
vcmpgtq_f16, vcmpgeq_n_f16, vcmpgeq_f16, vcmpeqq_n_f16, vcmpeqq_f16, vsubq_f16,
vqmovntq_s16, vqmovnbq_s16, vqdmulltq_s16, vqdmulltq_n_s16, vqdmullbq_s16,
vqdmullbq_n_s16, vorrq_f16, vornq_f16, vmulq_n_f16, vmulq_f16, vmovntq_s16,
vmovnbq_s16, vmlsldavxq_s16, vmlsldavq_s16, vmlaldavxq_s16, vmlaldavq_s16,
vminnmvq_f16, vminnmq_f16, vminnmavq_f16, vminnmaq_f16, vmaxnmvq_f16, 
vmaxnmq_f16,
vmaxnmavq_f16, vmaxnmaq_f16, veorq_f16, vcmulq_rot90_f16, vcmulq_rot270_f16,
vcmulq_rot180_f16, vcmulq_f16, vcaddq_rot90_f16, vcaddq_rot270_f16, vbicq_f16,
vandq_f16, vaddq_n_f16, vabdq_f16, vshlltq_n_s8, vshllbq_n_s8, vorrq_n_s16, 
vbicq_n_s16, vqmovntq_u32, vqmovnbq_u32, vmulltq_poly_p16, vmullbq_poly_p16,
vmovntq_u32, vmovnbq_u32, vmlaldavxq_u32, vmlaldavq_u32, vqmovuntq_s32,
vqmovunbq_s32, vshlltq_n_u16, vshllbq_n_u16, vorrq_n_u32, vbicq_n_u32, 
vcmpneq_n_f32, vcmpneq_f32, vcmpltq_n_f32, vcmpltq_f32, vcmpleq_n_f32, 
vcmpleq_f32, vcmpgtq_n_f32, vcmpgtq_f32, vcmpgeq_n_f32, vcmpgeq_f32, 
vcmpeqq_n_f32, vcmpeqq_f32, vsubq_f32, vqmovntq_s32, vqmovnbq_s32, 
vqdmulltq_s32, vqdmulltq_n_s32, vqdmullbq_s32, vqdmullbq_n_s32, vorrq_f32,
vornq_f32, vmulq_n_f32, vmulq_f32, vmovntq_s32, vmovnbq_s32, vmlsldavxq_s32,
vmlsldavq_s32, vmlaldavxq_s32, vmlaldavq_s32, vminnmvq_f32, vminnmq_f32,
vminnmavq_f32, vminnmaq_f32, vmaxnmvq_f32, vmaxnmq_f32, vmaxnmavq_f32,
vmaxnmaq_f32, veorq_f32, vcmulq_rot90_f32, vcmulq_rot270_f32, vcmulq_rot180_f32,
vcmulq_f32, vcaddq_rot90_f32, vcaddq_rot270_f32, vbicq_f32, vandq_f32, 
vaddq_n_f32, vabdq_f32, vshlltq_n_s16, vshllbq_n_s16, vorrq_n_s32, vbicq_n_s32, 
vrmlaldavhq_u32, vctp8q_m, vctp64q_m, vctp32q_m, vctp16q_m, vaddlvaq_u32, 
vrmlsldavhxq_s32, vrmlsldavhq_s32, vrmlaldavhxq_s32, vrmlaldavhq_s32,
vcvttq_f16_f32, vcvtbq_f16_f32, vaddlvaq_s32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

The above intrinsics are defined using the already defined builtin qualifiers 
BINOP_NONE_NONE_IMM,
BINOP_NONE_NONE_NONE, BINOP_UNONE_NONE_NONE, BINOP_UNONE_UNONE_IMM, 
BINOP_UNONE_UNONE_NONE,
BINOP_UNONE_UNONE_UNONE.

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-23  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm_mve.h (vqmovntq_u16): Define macro.
(vqmovnbq_u16): Likewise.
(vmulltq_poly_p8): Likewise.
(vmullbq_poly_p8): Likewise.
(vmovntq_u16): Likewise.
(vmovnbq_u16): Likewise.
(vmlaldavxq_u16): Likewise.
(vmlaldavq_u16): Likewise.
(vqmovuntq_s16): Likewise.
(vqmovunbq_s16): Likewise.
(vshlltq_n_u8): Likewise.
(vshllbq_n_u8): Likewise.
(vorrq_n_u16): Likewise.
(vbicq_n_u16): Likewise.
(vcmpneq_n_f16): Likewise.
(vcmpneq_f16): Likewise.
(vcmpltq_n_f16): Likewise.
(vcmpltq_f16): Likewise.
(vcmpleq_n_f16): Likewise.
(vcmpleq_f16): Likewise.
(vcmpgtq_n_f16): Likewise.
(vcmpgtq_f16): Likewise.
(vcmpgeq_n_f16): Likewise.
(vcmpgeq_f16): Likewise.
(vcmpeqq_n_f16): Likewise.
(vcmpeqq_f16): Likewise.
(vsubq_f16): Likewise.
(vqmovntq_s16): Likewise.
(vqmovnbq_s16): Likewise.
(vqdmulltq_s16): Likewise.
(vqdmulltq_n_s16): Likewise.
(vqdmullbq_s16): Likewise.
(vqdmullbq_n_s16): Likewise.
(vorrq_f16): Likewise.
(vornq_f16): Likewise.
(vmulq_n_f16): Likewise.
(vmulq_f16): Likewise.
(vmovntq_s16): Likewise.
(vmovnbq_s16): Likewise.
(vmlsldavxq_s16): Likewise.
(vmlsldavq_s16): Likewise.
(vmlaldavxq_s16): Likewise.
(vmlaldavq_s16): Likewise.
(vminnmvq_f16): Likewise.
(vminnmq_f16): Likewise.
(vminnmavq_f16): Likewise.
(vminnmaq_f16): Likewise.
(vmaxnmvq_f16): Likewise.
(vmaxnmq_f16): Likewise.
(vmaxnmavq_f16): Likewise.
(vmaxnmaq_f16): Likewise.
(veorq_f16): Likewise.
(vcmulq_rot90_f16): Likewise.
(vcmulq_rot270_f16): Likewise.
(vcmulq_rot180_f16): Likewise.
(vcmulq_f16): Likewise.
(vcaddq_rot90_f16): Likewise.
(vcaddq_rot270_f16): Likewise.
(vbicq_f16): Likewise.
 

[PATCH v2][ARM][GCC][4/1x]: MVE intrinsics with unary operand.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534342.html




Hello,

This patch supports following MVE ACLE intrinsics with unary operand.

vctp16q, vctp32q, vctp64q, vctp8q, vpnot.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

There are few conflicts in defining the machine registers, resolved by 
re-ordering
VPR_REGNUM, APSRQ_REGNUM and APSRGE_REGNUM.

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.


Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (hi_UP): Define mode.
* config/arm/arm.h (IS_VPR_REGNUM): Move.
* config/arm/arm.md (VPR_REGNUM): Define before APSRQ_REGNUM.
(APSRQ_REGNUM): Modify.
(APSRGE_REGNUM): Modify.
* config/arm/arm_mve.h (vctp16q): Define macro.
(vctp32q): Likewise.
(vctp64q): Likewise.
(vctp8q): Likewise.
(vpnot): Likewise.
(__arm_vctp16q): Define intrinsic.
(__arm_vctp32q): Likewise.
(__arm_vctp64q): Likewise.
(__arm_vctp8q): Likewise.
(__arm_vpnot): Likewise.
* config/arm/arm_mve_builtins.def (UNOP_UNONE_UNONE): Use builtin
qualifier.
* config/arm/mve.md (mve_vctpqhi): Define RTL pattern.
(mve_vpnothi): Likewise.

gcc/testsuite/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vctp16q.c: New test.
* gcc.target/arm/mve/intrinsics/vctp32q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp64q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vctp8q.c: Likewise.
* gcc.target/arm/mve/intrinsics/vpnot.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
0d5da5ea3133ca39b55b4ca76507e01956997c03..fd449458bcb1f9a899a16e432aa015d48b665868
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -415,6 +415,7 @@ arm_set_sat_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define hf_UP   E_HFmode
 #define bf_UPE_BFmode
 #define si_UP   E_SImode
+#define hi_UPE_HImode
 #define void_UP E_VOIDmode
 #define sf_UP   E_SFmode
 #define UP(X) X##_UP
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
912849f0acd36e9c8c3a00f4253a691b7085e72d..7f94e11c0ea23dfbdb7e64fafbdfc68d83411865
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -192,6 +192,11 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define vcvtmq_u32_f32(__a) __arm_vcvtmq_u32_f32(__a)
 #define vcvtaq_u16_f16(__a) __arm_vcvtaq_u16_f16(__a)
 #define vcvtaq_u32_f32(__a) __arm_vcvtaq_u32_f32(__a)
+#define vctp16q(__a) __arm_vctp16q(__a)
+#define vctp32q(__a) __arm_vctp32q(__a)
+#define vctp64q(__a) __arm_vctp64q(__a)
+#define vctp8q(__a) __arm_vctp8q(__a)
+#define vpnot(__a) __arm_vpnot(__a)
 #endif
 
 __extension__ extern __inline void
@@ -703,6 +708,41 @@ __arm_vaddlvq_u32 (uint32x4_t __a)
   return __builtin_mve_vaddlvq_uv4si (__a);
 }
 
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vctp16q (uint32_t __a)
+{
+  return __builtin_mve_vctp16qhi (__a);
+}
+
+__extension__ extern __inline mve_pred16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vctp32q (uint32_t __a)
+{
+  return __builtin_mve_vctp32qhi (__a);
+}
+
+__extension__ extern __inline mve_pred16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vctp64q (uint32_t __a)
+{
+  return __builtin_mve_vctp64qhi (__a);
+}
+
+__extension__ extern __inline mve_pred16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vctp8q (uint32_t __a)
+{
+  return __builtin_mve_vctp8qhi (__a);
+}
+
+__extension__ extern __inline mve_pred16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vpnot (mve_pred16_t __a)
+{
+  return __builtin_mve_vpnothi (__a);
+}
+
 #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point.  */
 
 __extension__ extern __inline void
diff --git a/gcc/config/arm/arm_mve_builtins.def 
b/gcc/config/arm/arm_mve_builtins.def
index 
44807d6e8c4a4717c4f2fd2ef7015708ca3af4bc..5d5696965457e4fe138c194d7f3c3c5737bf68d0
 100644
--- a/gcc/config/arm/arm_mve_builtins.def
+++ b/gcc/config/arm/arm_mve_builtins.def
@@ -71,3 +71,8 @@ VAR2 (UNOP_UNONE_NONE, vcvtaq_u, v8hi, v4si)
 VAR2 (UNOP_UNONE_IMM, vmvnq_n_u, v8hi, v4si)
 VAR1 (UNOP_UNONE_UNONE, vrev16q_u, v16qi)
 VAR1 (UNOP_UNONE_UNONE, vaddlvq_u, v4si)
+VAR1 (UNOP_UNONE_UNONE, vctp16q, hi)
+VAR1 (UNOP_UNONE_UNONE, vctp32q, hi

[PATCH v2][ARM][GCC][2/2x]: MVE intrinsics with binary operands.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534325.html




Hello,

This patch supports following MVE ACLE intrinsics with binary operands.

vcvtq_n_s16_f16, vcvtq_n_s32_f32, vcvtq_n_u16_f16, vcvtq_n_u32_f32, vcreateq_u8,
vcreateq_u16, vcreateq_u32, vcreateq_u64, vcreateq_s8, vcreateq_s16, 
vcreateq_s32,
vcreateq_s64, vshrq_n_s8, vshrq_n_s16, vshrq_n_s32, vshrq_n_u8, vshrq_n_u16, 
vshrq_n_u32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

In this patch new constraints "Rb" and "Rf" are added, which checks the 
constant is with
in the range of 1 to 8 and 1 to 32 respectively.

Also a new predicates "mve_imm_8" and "mve_imm_32" are added, to check the the 
matching
constraint Rb and Rf respectively.

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (BINOP_UNONE_UNONE_IMM_QUALIFIERS): Define
qualifier for binary operands.
(BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vcvtq_n_s16_f16): Define macro.
(vcvtq_n_s32_f32): Likewise.
(vcvtq_n_u16_f16): Likewise.
(vcvtq_n_u32_f32): Likewise.
(vcreateq_u8): Likewise.
(vcreateq_u16): Likewise.
(vcreateq_u32): Likewise.
(vcreateq_u64): Likewise.
(vcreateq_s8): Likewise.
(vcreateq_s16): Likewise.
(vcreateq_s32): Likewise.
(vcreateq_s64): Likewise.
(vshrq_n_s8): Likewise.
(vshrq_n_s16): Likewise.
(vshrq_n_s32): Likewise.
(vshrq_n_u8): Likewise.
(vshrq_n_u16): Likewise.
(vshrq_n_u32): Likewise.
(__arm_vcreateq_u8): Define intrinsic.
(__arm_vcreateq_u16): Likewise.
(__arm_vcreateq_u32): Likewise.
(__arm_vcreateq_u64): Likewise.
(__arm_vcreateq_s8): Likewise.
(__arm_vcreateq_s16): Likewise.
(__arm_vcreateq_s32): Likewise.
(__arm_vcreateq_s64): Likewise.
(__arm_vshrq_n_s8): Likewise.
(__arm_vshrq_n_s16): Likewise.
(__arm_vshrq_n_s32): Likewise.
(__arm_vshrq_n_u8): Likewise.
(__arm_vshrq_n_u16): Likewise.
(__arm_vshrq_n_u32): Likewise.
(__arm_vcvtq_n_s16_f16): Likewise.
(__arm_vcvtq_n_s32_f32): Likewise.
(__arm_vcvtq_n_u16_f16): Likewise.
(__arm_vcvtq_n_u32_f32): Likewise.
(vshrq_n): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (BINOP_UNONE_UNONE_IMM_QUALIFIERS):
Use it.
(BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise.
(BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise.
* config/arm/constraints.md (Rb): Define constraint to check constant is
in the range of 1 to 8.
(Rf): Define constraint to check constant is in the range of 1 to 32.
* config/arm/mve.md (mve_vcreateq_): Define RTL pattern.
(mve_vshrq_n_): Likewise.
(mve_vcvtq_n_from_f_): Likewise.
* config/arm/predicates.md (mve_imm_8): Define predicate to check
the matching constraint Rb.
(mve_imm_32): Define predicate to check the matching constraint Rf.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vcreateq_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vcreateq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_s16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vshrq_n_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
in

[PATCH v2][ARM][GCC][2/1x]: MVE intrinsics with unary operand.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534326.html




Hello,

This patch supports following MVE ACLE intrinsics with unary operand.

vmvnq_n_s16, vmvnq_n_s32, vrev64q_s8, vrev64q_s16, vrev64q_s32, vcvtq_s16_f16, 
vcvtq_s32_f32,
vrev64q_u8, vrev64q_u16, vrev64q_u32, vmvnq_n_u16, vmvnq_n_u32, vcvtq_u16_f16, 
vcvtq_u32_f32,
vrev64q.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.


Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (UNOP_SNONE_SNONE_QUALIFIERS): Define.
(UNOP_SNONE_NONE_QUALIFIERS): Likewise.
(UNOP_SNONE_IMM_QUALIFIERS): Likewise.
(UNOP_UNONE_NONE_QUALIFIERS): Likewise.
(UNOP_UNONE_UNONE_QUALIFIERS): Likewise.
(UNOP_UNONE_IMM_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vmvnq_n_s16): Define macro.
(vmvnq_n_s32): Likewise.
(vrev64q_s8): Likewise.
(vrev64q_s16): Likewise.
(vrev64q_s32): Likewise.
(vcvtq_s16_f16): Likewise.
(vcvtq_s32_f32): Likewise.
(vrev64q_u8): Likewise.
(vrev64q_u16): Likewise.
(vrev64q_u32): Likewise.
(vmvnq_n_u16): Likewise.
(vmvnq_n_u32): Likewise.
(vcvtq_u16_f16): Likewise.
(vcvtq_u32_f32): Likewise.
(__arm_vmvnq_n_s16): Define intrinsic.
(__arm_vmvnq_n_s32): Likewise.
(__arm_vrev64q_s8): Likewise.
(__arm_vrev64q_s16): Likewise.
(__arm_vrev64q_s32): Likewise.
(__arm_vrev64q_u8): Likewise.
(__arm_vrev64q_u16): Likewise.
(__arm_vrev64q_u32): Likewise.
(__arm_vmvnq_n_u16): Likewise.
(__arm_vmvnq_n_u32): Likewise.
(__arm_vcvtq_s16_f16): Likewise.
(__arm_vcvtq_s32_f32): Likewise.
(__arm_vcvtq_u16_f16): Likewise.
(__arm_vcvtq_u32_f32): Likewise.
(vrev64q): Define polymorphic variant.
* config/arm/arm_mve_builtins.def (UNOP_SNONE_SNONE): Use it.
(UNOP_SNONE_NONE): Likewise.
(UNOP_SNONE_IMM): Likewise.
(UNOP_UNONE_UNONE): Likewise.
(UNOP_UNONE_NONE): Likewise.
(UNOP_UNONE_IMM): Likewise.
* config/arm/mve.md (mve_vrev64q_): Define RTL pattern.
(mve_vcvtq_from_f_): Likewise.
(mve_vmvnq_n_): Likewise.

gcc/testsuite/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vcvtq_s16_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vcvtq_s32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_u16_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_u32_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmvnq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrev64q_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
7bf5ef5479722a6cb08cc030e1aa7d6d7fad4599..97354c245ff42e7db2934e1045b8707033511e11
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -337,6 +337,42 @@ arm_unop_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define UNOP_NONE_UNONE_QUALIFIERS \
   (arm_unop_none_unone_qualifiers)
 
+static enum arm_type_qualifiers
+arm_unop_snone_snone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none };
+#define UNOP_SNONE_SNONE_QUALIFIERS \
+  (arm_unop_snone_snone_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_snone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none };
+#define UNOP_SNONE_NONE_QUALIFIERS \
+  (arm_unop_snone_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_snone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_immediate };
+#define UNOP_SNONE_IMM_QUALIFIERS \
+  (arm_unop_snone_imm_qualifiers)
+
+static enum arm_type_qualifiers
+arm_unop_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_none };
+#define UNOP_UNONE_NONE_QUALIFIERS \
+  (arm_unop_unone_none_qualifiers)
+
+static

[PATCH v2][ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534328.html



Hello,

This patch supports MVE ACLE intrinsics vst4q_s8, vst4q_s16, vst4q_s32, 
vst4q_u8,
vst4q_u16, vst4q_u32, vst4q_f16 and vst4q_f32.

In this patch arm_mve_builtins.def file is added to the source code in which the
builtins for MVE ACLE intrinsics are defined using builtin qualifiers.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (CF): Define mve_builtin_data.
(VAR1): Define.
(ARM_BUILTIN_MVE_PATTERN_START): Define.
(arm_init_mve_builtins): Define function.
(arm_init_builtins): Add TARGET_HAVE_MVE check.
(arm_expand_builtin_1): Check the range of fcode.
(arm_expand_mve_builtin): Define function to expand MVE builtins.
(arm_expand_builtin): Check the range of fcode.
* config/arm/arm_mve.h (__ARM_FEATURE_MVE): Define MVE floating point
types.
(__ARM_MVE_PRESERVE_USER_NAMESPACE): Define to protect user namespace.
(vst4q_s8): Define macro.
(vst4q_s16): Likewise.
(vst4q_s32): Likewise.
(vst4q_u8): Likewise.
(vst4q_u16): Likewise.
(vst4q_u32): Likewise.
(vst4q_f16): Likewise.
(vst4q_f32): Likewise.
(__arm_vst4q_s8): Define inline builtin.
(__arm_vst4q_s16): Likewise.
(__arm_vst4q_s32): Likewise.
(__arm_vst4q_u8): Likewise.
(__arm_vst4q_u16): Likewise.
(__arm_vst4q_u32): Likewise.
(__arm_vst4q_f16): Likewise.
(__arm_vst4q_f32): Likewise.
(__ARM_mve_typeid): Define macro with MVE types.
(__ARM_mve_coerce): Define macro with _Generic feature.
(vst4q): Define polymorphic variant for different vst4q builtins.
* config/arm/arm_mve_builtins.def: New file.
* config/arm/iterators.md (VSTRUCT): Modify to allow XI and OI
modes in MVE.
* config/arm/mve.md (MVE_VLD_ST): Define iterator.
(unspec): Define unspec.
(mve_vst4q): Define RTL pattern.
* config/arm/neon.md (mov): Modify expand to allow XI and OI
modes in MVE.
(neon_mov): Modify RTL define_insn to allow XI and OI modes
in MVE.
(define_split): Allow OI mode split for MVE after reload.
(define_split): Allow XI mode split for MVE after reload.
* config/arm/t-arm (arm.o): Add entry for arm_mve_builtins.def.
(arm-builtins.o): Likewise.

gcc/testsuite/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vst4q_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
28917363eeae51b7cc39576f3c3e77a0350b8877..b9ee45d5950ac9c1e12d88cd7d3ece1953dc51d0
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -432,6 +432,13 @@ static arm_builtin_datum neon_builtin_data[] =
 };
 
 #undef CF
+#define CF(N,X) CODE_FOR_mve_##N##X
+static arm_builtin_datum mve_builtin_data[] =
+{
+#include "arm_mve_builtins.def"
+};
+
+#undef CF
 #undef VAR1
 #define VAR1(T, N, A) \
   {#N, UP (A), CODE_FOR_arm_##N, 0, T##_QUALIFIERS},
@@ -736,6 +743,13 @@ enum arm_builtins
 
 #include "arm_acle_builtins.def"
 
+  ARM_BUILTIN_MVE_BASE,
+
+#undef VAR1
+#define VAR1(T, N, X) \
+  ARM_BUILTIN_MVE_##N##X,
+#include "arm_mve_builtins.def"
+
   ARM_BUILTIN_MAX
 };
 
@@ -745,6 +759,9 @@ enum arm_builtins
 #define ARM_BUILTIN_NEON_PATTERN_START \
   (ARM_BUILTIN_NEON_BASE + 1)
 
+#define ARM_BUILTIN_MVE_PATTERN_START \
+  (ARM_BUILTIN_MVE_BASE + 1)
+
 #define ARM_BUILTIN_ACLE_PATTERN_START \
   (ARM_BUILTIN_ACLE_BASE + 1)
 
@@ -1276,6 +1293,22 @@ arm_init_acle_builtins (void)
 }
 }
 
+/* Set up all the MVE builtins mentioned in arm_mve_builtins.def file.  */
+static void
+arm_init_mve_builtins (void)
+{
+  volatile unsigned int i, fcode = ARM_BUILTIN_MVE_PATTERN_START;
+
+  arm_init_simd_builtin_scalar_types ();
+  arm_init_simd_builtin_types ();
+
+  for

[PATCH v2][ARM][GCC][1/2x]: MVE intrinsics with binary operands.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534331.html




Hello,

This patch supports following MVE ACLE intrinsics with binary operand.

vsubq_n_f16, vsubq_n_f32, vbrsrq_n_f16, vbrsrq_n_f32, vcvtq_n_f16_s16,
vcvtq_n_f32_s32, vcvtq_n_f16_u16, vcvtq_n_f32_u32, vcreateq_f16, vcreateq_f32.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

In this patch new constraint "Rd" is added, which checks the constant is with 
in the range of 1 to 16.
Also a new predicate "mve_imm_16" is added, to check the the matching 
constraint Rd.

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (BINOP_NONE_NONE_NONE_QUALIFIERS): Define
qualifier for binary operands.
(BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vsubq_n_f16): Define macro.
(vsubq_n_f32): Likewise.
(vbrsrq_n_f16): Likewise.
(vbrsrq_n_f32): Likewise.
(vcvtq_n_f16_s16): Likewise.
(vcvtq_n_f32_s32): Likewise.
(vcvtq_n_f16_u16): Likewise.
(vcvtq_n_f32_u32): Likewise.
(vcreateq_f16): Likewise.
(vcreateq_f32): Likewise.
(__arm_vsubq_n_f16): Define intrinsic.
(__arm_vsubq_n_f32): Likewise.
(__arm_vbrsrq_n_f16): Likewise.
(__arm_vbrsrq_n_f32): Likewise.
(__arm_vcvtq_n_f16_s16): Likewise.
(__arm_vcvtq_n_f32_s32): Likewise.
(__arm_vcvtq_n_f16_u16): Likewise.
(__arm_vcvtq_n_f32_u32): Likewise.
(__arm_vcreateq_f16): Likewise.
(__arm_vcreateq_f32): Likewise.
(vsubq): Define polymorphic variant.
(vbrsrq): Likewise.
(vcvtq_n): Likewise.
* config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE_QUALIFIERS): Use
it.
(BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise.
(BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise.
* config/arm/constraints.md (Rd): Define constraint to check constant is
in the range of 1 to 16.
* config/arm/mve.md (mve_vsubq_n_f): Define RTL pattern.
mve_vbrsrq_n_f: Likewise.
mve_vcvtq_n_to_f_: Likewise.
mve_vcreateq_f: Likewise.
* config/arm/predicates.md (mve_imm_16): Define predicate to check
the matching constraint Rd.

gcc/testsuite/ChangeLog:

2019-10-21  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vbrsrq_n_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vbrsrq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_n_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f32.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
b61094303443960bb173219e46a4d8e351deeb25..658886831ff55a6fa3350f9a654be4887e115bbc
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -373,6 +373,30 @@ arm_unop_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define UNOP_UNONE_IMM_QUALIFIERS \
   (arm_unop_unone_imm_qualifiers)
 
+static enum arm_type_qualifiers
+arm_binop_none_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_none };
+#define BINOP_NONE_NONE_NONE_QUALIFIERS \
+  (arm_binop_none_none_none_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_none_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_none, qualifier_immediate };
+#define BINOP_NONE_NONE_IMM_QUALIFIERS \
+  (arm_binop_none_none_imm_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_none_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_unsigned, qualifier_immediate };
+#define BINOP_NONE_UNONE_IMM_QUALIFIERS \
+  (arm_binop_none_unone_imm_qualifiers)
+
+static enum arm_type_qualifiers
+arm_binop_none_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_none, qualifier_unsigned, qualifier_unsigned };
+#define BINOP_NONE_UNONE_UNONE_QUALIFIERS \
+  (

[PATCH v2][ARM][GCC][1/1x]: Patch to support MVE ACLE intrinsics with unary operand.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534343.html




Hello,

This patch supports MVE ACLE intrinsics vcvtq_f16_s16, vcvtq_f32_s32, 
vcvtq_f16_u16, vcvtq_f32_u32n
vrndxq_f16, vrndxq_f32, vrndq_f16, vrndq_f32, vrndpq_f16, vrndpq_f32, 
vrndnq_f16, vrndnq_f32,
vrndmq_f16, vrndmq_f32, vrndaq_f16, vrndaq_f32, vrev64q_f16, vrev64q_f32, 
vnegq_f16, vnegq_f32,
vdupq_n_f16, vdupq_n_f32, vabsq_f16, vabsq_f32, vrev32q_f16, vcvttq_f32_f16, 
vcvtbq_f32_f16.


Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more 
details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config/arm/arm-builtins.c (UNOP_NONE_NONE_QUALIFIERS): Define macro.
(UNOP_NONE_SNONE_QUALIFIERS): Likewise.
(UNOP_NONE_UNONE_QUALIFIERS): Likewise.
* config/arm/arm_mve.h (vrndxq_f16): Define macro.
(vrndxq_f32): Likewise.
(vrndq_f16) Likewise.:
(vrndq_f32): Likewise.
(vrndpq_f16): Likewise. 
(vrndpq_f32): Likewise.
(vrndnq_f16): Likewise.
(vrndnq_f32): Likewise.
(vrndmq_f16): Likewise.
(vrndmq_f32): Likewise. 
(vrndaq_f16): Likewise.
(vrndaq_f32): Likewise.
(vrev64q_f16): Likewise.
(vrev64q_f32): Likewise.
(vnegq_f16): Likewise.
(vnegq_f32): Likewise.
(vdupq_n_f16): Likewise.
(vdupq_n_f32): Likewise.
(vabsq_f16): Likewise. 
(vabsq_f32): Likewise.
(vrev32q_f16): Likewise.
(vcvttq_f32_f16): Likewise.
(vcvtbq_f32_f16): Likewise.
(vcvtq_f16_s16): Likewise. 
(vcvtq_f32_s32): Likewise.
(vcvtq_f16_u16): Likewise.
(vcvtq_f32_u32): Likewise.
(__arm_vrndxq_f16): Define intrinsic.
(__arm_vrndxq_f32): Likewise.
(__arm_vrndq_f16): Likewise.
(__arm_vrndq_f32): Likewise.
(__arm_vrndpq_f16): Likewise.
(__arm_vrndpq_f32): Likewise.
(__arm_vrndnq_f16): Likewise.
(__arm_vrndnq_f32): Likewise.
(__arm_vrndmq_f16): Likewise.
(__arm_vrndmq_f32): Likewise.
(__arm_vrndaq_f16): Likewise.
(__arm_vrndaq_f32): Likewise.
(__arm_vrev64q_f16): Likewise.
(__arm_vrev64q_f32): Likewise.
(__arm_vnegq_f16): Likewise.
(__arm_vnegq_f32): Likewise.
(__arm_vdupq_n_f16): Likewise.
(__arm_vdupq_n_f32): Likewise.
(__arm_vabsq_f16): Likewise.
(__arm_vabsq_f32): Likewise.
(__arm_vrev32q_f16): Likewise.
(__arm_vcvttq_f32_f16): Likewise.
(__arm_vcvtbq_f32_f16): Likewise.
(__arm_vcvtq_f16_s16): Likewise.
(__arm_vcvtq_f32_s32): Likewise.
(__arm_vcvtq_f16_u16): Likewise.
(__arm_vcvtq_f32_u32): Likewise.
(vrndxq): Define polymorphic variants.
(vrndq): Likewise.
(vrndpq): Likewise.
(vrndnq): Likewise.
(vrndmq): Likewise.
(vrndaq): Likewise.
(vrev64q): Likewise.
(vnegq): Likewise.
(vabsq): Likewise.
(vrev32q): Likewise.
(vcvtbq_f32): Likewise.
(vcvttq_f32): Likewise.
(vcvtq): Likewise.
* config/arm/arm_mve_builtins.def (VAR2): Define.
(VAR1): Define.
* config/arm/mve.md (mve_vrndxq_f): Add RTL pattern.
(mve_vrndq_f): Likewise.
(mve_vrndpq_f): Likewise.
(mve_vrndnq_f): Likewise.
(mve_vrndmq_f): Likewise.
(mve_vrndaq_f): Likewise.
(mve_vrev64q_f): Likewise.
(mve_vnegq_f): Likewise.
(mve_vdupq_n_f): Likewise.
(mve_vabsq_f): Likewise.
(mve_vrev32q_fv8hf): Likewise.
(mve_vcvttq_f32_f16v4sf): Likewise.
(mve_vcvtbq_f32_f16v4sf): Likewise.
(mve_vcvtq_to_f_): Likewise.

gcc/testsuite/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/vabsq_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vabsq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtbq_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f16_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f16_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f32_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvtq_f32_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcvttq_f32_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_f32.c: Like

[PATCH v3][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

This patch addresses all the comments in patch version v2.
(version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540417.html




Hello,

This patch is part of MVE ACLE intrinsics framework.

The patch supports the use of emulation for the single-precision arithmetic
operations for MVE. This changes are to support the MVE ACLE intrinsics which
operates on vector floating point arithmetic operations.

Please refer to Arm reference manual [1] for more details.
[1] https://developer.arm.com/docs/ddi0553/latest

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Srinath Parvathaneni  

* config/arm/arm.c (arm_libcall_uses_aapcs_base): Modify function to add
emulator calls for dobule precision arithmetic operations for MVE.

2020-03-06  Srinath Parvathaneni  

* gcc.target/arm/mve/intrinsics/mve_libcall1.c: New test.
* gcc.target/arm/mve/intrinsics/mve_libcall2.c: Likewise.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
c28a475629c7fbad48730beed5550e0cffdf2e1b..40db35a2a8b6dedb4f536b4995e80c8b9a38b588
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -5754,9 +5754,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall)
   /* Values from double-precision helper functions are returned in core
 registers if the selected core only supports single-precision
 arithmetic, even if we are using the hard-float ABI.  The same is
-true for single-precision helpers, but we will never be using the
-hard-float ABI on a CPU which doesn't support single-precision
-operations in hardware.  */
+true for single-precision helpers except in case of MVE, because in
+MVE we will be using the hard-float ABI on a CPU which doesn't support
+single-precision operations in hardware.  In MVE the following check
+enables use of emulation for the single-precision arithmetic
+operations.  */
+  if (TARGET_HAVE_MVE)
+   {
+ add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode));
+ add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode));
+   }
   add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode));
   add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode));
   add_libcall (libcall_htab, optab_libfunc (smul_optab, DFmode));
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c
new file mode 100644
index 
..f89301228c577291fc3095420df1937e1a0c7104
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c
@@ -0,0 +1,70 @@
+/* { dg-do compile  } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb 
-mfpu=auto" } */
+
+float
+foo (float a, float b, float c)
+{
+  return a + b + c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fadd" }  } */
+/* { dg-final { scan-assembler-times "bl\\t__aeabi_fadd" 2 } } */
+
+float
+foo1 (float a, float b, float c)
+{
+  return a - b - c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fsub" }  } */
+/* { dg-final { scan-assembler-times "bl\\t__aeabi_fsub" 2 } } */
+
+float
+foo2 (float a, float b, float c)
+{
+  return a * b * c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fmul" }  } */
+/* { dg-final { scan-assembler-times "bl\\t__aeabi_fmul" 2 } } */
+
+float
+foo3 (float b, float c)
+{
+  return b / c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fdiv" }  } */
+
+int
+foo4 (float b, float c)
+{
+  return b < c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fcmplt" }  } */
+
+int
+foo5 (float b, float c)
+{
+  return b > c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fcmpgt" }  } */
+
+int
+foo6 (float b, float c)
+{
+  return b != c;
+}
+
+/* { dg-final { scan-assembler "bl\\t__aeabi_fcmpeq" }  } */
+
+int
+foo7 (float b, float c)
+{
+  return b == c;
+}
+
+/* { dg-final { scan-assembler-times "bl\\t__aeabi_fcmpeq" 2 } } */
diff --git a/gcc/testsuite/gcc.target/ar

[PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

This patch addresses all the comments in patch version v2.
(version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html




Hello,

This patch is part of MVE ACLE intrinsics framework.
This patches add support to update (read/write) the APSR (Application Program 
Status Register)
register and FPSCR (Floating-point Status and Control Register) register for 
MVE.
This patch also enables thumb2 mov RTL patterns for MVE.

A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and 
MVE with floating point
extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the 
VFP instructions, RTL patterns,
status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch 
modifies that and the
common instructions, RTL patterns, status and control registers bewteen MVE and 
VFP are guarded by
TARGET_VFP_BASE macro.

The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because 
few MVE intrinsics
set/get carry bit of FPSCR register.

Please refer to Arm reference manual [1] for more details.
[1] https://developer.arm.com/docs/ddi0553/latest

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath
gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base
feature bit is on and -mfpu=auto is passed as compiler option, do not
generate error on not finding any match fpu. Because in this case fpu
is not required.
* config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is
enabled for MVE and also for all VFP extensions.
(VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2
is enabled.
(MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em.
(MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5
along with feature bits mve_float.
(mve): Modify add options in armv8.1-m.main arch for MVE.
(mve.fp): Modify add options in armv8.1-m.main arch for MVE with
floating point.
* config/arm/arm.c (use_return_insn): Replace the
check with TARGET_VFP_BASE.
(thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
TARGET_VFP_BASE.
(arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as
well.
(arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE
as well.
(arm_compute_frame_layout): Likewise.
(arm_save_coproc_regs): Likewise.
(arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM
in MVE as well.
(arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
with equivalent macro TARGET_VFP_BASE.
(arm_expand_epilogue_apcs_frame): Likewise.
(arm_expand_epilogue): Likewise.
(arm_conditional_register_usage): Likewise.
(arm_declare_function_name): Add check to skip printing .fpu directive
in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is
"softvfp".
* config/arm/arm.h (TARGET_VFP_BASE): Define.
* config/arm/arm.md (arch): Add "mve" to arch.
(eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
(vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
|| TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
* config/arm/constraints.md (Uf): Define to allow modification to FPCCR
in MVE.
* config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard
to not allow for MVE.
* config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs
enum.
(VUNSPEC_GET_FPSCR): Define.
* config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS
instructions which move to general-purpose Register from Floating-point
Special register and vice-versa.
(thumb2_movhi_fp16): Likewise.
(thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along
with MCR and MRC instructions which set and get Floating-point Status
and Control Register (FPSCR).
(movdi_vfp): Modify pattern to enable Single-precision scalar float move
in MVE.
(thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar
float move patterns in MVE.
(thumb2_movsfcc_vfp): Modify pattern to enable single float conditional
code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
(thumb2_movdfcc_vfp): Modify pattern to enable double float conditional
code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
(

[PATCH v3][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.

2020-03-10 Thread Srinath Parvathaneni
Hello Kyrill,

This patch addresses all the comments in patch version v2.
(version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540415.html



Hello,

This patch creates the required framework for MVE ACLE intrinsics.

The following changes are done in this patch to support MVE ACLE intrinsics.

Header file arm_mve.h is added to source code, which contains the definitions 
of MVE ACLE intrinsics
and different data types used in MVE. Machine description file mve.md is also 
added which contains the
RTL patterns defined for MVE.

A new reigster "p0" is added which is used in by MVE predicated patterns. A new 
register class "VPR_REG"
is added and its contents are defined in REG_CLASS_CONTENTS.

The vec-common.md file is modified to support the standard move patterns. The 
prefix of neon functions
which are also used by MVE is changed from "neon_" to "simd_".
eg: neon_immediate_valid_for_move changed to simd_immediate_valid_for_move.

In the patch standard patterns mve_move, mve_store and move_load for MVE are 
added and neon.md and vfp.md
files are modified to support this common patterns.

Please refer to Arm reference manual [1] for more details.

[1] https://developer.arm.com/docs/ddi0553/latest

Regression tested on target arm-none-eabi and armeb-none-eabi and found no 
regressions.

Ok for trunk?

Thanks,
Srinath

gcc/ChangeLog:

2020-03-06  Andre Vieira  
Mihail Ionescu  
Srinath Parvathaneni  

* config.gcc (arm_mve.h): Include mve intrinsics header file.
* config/arm/aout.h (p0): Add new register name for MVE predicated
cases.
* config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define macro
common to Neon and MVE.
(ARM_BUILTIN_NEON_LANE_CHECK): Renamed to ARM_BUILTIN_SIMD_LANE_CHECK.
(arm_init_simd_builtin_types): Disable poly types for MVE.
(arm_init_neon_builtins): Move a check to arm_init_builtins function.
(arm_init_builtins): Use ARM_BUILTIN_SIMD_LANE_CHECK instead of
ARM_BUILTIN_NEON_LANE_CHECK.
(mve_dereference_pointer): Add function.
(arm_expand_builtin_args): Call to mve_dereference_pointer when MVE is
enabled.
(arm_expand_neon_builtin): Moved to arm_expand_builtin function.
(arm_expand_builtin): Moved from arm_expand_neon_builtin function.
* config/arm/arm-c.c (__ARM_FEATURE_MVE): Define macro for MVE and MVE
with floating point enabled.
* config/arm/arm-protos.h (neon_immediate_valid_for_move): Renamed to
simd_immediate_valid_for_move.
(simd_immediate_valid_for_move): Renamed from
neon_immediate_valid_for_move function.
* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Generate
error if vfpv2 feature bit is disabled and mve feature bit is also
disabled for HARD_FLOAT_ABI.
(use_return_insn): Check to not push VFP regs for MVE.
(aapcs_vfp_allocate): Add MVE check to have same Procedure Call Standard
as Neon.
(aapcs_vfp_allocate_return_reg): Likewise.
(thumb2_legitimate_address_p): Check to return 0 on valid Thumb-2
address operand for MVE.
(arm_rtx_costs_internal): MVE check to determine cost of rtx.
(neon_valid_immediate): Rename to simd_valid_immediate.
(simd_valid_immediate): Rename from neon_valid_immediate.
(simd_valid_immediate): MVE check on size of vector is 128 bits.
(neon_immediate_valid_for_move): Rename to
simd_immediate_valid_for_move.
(simd_immediate_valid_for_move): Rename from
neon_immediate_valid_for_move.
(neon_immediate_valid_for_logic): Modify call to neon_valid_immediate
function.
(neon_make_constant): Modify call to neon_valid_immediate function.
(neon_vector_mem_operand): Return VFP register for POST_INC or PRE_DEC
for MVE.
(output_move_neon): Add MVE check to generate vldm/vstm instrcutions.
(arm_compute_frame_layout): Calculate space for saved VFP registers for
MVE.
(arm_save_coproc_regs): Save coproc registers for MVE.
(arm_print_operand): Add case 'E' to print memory operands for MVE.
(arm_print_operand_address): Check to print register number for MVE.
(arm_hard_regno_mode_ok): Check for arm hard regno mode ok for MVE.
(arm_modes_tieable_p): Check to allow structure mode for MVE.
(arm_regno_class): Add VPR_REGNUM check.
(arm_expand_epilogue_apcs_frame): MVE check to calculate epilogue code
for APCS frame.
(arm_expand_epilogue): MVE check for enabling pop instructions in
epilogue.
(arm_print_asm_arch_directives): Modify function to disable print of
.arch_extension "mve" and "fp" for cases where MVE is enabled with
"SOFT FLOAT ABI".
(arm_vector_mode_supported_p): Check for modes available in MVE interger
and MVE floating point.

Re: [GCC][PATCH][AArch64] ACLE intrinsics for BFCVTN, BFCVTN2 (AArch64 AdvSIMD) and BFCVT (AArch64 FP)

2020-03-10 Thread Vasee Vinayagamoorthy
Hi,

I think this commit causes a failure on aarch64-none-elf due to a DejaGnu
typo in gcc.target/aarch64/advsimd-intrinsics/bfcvt-nosimd.c.

{ dg-final { check-function-bodies "**" "" "-O[^0]" } }

I think the square brackets need to be escaped or use {-O[^0]}.

Regards
Vasee

Fri, Mar 06, 2020 at 09:54:07AM +, Richard Sandiford wrote:
> Delia Burduv  writes:
> > Hi,
> >
> > Here is the latest version of the  patch. That test should now work.
> 
> Thanks, pushed.
> 
> Richard


Re: [PATCH] libstdc++: LWG 3286 ranges::size is not required to be valid after ...

2020-03-10 Thread Jonathan Wakely via Gcc-patches

On 09/03/20 14:17 -0400, Patrick Palka wrote:

... a call to ranges::begin on an input range.

This implements LWG 3286.  The new wording for the single-argument
subrange::subrange constructor is implemented by splitting the constructor into
two delegating constructors, one constrained by _S_store_size and the other by
!_S_store_size.

Tested on x86_64-pc-linux-gnu, both tests fail before the patch and pass with
the patch.

libstdc++-v3/ChangeLog:

LWG 3286 ranges::size is not required to be valid after a call to
ranges::begin on an input range
* include/std/ranges (subrange::subrange): Split single-argument
constructor into two, one constrained by _S_store_size and another by
!_S_store_size.
(take_view::begin): Call size() before calling ranges::begin(_M_base).
* testsuite/std/ranges/adaptors/lwg3286.cc: New test.
* testsuite/std/ranges/subrange/lwg3286.cc: New test.


OK, thanks.



Re: [committed] libstdc++: Fix invalid noexcept-specifier (PR 94117)

2020-03-10 Thread Jonathan Wakely via Gcc-patches

On 10/03/20 11:40 +, Jonathan Wakely wrote:

G++ fails to diagnose this non-dependent expression, but Clang doesn't
like it.

PR c++/94117
* include/std/ranges (ranges::transform_view::_Iterator::iter_move):
Change expression in noexcept-specifier to match function body.




This patch goes further and removes the __iter_move helper completely,
and the __iter_swap one, in transform_view.

It also does the same in split_view, and fixes a bug where the
noexcept-specifier was always false.

I've also added new _M_i_current() accessors (overloaded for const and
non-const) to return _M_i.__current(). Using this instead of
_M_i._M_current fixes a bug in inner-iterator::operator*() (which is
also present in the working draft).

Tested powerpc64le-linux, committed to master.

commit cf0c3a457319df1e8dc9321227162a7c57169a39
Author: Jonathan Wakely 
Date:   Tue Mar 10 17:45:45 2020 +

libstdc++: Fix noexcept guarantees for ranges::split_view

Also introduce the _M_i_current() accessors to solve the problem of
access to the private member of _OuterIter from the iter_move and
iter_swap overloads (which are only friends of _InnerIter not
_OuterIter).

* include/std/ranges (transform_view::_Iterator::__iter_move): Remove.
(transform_view::_Iterator::operator*): Add noexcept-specifier.
(transform_view::_Iterator::iter_move): Inline __iter_move body here.
(split_view::_OuterIter::__current): Add noexcept.
(split_view::_InnerIter::__iter_swap): Remove.
(split_view::_InnerIter::__iter_move): Remove.
(split_view::_InnerIter::_M_i_current): New accessors.
(split_view::_InnerIter::__at_end): Use _M_i_current().
(split_view::_InnerIter::operator*): Likewise.
(split_view::_InnerIter::operator++): Likewise.
(iter_move(const _InnerIter&)): Likewise.
(iter_swap(const _InnerIter&, const _InnerIter&)): Likewise.
* testsuite/std/ranges/adaptors/split.cc: Check noexcept-specifier
for iter_move and iter_swap on split_view's inner iterator.

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 292132db990..4dc7342e2f7 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -1679,17 +1679,6 @@ namespace views
 	  return input_iterator_tag{};
 	  }
 
-	  static constexpr decltype(auto)
-	  __iter_move(const _Iterator& __i = {})
-	noexcept(noexcept(std::__invoke(*__i._M_parent->_M_fun,
-	*__i._M_current)))
-	  {
-	if constexpr (is_lvalue_reference_v)
-	  return std::move(*__i);
-	else
-	  return *__i;
-	  }
-
 	  using _Base_iter = iterator_t<_Base>;
 
 	  _Base_iter _M_current = _Base_iter();
@@ -1728,6 +1717,7 @@ namespace views
 
 	  constexpr decltype(auto)
 	  operator*() const
+	noexcept(noexcept(std::__invoke(*_M_parent->_M_fun, *_M_current)))
 	  { return std::__invoke(*_M_parent->_M_fun, *_M_current); }
 
 	  constexpr _Iterator&
@@ -1837,8 +1827,13 @@ namespace views
 	  { return __x._M_current - __y._M_current; }
 
 	  friend constexpr decltype(auto)
-	  iter_move(const _Iterator& __i) noexcept(noexcept(__iter_move(__i)))
-	  { return __iter_move(__i); }
+	  iter_move(const _Iterator& __i) noexcept(noexcept(*__i))
+	  {
+	if constexpr (is_lvalue_reference_v)
+	  return std::move(*__i);
+	else
+	  return *__i;
+	  }
 
 	  friend constexpr void
 	  iter_swap(const _Iterator& __x, const _Iterator& __y)
@@ -2715,7 +2710,7 @@ namespace views
 	  //  current of outer-iterator.  current is equivalent to current_ if
 	  //  V models forward_range, and parent_->current_ otherwise.
 	  constexpr auto&
-	  __current()
+	  __current() noexcept
 	  {
 	if constexpr (forward_range<_Vp>)
 	  return _M_current;
@@ -2724,7 +2719,7 @@ namespace views
 	  }
 
 	  constexpr auto&
-	  __current() const
+	  __current() const noexcept
 	  {
 	if constexpr (forward_range<_Vp>)
 	  return _M_current;
@@ -2860,7 +2855,7 @@ namespace views
 	auto __end = ranges::end(_M_i._M_parent->_M_base);
 	if constexpr (__detail::__tiny_range<_Pattern>)
 	  {
-		const auto& __cur = _M_i.__current();
+		const auto& __cur = _M_i_current();
 		if (__cur == __end)
 		  return true;
 		if (__pcur == __pend)
@@ -2869,7 +2864,7 @@ namespace views
 	  }
 	else
 	  {
-		auto __cur = _M_i.__current();
+		auto __cur = _M_i_current();
 		if (__cur == __end)
 		  return true;
 		if (__pcur == __pend)
@@ -2896,16 +2891,13 @@ namespace views
 	  return _Cat{};
 	  }
 
-	  static constexpr decltype(auto)
-	  __iter_move(const _InnerIter& __i = {})
-	  noexcept(noexcept(ranges::iter_move(__i._M_i.__current(
-	  { return ranges::iter_move(__i._M_i.__current()); }
+	  constexpr auto&
+	  _M_i_current() noexcept
+	  { return _M_i.__current(); }
 
-	  static constexpr void
-	  __iter_swap(const _InnerIter&

Fwd: [testsuite] Add @ lines to check-function-bodies fluff

2020-03-10 Thread Matthew Malcomson

Cc'ing maintainers and original author of `check-function-bodies`.

It looks like I missed that the first time around.


 Forwarded Message 
Subject: [testsuite] Add @ lines to check-function-bodies fluff
Date: Tue, 10 Mar 2020 17:22:52 +
From: Matthew Malcomson 
To: gcc-patches@gcc.gnu.org
CC: n...@arm.com

When using `check-function-bodies`, the subroutine 
`parse_function_bodies` uses

the `fluff` regexp to remove uninteresting assembly lines.

Arm targets generate assembly with some lines prefixed by `@`, these 
lines are

left by this process.

As an example of some lines prefixed by `@': the assembly output from the
`stacktest1` function in "bfloat16_simd_3_1.c" is:

.align  2
.global stacktest1
.arch armv8.2-a
.syntax unified
.arm
.fpu neon-fp-armv8
.type   stacktest1, %function
stacktest1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #8
add r3, sp, #6
vst1.16 {d0[0]}, [r3]
vld1.16 {d0[0]}, [r3]
add sp, sp, #8
@ sp needed
bx  lr
.size   stacktest1, .-stacktest1



It seems that previous uses of `check-function-bodies` in the arm 
backend have
avoided problems with such lines since they use the `...` regexp in each 
place

such fluff occurs.

I'm currently writing a patch that I'd like to match the entire function 
body,

so I'd like to remove such `@` lines automatically.

gcc/testsuite/ChangeLog:

2020-03-10  Matthew Malcomson  

* lib/scanasm.exp (parse_function_bodies): Lines starting with '@' also
counted as fluff.



### Attachment also inlined for ease of reply 
###



diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765 
100644

--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
  # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
  set fd [open $filename r]
 set in_function 0


diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0



[testsuite] Add @ lines to check-function-bodies fluff

2020-03-10 Thread Matthew Malcomson
When using `check-function-bodies`, the subroutine `parse_function_bodies` uses
the `fluff` regexp to remove uninteresting assembly lines.

Arm targets generate assembly with some lines prefixed by `@`, these lines are
left by this process.

As an example of some lines prefixed by `@': the assembly output from the
`stacktest1` function in "bfloat16_simd_3_1.c" is:

.align  2
.global stacktest1
.arch armv8.2-a
.syntax unified
.arm
.fpu neon-fp-armv8
.type   stacktest1, %function
stacktest1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #8
add r3, sp, #6
vst1.16 {d0[0]}, [r3]
vld1.16 {d0[0]}, [r3]
add sp, sp, #8
@ sp needed
bx  lr
.size   stacktest1, .-stacktest1



It seems that previous uses of `check-function-bodies` in the arm backend have
avoided problems with such lines since they use the `...` regexp in each place
such fluff occurs.

I'm currently writing a patch that I'd like to match the entire function body,
so I'd like to remove such `@` lines automatically.

gcc/testsuite/ChangeLog:

2020-03-10  Matthew Malcomson  

* lib/scanasm.exp (parse_function_bodies): Lines starting with '@' also
counted as fluff.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0

diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0



[PING^3][PATCH] Generalize -fuse-ld= to support absolute path or arbitrary ld.linker

2020-03-10 Thread Fangrui Song via Gcc-patches

On 2020-02-24, Fangrui Song wrote:

On 2020-02-13, Fangrui Song wrote:

On 2020-02-09, Fangrui Song wrote:

PR driver/93645
* common.opt (-fuse-ld=): Delete -fuse-ld=[bfd|gold|lld]. Add -fuse-ld=.
* opts.c (common_handle_option): Handle OPT_fuse_ld_.
* gcc.c (driver_handle_option): Likewise.
* collect2.c (main): Likewise.
---
gcc/ChangeLog   |  8 ++
gcc/collect2.c  | 67 -
gcc/common.opt  | 14 ++
gcc/doc/invoke.texi | 15 +++---
gcc/gcc.c   | 14 --
gcc/opts.c  |  4 +--
6 files changed, 57 insertions(+), 65 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index feb2d066d0b..6bcec12d841 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2020-02-09  Fangrui Song  
+
+   PR driver/93645
+   * common.opt (-fuse-ld=): Delete -fuse-ld=[bfd|gold|lld]. Add -fuse-ld=.
+   * opts.c (common_handle_option): Handle OPT_fuse_ld_.
+   * gcc.c (driver_handle_option): Likewise.
+   * collect2.c (main): Likewise.
+
2020-02-09  Uroš Bizjak  

* recog.c: Move pass_split_before_sched2 code in front of
diff --git a/gcc/collect2.c b/gcc/collect2.c
index 502d629141c..a3ef525a93b 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -859,18 +859,12 @@ main (int argc, char **argv)
  {
USE_DEFAULT_LD,
USE_PLUGIN_LD,
-  USE_GOLD_LD,
-  USE_BFD_LD,
-  USE_LLD_LD,
-  USE_LD_MAX
+  USE_LD
  } selected_linker = USE_DEFAULT_LD;
-  static const char *const ld_suffixes[USE_LD_MAX] =
+  static const char *const ld_suffixes[USE_LD] =
  {
"ld",
-  PLUGIN_LD_SUFFIX,
-  "ld.gold",
-  "ld.bfd",
-  "ld.lld"
+  PLUGIN_LD_SUFFIX
  };
static const char *const real_ld_suffix = "real-ld";
static const char *const collect_ld_suffix = "collect-ld";
@@ -882,7 +876,7 @@ main (int argc, char **argv)
static const char *const strip_suffix = "strip";
static const char *const gstrip_suffix = "gstrip";

-  const char *full_ld_suffixes[USE_LD_MAX];
+  const char *full_ld_suffixes[USE_LD];
#ifdef CROSS_DIRECTORY_STRUCTURE
/* If we look for a program in the compiler directories, we just use
   the short name, since these directories are already system-specific.
@@ -924,6 +918,7 @@ main (int argc, char **argv)
const char **ld1;
bool use_plugin = false;
bool use_collect_ld = false;
+  const char *use_ld = NULL;

/* The kinds of symbols we will have to consider when scanning the
   outcome of a first pass link.  This is ALL to start with, then might
@@ -948,7 +943,7 @@ main (int argc, char **argv)
#endif
int i;

-  for (i = 0; i < USE_LD_MAX; i++)
+  for (i = 0; i < USE_LD; i++)
  full_ld_suffixes[i]
#ifdef CROSS_DIRECTORY_STRUCTURE
= concat (target_machine, "-", ld_suffixes[i], NULL);
@@ -1041,12 +1036,11 @@ main (int argc, char **argv)
if (selected_linker == USE_DEFAULT_LD)
  selected_linker = USE_PLUGIN_LD;
  }
-   else if (strcmp (argv[i], "-fuse-ld=bfd") == 0)
- selected_linker = USE_BFD_LD;
-   else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
- selected_linker = USE_GOLD_LD;
-   else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
- selected_linker = USE_LLD_LD;
+   else if (!strncmp (argv[i], "-fuse-ld=", 9))
+ {
+   use_ld = argv[i] + 9;
+   selected_linker = USE_LD;
+ }
else if (strncmp (argv[i], "-o", 2) == 0)
  {
/* Parse the output filename if it's given so that we can make
@@ -1152,8 +1146,7 @@ main (int argc, char **argv)
/* Maybe we know the right file to use (if not cross).  */
ld_file_name = 0;
#ifdef DEFAULT_LINKER
-  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD ||
-  selected_linker == USE_LLD_LD)
+  if (!ld_file_name && selected_linker == USE_LD)
  {
char *linker_name;
# ifdef HOST_EXECUTABLE_SUFFIX
@@ -1168,15 +1161,13 @@ main (int argc, char **argv)
  if (! strcmp (&default_linker[len], HOST_EXECUTABLE_SUFFIX))
{
  default_linker[len] = '\0';
- linker_name = concat (default_linker,
-   &ld_suffixes[selected_linker][2],
+ linker_name = concat (default_linker, ".", use_ld,
HOST_EXECUTABLE_SUFFIX, NULL);
}
}
if (linker_name == NULL)
# endif
-  linker_name = concat (DEFAULT_LINKER,
-   &ld_suffixes[selected_linker][2],
+  linker_name = concat (DEFAULT_LINKER, ".", use_ld,
NULL);
if (access (linker_name, X_OK) == 0)
ld_file_name = linker_name;
@@ -1197,14 +1188,28 @@ main (int argc, char **argv)
ld_file_name = find_a_file (&cpath, collect_ld_suffix, X_OK);
use_collect_ld = ld_file_name != 0;
  }
-  /* Search the compiler directories for `ld'.  We have protection against
- recursive calls in find_a_file.  */
-  if (ld_file_name == 0)
- 

[committed] loop-iv: make find_simple_exit static

2020-03-10 Thread Roman Zhuykov via Gcc-patches
This patch marks find_simple_exit function as static.
Committed as r10-7107 after successful bootstrap on x86_64 and powerpc64le.

Roman

--
Function 'find_simple_exit' is used only from loop-iv.c
In 2004-2006 it was also used in predict.c, but since r118694
(992c31e62304ed5d34247dbdef2db276d08fac05) it does not.

gcc/ChangeLog:
* loop-iv.c (find_simple_exit): Make it static.
* cfgloop.h: Remove the corresponding prototype.

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 11378cadd41..1c49a8b8c2d 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -499,7 +499,6 @@ extern bool iv_analyze_expr (rtx_insn *, scalar_int_mode, 
rtx,
 class rtx_iv *);
 extern rtx get_iv_value (class rtx_iv *, rtx);
 extern bool biv_p (rtx_insn *, scalar_int_mode, rtx);
-extern void find_simple_exit (class loop *, class niter_desc *);
 extern void iv_analysis_done (void);
 
 extern class niter_desc *get_simple_loop_desc (class loop *loop);
diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
index 6a59954f60a..d7b3d90f047 100644
--- a/gcc/loop-iv.c
+++ b/gcc/loop-iv.c
@@ -2915,7 +2915,7 @@ check_simple_exit (class loop *loop, edge e, class 
niter_desc *desc)
 
 /* Finds a simple exit of LOOP and stores its description into DESC.  */
 
-void
+static void
 find_simple_exit (class loop *loop, class niter_desc *desc)
 {
   unsigned i;


[Fortran, OpenACC] Reject vars of different scope in $acc declare (PR94120)

2020-03-10 Thread Tobias Burnus

[This fixes a bunch of issues found when actually only
wanting to add a check for the following restriction.]

OpenACC's "Declare Directive" has the following restriction:

"A declare directive must be in the same scope
 as the declaration of any var that appears in
 the data clauses of the directive."

The gfortran now rejects a "var" declared is in a different
namespace than the "$acc declare". (Use-associated variables
are already rejected.) Testing showed that a straight-forward
check fails if the result variable is the function name – as
then the function symbol is in the parent scope. — Extending
the failing test to use a result variable showed that the
current is-a-module-variable check didn't work and when fixing,
one was running into a likewise issue.

The reason that I exclude 's' being a module is that at
resolution time, an is-variable check is done.


The other changes are because the following restriction was
mishandled:

"In a Fortran module declaration section, only
 create, copyin, device_resident, and link clauses are allowed."

But all examples where for variables using those in a module
procedure …


OK for the trunk?

Cheers,

Tobias

PS: For those who wounder what happens in a BLOCK DATA construct:
gfortran outrightly rejects 'acc declare'. (It probably should
work for COMMON blocks with 'declare device_resident' – but
currently the spec only permits it in program/subroutine/function
+ declaration part of a module.)

PPS: The PR shows (for C) that one can construct a test case,
which violates the OpenACC restriction and actually fails with
an ICE. I have a draft patch for C (see PR) but not yet one for
C++, hence, I start with the Fortran side. – I currently still
struggle to write a same-scope check in C++.
[The C test case in turn was a fallout of debugging an
ICE-on-valid-code issue …]

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
[Fortran, OpenACC] Reject vars of different scope in $acc declare (PR94120)

2020-10-03  Tobias Burnus  

	PR middle-end/94120
	* openmp.c (gfc_match_oacc_declare): Accept function-result
	variables; reject variables declared in a different scoping unit.

2020-10-03  Tobias Burnus  

	PR middle-end/94120
	* gfortran.dg/goacc/pr78260-2.f90: Correct scan-tree-dump-times.
	Extend test case to result variables.
	* gfortran.dg/goacc/declare-2.f95: Actually check module-declaration
	restriction of OpenACC.
	* gfortran.dg/goacc/declare-3.f95: Remove case where this
	restriction is violated.
	* gfortran.dg/goacc/pr94120-1.f90: New.
	* gfortran.dg/goacc/pr94120-2.f90: New.
	* gfortran.dg/goacc/pr94120-3.f90: New.

 gcc/fortran/openmp.c  | 12 +++-
 gcc/testsuite/gfortran.dg/goacc/declare-2.f95 | 21 -
 gcc/testsuite/gfortran.dg/goacc/declare-3.f95 | 10 +-
 gcc/testsuite/gfortran.dg/goacc/pr78260-2.f90 | 13 +++--
 gcc/testsuite/gfortran.dg/goacc/pr94120-1.f90 | 11 +++
 gcc/testsuite/gfortran.dg/goacc/pr94120-2.f90 | 12 
 gcc/testsuite/gfortran.dg/goacc/pr94120-3.f90 | 13 +
 7 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 35f6b2f4938..930bca541b9 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -2155,7 +2155,8 @@ gfc_match_oacc_declare (void)
 {
   gfc_symbol *s = n->sym;
 
-  if (s->ns->proc_name && s->ns->proc_name->attr.proc == PROC_MODULE)
+  if (gfc_current_ns->proc_name
+	  && gfc_current_ns->proc_name->attr.flavor == FL_MODULE)
 	{
 	  if (n->u.map_op != OMP_MAP_ALLOC && n->u.map_op != OMP_MAP_TO)
 	{
@@ -2174,6 +2175,15 @@ gfc_match_oacc_declare (void)
 	  return MATCH_ERROR;
 	}
 
+  if ((s->result == s && s->ns->contained != gfc_current_ns)
+	  || ((s->attr.flavor == FL_UNKNOWN || s->attr.flavor == FL_VARIABLE)
+	  && s->ns != gfc_current_ns))
+	{
+	  gfc_error ("Variable %qs shall be declared in the same scoping unit "
+		 "as !$ACC DECLARE at %L", s->name, &where);
+	  return MATCH_ERROR;
+	}
+
   if ((s->attr.dimension || s->attr.codimension)
 	  && s->attr.dummy && s->as->type != AS_EXPLICIT)
 	{
diff --git a/gcc/testsuite/gfortran.dg/goacc/declare-2.f95 b/gcc/testsuite/gfortran.dg/goacc/declare-2.f95
index 7aa3dab4707..bad5de9d757 100644
--- a/gcc/testsuite/gfortran.dg/goacc/declare-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/declare-2.f95
@@ -1,9 +1,5 @@
 
 module amod
-
-contains
-
-subroutine asubr (b)
   implicit none
   integer :: b(8)
 
@@ -16,9 +12,24 @@ subroutine asubr (b)
   !$acc declare present_or_create (b) ! { dg-error "present on multiple" }
   !$acc declare deviceptr (b) ! { dg-error "Invalid clause in module" }
   !$acc declare create (b) copyin (b) ! { dg-error "present on multiple" }
+end module
 
+module amod2
+contains
+subroutine asubr (a, b, c, d, e, f, g, h, i, j, k)
+  implicit 

[committed] minor: fix indentation in ddg.c

2020-03-10 Thread Roman Zhuykov via Gcc-patches
This obvious patch fixes indentation in PR90001-related code.
Committed as r10-7106.

Roman

--
gcc/ChangeLog:
* ddg.c (create_ddg): Fix intendation.
(set_recurrence_length): Likewise.
(create_ddg_all_sccs): Likewise.

diff --git a/gcc/ddg.c b/gcc/ddg.c
index aae92adf89a..ca8cb74823d 100644
--- a/gcc/ddg.c
+++ b/gcc/ddg.c
@@ -633,7 +633,7 @@ create_ddg (basic_block bb, int closing_branch_deps)
   g->nodes[i].aux.count = -1;
   g->nodes[i].max_dist = XCNEWVEC (int, num_nodes);
   for (j = 0; j < num_nodes; j++)
- g->nodes[i].max_dist[j] = -1;
+   g->nodes[i].max_dist[j] = -1;
 
   g->nodes[i++].insn = insn;
   first_note = NULL;
@@ -838,7 +838,7 @@ set_recurrence_length (ddg_scc_ptr scc)
   int length = src->max_dist[dest->cuid];
 
   if (length < 0)
-continue;
+   continue;
 
   length += backarc->latency;
   result = MAX (result, (length / distance));
@@ -1069,8 +1069,8 @@ create_ddg_all_sccs (ddg_ptr g)
 
   n->max_dist[k] = 0;
   for (e = n->out; e; e = e->next_out)
-if (e->distance == 0 && g->nodes[e->dest->cuid].aux.count == 
n->aux.count)
-  n->max_dist[e->dest->cuid] = e->latency;
+   if (e->distance == 0 && g->nodes[e->dest->cuid].aux.count == 
n->aux.count)
+ n->max_dist[e->dest->cuid] = e->latency;
 }
 
   /* Run main Floid-Warshall loop.  We use only non-backarc edges
@@ -1079,19 +1079,19 @@ create_ddg_all_sccs (ddg_ptr g)
 {
   scc = g->nodes[k].aux.count;
   if (scc != -1)
-{
-  for (i = 0; i < num_nodes; i++)
-if (g->nodes[i].aux.count == scc)
-  for (j = 0; j < num_nodes; j++)
-if (g->nodes[j].aux.count == scc
-&& g->nodes[i].max_dist[k] >= 0
-&& g->nodes[k].max_dist[j] >= 0)
-  {
-way = g->nodes[i].max_dist[k] + g->nodes[k].max_dist[j];
-if (g->nodes[i].max_dist[j] < way)
-  g->nodes[i].max_dist[j] = way;
-  }
-}
+   {
+ for (i = 0; i < num_nodes; i++)
+   if (g->nodes[i].aux.count == scc)
+ for (j = 0; j < num_nodes; j++)
+   if (g->nodes[j].aux.count == scc
+   && g->nodes[i].max_dist[k] >= 0
+   && g->nodes[k].max_dist[j] >= 0)
+ {
+   way = g->nodes[i].max_dist[k] + g->nodes[k].max_dist[j];
+   if (g->nodes[i].max_dist[j] < way)
+ g->nodes[i].max_dist[j] = way;
+ }
+   }
 }
 
   /* Calculate recurrence_length using max_dist info.  */



[committed] testsuite: Scan for SSE reg-reg moves only in pr80481.C

2020-03-10 Thread Uros Bizjak
The function needs more than 8 SSE registers, avoid
false positives triggered by SSE spills for 32bit targets.

2020-03-10  Uroš Bizjak  

* g++.dg/pr80481.C (dg-final): Scan for SSE reg-reg moves only.

Tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/testsuite/g++.dg/pr80481.C b/gcc/testsuite/g++.dg/pr80481.C
index c565ad24d90..78c463b8e3b 100644
--- a/gcc/testsuite/g++.dg/pr80481.C
+++ b/gcc/testsuite/g++.dg/pr80481.C
@@ -1,11 +1,9 @@
 // { dg-do compile { target { i?86-*-* x86_64-*-* }  && { ! *-*-solaris* } } }
 // { dg-options "-Ofast -funroll-loops -fopenmp -march=knl" }
-// { dg-final { scan-assembler-not "vmovaps" } }
 // Disabling epilogues until we find a better way to deal with scans.
 // { dg-additional-options "--param vect-epilogues-nomask=0" }
 
 
-
 #include 
 
 #include 
@@ -72,3 +70,5 @@ void foo (Sdata *in, int idx, float *out)
   _mm_free(y3);
   _mm_free(y4);
 }
+
+// { dg-final { scan-assembler-not 
"vmovaps\[^\n\r]*zmm\[0-9]+,\[^\n\r]*zmm\[0-9]+" } }


Fix modulo-scheduler -fcompare-debug issues

2020-03-10 Thread Roman Zhuykov
Hi!

Current modulo-sched implementation is a bit faulty from -fcompile-debug 
perspective.

I found that few years ago, the most simple example is pr42631.c which fails 
(with just -fmodulo-sched or together with -fmodulo-sched-allow-regmoves) on 
powerpc64le with at least any gcc-4.9 or newer compiler.
I've investigated that difference about 3 years ago, it is mostly technical, 
dumps shows there are some "flying" NOTE_INSN_DELETED items.
I understood that it is minor, and I planned to commit the fix only when my 
other modulo-sched stuff will be ready.

But right now I see that when I enable -fmodulo-sched by default, powerpc64le 
bootstrap give comparison failure as of r10-7056.

Comparing stages 2 and 3
Bootstrap comparison failure!
gcc/ggc-page.o differs

That doesn't happen on released branches, so it is a kind of "regression" 
(certainly, nobody runs bootstrap with -fmodulo-sched).

Is that a good reason to commit the patch right now in stage4?

Patch was successfully regstrapped (based on r10-7056) using x86_64 and 
powerpc64le, both with default options and with -fmodulo-sched enabled.

Roman

--
modulo-sched: fix compare-debug issues

This patch fixes bootstrap comparison failure on powerpc64le when running it
with -fmodulo-sched enabled.

When applying the schedule we have to move debug insns in the same
way as we move note insns.  Also we have to discard adding debug insns
to SCCs in DDG graph.

20YY-MM-DD  Roman Zhuykov  
   
* ddg.c (create_ddg_dep_from_intra_loop_link): Adjust assertions.
(create_ddg_dep_no_link): Likewise.
(add_inter_loop_mem_dep): Do not create "debug --> non-debug" anti-deps.
(create_ddg): Adjust first_note field filling.
(check_sccs): Assert if any debug instruction is in SCC.
* modulo-sched.c (ps_first_note): Add an assertion if first_note
is empty.

testsuite:

20YY-MM-DD  Roman Zhuykov  

* gcc.dg/pr42631-sms1.c: New test.
* gcc.dg/pr42631-sms2.c: New test.

diff --git a/gcc/ddg.c b/gcc/ddg.c
index ca8cb74823..048207a354 100644
--- a/gcc/ddg.c
+++ b/gcc/ddg.c
@@ -185,8 +185,8 @@ create_ddg_dep_from_intra_loop_link (ddg_ptr g, 
ddg_node_ptr src_node,
   else if (DEP_TYPE (link) == REG_DEP_OUTPUT)
 t = OUTPUT_DEP;
 
-  gcc_assert (!DEBUG_INSN_P (dest_node->insn) || t == ANTI_DEP);
-  gcc_assert (!DEBUG_INSN_P (src_node->insn) || t == ANTI_DEP);
+  gcc_assert (NONDEBUG_INSN_P (dest_node->insn) || t == ANTI_DEP);
+  gcc_assert (NONDEBUG_INSN_P (src_node->insn) || DEBUG_INSN_P 
(dest_node->insn));
 
   /* We currently choose not to create certain anti-deps edges and
  compensate for that by generating reg-moves based on the life-range
@@ -222,9 +222,9 @@ create_ddg_dep_from_intra_loop_link (ddg_ptr g, 
ddg_node_ptr src_node,
 }
 }
 
-   latency = dep_cost (link);
-   e = create_ddg_edge (src_node, dest_node, t, dt, latency, distance);
-   add_edge_to_ddg (g, e);
+  latency = dep_cost (link);
+  e = create_ddg_edge (src_node, dest_node, t, dt, latency, distance);
+  add_edge_to_ddg (g, e);
 }
 
 /* The same as the above function, but it doesn't require a link parameter.  */
@@ -237,8 +237,8 @@ create_ddg_dep_no_link (ddg_ptr g, ddg_node_ptr from, 
ddg_node_ptr to,
   enum reg_note dep_kind;
   struct _dep _dep, *dep = &_dep;
 
-  gcc_assert (!DEBUG_INSN_P (to->insn) || d_t == ANTI_DEP);
-  gcc_assert (!DEBUG_INSN_P (from->insn) || d_t == ANTI_DEP);
+  gcc_assert (NONDEBUG_INSN_P (to->insn) || d_t == ANTI_DEP);
+  gcc_assert (NONDEBUG_INSN_P (from->insn) || DEBUG_INSN_P (to->insn));
 
   if (d_t == ANTI_DEP)
 dep_kind = REG_DEP_ANTI;
@@ -455,8 +455,12 @@ add_inter_loop_mem_dep (ddg_ptr g, ddg_node_ptr from, 
ddg_node_ptr to)
return;
   else if (from->cuid != to->cuid)
{
- create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 1);
- if (DEBUG_INSN_P (from->insn) || DEBUG_INSN_P (to->insn))
+ gcc_assert (NONDEBUG_INSN_P (to->insn));
+
+ if (NONDEBUG_INSN_P (from->insn))
+   create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 1);
+
+ if (DEBUG_INSN_P (from->insn))
create_ddg_dep_no_link (g, to, from, ANTI_DEP, MEM_DEP, 1);
  else
create_ddg_dep_no_link (g, to, from, TRUE_DEP, MEM_DEP, 1);
@@ -607,28 +611,34 @@ create_ddg (basic_block bb, int closing_branch_deps)
   if (! INSN_P (insn))
{
  if (! first_note && NOTE_P (insn)
- && NOTE_KIND (insn) !=  NOTE_INSN_BASIC_BLOCK)
+ && NOTE_KIND (insn) != NOTE_INSN_BASIC_BLOCK)
first_note = insn;
  continue;
}
+
   if (JUMP_P (insn))
{
  gcc_assert (!g->closing_branch);
  g->closing_branch = &g->nodes[i];
}
-  else if (GET_CODE (PATTERN (insn)) == USE)
-   {
- if (! first_note)
-   first_note = insn;
- continue;
-   }
+
+  if (! first_note)
+   first_note = insn;
+
+

Re: c: ignore initializers for elements of variable-size types [PR93577]

2020-03-10 Thread Christophe Lyon
On Tue, 10 Mar 2020 at 01:52, Joseph Myers  wrote:
>
> On Mon, 9 Mar 2020, Christophe Lyon wrote:
>
> > Hi Joseph,
> >
> > I've noticed that your patch introduces regressions on aarch64:
> > FAIL: gcc.target/aarch64/sve/acle/general-c/sizeless-1.c
> > -march=armv8.2-a+sve  (test for errors, line 33)
> > we now get
> > /gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/sizeless-1.c:33:44:
> > error: empty scalar initializer
> > while we expect dg-error {initializer element is not constant}
> >
> > FAIL: gcc.target/aarch64/sve/acle/general-c/sizeless-1.c
> > -march=armv8.2-a+sve  (test for errors, line 85)
> > we no longer emit dg-error {empty scalar initializer }
> >
> > FAIL: gcc.target/aarch64/sve/acle/general-c/sizeless-1.c
> > -march=armv8.2-a+sve (test for excess errors)
> > FAIL: gcc.target/aarch64/sve/acle/general-c/sizeless-2.c
> > -march=armv8.2-a+sve  (test for errors, line 85)
> > we no longer emit dg-error {empty scalar initializer }
> >
> > Since the compiler did not ICE before your patch, is that new
> > behaviour expected (and the tests need an update), or is that a
> > problem with the patch?
>
> Where there has already been an error about the type of an initializer,
> it's expected that some other errors about the value of that initializer
> will disappear.  So I think the cases where a previously expected error
> has disappeared are cases where the tests need an update; they already
> expect an error for the type.
>
OK, makes sense.


> That leaves the case where you report that "empty scalar initializer" has
> appeared.  That seems like a bug.  Maybe some SVE case means
> process_init_element is ignoring something in an initializer for an SVE
> type that did not in fact result in the variable-size error from
> digest_init?  Is the "empty scalar initializer" error coming from the
> compound literal on line 33, as opposed to the variable initialized on
> that line?
>
sizeless-1.c and sizeless-2.c have the same code, but the latter is
compiled with -msve-vector-bits=256 and expects different
warnings/errors.
For line 33:
svint8_t *invalid_sve_sc_ptr = &(svint8_t) { *global_sve_sc_ptr };
we now have:
sizeless-1.c:33:44: error: empty scalar initializer
sizeless-1.c:33:44: note: (near initialization for '(anonymous)')
and
sizeless-2.c:33:44: error: initializer element is not constant
sizeless-2.c:33:44: note: (near initialization for 'invalid_sve_sc_ptr')
sizeless-2.c:33:44: error: SVE type 'svint8_t' does not have a fixed size
so I think the error comes from the compound literal being treated
differently with -msve-vector-bits=256


> --
> Joseph S. Myers
> jos...@codesourcery.com


[committed] Revert testcase change after IRA change reversion

2020-03-10 Thread Jeff Law
I suspect there'll be a couple more as the tester churns through the targets.


commit aed151bb53b44d523e2732ca6add9324c4ff9798
Author: Jeff Law 
Date:   Tue Mar 10 08:38:14 2020 -0600

Revert "Fix regression reported by tester due to recent IRA changes"

This reverts commit d48e1175279a551bf90aa5b165fc46a1d5a2c07e.

2020-03-10  Jeff Law  

Revert:
2020-02-29  Jeff Law  

* gcc.target/xstormy16/sfr/06_sfrw_to_var.c: Update expected output.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 5ed497dc44c..089874e5f7c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2020-03-10  Jeff Law  
+
+   Revert:
+   2020-02-29  Jeff Law  
+
+   * gcc.target/xstormy16/sfr/06_sfrw_to_var.c: Update expected output.
+
 2020-03-10  Jakub Jelinek  
 
PR target/94088
diff --git a/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c 
b/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c
index 54c9baf8746..39cbab5c3e9 100644
--- a/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c
+++ b/gcc/testsuite/gcc.target/xstormy16/sfr/06_sfrw_to_var.c
@@ -1,5 +1,5 @@
 /* { dg-options { -nostartfiles below100.o -Tbelow100.ld -O2 } } */
-/* { dg-final { scan-assembler "mov.w r1,32532" } } */
+/* { dg-final { scan-assembler "mov.w r6,32532" } } */
 
 #define SFR (*((volatile unsigned short*)0x7f14))
 unsigned short *p = (unsigned short *) 0x7f14;


Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Michael Matz
Hello,

On Tue, 10 Mar 2020, Martin Liška wrote:

> >>> We nee to support different variables, like TLS, data and bss variables.
> >>
> >> Why do we need TLS? Right now, it's not supported by nm.
> > 
> > Of course it does.  It's the 'T' (or 't') character.
> 
> Thank you reply!
> 
> Are you sure about it?

Err, I bungled in between writing emails, you are right, nm(1) in BSD mode 
doesn't make a difference for TLS symbols.  (And of course T/t are for 
text, aka code, symbols).

Doesn't invalidate the rest of my email, though.


Ciao,
Michael.


[PATCH][AARCH64] Fix for PR94121

2020-03-10 Thread lizekun (A)
Hi,

This is a fix tring to solve PR94121.

The ICE appears when generating an add insn with the offset. If the offset is 
negative, function aarch64_add_offset_1 in aarch64.c will take its absolute 
value.
With this fix, offset does not take absolute value if it equals to the minimum 
value of machine.

Added one test case for this. Bootstrap and tested on aarch64 Linux platform.  

Zekun Li


Log:
PR 94121
* aarch64.c (aarch64_add_offset_1): Add a branch 
when generating addition assembly expression with 
offset HOST_WIDE_INT_MIN.
* gcc.target/aarch64/PR94121.c: New test.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4b9747b4c5e..a6b60cf8a92 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3713,9 +3713,31 @@ aarch64_add_offset_1 (scalar_int_mode mode, rtx dest,
   gcc_assert (emit_move_imm || temp1 != NULL_RTX);
   gcc_assert (temp1 == NULL_RTX || !reg_overlap_mentioned_p (temp1, src));
 
-  HOST_WIDE_INT moffset = abs_hwi (offset);
   rtx_insn *insn;
 
+  if (offset == HOST_WIDE_INT_MIN)
+{
+  if (emit_move_imm)
+   {
+ gcc_assert (temp1 != NULL_RTX || can_create_pseudo_p ());
+ temp1 = aarch64_force_temporary (mode, temp1, GEN_INT (offset));
+ insn = emit_insn (gen_add3_insn (dest, src, temp1));
+   }
+  else
+   {
+ insn = emit_insn (gen_sub3_insn (dest, src, temp1));
+   }
+  if (frame_related_p)
+   {
+ RTX_FRAME_RELATED_P (insn) = frame_related_p;
+ rtx adj = plus_constant (mode, src, offset);
+ add_reg_note (insn, REG_CFA_ADJUST_CFA, gen_rtx_SET (dest, adj));
+   }
+   return;
+}
+
+  HOST_WIDE_INT moffset = abs_hwi (offset);
+
   if (!moffset)
 {
   if (!rtx_equal_p (dest, src))
diff --git a/gcc/testsuite/gcc.target/aarch64/PR94121.c 
b/gcc/testsuite/gcc.target/aarch64/PR94121.c
new file mode 100644
index 000..8960f8cbf62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/PR94121.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpie" } */
+
+#define DIFF_MAX __PTRDIFF_MAX__
+#define DIFF_MIN (-DIFF_MAX - 1)
+
+extern
+void foo ();
+
+void test_global_char_array (void)
+{
+  extern char gcar3[1];
+  char *p = gcar3;
+  foo (&p[DIFF_MIN]);
+}


[PATCH][AARCH64] Fix for PR94121

2020-03-10 Thread lizekun (A)
Hi,

This is a fix tring to solve PR94121.

The ICE appears when generating an add insn with the offset. If the offset is 
negative, function aarch64_add_offset_1 in aarch64.c will take its absolute 
value.
With this fix, offset does not take absolute value if it equals to the minimum 
value of machine.

Added one test case for this. Bootstrap and tested on aarch64 Linux platform.  

Zekun Li


Log:
PR 94121
* aarch64.c (aarch64_add_offset_1): Add a branch 
when generating addition assembly expression with 
offset HOST_WIDE_INT_MIN.
* gcc.target/aarch64/PR94121.c: New test.


[PATCHv2] Ada: gcc-interface: fixed assertion for aliased entities

2020-03-10 Thread Richard Wai
This is the second go at this patch, and now with a testcase!

In summary:

If the type is derived in the current compilation unit, and Allocate is not
overridden on derivation (as is typically the case with
Root_Storage_Pool_With_Subpools), the entity for Allocate for the derived
type is then an alias to System.Storage_Pools.Subpools.Allocate. When the
allocator is built, gnat_to_gnu_entity is called with definition == false
for the derived storage pool's allocate operation. An assertion is
gnat_to_gnu_entity fails in this case, since it is not a definition, and
Is_Public is false (since the entity is nested in the same compilation
unit).

This patch adds an extra check in the assertion (decl.c: gnat_to_gnu_entity)
that the entity has the Aliased property, and that the Alias is also Public.


Added a regression test for the declaration and allocation from a
Root_Pool_With_Subpools type derrived within the same compilation unit (a
package nested in a subprogram in this testcase).

diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index 871a309ab7d..ae49c2625f8 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -447,6 +447,17 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree
gnu_expr, bool definition)
   /* If we get here, it means we have not yet done anything with this
entity.
  If we are not defining it, it must be a type or an entity that is
defined
  elsewhere or externally, otherwise we should have defined it already.
*/
+
+  /* One exception relates to an entity, typically an inherited operation,
+ which has an alias pointing to the parent's operation. Often such an
+ aliased entity will also carry with it the Is_Public property if it
was
+ declared in a separate compilation unit, but when a type is extended
+ within the current unit, the aliased entity will not pass this
+ assertion. It is neither defined (since it is an inherited operation,
+ and is not Public, since it is within the current compilation unit.
+
+For this case we look for an Alias that is also Public */
+  
   gcc_assert (definition
  || is_type
  || kind == E_Discriminant
@@ -454,6 +465,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree
gnu_expr, bool definition)
  || kind == E_Label
  || (kind == E_Constant && Present (Full_View (gnat_entity)))
  || Is_Public (gnat_entity)
+  || (Present (Alias (gnat_entity)) && Is_Public (Alias
(gnat_entity)))
  || type_annotate_only);
 
   /* Get the name of the entity and set up the line number and filename of
diff --git a/gcc/testsuite/gnat.dg/subpools1.adb
b/gcc/testsuite/gnat.dg/subpools1.adb
new file mode 100644
index 000..87b9e53baca
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/subpools1.adb
@@ -0,0 +1,99 @@
+-- { dg-do compile }
+
+with System.Storage_Elements;
+with System.Storage_Pools.Subpools;
+
+procedure Subpools1 is
+
+   use System.Storage_Pools.Subpools;
+   
+   package Local_Pools is
+  
+  use System.Storage_Elements;
+  
+  subtype Address is System.Address;
+  use type Address;
+  
+  type Local_Pool is new Root_Storage_Pool_With_Subpools 
+with null record;
+  
+  overriding
+  function Create_Subpool (Pool: in out Local_Pool)
+  return not null Subpool_Handle;
+  
+  overriding
+  procedure Allocate_From_Subpool
+(Pool: in out Local_Pool;
+ Storage_Address :out Address;
+ Size_In_Storage_Elements: in Storage_Count;
+ Alignment   : in Storage_Count;
+ Subpool : in not null Subpool_Handle);
+  
+  overriding
+  procedure Deallocate_Subpool
+(Pool   : in out Local_Pool;
+ Subpool: in out Subpool_Handle) is null;
+  
+  overriding
+  function Default_Subpool_For_Pool (Pool: in out Local_Pool)
+return not null Subpool_Handle;
+  
+   end Local_Pools;
+   
+   package body Local_Pools is
+  
+  type Local_Subpool is new Root_Subpool with null record;
+  
+  Dummy_Subpool: aliased Local_Subpool;
+  
+  overriding
+  function Create_Subpool (Pool: in out Local_Pool)
+  return not null Subpool_Handle 
+  is 
+  begin 
+ return Result: not null Subpool_Handle 
+   := Dummy_Subpool'Unchecked_Access
+ do
+Set_Pool_Of_Subpool (Result, Pool);
+ end return;
+  end Create_Subpool;
+  
+  
+  overriding
+  procedure Allocate_From_Subpool
+(Pool: in out Local_Pool;
+ Storage_Address :out Address;
+ Size_In_Storage_Elements: in Storage_Count;
+ Alignment   : in Storage_Count;
+ Subpool : in not null S

Re: [PATCH 1/6] i386: Properly encode vector registers in vector move

2020-03-10 Thread H.J. Lu
On Thu, Mar 5, 2020 at 3:47 PM Jeff Law  wrote:
>
> On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> > On x86, when AVX and AVX512 are enabled, vector move instructions can
> > be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
> >
> >0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2
> >4: 62 f1 fd 08 6f d1   vmovdqa64 %xmm1,%xmm2
> >
> > We prefer VEX encoding over EVEX since VEX is shorter.  Also AVX512F
> > only supports 512-bit vector moves.  AVX512F + AVX512VL supports 128-bit
> > and 256-bit vector moves.  xmm16-xmm31 and ymm16-ymm31 are disallowed in
> > 128-bit and 256-bit modes when AVX512VL is disabled.  Mode attributes on
> > x86 vector move patterns indicate target preferences of vector move
> > encoding.  For scalar register to register move, we can use 512-bit
> > vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
> > available.  With AVX512F and AVX512VL, we should use VEX encoding for
> > 128-bit/256-bit vector moves if upper 16 vector registers aren't used.
> > This patch adds a function, ix86_output_ssemov, to generate vector moves:
> >
> > 1. If zmm registers are used, use EVEX encoding.
> > 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
> > will be generated.
> > 3. If xmm16-xmm31/ymm16-ymm31 registers are used:
> >a. With AVX512VL, AVX512VL vector moves will be generated.
> >b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
> >   move will be done with zmm register move.
> >
> > There is no need to set mode attribute to XImode explicitly since
> > ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
> > with and without AVX512VL.
> >
> > Tested on AVX2 and AVX512 with and without --with-arch=native.
> >
> > gcc/
> >
> >   PR target/89229
> >   PR target/89346
> >   * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
> >   * config/i386/i386.c (ix86_get_ssemov): New function.
> >   (ix86_output_ssemov): Likewise.
> >   * config/i386/sse.md (VMOVE:mov_internal): Call
> >   ix86_output_ssemov for TYPE_SSEMOV.  Remove TARGET_AVX512VL
> >   check.
> >   (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
> >   (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
> >   Remove ext_sse_reg_operand and TARGET_AVX512VL check.
> >   (*movti_internal): Likewise.
> >   (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> >
> > gcc/testsuite/
> >
> >   PR target/89229
> >   PR target/89346
> >   * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
> >   * gcc.target/i386/pr89346.c: New test.
> >
> > gcc/testsuite/
> >
> >   PR target/89229
> >   * gcc.target/i386/pr89229-2a.c: New test.
> >   * gcc.target/i386/pr89229-2b.c: Likewise.
> >   * gcc.target/i386/pr89229-2c.c: Likewise.
> >   * gcc.target/i386/pr89229-3a.c: Likewise.
> >   * gcc.target/i386/pr89229-3b.c: Likewise.
> >   * gcc.target/i386/pr89229-3c.c: Likewise.
> OK.  Let's get this one installed, let the various testers out there chew on 
> it
> for a day, then we'll iterate through the rest.
>
> Thanks again for your patience.

Hi, Jeff,

My first patch has been installed for 5 days without problems.  Can you
review the rest?

Thanks.


-- 
H.J.


Patch ping

2020-03-10 Thread Jakub Jelinek
Hi!

I'd like to ping the
https://gcc.gnu.org/legacy-ml/gcc-patches/2020-03/msg00154.html
  P1 PR94015
patch, with the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94015#c5
testcase instead of the one sent in the patch, so that it FAILs without the
fix on more targets and more reliably.

Thanks

Jakub



Re: [PATCH] issues with configure --enable-checking option

2020-03-10 Thread Roman Zhuykov
Hi!

25.02.2020 13:36, Roman Zhuykov wrote:

> So, IMHO the best next step is to improve the behavior rather then docs :)

I want to add one more point into this discussion.  I have recently
decided to test some stuff on old branches, e.q gcc-4.9, 5 and 6.
On modern systems there are some issues with building old branches, at
least I met "struct ucontext -> ucontext_t",
"sys/ustat.h - no such file" and few others.  But in this particular
experiment, I was using pretty old Ubuntu 16.04,
and there were no issues with building unpatched frozen branches.

But, since the stuff was really experimental, at some moment I've
decided to apply the following to enable more checking:

diff --git a/gcc/DEV-PHASE b/gcc/DEV-PHASE
--- a/gcc/DEV-PHASE
+++ b/gcc/DEV-PHASE
@@ -0,0 +1 @@
+experimental
diff --git a/gcc/configure b/gcc/configure
--- a/gcc/configure
+++ b/gcc/configure
@@ -6727,7 +6727,7 @@ do
 # these set all the flags to specific states
 yes)        ac_assert_checking=1 ; ac_checking=1 ; ac_df_checking= ;
         ac_fold_checking= ; ac_gc_checking=1 ;
-            ac_gc_always_collect= ; ac_gimple_checking=1 ;
ac_rtl_checking= ;
+            ac_gc_always_collect= ; ac_gimple_checking=1 ;
ac_rtl_checking=1 ;
         ac_rtlflag_checking=1 ; ac_runtime_checking=1 ;
         ac_tree_checking=1 ; ac_valgrind_checking= ;
         ac_types_checking=1 ;;
diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -427,7 +427,7 @@ do
 # these set all the flags to specific states
 yes)        ac_assert_checking=1 ; ac_checking=1 ; ac_df_checking= ;
         ac_fold_checking= ; ac_gc_checking=1 ;
-            ac_gc_always_collect= ; ac_gimple_checking=1 ;
ac_rtl_checking= ;
+            ac_gc_always_collect= ; ac_gimple_checking=1 ;
ac_rtl_checking=1 ;
         ac_rtlflag_checking=1 ; ac_runtime_checking=1 ;
         ac_tree_checking=1 ; ac_valgrind_checking= ;
         ac_types_checking=1 ;;

And that gives broken basic x86_64 bootstrap on gcc-5 and gcc-6 branches!

First, about gcc-4.9 branch, it works fine.  There was another story
that it failed with rtl checks on ppc64.  Alex told me there was some
moment when all folks forgot to test that.  So, everything went fine
after backporting r243144 (r5-10072) and r212829 (r5-1977).

But the gcc-5 branch case was much more tricky.  The following 3 hunks
are needed to fix x86_64 bootstrap.
And that one with UNKNOWN_LOCATION is really a null pointer dereference
we put into a released compiler!

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -138,7 +138,7 @@ gfc_run_passes (gfc_namespace *ns)
  */
 
 static int
-realloc_string_callback (gfc_code **c, int *walk_subtrees,
+realloc_string_callback (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
          void *data ATTRIBUTE_UNUSED)
 {
   gfc_expr *expr1, *expr2;
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -38,7 +38,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"    /* For create_tmp_var_raw.  */
 #include "stringpool.h"
 #include "gfortran.h"
-#include "diagnostic-core.h"    /* For internal_error.  */
 #include "trans.h"
 #include "trans-stmt.h"
 #include "trans-types.h"
diff --git a/gcc/toplev.c b/gcc/toplev.c
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1386,8 +1386,7 @@ process_options (void)
 
   if (flag_sanitize & SANITIZE_THREAD)
 {
-      error (UNKNOWN_LOCATION,
-         "%<-fcheck-pointer-bounds%> is not supported with "
+      error ("%<-fcheck-pointer-bounds%> is not supported with "
      "Thread Sanitizer");
 
   flag_check_pointer_bounds = 0;


In gcc-6 branch the solution was simple - I have to revert r262046
(r6-10168), haven't investigate that deeply.

Overall, IMHO this is one more point to review current
branching/checking approach.
So, I understand that main purpose of empty DEV-PHASE is to test "almost
released" compiler in the same way it will be compiled from release
archives.
But those issues show it's also necessary to run checking-bootstrap when
backporting any patch.

PS. Everything seems fine in gcc-7 branch.

Roman




Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Martin Liška

On 3/10/20 10:39 AM, Martin Liška wrote:


Are you sure about it?
$ gcc gcc/testsuite/gcc.target/i386/pr56564-3.c -c -fpic && nm pr56564-3.o
...
 D s
0010 D t

?


A nicer example:

$ cat tls.c
__thread struct S { long a, b; } s = { 0, 0 };
__thread char t[16] = { 7 };

$ gcc tls.c -c && nm -B tls.o
 B s
 D t

Well, TLS seems to me an orthogonal problem. Similarly other special symbols 
like
.func, .symver, alias attribute, ...

Martin


Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Martin Liška

On 3/10/20 12:24 PM, Richard Biener wrote:

Not sure how symtab is encoded right now but we also could have


Ok, right now I don't see symtab entry much extensible.

But what am I suggesting is to parse LTO bytecode version and then
process conditional parsing of lto_symtab section.

Thoughts?
Martin
>From 017dd9cba9a0222104682bef094d9f5057d2c9ae Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 6 Mar 2020 18:09:35 +0100
Subject: [PATCH] API extension for binutils (type of symbols).

gcc/ChangeLog:

2020-03-09  Martin Liska  

	* lto-streamer-out.c (write_symbol): Stream
	symbol type.

include/ChangeLog:

2020-03-09  Martin Liska  

	* lto-symtab.h (enum gcc_plugin_symbol_type): New.
	* plugin-api.h (struct ld_plugin_symbol): New member
	symbols_type.
	(enum ld_plugin_symbol_type): New.
	(enum ld_plugin_tag): Add new tag LDPT_GET_SYMBOLS_V4.

lto-plugin/ChangeLog:

2020-03-09  Martin Liska  

	* lto-plugin.c (parse_table_entry): Parse symbol type.
	(LTO_LTO_PREFIX): New.
	(LTO_LTO_PREFIX_LEN): New.
	(struct lto_section): New.
---
 gcc/lto-streamer-out.c  | 14 
 include/lto-symtab.h|  8 +++
 include/plugin-api.h| 14 +++-
 lto-plugin/lto-plugin.c | 48 -
 4 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index cea5e71cffb..ead606eb665 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "print-tree.h"
 #include "tree-dfa.h"
 #include "file-prefix-map.h" /* remap_debug_filename()  */
+#include "output.h"
 
 
 static void lto_write_tree (struct output_block*, tree, bool);
@@ -2773,6 +2774,19 @@ write_symbol (struct streamer_tree_cache_d *cache,
   lto_write_data (&c, 1);
   c = (unsigned char) visibility;
   lto_write_data (&c, 1);
+
+  gcc_plugin_symbol_type st;
+  if (TREE_CODE (t) == VAR_DECL)
+{
+  section *s = get_variable_section (t, false);
+  st = (s->common.flags & SECTION_BSS
+	? GCCST_VARIABLE_BSS : GCCST_VARIABLE_DATA);
+}
+  else
+st = GCCST_FUNCTION;
+
+  c = (unsigned char) st;
+  lto_write_data (&c, 1);
   lto_write_data (&size, 8);
   lto_write_data (&slot_num, 4);
 }
diff --git a/include/lto-symtab.h b/include/lto-symtab.h
index 0ce0de10121..901bc3585c2 100644
--- a/include/lto-symtab.h
+++ b/include/lto-symtab.h
@@ -38,4 +38,12 @@ enum gcc_plugin_symbol_visibility
 GCCPV_HIDDEN
   };
 
+enum gcc_plugin_symbol_type
+{
+  GCCST_UNKNOWN,
+  GCCST_FUNCTION,
+  GCCST_VARIABLE_DATA,
+  GCCST_VARIABLE_BSS
+};
+
 #endif /* GCC_LTO_SYMTAB_H  */
diff --git a/include/plugin-api.h b/include/plugin-api.h
index 09e1202df07..794a2dcc4ee 100644
--- a/include/plugin-api.h
+++ b/include/plugin-api.h
@@ -92,6 +92,7 @@ struct ld_plugin_symbol
   uint64_t size;
   char *comdat_key;
   int resolution;
+  int symbol_type;
 };
 
 /* An object's section.  */
@@ -123,6 +124,16 @@ enum ld_plugin_symbol_visibility
   LDPV_HIDDEN
 };
 
+/* The type of the symbol.  */
+
+enum ld_plugin_symbol_type
+{
+  LDST_UNKNOWN,
+  LDST_FUNCTION,
+  LDST_VARIABLE_DATA,
+  LDST_VARIABLE_BSS
+};
+
 /* How a symbol is resolved.  */
 
 enum ld_plugin_symbol_resolution
@@ -431,7 +442,8 @@ enum ld_plugin_tag
   LDPT_GET_INPUT_SECTION_ALIGNMENT = 29,
   LDPT_GET_INPUT_SECTION_SIZE = 30,
   LDPT_REGISTER_NEW_INPUT_HOOK = 31,
-  LDPT_GET_WRAP_SYMBOLS = 32
+  LDPT_GET_WRAP_SYMBOLS = 32,
+  LDPT_GET_SYMBOLS_V4 = 33,
 };
 
 /* The plugin transfer vector.  */
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index c307fc871bf..33afae9afb6 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -90,6 +90,8 @@ along with this program; see the file COPYING3.  If not see
 
 #define LTO_SECTION_PREFIX	".gnu.lto_.symtab"
 #define LTO_SECTION_PREFIX_LEN	(sizeof (LTO_SECTION_PREFIX) - 1)
+#define LTO_LTO_PREFIX		".gnu.lto_.lto"
+#define LTO_LTO_PREFIX_LEN	(sizeof (LTO_LTO_PREFIX) - 1)
 #define OFFLOAD_SECTION		".gnu.offload_lto_.opts"
 #define OFFLOAD_SECTION_LEN	(sizeof (OFFLOAD_SECTION) - 1)
 
@@ -221,6 +223,20 @@ check_1 (int gate, enum ld_plugin_level level, const char *text)
 }
 }
 
+/* Structure that represents LTO ELF section with information
+   about the format.  */
+
+struct lto_section
+ {
+   int16_t major_version;
+   int16_t minor_version;
+   unsigned char slim_object: 1;
+   unsigned char compression: 4;
+   int32_t reserved0: 27;
+};
+
+struct lto_section lto_header;
+
 /* This little wrapper allows check to be called with a non-integer
first argument, such as a pointer that must be non-NULL.  We can't
use c99 bool type to coerce it into range, so we explicitly test.  */
@@ -252,6 +268,14 @@ parse_table_entry (char *p, struct ld_plugin_symbol *entry,
   LDPV_HIDDEN
 };
 
+  enum ld_plugin_symbol_type symbol_types[] =
+{
+  LDST_UNKNOWN,
+  LDST_FUNCTION,
+  LDST_VARIABLE_DATA,
+  LDST_VARIABLE_BSS
+};
+
   switc

Re: [PATCH][GCC][AArch64]: Break apart paradoxical subregs for VSTRUCT writes (PR target/94052)

2020-03-10 Thread Richard Sandiford
Tamar Christina  writes:
> Hi All,
>
> This works around an ICE in reload where from expand we get the following RTL
> generated for VSTRUCT mode writes:
>
> (insn 446 354 445 2 (set (reg:CI 383)
>  (subreg:CI (reg:V4SI 291) 0)) "small.i":146:22 3408 {*aarch64_movci}
>  (nil))
>
> This sequence is trying to say two things:
>
> 1) liveliness: It's trying to say that eventually the whole CI reg will be
>  written to. It does this by generating the paradoxical subreg.
> 2) write data: It's trying to in the same instruction also write the V4SI mode
>  component at offset 0 in the CI reg.
>
> Reload is unable to understand this concept and so it attempts to handle this
> instruction by breaking apart the instruction, first writing the data and then
> tries to reload the paradoxical part.  This gets it to the same instruction
> again and eventually we ICE since we reach the limit of no. reloads.

reload/LRA does understand the concept.  It just isn't handling this
particular case very well :-)

The pre-RA insn is:

(insn 210 218 209 6 (set (reg/v:OI 182 [ vres ])
(subreg:OI (reg:V4SI 289) 0)) "pr94052.C":157:31 3518 {*aarch64_movoi}
 (expr_list:REG_DEAD (reg:V4SI 289)
(nil)))

IRA allocates a hard register to r182 but not r289:

  126:r182 l032  ...
  ...
  129:r289 l0   mem  ...

This is because memory appears to be cheaper:

  a129(r289,l2) costs: FP_LO8_REGS:136,136 FP_LO_REGS:136,136 FP_REGS:136,136 
MEM:110,110

That's probably a bug in itself (possibly in the target costs), but it
shouldn't be a correctness issue.

LRA then handles insn 210 like this:

  Creating newreg=317, assigning class ALL_REGS to slow/invalid mem r317
  Creating newreg=318, assigning class ALL_REGS to slow/invalid mem r318
  210: r182:OI=r318:OI
  REG_DEAD r289:V4SI
Inserting slow/invalid mem reload before:
  355: r317:V4SI=[sfp:DI+0x60]
  356: r318:OI=r317:V4SI#0

1 Non pseudo reload: reject++
  alt=0,overall=1,losers=0,rld_nregs=0
 Choosing alt 0 in insn 210:  (0) =w  (1) w {*aarch64_movoi}
  Change to class FP_REGS for r318

That looks OK so far, given the allocation.  But later we have:

** Assignment #2: **

 Assigning to 318 (cl=FP_REGS, orig=318, freq=44, tfirst=318, 
tfreq=44)...
   Assign 32 to subreg reload r318 (freq=44)
 Assigning to 317 (cl=ALL_REGS, orig=317, freq=44, tfirst=317, 
tfreq=44)...
   Assign 0 to subreg reload r317 (freq=44)

So LRA is assigning a GPR (x0) to the new V4SI register r317.  It's this
allocation that induces the cycling, because we get:

   Cycle danger: overall += LRA_MAX_REJECT
  alt=0,overall=606,losers=1,rld_nregs=2
0 Spill pseudo into memory: reject+=3
Using memory insn operand 0: reject+=3
0 Non input pseudo reload: reject++
Cycle danger: overall += LRA_MAX_REJECT
  alt=1,overall=619,losers=2,rld_nregs=2
1 Spill pseudo into memory: reject+=3
Using memory insn operand 1: reject+=3
  alt=2,overall=12,losers=1,rld_nregs=0
 Choosing alt 2 in insn 356:  (0) w  (1) Utv {*aarch64_movoi}
  Creating newreg=319, assigning class NO_REGS to r319
  356: r318:OI=r319:OI
  REG_DEAD r317:V4SI
Inserting insn reload before:
  357: r319:OI=r317:V4SI#0

0 Non input pseudo reload: reject++
  alt=0,overall=13,losers=2,rld_nregs=4
0 Non pseudo reload: reject++
  alt=1,overall=7,losers=1,rld_nregs=2
0 Non input pseudo reload: reject++
1 Spill pseudo into memory: reject+=3
Using memory insn operand 1: reject+=3
alt=2,overall=19,losers=2 -- refuse
 Choosing alt 1 in insn 357:  (0) Utv  (1) w {*aarch64_movoi}
  Creating newreg=320, assigning class FP_REGS to r320
  357: r319:OI=r320:OI
Inserting insn reload before:
  358: r320:OI=r317:V4SI#0

and we keep oscillating between those two choices (alt 2 vs alt 1).
This wouldn't have happened if we'd allocated an FPR instead.

I think the problem here is that we're always trying to reload the
subreg rather than the inner register, even though the allocation for
the inner register isn't valid for the subreg.  There is code in
simplify_operand_subreg to detect this kind of situation, but it
seems to be missing a check for hard_regno_mode_ok.  The first patch
below seems to fix that.

> This patch fixes it by in the backend when we see such a paradoxical
> construction breaking it apart and issuing a clobber to correct the liveliness
> information and then emitting a normal subreg write for the component that the
> paradoxical subreg was trying to write to.
>
> Concretely we generate this:
>
> (insn 42 41 43 (clobber (reg/v:CI 122 [ diD.5226 ])) "small.i":121:23 -1
>  (nil))
>
> (insn 43 42 44 (set (subreg:V4SI (reg/v:CI 122 [ diD.5226 ]) 0)
> (reg:V4SI 136)) "small.i":121:23 -1
>  (nil))
>
> B

[committed] libstdc++: Fix invalid noexcept-specifier (PR 94117)

2020-03-10 Thread Jonathan Wakely
G++ fails to diagnose this non-dependent expression, but Clang doesn't
like it.

PR c++/94117
* include/std/ranges (ranges::transform_view::_Iterator::iter_move):
Change expression in noexcept-specifier to match function body.

Tested x86_64-linux, committed to master.


commit c222eabcf8be0e3f644e4bd4c3316b40dba4b514
Author: Jonathan Wakely 
Date:   Tue Mar 10 10:50:40 2020 +

libstdc++: Fix invalid noexcept-specifier (PR 94117)

G++ fails to diagnose this non-dependent expression, but Clang doesn't
like it.

PR c++/94117
* include/std/ranges (ranges::transform_view::_Iterator::iter_move):
Change expression in noexcept-specifier to match function body.

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index eb54b110c04..292132db990 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -1837,7 +1837,7 @@ namespace views
  { return __x._M_current - __y._M_current; }
 
  friend constexpr decltype(auto)
- iter_move(const _Iterator& __i) noexcept(noexcept(__iter_move()))
+ iter_move(const _Iterator& __i) noexcept(noexcept(__iter_move(__i)))
  { return __iter_move(__i); }
 
  friend constexpr void


Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Richard Biener
On Tue, Mar 10, 2020 at 12:09 PM Jan Hubicka  wrote:
>
> > > >>> @Honza/Richi: Do you have any opinion about that?
> > > >
> > > > I guess we indeed want to get as close to non-LTO nm behaviour as
> > > > possible. So we want to support them and perhaps think of .symtab
> > > > section file format that can be made backward compatible (such as having
> > > > attribute string for symbols where we can add new info in future in a
> > > > way that old plugins will still get info they want).
> > >
> > > I like the idea. But it's probably next stage1 material. Or can you 
> > > prepare
> > > a patch?
> >
> > I think what's important is that the LTO plugin needs to understand
> > the old and the new version since there's only one for auto-loading.
> >
> > The other missing feature of the linker plugin API is file claiming
> > which should be a on a section basis instead - but that's a different
> > part of the API and not related to symbol tables.  Enhancing that
> > part of the API would allow to elide the LTO debug copying ...
>
> Thinking of it, it seems to me that we do not need to break
> compatibility with existing plugins.  We could keep existing .symtab
> section the way it is implemented right now
> and add additional data to new .symtab_ext section so existing plugins
> will work as expeted.
>
> We could tag symtab_ext by a version string and keep adding extensions
> of extensions in the future being compatible with old plugins.

Not sure how symtab is encoded right now but we also could have

entry1: symbol 
entry2: ^ext-ver-n 
entry3: ^ext-ver-m 
entry4: symbol 

and ^ext-ver-X being escape byte(s) denoting we're providing
additional data for the preceeding symbol.  Not sure if there's
something conveniently available in the current encoding that
would make older plugins skip an entry ;)

> Indeed that would help situation user already has distro provided plugin
> in the search path but compiles its own gcc10.
>
> Honza


Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Jan Hubicka
> > >>> @Honza/Richi: Do you have any opinion about that?
> > >
> > > I guess we indeed want to get as close to non-LTO nm behaviour as
> > > possible. So we want to support them and perhaps think of .symtab
> > > section file format that can be made backward compatible (such as having
> > > attribute string for symbols where we can add new info in future in a
> > > way that old plugins will still get info they want).
> >
> > I like the idea. But it's probably next stage1 material. Or can you prepare
> > a patch?
> 
> I think what's important is that the LTO plugin needs to understand
> the old and the new version since there's only one for auto-loading.
> 
> The other missing feature of the linker plugin API is file claiming
> which should be a on a section basis instead - but that's a different
> part of the API and not related to symbol tables.  Enhancing that
> part of the API would allow to elide the LTO debug copying ...

Thinking of it, it seems to me that we do not need to break
compatibility with existing plugins.  We could keep existing .symtab
section the way it is implemented right now
and add additional data to new .symtab_ext section so existing plugins
will work as expeted.

We could tag symtab_ext by a version string and keep adding extensions
of extensions in the future being compatible with old plugins.

Indeed that would help situation user already has distro provided plugin
in the search path but compiles its own gcc10.

Honza


Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Richard Biener
On Tue, Mar 10, 2020 at 11:05 AM Martin Liška  wrote:
>
> On 3/9/20 9:19 PM, Jan Hubicka wrote:
> >> On Mon, Mar 9, 2020 at 9:56 AM Martin Liška  wrote:
> >>>
> >>> On 3/9/20 4:36 PM, H.J. Lu wrote:
>  We nee to support different variables, like TLS, data and bss variables.
> >>>
> >>> Why do we need TLS? Right now, it's not supported by nm. Or am I wrong?
> >>
> >> Since you are introducing symbol types, why not support TLS?
> >>
> >>> About BSS and DATA I agree that it would be handy. I can theoretically
> >>> covered with code in get_variable_section/bss_initializer_p. But it's
> >>> quite logic and I'm not sure we should simulate it.
> >
> > I think it should not be that hard to factor out the logic from
> > get_variable_section to return enum of what we want to do and then
> > have get_variale_section as a wrapper parsing this enum to actual
> > section.
>
> So it was easier that I expected and I'm sending updated version
> of the patch.
>
> >>>
> >>> @Honza/Richi: Do you have any opinion about that?
> >
> > I guess we indeed want to get as close to non-LTO nm behaviour as
> > possible. So we want to support them and perhaps think of .symtab
> > section file format that can be made backward compatible (such as having
> > attribute string for symbols where we can add new info in future in a
> > way that old plugins will still get info they want).
>
> I like the idea. But it's probably next stage1 material. Or can you prepare
> a patch?

I think what's important is that the LTO plugin needs to understand
the old and the new version since there's only one for auto-loading.

The other missing feature of the linker plugin API is file claiming
which should be a on a section basis instead - but that's a different
part of the API and not related to symbol tables.  Enhancing that
part of the API would allow to elide the LTO debug copying ...

> >
> > Of course IPA optimizations may migrate symbols around (say from data to
> > bss)/take them away/rename them, but with that we need to live. I would
> > expect most tools inspecting nm are interested in what will enter
> > linking not what will be in final output.
>
> Yes, there are mostly used during configure script run where they commonly
> do not take final linked executables/shared libs.
>
> >
> > Since we discuss plugin extensions (and I do not want this to complicate
> > finishing Martin's patch).
>
> Please suggest that in another patch.
>
> The current situation with binutils is bad because we can't build distribution
> with -fno-common and LTO.
>
> Martin
>
> >  Are we aware of other plugin limitations?
> > One thing that I consider unsafe is the way we produce local names when
> > we need to promote symbol to hidden due to partitining.  We add
> > .lto_priv, but that is not safe if we link with .o file that was
> > incrementally lto-optimized to target object file (this is reason why I
> > did not enabled WHOPR path for it).
> >
> > We may also want to inform lld and llvm's gold plugin maintainers about
> > intended changes.
> > Honza
> >>>
> >>> Thanks,
> >>> Martin
> >>
> >>
> >>
> >> --
> >> H.J.
>


Re: GCC 9 backports

2020-03-10 Thread Martin Liška

Hi.

One more that I've just tested.

Martin
>From 40b6c70febc36e523caf9d8615fa4e1e1d68508b Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 9 Mar 2020 14:13:04 +0100
Subject: [PATCH] Backport 314b91220a07bd63f13c58e37f1b5b9430a3702b

gcc/ChangeLog:

2020-03-09  Martin Liska  

	PR target/93800
	* config/rs6000/rs6000.c (rs6000_option_override_internal):
	Remove set of str_align_loops and str_align_jumps as these
	should be set in previous 2 conditions in the function.

gcc/testsuite/ChangeLog:

2020-03-09  Martin Liska  

	PR target/93800
	* gcc.target/powerpc/pr93800.c: New test.
---
 gcc/config/rs6000/rs6000.c |  5 -
 gcc/testsuite/gcc.target/powerpc/pr93800.c | 14 ++
 2 files changed, 14 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr93800.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 87d60078bb0..d45294302cb 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4715,11 +4715,6 @@ rs6000_option_override_internal (bool global_init_p)
 		  str_align_loops = "16";
 		}
 	}
-
-	  if (flag_align_jumps && !str_align_jumps)
-	str_align_jumps = "16";
-	  if (flag_align_loops && !str_align_loops)
-	str_align_loops = "16";
 	}
 
   /* Arrange to save and restore machine status around nested functions.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr93800.c b/gcc/testsuite/gcc.target/powerpc/pr93800.c
new file mode 100644
index 000..f8dfbe7c082
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr93800.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-mcpu=860 -O2" } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-final { scan-assembler-not "\\.p2align 4" } } */
+
+volatile int g;
+int f(int a, int b)
+{
+	int i;
+
+	for (i = 0; i < b; i++)
+		a += g;
+	return a;
+}
-- 
2.25.1



Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Martin Liška

On 3/9/20 9:19 PM, Jan Hubicka wrote:

On Mon, Mar 9, 2020 at 9:56 AM Martin Liška  wrote:


On 3/9/20 4:36 PM, H.J. Lu wrote:

We nee to support different variables, like TLS, data and bss variables.


Why do we need TLS? Right now, it's not supported by nm. Or am I wrong?


Since you are introducing symbol types, why not support TLS?


About BSS and DATA I agree that it would be handy. I can theoretically
covered with code in get_variable_section/bss_initializer_p. But it's
quite logic and I'm not sure we should simulate it.


I think it should not be that hard to factor out the logic from
get_variable_section to return enum of what we want to do and then
have get_variale_section as a wrapper parsing this enum to actual
section.


So it was easier that I expected and I'm sending updated version
of the patch.



@Honza/Richi: Do you have any opinion about that?


I guess we indeed want to get as close to non-LTO nm behaviour as
possible. So we want to support them and perhaps think of .symtab
section file format that can be made backward compatible (such as having
attribute string for symbols where we can add new info in future in a
way that old plugins will still get info they want).


I like the idea. But it's probably next stage1 material. Or can you prepare
a patch?



Of course IPA optimizations may migrate symbols around (say from data to
bss)/take them away/rename them, but with that we need to live. I would
expect most tools inspecting nm are interested in what will enter
linking not what will be in final output.


Yes, there are mostly used during configure script run where they commonly
do not take final linked executables/shared libs.



Since we discuss plugin extensions (and I do not want this to complicate
finishing Martin's patch).


Please suggest that in another patch.

The current situation with binutils is bad because we can't build distribution
with -fno-common and LTO.

Martin


 Are we aware of other plugin limitations?
One thing that I consider unsafe is the way we produce local names when
we need to promote symbol to hidden due to partitining.  We add
.lto_priv, but that is not safe if we link with .o file that was
incrementally lto-optimized to target object file (this is reason why I
did not enabled WHOPR path for it).

We may also want to inform lld and llvm's gold plugin maintainers about
intended changes.
Honza


Thanks,
Martin




--
H.J.


>From 4214743fc011fd8900a89166759a5511d8da5da2 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 6 Mar 2020 18:09:35 +0100
Subject: [PATCH] API extension for binutils (type of symbols).

gcc/ChangeLog:

2020-03-09  Martin Liska  

	* lto-streamer-out.c (write_symbol): Stream
	symbol type.

include/ChangeLog:

2020-03-09  Martin Liska  

	* lto-symtab.h (enum gcc_plugin_symbol_type): New.
	* plugin-api.h (struct ld_plugin_symbol): New member
	symbols_type.
	(enum ld_plugin_symbol_type): New.
	(enum ld_plugin_tag): Add new tag LDPT_GET_SYMBOLS_V4.

lto-plugin/ChangeLog:

2020-03-09  Martin Liska  

	* lto-plugin.c (parse_table_entry): Parse symbol type.
---
 gcc/lto-streamer-out.c  | 14 ++
 include/lto-symtab.h|  8 
 include/plugin-api.h| 14 +-
 lto-plugin/lto-plugin.c | 13 +
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index cea5e71cffb..ead606eb665 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "print-tree.h"
 #include "tree-dfa.h"
 #include "file-prefix-map.h" /* remap_debug_filename()  */
+#include "output.h"
 
 
 static void lto_write_tree (struct output_block*, tree, bool);
@@ -2773,6 +2774,19 @@ write_symbol (struct streamer_tree_cache_d *cache,
   lto_write_data (&c, 1);
   c = (unsigned char) visibility;
   lto_write_data (&c, 1);
+
+  gcc_plugin_symbol_type st;
+  if (TREE_CODE (t) == VAR_DECL)
+{
+  section *s = get_variable_section (t, false);
+  st = (s->common.flags & SECTION_BSS
+	? GCCST_VARIABLE_BSS : GCCST_VARIABLE_DATA);
+}
+  else
+st = GCCST_FUNCTION;
+
+  c = (unsigned char) st;
+  lto_write_data (&c, 1);
   lto_write_data (&size, 8);
   lto_write_data (&slot_num, 4);
 }
diff --git a/include/lto-symtab.h b/include/lto-symtab.h
index 0ce0de10121..901bc3585c2 100644
--- a/include/lto-symtab.h
+++ b/include/lto-symtab.h
@@ -38,4 +38,12 @@ enum gcc_plugin_symbol_visibility
 GCCPV_HIDDEN
   };
 
+enum gcc_plugin_symbol_type
+{
+  GCCST_UNKNOWN,
+  GCCST_FUNCTION,
+  GCCST_VARIABLE_DATA,
+  GCCST_VARIABLE_BSS
+};
+
 #endif /* GCC_LTO_SYMTAB_H  */
diff --git a/include/plugin-api.h b/include/plugin-api.h
index 09e1202df07..794a2dcc4ee 100644
--- a/include/plugin-api.h
+++ b/include/plugin-api.h
@@ -92,6 +92,7 @@ struct ld_plugin_symbol
   uint64_t size;
   char *comdat_key;
   int resolution;
+  int symbol_type;
 };
 
 /* An object's section.  */
@@ -123,6 +124,16 @@ enum ld_plugin

Re: [PING PATCH coroutines] Do not strip cleanup_point when promote temporaries out of current stmt

2020-03-10 Thread Bin.Cheng
On Thu, Mar 5, 2020 at 10:18 PM Iain Sandoe  wrote:
>
> Hello JunMa,
>
> JunMa  wrote:
>
> > Ping
>
> Once again, sorry for taking time to review this.
>
> > 在 2020/2/27 上午10:18, JunMa 写道:
> >> 在 2020/2/11 上午10:14, JunMa 写道:
> >> Kindly ping
> >>
> >> Regards
> >> JunMa
> >>> Hi
> >>> In maybe_promote_captured_temps, the cleanup_point_stmt has been
> >>> stripped when handle temporaries captured by reference. However, maybe
> >>> there are non-reference temporaries in current stmt which cause ice in
> >>> gimpilify pass.
> >>>
> >>> This patch fix this. The testcase comes from cppcoro and is reduced by
> >>> creduce.
>
> With current trunk + Bin’s two approved patches.
>
> I see no change in the testcase (lambda-09-capture-object.C) before / after
> the patch
>   (it fails for me at -O0 only - in both cases).

Hi Iain,
I tried exactly what you did, however, the result is different.
With current trunk(cb2c60206f4f2218f84ccde21663b00de068d8c7) with my
approved patch, the case(lambda-09-capture-object.C) still causes ICE.
Actually, the same ICE happens in testcase(co-await-syntax-11.C) added
by my patch.
So this one is a prerequisite for my approved patch.

Thanks,
bin
>
> please could you check?
> thanks
> Iain
>


[PATCH 2/2] [aarch64] Rework fpcr fpsr getter/setter builtins

2020-03-10 Thread Andrea Corallo
Hi all,

second and last patch of the two reworking FPCR and FPSR builtins.

This rework __builtin_aarch64_set_fpcr (unsigned) and
__builtin_aarch64_set_fpsr (unsigned) to emit a read-modify-sequences
as:

 mrs x1, fpsr
 bfi x1, x0, 0, 32
 msr fpsr, x1

This in order to preserve the original high 32 bits of the system
register.  Both FPSR and FPCR became 64bit regs with armv8.1.

Bootstrapped on aarch64-linux-gnu, does not introduce regressions.

Regards

  Andrea

gcc/ChangeLog:

2020-??-??  Andrea Corallo  

* config/aarch64/aarch64.md (insv_reg): Pattern renamed.
(set_fpcr, set_fpsr): Pattern modified for read-modify-write
sequence respecting high 32bit register content.

gcc/testsuite/ChangeLog:

2020-??-??  Andrea Corallo  

* gcc.target/aarch64/set_fpcr.c: New test.
* gcc.target/aarch64/get_fpcr.c: New test.
* gcc.target/aarch64/set_fpsr.c: New test.
* gcc.target/aarch64/get_fpsr.c: New test.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index b6836710c9c2..b6ee2e1a946e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5613,7 +5613,7 @@
   operands[3] = force_reg (mode, value);
 })
 
-(define_insn "*insv_reg"
+(define_insn "insv_reg"
   [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r")
 			  (match_operand 1 "const_int_operand" "n")
 			  (match_operand 2 "const_int_operand" "n"))
@@ -7173,10 +7173,21 @@
(set_attr "type" "multiple")])
 
 ;; Write Floating-point Control Register.
-(define_insn "set_fpcr"
-  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] UNSPECV_SET_FPCR)]
+
+(define_expand "set_fpcr"
+  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
+UNSPECV_SET_FPCR)]
   ""
-  "msr\\tfpcr, %0"
+  {
+rtx val32 = simplify_gen_subreg (DImode, operands[0], SImode, 0);
+/* Read-modify-write sequence.  */
+rtx scratch = gen_reg_rtx (DImode);
+emit_insn (gen_get_fpcr64 (scratch));
+emit_insn (gen_insv_regdi (scratch, GEN_INT (32), GEN_INT (0),
+			   val32));
+emit_insn (gen_set_fpcr64 (scratch));
+DONE;
+  }
   [(set_attr "type" "mrs")])
 
 ;; Read Floating-point Control Register.
@@ -7188,10 +7199,19 @@
   [(set_attr "type" "mrs")])
 
 ;; Write Floating-point Status Register.
-(define_insn "set_fpsr"
+(define_expand "set_fpsr"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] UNSPECV_SET_FPSR)]
   ""
-  "msr\\tfpsr, %0"
+  {
+rtx val32 = simplify_gen_subreg (DImode, operands[0], SImode, 0);
+/* Read-modify-write sequence.  */
+rtx scratch = gen_reg_rtx (DImode);
+emit_insn (gen_get_fpsr64 (scratch));
+emit_insn (gen_insv_regdi (scratch, GEN_INT (32), GEN_INT (0),
+			   val32));
+emit_insn (gen_set_fpsr64 (scratch));
+DONE;
+  }
   [(set_attr "type" "mrs")])
 
 ;; Read Floating-point Status Register.
diff --git a/gcc/testsuite/gcc.target/aarch64/get_fpcr.c b/gcc/testsuite/gcc.target/aarch64/get_fpcr.c
new file mode 100644
index ..f33e70e34cd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/get_fpcr.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned int
+get_fpcr ()
+{
+  return __builtin_aarch64_get_fpcr ();
+}
+
+/* { dg-final { scan-assembler-times "mrs.*fpcr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/get_fpsr.c b/gcc/testsuite/gcc.target/aarch64/get_fpsr.c
new file mode 100644
index ..2f7d75637d20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/get_fpsr.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned int
+get_fpsr ()
+{
+  return __builtin_aarch64_get_fpsr ();
+}
+
+/* { dg-final { scan-assembler-times "mrs.*fpsr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/set_fpcr.c b/gcc/testsuite/gcc.target/aarch64/set_fpcr.c
new file mode 100644
index ..74525981f323
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/set_fpcr.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+set_fpcr (unsigned int x)
+{
+  return __builtin_aarch64_set_fpcr (x);
+}
+
+/* { dg-final { scan-assembler-times "bfi" 1 } } */
+/* { dg-final { scan-assembler-times "mrs.*fpcr" 1 } } */
+/* { dg-final { scan-assembler-times "msr.*fpcr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/set_fpsr.c b/gcc/testsuite/gcc.target/aarch64/set_fpsr.c
new file mode 100644
index ..e3e2e631f70c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/set_fpsr.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+set_fpsr (unsigned int x)
+{
+  return __builtin_aarch64_set_fpsr (x);
+}
+
+/* { dg-final { scan-assembler-times "bfi" 1 } } */
+/* { dg-final { scan-assembler-times "mrs.*fpsr" 1 } } */
+/* { dg-final { scan-assembler-times "msr.*fpsr" 1 } } */


[PATCH 1/2] [aarch64] Rework fpcr fpsr getter/setter builtins

2020-03-10 Thread Andrea Corallo
Hi all,

I'd like to submit this patch introducing the following 64bit builtins
variants as FPCR and FPSR registers getter/setter:

unsigned long long __builtin_aarch64_get_fpcr64 ()
void __builtin_aarch64_set_fpcr64 (unsigned long long)
unsigned long long __builtin_aarch64_get_fpsr64 ()
void __builtin_aarch64_set_fpsr64 (unsigned long long)

Regards
  Andrea

gcc/ChangeLog:

2020-??-??  Andrea Corallo  

* config/aarch64/aarch64-builtins.c (aarch64_builtins): Add enums
for 64bits fpsr/fpcr getter setters builtin variants.
(aarch64_init_fpsr_fpcr_builtins): New function.
(aarch64_expand_fcr_fpsr_builtin): New function.
(aarch64_general_expand_builtin): Modify to make use of the later.
* config/aarch64/aarch64.md (UNSPECV_GET_FPCR64)
(UNSPECV_SET_FPCR64, UNSPECV_GET_FPSR64, UNSPECV_SET_FPSR64): Add
4 new unpecv.
(set_fpcr64, get_fpcr64,set_fpsr64, get_fpsr64): New patterns.
* doc/extend.texi (__builtin_aarch64_get_fpcr64)
(__builtin_aarch64_set_fpcr64, __builtin_aarch64_get_fpsr64)
(__builtin_aarch64_set_fpsr64): Add into AArch64 Built-in
Functions.

gcc/testsuite/ChangeLog:

2020-??-??  Andrea Corallo  

* gcc.target/aarch64/get_fpcr64.c: New test.
* gcc.target/aarch64/set_fpcr64.c: New test.
* gcc.target/aarch64/get_fpsr64.c: New test.
* gcc.target/aarch64/set_fpsr64.c: New test.

diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 9c9c6d86ae29..b3ffa3893c08 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -443,6 +443,11 @@ enum aarch64_builtins
   AARCH64_BUILTIN_GET_FPSR,
   AARCH64_BUILTIN_SET_FPSR,
 
+  AARCH64_BUILTIN_GET_FPCR64,
+  AARCH64_BUILTIN_SET_FPCR64,
+  AARCH64_BUILTIN_GET_FPSR64,
+  AARCH64_BUILTIN_SET_FPSR64,
+
   AARCH64_BUILTIN_RSQRT_DF,
   AARCH64_BUILTIN_RSQRT_SF,
   AARCH64_BUILTIN_RSQRT_V2DF,
@@ -1240,33 +1245,65 @@ aarch64_init_memtag_builtins (void)
 #undef AARCH64_INIT_MEMTAG_BUILTINS_DECL
 }
 
-/* Initialize all builtins in the AARCH64_BUILTIN_GENERAL group.  */
+/* Initialize fpsr fpcr getter and setters.  */
 
-void
-aarch64_general_init_builtins (void)
+static void
+aarch64_init_fpsr_fpcr_builtins (void)
 {
-  tree ftype_set_fpr
+  tree ftype_set
 = build_function_type_list (void_type_node, unsigned_type_node, NULL);
-  tree ftype_get_fpr
+  tree ftype_get
 = build_function_type_list (unsigned_type_node, NULL);
 
   aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR]
 = aarch64_general_add_builtin ("__builtin_aarch64_get_fpcr",
-   ftype_get_fpr,
+   ftype_get,
    AARCH64_BUILTIN_GET_FPCR);
   aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR]
 = aarch64_general_add_builtin ("__builtin_aarch64_set_fpcr",
-   ftype_set_fpr,
+   ftype_set,
    AARCH64_BUILTIN_SET_FPCR);
   aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR]
 = aarch64_general_add_builtin ("__builtin_aarch64_get_fpsr",
-   ftype_get_fpr,
+   ftype_get,
    AARCH64_BUILTIN_GET_FPSR);
   aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR]
 = aarch64_general_add_builtin ("__builtin_aarch64_set_fpsr",
-   ftype_set_fpr,
+   ftype_set,
    AARCH64_BUILTIN_SET_FPSR);
 
+  ftype_set
+= build_function_type_list (void_type_node, long_long_unsigned_type_node,
+NULL);
+  ftype_get
+= build_function_type_list (long_long_unsigned_type_node, NULL);
+
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR64]
+= aarch64_general_add_builtin ("__builtin_aarch64_get_fpcr64",
+   ftype_get,
+   AARCH64_BUILTIN_GET_FPCR64);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR64]
+= aarch64_general_add_builtin ("__builtin_aarch64_set_fpcr64",
+   ftype_set,
+   AARCH64_BUILTIN_SET_FPCR64);
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR64]
+= aarch64_general_add_builtin ("__builtin_aarch64_get_fpsr64",
+   ftype_get,
+   AARCH64_BUILTIN_GET_FPSR64);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR64]
+= aarch64_general_add_builtin ("__builtin_aarch64_set_fpsr64",
+   ftype_set,
+   AARCH64_BUILTIN_SET_FPSR64);
+}
+
+/* Initialize all builtins in the AARCH64_BUILTIN_GENERAL group.  */
+
+void
+aarch64_general_init_builtins (void)
+{
+
+  aarch64_init_fpsr_fpcr_builtins ();
+
   aarch64_init_fp16_types ();
 
   aarch64_init_bf16_types ();
@@ -1871,6 +1908,40 @@ aarch64_expand_builtin_memtag (int fcode, tree exp, rtx target)
   return target;
 }
 
+static rtx
+aarch64_expand_fcr_fpsr_builtin (tree exp, machine_mode mode, bool getter,
+ bool fpsr)
+{
+  int icode;
+  rtx pat;
+  rtx target = NULL_RTX;
+
+  gcc_assert (mode == SImode || (mode == DImode));
+
+  if (getter)
+{
+  if (mode == SImode)
+	icode = fpsr ? CODE_FOR_get_fpsr : CODE_FOR_get_fpcr;
+  else
+	icode = fpsr ? CODE_FOR_get_fpsr64 : CODE_FOR_get_fpcr64;
+  target = gen_reg_rtx (mode);
+  pat = GEN_FCN (icode) (ta

[committed] libstdc++: Change compile-only test to run

2020-03-10 Thread Jonathan Wakely
The 24_iterators/ostream_iterator/1.cc test uses VERIFY and so is
obviously meant to have been run, not just compiled.

* testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Add
comment explaining multiple dg-do directives.
* testsuite/24_iterators/ostream_iterator/1.cc: Fix do-do directive
so test is run as well as compiled.

Tested x86_64-linux, committed to master.


commit 3654d49d0ff651b2a78401bc2430428711e7d2eb
Author: Jonathan Wakely 
Date:   Tue Mar 10 09:47:15 2020 +

libstdc++: Change compile-only test to run

The 24_iterators/ostream_iterator/1.cc test uses VERIFY and so is
obviously meant to have been run, not just compiled.

* testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Add
comment explaining multiple dg-do directives.
* testsuite/24_iterators/ostream_iterator/1.cc: Fix do-do directive
so test is run as well as compiled.

diff --git 
a/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
index 5cbc76e0d8c..f6b908ac03e 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
@@ -15,6 +15,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// This test fails to compile since C++17 (see xfail-if below) so we can only
+// do a "run" test for C++11 and C++14, and a "compile" test for C++17 and up.
 // { dg-do run { target { c++11_only || c++14_only } } }
 // { dg-do compile { target c++17 } }
 
diff --git a/libstdc++-v3/testsuite/24_iterators/ostream_iterator/1.cc 
b/libstdc++-v3/testsuite/24_iterators/ostream_iterator/1.cc
index 640ff61afa7..718dad3b684 100644
--- a/libstdc++-v3/testsuite/24_iterators/ostream_iterator/1.cc
+++ b/libstdc++-v3/testsuite/24_iterators/ostream_iterator/1.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do compile { target c++11 } }
+// { dg-do run { target c++11 } }
 
 #include 
 #include 


Re: [PATCH][RFC] API extension for binutils (type of symbols).

2020-03-10 Thread Martin Liška

On 3/9/20 8:45 PM, Michael Matz wrote:

Hello,

On Mon, 9 Mar 2020, Martin Liška wrote:


On 3/9/20 4:36 PM, H.J. Lu wrote:

We nee to support different variables, like TLS, data and bss variables.


Why do we need TLS? Right now, it's not supported by nm.


Of course it does.  It's the 'T' (or 't') character.


Thank you reply!

Are you sure about it?
$ gcc gcc/testsuite/gcc.target/i386/pr56564-3.c -c -fpic && nm pr56564-3.o
...
 D s
0010 D t

?


 When you introduce
symbol categories into the plugin system it would be advisable to include
all we usually care about, and as the ELF categories are (roughly) a
superset of everything we support, I'd say that should be the list to look
at.  I.e. a mixture of visibility, locality (aka binding) and type:


I agree with that.



{object,function,common,tls}


Here LDPK_COMMON is already handled.


  x {local,global,weak,unique}
  x {default,internal,hidden,protected}

That doesn't include symbols types section,file,ifunc or os or arch
specific types or visibilities or bindings.  But it would probably not be
the worst idea to simply encode what we need with ELF constants and names.
While not all the world is ELF, all concepts we have can be mapped onto
ELF.


Ciao,
Michael.





Re: [PATCH] i386: Fix up *testqi_ext_3 insn&split for the *testdi_1 changes [PR94088]

2020-03-10 Thread Uros Bizjak
On Tue, Mar 10, 2020 at 7:39 AM Jakub Jelinek  wrote:
>
> Hi!
>
> In r10-1938-g460bf043c8266dd080308f4783137aee0d0f862c *testdi_1 has been
> changed, so that if the mask has upper 32-bits 0 and then at least one bit
> set, it requires CCZmode rather than CCNOmode, because in that case it uses
> testl instruction rather than testq and so the SF flag wouldn't respect the
> state of the 64-bit result.
> The *testqi_ext_3 define_insn_and_split needs to match that though,
> otherwise it can create an RTL pattern that used to match *testdi_1 but
> doesn't anymore and we'd ICE due to an unrecognizable insn.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
>
> 2020-03-10  Jakub Jelinek  
>
> PR target/94088
> * config/i386/i386.md (*testqi_ext_3): Call ix86_match_ccmode with
> CCZmode instead of CCNOmode if operands[2] has DImode and pos + len
> is 32.
>
> * gcc.target/i386/pr94088.c: New test.

OK.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2020-03-06 11:35:46.279074931 +0100
> +++ gcc/config/i386/i386.md 2020-03-09 13:25:47.045165188 +0100
> @@ -8826,18 +8826,23 @@ (define_insn_and_split "*testqi_ext_3"
>  (match_operand 3 "const_int_operand" "n")
>  (match_operand 4 "const_int_operand" "n"))
>(const_int 0)]))]
> -  "ix86_match_ccmode (insn, CCNOmode)
> -   && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
> -   || GET_MODE (operands[2]) == SImode
> -   || GET_MODE (operands[2]) == HImode
> -   || GET_MODE (operands[2]) == QImode)
> +  "((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
> +|| GET_MODE (operands[2]) == SImode
> +|| GET_MODE (operands[2]) == HImode
> +|| GET_MODE (operands[2]) == QImode)
> /* Ensure that resulting mask is zero or sign extended operand.  */
> && INTVAL (operands[4]) >= 0
> && ((INTVAL (operands[3]) > 0
> && INTVAL (operands[3]) + INTVAL (operands[4]) <= 32)
> || (mode == DImode
>&& INTVAL (operands[3]) > 32
> -  && INTVAL (operands[3]) + INTVAL (operands[4]) == 64))"
> +  && INTVAL (operands[3]) + INTVAL (operands[4]) == 64))
> +   && ix86_match_ccmode (insn,
> +/* *testdi_1 requires CCZmode if the mask has bit
> +   31 set and all bits above it clear.  */
> +GET_MODE (operands[2]) == DImode
> +&& INTVAL (operands[3]) + INTVAL (operands[4]) == 32
> +? CCZmode : CCNOmode)"
>"#"
>"&& 1"
>[(set (match_dup 0) (match_op_dup 1 [(match_dup 2) (const_int 0)]))]
> --- gcc/testsuite/gcc.target/i386/pr94088.c.jj  2020-03-09 13:23:56.485796409 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr94088.c 2020-03-09 13:23:21.627310722 
> +0100
> @@ -0,0 +1,9 @@
> +/* PR target/94088 */
> +/* { dg-do compile } */
> +/* { dg-options "-mtbm -O1 -fira-loop-pressure -fno-dce" } */
> +
> +double
> +foo (int x)
> +{
> +  return x / (4294950402U % -65472 + 161);
> +}
>
> Jakub
>