date:20170613

Re: [PATCH] Enhance dump_probability function.

2017-06-13 Thread Rainer Orth

Hi Paul,

> New "ERROR: (DejaGnu)" on mips64el target.
>
> my DejaGnu version is 1.5.1.
>
> 1)
> make check-gcc RUNTESTFLAGS="tree-ssa.exp=builtin-sprintf-2.c"
> ...
> ERROR: (DejaGnu) proc "^:\\" does not exist.
> The error code is TCL LOOKUP COMMAND ^:\\
> The info on the error is:
> invalid command name "^:\"
> while executing
> "::tcl_unknown ^:\\"
> ("uplevel" body line 1)
> invoked from within
> "uplevel 1 ::tcl_unknown $args"
> ...
>
> 2)
> make check-gcc RUNTESTFLAGS="tree-ssa.exp=vrp101.c"
> ...
> ERROR: (DejaGnu) proc "^:\\" does not exist.
> The error code is TCL LOOKUP COMMAND ^:\\
> The info on the error is:
> invalid command name "^:\"
> while executing
> "::tcl_unknown ^:\\"
> ("uplevel" body line 1)
> invoked from within
> "uplevel 1 ::tcl_unknown $args"
> ...
>
> I don't known how to debug this, any advice ?

both revised scan-tree-dump patterns got the quoting wrong, leading to
attempts to run unknown procs ^\\: instead of matching [^:] ;-(

This totally broke make check-gcc: the affected partial test runs
aborted at that point, leading to gcc.{sum,log} files that make
contrib/dg-extract-results.py choke, producing empty combined
gcc.{sum,log} files.  No idea how this was tested (probably not at all).

The following patch fixes the syntax error

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c
@@ -290,7 +290,7 @@ RNG (0,  6,   8, "%s%ls", "1", L"2");
 
 /*  Only conditional calls to must_not_eliminate must be made (with
 any probability):
-{ dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\]\\ \\\[count:\\[^:\\]*\\\]:\n *must_not_eliminate" 127 "optimized" { target { ilp32 || lp64 } } } }
-{ dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\]\\ \\\[count:\\[^:\\]*\\\]:\n *must_not_eliminate" 96 "optimized" { target { { ! ilp32 } && { ! lp64 } } } } }
+{ dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\]\\ \\\[count:\\\[^:\\]*\\\]:\n *must_not_eliminate" 127 "optimized" { target { ilp32 || lp64 } } } }
+{ dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\]\\ \\\[count:\\\[^:\\]*\\\]:\n *must_not_eliminate" 96 "optimized" { target { { ! ilp32 } && { ! lp64 } } } } }
 No unconditional calls to abort should be made:
 { dg-final { scan-tree-dump-not ";\n *must_not_eliminate" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp101.c
@@ -10,4 +10,4 @@ int main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump " \\\[\[0-9.\]+%\\\] \\\[count: \\[^:\\]*\\\]:\[\n\r \]*return 0;" "optimized" } } */
+/* { dg-final { scan-tree-dump " \\\[\[0-9.\]+%\\\] \\\[count: \\\[^:\\]*\\\]:\[\n\r \]*return 0;" "optimized" } } */

but both tests still come out as FAIL:

+FAIL: gcc.dg/tree-ssa/builtin-sprintf-2.c scan-tree-dump-times optimized "> \\\
\[[0-9.]+%] [count:[^:]*]:\\n *must_not_eliminate" 127

+FAIL: gcc.dg/tree-ssa/vrp101.c scan-tree-dump optimized " [[0-9.]+%\\
\\] [count: [^:]*]:[\\n\\r ]*return 0;"

Martin should check what he really meant to match here and fix the
patterns accordingly.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Enhance dump_probability function.

2017-06-13 Thread Paul Hua

New "ERROR: (DejaGnu)" on mips64el target.

my DejaGnu version is 1.5.1.

1)
make check-gcc RUNTESTFLAGS="tree-ssa.exp=builtin-sprintf-2.c"
...
ERROR: (DejaGnu) proc "^:\\" does not exist.
The error code is TCL LOOKUP COMMAND ^:\\
The info on the error is:
invalid command name "^:\"
while executing
"::tcl_unknown ^:\\"
("uplevel" body line 1)
invoked from within
"uplevel 1 ::tcl_unknown $args"
...

2)
make check-gcc RUNTESTFLAGS="tree-ssa.exp=vrp101.c"
...
ERROR: (DejaGnu) proc "^:\\" does not exist.
The error code is TCL LOOKUP COMMAND ^:\\
The info on the error is:
invalid command name "^:\"
while executing
"::tcl_unknown ^:\\"
("uplevel" body line 1)
invoked from within
"uplevel 1 ::tcl_unknown $args"
...

I don't known how to debug this, any advice ?

Paul.

On Tue, Jun 13, 2017 at 4:14 PM, Martin Liška  wrote:
> Hi.
>
> This is pre-approved patch that displays edge counts in dump files:
>
> ...
>   _85 = _83 + _84;
>   len_86 = SQRT (_85);
>   if (_85 u>= 0.0)
> goto ; [99.00%] [count: 778568]
>   else
> goto ; [1.00%] [count: 7864]
>
>[0.01%] [count: 7864]:
>   sqrt (_85);
> ...
>
> That makes it possible to understand why a profile mismatch happens.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Martin
>
> gcc/ChangeLog:
>
> 2017-06-12  Martin Liska  
>
> * gimple-pretty-print.c (dump_probability): Add new argument.
> (dump_edge_probability): Dump both probability and count.
> (dump_gimple_label): Likewise.
> (dump_gimple_bb_header): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2017-06-12  Martin Liska  
>
> * gcc.dg/tree-ssa/builtin-sprintf-2.c: Adjust scanned pattern.
> * gcc.dg/tree-ssa/dump-2.c: Likewise.
> * gcc.dg/tree-ssa/vrp101.c: Likewise.
> ---
>  gcc/gimple-pretty-print.c | 22 ++
>  gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c |  4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/dump-2.c|  2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/vrp101.c|  2 +-
>  4 files changed, 18 insertions(+), 12 deletions(-)
>
>

Re: [PATCH 13/13] D: Phobos config, makefiles, and testsuite.

2017-06-13 Thread Iain Buclaw

On 13 June 2017 at 19:41, Joseph Myers  wrote:
> There appear to be various GPLv2 notices with old FSF addresses in here.
> Where those are on source files (as opposed to generated files), they
> should be updated to the usual GPLv3+ notice for GCC (and I'd expect FSF
> copyright notices throughout the contributed GCC-specific files, not
> "Copyright (C) 2012 Iain Buclaw").
>

I'll have a look, though it sounds like very old files from before I
got the assignment papers sorted out that missed being updated when I
sifted through them.

Regards,
Iain.

Re: [PATCH 11/13] D: GCC builtins and runtime support.

2017-06-13 Thread Iain Buclaw

On 13 June 2017 at 19:38, Joseph Myers  wrote:
> Presumably all of these GCC-specific files should have the GCC Runtime
> Library Exception notice.
>

OK, noted.  I will update them.

Re: [PATCH 2/13] D: The front-end (GDC) implementation.

2017-06-13 Thread Iain Buclaw

On 13 June 2017 at 19:29, Joseph Myers  wrote:
> As I read it, the front end has functions with names such as error, but no
> useful i18n will actually occur because the functions in d-diagnostic.cc
> format the messages with xvasprintf before passing to the common
> diagnostic code.
>

That could be changed I guess to interact with
diagnostic_report_diagnostic() directly, rather than just being a high
level wrapper around gcc error_at()/warning_at().

> But will exgettext nevertheless extract messages from the dfrontend code,
> if the functions happen to have string arguments in the same position as
> the generic diagnostic functions do?  If so, I think that should be
> disabled, to avoid putting a lot of messages in gcc.pot that won't
> actually be translated.  (If actual i18n support is desired, it should be
> shared with other users of the front end, which would mean using dgettext
> to extract translations in a different domain from the default GCC one,
> and so the messages shouldn't go in gcc.pot anyway.)
>

I would say I'm open to i18n, however upstream D probably wouldn't be.
However as it is my intention to eventually switch the dfrontend
sources to D, exgettext extracting messages would cease to be a
problem.

> In d-target.cc you have code like:
>
> +  else if (global.params.isLinux)
> +{
> +  /* sizeof(pthread_mutex_t) for Linux.  */
> +  if (global.params.is64bit)
> +   return global.params.isLP64 ? 40 : 32;
> +  else
> +   return global.params.isLP64 ? 40 : 24;
> +}
>
> which feels like it belongs in the config/ configuration for each target
> (as a target hook returning the required information), not in the D front
> end code.  I'm not clear what global.params.is64bit is meant to mean; it
> looks like "this is x86_64, possibly x32" in this patch.  These values
> aren't correct in general anyway; on AArch64, glibc has pthread_mutex_t of
> size 48 for LP64 and 32 for ILP32; on HPPA (only ILP32 supported for
> Linux) it's 48.
>

That is something that I have been meaning to handle better.  I was
originally thinking something along the lines of it being determined
at configure time.  Then again, it's only use is for the dfrontend to
generate a lowering of synchronized statements that looks like:

static byte[] critsec;
_d_criticalenter(critsec.ptr);
try { ... }
finally {
_d_criticalexit(critsec.ptr);
}

Returning a size that is large enough for all would work also in the worst case.

There are a number of fields in global.params that I would prefer
removed from the shared frontend.  I when through them all with a
co-collaborator on the core dlang team during Dconf, but I wouldn't
hold my breath for it to happen.

> You have two new target macros TARGET_CPU_D_BUILTINS and
> TARGET_OS_D_BUILTINS.  You're missing any documentation for them in
> tm.texi.in.  And we prefer target hooks to macros.  So please try to
> convert them to (documented) target hooks.  (See c-family/c-target.def,
> and c_target_objs etc., for how there can be hooks that are specific to
> particular front ends.  See the comment in config/default-c.c regarding
> how to deal with a mixture of OS-dependent and architecture-dependent
> hooks.)
>

OK, thanks for the suggestion!

Iain.

Re: Default std::vector default and move constructor

2017-06-13 Thread François Dumont


On 01/06/2017 15:34, Jonathan Wakely wrote:


I would expect the constructor to look like this:

  _Bvector_impl()
  _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )
 : _Bit_alloc_type()
 { }

What happens when you do that?



  _Bvector_impl(const _Bit_alloc_type& __a)
-: _Bit_alloc_type(__a), _M_start(), _M_finish(), 
_M_end_of_storage()

+ _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type(__a)) )


Copying the allocator is not allowed to throw. You can use simply
_GLIBCXX_NOEXCEPT here.


Now that we find out what was the problem with default/value 
initialization of allocator I would like to re-submit this patch with 
the correct constructor.


Tested under Linux x86_64 normal mode.

Ok to commit ?

François
diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 78195c1..6e58503 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -388,10 +388,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { return __x + __n; }
 
   inline void
-  __fill_bvector(_Bit_iterator __first, _Bit_iterator __last, bool __x)
+  __fill_bvector(_Bit_type * __v,
+		 unsigned int __first, unsigned int __last, bool __x)
   {
-for (; __first != __last; ++__first)
-  *__first = __x;
+const _Bit_type __fmask = ~0ul << __first;
+const _Bit_type __lmask = ~0ul >> (_S_word_bit - __last);
+const _Bit_type __mask = __fmask & __lmask;
+
+if (__x)
+  *__v |= __mask;
+else
+  *__v &= ~__mask;
   }
 
   inline void
@@ -399,12 +406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 if (__first._M_p != __last._M_p)
   {
-	std::fill(__first._M_p + 1, __last._M_p, __x ? ~0 : 0);
-	__fill_bvector(__first, _Bit_iterator(__first._M_p + 1, 0), __x);
-	__fill_bvector(_Bit_iterator(__last._M_p, 0), __last, __x);
+	_Bit_type *__first_p = __first._M_p;
+	if (__first._M_offset != 0)
+	  __fill_bvector(__first_p++, __first._M_offset, _S_word_bit, __x);
+
+	__builtin_memset(__first_p, __x ? ~0 : 0,
+			 (__last._M_p - __first_p) * sizeof(_Bit_type));
+
+	if (__last._M_offset != 0)
+	  __fill_bvector(__last._M_p, 0, __last._M_offset, __x);
   }
 else
-  __fill_bvector(__first, __last, __x);
+  __fill_bvector(__first._M_p, __first._M_offset, __last._M_offset, __x);
   }
 
   template
@@ -416,33 +429,62 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	_Bit_alloc_traits;
   typedef typename _Bit_alloc_traits::pointer _Bit_pointer;
 
-  struct _Bvector_impl
-  : public _Bit_alloc_type
+  struct _Bvector_impl_data
   {
 	_Bit_iterator 	_M_start;
 	_Bit_iterator 	_M_finish;
 	_Bit_pointer 	_M_end_of_storage;
 
+	_Bvector_impl_data() _GLIBCXX_NOEXCEPT
+	: _M_start(), _M_finish(), _M_end_of_storage()
+	{ }
+
+#if __cplusplus >= 201103L
+	_Bvector_impl_data(_Bvector_impl_data&& __x) noexcept
+	: _M_start(__x._M_start), _M_finish(__x._M_finish)
+	, _M_end_of_storage(__x._M_end_of_storage)
+	{ __x._M_reset(); }
+
+	void
+	_M_move_data(_Bvector_impl_data&& __x) noexcept
+	{
+	  this->_M_start = __x._M_start;
+	  this->_M_finish = __x._M_finish;
+	  this->_M_end_of_storage = __x._M_end_of_storage;
+	  __x._M_reset();
+	}
+#endif
+
+	void
+	_M_reset() _GLIBCXX_NOEXCEPT
+	{
+	  _M_start = _M_finish = _Bit_iterator();
+	  _M_end_of_storage = _Bit_pointer();
+	}
+  };
+
+  struct _Bvector_impl
+	: public _Bit_alloc_type, public _Bvector_impl_data
+	{
+	public:
 	  _Bvector_impl()
-	: _Bit_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage()
+	_GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )
+	  : _Bit_alloc_type()
 	  { }
 
-	_Bvector_impl(const _Bit_alloc_type& __a)
-	: _Bit_alloc_type(__a), _M_start(), _M_finish(), _M_end_of_storage()
+	  _Bvector_impl(const _Bit_alloc_type& __a) _GLIBCXX_NOEXCEPT
+	  : _Bit_alloc_type(__a)
 	  { }
 
 #if __cplusplus >= 201103L
-	_Bvector_impl(_Bit_alloc_type&& __a)
-	: _Bit_alloc_type(std::move(__a)), _M_start(), _M_finish(),
-	  _M_end_of_storage()
-	{ }
+	_Bvector_impl(_Bvector_impl&&) = default;
 #endif
 
 	_Bit_type*
 	_M_end_addr() const _GLIBCXX_NOEXCEPT
 	{
-	  if (_M_end_of_storage)
-	return std::__addressof(_M_end_of_storage[-1]) + 1;
+	  if (this->_M_end_of_storage)
+	return std::__addressof(this->_M_end_of_storage[-1]) + 1;
 	  return 0;
 	}
   };
@@ -452,33 +494,27 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   _Bit_alloc_type&
   _M_get_Bit_allocator() _GLIBCXX_NOEXCEPT
-  { return *static_cast<_Bit_alloc_type*>(>_M_impl); }
+  { return this->_M_impl; }
 
   const _Bit_alloc_type&
   _M_get_Bit_allocator() const _GLIBCXX_NOEXCEPT
-  { return *static_cast(>_M_impl); }
+  { return this->_M_impl; }
 
   allocator_type
   get_allocator() const _GLIBCXX_NOEXCEPT
   { return allocator_type(_M_get_Bit_allocator()); }
 
-  _Bvector_base()
-  : _M_impl() { }
+#if __cplusplus >= 201103L
+  _Bvector_base() = default;
+#else
+  _Bvector_base() { }
+#endif

gotools patch committed: Build for host_alias = target_alias

2017-06-13 Thread Ian Lance Taylor

This patch to gotools/configure.ac fixes the build to build the tools
when host_alias = target_alias, or, in other words, when the system
for which we are building code is the same as the system where that
code will run.  The earlier test of cross_compiling effectively tested
whether build_alias and host_alias were different, which is not the
same thing when doing a Canadian Cross.  This is for PR 80964.
Bootstrapped on x86_64-pc-linux-gnu (there are no tests for this code,
which is unfortunate).  Committed to mainline.

Ian

2017-06-13  Ian Lance Taylor  

PR go/80964
* configure.ac: Set NATIVE if host_alias = target_alias.
* configure: Rebuild.
Index: configure.ac
===
--- configure.ac(revision 249171)
+++ configure.ac(working copy)
@@ -46,7 +46,7 @@ AC_PROG_INSTALL
 AC_PROG_CC
 AC_PROG_GO
 
-AM_CONDITIONAL(NATIVE, test "$cross_compiling" = no)
+AM_CONDITIONAL(NATIVE, test "$host_alias" = "$target_alias")
 
 dnl Test for -lsocket and -lnsl.  Copied from libjava/configure.ac.
 AC_CACHE_CHECK([for socket libraries], gotools_cv_lib_sockets,

Re: Avoid _Rb_tree_rotate_[left,right] symbols export

2017-06-13 Thread François Dumont


On 12/05/2017 13:03, Jonathan Wakely wrote:

A much simpler (but equivalent) change would be:

--- a/libstdc++-v3/src/c++98/tree.cc
+++ b/libstdc++-v3/src/c++98/tree.cc
@@ -153,6 +153,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  /* Static keyword was missing on _Rb_tree_rotate_left.
 Export the symbol for backward compatibility until
 next ABI change.  */
+#if _GLIBCXX_INLINE_VERSION
+  static
+#endif


Ok, so it looks like you are not a great fan of the anonymous namespace 
in this context.


Here is a new proposal. We don't need to add static keyword, this 
function is only here to be exported for backward compatibility.



Tested under Linux x86_64 with versioned namespace.


What about the normal configuration? It's much more important that the
default configuration works. The versioned namespace that nobody uses
doesn't matter.


Tested under Linux x86_64 normal mode.

Ok to commit ?

François


diff --git a/libstdc++-v3/src/c++98/tree.cc b/libstdc++-v3/src/c++98/tree.cc
index 50fa7cf..0984b05 100644
--- a/libstdc++-v3/src/c++98/tree.cc
+++ b/libstdc++-v3/src/c++98/tree.cc
@@ -150,15 +150,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __x->_M_parent = __y;
   }
 
+#if !_GLIBCXX_INLINE_VERSION
   /* Static keyword was missing on _Rb_tree_rotate_left.
  Export the symbol for backward compatibility until
  next ABI change.  */
   void
   _Rb_tree_rotate_left(_Rb_tree_node_base* const __x,
 		   _Rb_tree_node_base*& __root)
-  {
-local_Rb_tree_rotate_left (__x, __root);
-  }
+  { local_Rb_tree_rotate_left (__x, __root); }
+#endif
 
   static void
   local_Rb_tree_rotate_right(_Rb_tree_node_base* const __x,
@@ -181,15 +181,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __x->_M_parent = __y;
   }
 
+#if !_GLIBCXX_INLINE_VERSION
   /* Static keyword was missing on _Rb_tree_rotate_right
  Export the symbol for backward compatibility until
  next ABI change.  */
   void
   _Rb_tree_rotate_right(_Rb_tree_node_base* const __x,
 			_Rb_tree_node_base*& __root)
-  {
-local_Rb_tree_rotate_right (__x, __root);
-  }
+  { local_Rb_tree_rotate_right (__x, __root); }
+#endif
 
   void
   _Rb_tree_insert_and_rebalance(const bool  __insert_left,

[PATCH, rs6000] (v3) Fold vector shifts in GIMPLE

2017-06-13 Thread Will Schmidt

Hi, 

Add support for early expansion of vector shifts.  Including
vec_sl (shift left), vec_sr (shift right),
vec_sra (shift right algebraic), vec_rl (rotate left).
Part of this includes adding the vector shift right instructions to
the list of those instructions having an unsigned second argument.

The VSR (vector shift right) folding is a bit more complex than
the others. This is due to requiring arg0 be unsigned before the
gimple RSHIFT_EXPR assignment is built, which is required for an
algebraic shift.

[V2 update] Guard the folding of left shifts with TYPE_OVERFLOW_WRAPS.
Add -fwrapv test variations for the left shifts.

[V3 update] Rework the vector shift right folding logic to use the
gimple_build convenience routines.  Add a #include of ssa-propagate.h
to get at the update_call_from_tree() function.

I sniff-tested the latest changes on Power8, with good results.  Full
regtest running.  OK for trunk?


[gcc]

2017-06-13  Will Schmidt  

* config/rs6000/rs6000.c: Add include of ssa-propagate.h for
update_call_from_tree().
(rs6000_gimple_fold_builtin): Add handling
for early expansion of vector shifts (sl,sr,sra,rl).
(builtin_function_type): Add vector shift right instructions
to the unsigned argument list.

[gcc/testsuite]

2017-06-13  Will Schmidt  

* testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-left.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-left-fwrapv.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong-fwrapv.c: 
New.
* testsuite/gcc.target/powerpc/fold-vec-shift-left-longlong.c: New.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 63ca2d1..a88fc18 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -77,6 +77,7 @@
 #endif
 #include "case-cfn-macros.h"
 #include "ppc-auxv.h"
+#include "tree-ssa-propagate.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -16588,6 +16589,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gsi_replace (gsi, g, true);
return true;
   }
+/* Flavors of vec_rotate_left.  */
+case ALTIVEC_BUILTIN_VRLB:
+case ALTIVEC_BUILTIN_VRLH:
+case ALTIVEC_BUILTIN_VRLW:
+case P8V_BUILTIN_VRLD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+  /* Flavors of vector shift right algebraic.
+   * vec_sra{b,h,w} -> vsra{b,h,w}.  */
+case ALTIVEC_BUILTIN_VSRAB:
+case ALTIVEC_BUILTIN_VSRAH:
+case ALTIVEC_BUILTIN_VSRAW:
+case P8V_BUILTIN_VSRAD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+   /* Flavors of vector shift left.
+* builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}.  */
+case ALTIVEC_BUILTIN_VSLB:
+case ALTIVEC_BUILTIN_VSLH:
+case ALTIVEC_BUILTIN_VSLW:
+case P8V_BUILTIN_VSLD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (arg0)))
+   && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE (arg0
+ return false;
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+/* Flavors of vector shift right.  */
+case ALTIVEC_BUILTIN_VSRB:
+case ALTIVEC_BUILTIN_VSRH:
+case ALTIVEC_BUILTIN_VSRW:
+case P8V_BUILTIN_VSRD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple_seq stmts = NULL;
+   /* convert arg0 to unsigned.  */
+   tree arg0_unsigned
+  = gimple_build (, VIEW_CONVERT_EXPR,
+  unsigned_type_for (TREE_TYPE (arg0)), arg0);
+   tree res
+  = gimple_build (, RSHIFT_EXPR,
+  TREE_TYPE (arg0_unsigned), arg0_unsigned, arg1);
+   /* convert result back to the lhs type.  */
+   res = gimple_build (, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), res);
+

Re: [PATCH, GCC/testsuite/ARM] Consistently check for neon in vect effective targets

2017-06-13 Thread Christophe Lyon

Hi Thomas,

On 13 June 2017 at 11:08, Thomas Preudhomme
 wrote:
> Hi,
>
> Conditions checked for ARM targets in vector-related effective targets
> are inconsistent:
>
> * sometimes arm*-*-* is checked
> * sometimes Neon is checked
> * sometimes arm_neon_ok and sometimes arm_neon is used for neon check
> * sometimes check_effective_target_* is used, sometimes is-effective-target
>
> This patch consolidate all of these check into using is-effective-target
> arm_neon and when little endian was checked, the check is kept.
>
> ChangeLog entry is as follows:
>
> *** gcc/testsuite/ChangeLog ***
>
> 2017-06-06  Thomas Preud'homme  
>
> * lib/target-supports.exp (check_effective_target_vect_int): Replace
> current ARM check by ARM NEON's availability check.
> (check_effective_target_vect_intfloat_cvt): Likewise.
> (check_effective_target_vect_uintfloat_cvt): Likewise.
> (check_effective_target_vect_floatint_cvt): Likewise.
> (check_effective_target_vect_floatuint_cvt): Likewise.
> (check_effective_target_vect_shift): Likewise.
> (check_effective_target_whole_vector_shift): Likewise.
> (check_effective_target_vect_bswap): Likewise.
> (check_effective_target_vect_shift_char): Likewise.
> (check_effective_target_vect_long): Likewise.
> (check_effective_target_vect_float): Likewise.
> (check_effective_target_vect_perm): Likewise.
> (check_effective_target_vect_perm_byte): Likewise.
> (check_effective_target_vect_perm_short): Likewise.
> (check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
> (check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
> (check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
> (check_effective_target_vect_widen_mult_hi_to_si): Likewise.
> (check_effective_target_vect_widen_mult_qi_to_hi_pattern): Likewise.
> (check_effective_target_vect_widen_mult_hi_to_si_pattern): Likewise.
> (check_effective_target_vect_widen_shift): Likewise.
> (check_effective_target_vect_extract_even_odd): Likewise.
> (check_effective_target_vect_interleave): Likewise.
> (check_effective_target_vect_multiple_sizes): Likewise.
> (check_effective_target_vect64): Likewise.
> (check_effective_target_vect_max_reduc): Likewise.
>
> Testing: Testsuite shows no regression when targeting ARMv7-A with
> -mfpu=neon-fpv4 and -mfloat-abi=hard or when targeting Cortex-M3 with
> default FPU and float ABI (soft).
>

That's strange, my testing detects a syntax error:

  Executed from: gcc.dg/vect/vect.exp
gcc.dg/vect/slp-9.c: error executing dg-final: unbalanced close paren

See 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/249142-consistent_neon_check/report-build-info.html
for a full picture.

Note that the cells with "BETTER" seem to be mostly several PASSes
becoming unsupported.

Thanks,

Christophe

> Is this ok for trunk?
>
> Best regards,
>
> Thomas

Re: Containers default initialization

2017-06-13 Thread François Dumont


On 12/06/2017 13:57, Jonathan Wakely wrote:

Ok to commit ?


OK, thanks.


Done yesterday.

I guess that considering the compiler bug and rare occasions for this 
bug to show up we don't backport.


François

Re: [PATCH 2/2] [MSP430] Fix issues handling .persistent attribute (PR 78818)

2017-06-13 Thread Jozef Lawrynowicz


On 13/06/2017 16:54, Nick Clifton wrote:

Hi Jozef,


Ok for trunk and gcc-7-branch?


Approved - please apply (to both).

Cheers
   Nick




Sorry, didn't mention in that last post that I don't have write access,
could someone please apply this for me.

Thanks,
Jozef

Re: [PATCH 13/13] D: Phobos config, makefiles, and testsuite.

2017-06-13 Thread Joseph Myers

There appear to be various GPLv2 notices with old FSF addresses in here.  
Where those are on source files (as opposed to generated files), they 
should be updated to the usual GPLv3+ notice for GCC (and I'd expect FSF 
copyright notices throughout the contributed GCC-specific files, not 
"Copyright (C) 2012 Iain Buclaw").

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 11/13] D: GCC builtins and runtime support.

2017-06-13 Thread Joseph Myers

Presumably all of these GCC-specific files should have the GCC Runtime 
Library Exception notice.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 2/13] D: The front-end (GDC) implementation.

2017-06-13 Thread Joseph Myers

As I read it, the front end has functions with names such as error, but no 
useful i18n will actually occur because the functions in d-diagnostic.cc 
format the messages with xvasprintf before passing to the common 
diagnostic code.

But will exgettext nevertheless extract messages from the dfrontend code, 
if the functions happen to have string arguments in the same position as 
the generic diagnostic functions do?  If so, I think that should be 
disabled, to avoid putting a lot of messages in gcc.pot that won't 
actually be translated.  (If actual i18n support is desired, it should be 
shared with other users of the front end, which would mean using dgettext 
to extract translations in a different domain from the default GCC one, 
and so the messages shouldn't go in gcc.pot anyway.)

In d-target.cc you have code like:

+  else if (global.params.isLinux)
+{
+  /* sizeof(pthread_mutex_t) for Linux.  */
+  if (global.params.is64bit)
+   return global.params.isLP64 ? 40 : 32;
+  else
+   return global.params.isLP64 ? 40 : 24;
+}

which feels like it belongs in the config/ configuration for each target 
(as a target hook returning the required information), not in the D front 
end code.  I'm not clear what global.params.is64bit is meant to mean; it 
looks like "this is x86_64, possibly x32" in this patch.  These values 
aren't correct in general anyway; on AArch64, glibc has pthread_mutex_t of 
size 48 for LP64 and 32 for ILP32; on HPPA (only ILP32 supported for 
Linux) it's 48.

You have two new target macros TARGET_CPU_D_BUILTINS and 
TARGET_OS_D_BUILTINS.  You're missing any documentation for them in 
tm.texi.in.  And we prefer target hooks to macros.  So please try to 
convert them to (documented) target hooks.  (See c-family/c-target.def, 
and c_target_objs etc., for how there can be hooks that are specific to 
particular front ends.  See the comment in config/default-c.c regarding 
how to deal with a mixture of OS-dependent and architecture-dependent 
hooks.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 32/30][arm][testsuite] Fix neon-thumb2-move.c test

2017-06-13 Thread Richard Earnshaw (lists)

This test was overriding the options that had been detected as being
necessary to enable Neon.  The result was that the combination of the
test's options and those auto-detected were not compatible with neon
leading to a test failure.  The correct fix here is to stick with the
options that dg-add-options arm_neon has worked out.

* gcc.target/arm/neon-thumb2-move.c (dg-options): Don't override
the architecture options added by dg-add-options arm_neon.


diff --git a/gcc/testsuite/gcc.target/arm/neon-thumb2-move.c b/gcc/testsuite/gcc.target/arm/neon-thumb2-move.c
index 9cf86dd..d8c6748 100644
--- a/gcc/testsuite/gcc.target/arm/neon-thumb2-move.c
+++ b/gcc/testsuite/gcc.target/arm/neon-thumb2-move.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_neon_ok } */
 /* { dg-require-effective-target arm_thumb2_ok } */
-/* { dg-options "-O2 -mthumb -march=armv7-a" } */
+/* { dg-options "-O2 -mthumb" } */
 /* { dg-add-options arm_neon } */
 /* { dg-prune-output "switch .* conflicts with" } */

Re: [PATCH 31/30] [arm] Mark -marm and -mthumb as being inverse options

2017-06-13 Thread Richard Earnshaw (lists)

-marm and -mthumb are opposites: one cancels out the other.  This patch
marks them as such so that the driver will eliminate all but the last
option on the command line.  This aids multilib selection which
otherwise can get confused if both are present.

* config/arm/arm.opt (marm): Mark as the negative of of -mthumb.
(mthumb): Mark as the negative of -marm.


R.
diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
index efee1be..dad5257 100644
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -91,7 +91,7 @@ EnumValue
 Enum(arm_arch) String(native) Value(-1) DriverOnly
 
 marm
-Target Report RejectNegative InverseMask(THUMB)
+Target Report RejectNegative Negative(mthumb) InverseMask(THUMB)
 Generate code in 32 bit ARM state.
 
 mbig-endian
@@ -195,7 +195,7 @@ Target RejectNegative Joined UInteger Var(arm_structure_size_boundary) Init(DEFA
 Specify the minimum bit alignment of structures.
 
 mthumb
-Target Report RejectNegative Mask(THUMB) Save
+Target Report RejectNegative Negative(marm) Mask(THUMB) Save
 Generate code for Thumb state.
 
 mthumb-interwork

Re: [PATCH 14/30] [arm] Generate a canonical form for -march

2017-06-13 Thread Richard Earnshaw (lists)

On 09/06/17 13:53, Richard Earnshaw wrote:
> 
> This patch uses the driver and some spec rewrite rules to generate a
> canonicalized form of the -march= option.  We want to do this for
> several reasons, all relating to making multi-lib selection sane.
> 
> 1) It can remove redundant extension options to produce a minimal
> list.
> 
> 2) The general syntax of the option permits a plethora of features,
> these are permitted in any order.  Canonicalization ensures that there
> is a single ordering of the options that are needed.
> 
> 3) It can use additional options to remove extensions that aren't
> relevant, such as removing all features that relate to the FPU when
> use of that is disabled.
> 
> Once we have this information in a sensible form the multilib rules
> can be vastly simplified making for much more understandable Makefile
> fragments.
> 
>   * common/config/arm/arm-common.c: Define INCLUDE_LIST.
>   (configargs.h): Include it.
>   (arm_print_hint_for_fpu_option): New function.
>   (arm_parse_fpu_option): New function.
>   (candidate_extension): New class.
>   (arm_canon_for_multilib): New function.
>   * config/arm/arm.h (CANON_ARCH_SPEC_FUNCTION): New macro.
>   (EXTRA_SPEC_FUNCTIONS): Add CANON_ARCH_SPEC_FUNCTION.
>   (ARCH_CANONICAL_SPECS): New macro.
>   (DRIVER_SELF_SPECS): Add ARCH_CANONICAL_SPECS.

Fix the canonicalization issue by deferring deletion of multiple -march
options until after the array substitution iteration process has completed.

As Joseph mentioned, it would be nice if this code did a degree of
validation of overriden options; the canonicalizer could in principle do
this, but I haven't added that code at this time.  I'll consider that
for a follow-up.

R.
diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 42f1ad4..30cb61e 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -17,6 +17,7 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
+#define INCLUDE_LIST
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -305,6 +306,41 @@ arm_parse_arch_option_name (const arch_option *list, const char *optname,
   return NULL;
 }
 
+/* List the permitted architecture option names.  If TARGET is a near
+   miss for an entry, print out the suggested alternative.  */
+static void
+arm_print_hint_for_fpu_option (const char *target)
+{
+  auto_vec candidates;
+  for (int i = 0; i < TARGET_FPU_auto; i++)
+candidates.safe_push (all_fpus[i].name);
+  char *s;
+  const char *hint = candidates_list_and_hint (target, s, candidates);
+  if (hint)
+inform (input_location, "valid arguments are: %s; did you mean %qs?",
+	s, hint);
+  else
+inform (input_location, "valid arguments are: %s", s);
+
+  XDELETEVEC (s);
+}
+
+static const arm_fpu_desc *
+arm_parse_fpu_option (const char *opt)
+{
+  int i;
+
+  for (i = 0; i < TARGET_FPU_auto; i++)
+{
+  if (strcmp (all_fpus[i].name, opt) == 0)
+	return all_fpus + i;
+}
+
+  error_at (input_location, "unrecognized -mfpu target: %s", opt);
+  arm_print_hint_for_fpu_option (opt);
+  return NULL;
+}
+
 /* Convert a static initializer array of feature bits to sbitmap
representation.  */
 void
@@ -405,6 +441,324 @@ arm_parse_option_features (sbitmap isa, const cpu_arch_option *target,
 }
 }
 
+class candidate_extension
+{
+public:
+  const cpu_arch_extension *extension;
+  sbitmap isa_bits;
+  bool required;
+
+  candidate_extension (const cpu_arch_extension *ext, sbitmap bits)
+: extension (ext), isa_bits (bits), required (true)
+{}
+  ~candidate_extension ()
+{
+  sbitmap_free (isa_bits);
+}
+};
+
+/* Generate a canonical representation of the -march option from the
+   current -march string (if given) and other options on the command
+   line that might affect the architecture.  This aids multilib selection
+   by ensuring that:
+   a) the option is always present
+   b) only the minimal set of options are used
+   c) when there are multiple extensions, they are in a consistent order.
+
+   The options array consists of couplets of information where the
+   first item in each couplet is the string describing which option
+   name was selected (arch, cpu, fpu) and the second is the value
+   passed for that option.  */
+const char *
+arm_canon_arch_option (int argc, const char **argv)
+{
+  const char *arch = NULL;
+  const char *cpu = NULL;
+  const char *fpu = NULL;
+  const char *abi = NULL;
+  static char *canonical_arch = NULL;
+
+  /* Just in case we're called more than once.  */
+  if (canonical_arch)
+{
+  free (canonical_arch);
+  canonical_arch = NULL;
+}
+
+  if (argc & 1)
+fatal_error (input_location,
+		 "%%:canon_for_mlib takes 1 or more pairs of parameters");
+
+  while (argc)
+{
+  if (strcmp (argv[0], "arch") == 0)
+	arch = argv[1];
+  else if (strcmp (argv[0], "cpu") == 0)

Re: [PATCH 09/30] [ARM] Move cpu and architecture option name parsing code to arm-common.c

2017-06-13 Thread Richard Earnshaw (lists)

On 09/06/17 13:53, Richard Earnshaw wrote:
> 
> This patch has no functional change.  The code used for parsing -mcpu,
> -mtune and -march options is simply moved from arm.c arm-common.c.
> The list of FPU options is also moved.  Subsequent patches will make
> use of this within the driver.
> 
> Some small adjustments are needed as a consequence of moving the
> definitions of the data objects to another object file, in that we
> no-longer have direct access to the size of the object.
> 
>   * common/config/arm/arm-common.c (arm_initialize_isa): Moved here from
>   config/arm/arm.c.
>   (arm_print_hint_for_cpu_option): Likewise.
>   (arm_print_hint_for_arch_option): Likewise.
>   (arm_parse_cpu_option_name): Likewise.
>   (arm_parse_arch_option_name): Likewise.
>   * config/arm/arm.c (arm_identify_fpu_from_isa): Use the computed number
>   of entries in the all_fpus list.
>   * config/arm/arm-protos.h (all_architectures, all_cores): Declare.
>   (arm_parse_cpu_option_name): Declare.
>   (arm_parse_arch_option_name): Declare.
>   (arm_parse_option_features): Declare.
>   (arm_intialize_isa): Declare.
>   * config/arm/parsecpu.awk (gen_data): Move CPU and architecture
>   data tables to ...
>   (gen_comm_data): ... here.  Make definitions non-static.
>   * config/arm/arm-cpu-data.h: Regenerated.
>   * config/arm/arm-cpu-cdata.h: Regenerated.

More typo fixes.

diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index fd0c616..f44ba1f 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -27,6 +27,8 @@
 #include "common/common-target-def.h"
 #include "opts.h"
 #include "flags.h"
+#include "sbitmap.h"
+#include "diagnostic.h"
 
 /* Set default optimization options.  */
 static const struct default_options arm_option_optimization_table[] =
@@ -187,6 +189,194 @@ arm_target_thumb_only (int argc, const char **argv)
 return NULL;
 }
 
+/* List the permitted CPU option names.  If TARGET is a near miss for an
+   entry, print out the suggested alternative.  */
+static void
+arm_print_hint_for_cpu_option (const char *target,
+			   const cpu_option *list)
+{
+  auto_vec candidates;
+  for (; list->common.name != NULL; list++)
+candidates.safe_push (list->common.name);
+  char *s;
+  const char *hint = candidates_list_and_hint (target, s, candidates);
+  if (hint)
+inform (input_location, "valid arguments are: %s; did you mean %qs?",
+	s, hint);
+  else
+inform (input_location, "valid arguments are: %s", s);
+
+  XDELETEVEC (s);
+}
+
+/* Parse the base component of a CPU selection in LIST.  Return a
+   pointer to the entry in the architecture table.  OPTNAME is the
+   name of the option we are parsing and can be used if a diagnostic
+   is needed.  */
+const cpu_option *
+arm_parse_cpu_option_name (const cpu_option *list, const char *optname,
+			   const char *target)
+{
+  const cpu_option *entry;
+  const char *end  = strchr (target, '+');
+  size_t len = end ? end - target : strlen (target);
+
+  for (entry = list; entry->common.name != NULL; entry++)
+{
+  if (strncmp (entry->common.name, target, len) == 0
+	  && entry->common.name[len] == '\0')
+	return entry;
+}
+
+  error_at (input_location, "unrecognized %s target: %s", optname, target);
+  arm_print_hint_for_cpu_option (target, list);
+  return NULL;
+}
+
+/* List the permitted architecture option names.  If TARGET is a near
+   miss for an entry, print out the suggested alternative.  */
+static void
+arm_print_hint_for_arch_option (const char *target,
+			   const arch_option *list)
+{
+  auto_vec candidates;
+  for (; list->common.name != NULL; list++)
+candidates.safe_push (list->common.name);
+  char *s;
+  const char *hint = candidates_list_and_hint (target, s, candidates);
+  if (hint)
+inform (input_location, "valid arguments are: %s; did you mean %qs?",
+	s, hint);
+  else
+inform (input_location, "valid arguments are: %s", s);
+
+  XDELETEVEC (s);
+}
+
+/* Parse the base component of a CPU or architecture selection in
+   LIST.  Return a pointer to the entry in the architecture table.
+   OPTNAME is the name of the option we are parsing and can be used if
+   a diagnostic is needed.  */
+const arch_option *
+arm_parse_arch_option_name (const arch_option *list, const char *optname,
+			const char *target)
+{
+  const arch_option *entry;
+  const char *end  = strchr (target, '+');
+  size_t len = end ? end - target : strlen (target);
+
+  for (entry = list; entry->common.name != NULL; entry++)
+{
+  if (strncmp (entry->common.name, target, len) == 0
+	  && entry->common.name[len] == '\0')
+	return entry;
+}
+
+  error_at (input_location, "unrecognized %s target: %s", optname, target);
+  arm_print_hint_for_arch_option (target, list);
+  return NULL;
+}
+
+/* Convert a static initializer array of feature bits to sbitmap
+   representation.

Re: [PATCH 08/30] [arm] Split CPU, architecture and tuning data tables.

2017-06-13 Thread Richard Earnshaw (lists)

On 09/06/17 13:53, Richard Earnshaw wrote:
> 
> The driver really needs to handle some canonicalization of the new
> -mcpu and -march options in order to make multilib selection
> tractable.  This will require moving much of the logic to parse the
> new options into the common code file.  However, the tuning data
> definitely does not want to be there as it is very specific to the
> compiler passes.  To facilitate this we need to split up the generated
> configuration data into architectural and tuning related tables.
> 
> This patch starts that process, but does not yet move any code out of
> the compiler backend.  Since I'm reworking all that code I took the
> opportunity to also separate out the CPU data tables from the
> architecture data tables.  Although they are related, there is a lot
> of redundancy in the CPU options that is best handled by simply
> indirecting to the architecture entry.
> 
>   * config/arm/arm-protos.h (arm_build_target): Remove arch_core.
>   (cpu_arch_extension): New structure.
>   (cpu_arch_option, arch_option, cpu_option): New structures.
>   * config/arm/parsecpu.awk (gen_headers): Build an enumeration of
>   architecture types.
>   (gen_data): Generate new format data tables.
>   * config/arm/arm.c (cpu_tune): New structure.
>   (cpu_option, processors): Delete.
>   (arm_print_hint_for_core_or_arch): Delete.  Replace with ...
>   (arm_print_hint_for_cpu_option): ... this and ...
>   (arm_print_hint_for_arch_option): ... this.
>   (arm_parse_arch_cpu_name): Delete.  Replace with ...
>   (arm_parse_cpu_option_name): ... this and ...
>   (arm_parse_arch_option_name): ... this.
>   (arm_unrecognized_feature): Change type of target parameter to
>   cpu_arch_option.
>   (arm_parse_arch_cpu_features): Delete.  Replace with ...
>   (arm_parse_option_features): ... this.
>   (arm_configure_build_target): Rework to use new configuration data
>   tables.
>   (arm_print_tune_info): Rework for new configuration data tables.
>   * config/arm/arm-cpu-data.h: Regenerated.
>   * config/arm/arm-cpu.h: Regenerated.
> ---

Fix for using quirk bits when -mcpu and -march match up.

R.
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index da9d273..0e45b23 100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -20,7 +20,7 @@
License along with GCC; see the file COPYING3.  If not see
.  */
 
-static const struct cpu_option cpu_opttab_arm9e[] = {
+static const cpu_arch_extension cpu_opttab_arm9e[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -28,7 +28,7 @@ static const struct cpu_option cpu_opttab_arm9e[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm946es[] = {
+static const cpu_arch_extension cpu_opttab_arm946es[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -36,7 +36,7 @@ static const struct cpu_option cpu_opttab_arm946es[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm966es[] = {
+static const cpu_arch_extension cpu_opttab_arm966es[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -44,7 +44,7 @@ static const struct cpu_option cpu_opttab_arm966es[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm968es[] = {
+static const cpu_arch_extension cpu_opttab_arm968es[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -52,7 +52,7 @@ static const struct cpu_option cpu_opttab_arm968es[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm10e[] = {
+static const cpu_arch_extension cpu_opttab_arm10e[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -60,7 +60,7 @@ static const struct cpu_option cpu_opttab_arm10e[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm1020e[] = {
+static const cpu_arch_extension cpu_opttab_arm1020e[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -68,7 +68,7 @@ static const struct cpu_option cpu_opttab_arm1020e[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm1022e[] = {
+static const cpu_arch_extension cpu_opttab_arm1022e[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -76,7 +76,7 @@ static const struct cpu_option cpu_opttab_arm1022e[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm926ejs[] = {
+static const cpu_arch_extension cpu_opttab_arm926ejs[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -84,7 +84,7 @@ static const struct cpu_option cpu_opttab_arm926ejs[] = {
   { NULL, false, {isa_nobit}}
 };
 
-static const struct cpu_option cpu_opttab_arm1026ejs[] = {
+static const cpu_arch_extension cpu_opttab_arm1026ejs[] = {
   {
 "nofp", true,
 { ISA_ALL_FP, isa_nobit }
@@ -92,7 +92,7 @@ static const

Re: [PATCH 0/13] D: Submission of D Front End

2017-06-13 Thread Jeff Law

On 06/13/2017 02:05 AM, Richard Biener wrote:
> On Tue, Jun 13, 2017 at 2:09 AM, Iain Buclaw  wrote:
>> On 13 June 2017 at 01:22, Mike Stump  wrote:
>>> On Jun 12, 2017, at 11:34 AM, Richard Sandiford 
>>>  wrote:

 I'm not sure who this is a question to really, but how much value is
 there in reviewing the other patches?
>>>
 Maybe people who know the
 frontend interface well could comment on that part, but would anyone
 here be able to do a meaningful review of the core frontend?  And AIUI
 some of the patches are straight imports from an external upstream.

 I was just wondering whether, once 5, 6 and 7 have been reviewed,
 accepting the rest would be a policy decision, or whether there
 was a plan for someone to review the whole series.
>>>
>>> So Iain might not have the whole game plan pre-arranged.  My guess is that 
>>> it isn't yet.  So, technically, people can argue for or against the FE as 
>>> the want, but ultimately, the SC I think gets to make the decision in the 
>>> form of accepting the FE contribution and appointing a FE maintainer.  If 
>>> they say yes, then that person can technically self-review the changes to 
>>> the non-shared bits.  For the shared bits, the usual maintainer for those 
>>> bits should review and approve those bits.  For example, the testsuite 
>>> changes are reviewed by the testsuite maintainer; I've done that, so that's 
>>> done.  If there are doc changes, a doc reviewer will review those bits and 
>>> so on.
>>>
>>> I'd expect that for the changes that aren't shared, we treat it kinda like 
>>> we do for a new port.  There, we usually have a person or two go through 
>>> and weigh in where useful and help refine things a little.  If someone 
>>> wants to help out and volunteer to do this, they will.  If not, then we 
>>> just trust the FE coming in.  The SC will weigh in if they want the 
>>> contribution contingent upon a review.  Of course, the global reviewers 
>>> and/or the SC might be able to clarify, as they keep track of the little 
>>> details better than I, the above is just my guess to help get the process 
>>> started.
>>
>>
>> Right, I actually gave no forewarning other than via IRC, where it got
>> an acknowledgement from Jakub and Richi, if I recall right, the
>> response was asking if the SC has formally accepted D and myself as a
>> maintainer.  The answer is 'no' on that front.  My initial intent was
>> to get things in motion again, after they were abruptly halted 4 years
>> ago.
> 
> Yeah, it was to make sure the issue is raised with the SC.  Jeff?
David E. raised it earlier today.

Jeff

Re: [PATCH 04/30] [arm] Allow +opt on arbitrary cpu and architecture specifications

2017-06-13 Thread Richard Earnshaw (lists)

On 09/06/17 13:53, Richard Earnshaw wrote:
> 
> This is the main patch to provide the infrastructure for adding
> feature extensions to CPU and architecture specifications.  It does not,
> however, add all the extensions that we intend to support (just a small
> number to permit some basic testing).  Now, instead of having specific
> entries in the architecture table for variants such as armv8-a+crc, the
> crc extension is specified as an optional component of the armv8-a
> architecture entry.  Similar control can be added to CPU option names.
> In both cases the list of permitted options is controlled by the main
> architecture or CPU name to prevent arbitrary cross-products of options.
> 
>   * config/arm/arm-cpus.in (armv8-a): Add options crc, simd crypto and
>   nofp.
>   (armv8-a+crc): Delete.
>   (armv8.1-a): Add options simd, crypto and nofp.
>   (armv8.2-a): Add options fp16, simd, crypto and nofp.
>   (armv8.2-a+fp16): Delete.
>   (armv8-m.main): Add option dsp.
>   (armv8-m.main+dsp): Delete.
>   (cortex-a8): Add fpu.  Add nofp option.
>   (cortex-a9): Add fpu.  Add nofp and nosimd options.
>   * config/arm/parsecpu.awk (gen_data): Generate option tables and
>   link to main cpu and architecture data structures.
>   (gen_comm_data): Only put isa attributes from the main architecture
>   in common tables.
>   (option): New statement for architecture and CPU entries.
>   * arm.c (struct cpu_option): New structure.
>   (struct processors): Add entry for options.
>   (arm_unrecognized_feature): New function.
>   (arm_parse_arch_cpu_name): Ignore any characters after the first
>   '+' character.
>   (arm_parse_arch_cpu_feature): New function.
>   (arm_configure_build_target): Separate out any CPU and architecture
>   features and parse separately.  Don't error out if -mfpu=auto is
>   used with only an architecture string.
>   (arm_print_asm_arch_directives): New function.
>   (arm_file_start): Call it.
>   * config/arm/arm-cpu-cdata.h: Regenerated.
>   * config/arm/arm-cpu-data.h: Likewise.
>   * config/arm/arm-tables.opt: Likewise.
> ---
>  gcc/config/arm/arm-cpu-cdata.h |  51 +++
>  gcc/config/arm/arm-cpu-data.h  | 305 
> ++---
>  gcc/config/arm/arm-cpus.in |  39 +++---
>  gcc/config/arm/arm-tables.opt  |  21 +--
>  gcc/config/arm/arm.c   | 191 +-
>  gcc/config/arm/parsecpu.awk| 110 +--
>  6 files changed, 555 insertions(+), 162 deletions(-)
> 

Updated with typo corrected.


diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h
index b388812..878d226 100644
--- a/gcc/config/arm/arm-cpu-cdata.h
+++ b/gcc/config/arm/arm-cpu-cdata.h
@@ -577,6 +577,7 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 "cortex-a8",
 {
   ISA_ARMv7a,
+  ISA_VFPv3,ISA_NEON,
   isa_nobit
 },
   },
@@ -584,6 +585,7 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 "cortex-a9",
 {
   ISA_ARMv7a,
+  ISA_VFPv3,ISA_NEON,
   isa_nobit
 },
   },
@@ -693,63 +695,63 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
   {
 "cortex-a32",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a35",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a53",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a57",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a72",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a73",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "exynos-m1",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "falkor",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "qdf24xx",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
@@ -763,28 +765,28 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
   {
 "cortex-a57.cortex-a53",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a72.cortex-a53",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a73.cortex-a35",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
   {
 "cortex-a73.cortex-a53",
 {
-  ISA_ARMv8a,isa_bit_crc32,
+  ISA_ARMv8a,
   isa_nobit
 },
   },
@@ -798,7 +800,7 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
   {
 "cortex-m33",
 {
-

Re: [PATCH 01/30] [arm] Use strings for -march, -mcpu and -mtune options

2017-06-13 Thread Richard Earnshaw (lists)

On 09/06/17 13:53, Richard Earnshaw wrote:
> 
> In order to support more complex specifications for cpus and architectures
> we need to move away from using enumerations to represent the set of
> permitted options.  This basic change just moves the option parsing
> infrastructure over to that, but changes nothing more beyond generating
> a hint when the specified option does not match a known target (previously
> the help option was able to print out all the permitted values, but we
> can no-longer do that.
> 
>   * config/arm/arm.opt (x_arm_arch_string): New TargetSave option.
>   (x_arm_cpu_string, x_arm_tune_string): Likewise.
>   (march, mcpu, mtune): Convert to string-based options.
>   * config/arm/arm.c (arm_print_hint_for_core_or_arch): New function.
>   (arm_parse_arch_cpu_name): New function.
>   (arm_configure_build_target): Use arm_parse_arch_cpu_name to
>   identify selected architecture or CPU.
>   (arm_option_save): New function.
>   (TARGET_OPTION_SAVE): Redefine.
>   (arm_option_restore): Restore string options.
>   (arm_option_print): Print string options.

Updated with typo fixed.

R.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 42b0e86..5288000 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -233,6 +233,7 @@ static tree arm_build_builtin_va_list (void);
 static void arm_expand_builtin_va_start (tree, rtx);
 static tree arm_gimplify_va_arg_expr (tree, tree, gimple_seq *, gimple_seq *);
 static void arm_option_override (void);
+static void arm_option_save (struct cl_target_option *, struct gcc_options *);
 static void arm_option_restore (struct gcc_options *,
 struct cl_target_option *);
 static void arm_override_options_after_change (void);
@@ -413,6 +414,9 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE
 #define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE arm_override_options_after_change
 
+#undef TARGET_OPTION_SAVE
+#define TARGET_OPTION_SAVE arm_option_save
+
 #undef TARGET_OPTION_RESTORE
 #define TARGET_OPTION_RESTORE arm_option_restore
 
@@ -2924,9 +2928,22 @@ arm_override_options_after_change (void)
   arm_override_options_after_change_1 (_options);
 }
 
+/* Implement TARGET_OPTION_SAVE.  */
+static void
+arm_option_save (struct cl_target_option *ptr, struct gcc_options *opts)
+{
+  ptr->x_arm_arch_string = opts->x_arm_arch_string;
+  ptr->x_arm_cpu_string = opts->x_arm_cpu_string;
+  ptr->x_arm_tune_string = opts->x_arm_tune_string;
+}
+
+/* Implement TARGET_OPTION_RESTORE.  */
 static void
-arm_option_restore (struct gcc_options *, struct cl_target_option *ptr)
+arm_option_restore (struct gcc_options *opts, struct cl_target_option *ptr)
 {
+  opts->x_arm_arch_string = ptr->x_arm_arch_string;
+  opts->x_arm_cpu_string = ptr->x_arm_cpu_string;
+  opts->x_arm_tune_string = ptr->x_arm_tune_string;
   arm_configure_build_target (_active_target, ptr, _options_set,
 			  false);
 }
@@ -3044,6 +3061,46 @@ arm_initialize_isa (sbitmap isa, const enum isa_feature *isa_bits)
 bitmap_set_bit (isa, *(isa_bits++));
 }
 
+/* List the permitted CPU or architecture names.  If TARGET is a near
+   miss for an entry, print out the suggested alternative.  */
+static void
+arm_print_hint_for_core_or_arch (const char *target,
+ const struct processors *list)
+{
+  auto_vec candidates;
+  for (; list->name != NULL; list++)
+candidates.safe_push (list->name);
+  char *s;
+  const char *hint = candidates_list_and_hint (target, s, candidates);
+  if (hint)
+inform (input_location, "valid arguments are: %s; did you mean %qs?",
+	s, hint);
+  else
+inform (input_location, "valid arguments are: %s", s);
+
+  XDELETEVEC (s);
+}
+
+/* Parse the base component of a CPU or architecture selection in
+   LIST.  Return a pointer to the entry in the architecture table.
+   OPTNAME is the name of the option we are parsing and can be used if
+   a diagnostic is needed.  */
+static const struct processors *
+arm_parse_arch_cpu_name (const struct processors *list, const char *optname,
+			 const char *target)
+{
+  const struct processors *entry;
+  for (entry = list; entry->name != NULL; entry++)
+{
+  if (streq (entry->name, target))
+	return entry;
+}
+
+  error_at (input_location, "unrecognized %s target: %s", optname, target);
+  arm_print_hint_for_core_or_arch (target, list);
+  return NULL;
+}
+
 static sbitmap isa_all_fpubits;
 static sbitmap isa_quirkbits;
 
@@ -3065,17 +3122,20 @@ arm_configure_build_target (struct arm_build_target *target,
   target->core_name = NULL;
   target->arch_name = NULL;
 
-  if (opts_set->x_arm_arch_option)
-arm_selected_arch = _architectures[opts->x_arm_arch_option];
-
-  if (opts_set->x_arm_cpu_option)
+  if (opts_set->x_arm_arch_string)
+arm_selected_arch = arm_parse_arch_cpu_name (all_architectures,
+		 "-march",
+		 opts->x_arm_arch_string);
+  if (opts_set->x_arm_cpu_string)
 {
-

Re: [PATCH] Fix ICE with -Wduplicated-branches (PR objc/80949)

2017-06-13 Thread Jakub Jelinek

On Tue, Jun 13, 2017 at 07:01:11PM +0200, Marek Polacek wrote:
> -Wduplicated-branches can crash on a weird ObjC testcase that we haven't
> managed to reduce, so no testcase attached.  On that testcase, we end up
> calling do_warn_duplicated_branches with null COND_EXPR_THEN, and the code
> wasn't prepared to handle that.  The fix is trivial.  Eric G. verified that
> this indeed fixes the ICE (thanks).
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2017-06-13  Marek Polacek  
> 
>   PR objc/80949
>   * c-warn.c (do_warn_duplicated_branches): Return if any of the
>   branches is null.

Ok, thanks.

> --- gcc/c-family/c-warn.c
> +++ gcc/c-family/c-warn.c
> @@ -2354,8 +2354,8 @@ do_warn_duplicated_branches (tree expr)
>tree thenb = COND_EXPR_THEN (expr);
>tree elseb = COND_EXPR_ELSE (expr);
>  
> -  /* Don't bother if there's no else branch.  */
> -  if (elseb == NULL_TREE)
> +  /* Don't bother if any of the branches is missing.  */
> +  if (thenb == NULL_TREE || elseb == NULL_TREE)
>  return;
>  
>/* And don't warn for empty statements.  */
> 
>   Marek

Jakub

Re: [PATCH][GCC][AArch64] optimize float immediate moves (1 /4) - infrastructure.

2017-06-13 Thread Richard Sandiford

James Greenhalgh  writes:
>> +
>> + /* First determine number of instructions to do the move
>> +as an integer constant.  */
>> +if (!aarch64_float_const_representable_p (x)
>> +&& !aarch64_can_const_movi_rtx_p (x, mode)
>> +&& aarch64_float_const_rtx_p (x))
>> +  {
>> +unsigned HOST_WIDE_INT ival;
>> +bool succeed = aarch64_reinterpret_float_as_int (x, );
>> +gcc_assert (succeed);
>
> Just:
>
>   gcc_assert (aarch64_reinterpret_float_as_int (x, ));
>
> There's not much extra information in the name "succeed", so no extra value
> in the variable assignment.

That's not the same thing with --enable-checking=no

>> +
>> +machine_mode imode = mode == HFmode ? SImode : int_mode_for_mode 
>> (mode);
>> +int ncost = aarch64_internal_mov_immediate
>> +(NULL_RTX, gen_int_mode (ival, imode), false, imode);
>> +*cost += COSTS_N_INSNS (ncost);
>> +return true;
>> +  }

[PATCH] Fix ICE with -Wduplicated-branches (PR objc/80949)

2017-06-13 Thread Marek Polacek

-Wduplicated-branches can crash on a weird ObjC testcase that we haven't
managed to reduce, so no testcase attached.  On that testcase, we end up
calling do_warn_duplicated_branches with null COND_EXPR_THEN, and the code
wasn't prepared to handle that.  The fix is trivial.  Eric G. verified that
this indeed fixes the ICE (thanks).

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-06-13  Marek Polacek  

PR objc/80949
* c-warn.c (do_warn_duplicated_branches): Return if any of the
branches is null.

--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -2354,8 +2354,8 @@ do_warn_duplicated_branches (tree expr)
   tree thenb = COND_EXPR_THEN (expr);
   tree elseb = COND_EXPR_ELSE (expr);
 
-  /* Don't bother if there's no else branch.  */
-  if (elseb == NULL_TREE)
+  /* Don't bother if any of the branches is missing.  */
+  if (thenb == NULL_TREE || elseb == NULL_TREE)
 return;
 
   /* And don't warn for empty statements.  */

Marek

[PATCH v9] add -fpatchable-function-entry=N,M option

2017-06-13 Thread Torsten Duwe

Changes since v8:

  * Documentation changes as requested by Sandra
  * 3 functional test cases added

Torsten


gcc/c-family/ChangeLog
2017-06-13  Torsten Duwe  

* c-attribs.c (c_common_attribute_table): Add entry for
"patchable_function_entry".

gcc/lto/ChangeLog
2017-06-13  Torsten Duwe  

* lto-lang.c (lto_attribute_table): Add entry for
"patchable_function_entry".

gcc/ChangeLog
2017-06-13  Torsten Duwe  

* common.opt: Introduce -fpatchable-function-entry
command line option, and its variables function_entry_patch_area_size
and function_entry_patch_area_start.
* opts.c (common_handle_option): Add -fpatchable_function_entry_ case,
including a two-value parser.
* target.def (print_patchable_function_entry): New target hook.
* targhooks.h (default_print_patchable_function_entry): New function.
* targhooks.c (default_print_patchable_function_entry): Likewise.
* toplev.c (process_options): Switch off IPA-RA if
patchable function entries are being generated.
* varasm.c (assemble_start_function): Look at the
patchable-function-entry command line switch and current
function attributes and maybe generate NOP instructions by
calling the print_patchable_function_entry hook.
* doc/extend.texi: Document patchable_function_entry attribute.
* doc/invoke.texi: Document -fpatchable_function_entry
command line option.
* doc/tm.texi.in (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
New target hook.
* doc/tm.texi: Likewise.

gcc/testsuite/ChangeLog
2017-06-13  Torsten Duwe  

* c-c++-common/patchable_function_entry-default.c: New test.
* c-c++-common/patchable_function_entry-decl.c: Likewise.
* c-c++-common/patchable_function_entry-definition.c: Likewise.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index f2a88e147ba..31137ce0433 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -139,6 +139,8 @@ static tree handle_bnd_variable_size_attribute (tree *, 
tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
+  int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -345,6 +347,9 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_instrument, false },
   { "fallthrough",   0, 0, false, false, false,
  handle_fallthrough_attribute, false },
+  { "patchable_function_entry",1, 2, true, false, false,
+ handle_patchable_function_entry_attribute,
+ false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -3173,3 +3178,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index a5c3aeaa336..f542590650c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; How many NOP insns to place at each function entry by default
+Variable
+HOST_WIDE_INT function_entry_patch_area_size
+
+; And how far the real asm entry point is into this area
+Variable
+HOST_WIDE_INT function_entry_patch_area_start
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2022,6 +2029,10 @@ fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fpatchable-function-entry=
+Common Joined Optimization
+Insert NOP instructions at each function entry.
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1255995eb78..d09ccd90c42 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3083,6 +3083,25 @@ that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item patchable_function_entry
+@cindex @code{patchable_function_entry} function attribute
+@cindex extra NOP instructions at the function entry point
+In case the target's text segment can be made writable at run time by
+any means, padding the function entry with a number of NOPs can be
+used to provide a universal tool for instrumentation.  Usually,
+patchable function

Re: [PATCH][GCC][AArch64] optimize float immediate moves (1 /4) - infrastructure.

2017-06-13 Thread James Greenhalgh

This patch is pretty huge, are there any opportunities to further split
it to aid review?

I have some comments in line.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> a069427f576f6bd7336bbe4497249773bd33d138..2ab2d96e40e80a79b5648046ca2d6e202d3939a2
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -147,6 +147,8 @@ static bool aarch64_builtin_support_vector_misalignment 
> (machine_mode mode,
>const_tree type,
>int misalignment,
>bool is_packed);
> +static machine_mode
> +aarch64_simd_container_mode (machine_mode mode, unsigned width);
>  
>  /* Major revision number of the ARM Architecture implemented by the target.  
> */
>  unsigned aarch64_architecture_version;
> @@ -4613,6 +4615,66 @@ aarch64_legitimize_address_displacement (rtx *disp, 
> rtx *off, machine_mode mode)
>return true;
>  }
>  
> +/* Return the binary representation of floating point constant VALUE in 
> INTVAL.
> +   If the value cannot be converted, return false without setting INTVAL.
> +   The conversion is done in the given MODE.  */
> +bool
> +aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT *intval)
> +{
> +  machine_mode mode = GET_MODE (value);
> +  if (GET_CODE (value) != CONST_DOUBLE
> +  || !SCALAR_FLOAT_MODE_P (mode)
> +  || GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT)
> +return false;
> +
> +  unsigned HOST_WIDE_INT ival = 0;
> +
> +  /* Only support up to DF mode.  */
> +  gcc_assert (GET_MODE_BITSIZE (mode) <= 64);
> +  int needed = GET_MODE_BITSIZE (mode) == 64 ? 2 : 1;
> +
> +  long res[2];
> +  real_to_target (res,
> +   CONST_DOUBLE_REAL_VALUE (value),
> +   REAL_MODE_FORMAT (mode));
> +
> +  ival = zext_hwi (res[needed - 1], 32);
> +  for (int i = needed - 2; i >= 0; i--)
> +{
> +  ival <<= 32;
> +  ival |= zext_hwi (res[i], 32);
> +}
> +
> +  *intval = ival;

???

Two cases here, needed is either 2 if GET_MODE_BITSIZE (mode) == 64, or it
is 1 otherwise. So i starts at either -1 or 0. So this for loop either runs
0 or 1 times. What am I missing? I'm sure this is all an indirect way of
writing:

  *intval = 0;
  if (GET_MODE_BITSIZE (mode) == 64)
*intval = zext_hwi (res[1], 32) << 32
  *intval |= zext_hwi (res[0], 32)



> +  return true;
> +}
> +
> +/* Return TRUE if rtx X is an immediate constant that can be moved using a
> +   single MOV(+MOVK) followed by an FMOV.  */
> +bool
> +aarch64_float_const_rtx_p (rtx x)
> +{
> +  machine_mode mode = GET_MODE (x);
> +  if (mode == VOIDmode)
> +return false;
> +
> +  /* Determine whether it's cheaper to write float constants as
> + mov/movk pairs over ldr/adrp pairs.  */
> +  unsigned HOST_WIDE_INT ival;
> +
> +  if (GET_CODE (x) == CONST_DOUBLE
> +  && SCALAR_FLOAT_MODE_P (mode)
> +  && aarch64_reinterpret_float_as_int (x, ))
> +{
> +  machine_mode imode = mode == HFmode ? SImode : int_mode_for_mode 
> (mode);
> +  int num_instr = aarch64_internal_mov_immediate
> + (NULL_RTX, gen_int_mode (ival, imode), false, imode);
> +  return num_instr < 3;

Should this cost model be static on a magin number? Is it not the case that
the decision should be based on the relative speeds of a memory access
compared with mov/movk/fmov ?

> +}
> +
> +  return false;
> +}
> +
>  /* Return TRUE if rtx X is immediate constant 0.0 */
>  bool
>  aarch64_float_const_zero_rtx_p (rtx x)
> @@ -4625,6 +4687,46 @@ aarch64_float_const_zero_rtx_p (rtx x)
>return real_equal (CONST_DOUBLE_REAL_VALUE (x), );
>  }
>  
> +/* Return TRUE if rtx X is immediate constant that fits in a single
> +   MOVI immediate operation.  */
> +bool
> +aarch64_can_const_movi_rtx_p (rtx x, machine_mode mode)
> +{
> +  if (!TARGET_SIMD)
> + return false;
> +
> +  machine_mode vmode, imode;
> +  unsigned HOST_WIDE_INT ival;
> +
> +  /* Don't write float constants out to memory.  */
> +  if (GET_CODE (x) == CONST_DOUBLE
> +  && SCALAR_FLOAT_MODE_P (mode))
> +{
> +  if (!aarch64_reinterpret_float_as_int (x, ))
> + return false;
> +
> +  imode = int_mode_for_mode (mode);
> +}
> +  else if (GET_CODE (x) == CONST_INT
> +&& SCALAR_INT_MODE_P (mode))
> +{
> +   imode = mode;
> +   ival = INTVAL (x);
> +}
> +  else
> +return false;
> +
> +  unsigned width = GET_MODE_BITSIZE (mode) * 2;

Why * 2? It isn't obvious to me from my understanding of movi why that would
be better than just clamping to 64-bit?

> +  if (width < GET_MODE_BITSIZE (DFmode))
> + width = GET_MODE_BITSIZE (DFmode);
> +
> +  vmode = aarch64_simd_container_mode (imode, width);
> +  rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, ival);
> +
> +  return aarch64_simd_valid_immediate (v_op, vmode, false, NULL);
> +}
> +
> +
>

Re: [PATCH 00/30] [ARM] Reworking the -mcpu, -march and -mfpu options

2017-06-13 Thread Christophe Lyon

On 13 June 2017 at 17:25, Richard Earnshaw (lists)
 wrote:
> On 12/06/17 15:34, Richard Earnshaw (lists) wrote:
>> On 12/06/17 12:49, Christophe Lyon wrote:
>>> On 10 June 2017 at 01:27, Richard Earnshaw (lists)
>>>  wrote:
 On 09/06/17 23:45, Christophe Lyon wrote:
> Hi Richard,
>
>
> On 9 June 2017 at 14:53, Richard Earnshaw  
> wrote:
>>
>> During the ARM BoF at the Cauldron last year I mentioned that I wanted
>> to rework the way GCC on ARM handles the command line options.  The
>> problem was that most users, and even many experts, can't remember
>> which FPU/SIMD unit comes with which CPU and that consequently many
>> users were inadvertenly generating sub-optimal code for their system.
>>
>> This patch series implements the proposed change and provides support
>> for a generic way of adding optional features to architectures and CPU
>> names.  The documentation patches at the end of the series explain the
>> new syntax, so I won't repeat all that here.  Suffice to say here that
>> the result is that the -mfpu option now defaults to 'auto', which
>> allows the compiler to infer the floating-point and simd options from
>> the CPU/architecture options and that these options can normally be
>> expressed in a context-specific manner like +simd or +fp without
>> having to know precisely which variant is implemented.  Long term I'd
>> like to deprecate -mfpu and entirely move over to the new syntax; but
>> it's too early to start that process now.
>>
>> All the patches in the series should build a working basic compiler,
>> but the multilib selection will not work correctly until the relevant
>> patches towards the end are applied.  It is not really feasible to
>> retain that functionality without collapsing too many of the patches
>> together into one hunk.  It's also possible that some tests in the
>> testsuite may exhibit transient misbehaviour, but there should be no
>> regressions by the end of the sequence (some tests no-longer run in
>> the default configurations because the default CPU does not have
>> floating-point support).
>>
>> Just two patches are to the generic code, but both are fairly trivial.
>> One permits the sbitmap code to be used in the driver programs and the
>> other provides a way of escaping the meta-character in some multilib
>> reuse strings.
>>
>> I won't apply any of this series until those two patches have been
>> approved, and I won't commit anything before the middle of next week
>> even then.  This is a fairly complex change and it deserves some time
>> for people to comment before committing.
>>
>> R.
>>
>> Richard Earnshaw (30):
>>   [arm] Use strings for -march, -mcpu and -mtune options
>>   [arm] Rewrite -march and -mcpu options for passing to the assembler
>>   [arm] Don't pass -mfpu=auto through to the assembler.
>>   [arm] Allow +opt on arbitrary cpu and architecture specifications
>>   [arm] Add architectural options
>>   [arm] Add default FPUs for CPUs.
>>   [build] Make sbitmap code available to the driver programs
>>   [arm] Split CPU, architecture and tuning data tables.
>>   [ARM] Move cpu and architecture option name parsing code to
>> arm-common.c
>>   [arm] Use standard option parsing code for detecting thumb-only
>> targets
>>   [arm] Allow CPU and architecture extensions to be defined as aliases
>>   [arm] Allow new extended syntax CPU and architecture names during
>> configure
>>   [arm] Force a CPU default in the config args defaults list.
>>   [arm] Generate a canonical form for -march
>>   [arm] Make -mfloat-abi=softfp work when there are no FPU instructions
>>   [arm] Update basic multilib configuration
>>   [arm] Make 'auto' the default FPU selection option.
>>   [arm] Rewrite t-aprofile using new selector methodology
>>   [arm] Explicitly set .fpu in cmse_nonsecure_call.S
>>   [genmultilib] Allow explicit periods to be escaped in MULTILIB_REUSE
>>   [arm][testsuite] Use -march=armv7-a+fp when testing hard-float ABI.
>>   [arm] Rewrite t-rmprofile multilib specification
>>   [arm][rtems] Update t-rtems for new option framework
>>   [arm][linux-eabi] Ensure all multilib variables are reset
>>   [arm][phoenix] reset all multilib variables
>>   [arm] Rework multlib builds for symbianelf
>>   [arm][fuchsia] Rework multilib support
>>   [arm] Add a few missing architecture extension options.
>>   [arm][doc] Document new -march= syntax.
>>   [arm][doc] Document changes to -mcpu, -mtune and -mfpu.
>>
>>  gcc/Makefile.in   |2 +-
>>  gcc/common/config/arm/arm-common.c|  651 +++-
>>

Re: [libgomp, OpenACC] Add more map handling for enter/exit data directives

2017-06-13 Thread Jakub Jelinek

On Tue, Jun 13, 2017 at 06:48:18PM +0800, Chung-Lin Tang wrote:
> Hi Jakub,
> this patch has been posted before, but hasn't really been reviewed yet:
> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01927.html
> 
> This has been deployed on gomp-4_0-branch for a long time, and was re-tested
> on current trunk, test results okay.

I don't see sufficient information on what you want to change and why
and whether the changes are backwards compatible (say will a valid
OpenACC 2.0 program compiled by GCC 7 work against both libgomp from GCC 7
as well as one with this patch)?
Can you write a few paragraphs on it (doesn't have to be comments in the
source, mailing list is fine)?

> @@ -318,25 +337,24 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>   {
> unsigned char kind = kinds[i] & 0xff;
>  
> -   /* Scan for PSETs.  */
> -   int psets = find_pset (i, mapnum, kinds);
> +   /* Scan for pointers and PSETs.  */
> +   int pointer = find_pointer (i, mapnum, kinds);
>  
> -   if (!psets)
> +   if (!pointer)
>   {
> switch (kind)
>   {
> - case GOMP_MAP_POINTER:
> -   gomp_acc_insert_pointer (1, [i], [i],
> - [i]);
> + case GOMP_MAP_ALLOC:
> +   acc_present_or_create (hostaddrs[i], sizes[i]);
> break;
>   case GOMP_MAP_FORCE_ALLOC:
> acc_create (hostaddrs[i], sizes[i]);
> break;
> - case GOMP_MAP_FORCE_PRESENT:
> + case GOMP_MAP_TO:
> acc_present_or_copyin (hostaddrs[i], sizes[i]);
> break;
>   case GOMP_MAP_FORCE_TO:
> -   acc_present_or_copyin (hostaddrs[i], sizes[i]);
> +   acc_copyin (hostaddrs[i], sizes[i]);
> break;

E.g. in this hunk you remove GOMP_MAP_POINTER and GOMP_MAP_FORCE_PRESENT
handling and significantly change GOMP_MAP_FORCE_TO.  The first two will
now gomp_fatal, right?  Can it ever appear in GOACC_enter_exit_data
calls?

>   default:
> gomp_fatal (" GOACC_enter_exit_data UNHANDLED kind 
> 0x%.2x",

Jakub

Re: [PATCH] Finish implementing P0426R1 "Constexpr for std::char_traits" for C++17

2017-06-13 Thread Jonathan Wakely


On 12/06/17 23:28 +0100, Pedro Alves wrote:

On 06/05/2017 03:27 PM, Jonathan Wakely wrote:


Pedro, this is OK for trunk now we're in stage 1. Please go ahead and
commit it - thanks.


Thanks Jonathan.  I've pushed it in now.



It's probably safe for gcc-7-branch too, but let's leave it on trunk
for a while first.


OK.

BTW, for extra thoroughness, to confirm we're handling both
const & non-const arrays correctly, I had written this testsuite
tweak too.  Would you like to have this in?


Yes please, this looks useful.



@@ -98,7 +220,12 @@ static_assert( test_compare() );
static_assert( test_length() );
static_assert( test_find() );

-struct C { unsigned char c; };
+struct C
+{
+  C() = default;
+  constexpr C(auto c_) : c(c_) {}


Placeholder types as function parameters are non-standard, so this
would fail with -pedantic-errors.

How about:

struct C {
 constexpr C(unsigned char c_ = 0) : c(c_) { }
 unsigned char c;
};



+  unsigned char c;
+};
constexpr bool operator==(const C& c1, const C& c2) { return c1.c == c2.c; }
constexpr bool operator<(const C& c1, const C& c2) { return c1.c < c2.c; }
static_assert( test_assign() );
--
2.5.5

Re: [PATCH 2/2] [MSP430] Fix issues handling .persistent attribute (PR 78818)

2017-06-13 Thread Nick Clifton

Hi Jozef,

> Ok for trunk and gcc-7-branch?

Approved - please apply (to both).

Cheers
  Nick

Re: [PATCH 01/30] [arm] Use strings for -march, -mcpu and -mtune options

2017-06-13 Thread Richard Earnshaw (lists)

On 13/06/17 14:23, Christophe Lyon wrote:
> On 9 June 2017 at 14:53, Richard Earnshaw  wrote:
>>
>> In order to support more complex specifications for cpus and architectures
>> we need to move away from using enumerations to represent the set of
>> permitted options.  This basic change just moves the option parsing
>> infrastructure over to that, but changes nothing more beyond generating
>> a hint when the specified option does not match a known target (previously
>> the help option was able to print out all the permitted values, but we
>> can no-longer do that.
>>
>> * config/arm/arm.opt (x_arm_arch_string): New TargetSave option.
>> (x_arm_cpu_string, x_arm_tune_string): Likewise.
>> (march, mcpu, mtune): Convert to string-based options.
>> * config/arm/arm.c (arm_print_hint_for_core_or_arch): New function.
>> (arm_parse_arch_cpu_name): New function.
>> (arm_configure_build_target): Use arm_parse_arch_cpu_name to
>> identify selected architecture or CPU.
>> (arm_option_save): New function.
>> (TARGET_OPTION_SAVE): Redefine.
>> (arm_option_restore): Restore string options.
>> (arm_option_print): Print string options.
>> ---
>>  gcc/config/arm/arm.c   | 92 
>> --
>>  gcc/config/arm/arm.opt | 15 ++--
>>  2 files changed, 94 insertions(+), 13 deletions(-)
>>
> 
> 
> I've noticed a typo (:premitted):
> +/* List the premitted CPU or architecture names.  If TARGET is a near
> 

Thanks.  That code gets moved to arm-common.c later on.  I'll fix up the
moved copies.

R.

Re: C/C++ PATCH to implement -Wmultistatement-macros (PR c/80116)

2017-06-13 Thread Joseph Myers

On Tue, 13 Jun 2017, Marek Polacek wrote:

>   * c-parser.c (c_parser_if_body): Set the location of the
>   body of the conditional after parsing all the labels.  Call
>   warn_for_multistatement_macros.
>   (c_parser_else_body): Likewise.
>   (c_parser_switch_statement): Likewise.
>   (c_parser_while_statement): Likewise.
>   (c_parser_for_statement): Likewise.
>   (c_parser_statement): Add a default argument.  Save the location
>   after labels have been parsed.
>   (c_parser_c99_block_statement): Likewise.

The gcc/c/ changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 00/30] [ARM] Reworking the -mcpu, -march and -mfpu options

2017-06-13 Thread Richard Earnshaw (lists)

On 12/06/17 15:34, Richard Earnshaw (lists) wrote:
> On 12/06/17 12:49, Christophe Lyon wrote:
>> On 10 June 2017 at 01:27, Richard Earnshaw (lists)
>>  wrote:
>>> On 09/06/17 23:45, Christophe Lyon wrote:
 Hi Richard,


 On 9 June 2017 at 14:53, Richard Earnshaw  wrote:
>
> During the ARM BoF at the Cauldron last year I mentioned that I wanted
> to rework the way GCC on ARM handles the command line options.  The
> problem was that most users, and even many experts, can't remember
> which FPU/SIMD unit comes with which CPU and that consequently many
> users were inadvertenly generating sub-optimal code for their system.
>
> This patch series implements the proposed change and provides support
> for a generic way of adding optional features to architectures and CPU
> names.  The documentation patches at the end of the series explain the
> new syntax, so I won't repeat all that here.  Suffice to say here that
> the result is that the -mfpu option now defaults to 'auto', which
> allows the compiler to infer the floating-point and simd options from
> the CPU/architecture options and that these options can normally be
> expressed in a context-specific manner like +simd or +fp without
> having to know precisely which variant is implemented.  Long term I'd
> like to deprecate -mfpu and entirely move over to the new syntax; but
> it's too early to start that process now.
>
> All the patches in the series should build a working basic compiler,
> but the multilib selection will not work correctly until the relevant
> patches towards the end are applied.  It is not really feasible to
> retain that functionality without collapsing too many of the patches
> together into one hunk.  It's also possible that some tests in the
> testsuite may exhibit transient misbehaviour, but there should be no
> regressions by the end of the sequence (some tests no-longer run in
> the default configurations because the default CPU does not have
> floating-point support).
>
> Just two patches are to the generic code, but both are fairly trivial.
> One permits the sbitmap code to be used in the driver programs and the
> other provides a way of escaping the meta-character in some multilib
> reuse strings.
>
> I won't apply any of this series until those two patches have been
> approved, and I won't commit anything before the middle of next week
> even then.  This is a fairly complex change and it deserves some time
> for people to comment before committing.
>
> R.
>
> Richard Earnshaw (30):
>   [arm] Use strings for -march, -mcpu and -mtune options
>   [arm] Rewrite -march and -mcpu options for passing to the assembler
>   [arm] Don't pass -mfpu=auto through to the assembler.
>   [arm] Allow +opt on arbitrary cpu and architecture specifications
>   [arm] Add architectural options
>   [arm] Add default FPUs for CPUs.
>   [build] Make sbitmap code available to the driver programs
>   [arm] Split CPU, architecture and tuning data tables.
>   [ARM] Move cpu and architecture option name parsing code to
> arm-common.c
>   [arm] Use standard option parsing code for detecting thumb-only
> targets
>   [arm] Allow CPU and architecture extensions to be defined as aliases
>   [arm] Allow new extended syntax CPU and architecture names during
> configure
>   [arm] Force a CPU default in the config args defaults list.
>   [arm] Generate a canonical form for -march
>   [arm] Make -mfloat-abi=softfp work when there are no FPU instructions
>   [arm] Update basic multilib configuration
>   [arm] Make 'auto' the default FPU selection option.
>   [arm] Rewrite t-aprofile using new selector methodology
>   [arm] Explicitly set .fpu in cmse_nonsecure_call.S
>   [genmultilib] Allow explicit periods to be escaped in MULTILIB_REUSE
>   [arm][testsuite] Use -march=armv7-a+fp when testing hard-float ABI.
>   [arm] Rewrite t-rmprofile multilib specification
>   [arm][rtems] Update t-rtems for new option framework
>   [arm][linux-eabi] Ensure all multilib variables are reset
>   [arm][phoenix] reset all multilib variables
>   [arm] Rework multlib builds for symbianelf
>   [arm][fuchsia] Rework multilib support
>   [arm] Add a few missing architecture extension options.
>   [arm][doc] Document new -march= syntax.
>   [arm][doc] Document changes to -mcpu, -mtune and -mfpu.
>
>  gcc/Makefile.in   |2 +-
>  gcc/common/config/arm/arm-common.c|  651 +++-
>  gcc/config.gcc|   17 +-
>  gcc/config/arm/arm-builtins.c |4 +-
>  gcc/config/arm/arm-cpu-cdata.h| 2444 
>

Re: Merge from GCC trunk to gccgo branch

2017-06-13 Thread Ian Lance Taylor

I've merged GCC trunk revision 249156 to the gccgo branch.

Ian

Go patch committed: Fix function passed in write_globals

2017-06-13 Thread Ian Lance Taylor

This patch by Than McIntosh fixes a bug in the Go frontend: in
Gogo::write_globals in a couple of places the wrong Bfunction was
being used for the containing (not target) function when creating
calls for init functions.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 249156)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-c4ecdd3edb9febe72b5527481ae3d7310105ca67
+be5fa26b2b1b5d0755bc1c7ce25f3aa26bea9d9c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 249125)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -1504,10 +1504,10 @@ Gogo::write_globals()
   Bfunction* initfn = func->get_or_make_decl(this, *p);
   Bexpression* func_code =
   this->backend()->function_code_expression(initfn, func_loc);
-  Bexpression* call = this->backend()->call_expression(initfn, func_code,
+  Bexpression* call = this->backend()->call_expression(init_bfn, func_code,
empty_args,
   NULL, func_loc);
-  Bstatement* ist = this->backend()->expression_statement(initfn, call);
+  Bstatement* ist = this->backend()->expression_statement(init_bfn, call);
   init_stmts.push_back(ist);
 }

Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts

2017-06-13 Thread Wilco Dijkstra


ping

From: Wilco Dijkstra
Sent: 17 January 2017 19:23
To: GCC Patches
Cc: nd; Kyrill Tkachov; Richard Earnshaw
Subject: [PATCH][ARM] Remove DImode expansions for 1-bit shifts
    
A left shift of 1 can always be done using an add, so slightly adjust rtx
cost for DImode left shift by 1 so that adddi3 is preferred in all cases,
and the arm_ashldi3_1bit is redundant.

DImode right shifts of 1 are rarely used (6 in total in the GCC binary),
so there is little benefit of the arm_ashrdi3_1bit and arm_lshrdi3_1bit
patterns.

Bootstrap OK on arm-linux-gnueabihf.

ChangeLog:
2017-01-17  Wilco Dijkstra  

    * config/arm/arm.md (ashldi3): Remove shift by 1 expansion.
    (arm_ashldi3_1bit): Remove pattern.
    (ashrdi3): Remove shift by 1 expansion.
    (arm_ashrdi3_1bit): Remove pattern.
    (lshrdi3): Remove shift by 1 expansion.
    (arm_lshrdi3_1bit): Remove pattern.
    * config/arm/arm.c (arm_rtx_costs_internal): Slightly increase
    cost of ashldi3 by 1.
    * config/arm/neon.md (ashldi3_neon): Remove shift by 1 expansion.
    (di3_neon): Likewise.
--
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
7d82ba358306189535bf7eee08a54e2f84569307..d47f4005446ff3e81968d7888c6573c0360cfdbd
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9254,6 +9254,9 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
    + rtx_cost (XEXP (x, 0), mode, code, 0, speed_p));
   if (speed_p)
 *cost += 2 * extra_cost->alu.shift;
+ /* Slightly disparage left shift by 1 at so we prefer adddi3.  */
+ if (code == ASHIFT && XEXP (x, 1) == CONST1_RTX (SImode))
+   *cost += 1;
   return true;
 }
   else if (mode == SImode)
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
0d69c8be9a2f98971c23c3b6f1659049f369920e..92b734ca277079f5f7343c7cc21a343f48d234c5
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4061,12 +4061,6 @@
 {
   rtx scratch1, scratch2;
 
-  if (operands[2] == CONST1_RTX (SImode))
-    {
-  emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1]));
-  DONE;
-    }
-
   /* Ideally we should use iwmmxt here if we could know that operands[1]
  ends up already living in an iwmmxt register. Otherwise it's
  cheaper to have the alternate code being generated than moving
@@ -4083,18 +4077,6 @@
   "
 )
 
-(define_insn "arm_ashldi3_1bit"
-  [(set (match_operand:DI    0 "s_register_operand" "=r,")
-    (ashift:DI (match_operand:DI 1 "s_register_operand" "0,r")
-   (const_int 1)))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_32BIT"
-  "movs\\t%Q0, %Q1, asl #1\;adc\\t%R0, %R1, %R1"
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]
-)
-
 (define_expand "ashlsi3"
   [(set (match_operand:SI    0 "s_register_operand" "")
 (ashift:SI (match_operand:SI 1 "s_register_operand" "")
@@ -4130,12 +4112,6 @@
 {
   rtx scratch1, scratch2;
 
-  if (operands[2] == CONST1_RTX (SImode))
-    {
-  emit_insn (gen_arm_ashrdi3_1bit (operands[0], operands[1]));
-  DONE;
-    }
-
   /* Ideally we should use iwmmxt here if we could know that operands[1]
  ends up already living in an iwmmxt register. Otherwise it's
  cheaper to have the alternate code being generated than moving
@@ -4152,18 +4128,6 @@
   "
 )
 
-(define_insn "arm_ashrdi3_1bit"
-  [(set (match_operand:DI  0 "s_register_operand" "=r,")
-    (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")
- (const_int 1)))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_32BIT"
-  "movs\\t%R0, %R1, asr #1\;mov\\t%Q0, %Q1, rrx"
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]
-)
-
 (define_expand "ashrsi3"
   [(set (match_operand:SI  0 "s_register_operand" "")
 (ashiftrt:SI (match_operand:SI 1 "s_register_operand" "")
@@ -4196,12 +4160,6 @@
 {
   rtx scratch1, scratch2;
 
-  if (operands[2] == CONST1_RTX (SImode))
-    {
-  emit_insn (gen_arm_lshrdi3_1bit (operands[0], operands[1]));
-  DONE;
-    }
-
   /* Ideally we should use iwmmxt here if we could know that operands[1]
  ends up already living in an iwmmxt register. Otherwise it's
  cheaper to have the alternate code being generated than moving
@@ -4218,18 +4176,6 @@
   "
 )
 
-(define_insn "arm_lshrdi3_1bit"
-  [(set (match_operand:DI  0 "s_register_operand" "=r,")
-    (lshiftrt:DI (match_operand:DI 1 "s_register_operand" "0,r")
- (const_int 1)))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_32BIT"
-  "movs\\t%R0, %R1, lsr #1\;mov\\t%Q0, %Q1, rrx"
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-13 Thread Wilco Dijkstra


ping

From: Wilco Dijkstra
Sent: 17 January 2017 15:14
To: Richard Earnshaw; GCC Patches; James Greenhalgh
Cc: nd
Subject: Re: [PATCH v3][AArch64] Fix symbol offset limit
    
Here is v3 of the patch - tree_fits_uhwi_p was necessary to ensure the size of a
declaration is an integer. So the question is whether we should allow
largish offsets outside of the bounds of symbols (v1), no offsets (this 
version), or
small offsets (small negative and positive offsets just outside a symbol are 
common).
The only thing we can't allow is any offset like we currently do...

In aarch64_classify_symbol symbols are allowed full-range offsets on 
relocations.
This means the offset can use all of the +/-4GB offset, leaving no offset 
available
for the symbol itself.  This results in relocation overflow and link-time errors
for simple expressions like _char + 0xff00.

To avoid this, limit the offset to +/-1GB so that the symbol needs to be within 
a
3GB offset from its references.  For the tiny code model use a 64KB offset, 
allowing
most of the 1MB range for code/data between the symbol and its references.
For symbols with a defined size, limit the offset to be within the size of the 
symbol.


ChangeLog:
2017-01-17  Wilco Dijkstra  

    gcc/
    * config/aarch64/aarch64.c (aarch64_classify_symbol):
    Apply reasonable limit to symbol offsets.

    testsuite/
    * gcc.target/aarch64/symbol-range.c (foo): Set new limit.
    * gcc.target/aarch64/symbol-range-tiny.c (foo): Likewise.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
e8d65ead95a3c5730c2ffe64a9e057779819f7b4..f1d54e332dc1cf1ef0bc4b1e46b0ebebe1c4cea4
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9809,6 +9809,8 @@ aarch64_classify_symbol (rtx x, rtx offset)
   if (aarch64_tls_symbol_p (x))
 return aarch64_classify_tls_symbol (x);
 
+  const_tree decl = SYMBOL_REF_DECL (x);
+
   switch (aarch64_cmodel)
 {
 case AARCH64_CMODEL_TINY:
@@ -9817,25 +9819,45 @@ aarch64_classify_symbol (rtx x, rtx offset)
  we have no way of knowing the address of symbol at compile time
  so we can't accurately say if the distance between the PC and
  symbol + offset is outside the addressible range of +/-1M in the
-    TINY code model.  So we rely on images not being greater than
-    1M and cap the offset at 1M and anything beyond 1M will have to
-    be loaded using an alternative mechanism.  Furthermore if the
-    symbol is a weak reference to something that isn't known to
-    resolve to a symbol in this module, then force to memory.  */
+    TINY code model.  So we limit the maximum offset to +/-64KB and
+    assume the offset to the symbol is not larger than +/-(1M - 64KB).
+    Furthermore force to memory if the symbol is a weak reference to
+    something that doesn't resolve to a symbol in this module.  */
   if ((SYMBOL_REF_WEAK (x)
    && !aarch64_symbol_binds_local_p (x))
- || INTVAL (offset) < -1048575 || INTVAL (offset) > 1048575)
+ || !IN_RANGE (INTVAL (offset), -0x1, 0x1))
 return SYMBOL_FORCE_TO_MEM;
+
+ /* Limit offset to within the size of a declaration if available.  */
+ if (decl && DECL_P (decl))
+   {
+ const_tree decl_size = DECL_SIZE (decl);
+
+ if (tree_fits_uhwi_p (decl_size)
+ && !IN_RANGE (INTVAL (offset), 0, tree_to_uhwi (decl_size)))
+   return SYMBOL_FORCE_TO_MEM;
+   }
+
   return SYMBOL_TINY_ABSOLUTE;
 
 case AARCH64_CMODEL_SMALL:
   /* Same reasoning as the tiny code model, but the offset cap here is
-    4G.  */
+    1G, allowing +/-3G for the offset to the symbol.  */
   if ((SYMBOL_REF_WEAK (x)
    && !aarch64_symbol_binds_local_p (x))
- || !IN_RANGE (INTVAL (offset), HOST_WIDE_INT_C (-4294967263),
-   HOST_WIDE_INT_C (4294967264)))
+ || !IN_RANGE (INTVAL (offset), -0x4000, 0x4000))
 return SYMBOL_FORCE_TO_MEM;
+
+ /* Limit offset to within the size of a declaration if available.  */
+ if (decl && DECL_P (decl))
+   {
+ const_tree decl_size = DECL_SIZE (decl);
+
+ if (tree_fits_uhwi_p (decl_size)
+ && !IN_RANGE (INTVAL (offset), 0, tree_to_uhwi (decl_size)))
+   return SYMBOL_FORCE_TO_MEM;
+   }
+
   return SYMBOL_SMALL_ABSOLUTE;
 
 case AARCH64_CMODEL_TINY_PIC:
diff --git a/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c 
b/gcc/testsuite/gcc.target/aarch64/symbol-range-tiny.c
index 
d7e46b059e41f2672b3a1da5506fa8944e752e01..d49ff4dbe5786ef6d343d2b90052c09676dd7fe5
 100644
---

Re: [PATCH][ARM] Remove Thumb-2 iordi_not patterns

2017-06-13 Thread Wilco Dijkstra


ping


From: Wilco Dijkstra
Sent: 17 January 2017 18:00
To: GCC Patches
Cc: nd; Kyrylo Tkachov; Richard Earnshaw
Subject: [PATCH][ARM] Remove Thumb-2 iordi_not patterns
    
After Bernd's DImode patch [1] almost all DImode operations are expanded
early (except for -mfpu=neon). This means the Thumb-2 iordi_notdi_di
patterns are no longer used - the split ORR and NOT instructions are merged
into ORN by Combine.  With -mfpu=neon the iordi_notdi_di patterns are used
on Thumb-2, and after this patch the orndi3_neon pattern matches instead
(which still emits ORN).  After this there are no Thumb-2 specific DImode 
patterns.

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02796.html

ChangeLog:
2017-01-17  Wilco Dijkstra  

    * config/arm/thumb2.md (iordi_notdi_di): Remove pattern.
    (iordi_notzesidi_di): Likewise.
    (iordi_notdi_zesidi): Likewise.
    (iordi_notsesidi_di): Likewise.

--

diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 
2e7580f220eae1524fef69719b1796f50f5cf27c..91471d4650ecae4f4e87b549d84d11adf3014ad2
 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1434,103 +1434,6 @@
    (set_attr "type" "alu_sreg")]
 )
 
-; Constants for op 2 will never be given to these patterns.
-(define_insn_and_split "*iordi_notdi_di"
-  [(set (match_operand:DI 0 "s_register_operand" "=,")
-   (ior:DI (not:DI (match_operand:DI 1 "s_register_operand" "0,r"))
-   (match_operand:DI 2 "s_register_operand" "r,0")))]
-  "TARGET_THUMB2"
-  "#"
-  "TARGET_THUMB2 && reload_completed"
-  [(set (match_dup 0) (ior:SI (not:SI (match_dup 1)) (match_dup 2)))
-   (set (match_dup 3) (ior:SI (not:SI (match_dup 4)) (match_dup 5)))]
-  "
-  {
-    operands[3] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-    operands[4] = gen_highpart (SImode, operands[1]);
-    operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[5] = gen_highpart (SImode, operands[2]);
-    operands[2] = gen_lowpart (SImode, operands[2]);
-  }"
-  [(set_attr "length" "8")
-   (set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
-   (set_attr "type" "multiple")]
-)
-
-(define_insn_and_split "*iordi_notzesidi_di"
-  [(set (match_operand:DI 0 "s_register_operand" "=,")
-   (ior:DI (not:DI (zero_extend:DI
-    (match_operand:SI 2 "s_register_operand" "r,r")))
-   (match_operand:DI 1 "s_register_operand" "0,?r")))]
-  "TARGET_THUMB2"
-  "#"
-  ; (not (zero_extend...)) means operand0 will always be 0x
-  "TARGET_THUMB2 && reload_completed"
-  [(set (match_dup 0) (ior:SI (not:SI (match_dup 2)) (match_dup 1)))
-   (set (match_dup 3) (const_int -1))]
-  "
-  {
-    operands[3] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-    operands[1] = gen_lowpart (SImode, operands[1]);
-  }"
-  [(set_attr "length" "4,8")
-   (set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
-   (set_attr "type" "multiple")]
-)
-
-(define_insn_and_split "*iordi_notdi_zesidi"
-  [(set (match_operand:DI 0 "s_register_operand" "=,")
-   (ior:DI (not:DI (match_operand:DI 2 "s_register_operand" "0,?r"))
-   (zero_extend:DI
-    (match_operand:SI 1 "s_register_operand" "r,r"]
-  "TARGET_THUMB2"
-  "#"
-  "TARGET_THUMB2 && reload_completed"
-  [(set (match_dup 0) (ior:SI (not:SI (match_dup 2)) (match_dup 1)))
-   (set (match_dup 3) (not:SI (match_dup 4)))]
-  "
-  {
-    operands[3] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-    operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[4] = gen_highpart (SImode, operands[2]);
-    operands[2] = gen_lowpart (SImode, operands[2]);
-  }"
-  [(set_attr "length" "8")
-   (set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
-   (set_attr "type" "multiple")]
-)
-
-(define_insn_and_split "*iordi_notsesidi_di"
-  [(set (match_operand:DI 0 "s_register_operand" "=,")
-   (ior:DI (not:DI (sign_extend:DI
-    (match_operand:SI 2 "s_register_operand" "r,r")))
-   (match_operand:DI 1 "s_register_operand" "0,r")))]
-  "TARGET_THUMB2"
-  "#"
-  "TARGET_THUMB2 && reload_completed"
-  [(set (match_dup 0) (ior:SI (not:SI (match_dup 2)) (match_dup 1)))
-   (set (match_dup 3) (ior:SI (not:SI
-   (ashiftrt:SI (match_dup 2) (const_int 31)))
-  (match_dup 4)))]
-  "
-  {
-    operands[3] = gen_highpart (SImode, operands[0]);
-    operands[0] = gen_lowpart (SImode, operands[0]);
-    operands[4] = gen_highpart (SImode, operands[1]);
-    operands[1] = gen_lowpart (SImode, operands[1]);
-  }"
-  [(set_attr "length" "8")
-   (set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")
-   (set_attr "type" "multiple")]
-)
-
 (define_insn "*orsi_notsi_si"
   [(set (match_operand:SI 0

Re: [PATCH][ARM] Improve max_insns_skipped logic

2017-06-13 Thread Wilco Dijkstra


ping


From: Wilco Dijkstra
Sent: 10 November 2016 17:19
To: GCC Patches
Cc: nd
Subject: [PATCH][ARM] Improve max_insns_skipped logic
    
Improve the logic when setting max_insns_skipped.  Limit the maximum size of IT
to MAX_INSN_PER_IT_BLOCK as otherwise multiple IT instructions are needed,
increasing codesize.  Given 4 works well for Thumb-2, use the same limit for ARM
for consistency. 

ChangeLog:
2016-11-04  Wilco Dijkstra  

    * config/arm/arm.c (arm_option_params_internal): Improve setting of
    max_insns_skipped.
--

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
f046854e9665d54911616fc1c60fee407188f7d6..29e8d1d07d918fbb2a627a653510dfc8587ee01a
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2901,20 +2901,12 @@ arm_option_params_internal (void)
   targetm.max_anchor_offset = TARGET_MAX_ANCHOR_OFFSET;
 }
 
-  if (optimize_size)
-    {
-  /* If optimizing for size, bump the number of instructions that we
- are prepared to conditionally execute (even on a StrongARM).  */
-  max_insns_skipped = 6;
+  /* Increase the number of conditional instructions with -Os.  */
+  max_insns_skipped = optimize_size ? 4 : current_tune->max_insns_skipped;
 
-  /* For THUMB2, we limit the conditional sequence to one IT block.  */
-  if (TARGET_THUMB2)
-    max_insns_skipped = arm_restrict_it ? 1 : 4;
-    }
-  else
-    /* When -mrestrict-it is in use tone down the if-conversion.  */
-    max_insns_skipped = (TARGET_THUMB2 && arm_restrict_it)
-  ? 1 : current_tune->max_insns_skipped;
+  /* For THUMB2, we limit the conditional sequence to one IT block.  */
+  if (TARGET_THUMB2)
+    max_insns_skipped = MIN (max_insns_skipped, MAX_INSN_PER_IT_BLOCK);
 }
 
 /* True if -mflip-thumb should next add an attribute for the default

Re: [PATCH][ARM] Fix ldrd offsets

2017-06-13 Thread Wilco Dijkstra


    

ping

From: Wilco Dijkstra
Sent: 03 November 2016 12:20
To: GCC Patches
Cc: nd
Subject: [PATCH][ARM] Fix ldrd offsets
    
Fix ldrd offsets of Thumb-2 - for TARGET_LDRD the range is +-1020,
without -255..4091.  This reduces the number of addressing instructions
when using DI mode operations (such as in PR77308).

Bootstrap & regress OK.

ChangeLog:
2015-11-03  Wilco Dijkstra  

    gcc/
    * config/arm/arm.c (arm_legitimate_index_p): Add comment.
    (thumb2_legitimate_index_p): Use correct range for DI/DF mode.
--

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
3c4c7042d9c2101619722b5822b3d1ca37d637b9..5d12cf9c46c27d60a278d90584bde36ec86bb3fe
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7486,6 +7486,8 @@ arm_legitimate_index_p (machine_mode mode, rtx index, 
RTX_CODE outer,
 {
   HOST_WIDE_INT val = INTVAL (index);
 
+ /* Assume we emit ldrd or 2x ldr if !TARGET_LDRD.
+    If vldr is selected it uses arm_coproc_mem_operand.  */
   if (TARGET_LDRD)
 return val > -256 && val < 256;
   else
@@ -7613,11 +7615,13 @@ thumb2_legitimate_index_p (machine_mode mode, rtx 
index, int strict_p)
   if (code == CONST_INT)
 {
   HOST_WIDE_INT val = INTVAL (index);
- /* ??? Can we assume ldrd for thumb2?  */
- /* Thumb-2 ldrd only has reg+const addressing modes.  */
- /* ldrd supports offsets of +-1020.
-    However the ldr fallback does not.  */
- return val > -256 && val < 256 && (val & 3) == 0;
+ /* Thumb-2 ldrd only has reg+const addressing modes.
+    Assume we emit ldrd or 2x ldr if !TARGET_LDRD.
+    If vldr is selected it uses arm_coproc_mem_operand.  */
+ if (TARGET_LDRD)
+   return IN_RANGE (val, -1020, 1020) && (val & 3) == 0;
+ else
+   return IN_RANGE (val, -255, 4095 - 4);
 }
   else
 return 0;

Re: [PATCH v2] Implement no_sanitize function attribute

2017-06-13 Thread Martin Liška

On 06/13/2017 03:49 PM, Richard Biener wrote:
> On Tue, Jun 13, 2017 at 1:09 PM, Martin Liška  wrote:
>> On 06/09/2017 03:35 PM, Richard Biener wrote:
>>> You can directly transform to no_sanitize with integer mask, not sure why
>>> you'd need an intermediate step with a string?
>>
>> Hello.
>>
>> Done in attached patch, I'm sending both incremental and final version 
>> (complete patch).
>> I also decided to support no_sanitize attribute in pretty printer:
>>
>> __attribute__((no_sanitize (address | shift | shift-base | shift-exponent | 
>> integer-divide-by-zero | undefined | unreachable | vla-bound | return | null 
>> | signed-integer-overflow | bool | enum | float-divide-by-zero | 
>> float-cast-overflow | bounds | bounds-strict | alignment | nonnull-attribute 
>> | returns-nonnull-attribute | object-size | vptr)))
>> fn1 ()
>> {
>>   char my_char[9];
>>   char * ptr2;
>>   char * ptr;
>> ..
>>
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
> 
> 
> +unsigned int
> +parse_no_sanitize_attribute (char *value, char **wrong_argument)
> +{
> 
> functions need a comment.
> 
> Otherwise looks ok to me.

Done and patch installed as r249158.

Thanks for help with that.
Martin

> 
> Thanks,
> Richard.
> 
>> Martin

Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage

2017-06-13 Thread Wilco Dijkstra

ping

From: Wilco Dijkstra
Sent: 31 October 2016 18:29
To: GCC Patches
Cc: nd
Subject: [RFC][PATCH][AArch64] Cleanup frame pointer usage

This patch cleans up all code related to the frame pointer.  On AArch64 we
emit a frame chain even in cases where the frame pointer is not required.
So make this explicit by introducing a boolean emit_frame_chain in
aarch64_frame record.

When the frame pointer is enabled but not strictly required (eg. no use of
alloca), we emit a frame chain in non-leaf functions, but continue to use the
stack pointer to access locals.  This results in smaller code and unwind info.

Also simplify the complex logic in aarch64_override_options_after_change_1
and compute whether the frame chain is required in aarch64_layout_frame
instead.  As a result aarch64_frame_pointer_required is now redundant and
aarch64_can_eliminate can be greatly simplified.

Finally convert all callee save/restore functions to use gen_frame_mem.

Bootstrap OK. Any comments?

ChangeLog:
2016-10-31  Wilco Dijkstra  

    gcc/
    * config/aarch64/aarch64.h (aarch64_frame):
 Add emit_frame_chain boolean.
    * config/aarch64/aarch64.c (aarch64_frame_pointer_required)
    Remove.
    (aarch64_layout_frame): Initialise emit_frame_chain.
    (aarch64_pushwb_single_reg): Use gen_frame_mem.
    (aarch64_pop_regs): Likewise.
    (aarch64_gen_load_pair): Likewise.
    (aarch64_save_callee_saves): Likewise.
    (aarch64_restore_callee_saves): Likewise.
    (aarch64_expand_prologue): Use emit_frame_chain.
    (aarch64_can_eliminate): Simplify. When FP needed or outgoing
    arguments are large, eliminate to FP, otherwise SP.
    (aarch64_override_options_after_change_1): Simplify.
    (TARGET_FRAME_POINTER_REQUIRED): Remove define.

--

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
fa81e4b853daf08842955288861ec7e7acca..6e32dc9f6f171dde0c182fdd7857230251f71712
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -583,6 +583,9 @@ struct GTY (()) aarch64_frame
   /* The size of the stack adjustment after saving callee-saves.  */
   HOST_WIDE_INT final_adjust;

+  /* Store FP,LR and setup a frame pointer.  */
+  bool emit_frame_chain;
+
   unsigned wb_candidate1;
   unsigned wb_candidate2;

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f07d771ea343803e054e03f59c8c1efb698bf474..6c06ac18d16f8afa7ee1cc5e8530e285a60e2b0f
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2728,24 +2728,6 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }

-static bool
-aarch64_frame_pointer_required (void)
-{
-  /* In aarch64_override_options_after_change
- flag_omit_leaf_frame_pointer turns off the frame pointer by
- default.  Turn it back on now if we've not got a leaf
- function.  */
-  if (flag_omit_leaf_frame_pointer
-  && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM)))
-    return true;
-
-  /* Force a frame pointer for EH returns so the return address is at FP+8.  */
-  if (crtl->calls_eh_return)
-    return true;
-
-  return false;
-}
-
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
    and LR may be omitted).  */
@@ -2758,6 +2740,18 @@ aarch64_layout_frame (void)
   if (reload_completed && cfun->machine->frame.laid_out)
 return;

+  /* Force a frame chain for EH returns so the return address is at FP+8.  */
+  cfun->machine->frame.emit_frame_chain
+    = frame_pointer_needed || crtl->calls_eh_return;
+
+  /* Emit a frame chain if the frame pointer is enabled.
+ If -momit-leaf-frame-pointer is used, do not use a frame chain
+ in leaf functions which do not use LR.  */
+  if (flag_omit_frame_pointer == 2
+  && !(flag_omit_leaf_frame_pointer && crtl->is_leaf
+  && !df_regs_ever_live_p (LR_REGNUM)))
+    cfun->machine->frame.emit_frame_chain = true;
+
 #define SLOT_NOT_REQUIRED (-2)
 #define SLOT_REQUIRED (-1)

@@ -2789,7 +2783,7 @@ aarch64_layout_frame (void)
 && !call_used_regs[regno])
   cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;

-  if (frame_pointer_needed)
+  if (cfun->machine->frame.emit_frame_chain)
 {
   /* FP and LR are placed in the linkage record.  */
   cfun->machine->frame.reg_offset[R29_REGNUM] = 0;
@@ -2937,7 +2931,7 @@ aarch64_pushwb_single_reg (machine_mode mode, unsigned 
regno,
   reg = gen_rtx_REG (mode, regno);
   mem = gen_rtx_PRE_MODIFY (Pmode, base_rtx,
 plus_constant (Pmode, base_rtx, -adjustment));
-  mem = gen_rtx_MEM (mode, mem);
+  mem = gen_frame_mem (mode, mem);

   insn = emit_move_insn (mem, reg);
   RTX_FRAME_RELATED_P (insn) = 1;
@@ -3011,7 +3005,7 @@ aarch64_pop_regs (unsigned regno1, unsigned regno2, 
HOST_WIDE_INT adjustment,
 {
   rtx mem =

Re: [PATCH][ARM] Remove movdi_vfp_cortexa8

2017-06-13 Thread Wilco Dijkstra

ping

Richard Earnshaw (lists) wrote:
>  (define_insn "*movdi_vfp"
> -  [(set (match_operand:DI 0 "nonimmediate_di_operand" 
> "=r,r,r,r,q,q,m,w,r,w,w, Uv")
> +  [(set (match_operand:DI 0 "nonimmediate_di_operand" 
> "=r,r,r,r,q,q,m,w,!r,w,w, Uv")

> Why have you introduced a no-reloads block on the 9th alternative for
> all variants?

That is the default behaviour when you don't explicitly set a cpu, so I kept 
that.
See https://patches.linaro.org/patch/541/ for the original reason for adding it 
-
duplicating this pattern was a mistake since '!' wouldn't pessimize other cores
as int<->fp moves typically have a non-trivial cost.

However given Cortex-A8 is ancient now we could just remove the '!'.

Wilco

Re: [PATCH][AArch64] Improve Cortex-A53 shift bypass

2017-06-13 Thread Wilco Dijkstra

ping
    
Richard Earnshaw (lists) wrote:

> --- a/gcc/config/arm/aarch-common.c
> +++ b/gcc/config/arm/aarch-common.c
> @@ -254,12 +254,7 @@ arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
>  return 0;
>  
>    if ((early_op = arm_find_shift_sub_rtx (op)))
> -    {
> -  if (REG_P (early_op))
> - early_op = op;
> -
> -  return !reg_overlap_mentioned_p (value, early_op);
> -    }
> +    return !reg_overlap_mentioned_p (value, early_op);
>  
>    return 0;
>  }

> This function is used by several aarch32 pipeline description models.
> What testing have you given it there.  Are the changes appropriate for
> those cores as well?

arm_find_shift_sub_rtx can only ever return NULL_RTX or a shift rtx, so the
check for REG_P is dead code. Bootstrap passes on ARM too of course.

Wilco

Re: [PATCH][ARM] Update max_cond_insns settings

2017-06-13 Thread Wilco Dijkstra

ping
    
Richard Earnshaw (lists) wrote:
> On 05/05/17 13:42, Wilco Dijkstra wrote:
>> Richard Earnshaw (lists) wrote:
>>> On 04/05/17 18:38, Wilco Dijkstra wrote:
>>> > Richard Earnshaw wrote:
>>> > 
> -  5, /* Max cond insns.  */
> +  2, /* Max cond insns.  */
 
> This parameter is also used for A32 code.  Is that really the right
> number there as well?
 
 Yes, this parameter has always been the same for ARM and Thumb-2.
>>>
>>> I know that.  I'm questioning whether that number (2) is right when on
>>> ARM.  It seems very low to me, especially when branches are unpredictable.
>> 
>> Why does it seem low? Benchmarking showed 2 was the best value for modern
>> cores. The same branch predictor is used, so the same settings should be
>> used
>> for ARM and Thumb-2.
>
> Thumb2 code has to execute an additional instruction to start an IT
> sequence.  It might therefore seem reasonable for the ARM sequence to be
> one instruction longer.

The IT instruction has no inputs/outputs and thus behaves like a NOP - unlike
conditional instructions which have real latencies and additional dependencies 
due
to being conditional. So the overhead of IT itself is small.

Wilco

Re: [PATCH] [AArch64] PR target/71663 Improve Vector Initializtion

2017-06-13 Thread James Greenhalgh

On Tue, Jun 13, 2017 at 10:24:59AM +, Hurugalawadi, Naveen wrote:
> Hi James,
> 
> Thanks for your review and useful comments.
> 
> >> If you could try to keep one reply chain for each patch series
> Will keep that in mind for sure :-)
> 
> >> Very minor, but what is wrong with:
> >> int matches[16][2] = {0};
> Done.
> 
> >> nummatches is unused.
> Removed.
> 
> >> This search algorithm is tough to follow
> Updated as per your comments.
> 
> >> Put braces round this and write it as two statements
> Done.
> 
> >> Move your new code above the part-variable case.
> Done.
> 
> >> c is unused.
> Removed.
> 
> Bootstrapped and Regression tested on aarch64-thunder-linux.
> 
> Please review the patch and let us know if any comments or suggestions.

Almost OK. Could you make the testcase a bit more comprehensive? Test
that the right thing happens with multiple duplicates, that the right thing
happens when there are no duplicates, etc. At the moment the test does not
provide good coverage of the cases your code handles.

With a fuller testcase this will likely be OK, but please repost the patch
for another review.

Thanks,
James

> diff --git a/gcc/testsuite/gcc.target/aarch64/pr71663.c 
> b/gcc/testsuite/gcc.target/aarch64/pr71663.c
> new file mode 100644
> index 000..65f368d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr71663.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +#define vector __attribute__((vector_size(16)))
> +
> +vector float combine (float a, float b, float d)
> +{
> +  return (vector float) { a, b, a, d };
> +}
> +
> +/* { dg-final { scan-assembler-not "movi\t" } } */
> +/* { dg-final { scan-assembler-not "orr\t" } } */
> +/* { dg-final { scan-assembler-times "ins\t" 2 } } */
> +/* { dg-final { scan-assembler-times "dup\t" 1 } } */

Re: [PATCH v2] Implement no_sanitize function attribute

2017-06-13 Thread Richard Biener

On Tue, Jun 13, 2017 at 1:09 PM, Martin Liška  wrote:
> On 06/09/2017 03:35 PM, Richard Biener wrote:
>> You can directly transform to no_sanitize with integer mask, not sure why
>> you'd need an intermediate step with a string?
>
> Hello.
>
> Done in attached patch, I'm sending both incremental and final version 
> (complete patch).
> I also decided to support no_sanitize attribute in pretty printer:
>
> __attribute__((no_sanitize (address | shift | shift-base | shift-exponent | 
> integer-divide-by-zero | undefined | unreachable | vla-bound | return | null 
> | signed-integer-overflow | bool | enum | float-divide-by-zero | 
> float-cast-overflow | bounds | bounds-strict | alignment | nonnull-attribute 
> | returns-nonnull-attribute | object-size | vptr)))
> fn1 ()
> {
>   char my_char[9];
>   char * ptr2;
>   char * ptr;
> ..
>
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?


+unsigned int
+parse_no_sanitize_attribute (char *value, char **wrong_argument)
+{

functions need a comment.

Otherwise looks ok to me.

Thanks,
Richard.

> Martin

Re: C/C++ PATCH to implement -Wmultistatement-macros (PR c/80116)

2017-06-13 Thread Marek Polacek

On Sat, Jun 10, 2017 at 12:03:18AM +0200, Gerald Pfeifer wrote:
> On Thu, 8 Jun 2017, David Malcolm wrote:
> > How about:
> > 
> > "Warn about unsafe multiple statement macros that appear to be guarded
> > by a clause such as if, else, while, or for, in which only the first
> > statement is actually guarded after the macro is expanded."
> > 
> > or somesuch?
> 
> Yes, I like this.
> 
> On Thu, 8 Jun 2017, Martin Sebor wrote:
> > I don't have strong feelings about the current wording but if it
> > should be tweaked for accuracy I would suggest to use the formal
> > term "controlling expression", similarly to -Wswitch-unreachable.
> 
> That sounds good to me.
> 
> Some comments on the original patch:
> 
>   +Warn about macros expanding to multiple statements in a body of a 
> conditional,
>   +such as @code{if}, @code{else}, @code{for}, or @code{while}.
> 
> "in the body of a $WHATEVER_WE_SHALL_CALL_IT"
 
It now says something other than that, so that mistake is not there anymore.

>   +The can usually be fixed by wrapping the macro in a do-while loop:
> 
> Is there a particular reason for not using an if(1) { } statement?
> 
> Ah, of course, a following else statement would be impacted by that.
> Do we want to note that in the documentation?

I don't think so, we only suggest do {} while (0);.

>   +This warning is enabled by @option{-Wall} in C and C++.
> 
> "for C and C++" instead of "in"?

Other parts of invoke.text use both, so I left that as it was.  I think
both work here.

> I'm curious to see how many issues this is going to find in real-world
> code out there!

Yeah, me too.

Marek

libgo patch committed: Don't always show frames with no function in traceback

2017-06-13 Thread Ian Lance Taylor

In a Go traceback, if there is no function name, that traceback entry
is generally uninformative.  In earlier versions we did not show such
frames.  This patch restores that behavior.  These frames can be seen
with GOTRACEBACK=system.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 249143)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-3b44ad058abda0d1b0b6c928987270da50ab7431
+c4ecdd3edb9febe72b5527481ae3d7310105ca67
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/traceback_gccgo.go
===
--- libgo/go/runtime/traceback_gccgo.go (revision 249125)
+++ libgo/go/runtime/traceback_gccgo.go (working copy)
@@ -94,7 +94,7 @@ func showframe(name string, gp *g) bool
// We want to print those in the traceback.
// But unless GOTRACEBACK > 1 (checked below), still skip
// internal C functions and cgo-generated functions.
-   if !contains(name, ".") && !hasprefix(name, "__go_") && 
!hasprefix(name, "_cgo_") {
+   if name != "" && !contains(name, ".") && !hasprefix(name, "__go_") && 
!hasprefix(name, "_cgo_") {
return true
}

Re: [PATCH 01/30] [arm] Use strings for -march, -mcpu and -mtune options

2017-06-13 Thread Christophe Lyon

On 9 June 2017 at 14:53, Richard Earnshaw  wrote:
>
> In order to support more complex specifications for cpus and architectures
> we need to move away from using enumerations to represent the set of
> permitted options.  This basic change just moves the option parsing
> infrastructure over to that, but changes nothing more beyond generating
> a hint when the specified option does not match a known target (previously
> the help option was able to print out all the permitted values, but we
> can no-longer do that.
>
> * config/arm/arm.opt (x_arm_arch_string): New TargetSave option.
> (x_arm_cpu_string, x_arm_tune_string): Likewise.
> (march, mcpu, mtune): Convert to string-based options.
> * config/arm/arm.c (arm_print_hint_for_core_or_arch): New function.
> (arm_parse_arch_cpu_name): New function.
> (arm_configure_build_target): Use arm_parse_arch_cpu_name to
> identify selected architecture or CPU.
> (arm_option_save): New function.
> (TARGET_OPTION_SAVE): Redefine.
> (arm_option_restore): Restore string options.
> (arm_option_print): Print string options.
> ---
>  gcc/config/arm/arm.c   | 92 
> --
>  gcc/config/arm/arm.opt | 15 ++--
>  2 files changed, 94 insertions(+), 13 deletions(-)
>


I've noticed a typo (:premitted):
+/* List the premitted CPU or architecture names.  If TARGET is a near

Re: [PATCH 9/9] rs6000: Comment fixes + some leftovers

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/rs6000.c: Update all comments that mentioned SPE.
> (rs6000_expand_builtin): Remove RS6000_BTC_EVSEL.
> * config/rs6000/rs6000.h (RS6000_BTC_EVSEL): Delete.
> * config/rs6000/vxworks.h (VXCPU_FOR_8548): Delete.  Adjust former 
> use.
> * config/rs6000/vxworksae.h (VXCPU_FOR_8548): Delete.
> * config/rs6000/vxworksmils.h (VXCPU_FOR_8548): Delete.

Okay.

Thanks, David

Re: [PATCH 7/9] rs6000: Remove FIXED_SCRATCH

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/rs6000.h (FIXED_SCRATCH): Delete.

Okay.

Thanks, david

Re: [PATCH 8/9] rs6000: Remove VECTOR_SPE

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/rs6000-opts.h (enum rs6000_vector): Delete VECTOR_SPE.
> * config/rs6000/rs6000.c (rs6000_debug_vector_unit): Delete 
> VECTOR_SPE.

Okay.

Thanks, David

Re: [PATCH 6/9] rs6000: Updates to t-rtems

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/t-rtems: Don't handle SPE.

Okay.

Thanks, David

Re: [PATCH 5/9] rs6000: Updates to t-linux

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/t-linux: Don't handle SPE.

Okay.

Thanks, David

Re: [PATCH 4/9] rs6000: Remove eabispe.h

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/eabispe.h: Delete file.

Okay.

Thanks, David

Re: [PATCH 3/9] rs6000: Remove t-spe

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/t-spe: Delete file.

Okay.

Thanks, David

Re: [PATCH][RFC] Canonize names of attributes.

2017-06-13 Thread Richard Biener

On Tue, Jun 13, 2017 at 2:32 PM, Martin Liška  wrote:
> Hello.
>
> After some discussions with Richi, I would like to propose patch that will
> come up with a canonical name of attribute names. That means 
> __attribute__((__abi_tag__))
> will be given 'abi_tag' as IDENTIFIER_NAME of the attribute. The change can 
> improve
> attribute name lookup and we can delete all the ugly code that compares 
> strlen(i1)
> == strlen(i2) + 4, etc.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests (w/ 
> default
> languages). I'm currently testing objc, obj-c++ and go.
>
> Ready to be installed?


+tree
+canonize_attr_name (tree attr_name)
+{

needs a comment.

+  if (l > 4 && s[0] == '_')
+{
+  gcc_assert (s[1] == '_');
+  gcc_assert (s[l - 2] == '_');
+  gcc_assert (s[l - 1] == '_');
+  return get_identifier_with_length (s + 2, l - 4);
+}

a single gcc_checking_assert please.  I think this belongs in attribs.[ch].

Seeing

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index e1c8bdff986..6d0e9279ed6 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -316,6 +316,7 @@ c_common_has_attribute (cpp_reader *pfile)
 {
   attr_name = get_identifier ((const char *)
  cpp_token_as_text (pfile, token));
+  attr_name = canonize_attr_name (attr_name);

I wondered if we can save allocating the non-canonical identifier.  Like
with

tree
canonize_attr_name (const char *attr_name, size_t len)

as we can pass it IDENTIFIER_POINTER/LENGTH or the token.  OTOH
all other cases do have IDENTIFIERs already...

@ -24638,6 +24639,11 @@ cp_parser_gnu_attribute_list (cp_parser* parser)
  else
{
  arguments = build_tree_list_vec (vec);
+ tree tv;
+ if (arguments != NULL_TREE
+ && ((tv = TREE_VALUE (arguments)) != NULL_TREE)
+ && TREE_CODE (tv) == IDENTIFIER_NODE)
+ TREE_VALUE (arguments) = canonize_attr_name (tv);
  release_tree_vector (vec);
}

are you sure this is needed?  This seems to be solely arguments to
attributes.

The rest of the changes look good but please wait for input from FE maintainers.

Thanks,
Richard.

> Martin
>
>
> gcc/cp/ChangeLog:
>
> 2017-06-09  Martin Liska  
>
> * parser.c (cp_parser_gnu_attribute_list): Canonize attribute
> names.
> (cp_parser_std_attribute): Likewise.
>
> gcc/go/ChangeLog:
>
> 2017-06-09  Martin Liska  
>
> * go-gcc.cc (Gcc_backend::function): Use no_split_stack
> instead of __no_split_stack__.
>
> gcc/c/ChangeLog:
>
> 2017-06-09  Martin Liska  
>
> * c-parser.c (c_parser_attributes): Canonize attribute names.
>
> gcc/c-family/ChangeLog:
>
> 2017-06-09  Martin Liska  
>
> * c-format.c (cmp_attribs): Simplify comparison of attributes.
> * c-lex.c (c_common_has_attribute): Canonize attribute names.
>
> gcc/ChangeLog:
>
> 2017-06-09  Martin Liska  
>
> * tree.c (cmp_attrib_identifiers): Simplify comparison of attributes.
> (private_is_attribute_p): Likewise.
> (private_lookup_attribute): Likewise.
> (private_lookup_attribute_by_prefix): Likewise.
> (remove_attribute): Likewise.
> (canonize_attr_name): New function.
> * tree.h: Declared here.
>
> gcc/testsuite/ChangeLog:
>
> 2017-06-09  Martin Liska  
>
> * g++.dg/cpp0x/pr65558.C: Change expected warning.
> * gcc.dg/parm-impl-decl-1.c: Likewise.
> * gcc.dg/parm-impl-decl-3.c: Likewise.
> ---
>  gcc/c-family/c-format.c |  13 ++--
>  gcc/c-family/c-lex.c|   1 +
>  gcc/c/c-parser.c|   9 +++
>  gcc/cp/parser.c |  11 +++-
>  gcc/go/go-gcc.cc|   2 +-
>  gcc/testsuite/g++.dg/cpp0x/pr65558.C|   2 +-
>  gcc/testsuite/gcc.dg/parm-impl-decl-1.c |   2 +-
>  gcc/testsuite/gcc.dg/parm-impl-decl-3.c |   2 +-
>  gcc/tree.c  | 108 
> +++-
>  gcc/tree.h  |   4 ++
>  10 files changed, 69 insertions(+), 85 deletions(-)
>
>

Re: [PATCH 2/9] rs6000: Remove SPE_CONST_OFFSET_OK

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/rs6000.c (SPE_CONST_OFFSET_OK): Delete.
> (rs6000_legitimate_offset_address_p): Return false for anything in
> V2SImode or V2SFmode.

Okay.

Thanks, David

Re: [PATCH 1/9] rs6000: Sanitize vector modes

2017-06-13 Thread David Edelsohn

On Tue, Jun 13, 2017 at 8:53 AM, Segher Boessenkool
 wrote:
> This removes the vector modes that were only used by SPE.  It also
> rearranges things so it is easier to see what is there, and for what.
>
>
> 2017-06-13  Segher Boessenkool  
>
> * config/rs6000/rs6000-modes.def: Remove all 8-byte vector modes
> except V2SF and V2SI.  Rearrange the vector modes, and add comments.
> * config/rs6000/rs6000.c (rs6000_debug_reg_global): Remove V8QImode
> and V4HImode.
> (reg_offset_addressing_ok_p): Remove V4HImode and V1DImode.
> (rs6000_legitimate_offset_address_p): Ditto.
> (rs6000_emit_move): Ditto.
> (rs6000_init_builtins): Remove V4HI_type_node.

Okay.

Thanks, David

Re: [PING^3][RFC, PATCH][ASAN] Implement dynamic allocas/VLAs sanitization.

2017-06-13 Thread Jakub Jelinek

On Tue, Jun 13, 2017 at 03:11:41PM +0300, Maxim Ostapenko wrote:
> @@ -531,11 +533,166 @@ get_mem_ref_of_assignment (const gassign *assignment,
>return true;
>  }
>  
> +/* Return address of last allocated dynamic alloca.  */
> +
> +static tree
> +get_last_alloca_addr ()
> +{
> +  if (last_alloca_addr)
> +return last_alloca_addr;
> +
> +  gimple_seq seq = NULL;
> +  gassign *g;
> +
> +  last_alloca_addr = create_tmp_reg (ptr_type_node, "last_alloca_addr");
> +/* Insert __asan_allocas_unpoison(top, bottom) call after
> +   __builtin_stackrestore(new_sp) call.

s/stackrestore/stack_restore/, that is how the builtin is called, right?
Also, please put a space before ( even in comments.

> +static void
> +handle_builtin_stackrestore (gcall *call, gimple_stmt_iterator *iter)

Again, stack_restore

> +  bool alloca_with_align
> += (DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA_WITH_ALIGN);

Unnecessary ()s around the comparison?

> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -2241,6 +2241,10 @@ expand_used_vars (void)
>expand_stack_vars (NULL, );
>  }
>  
> +  if ((flag_sanitize & SANITIZE_ADDRESS) && cfun->calls_alloca)
> +var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
> +   virtual_stack_vars_rtx);
> +

Doesn't this mean the old var_end_seq is lost because of this
(in functions that call alloca, but also have addressable variables we
asan instrument)?
I'd think you need to append the sequences, or call
asan_emit_allocas_unpoison with the var_end_seq as argument and insert
it into the new sequence.

Jakub

fix libcc1 dependencies in toplevel Makefile

2017-06-13 Thread Olivier Hainque

Hello,

During highly parallel builds on fast hosts, we have experienced
sporadic bootstrap failures on libquadmath like

  In file included from ../../../src/libquadmath/printf/printf_fp.c:39:0:
  ../../../src/libquadmath/printf/quadmath-printf.h:24:20: fatal error: 
.../build/./gcc/include-fixed/limits.h: No such file or directory
  #include 

A pretty clear sign of a race condition caused by some inaccuracy in the
dependency statements.

Investigation led us to suspect this piece in the toplevel Makefile.in:

  all-libcc1: maybe-all-gcc

which differs from all the other dependencies on maybe-all-gcc in that it's
unconditional whereas the other ones are conditioned on @if gcc-no-bootstrap.

(Thanks to Nico Roche, cc'ed for the worked involved in finding this out)

Our understanding is that it's incorrect to have dependencies on maybe-all-gcc
in the bootstrap case; that this should be a dependency on stage_current
instead.

This patch is a proposal to address this by first removing the following
statement in Makefile.def:

  dependencies = { module=all-libcc1; on=all-gcc; };

(which emits the dependency unconditionally), then refining the expansion
of "all" targets in Makefile.tpl so they include a possible dep conditioned by
gcc-no-bootstrap, on demand for "host_module"s that ask for it by way of a new
"depgcc" parameter.

We have been using this in-house for months now. The sporadic failures
have disappeared since then and we haven't observed any related fallout
so far.

Bootstrapped and regression tested on x86_64-linux.

OK to commit ?

Thanks in advance for your feedback,

With Kind Regards,

Olivier

2017-06-13  Olivier Hainque  

* Makefile.def (host_modules): Set depgcc to true for libcc1,
meaning need of a dep on stage_current if gcc-bootstrap and on
maybe-all-gcc otherwise.
(dependencies) Remove unconditional dependency on all-gcc.

* Makefile.tpl ("all" targets): Handle depgcc.
* Makefile.in: Regenerate
 


libcc1-deps.diff
Description: Binary data

[PATCH 9/9] rs6000: Comment fixes + some leftovers

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/rs6000.c: Update all comments that mentioned SPE.
(rs6000_expand_builtin): Remove RS6000_BTC_EVSEL.
* config/rs6000/rs6000.h (RS6000_BTC_EVSEL): Delete.
* config/rs6000/vxworks.h (VXCPU_FOR_8548): Delete.  Adjust former use.
* config/rs6000/vxworksae.h (VXCPU_FOR_8548): Delete.
* config/rs6000/vxworksmils.h (VXCPU_FOR_8548): Delete.

---
 gcc/config/rs6000/rs6000.c  | 79 -
 gcc/config/rs6000/rs6000.h  |  5 ++-
 gcc/config/rs6000/vxworks.h |  8 +
 gcc/config/rs6000/vxworksae.h   |  4 ---
 gcc/config/rs6000/vxworksmils.h |  4 ---
 5 files changed, 33 insertions(+), 67 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 58ef789..6b28658 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2014,10 +2014,6 @@ rs6000_cpu_name_lookup (const char *name)
This is ordinarily the length in words of a value of mode MODE
but can be less for certain modes in special long registers.
 
-   For the SPE, GPRs are 64 bits but only 32 bits are visible in
-   scalar instructions.  The upper 32 bits are only available to the
-   SIMD instructions.
-
POWER and PowerPC GPRs hold 32 bits worth;
PowerPC64 GPRs and FPRs point register holds 64 bits worth.  */
 
@@ -2901,9 +2897,7 @@ rs6000_setup_reg_addr_masks (void)
addr_mask |= RELOAD_REG_INDEXED;
 
  /* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
-addressing.  Restrict addressing on SPE for 64-bit types
-because of the SUBREG hackery used to address 64-bit floats in
-'32-bit' GPRs.  If we allow scalars into Altivec registers,
+addressing.  If we allow scalars into Altivec registers,
 don't allow PRE_INC, PRE_DEC, or PRE_MODIFY.  */
 
  if (TARGET_UPDATE
@@ -3171,7 +3165,7 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
   rs6000_vector_align[TImode] = align64;
 }
 
-  /* TODO add SPE and paired floating point vector support.  */
+  /* TODO add paired floating point vector support.  */
 
   /* Register class constraints for the constraints that depend on compile
  switches. When the VSX code was added, different constraints were added
@@ -3827,8 +3821,7 @@ darwin_rs6000_override_options (void)
 
 /* Return the builtin mask of the various options used that could affect which
builtins were used.  In the past we used target_flags, but we've run out of
-   bits, and some options like SPE and PAIRED are no longer in
-   target_flags.  */
+   bits, and some options like PAIRED are no longer in target_flags.  */
 
 HOST_WIDE_INT
 rs6000_builtin_mask_calculate (void)
@@ -5479,8 +5472,7 @@ rs6000_option_override_internal (bool global_init_p)
 
   /* Set the builtin mask of the various options used that could affect which
  builtins were used.  In the past we used target_flags, but we've run out
- of bits, and some options like SPE and PAIRED are no longer in
- target_flags.  */
+ of bits, and some options like PAIRED are no longer in target_flags.  */
   rs6000_builtin_mask = rs6000_builtin_mask_calculate ();
   if (TARGET_DEBUG_BUILTIN || TARGET_DEBUG_TARGET)
 rs6000_print_builtin_options (stderr, 0, "builtin mask",
@@ -11767,7 +11759,6 @@ function_arg_padding (machine_mode mode, const_tree 
type)
However, we're stuck with this because changing the ABI might break
existing library interfaces.
 
-   Doubleword align SPE vectors.
Quadword align Altivec/VSX vectors.
Quadword align large synthetic vector types.   */
 
@@ -12188,18 +12179,17 @@ rs6000_function_arg_advance_1 (CUMULATIVE_ARGS *cum, 
machine_mode mode,
  int n_words = rs6000_arg_size (mode, type);
  int gregno = cum->sysv_gregno;
 
- /* Long long and SPE vectors are put in (r3,r4), (r5,r6),
-(r7,r8) or (r9,r10).  As does any other 2 word item such
-as complex int due to a historical mistake.  */
+ /* Long long is put in (r3,r4), (r5,r6), (r7,r8) or (r9,r10).
+As does any other 2 word item such as complex int due to a
+historical mistake.  */
  if (n_words == 2)
gregno += (1 - gregno) & 1;
 
  /* Multi-reg args are not split between registers and stack.  */
  if (gregno + n_words - 1 > GP_ARG_MAX_REG)
{
- /* Long long and SPE vectors are aligned on the stack.
-So are other 2 word items such as complex int due to
-a historical mistake.  */
+ /* Long long is aligned on the stack.  So are other 2 word
+items such as complex int due to a historical mistake.  */
  if (n_words == 2)
cum->words += cum->words & 1;
  cum->words += n_words;
@@ -12736,9 +12726,9 @@

[PATCH 7/9] rs6000: Remove FIXED_SCRATCH

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/rs6000.h (FIXED_SCRATCH): Delete.

---
 gcc/config/rs6000/rs6000.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index edfa546..e8305aa 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1330,13 +1330,6 @@ enum data_align { align_abi, align_opt, align_both };
 
 #define LOGICAL_OP_NON_SHORT_CIRCUIT 0
 
-/* A fixed register used at epilogue generation to address SPE registers
-   with negative offsets.  The 64-bit load/store instructions on the SPE
-   only take positive offsets (and small ones at that), so we need to
-   reserve a register for consing up negative offsets.  */
-
-#define FIXED_SCRATCH 0
-
 /* Specify the registers used for certain standard purposes.
The values of these macros are register numbers.  */
 
-- 
1.9.3

[PATCH 8/9] rs6000: Remove VECTOR_SPE

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/rs6000-opts.h (enum rs6000_vector): Delete VECTOR_SPE.
* config/rs6000/rs6000.c (rs6000_debug_vector_unit): Delete VECTOR_SPE.

---
 gcc/config/rs6000/rs6000-opts.h | 1 -
 gcc/config/rs6000/rs6000.c  | 1 -
 2 files changed, 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-opts.h b/gcc/config/rs6000/rs6000-opts.h
index 086217a..6dffe8d 100644
--- a/gcc/config/rs6000/rs6000-opts.h
+++ b/gcc/config/rs6000/rs6000-opts.h
@@ -150,7 +150,6 @@ enum rs6000_vector {
   VECTOR_VSX,  /* Use VSX for vector processing */
   VECTOR_P8_VECTOR,/* Use ISA 2.07 VSX for vector processing */
   VECTOR_PAIRED,   /* Use paired floating point for vectors */
-  VECTOR_SPE,  /* Use SPE for vector processing */
   VECTOR_OTHER /* Some other vector unit */
 };
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a1005c0..58ef789 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2236,7 +2236,6 @@ rs6000_debug_vector_unit (enum rs6000_vector v)
 case VECTOR_VSX:  ret = "vsx";   break;
 case VECTOR_P8_VECTOR: ret = "p8_vector"; break;
 case VECTOR_PAIRED:   ret = "paired";break;
-case VECTOR_SPE:  ret = "spe";   break;
 case VECTOR_OTHER:ret = "other"; break;
 default:  ret = "unknown";   break;
 }
-- 
1.9.3

[PATCH 6/9] rs6000: Updates to t-rtems

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/t-rtems: Don't handle SPE.

---
 gcc/config/rs6000/t-rtems | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems
index 723c6a3..8290f5c 100644
--- a/gcc/config/rs6000/t-rtems
+++ b/gcc/config/rs6000/t-rtems
@@ -33,8 +33,8 @@ MULTILIB_DIRNAMES += m32
 MULTILIB_OPTIONS += msoft-float
 MULTILIB_DIRNAMES += nof
 
-MULTILIB_OPTIONS += mno-spe/mno-altivec
-MULTILIB_DIRNAMES += nospe noaltivec
+MULTILIB_OPTIONS += mno-altivec
+MULTILIB_DIRNAMES += noaltivec
 
 MULTILIB_MATCHES   += ${MULTILIB_MATCHES_ENDIAN}
 MULTILIB_MATCHES   += ${MULTILIB_MATCHES_SYSV}
@@ -68,7 +68,7 @@ MULTILIB_REQUIRED += mcpu=604/msoft-float
 MULTILIB_REQUIRED += mcpu=7400
 MULTILIB_REQUIRED += mcpu=7400/msoft-float
 MULTILIB_REQUIRED += mcpu=8540
-MULTILIB_REQUIRED += mcpu=8540/msoft-float/mno-spe
+MULTILIB_REQUIRED += mcpu=8540/msoft-float
 MULTILIB_REQUIRED += mcpu=860
 MULTILIB_REQUIRED += mcpu=e6500/m32
 MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec
-- 
1.9.3

[PATCH 5/9] rs6000: Updates to t-linux

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/t-linux: Don't handle SPE.

---
 gcc/config/rs6000/t-linux | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/config/rs6000/t-linux b/gcc/config/rs6000/t-linux
index 4cb63bd..acfde1f 100644
--- a/gcc/config/rs6000/t-linux
+++ b/gcc/config/rs6000/t-linux
@@ -4,12 +4,8 @@ ifeq (,$(filter $(with_cpu),$(SOFT_FLOAT_CPUS))$(findstring 
soft,$(with_float)))
 ifneq (,$(findstring powerpc64,$(target)))
 MULTILIB_OSDIRNAMES := .=../lib64$(call if_multiarch,:powerpc64-linux-gnu)
 else
-ifneq (,$(findstring spe,$(target)))
-MULTIARCH_DIRNAME := powerpc-linux-gnuspe$(if $(findstring 
8548,$(with_cpu)),,v1)
-else
 MULTIARCH_DIRNAME := powerpc-linux-gnu
 endif
-endif
 ifneq (,$(findstring powerpcle,$(target)))
 MULTIARCH_DIRNAME := $(subst -linux,le-linux,$(MULTIARCH_DIRNAME))
 endif
-- 
1.9.3

[PATCH 4/9] rs6000: Remove eabispe.h

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/eabispe.h: Delete file.

---
 gcc/config/rs6000/eabispe.h | 26 --
 1 file changed, 26 deletions(-)
 delete mode 100644 gcc/config/rs6000/eabispe.h

diff --git a/gcc/config/rs6000/eabispe.h b/gcc/config/rs6000/eabispe.h
deleted file mode 100644
index db8030a..000
--- a/gcc/config/rs6000/eabispe.h
+++ /dev/null
@@ -1,26 +0,0 @@
-/* Core target definitions for GNU compiler
-   for PowerPC embedded targeted systems with SPE support.
-   Copyright (C) 2002-2017 Free Software Foundation, Inc.
-   Contributed by Aldy Hernandez (al...@redhat.com).
-
-   This file is part of GCC.
-
-   GCC is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published
-   by the Free Software Foundation; either version 3, or (at your
-   option) any later version.
-
-   GCC is distributed in the hope that it will be useful, but WITHOUT
-   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
-   License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with GCC; see the file COPYING3.  If not see
-   .  */
-
-#undef  TARGET_DEFAULT
-#define TARGET_DEFAULT (MASK_STRICT_ALIGN | MASK_EABI)
-
-#undef  ASM_DEFAULT_SPEC
-#defineASM_DEFAULT_SPEC "-mppc -mspe -me500"
-- 
1.9.3

[PATCH 3/9] rs6000: Remove t-spe

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/t-spe: Delete file.

---
 gcc/config/rs6000/t-spe | 72 -
 1 file changed, 72 deletions(-)
 delete mode 100644 gcc/config/rs6000/t-spe

diff --git a/gcc/config/rs6000/t-spe b/gcc/config/rs6000/t-spe
deleted file mode 100644
index fe5de53..000
--- a/gcc/config/rs6000/t-spe
+++ /dev/null
@@ -1,72 +0,0 @@
-# Multilibs for e500
-#
-# Copyright (C) 2003-2017 Free Software Foundation, Inc.
-#
-# This file is part of GCC.
-#
-# GCC is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 3, or (at your option)
-# any later version.
-#
-# GCC is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with GCC; see the file COPYING3.  If not see
-# .
-
-# What we really want are these variants:
-#  -mcpu=7400
-#  -mcpu=7400 -maltivec -mabi=altivec
-#  -mcpu=7400 -msoft-float
-#  -msoft-float
-#  -mno-spe -mabi=no-spe
-#  -mno-spe -mabi=no-spe -mno-isel
-# so we'll need to create exceptions later below.
-
-MULTILIB_OPTIONS   = mcpu=7400 \
- maltivec \
- mabi=altivec \
- msoft-float \
- mno-spe \
- mabi=no-spe \
- mno-isel \
- mlittle
-
-MULTILIB_DIRNAMES  = mpc7400 altivec abi-altivec \
- nof no-spe no-abi-spe no-isel le
-
-MULTILIB_EXCEPTIONS= maltivec mabi=altivec mno-spe mabi=no-spe mno-isel \
- maltivec/mabi=altivec \
- mcpu=7400/maltivec \
- mcpu=7400/mabi=altivec \
- *mcpu=7400/*mno-spe* \
- *mcpu=7400/*mabi=no-spe* \
- *mcpu=7400/*mno-isel* \
- *maltivec/*msoft-float* \
- *maltivec/*mno-spe* \
- *maltivec/*mabi=no-spe* \
- *maltivec/*mno-isel* \
- *mabi=altivec/*msoft-float* \
- *mabi=altivec/*mno-spe* \
- *mabi=altivec/*mabi=no-spe* \
- *mabi=altivec/*mno-isel* \
- *msoft-float/*mno-spe* \
- *msoft-float/*mabi=no-spe* \
- *msoft-float/*mno-isel* \
- mno-spe/mno-isel \
- mabi=no-spe/mno-isel \
- mno-isel/mlittle \
- mabi=no-spe/mno-isel/mlittle \
- mno-spe/mlittle \
- mabi=spe/mlittle \
- mcpu=7400/mabi=altivec/mlittle \
- mcpu=7400/maltivec/mlittle \
- mabi=no-spe/mlittle \
- mno-spe/mno-isel/mlittle \
- mabi=altivec/mlittle \
- maltivec/mlittle \
- maltivec/mabi=altivec/mlittle
-- 
1.9.3

[PATCH 2/9] rs6000: Remove SPE_CONST_OFFSET_OK

2017-06-13 Thread Segher Boessenkool

2017-06-13  Segher Boessenkool  

* config/rs6000/rs6000.c (SPE_CONST_OFFSET_OK): Delete.
(rs6000_legitimate_offset_address_p): Return false for anything in
V2SImode or V2SFmode.

---
 gcc/config/rs6000/rs6000.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b51ffcc..a1005c0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8697,9 +8697,6 @@ legitimate_small_data_p (machine_mode mode, rtx x)
  && small_data_operand (x, mode));
 }
 
-/* SPE offset addressing is limited to 5-bits worth of double words.  */
-#define SPE_CONST_OFFSET_OK(x) (((x) & ~0xf8) == 0)
-
 bool
 rs6000_legitimate_offset_address_p (machine_mode mode, rtx x,
bool strict, bool worst_case)
@@ -8728,8 +8725,8 @@ rs6000_legitimate_offset_address_p (machine_mode mode, 
rtx x,
 {
 case V2SImode:
 case V2SFmode:
-  /* SPE vector modes.  */
-  return SPE_CONST_OFFSET_OK (offset);
+  /* Paired single modes: offset addressing isn't valid.  */
+  return false;
 
 case DFmode:
 case DDmode:
-- 
1.9.3

[PATCH 1/9] rs6000: Sanitize vector modes

2017-06-13 Thread Segher Boessenkool

This removes the vector modes that were only used by SPE.  It also
rearranges things so it is easier to see what is there, and for what.


2017-06-13  Segher Boessenkool  

* config/rs6000/rs6000-modes.def: Remove all 8-byte vector modes
except V2SF and V2SI.  Rearrange the vector modes, and add comments.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Remove V8QImode
and V4HImode.
(reg_offset_addressing_ok_p): Remove V4HImode and V1DImode.
(rs6000_legitimate_offset_address_p): Ditto.
(rs6000_emit_move): Ditto.
(rs6000_init_builtins): Remove V4HI_type_node.

---
 gcc/config/rs6000/rs6000-modes.def | 15 ++-
 gcc/config/rs6000/rs6000.c | 10 --
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-modes.def 
b/gcc/config/rs6000/rs6000-modes.def
index fc66fca..65f890e 100644
--- a/gcc/config/rs6000/rs6000-modes.def
+++ b/gcc/config/rs6000/rs6000-modes.def
@@ -41,15 +41,20 @@ CC_MODE (CCFP);
 CC_MODE (CCEQ);
 
 /* Vector modes.  */
-VECTOR_MODES (INT, 8);/*   V8QI  V4HI V2SI */
+
+/* VMX/VSX.  */
 VECTOR_MODES (INT, 16);   /* V16QI V8HI  V4SI V2DI */
-VECTOR_MODES (INT, 32);   /* V32QI V16HI V8SI V4DI */
-VECTOR_MODE (INT, DI, 1);
-VECTOR_MODE (INT, TI, 1);
-VECTOR_MODES (FLOAT, 8);  /* V4HF V2SF */
+VECTOR_MODE (INT, TI, 1); /*  V1TI */
 VECTOR_MODES (FLOAT, 16); /*   V8HF  V4SF V2DF */
+
+/* Two VMX/VSX vectors (for permute, select, concat, etc.)  */
+VECTOR_MODES (INT, 32);   /* V32QI V16HI V8SI V4DI */
 VECTOR_MODES (FLOAT, 32); /*   V16HF V8SF V4DF */
 
+/* Paired single.  */
+VECTOR_MODE (FLOAT, SF, 2);   /* The only valid paired-single mode.  */
+VECTOR_MODE (INT, SI, 2); /* For paired-single permutes.  */
+
 /* Replacement for TImode that only is allowed in GPRs.  We also use PTImode
for quad memory atomic operations to force getting an even/odd register
combination.  */
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8e82570..b51ffcc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2450,8 +2450,6 @@ rs6000_debug_reg_global (void)
 SDmode,
 DDmode,
 TDmode,
-V8QImode,
-V4HImode,
 V2SImode,
 V16QImode,
 V8HImode,
@@ -8490,9 +8488,7 @@ reg_offset_addressing_ok_p (machine_mode mode)
return mode_supports_vsx_dform_quad (mode);
   break;
 
-case V4HImode:
 case V2SImode:
-case V1DImode:
 case V2SFmode:
/* Paired vector modes.  Only reg+reg addressing is valid.  */
   if (TARGET_PAIRED_FLOAT)
@@ -8730,9 +8726,7 @@ rs6000_legitimate_offset_address_p (machine_mode mode, 
rtx x,
   extra = 0;
   switch (mode)
 {
-case V4HImode:
 case V2SImode:
-case V1DImode:
 case V2SFmode:
   /* SPE vector modes.  */
   return SPE_CONST_OFFSET_OK (offset);
@@ -10981,10 +10975,8 @@ rs6000_emit_move (rtx dest, rtx source, machine_mode 
mode)
 case V8HImode:
 case V4SFmode:
 case V4SImode:
-case V4HImode:
 case V2SFmode:
 case V2SImode:
-case V1DImode:
 case V2DFmode:
 case V2DImode:
 case V1TImode:
@@ -16843,7 +16835,6 @@ rs6000_init_builtins (void)
   : "__vector long long",
   intDI_type_node, 2);
   V2DF_type_node = rs6000_vector_type ("__vector double", double_type_node, 2);
-  V4HI_type_node = build_vector_type (intHI_type_node, 4);
   V4SI_type_node = rs6000_vector_type ("__vector signed int",
   intSI_type_node, 4);
   V4SF_type_node = rs6000_vector_type ("__vector float", float_type_node, 4);
@@ -16991,7 +16982,6 @@ rs6000_init_builtins (void)
   builtin_mode_to_type[V2DImode][0] = V2DI_type_node;
   builtin_mode_to_type[V2DImode][1] = unsigned_V2DI_type_node;
   builtin_mode_to_type[V2DFmode][0] = V2DF_type_node;
-  builtin_mode_to_type[V4HImode][0] = V4HI_type_node;
   builtin_mode_to_type[V4SImode][0] = V4SI_type_node;
   builtin_mode_to_type[V4SImode][1] = unsigned_V4SI_type_node;
   builtin_mode_to_type[V4SFmode][0] = V4SF_type_node;
-- 
1.9.3

[PATCH 0/9] rs6000: SPE removal, part 2

2017-06-13 Thread Segher Boessenkool

This patch series makes further updates to remove SPE from the rs6000
port.  The only thing that should be left now is the documentation.

Tested on powerpc64-linux {-m32,-m64}; I'll test on some more systems and
commit later today.


Segher


 gcc/config/rs6000/eabispe.h| 26 --
 gcc/config/rs6000/rs6000-modes.def | 15 --
 gcc/config/rs6000/rs6000-opts.h|  1 -
 gcc/config/rs6000/rs6000.c | 97 +-
 gcc/config/rs6000/rs6000.h | 12 +
 gcc/config/rs6000/t-linux  |  4 --
 gcc/config/rs6000/t-rtems  |  6 +--
 gcc/config/rs6000/t-spe| 72 
 gcc/config/rs6000/vxworks.h|  8 +---
 gcc/config/rs6000/vxworksae.h  |  4 --
 gcc/config/rs6000/vxworksmils.h|  4 --
 11 files changed, 48 insertions(+), 201 deletions(-)
 delete mode 100644 gcc/config/rs6000/eabispe.h
 delete mode 100644 gcc/config/rs6000/t-spe

-- 
1.9.3

[PATCH][RFC] Canonize names of attributes.

2017-06-13 Thread Martin Liška

Hello.

After some discussions with Richi, I would like to propose patch that will
come up with a canonical name of attribute names. That means 
__attribute__((__abi_tag__))
will be given 'abi_tag' as IDENTIFIER_NAME of the attribute. The change can 
improve
attribute name lookup and we can delete all the ugly code that compares 
strlen(i1)
== strlen(i2) + 4, etc.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests (w/ 
default
languages). I'm currently testing objc, obj-c++ and go.

Ready to be installed?
Martin


gcc/cp/ChangeLog:

2017-06-09  Martin Liska  

* parser.c (cp_parser_gnu_attribute_list): Canonize attribute
names.
(cp_parser_std_attribute): Likewise.

gcc/go/ChangeLog:

2017-06-09  Martin Liska  

* go-gcc.cc (Gcc_backend::function): Use no_split_stack
instead of __no_split_stack__.

gcc/c/ChangeLog:

2017-06-09  Martin Liska  

* c-parser.c (c_parser_attributes): Canonize attribute names.

gcc/c-family/ChangeLog:

2017-06-09  Martin Liska  

* c-format.c (cmp_attribs): Simplify comparison of attributes.
* c-lex.c (c_common_has_attribute): Canonize attribute names.

gcc/ChangeLog:

2017-06-09  Martin Liska  

* tree.c (cmp_attrib_identifiers): Simplify comparison of attributes.
(private_is_attribute_p): Likewise.
(private_lookup_attribute): Likewise.
(private_lookup_attribute_by_prefix): Likewise.
(remove_attribute): Likewise.
(canonize_attr_name): New function.
* tree.h: Declared here.

gcc/testsuite/ChangeLog:

2017-06-09  Martin Liska  

* g++.dg/cpp0x/pr65558.C: Change expected warning.
* gcc.dg/parm-impl-decl-1.c: Likewise.
* gcc.dg/parm-impl-decl-3.c: Likewise.
---
 gcc/c-family/c-format.c |  13 ++--
 gcc/c-family/c-lex.c|   1 +
 gcc/c/c-parser.c|   9 +++
 gcc/cp/parser.c |  11 +++-
 gcc/go/go-gcc.cc|   2 +-
 gcc/testsuite/g++.dg/cpp0x/pr65558.C|   2 +-
 gcc/testsuite/gcc.dg/parm-impl-decl-1.c |   2 +-
 gcc/testsuite/gcc.dg/parm-impl-decl-3.c |   2 +-
 gcc/tree.c  | 108 +++-
 gcc/tree.h  |   4 ++
 10 files changed, 69 insertions(+), 85 deletions(-)


diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 732339b9b5e..30f60d42cca 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -3982,15 +3982,10 @@ cmp_attribs (const char *tattr_name, const char *attr_name)
 {
   int alen = strlen (attr_name);
   int slen = (tattr_name ? strlen (tattr_name) : 0);
-  if (alen > 4 && attr_name[0] == '_' && attr_name[1] == '_'
-  && attr_name[alen - 1] == '_' && attr_name[alen - 2] == '_')
-{
-  attr_name += 2;
-  alen -= 4;
-}
-  if (alen != slen || strncmp (tattr_name, attr_name, alen) != 0)
-return false;
-  return true;
+  gcc_checking_assert (alen == 0 || attr_name[0] != '_');
+  gcc_checking_assert (slen == 0 || tattr_name[0] != '_');
+
+  return (alen == slen && strncmp (tattr_name, attr_name, alen) == 0);
 }
 
 /* Handle a "format" attribute; arguments as in
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index e1c8bdff986..6d0e9279ed6 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -316,6 +316,7 @@ c_common_has_attribute (cpp_reader *pfile)
 {
   attr_name = get_identifier ((const char *)
   cpp_token_as_text (pfile, token));
+  attr_name = canonize_attr_name (attr_name);
   if (c_dialect_cxx ())
 	{
 	  int idx = 0;
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 6f954f21fa2..400b65380e2 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -4168,9 +4168,11 @@ c_parser_attributes (c_parser *parser)
 	  attr_name = c_parser_attribute_any_word (parser);
 	  if (attr_name == NULL)
 	break;
+	  attr_name = canonize_attr_name (attr_name);
 	  if (is_cilkplus_vector_p (attr_name))
 	{
 	  c_token *v_token = c_parser_peek_token (parser);
+	  v_token->value = canonize_attr_name (v_token->value);
 	  c_parser_cilk_simd_fn_vector_attrs (parser, *v_token);
 	  /* If the next token isn't a comma, we're done.  */
 	  if (!c_parser_next_token_is (parser, CPP_COMMA))
@@ -4234,6 +4236,13 @@ c_parser_attributes (c_parser *parser)
 		  release_tree_vector (expr_list);
 		}
 	}
+
+	  if (attr_args
+	  && TREE_VALUE (attr_args)
+	  && TREE_CODE (TREE_VALUE (attr_args)) == IDENTIFIER_NODE)
+	TREE_VALUE (attr_args)
+	  = canonize_attr_name (TREE_VALUE (attr_args));
+
 	  attr = build_tree_list (attr_name, attr_args);
 	  if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
 	c_parser_consume_token (parser);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index d02ad360d16..ea6b9a61390 100644
---

Re: [GCC][PATCH][ARM] Require arm_arch_v8a_ok for sdiv_costs_1.c

2017-06-13 Thread Christophe Lyon

On 13 June 2017 at 12:13, Kyrill Tkachov  wrote:
>
> On 13/06/17 11:12, Tamar Christina wrote:
>>
>> Hi All,
>>
>> This fixes the failing test gcc.target/arm/sdiv_costs_1.c by
>> requiring arm_arch_v8a_ok.
>>
>>
>> OK for trunk?
>>
>
> Ok.
> Thanks,
> Kyrill
>
>
>> gcc/testsuite/
>> 2017-06-13  Tamar Christina  
>>
>> * gcc.target/arm/sdiv_costs_1.c:
>> Require arm_arch_v8a_ok and add march option.
>>

Shouldn't you use
add_options_for_arm_arch_v8a instead?

>> Thanks,
>> Tamar
>
>

[PING^3][RFC, PATCH][ASAN] Implement dynamic allocas/VLAs sanitization.

2017-06-13 Thread Maxim Ostapenko


Hi,

I would like to ping the following patch: 
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01374.html

Rebased version is attached.

Thanks,
-Maxim
gcc/ChangeLog:

2017-06-13  Maxim Ostapenko  

	* asan.c: Include gimple-fold.h.
	(get_last_alloca_addr): New function.
	(handle_builtin_stackrestore): Likewise.
	(handle_builtin_alloca): Likewise.
	(asan_emit_allocas_unpoison): Likewise.
	(get_mem_refs_of_builtin_call): Add new parameter, remove const
	quallifier from first paramerer. Handle BUILT_IN_ALLOCA,
	BUILT_IN_ALLOCA_WITH_ALIGN and BUILT_IN_STACK_RESTORE builtins.
	(instrument_builtin_call): Pass gimple iterator to
	get_mem_refs_of_builtin_call.
	(last_alloca_addr): New global.
	* asan.h (asan_emit_allocas_unpoison): Declare.
	* builtins.c (expand_asan_emit_allocas_unpoison): New function.
	(expand_builtin): Handle BUILT_IN_ASAN_ALLOCAS_UNPOISON.
	* cfgexpand.c (expand_used_vars): Call asan_emit_allocas_unpoison
	if function calls alloca.
	* gimple-fold.c (replace_call_with_value): Remove static keyword.
	* gimple-fold.h (replace_call_with_value): Declare.
	* internal-fn.c: Include asan.h.
	* sanitizer.def (BUILT_IN_ASAN_ALLOCA_POISON,
	BUILT_IN_ASAN_ALLOCAS_UNPOISON): New builtins.

gcc/testsuite/ChangeLog:

2017-06-13  Maxim Ostapenko  

	* c-c++-common/asan/alloca_big_alignment.c: New test.
	* c-c++-common/asan/alloca_detect_custom_size.c: Likewise.
	* c-c++-common/asan/alloca_instruments_all_paddings.c: Likewise.
	* c-c++-common/asan/alloca_loop_unpoisoning.c: Likewise.
	* c-c++-common/asan/alloca_overflow_partial.c: Likewise.
	* c-c++-common/asan/alloca_overflow_right.c: Likewise.
	* c-c++-common/asan/alloca_safe_access.c: Likewise.
	* c-c++-common/asan/alloca_underflow_left.c: Likewise.

diff --git a/gcc/asan.c b/gcc/asan.c
index bf564a4..4835db9 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "cfgloop.h"
 #include "gimple-builder.h"
+#include "gimple-fold.h"
 #include "ubsan.h"
 #include "params.h"
 #include "builtins.h"
@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
 static unsigned HOST_WIDE_INT asan_shadow_offset_value;
 static bool asan_shadow_offset_computed;
 static vec sanitized_sections;
+static tree last_alloca_addr = NULL_TREE;
 
 /* Set of variable declarations that are going to be guarded by
use-after-scope sanitizer.  */
@@ -531,11 +533,166 @@ get_mem_ref_of_assignment (const gassign *assignment,
   return true;
 }
 
+/* Return address of last allocated dynamic alloca.  */
+
+static tree
+get_last_alloca_addr ()
+{
+  if (last_alloca_addr)
+return last_alloca_addr;
+
+  gimple_seq seq = NULL;
+  gassign *g;
+
+  last_alloca_addr = create_tmp_reg (ptr_type_node, "last_alloca_addr");
+  g = gimple_build_assign (last_alloca_addr, NOP_EXPR,
+			   build_int_cst (ptr_type_node, 0));
+  gimple_seq_add_stmt_without_update (, g);
+
+  edge e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  gsi_insert_seq_on_edge_immediate (e, seq);
+  return last_alloca_addr;
+}
+
+/* Insert __asan_allocas_unpoison(top, bottom) call after
+   __builtin_stackrestore(new_sp) call.
+   The pseudocode of this routine should look like this:
+ __builtin_stackrestore(new_sp);
+ top = last_alloca_addr;
+ bot = virtual_dynamic_stack_rtx;
+ __asan_allocas_unpoison(top, bottom);
+ last_alloca_addr = new_sp;
+   We don't use new_sp as bot parameter because on some architectures
+   SP has non zero offset from dynamic stack area.  Moreover, on some
+   architectures this offset (STACK_DYNAMIC_OFFSET) becomes known for each
+   particular function only after all callees were expanded to rtl.
+   The most noticable example is PowerPC{,64}, see
+   http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#DYNAM-STACK.
+*/
+
+static void
+handle_builtin_stackrestore (gcall *call, gimple_stmt_iterator *iter)
+{
+  if (!iter)
+return;
+
+  gimple_stmt_iterator gsi = *iter;
+  gimple_seq seq = NULL;
+  tree last_alloca_addr = get_last_alloca_addr ();
+  tree restored_stack = gimple_call_arg (call, 0);
+  tree fn = builtin_decl_implicit (BUILT_IN_ASAN_ALLOCAS_UNPOISON);
+  gimple *g = gimple_build_call (fn, 2, last_alloca_addr, restored_stack);
+  gimple_seq_add_stmt_without_update (, g);
+  g = gimple_build_assign (last_alloca_addr, NOP_EXPR, restored_stack);
+  gimple_seq_add_stmt_without_update (, g);
+  gsi_insert_seq_after (, seq, GSI_SAME_STMT);
+}
+
+/* Deploy and poison redzones around __builtin_alloca call.  To do this, we
+   should replace this call with another one with changed parameters and
+   replace all its uses with new address, so
+ addr = __builtin_alloca (old_size, align);
+   is replaced by
+ new_size = old_size + additional_size;
+ tmp = __builtin_alloca (new_size, max(align, 32))
+ addr = tmp + 32 (first 32 bytes are for the left redzone);
+   ADDITIONAL_SIZE is added to

Re: [PATCH] Fix PR66313

2017-06-13 Thread Richard Biener

On Tue, 13 Jun 2017, Bin.Cheng wrote:

> On Tue, Jun 13, 2017 at 12:48 PM, Richard Biener  wrote:
> > On Tue, 13 Jun 2017, Richard Sandiford wrote:
> >
> >> Richard Biener  writes:
> >> > On Tue, 13 Jun 2017, Richard Sandiford wrote:
> >> >> Richard Biener  writes:
> >> >> > So I've come back to PR66313 and found a solution to the tailrecursion
> >> >> > missed optimization when fixing the factoring folding to use an 
> >> >> > unsigned
> >> >> > type when we're not sure of overflow.
> >> >> >
> >> >> > The folding part is identical to my last try from 2015, the 
> >> >> > tailrecursion
> >> >> > part makes us handle intermittent stmts that were introduced by 
> >> >> > foldings
> >> >> > that "clobber" our quest walking the single-use chain of stmts between
> >> >> > the call and the return (and failing at all stmts that are not part
> >> >> > of said chain).  A simple solution is to move the stmts that are not
> >> >> > part of the chain and that we can move before the call.  That handles
> >> >> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
> >> >> >
> >> >> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> >> >> >
> >> >> > Richard.
> >> >> >
> >> >> > 2017-05-31  Richard Biener  
> >> >> >
> >> >> >  PR middle-end/66313
> >> >> >  * fold-const.c (fold_plusminus_mult_expr): If the factored
> >> >> >  factor may be zero use a wrapping type for the inner operation.
> >> >> >  * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
> >> >> >  and handle moved defs.
> >> >> >  (process_assignment): Properly guard the unary op case.  Return a
> >> >> >  tri-state indicating that moving the stmt before the call may allow
> >> >> >  to continue.  Pass through to_move.
> >> >> >  (find_tail_calls): Handle moving unrelated defs before
> >> >> >  the call.
> >> >> >
> >> >> >  * c-c++-common/ubsan/pr66313.c: New testcase.
> >> >> >  * gcc.dg/tree-ssa/loop-15.c: Adjust.
> >> >> >
> >> >> > Index: gcc/fold-const.c
> >> >> > ===
> >> >> > *** gcc/fold-const.c.orig2015-10-29 12:32:33.302782318 +0100
> >> >> > --- gcc/fold-const.c 2015-10-29 14:08:39.936497739 +0100
> >> >> > *** fold_plusminus_mult_expr (location_t loc
> >> >> > *** 6916,6925 
> >> >> >   }
> >> >> > same = NULL_TREE;
> >> >> >
> >> >> > !   if (operand_equal_p (arg01, arg11, 0))
> >> >> > ! same = arg01, alt0 = arg00, alt1 = arg10;
> >> >> > !   else if (operand_equal_p (arg00, arg10, 0))
> >> >> >   same = arg00, alt0 = arg01, alt1 = arg11;
> >> >> > else if (operand_equal_p (arg00, arg11, 0))
> >> >> >   same = arg00, alt0 = arg01, alt1 = arg10;
> >> >> > else if (operand_equal_p (arg01, arg10, 0))
> >> >> > --- 6916,6926 
> >> >> >   }
> >> >> > same = NULL_TREE;
> >> >> >
> >> >> > !   /* Prefer factoring a common non-constant.  */
> >> >> > !   if (operand_equal_p (arg00, arg10, 0))
> >> >> >   same = arg00, alt0 = arg01, alt1 = arg11;
> >> >> > +   else if (operand_equal_p (arg01, arg11, 0))
> >> >> > + same = arg01, alt0 = arg00, alt1 = arg10;
> >> >> > else if (operand_equal_p (arg00, arg11, 0))
> >> >> >   same = arg00, alt0 = arg01, alt1 = arg10;
> >> >> > else if (operand_equal_p (arg01, arg10, 0))
> >> >> > *** fold_plusminus_mult_expr (location_t loc
> >> >> > *** 6974,6987 
> >> >> >  }
> >> >> >   }
> >> >> >
> >> >> > !   if (same)
> >> >> >   return fold_build2_loc (loc, MULT_EXPR, type,
> >> >> >  fold_build2_loc (loc, code, type,
> >> >> >   fold_convert_loc (loc, type, 
> >> >> > alt0),
> >> >> >   fold_convert_loc (loc, type, 
> >> >> > alt1)),
> >> >> >  fold_convert_loc (loc, type, same));
> >> >> >
> >> >> > !   return NULL_TREE;
> >> >> >   }
> >> >> >
> >> >> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
> >> >> > --- 6975,7010 
> >> >> >  }
> >> >> >   }
> >> >> >
> >> >> > !   if (!same)
> >> >> > ! return NULL_TREE;
> >> >> > !
> >> >> > !   if (! INTEGRAL_TYPE_P (type)
> >> >> > !   || TYPE_OVERFLOW_WRAPS (type)
> >> >> > !   /* We are neither factoring zero nor minus one.  */
> >> >> > !   || TREE_CODE (same) == INTEGER_CST)
> >> >> >   return fold_build2_loc (loc, MULT_EXPR, type,
> >> >> >  fold_build2_loc (loc, code, type,
> >> >> >   fold_convert_loc (loc, type, 
> >> >> > alt0),
> >> >> >   fold_convert_loc (loc, type, 
> >> >> > alt1)),
> >> >> >  fold_convert_loc (loc, type, same));
> >> >> >
> >> >> > !   /* Same may be zero and thus the operation 'code' may overflow.  
> >> >> > Likewise
> >> >> > !  same may

Re: [PATCH] Fix PR66313

2017-06-13 Thread Bin.Cheng

On Tue, Jun 13, 2017 at 12:48 PM, Richard Biener  wrote:
> On Tue, 13 Jun 2017, Richard Sandiford wrote:
>
>> Richard Biener  writes:
>> > On Tue, 13 Jun 2017, Richard Sandiford wrote:
>> >> Richard Biener  writes:
>> >> > So I've come back to PR66313 and found a solution to the tailrecursion
>> >> > missed optimization when fixing the factoring folding to use an unsigned
>> >> > type when we're not sure of overflow.
>> >> >
>> >> > The folding part is identical to my last try from 2015, the 
>> >> > tailrecursion
>> >> > part makes us handle intermittent stmts that were introduced by foldings
>> >> > that "clobber" our quest walking the single-use chain of stmts between
>> >> > the call and the return (and failing at all stmts that are not part
>> >> > of said chain).  A simple solution is to move the stmts that are not
>> >> > part of the chain and that we can move before the call.  That handles
>> >> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
>> >> >
>> >> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>> >> >
>> >> > Richard.
>> >> >
>> >> > 2017-05-31  Richard Biener  
>> >> >
>> >> >  PR middle-end/66313
>> >> >  * fold-const.c (fold_plusminus_mult_expr): If the factored
>> >> >  factor may be zero use a wrapping type for the inner operation.
>> >> >  * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
>> >> >  and handle moved defs.
>> >> >  (process_assignment): Properly guard the unary op case.  Return a
>> >> >  tri-state indicating that moving the stmt before the call may allow
>> >> >  to continue.  Pass through to_move.
>> >> >  (find_tail_calls): Handle moving unrelated defs before
>> >> >  the call.
>> >> >
>> >> >  * c-c++-common/ubsan/pr66313.c: New testcase.
>> >> >  * gcc.dg/tree-ssa/loop-15.c: Adjust.
>> >> >
>> >> > Index: gcc/fold-const.c
>> >> > ===
>> >> > *** gcc/fold-const.c.orig2015-10-29 12:32:33.302782318 +0100
>> >> > --- gcc/fold-const.c 2015-10-29 14:08:39.936497739 +0100
>> >> > *** fold_plusminus_mult_expr (location_t loc
>> >> > *** 6916,6925 
>> >> >   }
>> >> > same = NULL_TREE;
>> >> >
>> >> > !   if (operand_equal_p (arg01, arg11, 0))
>> >> > ! same = arg01, alt0 = arg00, alt1 = arg10;
>> >> > !   else if (operand_equal_p (arg00, arg10, 0))
>> >> >   same = arg00, alt0 = arg01, alt1 = arg11;
>> >> > else if (operand_equal_p (arg00, arg11, 0))
>> >> >   same = arg00, alt0 = arg01, alt1 = arg10;
>> >> > else if (operand_equal_p (arg01, arg10, 0))
>> >> > --- 6916,6926 
>> >> >   }
>> >> > same = NULL_TREE;
>> >> >
>> >> > !   /* Prefer factoring a common non-constant.  */
>> >> > !   if (operand_equal_p (arg00, arg10, 0))
>> >> >   same = arg00, alt0 = arg01, alt1 = arg11;
>> >> > +   else if (operand_equal_p (arg01, arg11, 0))
>> >> > + same = arg01, alt0 = arg00, alt1 = arg10;
>> >> > else if (operand_equal_p (arg00, arg11, 0))
>> >> >   same = arg00, alt0 = arg01, alt1 = arg10;
>> >> > else if (operand_equal_p (arg01, arg10, 0))
>> >> > *** fold_plusminus_mult_expr (location_t loc
>> >> > *** 6974,6987 
>> >> >  }
>> >> >   }
>> >> >
>> >> > !   if (same)
>> >> >   return fold_build2_loc (loc, MULT_EXPR, type,
>> >> >  fold_build2_loc (loc, code, type,
>> >> >   fold_convert_loc (loc, type, 
>> >> > alt0),
>> >> >   fold_convert_loc (loc, type, 
>> >> > alt1)),
>> >> >  fold_convert_loc (loc, type, same));
>> >> >
>> >> > !   return NULL_TREE;
>> >> >   }
>> >> >
>> >> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
>> >> > --- 6975,7010 
>> >> >  }
>> >> >   }
>> >> >
>> >> > !   if (!same)
>> >> > ! return NULL_TREE;
>> >> > !
>> >> > !   if (! INTEGRAL_TYPE_P (type)
>> >> > !   || TYPE_OVERFLOW_WRAPS (type)
>> >> > !   /* We are neither factoring zero nor minus one.  */
>> >> > !   || TREE_CODE (same) == INTEGER_CST)
>> >> >   return fold_build2_loc (loc, MULT_EXPR, type,
>> >> >  fold_build2_loc (loc, code, type,
>> >> >   fold_convert_loc (loc, type, 
>> >> > alt0),
>> >> >   fold_convert_loc (loc, type, 
>> >> > alt1)),
>> >> >  fold_convert_loc (loc, type, same));
>> >> >
>> >> > !   /* Same may be zero and thus the operation 'code' may overflow.  
>> >> > Likewise
>> >> > !  same may be minus one and thus the multiplication may overflow.  
>> >> > Perform
>> >> > !  the operations in an unsigned type.  */
>> >> > !   tree utype = unsigned_type_for (type);
>> >> > !   tree tem = fold_build2_loc (loc, code, utype,
>> >> > !

Re: [PATCH] Fix PR66313

2017-06-13 Thread Richard Biener

On Tue, 13 Jun 2017, Bin.Cheng wrote:

> On Tue, Jun 13, 2017 at 12:23 PM, Richard Sandiford
>  wrote:
> > Richard Biener  writes:
> >> On Tue, 13 Jun 2017, Richard Sandiford wrote:
> >>> Richard Biener  writes:
> >>> > So I've come back to PR66313 and found a solution to the tailrecursion
> >>> > missed optimization when fixing the factoring folding to use an unsigned
> >>> > type when we're not sure of overflow.
> >>> >
> >>> > The folding part is identical to my last try from 2015, the 
> >>> > tailrecursion
> >>> > part makes us handle intermittent stmts that were introduced by foldings
> >>> > that "clobber" our quest walking the single-use chain of stmts between
> >>> > the call and the return (and failing at all stmts that are not part
> >>> > of said chain).  A simple solution is to move the stmts that are not
> >>> > part of the chain and that we can move before the call.  That handles
> >>> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
> >>> >
> >>> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> >>> >
> >>> > Richard.
> >>> >
> >>> > 2017-05-31  Richard Biener  
> >>> >
> >>> >PR middle-end/66313
> >>> >* fold-const.c (fold_plusminus_mult_expr): If the factored
> >>> >factor may be zero use a wrapping type for the inner operation.
> >>> >* tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
> >>> >and handle moved defs.
> >>> >(process_assignment): Properly guard the unary op case.  Return a
> >>> >tri-state indicating that moving the stmt before the call may allow
> >>> >to continue.  Pass through to_move.
> >>> >(find_tail_calls): Handle moving unrelated defs before
> >>> >the call.
> >>> >
> >>> >* c-c++-common/ubsan/pr66313.c: New testcase.
> >>> >* gcc.dg/tree-ssa/loop-15.c: Adjust.
> >>> >
> >>> > Index: gcc/fold-const.c
> >>> > ===
> >>> > *** gcc/fold-const.c.orig  2015-10-29 12:32:33.302782318 +0100
> >>> > --- gcc/fold-const.c   2015-10-29 14:08:39.936497739 +0100
> >>> > *** fold_plusminus_mult_expr (location_t loc
> >>> > *** 6916,6925 
> >>> >   }
> >>> > same = NULL_TREE;
> >>> >
> >>> > !   if (operand_equal_p (arg01, arg11, 0))
> >>> > ! same = arg01, alt0 = arg00, alt1 = arg10;
> >>> > !   else if (operand_equal_p (arg00, arg10, 0))
> >>> >   same = arg00, alt0 = arg01, alt1 = arg11;
> >>> > else if (operand_equal_p (arg00, arg11, 0))
> >>> >   same = arg00, alt0 = arg01, alt1 = arg10;
> >>> > else if (operand_equal_p (arg01, arg10, 0))
> >>> > --- 6916,6926 
> >>> >   }
> >>> > same = NULL_TREE;
> >>> >
> >>> > !   /* Prefer factoring a common non-constant.  */
> >>> > !   if (operand_equal_p (arg00, arg10, 0))
> >>> >   same = arg00, alt0 = arg01, alt1 = arg11;
> >>> > +   else if (operand_equal_p (arg01, arg11, 0))
> >>> > + same = arg01, alt0 = arg00, alt1 = arg10;
> >>> > else if (operand_equal_p (arg00, arg11, 0))
> >>> >   same = arg00, alt0 = arg01, alt1 = arg10;
> >>> > else if (operand_equal_p (arg01, arg10, 0))
> >>> > *** fold_plusminus_mult_expr (location_t loc
> >>> > *** 6974,6987 
> >>> >}
> >>> >   }
> >>> >
> >>> > !   if (same)
> >>> >   return fold_build2_loc (loc, MULT_EXPR, type,
> >>> >fold_build2_loc (loc, code, type,
> >>> > fold_convert_loc (loc, type, alt0),
> >>> > fold_convert_loc (loc, type, alt1)),
> >>> >fold_convert_loc (loc, type, same));
> >>> >
> >>> > !   return NULL_TREE;
> >>> >   }
> >>> >
> >>> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
> >>> > --- 6975,7010 
> >>> >}
> >>> >   }
> >>> >
> >>> > !   if (!same)
> >>> > ! return NULL_TREE;
> >>> > !
> >>> > !   if (! INTEGRAL_TYPE_P (type)
> >>> > !   || TYPE_OVERFLOW_WRAPS (type)
> >>> > !   /* We are neither factoring zero nor minus one.  */
> >>> > !   || TREE_CODE (same) == INTEGER_CST)
> >>> >   return fold_build2_loc (loc, MULT_EXPR, type,
> >>> >fold_build2_loc (loc, code, type,
> >>> > fold_convert_loc (loc, type, alt0),
> >>> > fold_convert_loc (loc, type, alt1)),
> >>> >fold_convert_loc (loc, type, same));
> >>> >
> >>> > !   /* Same may be zero and thus the operation 'code' may overflow.  
> >>> > Likewise
> >>> > !  same may be minus one and thus the multiplication may overflow.  
> >>> > Perform
> >>> > !  the operations in an unsigned type.  */
> >>> > !   tree utype = unsigned_type_for (type);
> >>> > !   tree tem = fold_build2_loc (loc, code, utype,
> >>> > !fold_convert_loc (loc, utype, alt0),
> >>> > !

Re: [PATCH] Fix PR66313

2017-06-13 Thread Richard Biener

On Tue, 13 Jun 2017, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Tue, 13 Jun 2017, Richard Sandiford wrote:
> >> Richard Biener  writes:
> >> > So I've come back to PR66313 and found a solution to the tailrecursion
> >> > missed optimization when fixing the factoring folding to use an unsigned
> >> > type when we're not sure of overflow.
> >> >
> >> > The folding part is identical to my last try from 2015, the tailrecursion
> >> > part makes us handle intermittent stmts that were introduced by foldings
> >> > that "clobber" our quest walking the single-use chain of stmts between
> >> > the call and the return (and failing at all stmts that are not part
> >> > of said chain).  A simple solution is to move the stmts that are not
> >> > part of the chain and that we can move before the call.  That handles
> >> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
> >> >
> >> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> >> >
> >> > Richard.
> >> >
> >> > 2017-05-31  Richard Biener  
> >> >
> >> >  PR middle-end/66313
> >> >  * fold-const.c (fold_plusminus_mult_expr): If the factored
> >> >  factor may be zero use a wrapping type for the inner operation.
> >> >  * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
> >> >  and handle moved defs.
> >> >  (process_assignment): Properly guard the unary op case.  Return a
> >> >  tri-state indicating that moving the stmt before the call may allow
> >> >  to continue.  Pass through to_move.
> >> >  (find_tail_calls): Handle moving unrelated defs before
> >> >  the call.
> >> >
> >> >  * c-c++-common/ubsan/pr66313.c: New testcase.
> >> >  * gcc.dg/tree-ssa/loop-15.c: Adjust.
> >> >
> >> > Index: gcc/fold-const.c
> >> > ===
> >> > *** gcc/fold-const.c.orig2015-10-29 12:32:33.302782318 +0100
> >> > --- gcc/fold-const.c 2015-10-29 14:08:39.936497739 +0100
> >> > *** fold_plusminus_mult_expr (location_t loc
> >> > *** 6916,6925 
> >> >   }
> >> > same = NULL_TREE;
> >> >   
> >> > !   if (operand_equal_p (arg01, arg11, 0))
> >> > ! same = arg01, alt0 = arg00, alt1 = arg10;
> >> > !   else if (operand_equal_p (arg00, arg10, 0))
> >> >   same = arg00, alt0 = arg01, alt1 = arg11;
> >> > else if (operand_equal_p (arg00, arg11, 0))
> >> >   same = arg00, alt0 = arg01, alt1 = arg10;
> >> > else if (operand_equal_p (arg01, arg10, 0))
> >> > --- 6916,6926 
> >> >   }
> >> > same = NULL_TREE;
> >> >   
> >> > !   /* Prefer factoring a common non-constant.  */
> >> > !   if (operand_equal_p (arg00, arg10, 0))
> >> >   same = arg00, alt0 = arg01, alt1 = arg11;
> >> > +   else if (operand_equal_p (arg01, arg11, 0))
> >> > + same = arg01, alt0 = arg00, alt1 = arg10;
> >> > else if (operand_equal_p (arg00, arg11, 0))
> >> >   same = arg00, alt0 = arg01, alt1 = arg10;
> >> > else if (operand_equal_p (arg01, arg10, 0))
> >> > *** fold_plusminus_mult_expr (location_t loc
> >> > *** 6974,6987 
> >> >  }
> >> >   }
> >> >   
> >> > !   if (same)
> >> >   return fold_build2_loc (loc, MULT_EXPR, type,
> >> >  fold_build2_loc (loc, code, type,
> >> >   fold_convert_loc (loc, type, alt0),
> >> >   fold_convert_loc (loc, type, 
> >> > alt1)),
> >> >  fold_convert_loc (loc, type, same));
> >> >   
> >> > !   return NULL_TREE;
> >> >   }
> >> >   
> >> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
> >> > --- 6975,7010 
> >> >  }
> >> >   }
> >> >   
> >> > !   if (!same)
> >> > ! return NULL_TREE;
> >> > ! 
> >> > !   if (! INTEGRAL_TYPE_P (type)
> >> > !   || TYPE_OVERFLOW_WRAPS (type)
> >> > !   /* We are neither factoring zero nor minus one.  */
> >> > !   || TREE_CODE (same) == INTEGER_CST)
> >> >   return fold_build2_loc (loc, MULT_EXPR, type,
> >> >  fold_build2_loc (loc, code, type,
> >> >   fold_convert_loc (loc, type, alt0),
> >> >   fold_convert_loc (loc, type, 
> >> > alt1)),
> >> >  fold_convert_loc (loc, type, same));
> >> >   
> >> > !   /* Same may be zero and thus the operation 'code' may overflow.  
> >> > Likewise
> >> > !  same may be minus one and thus the multiplication may overflow.  
> >> > Perform
> >> > !  the operations in an unsigned type.  */
> >> > !   tree utype = unsigned_type_for (type);
> >> > !   tree tem = fold_build2_loc (loc, code, utype,
> >> > !  fold_convert_loc (loc, utype, alt0),
> >> > !  fold_convert_loc (loc, utype, alt1));
> >> > !   /* If the sum evaluated to a constant that is not -INF the 
> >>

RE: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-13 Thread Koval, Julia

Thank you for your help. I fixed the test similar to existing sigaction tests.

gcc/
* config/i386/i386.c: Fix rounding expand for new pattern.
* config/i386/subst.md: Fix pattern (parallel -> unspec).
gcc/testsuite/
* gcc.target/i386/pr73350-2.c: New test.

Thanks,
Julia

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, June 13, 2017 10:09 AM
> To: Koval, Julia 
> Cc: Jakub Jelinek ; H.J. Lu ; GCC
> Patches ; Uros Bizjak ; Kirill
> Yukhin 
> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> 
> On Mon, Jun 12, 2017 at 6:50 PM, Koval, Julia  wrote:
> > I'm so sorry, but I really don't get it. The right result of the test is: 
> > Floating
> point exception (core dumped). The wrong result of the test is: nan(no
> exception). If I get an exception(which is right) - the test is failed 
> anyway. The
> exception is raised in one instruction, I can't get any intermediate value 
> there..
> 
> We do have a few testcases catching these cases by installing a signal
> handler (grep for sigaction in testsuite/)
> 
> Richard.
> 
> > I tried to replaced it with compile time test(attached), which shows, that 
> > both
> instruction are generated(not combined) - is it ok?
> >
> > Thanks,
> > Julia
> >
> >> -Original Message-
> >> From: Jakub Jelinek [mailto:ja...@redhat.com]
> >> Sent: Monday, June 12, 2017 6:18 PM
> >> To: H.J. Lu 
> >> Cc: Koval, Julia ; GCC Patches  >> patc...@gcc.gnu.org>; Uros Bizjak ; Kirill Yukhin
> >> 
> >> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> >>
> >> On Mon, Jun 12, 2017 at 09:08:00AM -0700, H.J. Lu wrote:
> >> > On Mon, Jun 12, 2017 at 9:06 AM, Koval, Julia 
> wrote:
> >> > > I would like to, but as far as I know the only testcase possible is 
> >> > > below,
> and
> >> as far as I know there is no possibility to use dg-error for runtime
> >> exceptions(Sorry, if I'm wrong). There are only 2 versions of the flag
> exception
> >> or no exception and the error is, when they are combined in CSE.
> >> >
> >> > Can you use
> >> >
> >> > if (wrong)
> >> >   abort ();
> >> >
> >> > in testcase?
> >>
> >> Where wrong can also be if (__builtin_fabsf (somefloatval - expectedval) <
> >> epsilon)
> >> or similar if needed.  Also, the testcase contains many unnecessary
> >> includes, if you use __builtin_abort, I'd hope you only need x86intrin.h 
> >> and
> >> nothing else.  And, main can be just int main (), argc and argv aren't 
> >> used.
> >> >
> >> > >> -Original Message-
> >> > >> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> >> > >> Sent: Monday, June 12, 2017 3:43 PM
> >> > >> To: Koval, Julia 
> >> > >> Cc: GCC Patches ; Uros Bizjak
> >> > >> ; Kirill Yukhin 
> >> > >> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> >> > >>
> >> > >> On Mon, Jun 12, 2017 at 6:21 AM, Koval, Julia 
> >> wrote:
> >> > >> > This is the same issue as PR73350 and PR80862 for disabling FP
> >> exceptions.
> >> > >> >
> >> > >> > gcc -O0 -mavx512f -mavx512er returns exception
> >> > >> > gcc -O2 -mavx512f -mavx512er returns nan
> >> > >> >
> >> > >> > For this code:
> >> > >> >
> >> > >> > #include 
> >> > >> > #include 
> >> > >> > #include 
> >> > >> > #include 
> >> > >> > #include 
> >> > >> >
> >> > >> > int main(int argc, char *argv[]) {
> >> > >> > __m512 a = _mm512_set1_ps((float) -1);
> >> > >> > __m512 b = _mm512_set1_ps((float) -1);
> >> > >> > _mm_setcsr( _MM_MASK_MASK &~
> >> > >> >
> >> > >>
> (_MM_MASK_OVERFLOW|_MM_MASK_INVALID|_MM_MASK_DIV_ZERO)
> >> );
> >> > >> > __m512 result1 = _mm512_rsqrt28_round_ps(a,
> >> _MM_FROUND_NO_EXC );
> >> > >> > printf("%d %d\n", _MM_FROUND_CUR_DIRECTION,
> >> > >> _MM_FROUND_NO_EXC);
> >> > >> > __m512 result2 = _mm512_rsqrt28_round_ps(a,
> >> > >> _MM_FROUND_CUR_DIRECTION);
> >> > >> >
> >> > >> > printf("%g\n", result1[0] - result2[0]);
> >> > >> >
> >> > >> > return 0;
> >> > >> > }
> >>
> >>   Jakub


0001-fix.patch
Description: 0001-fix.patch

Re: [PATCH try 2 resend] [i386] Remove warnings for ignoring -mcall-ms2sysv-xlogues.

2017-06-13 Thread Bernd Edlinger



On 06/11/17 22:35, Daniel Santos wrote:
> I appear to have forgotten to cc gcc-patches, sorry about that.
> 
> There are currently three cases where we issue a warning when disabling
> -mcall-ms2sysv-xlogues for a function, but I never added a proper
> warning, so there's no mechanism for disabling it.  This is something
> that I meant to address sooner.  I'm thinking that it's better to just
> remove the warning entirely and document these cases, rather than adding
> a new warning.  Any thoughts?
> 
> These are the conditions:
> 
> * the use of -fsplit-stack,
> * the use of static call chains (not sure if we can ever have that), and
> * if the function calls __buildin_eh_return.
> 
> Some of these cases can likely be supported, but they are just on the
> "not yet tested" list.
> 
> 2017-06-11  Daniel Santos   
>   * config/i386/i386.c (warn_once_call_ms2sysv_xlogues): Remove.
>   (ix86_compute_frame_layout): Don't call warn_once_call_ms2sysv_xlogues.
>   (ix86_expand_call): Likewise.
> 

Your change log should also mention the changed doc/invoke.texi


> Thanks,
> Daniel
> 
> Signed-off-by: Daniel Santos 
> ---
>   gcc/config/i386/i386.c | 26 +++---
>   gcc/doc/invoke.texi| 25 -
>   2 files changed, 23 insertions(+), 28 deletions(-)
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index d5c2d46bf5e..2dc6e53c765 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12772,18 +12772,6 @@ ix86_builtin_setjmp_frame_value (void)
> return stack_realign_fp ? hard_frame_pointer_rtx : virtual_stack_vars_rtx;
>   }
>   
> -/* Emits a warning for unsupported msabi to sysv pro/epilogues.  */
> -static void warn_once_call_ms2sysv_xlogues (const char *feature)
> -{
> -  static bool warned_once = false;
> -  if (!warned_once)
> -{
> -  warning (0, "-mcall-ms2sysv-xlogues is not compatible with %s",
> -feature);
> -  warned_once = true;
> -}
> -}
> -
>   /* When using -fsplit-stack, the allocation routines set a field in
>  the TCB to the bottom of the stack plus this much space, measured
>  in bytes.  */
> @@ -12814,18 +12802,10 @@ ix86_compute_frame_layout (void)
> gcc_assert (TARGET_SSE);
> gcc_assert (!ix86_using_red_zone ());
>   
> -  if (crtl->calls_eh_return)
> +  if (crtl->calls_eh_return || ix86_static_chain_on_stack)
>   {
> gcc_assert (!reload_completed);
> m->call_ms2sysv = false;
> -   warn_once_call_ms2sysv_xlogues ("__builtin_eh_return");
> - }
> -
> -  else if (ix86_static_chain_on_stack)
> - {
> -   gcc_assert (!reload_completed);
> -   m->call_ms2sysv = false;
> -   warn_once_call_ms2sysv_xlogues ("static call chains");
>   }
>   
> /* Finally, compute which registers the stub will manage.  */
> @@ -29290,9 +29270,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
> callarg1,
> else if (ix86_function_ms_hook_prologue (current_function_decl))
>   ;
>   
> -   /* TODO: Cases not yet examined.  */
> +   /* TODO: Compatibility not yet examined.  */
> else if (flag_split_stack)
> - warn_once_call_ms2sysv_xlogues ("-fsplit-stack");
> + ;
>   
> else
>   {
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c1168823af7..eec02b43a4f 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -25389,11 +25389,26 @@ using the function attributes @code{ms_abi} and 
> @code{sysv_abi}.
>   @opindex mno-call-ms2sysv-xlogues
>   Due to differences in 64-bit ABIs, any Microsoft ABI function that calls a
>   System V ABI function must consider RSI, RDI and XMM6-15 as clobbered.  By
> -default, the code for saving and restoring these registers is emitted inline,
> -resulting in fairly lengthy prologues and epilogues.  Using
> -@option{-mcall-ms2sysv-xlogues} emits prologues and epilogues that
> -use stubs in the static portion of libgcc to perform these saves and 
> restores,
> -thus reducing function size at the cost of a few extra instructions.
> +default, the instructions for saving and restoring these registers are 
> emitted
> +inline, resulting in fairly lengthy pro- and epilogues.  Using
> +@option{-mcall-ms2sysv-xlogues} emits pro- and epilogues that use stubs in 
> the
> +static portion of libgcc to perform these saves and restores, thus reducing
> +function size at the cost of executing a few extra instructions.  This cost 
> is
> +theoretically mitigated or eliminated by reduced instruction cache 
> utilization,
> +temporal locality of the stubs, and the stubs' use of MOV instructions over
> +PUSH and POP.
> +
> +This option is not supported with SEH, so it is completely unavailable on
> +Windows.  It is also silently disabled if a function:
> +
> +@enumerate
> +@item is built with @option{-mno-sse2} or @option{-fsplit-stack},
> +@item has

Re: [PATCH] Fix PR66313

2017-06-13 Thread Bin.Cheng

On Tue, Jun 13, 2017 at 12:23 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Tue, 13 Jun 2017, Richard Sandiford wrote:
>>> Richard Biener  writes:
>>> > So I've come back to PR66313 and found a solution to the tailrecursion
>>> > missed optimization when fixing the factoring folding to use an unsigned
>>> > type when we're not sure of overflow.
>>> >
>>> > The folding part is identical to my last try from 2015, the tailrecursion
>>> > part makes us handle intermittent stmts that were introduced by foldings
>>> > that "clobber" our quest walking the single-use chain of stmts between
>>> > the call and the return (and failing at all stmts that are not part
>>> > of said chain).  A simple solution is to move the stmts that are not
>>> > part of the chain and that we can move before the call.  That handles
>>> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
>>> >
>>> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>> >
>>> > Richard.
>>> >
>>> > 2017-05-31  Richard Biener  
>>> >
>>> >PR middle-end/66313
>>> >* fold-const.c (fold_plusminus_mult_expr): If the factored
>>> >factor may be zero use a wrapping type for the inner operation.
>>> >* tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
>>> >and handle moved defs.
>>> >(process_assignment): Properly guard the unary op case.  Return a
>>> >tri-state indicating that moving the stmt before the call may allow
>>> >to continue.  Pass through to_move.
>>> >(find_tail_calls): Handle moving unrelated defs before
>>> >the call.
>>> >
>>> >* c-c++-common/ubsan/pr66313.c: New testcase.
>>> >* gcc.dg/tree-ssa/loop-15.c: Adjust.
>>> >
>>> > Index: gcc/fold-const.c
>>> > ===
>>> > *** gcc/fold-const.c.orig  2015-10-29 12:32:33.302782318 +0100
>>> > --- gcc/fold-const.c   2015-10-29 14:08:39.936497739 +0100
>>> > *** fold_plusminus_mult_expr (location_t loc
>>> > *** 6916,6925 
>>> >   }
>>> > same = NULL_TREE;
>>> >
>>> > !   if (operand_equal_p (arg01, arg11, 0))
>>> > ! same = arg01, alt0 = arg00, alt1 = arg10;
>>> > !   else if (operand_equal_p (arg00, arg10, 0))
>>> >   same = arg00, alt0 = arg01, alt1 = arg11;
>>> > else if (operand_equal_p (arg00, arg11, 0))
>>> >   same = arg00, alt0 = arg01, alt1 = arg10;
>>> > else if (operand_equal_p (arg01, arg10, 0))
>>> > --- 6916,6926 
>>> >   }
>>> > same = NULL_TREE;
>>> >
>>> > !   /* Prefer factoring a common non-constant.  */
>>> > !   if (operand_equal_p (arg00, arg10, 0))
>>> >   same = arg00, alt0 = arg01, alt1 = arg11;
>>> > +   else if (operand_equal_p (arg01, arg11, 0))
>>> > + same = arg01, alt0 = arg00, alt1 = arg10;
>>> > else if (operand_equal_p (arg00, arg11, 0))
>>> >   same = arg00, alt0 = arg01, alt1 = arg10;
>>> > else if (operand_equal_p (arg01, arg10, 0))
>>> > *** fold_plusminus_mult_expr (location_t loc
>>> > *** 6974,6987 
>>> >}
>>> >   }
>>> >
>>> > !   if (same)
>>> >   return fold_build2_loc (loc, MULT_EXPR, type,
>>> >fold_build2_loc (loc, code, type,
>>> > fold_convert_loc (loc, type, alt0),
>>> > fold_convert_loc (loc, type, alt1)),
>>> >fold_convert_loc (loc, type, same));
>>> >
>>> > !   return NULL_TREE;
>>> >   }
>>> >
>>> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
>>> > --- 6975,7010 
>>> >}
>>> >   }
>>> >
>>> > !   if (!same)
>>> > ! return NULL_TREE;
>>> > !
>>> > !   if (! INTEGRAL_TYPE_P (type)
>>> > !   || TYPE_OVERFLOW_WRAPS (type)
>>> > !   /* We are neither factoring zero nor minus one.  */
>>> > !   || TREE_CODE (same) == INTEGER_CST)
>>> >   return fold_build2_loc (loc, MULT_EXPR, type,
>>> >fold_build2_loc (loc, code, type,
>>> > fold_convert_loc (loc, type, alt0),
>>> > fold_convert_loc (loc, type, alt1)),
>>> >fold_convert_loc (loc, type, same));
>>> >
>>> > !   /* Same may be zero and thus the operation 'code' may overflow.  
>>> > Likewise
>>> > !  same may be minus one and thus the multiplication may overflow.  
>>> > Perform
>>> > !  the operations in an unsigned type.  */
>>> > !   tree utype = unsigned_type_for (type);
>>> > !   tree tem = fold_build2_loc (loc, code, utype,
>>> > !fold_convert_loc (loc, utype, alt0),
>>> > !fold_convert_loc (loc, utype, alt1));
>>> > !   /* If the sum evaluated to a constant that is not -INF the 
>>> > multiplication
>>> > !  cannot overflow.  */
>>> > !   if (TREE_CODE (tem) == INTEGER_CST
>>> > !   && ! wi::eq_p (tem, wi::min_value

Re: [PATCH] Fix PR66313

2017-06-13 Thread Richard Sandiford

Richard Biener  writes:
> On Tue, 13 Jun 2017, Richard Sandiford wrote:
>> Richard Biener  writes:
>> > So I've come back to PR66313 and found a solution to the tailrecursion
>> > missed optimization when fixing the factoring folding to use an unsigned
>> > type when we're not sure of overflow.
>> >
>> > The folding part is identical to my last try from 2015, the tailrecursion
>> > part makes us handle intermittent stmts that were introduced by foldings
>> > that "clobber" our quest walking the single-use chain of stmts between
>> > the call and the return (and failing at all stmts that are not part
>> > of said chain).  A simple solution is to move the stmts that are not
>> > part of the chain and that we can move before the call.  That handles
>> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
>> >
>> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>> >
>> > Richard.
>> >
>> > 2017-05-31  Richard Biener  
>> >
>> >PR middle-end/66313
>> >* fold-const.c (fold_plusminus_mult_expr): If the factored
>> >factor may be zero use a wrapping type for the inner operation.
>> >* tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
>> >and handle moved defs.
>> >(process_assignment): Properly guard the unary op case.  Return a
>> >tri-state indicating that moving the stmt before the call may allow
>> >to continue.  Pass through to_move.
>> >(find_tail_calls): Handle moving unrelated defs before
>> >the call.
>> >
>> >* c-c++-common/ubsan/pr66313.c: New testcase.
>> >* gcc.dg/tree-ssa/loop-15.c: Adjust.
>> >
>> > Index: gcc/fold-const.c
>> > ===
>> > *** gcc/fold-const.c.orig  2015-10-29 12:32:33.302782318 +0100
>> > --- gcc/fold-const.c   2015-10-29 14:08:39.936497739 +0100
>> > *** fold_plusminus_mult_expr (location_t loc
>> > *** 6916,6925 
>> >   }
>> > same = NULL_TREE;
>> >   
>> > !   if (operand_equal_p (arg01, arg11, 0))
>> > ! same = arg01, alt0 = arg00, alt1 = arg10;
>> > !   else if (operand_equal_p (arg00, arg10, 0))
>> >   same = arg00, alt0 = arg01, alt1 = arg11;
>> > else if (operand_equal_p (arg00, arg11, 0))
>> >   same = arg00, alt0 = arg01, alt1 = arg10;
>> > else if (operand_equal_p (arg01, arg10, 0))
>> > --- 6916,6926 
>> >   }
>> > same = NULL_TREE;
>> >   
>> > !   /* Prefer factoring a common non-constant.  */
>> > !   if (operand_equal_p (arg00, arg10, 0))
>> >   same = arg00, alt0 = arg01, alt1 = arg11;
>> > +   else if (operand_equal_p (arg01, arg11, 0))
>> > + same = arg01, alt0 = arg00, alt1 = arg10;
>> > else if (operand_equal_p (arg00, arg11, 0))
>> >   same = arg00, alt0 = arg01, alt1 = arg10;
>> > else if (operand_equal_p (arg01, arg10, 0))
>> > *** fold_plusminus_mult_expr (location_t loc
>> > *** 6974,6987 
>> >}
>> >   }
>> >   
>> > !   if (same)
>> >   return fold_build2_loc (loc, MULT_EXPR, type,
>> >fold_build2_loc (loc, code, type,
>> > fold_convert_loc (loc, type, alt0),
>> > fold_convert_loc (loc, type, alt1)),
>> >fold_convert_loc (loc, type, same));
>> >   
>> > !   return NULL_TREE;
>> >   }
>> >   
>> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
>> > --- 6975,7010 
>> >}
>> >   }
>> >   
>> > !   if (!same)
>> > ! return NULL_TREE;
>> > ! 
>> > !   if (! INTEGRAL_TYPE_P (type)
>> > !   || TYPE_OVERFLOW_WRAPS (type)
>> > !   /* We are neither factoring zero nor minus one.  */
>> > !   || TREE_CODE (same) == INTEGER_CST)
>> >   return fold_build2_loc (loc, MULT_EXPR, type,
>> >fold_build2_loc (loc, code, type,
>> > fold_convert_loc (loc, type, alt0),
>> > fold_convert_loc (loc, type, alt1)),
>> >fold_convert_loc (loc, type, same));
>> >   
>> > !   /* Same may be zero and thus the operation 'code' may overflow.  
>> > Likewise
>> > !  same may be minus one and thus the multiplication may overflow.  
>> > Perform
>> > !  the operations in an unsigned type.  */
>> > !   tree utype = unsigned_type_for (type);
>> > !   tree tem = fold_build2_loc (loc, code, utype,
>> > !fold_convert_loc (loc, utype, alt0),
>> > !fold_convert_loc (loc, utype, alt1));
>> > !   /* If the sum evaluated to a constant that is not -INF the 
>> > multiplication
>> > !  cannot overflow.  */
>> > !   if (TREE_CODE (tem) == INTEGER_CST
>> > !   && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), 
>> > SIGNED)))
>> > ! return fold_build2_loc (loc, MULT_EXPR, type,
>> > !  fold_convert (type, tem), same);
>> > ! 
>> > !   return

Re: [PATCH GCC][06/13]Preserve loop nest in whole distribution life time

2017-06-13 Thread Richard Biener

On Tue, Jun 13, 2017 at 1:15 PM, Bin.Cheng  wrote:
> On Tue, Jun 13, 2017 at 12:06 PM, Richard Biener
>  wrote:
>> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
>>> Hi,
>>> This simple patch computes and preserves loop nest vector for whole 
>>> distribution
>>> life time.  The loop nest will be used multiple times in on-demand data 
>>> dependence
>>> computation.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>
>> Don't like it too much but I guess we can see if refactoring it back
>> to pass down
>> loop_nest can work.
> The global data is partly to avoid patch conflicts when separating
> patches, otherwise several parameters are needed for quite number of
> functions.  We can introduce a global distribution data and only pass
> it to various functions.

Or make a class covering distribution of one loop (nest) and make all
functions members ...

struct one_loop_distribution
{
  one_loop_distribution (loop *);
...
};

Richard.

> Thanks,
> bin
>>
>> Ok.
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> bin
>>> 2017-06-07  Bin Cheng  
>>>
>>> * tree-loop-distribution.c (loop_nest): New global var.
>>> (build_rdg): Use loop directly, rather than loop nest.
>>> (pg_add_dependence_edges): Remove loop nest parameter.  Use global
>>> variable directly.
>>> (distribute_loop): Compute global variable loop nest.  Update use.

Re: [PATCH GCC][06/13]Preserve loop nest in whole distribution life time

2017-06-13 Thread Bin.Cheng

On Tue, Jun 13, 2017 at 12:06 PM, Richard Biener
 wrote:
> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
>> Hi,
>> This simple patch computes and preserves loop nest vector for whole 
>> distribution
>> life time.  The loop nest will be used multiple times in on-demand data 
>> dependence
>> computation.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> Don't like it too much but I guess we can see if refactoring it back
> to pass down
> loop_nest can work.
The global data is partly to avoid patch conflicts when separating
patches, otherwise several parameters are needed for quite number of
functions.  We can introduce a global distribution data and only pass
it to various functions.

Thanks,
bin
>
> Ok.
>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  
>>
>> * tree-loop-distribution.c (loop_nest): New global var.
>> (build_rdg): Use loop directly, rather than loop nest.
>> (pg_add_dependence_edges): Remove loop nest parameter.  Use global
>> variable directly.
>> (distribute_loop): Compute global variable loop nest.  Update use.

Re: [PATCH GCC][07/13]Preserve data references for whole distribution life time

2017-06-13 Thread Richard Biener

On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> This patch collects and preserves all data references in loop for whole
> distribution life time.  It will be used afterwards.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

+/* Vector of data references in the loop to be distributed.  */
+static vec *datarefs_vec;
+
+/* Map of data reference in the loop to a unique id.  */
+static hash_map *datarefs_map;
+

no need to make those pointers.  It's not a unique id but
the index into the datarefs_vec vector, right?

loop distribution doesn't yet use dr->aux so it would be nice
to avoid the hash_map in favor of using that field.

#define DR_INDEX (dr) ((uintptr_t)(dr)->aux)

+  if (datarefs_vec->length () > 64)

There is PARAM_VALUE (PARAM_LOOP_MAX_DATAREFS_FOR_DATADEPS)
with a default value of 1000.  Please use that instead of magic numbers.

+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file,
+"Loop %d not distributed: more than 64 memory references.\n",
+loop->num);
+
+  free_rdg (rdg);
+  loop_nest->release ();
+  delete loop_nest;
+  free_data_refs (*datarefs_vec);
+  delete datarefs_vec;
+  return 0;
+}

auto_* were so nice ...

> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * tree-loop-distribution.c (datarefs_vec, datarefs_map): New
> global var.
> (create_rdg_vertices): Use datarefs_vec directly.
> (free_rdg): Don't free data references.
> (build_rdg): Update use.  Don't free data references.
> (distribute_loop): Compute global variable for data references.
> Bail out if there are too many data references.

Re: [PATCH v2] Implement no_sanitize function attribute

2017-06-13 Thread Martin Liška

On 06/09/2017 03:35 PM, Richard Biener wrote:
> You can directly transform to no_sanitize with integer mask, not sure why
> you'd need an intermediate step with a string?

Hello.

Done in attached patch, I'm sending both incremental and final version 
(complete patch).
I also decided to support no_sanitize attribute in pretty printer:

__attribute__((no_sanitize (address | shift | shift-base | shift-exponent | 
integer-divide-by-zero | undefined | unreachable | vla-bound | return | null | 
signed-integer-overflow | bool | enum | float-divide-by-zero | 
float-cast-overflow | bounds | bounds-strict | alignment | nonnull-attribute | 
returns-nonnull-attribute | object-size | vptr)))
fn1 ()
{
  char my_char[9];
  char * ptr2;
  char * ptr;
..


Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
diff --git a/gcc/asan.h b/gcc/asan.h
index a590d0a5ace..95bb89e197c 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -154,4 +154,24 @@ asan_protect_stack_decl (tree decl)
 	|| (asan_sanitize_use_after_scope () && TREE_ADDRESSABLE (decl)));
 }
 
+/* Return true when flag_sanitize & FLAG is non-zero.  If FN is non-null,
+   remove all flags mentioned in "no_sanitize" of DECL_ATTRIBUTES.  */
+
+static inline bool
+sanitize_flags_p (unsigned int flag, const_tree fn = current_function_decl)
+{
+  unsigned int result_flags = flag_sanitize & flag;
+  if (result_flags == 0)
+return false;
+
+  if (fn != NULL_TREE)
+{
+  tree value = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (fn));
+  if (value)
+	result_flags &= ~tree_to_uhwi (TREE_VALUE (value));
+}
+
+  return result_flags;
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index abb43d0d02c..2b6845f2cbd 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -558,17 +558,22 @@ handle_cold_attribute (tree *node, tree name, tree ARG_UNUSED (args),
 void
 add_no_sanitize_value (tree node, unsigned int flags)
 {
-  tree attr = lookup_attribute ("no_sanitize_flags", DECL_ATTRIBUTES (node));
+  tree attr = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (node));
   if (attr)
 {
   unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr));
   flags |= old_value;
-}
 
-  DECL_ATTRIBUTES (node)
-= tree_cons (get_identifier ("no_sanitize_flags"),
-		 build_int_cst (unsigned_type_node, flags),
-		 DECL_ATTRIBUTES (node));
+  if (flags == old_value)
+	return;
+
+  TREE_VALUE (attr) = build_int_cst (unsigned_type_node, flags);
+}
+  else
+DECL_ATTRIBUTES (node)
+  = tree_cons (get_identifier ("no_sanitize"),
+		   build_int_cst (unsigned_type_node, flags),
+		   DECL_ATTRIBUTES (node));
 }
 
 /* Handle a "no_sanitize" attribute; arguments as in
@@ -578,11 +583,11 @@ static tree
 handle_no_sanitize_attribute (tree *node, tree name, tree args, int,
 			  bool *no_add_attrs)
 {
+  *no_add_attrs = true;
   tree id = TREE_VALUE (args);
   if (TREE_CODE (*node) != FUNCTION_DECL)
 {
   warning (OPT_Wattributes, "%qE attribute ignored", name);
-  *no_add_attrs = true;
   return NULL_TREE;
 }
 
@@ -614,11 +619,9 @@ static tree
 handle_no_sanitize_address_attribute (tree *node, tree name, tree, int,
   bool *no_add_attrs)
 {
+  *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
-{
-  warning (OPT_Wattributes, "%qE attribute ignored", name);
-  *no_add_attrs = true;
-}
+warning (OPT_Wattributes, "%qE attribute ignored", name);
   else
 add_no_sanitize_value (*node, SANITIZE_ADDRESS);
 
@@ -632,11 +635,9 @@ static tree
 handle_no_sanitize_thread_attribute (tree *node, tree name, tree, int,
   bool *no_add_attrs)
 {
+  *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
-{
-  warning (OPT_Wattributes, "%qE attribute ignored", name);
-  *no_add_attrs = true;
-}
+warning (OPT_Wattributes, "%qE attribute ignored", name);
   else
 add_no_sanitize_value (*node, SANITIZE_THREAD);
 
@@ -651,11 +652,9 @@ static tree
 handle_no_address_safety_analysis_attribute (tree *node, tree name, tree, int,
 	 bool *no_add_attrs)
 {
+  *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
-{
-  warning (OPT_Wattributes, "%qE attribute ignored", name);
-  *no_add_attrs = true;
-}
+warning (OPT_Wattributes, "%qE attribute ignored", name);
   else
 add_no_sanitize_value (*node, SANITIZE_ADDRESS);
 
@@ -669,11 +668,9 @@ static tree
 handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int,
   bool *no_add_attrs)
 {
+  *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
-{
-  warning (OPT_Wattributes, "%qE attribute ignored", name);
-  *no_add_attrs = true;
-}
+warning (OPT_Wattributes, "%qE attribute ignored", name);
   else
 add_no_sanitize_value (*node,
 			   SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT);
diff --git

Re: [PATCH GCC][06/13]Preserve loop nest in whole distribution life time

2017-06-13 Thread Richard Biener

On Tue, Jun 13, 2017 at 1:06 PM, Richard Biener
 wrote:
> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
>> Hi,
>> This simple patch computes and preserves loop nest vector for whole 
>> distribution
>> life time.  The loop nest will be used multiple times in on-demand data 
>> dependence
>> computation.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> Don't like it too much but I guess we can see if refactoring it back
> to pass down
> loop_nest can work.
>
> Ok.

Oh.

+/* The loop (nest) to be distributed.  */
+static vec *loop_nest;
+

please make it

static vec loop_nest;

instead to avoid a pointless indirection (vec<> just contains a
pointer to allocated storage).

Richard.

> Thanks,
> Richard.
>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  
>>
>> * tree-loop-distribution.c (loop_nest): New global var.
>> (build_rdg): Use loop directly, rather than loop nest.
>> (pg_add_dependence_edges): Remove loop nest parameter.  Use global
>> variable directly.
>> (distribute_loop): Compute global variable loop nest.  Update use.

Re: [PATCH GCC][06/13]Preserve loop nest in whole distribution life time

2017-06-13 Thread Richard Biener

On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> This simple patch computes and preserves loop nest vector for whole 
> distribution
> life time.  The loop nest will be used multiple times in on-demand data 
> dependence
> computation.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

Don't like it too much but I guess we can see if refactoring it back
to pass down
loop_nest can work.

Ok.

Thanks,
Richard.

> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * tree-loop-distribution.c (loop_nest): New global var.
> (build_rdg): Use loop directly, rather than loop nest.
> (pg_add_dependence_edges): Remove loop nest parameter.  Use global
> variable directly.
> (distribute_loop): Compute global variable loop nest.  Update use.

Re: [PATCH GCC][05/13]Refactoring partition merge

2017-06-13 Thread Richard Biener

On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> This simple patch refactors partition merge code and dump information.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * tree-loop-distribution.c (enum fuse_type, fuse_message): New.
> (partition_merge_into): New parameter.  Dump reason for fusion.
> (distribute_loop): Update use of partition_merge_into.

Re: [PATCH 2/3] Make early return predictor more precise.

2017-06-13 Thread Martin Liška

On 06/09/2017 04:08 PM, Jan Hubicka wrote:
>> gcc/ChangeLog:
>>
>> 2017-05-26  Martin Liska  
>>
>>  PR tree-optimization/79489
>>  * gimplify.c (maybe_add_early_return_predict_stmt): New
>>  function.
>>  (gimplify_return_expr): Call the function.
>>  * predict.c (tree_estimate_probability_bb): Remove handling
>>  of early return.
>>  * predict.def: Update comment about early return predictor.
>>  * gimple-predict.h (is_gimple_predict): New function.
>>  * tree-inline.c (remap_gimple_stmt): Do not copy early return
>>  predictors during inlining.
>>  * predict.def: Change default value of early return to 66.
> 
> Thanks for working on this.
> Doing tail recursion early is quite useful.  Can't we make the pass to
> skip predict statements in analysis similar was as debug statements are
> skipped?

Hi.

Yes, this was easy to fix, skipping here helps.

>> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
>> index f3ec404ef09..3f3813cb062 100644
>> --- a/gcc/tree-inline.c
>> +++ b/gcc/tree-inline.c
>> @@ -1629,6 +1629,13 @@ remap_gimple_stmt (gimple *stmt, copy_body_data *id)
>>gimple_seq_add_stmt (, copy);
>>return stmts;
>>  }
>> +  if (is_gimple_predict (stmt))
>> +{
>> +  /* Do not copy early return predictor that does not make sense
>> + after inlining.  */
>> +  if (gimple_predict_predictor (stmt) == PRED_TREE_EARLY_RETURN)
>> +return stmts;
>> +}
> 
> I am also not quite sure about this one.  The code was still structured in a 
> way
> there was early return in the inlined function, so we may still assume that 
> the heuristic works for it?

Ok, you're right that we can preserve the predictor. However, let's consider 
following test-case:

static
int baz(int a)
{
  if (a == 1)
return 1;

  return 0;
}

  
static
int bar(int a)
{
  if (a == 1)
return baz(a);

  return 0;
}
  
static
int foo(int a)
{
  if (a == 1)
return bar(a);

  return 12;
}

int main(int argc, char **argv)
{
  return foo(argc);
}

There after einline we have:

main (int argc, char * * argv)
{
  int D.1832;
  int _3;
  int _4;

   [100.00%]:
  if (argc_2(D) == 1)
goto ; [37.13%]
  else
goto ; [62.87%]

   [37.13%]:
  // predicted unlikely by early return (on trees) predictor.
  // predicted unlikely by early return (on trees) predictor.
  // predicted unlikely by early return (on trees) predictor.

   [100.00%]:
  # _3 = PHI <12(2), 1(3)>
  _5 = _3;
  _4 = _5;
  return _4;

}

I'm thinking what's the best place to merge all the predictor
statements?

Thanks,
Martin

> 
> Where did you found this case?
> Honza
>>  
>>/* Create a new deep copy of the statement.  */
>>copy = gimple_copy (stmt);
>> -- 
>> 2.13.0
>>

Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-13 Thread Richard Biener

On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> During the work I ran into a latent bug for distributing.  For the moment we 
> sort statements
> in dominance order, but that's not enough because basic blocks may be sorted 
> in reverse order
> of execution flow.  This results in wrong data dependence direction later.  
> This patch fixes
> the issue by sorting in topological order.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

I suppose you are fixing

static int
pg_add_dependence_edges (struct graph *rdg, vec loops, int dir,
 vec drs1,
 vec drs2)
{
...
/* Re-shuffle data-refs to be in dominator order.  */
if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
> rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
  {
std::swap (dr1, dr2);
this_dir = -this_dir;
  }

but then for stmts that are not "ordered" by RPO or DOM like

   if (flag)
 ... = dr1;
   else
 ... = dr2;

this doesn't avoid spurious swaps?  Also the code was basically
copied from tree-data-refs.c:find_data_references_in_loop which
does iterate over get_loop_body_in_dom_order as well.  So isn't the
issue latent there as well?

That said, what about those "unordered" stmts?  I suppose
dependence analysis still happily computes a dependence
distance but in reality we'd have to consider both execution
orders?

Thanks,
Richard.


>
> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * tree-loop-distribution.c (bb_top_order_index): New.
> (bb_top_order_index_size, bb_top_order_cmp): New.
> (stmts_from_loop): Use topological order.
> (pass_loop_distribution::execute): Compute topological order for.
> basic blocks.

Re: [PATCH GCC][03/13]Mark and skip distributed loops

2017-06-13 Thread Bin.Cheng

On Tue, Jun 13, 2017 at 11:47 AM, Richard Biener
 wrote:
> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
>> Hi,
>> This simple patch marks distributed loops and skips it in following 
>> distribution.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> This is not necessary, FOR_EACH_LOOP first builds a vector of loops to
> iterate over
> so it will never pick up loops added during iteration.
This was originally for loop nest distribution, but I didn't include
that part in this series.  I will withdraw it for now.

Thanks,
bin
>
> Richard.
>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  
>>
>> * tree-loop-distribution.c (generate_loops_for_partition): Mark
>> distributed loops.
>> (pass_loop_distribution::execute): Skip distributed loops.

[libgomp, OpenACC] Add more map handling for enter/exit data directives

2017-06-13 Thread Chung-Lin Tang

Hi Jakub,
this patch has been posted before, but hasn't really been reviewed yet:
https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01927.html

This has been deployed on gomp-4_0-branch for a long time, and was re-tested
on current trunk, test results okay.

Is this okay for trunk?

Thanks,
Chung-Lin

2016-06-13  Cesar Philippidis  
Thomas Schwinge  
Chung-Lin Tang  

libgomp/
* oacc-parallel.c (find_pset): Adjust and rename from...
(find_pointer): ...this function.
(GOACC_enter_exit_data): Handle GOMP_MAP_TO and GOMP_MAP_ALLOC,
adjust find_pointer calls into find_pset, adjust pointer map handling,
add acc_is_present guards to calls to gomp_acc_insert_pointer and
gomp_acc_remove_pointer.

* testsuite/libgomp.oacc-c-c++-common/data-2.c: Update test.
* testsuite/libgomp.oacc-c-c++-common/enter-data.c: New test.
* testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
Index: oacc-parallel.c
===
--- oacc-parallel.c (revision 249147)
+++ oacc-parallel.c (working copy)
@@ -38,8 +38,11 @@
 #include 
 #include 
 
+/* Returns the number of mappings associated with the pointer or pset. PSET
+   have three mappings, whereas pointer have two.  */
+
 static int
-find_pset (int pos, size_t mapnum, unsigned short *kinds)
+find_pointer (int pos, size_t mapnum, unsigned short *kinds)
 {
   if (pos + 1 >= mapnum)
 return 0;
@@ -46,7 +49,12 @@ static int
 
   unsigned char kind = kinds[pos+1] & 0xff;
 
-  return kind == GOMP_MAP_TO_PSET;
+  if (kind == GOMP_MAP_TO_PSET)
+return 3;
+  else if (kind == GOMP_MAP_POINTER)
+return 2;
+
+  return 0;
 }
 
 static void goacc_wait (int async, int num_waits, va_list *ap);
@@ -298,7 +306,9 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 
   if (kind == GOMP_MAP_FORCE_ALLOC
  || kind == GOMP_MAP_FORCE_PRESENT
- || kind == GOMP_MAP_FORCE_TO)
+ || kind == GOMP_MAP_FORCE_TO
+ || kind == GOMP_MAP_TO
+ || kind == GOMP_MAP_ALLOC)
{
  data_enter = true;
  break;
@@ -312,6 +322,15 @@ GOACC_enter_exit_data (int device, size_t mapnum,
  kind);
 }
 
+  /* In c, non-pointers and arrays are represented by a single data clause.
+ Dynamically allocated arrays and subarrays are represented by a data
+ clause followed by an internal GOMP_MAP_POINTER.
+
+ In fortran, scalars and not allocated arrays are represented by a
+ single data clause. Allocated arrays and subarrays have three mappings:
+ 1) the original data clause, 2) a PSET 3) a pointer to the array data.
+  */
+
   if (data_enter)
 {
   for (i = 0; i < mapnum; i++)
@@ -318,25 +337,24 @@ GOACC_enter_exit_data (int device, size_t mapnum,
{
  unsigned char kind = kinds[i] & 0xff;
 
- /* Scan for PSETs.  */
- int psets = find_pset (i, mapnum, kinds);
+ /* Scan for pointers and PSETs.  */
+ int pointer = find_pointer (i, mapnum, kinds);
 
- if (!psets)
+ if (!pointer)
{
  switch (kind)
{
-   case GOMP_MAP_POINTER:
- gomp_acc_insert_pointer (1, [i], [i],
-   [i]);
+   case GOMP_MAP_ALLOC:
+ acc_present_or_create (hostaddrs[i], sizes[i]);
  break;
case GOMP_MAP_FORCE_ALLOC:
  acc_create (hostaddrs[i], sizes[i]);
  break;
-   case GOMP_MAP_FORCE_PRESENT:
+   case GOMP_MAP_TO:
  acc_present_or_copyin (hostaddrs[i], sizes[i]);
  break;
case GOMP_MAP_FORCE_TO:
- acc_present_or_copyin (hostaddrs[i], sizes[i]);
+ acc_copyin (hostaddrs[i], sizes[i]);
  break;
default:
  gomp_fatal (" GOACC_enter_exit_data UNHANDLED kind 
0x%.2x",
@@ -346,12 +364,16 @@ GOACC_enter_exit_data (int device, size_t mapnum,
}
  else
{
- gomp_acc_insert_pointer (3, [i], [i], [i]);
+ if (!acc_is_present (hostaddrs[i], sizes[i]))
+   {
+ gomp_acc_insert_pointer (pointer, [i],
+  [i], [i]);
+   }
  /* Increment 'i' by two because OpenACC requires fortran
 arrays to be contiguous, so each PSET is associated with
 one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
 one MAP_POINTER.  */
- i += 2;
+ i += pointer - 1;
}
}
 }
@@ -360,19 +382,15 @@ GOACC_enter_exit_data (int device, size_t mapnum,
   {
unsigned char kind = kinds[i] & 0xff;
 
-

Re: [PATCH GCC][03/13]Mark and skip distributed loops

2017-06-13 Thread Richard Biener

On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> This simple patch marks distributed loops and skips it in following 
> distribution.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

This is not necessary, FOR_EACH_LOOP first builds a vector of loops to
iterate over
so it will never pick up loops added during iteration.

Richard.

> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * tree-loop-distribution.c (generate_loops_for_partition): Mark
> distributed loops.
> (pass_loop_distribution::execute): Skip distributed loops.

Re: [PATCH GCC][02/13]Skip distribution if there is no loop

2017-06-13 Thread Richard Biener

On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> this is a simple patch skipping distribution if there is no loop at all.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?
> Thanks,
> bin
>
> 2017-06-07  Bin Cheng  
>
> * cfgloop.h (pass_loop_distribution::execute): Skip if no loops.

tree-loop-distribution.c

Ok.

Richard.

Re: [PATCH 00/30] [ARM] Reworking the -mcpu, -march and -mfpu options

2017-06-13 Thread Joseph Myers

On Tue, 13 Jun 2017, Richard Earnshaw (lists) wrote:

> I wonder if we should/could add a LAST attribute to the options
> specification such that the driver discards all but the final instance
> of such an option.  This would also solve the -mcpu=native problem since
> the discard rule would kick in and eliminate that option if it wasn't
> the final one in the list.

As noted, I think all the options should be validated before discarding in 
such a case (which is easy for Enum options, but for options with custom 
parsing code care would need to be taken that this code is run for 
validation purposes before discarding).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix PR66313

2017-06-13 Thread Richard Biener

On Tue, 13 Jun 2017, Richard Sandiford wrote:

> Richard Biener  writes:
> > So I've come back to PR66313 and found a solution to the tailrecursion
> > missed optimization when fixing the factoring folding to use an unsigned
> > type when we're not sure of overflow.
> >
> > The folding part is identical to my last try from 2015, the tailrecursion
> > part makes us handle intermittent stmts that were introduced by foldings
> > that "clobber" our quest walking the single-use chain of stmts between
> > the call and the return (and failing at all stmts that are not part
> > of said chain).  A simple solution is to move the stmts that are not
> > part of the chain and that we can move before the call.  That handles
> > the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
> >
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> >
> > Richard.
> >
> > 2017-05-31  Richard Biener  
> >
> > PR middle-end/66313
> > * fold-const.c (fold_plusminus_mult_expr): If the factored
> > factor may be zero use a wrapping type for the inner operation.
> > * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
> > and handle moved defs.
> > (process_assignment): Properly guard the unary op case.  Return a
> > tri-state indicating that moving the stmt before the call may allow
> > to continue.  Pass through to_move.
> > (find_tail_calls): Handle moving unrelated defs before
> > the call.
> >
> > * c-c++-common/ubsan/pr66313.c: New testcase.
> > * gcc.dg/tree-ssa/loop-15.c: Adjust.
> >
> > Index: gcc/fold-const.c
> > ===
> > *** gcc/fold-const.c.orig   2015-10-29 12:32:33.302782318 +0100
> > --- gcc/fold-const.c2015-10-29 14:08:39.936497739 +0100
> > *** fold_plusminus_mult_expr (location_t loc
> > *** 6916,6925 
> >   }
> > same = NULL_TREE;
> >   
> > !   if (operand_equal_p (arg01, arg11, 0))
> > ! same = arg01, alt0 = arg00, alt1 = arg10;
> > !   else if (operand_equal_p (arg00, arg10, 0))
> >   same = arg00, alt0 = arg01, alt1 = arg11;
> > else if (operand_equal_p (arg00, arg11, 0))
> >   same = arg00, alt0 = arg01, alt1 = arg10;
> > else if (operand_equal_p (arg01, arg10, 0))
> > --- 6916,6926 
> >   }
> > same = NULL_TREE;
> >   
> > !   /* Prefer factoring a common non-constant.  */
> > !   if (operand_equal_p (arg00, arg10, 0))
> >   same = arg00, alt0 = arg01, alt1 = arg11;
> > +   else if (operand_equal_p (arg01, arg11, 0))
> > + same = arg01, alt0 = arg00, alt1 = arg10;
> > else if (operand_equal_p (arg00, arg11, 0))
> >   same = arg00, alt0 = arg01, alt1 = arg10;
> > else if (operand_equal_p (arg01, arg10, 0))
> > *** fold_plusminus_mult_expr (location_t loc
> > *** 6974,6987 
> > }
> >   }
> >   
> > !   if (same)
> >   return fold_build2_loc (loc, MULT_EXPR, type,
> > fold_build2_loc (loc, code, type,
> >  fold_convert_loc (loc, type, alt0),
> >  fold_convert_loc (loc, type, alt1)),
> > fold_convert_loc (loc, type, same));
> >   
> > !   return NULL_TREE;
> >   }
> >   
> >   /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
> > --- 6975,7010 
> > }
> >   }
> >   
> > !   if (!same)
> > ! return NULL_TREE;
> > ! 
> > !   if (! INTEGRAL_TYPE_P (type)
> > !   || TYPE_OVERFLOW_WRAPS (type)
> > !   /* We are neither factoring zero nor minus one.  */
> > !   || TREE_CODE (same) == INTEGER_CST)
> >   return fold_build2_loc (loc, MULT_EXPR, type,
> > fold_build2_loc (loc, code, type,
> >  fold_convert_loc (loc, type, alt0),
> >  fold_convert_loc (loc, type, alt1)),
> > fold_convert_loc (loc, type, same));
> >   
> > !   /* Same may be zero and thus the operation 'code' may overflow.  
> > Likewise
> > !  same may be minus one and thus the multiplication may overflow.  
> > Perform
> > !  the operations in an unsigned type.  */
> > !   tree utype = unsigned_type_for (type);
> > !   tree tem = fold_build2_loc (loc, code, utype,
> > ! fold_convert_loc (loc, utype, alt0),
> > ! fold_convert_loc (loc, utype, alt1));
> > !   /* If the sum evaluated to a constant that is not -INF the 
> > multiplication
> > !  cannot overflow.  */
> > !   if (TREE_CODE (tem) == INTEGER_CST
> > !   && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), SIGNED)))
> > ! return fold_build2_loc (loc, MULT_EXPR, type,
> > !   fold_convert (type, tem), same);
> > ! 
> > !   return fold_convert_loc (loc, type,
> > !  fold_build2_loc (loc, MULT_EXPR, utype, tem,
> > !

1 2 >

1 - 100 of 128 matches

Mail list logo