date:20190510

fCa提//供//税%%栗//

2019-05-10 Thread jkcylozbr

gcc-patches@gcc.gnu.org
+
殿滓优 惠 办 理 正 规 税 票，认 证 后 付 款。
 详 电：李 生，136—6075— 4190，
 业 q：157— 533— 2698
---

Re: Implement numeric_limits<__float128>.

2019-05-10 Thread Marc Glisse


On Fri, 10 May 2019, Ed Smith-Rowland via libstdc++ wrote:

I know people are mostly looking at release branch work but I'd like to post 
this.  Other projects like mppp and boost use our __float128 with C++.  I use 
it for specfun testing and various other projects. I'd like to offer a series 
of patches to enable this support straight from libstdc++.  This is the first 
patch. Next will be  and then  a bit later.


It's pretty straightforward but others might have tips on configuration (and 
anything else ;-)).  Built and tested of x86_64-linux.


+#  define FLT128_MAX 1.18973149535723176508575932662800702e4932Q
+#  define FLT128_MIN 3.36210314311209350626267781732175260e-4932Q
etc

do such names really belong in ?

I see that the preprocessor has macros
#define __FLT128_MAX__ 1.18973149535723176508575932662800702e+4932F128
#define __FLT128_MIN__ 3.36210314311209350626267781732175260e-4932F128

for type _Float128 but the C++ compiler does not know that type or 
operator""F128. It seems broken that the compiler defines the macro for 
something it does not support. Having FLT128_MIN and __FLT128_MIN__ refer 
to different things seems like a bad idea. If you define 
__STDC_WANT_IEC_60559_TYPES_EXT__ and include  you already get 
FLT128_MIN as _Float128 and if you include  you already get 
FLT128_MIN as __float128 (and if you do both in this order you get a 
redefinition warning with -Wsystem-headers).


--
Marc Glisse

Implement numeric_limits<__float128>.

2019-05-10 Thread Ed Smith-Rowland via gcc-patches


Greetings,

I know people are mostly looking at release branch work but I'd like to 
post this.  Other projects like mppp and boost use our __float128 with 
C++.  I use it for specfun testing and various other projects. I'd like 
to offer a series of patches to enable this support straight from 
libstdc++.  This is the first patch. Next will be  and then 
 a bit later.


It's pretty straightforward but others might have tips on configuration 
(and anything else ;-)).  Built and tested of x86_64-linux.


Ok?

Ed


2019-05-11  Ed Smith-Rowland  <3dw...@verizon.net>

Implement numeric_limits<__float128>.
* include/std/limits: Copy limit macros from quadmath.h;
(__glibcxx_float128_has_denorm_loss, __glibcxx_float128_traps,
__glibcxx_float128_tinyness_before): New macros (set to false);
(numeric_limits<__float128>): New specialization.
* : Add __float128 test guarded by _GLIBCXX_USE_FLOAT128.
* : testsuite/18_support/numeric_limits/denorm_min.cc: Add __float128
test guarded by _GLIBCXX_USE_FLOAT128.
* : testsuite/18_support/numeric_limits/dr559.cc: Same.
* : testsuite/18_support/numeric_limits/epsilon.cc: Same.
* : testsuite/18_support/numeric_limits/infinity.cc: Same.
* : testsuite/18_support/numeric_limits/is_iec559.cc: Same.
* : testsuite/18_support/numeric_limits/lowest.cc: Same.
* : testsuite/18_support/numeric_limits/max_digits10.cc: Same.
* : testsuite/18_support/numeric_limits/min_max.cc: Same.
* : testsuite/18_support/numeric_limits/quiet_NaN.cc: Same.

Index: include/std/limits
===
--- include/std/limits  (revision 271076)
+++ include/std/limits  (working copy)
@@ -41,6 +41,19 @@
 
 #include 
 
+#if defined(_GLIBCXX_USE_FLOAT128) && !defined(__STRICT_ANSI__)
+#  define FLT128_MAX 1.18973149535723176508575932662800702e4932Q
+#  define FLT128_MIN 3.36210314311209350626267781732175260e-4932Q
+#  define FLT128_EPSILON 1.92592994438723585305597794258492732e-34Q
+#  define FLT128_DENORM_MIN 6.475175119438025110924438958227646552e-4966Q
+#  define FLT128_MANT_DIG 113
+#  define FLT128_MIN_EXP (-16381)
+#  define FLT128_MAX_EXP 16384
+#  define FLT128_DIG 33
+#  define FLT128_MIN_10_EXP (-4931)
+#  define FLT128_MAX_10_EXP 4932
+#endif
+
 //
 // The numeric_limits<> traits document implementation-defined aspects
 // of fundamental arithmetic data types (integers and floating points).
@@ -123,6 +136,24 @@
 #  define __glibcxx_long_double_tinyness_before false
 #endif
 
+#if defined(_GLIBCXX_USE_FLOAT128) && !defined(__STRICT_ANSI__)
+
+// __float128
+
+// Default values.  Should be overridden in configuration files if necessary.
+
+#  ifndef __glibcxx_float128_has_denorm_loss
+#define __glibcxx_float128_has_denorm_loss false
+#  endif
+#  ifndef __glibcxx_float128_traps
+#define __glibcxx_float128_traps false
+#  endif
+#  ifndef __glibcxx_float128_tinyness_before
+#define __glibcxx_float128_tinyness_before false
+#  endif
+
+#  endif // _GLIBCXX_USE_FLOAT128
+
 // You should not need to define any macros below this point.
 
 #define __glibcxx_signed_b(T,B)((T)(-1) < 0)
@@ -1880,6 +1911,85 @@
 #undef __glibcxx_long_double_traps
 #undef __glibcxx_long_double_tinyness_before
 
+#if defined(_GLIBCXX_USE_FLOAT128) && !defined(__STRICT_ANSI__)
+
+  /// numeric_limits<__float128> specialization.
+  template<>
+struct numeric_limits<__float128>
+{
+  static _GLIBCXX_USE_CONSTEXPR bool is_specialized = true;
+
+  static _GLIBCXX_CONSTEXPR __float128
+  min() _GLIBCXX_USE_NOEXCEPT { return FLT128_MIN; }
+
+  static _GLIBCXX_CONSTEXPR __float128
+  max() _GLIBCXX_USE_NOEXCEPT { return FLT128_MAX; }
+
+#if __cplusplus >= 201103L
+  static _GLIBCXX_CONSTEXPR __float128
+  lowest() _GLIBCXX_USE_NOEXCEPT { return -FLT128_MAX; }
+#endif
+
+  static _GLIBCXX_USE_CONSTEXPR int digits = FLT128_MANT_DIG;
+  static _GLIBCXX_USE_CONSTEXPR int digits10 = FLT128_DIG;
+#if __cplusplus >= 201103L
+  static _GLIBCXX_USE_CONSTEXPR int max_digits10
+= __glibcxx_max_digits10 (FLT128_MANT_DIG);
+#endif
+  static _GLIBCXX_USE_CONSTEXPR bool is_signed = true;
+  static _GLIBCXX_USE_CONSTEXPR bool is_integer = false;
+  static _GLIBCXX_USE_CONSTEXPR bool is_exact = false;
+  static _GLIBCXX_USE_CONSTEXPR int radix = __FLT_RADIX__;
+
+  static _GLIBCXX_CONSTEXPR __float128
+  epsilon() _GLIBCXX_USE_NOEXCEPT { return FLT128_EPSILON; }
+
+  static _GLIBCXX_CONSTEXPR __float128
+  round_error() _GLIBCXX_USE_NOEXCEPT { return 0.5Q; }
+
+  static _GLIBCXX_USE_CONSTEXPR int min_exponent = FLT128_MIN_EXP;
+  static _GLIBCXX_USE_CONSTEXPR int min_exponent10 = FLT128_MIN_10_EXP;
+  static _GLIBCXX_USE_CONSTEXPR int max_exponent = FLT128_MAX_EXP;
+  static _GLIBCXX_USE_CONSTEXPR int max_exponent10 = FLT128_MAX_10_EXP;
+
+  static

libgo patch committed: Set up g early

2019-05-10 Thread Ian Lance Taylor

This libgo patch by Cherry Zhang changes the runtime to set up g
early.  runtime.throw needs a g to work properly.  Set up g early, to
ensure that if something goes wrong in the runtime startup (e.g.
runtime.check fails), the program terminates in a reasonable way.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271074)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-76ab85364745e445498fe53f9ca8e37b49650779
+5c2c4743980556c041561533ef31762f524737ca
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/proc.go
===
--- libgo/go/runtime/proc.go(revision 270877)
+++ libgo/go/runtime/proc.go(working copy)
@@ -18,6 +18,7 @@ import (
 //go:linkname acquirep runtime.acquirep
 //go:linkname releasep runtime.releasep
 //go:linkname incidlelocked runtime.incidlelocked
+//go:linkname ginit runtime.ginit
 //go:linkname schedinit runtime.schedinit
 //go:linkname ready runtime.ready
 //go:linkname stopm runtime.stopm
@@ -515,6 +516,15 @@ func cpuinit() {
cpu.Initialize(env)
 }
 
+func ginit() {
+   _m_ := 
+   _g_ := 
+   _m_.g0 = _g_
+   _m_.curg = _g_
+   _g_.m = _m_
+   setg(_g_)
+}
+
 // The bootstrap sequence is:
 //
 // call osinit
@@ -524,13 +534,7 @@ func cpuinit() {
 //
 // The new G calls runtime·main.
 func schedinit() {
-   _m_ := 
-   _g_ := 
-   _m_.g0 = _g_
-   _m_.curg = _g_
-   _g_.m = _m_
-   setg(_g_)
-
+   _g_ := getg()
sched.maxmcount = 1
 
usestackmaps = probestackmaps()
Index: libgo/runtime/go-libmain.c
===
--- libgo/runtime/go-libmain.c  (revision 270877)
+++ libgo/runtime/go-libmain.c  (working copy)
@@ -225,6 +225,7 @@ gostart (void *arg)
 return NULL;
   runtime_isstarted = true;
 
+  runtime_ginit ();
   runtime_check ();
   runtime_args (a->argc, (byte **) a->argv);
   setncpu (getproccount ());
Index: libgo/runtime/go-main.c
===
--- libgo/runtime/go-main.c (revision 270877)
+++ libgo/runtime/go-main.c (working copy)
@@ -48,6 +48,7 @@ main (int argc, char **argv)
 setIsCgo ();
 
   __go_end = (uintptr)_end;
+  runtime_ginit ();
   runtime_cpuinit ();
   runtime_check ();
   runtime_args (argc, (byte **) argv);
Index: libgo/runtime/runtime.h
===
--- libgo/runtime/runtime.h (revision 270877)
+++ libgo/runtime/runtime.h (working copy)
@@ -240,6 +240,8 @@ int32   runtime_snprintf(byte*, int32, con
 #define runtime_memmove(a, b, s) __builtin_memmove((a), (b), (s))
 String runtime_gostringnocopy(const byte*)
   __asm__ (GOSYM_PREFIX "runtime.gostringnocopy");
+void   runtime_ginit(void)
+  __asm__ (GOSYM_PREFIX "runtime.ginit");
 void   runtime_schedinit(void)
   __asm__ (GOSYM_PREFIX "runtime.schedinit");
 void   runtime_initsig(bool)

[C++ PATCH] Fix up C++ loop construct debug info without -gno-statement-frontiers (PR debug/90197, take 2)

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 04:26:47PM -0400, Jason Merrill wrote:
> That is strange.  That seems to go back to
> 
> Surely we should only set the incr location if it doesn't already have one,
> as would have been the case before that patch.

So, like this then?  Bootstrapped/regtested again on x86_64-linux and
i686-linux:

2019-05-11  Jakub Jelinek  

PR debug/90197
* cp-gimplify.c (genericize_cp_loop): Emit a DEBUG_BEGIN_STMT
before the condition (or if missing or constant non-zero at the end
of the loop.  Emit a DEBUG_BEGIN_STMT before the increment expression
if any.  Don't call protected_set_expr_location on incr if it already
has a location.

--- gcc/cp/cp-gimplify.c.jj 2019-05-10 23:20:35.812735806 +0200
+++ gcc/cp/cp-gimplify.c2019-05-10 23:23:20.712991042 +0200
@@ -241,8 +241,10 @@ genericize_cp_loop (tree *stmt_p, locati
   tree blab, clab;
   tree exit = NULL;
   tree stmt_list = NULL;
+  tree debug_begin = NULL;
 
-  protected_set_expr_location (incr, start_locus);
+  if (EXPR_LOCATION (incr) == UNKNOWN_LOCATION)
+protected_set_expr_location (incr, start_locus);
 
   cp_walk_tree (, cp_genericize_r, data, NULL);
   cp_walk_tree (, cp_genericize_r, data, NULL);
@@ -253,6 +255,13 @@ genericize_cp_loop (tree *stmt_p, locati
   cp_walk_tree (, cp_genericize_r, data, NULL);
   *walk_subtrees = 0;
 
+  if (MAY_HAVE_DEBUG_MARKER_STMTS
+  && (!cond || !integer_zerop (cond)))
+{
+  debug_begin = build0 (DEBUG_BEGIN_STMT, void_type_node);
+  SET_EXPR_LOCATION (debug_begin, cp_expr_loc_or_loc (cond, start_locus));
+}
+
   if (cond && TREE_CODE (cond) != INTEGER_CST)
 {
   /* If COND is constant, don't bother building an exit.  If it's false,
@@ -265,10 +274,24 @@ genericize_cp_loop (tree *stmt_p, locati
 }
 
   if (exit && cond_is_first)
-append_to_statement_list (exit, _list);
+{
+  append_to_statement_list (debug_begin, _list);
+  debug_begin = NULL_TREE;
+  append_to_statement_list (exit, _list);
+}
   append_to_statement_list (body, _list);
   finish_bc_block (_list, bc_continue, clab);
-  append_to_statement_list (incr, _list);
+  if (incr)
+{
+  if (MAY_HAVE_DEBUG_MARKER_STMTS)
+   {
+ tree d = build0 (DEBUG_BEGIN_STMT, void_type_node);
+ SET_EXPR_LOCATION (d, cp_expr_loc_or_loc (incr, start_locus));
+ append_to_statement_list (d, _list);
+   }
+  append_to_statement_list (incr, _list);
+}
+  append_to_statement_list (debug_begin, _list);
   if (exit && !cond_is_first)
 append_to_statement_list (exit, _list);
 


Jakub

C++ PATCH to improve diagnostic for non-type template parameters

2019-05-10 Thread Marek Polacek

When we have 

  template
  struct S { };

then in

   S s;

"int()" is resolved to a type-id, as per [temp.arg]/2, causing this program to
fail to compile.  This can be rather confusing so I think we want to improve the
diagnostic a bit. 

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-05-10  Marek Polacek  

* pt.c (convert_template_argument): Add a diagnostic for the
[temp.arg]/2 ambiguity case.

* g++.dg/cpp2a/nontype-class17.C: New test.

diff --git gcc/cp/pt.c gcc/cp/pt.c
index 08da94ae0c9..b38e65d7f7e 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -7961,10 +7961,22 @@ convert_template_argument (tree parm,
 "parameter list for %qD",
 i + 1, in_decl);
  if (is_type)
-   inform (input_location,
-   "  expected a constant of type %qT, got %qT",
-   TREE_TYPE (parm),
-   (DECL_P (arg) ? DECL_NAME (arg) : orig_arg));
+   {
+ /* The template argument is a type, but we're expecting
+an expression.  */
+ inform (input_location,
+ "  expected a constant of type %qT, got %qT",
+ TREE_TYPE (parm),
+ (DECL_P (arg) ? DECL_NAME (arg) : orig_arg));
+ /* [temp.arg]/2: "In a template-argument, an ambiguity
+between a type-id and an expression is resolved to a
+type-id, regardless of the form of the corresponding
+template-parameter."  So give the user a clue.  */
+ if (TREE_CODE (arg) == FUNCTION_TYPE)
+   inform (input_location, "  template argument for "
+   "non-type template parameter is treated as "
+   "function type");
+   }
  else if (requires_tmpl_type)
inform (input_location,
"  expected a class template, got %qE", orig_arg);
diff --git gcc/testsuite/g++.dg/cpp2a/nontype-class17.C 
gcc/testsuite/g++.dg/cpp2a/nontype-class17.C
new file mode 100644
index 000..ca5f68e1611
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/nontype-class17.C
@@ -0,0 +1,17 @@
+// { dg-do compile { target c++2a } }
+
+template
+struct S { };
+
+struct R { };
+
+void
+g (void)
+{
+  S s; // { dg-error "mismatch" }
+// { dg-message "treated as function" "note" { target *-*-* } .-1 }
+  S s2;
+  S s3; // { dg-error "mismatch" }
+// { dg-message "treated as function" "note" { target *-*-* } .-1 }
+  S s4;
+}

Re: [PATCH] PR libstdc++/90388 fix std::hash> bugs

2019-05-10 Thread Ville Voutilainen

On Sat, 11 May 2019 at 00:55, Jonathan Wakely  wrote:
>
> On 11/05/19 00:51 +0300, Ville Voutilainen wrote:
> >On Sat, 11 May 2019 at 00:42, Jonathan Wakely  wrote:
> >>
> >> A disabled specialization should not be callable, so move the function
> >> call operator into a new base class which correctly implements the
> >> disabled hash semantics. For the versioned namespace configuration do
> >> not derive from __poison_hash in the enabled case, as the empty base
> >> class serves no purpose but potentially increases the object size. For
> >> the default configuration that base class must be kept, to preserve
> >> layout.
> >
> >I continue to not be a fan of the versioned namespace ifdeffery in
> >this, but I can live with it.
>
> The versioned namespace configuration should be as good as we can make
> it, with no limitations due to ABI compatibility. Removing redundant
> base classes allows more compact class layouts. With current trunk
> this type has size 3, in the versioned namespace after my patch it has
> size 2 (and if we patched hash> it would have size 1):

I understand all of that, but I do question the value of optimizing
the versioned namespace configuration at the cost
of sprinkling such ifdefs into our code.

Re: [PATCH] PR libstdc++/90388 fix std::hash> bugs

2019-05-10 Thread Jonathan Wakely


On 11/05/19 00:51 +0300, Ville Voutilainen wrote:

On Sat, 11 May 2019 at 00:42, Jonathan Wakely  wrote:


A disabled specialization should not be callable, so move the function
call operator into a new base class which correctly implements the
disabled hash semantics. For the versioned namespace configuration do
not derive from __poison_hash in the enabled case, as the empty base
class serves no purpose but potentially increases the object size. For
the default configuration that base class must be kept, to preserve
layout.


I continue to not be a fan of the versioned namespace ifdeffery in
this, but I can live with it.


The versioned namespace configuration should be as good as we can make
it, with no limitations due to ABI compatibility. Removing redundant
base classes allows more compact class layouts. With current trunk
this type has size 3, in the versioned namespace after my patch it has
size 2 (and if we patched hash> it would have size 1):

template
struct HashPtr
: std::hash,
 std::hash>,
 std::hash>,
 std::hash>
{
};

Re: 30_threads/thread/native_handle/typesizes.cc is no good

2019-05-10 Thread Jonathan Wakely


On 10/05/19 19:57 +0100, Iain Sandoe wrote:

Hi Jonathan


On 10 May 2019, at 15:20, Iain Sandoe  wrote:


On 10 May 2019, at 14:57, Jonathan Wakely  wrote:

Resending as plaint text so the lists don't reject it …



In order to test what it should, we'd need to use an alternate test
function that does not strip off one indirection level from
native_handle_type, if the test is to remain.



Or just adapt the current test to work for the std::thread case too, by
only removing the pointer when we know we need to remove it, as in the
attached patch. Does this work on targets using a pointer type for
pthread_t?


this will fix PR81266, if so, will add to my next run.


The attached minor update to the posted patch does this.
cheers


Sorry for not testing the patch!

The attached version has been committed to trunk.

commit 6c6a062b248be24fd498b5ba184a130904320f11
Author: Jonathan Wakely 
Date:   Fri May 10 22:21:40 2019 +0100

PR libstdc++/81266 fix std::thread::native_handle_type test

The test uses remove_pointer because in most cases native_handle_type is
a pointer to the actual type that the C++ class contains. However, for
std::thread, native_handle_type is the same type as the type contained
in std::thread, and so remove_pointer is not needed. On targets where
pthread_t is a pointer type remove_pointer is not a
no-op, instead it transforms pthread_t and causes the test to fail.

The fix is to not apply remove_pointer when testing std::thread.

PR libstdc++/81266
* testsuite/util/thread/all.h: Do not use remove_pointer for
std::thread::native_handle_type.

diff --git a/libstdc++-v3/testsuite/util/thread/all.h b/libstdc++-v3/testsuite/util/thread/all.h
index e5794fa4a97..2aacae4f2fc 100644
--- a/libstdc++-v3/testsuite/util/thread/all.h
+++ b/libstdc++-v3/testsuite/util/thread/all.h
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 // C++11 only.
 namespace __gnu_test
@@ -39,7 +40,12 @@ namespace __gnu_test
 
   // Remove possible pointer type.
   typedef typename test_type::native_handle_type native_handle;
-  typedef typename std::remove_pointer::type native_type;
+  // For std::thread native_handle_type is the type of its data member,
+  // for other types it's a pointer to the type of the data member.
+  typedef typename std::conditional<
+	std::is_same::value,
+	native_handle,
+	typename std::remove_pointer::type>::type native_type;
 
   int st = sizeof(test_type);
   int snt = sizeof(native_type);

Re: [PATCH] PR libstdc++/90388 fix std::hash> bugs

2019-05-10 Thread Ville Voutilainen

On Sat, 11 May 2019 at 00:42, Jonathan Wakely  wrote:
>
> A disabled specialization should not be callable, so move the function
> call operator into a new base class which correctly implements the
> disabled hash semantics. For the versioned namespace configuration do
> not derive from __poison_hash in the enabled case, as the empty base
> class serves no purpose but potentially increases the object size. For
> the default configuration that base class must be kept, to preserve
> layout.

I continue to not be a fan of the versioned namespace ifdeffery in
this, but I can live with it.

[PATCH] Improve API docs for and

2019-05-10 Thread Jonathan Wakely


More Doxygenation.

Tested powerpc64le-linux. Committed to trunk.


commit 5ff3d9a181fcd565a1a54b7c8bc5016cb8d71bb4
Author: Jonathan Wakely 
Date:   Wed May 8 00:13:39 2019 +0100

Improve API docs for  and 

* include/bits/shared_ptr.h: Improve docs.
* include/bits/shared_ptr_base.h: Likewise.
* include/bits/stl_uninitialized.h: Likewise.
* include/bits/unique_ptr.h: Likewise.
* libsupc++/new: Likewise.

diff --git a/libstdc++-v3/include/bits/shared_ptr.h b/libstdc++-v3/include/bits/shared_ptr.h
index a38c1988973..8f219e73d60 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -60,7 +60,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
-  /// 20.7.2.2.11 shared_ptr I/O
+  // 20.7.2.2.11 shared_ptr I/O
+
+  /// Write the stored pointer to an ostream.
+  /// @relates shared_ptr
   template
 inline std::basic_ostream<_Ch, _Tr>&
 operator<<(std::basic_ostream<_Ch, _Tr>& __os,
@@ -82,6 +85,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   /// 20.7.2.2.10 shared_ptr get_deleter
+
+  /// If `__p` has a deleter of type `_Del`, return a pointer to it.
   /// @relates shared_ptr
   template
 inline _Del*
@@ -106,6 +111,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* A `shared_ptr` also stores another pointer, which is usually
* (but not always) the same pointer as it owns. The stored pointer
* can be retrieved by calling the `get()` member function.
+   *
+   * The equality and relational operators for `shared_ptr` only compare
+   * the stored pointer returned by `get()`, not the owned pointer.
+   * To test whether two `shared_ptr` objects share ownership of the same
+   * pointer see `std::shared_ptr::owner_before` and `std::owner_less`.
   */
   template
 class shared_ptr : public __shared_ptr<_Tp>
@@ -122,10 +132,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 public:
 
+  /// The type pointed to by the stored pointer, remove_extent_t<_Tp>
   using element_type = typename __shared_ptr<_Tp>::element_type;
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 # define __cpp_lib_shared_ptr_weak_type 201606
+  /// The corresponding weak_ptr type for this shared_ptr
   using weak_type = weak_ptr<_Tp>;
 #endif
   /**
@@ -134,7 +146,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   constexpr shared_ptr() noexcept : __shared_ptr<_Tp>() { }
 
-  shared_ptr(const shared_ptr&) noexcept = default;
+  shared_ptr(const shared_ptr&) noexcept = default; ///< Copy constructor
 
   /**
*  @brief  Construct a %shared_ptr that owns the pointer @a __p.
@@ -378,8 +390,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   friend class weak_ptr<_Tp>;
 };
 
-  /// @relates shared_ptr @{
-
 #if __cpp_deduction_guides >= 201606
   template
 shared_ptr(weak_ptr<_Tp>) ->  shared_ptr<_Tp>;
@@ -388,36 +398,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   // 20.7.2.2.7 shared_ptr comparisons
+
+  /// @relates shared_ptr @{
+
+  /// Equality operator for shared_ptr objects, compares the stored pointers
   template
 _GLIBCXX_NODISCARD inline bool
 operator==(const shared_ptr<_Tp>& __a, const shared_ptr<_Up>& __b) noexcept
 { return __a.get() == __b.get(); }
 
+  /// shared_ptr comparison with nullptr
   template
 _GLIBCXX_NODISCARD inline bool
 operator==(const shared_ptr<_Tp>& __a, nullptr_t) noexcept
 { return !__a; }
 
+  /// shared_ptr comparison with nullptr
   template
 _GLIBCXX_NODISCARD inline bool
 operator==(nullptr_t, const shared_ptr<_Tp>& __a) noexcept
 { return !__a; }
 
+  /// Inequality operator for shared_ptr objects, compares the stored pointers
   template
 _GLIBCXX_NODISCARD inline bool
 operator!=(const shared_ptr<_Tp>& __a, const shared_ptr<_Up>& __b) noexcept
 { return __a.get() != __b.get(); }
 
+  /// shared_ptr comparison with nullptr
   template
 _GLIBCXX_NODISCARD inline bool
 operator!=(const shared_ptr<_Tp>& __a, nullptr_t) noexcept
 { return (bool)__a; }
 
+  /// shared_ptr comparison with nullptr
   template
 _GLIBCXX_NODISCARD inline bool
 operator!=(nullptr_t, const shared_ptr<_Tp>& __a) noexcept
 { return (bool)__a; }
 
+  /// Relational operator for shared_ptr objects, compares the stored pointers
   template
 _GLIBCXX_NODISCARD inline bool
 operator<(const shared_ptr<_Tp>& __a, const shared_ptr<_Up>& __b) noexcept
@@ -428,6 +448,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return less<_Vp>()(__a.get(), __b.get());
 }
 
+  /// shared_ptr comparison with nullptr
   template
 _GLIBCXX_NODISCARD inline bool
 operator<(const shared_ptr<_Tp>& __a, nullptr_t) noexcept
@@ -436,6 +457,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return less<_Tp_elt*>()(__a.get(), nullptr);
 }
 
+  /// shared_ptr comparison with nullptr
   template
 _GLIBCXX_NODISCARD inline bool
 operator<(nullptr_t, const shared_ptr<_Tp>& __a) noexcept
@@

[PATCH] PR libstdc++/90397 fix std::variant friend declarations

2019-05-10 Thread Jonathan Wakely


Clang diagnoses the inconsistent noexcept-specifier on the friend
declaration of __get. Add it, and also on __get_storage.

PR libstdc++/90397
* include/std/variant (_Variant_storage::_M_storage())
(_Variant_storage::_M_reset()))
(_Variant_storage::_M_storage())): Add noexcept.
(__get_storage): Likewise.
(variant): Add noexcept to friend declarations for __get and
__get_storage.

Tested powerpc64le-linux (and with clang), committed to trunk.

Backport to gcc-9-branch to follow soon.

commit ca72894f26baa3702ba0b699564bd3e356a98dba
Author: Jonathan Wakely 
Date:   Fri May 10 22:17:41 2019 +0100

PR libstdc++/90397 fix std::variant friend declarations

Clang diagnoses the inconsistent noexcept-specifier on the friend
declaration of __get. Add it, and also on __get_storage.

PR libstdc++/90397
* include/std/variant (_Variant_storage::_M_storage())
(_Variant_storage::_M_reset()))
(_Variant_storage::_M_storage())): Add noexcept.
(__get_storage): Likewise.
(variant): Add noexcept to friend declarations for __get and
__get_storage.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 8c7d7f37fe2..d539df125bf 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -402,7 +402,7 @@ namespace __variant
   { _M_reset(); }
 
   void*
-  _M_storage() const
+  _M_storage() const noexcept
   {
return const_cast(static_cast(
std::addressof(_M_u)));
@@ -432,11 +432,11 @@ namespace __variant
_M_index(_Np)
{ }
 
-  void _M_reset()
+  void _M_reset() noexcept
   { _M_index = variant_npos; }
 
   void*
-  _M_storage() const
+  _M_storage() const noexcept
   {
return const_cast(static_cast(
std::addressof(_M_u)));
@@ -760,7 +760,7 @@ namespace __variant
 
   // Returns the raw storage for __v.
   template
-void* __get_storage(_Variant&& __v)
+void* __get_storage(_Variant&& __v) noexcept
 { return __v._M_storage(); }
 
   template 
@@ -1556,10 +1556,12 @@ namespace __variant
 #endif
 
   template
-   friend constexpr decltype(auto) __detail::__variant::__get(_Vp&& __v);
+   friend constexpr decltype(auto)
+   __detail::__variant::__get(_Vp&& __v) noexcept;
 
   template
-   friend void* __detail::__variant::__get_storage(_Vp&& __v);
+   friend void*
+   __detail::__variant::__get_storage(_Vp&& __v) noexcept;
 
 #define _VARIANT_RELATION_FUNCTION_TEMPLATE(__OP) \
   template \

[PATCH] PR libstdc++/90388 fix std::hash> bugs

2019-05-10 Thread Jonathan Wakely


A disabled specialization should not be callable, so move the function
call operator into a new base class which correctly implements the
disabled hash semantics. For the versioned namespace configuration do
not derive from __poison_hash in the enabled case, as the empty base
class serves no purpose but potentially increases the object size. For
the default configuration that base class must be kept, to preserve
layout.

An enabled specialization should not be unconditionally noexcept,
because the underlying hash object might throw.

PR libstdc++/90388
* include/bits/unique_ptr.h (default_delete, default_delete):
Use _Require for constraints.
(operator>(nullptr_t, const unique_ptr&)): Implement exactly as
per the standard.
(__uniq_ptr_hash): New base class with conditionally-disabled call
operator.
(hash>): Derive from __uniq_ptr_hash.
* testsuite/20_util/default_delete/48631_neg.cc: Adjust dg-error line.
* testsuite/20_util/unique_ptr/hash/90388.cc: New test.

Tested powerpc64le-linux, committed to trunk.


commit e8d13e93aacad3587d00b71281da953dc1ad2cfd
Author: Jonathan Wakely 
Date:   Fri May 10 22:06:39 2019 +0100

PR libstdc++/90388 fix std::hash> bugs

A disabled specialization should not be callable, so move the function
call operator into a new base class which correctly implements the
disabled hash semantics. For the versioned namespace configuration do
not derive from __poison_hash in the enabled case, as the empty base
class serves no purpose but potentially increases the object size. For
the default configuration that base class must be kept, to preserve
layout.

An enabled specialization should not be unconditionally noexcept,
because the underlying hash object might throw.

PR libstdc++/90388
* include/bits/unique_ptr.h (default_delete, default_delete):
Use _Require for constraints.
(operator>(nullptr_t, const unique_ptr&)): Implement exactly as
per the standard.
(__uniq_ptr_hash): New base class with conditionally-disabled call
operator.
(hash>): Derive from __uniq_ptr_hash.
* testsuite/20_util/default_delete/48631_neg.cc: Adjust dg-error 
line.
* testsuite/20_util/unique_ptr/hash/90388.cc: New test.

diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index 6a23669f119..a9e74725dfd 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -66,8 +66,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* Allows conversion from a deleter for objects of another type, `_Up`,
* only if `_Up*` is convertible to `_Tp*`.
*/
-  template::value>::type>
+  template>>
 default_delete(const default_delete<_Up>&) noexcept { }
 
   /// Calls `delete __ptr`
@@ -102,19 +102,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* it is undefined to `delete[]` an array of derived types through a
* pointer to the base type.
*/
-  template::value>::type>
+  template>>
 default_delete(const default_delete<_Up[]>&) noexcept { }
 
   /// Calls `delete[] __ptr`
   template
-  typename enable_if::value>::type
+   typename enable_if::value>::type
operator()(_Up* __ptr) const
-  {
-   static_assert(sizeof(_Tp)>0,
- "can't delete pointer to incomplete type");
-   delete [] __ptr;
-  }
+   {
+ static_assert(sizeof(_Tp)>0,
+   "can't delete pointer to incomplete type");
+ delete [] __ptr;
+   }
 };
 
   /// @cond undocumented
@@ -712,7 +712,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 unique_ptr<_Tp, _Dp>&) = delete;
 #endif
 
-  /// Equality operator for unique_ptr objects, compares the owned pointers.
+  /// Equality operator for unique_ptr objects, compares the owned pointers
   template
 _GLIBCXX_NODISCARD inline bool
@@ -848,23 +848,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_NODISCARD inline bool
 operator>=(nullptr_t, const unique_ptr<_Tp, _Dp>& __x)
 { return !(nullptr < __x); }
+  // @} relates unique_ptr
+
+  /// @cond undocumented
+  template::__enable_hash_call>
+struct __uniq_ptr_hash
+#if ! _GLIBCXX_INLINE_VERSION
+: private __poison_hash<_Ptr>
+#endif
+{
+  size_t
+  operator()(const _Up& __u) const
+  noexcept(noexcept(std::declval>()(std::declval<_Ptr>(
+  { return hash<_Ptr>()(__u.get()); }
+};
+
+  template
+struct __uniq_ptr_hash<_Up, _Ptr, false>
+: private __poison_hash<_Ptr>
+{ };
+  /// @endcond
 
   /// std::hash specialization for unique_ptr.
   template
 struct hash>
 : public __hash_base>,
-private __poison_hash::pointer>
-{
-  size_t
-  operator()(const unique_ptr<_Tp, _Dp>& __u) const noexcept
-  {
-

Re: [PATCH 1/2] or1k: Fix code quality for volatile memory loads

2019-05-10 Thread Stafford Horne

On Thu, May 09, 2019 at 07:44:15PM +0200, Bernhard Reutner-Fischer wrote:
> On 6 May 2019 15:16:20 CEST, Stafford Horne  wrote:
> >Volatile memory does not match the memory_operand predicate.  This
> >causes extra extend/mask instructions instructions when reading
> >from volatile memory.  On OpenRISC loading volitile memory can be
> 
> s/volitile/volatile/g
> 
> also at least in the test.
> Thanks,

Thank you,

I always mispell that one.

-Stafford

> 
> >diff --git a/gcc/testsuite/gcc.target/or1k/swap-2.c
> >b/gcc/testsuite/gcc.target/or1k/swap-2.c
> >new file mode 100644
> >index 000..8ddea4e659f
> >--- /dev/null
> >+++ b/gcc/testsuite/gcc.target/or1k/swap-2.c
> 
> >+/* Check to ensure the volitile load does not get zero extended.  */
>

Re: [PATCH, X86] Fix PR82920 (code part).

2019-05-10 Thread Uros Bizjak

On Fri, May 10, 2019 at 11:03 PM Iain Sandoe  wrote:
>
> Hi!
>
> PR82920 is about CET fails on Darwin.
>
> Initially, this was expected to be just a testsuite issue, however it turns 
> out that the indirection thunks code has inconsistent handling of the output 
> of labels.  Thus some of the output is missing the leading “_” on Darwin, 
> which breaks ABI and won’t link.
>
> Since most of the tests are scan-asms that check for what’s expected, they 
> currently pass on Darwin but will begin failing when the codegen is fixed.  
> Thus there is  larger, but mechanical, testsuite change needed to deal with 
> this.   I will post that if anyone’s interested, but otherwise will just 
> apply it once the codgen fix is agreed.
>
> The patch factors out some common code that writes out the jumps and uses the 
> regular output scheme that accounts for __USER_LABEL_PREFIX__.
>
> I will note in passing that there’s very little PIC test coverage for the 
> indirection thunks code, although Darwin is PIC-only for m64 - Linux has only 
> a few tests.
>
> OK for trunk?

OK for trunk if the patch doesn't regress x86_64-linux.

> Backports?

Also OK for backports after a couple of days of soaking in the trunk
without problems (so autotesters will test the patched compiler from
the trunk).

Thanks,
Uros.

> Iain
>
> gcc/
>
> * config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): New.
> (ix86_output_indirect_branch_via_reg): Use output mechanism 
> accounting for
> __USR_LABEL_PREFIX.
> (ix86_output_indirect_branch_via_push): Likewise.
> (ix86_output_function_return): Likewise.
> (ix86_output_indirect_function_return): Likewise.
>
> From 4da5837cd7bbe61b6d2687e552e3afb5bfdb2765 Mon Sep 17 00:00:00 2001
> From: Iain Sandoe 
> Date: Tue, 7 May 2019 07:27:19 -0400
> Subject: [PATCH] [Darwin] Fix PR82920 - code changes.
>
> Emit labels using machinery that includes the __USER_LABEL_PREFIX__
> ---
>  gcc/config/i386/i386.c | 48 --
>  1 file changed, 27 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index c51d775b89..08aa9d9475 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -15130,6 +15130,20 @@ ix86_nopic_noplt_attribute_p (rtx call_op)
>return false;
>  }
>
> +/* Helper to output the jmp/call.  */
> +static void
> +ix86_output_jmp_thunk_or_indirect (const char *thunk_name, const int regno)
> +{
> +  if (thunk_name != NULL)
> +{
> +  fprintf (asm_out_file, "\tjmp\t");
> +  assemble_name (asm_out_file, thunk_name);
> +  putc ('\n', asm_out_file);
> +}
> +  else
> +output_indirect_thunk (regno);
> +}
> +
>  /* Output indirect branch via a call and return thunk.  CALL_OP is a
> register which contains the branch target.  XASM is the assembly
> template for CALL_OP.  Branch is a tail call if SIBCALL_P is true.
> @@ -15168,17 +15182,14 @@ ix86_output_indirect_branch_via_reg (rtx call_op, 
> bool sibcall_p)
>  thunk_name = NULL;
>
>if (sibcall_p)
> -{
> -  if (thunk_name != NULL)
> -   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> -  else
> -   output_indirect_thunk (regno);
> -}
> + ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
>else
>  {
>if (thunk_name != NULL)
> {
> - fprintf (asm_out_file, "\tcall\t%s\n", thunk_name);
> + fprintf (asm_out_file, "\tcall\t");
> + assemble_name (asm_out_file, thunk_name);
> + putc ('\n', asm_out_file);
>   return;
> }
>
> @@ -15199,10 +15210,7 @@ ix86_output_indirect_branch_via_reg (rtx call_op, 
> bool sibcall_p)
>
>ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
>
> -  if (thunk_name != NULL)
> -   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> -  else
> -   output_indirect_thunk (regno);
> + ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
>
>ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
>
> @@ -15259,10 +15267,7 @@ ix86_output_indirect_branch_via_push (rtx call_op, 
> const char *xasm,
>if (sibcall_p)
>  {
>output_asm_insn (push_buf, _op);
> -  if (thunk_name != NULL)
> -   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> -  else
> -   output_indirect_thunk (regno);
> +  ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
>  }
>else
>  {
> @@ -15318,10 +15323,7 @@ ix86_output_indirect_branch_via_push (rtx call_op, 
> const char *xasm,
>
>output_asm_insn (push_buf, _op);
>
> -  if (thunk_name != NULL)
> -   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
> -  else
> -   output_indirect_thunk (regno);
> +  ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
>
>ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
>
> @@ -15420,7 +15422,9 @@ ix86_output_function_return (bool long_p)
>

[PATCH, X86] Fix PR82920 (code part).

2019-05-10 Thread Iain Sandoe

Hi!

PR82920 is about CET fails on Darwin.

Initially, this was expected to be just a testsuite issue, however it turns out 
that the indirection thunks code has inconsistent handling of the output of 
labels.  Thus some of the output is missing the leading “_” on Darwin, which 
breaks ABI and won’t link.

Since most of the tests are scan-asms that check for what’s expected, they 
currently pass on Darwin but will begin failing when the codegen is fixed.  
Thus there is  larger, but mechanical, testsuite change needed to deal with 
this.   I will post that if anyone’s interested, but otherwise will just apply 
it once the codgen fix is agreed.

The patch factors out some common code that writes out the jumps and uses the 
regular output scheme that accounts for __USER_LABEL_PREFIX__.

I will note in passing that there’s very little PIC test coverage for the 
indirection thunks code, although Darwin is PIC-only for m64 - Linux has only a 
few tests.

OK for trunk?
Backports?
Iain

gcc/

* config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): New.
(ix86_output_indirect_branch_via_reg): Use output mechanism accounting 
for
__USR_LABEL_PREFIX.
(ix86_output_indirect_branch_via_push): Likewise.
(ix86_output_function_return): Likewise.
(ix86_output_indirect_function_return): Likewise.

From 4da5837cd7bbe61b6d2687e552e3afb5bfdb2765 Mon Sep 17 00:00:00 2001
From: Iain Sandoe 
Date: Tue, 7 May 2019 07:27:19 -0400
Subject: [PATCH] [Darwin] Fix PR82920 - code changes.

Emit labels using machinery that includes the __USER_LABEL_PREFIX__
---
 gcc/config/i386/i386.c | 48 --
 1 file changed, 27 insertions(+), 21 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c51d775b89..08aa9d9475 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -15130,6 +15130,20 @@ ix86_nopic_noplt_attribute_p (rtx call_op)
   return false;
 }
 
+/* Helper to output the jmp/call.  */
+static void
+ix86_output_jmp_thunk_or_indirect (const char *thunk_name, const int regno)
+{
+  if (thunk_name != NULL)
+{
+  fprintf (asm_out_file, "\tjmp\t");
+  assemble_name (asm_out_file, thunk_name);
+  putc ('\n', asm_out_file);
+}
+  else
+output_indirect_thunk (regno);
+}
+
 /* Output indirect branch via a call and return thunk.  CALL_OP is a
register which contains the branch target.  XASM is the assembly
template for CALL_OP.  Branch is a tail call if SIBCALL_P is true.
@@ -15168,17 +15182,14 @@ ix86_output_indirect_branch_via_reg (rtx call_op, 
bool sibcall_p)
 thunk_name = NULL;
 
   if (sibcall_p)
-{
-  if (thunk_name != NULL)
-   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
-  else
-   output_indirect_thunk (regno);
-}
+ ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
   else
 {
   if (thunk_name != NULL)
{
- fprintf (asm_out_file, "\tcall\t%s\n", thunk_name);
+ fprintf (asm_out_file, "\tcall\t");
+ assemble_name (asm_out_file, thunk_name);
+ putc ('\n', asm_out_file);
  return;
}
 
@@ -15199,10 +15210,7 @@ ix86_output_indirect_branch_via_reg (rtx call_op, bool 
sibcall_p)
 
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1);
 
-  if (thunk_name != NULL)
-   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
-  else
-   output_indirect_thunk (regno);
+ ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
 
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
 
@@ -15259,10 +15267,7 @@ ix86_output_indirect_branch_via_push (rtx call_op, 
const char *xasm,
   if (sibcall_p)
 {
   output_asm_insn (push_buf, _op);
-  if (thunk_name != NULL)
-   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
-  else
-   output_indirect_thunk (regno);
+  ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
 }
   else
 {
@@ -15318,10 +15323,7 @@ ix86_output_indirect_branch_via_push (rtx call_op, 
const char *xasm,
 
   output_asm_insn (push_buf, _op);
 
-  if (thunk_name != NULL)
-   fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
-  else
-   output_indirect_thunk (regno);
+  ix86_output_jmp_thunk_or_indirect (thunk_name, regno);
 
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
 
@@ -15420,7 +15422,9 @@ ix86_output_function_return (bool long_p)
  indirect_thunk_name (thunk_name, INVALID_REGNUM, need_prefix,
   true);
  indirect_return_needed |= need_thunk;
- fprintf (asm_out_file, "\tjmp\t%s\n", thunk_name);
+ fprintf (asm_out_file, "\tjmp\t");
+ assemble_name (asm_out_file, thunk_name);
+ putc ('\n', asm_out_file);
}
   else
output_indirect_thunk (INVALID_REGNUM);
@@ -15460,7 +15464,9 @@ ix86_output_indirect_function_return (rtx ret_op)
  indirect_return_via_cx =

Re: [C++ PATCH] Fix up C++ loop construct debug info without -gno-statement-frontiers (PR debug/90197)

2019-05-10 Thread Jason Merrill


On 4/26/19 11:45 AM, Jakub Jelinek wrote:

Hi!

On Fri, Apr 26, 2019 at 09:31:36AM -0600, Jeff Law wrote:

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


Thanks, committed to trunk now.


I'll work on a C++ FE version of this next (needed as well).


Here is the C++ FE version of this patch, bootstrapped/regtested on
x86_64-linux and i686-linux and tested with the same testcases.

For some strange reason the C++ FE does that
protected_set_expr_location (incr, start_locus);
on the incr expression, so the patch also uses that locus for the
corresponding DEBUG_BEGIN_STMT instead of trying to figure out
original locus for the incr.


That is strange.  That seems to go back to


2012-04-17  Tom de Vries  

* cp-gimplify.c (begin_bc_block): Add location parameter and use as
location argument to create_artificial_label.
(finish_bc_block): Change return type to void.  Remove body_seq
parameter, and add block parameter.  Append label to STMT_LIST and
return in block.
(gimplify_cp_loop, gimplify_for_stmt, gimplify_while_stmt)
(gimplify_do_stmt, gimplify_switch_stmt): Remove function.
(genericize_cp_loop, genericize_for_stmt, genericize_while_stmt)
(genericize_do_stmt, genericize_switch_stmt, genericize_continue_stmt)
(genericize_break_stmt, genericize_omp_for_stmt): New function.
(cp_gimplify_omp_for): Remove bc_continue processing.
(cp_gimplify_expr): Genericize VEC_INIT_EXPR.
(cp_gimplify_expr): Mark FOR_STMT, WHILE_STMT, DO_STMT, SWITCH_STMT,
CONTINUE_STMT, and BREAK_STMT as unreachable.
(cp_genericize_r): Genericize FOR_STMT, WHILE_STMT, DO_STMT,
SWITCH_STMT, CONTINUE_STMT, BREAK_STMT and OMP_FOR.
(cp_genericize_tree): New function, factored out of ...
(cp_genericize): ... this function.


Surely we should only set the incr location if it doesn't already have 
one, as would have been the case before that patch.


Jason

Re: [PATCH] Append to target_gtfiles in order to fix Darwin bootstrap.

2019-05-10 Thread Eric Gallager

On 5/6/19, Martin Liška  wrote:
> On 5/6/19 3:52 PM, Jakub Jelinek wrote:
>> On Mon, May 06, 2019 at 03:47:53PM +0200, Martin Liška wrote:
>>> The patch append to target_gtfiles at 3 places instead of overwriting
>>> that.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>
>>> Ready to be installed?
>>> Thanks,
>>> Martin
>>>
>>> gcc/ChangeLog:
>>>
>>> 2019-05-06  Martin Liska  
>>>
>>> * config.gcc: Append to target_gtfiles.
>>> ---
>>>  gcc/config.gcc | 6 +++---
>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>>
>>
>>> diff --git a/gcc/config.gcc b/gcc/config.gcc
>>> index 5124ea00792..f119f82e475 100644
>>> --- a/gcc/config.gcc
>>> +++ b/gcc/config.gcc
>>> @@ -383,7 +383,7 @@ i[34567]86-*-*)
>>> cxx_target_objs="i386-c.o"
>>> d_target_objs="i386-d.o"
>>> extra_objs="x86-tune-sched.o x86-tune-sched-bd.o x86-tune-sched-atom.o
>>> x86-tune-sched-core.o i386-options.o i386-builtins.o i386-expand.o
>>> i386-features.o"
>>> -  target_gtfiles="\$(srcdir)/config/i386/i386-builtins.c
>>> \$(srcdir)/config/i386/i386-expand.c
>>> \$(srcdir)/config/i386/i386-options.c"
>>> +   target_gtfiles="$target_gtfiles \$(srcdir)/config/i386/i386-builtins.c
>>> \$(srcdir)/config/i386/i386-expand.c
>>> \$(srcdir)/config/i386/i386-options.c"
>>
>> I think there is no need to add $target_gtfiles here, you know it is
>> empty,
>> the first spot in config.gcc that touches it is this switch based on CPU.
>> Just fix up the indentation.
>
> Ah, got it.
>
>>
>>> @@ -416,7 +416,7 @@ x86_64-*-*)
>>> d_target_objs="i386-d.o"
>>> extra_options="${extra_options} fused-madd.opt"
>>> extra_objs="x86-tune-sched.o x86-tune-sched-bd.o x86-tune-sched-atom.o
>>> x86-tune-sched-core.o i386-options.o i386-builtins.o i386-expand.o
>>> i386-features.o"
>>> -  target_gtfiles="\$(srcdir)/config/i386/i386-builtins.c
>>> \$(srcdir)/config/i386/i386-expand.c
>>> \$(srcdir)/config/i386/i386-options.c"
>>> +   target_gtfiles="$target_gtfiles \$(srcdir)/config/i386/i386-builtins.c
>>> \$(srcdir)/config/i386/i386-expand.c
>>> \$(srcdir)/config/i386/i386-options.c"
>>> extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h
>>>pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
>>>nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
>>
>> Ditto.
>>
>>> @@ -693,7 +693,7 @@ case ${target} in
>>>esac
>>>tm_file="${tm_file} ${cpu_type}/darwin.h"
>>>tm_p_file="${tm_p_file} darwin-protos.h"
>>> -  target_gtfiles="\$(srcdir)/config/darwin.c"
>>> +  target_gtfiles="$target_gtfiles \$(srcdir)/config/darwin.c"
>>>extra_options="${extra_options} darwin.opt"
>>>c_target_objs="${c_target_objs} darwin-c.o"
>>>cxx_target_objs="${cxx_target_objs} darwin-c.o"
>>>
>>
>> This is insufficient, needs to be done also in the 3
>>  target_gtfiles="\$(srcdir)/config/i386/winnt.c"
>> cases.
>
> Done that. I'm going to install the patch.
>

This reminded me about bug 36994:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36994
It's about all_gtfiles, but I'm pretty sure that should contain
target_gtfiles, so if target_gtfiles is getting longer... well, it
seems like bug 36994 would be relevant.

> Martin
>
>>
>> Ok with those changes.
>>
>>  Jakub
>>
>
>

Re: [patch, fortran] C prototype writing improvements for gfortran

2019-05-10 Thread Janne Blomqvist

On Thu, May 9, 2019 at 12:31 AM Thomas Koenig  wrote:
>
> Hello world,
>
> the attached patch fixes PR 90351 (not all prototypes were written
> to standard output with -fc-prototypes) and introduces new
> functionality to also write C prototypes for external functions,
> at the same time discouraging their use (because BIND(C) is really
> the better, standard-conforming and portable way).  While looking
> at the code, I also noticed that COMPLEX was not handled before,
> so I added that, too.
>
> Example:
>
> $ cat c.f90
> integer function r(i)
> end
>
> subroutine foo(a,b,c)
>character*(*) a
>real b
>complex c
> end
>
> character*(*) function x(r, c1,c2)
>real r
>character*(*) c1,c2
> end
> $ gfortran -fsyntax-only -fc-prototypes-external c.f90
> /* Prototypes for external procedures generated from c.f90
> by GNU Fortran (GCC) 10.0.0 20190427 (experimental).
>
> Use of this interface is dicsouraged, consider using the
> BIND(C) feature of standard Fortran instead.  */
>
> int r_ (int *i);
> void foo_ (char *a, float *b, float complex *c, size_t a_len);
> void x_ (char *result_x, size_t result_x_len, float *r, char *c1, char
> *c2, size_t c1_len, size_t c2_len);
>
> I'd like to commit this to trunk and to gcc-9, to help users of
> old-fashioned Lapack bindings, such as R, with their transition
> to something that does not violate gfortran's ABI.
>
> Tested with "make dvi" and "make info".  Otherwise, since these flags
> are not tested in the testsuite (maybe they should be, I just don't
> know how), regression test passed.
>
> OK?
>
> 2019-05-08  Thomas Koenig  
>
>  PR fortran/90351
>  PR fortran/90329
>  * gfortran.dg/dump-parse-tree.c: Include version.h.
>  (gfc_dump_external_c_prototypes): New function.
>  (get_c_type_name): Select "char" as a name for a simple char.
>  Adjust to handling external functions. Also handle complex.
>  (write_decl): Add argument bind_c. Adjust for dumping of external
>  procedures.
>  (write_proc): Likewise.
>  (write_interop_decl): Add bind_c argument to call of write_proc.
>  * gfortran.h: Add prototype for gfc_dump_external_c_prototypes.
>  * lang.opt: Add -fc-prototypes-external flag.
>  * parse.c (gfc_parse_file): Move dumping of BIND(C) prototypes.
>  Call gfc_dump_external_c_prototypes if option is set.
>  * invoke.texi: Document -fc-prototypes-external.
>

Thanks for this. I committed as obvious r271075

Index: ChangeLog
===
--- ChangeLog   (revision 271074)
+++ ChangeLog   (working copy)
@@ -28,7 +28,7 @@

PR fortran/90351
PR fortran/90329
-   * gfortran.dg/dump-parse-tree.c: Include version.h.
+   * dump-parse-tree.c: Include version.h.
(gfc_dump_external_c_prototypes): New function.
(get_c_type_name): Select "char" as a name for a simple char.
Adjust to handling external functions. Also handle complex.



-- 
Janne Blomqvist

Re: Ping [patch, fortran] Fix PR 61968

2019-05-10 Thread Steve Kargl

Seems short enough to be committed as 'obvious'.

Ok.


On Fri, May 10, 2019 at 09:57:49PM +0200, Thomas Koenig wrote:
> Hi,
> 
> ping?
> 
> >> Not for me, I still get
> >>
> >> % gfc pr61968.f90 -c -O3
> >> pr61968.f90:32:0:
> >>
> >>     32 | call test_lib (a, int (sizeof (a), kind=c_size_t))
> >>    |
> >> internal compiler error: in gfc_trans_create_temp_array, at 
> >> fortran/trans-array.c:1265
> > 
> > You're right, I will clear this up separately.
> > 
> > In the meantime, here is the one-line patch with the test case above
> > with -O3 added, so any failure will be noted soon.
> > 
> > OK for trunk?
> 
> Re

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

Ping [patch, fortran] Fix PR 61968

2019-05-10 Thread Thomas Koenig


Hi,

ping?


Not for me, I still get

% gfc pr61968.f90 -c -O3
pr61968.f90:32:0:

    32 | call test_lib (a, int (sizeof (a), kind=c_size_t))
   |
internal compiler error: in gfc_trans_create_temp_array, at 
fortran/trans-array.c:1265


You're right, I will clear this up separately.

In the meantime, here is the one-line patch with the test case above
with -O3 added, so any failure will be noted soon.

OK for trunk?


Re

Go patch committed: Permit inlining receive expressions

2019-05-10 Thread Ian Lance Taylor

This patch to the Go frontend permits inlining functions with receive
expressions.  This does not permit any new inlinable functions in the
standard library.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271063)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-b5e4ba88a2e7f3c34e9183f43370c38ea639c393
+76ab85364745e445498fe53f9ca8e37b49650779
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271063)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -15423,6 +15423,15 @@ Receive_expression::do_get_backend(Trans
   return Expression::make_compound(recv, recv_ref, loc)->get_backend(context);
 }
 
+// Export a receive expression.
+
+void
+Receive_expression::do_export(Export_function_body* efb) const
+{
+  efb->write_c_string("<-");
+  this->channel_->export_expression(efb);
+}
+
 // Dump ast representation for a receive expression.
 
 void
@@ -15432,6 +15441,16 @@ Receive_expression::do_dump_expression(A
   ast_dump_context->dump_expression(channel_);
 }
 
+// Import a receive expression.
+
+Expression*
+Receive_expression::do_import(Import_expression* imp, Location loc)
+{
+  imp->require_c_string("<-");
+  Expression* expr = Expression::import_expression(imp, loc);
+  return Expression::make_receive(expr, loc);
+}
+
 // Make a receive expression.
 
 Receive_expression*
@@ -16783,6 +16802,8 @@ Expression::import_expression(Import_exp
   // This handles integers, floats and complex constants.
   return Integer_expression::do_import(imp, loc);
 }
+  else if (imp->match_c_string("<-"))
+return Receive_expression::do_import(imp, loc);
   else if (imp->match_c_string("$nil")
   || (imp->version() < EXPORT_FORMAT_V3
   && imp->match_c_string("nil")))
Index: gcc/go/gofrontend/expressions.h
===
--- gcc/go/gofrontend/expressions.h (revision 271021)
+++ gcc/go/gofrontend/expressions.h (working copy)
@@ -3982,6 +3982,9 @@ class Receive_expression : public Expres
   channel()
   { return this->channel_; }
 
+  static Expression*
+  do_import(Import_expression*, Location);
+
  protected:
   int
   do_traverse(Traverse* traverse)
@@ -4010,6 +4013,10 @@ class Receive_expression : public Expres
 return Expression::make_receive(this->channel_->copy(), this->location());
   }
 
+  int
+  do_inlining_cost() const
+  { return 1; }
+
   bool
   do_must_eval_in_order() const
   { return true; }
@@ -4018,6 +4025,9 @@ class Receive_expression : public Expres
   do_get_backend(Translate_context*);
 
   void
+  do_export(Export_function_body*) const;
+
+  void
   do_dump_expression(Ast_dump_context*) const;
 
  private:

Re: C++ PATCH to implement deferred parsing of noexcept-specifiers (c++/86476, c++/52869)

2019-05-10 Thread Marek Polacek

Coming back to this.  I didn't think this was suitable for GCC 9.

On Mon, Jan 07, 2019 at 10:44:37AM -0500, Jason Merrill wrote:
> On 12/19/18 3:27 PM, Marek Polacek wrote:
> > Prompted by Jon's observation in 52869, I noticed that we don't treat
> > a noexcept-specifier as a complete-class context of a class ([class.mem]/6).
> > As with member function bodies, default arguments, and NSDMIs, names used in
> > a noexcept-specifier of a member-function can be declared later in the class
> > body, so we need to wait and parse them at the end of the class.
> > For that, I've made use of DEFAULT_ARG (now best to be renamed to 
> > UNPARSED_ARG).
> 
> Or DEFERRED_PARSE, yes.

I didn't change the name but I'm happy to do it as a follow up.

> > +  /* We can't compare unparsed noexcept-specifiers.  Save the old decl
> > + and check this again after we've parsed the noexcept-specifiers
> > + for real.  */
> > +  if (UNPARSED_NOEXCEPT_SPEC_P (new_exceptions))
> > +{
> > +  vec_safe_push (DEFARG_INSTANTIATIONS (TREE_PURPOSE (new_exceptions)),
> > +copy_decl (old_decl));
> > +  return;
> > +}
> 
> Why copy_decl?

This is so that we don't lose the diagnostic in noexcept46.C.  If I don't use
copy_decl then the tree is shared and subsequent changes to it make us not
detect discrepancies like noexcept(false) vs. noexcept(true) on the same decl.

> It seems wasteful to allocate a vec to hold this single decl; let's make the
> last field of tree_default_arg a union instead.  And add a new macro for the
> single decl case.

Done.  But that required also adding GTY markers *and* a new BOOL_BITFIELD for
the sake of GTY((desc)).

> I notice that default_arg currently uses tree_common for some reason, and we
> ought to be able to save two words by switching to tree_base

Done.

> > @@ -1245,6 +1245,7 @@ nothrow_spec_p (const_tree spec)
> >   || TREE_VALUE (spec)
> >   || spec == noexcept_false_spec
> >   || TREE_PURPOSE (spec) == error_mark_node
> > + || TREE_CODE (TREE_PURPOSE (spec)) == DEFAULT_ARG
> 
> Maybe use UNPARSED_NOEXCEPT_SPEC_P here?
 
Done.

> > +/* Make sure that any member-function parameters are in scope.
> > +   For instance, a function's noexcept-specifier can use the function's
> > +   parameters:
> > +
> > +   struct S {
> > + void fn (int p) noexcept(noexcept(p));
> > +   };
> > +
> > +   so we need to make sure name lookup can find them.  This is used
> > +   when we delay parsing of the noexcept-specifier.  */
> > +
> > +static void
> > +maybe_begin_member_function_processing (tree decl)
> 
> This name is pretty misleading.  How about inject_parm_decls, to go with
> inject_this_parameter?

Done.

> > +/* Undo the effects of maybe_begin_member_function_processing.  */
> > +
> > +static void
> > +maybe_end_member_function_processing (void)
> 
> And then perhaps pop_injected_parms.

Done.

> > +/* Check throw specifier of OVERRIDER is at least as strict as
> > +   the one of BASEFN.  */
> > +
> > +bool
> > +maybe_check_throw_specifier (tree overrider, tree basefn)
> > +{
> > +  maybe_instantiate_noexcept (basefn);
> > +  maybe_instantiate_noexcept (overrider);
> > +  tree base_throw = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (basefn));
> > +  tree over_throw = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (overrider));
> > +
> > +  if (DECL_INVALID_OVERRIDER_P (overrider))
> > +return true;
> > +
> > +  /* Can't check this yet.  Pretend this is fine and let
> > + noexcept_override_late_checks check this later.  */
> > +  if (UNPARSED_NOEXCEPT_SPEC_P (base_throw)
> > +  || UNPARSED_NOEXCEPT_SPEC_P (over_throw))
> > +return true;
> > +
> > +  if (!comp_except_specs (base_throw, over_throw, ce_derived))
> > +{
> > +  auto_diagnostic_group d;
> > +  error ("looser throw specifier for %q+#F", overrider);
> 
> Since we're touching this diagnostic, let's correct it now to "exception
> specification".  And add "on overriding virtual function".

Ok, changed to the more up-to-date term.

Two further changes were required since my changes to detecting 'this' for
static member functions:
1) use THIS_FORBIDDEN if needed when parsing the delayed noexcept,
2) careful about friend member functions -- its DECL_CONTEXT is not the
  containing class, need to use DECL_FRIEND_CONTEXT.

Both of these points are tested in g++.dg/cpp0x/this1.C.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-05-10  Marek Polacek  

PR c++/86476 - noexcept-specifier is a complete-class context.
PR c++/52869
* cp-tree.def (DEFAULT_ARG): Update commentary.
* cp-tree.h (DEFARG_DECL, DEFARG_NOEXCEPT_P, UNPARSED_NOEXCEPT_SPEC_P):
New macros.
(tree_default_arg): Add a tree field, make the last two fields into a
union.  Add GTY markers.
(check_redeclaration_exception_specification): Declare.
(maybe_check_throw_specifier): Declare.
* decl.c

[Darwin, testsuite,x86, committed] Provide an asm shim for AVX512F tests.

2019-05-10 Thread Iain Sandoe

These tests fail to build on Darwin versions with assembler support for the 
insns, the addition of the shim fixes those fails.  However, at present, 
there’s no suitable hardware available (to the regular GCC contributors) to 
test the execution.

Applied.

gcc/testsuite/

2019-05-10  Iain Sandoe  

* gcc.target/x86_64/abi/avx512f/abi-avx512f.exp: Darwin is
now tested.
* gcc.target/x86_64/abi/avx512f/asm-support-darwin.s: New.

thanks
Iain



avx512f-shim.diff
Description: avx512f-shim.diff

Re: 30_threads/thread/native_handle/typesizes.cc is no good

2019-05-10 Thread Iain Sandoe

Hi Jonathan

> On 10 May 2019, at 15:20, Iain Sandoe  wrote:
> 
>> On 10 May 2019, at 14:57, Jonathan Wakely  wrote:
>> 
>> Resending as plaint text so the lists don't reject it …
> 
>>> In order to test what it should, we'd need to use an alternate test
>>> function that does not strip off one indirection level from
>>> native_handle_type, if the test is to remain.
>>> 
>> 
>> Or just adapt the current test to work for the std::thread case too, by
>> only removing the pointer when we know we need to remove it, as in the
>> attached patch. Does this work on targets using a pointer type for
>> pthread_t?
> 
> this will fix PR81266, if so, will add to my next run.

The attached minor update to the posted patch does this.
cheers
Iain



pr81266.diff
Description: Binary data

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
> And looking at the generated code, emitting a tailcall for this case is 
> actually a
> de-optimization since the large eh_return epilog must be repeated for every
> tailcall.

On x86_64, the code is shorter with the tail call rather than without.

That said, here is actually tested workaround until targets are fixed.
Richard or Jeff, do we want this workaround?

I don't see why we would need extra testsuite coverage for this, given the
number of FAILs or bootstrap issues on targets that are broken.

2019-05-10  Jakub Jelinek  

PR c++/59813
* unwind.inc (_Unwind_Resume_or_Rethrow): Add optimize attribute
to temporarily avoid tail calls in the function.

--- libgcc/unwind.inc.jj2019-01-01 12:38:17.345987416 +0100
+++ libgcc/unwind.inc   2019-05-10 17:10:32.436329516 +0200
@@ -252,6 +252,7 @@ _Unwind_Resume (struct _Unwind_Exception
a normal exception that was handled.  */

 _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE
+__attribute__((optimize ("no-optimize-sibling-calls")))
 _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
 {
   struct _Unwind_Context this_context, cur_context;

Jakub

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra

Hi Jakub,

>> That's great to hear, but all the other targets still need to be fixed 
>> somehow.
>
> Not all other targets, just some.  I don't see what you find wrong on
> actually fixing them.  The tail call in that spot is actually very
> desirable, but only if we can do a shrink wrapping for it, by doing

Without a torture test this it's not obvious how many targets actually get it 
right.

If it's desirable to optimize RaiseException (is that really the common case?), 
then
why not optimize the source code to:

_Unwind_Reason_Code
_Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
{
  if (exc->private_1 == 0)
return _Unwind_RaiseException (exc);
  
  _Unwind_Resume_Tailcall (exc);
}

_Unwind_Reason_Code __attribute__((optimize ("no-omit-frame-pointer")))
_Unwind_Resume_Tailcall (struct _Unwind_Exception *exc)
{
  struct _Unwind_Context this_context, cur_context;
  _Unwind_Reason_Code code;
  unsigned long frames;
  do { __builtin_unwind_init (); uw_init_context_1 (_context, 
__builtin_dwarf_cfa (), __builtin_return_address (0)); } while (0);
  cur_context = this_context;
  code = _Unwind_ForcedUnwind_Phase2 (exc, _context, );
  ((void)(!(code == _URC_INSTALL_CONTEXT) ? abort (), 0 : 0));
  do { long offset = uw_install_context_1 ((_context), (_context)); 
void *handler = uw_frob_return_addr ((_context), (_context));
 _Unwind_DebugHook ((_context)->cfa, handler); ; __builtin_eh_return 
(offset, handler); } while (0);
}


> If you want just ignore the problem, we can just
--- libgcc/unwind.inc.jj    2019-01-01 12:38:17.345987416 +0100
+++ libgcc/unwind.inc   2019-05-10 16:12:18.330555718 +0200
@@ -252,6 +252,7 @@ _Unwind_Resume (struct _Unwind_Exception
    a normal exception that was handled.  */
 
 _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE
+__attribute__((optimize ("no-omit-sibling-calls")))
 _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
 {
   struct _Unwind_Context this_context, cur_context;

> and be done with that.

Yes that's a safe temporary workaround to get things working again, I don't
think anyone can object to that.

Note typo "no-optimize-sibling-calls". And it needs a comment with the PRs
created for this.

Wilco

Re: [C++ Patch] Consistently use FUNC_OR_METHOD_TYPE_P

2019-05-10 Thread Jason Merrill


On 5/6/19 5:13 AM, Paolo Carlini wrote:

Hi,

one of most straightforward changes in my pending queue. Tested 
x86_64-linux.


OK.

Jason

Re: [C++ Patch] Avoid some duplicate error messages

2019-05-10 Thread Jason Merrill


On 4/29/19 2:50 PM, Paolo Carlini wrote:

Hi,

I have a small back queue of tweaks of various kinds and sizes, this one 
seems small enough to be safe wrt last minute release branch fixes.


While working on the regression c++/88969, some duplicate errors showed 
up when we started giving appropriate diagnostics instead of ICEing, 
which could be cured by unconditionally returning error_mark_node from 
cp_build_function_call_vec when mark_used fails. In general, in the 
front-end we have a mix of unconditional and conditional to SFINAE 
context mark_used checks, I think it's often a delicate choice, the 
below change passes testing, I would give it a spin, at the beginning of 
Stage1. Tested x86_64-linux.


OK.

Jason

Re: C++ PATCH for c++/78010 - bogus -Wsuggest-override warning on final function

2019-05-10 Thread Jason Merrill


On 5/6/19 5:07 PM, Marek Polacek wrote:

-Wsuggest-override warns for virtual functions that could use the override
keyword.  But as everyone in this PR agrees, the warning shouldn't suggest
adding "override" for "final" functions.  This trivial patch implements that.

Bootstrapped/regtested on x86_64-linux, ok for trunk/9?


OK.

Jason

Re: [C++ Patch] One more location fix

2019-05-10 Thread Jason Merrill


On 5/8/19 5:02 AM, Paolo Carlini wrote:

Hi again,

one more straightforward fixlet which remained in my tree for a while. 
Tested x86_64-linux.


Thanks, Paolo.

/



OK.

Jason

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Uros Bizjak

On Fri, May 10, 2019 at 4:35 PM Segher Boessenkool
 wrote:
>
> On Fri, May 10, 2019 at 09:42:47AM +0200, Richard Biener wrote:
> > On Fri, May 10, 2019 at 9:25 AM Uros Bizjak  wrote:
> > > > But IL semantic differences based on mode is even worse.  What happens
> > > > if STV then substitues a vector register/op but you previously optimized
> > > > with the assumption the count would be truncated since the shift was 
> > > > SImode?
>
> What is STV?

scalar-to-vector pass for x86 targets that uses vector instructions
and XMM registers to perform DImode instructions.

Uros.

> > But that's more a combine limitation than a reason going for the
> > "hidden" IL semantic change.  But yes, if the and is used by
> > non-masking insns then it's likely cheap enough to retain it.
> >
> > If the masking were always in place (combined with the shift
> > if a suitable insn exists) then STV handling should be possible,
> > it just would need to split the insn to do the masking and then the shift
> > (of course that might not be very profitable).
>
> Why does the pattern with masking split?  It could just be a define_insn.
> Combine will try to get rid of the masking, to simplify the RTL.  Going in the
> other direction is probably not profitable :-/
>
>
> Segher

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Segher Boessenkool

On Fri, May 10, 2019 at 09:42:47AM +0200, Richard Biener wrote:
> On Fri, May 10, 2019 at 9:25 AM Uros Bizjak  wrote:
> > > But IL semantic differences based on mode is even worse.  What happens
> > > if STV then substitues a vector register/op but you previously optimized
> > > with the assumption the count would be truncated since the shift was 
> > > SImode?

What is STV?

> But that's more a combine limitation than a reason going for the
> "hidden" IL semantic change.  But yes, if the and is used by
> non-masking insns then it's likely cheap enough to retain it.
> 
> If the masking were always in place (combined with the shift
> if a suitable insn exists) then STV handling should be possible,
> it just would need to split the insn to do the masking and then the shift
> (of course that might not be very profitable).

Why does the pattern with masking split?  It could just be a define_insn.
Combine will try to get rid of the masking, to simplify the RTL.  Going in the
other direction is probably not profitable :-/


Segher

[C++ Patch] PR 89875 ("[7/8/9/10 Regression] invalid typeof reference to a member of an incomplete struct accepted at function scope")

2019-05-10 Thread Paolo Carlini


Hi,

a while ago Martin noticed that an unintended consequence of an old 
tweak of mine - which avoided redundant error messages emitted from 
cp_parser_init_declarator - is that, in some cases, we started accepting 
ill-formed typeofs. Luckily, decltype isn't affected and that points to 
the real issue: by the time that place in cp_parser_init_declarator is 
reached, for a decltype version we already emitted a correct error 
message. Thus I think the right way to fix the problem is simply 
committing to tentative parse when, toward the end of 
cp_parser_sizeof_operand we know that we must be looking at a (possibly 
ill-formed) expression. Tested x86_64-linux.


Thanks, Paolo.

///

/cp
2019-05-10  Paolo Carlini  

PR c++/89875
* parser.c (cp_parser_sizeof_operand): When the type-id production
did not work out commit to the tentative parse.

/testsuite
2019-05-10  Paolo Carlini  

PR c++/89875
* g++.dg/cpp0x/decltype-pr66548.C: Remove xfail.
* g++.dg/template/sizeof-template-argument.C: Adjust expected error.
Index: cp/parser.c
===
--- cp/parser.c (revision 271059)
+++ cp/parser.c (working copy)
@@ -28998,7 +28998,11 @@ cp_parser_sizeof_operand (cp_parser* parser, enum
   /* If the type-id production did not work out, then we must be
  looking at the unary-expression production.  */
   if (!expr)
-expr = cp_parser_unary_expression (parser);
+{
+  cp_parser_commit_to_tentative_parse (parser);
+  
+  expr = cp_parser_unary_expression (parser);
+}
 
   /* Go back to evaluating expressions.  */
   --cp_unevaluated_operand;
Index: testsuite/g++.dg/cpp0x/decltype-pr66548.C
===
--- testsuite/g++.dg/cpp0x/decltype-pr66548.C   (revision 271059)
+++ testsuite/g++.dg/cpp0x/decltype-pr66548.C   (working copy)
@@ -11,7 +11,7 @@ struct Meow {};
 
 void f ()
 {
-  decltype (Meow.purr ()) d;   // { dg-error "expected primary-expression" 
"pr89875" { xfail c++98_only } }
+  decltype (Meow.purr ()) d;   // { dg-error "expected primary-expression" }
   (void)
 }
 
Index: testsuite/g++.dg/template/sizeof-template-argument.C
===
--- testsuite/g++.dg/template/sizeof-template-argument.C(revision 
271059)
+++ testsuite/g++.dg/template/sizeof-template-argument.C(working copy)
@@ -3,9 +3,9 @@
 
 template struct A {};
 
-template struct B : A  {}; /* { dg-error "template 
argument" } */
+template struct B : A  {}; /* { dg-error "expected 
primary-expression" } */
 
-template struct C : A  {}; /* { dg-error "template 
argument" } */
+template struct C : A  {}; /* { dg-error "expected 
primary-expression" } */
 
 int a;

Re: [PATCH] netbsd EABI support

2019-05-10 Thread coypu

On Fri, May 10, 2019 at 11:44:28AM +0100, Richard Earnshaw (lists) wrote:
> I was hoping to get a comment from one of the netbsd port maintainers on
> the changes to the common NetBSD code.  Are they no-longer active?

Jason Thorpe said he can't contribute to GCC because of where he works.
Krister Walfridsson is slow to respond but does eventually and knows a
lot about GCC.

Re: [PATCH][RFC] Come up with TARGET_HAS_FAST_MEMPCPY_ROUTINE (PR middle-end/90263).

2019-05-10 Thread Wilco Dijkstra

Hi Martin,

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

However I guess some existing tests checking for mempcpy may fail on
other targets, so might need to be changed.

> @Wilco: Can you please come up with a test-case for aarch64?

A simplified version gcc/testsuite/gcc.dg/fold-bcopy.c should be good enough
if existing tests don't provide enough cover. Literally just compiling the 
example
I gave and checking it calls memcpy using a target exclude will do, eg.

int *f (int *p, int *q, long n)
{
  return __builtin_mempcpy (p, q, n);
}

/* { dg-final { scan-assembler "mempcpy" { target { i?86-*-* x86_64-*-* } } } } 
*/
/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } 
} } } */

Wilco

Re: 30_threads/thread/native_handle/typesizes.cc is no good

2019-05-10 Thread Iain Sandoe



> On 10 May 2019, at 14:57, Jonathan Wakely  wrote:
> 
> Resending as plaint text so the lists don't reject it …
> 

>> In order to test what it should, we'd need to use an alternate test
>> function that does not strip off one indirection level from
>> native_handle_type, if the test is to remain.
>> 
> 
> Or just adapt the current test to work for the std::thread case too, by
> only removing the pointer when we know we need to remove it, as in the
> attached patch. Does this work on targets using a pointer type for
> pthread_t?

this will fix PR81266, if so, will add to my next run.

Iain

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
> > So given this and the fact there is no real gain from emitting tailcalls in 
> > eh_return
> > functions, wouldn't it make more sense to always block tailcalls in the 
> > mid-end?
> 
> > No.  E.g. x86 handles that just fine, 
> 
> That's great to hear, but all the other targets still need to be fixed 
> somehow.

Not all other targets, just some.  I don't see what you find wrong on
actually fixing them.  The tail call in that spot is actually very
desirable, but only if we can do a shrink wrapping for it, by doing
if (exc->private)
  tail_call _Unwind_RaiseException (exc);
prologue;
whatever;
eh_epilogue;

Or
if (exc->private)
  tail_call _Unwind_RaiseException (exc);
else
  tail_call _Unwind_Resume (exc);
Except that _Unwind_Resume returns void and is (determined by the compiler)
noreturn, so it doesn't matter what we return.  That second option would be
shortest, but we don't have ways to convince the compiler to do that right
now (__attribute__((noipa)) on _Unwind_Resume perhaps, so that it doesn't
tell callers it is noreturn, but the mismatch of the return value is bad.

If you want just ignore the problem, we can just
--- libgcc/unwind.inc.jj2019-01-01 12:38:17.345987416 +0100
+++ libgcc/unwind.inc   2019-05-10 16:12:18.330555718 +0200
@@ -252,6 +252,7 @@ _Unwind_Resume (struct _Unwind_Exception
a normal exception that was handled.  */

 _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE
+__attribute__((optimize ("no-omit-sibling-calls")))
 _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
 {
   struct _Unwind_Context this_context, cur_context;

and be done with that.

Jakub

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Jeff Law

On 5/10/19 1:08 AM, Richard Biener wrote:
> On Thu, May 9, 2019 at 6:00 PM Jeff Law  wrote:
>>
>> On 5/9/19 5:52 AM, Robin Dapp wrote:
>>> Hi,
>>>
>>> while trying to improve s390 code generation for rotate and shift I
>>> noticed superfluous subregs for shift count operands. In our backend we
>>> already have quite cumbersome patterns that would need to be duplicated
>>> (or complicated further by more subst patterns) in order to get rid of
>>> the subregs.
>>>
>>> I had already finished all the patterns when I realized that
>>> SHIFT_COUNT_TRUNCATED and the target hook shift_truncation_mask already
>>> exist and could do what is needed without extra patterns.  Just defining
>>>  shift_truncation_mask was not enough though as most of the additional
>>> insns get introduced by combine.
>>>
>>> Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
>>> does because we always only consider the last 6 bits of a shift operand.>
>>> Despite all the warnings in the other backends, most notably
>>> SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
>>> wrote the attached tentative patch.  It's a little ad-hoc, uses the
>>> SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and,
>>> instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
>>> the mask returned by shift_truncation_mask.  Doing so, usage of both
>>> "methods" actually reduces to a single way.
>> THe main reason it's discouraged is because some targets have insns
>> where the count would be truncated and others where it would not.   So
>> for example normal shifts might truncate, but vector shifts might or
>> (mips) or shifts might truncate but bit tests do not (x86).
>>
>> I don't know enough about the s390 architecture to know if there's any
>> corner cases.  You'd have to look at ever pattern in your machine
>> description with a shift and verify that it's going to DTRT if the count
>> hasn't been truncated.
>>
>>
>> It would really help if you could provide testcases which show the
>> suboptimal code and any analysis you've done.
> 
> The main reason I dislike SHIFT_COUNT_TRUNCATED is that it
> changes the meaning of the IL.  We generally want these kind
> of things to be explicit.
Understood and agreed.  So I think this argues that Robin should look at
a different approach and we should start pushing ports away from
SHIFT_COUNT_TRUNCATED a bit more aggressively.

Jeff

Re: [PATCH][OBVIOUS] Reapply r269790 which was missed during rebase.

2019-05-10 Thread Jeff Law

On 5/10/19 1:19 AM, Martin Liška wrote:
> Hi.
> 
> When I did split of i386.c I forgot to rebased this patch. It caused failure
> of gcc.target/i386/fpprec-1.c execution test.
> 
> Thank you Jeff for reporting that.
> 
> I'm going to install the patch.
Thanks for fixing.  And this is precisely what the tester is meant to
catch :-)  Just glad its doing its job.

Jeff

Re: Make vector iterator operators hidden friends

2019-05-10 Thread Jonathan Wakely


On 10/05/19 14:40 +0100, Jonathan Wakely wrote:

On Thu, 9 May 2019 at 06:49, François Dumont wrote:


Hi

 Patch similar to the one I just apply for deque iterator including
NRVO copy ellision fix.

 * include/bits/stl_bvector.h
 (operator==(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Make hidden friend.
 (operator<(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Likewise.
 (operator!=(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Likewise.
 (operator>(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Likewise.
 (operator<=(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Likewise.
 (operator>=(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Likewise.
 (operator-(const _Bit_iterator_base&, const _Bit_iterator_base&)):
 Likewise.
 (_Bit_iterator::operator+(difference_type)): Likewise and allow NRVO
 copy elision.
 (_Bit_iterator::operator-(difference_type)): Likewise.
 (operator+(ptrdiff_t, const _Bit_iterator&)): Make hidden friend.
 (_Bit_const_iterator::operator+(difference_type)): Likewise and allow
 NRVO copy elision.
 (_Bit_const_iterator::operator-(difference_type)): Likewise.
 (operator+(ptrdiff_t, const _Bit_const_iterator&)): Make hidden friend.


These const_iterator overloads seem to be missing the NRVO fix.


But the patch looks good otherwise, so OK for trunk with the NRVO
changes. Thanks.

Re: 30_threads/thread/native_handle/typesizes.cc is no good

2019-05-10 Thread Jonathan Wakely

Resending as plaint text so the lists don't reject it ...

On Thu, 9 May 2019, 11:37 Alexandre Oliva wrote:

> Other classes that have native_handle/typesizes.cc tests have
> native_type_handle defined as a pointer type, and the pointed-to type is
> native_handle, the type of the single data member of the class (*).  It
> thus makes sense to check that the single data member and the enclosing
> class type have the same size and alignment: a pointer to the enclosing
> type should also be a valid pointer to its single member.
>
> (*) this single data member is actually inherited or enclosed in another
> class, but let's not get distracted by this irrelevant detail.
>
> std::thread, however, does not follow this pattern.  Not because the
> single data member is defined as a direct data member of a nested class,
> rather than inherited from another class.  The key difference is that it
> does not use native_type_handle's pointed-to type as the single data
> member, it rather uses native_type_handle directly as the type of the
> single data member.
>
> On GNU/Linux, and presumably on most platforms, native_handle_type =
> __gthread_t is not even a pointer type.  This key difference would have
> been obvious if remove_pointer rejected non-pointer types, but that's
> not the case.  When given __gthread_t -> pthread_t -> unsigned long int,
> remove_pointer::type is T, so we get unsigned long int back.  The
> test works because there's no pointer type to strip off.
>

Which is by design. The test is written to work for the cases where
native_handle() returns _handle and also the std::thread case where it
returns m_handle directly.

The problem is this:


> However, on a platform that defines __gthread_t as a pointer type, we
> use that pointer type as native_type_handle and as the type for the
> single data member of std::thread.  But then, the test compares size and
> alignment of that type with those of the type obtained by removing one
> indirection level.  We're comparing properties of different, unrelated
> types.  Unless the pointed-to type happens to have, by chance, the size
> and alignment of a pointer, the test will fail.
>
>
Good catch.


>
> IOW, the test is no good: it's not testing what it should, and wherever
> it passes, it's by accident.
>


It's not by accident on GNU/Linux, or other targets where pthread_t is not
a pointer.



> In order to test what it should, we'd need to use an alternate test
> function that does not strip off one indirection level from
> native_handle_type, if the test is to remain.
>

Or just adapt the current test to work for the std::thread case too, by
only removing the pointer when we know we need to remove it, as in the
attached patch. Does this work on targets using a pointer type for
pthread_t?



Should I introduce such an alternate test function and adjust the test,
> or just remove the broken test?
>
> Or should we change std::thread::native_handle_type to __gthread_t*,
> while keeping unchanged the type of the single data member
> std::thread::id::_M_thread?
>

We could do that, but it would break any user code using the native handle.
I'd prefer not to do that just because we have a buggy testcase.
diff --git a/libstdc++-v3/testsuite/util/thread/all.h 
b/libstdc++-v3/testsuite/util/thread/all.h
index e5794fa4a97..ec24822a8ce 100644
--- a/libstdc++-v3/testsuite/util/thread/all.h
+++ b/libstdc++-v3/testsuite/util/thread/all.h
@@ -39,7 +39,12 @@ namespace __gnu_test
 
   // Remove possible pointer type.
   typedef typename test_type::native_handle_type native_handle;
-  typedef typename std::remove_pointer::type native_type;
+  // For std::thread native_handle_type is the type of its data member,
+  // for other types it's a pointer to the type of the data member.
+  typedef typename std::conditional<
+   std::is_same::value,
+   ::native_handle,
+   typename std::remove_pointer::type>::type native_type;
 
   int st = sizeof(test_type);
   int snt = sizeof(native_type);

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra

Hi Jakub,

> So given this and the fact there is no real gain from emitting tailcalls in 
> eh_return
> functions, wouldn't it make more sense to always block tailcalls in the 
> mid-end?

> No.  E.g. x86 handles that just fine, 

That's great to hear, but all the other targets still need to be fixed somehow.

Is the tailcall optimization so important for this case it is worth forcing 20+ 
targets
to find and fix this issue? Given eh_return is already overly complex that seems
like a lot of effort that could be used better on more productive optimizations.

And looking at the generated code, emitting a tailcall for this case is 
actually a
de-optimization since the large eh_return epilog must be repeated for every
tailcall.

Wilco

Go patch committed: Permit inlining variable declaration statements

2019-05-10 Thread Ian Lance Taylor

This patch to the Go frontend permits inlining variable declaration
statements.  This adds all of two inlinable functions to the standard
library: crypto/subtle.ConstantTimeLessOrEq, regexp.(*Regexp).Copy.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271044)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-3dbf51c01c5d0acbf9ae47f77166fa9935881749
+b5e4ba88a2e7f3c34e9183f43370c38ea639c393
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271044)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -843,14 +843,14 @@ Var_expression::do_address_taken(bool es
 }
 
 // The cost to inline a variable reference.  We currently only support
-// references to parameters.
+// references to parameters and local variables.
 
 int
 Var_expression::do_inlining_cost() const
 {
   if (this->variable_->is_variable())
 {
-  if (this->variable_->var_value()->is_parameter())
+  if (!this->variable_->var_value()->is_global())
return 1;
 }
   else if (this->variable_->is_result_variable())
Index: gcc/go/gofrontend/statements.cc
===
--- gcc/go/gofrontend/statements.cc (revision 271044)
+++ gcc/go/gofrontend/statements.cc (working copy)
@@ -155,6 +155,8 @@ Statement::import_statement(Import_funct
   ifb->advance(6);
   return Statement::make_return_statement(NULL, loc);
 }
+  else if (ifb->match_c_string("var "))
+return Variable_declaration_statement::do_import(ifb, loc);
 
   Expression* lhs = Expression::import_expression(ifb, loc);
   ifb->require_c_string(" = ");
@@ -408,6 +410,57 @@ Statement::make_variable_declaration(Nam
   return new Variable_declaration_statement(var);
 }
 
+// Export a variable declaration.
+
+void
+Variable_declaration_statement::do_export_statement(Export_function_body* efb)
+{
+  efb->write_c_string("var ");
+  efb->write_string(Gogo::unpack_hidden_name(this->var_->name()));
+  efb->write_c_string(" ");
+  Variable* var = this->var_->var_value();
+  Type* type = var->type();
+  efb->write_type(type);
+  Expression* init = var->init();
+  if (init != NULL)
+{
+  efb->write_c_string(" = ");
+
+  go_assert(efb->type_context() == NULL);
+  efb->set_type_context(type);
+
+  init->export_expression(efb);
+
+  efb->set_type_context(NULL);
+}
+}
+
+// Import a variable declaration.
+
+Statement*
+Variable_declaration_statement::do_import(Import_function_body* ifb,
+ Location loc)
+{
+  ifb->require_c_string("var ");
+  std::string id = ifb->read_identifier();
+  ifb->require_c_string(" ");
+  Type* type = ifb->read_type();
+  Expression* init = NULL;
+  if (ifb->match_c_string(" = "))
+{
+  ifb->advance(3);
+  init = Expression::import_expression(ifb, loc);
+  Type_context context(type, false);
+  init->determine_type();
+}
+  Variable* var = new Variable(type, init, false, false, false, loc);
+  var->set_is_used();
+  // FIXME: The package we are importing does not yet exist, so we
+  // can't pass the correct package here.  It probably doesn't matter.
+  Named_object* no = ifb->block()->bindings()->add_variable(id, NULL, var);
+  return Statement::make_variable_declaration(no);
+}
+
 // Class Temporary_statement.
 
 // Return the type of the temporary variable.
Index: gcc/go/gofrontend/statements.h
===
--- gcc/go/gofrontend/statements.h  (revision 270877)
+++ gcc/go/gofrontend/statements.h  (working copy)
@@ -16,6 +16,7 @@ class Block;
 class Function;
 class Unnamed_label;
 class Export_function_body;
+class Import_function_body;
 class Assignment_statement;
 class Temporary_statement;
 class Variable_declaration_statement;
@@ -332,8 +333,7 @@ class Statement
   inlining_cost()
   { return this->do_inlining_cost(); }
 
-  // Export data for this statement to BODY.  INDENT is an indentation
-  // level used if the export data requires multiple lines.
+  // Export data for this statement to BODY.
   void
   export_statement(Export_function_body* efb)
   { this->do_export_statement(efb); }
@@ -514,10 +514,8 @@ class Statement
   { return 0x10; }
 
   // Implemented by child class: write export data for this statement
-  // to the string.  The integer is an indentation level used if the
-  // export data requires multiple lines.  This need only be
-  // implemented by classes that implement do_inlining_cost with a
-  // reasonable value.
+  // to the string.  This need only be implemented by classes that
+  //

Re: [PATCH 0/3] Small clean up of profile_{count,probability}

2019-05-10 Thread Martin Liška

PING^1

On 4/29/19 1:01 PM, marxin wrote:
> The patch makes couple of adjustements:
> 1) I changed enum values to use capital letters. It's aligned with
>our coding style and it makes the code more readable. E.g.
>profile_quality::profile_guessed_local
>vs. static profile_probability guessed_always ()
> 
> 2) I removed usage of fully qualified names.
> 3) I added vertical spaces to separate different functions (operators).
> 
> Survives bootstrap and tests on x86_64-linux-gnu.
> 
> Ready for trunk after 9.1 release?
> Thanks,
> Martin
> 
> marxin (3):
>   Use cappital letters for enum value names.
>   Do not use full qualified names if possible.
>   Add vertical spacing in order to separate functions.
> 
>  gcc/profile-count.c |  76 +--
>  gcc/profile-count.h | 320 +---
>  2 files changed, 220 insertions(+), 176 deletions(-)
>

Re: [PATCH] Support again multiple --help options (PR other/90315).

2019-05-10 Thread Martin Liška

PING^1

On 5/3/19 12:59 PM, Martin Liška wrote:
> Hi.
> 
> The patch prints all values requested in multiple --help options.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-05-03  Martin Liska  
> 
>   PR other/90315
>   * opts-global.c (decode_options): Print help for all
>   help_option_arguments.
>   * opts.c (print_help): Add new argument.
>   (common_handle_option): Remember all values into
>   help_option_arguments.
>   * opts.h (print_help): Add new argument.
> ---
>  gcc/opts-global.c | 4 ++--
>  gcc/opts.c| 7 ---
>  gcc/opts.h| 5 +++--
>  3 files changed, 9 insertions(+), 7 deletions(-)
> 
>

Re: [PATCH 2/2] Do not follow zero edges in cycle detection (PR gcov-profile/90380).

2019-05-10 Thread Martin Liška

On 5/9/19 10:12 AM, marxin wrote:
> 
> gcc/ChangeLog:
> 
> 2019-05-09  Martin Liska  
> 
>   * gcov.c (circuit): Ignore zero count arcs.
> ---
>  gcc/gcov.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 

Hi.

There's second version of the patch that was confirmed by the reported
that works fine.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin
>From 028e216cb182d2f0e114d7838bb1f1c7f1102e85 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 9 May 2019 10:12:42 +0200
Subject: [PATCH 2/2] Do not follow zero edges in cycle detection (PR
 gcov-profile/90380).

gcc/ChangeLog:

2019-05-10  Martin Liska  

	PR gcov-profile/90380
	* gcov.c (handle_cycle): Do not support zero cycle count,
	it should not be possible.
	(path_contains_zero_cycle_arc): New function.
	(circuit): Ignore zero cycle arc counts.
---
 gcc/gcov.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 6bcd2b23748..b06a6714c2e 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -696,7 +696,7 @@ handle_cycle (const arc_vector_t , int64_t )
   for (unsigned i = 0; i < edges.size (); i++)
 edges[i]->cs_count -= cycle_count;
 
-  gcc_assert (cycle_count >= 0);
+  gcc_assert (cycle_count > 0);
 }
 
 /* Unblock a block U from BLOCKED.  Apart from that, iterate all blocks
@@ -722,6 +722,17 @@ unblock (const block_info *u, block_vector_t ,
 unblock (*it, blocked, block_lists);
 }
 
+/* Return true when PATH contains a zero cycle arc count.  */
+
+static bool
+path_contains_zero_cycle_arc (arc_vector_t )
+{
+  for (unsigned i = 0; i < path.size (); i++)
+if (path[i]->cs_count == 0)
+  return true;
+  return false;
+}
+
 /* Find circuit going to block V, PATH is provisional seen cycle.
BLOCKED is vector of blocked vertices, BLOCK_LISTS contains vertices
blocked by a block.  COUNT is accumulated count of the current LINE.
@@ -742,7 +753,9 @@ circuit (block_info *v, arc_vector_t , block_info *start,
   for (arc_info *arc = v->succ; arc; arc = arc->succ_next)
 {
   block_info *w = arc->dst;
-  if (w < start || !linfo.has_block (w))
+  if (w < start
+	  || arc->cs_count == 0
+	  || !linfo.has_block (w))
 	continue;
 
   path.push_back (arc);
@@ -752,7 +765,8 @@ circuit (block_info *v, arc_vector_t , block_info *start,
 	  handle_cycle (path, count);
 	  loop_found = true;
 	}
-  else if (find (blocked.begin (), blocked.end (), w) == blocked.end ())
+  else if (!path_contains_zero_cycle_arc (path)
+	   &&  find (blocked.begin (), blocked.end (), w) == blocked.end ())
 	loop_found |= circuit (w, path, start, blocked, block_lists, linfo,
 			   count);
 
@@ -765,7 +779,9 @@ circuit (block_info *v, arc_vector_t , block_info *start,
 for (arc_info *arc = v->succ; arc; arc = arc->succ_next)
   {
 	block_info *w = arc->dst;
-	if (w < start || !linfo.has_block (w))
+	if (w < start
+	|| arc->cs_count == 0
+	|| !linfo.has_block (w))
 	  continue;
 
 	size_t index
-- 
2.21.0

Re: Make vector iterator operators hidden friends

2019-05-10 Thread Jonathan Wakely

On Thu, 9 May 2019 at 06:49, François Dumont wrote:
>
> Hi
>
>  Patch similar to the one I just apply for deque iterator including
> NRVO copy ellision fix.
>
>  * include/bits/stl_bvector.h
>  (operator==(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Make hidden friend.
>  (operator<(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Likewise.
>  (operator!=(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Likewise.
>  (operator>(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Likewise.
>  (operator<=(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Likewise.
>  (operator>=(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Likewise.
>  (operator-(const _Bit_iterator_base&, const _Bit_iterator_base&)):
>  Likewise.
>  (_Bit_iterator::operator+(difference_type)): Likewise and allow NRVO
>  copy elision.
>  (_Bit_iterator::operator-(difference_type)): Likewise.
>  (operator+(ptrdiff_t, const _Bit_iterator&)): Make hidden friend.
>  (_Bit_const_iterator::operator+(difference_type)): Likewise and allow
>  NRVO copy elision.
>  (_Bit_const_iterator::operator-(difference_type)): Likewise.
>  (operator+(ptrdiff_t, const _Bit_const_iterator&)): Make hidden friend.

These const_iterator overloads seem to be missing the NRVO fix.

Re: Deque code cleanup and optimizations

2019-05-10 Thread Jonathan Wakely

This seems generally OK, but ...

On Fri, 10 May 2019, 05:59 François Dumont wrote:
>  I remove several _M_map != nullptr checks cause in current
> implementation it can't be null. I have several patches following this
> one to support it but in this case we will be using a different code path.

You can't remove those checks. If _M_map can ever be null now or in
the future, then we need the checks. Otherwise code compiled today
would break if passed a deque compiled with a future GCC that allows
the map to be null.

I'm curious how you plan to support it though, I don't think it's
possible without an ABI break.

>  (_Deque_base::_Deque_impl::_M_move_impl()): Remove _M_impl._M_map
> check.

_M_move_impl and the constructor that calls it can be removed
completely, because https://cplusplus.github.io/LWG/issue2593 means
that the same allocator can still be used after moving from it. That
function only exists to handle the case where an allocator changes
value after being moved from.

[DWARF] dwarf2out cleanups

2019-05-10 Thread Nathan Sidwell

When poking at some dwarf bugs, which were ultimately fixed by Richard, 
I made a couple of cleanups.  The first two are pretty obvious comment 
clarification, but the last fragment is a more invasive control flow 
change.  In that case we essentially repeat an inline-static-var check 
in two places.  I've replaced that with a bool to control whether we 
should splice.  That at least helped me understand the code better, and 
should give the optimizer better visibility to simplify the generated 
control flow.


tested on x86_64-linux, ok?

nathan

--
Nathan Sidwell
2019-05-10  Nathan Sidwell  

	* dwarf2out.c (breakout_comdat_types): Move comment to correct
	piece of code.
	(const_ok_for_output_1): Balance parens around #if/#else/#endif
	(gen_member_die): Move abstract origin check earlier.  Only VARs
	can be static_inline_p.  Simplify splicing control flow.

Index: dwarf2out.c
===
--- dwarf2out.c	(revision 271062)
+++ dwarf2out.c	(working copy)
@@ -8576,11 +8576,12 @@ break_out_comdat_types (dw_die_ref die)
 /* Break out nested types into their own type units.  */
 break_out_comdat_types (c);
 
-/* Create a new type unit DIE as the root for the new tree, and
-   add it to the list of comdat types.  */
+/* Create a new type unit DIE as the root for the new tree.  */
 unit = new_die (DW_TAG_type_unit, NULL, NULL);
 add_AT_unsigned (unit, DW_AT_language,
  get_AT_unsigned (comp_unit_die (), DW_AT_language));
+
+	/* Add the new unit's type DIE into the comdat type list.  */
 type_node = ggc_cleared_alloc ();
 type_node->root_die = unit;
 type_node->next = comdat_type_list;
@@ -14509,11 +14510,10 @@ const_ok_for_output_1 (rtx rtl)
 		"non-delegitimized UNSPEC %s (%d) found in variable location",
 		((XINT (rtl, 1) >= 0 && XINT (rtl, 1) < NUM_UNSPEC_VALUES)
 		 ? unspec_strings[XINT (rtl, 1)] : "unknown"),
-		XINT (rtl, 1));
 #else
 		"non-delegitimized UNSPEC %d found in variable location",
-		XINT (rtl, 1));
 #endif
+		XINT (rtl, 1));
   expansion_failed (NULL_TREE, rtl,
 			"UNSPEC hasn't been delegitimized.\n");
   return false;
@@ -25161,19 +25161,20 @@ gen_member_die (tree type, dw_die_ref co
 			 context_die);
 }
 
-  /* Now output info about the data members and type members.  */
+  /* Now output info about the members. */
   for (member = TYPE_FIELDS (type); member; member = DECL_CHAIN (member))
 {
+  /* Ignore clones.  */
+  if (DECL_ABSTRACT_ORIGIN (member))
+	continue;
+
   struct vlr_context vlr_ctx = { type, NULL_TREE };
   bool static_inline_p
-	= (TREE_STATIC (member)
+	= (VAR_P (member)
+	   && TREE_STATIC (member)
 	   && (lang_hooks.decls.decl_dwarf_attribute (member, DW_AT_inline)
 	   != -1));
 
-  /* Ignore clones.  */
-  if (DECL_ABSTRACT_ORIGIN (member))
-	continue;
-
   /* If we thought we were generating minimal debug info for TYPE
 	 and then changed our minds, some of the member declarations
 	 may have already been defined.  Don't define them again, but
@@ -25183,11 +25184,14 @@ gen_member_die (tree type, dw_die_ref co
 	{
 	  /* Handle inline static data members, which only have in-class
 	 declarations.  */
-	  dw_die_ref ref = NULL; 
+	  bool splice = true;
+
+	  dw_die_ref ref = NULL;
 	  if (child->die_tag == DW_TAG_variable
 	  && child->die_parent == comp_unit_die ())
 	{
 	  ref = get_AT_ref (child, DW_AT_specification);
+
 	  /* For C++17 inline static data members followed by redundant
 		 out of class redeclaration, we might get here with
 		 child being the DIE created for the out of class
@@ -25206,17 +25210,17 @@ gen_member_die (tree type, dw_die_ref co
 		  ref = NULL;
 		  static_inline_p = false;
 		}
-	}
 
-	  if (child->die_tag == DW_TAG_variable
-	  && child->die_parent == comp_unit_die ()
-	  && ref == NULL)
-	{
-	  reparent_child (child, context_die);
-	  if (dwarf_version < 5)
-		child->die_tag = DW_TAG_member;
+	  if (!ref)
+		{
+		  reparent_child (child, context_die);
+		  if (dwarf_version < 5)
+		child->die_tag = DW_TAG_member;
+		  splice = false;
+		}
 	}
-	  else
+
+	  if (splice)
 	splice_child_die (context_die, child);
 	}

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 11:38:50AM +, Wilco Dijkstra wrote:
> > __builtin_eh_return is a noreturn call, and we never tail call optimize
> > noreturn calls:
> >  if (flags & ECF_NORETURN)
> >    {
> >  maybe_complain_about_tail_call (exp, "callee does not return");
> >  return false;
> >    }
> > As both the __builtin_eh_return and other tail calls are points in the CFG
> > that are followed by EXIT only, it doesn't make sense to talk about
> > tailcalls ignoring __builtin_eh_return.  The tailcall is in one epilogue in
> > the function, __builtin_eh_return is in another epilogue.  We do want that
> > add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue.
> 
> Thanks that makes sense, so this change would work fine. 
> 
> However I think almost all targets are affected by this bug - I tried 4 random
> backends and all had code similar to AArch64:
> 
>   if (crtl->calls_eh_return)
> {
>   rtx sa = EH_RETURN_STACKADJ_RTX;
>   emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx, sa));
> }
>  ... with sibcall case handled after this...
> 
> And no code in _function_ok_for_sibcall to block tailcalls...
> 
> So given this and the fact there is no real gain from emitting tailcalls in 
> eh_return
> functions, wouldn't it make more sense to always block tailcalls in the 
> mid-end?

No.  E.g. x86 handles that just fine, the EH_RETURN_STACKADJ_RTX addition is
guarded with style == 2 argument to ix86_expand_epilogue, which is only set
in eh_return epilogue, not in normal epilogue or sibcall_epilogue.
That seems to be the cleanest approach to me, although guarding it with
!sibcall or similar as done in this patch should be enough for most of the
targets that have the problem; of course there needs to be a way to let
targets disallow tail calls in __builtin_eh_return functions, ATM there is
not an easy one, because crtl->calls_eh_return is set only during expansion
and in the case of
if (cond)
  return tail_call (...);
...
__builtin_eh_return (...);
the tail call is expanded before crtl->calls_eh_return is set.
The question is if we want to replace crtl->calls_eh_return with
cfun->calls_eh_return and where we should set it (say in builtins folding
and propagate through inlining), or some pass should do that?
Can't find a good match ATM though, perhaps just the tailcall pass should
check that, like below (untested):

--- gcc/tree-tailcall.c.jj  2019-05-08 19:04:58.947797821 +0200
+++ gcc/tree-tailcall.c 2019-05-10 14:16:57.203930630 +0200
@@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.
 #include "common/common-target.h"
 #include "ipa-utils.h"
 #include "tree-ssa-live.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
 
 /* The file implements the tail recursion elimination.  It is also used to
analyze the tail calls in general, passing the results to the rtl level
@@ -1171,6 +1173,25 @@ tree_optimize_tail_calls_1 (bool opt_tai
   if (phis_constructed)
 mark_virtual_operands_for_renaming (cfun);
 
+  /* If there are any tail calls, check if the function has any
+ __builtin_eh_return calls, as some targets can't handle tail calls
+ in functions that call __builtin_eh_return.  */
+  if (cfun->tail_call_marked)
+{
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, cfun)
+if (EDGE_COUNT (bb->succs) == 0)
+ {
+   gimple *stmt = last_stmt (bb);
+   if (stmt && gimple_call_builtin_p (stmt, BUILT_IN_EH_RETURN))
+ {
+   crtl->calls_eh_return = 1;
+   break;
+ }
+ }
+}
+
   if (changed)
 return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
   return 0;


Jakub

[PATCH] Fix missing next_value_id initialization

2019-05-10 Thread Richard Biener



also value_id for inserted calls was uninitialized.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied to trunk and 
branch.

Richard.

2019-05-10  Richard Biener  

* tree-ssa-sccvn.c (visit_reference_op_call): Initialize value-id.
(do_rpo_vn): Initialize next_value_id.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 271001)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -4059,6 +4059,7 @@ visit_reference_op_call (tree lhs, gcall
   vr2->hashcode = vr1.hashcode;
   vr2->result = lhs;
   vr2->result_vdef = vdef_val;
+  vr2->value_id = 0;
   slot = valid_info->references->find_slot_with_hash (vr2, vr2->hashcode,
  INSERT);
   gcc_assert (!*slot);
@@ -6467,6 +6468,7 @@ do_rpo_vn (function *fn, edge entry, bit
   unsigned region_size = (((unsigned HOST_WIDE_INT)n * num_ssa_names)
  / (n_basic_blocks_for_fn (fn) - NUM_FIXED_BLOCKS));
   VN_TOP = create_tmp_var_raw (void_type_node, "vn_top");
+  next_value_id = 1;
 
   vn_ssa_aux_hash = new hash_table  (region_size * 2);
   gcc_obstack_init (_ssa_aux_obstack);

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra

Hi Jakub,

> __builtin_eh_return is a noreturn call, and we never tail call optimize
> noreturn calls:
>  if (flags & ECF_NORETURN)
>    {
>  maybe_complain_about_tail_call (exp, "callee does not return");
>  return false;
>    }
> As both the __builtin_eh_return and other tail calls are points in the CFG
> that are followed by EXIT only, it doesn't make sense to talk about
> tailcalls ignoring __builtin_eh_return.  The tailcall is in one epilogue in
> the function, __builtin_eh_return is in another epilogue.  We do want that
> add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue.

Thanks that makes sense, so this change would work fine. 

However I think almost all targets are affected by this bug - I tried 4 random
backends and all had code similar to AArch64:

  if (crtl->calls_eh_return)
{
  rtx sa = EH_RETURN_STACKADJ_RTX;
  emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx, sa));
}
 ... with sibcall case handled after this...

And no code in _function_ok_for_sibcall to block tailcalls...

So given this and the fact there is no real gain from emitting tailcalls in 
eh_return
functions, wouldn't it make more sense to always block tailcalls in the mid-end?

Wilco

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 10:25:58AM +, Wilco Dijkstra wrote:
> Hi Jakub,
> 
> 27ec:   912143ffadd sp, sp, #0x850
> 27f0:   8b2463ffadd sp, sp, x4
> 27f4:   1400b   23c8 <_Unwind_RaiseException>
> 
> > This does a lot of register saving and restoring, which is not needed but is
> > not wrong-code (guess separate shrink wrapping would help here if
> > implemented for the target).  The only wrong-code is actually the
> > add sp, sp, x4 instruction though.  The previous instruction restored sp to
> > the value it had at the start of the function and then we should just tail
> > call.  This instruction is something that is needed in the spot where
> > __builtin_eh_return is emitted.
> 
> The issue seems to be that x4 isn't initialized correctly for tailcalls. 
> However
> doesn't that mean tailcalls will ignore __builtin_eh_return? Or does the 
> midend
> block any tailcalls that are reachable from __builtin_eh_return?

__builtin_eh_return is a noreturn call, and we never tail call optimize
noreturn calls:
  if (flags & ECF_NORETURN)
{
  maybe_complain_about_tail_call (exp, "callee does not return");
  return false;
}
As both the __builtin_eh_return and other tail calls are points in the CFG
that are followed by EXIT only, it doesn't make sense to talk about
tailcalls ignoring __builtin_eh_return.  The tailcall is in one epilogue in
the function, __builtin_eh_return is in another epilogue.  We do want that
add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue.

Jakub

Re: [PATCH] netbsd EABI support

2019-05-10 Thread Richard Earnshaw (lists)

On 10/05/2019 00:33, co...@sdf.org wrote:
> On Tue, Apr 09, 2019 at 05:36:47PM +0100, Richard Earnshaw (lists) wrote:
>>> So we're well into stage4 which means technically it's too late for
>>> something like this.  However, given it's limited scope I won't object
>>> if the ARM port maintainers want to go forward.  Otherwise I'll queue it
>>> for gcc-10.
>>>
>>> jeff
>>>
>>
>> I was about to approve this (modulo removing the now obsolete
>> FPU_DEFAULT macro), until I noticed that it also modifies the generic
>> NetBSD code as well.  I'm certainly not willing to approve that myself
>> at this late stage, but if one of the NetBSD OS maintainers wants to
>> step up and do so, I'll happily take the Arm back-end code as that's not
>> a primary or secondary target.
>>
>> R.
> 
> Congrats on a new release :-)
> ping
> 

I was hoping to get a comment from one of the netbsd port maintainers on
the changes to the common NetBSD code.  Are they no-longer active?

R.

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Wilco Dijkstra

Hi,

> Would a shift_truncation_mask (rtx op, machine_mode mode) as replacement
> for SHIFT_COUNT_TRUNCATED and the old hook make more sense than just
> relying on the mode?  That would be a way to alleviate the first counter
> argument to SHIFT_COUNT_TRUNCATED ("not flexible enough").

Well I think it would be a good idea to fix the issue once and for all in a way 
that
works for most targets. I think the best option is to have extra ops that assume
masking of the shift count. If the target supports the optab, the optimizer can 
then
change (x << (y & 31)) into (x <<& y) where <<& is the shift+mask operator.

> That said, I don't intend to pursue the proposal given the opinions so
> far - after all it would only save a few lines in our backend and some
> others.  I would find it useful to more clearly document the current
> methods as deprecated since they are apparently not useful for any of
> the major targets.

Well I do think it is worth sorting out. It will be more effort to get something
that works for most targets, but it's a long standing issue that shouldn't 
really
cause so many issues and a lot of extra work for all targets.

Wilco

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra

Hi Jakub,

27ec:   912143ffadd sp, sp, #0x850
27f0:   8b2463ffadd sp, sp, x4
27f4:   1400b   23c8 <_Unwind_RaiseException>

> This does a lot of register saving and restoring, which is not needed but is
> not wrong-code (guess separate shrink wrapping would help here if
> implemented for the target).  The only wrong-code is actually the
> add sp, sp, x4 instruction though.  The previous instruction restored sp to
> the value it had at the start of the function and then we should just tail
> call.  This instruction is something that is needed in the spot where
> __builtin_eh_return is emitted.

The issue seems to be that x4 isn't initialized correctly for tailcalls. However
doesn't that mean tailcalls will ignore __builtin_eh_return? Or does the midend
block any tailcalls that are reachable from __builtin_eh_return?

Wilco

[PATCH][AArch64] PR tree-optimization/90332: Implement vec_init where N is a vector mode

2019-05-10 Thread Kyrill Tkachov


Hi all,

This patch fixes the failing gcc.dg/vect/slp-reduc-sad-2.c testcase on 
aarch64
by implementing a vec_init optab that can handle two half-width vectors 
producing a full-width one

by concatenating them.

In the gcc.dg/vect/slp-reduc-sad-2.c case it's a V8QI reg concatenated 
with a V8QI const_vector of zeroes.
This can be implemented efficiently using the aarch64_combinez pattern 
that just loads a D-register to make

use of the implicit zero-extending semantics of that load.
Otherwise it concatenates the two vector using aarch64_simd_combine.

With this patch I'm seeing the effect from richi's original patch that 
added gcc.dg/vect/slp-reduc-sad-2.c on aarch64

and 525.x264_r improves by about 1.5%.

Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on 
aarch64_be-none-elf.


Ok for trunk?
Thanks,
Kyrill

2019-10-05  Kyrylo Tkachov  

    PR tree-optimization/90332
    * config/aarch64/aarch64.c (aarch64_expand_vector_init):
    Handle VALS containing two vectors.
    * config/aarch64/aarch64-simd.md (*aarch64_combinez): Rename
    to...
    (@aarch64_combinez): ... This.
    (*aarch64_combinez_be): Rename to...
    (@aarch64_combinez_be): ... This.
    (vec_init): New define_expand.
    * config/aarch64/iterators.md (Vhalf): Handle V8HF.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 8e1afe1704490cf399ba46e68acc8131cc932259..7d917584d5bce612ce2ebff68fd3e67b57e0d403 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3211,7 +3211,7 @@
 ;; In this insn, operand 1 should be low, and operand 2 the high part of the
 ;; dest vector.
 
-(define_insn "*aarch64_combinez"
+(define_insn "@aarch64_combinez"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
 	(vec_concat:
 	  (match_operand:VDC 1 "general_operand" "w,?r,m")
@@ -3225,7 +3225,7 @@
(set_attr "arch" "simd,fp,simd")]
 )
 
-(define_insn "*aarch64_combinez_be"
+(define_insn "@aarch64_combinez_be"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
 (vec_concat:
 	  (match_operand:VDC 2 "aarch64_simd_or_scalar_imm_zero")
@@ -5954,6 +5954,15 @@
   DONE;
 })
 
+(define_expand "vec_init"
+  [(match_operand:VQ_NO2E 0 "register_operand" "")
+   (match_operand 1 "" "")]
+  "TARGET_SIMD"
+{
+  aarch64_expand_vector_init (operands[0], operands[1]);
+  DONE;
+})
+
 (define_insn "*aarch64_simd_ld1r"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
 	(vec_duplicate:VALL_F16
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0c2c17ed8269923723d066b250974ee1ff423d26..52c933cfdac20c5c566c13ae2528f039efda4c46 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15075,6 +15075,43 @@ aarch64_expand_vector_init (rtx target, rtx vals)
   rtx v0 = XVECEXP (vals, 0, 0);
   bool all_same = true;
 
+  /* This is a special vec_init where N is not an element mode but a
+ vector mode with half the elements of M.  We expect to find two entries
+ of mode N in VALS and we must put their concatentation into TARGET.  */
+  if (XVECLEN (vals, 0) == 2 && VECTOR_MODE_P (GET_MODE (XVECEXP (vals, 0, 0
+{
+  rtx lo = XVECEXP (vals, 0, 0);
+  rtx hi = XVECEXP (vals, 0, 1);
+  machine_mode narrow_mode = GET_MODE (lo);
+  gcc_assert (GET_MODE_INNER (narrow_mode) == inner_mode);
+  gcc_assert (narrow_mode == GET_MODE (hi));
+
+  /* When we want to concatenate a half-width vector with zeroes we can
+	 use the aarch64_combinez[_be] patterns.  Just make sure that the
+	 zeroes are in the right half.  */
+  if (BYTES_BIG_ENDIAN
+	  && aarch64_simd_imm_zero (lo, narrow_mode)
+	  && general_operand (hi, narrow_mode))
+	emit_insn (gen_aarch64_combinez_be (narrow_mode, target, hi, lo));
+  else if (!BYTES_BIG_ENDIAN
+	   && aarch64_simd_imm_zero (hi, narrow_mode)
+	   && general_operand (lo, narrow_mode))
+	emit_insn (gen_aarch64_combinez (narrow_mode, target, lo, hi));
+  else
+	{
+	  /* Else create the two half-width registers and combine them.  */
+	  if (!REG_P (lo))
+	lo = force_reg (GET_MODE (lo), lo);
+	  if (!REG_P (hi))
+	hi = force_reg (GET_MODE (hi), hi);
+
+	  if (BYTES_BIG_ENDIAN)
+	std::swap (lo, hi);
+	  emit_insn (gen_aarch64_simd_combine (narrow_mode, target, lo, hi));
+	}
+ return;
+   }
+
   /* Count the number of variable elements to initialise.  */
   for (int i = 0; i < n_elts; ++i)
 {
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index d59d72833c527da80af1ac8dd7264c8dd86047c7..c495e76b3b73e3a19603330989121fce51c3fe35 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -771,6 +771,7 @@
 ;; Half modes of all vector modes, in lower-case.
 (define_mode_attr Vhalf [(V8QI "v4qi")  (V16QI "v8qi")
 			 (V4HI "v2hi")  (V8HI  "v4hi")
+			 (V8HF  "v4hf")
 			 (V2SI "si")(V4SI  "v2si")
 			 (V2DI "di")(V2SF  "sf")
 			 (V4SF "v2sf")  (V2DF

[PATCH][OBVIOUS] Fix a plural in a param description.

2019-05-10 Thread Martin Liška

Hi.

One obvious fix seen be Bernhard.

Martin

gcc/ChangeLog:

2019-05-10  Martin Liska  

* params.def (PARAM_GIMPLE_FE_COMPUTED_HOT_BB_THRESHOLD):
Fix plural form.
---
 gcc/params.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/params.def b/gcc/params.def
index 23b8743786c..6b7f7eb5bae 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1436,7 +1436,7 @@ DEFPARAM(PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS,
 DEFPARAM(PARAM_GIMPLE_FE_COMPUTED_HOT_BB_THRESHOLD,
 	 "gimple-fe-computed-hot-bb-threshold",
 	 "The number of executions of a basic block which is considered hot."
-	 " The parameters is used only in GIMPLE FE.",
+	 " The parameter is used only in GIMPLE FE.",
 	 0, 0, 0)
 
 /*

Re: [PATCH][RFC] Come up with TARGET_HAS_FAST_MEMPCPY_ROUTINE (PR middle-end/90263).

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 11:04:12AM +0200, Martin Liška wrote:
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -1906,6 +1906,9 @@ typedef struct ix86_args {
>  
>  #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
>  
> +/* C library provides fast implementation of mempcpy function.  */
> +#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
> +

1) we shouldn't be adding further target macros, but target hooks
2) I don't think this is a property of the x86 target, but of x86 glibc,
   so you should set it on x86 glibc only (i.e. i?86/x86_64 linux and hurd
   when using glibc, not newlib, nor bionic/android, nor uclibc, nor musl)

> --- a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy.c
> @@ -56,9 +56,8 @@ main_test (void)
>if (__builtin_mempcpy (p, "ABCDE", 6) != p + 6 || memcmp (p, "ABCDE", 6))
>  abort ();
>  
> -  /* If the result of mempcpy is ignored, gcc should use memcpy.
> - This should be optimized always, so set inside_main again.  */
> -  inside_main = 1;
> +  /* Set inside main in order to not abort because of usafe of mempcpy.  */
> +  inside_main = 0;
>mempcpy (p + 5, s3, 1);
>if (memcmp (p, "ABCDEFg", 8))
>  abort ();

Why this?  Do you mean mempcpy is called here, even when the lhs is unused?
We should be calling memcpy in that case.

Jakub

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Robin Dapp

>> Bit tests on x86 also truncate [1], if the bit base operand specifies
>> a register, and we don't use BT with a memory location as a bit base.
>> I don't know what is referred with "(real or pretended) bit field
>> operations" in the documentation for SHIFT_COUNT_TRUNCATED:
>>
>>  However, on some machines, such as the 80386 and the 680x0,
>>  truncation only applies to shift operations and not the (real or
>>  pretended) bit-field operations.  Define 'SHIFT_COUNT_TRUNCATED' to
>>
>> Vector shifts don't truncate on x86, so x86 probably shares the same
>> destiny with MIPS. Maybe a machine mode argument could be passed to
>> SHIFT_COUNT_TRUNCATED to distinguish modes that truncate from modes
>> that don't.
> 
> But IL semantic differences based on mode is even worse.  What happens
> if STV then substitues a vector register/op but you previously optimized
> with the assumption the count would be truncated since the shift was SImode?
> 
> IMHO a recipie for desaster.

Would a shift_truncation_mask (rtx op, machine_mode mode) as replacement
for SHIFT_COUNT_TRUNCATED and the old hook make more sense than just
relying on the mode?  That would be a way to alleviate the first counter
argument to SHIFT_COUNT_TRUNCATED ("not flexible enough").  Richard's
argument about the premature optimization however, I can agree on and I
don't see a way to avoid this.  Doesn't it apply to other middle-end
optimizations as well, though, and is handled via additional flags that
prohibit selected follow-up optimizations?

The problem with my proposed unifying is that the hook is/will be called
by RTL as well as tree passes and would need to have an argument type
that can hold all of their respective shift-like operations in order to
return a useful mask.  My local version uses an rtx that only contains
an rtx_code which is filled in an ad-hoc fashion for the tree passes.

That said, I don't intend to pursue the proposal given the opinions so
far - after all it would only save a few lines in our backend and some
others.  I would find it useful to more clearly document the current
methods as deprecated since they are apparently not useful for any of
the major targets.

Regards
 Robin

[PATCH][RFC] Come up with TARGET_HAS_FAST_MEMPCPY_ROUTINE (PR middle-end/90263).

2019-05-10 Thread Martin Liška

Hi.

This is updated version of the patch I've sent here:
https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00149.html

The patch is about introduction of a new target macro that will
drive how we expand mempcpy. Having a target with 
TARGET_HAS_FAST_MEMPCPY_ROUTINE == 1,
we do not expand using memcpy, but mempcy.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

@Wilco: Can you please come up with a test-case for aarch64?

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-05-09  Martin Liska  

PR middle-end/90263
* builtins.c (expand_builtin_memory_copy_args): When having a
target with fast mempcpy implementation do now use memcpy.
* config/i386/i386.h (TARGET_HAS_FAST_MEMPCPY_ROUTINE): New.
* defaults.h (TARGET_HAS_FAST_MEMPCPY_ROUTINE): By default
target does not have fast mempcpy routine.
* doc/tm.texi: Document the new hook.
* doc/tm.texi.in: Likewise.
* expr.h (emit_block_move_hints): Add 2 new arguments.
* expr.c (emit_block_move_hints): Bail out when libcall
to memcpy would be used.

gcc/testsuite/ChangeLog:

2019-05-09  Martin Liska  

* gcc.c-torture/execute/builtins/mempcpy.c: Use mempcpy
without LHS.
---
 gcc/builtins.c | 18 --
 gcc/config/i386/i386.h |  3 +++
 gcc/defaults.h |  7 +++
 gcc/doc/tm.texi|  5 +
 gcc/doc/tm.texi.in |  5 +
 gcc/expr.c | 13 -
 gcc/expr.h |  4 +++-
 .../gcc.c-torture/execute/builtins/mempcpy.c   |  5 ++---
 8 files changed, 53 insertions(+), 7 deletions(-)


diff --git a/gcc/builtins.c b/gcc/builtins.c
index d37d73fc4a0..09d5b540ae8 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3839,6 +3839,8 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len,
   unsigned HOST_WIDE_INT max_size;
   unsigned HOST_WIDE_INT probable_max_size;
 
+  bool is_move_done;
+
   /* If DEST is not a pointer type, call the normal function.  */
   if (dest_align == 0)
 return NULL_RTX;
@@ -3888,11 +3890,23 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len,
   if (CALL_EXPR_TAILCALL (exp)
   && (retmode == RETURN_BEGIN || target == const0_rtx))
 method = BLOCK_OP_TAILCALL;
-  if (retmode == RETURN_END && target != const0_rtx)
+  if (TARGET_HAS_FAST_MEMPCPY_ROUTINE
+  && retmode == RETURN_END
+  && target != const0_rtx)
 method = BLOCK_OP_NO_LIBCALL_RET;
   dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, method,
  expected_align, expected_size,
- min_size, max_size, probable_max_size);
+ min_size, max_size, probable_max_size,
+ TARGET_HAS_FAST_MEMPCPY_ROUTINE
+ && retmode == RETURN_END,
+ _move_done);
+
+  /* Bail out when a mempcpy call would be expanded as libcall and when
+ we have a target that provides a fast implementation
+ of mempcpy routine.  */
+  if (!is_move_done)
+return NULL_RTX;
+
   if (dest_addr == pc_rtx)
 return NULL_RTX;
 
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 3fee779296f..7d20178f432 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1906,6 +1906,9 @@ typedef struct ix86_args {
 
 #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2)
 
+/* C library provides fast implementation of mempcpy function.  */
+#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 1
+
 /* Define if shifts truncate the shift count which implies one can
omit a sign-extension or zero-extension of a shift count.
 
diff --git a/gcc/defaults.h b/gcc/defaults.h
index b7534256119..eca19d1977f 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1348,6 +1348,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define SET_RATIO(speed) MOVE_RATIO (speed)
 #endif
 
+/* By default do not generate libcall to mempcpy and rather use
+   libcall to memcpy and adjustment of return value.  */
+
+#ifndef TARGET_HAS_FAST_MEMPCPY_ROUTINE
+#define TARGET_HAS_FAST_MEMPCPY_ROUTINE 0
+#endif
+
 /* Supply a default definition of STACK_SAVEAREA_MODE for emit_stack_save.
Normally move_insn, so Pmode stack pointer.  */
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8c8978bb13a..cf709dfb843 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6735,6 +6735,11 @@ optimized for speed rather than size.
 If you don't define this, it defaults to the value of @code{MOVE_RATIO}.
 @end defmac
 
+@defmac TARGET_HAS_FAST_MEMPCPY_ROUTINE
+By default do not generate libcall to mempcpy and rather use
+libcall to memcpy and adjustment of return value.
+@end defmac
+
 @defmac USE_LOAD_POST_INCREMENT (@var{mode})
 A C expression used to determine whether a load postincrement is a good
 thing to use for a given mode.  Defaults to the

[PATCH] True IPA reimplementation of IPA-SRA

2019-05-10 Thread Martin Jambor

Hello,

this is a follow-up from a WIP patch I sent here in late December:
https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01765.html

Just like the last time, the patch below is is a reimplementation of
IPA-SRA to make it a full IPA pass that can handle strictly connected
components in the call-graph, can take advantage of LTO and does not
weirdly switch functions in pass-pipeline like our current quasi-IPA SRA
does.  Unlike the current IPA-SRA it can also remove return values, even
in SCCs.  On the other hand, it is less powerful when it comes to
structures passed by reference.  By design it will not create references
to bits of an aggregate because that turned out to be just obfuscation
in practice.  However, it also cannot usually split aggregates passed by
reference that are just passed to another function (where splitting
would be useful) because it cannot perform the same TBAA analysis like
the current implementation which already knows what types it should look
at because it has access to bodies of all functions attempts to modify.

Since the last time I have fixed a number of bugs that Martin Liška
found when compiling a portion of openSUSE with the patch, removed all
the FIXMEs, made long living memory structures more compact and
self-reviewed the entire patch once.

Therefore, I would like to ask for a review and eventually for an
approval to commit the patch to the trunk.  The patch survives
bootstrap, LTO bootstrap and LTO profiledbootstrap on x86_64-linux.  In
the testsuite, it "fixes" 24 guality passes (all LTO ones) but breaks 12
other ones (one is non-LTO).  I would welcome any help with addressing
these.  Because the patch removes the old IPA-SRA, the input to the IPA
pipeline looks different and so I could not just try to make it "process
the debug statements like before."

Because the patch is big I had to compress it to get it through to
gcc-patches.  Because of its size and because it contains a completely
new file ipa-sra.c and total reimplementations of
ipa-param-manipulation.[ch], and so I pushed my development branch to
branch jamborm/ipa-sra on GCC git server
(https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/jamborm/ipa-sra).
It may be more convenient to check it out and review it that way.

Thanks in advance for any questions, comments and suggestions,

Martin



2019-05-09  Martin Jambor  

* coretypes.h (cgraph_edge): Declare.
* ipa-param-manipulation.c: Rewrite.
* ipa-param-manipulation.h: Likewise.
* Makefile.in (GTFILES): Added ipa-param-manipulation.h and ipa-sra.c.
(OBJS): Added ipa-sra.o.
* cgraph.h (ipa_replace_map): Removed fields old_tree, replace_p
and ref_p, added fields param_adjustments and performed_splits.
(struct cgraph_clone_info): Remove ags_to_skip and
combined_args_to_skip, new field param_adjustments.
(cgraph_node::create_clone): Changed parameters to use
ipa_param_adjustments.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_virtual_clone_with_body): Likewise.
(tree_function_versioning): Likewise.
(cgraph_build_function_type_skip_args): Removed.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Convert to
using ipa_param_adjustments.
(clone_of_p): Likewise.
* cgraphclones.c (cgraph_build_function_type_skip_args): Removed.
(build_function_decl_skip_args): Likewise.
(duplicate_thunk_for_node): Adjust parameters using
ipa_param_body_adjustments, copy param_adjustments instead of
args_to_skip.
(cgraph_node::create_clone): Convert to using ipa_param_adjustments.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_version_clone_with_body): Likewise.
(cgraph_materialize_clone): Likewise.
(symbol_table::materialize_all_clones): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Simplify
ipa_replace_map check.
* ipa-cp.c (get_replacement_map): Do not initialize removed fields.
(initialize_node_lattices): Make aware that some parameters might have
already been removed.
(want_remove_some_param_p): New function.
(create_specialized_node): Convert to using ipa_param_adjustments and
deal with possibly pre-existing adjustments.
* lto-cgraph.c (output_cgraph_opt_summary_p): Likewise.
(output_node_opt_summary): Do not stream removed fields.  Stream
parameter adjustments instead of argumetns to skip.
(input_node_opt_summary): Likewise.
(input_node_opt_summary): Likewise.
* lto-section-in.c (lto_section_name): Added ipa-sra section.
* lto-streamer.h (lto_section_type): Likewise.
* tree-inline.h (copy_body_data): New field killed_new_ssa_names.
(copy_decl_to_var): Declare.
* tree-inline.c (update_clone_info): Do not remap old_tree.

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Uros Bizjak

On Fri, May 10, 2019 at 9:42 AM Richard Biener
 wrote:
>
> On Fri, May 10, 2019 at 9:25 AM Uros Bizjak  wrote:
> >
> > On Fri, May 10, 2019 at 9:10 AM Richard Biener
> >  wrote:
> > >
> > > On Fri, May 10, 2019 at 12:44 AM Uros Bizjak  wrote:
> > > >
> > > > >> Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
> > > > >> does because we always only consider the last 6 bits of a shift 
> > > > >> operand.>
> > > > >> Despite all the warnings in the other backends, most notably
> > > > >> SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
> > > > >> wrote the attached tentative patch.  It's a little ad-hoc, uses the
> > > > >> SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 
> > > > >> and,
> > > > >> instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
> > > > >> the mask returned by shift_truncation_mask.  Doing so, usage of both
> > > > >> "methods" actually reduces to a single way.
> > > > > THe main reason it's discouraged is because some targets have insns
> > > > > where the count would be truncated and others where it would not.   So
> > > > > for example normal shifts might truncate, but vector shifts might or
> > > > > (mips) or shifts might truncate but bit tests do not (x86).
> > > >
> > > > Bit tests on x86 also truncate [1], if the bit base operand specifies
> > > > a register, and we don't use BT with a memory location as a bit base.
> > > > I don't know what is referred with "(real or pretended) bit field
> > > > operations" in the documentation for SHIFT_COUNT_TRUNCATED:
> > > >
> > > >  However, on some machines, such as the 80386 and the 680x0,
> > > >  truncation only applies to shift operations and not the (real or
> > > >  pretended) bit-field operations.  Define 'SHIFT_COUNT_TRUNCATED' to
> > > >
> > > > Vector shifts don't truncate on x86, so x86 probably shares the same
> > > > destiny with MIPS. Maybe a machine mode argument could be passed to
> > > > SHIFT_COUNT_TRUNCATED to distinguish modes that truncate from modes
> > > > that don't.
> > >
> > > But IL semantic differences based on mode is even worse.  What happens
> > > if STV then substitues a vector register/op but you previously optimized
> > > with the assumption the count would be truncated since the shift was 
> > > SImode?
> >
> > I have removed support to STV shifts with dynamic count just because
> > of the issue you mentioned. It is not possible to substitute
> > truncating shift with a non-truncating one in STV, so there is IMO no
> > problem with the proposed idea.
> >
> > BTW: We do implement the removal of useless count argument masking
> > (for shifts and bit test insns) with several define_insn_and_split
> > patterns that allow combine to create compound insn (please grep for
> > "Avoid usless masking" in i386.md). However, the generic handling
> > would probably be more effective, since the implemented approach
> > doesn't remove masking in cases masked argument is used in several
> > shift instructions.
>
> But that's more a combine limitation than a reason going for the
> "hidden" IL semantic change.  But yes, if the and is used by
> non-masking insns then it's likely cheap enough to retain it.
>
> If the masking were always in place (combined with the shift
> if a suitable insn exists) then STV handling should be possible,
> it just would need to split the insn to do the masking and then the shift
> (of course that might not be very profitable).

Unfortunately, STV substitutes a register in one go, so if we have

or $cnt, $0x12345
shl $reg, $cnt

the sequence gets converted to:


por $xmm_cnt, $xmm_imm
psll $xmm_reg, $xmm_cnt

which is not the same; we would need to mask $xmm_cnt inbetween insns.

Uros.

> Richard.
>
> > Uros.
> >
> > > IMHO a recipie for desaster.
> > >
> > > Richard.
> > >
> > > > [1] https://www.felixcloutier.com/x86/bt
> > > >
> > > > Uros.

[PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek

Hi!

My recent patch for tail call improvement apparently affects also the
_Unwind_Resume_or_Rethrow function in libgcc:

_Unwind_Reason_Code __attribute__((optimize ("no-omit-frame-pointer")))
_Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
{
  struct _Unwind_Context this_context, cur_context;
  _Unwind_Reason_Code code;
  unsigned long frames;
  if (exc->private_1 == 0)
return _Unwind_RaiseException (exc);
  do { __builtin_unwind_init (); uw_init_context_1 (_context, 
__builtin_dwarf_cfa (), __builtin_return_address (0)); } while (0);
  cur_context = this_context;
  code = _Unwind_ForcedUnwind_Phase2 (exc, _context, );
  ((void)(!(code == _URC_INSTALL_CONTEXT) ? abort (), 0 : 0));
  do { long offset = uw_install_context_1 ((_context), (_context)); 
void *handler = uw_frob_return_addr ((_context), (_context));
 _Unwind_DebugHook ((_context)->cfa, handler); ; __builtin_eh_return 
(offset, handler); } while (0);
}

Previously, the mere existence of the addressable variables this_context
and cur_context prevented tail call on the early out
return _Unwind_RaiseException (exc);
but since r271013 the tailcall analysis figures that while those two
variables are there, they aren't touched before the possible tail call
site, so they can't be really live during the call.
The problem is that this is one of the few calls that call
__builtin_eh_return which normally user code doesn't use, we use it just
for the unwinder routines and so some targets (in this case aarch64, in
another report powerpc-darwin) show that the combination of
cfun->calls_eh_return + tail calls has not been really tested.
One possibility is to do if (cfun->calls_eh_return) return false; in
the target hook *_ok_for_sibcall, the other possibility is to fix that
case properly.

The following patch fixes the aarch64 case.
The code emitted for the code path from the start of the function
till the tail call was:
2778 <_Unwind_Resume_or_Rethrow>:
2778:   d12143ffsub sp, sp, #0x850
277c:   a9007bfdstp x29, x30, [sp]
2780:   910003fdmov x29, sp
2784:   a90107e0stp x0, x1, [sp, #16]
2788:   f9400801ldr x1, [x0, #16]
278c:   a9020fe2stp x2, x3, [sp, #32]
2790:   a90353f3stp x19, x20, [sp, #48]
2794:   aa0003f3mov x19, x0
2798:   a9045bf5stp x21, x22, [sp, #64]
279c:   a90563f7stp x23, x24, [sp, #80]
27a0:   a9066bf9stp x25, x26, [sp, #96]
27a4:   a90773fbstp x27, x28, [sp, #112]
27a8:   6d0827e8stp d8, d9, [sp, #128]
27ac:   6d092feastp d10, d11, [sp, #144]
27b0:   6d0a37ecstp d12, d13, [sp, #160]
27b4:   6d0b3feestp d14, d15, [sp, #176]
27b8:   b5000201cbnzx1, 27f8 
<_Unwind_Resume_or_Rethrow+0x80>
27bc:   a9407bfdldp x29, x30, [sp]
27c0:   a94107e0ldp x0, x1, [sp, #16]
27c4:   a9420fe2ldp x2, x3, [sp, #32]
27c8:   a94353f3ldp x19, x20, [sp, #48]
27cc:   a9445bf5ldp x21, x22, [sp, #64]
27d0:   a94563f7ldp x23, x24, [sp, #80]
27d4:   a9466bf9ldp x25, x26, [sp, #96]
27d8:   a94773fbldp x27, x28, [sp, #112]
27dc:   6d4827e8ldp d8, d9, [sp, #128]
27e0:   6d492fealdp d10, d11, [sp, #144]
27e4:   6d4a37ecldp d12, d13, [sp, #160]
27e8:   6d4b3feeldp d14, d15, [sp, #176]
27ec:   912143ffadd sp, sp, #0x850
27f0:   8b2463ffadd sp, sp, x4
27f4:   1400b   23c8 <_Unwind_RaiseException>
27f4: R_AARCH64_JUMP26  _Unwind_RaiseException
This does a lot of register saving and restoring, which is not needed but is
not wrong-code (guess separate shrink wrapping would help here if
implemented for the target).  The only wrong-code is actually the
add sp, sp, x4 instruction though.  The previous instruction restored sp to
the value it had at the start of the function and then we should just tail
call.  This instruction is something that is needed in the spot where
__builtin_eh_return is emitted.

Fixed thusly, bootstrapped/regtested on aarch64-linux, ok for trunk?

2019-05-10  Jakub Jelinek  

PR c++/59813
* config/aarch64/aarch64.c (aarch64_expand_epilogue): Don't add
EH_RETURN_STACKADJ_RTX to sp in sibcall epilogues.

--- gcc/config/aarch64/aarch64.c.jj 2019-05-02 12:18:40.004979690 +0200
+++ gcc/config/aarch64/aarch64.c2019-05-09 20:08:00.774718003 +0200
@@ -5913,7 +5913,7 @@ aarch64_expand_epilogue (bool for_sibcal
 }
 
   /* Stack adjustment for exception handler.  */
-  if (crtl->calls_eh_return)
+  if (crtl->calls_eh_return && !for_sibcall)
 {

Patch ping^2 (was Re: [C++ PATCH] Fix up C++ loop construct debug info without -gno-statement-frontiers (PR debug/90197))

2019-05-10 Thread Jakub Jelinek

On Fri, May 03, 2019 at 09:10:39AM +0200, Jakub Jelinek wrote:
> On Fri, Apr 26, 2019 at 05:45:27PM +0200, Jakub Jelinek wrote:
> > 2019-04-26  Jakub Jelinek  
> > 
> > PR debug/90197
> > * cp-gimplify.c (genericize_cp_loop): Emit a DEBUG_BEGIN_STMT
> > before the condition (or if missing or constant non-zero at the end
> > of the loop.  Emit a DEBUG_BEGIN_STMT before the increment expression
> > if any.
> 
> I'd like to ping this patch for trunk.  Thanks.

Ping.

Jakub

[committed] Adjust store_merging_29.c testcase (PR tree-optimization/88709, PR tree-optimization/90271)

2019-05-10 Thread Jakub Jelinek

Hi!

Some arm options are unfortunately the only exception when store_merge
predicate is true even on a target with STRICT_ALIGNMENT, in which case we
can't really do what the testcase wants to test, an unaligned store covering
the 3 separate stores.  There are other testcases that already have to apply
arm workarounds.

Fixed thusly, committed as obvious to trunk.

2019-05-10  Jakub Jelinek  

PR tree-optimization/88709
PR tree-optimization/90271
* gcc.dg/store_merging_29.c: Allow 4 stores to replace 6 stores on
arm*-*-*.

--- gcc/testsuite/gcc.dg/store_merging_29.c.jj  2019-05-06 23:49:33.378041182 
+0200
+++ gcc/testsuite/gcc.dg/store_merging_29.c 2019-05-09 16:27:49.053535179 
+0200
@@ -2,8 +2,8 @@
 /* { dg-do run { target int32 } } */
 /* { dg-require-effective-target store_merge } */
 /* { dg-options "-O2 -fdump-tree-store-merging-details" } */
-/* { dg-final { scan-tree-dump "New sequence of 3 stores to replace old one of 
6 stores" "store-merging" { target le } } } */
-/* { dg-final { scan-tree-dump "New sequence of \[34] stores to replace old 
one of 6 stores" "store-merging" { target be } } } */
+/* { dg-final { scan-tree-dump "New sequence of 3 stores to replace old one of 
6 stores" "store-merging" { target { le && { ! arm*-*-* } } } } } */
+/* { dg-final { scan-tree-dump "New sequence of \[34] stores to replace old 
one of 6 stores" "store-merging" { target { be || { arm*-*-* } } } } } */
 
 struct T { char a[1024]; };
 

Jakub

Re: [Patch, fortran] ISO_Fortran_binding PRs 90093, 90352 & 90355

2019-05-10 Thread Paul Richard Thomas

Committed to trunk as revision 271057.

Will do likewise with 9-branch asap.

Cheers

Paul

On Wed, 8 May 2019 at 19:40, Paul Richard Thomas
 wrote:
>
> Unless there are any objections to this patch, I plan to commit to
> trunk and 9-branch tomorrow night, with the change to the testcase
> pointed out by Dominique.
>
> I sincerely hope that will be the end of CFI PRs for a little while,
> at least. I have a load of pending patches and want to get on with
> fixing PDTs.
>
> Cheers
>
> Paul
>
> On Mon, 6 May 2019 at 19:59, Paul Richard Thomas
>  wrote:
> >
> > It helps to attach the patch!
> >
> > On Mon, 6 May 2019 at 19:57, Paul Richard Thomas
> >  wrote:
> > >
> > > Unfortunately, this patch was still in the making at the release of
> > > 9.1. It is more or less self explanatory with the ChangeLogs.
> > >
> > > It should be noted that gfc_conv_expr_present could not be used in the
> > > fix for PR90093 because the passed descriptor is a CFI type. Instead,
> > > the test is for a null pointer passed.
> > >
> > > The changes to trans-array.c(gfc_trans_create_temp_array) have an eye
> > > on the future, as well as PR90355. I am progressing towards the point
> > > where all descriptors have 'span' set correctly so that
> > > trans.c(get_array_span) can be eliminated and much of the code in the
> > > callers can be simplified.
> > >
> > > Bootstrapped and regtested on FC29/x86_64 - OK for trunk and 9-branch?
> > >
> > > Paul
> > >
> > > 2019-05-06  Paul Thomas  
> > >
> > > PR fortran/90093
> > > * trans-decl.c (convert_CFI_desc): Test that the dummy is
> > > present before doing any of the conversions.
> > >
> > > PR fortran/90352
> > > * decl.c (gfc_verify_c_interop_param): Restore the error for
> > > charlen > 1 actual arguments passed to bind(C) procs.
> > > Clean up trailing white space.
> > >
> > > PR fortran/90355
> > > * trans-array.c (gfc_trans_create_temp_array): Set the 'span'
> > > field to the element length for all types.
> > > (gfc_conv_expr_descriptor): The force_no_tmp flag is used to
> > > prevent temporary creation, especially for substrings.
> > > * trans-decl.c (gfc_trans_deferred_vars): Rather than assert
> > > that the backend decl for the string length is non-null, use it
> > > as a condition before calling gfc_trans_vla_type_sizes.
> > > * trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): 'force_no_tmp'
> > > is set before calling gfc_conv_expr_descriptor.
> > > * trans.c (get_array_span): Move the code for extracting 'span'
> > > from gfc_build_array_ref to this function. This is specific to
> > > descriptors that are component and indirect references.
> > > * trans.h : Add the force_no_tmp flag bitfield to gfc_se.
> > >
> > > 2019-05-06  Paul Thomas  
> > >
> > > PR fortran/90093
> > > * gfortran.dg/ISO_Fortran_binding_12.f90: New test.
> > > * gfortran.dg/ISO_Fortran_binding_12.c: Supplementary code.
> > >
> > > PR fortran/90352
> > > * gfortran.dg/iso_c_binding_char_1.f90: New test.
> > >
> > > PR fortran/90355
> > > * gfortran.dg/ISO_Fortran_binding_4.f90: Add 'substr' to test
> > > the direct passing of substrings as descriptors to bind(C).
> > > * gfortran.dg/assign_10.f90: Increase the tree_dump count of
> > > 'atmp' to account for the setting of the 'span' field.
> > > * gfortran.dg/transpose_optimization_2.f90: Ditto.
>
>
>
> --
> "If you can't explain it simply, you don't understand it well enough"
> - Albert Einstein



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

[committed] Fix up C++ and ObjC gtfiles (PR pch/90326)

2019-05-10 Thread Jakub Jelinek

Hi!

The following testcase failed with C++ but succeeded with C.
The problem was that c-family/c-cppbuiltin.c was registered for GTY
only for C, ObjC and ObjC++, but not for C++, so
lazy_hex_fp_values[...].hex_str was garbage after reading the PCH file.

The first hunk fixes that, on the other side c-lex.c doesn't have any GTY
markers and is not a gtfile for any other C family language.
I've then compared the C, C++, ObjC and ObjC++ lists and found ObjC was
missing c-format.c; the ObjC++ change is just to make sure c-cppbuiltin.c
isn't mentioned twice after adding it to C++, because that is the only
FE that generates gtfiles using scripting instead of full list.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as
obvious, will backport to 9.2 later.

2019-05-10  Jakub Jelinek  

PR pch/90326
cp/
* config-lang.in (gtfiles): Remove c-family/c-lex.c, add
c-family/c-cppbuiltin.c.
objc/
* config-lang.in (gtfiles): Add c-family/c-format.c.
objcp/
* config-lang.in (gtfiles): Don't add c-family/c-cppbuiltin.c.
testsuite/
* g++.dg/pch/pr90326.C: New test.
* g++.dg/pch/pr90326.Hs: New file.

--- gcc/cp/config-lang.in.jj2019-01-01 12:37:46.850487770 +0100
+++ gcc/cp/config-lang.in   2019-05-09 14:54:13.210293195 +0200
@@ -37,7 +37,7 @@ gtfiles="\
 \$(srcdir)/c-family/c-pragma.h \$(srcdir)/cp/decl.h \
 \$(srcdir)/cp/parser.h \
 \$(srcdir)/c-family/c-common.c \$(srcdir)/c-family/c-format.c \
-\$(srcdir)/c-family/c-lex.c \$(srcdir)/c-family/c-pragma.c \
+\$(srcdir)/c-family/c-cppbuiltin.c \$(srcdir)/c-family/c-pragma.c \
 \$(srcdir)/cp/call.c \$(srcdir)/cp/class.c \$(srcdir)/cp/constexpr.c \
 \$(srcdir)/cp/cp-gimplify.c \
 \$(srcdir)/cp/cp-lang.c \$(srcdir)/cp/cp-objcp-common.c \
--- gcc/objc/config-lang.in.jj  2019-01-01 12:37:49.936437137 +0100
+++ gcc/objc/config-lang.in 2019-05-09 14:48:11.082368771 +0200
@@ -35,4 +35,4 @@ lang_requires="c"
 # Order is important.  If you change this list, make sure you test
 # building without C++ as well; that is, remove the gcc/cp directory,
 # and build with --enable-languages=c,objc.
-gtfiles="\$(srcdir)/objc/objc-map.h \$(srcdir)/c-family/c-objc.h 
\$(srcdir)/objc/objc-act.h \$(srcdir)/objc/objc-act.c 
\$(srcdir)/objc/objc-runtime-shared-support.c 
\$(srcdir)/objc/objc-gnu-runtime-abi-01.c 
\$(srcdir)/objc/objc-next-runtime-abi-01.c 
\$(srcdir)/objc/objc-next-runtime-abi-02.c \$(srcdir)/c/c-parser.h 
\$(srcdir)/c/c-parser.c \$(srcdir)/c/c-tree.h \$(srcdir)/c/c-decl.c 
\$(srcdir)/c/c-lang.h \$(srcdir)/c/c-objc-common.c 
\$(srcdir)/c-family/c-common.c \$(srcdir)/c-family/c-common.h 
\$(srcdir)/c-family/c-cppbuiltin.c \$(srcdir)/c-family/c-pragma.h 
\$(srcdir)/c-family/c-pragma.c"
+gtfiles="\$(srcdir)/objc/objc-map.h \$(srcdir)/c-family/c-objc.h 
\$(srcdir)/objc/objc-act.h \$(srcdir)/objc/objc-act.c 
\$(srcdir)/objc/objc-runtime-shared-support.c 
\$(srcdir)/objc/objc-gnu-runtime-abi-01.c 
\$(srcdir)/objc/objc-next-runtime-abi-01.c 
\$(srcdir)/objc/objc-next-runtime-abi-02.c \$(srcdir)/c/c-parser.h 
\$(srcdir)/c/c-parser.c \$(srcdir)/c/c-tree.h \$(srcdir)/c/c-decl.c 
\$(srcdir)/c/c-lang.h \$(srcdir)/c/c-objc-common.c 
\$(srcdir)/c-family/c-common.c \$(srcdir)/c-family/c-common.h 
\$(srcdir)/c-family/c-cppbuiltin.c \$(srcdir)/c-family/c-pragma.h 
\$(srcdir)/c-family/c-pragma.c \$(srcdir)/c-family/c-format.c"
--- gcc/objcp/config-lang.in.jj 2019-01-01 12:37:51.388413314 +0100
+++ gcc/objcp/config-lang.in2019-05-09 14:49:18.714235493 +0200
@@ -52,7 +52,6 @@ gtfiles="$(. $srcdir/cp/config-lang.in ;
 gtfiles="$gtfiles \
 \$(srcdir)/objc/objc-act.h \
 \$(srcdir)/objc/objc-map.h \
-\$(srcdir)/c-family/c-cppbuiltin.c \
 \$(srcdir)/objc/objc-act.c \
 \$(srcdir)/objc/objc-gnu-runtime-abi-01.c \
 \$(srcdir)/objc/objc-next-runtime-abi-01.c \
--- gcc/testsuite/g++.dg/pch/pr90326.C.jj   2019-05-09 15:04:44.475697021 
+0200
+++ gcc/testsuite/g++.dg/pch/pr90326.C  2019-05-09 15:07:15.145167975 +0200
@@ -0,0 +1,9 @@
+#include "pr90326.H"
+
+int main()
+{
+  float f = __FLT_MAX__;
+  if (f == 0.0)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/g++.dg/pch/pr90326.Hs.jj  2019-05-09 15:04:47.853640320 
+0200
+++ gcc/testsuite/g++.dg/pch/pr90326.Hs 2019-05-09 15:07:22.296048802 +0200
@@ -0,0 +1,1 @@
+// empty

Jakub

Re: [PATCH][OBVIOUS] Reapply r269790 which was missed during rebase.

2019-05-10 Thread Martin Liška

On 5/10/19 9:26 AM, Jakub Jelinek wrote:
> On Fri, May 10, 2019 at 09:19:54AM +0200, Martin Liška wrote:
>> Hi.
>>
>> When I did split of i386.c I forgot to rebased this patch. It caused failure
>> of gcc.target/i386/fpprec-1.c execution test.
>>
>> Thank you Jeff for reporting that.
>>
>> I'm going to install the patch.
> 
> Ok.  Have you verified all other i386.c changes since the time you did the
> splitting?
> 
>> 2019-05-10  Martin Liska  
>>
>>  * config/i386/i386-expand.c (ix86_expand_floorceildf_32):
>>  Reapply changes from r269790.
> 
>   Jakub
> 

I verified that now. Luckily all last commits have a test-cases which cover it.

Martin

Re: [PATCH] Do not fold anything during copy_fn (PR c++/90383)

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 09:32:00AM +0200, Richard Biener wrote:
> OK.  Note in general canonicalization might be necessary if
> we have any DECL_VALUE_EXPRs resolved.

We only resolve DECL_VALUE_EXPRs in gimple_regimplify_operands I believe,
during copy_fn we don't have any gimple statements plus id->regimplify
is false as well.  The only direct handling of DECL_VALUE_EXPR in
tree-inline.c is copying of their values to the copied decls (which we do
want even for copy_fn).

Jakub

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Richard Biener

On Fri, May 10, 2019 at 9:25 AM Uros Bizjak  wrote:
>
> On Fri, May 10, 2019 at 9:10 AM Richard Biener
>  wrote:
> >
> > On Fri, May 10, 2019 at 12:44 AM Uros Bizjak  wrote:
> > >
> > > >> Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
> > > >> does because we always only consider the last 6 bits of a shift 
> > > >> operand.>
> > > >> Despite all the warnings in the other backends, most notably
> > > >> SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
> > > >> wrote the attached tentative patch.  It's a little ad-hoc, uses the
> > > >> SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and,
> > > >> instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
> > > >> the mask returned by shift_truncation_mask.  Doing so, usage of both
> > > >> "methods" actually reduces to a single way.
> > > > THe main reason it's discouraged is because some targets have insns
> > > > where the count would be truncated and others where it would not.   So
> > > > for example normal shifts might truncate, but vector shifts might or
> > > > (mips) or shifts might truncate but bit tests do not (x86).
> > >
> > > Bit tests on x86 also truncate [1], if the bit base operand specifies
> > > a register, and we don't use BT with a memory location as a bit base.
> > > I don't know what is referred with "(real or pretended) bit field
> > > operations" in the documentation for SHIFT_COUNT_TRUNCATED:
> > >
> > >  However, on some machines, such as the 80386 and the 680x0,
> > >  truncation only applies to shift operations and not the (real or
> > >  pretended) bit-field operations.  Define 'SHIFT_COUNT_TRUNCATED' to
> > >
> > > Vector shifts don't truncate on x86, so x86 probably shares the same
> > > destiny with MIPS. Maybe a machine mode argument could be passed to
> > > SHIFT_COUNT_TRUNCATED to distinguish modes that truncate from modes
> > > that don't.
> >
> > But IL semantic differences based on mode is even worse.  What happens
> > if STV then substitues a vector register/op but you previously optimized
> > with the assumption the count would be truncated since the shift was SImode?
>
> I have removed support to STV shifts with dynamic count just because
> of the issue you mentioned. It is not possible to substitute
> truncating shift with a non-truncating one in STV, so there is IMO no
> problem with the proposed idea.
>
> BTW: We do implement the removal of useless count argument masking
> (for shifts and bit test insns) with several define_insn_and_split
> patterns that allow combine to create compound insn (please grep for
> "Avoid usless masking" in i386.md). However, the generic handling
> would probably be more effective, since the implemented approach
> doesn't remove masking in cases masked argument is used in several
> shift instructions.

But that's more a combine limitation than a reason going for the
"hidden" IL semantic change.  But yes, if the and is used by
non-masking insns then it's likely cheap enough to retain it.

If the masking were always in place (combined with the shift
if a suitable insn exists) then STV handling should be possible,
it just would need to split the insn to do the masking and then the shift
(of course that might not be very profitable).

Richard.

> Uros.
>
> > IMHO a recipie for desaster.
> >
> > Richard.
> >
> > > [1] https://www.felixcloutier.com/x86/bt
> > >
> > > Uros.

Re: [PATCH] Fix another parloops reduction ICE (PR tree-optimization/90385)

2019-05-10 Thread Richard Biener

On Fri, 10 May 2019, Jakub Jelinek wrote:

> Hi!
> 
> Based on the single testcase we had I thought the rest of parloops will
> handle the exit PHIs with non-SSA_NAME arguments just fine, but this patch
> shows that is not the case and doesn't seem trivial to fix (just punting
> on the other ICE spot doesn't work).  As both the testcases are about
> massive -fno-* disabling of optimizations, I don't see a sufficient use case
> to actually support that, so this patch modifies the last change to punt
> instead of trying to support it.  If we find an important use case, anyone
> with sufficient motivaction can change this again and add full support for
> that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
> after a while for 9.2?

OK.

Richard.

> 2019-05-10  Jakub Jelinek  
> 
>   PR tree-optimization/90385
>   * tree-parloops.c (try_create_reduction_list): Punt on non-SSA_NAME
>   arguments of the exit phis.
> 
>   * gfortran.dg/pr90385.f90: New test.
> 
> --- gcc/tree-parloops.c.jj2019-05-03 15:22:07.0 +0200
> +++ gcc/tree-parloops.c   2019-05-09 11:33:19.238730902 +0200
> @@ -2794,8 +2794,16 @@ try_create_reduction_list (loop_p loop,
>gimple *reduc_phi;
>tree val = PHI_ARG_DEF_FROM_EDGE (phi, exit);
>  
> -  if (TREE_CODE (val) == SSA_NAME && !virtual_operand_p (val))
> +  if (!virtual_operand_p (val))
>   {
> +   if (TREE_CODE (val) != SSA_NAME)
> + {
> +   if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file,
> +  "  FAILED: exit PHI argument invariant.\n");
> +   return false;
> + }
> +
> if (dump_file && (dump_flags & TDF_DETAILS))
>   {
> fprintf (dump_file, "phi is ");
> --- gcc/testsuite/gfortran.dg/pr90385.f90.jj  2019-05-09 11:42:21.463573092 
> +0200
> +++ gcc/testsuite/gfortran.dg/pr90385.f90 2019-05-09 11:42:14.622688340 
> +0200
> @@ -0,0 +1,6 @@
> +! PR tree-optimization/90385
> +! { dg-do compile }
> +! { dg-require-effective-target pthread }
> +! { dg-options "-O1 -ftree-parallelize-loops=2 -fno-tree-ccp -fno-tree-ch 
> -fno-tree-copy-prop -fno-tree-forwprop -fno-tree-sink --param 
> parloops-min-per-thread=5" }
> +
> +include 'array_constructor_47.f90'
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH] Do not fold anything during copy_fn (PR c++/90383)

2019-05-10 Thread Richard Biener

On Fri, 10 May 2019, Jakub Jelinek wrote:

> Hi!
> 
> The following testcases are rejects-valid or wrong-code, because
> when we make copies of the function for constexpr evaluation purposes (the
> primary intent is have the functions as is, with no folding whatsoever, so
> we diagnose everything), the inliner used under the hood to copy the function
> actually folds in some cases.  This folding is done there to canonicalize
> some cases (MEM_REFs, INDIRECT_REFs, ADDR_EXRPs), but such canonicalization
> is only really needed if we replace their operands by something different
> (e.g. replace a PARM_DECL for the corresponding value etc.).  When doing
> copy_fn, all we are changing is one set of decls for another set of decls
> of the same category.
> 
> The following patch avoids those foldings during copy_fn.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
> after a while to 9.2?

OK.  Note in general canonicalization might be necessary if
we have any DECL_VALUE_EXPRs resolved.

Thanks,
Richard.

> 2019-05-10  Jakub Jelinek  
> 
>   PR c++/90383
>   * tree-inline.h (struct copy_body_data): Add do_not_fold member.
>   * tree-inline.c (remap_gimple_op_r): Avoid folding expressions if
>   id->do_not_fold.
>   (copy_tree_body_r): Likewise.
>   (copy_fn): Set id.do_not_fold to true.
> 
>   * g++.dg/cpp1y/constexpr-90383-1.C: New test.
>   * g++.dg/cpp1y/constexpr-90383-2.C: New test.
> 
> --- gcc/tree-inline.h.jj  2019-05-08 19:04:58.947797821 +0200
> +++ gcc/tree-inline.h 2019-05-09 10:18:43.567373908 +0200
> @@ -113,6 +113,9 @@ struct copy_body_data
>/* True if trees may not be unshared.  */
>bool do_not_unshare;
>  
> +  /* True if trees should not be folded during the copying.  */
> +  bool do_not_fold;
> +
>/* True if new declarations may not be created during type remapping.  */
>bool prevent_decl_creation_for_types;
>  
> --- gcc/tree-inline.c.jj  2019-05-08 19:04:58.949797788 +0200
> +++ gcc/tree-inline.c 2019-05-09 10:41:49.691949208 +0200
> @@ -1101,7 +1101,7 @@ remap_gimple_op_r (tree *tp, int *walk_s
>/* Otherwise, just copy the node.  Note that copy_tree_r already
>knows not to copy VAR_DECLs, etc., so this is safe.  */
>  
> -  if (TREE_CODE (*tp) == MEM_REF)
> +  if (TREE_CODE (*tp) == MEM_REF && !id->do_not_fold)
>   {
> /* We need to re-canonicalize MEM_REFs from inline substitutions
>that can happen when a pointer argument is an ADDR_EXPR.
> @@ -1327,11 +1327,11 @@ copy_tree_body_r (tree *tp, int *walk_su
> tree type = TREE_TYPE (*tp);
> tree ptr = id->do_not_unshare ? *n : unshare_expr (*n);
> tree old = *tp;
> -   *tp = gimple_fold_indirect_ref (ptr);
> +   *tp = id->do_not_fold ? NULL : gimple_fold_indirect_ref (ptr);
> if (! *tp)
>   {
> type = remap_type (type, id);
> -   if (TREE_CODE (ptr) == ADDR_EXPR)
> +   if (TREE_CODE (ptr) == ADDR_EXPR && !id->do_not_fold)
>   {
> *tp
>   = fold_indirect_ref_1 (EXPR_LOCATION (ptr), type, ptr);
> @@ -1360,7 +1360,7 @@ copy_tree_body_r (tree *tp, int *walk_su
> return NULL;
>   }
>   }
> -  else if (TREE_CODE (*tp) == MEM_REF)
> +  else if (TREE_CODE (*tp) == MEM_REF && !id->do_not_fold)
>   {
> /* We need to re-canonicalize MEM_REFs from inline substitutions
>that can happen when a pointer argument is an ADDR_EXPR.
> @@ -1432,7 +1432,8 @@ copy_tree_body_r (tree *tp, int *walk_su
>  
> /* Handle the case where we substituted an INDIRECT_REF
>into the operand of the ADDR_EXPR.  */
> -   if (TREE_CODE (TREE_OPERAND (*tp, 0)) == INDIRECT_REF)
> +   if (TREE_CODE (TREE_OPERAND (*tp, 0)) == INDIRECT_REF
> +   && !id->do_not_fold)
>   {
> tree t = TREE_OPERAND (TREE_OPERAND (*tp, 0), 0);
> if (TREE_TYPE (t) != TREE_TYPE (*tp))
> @@ -6370,6 +6371,7 @@ copy_fn (tree fn, tree& parms, tree& res
>   since front-end specific mechanisms may rely on sharing.  */
>id.regimplify = false;
>id.do_not_unshare = true;
> +  id.do_not_fold = true;
>  
>/* We're not inside any EH region.  */
>id.eh_lp_nr = 0;
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-90383-1.C.jj 2019-05-09 
> 10:49:10.222509867 +0200
> +++ gcc/testsuite/g++.dg/cpp1y/constexpr-90383-1.C2019-05-09 
> 10:48:46.538910236 +0200
> @@ -0,0 +1,15 @@
> +// PR c++/90383
> +// { dg-do compile { target c++14 } }
> +
> +struct alignas(8) A { constexpr A (bool x) : a(x) {} A () = delete; bool a; 
> };
> +struct B { A b; };
> +
> +constexpr bool
> +foo ()
> +{
> +  B w{A (true)};
> +  w.b = A (true);
> +  return w.b.a;
> +}
> +
> +static_assert (foo (), "");
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-90383-2.C.jj 2019-05-09 
> 10:49:18.194375099 +0200
> +++

[PATCH] Fix another parloops reduction ICE (PR tree-optimization/90385)

2019-05-10 Thread Jakub Jelinek

Hi!

Based on the single testcase we had I thought the rest of parloops will
handle the exit PHIs with non-SSA_NAME arguments just fine, but this patch
shows that is not the case and doesn't seem trivial to fix (just punting
on the other ICE spot doesn't work).  As both the testcases are about
massive -fno-* disabling of optimizations, I don't see a sufficient use case
to actually support that, so this patch modifies the last change to punt
instead of trying to support it.  If we find an important use case, anyone
with sufficient motivaction can change this again and add full support for
that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
after a while for 9.2?

2019-05-10  Jakub Jelinek  

PR tree-optimization/90385
* tree-parloops.c (try_create_reduction_list): Punt on non-SSA_NAME
arguments of the exit phis.

* gfortran.dg/pr90385.f90: New test.

--- gcc/tree-parloops.c.jj  2019-05-03 15:22:07.0 +0200
+++ gcc/tree-parloops.c 2019-05-09 11:33:19.238730902 +0200
@@ -2794,8 +2794,16 @@ try_create_reduction_list (loop_p loop,
   gimple *reduc_phi;
   tree val = PHI_ARG_DEF_FROM_EDGE (phi, exit);
 
-  if (TREE_CODE (val) == SSA_NAME && !virtual_operand_p (val))
+  if (!virtual_operand_p (val))
{
+ if (TREE_CODE (val) != SSA_NAME)
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file,
+"  FAILED: exit PHI argument invariant.\n");
+ return false;
+   }
+
  if (dump_file && (dump_flags & TDF_DETAILS))
{
  fprintf (dump_file, "phi is ");
--- gcc/testsuite/gfortran.dg/pr90385.f90.jj2019-05-09 11:42:21.463573092 
+0200
+++ gcc/testsuite/gfortran.dg/pr90385.f90   2019-05-09 11:42:14.622688340 
+0200
@@ -0,0 +1,6 @@
+! PR tree-optimization/90385
+! { dg-do compile }
+! { dg-require-effective-target pthread }
+! { dg-options "-O1 -ftree-parallelize-loops=2 -fno-tree-ccp -fno-tree-ch 
-fno-tree-copy-prop -fno-tree-forwprop -fno-tree-sink --param 
parloops-min-per-thread=5" }
+
+include 'array_constructor_47.f90'

Jakub

Re: [PATCH][OBVIOUS] Reapply r269790 which was missed during rebase.

2019-05-10 Thread Jakub Jelinek

On Fri, May 10, 2019 at 09:19:54AM +0200, Martin Liška wrote:
> Hi.
> 
> When I did split of i386.c I forgot to rebased this patch. It caused failure
> of gcc.target/i386/fpprec-1.c execution test.
> 
> Thank you Jeff for reporting that.
> 
> I'm going to install the patch.

Ok.  Have you verified all other i386.c changes since the time you did the
splitting?

> 2019-05-10  Martin Liska  
> 
>   * config/i386/i386-expand.c (ix86_expand_floorceildf_32):
>   Reapply changes from r269790.

Jakub

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Uros Bizjak

On Fri, May 10, 2019 at 9:10 AM Richard Biener
 wrote:
>
> On Fri, May 10, 2019 at 12:44 AM Uros Bizjak  wrote:
> >
> > >> Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
> > >> does because we always only consider the last 6 bits of a shift operand.>
> > >> Despite all the warnings in the other backends, most notably
> > >> SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
> > >> wrote the attached tentative patch.  It's a little ad-hoc, uses the
> > >> SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and,
> > >> instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
> > >> the mask returned by shift_truncation_mask.  Doing so, usage of both
> > >> "methods" actually reduces to a single way.
> > > THe main reason it's discouraged is because some targets have insns
> > > where the count would be truncated and others where it would not.   So
> > > for example normal shifts might truncate, but vector shifts might or
> > > (mips) or shifts might truncate but bit tests do not (x86).
> >
> > Bit tests on x86 also truncate [1], if the bit base operand specifies
> > a register, and we don't use BT with a memory location as a bit base.
> > I don't know what is referred with "(real or pretended) bit field
> > operations" in the documentation for SHIFT_COUNT_TRUNCATED:
> >
> >  However, on some machines, such as the 80386 and the 680x0,
> >  truncation only applies to shift operations and not the (real or
> >  pretended) bit-field operations.  Define 'SHIFT_COUNT_TRUNCATED' to
> >
> > Vector shifts don't truncate on x86, so x86 probably shares the same
> > destiny with MIPS. Maybe a machine mode argument could be passed to
> > SHIFT_COUNT_TRUNCATED to distinguish modes that truncate from modes
> > that don't.
>
> But IL semantic differences based on mode is even worse.  What happens
> if STV then substitues a vector register/op but you previously optimized
> with the assumption the count would be truncated since the shift was SImode?

I have removed support to STV shifts with dynamic count just because
of the issue you mentioned. It is not possible to substitute
truncating shift with a non-truncating one in STV, so there is IMO no
problem with the proposed idea.

BTW: We do implement the removal of useless count argument masking
(for shifts and bit test insns) with several define_insn_and_split
patterns that allow combine to create compound insn (please grep for
"Avoid usless masking" in i386.md). However, the generic handling
would probably be more effective, since the implemented approach
doesn't remove masking in cases masked argument is used in several
shift instructions.

Uros.

> IMHO a recipie for desaster.
>
> Richard.
>
> > [1] https://www.felixcloutier.com/x86/bt
> >
> > Uros.

[PATCH][OBVIOUS] Reapply r269790 which was missed during rebase.

2019-05-10 Thread Martin Liška

Hi.

When I did split of i386.c I forgot to rebased this patch. It caused failure
of gcc.target/i386/fpprec-1.c execution test.

Thank you Jeff for reporting that.

I'm going to install the patch.

gcc/ChangeLog:

2019-05-10  Martin Liska  

* config/i386/i386-expand.c (ix86_expand_floorceildf_32):
Reapply changes from r269790.
---
 gcc/config/i386/i386-expand.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)


diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index b7ce5d0975b..a55d4923be4 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -15533,8 +15533,10 @@ ix86_expand_floorceildf_32 (rtx operand0, rtx operand1, bool do_floor)
   x2 -= 1;
  Compensate.  Ceil:
 if (x2 < x)
-  x2 -= -1;
-return x2;
+  x2 += 1;
+	if (HONOR_SIGNED_ZEROS (mode))
+	  x2 = copysign (x2, x);
+	return x2;
*/
   machine_mode mode = GET_MODE (operand0);
   rtx xa, TWO52, tmp, one, res, mask;
@@ -15560,17 +15562,16 @@ ix86_expand_floorceildf_32 (rtx operand0, rtx operand1, bool do_floor)
   /* xa = copysign (xa, operand1) */
   ix86_sse_copysign_to_positive (xa, xa, res, mask);
 
-  /* generate 1.0 or -1.0 */
-  one = force_reg (mode,
-	   const_double_from_real_value (do_floor
-		 ? dconst1 : dconstm1, mode));
+  /* generate 1.0 */
+  one = force_reg (mode, const_double_from_real_value (dconst1, mode));
 
   /* Compensate: xa = xa - (xa > operand1 ? 1 : 0) */
   tmp = ix86_expand_sse_compare_mask (UNGT, xa, res, !do_floor);
   emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, one, tmp)));
-  /* We always need to subtract here to preserve signed zero.  */
-  tmp = expand_simple_binop (mode, MINUS,
+  tmp = expand_simple_binop (mode, do_floor ? MINUS : PLUS,
 			 xa, tmp, NULL_RTX, 0, OPTAB_DIRECT);
+  if (!do_floor && HONOR_SIGNED_ZEROS (mode))
+ix86_sse_copysign_to_positive (tmp, tmp, res, mask);
   emit_move_insn (res, tmp);
 
   emit_label (label);

[PATCH] Do not fold anything during copy_fn (PR c++/90383)

2019-05-10 Thread Jakub Jelinek

Hi!

The following testcases are rejects-valid or wrong-code, because
when we make copies of the function for constexpr evaluation purposes (the
primary intent is have the functions as is, with no folding whatsoever, so
we diagnose everything), the inliner used under the hood to copy the function
actually folds in some cases.  This folding is done there to canonicalize
some cases (MEM_REFs, INDIRECT_REFs, ADDR_EXRPs), but such canonicalization
is only really needed if we replace their operands by something different
(e.g. replace a PARM_DECL for the corresponding value etc.).  When doing
copy_fn, all we are changing is one set of decls for another set of decls
of the same category.

The following patch avoids those foldings during copy_fn.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
after a while to 9.2?

2019-05-10  Jakub Jelinek  

PR c++/90383
* tree-inline.h (struct copy_body_data): Add do_not_fold member.
* tree-inline.c (remap_gimple_op_r): Avoid folding expressions if
id->do_not_fold.
(copy_tree_body_r): Likewise.
(copy_fn): Set id.do_not_fold to true.

* g++.dg/cpp1y/constexpr-90383-1.C: New test.
* g++.dg/cpp1y/constexpr-90383-2.C: New test.

--- gcc/tree-inline.h.jj2019-05-08 19:04:58.947797821 +0200
+++ gcc/tree-inline.h   2019-05-09 10:18:43.567373908 +0200
@@ -113,6 +113,9 @@ struct copy_body_data
   /* True if trees may not be unshared.  */
   bool do_not_unshare;
 
+  /* True if trees should not be folded during the copying.  */
+  bool do_not_fold;
+
   /* True if new declarations may not be created during type remapping.  */
   bool prevent_decl_creation_for_types;
 
--- gcc/tree-inline.c.jj2019-05-08 19:04:58.949797788 +0200
+++ gcc/tree-inline.c   2019-05-09 10:41:49.691949208 +0200
@@ -1101,7 +1101,7 @@ remap_gimple_op_r (tree *tp, int *walk_s
   /* Otherwise, just copy the node.  Note that copy_tree_r already
 knows not to copy VAR_DECLs, etc., so this is safe.  */
 
-  if (TREE_CODE (*tp) == MEM_REF)
+  if (TREE_CODE (*tp) == MEM_REF && !id->do_not_fold)
{
  /* We need to re-canonicalize MEM_REFs from inline substitutions
 that can happen when a pointer argument is an ADDR_EXPR.
@@ -1327,11 +1327,11 @@ copy_tree_body_r (tree *tp, int *walk_su
  tree type = TREE_TYPE (*tp);
  tree ptr = id->do_not_unshare ? *n : unshare_expr (*n);
  tree old = *tp;
- *tp = gimple_fold_indirect_ref (ptr);
+ *tp = id->do_not_fold ? NULL : gimple_fold_indirect_ref (ptr);
  if (! *tp)
{
  type = remap_type (type, id);
- if (TREE_CODE (ptr) == ADDR_EXPR)
+ if (TREE_CODE (ptr) == ADDR_EXPR && !id->do_not_fold)
{
  *tp
= fold_indirect_ref_1 (EXPR_LOCATION (ptr), type, ptr);
@@ -1360,7 +1360,7 @@ copy_tree_body_r (tree *tp, int *walk_su
  return NULL;
}
}
-  else if (TREE_CODE (*tp) == MEM_REF)
+  else if (TREE_CODE (*tp) == MEM_REF && !id->do_not_fold)
{
  /* We need to re-canonicalize MEM_REFs from inline substitutions
 that can happen when a pointer argument is an ADDR_EXPR.
@@ -1432,7 +1432,8 @@ copy_tree_body_r (tree *tp, int *walk_su
 
  /* Handle the case where we substituted an INDIRECT_REF
 into the operand of the ADDR_EXPR.  */
- if (TREE_CODE (TREE_OPERAND (*tp, 0)) == INDIRECT_REF)
+ if (TREE_CODE (TREE_OPERAND (*tp, 0)) == INDIRECT_REF
+ && !id->do_not_fold)
{
  tree t = TREE_OPERAND (TREE_OPERAND (*tp, 0), 0);
  if (TREE_TYPE (t) != TREE_TYPE (*tp))
@@ -6370,6 +6371,7 @@ copy_fn (tree fn, tree& parms, tree& res
  since front-end specific mechanisms may rely on sharing.  */
   id.regimplify = false;
   id.do_not_unshare = true;
+  id.do_not_fold = true;
 
   /* We're not inside any EH region.  */
   id.eh_lp_nr = 0;
--- gcc/testsuite/g++.dg/cpp1y/constexpr-90383-1.C.jj   2019-05-09 
10:49:10.222509867 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-90383-1.C  2019-05-09 
10:48:46.538910236 +0200
@@ -0,0 +1,15 @@
+// PR c++/90383
+// { dg-do compile { target c++14 } }
+
+struct alignas(8) A { constexpr A (bool x) : a(x) {} A () = delete; bool a; };
+struct B { A b; };
+
+constexpr bool
+foo ()
+{
+  B w{A (true)};
+  w.b = A (true);
+  return w.b.a;
+}
+
+static_assert (foo (), "");
--- gcc/testsuite/g++.dg/cpp1y/constexpr-90383-2.C.jj   2019-05-09 
10:49:18.194375099 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-90383-2.C  2019-05-09 
10:51:08.433511507 +0200
@@ -0,0 +1,22 @@
+// PR c++/90383
+// { dg-do run { target c++14 } }
+// { dg-options "-O2" }
+
+extern "C" void abort ();
+struct alignas(8) A { constexpr A (bool x) : a(x) {} A () = default; bool a; };
+struct B { A b; };
+

Re: [committed] Clean up MPX-related stuff: CIF_CHKP (was: [PATCH] Clean up another MPX-related stuff.)

2019-05-10 Thread Richard Biener

On Thu, May 9, 2019 at 11:59 AM Thomas Schwinge  wrote:
>
> Hi!
>
> On Wed, 13 Feb 2019 14:47:36 +0100, Richard Biener 
>  wrote:
> > On February 13, 2019 6:53:17 AM GMT+01:00, "Martin Liška"  
> > wrote:
> > >As Honza noticed, there's still some leftover from MPX removal.
> > >May I remove another bunch of fields now, or should I wait
> > >for next stage1?
> >
> > You can do it now.
>
> I recently stumbled across an additional leftover piece:

OK.

Richard.

> > 2019-02-13  Martin Liska  
>
> >   * ipa-fnsummary.c (compute_fn_summary): Likewise.
>
> | --- a/gcc/ipa-fnsummary.c
> | +++ b/gcc/ipa-fnsummary.c
> | @@ -2449,13 +2449,7 @@ compute_fn_summary (struct cgraph_node *node, bool 
> early)
> |info->account_size_time (2 * ipa_fn_summary::size_scale, 0, t, t);
> |ipa_update_overall_fn_summary (node);
> |info->self_size = info->size;
> | -  /* We cannot inline instrumentation clones.  */
> | -  if (node->thunk.add_pointer_bounds_args)
> | - {
> | -  info->inlinable = false;
> | -  node->callees->inline_failed = CIF_CHKP;
> | - }
> | -  else if (stdarg_p (TREE_TYPE (node->decl)))
> | +  if (stdarg_p (TREE_TYPE (node->decl)))
> |   {
>
> This removed the (only) user of 'CIF_CHKP', but didn't remove its
> definition.  (Probably because of that one going by the un-prefixed
> short-hand name of 'CHKP'?)  As obvious, now cleaned up on trunk in
> r271029, and on gcc-9-branch in r271030, see attached.
>
>
> Grüße
>  Thomas
>
>

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Richard Biener

On Fri, May 10, 2019 at 12:44 AM Uros Bizjak  wrote:
>
> >> Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
> >> does because we always only consider the last 6 bits of a shift operand.>
> >> Despite all the warnings in the other backends, most notably
> >> SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
> >> wrote the attached tentative patch.  It's a little ad-hoc, uses the
> >> SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and,
> >> instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
> >> the mask returned by shift_truncation_mask.  Doing so, usage of both
> >> "methods" actually reduces to a single way.
> > THe main reason it's discouraged is because some targets have insns
> > where the count would be truncated and others where it would not.   So
> > for example normal shifts might truncate, but vector shifts might or
> > (mips) or shifts might truncate but bit tests do not (x86).
>
> Bit tests on x86 also truncate [1], if the bit base operand specifies
> a register, and we don't use BT with a memory location as a bit base.
> I don't know what is referred with "(real or pretended) bit field
> operations" in the documentation for SHIFT_COUNT_TRUNCATED:
>
>  However, on some machines, such as the 80386 and the 680x0,
>  truncation only applies to shift operations and not the (real or
>  pretended) bit-field operations.  Define 'SHIFT_COUNT_TRUNCATED' to
>
> Vector shifts don't truncate on x86, so x86 probably shares the same
> destiny with MIPS. Maybe a machine mode argument could be passed to
> SHIFT_COUNT_TRUNCATED to distinguish modes that truncate from modes
> that don't.

But IL semantic differences based on mode is even worse.  What happens
if STV then substitues a vector register/op but you previously optimized
with the assumption the count would be truncated since the shift was SImode?

IMHO a recipie for desaster.

Richard.

> [1] https://www.felixcloutier.com/x86/bt
>
> Uros.

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Richard Biener

On Thu, May 9, 2019 at 6:00 PM Jeff Law  wrote:
>
> On 5/9/19 5:52 AM, Robin Dapp wrote:
> > Hi,
> >
> > while trying to improve s390 code generation for rotate and shift I
> > noticed superfluous subregs for shift count operands. In our backend we
> > already have quite cumbersome patterns that would need to be duplicated
> > (or complicated further by more subst patterns) in order to get rid of
> > the subregs.
> >
> > I had already finished all the patterns when I realized that
> > SHIFT_COUNT_TRUNCATED and the target hook shift_truncation_mask already
> > exist and could do what is needed without extra patterns.  Just defining
> >  shift_truncation_mask was not enough though as most of the additional
> > insns get introduced by combine.
> >
> > Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
> > does because we always only consider the last 6 bits of a shift operand.>
> > Despite all the warnings in the other backends, most notably
> > SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
> > wrote the attached tentative patch.  It's a little ad-hoc, uses the
> > SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and,
> > instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
> > the mask returned by shift_truncation_mask.  Doing so, usage of both
> > "methods" actually reduces to a single way.
> THe main reason it's discouraged is because some targets have insns
> where the count would be truncated and others where it would not.   So
> for example normal shifts might truncate, but vector shifts might or
> (mips) or shifts might truncate but bit tests do not (x86).
>
> I don't know enough about the s390 architecture to know if there's any
> corner cases.  You'd have to look at ever pattern in your machine
> description with a shift and verify that it's going to DTRT if the count
> hasn't been truncated.
>
>
> It would really help if you could provide testcases which show the
> suboptimal code and any analysis you've done.

The main reason I dislike SHIFT_COUNT_TRUNCATED is that it
changes the meaning of the IL.  We generally want these kind
of things to be explicit.

Richard.



>
> Jeff

Re: [PATCH][stage1] Support profile (BB counts and edge probabilities) in GIMPLE FE.

2019-05-10 Thread Bernhard Reutner-Fischer

On 7 May 2019 14:00:32 CEST, "Martin Liška"  wrote:

/The parameters is/s/parameters/parameter/

thanks,

83 matches

Mail list logo