Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Jan Hubicka
> On 07/20/16 20:08, Richard Biener wrote:
> > On July 20, 2016 6:54:48 PM GMT+02:00, Bernd Edlinger 
> >  wrote:
> >>
> >> Yes. That is another interesting observation.  I think, originally this
> >> flag was introduced by Jan Hubicka, and should mean, "it may be alloca
> >> or a weak alias to alloca or maybe even something different".
> >> But some of the later optimizations use it in a way as if it meant
> >> "it must be alloca".  However I have not been able to come up with
> >> a test case that makes this assumption false, but I probably just
> >> did not try hard enough.
> >>
> >> But I think that alloca just should not be recognized by name any
> >> more.
> >
> > It was introduced to mark calls that should not be duplicated by inlining 
> > or unrolling to avoid increasing stack usage too much.  Sth worthwhile to 
> > keep even with -ffreestanding.
> >
> > Richard.
> >
> 
> Apparently the MAY_BE_ALLOCA issue is worse than I ever thought...
> 
> But I could not imagine that alloca can be anything else than a
> built-in.
> 
> Is there any implementation where alloca is like an ordinary function
> call?
> 
> I mean, does something like a function that allocates n bytes
> from the caller's stack frame work at all with any calling convention?

I just looked up the original patch introducing MAY_BE_ALLOCA
https://gcc.gnu.org/ml/gcc-patches/2000-03/msg00998.html
I did not introduce the flag and it was there for ages (I just came with
ECF_*).  The reason for it back then was only to prevent pending stack
adjustements before call to alloca. It is not completely clear to me why it is
needed - if alloca is bulitin it will do pending adjustments. There used to be
library implementation of alloca which did malloc and then gabrage collected
based on stack pointer. For that you need no special handling.

Unrolling is safe WRT alloca (because it depends on how many the function is
invoked). I added the logic to prevent inlining long time ago because otherwise
gcc benchmark from SPEC exploded in stack usage.
It is ages on my TODO to remove it and instead save/restore stack pointer around
inlined functions which does alloca.

Not fully relevant for this thread, just my 2 cents ;)
Honza
> 
> 
> Bernd.


Re: [libstdc++, C++17] Implement C++17 P0330 size_t UDL.

2016-07-20 Thread Jonathan Wakely

On 21/07/16 00:18 -0400, Ed Smith-Rowland wrote:

This patch defines

 operator""zu(unsigned long long __n)

for size_t literals.

for (auto k = 0zul; k < v.size(); ++k)

  ...


Testing on x86-64-linux is finishing but I'm past these tests.

OK?


P0330 isn't in C++17. In Oulu LEWG voted to forward it to LWG for
C++Next (i.e. C++20) and LFTSv3. I don't think LWG reviewed it yet.

https://issues.isocpp.org/show_bug.cgi?id=77


C++ PATCH for c++/70781 (ICE with ill-formed lambda)

2016-07-20 Thread Jason Merrill
Here we were returning OK from cp_parser_lambda_declarator_opt even
though we had encountered parse errors, and so parsing the inner
lambda aborted due to seeing an error_mark_node expression-statement
without ever having emitted any errors.

Fixed by clearing OK if there were parse errors.

Tested x86_64-pc-linux-gnu, applying to trunk and 6.
commit 17fea383f9a2861903ae6d144b3ba7e6a4bf191a
Author: Jason Merrill 
Date:   Wed Jul 20 17:03:50 2016 -0400

PR c++/70781 - ICE on ill-formed lambda.

* parser.c (cp_parser_lambda_expression): Unset OK if there was an
error parsing the lambda-declarator.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 9bdb108..b71b9e5 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -9771,10 +9771,12 @@ cp_parser_lambda_expression (cp_parser* parser)
 
 ok &= cp_parser_lambda_declarator_opt (parser, lambda_expr);
 
+if (ok && cp_parser_error_occurred (parser))
+  ok = false;
+
 if (ok)
   {
-   if (!cp_parser_error_occurred (parser)
-   && cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)
+   if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE)
&& cp_parser_start_tentative_firewall (parser))
  start = token;
cp_parser_lambda_body (parser, lambda_expr);
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-ice16.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-ice16.C
new file mode 100644
index 000..e94a0b6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-ice16.C
@@ -0,0 +1,8 @@
+// PR c++/70781
+// { dg-do compile { target c++11 } }
+
+template < typename T >  
+void foo ()
+{
+  T ([=] (S) { [=] {}; }); // { dg-error "" }
+}


C++ PATCH for c++/71896 (constexpr pointer to member comparison)

2016-07-20 Thread Jason Merrill
Here one operand of the comparison had been reduced to an INTEGER_CST
and the other was still a PTRMEM_CST.  We should deal with that
situation by reducing the PTRMEM_CST.

Tested x86_64-pc-linux-gnu, applying to trunk and 6.
commit 984c524f8a302059a1f71f84935dcae5f9914c7f
Author: Jason Merrill 
Date:   Wed Jul 20 17:11:00 2016 -0400

PR c++/71896 - constexpr pointer-to-member comparison.

* constexpr.c (cxx_eval_binary_expression): Handle comparison
between lowered and unlowered PTRMEM_CST.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 346fdfa..240c606 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1838,6 +1838,10 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
   && (null_member_pointer_value_p (lhs)
   || null_member_pointer_value_p (rhs)))
r = constant_boolean_node (!is_code_eq, type);
+  else if (TREE_CODE (lhs) == PTRMEM_CST)
+   lhs = cplus_expand_constant (lhs);
+  else if (TREE_CODE (rhs) == PTRMEM_CST)
+   rhs = cplus_expand_constant (rhs);
 }
 
   if (r == NULL_TREE)
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem6.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem6.C
new file mode 100644
index 000..ed18ab1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ptrmem6.C
@@ -0,0 +1,13 @@
+// PR c++/71896
+// { dg-do compile { target c++11 } }
+
+struct Foo {
+  int x;
+};
+
+constexpr bool compare(int Foo::*t) { return t == &Foo::x; }
+
+constexpr bool b = compare(&Foo::x);
+
+#define SA(X) static_assert ((X),#X)
+SA(b);


C++ PATCH for -Waddress false positive in unevaluated context

2016-07-20 Thread Jason Merrill
The fix for 65168 didn't check c_inhibit_evaluation_warnings, which
means getting warnings about comparing the address of a reference to
NULL in places where we aren't actually interested in the value.  This
patch corrects that, but this lead to some regressions because
cp_truthvalue_conversion was inappropriately setting that flag to
avoid -Wzero-as-null-pointer-constant warnings.  So now we avoid those
warnings by comparing to nullptr rather than 0.

Tested x86_64-cp-linux-gnu, applying to trunk.
commit e274607c02db6d92af948b670cb90d4af452aa77
Author: Jason Merrill 
Date:   Wed Jul 20 12:03:48 2016 -0400

PR c++/65168 - -Waddress in unevaluated context.

gcc/c-family/
* c-common.c (c_common_truthvalue_conversion): Check
c_inhibit_evaluation_warnings for warning about address of
reference.
gcc/cp/
* typeck.c (cp_truthvalue_conversion): Compare pointers to nullptr.
Don't set c_inhibit_evaluation_warnings.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 936ddfb..9900e93 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -4551,6 +4551,7 @@ c_common_truthvalue_conversion (location_t location, tree 
expr)
tree fromtype = TREE_TYPE (TREE_OPERAND (expr, 0));
 
if (POINTER_TYPE_P (totype)
+   && !c_inhibit_evaluation_warnings
&& TREE_CODE (fromtype) == REFERENCE_TYPE)
  {
tree inner = expr;
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index f9e45ee..d4bfb11 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -5459,21 +5459,10 @@ tree
 cp_truthvalue_conversion (tree expr)
 {
   tree type = TREE_TYPE (expr);
-  if (TYPE_PTRDATAMEM_P (type)
+  if (TYPE_PTR_OR_PTRMEM_P (type)
   /* Avoid ICE on invalid use of non-static member function.  */
   || TREE_CODE (expr) == FUNCTION_DECL)
-return build_binary_op (EXPR_LOCATION (expr),
-   NE_EXPR, expr, nullptr_node, 1);
-  else if (TYPE_PTR_P (type) || TYPE_PTRMEMFUNC_P (type))
-{
-  /* With -Wzero-as-null-pointer-constant do not warn for an
-'if (p)' or a 'while (!p)', where p is a pointer.  */
-  tree ret;
-  ++c_inhibit_evaluation_warnings;
-  ret = c_common_truthvalue_conversion (input_location, expr);
-  --c_inhibit_evaluation_warnings;
-  return ret;
-}
+return build_binary_op (input_location, NE_EXPR, expr, nullptr_node, 1);
   else
 return c_common_truthvalue_conversion (input_location, expr);
 }
diff --git a/gcc/testsuite/g++.dg/warn/Waddress-3.C 
b/gcc/testsuite/g++.dg/warn/Waddress-3.C
new file mode 100644
index 000..13d7cd2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Waddress-3.C
@@ -0,0 +1,14 @@
+// PR c++/65168
+// { dg-do compile { target c++11 } }
+// { dg-options -Waddress }
+// We shouldn't warn in unevaluated context about the address of a reference
+// always being true.
+
+template 
+auto f(U&& u) -> decltype(T(u)) { }
+
+int main()
+{
+  bool ar[4];
+  f(ar);
+}
diff --git a/gcc/testsuite/g++.dg/warn/Walways-true-1.C 
b/gcc/testsuite/g++.dg/warn/Walways-true-1.C
index ae6f9dc..48b9f72 100644
--- a/gcc/testsuite/g++.dg/warn/Walways-true-1.C
+++ b/gcc/testsuite/g++.dg/warn/Walways-true-1.C
@@ -12,19 +12,19 @@ void
 bar (int a)
 {
  lab:
-  if (foo) // { dg-warning "always evaluate as" "correct warning" }
+  if (foo) // { dg-warning "always evaluate as|never be NULL" "correct 
warning" }
 foo (0);
   if (foo (1))
 ;
-  if (&i)  // { dg-warning "always evaluate as" "correct warning" }
+  if (&i)  // { dg-warning "always evaluate as|never be NULL" "correct 
warning" }
 foo (2);
   if (i)
 foo (3);
-  if (&a)  // { dg-warning "always evaluate as" "correct warning" }
+  if (&a)  // { dg-warning "always evaluate as|never be NULL" "correct 
warning" }
 foo (4);
   if (a)
 foo (5);
-  if (&&lab)   // { dg-warning "always evaluate as" "correct warning" }
+  if (&&lab)   // { dg-warning "always evaluate as|never be NULL" "correct 
warning" }
 foo (6);
   if (foo == 0)// { dg-warning "never be NULL" "correct warning" }
 foo (7);
diff --git a/gcc/testsuite/g++.dg/warn/Walways-true-2.C 
b/gcc/testsuite/g++.dg/warn/Walways-true-2.C
index f157347..e4b5713 100644
--- a/gcc/testsuite/g++.dg/warn/Walways-true-2.C
+++ b/gcc/testsuite/g++.dg/warn/Walways-true-2.C
@@ -23,11 +23,11 @@ bar (int a)
 foo (2);
   if (i)
 foo (3);
-  if (&a)  // { dg-warning "always evaluate as" "correct warning" }
+  if (&a)  // { dg-warning "always evaluate as|never be NULL" "correct 
warning" }
 foo (4);
   if (a)
 foo (5);
-  if (&&lab)   // { dg-warning "always evaluate as" "correct warning" }
+  if (&&lab)   // { dg-warning "always evaluate as|never be NULL" "correct 
warning" }
 foo (6);
   if (foo == 0)
 foo (7);


C++ PATCH for C++/71121 (wrong -Waddress warning with PMF and constexpr)

2016-07-20 Thread Jason Merrill
The problem here was that the code that tries to prevent the -Waddress
warning used cp_fully_fold, and later code used maybe_constant_value,
and the latter simplified the operand more so that it exposed the
ADDR_EXPR to the -Waddress warning.  Fixed by calling
maybe_constant_value from cp_fully_fold.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 16703ee40f25fc3afab05b4d25741eb27ce70825
Author: Jason Merrill 
Date:   Wed Jul 20 13:10:40 2016 -0400

PR c++/71121 - -Waddress, constexpr, and PMFs.

* cp-gimplify.c (cp_fully_fold): First call maybe_constant_value.

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 41ab35f..ee28ba5 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -1954,6 +1954,11 @@ cxx_omp_disregard_value_expr (tree decl, bool shared)
 tree
 cp_fully_fold (tree x)
 {
+  if (processing_template_decl)
+return x;
+  /* FIXME cp_fold ought to be a superset of maybe_constant_value so we don't
+ have to call both.  */
+  x = maybe_constant_value (x);
   return cp_fold (x);
 }
 
diff --git a/gcc/testsuite/g++.dg/warn/Waddress-4.C 
b/gcc/testsuite/g++.dg/warn/Waddress-4.C
new file mode 100644
index 000..a9fdfc4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Waddress-4.C
@@ -0,0 +1,15 @@
+// PR c++/71121
+// { dg-do compile { target c++14 } }
+// { dg-options -Waddress }
+
+struct CC { void mbr(); };
+
+constexpr auto getFunc() {
+return &CC::mbr;
+}
+
+constexpr bool xxx(void (CC::*_a)())
+{
+constexpr auto f = getFunc();
+return (f == _a);
+}
diff --git a/gcc/testsuite/g++.dg/warn/overflow-warn-7.C 
b/gcc/testsuite/g++.dg/warn/overflow-warn-7.C
deleted file mode 100644
index b536563..000
--- a/gcc/testsuite/g++.dg/warn/overflow-warn-7.C
+++ /dev/null
@@ -1,17 +0,0 @@
-// PR c/62096 - unexpected warning overflow in implicit constant conversion
-// { dg-do compile { target c++11 } }
-
-enum E {
-E_val  = 1,
-};
-
-inline constexpr E operator~(E e)
-{
-  return E(~static_cast(e));
-}
-
-int main()
-{
-  int val = ~E_val;   // { dg-bogus "overflow in implicit constant conversion" 
}
-  (void) val;
-}


Re: [C++ PATCH] Fix up genericization ICE (PR c++/71941)

2016-07-20 Thread Jason Merrill
OK.

On Wed, Jul 20, 2016 at 5:05 PM, Jakub Jelinek  wrote:
> Hi!
>
> In PR69315, we've recently allowed recursive genericization, unfortunately
> the bc_label handling isn't prepared for that, we ICE if we cp_genericize
> some function (usually newly instantiated method) while inside of some loop
> in the outer function.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk/6.2?
>
> 2016-07-20  Jakub Jelinek  
>
> PR c++/71941
> * cp-gimplify.c (cp_genericize): For nested cp_genericize calls
> save/restore bc_label array.
>
> * g++.dg/gomp/pr71941.C: New test.
>
> --- gcc/cp/cp-gimplify.c.jj 2016-07-18 20:44:59.0 +0200
> +++ gcc/cp/cp-gimplify.c2016-07-20 12:14:28.203381211 +0200
> @@ -1632,6 +1632,13 @@ cp_genericize (tree fndecl)
>if (DECL_CLONED_FUNCTION_P (fndecl))
>  return;
>
> +  /* Allow cp_genericize calls to be nested.  */
> +  tree save_bc_label[2];
> +  save_bc_label[bc_break] = bc_label[bc_break];
> +  save_bc_label[bc_continue] = bc_label[bc_continue];
> +  bc_label[bc_break] = NULL_TREE;
> +  bc_label[bc_continue] = NULL_TREE;
> +
>/* Expand all the array notations here.  */
>if (flag_cilkplus
>&& contains_array_notation_expr (DECL_SAVED_TREE (fndecl)))
> @@ -1651,6 +1658,8 @@ cp_genericize (tree fndecl)
>
>gcc_assert (bc_label[bc_break] == NULL);
>gcc_assert (bc_label[bc_continue] == NULL);
> +  bc_label[bc_break] = save_bc_label[bc_break];
> +  bc_label[bc_continue] = save_bc_label[bc_continue];
>  }
>
>  /* Build code to apply FN to each member of ARG1 and ARG2.  FN may be
> --- gcc/testsuite/g++.dg/gomp/pr71941.C.jj  2016-07-20 12:11:29.793638764 
> +0200
> +++ gcc/testsuite/g++.dg/gomp/pr71941.C 2016-07-20 12:11:14.0 +0200
> @@ -0,0 +1,22 @@
> +// PR c++/71941
> +// { dg-do compile }
> +// { dg-options "-fopenmp" }
> +
> +struct A { A (); A (A &); ~A (); };
> +
> +template 
> +struct B
> +{
> +  struct C { A a; C () : a () {} };
> +  C c;
> +  void foo ();
> +};
> +
> +void
> +bar ()
> +{
> +  B<0> b;
> +#pragma omp task
> +  for (int i = 0; i < 2; i++)
> +b.foo ();
> +}
>
> Jakub


[libstdc++, C++17] Implement C++17 P0330 size_t UDL.

2016-07-20 Thread Ed Smith-Rowland

This patch defines

  operator""zu(unsigned long long __n)

for size_t literals.

for (auto k = 0zul; k < v.size(); ++k)

   ...


Testing on x86-64-linux is finishing but I'm past these tests.

OK?


Ed


2016-07-21  Edward Smith-Rowland  <3dw...@verizon.net>

Implement C++17 P0330 size_t UDL.
* include/c_global/cstddef: Add size_t operator""zu().
* testsuite/18_support/headers/cstddef/literals/types.cc: New test.
* testsuite/18_support/headers/cstddef/literals/values.cc: New test.

Index: include/c_global/cstddef
===
--- include/c_global/cstddef(revision 238557)
+++ include/c_global/cstddef(working copy)
@@ -57,4 +57,20 @@
 }
 #endif
 
+#if __cplusplus >= 201500L
+#define __cpp_lib_support_udls 201605
+namespace std
+{
+inline namespace literals
+{
+inline namespace support_literals
+{
+  constexpr size_t
+  operator""zu(unsigned long long __n)
+  { return static_cast(__n); }
+}
+}
+}
+#endif // C++17
+
 #endif // _GLIBCXX_CSTDDEF
Index: testsuite/18_support/headers/cstddef/literals/types.cc
===
--- testsuite/18_support/headers/cstddef/literals/types.cc  (nonexistent)
+++ testsuite/18_support/headers/cstddef/literals/types.cc  (working copy)
@@ -0,0 +1,31 @@
+// { dg-options "-std=gnu++17" }
+// { dg-do compile }
+
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+#include 
+#include 
+
+void
+test01()
+{
+  using namespace std::literals::support_literals;
+
+  static_assert(std::is_same::value,
+   "1zu is std::size_t");
+}
Index: testsuite/18_support/headers/cstddef/literals/values.cc
===
--- testsuite/18_support/headers/cstddef/literals/values.cc (nonexistent)
+++ testsuite/18_support/headers/cstddef/literals/values.cc (working copy)
@@ -0,0 +1,42 @@
+// { dg-options "-std=gnu++1z" }
+// { dg-do run }
+
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+#include 
+#include 
+#include 
+
+void
+test01()
+{
+  using namespace std::literals::support_literals;
+  bool test [[gnu::unused]] = true;
+
+  std::size_t s = 1zu;
+  std::size_t t = -1zu;
+
+  VERIFY( s == 1 );
+  VERIFY( t == std::numeric_limits::max() );
+}
+
+int
+main()
+{
+  test01();
+}


Re: [PATCH, rs6000] Fix PR target/71733, ICE with -mcpu=power9 -mno-vsx

2016-07-20 Thread Peter Bergner

On 7/12/16 8:48 AM, Alan Modra wrote:

On Tue, Jul 12, 2016 at 02:02:43PM +0200, Ulrich Weigand wrote:

The second time around, get_secondary_mem should reuse the
same stack slot it already allocated, and the elimination
offsets should already be set to accommodate that stack slot,
which means the second time around, the correct RTX should be
generated for the memory access.

Is this not happening somehow?


Duh, yes, of course.  Second time around the mem is
(mem/c:V16QI (plus:DI (reg/f:DI 1 1)
(const_int -16 [0xfff0])) [0  S16 A128])
so we're checking the correct offset.

The problem now is that this passes rs6000_legitimate_address_p due to
mode_supports_vsx_dform_quad and quad_address_p being true.  That
doesn't seem correct for -mno-vsx.


Catching up with email, not that I'm back from vacation.

This still doesn't answer David's question about what will happen if
we generate this pattern (or one of the older VSX reg+reg patterns)
when we are NOT using -mno-vsx.  In those cases, quad_address_p and
mode_supports_vsx_dform_quad will return true and it seems like
we'll go ahead and generate these reg+offset addresses when they're
not legal for these patterns.

As I said in my previous note, I wasn't able to actually generate the
altivec pattern (I haven't tried the vsx reg+reg patterns), but if we
could, I assume we'll still have the same issue, will we not?

Peter




Re: [PATCH] Fix source locations of bad enum values (PR c/71610 and PR c/71613)

2016-07-20 Thread David Malcolm
On Wed, 2016-07-20 at 16:16 -0400, Jason Merrill wrote:
> On Thu, Jun 30, 2016 at 1:49 PM, Jason Merrill 
> wrote:
> > This needs a template testcase.
> 
> Did you get this reply before?  It bounced from the mailing list, but
> I thought you would have gotten it directly.

I did; sorry for not responding - I have quite a few projects underway
for gcc 7 and this one managed to slip off my radar.  I'll try to
resubmit with extra test coverage tomorrow.


Re: [PATCH v2] x86: allow to suppress default clobbers added to asm()s

2016-07-20 Thread Jeff Law

On 07/06/2016 08:32 AM, Jan Beulich wrote:

While it always seemed wrong to me that there's no way to avoid the
default "flags" and "fpsr" clobbers, the regression the fix for
PR/60663 introduced (see PR/63637) makes it even more desirable to have
such a mechanism: This way, at least asm()s with a single output and no
explicit clobbers could again have been made subject to CSE even with
that bug unfixed.
---
There wasn't much feedback on v1
(https://gcc.gnu.org/ml/gcc-patches/2014-10/msg03251.html)
and the feedback I did get from Jeff I didn't really mean to address
in this version:


I really don't like having an option that's globally applied for this
feature. THough I am OK with having a mechanism to avoid
implicit clobbers on specific ASMs.


I don't really understand what's wrong with a command line option
allowing to state this globally for a source file or even entire project.
It's been a while, but I believe my concern was that if you flip a flag 
globally, then it's an all-or-nothing proposition.  ie, you either have 
default clobbers or you do not have default clobbers for the affected 
TU.  While that may be OK for some projects, I think that folks will 
want/need this applied on a per-asm basis far far more often.  That's 
just my opinion -- I don't have anything hard to back that up.






Why use negative numbers for the hard register numbers? I
wouldn't be at all surprised if lots of random code assumes
register numbers are always positive.


I'm lacking an idea (or suggestion) of a better alternative. Using
positive numbers resulted in far more problems, as such registers
then got accepted elsewhere as valid too.
Sadly, I don't have a better suggestion.  Perhaps Bernd or someone else 
has ideas on how to represent.





I don't like adding new registers with special names like !foo.
Instead I think that listing "!cc" or something similar in the asm
itself if it doesn't clobber the cc register would be better.


I didn't really understand what was meant here, i.e. how the
proposed alternative was supposed to look like in an actual
asm().
I think what I was suggesting was to use the ! syntax, but not as 
a distinct register name.  ie, the ! essentially could be applied to any 
register and would be handled outside the target specific bits.


Though I think this implies a new API for decode_reg_name_and_count 
where certain negative values have special meaning.  But that ought to 
be trivial.



Jeff


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger


On 07/21/16 00:19, Bernd Edlinger wrote:
> On 07/21/16 00:00, Jakub Jelinek wrote:
>> On Wed, Jul 20, 2016 at 09:50:03PM +, Bernd Edlinger wrote:
>>> But the built-in alloca is still recognized because the builtin
>>> does have ECF_MAY_BE_ALLOCA and ECF_MALLOC.
>>
>> But __builtin_alloca_with_align likely doesn't have ECF_MALLOC set (even
>> when it should).
>>
>> Jakub
>>
>
>
> DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN,
> "__builtin_alloca_with_align")
>
> do you know what the attributes are instead,
> or where that is constructed?
>

tree.c:

   /* If we're checking the stack, `alloca' can throw.  */
   const int alloca_flags
 = ECF_MALLOC | ECF_LEAF | (flag_stack_check ? 0 : ECF_NOTHROW);

   if (!builtin_decl_explicit_p (BUILT_IN_ALLOCA))
 {
   ftype = build_function_type_list (ptr_type_node,
 size_type_node, NULL_TREE);
   local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA,
 "alloca", alloca_flags);
 }

   ftype = build_function_type_list (ptr_type_node, size_type_node,
 size_type_node, NULL_TREE);
   local_define_builtin ("__builtin_alloca_with_align", ftype,
 BUILT_IN_ALLOCA_WITH_ALIGN,
 "__builtin_alloca_with_align",
 alloca_flags);


looks like ECF_MALLOC and ECF_LEAF are always there.

Right?


Bernd.


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger
On 07/21/16 00:00, Jakub Jelinek wrote:
> On Wed, Jul 20, 2016 at 09:50:03PM +, Bernd Edlinger wrote:
>> But the built-in alloca is still recognized because the builtin
>> does have ECF_MAY_BE_ALLOCA and ECF_MALLOC.
>
> But __builtin_alloca_with_align likely doesn't have ECF_MALLOC set (even
> when it should).
>
>   Jakub
>


DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN, "__builtin_alloca_with_align")

do you know what the attributes are instead,
or where that is constructed?


At least in this example it seems to work:

extern "C"
void *alloca(unsigned long);
void test(void*);
void bar(unsigned long n)
{
   char *x = (char*) __builtin_alloca_with_align(n,64);
   if (x)
 *x = 0;
}

g++ -O3 -S -Wall test.cc -ansi

results in

_Z3barm:
.LFB0:
 .cfi_startproc
 rep ret
 .cfi_endproc


and make check-c has no regressions.


Thanks
Bernd.


-.Job in Canada

2016-07-20 Thread Ray
The new canada hotel is looking for over 100 foreign workers.
Contact the Canadian Administrator by e-mail: forapplica...@yahoo.ca, if it 
interests you. 

Ray


Re: fold x ^ y to 0 if x == y

2016-07-20 Thread Prathamesh Kulkarni
On 20 July 2016 at 23:07, Prathamesh Kulkarni
 wrote:
> On 20 July 2016 at 16:35, Richard Biener  wrote:
>> On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:
>>
>>> On 8 July 2016 at 12:29, Richard Biener  wrote:
>>> > On Fri, 8 Jul 2016, Richard Biener wrote:
>>> >
>>> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:
>>> >>
>>> >> > Hi Richard,
>>> >> > For the following test-case:
>>> >> >
>>> >> > int f(int x, int y)
>>> >> > {
>>> >> >int ret;
>>> >> >
>>> >> >if (x == y)
>>> >> >  ret = x ^ y;
>>> >> >else
>>> >> >  ret = 1;
>>> >> >
>>> >> >return ret;
>>> >> > }
>>> >> >
>>> >> > I was wondering if x ^ y should be folded to 0 since
>>> >> > it's guarded by condition x == y ?
>>> >> >
>>> >> > optimized dump shows:
>>> >> > f (int x, int y)
>>> >> > {
>>> >> >   int iftmp.0_1;
>>> >> >   int iftmp.0_4;
>>> >> >
>>> >> >   :
>>> >> >   if (x_2(D) == y_3(D))
>>> >> > goto ;
>>> >> >   else
>>> >> > goto ;
>>> >> >
>>> >> >   :
>>> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);
>>> >> >
>>> >> >   :
>>> >> >   # iftmp.0_1 = PHI 
>>> >> >   return iftmp.0_1;
>>> >> >
>>> >> > }
>>> >> >
>>> >> > The attached patch tries to fold for above case.
>>> >> > I am checking if op0 and op1 are equal using:
>>> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)
>>> >> >&& operand_equal_p (vr1->min, vr1->max)
>>> >> >&& operand_equal_p (vr2->min, vr2->max))
>>> >> >   { /* equal /* }
>>> >> >
>>> >> > I suppose intersection would check if op0 and op1 have equivalent 
>>> >> > ranges,
>>> >> > and added operand_equal_p check to ensure that there is only one
>>> >> > element within the range. Does that look correct ?
>>> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.
>>> >>
>>> >> I think VRP is the wrong place to catch this and DOM should have but it
>>> >> does
>>> >>
>>> >> Optimizing block #3
>>> >>
>>> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)
>>> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
>>> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
>>> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
>>> >> 0>>> COPY x_2(D) = y_3(D)
>>> >> 0>>> COPY y_3(D) = x_2(D)
>>> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);
>>> >>   Replaced 'x_2(D)' with variable 'y_3(D)'
>>> >>   Replaced 'y_3(D)' with variable 'x_2(D)'
>>> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);
>>> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)
>>> >>
>>> >> heh, registering both reqivalencies is obviously not going to help...
>>> >>
>>> >> The 2nd equivalence is from doing
>>> >>
>>> >>   /* We already recorded that LHS = RHS, with canonicalization,
>>> >>  value chain following, etc.
>>> >>
>>> >>  We also want to record RHS = LHS, but without any
>>> >> canonicalization
>>> >>  or value chain following.  */
>>> >>   if (TREE_CODE (rhs) == SSA_NAME)
>>> >> const_and_copies->record_const_or_copy_raw (rhs, lhs,
>>> >> SSA_NAME_VALUE 
>>> >> (rhs));
>>> >>
>>> >> generally recording both is not helpful.  Jeff?  This seems to be
>>> >> r233207 (fix for PR65917) which must have regressed this testcase.
>>> >
>>> > Just verified it works fine on the GCC 5 branch:
>>> >
>>> > Optimizing block #3
>>> >
>>> > 0>>> COPY y_3(D) = x_2(D)
>>> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)
>>> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
>>> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
>>> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
>>> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);
>>> >   Replaced 'y_3(D)' with variable 'x_2(D)'
>>> > Applying pattern match.pd:240, gimple-match.c:11346
>>> > gimple_simplified to ret_4 = 0;
>>> >   Folded to: ret_4 = 0;
>>> I have reported it as PR71947.
>>> Could you help me point out how to fix this ?
>>
>> Not record both equivalences.  This might break the testcase it was
>> introduced for (obviously).  Which is why I CCed Jeff for his opinion.
> Well, folding happens for x - y, if x == y.
>
> int f(int x, int y)
> {
>   int ret;
>   if (x == y)
> ret = x - y;
>   else
> ret = 1;
>
>   return ret;
> }
Oops wrong test-case, the dump below is of the following test-case:
int f(int x, int y)
{
  int ret = 10;
  if (x == y)
ret = x  -  y;
  return ret;
}
>
> For the above test-case, extract_range_from_binary_expr_1()
> determines that range of ret = [0, 0]
> and propagates it.
>
> vrp1 dump:
> f (int x, int y)
> {
>   int ret;
>
>   :
>   if (x_2(D) == y_3(D))
> goto ;
>   else
> goto ;
>
>   :
>   ret_4 = x_2(D) - y_3(D);
>
>   :
>   # ret_1 = PHI <0(3), 10(2)>
>   return ret_1;
>
> }
>
> Then the dce pass removes ret_4 = x_2(D) - y_3(D) since it's
> redundant.
> However it appears vrp fails to notice the equality for the following 
> test-case,
> and sets range for ret to VARYING.
>
> int f(int x, int y, int a, int b)
> {
>   int ret = 10;
>   if (a == x
>   && b == y
>   && a == b)
> ret = x - y;
>
>   return ret;
> }
>
> Looking at the vrp dump, shows the following form
> after inserting ASSERT_EXPR:
>
> SSA form aft

Re: fold x ^ y to 0 if x == y

2016-07-20 Thread Prathamesh Kulkarni
On 20 July 2016 at 16:35, Richard Biener  wrote:
> On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:
>
>> On 8 July 2016 at 12:29, Richard Biener  wrote:
>> > On Fri, 8 Jul 2016, Richard Biener wrote:
>> >
>> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:
>> >>
>> >> > Hi Richard,
>> >> > For the following test-case:
>> >> >
>> >> > int f(int x, int y)
>> >> > {
>> >> >int ret;
>> >> >
>> >> >if (x == y)
>> >> >  ret = x ^ y;
>> >> >else
>> >> >  ret = 1;
>> >> >
>> >> >return ret;
>> >> > }
>> >> >
>> >> > I was wondering if x ^ y should be folded to 0 since
>> >> > it's guarded by condition x == y ?
>> >> >
>> >> > optimized dump shows:
>> >> > f (int x, int y)
>> >> > {
>> >> >   int iftmp.0_1;
>> >> >   int iftmp.0_4;
>> >> >
>> >> >   :
>> >> >   if (x_2(D) == y_3(D))
>> >> > goto ;
>> >> >   else
>> >> > goto ;
>> >> >
>> >> >   :
>> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);
>> >> >
>> >> >   :
>> >> >   # iftmp.0_1 = PHI 
>> >> >   return iftmp.0_1;
>> >> >
>> >> > }
>> >> >
>> >> > The attached patch tries to fold for above case.
>> >> > I am checking if op0 and op1 are equal using:
>> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)
>> >> >&& operand_equal_p (vr1->min, vr1->max)
>> >> >&& operand_equal_p (vr2->min, vr2->max))
>> >> >   { /* equal /* }
>> >> >
>> >> > I suppose intersection would check if op0 and op1 have equivalent 
>> >> > ranges,
>> >> > and added operand_equal_p check to ensure that there is only one
>> >> > element within the range. Does that look correct ?
>> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.
>> >>
>> >> I think VRP is the wrong place to catch this and DOM should have but it
>> >> does
>> >>
>> >> Optimizing block #3
>> >>
>> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)
>> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
>> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
>> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
>> >> 0>>> COPY x_2(D) = y_3(D)
>> >> 0>>> COPY y_3(D) = x_2(D)
>> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);
>> >>   Replaced 'x_2(D)' with variable 'y_3(D)'
>> >>   Replaced 'y_3(D)' with variable 'x_2(D)'
>> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);
>> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)
>> >>
>> >> heh, registering both reqivalencies is obviously not going to help...
>> >>
>> >> The 2nd equivalence is from doing
>> >>
>> >>   /* We already recorded that LHS = RHS, with canonicalization,
>> >>  value chain following, etc.
>> >>
>> >>  We also want to record RHS = LHS, but without any
>> >> canonicalization
>> >>  or value chain following.  */
>> >>   if (TREE_CODE (rhs) == SSA_NAME)
>> >> const_and_copies->record_const_or_copy_raw (rhs, lhs,
>> >> SSA_NAME_VALUE (rhs));
>> >>
>> >> generally recording both is not helpful.  Jeff?  This seems to be
>> >> r233207 (fix for PR65917) which must have regressed this testcase.
>> >
>> > Just verified it works fine on the GCC 5 branch:
>> >
>> > Optimizing block #3
>> >
>> > 0>>> COPY y_3(D) = x_2(D)
>> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)
>> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
>> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
>> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
>> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);
>> >   Replaced 'y_3(D)' with variable 'x_2(D)'
>> > Applying pattern match.pd:240, gimple-match.c:11346
>> > gimple_simplified to ret_4 = 0;
>> >   Folded to: ret_4 = 0;
>> I have reported it as PR71947.
>> Could you help me point out how to fix this ?
>
> Not record both equivalences.  This might break the testcase it was
> introduced for (obviously).  Which is why I CCed Jeff for his opinion.
Well, folding happens for x - y, if x == y.

int f(int x, int y)
{
  int ret;
  if (x == y)
ret = x - y;
  else
ret = 1;

  return ret;
}

For the above test-case, extract_range_from_binary_expr_1()
determines that range of ret = [0, 0]
and propagates it.

vrp1 dump:
f (int x, int y)
{
  int ret;

  :
  if (x_2(D) == y_3(D))
goto ;
  else
goto ;

  :
  ret_4 = x_2(D) - y_3(D);

  :
  # ret_1 = PHI <0(3), 10(2)>
  return ret_1;

}

Then the dce pass removes ret_4 = x_2(D) - y_3(D) since it's
redundant.
However it appears vrp fails to notice the equality for the following test-case,
and sets range for ret to VARYING.

int f(int x, int y, int a, int b)
{
  int ret = 10;
  if (a == x
  && b == y
  && a == b)
ret = x - y;

  return ret;
}

Looking at the vrp dump, shows the following form
after inserting ASSERT_EXPR:

SSA form after inserting ASSERT_EXPRs
f (int x, int y, int a, int b)
{
  int ret;
  _Bool _1;
  _Bool _2;
  _Bool _3;

  :
  _1 = a_5(D) == x_6(D);
  _2 = b_7(D) == y_8(D);
  _3 = _1 & _2;
  if (_3 != 0)
goto ;
  else
goto ;

  :
  a_11 = ASSERT_EXPR ;
  x_12 = ASSERT_EXPR ;
  b_13 = ASSERT_EXPR ;
  y_14 = ASSERT_EXPR ;
  if (a_11 == b_13)
goto ;
  else
goto ;

  :
  ret_9 = x_12 ^ y_14;

  :
  # ret_4 = PHI <10(2), 10(3), ret_

Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Jakub Jelinek
On Wed, Jul 20, 2016 at 09:50:03PM +, Bernd Edlinger wrote:
> But the built-in alloca is still recognized because the builtin
> does have ECF_MAY_BE_ALLOCA and ECF_MALLOC.

But __builtin_alloca_with_align likely doesn't have ECF_MALLOC set (even
when it should).

Jakub


Re: [PATCH]: Introduce HOST_WIDE_INT_0{,U}

2016-07-20 Thread Bernd Schmidt

On 07/20/2016 11:16 PM, Uros Bizjak wrote:

As suggested by Jakub.

2016-07-20  Uros Bizjak  

* hwint.h (HOST_WIDE_INT_0): New define.
(HOST_WIDE_INT_0U): Ditto.
* double-int.c: Use HOST_WIDE_INT_0 instead of (HOST_WIDE_INT) 0.
* dse.c: Use HOST_WIDE_INT_0U instead of (unsigned HOST_WIDE_INT) 0.
* simplify-rtx.c: Ditto.
* tree-object-size.c: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu. In addition, I
have checked that .o files didn't differ.

OK for mainline?


Seems slightly less useful than the others, but I guess it's fine too.


Bernd



Re: [PATCH][C++] C++ bitfield memory model for as-base classes

2016-07-20 Thread Jeff Law

On 06/29/2016 05:54 AM, Richard Biener wrote:


Currently as-base classes lack DECL_BIT_FIELD_REPRESENTATIVEs which
means RTL expansion doesn't honor the C++ memory model for bitfields
in them thus for the following testcase

struct B {
B() {}
int x;
int a : 6;
int b : 6;
int c : 6;
};

struct C : B {
char d;
};

C c;

int main()
{
  c.c = 1;
  c.d = 2;
}

on x86 we happily store to c.c in a way creating a store data race with
c.d:

main:
.LFB6:
.cfi_startproc
movlc+4(%rip), %eax
andl$-258049, %eax
orb $16, %ah
movl%eax, c+4(%rip)
movb$2, c+7(%rip)
xorl%eax, %eax
ret

Fixing the lack of DECL_BIT_FIELD_REPRESENTATIVEs in as-base
classes doesn't help though as the C++ FE builds access trees
for c.c using the non-as-base class FIELD_DECLs which is because
of layout_class_type doing

  /* Now that we're done with layout, give the base fields the real types.  */
  for (field = TYPE_FIELDS (t); field; field = DECL_CHAIN (field))
if (DECL_ARTIFICIAL (field) && IS_FAKE_BASE_TYPE (TREE_TYPE (field)))
  TREE_TYPE (field) = TYPE_CONTEXT (TREE_TYPE (field));

this would basically require us to always treat tail-padding in a
struct conservatively in finish_bitfield_representative (according
to the doubt by the ??? comment I patch out below).

Simply commenting out the above makes fixing build_simple_base_path
necessary but even after that it then complains in the verifier
later ("type mismatch in component reference" - as-base to class
assignment).

But it still somehow ends up using the wrong FIELD_DECL in the end.

Now I think we need to fix the wrong-code issue somehow and
doing so in stor-layout.c by conservatively treating tail-padding
is a possibility.  But that will pessimize PODs and other languages
unless we have a way to know whether a RECORD_TYPE possibly can
have its tail-padding re-used (I'd hate to put a lang_hooks.name
check there and even that would pessimize C++ PODs).

Any guidance here?
Note that if we change tail-padding re-use properties, then we 
effectively have an ABI change.  Given that, the only path forward is to 
use smaller memory operations.


Do any other compilers gets this right (LLVM?)

jeff




Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger
On 07/20/16 20:08, Richard Biener wrote:
> On July 20, 2016 6:54:48 PM GMT+02:00, Bernd Edlinger 
>  wrote:
>>
>> But I think that alloca just should not be recognized by name any
>> more.
>
> It was introduced to mark calls that should not be duplicated by inlining or 
> unrolling to avoid increasing stack usage too much.  Sth worthwhile to keep 
> even with -ffreestanding.
>
> Richard.
>

On second thought I start to think that an external alloca function
might still work.  And returning ECF_MAY_BE_ALLOCA just based on the
name could be made safe by checking the malloc attribute at the right
places.

With this new incremental patch the example

extern "C"
void *alloca(unsigned long);
void bar(unsigned long n)
{
   char *x = (char*) alloca(n);
   if (x)
 *x = 0;
}

might actually work when -ansi is used,
i.e. it does no longer assume that alloca cannot return null,
but still creates a frame pointer, which it would not have done
for allocb for instance.

But the built-in alloca is still recognized because the builtin
does have ECF_MAY_BE_ALLOCA and ECF_MALLOC.


Is it OK for trunk after boot-strap and reg-testing?


Thanks
Bernd.
2016-07-19  Bernd Edlinger  

	PR middle-end/71876
	* fold-const.c (tree_expr_nonzero_warnv_p): Check for real built-in
	alloca.
	* tree-vrp.c (gimple_stmt_nonzero_warnv_p): Likewise.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 238513)
+++ gcc/fold-const.c	(working copy)
@@ -9018,7 +9018,7 @@ tree_expr_nonzero_warnv_p (tree t, bool *strict_ov
 	&& lookup_attribute ("returns_nonnull",
 		 TYPE_ATTRIBUTES (TREE_TYPE (fndecl
 	  return true;
-	return alloca_call_p (t);
+	return alloca_call_p (t) && DECL_IS_MALLOC (fndecl);
   }
 
 default:
Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c	(revision 238513)
+++ gcc/tree-vrp.c	(working copy)
@@ -1065,7 +1065,7 @@ gimple_stmt_nonzero_warnv_p (gimple *stmt, bool *s
 	lookup_attribute ("returns_nonnull",
 			  TYPE_ATTRIBUTES (gimple_call_fntype (stmt
 	  return true;
-	return gimple_alloca_call_p (stmt);
+	return gimple_alloca_call_p (stmt) && DECL_IS_MALLOC (fndecl);
   }
 default:
   gcc_unreachable ();


Re: [PATCH GCC]Vectorize possible infinite loops by versioning

2016-07-20 Thread Jeff Law

On 06/28/2016 12:18 AM, Bin Cheng wrote:

Hi,
This patch improves vectorizer in order to handle possible infinite loops by 
versioning.  Its changes fall in three categories.
A) Changes in vect_get_loop_niters.  AT the moment, it computes niter using 
number_of_executions_latch, in this way the assumption is discarded and loop 
not vectorized.  To fix the issue, we need assumption information from niter 
analyzer and use that as a break condition to version the loop.  This patch 
uses newly introduced interface number_of_iterations_exit_assumptions and 
passes assumptions all the way to vect_analyze_loop_form.  The assumptions will 
be finally recorded in LOOP_VINFO_NITERS_ASSUMPTIONS.
B) It sets and clears flag LOOP_F_ASSUMPTIONS for loop.  The flag is important 
because during checking if a loop can be vectorized (with versioning), all data 
references need to be analyzed by assuming LOOP_VINFO_NITERS_ASSUMPTIONS is 
TRUE.  Otherwise it's very likely address expression of data reference won't be 
identified as SCEV and vectorization would fail.  With this flag set to TRUE, 
niter analyzer will bypass assumptions recorded LOOP_VINFO_NITERS_ASSUMPTIONS.  
I also keep this flag for versioned loop because the assumption is guaranteed 
to be TRUE after versioning.  For now, I didn't copy these flags in 
copy_loop_info, but I think this can be done so that the flags can be inherited 
by peeled pre/post loop.  Maybe in follow up patches.  Also it's possible to 
turn other bool fields into flags in the future?
C) This patch uses existing infrastructure to version a loop against 
LOOP_VINFO_NITERS_ASSUMPTIONS, just like for alignment or alias check.  The 
change is straightforward, however, I did refactoring to versioning related 
macros hoping the code would be cleaner.

Bootstrap and test along with previous niter patches on x86_64 and AArch64.  Is 
it OK?
So I have one high level concern -- how (if at all) does this interact 
with Ilya's changes to vectorize loop tails that are just about through 
the review process?


Related -- I see that you throw away the SCEV/iteration knowledge then 
analyze the loop using the given assumptions, then eventually throw that 
information away.  Which sounds generally reasonable -- except for one 
potential issue -- does anything still want to look at the original 
SCEV/iteration information (that we've lost)?  I'm assuming no since you 
didn't try to restore it and we pass the testsuite with your change.





Thanks,
bin

2016-06-27  Bin Cheng  

PR tree-optimization/57558
* tree-vect-loop-manip.c (vect_create_cond_for_niters_checks): New
function.
(vect_loop_versioning): Support versioning with niter assumptions.
* tree-vect-loop.c (tree-ssa-loop.h): Include new header file.
(vect_get_loop_niters): New parameter.  Reimplement to support
assumptions in loop niter info.
(vect_analyze_loop_form_1, vect_analyze_loop_form): Ditto.
(new_loop_vec_info): Init LOOP_VINFO_NITERS_ASSUMPTIONS.
(vect_estimate_min_profitable_iters): Use LOOP_REQUIRES_VERSIONING.
* tree-vectorizer.c (vect_free_loop_info_assumptions): New function.
(vectorize_loops): Free loop niter info for loops with flag
LOOP_F_ASSUMPTIONS set.
* tree-vectorizer.h (struct _loorefactoredp_vec_info): New field

Typo in the ChangeLog entry.


num_iters_assumptions.
(LOOP_VINFO_NITERS_ASSUMPTIONS): New macro.
(LOOP_REQUIRES_VERSIONING_FOR_NITERS): New macro.
(LOOP_REQUIRES_VERSIONING): New macro.
(vect_free_loop_info_assumptions): New decl.

gcc/testsuite/ChangeLog
2016-06-27  Bin Cheng  

* gcc.dg/vect/pr57558-1.c: New test.
* gcc.dg/vect/pr57558-2.c: New test.
I was rather surprised at how simple supporting this case was.  THe two 
high level questions above are the only things I'm worried about.  Let's 
figure out the answers to those questions before we OK this for the trunk.

Jeff



[PATCH]: Introduce HOST_WIDE_INT_0{,U}

2016-07-20 Thread Uros Bizjak
As suggested by Jakub.

2016-07-20  Uros Bizjak  

* hwint.h (HOST_WIDE_INT_0): New define.
(HOST_WIDE_INT_0U): Ditto.
* double-int.c: Use HOST_WIDE_INT_0 instead of (HOST_WIDE_INT) 0.
* dse.c: Use HOST_WIDE_INT_0U instead of (unsigned HOST_WIDE_INT) 0.
* simplify-rtx.c: Ditto.
* tree-object-size.c: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu. In addition, I
have checked that .o files didn't differ.

OK for mainline?

Uros.
Index: double-int.c
===
--- double-int.c(revision 238538)
+++ double-int.c(working copy)
@@ -557,7 +557,7 @@ div_and_round_double (unsigned code, int uns,
 case CEIL_MOD_EXPR:/* round toward positive infinity */
   if (!quo_neg && (*lrem != 0 || *hrem != 0))  /* ratio > 0 && rem != 0 */
{
- add_double (*lquo, *hquo, HOST_WIDE_INT_1, (HOST_WIDE_INT) 0,
+ add_double (*lquo, *hquo, HOST_WIDE_INT_1, HOST_WIDE_INT_0,
  lquo, hquo);
}
   else
@@ -593,7 +593,7 @@ div_and_round_double (unsigned code, int uns,
  HOST_WIDE_INT_M1, HOST_WIDE_INT_M1, lquo, hquo);
else
  /* quo = quo + 1; */
- add_double (*lquo, *hquo, HOST_WIDE_INT_1, (HOST_WIDE_INT) 0,
+ add_double (*lquo, *hquo, HOST_WIDE_INT_1, HOST_WIDE_INT_0,
  lquo, hquo);
  }
else
Index: dse.c
===
--- dse.c   (revision 238538)
+++ dse.c   (working copy)
@@ -1217,7 +1217,7 @@ set_all_positions_unneeded (store_info *s_info)
   s_info->positions_needed.large.count = end;
 }
   else
-s_info->positions_needed.small_bitmask = (unsigned HOST_WIDE_INT) 0;
+s_info->positions_needed.small_bitmask = HOST_WIDE_INT_0U;
 }
 
 /* Return TRUE if any bytes from S_INFO store are needed.  */
@@ -1229,8 +1229,7 @@ any_positions_needed_p (store_info *s_info)
 return (s_info->positions_needed.large.count
< s_info->end - s_info->begin);
   else
-return (s_info->positions_needed.small_bitmask
-   != (unsigned HOST_WIDE_INT) 0);
+return (s_info->positions_needed.small_bitmask != HOST_WIDE_INT_0U);
 }
 
 /* Return TRUE if all bytes START through START+WIDTH-1 from S_INFO
Index: hwint.h
===
--- hwint.h (revision 238538)
+++ hwint.h (working copy)
@@ -63,6 +63,8 @@ extern char sizeof_long_long_must_be_8[sizeof (lon
 #endif
 
 #define HOST_WIDE_INT_UC(X) HOST_WIDE_INT_C (X ## U)
+#define HOST_WIDE_INT_0 HOST_WIDE_INT_C (0)
+#define HOST_WIDE_INT_0U HOST_WIDE_INT_UC (0)
 #define HOST_WIDE_INT_1 HOST_WIDE_INT_C (1)
 #define HOST_WIDE_INT_1U HOST_WIDE_INT_UC (1)
 #define HOST_WIDE_INT_M1 HOST_WIDE_INT_C (-1)
Index: simplify-rtx.c
===
--- simplify-rtx.c  (revision 238538)
+++ simplify-rtx.c  (working copy)
@@ -40,7 +40,7 @@ along with GCC; see the file COPYING3.  If not see
occasionally need to sign extend from low to high as if low were a
signed wide int.  */
 #define HWI_SIGN_EXTEND(low) \
- HOST_WIDE_INT) low) < 0) ? HOST_WIDE_INT_M1 : ((HOST_WIDE_INT) 0))
+  HOST_WIDE_INT) low) < 0) ? HOST_WIDE_INT_M1 : HOST_WIDE_INT_0)
 
 static rtx neg_const_int (machine_mode, const_rtx);
 static bool plus_minus_operand_p (const_rtx);
Index: tree-object-size.c
===
--- tree-object-size.c  (revision 238538)
+++ tree-object-size.c  (working copy)
@@ -738,7 +738,7 @@ merge_object_sizes (struct object_size_info *osi,
   orig_bytes = object_sizes[object_size_type][SSA_NAME_VERSION (orig)];
   if (orig_bytes != unknown[object_size_type])
 orig_bytes = (offset > orig_bytes)
-? (unsigned HOST_WIDE_INT) 0 : orig_bytes - offset;
+? HOST_WIDE_INT_0U : orig_bytes - offset;
 
   if ((object_size_type & 2) == 0)
 {


[C++ PATCH] Fix up genericization ICE (PR c++/71941)

2016-07-20 Thread Jakub Jelinek
Hi!

In PR69315, we've recently allowed recursive genericization, unfortunately
the bc_label handling isn't prepared for that, we ICE if we cp_genericize
some function (usually newly instantiated method) while inside of some loop
in the outer function.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk/6.2?

2016-07-20  Jakub Jelinek  

PR c++/71941
* cp-gimplify.c (cp_genericize): For nested cp_genericize calls
save/restore bc_label array.

* g++.dg/gomp/pr71941.C: New test.

--- gcc/cp/cp-gimplify.c.jj 2016-07-18 20:44:59.0 +0200
+++ gcc/cp/cp-gimplify.c2016-07-20 12:14:28.203381211 +0200
@@ -1632,6 +1632,13 @@ cp_genericize (tree fndecl)
   if (DECL_CLONED_FUNCTION_P (fndecl))
 return;
 
+  /* Allow cp_genericize calls to be nested.  */
+  tree save_bc_label[2];
+  save_bc_label[bc_break] = bc_label[bc_break];
+  save_bc_label[bc_continue] = bc_label[bc_continue];
+  bc_label[bc_break] = NULL_TREE;
+  bc_label[bc_continue] = NULL_TREE;
+
   /* Expand all the array notations here.  */
   if (flag_cilkplus 
   && contains_array_notation_expr (DECL_SAVED_TREE (fndecl)))
@@ -1651,6 +1658,8 @@ cp_genericize (tree fndecl)
 
   gcc_assert (bc_label[bc_break] == NULL);
   gcc_assert (bc_label[bc_continue] == NULL);
+  bc_label[bc_break] = save_bc_label[bc_break];
+  bc_label[bc_continue] = save_bc_label[bc_continue];
 }
 
 /* Build code to apply FN to each member of ARG1 and ARG2.  FN may be
--- gcc/testsuite/g++.dg/gomp/pr71941.C.jj  2016-07-20 12:11:29.793638764 
+0200
+++ gcc/testsuite/g++.dg/gomp/pr71941.C 2016-07-20 12:11:14.0 +0200
@@ -0,0 +1,22 @@
+// PR c++/71941
+// { dg-do compile }
+// { dg-options "-fopenmp" }
+
+struct A { A (); A (A &); ~A (); };
+
+template 
+struct B
+{
+  struct C { A a; C () : a () {} };
+  C c;
+  void foo ();
+};
+
+void
+bar ()
+{
+  B<0> b;
+#pragma omp task
+  for (int i = 0; i < 2; i++)
+b.foo ();
+}

Jakub


Re: [PATCH] target lib tests with build sysroot PR testsuite/71931

2016-07-20 Thread Jeff Law

On 07/20/2016 08:04 AM, Szabolcs Nagy wrote:

Fix target library tests when gcc is built using --with-build-sysroot.

The dejagnu find_gcc function cannot handle if CC needs extra flags
like --sysroot. So for testing target libraries use the same CC that
was used for building the target libs. This change assumes the test
is ran from make.

Another approach would be to pass down the sysroot flags
separately and add
set TEST_ALWAYS_FLAGS "$(SYSROOT_CFLAGS_FOR_TARGET)"
to site.exp like the gcc site.exp does, but that's more
changes.
But isn't all this supposed to still work if someone invokes runtest 
directly?


Can we get the right magic into the generated site.exp which might 
resolve these issues.


jeff



Re: [PATCH] Fix source locations of bad enum values (PR c/71610 and PR c/71613)

2016-07-20 Thread Jason Merrill
On Thu, Jun 30, 2016 at 1:49 PM, Jason Merrill  wrote:
> This needs a template testcase.

Did you get this reply before?  It bounced from the mailing list, but
I thought you would have gotten it directly.

Jason


Re: [PATCH] Fix source locations of bad enum values (PR c/71610 and PR c/71613)

2016-07-20 Thread Jeff Law

On 06/22/2016 08:52 PM, David Malcolm wrote:

PR c/71613 identifies a problem where we fail to report this enum:

  enum { e1 = LLONG_MIN };

with -pedantic, due to LLONG_MIN being inside a system header.

This patch updates the C and C++ frontends to use the location of the
name as the primary location in the diagnostic, supplying the location
of the value as a secondary location, fixing the issue.

Before:
  $ gcc -c /tmp/test.c -Wpedantic
  /tmp/test.c: In function 'main':
  /tmp/test.c:3:14: warning: ISO C restricts enumerator values to range of 
'int' [-Wpedantic]
 enum { c = -30 };
^

After:
  $ ./xgcc -B. -c /tmp/test.c -Wpedantic
  /tmp/test.c: In function 'main':
  /tmp/test.c:3:10: warning: ISO C restricts enumerator values to range of 
'int' [-Wpedantic]
 enum { c = -30 };
^   ~~~

Successfully bootstrapped®retested on x86_64-pc-linux-gnu;
adds 13 PASS results to gcc.sum and 9 PASS results to g++.sum.

OK for trunk?

gcc/c/ChangeLog:
PR c/71610
PR c/71613
* c-decl.c (build_enumerator): Fix description of LOC in comment.
Update diagnostics to use a rich_location at decl_loc, rather than
at loc, adding loc as a secondary range if available.
* c-parser.c (c_parser_enum_specifier): Use the full location of
the expression for value_loc, rather than just the first token.

gcc/cp/ChangeLog:
PR c/71610
PR c/71613
* cp-tree.h (build_enumerator): Add location_t param.
* decl.c (build_enumerator): Add "value_loc" param.  Update
"not an integer constant" diagnostic to use "loc" rather than
input_location, and to add "value_loc" as a secondary range if
available.
* parser.c (cp_parser_enumerator_definition): Extract the
location of the value from the cp_expr for the constant
expression, if any, and pass it to build_enumerator.
* pt.c (tsubst_enum): Extract EXPR_LOCATION of the value,
and pass it to build_enumerator.

gcc/ChangeLog:
PR c/71610
PR c/71613
* diagnostic-core.h (pedwarn_at_rich_loc): New prototype.
* diagnostic.c (pedwarn_at_rich_loc): New function.

gcc/testsuite/ChangeLog:
PR c/71610
PR c/71613
* c-c++-common/pr71610.c: New test case.
* gcc.dg/c90-const-expr-8.c: Update expected column of diagnostic.
* gcc.dg/pr71610-2.c: New test case.
* gcc.dg/pr71613.c: New test case.

OK.

jeff



Re: [PING**2] [PATCH] Fix asm X constraint (PR inline-asm/59155)

2016-07-20 Thread Jeff Law

On 06/22/2016 02:48 PM, Bernd Edlinger wrote:

On 06/22/16 21:51, Jeff Law wrote:

On 06/19/2016 07:25 AM, Bernd Edlinger wrote:

Hi,

ping...

As this discussion did not make any progress, I just attached
the latest version of my patch with the the changes that
Vladimir proposed.

Boot-strapped and reg-tested again on x86_64-linux-gnu.
Is it OK for the trunk?

Well, I don't think we've got any kind of consensus on whether or not
this is reasonable or not.

The fundamental issue is that "X" is supposed to accept anything,
literally anything.  That implies it's really the downstream users of
those operands that are broken.



Hmm...

I think it must be pretty easy to write something in a .md file with the
X constraint that ends up in an ICE, right?

Probably not terribly hard.



But in an .md file we have much more control on what happens.
That's why I did not propose to change the meaning of "X" in .md files.
We have control over RTL generation, operand predicates and the like. 
And those are how we control things like combine.




And we only have problems with asm statements that use "X" constraints.
But I'd disagree.  I think we could easily have problems with "X" 
constraints in the MD file.  But the most common uses of "X" probably 
don't try to refer to that operand in the output string and use good 
predicates.


And that's one of the key differences here.  In an MD file the operand 
predicate has to pass -- that's not the case in an ASM.  The operand 
predicate allows the backend to prevent all kinds of things from showing up.




But I think we have a use case where "X" means really more possible
registers (i.e. includes ss2, mmx etc.) than "g" (only general
registers).  Otherwise, in the test cases of pr59155 we would not
have any benefit for using "+X" instead of "+g" or "+r".

Does that sound reasonable?
If it's the case that the real benefit of +X is that it's allowing more 
registers, then that argues that the backend ought to be providing 
another (larger) register class.



jeff


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Eric Botcazou
> Very few targets continue to use SJLJ eh (perhaps just cygwin/mingw).
> *But* I think the Ada front-end explicitly uses SJLJ EH, so if you want
> to get some smoke testing, the Ada testsuite is probably the place to go.

Right, the Ada front-end uses an EH scheme directly based on __builtin_setjmp 
(which is similar but distinct from the regular SJLJ EH because the front-end 
directly manages the SJLJ buffers) for internal EH.  Note that it's on the 
host only, for the target it uses the same EH scheme as C++/Java/etc.

-- 
Eric Botcazou


Re: [PATCH build/doc] Replacing libiberty with gnulib

2016-07-20 Thread Manuel López-Ibáñez
On 20 July 2016 at 19:21, ayush goel  wrote:
> Hey,
> As a first step of my GSOC project
> (https://gcc.gnu.org/wiki/replacelibibertywithgnulib) I have imported
> the gnulib library inside the gcc tree. I have created gnulib as a top
> level directory which contains the necessary scripts to import the
> modules. It also contains the necessary Makefile.in and configure.ac
> files.

Looks good to me, but I cannot approve it. Joseph, what do you think?

Minor nit: It should be as follows (you can also use the script contrib/mklog )

2016-07-20 Ayush Goel 

* Makefile.def: Add gnulib as build & host library and dependency of
all-gcc on gnulib.
...


Re: [PATCH] RFC: On-demand locations within string-literals

2016-07-20 Thread David Malcolm
On Fri, 2016-07-08 at 17:49 -0400, David Malcolm wrote:
[...]

> Also, this patch currently makes the assumption (in charset.c)
> that there's a 1:1 correspondence between bytes in the source
> character set and bytes in the execution character set.  This can
> be the case if both are, say, UTF-8, but might not hold in
> general.
> 
> The source char set is UTF-8 or UTF-EBCDIC, and safe-ctype.c has:
> 
> # if HOST_CHARSET == HOST_CHARSET_EBCDIC
>   #error "FIXME: write tables for EBCDIC"
> 
> so presumably we don't actually have any hosts that supports EBCDIC
> (do we?); as far as I can tell, we only currently support UTF-8
> as the source char set.
> 
> Similarly, do we support any targets for which the execution
> character set is *not* UTF-8?

I brought this up in this thread on the gcc mailing list:
"gcc/libcpp: non-UTF-8 source or execution encodings?"
  https://gcc.gnu.org/ml/gcc/2016-07/msg00091.html
and in particular:
  https://gcc.gnu.org/ml/gcc/2016-07/msg00106.html
it's possible to select the execution char set using at the command
-line for C-family frontends using:
  -fexec-charset=
  -fwide-exec-charset=
e.g. "-fexec-charset=IBM1047" will give one of the variants of EBCDIC.

Given that the internal interface already has a failure mode, I'm
thinking that a reasonable restriction is to only support locations
within string literals for the case where source character set ==
execution character set, and hence we have "convert_no_conversion" as
the converter.  Does that sound sane?  (I can write test coverage for
this).

[...]


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger
On 07/20/16 21:00, Jeff Law wrote:
> On 07/20/2016 10:30 AM, Bernd Edlinger wrote:
>> On 07/20/16 18:15, Jeff Law wrote:
>>> On 07/20/2016 05:53 AM, Richard Biener wrote:
>>>>> Is it OK after boot-strap and regression-testing?
>>>>
>>>> I think the __builtin_setjmp change is wrong - __builtin_setjmp is
>>>> _not_ 'setjmp' it is part of the GCC internal machinery (using setjmp
>>>> and longjmp in the end) for SJLJ exception handing.
>>>>
>>>> Am I correct Eric?
>>> That is correct.  __builtin_setjmp (and friends) are part of the SJLJ
>>> exception handling code.   They use a fixed sized buffer (5 words) to
>>> store the key items (as opposed to the OS defined jmp_buf structure
>>> which is usually considerably larger).
>>>
>>> jeff
>>
>> Yes. __builtin_setjmp is declared in builtins.def:
>>
>> DEF_GCC_BUILTIN(BUILT_IN_SETJMP, "setjmp", BT_FN_INT_PTR,
>> ATTR_NOTHROW_LEAF_LIST)
>>
>> It is visible in C as __builtin_setjmp, and it special_function_p
>> adds the ECF_RETURNS_TWICE | ECF_LEAF.
>>
>> So it becomes equivalent to this:
>>
>> int __builtin_setjmp(void*) __attribute__((returns_twice, nothrow,
>> leaf))
>>
>> after special_function_p does it's magic.
>>
>> If I remove the recognition of "__builtin_" from special_function_p
>> I have to add the returns_twice attribute in the DEF_GCC_BUILTIN.
>> Otherwise, I would get wrong code on all platforms, because
>> __builtin_setjmp saves only IP, SP, and FP registers.
>>
>> Everything in the normal test suite keeps on going with the patch,
>> but is there anything that I have to do to make sure that the
>> SJLJ eh is still working? It is not the default on x86_64, right?
> Very few targets continue to use SJLJ eh (perhaps just cygwin/mingw).
> *But* I think the Ada front-end explicitly uses SJLJ EH, so if you want
> to get some smoke testing, the Ada testsuite is probably the place to go.
>
> Jeff

Good.  I always include ada and go, just to be on the safe side.

The reg-test of the __builtin-setjmp patch is still running but the
ada part is already complete:

 === acats Summary ===
# of expected passes2320
# of unexpected failures0
Native configuration is x86_64-pc-linux-gnu
 === gnat tests ===


Running target unix
FAIL: gnat.dg/vect3.adb scan-tree-dump-times vect "vectorized 1 loops" 15
FAIL: gnat.dg/vect6.adb scan-tree-dump-times vect "vectorized 1 loops" 15

 === gnat Summary ===

# of expected passes2511
# of unexpected failures2
# of expected failures  22
# of unsupported tests  3
/home/ed/gnu/gcc-build/gcc/gnatmake version 7.0.0 20160720 (experimental)

the failures are already there since a few months.



Bernd.


Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-07-20 Thread Mikael Morin

Le 20/07/2016 à 11:39, Andre Vehreschild a écrit :

Hi Mikael,



+  if(st == ST_FAIL_IMAGE)
+new_st.op = EXEC_FAIL_IMAGE;
+  else
+gcc_unreachable();

You can use
gcc_assert (st == ST_FAIL_IMAGE);
foo...;
instead of
if (st == ST_FAIL_IMAGE)
foo...;
else
gcc_unreachable ();


Be careful, this is not 100% identical in the general case. For older
gcc version (gcc < 4008) gcc_assert() is mapped to nothing, esp. not to
an abort(), so the behavior can change. But in this case everything is
fine, because the patch is most likely not backported.


Didn't know about this. The difference seems to be very subtle.
I don't mind much anyway. The original version can stay if preferred, 
this was just a suggestion.


By the way, if the function is inlined in its single caller, the assert 
or unreachable statement can be removed, which avoids choosing between them.

That's another suggestion.


+
+  return MATCH_YES;
+
+ syntax:
+  gfc_syntax_error (st);
+
+  return MATCH_ERROR;
+}
+
+match
+gfc_match_fail_image (void)
+{
+  /* if (!gfc_notify_std (GFC_STD_F2008_TS, "FAIL IMAGE statement
at %C")) */
+  /*   return MATCH_ERROR; */
+

Can this be uncommented?


+  return fail_image_statement (ST_FAIL_IMAGE);
+}

 /* Match LOCK/UNLOCK statement. Syntax:
  LOCK ( lock-variable [ , lock-stat-list ] )
diff --git a/gcc/fortran/trans-intrinsic.c
b/gcc/fortran/trans-intrinsic.c index 1aaf4e2..b2f5596 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -1647,6 +1647,24 @@ trans_this_image (gfc_se * se, gfc_expr
*expr) m, lbound));
 }

+static void
+gfc_conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
+{
+  unsigned int num_args;
+  tree *args,tmp;
+
+  num_args = gfc_intrinsic_argument_list_length (expr);
+  args = XALLOCAVEC (tree, num_args);
+
+  gfc_conv_intrinsic_function_args (se, expr, args, num_args);
+
+  if (flag_coarray == GFC_FCOARRAY_LIB)
+{

Can everything be put under the if?
Does it work with -fcoarray=single?


IMO coarray=single should not generate code here, therefore putting
everything under the if should to fine.

My point was more avoiding generating code for the arguments if they are 
not used in the end.
Regarding the -fcoarray=single case, the function returns a result, 
which can be used in an expression, so I don't think it will work 
without at least hardcoding a fixed value as result in that case.
But even that wouldn't be enough, as the function wouldn't work 
consistently with the fail image statement.



Sorry for the comments ...


Comments are welcome here, as far as I know. ;-)

Mikael


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Jeff Law

On 07/20/2016 10:30 AM, Bernd Edlinger wrote:

On 07/20/16 18:15, Jeff Law wrote:

On 07/20/2016 05:53 AM, Richard Biener wrote:

Is it OK after boot-strap and regression-testing?


I think the __builtin_setjmp change is wrong - __builtin_setjmp is
_not_ 'setjmp' it is part of the GCC internal machinery (using setjmp
and longjmp in the end) for SJLJ exception handing.

Am I correct Eric?

That is correct.  __builtin_setjmp (and friends) are part of the SJLJ
exception handling code.   They use a fixed sized buffer (5 words) to
store the key items (as opposed to the OS defined jmp_buf structure
which is usually considerably larger).

jeff


Yes. __builtin_setjmp is declared in builtins.def:

DEF_GCC_BUILTIN(BUILT_IN_SETJMP, "setjmp", BT_FN_INT_PTR,
ATTR_NOTHROW_LEAF_LIST)

It is visible in C as __builtin_setjmp, and it special_function_p
adds the ECF_RETURNS_TWICE | ECF_LEAF.

So it becomes equivalent to this:

int __builtin_setjmp(void*) __attribute__((returns_twice, nothrow,
leaf))

after special_function_p does it's magic.

If I remove the recognition of "__builtin_" from special_function_p
I have to add the returns_twice attribute in the DEF_GCC_BUILTIN.
Otherwise, I would get wrong code on all platforms, because
__builtin_setjmp saves only IP, SP, and FP registers.

Everything in the normal test suite keeps on going with the patch,
but is there anything that I have to do to make sure that the
SJLJ eh is still working? It is not the default on x86_64, right?
Very few targets continue to use SJLJ eh (perhaps just cygwin/mingw). 
*But* I think the Ada front-end explicitly uses SJLJ EH, so if you want 
to get some smoke testing, the Ada testsuite is probably the place to go.


Jeff


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Jeff Law

On 07/20/2016 10:54 AM, Bernd Edlinger wrote:



Yes. That is another interesting observation.  I think, originally this
flag was introduced by Jan Hubicka, and should mean, "it may be alloca
or a weak alias to alloca or maybe even something different".
But some of the later optimizations use it in a way as if it meant
"it must be alloca".  However I have not been able to come up with
a test case that makes this assumption false, but I probably just
did not try hard enough.

But I think that alloca just should not be recognized by name any
more.
And those optimizations probably aren't safe in the presence of alloca 
implemented on top of malloc.  They are safe for the built-in alloca though.


jeff



Re: [PATCH] c++/60760 - arithmetic on null pointers should not be allowed in constant expressions

2016-07-20 Thread Jason Merrill
On Wed, Jul 20, 2016 at 2:15 PM, Martin Sebor  wrote:
> On 07/20/2016 07:52 AM, Jason Merrill wrote:
>>
>> On Mon, Jul 18, 2016 at 6:15 PM, Martin Sebor  wrote:
>>>
>>> On 07/18/2016 11:51 AM, Jason Merrill wrote:


 On 07/06/2016 06:20 PM, Martin Sebor wrote:
>
>
> @@ -2911,6 +2923,14 @@ cxx_eval_indirect_ref (const constexpr_ctx
> *ctx, tree t,
> if (*non_constant_p)
>   return t;
>
> +  if (integer_zerop (op0))
> +{
> +  if (!ctx->quiet)
> +error ("dereferencing a null pointer");
> +  *non_constant_p = true;
> +  return t;
> +}


 I'm skeptical of checking this here, since *p is valid for null p; &*p
 is even a constant expression.  And removing this hunk doesn't seem to
 break any of your tests.

 OK with that hunk removed.
>>>
>>>
>>> With it removed the constexpr-nullptr-2.C test fails on line 64:
>>>
>>>constexpr const int *pi0 = &pa2->pa1->pa0->i;   // { dg-error "null
>>> pointer|not a constant" }
>>>
>>> Here, pa2 and pa1 are non-null but pa0 is null.
>>
>>
>> It doesn't fail for me; that line hits the error in
>> cxx_eval_component_reference.  I'm only talking about removing the
>> cxx_eval_indirect_ref hunk.
>
>
> Sorry, I may have been referring to an older patch.  With the latest
> patch, the assertion is on line 75.  It's also not failing, even
> though it should be.  The problem is that I had misunderstood how
> the vertical bar in DejaGnu directives works.  I thought it meant
> that both sides had to match a message on that line, when it means
> only one side has to.  I'll need to fix that (how does one match
> two messages on the same line?)
>
> But removing the hunk as you suggest does break the intent of the
> test.  With it there, we get a descriptive message for the invalid
> code below clearly explaining the problem:
>
> $ cat xyz.c && /build/gcc-60760/gcc/xgcc -B /build/gcc-60760/gcc -S -Wall
> -Wextra -Wpedantic -xc++ xyz.c
> struct S { const S *p; int i; };
>
> constexpr S s0 = { 0, 0 };
> constexpr S s1 = { &s0, 1 };
>
> constexpr int i = s1.p->p->i;
> xyz.c:6:28: error: dereferencing a null pointer
>  constexpr int i = s1.p->p->i;
> ^
>
> With the hunk removed, all we get is the generic:
>
> xyz.c:6:28: error: ‘*(const S*)((const S*)s1.S::p)->S::p’ is not a constant
> expression
>  constexpr int i = s1.p->p->i;
> ^
>
> Re-reading your comment above now: "since *p is valid for null p;"
> I agree that &*p is valid when p is null.  Unless I missed a case
> it is accepted with or without the hunk.  Otherwise, *p is not valid,
> and it is also rejected with or without it.
>
> Is there something else you're worried about with the hunk that
> makes you want to trade it off for the less informative message?

OK, we can keep the hunk, but only when !lval, since that means we
access the value.

Jason


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Jeff Law

On 07/20/2016 12:30 PM, Bernd Edlinger wrote:

On 07/20/16 20:08, Richard Biener wrote:

On July 20, 2016 6:54:48 PM GMT+02:00, Bernd Edlinger 
 wrote:


Yes. That is another interesting observation.  I think, originally this
flag was introduced by Jan Hubicka, and should mean, "it may be alloca
or a weak alias to alloca or maybe even something different".
But some of the later optimizations use it in a way as if it meant
"it must be alloca".  However I have not been able to come up with
a test case that makes this assumption false, but I probably just
did not try hard enough.

But I think that alloca just should not be recognized by name any
more.


It was introduced to mark calls that should not be duplicated by inlining or 
unrolling to avoid increasing stack usage too much.  Sth worthwhile to keep 
even with -ffreestanding.

Richard.



Apparently the MAY_BE_ALLOCA issue is worse than I ever thought...

But I could not imagine that alloca can be anything else than a
built-in.

Is there any implementation where alloca is like an ordinary function
call?

I mean, does something like a function that allocates n bytes
from the caller's stack frame work at all with any calling convention?
IIRC alloca was actually a normal call on some systems.  It'd use the 
current value of hte stack pointer to record the depth of the call chain 
and the objects allocated at that particular depth.  Calls to alloca 
would walk those chains to automatically deallocate objects that had 
been allocated in frames that had been subsequently popped.


Of course it was a heuristic and if you used alloca in fun1, then 
returned from fun1 and called fun2, then used alloca in fun2, you might 
expect objects allocated in fun1 to go away, but that didn't always 
happen (consider if fun1 & fun2 have the same stack size :(...


I don't know if any systems still implement alloca this way, but it was 
certainly was implemented as a normal call on some systems in the past.


jeff


Re: [PATCH] Add priority_queue::value_compare (LWG 2684)

2016-07-20 Thread Jonathan Wakely

On 24/05/16 17:02 +0100, Jonathan Wakely wrote:

* include/bits/stl_queue.h (priority_queue::value_compare): Define.

This is only Tentatively Ready but I don't think there's any harm in
making the change now. Libc++ have been shipping this for years,
without realising it wasn't actually in the standard :-)

Tested x86_64, committed to trunk.


Now that the issue has been resolved I've documented it and
regenerated the docs.

Committed to trunk.

commit d86801688332d159d49bc55b7cd16d24ef3be423
Author: redi 
Date:   Wed Jul 20 18:22:05 2016 +

Document LWG DR 2684 status and regenerate libstdc++ manual

	* doc/xml/manual/intro.xml: Document DR 2684 status.
	* doc/html/*: Regenerate.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@238535 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml b/libstdc++-v3/doc/xml/manual/intro.xml
index c6b0656..a5e0a3b 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -1052,6 +1052,12 @@ requirements of the license of GCC.
 Divide by the object type.
 
 
+http://www.w3.org/1999/xlink"; xlink:href="../ext/lwg-defects.html#2684">2684:
+   priority_queue lacking comparator typedef
+   
+
+Define the value_compare typedef.
+
 
   
 


[PATCH] LWG 2441 Provide exact-width atomic typedefs

2016-07-20 Thread Jonathan Wakely

* include/std/atomic (atomic_int8_t, atomic_uint8_t, atomic_int16_t)
(atomic_uint16_t, atomic_int32_t, atomic_uint32_t, atomic_int64_t)
(atomic_uint64_t): Define (LWG 2441).
* testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc: Remove empty
lines.
* testsuite/29_atomics/headers/atomic/types_std_c++0x.cc: Test for
the new types.
* doc/xml/manual/intro.xml: Document DR 2441 status.

I've chosen to treat this as a DR and defined the types even for
-std=c++11 and -std=c++14. Again, if there are strong objections that
could be changed to only define them for -std=c++17 or -std=gnu++NN
but I'd rather not do that.

Tested x864-linux, committed to trunk
.
commit bc8afb20f0bcb079d2e132edf440693fcb243b22
Author: Jonathan Wakely 
Date:   Wed Jul 20 18:32:00 2016 +0100

LWG 2441 Provide exact-width atomic typedefs

* include/std/atomic (atomic_int8_t, atomic_uint8_t, atomic_int16_t)
(atomic_uint16_t, atomic_int32_t, atomic_uint32_t, atomic_int64_t)
(atomic_uint64_t): Define (LWG 2441).
* testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc: Remove empty
lines.
* testsuite/29_atomics/headers/atomic/types_std_c++0x.cc: Test for
the new types.
* doc/xml/manual/intro.xml: Document DR 2441 status.

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml 
b/libstdc++-v3/doc/xml/manual/intro.xml
index 6335614..c6b0656 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -1011,6 +1011,12 @@ requirements of the license of GCC.
 Add noexcept.
 
 
+http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2441">2441:
+   Exact-width atomic typedefs should be provided
+
+Define the typedefs.
+
+
 http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2454">2454:
Add raw_storage_iterator::base() member

diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index 8cbc91f..f8894bf 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -833,6 +833,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef atomic atomic_char32_t;
 
 
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 2441. Exact-width atomic typedefs should be provided
+
+  /// atomic_int8_t
+  typedef atomic   atomic_int8_t;
+
+  /// atomic_uint8_t
+  typedef atomic  atomic_uint8_t;
+
+  /// atomic_int16_t
+  typedef atomic  atomic_int16_t;
+
+  /// atomic_uint16_t
+  typedef atomic atomic_uint16_t;
+
+  /// atomic_int32_t
+  typedef atomic  atomic_int32_t;
+
+  /// atomic_uint32_t
+  typedef atomic atomic_uint32_t;
+
+  /// atomic_int64_t
+  typedef atomic  atomic_int64_t;
+
+  /// atomic_uint64_t
+  typedef atomic atomic_uint64_t;
+
+
   /// atomic_int_least8_t
   typedef atomic atomic_int_least8_t;
 
diff --git a/libstdc++-v3/testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc 
b/libstdc++-v3/testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc
index a13f66a..1f56f5a 100644
--- a/libstdc++-v3/testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc
+++ b/libstdc++-v3/testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc
@@ -21,6 +21,3 @@
 #include 
 
 // { dg-error "ISO C.. 2011" "" { target *-*-* } 32 }
-
-
-
diff --git 
a/libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc 
b/libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
index 32fe2a4..51adbf5 100644
--- a/libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
+++ b/libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
@@ -72,4 +72,14 @@ void test01()
   using std::atomic_ptrdiff_t;
   using std::atomic_intmax_t;
   using std::atomic_uintmax_t;
+
+  // DR 2441
+  using std::atomic_int8_t;
+  using std::atomic_uint8_t;
+  using std::atomic_int16_t;
+  using std::atomic_uint16_t;
+  using std::atomic_int32_t;
+  using std::atomic_uint32_t;
+  using std::atomic_int64_t;
+  using std::atomic_uint64_t;
 }


[Committed] S/390: Remove mode size check in encode_section_info.

2016-07-20 Thread Andreas Krebbel
With the last change the not-aligned symbol ref markers are always set
for modes with size zero.  This is wrong since for larl the size of
the access does not matter.  This patch removes that check entirely
from s390_encode_section_info.  Modes with a size of 0 get rejected in
s390_check_symref_alignment which is used for the load/store relative
instructions to check for natural alignment.

Bootstrapped and regression tested on s390 and s390x with
--with-arch=z900 and --with-arch=z13.

gcc/ChangeLog:

2016-07-20  Andreas Krebbel  

* config/s390/s390.c (s390_encode_section_info): Remove mode size
check.
---
 gcc/config/s390/s390.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 318c021..23d758c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -12413,8 +12413,7 @@ s390_encode_section_info (tree decl, rtx rtl, int first)
   /* Store the alignment to be able to check if we can use
 a larl/load-relative instruction.  We only handle the cases
 that can go wrong (i.e. no FUNC_DECLs).  */
-  if (DECL_ALIGN (decl) == 0
- || DECL_ALIGN (decl) % 16)
+  if (DECL_ALIGN (decl) == 0 || DECL_ALIGN (decl) % 16)
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
   else if (DECL_ALIGN (decl) % 32)
SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
@@ -12429,9 +12428,7 @@ s390_encode_section_info (tree decl, rtx rtl, int first)
   && GET_CODE (XEXP (rtl, 0)) == SYMBOL_REF
   && TREE_CONSTANT_POOL_ADDRESS_P (XEXP (rtl, 0)))
 {
-  if (MEM_ALIGN (rtl) == 0
- || GET_MODE_SIZE (GET_MODE (rtl)) == 0
- || MEM_ALIGN (rtl) % 16)
+  if (MEM_ALIGN (rtl) == 0 || MEM_ALIGN (rtl) % 16)
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
   else if (MEM_ALIGN (rtl) % 32)
SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
-- 
2.9.1



Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger
On 07/20/16 20:08, Richard Biener wrote:
> On July 20, 2016 6:54:48 PM GMT+02:00, Bernd Edlinger 
>  wrote:
>>
>> Yes. That is another interesting observation.  I think, originally this
>> flag was introduced by Jan Hubicka, and should mean, "it may be alloca
>> or a weak alias to alloca or maybe even something different".
>> But some of the later optimizations use it in a way as if it meant
>> "it must be alloca".  However I have not been able to come up with
>> a test case that makes this assumption false, but I probably just
>> did not try hard enough.
>>
>> But I think that alloca just should not be recognized by name any
>> more.
>
> It was introduced to mark calls that should not be duplicated by inlining or 
> unrolling to avoid increasing stack usage too much.  Sth worthwhile to keep 
> even with -ffreestanding.
>
> Richard.
>

Apparently the MAY_BE_ALLOCA issue is worse than I ever thought...

But I could not imagine that alloca can be anything else than a
built-in.

Is there any implementation where alloca is like an ordinary function
call?

I mean, does something like a function that allocates n bytes
from the caller's stack frame work at all with any calling convention?


Bernd.


[PATCH] LWG 2328 Rvalue stream extraction should use perfect forwarding

2016-07-20 Thread Jonathan Wakely

* include/std/istream (operator>>(basic_istream&&, _Tp&)): Adjust
to use perfect forwarding (LWG 2328).
* testsuite/27_io/rvalue_streams.cc: Test perfect forwarding.
* doc/xml/manual/intro.xml: Document DR 2328 status.

Teted x86_64-linux, committed to trunk.
commit 8a9c61de6663071a562e90ce7d2a2b7a2228fd95
Author: Jonathan Wakely 
Date:   Wed Oct 21 21:33:37 2015 +0100

LWG 2328 Rvalue stream extraction should use perfect forwarding

* include/std/istream (operator>>(basic_istream&&, _Tp&)): Adjust
to use perfect forwarding (LWG 2328).
* testsuite/27_io/rvalue_streams.cc: Test perfect forwarding.
* doc/xml/manual/intro.xml: Document DR 2328 status.

diff --git a/libstdc++-v3/doc/xml/manual/intro.xml 
b/libstdc++-v3/doc/xml/manual/intro.xml
index 7b836cd..6335614 100644
--- a/libstdc++-v3/doc/xml/manual/intro.xml
+++ b/libstdc++-v3/doc/xml/manual/intro.xml
@@ -948,6 +948,12 @@ requirements of the license of GCC.
 Update definitions of the partial specializations for 
const and volatile types.
 
 
+http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2328">2328:
+   Rvalue stream extraction should use perfect 
forwarding
+
+Use perfect forwarding for right operand.
+
+
 http://www.w3.org/1999/xlink"; 
xlink:href="../ext/lwg-defects.html#2329">2329:
regex_match()/regex_search() with 
match_results should forbid temporary strings
 
diff --git a/libstdc++-v3/include/std/istream b/libstdc++-v3/include/std/istream
index d4cf7bc..c8a2e08e 100644
--- a/libstdc++-v3/include/std/istream
+++ b/libstdc++-v3/include/std/istream
@@ -909,6 +909,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus >= 201103L
   // [27.7.1.6] Rvalue stream extraction
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 2328. Rvalue stream extraction should use perfect forwarding
   /**
*  @brief  Generic extractor for rvalue stream
*  @param  __is  An input stream.
@@ -921,9 +923,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
   template
 inline basic_istream<_CharT, _Traits>&
-operator>>(basic_istream<_CharT, _Traits>&& __is, _Tp& __x)
+operator>>(basic_istream<_CharT, _Traits>&& __is, _Tp&& __x)
 {
-  __is >> __x;
+  __is >> std::forward<_Tp>(__x);
   return __is;
 }
 #endif // C++11
diff --git a/libstdc++-v3/testsuite/27_io/rvalue_streams.cc 
b/libstdc++-v3/testsuite/27_io/rvalue_streams.cc
index eba5bc3..5918595 100644
--- a/libstdc++-v3/testsuite/27_io/rvalue_streams.cc
+++ b/libstdc++-v3/testsuite/27_io/rvalue_streams.cc
@@ -34,9 +34,33 @@ test01()
   VERIFY (i == i2);
 }
 
+struct X { bool as_rvalue; };
+
+void operator>>(std::istream&, X& x) { x.as_rvalue = false; }
+void operator>>(std::istream&, X&& x) { x.as_rvalue = true; }
+
+// LWG 2328 Rvalue stream extraction should use perfect forwarding
+void
+test02()
+{
+  X x;
+  std::istringstream is;
+  auto& ref1 = (std::move(is) >> x);
+  VERIFY( &ref1 == &is );
+  VERIFY( x.as_rvalue == false );
+  auto& ref2 = (std::move(is) >> std::move(x));
+  VERIFY( &ref2 == &is );
+  VERIFY( x.as_rvalue == true );
+
+  char arr[2];
+  std::istringstream("x") >> &arr[0];
+  std::istringstream("x") >> arr;
+}
+
 int
 main()
 {
   test01();
+  test02();
   return 0;
 }


[PATCH build/doc] Replacing libiberty with gnulib

2016-07-20 Thread ayush goel
Hey,
As a first step of my GSOC project
(https://gcc.gnu.org/wiki/replacelibibertywithgnulib) I have imported
the gnulib library inside the gcc tree. I have created gnulib as a top
level directory which contains the necessary scripts to import the
modules. It also contains the necessary Makefile.in and configure.ac
files.
I have made the corresponding changes in the Makefile.def and
configure.ac files, adding gnulib both as a build and host library,
and subsequently regenerated the Makefile.in and configure files.

In order to show the setup works, I’ve replaced libiberty’s version by
obstack by gnulib’s. This was made possible by replacing the
corresponding header file and then including gnulib headers and gnulib
static library in the build path required to compile gcc files.
Also, in order to ensure that the setup works fine, I locally removed
obstack.[ch] from libiberty so that the setup uses the corresponding
files from gnulib

Used gdb’s scripts to import gnulib
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=tree;f=gdb/gnulib;h=cdf326774716ae427dc4fb47c9a410fcdf715563;hb=HEAD

Bootstrapped with multiple languages and multilib enabled for maximum coverage.
Regression tested on x86_64-apple-darwin15.5.0 and x86_64-linux




20-7-16 Ayush Goel 

 Makefile.def: Added gnulib as build & host library and dependency of
all-gcc on gnulib
* Makefile.in: regenerated
* configure.ac: Added gnulib as build and host library
* configure: regenerated
* gcc/Makefile.in: Added path to gnulib static library (libgnu.a) and
gnulib header files
* gcc/doc/sourcebuild.texi: Added gnulib and how to use the update
script to update/import gnu lib modules
* gnulib: created directory
* gnulib/Makefile.in: new file
* gnulib/configure.ac: new file
* gnulib/update-gnulib.sh: script to import gnulib modules using gnulib-tool
* gnulib/import: created by update-gnulib.sh
* gnulib/import/Makefile.in: imported from gnulib
* gnulib/import/alignof.h: Imported from gnulib
* gnulib/import/exitfail.c: Imported from gnulib
* gnulib/import/exitfail.h: Imported from gnulib
* gnulib/import/extra: Imported from gnulib
* gnulib/import/extra/snippet: Imported from gnulib
* gnulib/import/extra/snippet/_Noreturn.h: Imported from gnulib
* gnulib/import/extra/snippet/arg-nonnull.h: Imported from gnulib
* gnulib/import/extra/snippet/c++defs.h: Imported from gnulib
* gnulib/import/extra/snippet/warn-on-use.h: Imported from gnulib
* gnulib/import/gettext.h: Imported from gnulib
* gnulib/import/m4: Imported from gnulib
* gnulib/import/m4/00gnulib.m4: Imported from gnulib
* gnulib/import/m4/absolute-header.m4: Imported from gnulib
* gnulib/import/m4/extern-inline.m4: Imported from gnulib
* gnulib/import/m4/gnulib-cache.m4: Imported from gnulib
* gnulib/import/m4/gnulib-common.m4: Imported from gnulib
* gnulib/import/m4/gnulib-comp.m4: Imported from gnulib
* gnulib/import/m4/gnulib-tool.m4: Imported from gnulib
* gnulib/import/m4/include_next.m4: Imported from gnulib
* gnulib/import/m4/longlong.m4: Imported from gnulib
* gnulib/import/m4/multiarch.m4: Imported from gnulib
* gnulib/import/m4/obstack.m4: Imported from gnulib
* gnulib/import/m4/off_t.m4: Imported from gnulib
* gnulib/import/m4/ssize_t.m4: Imported from gnulib
* gnulib/import/m4/stddef_h.m4: Imported from gnulib
* gnulib/import/m4/stdint.m4: Imported from gnulib
* gnulib/import/m4/stdlib_h.m4: Imported from gnulib
* gnulib/import/m4/sys_types_h.m4: Imported from gnulib
* gnulib/import/m4/unistd_h.m4: Imported from gnulib
* gnulib/import/m4/warn-on-use.m4: Imported from gnulib
* gnulib/import/m4/wchar_t.m4: Imported from gnulib
* gnulib/import/obstack.c: Imported from gnulib
* gnulib/import/obstack.h: Imported from gnulib
* gnulib/import/stddef.in.h: Imported from gnulib
* gnulib/import/stdint.in.h: Imported from gnulib
* gnulib/import/stdlib.in.h: Imported from gnulib
* gnulib/import/sys: Imported from gnulib
* gnulib/import/sys_types.in.h: Imported from gnulib
* gnulib/import/unistd.c: Imported from gnulib
* gnulib/import/unistd.in.h: Imported from gnulib
* gnulib/stamp-h1: generated

Also note that I have a copyright assignment in place already.

-Ayush Goel


importgnulib_7_20.patch
Description: Binary data


Re: [PATCH] c++/60760 - arithmetic on null pointers should not be allowed in constant expressions

2016-07-20 Thread Martin Sebor

On 07/20/2016 07:52 AM, Jason Merrill wrote:

On Mon, Jul 18, 2016 at 6:15 PM, Martin Sebor  wrote:

On 07/18/2016 11:51 AM, Jason Merrill wrote:


On 07/06/2016 06:20 PM, Martin Sebor wrote:


@@ -2911,6 +2923,14 @@ cxx_eval_indirect_ref (const constexpr_ctx
*ctx, tree t,
if (*non_constant_p)
  return t;

+  if (integer_zerop (op0))
+{
+  if (!ctx->quiet)
+error ("dereferencing a null pointer");
+  *non_constant_p = true;
+  return t;
+}


I'm skeptical of checking this here, since *p is valid for null p; &*p
is even a constant expression.  And removing this hunk doesn't seem to
break any of your tests.

OK with that hunk removed.


With it removed the constexpr-nullptr-2.C test fails on line 64:

   constexpr const int *pi0 = &pa2->pa1->pa0->i;   // { dg-error "null
pointer|not a constant" }

Here, pa2 and pa1 are non-null but pa0 is null.


It doesn't fail for me; that line hits the error in
cxx_eval_component_reference.  I'm only talking about removing the
cxx_eval_indirect_ref hunk.


Sorry, I may have been referring to an older patch.  With the latest
patch, the assertion is on line 75.  It's also not failing, even
though it should be.  The problem is that I had misunderstood how
the vertical bar in DejaGnu directives works.  I thought it meant
that both sides had to match a message on that line, when it means
only one side has to.  I'll need to fix that (how does one match
two messages on the same line?)

But removing the hunk as you suggest does break the intent of the
test.  With it there, we get a descriptive message for the invalid
code below clearly explaining the problem:

$ cat xyz.c && /build/gcc-60760/gcc/xgcc -B /build/gcc-60760/gcc -S 
-Wall -Wextra -Wpedantic -xc++ xyz.c

struct S { const S *p; int i; };

constexpr S s0 = { 0, 0 };
constexpr S s1 = { &s0, 1 };

constexpr int i = s1.p->p->i;
xyz.c:6:28: error: dereferencing a null pointer
 constexpr int i = s1.p->p->i;
^

With the hunk removed, all we get is the generic:

xyz.c:6:28: error: ‘*(const S*)((const S*)s1.S::p)->S::p’ is not a 
constant expression

 constexpr int i = s1.p->p->i;
^

Re-reading your comment above now: "since *p is valid for null p;"
I agree that &*p is valid when p is null.  Unless I missed a case
it is accepted with or without the hunk.  Otherwise, *p is not valid,
and it is also rejected with or without it.

Is there something else you're worried about with the hunk that
makes you want to trade it off for the less informative message?

Martin


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Richard Biener
On July 20, 2016 6:54:48 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 07/20/16 18:20, Jeff Law wrote:
>> On 07/20/2016 09:41 AM, Bernd Edlinger wrote:
>>> On 07/20/16 12:44, Richard Biener wrote:
 On Tue, 19 Jul 2016, Bernd Edlinger wrote:

> Hi!
>
> As discussed at
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71876,
> we have a _very_ old hack in gcc, that recognizes certain
>functions by
> name, and inserts in some cases unsafe attributes, that don't work
>for
> a freestanding environment.
>
> It is unsafe to return ECF_MAY_BE_ALLOCA, ECF_LEAF and
>ECF_NORETURN
> from special_function_p, just by the name of the function,
>especially
> for less well known functions, like "getcontext" or "savectx",
>which
> could easily used for something completely different.

 Returning ECF_MAY_BE_ALLOCA is safe.  Just wanted to mention this,
 regardless of the followups you already received.
>>>
>>>
>>> I dont think so.
>>>
>>>
>>> Consider this example:
>>>
>>> cat test.cc
>>> //extern "C"
>>> void *alloca(unsigned long);
>>> void bar(unsigned long n)
>>> {
>>>char *x = (char*) alloca(n);
>>>if (x)
>>>  *x = 0;
>>> }
>>>
>>> g++ -O3 -S test.cc
>>>
>>> result:
>>>
>>> _Z3barm:
>>> .LFB0:
>>> .cfi_startproc
>>> pushq%rbp
>>> .cfi_def_cfa_offset 16
>>> .cfi_offset 6, -16
>>> movq%rsp, %rbp
>>> .cfi_def_cfa_register 6
>>> call_Z6allocam
>>> movb$0, (%rax)
>>> leave
>>> .cfi_def_cfa 7, 8
>>> ret
>>>
>>> So we call a C++ function with name alloca, but because
>>> special_function_p adds ECF_MAY_BE_ALLOCA, the null-pointer
>>> check is eliminated, but it is not the builtin alloca,
>>> but for the C++ front end it is a pretty normal function.
>> Clearly if something "may be alloca", then the properties on the
>> arguments/return values & side effects that are specific to alloca
>can
>> not be relied upon.  That to me seems like a bug elsewhere in the
>> compiler independent of the changes you're trying to make.
>>
>
>
>Yes. That is another interesting observation.  I think, originally this
>flag was introduced by Jan Hubicka, and should mean, "it may be alloca
>or a weak alias to alloca or maybe even something different".
>But some of the later optimizations use it in a way as if it meant
>"it must be alloca".  However I have not been able to come up with
>a test case that makes this assumption false, but I probably just
>did not try hard enough.
>
>But I think that alloca just should not be recognized by name any
>more.

It was introduced to mark calls that should not be duplicated by inlining or 
unrolling to avoid increasing stack usage too much.  Sth worthwhile to keep 
even with -ffreestanding.

Richard.

>
>>
>>
>> Jeff
>>




Re: [PATCH test]XFAIL gcc.dg/vect/vect-mask-store-move-1.c

2016-07-20 Thread Jeff Law

On 07/20/2016 10:58 AM, Bin Cheng wrote:

Hi,
After patch @238301, issue reported in PR65206 is also exposed by case 
gcc.dg/vect/vect-mask-store-move-1.c.  This patch xfail the case for the moment.
Test result checked, is it OK?

Thanks,
bin
gcc/testsuite/ChangeLog
2016-07-14  Bin Cheng  

* gcc.dg/vect/vect-mask-store-move-1.c: XFAIL.

OK.
jeff



Re: [Patch, testsuite, tentative] Explicitly disable pointer <-> int cast warnings for avr?

2016-07-20 Thread Mike Stump
On Jul 19, 2016, at 10:37 PM, Senthil Kumar Selvaraj 
 wrote:
>  The patch fixes a couple of testsuite failures that show up for the
>  avr target because it has different sizes for longs and pointers (4
>  bytes versus 2), by explicitly disabling the warning for avr.
> 
>  Does this make sense?

I don't feel too strongly about it, but it would be nice either to have a cast 
in the code to the target type to shut it up, or alternatively, use the 
intptr_t type so that that the sizes are the same.  Since this is compile only, 
a cast would work.  As a general rule, if a test case runs, I think we'd need 
the intptr_t type involved.

Can you give that a try, as then we don't have to have any special flags for 
any target and the test case is then more portable?

If the edits are too much, your patch is fine.  I'm hoping someone reduced the 
test case and it is small with very few changes required, but I didn't look at 
the test case.

Re: [PATCH] S/390: Fix pr67443.c.

2016-07-20 Thread Andreas Krebbel
On 07/20/2016 01:55 PM, Dominik Vogt wrote:
> The attached patch rewrites the pr67443.c testcase in a different
> way so that the test still works with the changed allocation of
> globals pinned to registers.  The test ist hopefully more robust
> now.  The test ist hopefully more robust now.  Tested on s390 and s390x 
> biarch.

Applied.  Thanks!

-Andreas-



Re: [PATCH] S/390: Xfail some tests in insv-[12].c.

2016-07-20 Thread Andreas Krebbel
On 07/19/2016 11:40 AM, Dominik Vogt wrote:
> The attached patch XFAILs some of the "insv" testcases as
> discussed internally.  Tested on s390x biarch and s390.

Applied.  Thanks!

-Andreas-



[PATCH #2], PowerPC support to enable -mlra and/or -mfloat128

2016-07-20 Thread Michael Meissner
This patch renames the configure switches to be explicit that they are for the
PowerPC, and that they are temporary.  I would hope by the time GCC 7 exits
stage1 that these switches will be removed, but having them now will allow us
to move to LRA and __float128 in an orderly fashion.

I built a bootstrap compiler using the --enable-powerpc-lra option, and it ran
fine.  There were two additional tests that generate different code with -mlra
and now fail.  These will be fixed in later patches.

I also built a C only compiler using the --enable-powerpc-float128 option
(disabling libquadmath and bootstrap), and the C tests looked fine.

Can I install these patches in the trunk?

2016-07-20  Michael Meissner  

* doc/install.texi (Configuration): Document PowerPC specific
configuration options --enable-powerpc-lra and
--enable-powerpc-float128.
* configure.ac: Add support for the configuration option
--enable-powerpc-lra to enable the use of the LRA register
allocator by default.  Add support for the configuration option
--enable-powerpc-float128 to enable the use of the __float128 type
in PowerPC Linux systems.
* configure: Regenerate.
* config.gcc (powerpc*-*-linux*): Add --enable-powerpc-lra and
--enable-powerpc-float128 support.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
support for --enable-powerpc-lra and --enable-powerpc-float128.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 238445)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
@@ -4306,6 +4306,17 @@ rs6000_option_override_internal (bool gl
   rs6000_isa_flags &= ~OPTION_MASK_P9_DFORM_SCALAR;
 }
 
+  /* Enable LRA if the compiler was configured with --enable-lra.  */
+#ifdef ENABLE_LRA
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_LRA) == 0)
+{
+  if (ENABLE_LRA)
+   rs6000_isa_flags |= OPTION_MASK_LRA;
+  else
+   rs6000_isa_flags &= ~OPTION_MASK_LRA;
+}
+#endif
+
   /* There have been bugs with -mvsx-timode that don't show up with -mlra,
  but do show up with -mno-lra.  Given -mlra will become the default once
  PR 69847 is fixed, turn off the options with problems by default if
@@ -4372,6 +4383,17 @@ rs6000_option_override_internal (bool gl
}
 }
 
+  /* Enable FLOAT128 if the compiler was configured with --enable-float128.  */
+#ifdef ENABLE_FLOAT128
+  if (TARGET_VSX && (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128) == 0)
+{
+  if (ENABLE_FLOAT128)
+   rs6000_isa_flags |= OPTION_MASK_FLOAT128;
+  else
+   rs6000_isa_flags &= ~(OPTION_MASK_FLOAT128 | OPTION_MASK_FLOAT128_HW);
+}
+#endif
+
   /* __float128 requires VSX support.  */
   if (TARGET_FLOAT128 && !TARGET_VSX)
 {
Index: gcc/doc/install.texi
===
--- gcc/doc/install.texi
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/doc)  (revision 
238445)
+++ gcc/doc/install.texi(.../gcc/doc)   (working copy)
@@ -1661,6 +1661,35 @@ Using the GNU Compiler Collection (GCC)}
 See ``RS/6000 and PowerPC Options'' in the main manual
 @end ifhtml
 
+@item --enable-powerpc-lra
+This option enables @option{-mlra} by default for powerpc-linux.  This
+switch is a temporary configuration switch that is intended to allow
+for the transition from the reload register allocator to the newer lra
+register allocator.  When the transition is complete, this switch
+may be deleted.
+@ifnothtml
+@xref{RS/6000 and PowerPC Options,, RS/6000 and PowerPC Options, gcc,
+Using the GNU Compiler Collection (GCC)},
+@end ifnothtml
+@ifhtml
+See ``RS/6000 and PowerPC Options'' in the main manual
+@end ifhtml
+
+@item --enable-powerpc-float128
+This option enables @option{-mfloat128} by default for powerpc-linux.
+This switch is a temporary configuation switch that is intended to
+allow the PowerPC GCC developers to work on implementing library
+support for PowerPC IEEE 128-bit floating point functions.  When the
+standard GCC libraries are enhanced to support @code{__float128} by
+default, this switch may be deleted.
+@ifnothtml
+@xref{RS/6000 and PowerPC Options,, RS/6000 and PowerPC Options, gcc,
+Using the GNU Compiler Collection (GCC)},
+@end ifnothtml
+@ifhtml
+See ``RS/6000 and PowerPC Options'' in the main manual
+@end ifhtml
+
 @item --enable-default-ssp
 Turn on @option{-fstack-protector-strong} by default.
 


Re: [AArch64][8/14] ARMv8.2-A FP16 two operands scalar intrinsics

2016-07-20 Thread Jiong Wang

On 07/07/16 17:17, Jiong Wang wrote:

This patch add ARMv8.2-A FP16 two operands scalar intrinsics.


The updated patch resolve the conflict with

   https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00309.html

The change is to let aarch64_emit_approx_div return false for HFmode.

gcc/
2016-07-20  Jiong Wang

* config/aarch64/aarch64-simd-builtins.def: Register new builtins.
* config/aarch64/aarch64.md (hf3): 
New.
(hf3): Likewise.
(add3): Likewise.
(sub3): Likewise.
(mul3): Likewise.
(div3): Likewise.
(*div3): Likewise.
(3): Extend to HF.
* config/aarch64/aarch64.c (aarch64_emit_approx_div): Return
false for HFmode.
* config/aarch64/aarch64-simd.md (aarch64_rsqrts): Likewise.
(fabd3): Likewise.
(3): Likewise.
(3): Likewise.
(aarch64_fmulx): Likewise.
(aarch64_fac): Likewise.
(aarch64_frecps): Likewise.
(hfhi3): New.
(hihf3): Likewise.
* config/aarch64/iterators.md (VHSDF_SDF): Delete.
(VSDQ_HSDI): Support HI.
(fcvt_target, FCVT_TARGET): Likewise.
* config/aarch64/arm_fp16.h: (vaddh_f16): New.
(vsubh_f16): Likewise.
(vabdh_f16): Likewise.
(vcageh_f16): Likewise.
(vcagth_f16): Likewise.
(vcaleh_f16): Likewise.
(vcalth_f16): Likewise.(vcleh_f16): Likewise.
(vclth_f16): Likewise.
(vcvth_n_f16_s16): Likewise.
(vcvth_n_f16_s32): Likewise.
(vcvth_n_f16_s64): Likewise.
(vcvth_n_f16_u16): Likewise.
(vcvth_n_f16_u32): Likewise.
(vcvth_n_f16_u64): Likewise.
(vcvth_n_s16_f16): Likewise.
(vcvth_n_s32_f16): Likewise.
(vcvth_n_s64_f16): Likewise.
(vcvth_n_u16_f16): Likewise.
(vcvth_n_u32_f16): Likewise.
(vcvth_n_u64_f16): Likewise.
(vdivh_f16): Likewise.
(vmaxh_f16): Likewise.
(vmaxnmh_f16): Likewise.
(vminh_f16): Likewise.
(vminnmh_f16): Likewise.
(vmulh_f16): Likewise.
(vmulxh_f16): Likewise.
(vrecpsh_f16): Likewise.
(vrsqrtsh_f16): Likewise.

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 6f50d8405d3ee8c4823037bb2022a4f2f08b72fe..31abc077859254e3696adacb3f8f2b9b2da0647f 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -41,7 +41,7 @@
 
   BUILTIN_VDC (COMBINE, combine, 0)
   BUILTIN_VB (BINOP, pmul, 0)
-  BUILTIN_VHSDF_SDF (BINOP, fmulx, 0)
+  BUILTIN_VHSDF_HSDF (BINOP, fmulx, 0)
   BUILTIN_VHSDF_DF (UNOP, sqrt, 2)
   BUILTIN_VD_BHSI (BINOP, addp, 0)
   VAR1 (UNOP, addp, 0, di)
@@ -393,13 +393,12 @@
   /* Implemented by
  aarch64_frecp.  */
   BUILTIN_GPF_F16 (UNOP, frecpe, 0)
-  BUILTIN_GPF (BINOP, frecps, 0)
   BUILTIN_GPF_F16 (UNOP, frecpx, 0)
 
   BUILTIN_VDQ_SI (UNOP, urecpe, 0)
 
   BUILTIN_VHSDF (UNOP, frecpe, 0)
-  BUILTIN_VHSDF (BINOP, frecps, 0)
+  BUILTIN_VHSDF_HSDF (BINOP, frecps, 0)
 
   /* Implemented by a mixture of abs2 patterns.  Note the DImode builtin is
  only ever used for the int64x1_t intrinsic, there is no scalar version.  */
@@ -496,17 +495,23 @@
   /* Implemented by <*><*>3.  */
   BUILTIN_VSDQ_HSDI (SHIFTIMM, scvtf, 3)
   BUILTIN_VSDQ_HSDI (FCVTIMM_SUS, ucvtf, 3)
-  BUILTIN_VHSDF_SDF (SHIFTIMM, fcvtzs, 3)
-  BUILTIN_VHSDF_SDF (SHIFTIMM_USS, fcvtzu, 3)
+  BUILTIN_VHSDF_HSDF (SHIFTIMM, fcvtzs, 3)
+  BUILTIN_VHSDF_HSDF (SHIFTIMM_USS, fcvtzu, 3)
+  VAR1 (SHIFTIMM, scvtfsi, 3, hf)
+  VAR1 (SHIFTIMM, scvtfdi, 3, hf)
+  VAR1 (FCVTIMM_SUS, ucvtfsi, 3, hf)
+  VAR1 (FCVTIMM_SUS, ucvtfdi, 3, hf)
+  BUILTIN_GPI (SHIFTIMM, fcvtzshf, 3)
+  BUILTIN_GPI (SHIFTIMM_USS, fcvtzuhf, 3)
 
   /* Implemented by aarch64_rsqrte.  */
   BUILTIN_VHSDF_HSDF (UNOP, rsqrte, 0)
 
   /* Implemented by aarch64_rsqrts.  */
-  BUILTIN_VHSDF_SDF (BINOP, rsqrts, 0)
+  BUILTIN_VHSDF_HSDF (BINOP, rsqrts, 0)
 
   /* Implemented by fabd3.  */
-  BUILTIN_VHSDF_SDF (BINOP, fabd, 3)
+  BUILTIN_VHSDF_HSDF (BINOP, fabd, 3)
 
   /* Implemented by aarch64_faddp.  */
   BUILTIN_VHSDF (BINOP, faddp, 0)
@@ -522,10 +527,10 @@
   BUILTIN_VHSDF_HSDF (UNOP, neg, 2)
 
   /* Implemented by aarch64_fac.  */
-  BUILTIN_VHSDF_SDF (BINOP_USS, faclt, 0)
-  BUILTIN_VHSDF_SDF (BINOP_USS, facle, 0)
-  BUILTIN_VHSDF_SDF (BINOP_USS, facgt, 0)
-  BUILTIN_VHSDF_SDF (BINOP_USS, facge, 0)
+  BUILTIN_VHSDF_HSDF (BINOP_USS, faclt, 0)
+  BUILTIN_VHSDF_HSDF (BINOP_USS, facle, 0)
+  BUILTIN_VHSDF_HSDF (BINOP_USS, facgt, 0)
+  BUILTIN_VHSDF_HSDF (BINOP_USS, facge, 0)
 
   /* Implemented by sqrt2.  */
   VAR1 (UNOP, sqrt, 2, hf)
@@ -543,3 +548,7 @@
   BUILTIN_GPI_I16 (UNOPUS, fixuns_trunchf, 2)
   BUILTIN_GPI (UNOPUS, fixuns_truncsf, 2)
   BUILTIN_GPI (UNOPUS, fixuns_truncdf, 2)
+
+  /* Implemented by 3.  */
+  VAR1 (BINOP, fmax, 3, hf)
+  VAR1 (BINOP, fmin, 3, hf)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/conf

Re: [AArch64][7/14] ARMv8.2-A FP16 one operand scalar intrinsics

2016-07-20 Thread Jiong Wang

On 07/07/16 17:17, Jiong Wang wrote:

This patch add ARMv8.2-A FP16 one operand scalar intrinsics

Scalar intrinsics are kept in arm_fp16.h instead of arm_neon.h.


The updated patch resolve the conflict with

   https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00308.html

The change is to let aarch64_emit_approx_sqrt return false for HFmode.

gcc/
2016-07-20  Jiong Wang

* config.gcc (aarch64*-*-*): Install arm_fp16.h.
* config/aarch64/aarch64-builtins.c (hi_UP): New.
* config/aarch64/aarch64-simd-builtins.def: Register new builtins.
* config/aarch64/aarch64-simd.md (aarch64_frsqrte): Extend to HF 
mode.
(aarch64_frecp): Likewise.
(aarch64_cm): Likewise.
* config/aarch64/aarch64.md (2): Likewise.
(l2): Likewise.
(fix_trunc2): Likewise.
(sqrt2): Likewise.
(*sqrt2): Likewise.
(abs2): Likewise.
(hf2): New pattern for HF mode.
(hihf2): Likewise.
* config/aarch64/aarch64.c (aarch64_emit_approx_sqrt): Return
for HF mode.
* config/aarch64/arm_neon.h: Include arm_fp16.h.
* config/aarch64/iterators.md (GPF_F16): New.
(GPI_F16): Likewise.
(VHSDF_HSDF): Likewise.
(w1): Support HF mode.
(w2): Likewise.
(v): Likewise.
(s): Likewise.
(q): Likewise.
(Vmtype): Likewise.
(V_cmp_result): Likewise.
(fcvt_iesize): Likewise.
(FCVT_IESIZE): Likewise.
* config/aarch64/arm_fp16.h: New file.
(vabsh_f16): New.
(vceqzh_f16): Likewise.
(vcgezh_f16): Likewise.
(vcgtzh_f16): Likewise.
(vclezh_f16): Likewise.
(vcltzh_f16): Likewise.
(vcvth_f16_s16): Likewise.
(vcvth_f16_s32): Likewise.
(vcvth_f16_s64): Likewise.
(vcvth_f16_u16): Likewise.
(vcvth_f16_u32): Likewise.
(vcvth_f16_u64): Likewise.
(vcvth_s16_f16): Likewise.
(vcvth_s32_f16): Likewise.
(vcvth_s64_f16): Likewise.
(vcvth_u16_f16): Likewise.
(vcvth_u32_f16): Likewise.
(vcvth_u64_f16): Likewise.
(vcvtah_s16_f16): Likewise.
(vcvtah_s32_f16): Likewise.
(vcvtah_s64_f16): Likewise.
(vcvtah_u16_f16): Likewise.
(vcvtah_u32_f16): Likewise.
(vcvtah_u64_f16): Likewise.
(vcvtmh_s16_f16): Likewise.
(vcvtmh_s32_f16): Likewise.
(vcvtmh_s64_f16): Likewise.
(vcvtmh_u16_f16): Likewise.
(vcvtmh_u32_f16): Likewise.
(vcvtmh_u64_f16): Likewise.
(vcvtnh_s16_f16): Likewise.
(vcvtnh_s32_f16): Likewise.
(vcvtnh_s64_f16): Likewise.
(vcvtnh_u16_f16): Likewise.
(vcvtnh_u32_f16): Likewise.
(vcvtnh_u64_f16): Likewise.
(vcvtph_s16_f16): Likewise.
(vcvtph_s32_f16): Likewise.
(vcvtph_s64_f16): Likewise.
(vcvtph_u16_f16): Likewise.
(vcvtph_u32_f16): Likewise.
(vcvtph_u64_f16): Likewise.
(vnegh_f16): Likewise.
(vrecpeh_f16): Likewise.
(vrecpxh_f16): Likewise.
(vrndh_f16): Likewise.
(vrndah_f16): Likewise.
(vrndih_f16): Likewise.
(vrndmh_f16): Likewise.
(vrndnh_f16): Likewise.
(vrndph_f16): Likewise.
(vrndxh_f16): Likewise.
(vrsqrteh_f16): Likewise.
(vsqrth_f16): Likewise.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 1f75f17877334c2bb61cd16b69539ec7514db8ae..8827dc830d374c2512be5713d6dd143913f53c7d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -300,7 +300,7 @@ m32c*-*-*)
 ;;
 aarch64*-*-*)
 	cpu_type=aarch64
-	extra_headers="arm_neon.h arm_acle.h"
+	extra_headers="arm_fp16.h arm_neon.h arm_acle.h"
 	c_target_objs="aarch64-c.o"
 	cxx_target_objs="aarch64-c.o"
 	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index af5fac5b29cf5373561d9bf9a69c401d2bec5cec..ca91d9108ead3eb83c21ee86d9e6ed44c8f4ad2d 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -62,6 +62,7 @@
 #define si_UPSImode
 #define sf_UPSFmode
 #define hi_UPHImode
+#define hf_UPHFmode
 #define qi_UPQImode
 #define UP(X) X##_UP
 
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 363e131327d6be04dd94e664ef839e46f26940e4..6f50d8405d3ee8c4823037bb2022a4f2f08b72fe 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -274,6 +274,14 @@
   BUILTIN_VHSDF (UNOP, round, 2)
   BUILTIN_VHSDF_DF (UNOP, frintn, 2)
 
+  VAR1 (UNOP, btrunc, 2, hf)
+  VAR1 (UNOP, ceil, 2, hf)
+  VAR1 (UNOP, floor, 2, hf)
+  VAR1 (UNOP, frintn, 2, hf)
+  VAR1 (UNOP, nearbyint, 2, hf)
+  VAR1 (UNOP, rint, 2, hf)
+  VAR1 (UNOP, round, 2, hf)
+
   /* Implemented by l2.  */
   VAR1 (UNOP, lbtruncv4hf, 2, v4hi)
   VAR1 (UNO

Re: [AArch64][3/14] ARMv8.2-A FP16 two operands vector intrinsics

2016-07-20 Thread Jiong Wang

On 07/07/16 17:15, Jiong Wang wrote:

This patch add ARMv8.2-A FP16 two operands vector intrinsics.


The updated patch resolve the conflict with

   https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00309.html

The change is to let aarch64_emit_approx_div return false for
V4HFmode and V8HFmode.

gcc/
2016-07-20  Jiong Wang

* config/aarch64/aarch64-simd-builtins.def: Register new builtins.
* config/aarch64/aarch64-simd.md
(aarch64_rsqrts): Extend to HF modes.
(fabd3): Likewise.
(3): Likewise.
(3): Likewise.
(aarch64_p): Likewise.
(3): Likewise.
(3): Likewise.
(3): Likewise.
(aarch64_faddp): Likewise.
(aarch64_fmulx): Likewise.
(aarch64_frecps): Likewise.
(*aarch64_fac): Rename to aarch64_fac.
(add3): Extend to HF modes.
(sub3): Likewise.
(mul3): Likewise.
(div3): Likewise.
(*div3): Likewise.
* config/aarch64/aarch64.c (aarch64_emit_approx_div): Return
false for V4HF and V8HF.
* config/aarch64/iterators.md (VDQ_HSDI, VSDQ_HSDI): New mode
iterator.
* config/aarch64/arm_neon.h (vadd_f16): Likewise.
(vaddq_f16): Likewise.
(vabd_f16): Likewise.
(vabdq_f16): Likewise.
(vcage_f16): Likewise.
(vcageq_f16): Likewise.
(vcagt_f16): Likewise.
(vcagtq_f16): Likewise.
(vcale_f16): Likewise.
(vcaleq_f16): Likewise.
(vcalt_f16): Likewise.
(vcaltq_f16): Likewise.
(vceq_f16): Likewise.
(vceqq_f16): Likewise.
(vcge_f16): Likewise.
(vcgeq_f16): Likewise.
(vcgt_f16): Likewise.
(vcgtq_f16): Likewise.
(vcle_f16): Likewise.
(vcleq_f16): Likewise.
(vclt_f16): Likewise.
(vcltq_f16): Likewise.
(vcvt_n_f16_s16): Likewise.
(vcvtq_n_f16_s16): Likewise.
(vcvt_n_f16_u16): Likewise.
(vcvtq_n_f16_u16): Likewise.
(vcvt_n_s16_f16): Likewise.
(vcvtq_n_s16_f16): Likewise.
(vcvt_n_u16_f16): Likewise.
(vcvtq_n_u16_f16): Likewise.
(vdiv_f16): Likewise.
(vdivq_f16): Likewise.
(vdup_lane_f16): Likewise.
(vdup_laneq_f16): Likewise.
(vdupq_lane_f16): Likewise.
(vdupq_laneq_f16): Likewise.
(vdups_lane_f16): Likewise.
(vdups_laneq_f16): Likewise.
(vmax_f16): Likewise.
(vmaxq_f16): Likewise.
(vmaxnm_f16): Likewise.
(vmaxnmq_f16): Likewise.
(vmin_f16): Likewise.
(vminq_f16): Likewise.
(vminnm_f16): Likewise.
(vminnmq_f16): Likewise.
(vmul_f16): Likewise.
(vmulq_f16): Likewise.
(vmulx_f16): Likewise.
(vmulxq_f16): Likewise.
(vpadd_f16): Likewise.
(vpaddq_f16): Likewise.
(vpmax_f16): Likewise.
(vpmaxq_f16): Likewise.
(vpmaxnm_f16): Likewise.
(vpmaxnmq_f16): Likewise.
(vpmin_f16): Likewise.
(vpminq_f16): Likewise.
(vpminnm_f16): Likewise.
(vpminnmq_f16): Likewise.
(vrecps_f16): Likewise.
(vrecpsq_f16): Likewise.
(vrsqrts_f16): Likewise.
(vrsqrtsq_f16): Likewise.
(vsub_f16): Likewise.
(vsubq_f16): Likewise.

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 22c87be429ba1aac2bbe77f1119d16b6b8bd6e80..007dad60b6999158a1c9c1cf2a501a9f0712af54 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -41,7 +41,7 @@
 
   BUILTIN_VDC (COMBINE, combine, 0)
   BUILTIN_VB (BINOP, pmul, 0)
-  BUILTIN_VALLF (BINOP, fmulx, 0)
+  BUILTIN_VHSDF_SDF (BINOP, fmulx, 0)
   BUILTIN_VHSDF_DF (UNOP, sqrt, 2)
   BUILTIN_VD_BHSI (BINOP, addp, 0)
   VAR1 (UNOP, addp, 0, di)
@@ -248,22 +248,22 @@
   BUILTIN_VDQ_BHSI (BINOP, smin, 3)
   BUILTIN_VDQ_BHSI (BINOP, umax, 3)
   BUILTIN_VDQ_BHSI (BINOP, umin, 3)
-  BUILTIN_VDQF (BINOP, smax_nan, 3)
-  BUILTIN_VDQF (BINOP, smin_nan, 3)
+  BUILTIN_VHSDF (BINOP, smax_nan, 3)
+  BUILTIN_VHSDF (BINOP, smin_nan, 3)
 
   /* Implemented by 3.  */
-  BUILTIN_VDQF (BINOP, fmax, 3)
-  BUILTIN_VDQF (BINOP, fmin, 3)
+  BUILTIN_VHSDF (BINOP, fmax, 3)
+  BUILTIN_VHSDF (BINOP, fmin, 3)
 
   /* Implemented by aarch64_p.  */
   BUILTIN_VDQ_BHSI (BINOP, smaxp, 0)
   BUILTIN_VDQ_BHSI (BINOP, sminp, 0)
   BUILTIN_VDQ_BHSI (BINOP, umaxp, 0)
   BUILTIN_VDQ_BHSI (BINOP, uminp, 0)
-  BUILTIN_VDQF (BINOP, smaxp, 0)
-  BUILTIN_VDQF (BINOP, sminp, 0)
-  BUILTIN_VDQF (BINOP, smax_nanp, 0)
-  BUILTIN_VDQF (BINOP, smin_nanp, 0)
+  BUILTIN_VHSDF (BINOP, smaxp, 0)
+  BUILTIN_VHSDF (BINOP, sminp, 0)
+  BUILTIN_VHSDF (BINOP, smax_nanp, 0)
+  BUILTIN_VHSDF (BINOP, smin_nanp, 0)
 
   /* Implemented by 2.  */
   BUILTIN_VHSDF (UNOP, btrunc, 2)
@@ -383,7 +383,7 @@
   BUILTIN_VDQ_SI (UNOP, urecpe, 0)
 
   BUILTIN_VHSDF (UNOP, frecpe, 0)
-  BUILTIN_VDQF (BINOP, frecps, 0)

Re: [AArch64][2/14] ARMv8.2-A FP16 one operand vector intrinsics

2016-07-20 Thread Jiong Wang

On 07/07/16 17:14, Jiong Wang wrote:

This patch add ARMv8.2-A FP16 one operand vector intrinsics.

We introduced new mode iterators to cover HF modes, qualified patterns
which was using old mode iterators are switched to new ones.

We can't simply extend old iterator like VDQF to conver HF modes,
because not all patterns using VDQF are with new FP16 support, thus we
introduced new, temperary iterators, and only apply new iterators on
those patterns which do have FP16 supports.


I noticed the patchset at

  https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00308.html

has some modifications on the standard name "div" and "sqrt", thus there
are minor conflicts as this patch touch "sqrt" as well.

This patch resolve the conflict and the change is to let
aarch64_emit_approx_sqrt simply return false for V4HFmode and V8HFmode.

gcc/
2016-07-20  Jiong Wang

* config/aarch64/aarch64-builtins.c (TYPES_BINOP_USS): New.
* config/aarch64/aarch64-simd-builtins.def: Register new builtins.
* config/aarch64/aarch64-simd.md (aarch64_rsqrte): Extend to HF 
modes.
(neg2): Likewise.
(abs2): Likewise.
(2): Likewise.
(l2): Likewise.
(2): Likewise.
(2): Likewise.
(ftrunc2): Likewise.
(2): Likewise.
(sqrt2): Likewise.
(*sqrt2): Likewise.
(aarch64_frecpe): Likewise.
(aarch64_cm): Likewise.
* config/aarch64/aarch64.c (aarch64_emit_approx_sqrt): Return
false for V4HF and V8HF.
* config/aarch64/iterators.md (VHSDF, VHSDF_DF, VHSDF_SDF): New.
(VDQF_COND, fcvt_target, FCVT_TARGET, hcon): Extend mode attribute to 
HF modes.
(stype): New.
* config/aarch64/arm_neon.h (vdup_n_f16): New.
(vdupq_n_f16): Likewise.
(vld1_dup_f16): Use vdup_n_f16.
(vld1q_dup_f16): Use vdupq_n_f16.
(vabs_f16): New.
(vabsq_f16): Likewise.
(vceqz_f16): Likewise.
(vceqzq_f16): Likewise.
(vcgez_f16): Likewise.
(vcgezq_f16): Likewise.
(vcgtz_f16): Likewise.
(vcgtzq_f16): Likewise.
(vclez_f16): Likewise.
(vclezq_f16): Likewise.
(vcltz_f16): Likewise.
(vcltzq_f16): Likewise.
(vcvt_f16_s16): Likewise.
(vcvtq_f16_s16): Likewise.
(vcvt_f16_u16): Likewise.
(vcvtq_f16_u16): Likewise.
(vcvt_s16_f16): Likewise.
(vcvtq_s16_f16): Likewise.
(vcvt_u16_f16): Likewise.
(vcvtq_u16_f16): Likewise.
(vcvta_s16_f16): Likewise.
(vcvtaq_s16_f16): Likewise.
(vcvta_u16_f16): Likewise.
(vcvtaq_u16_f16): Likewise.
(vcvtm_s16_f16): Likewise.
(vcvtmq_s16_f16): Likewise.
(vcvtm_u16_f16): Likewise.
(vcvtmq_u16_f16): Likewise.
(vcvtn_s16_f16): Likewise.
(vcvtnq_s16_f16): Likewise.
(vcvtn_u16_f16): Likewise.
(vcvtnq_u16_f16): Likewise.
(vcvtp_s16_f16): Likewise.
(vcvtpq_s16_f16): Likewise.
(vcvtp_u16_f16): Likewise.
(vcvtpq_u16_f16): Likewise.
(vneg_f16): Likewise.
(vnegq_f16): Likewise.
(vrecpe_f16): Likewise.
(vrecpeq_f16): Likewise.
(vrnd_f16): Likewise.
(vrndq_f16): Likewise.
(vrnda_f16): Likewise.
(vrndaq_f16): Likewise.
(vrndi_f16): Likewise.
(vrndiq_f16): Likewise.
(vrndm_f16): Likewise.
(vrndmq_f16): Likewise.
(vrndn_f16): Likewise.
(vrndnq_f16): Likewise.
(vrndp_f16): Likewise.
(vrndpq_f16): Likewise.
(vrndx_f16): Likewise.
(vrndxq_f16): Likewise.
(vrsqrte_f16): Likewise.
(vrsqrteq_f16): Likewise.
(vsqrt_f16): Likewise.
(vsqrtq_f16): Likewise.

diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 6b90b2af5e9d2b5e7f48569ec1ebcb0ef16314ee..af5fac5b29cf5373561d9bf9a69c401d2bec5cec 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -139,6 +139,10 @@ aarch64_types_binop_ssu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_unsigned };
 #define TYPES_BINOP_SSU (aarch64_types_binop_ssu_qualifiers)
 static enum aarch64_type_qualifiers
+aarch64_types_binop_uss_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_none, qualifier_none };
+#define TYPES_BINOP_USS (aarch64_types_binop_uss_qualifiers)
+static enum aarch64_type_qualifiers
 aarch64_types_binopp_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_poly, qualifier_poly, qualifier_poly };
 #define TYPES_BINOPP (aarch64_types_binopp_qualifiers)
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index f1ad325f464f89c981cbdee8a8f6afafa938639a..22c87be429ba1aac2bbe77f1119d16b6b8bd6e80 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -42,7 +42,7 @@
   

[PATCH test]XFAIL gcc.dg/vect/vect-mask-store-move-1.c

2016-07-20 Thread Bin Cheng
Hi,
After patch @238301, issue reported in PR65206 is also exposed by case 
gcc.dg/vect/vect-mask-store-move-1.c.  This patch xfail the case for the moment.
Test result checked, is it OK?

Thanks,
bin
gcc/testsuite/ChangeLog
2016-07-14  Bin Cheng  

* gcc.dg/vect/vect-mask-store-move-1.c: XFAIL.diff --git a/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
index f928dbf..1e06b58 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
@@ -15,4 +15,4 @@ void foo (int n)
   }
 }
 
-/* { dg-final { scan-tree-dump-times "Move stmt to created bb" 4 "vect" { 
target { i?86-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "Move stmt to created bb" 4 "vect" { 
target { i?86-*-* x86_64-*-* } xfail { i?86-*-* x86_64-*-* } } } } */


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger
On 07/20/16 18:20, Jeff Law wrote:
> On 07/20/2016 09:41 AM, Bernd Edlinger wrote:
>> On 07/20/16 12:44, Richard Biener wrote:
>>> On Tue, 19 Jul 2016, Bernd Edlinger wrote:
>>>
 Hi!

 As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71876,
 we have a _very_ old hack in gcc, that recognizes certain functions by
 name, and inserts in some cases unsafe attributes, that don't work for
 a freestanding environment.

 It is unsafe to return ECF_MAY_BE_ALLOCA, ECF_LEAF and ECF_NORETURN
 from special_function_p, just by the name of the function, especially
 for less well known functions, like "getcontext" or "savectx", which
 could easily used for something completely different.
>>>
>>> Returning ECF_MAY_BE_ALLOCA is safe.  Just wanted to mention this,
>>> regardless of the followups you already received.
>>
>>
>> I dont think so.
>>
>>
>> Consider this example:
>>
>> cat test.cc
>> //extern "C"
>> void *alloca(unsigned long);
>> void bar(unsigned long n)
>> {
>>char *x = (char*) alloca(n);
>>if (x)
>>  *x = 0;
>> }
>>
>> g++ -O3 -S test.cc
>>
>> result:
>>
>> _Z3barm:
>> .LFB0:
>> .cfi_startproc
>> pushq%rbp
>> .cfi_def_cfa_offset 16
>> .cfi_offset 6, -16
>> movq%rsp, %rbp
>> .cfi_def_cfa_register 6
>> call_Z6allocam
>> movb$0, (%rax)
>> leave
>> .cfi_def_cfa 7, 8
>> ret
>>
>> So we call a C++ function with name alloca, but because
>> special_function_p adds ECF_MAY_BE_ALLOCA, the null-pointer
>> check is eliminated, but it is not the builtin alloca,
>> but for the C++ front end it is a pretty normal function.
> Clearly if something "may be alloca", then the properties on the
> arguments/return values & side effects that are specific to alloca can
> not be relied upon.  That to me seems like a bug elsewhere in the
> compiler independent of the changes you're trying to make.
>


Yes. That is another interesting observation.  I think, originally this
flag was introduced by Jan Hubicka, and should mean, "it may be alloca
or a weak alias to alloca or maybe even something different".
But some of the later optimizations use it in a way as if it meant
"it must be alloca".  However I have not been able to come up with
a test case that makes this assumption false, but I probably just
did not try hard enough.

But I think that alloca just should not be recognized by name any
more.

>
>
> Jeff
>


[PATCH GCC]Cleanup lt_to_ne handling in niter analyzer

2016-07-20 Thread Bin Cheng
Hi,
This patch cleans up function number_of_iterations_lt_to_ne mainly by removing 
computation of may_be_zero.  The computation is unnecessary and may_be_zero in 
this case must be true.  Specifically, DELTA is integer constant and iv0.base < 
iv1.base bounds to be true because the false case is handled in function 
number_of_iterations_cond before.  This patch also refines comment a little.

Bootstrap and test on x86_64, is it OK?

Thanks,
bin

2016-07-19  Bin Cheng  

* tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Clean up
by removing computation of may_be_zero.diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 0723752..3302f62 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1072,12 +1072,8 @@ number_of_iterations_lt_to_ne (tree type, affine_iv 
*iv0, affine_iv *iv1,
   tree niter_type = TREE_TYPE (step);
   tree mod = fold_build2 (FLOOR_MOD_EXPR, niter_type, *delta, step);
   tree tmod;
-  mpz_t mmod;
-  tree assumption = boolean_true_node, bound, noloop;
-  bool ret = false, fv_comp_no_overflow;
-  tree type1 = type;
-  if (POINTER_TYPE_P (type))
-type1 = sizetype;
+  tree assumption = boolean_true_node, bound;
+  tree type1 = (POINTER_TYPE_P (type)) ? sizetype : type;
 
   if (TREE_CODE (mod) != INTEGER_CST)
 return false;
@@ -1085,96 +1081,51 @@ number_of_iterations_lt_to_ne (tree type, affine_iv 
*iv0, affine_iv *iv1,
 mod = fold_build2 (MINUS_EXPR, niter_type, step, mod);
   tmod = fold_convert (type1, mod);
 
-  mpz_init (mmod);
-  wi::to_mpz (mod, mmod, UNSIGNED);
-  mpz_neg (mmod, mmod);
-
   /* If the induction variable does not overflow and the exit is taken,
- then the computation of the final value does not overflow.  This is
- also obviously the case if the new final value is equal to the
- current one.  Finally, we postulate this for pointer type variables,
- as the code cannot rely on the object to that the pointer points being
- placed at the end of the address space (and more pragmatically,
- TYPE_{MIN,MAX}_VALUE is not defined for pointers).  */
-  if (integer_zerop (mod) || POINTER_TYPE_P (type))
-fv_comp_no_overflow = true;
-  else if (!exit_must_be_taken)
-fv_comp_no_overflow = false;
-  else
-fv_comp_no_overflow =
-   (iv0->no_overflow && integer_nonzerop (iv0->step))
-   || (iv1->no_overflow && integer_nonzerop (iv1->step));
-
-  if (integer_nonzerop (iv0->step))
+ then the computation of the final value does not overflow.  There
+ are three cases:
+   1) The case if the new final value is equal to the current one.
+   2) Induction varaible has pointer type, as the code cannot rely
+ on the object to that the pointer points being placed at the
+ end of the address space (and more pragmatically,
+ TYPE_{MIN,MAX}_VALUE is not defined for pointers).
+   3) EXIT_MUST_BE_TAKEN is true, note it implies that the induction
+ variable does not overflow.  */
+  if (!integer_zerop (mod) && !POINTER_TYPE_P (type) && !exit_must_be_taken)
 {
-  /* The final value of the iv is iv1->base + MOD, assuming that this
-computation does not overflow, and that
-iv0->base <= iv1->base + MOD.  */
-  if (!fv_comp_no_overflow)
+  if (integer_nonzerop (iv0->step))
{
+ /* The final value of the iv is iv1->base + MOD, assuming
+that this computation does not overflow, and that
+iv0->base <= iv1->base + MOD.  */
  bound = fold_build2 (MINUS_EXPR, type1,
   TYPE_MAX_VALUE (type1), tmod);
  assumption = fold_build2 (LE_EXPR, boolean_type_node,
iv1->base, bound);
- if (integer_zerop (assumption))
-   goto end;
}
-  if (mpz_cmp (mmod, bnds->below) < 0)
-   noloop = boolean_false_node;
-  else if (POINTER_TYPE_P (type))
-   noloop = fold_build2 (GT_EXPR, boolean_type_node,
- iv0->base,
- fold_build_pointer_plus (iv1->base, tmod));
   else
-   noloop = fold_build2 (GT_EXPR, boolean_type_node,
- iv0->base,
- fold_build2 (PLUS_EXPR, type1,
-  iv1->base, tmod));
-}
-  else
-{
-  /* The final value of the iv is iv0->base - MOD, assuming that this
-computation does not overflow, and that
-iv0->base - MOD <= iv1->base. */
-  if (!fv_comp_no_overflow)
{
+ /* The final value of the iv is iv0->base - MOD, assuming
+that this computation does not overflow, and that
+iv0->base - MOD <= iv1->base.  */
  bound = fold_build2 (PLUS_EXPR, type1,
   TYPE_MIN_VALUE (type1), tmod);
  assumption = fold_build2 (GE_EXPR, boolean_type_node,
iv0->base, bound

Re: [Re: RFC: Patch 1/2 v3] New target hook: max_noce_ifcvt_seq_cost

2016-07-20 Thread James Greenhalgh
On Wed, Jul 20, 2016 at 01:41:39PM +0200, Bernd Schmidt wrote:
> On 07/20/2016 11:51 AM, James Greenhalgh wrote:
> 
> >
> >2016-07-20  James Greenhalgh  
> >
> > * target.def (max_noce_ifcvt_seq_cost): New.
> > * doc/tm.texi.in (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Document it.
> > * doc/tm.texi: Regenerate.
> > * targhooks.h (default_max_noce_ifcvt_seq_cost): New.
> > * targhooks.c (default_max_noce_ifcvt_seq_cost): New.
> > * params.def (PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST): New.
> > (PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST): Likewise.
> > * doc/invoke.texi: Document new params.
> 
> I think this is starting to look like a clear improvement, so I'll
> ack patches 1-3 with a few minor comments, and with the expectation
> that you'll address performance regressions on other targets if they
> occur.

I'll gladly take a look if I've caused anyone any trouble.

> Number 4 I still need to figure out.
> 
> Minor details:
> 
> >+  if (!speed_p)
> >+{
> >+  return cost <= if_info->original_cost;
> >+}
> 
> No braces around single statements in ifs. There's an instance of
> this in patch 4 as well.
> 
> >+  if (global_options_set.x_param_values[param])
> >+return PARAM_VALUE (param);
> 
> How about wrapping the param value into COSTS_N_INSNS, to make the
> value of the param less dependent on compiler internals?

I did consider this, but found it hard to word for the user documentation.
I found it easier to understand when it was in the same units as
rtx_cost, particularly as the AArch64 backend prints RTX costs to most
dump files (including ce1, ce2, ce3) so comparing directly was easy for me
to grok. I think going in either direction has the potential to confuse
users, the cost metrics of the RTL passes are very tightly coupled to
compiler internals.

I don't have a strong feeling either way, just a slight preference to keep
everything in the same units as rtx_cost where I can.

Let me know if you'd rather I follow this comment. There's some precedent
to wrapping it in COSTS_N_INSNS in GCSE_UNRESTRICTED_COST, but I find this
less clear than what I've done (well, I would say that :-) ).

> In patch 4:
> 
> >+  /* Check that our new insn isn't going to clobber ORIG_OTHER_DEST.  */
> >+  bool modified_in_x = (set_tmp != NULL_RTX)
> >+&& modified_in_p (orig_other_dest, set_tmp);
> 
> Watch line wrapping. No parens around the first subexpression (there
> are other examples of unnecessary ones in invocations of
> noce_arith_helper), but around the full one.

I'll catch these and others on commit, thanks for pointing them out.

Thanks,
James



Re: [PATCH] Fix assembler arguments for -m16

2016-07-20 Thread Roger Pau Monne
On Wed, Jul 06, 2016 at 04:18:49PM +0200, Roger Pau Monne wrote:
> At the moment the -m16 option only passes the "--32" parameter to the
> assembler on glibc OSes, while on other OSes the assembler is called without
> any specific flag. This is wrong and causes the assembler to fail. Fix it
> by adding support for the -m16 option to x86-64.h.
> 
> 2016-07-06  Roger Pau Monné  
> 
>   * x86-64.h: append --32 to the assembler options when -m16 is used
>   even on non-glibc OSes.
> 
> ---
> Cc: h...@gcc.gnu.org
> Cc: ger...@freebsd.org
> ---
> This should be backported to all stable branches up to 4.9 (when -m16 was
> introduced).
> 
> Please keep me on Cc since I'm not subscribed to the list, thanks.

Ping?

Roger.


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Bernd Edlinger
On 07/20/16 18:15, Jeff Law wrote:
> On 07/20/2016 05:53 AM, Richard Biener wrote:
>>> Is it OK after boot-strap and regression-testing?
>>
>> I think the __builtin_setjmp change is wrong - __builtin_setjmp is
>> _not_ 'setjmp' it is part of the GCC internal machinery (using setjmp
>> and longjmp in the end) for SJLJ exception handing.
>>
>> Am I correct Eric?
> That is correct.  __builtin_setjmp (and friends) are part of the SJLJ
> exception handling code.   They use a fixed sized buffer (5 words) to
> store the key items (as opposed to the OS defined jmp_buf structure
> which is usually considerably larger).
>
> jeff

Yes. __builtin_setjmp is declared in builtins.def:

DEF_GCC_BUILTIN(BUILT_IN_SETJMP, "setjmp", BT_FN_INT_PTR, 
ATTR_NOTHROW_LEAF_LIST)

It is visible in C as __builtin_setjmp, and it special_function_p
adds the ECF_RETURNS_TWICE | ECF_LEAF.

So it becomes equivalent to this:

int __builtin_setjmp(void*) __attribute__((returns_twice, nothrow,
leaf))

after special_function_p does it's magic.

If I remove the recognition of "__builtin_" from special_function_p
I have to add the returns_twice attribute in the DEF_GCC_BUILTIN.
Otherwise, I would get wrong code on all platforms, because
__builtin_setjmp saves only IP, SP, and FP registers.

Everything in the normal test suite keeps on going with the patch,
but is there anything that I have to do to make sure that the
SJLJ eh is still working? It is not the default on x86_64, right?



Bernd.


Re: Merge switch statements in tree-cfgcleanup

2016-07-20 Thread Bernd Schmidt

On 07/20/2016 06:09 PM, Jeff Law wrote:

So I'm going to let Richi run with the review on this one since the two
of you are already iterating.  But I did have one comment on the
placement of the pass.

I believe one of the key things to consider for whether or not something
like this belongs in the cfgcleanup code is whether or not the
optimization is likely exposed repeatedly through the optimization
pipeline.  If it's mostly a source level issue or only usually exposed
by a limited set of optimizers, then a separate pass might be better.


It can trigger before switchconv, and could expose optimization 
opportunities there, but I've also seen it trigger much later. Since I 
think it's cheap I don't see a reason not to put it in cfgcleanup, IMO 
it's the best fit conceptually.



Bernd



Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-20 Thread Jeff Law

On 07/20/2016 08:37 AM, Ilya Enkovich wrote:


Here is an updated version.

Thanks,
Ilya
--
gcc/

2016-07-20  Ilya Enkovich  

* dbgcnt.def (vect_tail_combine): New.
* params.def (PARAM_VECT_COST_INCREASE_COMBINE_THRESHOLD): New.
* tree-vect-data-refs.c (vect_get_new_ssa_name): Support vect_mask_var.
* tree-vect-loop-manip.c (slpeel_tree_peel_loop_to_edge): Support
epilogue combined with loop body.
(vect_do_peeling_for_loop_bound): LIkewise.
(vect_do_peeling_for_alignment): ???
* tree-vect-loop.c Include alias.h and dbgcnt.h.
(vect_estimate_min_profitable_iters): Add 
ret_min_profitable_combine_niters
arg, compute number of iterations for which loop epilogue combining is
profitable.
(vect_generate_tmps_on_preheader): Support combined apilogue.
(vect_gen_ivs_for_masking): New.
(vect_get_mask_index_for_elems): New.
(vect_get_mask_index_for_type): New.
(vect_create_narrowed_masks): New.
(vect_create_widened_masks): New.
(vect_gen_loop_masks): New.
(vect_mask_reduction_stmt): New.
(vect_mask_mask_load_store_stmt): New.
(vect_mask_load_store_stmt): New.
(vect_combine_loop_epilogue): New.
(vect_transform_loop): Support combined apilogue.

I think this is OK.  We've just got patch #5 to work through now, correct?

Jeff



Re: [PATCH GCC]Improve no-overflow check in SCEV using value range info.

2016-07-20 Thread Bin.Cheng
On Wed, Jul 20, 2016 at 11:01 AM, Richard Biener
 wrote:
> On Tue, Jul 19, 2016 at 6:15 PM, Bin.Cheng  wrote:
>> On Tue, Jul 19, 2016 at 1:10 PM, Richard Biener
>>  wrote:
>>> On Mon, Jul 18, 2016 at 6:27 PM, Bin Cheng  wrote:
 Hi,
 Scalar evolution needs to prove no-overflow for source variable when 
 handling type conversion.  This is important because otherwise we would 
 fail to recognize result of the conversion as SCEV, resulting in missing 
 loop optimizations.  Take case added by this patch as an example, the loop 
 can't be distributed as memset call because address of memory reference is 
 not recognized.  At the moment, we rely on type overflow semantics and 
 loop niter info for no-overflow checking, unfortunately that's not enough. 
  This patch introduces new method checking no-overflow using value range 
 information.  As commented in the patch, value range can only be used when 
 source operand variable evaluates on every loop iteration, rather than 
 guarded by some conditions.

 This together with patch improving loop niter analysis 
 (https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00736.html) can help 
 various loop passes like vectorization.
 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>> @@ -3187,7 +3187,8 @@ idx_infer_loop_bounds (tree base, tree *idx, void 
>>> *dta)
>>>/* If access is not executed on every iteration, we must ensure that 
>>> overlow
>>>   may not make the access valid later.  */
>>>if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt))
>>> -  && scev_probably_wraps_p (initial_condition_in_loop_num (ev, 
>>> loop->num),
>>> +  && scev_probably_wraps_p (NULL,
>>>
>>> use NULL_TREE for the null pointer constant of tree.
>>>
>>> +  /* Check if VAR evaluates in every loop iteration.  */
>>> +  gimple *def;
>>> +  if ((def = SSA_NAME_DEF_STMT (var)) != NULL
>>>
>>> def is never NULL but it might be a GIMPLE_NOP which has a NULL gimple_bb.
>>> Better check for ! SSA_DEFAULT_DEF_P (var)
>>>
>>> +  if (TREE_CODE (step) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE 
>>> (var)))
>>> +return false;
>>>
>>> this looks like a cheaper test so please do that first.
>>>
>>> +  step_wi = step;
>>> +  type = TREE_TYPE (var);
>>> +  if (tree_int_cst_sign_bit (step))
>>> +{
>>> +  diff = lower_bound_in_type (type, type);
>>> +  diff = minv - diff;
>>> +  step_wi = - step_wi;
>>> +}
>>> +  else
>>> +{
>>> +  diff = upper_bound_in_type (type, type);
>>> +  diff = diff - maxv;
>>> +}
>>>
>>> this lacks a comment - it's not obvious to me what the gymnastics
>>> with lower/upper_bound_in_type are supposed to achieve.
>>
>> Thanks for reviewing, I will prepare another version of patch.
>>>
>>> As VRP uses niter analysis itself I wonder how this fires back-to-back 
>>> between
>> I am not sure if I mis-understood the question.  If the VRP
>> information comes from loop niter, I think it will not change loop
>> niter or VRP2 in back because that's the best information we got in
>> the first place in niter.  If the VRP information comes from other
>> places (guard conditions?)  SCEV and loop niter after vrp1 might be
>> improved and thus VRP2.  There should be no problems in either case,
>> as long as GCC breaks the recursive chain among niter/scev/vrp
>> correctly.
>
> Ok.
>
>>> VRP1 and VRP2?  If the def of var dominates the latch isn't it enough to do
>>> a + 1 to check whether VRP bumped the range up to INT_MAX/MIN?  That is,
>>> why do we need to add step if not for the TYPE_OVERFLOW_UNDEFINED case
>>> of VRP handling the ranges optimistically?
>> Again, please correct me if I mis-understood.  Considering a variable
>> whose type is unsigned int and scev is {0, 4}_loop, the value range
>> could be computed as [0, 0xfffc], thus MAX + 1 is smaller than
>> type_MAX, but the scev could be overflow.
>
> Yes.  I was wondering about the case where VRP bumps the range to +INF
> because it gave up during iteration or because overflow behavior is undefined.
> Do I understand correctly that the code is mostly to improve the not
> undefined-overflow case?
Hi Richard,

I think we resolved these on IRC, here are words for the record.
The motivation case is for unsigned type loop counter, while the patch
should work for signed type in theory.  Considering a loop has signed
char counter i and it's used in array_ref[i + 10], since front-end
converts signed char addition into unsigned operation, we may need the
range information to prove (unsigned char)i + 10 doesn't overflow,
thus address of array reference is a scev.  I am not sure if the
signed case can be handled by current code, or there are other
fallouts preventing this patch from working.

>
> Also I was wondering if the range DEF dominates the latch then why
> do we necessarily need to add step to verify overflow?  Can't we do better
> if we for example see that the DEF is the loop header

Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-20 Thread Jeff Law

On 07/20/2016 05:53 AM, Richard Biener wrote:

Is it OK after boot-strap and regression-testing?


I think the __builtin_setjmp change is wrong - __builtin_setjmp is
_not_ 'setjmp' it is part of the GCC internal machinery (using setjmp
and longjmp in the end) for SJLJ exception handing.

Am I correct Eric?
That is correct.  __builtin_setjmp (and friends) are part of the SJLJ 
exception handling code.   They use a fixed sized buffer (5 words) to 
store the key items (as opposed to the OS defined jmp_buf structure 
which is usually considerably larger).


jeff


Re: Merge switch statements in tree-cfgcleanup

2016-07-20 Thread Jeff Law

On 07/20/2016 05:14 AM, Bernd Schmidt wrote:

On 07/19/2016 01:18 PM, Richard Biener wrote:

On Tue, Jul 19, 2016 at 1:07 PM, Bernd Schmidt
 wrote:

On 07/19/2016 12:35 PM, Richard Biener wrote:


I think that start/end_recording_case_labels also merged
adjacent labels via group_case_labels_stmt.  Not sure why you
need to stop recording case labels during the transform.  Is
this because you are building a new switch stmt?



It's because the cached mapping gets invalidated. Look in
tree-cfg, it has a edge_to_cases map which I think cannot be
maintained if you modify the structure. I certainly got lots of
internal errors until I added that pair of calls.


Yeah, I see that.  OTOH cfgcleanup relies on this cache to be
efficient and you (repeatedly) clear it.  Clearing parts of it
should be sufficient and if you used redirect_edge_and_branch
instead of redirect_edge_pred it would have maintained the cache as
far as I can see,


I don't think that would work, since we're modifying and/or
discarding case labels as well and they can't remain part of the
cache.


or you can make sure to maintain it yourself or just clear the info
associated with the edges you redirect from one switch to another.


How's this? Tested as before.
So I'm going to let Richi run with the review on this one since the two 
of you are already iterating.  But I did have one comment on the 
placement of the pass.


I believe one of the key things to consider for whether or not something 
like this belongs in the cfgcleanup code is whether or not the 
optimization is likely exposed repeatedly through the optimization 
pipeline.  If it's mostly a source level issue or only usually exposed 
by a limited set of optimizers, then a separate pass might be better.




jeff


Re: [PATCH]: Use HOST_WIDE_INT_{,M}1{,U} some more

2016-07-20 Thread Uros Bizjak
On Wed, Jul 20, 2016 at 3:15 PM, Bernd Schmidt  wrote:
>
>
> On 07/20/2016 02:25 PM, Uros Bizjak wrote:
>>
>> 2016-07-19 14:46 GMT+02:00 Uros Bizjak :
>>>
>>> The result of exercises with sed in gcc/ directory.
>>
>>
>> Some more conversions:
>>
>> 2016-07-20  Uros Bizjak  
>>
>> * cse.c: Use HOST_WIDE_INT_M1 instead of ~(HOST_WIDE_INT) 0.
>> * combine.c: Use HOST_WIDE_INT_M1U instead of
>> ~(unsigned HOST_WIDE_INT) 0.
>> * double-int.h: Ditto.
>> * dse.c: Ditto.
>> * dwarf2asm.c:Ditto.
>> * expmed.c: Ditto.
>> * genmodes.c: Ditto.
>> * match.pd: Ditto.
>> * read-rtl.c: Ditto.
>> * tree-ssa-loop-ivopts.c: Ditto.
>> * tree-ssa-loop-prefetch.c: Ditto.
>> * tree-vect-generic.c: Ditto.
>> * tree-vect-patterns.c: Ditto.
>> * tree.c: Ditto.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>>
>> OK for mainline?
>
>
> I think this is a good set of changes which makes the code easier to read.
> Can I impose one additional requirement, building before/after and verifying
> that all the object files are identical? If you do this, these and all other
> similar changes are preapproved.

I did check for differences of object files in stage1 and stage3
(final) directory when the compiler was bootstrapped w/ and w/o the
patch. As expected, objdump -dr didn't show any, so I'm pretty
confident that the sources are functionally the same.

I have committed the patch to mainline.

Uros.


Re: fold x ^ y to 0 if x == y

2016-07-20 Thread Jeff Law

On 07/20/2016 09:35 AM, Richard Biener wrote:

I have reported it as PR71947.
Could you help me point out how to fix this ?


Not record both equivalences.  This might break the testcase it was
introduced for (obviously).  Which is why I CCed Jeff for his opinion.

It's on my todo list.  I'm still catching up from my PTO last month.

It'll certainly regress the testcase that was introduced when we 
recorded both equivalences.


jeff



Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Wilco Dijkstra
Richard Earnshaw wrote:
> So under what circumstances does it lead to sub-optimal code?

If the cost is incorrect Combine can make the wrong decision, for example
whether to emit a multiply-add or not. I'm not sure whether this still happens
as Kyrill fixed several issues in Combine since this patch was written.

Wilco





Re: fold x ^ y to 0 if x == y

2016-07-20 Thread Richard Biener
On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:

> On 8 July 2016 at 12:29, Richard Biener  wrote:
> > On Fri, 8 Jul 2016, Richard Biener wrote:
> >
> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:
> >>
> >> > Hi Richard,
> >> > For the following test-case:
> >> >
> >> > int f(int x, int y)
> >> > {
> >> >int ret;
> >> >
> >> >if (x == y)
> >> >  ret = x ^ y;
> >> >else
> >> >  ret = 1;
> >> >
> >> >return ret;
> >> > }
> >> >
> >> > I was wondering if x ^ y should be folded to 0 since
> >> > it's guarded by condition x == y ?
> >> >
> >> > optimized dump shows:
> >> > f (int x, int y)
> >> > {
> >> >   int iftmp.0_1;
> >> >   int iftmp.0_4;
> >> >
> >> >   :
> >> >   if (x_2(D) == y_3(D))
> >> > goto ;
> >> >   else
> >> > goto ;
> >> >
> >> >   :
> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);
> >> >
> >> >   :
> >> >   # iftmp.0_1 = PHI 
> >> >   return iftmp.0_1;
> >> >
> >> > }
> >> >
> >> > The attached patch tries to fold for above case.
> >> > I am checking if op0 and op1 are equal using:
> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)
> >> >&& operand_equal_p (vr1->min, vr1->max)
> >> >&& operand_equal_p (vr2->min, vr2->max))
> >> >   { /* equal /* }
> >> >
> >> > I suppose intersection would check if op0 and op1 have equivalent ranges,
> >> > and added operand_equal_p check to ensure that there is only one
> >> > element within the range. Does that look correct ?
> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.
> >>
> >> I think VRP is the wrong place to catch this and DOM should have but it
> >> does
> >>
> >> Optimizing block #3
> >>
> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)
> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
> >> 0>>> COPY x_2(D) = y_3(D)
> >> 0>>> COPY y_3(D) = x_2(D)
> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);
> >>   Replaced 'x_2(D)' with variable 'y_3(D)'
> >>   Replaced 'y_3(D)' with variable 'x_2(D)'
> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);
> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)
> >>
> >> heh, registering both reqivalencies is obviously not going to help...
> >>
> >> The 2nd equivalence is from doing
> >>
> >>   /* We already recorded that LHS = RHS, with canonicalization,
> >>  value chain following, etc.
> >>
> >>  We also want to record RHS = LHS, but without any
> >> canonicalization
> >>  or value chain following.  */
> >>   if (TREE_CODE (rhs) == SSA_NAME)
> >> const_and_copies->record_const_or_copy_raw (rhs, lhs,
> >> SSA_NAME_VALUE (rhs));
> >>
> >> generally recording both is not helpful.  Jeff?  This seems to be
> >> r233207 (fix for PR65917) which must have regressed this testcase.
> >
> > Just verified it works fine on the GCC 5 branch:
> >
> > Optimizing block #3
> >
> > 0>>> COPY y_3(D) = x_2(D)
> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)
> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);
> >   Replaced 'y_3(D)' with variable 'x_2(D)'
> > Applying pattern match.pd:240, gimple-match.c:11346
> > gimple_simplified to ret_4 = 0;
> >   Folded to: ret_4 = 0;
> I have reported it as PR71947.
> Could you help me point out how to fix this ?

Not record both equivalences.  This might break the testcase it was
introduced for (obviously).  Which is why I CCed Jeff for his opinion.

Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 16:28, Wilco Dijkstra wrote:
> Richard Earnshaw wrote:
>> Both of which look reasonable to me.
> 
> Yes the code we generate for these examples is fine, I don't believe this
> example ever went bad. It's just the cost calculation that is incorrect with
> the outer check.
> 
> Wilco
> 
> 

So under what circumstances does it lead to sub-optimal code?

R.


Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Wilco Dijkstra
Richard Earnshaw wrote:
> Both of which look reasonable to me.

Yes the code we generate for these examples is fine, I don't believe this
example ever went bad. It's just the cost calculation that is incorrect with
the outer check.

Wilco




Re: [AArch64][3/3] Migrate aarch64_expand_prologue/epilogue to aarch64_add_constant

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 16:02, Jiong Wang wrote:
> On 20/07/16 15:18, Richard Earnshaw (lists) wrote:
>> On 20/07/16 14:03, Jiong Wang wrote:
>>> Those stack adjustment sequences inside aarch64_expand_prologue/epilogue
>>> are doing exactly what's aarch64_add_constant offered, except they also
>>> need to be aware of dwarf generation.
>>>
>>> This patch teach existed aarch64_add_constant about dwarf generation and
>>> currently SP register is supported.  Whenever SP is updated, there
>>> should be CFA update, we then mark these instructions as frame related,
>>> and if the update is too complex for gcc to guess the adjustment, we
>>> attach explicit annotation.
>>>
>>> Both dwarf frame info size and pro/epilogue scheduling are improved
>>> after
>>> this patch as aarch64_add_constant has better utilization of scratch
>>> register.
>>>
>>> OK for trunk?
>>>
>>> gcc/
>>> 2016-07-20  Jiong Wang  
>>>
>>>  * config/aarch64/aarch64.c (aarch64_add_constant): Mark
>>>  instruction as frame related when it is.  Generate CFA
>>>  annotation when it's necessary.
>>>  (aarch64_expand_prologue): Use aarch64_add_constant.
>>>  (aarch64_expand_epilogue): Likewise.
>>>
>> Are you sure using aarch64_add_constant is unconditionally safe?  Stack
>> adjustments need to be done very carefully to ensure that we never
>> transiently deallocate part of the stack.
> 
> Richard,
> 
>   Thanks for the review, yes, I believe using aarch64_add_constant is
> unconditionally
> safe here.  Because we have generated a stack tie to clobber the whole
> memory thus
> prevent any instruction which access stack be scheduled after that.
> 
>   The access to deallocated stack issue was there and fixed by
> 
>   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02292.html.
> 
>  aarch64_add_constant itself is generating the same instruction
> sequences as the
> original code, except for a few cases, it will prefer
> 
>   move scratch_reg, #imm
>   add sp, sp, scratch_reg
> 
> than:
>   add sp, sp, #imm_part1
>   add sp, sp, #imm_part2
> 
> 
> 
> 

But can you guarantee we will never get and add and a sub in a single
adjustment?  If not, then we need to ensure the two instructions appear
in the right order.

R.


Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

2016-07-20 Thread Segher Boessenkool
On Wed, Jul 20, 2016 at 01:23:44PM +0200, Bernd Schmidt wrote:
> >>>But you need the profile to make even reasonably good decisions.
> >>
> >>I'm not worried about making cost decisions: as far as I'm concerned
> >>it's perfectly fine for that. I'm worried about correctness - you can't
> >>validly save registers inside a loop.
> >
> >Of course you can.  It needs to be paired with a restore; and we do
> >that just fine.
> > Pretty much *all* implementations in the literature do this, fwiw.

> I, however, fail to see where this happens.

See for example one of the better papers on shrink-wrapping, "Post Register
Allocation Spill Code Optimization", by Lupo and Wilken.

See the problem definition (section 2), figure 1 for a figure clearly
showing multiple save/restore (and executed more than once). section 4.2
for why we don't need to look at loops.

[ In this paper prologue/epilogue pairs are only placed around SESE
regions, which we do not have many in GCC that late in RTL (often the
CFG isn't even reducible); there is no reason to restrict to SESE regions
though ].

> If you have references to 
> somewhere where this algorithm is described, that would be helpful, 

No, of course not, because I just made this up, as should be clear.

The problem definition is simple: we have a CFG, and some of the blocks
in that CFG need some things done by the prologue done before they
execute.  We don't want to run that prologue code more often than
necessary, because it can be expensive (compared to the parts of the
function that are executed at all).  Considering all possible combinations
of blocks (or edges) where we can place a prologue is not computationally
feasible.  But there is a shortcut: if a block X gets a prologue, all
blocks dominated by it will for zero cost have that prologue established
as well (by simply not doing an epilogue until they are reached).  So
this algo does the obvious thing, simply walking the dom tree (which is
O(n)).  Then, from the prologue placement, we compute which blocks will
execute with that concern "active"; and we insert prologue/epilogue code
to make that assignment true (a prologue or epilogue for every edge that
crosses from "does not have" to "does have", or the other way around; and
then there is the head/tail thing because cross-jumping fails to unify
many of those *logues, so we take care of it manually).

> because at this stage I think I really don't understand what you're 
> trying to achieve. The submission lacks examples.

It says what it does right at the start of the head comment:

"""
   Instead of putting all of the prologue and epilogue in one spot, we
   can put parts of it in places that are executed less frequently.  The
   following code does this, for concerns that can have more than one
   prologue and epilogue, and where those pro- and epilogues can be
   executed more than once.
"""

followed by a bunch of detail.

> So I could see things could work if you place an epilogue part in the 
> last block of a loop if the start of the loop contains a corresponding 
> part of the prologue, but taking just the comment in the code:
>Prologue concerns are placed in such a way that they are executed as
>infrequently as possible.  Epilogue concerns are put everywhere where
>there is an edge from a bb dominated by such a prologue concern to a
>bb not dominated by one.
> 
> this describes no mechanism by which such a thing would happen.

Sure it does.  The edge leaving the loop, for example.

You can have a prologue/epilogue pair within a loop (which is unusual,
but *can* happen, e.g. as part of a conditional that executes almost
never -- this is quite frequent btw, assertions, on-the-run initialisation,
etc.)

The situation you describe has all the blocks in the loop needing the
prologue (but, say, nothing outside the loop).  Then of course the prologue
is placed on the entry (edge) into the loop, and the epilogue on the exit
edge(s).

> And I 
> fail to see how moving parts of the prologue into a loop would be 
> beneficial as an optimization.

for (i = 0; i < 10; i++)
if (less_than_one_in_ten_times)
do_something_that_needs_a_prologue;

or

for (i = 0; i < 10; i++)
if (whatever)
do_something_that_needs_a_prologue_and_does_not_return;

or whatever other situation.  We do not have natural loops, often.  The
algorithm places prologues so that their dynamic execution frequency is
optimal, which results in their dynamic execution being optimal, whatever
the CFG looks like.


Segher


Re: [AArch64][3/3] Migrate aarch64_expand_prologue/epilogue to aarch64_add_constant

2016-07-20 Thread Jiong Wang

On 20/07/16 15:18, Richard Earnshaw (lists) wrote:

On 20/07/16 14:03, Jiong Wang wrote:

Those stack adjustment sequences inside aarch64_expand_prologue/epilogue
are doing exactly what's aarch64_add_constant offered, except they also
need to be aware of dwarf generation.

This patch teach existed aarch64_add_constant about dwarf generation and
currently SP register is supported.  Whenever SP is updated, there
should be CFA update, we then mark these instructions as frame related,
and if the update is too complex for gcc to guess the adjustment, we
attach explicit annotation.

Both dwarf frame info size and pro/epilogue scheduling are improved after
this patch as aarch64_add_constant has better utilization of scratch
register.

OK for trunk?

gcc/
2016-07-20  Jiong Wang  

 * config/aarch64/aarch64.c (aarch64_add_constant): Mark
 instruction as frame related when it is.  Generate CFA
 annotation when it's necessary.
 (aarch64_expand_prologue): Use aarch64_add_constant.
 (aarch64_expand_epilogue): Likewise.


Are you sure using aarch64_add_constant is unconditionally safe?  Stack
adjustments need to be done very carefully to ensure that we never
transiently deallocate part of the stack.


Richard,

  Thanks for the review, yes, I believe using aarch64_add_constant is 
unconditionally
safe here.  Because we have generated a stack tie to clobber the whole 
memory thus

prevent any instruction which access stack be scheduled after that.

  The access to deallocated stack issue was there and fixed by

  https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02292.html.

 aarch64_add_constant itself is generating the same instruction 
sequences as the

original code, except for a few cases, it will prefer

  move scratch_reg, #imm
  add sp, sp, scratch_reg

than:
  add sp, sp, #imm_part1
  add sp, sp, #imm_part2






Re: [PATCH v2] C++ FE: handle misspelled identifiers and typenames

2016-07-20 Thread Jason Merrill
On Wed, Jul 20, 2016 at 10:46 AM, David Malcolm  wrote:
> @@ -1407,6 +1407,10 @@ lookup_field_fuzzy_info::fuzzy_lookup_field (tree type)
> The TYPE_FIELDS of TYPENAME_TYPE is its TYPENAME_TYPE_FULLNAME.  */
>  return;
>
> +  /* TYPE_FIELDS is not valid for a TYPE_PACK_EXPANSION.  */
> +  if (TREE_CODE (type) == TYPE_PACK_EXPANSION)
> +return;

Instead of checking for various invalid codes, why don't we just check
CLASS_TYPE_P at the top, like fuzzy_lookup_fnfields?

OK with that change.

Jason


Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-20 Thread Ilya Enkovich
On 14 Jul 16:04, Jeff Law wrote:
> On 06/28/2016 06:24 AM, Ilya Enkovich wrote:
> 
> >
> >Here is an updated patch version.
> >
> >Thanks,
> >Ilya
> >--
> >gcc/
> >
> >+/* Function vect_gen_loop_masks.
> >+
> >+   Create masks to mask a loop described by LOOP_VINFO.  Masks
> >+   are created according to LOOP_VINFO_REQUIRED_MASKS and are stored
> >+   into MASKS vector.
> >+
> >+   Index of a mask in a vector is computed according to a number
> >+   of masks's elements.  Masks are sorted by number of its elements
> >+   in descending order.  Index 0 is used to access a mask with
> >+   current_vector_size elements.  Among masks with the same number
> >+   of elements the one with lower index is used to mask iterations
> >+   with smaller iteration counter.  Note that you may get NULL elements
> >+   for masks which are not required.  Use vect_get_mask_index_for_elems
> >+   or vect_get_mask_index_for_type to access resulting vector.  */
> >+
> >+static void
> >+vect_gen_loop_masks (loop_vec_info loop_vinfo, vec *masks)
> I find myself wondering if this ought to be broken down a bit (without
> changing the underlying semantics).
> 
> >+
> >+  /* Create narrowed masks.  */
> >+  cur_mask_elems = iv_elems;
> >+  nmasks = ivs.length ();
> >+  while (cur_mask_elems < max_mask_elems)
> >+{
> >+  prev_mask = vect_get_mask_index_for_elems (cur_mask_elems);
> >+
> >+  cur_mask_elems <<= 1;
> >+  nmasks >>= 1;
> >+
> >+  cur_mask = vect_get_mask_index_for_elems (cur_mask_elems);
> >+
> >+  mask_type = build_truth_vector_type (cur_mask_elems, vec_size);
> >+
> >+  for (unsigned i = 0; i < nmasks; i++)
> >+{
> >+  tree mask_low = (*masks)[prev_mask++];
> >+  tree mask_hi = (*masks)[prev_mask++];
> >+  mask = vect_get_new_ssa_name (mask_type, vect_mask_var);
> >+  stmt = gimple_build_assign (mask, VEC_PACK_TRUNC_EXPR,
> >+  mask_low, mask_hi);
> >+  gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
> >+  (*masks)[cur_mask++] = mask;
> >+}
> >+}
> For example, pull this into its own function as well as the code to create
> widened masks.  In fact, didn't I see those functions in one of the other
> patches as their own separate subroutines?

There were functions which check we may generate such masks.  Here we
actually generate them.  I moved the code into separate functions.

> 
> It's not a huge deal and I don't think it requires another round of review.
> I just found myself scrolling through multiple pages of this function and
> thought it'd be slightly easier to grok if were simply smaller.
> 
> 
> >+
> >+/* Function vect_mask_reduction_stmt.
> >+
> >+   Mask given vectorized reduction statement STMT using
> >+   MASK.  In case scalar reduction statement is vectorized
> >+   into several vector statements then PREV holds a
> >+   preceding vetor statement copy for STMT.
> s/vetor/vector/
> 
> With the one function split up and the typo fix I think this is OK for the
> trunk when the set as a whole is ready.
> 
> jeff
> 
> 

Here is an updated version.

Thanks,
Ilya
--
gcc/

2016-07-20  Ilya Enkovich  

* dbgcnt.def (vect_tail_combine): New.
* params.def (PARAM_VECT_COST_INCREASE_COMBINE_THRESHOLD): New.
* tree-vect-data-refs.c (vect_get_new_ssa_name): Support vect_mask_var.
* tree-vect-loop-manip.c (slpeel_tree_peel_loop_to_edge): Support
epilogue combined with loop body.
(vect_do_peeling_for_loop_bound): LIkewise.
(vect_do_peeling_for_alignment): ???
* tree-vect-loop.c Include alias.h and dbgcnt.h.
(vect_estimate_min_profitable_iters): Add 
ret_min_profitable_combine_niters
arg, compute number of iterations for which loop epilogue combining is
profitable.
(vect_generate_tmps_on_preheader): Support combined apilogue.
(vect_gen_ivs_for_masking): New.
(vect_get_mask_index_for_elems): New.
(vect_get_mask_index_for_type): New.
(vect_create_narrowed_masks): New.
(vect_create_widened_masks): New.
(vect_gen_loop_masks): New.
(vect_mask_reduction_stmt): New.
(vect_mask_mask_load_store_stmt): New.
(vect_mask_load_store_stmt): New.
(vect_combine_loop_epilogue): New.
(vect_transform_loop): Support combined apilogue.


diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 78ddcc2..73c2966 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -192,4 +192,5 @@ DEBUG_COUNTER (treepre_insert)
 DEBUG_COUNTER (tree_sra)
 DEBUG_COUNTER (vect_loop)
 DEBUG_COUNTER (vect_slp)
+DEBUG_COUNTER (vect_tail_combine)
 DEBUG_COUNTER (dom_unreachable_edges)
diff --git a/gcc/params.def b/gcc/params.def
index b86d592..745da4c 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1232,6 +1232,11 @@ DEFPARAM (PARAM_MAX_SPECULATIVE_DEVIRT_MAYDEFS,
  "Maximum number of may-defs visited when devirtualizing "
  "speculatively", 50, 0, 0)
 
+DEFPARAM (PARAM_VECT_COST_INCREA

Re: [PATCH] disable ifunc on *-musl by default

2016-07-20 Thread Szabolcs Nagy
On 20/07/16 15:13, David Edelsohn wrote:
> On Wed, Jul 20, 2016 at 7:09 AM, Szabolcs Nagy  wrote:
>> On 20/07/16 14:45, David Edelsohn wrote:
 Musl libc does not support gnu ifunc, so disable it by default.
 (not disabled on s390-* since that has no musl support yet.)
>>>
>>> Musl libc now supports PPC64. Support for s390 is in progress.
>>>
>>
>> it seemed to me that on ppc64 ifunc is disabled by default.
>> (at least it is not enabled in config.gcc)
> 
> Ifunc is used on PPC64.
> 

only if you build gcc with --enable-gnu-indirect-function

otherwise the ifunc attribute does not work, and target
libs (e.g. libatomic on x86_64) don't use ifunc.

in glibc i think you need --enable-multiarch to use ifunc,
but it handles ifunc in user code independently of that.

i just want to make sure that --enable-gnu-indirect-function
is not the default with *-musl (since musl has no ifunc support).



Re: [patch,avr] make progmem work on AVR_TINY, use TARGET_ADDR_SPACE_DIAGNOSE_USAGE

2016-07-20 Thread Georg-Johann Lay

On 18.07.2016 08:58, Denis Chertykov wrote:

2016-07-15 18:26 GMT+03:00 Georg-Johann Lay :

This patch needs new hook TARGET_ADDR_SPACE_DIAGNOSE_USAGE:
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00839.html

This patch turns attribute progmem into a working feature for AVR_TINY
cores.

It boils down to adding 0x4000 to all symbols with progmem:  Flash memory
can be seen in the RAM address space starting at 0x4000, i.e. data in flash
can be read by means of LD instruction if we add offsets of 0x4000.  There
is no need for special access macros like pgm_read_* or special address
spaces as there is nothing like a LPM instruction.

This is simply achieved by setting a respective symbol_ref_flag, and when
such a symbol has to be printed, then plus_constant with 0x4000 is used.

Diagnosing of unsupported address spaces is now performed by
TARGET_ADDR_SPACE_DIAGNOSE_USAGE which has exact location information.
Hence there is no need to scan all decls for invalid address spaces.

For AVR_TINY, alls address spaces have been disabled.  They are of no use.
Supporting __flash would just make the backend more complicated without any
gains.


Ok for trunk?

Johann


gcc/
* doc/extend.texi (AVR Variable Attributes) [progmem]: Add
documentation how it works on reduced Tiny cores.
(AVR Named Address Spaces): No support for reduced Tiny.
* avr-protos.h (avr_addr_space_supported_p): New prototype.
* avr.c (AVR_SYMBOL_FLAG_TINY_PM): New macro.
(avr_address_tiny_pm_p): New static function.
(avr_print_operand_address) [AVR_TINY]: Add AVR_TINY_PM_OFFSET
if the address is in progmem.
(avr_assemble_integer): Same.
(avr_encode_section_info) [AVR_TINY]: Set AVR_SYMBOL_FLAG_TINY_PM
for symbol_ref in progmem.
(TARGET_ADDR_SPACE_DIAGNOSE_USAGE): New hook define...
(avr_addr_space_diagnose_usage): ...and implementation.
(avr_addr_space_supported_p): New function.
(avr_nonconst_pointer_addrspace, avr_pgm_check_var_decl): Only
report bad address space usage if that space is supported.
(avr_insert_attributes): Same.  No more complain about unsupported
address spaces.
* avr.h (AVR_TINY_PM_OFFSET): New macro.
* avr-c.c (tm_p.h): Include it.
(avr_cpu_cpp_builtins) [__AVR_TINY_PM_BASE_ADDRESS__]: Use
AVR_TINY_PM_OFFSET instead of magic 0x4000 when built-in def'ing.
Only define addr-space related built-in macro if
avr_addr_space_supported_p.
gcc/testsuite/
* gcc.target/avr/torture/tiny-progmem.c: New test.



Approved.


Committed, but I split it into 2 change-sets.  The only effective change is 
that the hook has a different prototype (returns void instead of bool).



Part1: Implement new target hook TARGET_ADDR_SPACE_DIAGNOSE_USAGE.

https://gcc.gnu.org/r238519

gcc/
* avr-protos.h (avr_addr_space_supported_p): New prototype.
* avr.c (TARGET_ADDR_SPACE_DIAGNOSE_USAGE): New hook define...
(avr_addr_space_diagnose_usage): ...and implementation.
(avr_addr_space_supported_p): New function.
(avr_nonconst_pointer_addrspace, avr_pgm_check_var_decl): Only
report bad address space usage if that space is supported.
(avr_insert_attributes): Same.  No more complain about unsupported
address spaces.
* avr-c.c (tm_p.h): Include it.
(avr_cpu_cpp_builtins): Only define addr-space related built-in
macro if avr_addr_space_supported_p.

Part2: Make progmem work for reduced Tiny cores

https://gcc.gnu.org/r238525

gcc/
Implement attribute progmem on reduced Tiny cores by adding
flash offset 0x4000 to respective symbols.

PR target/71948
* doc/extend.texi (AVR Variable Attributes) [progmem]: Add
documentation how it works on reduced Tiny cores.
(AVR Named Address Spaces): No support for reduced Tiny.
* config/avr/avr.c (AVR_SYMBOL_FLAG_TINY_PM): New macro.
(avr_address_tiny_pm_p): New static function.
(avr_print_operand_address) [AVR_TINY]: Add AVR_TINY_PM_OFFSET
if the address is in progmem.
(avr_assemble_integer): Same.
(avr_encode_section_info) [AVR_TINY]: Set AVR_SYMBOL_FLAG_TINY_PM
for symbol_ref in progmem.
* config/avr/avr.h (AVR_TINY_PM_OFFSET): New macro.
* config/avr/avr-c.c (avr_cpu_cpp_builtins): Use it instead of
magic 0x4000 when built-in def'ing __AVR_TINY_PM_BASE_ADDRESS__.
gcc/testsuite/
PR target/71948
* gcc.target/avr/torture/tiny-progmem.c: New test.

Index: config/avr/avr-c.c
===
--- config/avr/avr-c.c	(revision 238518)
+++ config/avr/avr-c.c	(revision 238519)
@@ -26,7 +26,7 @@
 #include "c-family/c-common.h"
 #include "stor-layout.h"
 #include "langhooks.h"
-
+#include "tm_p.h"
 
 /* IDs for all the AVR builtins.  */
 
@@ -253,7 +253,10 @@ avr_register

Re: [PATCH v2] C++ FE: handle misspelled identifiers and typenames

2016-07-20 Thread Jakub Jelinek
On Wed, Jul 20, 2016 at 10:46:58AM -0400, David Malcolm wrote:
> +  /* Skip anticipated decls of builtin functions.  */
> +  if (TREE_CODE (t) == FUNCTION_DECL)
> + if (DECL_BUILT_IN (t))
> +   if (DECL_ANTICIPATED (t))

Just a style comment, wouldn't
  if (TREE_CODE (t) == FUNCTION_DECL
  && DECL_BUILT_IN (t)
  && DECL_ANTICIPATED (t))
continue;
be better?

Jakub


Re: [AArch64][3/3] Migrate aarch64_expand_prologue/epilogue to aarch64_add_constant

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 14:03, Jiong Wang wrote:
> Those stack adjustment sequences inside aarch64_expand_prologue/epilogue
> are doing exactly what's aarch64_add_constant offered, except they also
> need to be aware of dwarf generation.
> 
> This patch teach existed aarch64_add_constant about dwarf generation and
> currently SP register is supported.  Whenever SP is updated, there
> should be CFA update, we then mark these instructions as frame related,
> and if the update is too complex for gcc to guess the adjustment, we
> attach explicit annotation.
> 
> Both dwarf frame info size and pro/epilogue scheduling are improved after
> this patch as aarch64_add_constant has better utilization of scratch
> register.
> 
> OK for trunk?
> 
> gcc/
> 2016-07-20  Jiong Wang  
> 
> * config/aarch64/aarch64.c (aarch64_add_constant): Mark
> instruction as frame related when it is.  Generate CFA
> annotation when it's necessary.
> (aarch64_expand_prologue): Use aarch64_add_constant.
> (aarch64_expand_epilogue): Likewise.
> 

Are you sure using aarch64_add_constant is unconditionally safe?  Stack
adjustments need to be done very carefully to ensure that we never
transiently deallocate part of the stack.

R.


> 
> build-const-3.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 41844a101247c939ecb31f8a8c17cf79759255aa..b38f3f1e8f85a5f3191d0c96080327dac7b2eaed
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1874,6 +1874,8 @@ aarch64_add_constant (machine_mode mode, int regnum, 
> int scratchreg,
>  {
>HOST_WIDE_INT mdelta = abs_hwi (delta);
>rtx this_rtx = gen_rtx_REG (mode, regnum);
> +  bool frame_related_p = (regnum == SP_REGNUM);
> +  rtx_insn *insn;
>  
>/* Do nothing if mdelta is zero.  */
>if (!mdelta)
> @@ -1882,7 +1884,8 @@ aarch64_add_constant (machine_mode mode, int regnum, 
> int scratchreg,
>/* We only need single instruction if the offset fit into add/sub.  */
>if (aarch64_uimm12_shift (mdelta))
>  {
> -  emit_insn (gen_add2_insn (this_rtx, GEN_INT (delta)));
> +  insn = emit_insn (gen_add2_insn (this_rtx, GEN_INT (delta)));
> +  RTX_FRAME_RELATED_P (insn) = frame_related_p;
>return;
>  }
>  
> @@ -1895,15 +1898,23 @@ aarch64_add_constant (machine_mode mode, int regnum, 
> int scratchreg,
>HOST_WIDE_INT low_off = mdelta & 0xfff;
>  
>low_off = delta < 0 ? -low_off : low_off;
> -  emit_insn (gen_add2_insn (this_rtx, GEN_INT (low_off)));
> -  emit_insn (gen_add2_insn (this_rtx, GEN_INT (delta - low_off)));
> +  insn = emit_insn (gen_add2_insn (this_rtx, GEN_INT (low_off)));
> +  RTX_FRAME_RELATED_P (insn) = frame_related_p;
> +  insn = emit_insn (gen_add2_insn (this_rtx, GEN_INT (delta - low_off)));
> +  RTX_FRAME_RELATED_P (insn) = frame_related_p;
>return;
>  }
>  
>/* Otherwise use generic function to handle all other situations.  */
>rtx scratch_rtx = gen_rtx_REG (mode, scratchreg);
>aarch64_internal_mov_immediate (scratch_rtx, GEN_INT (delta), true, mode);
> -  emit_insn (gen_add2_insn (this_rtx, scratch_rtx));
> +  insn = emit_insn (gen_add2_insn (this_rtx, scratch_rtx));
> +  if (frame_related_p)
> +{
> +  RTX_FRAME_RELATED_P (insn) = frame_related_p;
> +  rtx adj = plus_constant (mode, this_rtx, delta);
> +  add_reg_note (insn , REG_CFA_ADJUST_CFA, gen_rtx_SET (this_rtx, adj));
> +}
>  }
>  
>  static bool
> @@ -3038,36 +3049,7 @@ aarch64_expand_prologue (void)
>frame_size -= (offset + crtl->outgoing_args_size);
>fp_offset = 0;
>  
> -  if (frame_size >= 0x100)
> - {
> -   rtx op0 = gen_rtx_REG (Pmode, IP0_REGNUM);
> -   emit_move_insn (op0, GEN_INT (-frame_size));
> -   insn = emit_insn (gen_add2_insn (stack_pointer_rtx, op0));
> -
> -   add_reg_note (insn, REG_CFA_ADJUST_CFA,
> - gen_rtx_SET (stack_pointer_rtx,
> -  plus_constant (Pmode, stack_pointer_rtx,
> - -frame_size)));
> -   RTX_FRAME_RELATED_P (insn) = 1;
> - }
> -  else if (frame_size > 0)
> - {
> -   int hi_ofs = frame_size & 0xfff000;
> -   int lo_ofs = frame_size & 0x000fff;
> -
> -   if (hi_ofs)
> - {
> -   insn = emit_insn (gen_add2_insn
> - (stack_pointer_rtx, GEN_INT (-hi_ofs)));
> -   RTX_FRAME_RELATED_P (insn) = 1;
> - }
> -   if (lo_ofs)
> - {
> -   insn = emit_insn (gen_add2_insn
> - (stack_pointer_rtx, GEN_INT (-lo_ofs)));
> -   RTX_FRAME_RELATED_P (insn) = 1;
> - }
> - }
> +  aarch64_add_constant (Pmode, SP_REGNUM, IP0_REGNUM, -frame_size);
>  }
>else
>  frame_size = -1;
> @@ -3287,31 +3269,7 @@ aarch64_expand_epilogue (bool for_sibcall)
>if 

Re: [PATCH] disable ifunc on *-musl by default

2016-07-20 Thread David Edelsohn
On Wed, Jul 20, 2016 at 7:09 AM, Szabolcs Nagy  wrote:
> On 20/07/16 14:45, David Edelsohn wrote:
>>> Musl libc does not support gnu ifunc, so disable it by default.
>>> (not disabled on s390-* since that has no musl support yet.)
>>
>> Musl libc now supports PPC64. Support for s390 is in progress.
>>
>
> it seemed to me that on ppc64 ifunc is disabled by default.
> (at least it is not enabled in config.gcc)

Ifunc is used on PPC64.

- David


[PATCH v2] C++ FE: handle misspelled identifiers and typenames

2016-07-20 Thread David Malcolm
Changes in v2:
 - split out the non-C++ parts already approved by Jeff (I've committed
   these as r238522).
 - updated to mirror the fixes for PR c/71858 Jakub made to the
   corresponding C implementation in r238352, skipping anticipated decls
   of builtin functions
 - rewritten to more closely resemble the C FE's implementation

This is a port of the C frontend's r237714 [1] to the C++ frontend:
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01052.html
offering spelling suggestions for misspelled identifiers, macro names,
and some keywords (e.g. "singed" vs "signed" aka PR c/70339).

Examples of suggestions can be seen in the test case.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu; adds
267 PASS results to g++.sum.

OK for trunk?

[1] aka 8469aece13814deddf2cd80538d33c2d0a8d60d9 in the git mirror

gcc/cp/ChangeLog:
PR c/70339
PR c/71858
* name-lookup.c: Include gcc-rich-location.h, spellcheck-tree.h,
and parser.h.
(suggest_alternatives_for): If no candidates are found, try
lookup_name_fuzzy and report if if finds a suggestion.
(consider_binding_level): New function.
(lookup_name_fuzzy) New function.
* parser.c: Include gcc-rich-location.h.
(cp_lexer_next_token_is_decl_specifier_keyword): Move most of
logic into...
(cp_keyword_starts_decl_specifier_p): ...this new function.
(cp_parser_diagnose_invalid_type_name): When issuing
"does not name a type" errors, attempt to make a suggestion using
lookup_name_fuzzy.
* parser.h (cp_keyword_starts_decl_specifier_p): New prototype.
* search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Don't
attempt to access TYPE_FIELDS within a TYPE_PACK_EXPANSION.

gcc/testsuite/ChangeLog:
PR c/70339
PR c/71858
* g++.dg/spellcheck-identifiers.C: New test case, based on
gcc.dg/spellcheck-identifiers.c.
* g++.dg/spellcheck-identifiers-2.C: New test case, based on
gcc.dg/spellcheck-identifiers-2.c.
* g++.dg/spellcheck-typenames.C: New test case, based on
gcc.dg/spellcheck-typenames.c
---
 gcc/cp/name-lookup.c| 116 ++-
 gcc/cp/parser.c |  43 +++-
 gcc/cp/parser.h |   1 +
 gcc/cp/search.c |   4 +
 gcc/testsuite/g++.dg/spellcheck-identifiers-2.C |  43 
 gcc/testsuite/g++.dg/spellcheck-identifiers.C   | 255 
 gcc/testsuite/g++.dg/spellcheck-typenames.C |  84 
 7 files changed, 533 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-identifiers-2.C
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-identifiers.C
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-typenames.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index cbd5209..561bf71 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -29,6 +29,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "debug.h"
 #include "c-family/c-pragma.h"
 #include "params.h"
+#include "gcc-rich-location.h"
+#include "spellcheck-tree.h"
+#include "parser.h"
 
 /* The bindings for a particular name in a particular scope.  */
 
@@ -4435,9 +4438,20 @@ suggest_alternatives_for (location_t location, tree name)
 
   namespaces_to_search.release ();
 
-  /* Nothing useful to report.  */
+  /* Nothing useful to report for NAME.  Report on likely misspellings,
+ or do nothing.  */
   if (candidates.is_empty ())
-return;
+{
+  const char *fuzzy_name = lookup_name_fuzzy (name, FUZZY_LOOKUP_NAME);
+  if (fuzzy_name)
+   {
+ gcc_rich_location richloc (location);
+ richloc.add_fixit_misspelled_id (location, fuzzy_name);
+ inform_at_rich_loc (&richloc, "suggested alternative: %qs",
+ fuzzy_name);
+   }
+  return;
+}
 
   inform_n (location, candidates.length (),
"suggested alternative:",
@@ -4672,6 +4686,104 @@ qualified_lookup_using_namespace (tree name, tree scope,
   return result->value != error_mark_node;
 }
 
+/* Helper function for lookup_name_fuzzy.
+   Traverse binding level LVL, looking for good name matches for NAME
+   (and BM).  */
+static void
+consider_binding_level (tree name, best_match  &bm,
+   cp_binding_level *lvl, bool look_within_fields,
+   enum lookup_name_fuzzy_kind kind)
+{
+  if (look_within_fields)
+if (lvl->this_entity && TREE_CODE (lvl->this_entity) == RECORD_TYPE)
+  {
+   tree type = lvl->this_entity;
+   bool want_type_p = (kind == FUZZY_LOOKUP_TYPENAME);
+   tree best_matching_field
+ = lookup_member_fuzzy (type, name, want_type_p);
+   if (best_matching_field)
+ bm.consider (best_matching_field);
+  }
+
+  for (tree t = lvl->names; t; t = TREE_CHAIN (t))
+{
+  /* Don't use bindin

Re: [AArch64][2/3] Optimize aarch64_add_constant to generate better addition sequences

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 14:02, Jiong Wang wrote:
> This patch optimize immediate addition sequences generated by
> aarch64_add_constant.
> 
> The current addition sequences generated are:
> 
>   * If immediate fit into unsigned 12bit range, generate single add/sub.
>   * Otherwise if it fit into unsigned 24bit range, generate two
> add/sub.
> 
>   * Otherwise invoke general constant build function.
> 
> 
> This haven't considered the situation where immedate can't fit into
> unsigned 12bit range, but can fit into single mov instruction for which
> case we generate one move and one addition.  The move won't touch the
> destination register thus the sequences is better than two additions
> which both touch the destination register.
> 
> 
> This patch thus optimize the addition sequences into:
> 
>   * If immediate fit into unsigned 12bit range, generate single add/sub.
>  
>   * Otherwise if it fit into unsigned 24bit range, generate two add/sub.
> And don't do this if it fit into single move instruction, in which case
> move the immedaite to scratch register firstly, then generate one
> addition to add the scratch register to the destination register.
>   * Otherwise invoke general constant build function.
> 
> 
> OK for trunk?
> 
> gcc/
> 2016-07-20  Jiong Wang  
> 
> * config/aarch64/aarch64.c (aarch64_add_constant): Optimize
> instruction sequences.
> 
> 

OK with the updates to the comments as mentioned below.

> build-const-2.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> aeea3b3ebc514663043ac8d7cd13361f06f78502..41844a101247c939ecb31f8a8c17cf79759255aa
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1865,6 +1865,47 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm)
>aarch64_internal_mov_immediate (dest, imm, true, GET_MODE (dest));
>  }
>  
> +/* Add DELTA onto REGNUM in MODE, using SCRATCHREG to held intermediate 
> value if
> +   it is necessary.  */

Add DELTA to REGNUM in mode MODE.  SCRATCHREG can be used to hold an
intermediate value if necessary.


> +
> +static void
> +aarch64_add_constant (machine_mode mode, int regnum, int scratchreg,
> +   HOST_WIDE_INT delta)
> +{
> +  HOST_WIDE_INT mdelta = abs_hwi (delta);
> +  rtx this_rtx = gen_rtx_REG (mode, regnum);
> +
> +  /* Do nothing if mdelta is zero.  */
> +  if (!mdelta)
> +return;
> +
> +  /* We only need single instruction if the offset fit into add/sub.  */
> +  if (aarch64_uimm12_shift (mdelta))
> +{
> +  emit_insn (gen_add2_insn (this_rtx, GEN_INT (delta)));
> +  return;
> +}
> +
> +  /* We need two add/sub instructions, each one perform part of the
> + addition/subtraction, but don't this if the addend can be loaded into
> + register by single instruction, in that case we prefer a move to scratch
> + register following by addition.  */

We need two add/sub instructions, each one performing part of the
calculation.  Don't do this if the addend can be loaded into
register with a single instruction, in that case we prefer a move to a
scratch register following by an addition.



> +  if (mdelta < 0x100 && !aarch64_move_imm (delta, mode))
> +{
> +  HOST_WIDE_INT low_off = mdelta & 0xfff;
> +
> +  low_off = delta < 0 ? -low_off : low_off;
> +  emit_insn (gen_add2_insn (this_rtx, GEN_INT (low_off)));
> +  emit_insn (gen_add2_insn (this_rtx, GEN_INT (delta - low_off)));
> +  return;
> +}
> +
> +  /* Otherwise use generic function to handle all other situations.  */
> +  rtx scratch_rtx = gen_rtx_REG (mode, scratchreg);
> +  aarch64_internal_mov_immediate (scratch_rtx, GEN_INT (delta), true, mode);
> +  emit_insn (gen_add2_insn (this_rtx, scratch_rtx));
> +}
> +
>  static bool
>  aarch64_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED,
>tree exp ATTRIBUTE_UNUSED)
> @@ -3337,44 +3378,6 @@ aarch64_final_eh_return_addr (void)
>  - 2 * UNITS_PER_WORD));
>  }
>  
> -static void
> -aarch64_add_constant (machine_mode mode, int regnum, int scratchreg,
> -   HOST_WIDE_INT delta)
> -{
> -  HOST_WIDE_INT mdelta = delta;
> -  rtx this_rtx = gen_rtx_REG (mode, regnum);
> -  rtx scratch_rtx = gen_rtx_REG (mode, scratchreg);
> -
> -  if (mdelta < 0)
> -mdelta = -mdelta;
> -
> -  if (mdelta >= 4096 * 4096)
> -{
> -  aarch64_internal_mov_immediate (scratch_rtx, GEN_INT (delta), true, 
> mode);
> -  emit_insn (gen_add3_insn (this_rtx, this_rtx, scratch_rtx));
> -}
> -  else if (mdelta > 0)
> -{
> -  if (mdelta >= 4096)
> - {
> -   emit_insn (gen_rtx_SET (scratch_rtx, GEN_INT (mdelta / 4096)));
> -   rtx shift = gen_rtx_ASHIFT (mode, scratch_rtx, GEN_INT (12));
> -   if (delta < 0)
> - emit_insn (gen_rtx_SET (this_rtx,
> - gen_rtx_MINUS (mode, this_rtx, shift)));
> -   else
> -   

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-20 Thread Patrick Palka
On Wed, 20 Jul 2016, Bernd Schmidt wrote:

> On 07/19/2016 10:20 AM, Richard Biener wrote:
> > I like it.  Improving re-build time in my dev tree is very much
> > welcome, and yes,
> > libbackend build time is a big part of it usually (plus of course cc1
> > link time).
> 
> Since that wasn't an entirely explicit ack, I'll add mine. Thank you for doing
> this.
> 
> 
> Bernd
> 
> 

Committed as r238524 with the following minor change to the configure
test to use $CFLAGS and $LDFLAGS consistently:

diff --git a/gcc/configure.ac b/gcc/configure.ac
index 63052ba..241e82d 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4905,7 +4905,7 @@ echo 'int main (void) { return 0; }' > conftest.c
 if ($AR --version | sed 1q | grep "GNU ar" \
 && $CC $CFLAGS -c conftest.c \
 && $AR rcT conftest.a conftest.o \
-&& $CC -o conftest conftest.a) >/dev/null 2>&1; then
+&& $CC $CFLAGS $LDFLAGS -o conftest conftest.a) >/dev/null 2>&1; then
   thin_archive_support=yes
 fi
 rm -f conftest.c conftest.o conftest.a conftest


Re: [PATCH] disable ifunc on *-musl by default

2016-07-20 Thread Szabolcs Nagy
On 20/07/16 14:45, David Edelsohn wrote:
>> Musl libc does not support gnu ifunc, so disable it by default.
>> (not disabled on s390-* since that has no musl support yet.)
> 
> Musl libc now supports PPC64. Support for s390 is in progress.
> 

it seemed to me that on ppc64 ifunc is disabled by default.
(at least it is not enabled in config.gcc)



Re: [AArch64][1/3] Migrate aarch64_add_constant to new interface & kill aarch64_build_constant

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 14:02, Jiong Wang wrote:
> Currently aarch64_add_constant is using aarch64_build_constant to move
> an immediate into the destination register.
> 
> It has considered the following situations:
> 
>   * immediate can fit into bitmask pattern that only needs single
> instruction.
>   * immediate can fit into single movz/movn.
>   * immediate needs single movz/movn, and multiply movk.
> 
> 
> Actually we have another constant building helper function
> "aarch64_internal_mov_immediate" which cover all these situations and
> more.
> 
> This patch thus migrate aarch64_add_constant to
> aarch64_internal_mov_immediate so that we can kill the old
> aarch64_build_constant.
> 
> OK for trunk?
> 
> gcc/
> 2016-07-20  Jiong Wang  
> 
> * config/aarch64/aarch64.c (aarch64_add_constant): New
> parameter "mode".  Use aarch64_internal_mov_immediate
> instead of aarch64_build_constant.
> (aarch64_build_constant): Delete.
> 

Really you should also list the callers of aarch64_add_constant that
have been updated as well (there aren't that many).

OK with that change.

R.

> 
> build-const-1.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 512ef10d158d2eaa1384d28c43b9a8f90387099d..aeea3b3ebc514663043ac8d7cd13361f06f78502
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3337,98 +3337,20 @@ aarch64_final_eh_return_addr (void)
>  - 2 * UNITS_PER_WORD));
>  }
>  
> -/* Possibly output code to build up a constant in a register.  For
> -   the benefit of the costs infrastructure, returns the number of
> -   instructions which would be emitted.  GENERATE inhibits or
> -   enables code generation.  */
> -
> -static int
> -aarch64_build_constant (int regnum, HOST_WIDE_INT val, bool generate)
> -{
> -  int insns = 0;
> -
> -  if (aarch64_bitmask_imm (val, DImode))
> -{
> -  if (generate)
> - emit_move_insn (gen_rtx_REG (Pmode, regnum), GEN_INT (val));
> -  insns = 1;
> -}
> -  else
> -{
> -  int i;
> -  int ncount = 0;
> -  int zcount = 0;
> -  HOST_WIDE_INT valp = val >> 16;
> -  HOST_WIDE_INT valm;
> -  HOST_WIDE_INT tval;
> -
> -  for (i = 16; i < 64; i += 16)
> - {
> -   valm = (valp & 0x);
> -
> -   if (valm != 0)
> - ++ zcount;
> -
> -   if (valm != 0x)
> - ++ ncount;
> -
> -   valp >>= 16;
> - }
> -
> -  /* zcount contains the number of additional MOVK instructions
> -  required if the constant is built up with an initial MOVZ instruction,
> -  while ncount is the number of MOVK instructions required if starting
> -  with a MOVN instruction.  Choose the sequence that yields the fewest
> -  number of instructions, preferring MOVZ instructions when they are both
> -  the same.  */
> -  if (ncount < zcount)
> - {
> -   if (generate)
> - emit_move_insn (gen_rtx_REG (Pmode, regnum),
> - GEN_INT (val | ~(HOST_WIDE_INT) 0x));
> -   tval = 0x;
> -   insns++;
> - }
> -  else
> - {
> -   if (generate)
> - emit_move_insn (gen_rtx_REG (Pmode, regnum),
> - GEN_INT (val & 0x));
> -   tval = 0;
> -   insns++;
> - }
> -
> -  val >>= 16;
> -
> -  for (i = 16; i < 64; i += 16)
> - {
> -   if ((val & 0x) != tval)
> - {
> -   if (generate)
> - emit_insn (gen_insv_immdi (gen_rtx_REG (Pmode, regnum),
> -GEN_INT (i),
> -GEN_INT (val & 0x)));
> -   insns++;
> - }
> -   val >>= 16;
> - }
> -}
> -  return insns;
> -}
> -
>  static void
> -aarch64_add_constant (int regnum, int scratchreg, HOST_WIDE_INT delta)
> +aarch64_add_constant (machine_mode mode, int regnum, int scratchreg,
> +   HOST_WIDE_INT delta)
>  {
>HOST_WIDE_INT mdelta = delta;
> -  rtx this_rtx = gen_rtx_REG (Pmode, regnum);
> -  rtx scratch_rtx = gen_rtx_REG (Pmode, scratchreg);
> +  rtx this_rtx = gen_rtx_REG (mode, regnum);
> +  rtx scratch_rtx = gen_rtx_REG (mode, scratchreg);
>  
>if (mdelta < 0)
>  mdelta = -mdelta;
>  
>if (mdelta >= 4096 * 4096)
>  {
> -  (void) aarch64_build_constant (scratchreg, delta, true);
> +  aarch64_internal_mov_immediate (scratch_rtx, GEN_INT (delta), true, 
> mode);
>emit_insn (gen_add3_insn (this_rtx, this_rtx, scratch_rtx));
>  }
>else if (mdelta > 0)
> @@ -3436,19 +3358,19 @@ aarch64_add_constant (int regnum, int scratchreg, 
> HOST_WIDE_INT delta)
>if (mdelta >= 4096)
>   {
> emit_insn (gen_rtx_SET (scratch_rtx, GEN_INT (mdelta / 4096)));
> -   rtx shift = gen_rtx_ASHIFT (Pmode, scratch_rtx, GEN_INT (12));
> +   rtx shift = gen_rtx_ASHIFT (mode, scratch_rtx, GEN_INT (12));
>  

[PATCH] target lib tests with build sysroot PR testsuite/71931

2016-07-20 Thread Szabolcs Nagy
Fix target library tests when gcc is built using --with-build-sysroot.

The dejagnu find_gcc function cannot handle if CC needs extra flags
like --sysroot. So for testing target libraries use the same CC that
was used for building the target libs. This change assumes the test
is ran from make.

Another approach would be to pass down the sysroot flags
separately and add
set TEST_ALWAYS_FLAGS "$(SYSROOT_CFLAGS_FOR_TARGET)"
to site.exp like the gcc site.exp does, but that's more
changes.

libatomic/
2016-07-20  Szabolcs Nagy  

PR testsuite/71931
* testuite/lib/libatomic.exp (libatomic_init): Use CC.
* testuite/Makefile.am: Export CC.
* testuite/Makefile.in: Regenerated.

libgomp/
2016-07-20  Szabolcs Nagy  

PR testsuite/71931
* testuite/lib/libgomp.exp (libgomp_init): Use CC.
* testuite/Makefile.am: Export CC.
* testuite/Makefile.in: Regenerated.

libitm/
2016-07-20  Szabolcs Nagy  

PR testsuite/71931
* testuite/lib/libitm.exp (libitm_init): Use CC.
* testuite/Makefile.am: Export CC.
* testuite/Makefile.in: Regenerated.

libvtv/
2016-07-20  Szabolcs Nagy  

PR testsuite/71931
* testuite/lib/libvtv.exp (libvtv_init): Use CC.
* testuite/Makefile.am: Export CC.
* testuite/Makefile.in: Regenerated.
diff --git a/libatomic/testsuite/Makefile.am b/libatomic/testsuite/Makefile.am
index 561b7e2..d9af02a 100644
--- a/libatomic/testsuite/Makefile.am
+++ b/libatomic/testsuite/Makefile.am
@@ -11,3 +11,5 @@ EXPECT = $(shell if test -f $(top_builddir)/../expect/expect; then \
 _RUNTEST = $(shell if test -f $(top_srcdir)/../dejagnu/runtest; then \
 	 echo $(top_srcdir)/../dejagnu/runtest; else echo runtest; fi)
 RUNTEST = "$(_RUNTEST) $(AM_RUNTESTFLAGS)"
+
+export CC
diff --git a/libatomic/testsuite/Makefile.in b/libatomic/testsuite/Makefile.in
index 34f83e0..8392e01 100644
--- a/libatomic/testsuite/Makefile.in
+++ b/libatomic/testsuite/Makefile.in
@@ -428,6 +428,8 @@ uninstall-am:
 	uninstall uninstall-am
 
 
+export CC
+
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
diff --git a/libatomic/testsuite/lib/libatomic.exp b/libatomic/testsuite/lib/libatomic.exp
index cafab54..6ba67e8 100644
--- a/libatomic/testsuite/lib/libatomic.exp
+++ b/libatomic/testsuite/lib/libatomic.exp
@@ -90,7 +90,7 @@ proc libatomic_init { args } {
 	if [info exists TOOL_EXECUTABLE] {
 	set GCC_UNDER_TEST $TOOL_EXECUTABLE
 	} else {
-	set GCC_UNDER_TEST "[find_gcc]"
+	set GCC_UNDER_TEST "[getenv CC]"
 	}
 }
 
diff --git a/libgomp/testsuite/Makefile.am b/libgomp/testsuite/Makefile.am
index 66a9d94..821ab31 100644
--- a/libgomp/testsuite/Makefile.am
+++ b/libgomp/testsuite/Makefile.am
@@ -25,3 +25,5 @@ libgomp-test-support.exp: libgomp-test-support.pt.exp Makefile
 	mv $@.tmp $@
 
 all-local: libgomp-test-support.exp
+
+export CC
diff --git a/libgomp/testsuite/Makefile.in b/libgomp/testsuite/Makefile.in
index 4dbb406..7bd8b86 100644
--- a/libgomp/testsuite/Makefile.in
+++ b/libgomp/testsuite/Makefile.in
@@ -475,6 +475,8 @@ libgomp-test-support.exp: libgomp-test-support.pt.exp Makefile
 
 all-local: libgomp-test-support.exp
 
+export CC
+
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp
index 1cb4991..063 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -108,7 +108,7 @@ proc libgomp_init { args } {
 	if [info exists TOOL_EXECUTABLE] {
 	set GCC_UNDER_TEST $TOOL_EXECUTABLE
 	} else {
-	set GCC_UNDER_TEST "[find_gcc]"
+	set GCC_UNDER_TEST "[getenv CC]"
 	}
 }
 
diff --git a/libitm/testsuite/Makefile.am b/libitm/testsuite/Makefile.am
index 561b7e2..d9af02a 100644
--- a/libitm/testsuite/Makefile.am
+++ b/libitm/testsuite/Makefile.am
@@ -11,3 +11,5 @@ EXPECT = $(shell if test -f $(top_builddir)/../expect/expect; then \
 _RUNTEST = $(shell if test -f $(top_srcdir)/../dejagnu/runtest; then \
 	 echo $(top_srcdir)/../dejagnu/runtest; else echo runtest; fi)
 RUNTEST = "$(_RUNTEST) $(AM_RUNTESTFLAGS)"
+
+export CC
diff --git a/libitm/testsuite/Makefile.in b/libitm/testsuite/Makefile.in
index 4d79781..49a333a 100644
--- a/libitm/testsuite/Makefile.in
+++ b/libitm/testsuite/Makefile.in
@@ -438,6 +438,8 @@ uninstall-am:
 	uninstall uninstall-am
 
 
+export CC
+
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
 .NOEXPORT:
diff --git a/libitm/testsuite/lib/libitm.exp b/libitm/testsuite/lib/libitm.exp
index 0416296..8679e92 100644
--- a/libitm/testsuite/lib/libitm.exp
+++ b/libitm/testsuite/lib/libitm.exp
@@ -90,7 +90,7 @@ proc libitm_init { args } {
 	if [info exists TOOL_EXECUTABLE] {
 	set G

[PATCH] check -nopie in configure

2016-07-20 Thread Szabolcs Nagy
since gcc can be built with --enable-default-pie, there
is a -no-pie flag to turn off PIE.

gcc cannot be built as PIE (pr 71934), so the gcc build
system has to detect the -no-pie flag to disable PIE.

historically default pie toolchains used the -nopie flag
(e.g. gentoo hardened), those toolchains cannot build
gcc anymore, so detect -nopie too.

gcc/
2016-07-20  Szabolcs Nagy  

* configure.ac: Detect -nopie flag just like -no-pie.
* configure: Regenerate.
diff --git a/gcc/configure b/gcc/configure
index ed44472..ca16e66 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -29566,6 +29566,33 @@ fi
 $as_echo "$gcc_cv_no_pie" >&6; }
 if test "$gcc_cv_no_pie" = "yes"; then
   NO_PIE_FLAG="-no-pie"
+else
+  # Check if -nopie works.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for -nopie option" >&5
+$as_echo_n "checking for -nopie option... " >&6; }
+if test "${gcc_cv_nopie+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  saved_LDFLAGS="$LDFLAGS"
+ LDFLAGS="$LDFLAGS -nopie"
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+int main(void) {return 0;}
+_ACEOF
+if ac_fn_cxx_try_link "$LINENO"; then :
+  gcc_cv_nopie=yes
+else
+  gcc_cv_nopie=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+ LDFLAGS="$saved_LDFLAGS"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_nopie" >&5
+$as_echo "$gcc_cv_nopie" >&6; }
+  if test "$gcc_cv_nopie" = "yes"; then
+NO_PIE_FLAG="-nopie"
+  fi
 fi
 
 
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 086d0fc..98ab5cb 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -6200,6 +6200,19 @@ AC_CACHE_CHECK([for -no-pie option],
LDFLAGS="$saved_LDFLAGS"])
 if test "$gcc_cv_no_pie" = "yes"; then
   NO_PIE_FLAG="-no-pie"
+else
+  # Check if -nopie works.
+  AC_CACHE_CHECK([for -nopie option],
+[gcc_cv_nopie],
+[saved_LDFLAGS="$LDFLAGS"
+ LDFLAGS="$LDFLAGS -nopie"
+ AC_LINK_IFELSE([int main(void) {return 0;}],
+   [gcc_cv_nopie=yes],
+   [gcc_cv_nopie=no])
+ LDFLAGS="$saved_LDFLAGS"])
+  if test "$gcc_cv_nopie" = "yes"; then
+NO_PIE_FLAG="-nopie"
+  fi
 fi
 AC_SUBST([NO_PIE_FLAG])
 


Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 14:40, Wilco Dijkstra wrote:
> Richard Earnshaw wrote:
>> Why does combine care what the cost is if the instruction isn't valid?
> 
> No idea. Combine does lots of odd things that don't make sense to me. 
> Unfortunately the costs we give for cases like this need to be accurate or
> they negatively affect code quality. The reason for this patch was to fix
> some unexpected slowdowns caused by the cost for zero_extend being
> too high.
> 
> Wilco
> 

Well if I take your testcase and plug it into a fairly recent gcc I get:

x:
mov w1, 20
umull   x0, w0, w1
ret

If I change the constant to 33, I then get:

x:
uxtwx0, w0
add x0, x0, x0, lsl 5
ret

Both of which look reasonable to me.


Re: [PATCH] c++/60760 - arithmetic on null pointers should not be allowed in constant expressions

2016-07-20 Thread Jason Merrill
On Mon, Jul 18, 2016 at 6:15 PM, Martin Sebor  wrote:
> On 07/18/2016 11:51 AM, Jason Merrill wrote:
>>
>> On 07/06/2016 06:20 PM, Martin Sebor wrote:
>>>
>>> @@ -2911,6 +2923,14 @@ cxx_eval_indirect_ref (const constexpr_ctx
>>> *ctx, tree t,
>>>if (*non_constant_p)
>>>  return t;
>>>
>>> +  if (integer_zerop (op0))
>>> +{
>>> +  if (!ctx->quiet)
>>> +error ("dereferencing a null pointer");
>>> +  *non_constant_p = true;
>>> +  return t;
>>> +}
>>
>> I'm skeptical of checking this here, since *p is valid for null p; &*p
>> is even a constant expression.  And removing this hunk doesn't seem to
>> break any of your tests.
>>
>> OK with that hunk removed.
>
> With it removed the constexpr-nullptr-2.C test fails on line 64:
>
>   constexpr const int *pi0 = &pa2->pa1->pa0->i;   // { dg-error "null
> pointer|not a constant" }
>
> Here, pa2 and pa1 are non-null but pa0 is null.

It doesn't fail for me; that line hits the error in
cxx_eval_component_reference.  I'm only talking about removing the
cxx_eval_indirect_ref hunk.

Jason


[PATCH] disable ifunc on *-musl by default

2016-07-20 Thread Szabolcs Nagy
Musl libc does not support gnu ifunc, so disable it by default.
(not disabled on s390-* since that has no musl support yet.)

gcc/
2016-07-20  Szabolcs Nagy  

* config.gcc (*-*-*musl*): Disable gnu-indirect-function.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 1f75f17..f3f6e14 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1465,7 +1465,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-gnu* | i[34567]8
 		extra_options="${extra_options} linux-android.opt"
 		# Assume modern glibc if not targeting Android nor uclibc.
 		case ${target} in
-		*-*-*android*|*-*-*uclibc*)
+		*-*-*android*|*-*-*uclibc*|*-*-*musl*)
 		  ;;
 		*)
 		  default_gnu_indirect_function=yes
@@ -1531,7 +1531,7 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
 		extra_options="${extra_options} linux-android.opt"
 		# Assume modern glibc if not targeting Android nor uclibc.
 		case ${target} in
-		*-*-*android*|*-*-*uclibc*)
+		*-*-*android*|*-*-*uclibc*|*-*-*musl*)
 		  ;;
 		*)
 		  default_gnu_indirect_function=yes


Re: [PATCH] disable ifunc on *-musl by default

2016-07-20 Thread David Edelsohn
> Musl libc does not support gnu ifunc, so disable it by default.
> (not disabled on s390-* since that has no musl support yet.)

Musl libc now supports PPC64. Support for s390 is in progress.

- David


Re: [C++ PATCH] cp_parser_save_member_function_body fix (PR c++/71909)

2016-07-20 Thread Jason Merrill
OK.

On Mon, Jul 18, 2016 at 5:14 PM, Jakub Jelinek  wrote:
> Hi!
>
> This patch fixes two issues:
> 1) as shown in the first testcase, cp_parser_save_member_function_body
>adds the catch () { ... } tokens into the saved token range
>even when there is no function try block (missing try keyword)
> 2) if the method starts with __transaction_{atomic,relaxed}, and
>e.g. contains {}s somewhere in the mem-initializers, then
>cp_parser_save_member_function_body stops saving the tokens early
>instead of late
>
> The following patch attempts to handle the same cases
> cp_parser_function_definition_after_declarator handles (ok, ignores
> the already unsupported return extension) - note that
> cp_parser_txn_attribute_opt handles only a small subset of C++11 attributes
> (and only once, not multiple times).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-07-18  Jakub Jelinek  
>
> PR c++/71909
> * parser.c (cp_parser_save_member_function_body): Consume
> __transaction_relaxed or __transaction_atomic with optional
> attribute.  Only skip catch with block if try keyword is seen.
>
> * g++.dg/parse/pr71909.C: New test.
> * g++.dg/tm/pr71909.C: New test.
>
> --- gcc/cp/parser.c.jj  2016-07-16 10:41:04.0 +0200
> +++ gcc/cp/parser.c 2016-07-18 11:47:49.487748010 +0200
> @@ -26044,6 +26044,7 @@ cp_parser_save_member_function_body (cp_
>cp_token *first;
>cp_token *last;
>tree fn;
> +  bool function_try_block = false;
>
>/* Create the FUNCTION_DECL.  */
>fn = grokmethod (decl_specifiers, declarator, attributes);
> @@ -26065,9 +26066,43 @@ cp_parser_save_member_function_body (cp_
>/* Save away the tokens that make up the body of the
>   function.  */
>first = parser->lexer->next_token;
> +
> +  if (cp_lexer_next_token_is_keyword (parser->lexer, 
> RID_TRANSACTION_RELAXED))
> +cp_lexer_consume_token (parser->lexer);
> +  else if (cp_lexer_next_token_is_keyword (parser->lexer,
> +  RID_TRANSACTION_ATOMIC))
> +{
> +  cp_lexer_consume_token (parser->lexer);
> +  /* Match cp_parser_txn_attribute_opt [[ identifier ]].  */
> +  if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_SQUARE)
> + && cp_lexer_nth_token_is (parser->lexer, 2, CPP_OPEN_SQUARE)
> + && (cp_lexer_nth_token_is (parser->lexer, 3, CPP_NAME)
> + || cp_lexer_nth_token_is (parser->lexer, 3, CPP_KEYWORD))
> + && cp_lexer_nth_token_is (parser->lexer, 4, CPP_CLOSE_SQUARE)
> + && cp_lexer_nth_token_is (parser->lexer, 5, CPP_CLOSE_SQUARE))
> +   {
> + cp_lexer_consume_token (parser->lexer);
> + cp_lexer_consume_token (parser->lexer);
> + cp_lexer_consume_token (parser->lexer);
> + cp_lexer_consume_token (parser->lexer);
> + cp_lexer_consume_token (parser->lexer);
> +   }
> +  else
> +   while (cp_next_tokens_can_be_gnu_attribute_p (parser)
> +  && cp_lexer_nth_token_is (parser->lexer, 2, CPP_OPEN_PAREN))
> + {
> +   cp_lexer_consume_token (parser->lexer);
> +   if (cp_parser_cache_group (parser, CPP_CLOSE_PAREN, /*depth=*/0))
> + break;
> + }
> +}
> +
>/* Handle function try blocks.  */
>if (cp_lexer_next_token_is_keyword (parser->lexer, RID_TRY))
> -cp_lexer_consume_token (parser->lexer);
> +{
> +  cp_lexer_consume_token (parser->lexer);
> +  function_try_block = true;
> +}
>/* We can have braced-init-list mem-initializers before the fn body.  */
>if (cp_lexer_next_token_is (parser->lexer, CPP_COLON))
>  {
> @@ -26085,8 +26120,9 @@ cp_parser_save_member_function_body (cp_
>  }
>cp_parser_cache_group (parser, CPP_CLOSE_BRACE, /*depth=*/0);
>/* Handle function try blocks.  */
> -  while (cp_lexer_next_token_is_keyword (parser->lexer, RID_CATCH))
> -cp_parser_cache_group (parser, CPP_CLOSE_BRACE, /*depth=*/0);
> +  if (function_try_block)
> +while (cp_lexer_next_token_is_keyword (parser->lexer, RID_CATCH))
> +  cp_parser_cache_group (parser, CPP_CLOSE_BRACE, /*depth=*/0);
>last = parser->lexer->next_token;
>
>/* Save away the inline definition; we will process it when the
> --- gcc/testsuite/g++.dg/parse/pr71909.C.jj 2016-07-18 11:55:51.169600236 
> +0200
> +++ gcc/testsuite/g++.dg/parse/pr71909.C2016-07-18 11:57:09.99364 
> +0200
> @@ -0,0 +1,22 @@
> +// PR c++/71909
> +// { dg-do compile }
> +
> +struct S
> +{
> +  S () try : m (0) {}
> +  catch (...) {}
> +  void foo () try {}
> +  catch (int) {}
> +  catch (...) {}
> +  int m;
> +};
> +
> +struct T
> +{
> +  T () : m (0) {}
> +  catch (...) {}   // { dg-error "expected unqualified-id before" }
> +  void foo () {}
> +  catch (int) {}   // { dg-error "expected unqualified-id before" }
> +  catch (...) {}   // { dg-error "expected unqualified-id before" }
> +  int

Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Wilco Dijkstra
Richard Earnshaw wrote:
> Why does combine care what the cost is if the instruction isn't valid?

No idea. Combine does lots of odd things that don't make sense to me. 
Unfortunately the costs we give for cases like this need to be accurate or
they negatively affect code quality. The reason for this patch was to fix
some unexpected slowdowns caused by the cost for zero_extend being
too high.

Wilco



Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-20 Thread Jason Merrill
OK.

On Mon, Jul 18, 2016 at 5:07 PM, Jakub Jelinek  wrote:
> On Mon, Jul 18, 2016 at 02:42:43PM -0400, Jason Merrill wrote:
>> Ah, I guess we need to check cxx_dialect in cxx_eval_store_expression,
>> not just in potential_constant_expression.
>
> Here is an updated version, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?
>
> 2016-07-18  Jakub Jelinek  
>
> PR c++/50060
> * constexpr.c (cxx_eval_builtin_function_call): Pass false as lval
> when evaluating call arguments.  Use fold_builtin_call_array instead
> of fold_build_call_array_loc, return t if it returns NULL.  Otherwise
> check the result with potential_constant_expression and call
> cxx_eval_constant_expression on it.
>
> * g++.dg/cpp0x/constexpr-50060.C: New test.
> * g++.dg/cpp1y/constexpr-50060.C: New test.
>
> --- gcc/cp/constexpr.c.jj   2016-07-18 20:42:51.163955883 +0200
> +++ gcc/cp/constexpr.c  2016-07-18 20:55:47.246152938 +0200
> @@ -1105,7 +1105,7 @@ cxx_eval_builtin_function_call (const co
>for (i = 0; i < nargs; ++i)
>  {
>args[i] = cxx_eval_constant_expression (&new_ctx, CALL_EXPR_ARG (t, i),
> - lval, &dummy1, &dummy2);
> + false, &dummy1, &dummy2);
>if (bi_const_p)
> /* For __built_in_constant_p, fold all expressions with constant 
> values
>even if they aren't C++ constant-expressions.  */
> @@ -1114,13 +1114,31 @@ cxx_eval_builtin_function_call (const co
>
>bool save_ffbcp = force_folding_builtin_constant_p;
>force_folding_builtin_constant_p = true;
> -  new_call = fold_build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
> -   CALL_EXPR_FN (t), nargs, args);
> -  /* Fold away the NOP_EXPR from fold_builtin_n.  */
> -  new_call = fold (new_call);
> +  new_call = fold_builtin_call_array (EXPR_LOCATION (t), TREE_TYPE (t),
> + CALL_EXPR_FN (t), nargs, args);
>force_folding_builtin_constant_p = save_ffbcp;
> -  VERIFY_CONSTANT (new_call);
> -  return new_call;
> +  if (new_call == NULL)
> +{
> +  if (!*non_constant_p && !ctx->quiet)
> +   {
> + new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
> +  CALL_EXPR_FN (t), nargs, args);
> + error ("%q+E is not a constant expression", new_call);
> +   }
> +  *non_constant_p = true;
> +  return t;
> +}
> +
> +  if (!potential_constant_expression (new_call))
> +{
> +  if (!*non_constant_p && !ctx->quiet)
> +   error ("%q+E is not a constant expression", new_call);
> +  *non_constant_p = true;
> +  return t;
> +}
> +
> +  return cxx_eval_constant_expression (&new_ctx, new_call, lval,
> +  non_constant_p, overflow_p);
>  }
>
>  /* TEMP is the constant value of a temporary object of type TYPE.  Adjust
> --- gcc/testsuite/g++.dg/cpp0x/constexpr-50060.C.jj 2016-07-18 
> 21:03:12.505532831 +0200
> +++ gcc/testsuite/g++.dg/cpp0x/constexpr-50060.C2016-07-18 
> 21:05:41.306655422 +0200
> @@ -0,0 +1,21 @@
> +// PR c++/50060
> +// { dg-do compile { target c++11 } }
> +
> +extern "C" double frexp (double, int *);
> +
> +struct S
> +{
> +  constexpr S (double a) : y {}, x (frexp (a, &y)) {}  // { dg-error "is not 
> a constant expression" "S" { target { ! c++14 } } }
> +  double x;
> +  int y;
> +};
> +
> +struct T
> +{
> +  constexpr T (double a) : y {}, x ((y = 1, 0.8125)) {}// { dg-error 
> "is not a constant-expression" "T" { target { ! c++14 } } }
> +  double x;
> +  int y;
> +};
> +
> +static_assert (S (6.5).x == 0.8125, "");   // { dg-error "non-constant 
> condition for static assertion|in constexpr expansion" "" { target { ! c++14 
> } } }
> +static_assert (T (6.5).x == 0.8125, "");   // { dg-error "non-constant 
> condition for static assertion|called in a constant expression" "" { target { 
> ! c++14 } } }
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C.jj 2016-07-18 
> 20:46:00.992553765 +0200
> +++ gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C2016-07-18 
> 20:46:00.992553765 +0200
> @@ -0,0 +1,100 @@
> +// PR c++/50060
> +// { dg-do compile { target c++14 } }
> +
> +// sincos and lgamma_r aren't available in -std=c++14,
> +// only in -std=gnu++14.  Use __builtin_* in that case.
> +extern "C" void sincos (double, double *, double *);
> +extern "C" double frexp (double, int *);
> +extern "C" double modf (double, double *);
> +extern "C" double remquo (double, double, int *);
> +extern "C" double lgamma_r (double, int *);
> +
> +constexpr double
> +f0 (double x)
> +{
> +  double y {};
> +  double z {};
> +  __builtin_sincos (x, &y, &z);
> +  return y;
> +}
> +
> +constexpr double
> +f1 (double x)
> +{
> +  double y {};
> +  double z {};
> +  __builtin_sincos (x, &y, &z);
> +  return z;
> +}
>

[PATCH] report supported function classes correctly on *-musl

2016-07-20 Thread Szabolcs Nagy
All function classes listed in gcc/coretypes.h are supported by musl.

Most of the optimizations based on these function classes are not
relevant for standard conform c code, but this is required to get
rid of some test system noise.

gcc/
2016-07-20  Szabolcs Nagy  

* config/linux.c (linux_libc_has_function): Return true on musl.
From 294b908f9a7577bcfe8036a601262ca0bc7c2ca2 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy 
Date: Fri, 6 Nov 2015 23:59:20 +
Subject: [PATCH 2/7] linux_libc_has_function

---
 gcc/config/linux.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/linux.c b/gcc/config/linux.c
index 2081e34..37515bf 100644
--- a/gcc/config/linux.c
+++ b/gcc/config/linux.c
@@ -26,7 +26,7 @@ along with GCC; see the file COPYING3.  If not see
 bool
 linux_libc_has_function (enum function_class fn_class)
 {
-  if (OPTION_GLIBC)
+  if (OPTION_GLIBC || OPTION_MUSL)
 return true;
   if (OPTION_BIONIC)
 if (fn_class == function_c94
-- 
2.4.1



[PATCH] Consider functions with xloc.file == NULL (PR, gcov-profile/69028)

2016-07-20 Thread Martin Liška
Hi.

Following patch addresses ICE which happens when coverage.c computes checksum
of a function w/o xloc.file. My patch assumes it's a valid state having a 
function
w/o xloc.file, which is situation exposed by cilkplus functions.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From ac1ba622f394d9914c5f8250719780595f54b571 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 20 Jul 2016 09:54:12 +0200
Subject: [PATCH] Consider functions with xloc.file == NULL (PR
 gcov-profile/69028)

gcc/testsuite/ChangeLog:

2016-07-20  Martin Liska  

	PR gcov-profile/69028
	PR gcov-profile/62047
	* g++.dg/cilk-plus/pr69028.C: New test.

gcc/ChangeLog:

2016-07-20  Martin Liska  

	* coverage.c (coverage_compute_lineno_checksum): Do not
	calculate checksum for fns w/o xloc.file.
	(coverage_compute_profile_id): Likewise.
---
 gcc/coverage.c   |  6 --
 gcc/testsuite/g++.dg/cilk-plus/pr69028.C | 13 +
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cilk-plus/pr69028.C

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 67cc908..d4d371e 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -553,7 +553,8 @@ coverage_compute_lineno_checksum (void)
 = expand_location (DECL_SOURCE_LOCATION (current_function_decl));
   unsigned chksum = xloc.line;
 
-  chksum = coverage_checksum_string (chksum, xloc.file);
+  if (xloc.file)
+chksum = coverage_checksum_string (chksum, xloc.file);
   chksum = coverage_checksum_string
 (chksum, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (current_function_decl)));
 
@@ -580,7 +581,8 @@ coverage_compute_profile_id (struct cgraph_node *n)
   bool use_name_only = (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID) == 0);
 
   chksum = (use_name_only ? 0 : xloc.line);
-  chksum = coverage_checksum_string (chksum, xloc.file);
+  if (xloc.file)
+	chksum = coverage_checksum_string (chksum, xloc.file);
   chksum = coverage_checksum_string
 	(chksum, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (n->decl)));
   if (!use_name_only && first_global_object_name)
diff --git a/gcc/testsuite/g++.dg/cilk-plus/pr69028.C b/gcc/testsuite/g++.dg/cilk-plus/pr69028.C
new file mode 100644
index 000..31542f3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cilk-plus/pr69028.C
@@ -0,0 +1,13 @@
+// PR c++/69028
+// { dg-require-effective-target c++11 }
+// { dg-options "-fcilkplus -fprofile-arcs" }
+
+void parallel()
+{
+}
+
+int main()
+{
+   _Cilk_spawn parallel();
+   _Cilk_sync;
+}
-- 
2.9.0



Re: [PATCH] c++/58796 Make nullptr match exception handlers of pointer type

2016-07-20 Thread Jonathan Wakely

On 19/07/16 10:32 +0100, Jonathan Wakely wrote:

On 18/07/16 12:49 -0400, Jason Merrill wrote:

Perhaps the right answer is to drop support for catching nullptr as a
pointers to member from the language.


Yes, I've been drafting a ballot comment along those lines.


On the CWG reflector Richard Smith suggested using static objects as
the result for pointer to member handlers. I had tried that
unsuccessfully, but must have done something wrong because it works
fine, and avoids any races.

Tested x86_64-linux. I'll commit this to trunk later today.

commit 6cc1a2bca8dddb8ff5994849fcd3ee22de8776ed
Author: Jonathan Wakely 
Date:   Wed Jul 20 12:49:50 2016 +0100

Use static pointer to member when catching nullptr

libstdc++-v3:

	* libsupc++/pbase_type_info.cc (__pbase_type_info::__do_catch): Use
	static objects for catching nullptr as pointer to member types.

gcc/testsuite:

	* g++.dg/cpp0x/nullptr35.C: Change expected result for catching as
	pointer to member function and also test catching by reference.

diff --git a/gcc/testsuite/g++.dg/cpp0x/nullptr35.C b/gcc/testsuite/g++.dg/cpp0x/nullptr35.C
index c84966f..d932114 100644
--- a/gcc/testsuite/g++.dg/cpp0x/nullptr35.C
+++ b/gcc/testsuite/g++.dg/cpp0x/nullptr35.C
@@ -39,7 +39,7 @@ int main()
   caught(4);
 throw;
   }
-} catch (int (A::*pmf)()) {  // FIXME: currently unsupported
+} catch (int (A::*pmf)()) {
   if (pmf == nullptr)
 caught(8);
   throw;
@@ -47,6 +47,35 @@ int main()
   } catch (nullptr_t) {
   }
 
-  if (result != 7) // should be 15
+  try {
+try {
+  try {
+try {
+  try {
+throw nullptr;
+  } catch (void* const& p) {
+if (p == nullptr)
+  caught(16);
+throw;
+  }
+} catch (void(* const& pf)()) {
+  if (pf == nullptr)
+caught(32);
+  throw;
+}
+  } catch (int A::* const& pm) {
+if (pm == nullptr)
+  caught(64);
+throw;
+  }
+} catch (int (A::* const& pmf)()) {
+  if (pmf == nullptr)
+caught(128);
+  throw;
+}
+  } catch (nullptr_t) {
+  }
+
+  if (result != 255)
 abort ();
 }
diff --git a/libstdc++-v3/libsupc++/pbase_type_info.cc b/libstdc++-v3/libsupc++/pbase_type_info.cc
index a2993e4..ff6b756 100644
--- a/libstdc++-v3/libsupc++/pbase_type_info.cc
+++ b/libstdc++-v3/libsupc++/pbase_type_info.cc
@@ -50,14 +50,16 @@ __do_catch (const type_info *thr_type,
 {
   if (__pointee->__is_function_p ())
 {
-  // A pointer-to-member-function is two words  but the
-  // nullptr_t exception object at *(nullptr_t*)*thr_obj is only
-  // one word, so we can't safely return it as a PMF. FIXME.
-  return false;
+  using pmf_type = void (__pbase_type_info::*)();
+  static const pmf_type pmf = nullptr;
+  *thr_obj = const_cast(&pmf);
+  return true;
 }
   else
 {
-  *(ptrdiff_t*)*thr_obj = -1; // null pointer to data member
+  using pm_type = int __pbase_type_info::*;
+  static const pm_type pm = nullptr;
+  *thr_obj = const_cast(&pm);
   return true;
 }
 }


Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Richard Earnshaw (lists)
On 20/07/16 14:08, Wilco Dijkstra wrote:
> Richard Earnshaw wrote:
>> I'm not sure about this, while rtx_cost is called recursively as it
>> walks the RTL, I'd normally expect the outer levels of the recursion to
>> catch the cases where zero-extend is folded into a more complex
>> operation.  Hitting a case like this suggests that something isn't doing
>> that correctly.
> 
> As mentioned, the query is about an non-existent instruction, so the existing
> rtx_cost code won't handle it. In fact there is no other check for "outer" 
> anywhere
> in aarch64_rtx_cost. We either assume outer == SET or know that if it isn't, 
> the
> expression will be split.
> 
>> So what was the top-level RTX passed into rtx_cost?  I'd like to get a
>> better understanding about the use case before acking this patch.
> 
> An example would be:
> 
> long f(unsigned x) { return (long)x * 20; }
> 
> Combine tries to merge the constant into the multiply, so we get this cost 
> query:
> 
> (mult:DI (zero_extend:DI (reg/v:SI 74 [ x ]))
> (const_int 20 [0x14]))
> 
> Given this is not a legal multiply, rtx_mult_cost recurses, assuming both the
> zero_extend and the immediate are going to be split off. But then the 
> zero_extend
> is a SET, ie. a zero-cost operation. So not checking outer is correct.
> 
> Wilco
> 

Why does combine care what the cost is if the instruction isn't valid?

R.


Re: [PATCH]: Use HOST_WIDE_INT_{,M}1{,U} some more

2016-07-20 Thread Bernd Schmidt



On 07/20/2016 02:25 PM, Uros Bizjak wrote:

2016-07-19 14:46 GMT+02:00 Uros Bizjak :

The result of exercises with sed in gcc/ directory.


Some more conversions:

2016-07-20  Uros Bizjak  

* cse.c: Use HOST_WIDE_INT_M1 instead of ~(HOST_WIDE_INT) 0.
* combine.c: Use HOST_WIDE_INT_M1U instead of
~(unsigned HOST_WIDE_INT) 0.
* double-int.h: Ditto.
* dse.c: Ditto.
* dwarf2asm.c:Ditto.
* expmed.c: Ditto.
* genmodes.c: Ditto.
* match.pd: Ditto.
* read-rtl.c: Ditto.
* tree-ssa-loop-ivopts.c: Ditto.
* tree-ssa-loop-prefetch.c: Ditto.
* tree-vect-generic.c: Ditto.
* tree-vect-patterns.c: Ditto.
* tree.c: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?


I think this is a good set of changes which makes the code easier to 
read. Can I impose one additional requirement, building before/after and 
verifying that all the object files are identical? If you do this, these 
and all other similar changes are preapproved.



Bernd



Re: [PATCH 2/3][AArch64] Improve zero extend

2016-07-20 Thread Wilco Dijkstra
Richard Earnshaw wrote:
> I'm not sure about this, while rtx_cost is called recursively as it
> walks the RTL, I'd normally expect the outer levels of the recursion to
> catch the cases where zero-extend is folded into a more complex
> operation.  Hitting a case like this suggests that something isn't doing
> that correctly.

As mentioned, the query is about an non-existent instruction, so the existing
rtx_cost code won't handle it. In fact there is no other check for "outer" 
anywhere
in aarch64_rtx_cost. We either assume outer == SET or know that if it isn't, the
expression will be split.

> So what was the top-level RTX passed into rtx_cost?  I'd like to get a
> better understanding about the use case before acking this patch.

An example would be:

long f(unsigned x) { return (long)x * 20; }

Combine tries to merge the constant into the multiply, so we get this cost 
query:

(mult:DI (zero_extend:DI (reg/v:SI 74 [ x ]))
(const_int 20 [0x14]))

Given this is not a legal multiply, rtx_mult_cost recurses, assuming both the
zero_extend and the immediate are going to be split off. But then the 
zero_extend
is a SET, ie. a zero-cost operation. So not checking outer is correct.

Wilco



  1   2   >