Re: [PATCH 0/3] Support for mandatory tail calls

2016-05-18 Thread Basile Starynkevitch

On 05/19/2016 12:12 AM, Jeff Law wrote:

On 05/17/2016 04:01 PM, David Malcolm wrote:

There have been requests [1] for libgccjit to better support
functional programming by supporting the contination-passing style,
in which every function "returns" by calling a "continuation"
function pointer.

These calls must be guaranteed to be implemented as a jump,
otherwise the program could consume an arbitrary amount of stack
space as it executed.

This patch kit implements this.

Patch 1 is a preliminary tweak to calls.c

Patch 2 implements a new flag in tree.h: CALL_EXPR_MUST_TAIL_CALL,
which makes calls.c try harder to implement a flagged call as a
tail-call/sibling call, and makes it issue an error if
the optimization is impossible.  It doesn't implement any
frontend support for setting the flag (instead using a plugin
to test it).  We had some discussion on the jit list about possibly
introducing a new builtin for this, but the patch punts on this
issue.
I wonder if we should have an attribute so that the flag can be set 
for C/C++ code.  I've seen requests for forcing tail calls in C/C++ 
code several times in the past, precisely to support continuations.


Why an attribute? Attributes are on declarations. I think it should 
better be some pragma like _Pragma(GCC tail cail, foo(x,y)) or some 
builtin (or else some syntax extension like goto return foo(x,y); ...) 
because what we really want is to annotate a particular call to be 
tail-recursive.


Cheers

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***



Re: inhibit the sincos optimization when the target has sin and cos instructions

2016-05-18 Thread Cesar Philippidis
On 05/18/2016 05:29 AM, Nathan Sidwell wrote:
> On 05/17/16 17:30, Cesar Philippidis wrote:
>> On 05/17/2016 02:22 PM, Andrew Pinski wrote:

 gcc.sum
 Tests that now fail, but worked before:

 nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution
 test
 nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution
 test
 nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution
 test
 nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer
 -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
 nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test

> 
> Please determine why these now fail.

Those were failing intermittently, at least on my desktop. I'll look
into that it next.

>> +(define_expand "sincossf3"
>> +  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
>> +(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
>> +   UNSPEC_COS))
>> +   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
>> +(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
>> +  "flag_unsafe_math_optimizations"
>> +{
>> +  emit_insn (gen_sinsf2 (operands[1], operands[2]));
>> +  emit_insn (gen_cossf2 (operands[0], operands[2]));
>> +
>> +  DONE;
>> +})
> 
> Why the emit_insn code?  that seems to be replicating the RTL
> representation -- you're saying the same thing twice.
> 
> Doesn't operands[2] need (conditionally) copying to a new register --
> what if it aliases operands[1]?

This patch does that now.

>> +++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
>> @@ -0,0 +1,30 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -ffast-math" } */
>> +
> 
> What is this test trying to test?  I'm puzzled by it.  (btw, don't use
> assert, either abort, exit(1) or return from main.)

My intent was to verify that I got the sin and cos arguments right,
i.e., make sure that this sincos expansion didn't mix up sin(x) with
cos(x). I guess I can create a test that uses vprintf and scans
dg-output for the proper results. But in this patch I just omitted that
test case altogether.

Is this patch ok for trunk?

Cesar

2016-05-18  Cesar Philippidis  

	gcc/
	* config/nvptx/nvptx.md (sincossf3): New pattern.

	gcc/testsuite/
	* gcc.target/nvptx/sincos.c: New test.


diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..69bbb22 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,24 @@
   ""
   "%.\\tsqrt%#%t0\\t%0, %1;")
 
+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+	   UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+{
+  if (REGNO (operands[0]) == REGNO (operands[2]))
+{
+  rtx tmp = gen_reg_rtx (GET_MODE (operands[2]));
+  emit_insn (gen_rtx_SET (tmp, operands[2]));
+  emit_insn (gen_sinsf2 (operands[1], tmp));
+  emit_insn (gen_cossf2 (operands[0], tmp));
+  DONE;
+}
+})
+
 (define_insn "sinsf2"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */


[PATCH] c++/71147 - [6 Regression] Flexible array member wrongly rejected in template

2016-05-18 Thread Martin Sebor

The handling of flexible array members whose element type was
dependent tried to deal with the case when the element type
was not yet completed but it did it wrong.  The attached patch
corrects the handling by trying to complete the element type
first.

Thanks
Martin
PR c++/71147 - [6 Regression] Flexible array member wrongly rejected in template

gcc/testsuite/ChangeLog:
2016-05-18  Martin Sebor  

	PR c++/71147
	* g++.dg/ext/flexary16.C: New test.

gcc/cp/ChangeLog:
2016-05-18  Martin Sebor  

	PR c++/71147
	* pt.c (instantiate_class_template_1): Try to complete the element
	type of a flexible array member.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 65bfd42..73291e0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10119,16 +10119,25 @@ instantiate_class_template_1 (tree type)
 			  if (can_complete_type_without_circularity (rtype))
 			complete_type (rtype);
 
+			  bool complete;
+
   if (TREE_CODE (r) == FIELD_DECL
   && TREE_CODE (rtype) == ARRAY_TYPE
-  && COMPLETE_TYPE_P (TREE_TYPE (rtype))
   && !COMPLETE_TYPE_P (rtype))
 {
-  /* Flexible array mmembers of elements
- of complete type have an incomplete type
- and that's okay.  */
+  /* Flexible array mmembers have an incomplete
+ type and that's okay as long as their element
+ type is complete.  */
+			  tree eltype = TREE_TYPE (rtype);
+			  if (can_complete_type_without_circularity (eltype))
+complete_type (eltype);
+
+			  complete = COMPLETE_TYPE_P (eltype);
 }
-  else if (!COMPLETE_TYPE_P (rtype))
+			  else
+			complete = COMPLETE_TYPE_P (rtype);
+
+			  if (!complete)
 			{
 			  cxx_incomplete_type_error (r, rtype);
 			  TREE_TYPE (r) = error_mark_node;
diff --git a/gcc/testsuite/g++.dg/ext/flexary16.C b/gcc/testsuite/g++.dg/ext/flexary16.C
new file mode 100644
index 000..a3e040d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/flexary16.C
@@ -0,0 +1,37 @@
+// PR c++/71147 - [6 Regression] Flexible array member wrongly rejected
+//   in template
+// { dg-do compile }
+
+template 
+struct container
+{
+  struct elem {
+unsigned u;
+  };
+
+  struct incomplete {
+int x;
+elem array[];
+  };
+};
+
+unsigned f (container::incomplete* i)
+{
+  return i->array [0].u;
+}
+
+
+template 
+struct D: container
+{
+  struct S {
+int x;
+typename container::elem array[];
+  };
+};
+
+
+unsigned g (D::S *s)
+{
+  return s->array [0].u;
+}


Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-18 Thread Jim Wilson
On Mon, May 16, 2016 at 4:30 AM, James Greenhalgh
 wrote:
> As this change will change code generation for all cores (except
> Exynos-M1), I'd like to hear from those with more detailed knowledge of
> ThunderX, X-Gene and qdf24xx before I take this patch.

It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3.  I see
about a 0.37% loss on the integer benchmarks, and no significant
change on the FP benchmarks.  The integer loss is mainly due to
458.sjeng which drops 2%.  We had tried various values for
max_case_values earlier, and didn't see any performance improvement
from setting it, so we are using the default value.

We've been tracking changes to the FSF tree, and adjust our tuning
structure as necessary, so I'm not too concerned about this.  We will
just set the max_case_values field in the tuning structure to get the
result we want.  What I am slightly concerned about is that the
max_case_values field is only used at -O3 and above which limits the
usefulness.  If a port has specified a value, it probably should be
used for all non-size optimization, which means we should check for
optimize_size first, then check for a cpu specific value, then use the
default.  If you do that, then you don't need to change the default to
get better generic/a53 code, you can change it in the generic and/or
a53 tuning tables.

Though I see that the original patch from Samsung that added the
max_case_values field has the -O3 check, so there was apparently some
reason why they wanted it to work that way.  The value that the
exynos-m1 is using, 48, looks pretty large, so maybe they thought that
the code size expansion from that is only OK at -O3 and above.  Worst
case, we might need two max_case_value fields, one to use at -O1/-O2,
and one to use at -O3.

Jim


[PATCH] PR c/71171: Fix uninitialized source_range in c_parser_postfix_expression

2016-05-18 Thread David Malcolm
PR c/71171 reports yet another instance of the src_range of a
c_expr being used without initialization.  Investigation shows
that this was due to error-handling, where the "value" field of
a c_expr is set to error_mark_node without touching the
src_range, leading to complaints from valgrind.

This seems to be a common mistake, so this patch introduces a
new method, c_expr::set_error, which sets the value to
error_mark_node whilst initializing the src_range to
UNKNOWN_LOCATION.

This fixes the valgrind issue seen in PR c/71171, along with various
similar issues seen when running the testsuite using the checker
patch I posted here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00887.html
(this checker still doesn't fully work yet, but it seems to be good
for easily detecting these issues without needing Valgrind).

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk and for gcc-6-branch?

gcc/c/ChangeLog:
PR c/71171
* c-parser.c (c_parser_generic_selection): Use c_expr::set_error
in error-handling.
(c_parser_postfix_expression): Likewise.
* c-tree.h (c_expr::set_error): New method.
* c-typeck.c (parser_build_binary_op): In error-handling, ensure
that result's range is initialized.
---
 gcc/c/c-parser.c | 72 
 gcc/c/c-tree.h   |  9 +++
 gcc/c/c-typeck.c |  7 +-
 3 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 5703989..5edeb64 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7200,7 +7200,7 @@ c_parser_generic_selection (c_parser *parser)
 
   error_expr.original_code = ERROR_MARK;
   error_expr.original_type = NULL;
-  error_expr.value = error_mark_node;
+  error_expr.set_error ();
   matched_assoc.type_location = UNKNOWN_LOCATION;
   matched_assoc.type = NULL_TREE;
   matched_assoc.expression = error_expr;
@@ -7511,13 +7511,13 @@ c_parser_postfix_expression (c_parser *parser)
gcc_assert (c_dialect_objc ());
if (!c_parser_require (parser, CPP_DOT, "expected %<.%>"))
  {
-   expr.value = error_mark_node;
+   expr.set_error ();
break;
  }
if (c_parser_next_token_is_not (parser, CPP_NAME))
  {
c_parser_error (parser, "expected identifier");
-   expr.value = error_mark_node;
+   expr.set_error ();
break;
  }
c_token *component_tok = c_parser_peek_token (parser);
@@ -7531,7 +7531,7 @@ c_parser_postfix_expression (c_parser *parser)
  }
default:
  c_parser_error (parser, "expected expression");
- expr.value = error_mark_node;
+ expr.set_error ();
  break;
}
   break;
@@ -7553,7 +7553,7 @@ c_parser_postfix_expression (c_parser *parser)
  parser->error = true;
  c_parser_skip_until_found (parser, CPP_CLOSE_BRACE, NULL);
  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
- expr.value = error_mark_node;
+ expr.set_error ();
  break;
}
  stmt = c_begin_stmt_expr ();
@@ -7582,7 +7582,7 @@ c_parser_postfix_expression (c_parser *parser)
 "expected %<)%>");
  if (type_name == NULL)
{
- expr.value = error_mark_node;
+ expr.set_error ();
}
  else
expr = c_parser_postfix_expression_after_paren_type (parser,
@@ -7642,7 +7642,7 @@ c_parser_postfix_expression (c_parser *parser)
c_parser_consume_token (parser);
if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
  {
-   expr.value = error_mark_node;
+   expr.set_error ();
break;
  }
e1 = c_parser_expr_no_commas (parser, NULL);
@@ -7651,7 +7651,7 @@ c_parser_postfix_expression (c_parser *parser)
if (!c_parser_require (parser, CPP_COMMA, "expected %<,%>"))
  {
c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
-   expr.value = error_mark_node;
+   expr.set_error ();
break;
  }
loc = c_parser_peek_token (parser)->location;
@@ -7661,7 +7661,7 @@ c_parser_postfix_expression (c_parser *parser)
   "expected %<)%>");
if (t1 == NULL)
  {
-   expr.value = error_mark_node;
+   expr.set_error ();
  }
else
  {
@@ -7683,7 +7683,7 @@ c_parser_postfix_expression (c_parser *parser)
  c_parser_consume_token (parser);
  if (!c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
{
- expr.value = error_mark_node;
+ expr.set_error ();
  

[PATCH] PR c++/71184: Fix NULL dereference in cp_parser_operator

2016-05-18 Thread David Malcolm
The source-range handling for the array form of operator
new/delete erroneously assumed that the "]" was present,
leading to a dereference of NULL when it's absent.

Fix it thusly.

Successfully bootstrapped on x86_64-pc-linux-gnu;
adds 6 PASS results to g++.sum.

OK for trunk and gcc-6-branch?

gcc/cp/ChangeLog:
PR c++/71184
* parser.c (cp_parser_operator): For array new/delete, check that
cp_parser_require returned a non-NULL token before dereferencing
it.

gcc/testsuite/ChangeLog:
PR c++/71184
* g++.dg/pr71184.C: New test case.
---
 gcc/cp/parser.c| 6 --
 gcc/testsuite/g++.dg/pr71184.C | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr71184.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 539f165..1d1e574 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -13791,8 +13791,10 @@ cp_parser_operator (cp_parser* parser)
/* Consume the `[' token.  */
cp_lexer_consume_token (parser->lexer);
/* Look for the `]' token.  */
-   end_loc = cp_parser_require (parser, CPP_CLOSE_SQUARE,
- RT_CLOSE_SQUARE)->location;
+   cp_token *close_token =
+ cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE);
+   if (close_token)
+ end_loc = close_token->location;
id = ansi_opname (op == NEW_EXPR
  ? VEC_NEW_EXPR : VEC_DELETE_EXPR);
  }
diff --git a/gcc/testsuite/g++.dg/pr71184.C b/gcc/testsuite/g++.dg/pr71184.C
new file mode 100644
index 000..452303e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr71184.C
@@ -0,0 +1 @@
+operator new[ // { dg-error "expected type-specifier before 'new'" }
-- 
1.8.5.3



[PATCH] c/71115 - Missing warning: excess elements in struct initializer

2016-05-18 Thread Martin Sebor

The bug points out that the following and similar invalid uses
of NULL are not diagnosed.

  #include 

  const char* a[1] = { "", NULL };

The attached patch implements the suggestion on the Diagnostics
Guidelines Wiki to call
expansion_point_location_if_in_system_header to determine the
location where the macro is used.  Making these changes and
noticing the already existing calls to the function made me
wonder if this approach (warning on system macros) should be
the default strategy, and not warning special.  Aren't there
many more contexts where we would like to see warnings for
them?

In comment #8 on the bug Manuel also suggests to remove the note:
(near initialization for 'decl').  I tried it but decided not to
include it in this change because of the large number of tests it
will require making changes to (I counted at least 20).  I think
it's a worthwhile change but it seems that it might better be
made on its own.

Martin
PR c/71115 - Missing warning: excess elements in struct initializer

gcc/testsuite/ChangeLog:
2016-05-18  Martin Sebor  

	PR c/71115
	* gcc.dg/init-excess-2.c: New test.

gcc/c/ChangeLog:
2016-05-18  Martin Sebor  

	PR c/71115
	* c-typeck.c (error_init): Use
	expansion_point_location_if_in_system_header.
	(warning_init): Same.

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 0fa9653..2abb171 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -5871,16 +5871,21 @@ error_init (location_t loc, const char *gmsgid)
component name is taken from the spelling stack.  */
 
 static void
-pedwarn_init (location_t location, int opt, const char *gmsgid)
+pedwarn_init (location_t loc, int opt, const char *gmsgid)
 {
   char *ofwhat;
   bool warned;
 
+  /* Use the location where a macro was expanded rather than where
+ it was defined to make sure macros defined in system headers
+ but used incorrectly elsewhere are diagnosed.  */
+  source_location exploc = expansion_point_location_if_in_system_header (loc);
+
   /* The gmsgid may be a format string with %< and %>. */
-  warned = pedwarn (location, opt, gmsgid);
+  warned = pedwarn (exploc, opt, gmsgid);
   ofwhat = print_spelling ((char *) alloca (spelling_length () + 1));
   if (*ofwhat && warned)
-inform (location, "(near initialization for %qs)", ofwhat);
+inform (exploc, "(near initialization for %qs)", ofwhat);
 }
 
 /* Issue a warning for a bad initializer component.
@@ -5895,11 +5900,16 @@ warning_init (location_t loc, int opt, const char *gmsgid)
   char *ofwhat;
   bool warned;
 
+  /* Use the location where a macro was expanded rather than where
+ it was defined to make sure macros defined in system headers
+ but used incorrectly elsewhere are diagnosed.  */
+  source_location exploc = expansion_point_location_if_in_system_header (loc);
+
   /* The gmsgid may be a format string with %< and %>. */
-  warned = warning_at (loc, opt, gmsgid);
+  warned = warning_at (exploc, opt, gmsgid);
   ofwhat = print_spelling ((char *) alloca (spelling_length () + 1));
   if (*ofwhat && warned)
-inform (loc, "(near initialization for %qs)", ofwhat);
+inform (exploc, "(near initialization for %qs)", ofwhat);
 }
 
 /* If TYPE is an array type and EXPR is a parenthesized string
diff --git a/gcc/testsuite/gcc.dg/init-excess-2.c b/gcc/testsuite/gcc.dg/init-excess-2.c
new file mode 100644
index 000..1bf0a96
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/init-excess-2.c
@@ -0,0 +1,47 @@
+/* Test for diagnostics about excess initializers when using a macro
+   defined in a system header:
+   c/71115 - Missing warning: excess elements in struct initializer.  */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+#include 
+
+int* a[1] = {
+  0,
+  NULL  /* { dg-warning "excess elements|near init" } */
+};
+
+const char str[1] = {
+  0,
+  NULL  /* { dg-warning "excess elements|near init" } */
+};
+
+struct S {
+  int *a;
+} s = {
+  0,
+  NULL  /* { dg-warning "excess elements|near init" } */
+};
+
+struct __attribute__ ((designated_init)) S2 {
+  int *a;
+} s2 = {
+  NULL  /* { dg-warning "positional initialization|near init" } */
+};
+
+union U {
+  int *a;
+} u = {
+  0,
+  NULL  /* { dg-warning "excess elements|near init" } */
+};
+
+int __attribute__ ((vector_size (16))) ivec = {
+  0, 0, 0, 0,
+  NULL  /* { dg-warning "excess elements|near init" } */
+};
+
+int* scal = {
+  0,
+  NULL  /* { dg-warning "excess elements|near init" } */
+};


[PATCH] Ensure source_date_epoch is always initialised

2016-05-18 Thread James Clarke
gcc/c-family
PR preprocessor/71183
* c-common.c (get_source_date_epoch): Move to libcpp/init.c.
* c-common.h (get_source_date_epoch): Remove definition, as it
is now internal to libcpp/init.c.
* c-lex.c (c_lex_with_flags): Remove source_date_epoch
initialization, as this is now done by libcpp.

gcc/testsuite/
PR preprocessor/71183
* gcc.dg/cpp/special/date-time.c: New testcase.
* gcc.dg/cpp/special/date-time.exp: New file. Sets the
SOURCE_DATE_EPOCH environment variable for date-time.c.

libcpp/
PR preprocessor/71183
* include/cpplib.h (cpp_init_source_date_epoch): Remove
definition, as it is now internal to init.c.
* init.c (cpp_create_reader): Initialize source_date_epoch.
(get_source_date_epoch): Moved from gcc/c-family/c-common.c, and
uses cpp_error instead of fatal_error.
(cpp_init_source_date_epoch): Drop source_date_epoch argument
and call get_source_date_epoch to get the value.
---
 gcc/c-family/c-common.c| 33 
 gcc/c-family/c-common.h|  5 ---
 gcc/c-family/c-lex.c   |  3 --
 gcc/testsuite/gcc.dg/cpp/special/date-time.c   |  5 +++
 gcc/testsuite/gcc.dg/cpp/special/date-time.exp | 35 +
 libcpp/include/cpplib.h|  3 --
 libcpp/init.c  | 43 --
 7 files changed, 80 insertions(+), 47 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 146e805..83f38dd 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12791,39 +12791,6 @@ valid_array_size_p (location_t loc, tree type, tree 
name)
   return true;
 }
 
-/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
-   timestamp to replace embedded current dates to get reproducible
-   results.  Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
-time_t
-get_source_date_epoch ()
-{
-  char *source_date_epoch;
-  long long epoch;
-  char *endptr;
-
-  source_date_epoch = getenv ("SOURCE_DATE_EPOCH");
-  if (!source_date_epoch)
-return (time_t) -1;
-
-  errno = 0;
-  epoch = strtoll (source_date_epoch, , 10);
-  if ((errno == ERANGE && (epoch == LLONG_MAX || epoch == LLONG_MIN))
-  || (errno != 0 && epoch == 0))
-fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
-"strtoll: %s\n", xstrerror(errno));
-  if (endptr == source_date_epoch)
-fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
-"no digits were found: %s\n", endptr);
-  if (*endptr != '\0')
-fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
-"trailing garbage: %s\n", endptr);
-  if (epoch < 0)
-fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
-"value must be nonnegative: %lld \n", epoch);
-
-  return (time_t) epoch;
-}
-
 /* Check and possibly warn if two declarations have contradictory
attributes, such as always_inline vs. noinline.  */
 
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 0ee9f56..63fd2b9 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1482,9 +1482,4 @@ extern bool valid_array_size_p (location_t, tree, tree);
 extern bool cilk_ignorable_spawn_rhs_op (tree);
 extern bool cilk_recognize_spawn (tree, tree *);
 
-/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
-   timestamp to replace embedded current dates to get reproducible
-   results.  Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
-extern time_t get_source_date_epoch (void);
-
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 38a428d..5bab8d1 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -389,9 +389,6 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned 
char *cpp_flags,
   enum cpp_ttype type;
   unsigned char add_flags = 0;
   enum overflow_type overflow = OT_NONE;
-  time_t source_date_epoch = get_source_date_epoch ();
-
-  cpp_init_source_date_epoch (parse_in, source_date_epoch);
 
   timevar_push (TV_CPP);
  retry:
diff --git a/gcc/testsuite/gcc.dg/cpp/special/date-time.c 
b/gcc/testsuite/gcc.dg/cpp/special/date-time.c
new file mode 100644
index 000..3304b75
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/special/date-time.c
@@ -0,0 +1,5 @@
+/* { dg-do preprocess } */
+__DATE__
+__TIME__
+/* { dg-final { scan-file date-time.i "\"Jul  4 1978\"" } } */
+/* { dg-final { scan-file date-time.i "\"21:24:16\"" } } */
diff --git a/gcc/testsuite/gcc.dg/cpp/special/date-time.exp 
b/gcc/testsuite/gcc.dg/cpp/special/date-time.exp
new file mode 100644
index 000..3c43143
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/special/date-time.exp
@@ -0,0 +1,35 @@
+#   Copyright (C) 1997-2016 Free Software Foundation, Inc.
+
+# This program 

Re: [C++ Patch/RFC] PR 70572 ("[4.9/5/6/7 Regression] ICE on code with decltype (auto) on x86_64-linux-gnu in digest_init_r")

2016-05-18 Thread Paolo Carlini

Hi,

On 18/05/2016 23:13, Jason Merrill wrote:
Shouldn't we have complained about declaring a variable with function 
type before we get here?
Ah, interesting, I think all the other compilers I have at hand don't 
even try to catch the issue so early.


In any case, something as simple as the below appears to work, I'm 
finishing testing it. An earlier version didn't have the assignment of 
error_mark_node, which I added to avoid cases of redundant diagnostic 
later (eg, in check_field_decls for fields) and had a VAR_P (decl) check 
in the condition which doesn't seem necessary. What do you think?


Thanks,
Paolo.

/
Index: cp/decl.c
===
--- cp/decl.c   (revision 236433)
+++ cp/decl.c   (working copy)
@@ -6609,6 +6609,12 @@ cp_finish_decl (tree decl, tree init, bool init_co
adc_variable_type);
   if (type == error_mark_node)
return;
+  if (TREE_CODE (type) == FUNCTION_TYPE)
+   {
+ error ("cannot declare variable %q+D with function type", decl);
+ TREE_TYPE (decl) = error_mark_node;
+ return;
+   }
   cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
 }
 
Index: testsuite/g++.dg/cpp1y/auto-fn31.C
===
--- testsuite/g++.dg/cpp1y/auto-fn31.C  (revision 0)
+++ testsuite/g++.dg/cpp1y/auto-fn31.C  (working copy)
@@ -0,0 +1,7 @@
+// PR c++/70572
+// { dg-do compile { target c++14 } }
+
+void foo ()
+{
+  decltype (auto) a = foo;  // { dg-error "cannot declare" }
+}


Re: [PATCH 2/3] function: Factor out make_*logue_seq

2016-05-18 Thread Segher Boessenkool
On Wed, May 18, 2016 at 01:35:16PM -0500, Segher Boessenkool wrote:
> On Wed, May 18, 2016 at 11:20:29AM -0700, H.J. Lu wrote:
> > >> > * function.c (make_split_prologue_seq, make_prologue_seq,
> > >> > make_epilogue_seq): New functions, factored out from...
> > >> > (thread_prologue_and_epilogue_insns): Here.
> > >>
> > >> It breaks x86:
> > >
> > > Are you sure it is this patch causing it?  As noted, it was tested on x86.
> > 
> > I am pretty sure.  How did you test it on x86?
> 
> "make -k check".  I'll test 32-bit now.

Actually, it also fails on 64 bit.  It passed my testing because it does
not fail together with patch 3/3, and does not fail on powerpc at all.


Segher


Re: [PATCH 0/3] Support for mandatory tail calls

2016-05-18 Thread Jeff Law

On 05/17/2016 04:01 PM, David Malcolm wrote:

There have been requests [1] for libgccjit to better support
functional programming by supporting the contination-passing style,
in which every function "returns" by calling a "continuation"
function pointer.

These calls must be guaranteed to be implemented as a jump,
otherwise the program could consume an arbitrary amount of stack
space as it executed.

This patch kit implements this.

Patch 1 is a preliminary tweak to calls.c

Patch 2 implements a new flag in tree.h: CALL_EXPR_MUST_TAIL_CALL,
which makes calls.c try harder to implement a flagged call as a
tail-call/sibling call, and makes it issue an error if
the optimization is impossible.  It doesn't implement any
frontend support for setting the flag (instead using a plugin
to test it).  We had some discussion on the jit list about possibly
introducing a new builtin for this, but the patch punts on this
issue.
I wonder if we should have an attribute so that the flag can be set for 
C/C++ code.  I've seen requests for forcing tail calls in C/C++ code 
several times in the past, precisely to support continuations.


Jeff



Re: [PATCH] PR driver/69265: add hint for options with misspelled arguments

2016-05-18 Thread Jeff Law

On 05/09/2016 06:14 PM, David Malcolm wrote:

opts-common.c's cmdline_handle_error handles invalid arguments
for options with CL_ERR_ENUM_ARG by building a strings listing the
valid arguments.  By also building a vec of valid arguments, we
can use find_closest_string and provide a hint if we see a close
misspelling.

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
PR driver/69265
* Makefile.in (GCC_OBJS): Move spellcheck.o to...
(OBJS-libcommon-target): ...here.
* opts-common.c: Include spellcheck.h.
(cmdline_handle_error): Build a vec of valid options and use it
to suggest provide hints for misspelled arguments.

gcc/testsuite/ChangeLog:
PR driver/69265
* gcc.dg/spellcheck-options-11.c: New test case.

OK.
jeff



Re: [PATCH] Make basic asm implicitly clobber memory

2016-05-18 Thread Jeff Law

On 05/07/2016 11:38 AM, Andrew Haley wrote:

On 06/05/16 07:35, David Wohlferd wrote:


1) I'm not clear precisely what problem this patch fixes.  It's true
that some people have incorrectly assumed that basic asm clobbers
memory and this change would fix their code.  But some people also
incorrectly assume it clobbers registers.  I assume that's why Jeff
Law proposed making basic asm "an opaque blob that
read/write/clobber any register or memory location."


A few more things:

Jeff Law did propose this, but it's impossible to do because it
inevitably causes reload failures.

Right.



My argument in support of Bernd's proposal is that it makes sense from
a *practical* software reliability point of view.  It wouldn't hurt,
and might fix some significant bugs.  It's similar to the targets
which always implicitly clobber "cc".  It corresponds to what I always
assumed basic asm did, and I'm sure that I'm not alone.  This change
might fix some real bugs and it is extremely unlikely to break
anything.
And by making basic asms use/clobber memory in particular, it means code 
using them is less likely to break as the optimizers continue to get 
smarter about memory loads/stores.


I haven't gone through the actual patch yet, but I like it's basic goals.

Jeff



Re: [PATCH] Re-use cc1-checksum.c for stage-final

2016-05-18 Thread Jeff Law

On 05/03/2016 04:32 AM, Richard Biener wrote:

On Tue, 3 May 2016, Richard Biener wrote:


On Mon, 2 May 2016, Jeff Law wrote:


On 04/29/2016 05:36 AM, Richard Biener wrote:

On Thu, 28 Apr 2016, Jeff Law wrote:


On 04/28/2016 02:49 AM, Richard Biener wrote:


The following prototype patch re-uses cc1-checksum.c from the
previous stage when compiling stage-final.  This eventually
allows to compare cc1 from the last two stages to fix the
lack of a true comparison when doing LTO bootstrap (it
compiles LTO bytecode from the compile-stage there, not the
final optimization result).

Bootstrapped on x86_64-unknown-linux-gnu.

When stripping gcc/cc1 and prev-gcc/cc1 after the bootstrap
they now compare identical (with LTO bootstrap it should
not require stripping as that doesn't do a bootstrap-debug AFAIK).

Is sth like this acceptable?  (consider it also done for
cp/Make-lang.in)

In theory we can compare all stage1 languages but I guess comparing
the required ones for a LTO bootstrap, cc1, cc1plus and lto1 would
be sufficient (or even just comparing one binary in which case
comparing lto1 would not require any patches).

This also gets rid of the annoying warning that cc1-checksum.o
differs (obviously).

Thanks,
Richard.

2016-04-28  Richard Biener  

c/
* Make-lang.in (cc1-checksum.c): For stage-final re-use
the checksum from the previous stage.

I won't object if you add a comment into the fragment indicating why
you're
doing this.


So the following is a complete patch (not considering people may
add objc or obj-c++ to stage1 languages).  Build with --disable-bootstrap,
bootstrapped and profilebootstrapped with verifying it works as
intended (looks like we don't compare with profiledbootstrap - huh,
we're building stagefeedback only once)

Ok for trunk?

Step 2 will now be to figure out how to also compare cc1 (for example)
when using bootstrap-lto ... (we don't want to do this unconditionally
as it is a waste of time when the objects are not only LTO bytecode).

Thanks,
Richard.

2016-04-29  Richard Biener  

c/
* Make-lang.in (cc1-checksum.c): For stage-final re-use
the checksum from the previous stage.

cp/
* Make-lang.in (cc1plus-checksum.c): For stage-final re-use
the checksum from the previous stage.

LGTM.
jeff


Thanks - applied as rev. 235804.

I'll now play with some way to add additional compare objects.  Thinking
of adding sth like

EXTRA_COMPARE_OBJS = lto1 cc1 cc1plus

to bootstrap-lto.mk for example.


To my surprise this works.

LTO bootstrapped on x86_64-unknown-linux-gnu - I've added an additional
echo comparing $$f1 $$f2 which then shows

...
comparing /abuild/rguenther/obj/stage2-zlib/libz_a-inftrees.o
/abuild/rguenther/obj/stage3-zlib/libz_a-inftrees.o
comparing /abuild/rguenther/obj/stage2-cc1
/abuild/rguenther/obj/stage3-cc1
comparing /abuild/rguenther/obj/stage2-cc1plus
/abuild/rguenther/obj/stage3-cc1plus
comparing /abuild/rguenther/obj/stage2-lto1
/abuild/rguenther/obj/stage3-lto1
Comparison successful.

Ok for trunk?  This probably slows down the compare phase for
LTO bootstrap a bit (and LTO IL of the .o files is still compared).
I'm also not 100% sure that what works on .o files works on
executables on all targets (hmm, and I suppose I might miss
some exec-suffix?  Ah, there is $(exeext) but not available in
the toplevel makefile yet.).

Thanks,
Richard.

2016-05-03  Richard Biener  

* Makefile.tpl: Also compare EXTRA_COMPARE_OBJS.
* Makefile.in: Regenerate.

config/
* bootstrap-lto.mk: Add cc1, cc1plus and lto1 to EXTRA_COMPARE_OBJS.

LGTM.
jeff



Re: [PATCH] Clean up tests where a later dg-do completely overrides another.

2016-05-18 Thread Jeff Law

On 05/02/2016 10:24 AM, Dominik Vogt wrote:

On Mon, May 02, 2016 at 09:29:50AM -0600, Jeff Law wrote:

On 04/29/2016 05:56 PM, Dominik Vogt wrote:

...
Maybe a comment should be added to the test case

 /* If this test is *run* (not just compiled) and therefore fails
on non sh*-targets, this is because of a bug older DejaGnu
versions.  This is fixed with DejaGnu-1.6.  */

I think we have a couple issues now that are resolved if we step
forward to a newer version of dejagnu.

Given dejagnu-1.6 was recently released, should we just bite the
bullet and ask everyone to step forward?


I'm all for that.  I've recently added s390 test cases that
require Dejagnu 1.6.  Apart from the discussed problem with
spec-options.c, there are a number of Power (and some other
target) test cases that do not work properly with older Dejagnu
version but would finally work (read: actually test something) if
the new version were required.

FWIW, Fedora 24 uses dejagnu-1.6.  Not sure about other distributions.

jeff


Re: [PATCH] libiberty: support demangling of rvalue reference typenames

2016-05-18 Thread Jeff Law

On 01/04/2016 06:43 PM, Artemiy Volkov wrote:

016-01-04  Artemiy Volkov  

* cplus-dem.c (enum type_kind_t): Add tk_rvalue_reference
constant.
(demangle_template_value_parm): Handle tk_rvalue_reference
type kind.
(do_type): Support 'O' type id (rvalue references).

* testsuite/demangle-expected: Add tests.
Thanks.  I've installed this on the trunk after fixing some minor 
whitespace issues.


jeff



Re: [C++ Patch/RFC] PR 70572 ("[4.9/5/6/7 Regression] ICE on code with decltype (auto) on x86_64-linux-gnu in digest_init_r")

2016-05-18 Thread Jason Merrill

On 05/18/2016 11:48 AM, Paolo Carlini wrote:

Hi,

this issue should be easy to fix. Broken code like:

void foo ()
{
  decltype (auto) a = foo;
}

triggers the gcc_assert in digest_init_r:

  /* Come here only for aggregates: records, arrays, unions, complex
numbers
 and vectors.  */
  gcc_assert (TREE_CODE (type) == ARRAY_TYPE
  || VECTOR_TYPE_P (type)
  || TREE_CODE (type) == RECORD_TYPE
  || TREE_CODE (type) == UNION_TYPE
  || TREE_CODE (type) == COMPLEX_TYPE);

because of course TREE_CODE (type) == FUNCTION_TYPE, none of the above.
I said should be easy to fix because in fact convert_for_initialization
is perfectly able to handle these cases and emit proper diagnostic, if
called. What shall we do then? The patchlet below passes testing but we
could also relax the gcc_assert itself, include FUNCTION_TYPE
with/without checking cxx_dialect >= cxx14. We could drop the latter
check in my patchlet. Or something else entirely.

Thanks!
Paolo.




Shouldn't we have complained about declaring a variable with function 
type before we get here?


Jason



Re: [PATCH] Make C++ honor the enum mode attribute

2016-05-18 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Fix another case of noreturn call with TREE_ADDRESSABLE type on lhs (PR c++/71100)

2016-05-18 Thread Jason Merrill
OK.

Jason


On Wed, May 18, 2016 at 4:55 PM, Jakub Jelinek  wrote:
> Hi!
>
> In 6+ we require that lhs is present if the return type of a call is
> TREE_ADDRESSABLE, apparently this function has been missed.
>
> Fixed thusly, bootstrapped/regtested (at r236371, later trunk fails
> miserably on both x86_64-linux and i686-linux), ok for trunk/6.2?
>
> 2016-05-18  Jakub Jelinek  
>
> PR c++/71100
> * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Don't drop
> lhs if it has TREE_ADDRESSABLE type.
>
> * g++.dg/opt/pr71100.C: New test.
>
> --- gcc/cgraph.c.jj 2016-05-16 18:53:43.0 +0200
> +++ gcc/cgraph.c2016-05-18 10:53:39.491189469 +0200
> @@ -1515,7 +1515,8 @@ cgraph_edge::redirect_call_stmt_to_calle
>/* If the call becomes noreturn, remove the LHS if possible.  */
>if (lhs
>&& (gimple_call_flags (new_stmt) & ECF_NORETURN)
> -  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST)
> +  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
> +  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
>  {
>if (TREE_CODE (lhs) == SSA_NAME)
> {
> --- gcc/testsuite/g++.dg/opt/pr71100.C.jj   2016-05-18 10:56:12.798070065 
> +0200
> +++ gcc/testsuite/g++.dg/opt/pr71100.C  2016-05-18 10:56:07.974136754 +0200
> @@ -0,0 +1,18 @@
> +// PR c++/71100
> +// { dg-do compile }
> +// { dg-options "-O2" }
> +
> +struct D { ~D (); };
> +struct E { D foo () { throw 1; } };
> +
> +inline void
> +bar (D (E::*f) (), E *o)
> +{
> +  (o->*f) ();
> +}
> +
> +void
> +baz (E *o)
> +{
> +  bar (::foo, o);
> +}
>
> Jakub


[PATCH] Fix up vec_set_* for -mavx512vl -mno-avx512dq

2016-05-18 Thread Jakub Jelinek
Hi!

The vinsert[if]64x2 instructions are AVX512VL & AVX512DQ, so
if only AVX512VL is on, we should emit the other insns - 32x4,
which without masking do the same thing.
With masking, we have to require TARGET_AVX512DQ.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-18  Jakub Jelinek  

* config/i386/sse.md (vec_set_lo_,
vec_set_hi_): Add && 
condition.  For !TARGET_AVX512DQ, emit 32x4 instruction instead
of 64x2.

* gcc.target/i386/avx512dq-vinsert-1.c: New test.
* gcc.target/i386/avx512vl-vinsert-1.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-18 15:02:54.0 +0200
+++ gcc/config/i386/sse.md  2016-05-18 15:54:20.944236472 +0200
@@ -17823,10 +17823,12 @@ (define_insn "vec_set_lo_
(match_operand:VI8F_256 1 "register_operand" "v")
(parallel [(const_int 2) (const_int 3)]]
-  "TARGET_AVX"
+  "TARGET_AVX && "
 {
-  if (TARGET_AVX512VL)
+  if (TARGET_AVX512DQ)
 return "vinsert64x2\t{$0x0, %2, %1, 
%0|%0, %1, %2, 0x0}";
+  else if (TARGET_AVX512VL)
+return "vinsert32x4\t{$0x0, %2, %1, 
%0|%0, %1, %2, 0x0}";
   else
 return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}";
 }
@@ -17843,10 +17845,12 @@ (define_insn "vec_set_hi_ 2 "nonimmediate_operand" "vm")))]
-  "TARGET_AVX"
+  "TARGET_AVX && "
 {
-  if (TARGET_AVX512VL)
+  if (TARGET_AVX512DQ)
 return "vinsert64x2\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+  else if (TARGET_AVX512VL)
+return "vinsert32x4\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
   else
 return "vinsert\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}";
 }
--- gcc/testsuite/gcc.target/i386/avx512dq-vinsert-1.c.jj   2016-05-18 
16:08:48.572351388 +0200
+++ gcc/testsuite/gcc.target/i386/avx512dq-vinsert-1.c  2016-05-18 
16:09:18.114947627 +0200
@@ -0,0 +1,100 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mavx512dq -masm=att" } */
+
+typedef int V1 __attribute__((vector_size (32)));
+typedef long long V2 __attribute__((vector_size (32)));
+typedef float V3 __attribute__((vector_size (32)));
+typedef double V4 __attribute__((vector_size (32)));
+
+void
+f1 (V1 x, int y)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[3] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f2 (V1 x, int y)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[6] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f3 (V2 x, long long y)
+{
+  register V2 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[1] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f4 (V2 x, long long y)
+{
+  register V2 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[3] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f5 (V3 x, float y)
+{
+  register V3 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[3] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f6 (V3 x, float y)
+{
+  register V3 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[6] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f7 (V4 x, double y)
+{
+  register V4 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[1] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f8 (V4 x, double y)
+{
+  register V4 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a[3] = y;
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times "vinserti32x4\[^\n\r]*0x0\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times "vinserti32x4\[^\n\r]*0x1\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times "vinsertf32x4\[^\n\r]*0x0\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times "vinsertf32x4\[^\n\r]*0x1\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times 
"vextracti32x4\[^\n\r]*0x1\[^\n\r]*%\[yz]mm16" 1 } } */
+/* { dg-final { scan-assembler-times 
"vextractf32x4\[^\n\r]*0x1\[^\n\r]*%\[yz]mm16" 1 } } */
+/* { dg-final { scan-assembler-times "vinserti64x2\[^\n\r]*0x0\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times "vinserti64x2\[^\n\r]*0x1\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times "vinsertf64x2\[^\n\r]*0x0\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times "vinsertf64x2\[^\n\r]*0x1\[^\n\r]*%ymm16" 
1 } } */
+/* { dg-final { scan-assembler-times 
"vextracti64x2\[^\n\r]*0x1\[^\n\r]*%\[yz]mm16" 1 } } */
+/* { dg-final { scan-assembler-times 
"vextractf64x2\[^\n\r]*0x1\[^\n\r]*%\[yz]mm16" 1 } } */
--- gcc/testsuite/gcc.target/i386/avx512vl-vinsert-1.c.jj   2016-05-18 
16:07:03.928781560 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vinsert-1.c  2016-05-18 
16:08:29.500612043 +0200
@@ -0,0 +1,98 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512dq -masm=att" } */
+
+typedef int V1 __attribute__((vector_size (32)));
+typedef long long V2 __attribute__((vector_size (32)));
+typedef float V3 

[PATCH] Improve XMM16+ handling in vec_set*

2016-05-18 Thread Jakub Jelinek
Hi!

vinserti32x4 is in AVX512VL.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-18  Jakub Jelinek  

* config/i386/sse.md (vec_set_lo_v16hi, vec_set_hi_v16hi,
vec_set_lo_v32qi, vec_set_hi_v32qi): Add alternative with
v constraint instead of x and vinserti32x4 insn.

* gcc.target/i386/avx512vl-vinserti32x4-3.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-18 13:21:35.0 +0200
+++ gcc/config/i386/sse.md  2016-05-18 15:02:54.574685438 +0200
@@ -17899,47 +17899,50 @@ (define_insn "vec_set_hi_")])
 
 (define_insn "vec_set_lo_v16hi"
-  [(set (match_operand:V16HI 0 "register_operand" "=x")
+  [(set (match_operand:V16HI 0 "register_operand" "=x,v")
(vec_concat:V16HI
- (match_operand:V8HI 2 "nonimmediate_operand" "xm")
+ (match_operand:V8HI 2 "nonimmediate_operand" "xm,vm")
  (vec_select:V8HI
-   (match_operand:V16HI 1 "register_operand" "x")
+   (match_operand:V16HI 1 "register_operand" "x,v")
(parallel [(const_int 8) (const_int 9)
   (const_int 10) (const_int 11)
   (const_int 12) (const_int 13)
   (const_int 14) (const_int 15)]]
   "TARGET_AVX"
-  "vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"
+  "@vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}
+   vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"
   [(set_attr "type" "sselog")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "vex,evex")
(set_attr "mode" "OI")])
 
 (define_insn "vec_set_hi_v16hi"
-  [(set (match_operand:V16HI 0 "register_operand" "=x")
+  [(set (match_operand:V16HI 0 "register_operand" "=x,v")
(vec_concat:V16HI
  (vec_select:V8HI
-   (match_operand:V16HI 1 "register_operand" "x")
+   (match_operand:V16HI 1 "register_operand" "x,v")
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]))
- (match_operand:V8HI 2 "nonimmediate_operand" "xm")))]
+ (match_operand:V8HI 2 "nonimmediate_operand" "xm,vm")))]
   "TARGET_AVX"
-  "vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"
+  "@
+   vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}
+   vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"
   [(set_attr "type" "sselog")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "vex,evex")
(set_attr "mode" "OI")])
 
 (define_insn "vec_set_lo_v32qi"
-  [(set (match_operand:V32QI 0 "register_operand" "=x")
+  [(set (match_operand:V32QI 0 "register_operand" "=x,v")
(vec_concat:V32QI
- (match_operand:V16QI 2 "nonimmediate_operand" "xm")
+ (match_operand:V16QI 2 "nonimmediate_operand" "xm,v")
  (vec_select:V16QI
-   (match_operand:V32QI 1 "register_operand" "x")
+   (match_operand:V32QI 1 "register_operand" "x,v")
(parallel [(const_int 16) (const_int 17)
   (const_int 18) (const_int 19)
   (const_int 20) (const_int 21)
@@ -17949,18 +17952,20 @@ (define_insn "vec_set_lo_v32qi"
   (const_int 28) (const_int 29)
   (const_int 30) (const_int 31)]]
   "TARGET_AVX"
-  "vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"
+  "@
+   vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}
+   vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"
   [(set_attr "type" "sselog")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "vex,evex")
(set_attr "mode" "OI")])
 
 (define_insn "vec_set_hi_v32qi"
-  [(set (match_operand:V32QI 0 "register_operand" "=x")
+  [(set (match_operand:V32QI 0 "register_operand" "=x,v")
(vec_concat:V32QI
  (vec_select:V16QI
-   (match_operand:V32QI 1 "register_operand" "x")
+   (match_operand:V32QI 1 "register_operand" "x,v")
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
@@ -17969,13 +17974,15 @@ (define_insn "vec_set_hi_v32qi"
   (const_int 10) (const_int 11)
   (const_int 12) (const_int 13)
   (const_int 14) (const_int 15)]))
- (match_operand:V16QI 2 "nonimmediate_operand" "xm")))]
+ (match_operand:V16QI 2 "nonimmediate_operand" "xm,vm")))]
   "TARGET_AVX"
-  "vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"
+  "@
+   vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}
+   vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"
   [(set_attr "type" "sselog")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
-   (set_attr "prefix" "vex")
+   

[PATCH] Improve 128-bit to 256-bit broadcasts

2016-05-18 Thread Jakub Jelinek
Hi!

vbroadcast[fi]32x4 and vinsert[fi]32x4 are in AVX512VL,
vbroadcast[fi]64x2 and vinsert[fi]64x2 are in AVX512VL & AVX512DQ.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-18  Jakub Jelinek  

* config/i386/sse.md (i128vldq): New mode iterator.
(avx2_vbroadcasti128_, avx_vbroadcastf128_): Add
avx512dq and avx512vl altenratives.

* gcc.target/i386/avx512dq-vbroadcast-2.c: New test.
* gcc.target/i386/avx512vl-vbroadcast-2.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-18 12:30:50.0 +0200
+++ gcc/config/i386/sse.md  2016-05-18 13:21:35.339616623 +0200
@@ -778,6 +778,12 @@ (define_mode_attr i128
(V64QI "i64x4") (V32QI "%~128") (V32HI "i64x4") (V16HI "%~128")
(V16SI "i64x4") (V8SI "%~128") (V8DI "i64x4") (V4DI "%~128")])
 
+;; For 256-bit modes for TARGET_AVX512VL && TARGET_AVX512DQ
+;; i32x4, f32x4, i64x2 or f64x2 suffixes.
+(define_mode_attr i128vldq
+  [(V8SF "f32x4") (V4DF "f64x2")
+   (V32QI "i32x4") (V16HI "i32x4") (V8SI "i32x4") (V4DI "i64x2")])
+
 ;; Mix-n-match
 (define_mode_iterator AVX256MODE2P [V8SI V8SF V4DF])
 (define_mode_iterator AVX512MODE2P [V16SI V16SF V8DF])
@@ -17038,15 +17044,19 @@ (define_insn "*vec_dupv2di"
(set_attr "mode" "TI,TI,DF,V4SF")])
 
 (define_insn "avx2_vbroadcasti128_"
-  [(set (match_operand:VI_256 0 "register_operand" "=x")
+  [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")
(vec_concat:VI_256
- (match_operand: 1 "memory_operand" "m")
+ (match_operand: 1 "memory_operand" "m,m,m")
  (match_dup 1)))]
   "TARGET_AVX2"
-  "vbroadcasti128\t{%1, %0|%0, %1}"
-  [(set_attr "type" "ssemov")
+  "@
+   vbroadcasti128\t{%1, %0|%0, %1}
+   vbroadcast\t{%1, %0|%0, %1}
+   vbroadcast32x4\t{%1, %0|%0, %1}"
+  [(set_attr "isa" "*,avx512dq,avx512vl")
+   (set_attr "type" "ssemov")
(set_attr "prefix_extra" "1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "vex,evex,evex")
(set_attr "mode" "OI")])
 
 ;; Modes handled by AVX vec_dup patterns.
@@ -17123,19 +17133,24 @@ (define_split
   "operands[2] = gen_lowpart (mode, operands[0]);")
 
 (define_insn "avx_vbroadcastf128_"
-  [(set (match_operand:V_256 0 "register_operand" "=x,x,x")
+  [(set (match_operand:V_256 0 "register_operand" "=x,x,x,v,v,v,v")
(vec_concat:V_256
- (match_operand: 1 "nonimmediate_operand" "m,0,?x")
+ (match_operand: 1 "nonimmediate_operand" 
"m,0,?x,m,0,m,0")
  (match_dup 1)))]
   "TARGET_AVX"
   "@
vbroadcast\t{%1, %0|%0, %1}
vinsert\t{$1, %1, %0, %0|%0, %0, %1, 1}
-   vperm2\t{$0, %t1, %t1, %0|%0, %t1, %t1, 0}"
-  [(set_attr "type" "ssemov,sselog1,sselog1")
+   vperm2\t{$0, %t1, %t1, %0|%0, %t1, %t1, 0}
+   vbroadcast\t{%1, %0|%0, %1}
+   vinsert\t{$1, %1, %0, %0|%0, %0, %1, 1}
+   vbroadcast32x4\t{%1, %0|%0, %1}
+   vinsert32x4\t{$1, %1, %0, %0|%0, %0, %1, 1}"
+  [(set_attr "isa" "*,*,*,avx512dq,avx512dq,avx512vl,avx512vl")
+   (set_attr "type" "ssemov,sselog1,sselog1,ssemov,sselog1,ssemov,sselog1")
(set_attr "prefix_extra" "1")
-   (set_attr "length_immediate" "0,1,1")
-   (set_attr "prefix" "vex")
+   (set_attr "length_immediate" "0,1,1,0,1,0,1")
+   (set_attr "prefix" "vex,vex,vex,evex,evex,evex,evex")
(set_attr "mode" "")])
 
 ;; For broadcast[i|f]32x2.  Yes there is no v4sf version, only v4si.
--- gcc/testsuite/gcc.target/i386/avx512dq-vbroadcast-2.c.jj2016-05-18 
13:46:05.757523635 +0200
+++ gcc/testsuite/gcc.target/i386/avx512dq-vbroadcast-2.c   2016-05-18 
13:50:31.330891648 +0200
@@ -0,0 +1,49 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mavx512dq" } */
+
+#include 
+
+void
+f1 (__m128i x)
+{
+  register __m128i a __asm ("xmm16");
+  register __m256i c;
+  a = x;
+  asm volatile ("" : "+v" (a));
+  c = _mm256_broadcastsi128_si256 (a);
+  register __m256i b __asm ("xmm16");
+  b = c;
+  asm volatile ("" : "+v" (b));
+}
+
+/* { dg-final { scan-assembler 
"vinserti64x2\[^\n\r]*(xmm16\[^\n\r]*ymm16\[^\n\r]*ymm16|ymm16\[^\n\r]*ymm16\[^\n\r]*xmm16)"
 } } */
+
+void
+f2 (__m128i *x)
+{
+  register __m256i a __asm ("xmm16");
+  a = _mm256_broadcastsi128_si256 (*x);
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler "vbroadcasti64x2\[^\n\r]*ymm16" } } */
+
+void
+f3 (__m128 *x)
+{
+  register __m256 a __asm ("xmm16");
+  a = _mm256_broadcast_ps (x);
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler "vbroadcastf32x4\[^\n\r]*ymm16" } } */
+
+void
+f4 (__m128d *x)
+{
+  register __m256d a __asm ("xmm16");
+  a = _mm256_broadcast_pd (x);
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler "vbroadcastf64x2\[^\n\r]*ymm16" } } */
--- gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-2.c.jj2016-05-18 
13:45:40.449869743 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-2.c   2016-05-18 
13:50:46.922678414 +0200
@@ -0,0 +1,47 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { 

[PATCH] Improve XMM16-31 handling in various *vec_dup* patterns

2016-05-18 Thread Jakub Jelinek
Hi!

These instructions are available in AVX512VL, so we can use
XMM16+ in there.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-18  Jakub Jelinek  

* config/i386/sse.md (avx2_vec_dupv4df): Use v instead of x
constraint, use maybe_evex prefix instead of vex.
(vec_dupv4sf): Use v constraint instead of x for output
operand except for noavx alternative, use Yv constraint
instead of x for input.  Use maybe_evex prefix instead of vex.
(*vec_dupv4si): Likewise.
(*vec_dupv2di): Likewise.

* gcc.target/i386/avx512vl-vbroadcast-1.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-18 11:24:21.0 +0200
+++ gcc/config/i386/sse.md  2016-05-18 12:30:50.929220572 +0200
@@ -16880,15 +16880,15 @@ (define_insn "avx2_permv2ti"
(set_attr "mode" "OI")])
 
 (define_insn "avx2_vec_dupv4df"
-  [(set (match_operand:V4DF 0 "register_operand" "=x")
+  [(set (match_operand:V4DF 0 "register_operand" "=v")
(vec_duplicate:V4DF
  (vec_select:DF
-   (match_operand:V2DF 1 "register_operand" "x")
+   (match_operand:V2DF 1 "register_operand" "v")
(parallel [(const_int 0)]]
   "TARGET_AVX2"
   "vbroadcastsd\t{%1, %0|%0, %1}"
   [(set_attr "type" "sselog1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "mode" "V4DF")])
 
 (define_insn "_vec_dup_1"
@@ -16991,9 +16991,9 @@ (define_insn "_vec
(const_int 1)))])
 
 (define_insn "vec_dupv4sf"
-  [(set (match_operand:V4SF 0 "register_operand" "=x,x,x")
+  [(set (match_operand:V4SF 0 "register_operand" "=v,v,x")
(vec_duplicate:V4SF
- (match_operand:SF 1 "nonimmediate_operand" "x,m,0")))]
+ (match_operand:SF 1 "nonimmediate_operand" "Yv,m,0")))]
   "TARGET_SSE"
   "@
vshufps\t{$0, %1, %1, %0|%0, %1, %1, 0}
@@ -17003,13 +17003,13 @@ (define_insn "vec_dupv4sf"
(set_attr "type" "sseshuf1,ssemov,sseshuf1")
(set_attr "length_immediate" "1,0,1")
(set_attr "prefix_extra" "0,1,*")
-   (set_attr "prefix" "vex,vex,orig")
+   (set_attr "prefix" "maybe_evex,maybe_evex,orig")
(set_attr "mode" "V4SF")])
 
 (define_insn "*vec_dupv4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=x,x,x")
+  [(set (match_operand:V4SI 0 "register_operand" "=v,v,x")
(vec_duplicate:V4SI
- (match_operand:SI 1 "nonimmediate_operand" " x,m,0")))]
+ (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
   "TARGET_SSE"
   "@
%vpshufd\t{$0, %1, %0|%0, %1, 0}
@@ -17019,13 +17019,13 @@ (define_insn "*vec_dupv4si"
(set_attr "type" "sselog1,ssemov,sselog1")
(set_attr "length_immediate" "1,0,1")
(set_attr "prefix_extra" "0,1,*")
-   (set_attr "prefix" "maybe_vex,vex,orig")
+   (set_attr "prefix" "maybe_vex,maybe_evex,orig")
(set_attr "mode" "TI,V4SF,V4SF")])
 
 (define_insn "*vec_dupv2di"
-  [(set (match_operand:V2DI 0 "register_operand" "=x,x,x,x")
+  [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x")
(vec_duplicate:V2DI
- (match_operand:DI 1 "nonimmediate_operand" " 0,x,m,0")))]
+ (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,m,0")))]
   "TARGET_SSE"
   "@
punpcklqdq\t%0, %0
@@ -17034,7 +17034,7 @@ (define_insn "*vec_dupv2di"
movlhps\t%0, %0"
   [(set_attr "isa" "sse2_noavx,avx,sse3,noavx")
(set_attr "type" "sselog1,sselog1,sselog1,ssemov")
-   (set_attr "prefix" "orig,vex,maybe_vex,orig")
+   (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig")
(set_attr "mode" "TI,TI,DF,V4SF")])
 
 (define_insn "avx2_vbroadcasti128_"
--- gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-1.c.jj2016-05-18 
12:31:29.486693255 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-1.c   2016-05-18 
12:33:41.202891888 +0200
@@ -0,0 +1,41 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl" } */
+
+#include 
+
+void
+f1 (__m128d x)
+{
+  register __m128d a __asm ("xmm16");
+  register __m256d b __asm ("xmm17");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  b = _mm256_broadcastsd_pd (a);
+  asm volatile ("" : "+v" (b));
+}
+
+/* { dg-final { scan-assembler 
"vbroadcastsd\[^\n\r]*(xmm16\[^\n\r]*ymm17|ymm17\[^\n\r]*xmm16)" } } */
+
+void
+f2 (float const *x)
+{
+  register __m128 a __asm ("xmm16");
+  a = _mm_broadcast_ss (x);
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler 
"vbroadcastss\[^\n\r]*(\\)\[^\n\r]*xmm16|xmm16\[^\n\r]*PTR)" } } */
+
+void
+f3 (float x)
+{
+  register float a __asm ("xmm16");
+  register __m128 b __asm ("xmm17");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  float c = a;
+  b = _mm_broadcast_ss ();
+  asm volatile ("" : "+v" (b));
+}
+
+/* { dg-final { scan-assembler 
"vbroadcastss\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" } } */

Jakub


[PATCH] Fix another case of noreturn call with TREE_ADDRESSABLE type on lhs (PR c++/71100)

2016-05-18 Thread Jakub Jelinek
Hi!

In 6+ we require that lhs is present if the return type of a call is
TREE_ADDRESSABLE, apparently this function has been missed.

Fixed thusly, bootstrapped/regtested (at r236371, later trunk fails
miserably on both x86_64-linux and i686-linux), ok for trunk/6.2?

2016-05-18  Jakub Jelinek  

PR c++/71100
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Don't drop
lhs if it has TREE_ADDRESSABLE type.

* g++.dg/opt/pr71100.C: New test.

--- gcc/cgraph.c.jj 2016-05-16 18:53:43.0 +0200
+++ gcc/cgraph.c2016-05-18 10:53:39.491189469 +0200
@@ -1515,7 +1515,8 @@ cgraph_edge::redirect_call_stmt_to_calle
   /* If the call becomes noreturn, remove the LHS if possible.  */
   if (lhs
   && (gimple_call_flags (new_stmt) & ECF_NORETURN)
-  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST)
+  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
+  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
 {
   if (TREE_CODE (lhs) == SSA_NAME)
{
--- gcc/testsuite/g++.dg/opt/pr71100.C.jj   2016-05-18 10:56:12.798070065 
+0200
+++ gcc/testsuite/g++.dg/opt/pr71100.C  2016-05-18 10:56:07.974136754 +0200
@@ -0,0 +1,18 @@
+// PR c++/71100
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct D { ~D (); };
+struct E { D foo () { throw 1; } };
+
+inline void
+bar (D (E::*f) (), E *o)
+{
+  (o->*f) ();
+}
+
+void
+baz (E *o)
+{
+  bar (::foo, o);
+}

Jakub


[PATCH, alpha]: Fix PR 71145: Error: No lda !gpdisp!278 was found

2016-05-18 Thread Uros Bizjak
Hello!

Alpha assembler requires that matching "lda $29,0($29)
!gpdisp!NNN" always follow "ldah $29,0($26)!gpdisp!NNN".
However, when the compiler inserts trap insn, it (correctly) figures
out that $29 is unused, and removes "lda" from insn stream. Since ldah
is defined as unspec_volatile, it remains present.

The solution is to make trap insn dependent on register $29.

2016-05-18  Uros Bizjak  

PR target/71145
* config/alpha/alpha.md (trap): Add (use (reg:DI 29)).
(*exception_receiver_1): Return "#" for TARGET_EXPLICIT_RELOCS.

Patch was bootstrapped and regression tested on alpha-linux-gnu.

Commited to mainline and all release branches.

Uros.
Index: alpha.md
===
--- alpha.md(revision 236296)
+++ alpha.md(working copy)
@@ -3738,7 +3738,8 @@
 
 ;; BUGCHK is documented common to OSF/1 and VMS PALcode.
 (define_insn "trap"
-  [(trap_if (const_int 1) (const_int 0))]
+  [(trap_if (const_int 1) (const_int 0))
+   (use (reg:DI 29))]
   ""
   "call_pal 0x81"
   [(set_attr "type" "callpal")])
@@ -5157,7 +5158,7 @@
   "TARGET_ABI_OSF"
 {
   if (TARGET_EXPLICIT_RELOCS)
-return "ldah $29,0($26)\t\t!gpdisp!%*\;lda $29,0($29)\t\t!gpdisp!%*";
+return "#";
   else
 return "ldgp $29,0($26)";
 }


Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.

2016-05-18 Thread David Malcolm
On Wed, 2016-05-18 at 19:59 +0300, Pekka Jääskeläinen wrote:
> Hi Joseph,
> 
> Thanks for the comments.  Updated patch attached. Hopefully I
> didn't miss any diags.

It looks like the attachment doesn't contain the patch; on unzipping I
just see a 27 byte file reading "The BRIG frontend itself."

A similar problem seems to have happened with the update to 1/4:
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01385.html
where the "patch" just has the descriptive text, but not the actual
changes.

Looks like you need to resend both.

Hope this is helpful; good luck
Dave

> On Wed, May 18, 2016 at 3:20 AM, Joseph Myers <
> jos...@codesourcery.com> wrote:
> > This patch has many improperly formatted diagnostic messages (e.g.
> > starting with capital letters, ending with '.' or failing to use %q
> > for
> > quoting).  I also note cases where you use %lu as a format for
> > size_t,
> > which is not correct (you'd need to add pretty-print.c support for
> > %zu
> > before you could use that, however).
> > 
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com


Re: [PATCH 2/3] function: Factor out make_*logue_seq

2016-05-18 Thread Segher Boessenkool
On Wed, May 18, 2016 at 11:20:29AM -0700, H.J. Lu wrote:
> >> > * function.c (make_split_prologue_seq, make_prologue_seq,
> >> > make_epilogue_seq): New functions, factored out from...
> >> > (thread_prologue_and_epilogue_insns): Here.
> >>
> >> It breaks x86:
> >
> > Are you sure it is this patch causing it?  As noted, it was tested on x86.
> 
> I am pretty sure.  How did you test it on x86?

"make -k check".  I'll test 32-bit now.


Segher


[PATCH], Add support for PowerPC ISA 3.0 VNEGD/VNEGW instructions

2016-05-18 Thread Michael Meissner
Unlike some of my patches, this is a fairly simple patch to add support for the
VNEGW and VNEGD instructions that were added in ISA 3.0.  Note, ISA 3.0 does
not provide negation for V16QImode/V8HImode, just V4SImode/V2DImode.

I discovered that when we added ISA 2.07 support for V2DImode, we didn't
provide an expander for negv2di2, which I added with this patch.

[gcc]
2016-05-18  Michael Meissner  

* config/rs6000/altivec.md (VNEG iterator): New iterator for
VNEGW/VNEGD instructions.
(p9_neg2): New insns for ISA 3.0 VNEGW/VNEGD.
(neg2): Add expander for V2DImode added in ISA 2.06, and
support for ISA 3.0 VNEGW/VNEGD instructions.

[gcc/testsuite]
2016-05-18  Michael Meissner  

* gcc.target/powerpc/p9-vneg.c: New test for ISA 3.0 VNEGW/VNEGD
instructions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 236398)
+++ gcc/config/rs6000/altivec.md(.../gcc/config/rs6000) (working copy)
@@ -203,6 +203,9 @@ (define_mode_attr VP_small [(V2DI "V4SI"
 (define_mode_attr VP_small_lc [(V2DI "v4si") (V4SI "v8hi") (V8HI "v16qi")])
 (define_mode_attr VU_char [(V2DI "w") (V4SI "h") (V8HI "b")])
 
n+;; Vector negate
+(define_mode_iterator VNEG [V4SI V2DI])
+
 ;; Vector move instructions.
 (define_insn "*altivec_mov"
   [(set (match_operand:VM2 0 "nonimmediate_operand" "=Z,v,v,*Y,*r,*r,v,v,*r")
@@ -2740,19 +2743,28 @@ (define_expand "reduc_plus_scal_"
   DONE;
 })
 
+(define_insn "*p9_neg2"
+  [(set (match_operand:VNEG 0 "altivec_register_operand" "=v")
+   (neg:VNEG (match_operand:VNEG 1 "altivec_register_operand" "v")))]
+  "TARGET_P9_VECTOR"
+  "vneg %0,%1"
+  [(set_attr "type" "vecsimple")])
+
 (define_expand "neg2"
-  [(use (match_operand:VI 0 "register_operand" ""))
-   (use (match_operand:VI 1 "register_operand" ""))]
-  "TARGET_ALTIVEC"
+  [(set (match_operand:VI2 0 "register_operand" "")
+   (neg:VI2 (match_operand:VI2 1 "register_operand" "")))]
+  ""
   "
 {
-  rtx vzero;
+  if (!TARGET_P9_VECTOR || (mode != V4SImode && mode != V2DImode))
+{
+  rtx vzero;
 
-  vzero = gen_reg_rtx (GET_MODE (operands[0]));
-  emit_insn (gen_altivec_vspltis (vzero, const0_rtx));
-  emit_insn (gen_sub3 (operands[0], vzero, operands[1])); 
-  
-  DONE;
+  vzero = gen_reg_rtx (GET_MODE (operands[0]));
+  emit_move_insn (vzero, CONST0_RTX (mode));
+  emit_insn (gen_sub3 (operands[0], vzero, operands[1])); 
+  DONE;
+}
 }")
 
 (define_expand "udot_prod"
Index: gcc/testsuite/gcc.target/powerpc/p9-vneg.c
===
--- gcc/testsuite/gcc.target/powerpc/p9-vneg.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)
 (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-vneg.c  
(.../gcc/testsuite/gcc.target/powerpc)  (revision 236415)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc64*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+/* Verify P9 vector negate instructions.  */
+
+vector long long v2di_neg (vector long long a) { return -a; }
+vector int v4si_neg (vector int a) { return -a; }
+
+/* { dg-final { scan-assembler "vnegd" } } */
+/* { dg-final { scan-assembler "vnegw" } } */



[patch, fortran] PR66461 ICE on missing end program in fixed source

2016-05-18 Thread Jerry DeLisle
Hi all,

The following patch regression tested on x86-64.  The ICE is from an attempt to
free a bad expression after a MATCH_ERROR is returned. I have not been able to
identify an exact cause, there being numerous matchers involved attempting to
match the logical expression.

Regardless, it is an error on invalid so I suggest we commit this patch and
close the PR.  I dont think its a regression as marked in bugzilla.  I see the
the internal error as far back as 4.5.  If someone has an earlier build and can
see where this does not occur, please let me know. (In case I missed something.

The results of the patch gives the following:

$ gfc s.f
s.f:4:9:

  if ( x(1) < 0 .or.
 1
Error: Can not process after the IF statement shown at (1)
f951: Error: Unexpected end of file in ‘s.f’


OK for trunk?

Regards,

Jerry

2016-05-18  Jerry DeLisle  

PR fortran/66461
* match.c (gfc_match_if): Catch unxpected MATCH_ERROR and issue an error
message.


diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index f3a4a43..85e6f92 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -1560,7 +1560,16 @@ gfc_match_if (gfc_statement *if_type)
   if (m == MATCH_ERROR)
 return MATCH_ERROR;

-  gfc_match (" if ( %e ) ", );/* Guaranteed to match.  */
+  m = gfc_match (" if ( %e ) ", );  /* Not always guaranteed to match.  */
+
+  if (m == MATCH_ERROR)
+{
+  /* Under some invalid conditions like unexpected end of file, one
+can get an error in the match. We bail out here and hope for
+the best (the best being an error reported somewhere else).  */
+  gfc_error ("Can not process after the IF statement shown at %C");
+  return MATCH_ERROR;
+}

   m = gfc_match_pointer_assignment ();
   if (m == MATCH_YES)




Re: [PATCH 2/3] function: Factor out make_*logue_seq

2016-05-18 Thread H.J. Lu
On Wed, May 18, 2016 at 11:20 AM, H.J. Lu  wrote:
> On Wed, May 18, 2016 at 11:11 AM, Segher Boessenkool
>  wrote:
>> On Wed, May 18, 2016 at 10:17:32AM -0700, H.J. Lu wrote:
>>> On Mon, May 16, 2016 at 6:09 PM, Segher Boessenkool
>>>  wrote:
>>> > Make new functions make_split_prologue_seq, make_prologue_seq, and
>>> > make_epilogue_seq.
>>> >
>>> > Tested as in the previous patch; is this okay for trunk?
>>> >
>>> >
>>> > Segher
>>> >
>>> >
>>> > 2016-05-16  Segher Boessenkool  
>>> >
>>> > * function.c (make_split_prologue_seq, make_prologue_seq,
>>> > make_epilogue_seq): New functions, factored out from...
>>> > (thread_prologue_and_epilogue_insns): Here.
>>> >
>>>
>>> It breaks x86:
>>
>> Are you sure it is this patch causing it?  As noted, it was tested on x86.
>>
>
> I am pretty sure.  How did you test it on x86?  What do you get with
>
> # make check-c++ RUNTESTFLAGS="dg.exp=ctor*.C --target_board='unix{-m32,}'"
>

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71180


-- 
H.J.


Re: [PATCH 2/3] function: Factor out make_*logue_seq

2016-05-18 Thread H.J. Lu
On Wed, May 18, 2016 at 11:11 AM, Segher Boessenkool
 wrote:
> On Wed, May 18, 2016 at 10:17:32AM -0700, H.J. Lu wrote:
>> On Mon, May 16, 2016 at 6:09 PM, Segher Boessenkool
>>  wrote:
>> > Make new functions make_split_prologue_seq, make_prologue_seq, and
>> > make_epilogue_seq.
>> >
>> > Tested as in the previous patch; is this okay for trunk?
>> >
>> >
>> > Segher
>> >
>> >
>> > 2016-05-16  Segher Boessenkool  
>> >
>> > * function.c (make_split_prologue_seq, make_prologue_seq,
>> > make_epilogue_seq): New functions, factored out from...
>> > (thread_prologue_and_epilogue_insns): Here.
>> >
>>
>> It breaks x86:
>
> Are you sure it is this patch causing it?  As noted, it was tested on x86.
>

I am pretty sure.  How did you test it on x86?  What do you get with

# make check-c++ RUNTESTFLAGS="dg.exp=ctor*.C --target_board='unix{-m32,}'"


-- 
H.J.


[PING**2] [PATCH] Make C++ honor the enum mode attribute

2016-05-18 Thread Bernd Edlinger
Ping...

On 07.05.2016 11:54 Bernd Edlinger wrote:
> Ping..
>
> For this patch: https://gcc.gnu.org/ml/gcc-patches/2016-04/msg02069.html
>
> Thanks
> Bernd.
>


Re: [PATCH 2/3] function: Factor out make_*logue_seq

2016-05-18 Thread Segher Boessenkool
On Wed, May 18, 2016 at 10:17:32AM -0700, H.J. Lu wrote:
> On Mon, May 16, 2016 at 6:09 PM, Segher Boessenkool
>  wrote:
> > Make new functions make_split_prologue_seq, make_prologue_seq, and
> > make_epilogue_seq.
> >
> > Tested as in the previous patch; is this okay for trunk?
> >
> >
> > Segher
> >
> >
> > 2016-05-16  Segher Boessenkool  
> >
> > * function.c (make_split_prologue_seq, make_prologue_seq,
> > make_epilogue_seq): New functions, factored out from...
> > (thread_prologue_and_epilogue_insns): Here.
> >
> 
> It breaks x86:

Are you sure it is this patch causing it?  As noted, it was tested on x86.


Segher


Re: [PATCH] Fix PR71104 - call gimplification

2016-05-18 Thread Jeff Law

On 05/17/2016 06:28 AM, Richard Biener wrote:


The following patch addresses PR71104 which shows verify-SSA ICEs
after gimplify-into-SSA.  The issue is that for returns-twice calls
we gimplify register uses in the LHS before the actual call which leads to

  p.0_1 = p;
  _2 = vfork ();
  *p.0_1 = _2;

when gimplifying *p = vfork ().  That of course does not work -
fortunately the C standard allows to evaluate operands in the LHS
in unspecified order of the RHS.  That also makes this order aligned
with that scary C++ proposal of defined evaluation order.  It also
improves code-generation, avoiding spilling of the pointer load
around the call.

Exchanging the gimplify calls doesn't fix the issue fully as for
aggregate returns we don't gimplify the call result into a
temporary.  So we need to make sure to not emit an SSA when
gimplifying the LHS of a returns-twice call (this path only applies
to non-register returns).

A bootstrap with just the gimplification order exchange is building
target libs right now, I'll re-bootstrap and test the whole thing
again if that succeeds.

Is this ok?  I think it makes sense code-generation-wise.  Code
changes from GCC 6

bar:
.LFB0:
.cfi_startproc
subq$24, %rsp
.cfi_def_cfa_offset 32
callfoo
movqp(%rip), %rax
movq%rax, 8(%rsp)
callvfork
movq8(%rsp), %rdx
movl%eax, (%rdx)
addq$24, %rsp
.cfi_def_cfa_offset 8
ret

to

bar:
.LFB0:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
callfoo
callvfork
movqp(%rip), %rdx
movl%eax, (%rdx)
addq$8, %rsp
.cfi_def_cfa_offset 8
ret

Thanks,
Richard.

2016-05-17  Richard Biener  

PR middle-end/71104
* gimplify.c (gimplify_modify_expr): Gimplify the RHS before
gimplifying the LHS.  Make sure to gimplify a returning twice
call LHS without using SSA names.

* gcc.dg/pr71104-1.c: New testcase.
* gcc.dg/pr71104-2.c: Likewise.

LGTM.
jeff



Re: [PATCH][PR sanitizer/64354] Define __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros if corresponding switches are enabled.

2016-05-18 Thread Maxim Ostapenko

On 18/05/16 20:39, Yuri Gribov wrote:

On Wed, May 18, 2016 at 8:33 PM, Maxim Ostapenko
 wrote:

Hi,

when compiling with -fsanitize=address we define __SANITIZE_ADDRESS__
macros, but we don't do this for -fsanitize=thread and -fsanitize=undefined.
Perhaps we should be more symmetric here and define corresponding
__SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros respectively?

Are these available in Clang as well?


AFAIK Clang has __has_feature(address_sanitizer) and 
__has_feature(thread_sanitizer). UBSan doesn't have such things.





I added two simple test cases to c-c++-common/{ub, t}san/ directories that
just verify if __SANITIZE_THREAD__ (__SANITIZE_UNDEFINED__) is defined. Is
that a proper way how we check that the macros defined correctly? Does this
patch looks reasonable?

-Maxim






Re: New C++ PATCH for c++/10200 et al

2016-05-18 Thread Jason Merrill

On 05/13/2016 03:17 PM, Jason Merrill wrote:

On 02/16/2016 07:49 PM, Jason Merrill wrote:

Clearly the DR 141 change is requiring much larger adjustments in the
rest of the compiler than I'm comfortable making at this point in the
GCC 6 schedule, so I'm backing out my earlier changes for 10200 and
69753 and replacing them with a more modest fix for 10200: Now we will
still find member function templates by unqualified lookup, we just
won't find namespace-scope function templates.  The earlier approach
will return in GCC 7 stage 1.


As promised.  The prerequisite for the DR 141 change was fixing the
C++11 handling of type-dependence of member access expressions,
including calls.  14.6.2.2 says,

A class member access expression (5.2.5) is type-dependent if the
expression refers to a member of the current instantiation and the type
of the referenced member is dependent, or the class member access
expression refers to a member of an unknown specialization. [ Note: In
an expression of the form x.y or xp->y the type of the expression is
usually the type of the member y of the class of x (or the class pointed
to by xp). However, if x or xp refers to a dependent type that is not
the current instantiation, the type of y is always dependent. If x or xp
refers to a non-dependent type or refers to the current instantiation,
the type of y is the type of the class member access expression. —end
note ]

Previously we had been treating such expressions as type-dependent if
the object-expression is type-dependent, even if its type is the current
instantiation.  Fixing this required a few changes in other areas that
now have to deal with non-dependent member function calls within a
template.


A small tweak to handling of value-dependent functions to better match 
the text of the standard.



commit 9a4d77c31e361644e4deffb6a6e21a87948b0ac0
Author: Jason Merrill 
Date:   Mon May 16 15:52:25 2016 -0400

	* pt.c (value_dependent_expression_p): Tweak new cases to better
	match the wording in the standard.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 65bfd42..fde3091 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -22649,19 +22649,14 @@ value_dependent_expression_p (tree expression)
   switch (TREE_CODE (expression))
 {
 case BASELINK:
-  /* A member function of a dependent class has dependent template
-	 arguments from its class.  */
-  if (dependent_type_p (BINFO_TYPE (BASELINK_BINFO (expression
-	return true;
-  return value_dependent_expression_p (BASELINK_FUNCTIONS (expression));
+  /* A dependent member function of the current instantiation.  */
+  return dependent_type_p (BINFO_TYPE (BASELINK_BINFO (expression)));
 
 case FUNCTION_DECL:
-  /* A function template specialization is value-dependent if it has any
-	 dependent template arguments, since that means it cannot be
-	 instantiated for constexpr evaluation.  */
-  if (DECL_LANG_SPECIFIC (expression)
-	  && DECL_TEMPLATE_INFO (expression))
-	return any_dependent_template_arguments_p (DECL_TI_ARGS (expression));
+  /* A dependent member function of the current instantiation.  */
+  if (DECL_CLASS_SCOPE_P (expression)
+	  && dependent_type_p (DECL_CONTEXT (expression)))
+	return true;
   break;
 
 case IDENTIFIER_NODE:


Re: [PATCH][PR sanitizer/64354] Define __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros if corresponding switches are enabled.

2016-05-18 Thread Yuri Gribov
On Wed, May 18, 2016 at 8:33 PM, Maxim Ostapenko
 wrote:
> Hi,
>
> when compiling with -fsanitize=address we define __SANITIZE_ADDRESS__
> macros, but we don't do this for -fsanitize=thread and -fsanitize=undefined.
> Perhaps we should be more symmetric here and define corresponding
> __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros respectively?

Are these available in Clang as well?

> I added two simple test cases to c-c++-common/{ub, t}san/ directories that
> just verify if __SANITIZE_THREAD__ (__SANITIZE_UNDEFINED__) is defined. Is
> that a proper way how we check that the macros defined correctly? Does this
> patch looks reasonable?
>
> -Maxim


Re: [PATCH][PR sanitizer/64354] Define __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros if corresponding switches are enabled.

2016-05-18 Thread Yuri Gribov
On Wed, May 18, 2016 at 8:36 PM, Jakub Jelinek  wrote:
> On Wed, May 18, 2016 at 08:33:53PM +0300, Maxim Ostapenko wrote:
>> when compiling with -fsanitize=address we define __SANITIZE_ADDRESS__
>> macros, but we don't do this for -fsanitize=thread and -fsanitize=undefined.
>> Perhaps we should be more symmetric here and define corresponding
>> __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros respectively?
>>
>> I added two simple test cases to c-c++-common/{ub, t}san/ directories that
>> just verify if __SANITIZE_THREAD__ (__SANITIZE_UNDEFINED__) is defined. Is
>> that a proper way how we check that the macros defined correctly? Does this
>> patch looks reasonable?
>
> I can understand __SANITIZE_THREAD__, but I fail to see what
> __SANITIZE_UNDEFINED__ would be good for, especially when it is not just
> a single sanitizer, but dozens of them.

Some low-level codes do nasty things like unaligned or NULL memory
accesses. This would allow them to selectively disable UBSan. Good
point about different UB checks though.

-Y


Re: [PATCH][PR sanitizer/64354] Define __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros if corresponding switches are enabled.

2016-05-18 Thread Jakub Jelinek
On Wed, May 18, 2016 at 08:33:53PM +0300, Maxim Ostapenko wrote:
> when compiling with -fsanitize=address we define __SANITIZE_ADDRESS__
> macros, but we don't do this for -fsanitize=thread and -fsanitize=undefined.
> Perhaps we should be more symmetric here and define corresponding
> __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros respectively?
> 
> I added two simple test cases to c-c++-common/{ub, t}san/ directories that
> just verify if __SANITIZE_THREAD__ (__SANITIZE_UNDEFINED__) is defined. Is
> that a proper way how we check that the macros defined correctly? Does this
> patch looks reasonable?

I can understand __SANITIZE_THREAD__, but I fail to see what
__SANITIZE_UNDEFINED__ would be good for, especially when it is not just
a single sanitizer, but dozens of them.

Jakub


[PATCH][PR sanitizer/64354] Define __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros if corresponding switches are enabled.

2016-05-18 Thread Maxim Ostapenko

Hi,

when compiling with -fsanitize=address we define __SANITIZE_ADDRESS__ 
macros, but we don't do this for -fsanitize=thread and 
-fsanitize=undefined. Perhaps we should be more symmetric here and 
define corresponding __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ 
macros respectively?


I added two simple test cases to c-c++-common/{ub, t}san/ directories 
that just verify if __SANITIZE_THREAD__ (__SANITIZE_UNDEFINED__) is 
defined. Is that a proper way how we check that the macros defined 
correctly? Does this patch looks reasonable?


-Maxim
gcc/ChangeLog:

2016-05-19  Maxim Ostapenko  

	PR sanitizer/64354
	* cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add new
	builtin __SANITIZE_THREAD__ and __SANITIZE_UNDEFINED__ macros for
	-fsanitize=thread and -fsanitize=undefined switches respectively.
	* doc/cpp.texi: Document new macros.

gcc/testsuite/ChangeLog:

2016-05-19  Maxim Ostapenko  

	PR sanitizer/64354
	* c-c++-common/tsan/sanitize-thread-macro.c: New test.
	* c-c++-common/ubsan/sanitize-undefined-macro.c: Likewise.

diff --git a/gcc/cppbuiltin.c b/gcc/cppbuiltin.c
index 6d494ad..8fd0723 100644
--- a/gcc/cppbuiltin.c
+++ b/gcc/cppbuiltin.c
@@ -92,6 +92,12 @@ define_builtin_macros_for_compilation_flags (cpp_reader *pfile)
   if (flag_sanitize & SANITIZE_ADDRESS)
 cpp_define (pfile, "__SANITIZE_ADDRESS__");
 
+  if (flag_sanitize & SANITIZE_THREAD)
+cpp_define (pfile, "__SANITIZE_THREAD__");
+
+  if (flag_sanitize & SANITIZE_UNDEFINED)
+cpp_define (pfile, "__SANITIZE_UNDEFINED__");
+
   if (optimize_size)
 cpp_define (pfile, "__OPTIMIZE_SIZE__");
   if (optimize)
diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index 9f914b2..965795b 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -2362,6 +2362,12 @@ in use.
 This macro is defined, with value 1, when @option{-fsanitize=address}
 or @option{-fsanitize=kernel-address} are in use.
 
+@item __SANITIZE_THREAD__
+This macro is defined, with value 1, when @option{-fsanitize=thread} is in use.
+
+@item __SANITIZE_UNDEFINED__
+This macro is defined, with value 1, when @option{-fsanitize=undefined} is in use.
+
 @item __TIMESTAMP__
 This macro expands to a string constant that describes the date and time
 of the last modification of the current source file. The string constant
diff --git a/gcc/testsuite/c-c++-common/tsan/sanitize-thread-macro.c b/gcc/testsuite/c-c++-common/tsan/sanitize-thread-macro.c
new file mode 100644
index 000..2b8a840
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/tsan/sanitize-thread-macro.c
@@ -0,0 +1,12 @@
+/* Check that -fsanitize=thread options defines __SANITIZE_THREAD__ macros.  */
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
+
+int
+main ()
+{
+#ifndef __SANITIZE_THREAD__
+  bad construction
+#endif
+  return 0;
+}
diff --git a/gcc/testsuite/c-c++-common/ubsan/sanitize-undefined-macro.c b/gcc/testsuite/c-c++-common/ubsan/sanitize-undefined-macro.c
new file mode 100644
index 000..7461324
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/sanitize-undefined-macro.c
@@ -0,0 +1,13 @@
+/* Check that -fsanitize=undefined options defines __SANITIZE_UNDEFINED__ macros.  */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=undefined" } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
+
+int
+main ()
+{
+#ifndef __SANITIZE_UNDEFINED__
+  bad construction
+#endif
+  return 0;
+}


Re: [PATCH 2/3] function: Factor out make_*logue_seq

2016-05-18 Thread H.J. Lu
On Mon, May 16, 2016 at 6:09 PM, Segher Boessenkool
 wrote:
> Make new functions make_split_prologue_seq, make_prologue_seq, and
> make_epilogue_seq.
>
> Tested as in the previous patch; is this okay for trunk?
>
>
> Segher
>
>
> 2016-05-16  Segher Boessenkool  
>
> * function.c (make_split_prologue_seq, make_prologue_seq,
> make_epilogue_seq): New functions, factored out from...
> (thread_prologue_and_epilogue_insns): Here.
>

It breaks x86:

https://gcc.gnu.org/ml/gcc-regression/2016-05/msg00263.html

FAIL: 18_support/exception_ptr/lifespan.cc execution test
FAIL: 18_support/nested_exception/rethrow_if_nested.cc execution test
FAIL: 20_util/function/63840.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stod.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stof.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stoi.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stol.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stold.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stoll.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stoul.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/char/stoull.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stod.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stof.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stoi.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stol.cc execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stold.cc
execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stoll.cc
execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stoul.cc
execution test
FAIL: 21_strings/basic_string/numeric_conversions/wchar_t/stoull.cc
execution test
FAIL: 22_locale/locale/cons/12352.cc execution test
FAIL: 22_locale/locale/cons/2.cc execution test
FAIL: 22_locale/numpunct/members/pod/2.cc execution test
FAIL: 23_containers/bitset/to_ulong/1.cc execution test
FAIL: 23_containers/deque/cons/2.cc execution test
FAIL: 23_containers/deque/requirements/exception/basic.cc execution test
FAIL: 23_containers/deque/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/forward_list/requirements/exception/basic.cc execution test
FAIL: 
23_containers/forward_list/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/list/modifiers/insert/25288.cc execution test
FAIL: 23_containers/list/requirements/exception/basic.cc execution test
FAIL: 23_containers/list/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/map/requirements/exception/basic.cc execution test
FAIL: 23_containers/map/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/multimap/requirements/exception/basic.cc execution test
FAIL: 23_containers/multimap/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/multiset/requirements/exception/basic.cc execution test
FAIL: 23_containers/multiset/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/set/requirements/exception/basic.cc execution test
FAIL: 23_containers/set/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/unordered_map/requirements/exception/basic.cc execution test
FAIL: 
23_containers/unordered_map/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/unordered_multimap/requirements/exception/basic.cc
execution test
FAIL: 
23_containers/unordered_multimap/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/unordered_multiset/insert/hash_policy.cc execution test
FAIL: 23_containers/unordered_multiset/requirements/exception/basic.cc
execution test
FAIL: 
23_containers/unordered_multiset/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/unordered_set/insert/hash_policy.cc execution test
FAIL: 23_containers/unordered_set/max_load_factor/robustness.cc execution test
FAIL: 23_containers/unordered_set/requirements/exception/basic.cc execution test
FAIL: 
23_containers/unordered_set/requirements/exception/propagation_consistent.cc
execution test
FAIL: 23_containers/vector/capacity/2.cc execution test
FAIL: 23_containers/vector/cons/4.cc execution test
FAIL: 23_containers/vector/modifiers/push_back/strong_guarantee.cc
execution test
FAIL: 23_containers/vector/requirements/exception/basic.cc execution test
FAIL: 23_containers/vector/requirements/exception/propagation_consistent.cc
execution test
FAIL: 25_algorithms/stable_sort/mem_check.cc execution test
FAIL: 27_io/basic_istream/exceptions/char/9561.cc execution test
FAIL: 

Re: [PATCH #2] Add PowerPC ISA 3.0 word splat and byte immediate splat support

2016-05-18 Thread Michael Meissner
On Wed, May 18, 2016 at 06:53:51AM -0500, Segher Boessenkool wrote:
> On Tue, May 17, 2016 at 07:08:52PM -0400, Michael Meissner wrote:
> > FWIW, the problem after subversion id 236136 shows up when the trunk 
> > compiler
> > is built with the host compiler (4.3.4).
> 
> That compiler is almost seven years old.  It would be interesting to find
> out what the oldest compiler that *does* work is.

FWIW, I can now build a non-bootstrap compiler (subversion id 236398) with the
4.3.4 host compiler, and builds spec.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH 0/4] BRIG (HSAIL) frontend

2016-05-18 Thread Pekka Jääskeläinen
Hi Joseph,

Updated diffstat below:
 Makefile.def  | 3 +
 Makefile.in   |   489 +
 configure | 1 +
 configure.ac  | 1 +
 gcc/brig/Make-lang.in |   246 +
 gcc/brig/brig-c.h |68 +
 gcc/brig/brig-lang.c  |   454 +
 gcc/brig/brigfrontend/brig-arg-block-handler.cc   |67 +
 gcc/brig/brigfrontend/brig-atomic-inst-handler.cc |   377 +
 gcc/brig/brigfrontend/brig-basic-inst-handler.cc  |   733 +
 gcc/brig/brigfrontend/brig-branch-inst-handler.cc |   217 +
 gcc/brig/brigfrontend/brig-cmp-inst-handler.cc|   212 +
 gcc/brig/brigfrontend/brig-code-entry-handler.cc  |  2325 +++
 gcc/brig/brigfrontend/brig-code-entry-handler.h   |   449 +
 gcc/brig/brigfrontend/brig-comment-handler.cc |39 +
 gcc/brig/brigfrontend/brig-control-handler.cc |29 +
 .../brigfrontend/brig-copy-move-inst-handler.cc   |56 +
 gcc/brig/brigfrontend/brig-cvt-inst-handler.cc|   250 +
 gcc/brig/brigfrontend/brig-fbarrier-handler.cc|44 +
 gcc/brig/brigfrontend/brig-function-handler.cc|   373 +
 gcc/brig/brigfrontend/brig-function.cc|   698 +
 gcc/brig/brigfrontend/brig-function.h |   216 +
 gcc/brig/brigfrontend/brig-inst-mod-handler.cc|   168 +
 gcc/brig/brigfrontend/brig-label-handler.cc   |37 +
 gcc/brig/brigfrontend/brig-lane-inst-handler.cc   |83 +
 gcc/brig/brigfrontend/brig-machine.c  |37 +
 gcc/brig/brigfrontend/brig-machine.h  |35 +
 gcc/brig/brigfrontend/brig-mem-inst-handler.cc|   181 +
 gcc/brig/brigfrontend/brig-module-handler.cc  |30 +
 gcc/brig/brigfrontend/brig-queue-inst-handler.cc  |92 +
 gcc/brig/brigfrontend/brig-seg-inst-handler.cc|   134 +
 gcc/brig/brigfrontend/brig-signal-inst-handler.cc |42 +
 gcc/brig/brigfrontend/brig-util.cc|   348 +
 gcc/brig/brigfrontend/brig-util.h |49 +
 gcc/brig/brigfrontend/brig-variable-handler.cc|   256 +
 gcc/brig/brigfrontend/brig_to_generic.cc  |   773 +
 gcc/brig/brigfrontend/brig_to_generic.h   |   245 +
 gcc/brig/brigfrontend/phsa.h  |40 +
 gcc/brig/brigspec.c   |   193 +
 gcc/brig/config-lang.in   |41 +
 gcc/brig/lang-specs.h |28 +
 gcc/brig/lang.opt |41 +
 gcc/doc/frontends.texi| 2 +-
 gcc/doc/invoke.texi   | 4 +
 gcc/doc/standards.texi| 8 +
 gcc/testsuite/brig.dg/README  |10 +
 gcc/testsuite/brig.dg/dg.exp  |27 +
 gcc/testsuite/brig.dg/test/gimple/alloca.hsail|37 +
 gcc/testsuite/brig.dg/test/gimple/atomics.hsail   |33 +
 gcc/testsuite/brig.dg/test/gimple/branches.hsail  |58 +
 gcc/testsuite/brig.dg/test/gimple/fbarrier.hsail  |74 +
 .../brig.dg/test/gimple/function_calls.hsail  |59 +
 gcc/testsuite/brig.dg/test/gimple/mem.hsail   |39 +
 gcc/testsuite/brig.dg/test/gimple/mulhi.hsail |33 +
 gcc/testsuite/brig.dg/test/gimple/packed.hsail|78 +
 .../brig.dg/test/gimple/smoke_test.hsail  |91 +
 gcc/testsuite/brig.dg/test/gimple/variables.hsail |   124 +
 gcc/testsuite/brig.dg/test/gimple/vector.hsail|57 +
 gcc/testsuite/lib/brig-dg.exp |29 +
 gcc/testsuite/lib/brig.exp|40 +
 include/hsa-interface.h   |   630 +
 libhsail-rt/Makefile.am   |   123 +
 libhsail-rt/Makefile.in   |   721 +
 libhsail-rt/README| 4 +
 libhsail-rt/aclocal.m4|   979 +
 libhsail-rt/config.h.in   |   217 +
 libhsail-rt/configure | 17162 ++
 libhsail-rt/configure.ac  |   150 +
 libhsail-rt/include/internal/phsa-rt.h|97 +
 .../include/internal/phsa_queue_interface.h   |60 +
 libhsail-rt/include/internal/workitems.h  |   103 +
 libhsail-rt/m4/libtool.m4 |  7997 
 libhsail-rt/m4/ltoptions.m4   |   384 +
 libhsail-rt/m4/ltsugar.m4 |   123 +
 libhsail-rt/m4/ltversion.m4   |23 +
 libhsail-rt/m4/lt~obsolete.m4 |98 +
 libhsail-rt/m4/pth.m4 |   402 +
 libhsail-rt/rt/arithmetic.c   |   374 +
 libhsail-rt/rt/atomics.c  |   115 +
 libhsail-rt/rt/bitstring.c|   188 +
 libhsail-rt/rt/fbarrier.c |   

Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.

2016-05-18 Thread Pekka Jääskeläinen
Hi Joseph,

Thanks for the comments.  Updated patch attached. Hopefully I
didn't miss any diags.

On Wed, May 18, 2016 at 3:20 AM, Joseph Myers  wrote:
> This patch has many improperly formatted diagnostic messages (e.g.
> starting with capital letters, ending with '.' or failing to use %q for
> quoting).  I also note cases where you use %lu as a format for size_t,
> which is not correct (you'd need to add pretty-print.c support for %zu
> before you could use that, however).
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


002-brig-fe-new-files.patch.gz
Description: GNU Zip compressed data


Re: [PATCH 1/4] BRIG (HSAIL) frontend: configuration file changes and misc

2016-05-18 Thread Pekka Jääskeläinen
Hi,

Attached an updated patch (rebased + added .texi docs).

On Mon, May 16, 2016 at 8:25 PM, Pekka Jääskeläinen  wrote:
> The configuration file changes and misc. updates required
> by the BRIG frontend.
>
> Also, added include/hsa-interface.h which is hsa.h taken from libgomp
> and will be shared by it (agreed with Martin Liška / SUSE).
>
> --
> Pekka Jääskeläinen
> Parmance
The configuration file changes, documentation updates and other updates.

Also, added include/hsa-interface.h which is hsa.h taken from libgomp
and will be shared by it (agreed with Martin Liška / SUSE).



[PATCH, PR rtl-optimization/71148] Avoid cleanup_cfg called with invalidated dominance info

2016-05-18 Thread Ilya Enkovich
Hi,

This patch resolves PR71148 by releasing dominance info before
cleanup_cfg calls to avoid attempts to fixup invalid dominance
info.

Dominance info handling in cleanup_cfg looks weird though.  It
tries to fix it but can invalidate it at the same time (PR71084).
We should probably do something with that.

Tracker is P1 and this patch may be OK solution for now.

Bootstrapped and regtested on x86_64-pc-linux-gnu.  Ok for trunk?

Thanks,
Ilya
--
gcc/

2016-05-18  Ilya Enkovich  

PR rtl-optimization/71148
* cse.c (rest_of_handle_cse): Free dominance info
before cleanup_cfg call if required.
(rest_of_handle_cse2): Likewise.
(rest_of_handle_cse_after_global_opts): Likewise.

gcc/testsuite/

2016-05-18  Ilya Enkovich  

PR rtl-optimization/71148
* gcc.dg/pr71148.c: New test.


diff --git a/gcc/cse.c b/gcc/cse.c
index 322e352..4aa4443 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -7558,6 +7558,12 @@ rest_of_handle_cse (void)
  expecting CSE to be run.  But always rerun it in a cheap mode.  */
   cse_not_expected = !flag_rerun_cse_after_loop && !flag_gcse;
 
+  /* Check if we need to free dominance info before cleanup_cfg
+ because it may become really slow in case of invalid
+ dominance info.  */
+  if (cse_cfg_altered && dom_info_available_p (CDI_DOMINATORS))
+free_dominance_info (CDI_DOMINATORS);
+
   if (tem == 2)
 {
   timevar_push (TV_JUMP);
@@ -7630,6 +7636,12 @@ rest_of_handle_cse2 (void)
 
   delete_trivially_dead_insns (get_insns (), max_reg_num ());
 
+  /* Check if we need to free dominance info before cleanup_cfg
+ because it may become really slow in case of invalid
+ dominance info.  */
+  if (cse_cfg_altered && dom_info_available_p (CDI_DOMINATORS))
+free_dominance_info (CDI_DOMINATORS);
+
   if (tem == 2)
 {
   timevar_push (TV_JUMP);
@@ -7706,6 +7718,12 @@ rest_of_handle_cse_after_global_opts (void)
 
   cse_not_expected = !flag_rerun_cse_after_loop;
 
+  /* Check if we need to free dominance info before cleanup_cfg
+ because it may become really slow in case of invalid
+ dominance info.  */
+  if (cse_cfg_altered && dom_info_available_p (CDI_DOMINATORS))
+free_dominance_info (CDI_DOMINATORS);
+
   /* If cse altered any jumps, rerun jump opts to clean things up.  */
   if (tem == 2)
 {
diff --git a/gcc/testsuite/gcc.dg/pr71148.c b/gcc/testsuite/gcc.dg/pr71148.c
new file mode 100644
index 000..6aa4920
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr71148.c
@@ -0,0 +1,46 @@
+/* PR rtl-optimization/71148 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops" } */
+
+int rh, ok, kq, fu;
+
+void
+js (int cs)
+{
+  rh = fu;
+  if (fu != 0)
+{
+  cs /= 3;
+  if (cs <= 0)
+{
+  int z9;
+  for (z9 = 0; z9 < 2; ++z9)
+{
+  z9 += cs;
+  ok += z9;
+  fu += ok;
+}
+}
+}
+}
+
+void
+vy (int s3)
+{
+  int yo, g2 = 0;
+ sd:
+  js (g2);
+  for (yo = 0; yo < 2; ++yo)
+{
+  if (fu != 0)
+goto sd;
+  kq += (s3 != (g2 ? s3 : 0));
+  for (s3 = 0; s3 < 72; ++s3)
+g2 *= (~0 - 1);
+  g2 -= yo;
+}
+  for (fu = 0; fu < 18; ++fu)
+for (yo = 0; yo < 17; ++yo)
+  if (g2 < 0)
+goto sd;
+}


Re: [ARM] Enable __fp16 as a function parameter and return type.

2016-05-18 Thread Joseph Myers
On Wed, 18 May 2016, Matthew Wahab wrote:

> On 18/05/16 09:41, Ramana Radhakrishnan wrote:
> > On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
> >  wrote:
> > 
> > > 
> > > We do have plans to fix pre-ACLE behavior of fp16 to conform to current
> > > ACLE
> > > spec, but can't say when exactly.
> > 
> > Matthew, could you please take a look at this while you are in this area ?
> 
> Ok.

FWIW, the obvious (to me) approach to doing the conversion without double 
rounding issues while properly respecting exceptions and rounding modes 
would be to set a sticky bit in the double value and ensure its precision 
is no more than that of float before converting to float.  Something like 
(example for little-endian, untested):

union { double d; struct { uint32_t lo, hi; } r; } x;
__fp16 ret;

if (x.r.lo) x.r.hi |= 1;
x.r.lo = 0;
ret = (__fp16) (float) x.d;

By using floating point for the final conversion, you ensure it respects 
the rounding mode and produces the proper exceptions.

-- 
Joseph S. Myers
jos...@codesourcery.com


[C++ Patch/RFC] PR 70572 ("[4.9/5/6/7 Regression] ICE on code with decltype (auto) on x86_64-linux-gnu in digest_init_r")

2016-05-18 Thread Paolo Carlini

Hi,

this issue should be easy to fix. Broken code like:

void foo ()
{
  decltype (auto) a = foo;
}

triggers the gcc_assert in digest_init_r:

  /* Come here only for aggregates: records, arrays, unions, complex 
numbers

 and vectors.  */
  gcc_assert (TREE_CODE (type) == ARRAY_TYPE
  || VECTOR_TYPE_P (type)
  || TREE_CODE (type) == RECORD_TYPE
  || TREE_CODE (type) == UNION_TYPE
  || TREE_CODE (type) == COMPLEX_TYPE);

because of course TREE_CODE (type) == FUNCTION_TYPE, none of the above. 
I said should be easy to fix because in fact convert_for_initialization 
is perfectly able to handle these cases and emit proper diagnostic, if 
called. What shall we do then? The patchlet below passes testing but we 
could also relax the gcc_assert itself, include FUNCTION_TYPE 
with/without checking cxx_dialect >= cxx14. We could drop the latter 
check in my patchlet. Or something else entirely.


Thanks!
Paolo.


Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 236400)
+++ cp/typeck2.c(working copy)
@@ -1074,10 +1074,14 @@ digest_init_r (tree type, tree init, bool nested,
}
 }
 
-  /* Handle scalar types (including conversions) and references.  */
+  /* Handle scalar types (including conversions) and references.
+ Also handle cases of erroneous C++14 code involving function types
+ like (c++/70572): void foo () { decltype (auto) a = foo; }
+ and get a proper error message from convert_for_initialization.  */
   if ((TREE_CODE (type) != COMPLEX_TYPE
|| BRACE_ENCLOSED_INITIALIZER_P (init))
-  && (SCALAR_TYPE_P (type) || code == REFERENCE_TYPE))
+  && (SCALAR_TYPE_P (type) || code == REFERENCE_TYPE
+ || (TREE_CODE (type) == FUNCTION_TYPE && cxx_dialect >= cxx14)))
 {
   if (nested)
flags |= LOOKUP_NO_NARROWING;
Index: testsuite/g++.dg/cpp1y/auto-fn31.C
===
--- testsuite/g++.dg/cpp1y/auto-fn31.C  (revision 0)
+++ testsuite/g++.dg/cpp1y/auto-fn31.C  (working copy)
@@ -0,0 +1,7 @@
+// PR c++/70572
+// { dg-do compile { target c++14 } }
+
+void foo ()
+{
+  decltype (auto) a = foo;  // { dg-error "cannot convert" }
+}


Re: [C++ Patch] PR 69793 ("ICE on invalid code in "cp_lexer_peek_nth_token"")

2016-05-18 Thread Paolo Carlini

Hi,

On 18/05/2016 17:17, Jason Merrill wrote:

On 05/18/2016 11:05 AM, Paolo Carlini wrote:

On 18/05/2016 16:39, Jason Merrill wrote:

On 05/17/2016 04:47 PM, Paolo Carlini wrote:

this ICE during error recovery exposes a rather more general weakness:
we should never call cp_lexer_peek_nth_token (*, 2) when a previous
cp_lexer_peek_token returns CPP_EOF.



Hmm, that seems fragile, I would expect it to keep returning EOF.



Indeed. I didn't explain myself well enough. I meant something along the
lines: outside this specific and minor case of ICE during error
recovery, we should audit our code and keep in mind that calling
cp_lexer_peek_nth_token (*, anything > 1, the common case) right after
cp_lexer_peek_token is, how shall I put it, "suspect", due to that
assert at the beginning of cp_lexer_peek_nth_token.


I understood that, but I think that assert should be replaced with 
code to properly handle that case.
Ah yes, that comment before cp_lexer_peek_nth_token. I read it 
yesterday, got interested, but afterward focused on the specific issue 
in the bug report, seemed easy to fix.


Paolo.


RE: [Patch V2] Fix SLP PR58135.

2016-05-18 Thread Kumar, Venkataramanan
Hi Richard,

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, May 17, 2016 5:40 PM
> To: Kumar, Venkataramanan 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Patch V2] Fix SLP PR58135.
> 
> On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan
>  wrote:
> > Hi Richard,
> >
> > I created the patch by passing -b option to git. Now the patch is more
> readable.
> >
> > As per your suggestion I tried to fix the PR by splitting the SLP store 
> > group at
> vector boundary after the SLP tree is built.
> >
> > Boot strap PASSED on x86_64.
> > Checked the patch with check_GNU_style.sh.
> >
> > The gfortran.dg/pr46519-1.f test now does SLP vectorization. Hence it
> generated 2 more vzeroupper.
> > As recommended I adjusted the test case by adding -fno-tree-slp-vectorize
> to make it as expected after loop vectorization.
> >
> > The following tests are now passing.
> >
> > -- Snip-
> > Tests that now work, but didn't before:
> >
> > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects  scan-tree-dump-times
> > slp2 "basic block vectorized" 1
> >
> > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block
> > vectorized" 1
> >
> > New tests that PASS:
> >
> > gcc.dg/vect/pr58135.c (test for excess errors) gcc.dg/vect/pr58135.c
> > -flto -ffat-lto-objects (test for excess errors)
> >
> > -- Snip-
> >
> > ChangeLog
> >
> > 2016-05-14  Venkataramanan Kumar
> 
> >  PR tree-optimization/58135
> > * tree-vect-slp.c:  When group size is not multiple of vector size,
> >  allow splitting of store group at vector boundary.
> >
> > Test suite  ChangeLog
> > 2016-05-14  Venkataramanan Kumar
> 
> > * gcc.dg/vect/bb-slp-19.c:  Remove XFAIL.
> > * gcc.dg/vect/pr58135.c:  Add new.
> > * gfortran.dg/pr46519-1.f: Adjust test case.
> >
> > The attached patch Ok for trunk?
> 
> 
> Please avoid the excessive vertical space around the vect_build_slp_tree call.
Yes fixed in the attached patch.
> 
> +  /* Calculate the unrolling factor.  */
> +  unrolling_factor = least_common_multiple
> + (nunits, group_size) / group_size;
> ...
> +  else
> {
>   /* Calculate the unrolling factor based on the smallest type.  */
>   if (max_nunits > nunits)
> -unrolling_factor = least_common_multiple (max_nunits, group_size)
> -   / group_size;
> +   unrolling_factor
> +   = least_common_multiple (max_nunits,
> + group_size)/group_size;
> 
> please compute the "correct" unroll factor immediately and move the
> "unrolling of BB required" error into the if() case by post-poning the nunits 
> <
> group_size check (and use max_nunits here).
> 
Yes fixed in the attached patch.

> +  if (is_a  (vinfo)
> + && nunits < group_size
> + && unrolling_factor != 1
> + && is_a  (vinfo))
> +   {
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +  "Build SLP failed: store group "
> +  "size not a multiple of the vector size "
> +  "in basic block SLP\n");
> + /* Fatal mismatch.  */
> + matches[nunits] = false;
> 
> this is too pessimistic - you want to add the extra 'false' at group_size /
> max_nunits * max_nunits.
Yes fixed in attached patch. 

> 
> It looks like you leak 'node' in the if () path as well.  You need
> 
>   vect_free_slp_tree (node);
>   loads.release ();
> 
> thus treat it as a failure case.

Yes fixed. I added an else part before scalar_stmts.release call for the case 
when SLP tree is not built. This avoids double freeing.
Bootstrapped and reg tested on X86_64.

Ok for trunk ?
> 
> Thanks,
> Richard.
> 
> > Regards,
> > Venkat.
> >

Regards,
Venkat.
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-19.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-19.c
index 42cd294..c282155 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-19.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-19.c
@@ -53,5 +53,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2"  { 
xfail *-*-* }  } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2" } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/pr58135.c 
b/gcc/testsuite/gcc.dg/vect/pr58135.c
new file mode 100644
index 000..ca25000
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr58135.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+int a[100];
+void foo ()
+{
+  a[0] = a[1] = a[2] = a[3] = a[4]= 0;
+}
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2" } } */
diff --git a/gcc/testsuite/gfortran.dg/pr46519-1.f 
b/gcc/testsuite/gfortran.dg/pr46519-1.f
index 51c64b8..46be9f5 100644
--- a/gcc/testsuite/gfortran.dg/pr46519-1.f

Re: [C++ Patch] PR 69793 ("ICE on invalid code in "cp_lexer_peek_nth_token"")

2016-05-18 Thread David Malcolm
On Wed, 2016-05-18 at 17:05 +0200, Paolo Carlini wrote:
> Hi,
> 
> On 18/05/2016 16:39, Jason Merrill wrote:
> > On 05/17/2016 04:47 PM, Paolo Carlini wrote:
> > > this ICE during error recovery exposes a rather more general
> > > weakness:
> > > we should never call cp_lexer_peek_nth_token (*, 2) when a
> > > previous
> > > cp_lexer_peek_token returns CPP_EOF.
> > 
> > Hmm, that seems fragile, I would expect it to keep returning EOF.
> Indeed. I didn't explain myself well enough. I meant something along
> the 
> lines: outside this specific and minor case of ICE during error 
> recovery, we should audit our code and keep in mind that calling 
> cp_lexer_peek_nth_token (*, anything > 1, the common case) right
> after 
> cp_lexer_peek_token is, how shall I put it, "suspect", due to that 
> assert at the beginning of cp_lexer_peek_nth_token.

Thinking aloud, I wonder if a plugin could detect this kind of thing? -
verify that any call of cp_lexer_peek_nth_token for N > 1 happens after
the result of cp_lexer_peek_token has been checked for EOF.

(another idea might be a fault-injection option for the C++ frontend,
perhaps simulating EOF at a given token index, and then using this,
somehow, to torture-test the existing test cases, but that seems like a
*lot* of extra parsing; static checking seems more efficient).



Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-05-18 Thread Joseph Myers
On Wed, 18 May 2016, Matthew Wahab wrote:

> AArch64 follows IEEE-754 but ARM (AArch32) adds restrictions like
> flush-to-zero that could affect the outcome of a calculation.

The result of a float computation on two values immediately promoted from 
fp16 cannot be within the subnormal range for float.  Thus, only one flush 
to zero can happen, on the final conversion back to fp16, and that cannot 
make the result different from doing direct arithmetic in fp16 (assuming 
flush to zero affects conversion from float to fp16 the same way it 
affects direct fp16 arithmetic).

> > So I'd expect e.g.
> > 
> > __fp16 a, b;
> > __fp16 c = a / b;
> > 
> > to generate the new instructions, because direct binary16 arithmetic is a
> > correct implementation of (__fp16) ((float) a / (float) b).
> 
> Something like
> 
> __fp16 a, b, c;
> __fp16 d = (a / b) * c;
> 
> would be done as the sequence of single precision operations:
> 
> vcvtb.f32.f16 s0, s0
> vcvtb.f32.f16 s1, s1
> vcvtb.f32.f16 s2, s2
> vdiv.f32 s15, s0, s1
> vmul.f32 s0, s15, s2
> vcvtb.f16.f32 s0, s0
> 
> Doing this with vdiv.f16 and vmul.f16 could change the calculated result
> because the flush-to-zero rule is related to operation precision so affects
> the value of a vdiv.f16 differently from the vdiv.f32.

Flush to zero is irrelevant here, since that sequence of three operations 
also cannot produce anything in the subnormal range for float.  (It's true 
that double rounding is relevant for your example and so converting it to 
direct fp16 arithmetic would not be safe for that reason.)

That example is also not relevant to my point.  In my example

> > __fp16 a, b;
> > __fp16 c = a / b;

it's already the case that GCC will (a) promote to float, because the 
target hooks say to do so, (b) notice that the result is immediately 
converted back to fp16, and that this means fp16 arithmetic could be used 
directly, and so adjust it back to fp16 arithmetic (see convert_to_real_1, 
and the call therein to real_can_shorten_arithmetic which knows conditions 
under which it's safe to change such promoted arithmetic back to 
arithmetic on a narrower type).  Then the expanders (I think) notice the 
lack of direct HFmode arithmetic and so put the widening / narrowing back 
again.

But in your example, *because* doing it with direct fp16 arithmetic would 
not be equivalent, convert_to_real_1 would not eliminate the conversions 
to float, the float operations would still be present at expansion time, 
and so direct HFmode arithmetic patterns would not match.

In short: instructions for direct HFmode arithmetic should be described 
with patterns with the standard names.  It's the job of the 
architecture-independent compiler to ensure that fp16 arithmetic in the 
user's source code only generates direct fp16 arithmetic in GIMPLE (and 
thus ends up using those patterns) if that is a correct representation of 
the source code's semantics according to ACLE.

The intrinsics you provide can then be written to use direct arithmetic, 
and rely on convert_to_real_1 eliminating the promotions, rather than 
needing built-in functions at all, just like many arm_neon.h intrinsics 
make direct use of GNU C vector arithmetic.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch] PR 69793 ("ICE on invalid code in "cp_lexer_peek_nth_token"")

2016-05-18 Thread Jason Merrill

On 05/18/2016 11:05 AM, Paolo Carlini wrote:

On 18/05/2016 16:39, Jason Merrill wrote:

On 05/17/2016 04:47 PM, Paolo Carlini wrote:

this ICE during error recovery exposes a rather more general weakness:
we should never call cp_lexer_peek_nth_token (*, 2) when a previous
cp_lexer_peek_token returns CPP_EOF.



Hmm, that seems fragile, I would expect it to keep returning EOF.



Indeed. I didn't explain myself well enough. I meant something along the
lines: outside this specific and minor case of ICE during error
recovery, we should audit our code and keep in mind that calling
cp_lexer_peek_nth_token (*, anything > 1, the common case) right after
cp_lexer_peek_token is, how shall I put it, "suspect", due to that
assert at the beginning of cp_lexer_peek_nth_token.


I understood that, but I think that assert should be replaced with code 
to properly handle that case.


Jason



Re: [PATCH PR69848/partial]Propagate comparison into VEC_COND_EXPR if target supports

2016-05-18 Thread Bin.Cheng
On Tue, May 17, 2016 at 12:08 PM, Richard Biener
 wrote:
> On Mon, May 16, 2016 at 10:09 AM, Bin.Cheng  wrote:
>> On Fri, May 13, 2016 at 5:53 PM, Richard Biener
>>  wrote:
>>> On May 13, 2016 6:02:27 PM GMT+02:00, Bin Cheng  wrote:
Hi,
As PR69848 reported, GCC vectorizer now generates comparison outside of
VEC_COND_EXPR for COND_REDUCTION case, as below:

  _20 = vect__1.6_8 != { 0, 0, 0, 0 };
  vect_c_2.8_16 = VEC_COND_EXPR <_20, { 0, 0, 0, 0 }, vect_c_2.7_13>;
  _21 = VEC_COND_EXPR <_20, ivtmp_17, _19>;

This results in inefficient expanding.  With IR like:

vect_c_2.8_16 = VEC_COND_EXPR >>>0, 0 }, vect_c_2.7_13>;
  _21 = VEC_COND_EXPR ;

We can do:
1) Expanding time optimization, for example, reverting comparison
operator by switching VEC_COND_EXPR operands.  This is useful when
backend only supports some comparison operators.
2) For backend not supporting vcond_mask patterns, saving one LT_EXPR
instruction which introduced by expand_vec_cond_expr.

This patch fixes this by propagating comparison into VEC_COND_EXPR even
if it's used multiple times.  For now, GCC does single_use_only
propagation.  Ideally, we may duplicate the comparison before each use
statement just before expanding, so that TER can successfully backtrack
it from each VEC_COND_EXPR.  Unfortunately I didn't find a good pass to
do this.  Tree-vect-generic.c looks like a good candidate, but it's so
early that following CSE could undo the transform.  Another possible
fix is to generate comparison inside VEC_COND_EXPR directly in function
vectorizable_reduction.
>>>
>>> I prefer this for now.
>> Hi Richard, you mean this patch, or the possible fix before your comment?
>
> The possible fix before my comment - make the vectorizer generate 
> VEC_COND_EXPRs
> with embedded comparison.
Hi,
Here is updated patch doing that.  It's definitely clearer than the
original version.
Bootstrap and test on x86_64.  Also checked the expanding time
optimization still happens.  Is it OK?

Thanks,
bin
>
> Thanks,
> Richard.
>
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index d673c67..67053af 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6159,21 +6159,14 @@ vectorizable_reduction (gimple *stmt, 
gimple_stmt_iterator *gsi,
 Finally, we update the phi (NEW_PHI_TREE) to take the value of
 the new cond_expr (INDEX_COND_EXPR).  */
 
- /* Turn the condition from vec_stmt into an ssa name.  */
- gimple_stmt_iterator vec_stmt_gsi = gsi_for_stmt (*vec_stmt);
- tree ccompare = gimple_assign_rhs1 (*vec_stmt);
- tree ccompare_name = make_ssa_name (TREE_TYPE (ccompare));
- gimple *ccompare_stmt = gimple_build_assign (ccompare_name,
-  ccompare);
- gsi_insert_before (_stmt_gsi, ccompare_stmt, GSI_SAME_STMT);
- gimple_assign_set_rhs1 (*vec_stmt, ccompare_name);
- update_stmt (*vec_stmt);
+ /* Duplicate the condition from vec_stmt.  */
+ tree ccompare = unshare_expr (gimple_assign_rhs1 (*vec_stmt));
 
  /* Create a conditional, where the condition is taken from vec_stmt
-(CCOMPARE_NAME), then is the induction index (INDEX_BEFORE_INCR)
-and else is the phi (NEW_PHI_TREE).  */
+(CCOMPARE), then is the induction index (INDEX_BEFORE_INCR) and
+else is the phi (NEW_PHI_TREE).  */
  tree index_cond_expr = build3 (VEC_COND_EXPR, cr_index_vector_type,
-ccompare_name, indx_before_incr,
+ccompare, indx_before_incr,
 new_phi_tree);
  cond_name = make_ssa_name (cr_index_vector_type);
  gimple *index_condition = gimple_build_assign (cond_name,


Re: [C++ Patch] PR 69793 ("ICE on invalid code in "cp_lexer_peek_nth_token"")

2016-05-18 Thread Paolo Carlini

Hi,

On 18/05/2016 16:39, Jason Merrill wrote:

On 05/17/2016 04:47 PM, Paolo Carlini wrote:

this ICE during error recovery exposes a rather more general weakness:
we should never call cp_lexer_peek_nth_token (*, 2) when a previous
cp_lexer_peek_token returns CPP_EOF.


Hmm, that seems fragile, I would expect it to keep returning EOF.
Indeed. I didn't explain myself well enough. I meant something along the 
lines: outside this specific and minor case of ICE during error 
recovery, we should audit our code and keep in mind that calling 
cp_lexer_peek_nth_token (*, anything > 1, the common case) right after 
cp_lexer_peek_token is, how shall I put it, "suspect", due to that 
assert at the beginning of cp_lexer_peek_nth_token.

But your patch is OK.

Thanks. I'm going to commit it then.

Paolo.


Re: [C++ Patch] PR 69793 ("ICE on invalid code in "cp_lexer_peek_nth_token"")

2016-05-18 Thread Jason Merrill

On 05/17/2016 04:47 PM, Paolo Carlini wrote:

this ICE during error recovery exposes a rather more general weakness:
we should never call cp_lexer_peek_nth_token (*, 2) when a previous
cp_lexer_peek_token returns CPP_EOF.


Hmm, that seems fragile, I would expect it to keep returning EOF.

But your patch is OK.

Jason



Re: [ARM] Enable __fp16 as a function parameter and return type.

2016-05-18 Thread Ramana Radhakrishnan
On 18/05/16 15:33, Matthew Wahab wrote:
> On 18/05/16 09:41, Ramana Radhakrishnan wrote:
>> On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
>>  wrote:
>>
>>>
>>> We do have plans to fix pre-ACLE behavior of fp16 to conform to current ACLE
>>> spec, but can't say when exactly.
>>
>> Matthew, could you please take a look at this while you are in this area ?
> 
> Ok.
> 
> Part of this is likely to involve removing TARGET_CONVERT_TO_TYPE from the 
> ARM backend. grep doesn't show anywhere that uses this hook, the only other 
> occurrence is in the ARC backend. Does the hook ever get used?
> 
> Matthew
> 
> 

The use in the front-ends / mid-end is usually in lower case - thus grepping 
for convert_to_type will give you the answer.

grep -r convert_to_type *

gcc/jit/jit-playback.c:  t_ret = targetm.convert_to_type (t_dst_type, t_expr);
gcc/cp/cvt.c:  e1 = targetm.convert_to_type (type, e);
gcc/c/c-convert.c:  ret = targetm.convert_to_type (type, expr);
gcc/config/arm/arm.c:static tree arm_convert_to_type (tree type, tree expr);
gcc/config/arm/arm.c:#define TARGET_CONVERT_TO_TYPE arm_convert_to_type
gcc/config/arm/arm.c:arm_convert_to_type (tree type, tree expr)
gcc/config/avr/avr.c:avr_convert_to_type (tree type, tree expr)
gcc/config/avr/avr.c:#define TARGET_CONVERT_TO_TYPE avr_convert_to_type

I don't think you can poison this as the avr backend appears to use this rather 
than arc.


regards
Ramana


Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x

2016-05-18 Thread H.J. Lu
On Wed, May 4, 2016 at 6:57 PM, kugan  wrote:
> Hi Richard,
>
>>
>> maybe instert_stmt_after will help here, I don't think you got the
>> insertion
>> logic correct, thus insert_stmt_after (mul_stmt, def_stmt) which I think
>> misses GIMPLE_NOP handling.  At least
>>
>> +  if (SSA_NAME_VAR (op) != NULL
>>
>> huh?  I suppose you could have tested SSA_NAME_IS_DEFAULT_DEF
>> but just the GIMPLE_NOP def-stmt test should be enough.
>>
>> + && gimple_code (def_stmt) == GIMPLE_NOP)
>> +   {
>> + gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN
>> (cfun)));
>> + stmt = gsi_stmt (gsi);
>> + gsi_insert_before (, mul_stmt, GSI_NEW_STMT);
>>
>> not sure if that is the best insertion point choice, it un-does all
>> code-sinking done
>> (and no further sinking is run after the last reassoc pass).  We do know
>> we
>> are handling all uses of op in our chain so inserting before the plus-expr
>> chain root should work here (thus 'stmt' in the caller context).  I'd
>> use that here instead.
>> I think I'd use that unconditionally even if it works and not bother
>> finding something
>> more optimal.
>>
>
> I now tried using instert_stmt_after with special handling for GIMPLE_PHI as
> you described.
>
>
>> Apart from this this now looks ok to me.
>>
>> But the testcases need some work
>>
>>
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> ...
>> +
>> +/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
>>
>> I would have expected 3.
>
>
> We now have an additional _15 = x_1(D) * 2
>
>   Also please check for \\\* 5 for example
>>
>> to be more specific (and change the cases so you get different constants
>> for the different functions).
>
>
>>
>> That said, please make the scans more specific.
>
>
> I have now changes the test-cases to scan more specific multiplication scan
> as you wanted.
>
>
> Does this now look better?

It caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71172


-- 
H.J.


Re: [ARM] Enable __fp16 as a function parameter and return type.

2016-05-18 Thread Matthew Wahab

On 18/05/16 09:41, Ramana Radhakrishnan wrote:

On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
 wrote:



We do have plans to fix pre-ACLE behavior of fp16 to conform to current ACLE
spec, but can't say when exactly.


Matthew, could you please take a look at this while you are in this area ?


Ok.

Part of this is likely to involve removing TARGET_CONVERT_TO_TYPE from the ARM 
backend. grep doesn't show anywhere that uses this hook, the only other occurrence is 
in the ARC backend. Does the hook ever get used?


Matthew




Re: [C++ Patch] PR 70466 ("ICE on invalid code in tree check: expected constructor, have parm_decl in convert_like_real...")

2016-05-18 Thread Jason Merrill

On 05/18/2016 10:22 AM, Paolo Carlini wrote:

Hi,

On 18/05/2016 16:08, Jason Merrill wrote:

On 05/17/2016 05:57 PM, Paolo Carlini wrote:

On 17/05/2016 20:15, Jason Merrill wrote:

On 05/17/2016 04:47 AM, Paolo Carlini wrote:

... alternately, if the substance of my patchlet is right, we could
simplify a bit the logic per the below.


Here's a well-formed variant that was accepted by 4.5.  Does your
patch fix it?  I also think with your patch we can drop the C++11
check, since list-initialization doesn't exist in C++98.

Oh nice, the new testcase indeed passes with my patch. However, removing
completely C++11 check causes a regression in c++98 mode for
init/explicit1.C, we start warning for it:


Ah, that makes sense.  Your patch is OK, then.

Committed. Since you noticed that actually this is a regression, please
let me know in which branches we want to fix it, I would guess at least
gcc-6-branch too.


All the release branches, I think; it looks very safe.

Jason




Re: [Patch] PR rtl-optimization/71150, guard in_class_p check with REG_P

2016-05-18 Thread Vladimir Makarov

On 05/17/2016 06:02 AM, Jiong Wang wrote:

This bug is introduced by my commit r236181 where the inner rtx of
SUBREG haven't been checked while it should as "in_class_p" only
works with REG, and SUBREG_REG is actually not always REG.  If REG_P
check failed,  then we should fall back to normal code patch. The
following simple testcase for x86 can reproduce this bug.

long
foo (long a)
{
  return (unsigned) foo;
}

OK for trunk?


Yes.  Thank you, Jiong.

x86-64 bootstrap OK and no regression on check-gcc/g++.

2016-05-17  Jiong Wang  

gcc/
  PR rtl-optimization/71150
  * lra-constraint (process_addr_reg): Guard "in_class_p" with REG_P 
check.






Re: [C++ Patch] PR 70466 ("ICE on invalid code in tree check: expected constructor, have parm_decl in convert_like_real...")

2016-05-18 Thread Paolo Carlini

Hi,

On 18/05/2016 16:08, Jason Merrill wrote:

On 05/17/2016 05:57 PM, Paolo Carlini wrote:

On 17/05/2016 20:15, Jason Merrill wrote:

On 05/17/2016 04:47 AM, Paolo Carlini wrote:

... alternately, if the substance of my patchlet is right, we could
simplify a bit the logic per the below.


Here's a well-formed variant that was accepted by 4.5.  Does your
patch fix it?  I also think with your patch we can drop the C++11
check, since list-initialization doesn't exist in C++98.

Oh nice, the new testcase indeed passes with my patch. However, removing
completely C++11 check causes a regression in c++98 mode for
init/explicit1.C, we start warning for it:


Ah, that makes sense.  Your patch is OK, then.
Committed. Since you noticed that actually this is a regression, please 
let me know in which branches we want to fix it, I would guess at least 
gcc-6-branch too.


Thanks,
Paolo.


Re: [PATCH #2] Add PowerPC ISA 3.0 word splat and byte immediate splat support

2016-05-18 Thread Michael Meissner
On Wed, May 18, 2016 at 07:15:10AM -0500, Segher Boessenkool wrote:
> On Tue, May 17, 2016 at 06:45:49PM -0400, Michael Meissner wrote:
> > As I mentioned in the last message, my previous patch had some problems that
> > showed up on big endian systems, using RELOAD (one of the tests that failed 
> > was
> > the vshuf-v32qi.c test in the testsuite).  Little endian and IRA did 
> > compiled
> > the test fine.  This patch fixes the problem.  I went over the alternatives 
> > in
> > the vsx_mov_{32bit,64bit} patterns, and I removed the '*' constraints,
> > and checked all of the other constraints.
> 
> So those * is the only change?

In this last patch, I started with the constraints from the current
vsx_mov before the patch.  As documented in the ChangeLog, I made the
following changes:

1)  I collapsed the ,? cases to just  (i.e. eliminating the
idea in the move of a favored register constraint compared).

2)  Add XXSPLTIB support.

3)  Prefer using VSPLTIWS for creating 0/-1 over XXLXOR/XXLORC.

> 
> > The patches bootstrap and pass regression tests on both little endian power8
> > and big endian power7 systems.  Are these patches ok to install in the 
> > trunk?
> 
> This patch is okay for trunk.
> 
> > After a burn-in period, will they be ok to back port to the GCC 6.2 branch?
> 
> Yes.  Please make sure you test it there as well (BE/LE, p7/p8).

Yes.

> A few nits still:
> 
> > +(define_predicate "xxspltib_constant_split"
> > +  (match_code "const_vector,vec_duplicate,const_int")
> > +{
> > +  int value = 256;
> > +  int num_insns= -1;
> 
> Still has a tab character.

Fixed.

> > +(define_predicate "xxspltib_constant_nosplit"
> > +  (match_code "const_vector,vec_duplicate,const_int")
> > +{
> > +  int value = 256;
> > +  int num_insns= -1;
> 
> And here.

Fixed.

> > @@ -1024,6 +1068,10 @@ (define_predicate "splat_input_operand"
> > mode = V2DFmode;
> >else if (mode == DImode)
> > mode = V2DImode;
> > +  else if (mode == SImode && TARGET_P9_VECTOR)
> > +   mode = V4SImode;
> > +  else if (mode == SFmode && TARGET_P9_VECTOR)
> > +   mode = V4SFmode;
> 
> Trailing tabs (twice).

Fixed.

> > +;; VSX store  VSX load   VSX move  VSX->GPR   GPR->VSXLQ (GPR)
> > +;;  STQ (GPR)  GPR load   GPR store GPR move   XXSPLTIB
> > VSPLTISW
> > +;; VSX 0/-1   GPR 0/-1   VMX const GPR const  LVX (VMX)   STVX 
> > (VMX)
> > +(define_insn "*vsx_mov_64bit"
> > +  [(set (match_operand:VSX_M 0 "nonimmediate_operand"
> > +   "=ZwO,  , , r, we,?wQ,
> > +   ?,   ??r,   ??Y,   ??r,   wo,v,
> > +   ?,*r,v, ??r,   wZ,v")
> > +
> > +   (match_operand:VSX_M 1 "input_operand" 
> > +   ", ZwO,   , we,r, r,
> > +   wQ,Y, r, r, wE,jwM,
> > +   ?jwM,  jwM,   W, W, v, wZ"))]
> > +
> > +  "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (mode)
> > +   && (register_operand (operands[0], mode) 
> > +   || register_operand (operands[1], mode))"
> > +{
> > +  return rs6000_output_move_128bit (operands);
> > +}
> > +  [(set_attr "type"
> > +   "vecstore,  vecload,   vecsimple, mffgpr,mftgpr,
> > load,
> > +   store, load,  store, *, vecsimple, 
> > vecsimple,
> > +   vecsimple, *, *, *, vecstore,  vecload")
> > +
> > +   (set_attr "length"
> > +   "4, 4, 4, 8, 4, 8,
> > +   8, 8, 8, 8, 4, 4,
> > +   4, 8, 20,20,4, 4")])
> 
> Some of these lines are indented with spaces instead of tabs, please fix.
> Looks great otherwise, thanks!

Fixed.

> > +;; V4SI splat (ISA 3.0)
> > +;; When SI's are allowed in VSX registers, add XXSPLTW support
> > +(define_expand "vsx_splat_"
> > +  [(set (match_operand:VSX_W 0 "vsx_register_operand" "")
> > +   (vec_duplicate:VSX_W
> > +(match_operand: 1 "splat_input_operand" "")))]
> 
> You can leave off the default arg "" nowadays.  I know we're not consistent
> in that.

I tend to be conservative, since I do have to backport patches to older
branches.

Here is the patch that I checked in subversion id 236394:

[gcc]
2016-05-18  Michael Meissner  

PR target/70915
* config/rs6000/constraints.md (wE constraint): New constraint
for a vector constant that can be loaded with XXSPLTIB.
(wM constraint): New constraint for a vector constant of a 1's.
(wS constraint): New constraint for a vector constant that can be
loaded with XXSPLTIB and a vector sign extend instruction.
* config/rs6000/predicates.md (xxspltib_constant_split): New
predicates for wE/wS constraints.

Re: [C++ Patch] PR 70466 ("ICE on invalid code in tree check: expected constructor, have parm_decl in convert_like_real...")

2016-05-18 Thread Jason Merrill

On 05/17/2016 05:57 PM, Paolo Carlini wrote:

On 17/05/2016 20:15, Jason Merrill wrote:

On 05/17/2016 04:47 AM, Paolo Carlini wrote:

... alternately, if the substance of my patchlet is right, we could
simplify a bit the logic per the below.


Here's a well-formed variant that was accepted by 4.5.  Does your
patch fix it?  I also think with your patch we can drop the C++11
check, since list-initialization doesn't exist in C++98.

Oh nice, the new testcase indeed passes with my patch. However, removing
completely C++11 check causes a regression in c++98 mode for
init/explicit1.C, we start warning for it:


Ah, that makes sense.  Your patch is OK, then.

Jason



Re: [PATCH, ARM 3/7, ping1] Fix indentation of FL_FOR_ARCH* definition after adding support for ARMv8-M

2016-05-18 Thread Kyrill Tkachov


On 18/05/16 14:45, Thomas Preudhomme wrote:

On Wednesday 18 May 2016 11:30:43 Kyrill Tkachov wrote:

Hi Thomas,

On 17/05/16 11:10, Thomas Preudhomme wrote:

Ping?

*** gcc/ChangeLog ***

2015-11-06  Thomas Preud'homme  

  * config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index
63235cb63acf3e676fac5b61e1195081efd64075..f437d0d8baa5534f9519dd28cd2c4ac5
2d48685c 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -395,30 +395,31 @@ extern bool arm_is_constant_pool_ref (rtx);

   #define FL_TUNE  (FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
   
   			 | FL_CO_PROC)


-#define FL_FOR_ARCH2   FL_NOTM
-#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
-#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
-#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
-#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
-#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
-#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
-#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
-#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
-#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE
-#define FL_FOR_ARCH6   (FL_FOR_ARCH5TE | FL_ARCH6)
-#define FL_FOR_ARCH6J  FL_FOR_ARCH6
-#define FL_FOR_ARCH6K  (FL_FOR_ARCH6 | FL_ARCH6K)
-#define FL_FOR_ARCH6Z  FL_FOR_ARCH6
-#define FL_FOR_ARCH6KZ (FL_FOR_ARCH6K | FL_ARCH6KZ)
-#define FL_FOR_ARCH6T2 (FL_FOR_ARCH6 | FL_THUMB2)
-#define FL_FOR_ARCH6M  (FL_FOR_ARCH6 & ~FL_NOTM)
-#define FL_FOR_ARCH7   ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
-#define FL_FOR_ARCH7A  (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
-#define FL_FOR_ARCH7VE (FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
-#define FL_FOR_ARCH7R  (FL_FOR_ARCH7A | FL_THUMB_DIV)
-#define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
-#define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
-#define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
+#define FL_FOR_ARCH2   FL_NOTM
+#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
+#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
+#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
+#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
+#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
+#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
+#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
+#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
+#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE

This one looks misindented.
Ok with that fixed once the prerequisites are approved.

It is in the patch but not in the result. If you remove the + in the patch for
the last two lines you'll see that they are perfectly aligned.


Ah ok, thanks.
The patch is ok then once the prerequisites are approved.

Kyrill


Best regards,

Thomas





Re: [PATCH, ARM 3/7, ping1] Fix indentation of FL_FOR_ARCH* definition after adding support for ARMv8-M

2016-05-18 Thread Thomas Preudhomme
On Wednesday 18 May 2016 11:30:43 Kyrill Tkachov wrote:
> Hi Thomas,
> 
> On 17/05/16 11:10, Thomas Preudhomme wrote:
> > Ping?
> > 
> > *** gcc/ChangeLog ***
> > 
> > 2015-11-06  Thomas Preud'homme  
> > 
> >  * config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.
> > 
> > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> > index
> > 63235cb63acf3e676fac5b61e1195081efd64075..f437d0d8baa5534f9519dd28cd2c4ac5
> > 2d48685c 100644
> > --- a/gcc/config/arm/arm-protos.h
> > +++ b/gcc/config/arm/arm-protos.h
> > @@ -395,30 +395,31 @@ extern bool arm_is_constant_pool_ref (rtx);
> > 
> >   #define FL_TUNE   (FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
> >   
> >  | FL_CO_PROC)
> > 
> > -#define FL_FOR_ARCH2   FL_NOTM
> > -#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
> > -#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
> > -#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
> > -#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
> > -#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
> > -#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
> > -#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
> > -#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
> > -#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE
> > -#define FL_FOR_ARCH6   (FL_FOR_ARCH5TE | FL_ARCH6)
> > -#define FL_FOR_ARCH6J  FL_FOR_ARCH6
> > -#define FL_FOR_ARCH6K  (FL_FOR_ARCH6 | FL_ARCH6K)
> > -#define FL_FOR_ARCH6Z  FL_FOR_ARCH6
> > -#define FL_FOR_ARCH6KZ (FL_FOR_ARCH6K | FL_ARCH6KZ)
> > -#define FL_FOR_ARCH6T2 (FL_FOR_ARCH6 | FL_THUMB2)
> > -#define FL_FOR_ARCH6M  (FL_FOR_ARCH6 & ~FL_NOTM)
> > -#define FL_FOR_ARCH7   ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
> > -#define FL_FOR_ARCH7A  (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
> > -#define FL_FOR_ARCH7VE (FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
> > -#define FL_FOR_ARCH7R  (FL_FOR_ARCH7A | FL_THUMB_DIV)
> > -#define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
> > -#define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
> > -#define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
> > +#define FL_FOR_ARCH2   FL_NOTM
> > +#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
> > +#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
> > +#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
> > +#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
> > +#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
> > +#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
> > +#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
> > +#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
> > +#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE
> 
> This one looks misindented.
> Ok with that fixed once the prerequisites are approved.

It is in the patch but not in the result. If you remove the + in the patch for 
the last two lines you'll see that they are perfectly aligned.

Best regards,

Thomas


Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-05-18 Thread Matthew Wahab

On 18/05/16 01:51, Joseph Myers wrote:

On Tue, 17 May 2016, Matthew Wahab wrote:


In most cases the instructions are added using non-standard pattern
names. This is to force operations on __fp16 values to be done, by
conversion, using the single-precision instructions. The exceptions are
the precision preserving operations ABS and NEG.


But why do you need to force that?  If the instructions follow IEEE
semantics including for exceptions and rounding modes, then X OP Y
computed directly with binary16 arithmetic has the same value as results
from promoting to binary32, doing binary32 arithmetic and converting back
to binary16, for OP in + - * /.  (Double-rounding problems can only occur
in round-to-nearest and if the binary32 result is exactly half way between
two representable binary16 values but the exact result is not exactly half
way between.  It's obvious that this can't occur to + - * and only a bit
harder to see this for /.  According to the logic used in
convert.c:convert_to_real_1, double rounding can't occur in this case for
square root either, though I haven't verified that.)


AArch64 follows IEEE-754 but ARM (AArch32) adds restrictions like flush-to-zero that 
could affect the outcome of a calculation.



So I'd expect e.g.

__fp16 a, b;
__fp16 c = a / b;

to generate the new instructions, because direct binary16 arithmetic is a
correct implementation of (__fp16) ((float) a / (float) b).


Something like

__fp16 a, b, c;
__fp16 d = (a / b) * c;

would be done as the sequence of single precision operations:

vcvtb.f32.f16 s0, s0
vcvtb.f32.f16 s1, s1
vcvtb.f32.f16 s2, s2
vdiv.f32 s15, s0, s1
vmul.f32 s0, s15, s2
vcvtb.f16.f32 s0, s0

Doing this with vdiv.f16 and vmul.f16 could change the calculated result because the 
flush-to-zero rule is related to operation precision so affects the value of a 
vdiv.f16 differently from the vdiv.f32.


(At least, that's my understanding.)

Matthew


Re: [PATCH] Optimize strchr (s, 0) to strlen

2016-05-18 Thread Richard Biener
On Wed, May 18, 2016 at 2:29 PM, Wilco Dijkstra  wrote:
> Richard Biener wrote:
>>
>> Yeah ;)  I'm currently bootstrapping/testing the patch that makes it 
>> possible to
>> write all this in match.pd.
>
> So what was the conclusion? Improving match.pd to be able to handle more cases
> like this seems like a nice thing.

I'm stuck with fallout and making this work requires some serious
thought.  Don't
hold your breath here :/

The restricted case of strchr (a, 0) -> strlen () can be made working
more easily
but I didn't yet try to implement a restriction only allowing the
cases that would work.

Meanwhile the strlenopt pass would be an appropriate place to handle
this transform
(well, if we now agree on its usefulness).

Richard.

>
> Wilco
>


Re: [AArch64, 2/4] Extend vector mutiply by element to all supported modes

2016-05-18 Thread Jiong Wang



On 18/05/16 09:17, Christophe Lyon wrote:

On 17 May 2016 at 14:27, James Greenhalgh  wrote:

On Mon, May 16, 2016 at 10:09:31AM +0100, Jiong Wang wrote:

AArch64 support vector multiply by element for V2DF, V2SF, V4SF, V2SI,
V4SI, V4HI, V8HI.

All above are well supported by "*aarch64_mul3_elt" pattern and
"*aarch64_mul3_elt_" if there is lane size
change.

Above patterns are trying to match "(mul (vec_dup (vec_select)))"
which is genuinely vector multiply by element.

While vector multiply by element can also comes from "(mul (vec_dup
(scalar" where the scalar value is already sitting in vector register
then duplicated to other lanes, and there is no lane size change.

We have "*aarch64_mul3_elt_to_128df" to match this already, but it's
restricted for V2DF while this patch extends this support to more modes,
for example vector integer operations.

For the testcase included, the following codegen change will happen:


-   ldr w0, [x3, 160]
-   dup v1.2s, w0
-   mul v1.2s, v1.2s, v2.2s
+   ldr s1, [x3, 160]
+   mul v1.2s, v0.2s, v1.s[0]

OK for trunk?

2016-05-16  Jiong Wang

gcc/
   * config/aarch64/aarch64-simd.md (*aarch64_mul3_elt_to_128df): Extend to all
   supported modes.  Rename to "*aarch64_mul3_elt_from_dup".

gcc/testsuite/
   * /gcc.target/aarch64/simd/vmul_elem_1.c: New.


This ChangeLog formatting is incorrect. It should look like:

gcc/

2016-05-17  Jiong Wang  

 * config/aarch64/aarch64-simd.md (*aarch64_mul3_elt_to_128df): Extend
 to all supported modes.  Rename to...
 (*aarch64_mul3_elt_from_dup): ...this.

gcc/testsuite/

2016-05-17  Jiong Wang  

 * gcc.target/aarch64/simd/vmul_elem_1.c: New.

Otherwise, this patch is OK.


Hi Jiong,

The new testcase fails on aarch64_be, at execution time.

Christophe.


Thanks for reporting this.

Yes, reproduced. I should force those res* local variable into
memory so they can be in the same order as the expected result
which is kept in memory.

The following patch fix this.

vmul_elem_1 pass on both aarch64_be-none-elf and aarch64-linux.

OK for trunk?

gcc/testsuite/

2016-05-18  Jiong Wang  

* gcc.target/aarch64/simd/vmul_elem_1.c: Force result variables to be
kept in memory.

diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c b/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c
index 155cac3..a1faefd 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c
@@ -142,13 +142,15 @@ check_v2sf (float32_t elemA, float32_t elemB)
   int32_t indx;
   const float32_t vec32x2_buf[2] = {A, B};
   float32x2_t vec32x2_src = vld1_f32 (vec32x2_buf);
-  float32x2_t vec32x2_res = vmul_n_f32 (vec32x2_src, elemA);
+  float32_t vec32x2_res[2];
+
+  vst1_f32 (vec32x2_res, vmul_n_f32 (vec32x2_src, elemA));
 
   for (indx = 0; indx < 2; indx++)
 if (* (uint32_t *) _res[indx] != * (uint32_t *) _1[indx])
   abort ();
 
-  vec32x2_res = vmul_n_f32 (vec32x2_src, elemB);
+  vst1_f32 (vec32x2_res, vmul_n_f32 (vec32x2_src, elemB));
 
   for (indx = 0; indx < 2; indx++)
 if (* (uint32_t *) _res[indx] != * (uint32_t *) _2[indx])
@@ -163,25 +165,27 @@ check_v4sf (float32_t elemA, float32_t elemB, float32_t elemC, float32_t elemD)
   int32_t indx;
   const float32_t vec32x4_buf[4] = {A, B, C, D};
   float32x4_t vec32x4_src = vld1q_f32 (vec32x4_buf);
-  float32x4_t vec32x4_res = vmulq_n_f32 (vec32x4_src, elemA);
+  float32_t vec32x4_res[4];
+
+  vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemA));
 
   for (indx = 0; indx < 4; indx++)
 if (* (uint32_t *) _res[indx] != * (uint32_t *) _1[indx])
   abort ();
 
-  vec32x4_res = vmulq_n_f32 (vec32x4_src, elemB);
+  vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemB));
 
   for (indx = 0; indx < 4; indx++)
 if (* (uint32_t *) _res[indx] != * (uint32_t *) _2[indx])
   abort ();
 
-  vec32x4_res = vmulq_n_f32 (vec32x4_src, elemC);
+  vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemC));
 
   for (indx = 0; indx < 4; indx++)
 if (* (uint32_t *) _res[indx] != * (uint32_t *) _3[indx])
   abort ();
 
-  vec32x4_res = vmulq_n_f32 (vec32x4_src, elemD);
+  vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemD));
 
   for (indx = 0; indx < 4; indx++)
 if (* (uint32_t *) _res[indx] != * (uint32_t *) _4[indx])
@@ -196,13 +200,15 @@ check_v2df (float64_t elemdC, float64_t elemdD)
   int32_t indx;
   const float64_t vec64x2_buf[2] = {AD, BD};
   float64x2_t vec64x2_src = vld1q_f64 (vec64x2_buf);
-  float64x2_t vec64x2_res = vmulq_n_f64 (vec64x2_src, elemdC);
+  float64_t vec64x2_res[2];
+
+  vst1q_f64 (vec64x2_res, vmulq_n_f64 (vec64x2_src, elemdC));
 
   for (indx = 0; indx < 2; indx++)
 if (* (uint64_t *) _res[indx] != * (uint64_t *) _1[indx])
   abort ();
 
-  vec64x2_res = vmulq_n_f64 (vec64x2_src, 

Re: [PATCH] Improve TBAA with unions

2016-05-18 Thread Eric Botcazou
> We have a good place in the middle-end to apply such rules which
> is component_uses_parent_alias_set_from - this is where I move
> the logic that is duplicated in various frontends.
> 
> The Java and Ada frontends do not allow union type punning (LTO does),
> so this patch may eventually pessimize them.  I don't care anything
> about Java but Ada folks might want to chime in.

The role of UNION_TYPE is negligible in Ada, it's only used for unchecked 
unions, which are quite rare.  Moreover, they are explicitly designed to be 
compatible with C unions so behaving like them kind of makes sense I think.

-- 
Eric Botcazou


GCC 5.4 Status report (2016-05-18)

2016-05-18 Thread Richard Biener

Status
==

The GCC 5 branch is currently open for regression and documentation fixes.

I plan to do a release candidate of GCC 5.4 at the end of next week
followed by a release at the beginning of June.

This is a good time to look through your assigned bugs looking for
patches you might want to backport to open release branches which
also still includes the GCC 4.9 branch which will see one last release
before it will be closed.


Quality Data


Priority  #   Change from last report
---   ---
P10
P2  147   +  38 
P3   13   -  15
P4   80   -   5
P5   28   -   4
---   ---
Total P1-P3 163   +  26
Total   271   +  17


Previous Report
===

https://gcc.gnu.org/ml/gcc/2015-12/msg00051.html


Re: [testuite,AArch64] Make scan for 'br' more robust

2016-05-18 Thread Christophe Lyon
On 13 May 2016 at 15:51, James Greenhalgh  wrote:
> On Wed, May 04, 2016 at 11:55:42AM +0200, Christophe Lyon wrote:
>> On 4 May 2016 at 10:43, Kyrill Tkachov  wrote:
>> >
>> > Hi Christophe,
>> >
>> >
>> > On 02/05/16 12:50, Christophe Lyon wrote:
>> >>
>> >> Hi,
>> >>
>> >> I've noticed a "regression" of AArch64's noplt_3.c in the gcc-6-branch
>> >> because my validation script adds the branch name to gcc/REVISION.
>> >>
>> >> As a result scan-assembler-times "br" also matched "gcc-6-branch",
>> >> hence the failure.
>> >>
>> >> The small attached patch replaces "br" by "br\t" to fix the problem.
>> >>
>> >> I've also made a similar change to tail_indirect_call_1 although the
>> >> problem did not happen for this test because it uses scan-assembler
>> >> instead of scan-assembler-times. I think it's better to make it more
>> >> robust too.
>> >>
>> >> OK?
>> >>
>> >> Christophe
>> >
>> >
>> > diff --git a/gcc/testsuite/gcc.target/aarch64/noplt_3.c
>> > b/gcc/testsuite/gcc.target/aarch64/noplt_3.c
>> > index ef6e65d..a382618 100644
>> > --- a/gcc/testsuite/gcc.target/aarch64/noplt_3.c
>> > +++ b/gcc/testsuite/gcc.target/aarch64/noplt_3.c
>> > @@ -16,5 +16,5 @@ cal_novalue (int a)
>> >dec (a);
>> >  }
>> >  -/* { dg-final { scan-assembler-times "br" 2 } } */
>> > +/* { dg-final { scan-assembler-times "br\t" 2 } } */
>> >  /* { dg-final { scan-assembler-not "b\t" } } */
>> > diff --git a/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
>> > b/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
>> > index 4759d20..e863323 100644
>> > --- a/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
>> > +++ b/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
>> > @@ -3,7 +3,7 @@
>> >   typedef void FP (int);
>> >  -/* { dg-final { scan-assembler "br" } } */
>> > +/* { dg-final { scan-assembler "br\t" } } */
>> >
>> > Did you mean to make this scan-assembler-times as well?
>> >
>>
>> I kept the changes minimal, but you are right, it would be more robust
>> as attached.
>>
>> OK for trunk and gcc-6 branch?
>
> OK.
>
> If you want completeness on this, the
> gcc.target/aarch64/tail_indirect_call_1.c change should go back to the
> gcc-5 branch too.
>

Thanks,  I've committed to trunk, backported to gcc-6,
and partially to gcc-5.

Christophe.


> Cheers,
> James
>


Re: [PATCH, libbacktrace]: Fix PR 71161, Lots of ASAN and libgo runtime FAILs on 32bit x86 targets

2016-05-18 Thread Ian Lance Taylor
Uros Bizjak  writes:

> 2016-05-18  Uros Bizjak  
>
> * elf.c (phdr_callback) [__i386__]: Add
> __attribute__((__force_align_arg_pointer__)).

This is OK.

Thanks.

Ian


Re: inhibit the sincos optimization when the target has sin and cos instructions

2016-05-18 Thread Nathan Sidwell

On 05/17/16 17:30, Cesar Philippidis wrote:

On 05/17/2016 02:22 PM, Andrew Pinski wrote:



Good eyes, thanks! I thought I had to create a new insn, but I got away
with an expand. I attached the updated patch.

Cesar



gcc.sum
Tests that now fail, but worked before:

nvptx-none-run: gcc.c-torture/execute/20100316-1.c   -Os  execution test
nvptx-none-run: gcc.c-torture/execute/20100708-1.c   -O1  execution test
nvptx-none-run: gcc.c-torture/execute/20100805-1.c   -O0  execution test
nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
nvptx-none-run: gcc.dg/torture/pr52028.c   -O3 -g  execution test



Please determine why these now fail.


+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+   (unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+  UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+   (unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+{
+  emit_insn (gen_sinsf2 (operands[1], operands[2]));
+  emit_insn (gen_cossf2 (operands[0], operands[2]));
+
+  DONE;
+})


Why the emit_insn code?  that seems to be replicating the RTL representation -- 
you're saying the same thing twice.


Doesn't operands[2] need (conditionally) copying to a new register -- what if it 
aliases operands[1]?



+++ b/gcc/testsuite/gcc.target/nvptx/sincos-2.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math" } */
+


What is this test trying to test?  I'm puzzled by it.  (btw, don't use assert, 
either abort, exit(1) or return from main.)


nathan


Re: [PATCH] Optimize strchr (s, 0) to strlen

2016-05-18 Thread Wilco Dijkstra
Richard Biener wrote:
>
> Yeah ;)  I'm currently bootstrapping/testing the patch that makes it possible 
> to
> write all this in match.pd.

So what was the conclusion? Improving match.pd to be able to handle more cases
like this seems like a nice thing.

Wilco



Re: [PATCH #2] Add PowerPC ISA 3.0 word splat and byte immediate splat support

2016-05-18 Thread Segher Boessenkool
On Tue, May 17, 2016 at 06:45:49PM -0400, Michael Meissner wrote:
> As I mentioned in the last message, my previous patch had some problems that
> showed up on big endian systems, using RELOAD (one of the tests that failed 
> was
> the vshuf-v32qi.c test in the testsuite).  Little endian and IRA did compiled
> the test fine.  This patch fixes the problem.  I went over the alternatives in
> the vsx_mov_{32bit,64bit} patterns, and I removed the '*' constraints,
> and checked all of the other constraints.

So those * is the only change?

> The patches bootstrap and pass regression tests on both little endian power8
> and big endian power7 systems.  Are these patches ok to install in the trunk?

This patch is okay for trunk.

> After a burn-in period, will they be ok to back port to the GCC 6.2 branch?

Yes.  Please make sure you test it there as well (BE/LE, p7/p8).

A few nits still:

> +(define_predicate "xxspltib_constant_split"
> +  (match_code "const_vector,vec_duplicate,const_int")
> +{
> +  int value = 256;
> +  int num_insns  = -1;

Still has a tab character.

> +(define_predicate "xxspltib_constant_nosplit"
> +  (match_code "const_vector,vec_duplicate,const_int")
> +{
> +  int value = 256;
> +  int num_insns  = -1;

And here.

> @@ -1024,6 +1068,10 @@ (define_predicate "splat_input_operand"
>   mode = V2DFmode;
>else if (mode == DImode)
>   mode = V2DImode;
> +  else if (mode == SImode && TARGET_P9_VECTOR)
> + mode = V4SImode;
> +  else if (mode == SFmode && TARGET_P9_VECTOR)
> + mode = V4SFmode;

Trailing tabs (twice).

> +;;   VSX store  VSX load   VSX move  VSX->GPR   GPR->VSXLQ (GPR)
> +;;  STQ (GPR)  GPR load   GPR store GPR move   XXSPLTIB
> VSPLTISW
> +;;   VSX 0/-1   GPR 0/-1   VMX const GPR const  LVX (VMX)   STVX 
> (VMX)
> +(define_insn "*vsx_mov_64bit"
> +  [(set (match_operand:VSX_M 0 "nonimmediate_operand"
> +   "=ZwO,  , , r, we,?wQ,
> + ?,   ??r,   ??Y,   ??r,   wo,v,
> + ?,*r,v, ??r,   wZ,v")
> +
> + (match_operand:VSX_M 1 "input_operand" 
> +   ", ZwO,   , we,r, r,
> + wQ,Y, r, r, wE,jwM,
> + ?jwM,  jwM,   W, W, v, wZ"))]
> +
> +  "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (mode)
> +   && (register_operand (operands[0], mode) 
> +   || register_operand (operands[1], mode))"
> +{
> +  return rs6000_output_move_128bit (operands);
> +}
> +  [(set_attr "type"
> +   "vecstore,  vecload,   vecsimple, mffgpr,mftgpr,load,
> + store, load,  store, *, vecsimple, 
> vecsimple,
> + vecsimple, *, *, *, vecstore,  vecload")
> +
> +   (set_attr "length"
> +   "4, 4, 4, 8, 4, 8,
> + 8, 8, 8, 8, 4, 4,
> + 4, 8, 20,20,4, 4")])

Some of these lines are indented with spaces instead of tabs, please fix.
Looks great otherwise, thanks!

> +;; V4SI splat (ISA 3.0)
> +;; When SI's are allowed in VSX registers, add XXSPLTW support
> +(define_expand "vsx_splat_"
> +  [(set (match_operand:VSX_W 0 "vsx_register_operand" "")
> + (vec_duplicate:VSX_W
> +  (match_operand: 1 "splat_input_operand" "")))]

You can leave off the default arg "" nowadays.  I know we're not consistent
in that.


Segher


Re: [PATCH] Fix PR fortran/70856

2016-05-18 Thread Richard Biener
On Wed, May 18, 2016 at 2:03 PM, Martin Liška  wrote:
> On 05/18/2016 01:24 PM, Richard Biener wrote:
>> Ok.
>>
>> Richard.
>
> Thanks, I'll install the same patch to GCC 6 branch after
> finishing of tests, ok?

Ok.

Richard.

> Martin


[PATCH, libbacktrace]: Fix PR 71161, Lots of ASAN and libgo runtime FAILs on 32bit x86 targets

2016-05-18 Thread Uros Bizjak
Hello!

The issue here is a misaligned stack with some old(er) 32bit x86
glibcs, where dl_iterate_phdr callback gets called with misaligned
stack.

Attached patch makes phdr_callback in libbacktrace resistant to this
ABI violation.

2016-05-18  Uros Bizjak  

* elf.c (phdr_callback) [__i386__]: Add
__attribute__((__force_align_arg_pointer__)).

Patch was bootstrapped on x86_64-linux-gnu and regression tested with -m32.

OK for mainline and release branches?

Uros.

diff --git a/libbacktrace/elf.c b/libbacktrace/elf.c
index f85ac65..81ba344 100644
--- a/libbacktrace/elf.c
+++ b/libbacktrace/elf.c
@@ -866,6 +866,9 @@ struct phdr_data
libraries.  */

 static int
+#ifdef __i386__
+__attribute__ ((__force_align_arg_pointer__))
+#endif
 phdr_callback (struct dl_phdr_info *info, size_t size ATTRIBUTE_UNUSED,
   void *pdata)
 {


Re: [PATCH] Fix PR fortran/70856

2016-05-18 Thread Martin Liška
On 05/18/2016 01:24 PM, Richard Biener wrote:
> Ok.
> 
> Richard.

Thanks, I'll install the same patch to GCC 6 branch after
finishing of tests, ok?

Martin


Re: [PATCH #2] Add PowerPC ISA 3.0 word splat and byte immediate splat support

2016-05-18 Thread Segher Boessenkool
On Tue, May 17, 2016 at 07:08:52PM -0400, Michael Meissner wrote:
> FWIW, the problem after subversion id 236136 shows up when the trunk compiler
> is built with the host compiler (4.3.4).

That compiler is almost seven years old.  It would be interesting to find
out what the oldest compiler that *does* work is.


Segher


Re: [committed] Cherry-pick upstream asan fix for upcoming glibc (PR sanitizer/71160)

2016-05-18 Thread Florian Weimer

On 05/17/2016 04:46 PM, Jakub Jelinek wrote:

On Tue, May 17, 2016 at 05:38:27PM +0300, Maxim Ostapenko wrote:

Hi Jakub,

thanks for backporting this! Do you have any plans to apply this patch to
GCC 5 and 6 branches? AFAIK people hit on this ASan + newer Glibc bug by
using GCC 5.3.1 on Fedora 23.


I don't have the newer glibc on my box, therefore I'm waiting until somebody
confirms the trunk change fixed it before backporting.


I compiled GCC trunk (r236371) and today's glibc master (around commit 
0014680d6a5bdeb4fe17682450105ebed19f35da), and both work together, in 
the sense that this test program, when compiled with ASAN, reports a 
memory leak:


#include 
int main () { malloc(35); return 0; }

=
==30982==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 35 byte(s) in 1 object(s) allocated from:
#0 0x7fa00ea15928 in __interceptor_malloc 
../../../../trunk/libsanitizer/asan/asan_malloc_linux.cc:62

#1 0x400763  (/home/fweimer/src/gnu/glibc/build/elf/ld.so+0x400763)
#2 0x7fa00e5d625f in __libc_start_main ../csu/libc-start.c:289

SUMMARY: AddressSanitizer: 35 byte(s) leaked in 1 allocation(s).

Before, when compiled with GCC 5.3.1 (in Fedora 22), it would report an 
internal error:


==30945==AddressSanitizer CHECK failed: 
../../../../libsanitizer/asan/asan_rtl.cc:556 "((!asan_init_is_running 
&& "ASan init calls itself!")) != (0)" (0x0, 0x0)




Thanks,
Florian


Re: [PATCH] Fix PR71132

2016-05-18 Thread Richard Biener
On Wed, 18 May 2016, H.J. Lu wrote:

> On Wed, May 18, 2016 at 12:50 AM, Richard Biener  wrote:
> > On Tue, 17 May 2016, H.J. Lu wrote:
> >
> >> On Tue, May 17, 2016 at 5:51 AM, Richard Biener  wrote:
> >> >
> >> > The following fixes a latent issue in loop distribution catched by
> >> > the fake edge placement adjustment.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> >> >
> >> > Richard.
> >> >
> >> > 2016-05-17  Richard Biener  
> >> >
> >> > PR tree-optimization/71132
> >> > * tree-loop-distribution.c (create_rdg_cd_edges): Pass in loop.
> >> > Only add control dependences for blocks in the loop.
> >> > (build_rdg): Adjust.
> >> > (generate_code_for_partition): Return whether loop should
> >> > be destroyed and delay that.
> >> > (distribute_loop): Likewise.
> >> > (pass_loop_distribution::execute): Record loops to be destroyed
> >> > and perform delayed destroying of loops.
> >> >
> >> > * gcc.dg/torture/pr71132.c: New testcase.
> >> >
> >>
> >> On x86, this caused:
> >>
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_custom.c  -O3 -fcilkplus
> >> (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_custom.c  -O3 -fcilkplus
> >> (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -fcilkplus -O3
> >> -std=c99 (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -fcilkplus -O3
> >> -std=c99 (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -O3 -fcilkplus
> >> (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -O3 -fcilkplus
> >> (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -fcilkplus -O3
> >> -std=c99 (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -fcilkplus -O3
> >> -std=c99 (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
> >> (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
> >> (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
> >> (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
> >> (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c
> >> -fcilkplus -O3 -std=c99 (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c
> >> -fcilkplus -O3 -std=c99 (test for excess errors)
> >> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c  -O3
> >> -fcilkplus (internal compiler error)
> >> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c  -O3
> >> -fcilkplus (test for excess errors)
> >> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
> >> compiler error)
> >> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
> >> excess errors)
> >> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -g  (internal compiler error)
> >> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -g  (test for excess errors)
> >> FAIL: gcc.c-torture/execute/20010221-1.c   -O3 -g  (internal compiler 
> >> error)
> >> FAIL: gcc.c-torture/execute/20010221-1.c   -O3 -g  (test for excess errors)
> >> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
> >> compiler error)
> >> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
> >> excess errors)
> >> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -g  (internal compiler 
> >> error)
> >> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -g  (test for excess errors)
> >> FAIL: gcc.dg/torture/pr61383-1.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
> >> compiler error)
> >> FAIL: gcc.dg/torture/pr61383-1.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
> >> excess errors)
> >> FAIL: gcc.dg/torture/pr69452.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
> >> compiler error)
> >> FAIL: gcc.dg/torture/pr69452.c   -O3 -fomit-frame-pointer
> >> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
> >> excess errors)
> >> FAIL: gcc.dg/torture/pr69452.c   -O3 -g  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69452.c   -O3 -g  (test for excess errors)
> >> FAIL: g++.dg/cilk-plus/AN/builtin_fn_mutating_tplt.cc  -g -O3
> >> -fcilkplus (internal compiler error)
> >> FAIL: 

Re: Splitting up gcc/omp-low.c?

2016-05-18 Thread Thomas Schwinge
Hi!

Ping.

On Wed, 11 May 2016 15:44:14 +0200, I wrote:
> Ping.
> 
> On Tue, 03 May 2016 11:34:39 +0200, I wrote:
> > On Wed, 13 Apr 2016 18:01:09 +0200, I wrote:
> > > On Fri, 08 Apr 2016 11:36:03 +0200, I wrote:
> > > > On Thu, 10 Dec 2015 09:08:35 +0100, Jakub Jelinek  
> > > > wrote:
> > > > > On Wed, Dec 09, 2015 at 06:23:22PM +0100, Bernd Schmidt wrote:
> > > > > > On 12/09/2015 05:24 PM, Thomas Schwinge wrote:
> > > > > > >how about we split up gcc/omp-low.c into several
> > > > > > >files?  Would it make sense (I have not yet looked in detail) to 
> > > > > > >do so
> > > > > > >along the borders of the several passes defined therein?
> > > 
> > > > > > I suspect a split along the ompexp/omplow boundary would be quite 
> > > > > > easy to
> > > > > > achieve.
> > > 
> > > That was indeed the first one that I tackled, omp-expand.c (spelled out
> > > "expand" instead of "exp" to avoid confusion as "exp" might also be short
> > > for "expression"; OK?) [...]
> > 
> > That's the one I'd suggest to pursue next, now that GCC 6.1 has been
> > released.  How would you like me to submit the patch for review?  (It's
> > huge, obviously.)
> > 
> > A few high-level comments, and questions that remain to be answered:
> > 
> > > Stuff that does not relate to OMP lowering, I did not move stuff out of
> > > omp-low.c (into a new omp.c, or omp-misc.c, for example) so far, but
> > > instead just left all that in omp-low.c.  We'll see how far we get.
> > > 
> > > One thing I noticed is that there sometimes is more than one suitable
> > > place to put stuff: omp-low.c and omp-expand.c categorize by compiler
> > > passes, and omp-offload.c -- at least in part -- [would be] about the 
> > > orthogonal
> > > "offloading" category.  For example, see the OMPTODO "struct oacc_loop
> > > and enum oacc_loop_flags" in gcc/omp-offload.h.  We'll see how that goes.
> > 
> > > Some more comments, to help review:
> > 
> > > As I don't know how this is usually done: is it appropriate to remove
> > > "Contributed by Diego Novillo" from omp-low.c (he does get mentioned for
> > > his OpenMP work in gcc/doc/contrib.texi; a ton of other people have been
> > > contributing a ton of other stuff since omp-low.c has been created), or
> > > does this line stay in omp-low.c, or do I even duplicate it into the new
> > > files?
> > > 
> > > I tried not to re-order stuff when moving.  But: we may actually want to
> > > reorder stuff, to put it into a more sensible order.  Any suggestions?
> > 
> > > I had to export a small number of functions (see the prototypes not moved
> > > but added to the header files).
> > > 
> > > Because it's also used in omp-expand.c, I moved the one-line static
> > > inline is_reference function from omp-low.c to omp-low.h, and renamed it
> > > to omp_is_reference because of the very generic name.  Similar functions
> > > stay in omp-low.c however, so they're no longer defined next to each
> > > other.  OK, or does this need a different solution?


Grüße
 Thomas


Re: libgomp: In OpenACC testing, cycle though $offload_targets, and by default only build for the offload target that we're actually going to test

2016-05-18 Thread Thomas Schwinge
Hi!

Ping.

On Wed, 11 May 2016 15:45:13 +0200, I wrote:
> Ping.
> 
> On Mon, 02 May 2016 11:54:27 +0200, I wrote:
> > On Fri, 29 Apr 2016 09:43:41 +0200, Jakub Jelinek  wrote:
> > > On Thu, Apr 28, 2016 at 12:43:43PM +0200, Thomas Schwinge wrote:
> > > > commit 3b521f3e35fdb4b320e95b5f6a82b8d89399481a
> > > > Author: Thomas Schwinge 
> > > > Date:   Thu Apr 21 11:36:39 2016 +0200
> > > > 
> > > > libgomp: Unconfuse offload plugins vs. offload targets
> > > 
> > > I don't like this patch at all, rather than unconfusing stuff it
> > > makes stuff confusing.  Plugins are just a way to support various
> > > offloading targets.
> > 
> > Huh; my patch exactly clarifies that the offload_targets variable does
> > not actually list offload target names, but does list libgomp offload
> > plugin names...
> > 
> > > Can you please post just a short patch without all those changes
> > > that does what you want, rather than renaming everything at the same time?
> > 
> > I thought incremental, self-contained patches were easier to review.
> > Anyway, here's the three patches merged into one:
> > 
> > commit 8060ae3474072eef685381d80f566d1c0942c603
> > Author: Thomas Schwinge 
> > Date:   Thu Apr 21 11:36:39 2016 +0200
> > 
> > libgomp: In OpenACC testing, cycle though $offload_targets, and by 
> > default only build for the offload target that we're actually going to test
> > 
> > libgomp/
> > * plugin/configfrag.ac (offload_targets): Actually enumerate
> > offload targets, and add...
> > (offload_plugins): ... this one to enumerate offload plugins.
> > (OFFLOAD_PLUGINS): Renamed from OFFLOAD_TARGETS.
> > * target.c (gomp_target_init): Adjust to that.
> > * testsuite/lib/libgomp.exp: Likewise.
> > (offload_targets_s, offload_targets_s_openacc): Remove 
> > variables.
> > (offload_target_to_openacc_device_type): New proc.
> > (check_effective_target_openacc_nvidia_accel_selected)
> > (check_effective_target_openacc_host_selected): Examine
> > $openacc_device_type instead of $offload_target_openacc.
> > * Makefile.in: Regenerate.
> > * config.h.in: Likewise.
> > * configure: Likewise.
> > * testsuite/Makefile.in: Likewise.
> > * testsuite/libgomp.oacc-c++/c++.exp: Cycle through
> > $offload_targets (plus "disable") instead of
> > $offload_targets_s_openacc, and add "-foffload=$offload_target" 
> > to
> > tagopt.
> > * testsuite/libgomp.oacc-c/c.exp: Likewise.
> > * testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
> > ---
> >  libgomp/Makefile.in|  1 +
> >  libgomp/config.h.in|  4 +-
> >  libgomp/configure  | 44 +++--
> >  libgomp/plugin/configfrag.ac   | 39 +++-
> >  libgomp/target.c   |  8 +--
> >  libgomp/testsuite/Makefile.in  |  1 +
> >  libgomp/testsuite/lib/libgomp.exp  | 72 
> > ++
> >  libgomp/testsuite/libgomp.oacc-c++/c++.exp | 30 +
> >  libgomp/testsuite/libgomp.oacc-c/c.exp | 30 +
> >  libgomp/testsuite/libgomp.oacc-fortran/fortran.exp | 22 ---
> >  10 files changed, 142 insertions(+), 109 deletions(-)
> > 
> > diff --git libgomp/Makefile.in libgomp/Makefile.in
> > [snipped]
> > diff --git libgomp/config.h.in libgomp/config.h.in
> > [snipped]
> > diff --git libgomp/configure libgomp/configure
> > [snipped]
> > diff --git libgomp/plugin/configfrag.ac libgomp/plugin/configfrag.ac
> > index 88b4156..de0a6f6 100644
> > --- libgomp/plugin/configfrag.ac
> > +++ libgomp/plugin/configfrag.ac
> > @@ -26,8 +26,6 @@
> >  # see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> >  # .
> >  
> > -offload_targets=
> > -AC_SUBST(offload_targets)
> >  plugin_support=yes
> >  AC_CHECK_LIB(dl, dlsym, , [plugin_support=no])
> >  if test x"$plugin_support" = xyes; then
> > @@ -142,7 +140,13 @@ AC_SUBST(PLUGIN_HSA_LIBS)
> >  
> >  
> >  
> > -# Get offload targets and path to install tree of offloading compiler.
> > +# Parse offload targets, and figure out libgomp plugin, and configure the
> > +# corresponding offload compiler.  offload_plugins and offload_targets 
> > will be
> > +# populated in the same order.
> > +offload_plugins=
> > +offload_targets=
> > +AC_SUBST(offload_plugins)
> > +AC_SUBST(offload_targets)
> >  offload_additional_options=
> >  offload_additional_lib_paths=
> >  AC_SUBST(offload_additional_options)
> > @@ -151,13 +155,13 @@ if test x"$enable_offload_targets" != x; then
> >for tgt in `echo $enable_offload_targets | sed -e 's#,# #g'`; do
> >  

Re: [PATCH] Fix PR71132

2016-05-18 Thread H.J. Lu
On Wed, May 18, 2016 at 12:50 AM, Richard Biener  wrote:
> On Tue, 17 May 2016, H.J. Lu wrote:
>
>> On Tue, May 17, 2016 at 5:51 AM, Richard Biener  wrote:
>> >
>> > The following fixes a latent issue in loop distribution catched by
>> > the fake edge placement adjustment.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>> >
>> > Richard.
>> >
>> > 2016-05-17  Richard Biener  
>> >
>> > PR tree-optimization/71132
>> > * tree-loop-distribution.c (create_rdg_cd_edges): Pass in loop.
>> > Only add control dependences for blocks in the loop.
>> > (build_rdg): Adjust.
>> > (generate_code_for_partition): Return whether loop should
>> > be destroyed and delay that.
>> > (distribute_loop): Likewise.
>> > (pass_loop_distribution::execute): Record loops to be destroyed
>> > and perform delayed destroying of loops.
>> >
>> > * gcc.dg/torture/pr71132.c: New testcase.
>> >
>>
>> On x86, this caused:
>>
>> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_custom.c  -O3 -fcilkplus
>> (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_custom.c  -O3 -fcilkplus
>> (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -fcilkplus -O3
>> -std=c99 (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -fcilkplus -O3
>> -std=c99 (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -O3 -fcilkplus
>> (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_fn_mutating.c  -O3 -fcilkplus
>> (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -fcilkplus -O3
>> -std=c99 (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -fcilkplus -O3
>> -std=c99 (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
>> (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
>> (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
>> (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/builtin_func_double.c  -O3 -fcilkplus
>> (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c
>> -fcilkplus -O3 -std=c99 (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c
>> -fcilkplus -O3 -std=c99 (test for excess errors)
>> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c  -O3
>> -fcilkplus (internal compiler error)
>> FAIL: c-c++-common/cilk-plus/AN/sec_reduce_ind_same_value.c  -O3
>> -fcilkplus (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
>> compiler error)
>> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
>> excess errors)
>> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr32399.c   -O3 -g  (test for excess errors)
>> FAIL: gcc.c-torture/execute/20010221-1.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.c-torture/execute/20010221-1.c   -O3 -g  (test for excess errors)
>> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
>> compiler error)
>> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
>> excess errors)
>> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.c-torture/execute/20120919-1.c   -O3 -g  (test for excess errors)
>> FAIL: gcc.dg/torture/pr61383-1.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
>> compiler error)
>> FAIL: gcc.dg/torture/pr61383-1.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
>> excess errors)
>> FAIL: gcc.dg/torture/pr69452.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal
>> compiler error)
>> FAIL: gcc.dg/torture/pr69452.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
>> excess errors)
>> FAIL: gcc.dg/torture/pr69452.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69452.c   -O3 -g  (test for excess errors)
>> FAIL: g++.dg/cilk-plus/AN/builtin_fn_mutating_tplt.cc  -g -O3
>> -fcilkplus (internal compiler error)
>> FAIL: g++.dg/cilk-plus/AN/builtin_fn_mutating_tplt.cc  -g -O3
>> -fcilkplus (test for excess errors)
>> FAIL: g++.dg/cilk-plus/AN/builtin_fn_mutating_tplt.cc  -O3 -fcilkplus
>> (internal compiler error)
>> FAIL: g++.dg/cilk-plus/AN/builtin_fn_mutating_tplt.cc  -O3 -fcilkplus
>> (test 

Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z

2016-05-18 Thread Kugan
Hi Martin,

> 
> I see various ICE after your commit r236356:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

Sorry for the breakage. Looking into it.

Thanks,
Kugan


Re: [PATCH, ARM 5/7, ping1] Add support for MOVT/MOVW to ARMv8-M Baseline

2016-05-18 Thread Kyrill Tkachov

Hi Thomas,

This looks mostly good with a few nits inline.
Please repost with the comments addressed.

Thanks,
Kyrill

On 17/05/16 11:13, Thomas Preudhomme wrote:

Ping?

*** gcc/ChangeLog ***

2015-11-13  Thomas Preud'homme  

 * config/arm/arm.h (TARGET_HAVE_MOVT): Include ARMv8-M as having MOVT.
 * config/arm/arm.c (arm_arch_name): (const_ok_for_op): Check MOVT/MOVW
 availability with TARGET_HAVE_MOVT.
 (thumb_legitimate_constant_p): Legalize high part of a label_ref as a
 constant.


I don't think "Legalize" is the right word here. How about "Strip the HIGH part of a 
label_ref"?


 (thumb1_rtx_costs): Also return 0 if setting a half word constant and
 movw is available.
 (thumb1_size_rtx_costs): Make set of half word constant also cost 1
 extra instruction if MOVW is available.  Make constant with bottom
half
 word zero cost 2 instruction if MOVW is available.
 * config/arm/arm.md (define_attr "arch"): Add v8mb.
 (define_attr "arch_enabled"): Set to yes if arch value is v8mb and
 target is ARMv8-M Baseline.
 * config/arm/thumb1.md (thumb1_movdi_insn): Add ARMv8-M Baseline only
 alternative for constants satisfying j constraint.
 (thumb1_movsi_insn): Likewise.
 (movsi splitter for K alternative): Tighten condition to not trigger
 if movt is available and j constraint is satisfied.
 (Pe immediate splitter): Likewise.
 (thumb1_movhi_insn): Add ARMv8-M Baseline only alternative for
 constant fitting in an halfword to use movw.


Please use 'MOVW' consistently in the ChangeLog rather than the lowercase 'movw'


 * doc/sourcebuild.texi (arm_thumb1_movt_ko): Document new ARM
 effective target.


*** gcc/testsuite/ChangeLog ***

2015-11-13  Thomas Preud'homme  

 * lib/target-supports.exp (check_effective_target_arm_thumb1_movt_ko):
 Define effective target.
 * gcc.target/arm/pr42574.c: Require arm_thumb1_movt_ko instead of
 arm_thumb1_ok as effective target to exclude ARMv8-M Baseline.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index
47216b4a1959ccdb18e329db411bf7f941e67163..f42e996e5a7ce979fe406b8261d50fb2ba005f6b
100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -269,7 +269,7 @@ extern void (*arm_lang_output_object_attributes_hook)
(void);
  #define TARGET_HAVE_LDACQ (TARGET_ARM_ARCH >= 8 && arm_arch_notm)
  
  /* Nonzero if this chip provides the movw and movt instructions.  */

-#define TARGET_HAVE_MOVT   (arm_arch_thumb2)
+#define TARGET_HAVE_MOVT   (arm_arch_thumb2 || arm_arch8)
  
  /* Nonzero if integer division instructions supported.  */

  #define TARGET_IDIV   ((TARGET_ARM && arm_arch_arm_hwdiv) \
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index
d75a34f10d5ed22cff0a0b5d3ad433f111b059ee..13b4b71ac8f9c1da8ef1945f7ff6985ca59f6832
100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8220,6 +8220,12 @@ arm_legitimate_constant_p_1 (machine_mode, rtx x)
  static bool
  thumb_legitimate_constant_p (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
  {
+  /* Splitters for TARGET_USE_MOVT call arm_emit_movpair which creates high
+ RTX.  These RTX must therefore be allowed for Thumb-1 so that when run
+ for ARMv8-M baseline or later the result is valid.  */
+  if (TARGET_HAVE_MOVT && GET_CODE (x) == HIGH)
+x = XEXP (x, 0);
+
return (CONST_INT_P (x)
  || CONST_DOUBLE_P (x)
  || CONSTANT_ADDRESS_P (x)
@@ -8306,7 +8312,8 @@ thumb1_rtx_costs (rtx x, enum rtx_code code, enum
rtx_code outer)
  case CONST_INT:
if (outer == SET)
{
- if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256)
+ if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256
+ || (TARGET_HAVE_MOVT && !(INTVAL (x) & 0x)))
return 0;


Since you're modifying this line please replace (unsigned HOST_WIDE_INT) INTVAL 
(x)
with UINTVAL (x).



Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z

2016-05-18 Thread Richard Biener
On Wed, May 18, 2016 at 10:38 AM, Kugan Vivekanandarajah
 wrote:
 Please move the whole thing under the else { } case of the ops.length
 == 0, ops.length == 1 test chain
 as you did for the actual emit of the negate.

>>>
>>> I see your point. However, when we remove the (-1) from the ops list, that
>>> intern can result in ops.length becoming 1. Therefore, I moved the  the
>>> following  if (negate_result), outside the condition.
>>
>> Ah, indeed.   But now you have to care for ops.length () == 0 and thus
>> the unconditonally ops.last () may now trap.  So I suggest to
>> do
>
> Done.
>
>> Yes - the patch is ok with the above suggested change.
>
>
> While testing on an arm variant, vector types are not handled.
> Therefore, I had to change:
>
> +  || ((TREE_CODE (last->op) == REAL_CST)
> +&& real_equal (_REAL_CST (last->op), ))
>
>
> to
>
> +  || real_minus_onep (last->op))
>
>
> Is this Still OK. Bootstrap and regression testing on ARM, AARCH64 and
> x86-64 didn’t have any new regressions.

Yes.

Thanks,
Richard.

> Thanks,
> Kugan


Re: [PATCH] Fix PR fortran/70856

2016-05-18 Thread Richard Biener
On Wed, May 18, 2016 at 12:30 PM, Martin Liška  wrote:
> Hello.
>
> Following patch add support for IPA ICF, where we miss support for
> a proper DECL_PT_UID update in situations where we merge variables.
>
> Patch can bootstrap and no new regression is introduced for x86_64-linux-gnu.
>
> Ready for trunk?

Ok.

Richard.

> Thanks,
> Martin


Re: PR 71020: Handle abnormal PHIs in tree-call-cdce.c

2016-05-18 Thread Richard Biener
On Wed, May 18, 2016 at 11:10 AM, Richard Sandiford
 wrote:
> The PR is about a case where tree-call-cdce.c causes two abnormal
> PHIs for the same variable to be live at the same time, leading to
> a coalescing failure.  It seemed like getting rid of these kinds of
> input would be generally useful, so I added a utility to tree-dfa.c.
>
> Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> PR middle-end/71020
> * tree-dfa.h (replace_abnormal_ssa_names): Declare.
> * tree-dfa.c (replace_abnormal_ssa_names): New function.
> * tree-call-cdce.c: Include tree-dfa.h.
> (can_guard_call_p): New function, extracted from...
> (can_use_internal_fn): ...here.
> (shrink_wrap_one_built_in_call_with_conds): Remove failure path
> and return void.
> (shrink_wrap_one_built_in_call): Likewise.
> (use_internal_fn): Likewise.
> (shrink_wrap_conditional_dead_built_in_calls): Update accordingly
> and return void.  Call replace_abnormal_ssa_names.
> (pass_call_cdce::execute): Check can_guard_call_p during the
> initial walk.  Assume shrink_wrap_conditional_dead_built_in_calls
> will always change something.
>
> gcc/testsuite/
> * gcc.dg/torture/pr71020.c: New test.
>
> Index: gcc/tree-dfa.h
> ===
> --- gcc/tree-dfa.h
> +++ gcc/tree-dfa.h
> @@ -35,6 +35,7 @@ extern tree get_addr_base_and_unit_offset_1 (tree, 
> HOST_WIDE_INT *,
>  tree (*) (tree));
>  extern tree get_addr_base_and_unit_offset (tree, HOST_WIDE_INT *);
>  extern bool stmt_references_abnormal_ssa_name (gimple *);
> +extern void replace_abnormal_ssa_names (gimple *);
>  extern void dump_enumerated_decls (FILE *, int);
>
>
> Index: gcc/tree-dfa.c
> ===
> --- gcc/tree-dfa.c
> +++ gcc/tree-dfa.c
> @@ -823,6 +823,29 @@ stmt_references_abnormal_ssa_name (gimple *stmt)
>return false;
>  }
>
> +/* If STMT takes any abnormal PHI values as input, replace them with
> +   local copies.  */
> +
> +void
> +replace_abnormal_ssa_names (gimple *stmt)
> +{
> +  ssa_op_iter oi;
> +  use_operand_p use_p;
> +
> +  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, oi, SSA_OP_USE)
> +{
> +  tree op = USE_FROM_PTR (use_p);
> +  if (TREE_CODE (op) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> +   {
> + gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> + tree new_name = make_ssa_name (TREE_TYPE (op));
> + gassign *assign = gimple_build_assign (new_name, op);
> + gsi_insert_before (, assign, GSI_SAME_STMT);
> + SET_USE (use_p, new_name);
> +   }
> +}
> +}
> +
>  /* Pair of tree and a sorting index, for dump_enumerated_decls.  */
>  struct GTY(()) numbered_tree
>  {
> Index: gcc/tree-call-cdce.c
> ===
> --- gcc/tree-call-cdce.c
> +++ gcc/tree-call-cdce.c
> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-into-ssa.h"
>  #include "builtins.h"
>  #include "internal-fn.h"
> +#include "tree-dfa.h"
>
>
>  /* This pass serves two closely-related purposes:
> @@ -349,6 +350,15 @@ edom_only_function (gcall *call)
>return false;
>  }
>  }
> +
> +/* Return true if it is structurally possible to guard CALL.  */
> +
> +static bool
> +can_guard_call_p (gimple *call)
> +{
> +  return (!stmt_ends_bb_p (call)
> + || find_fallthru_edge (gimple_bb (call)->succs));
> +}
>
>  /* A helper function to generate gimple statements for one bound
> comparison, so that the built-in function is called whenever
> @@ -747,11 +757,9 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec *> conds,
>  #define ERR_PROB 0.01
>
>  /* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS
> -   conditions in CONDS is false.
> +   conditions in CONDS is false.  */
>
> -   Return true on success, in which case the cfg will have been updated.  */
> -
> -static bool
> +static void
>  shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec  
> conds,
>   unsigned int nconds)
>  {
> @@ -795,11 +803,10 @@ shrink_wrap_one_built_in_call_with_conds (gcall 
> *bi_call, vec  conds,
>/* Now find the join target bb -- split bi_call_bb if needed.  */
>if (stmt_ends_bb_p (bi_call))
>  {
> -  /* If the call must be the last in the bb, don't split the block,
> -it could e.g. have EH edges.  */
> +  /* We checked that there was a fallthrough edge in
> +can_guard_call_p.  */
>join_tgt_in_edge_from_call = find_fallthru_edge (bi_call_bb->succs);
> -  if (join_tgt_in_edge_from_call == NULL)
> -return false;
> +  gcc_assert (join_tgt_in_edge_from_call);
>free_dominance_info 

[PATCH] Bump LTO bytecode major

2016-05-18 Thread Richard Biener

Committed.

Richard.

2016-05-18  Richard Biener  

* lto-streamer.h (LTO_major_version): Bump to 6.

Index: gcc/lto-streamer.h
===
--- gcc/lto-streamer.h  (revision 236373)
+++ gcc/lto-streamer.h  (working copy)
@@ -128,7 +128,7 @@ along with GCC; see the file COPYING3.
  String are represented in the table as pairs, a length in ULEB128
  form followed by the data for the string.  */
 
-#define LTO_major_version 5
+#define LTO_major_version 6
 #define LTO_minor_version 0
 
 typedef unsigned char  lto_decl_flags_t;


[PATCH] Improve TBAA with unions

2016-05-18 Thread Richard Biener

The following adjusts get_alias_set beahvior when applied to
union accesses to use the union alias-set rather than alias-set
zero.  This is in line with behavior from the alias oracle
which (bogously) circumvents alias-set zero with looking at
the alias-sets of the base object.  Thus for

union U { int i; float f; };

float
foo (union U *u, double *p)
{
  u->f = 1.;
  *p = 0;
  return u->f;
}

the langhooks ensured u->f has alias-set zero and thus disambiguation
against *p was not allowed.  Still the alias-oracle did the disambiguation
by using the alias set of the union here (I think optimizing the
return to return 1. is valid).

We have a good place in the middle-end to apply such rules which
is component_uses_parent_alias_set_from - this is where I move
the logic that is duplicated in various frontends.

The Java and Ada frontends do not allow union type punning (LTO does),
so this patch may eventually pessimize them.  I don't care anything
about Java but Ada folks might want to chime in.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok for trunk?

Thanks,
Richard.

2016-05-18  Richard Biener  

* alias.c (component_uses_parent_alias_set_from): Handle
type punning through union accesses by using the union alias set.
* gimple.c (gimple_get_alias_set): Remove union type punning case.

c-family/
* c-common.c (c_common_get_alias_set): Remove union type punning case.

fortran/
* f95-lang.c (LANG_HOOKS_GET_ALIAS_SET): Remove (un-)define.
(gfc_get_alias_set): Remove.


Index: trunk/gcc/alias.c
===
*** trunk.orig/gcc/alias.c  2016-05-18 11:15:41.744792403 +0200
--- trunk/gcc/alias.c   2016-05-18 11:31:40.139709782 +0200
*** component_uses_parent_alias_set_from (co
*** 619,624 
--- 619,632 
case COMPONENT_REF:
  if (DECL_NONADDRESSABLE_P (TREE_OPERAND (t, 1)))
found = t;
+ /* Permit type-punning when accessing a union, provided the access
+is directly through the union.  For example, this code does not
+permit taking the address of a union member and then storing
+through it.  Even the type-punning allowed here is a GCC
+extension, albeit a common and useful one; the C standard says
+that such accesses have implementation-defined behavior.  */
+ else if (TREE_CODE (TREE_TYPE (TREE_OPERAND (t, 0))) == UNION_TYPE)
+   found = t;
  break;
  
case ARRAY_REF:
Index: trunk/gcc/c-family/c-common.c
===
*** trunk.orig/gcc/c-family/c-common.c  2016-05-18 11:15:41.744792403 +0200
--- trunk/gcc/c-family/c-common.c   2016-05-18 11:31:40.143709828 +0200
*** static GTY(()) hash_table
*** 4734,4741 
  alias_set_type
  c_common_get_alias_set (tree t)
  {
-   tree u;
- 
/* For VLAs, use the alias set of the element type rather than the
   default of alias set 0 for types compared structurally.  */
if (TYPE_P (t) && TYPE_STRUCTURAL_EQUALITY_P (t))
--- 4734,4739 
*** c_common_get_alias_set (tree t)
*** 4745,4763 
return -1;
  }
  
-   /* Permit type-punning when accessing a union, provided the access
-  is directly through the union.  For example, this code does not
-  permit taking the address of a union member and then storing
-  through it.  Even the type-punning allowed here is a GCC
-  extension, albeit a common and useful one; the C standard says
-  that such accesses have implementation-defined behavior.  */
-   for (u = t;
-TREE_CODE (u) == COMPONENT_REF || TREE_CODE (u) == ARRAY_REF;
-u = TREE_OPERAND (u, 0))
- if (TREE_CODE (u) == COMPONENT_REF
-   && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
-   return 0;
- 
/* That's all the expressions we handle specially.  */
if (!TYPE_P (t))
  return -1;
--- 4743,4748 
Index: trunk/gcc/fortran/f95-lang.c
===
*** trunk.orig/gcc/fortran/f95-lang.c   2016-05-18 11:15:41.744792403 +0200
--- trunk/gcc/fortran/f95-lang.c2016-05-18 11:31:48.623806334 +0200
*** static bool global_bindings_p (void);
*** 74,80 
  static bool gfc_init (void);
  static void gfc_finish (void);
  static void gfc_be_parse_file (void);
- static alias_set_type gfc_get_alias_set (tree);
  static void gfc_init_ts (void);
  static tree gfc_builtin_function (tree);
  
--- 74,79 
*** static const struct attribute_spec gfc_a
*** 110,116 
  #undef LANG_HOOKS_MARK_ADDRESSABLE
  #undef LANG_HOOKS_TYPE_FOR_MODE
  #undef LANG_HOOKS_TYPE_FOR_SIZE
- #undef LANG_HOOKS_GET_ALIAS_SET
  #undef LANG_HOOKS_INIT_TS
  #undef LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE
  #undef 

Re: [PATCH 16/17][ARM] Add tests for VFP FP16 ACLE instrinsics.

2016-05-18 Thread Matthew Wahab

On 18/05/16 02:06, Joseph Myers wrote:

On Tue, 17 May 2016, Matthew Wahab wrote:


In some tests, there are unavoidable differences in precision when calculating
the actual and the expected results of an FP16 operation. A new support function
CHECK_FP_BIAS is used so that these tests can check for an acceptable margin of
error. In these tests, the tolerance is given as the absolute integer difference
between the bitvectors of the expected and the actual results.


As far as I can see, CHECK_FP_BIAS is only used in the following patch, but 
there
 is another bias test in vsqrth_f16_1.c in this patch.


This is my mistake, the CHECK_FP_BIAS is used for the NEON tests and should 
have gone
into that patch. The VFP test can do a simpler check so doesn't need the macro.


Could you clarify where the "unavoidable differences in precision" come from? 
Are
the results of some of the new instructions not fully specified, only specified
within a given precision?  (As far as I can tell the existing v8 instructions 
for
reciprocal and reciprocal square root estimates do have fully defined results,
despite being loosely described as esimtates.)


The expected results in the new tests are represented as expressions whose 
value is
expected to be calculated at compile-time. This makes the tests more readable but 
differences in the precision between the the compiler and the HW calculations mean 
that for vrecpe_f16, vrecps_f16, vrsqrts_f16 and vsqrth_f16_1.c the expected and 
actual results are different.


On reflection, it may be better to remove the CHECK_FP_BIAS macro and, for the tests 
that needed it, to drop the compiler calculation and just use the expected 
hexadecimal value.


Other tests depending on compiler-time calculations involve relatively simple 
arithmetic operations and it's not clear if they are susceptible to the same rounding 
errors. I have limited knowledge in FP arithmetic though so I'll look into this.


Matthew


[PATCH] Help PR70729, shuffle LIM and PRE

2016-05-18 Thread Richard Biener

The following patch moves LIM before PRE to allow it to cleanup CSE
(and copyprop) opportunities LIM exposes.  It also moves the DCE done
in loop before the loop pipeline as otherwise it is no longer executed
uncoditionally at this point (since we have the no_loop pipeline).

The patch requires some testsuite adjustments such as cope with LIM now
running before PRE and thus disabling the former and to adjust
for better optimization we now do in the two testcases with redundant
stores where store motion enables sinking to sink all interesting code
out of the innermost loop.

It also requires the LIM PHI hoisting cost adjustment patch I am
testing separately.

Bootstrapped and tested on x86_64-unknown-linux-gnu (with testsuite
fallout resulting in the following adjustments).

I'm going to re-test before committing.

Richard.

2016-05-18  Richard Biener  

PR tree-optimization/70729
* passes.def: Move LIM pass before PRE.  Remove no longer
required copyprop and move first DCE out of the loop pipeline.

* gcc.dg/autopar/outer-6.c: Adjust to avoid redundant store.
* gcc.dg/graphite/scop-18.c: Likewise.
* gcc.dg/pr41783.c: Disable LIM.
* gcc.dg/tree-ssa/loadpre10.c: Likewise.
* gcc.dg/tree-ssa/loadpre23.c: Likewise.
* gcc.dg/tree-ssa/loadpre24.c: Likewise.
* gcc.dg/tree-ssa/loadpre25.c: Likewise.
* gcc.dg/tree-ssa/loadpre4.c: Likewise.
* gcc.dg/tree-ssa/loadpre8.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-16.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-18.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-20.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-3.c: Likewise.
* gfortran.dg/pr42108.f90: Likewise.

Index: trunk/gcc/passes.def
===
--- trunk.orig/gcc/passes.def   2016-05-18 11:46:56.518134310 +0200
+++ trunk/gcc/passes.def2016-05-18 11:47:16.006355920 +0200
@@ -243,12 +243,14 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_cse_sincos);
   NEXT_PASS (pass_optimize_bswap);
   NEXT_PASS (pass_laddress);
+  NEXT_PASS (pass_lim);
   NEXT_PASS (pass_split_crit_edges);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
   NEXT_PASS (pass_sancov);
   NEXT_PASS (pass_asan);
   NEXT_PASS (pass_tsan);
+  NEXT_PASS (pass_dce);
   /* Pass group that runs when 1) enabled, 2) there are loops
 in the function.  Make sure to run pass_fix_loops before
 to discover/remove loops before running the gate function
@@ -257,9 +259,6 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_tree_loop);
   PUSH_INSERT_PASSES_WITHIN (pass_tree_loop)
  NEXT_PASS (pass_tree_loop_init);
- NEXT_PASS (pass_lim);
- NEXT_PASS (pass_copy_prop);
- NEXT_PASS (pass_dce);
  NEXT_PASS (pass_tree_unswitch);
  NEXT_PASS (pass_scev_cprop);
  NEXT_PASS (pass_record_bounds);
Index: trunk/gcc/testsuite/gcc.dg/autopar/outer-6.c
===
--- trunk.orig/gcc/testsuite/gcc.dg/autopar/outer-6.c   2016-01-20 
15:36:51.477802338 +0100
+++ trunk/gcc/testsuite/gcc.dg/autopar/outer-6.c2016-05-18 
12:40:29.342665450 +0200
@@ -24,7 +24,7 @@ void parloop (int N)
   for (i = 0; i < N; i++)
   {
 for (j = 0; j < N; j++)
-  y[i]=x[i][j];
+  y[i]+=x[i][j];
 sum += y[i];
   }
   g_sum = sum;
Index: trunk/gcc/testsuite/gcc.dg/graphite/scop-18.c
===
--- trunk.orig/gcc/testsuite/gcc.dg/graphite/scop-18.c  2015-09-14 
10:21:31.364089947 +0200
+++ trunk/gcc/testsuite/gcc.dg/graphite/scop-18.c   2016-05-18 
12:38:35.673369299 +0200
@@ -13,13 +13,13 @@ void test (void)
   for (i = 0; i < 24; i++)
 for (j = 0; j < 24; j++)
   for (k = 0; k < 24; k++)
-A[i][j] = B[i][k] * C[k][j];
+A[i][j] += B[i][k] * C[k][j];
 
   /* These loops should still be strip mined.  */
   for (i = 0; i < 1000; i++)
 for (j = 0; j < 1000; j++)
   for (k = 0; k < 1000; k++)
-A[i][j] = B[i][k] * C[k][j];
+A[i][j] += B[i][k] * C[k][j];
 }
 
 /* { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite"} } */
Index: trunk/gcc/testsuite/gcc.dg/pr41783.c
===
--- trunk.orig/gcc/testsuite/gcc.dg/pr41783.c   2015-06-09 15:45:14.092224446 
+0200
+++ trunk/gcc/testsuite/gcc.dg/pr41783.c2016-05-18 11:47:31.454531583 
+0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-tree-pre" } */
+/* { dg-options "-O3 -fdump-tree-pre -fno-tree-loop-im" } */
 int db[100];
 int a_global_var, fact;
 int main()
Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/loadpre10.c
===
--- trunk.orig/gcc/testsuite/gcc.dg/tree-ssa/loadpre10.c

Re: [PATCH, ARM 4/7, ping1] Factor out MOVW/MOVT availability and desirability checks

2016-05-18 Thread Kyrill Tkachov

Hi Thomas,

On 17/05/16 11:11, Thomas Preudhomme wrote:

Ping?

*** gcc/ChangeLog ***

2015-11-09  Thomas Preud'homme  

 * config/arm/arm.h (TARGET_USE_MOVT): Check MOVT/MOVW availability
 with TARGET_HAVE_MOVT.
 (TARGET_HAVE_MOVT): Define.
 * config/arm/arm.c (const_ok_for_op): Check MOVT/MOVW
 availability with TARGET_HAVE_MOVT.
 * config/arm/arm.md (arm_movt): Use TARGET_HAVE_MOVT to check movt
 availability.
 (addsi splitter): Use TARGET_USE_MOVT to check whether to use
 movt + movw.
 (symbol_refs movsi splitter): Remove TARGET_32BIT check.
 (arm_movtas_ze): Use TARGET_HAVE_MOVT to check movt availability.
 * config/arm/constraints.md (define_constraint "j"): Use
 TARGET_HAVE_MOVT to check movt availability.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index
1d976b36300d92d538098b3cf83c60d62ed2be1c..47216b4a1959ccdb18e329db411bf7f941e67163
100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -237,7 +237,7 @@ extern void (*arm_lang_output_object_attributes_hook)
(void);
  
  /* Should MOVW/MOVT be used in preference to a constant pool.  */

  #define TARGET_USE_MOVT \
-  (arm_arch_thumb2 \
+  (TARGET_HAVE_MOVT \
 && (arm_disable_literal_pool \
 || (!optimize_size && !current_tune->prefer_constant_pool)))
  
@@ -268,6 +268,9 @@ extern void (*arm_lang_output_object_attributes_hook)

(void);
  /* Nonzero if this chip supports load-acquire and store-release.  */
  #define TARGET_HAVE_LDACQ (TARGET_ARM_ARCH >= 8 && arm_arch_notm)
  
+/* Nonzero if this chip provides the movw and movt instructions.  */

+#define TARGET_HAVE_MOVT   (arm_arch_thumb2)
+
  /* Nonzero if integer division instructions supported.  */
  #define TARGET_IDIV   ((TARGET_ARM && arm_arch_arm_hwdiv) \
 || (TARGET_THUMB2 && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index
7b95ba0b379c31ee650e714ce2198a43b1cadbac..d75a34f10d5ed22cff0a0b5d3ad433f111b059ee
100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3897,7 +3897,7 @@ const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
  {
  case SET:
/* See if we can use movw.  */
-  if (arm_arch_thumb2 && (i & 0x) == 0)
+  if (TARGET_HAVE_MOVT && (i & 0x) == 0)
return 1;
else
/* Otherwise, try mvn.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index
4049f104c6d5fd8bfd8f68ecdfae6a3d34d4333f..094423477acb8d9223fd06c17e82bfd0a94d
100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5705,7 +5705,7 @@
[(set (match_operand:SI 0 "nonimmediate_operand" "=r")
(lo_sum:SI (match_operand:SI 1 "nonimmediate_operand" "0")
   (match_operand:SI 2 "general_operand"  "i")))]
-  "arm_arch_thumb2 && arm_valid_symbolic_address_p (operands[2])"
+  "TARGET_HAVE_MOVT && arm_valid_symbolic_address_p (operands[2])"
"movt%?\t%0, #:upper16:%c2"
[(set_attr "predicable" "yes")
 (set_attr "predicable_short_it" "no")
@@ -5765,8 +5765,7 @@
[(set (match_operand:SI 0 "arm_general_register_operand" "")
(const:SI (plus:SI (match_operand:SI 1 "general_operand" "")
   (match_operand:SI 2 "const_int_operand" ""]
-  "TARGET_THUMB2
-   && arm_disable_literal_pool
+  "TARGET_USE_MOVT


This is not an equivalent change.
First, TARGET_THUMB2 and arm_arch_thumb2 are not exactly the same.
TARGET_THUMB2 means that the selected architecture supports Thumb2 AND
the user is compiling for Thumb2. arm_arch_thumb2 on the other hand means
that the selected architecture supports Thumb2, but will be set even when
when compiling for -marm (for example -march=armv7-a -marm).

In this case we want the pattern to apply only when actually targeting Thumb2.

Second,

TARGET_USE_MOVT is not just TARGET_THUMB2 && arm_disable_literal_pool.
(With this patch) it's defined as:
#define TARGET_USE_MOVT \
  (TARGET_HAVE_MOVT \
   && (arm_disable_literal_pool \
   || (!optimize_size && !current_tune->prefer_constant_pool)))

So, if you want to enable this pattern for ARMv8-M in the next patch I think
what you want is to replace TARGET_THUMB2 by TARGET_THUMB && TARGET_HAVE_MOVT

Kyrill


 && reload_completed
 && GET_CODE (operands[1]) == SYMBOL_REF"
[(clobber (const_int 0))]
@@ -5796,8 +5795,7 @@
  (define_split
[(set (match_operand:SI 0 "arm_general_register_operand" "")
 (match_operand:SI 1 "general_operand" ""))]
-  "TARGET_32BIT
-   && TARGET_USE_MOVT && GET_CODE (operands[1]) == SYMBOL_REF
+  "TARGET_USE_MOVT && GET_CODE (operands[1]) == SYMBOL_REF
 && !flag_pic && !target_word_relocations
 && !arm_tls_referenced_p (operands[1])"
[(clobber (const_int 0))]
@@ -10965,7 +10963,7 @@
 (const_int 16)
 (const_int 16))
  (match_operand:SI 1 

[PATCH] Fix LIM PHI movement cost

2016-05-18 Thread Richard Biener

The PHI movement penalty is bogously biased by adding the cost of moving
the condition - but the condition is computed unconditionally and thus
its cost should be added to the PHI cost.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

This patch is required to avoid regressing with a followup that will
exchange LIM and PRE (PRE can cleanup after LIM which avoids one
copyprop pass and it can perform CSE which increases vectorization
opportunities - failed to find the PR right now).

Richard.

2016-05-18  Richard Biener  

* tree-ssa-loop-im.c (determine_max_movement): Properly add
condition cost to PHI cost instead of total_cost.

Index: gcc/tree-ssa-loop-im.c
===
--- gcc/tree-ssa-loop-im.c  (revision 236361)
+++ gcc/tree-ssa-loop-im.c  (working copy)
@@ -717,7 +717,7 @@ determine_max_movement (gimple *stmt, bo
return false;
  def_data = get_lim_data (SSA_NAME_DEF_STMT (val));
  if (def_data)
-   total_cost += def_data->cost;
+   lim_data->cost += def_data->cost;
}
 
  /* We want to avoid unconditionally executing very expensive


[PATCH Obvious/r236200]Check invariant expression pointer, not pointer to the pointer

2016-05-18 Thread Bin Cheng
Hi,
Revision 236200 checks wrong pointer for invariant expression, which 
accidentally clears all depends_on information.  This causes wrong cost and bad 
code generation, for example, gcc.target/arm/pr42505.c on thumb1 targets.  This 
patch fixes the issue.
Test result checked for gcc.target/arm/pr42505.c.  Apply as obvious.

Thanks,
bin

2016-05-18 Bin Cheng  

* tree-ssa-loop-ivopts.c (get_computation_cost_at): Check invariant
expression pointer, not pointer to the pointer.diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index e8953a0..9ce6b64 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -4874,7 +4874,7 @@ get_computation_cost_at (struct ivopts_data *data,
   *inv_expr = get_loop_invariant_expr (data, ubase, cbase, ratio,
   address_p);
   /* Clear depends on.  */
-  if (inv_expr != NULL)
+  if (*inv_expr != NULL)
bitmap_clear (*depends_on);
 }
 


Re: [PATCH, ARM 3/7, ping1] Fix indentation of FL_FOR_ARCH* definition after adding support for ARMv8-M

2016-05-18 Thread Kyrill Tkachov

Hi Thomas,

On 17/05/16 11:10, Thomas Preudhomme wrote:

Ping?

*** gcc/ChangeLog ***

2015-11-06  Thomas Preud'homme  

 * config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index
63235cb63acf3e676fac5b61e1195081efd64075..f437d0d8baa5534f9519dd28cd2c4ac52d48685c
100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -395,30 +395,31 @@ extern bool arm_is_constant_pool_ref (rtx);
  #define FL_TUNE   (FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
 | FL_CO_PROC)
  
-#define FL_FOR_ARCH2	FL_NOTM

-#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
-#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
-#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
-#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
-#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
-#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
-#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
-#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
-#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE
-#define FL_FOR_ARCH6   (FL_FOR_ARCH5TE | FL_ARCH6)
-#define FL_FOR_ARCH6J  FL_FOR_ARCH6
-#define FL_FOR_ARCH6K  (FL_FOR_ARCH6 | FL_ARCH6K)
-#define FL_FOR_ARCH6Z  FL_FOR_ARCH6
-#define FL_FOR_ARCH6KZ (FL_FOR_ARCH6K | FL_ARCH6KZ)
-#define FL_FOR_ARCH6T2 (FL_FOR_ARCH6 | FL_THUMB2)
-#define FL_FOR_ARCH6M  (FL_FOR_ARCH6 & ~FL_NOTM)
-#define FL_FOR_ARCH7   ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
-#define FL_FOR_ARCH7A  (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
-#define FL_FOR_ARCH7VE (FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
-#define FL_FOR_ARCH7R  (FL_FOR_ARCH7A | FL_THUMB_DIV)
-#define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
-#define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
-#define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
+#define FL_FOR_ARCH2   FL_NOTM
+#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
+#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
+#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
+#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
+#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
+#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
+#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
+#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
+#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE


This one looks misindented.
Ok with that fixed once the prerequisites are approved.

Kyrill


+#define FL_FOR_ARCH6   (FL_FOR_ARCH5TE | FL_ARCH6)
+#define FL_FOR_ARCH6J  FL_FOR_ARCH6
+#define FL_FOR_ARCH6K  (FL_FOR_ARCH6 | FL_ARCH6K)
+#define FL_FOR_ARCH6Z  FL_FOR_ARCH6
+#define FL_FOR_ARCH6ZK FL_FOR_ARCH6K
+#define FL_FOR_ARCH6KZ (FL_FOR_ARCH6K | FL_ARCH6KZ)
+#define FL_FOR_ARCH6T2 (FL_FOR_ARCH6 | FL_THUMB2)
+#define FL_FOR_ARCH6M  (FL_FOR_ARCH6 & ~FL_NOTM)
+#define FL_FOR_ARCH7   ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
+#define FL_FOR_ARCH7A  (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
+#define FL_FOR_ARCH7VE (FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
+#define FL_FOR_ARCH7R  (FL_FOR_ARCH7A | FL_THUMB_DIV)
+#define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
+#define FL_FOR_ARCH7EM (FL_FOR_ARCH7M | FL_ARCH7EM)
+#define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
  #define FL2_FOR_ARCH8_1A  FL2_ARCH8_1
  #define FL_FOR_ARCH8M_BASE(FL_FOR_ARCH6M | FL_ARCH8 | FL_THUMB_DIV)
  #define FL_FOR_ARCH8M_MAIN(FL_FOR_ARCH7M | FL_ARCH8)


Best regards,

Thomas

On Thursday 17 December 2015 15:50:31 Thomas Preud'homme wrote:

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
This specific patch fixes the indentation of FL_FOR_ARCH* macros definition
following the patch to add support for ARMv8-M. Since this is an obvious
change, I'm not expecting a review and will commit it as soon as the other
patches in the series are accepted.

[1] For a quick overview of ARMv8-M please refer to the initial cover
letter.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-11-06  Thomas Preud'homme  

 * config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 1371ee7..bf0d1b4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -391,32 +391,33 @@ extern bool arm_is_constant_pool_ref (rtx);
  #define FL_TUNE   (FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \

 | FL_CO_PROC)

-#define FL_FOR_ARCH2   FL_NOTM
-#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
-#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
-#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
-#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
-#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
-#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)

[PATCH] Fix PR fortran/70856

2016-05-18 Thread Martin Liška
Hello.

Following patch add support for IPA ICF, where we miss support for
a proper DECL_PT_UID update in situations where we merge variables.

Patch can bootstrap and no new regression is introduced for x86_64-linux-gnu.

Ready for trunk?
Thanks,
Martin
>From 35ec4381940677e9491f28b7d83c8b0fbbc45d6c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 18 May 2016 10:07:04 +0200
Subject: [PATCH] Set DECL_PT_UID for merged variables in IPA ICF (PR70856).

gcc/ChangeLog:

2016-05-18  Martin Liska  

	* ipa-icf.c (sem_variable::merge): Set DECL_PT_UID for
	merged variables.
---
 gcc/ipa-icf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index dda5cac..3c04b5a 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -2258,6 +2258,8 @@ sem_variable::merge (sem_item *alias_item)
 
   varpool_node::create_alias (alias_var->decl, decl);
   alias->resolve_alias (original);
+  if (DECL_PT_UID_SET_P (original->decl))
+	SET_DECL_PT_UID (alias->decl, DECL_PT_UID (original->decl));
 
   if (dump_file)
 	fprintf (dump_file, "Unified; Variable alias has been created.\n\n");
-- 
2.8.2



Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z

2016-05-18 Thread Martin Liška
On 05/18/2016 10:38 AM, Kugan Vivekanandarajah wrote:
> Is this Still OK. Bootstrap and regression testing on ARM, AARCH64 and
> x86-64 didn’t have any new regressions.
> 
> Thanks,
> Kugan

Hello.

I see various ICE after your commit r236356:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

Martin


  1   2   >