Re: [patch, fortran] PR66461 ICE on missing end program in fixed source

2016-05-23 Thread Jerry DeLisle
On 05/22/2016 06:00 PM, Jerry DeLisle wrote:
> On 05/22/2016 04:53 AM, Andre Vehreschild wrote:
>> Hi Jerry,
>>
>> I have tested your patch and gave it a review and the only thing I like
>> to have is a testcase. Can you provide one from the PR? With a testcase
>> I say the patch is ok for trunk and thanks for the patch.
>>
>> Please note, I don't have review rights in the area the patch
>> addresses, although I am familiar with the matcher having worked in it.
>> This "review" is just a helper for an official reviewer to "second" my
>> opinion, hoping to get your patch faster into trunk.
>>
>> Regards and thanks for the patch,
>>  Andre
> 
> Thanks Andre, I am going to hold off just a bit after some further info from 
> Mikael.
> 
> We are obviously intercepting the problem in different ways, but the root
> problem still exists.
> 
> Regards,
> 
> Jerry
> 

Committed rev 236627 under simple rule. Also credit Mikael Morin for breaking
through the ICE so to speak.  Many thanks,

Jerry

https://gcc.gnu.org/viewcvs/gcc?view=revision=236627


Re: [PATCH] c++/71147 - [6 Regression] Flexible array member wrongly rejected in template

2016-05-23 Thread Jason Merrill

On 05/23/2016 07:18 PM, Martin Sebor wrote:

On 05/19/2016 07:30 AM, Jason Merrill wrote:

On 05/18/2016 09:40 PM, Martin Sebor wrote:

The handling of flexible array members whose element type was
dependent tried to deal with the case when the element type
was not yet completed but it did it wrong.  The attached patch
corrects the handling by trying to complete the element type
first.


How about changing complete_type to complete the element type even for
an array of unknown bound?  It seems to me that it would be useful to do
that to set alignment and 'structor flags even if we can't set TYPE_SIZE.

It would also be useful to have a 'complete type or array of unknown
bound of complete type' predicate to use here and in layout_var_decl,
grokdeclarator, and type_with_alias_set_p.


Thanks for the suggestions.  I implemented them in the attached
update to the the patch.  The macro I added evaluates its argument
multiple times.  That normally isn't a problem unless it's invoked
with a non-trivial argument like a call to complete_type() that's
passed to COMPLETE_TYPE_P() in grokdeclarator.  One way to avoid
possible problems due to evaluating the macro argument more than
once is to introduce a helper inline function.  I haven't seen
it done in tree.h so I didn't introduce one in this patch either,
but it might be worth considering for the new macro and any other
non-trivial macros like it.


Yes, let's just make it an inline function (of which there are already 
quite a few in tree.h).


Jason



Re: [C++ PATCH] Reject self-recursive constexpr calls even in templates (PR c++/70449)

2016-05-23 Thread Jason Merrill

On 04/01/2016 09:35 PM, Jason Merrill wrote:

On 04/01/2016 09:34 PM, Jason Merrill wrote:

On 04/01/2016 03:19 PM, Jakub Jelinek wrote:

As the testcase shows, when not in a template, cxx_eval_call_expression
already complains about self-recursive calls in constexpr contexts,
but if we are in a function template, we ICE on the testcase,
because we try to instantiate the function template we are in the
middle of
parsing, e.g. function_end_locus is UNKNOWN_LOCATION, and only the
statements that have been already parsed are in there.


That's odd, we should have failed to instantiate the template.
Investigating further, it seems that we can check for DECL_INITIAL ==
error_mark_node to tell that a function is still being defined.  So this
patch does that, and also replaces my earlier fix for 70344.


This doesn't quite work for the 70344 testcase when optimizing, because 
we call cp_fold_function between when we set DECL_INITIAL to a BLOCK and 
when we lower invisiref parameters, which leads to confusion when we try 
to evaluate a recursive call.  So let's put back the check against 
current_function_decl.


Tested x86_64-pc-linux-gnu, applying to trunk and 6.

commit a65d5bc7e718824989e02233e6edf990afb15358
Author: Jason Merrill 
Date:   Fri May 20 12:35:30 2016 -0400

	PR c++/70344 - ICE with recursive constexpr

	* constexpr.c (cxx_eval_call_expression): Check for
	fun == current_function_decl again.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 7b56260..bb723f4 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1371,11 +1371,17 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
   else
 {
   new_call.fundef = retrieve_constexpr_fundef (fun);
-  if (new_call.fundef == NULL || new_call.fundef->body == NULL)
+  if (new_call.fundef == NULL || new_call.fundef->body == NULL
+	  || fun == current_function_decl)
 {
 	  if (!ctx->quiet)
 	{
-	  if (DECL_INITIAL (fun) == error_mark_node)
+	  /* We need to check for current_function_decl here in case we're
+		 being called during cp_fold_function, because at that point
+		 DECL_INITIAL is set properly and we have a fundef but we
+		 haven't lowered invisirefs yet (c++/70344).  */
+	  if (DECL_INITIAL (fun) == error_mark_node
+		  || fun == current_function_decl)
 		error_at (loc, "%qD called in a constant expression before its "
 			  "definition is complete", fun);
 	  else if (DECL_INITIAL (fun))
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-recursion2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-recursion2.C
index 978b998..ce2280c 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-recursion2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-recursion2.C
@@ -1,5 +1,6 @@
 // PR c++/70344
 // { dg-do compile { target c++11 } }
+// { dg-options -O }
 
 struct Z
 {


Re: [PATCH] Fix PR tree-optimization/71170

2016-05-23 Thread Kugan Vivekanandarajah
On 23 May 2016 at 21:35, Richard Biener  wrote:
> On Sat, May 21, 2016 at 8:08 AM, Kugan Vivekanandarajah
>  wrote:
>> On 20 May 2016 at 21:07, Richard Biener  wrote:
>>> On Fri, May 20, 2016 at 1:51 AM, Kugan Vivekanandarajah
>>>  wrote:
 Hi Richard,

> I think it should have the same rank as op or op + 1 which is the current
> behavior.  Sth else doesn't work correctly here I think, like inserting 
> the
> multiplication not near the definition of op.
>
> Well, the whole "clever insertion" logic is simply flawed.

 What I meant to say was that the simple logic we have now wouldn’t
 work. "clever logic" is knowing where exactly where it is needed and
 inserting there.  I think thats what  you are suggesting below in a
 simple to implement way.

> I'd say that ideally we would delay inserting the multiplication to
> rewrite_expr_tree time.  For example by adding a ops->stmt_to_insert
> member.
>

 Here is an implementation based on above. Bootstrap on x86-linux-gnu
 is OK. regression testing is ongoing.
>>>
>>> I like it.  Please push the insertion code to a helper as I think you need
>>> to post-pone setting the stmts UID to that point.
>>>
>>> Ideally we'd make use of the same machinery in attempt_builtin_powi,
>>> removing the special-casing of powi_result.  (same as I said that ideally
>>> the plus->mult stuff would use the repeat-ops machinery...)
>>>
>>> I'm not 100% convinced the place you insert the stmt is correct but I
>>> haven't spent too much time to decipher reassoc in this area.
>>
>>
>> Hi Richard,
>>
>> Thanks. Here is a tested version of the patch. I did miss one place
>> which I fixed now (tranform_stmt_to_copy) I also created a function to
>> do the insertion.
>>
>>
>> Bootstrap and regression testing on x86_64-linux-gnu are fine. Is this
>> OK for trunk.
>
> @@ -3798,6 +3805,7 @@ rewrite_expr_tree (gimple *stmt, unsigned int opindex,
>oe1 = ops[opindex];
>oe2 = ops[opindex + 1];
>
> +
>if (rhs1 != oe1->op || rhs2 != oe2->op)
> {
>   gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
>
> please remove this stray change.
>
> Ok with that change.

Hi Richard,

Thanks for the review. I also found another issue with this patch.
I.e. for the stmt_to_insert we will get gimple_bb of NULL which is not
expected in sort_by_operand_rank. This only showed up only while
building a version of glibc.

Bootstrap and regression testing are ongoing.Is this OK for trunk if
passes regression and bootstrap.

Thanks,
Kugan


gcc/ChangeLog:

2016-05-24  Kugan Vivekanandarajah  

* tree-ssa-reassoc.c (sort_by_operand_rank): Check for gimple_bb of NULL
for stmt_to_insert.


gcc/testsuite/ChangeLog:

2016-05-24  Kugan Vivekanandarajah  

* gcc.dg/tree-ssa/reassoc-44.c: New test.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-44.c 
b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-44.c
index e69de29..9b12212 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-44.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-44.c
@@ -0,0 +1,10 @@
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned int a;
+int b, c;
+void fn1 ()
+{
+  b = a + c + c;
+}
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index fb683ad..06f4d1b 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -525,7 +525,7 @@ sort_by_operand_rank (const void *pa, const void *pb)
  gimple *stmtb = SSA_NAME_DEF_STMT (oeb->op);
  basic_block bba = gimple_bb (stmta);
  basic_block bbb = gimple_bb (stmtb);
- if (bbb != bba)
+ if (bba && bbb && bbb != bba)
{
  if (bb_rank[bbb->index] != bb_rank[bba->index])
return bb_rank[bbb->index] - bb_rank[bba->index];


Re: [patch,openacc] use firstprivate pointers for subarrays in c and c++

2016-05-23 Thread Cesar Philippidis
On 05/20/2016 02:42 AM, Jakub Jelinek wrote:
> On Tue, May 10, 2016 at 01:29:50PM -0700, Cesar Philippidis wrote:

>> @@ -5796,12 +5796,14 @@ tree
>>  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>>  {
>>bitmap_head generic_head, firstprivate_head, lastprivate_head;
>> -  bitmap_head aligned_head, map_head, map_field_head;
>> +  bitmap_head aligned_head, map_head, map_field_head, oacc_reduction_head;
>>tree c, t, *pc;
>>tree safelen = NULL_TREE;
>>bool branch_seen = false;
>>bool copyprivate_seen = false;
>>bool ordered_seen = false;
>> +  bool allow_fields = (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
>> +|| ort == C_ORT_ACC;
>>  
> 
> Formatting.  You want = already on the new line, or add () around the whole
> rhs and align || below (ort &.
> 
> Though, this looks wrong to me, does OpenACC all of sudden support
> privatization of non-static data members in methods?
> 
>>bitmap_obstack_initialize (NULL);
>>bitmap_initialize (_head, _default_obstack);
>> @@ -5810,6 +5812,7 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>>bitmap_initialize (_head, _default_obstack);
>>bitmap_initialize (_head, _default_obstack);
>>bitmap_initialize (_field_head, _default_obstack);
>> +  bitmap_initialize (_reduction_head, _default_obstack);
>>  
>>for (pc = , c = clauses; c ; c = *pc)
>>  {
>> @@ -5829,8 +5832,7 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>>t = OMP_CLAUSE_DECL (c);
>>if (TREE_CODE (t) == TREE_LIST)
>>  {
>> -  if (handle_omp_array_sections (c, ((ort & C_ORT_OMP_DECLARE_SIMD)
>> - == C_ORT_OMP)))
>> +  if (handle_omp_array_sections (c, allow_fields))
> 
> IMNSHO you don't want to change this, instead adjust C++
> handle_omp_array_sections* where it deals with array sections to just use
> the is_omp variant; there are still other places where it deals with
> non-static data members and I think you don't want to change those.

That should be fixed now. It looks like I only needed to prevent
handle_omp_array_sections_1 from calling omp_privatize_field for acc
regions. So I modified handle_omp_array_sections* to take a
c_omp_region_type argument instead of a bool is_omp to enable that.

finish_omp_clauses should be ok because it already has a field_ok
variable to guard that calls to omp_privatize_field.

>>  {
>>remove = true;
>>break;
>> @@ -6040,6 +6042,17 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>> omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
>>remove = true;
>>  }
>> +  else if (ort == C_ORT_ACC
>> +   && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION)
>> +{
>> +  if (bitmap_bit_p (_reduction_head, DECL_UID (t)))
>> +{
>> +  error ("%qD appears more than once in reduction clauses", t);
>> +  remove = true;
>> +}
>> +  else
>> +bitmap_set_bit (_reduction_head, DECL_UID (t));
>> +}
>>else if (bitmap_bit_p (_head, DECL_UID (t))
>> || bitmap_bit_p (_head, DECL_UID (t))
>> || bitmap_bit_p (_head, DECL_UID (t)))
>> @@ -6050,7 +6063,10 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>>else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_PRIVATE
>> && bitmap_bit_p (_head, DECL_UID (t)))
>>  {
>> -  error ("%qD appears both in data and map clauses", t);
>> +  if (ort == C_ORT_ACC)
>> +error ("%qD appears more than once in data clauses", t);
>> +  else
>> +error ("%qD appears both in data and map clauses", t);
>>remove = true;
>>  }
>>else
>> @@ -6076,7 +6092,7 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>>  omp_note_field_privatization (t, OMP_CLAUSE_DECL (c));
>>else
>>  t = OMP_CLAUSE_DECL (c);
>> -  if (t == current_class_ptr)
>> +  if (ort != C_ORT_ACC && t == current_class_ptr)
>>  {
>>error ("% allowed in OpenMP only in %"
>>   " clauses");
>> @@ -6103,7 +6119,10 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>>  }
>>else if (bitmap_bit_p (_head, DECL_UID (t)))
>>  {
>> -  error ("%qD appears both in data and map clauses", t);
>> +  if (ort == C_ORT_ACC)
>> +error ("%qD appears more than once in data clauses", t);
>> +  else
>> +error ("%qD appears both in data and map clauses", t);
>>remove = true;
>>  }
>>else
>> @@ -6551,8 +6570,7 @@ finish_omp_clauses (tree clauses, enum 
>> c_omp_region_type ort)
>>  }
>>if (TREE_CODE (t) == TREE_LIST)
>>  {
>> -  if (handle_omp_array_sections (c, ((ort 

Re: C PATCH to add -Wswitch-unreachable (PR c/49859)

2016-05-23 Thread Martin Sebor

Sorry I'm a little late with my comments but I noticed one minor
problem (I raised bug 71249 for it since the patch has already
been checked in), and have a question about the hunk below:


@@ -1595,6 +1595,32 @@ gimplify_switch_expr (tree *expr_p, gimple_seq *pre_p)
gimplify_ctxp->case_labels.create (8);

gimplify_stmt (_BODY (switch_expr), _body_seq);
+
+  /* Possibly warn about unreachable statements between switch's
+controlling expression and the first case.  */
+  if (warn_switch_unreachable
+ /* This warning doesn't play well with Fortran when optimizations
+are on.  */
+ && !lang_GNU_Fortran ()
+ && switch_body_seq != NULL)
+   {
+ gimple_seq seq = switch_body_seq;
+ if (gimple_code (switch_body_seq) == GIMPLE_BIND)
+   seq = gimple_bind_body (as_a  (switch_body_seq));
+ gimple *stmt = gimple_seq_first_stmt (seq);
+ enum gimple_code code = gimple_code (stmt);
+ if (code != GIMPLE_LABEL && code != GIMPLE_TRY)


Why exempt GIMPLE_TRY?  It suppresses the warning in cases like:

  switch (i) {
  try { } catch (...) { }
  case 1: ;
  }

(If excluding GIMPLE_TRY is unavoidable, it might be worthwhile
to add a comment to the code, and perhaps also mention it in
the documentation to preempt bug reports by nitpickers like me ;)

Finally, while even this simple warning can be useful, it would
be even more helpful if it could also point out other unreachable
statements within the body of the switch statements after
a break/goto/return and before a subsequent label.  This could
be especially valuable with optimization to make possible
diagnosing non-trivial problems like this:

  switch (i) {
  case 3:
if (i < 3)
   return 1;
i = 8;
  }

(I realize this might be outside the scope of the feature request
and starting to creep into the -Wunreachable-code territory.)

Martin


Re: [PATCH] c++/71147 - [6 Regression] Flexible array member wrongly rejected in template

2016-05-23 Thread Martin Sebor

On 05/19/2016 07:30 AM, Jason Merrill wrote:

On 05/18/2016 09:40 PM, Martin Sebor wrote:

The handling of flexible array members whose element type was
dependent tried to deal with the case when the element type
was not yet completed but it did it wrong.  The attached patch
corrects the handling by trying to complete the element type
first.


How about changing complete_type to complete the element type even for
an array of unknown bound?  It seems to me that it would be useful to do
that to set alignment and 'structor flags even if we can't set TYPE_SIZE.

It would also be useful to have a 'complete type or array of unknown
bound of complete type' predicate to use here and in layout_var_decl,
grokdeclarator, and type_with_alias_set_p.


Thanks for the suggestions.  I implemented them in the attached
update to the the patch.  The macro I added evaluates its argument
multiple times.  That normally isn't a problem unless it's invoked
with a non-trivial argument like a call to complete_type() that's
passed to COMPLETE_TYPE_P() in grokdeclarator.  One way to avoid
possible problems due to evaluating the macro argument more than
once is to introduce a helper inline functionn.  I haven't seen
it done in tree.h so I didn't introduce one in this patch either,
but it might be worth considering for the new macro and any other
non-trivial macros like it.

Martin
PR c++/71147 - [6 Regression] Flexible array member wrongly rejected in template

gcc/ChangeLog:
2016-05-23  Martin Sebor  

	PR c++/71147
	* gcc/tree.h (COMPLETE_OR_ARRAY_TYPE_P): New macro.

gcc/testsuite/ChangeLog:
2016-05-23  Martin Sebor  

	PR c++/71147
	* g++.dg/ext/flexary16.C: New test.

gcc/cp/ChangeLog:
2016-05-23  Martin Sebor  

	PR c++/71147
	* decl.c (layout_var_decl, grokdeclarator): Use COMPLETE_OR_ARRAY_TYPE_P.
	* pt.c (instantiate_class_template_1): Try to complete the element
	type of a flexible array member.
	(can_complete_type_without_circularity): Handle arrays of unknown bound.
	* typeck.c (complete_type): Also complete the type of the elements of
	arrays with an unspecified bound.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 7a69711..82bbe19 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -5305,10 +5305,7 @@ layout_var_decl (tree decl)
 complete_type (type);
   if (!DECL_SIZE (decl)
   && TREE_TYPE (decl) != error_mark_node
-  && (COMPLETE_TYPE_P (type)
-	  || (TREE_CODE (type) == ARRAY_TYPE
-	  && !TYPE_DOMAIN (type)
-	  && COMPLETE_TYPE_P (TREE_TYPE (type)
+  && COMPLETE_OR_ARRAY_TYPE_P (type))
 layout_decl (decl, 0);
 
   if (!DECL_EXTERNAL (decl) && DECL_SIZE (decl) == NULL_TREE)
@@ -11165,8 +11162,7 @@ grokdeclarator (const cp_declarator *declarator,
 	  }
 	else if (!staticp && !dependent_type_p (type)
 		 && !COMPLETE_TYPE_P (complete_type (type))
-		 && (TREE_CODE (type) != ARRAY_TYPE
-		 || !COMPLETE_TYPE_P (TREE_TYPE (type))
+		 && (!COMPLETE_OR_ARRAY_TYPE_P (type)
 		 || initialized == 0))
 	  {
 	if (TREE_CODE (type) != ARRAY_TYPE
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2bba571..04ae378 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -9554,7 +9554,7 @@ can_complete_type_without_circularity (tree type)
 return 0;
   else if (COMPLETE_TYPE_P (type))
 return 1;
-  else if (TREE_CODE (type) == ARRAY_TYPE && TYPE_DOMAIN (type))
+  else if (TREE_CODE (type) == ARRAY_TYPE)
 return can_complete_type_without_circularity (TREE_TYPE (type));
   else if (CLASS_TYPE_P (type)
 	   && TYPE_BEING_DEFINED (TYPE_MAIN_VARIANT (type)))
@@ -10119,17 +10119,12 @@ instantiate_class_template_1 (tree type)
 			  if (can_complete_type_without_circularity (rtype))
 			complete_type (rtype);
 
-  if (TREE_CODE (r) == FIELD_DECL
-  && TREE_CODE (rtype) == ARRAY_TYPE
-  && COMPLETE_TYPE_P (TREE_TYPE (rtype))
-  && !COMPLETE_TYPE_P (rtype))
-{
-  /* Flexible array mmembers of elements
- of complete type have an incomplete type
- and that's okay.  */
-}
-  else if (!COMPLETE_TYPE_P (rtype))
+			  if (!COMPLETE_OR_ARRAY_TYPE_P (rtype))
 			{
+			  /* If R's type couldn't be completed and
+ it isn't a flexible array member (whose
+ type is incomplete by definition) give
+ an error.  */
 			  cxx_incomplete_type_error (r, rtype);
 			  TREE_TYPE (r) = error_mark_node;
 			}
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index cd058fa..2688ab4 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -112,7 +112,7 @@ complete_type (tree type)
 
   if (type == error_mark_node || COMPLETE_TYPE_P (type))
 ;
-  else if (TREE_CODE (type) == ARRAY_TYPE && TYPE_DOMAIN (type))
+  else if (TREE_CODE (type) == ARRAY_TYPE)
 {
  

Re: [PATCH #3], Add PowerPC ISA 3.0 vpermr/xxpermr support

2016-05-23 Thread Segher Boessenkool
On Mon, May 23, 2016 at 06:22:22PM -0400, Michael Meissner wrote:
> Here are the patches for xxpermr/vpermr support that are broken out from 
> fixing
> the xxperm fusion bug.  I have built a compiler with these patches (and the
> xxperm patches) and it bootstraps and does not cause a regression.  Are they 
> ok
> to add to GCC 7 and eventually to GCC 6.2?
> 
> [gcc]
> 2016-05-23  Michael Meissner  
>   Kelvin Nilsen  
> 
>   * config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
>   vpermr/xxpermr on ISA 3.0.
>   (altivec_expand_vec_perm_le): Likewise.
>   * config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
>   (altivec_vpermr__internal): Add VPERMR/XXPERMR support for
>   ISA 3.0.
> 
> [gcc/testsuite]
> 2016-05-23  Michael Meissner  
>   Kelvin Nilsen  
> 
>   * gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
>   support.

Okay for trunk.  Okay for 6 after a week or so.

Thanks,


Segher


Re: [PATCH #2], PR 71201, Fix xxperm fusion on PowerPC ISA 3.0

2016-05-23 Thread Segher Boessenkool
Hi,

On Mon, May 23, 2016 at 06:17:36PM -0400, Michael Meissner wrote:
> > > Unfortunately, in running the testsuite on the power9 simulator, we 
> > > discovered
> > > that the test gcc.c-torture/execute/pr56866.c would fail because the 
> > > fusion
> > > alternatives confused the register allocator and/or the passes after the
> > > register allocator.  This patch removes the explicit fusion support from
> > > XXPERM.
> > 
> > Okay.  Please keep the PR open until that problem is fixed.  It also
> > shouldn't be "target" category, if the problem is RA.

> This patch just fixes the xxperm fusion problem, and I will submit the
> vpermr/xxpermr support in another patch.

Thanks.

> Note, if you believe the register allocator and the post reload RTL passes 
> need
> to be fixed to allow the fusion of the move to the xxperm, that is fine.
> However, take it on yourself.

I'm just saying that if the RA (and later) are "confused", that is their
problem, not a target problem.  Or I'm not understanding what the problem
is.  Maybe it is just target abusing the RA?  Either way...

> As the person who wrote the code to add fusion
> support for xxperm, I now think it was a bad idea, and I want to remove that
> support.  It would probably be better done by modifying the scheduler to keep
> the move and xxperm together, rather than including it in the insn.

That should give most of the win without most of the complexity.  I like
that plan ;-)

> I have bootstrapped the compiler on a little endian power9 system and there
> were no regressions in the test suite.  Is it ok to check into the trunk and 
> on
> the 6.1 branch?
> 
> [gcc]
> 2016-05-23  Michael Meissner  
> 
>   PR target/71201
>   * config/rs6000/altivec.md (altivec_vperm__internal): Drop
>   ISA 3.0 xxperm fusion alternative.
>   (altivec_vperm_v8hiv16qi): Likewise.
>   (altivec_vperm__uns_internal): Likewise.
>   (vperm_v8hiv4si): Likewise.
>   (vperm_v16qiv8hi): Likewise.
> 
> [gcc/testsuite]
> 2016-05-23  Michael Meissner  
>   Kelvin Nilsen  
> 
>   * gcc.target/powerpc/p9-permute.c: Run test on big endian as well
>   as little endian.

Okay for trunk.  Okay for 6 after a week or so.


Segher


Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-05-23 Thread Dhole
PING

-- 
Dhole


signature.asc
Description: PGP signature


Re: [PATCH #3], Add PowerPC ISA 3.0 vpermr/xxpermr support

2016-05-23 Thread Michael Meissner
On Thu, May 19, 2016 at 10:33:41AM -0500, Segher Boessenkool wrote:
> On Thu, May 19, 2016 at 10:53:41AM -0400, Michael Meissner wrote:
> > GCC 6.1 added support for the XXPERM instruction for the PowerPC ISA 3.0.  
> > The
> > XXPERM instruction is essentially a 4 operand instruction, with only 3 
> > operands
> > in the instruction (the target register overlaps with the first input
> > register).  The Power9 hardware has fusion support where if the instruction
> > that precedes the XXPERM is a XXLOR move instruction to set the first input
> > argument, it is fused with the XXPERM.  I added code to support this fusion.
> > 
> > Unfortunately, in running the testsuite on the power9 simulator, we 
> > discovered
> > that the test gcc.c-torture/execute/pr56866.c would fail because the fusion
> > alternatives confused the register allocator and/or the passes after the
> > register allocator.  This patch removes the explicit fusion support from
> > XXPERM.
> 
> Okay.  Please keep the PR open until that problem is fixed.  It also
> shouldn't be "target" category, if the problem is RA.
> 
> > In addition, ISA 3.0 added XXPERMR and VPERMR instructions for little endian
> > support where the permute vector reverses the bytes.  This patch adds 
> > support
> > for XXPERMR/VPERMR.
> 
> Please send that as a separate patch, it has nothing to do with the PR.
> 
> > +   x = gen_rtx_UNSPEC (mode,
> > +   gen_rtvec (3, target, reg, 
> 
> Trailing space.
> 
> > +  if (TARGET_P9_VECTOR)
> > +{
> > +  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel), 
> 
> And another.
> 
> > +The VNAND is preferred for future fusion opportunities.  */
> > +  notx = gen_rtx_NOT (V16QImode, sel);
> > +  iorx = (TARGET_P8_VECTOR
> > + ? gen_rtx_IOR (V16QImode, notx, notx)
> > + : gen_rtx_AND (V16QImode, notx, notx));
> > +  emit_insn (gen_rtx_SET (norreg, iorx));
> > +  
> 
> Some more.
> 
> > +/* { dg-final { scan-assembler"vpermr\|xxpermr" } } */
> 
> Tab in the middle of the line.

Here are the patches for xxpermr/vpermr support that are broken out from fixing
the xxperm fusion bug.  I have built a compiler with these patches (and the
xxperm patches) and it bootstraps and does not cause a regression.  Are they ok
to add to GCC 7 and eventually to GCC 6.2?

[gcc]
2016-05-23  Michael Meissner  
Kelvin Nilsen  

* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
vpermr/xxpermr on ISA 3.0.
(altivec_expand_vec_perm_le): Likewise.
* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
(altivec_vpermr__internal): Add VPERMR/XXPERMR support for
ISA 3.0.

[gcc/testsuite]
2016-05-23  Michael Meissner  
Kelvin Nilsen  

* gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
support.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 236608)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6863,21 +6863,29 @@ rs6000_expand_vector_set (rtx target, rt
gen_rtvec (3, target, reg,
   force_reg (V16QImode, x)),
UNSPEC_VPERM);
-  else 
+  else
 {
-  /* Invert selector.  We prefer to generate VNAND on P8 so
- that future fusion opportunities can kick in, but must
- generate VNOR elsewhere.  */
-  rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
-  rtx iorx = (TARGET_P8_VECTOR
- ? gen_rtx_IOR (V16QImode, notx, notx)
- : gen_rtx_AND (V16QImode, notx, notx));
-  rtx tmp = gen_reg_rtx (V16QImode);
-  emit_insn (gen_rtx_SET (tmp, iorx));
-
-  /* Permute with operands reversed and adjusted selector.  */
-  x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
- UNSPEC_VPERM);
+  if (TARGET_P9_VECTOR)
+   x = gen_rtx_UNSPEC (mode,
+   gen_rtvec (3, target, reg,
+  force_reg (V16QImode, x)),
+   UNSPEC_VPERMR);
+  else
+   {
+ /* Invert selector.  We prefer to generate VNAND on P8 so
+that future fusion opportunities can kick in, but must
+generate VNOR elsewhere.  */
+ rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
+ rtx iorx = (TARGET_P8_VECTOR
+ ? gen_rtx_IOR (V16QImode, notx, notx)
+ : gen_rtx_AND (V16QImode, notx, notx));
+ rtx tmp = gen_reg_rtx (V16QImode);
+ emit_insn (gen_rtx_SET (tmp, iorx));
+
+ /* Permute with 

Re: [Patch wwwdocs] Add aarch64-none-linux-gnu as a primary platform for GCC-7

2016-05-23 Thread Gerald Pfeifer
On Mon, 23 May 2016, Richard Biener wrote:
> So I propose to demote -freebsd to secondary and use
> i686-unknown-freebsd (or x86_64-unknown-freebsd?).
> 
> Gerald, Andreas, can you comment on both issues?  Esp. i386 
> is putting quite some burden on libstdc++ and atomics support
> for example.

As Jeff noted, i386 actually is the "marketing" name used for the 
platform, GCC has been defaulting to i486 for ages, and I upgraded 
to i586 last year:

2015-11-15  Gerald Pfeifer  

* config/i386/freebsd.h (SUBTARGET32_DEFAULT_CPU): Change to i586.
Remove support for FreeBSD 5 and earlier.

And, yes, the system compiler on current versions of FreeBSD is 
LLVM (for most platforms including x86).  There is still a fair 
user base, though.

Given the above, do you still see a desire to make this change?

Gerald


Re: [PATCH #2], PR 71201, Fix xxperm fusion on PowerPC ISA 3.0

2016-05-23 Thread Michael Meissner
On Thu, May 19, 2016 at 10:33:41AM -0500, Segher Boessenkool wrote:
> On Thu, May 19, 2016 at 10:53:41AM -0400, Michael Meissner wrote:
> > GCC 6.1 added support for the XXPERM instruction for the PowerPC ISA 3.0.  
> > The
> > XXPERM instruction is essentially a 4 operand instruction, with only 3 
> > operands
> > in the instruction (the target register overlaps with the first input
> > register).  The Power9 hardware has fusion support where if the instruction
> > that precedes the XXPERM is a XXLOR move instruction to set the first input
> > argument, it is fused with the XXPERM.  I added code to support this fusion.
> > 
> > Unfortunately, in running the testsuite on the power9 simulator, we 
> > discovered
> > that the test gcc.c-torture/execute/pr56866.c would fail because the fusion
> > alternatives confused the register allocator and/or the passes after the
> > register allocator.  This patch removes the explicit fusion support from
> > XXPERM.
> 
> Okay.  Please keep the PR open until that problem is fixed.  It also
> shouldn't be "target" category, if the problem is RA.
> 
> > In addition, ISA 3.0 added XXPERMR and VPERMR instructions for little endian
> > support where the permute vector reverses the bytes.  This patch adds 
> > support
> > for XXPERMR/VPERMR.
> 
> Please send that as a separate patch, it has nothing to do with the PR.
> 
> > +   x = gen_rtx_UNSPEC (mode,
> > +   gen_rtvec (3, target, reg, 
> 
> Trailing space.
> 
> > +  if (TARGET_P9_VECTOR)
> > +{
> > +  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel), 
> 
> And another.
> 
> > +The VNAND is preferred for future fusion opportunities.  */
> > +  notx = gen_rtx_NOT (V16QImode, sel);
> > +  iorx = (TARGET_P8_VECTOR
> > + ? gen_rtx_IOR (V16QImode, notx, notx)
> > + : gen_rtx_AND (V16QImode, notx, notx));
> > +  emit_insn (gen_rtx_SET (norreg, iorx));
> > +  
> 
> Some more.
> 
> > +/* { dg-final { scan-assembler"vpermr\|xxpermr" } } */
> 
> Tab in the middle of the line.

This patch just fixes the xxperm fusion problem, and I will submit the
vpermr/xxpermr support in another patch.

Note, if you believe the register allocator and the post reload RTL passes need
to be fixed to allow the fusion of the move to the xxperm, that is fine.
However, take it on yourself.  As the person who wrote the code to add fusion
support for xxperm, I now think it was a bad idea, and I want to remove that
support.  It would probably be better done by modifying the scheduler to keep
the move and xxperm together, rather than including it in the insn.

I have bootstrapped the compiler on a little endian power9 system and there
were no regressions in the test suite.  Is it ok to check into the trunk and on
the 6.1 branch?

[gcc]
2016-05-23  Michael Meissner  

PR target/71201
* config/rs6000/altivec.md (altivec_vperm__internal): Drop
ISA 3.0 xxperm fusion alternative.
(altivec_vperm_v8hiv16qi): Likewise.
(altivec_vperm__uns_internal): Likewise.
(vperm_v8hiv4si): Likewise.
(vperm_v16qiv8hi): Likewise.

[gcc/testsuite]
2016-05-23  Michael Meissner  
Kelvin Nilsen  

* gcc.target/powerpc/p9-permute.c: Run test on big endian as well
as little endian.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc)  (revision 236600)
+++ gcc/config/rs6000/altivec.md(.../gcc)   (working copy)
@@ -1952,32 +1952,30 @@ (define_expand "altivec_vperm_"
 
 ;; Slightly prefer vperm, since the target does not overlap the source
 (define_insn "*altivec_vperm__internal"
-  [(set (match_operand:VM 0 "register_operand" "=v,?wo,?")
-   (unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo")
-   (match_operand:VM 2 "register_operand" "v,wo,wo")
-   (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+   (unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+   (match_operand:VM 2 "register_operand" "v,wo")
+   (match_operand:V16QI 3 "register_operand" "v,wo")]
   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_insn "altivec_vperm_v8hiv16qi"
-  [(set (match_operand:V16QI 0 "register_operand" "=v,?wo,?")
-   (unspec:V16QI [(match_operand:V8HI 1 

Re: [PATCH] Make basic asm implicitly clobber memory

2016-05-23 Thread David Wohlferd

On 5/23/2016 12:46 AM, Richard Biener wrote:
> On Sun, 22 May 2016, Andrew Haley wrote:
>> On 05/20/2016 07:50 AM, David Wohlferd wrote:
>>> I realize deprecation/removal is drastic.  Especially since basic
>>> asm (mostly) works as is.  But fixing memory clobbers while leaving
>>> the rest broken feels like half a solution, meaning that some day
>>> we're going to have to fiddle with this again.
>>
>> Yes, we will undoubtedly have to fiddle with basic asm again.  We
>> should plan for deprecation.
>
> I think adding memory clobbers is worth having.  I also think that
> deprecating basic asms would be a good thing, so can we please
> add a new warning for that?  "warning: basic asms are deprecated"

I've still got the -Wbasic-asm patch where I proposed this for v6. I can 
dust it off again and re-submit it.  A couple questions first:


1) In this patch the warning was disabled by default.  But it sounds 
like you want it enabled by default?  Easy to change, I'm just 
confirming your intent.


2) Is 'deprecated' handled differently than other types of warnings?  
There is a -Wno-deprecated, but it seems to have a very specific meaning 
that does not apply here.


3) The warning text in the old patch was "asm statement in function does 
not use extended syntax".  The intent was:


a) Don't make it sound like basic asm is completely gone, since it can 
still be used at top level.
b) Don't make it sound like all inline asm is gone, since extended asm 
can still be used in functions.

c) Convey all that in as few words as possible.

Now that we want to add the word 'deprecated,' perhaps one of these:

- Basic asm in functions is deprecated in favor of extended syntax
- asm in functions without extended syntax is deprecated
- Deprecated: basic asm in function
- Deprecated: asm in function without extended syntax

I like the last one (people may not know what 'basic' means in this 
context), but any of these would work for me.  Preferences?


In order to avoid conflicts, I'll wait for Bernd to commit his patch first.

dw


Re: [PATCH 0/3][AArch64] Add infrastructure for more approximate FP operations

2016-05-23 Thread Evandro Menezes

On 04/27/16 16:13, Evandro Menezes wrote:
This patch suite increases the granularity of target selections of 
approximate FP operations and adds the options of emitting approximate 
square root and division.


The full suite is contained in the emails tagged:

1.

   [PATCH 1/3][AArch64] Add more choices for the reciprocal square 
root approximation


2.

   [PATCH 2/3][AArch64] Emit square root using the Newton series

3.

   [PATCH 3/3][AArch64] Emit division using the Newton series


Ping.

--
Evandro Menezes



Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-23 Thread Evandro Menezes

On 05/18/16 20:03, Jim Wilson wrote:
Though I see that the original patch from Samsung that added the 
max_case_values field has the -O3 check, so there was apparently some 
reason why they wanted it to work that way. The value that the 
exynos-m1 is using, 48, looks pretty large, so maybe they thought that 
the code size expansion from that is only OK at -O3 and above. Worst 
case, we might need two max_case_value fields, one to use at -O1/-O2, 
and one to use at -O3. Jim 


Indeed, the reason to restrict the new limit to -O3 in my original patch 
was the increased cost in code size.


I'm fine with this patch, as it achieves in part what I intended before: 
going beyond the  default_case_values_threshold, too conservative for 
Exynos M1.  My concern is particularly what happens to in-order targets, 
like the ubiquitous A53.


I'll get make some figures available soon.

Cheers,

--
Evandro Menezes



[PATCH v3] gcov: Runtime configurable destination output

2016-05-23 Thread Aaron Conole
The previous gcov behavior was to always output errors on the stderr channel.
This is fine for most uses, but some programs will require stderr to be
untouched by libgcov for certain tests. This change allows configuring
the gcov output via an environment variable which will be used to open
the appropriate file.
---
 libgcc/libgcov-driver-system.c | 43 +-
 libgcc/libgcov-driver.c|  6 ++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/libgcc/libgcov-driver-system.c b/libgcc/libgcov-driver-system.c
index 4e3b244..461715f 100644
--- a/libgcc/libgcov-driver-system.c
+++ b/libgcc/libgcov-driver-system.c
@@ -23,6 +23,31 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
+/* Configured via the GCOV_ERROR_FILE environment variable;
+   it will either be stderr, or a file of the user's choosing. */
+static FILE *gcov_error_file;
+
+/* A utility function to populate the gcov_error_file pointer */
+
+static FILE *
+get_gcov_error_file(void)
+{
+#if IN_GCOV_TOOL
+  return stderr;
+#endif
+  char *gcov_error_filename = getenv ("GCOV_ERROR_FILE");
+
+  if (gcov_error_filename)
+{
+  FILE *openfile = fopen (gcov_error_filename, "a");
+  if (openfile)
+gcov_error_file = openfile;
+}
+  if (!gcov_error_file)
+gcov_error_file = stderr;
+  return gcov_error_file;
+}
+
 /* A utility function for outputing errors.  */
 
 static int __attribute__((format(printf, 1, 2)))
@@ -30,12 +55,28 @@ gcov_error (const char *fmt, ...)
 {
   int ret;
   va_list argp;
+
+  if (!gcov_error_file)
+gcov_error_file = get_gcov_error_file ();
+
   va_start (argp, fmt);
-  ret = vfprintf (stderr, fmt, argp);
+  ret = vfprintf (gcov_error_file, fmt, argp);
   va_end (argp);
   return ret;
 }
 
+#if !IN_GCOV_TOOL
+static void
+gcov_error_exit(void)
+{
+  if (gcov_error_file && gcov_error_file != stderr)
+{
+  fclose(gcov_error_file);
+  gcov_error_file = NULL;
+}
+}
+#endif
+
 /* Make sure path component of the given FILENAME exists, create
missing directories. FILENAME must be writable.
Returns zero on success, or -1 if an error occurred.  */
diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 9c4eeca..92fb8ab 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -46,6 +46,10 @@ void __gcov_init (struct gcov_info *p __attribute__ 
((unused))) {}
 /* A utility function for outputing errors.  */
 static int gcov_error (const char *, ...);
 
+#if !IN_GCOV_TOOL
+static void gcov_error_exit(void);
+#endif
+
 #include "gcov-io.c"
 
 struct gcov_fn_buffer
@@ -878,6 +882,8 @@ gcov_exit (void)
 __gcov_root.prev->next = __gcov_root.next;
   else
 __gcov_master.root = __gcov_root.next;
+
+  gcov_error_exit ();
 }
 
 /* Add a new object file onto the bb chain.  Invoked automatically
-- 
2.5.5



C++ PATCH for c++/70735 (static locals and generic lambdas)

2016-05-23 Thread Jason Merrill
Here we were failing to handle static locals referred to from a generic 
lambda properly: we decided that in that situation rather than try to 
look up the primary decl for the variable (since its function is 
probably out of scope when the lambda op() is instantiated), we can just 
build a new VAR_DECL and use that instead.  The problem with this was 
that we weren't setting DECL_CONTEXT on the new decl, so it mangled 
differently from the real decl, so references within the lambda were 
finding a different object.


Fixed for statics in non-template functions by just using the variable 
directly, and for statics in template functions by setting DECL_CONTEXT 
appropriately.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 120812acdc3feda6cd79f634a4ac26ae0db8c087
Author: Jason Merrill 
Date:   Fri May 20 14:00:38 2016 -0400

	PR c++/70735 - generic lambda and local static variable

	* pt.c (tsubst_copy): Just return a local variable from
	non-template context.  Don't call rest_of_decl_compilation for
	duplicated static locals.
	(tsubst_decl): Set DECL_CONTEXT of local static from another
	function.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2bba571..59d6a95 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -12280,6 +12280,14 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 	local_p = true;
 	/* Subsequent calls to pushdecl will fill this in.  */
 	ctx = NULL_TREE;
+	/* Unless this is a reference to a static variable from an
+	   enclosing function, in which case we need to fill it in now.  */
+	if (TREE_STATIC (t))
+	  {
+		tree fn = tsubst (DECL_CONTEXT (t), args, complain, in_decl);
+		if (fn != current_function_decl)
+		  ctx = fn;
+	  }
 	spec = retrieve_local_specialization (t);
 	  }
 	/* If we already have the specialization we need, there is
@@ -13991,7 +13999,8 @@ tsubst_copy (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 case FUNCTION_DECL:
   if (DECL_LANG_SPECIFIC (t) && DECL_TEMPLATE_INFO (t))
 	r = tsubst (t, args, complain, in_decl);
-  else if (local_variable_p (t))
+  else if (local_variable_p (t)
+	   && uses_template_parms (DECL_CONTEXT (t)))
 	{
 	  r = retrieve_local_specialization (t);
 	  if (r == NULL_TREE)
@@ -14035,14 +14044,9 @@ tsubst_copy (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 		  gcc_assert (cp_unevaluated_operand || TREE_STATIC (r)
 			  || decl_constant_var_p (r)
 			  || errorcount || sorrycount);
-		  if (!processing_template_decl)
-		{
-		  if (TREE_STATIC (r))
-			rest_of_decl_compilation (r, toplevel_bindings_p (),
-		  at_eof);
-		  else
-			r = process_outer_var_ref (r, complain);
-		}
+		  if (!processing_template_decl
+		  && !TREE_STATIC (r))
+		r = process_outer_var_ref (r, complain);
 		}
 	  /* Remember this for subsequent uses.  */
 	  if (local_specializations)
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-static1.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-static1.C
new file mode 100644
index 000..a1667a2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-static1.C
@@ -0,0 +1,13 @@
+// PR c++/70735
+// { dg-do run { target c++1y } }
+
+int main()
+{
+  static int a;
+  auto f = [](auto) { return a; };
+  if (f(0) != 0)
+__builtin_abort();
+  a = 1;
+  if (f(0) != 1)
+__builtin_abort();
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-static2.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-static2.C
new file mode 100644
index 000..51bf75f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-static2.C
@@ -0,0 +1,19 @@
+// PR c++/70735
+// { dg-do run { target c++1y } }
+
+template 
+static void g()
+{
+  static int a;
+  auto f = [](auto) { return a; };
+  if (f(0) != 0)
+__builtin_abort();
+  a = 1;
+  if (f(0) != 1)
+__builtin_abort();
+}
+
+int main()
+{
+  g();
+}


[PATCH, i386]: Improve IS_STACK_MODE and use it some more

2016-05-23 Thread Uros Bizjak
Hello!

2016-05-23  Uros Bizjak  

* config/i386/i386.h (IS_STACK_MODE): Enable for
TARGET_MIX_SSE_I387.  Rewrite using X87_FLOAT_MODE_P and
SSE_FLOAT_MODE_P macros.
* config/i386/i386.c (ix86_preferred_reload_class): Use
IS_STACK_MODE, INTEGER_CLASS_P and FLOAT_CLASS_P macros.  Cleanup
regclass processing for CONST_DOUBLE_P.
(ix86_preferred_output_reload_class): Use IS_STACK_MODE macro.
(ix86_rtx_costs): Remove redundant TARGET_80387 check
with IS_STACK_MODE macro.
* config/i386/i386.md: Replace SSE_FLOAT_MODE_P (DFmode)
with TARGET_SSE2.
(*movdf_internal): Use IS_STACK_MODE macro.
(*movsf_internal): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.c
===
--- i386.c  (revision 236592)
+++ i386.c  (working copy)
@@ -43301,37 +43301,35 @@
  || MAYBE_MASK_CLASS_P (regclass)))
 return NO_REGS;
 
-  /* Prefer SSE regs only, if we can use them for math.  */
-  if (TARGET_SSE_MATH && !TARGET_MIX_SSE_I387 && SSE_FLOAT_MODE_P (mode))
-return SSE_CLASS_P (regclass) ? regclass : NO_REGS;
-
   /* Floating-point constants need more complex checks.  */
   if (CONST_DOUBLE_P (x))
 {
   /* General regs can load everything.  */
-  if (reg_class_subset_p (regclass, GENERAL_REGS))
+  if (INTEGER_CLASS_P (regclass))
 return regclass;
 
   /* Floats can load 0 and 1 plus some others.  Note that we eliminated
 zero above.  We only want to wind up preferring 80387 registers if
 we plan on doing computation with them.  */
-  if (TARGET_80387
+  if (IS_STACK_MODE (mode)
  && standard_80387_constant_p (x) > 0)
{
- /* Limit class to non-sse.  */
- if (regclass == FLOAT_SSE_REGS)
+ /* Limit class to FP regs.  */
+ if (FLOAT_CLASS_P (regclass))
return FLOAT_REGS;
- if (regclass == FP_TOP_SSE_REGS)
+ else if (regclass == FP_TOP_SSE_REGS)
return FP_TOP_REG;
- if (regclass == FP_SECOND_SSE_REGS)
+ else if (regclass == FP_SECOND_SSE_REGS)
return FP_SECOND_REG;
- if (regclass == FLOAT_INT_REGS || regclass == FLOAT_REGS)
-   return regclass;
}
 
   return NO_REGS;
 }
 
+  /* Prefer SSE regs only, if we can use them for math.  */
+  if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+return SSE_CLASS_P (regclass) ? regclass : NO_REGS;
+
   /* Generally when we see PLUS here, it's the function invariant
  (plus soft-fp const_int).  Which can only be computed into general
  regs.  */
@@ -43363,10 +43361,10 @@
  math on.  If we would like not to return a subset of CLASS, reject this
  alternative: if reload cannot do this, it will still use its choice.  */
   mode = GET_MODE (x);
-  if (TARGET_SSE_MATH && SSE_FLOAT_MODE_P (mode))
+  if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
 return MAYBE_SSE_CLASS_P (regclass) ? ALL_SSE_REGS : NO_REGS;
 
-  if (X87_FLOAT_MODE_P (mode))
+  if (IS_STACK_MODE (mode))
 {
   if (regclass == FP_TOP_SSE_REGS)
return FP_TOP_REG;
@@ -44071,7 +44069,7 @@
   return true;
 
 case CONST_DOUBLE:
-  if (TARGET_80387 && IS_STACK_MODE (mode))
+  if (IS_STACK_MODE (mode))
switch (standard_80387_constant_p (x))
  {
  case -1:
Index: i386.h
===
--- i386.h  (revision 236592)
+++ i386.h  (working copy)
@@ -957,10 +957,10 @@
 
 #define STACK_REGS
 
-#define IS_STACK_MODE(MODE)\
-  (((MODE) == SFmode && !(TARGET_SSE && TARGET_SSE_MATH))  \
-   || ((MODE) == DFmode && !(TARGET_SSE2 && TARGET_SSE_MATH))  \
-   || (MODE) == XFmode)
+#define IS_STACK_MODE(MODE)\
+  (X87_FLOAT_MODE_P (MODE) \
+   && (!(SSE_FLOAT_MODE_P (MODE) && TARGET_SSE_MATH)   \
+   || TARGET_MIX_SSE_I387))
 
 /* Number of actual hardware registers.
The hardware registers are assigned numbers for the compiler
Index: i386.md
===
--- i386.md (revision 236592)
+++ i386.md (working copy)
@@ -3276,7 +3276,7 @@
|| !CONST_DOUBLE_P (operands[1])
|| ((optimize_function_for_size_p (cfun)
|| (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC))
-  && ((!(TARGET_SSE2 && TARGET_SSE_MATH)
+  && ((IS_STACK_MODE (DFmode)
&& standard_80387_constant_p (operands[1]) > 0)
   || (TARGET_SSE2 && TARGET_SSE_MATH
   && standard_sse_constant_p (operands[1], DFmode) == 1))
@@ -3478,9 +3478,9 @@
|| !CONST_DOUBLE_P (operands[1])
|| ((optimize_function_for_size_p (cfun)
|| (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC))
-   

C++ PATCH for c++/70584 (parenthesized argument to x86 builtin)

2016-05-23 Thread Jason Merrill
The C++14 decltype(auto) obfuscation was confusing the x86 builtin; it's 
a simple matter to undo it during delayed folding, thanks to the 
maybe_undo_parenthesized_ref function that Patrick recently introduced.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit d8cfb8653df9583386a9825809dfdfa5e8d99759
Author: Jason Merrill 
Date:   Fri May 20 17:21:51 2016 -0400

	PR c++/70584 - error with parenthesized builtin arg

	* cp-gimplify.c (cp_fold) [INDIRECT_REF]: Call
	maybe_undo_parenthesized_ref.

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index aaa2db2..57f5d35 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -1996,6 +1996,15 @@ cp_fold (tree x)
 
   break;
 
+case INDIRECT_REF:
+  /* We don't need the decltype(auto) obfuscation anymore.  */
+  if (REF_PARENTHESIZED_P (x))
+	{
+	  tree p = maybe_undo_parenthesized_ref (x);
+	  return cp_fold_maybe_rvalue (p, rval_ops);
+	}
+  goto unary;
+
 case ADDR_EXPR:
 case REALPART_EXPR:
 case IMAGPART_EXPR:
@@ -2008,7 +2017,7 @@ cp_fold (tree x)
 case BIT_NOT_EXPR:
 case TRUTH_NOT_EXPR:
 case FIXED_CONVERT_EXPR:
-case INDIRECT_REF:
+unary:
 
   loc = EXPR_LOCATION (x);
   op0 = cp_fold_maybe_rvalue (TREE_OPERAND (x, 0), rval_ops);
diff --git a/gcc/testsuite/g++.dg/other/i386-10.C b/gcc/testsuite/g++.dg/other/i386-10.C
new file mode 100644
index 000..96def72
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/i386-10.C
@@ -0,0 +1,12 @@
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options -maes }
+
+typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__));
+
+int main()
+{
+const char index = 1;
+__m128i r = { };
+
+r = __builtin_ia32_aeskeygenassist128 (r, (int)(index));
+}


Re: tuple move constructor

2016-05-23 Thread Marc Glisse

Ping

(re-attaching, I just added a one-line comment before the tag class as 
asked by Ville)


On Thu, 21 Apr 2016, Marc Glisse wrote:


On Thu, 21 Apr 2016, Jonathan Wakely wrote:


On 20 April 2016 at 21:42, Marc Glisse wrote:

Hello,

does anyone remember why the move constructor of _Tuple_impl is not
defaulted? The attached patch does not cause any test to fail (whitespace
kept to avoid line number changes). Maybe something about tuples of
references?


I don't know/remember why. It's possible it was to workaround a
front-end bug that required it, or maybe just a mistake and it should
always have been defaulted.


Ok, then how about something like this? In order to suppress the move
constructor in tuple (when there is a non-movable element), we need to
either declare it with suitable constraints, or keep it defaulted and
ensure that we don't bypass a missing move constructor anywhere along
the way (_Tuple_impl, _Head_base). There is a strange mix of 2
strategies in the patch, I prefer the tag class, but I started using
enable_if before I realized how many places needed those horrors.

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.


2016-04-22  Marc Glisse  

* include/std/tuple (__element_arg_t): New class.
(_Head_base(const _Head&), _Tuple_impl(const _Head&, const _Tail&...):
Remove.
(_Head_base(_UHead&&)): Add __element_arg_t argument...
(_Tuple_impl): ... and adjust callers.
(_Tuple_impl(_Tuple_impl&&)): Default.
(_Tuple_impl(const _Tuple_impl&),
_Tuple_impl(_Tuple_impl&&), _Tuple_impl(_UHead&&): Constrain.
* testsuite/20_util/tuple/nomove.cc: New.


--
Marc GlisseIndex: libstdc++-v3/include/std/tuple
===
--- libstdc++-v3/include/std/tuple	(revision 236338)
+++ libstdc++-v3/include/std/tuple	(working copy)
@@ -41,38 +41,38 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /**
*  @addtogroup utilities
*  @{
*/
 
+  // Tag type to distinguish forwarding constructors from copy/move.
+  struct __element_arg_t { };
+
   template
 struct _Head_base;
 
   template
 struct _Head_base<_Idx, _Head, true>
 : public _Head
 {
   constexpr _Head_base()
   : _Head() { }
 
-  constexpr _Head_base(const _Head& __h)
-  : _Head(__h) { }
-
   constexpr _Head_base(const _Head_base&) = default;
   constexpr _Head_base(_Head_base&&) = default;
 
   template
-constexpr _Head_base(_UHead&& __h)
+constexpr _Head_base(__element_arg_t, _UHead&& __h)
 	: _Head(std::forward<_UHead>(__h)) { }
 
   _Head_base(allocator_arg_t, __uses_alloc0)
   : _Head() { }
 
   template
 	_Head_base(allocator_arg_t, __uses_alloc1<_Alloc> __a)
 	: _Head(allocator_arg, *__a._M_a) { }
 
   template
@@ -97,28 +97,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static constexpr const _Head&
   _M_head(const _Head_base& __b) noexcept { return __b; }
 };
 
   template
 struct _Head_base<_Idx, _Head, false>
 {
   constexpr _Head_base()
   : _M_head_impl() { }
 
-  constexpr _Head_base(const _Head& __h)
-  : _M_head_impl(__h) { }
-
   constexpr _Head_base(const _Head_base&) = default;
   constexpr _Head_base(_Head_base&&) = default;
 
   template
-constexpr _Head_base(_UHead&& __h)
+constexpr _Head_base(__element_arg_t, _UHead&& __h)
 	: _M_head_impl(std::forward<_UHead>(__h)) { }
 
   _Head_base(allocator_arg_t, __uses_alloc0)
   : _M_head_impl() { }
 
   template
 	_Head_base(allocator_arg_t, __uses_alloc1<_Alloc> __a)
 	: _M_head_impl(allocator_arg, *__a._M_a) { }
 
   template
@@ -194,50 +191,49 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   static constexpr _Inherited&
   _M_tail(_Tuple_impl& __t) noexcept { return __t; }
 
   static constexpr const _Inherited&
   _M_tail(const _Tuple_impl& __t) noexcept { return __t; }
 
   constexpr _Tuple_impl()
   : _Inherited(), _Base() { }
 
-  explicit 
-  constexpr _Tuple_impl(const _Head& __head, const _Tail&... __tail)
-  : _Inherited(__tail...), _Base(__head) { }
-
   template::type> 
 explicit
 constexpr _Tuple_impl(_UHead&& __head, _UTail&&... __tail)
 	: _Inherited(std::forward<_UTail>(__tail)...),
-	  _Base(std::forward<_UHead>(__head)) { }
+	  _Base(__element_arg_t(), std::forward<_UHead>(__head)) { }
 
   constexpr _Tuple_impl(const _Tuple_impl&) = default;
+  constexpr _Tuple_impl(_Tuple_impl&&) = default;
 
-  constexpr
-  _Tuple_impl(_Tuple_impl&& __in)
-  noexcept(__and_,
-	  is_nothrow_move_constructible<_Inherited>>::value)
-  : _Inherited(std::move(_M_tail(__in))), 
-	_Base(std::forward<_Head>(_M_head(__in))) { }
-
-  template
+  template>::value,
+	  bool>::type = false>
 constexpr 

Re: "omp declare target" on DECL_EXTERNAL vars

2016-05-23 Thread Jakub Jelinek
On Mon, May 23, 2016 at 09:19:47PM +0300, Alexander Monakov wrote:
> 
> So unlike for functions, for variables GCC needs to know exactly whether they
> are 'omp declare target [link]' at all points of use, not just at the point of
> definition.

There are many bugs that just can't be diagnosed by the compiler.
It is up to the users to make sure they write sane code.

> There's a pitfall if the user forgets the pragma on the external declaration:
> 
> === a.c
> 
> #pragma omp declare target
> int a;
> void set_a()
> {
>   a = 42;
> }
> #pragma omp end declare target
> 
> === main.c
> 
> extern int a;
> extern void set_a();
> #pragma omp declare target to(set_a)
> 
> int main()
> {
>   a = 0;
>   #pragma omp target map(tofrom:a)
> set_a();
> 
>   if (a != 42) abort();
> }
> ===

The above will abort always, no matter if you have #pragma omp declare target 
to(a)
in main.c or not, because a is already mapped (with infinite refcount), so
the map(tofrom:a) doesn't actually do anything (but prevent
firstprivatization of the var).  With map clause on the target, the only
change would be that the body of the target (but not functions it calls), if
they reference a, would be less efficient (would reference a through some
pointer set up during the mapping, instead of a directly).

Jakub


Re: "omp declare target" on DECL_EXTERNAL vars

2016-05-23 Thread Alexander Monakov
On Mon, 23 May 2016, Jakub Jelinek wrote:
> Having the externs specified in omp declare target to is important for
> code generation, we need to know that whether the vars should be mapped
> implicitly on target constructs and remapped in the target construct bodies,
> or whether the actual vars should be used in the regions.

Yep, sorry for missing that.

> Thus, 
> 
> > So from that perspective it's undesirable to have 'omp declare target' on
> > declarations that don't define anything.
> 
> is just wrong, we at least need the symbol_table::offloadable bit set.

So unlike for functions, for variables GCC needs to know exactly whether they
are 'omp declare target [link]' at all points of use, not just at the point of
definition.

There's a pitfall if the user forgets the pragma on the external declaration:

=== a.c

#pragma omp declare target
int a;
void set_a()
{
  a = 42;
}
#pragma omp end declare target

=== main.c

extern int a;
extern void set_a();
#pragma omp declare target to(set_a)

int main()
{
  a = 0;
  #pragma omp target map(tofrom:a)
set_a();

  if (a != 42) abort();
}
===

As I understand, this aborts, and it's not obvious how to take measures to
produce a compile-time diagnostic.  And I'm not sure if the letter of the spec
is being violated there.

Sorry if I'm elaborating on the more obvious stuff without contributing to
your original question; I hope this is of some value (like it is for me).

> About g->head_offload and offload_vars, I guess it is fine not to set those
> for externs but we need to arrange that to be set when we actually define it
> when it has been previously extern,

+1, it should be nice to avoid unnecessary streaming of externs; as for the
latter point, wouldn't moving handling from frontends to a point in the
middle-end when the symtab is complete solve that automatically?

> and we need some sensible handling of the case where the var is only
> declared extern and omp declare target, used in some target region, but not
> actually defined anywhere in the same shared library or executable.

I think on NVPTX it yields a link error at mkoffload time.

Alexander


Re: [PATCH v2] gcov: Runtime configurable destination output

2016-05-23 Thread Aaron Conole
Nathan Sidwell  writes:

> On 05/19/16 14:40, Aaron Conole wrote:
>> Nathan Sidwell  writes:
>
 +FILE *__gcov_error_file = NULL;
>>>
>>> Unless I'm missing something, isn't this only accessed from this file?
>>> (So could be static with a non-underbarred name)
>>
>> Ack.
>
> I have a vague memory that perhaps the __gcov_error_file is seen from
> other dynamic objects, and one of them gets to open/close it?  I think
> the closing function needs to reset it to NULL though?  (In case it's
> reactivated before the process exits)

This is being introduced here, so the actual variable won't be seen,
however you're correct - the APIs could still be called.

I think there does exist a possibility that it can get re-activated
before the process exits. So, I've changed it to have a proper block
cope and to reset gcov_error_file to NULL.

>>> And this protection here, makes me wonder what happens if one is
>>> IN_GCOV_TOOL. Does it pay attention to GCOV_ERROR_FILE?  That would
>>> seem incorrect, and thus the above should be changed so that stderr is
>>> unconditionally used when IN_GCOV_TOOL?
>>
>> You are correct.  I will fix it.
>
> thanks.
>
 +static void
 +gcov_error_exit(void)
 +{
 +  if (__gcov_error_file && __gcov_error_file != stderr)
 +{
>>>
>>> Braces are not needed here.
>
> Unless of course my speculation about setting it to NULL is right.

It is - I've fixed it, and will post the v3 patch shortly.

Thank you for your help, Nathan!

> nathan


Re: [C++ Patch] PR 70972 ("[6/7 Regression] Inheriting constructors taking parameters by value should move them, not copy")

2016-05-23 Thread Jason Merrill

OK.

Jason


Re: [C++ Patch] PR 69095

2016-05-23 Thread Jason Merrill

On 05/23/2016 11:25 AM, Paolo Carlini wrote:

On 23/05/2016 15:32, Jason Merrill wrote:

On 05/22/2016 02:26 PM, Paolo Carlini wrote:

finally sending a patch for this issue. As noticed by submitter himself,
it appears to boil down to a rather straightforward case of not
rejecting unexpanded parameter packs in default arguments. In order to
handle all the combinations (in/out of class, template
parameter/function parameter) I added calls of
check_for_bare_parameter_packs both to cp_parser_default_argument and
cp_parser_late_parsing_default_args


Hmm, would it make sense to check in cp_parser_initializer?

Oh yes. The below is already past g++.dg/tm...


OK if testing passes.

Jason



[PATCH] Improve avx_vec_concat

2016-05-23 Thread Jakub Jelinek
Hi!

Not sure how to easily test these.
In any case, for the vinsert* case, we don't have vinserti128 nor
vinsertf128 in evex, so need to use vinsert[if]{64x4,32x4} or
for DQ {64x2,32x8}.  For the case with zero in the other half,
we need AVX512VL and it isn't guaranteed for the output operand,
because it can be 512-bit mode too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-23  Jakub Jelinek  

* config/i386/sse.md (avx_vec_concat): Add v=v,vm and
Yv=Yv,C alternatives.

--- gcc/config/i386/sse.md.jj   2016-05-23 15:42:49.0 +0200
+++ gcc/config/i386/sse.md  2016-05-23 16:25:58.434925572 +0200
@@ -18178,10 +18178,10 @@ (define_insn "_
(set_attr "mode" "")])
 
 (define_insn "avx_vec_concat"
-  [(set (match_operand:V_256_512 0 "register_operand" "=x,x")
+  [(set (match_operand:V_256_512 0 "register_operand" "=x,v,x,Yv")
(vec_concat:V_256_512
- (match_operand: 1 "register_operand" "x,x")
- (match_operand: 2 "vector_move_operand" "xm,C")))]
+ (match_operand: 1 "register_operand" "x,v,x,v")
+ (match_operand: 2 "vector_move_operand" 
"xm,vm,C,C")))]
   "TARGET_AVX"
 {
   switch (which_alternative)
@@ -18189,6 +18189,22 @@ (define_insn "avx_vec_concat"
 case 0:
   return "vinsert\t{$0x1, %2, %1, %0|%0, 
%1, %2, 0x1}";
 case 1:
+  if ( == 64)
+   {
+ if (TARGET_AVX512DQ && GET_MODE_SIZE (mode) == 4)
+   return "vinsert32x8\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+ else
+   return "vinsert64x4\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+   }
+  else
+   {
+ if (TARGET_AVX512DQ && GET_MODE_SIZE (mode) == 8)
+   return "vinsert64x2\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+ else
+   return "vinsert32x4\t{$0x1, %2, %1, 
%0|%0, %1, %2, 0x1}";
+   }
+case 2:
+case 3:
   switch (get_attr_mode (insn))
{
case MODE_V16SF:
@@ -18200,9 +18216,19 @@ (define_insn "avx_vec_concat"
case MODE_V4DF:
  return "vmovapd\t{%1, %x0|%x0, %1}";
case MODE_XI:
- return "vmovdqa\t{%1, %t0|%t0, %1}";
+ if (which_alternative == 2)
+   return "vmovdqa\t{%1, %t0|%t0, %1}";
+ else if (GET_MODE_SIZE (mode) == 8)
+   return "vmovdqa64\t{%1, %t0|%t0, %1}";
+ else
+   return "vmovdqa32\t{%1, %t0|%t0, %1}";
case MODE_OI:
- return "vmovdqa\t{%1, %x0|%x0, %1}";
+ if (which_alternative == 2)
+   return "vmovdqa\t{%1, %x0|%x0, %1}";
+ else if (GET_MODE_SIZE (mode) == 8)
+   return "vmovdqa64\t{%1, %x0|%x0, %1}";
+ else
+   return "vmovdqa32\t{%1, %x0|%x0, %1}";
default:
  gcc_unreachable ();
}
@@ -18210,9 +18236,9 @@ (define_insn "avx_vec_concat"
   gcc_unreachable ();
 }
 }
-  [(set_attr "type" "sselog,ssemov")
-   (set_attr "prefix_extra" "1,*")
-   (set_attr "length_immediate" "1,*")
+  [(set_attr "type" "sselog,sselog,ssemov,ssemov")
+   (set_attr "prefix_extra" "1,1,*,*")
+   (set_attr "length_immediate" "1,1,*,*")
(set_attr "prefix" "maybe_evex")
(set_attr "mode" "")])
 

Jakub


[PATCH] Improve vcvtps2ph

2016-05-23 Thread Jakub Jelinek
Hi!

These insns are available in AVX512VL, so we can just use v instead of x.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-23  Jakub Jelinek  

* config/i386/sse.md (*vcvtps2ph_store): Use v constraint
instead of x constraint.
(vcvtps2ph256): Likewise.

* gcc.target/i386/avx512vl-vcvtps2ph-3.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-23 15:07:41.0 +0200
+++ gcc/config/i386/sse.md  2016-05-23 15:42:49.854873998 +0200
@@ -18299,7 +18299,7 @@ (define_insn "*vcvtps2ph"
 
 (define_insn "*vcvtps2ph_store"
   [(set (match_operand:V4HI 0 "memory_operand" "=m")
-   (unspec:V4HI [(match_operand:V4SF 1 "register_operand" "x")
+   (unspec:V4HI [(match_operand:V4SF 1 "register_operand" "v")
  (match_operand:SI 2 "const_0_to_255_operand" "N")]
 UNSPEC_VCVTPS2PH))]
   "TARGET_F16C || TARGET_AVX512VL"
@@ -18309,8 +18309,8 @@ (define_insn "*vcvtps2ph_store"
-  [(set (match_operand:V8HI 0 "nonimmediate_operand" "=xm")
-   (unspec:V8HI [(match_operand:V8SF 1 "register_operand" "x")
+  [(set (match_operand:V8HI 0 "nonimmediate_operand" "=vm")
+   (unspec:V8HI [(match_operand:V8SF 1 "register_operand" "v")
  (match_operand:SI 2 "const_0_to_255_operand" "N")]
 UNSPEC_VCVTPS2PH))]
   "TARGET_F16C || TARGET_AVX512VL"
--- gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-3.c.jj 2016-05-23 
15:51:51.913742438 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vcvtps2ph-3.c2016-05-23 
15:54:03.316021252 +0200
@@ -0,0 +1,41 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mf16c -mavx512vl -masm=att" } */
+
+#include 
+
+void
+f1 (__m128 x)
+{
+  register __m128 a __asm ("xmm16");
+  register __m128i b __asm ("xmm17");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  b = _mm_cvtps_ph (a, 1);
+  asm volatile ("" : "+v" (b));
+}
+
+/* { dg-final { scan-assembler 
"vcvtps2ph\[^\n\r]*\\\$1\[^\n\r]*%xmm16\[^\n\r]*%xmm17" } } */
+
+void
+f2 (__m256 x)
+{
+  register __m256 a __asm ("xmm16");
+  register __m128i b __asm ("xmm17");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  b = _mm256_cvtps_ph (a, 1);
+  asm volatile ("" : "+v" (b));
+}
+
+/* { dg-final { scan-assembler 
"vcvtps2ph\[^\n\r]*\\\$1\[^\n\r]*%ymm16\[^\n\r]*%xmm17" } } */
+
+void
+f3 (__m256 x, __v8hi *y)
+{
+  register __m256 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  *y = (__v8hi) _mm256_cvtps_ph (a, 1);
+}
+
+/* { dg-final { scan-assembler 
"vcvtps2ph\[^\n\r]*\\\$1\[^\n\r]*%ymm16\[^\n\r]*%rdi" } } */

Jakub


[PATCH] Improve *ssse3_palignr_perm

2016-05-23 Thread Jakub Jelinek
Hi!

This pattern is used to improve __builtin_shuffle in some cases;
VPALIGNR is AVX512BW & AVX512VL.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-23  Jakub Jelinek  

* config/i386/sse.md (*ssse3_palignr_perm): Add avx512bw
alternative.  Formatting fix.

* gcc.target/i386/avx512bw-vpalignr-4.c: New test.
* gcc.target/i386/avx512vl-vpalignr-4.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-23 14:53:36.0 +0200
+++ gcc/config/i386/sse.md  2016-05-23 15:07:41.518548599 +0200
@@ -17747,33 +17747,34 @@ (define_insn "*avx_vperm2f128_noze
(set_attr "mode" "")])
 
 (define_insn "*ssse3_palignr_perm"
-  [(set (match_operand:V_128 0 "register_operand" "=x,x")
+  [(set (match_operand:V_128 0 "register_operand" "=x,x,v")
   (vec_select:V_128
-   (match_operand:V_128 1 "register_operand" "0,x")
+   (match_operand:V_128 1 "register_operand" "0,x,v")
(match_parallel 2 "palignr_operand"
- [(match_operand 3 "const_int_operand" "n, n")])))]
+ [(match_operand 3 "const_int_operand" "n,n,n")])))]
   "TARGET_SSSE3"
 {
-  operands[2] =
-   GEN_INT (INTVAL (operands[3]) * GET_MODE_UNIT_SIZE (GET_MODE 
(operands[0])));
+  operands[2] = (GEN_INT (INTVAL (operands[3])
+* GET_MODE_UNIT_SIZE (GET_MODE (operands[0];
 
   switch (which_alternative)
 {
 case 0:
   return "palignr\t{%2, %1, %0|%0, %1, %2}";
 case 1:
+case 2:
   return "vpalignr\t{%2, %1, %1, %0|%0, %1, %1, %2}";
 default:
   gcc_unreachable ();
 }
 }
-  [(set_attr "isa" "noavx,avx")
+  [(set_attr "isa" "noavx,avx,avx512bw")
(set_attr "type" "sseishft")
(set_attr "atom_unit" "sishuf")
-   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix_data16" "1,*,*")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
-   (set_attr "prefix" "orig,vex")])
+   (set_attr "prefix" "orig,vex,evex")])
 
 (define_expand "avx512vl_vinsert"
   [(match_operand:VI48F_256 0 "register_operand")
--- gcc/testsuite/gcc.target/i386/avx512bw-vpalignr-4.c.jj  2016-05-23 
15:18:57.787640379 +0200
+++ gcc/testsuite/gcc.target/i386/avx512bw-vpalignr-4.c 2016-05-23 
15:18:26.0 +0200
@@ -0,0 +1,86 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mavx512bw -masm=att" } */
+
+typedef char V1 __attribute__((vector_size (16)));
+
+void
+f1 (V1 x)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V1) { 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 
2, 3, 4, 5 });
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times 
"vpalignr\[^\n\r]*\\\$6\[^\n\r]*%xmm16\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
+
+typedef short V2 __attribute__((vector_size (16)));
+
+void
+f2 (V2 x)
+{
+  register V2 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V2) { 5, 6, 7, 0, 1, 2, 3, 4 });
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times 
"vpalignr\[^\n\r]*\\\$10\[^\n\r]*%xmm16\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
+
+typedef int V3 __attribute__((vector_size (16)));
+
+void
+f3 (V3 x)
+{
+  register V3 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V3) { 3, 0, 1, 2 });
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times 
"vpshufd\[^\n\r]*\\\$147\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
+
+typedef long long V4 __attribute__((vector_size (16)));
+
+void
+f4 (V4 x)
+{
+  register V4 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V4) { 1, 0 });
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times 
"vpalignr\[^\n\r]*\\\$8\[^\n\r]*%xmm16\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
+
+typedef float V5 __attribute__((vector_size (16)));
+
+void
+f5 (V5 x)
+{
+  register V5 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V3) { 3, 0, 1, 2 });
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times 
"vpermilps\[^\n\r]*\\\$147\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
+
+typedef double V6 __attribute__((vector_size (16)));
+
+void
+f6 (V6 x)
+{
+  register V6 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V4) { 1, 0 });
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler-times 
"vpermilpd\[^\n\r]*\\\$1\[^\n\r]*%xmm16\[^\n\r]*%xmm16" 1 } } */
--- gcc/testsuite/gcc.target/i386/avx512vl-vpalignr-4.c.jj  2016-05-23 
15:19:34.352162361 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vpalignr-4.c 2016-05-23 
15:20:02.570793519 +0200
@@ -0,0 +1,86 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -mno-avx512bw -masm=att" } */
+
+typedef char V1 __attribute__((vector_size (16)));
+
+void
+f1 (V1 x)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+ 

[PATCH] Improve *avx_vperm_broadcast_*

2016-05-23 Thread Jakub Jelinek
Hi!

The vbroadcastss and vpermilps insns are already in AVX512F & AVX512VL,
so can be used with v instead of x, the splitter case where we for AVX
emit vpermilps plus vpermf128 is more problematic, because the latter
insn isn't available in EVEX.  But, we can get the same effect with
vshuff32x4 when both source operands are the same.
Alternatively, we could replace the vpermilps and vshuff32x4 insns
with the AVX512VL arbitrary permutations I think, the question is
what is faster, because we'd need to load the mask from memory.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-05-23  Jakub Jelinek  

* config/i386/sse.md
(avx512vl_shuf_32x4_1): Rename
to ...
(avx512vl_shuf_32x4_1): ... this.
(*avx_vperm_broadcast_v4sf): Use v constraint instead of x.  Use
maybe_evex prefix instead of vex.
(*avx_vperm_broadcast_): Use v constraint instead of x.  Handle
EXT_REX_SSE_REG_P (op0) case in the splitter.

* gcc.target/i386/avx512vl-vbroadcast-3.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-22 12:27:34.0 +0200
+++ gcc/config/i386/sse.md  2016-05-23 13:54:22.211998751 +0200
@@ -12380,7 +12380,7 @@
   DONE;
 })
 
-(define_insn "avx512vl_shuf_32x4_1"
+(define_insn "avx512vl_shuf_32x4_1"
   [(set (match_operand:VI4F_256 0 "register_operand" "=v")
(vec_select:VI4F_256
  (vec_concat:
@@ -17247,9 +17247,9 @@
 ;; If it so happens that the input is in memory, use vbroadcast.
 ;; Otherwise use vpermilp (and in the case of 256-bit modes, vperm2f128).
 (define_insn "*avx_vperm_broadcast_v4sf"
-  [(set (match_operand:V4SF 0 "register_operand" "=x,x,x")
+  [(set (match_operand:V4SF 0 "register_operand" "=v,v,v")
(vec_select:V4SF
- (match_operand:V4SF 1 "nonimmediate_operand" "m,o,x")
+ (match_operand:V4SF 1 "nonimmediate_operand" "m,o,v")
  (match_parallel 2 "avx_vbroadcast_operand"
[(match_operand 3 "const_int_operand" "C,n,n")])))]
   "TARGET_AVX"
@@ -17271,13 +17271,13 @@
   [(set_attr "type" "ssemov,ssemov,sselog1")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "0,0,1")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "mode" "SF,SF,V4SF")])
 
 (define_insn_and_split "*avx_vperm_broadcast_"
-  [(set (match_operand:VF_256 0 "register_operand" "=x,x,x")
+  [(set (match_operand:VF_256 0 "register_operand" "=v,v,v")
(vec_select:VF_256
- (match_operand:VF_256 1 "nonimmediate_operand" "m,o,?x")
+ (match_operand:VF_256 1 "nonimmediate_operand" "m,o,?v")
  (match_parallel 2 "avx_vbroadcast_operand"
[(match_operand 3 "const_int_operand" "C,n,n")])))]
   "TARGET_AVX"
@@ -17309,6 +17309,23 @@
 
   /* Shuffle the lane we care about into both lanes of the dest.  */
   mask = (elt / ( / 2)) * 0x11;
+  if (EXT_REX_SSE_REG_P (op0))
+   {
+ /* There is no EVEX VPERM2F128, but we can use either VBROADCASTSS
+or VSHUFF128.  */
+ gcc_assert (mode == V8SFmode);
+ if ((mask & 1) == 0)
+   emit_insn (gen_avx2_vec_dupv8sf (op0,
+gen_lowpart (V4SFmode, op0)));
+ else
+   emit_insn (gen_avx512vl_shuf_f32x4_1 (op0, op0, op0,
+ GEN_INT (4), GEN_INT (5),
+ GEN_INT (6), GEN_INT (7),
+ GEN_INT (12), GEN_INT (13),
+ GEN_INT (14), GEN_INT (15)));
+ DONE;
+   }
+
   emit_insn (gen_avx_vperm2f1283 (op0, op0, op0, GEN_INT (mask)));
   DONE;
 }
--- gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-3.c.jj2016-05-23 
14:07:36.266695992 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-vbroadcast-3.c   2016-05-23 
14:14:49.495012459 +0200
@@ -0,0 +1,162 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512vl -masm=att" } */
+
+typedef float V1 __attribute__((vector_size (16)));
+typedef float V2 __attribute__((vector_size (32)));
+typedef int V4 __attribute__((vector_size (16)));
+typedef int V5 __attribute__((vector_size (32)));
+
+void
+f1 (V1 x)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V4) { 0, 0, 0, 0 });
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f2 (V1 x)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V4) { 1, 1, 1, 1 });
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f3 (V1 x)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V4) { 2, 2, 2, 2 });
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f4 (V1 x)
+{
+  register V1 a __asm ("xmm16");
+  a = x;
+  asm volatile ("" : "+v" (a));
+  a = __builtin_shuffle (a, (V4) { 3, 

Refactor a bit of the backwards jump threading code

2016-05-23 Thread Jeff Law


fsm_find_control_statement_thread_paths is getting unwieldy and I wanted 
to extend it further.  So a bit of refactoring is warranted.


This pulls out the code to examine a thread path and determine if it's a 
profitable path to thread.  It shouldn't affect the generated code in 
any way.


The primary goal here was to make the management of the last entry on 
the jump threading path more visible (ie, is BBI in PATH or not).  I'm 
not at all happy with how that detail is managed, and it may well change 
in the future.


The diffs look far more extensive than they really are.

Bootstrapped and regression tested on x86_64.  Installed on the trunk.

Jeff
commit e5c47bcb335caf84b6abdb6c6b075919465acf0b
Author: Jeff Law 
Date:   Mon May 23 12:46:27 2016 -0400

* tree-ssa-threadbackward.c (profitable_jump_thread_path): New function
extracted from ...
(fsm_find_control_statement_thread_paths): Call it.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f7a012c..f487139 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-05-23  Jeff Law  
+
+   * tree-ssa-threadbackward.c (profitable_jump_thread_path): New function
+   extracted from ...
+   (fsm_find_control_statement_thread_paths): Call it.
+
 2016-05-23  Martin Jambor  
 
PR ipa/71234
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 44b1b47..73ab4ea 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -89,6 +89,273 @@ fsm_find_thread_path (basic_block start_bb, basic_block 
end_bb,
   return false;
 }
 
+/* Examine jump threading path PATH to which we want to add BBI.
+
+   If the resulting path is profitable to thread, then return the
+   final taken edge from the path, NULL otherwise.
+
+   NAME is the SSA_NAME of the variable we found to have a constant
+   value on PATH.  ARG is the value of that SSA_NAME.
+
+   BBI will be appended to PATH when we have a profitable jump threading
+   path.  Callers are responsible for removing BBI from PATH in that case. */
+
+static edge
+profitable_jump_thread_path (vec *,
+basic_block bbi, tree name, tree arg)
+{
+  /* Note BBI is not in the path yet, hence the +1 in the test below
+ to make sure BBI is accounted for in the path length test.  */
+  int path_length = path->length ();
+  if (path_length + 1 > PARAM_VALUE (PARAM_MAX_FSM_THREAD_LENGTH))
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "FSM jump-thread path not considered: "
+"the number of basic blocks on the path "
+"exceeds PARAM_MAX_FSM_THREAD_LENGTH.\n");
+  return NULL;
+}
+
+  if (max_threaded_paths <= 0)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "FSM jump-thread path not considered: "
+"the number of previously recorded FSM paths to "
+"thread exceeds PARAM_MAX_FSM_THREAD_PATHS.\n");
+  return NULL;
+}
+
+  /* Add BBI to the path.
+ From this point onward, if we decide we the path is not profitable
+ to thread, we must remove BBI from the path.  */
+  vec_safe_push (path, bbi);
+  ++path_length;
+
+  int n_insns = 0;
+  gimple_stmt_iterator gsi;
+  int j;
+  loop_p loop = (*path)[0]->loop_father;
+  bool path_crosses_loops = false;
+  bool threaded_through_latch = false;
+  bool multiway_branch_in_path = false;
+  bool threaded_multiway_branch = false;
+
+  /* Count the number of instructions on the path: as these instructions
+ will have to be duplicated, we will not record the path if there
+ are too many instructions on the path.  Also check that all the
+ blocks in the path belong to a single loop.  */
+  for (j = 0; j < path_length; j++)
+{
+  basic_block bb = (*path)[j];
+
+  /* Remember, blocks in the path are stored in opposite order
+in the PATH array.  The last entry in the array represents
+the block with an outgoing edge that we will redirect to the
+jump threading path.  Thus we don't care about that block's
+loop father, nor how many statements are in that block because
+it will not be copied or whether or not it ends in a multiway
+branch.  */
+  if (j < path_length - 1)
+   {
+ if (bb->loop_father != loop)
+   {
+ path_crosses_loops = true;
+ break;
+   }
+
+ /* PHIs in the path will create degenerate PHIS in the
+copied path which will then get propagated away, so
+looking at just the duplicate path the PHIs would
+seem unimportant.
+
+But those PHIs, because they're assignments to objects
+typically with lives that exist outside the thread path,
+will tend to generate PHIs (or at least new PHI arguments)
+at 

Re: "omp declare target" on DECL_EXTERNAL vars

2016-05-23 Thread Jakub Jelinek
On Mon, May 23, 2016 at 07:15:48PM +0300, Alexander Monakov wrote:
> (it's unclear to me what you mean by 'non-local vars' here, from the context
> it looks like it's 'variables with an external declaration and no definition
> in the current TU'; correct?)

Sure.

> Looking at the OpenMP 4.5 spec, there's a requirement that
> 
> [2.10.6 declare target directive, Restrictions, C/C++]
> * All declarations and definitions for a function must have a declare
> target directive if one is specified for any of them. Otherwise, the
> result is unspecified.
> 
> (why are variables exempted?)

I'll ask on the lang committee.  That said, for external declarations of
variables, all we care is that if any external declaration is specified in
to/link clause on declare target directive, then the definition (for common
vars all the definitions I guess) are also specified in the same kind of
clause.

Having the externs specified in omp declare target to is important for
code generation, we need to know that whether the vars should be mapped
implicitly on target constructs and remapped in the target construct bodies,
or whether the actual vars should be used in the regions.

Thus, 

> So from that perspective it's undesirable to have 'omp declare target' on
> declarations that don't define anything.

is just wrong, we at least need the symbol_table::offloadable bit set.

About g->head_offload and offload_vars, I guess it is fine not to set those
for externs but we need to arrange that to be set when we actually define it
when it has been previously extern, and we need some sensible handling
of the case where the var is only declared extern and omp declare target,
used in some target region, but not actually defined anywhere in the same
shared library or executable.

Jakub


[fortran] Re: Make array_at_struct_end_p to grok MEM_REFs

2016-05-23 Thread Jan Hubicka
> 
> The assert below is unnecessary btw - it is ensured by IL checking.
I removed the assert but had to add a check that sizes match. As sported by the
testsuite, the declaration size doesn't need to match the size of object that we
see.
> 
> Rather than annotating an ARRAY_REF I'd have FEs annotate FIELD_DECLs
> that they are possibly flexible-size members.

This was my original plan. The problem however is that in many cases we do
not see any FIELD_DECL.  When I dump the Fortran cases we give up on, I 
typically
see something like:
Index: trans-types.c
===
--- trans-types.c   (revision 236556)
+++ trans-types.c   (working copy)
@@ -1920,7 +1920,7 @@ gfc_get_array_type_bounds (tree etype, i
 
   /* We define data as an array with the correct size if possible.
  Much better than doing pointer arithmetic.  */
-  if (stride)
+  if (stride && akind >= GFC_ARRAY_ALLOCATABLE)
 rtype = build_range_type (gfc_array_index_type, gfc_index_zero_node,
  int_const_binop (MINUS_EXPR, stride,
   build_int_cst (TREE_TYPE 
(stride), 1)));

It does not seem to make sense to build range types for arrays where the
permitted value range is often above the upper bound.

In that case I think we may just add ARRAY_TYPE_STRICT_DOMAIN flag 
specifying that the value must be within the given range. Then we can just
build arrays with strict ranges when we know these are not trailing.

Honza
> 
> Richard.
> 

* tree.c (array_at_struct_end_p): Look through MEM_REF.
Index: tree.c
===
--- tree.c  (revision 236529)
+++ tree.c  (working copy)
@@ -13076,9 +13076,28 @@ array_at_struct_end_p (tree ref)
   ref = TREE_OPERAND (ref, 0);
 }
 
+  tree size = NULL;
+
+  if (TREE_CODE (ref) == MEM_REF
+  && TREE_CODE (TREE_OPERAND (ref, 0)) == ADDR_EXPR)
+{
+  size = TYPE_SIZE (TREE_TYPE (ref));
+  ref = TREE_OPERAND (TREE_OPERAND (ref, 0), 0);
+}
+
   /* If the reference is based on a declared entity, the size of the array
  is constrained by its given domain.  (Do not trust commons PR/69368).  */
   if (DECL_P (ref)
+  /* Be sure the size of MEM_REF target match.  For example:
+
+  char buf[10];
+  struct foo *str = (struct foo *)
+
+  str->trailin_array[2] = 1;
+
+is valid because BUF allocate enough space.  */
+
+  && (!size || operand_equal_p (DECL_SIZE (ref), size, 0))
   && !(flag_unconstrained_commons
   && TREE_CODE (ref) == VAR_DECL && DECL_COMMON (ref)))
 return false;


[PR 71234] Avoid valgrind warning in ipa-cp

2016-05-23 Thread Martin Jambor
Hi,

ipa_find_agg_cst_for_param can leave from_global_constant as it is
when it returns NULL.  It's user ipa_get_indirect_edge_target_1 then
reads that uninitialized value when it tests whether it should NULLify
the result itself, which was caught by valgrind.

Fixed by the patch below, which checks whether
ipa_find_agg_cst_for_param returned non-NULL before loading
from_global_constant.  I decided to address it here rather than in
ipa_find_agg_cst_for_param because that would require a check that
from_global_constant in not NULL there and because it is consistent
with how by_ref is returned in other functions in ipa-prop.

Bootstrapped and tested on x86_64-linux, I will go ahead and commit it
as obvious.

Martin


2016-05-23  Martin Jambor  

PR ipa/71234
* ipa-cp.c (ipa_get_indirect_edge_target_1): Only check value of
from_global_constant if t is not NULL.
---
 gcc/ipa-cp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 8caa973..4b7f6bb 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2027,7 +2027,8 @@ ipa_get_indirect_edge_target_1 (struct cgraph_edge *ie,
  ie->indirect_info->offset,
  ie->indirect_info->by_ref,
  _global_constant);
- if (!from_global_constant
+ if (t
+ && !from_global_constant
  && !ie->indirect_info->guaranteed_unmodified)
t = NULL_TREE;
}
-- 
2.8.2



Re: "omp declare target" on DECL_EXTERNAL vars

2016-05-23 Thread Alexander Monakov
On Fri, 20 May 2016, Jakub Jelinek wrote:
> but that made me think on what handling do we want for the
> "omp declare target" DECL_EXTERNAL vars.
[snip]
> In the C/C++ FEs, we set not just node->offloadable, but also for
> ENABLE_OFFLOADING g->have_offload and offload_vars too.  Wonder if that
> means we register even non-local vars, that would be IMHO a bug.

(it's unclear to me what you mean by 'non-local vars' here, from the context
it looks like it's 'variables with an external declaration and no definition
in the current TU'; correct?)

Looking at the OpenMP 4.5 spec, there's a requirement that

[2.10.6 declare target directive, Restrictions, C/C++]
* All declarations and definitions for a function must have a declare
target directive if one is specified for any of them. Otherwise, the
result is unspecified.

(why are variables exempted?)

A natural way to conform to that requirement is to have a '#pragma omp declare
target' in the header file declaring the offloaded function. But that means
every TU that includes that header will have g->have_offload set, even if
otherwise it doesn't touch OpenMP at all.

So from that perspective it's undesirable to have 'omp declare target' on
declarations that don't define anything.

> On the other side, we need to watch for an extern declaration
> of a VAR_DECL marked for offloading and only later on locally defined,
> in that case if we haven't set g->have_offload and added entry to
> offload_vars, we'd need to do it when merging the extern decl with the
> definition.

Yes, but I wonder if setting g->have_offload etc. in the front-ends is the
right thing to do at all.  Shouldn't frontends simply set 'omp declare target'
and leave the rest to omp-low?

Hope that's constructive.
Alexander


[PATCH, i386, AVX-512] Add vectorizer support builtins

2016-05-23 Thread Ilya Verbin
Hi!

This patch adds missed 512-bit rounding builtins for vectorization.
Regtested on x86_64-linux and i686-linux.  OK for trunk?


gcc/
* config/i386/i386-builtin-types.def: Add V16SI_FTYPE_V16SF,
V8DF_FTYPE_V8DF_ROUND, V16SF_FTYPE_V16SF_ROUND, V16SI_FTYPE_V16SF_ROUND.
* config/i386/i386.c (enum ix86_builtins): Add
IX86_BUILTIN_CVTPS2DQ512_MASK, IX86_BUILTIN_FLOORPS512,
IX86_BUILTIN_FLOORPD512, IX86_BUILTIN_CEILPS512, IX86_BUILTIN_CEILPD512,
IX86_BUILTIN_TRUNCPS512, IX86_BUILTIN_TRUNCPD512,
IX86_BUILTIN_CVTPS2DQ512, IX86_BUILTIN_VEC_PACK_SFIX512,
IX86_BUILTIN_FLOORPS_SFIX512, IX86_BUILTIN_CEILPS_SFIX512,
IX86_BUILTIN_ROUNDPS_AZ_SFIX512.
(builtin_description bdesc_args): Add __builtin_ia32_floorps512,
__builtin_ia32_ceilps512, __builtin_ia32_truncps512,
__builtin_ia32_floorpd512, __builtin_ia32_ceilpd512,
__builtin_ia32_truncpd512, __builtin_ia32_cvtps2dq512,
__builtin_ia32_vec_pack_sfix512, __builtin_ia32_roundps_az_sfix512,
__builtin_ia32_floorps_sfix512, __builtin_ia32_ceilps_sfix512.
Change IX86_BUILTIN_CVTPS2DQ512 to IX86_BUILTIN_CVTPS2DQ512_MASK for
__builtin_ia32_cvtps2dq512_mask.
(ix86_expand_args_builtin): Handle V8DF_FTYPE_V8DF_ROUND,
V16SF_FTYPE_V16SF_ROUND, V16SI_FTYPE_V16SF_ROUND, V16SI_FTYPE_V16SF.
(ix86_builtin_vectorized_function): Handle builtins mentioned above.
* config/i386/sse.md
(avx512f_fix_notruncv16sfv16si):
Rename to ...
(avx512f_fix_notruncv16sfv16si): ... this.
(avx512f_cvtpd2dq512): Rename
to ...
(avx512f_cvtpd2dq512): ... this.
(avx512f_vec_pack_sfix_v8df): New define_expand.
(avx512f_roundpd512): Rename to ...
(avx512f_round512): ... this.  Change iterator.
(avx512f_roundps512_sfix): New define_expand.
(round2_sfix): Change iterator.
gcc/testsuite/
* gcc.target/i386/avx512f-ceil-vec-1.c: New test.
* gcc.target/i386/avx512f-ceil-vec-2.c: New test.
* gcc.target/i386/avx512f-ceilf-sfix-vec-1.c: New test.
* gcc.target/i386/avx512f-ceilf-sfix-vec-2.c: New test.
* gcc.target/i386/avx512f-ceilf-vec-1.c: New test.
* gcc.target/i386/avx512f-ceilf-vec-2.c: New test.
* gcc.target/i386/avx512f-floor-vec-1.c: New test.
* gcc.target/i386/avx512f-floor-vec-2.c: New test.
* gcc.target/i386/avx512f-floorf-sfix-vec-1.c: New test.
* gcc.target/i386/avx512f-floorf-sfix-vec-2.c: New test.
* gcc.target/i386/avx512f-floorf-vec-1.c: New test.
* gcc.target/i386/avx512f-floorf-vec-2.c: New test.
* gcc.target/i386/avx512f-rint-sfix-vec-1.c: New test.
* gcc.target/i386/avx512f-rint-sfix-vec-2.c: New test.
* gcc.target/i386/avx512f-rintf-sfix-vec-1.c: New test.
* gcc.target/i386/avx512f-rintf-sfix-vec-2.c: New test.
* gcc.target/i386/avx512f-round-sfix-vec-1.c: New test.
* gcc.target/i386/avx512f-round-sfix-vec-2.c: New test.
* gcc.target/i386/avx512f-roundf-sfix-vec-1.c: New test.
* gcc.target/i386/avx512f-roundf-sfix-vec-2.c: New test.
* gcc.target/i386/avx512f-trunc-vec-1.c: New test.
* gcc.target/i386/avx512f-trunc-vec-2.c: New test.
* gcc.target/i386/avx512f-truncf-vec-1.c: New test.
* gcc.target/i386/avx512f-truncf-vec-2.c: New test.


diff --git a/gcc/config/i386/i386-builtin-types.def 
b/gcc/config/i386/i386-builtin-types.def
index 75d57d9..c66f651 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -294,6 +294,7 @@ DEF_FUNCTION_TYPE (V8DF, V4DF)
 DEF_FUNCTION_TYPE (V8DF, V2DF)
 DEF_FUNCTION_TYPE (V16SI, V4SI)
 DEF_FUNCTION_TYPE (V16SI, V8SI)
+DEF_FUNCTION_TYPE (V16SI, V16SF)
 DEF_FUNCTION_TYPE (V16SI, V16SI, V16SI, UHI)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V8DI, UQI)
 DEF_FUNCTION_TYPE (V8DI, PV8DI)
@@ -1061,14 +1062,17 @@ DEF_FUNCTION_TYPE (VOID, QI, V8DI, PCINT, INT, INT)
 
 DEF_FUNCTION_TYPE_ALIAS (V2DF_FTYPE_V2DF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V4DF_FTYPE_V4DF, ROUND)
+DEF_FUNCTION_TYPE_ALIAS (V8DF_FTYPE_V8DF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V4SF_FTYPE_V4SF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V8SF_FTYPE_V8SF, ROUND)
+DEF_FUNCTION_TYPE_ALIAS (V16SF_FTYPE_V16SF, ROUND)
 
 DEF_FUNCTION_TYPE_ALIAS (V4SI_FTYPE_V2DF_V2DF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V8SI_FTYPE_V4DF_V4DF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V16SI_FTYPE_V8DF_V8DF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V4SI_FTYPE_V4SF, ROUND)
 DEF_FUNCTION_TYPE_ALIAS (V8SI_FTYPE_V8SF, ROUND)
+DEF_FUNCTION_TYPE_ALIAS (V16SI_FTYPE_V16SF, ROUND)
 
 DEF_FUNCTION_TYPE_ALIAS (INT_FTYPE_V2DF_V2DF, PTEST)
 DEF_FUNCTION_TYPE_ALIAS (INT_FTYPE_V2DI_V2DI, PTEST)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1cb88d6..049a006 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -30935,7 +30935,7 @@ enum ix86_builtins
   IX86_BUILTIN_CVTPD2PS512,
   

Re: [PATCH] Make basic asm implicitly clobber memory

2016-05-23 Thread Jeff Law

On 05/22/2016 04:33 AM, Andrew Haley wrote:

On 05/20/2016 07:50 AM, David Wohlferd wrote:


At a minimum, suddenly forcing an unexpected/unneeded memory clobber
can adversely impact the optimization of surrounding code.  This can
be particularly annoying if the reason for the asm was to improve
performance.  And adding a memory clobber does add a dependency of
sorts, which might cause the location of the asm to shift in an
unfortunate way.  And there's always the long-shot possibility that
some weird quirk or (very) badly-written code will cause the asm to
flat out fail when used with a memory clobber.  And if this change
does produce any of these problems, I feel pity for whoever has to
track it down.


OTOH, if a memory clobber does change code gen it probably changes it
in a way which better fits user expectations, and perhaps it fixes a
bug.  That's a win, and it is far, far more important than any other
consideration.

My thoughts precisely.




I realize deprecation/removal is drastic.  Especially since basic
asm (mostly) works as is.  But fixing memory clobbers while leaving
the rest broken feels like half a solution, meaning that some day
we're going to have to fiddle with this again.


Yes, we will undoubtedly have to fiddle with basic asm again.  We
should plan for deprecation.
Right.  There are some fundamental problems with basic asms and I think 
we want to deprecate them in the long term.  In the immediate/medium 
term, I think addressing the memory dependency issue is the right thing 
to do.


While it may make some code somewhere less optimized, it brings the 
basic asm semantics closer to what most programmers expect and prevents 
them from suddenly breaking as the optimizers continue to improve.  If 
someone wants better optimized code, they ought to be using extended 
asms anyway.


Jeff


Re: [Patch wwwdocs] Add aarch64-none-linux-gnu as a primary platform for GCC-7

2016-05-23 Thread Jeff Law

On 05/23/2016 03:26 AM, Richard Biener wrote:

On Mon, May 23, 2016 at 11:16 AM, Ramana Radhakrishnan
 wrote:

Hi,

The Steering Committee has decided to add aarch64-none-linux-gnu as a primary 
platform for GCC-7. This reflects the increasing popularity of the port and the 
increased general availability of hardware. I also took the opportunity of 
creating a GCC-7 criteria page at the same time.

Applied.


Sorry to hijack the thread but I continue to notice that we have
i386-unknown-freebsd as a primary target.  I notice here
the 'i386' (the only primary target still explicitely listing that
sub-target) and the fact that freebsd switched to LLVM as
far as I know.

So I propose to demote -freebsd to secondary and use
i686-unknown-freebsd (or x86_64-unknown-freebsd?).

Gerald, Andreas, can you comment on both issues?  Esp. i386 is putting
quite some burden on libstdc++ and atomics support
for example.
The target may claim i386, but it's actually i486+ so we don't have any 
real issues around atomics.  This came up in 2014.


We can look at demoting FreeBSD again; when we looked at it in 2014 it 
wasn't seen as a particularly useful thing to do.  But I'm open to 
revisiting.


jeff



Re: [PATCH] Introduce can_remove_lhs_p

2016-05-23 Thread Marek Polacek
On Mon, May 23, 2016 at 04:36:30PM +0200, Jakub Jelinek wrote:
> On Mon, May 23, 2016 at 04:28:33PM +0200, Marek Polacek wrote:
> > As promised in ,
> > this is a simple clean-up which makes use of a new predicate.  Richi 
> > suggested
> > adding maybe_drop_lhs_from_noreturn_call which would be nicer, but I didn't
> > know how to do that, given the handling if lhs is an SSA_NAME.
> 
> Shouldn't it be should_remove_lhs_p instead?
> I mean, it is not just an optimization, but part of how we define the IL.
 
Aha, ok.  Renamed.

> Shouldn't it be also used in tree-cfg.c (verify_gimple_call)?

I left that spot on purpose but now I don't quite see why, fixed.  Thanks,

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-05-23  Marek Polacek  

* tree.h (should_remove_lhs_p): New predicate.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Use it.
* gimple-fold.c (gimple_fold_call): Likewise.
* gimplify.c (gimplify_modify_expr): Likewise.
* tree-cfg.c (verify_gimple_call): Likewise.
* tree-cfgcleanup.c (fixup_noreturn_call): Likewise.

diff --git gcc/cgraph.c gcc/cgraph.c
index cf9192f..1a4f665 100644
--- gcc/cgraph.c
+++ gcc/cgraph.c
@@ -1513,10 +1513,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 }
 
   /* If the call becomes noreturn, remove the LHS if possible.  */
-  if (lhs
-  && (gimple_call_flags (new_stmt) & ECF_NORETURN)
-  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
-  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
+  if (gimple_call_noreturn_p (new_stmt) && should_remove_lhs_p (lhs))
 {
   if (TREE_CODE (lhs) == SSA_NAME)
{
diff --git gcc/gimple-fold.c gcc/gimple-fold.c
index 858f484..6b50d43 100644
--- gcc/gimple-fold.c
+++ gcc/gimple-fold.c
@@ -3052,12 +3052,9 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool 
inplace)
  == void_type_node))
gimple_call_set_fntype (stmt, TREE_TYPE (fndecl));
  /* If the call becomes noreturn, remove the lhs.  */
- if (lhs
- && (gimple_call_flags (stmt) & ECF_NORETURN)
+ if (gimple_call_noreturn_p (stmt)
  && (VOID_TYPE_P (TREE_TYPE (gimple_call_fntype (stmt)))
- || ((TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs)))
-  == INTEGER_CST)
- && !TREE_ADDRESSABLE (TREE_TYPE (lhs)
+ || should_remove_lhs_p (lhs)))
{
  if (TREE_CODE (lhs) == SSA_NAME)
{
diff --git gcc/gimplify.c gcc/gimplify.c
index 4a544e3..c77eb51 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -4847,9 +4847,7 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
}
}
   notice_special_calls (call_stmt);
-  if (!gimple_call_noreturn_p (call_stmt)
- || TREE_ADDRESSABLE (TREE_TYPE (*to_p))
- || TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (*to_p))) != INTEGER_CST)
+  if (!gimple_call_noreturn_p (call_stmt) || !should_remove_lhs_p (*to_p))
gimple_call_set_lhs (call_stmt, *to_p);
   else if (TREE_CODE (*to_p) == SSA_NAME)
/* The above is somewhat premature, avoid ICEing later for a
diff --git gcc/tree-cfg.c gcc/tree-cfg.c
index 7c2ee78..82f0da6c 100644
--- gcc/tree-cfg.c
+++ gcc/tree-cfg.c
@@ -3385,11 +3385,9 @@ verify_gimple_call (gcall *stmt)
   return true;
 }
 
-  if (lhs
-  && gimple_call_ctrl_altering_p (stmt)
+  if (gimple_call_ctrl_altering_p (stmt)
   && gimple_call_noreturn_p (stmt)
-  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
-  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
+  && should_remove_lhs_p (lhs))
 {
   error ("LHS in noreturn call");
   return true;
diff --git gcc/tree-cfgcleanup.c gcc/tree-cfgcleanup.c
index 46d0fa3..4134c38 100644
--- gcc/tree-cfgcleanup.c
+++ gcc/tree-cfgcleanup.c
@@ -604,8 +604,7 @@ fixup_noreturn_call (gimple *stmt)
  temporaries of variable-sized types is not supported.  Also don't
  do this with TREE_ADDRESSABLE types, as assign_temp will abort.  */
   tree lhs = gimple_call_lhs (stmt);
-  if (lhs && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
-  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
+  if (should_remove_lhs_p (lhs))
 {
   gimple_call_set_lhs (stmt, NULL_TREE);
 
diff --git gcc/tree.h gcc/tree.h
index 2510d16..1d72437 100644
--- gcc/tree.h
+++ gcc/tree.h
@@ -5471,4 +5471,14 @@ desired_pro_or_demotion_p (const_tree to_type, 
const_tree from_type)
   return to_type_precision <= TYPE_PRECISION (from_type);
 }
 
+/* Return true if the LHS of a call should be removed.  */
+
+inline bool
+should_remove_lhs_p (tree lhs)
+{
+  return (lhs
+ && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
+

[Ping ^ 2] Re: [ARM] Add support for overflow add, sub, and neg operations

2016-05-23 Thread Michael Collison


Ping. Previous Patch posted here:

https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01472.html



[C++ Patch] PR 70972 ("[6/7 Regression] Inheriting constructors taking parameters by value should move them, not copy")

2016-05-23 Thread Paolo Carlini

Hi,

admittedly I didn't spend much time on this issue, but submitter himself 
provided a strong hint and a fix may be very simple... Essentially, he 
noticed that the implementation of forward_parm, being kind of a 
std::forward for internal uses, misses a cp_build_reference (*, true) 
when handling the testcases at issue. Adding it, indeed fixes the 
testcases (both the dg-do compile and the dg-do run). Beyond submitter's 
hint, we can't just unconditionally call cp_build_reference however, for 
example cpp0x/inh-ctor16.C would regress: that doesn't surprise me much, 
when I look at other helper functions in method.c... Then, is the below 
good enough? Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2016-05-23

PR c++/70972
* method.c (forward_parm): Use cp_build_reference_type.

/testsuite
2016-05-23

PR c++/70972
* g++.dg/cpp0x/inh-ctor20.C: New.
* g++.dg/cpp0x/inh-ctor21.C: Likewise.
Index: cp/method.c
===
--- cp/method.c (revision 236581)
+++ cp/method.c (working copy)
@@ -484,6 +484,8 @@ forward_parm (tree parm)
   tree type = TREE_TYPE (parm);
   if (DECL_PACK_P (parm))
 type = PACK_EXPANSION_PATTERN (type);
+  if (TREE_CODE (type) != REFERENCE_TYPE)
+type = cp_build_reference_type (type, /*rval=*/true);
   exp = build_static_cast (type, exp, tf_warning_or_error);
   if (DECL_PACK_P (parm))
 exp = make_pack_expansion (exp);
Index: testsuite/g++.dg/cpp0x/inh-ctor20.C
===
--- testsuite/g++.dg/cpp0x/inh-ctor20.C (revision 0)
+++ testsuite/g++.dg/cpp0x/inh-ctor20.C (working copy)
@@ -0,0 +1,16 @@
+// PR c++/70972
+// { dg-do compile { target c++11 } }
+
+struct moveonly {
+moveonly(moveonly&&) = default;
+moveonly() = default;
+};
+
+struct A {
+A(moveonly) {}
+};
+struct B : A {
+using A::A;
+};
+
+B b(moveonly{});
Index: testsuite/g++.dg/cpp0x/inh-ctor21.C
===
--- testsuite/g++.dg/cpp0x/inh-ctor21.C (revision 0)
+++ testsuite/g++.dg/cpp0x/inh-ctor21.C (working copy)
@@ -0,0 +1,19 @@
+// PR c++/70972
+// { dg-do run { target c++11 } }
+
+struct abort_on_copy{
+abort_on_copy(abort_on_copy&&) = default;
+abort_on_copy(const abort_on_copy&) { __builtin_abort(); }
+abort_on_copy() = default;
+};
+
+struct A {
+A(abort_on_copy) {}
+};
+struct B : A {
+using A::A;
+};
+
+int main() {
+B b(abort_on_copy{});
+}


Re: [C++ Patch] PR 69095

2016-05-23 Thread Paolo Carlini

Hi,

On 23/05/2016 15:32, Jason Merrill wrote:

On 05/22/2016 02:26 PM, Paolo Carlini wrote:

finally sending a patch for this issue. As noticed by submitter himself,
it appears to boil down to a rather straightforward case of not
rejecting unexpanded parameter packs in default arguments. In order to
handle all the combinations (in/out of class, template
parameter/function parameter) I added calls of
check_for_bare_parameter_packs both to cp_parser_default_argument and
cp_parser_late_parsing_default_args


Hmm, would it make sense to check in cp_parser_initializer?

Oh yes. The below is already past g++.dg/tm...

Thanks!
Paolo.


Index: cp/parser.c
===
--- cp/parser.c (revision 236592)
+++ cp/parser.c (working copy)
@@ -20800,6 +20800,9 @@ cp_parser_initializer (cp_parser* parser, bool* is
   init = error_mark_node;
 }
 
+  if (check_for_bare_parameter_packs (init))
+init = error_mark_node;
+
   return init;
 }
 
Index: testsuite/g++.dg/cpp0x/variadic168.C
===
--- testsuite/g++.dg/cpp0x/variadic168.C(revision 0)
+++ testsuite/g++.dg/cpp0x/variadic168.C(working copy)
@@ -0,0 +1,18 @@
+// PR c++/69095
+// { dg-do compile { target c++11 } }
+
+struct B1 {
+  template  // { 
dg-error "parameter packs not expanded" }
+  void insert(Ret);
+};
+
+struct B2 {
+  template 
+  void insert(Ret, unsigned = sizeof(Args)); // { dg-error "parameter packs 
not expanded" }
+};
+
+template  // { 
dg-error "parameter packs not expanded" }
+void insert1(Ret);
+
+template 
+void insert2(Ret, unsigned = sizeof(Args)); // { dg-error "parameter packs not 
expanded" }


Re: [PATCH][Testsuite] Force testing of vectorized builtins rather than inlined i387 asm

2016-05-23 Thread Ilya Verbin
On Sat, May 21, 2016 at 09:51:36 +0200, Uros Bizjak wrote:
> On Fri, May 20, 2016 at 8:01 PM, Ilya Verbin  wrote:
> > In some cases the i387 version of a math function may be inlined from 
> > math.h,
> > and the testcase (like gcc.target/i386/sse4_1-ceil-vec.c) will actually test
> > inlined asm instead of vectorized builtin.  To fix this I've created a new 
> > file
> > gcc.dg/mathfunc.h (similar to gcc.dg/strlenopt.h) and changed vectorization
> > tests so that they include it instead of math.h.
> > Regtested on x86_64-linux and i686-linux.  Is it OK for trunk?
> 
> No, please just #define NO_MATH_INLINES before math.h is included.
> This will solve unwanted inlining.

Thanks for the hint.  I'll check-in this patch tomorrow.


gcc/testsuite/
* gcc.target/i386/avx-ceil-sfix-2-vec.c: Define __NO_MATH_INLINES before
math.h is included.
* gcc.target/i386/avx-floor-sfix-2-vec.c: Likewise.
* gcc.target/i386/avx-rint-sfix-2-vec.c: Likewise.
* gcc.target/i386/avx-round-sfix-2-vec.c: Likewise.
* gcc.target/i386/avx512f-ceil-sfix-vec-1.c: Likewise.
* gcc.target/i386/avx512f-floor-sfix-vec-1.c: Likewise.
* gcc.target/i386/sse4_1-ceil-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-ceil-vec.c: Likewise.
* gcc.target/i386/sse4_1-ceilf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-ceilf-vec.c: Likewise.
* gcc.target/i386/sse4_1-floor-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-floor-vec.c: Likewise.
* gcc.target/i386/sse4_1-rint-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-rint-vec.c: Likewise.
* gcc.target/i386/sse4_1-rintf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-rintf-vec.c: Likewise.
* gcc.target/i386/sse4_1-round-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-round-vec.c: Likewise.
* gcc.target/i386/sse4_1-roundf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-roundf-vec.c: Likewise.
* gcc.target/i386/sse4_1-trunc-vec.c: Likewise.
* gcc.target/i386/sse4_1-truncf-vec.c: Likewise.
* gcc.target/i386/sse4_1-floorf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-floorf-vec.c: Likewise.


diff --git a/gcc/testsuite/gcc.target/i386/avx-ceil-sfix-2-vec.c 
b/gcc/testsuite/gcc.target/i386/avx-ceil-sfix-2-vec.c
index bf48b80..45b7af7 100644
--- a/gcc/testsuite/gcc.target/i386/avx-ceil-sfix-2-vec.c
+++ b/gcc/testsuite/gcc.target/i386/avx-ceil-sfix-2-vec.c
@@ -13,6 +13,7 @@
 
 #include CHECK_H
 
+#define __NO_MATH_INLINES
 #include 
 
 extern double ceil (double);
diff --git a/gcc/testsuite/gcc.target/i386/avx-floor-sfix-2-vec.c 
b/gcc/testsuite/gcc.target/i386/avx-floor-sfix-2-vec.c
index 275199c..0a28c76 100644
--- a/gcc/testsuite/gcc.target/i386/avx-floor-sfix-2-vec.c
+++ b/gcc/testsuite/gcc.target/i386/avx-floor-sfix-2-vec.c
@@ -13,6 +13,7 @@
 
 #include CHECK_H
 
+#define __NO_MATH_INLINES
 #include 
 
 extern double floor (double);
diff --git a/gcc/testsuite/gcc.target/i386/avx-rint-sfix-2-vec.c 
b/gcc/testsuite/gcc.target/i386/avx-rint-sfix-2-vec.c
index 9f273af..e6c47b8 100644
--- a/gcc/testsuite/gcc.target/i386/avx-rint-sfix-2-vec.c
+++ b/gcc/testsuite/gcc.target/i386/avx-rint-sfix-2-vec.c
@@ -13,6 +13,7 @@
 
 #include CHECK_H
 
+#define __NO_MATH_INLINES
 #include 
 
 extern double rint (double);
diff --git a/gcc/testsuite/gcc.target/i386/avx-round-sfix-2-vec.c 
b/gcc/testsuite/gcc.target/i386/avx-round-sfix-2-vec.c
index ddb46d9..dc0a7db 100644
--- a/gcc/testsuite/gcc.target/i386/avx-round-sfix-2-vec.c
+++ b/gcc/testsuite/gcc.target/i386/avx-round-sfix-2-vec.c
@@ -13,6 +13,7 @@
 
 #include CHECK_H
 
+#define __NO_MATH_INLINES
 #include 
 
 extern double round (double);
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-ceil-sfix-vec-1.c 
b/gcc/testsuite/gcc.target/i386/avx512f-ceil-sfix-vec-1.c
index 038d25e..d7d6916 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-ceil-sfix-vec-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-ceil-sfix-vec-1.c
@@ -3,6 +3,7 @@
 /* { dg-require-effective-target avx512f } */
 /* { dg-skip-if "no M_PI" { vxworks_kernel } } */
 
+#define __NO_MATH_INLINES
 #include 
 #include "avx512f-check.h"
 
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-floor-sfix-vec-1.c 
b/gcc/testsuite/gcc.target/i386/avx512f-floor-sfix-vec-1.c
index fab7e65..b46ea9f 100644
--- a/gcc/testsuite/gcc.target/i386/avx512f-floor-sfix-vec-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-floor-sfix-vec-1.c
@@ -3,6 +3,7 @@
 /* { dg-require-effective-target avx512f } */
 /* { dg-skip-if "no M_PI" { vxworks_kernel } } */
 
+#define __NO_MATH_INLINES
 #include 
 #include "avx512f-check.h"
 
diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-ceil-sfix-vec.c 
b/gcc/testsuite/gcc.target/i386/sse4_1-ceil-sfix-vec.c
index ca07d9c..bb32c8d 100644
--- a/gcc/testsuite/gcc.target/i386/sse4_1-ceil-sfix-vec.c
+++ b/gcc/testsuite/gcc.target/i386/sse4_1-ceil-sfix-vec.c
@@ -13,6 +13,7 @@
 
 #include CHECK_H
 

[PR libffi/65567] libffi: Fix, and simply libffi_feature_test (was: [PATCH] libffi testsuite: Use split to ensure valid tcl list)

2016-05-23 Thread Thomas Schwinge
Hi!

On Thu, 25 Feb 2016 20:10:18 +0100, I wrote:
> On Sat, 28 Mar 2015 13:59:30 -0400, John David Anglin  
> wrote:
> > The attached change fixes tcl errors that occur running the complex.exp and 
> > go.exp test sets.
> > See: .
> > 
> > Tested on hppa2.0w-hp-hpux11.11.  Okay for trunk?
> 
> (Got approved, and installed as r221765.)
> 
> > 2015-03-28  John David Anglin  
> > 
> > PR libffi/65567
> > * testsuite/lib/libffi.exp (libffi_feature_test): Use split to ensure
> > lindex is applied to a list.
> > 
> > Index: testsuite/lib/libffi.exp
> > ===
> > --- testsuite/lib/libffi.exp(revision 221591)
> > +++ testsuite/lib/libffi.exp(working copy)
> > @@ -238,7 +239,7 @@
> >  set lines [libffi_target_compile $src "" "preprocess" ""]
> >  file delete $src
> >  
> > -set last [lindex $lines end]
> > +set last [lindex [split $lines] end]
> >  return [regexp -- "xyzzy" $last]
> >  }
> 
> On my several systems, this has the effect that any user of
> libffi_feature_test has their test results regress from PASS to
> UNSUPPORTED.  Apparently the regexp xyzzy matching doesn't work as
> intended.  If I revert your patch, it's OK for me -- but still not for
> you, I suppose.  ;-)
> 
> How about the followinginstead?  It's conceptually simpler (and similar
> to what other such tests are doing), works for me -- but can you also
> please test this?

Committed to trunk in r236594:

commit 84c1a54dea2e39c012a11b49a898ed3773d3587b
Author: tschwinge 
Date:   Mon May 23 14:54:04 2016 +

[PR libffi/65567] libffi: Fix, and simply libffi_feature_test

libffi/
PR libffi/65567
* testsuite/lib/libffi.exp (libffi_feature_test): Fix, and simply.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@236594 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libffi/ChangeLog|5 +
 libffi/testsuite/lib/libffi.exp |   11 ++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git libffi/ChangeLog libffi/ChangeLog
index 5027680..8245f5b 100644
--- libffi/ChangeLog
+++ libffi/ChangeLog
@@ -1,3 +1,8 @@
+2016-05-23  Thomas Schwinge  
+
+   PR libffi/65567
+   * testsuite/lib/libffi.exp (libffi_feature_test): Fix, and simply.
+
 2016-03-17  Andreas Schwab  
 
* src/aarch64/ffitarget.h (FFI_SIZEOF_JAVA_RAW) [__ILP32__]:
diff --git libffi/testsuite/lib/libffi.exp libffi/testsuite/lib/libffi.exp
index 169fe74..a0f6ab3 100644
--- libffi/testsuite/lib/libffi.exp
+++ libffi/testsuite/lib/libffi.exp
@@ -227,20 +227,21 @@ proc libffi_target_compile { source dest type options } {
 
 # TEST should be a preprocessor condition.  Returns true if it holds.
 proc libffi_feature_test { test } {
-set src "ffitest.c"
+set src "ffitest[pid].c"
 
 set f [open $src "w"]
 puts $f "#include "
 puts $f $test
-puts $f "xyzzy"
+puts $f "/* OK */"
+puts $f "#else"
+puts $f "# error Failed $test"
 puts $f "#endif"
 close $f
 
-set lines [libffi_target_compile $src "" "preprocess" ""]
+set lines [libffi_target_compile $src /dev/null assembly ""]
 file delete $src
 
-set last [lindex [split $lines] end]
-return [regexp -- "xyzzy" $last]
+return [string match "" $lines]
 }
 
 # Utility routines.


Backported to gcc-6-branch in r236595:

commit eb4cfb9897caae0af36c9bc496a104c9f8aa11c6
Author: tschwinge 
Date:   Mon May 23 15:00:41 2016 +

[PR libffi/65567] libffi: Fix, and simply libffi_feature_test

Backport trunk r236594:

libffi/
PR libffi/65567
* testsuite/lib/libffi.exp (libffi_feature_test): Fix, and simply.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-6-branch@236595 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libffi/ChangeLog|7 +++
 libffi/testsuite/lib/libffi.exp |   11 ++-
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git libffi/ChangeLog libffi/ChangeLog
index e00113c..c851366 100644
--- libffi/ChangeLog
+++ libffi/ChangeLog
@@ -1,3 +1,10 @@
+2016-05-23  Thomas Schwinge  
+
+   Backport trunk r236594:
+
+   PR libffi/65567
+   * testsuite/lib/libffi.exp (libffi_feature_test): Fix, and simply.
+
 2016-04-27  Release Manager
 
* GCC 6.1.0 released.
diff --git libffi/testsuite/lib/libffi.exp libffi/testsuite/lib/libffi.exp
index 169fe74..a0f6ab3 100644
--- libffi/testsuite/lib/libffi.exp
+++ libffi/testsuite/lib/libffi.exp
@@ -227,20 +227,21 @@ proc libffi_target_compile { source dest type options } {
 
 # TEST should be a preprocessor condition.  Returns true if it holds.
 proc libffi_feature_test { test } {
-set src 

Re: "omp declare target" on DECL_EXTERNAL vars

2016-05-23 Thread Jakub Jelinek
On Mon, May 23, 2016 at 05:37:17PM +0300, Alexander Monakov wrote:
> Hello,
> 
> On Fri, 20 May 2016, Jakub Jelinek wrote:
> [snip]
> > The reason I needed the above is that both gimplify.c and omp-low.c
> > test just the node->offloadable flag, bit the attribute, and so when
> > it is external and the flag wasn't set, we could privatize the vars
> > even when we were supposed to map them etc.
> > In the C/C++ FEs, we set not just node->offloadable, but also
> > for ENABLE_OFFLOADING g->have_offload and offload_vars too.
> > Wonder if that means we register even non-local vars, that would be IMHO a
> > bug.  On the other side, we need to watch for an extern declaration
> > of a VAR_DECL marked for offloading and only later on locally defined,
> > in that case if we haven't set g->have_offload and added entry to
> > offload_vars, we'd need to do it when merging the extern decl with the
> > definition.
> > 
> > So, your thoughts on that?
> 
> As I'm relatively late to this game, at times like this it's hard for me to
> follow what's the general model is. It appears that 'omp declare target' is
> superfluous given symtab_node::offloadable. Offloading compilation still needs
> to distinguish target region entry points from the rest of the functions
> (hence 'omp target entrypoint' serves a clear purpose), but does plain 'omp
> declare target' have a particular meaning not conveyed by
> symtab_node::offloadable && !'omp target entrypoint'?

"omp declare target" and "omp declare target link" attributes are FE
representation, symtab_node::offloadable is ME representation.
We have just one bit in the latter right now, so e.g. it does not
differentiate between the two kinds of offloadable vars.  In the C/C++ FE,
we set the offloadable bit right away next to the creation of the attribute,
in the Fortran FE we don't (and not sure if it is even safe to create symtab
node at that point yet).

> Is/should be there an invariant like "when omp-low is completed, all decls
> annotated with 'omp declare target' will also have symtab_node::offloadable
> set"?

Without my patch to gomp4.5 branch, that invariant didn't hold for Fortran
DECL_EXTERNAL vars.  With the patch it holds, but we need to come to
agreement what behavior we do want for DECL_EXTERNAL vars.

Jakub


Re: "omp declare target" on DECL_EXTERNAL vars

2016-05-23 Thread Alexander Monakov
Hello,

On Fri, 20 May 2016, Jakub Jelinek wrote:
[snip]
> The reason I needed the above is that both gimplify.c and omp-low.c
> test just the node->offloadable flag, bit the attribute, and so when
> it is external and the flag wasn't set, we could privatize the vars
> even when we were supposed to map them etc.
> In the C/C++ FEs, we set not just node->offloadable, but also
> for ENABLE_OFFLOADING g->have_offload and offload_vars too.
> Wonder if that means we register even non-local vars, that would be IMHO a
> bug.  On the other side, we need to watch for an extern declaration
> of a VAR_DECL marked for offloading and only later on locally defined,
> in that case if we haven't set g->have_offload and added entry to
> offload_vars, we'd need to do it when merging the extern decl with the
> definition.
> 
> So, your thoughts on that?

As I'm relatively late to this game, at times like this it's hard for me to
follow what's the general model is. It appears that 'omp declare target' is
superfluous given symtab_node::offloadable. Offloading compilation still needs
to distinguish target region entry points from the rest of the functions
(hence 'omp target entrypoint' serves a clear purpose), but does plain 'omp
declare target' have a particular meaning not conveyed by
symtab_node::offloadable && !'omp target entrypoint'?

Is/should be there an invariant like "when omp-low is completed, all decls
annotated with 'omp declare target' will also have symtab_node::offloadable
set"?

Thanks.
Alexander


Re: [PATCH] Introduce can_remove_lhs_p

2016-05-23 Thread Jakub Jelinek
On Mon, May 23, 2016 at 04:28:33PM +0200, Marek Polacek wrote:
> As promised in ,
> this is a simple clean-up which makes use of a new predicate.  Richi suggested
> adding maybe_drop_lhs_from_noreturn_call which would be nicer, but I didn't
> know how to do that, given the handling if lhs is an SSA_NAME.

Shouldn't it be should_remove_lhs_p instead?
I mean, it is not just an optimization, but part of how we define the IL.

Shouldn't it be also used in tree-cfg.c (verify_gimple_call)?

> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2016-05-23  Marek Polacek  
> 
>   * tree.h (can_remove_lhs_p): New predicate.
>   * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Use it.
>   * gimple-fold.c (gimple_fold_call): Likewise.
>   * gimplify.c (gimplify_modify_expr): Likewise.
>   * tree-cfgcleanup.c (fixup_noreturn_call): Likewise.

Jakub


Re: [PATCH GCC]A latent alignment bug in tree-ssa-address.c

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 4:29 PM, Bin.Cheng  wrote:
> On Mon, May 23, 2016 at 3:16 PM, Richard Biener
>  wrote:
>> On Mon, May 23, 2016 at 3:23 PM, Bin Cheng  wrote:
>>> Hi,
>>> When working on PR69710, I ran into this latent bug in which alignment 
>>> information is wrongly updated for pointer variables.  It results in memory 
>>> exceptions on x86_64 after patch for PR69710.  Scenario is that 
>>> copy_ref_info tries to update base's alignment in TARGET_MEM_REF[base + 
>>> index << step].  But case with NULL TMR_STEP (which implies the step is 1) 
>>> is not handled here.  This patch fixes the bug by simply checking NULL 
>>> TMR_STEP.  The conditions about TMR_STEP could be relaxed if TMR_INDEX is 
>>> an induction variable which has aligned initial value and step.  But that 
>>> needs non-trivial code refactoring since copy_ref_info is uses by different 
>>> parts of compiler.
>>>
>>> Bootstrap and test on x86_64.  Is it OK?
>>
>> I think it is ok but having !TMR_STEP when TMR_INDEX is non-NULL is
>> IMHO bad and we should enforce
>> that this doesn't happen in TMR verification in
>> tree-cfg.c:verify_types_in_gimple_reference.  I also notice
>> that if TMR_INDEX is NULL then TMR_STEP being NULL shouldn't block
> Yeah, patch updated slightly to handle this case.

Ok.

Thanks,
Richard.

> Thanks,
> bin
>> alignment transfer (in that case
>> TMR_INDEX2 is likely non-NULL though).
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> bin
>>>
>>> 2016-05-20 Bin Cheng  
>>>
>>> * tree-ssa-address.c (copy_ref_info): Check null TMR_STEP.


[gomp4] backport the new sincos pattern

2016-05-23 Thread Cesar Philippidis
I've committed this patch to backport the new sincos pattern, which is
enabled with -ffast-math, to gomp-4_0-branch.

Cesar
2016-05-23  Cesar Philippidis  

	gcc/
	* config/nvptx/nvptx.md (sincossf3): New pattern.

	gcc/testsuite/
	* gcc.target/nvptx/sincos.c: New test.


diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 33a4862..e48412d 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -794,6 +794,17 @@
   ""
   "%.\\tsqrt%#%t0\\t%0, %1;")
 
+(define_expand "sincossf3"
+  [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_operand:SF 2 "nvptx_register_operand" "R")]
+	   UNSPEC_COS))
+   (set (match_operand:SF 1 "nvptx_register_operand" "=R")
+	(unspec:SF [(match_dup 2)] UNSPEC_SIN))]
+  "flag_unsafe_math_optimizations"
+{
+  operands[2] = make_safe_from (operands[2], operands[0]);
+})
+
 (define_insn "sinsf2"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
 	(unspec:SF [(match_operand:SF 1 "nvptx_register_operand" "R")]
diff --git a/gcc/testsuite/gcc.target/nvptx/sincos.c b/gcc/testsuite/gcc.target/nvptx/sincos.c
new file mode 100644
index 000..921ec41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/sincos.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern float sinf (float);
+extern float cosf (float);
+
+float
+sincos_add (float x)
+{
+  float s = sinf (x);
+  float c = cosf (x);
+
+  return s + c;
+}
+
+/* { dg-final { scan-assembler-times "sin.approx.f32" 1 } } */
+/* { dg-final { scan-assembler-times "cos.approx.f32" 1 } } */


Re: [PATCH GCC]A latent alignment bug in tree-ssa-address.c

2016-05-23 Thread Bin.Cheng
On Mon, May 23, 2016 at 3:16 PM, Richard Biener
 wrote:
> On Mon, May 23, 2016 at 3:23 PM, Bin Cheng  wrote:
>> Hi,
>> When working on PR69710, I ran into this latent bug in which alignment 
>> information is wrongly updated for pointer variables.  It results in memory 
>> exceptions on x86_64 after patch for PR69710.  Scenario is that 
>> copy_ref_info tries to update base's alignment in TARGET_MEM_REF[base + 
>> index << step].  But case with NULL TMR_STEP (which implies the step is 1) 
>> is not handled here.  This patch fixes the bug by simply checking NULL 
>> TMR_STEP.  The conditions about TMR_STEP could be relaxed if TMR_INDEX is an 
>> induction variable which has aligned initial value and step.  But that needs 
>> non-trivial code refactoring since copy_ref_info is uses by different parts 
>> of compiler.
>>
>> Bootstrap and test on x86_64.  Is it OK?
>
> I think it is ok but having !TMR_STEP when TMR_INDEX is non-NULL is
> IMHO bad and we should enforce
> that this doesn't happen in TMR verification in
> tree-cfg.c:verify_types_in_gimple_reference.  I also notice
> that if TMR_INDEX is NULL then TMR_STEP being NULL shouldn't block
Yeah, patch updated slightly to handle this case.

Thanks,
bin
> alignment transfer (in that case
> TMR_INDEX2 is likely non-NULL though).
>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>>
>> 2016-05-20 Bin Cheng  
>>
>> * tree-ssa-address.c (copy_ref_info): Check null TMR_STEP.
diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index 9e49f3d..b04545c 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -877,6 +877,10 @@ copy_ref_info (tree new_ref, tree old_ref)
  && TREE_CODE (old_ref) == MEM_REF
  && !(TREE_CODE (new_ref) == TARGET_MEM_REF
   && (TMR_INDEX2 (new_ref)
+  /* TODO: Below conditions can be relaxed if TMR_INDEX
+ is an indcution variable and its initial value and
+ step are aligned.  */
+  || (TMR_INDEX (new_ref) && !TMR_STEP (new_ref))
   || (TMR_STEP (new_ref)
   && (TREE_INT_CST_LOW (TMR_STEP (new_ref))
   < align)


[PATCH] Introduce can_remove_lhs_p

2016-05-23 Thread Marek Polacek
As promised in ,
this is a simple clean-up which makes use of a new predicate.  Richi suggested
adding maybe_drop_lhs_from_noreturn_call which would be nicer, but I didn't
know how to do that, given the handling if lhs is an SSA_NAME.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-05-23  Marek Polacek  

* tree.h (can_remove_lhs_p): New predicate.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Use it.
* gimple-fold.c (gimple_fold_call): Likewise.
* gimplify.c (gimplify_modify_expr): Likewise.
* tree-cfgcleanup.c (fixup_noreturn_call): Likewise.

diff --git gcc/cgraph.c gcc/cgraph.c
index cf9192f..a9efa39 100644
--- gcc/cgraph.c
+++ gcc/cgraph.c
@@ -1513,10 +1513,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 }
 
   /* If the call becomes noreturn, remove the LHS if possible.  */
-  if (lhs
-  && (gimple_call_flags (new_stmt) & ECF_NORETURN)
-  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
-  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
+  if (gimple_call_noreturn_p (new_stmt) && can_remove_lhs_p (lhs))
 {
   if (TREE_CODE (lhs) == SSA_NAME)
{
diff --git gcc/gimple-fold.c gcc/gimple-fold.c
index 858f484..37c7d4e 100644
--- gcc/gimple-fold.c
+++ gcc/gimple-fold.c
@@ -3052,12 +3052,9 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool 
inplace)
  == void_type_node))
gimple_call_set_fntype (stmt, TREE_TYPE (fndecl));
  /* If the call becomes noreturn, remove the lhs.  */
- if (lhs
- && (gimple_call_flags (stmt) & ECF_NORETURN)
+ if (gimple_call_noreturn_p (stmt)
  && (VOID_TYPE_P (TREE_TYPE (gimple_call_fntype (stmt)))
- || ((TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs)))
-  == INTEGER_CST)
- && !TREE_ADDRESSABLE (TREE_TYPE (lhs)
+ || can_remove_lhs_p (lhs)))
{
  if (TREE_CODE (lhs) == SSA_NAME)
{
diff --git gcc/gimplify.c gcc/gimplify.c
index 4a544e3..a5977f2 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -4847,9 +4847,7 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
}
}
   notice_special_calls (call_stmt);
-  if (!gimple_call_noreturn_p (call_stmt)
- || TREE_ADDRESSABLE (TREE_TYPE (*to_p))
- || TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (*to_p))) != INTEGER_CST)
+  if (!gimple_call_noreturn_p (call_stmt) || !can_remove_lhs_p (*to_p))
gimple_call_set_lhs (call_stmt, *to_p);
   else if (TREE_CODE (*to_p) == SSA_NAME)
/* The above is somewhat premature, avoid ICEing later for a
diff --git gcc/tree-cfgcleanup.c gcc/tree-cfgcleanup.c
index 46d0fa3..9ca4447 100644
--- gcc/tree-cfgcleanup.c
+++ gcc/tree-cfgcleanup.c
@@ -604,8 +604,7 @@ fixup_noreturn_call (gimple *stmt)
  temporaries of variable-sized types is not supported.  Also don't
  do this with TREE_ADDRESSABLE types, as assign_temp will abort.  */
   tree lhs = gimple_call_lhs (stmt);
-  if (lhs && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
-  && !TREE_ADDRESSABLE (TREE_TYPE (lhs)))
+  if (can_remove_lhs_p (lhs))
 {
   gimple_call_set_lhs (stmt, NULL_TREE);
 
diff --git gcc/tree.h gcc/tree.h
index 2510d16..c217510 100644
--- gcc/tree.h
+++ gcc/tree.h
@@ -5471,4 +5471,14 @@ desired_pro_or_demotion_p (const_tree to_type, 
const_tree from_type)
   return to_type_precision <= TYPE_PRECISION (from_type);
 }
 
+/* Return true if the LHS of a call can be removed.  */
+
+inline bool
+can_remove_lhs_p (tree lhs)
+{
+  return (lhs
+ && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (lhs))) == INTEGER_CST
+ && !TREE_ADDRESSABLE (TREE_TYPE (lhs)));
+}
+
 #endif  /* GCC_TREE_H  */

Marek


[wwwdocs] Add another entry to gcc-6/porting_to.html

2016-05-23 Thread Jonathan Wakely

I found a couple of build failures in Fedora packages due to stinky
code like:

 if (file != NULL)

This fails due to the removal of basic_ios::operator void*() in C++11,
which is already documented on the porting_to page, but the symptoms
are sufficiently different to deserve documenting.

Committed to CVS.


Index: htdocs/gcc-6/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.22
diff -u -r1.22 porting_to.html
--- htdocs/gcc-6/porting_to.html	12 Apr 2016 04:46:19 -	1.22
+++ htdocs/gcc-6/porting_to.html	23 May 2016 14:22:37 -
@@ -111,6 +111,21 @@
 return static_castbool(os);
 
 
+No match for 'operator!=' (operand types are 'std::ifstream' and 'int')
+
+
+The change to iostream classes also affects code that tries to check for stream
+errors by comparing to NULL or 0.
+Such code should be changed to simply test the stream directly, instead of
+comparing it to a null pointer:
+
+
+
+  if (file) {   // not if (file != NULL), or if (file != 0)
+...
+  }
+
+
 Lvalue required as left operand of assignment with complex numbers
 
 


Re: [PATCH] Fix PR70434, change FE IL for vector indexing

2016-05-23 Thread Richard Biener
On Mon, 23 May 2016, Jason Merrill wrote:

> On 05/23/2016 05:19 AM, Richard Biener wrote:
> > * c-common.c (convert_vector_to_pointer_for_subscript): Use a
> > VIEW_CONVERT_EXPR to an array type.
> 
> Since we aren't converting to pointer type anymore, the function name should
> probably change, or at least the comment.
> 
> OK with that adjustment.

Good point - I've renamed it to convert_vector_to_array_for_subscript,
see below.

Richard.

2016-05-23  Richard Biener  

PR middle-end/70434
PR c/69504
c-family/
* c-common.h (convert_vector_to_pointer_for_subscript): Rename to ...
(convert_vector_to_array_for_subscript): ... this.
* c-common.c (convert_vector_to_pointer_for_subscript): Use a
VIEW_CONVERT_EXPR to an array type.  Rename to ...
(convert_vector_to_array_for_subscript): ... this.

cp/
* expr.c (mark_exp_read): Handle VIEW_CONVERT_EXPR.
* constexpr.c (cxx_eval_array_reference): Handle indexed
vectors.
* typeck.c (cp_build_array_ref): Adjust.

c/
* c-typeck.c (build_array_ref): Do not complain about indexing
non-lvalue vectors.  Adjust for function name change.

* tree-ssa.c (non_rewritable_mem_ref_base): Make sure to mark
bases which are accessed with non-invariant indices.
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Re-write
constant index ARRAY_REFs of vectors into BIT_FIELD_REFs.

* c-c++-common/vector-subscript-4.c: New testcase.

Index: gcc/c-family/c-common.c
===
*** gcc/c-family/c-common.c.orig2016-05-23 11:18:50.966861914 +0200
--- gcc/c-family/c-common.c 2016-05-23 16:18:47.31760 +0200
*** build_userdef_literal (tree suffix_id, t
*** 12496,12561 
return literal;
  }
  
! /* For vector[index], convert the vector to a
!pointer of the underlying type.  Return true if the resulting
!ARRAY_REF should not be an lvalue.  */
  
  bool
! convert_vector_to_pointer_for_subscript (location_t loc,
!tree *vecp, tree index)
  {
bool ret = false;
if (VECTOR_TYPE_P (TREE_TYPE (*vecp)))
  {
tree type = TREE_TYPE (*vecp);
-   tree type1;
  
ret = !lvalue_p (*vecp);
if (TREE_CODE (index) == INTEGER_CST)
  if (!tree_fits_uhwi_p (index)
  || tree_to_uhwi (index) >= TYPE_VECTOR_SUBPARTS (type))
warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
  
!   if (ret)
!   {
! tree tmp = create_tmp_var_raw (type);
! DECL_SOURCE_LOCATION (tmp) = loc;
! *vecp = c_save_expr (*vecp);
! if (TREE_CODE (*vecp) == C_MAYBE_CONST_EXPR)
!   {
! bool non_const = C_MAYBE_CONST_EXPR_NON_CONST (*vecp);
! *vecp = C_MAYBE_CONST_EXPR_EXPR (*vecp);
! *vecp
!   = c_wrap_maybe_const (build4 (TARGET_EXPR, type, tmp,
! *vecp, NULL_TREE, NULL_TREE),
! non_const);
!   }
! else
!   *vecp = build4 (TARGET_EXPR, type, tmp, *vecp,
!   NULL_TREE, NULL_TREE);
! SET_EXPR_LOCATION (*vecp, loc);
! c_common_mark_addressable_vec (tmp);
!   }
!   else
!   c_common_mark_addressable_vec (*vecp);
!   type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type));
!   type1 = build_pointer_type (TREE_TYPE (*vecp));
!   bool ref_all = TYPE_REF_CAN_ALIAS_ALL (type1);
!   if (!ref_all
! && !DECL_P (*vecp))
!   {
! /* If the original vector isn't declared may_alias and it
!isn't a bare vector look if the subscripting would
!alias the vector we subscript, and if not, force ref-all.  */
! alias_set_type vecset = get_alias_set (*vecp);
! alias_set_type sset = get_alias_set (type);
! if (!alias_sets_must_conflict_p (sset, vecset)
! && !alias_set_subset_of (sset, vecset))
!   ref_all = true;
!   }
!   type = build_pointer_type_for_mode (type, ptr_mode, ref_all);
!   *vecp = build1 (ADDR_EXPR, type1, *vecp);
!   *vecp = convert (type, *vecp);
  }
return ret;
  }
--- 12496,12529 
return literal;
  }
  
! /* For vector[index], convert the vector to an array of the underlying type.
!Return true if the resulting ARRAY_REF should not be an lvalue.  */
  
  bool
! convert_vector_to_array_for_subscript (location_t loc,
!  tree *vecp, tree index)
  {
bool ret = false;
if (VECTOR_TYPE_P (TREE_TYPE (*vecp)))
  {
tree type = TREE_TYPE (*vecp);
  
ret = !lvalue_p (*vecp);
+ 
if (TREE_CODE (index) == INTEGER_CST)
  if (!tree_fits_uhwi_p (index)
  || tree_to_uhwi (index) 

Re: [PATCH GCC]A latent alignment bug in tree-ssa-address.c

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 3:23 PM, Bin Cheng  wrote:
> Hi,
> When working on PR69710, I ran into this latent bug in which alignment 
> information is wrongly updated for pointer variables.  It results in memory 
> exceptions on x86_64 after patch for PR69710.  Scenario is that copy_ref_info 
> tries to update base's alignment in TARGET_MEM_REF[base + index << step].  
> But case with NULL TMR_STEP (which implies the step is 1) is not handled 
> here.  This patch fixes the bug by simply checking NULL TMR_STEP.  The 
> conditions about TMR_STEP could be relaxed if TMR_INDEX is an induction 
> variable which has aligned initial value and step.  But that needs 
> non-trivial code refactoring since copy_ref_info is uses by different parts 
> of compiler.
>
> Bootstrap and test on x86_64.  Is it OK?

I think it is ok but having !TMR_STEP when TMR_INDEX is non-NULL is
IMHO bad and we should enforce
that this doesn't happen in TMR verification in
tree-cfg.c:verify_types_in_gimple_reference.  I also notice
that if TMR_INDEX is NULL then TMR_STEP being NULL shouldn't block
alignment transfer (in that case
TMR_INDEX2 is likely non-NULL though).

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-05-20 Bin Cheng  
>
> * tree-ssa-address.c (copy_ref_info): Check null TMR_STEP.


Re: [PATCH 2/2][GCC] Add one more pattern to RTL if-conversion

2016-05-23 Thread Kyrill Tkachov

Hi Mikhail,

On 23/05/16 15:00, Mikhail Maltsev wrote:

This patch adds a new if-conversion pattern for the following case:

   if (test) x = A; else x = B;

   A and B are constants, abs(A - B) == 2^N, A != 0, B != 0


Bootstrapped and regtested on x86_64-linux. OK for trunk?



@@ -1453,6 +1460,19 @@ noce_try_store_flag_constants (struct noce_if_info 
*if_info)
gen_int_mode (ifalse, mode), if_info->x,
0, OPTAB_WIDEN);
  break;
+   case ST_SHIFT_ADD_FLAG:
+ {
+   /* if (test) x = 5; else x = 1;
+  =>   x = (test != 0) << 2 + 1;  */
+   HOST_WIDE_INT diff_log = exact_log2 (abs_hwi (diff));
+   rtx diff_rtx
+ = expand_simple_binop (mode, ASHIFT, target, GEN_INT (diff_log),
+if_info->x, 0, OPTAB_WIDEN);
+   target = expand_simple_binop (mode, (diff < 0) ? MINUS : PLUS,
+ gen_int_mode (ifalse, mode), diff_rtx,
+ if_info->x, 0, OPTAB_WIDEN);
+   break;
+ }

expand_simple_binop may fail. I think you should add a check that diff_rtx is 
non-NULL
and bail out early if it is.

Kyrill



Re: [PATCH][RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 3:19 PM, Kyrill Tkachov
 wrote:
>
> On 23/05/16 13:46, Richard Biener wrote:
>>
>> n Mon, May 23, 2016 at 2:28 PM, Kyrill Tkachov
>>  wrote:
>>>
>>> On 23/05/16 12:27, Richard Biener wrote:

 On Mon, May 23, 2016 at 1:17 PM, Kyrill Tkachov
  wrote:
>
> Hi all,
>
> In this PR we end up hitting a signed overflow in
> noce_get_alt_condition
> when we try to
> increment or decrement a HOST_WIDE_INT that might be HOST_WIDE_INT_MAX
> or
> HOST_WIDE_INT_MIN.
>
> I've confirmed the overflow by adding an assert before the operation:
> gcc_assert (desired_val != HOST_WIDE_INT_MAX);
>
> This patch fixes those cases by catching the cases when desired_val has
> the
> extreme
> value and avoids the transformation that function is trying to make.
>
> Bootstrapped and tested on arm, aarch64, x86_64.
>
> I've added the testcase that I used to trigger the assert mentioned
> above
> as
> a compile test,
> though I'm not sure how much value it has...
>
> Ok for trunk?

 If this isn't also a wrong-code issue (runtime testcase?) then why not
 perform
 the operation in unsigned HOST_WIDE_INT instead?
>>>
>>>
>>> This part of the code transforms a comparison "x < CST" to "x <= CST - 1"
>>> and similar transformations. Fro what I understand the LT,LE,GT,GE RTL
>>> comparison
>>> operators operate on signed integers, so I'm not sure how valid it would
>>> be
>>> to do all this on unsigned HOST_WIDE_INT.
>>
>> But then this is a wrong-code issue and you should see miscompiles
>> and thus can add a dg-do run testcase instead?
>
>
> I couldn't get it to miscompile anything, because the check:
> "actual_val == desired_val + 1"  where desired_val + 1 has signed
> overflow doesn't return true, so the transformation doesn't happen anyway.
> I think whether a miscompilation can occur depends on whether the compiler
> used
> to compile GCC itself does anything funky with the undefined behaviour
> that's
> occurring, which is why we should fix it. I suppose the testcase in this
> patch only
> goes so far to show that GCC doesn't crash, but not much else. I can make it
> an execute
> testcase if you'd like, but I can't get it to fail on my setup (the
> generated assembly
> looks correct on inspection).

Please make it a execute testcase anyway.

Ok with that change.

Thanks,
Richard.

> Kyrill
>
>
> Kyrill
>
>
>> Richard.
>>
>>> Thanks,
>>> Kyrill
>>>
>>>
>>>
 Richard.

> Thanks,
> Kyrill
>
> 2016-05-23  Kyrylo Tkachov  
>
>   PR rtl-optimization/66940
>   * ifcvt.c (noce_get_alt_condition): Check that incrementing or
>   decrementing desired_val will not overflow before performing
> these
>   operations.
>
> 2016-05-23  Kyrylo Tkachov  
>
>   PR rtl-optimization/66940
>   * gcc.c-torture/compile/pr66940.c: New test.
>>>
>>>
>


[PATCH 2/2][GCC] Add one more pattern to RTL if-conversion

2016-05-23 Thread Mikhail Maltsev
This patch adds a new if-conversion pattern for the following case:

  if (test) x = A; else x = B;

  A and B are constants, abs(A - B) == 2^N, A != 0, B != 0


Bootstrapped and regtested on x86_64-linux. OK for trunk?

-- 
Regards,
Mikhail Maltsev

gcc/testsuite/ChangeLog:

2016-05-23  Mikhail Maltsev  

* gcc.dg/ifcvt-6.c: New test.


gcc/ChangeLog:

2016-05-23  Mikhail Maltsev  

* ifcvt.c (noce_try_store_flag_constants): Add new pattern.
From 32ef17083d1ca6222e4befb1e1d8bae42d71db3b Mon Sep 17 00:00:00 2001
From: Mikhail Maltsev 
Date: Thu, 12 May 2016 15:23:03 +0300
Subject: [PATCH 2/2] Add ifcvt pattern


diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index a9c146b..f06b05d 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1260,6 +1260,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
   {
 ST_ADD_FLAG,
 ST_SHIFT_FLAG,
+ST_SHIFT_ADD_FLAG,
 ST_IOR_FLAG
   };
 
@@ -1384,6 +1385,12 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	  normalize = -1;
 	  reversep = true;
 	}
+  else if (exact_log2 (abs_hwi (diff)) >= 0
+	   && (STORE_FLAG_VALUE == 1 || if_info->branch_cost >= 2))
+	{
+	  strategy = ST_SHIFT_ADD_FLAG;
+	  normalize = 1;
+	}
   else
 	return FALSE;
 
@@ -1453,6 +1460,19 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	gen_int_mode (ifalse, mode), if_info->x,
 	0, OPTAB_WIDEN);
 	  break;
+	case ST_SHIFT_ADD_FLAG:
+	  {
+	/* if (test) x = 5; else x = 1;
+	   =>   x = (test != 0) << 2 + 1;  */
+	HOST_WIDE_INT diff_log = exact_log2 (abs_hwi (diff));
+	rtx diff_rtx
+	  = expand_simple_binop (mode, ASHIFT, target, GEN_INT (diff_log),
+ if_info->x, 0, OPTAB_WIDEN);
+	target = expand_simple_binop (mode, (diff < 0) ? MINUS : PLUS,
+	  gen_int_mode (ifalse, mode), diff_rtx,
+	  if_info->x, 0, OPTAB_WIDEN);
+	break;
+	  }
 	}
 
   if (! target)
diff --git a/gcc/testsuite/gcc.dg/ifcvt-6.c b/gcc/testsuite/gcc.dg/ifcvt-6.c
new file mode 100644
index 000..c2cfb17
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ifcvt-6.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-fdump-rtl-ce1 -O2" } */
+
+int
+test1 (int a)
+{
+  return a % 2 != 0 ? 7 : 3;
+}
+
+/* { dg-final { scan-rtl-dump "3 true changes made" "ce1" } } */
+/* { dg-final { scan-assembler-not "sbbl" } } */
-- 
2.1.4



[PATCH 1/2][GCC] Refactor noce_try_store_flag_constants

2016-05-23 Thread Mikhail Maltsev
This patch refactors 'noce_try_store_flag_constants' a bit to make it easier to
reason about.

The function contains two series of conditions, and each branch of the second
series corresponds to one or two branches of the first series. The patch
introduces a new enumeration strategy_t instead and uses it to select the
correct branch. Also, ISTM that the last 'else' branch is unreachable.

Bootstrapped and regtested on x86_64-linux. OK for trunk?

-- 
Regards,
Mikhail Maltsev

gcc/ChangeLog:

2016-05-23  Mikhail Maltsev  

* ifcvt.c (noce_try_store_flag_constants): Refactor.

From 847ba5ac9194273c8b51839cfda86bbc399847f4 Mon Sep 17 00:00:00 2001
From: Mikhail Maltsev 
Date: Tue, 10 May 2016 22:53:26 +0300
Subject: [PATCH 1/2] Refactor noce_try_store_flag_constants


diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 4949965..a9c146b 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1256,13 +1256,14 @@ noce_try_inverse_constants (struct noce_if_info *if_info)
 static int
 noce_try_store_flag_constants (struct noce_if_info *if_info)
 {
-  rtx target;
-  rtx_insn *seq;
-  bool reversep;
-  HOST_WIDE_INT itrue, ifalse, diff, tmp;
-  int normalize;
-  bool can_reverse;
-  machine_mode mode = GET_MODE (if_info->x);;
+  enum strategy_t
+  {
+ST_ADD_FLAG,
+ST_SHIFT_FLAG,
+ST_IOR_FLAG
+  };
+
+  machine_mode mode = GET_MODE (if_info->x);
   rtx common = NULL_RTX;
 
   rtx a = if_info->a;
@@ -1292,11 +1293,11 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
   if (CONST_INT_P (a)
   && CONST_INT_P (b))
 {
-  ifalse = INTVAL (a);
-  itrue = INTVAL (b);
+  HOST_WIDE_INT ifalse = INTVAL (a);
+  HOST_WIDE_INT itrue = INTVAL (b);
   bool subtract_flag_p = false;
 
-  diff = (unsigned HOST_WIDE_INT) itrue - ifalse;
+  HOST_WIDE_INT diff = (unsigned HOST_WIDE_INT) itrue - ifalse;
   /* Make sure we can represent the difference between the two values.  */
   if ((diff > 0)
 	  != ((ifalse < 0) != (itrue < 0) ? ifalse < 0 : ifalse < itrue))
@@ -1304,12 +1305,15 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 
   diff = trunc_int_for_mode (diff, mode);
 
-  can_reverse = (reversed_comparison_code (if_info->cond, if_info->jump)
-		 != UNKNOWN);
+  bool can_reverse
+	= (reversed_comparison_code (if_info->cond, if_info->jump) != UNKNOWN);
 
-  reversep = false;
+  bool reversep = false;
+  int normalize;
+  strategy_t strategy;
   if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
 	{
+	  strategy = ST_ADD_FLAG;
 	  normalize = 0;
 	  /* We could collapse these cases but it is easier to follow the
 	 diff/STORE_FLAG_VALUE combinations when they are listed
@@ -1355,20 +1359,28 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
   else if (ifalse == 0 && exact_log2 (itrue) >= 0
 	   && (STORE_FLAG_VALUE == 1
 		   || if_info->branch_cost >= 2))
-	normalize = 1;
+	{
+	  strategy = ST_SHIFT_FLAG;
+	  normalize = 1;
+	}
   else if (itrue == 0 && exact_log2 (ifalse) >= 0 && can_reverse
 	   && (STORE_FLAG_VALUE == 1 || if_info->branch_cost >= 2))
 	{
+	  strategy = ST_SHIFT_FLAG;
 	  normalize = 1;
 	  reversep = true;
 	}
   else if (itrue == -1
 	   && (STORE_FLAG_VALUE == -1
 		   || if_info->branch_cost >= 2))
-	normalize = -1;
+	{
+	  strategy = ST_IOR_FLAG;
+	  normalize = -1;
+	}
   else if (ifalse == -1 && can_reverse
 	   && (STORE_FLAG_VALUE == -1 || if_info->branch_cost >= 2))
 	{
+	  strategy = ST_IOR_FLAG;
 	  normalize = -1;
 	  reversep = true;
 	}
@@ -1391,59 +1403,56 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
 	  noce_emit_move_insn (common, if_info->x);
 	}
 
-  target = noce_emit_store_flag (if_info, if_info->x, reversep, normalize);
+  rtx target
+	= noce_emit_store_flag (if_info, if_info->x, reversep, normalize);
   if (! target)
 	{
 	  end_sequence ();
 	  return FALSE;
 	}
 
-  /* if (test) x = 3; else x = 4;
-	 =>   x = 3 + (test == 0);  */
-  if (diff == STORE_FLAG_VALUE || diff == -STORE_FLAG_VALUE)
+  if (common && strategy != ST_ADD_FLAG)
+	{
+	  /* Not beneficial when the original A and B are PLUS expressions.  */
+	  end_sequence ();
+	  return false;
+	}
+
+  switch (strategy)
 	{
+	case ST_ADD_FLAG:
+	  /* if (test) x = 3; else x = 4;
+	 =>   x = 3 + (test == 0);  */
+
 	  /* Add the common part now.  This may allow combine to merge this
 	 with the store flag operation earlier into some sort of conditional
 	 increment/decrement if the target allows it.  */
 	  if (common)
-	target = expand_simple_binop (mode, PLUS,
-	   target, common,
-	   target, 0, OPTAB_WIDEN);
+	target = expand_simple_binop (mode, PLUS, target, common, target, 0,
+	  OPTAB_WIDEN);
 
 	  /* Always use ifalse here.  It should have been swapped with itrue
 	 when appropriate when reversep is true.  */
 	  target = 

[PATCH 0/2][GCC] Add one more pattern to RTL if-conversion

2016-05-23 Thread Mikhail Maltsev
Hi all!

Currently GCC generates rather bad code for the following test case:

int test(int a)
{
  return a % 2 != 0 ? 4 : 2;
}

The code looks like this:

test:
andl$1, %edi
cmpl$1, %edi
sbbl%eax, %eax
andl$-2, %eax
addl$4, %eax
ret

Clang seems to generate optimal code:

test:
andl$1, %edi
leal2(%rdi,%rdi), %eax
retq

After applying this series of 2 patches GCC generates:

test:
movl%edi, %eax
andl$1, %eax
leal2(%rax,%rax), %eax
ret

-- 
Regards,
Mikhail Maltsev


Re: [C++ Patch] PR 69095

2016-05-23 Thread Jason Merrill

On 05/22/2016 02:26 PM, Paolo Carlini wrote:

finally sending a patch for this issue. As noticed by submitter himself,
it appears to boil down to a rather straightforward case of not
rejecting unexpanded parameter packs in default arguments. In order to
handle all the combinations (in/out of class, template
parameter/function parameter) I added calls of
check_for_bare_parameter_packs both to cp_parser_default_argument and
cp_parser_late_parsing_default_args


Hmm, would it make sense to check in cp_parser_initializer?

Jason



Re: [C++ Patch] Improve check_for_bare_parameter_packs location

2016-05-23 Thread Jason Merrill

On 05/22/2016 11:28 AM, Paolo Carlini wrote:

+  location_t loc = EXPR_LOC_OR_LOC (t, input_location);


Hmm, we can get types here as well, but I guess that works fine.  OK.

Jason



[PATCH GCC]A latent alignment bug in tree-ssa-address.c

2016-05-23 Thread Bin Cheng
Hi,
When working on PR69710, I ran into this latent bug in which alignment 
information is wrongly updated for pointer variables.  It results in memory 
exceptions on x86_64 after patch for PR69710.  Scenario is that copy_ref_info 
tries to update base's alignment in TARGET_MEM_REF[base + index << step].  But 
case with NULL TMR_STEP (which implies the step is 1) is not handled here.  
This patch fixes the bug by simply checking NULL TMR_STEP.  The conditions 
about TMR_STEP could be relaxed if TMR_INDEX is an induction variable which has 
aligned initial value and step.  But that needs non-trivial code refactoring 
since copy_ref_info is uses by different parts of compiler.

Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

2016-05-20 Bin Cheng  

* tree-ssa-address.c (copy_ref_info): Check null TMR_STEP.diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index 9e49f3d..d4ff755 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -877,6 +877,10 @@ copy_ref_info (tree new_ref, tree old_ref)
  && TREE_CODE (old_ref) == MEM_REF
  && !(TREE_CODE (new_ref) == TARGET_MEM_REF
   && (TMR_INDEX2 (new_ref)
+  /* TODO: Below conditions can be relaxed if TMR_INDEX
+ is an indcution variable and its initial value and
+ step are aligned.  */
+  || !TMR_STEP (new_ref)
   || (TMR_STEP (new_ref)
   && (TREE_INT_CST_LOW (TMR_STEP (new_ref))
   < align)


Re: [PATCH][RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition

2016-05-23 Thread Kyrill Tkachov


On 23/05/16 13:46, Richard Biener wrote:

n Mon, May 23, 2016 at 2:28 PM, Kyrill Tkachov
 wrote:

On 23/05/16 12:27, Richard Biener wrote:

On Mon, May 23, 2016 at 1:17 PM, Kyrill Tkachov
 wrote:

Hi all,

In this PR we end up hitting a signed overflow in noce_get_alt_condition
when we try to
increment or decrement a HOST_WIDE_INT that might be HOST_WIDE_INT_MAX or
HOST_WIDE_INT_MIN.

I've confirmed the overflow by adding an assert before the operation:
gcc_assert (desired_val != HOST_WIDE_INT_MAX);

This patch fixes those cases by catching the cases when desired_val has
the
extreme
value and avoids the transformation that function is trying to make.

Bootstrapped and tested on arm, aarch64, x86_64.

I've added the testcase that I used to trigger the assert mentioned above
as
a compile test,
though I'm not sure how much value it has...

Ok for trunk?

If this isn't also a wrong-code issue (runtime testcase?) then why not
perform
the operation in unsigned HOST_WIDE_INT instead?


This part of the code transforms a comparison "x < CST" to "x <= CST - 1"
and similar transformations. Fro what I understand the LT,LE,GT,GE RTL
comparison
operators operate on signed integers, so I'm not sure how valid it would be
to do all this on unsigned HOST_WIDE_INT.

But then this is a wrong-code issue and you should see miscompiles
and thus can add a dg-do run testcase instead?


I couldn't get it to miscompile anything, because the check:
"actual_val == desired_val + 1"  where desired_val + 1 has signed
overflow doesn't return true, so the transformation doesn't happen anyway.
I think whether a miscompilation can occur depends on whether the compiler used
to compile GCC itself does anything funky with the undefined behaviour that's
occurring, which is why we should fix it. I suppose the testcase in this patch 
only
goes so far to show that GCC doesn't crash, but not much else. I can make it an 
execute
testcase if you'd like, but I can't get it to fail on my setup (the generated 
assembly
looks correct on inspection).

Kyrill


Kyrill


Richard.


Thanks,
Kyrill




Richard.


Thanks,
Kyrill

2016-05-23  Kyrylo Tkachov  

  PR rtl-optimization/66940
  * ifcvt.c (noce_get_alt_condition): Check that incrementing or
  decrementing desired_val will not overflow before performing these
  operations.

2016-05-23  Kyrylo Tkachov  

  PR rtl-optimization/66940
  * gcc.c-torture/compile/pr66940.c: New test.






Re: [PATCH] Fix PR70434, change FE IL for vector indexing

2016-05-23 Thread Jason Merrill

On 05/23/2016 05:19 AM, Richard Biener wrote:

* c-common.c (convert_vector_to_pointer_for_subscript): Use a
VIEW_CONVERT_EXPR to an array type.


Since we aren't converting to pointer type anymore, the function name 
should probably change, or at least the comment.


OK with that adjustment.

Jason



Re: [PATCH] Vectorize inductions that are live after the loop.

2016-05-23 Thread Alan Hayward

Thanks for the review.

On 23/05/2016 11:35, "Richard Biener"  wrote:

>
>@@ -6332,79 +6324,81 @@ vectorizable_live_operation (gimple *stmt,
>   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>-  tree op;
>-  gimple *def_stmt;
>-  ssa_op_iter iter;
>+  imm_use_iterator imm_iter;
>+  tree lhs, lhs_type, vec_lhs;
>+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>+  int nunits = TYPE_VECTOR_SUBPARTS (vectype);
>+  int ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
>+  gimple *use_stmt;
>
>   gcc_assert (STMT_VINFO_LIVE_P (stmt_info));
>
>+  if (STMT_VINFO_TYPE (stmt_info) == reduc_vec_info_type)
>+return true;
>+
>
>This is an odd check - it says the stmt is handled by
>vectorizable_reduction.  And your
>return claims it is handled by vectorizable_live_operation ...

Previously this check was made to decide whether to call
vectorizable_live_operation,
So it made sense to put this check inside the function.

But, yes, I agree that the return value of the function no longer makes
sense.
I can revert this.

>
>You removed the SIMD lane handling?

The SIMD lane handling effectively checked for a special case, then added
code which would extract the final value of the vector.
The new code I’ve added does the exact same thing for more generic cases,
so the SIMD check can be removed and it’ll still be vectorized correctly.

>
>@@ -303,6 +335,16 @@ vect_stmt_relevant_p (gimple *stmt, loop_vec_info
>loop_vinfo,
>}
> }
>
>+  if (*live_p && *relevant == vect_unused_in_scope
>+  && !is_simple_and_all_uses_invariant (stmt, loop_vinfo))
>+{
>+  if (dump_enabled_p ())
>+   dump_printf_loc (MSG_NOTE, vect_location,
>+"vec_stmt_relevant_p: live and not all uses "
>+"invariant.\n");
>+  *relevant = vect_used_only_live;
>+}
>
>But that's a missed invariant motion / code sinking opportunity then.
>Did you have a
>testcase for this?

I don’t have a test case :(
It made sense that this was the correct action to do on the failure
(rather than assert).

>
>@@ -618,57 +660,31 @@ vect_mark_stmts_to_be_vectorized (loop_vec_info
>loop_vinfo)
>}
>
>   /* Examine the USEs of STMT. For each USE, mark the stmt that
>defines it
>-(DEF_STMT) as relevant/irrelevant and live/dead according to the
>-liveness and relevance properties of STMT.  */
>+(DEF_STMT) as relevant/irrelevant according to the relevance
>property
>+of STMT.  */
>   stmt_vinfo = vinfo_for_stmt (stmt);
>   relevant = STMT_VINFO_RELEVANT (stmt_vinfo);
>-  live_p = STMT_VINFO_LIVE_P (stmt_vinfo);
>-
>-  /* Generally, the liveness and relevance properties of STMT are
>-propagated as is to the DEF_STMTs of its USEs:
>- live_p <-- STMT_VINFO_LIVE_P (STMT_VINFO)
>- relevant <-- STMT_VINFO_RELEVANT (STMT_VINFO)
>-
>-One exception is when STMT has been identified as defining a
>reduction
>-variable; in this case we set the liveness/relevance as follows:
>-  live_p = false
>-  relevant = vect_used_by_reduction
>-This is because we distinguish between two kinds of relevant
>stmts -
>-those that are used by a reduction computation, and those that
>are
>-(also) used by a regular computation.  This allows us later on to
>-identify stmts that are used solely by a reduction, and
>therefore the
>-order of the results that they produce does not have to be kept.
> */
>-
>-  def_type = STMT_VINFO_DEF_TYPE (stmt_vinfo);
>-  tmp_relevant = relevant;
>-  switch (def_type)
>+
>+  switch (STMT_VINFO_DEF_TYPE (stmt_vinfo))
> {
>
>you removed this comment.  Is it no longer valid?  Can you please
>instead update it?
>This is a tricky area.

I’ll replace with a new comment.

>
>
>@@ -1310,17 +1325,14 @@ vect_init_vector (gimple *stmt, tree val, tree
>type, gimple_stmt_iterator *gsi)
>In case OP is an invariant or constant, a new stmt that creates a
>vector def
>needs to be introduced.  VECTYPE may be used to specify a required
>type for
>vector invariant.  */
>-
>-tree
>-vect_get_vec_def_for_operand (tree op, gimple *stmt, tree vectype)
>+static tree
>+vect_get_vec_def_for_operand_internal (tree op, gimple *stmt,
>+  loop_vec_info loop_vinfo, tree
>vectype)
> {
>   tree vec_oprnd;
>...
>
>+tree
>+vect_get_vec_def_for_operand (tree op, gimple *stmt, tree vectype)
>+{
>+  stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
>+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
>+  return vect_get_vec_def_for_operand_internal (op, stmt, loop_vinfo,
>vectype);
>+}
>+
>+tree
>+vect_get_vec_def_for_operand_outside (tree op, loop_vec_info loop_vinfo)
>+{
>+  return vect_get_vec_def_for_operand_internal (op, NULL, loop_vinfo,

Re: [PATCH][RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition

2016-05-23 Thread Richard Biener
n Mon, May 23, 2016 at 2:28 PM, Kyrill Tkachov
 wrote:
>
> On 23/05/16 12:27, Richard Biener wrote:
>>
>> On Mon, May 23, 2016 at 1:17 PM, Kyrill Tkachov
>>  wrote:
>>>
>>> Hi all,
>>>
>>> In this PR we end up hitting a signed overflow in noce_get_alt_condition
>>> when we try to
>>> increment or decrement a HOST_WIDE_INT that might be HOST_WIDE_INT_MAX or
>>> HOST_WIDE_INT_MIN.
>>>
>>> I've confirmed the overflow by adding an assert before the operation:
>>> gcc_assert (desired_val != HOST_WIDE_INT_MAX);
>>>
>>> This patch fixes those cases by catching the cases when desired_val has
>>> the
>>> extreme
>>> value and avoids the transformation that function is trying to make.
>>>
>>> Bootstrapped and tested on arm, aarch64, x86_64.
>>>
>>> I've added the testcase that I used to trigger the assert mentioned above
>>> as
>>> a compile test,
>>> though I'm not sure how much value it has...
>>>
>>> Ok for trunk?
>>
>> If this isn't also a wrong-code issue (runtime testcase?) then why not
>> perform
>> the operation in unsigned HOST_WIDE_INT instead?
>
>
> This part of the code transforms a comparison "x < CST" to "x <= CST - 1"
> and similar transformations. Fro what I understand the LT,LE,GT,GE RTL
> comparison
> operators operate on signed integers, so I'm not sure how valid it would be
> to do all this on unsigned HOST_WIDE_INT.

But then this is a wrong-code issue and you should see miscompiles
and thus can add a dg-do run testcase instead?

Richard.

> Thanks,
> Kyrill
>
>
>
>> Richard.
>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2016-05-23  Kyrylo Tkachov  
>>>
>>>  PR rtl-optimization/66940
>>>  * ifcvt.c (noce_get_alt_condition): Check that incrementing or
>>>  decrementing desired_val will not overflow before performing these
>>>  operations.
>>>
>>> 2016-05-23  Kyrylo Tkachov  
>>>
>>>  PR rtl-optimization/66940
>>>  * gcc.c-torture/compile/pr66940.c: New test.
>
>


Re: [PATCH][RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition

2016-05-23 Thread Kyrill Tkachov


On 23/05/16 12:27, Richard Biener wrote:

On Mon, May 23, 2016 at 1:17 PM, Kyrill Tkachov
 wrote:

Hi all,

In this PR we end up hitting a signed overflow in noce_get_alt_condition
when we try to
increment or decrement a HOST_WIDE_INT that might be HOST_WIDE_INT_MAX or
HOST_WIDE_INT_MIN.

I've confirmed the overflow by adding an assert before the operation:
gcc_assert (desired_val != HOST_WIDE_INT_MAX);

This patch fixes those cases by catching the cases when desired_val has the
extreme
value and avoids the transformation that function is trying to make.

Bootstrapped and tested on arm, aarch64, x86_64.

I've added the testcase that I used to trigger the assert mentioned above as
a compile test,
though I'm not sure how much value it has...

Ok for trunk?

If this isn't also a wrong-code issue (runtime testcase?) then why not perform
the operation in unsigned HOST_WIDE_INT instead?


This part of the code transforms a comparison "x < CST" to "x <= CST - 1"
and similar transformations. Fro what I understand the LT,LE,GT,GE RTL 
comparison
operators operate on signed integers, so I'm not sure how valid it would be
to do all this on unsigned HOST_WIDE_INT.

Thanks,
Kyrill



Richard.


Thanks,
Kyrill

2016-05-23  Kyrylo Tkachov  

 PR rtl-optimization/66940
 * ifcvt.c (noce_get_alt_condition): Check that incrementing or
 decrementing desired_val will not overflow before performing these
 operations.

2016-05-23  Kyrylo Tkachov  

 PR rtl-optimization/66940
 * gcc.c-torture/compile/pr66940.c: New test.




JIT patch: add gcc_jit_magic_int

2016-05-23 Thread Basile Starynkevitch

Hello All,

As I explained in https://gcc.gnu.org/ml/jit/2016-q2/msg00042.html it is 
difficult (or tricky without using dirty tricks involving the GCC plugin 
headers) to use GCCJIT to emit code equivalent to the following C file:


   extern int a;
   int get_atomic_a (void) {
 return __atomic_load_n (, __ATOMIC_SEQ_CST);
   }

The issue is that __ATOMIC_SEQ_CST is a magic preprocessor (but non-standard!) 
symbol which might not be available
(or might have a different value) in the C code for GCCJIT building such an AST.

So we need a function to retrieve some magic integral value from the GCCJIT 
compiler.

The attached patch (relative to trunk svn 236583) is a first attempt to solve 
that issue
 (and also give ability to query some other magic numbers).

Proposed ChangeLog entry (in gcc/jit/)

2016-05-23  Basile Starynkevitch  
* libgccjit.h (LIBGCCJIT_HAVE_gcc_jit_magic_int): New macro.
(gcc_jit_magic_int): New public function declaration.

* libgccjit.c: Include "cppbuiltin.h", "options.h", "flag-types.h"
(gcc_jit_magic_int): New function.

* libgccjit.map: Add gcc_jit_magic_int to LIBGCCJIT_ABI_6.

Comments (or an ok to commit) are welcome. (I am not sure that 
__SANITIZE_ADDRESS__ is correctly handled,
because I would believe that optimization flags are not globals in GCCJIT)

Regards.

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***

Index: gcc/jit/libgccjit.h
===
--- gcc/jit/libgccjit.h	(revision 236583)
+++ gcc/jit/libgccjit.h	(working copy)
@@ -1387,6 +1387,27 @@
 gcc_jit_rvalue_set_bool_require_tail_call (gcc_jit_rvalue *call,
 	   int require_tail_call);
 
+
+  /* Magical integer values useful in the compiler; similar to
+ predefined C macros like __GNUC__, __GNUC_MINOR__,
+ __GNUC_PATCHLEVEL__, __ATOMIC_RELAXED, __ATOMIC_SEQ_CST,
+ __ATOMIC_ACQUIRE, __ATOMIC_RELEASE, __ATOMIC_ACQ_REL,
+ __ATOMIC_CONSUME, __PIC__, __PIE__, etc.  Typical usage would be:
+
+bool err=false;
+int mypic = gcc_jit_magic_int("__PIC__", );
+if (err) somethinggotwrong();
+
+This function is expected to be rarely called, typically once at
+initialization time. 
+
+   This API entrypoint was added in LIBGCCJIT_ABI_6; you can test for its
+   presence using
+ #ifdef LIBGCCJIT_HAVE_gcc_jit_magic_int
+  */
+#define LIBGCCJIT_HAVE_gcc_jit_magic_int
+extern int gcc_jit_magic_int(const char*name, bool*errp);
+
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
Index: gcc/jit/libgccjit.c
===
--- gcc/jit/libgccjit.c	(revision 236583)
+++ gcc/jit/libgccjit.c	(working copy)
@@ -23,6 +23,9 @@
 #include "coretypes.h"
 #include "timevar.h"
 #include "typed-splay-tree.h"
+#include "cppbuiltin.h"
+#include "options.h"
+#include "flag-types.h"
 
 #include "libgccjit.h"
 #include "jit-recording.h"
@@ -2970,3 +2973,44 @@
 
   call->set_require_tail_call (require_tail_call);
 }
+
+
+/* Public entrypoint. See description in libgccjit.h. */
+
+int gcc_jit_magic_int(const char*name, bool*errp)
+{
+  static int major, minor, patchlevel;
+  if (!major) /* call once: */
+parse_basever (, , );
+  
+  RETURN_VAL_IF_FAIL (name,
+		  errp?((*errp=true),0):0,
+		  NULL, NULL,
+		  "NULL name");
+  RETURN_VAL_IF_FAIL (name[0] == '_' && name[1] == '_',
+		  errp?((*errp=true),0):0,
+		  NULL, NULL,
+		 "name should start with two underscores");
+#define HAVE_MAGIC_INT(NamStr,Val) do {		\
+  if (!strcmp(name, NamStr)) {			\
+  if (errp) *errp = false;			\
+  return Val; }} while(0)
+  // keep these in alphabetical order...
+  HAVE_MAGIC_INT("__ATOMIC_ACQUIRE", MEMMODEL_ACQUIRE);
+  HAVE_MAGIC_INT("__ATOMIC_ACQ_REL", MEMMODEL_ACQ_REL);
+  HAVE_MAGIC_INT("__ATOMIC_CONSUME", MEMMODEL_CONSUME);
+  HAVE_MAGIC_INT("__ATOMIC_RELAXED", MEMMODEL_RELAXED);
+  HAVE_MAGIC_INT("__ATOMIC_RELEASE", MEMMODEL_RELEASE);
+  HAVE_MAGIC_INT("__ATOMIC_SEQ_CST", MEMMODEL_SEQ_CST);
+  HAVE_MAGIC_INT("__GNUC_MINOR__", minor);
+  HAVE_MAGIC_INT("__GNUC_PATCHLEVEL__", patchlevel);
+  HAVE_MAGIC_INT("__GNUC__", major);
+  HAVE_MAGIC_INT("__PIC__", flag_pic);
+  HAVE_MAGIC_INT("__PIE__", flag_pie);
+  HAVE_MAGIC_INT("__SANITIZE_ADDRESS__", flag_sanitize & SANITIZE_ADDRESS);
+  HAVE_MAGIC_INT("__SANITIZE_THREAD__", flag_sanitize & SANITIZE_THREAD);
+#undef HAVE_MAGIC_INT
+  RETURN_VAL_IF_FAIL_PRINTF1 (false,  errp?((*errp=true),0):0,
+			  NULL, NULL,
+			  "unknown magic int name: %s", name);
+}
Index: gcc/jit/libgccjit.map
===
--- gcc/jit/libgccjit.map	(revision 236583)
+++ gcc/jit/libgccjit.map	(working copy)
@@ -149,4 +149,5 @@
 LIBGCCJIT_ABI_6 {
   global:
   

[PATCH] Fix PR71230

2016-05-23 Thread Richard Biener

The following fixes PR71230 - a missed single_use call when 
re-interpreting * (-x) as * x * -1.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-05-23  Richard Biener  

PR tree-optimization/71230
* tree-ssa-reassoc.c (acceptable_pow_call): Move initial condition...
(try_special_add_to_ops): ... here.  Always test for single-use.

* gfortran.dg/pr71230-1.f90: New testcase.
* gfortran.dg/pr71230-2.f90: Likewise.

Index: gcc/tree-ssa-reassoc.c
===
*** gcc/tree-ssa-reassoc.c  (revision 236575)
--- gcc/tree-ssa-reassoc.c  (working copy)
*** break_up_subtract (gimple *stmt, gimple_
*** 4271,4287 
 If any of these conditions does not hold, return FALSE.  */
  
  static bool
! acceptable_pow_call (gimple *stmt, tree *base, HOST_WIDE_INT *exponent)
  {
tree arg1;
REAL_VALUE_TYPE c, cint;
  
-   if (!reassoc_insert_powi_p
-   || !flag_unsafe_math_optimizations
-   || !is_gimple_call (stmt)
-   || !has_single_use (gimple_call_lhs (stmt)))
- return false;
- 
switch (gimple_call_combined_fn (stmt))
  {
  CASE_CFN_POW:
--- 4271,4281 
 If any of these conditions does not hold, return FALSE.  */
  
  static bool
! acceptable_pow_call (gcall *stmt, tree *base, HOST_WIDE_INT *exponent)
  {
tree arg1;
REAL_VALUE_TYPE c, cint;
  
switch (gimple_call_combined_fn (stmt))
  {
  CASE_CFN_POW:
*** try_special_add_to_ops (vec (def_stmt), , ))
  {
add_repeat_to_ops_vec (ops, base, exponent);
gimple_set_visited (def_stmt, true);
Index: gcc/testsuite/gfortran.dg/pr71230-1.f90
===
*** gcc/testsuite/gfortran.dg/pr71230-1.f90 (revision 0)
--- gcc/testsuite/gfortran.dg/pr71230-1.f90 (working copy)
***
*** 0 
--- 1,6 
+ ! { dg-do compile }
+ ! { dg-options "-O2 -fbounds-check -ffast-math" }
+   FUNCTION pw_integral_aa ( cc ) RESULT ( integral_value )
+ COMPLEX(KIND=8), DIMENSION(:), POINTER :: cc
+ integral_value = accurate_sum ( CONJG ( cc (:) ) * cc (:) )
+   END FUNCTION pw_integral_aa
Index: gcc/testsuite/gfortran.dg/pr71230-2.f90
===
*** gcc/testsuite/gfortran.dg/pr71230-2.f90 (revision 0)
--- gcc/testsuite/gfortran.dg/pr71230-2.f90 (working copy)
***
*** 0 
--- 1,67 
+ ! { dg-do compile }
+ ! { dg-options "-O2 -ffast-math" }
+ 
+ MODULE xc_b97
+   INTEGER, PARAMETER :: dp=8
+   PRIVATE
+   PUBLIC :: b97_lsd_eval
+ CONTAINS
+   SUBROUTINE b97_lsd_eval(rho_set,deriv_set,grad_deriv,b97_params)
+ INTEGER, INTENT(in)  :: grad_deriv
+ INTEGER  :: handle, npoints, param, stat
+ LOGICAL  :: failure
+ REAL(kind=dp):: epsilon_drho, epsilon_rho, &
+ scale_c, scale_x
+ REAL(kind=dp), DIMENSION(:, :, :), POINTER :: dummy, e_0, e_ndra, &
+   e_ndra_ndra, e_ndra_ndrb, e_ndra_ra, e_ndra_rb, e_ndrb, e_ndrb_ndrb, &
+   e_ndrb_ra, e_ndrb_rb, e_ra, e_ra_ra, e_ra_rb, e_rb, e_rb_rb, &
+   norm_drhoa, norm_drhob, rhoa, rhob
+ IF (.NOT. failure) THEN
+CALL b97_lsd_calc(&
+ rhoa=rhoa, rhob=rhob, norm_drhoa=norm_drhoa,&
+ norm_drhob=norm_drhob, e_0=e_0, &
+ e_ra=e_ra, e_rb=e_rb, &
+ e_ndra=e_ndra, e_ndrb=e_ndrb, &
+ e_ra_ra=e_ra_ra, e_ra_rb=e_ra_rb, e_rb_rb=e_rb_rb,&
+ e_ra_ndra=e_ndra_ra, e_ra_ndrb=e_ndrb_ra, &
+ e_rb_ndrb=e_ndrb_rb, e_rb_ndra=e_ndra_rb,&
+ e_ndra_ndra=e_ndra_ndra, e_ndrb_ndrb=e_ndrb_ndrb,&
+ e_ndra_ndrb=e_ndra_ndrb,&
+ grad_deriv=grad_deriv, npoints=npoints, &
+ epsilon_rho=epsilon_rho,epsilon_drho=epsilon_drho,&
+ param=param,scale_c_in=scale_c,scale_x_in=scale_x)
+ END IF
+   END SUBROUTINE b97_lsd_eval
+   SUBROUTINE b97_lsd_calc(rhoa, rhob, norm_drhoa, norm_drhob,&
+e_0, e_ra, e_rb, e_ndra, e_ndrb, &
+e_ra_ndra,e_ra_ndrb, e_rb_ndra, e_rb_ndrb,&
+e_ndra_ndra, e_ndrb_ndrb, e_ndra_ndrb, &
+e_ra_ra, e_ra_rb, e_rb_rb,&
+grad_deriv,npoints,epsilon_rho,epsilon_drho, &
+param, scale_c_in, scale_x_in)
+ REAL(kind=dp), DIMENSION(*), INTENT(in)  :: rhoa, rhob, norm_drhoa, &
+ norm_drhob
+ REAL(kind=dp), DIMENSION(*), INTENT(inout) :: e_0, e_ra, e_rb, e_ndra, &
+   e_ndrb, e_ra_ndra, e_ra_ndrb, e_rb_ndra, e_rb_ndrb, e_ndra_ndra, &
+   e_ndrb_ndrb, e_ndra_ndrb, e_ra_ra, e_ra_rb, e_rb_rb
+ INTEGER, INTENT(in)  :: grad_deriv, npoints
+ REAL(kind=dp), INTENT(in):: epsilon_rho, epsilon_drho

[PATCH] GCC 5 backports

2016-05-23 Thread Richard Biener

A few more.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-05-23  Richard Biener  

Backport from mainline
2016-02-11  Alexandre Oliva  

PR target/69634
* regstat.c (regstat_bb_compute_calls_crossed): Disregard
debug insns.

* gcc.dg/pr69634.c: New.

2016-03-23  Patrick Palka  
 
PR c++/70347
* typeck.c (process_init_constructor_union): If the initializer
is empty, use the union's NSDMI if it has one.
 
* g++.dg/cpp1y/nsdmi-union1.C: New test.

2015-10-30  Richard Biener  

PR middle-end/68142
* fold-const.c (extract_muldiv_1): Avoid introducing undefined
overflow.

* c-c++-common/ubsan/pr68142.c: New testcase.

2016-03-24  Richard Henderson  

PR middle-end/69845
* fold-const.c (extract_muldiv_1): Correct test for multiplication
overflow.

* gcc.dg/tree-ssa/pr69845-1.c: New test.
* gcc.dg/tree-ssa/pr69845-2.c: New test.


Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 229517)
--- gcc/fold-const.c(working copy)
*** extract_muldiv_1 (tree t, tree c, enum t
*** 6008,6015 
 or (for divide and modulus) if it is a multiple of our constant.  */
if (code == MULT_EXPR
  || wi::multiple_of_p (t, c, TYPE_SIGN (type)))
!   return const_binop (code, fold_convert (ctype, t),
!   fold_convert (ctype, c));
break;
  
  CASE_CONVERT: case NON_LVALUE_EXPR:
--- 6015,6031 
 or (for divide and modulus) if it is a multiple of our constant.  */
if (code == MULT_EXPR
  || wi::multiple_of_p (t, c, TYPE_SIGN (type)))
!   {
! tree tem = const_binop (code, fold_convert (ctype, t),
! fold_convert (ctype, c));
! /* If the multiplication overflowed to INT_MIN then we lost sign
!information on it and a subsequent multiplication might
!spuriously overflow.  See PR68142.  */
! if (TREE_OVERFLOW (tem)
! && wi::eq_p (tem, wi::min_value (TYPE_PRECISION (ctype), SIGNED)))
!   return NULL_TREE;
! return tem;
!   }
break;
  
  CASE_CONVERT: case NON_LVALUE_EXPR:
Index: gcc/testsuite/c-c++-common/ubsan/pr68142.c
===
*** gcc/testsuite/c-c++-common/ubsan/pr68142.c  (revision 0)
--- gcc/testsuite/c-c++-common/ubsan/pr68142.c  (working copy)
***
*** 0 
--- 1,31 
+ /* { dg-do run } */
+ /* { dg-options "-fsanitize=undefined -fsanitize-undefined-trap-on-error" } */
+ 
+ int __attribute__((noinline,noclone))
+ h(int a)
+ {
+   return 2 * (a * (__INT_MAX__/2 + 1));
+ }
+ int __attribute__((noinline,noclone))
+ i(int a)
+ {
+   return (2 * a) * (__INT_MAX__/2 + 1);
+ }
+ int __attribute__((noinline,noclone))
+ j(int a, int b)
+ {
+   return (b * a) * (__INT_MAX__/2 + 1);
+ }
+ int __attribute__((noinline,noclone))
+ k(int a, int b)
+ {
+   return (2 * a) * b;
+ }
+ int main()
+ {
+   volatile int tem = h(-1);
+   tem = i(-1);
+   tem = j(-1, 2);
+   tem = k(-1, __INT_MAX__/2 + 1);
+   return 0;
+ }

Index: gcc/regstat.c
===
--- gcc/regstat.c   (revision 233249)
+++ gcc/regstat.c   (revision 233250)
@@ -444,7 +444,7 @@ regstat_bb_compute_calls_crossed (unsign
   struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
   unsigned int regno;
 
-  if (!INSN_P (insn))
+  if (!NONDEBUG_INSN_P (insn))
continue;
 
   /* Process the defs.  */
Index: gcc/testsuite/gcc.dg/pr69634.c
===
--- gcc/testsuite/gcc.dg/pr69634.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr69634.c  (revision 233250)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-dce -fschedule-insns -fno-tree-vrp -fcompare-debug" 
} */
+/* { dg-additional-options "-Wno-psabi -mno-sse" { target i?86-*-* x86_64-*-* 
} } */
+
+typedef unsigned short u16;
+typedef short v16u16 __attribute__ ((vector_size (16)));
+typedef unsigned v16u32 __attribute__ ((vector_size (16)));
+typedef unsigned long long v16u64 __attribute__ ((vector_size (16)));
+
+u16
+foo(u16 u16_1, v16u16 v16u16_0, v16u32 v16u64_0, v16u16 v16u16_1, v16u32 
v16u32_1, v16u64 v16u64_1)
+{
+  v16u64_1 /= (v16u64){~v16u32_1[1]};
+  u16_1 = 0;
+  u16_1 /= v16u32_1[2];
+  v16u64_1 -= (v16u64) v16u16_1;
+  u16_1 >>= 1;
+  u16_1 -= ~0;
+  v16u16_1 /= (v16u16){~u16_1, 1 - v16u64_0[0], 0xffb6};
+  return u16_1 + v16u16_0[1] + v16u16_1[3] + v16u64_1[0] + v16u64_1[1];
+}
Index: gcc/testsuite/g++.dg/cpp1y/nsdmi-union1.C

Re: RFC [1/2] divmod transform

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 10:58 AM, Prathamesh Kulkarni
 wrote:
> Hi,
> I have updated my patch for divmod (attached), which was originally
> based on Kugan's patch.
> The patch transforms stmts with code TRUNC_DIV_EXPR and TRUNC_MOD_EXPR
> having same operands to divmod representation, so we can cse computation of 
> mod.
>
> t1 = a TRUNC_DIV_EXPR b;
> t2 = a TRUNC_MOD_EXPR b
> is transformed to:
> complex_tmp = DIVMOD (a, b);
> t1 = REALPART_EXPR (complex_tmp);
> t2 = IMAGPART_EXPR (complex_tmp);
>
> * New hook divmod_expand_libfunc
> The rationale for introducing the hook is that different targets have
> incompatible calling conventions for divmod libfunc.
> Currently three ports define divmod libfunc: c6x, spu and arm.
> c6x and spu follow the convention of libgcc2.c:__udivmoddi4:
> return quotient and store remainder in argument passed as pointer,
> while the arm version takes two arguments and returns both
> quotient and remainder having mode double the size of the operand mode.
> The port should hence override the hook expand_divmod_libfunc
> to generate call to target-specific divmod.
> Ports should define this hook if:
> a) The port does not have divmod or div insn for the given mode.
> b) The port defines divmod libfunc for the given mode.
> The default hook default_expand_divmod_libfunc() generates call
> to libgcc2.c:__udivmoddi4 provided the operands are unsigned and
> are of DImode.
>
> Patch passes bootstrap+test on x86_64-unknown-linux-gnu and
> cross-tested on arm*-*-*.
> Bootstrap+test in progress on arm-linux-gnueabihf.
> Does this patch look OK ?

diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 6b4601b..e4a021a 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode,
machine_mode, optimization_type)
   return true;
 }

+void
+default_expand_divmod_libfunc (bool unsignedp, machine_mode mode,
+  rtx op0, rtx op1,
+  rtx *quot_p, rtx *rem_p)

functions need a comment.

ISTR it was suggested that ARM change to libgcc2.c__udivmoddi4 style?  In that
case we could avoid the target hook.

+  /* If target overrides expand_divmod_libfunc hook
+then perform divmod by generating call to the target-specifc divmod
libfunc.  */
+  if (targetm.expand_divmod_libfunc != default_expand_divmod_libfunc)
+   return true;
+
+  /* Fall back to using libgcc2.c:__udivmoddi4.  */
+  return (mode == DImode && unsignedp);

I don't understand this - we know optab_libfunc returns non-NULL for 'mode'
but still restrict this to DImode && unsigned?  Also if
targetm.expand_divmod_libfunc
is not the default we expect the target to handle all modes?

That said - I expected the above piece to be simply a 'return true;' ;)

Usually we use some can_expand_XXX helper in optabs.c to query if the target
supports a specific operation (for example SImode divmod would use DImode
divmod by means of widening operands - for the unsigned case of course).

+  /* Disable the transform if either is a constant, since
division-by-constant
+ may have specialized expansion.  */
+  if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2))
+return false;

please use CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2)

+  if (TYPE_OVERFLOW_TRAPS (type))
+return false;

why's that?  Generally please first test cheap things (trapping, constant-ness)
before checking expensive stuff (target_supports_divmod_p).

+static bool
+convert_to_divmod (gassign *stmt)
+{
+  if (!divmod_candidate_p (stmt))
+return false;
+
+  tree op1 = gimple_assign_rhs1 (stmt);
+  tree op2 = gimple_assign_rhs2 (stmt);
+
+  vec stmts = vNULL;

use an auto_vec  - you currently leak it in at least one place.

+  if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
+   cfg_changed = true;

note that this suggests you should check whether any of the stmts may throw
internally as you don't update / transfer EH info correctly.  So for 'stmt' and
all 'use_stmt' check stmt_can_throw_internal and if so do not add it to
the list of stmts to modify.

Btw, I think you should not add 'stmt' immediately but when iterating over
all uses also gather uses in TRUNC_MOD_EXPR.

Otherwise looks ok.

Thanks,
Richard.

> Thanks,
> Prathamesh


Re: [Patch] Implement is_[nothrow_]swappable (p0185r1)

2016-05-23 Thread Jonathan Wakely

On 17/05/16 20:39 +0200, Daniel KrĂĽgler wrote:

This is an implementation of the Standard is_swappable traits according to

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0185r1.html

During that work it has been found that std::array's member swap's exception
specification for zero-size arrays was incorrectly depending on the value_type
and that was fixed as well.


This looks good to me, I'll get it committed (with some adjustment to
the ChangeLog format) - thanks.



Re: [RFC] Type promotion pass and elimination of zext/sext

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 2:44 AM, Kugan Vivekanandarajah
 wrote:
> Hi Richard,
>
>> So what does this mean for this pass?  It means that we need to think
>> about the immediate goal we want to fulfil - which might be to just
>> promote things that we can fully promote, avoiding the necessity to
>> prevent passes from undoing our work.  That said - we need a set of
>> testcases the pass should enable to being optimized better than without it
>> (I myself see the idea of promoting on GIMPLE according to PROMOTE_MODE
>> as good design cleanup towards pushing GIMPLE farther out).
>
> I will appreciate any test-cases you think that  think should work 
> (optimized).

You were coming from the ARM sign-/zero-elimination side and IIRC this
was previously
stuff you tried in VRP.  Didn't you have testcases for the desired
sign-/zero-extension elimination
side from that time?

> I will also try to gather test-cases based on testing/benchmarking.

That's great.

Thanks.
Richard.

> Thanks,
> Kugan


Re: match.pd: Relax some tree_nop_conversion_p

2016-05-23 Thread Richard Biener
On Sun, May 22, 2016 at 7:42 PM, Marc Glisse  wrote:
> Hello,
>
> this patch replaces some tree_nop_conversion_p tests with less restrictive
> conditions. In some cases I checked the transformation automatically (of
> course I could have messed up the checker, or the translation). I didn't
> always put the laxest possible check. For instance the transformation for
> (~x & ~y) is valid with sign extension, but the gain is less obvious in that
> case. ~(~X >> Y) also seems valid in some odd cases involving boolean types,
> not worth the complication. The bad case for a * (1 << b) is when 1< into INT_MIN (and then we sign-extend), which I think is valid even with the
> strictest overflow rules :-(
>
> I only did a few transforms because it isn't clear to me that this is worth
> it. It makes the validity of the transformation less obvious to the reader
> and probably seldom fires in regular code. True conversions also have a cost
> that can change if the transformation actually gains anything (may require
> extra :s). It could also interfere with a narrowing/promotion pass.
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Ok.

Thanks,
Richard.

> 2016-05-23  Marc Glisse  
>
> gcc/
> * match.pd (a * (1 << b), ~x & ~y, ~X ^ ~Y, (X ^ Y) ^ Y, ~ (-A),
> ~ (A - 1), ~(~X >> Y), ~(~X >>r Y)): Relax constraints.
>
> gcc/testsuite/
> * gcc.dg/fold-notshift-2.c: Adjust.
>
> --
> Marc Glisse
> Index: gcc/match.pd
> ===
> --- gcc/match.pd(revision 236488)
> +++ gcc/match.pd(working copy)
> @@ -447,21 +447,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (for ops (conj negate)
>   (for cabss (CABS)
>(simplify
> (cabss (ops @0))
> (cabss @0
>
>  /* Fold (a * (1 << b)) into (a << b)  */
>  (simplify
>   (mult:c @0 (convert? (lshift integer_onep@1 @2)))
>(if (! FLOAT_TYPE_P (type)
> -   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
> +   && (element_precision (type) <= element_precision (TREE_TYPE (@1))
> +  || TYPE_UNSIGNED (TREE_TYPE (@1
> (lshift @0 @2)))
>
>  /* Fold (C1/X)*C2 into (C1*C2)/X.  */
>  (simplify
>   (mult (rdiv@3 REAL_CST@0 @1) REAL_CST@2)
>(if (flag_associative_math
> && single_use (@3))
> (with
>  { tree tem = const_binop (MULT_EXPR, type, @0, @2); }
>  (if (tem)
> @@ -648,22 +649,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (simplify
>   (bit_and:c (bit_ior:c @0 @1) (bit_xor:c @1 (bit_not @0)))
>   (bit_and @0 @1))
>
>  /* ~x & ~y -> ~(x | y)
> ~x | ~y -> ~(x & y) */
>  (for op (bit_and bit_ior)
>   rop (bit_ior bit_and)
>   (simplify
>(op (convert1? (bit_not @0)) (convert2? (bit_not @1)))
> -  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> -   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
> +  (if (element_precision (type) <= element_precision (TREE_TYPE (@0))
> +   && element_precision (type) <= element_precision (TREE_TYPE (@1)))
> (bit_not (rop (convert @0) (convert @1))
>
>  /* If we are XORing or adding two BIT_AND_EXPR's, both of which are and'ing
> with a constant, and the two constants have no bits in common,
> we should treat this as a BIT_IOR_EXPR since this may produce more
> simplifications.  */
>  (for op (bit_xor plus)
>   (simplify
>(op (convert1? (bit_and@4 @0 INTEGER_CST@1))
>(convert2? (bit_and@5 @2 INTEGER_CST@3)))
> @@ -674,22 +675,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>
>  /* (X | Y) ^ X -> Y & ~ X*/
>  (simplify
>   (bit_xor:c (convert? (bit_ior:c @0 @1)) (convert? @0))
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>(convert (bit_and @1 (bit_not @0)
>
>  /* Convert ~X ^ ~Y to X ^ Y.  */
>  (simplify
>   (bit_xor (convert1? (bit_not @0)) (convert2? (bit_not @1)))
> - (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> -  && tree_nop_conversion_p (type, TREE_TYPE (@1)))
> + (if (element_precision (type) <= element_precision (TREE_TYPE (@0))
> +  && element_precision (type) <= element_precision (TREE_TYPE (@1)))
>(bit_xor (convert @0) (convert @1
>
>  /* Convert ~X ^ C to X ^ ~C.  */
>  (simplify
>   (bit_xor (convert? (bit_not @0)) INTEGER_CST@1)
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>(bit_xor (convert @0) (bit_not @1
>
>  /* Fold (X & Y) ^ Y and (X ^ Y) & Y as ~X & Y.  */
>  (for opo (bit_and bit_xor)
> @@ -715,22 +716,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* Some simple reassociation for bit operations, also handled in reassoc.
> */
>  /* (X & Y) & Y -> X & Y
> (X | Y) | Y -> X | Y  */
>  (for op (bit_and bit_ior)
>   (simplify
>(op:c (convert?@2 (op:c @0 @1)) (convert? @1))
>@2))
>  /* (X ^ Y) ^ Y -> X  */
>  (simplify
>   (bit_xor:c (convert? (bit_xor:c @0 @1)) (convert? @1))
> - (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> -  (convert @0)))
> + (convert @0))
>  /* (X & Y) & (X & Z) -> (X & Y) & Z
> (X | Y) 

Re: [PATCH] Fix PR tree-optimization/71170

2016-05-23 Thread Richard Biener
On Sat, May 21, 2016 at 8:08 AM, Kugan Vivekanandarajah
 wrote:
> On 20 May 2016 at 21:07, Richard Biener  wrote:
>> On Fri, May 20, 2016 at 1:51 AM, Kugan Vivekanandarajah
>>  wrote:
>>> Hi Richard,
>>>
 I think it should have the same rank as op or op + 1 which is the current
 behavior.  Sth else doesn't work correctly here I think, like inserting the
 multiplication not near the definition of op.

 Well, the whole "clever insertion" logic is simply flawed.
>>>
>>> What I meant to say was that the simple logic we have now wouldn’t
>>> work. "clever logic" is knowing where exactly where it is needed and
>>> inserting there.  I think thats what  you are suggesting below in a
>>> simple to implement way.
>>>
 I'd say that ideally we would delay inserting the multiplication to
 rewrite_expr_tree time.  For example by adding a ops->stmt_to_insert
 member.

>>>
>>> Here is an implementation based on above. Bootstrap on x86-linux-gnu
>>> is OK. regression testing is ongoing.
>>
>> I like it.  Please push the insertion code to a helper as I think you need
>> to post-pone setting the stmts UID to that point.
>>
>> Ideally we'd make use of the same machinery in attempt_builtin_powi,
>> removing the special-casing of powi_result.  (same as I said that ideally
>> the plus->mult stuff would use the repeat-ops machinery...)
>>
>> I'm not 100% convinced the place you insert the stmt is correct but I
>> haven't spent too much time to decipher reassoc in this area.
>
>
> Hi Richard,
>
> Thanks. Here is a tested version of the patch. I did miss one place
> which I fixed now (tranform_stmt_to_copy) I also created a function to
> do the insertion.
>
>
> Bootstrap and regression testing on x86_64-linux-gnu are fine. Is this
> OK for trunk.

@@ -3798,6 +3805,7 @@ rewrite_expr_tree (gimple *stmt, unsigned int opindex,
   oe1 = ops[opindex];
   oe2 = ops[opindex + 1];

+
   if (rhs1 != oe1->op || rhs2 != oe2->op)
{
  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);

please remove this stray change.

Ok with that change.

Thanks,
Richard.

> Thanks,
> Kugan
>
>
> gcc/ChangeLog:
>
> 2016-05-21  Kugan Vivekanandarajah  
>
> PR middle-end/71170
> * tree-ssa-reassoc.c (struct operand_entry): Add field stmt_to_insert.
> (add_to_ops_vec): Add stmt_to_insert.
> (add_repeat_to_ops_vec): Init stmt_to_insert.
> (insert_stmt_before_use): New.
> (transform_add_to_multiply): Remove mult_stmt insertion and add it
> to ops vector.
> (get_ops): Init stmt_to_insert.
> (maybe_optimize_range_tests): Likewise.
> (rewrite_expr_tree): Insert  stmt_to_insert before use stmt.
> (rewrite_expr_tree_parallel): Likewise.
> (reassociate_bb): Likewise.


Re: New hashtable power 2 rehash policy

2016-05-23 Thread Jonathan Wakely

On 17/05/16 22:28 +0200, François Dumont wrote:

On 14/05/2016 19:06, Daniel KrĂĽgler wrote:

1) The function __clp2 is declared using _GLIBCXX14_CONSTEXPR, which
means that it is an inline function if and *only* if
_GLIBCXX14_CONSTEXPR really expands to constexpr, otherwise it is
*not* inline, which is probably not intended and could easily cause
ODR problems. I suggest to mark it unconditionally as inline,
regardless of _GLIBCXX14_CONSTEXPR.


Maybe _GLIBCXX14_CONSTEXPR should take inline value previous to C++14 mode.


That's probably a good idea.


For the moment I simply added the inline as done in other situations.


OK, thanks.



2) Furthermore I suggest to declare __clp2 as noexcept - this is
(intentionally) *not* implied by constexpr.

3) Is there any reason, why _Power2_rehash_policy::_M_next_bkt
shouldn't be noexcept?

4) Similar to (3) for _Power2_rehash_policy's member functions
_M_bkt_for_elements, _M_need_rehash, _M_state, _M_reset
For noexcept I throught we were only adding it if necessary. We might 
have to go through a lot of code to find all places where noexcept 
could be added. Jonathan will give his feedback.


I'm in favour of adding it anywhere that that definitely can't throw.
We don't *need* to do that everywhere, but it doesn't hurt.


For the moment I have added it on all those methods.


Great.


Thanks for feedback, updated and tested patch attached.


OK for trunk - thanks!



Re: Debug iterator cleanup

2016-05-23 Thread Jonathan Wakely

On 22/05/16 17:21 +0200, François Dumont wrote:

Hi

   I just want to make sure that you agree that I can remove the 
@todo by implementing operator-> this way.


   * include/debug/safe_iterator.h
   (_Safe_iterator<>::operator->()): Implement using underlying iterator
   operator ->.
   * include/debug/safe_local_iterator.h
   (_Safe_local_iterator<>::operator->()): Likewise.


We never use _Safe_iterator to wrap a raw pointer, right? Raw pointers
don't have operator-> so that wouldn't work, but I think we only wrap
class types (the containers that might use raw pointers use
__normal_iterator instead).

So it looks good to me, thanks.


Re: [PATCH][RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 1:17 PM, Kyrill Tkachov
 wrote:
> Hi all,
>
> In this PR we end up hitting a signed overflow in noce_get_alt_condition
> when we try to
> increment or decrement a HOST_WIDE_INT that might be HOST_WIDE_INT_MAX or
> HOST_WIDE_INT_MIN.
>
> I've confirmed the overflow by adding an assert before the operation:
> gcc_assert (desired_val != HOST_WIDE_INT_MAX);
>
> This patch fixes those cases by catching the cases when desired_val has the
> extreme
> value and avoids the transformation that function is trying to make.
>
> Bootstrapped and tested on arm, aarch64, x86_64.
>
> I've added the testcase that I used to trigger the assert mentioned above as
> a compile test,
> though I'm not sure how much value it has...
>
> Ok for trunk?

If this isn't also a wrong-code issue (runtime testcase?) then why not perform
the operation in unsigned HOST_WIDE_INT instead?

Richard.

> Thanks,
> Kyrill
>
> 2016-05-23  Kyrylo Tkachov  
>
> PR rtl-optimization/66940
> * ifcvt.c (noce_get_alt_condition): Check that incrementing or
> decrementing desired_val will not overflow before performing these
> operations.
>
> 2016-05-23  Kyrylo Tkachov  
>
> PR rtl-optimization/66940
> * gcc.c-torture/compile/pr66940.c: New test.


[hsa] Avoid segfault in hsa switch expansion

2016-05-23 Thread Martin Jambor
Hi,

when we expand a switch statement to a SBR HSAIL instruction, we must
be careful when changing the CFG and avoid adding a new predecessor to
the default-value basic block if it has PHI nodes because we do not
provide values in the PHIs for the new edge.  We can avoid it by
splitting the edge to the default-value block and having the new edge
lead to this new block, which is what the patch below does.

Bootstrapped and tested on x86_64-linux on trunk and the gcc-6 branch.
I'll commit it to both momentarily.

Thanks,

Martin

2016-05-20  Martin Jambor  

* hsa-gen.c (gen_hsa_insns_for_switch_stmt): Create an empty
default block if a PHI node in the original one would be resized.

libgomp/
* testsuite/libgomp.hsa.c/switch-sbr-2.c: New test.
---
 gcc/hsa-gen.c  |  6 +++
 libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c | 59 ++
 2 files changed, 65 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 697d599..cf7d434 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -3482,6 +3482,12 @@ gen_hsa_insns_for_switch_stmt (gswitch *s, hsa_bb *hbb)
   basic_block default_label_bb = label_to_block_fn (func,
CASE_LABEL (default_label));
 
+  if (!gimple_seq_empty_p (phi_nodes (default_label_bb)))
+{
+  default_label_bb = split_edge (find_edge (e->dest, default_label_bb));
+  hsa_init_new_bb (default_label_bb);
+}
+
   make_edge (e->src, default_label_bb, EDGE_FALSE_VALUE);
 
   hsa_cfun->m_modified_cfg = true;
diff --git a/libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c 
b/libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c
new file mode 100644
index 000..06990d1
--- /dev/null
+++ b/libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c
@@ -0,0 +1,59 @@
+/* { dg-additional-options "-fno-tree-switch-conversion" } */
+
+#pragma omp declare target
+int
+foo (unsigned a)
+{
+  switch (a)
+{
+case 1 ... 5:
+  return 1;
+case 9 ... 11:
+  return a + 3;
+case 12 ... 13:
+  return a + 3;
+default:
+  return 44;
+}
+}
+#pragma omp end declare target
+
+#define s 100
+
+void __attribute__((noinline, noclone))
+verify(int *a)
+{
+  if (a[0] != 44)
+__builtin_abort ();
+  
+  for (int i = 1; i <= 5; i++)
+if (a[i] != 1)
+  __builtin_abort ();
+
+  for (int i = 6; i <= 8; i++)
+if (a[i] != 44)
+  __builtin_abort ();
+
+  for (int i = 9; i <= 13; i++)
+if (a[i] != i + 3)
+  __builtin_abort ();
+
+  for (int i = 14; i < s; i++)
+if (a[i] != 44)
+  __builtin_abort ();
+}
+
+int main(int argc)
+{
+  int array[s];
+#pragma omp target
+  {
+for (int i = 0; i < s; i++)
+  {
+   int v = foo (i);
+   array[i] = v;
+  }
+  }
+  verify (array);
+  return 0;
+}
-- 
2.8.2



[PATCH][RTL ifcvt] PR rtl-optimization/66940: Avoid signed overflow in noce_get_alt_condition

2016-05-23 Thread Kyrill Tkachov

Hi all,

In this PR we end up hitting a signed overflow in noce_get_alt_condition when 
we try to
increment or decrement a HOST_WIDE_INT that might be HOST_WIDE_INT_MAX or 
HOST_WIDE_INT_MIN.

I've confirmed the overflow by adding an assert before the operation:
gcc_assert (desired_val != HOST_WIDE_INT_MAX);

This patch fixes those cases by catching the cases when desired_val has the 
extreme
value and avoids the transformation that function is trying to make.

Bootstrapped and tested on arm, aarch64, x86_64.

I've added the testcase that I used to trigger the assert mentioned above as a 
compile test,
though I'm not sure how much value it has...

Ok for trunk?

Thanks,
Kyrill

2016-05-23  Kyrylo Tkachov  

PR rtl-optimization/66940
* ifcvt.c (noce_get_alt_condition): Check that incrementing or
decrementing desired_val will not overflow before performing these
operations.

2016-05-23  Kyrylo Tkachov  

PR rtl-optimization/66940
* gcc.c-torture/compile/pr66940.c: New test.
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 80af4a84363192879cc49ea45f777fc987fda555..05fac71409d401a08d01b7dc7cf164613f8477c4 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -2396,28 +2396,32 @@ noce_get_alt_condition (struct noce_if_info *if_info, rtx target,
 	  switch (code)
 	{
 	case LT:
-	  if (actual_val == desired_val + 1)
+	  if (desired_val != HOST_WIDE_INT_MAX
+		  && actual_val == desired_val + 1)
 		{
 		  code = LE;
 		  op_b = GEN_INT (desired_val);
 		}
 	  break;
 	case LE:
-	  if (actual_val == desired_val - 1)
+	  if (desired_val != HOST_WIDE_INT_MIN
+		  && actual_val == desired_val - 1)
 		{
 		  code = LT;
 		  op_b = GEN_INT (desired_val);
 		}
 	  break;
 	case GT:
-	  if (actual_val == desired_val - 1)
+	  if (desired_val != HOST_WIDE_INT_MIN
+		  && actual_val == desired_val - 1)
 		{
 		  code = GE;
 		  op_b = GEN_INT (desired_val);
 		}
 	  break;
 	case GE:
-	  if (actual_val == desired_val + 1)
+	  if (desired_val != HOST_WIDE_INT_MAX
+		  && actual_val == desired_val + 1)
 		{
 		  code = GT;
 		  op_b = GEN_INT (desired_val);
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr66940.c b/gcc/testsuite/gcc.c-torture/compile/pr66940.c
new file mode 100644
index ..1f3586b49f4389b4a506774cf550a984073f03e6
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr66940.c
@@ -0,0 +1,8 @@
+long
+foo (int cp, long ival)
+{
+ if (ival <= 0)
+return -0x7fffL - 1;
+
+ return 0x7fffL;
+}


[PATCH] GCC 5 backports

2016-05-23 Thread Richard Biener

I have bootstrapped / tested the following backports on 
x86_64-unknown-linux-gnu.

Richard.

2016-05-23  Richard Biener  

Backport from mainline
2015-12-11  Segher Boessenkool  

PR rtl-optimization/68814
* rtlanal.c (set_noop_p): Use BITS_BIG_ENDIAN instead of
BYTES_BIG_ENDIAN.

2016-01-12  Jan Hubicka  

PR lto/69003
* lto-partition.c (rename_statics): Fix pasto.

2016-01-13  Jan Hubicka  

PR ipa/66487
* ipa-polymorphic-call.c (inlined_polymorphic_ctor_dtor_block_p):
use block_ultimate_origin
(noncall-stmt_may_be_vtbl_ptr_store): Likewise.

2016-02-08  Jakub Jelinek  

PR ipa/69239
* g++.dg/ipa/pr69239.C: New test.

2016-01-21  Roman Zhuykov  

PR target/69252
* modulo-sched.c (optimize_sc): Allow branch-scheduling to add a new
first stage.

2016-01-21  Martin Sebor  

PR target/69252
* gcc.target/powerpc/pr69252.c: New test.


Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 231546)
+++ gcc/rtlanal.c   (revision 231547)
@@ -1534,7 +1534,7 @@ set_noop_p (const_rtx set)
 
   if (GET_CODE (dst) == ZERO_EXTRACT)
 return rtx_equal_p (XEXP (dst, 0), src)
-  && ! BYTES_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
+  && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
   && !side_effects_p (src);
 
   if (GET_CODE (dst) == STRICT_LOW_PART)
Index: gcc/lto/lto-partition.c
===
--- gcc/lto/lto-partition.c (revision 232524)
+++ gcc/lto/lto-partition.c (revision 232525)
@@ -1077,8 +1077,8 @@ rename_statics (lto_symtab_encoder_t enc
  IDENTIFIER_POINTER
(DECL_ASSEMBLER_NAME (s->get_alias_target()->decl
&& ((s->real_symbol_p ()
- && !DECL_EXTERNAL (node->decl)
-&& !TREE_PUBLIC (node->decl))
+ && !DECL_EXTERNAL (s->decl)
+&& !TREE_PUBLIC (s->decl))
|| may_need_named_section_p (encoder, s))
&& (!encoder
|| lto_symtab_encoder_lookup (encoder, s) != LCC_NOT_FOUND))
Index: gcc/ipa-polymorphic-call.c
===
--- gcc/ipa-polymorphic-call.c  (revision 232355)
+++ gcc/ipa-polymorphic-call.c  (revision 232356)
@@ -484,7 +484,7 @@ contains_type_p (tree outer_type, HOST_W
 tree
 inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones)
 {
-  tree fn = BLOCK_ABSTRACT_ORIGIN (block);
+  tree fn = block_ultimate_origin (block);
   if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL)
 return NULL_TREE;
 
@@ -1143,7 +1143,7 @@ noncall_stmt_may_be_vtbl_ptr_store (gimp
   for (tree block = gimple_block (stmt); block && TREE_CODE (block) == BLOCK;
block = BLOCK_SUPERCONTEXT (block))
 if (BLOCK_ABSTRACT_ORIGIN (block)
-   && TREE_CODE (BLOCK_ABSTRACT_ORIGIN (block)) == FUNCTION_DECL)
+   && TREE_CODE (block_ultimate_origin (block)) == FUNCTION_DECL)
   return inlined_polymorphic_ctor_dtor_block_p (block, false);
   return (TREE_CODE (TREE_TYPE (current_function_decl)) == METHOD_TYPE
  && (DECL_CXX_CONSTRUCTOR_P (current_function_decl)
Index: gcc/testsuite/g++.dg/ipa/pr69239.C
===
--- gcc/testsuite/g++.dg/ipa/pr69239.C  (revision 0)
+++ gcc/testsuite/g++.dg/ipa/pr69239.C  (revision 233224)
@@ -0,0 +1,71 @@
+// PR ipa/69239
+// { dg-do run }
+// { dg-options "-O2 --param=early-inlining-insns=196" }
+// { dg-additional-options "-fPIC" { target fpic } }
+
+struct D
+{
+  float f;
+  D () {}
+  virtual float bar (float z);
+};
+
+struct A
+{
+  A ();
+  virtual int foo (int i);
+};
+
+struct B : public D, public A
+{
+  virtual int foo (int i);
+};
+
+float
+D::bar (float)
+{
+  return f / 2;
+}
+
+int
+A::foo (int i)
+{
+  return i + 1;
+}
+
+int
+B::foo (int i)
+{
+  return i + 2;
+}
+
+int __attribute__ ((noinline,noclone))
+baz ()
+{
+  return 1;
+}
+
+static int __attribute__ ((noinline))
+fn (A *obj, int i)
+{
+  return obj->foo (i);
+}
+
+inline __attribute__ ((always_inline))
+A::A ()
+{
+  if (fn (this, baz ()) != 2)
+__builtin_abort ();
+}
+
+static void
+bah ()
+{
+  B b;
+}
+
+int
+main ()
+{
+  bah ();
+}
Index: gcc/testsuite/gcc.target/powerpc/pr69252.c
===
--- gcc/testsuite/gcc.target/powerpc/pr69252.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr69252.c  (revision 232712)
@@ -0,0 +1,28 @@
+/* PR target/69252 - [4.9/5/6 Regression] gcc.dg/vect/vect-iv-9.c FAILs
+   with -Os -fmodulo-sched -fmodulo-sched-allow-regmoves -fsched-pressure  */
+/* { dg-do run } */
+/* { 

Re: [PATCH] Vectorize inductions that are live after the loop.

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 11:28 AM, Alan Hayward  wrote:
> Vectorize inductions that are live after the loop.
>
> Stmts which are live (ie: defined inside a loop and then used after the
> loop)
> are not currently supported by the vectorizer.  In many cases
> vectorization can
> still occur because the SCEV cprop pass will hoist the definition of the
> stmt
> outside of the loop before the vectorizor pass. However, there are various
> cases SCEV cprop cannot hoist, for example:
>   for (i = 0; i < n; ++i)
> {
>   ret = x[i];
>   x[i] = i;
> }
>   return i;
>
> Currently stmts are marked live using a bool, and the relevant state using
> an
> enum. Both these states are propagated to the definition of all uses of the
> stmt. Also, a stmt can be live but not relevant.
>
> This patch vectorizes a live stmt definition normally within the loop and
> then
> after the loop uses BIT_FIELD_REF to extract the final scalar value from
> the
> vector.
>
> This patch adds a new relevant state (vect_used_only_live) for when a stmt
> is
> used only outside the loop. The relevant state is still propagated to all
> it's
> uses, but the live bool is not (this ensures that
> vectorizable_live_operation
> is only called with stmts that really are live).
>
> Tested on x86 and aarch64.

@@ -6332,79 +6324,81 @@ vectorizable_live_operation (gimple *stmt,
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  tree op;
-  gimple *def_stmt;
-  ssa_op_iter iter;
+  imm_use_iterator imm_iter;
+  tree lhs, lhs_type, vec_lhs;
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  int nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  int ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
+  gimple *use_stmt;

   gcc_assert (STMT_VINFO_LIVE_P (stmt_info));

+  if (STMT_VINFO_TYPE (stmt_info) == reduc_vec_info_type)
+return true;
+

This is an odd check - it says the stmt is handled by
vectorizable_reduction.  And your
return claims it is handled by vectorizable_live_operation ...

You removed the SIMD lane handling?

@@ -303,6 +335,16 @@ vect_stmt_relevant_p (gimple *stmt, loop_vec_info
loop_vinfo,
}
 }

+  if (*live_p && *relevant == vect_unused_in_scope
+  && !is_simple_and_all_uses_invariant (stmt, loop_vinfo))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"vec_stmt_relevant_p: live and not all uses "
+"invariant.\n");
+  *relevant = vect_used_only_live;
+}

But that's a missed invariant motion / code sinking opportunity then.
Did you have a
testcase for this?

@@ -618,57 +660,31 @@ vect_mark_stmts_to_be_vectorized (loop_vec_info
loop_vinfo)
}

   /* Examine the USEs of STMT. For each USE, mark the stmt that defines it
-(DEF_STMT) as relevant/irrelevant and live/dead according to the
-liveness and relevance properties of STMT.  */
+(DEF_STMT) as relevant/irrelevant according to the relevance property
+of STMT.  */
   stmt_vinfo = vinfo_for_stmt (stmt);
   relevant = STMT_VINFO_RELEVANT (stmt_vinfo);
-  live_p = STMT_VINFO_LIVE_P (stmt_vinfo);
-
-  /* Generally, the liveness and relevance properties of STMT are
-propagated as is to the DEF_STMTs of its USEs:
- live_p <-- STMT_VINFO_LIVE_P (STMT_VINFO)
- relevant <-- STMT_VINFO_RELEVANT (STMT_VINFO)
-
-One exception is when STMT has been identified as defining a reduction
-variable; in this case we set the liveness/relevance as follows:
-  live_p = false
-  relevant = vect_used_by_reduction
-This is because we distinguish between two kinds of relevant stmts -
-those that are used by a reduction computation, and those that are
-(also) used by a regular computation.  This allows us later on to
-identify stmts that are used solely by a reduction, and therefore the
-order of the results that they produce does not have to be kept.  */
-
-  def_type = STMT_VINFO_DEF_TYPE (stmt_vinfo);
-  tmp_relevant = relevant;
-  switch (def_type)
+
+  switch (STMT_VINFO_DEF_TYPE (stmt_vinfo))
 {

you removed this comment.  Is it no longer valid?  Can you please
instead update it?
This is a tricky area.


@@ -1310,17 +1325,14 @@ vect_init_vector (gimple *stmt, tree val, tree
type, gimple_stmt_iterator *gsi)
In case OP is an invariant or constant, a new stmt that creates a vector def
needs to be introduced.  VECTYPE may be used to specify a required type for
vector invariant.  */
-
-tree
-vect_get_vec_def_for_operand (tree op, gimple *stmt, tree vectype)
+static tree
+vect_get_vec_def_for_operand_internal (tree op, gimple *stmt,
+  loop_vec_info loop_vinfo, tree vectype)
 {
   tree vec_oprnd;
...

+tree

Re: C PATCH to add -Wswitch-unreachable (PR c/49859)

2016-05-23 Thread Marek Polacek
On Mon, May 23, 2016 at 11:43:24AM +0200, Florian Weimer wrote:
> On 05/20/2016 06:36 PM, Marek Polacek wrote:
> > --- gcc/gimplify.c
> > +++ gcc/gimplify.c
> > @@ -1595,6 +1595,32 @@ gimplify_switch_expr (tree *expr_p, gimple_seq 
> > *pre_p)
> >gimplify_ctxp->case_labels.create (8);
> > 
> >gimplify_stmt (_BODY (switch_expr), _body_seq);
> > +
> > +  /* Possibly warn about unreachable statements between switch's
> > +controlling expression and the first case.  */
> > +  if (warn_switch_unreachable
> > + /* This warning doesn't play well with Fortran when optimizations
> > +are on.  */
> > + && !lang_GNU_Fortran ()
> > + && switch_body_seq != NULL)
> 
> Does this still make the warning dependent on the optimization level and the
> target?

It's not dependent on the optimization level for C/C++ and I don't quite see
why would it be dependent on the target.  It's just that the Fortran FE with
-O optimizes a switch by removing the first case.  I don't know Fortran to
judge whether the warning could be useful there.
 
> Can we document what a programmer needs to do to suppress the warning? My
> worry is that the warning will be too unpredictable to be useful.

Why would you suppress the warning?  Just delete the code that cannot be
executed anyway.  And what is unpredictable?  You don't want the compiler
to warn on
   switch(x) {
 found = 1;
 case 1:
 ...
?

Marek


Re: [RFA] Minor cleanup to allocate_dynamic_stack_space

2016-05-23 Thread Dominik Vogt
On Fri, May 20, 2016 at 03:23:49PM -0600, Jeff Law wrote:
> On 05/19/2016 05:11 PM, Jeff Law wrote:
> [ ... ]
> >This is a bit of a mess and I think the code
> >needs some TLC before we start hacking it up further.
> >
> >Let's start with clean up of dead code:
> >
> > /* We will need to ensure that the address we return is aligned to
> > REQUIRED_ALIGN.  If STACK_DYNAMIC_OFFSET is defined, we don't
> > always know its final value at this point in the compilation (it
> > might depend on the size of the outgoing parameter lists, for
> > example), so we must align the value to be returned in that case.
> > (Note that STACK_DYNAMIC_OFFSET will have a default nonzero value if
> > STACK_POINTER_OFFSET or ACCUMULATE_OUTGOING_ARGS are defined).
> > We must also do an alignment operation on the returned value if
> > the stack pointer alignment is less strict than REQUIRED_ALIGN.
> >
> > If we have to align, we must leave space in SIZE for the hole
> > that might result from the alignment operation.  */
> >
> >  must_align = (crtl->preferred_stack_boundary < required_align);
> >  if (must_align)
> >{
> >  if (required_align > PREFERRED_STACK_BOUNDARY)
> >extra_align = PREFERRED_STACK_BOUNDARY;
> >  else if (required_align > STACK_BOUNDARY)
> >extra_align = STACK_BOUNDARY;
> >  else
> >extra_align = BITS_PER_UNIT;
> >}
> >
> >  /* ??? STACK_POINTER_OFFSET is always defined now.  */
> >#if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
> >  must_align = true;
> >  extra_align = BITS_PER_UNIT;
> >#endif
> >
> >If we look at defaults.h, it always defines STACK_POINTER_OFFSET.  So
> >all the code above I think collapses to:
> >
> >  must_align = true;
> >  extra_align = BITS_PER_UNIT
> >
> >And the only other assignment to must_align assigns it the value "true".
> > There are two conditionals on must_align that looks like
> >
> >if (must_align)
> >  {
> >CODE;
> >  }
> >
> >We should remove the conditional and pull CODE out an indentation level.
> > And remove all remnants of must_align.
> >
> >I don't think that changes your patch in any way.  Hopefully it makes
> >the whole function somewhat easier to grok.
> >
> >Thoughts?
> So here's that cleanup.  The diffs are larger than one might expect
> because of the reindentation that needs to happen.  So I've included
> a -b diff variant which shows how little actually changed here.
> 
> This should have no impact on any target.
> 
> Bootstrapped and regression tested on x86_64 linux.  Ok for the trunk?

The first part of the change (removing the dead code) is fine.
However, I suggest to leave "must_align" in the code for now
because I have another patch in the queue that assigns a
calculated value to must_align.  For that I'd have to revert this
part of your patch, so I think it's not worth the effort to remove
it in the first place.  See

  https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00445.html

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: C PATCH to add -Wswitch-unreachable (PR c/49859)

2016-05-23 Thread Marek Polacek
On Fri, May 20, 2016 at 08:17:50PM -0600, Sandra Loosemore wrote:
> On 05/20/2016 10:36 AM, Marek Polacek wrote:
> > diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
> > index f3d087f..5909b9d 100644
> > --- gcc/doc/invoke.texi
> > +++ gcc/doc/invoke.texi
> > @@ -297,7 +297,8 @@ Objective-C and Objective-C++ Dialects}.
> >   -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{]} @gol
> >   -Wsuggest-final-types @gol -Wsuggest-final-methods -Wsuggest-override @gol
> >   -Wmissing-format-attribute -Wsubobject-linkage @gol
> > --Wswitch  -Wswitch-default  -Wswitch-enum -Wswitch-bool -Wsync-nand @gol
> > +-Wswitch  -Wswitch-default  -Wswitch-enum -Wswitch-bool @gol
> > +-Wswitch-unreachable  -Wsync-nand @gol
> >   -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
> >   -Wtype-limits  -Wundef @gol
> >   -Wuninitialized  -Wunknown-pragmas  -Wunsafe-loop-optimizations @gol
> 
> I think this list is supposed to be alphabetized except with respect to
> -Wno-foo being sorted as if it were -Wfoo.  I realize there are other
> inconsistencies, but can you at least keep the -Wswitch* entries in proper
> order?
 
Sure, fixed.

> > @@ -4144,6 +4145,39 @@ switch ((int) (a == 4))
> >   @end smallexample
> >   This warning is enabled by default for C and C++ programs.
> > 
> > +@item -Wswitch-unreachable
> > +@opindex Wswitch-unreachable
> > +@opindex Wno-switch-unreachable
> > +Warn whenever a @code{switch} statement contains statements between the
> > +controlling expression and the first case label, which will never be
> > +executed.  For example:
> > +@smallexample
> > +@group
> > +switch (cond)
> > +  @{
> > +   i = 15;
> > +  @dots{}
> > +   case 5:
> > +  @dots{}
> > +  @}
> > +@end group
> > +@end smallexample
> > +@option{-Wswitch-unreachable} will not warn if the statement between the
> 
> s/will/does/

OK. 

> > +controlling expression and the first case label is just a declaration:
> > +@smallexample
> > +@group
> > +switch (cond)
> > +  @{
> > +   int i;
> > +  @dots{}
> > +   case 5:
> > +   i = 5;
> > +  @dots{}
> > +  @}
> > +@end group
> > +@end smallexample
> > +This warning is enabled by default for C and C++ programs.
> > +
> >   @item -Wsync-nand @r{(C and C++ only)}
> >   @opindex Wsync-nand
> >   @opindex Wno-sync-nand
> 
> The doc part of the patch is OK with those things fixed.

Thanks, I made the changes.

Marek


RE: [Patch V2] Fix SLP PR58135.

2016-05-23 Thread Kumar, Venkataramanan
Hi Richard,

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Thursday, May 19, 2016 4:08 PM
> To: Kumar, Venkataramanan 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Patch V2] Fix SLP PR58135.
> 
> On Wed, May 18, 2016 at 5:29 PM, Kumar, Venkataramanan
>  wrote:
> > Hi Richard,
> >
> >> -Original Message-
> >> From: Richard Biener [mailto:richard.guent...@gmail.com]
> >> Sent: Tuesday, May 17, 2016 5:40 PM
> >> To: Kumar, Venkataramanan 
> >> Cc: gcc-patches@gcc.gnu.org
> >> Subject: Re: [Patch V2] Fix SLP PR58135.
> >>
> >> On Tue, May 17, 2016 at 1:56 PM, Kumar, Venkataramanan
> >>  wrote:
> >> > Hi Richard,
> >> >
> >> > I created the patch by passing -b option to git. Now the patch is
> >> > more
> >> readable.
> >> >
> >> > As per your suggestion I tried to fix the PR by splitting the SLP
> >> > store group at
> >> vector boundary after the SLP tree is built.
> >> >
> >> > Boot strap PASSED on x86_64.
> >> > Checked the patch with check_GNU_style.sh.
> >> >
> >> > The gfortran.dg/pr46519-1.f test now does SLP vectorization. Hence
> >> > it
> >> generated 2 more vzeroupper.
> >> > As recommended I adjusted the test case by adding
> >> > -fno-tree-slp-vectorize
> >> to make it as expected after loop vectorization.
> >> >
> >> > The following tests are now passing.
> >> >
> >> > -- Snip-
> >> > Tests that now work, but didn't before:
> >> >
> >> > gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects
> >> > scan-tree-dump-times
> >> > slp2 "basic block vectorized" 1
> >> >
> >> > gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "basic block
> >> > vectorized" 1
> >> >
> >> > New tests that PASS:
> >> >
> >> > gcc.dg/vect/pr58135.c (test for excess errors)
> >> > gcc.dg/vect/pr58135.c -flto -ffat-lto-objects (test for excess
> >> > errors)
> >> >
> >> > -- Snip-
> >> >
> >> > ChangeLog
> >> >
> >> > 2016-05-14  Venkataramanan Kumar
> >> 
> >> >  PR tree-optimization/58135
> >> > * tree-vect-slp.c:  When group size is not multiple of vector size,
> >> >  allow splitting of store group at vector boundary.
> >> >
> >> > Test suite  ChangeLog
> >> > 2016-05-14  Venkataramanan Kumar
> >> 
> >> > * gcc.dg/vect/bb-slp-19.c:  Remove XFAIL.
> >> > * gcc.dg/vect/pr58135.c:  Add new.
> >> > * gfortran.dg/pr46519-1.f: Adjust test case.
> >> >
> >> > The attached patch Ok for trunk?
> >>
> >>
> >> Please avoid the excessive vertical space around the vect_build_slp_tree
> call.
> > Yes fixed in the attached patch.
> >>
> >> +  /* Calculate the unrolling factor.  */
> >> +  unrolling_factor = least_common_multiple
> >> + (nunits, group_size) / group_size;
> >> ...
> >> +  else
> >> {
> >>   /* Calculate the unrolling factor based on the smallest type.  */
> >>   if (max_nunits > nunits)
> >> -unrolling_factor = least_common_multiple (max_nunits, group_size)
> >> -   / group_size;
> >> +   unrolling_factor
> >> +   = least_common_multiple (max_nunits,
> >> + group_size)/group_size;
> >>
> >> please compute the "correct" unroll factor immediately and move the
> >> "unrolling of BB required" error into the if() case by post-poning
> >> the nunits < group_size check (and use max_nunits here).
> >>
> > Yes fixed in the attached patch.
> >
> >> +  if (is_a  (vinfo)
> >> + && nunits < group_size
> >> + && unrolling_factor != 1
> >> + && is_a  (vinfo))
> >> +   {
> >> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> >> +  "Build SLP failed: store group "
> >> +  "size not a multiple of the vector size "
> >> +  "in basic block SLP\n");
> >> + /* Fatal mismatch.  */
> >> + matches[nunits] = false;
> >>
> >> this is too pessimistic - you want to add the extra 'false' at
> >> group_size / max_nunits * max_nunits.
> > Yes fixed in attached patch.
> >
> >>
> >> It looks like you leak 'node' in the if () path as well.  You need
> >>
> >>   vect_free_slp_tree (node);
> >>   loads.release ();
> >>
> >> thus treat it as a failure case.
> >
> > Yes fixed. I added an else part before scalar_stmts.release call for the 
> > case
> when SLP tree is not built. This avoids double freeing.
> > Bootstrapped and reg tested on X86_64.
> >
> > Ok for trunk ?
> 
> +  /*Calculate the unrolling factor based on the smallest type.  */
> +  unrolling_factor
> 
> Space after /* missing.
> 
> +  unrolling_factor
> +   = least_common_multiple (max_nunits, group_size)/group_size;
> 
> spaces before/after the '/'
> 
> +  if (max_nunits > nunits
> + && unrolling_factor != 1
> + && is_a  

Re: C PATCH to add -Wswitch-unreachable (PR c/49859)

2016-05-23 Thread Florian Weimer

On 05/20/2016 06:36 PM, Marek Polacek wrote:

--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -1595,6 +1595,32 @@ gimplify_switch_expr (tree *expr_p, gimple_seq *pre_p)
   gimplify_ctxp->case_labels.create (8);

   gimplify_stmt (_BODY (switch_expr), _body_seq);
+
+  /* Possibly warn about unreachable statements between switch's
+controlling expression and the first case.  */
+  if (warn_switch_unreachable
+ /* This warning doesn't play well with Fortran when optimizations
+are on.  */
+ && !lang_GNU_Fortran ()
+ && switch_body_seq != NULL)


Does this still make the warning dependent on the optimization level and 
the target?


Can we document what a programmer needs to do to suppress the warning? 
My worry is that the warning will be too unpredictable to be useful.


Thanks,
Florian


Re: [ARM] implement division using vrecpe/vrecps with -funsafe-math-optimizations

2016-05-23 Thread Prathamesh Kulkarni
On 5 February 2016 at 18:40, Prathamesh Kulkarni
 wrote:
> On 4 February 2016 at 16:31, Ramana Radhakrishnan
>  wrote:
>> On Sun, Jan 17, 2016 at 9:06 AM, Prathamesh Kulkarni
>>  wrote:
>>> On 31 July 2015 at 15:04, Ramana Radhakrishnan
>>>  wrote:


 On 29/07/15 11:09, Prathamesh Kulkarni wrote:
> Hi,
> This patch tries to implement division with multiplication by
> reciprocal using vrecpe/vrecps
> with -funsafe-math-optimizations and -freciprocal-math enabled.
> Tested on arm-none-linux-gnueabihf using qemu.
> OK for trunk ?
>
> Thank you,
> Prathamesh
>

 I've tried this in the past and never been convinced that 2 iterations are 
 enough to get to stability with this given that the results are only 
 precise for 8 bits / iteration. Thus I've always believed you need 3 
 iterations rather than 2 at which point I've never been sure that it's 
 worth it. So the testing that you've done with this currently is not 
 enough for this to go into the tree.

 I'd like this to be tested on a couple of different AArch32 
 implementations with a wider range of inputs to verify that the results 
 are acceptable as well as running something like SPEC2k(6) with atleast 
 one iteration to ensure correctness.
>>> Hi,
>>> I got results of SPEC2k6 fp benchmarks:
>>> a15: +0.64% overall, 481.wrf: +6.46%
>>> a53: +0.21% overall, 416.gamess: -1.39%, 481.wrf: +6.76%
>>> a57: +0.35% overall, 481.wrf: +3.84%
>>> The other benchmarks had (almost) identical results.
>>
>> Thanks for the benchmarking results -  Please repost the patch with
>> the changes that I had requested in my previous review - given it is
>> now stage4 , I would rather queue changes like this for stage1 now.
> Hi,
> Please find the updated patch attached.
> It passes testsuite for arm-none-linux-gnueabi, arm-none-linux-gnueabihf and
> arm-none-eabi.
> However the test-case added in the patch (neon-vect-div-1.c) fails to
> get vectorized at -O2
> for armeb-none-linux-gnueabihf.
> Charles suggested me to try with -O3, which worked.
> It appears the test-case fails to get vectorized with
> -fvect-cost-model=cheap (which is default enabled at -O2)
> and passes for -fno-vect-cost-model / -fvect-cost-model=dynamic
>
> I can't figure out why it fails -fvect-cost-model=cheap.
> From the vect dump (attached):
> neon-vect-div-1.c:12:3: note: Setting misalignment to -1.
> neon-vect-div-1.c:12:3: note: not vectorized: unsupported unaligned load.*_9
Hi,
I think I have some idea why the test-case fails attached with patch
fail to get vectorized on armeb with -O2.

Issue with big endian vectorizer:
The patch does not cause regressions on big endian vectorizer but
fails to vectorize the test-cases attached with the patch, while they
get vectorized on
litttle-endian.
Fails with armeb with the following message in dump:
note: not vectorized: unsupported unaligned load.*_9

The behavior of big and little endian vectorizer seems to be different
in arm_builtin_support_vector_misalignment() which overrides the hook
targetm.vectorize.support_vector_misalignment().

targetm.vectorize.support_vector_misalignment is called by
vect_supportable_dr_alignment () which in turn is called
by verify_data_refs_alignment ().

Execution upto following condition is common between arm and armeb
in vect_supportable_dr_alignment():

if ((TYPE_USER_ALIGN (type) && !is_packed)
  || targetm.vectorize.support_vector_misalignment (mode, type,
DR_MISALIGNMENT (dr), is_packed))
/* Can't software pipeline the loads, but can at least do them.  */
return dr_unaligned_supported;

For little endian case:
arm_builtin_support_vector_misalignment() is called with
V2SF mode and misalignment == -1, and the following condition
becomes true:
/* If the misalignment is unknown, we should be able to handle the access
 so long as it is not to a member of a packed data structure.  */
  if (misalignment == -1)
return true;

Since the hook returned true we enter the condition above in
vect_supportable_dr_alignment() and return dr_unaligned_supported;

For big-endian:
arm_builtin_support_vector_misalignment() is called with V2SF mode.
The following condition that gates the entire function body fails:
 if (TARGET_NEON && !BYTES_BIG_ENDIAN && unaligned_access)
and the default hook gets called with V2SF mode and the default hook
returns false because
movmisalign_optab does not exist for V2SF mode.

So the condition above in vect_supportable_dr_alignment() fails
and we come here:
 /* Unsupported.  */
return dr_unaligned_unsupported;

And hence we get the unaligned load not supported message in the dump
for armeb in verify_data_ref_alignment ():

static bool
verify_data_ref_alignment (data_reference_p dr)
{
  enum dr_alignment_support 

[PATCH] Vectorize inductions that are live after the loop.

2016-05-23 Thread Alan Hayward
Vectorize inductions that are live after the loop.

Stmts which are live (ie: defined inside a loop and then used after the
loop)
are not currently supported by the vectorizer.  In many cases
vectorization can
still occur because the SCEV cprop pass will hoist the definition of the
stmt
outside of the loop before the vectorizor pass. However, there are various
cases SCEV cprop cannot hoist, for example:
  for (i = 0; i < n; ++i)
{
  ret = x[i];
  x[i] = i;
}
  return i;

Currently stmts are marked live using a bool, and the relevant state using
an
enum. Both these states are propagated to the definition of all uses of the
stmt. Also, a stmt can be live but not relevant.

This patch vectorizes a live stmt definition normally within the loop and
then
after the loop uses BIT_FIELD_REF to extract the final scalar value from
the
vector.

This patch adds a new relevant state (vect_used_only_live) for when a stmt
is
used only outside the loop. The relevant state is still propagated to all
it's
uses, but the live bool is not (this ensures that
vectorizable_live_operation
is only called with stmts that really are live).

Tested on x86 and aarch64.

gcc/
* tree-vect-loop.c (vect_analyze_loop_operations): Allow live stmts.
(vectorizable_reduction): Check for new relevant state.
(vectorizable_live_operation): vectorize live stmts using
BIT_FIELD_REF.  Remove special case for gimple assigns stmts.
* tree-vect-stmts.c (is_simple_and_all_uses_invariant): New function.
(vect_stmt_relevant_p): Check for stmts which are only used outside the
loop.
(process_use): Use of a stmt does not inherit it's live value.
(vect_mark_stmts_to_be_vectorized): Simplify relevance inheritance.
(vect_get_vec_def_for_operand): Split body into...
(vect_get_vec_def_for_operand_internal): ...new function
(vect_get_vec_def_for_operand_outside): New. Same as above but for
stmts outside a loop.
(vect_analyze_stmt): Check for new relevant state.
*tree-vectorizer.h (vect_relevant): New entry for a stmt which is used
outside the loop, but not inside it.

testsuite/
* gcc.dg/tree-ssa/pr64183.c: Ensure test does not vectorize.
* testsuite/gcc.dg/vect/no-scevccp-vect-iv-2.c: Remove xfail.
* gcc.dg/vect/vect-live-1.c: New test.
* gcc.dg/vect/vect-live-2.c: New test.
* gcc.dg/vect/vect-live-3.c: New test.
* gcc.dg/vect/vect-live-4.c: New test.


Cheers,
Alan.



liveInductions.patch
Description: liveInductions.patch


Re: [Patch wwwdocs] Add aarch64-none-linux-gnu as a primary platform for GCC-7

2016-05-23 Thread Richard Biener
On Mon, May 23, 2016 at 11:16 AM, Ramana Radhakrishnan
 wrote:
> Hi,
>
> The Steering Committee has decided to add aarch64-none-linux-gnu as a primary 
> platform for GCC-7. This reflects the increasing popularity of the port and 
> the increased general availability of hardware. I also took the opportunity 
> of creating a GCC-7 criteria page at the same time.
>
> Applied.

Sorry to hijack the thread but I continue to notice that we have
i386-unknown-freebsd as a primary target.  I notice here
the 'i386' (the only primary target still explicitely listing that
sub-target) and the fact that freebsd switched to LLVM as
far as I know.

So I propose to demote -freebsd to secondary and use
i686-unknown-freebsd (or x86_64-unknown-freebsd?).

Gerald, Andreas, can you comment on both issues?  Esp. i386 is putting
quite some burden on libstdc++ and atomics support
for example.

Thanks,
Richard.

> Thanks,
> Ramana


Re: [PATCH] Fix PR70434, change FE IL for vector indexing

2016-05-23 Thread Richard Biener
On Mon, 23 May 2016, Richard Biener wrote:

> On Mon, 23 May 2016, Eric Botcazou wrote:
> 
> > > The following changes the IL the C family frontends emit for
> > > vector indexing from a mix of BIT_FIELD_REF (for constant indices)
> > > and pointer arithmetic + dereferences (for variable indicies) to
> > > a simple ARRAY_REF of the vector view-converted to a corresponding
> > > array type.
> > 
> > FWIW the Ada front-end (gigi) has always done that for its vector_type.
> 
> Hmm, so it must have run into the bugfix part of this patch (the
> tree-ssa.c hunk).  And it will benefit from the gimple-fold.c hunk.

Marek noticed this fixes PR69504 as well.  Added testcase and adjusted
ChangeLog.

Richard.

2016-05-23  Richard Biener  

PR middle-end/70434
PR c/69504
c-family/
* c-common.c (convert_vector_to_pointer_for_subscript): Use a
VIEW_CONVERT_EXPR to an array type.

cp/
* expr.c (mark_exp_read): Handle VIEW_CONVERT_EXPR.
* constexpr.c (cxx_eval_array_reference): Handle indexed
vectors.

c/
* c-typeck.c (build_array_ref): Do not complain about indexing
non-lvalue vectors.

* tree-ssa.c (non_rewritable_mem_ref_base): Make sure to mark
bases which are accessed with non-invariant indices.
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Re-write
constant index ARRAY_REFs of vectors into BIT_FIELD_REFs.

* c-c++-common/vector-subscript-4.c: New testcase.

Index: gcc/c-family/c-common.c
===
*** gcc/c-family/c-common.c.orig2016-05-20 13:55:56.047232610 +0200
--- gcc/c-family/c-common.c 2016-05-23 11:12:54.810910622 +0200
*** convert_vector_to_pointer_for_subscript
*** 12508,12561 
if (VECTOR_TYPE_P (TREE_TYPE (*vecp)))
  {
tree type = TREE_TYPE (*vecp);
-   tree type1;
  
ret = !lvalue_p (*vecp);
if (TREE_CODE (index) == INTEGER_CST)
  if (!tree_fits_uhwi_p (index)
  || tree_to_uhwi (index) >= TYPE_VECTOR_SUBPARTS (type))
warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
  
!   if (ret)
!   {
! tree tmp = create_tmp_var_raw (type);
! DECL_SOURCE_LOCATION (tmp) = loc;
! *vecp = c_save_expr (*vecp);
! if (TREE_CODE (*vecp) == C_MAYBE_CONST_EXPR)
!   {
! bool non_const = C_MAYBE_CONST_EXPR_NON_CONST (*vecp);
! *vecp = C_MAYBE_CONST_EXPR_EXPR (*vecp);
! *vecp
!   = c_wrap_maybe_const (build4 (TARGET_EXPR, type, tmp,
! *vecp, NULL_TREE, NULL_TREE),
! non_const);
!   }
! else
!   *vecp = build4 (TARGET_EXPR, type, tmp, *vecp,
!   NULL_TREE, NULL_TREE);
! SET_EXPR_LOCATION (*vecp, loc);
! c_common_mark_addressable_vec (tmp);
!   }
!   else
!   c_common_mark_addressable_vec (*vecp);
!   type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type));
!   type1 = build_pointer_type (TREE_TYPE (*vecp));
!   bool ref_all = TYPE_REF_CAN_ALIAS_ALL (type1);
!   if (!ref_all
! && !DECL_P (*vecp))
!   {
! /* If the original vector isn't declared may_alias and it
!isn't a bare vector look if the subscripting would
!alias the vector we subscript, and if not, force ref-all.  */
! alias_set_type vecset = get_alias_set (*vecp);
! alias_set_type sset = get_alias_set (type);
! if (!alias_sets_must_conflict_p (sset, vecset)
! && !alias_set_subset_of (sset, vecset))
!   ref_all = true;
!   }
!   type = build_pointer_type_for_mode (type, ptr_mode, ref_all);
!   *vecp = build1 (ADDR_EXPR, type1, *vecp);
!   *vecp = convert (type, *vecp);
  }
return ret;
  }
--- 12508,12530 
if (VECTOR_TYPE_P (TREE_TYPE (*vecp)))
  {
tree type = TREE_TYPE (*vecp);
  
ret = !lvalue_p (*vecp);
+ 
if (TREE_CODE (index) == INTEGER_CST)
  if (!tree_fits_uhwi_p (index)
  || tree_to_uhwi (index) >= TYPE_VECTOR_SUBPARTS (type))
warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
  
!   /* We are building an ARRAY_REF so mark the vector as addressable
!  to not run into the gimplifiers premature setting of 
DECL_GIMPLE_REG_P
!for function parameters.  */
!   c_common_mark_addressable_vec (*vecp);
! 
!   *vecp = build1 (VIEW_CONVERT_EXPR,
! build_array_type_nelts (TREE_TYPE (type),
! TYPE_VECTOR_SUBPARTS (type)),
! *vecp);
  }
return ret;
  }
Index: gcc/tree-ssa.c
===
*** gcc/tree-ssa.c.orig 

[Patch wwwdocs] Add aarch64-none-linux-gnu as a primary platform for GCC-7

2016-05-23 Thread Ramana Radhakrishnan
Hi,

The Steering Committee has decided to add aarch64-none-linux-gnu as a primary 
platform for GCC-7. This reflects the increasing popularity of the port and the 
increased general availability of hardware. I also took the opportunity of 
creating a GCC-7 criteria page at the same time.

Applied.

Thanks,
Ramana
Index: htdocs/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.1008
diff -a -u -r1.1008 index.html
--- htdocs/index.html   18 May 2016 12:50:06 -  1.1008
+++ htdocs/index.html   20 May 2016 23:17:54 -
@@ -167,7 +167,7 @@
 
 
 Development:
-  GCC 7.0 (release criteria)
+  GCC 7.0 (release criteria)
 
   Status:
   
--- /dev/null   2016-05-06 10:43:58.436131027 +0100
+++ htdocs/gcc-7/criteria.html  2016-05-21 00:06:34.530921981 +0100
@@ -0,0 +1,149 @@
+
+
+
+GCC 7 Release Criteria 
+
+
+
+GCC 7 Release Criteria
+
+This page provides the release criteria for GCC 7.  
+
+The GCC team (and, in particular, the Release Managers) will attempt
+to meet these criteria before the release of GCC 7.
+
+In all cases, these criteria represent the minimum functionality
+required in order to make the release.  If this level of minimum
+functionality is not provided by a release candidate, then that
+candidate will probably not become the eventual release.  However, a
+release candidate that does meet these criteria may not necessarily
+become the official release; there may be other unforeseen issues that
+prevent release.  For example, if support for the Intel Pentium II is
+required by the release criteria, it is nevertheless unlikely that GCC
+would be released even though it did not support the Intel Pentium.
+
+Because the development of GCC is largely dependent on volunteers,
+the Release Managers and/or Steering Committee may eventually have to
+decide whether to make a release, even if the criteria here are not
+met.  For example, if no volunteer can be found to verify correct
+operation of a particular application program on a particular system,
+then that criterion may be abandoned.
+
+Languages
+
+GCC supports several programming languages, including Ada, C, C++,
+Fortran, Objective-C, Objective-C++, Go and Java.
+For the purposes of making releases,
+however, we will consider primarily C and C++, as those are the
+languages used by the vast majority of users.  Therefore, if, below,
+the criteria indicate, for example, that there should be no DejaGNU
+regressions on a particular platform, that criteria should be read as
+applying only to DejaGNU regressions within the C, C++, and C++
+runtime library testsuites.
+
+Primary and Secondary Platforms
+
+GCC targets a vast number of platforms.  We have classified these
+platforms into three categories: primary, secondary, and tertiary.
+Primary platforms are popular systems, both in the sense that there
+are many such systems in existence and in the sense that GCC is used
+frequently on those systems.  Secondary platforms are also popular
+systems, but are either somewhat less popular than the primary
+systems, or are considered duplicative from a testing perspective.
+All platforms that are neither primary nor secondary are tertiary
+platforms.
+
+Our release criteria for primary platforms is:
+
+
+
+All regressions open in Bugzilla have been analyzed, and all are
+deemed as either unlikely to affect most users, or are determined to
+have a minimal impact on affected users.  For example, a
+typographical error in a diagnostic might be relatively common, but
+also has minimal impact on users.
+
+In general, regressions where the compiler generates incorrect
+code, or refuses to compile a valid program, will be considered to
+be sufficiently severe to block the release, unless there are
+substantial mitigating factors.
+  
+
+The DejaGNU testsuite has been run, and compared with a run of
+the testsuite on the previous release of GCC, and no regressions are
+observed.
+
+
+Our release criteria for the secondary platforms is:
+
+The compiler bootstraps successfully, and the C++ runtime library
+builds.
+
+The DejaGNU testsuite has been run, and a substantial majority of
+the tests pass.
+
+
+There are no release criteria for tertiary platforms.
+
+In general bugs blocking the release are marked with priority P1
+(Maintaining the GCC Bugzilla 
database).
+
+In contrast to previous releases, we have removed all mention of
+explicit application testing.  It is our experience that, with the
+resources available, it is very difficult to methodically carry out
+such testing. However, we expect that interested users will submit
+bug reports for problems encountered building and using popular
+packages.  Therefore, we do not intend the elimination of application
+testing from our criteria to imply that we will not pay attention to
+application testing.
+
+Primary Platform List
+
+The primary platforms are:
+
+aarch64-none-linux-gnu
+arm-linux-gnueabi

Re: [Patch ARM/AArch64 10/11] Add missing tests for intrinsics operating on poly64 and poly128 types.

2016-05-23 Thread Christophe Lyon
On 13 May 2016 at 17:16, James Greenhalgh  wrote:
> On Wed, May 11, 2016 at 03:24:00PM +0200, Christophe Lyon wrote:
>> 2016-05-02  Christophe Lyon  
>>
>>   * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (result):
>>   Add poly64x1_t and poly64x2_t cases if supported.
>>   * gcc.target/aarch64/advsimd-intrinsics/compute-ref-data.h
>>   (buffer, buffer_pad, buffer_dup, buffer_dup_pad): Likewise.
>>   * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c: New file.
>>   * gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: New file.
>>   * gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c: New file.
>>
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
>> @@ -0,0 +1,665 @@
>> +/* This file contains tests for all the *p64 intrinsics, except for
>> +   vreinterpret which have their own testcase.  */
>> +
>> +/* { dg-require-effective-target arm_crypto_ok } */
>> +/* { dg-add-options arm_crypto } */
>> +
>> +#include 
>> +#include "arm-neon-ref.h"
>> +#include "compute-ref-data.h"
>> +
>> +/* Expected results: vbsl.  */
>> +VECT_VAR_DECL(vbsl_expected,poly,64,1) [] = { 0xfff1 };
>> +VECT_VAR_DECL(vbsl_expected,poly,64,2) [] = { 0xfff1,
>> +   0xfff1 };
>> +
>> +/* Expected results: vceq.  */
>> +VECT_VAR_DECL(vceq_expected,uint,64,1) [] = { 0x0 };
>
> vceqq_p64
> vceqz_p64
> vceqzq_p64
> vtst_p64
> vtstq_p64
>
> are missing, but will not be trivial to add. Could you raise a bug report
> (or fix it if you like :-) )?
>
> This is OK without a fix for those intrinsics with a suitable bug report
> opened.
>

OK, I've opened PR 71233 to track this.

Thanks,

Christophe.

> Thanks,
> James
>


Re: [PATCH] Fix PR70434, change FE IL for vector indexing

2016-05-23 Thread Richard Biener
On Mon, 23 May 2016, Eric Botcazou wrote:

> > The following changes the IL the C family frontends emit for
> > vector indexing from a mix of BIT_FIELD_REF (for constant indices)
> > and pointer arithmetic + dereferences (for variable indicies) to
> > a simple ARRAY_REF of the vector view-converted to a corresponding
> > array type.
> 
> FWIW the Ada front-end (gigi) has always done that for its vector_type.

Hmm, so it must have run into the bugfix part of this patch (the
tree-ssa.c hunk).  And it will benefit from the gimple-fold.c hunk.

Richard.


[RFC] [2/2] divmod transform: override expand_divmod_libfunc for ARM and add test-cases

2016-05-23 Thread Prathamesh Kulkarni
Hi,
This patch overrides expand_divmod_libfunc for ARM port and adds test-cases.
I separated the SImode tests into separate file from DImode tests
because certain arm configs (cortex-15) have hardware div insn for
SImode but not for DImode,
and for that config we want SImode tests to be disabled but not DImode tests.
The patch therefore has two target-effective checks: divmod and divmod_simode.
Cross-tested on arm*-*-*.
Bootstrap+test on arm-linux-gnueabihf in progress.
Does this patch look OK ?

Thanks,
Prathamesh
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 12060ba..1310006 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -61,6 +61,7 @@
 #include "builtins.h"
 #include "tm-constrs.h"
 #include "rtl-iter.h"
+#include "optabs-libfuncs.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -300,6 +301,7 @@ static unsigned HOST_WIDE_INT arm_asan_shadow_offset (void);
 static void arm_sched_fusion_priority (rtx_insn *, int, int *, int*);
 static bool arm_can_output_mi_thunk (const_tree, HOST_WIDE_INT, HOST_WIDE_INT,
 const_tree);
+static void arm_expand_divmod_libfunc (bool, machine_mode, rtx, rtx, rtx *, 
rtx *);
 
 
 /* Table of machine attributes.  */
@@ -730,6 +732,9 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_SCHED_FUSION_PRIORITY
 #define TARGET_SCHED_FUSION_PRIORITY arm_sched_fusion_priority
 
+#undef TARGET_EXPAND_DIVMOD_LIBFUNC
+#define TARGET_EXPAND_DIVMOD_LIBFUNC arm_expand_divmod_libfunc
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Obstack for minipool constant handling.  */
@@ -30354,6 +30359,37 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri,
   return;
 }
 
+/* Expand call to __aeabi_[mode]divmod (op0, op1).  */
+
+static void
+arm_expand_divmod_libfunc (bool unsignedp, machine_mode mode,
+  rtx op0, rtx op1,
+  rtx *quot_p, rtx *rem_p)
+{
+  if (mode == SImode)
+gcc_assert (!TARGET_IDIV);
+
+  optab tab = (unsignedp) ? udivmod_optab : sdivmod_optab;
+  rtx libfunc = optab_libfunc (tab, mode);
+  gcc_assert (libfunc);
+
+  machine_mode libval_mode = smallest_mode_for_size (2 * GET_MODE_BITSIZE 
(mode),
+MODE_INT);
+
+  rtx libval = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
+   libval_mode, 2,
+   op0, GET_MODE (op0),
+   op1, GET_MODE (op1));
+
+  rtx quotient = simplify_gen_subreg (mode, libval, libval_mode, 0);
+  rtx remainder = simplify_gen_subreg (mode, libval, libval_mode, 
GET_MODE_SIZE (mode));
+
+  gcc_assert (quotient);
+  gcc_assert (remainder);
+  
+  *quot_p = quotient;
+  *rem_p = remainder;
+}
 
 /* Construct and return a PARALLEL RTX vector with elements numbering the
lanes of either the high (HIGH == TRUE) or low (HIGH == FALSE) half of
diff --git a/gcc/testsuite/gcc.dg/divmod-1-simode.c 
b/gcc/testsuite/gcc.dg/divmod-1-simode.c
new file mode 100644
index 000..7405f66
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/divmod-1-simode.c
@@ -0,0 +1,22 @@
+/* { dg-require-effective-target divmod_simode } */
+/* { dg-options "-O2 -fdump-tree-widening_mul-details" } */
+/* div dominates mod.  */
+
+extern int cond;
+void foo(void);
+
+#define FOO(smalltype, bigtype, no)  \
+bigtype f_##no(smalltype x, bigtype y) \
+{   \
+  bigtype q = x / y; \
+  if (cond)  \
+foo ();  \
+  bigtype r = x % y; \
+  return q + r;  \
+}
+
+FOO(int, int, 1)
+FOO(int, unsigned, 2)
+FOO(unsigned, unsigned, 5)
+
+/* { dg-final { scan-tree-dump-times "DIVMOD" 3 "widening_mul" } } */
diff --git a/gcc/testsuite/gcc.dg/divmod-1.c b/gcc/testsuite/gcc.dg/divmod-1.c
new file mode 100644
index 000..40aec74
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/divmod-1.c
@@ -0,0 +1,26 @@
+/* { dg-require-effective-target divmod } */
+/* { dg-options "-O2 -fdump-tree-widening_mul-details" } */
+/* div dominates mod.  */
+
+extern int cond;
+void foo(void);
+
+#define FOO(smalltype, bigtype, no) \
+bigtype f_##no(smalltype x, bigtype y)   \
+{   \
+  bigtype q = x / y; \
+  if (cond)  \
+foo ();  \
+  bigtype r = x % y; \
+  return q + r;  \
+}
+
+FOO(int, long long, 3)
+FOO(int, unsigned long long, 4)
+FOO(unsigned, long long, 6)
+FOO(unsigned, unsigned long long, 7)
+FOO(long long, long long, 8)
+FOO(long long, unsigned long long, 9)
+FOO(unsigned long long, unsigned long long, 10)
+
+/* { dg-final { scan-tree-dump-times "DIVMOD" 7 "widening_mul" } } */
diff --git a/gcc/testsuite/gcc.dg/divmod-2-simode.c 

RFC [1/2] divmod transform

2016-05-23 Thread Prathamesh Kulkarni
Hi,
I have updated my patch for divmod (attached), which was originally
based on Kugan's patch.
The patch transforms stmts with code TRUNC_DIV_EXPR and TRUNC_MOD_EXPR
having same operands to divmod representation, so we can cse computation of mod.

t1 = a TRUNC_DIV_EXPR b;
t2 = a TRUNC_MOD_EXPR b
is transformed to:
complex_tmp = DIVMOD (a, b);
t1 = REALPART_EXPR (complex_tmp);
t2 = IMAGPART_EXPR (complex_tmp);

* New hook divmod_expand_libfunc
The rationale for introducing the hook is that different targets have
incompatible calling conventions for divmod libfunc.
Currently three ports define divmod libfunc: c6x, spu and arm.
c6x and spu follow the convention of libgcc2.c:__udivmoddi4:
return quotient and store remainder in argument passed as pointer,
while the arm version takes two arguments and returns both
quotient and remainder having mode double the size of the operand mode.
The port should hence override the hook expand_divmod_libfunc
to generate call to target-specific divmod.
Ports should define this hook if:
a) The port does not have divmod or div insn for the given mode.
b) The port defines divmod libfunc for the given mode.
The default hook default_expand_divmod_libfunc() generates call
to libgcc2.c:__udivmoddi4 provided the operands are unsigned and
are of DImode.

Patch passes bootstrap+test on x86_64-unknown-linux-gnu and
cross-tested on arm*-*-*.
Bootstrap+test in progress on arm-linux-gnueabihf.
Does this patch look OK ?

Thanks,
Prathamesh
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8c7f2a1..111f19f 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6963,6 +6963,12 @@ This is firstly introduced on ARM/AArch64 targets, 
please refer to
 the hook implementation for how different fusion types are supported.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_EXPAND_DIVMOD_LIBFUNC (bool 
@var{unsignedp}, machine_mode @var{mode}, @var{rtx}, @var{rtx}, rtx 
*@var{quot}, rtx *@var{rem})
+Define this hook if the port does not have hardware div and divmod insn for
+the given mode but has divmod libfunc, which is incompatible
+with libgcc2.c:__udivmoddi4
+@end deftypefn
+
 @node Sections
 @section Dividing the Output into Sections (Texts, Data, @dots{})
 @c the above section title is WAY too long.  maybe cut the part between
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f963a58..2c9a800 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4848,6 +4848,8 @@ them: try the first ones in this list first.
 
 @hook TARGET_SCHED_FUSION_PRIORITY
 
+@hook TARGET_EXPAND_DIVMOD_LIBFUNC
+
 @node Sections
 @section Dividing the Output into Sections (Texts, Data, @dots{})
 @c the above section title is WAY too long.  maybe cut the part between
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index c867ddc..0cb59f7 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2276,6 +2276,48 @@ multi_vector_optab_supported_p (convert_optab optab, 
tree_pair types,
 #define direct_mask_store_optab_supported_p direct_optab_supported_p
 #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
 
+/* Expand DIVMOD() using:
+ a) optab handler for udivmod/sdivmod if it is available.
+ b) If optab_handler doesn't exist, Generate call to
+optab_libfunc for udivmod/sdivmod.  */
+
+static void 
+expand_DIVMOD (internal_fn, gcall *stmt)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree arg0 = gimple_call_arg (stmt, 0);
+  tree arg1 = gimple_call_arg (stmt, 1);
+
+  gcc_assert (TREE_CODE (TREE_TYPE (lhs)) == COMPLEX_TYPE);
+  tree type = TREE_TYPE (TREE_TYPE (lhs));
+  machine_mode mode = TYPE_MODE (type);
+  bool unsignedp = TYPE_UNSIGNED (type);
+  optab tab = (unsignedp) ? udivmod_optab : sdivmod_optab;
+
+  rtx op0 = expand_normal (arg0);
+  rtx op1 = expand_normal (arg1);
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+  rtx quotient, remainder;
+
+  /* Check if optab handler exists for [u]divmod.  */
+  if (optab_handler (tab, mode) != CODE_FOR_nothing)
+{
+  quotient = gen_reg_rtx (mode);
+  remainder = gen_reg_rtx (mode);
+  expand_twoval_binop (tab, op0, op1, quotient, remainder, unsignedp);
+}
+  else
+targetm.expand_divmod_libfunc (unsignedp, mode, op0, op1,
+  , );
+
+  /* Wrap the return value (quotient, remainder) within COMPLEX_EXPR.  */
+  expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs),
+  make_tree (TREE_TYPE (arg0), quotient),
+  make_tree (TREE_TYPE (arg1), remainder)),
+  target, VOIDmode, EXPAND_NORMAL);
+}
+
 /* Return true if FN is supported for the types in TYPES when the
optimization type is OPT_TYPE.  The types are those associated with
the "type0" and "type1" fields of FN's direct_internal_fn_info
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e729d85..56a80f1 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -194,6 +194,9 @@ DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_SET, 

Re: Revert gcc r227962

2016-05-23 Thread JonY
On 5/20/2016 06:36, JonY wrote:
> On 5/20/2016 02:11, Jeff Law wrote:
>> So if we make this change (revert 227962), my understanding is that
>> cygwin bootstraps will fail because they won't find kernel32 and perhaps
>> other libraries.
>>
>> Jeff
>>
> 
> I'll need to double check with trunk but gcc-5.3.0 built OK without it.
> The other alternative is to search /usr/lib before w32api.
> 
> 

yep it reached stage 3 but failed from another error building target
libraries (libcilkrts), meaning it was able to find the w32api libraries
even with this patch reverted.




signature.asc
Description: OpenPGP digital signature


Re: [PATCH][wwwdocs] Improve arm and aarch64-related info in readings.html

2016-05-23 Thread James Greenhalgh
On Thu, May 19, 2016 at 04:27:31PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> I noticed that we have a readings.html page that has pointers to
> documentation of various backends that GCC supports.  The info on arm seems a
> bit out of date and somewhat confusing, and there is no entry for aarch64.
> This patch tries to address that.
> 
> The arm entry is updated to not mention armv2(?) and thumb and an aarch64
> entry is added with a link to the ARM documentation.
> 
> Ok to commit?

The AArch64 part looks fine to me, but a paragraph introducing what is
meant by AArch64 ("the 64-bit execution state introduced in the ARMv8-A
architecture profile" - or similar) would be a useful addition. Just
sending our users straight to the full repository of information provided
by ARM may give them too wide a set of readings :).

Thanks,
James

> ? readings.html~
> Index: readings.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
> retrieving revision 1.242
> diff -U 3 -r1.242 readings.html
> --- readings.html 14 Nov 2015 23:40:21 -  1.242
> +++ readings.html 15 Feb 2016 13:29:03 -
> @@ -59,6 +59,11 @@
>  
>  
>  
> + AArch64
> +  http://infocenter.arm.com/help/index.jsp;>
> + ARM Documentation
> + 
> +
>   alpha
> Manufacturer: Compaq (DEC)
>  href="http://www.tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51A_HTML/ARH9MBTE/TITLE.HTM;>Calling
> @@ -81,12 +86,12 @@
> href="http://www.synopsys.com/IP/PROCESSORIP/ARCPROCESSORS/Pages/default.aspx;>ARC
>  Documentation
>   
>  
> - arm (armv2, thumb)
> -  Manufacturer: Various, by license from ARM
> + ARM
> +  Manufacturer: Various, by license from ARM.
>CPUs include: ARM7 and ARM7T series (eg. ARM7TDMI), ARM9 and 
> StrongARM
>http://infocenter.arm.com/help/index.jsp;>ARM 
> Documentation
>   
> - 
> +
>   AVR
>Manufacturer: Atmel
>http://www.atmel.com/products/microcontrollers/avr/;>AVR 
> Documentation



Re: [PATCH] Fix PR70434, change FE IL for vector indexing

2016-05-23 Thread Eric Botcazou
> The following changes the IL the C family frontends emit for
> vector indexing from a mix of BIT_FIELD_REF (for constant indices)
> and pointer arithmetic + dereferences (for variable indicies) to
> a simple ARRAY_REF of the vector view-converted to a corresponding
> array type.

FWIW the Ada front-end (gigi) has always done that for its vector_type.

-- 
Eric Botcazou


Re: Clean up PURE_SLP_STMT handling

2016-05-23 Thread Richard Biener
On Fri, May 20, 2016 at 5:30 PM, Richard Sandiford
 wrote:
> The vectorizable_* routines had many instances of:
>
> slp_node || PURE_SLP_STMT (stmt_info)
>
> which gives the misleading impression that we can have
> !slp_node && PURE_SLP_STMT (stmt_info).  In this context
> it's really enough to test slp_node on its own.

Yeah, that's a cleanup opportunity since I changed the code last year
to always pass down slp_node in all phases.

> There are three cases:
>
>   loop vectorisation only:
> vectorizable_foo called only with !slp_node
>
>   pure SLP:
> vectorizable_foo called only with slp_node
>
>   hybrid SLP:
> (e.g. a vector that's used in SLP statements and also in a reduction)
> - vectorizable_foo called once with slp_node for the SLP uses.
> - vectorizable_foo called once with !slp_node for the non-SLP uses.
>
> Hybrid SLP isn't possible for stores, so I added an explicit assert
> for that.
>
> I also made vectorizable_comparison static, to make it obvious that
> no other callers outside tree-vect-stmts.c could use it with the
> !slp && PURE_SLP_STMT combination.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree-vectorizer.h (vectorizable_comparison): Delete.
> * tree-vect-loop.c (vectorizable_reduction): Remove redundant
> PURE_SLP_STMT check.
> * tree-vect-stmts.c (vectorizable_call): Likewise.
> (vectorizable_simd_clone_call): Likewise.
> (vectorizable_conversion): Likewise.
> (vectorizable_assignment): Likewise.
> (vectorizable_shift): Likewise.
> (vectorizable_operation): Likewise.
> (vectorizable_load): Likewise.
> (vectorizable_condition): Likewise.
> (vectorizable_store): Likewise.  Assert that we don't have
> hybrid SLP.
> (vectorizable_comparison): Make static.  Remove redundant
> PURE_SLP_STMT check.
> (vect_transform_stmt): Assert that we always have an slp_node
> if PURE_SLP_STMT.
>
> Index: gcc/tree-vectorizer.h
> ===
> --- gcc/tree-vectorizer.h
> +++ gcc/tree-vectorizer.h
> @@ -1004,8 +1004,6 @@ extern void vect_remove_stores (gimple *);
>  extern bool vect_analyze_stmt (gimple *, bool *, slp_tree);
>  extern bool vectorizable_condition (gimple *, gimple_stmt_iterator *,
> gimple **, tree, int, slp_tree);
> -extern bool vectorizable_comparison (gimple *, gimple_stmt_iterator *,
> -gimple **, tree, int, slp_tree);
>  extern void vect_get_load_cost (struct data_reference *, int, bool,
> unsigned int *, unsigned int *,
> stmt_vector_for_cost *,
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c
> +++ gcc/tree-vect-loop.c
> @@ -5594,7 +5594,7 @@ vectorizable_reduction (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>if (STMT_VINFO_LIVE_P (vinfo_for_stmt (reduc_def_stmt)))
>  return false;
>
> -  if (slp_node || PURE_SLP_STMT (stmt_info))
> +  if (slp_node)
>  ncopies = 1;
>else
>  ncopies = (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
> Index: gcc/tree-vect-stmts.c
> ===
> --- gcc/tree-vect-stmts.c
> +++ gcc/tree-vect-stmts.c
> @@ -2342,7 +2342,7 @@ vectorizable_call (gimple *gs, gimple_stmt_iterator 
> *gsi, gimple **vec_stmt,
> }
>  }
>
> -  if (slp_node || PURE_SLP_STMT (stmt_info))
> +  if (slp_node)
>  ncopies = 1;
>else if (modifier == NARROW && ifn == IFN_LAST)
>  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
> @@ -2792,7 +2792,7 @@ vectorizable_simd_clone_call (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>  return false;
>
>/* FORNOW */
> -  if (slp_node || PURE_SLP_STMT (stmt_info))
> +  if (slp_node)
>  return false;
>
>/* Process function arguments.  */
> @@ -3761,7 +3761,7 @@ vectorizable_conversion (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>/* Multiple types in SLP are handled by creating the appropriate number of
>   vectorized stmts for each SLP node.  Hence, NCOPIES is always 1 in
>   case of SLP.  */
> -  if (slp_node || PURE_SLP_STMT (stmt_info))
> +  if (slp_node)
>  ncopies = 1;
>else if (modifier == NARROW)
>  ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
> @@ -4242,7 +4242,7 @@ vectorizable_assignment (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>/* Multiple types in SLP are handled by creating the appropriate number of
>   vectorized stmts for each SLP node.  Hence, NCOPIES is always 1 in
>   case of SLP.  */
> -  if (slp_node || PURE_SLP_STMT (stmt_info))
> +  if (slp_node)
>  ncopies = 1;
>else
>  ncopies = LOOP_VINFO_VECT_FACTOR 

  1   2   >