Improve uncprop and coalescing

2013-06-06 Thread Jeff Law


I stumbled over this while looking at regressions triggered when moving 
certain branch-cost driven transformations from fold-const.c to a later 
point in the pipeline.


The coalescing we do as part of the out-of-ssa process restricts itself 
to only coalescing when the types of the underlying objects are the 
same.  Where same is currently defined as pointer equality on the type.


So if we have a copy or PHI node where the source & dest have 
equivalent, but different types we will not currently coalesce the 
source and destination.


We also have a pass which un-propagates certain equivalences appearing 
as PHI args  when the equivalence is derived from conditionals.  This 
un-propagation pass is primarily meant to improve coalescing and 
ultimately eliminate silly constant initializations and useless copies. 
 That pass also used pointer equality of types when determining if 
unpropagation was profitable.


Rather than using strict pointer equality, we can do better by looking 
at TYPE_CANONICAL when it's available.  Thus objects of the following 
two types (T1 & T2) become candidates for coalescing if they are tied 
together by a copy or PHI node.


typedef int t1;
typedef int t2;


This typically eliminates necessary copies and constant initializations, 
which is good.  The only regression I saw when reviewing the generated 
code & dumps was a case where we got different register allocation on 
one path which in turn inhibited a tail merging opportunity.




Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?



* gimple.h (gimple_can_coalesce_p): Prototype.
* tree-ssa-coalesce.c (gimple_can_coalesce_p): New function.
(create_outofssa_var_map, coalesce_partitions): Use it.
* tree-ssa-uncprop.c (uncprop_into_successor_phis): Similarly. 
* tree-ssa-live.c (var_map_base_init): Use TYPE_CANONICAL
if it's available.

* gcc.dg/tree-ssa/coalesce-1.c: New test.


diff --git a/gcc/gimple.h b/gcc/gimple.h
index b4de403..8ae07c9 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1101,6 +1101,9 @@ extern tree tree_ssa_strip_useless_type_conversions 
(tree);
 extern bool useless_type_conversion_p (tree, tree);
 extern bool types_compatible_p (tree, tree);
 
+/* In tree-ssa-coalesce.c */
+extern bool gimple_can_coalesce_p (tree, tree);
+
 /* Return the first node in GIMPLE sequence S.  */
 
 static inline gimple_seq_node
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 354b5f1..42bea5d 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -943,8 +943,7 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap 
used_in_copy)
continue;
 
  register_ssa_partition (map, arg);
- if ((SSA_NAME_VAR (arg) == SSA_NAME_VAR (res)
-  && TREE_TYPE (arg) == TREE_TYPE (res))
+ if (gimple_can_coalesce_p (arg, res)
  || (e->flags & EDGE_ABNORMAL))
{
  saw_copy = true;
@@ -985,8 +984,7 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap 
used_in_copy)
if (gimple_assign_copy_p (stmt)
 && TREE_CODE (lhs) == SSA_NAME
&& TREE_CODE (rhs1) == SSA_NAME
-   && SSA_NAME_VAR (lhs) == SSA_NAME_VAR (rhs1)
-   && TREE_TYPE (lhs) == TREE_TYPE (rhs1))
+   && gimple_can_coalesce_p (lhs, rhs1))
  {
v1 = SSA_NAME_VERSION (lhs);
v2 = SSA_NAME_VERSION (rhs1);
@@ -1037,8 +1035,7 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap 
used_in_copy)
v1 = SSA_NAME_VERSION (outputs[match]);
v2 = SSA_NAME_VERSION (input);
 
-   if (SSA_NAME_VAR (outputs[match]) == SSA_NAME_VAR (input)
-   && TREE_TYPE (outputs[match]) == TREE_TYPE (input))
+   if (gimple_can_coalesce_p (outputs[match], input))
  {
cost = coalesce_cost (REG_BR_PROB_BASE,
  optimize_bb_for_size_p (bb));
@@ -1072,8 +1069,7 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap 
used_in_copy)
first = var;
  else
{
- gcc_assert (SSA_NAME_VAR (var) == SSA_NAME_VAR (first)
- && TREE_TYPE (var) == TREE_TYPE (first));
+ gcc_assert (gimple_can_coalesce_p (var, first));
  v1 = SSA_NAME_VERSION (first);
  v2 = SSA_NAME_VERSION (var);
  bitmap_set_bit (used_in_copy, v1);
@@ -1210,8 +1206,7 @@ coalesce_partitions (var_map map, ssa_conflicts_p graph, 
coalesce_list_p cl,
   var2 = ssa_name (y);
 
   /* Assert the coalesces have the same base variable.  */
-  gcc_assert (SSA_NAME_VAR (var1) == SSA_NAME_VAR (var2)
- && TREE_TYPE (var1) == TREE_TYPE (var2

Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Xinliang David Li
This one should be submitted and discussed in trunk.

thanks,

David

On Thu, Jun 6, 2013 at 9:39 PM, Dehao Chen  wrote:
> I've prepared a patch check for args for indirect call value profiling.
>
> Testing on-going. Is it ok for trunk if testing is good?
>
> Thanks,
> Dehao
>
> gcc/ChangeLog:
> 2013-06-06  Dehao Chen  
>
> * tree-flow.h (gimple_check_call_matching_types): Add new argument.
> * gimple-low.c (gimple_check_call_matching_types): Likewise.
> (gimple_check_call_args): Likewise.
> * value-prof.c (check_ic_target): Likewise.
> * ipa-inline.c (early_inliner): Likewise.
> * ipa-prop.c (update_indirect_edges_after_inlining): Likewise.
> * cgraph.c (cgraph_create_edge_1): Likewise.
> (cgraph_make_edge_direct): Likewise.
>
>
>
> On Thu, Jun 6, 2013 at 9:14 AM, Xinliang David Li  wrote:
>> gimple_check_call_matching_types is called by check_ic_target. Instead
>> of moving the check out of this guard function, may be enhancing the
>> interface to allow it to guard with different strictness?
>>
>> David
>>
>> On Thu, Jun 6, 2013 at 8:10 AM, Dehao Chen  wrote:
>>> Hi, Martin,
>>>
>>> Yes, your patch can fix my case. Thanks a lot for the fix.
>>>
>>> With the fix, value profiling will still promote the wrong indirect
>>> call target. Though it will not be inlining, but it results in an
>>> additional check. How about in check_ic_target, after calling
>>> gimple_check_call_matching_types, we also check if number of args
>>> match number of params in target->symbol.decl?
>>>
>>> Thanks,
>>> Dehao
>>>
>>>
>>> On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor  wrote:

 Hi,

 On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
 > attached is a testcase that would cause problem when source has changed:
 >
 > $ g++ test.cc -O2 -fprofile-generate -DOLD
 > $ ./a.out
 > $ g++ test.cc -O2 -fprofile-use
 > test.cc:34:1: internal compiler error: in operator[], at vec.h:815
 >  }
 >  ^
 > 0x512740 vec::operator[](unsigned int)
 > ../../gcc/vec.h:815
 > 0x512740 vec::operator[](unsigned int)
 > ../../gcc/vec.h:1244
 > 0xf24464 vec::operator[](unsigned int)
 > ../../gcc/vec.h:815
 > 0xf24464 vec::operator[](unsigned int)
 > ../../gcc/vec.h:1244
 > 0xf24464 ipa_get_indirect_edge_target_1
 > ../../gcc/ipa-cp.c:1535
 > 0x971b9a estimate_edge_devirt_benefit
 > ../../gcc/ipa-inline-analysis.c:2757

 Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
 Since it is called also from inlining, we can have parameter count
 mismatches... and in fact in non-virtual paths of that function we do
 check that we don't.  Because all callers have to pass known_vals
 describing all formal parameters of the inline tree root, we should
 apply the fix below (I've only just started running a bootstrap and
 testsuite on x86_64, though).

 OTOH, while I understand that FDO can change inlining sufficiently so
 that this error occurs, IMHO this should not be caused by outdated
 profiles but there is somewhere a parameter mismatch in the source.

 Dehao, can you please check that this patch helps?

 Richi, if it does and the patch passes bootstrap and tests, is it OK
 for trunk and 4.8 branch?

 Thanks and sorry for the trouble,

 Martin


 2013-06-06  Martin Jambor  

 * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that 
 param_index is
 within bounds at the beginning of the function.

 Index: src/gcc/ipa-cp.c
 ===
 --- src.orig/gcc/ipa-cp.c
 +++ src/gcc/ipa-cp.c
 @@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
tree otr_type;
tree t;

 -  if (param_index == -1)
 +  if (param_index == -1
 +  || known_vals.length () <= (unsigned int) param_index)
  return NULL_TREE;

if (!ie->indirect_info->polymorphic)
 @@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
 t = NULL;
 }
else
 -   t = (known_vals.length () > (unsigned int) param_index
 -? known_vals[param_index] : NULL);
 +   t = NULL;

if (t &&
   TREE_CODE (t) == ADDR_EXPR


Re: [PATCH] PR55033

2013-06-06 Thread Alan Modra
On Tue, Apr 02, 2013 at 02:05:13PM +1030, Alan Modra wrote:
> suspicious.  For instance, you might wonder why it is correct to have
>   if (decl && !DECL_P (decl))
> decl = NULL_TREE;
> before calling get_section().  The answer is that get_section() is not
> prepared to handle !DECL_P trees when reporting errors.  Arguably it
> should be modified to do that.

This patch tidies error reporting in get_section(), removing
non-sensical messages like "foo causes a section type conflict with
foo" which was one of the things we hit prior to my pr55033 fix.
I'm moving the DECL_P check into error handling code.  That seems like
a win since decl and sect->named.decl are only used on the error path.
Bootstrapped and regression tested powerpc64-linux.  OK for mainline?

* varasm.c (get_section): Don't die on !DECL_P decl.  Tidy error
reporting.
(get_named_section): Don't NULL !DECL_P decl.

Index: gcc/varasm.c
===
--- gcc/varasm.c(revision 199718)
+++ gcc/varasm.c(working copy)
@@ -307,19 +307,22 @@ get_section (const char *name, unsigned int flags,
  return sect;
}
  /* Sanity check user variables for flag changes.  */
- if (decl == 0)
-   decl = sect->named.decl;
- gcc_assert (decl);
- if (sect->named.decl == NULL)
+ if (sect->named.decl != NULL
+ && DECL_P (sect->named.decl)
+ && decl != sect->named.decl)
+   {
+ if (decl != NULL && DECL_P (decl))
+   error ("%+D causes a section type conflict with %D",
+  decl, sect->named.decl);
+ else
+   error ("section type conflict with %D", sect->named.decl);
+ inform (DECL_SOURCE_LOCATION (sect->named.decl),
+ "%qD was declared here", sect->named.decl);
+   }
+ else if (decl != NULL && DECL_P (decl))
error ("%+D causes a section type conflict", decl);
  else
-   {
- error ("%+D causes a section type conflict with %D",
-decl, sect->named.decl);
- if (decl != sect->named.decl)
-   inform (DECL_SOURCE_LOCATION (sect->named.decl),
-   "%qD was declared here", sect->named.decl);
-   }
+   error ("section type conflict");
  /* Make sure we don't error about one section multiple times.  */
  sect->common.flags |= SECTION_OVERRIDE;
}
@@ -409,9 +412,6 @@ get_named_section (tree decl, const char *name, in
 }
 
   flags = targetm.section_type_flags (decl, name, reloc);
-
-  if (decl && !DECL_P (decl))
-decl = NULL_TREE;
   return get_section (name, flags, decl);
 }
 


-- 
Alan Modra
Australia Development Lab, IBM


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Dehao Chen
I've prepared a patch check for args for indirect call value profiling.

Testing on-going. Is it ok for trunk if testing is good?

Thanks,
Dehao

gcc/ChangeLog:
2013-06-06  Dehao Chen  

* tree-flow.h (gimple_check_call_matching_types): Add new argument.
* gimple-low.c (gimple_check_call_matching_types): Likewise.
(gimple_check_call_args): Likewise.
* value-prof.c (check_ic_target): Likewise.
* ipa-inline.c (early_inliner): Likewise.
* ipa-prop.c (update_indirect_edges_after_inlining): Likewise.
* cgraph.c (cgraph_create_edge_1): Likewise.
(cgraph_make_edge_direct): Likewise.



On Thu, Jun 6, 2013 at 9:14 AM, Xinliang David Li  wrote:
> gimple_check_call_matching_types is called by check_ic_target. Instead
> of moving the check out of this guard function, may be enhancing the
> interface to allow it to guard with different strictness?
>
> David
>
> On Thu, Jun 6, 2013 at 8:10 AM, Dehao Chen  wrote:
>> Hi, Martin,
>>
>> Yes, your patch can fix my case. Thanks a lot for the fix.
>>
>> With the fix, value profiling will still promote the wrong indirect
>> call target. Though it will not be inlining, but it results in an
>> additional check. How about in check_ic_target, after calling
>> gimple_check_call_matching_types, we also check if number of args
>> match number of params in target->symbol.decl?
>>
>> Thanks,
>> Dehao
>>
>>
>> On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor  wrote:
>>>
>>> Hi,
>>>
>>> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
>>> > attached is a testcase that would cause problem when source has changed:
>>> >
>>> > $ g++ test.cc -O2 -fprofile-generate -DOLD
>>> > $ ./a.out
>>> > $ g++ test.cc -O2 -fprofile-use
>>> > test.cc:34:1: internal compiler error: in operator[], at vec.h:815
>>> >  }
>>> >  ^
>>> > 0x512740 vec::operator[](unsigned int)
>>> > ../../gcc/vec.h:815
>>> > 0x512740 vec::operator[](unsigned int)
>>> > ../../gcc/vec.h:1244
>>> > 0xf24464 vec::operator[](unsigned int)
>>> > ../../gcc/vec.h:815
>>> > 0xf24464 vec::operator[](unsigned int)
>>> > ../../gcc/vec.h:1244
>>> > 0xf24464 ipa_get_indirect_edge_target_1
>>> > ../../gcc/ipa-cp.c:1535
>>> > 0x971b9a estimate_edge_devirt_benefit
>>> > ../../gcc/ipa-inline-analysis.c:2757
>>>
>>> Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
>>> Since it is called also from inlining, we can have parameter count
>>> mismatches... and in fact in non-virtual paths of that function we do
>>> check that we don't.  Because all callers have to pass known_vals
>>> describing all formal parameters of the inline tree root, we should
>>> apply the fix below (I've only just started running a bootstrap and
>>> testsuite on x86_64, though).
>>>
>>> OTOH, while I understand that FDO can change inlining sufficiently so
>>> that this error occurs, IMHO this should not be caused by outdated
>>> profiles but there is somewhere a parameter mismatch in the source.
>>>
>>> Dehao, can you please check that this patch helps?
>>>
>>> Richi, if it does and the patch passes bootstrap and tests, is it OK
>>> for trunk and 4.8 branch?
>>>
>>> Thanks and sorry for the trouble,
>>>
>>> Martin
>>>
>>>
>>> 2013-06-06  Martin Jambor  
>>>
>>> * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that param_index 
>>> is
>>> within bounds at the beginning of the function.
>>>
>>> Index: src/gcc/ipa-cp.c
>>> ===
>>> --- src.orig/gcc/ipa-cp.c
>>> +++ src/gcc/ipa-cp.c
>>> @@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
>>>tree otr_type;
>>>tree t;
>>>
>>> -  if (param_index == -1)
>>> +  if (param_index == -1
>>> +  || known_vals.length () <= (unsigned int) param_index)
>>>  return NULL_TREE;
>>>
>>>if (!ie->indirect_info->polymorphic)
>>> @@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
>>> t = NULL;
>>> }
>>>else
>>> -   t = (known_vals.length () > (unsigned int) param_index
>>> -? known_vals[param_index] : NULL);
>>> +   t = NULL;
>>>
>>>if (t &&
>>>   TREE_CODE (t) == ADDR_EXPR
Index: gcc/gimple-low.c
===
--- gcc/gimple-low.c(revision 199780)
+++ gcc/gimple-low.c(working copy)
@@ -204,7 +204,7 @@ struct gimple_opt_pass pass_lower_cf =
return false.  */
 
 static bool
-gimple_check_call_args (gimple stmt, tree fndecl)
+gimple_check_call_args (gimple stmt, tree fndecl, bool args_count_match)
 {
   tree parms, p;
   unsigned int i, nargs;
@@ -243,6 +243,8 @@ static bool
  && !fold_convertible_p (DECL_ARG_TYPE (p), arg)))
 return false;
}
+  if (args_count_match && p)
+   return false;
 }
   else if (parms)
 {
@@ -271,11 +273,13 @@ static bool
 }
 
 /* Verify if the type of the argument and lhs of CALL_STMT matches
-   that of the function declaration CAL

Re: [PATCH, libcpp] Do not decrease highest_location if the included file has be included twice.

2013-06-06 Thread Dehao Chen
ping...

On Tue, Jun 4, 2013 at 10:02 AM, Dehao Chen  wrote:
> Hi, Dodji,
>
> Thanks for helping update the patch. The new patch passed all
> regression test and can fix the problem in my huge source file. I
> added ChangeLog entry to the patch. Could any libcpp maintainers help
> check if it is ok for trunk?
>
> Thanks,
> Dehao
>
> libcpp/ChangeLog:
>
> 2013-06-04  Dehao Chen  
>
>  * files.c (_cpp_stack_include): Fix the highest_location when header
>  file is guarded by #ifndef and is included twice.
>
> Index: libcpp/files.c
> ===
> --- libcpp/files.c (revision 199570)
> +++ libcpp/files.c (working copy)
> @@ -983,6 +983,7 @@ _cpp_stack_include (cpp_reader *pfile, const char
>  {
>struct cpp_dir *dir;
>_cpp_file *file;
> +  bool stacked;
>
>dir = search_path_head (pfile, fname, angle_brackets, type);
>if (!dir)
> @@ -993,19 +994,26 @@ _cpp_stack_include (cpp_reader *pfile, const char
>if (type == IT_DEFAULT && file == NULL)
>  return false;
>
> -  /* Compensate for the increment in linemap_add that occurs in
> - _cpp_stack_file.  In the case of a normal #include, we're
> - currently at the start of the line *following* the #include.  A
> - separate source_location for this location makes no sense (until
> - we do the LC_LEAVE), and complicates LAST_SOURCE_LINE_LOCATION.
> - This does not apply if we found a PCH file (in which case
> - linemap_add is not called) or we were included from the
> - command-line.  */
> +  /* Compensate for the increment in linemap_add that occurs if
> +  _cpp_stack_file actually stacks the file.  In the case of a
> + normal #include, we're currently at the start of the line
> + *following* the #include.  A separate source_location for this
> + location makes no sense (until we do the LC_LEAVE), and
> + complicates LAST_SOURCE_LINE_LOCATION.  This does not apply if we
> + found a PCH file (in which case linemap_add is not called) or we
> + were included from the command-line.  */
>if (file->pchname == NULL && file->err_no == 0
>&& type != IT_CMDLINE && type != IT_DEFAULT)
>  pfile->line_table->highest_location--;
>
> -  return _cpp_stack_file (pfile, file, type == IT_IMPORT);
> +  stacked = _cpp_stack_file (pfile, file, type == IT_IMPORT);
> +
> +  if (!stacked)
> +/* _cpp_stack_file didn't stack the file, so let's rollback the
> +   compensation dance we performed above.  */
> +pfile->line_table->highest_location++;
> +
> +  return stacked;
>  }
>
>  /* Could not open FILE.  The complication is dependency output.  */


Re: [RS6000] libffi little-endian

2013-06-06 Thread David Edelsohn
On Thu, Jun 6, 2013 at 9:34 PM, Alan Modra  wrote:
> Bootstrapped and regression tested powerpc64-linux.  OK to apply?
>
> * src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Support
> little-endian.
> * src/powerpc/ppc_closure.S (ffi_closure_SYSV): Likewise.

This patch needs to be applied upstream in the libffi repository.

All of the handling of structs in ffi.c and ffi_darwin.c doesn't need
any changes?  Cool.  I thought there might be a padding issue.

- David


C++ PATCH for c++/55520 (ICE with lambda capture of VLA)

2013-06-06 Thread Jason Merrill
I recently implemented capture of C++14 VLAs, but the VLA in this 
testcase is not valid C++14, so capturing it isn't supported.  But we 
still shouldn't ICE.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c0ecc17471e7859ba36f3f8d095e5666e525f84f
Author: Jason Merrill 
Date:   Thu Jun 6 22:41:32 2013 -0400

	PR c++/55520
	* semantics.c (add_capture): Diagnose capture of variable-size
	type that is not a C++1y array of runtime bound.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 048a7db..b5c3b0a 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -9487,6 +9487,15 @@ add_capture (tree lambda, tree id, tree initializer, bool by_reference_p,
 	  nelts_field, array_type_nelts (type));
   type = ctype;
 }
+  else if (variably_modified_type_p (type, NULL_TREE))
+{
+  error ("capture of variable-size type %qT that is not a C++1y array "
+	 "of runtime bound", type);
+  if (TREE_CODE (type) == ARRAY_TYPE
+	  && variably_modified_type_p (TREE_TYPE (type), NULL_TREE))
+	inform (input_location, "because the array element type %qT has "
+		"variable size", TREE_TYPE (type));
+}
   else if (by_reference_p)
 {
   type = build_reference_type (type);
diff --git a/gcc/testsuite/g++.dg/cpp1y/vla7.C b/gcc/testsuite/g++.dg/cpp1y/vla7.C
new file mode 100644
index 000..df34c82
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/vla7.C
@@ -0,0 +1,12 @@
+// PR c++/55520
+// { dg-options "-Wno-vla" }
+// { dg-require-effective-target c++11 }
+
+int main(int argc, char** argv)
+{
+  int x[1][argc];
+
+  [&x](int i) {			// { dg-error "variable.size" }
+x[0][i] = 0;	 	// { dg-prune-output "assignment" }
+  }(5);
+}


C++ PATCH to handling of fields with incomplete type

2013-06-06 Thread Jason Merrill
A while back I noticed an error-recovery issue with fields of incomplete 
type: later references to such fields would give an error that the name 
was undeclared, which is not the case.  This patch improves that situation.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit cf2d44c5b95890d6b7de7f4cc48284cc9e528683
Author: Jason Merrill 
Date:   Thu Jun 6 15:19:09 2013 -0400

	* decl.c (grokdeclarator): Keep a decl with error type.
	(grokfield, grokbitfield): Likewise.
	* pt.c (instantiate_class_template_1): Likewise.
	(tsubst_decl): Drop redundant error.
	* class.c (walk_subobject_offsets): Handle erroneous fields.
	* typeck2.c (process_init_constructor_record): Likewise.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 40e6d3e..286164d 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -3779,7 +3779,9 @@ walk_subobject_offsets (tree type,
 
   /* Iterate through the fields of TYPE.  */
   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
-	if (TREE_CODE (field) == FIELD_DECL && !DECL_ARTIFICIAL (field))
+	if (TREE_CODE (field) == FIELD_DECL
+	&& TREE_TYPE (field) != error_mark_node
+	&& !DECL_ARTIFICIAL (field))
 	  {
 	tree field_offset;
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index c37b4fe..7825c73 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10530,21 +10530,13 @@ grokdeclarator (const cp_declarator *declarator,
 		 && (TREE_CODE (type) != ARRAY_TYPE || initialized == 0))
 	  {
 	if (unqualified_id)
-	  error ("field %qD has incomplete type", unqualified_id);
+	  error ("field %qD has incomplete type %qT",
+		 unqualified_id, type);
 	else
 	  error ("name %qT has incomplete type", type);
 
-	/* If we're instantiating a template, tell them which
-	   instantiation made the field's type be incomplete.  */
-	if (current_class_type
-		&& TYPE_NAME (current_class_type)
-		&& IDENTIFIER_TEMPLATE (current_class_name)
-		&& declspecs->type
-		&& declspecs->type == type)
-	  error ("  in instantiation of template %qT",
-		 current_class_type);
-
-	return error_mark_node;
+	type = error_mark_node;
+	decl = NULL_TREE;
 	  }
 	else
 	  {
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 373f883..1573ced 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -835,9 +835,11 @@ grokfield (const cp_declarator *declarator,
 init = NULL_TREE;
 
   value = grokdeclarator (declarator, declspecs, FIELD, init != 0, &attrlist);
-  if (! value || error_operand_p (value))
+  if (! value || value == error_mark_node)
 /* friend or constructor went bad.  */
 return error_mark_node;
+  if (TREE_TYPE (value) == error_mark_node)
+return value;
 
   if (TREE_CODE (value) == TYPE_DECL && init)
 {
@@ -1045,8 +1047,10 @@ grokbitfield (const cp_declarator *declarator,
 {
   tree value = grokdeclarator (declarator, declspecs, BITFIELD, 0, &attrlist);
 
-  if (value == error_mark_node) 
+  if (value == error_mark_node)
 return NULL_TREE; /* friends went bad.  */
+  if (TREE_TYPE (value) == error_mark_node)
+return value;
 
   /* Pass friendly classes back.  */
   if (VOID_TYPE_P (value))
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d9a14cc..dcdde00 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8863,9 +8863,8 @@ instantiate_class_template_1 (tree type)
 		  else if (TREE_CODE (r) == FIELD_DECL)
 		{
 		  /* Determine whether R has a valid type and can be
-			 completed later.  If R is invalid, then it is
-			 replaced by error_mark_node so that it will not be
-			 added to TYPE_FIELDS.  */
+			 completed later.  If R is invalid, then its type is
+			 replaced by error_mark_node.  */
 		  tree rtype = TREE_TYPE (r);
 		  if (can_complete_type_without_circularity (rtype))
 			complete_type (rtype);
@@ -8873,7 +8872,7 @@ instantiate_class_template_1 (tree type)
 		  if (!COMPLETE_TYPE_P (rtype))
 			{
 			  cxx_incomplete_type_error (r, rtype);
-			  r = error_mark_node;
+			  TREE_TYPE (r) = error_mark_node;
 			}
 		}
 
@@ -10514,8 +10513,6 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 	/* We don't have to set DECL_CONTEXT here; it is set by
 	   finish_member_declaration.  */
 	DECL_CHAIN (r) = NULL_TREE;
-	if (VOID_TYPE_P (type))
-	  error ("instantiation of %q+D as type %qT", r, type);
 
 	apply_late_template_attributes (&r, DECL_ATTRIBUTES (r), 0,
 	args, complain, in_decl);
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index e0ebae9..401904d 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1236,6 +1236,8 @@ process_init_constructor_record (tree type, tree init,
   type = TREE_TYPE (field);
   if (DECL_BIT_FIELD_TYPE (field))
 	type = DECL_BIT_FIELD_TYPE (field);
+  if (type == error_mark_node)
+	return PICFLAG_ERRONEOUS;
 
   if (idx < vec_safe_length (CONSTRUCTOR_ELTS (init)))
 	{
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept15.C b/gcc/testsuite/g++.dg/cpp0x/noexcept15.C
index c735b9e..db5b5c7 100644
--

Re: [C++ Patch] PR 53658

2013-06-06 Thread Jason Merrill

OK.

Jason


force_const_mem VOIDmode

2013-06-06 Thread Alan Modra
force_const_mem() isn't supposed to handle VOIDmode or BLKmode, so the
check for VOIDmode when aligning is needless.  If we ever did get one
of these modes in a constant pool, this

  pool->offset += GET_MODE_SIZE (mode);

won't add to the pool size, and output_constant_pool_2() will hit a
gcc_unreachable().  Bootstrapped etc. powerpc64-linux.

* varasm.c (force_const_mem): Assert mode is not VOID or BLK.

Index: gcc/varasm.c
===
--- gcc/varasm.c(revision 199718)
+++ gcc/varasm.c(working copy)
@@ -3567,7 +3575,8 @@
   *slot = desc;
 
   /* Align the location counter as required by EXP's data type.  */
-  align = GET_MODE_ALIGNMENT (mode == VOIDmode ? word_mode : mode);
+  gcc_checking_assert (mode != VOIDmode && mode != BLKmode);
+  align = GET_MODE_ALIGNMENT (mode);
 #ifdef CONSTANT_ALIGNMENT
   {
 tree type = lang_hooks.types.type_for_mode (mode, 0);

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH]FW: Added build_c_cast to c-family?

2013-06-06 Thread Iyer, Balaji V
Hello,
Is this OK for trunk? It involves very few changes (the patch is cut 
and pasted below) and does not cause any bootstrap issues on my x86_64 running 
SuSE.

Thanks,


-Balaji V. Iyer.

> -Original Message-
> From: Iyer, Balaji V
> Sent: Monday, June 03, 2013 8:44 PM
> To: gcc-patches@gcc.gnu.org
> Subject: Added build_c_cast to c-family?
> 
> Hello Everyone,
>   Is it OK to move build_c_cast prototype into c-common.h? The reason
> for this is that, I would like to share some of the code between array 
> notation
> for C and C++ and this function is sort of required for both places. Also, the
> exact same call is available for both C and C++ with the same parameters at 
> the
> same locations. The change involves removing the prototype from c-tree.h and
> cp-tree.h and moving it to c-common.h.
> 
>   Here is the changelogs and the patch to accomplish what I am
> requesting. Please let me know if it is OK for the trunk.
> 
> gcc/c-family/ChangeLog
> 2013-06-03  Balaji V. Iyer  
> 
>   * c-common.h (build_c_cast): Added new extern prototype.
> 
> gcc/c/ChangeLog
>  2013-06-03  Balaji V. Iyer  
> 
>   * c-tree.h (build_c_cast): Remove prototype.
> 
> gcc/cp/ChangeLog
> 2013-06-03  Balaji V. Iyer  
> 
>   * c-tree.h (build_c_cast): Remove prototype.
> 
> 
> Index: gcc/c-family/c-common.h
> =
> ==
> --- gcc/c-family/c-common.h   (revision 199630)
> +++ gcc/c-family/c-common.h   (working copy)
> @@ -538,6 +538,7 @@
>  extern tree pushdecl (tree);
>  extern tree build_modify_expr (location_t, tree, tree, enum tree_code,
>  location_t, tree, tree);
> +extern tree build_c_cast (location_t, tree, tree);
>  extern tree build_array_notation_expr (location_t, tree, tree, enum 
> tree_code,
>  location_t, tree, tree);
>  extern tree build_array_notation_ref (location_t, tree, tree, tree, tree, 
> tree);
> 
> Index: gcc/c/c-tree.h
> =
> ==
> --- gcc/c/c-tree.h(revision 199630)
> +++ gcc/c/c-tree.h(working copy)
> @@ -600,7 +600,6 @@
>   tree, tree);
>  extern tree build_compound_expr (location_t, tree, tree);  extern tree
> c_cast_expr (location_t, struct c_type_name *, tree); -extern tree 
> build_c_cast
> (location_t, tree, tree);  extern void store_init_value (location_t, tree, 
> tree,
> tree);  extern void error_init (const char *);  extern void pedwarn_init
> (location_t, int opt, const char *);
> 
> Index: gcc/cp/cp-tree.h
> =
> ==
> --- gcc/cp/cp-tree.h  (revision 199630)
> +++ gcc/cp/cp-tree.h  (working copy)
> @@ -6000,7 +6000,6 @@
>  extern tree build_static_cast(tree, tree, 
> tsubst_flags_t);
>  extern tree build_reinterpret_cast   (tree, tree, tsubst_flags_t);
>  extern tree build_const_cast (tree, tree, tsubst_flags_t);
> -extern tree build_c_cast (location_t, tree, tree);
>  extern tree cp_build_c_cast  (tree, tree, tsubst_flags_t);
>  extern tree build_x_modify_expr  (location_t, tree,
>enum tree_code, tree,
> Thanks,
> 
> Balaji V. Iyer.


[RS6000] libffi little-endian

2013-06-06 Thread Alan Modra
Bootstrapped and regression tested powerpc64-linux.  OK to apply?

* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Support
little-endian.
* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Likewise.

Index: libffi/src/powerpc/linux64_closure.S
===
--- libffi/src/powerpc/linux64_closure.S(revision 199718)
+++ libffi/src/powerpc/linux64_closure.S(working copy)
@@ -132,7 +132,11 @@
blr
nop
 # case FFI_TYPE_INT
+#ifdef __LITTLE_ENDIAN__
+   lwa %r3, 112+0(%r1)
+#else
lwa %r3, 112+4(%r1)
+#endif
mtlr %r0
addi %r1, %r1, 240
blr
@@ -152,33 +156,57 @@
lfd %f2, 112+8(%r1)
b .Lfinish
 # case FFI_TYPE_UINT8
+#ifdef __LITTLE_ENDIAN__
+   lbz %r3, 112+0(%r1)
+#else
lbz %r3, 112+7(%r1)
+#endif
mtlr %r0
addi %r1, %r1, 240
blr
 # case FFI_TYPE_SINT8
+#ifdef __LITTLE_ENDIAN__
+   lbz %r3, 112+0(%r1)
+#else
lbz %r3, 112+7(%r1)
+#endif
extsb %r3,%r3
mtlr %r0
b .Lfinish
 # case FFI_TYPE_UINT16
+#ifdef __LITTLE_ENDIAN__
+   lhz %r3, 112+0(%r1)
+#else
lhz %r3, 112+6(%r1)
+#endif
mtlr %r0
 .Lfinish:
addi %r1, %r1, 240
blr
 # case FFI_TYPE_SINT16
+#ifdef __LITTLE_ENDIAN__
+   lha %r3, 112+0(%r1)
+#else
lha %r3, 112+6(%r1)
+#endif
mtlr %r0
addi %r1, %r1, 240
blr
 # case FFI_TYPE_UINT32
+#ifdef __LITTLE_ENDIAN__
+   lwz %r3, 112+0(%r1)
+#else
lwz %r3, 112+4(%r1)
+#endif
mtlr %r0
addi %r1, %r1, 240
blr
 # case FFI_TYPE_SINT32
+#ifdef __LITTLE_ENDIAN__
+   lwa %r3, 112+0(%r1)
+#else
lwa %r3, 112+4(%r1)
+#endif
mtlr %r0
addi %r1, %r1, 240
blr
Index: libffi/src/powerpc/ppc_closure.S
===
--- libffi/src/powerpc/ppc_closure.S(revision 199718)
+++ libffi/src/powerpc/ppc_closure.S(working copy)
@@ -159,25 +159,41 @@
 #endif
 
 # case FFI_TYPE_UINT8
+#ifdef __LITTLE_ENDIAN__
+   lbz %r3,112+0(%r1)
+#else
lbz %r3,112+3(%r1)
+#endif
mtlr %r0
addi %r1,%r1,144
blr
 
 # case FFI_TYPE_SINT8
+#ifdef __LITTLE_ENDIAN__
+   lbz %r3,112+0(%r1)
+#else
lbz %r3,112+3(%r1)
+#endif
extsb %r3,%r3
mtlr %r0
b .Lfinish
 
 # case FFI_TYPE_UINT16
+#ifdef __LITTLE_ENDIAN__
+   lhz %r3,112+0(%r1)
+#else
lhz %r3,112+2(%r1)
+#endif
mtlr %r0
addi %r1,%r1,144
blr
 
 # case FFI_TYPE_SINT16
+#ifdef __LITTLE_ENDIAN__
+   lha %r3,112+0(%r1)
+#else
lha %r3,112+2(%r1)
+#endif
mtlr %r0
addi %r1,%r1,144
blr
@@ -239,9 +255,15 @@
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 3. Three byte struct.
lwz %r3,112+0(%r1)
+#ifdef __LITTLE_ENDIAN__
+   mtlr %r0
+   addi %r1,%r1,144
+   blr
+#else
srwi %r3,%r3,8
mtlr %r0
b .Lfinish
+#endif
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 4. Four byte struct.
lwz %r3,112+0(%r1)
@@ -252,20 +274,35 @@
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 5. Five byte struct.
lwz %r3,112+0(%r1)
lwz %r4,112+4(%r1)
+#ifdef __LITTLE_ENDIAN__
+   mtlr %r0
+   b .Lfinish
+#else
li %r5,24
b .Lstruct567
+#endif
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 6. Six byte struct.
lwz %r3,112+0(%r1)
lwz %r4,112+4(%r1)
+#ifdef __LITTLE_ENDIAN__
+   mtlr %r0
+   b .Lfinish
+#else
li %r5,16
b .Lstruct567
+#endif
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 7. Seven byte struct.
lwz %r3,112+0(%r1)
lwz %r4,112+4(%r1)
+#ifdef __LITTLE_ENDIAN__
+   mtlr %r0
+   b .Lfinish
+#else
li %r5,8
b .Lstruct567
+#endif
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 8. Eight byte struct.
lwz %r3,112+0(%r1)
@@ -273,6 +310,7 @@
mtlr %r0
b .Lfinish
 
+#ifndef __LITTLE_ENDIAN__
 .Lstruct567:
subfic %r6,%r5,32
srw %r4,%r4,%r5
@@ -282,6 +320,7 @@
mtlr %r0
addi %r1,%r1,144
blr
+#endif
 
 .Luint128:
lwz %r6,112+12(%r1)

-- 
Alan Modra
Australia Development Lab, IBM


Re: patch to fix PR57468

2013-06-06 Thread David Edelsohn
> The patch actually restore the LRA behaviour for x86/x86-64 before rev. 
> 199298. The revision was added for PPC SDmode value correct generation. So it 
> is really needed for PPC64 and badly hurts x86/x86-64 performance (by doing 
> secondary memory reloads when one pseudo is spilled).

Should the solution for PPC64 be further limited, even on
PPC64? Is this going to hurt more normal spilling code on PPC64 that
does not have the strange restrictions of SDmode?

Thanks, David


Re: [RS6000] -mfp-in-toc

2013-06-06 Thread David Edelsohn
On Thu, Jun 6, 2013 at 7:40 PM, Alan Modra  wrote:
> On Tue, Jun 04, 2013 at 11:45:12PM +0930, Alan Modra wrote:
>> This patch allows the user to specify -mfp-in-toc/-msum-in-toc options
>> without being overridden when -fsection-anchors or -mcmodel != small
>> is in effect.  I also change the default to -mno-fp-in-toc for
>> -mcmodel=medium, because -mcmodel=medium ought to be able to address
>> constants anywhere from the toc pointer, and putting them in their
>> usual constant sections (.rodata.cst4 and .rodata.cst8) allow them to
>> be merged at link time.  For -mcmodel=large we keep the default as
>> -mfp-in-toc because large code model requires a toc entry to address
>> any constant outside the TOC.
>>
>> The patch also allows -mcmodel=medium toc relative addressing for
>> CONSTANT_POOL_ADDRESS_P constants (the very sort we get from
>> force_const_mem when -mno-fp-in-toc), and allows combine to merge the
>> low-part of the address calculation with the load/store from memory.
>> I'm not sure now why I had this disabled, perhaps there was a problem
>> when we split toc refs early.  Bootstrapped and regression tested
>> powerpc64-linux.  OK to apply?
>
> Revised patch.  This one doesn't blindly trust CONSTANT_POOL_ADDRESS_P
> constants are always sufficiently aligned.  There are modes where
> alignment is less than size, CQI, CHI, CSI, CDI, CTI, SC, DC, TC, so
> it might be the case that the imaginary part of a complex constant
> crossed a 32k boundary.
>
> I've also changed offsettable_ok_by_alignment to only check that the
> particular access being considered doesn't cross a 32k boundary.
> We'll only be offsetting within that access, so that is all that is
> necessary.
>
> * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
> override user -mfp-in-toc.
> (offsettable_ok_by_alignment): Consider just the current access
> rather than the whole object, unless BLKmode.  Handle
> CONSTANT_POOL_ADDRESS_P constants that lack a decl too.
> (use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants
> for -mcmodel=medium.
> * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
> override user -mfp-in-toc or -msum-in-toc.  Default to
> -mno-fp-in-toc for -mcmodel=medium.

Okay.

It would have been helpful for reviewing the patch to submit the patch
in two stages that separate the re-organization of the code from the
modifications to the alignment test.

Thanks, David


[C++ Patch] PR 53658

2013-06-06 Thread Paolo Carlini

Hi,

this issue seems just another case of:

http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02472.html

thus we are ICEing because TYPE_STUB_DECL is null and we want to use 
TYPE_MAIN_DECL.


Tested x86_64-linux.

Thanks,
Paolo.

///


/cp
2013-06-07  Paolo Carlini  

PR c++/53658
* pt.c (lookup_template_class_1): Consistently use TYPE_MAIN_DECL,
not TYPE_STUB_DECL, to access the _DECL for a _TYPE.

/testsuite
2013-06-07  Paolo Carlini  

PR c++/53658
* g++.dg/cpp0x/alias-decl-36.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 199776)
+++ cp/pt.c (working copy)
@@ -7561,7 +7561,7 @@ lookup_template_class_1 (tree d1, tree arglist, tr
   if (CLASS_TYPE_P (template_type) && is_dependent_type)
/* If the type makes use of template parameters, the
   code that generates debugging information will crash.  */
-   DECL_IGNORED_P (TYPE_STUB_DECL (t)) = 1;
+   DECL_IGNORED_P (TYPE_MAIN_DECL (t)) = 1;
 
   /* Possibly limit visibility based on template args.  */
   TREE_PUBLIC (type_decl) = 1;
Index: testsuite/g++.dg/cpp0x/alias-decl-36.C
===
--- testsuite/g++.dg/cpp0x/alias-decl-36.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/alias-decl-36.C  (working copy)
@@ -0,0 +1,6 @@
+// PR c++/53658
+// { dg-do compile { target c++11 } }
+
+struct A;
+template  using Foo = const A;
+template  Foo  bar();


[patch,rl78] Implement TARGET_VALID_POINTER_MODE

2013-06-06 Thread DJ Delorie

This fixes a bug where cfgexpand would ICE when using far pointers,
because the SImode pointers weren't "valid" with the default macro.
Committed.

* config/rl78/rl78.c (rl78_valid_pointer_mode): New, implements
TARGET_VALID_POINTER_MODE.

Index: gcc/config/rl78/rl78.c
===
--- gcc/config/rl78/rl78.c  (revision 199776)
+++ gcc/config/rl78/rl78.c  (working copy)
@@ -644,12 +644,21 @@ rl78_addr_space_pointer_mode (addr_space
   return SImode;
 default:
   gcc_unreachable ();
 }
 }
 
+/* Returns TRUE for valid addresses.  */
+#undef TARGET_VALID_POINTER_MODE
+#define TARGET_VALID_POINTER_MODE rl78_valid_pointer_mode
+static bool
+rl78_valid_pointer_mode (enum machine_mode m)
+{
+  return (m == HImode || m == SImode);
+}
+
 /* Return the appropriate mode for a named address address.  */
 #undef TARGET_ADDR_SPACE_ADDRESS_MODE
 #define TARGET_ADDR_SPACE_ADDRESS_MODE rl78_addr_space_address_mode
 static enum machine_mode
 rl78_addr_space_address_mode (addr_space_t addrspace)
 {


Re: Symtab cleanups 4/17

2013-06-06 Thread Hans-Peter Nilsson
> From: Jan Hubicka 
> Date: Wed, 5 Jun 2013 16:18:52 +0200

> * class.c (emit_register_classes_in_jcr_section): Use DECL_PRESERVE_P
> instead of mark_decl_referenced.
> 
> * decl2.c (maybe_make_one_only): Use forced_by_abi instad of
> mark_decl_referenced.
> (mark_needed): Likewise.
> 
> * cgraph.c (cgraph_remove_node): Clear forced_by_abi.
> (cgraph_node_cannot_be_local_p_1): Honnor symbol.forced_by_abi
> and symtab_used_from_object_file_p.
> (cgraph_make_node_local_1): Clear forced_by_abi.
> (cgraph_can_remove_if_no_direct_calls_and): Use forced_by_abi
> * cgraph.h (symtab_node_base): Add forced_by_abi.
> (decide_is_variable_needed): Remove.
> (varpool_can_remove_if_no_refs): Honnor symbol.forced_by_abi.
> * cgraphunit.c (cgraph_decide_is_function_needed): Rename to ..
> (decide_is_symbol_needed): ... this one; handle symbols in general;
> always analyze virtuals; honnor forced_by_abi.
> (cgraph_finalize_function): Update.
> (varpool_finalize_decl): Update.
> (symbol_defined_and_needed): Remove.
> (analyze_functions): Update.
> * lto-cgraph.c (lto_output_node, lto_output_varpool_node,
> output_refs, input_overwrite_node): Handle forced_by_abi.
> * ipa.c (cgraph_address_taken_from_non_vtable_p): Rename to ...
> (address_taken_from_non_vtable_p): ... this one.
> (comdat_can_be_unshared_p_1): New function.
> (cgraph_comdat_can_be_unshared_p): Rename to ...
> (comdat_can_be_unshared_p): ... this one; handle symbols in general.
> (varpool_externally_visible_p): Use comdat_can_be_unshared_p.
> (function_and_variable_visibility): Clear forced_by_abi as needed.
> * trans-mem.c (ipa_tm_mark_forced_by_abi_node): New functoin.
> (ipa_tm_create_version_alias, ipa_tm_create_version): Update.
> * varasm.c (mark_decl_referenced): Remove.
> * symtab.c (dump_symtab_base): Dump forced_by_abi.
> * varpool.c (decide_is_variable_needed): Remove.

This caused a regression everywhere; PR57551 (all the code went
away, scan-assembler test failing).  I'm guessing you've already
fixed that test-case some way and just forgot to commit that
part.

brgds, H-P


Re: [RS6000] -mfp-in-toc

2013-06-06 Thread Alan Modra
On Tue, Jun 04, 2013 at 11:45:12PM +0930, Alan Modra wrote:
> This patch allows the user to specify -mfp-in-toc/-msum-in-toc options
> without being overridden when -fsection-anchors or -mcmodel != small
> is in effect.  I also change the default to -mno-fp-in-toc for
> -mcmodel=medium, because -mcmodel=medium ought to be able to address
> constants anywhere from the toc pointer, and putting them in their
> usual constant sections (.rodata.cst4 and .rodata.cst8) allow them to
> be merged at link time.  For -mcmodel=large we keep the default as
> -mfp-in-toc because large code model requires a toc entry to address
> any constant outside the TOC.
> 
> The patch also allows -mcmodel=medium toc relative addressing for
> CONSTANT_POOL_ADDRESS_P constants (the very sort we get from
> force_const_mem when -mno-fp-in-toc), and allows combine to merge the
> low-part of the address calculation with the load/store from memory.
> I'm not sure now why I had this disabled, perhaps there was a problem
> when we split toc refs early.  Bootstrapped and regression tested
> powerpc64-linux.  OK to apply?

Revised patch.  This one doesn't blindly trust CONSTANT_POOL_ADDRESS_P
constants are always sufficiently aligned.  There are modes where
alignment is less than size, CQI, CHI, CSI, CDI, CTI, SC, DC, TC, so
it might be the case that the imaginary part of a complex constant
crossed a 32k boundary.

I've also changed offsettable_ok_by_alignment to only check that the
particular access being considered doesn't cross a 32k boundary.
We'll only be offsetting within that access, so that is all that is
necessary.

* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
override user -mfp-in-toc.
(offsettable_ok_by_alignment): Consider just the current access
rather than the whole object, unless BLKmode.  Handle
CONSTANT_POOL_ADDRESS_P constants that lack a decl too.
(use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants
for -mcmodel=medium.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
override user -mfp-in-toc or -msum-in-toc.  Default to
-mno-fp-in-toc for -mcmodel=medium.

Index: gcc/config/rs6000/linux64.h
===
--- gcc/config/rs6000/linux64.h (revision 199718)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -136,8 +136,11 @@ extern int dot_symbols;
SET_CMODEL (CMODEL_MEDIUM); \
  if (rs6000_current_cmodel != CMODEL_SMALL)\
{   \
- TARGET_NO_FP_IN_TOC = 0;  \
- TARGET_NO_SUM_IN_TOC = 0; \
+ if (!global_options_set.x_TARGET_NO_FP_IN_TOC) \
+   TARGET_NO_FP_IN_TOC \
+ = rs6000_current_cmodel == CMODEL_MEDIUM; \
+ if (!global_options_set.x_TARGET_NO_SUM_IN_TOC) \
+   TARGET_NO_SUM_IN_TOC = 0;   \
}   \
}   \
}   \
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 199718)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3042,7 +3042,8 @@ rs6000_option_override_internal (bool global_init_
 
   /* Place FP constants in the constant pool instead of TOC
  if section anchors enabled.  */
-  if (flag_section_anchors)
+  if (flag_section_anchors
+  && !global_options_set.x_TARGET_NO_FP_IN_TOC)
 TARGET_NO_FP_IN_TOC = 1;
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
@@ -5474,91 +5475,102 @@ virtual_stack_registers_memory_p (rtx op)
  && regnum <= LAST_VIRTUAL_POINTER_REGISTER);
 }
 
-/* Return true if memory accesses to OP are known to never straddle
-   a 32k boundary.  */
+/* Return true if a MODE sized memory accesses to OP plus OFFSET
+   is known to not straddle a 32k boundary.  */
 
 static bool
 offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT offset,
 enum machine_mode mode)
 {
   tree decl, type;
-  unsigned HOST_WIDE_INT dsize, dalign;
+  unsigned HOST_WIDE_INT dsize, dalign, lsb, mask;
 
   if (GET_CODE (op) != SYMBOL_REF)
 return false;
 
+  dsize = GET_MODE_SIZE (mode);
   decl = SYMBOL_REF_DECL (op);
   if (!decl)
 {
-  if (GET_MODE_SIZE (mode) == 0)
+  if (dsize == 0)
return false;
 
   /* -fsection-anchors loses the original SYMBOL_REF_DECL when
 replacing memory addresses with an anchor plus offset.  We
 could find the decl by rummaging around in the block->objects
 VEC for the given offset but that seems like too much work.  */
-  dalign = 1;
+  dalign = B

Re: [google gcc-4_8] Restore max peeled instructions to old default

2013-06-06 Thread Teresa Johnson
On Thu, Jun 6, 2013 at 1:33 PM, Xinliang David Li  wrote:
> ok.   Wht is the rational for dropping the limit in trunk?  Ideally,
> the limit should be lifted up and to enable other heuristics to kick
> in.

Here is the message about it from Honza:

http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01193.html

Basically, it was to reduce code bloat, and it didn't show spec regressions.

Teresa

>
> David
>
> On Thu, Jun 6, 2013 at 1:22 PM, Teresa Johnson  wrote:
>> The default for the max instructions in peeled loops was reduced on
>> trunk in r193570. This is causing a performance regression on an internal
>> benchmark. This change will revert to the old higher limits.
>>
>> Google ref b/8839137.
>>
>> Bootstrapped and tested. Ok for google/4_8?
>>
>> Thanks,
>> Teresa
>>
>> 2013-06-06  Teresa Johnson  
>>
>> * params.def (PARAM_MAX_PEELED_INSNS): Revert to 400.
>> (PARAM_MAX_COMPLETELY_PEELED_INSNS): Ditto.
>>
>> Index: params.def
>> ===
>> --- params.def (revision 199753)
>> +++ params.def (working copy)
>> @@ -306,7 +306,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES,
>>  DEFPARAM(PARAM_MAX_PEELED_INSNS,
>>   "max-peeled-insns",
>>   "The maximum number of insns of a peeled loop",
>> - 100, 0, 0)
>> + 400, 0, 0)
>>  /* The maximum number of peelings of a single loop.  */
>>  DEFPARAM(PARAM_MAX_PEEL_TIMES,
>>   "max-peel-times",
>> @@ -321,7 +321,7 @@ DEFPARAM(PARAM_MAX_PEEL_BRANCHES,
>>  DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS,
>>   "max-completely-peeled-insns",
>>   "The maximum number of insns of a completely peeled loop",
>> - 100, 0, 0)
>> + 400, 0, 0)
>>  /* The maximum number of peelings of a single loop that is peeled
>> completely.  */
>>  DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES,
>>   "max-completely-peel-times",
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH, rs6000] power8 patches, patch #5, new vector tests

2013-06-06 Thread Michael Meissner
I checked in the tests that went with power8 patches #3 and #4 (which have been
committed) as subversion id 199768.

2013-06-06  Michael Meissner  
Pat Haugen 
Peter Bergner 

* gcc.target/powerpc/p8vector-builtin-1.c: New test to test
power8 builtin functions.
* gcc/testsuite/gcc.target/powerpc/p8vector-builtin-2.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-builtin-3.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-builtin-4.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-builtin-5.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-builtin-6.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-builtin-7.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-vectorize-1.c: New
tests to test power8 auto-vectorization.
* gcc/testsuite/gcc.target/powerpc/p8vector-vectorize-2.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-vectorize-3.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-vectorize-4.c: Likewise.
* gcc/testsuite/gcc.target/powerpc/p8vector-vectorize-5.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



patch to fix PR57468

2013-06-06 Thread Vladimir Makarov

The following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57468

The patch actually restore the LRA behaviour for x86/x86-64 before rev. 
199298.  The revision was added for PPC SDmode value correct 
generation.  So it is really needed for PPC64 and badly hurts x86/x86-64 
performance (by doing secondary memory reloads when one pseudo is spilled).


The patch was successfully bootstrapped and tested on x86/x86-64 (with 
patch for pr57459).


  Although the change in i386.c, it only concerns to LRA.  So I've 
decided to commit it without x86/x86-64 maintainer approval.  May be I 
am wrong in this situation.  If somebody objects I am ready to revert 
the patch and wait for an approval.


Committed as rev. 199764.

2013-06-06  Vladimir Makarov  

PR rtl-optimization/57468
* config/i386/i386.c (inline_secondary_memory_needed): Ignore
spilled pseudos.






[C++ testcase, committed] PR 43652

2013-06-06 Thread Paolo Carlini

Hi,

committed to mainline.

Thanks,
Paolo.

//
2013-06-06  Paolo Carlini  

PR c++/43652
* g++.dg/parse/error53.C: New.
Index: g++.dg/parse/error53.C
===
--- g++.dg/parse/error53.C  (revision 0)
+++ g++.dg/parse/error53.C  (working copy)
@@ -0,0 +1,3 @@
+// PR c++/43652
+
+static const char const * stdin_name = "";  // { dg-error 
"19:duplicate" }


patch to fix PR57459

2013-06-06 Thread Vladimir Makarov

The following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57459

The bug occurs in case of rare combination when insn uses inherited and 
original value, the pseudo is input and output and the pseudo value (as 
insn result) is unused.


The patch was successfully bootstrapped and tested on x86/x86-64.

Committed as rev. 199762.

2013-06-06  Vladimir Makarov  

PR rtl-optimization/57459
* lra-constraints.c (update_ebb_live_info): Fix typo for operand
type when setting live regs.

2013-06-06  Vladimir Makarov  

PR rtl-optimization/57459
* gcc.target/i386/pr57459.c: New test.

Index: lra-constraints.c
===
--- lra-constraints.c	(revision 199753)
+++ lra-constraints.c	(working copy)
@@ -4545,7 +4545,7 @@ update_ebb_live_info (rtx head, rtx tail
 	  bitmap_clear_bit (&live_regs, reg->regno);
   /* Mark each used value as live.  */
   for (reg = curr_id->regs; reg != NULL; reg = reg->next)
-	if (reg->type == OP_IN
+	if (reg->type != OP_OUT
 	&& bitmap_bit_p (&check_only_regs, reg->regno))
 	  bitmap_set_bit (&live_regs, reg->regno);
   /* It is quite important to remove dead move insns because it
Index: testsuite/gcc.target/i386/pr57459.c
===
--- testsuite/gcc.target/i386/pr57459.c	(revision 0)
+++ testsuite/gcc.target/i386/pr57459.c	(working copy)
@@ -0,0 +1,60 @@
+/* PR rtl-optimization/57459 */
+/* { dg-do run } */
+/* { dg-options "-fno-inline -O2 -minline-all-stringops -fno-omit-frame-pointer" } */
+
+int total1[10], total2[10], total3[10], total4[10], total5[10], a[20];
+int len;
+
+void stackclean() {
+  void *ptr = __builtin_alloca(2);
+  __builtin_memset(ptr, 0, 2);
+}
+
+void foo(const char *s) {
+  int r1 = a[1];
+  int r2 = a[2];
+  int r3 = a[3];
+  int r4 = a[4];
+  int r5 = a[5];
+
+  len =  __builtin_strlen(s);
+
+  if (s != 0)
+return;
+
+  while (r1) {
+   total1[r1] = r1;
+   r1--;
+  }
+
+  while (r2) {
+   total2[r2] = r2;
+   r2--;
+  }
+
+  while (r3) {
+   total3[r3] = r3;
+   r3--;
+  }
+
+  while (r4) {
+   total4[r4] = r4;
+   r4--;
+  }
+
+  while (r5) {
+   total5[r5] = r5;
+   r5--;
+  }
+}
+
+extern void abort (void);
+
+int main() {
+  stackclean();
+  foo("abcdefgh");
+  if (len != 8)
+abort ();
+  return 0;
+}
+


Re: Unordered container insertion hints

2013-06-06 Thread François Dumont

On 05/24/2013 01:00 AM, Paolo Carlini wrote:

On 05/23/2013 10:01 PM, François Dumont wrote:

Some feedback regarding this patch ?
Two quick ones: what if the hint is wrong? I suppose the insertion 
succeeds anyway, it's only a little waste of time, right?


Right.

Is it possible that for instance something throws in that case and 
would not now (when the hint is simply ignored)? In case, check and 
re-check we are still conforming.
I consider the hint only if it is equivalent to the inserted element so 
I invoke the equal_to functor for that. The invocation of the equal_to 
functor is already done if no hint is granted at the same location. So 
usage of the hint has no impact on exception safety.


In any case, I think it's quite easy to notice if an implementation is 
using the hint in this way or a similar one basing on some simple 
benchmarks, without looking of course at the actual implementation 
code. Do we have any idea what other implementations are doing? Like, 
eg, they invented something for unordered_set and map too? Or a better 
way to exploit the hint for the multi variants?


I only bench llvm/clang implementation and notice no different with 
or without hint, I guess it is simply ignored. I haven't plan to check 
or bench other implementations. The usage of hint I am introducing is 
quite natural considering the new unordered containers data model. And 
if anyone has a better idea to deal with it then he is welcome to 
contribute !


Eventually I suppose we want to add a performance testcase to our 
testsuite.
Good request and the reason why it took me so long to answer. Writing 
such benchmark have shown me that users should be very careful with it 
cause it can do more bad than good.


unordered_multiset_hint.ccunordered_set 100 X 2 insertions w/o 
hint 120r  120u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions with 
any hint 130r  130u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions with 
good hint  54r   54u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions with 
perfect hint  36r   36u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions w/o 
hint  40r   40u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions with 
any hint  38r   38u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions with 
bad hint  49r   50u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions with 
perfect hint  34r   35u0s 6416mem0pf


The small number represents how many time the same element is 
inserted and the big one the number of different elements. 100 X 2 
means that we loop 100 times inserting the 2 elements during each 
loop. 2 X 100 means that the main loop is on the elements and we 
insert each 100 times. Being able to insert all the equivalent elements 
at the same time or not has a major impact on the performances to get 
the same result. This is because when a new element is inserted it will 
be first in its bucket and the following 99 insertions will benefit from 
it even without any hint.


The bench also show that a bad hint can be worst than no hint. A 
bad hint is one that once used require to check that next bucket is not 
impacted by the insertion. To do so it requires a hash code computation 
(if it is not cached like in my use case) and check. I have added a word 
about being able to check performance before using hints. Here is the 
result using the default std::hash, hash code is being cached.


unordered_multiset_hint.ccunordered_set 100 X 2 insertions w/o 
hint  76r   76u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions with 
any hint  83r   83u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions with 
good hint  29r   29u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions with 
perfect hint  24r   23u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions w/o 
hint  27r   26u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions with 
any hint  24r   24u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions with 
bad hint  27r   27u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions with 
perfect hint  23r   23u0s 6416mem0pf


Almost no impact in this case when using a bad hint. I consider adding 
another condition to the use of the hint which is to have the element 
after the hint also equivalent to the inserted element. This way we are 
sure that next bucket won't be affected and do not need to compu

Re: [google gcc-4_8] Restore max peeled instructions to old default

2013-06-06 Thread Xinliang David Li
ok.   Wht is the rational for dropping the limit in trunk?  Ideally,
the limit should be lifted up and to enable other heuristics to kick
in.

David

On Thu, Jun 6, 2013 at 1:22 PM, Teresa Johnson  wrote:
> The default for the max instructions in peeled loops was reduced on
> trunk in r193570. This is causing a performance regression on an internal
> benchmark. This change will revert to the old higher limits.
>
> Google ref b/8839137.
>
> Bootstrapped and tested. Ok for google/4_8?
>
> Thanks,
> Teresa
>
> 2013-06-06  Teresa Johnson  
>
> * params.def (PARAM_MAX_PEELED_INSNS): Revert to 400.
> (PARAM_MAX_COMPLETELY_PEELED_INSNS): Ditto.
>
> Index: params.def
> ===
> --- params.def (revision 199753)
> +++ params.def (working copy)
> @@ -306,7 +306,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES,
>  DEFPARAM(PARAM_MAX_PEELED_INSNS,
>   "max-peeled-insns",
>   "The maximum number of insns of a peeled loop",
> - 100, 0, 0)
> + 400, 0, 0)
>  /* The maximum number of peelings of a single loop.  */
>  DEFPARAM(PARAM_MAX_PEEL_TIMES,
>   "max-peel-times",
> @@ -321,7 +321,7 @@ DEFPARAM(PARAM_MAX_PEEL_BRANCHES,
>  DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS,
>   "max-completely-peeled-insns",
>   "The maximum number of insns of a completely peeled loop",
> - 100, 0, 0)
> + 400, 0, 0)
>  /* The maximum number of peelings of a single loop that is peeled
> completely.  */
>  DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES,
>   "max-completely-peel-times",
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [google gcc-4_8] Restore max peeled instructions to old default

2013-06-06 Thread Xinliang David Li
We should make the default setting right for our environment. The
patch is trivial to maintain.

thanks,

David

On Thu, Jun 6, 2013 at 1:26 PM, Diego Novillo  wrote:
> On 2013-06-06 16:22 , Teresa Johnson wrote:
>>
>> The default for the max instructions in peeled loops was reduced on
>> trunk in r193570. This is causing a performance regression on an internal
>> benchmark. This change will revert to the old higher limits.
>>
>> Google ref b/8839137.
>>
>> Bootstrapped and tested. Ok for google/4_8?
>
>
> I wonder if this isn't something we want to set in our internal build
> configuration files.  I don't have a strong opinion one way or the other,
> however.  I'm only saying this because it would mean 1 less patch for us to
> maintain.
>
>
> Diego.


Re: [c++-concepts] code review

2013-06-06 Thread Jason Merrill

On 06/06/2013 01:47 PM, Andrew Sutton wrote:

I never did understand why this happens. Compiling with GCC-4.6, I get
these errors originating in logic.cc from an include of .
This is what I get:

/usr/include/c++/4.6/cstdlib:76:8: error: attempt to use poisoned "calloc"


Ah, I see: adding the include gets the mentions of malloc in before the 
names are poisoned.  This change is OK.



+; Activate C++ concepts support.
+Variable
+bool flag_concepts


You don't need to declare this separately.


I'm not quite sure what you mean. Separately from what?


Separately from


+C++ ObjC++ Var(flag_concepts, true)


This line declares flag_concepts implicitly.


That's the long and short of it. Gaby suggested writing constraints
here so that, for any instantiation, you would have easy access to the
constraints for that declaration.


I'm not sure why you would care about the constraints for a particular 
instantiation; constraints only apply to the template, right?



branch_goal queues a copy of the current sub-goal, returning a
reference to that new branch. The insertion of the operands are done
on different sub-goals, giving the expected results.


Right, I suppose I should have paid more attention to "This does not 
update the current goal"...



+template
+  tree
+  extract_goals (proof_state& s, F proj)


Why is proj a function argument rather than a template argument, which would
allow inlining?


STL influence. Can you give an example of how this should look in
order to take advantage of inlining?


I was thinking something like

template
tree
extract_goals (proof_state& s)
...
 return extract_goals(s);

but I suppose STL style is OK, too.


It was used in a previous version, and I suspect it might be useful in
the future, but I'm not 100% sure. I felt it would be worthwhile to
keep it in the patch just in case.


Makes sense.


And why do it this way
rather than check and possibly return at the top of the function, as
elsewhere in the parser?  You already have cp_parser_requires_clause
checking for RID_REQUIRES.


I was trying to write the parsing code a little more modularly so I
could keep my parse functions as small as possible. I use the facility
more heavily in the requires/validexpr code that's not included here.


Hmm, to me it seems more modular to keep all of the code for handling 
e.g. "requires" in its own function rather than needing two different 
places to know how a requires clause starts.



Why don't you use 'release' and conjoin_requirements here?


Because there is no template parameter list that can provide
additional requirements in this declaration.


OK, please add a comment to that effect.


+// Try to substitute ARGS into PARMS, returning the actual list of
+// arguments that have been substituted. If ARGS cannot be substituted,
+// return error_mark_node.


The comment sounds more like tsubst_template_parms than
coerce_template_parms.


It might be... I'll have to look. What I actually want to get is the
set of actual arguments that will be substituted for template
parameters given an initial set of arguments (lookup default
arguments, generate pack arguments, etc).


Right, I think coerce_template_parms has the effect you want, I just 
don't think of it as doing substitution, so the comment and name could 
use a tweak.  If the function doesn't go away, that is.


Jason



Re: [google gcc-4_8] Restore max peeled instructions to old default

2013-06-06 Thread Diego Novillo

On 2013-06-06 16:22 , Teresa Johnson wrote:

The default for the max instructions in peeled loops was reduced on
trunk in r193570. This is causing a performance regression on an internal
benchmark. This change will revert to the old higher limits.

Google ref b/8839137.

Bootstrapped and tested. Ok for google/4_8?


I wonder if this isn't something we want to set in our internal build 
configuration files.  I don't have a strong opinion one way or the 
other, however.  I'm only saying this because it would mean 1 less patch 
for us to maintain.



Diego.


[google gcc-4_8] Restore max peeled instructions to old default

2013-06-06 Thread Teresa Johnson
The default for the max instructions in peeled loops was reduced on
trunk in r193570. This is causing a performance regression on an internal
benchmark. This change will revert to the old higher limits.

Google ref b/8839137.

Bootstrapped and tested. Ok for google/4_8?

Thanks,
Teresa

2013-06-06  Teresa Johnson  

* params.def (PARAM_MAX_PEELED_INSNS): Revert to 400.
(PARAM_MAX_COMPLETELY_PEELED_INSNS): Ditto.

Index: params.def
===
--- params.def (revision 199753)
+++ params.def (working copy)
@@ -306,7 +306,7 @@ DEFPARAM(PARAM_MAX_UNROLL_TIMES,
 DEFPARAM(PARAM_MAX_PEELED_INSNS,
  "max-peeled-insns",
  "The maximum number of insns of a peeled loop",
- 100, 0, 0)
+ 400, 0, 0)
 /* The maximum number of peelings of a single loop.  */
 DEFPARAM(PARAM_MAX_PEEL_TIMES,
  "max-peel-times",
@@ -321,7 +321,7 @@ DEFPARAM(PARAM_MAX_PEEL_BRANCHES,
 DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS,
  "max-completely-peeled-insns",
  "The maximum number of insns of a completely peeled loop",
- 100, 0, 0)
+ 400, 0, 0)
 /* The maximum number of peelings of a single loop that is peeled
completely.  */
 DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES,
  "max-completely-peel-times",


--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


new mul* patterns "U" constraint in rl78

2013-06-06 Thread DJ Delorie

This patch:

2013-05-31  Kaushik Phatak  
* config/rl78/rl78.md (mulqi3,mulhi3): New define_expands.
(*mulqi3_rl78,*mulhi3_rl78,*mulhi3_g13): New define_insns.

Uses a "U" constraint which isn't defined in rl78/constraints.md

What should that constraint do?  Could you post a patch to add it?

They're also missing the valloc attribute, which controls register
allocation and clobbers.


Re: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window

2013-06-06 Thread Meador Inge
On 06/06/2013 08:11 AM, Richard Earnshaw wrote:

> I understand (and agree with) this bit...
> 
>> +(define_peephole2
>> +  [(set (reg:CC CC_REGNUM)
>> +(compare:CC (match_operand:SI 0 "register_operand" "")
>> +(match_operand:SI 1 "arm_rhs_operand" "")))
>> +   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
>> +  (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
>> +   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
>> +  (set (match_dup 2) (const_int 1)))
>> +   (match_scratch:SI 3 "r")]
>> +  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
>> +  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
>> +   (parallel
>> +[(set (reg:CC CC_REGNUM)
>> +  (compare:CC (const_int 0) (match_dup 3)))
>> + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
>> +   (set (match_dup 2)
>> +(plus:SI (plus:SI (match_dup 2) (match_dup 3))
>> + (geu:SI (reg:CC CC_REGNUM) (const_int 0])
>> +
> 
> ... but what's this bit about?

The original intent was to revert back to the original peephole pattern
(pre-PR 46975) when the CC reg is still live, but that doesn't properly
maintain the CC state either (it just happened to pass in the test
case I was looking at because I only cared about the Z flag, which is
maintained the same).

OK with the above bit left out?

-- 
Meador Inge
CodeSourcery / Mentor Embedded


Re: [c++-concepts] code review

2013-06-06 Thread Andrew Sutton
Hi Jason,

Thanks for the comments. I just went ahead and fixed all the editorial
issues. Comments and questions below:


>> * gcc/system.h (cstdlib): Include  to avoid poisoned
>> declaration errors.
>
> Poisoned declarations of what?  This seems redundant with the #include
>  just below.

I never did understand why this happens. Compiling with GCC-4.6, I get
these errors originating in logic.cc from an include of .
This is what I get:

In file included from /usr/include/c++/4.6/bits/stl_algo.h:61:0,
 from /usr/include/c++/4.6/algorithm:63,
 from ../../c++-concepts/gcc/cp/logic.cc:45:
/usr/include/c++/4.6/cstdlib:76:8: error: attempt to use poisoned "calloc"
/usr/include/c++/4.6/cstdlib:83:8: error: attempt to use poisoned "malloc"
/usr/include/c++/4.6/cstdlib:89:8: error: attempt to use poisoned "realloc"
/usr/include/c++/4.6/cstdlib:112:11: error: attempt to use poisoned "calloc"
/usr/include/c++/4.6/cstdlib:119:11: error: attempt to use poisoned "malloc"
/usr/include/c++/4.6/cstdlib:127:11: error: attempt to use poisoned "realloc"


>> +  /* Concepts-related keywords */
>> +  { "assume",  RID_ASSUME, D_CXXONLY | D_CXX0X | D_CXXWARN },
>> +  { "axiom",   RID_AXIOM,  D_CXXONLY | D_CXX0X | D_CXXWARN },
>> +  { "concept", RID_CONCEPT,D_CXXONLY | D_CXX0X | D_CXXWARN },
>> +  { "forall",  RID_FORALL, D_CXXONLY | D_CXX0X | D_CXXWARN },
>> +  { "requires",RID_REQUIRES,   D_CXXONLY | D_CXX0X | D_CXXWARN },
>
>
> I don't see anything that limits these keywords to when concepts are
> enabled.  You probably want to add an additional mask that applies to these.

Ok. I'll add D_CXX_CONCEPTS and set it for all of reserved words.


>> +; Activate C++ concepts support.
>> +Variable
>> +bool flag_concepts
>
> You don't need to declare this separately.

I'm not quite sure what you mean. Separately from what?


>> +static tree
>> +resolve_constraint_check (tree ovl, tree args)
>
> This function seems to be trying to model a subset of overload resolution,
> which seems fragile to me; better to use the actual overload resolution code
> to decide which function the constraint expression calls, or at least
> resolve_nondeduced_context which handles SFINAE.

It is. I was a little hesitant to use the actual overload resolution
facility because of the restrictions for concepts. I think I was also
doing something a little different in previous version.

I'll take another look and see if either will work instead of my
homebrew solution.


>
>> +case CAST_EXPR:
>> +  return reduce_node (TREE_VALUE (TREE_OPERAND (t, 0)));
>
> Are we assuming that constraint expressions can't involve objects of literal
> class type?

For now, I think it's a reasonable restriction. We can relax this as
needed in the future.


>>  struct GTY(()) tree_template_info {
>>struct tree_common common;
>> +  tree constraint;
>>vec
>> *typedefs_needing_access_checking;
>>  };
>
>
> Why do we need constraint information in template_info?  I suppose this is
> the issue you raised in your mail last month:
>
>> I had expected there to be a template decl associated with underlying
>> class, but after print-bombing most of the instantiation, lookup, and
>> specialization processing routines, I couldn't find that one was ever
>> created for the type decl.
>
> This does seem like a shortcoming, that also led to the typedefs vec getting
> stuck into the template_info inappropriately.  I guess we should start
> building TEMPLATE_DECLs for partial specializations.

That's the long and short of it. Gaby suggested writing constraints
here so that, for any instantiation, you would have easy access to the
constraints for that declaration.


>> +struct GTY(()) tree_constraint_info {
>> +  struct tree_base base;
>> +  tree spelling;
>> +  tree requirements;
>> +  tree assumptions;
>> +};
>
>
> I'm confused by the relationship between the comment and the field names
> here.  Where do the conclusions come in?  Is "requirements (as a constant
> expression)" in the spelling or requirements field?

I must have forgotten to update the comments. I probably need to
re-think this structure a bit. The requirements field is the complete
set of requirement (shorthand constraints + requires clause). The
assumptions field is the analyzed requirements.

I was using the "spelling" field specifically for diagnostics, so I
could print exactly what was written. I think it might be better to
hang that off the template parameter list rather than the constraint
info.

I don't think "spelling" is used in the current patch other than initialization.


>> +  DECL_DECLARED_CONCEPT_P (decl) = true;
>> +  if (!check_concept_fn (decl))
>> +return NULL_TREE;
>> +}
>
>
> I think I'd rather deal with an invalid concept by not marking it as a
> concept, but still declaring it as a constexpr function.

Sounds reasonable.


>> +// Return the current list of assumed terms.
>>

[Patch, Fortran] GCC 4.7 backport: PR54370 - ICE with non-default-kind logicals

2013-06-06 Thread Tobias Burnus

Committed as Rev. 199746 to the GCC 4.7 branch.

Tobias
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 199745)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,12 @@
+2013-06-06  Tobias Burnus  
+
+	Backport from mainline
+	2012-08-27  Tobias Burnus  
+
+	PR fortran/54370
+	* trans-stmt.c (gfc_trans_do_while): Don't change the logical
+	kind for negation of the condition.
+
 2013-06-01  Janus Weil  
 	Tobias Burnus  
 
Index: gcc/fortran/trans-stmt.c
===
--- gcc/fortran/trans-stmt.c	(Revision 199745)
+++ gcc/fortran/trans-stmt.c	(Arbeitskopie)
@@ -1743,7 +1743,7 @@ gfc_trans_do_while (gfc_code * code)
   gfc_conv_expr_val (&cond, code->expr1);
   gfc_add_block_to_block (&block, &cond.pre);
   cond.expr = fold_build1_loc (code->expr1->where.lb->location,
-			   TRUTH_NOT_EXPR, boolean_type_node, cond.expr);
+			   TRUTH_NOT_EXPR, TREE_TYPE (cond.expr), cond.expr);
 
   /* Build "IF (! cond) GOTO exit_label".  */
   tmp = build1_v (GOTO_EXPR, exit_label);
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 199745)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,11 @@
+2013-06-06  Tobias Burnus  
+
+	Backport from mainline
+	2012-08-27  Tobias Burnus  
+
+	PR fortran/54370
+	* gfortran.dg/do_5.f90: New.
+
 2013-06-01  Janus Weil  
 	Tobias Burnus  
 
Index: gcc/testsuite/gfortran.dg/do_5.f90
===
--- gcc/testsuite/gfortran.dg/do_5.f90	(Revision 0)
+++ gcc/testsuite/gfortran.dg/do_5.f90	(Arbeitskopie)
@@ -0,0 +1,29 @@
+! { dg-do compile }
+!
+! PR fortran/54370
+!
+! The following program was ICEing at tree-check time
+! "L()" was regarded as default-kind logical.
+!
+! Contributed by Kirill Chilikin
+!
+  MODULE M
+  CONTAINS
+
+  LOGICAL(C_BOOL) FUNCTION L() BIND(C)
+  USE, INTRINSIC :: ISO_C_BINDING
+  L = .FALSE.
+  END FUNCTION
+
+  LOGICAL(8) FUNCTION L2() BIND(C) ! { dg-warning "may not be a C interoperable kind but it is bind" }
+  L2 = .FALSE._8
+  END FUNCTION
+
+  SUBROUTINE S()
+  DO WHILE (L())
+  ENDDO
+  DO WHILE (L2())
+  ENDDO
+  END
+
+  END


[PATCH, alpha]: Update baseline_symbols.txt

2013-06-06 Thread Uros Bizjak
Hello!

This patch avoids ABI check failure on alpha.

2013-06-06  Uros Bizjak  

* config/abi/post/alpha-linux-gnu/baseline_symbols.txt: Update.

Tested on alphaev68-pc-linux-gnu.

OK for mainline?

Uros.
Index: config/abi/post/alpha-linux-gnu/baseline_symbols.txt
===
--- config/abi/post/alpha-linux-gnu/baseline_symbols.txt(revision 
199702)
+++ config/abi/post/alpha-linux-gnu/baseline_symbols.txt(working copy)
@@ -403,6 +403,7 @@
 FUNC:_ZNKSt15basic_streambufIwSt11char_traitsIwEE6getlocEv@@GLIBCXX_3.4
 FUNC:_ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4
 FUNC:_ZNKSt15basic_stringbufIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4
+FUNC:_ZNKSt16bad_array_length4whatEv@@CXXABI_1.3.8
 
FUNC:_ZNKSt17__gnu_cxx_ldbl1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intIjEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_LDBL_3.4
 
FUNC:_ZNKSt17__gnu_cxx_ldbl1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intIlEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_LDBL_3.4
 
FUNC:_ZNKSt17__gnu_cxx_ldbl1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intImEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_LDBL_3.4
@@ -556,6 +557,7 @@
 FUNC:_ZNKSt19basic_ostringstreamIcSt11char_traitsIcESaIcEE5rdbufEv@@GLIBCXX_3.4
 FUNC:_ZNKSt19basic_ostringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4
 FUNC:_ZNKSt19basic_ostringstreamIwSt11char_traitsIwESaIwEE5rdbufEv@@GLIBCXX_3.4
+FUNC:_ZNKSt20bad_array_new_length4whatEv@@CXXABI_1.3.8
 FUNC:_ZNKSt3tr14hashIRKSbIwSt11char_traitsIwESaIwEEEclES6_@@GLIBCXX_3.4.10
 FUNC:_ZNKSt3tr14hashIRKSsEclES2_@@GLIBCXX_3.4.10
 FUNC:_ZNKSt3tr14hashISbIwSt11char_traitsIwESaIwEEEclES4_@@GLIBCXX_3.4.10
@@ -1934,6 +1936,9 @@
 FUNC:_ZNSt16__numpunct_cacheIwED0Ev@@GLIBCXX_3.4
 FUNC:_ZNSt16__numpunct_cacheIwED1Ev@@GLIBCXX_3.4
 FUNC:_ZNSt16__numpunct_cacheIwED2Ev@@GLIBCXX_3.4
+FUNC:_ZNSt16bad_array_lengthD0Ev@@CXXABI_1.3.8
+FUNC:_ZNSt16bad_array_lengthD1Ev@@CXXABI_1.3.8
+FUNC:_ZNSt16bad_array_lengthD2Ev@@CXXABI_1.3.8
 FUNC:_ZNSt16invalid_argumentC1ERKSs@@GLIBCXX_3.4
 FUNC:_ZNSt16invalid_argumentC2ERKSs@@GLIBCXX_3.4
 FUNC:_ZNSt16invalid_argumentD0Ev@@GLIBCXX_3.4
@@ -2098,6 +2103,9 @@
 FUNC:_ZNSt19istreambuf_iteratorIcSt11char_traitsIcEEppEv@GLIBCXX_3.4
 FUNC:_ZNSt19istreambuf_iteratorIwSt11char_traitsIwEEppEv@@GLIBCXX_3.4.5
 FUNC:_ZNSt19istreambuf_iteratorIwSt11char_traitsIwEEppEv@GLIBCXX_3.4
+FUNC:_ZNSt20bad_array_new_lengthD0Ev@@CXXABI_1.3.8
+FUNC:_ZNSt20bad_array_new_lengthD1Ev@@CXXABI_1.3.8
+FUNC:_ZNSt20bad_array_new_lengthD2Ev@@CXXABI_1.3.8
 FUNC:_ZNSt22condition_variable_anyC1Ev@@GLIBCXX_3.4.11
 FUNC:_ZNSt22condition_variable_anyC2Ev@@GLIBCXX_3.4.11
 FUNC:_ZNSt22condition_variable_anyD1Ev@@GLIBCXX_3.4.11
@@ -2128,6 +2136,8 @@
 FUNC:_ZNSt6__norm15_List_node_base8transferEPS0_S1_@@GLIBCXX_3.4.9
 FUNC:_ZNSt6__norm15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14
 FUNC:_ZNSt6chrono12system_clock3nowEv@@GLIBCXX_3.4.11
+FUNC:_ZNSt6chrono3_V212steady_clock3nowEv@@GLIBCXX_3.4.19
+FUNC:_ZNSt6chrono3_V212system_clock3nowEv@@GLIBCXX_3.4.19
 FUNC:_ZNSt6gslice8_IndexerC1EmRKSt8valarrayImES4_@@GLIBCXX_3.4
 FUNC:_ZNSt6gslice8_IndexerC2EmRKSt8valarrayImES4_@@GLIBCXX_3.4
 FUNC:_ZNSt6locale11_M_coalesceERKS_S1_i@@GLIBCXX_3.4
@@ -2402,17 +2412,17 @@
 FUNC:_ZNVSt9__atomic011atomic_flag5clearESt12memory_order@@GLIBCXX_3.4.11
 FUNC:_ZSt10unexpectedv@@GLIBCXX_3.4
 FUNC:_ZSt11_Hash_bytesPKvmm@@CXXABI_1.3.5
-FUNC:_ZSt13get_terminatev@@GLIBCXX_3.4.19
+FUNC:_ZSt13get_terminatev@@GLIBCXX_3.4.20
 FUNC:_ZSt13set_terminatePFvvE@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIdEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIeEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIfEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_3.4
 
FUNC:_ZSt14__convert_to_vIgEvPKcRT_RSt12_Ios_IostateRKP15__locale_struct@@GLIBCXX_LDBL_3.4
-FUNC:_ZSt14get_unexpectedv@@GLIBCXX_3.4.19
+FUNC:_ZSt14get_unexpectedv@@GLIBCXX_3.4.20
 FUNC:_ZSt14set_unexpectedPFvvE@@GLIBCXX_3.4
 FUNC:_ZSt15_Fnv_hash_bytesPKvmm@@CXXABI_1.3.5
 FUNC:_ZSt15future_categoryv@@GLIBCXX_3.4.15
-FUNC:_ZSt15get_new_handlerv@@GLIBCXX_3.4.19
+FUNC:_ZSt15get_new_handlerv@@GLIBCXX_3.4.20
 FUNC:_ZSt15set_new_handlerPFvvE@@GLIBCXX_3.4
 FUNC:_ZSt15system_categoryv@@GLIBCXX_3.4.11
 
FUNC:_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@@GLIBCXX_3.4.9
@@ -2692,6 +2702,8 @@
 FUNC:__cxa_rethrow@@CXXABI_1.3
 FUNC:__cxa_thread_atexit@@CXXABI_1.3.7
 FUNC:__cxa_throw@@CXXABI_1.3
+FUNC:__cxa_throw_bad_array_length@@CXXABI_1.3.8
+FUNC:__cxa_throw_bad_array_new_length@@CXXABI_1.3.8
 FUNC:__cxa_tm_cleanup@@CXXABI_TM_1
 FUNC:__cxa_vec_cctor@@CXXABI_1.3
 FUNC:__cxa_vec_cleanup@@CXXABI_1.3
@@ -2738,6 +2750,7 @@
 OBJECT:0:CXXABI_1.3.5
 OBJECT:0:CXXABI_1.3.6
 OBJECT:0:CXXABI_1.3.7
+OBJECT:0:CXXABI_1.3.8
 OBJECT:0:CXXABI_LDBL_1.3
 OBJECT:0:C

Re: [PATCH][ARM] Fix iordi3_insn constraints

2013-06-06 Thread Richard Earnshaw

On 06/06/13 17:08, Kyrylo Tkachov wrote:

Hi all,

The constraint for iordi3_insn should take into account that we don't
have an orr instruction to deal with inverted immediate in ARM mode, but
we do in Thumb2 mode (orn). I had tried to reuse the De
constraint from the anddi3 case, but that's not appropriate in all
cases, causing an ICE in some cases. This patch adds a new constraint
for use by iordi3_insn to ensure that the appropriate immediates are
accepted.

Tested arm-none-eabi with and without -mthumb on qemu.

Ok for trunk?

Thanks,
Kyrill

2013-06-06  Kyrylo Tkachov  

* config/arm/constraints.md (Df): New constraint.
* config/arm/arm.md (iordi3_insn): Use Df constraint instead of
De.
Correct length attribute for last two alternatives.



OK.

R.




Re: [PATCH][ARM][5/n] Partial IT block deprecation in ARMv8 AArch32 - load/store multiple

2013-06-06 Thread Richard Earnshaw

On 06/06/13 17:26, Richard Henderson wrote:

On 06/06/2013 08:02 AM, Richard Earnshaw wrote:

  (define_insn "add3"
-  [(set (match_operand:FIXED 0 "s_register_operand" "=r")
-(plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r")
-(match_operand:FIXED 2 "s_register_operand" "r")))]
+  [(set (match_operand:FIXED 0 "s_register_operand" "=r,l")
+(plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r,l")
+(match_operand:FIXED 2 "s_register_operand" "r,l")))]

It would probably be better to put the 'l' variant first. This should encourage
register allocation to prefer low registers and that might lead to other
optimizations later on.  Similarly for sub3.


It's also 100% required in order to make the l alternative ever chosen.

When we compute which_alternative post-reload, we'll see that r matches
and always choose alternative 0.  If you've been examining asm dumps of
various test cases, in addition to your bootstrapping, you'll have seen
no IT predicated addition insns after this patch.


r~



Not quite.  In the cond-exec case the first alternative can be disabled, 
then only those cases that match the second set of constraints could be 
conditionalized.


R.



Re: [PATCH][ARM][5/n] Partial IT block deprecation in ARMv8 AArch32 - load/store multiple

2013-06-06 Thread Richard Henderson
On 06/06/2013 08:02 AM, Richard Earnshaw wrote:
>  (define_insn "add3"
> -  [(set (match_operand:FIXED 0 "s_register_operand" "=r")
> -(plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r")
> -(match_operand:FIXED 2 "s_register_operand" "r")))]
> +  [(set (match_operand:FIXED 0 "s_register_operand" "=r,l")
> +(plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r,l")
> +(match_operand:FIXED 2 "s_register_operand" "r,l")))]
> 
> It would probably be better to put the 'l' variant first. This should 
> encourage
> register allocation to prefer low registers and that might lead to other
> optimizations later on.  Similarly for sub3.

It's also 100% required in order to make the l alternative ever chosen.

When we compute which_alternative post-reload, we'll see that r matches
and always choose alternative 0.  If you've been examining asm dumps of
various test cases, in addition to your bootstrapping, you'll have seen
no IT predicated addition insns after this patch.


r~


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Xinliang David Li
On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor  wrote:
> Hi,
>
> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
>> attached is a testcase that would cause problem when source has changed:
>>
>> $ g++ test.cc -O2 -fprofile-generate -DOLD
>> $ ./a.out
>> $ g++ test.cc -O2 -fprofile-use
>> test.cc:34:1: internal compiler error: in operator[], at vec.h:815
>>  }
>>  ^
>> 0x512740 vec::operator[](unsigned int)
>> ../../gcc/vec.h:815
>> 0x512740 vec::operator[](unsigned int)
>> ../../gcc/vec.h:1244
>> 0xf24464 vec::operator[](unsigned int)
>> ../../gcc/vec.h:815
>> 0xf24464 vec::operator[](unsigned int)
>> ../../gcc/vec.h:1244
>> 0xf24464 ipa_get_indirect_edge_target_1
>> ../../gcc/ipa-cp.c:1535
>> 0x971b9a estimate_edge_devirt_benefit
>> ../../gcc/ipa-inline-analysis.c:2757
>
> Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
> Since it is called also from inlining, we can have parameter count
> mismatches... and in fact in non-virtual paths of that function we do
> check that we don't.  Because all callers have to pass known_vals
> describing all formal parameters of the inline tree root, we should
> apply the fix below (I've only just started running a bootstrap and
> testsuite on x86_64, though).
>
> OTOH, while I understand that FDO can change inlining sufficiently so
> that this error occurs, IMHO this should not be caused by outdated
> profiles but there is somewhere a parameter mismatch in the source.

Martin, what do you mean by the above?

thanks,

David


>
> Dehao, can you please check that this patch helps?
>
> Richi, if it does and the patch passes bootstrap and tests, is it OK
> for trunk and 4.8 branch?
>
> Thanks and sorry for the trouble,
>
> Martin
>
>
> 2013-06-06  Martin Jambor  
>
> * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that param_index is
> within bounds at the beginning of the function.
>
> Index: src/gcc/ipa-cp.c
> ===
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
>tree otr_type;
>tree t;
>
> -  if (param_index == -1)
> +  if (param_index == -1
> +  || known_vals.length () <= (unsigned int) param_index)
>  return NULL_TREE;
>
>if (!ie->indirect_info->polymorphic)
> @@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
> t = NULL;
> }
>else
> -   t = (known_vals.length () > (unsigned int) param_index
> -? known_vals[param_index] : NULL);
> +   t = NULL;
>
>if (t &&
>   TREE_CODE (t) == ADDR_EXPR


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Xinliang David Li
gimple_check_call_matching_types is called by check_ic_target. Instead
of moving the check out of this guard function, may be enhancing the
interface to allow it to guard with different strictness?

David

On Thu, Jun 6, 2013 at 8:10 AM, Dehao Chen  wrote:
> Hi, Martin,
>
> Yes, your patch can fix my case. Thanks a lot for the fix.
>
> With the fix, value profiling will still promote the wrong indirect
> call target. Though it will not be inlining, but it results in an
> additional check. How about in check_ic_target, after calling
> gimple_check_call_matching_types, we also check if number of args
> match number of params in target->symbol.decl?
>
> Thanks,
> Dehao
>
>
> On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor  wrote:
>>
>> Hi,
>>
>> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
>> > attached is a testcase that would cause problem when source has changed:
>> >
>> > $ g++ test.cc -O2 -fprofile-generate -DOLD
>> > $ ./a.out
>> > $ g++ test.cc -O2 -fprofile-use
>> > test.cc:34:1: internal compiler error: in operator[], at vec.h:815
>> >  }
>> >  ^
>> > 0x512740 vec::operator[](unsigned int)
>> > ../../gcc/vec.h:815
>> > 0x512740 vec::operator[](unsigned int)
>> > ../../gcc/vec.h:1244
>> > 0xf24464 vec::operator[](unsigned int)
>> > ../../gcc/vec.h:815
>> > 0xf24464 vec::operator[](unsigned int)
>> > ../../gcc/vec.h:1244
>> > 0xf24464 ipa_get_indirect_edge_target_1
>> > ../../gcc/ipa-cp.c:1535
>> > 0x971b9a estimate_edge_devirt_benefit
>> > ../../gcc/ipa-inline-analysis.c:2757
>>
>> Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
>> Since it is called also from inlining, we can have parameter count
>> mismatches... and in fact in non-virtual paths of that function we do
>> check that we don't.  Because all callers have to pass known_vals
>> describing all formal parameters of the inline tree root, we should
>> apply the fix below (I've only just started running a bootstrap and
>> testsuite on x86_64, though).
>>
>> OTOH, while I understand that FDO can change inlining sufficiently so
>> that this error occurs, IMHO this should not be caused by outdated
>> profiles but there is somewhere a parameter mismatch in the source.
>>
>> Dehao, can you please check that this patch helps?
>>
>> Richi, if it does and the patch passes bootstrap and tests, is it OK
>> for trunk and 4.8 branch?
>>
>> Thanks and sorry for the trouble,
>>
>> Martin
>>
>>
>> 2013-06-06  Martin Jambor  
>>
>> * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that param_index 
>> is
>> within bounds at the beginning of the function.
>>
>> Index: src/gcc/ipa-cp.c
>> ===
>> --- src.orig/gcc/ipa-cp.c
>> +++ src/gcc/ipa-cp.c
>> @@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
>>tree otr_type;
>>tree t;
>>
>> -  if (param_index == -1)
>> +  if (param_index == -1
>> +  || known_vals.length () <= (unsigned int) param_index)
>>  return NULL_TREE;
>>
>>if (!ie->indirect_info->polymorphic)
>> @@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
>> t = NULL;
>> }
>>else
>> -   t = (known_vals.length () > (unsigned int) param_index
>> -? known_vals[param_index] : NULL);
>> +   t = NULL;
>>
>>if (t &&
>>   TREE_CODE (t) == ADDR_EXPR


Re: [ARM] Avoid spilling ip for nested APCS frames

2013-06-06 Thread Richard Earnshaw

On 31/05/13 18:59, Eric Botcazou wrote:

The ARM/VxWorks port uses APCS frames and therefore ip to establish frames
with a frame pointer.  Now, for nested functions, ip is also the static chain
register so it needs to be preserved when the frame is being established.

There is code to that effect trying to save ip into r3 if the latter register
is available but, unfortunately, it uses df_regs_ever_live_p (3) to detect the
availability and this returns true for any non-toy function.

Fixed by implementing an arm_r3_live_at_start_p modelled on the implementation
of an equivalent predicate for %eax in the x86 back-end.

Tested on ARM/VxWorks, OK for the mainline?


2013-05-31  Eric Botcazou  

* config/arm/arm.c (arm_r3_live_at_start_p): New predicate.
(arm_compute_static_chain_stack_bytes): Use it.  Tidy up.
(arm_expand_prologue): Likewise.



OK

R.



[PATCH][ARM] Cleanup anddi3 constraints

2013-06-06 Thread Kyrylo Tkachov
Hi all,

This patch cleans up the anddi3_insn pattern by removing duplicate
alternatives and redundant attribute setters.
It also restricts the splitting to after reload and when we know that
we're not using the NEON versions.

Tested arm-none-eabi on qemu.

Ok for trunk?

Thanks,
Kyrill

2013-06-06  Kyrylo Tkachov  

* config/arm/arm.md (anddi3_insn): Remove duplicate
alternatives. Clean up.

cleanup_anddi3.patch
Description: Binary data


[PATCH][ARM] Fix iordi3_insn constraints

2013-06-06 Thread Kyrylo Tkachov
Hi all,

The constraint for iordi3_insn should take into account that we don't
have an orr instruction to deal with inverted immediate in ARM mode, but
we do in Thumb2 mode (orn). I had tried to reuse the De
constraint from the anddi3 case, but that's not appropriate in all
cases, causing an ICE in some cases. This patch adds a new constraint
for use by iordi3_insn to ensure that the appropriate immediates are
accepted.

Tested arm-none-eabi with and without -mthumb on qemu.

Ok for trunk?

Thanks,
Kyrill

2013-06-06  Kyrylo Tkachov  

* config/arm/constraints.md (Df): New constraint.
* config/arm/arm.md (iordi3_insn): Use Df constraint instead of
De.
Correct length attribute for last two alternatives.

fix_iordi.patch
Description: Binary data


Re: [ARM] Resurrect VxWorks port

2013-06-06 Thread Richard Earnshaw

On 31/05/13 18:57, Eric Botcazou wrote:

Hi,

as diagnosed by Doug, the VxWorks port cannot be built since:

2011-05-18  Joseph Myers  

which reorganized the ARM options and turned arm_fp16_format from a global
variable defined in arm.c into an option variable, leading to:

In file included from ../../.././gcc/tm.h:20:0,
  from /home/eric/svn/gcc/libgcc/fp-bit.c:38:
/home/eric/svn/gcc/libgcc/fp-bit.c: In function '__pack_f':
/home/eric/svn/gcc/libgcc/../gcc/config/arm/arm.h:426:22: error:
'arm_fp16_format' undeclared (first use in this function)
  ((bits) == 16 && arm_fp16_format == ARM_FP16_FORMAT_ALTERNATIVE)
   ^
/home/eric/svn/gcc/libgcc/fp-bit.c:205:7: note: in expansion of macro
'LARGEST_EXPONENT_IS_NORMAL'
if (LARGEST_EXPONENT_IS_NORMAL (FRAC_NBITS) && (isnan (src) || isinf
(src)))
^
/home/eric/svn/gcc/libgcc/../gcc/config/arm/arm.h:426:22: note: each
undeclared identifier is reported only once for each function it appears in
  ((bits) == 16 && arm_fp16_format == ARM_FP16_FORMAT_ALTERNATIVE)
   ^
/home/eric/svn/gcc/libgcc/fp-bit.c:205:7: note: in expansion of macro
'LARGEST_EXPONENT_IS_NORMAL'
if (LARGEST_EXPONENT_IS_NORMAL (FRAC_NBITS) && (isnan (src) || isinf
(src)))
^
make[3]: *** [_pack_sf.o] Error 1

Because fp-bit.c references macro
LARGEST_EXPONENT_IS_NORMAL (FRAC_NBITS)
which is defined in arm.h and references arm_fp16_format
which in turn is now a macro defined in options.h:
#define arm_fp16_format global_options.x_arm_fp16_format.

Furthermore

2011-08-05  Rainer Orth  

moved fp-bit.c into libgcc, making it more convoluted to fix it.

So we are proposing to switch the VxWorks port over to the soft-fp emulation
instead of the fp-bit one, like most of the other ARM ports.

Tested on ARM/VxWorks, OK for all active branches (they are all broken)?


2013-05-31  Douglas B Rupp  

* config.host (arm-wrs-vxworks): Configure with other soft float.




OK everywhere.

R.




Re: [PATCH] ARMv6-M MI thunk fix

2013-06-06 Thread Richard Earnshaw

On 06/06/13 16:43, Cesar Philippidis wrote:

This patch addresses the following FAILs on armv6-m:

   FAIL: g++.sum:g++.old-deja/g++.jason/thunk2.C -std=gnu++11 execution test
   FAIL: g++.sum:g++.old-deja/g++.jason/thunk2.C -std=gnu++98 execution test

The source of the problem is the use of ARM thunk offsets for Thumb1.
This test is using multiple inheritance, and that triggered the
problem.

I tested this patch with the default ARM and THUMB multilibs in
additional to -march=armv6-m.

OK for trunk?

Cesar


2013-06-06  Julian Brown  
Cesar Philippidis  

gcc/
* config/arm/arm.c (arm_output_mi_thunk): Fix offset for
TARGET_THUMB1_ONLY.


ARM-fix-mi-thunks-TARGET_THUMB1_ONLY.patch


Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 405523)
+++ gcc/config/arm/arm.c(revision 405524)
@@ -23140,7 +23140,11 @@
{
  /* Output ".word .LTHUNKn-7-.LTHUNKPCn".  */
  rtx tem = XEXP (DECL_RTL (function), 0);
- tem = gen_rtx_PLUS (GET_MODE (tem), tem, GEN_INT (-7));
+ /* For TARGET_THUMB1_ONLY the thunk is in Thumb mode, so the PC
+pipeline offset is four rather than eight.  Adjust the offset
+accordingly.  */
+ tem = gen_rtx_PLUS (GET_MODE (tem), tem,
+ GEN_INT (TARGET_THUMB1_ONLY ? -3 : -7));
  tem = gen_rtx_MINUS (GET_MODE (tem),
   tem,
   gen_rtx_SYMBOL_REF (Pmode,



The pipeline offset is 4 for Thumb2 as well.  So at the very least you 
need to explain why your change doesn't apply then as well.


R.



Re: [PATCH, rs6000] power8 patches, patch #4 (revised), new power8 builtins

2013-06-06 Thread David Edelsohn
On Wed, Jun 5, 2013 at 12:13 PM, Michael Meissner
 wrote:
> On Wed, Jun 05, 2013 at 10:28:02AM -0400, David Edelsohn wrote:
>> +;; The canonical form is to have the negated elment first, so we need to
>> +;; reverse arguments.
>>
>> Please fix the typo in the comment: "element".
>
> Ok.  I need to proof-read the patches before sending them out.
>
>> +;; Like VSX_L, but don't support TImode for doing logical instructions in
>> +;; 32-bit
>> +(define_mode_iterator VSX_L2 [V16QI V8HI V4SI V2DI V4SF V2DF])
>> +
>>  ;; Iterator for memory move.  Handle TImode specially to allow
>>  ;; it to use gprs as well as vsx registers.
>>  (define_mode_iterator VSX_M [V16QI V8HI V4SI V2DI V4SF V2DF])
>>
>> +(define_mode_iterator VSX_M2 [V16QI
>> +  V8HI
>> +  V4SI
>> +  V2DI
>> +  V4SF
>> +  V2DF
>> +  (TI"TARGET_VSX_TIMODE")])
>>
>> The patch adds new iterators VSX_L2 and VSX_M2.  The original
>> ChangeLog only mentioned M2 and the new ChangeLog only mentions L2.
>> What's going on?
>
> I thought I had deleted VSX_M2 from this patch.  It will be needed in patch #8
> for the fusion peephole.  The difference is VSX_L2 avoids TImode altogether,
> and was used by the logical ops to prevent TImode operations in VSX registers
> in 32-bit.
>
> The problem is unless we have expanders/splitters for logical DImode, the
> compiler when it wants to do a logical DImode operation says, aha I have a
> TImode operation, and then it converts the DImode value to TImode, does the
> operation (which in turn may mean transfer between GPR and VSX registers).
>
> I can add splitters and such for 32-bit DImode to prevent this, but I don't
> know if you want me to do it in the context of this patch, or do it as a later
> patch.

Okay, the revised patch #4 is okay with the typos fixed and either the
ChangeLog or the patch adjusted for iterators VSX_L2 and VSX_M2 -- the
ChangeLog and patch need to match.

But I view this as a preliminary step.  The logical instructions need
an iterator and TImode needs to be cleaned up on 32 bit.

Thanks, David


[c++-concepts] code review

2013-06-06 Thread Jason Merrill
Hi, I'm finally going through the current code on the branch, sorry for 
the delay.



* gcc/system.h (cstdlib): Include  to avoid poisoned
declaration errors.


Poisoned declarations of what?  This seems redundant with the #include 
 just below.



+  /* Concepts-related keywords */
+  { "assume",  RID_ASSUME, D_CXXONLY | D_CXX0X | D_CXXWARN },
+  { "axiom",   RID_AXIOM,  D_CXXONLY | D_CXX0X | D_CXXWARN },
+  { "concept", RID_CONCEPT,D_CXXONLY | D_CXX0X | D_CXXWARN },
+  { "forall",  RID_FORALL, D_CXXONLY | D_CXX0X | D_CXXWARN },
+  { "requires",RID_REQUIRES,   D_CXXONLY | D_CXX0X | D_CXXWARN },


I don't see anything that limits these keywords to when concepts are 
enabled.  You probably want to add an additional mask that applies to these.



+; Activate C++ concepts support.
+Variable
+bool flag_concepts


You don't need to declare this separately.


Components for process constraints and evaluating constraints.


Should that be "processing"?


+// TODO: Simply assinging boolean_type_node to the result type of the 
expression


"assigning"


+// reduced terms in the constraints languaage. Returns NULL_TREE if either A or


"language"


+// a constexpr, nullary function teplate whose result can be converted


"template"


+  // A constraint is declared constexpr


Needs a period.


+// This function is not called for abitrary call expressions. In particul,


"particular"


+static tree
+resolve_constraint_check (tree ovl, tree args)


This function seems to be trying to model a subset of overload 
resolution, which seems fragile to me; better to use the actual overload 
resolution code to decide which function the constraint expression 
calls, or at least resolve_nondeduced_context which handles SFINAE.



+case CAST_EXPR:
+  return reduce_node (TREE_VALUE (TREE_OPERAND (t, 0)));


Are we assuming that constraint expressions can't involve objects of 
literal class type?



+// If T is a call to a constraint instantiate it's definition and


"its"


+  tree c = finish_call_expr (t, &args, true, false, 0);
+  error ("invalid requirement");
+  inform (input_location, "did you mean %qE", c);


For both of these diagnostics, let's use EXPR_LOC_OR_HERE (t) as the 
location.



+// Reduce the requirement T into a logical formula written in terms of
+// atomic propositions.
+tree
+reduce_requirements (tree reqs)


s/T/REQS/


 struct GTY(()) tree_template_info {
   struct tree_common common;
+  tree constraint;
   vec *typedefs_needing_access_checking;
 };


Why do we need constraint information in template_info?  I suppose this 
is the issue you raised in your mail last month:



In general constraints are directly associated with a template decl. For 
example:

template
  class complex;

The Arithmetic constraint is associated with the template decl. However, this 
doesn't seem to work with partial specializations:

template
  struct complex { ... };

I had expected there to be a template decl associated with underlying class, 
but after print-bombing most of the instantiation, lookup, and specialization 
processing routines, I couldn't find that one was ever created for the type 
decl.


This does seem like a shortcoming, that also led to the typedefs vec 
getting stuck into the template_info inappropriately.  I guess we should 
start building TEMPLATE_DECLs for partial specializations.



+/* Constraint information for a C++ declaration. This includes the
+   requirements (as a constant expression) and the decomposed assumptions
+   and conclusions. The assumptions and conclusions are cached for the
+   purposes of overlaod resolution and diagnostics. */
+struct GTY(()) tree_constraint_info {
+  struct tree_base base;
+  tree spelling;
+  tree requirements;
+  tree assumptions;
+};


I'm confused by the relationship between the comment and the field names 
here.  Where do the conclusions come in?  Is "requirements (as a 
constant expression)" in the spelling or requirements field?


Also, "overload".


+constraint_info_p (tree t)
+template_info_p (tree t)


Let's use check_* rather than *_p for these, too.


+// NODE must be a lang-decl.


Let's say "NODE must have DECL_LANG_SPECIFIC" to avoid confusion with 
struct lang_decl.



+  error ("concept %q#D declared with function arguments", fn);


s/arguments/parameters/.  Some of the gcc internals get this distinction 
wrong; but we don't need to expose that in diagnostics...



+  // If the concept declaration specifier was found, check
+  // that the declaration satisfies the necessary requirements.
+  if (inlinep & 4)
+{
+  DECL_DECLARED_CONCEPT_P (decl) = true;
+  if (!check_concept_fn (decl))
+return NULL_TREE;
+}


I think I'd rather deal with an invalid concept by not marking it as a 
concept, but still declaring it as a constexpr function.



+  flag_concepts = true;


This is redundant since c.opt specifies that it defaults to true.


+// Retur

[PATCH] ARMv6-M MI thunk fix

2013-06-06 Thread Cesar Philippidis
This patch addresses the following FAILs on armv6-m:

  FAIL: g++.sum:g++.old-deja/g++.jason/thunk2.C -std=gnu++11 execution test
  FAIL: g++.sum:g++.old-deja/g++.jason/thunk2.C -std=gnu++98 execution test

The source of the problem is the use of ARM thunk offsets for Thumb1. 
This test is using multiple inheritance, and that triggered the
problem. 

I tested this patch with the default ARM and THUMB multilibs in 
additional to -march=armv6-m.

OK for trunk?

Cesar


2013-06-06  Julian Brown  
Cesar Philippidis  

gcc/
* config/arm/arm.c (arm_output_mi_thunk): Fix offset for
TARGET_THUMB1_ONLY.
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 405523)
+++ gcc/config/arm/arm.c(revision 405524)
@@ -23140,7 +23140,11 @@
{
  /* Output ".word .LTHUNKn-7-.LTHUNKPCn".  */
  rtx tem = XEXP (DECL_RTL (function), 0);
- tem = gen_rtx_PLUS (GET_MODE (tem), tem, GEN_INT (-7));
+ /* For TARGET_THUMB1_ONLY the thunk is in Thumb mode, so the PC
+pipeline offset is four rather than eight.  Adjust the offset
+accordingly.  */
+ tem = gen_rtx_PLUS (GET_MODE (tem), tem,
+ GEN_INT (TARGET_THUMB1_ONLY ? -3 : -7));
  tem = gen_rtx_MINUS (GET_MODE (tem),
   tem,
   gen_rtx_SYMBOL_REF (Pmode,


Re: [google/gcc-4_8] Add new libitm failures to x86_64-grtev3-linux-gnu.xfail

2013-06-06 Thread Diego Novillo

On 2013-06-06 11:33 , Simon Baldwin wrote:

Add new libitm failures to x86_64-grtev3-linux-gnu.xfail.

Okay for google/gcc-4_8?  google/main?  Thanks.


OK.


Diego.


[google/gcc-4_8] Add new libitm failures to x86_64-grtev3-linux-gnu.xfail

2013-06-06 Thread Simon Baldwin
Add new libitm failures to x86_64-grtev3-linux-gnu.xfail.

Okay for google/gcc-4_8?  google/main?  Thanks.


Index: contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail
===
--- contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail  (revision 
199736)
+++ contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail  (working copy)
@@ -66,6 +66,7 @@ flaky|FAIL: libmudflap.c/fail37-frag.c (
 FAIL: libatomic.c/atomic-generic.c (test for excess errors)
 FAIL: libatomic.c/generic-2.c (test for excess errors)
 FAIL: libitm.c++/dropref.C (test for excess errors)
+FAIL: libitm.c++/eh-1.C (test for excess errors)
 FAIL: libitm.c++/throwdown.C (test for excess errors)
 FAIL: libitm.c/cancel.c (test for excess errors)
 FAIL: libitm.c/clone-1.c (test for excess errors)
@@ -77,10 +78,12 @@ FAIL: libitm.c/notx.c (test for excess e
 FAIL: libitm.c/reentrant.c (test for excess errors)
 FAIL: libitm.c/simple-1.c (test for excess errors)
 FAIL: libitm.c/simple-2.c (test for excess errors)
+FAIL: libitm.c/stackundo.c (test for excess errors)
 FAIL: libitm.c/txrelease.c (test for excess errors)
 UNRESOLVED: libatomic.c/atomic-generic.c compilation failed to produce 
executable
 UNRESOLVED: libatomic.c/generic-2.c compilation failed to produce executable
 UNRESOLVED: libitm.c++/dropref.C compilation failed to produce executable
+UNRESOLVED: libitm.c++/eh-1.C compilation failed to produce executable
 UNRESOLVED: libitm.c/cancel.c compilation failed to produce executable
 UNRESOLVED: libitm.c/clone-1.c compilation failed to produce executable
 UNRESOLVED: libitm.c/dropref-2.c compilation failed to produce executable
@@ -91,6 +94,7 @@ UNRESOLVED: libitm.c/notx.c compilation
 UNRESOLVED: libitm.c/reentrant.c compilation failed to produce executable
 UNRESOLVED: libitm.c/simple-1.c compilation failed to produce executable
 UNRESOLVED: libitm.c/simple-2.c compilation failed to produce executable
+UNRESOLVED: libitm.c/stackundo.c compilation failed to produce executable
 UNRESOLVED: libitm.c/txrelease.c compilation failed to produce executable
 
 # These failures are likely due to misconfiguration during the


Re: Remove dead assignments to static local variables

2013-06-06 Thread Richard Biener
On Thu, Jun 6, 2013 at 5:10 PM, Bernd Schmidt  wrote:
> On 06/06/2013 04:52 PM, Richard Biener wrote:
>> +  /* We cannot optimize away a static used in multiple functions (as
>> +might happen in C++).  */
>> +  && !DECL_NONLOCAL(var)
>>
>> it may also happen trivially with inlining.  Which means a local pass can 
>> never
>> "remove" vars safely.
>
> This is why the pass isn't run if cgraph_function_possibly_inlined_p.
> Tested by remove-local-statics-14b.c.

I see (how ugly ;)).  Does that cover versioning via IPA CP as well for example?

>> In theory we have IPA reference which tries to figure out whether a local 
>> static
>> is read and/or written to (and from where).  It's of course quite early 
>> analysis
>> where FRE may not yet have optimized out all reads.
>>
>> But the trivial dead local static store elimination would simply eliminate
>> all write-only and !TREE_ADDRESSABLE vars (and statements storing
>> to it).
>>
>> For some reason this must be not enough so you write that local
>> analysis code.
>>
>> Thus - I'm asking you to double-check a trivial implementation using
>> the IPA reference result and double-check the issue with inlining
>> introducing out-of-current-function uses.
>
> I'm not sure what you're asking for here. The IPA passes seem to run
> much before PRE, and if you need an example why that's too early, try
> the remove-local-statics-7.c testcase.

Yes, that requires PRE.  I'm sure there are cases your pass doesn't
get either.  I was asking whether the particular benchmark would be
optimized by the simple IPA reference method.

Richard.

>
> Bernd
>


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Dehao Chen
Hi, Martin,

Yes, your patch can fix my case. Thanks a lot for the fix.

With the fix, value profiling will still promote the wrong indirect
call target. Though it will not be inlining, but it results in an
additional check. How about in check_ic_target, after calling
gimple_check_call_matching_types, we also check if number of args
match number of params in target->symbol.decl?

Thanks,
Dehao


On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor  wrote:
>
> Hi,
>
> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
> > attached is a testcase that would cause problem when source has changed:
> >
> > $ g++ test.cc -O2 -fprofile-generate -DOLD
> > $ ./a.out
> > $ g++ test.cc -O2 -fprofile-use
> > test.cc:34:1: internal compiler error: in operator[], at vec.h:815
> >  }
> >  ^
> > 0x512740 vec::operator[](unsigned int)
> > ../../gcc/vec.h:815
> > 0x512740 vec::operator[](unsigned int)
> > ../../gcc/vec.h:1244
> > 0xf24464 vec::operator[](unsigned int)
> > ../../gcc/vec.h:815
> > 0xf24464 vec::operator[](unsigned int)
> > ../../gcc/vec.h:1244
> > 0xf24464 ipa_get_indirect_edge_target_1
> > ../../gcc/ipa-cp.c:1535
> > 0x971b9a estimate_edge_devirt_benefit
> > ../../gcc/ipa-inline-analysis.c:2757
>
> Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
> Since it is called also from inlining, we can have parameter count
> mismatches... and in fact in non-virtual paths of that function we do
> check that we don't.  Because all callers have to pass known_vals
> describing all formal parameters of the inline tree root, we should
> apply the fix below (I've only just started running a bootstrap and
> testsuite on x86_64, though).
>
> OTOH, while I understand that FDO can change inlining sufficiently so
> that this error occurs, IMHO this should not be caused by outdated
> profiles but there is somewhere a parameter mismatch in the source.
>
> Dehao, can you please check that this patch helps?
>
> Richi, if it does and the patch passes bootstrap and tests, is it OK
> for trunk and 4.8 branch?
>
> Thanks and sorry for the trouble,
>
> Martin
>
>
> 2013-06-06  Martin Jambor  
>
> * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that param_index is
> within bounds at the beginning of the function.
>
> Index: src/gcc/ipa-cp.c
> ===
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
>tree otr_type;
>tree t;
>
> -  if (param_index == -1)
> +  if (param_index == -1
> +  || known_vals.length () <= (unsigned int) param_index)
>  return NULL_TREE;
>
>if (!ie->indirect_info->polymorphic)
> @@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
> t = NULL;
> }
>else
> -   t = (known_vals.length () > (unsigned int) param_index
> -? known_vals[param_index] : NULL);
> +   t = NULL;
>
>if (t &&
>   TREE_CODE (t) == ADDR_EXPR


[Patch, Fortran] PR57535 - Fix class-array handling for function result variables

2013-06-06 Thread Tobias Burnus
The attached test case failed with an ICE for function result variables 
as in that case the function decl was used.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?


Other pending patches:
* 4.8/4.9 regression with defined assignment: 
http://gcc.gnu.org/ml/fortran/2013-06/msg00047.html
* Finalize nonallocatables with intent(out): 
http://gcc.gnu.org/ml/fortran/2013-06/msg00048.html


Tobias
2013-06-06  Tobias Burnus  

	PR fortran/57535
	* trans-array.c (build_class_array_ref): Fix ICE for
	function result variables.

2013-06-06  Tobias Burnus  

	PR fortran/57535
	* gfortran.dg/class_array_18.f90: New.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 89f26d7..a4321cc 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -2991,7 +2991,13 @@ build_class_array_ref (gfc_se *se, tree base, tree index)
   if (ts == NULL)
 return false;
 
-  if (class_ref == NULL)
+  if (class_ref == NULL && expr->symtree->n.sym->attr.function
+  && expr->symtree->n.sym == expr->symtree->n.sym->result)
+{
+  gcc_assert (expr->symtree->n.sym->backend_decl == current_function_decl);
+  decl = gfc_get_fake_result_decl (expr->symtree->n.sym, 0);
+}
+  else if (class_ref == NULL)
 decl = expr->symtree->n.sym->backend_decl;
   else
 {
--- /dev/null	2013-06-06 09:52:08.544104880 +0200
+++ gcc/gcc/testsuite/gfortran.dg/class_array_18.f90	2013-06-06 16:57:46.838820509 +0200
@@ -0,0 +1,16 @@
+! { dg-do compile }
+!
+! PR fortran/57535
+!
+program test
+  implicit none
+  type t
+integer :: ii = 55
+  end type t
+contains
+  function func2()
+class(t), allocatable :: func2(:)
+allocate(func2(3))
+func2%ii = [111,222,333]
+  end function func2
+end program test


Re: Remove dead assignments to static local variables

2013-06-06 Thread Bernd Schmidt
On 06/06/2013 04:52 PM, Richard Biener wrote:
> +  /* We cannot optimize away a static used in multiple functions (as
> +might happen in C++).  */
> +  && !DECL_NONLOCAL(var)
> 
> it may also happen trivially with inlining.  Which means a local pass can 
> never
> "remove" vars safely.

This is why the pass isn't run if cgraph_function_possibly_inlined_p.
Tested by remove-local-statics-14b.c.

> In theory we have IPA reference which tries to figure out whether a local 
> static
> is read and/or written to (and from where).  It's of course quite early 
> analysis
> where FRE may not yet have optimized out all reads.
> 
> But the trivial dead local static store elimination would simply eliminate
> all write-only and !TREE_ADDRESSABLE vars (and statements storing
> to it).
> 
> For some reason this must be not enough so you write that local
> analysis code.
> 
> Thus - I'm asking you to double-check a trivial implementation using
> the IPA reference result and double-check the issue with inlining
> introducing out-of-current-function uses.

I'm not sure what you're asking for here. The IPA passes seem to run
much before PRE, and if you need an example why that's too early, try
the remove-local-statics-7.c testcase.


Bernd



Re: [PATCH][ARM][5/n] Partial IT block deprecation in ARMv8 AArch32 - load/store multiple

2013-06-06 Thread Richard Earnshaw

On 06/06/13 14:36, Kyrylo Tkachov wrote:

Hi all,

This patch updates the fixed-point math patterns in arm-fixed.md to
conform to new IT block rules in ARMv8. The add/sub instructions can be
placed in an IT block if they use the low registers.
The other more exotic variants (ssub, uqadd etc) do not have 16-bit
encodings and can therefore not be conditionalised by the new rules.

Tested together with the other patches in the series by bootstrap on a
Cortex-A15 and regtest arm-none-eabi on qemu and model.

Ok for trunk?


Thanks,
Kyrill


2013-06-06  Kyrylo Tkachov  

* config/arm/arm-fixed.md (add3,usadd3,ssadd3,
sub3, ussub3, sssub3, arm_ssatsihi_shift,
arm_usatsihi):
Adjust alternatives for arm_restrict_it.




 (define_insn "add3"
-  [(set (match_operand:FIXED 0 "s_register_operand" "=r")
-   (plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r")
-   (match_operand:FIXED 2 "s_register_operand" "r")))]
+  [(set (match_operand:FIXED 0 "s_register_operand" "=r,l")
+   (plus:FIXED (match_operand:FIXED 1 "s_register_operand" "r,l")
+   (match_operand:FIXED 2 "s_register_operand" "r,l")))]

It would probably be better to put the 'l' variant first. This should 
encourage register allocation to prefer low registers and that might 
lead to other optimizations later on.  Similarly for sub3.


OK with that change.

R.




Re: [libstdc++-v3][C++14] Implement N3654 - Quoted Strings

2013-06-06 Thread Ed Smith-Rowland

On 06/05/2013 04:01 PM, Jonathan Wakely wrote:

On 5 June 2013 20:18, Ed Smith-Rowland wrote:

Greetings,
This patch implements quoted string manipulators for C++14.

27.7.6 - Quoted manipulators[quoted.manip].

The idea is to allow round trip insert and extract of strings with spaces.

   std::stringstream ss;
   std::string original = "thing1  thing1";
   std::string round_trip;
   ss << std::quoted(original);
   ss >> std::quoted(round_trip);
   assert( original == round_trip );

Builds and tests clean on x86-64-linux.

As I suggested for your literals patch, couldn't the test for:
#if __cplusplus > 201103L
go inside the existing one?

i.e.

#if __cplusplus >= 201103L
[...]
#if __cplusplus > 201103L
[...]
#endif
#endif
Certainly.  I forgot that in the last literals patch.  I'll fix that 
after I finish this one. (I just noticed junk comments in the testcases 
for literals also).



_Quoted_string appears to do two copies of the string, one for the
constructor argument and one for the member variable, do they
definitely get elided?
I looks that way.  But all used of the template parm String are either 
references or pointers so these operations should be efficient.

_Quoted_string should be used as a non-owning string thing.


The members of _Quoted_string should be named _M_xxx not __xxx, to
follow the coding style guidelines.

Done.


What is __delim2 for?

What if the first extraction in the operator>> fails, is doing
__is.unget() the right thing to do?

Thanks. I'll return with __is rather than attempting to continue reading.


You could simplify the quoted() overloads by using auto return type
deduction, is it an intentional choice not to use that?
For some reason I forgot about auto return type in C++14.  It sure 
cleans things up nicely.  Done.

Rebuilt and retested on x86_64


2013-06-05  Ed Smith-Rowland  <3dw...@verizon.net>

Implement N3654 - Quoted Strings Library Proposal
* include/std/iomanip: Add quoted(String, Char delim, Char escape)
manipulators and supporting machinery in c++1y mode.
* testsuite/27_io/manipulators/standard/char/quoted.cc: New.
* testsuite/27_io/manipulators/standard/wchar_t/quoted.cc: New.
Index: include/std/iomanip
===
--- include/std/iomanip (revision 199730)
+++ include/std/iomanip (working copy)
@@ -334,8 +334,157 @@
   return __os; 
 }
 
-#endif
+#if __cplusplus > 201103L
 
+  namespace __detail {
+
+/**
+ * @brief Struct for delimited strings.
+ *The left and right delimiters can be different.
+ */
+template
+  struct _Quoted_string
+  {
+   _Quoted_string(_String __str, _CharT __del, _CharT __esc)
+   : _M_string(__str), _M_delim{__del}, _M_escape{__esc}
+   { }
+
+   _Quoted_string&
+   operator=(_Quoted_string&) = delete;
+
+   _String _M_string;
+   _CharT _M_delim;
+   _CharT _M_escape;
+  };
+
+/**
+ * @brief Inserter for delimited strings.
+ *The left and right delimiters can be different.
+ */
+template
+  auto&
+  operator<<(std::basic_ostream<_CharT, _Traits>& __os,
+const _Quoted_string& __str)
+  {
+   __os << __str._M_delim;
+   for (const _CharT* __c = __str._M_string; *__c; ++__c)
+ {
+   if (*__c == __str._M_delim || *__c == __str._M_escape)
+ __os << __str._M_escape;
+   __os << *__c;
+ }
+   __os << __str._M_delim;
+
+   return __os;
+  }
+
+/**
+ * @brief Inserter for delimited strings.
+ *The left and right delimiters can be different.
+ */
+template
+  auto&
+  operator<<(std::basic_ostream<_CharT, _Traits>& __os,
+const _Quoted_string<_String, _CharT>& __str)
+  {
+   __os << __str._M_delim;
+   for (auto& __c : __str._M_string)
+ {
+   if (__c == __str._M_delim || __c == __str._M_escape)
+ __os << __str._M_escape;
+   __os << __c;
+ }
+   __os << __str._M_delim;
+
+   return __os;
+  }
+
+/**
+ * @brief Extractor for delimited strings.
+ *The left and right delimiters can be different.
+ */
+template
+  auto&
+  operator>>(std::basic_istream<_CharT, _Traits>& __is,
+const _Quoted_string&,
+ _CharT>& __str)
+  {
+   __str._M_string.clear();
+
+   _CharT __c;
+   __is >> __c;
+   if (!__is.good())
+ return __is;
+   if (__c != __str._M_delim)
+ {
+   __is.unget();
+   __is >> __str._M_string;
+   return __is;
+ }
+   std::ios_base::fmtflags __flags
+ = __is.flags(__is.flags() & ~std::ios_base::skipws);
+   do
+ {
+   __is >> __c;
+   if (!__is.good())
+ break;
+   if (__c == __str._M_escape)
+ 

Re: Remove dead assignments to static local variables

2013-06-06 Thread Richard Biener
On Thu, Jun 6, 2013 at 3:42 PM, Bernd Schmidt  wrote:
> There's a well-known benchmark which uselessly likes to declare local
> variables as static. There exist at least two implementations to demote
> these to normal register variables. See the discussion thread here:
>   http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00982.html
> These days, however, we can skip most of the work as pointed out by
> Andrew Pinski:
>   http://gcc.gnu.org/ml/gcc-patches/2008-07/msg01035.html
> PRE usually manages to eliminate all references to the static variable
> except a final assignment. The only thing that's left to do is to
> enhance DSE to recognize these as dead. So, the following patch is a
> cut-down version of CodeSourcery's approach, originally written by
> Nathan Froyd, modified to do exactly that.
>
> Bootstrapped and tested on x86_64-linux, all languages except Ada. OK?

Just a few quick questions:

+  /* We cannot optimize away a static used in multiple functions (as
+might happen in C++).  */
+  && !DECL_NONLOCAL(var)

it may also happen trivially with inlining.  Which means a local pass can never
"remove" vars safely.

In theory we have IPA reference which tries to figure out whether a local static
is read and/or written to (and from where).  It's of course quite early analysis
where FRE may not yet have optimized out all reads.

But the trivial dead local static store elimination would simply eliminate
all write-only and !TREE_ADDRESSABLE vars (and statements storing
to it).

For some reason this must be not enough so you write that local
analysis code.

Thus - I'm asking you to double-check a trivial implementation using
the IPA reference result and double-check the issue with inlining
introducing out-of-current-function uses.

Thanks,
Richard.

>
> Bernd


[Patch, Fortran, committed] PR57542 - Fix regression with the final call

2013-06-06 Thread Tobias Burnus
As the test case shows, it can happen (-fcheck=bounds) that se.pre has a 
value. Hence, the patch removes se.pre from the assert and adds the pre 
block to the block.


Build, regtested and committed (Rev. 199736) on x86-64-gnu-linux.

Tobias
2013-06-06  Tobias Burnus  

	PR fortran/57542
	* trans.c (gfc_build_final_call): Add se.pre to the block
	and modify the assert.

2013-06-06  Tobias Burnus  

	PR fortran/57542
	* gfortran.dg/finalize_16.f90: New.

diff --git a/gcc/fortran/trans.c b/gcc/fortran/trans.c
index a1ea300..dd608b7 100644
--- a/gcc/fortran/trans.c
+++ b/gcc/fortran/trans.c
@@ -895,7 +895,8 @@ gfc_build_final_call (gfc_typespec ts, gfc_expr *final_wrapper, gfc_expr *var,
   gcc_assert (class_size);
   gfc_init_se (&se, NULL);
   gfc_conv_expr (&se, class_size);
-  gcc_assert (se.pre.head == NULL_TREE && se.post.head == NULL_TREE);
+  gfc_add_block_to_block (&block, &se.pre);
+  gcc_assert (se.post.head == NULL_TREE);
   size = se.expr;
 
   array_expr = gfc_copy_expr (var);
@@ -912,7 +913,8 @@ gfc_build_final_call (gfc_typespec ts, gfc_expr *final_wrapper, gfc_expr *var,
 	{
 	  gfc_add_data_component (array_expr);
 	  gfc_conv_expr (&se, array_expr);
-	  gcc_assert (se.pre.head == NULL_TREE && se.post.head == NULL_TREE);
+	  gfc_add_block_to_block (&block, &se.pre);
+	  gcc_assert (se.post.head == NULL_TREE);
 	  array = se.expr;
 	  if (TREE_CODE (array) == ADDR_EXPR
 	  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (array, 0
--- /dev/null	2013-06-06 09:52:08.544104880 +0200
+++ gcc/gcc/testsuite/gfortran.dg/finalize_16.f90	2013-06-06 16:14:13.478988916 +0200
@@ -0,0 +1,30 @@
+! { dg-do compile }
+! { dg-options "-fcheck=all" }
+!
+! PR fortran/57542
+!
+module type_mod
+  type inner
+  end type inner
+
+  type outer 
+class(inner), allocatable :: item
+  end type outer
+
+  type container 
+class(outer), allocatable :: item
+  end type container
+
+  type maintype
+type(container), allocatable :: v(:)
+  end type maintype
+
+end module type_mod
+
+subroutine testfinal(var)
+  use type_mod
+  type(maintype), intent(inout) :: var
+  ! A real code would obviously check
+  ! this is really allocated
+  deallocate(var%v(1)%item%item)
+end subroutine testfinal


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Richard Biener
On Thu, Jun 6, 2013 at 4:11 PM, Martin Jambor  wrote:
> Hi,
>
> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
>> attached is a testcase that would cause problem when source has changed:
>>
>> $ g++ test.cc -O2 -fprofile-generate -DOLD
>> $ ./a.out
>> $ g++ test.cc -O2 -fprofile-use
>> test.cc:34:1: internal compiler error: in operator[], at vec.h:815
>>  }
>>  ^
>> 0x512740 vec::operator[](unsigned int)
>> ../../gcc/vec.h:815
>> 0x512740 vec::operator[](unsigned int)
>> ../../gcc/vec.h:1244
>> 0xf24464 vec::operator[](unsigned int)
>> ../../gcc/vec.h:815
>> 0xf24464 vec::operator[](unsigned int)
>> ../../gcc/vec.h:1244
>> 0xf24464 ipa_get_indirect_edge_target_1
>> ../../gcc/ipa-cp.c:1535
>> 0x971b9a estimate_edge_devirt_benefit
>> ../../gcc/ipa-inline-analysis.c:2757
>
> Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
> Since it is called also from inlining, we can have parameter count
> mismatches... and in fact in non-virtual paths of that function we do
> check that we don't.  Because all callers have to pass known_vals
> describing all formal parameters of the inline tree root, we should
> apply the fix below (I've only just started running a bootstrap and
> testsuite on x86_64, though).
>
> OTOH, while I understand that FDO can change inlining sufficiently so
> that this error occurs, IMHO this should not be caused by outdated
> profiles but there is somewhere a parameter mismatch in the source.
>
> Dehao, can you please check that this patch helps?
>
> Richi, if it does and the patch passes bootstrap and tests, is it OK
> for trunk and 4.8 branch?

Yes.

Thanks,
Richard.

> Thanks and sorry for the trouble,
>
> Martin
>
>
> 2013-06-06  Martin Jambor  
>
> * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that param_index is
> within bounds at the beginning of the function.
>
> Index: src/gcc/ipa-cp.c
> ===
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
>tree otr_type;
>tree t;
>
> -  if (param_index == -1)
> +  if (param_index == -1
> +  || known_vals.length () <= (unsigned int) param_index)
>  return NULL_TREE;
>
>if (!ie->indirect_info->polymorphic)
> @@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
> t = NULL;
> }
>else
> -   t = (known_vals.length () > (unsigned int) param_index
> -? known_vals[param_index] : NULL);
> +   t = NULL;
>
>if (t &&
>   TREE_CODE (t) == ADDR_EXPR


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Martin Jambor
Hi,

On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote:
> attached is a testcase that would cause problem when source has changed:
> 
> $ g++ test.cc -O2 -fprofile-generate -DOLD
> $ ./a.out
> $ g++ test.cc -O2 -fprofile-use
> test.cc:34:1: internal compiler error: in operator[], at vec.h:815
>  }
>  ^
> 0x512740 vec::operator[](unsigned int)
> ../../gcc/vec.h:815
> 0x512740 vec::operator[](unsigned int)
> ../../gcc/vec.h:1244
> 0xf24464 vec::operator[](unsigned int)
> ../../gcc/vec.h:815
> 0xf24464 vec::operator[](unsigned int)
> ../../gcc/vec.h:1244
> 0xf24464 ipa_get_indirect_edge_target_1
> ../../gcc/ipa-cp.c:1535
> 0x971b9a estimate_edge_devirt_benefit
> ../../gcc/ipa-inline-analysis.c:2757

Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1.
Since it is called also from inlining, we can have parameter count
mismatches... and in fact in non-virtual paths of that function we do
check that we don't.  Because all callers have to pass known_vals
describing all formal parameters of the inline tree root, we should
apply the fix below (I've only just started running a bootstrap and
testsuite on x86_64, though).

OTOH, while I understand that FDO can change inlining sufficiently so
that this error occurs, IMHO this should not be caused by outdated
profiles but there is somewhere a parameter mismatch in the source.

Dehao, can you please check that this patch helps?

Richi, if it does and the patch passes bootstrap and tests, is it OK
for trunk and 4.8 branch?

Thanks and sorry for the trouble,

Martin


2013-06-06  Martin Jambor  

* ipa-cp.c (ipa_get_indirect_edge_target_1): Check that param_index is
within bounds at the beginning of the function.

Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1481,7 +1481,8 @@ ipa_get_indirect_edge_target_1 (struct c
   tree otr_type;
   tree t;
 
-  if (param_index == -1)
+  if (param_index == -1
+  || known_vals.length () <= (unsigned int) param_index)
 return NULL_TREE;
 
   if (!ie->indirect_info->polymorphic)
@@ -1516,8 +1517,7 @@ ipa_get_indirect_edge_target_1 (struct c
t = NULL;
}
   else
-   t = (known_vals.length () > (unsigned int) param_index
-? known_vals[param_index] : NULL);
+   t = NULL;
 
   if (t &&
  TREE_CODE (t) == ADDR_EXPR


Re: [patch] Fix parsing bug in validate_patches.py

2013-06-06 Thread Diego Novillo

On 2013-06-05 14:34 , Brooks Moses wrote:


I've tested the adjusted line-stripping parts of the refactored code 
by adding and removing spaces in lines in my xfails file and 
confirming that things are still correctly matched.


Is this refactoring also OK to commit?


Ah, thanks.  That's a better refactoring.

OK to commit.


Diego.


Remove dead assignments to static local variables

2013-06-06 Thread Bernd Schmidt
There's a well-known benchmark which uselessly likes to declare local
variables as static. There exist at least two implementations to demote
these to normal register variables. See the discussion thread here:
  http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00982.html
These days, however, we can skip most of the work as pointed out by
Andrew Pinski:
  http://gcc.gnu.org/ml/gcc-patches/2008-07/msg01035.html
PRE usually manages to eliminate all references to the static variable
except a final assignment. The only thing that's left to do is to
enhance DSE to recognize these as dead. So, the following patch is a
cut-down version of CodeSourcery's approach, originally written by
Nathan Froyd, modified to do exactly that.

Bootstrapped and tested on x86_64-linux, all languages except Ada. OK?


Bernd
commit ce5d3fe1bf7934dd551b7bf091f113f396e15d64
Author: Bernd Schmidt 
Date:   Wed Jun 5 15:04:56 2013 +0200

Extend DSE to remove static local variables that are only set.

Based on an earlier patch by Nathan Froyd and Andrew Stubbs.

	gcc/
	* cgraph.c (cgraph_node): Set ever_was_nested in the node and
	its parent when creating a new node.
	* cgraph.h (struct cgraph_node): New field ever_was_nested.
	* tree-ssa-dse.c: Include "hashtab.h".
	(struct rls_decl_info, struct rls_stmt_info): New.
	(static_variables, defuse_statements, n_statics): New static
	variables.
	(rls_hash_decl_info, rls_eq_decl_info, rls_free_decl_info,
	rls_hash_use_info, rls_eq_use_info, rls_free_use_info, rls_init,
	rls_done, note_var_ref, mark_used, remove_local_statics,
	find_static_nonvolatile_declarations, maybe_remove_stmt): New static
	functions.
	(tree_ssa_dse): Call remove_local_statics if appropriate.
	* Makefile.in (tree-ssa-dse.o): Update dependencies.

	gcc/cp/
	* decl2.c (mark_used): Mark _DECLs as DECL_NONLOCAL if appropriate.

	gcc/testsuite/
	* g++.dg/remove-local-statics-1.C: New test.
	* g++.dg/remove-local-statics-2.C: New test.
	* gcc.dg/remove-local-statics-1.c: New file.
	* gcc.dg/remove-local-statics-2.c: New file.
	* gcc.dg/remove-local-statics-3.c: New file.
	* gcc.dg/remove-local-statics-4.c: New file.
	* gcc.dg/remove-local-statics-5.c: New file.
	* gcc.dg/remove-local-statics-6.c: New file.
	* gcc.dg/remove-local-statics-7.c: New file.
	* gcc.dg/remove-local-statics-8.c: New file.
	* gcc.dg/remove-local-statics-9.c: New file.
	* gcc.dg/remove-local-statics-10.c: New file.
	* gcc.dg/remove-local-statics-11.c: New file.
	* gcc.dg/remove-local-statics-12.c: New file.
	* gcc.dg/remove-local-statics-13.c: New test.
	* gcc.dg/remove-local-statics-14.c: New test.
	* gcc.dg/remove-local-statics-15.c: New test.
	* gcc.dg/remove-local-statics-16.c: New test.
	* gcc.dg/remove-local-statics-17.c: New test.
	* gcc.dg/remove-local-statics-18.c: New test.
	* gcc.dg/tree-ssa/ssa-dse-6.c: Ensure the local variables aren't
	optimized away.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index e95dd63..3acd9fb 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2306,7 +2306,7 @@ tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
 tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h $(FLAGS_H) \
-   $(GIMPLE_PRETTY_PRINT_H) langhooks.h
+   $(GIMPLE_PRETTY_PRINT_H) $(HASHTAB_H) langhooks.h
 tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 445282a..9343e4c 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -531,8 +531,10 @@ cgraph_create_node (tree decl)
   if (DECL_CONTEXT (decl) && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
 {
   node->origin = cgraph_get_create_node (DECL_CONTEXT (decl));
+  node->origin->ever_was_nested = 1;
   node->next_nested = node->origin->nested;
   node->origin->nested = node;
+  node->ever_was_nested = 1;
 }
   return node;
 }
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 276e568..a667f74 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -303,6 +303,8 @@ struct GTY(()) cgraph_node {
   /* Set once the function has been instantiated and its callee
  lists created.  */
   unsigned process : 1;
+  /* Set if the function is a nested function or has nested functions.  */
+  unsigned ever_was_nested : 1;
   /* How commonly executed the node is.  Initialized during branch
  probabilities pass.  */
   ENUM_BITFIELD (node_frequency) frequency : 2;
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 5e7dbcd..8b346cd 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4515,6 +4515,15 @@ mark_used (tree decl, tsubst_flags_t complain)
 
   /* Set TREE_USED for the ben

[PATCH][ARM][5/n] Partial IT block deprecation in ARMv8 AArch32 - load/store multiple

2013-06-06 Thread Kyrylo Tkachov
Hi all,

This patch updates the fixed-point math patterns in arm-fixed.md to
conform to new IT block rules in ARMv8. The add/sub instructions can be
placed in an IT block if they use the low registers.
The other more exotic variants (ssub, uqadd etc) do not have 16-bit
encodings and can therefore not be conditionalised by the new rules.

Tested together with the other patches in the series by bootstrap on a
Cortex-A15 and regtest arm-none-eabi on qemu and model.

Ok for trunk?


Thanks,
Kyrill


2013-06-06  Kyrylo Tkachov  

* config/arm/arm-fixed.md (add3,usadd3,ssadd3,
sub3, ussub3, sssub3, arm_ssatsihi_shift,
arm_usatsihi):
Adjust alternatives for arm_restrict_it.

07-fixedpoint.patch
Description: Binary data


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Jakub Jelinek
On Thu, Jun 06, 2013 at 03:26:19PM +0200, Segher Boessenkool wrote:
> >The C++11/C++14 undefined behavior of left signed shift can be tested
> >similarly, if ((unsigned type for op0's type) op0) >> (precm1 - y)
> >is greater than one, then it is undefined behavior.
> >Jason, does
> >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3675.html#1457
> >apply just to C++11/C++14, or to C++03 too?
> 
> Doesn't DR1457 also leave
> 
>neg << 0
> 
> as undefined, where "neg" is a negative value?  That isn't caught by
> your "greater than one" expression.

Yeah, of course, it needs to be for any shift x << y or x >> y (signed or 
unsigned):
1) if ((unsigned) y > precm1) ub
plus for signed x << y:
2) for C99/C11 if ((unsigned) x >> (precm1 - y)) ub
3) for C++11/C++14 if (x < 0 || ((unsigned) x >> (precm1 - y)) > 1) ub

Jakub


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Segher Boessenkool

The C++11/C++14 undefined behavior of left signed shift can be tested
similarly, if ((unsigned type for op0's type) op0) >> (precm1 - y)
is greater than one, then it is undefined behavior.
Jason, does
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/ 
n3675.html#1457

apply just to C++11/C++14, or to C++03 too?


Doesn't DR1457 also leave

   neg << 0

as undefined, where "neg" is a negative value?  That isn't caught by
your "greater than one" expression.


Segher



Re: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window

2013-06-06 Thread Richard Earnshaw

On 29/05/13 18:15, Meador Inge wrote:

Hi All,

This patch fixes a bug in one of the ARM peephole2 optimizations.  The
peephole2 optimization in question was changed to use the CC-updating
form for all of the instructions produced by the peephole so that the
encoding will be smaller when compiling for thumb [1].  However, I don't
think that is always safe.

For example, the CC register might be used by something *after* the
peephole window.  The current peephole will transform:


 (insn:TI 7 49 18 2 (set (reg:CC 24 cc)
 (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
 (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
  (nil))

 (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (reg:SI 3 r3 [140])
 (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
  (expr_list:REG_EQUIV (const_int 0 [0])
 (nil)))

 (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (reg:SI 3 r3 [138])
 (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
  (expr_list:REG_EQUIV (const_int 1 [0x1])
 (nil)))

 (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
 (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
  (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
 (nil)))

 (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
 (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
  (nil))

 (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 
*endname_1(D)+0 S1 A8])
 (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
  (expr_list:REG_DEAD (reg:CC 24 cc)
 (expr_list:REG_DEAD (reg:QI 3 r3 [140])
 (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
 (nil)

into the following:


 (insn 59 49 60 2 (parallel [
 (set (reg:CC 24 cc)
 (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
 (const_int 0 [0])))
 (set (reg:SI 1 r1)
 (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
 (const_int 0 [0])))
 ]) repro.c:6 -1
  (nil))

 (insn 60 59 61 2 (parallel [
 (set (reg:CC 24 cc)
 (compare:CC (const_int 0 [0])
 (reg:SI 1 r1)))
 (set (reg:SI 3 r3 [140])
 (minus:SI (const_int 0 [0])
 (reg:SI 1 r1)))
 ]) repro.c:6 -1
  (nil))

 (insn 61 60 19 2 (parallel [
 (set (reg:SI 3 r3 [140])
 (plus:SI (plus:SI (reg:SI 3 r3 [140])
 (reg:SI 1 r1))
 (geu:SI (reg:CC 24 cc)
 (const_int 0 [0]
 (clobber (reg:CC 24 cc))
 ]) repro.c:6 -1
  (nil))

 (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
 (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
  (nil))

 (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
 (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
  (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
 (nil)))

 (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
 (const_int 0 [0]))
 (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 
*endname_1(D)+0 S1 A8])
 (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
  (expr_list:REG_DEAD (reg:CC 24 cc)
 (expr_list:REG_DEAD (reg:QI 3 r3 [140])
 (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
 (nil)


This gets compiled into the incorrect sequence:


 ldrbr3, [r0, #0]
 ldr r2, .L4
 subsr1, r3, #0
 rsbsr3, r1, #0
 adcsr3, r3, r1
 strne   r3, [r2, #0]
 streq   r3, [r2, #0]
 strneb r3, [r0, #0]


The conditional stores are now dealing with an incorrect condition state.

This patch fixes the problem by ensuring that the CC reg is dead after the
peephole window for the current peephole definition and falls back on the
original pre-PR46975 peephole when it is live.  Unfortunately I had trouble
coming up with

[PATCH] Shrink LTO bytecode

2013-06-06 Thread Richard Biener

This shrinks LTO_tree_pickle_reference and all LTO headers by
using uhwi streaming instead of fixed-size streaming for
streamer_write_hwi_in_range/streamer_read_hwi_in_range.
LTO_tree_pickle_reference are the most issued records so it's
worth optimizing its placement so it can use a 1 byte record.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-06-06  Richard Biener  

* lto-streamer.h (enum LTO_tags): Move LTO_tree_pickle_reference
after LTO_null.
(lto_tag_is_tree_code_p): Adjust.
(lto_tag_is_gimple_code_p): Likewise.
(lto_gimple_code_to_tag): Likewise.
(lto_tag_to_gimple_code): Likewise.
(lto_tree_code_to_tag): Likewise.
(lto_tag_to_tree_code): Likewise.
* data-streamer.h (streamer_write_hwi_in_range): Use
uhwi streaming to stream the normalized range.
(streamer_read_hwi_in_range): Likewise.

Index: gcc/lto-streamer.h
===
--- gcc/lto-streamer.h  (revision 199720)
+++ gcc/lto-streamer.h  (working copy)
@@ -155,6 +155,9 @@ enum LTO_tags
 {
   LTO_null = 0,
 
+  /* Special for streamer.  Reference to previously-streamed node.  */
+  LTO_tree_pickle_reference,
+
   /* Reserve enough entries to fit all the tree and gimple codes handled
  by the streamer.  This guarantees that:
 
@@ -196,9 +199,6 @@ enum LTO_tags
   /* EH try/catch node.  */
   LTO_eh_catch,
 
-  /* Special for global streamer. Reference to previously-streamed node.  */
-  LTO_tree_pickle_reference,
-
   /* References to indexable tree nodes.  These objects are stored in
  tables that are written separately from the function bodies that
  reference them.  This way they can be instantiated even when the
@@ -921,7 +921,7 @@ extern vec lto_f
 static inline bool
 lto_tag_is_tree_code_p (enum LTO_tags tag)
 {
-  return tag > LTO_null && (unsigned) tag <= MAX_TREE_CODES;
+  return tag > LTO_tree_pickle_reference && (unsigned) tag <= MAX_TREE_CODES;
 }
 
 
@@ -929,8 +929,8 @@ lto_tag_is_tree_code_p (enum LTO_tags ta
 static inline bool
 lto_tag_is_gimple_code_p (enum LTO_tags tag)
 {
-  return (unsigned) tag >= NUM_TREE_CODES + 1
-&& (unsigned) tag < 1 + NUM_TREE_CODES + LAST_AND_UNUSED_GIMPLE_CODE;
+  return (unsigned) tag >= NUM_TREE_CODES + 2
+&& (unsigned) tag < 2 + NUM_TREE_CODES + LAST_AND_UNUSED_GIMPLE_CODE;
 }
 
 
@@ -939,7 +939,7 @@ lto_tag_is_gimple_code_p (enum LTO_tags
 static inline enum LTO_tags
 lto_gimple_code_to_tag (enum gimple_code code)
 {
-  return (enum LTO_tags) ((unsigned) code + NUM_TREE_CODES + 1);
+  return (enum LTO_tags) ((unsigned) code + NUM_TREE_CODES + 2);
 }
 
 
@@ -949,7 +949,7 @@ static inline enum gimple_code
 lto_tag_to_gimple_code (enum LTO_tags tag)
 {
   gcc_assert (lto_tag_is_gimple_code_p (tag));
-  return (enum gimple_code) ((unsigned) tag - NUM_TREE_CODES - 1);
+  return (enum gimple_code) ((unsigned) tag - NUM_TREE_CODES - 2);
 }
 
 
@@ -958,7 +958,7 @@ lto_tag_to_gimple_code (enum LTO_tags ta
 static inline enum LTO_tags
 lto_tree_code_to_tag (enum tree_code code)
 {
-  return (enum LTO_tags) ((unsigned) code + 1);
+  return (enum LTO_tags) ((unsigned) code + 2);
 }
 
 
@@ -968,7 +968,7 @@ static inline enum tree_code
 lto_tag_to_tree_code (enum LTO_tags tag)
 {
   gcc_assert (lto_tag_is_tree_code_p (tag));
-  return (enum tree_code) ((unsigned) tag - 1);
+  return (enum tree_code) ((unsigned) tag - 2);
 }
 
 /* Check that tag ACTUAL == EXPECTED.  */
Index: gcc/data-streamer.h
===
--- gcc/data-streamer.h (revision 199720)
+++ gcc/data-streamer.h (working copy)
@@ -216,13 +216,7 @@ streamer_write_hwi_in_range (struct lto_
   && range < 0x7fff);
 
   val -= min;
-  streamer_write_char_stream (obs, val & 255);
-  if (range >= 0xff)
-streamer_write_char_stream (obs, (val >> 8) & 255);
-  if (range >= 0x)
-streamer_write_char_stream (obs, (val >> 16) & 255);
-  if (range >= 0xff)
-streamer_write_char_stream (obs, (val >> 24) & 255);
+  streamer_write_uhwi_stream (obs, (unsigned HOST_WIDE_INT) val);
 }
 
 /* Input VAL into OBS and verify it is in range MIN...MAX that is supposed
@@ -235,17 +229,11 @@ streamer_read_hwi_in_range (struct lto_i
 HOST_WIDE_INT max)
 {
   HOST_WIDE_INT range = max - min;
-  HOST_WIDE_INT val = streamer_read_uchar (ib);
+  unsigned HOST_WIDE_INT uval = streamer_read_uhwi (ib);
 
   gcc_checking_assert (range > 0 && range < 0x7fff);
 
-  if (range >= 0xff)
-val |= ((HOST_WIDE_INT)streamer_read_uchar (ib)) << 8;
-  if (range >= 0x)
-val |= ((HOST_WIDE_INT)streamer_read_uchar (ib)) << 16;
-  if (range >= 0xff)
-val |= ((HOST_WIDE_INT)streamer_read_uchar (ib)) << 24;
-  val += min;
+  HOST_WIDE_INT val = (HOST_WIDE_INT) (uval + (unsigned HOST_WIDE_INT) min);
   if (val < min || val > max)
 lto_value_ra

Re: [PATCH][ARM][4/n] Partial IT block deprecation in ARMv8 AArch32 - load/store multiple patterns

2013-06-06 Thread Richard Earnshaw

On 05/06/13 17:58, Kyrylo Tkachov wrote:

Hi all,

ARMv8-style IT blocks don't allow load/store multiple instructions (ldm,
stm), so this patch disables the predicable forms of the corresponding
patterns. The ldm/stm patterns are generated through an Ocaml script,
which is updated to reflect the new rules. The ldmstm.md file is
regenerated. The changes are quite straightforward.

Tested together with the other patches in the series by bootstrap on a
Cortex-A15 and regtest arm-none-eabi on qemu and model.

Ok for trunk?

Thanks,
Kyrill

2013-06-05  Kyrylo Tkachov  

* config/arm/arm-ldmstm.ml: Set "predicable_short_it" to "no"
where appropriate.
* config/arm/ldmstm.md: Regenerate.




OK

R.




Re: [PATCH][ARM][3/n] Partial IT block deprecation in ARMv8 AArch32 - atomics patterns

2013-06-06 Thread Richard Earnshaw

On 05/06/13 17:49, Kyrylo Tkachov wrote:

Hi all,

This patch restricts predication for the various atomics patterns in
sync.md by using the new predicable_short_it mechanism. The load/store
exclusive and the acquire/release instructions cannot be contained
inside IT blocks in ARMv8 so the logic behind disabling their predicable
versions for ARMv8 Thumb mode is pretty straightforward.

Tested together with the other patches in the series by bootstrap on
Cortex-A15 and regtest arm-none-eabi on qemu and model.

Ok for trunk?

Thanks,
Kyrill

2013-06-05  Kyrylo Tkachov  

* config/arm/sync.md (atomic_loaddi_1):
Disable predication for arm_restrict_it.
(arm_load_exclusive): Likewise.
(arm_load_exclusivesi): Likewise.
(arm_load_exclusivedi): Likewise.
(arm_load_acquire_exclusive): Likewise.
(arm_load_acquire_exclusivesi): Likewise.
(arm_load_acquire_exclusivedi): Likewise.
(arm_store_exclusive): Likewise.
(arm_store_exclusive): Likewise.
(arm_store_release_exclusivedi): Likewise.
(arm_store_release_exclusive): Likewise.




OK.

R.




Re: [PATCH] Enhance shrink-wrap

2013-06-06 Thread Steven Bosscher
On Thu, Jun 6, 2013 at 11:55 AM, Zhenqiang Chen wrote:
> The patch enhance prepare_shrink_wrap by doing copyprop for the entry
> block.  This exposes more opportunities for shrink-wrapping.  These
> kinds of copies often occur when incoming argument registers are moved
> to call-saved registers because their values are live across one or
> more calls during the function.

Hello,

This was discussed before, see the thread surrounding:
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01455.html

This is also http://gcc.gnu.org/PR10474

Ciao!
Steven


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Jason Merrill

On 06/06/2013 02:07 AM, Jakub Jelinek wrote:

Jason, does
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3675.html#1457
apply just to C++11/C++14, or to C++03 too?


The committee hasn't said anything about which DRs since C++03 apply to 
it.  I take the position that most do, but not this one, since it is a 
change to wording that doesn't exist in C++03.



In C++03 I see in [expr.shift]/2
"The value of E1 << E2 is E1 (interpreted as a bit pattern) left-shifted E2
bit positions; vacated bits are zero-filled. If E1 has an unsigned type,
the value of the result is E1 multiplied by the quantity 2 raised to
the power E2, reduced modulo ULONG_MAX+1 if E1 has type unsigned long,
UINT_MAX+1 otherwise."  Is that the same case as C90 then, the wording seems
to be pretty much the same?


Yes, that's the same as C90.


what the current -std= makes as undefined behavior (though, because of DRs
that is somewhat fuzzy, pre-DR1457 C++11 vs. post-DR1457 C++11)


In contrast to the C++03 situation, the committee has been clear about 
which DRs apply to C++11 and which to C++1y, and this one does apply to 
C++11.


It's unfortunate that C and C++ have different rules here.  I'm actually 
inclined to agree with comment 48 from 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n834.htm that we should 
have left the C90/C++98 rules alone, but I guess that comment was rejected.


Jason



Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Richard Smith
[Resending with less text/html]

On Thu, Jun 6, 2013 at 1:55 AM, Konstantin Serebryany
 wrote:
> On Thu, Jun 6, 2013 at 12:44 PM, Jakub Jelinek  wrote:
> > On Thu, Jun 06, 2013 at 12:41:56PM +0400, Konstantin Serebryany wrote:
> >> As for libstdc++, I completely agree, we don't want to depend on it,
> >> and we don't.
> >
> > ubsan actually needs
> >  U _ZTIN10__cxxabiv117__class_type_infoE@@CXXABI_1.3
> >  U _ZTIN10__cxxabiv120__si_class_type_infoE@@CXXABI_1.3
> >  U _ZTIN10__cxxabiv121__vmi_class_type_infoE@@CXXABI_1.3
> >  U _ZTISt9type_info@@GLIBCXX_3.4
> >  U __dynamic_cast@@CXXABI_1.3
>
> These things are needed only for the C++-specific undefined behavior checking.
> At least, if I compile a C test using clang -fsanitize=undefined I
> don't see any of  these.
>
> Richard, am I right?

Yes. We build two different runtimes, one which needs these bits (for
C++) and one which doesn't (for C).

Adding __extension__ to the __int128 typedefs is fine too.


[COMMITTED] libgomp: typo fix, GNU/Hurd TLS configuration

2013-06-06 Thread Thomas Schwinge
Hi!

I pushed the following two obvious patches to trunk:

commit 6731634deb51e8aebd43b7fc34fe8330d6a0edba
Author: tschwinge 
Date:   Thu Jun 6 10:04:34 2013 +

libgomp/
* config/posix/ptrlock.h: Fix comment.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@199724 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git libgomp/ChangeLog libgomp/ChangeLog
index 1747cec..f4d6cd6 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,7 @@
+2013-06-06  Thomas Schwinge  
+
+   * config/posix/ptrlock.h: Fix comment.
+
 2013-05-27  Tobias Burnus  
 
PR fortran/57423
diff --git libgomp/config/posix/ptrlock.h libgomp/config/posix/ptrlock.h
index eec4e19..76c2deb 100644
--- libgomp/config/posix/ptrlock.h
+++ libgomp/config/posix/ptrlock.h
@@ -22,9 +22,8 @@
see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
.  */
 
-/* This is a Linux specific implementation of a mutex synchronization
-   mechanism for libgomp.  This type is private to the library.  This
-   implementation uses atomic instructions and the futex syscall.  */
+/* This is a generic POSIX implementation of a mutex synchronization
+   mechanism for libgomp.  This type is private to the library.  */
 
 #ifndef GOMP_PTRLOCK_H
 #define GOMP_PTRLOCK_H 1

This has probably been copy'n'pasted from the Linux-specific file.


commit 3a930d3fc68785662f5f3f4af02474cb21a62056
Author: tschwinge 
Date:   Thu Jun 6 10:04:49 2013 +

libgomp/
* configure.tgt (XCFLAGS): Add -ftls-model=initial-exec for
GNU/Hurd, as done for Linux-based systems.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@199725 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git libgomp/ChangeLog libgomp/ChangeLog
index f4d6cd6..a78190f 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2013-06-06  Thomas Schwinge  
 
+   * configure.tgt (XCFLAGS): Add -ftls-model=initial-exec for
+   GNU/Hurd, as done for Linux-based systems.
+
* config/posix/ptrlock.h: Fix comment.
 
 2013-05-27  Tobias Burnus  
diff --git libgomp/configure.tgt libgomp/configure.tgt
index 2eecc93..8b18417 100644
--- libgomp/configure.tgt
+++ libgomp/configure.tgt
@@ -14,7 +14,10 @@
 if test $gcc_cv_have_tls = yes ; then
   case "${target}" in
 
-*-*-linux*)
+*-*-k*bsd*-gnu*)
+   ;;
+
+*-*-linux* | *-*-gnu*)
XCFLAGS="${XCFLAGS} -ftls-model=initial-exec"
;;
   esac

GNU/k*bsd not changed, because I don't know whether that'd be fine with
their LinuxThreads pthread implementation.


Grüße,
 Thomas


pgphBbAIRC64F.pgp
Description: PGP signature


[PATCH] Enhance shrink-wrap

2013-06-06 Thread Zhenqiang Chen
Hi,

The patch enhance prepare_shrink_wrap by doing copyprop for the entry
block.  This exposes more opportunities for shrink-wrapping.  These
kinds of copies often occur when incoming argument registers are moved
to call-saved registers because their values are live across one or
more calls during the function.

* For SPECint2000 (-O3), the number of functions, which can be
shrink-wrapped, increase from 197 to 364 on ARM and from 364 to 618 on
X86-64 with the patch.
* No SPECint2000 performance regression for X86-64 and ARM.
* On X86-64 (-O3), 253.perlbmk is ~3% better.
* On ARM (A15, -O3), 453.povray is ~5% better.
* Bootstrapped and no make check regression for X86-64 and ARM A9.

Is it OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2013-06-06  Zhenqiang Chen  

* function.c (prepare_shrink_wrap): Do copy prop for entry block.
* function.h (copyprop_hardreg_forward_blocks): New.
* regcprop.c (copyprop_hardreg_forward_blocks): New.
(copyprop_hardreg_forward): Call copyprop_hardreg_forward_blocks.


Enhance-shrink-wrap.patch
Description: Binary data


Re: [Patch, Fortran] PR57530 - fix rejects valid with gfc_type_compatible

2013-06-06 Thread Mikael Morin
Le 05/06/2013 14:49, Tobias Burnus a écrit :
> Now with attached patch.
> 
> Tobias Burnus wrote:
>> I accidentally attached a slightly out-dated patch. The old patch
>> permitted CLASS<->TYPE differences in cases where the characteristic
>> had to match (e.g. dummy arguments in a proc-pointer assignment). -
>> Sorry for the confusion.
>>
>> Build and regtested on x86-64-gnu-linux.
>> OK for the trunk?
>>
>> Tobias
>>
>> Tobias Burnus wrote:
>>> A TYPE is type compatible with a CLASS if both have the same declared
>>> type.
>>>
>>> Or in words of the standard (cf. PR):
>>> "A nonpolymorphic entity is type compatible only with entities of the
>>> same declared type. A polymorphic entity that is not an unlimited
>>> polymorphic entity is type compatible with entities of the same
>>> declared type or any of its extensions." (F2008, 4.3.1.3).
>>>
>>> Build and regtested on x86-64-gnu-linux.
>>> OK for the trunk?
>>
OK


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Konstantin Serebryany
On Thu, Jun 6, 2013 at 12:59 PM, Jakub Jelinek  wrote:
> On Thu, Jun 06, 2013 at 12:55:17PM +0400, Konstantin Serebryany wrote:
>> > ubsan actually needs
>> >  U _ZTIN10__cxxabiv117__class_type_infoE@@CXXABI_1.3
>> >  U _ZTIN10__cxxabiv120__si_class_type_infoE@@CXXABI_1.3
>> >  U _ZTIN10__cxxabiv121__vmi_class_type_infoE@@CXXABI_1.3
>> >  U _ZTISt9type_info@@GLIBCXX_3.4
>> >  U __dynamic_cast@@CXXABI_1.3
>>
>> These things are needed only for the C++-specific undefined behavior 
>> checking.
>> At least, if I compile a C test using clang -fsanitize=undefined I
>> don't see any of  these.
>
> But that is only because of the statically linking everything approach.

Err. Yes, right.
We don't link either of sanitizers dynamically.


> When libubsan is a shared library, when any part of the library needs
> libstdc++, you need it for everything, unless you do some weakref tricks and
> use it only conditionally (but that might be harder when you use C++
> dynamic_cast, you'd need to call the runtime routine through weak symbol
> instead by hand).
>
>> > plus all the libs have:
>> >  w __cxa_demangle@@CXXABI_1.3
>>
>> This beast is declared as weak:
>
> Sure, I know very well what w means ;), was listing this just for
> completeness.
>
> Jakub


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Jakub Jelinek
On Thu, Jun 06, 2013 at 12:55:17PM +0400, Konstantin Serebryany wrote:
> > ubsan actually needs
> >  U _ZTIN10__cxxabiv117__class_type_infoE@@CXXABI_1.3
> >  U _ZTIN10__cxxabiv120__si_class_type_infoE@@CXXABI_1.3
> >  U _ZTIN10__cxxabiv121__vmi_class_type_infoE@@CXXABI_1.3
> >  U _ZTISt9type_info@@GLIBCXX_3.4
> >  U __dynamic_cast@@CXXABI_1.3
> 
> These things are needed only for the C++-specific undefined behavior checking.
> At least, if I compile a C test using clang -fsanitize=undefined I
> don't see any of  these.

But that is only because of the statically linking everything approach.
When libubsan is a shared library, when any part of the library needs
libstdc++, you need it for everything, unless you do some weakref tricks and
use it only conditionally (but that might be harder when you use C++
dynamic_cast, you'd need to call the runtime routine through weak symbol
instead by hand).

> > plus all the libs have:
> >  w __cxa_demangle@@CXXABI_1.3
> 
> This beast is declared as weak:

Sure, I know very well what w means ;), was listing this just for
completeness.

Jakub


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Konstantin Serebryany
On Thu, Jun 6, 2013 at 12:44 PM, Jakub Jelinek  wrote:
> On Thu, Jun 06, 2013 at 12:41:56PM +0400, Konstantin Serebryany wrote:
>> As for libstdc++, I completely agree, we don't want to depend on it,
>> and we don't.
>
> ubsan actually needs
>  U _ZTIN10__cxxabiv117__class_type_infoE@@CXXABI_1.3
>  U _ZTIN10__cxxabiv120__si_class_type_infoE@@CXXABI_1.3
>  U _ZTIN10__cxxabiv121__vmi_class_type_infoE@@CXXABI_1.3
>  U _ZTISt9type_info@@GLIBCXX_3.4
>  U __dynamic_cast@@CXXABI_1.3

These things are needed only for the C++-specific undefined behavior checking.
At least, if I compile a C test using clang -fsanitize=undefined I
don't see any of  these.

Richard, am I right?


> plus all the libs have:
>  w __cxa_demangle@@CXXABI_1.3

This beast is declared as weak:
sanitizer_common/sanitizer_symbolizer_itanium.cc
  extern "C" char *__cxa_demangle(const char *mangled, char *buffer,
  size_t *length, int *status)
SANITIZER_WEAK_ATTRIBUTE;

If we have the C++ run-time linked-in, we can use __cxa_demangle.
If we don't have the C++ run-time, we most likely don't need
__cxa_demangle either.

You can confirm this by building some C program with "clang
-fsanitize=address" -- it will not depend on libc++.

--kcc

>
> Jakub


Re: Fix PR 53743 and other -freorder-blocks-and-partition failures

2013-06-06 Thread Richard Biener
On Wed, Jun 5, 2013 at 4:06 PM, Teresa Johnson  wrote:
> On Wed, May 29, 2013 at 7:57 AM, Teresa Johnson  wrote:
>> On Thu, May 23, 2013 at 6:18 AM, Teresa Johnson  wrote:
>>> On Wed, May 22, 2013 at 2:05 PM, Teresa Johnson  
>>> wrote:
 Revised patch included below. The spacing of my pasted in patch text
 looks funky again, let me know if you want the patch as an attachment
 instead.

 I addressed all of Steven's comments, except for the suggestion to use
 gcc_assert
 instead of error() in verify_hot_cold_block_grouping() to keep this 
 consistent
 with the rest of the verify_flow_info subroutines (let me know if this is 
 ok).
>>>
>>> I fixed this issue too, which was actually in
>>> insert_section_boundary_note(), so that it gcc_asserts more
>>> efficiently as suggested. Retested, latest patch below.
>>>
>>> Honza, would you be able to review the patch?
>>
>> Ping. Still needs a global maintainer to review and approve.
>
> Ping.

This is ok.  Please watch for fallout!

Thanks,
Richard.

> Thanks!
> Teresa
>
>>
>> Also, I submitted a PR for the debug range issue:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451
>>
>> Thanks!
>> Teresa
>>
>>>
>>> Thanks!
>>> Teresa
>>>

 The other main changes:
 (1) Added several test cases (cloned from the torture subdirectories,
 where I manually
 built/ran with FDO and -freorder-blocks-and-partition with both the
 current trunk and
 my fixed trunk compiler, and was able to expose some failures I fixed.
 (2) Changed existing tree-prof tests that used
 -freorder-blocks-and-partition to be
 built with -O2 instead of -O, so that partitioning actually kicks in.
 (3) Fixed a couple of failures in the new
 verify_hot_cold_block_grouping() checks
 exposed by the torture tests I ran manually with splitting (2 of the
 tests cloned
 to tree-prof in this patch). One was in computed goto where we were
 too aggressive
 about cloning crossing edges, and the other was in rtl_split_edge
 called from the "stack"
 pass which was not correctly inserting the new bb in the correct partition 
 since
 bb layout is complete at that point.

 Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
 builds and regression testing. Re-built/ran cpu2006int with profile
 feedback and -freorder-blocks-and-partition enabled.

 Ok for trunk?

 Thanks!
 Teresa
>>>
>>> 2013-05-23  Teresa Johnson  
>>>
>>> * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
>>> as this is now done by redirect_edge_and_branch_force.
>>> * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
>>> barriers, and fix interaction with splitting.
>>> * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
>>> * cfgcleanup.c (try_forward_edges): Fix early return value to properly
>>> reflect changes made in the routine.
>>> * bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c.
>>> (fix_up_fall_thru_edges): Remove incorrect check for bb layout order
>>> since this is called in cfglayout mode, and replace partition fixup
>>> with assert as that is now done by force_nonfallthru_and_redirect.
>>> (add_reg_crossing_jump_notes): Handle the fact that some jumps may
>>> already be marked with region crossing note.
>>> (insert_section_boundary_note): Make non-static, gate on flag
>>> has_bb_partition, rewrite to also check for multiple partitions.
>>> (rest_of_handle_reorder_blocks): Remove call to
>>> insert_section_boundary_note, now done later during free_cfg.
>>> (duplicate_computed_gotos): Don't duplicate partition crossing edge.
>>> * bb-reorder.h (insert_section_boundary_note): Declare.
>>> * Makefile.in (cfgrtl.o): Depend on bb-reorder.h
>>> * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
>>> invoke insert_section_boundary_note.
>>> (try_redirect_by_replacing_jump): Remove unnecessary
>>> check for region crossing note.
>>> (fixup_partition_crossing): New function.
>>> (rtl_redirect_edge_and_branch): Fixup partition boundaries.
>>> (emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion
>>> in non-cfglayout mode.
>>> (force_nonfallthru_and_redirect): Fixup partition boundaries,
>>> remove old code that tried to do this. Emit barrier correctly
>>> when we are in cfglayout mode.
>>> (last_bb_in_partition): New function.
>>> (rtl_split_edge): Correctly fixup partition boundaries.
>>> (commit_one_edge_insertion): Remove old code that tried to
>>> fixup region crossing edge since this is now handled in
>>> split_block, and set up insertion point correctly since
>>> block may now end in a jump.
>>> (verify_hot_cold_block_grouping): Guard against checking when not in
>>> linearized RTL mode.
>>> (rtl_verify_edges): Add checks for incorrect/missing

Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Konstantin Serebryany
+rich...@metafoo.co.uk

On Thu, Jun 6, 2013 at 12:21 PM, Jakub Jelinek  wrote:
> On Thu, Jun 06, 2013 at 11:46:19AM +0400, Konstantin Serebryany wrote:
>> If we are going to import the ubsan run-time from LLVM's
>> projects/compiler-rt/lib/ubsan,
>> we may also need to update the contents of
>> libsanitizer/sanitizer_common and keep them in sync afterwards.
>> (ubsan shares few bits of code with asan/tsan/msan)
>> The simplest way to do that is to extend libsanitizer/merge.sh
>
> Sure.  I've done so far just a partial merge by hand (only 3 changed files
> for the minimum of changes required to get ubsan to build), and have tested 
> just
> that it compiles, not that libubsan actually works.
>
> P1 patch is the toplevel stuff to add ubsan into GCC libsanitizer, plus
> ubsan/Makefile* and ubsan/libtool-version (i.e. gcc owned files).

The trivial patch to merge.sh is ok.
The partial merge is ok if it doesn't break asan/tsan build.

> P2 is the actual merge of the ubsan files.

Ok too.

> P3 is something I'd propose for ubsan upstream, without it g++ warns about
> __int128 in -pedantic mode.

Looks good.

rich...@metafoo.co.uk, do you agree to apply this upstream?

ubsan/ubsan_value.h:
-typedef __int128 s128;
-typedef unsigned __int128 u128;
+__extension__ typedef __int128 s128;
+__extension__ typedef unsigned __int128 u128;

--kcc


>
> Jakub


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Jakub Jelinek
On Thu, Jun 06, 2013 at 12:41:56PM +0400, Konstantin Serebryany wrote:
> As for libstdc++, I completely agree, we don't want to depend on it,
> and we don't.

ubsan actually needs
 U _ZTIN10__cxxabiv117__class_type_infoE@@CXXABI_1.3
 U _ZTIN10__cxxabiv120__si_class_type_infoE@@CXXABI_1.3
 U _ZTIN10__cxxabiv121__vmi_class_type_infoE@@CXXABI_1.3
 U _ZTISt9type_info@@GLIBCXX_3.4
 U __dynamic_cast@@CXXABI_1.3
plus all the libs have:
 w __cxa_demangle@@CXXABI_1.3

Jakub


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Konstantin Serebryany
On Thu, Jun 6, 2013 at 12:26 PM, Andrew Pinski  wrote:
> On Thu, Jun 6, 2013 at 1:21 AM, Jakub Jelinek  wrote:
>> On Thu, Jun 06, 2013 at 11:46:19AM +0400, Konstantin Serebryany wrote:
>>> If we are going to import the ubsan run-time from LLVM's
>>> projects/compiler-rt/lib/ubsan,
>>> we may also need to update the contents of
>>> libsanitizer/sanitizer_common and keep them in sync afterwards.
>>> (ubsan shares few bits of code with asan/tsan/msan)
>>> The simplest way to do that is to extend libsanitizer/merge.sh
>>
>> Sure.  I've done so far just a partial merge by hand (only 3 changed files
>> for the minimum of changes required to get ubsan to build), and have tested 
>> just
>> that it compiles, not that libubsan actually works.
>>
>> P1 patch is the toplevel stuff to add ubsan into GCC libsanitizer, plus
>> ubsan/Makefile* and ubsan/libtool-version (i.e. gcc owned files).
>> P2 is the actual merge of the ubsan files.
>> P3 is something I'd propose for ubsan upstream, without it g++ warns about
>> __int128 in -pedantic mode.
>
> Is there a reason why ubsan runtime in C++?  That seems like a bad
> idea to require linking against libstdc++ when doing development of a
> C only program.

for asan/tsan/msan/lsan the reason is that C++ is a better language
(in the author's humble opinion :).
For ubsan, I think the reason is the same, plus ubsan shares some C++
code with asan/tsan/msan/lsan.

As for libstdc++, I completely agree, we don't want to depend on it,
and we don't.
None of sanitizer run-times uses C++ features  that require libstdc++

--kcc

>
> Also it seems easy enough to write a GCC specific runtime that does
> not depend on the rest of libsanitizer stuff anyways.
>
> Thanks,
> Andrew Pinski


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Jakub Jelinek
On Thu, Jun 06, 2013 at 01:26:06AM -0700, Andrew Pinski wrote:
> On Thu, Jun 6, 2013 at 1:21 AM, Jakub Jelinek  wrote:
> > On Thu, Jun 06, 2013 at 11:46:19AM +0400, Konstantin Serebryany wrote:
> >> If we are going to import the ubsan run-time from LLVM's
> >> projects/compiler-rt/lib/ubsan,
> >> we may also need to update the contents of
> >> libsanitizer/sanitizer_common and keep them in sync afterwards.
> >> (ubsan shares few bits of code with asan/tsan/msan)
> >> The simplest way to do that is to extend libsanitizer/merge.sh
> >
> > Sure.  I've done so far just a partial merge by hand (only 3 changed files
> > for the minimum of changes required to get ubsan to build), and have tested 
> > just
> > that it compiles, not that libubsan actually works.
> >
> > P1 patch is the toplevel stuff to add ubsan into GCC libsanitizer, plus
> > ubsan/Makefile* and ubsan/libtool-version (i.e. gcc owned files).
> > P2 is the actual merge of the ubsan files.
> > P3 is something I'd propose for ubsan upstream, without it g++ warns about
> > __int128 in -pedantic mode.
> 
> Is there a reason why ubsan runtime in C++?  That seems like a bad
> idea to require linking against libstdc++ when doing development of a
> C only program.

-fsanitize=undefined etc. are debugging modes, not something meant for
release versions of programs, I think it is not a big deal.
C++ is implementation language for the libraries, why exactly we actually
link libasan and libtsan against -lstdc++ I don't really remember, maybe we 
don't
have to, it is compiled with -fno-exceptions and doesn't use any libstdc++
symbols.  libubsan apparently has two files which actually use some
libstdc++ symbols and are for some C++ sanitization.
BTW, all the libs link against -ldl too (not that big a deal) and -lpthread
(IMHO more serious problem than -lstdc++).

> Also it seems easy enough to write a GCC specific runtime that does
> not depend on the rest of libsanitizer stuff anyways.

We already have libasan and libtsan in gcc, ubsan is just a think layer on
top of the sanitizer_common infrastructure, we'd have to write from scratch
not just the handlers, but some infrastructure too, for what gain?

Jakub


*PING* / Re: [Patch, Fortran] Finalize nonallocatables with INTENT(out)

2013-06-06 Thread Tobias Burnus

* PING *

Attached is a rediff - including the later posted additional test case 
(http://gcc.gnu.org/ml/fortran/2013-05/msg00141.html)



On May 31, 2013 18:39, Tobias Burnus wrote:
This patch adds finalization support for INTENT(out) for 
nonallocatable dummy arguments.


Additionally, it addresses a missed optimization: The previous code 
tried to deallocate allocatable components even if the dummy argument 
was already an allocatable. That's a missed optimization as gfortran 
deallocates allocatables in the caller.


OK for the trunk?

Note: This patch depends on 
http://gcc.gnu.org/ml/fortran/2013-05/msg00134.html


Tobias

PS: There are many more places where finalization should happen, e.g. 
intrinsic assignment (LHS + RHS func/constructor finalization), 
end-of-scope of nonallocatables. And some issues related coarrays, 
elemental+optional, etc.
However, I stop here for the moment as I run out of time - and writing 
on-top patches of not reviewed/committed patches starts to become a 
chore.


2013-06-06  Tobias Burnus  

	PR fortran/37336
	* trans-decl.c (init_intent_out_dt): Call finalizer
	when approriate.

2013-06-06  Tobias Burnus  

	PR fortran/37336
	* gfortran.dg/finalize_10.f90: New.
	* gfortran.dg/auto_dealloc_2.f90: Update tree-dump.
	* gfortran.dg/finalize_15.f90: New.

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index b0e3ffc..72bb23f 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -3501,38 +3503,57 @@ init_intent_out_dt (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 	&& !f->sym->attr.pointer
 	&& f->sym->ts.type == BT_DERIVED)
   {
-	if (f->sym->ts.u.derived->attr.alloc_comp && !f->sym->value)
+	tmp = NULL_TREE;
+
+	/* Note: Allocatables are excluded as they are already handled
+	   by the caller.  */
+	if (!f->sym->attr.allocatable
+	&& gfc_is_finalizable (f->sym->ts.u.derived, NULL))
 	  {
-	tmp = gfc_deallocate_alloc_comp (f->sym->ts.u.derived,
-	 f->sym->backend_decl,
-	 f->sym->as ? f->sym->as->rank : 0);
+	stmtblock_t block;
+	gfc_expr *e;
+
+	gfc_init_block (&block);
+	f->sym->attr.referenced = 1;
+	e = gfc_lval_expr_from_sym (f->sym);
+	gfc_add_finalizer_call (&block, e);
+	gfc_free_expr (e);
+	tmp = gfc_finish_block (&block);
+	  }
 
-	if (f->sym->attr.optional
-		|| f->sym->ns->proc_name->attr.entry_master)
-	  {
-		present = gfc_conv_expr_present (f->sym);
-		tmp = build3_loc (input_location, COND_EXPR, TREE_TYPE (tmp),
-  present, tmp,
-  build_empty_stmt (input_location));
-	  }
+	if (tmp == NULL_TREE && !f->sym->attr.allocatable
+	&& f->sym->ts.u.derived->attr.alloc_comp && !f->sym->value)
+	  tmp = gfc_deallocate_alloc_comp (f->sym->ts.u.derived,
+	   f->sym->backend_decl,
+	   f->sym->as ? f->sym->as->rank : 0);
 
-	gfc_add_expr_to_block (&init, tmp);
+	if (tmp != NULL_TREE && (f->sym->attr.optional
+ || f->sym->ns->proc_name->attr.entry_master))
+	  {
+	present = gfc_conv_expr_present (f->sym);
+	tmp = build3_loc (input_location, COND_EXPR, TREE_TYPE (tmp),
+			  present, tmp, build_empty_stmt (input_location));
 	  }
-   else if (f->sym->value)
+
+	if (tmp != NULL_TREE)
+	  gfc_add_expr_to_block (&init, tmp);
+	else if (f->sym->value && !f->sym->attr.allocatable)
 	  gfc_init_default_dt (f->sym, &init, true);
   }
 else if (f->sym && f->sym->attr.intent == INTENT_OUT
 	 && f->sym->ts.type == BT_CLASS
 	 && !CLASS_DATA (f->sym)->attr.class_pointer
-	 && CLASS_DATA (f->sym)->ts.u.derived->attr.alloc_comp)
+	 && !CLASS_DATA (f->sym)->attr.allocatable)
   {
-	tmp = gfc_class_data_get (f->sym->backend_decl);
-	if (CLASS_DATA (f->sym)->as == NULL)
-	  tmp = build_fold_indirect_ref_loc (input_location, tmp);
-	tmp = gfc_deallocate_alloc_comp (CLASS_DATA (f->sym)->ts.u.derived,
-	 tmp,
-	 CLASS_DATA (f->sym)->as ?
-	 CLASS_DATA (f->sym)->as->rank : 0);
+	stmtblock_t block;
+	gfc_expr *e;
+
+	gfc_init_block (&block);
+	f->sym->attr.referenced = 1;
+	e = gfc_lval_expr_from_sym (f->sym);
+	gfc_add_finalizer_call (&block, e);
+	gfc_free_expr (e);
+	tmp = gfc_finish_block (&block);
 
 	if (f->sym->attr.optional || f->sym->ns->proc_name->attr.entry_master)
 	  {
--- /dev/null	2013-06-06 09:52:08.544104880 +0200
+++ gcc/gcc/testsuite/gfortran.dg/finalize_10.f90	2013-06-03 12:32:38.763008261 +0200
@@ -0,0 +1,39 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+!
+! PR fortran/37336
+!
+! Finalize nonallocatable INTENT(OUT)
+!
+module m
+  type t
+  end type t
+  type t2
+  contains
+final :: fini
+  end type t2
+contains
+  elemental subroutine fini(var)
+type(t2), intent(inout) :: var
+  end subroutine fini
+end module m
+
+subroutine foo(x,y,aa,bb)
+  use m
+  class(t), intent(out) :: x(:),y
+  type(t2), intent(out) :: aa(:),bb
+end subroutine foo
+
+! Finalize CLASS + set default init
+! { dg-final { scan-tree-dump-times "y->_vptr->_final \\(&desc.\[0-9\

[linaro/gcc-4_8-branch] Backports from trunk and merge from gcc-4_8-branch

2013-06-06 Thread Christophe Lyon
Hi,

I have just backported the following revisions from to linaro/gcc-4_8-branch:
r198970 (as r199696),
r199241 (as r199700),
r198497-198500 (as r199703),
r198680 (as r199710),
r198928,198973,199203 (as r199718)

I have also merged the gcc-4_8-branch into linaro/gcc-4_8-branch up to
revision r199609 (as r199719).

Thanks,

Christophe.


*ping* / Re: [Patch, Fortran] PR57508 - Fix ICE/Reject-valid issue with get_temp_from_expr (intrinsic assignment with defined assignment)

2013-06-06 Thread Tobias Burnus

Early *ping*.
http://gcc.gnu.org/ml/fortran/2013-06/msg00027.html

Tobias Burnus wrote:

Dear all,

Due to copying the attributes, the temporary variable could get marked 
as function (attr.function, attr.flavor == FL_PROCEDURE). This either 
lead to leaking those attributes into the assembler file - or to cause 
an error due to the call to gfc_add_flavor. With this patch, I now 
explicitly unset those attribues.  (Fund when building ForTrilinos.)


Build and
OK for the trunk and GCC 4.8?

Tobias




Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Andrew Pinski
On Thu, Jun 6, 2013 at 1:21 AM, Jakub Jelinek  wrote:
> On Thu, Jun 06, 2013 at 11:46:19AM +0400, Konstantin Serebryany wrote:
>> If we are going to import the ubsan run-time from LLVM's
>> projects/compiler-rt/lib/ubsan,
>> we may also need to update the contents of
>> libsanitizer/sanitizer_common and keep them in sync afterwards.
>> (ubsan shares few bits of code with asan/tsan/msan)
>> The simplest way to do that is to extend libsanitizer/merge.sh
>
> Sure.  I've done so far just a partial merge by hand (only 3 changed files
> for the minimum of changes required to get ubsan to build), and have tested 
> just
> that it compiles, not that libubsan actually works.
>
> P1 patch is the toplevel stuff to add ubsan into GCC libsanitizer, plus
> ubsan/Makefile* and ubsan/libtool-version (i.e. gcc owned files).
> P2 is the actual merge of the ubsan files.
> P3 is something I'd propose for ubsan upstream, without it g++ warns about
> __int128 in -pedantic mode.

Is there a reason why ubsan runtime in C++?  That seems like a bad
idea to require linking against libstdc++ when doing development of a
C only program.

Also it seems easy enough to write a GCC specific runtime that does
not depend on the rest of libsanitizer stuff anyways.

Thanks,
Andrew Pinski


Re: [GOOGLE] More strict checking for call args

2013-06-06 Thread Richard Biener
On Wed, Jun 5, 2013 at 6:11 PM, Xinliang David Li  wrote:
> Right, except that in the context of FDO/autoFDO, where this happens
> the most (note in FDO case, it can happen with fresh profile too for
> multi-threaded programs), it is not that important to handle -- the
> mismatch path will never be executed, so why bother to inline and
> bloat the code for it?

It's about being able to IPA-CP through such calls.  Consider

void foo (int);
void bar (int i, float)
{
  foo (i);
}
void foobar ()
{
  bar (1);  // mismatched # of arguments but we can see foo() is
called with constant 1
}

that's important if we can prove that all calls to foo () are called
with a constant 1
argument for example and thus we can replace it without cloning.

If the compatibility predicate guards even the IPA-CP case (not just the
inlining itself) then the predicate is used in a bogus way.

So your symptom is not properly fixed with the patch but papered over.

Richard.

> if (fptr_new == func_old) {
>   func_old (ptr); <--- do not want to inline.
> }
> else
>fptr_new(ptr);
>
> David
>
>
> On Wed, Jun 5, 2013 at 1:31 AM, Richard Biener
>  wrote:
>> On Wed, Jun 5, 2013 at 2:19 AM, Dehao Chen  wrote:
>>> attached is a testcase that would cause problem when source has changed:
>>>
>>> $ g++ test.cc -O2 -fprofile-generate -DOLD
>>> $ ./a.out
>>> $ g++ test.cc -O2 -fprofile-use
>>> test.cc:34:1: internal compiler error: in operator[], at vec.h:815
>>>  }
>>>  ^
>>> 0x512740 vec::operator[](unsigned int)
>>> ../../gcc/vec.h:815
>>> 0x512740 vec::operator[](unsigned int)
>>> ../../gcc/vec.h:1244
>>> 0xf24464 vec::operator[](unsigned int)
>>> ../../gcc/vec.h:815
>>> 0xf24464 vec::operator[](unsigned int)
>>> ../../gcc/vec.h:1244
>>> 0xf24464 ipa_get_indirect_edge_target_1
>>> ../../gcc/ipa-cp.c:1535
>>
>> This use needs to be properly guarded.  We can perfectly well have
>> mismatching fndecl nodes in gimple calls.  If we start with
>>
>> void fn(int, int, int);
>>
>> ...
>> void (*x)(float, double, struct X, int) = fn;
>> (*x)(1., 2., {}, 1);
>>
>> the GIMPLE_CALL receives the function type effective for the call
>> from the source (gimple_call_fntype).  Then CCP happily propagates the
>> 'fn' decl and we end up with
>>
>>   fn (1., 2., {}, 1);
>>
>> that is, gimple_call_fndecl is 'fn' but gimple_call_fntype is still
>> void (*x)(float, double, struct X, int)!
>>
>> So the solution is not to fix the argument verification predicate but to make
>> code aware of the fact that for the call statement gimple_call_fntype is
>> relevant for what is a valid call (that's also what is verified against in
>> verify_stmts) even though the ultimate called function-decl 'fn' has a
>> different prototype.  Thus any code propagating from a call site to
>> the callee has to deal with mismatches.
>>
>> Richard.
>>
>>> 0x971b9a estimate_edge_devirt_benefit
>>> ../../gcc/ipa-inline-analysis.c:2757
>>> 0x973f59 estimate_edge_size_and_time
>>> ../../gcc/ipa-inline-analysis.c:2789
>>> 0x973f59 estimate_calls_size_and_time
>>> ../../gcc/ipa-inline-analysis.c:2842
>>> 0x97429f estimate_node_size_and_time
>>> ../../gcc/ipa-inline-analysis.c:2929
>>> 0x976077 do_estimate_edge_size(cgraph_edge*)
>>> ../../gcc/ipa-inline-analysis.c:3472
>>> 0x97614f estimate_edge_size
>>> ../../gcc/ipa-inline.h:274
>>> 0x97614f estimate_edge_growth
>>> ../../gcc/ipa-inline.h:286
>>> 0x97614f do_estimate_growth_1
>>> ../../gcc/ipa-inline-analysis.c:3582
>>> 0x7e41df cgraph_for_node_and_aliases
>>> ../../gcc/cgraph.c:1777
>>> 0x976675 do_estimate_growth(cgraph_node*)
>>> ../../gcc/ipa-inline-analysis.c:3596
>>> 0xf314ea estimate_growth
>>> ../../gcc/ipa-inline.h:261
>>> 0xf314ea inline_small_functions
>>> ../../gcc/ipa-inline.c:1432
>>> 0xf314ea ipa_inline
>>> ../../gcc/ipa-inline.c:1797
>>> Please submit a full bug report,
>>> with preprocessed source if appropriate.
>>> Please include the complete backtrace with any bug report.
>>> See  for instructions.


Re: [RFC] Implement Undefined Behavior Sanitizer

2013-06-06 Thread Konstantin Serebryany
On Wed, Jun 5, 2013 at 11:40 PM, Andrew Pinski  wrote:
> On Wed, Jun 5, 2013 at 12:23 PM, Jakub Jelinek  wrote:
>> On Wed, Jun 05, 2013 at 11:44:07AM -0700, Andrew Pinski wrote:
>>> On Wed, Jun 5, 2013 at 10:57 AM, Marek Polacek  wrote:
>>> > Comments, please?
>>> I think it might be better to do handle this while gimplification
>>> happens rather than while parsing.  The main reason is that constexpr
>>> might fail due to the added function calls.
>>
>> Gimplification is too late, the FEs perform various operation shortenings
>> etc. in many cases, and what exactly is undefined behavior is apparently
>> heavily dependent on the particular language (C has different rules from
>> C++).  Yes, constexpr is something to consider in this light, but not
>> something that can't be handled (recognizing ubsan builtins and just
>> handling them specially).
>>
>>> Also please don't shorten file names like ubsan,  we already have file
>>> names which don't fit in the older POSIX tar format and needs extended
>>> length support.
>>
>> We already have asan.c and tsan.c, and that is how it is commonly called.
>
> Can we just move them to array-sanitizer and thread-sanitizer?  I

s/array-sanitizer/address-sanitizer/

If we are going to import the ubsan run-time from LLVM's
projects/compiler-rt/lib/ubsan,
we may also need to update the contents of
libsanitizer/sanitizer_common and keep them in sync afterwards.
(ubsan shares few bits of code with asan/tsan/msan)
The simplest way to do that is to extend libsanitizer/merge.sh

--kcc


> think those are better names than asan and tsan.  Shorten names are
> not useful when a new person is learning the code.
>
> Thanks,
> Andrew
>
>>
>> Jakub