Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8

2015-04-27 Thread Bin.Cheng
On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This is a follow-up to
 https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00103.html, which adds
 support for faster unaligned vector memory accesses on POWER8.  As
 pointed out in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65456, there
 was a piece missing here.  The target macro SLOW_UNALIGNED_ACCESS is
 still evaluating to 1 for unaligned vector accesses on POWER8, which
 causes some scalarization to occur during expand.  This version of the
 patch fixes this as well.

 The only changes from before are the update to config/rs6000/rs6000.h,
 and the new test case gcc.target/powerpc/pr65456.c.  Is this ok for
 trunk after 5 branches, and backports to 4.8, 4.9, 5 thereafter?

 Thanks,
 Bill


 [gcc]

 2015-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_option_override_internal):  For
 VSX + POWER8, enable TARGET_ALLOW_MOVMISALIGN and
 TARGET_EFFICIENT_UNALIGNED_VSX if not selected by command line
 option.  However, for -mno-allow-movmisalign, be sure to disable
 TARGET_EFFICIENT_UNALIGNED_VSX to avoid an ICE.
 (rs6000_builtin_mask_for_load): Return 0 for targets with
 efficient unaligned VSX accesses so that the vectorizer will use
 direct unaligned loads.
 (rs6000_builtin_support_vector_misalignment): Always return true
 for targets with efficient unaligned VSX accesses.
 (rs6000_builtin_vectorization_cost): Cost of unaligned loads and
 stores on targets with efficient unaligned VSX accesses is almost
 always the same as the cost of an aligned load or store, so model
 it that way.
 * config/rs6000/rs6000.h (SLOW_UNALIGNED_ACCESS): Evaluate to
 zero for unaligned vector accesses on POWER8.
 * config/rs6000/rs6000.opt (mefficient-unaligned-vector): New
 undocumented option.

 [gcc/testsuite]

 2015-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gcc.dg/vect/bb-slp-24.c: Exclude test for POWER8.
 * gcc.dg/vect/bb-slp-25.c: Likewise.
 * gcc.dg/vect/bb-slp-29.c: Likewise.
 * gcc.dg/vect/bb-slp-32.c: Replace vect_no_align with
 vect_no_align  { ! vect_hw_misalign }.
 * gcc.dg/vect/bb-slp-9.c: Likewise.
 * gcc.dg/vect/costmodel/ppc/costmodel-slp-33.c: Exclude test for
 vect_hw_misalign.
 * gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c: Likewise.
 * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust tests to
 account for POWER8, where peeling for alignment is not needed.
 * gcc.dg/vect/costmodel/ppc/costmodel-vect-outer-fir.c: Replace
 vect_no_align with vect_no_align  { ! vect_hw_misalign }.
 * gcc.dg.vect.if-cvt-stores-vect-ifcvt-18.c: Likewise.
 * gcc.dg/vect/no-scevccp-outer-6-global.c: Likewise.
 * gcc.dg/vect/no-scevccp-outer-6.c: Likewise.
 * gcc.dg/vect/no-vfa-vect-43.c: Likewise.
 * gcc.dg/vect/no-vfa-vect-57.c: Likewise.
 * gcc.dg/vect/no-vfa-vect-61.c: Likewise.
 * gcc.dg/vect/no-vfa-vect-depend-1.c: Likewise.
 * gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise.
 * gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
 * gcc.dg/vect/pr16105.c: Likewise.
 * gcc.dg/vect/pr20122.c: Likewise.
 * gcc.dg/vect/pr33804.c: Likewise.
 * gcc.dg/vect/pr33953.c: Likewise.
 * gcc.dg/vect/pr56787.c: Likewise.
 * gcc.dg/vect/pr58508.c: Likewise.
 * gcc.dg/vect/slp-25.c: Likewise.
 * gcc.dg/vect/vect-105-bit-array.c: Likewise.
 * gcc.dg/vect/vect-105.c: Likewise.
 * gcc.dg/vect/vect-27.c: Likewise.
 * gcc.dg/vect/vect-29.c: Likewise.
 * gcc.dg/vect/vect-33.c: Exclude unaligned access test for
 POWER8.
 * gcc.dg/vect/vect-42.c: Replace vect_no_align with vect_no_align
  { ! vect_hw_misalign }.
 * gcc.dg/vect/vect-44.c: Likewise.
 * gcc.dg/vect/vect-48.c: Likewise.
 * gcc.dg/vect/vect-50.c: Likewise.
 * gcc.dg/vect/vect-52.c: Likewise.
 * gcc.dg/vect/vect-56.c: Likewise.
 * gcc.dg/vect/vect-60.c: Likewise.
 * gcc.dg/vect/vect-72.c: Likewise.
 * gcc.dg/vect/vect-75-big-array.c: Likewise.
 * gcc.dg/vect/vect-75.c: Likewise.
 * gcc.dg/vect/vect-77-alignchecks.c: Likewise.
 * gcc.dg/vect/vect-77-global.c: Likewise.
 * gcc.dg/vect/vect-78-alignchecks.c: Likewise.
 * gcc.dg/vect/vect-78-global.c: Likewise.
 * gcc.dg/vect/vect-93.c: Likewise.
 * gcc.dg/vect/vect-95.c: Likewise.
 * gcc.dg/vect/vect-96.c: Likewise.
 * gcc.dg/vect/vect-cond-1.c: Likewise.
 * gcc.dg/vect/vect-cond-3.c: Likewise.
 * gcc.dg/vect/vect-cond-4.c: Likewise.
 * gcc.dg/vect/vect-cselim-1.c: Likewise.
 * gcc.dg/vect/vect-multitypes-1.c: 

Minor tweak to size functions

2015-04-27 Thread Eric Botcazou
The size functions are used by the Ada compiler to factor out the large size 
computations in self-referential types (dynamic types whose size depends on a 
discriminant in the object).  We generate useless leaf size functions for size 
expressions involving discriminants:

_Global.Sz4_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (bitsizetype) p0 * 8;


_Global.Sz5_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (sizetype) p0;


_Global.Sz6_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (bitsizetype) p0 * 8;


_Global.Sz7_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (sizetype) p0;


The wrapped expressions are simple arithmetic operations so there is no real 
point in creating a size function for them.  That's already ensured for non-
leaf size functions (size functions calling other size functions) and it's 
immediate to extend it down to the leaves.

Tested on x86_64-suse-linux, applied on the mainline (this only affects the 
Ada compiler).  The tree-cfg.c hunk adds the missing curly bracket at the end 
in the above dumps, applied as obvious.


2015-04-27  Eric Botcazou  ebotca...@adacore.com

* stor-layout.c (self_referential_component_ref_p): New predicate.
(copy_self_referential_tree_r): Use it.
(self_referential_size): Punt for simple operations directly involving
self-referential component references.
* tree-cfg.c (dump_function_to_file): Add missing final curly bracket.


-- 
Eric BotcazouIndex: stor-layout.c
===
--- stor-layout.c	(revision 222439)
+++ stor-layout.c	(working copy)
@@ -127,6 +127,20 @@ variable_size (tree size)
 /* An array of functions used for self-referential size computation.  */
 static GTY(()) vectree, va_gc *size_functions;
 
+/* Return true if T is a self-referential component reference.  */
+
+static bool
+self_referential_component_ref_p (tree t)
+{
+  if (TREE_CODE (t) != COMPONENT_REF)
+return false;
+
+  while (REFERENCE_CLASS_P (t))
+t = TREE_OPERAND (t, 0);
+
+  return (TREE_CODE (t) == PLACEHOLDER_EXPR);
+}
+
 /* Similar to copy_tree_r but do not copy component references involving
PLACEHOLDER_EXPRs.  These nodes are spotted in find_placeholder_in_expr
and substituted in substitute_in_expr.  */
@@ -154,19 +168,10 @@ copy_self_referential_tree_r (tree *tp,
 }
 
   /* Default case: the component reference.  */
-  else if (code == COMPONENT_REF)
+  else if (self_referential_component_ref_p (*tp))
 {
-  tree inner;
-  for (inner = TREE_OPERAND (*tp, 0);
-	   REFERENCE_CLASS_P (inner);
-	   inner = TREE_OPERAND (inner, 0))
-	;
-
-  if (TREE_CODE (inner) == PLACEHOLDER_EXPR)
-	{
-	  *walk_subtrees = 0;
-	  return NULL_TREE;
-	}
+  *walk_subtrees = 0;
+  return NULL_TREE;
 }
 
   /* We're not supposed to have them in self-referential size trees
@@ -199,7 +204,7 @@ self_referential_size (tree size)
 
   /* Do not factor out simple operations.  */
   t = skip_simple_constant_arithmetic (size);
-  if (TREE_CODE (t) == CALL_EXPR)
+  if (TREE_CODE (t) == CALL_EXPR || self_referential_component_ref_p (t))
 return size;
 
   /* Collect the list of self-references in the expression.  */
Index: tree-cfg.c
===
--- tree-cfg.c	(revision 222439)
+++ tree-cfg.c	(working copy)
@@ -7441,7 +7441,11 @@ dump_function_to_file (tree fndecl, FILE
   else
 	{
 	  if (!ignore_topmost_bind)
-	fprintf (file, {\n);
+	{
+	  fprintf (file, {\n);
+	  /* No topmost bind, pretend it's ignored for later.  */
+	  ignore_topmost_bind = true;
+	}
 	  indent = 2;
 	}
 


Re: [ARM] Fix RTX cost for vector SET

2015-04-27 Thread Kyrill Tkachov

Hi Kugan,

On 25/04/15 00:30, Kugan wrote:

Thanks for the review. I have updated the patch based on the comments
with some other minor changes. Bootstrapped and regression tested on
aarch64-none-linux-gnu with no-new regressions. Is this OK for trunk?


Thanks,
Kugan


gcc/ChangeLog:

2015-04-24  Kugan Vivekanandarajah  kug...@linaro.org
Jim Wilson  jim.wil...@linaro.org

* config/arm/aarch-common-protos.h (struct mem_cost_table): Added
new  fields loadv and storev.
* config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs):
Initialize loadv and storev.
* config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise.
(cortexa53_extra_costs): Likewise.
(cortexa57_extra_costs): Likewise.
(xgene1_extra_costs): Likewise.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Update vector
rtx_costs.


Due to the struct mem_cost_table update for aarch64, arm cost tables
also need to be updated. Please find the patch that does this.
Regression tested on arm-none-linux-gnu with no-new regressions. Is this
OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2015-04-25  Kugan Vivekanandarajah  kug...@linaro.org

* config/arm/arm.c (cortexa9_extra_costs): Initialize loadv and
 storev.
(cortexa8_extra_costs): Likewise.
(cortexa5_extra_costs): Likewise.
(cortexa7_extra_costs): Likewise.
(cortexa12_extra_costs): Likewise.
(cortexa15_extra_costs): Likewise.
(v7m_extra_costs): Likewise.


This arm part looks ok to me FWIW (if the approach in the aarch64
patch is deemed ok by the aarch64 maintainers), it just syncs the
fields of the common cost struct between arm and aarch64.

Please only commit this if the aarch64 patch gets approved and
committed at the same time, otherwise we'll get a build failure.
Having a look at the arm rtx costs and seeing if we can improve them
in the same way as the aarch64 ones would be good as a follow up at some
point too.

Thanks,
Kyrill





Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On Fri, 24 Apr 2015, Jeff Law wrote:

 On 02/17/2015 07:58 AM, Richard Biener wrote:
 [ Restarting a old thread... ]
 
  On a closer look the record_const_or_copy_1 hunk is redundant
  (record_equality is really a bit obfuscated...).
 Agreed.  I'm not entirely sure how it got to this point.
 
  And record_equality is where the SSA_NAME_VALUEs might end up
  forming chains?  At least because we might record a SSA_NAME_VALUE
  for sth that might be the target of a SSA_NAME_VALUE, as in
  
a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
if (b_2 == i_3(D))
  ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
  
  thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
  SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
  can't easily avoid that because we don't keep track of a
  reverse mapping (nor would we want to update that).
 Right.  In general, the fact that we're in SSA form means that we ought not
 get loops in the SSA_NAME_VALUE chain since there's a single static assignment
 and we'll visit it once.
 
 That nice model breaks down in two ways.  First we derive equivalences from
 equality conditions like you've shown above.  Second, when threading we can
 thread through a loop latch and start reprocessing a block.  The interaction
 between those two can result in loops in the value chain.
 
 What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
 is independent of the dominator walk (and optimizer). Instead of walking
 forward through the dominator tree recording key nuggets, then removing those
 nuggets as we pop back up the tree, we instead we start with the conditional
 and walk up the use-def chains, simplifying as we go, with the goal of
 simplifying to a constant, of course.
 
 You can see the beginnings of that with the FSM bits from Sebastian.
 
 Obviously until those bits are in place, we should try to clean up any warts
 in the current implementation.
 
  
  Btw, in record_equality, the == part of the = check for the loop
  depth will just randomly swap x and y while we should prefer
  IL canonical order.  If we can't rely on the input being in IL
  canonical order we should prepend the whole function with
  
if (tree_swap_operands_p (x, y, false))
  std::swap (x, y);
  
  and change = to  for the loop-depth check.
 Agreed.  Makes perfect sense.

I'm now re-bootstrapping and testing the following.

Richard.

2015-04-27  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

Index: gcc/tree-ssa-dom.c
===
*** gcc/tree-ssa-dom.c  (revision 222360)
--- gcc/tree-ssa-dom.c  (working copy)
*** record_equivalences_from_phis (basic_blo
*** 1519,1524 
--- 1519,1531 
  if (lhs == t)
continue;
  
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+ 
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
*** record_equality (tree x, tree y)
*** 1752,1757 
--- 1759,1767 
  {
tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+ std::swap (x, y);
+ 
if (TREE_CODE (x) == SSA_NAME)
  prev_x = SSA_NAME_VALUE (x);
if (TREE_CODE (y) == SSA_NAME)
*** record_equality (tree x, tree y)
*** 1766,1772 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) = loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x)  loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
*** record_equivalences_from_stmt (gimple st
*** 2128,2145 
if (may_optimize_p
   (TREE_CODE (rhs) == SSA_NAME
  || is_gimple_min_invariant (rhs)))
!   {
!   if (dump_file  (dump_flags  TDF_DETAILS))
! {
!   fprintf (dump_file,  ASGN );
!   print_generic_expr (dump_file, lhs, 0);
!   fprintf (dump_file,  = );
!   print_generic_expr (dump_file, rhs, 0);
!   fprintf 

Re: Improve LTO type checking during symtab merging

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Jan Hubicka wrote:

 Hi,
 this patch strengten ODR checking for requiring match on declarations
 (if both of them are defined in C++ tranlsation unit).  Currently
 ipa-symtab.c will warn only if declarations are incompatible in 
 useless_type_conversion sense.  Now we warn more often, i.e. in the following
 example:
 
 t.C:
 char a;
 
 t2.C:
 #include stdtype.h
 extern int8_t a;
 void *b=a;
 
 Will lead to warning:
 t2.C:2:15: warning: type of ???a??? does not match original declaration
  extern int8_t a;
^
 built-in: note: type name ???char??? should match type name ???signed 
 char???
 /usr/include/stdint.h:37:22: note: the incompatible type is defined here
  typedef signed char  int8_t;
   ^
 /aux/hubicka/t.C:1:6: note: previously declared here
  char a;
   ^
 
 Funnilly enough we do not get warning when t2.C is just signed char
 or unsigned char because that is builtin type and thus non-ODR. Something
 I need to deal with incrementally.  
 
 I also added call to warn_types_mismatch that outputs extra note about
 what is wrong with the type: in C++ the mismatch is often carried over
 several levels on declarations and it is hard to work out the reason
 without extra info.
 
 Richard, According to comments, it seems that the code was tailored to not
 output warning when we can avoid wrong code issues on mismatches based on fact
 that linker would be also silent. I am particularly entertained by the
 following hunk which I killed:
 
 -  if (!types_compatible_p (TREE_TYPE (prevailing_decl),
 -TREE_TYPE (decl)))
 - /* If we don't have a merged type yet...sigh.  The linker
 -wouldn't complain if the types were mismatched, so we
 -probably shouldn't either.  Just use the type from
 -whichever decl appears to be associated with the
 -definition.  If for some odd reason neither decl is, the
 -older one wins.  */
 - (void) 0;
 
 It is fun we go through the trouble of comparing types and then do nothing ;)
 Type merging is also completed at this time, so at least the comment looks
 out of date.

 I think as QOI issue, we do want to warn in cases the code is 
 inconsistent (it is pretty much surely a bug), but perhaps we may want 
 to have the warnings with -Wodr only and use slightly different warning 
 and different -W switch for warnings that unavoidably leads to wrong 
 code surprises?  This should be easy to implement by adding two levels 
 into new warn_type_compatibility_p predicate. I am personaly also OK 
 however with warning always or making all the warnings -Wodr.

 What do you think?

Only emitting the warnings with -Wodr looks good to me.  I can't see
how we can decide what cases lead to wrong code surprises and what not
(other than using types_compatible_p ...).  Wrong-code can only(?) happen
if we happen to inline in a way that makes the incosistency visible in
a single function.
 
 Incrementally I am heading towards proper definition of decl 
 compatibility that I can plug into symtab merging and avoid merging 
 incompatible decls (so FORTIFY_SOURCE works).

 lto-symtab and ipa-icf both have some knowledge of the problem, I want to get
 both right and factor out common logic.

Sounds good.
 
 Other improvement is to preserve the ODR type info when non-ODR variant
 previals so one can diagnose clash in between C++ units even in mixed
 language linktimes.

Hmm, but then merging ODR with non-ODR variant is already pointing to
a ODR violation?  So why do you need to retain that info?

Btw, there is that old PR41227 which has both wrong-code and diagnostic
impact...

Richard.

 Bootstrapped/regtested x86_64-linux with no new ODR warnings hit by -Werror.
 
 Honza
 
   * ipa-devirt.c (register_odr_type): Be ready for non-odr main variant
   of odr type.
   * ipa-utils.h (warn_types_mismatch): New.
   * lto/lto-symtab.c (warn_type_compatibility_p): Break out from ...
   (lto_symtab_merge): ... here.
   (lto_symtab_merge_decls_2): Improve warnings.
 
 Index: ipa-devirt.c
 ===
 --- ipa-devirt.c  (revision 222391)
 +++ ipa-devirt.c  (working copy)
 @@ -2029,10 +2030,14 @@ register_odr_type (tree type)
if (in_lto_p)
  odr_vtable_hash = new odr_vtable_hash_type (23);
  }
 -  /* Arrange things to be nicer and insert main variants first.  */
 -  if (odr_type_p (TYPE_MAIN_VARIANT (type)))
 +  /* Arrange things to be nicer and insert main variants first.
 + ??? fundamental prerecorded types do not have mangled names; this
 + makes it possible that non-ODR type is main_odr_variant of ODR type.
 + Things may get smoother if LTO FE set mangled name of those types same
 + way as C++ FE does.  */
 +  if (odr_type_p (main_odr_variant (TYPE_MAIN_VARIANT (type
  get_odr_type (TYPE_MAIN_VARIANT (type), true);
 -  if (TYPE_MAIN_VARIANT (type) != 

[committed] Make arguments to vec::splice const

2015-04-27 Thread Richard Sandiford
vec::splice doesn't modify its source vector, so mark the argument as const.

Tested on x86_64-linux-gnu.  Applied as obvious.

Richard


gcc/
* vec.h (vec): Make splice arguments const.  Update definitions
accordingly.

Index: gcc/vec.h
===
--- gcc/vec.h   2015-04-27 10:38:48.0 +0100
+++ gcc/vec.h   2015-04-27 10:41:57.859920289 +0100
@@ -483,8 +483,8 @@ struct GTY((user)) vecT, A, vl_embed
   bool iterate (unsigned, T *) const;
   bool iterate (unsigned, T **) const;
   vec *copy (ALONE_CXX_MEM_STAT_INFO) const;
-  void splice (vec );
-  void splice (vec *src);
+  void splice (const vec );
+  void splice (const vec *src);
   T *quick_push (const T );
   T pop (void);
   void truncate (unsigned);
@@ -705,7 +705,7 @@ vec_safe_copy (vecT, A, vl_embed *src
Reallocate DST, if necessary.  */
 templatetypename T, typename A
 inline void
-vec_safe_splice (vecT, A, vl_embed *dst, vecT, A, vl_embed *src
+vec_safe_splice (vecT, A, vl_embed *dst, const vecT, A, vl_embed *src
 CXX_MEM_STAT_INFO)
 {
   unsigned src_len = vec_safe_length (src);
@@ -836,7 +836,7 @@ vecT, A, vl_embed::copy (ALONE_MEM_STA
 
 templatetypename T, typename A
 inline void
-vecT, A, vl_embed::splice (vecT, A, vl_embed src)
+vecT, A, vl_embed::splice (const vecT, A, vl_embed src)
 {
   unsigned len = src.length ();
   if (len)
@@ -849,7 +849,7 @@ vecT, A, vl_embed::splice (vecT, A, v
 
 templatetypename T, typename A
 inline void
-vecT, A, vl_embed::splice (vecT, A, vl_embed *src)
+vecT, A, vl_embed::splice (const vecT, A, vl_embed *src)
 {
   if (src)
 splice (*src);
@@ -1212,8 +1212,8 @@ struct vecT, va_heap, vl_ptr
   vec copy (ALONE_CXX_MEM_STAT_INFO) const;
   bool reserve (unsigned, bool = false CXX_MEM_STAT_INFO);
   bool reserve_exact (unsigned CXX_MEM_STAT_INFO);
-  void splice (vec );
-  void safe_splice (vec  CXX_MEM_STAT_INFO);
+  void splice (const vec );
+  void safe_splice (const vec  CXX_MEM_STAT_INFO);
   T *quick_push (const T );
   T *safe_push (const T CXX_MEM_STAT_INFO);
   T pop (void);
@@ -1489,7 +1489,7 @@ vecT, va_heap, vl_ptr::release (void)
 
 templatetypename T
 inline void
-vecT, va_heap, vl_ptr::splice (vecT, va_heap, vl_ptr src)
+vecT, va_heap, vl_ptr::splice (const vecT, va_heap, vl_ptr src)
 {
   if (src.m_vec)
 m_vec-splice (*(src.m_vec));
@@ -1503,7 +1503,7 @@ vecT, va_heap, vl_ptr::splice (vecT,
 
 templatetypename T
 inline void
-vecT, va_heap, vl_ptr::safe_splice (vecT, va_heap, vl_ptr src
+vecT, va_heap, vl_ptr::safe_splice (const vecT, va_heap, vl_ptr src
  MEM_STAT_DECL)
 {
   if (src.length ())



Fix xstormy16 handling of HImode predicates

2015-04-27 Thread Richard Sandiford
xstormy16 has:

(define_predicate xs_hi_general_operand
  (match_code const_int,reg,subreg,mem,symbol_ref,label_ref,const)
{
  if ((GET_CODE (op) == CONST_INT)
((INTVAL (op) = 32768) || (INTVAL (op)  -32768)))
{
  error (constant halfword load operand out of range);
  return false;
}

  return general_operand (op, mode);
})

I think this is wrong for two reasons:

- Predicates are supposed to accept things in different modes and weed out
  the invalid values.  That's why the predicates start with a check that
  GET_MODE (op) == VOIDmode || GET_MODE (op) == mode.

  IMO it doesn't make sense to handle (e.g.) SImode REGs correctly but
  raise an error for SImode CONST_INTs.  The generic predicates later
  reject CONST_INTs that aren't suitable for the mode, so invalid
  HImode CONST_INTs will fail to match rather than be silently accepted.

- If we did have a check, it should be an assert rather than a user error.

The port is effectively relying on the matching algorithm in genrecog to
ensure that the predicate is only called on SET_SRC rtxes once the mode
of the SET_DEST has been checked.

This patch removes the predicate and uses general_operand instead.

xs_hi_nonmemory_operand is similar except that it rejects symbol_ref
and label_ref constants.  The patch therefore removes the error check
but keeps the restricted code list.

Note that we have an assert in rtl.h that CONST_INTs fit into the
range appropriate for the wi:: precision.  This should catch more
cases than the predicate checks would have.  (It has flagged up
problems that were latent before.)

Tested by building xstormy16-elf and making sure that there were
no differences in assembly output for the GCC testsuite.
OK to install?

Thanks,
Richard


gcc/
* config/stormy16/predicates.md (xs_hi_general_operand): Delete.
(xs_hi_nonmemory_operand): Remove error.
* config/stormy16/stormy16.md (movhi, movhi_internal): Use
general_operand rather than xs_hi_general_operand.

Index: gcc/config/stormy16/predicates.md
===
--- gcc/config/stormy16/predicates.md   2015-04-27 10:38:48.0 +0100
+++ gcc/config/stormy16/predicates.md   2015-04-27 10:40:00.709344494 +0100
@@ -151,28 +151,8 @@ (define_predicate xstormy16_carry_plus_
   (INTVAL (XEXP (op, 1))  -4 || INTVAL (XEXP (op, 1))  4));
 })
 
-(define_predicate xs_hi_general_operand
-  (match_code const_int,reg,subreg,mem,symbol_ref,label_ref,const)
-{
-  if ((GET_CODE (op) == CONST_INT)
-((INTVAL (op) = 32768) || (INTVAL (op)  -32768)))
-{
-  error (constant halfword load operand out of range);
-  return false;
-}
-
-  return general_operand (op, mode);
-})
-
 (define_predicate xs_hi_nonmemory_operand
   (match_code const_int,reg,subreg,const)
 {
-  if ((GET_CODE (op) == CONST_INT) 
-((INTVAL (op) = 32768) || (INTVAL (op)  -32768)))
-{
-  error (constant arithmetic operand out of range);
-  return false;
-}
-
   return nonmemory_operand (op, mode);
 })
Index: gcc/config/stormy16/stormy16.md
===
--- gcc/config/stormy16/stormy16.md 2015-04-27 10:38:48.0 +0100
+++ gcc/config/stormy16/stormy16.md 2015-04-27 10:40:00.709344494 +0100
@@ -185,7 +185,7 @@ (define_insn pophi1
 
 (define_expand movhi
   [(set (match_operand:HI 0 nonimmediate_nonstack_operand )
-   (match_operand:HI 1 xs_hi_general_operand ))]
+   (match_operand:HI 1 general_operand ))]
   
   { xstormy16_expand_move (HImode, operands[0], operands[1]);
 DONE;
@@ -193,7 +193,7 @@ (define_expand movhi
 
 (define_insn movhi_internal
   [(set (match_operand:HI 0 nonimmediate_nonstack_operand 
=r,m,e,e,T,r,S,W,e)
-   (match_operand:HI 1 xs_hi_general_operand  
r,e,m,L,L,i,i,ie,W))]
+   (match_operand:HI 1 general_operand
r,e,m,L,L,i,i,ie,W))]
   
   @
mov %0,%1



Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-27 Thread Kyrill Tkachov


On 22/04/15 12:51, Kyrill Tkachov wrote:

On 21/04/15 18:33, Kyrill Tkachov wrote:

On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:

From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of binutils
doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.

So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or smaller.
+   If the two are not statically comparable (for example, X and Y contain
+   different registers) return -1.  This is used in expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up reading from
+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

static int
memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
{
#ifdef ARGS_GROW_DOWNWARD
 rtx tmp = plus_constant (Pmode, x, -size);
#else
 rtx tmp = plus_constant (Pmode, x, size);
#endif
 rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);

 if (!CONST_INT_P (sub))
   return -1;

#ifdef ARGS_GROW_DOWNWARD
 return INTVAL (-sub);
#else
 return INTVAL (sub);
#endif
}

now, say for x == sp + 4,  y == sp + 8, size == 16:
This would be a problematic case for arm, so this code on arm
(where ARGS_GROW_DOWNWARD is *not* defined) would return
12, which is the number of bytes that overlap.

On a target where ARGS_GROW_DOWNWARD is defined this would return
-20, meaning that no overlap occurs (because we read in the descending
direction from x, IIUC).

Hi Jeff,

Here's an attempt to make this more concrete.
Only the memory_load_overlap function has changed.
This time I tried to take into account the case when
ARGS_GROW_DOWNWARD.

Take the case where x == sp, y == sp + 8, size == 16.
For arm, this would return 8 as that is the number of bytes
that overlap. On pa, since ARGS_GROW_DOWNWARD is defined it
would return -1 as we're reading down from x rather than up
towards y.

In the case when x == sp + 8, y == sp, size == 16
This would return -1 on arm since we're reading upwards from x
and thefore no overlap would happen.

On pa, this would return 8, which I think is the right thing.
But again, I don't have access to any pa means of testing.

What do you think of this approach?


Hi Dave,

Would it be possible for you to test this patch on a 64-bit hppa
or at least bootstrap it?
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01288.html

There is a concern that it may potentially affect the passing of
complex arguments partially on the stack and partially in regs
on pa because of the way the args and stack grow on that target.

Unfortunately I don't have access to any hardware or simulators.
It would help a lot with getting this patch in.

Thanks,
Kyrill




Thanks,
Kyrill

P.S. I've included the testcase from Honggyu in the patch.

2015-04-22  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  PR target/65358
  * expr.c (memory_load_overlap): New function.
  (emit_push_insn): When pushing partial args to the stack would
  clobber the register part load the overlapping part into a pseudo
  and put it into the hard reg after pushing.

2015-04-22  Honggyu Kim  hong.gyu@lge.com

  PR target/65358
  * gcc.dg/pr65358.c: New test.



Thanks,
Kyrill


Jeff





RE: [PATCH] [1/2] [ARM] [libgcc] Support RTABI half-precision conversion functions.

2015-04-27 Thread Hale Wang
 -Original Message-
 From: Ramana Radhakrishnan [mailto:ramana@googlemail.com]
 Sent: Wednesday, April 22, 2015 5:00 PM
 To: Hale Wang
 Cc: Ramana Radhakrishnan; Joseph Myers; GCC Patches
 Subject: Re: [PATCH] [1/2] [ARM] [libgcc] Support RTABI half-precision
 conversion functions.
 
 On Wed, Apr 22, 2015 at 9:32 AM, Hale Wang hale.w...@arm.com wrote:
  -Original Message-
  From: Ramana Radhakrishnan [mailto:ramana@googlemail.com]
  Sent: Wednesday, April 22, 2015 3:50 PM
  To: Joseph Myers
  Cc: Hale Wang; GCC Patches
  Subject: Re: [PATCH] [1/2] [ARM] [libgcc] Support RTABI
  half-precision conversion functions.
 
  On Mon, Apr 13, 2015 at 12:25 PM, Joseph Myers
  jos...@codesourcery.com wrote:
   On Mon, 13 Apr 2015, Hale Wang wrote:
  
   Yes, you are right. It's my fault to add the only here. Thank
   you to point out this.
   Beside this, is this patch OK for you?
  
   I don't think it's a good idea for libgcc to include large pieces
   of assembly code generated by a compiler.  Just compile the code
   with whatever options are needed at the time libgcc is built -
   possibly with #if conditionals to allow compiling different versions of 
   the
 code.
 
  Indeed, just compile the code with option '-mfloat-abi=soft' at the time
 libgcc is build which can solve this problem.
 
 Or why not conditionally use the ``pcs'' attribute on the ARM port ?
 That then means you don't need options magic on top ?

OK. I think your suggestion can solve this problem more clearly. I will 
resubmit a patch later. I think we can discard this patch this time.
Thanks a lot.

Hale
 
 Ramana





RE: [PATCH 1/2, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Thomas Preud'homme
 From: Jeff Law [mailto:l...@redhat.com]
 Sent: Saturday, April 25, 2015 2:57 AM

  +static rtx
  +sign_extend_short_imm (rtx src, machine_mode mode, unsigned int
 prec)
  +{
  +  if (GET_MODE_PRECISION (mode)  prec  CONST_INT_P (src)
  +   INTVAL (src)  0  val_signbit_known_set_p (mode, INTVAL
 (src)))
  +src = GEN_INT (INTVAL (src) | ~GET_MODE_MASK (mode));
 Can you go ahead and put each condition of the  on a separate line.
 It uses more vertical space, but IMHO makes this easier to read.As I
 said, it was a nit :-)

You're perfectly right. Anything that can improve readability of source code
is a good thing.

 
 OK with that fix.

Committed.

Best regards,

Thomas





Add a build version of inchash.o

2015-04-27 Thread Richard Sandiford
The generator patch that I'm about to post wants to use inchash.c,
so this patch adds it to the list of files that are built for the
build system as well as the host.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* inchash.h, inchash.c: Include bconfig.h for build objects.
* Makefile.in (build/inchash.o): New rule.

Index: gcc/inchash.h
===
--- gcc/inchash.h   2015-04-27 10:38:48.0 +0100
+++ gcc/inchash.h   2015-04-27 10:42:57.783191573 +0100
@@ -20,7 +20,11 @@ Software Foundation; either version 3, o
 #ifndef INCHASH_H
 #define INCHASH_H 1
 
+#ifdef GENERATOR_FILE
+#include bconfig.h
+#else
 #include config.h
+#endif
 #include system.h
 #include coretypes.h
 #include hashtab.h
Index: gcc/inchash.c
===
--- gcc/inchash.c   2015-04-27 10:38:48.0 +0100
+++ gcc/inchash.c   2015-04-27 10:42:57.783191573 +0100
@@ -17,7 +17,11 @@ Software Foundation; either version 3, o
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
+#ifdef GENERATOR_FILE
+#include bconfig.h
+#else
 #include config.h
+#endif
 #include system.h
 #include coretypes.h
 #include hashtab.h
Index: gcc/Makefile.in
===
--- gcc/Makefile.in 2015-04-27 10:38:48.0 +0100
+++ gcc/Makefile.in 2015-04-27 10:42:57.783191573 +0100
@@ -2434,6 +2434,8 @@ build/vec.o : vec.c $(BCONFIG_H) $(SYSTE
$(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
 build/hash-table.o : hash-table.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
$(HASH_TABLE_H) $(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
+build/inchash.o : inchash.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
+   $(HASHTAB_H) inchash.h
 build/gencondmd.o : build/gencondmd.c $(BCONFIG_H) $(SYSTEM_H) \
   coretypes.h $(GTM_H) insn-constants.h
\
   $(filter-out insn-flags.h, $(RTL_H) $(TM_P_H) $(FUNCTION_H) $(REGS_H) \



Re: Add a build version of inchash.o

2015-04-27 Thread Richard Biener
On Mon, Apr 27, 2015 at 12:09 PM, Richard Sandiford
richard.sandif...@arm.com wrote:
 The generator patch that I'm about to post wants to use inchash.c,
 so this patch adds it to the list of files that are built for the
 build system as well as the host.

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

THanks,
Richard.

 Thanks,
 Richard


 gcc/
 * inchash.h, inchash.c: Include bconfig.h for build objects.
 * Makefile.in (build/inchash.o): New rule.

 Index: gcc/inchash.h
 ===
 --- gcc/inchash.h   2015-04-27 10:38:48.0 +0100
 +++ gcc/inchash.h   2015-04-27 10:42:57.783191573 +0100
 @@ -20,7 +20,11 @@ Software Foundation; either version 3, o
  #ifndef INCHASH_H
  #define INCHASH_H 1

 +#ifdef GENERATOR_FILE
 +#include bconfig.h
 +#else
  #include config.h
 +#endif
  #include system.h
  #include coretypes.h
  #include hashtab.h
 Index: gcc/inchash.c
 ===
 --- gcc/inchash.c   2015-04-27 10:38:48.0 +0100
 +++ gcc/inchash.c   2015-04-27 10:42:57.783191573 +0100
 @@ -17,7 +17,11 @@ Software Foundation; either version 3, o
  along with GCC; see the file COPYING3.  If not see
  http://www.gnu.org/licenses/.  */

 +#ifdef GENERATOR_FILE
 +#include bconfig.h
 +#else
  #include config.h
 +#endif
  #include system.h
  #include coretypes.h
  #include hashtab.h
 Index: gcc/Makefile.in
 ===
 --- gcc/Makefile.in 2015-04-27 10:38:48.0 +0100
 +++ gcc/Makefile.in 2015-04-27 10:42:57.783191573 +0100
 @@ -2434,6 +2434,8 @@ build/vec.o : vec.c $(BCONFIG_H) $(SYSTE
 $(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
  build/hash-table.o : hash-table.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
 $(HASH_TABLE_H) $(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
 +build/inchash.o : inchash.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
 +   $(HASHTAB_H) inchash.h
  build/gencondmd.o : build/gencondmd.c $(BCONFIG_H) $(SYSTEM_H) \
coretypes.h $(GTM_H) insn-constants.h  
   \
$(filter-out insn-flags.h, $(RTL_H) $(TM_P_H) $(FUNCTION_H) $(REGS_H) \



[PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Kyrill Tkachov

Hi all,

This is another one of those std::swap patches, this time I was in the combine 
neighbourhood

Bootstrapped on x86_64 and tested on aarch64.
Precedents suggest these changes are considered obvious.
So I'll commit this in a couple of days unless someone objects.

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* combine.c (simplify_if_then_else): Use std::swap instead
of manually swapping.
(known_cond): Likewise.
(simplify_comparison): Likewise.
commit e8aa8f2ec22cf548bc05f0f8e56a13a2bdd1c228
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Fri Apr 24 17:47:14 2015 +0100

[combine][obvious] Use std::swap instead of manually swapping

diff --git a/gcc/combine.c b/gcc/combine.c
index 9f840cb..5f7cbc0 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6185,7 +6185,7 @@ simplify_if_then_else (rtx x)
   if (false_code == EQ)
 	{
 	  swapped = 1, true_code = EQ, false_code = NE;
-	  temp = true_rtx, true_rtx = false_rtx, false_rtx = temp;
+	  std::swap (true_rtx, false_rtx);
 	}
 
   /* If we are comparing against zero and the expression being tested has
@@ -6250,7 +6250,7 @@ simplify_if_then_else (rtx x)
   SUBST (XEXP (x, 1), false_rtx);
   SUBST (XEXP (x, 2), true_rtx);
 
-  temp = true_rtx, true_rtx = false_rtx, false_rtx = temp;
+  std::swap (true_rtx, false_rtx);
   cond = XEXP (x, 0);
 
   /* It is possible that the conditional has been simplified out.  */
@@ -9022,7 +9022,6 @@ static rtx
 known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
 {
   enum rtx_code code = GET_CODE (x);
-  rtx temp;
   const char *fmt;
   int i, j;
 
@@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
   else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
 {
   if (rtx_equal_p (XEXP (x, 0), val))
-	cond = swap_condition (cond), temp = val, val = reg, reg = temp;
+	cond = swap_condition (cond), std::swap (val, reg);
 
   if (rtx_equal_p (XEXP (x, 0), reg)  rtx_equal_p (XEXP (x, 1), val))
 	{
@@ -11454,7 +11453,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
  is already a constant integer.  */
   if (swap_commutative_operands_p (op0, op1))
 {
-  tem = op0, op0 = op1, op1 = tem;
+  std::swap (op0, op1);
   code = swap_condition (code);
 }
 
@@ -12341,7 +12340,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
   /* We may have changed the comparison operands.  Re-canonicalize.  */
   if (swap_commutative_operands_p (op0, op1))
 {
-  tem = op0, op0 = op1, op1 = tem;
+  std::swap (op0, op1);
   code = swap_condition (code);
 }
 


[PATCH][simplify-rtx][trivial] Use std::swap instead of manually swapping

2015-04-27 Thread Kyrill Tkachov

Hi all,

This patch replaces in simplify-rtx.c the manual swapping of values with 
std::swap.
Precedents suggest these are considered obvious changes.

Bootstrapped and tested on aarch64, x86_64.

Will commit as obvious in a couple of days if no one objects

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* simplify-rtx.c (simplify_gen_binary): Use std::swap instead
of manually swapping.
(simplify_associative_operation): Likewise.
(simplify_binary_operation): Likewise.
(simplify_plus_minus): Likewise.
(simplify_relational_operation): Likewise.
(simplify_ternary_operation): Likewise.
commit dbbb2823461181853a5605adc3cbbcab04dd00b1
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Fri Apr 24 16:54:49 2015 +0100

[simplify-rtx] Use std::swap instead of manually swapping

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 4a8b56b..2dc2b58 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -216,7 +216,7 @@ simplify_gen_binary (enum rtx_code code, machine_mode mode, rtx op0,
   /* Put complex operands first and constants second if commutative.  */
   if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
swap_commutative_operands_p (op0, op1))
-tem = op0, op0 = op1, op1 = tem;
+std::swap (op0, op1);
 
   return gen_rtx_fmt_ee (code, mode, op0, op1);
 }
@@ -1926,9 +1926,7 @@ simplify_associative_operation (enum rtx_code code, machine_mode mode,
   if (! swap_commutative_operands_p (op1, op0))
 	return simplify_gen_binary (code, mode, op1, op0);
 
-  tem = op0;
-  op0 = op1;
-  op1 = tem;
+  std::swap (op0, op1);
 }
 
   if (GET_CODE (op0) == code)
@@ -1977,9 +1975,7 @@ simplify_binary_operation (enum rtx_code code, machine_mode mode,
   /* Make sure the constant is second.  */
   if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
swap_commutative_operands_p (op0, op1))
-{
-  tem = op0, op0 = op1, op1 = tem;
-}
+std::swap (op0, op1);
 
   trueop0 = avoid_constant_pool_reference (op0);
   trueop1 = avoid_constant_pool_reference (op1);
@@ -4274,10 +4270,10 @@ simplify_plus_minus (enum rtx_code code, machine_mode mode, rtx op0,
 		  {
 		ncode = MINUS;
 		if (lneg)
-		  tem = lhs, lhs = rhs, rhs = tem;
+		  std::swap (lhs, rhs);
 		  }
 		else if (swap_commutative_operands_p (lhs, rhs))
-		  tem = lhs, lhs = rhs, rhs = tem;
+		  std::swap (lhs, rhs);
 
 		if ((GET_CODE (lhs) == CONST || CONST_INT_P (lhs))
 		 (GET_CODE (rhs) == CONST || CONST_INT_P (rhs)))
@@ -4467,7 +4463,7 @@ simplify_relational_operation (enum rtx_code code, machine_mode mode,
   /* For the following tests, ensure const0_rtx is op1.  */
   if (swap_commutative_operands_p (op0, op1)
   || (op0 == const0_rtx  op1 != const0_rtx))
-tem = op0, op0 = op1, op1 = tem, code = swap_condition (code);
+std::swap (op0, op1), code = swap_condition (code);
 
   /* If op0 is a compare, extract the comparison arguments from it.  */
   if (GET_CODE (op0) == COMPARE  op1 == const0_rtx)
@@ -4828,7 +4824,7 @@ simplify_const_relational_operation (enum rtx_code code,
   /* Make sure the constant is second.  */
   if (swap_commutative_operands_p (op0, op1))
 {
-  tem = op0, op0 = op1, op1 = tem;
+  std::swap (op0, op1);
   code = swap_condition (code);
 }
 
@@ -5189,7 +5185,7 @@ simplify_ternary_operation (enum rtx_code code, machine_mode mode,
   /* Canonicalize the two multiplication operands.  */
   /* a * -b + c  =  -b * a + c.  */
   if (swap_commutative_operands_p (op0, op1))
-	tem = op0, op0 = op1, op1 = tem, any_change = true;
+	std::swap (op0, op1), any_change = true;
 
   if (any_change)
 	return gen_rtx_FMA (mode, op0, op1, op2);


[PATCH][AArch64] Add alternative 'extr' pattern, calculate rtx cost properly

2015-04-27 Thread Kyrill Tkachov

Hi all,

We currently have a pattern that will recognise a particular combination of 
shifts and bitwise-or
as an extr instruction. However, the order of the shifts inside the IOR doesn't 
have
canonicalisation rules (see rev16 pattern for similar stuff).
This means that for code like:

unsigned long
foo (unsigned long a, unsigned long b)
{
  return (a  16) |  (b  48);
}

we will recognise the extr, but for the equivalent:
unsigned long
foo (unsigned long a, unsigned long b)
{
  return (b  48) | (a  16);
}

we won't, and we'll emit three instructions.
This patch adds the pattern for the alternative order of shifts and allows us 
to generate
for the above the code:
foo:
extrx0, x0, x1, 48
ret

The zero-extended version is added as well and the rtx costs function is 
updated to handle
all of these cases.

I've seen this pattern trigger in the gcc code itself in expmed.c where it 
eliminated a sequence
of orrs and shifts into an extr instruction!

Bootstrapped and tested on aarch64-linux.

Ok for trunk?

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.md (*extrmode5_insn_alt): New pattern.
(*extrsi5_insn_uxtw_alt): Likewise.
* config/aarch64/aarch64.c (aarch64_extr_rtx_p): New function.
(aarch64_rtx_costs, IOR case): Use above to properly cost extr
operations.
commit d45e92b3b8c5837328b7b10682565cacfb566e5b
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Mon Mar 2 17:26:38 2015 +

[AArch64] Add alternative 'extr' pattern, calculate rtx cost properly

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 860a1dd..ef5a1e4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5438,6 +5438,51 @@ aarch64_frint_unspec_p (unsigned int u)
 }
 }
 
+/* Return true iff X is an rtx that will match an extr instruction
+   i.e. as described in the *extrmode5_insn family of patterns.
+   OP0 and OP1 will be set to the operands of the shifts involved
+   on success and will be NULL_RTX otherwise.  */
+
+static bool
+aarch64_extr_rtx_p (rtx x, rtx *res_op0, rtx *res_op1)
+{
+  rtx op0, op1;
+  machine_mode mode = GET_MODE (x);
+
+  *res_op0 = NULL_RTX;
+  *res_op1 = NULL_RTX;
+
+  if (GET_CODE (x) != IOR)
+return false;
+
+  op0 = XEXP (x, 0);
+  op1 = XEXP (x, 1);
+
+  if ((GET_CODE (op0) == ASHIFT  GET_CODE (op1) == LSHIFTRT)
+  || (GET_CODE (op1) == ASHIFT  GET_CODE (op0) == LSHIFTRT))
+{
+ /* Canonicalise locally to ashift in op0, lshiftrt in op1.  */
+  if (GET_CODE (op1) == ASHIFT)
+std::swap (op0, op1);
+
+  if (!CONST_INT_P (XEXP (op0, 1)) || !CONST_INT_P (XEXP (op1, 1)))
+return false;
+
+  unsigned HOST_WIDE_INT shft_amnt_0 = UINTVAL (XEXP (op0, 1));
+  unsigned HOST_WIDE_INT shft_amnt_1 = UINTVAL (XEXP (op1, 1));
+
+  if (shft_amnt_0  GET_MODE_BITSIZE (mode)
+   shft_amnt_0 + shft_amnt_1 == GET_MODE_BITSIZE (mode))
+{
+  *res_op0 = XEXP (op0, 0);
+  *res_op1 = XEXP (op1, 0);
+  return true;
+}
+}
+
+  return false;
+}
+
 /* Calculate the cost of calculating (if_then_else (OP0) (OP1) (OP2)),
storing it in *COST.  Result is true if the total cost of the operation
has now been calculated.  */
@@ -5968,6 +6013,16 @@ cost_plus:
 
   return true;
 }
+
+  if (aarch64_extr_rtx_p (x, op0, op1))
+{
+  *cost += rtx_cost (op0, IOR, 0, speed)
+   + rtx_cost (op1, IOR, 1, speed);
+  if (speed)
+*cost += extra_cost-alu.shift;
+
+  return true;
+}
 /* Fall through.  */
 case XOR:
 case AND:
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1a7f888..17a8755 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3597,6 +3597,21 @@ (define_insn *extrmode5_insn
   [(set_attr type shift_imm)]
 )
 
+;; There are no canonicalisation rules for ashift and lshiftrt inside an ior
+;; so we have to match both orderings.
+(define_insn *extrmode5_insn_alt
+  [(set (match_operand:GPI 0 register_operand =r)
+	(ior:GPI  (lshiftrt:GPI (match_operand:GPI 2 register_operand r)
+			(match_operand 4 const_int_operand n))
+		  (ashift:GPI (match_operand:GPI 1 register_operand r)
+			  (match_operand 3 const_int_operand n]
+  UINTVAL (operands[3])  GET_MODE_BITSIZE (MODEmode)
+(UINTVAL (operands[3]) + UINTVAL (operands[4])
+   == GET_MODE_BITSIZE (MODEmode))
+  extr\\t%w0, %w1, %w2, %4
+  [(set_attr type shift_imm)]
+)
+
 ;; zero_extend version of the above
 (define_insn *extrsi5_insn_uxtw
   [(set (match_operand:DI 0 register_operand =r)
@@ -3611,6 +3626,19 @@ (define_insn *extrsi5_insn_uxtw
   [(set_attr type shift_imm)]
 )
 
+(define_insn *extrsi5_insn_uxtw_alt
+  [(set (match_operand:DI 0 register_operand =r)
+	(zero_extend:DI
+	 (ior:SI (lshiftrt:SI (match_operand:SI 2 register_operand r)
+			   

Remove redudant tests in icf_handled_component_p

2015-04-27 Thread Eric Botcazou
handled_component_p comprises both REALPART_EXPR and IMAGPART_EXPR.

Tested on x86_64-suse-linux, applied on the mainline as obvious.


2015-04-27  Eric Botcazou  ebotca...@adacore.com

* ipa-icf.c (icf_handled_component_p): Remove redundant tests.


-- 
Eric BotcazouIndex: ipa-icf.c
===
--- ipa-icf.c	(revision 222439)
+++ ipa-icf.c	(working copy)
@@ -1806,9 +1806,8 @@ sem_function::icf_handled_component_p (t
 {
   tree_code tc = TREE_CODE (t);
 
-  return ((handled_component_p (t))
-	  || tc == ADDR_EXPR || tc == MEM_REF || tc == REALPART_EXPR
-	  || tc == IMAGPART_EXPR || tc == OBJ_TYPE_REF);
+  return (handled_component_p (t)
+	  || tc == ADDR_EXPR || tc == MEM_REF || tc == OBJ_TYPE_REF);
 }
 
 /* Basic blocks dictionary BB_DICT returns true if SOURCE index BB

Mostly rewrite genrecog

2015-04-27 Thread Richard Sandiford
I think it's been the case for a while that parallel builds of GCC tend
to serialise around the compilation of insn-recog.c, especially with
higher --enable-checking settings.  This patch tries to speed that
up by replacing most of genrecog with a new algorithm.

I think the main problems with the current code are:

1. Vector architectures have added lots of new instructions that have
   a similar shape and differ only in mode, code or unspec number.
   The current algorithm doesn't have any way of factoring out those
   similarities.

2. When matching a particular instruction, the current code examines
   everything about a SET_DEST before moving on to the SET_SRC.  This has
   two subproblems:

   2a. The destination of a SET isn't very distinctive.  It's usually
   just a register_operand, a memory_operand, a nonimmediate_operand
   or a flags register.  We therefore tend to backtrack to the
   SET_DEST a lot, oscillating between groups of instructions with
   the same kind of destination.

   2b. Backtracking through predicate checks is relatively expensive.
   It would be good to narrow down the shape of the instruction
   first and only then check the predicates.  (The backtracking is
   expensive in terms of insn-recog.o compile time too, both because
   we need to copy into argument registers and out of the result
   register, and because it adds more sites where spills are needed.)

3. The code keeps one local variable per rtx depth, so it ends up
   loading the same rtx many times over (mostly when backtracking).
   This is very expensive in rtl-checking builds because each XEXP
   includes a code check and a line-specific failure call.

   In principle the idea of having one local variable per depth
   is good.  But it was originally written that way when all optimisations
   were done at the rtl level and I imagine each local variable mapped
   to one pseudo register.  These days the statements that reload the
   value needed on backtracking lead to many more SSA names and phi
   statements than you'd get with just a single variable per position
   (loaded once, so naturally SSA).  There is still the potential benefit
   of avoiding having sibling rtxes live at once, but fixing (2) above
   reduces that problem.

Also, the code is all goto-based, which makes it rather hard to step through.

The patch deals with these as follows:

1. Detect subpatterns that differ only by mode, code and/or integer
   (e.g. unspec number) and split them out into a common routine.

2. Match the shape of the instruction first, in terms of codes,
   integers and vector lengths, and only then check the modes, predicates
   and dups.  When checking the shape, handle SET_SRCs before SET_DESTs.
   In practice this seems to greatly reduce the amount of backtracking.

3. Have one local variable per rtx position.  I tested the patch with
   and without the change and it helped a lot with rtl-checking builds
   without seeming to affect release builds much either way.

As far as debuggability goes, the new code avoids gotos and just
uses natural control flow.

The headline stat is that a stage 3 --enable-checking=yes,rtl,df
build of insn-recog.c on my box goes from 7m43s to 2m2s (using the
same stage 2 compiler).  The corresponding --enable-checking=release
change is from 49s to 24s (less impressive, as expected).

The patch seems to speed up recog.  E.g. the time taken to build
fold-const.ii goes from 6.74s before the patch to 6.69s after it;
not a big speed-up, but reproducible.

Here's a comparison of the number of lines of code in insn-recog.c
before and after the patch on one target per config/ CPU:

aarch64-linux-gnueabi   11552638169 :   33.04%
alpha-linux-gnu  2447910740 :   43.87%
arm-linux-gnueabi   16920867759 :   40.04%
avr-rtems5564722127 :   39.76%
bfin-elf 13928 6498 :   46.65%
c6x-elf  2992813324 :   44.52%
cr16-elf  2650 1419 :   53.55%
cris-elf 18669 7257 :   38.87%
epiphany-elf 19308 6131 :   31.75%
fr30-elf  2204 1112 :   50.45%
frv-linux-gnu13541 5950 :   43.94%
h8300-elf19584 9327 :   47.63%
hppa64-hp-hpux11.23  18299 8549 :   46.72%
ia64-linux-gnu   3762917101 :   45.45%
iq2000-elf2752 1609 :   58.47%
lm32-elf  1536  872 :   56.77%
m32c-elf 10040 4145 :   41.28%
m32r-elf 

[Patch, fortran] Simplify lbound for array subcomponents

2015-04-27 Thread Mikael Morin
Hello,

while reviewing Thomas' bound simplification patch, I noticed that the
{l,u}bound simplification code wasn't handling array subcomponents.
Fixed by the attached patch, regression tested.  OK for trunk?

Mikael





2015-04-27  Mikael Morin  mik...@gcc.gnu.org

* simplify.c (simplify_bound_dim): Tighten the check for array fullness
by also checking for absence of subreference.
(simplify_bound): Don't skip simplification if the array
has subreferences.
(simplify_cobound): Same.

2015-04-27  Mikael Morin  mik...@gcc.gnu.org

* gfortran.dg/bound_simplify_4.f90: New.

Index: simplify.c
===
--- simplify.c	(révision 221972)
+++ simplify.c	(copie de travail)
@@ -3338,7 +3338,7 @@ simplify_bound_dim (gfc_expr *array, gfc_expr *kin
   result = gfc_get_constant_expr (BT_INTEGER, k, array-where);
 
   /* Then, we need to know the extent of the given dimension.  */
-  if (coarray || ref-u.ar.type == AR_FULL)
+  if (coarray || (ref-u.ar.type == AR_FULL  !ref-next))
 {
   l = as-lower[d-1];
   u = as-upper[d-1];
@@ -3417,11 +3417,8 @@ simplify_bound (gfc_expr *array, gfc_expr *dim, gf
 	case AR_FULL:
 	  /* We're done because 'as' has already been set in the
 		 previous iteration.  */
-	  if (!ref-next)
-		goto done;
+	  goto done;
 
-	/* Fall through.  */
-
 	case AR_UNKNOWN:
 	  return NULL;
 
@@ -3556,11 +3553,8 @@ simplify_cobound (gfc_expr *array, gfc_expr *dim,
 	case AR_FULL:
 	  /* We're done because 'as' has already been set in the
 		 previous iteration.  */
-	  if (!ref-next)
-	goto done;
+	  goto done;
 
-	/* Fall through.  */
-
 	case AR_UNKNOWN:
 	  return NULL;
 



! { dg-do run }
! { dg-additional-options -fcoarray=single -fdump-tree-original }
!
! Check that {L,U}{,CO}BOUND intrinsics are properly simplified.
!
  type :: t
integer :: c
  end type t

  type(t) :: d(3:8) = t(7)
  type(t) :: e[5:9,-1:*]

  if (lbound(d,  1) /= 3) call abort
  if (lbound(d(3:5), 1) /= 1) call abort
  if (lbound(d%c,1) /= 1) call abort  
  if (ubound(d,  1) /= 8) call abort  
  if (ubound(d(3:5), 1) /= 3) call abort
  if (ubound(d%c,1) /= 6) call abort  

  if (lcobound(e,   1) /=  5) call abort
  if (lcobound(e%c, 1) /=  5) call abort
  if (lcobound(e,   2) /= -1) call abort
  if (lcobound(e%c, 2) /= -1) call abort
  if (ucobound(e,   1) /=  9) call abort
  if (ucobound(e%c, 1) /=  9) call abort
  ! no simplification for ucobound(e{,%c}, dim=2)
end
! { dg-final { scan-tree-dump-not bound original } }
! { dg-final { scan-tree-dump-not abort original } }
! { dg-final { cleanup-tree-dump original } }



Extend list of codes accepted by address_operand

2015-04-27 Thread Richard Sandiford
While testing the patch that I'm about to post, I found that the list
of address_operand codes in gensupport.c was incomplete.  This patch
fills in some obvious missing entries.

It isn't exactly clear which codes are allowed at the top level of
an address.  decompose_address is known to be general enough for all LRA
targets, but it also technically accepts any unary operator as the outer
code, whereas I think only SIGN_EXTEND and ZERO_EXTEND are used at the
top level.  The routine accepts ZERO_EXTRACT and SIGN_EXTRACT, but I
think in practice they're only used for index values and not full addresses.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* gensupport.c (std_preds): Add missing codes to address_operand entry.

Index: gcc/gensupport.c
===
--- gcc/gensupport.c2015-04-27 10:38:48.0 +0100
+++ gcc/gensupport.c2015-04-27 10:40:59.044635383 +0100
@@ -2800,7 +2800,8 @@ struct std_pred_table
 
 static const struct std_pred_table std_preds[] = {
   {general_operand, false, true, {SUBREG, REG, MEM}},
-  {address_operand, true, true, {SUBREG, REG, MEM, PLUS, MINUS, MULT}},
+  {address_operand, true, true, {SUBREG, REG, MEM, PLUS, MINUS, MULT,
+  ZERO_EXTEND, SIGN_EXTEND, AND}},
   {register_operand, false, false, {SUBREG, REG}},
   {pmode_register_operand, true, false, {SUBREG, REG}},
   {scratch_operand, false, false, {SCRATCH, REG}},



RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Thomas Preud'homme
 From: Jeff Law [mailto:l...@redhat.com]
 Sent: Saturday, April 25, 2015 3:00 AM
 Do you have a testcase where this change can result in better generated
 code.  If so please add that testcase.  It's OK if it's ARM specific.

Hi Jeff,

Last time I tried I couldn't reduce the code to a small testcase but if I 
remember
well it was mostly due to the problem of finding a good test for creduce
(zero extension is not unique enough). I'll try again with a more manual 
approach
and get back to you.

Best regards,

Thomas





Re: [libstdc++ PATCH] Add support for std::uncaught_exceptions

2015-04-27 Thread Jonathan Wakely

On 13/04/15 02:24 +0300, Ville Voutilainen wrote:

The patch is a bit large since it does the baseline_symbols regeneration
and other new-version api-dance.
Hence attached gzipped.

Tested on Linux x86-64.

2015-04-13  Ville Voutilainen  ville.voutilai...@gmail.com
   Add support for std::uncaught_exceptions.
   * acinclude.m4: Bump libtool_VERSION.
   * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Regenerate.
   * config/abi/pre/gnu.ver: Export the new symbol.
   * configure: Bump libtool_VERSION.
   * libsupc++/eh_catch.cc(uncaught_exceptions): New.
   * libsupc++/exception(uncaught_exceptions): Ditto.
   * testsuite/18_support/uncaught_exceptions/uncaught_exceptions.cc: New.
   * testsuite/util/testsuite_abi.cc: Add 3.4.22 as the latest version.


Tested x86_64-linux and powerpc64le-linux. Committed to trunk.

The baseline_symbols changes aren't needed now and I tweaked the
gnu.ver file slightly.

I also added the feature-test macro recommended by N4440 and changed
the new function to only be declared for -std=gnu++?? and -std=c++1z.
It's useful and easy to provide it for pre-C++17 as an extension.
commit c37d380f6eea816886147001e0e85932f02b756f
Author: Jonathan Wakely jwak...@redhat.com
Date:   Mon Apr 27 20:46:03 2015 +0100

2015-04-27  Ville Voutilainen  ville.voutilai...@gmail.com

	Add support for std::uncaught_exceptions.
	* acinclude.m4: Bump libtool_VERSION.
	* config/abi/pre/gnu.ver: Export the new symbol.
	* configure: Regenerate.
	* libsupc++/eh_catch.cc (uncaught_exceptions): New.
	* libsupc++/exception (uncaught_exceptions): New.
	* testsuite/18_support/uncaught_exceptions/uncaught_exceptions.cc: New.
	* testsuite/util/testsuite_abi.cc: Add 3.4.22 as the latest version.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index a1e301f..196b4ff 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3383,7 +3383,7 @@ changequote([,])dnl
 fi
 
 # For libtool versioning info, format is CURRENT:REVISION:AGE
-libtool_VERSION=6:21:0
+libtool_VERSION=6:22:0
 
 # Everything parsed; figure out what files and settings to use.
 case $enable_symvers in
diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index 7b82ce8..4ed683c 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -178,7 +178,6 @@ GLIBCXX_3.4 {
 # std::[A-Zu-z]*;
 # std::underflow_error::u*;
 # std::underflow_error::~u*;
-  std::uncaught_exception*;
   std::unexpected*;
   std::[A-Zv-z]*;
   std::_List_node_base::hook*;
@@ -1024,6 +1023,9 @@ GLIBCXX_3.4 {
 _ZNSt6locale5_Impl19_M_replace_categoryEPKS0_PKPKNS_2idE;
 _ZNSt6locale5_Impl21_M_replace_categoriesEPKS0_i;
 
+# std::uncaught_exception()
+_ZSt18uncaught_exceptionv;
+
   # DO NOT DELETE THIS LINE.  Port-specific symbols, if any, will be here.
 
   local:
@@ -1860,6 +1862,12 @@ GLIBCXX_3.4.21 {
 
 } GLIBCXX_3.4.20;
 
+GLIBCXX_3.4.22 {
+
+# std::uncaught_exception()
+_ZSt19uncaught_exceptionsv;
+
+} GLIBCXX_3.4.21;
 
 # Symbols in the support library (libsupc++) have their own tag.
 CXXABI_1.3 {
diff --git a/libstdc++-v3/libsupc++/eh_catch.cc b/libstdc++-v3/libsupc++/eh_catch.cc
index 6302465..723ae56 100644
--- a/libstdc++-v3/libsupc++/eh_catch.cc
+++ b/libstdc++-v3/libsupc++/eh_catch.cc
@@ -139,3 +139,10 @@ std::uncaught_exception() throw()
   __cxa_eh_globals *globals = __cxa_get_globals ();
   return globals-uncaughtExceptions != 0;
 }
+
+int
+std::uncaught_exceptions() throw()
+{
+  __cxa_eh_globals *globals = __cxa_get_globals ();
+  return globals-uncaughtExceptions;
+}
diff --git a/libstdc++-v3/libsupc++/exception b/libstdc++-v3/libsupc++/exception
index d6c1855..069b33c 100644
--- a/libstdc++-v3/libsupc++/exception
+++ b/libstdc++-v3/libsupc++/exception
@@ -126,6 +126,11 @@ namespace std
*/
   bool uncaught_exception() _GLIBCXX_USE_NOEXCEPT __attribute__ ((__pure__));
 
+#if !defined(__STRICT_ANSI__) || __cplusplus  201402L
+#define __cpp_lib_uncaught_exceptions 201411
+  int uncaught_exceptions() _GLIBCXX_USE_NOEXCEPT __attribute__ ((__pure__));
+#endif
+
   // @} group exceptions
 } // namespace std
 
diff --git a/libstdc++-v3/testsuite/18_support/uncaught_exceptions/uncaught_exceptions.cc b/libstdc++-v3/testsuite/18_support/uncaught_exceptions/uncaught_exceptions.cc
new file mode 100644
index 000..001903e
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/uncaught_exceptions/uncaught_exceptions.cc
@@ -0,0 +1,162 @@
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY 

Re: [RS6000] pr65810, powerpc64 alignment of r2

2015-04-27 Thread David Edelsohn
On Thu, Apr 23, 2015 at 7:24 AM, Alan Modra amo...@gmail.com wrote:
 Revised patch, supporting linker that aligns the toc base.

 This fixes a thinko in offsettable_ok_by_alignment.  It's not the
 absolute placement that matters, but the toc-pointer relative offset.
 So alignment of r2 also needs to be taken into account.

 Changing offsettable_ok_by_alignment has a ripple effect into the 'm'
 constraint so we also need to ensure rs6000_legitimize_reload_address
 does not create invalid toc-relative addresses.  As found by
 gcc.dg/torture/builtin-math-2.c -Os.  That's the reason for the
 use_toc_relative_ref change.  I hope the size check along with
 reg_offset_p is sufficient here.  It seems so, but it's difficult to
 be certain due to how hard it is to get just the right combination of
 reload conditions to trigger.

 Bootstrapped and regression tested powerpc64-linux and
 powerpc64le-linux, both with a new and old linker.  OK for mainline?

 PR target/65810
 * config/rs6000/rs6000.c (POWERPC64_TOC_POINTER_ALIGNMENT): Define.
 (offsettable_ok_by_alignment): Use minimum of decl and toc
 pointer alignment.  Replace dead code with assertion.
 (use_toc_relative_ref): Add mode arg.  Return false in -mcmodel=medium
 case if size exceeds toc pointer alignment.
 (rs6000_legitimize_reload_address): Update use_toc_relative_ref call.
 (rs6000_emit_move): Likewise.
 * configure.ac: Add linker toc pointer alignment check.
 * configure: Regenerate.
 * config.in: Regenerate.

Okay.

Thanks, David


Add --enable-default-pie option to configure GCC to generate PIE by default.

2015-04-27 Thread Magnus Granberg
Hi
Can this work be commited to Gcc 6?
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=33cd3712cae4721121bc37aefd09fc5ed7cd146a
The work was posted to the patch liste even before Gcc 5 stage1 ended.
And diffrent versions of it have been posted to the list of nummer of times.

/Magnus G.
 




Re: [Patch, fortran] PR65792 - allocation of scalar elemental function with structure constructor fails

2015-04-27 Thread Paul Richard Thomas
Dear Steve,

Thanks for the review. I THINK that I know what I meant in the comment
:-) I will commit tomorrow night.

Cheers

Paul

On 26 April 2015 at 20:53, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
 On Sun, Apr 26, 2015 at 08:35:06PM +0200, Paul Richard Thomas wrote:

 --- 7062,7091 
   {
 if (expr-expr_type != EXPR_STRUCTURE)
   {
 +   tree dealloc = NULL_TREE;
 gfc_init_se (se, NULL);
 gfc_conv_expr (se, expr);
 gfc_add_block_to_block (block, se.pre);
 +   /* Prevent repeat evaluations in gfc_copy_alloc_comp by fixing the
 +  expression in  a temporary variable and deallocate is allocatable
 +  components the copy to the result.  */

 Can you take a second shot at this comment?  The and ... portions
 seems to be a little muddled.

 OK with after comment fix.

 --
 Steve



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


Re: [libstdc++ PATCH] Add support for std::uncaught_exceptions

2015-04-27 Thread Jonathan Wakely

On 27/04/15 21:40 +0100, Jonathan Wakely wrote:

Tested x86_64-linux and powerpc64le-linux. Committed to trunk.

The baseline_symbols changes aren't needed now and I tweaked the
gnu.ver file slightly.


Ville noticed a typo in a comment I added, fixed like so.
commit c595cfa88c9d38f333b262635a1c32744e3ce054
Author: Jonathan Wakely jwak...@redhat.com
Date:   Mon Apr 27 21:51:53 2015 +0100

2015-04-27  Ville Voutilainen  ville.voutilai...@gmail.com

	* config/abi/pre/gnu.ver: Fix comment.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index 4ed683c..2da04e4 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1864,7 +1864,7 @@ GLIBCXX_3.4.21 {
 
 GLIBCXX_3.4.22 {
 
-# std::uncaught_exception()
+# std::uncaught_exceptions()
 _ZSt19uncaught_exceptionsv;
 
 } GLIBCXX_3.4.21;


Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-27 Thread Jeff Law

On 04/21/2015 11:33 AM, Kyrill Tkachov wrote:


On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:

  From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of
binutils
doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.


So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or smaller.
+   If the two are not statically comparable (for example, X and Y contain
+   different registers) return -1.  This is used in expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up reading
from
+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

So I had to go back and compile some simple examples.

References to outgoing arguments will be SP relative.  References to the 
incoming arguments will be ARGP relative.  And that brings me to the 
another issue.  Isn't X in this context the incoming argument slot and 
the destination an outgoing argument slot?


If so, the approach of memory_load_overlap simply won't work on a target 
with calling conventions like the PA.  And you might really want to 
consider punting for these kind of calling conventions


If you hadn't already done the work, I'd suggest punting for any case 
where we have args partially in regs and partially in memory :-)


More thoughts when I can get an hour or two to remind myself how all 
this stuff works on the PA.


I will note that testing on the PA is unlikely to show anything simply 
because it uses 8 parameter passing registers.  So it's rare to pass 
anything in memory at all.  Even rarer to have something partially in 
memory and partially in registers.




Jeff




Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

2015-04-27 Thread Jiong Wang

Jeff Law writes:

 On 04/16/2015 05:04 AM, Jiong Wang wrote:

 This is a rework of

https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01998.html

 After second thinking, I feel it's better to fix this in earlier stage
 during RTL expand which is more generic, and we also avoid making the
 already complex combine pass complexer.

 Currently gcc expand wide mode left shift to some generic complex
 instruction sequences, while if we have known the high part of wide mode
 all comes from sign extension, the expand logic could be simplifed.

 Given the following example,

 T A = (T) B   const_imm_shift

 We know the high part of A are all comes from sign extension, if

 * T is the next wider type of word_mode.

 For example, for aarch64, if type T is 128int (TImode), and B is with
 type SImode or DImode, then tree analyzer know that the high part of
 TImode result all comes from sign extension, and kept them in range info.

   |   T  |
   |   high |   low |
|- sizel -|

 For above example, we could simplify the expand logic into
   1. low = low  const_imm_shift;
   2. high = low  (sizel - const_imm_shift)  */

 We can utilize the arithmetic right shift to do the sign
 extension. Those reduntant instructions will be optimized out later.

 For actual .s improvement,

 AArch64
 ===

__int128_t
foo (int data)
{
  return (__int128_t) data  50;
}

old:
  sxtwx2, w0
  asr x1, x2, 63
  lsl x0, x2, 50
  lsl x1, x1, 50
  orr x1, x1, x2, lsr 14

new:
  sxtwx1, w0
  lsl x0, x1, 50
  asr x1, x1, 14


 ARM (.fpu softvfp)
 ===

long long
shift (int data)
{
  return (long long) data  20;
}

old:
  stmfd   sp!, {r4, r5}
  mov r5, r0, asr #31
  mov r3, r0
  mov r0, r0, asl #20
  mov r1, r5, asl #20
  orr r1, r1, r3, lsr #12
  ldmfd   sp!, {r4, r5}
  bx  lr

new:
  mov r1, r0
  mov r0, r0, asl #20
  mov r1, r1, asr #12
  bx  lr

 Test
 

x86 bootstrap OK, regression test OK.
AArch64 bootstrap OK, regression test on board OK.

 Regards,
 Jiong

 2015-04-116  Jiong.Wang  jiong.w...@arm.com

 gcc/
* expr.c (expand_expr_real_2): Take tree range info into account when
expanding LSHIFT_EXPR.

 gcc/testsuite
* gcc.dg/wide_shift_64_1.c: New testcase.
* gcc.dg/wide_shift_128_1.c: Ditto.
* gcc.target/aarch64/ashlti3_1.c: Ditto.
* gcc.target/arm/ashldisi_1.c: Ditto.
 Funny, I find myself wanting this transformation in both places :-) 
 Expansion time so that we generate efficient code from the start and 
 combine to catch those cases which are too complex to see at expansion, 
 but due to other optimizations become visible in the combiner.

 Sadly, it's been fairly common practice for targets to define 
 double-word shift patterns which catch a variety of special cases. 
 Ports will have to choose between using those patterns and exploiting 
 your work.   I'd be tempted to generate a double-word shift by the given 
 constant and check its cost vs the single word shifts.

 What happens if there's an overlap between t_low and low?  Won't the 
 lshift clobber low and thus we get the right value for the rshift in 
 that case?

Jeff,

  Sorry, I can't understand the meaning of overlap between t_low and low,
  assume right in right value means the opposite of left not
  correct.

  So what you mean is t_low and low share the same pseudo regiser?

  or you mean if we are shifting a value across the word boundary? like the 
following.
 
   |  double word  |
   |   t_high  |   t_low |
  |- low -|
 
  for above situation, the simplified two instruction sequence do works.
  t_low = low  const_imm_shift ; t_high = low  (sizel - const_imm_shift)

  I attached the expand result for a simple testcase below. I appreicate
  if you could comment on the RTL. 

  Thanks.
  
  __int128_t
  foo (int data)
  {
return (__int128_t) data  50;
  }

  foo.c.188t.optimized
  ===
  foo (int data)
  {
__int128 _2;
__int128 _3;

  bb 2:
_2 = (__int128) data_1(D);
_3 = _2  50;
return _3;
  }

  foo.c.189r.expand
  ===
  
  (insn 2 4 3 2 (set (reg/v:SI 76 [ data ])
(reg:SI 0 x0 [ data ])) foo.c:3 -1
  (nil))
  (insn 6 3 7 2 (set (reg:DI 79)
(sign_extend:DI (reg/v:SI 76 [ data ]))) foo.c:4 -1
  (nil))
  (insn 7 6 8 2 (set (subreg:DI (reg:TI 78 [ D.2677 ]) 0)
(reg:DI 79)) foo.c:4 -1
  (nil))
  (insn 8 7 9 2 (set (reg:DI 80)
(ashiftrt:DI (reg:DI 79) (const_int 63 [0x3f]))) foo.c:4 -1
  (nil))
  (insn 9 8 10 2 (set (subreg:DI (reg:TI 78 [ D.2677 ]) 8)
(reg:DI 80)) foo.c:4 -1
  (nil))
  
  ^
   ~ sign extend SImode data into TImode _2 (r78)

  (insn 10 9 11 2 (set (subreg:DI (reg:TI 77 [D.2677 ]) 0)
(ashift:DI (subreg:DI (reg:TI 78 [ D.2677 ]) 0)
   

Re: [PATCH] libstdc++: Fix exceptions being generated when compiling with -fno-exceptions

2015-04-27 Thread Jonathan Wakely

On 14/04/15 12:17 -0300, Federico Lenarduzzi wrote:

When the libstdc++ is compiled, the compiler sets the std::terminate_handler function 
with __verbose_terminate_handler() or std::abort() depending on _GLIBCXX_HOSTED 
 _GLIBCXX_VERBOSE being true or false.

However, even if we compile with -fno-exceptions, the compiler will use 
__verbose_terminate_handler(), which uses exceptions. Therefore, the library is 
not fully exception-free.

This patch adds a check for __EXCEPTIONS to the #if _GLIBCXX_HOSTED  
_GLIBCXX_VERBOSE condition. If __EXCEPTIONS is defined, the compiler will use 
__verbose_terminate_handler() as a termination function; otherwise it'll use 
std::abort() which doesn't have exceptions. It also makes std::uncaught_exception() 
throw() return false if __EXCEPTIONS is not defined.


I've committed this slightly revised version using __cpp_exceptions
instead, and changing the new std::uncaught_exceptions() function too.

Tested x86_64-linux and powerpc64le-linux, and also built with
--enable-libstdcxx-flags=-fno-exceptions on x86_64-linux.

commit 80bc538bd64c7f6b9a21ec5d1a3bc5b08bc8770b
Author: Jonathan Wakely jwak...@redhat.com
Date:   Mon Apr 27 14:04:50 2015 +0100

2015-04-27  Federico Lenarduzzi  federico.lenardu...@tallertechnologies.com
	Jonathan Wakely  jwak...@redhat.com

	* libsupc++/eh_catch.cc (uncaught_exception, uncaught_exceptions):
	Return false or zero if the library is built without exceptions.
	* libsupc++/eh_term_handler.cc: Disable verbose terminate handler if
	the library is built without exceptions.

diff --git a/libstdc++-v3/libsupc++/eh_catch.cc b/libstdc++-v3/libsupc++/eh_catch.cc
index 723ae56..44fde79 100644
--- a/libstdc++-v3/libsupc++/eh_catch.cc
+++ b/libstdc++-v3/libsupc++/eh_catch.cc
@@ -136,13 +136,21 @@ __cxxabiv1::__cxa_end_catch ()
 bool
 std::uncaught_exception() throw()
 {
+#if __cpp_exceptions
   __cxa_eh_globals *globals = __cxa_get_globals ();
   return globals-uncaughtExceptions != 0;
+#else
+  return false;
+#endif
 }
 
 int
 std::uncaught_exceptions() throw()
 {
+#if __cpp_exceptions
   __cxa_eh_globals *globals = __cxa_get_globals ();
   return globals-uncaughtExceptions;
+#else
+  return 0;
+#endif
 }
diff --git a/libstdc++-v3/libsupc++/eh_term_handler.cc b/libstdc++-v3/libsupc++/eh_term_handler.cc
index 46acee8..0d6ea2b 100644
--- a/libstdc++-v3/libsupc++/eh_term_handler.cc
+++ b/libstdc++-v3/libsupc++/eh_term_handler.cc
@@ -32,7 +32,7 @@
--disable-libstdcxx-verbose and rebuilding the library.
In a freestanding environment, we default to this latter approach.  */
 
-#if _GLIBCXX_HOSTED  _GLIBCXX_VERBOSE
+#if _GLIBCXX_HOSTED  _GLIBCXX_VERBOSE  __cpp_exceptions
 /* The current installed user handler.  */
 std::terminate_handler __cxxabiv1::__terminate_handler =
 	__gnu_cxx::__verbose_terminate_handler;


Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-04-27 Thread Mikhail Maltsev
I'm sending an updated patch (rebased to recent trunk, bootstrapped and
regtested on x86_64-unknown-linux-gnu).

On 04/25/2015 02:49 PM, Richard Sandiford wrote:
 FWIW I think the split between label_rtx and live_label_rtx is good,
 but I think we should give them different names.  The first one is
 returning only a position in the instruction stream, the second is
 returning a jump target.  I think we should rename both of them to
 make that distinction clearer.

I renamed live_label_rtx to jump_target_rtx. But I'm not sure if it is
appropriate (so, perhaps, you could give some advice about the right
names for these functions?)

 I think the eventual aim would be to have rtx_jump_insn member functions
 that get and set the jump label as an rtx_insn, with JUMP_LABEL_AS_INSN
 being a stepping stone towards that.  In cases like this it might make
 more sense to ensure old_jump has the right type (rtx_jump_insn) and go
 straight to the member functions, rather than switching to JUMP_LABEL_AS_INSN
 now and then having to rewrite it later.

I added the member functions. The problem is that JUMP_LABEL does not
always satisfy the current invariant of rtx_insn: it can also be an RTL
expression of type RETURN or SIMPLE_RETURN.

 Formatting nit, but the line break should be before ? rather than after.
Fixed.

 This preserves the behaviour of the original code but I'm not sure
 it's worth it.  I doubt the distinction between:
 
   gcc_assert (JUMP_P (x));
 
 and:
 
   gcc_checking_assert (JUMP_P (x));
 
 was ever very scientific.  Seems like we should take this refactoring as
 an opportunity to make the checking more consistent.
Fixed (removed assert_as_a).

 That seems pretty heavy-weight for LRA-local code.  Also, the long-term
 plan is for INSN_LIST and rtx_insn to be in separate hierarchies,
 at which point we'd have no alias-safe way to distinguish them.
 
 usage_insns isn't a GC structure and isn't live across a GC collection,
 so I don't think we need these lists to be rtxes at all.
OK, reverted changes in LRA code for now. I think this should be a
separate patch then.

-- 
Regards,
Mikhail Maltsev
diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index c2a3be3..ae726e5 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -1745,9 +1745,11 @@ set_edge_can_fallthru_flag (void)
 	continue;
   if (!any_condjump_p (BB_END (bb)))
 	continue;
-  if (!invert_jump (BB_END (bb), JUMP_LABEL (BB_END (bb)), 0))
+
+  rtx_jump_insn *bb_end_jump = as_a rtx_jump_insn * (BB_END (bb));
+  if (!invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0))
 	continue;
-  invert_jump (BB_END (bb), JUMP_LABEL (BB_END (bb)), 0);
+  invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0);
   EDGE_SUCC (bb, 0)-flags |= EDGE_CAN_FALLTHRU;
   EDGE_SUCC (bb, 1)-flags |= EDGE_CAN_FALLTHRU;
 }
@@ -1902,9 +1904,15 @@ fix_up_fall_thru_edges (void)
 
 		  fall_thru_label = block_label (fall_thru-dest);
 
-		  if (old_jump  JUMP_P (old_jump)  fall_thru_label)
-			invert_worked = invert_jump (old_jump,
-		 fall_thru_label,0);
+		  if (old_jump  fall_thru_label)
+{
+  rtx_jump_insn *old_jump_insn =
+  dyn_cast rtx_jump_insn * (old_jump);
+  if (old_jump_insn)
+invert_worked = invert_jump (old_jump_insn,
+		 fall_thru_label, 0);
+}
+
 		  if (invert_worked)
 			{
 			  fall_thru-flags = ~EDGE_FALLTHRU;
@@ -2021,10 +2029,9 @@ fix_crossing_conditional_branches (void)
   edge succ2;
   edge crossing_edge;
   edge new_edge;
-  rtx_insn *old_jump;
   rtx set_src;
   rtx old_label = NULL_RTX;
-  rtx new_label;
+  rtx_code_label *new_label;
 
   FOR_EACH_BB_FN (cur_bb, cfun)
 {
@@ -2049,7 +2056,7 @@ fix_crossing_conditional_branches (void)
 
   if (crossing_edge)
 	{
-	  old_jump = BB_END (cur_bb);
+	  rtx_jump_insn *old_jump = as_a rtx_jump_insn * (BB_END (cur_bb));
 
 	  /* Check to make sure the jump instruction is a
 	 conditional jump.  */
@@ -2088,7 +2095,8 @@ fix_crossing_conditional_branches (void)
 	  else
 		{
 		  basic_block last_bb;
-		  rtx_insn *new_jump;
+		  rtx_insn *old_label_insn;
+		  rtx_jump_insn *new_jump;
 
 		  /* Create new basic block to be dest for
 		 conditional jump.  */
@@ -2099,9 +2107,10 @@ fix_crossing_conditional_branches (void)
 		  emit_label (new_label);
 
 		  gcc_assert (GET_CODE (old_label) == LABEL_REF);
-		  old_label = JUMP_LABEL (old_jump);
-		  new_jump = emit_jump_insn (gen_jump (old_label));
-		  JUMP_LABEL (new_jump) = old_label;
+		  old_label_insn = old_jump-jump_target ();
+		  new_jump = as_a rtx_jump_insn *
+(emit_jump_insn (gen_jump (old_label_insn)));
+		  new_jump-set_jump_target (old_label_insn);
 
 		  last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)-prev_bb;
 		  new_bb = create_basic_block (new_label, new_jump, last_bb);
diff --git a/gcc/bt-load.c 

Re: [PATCH libstdc++] Fix for std::uncaught_exception (PR 62258)

2015-04-27 Thread Jonathan Wakely

On 02/02/15 03:37 +0100, Michael Hanselmann wrote:

Calls to `std::uncaught_exception` after calling
`std::rethrow_exception' always return `true' when
`std::uncaught_exception' should return `false' unless an exception is
in flight. `std::rethrow_exception' does not update
`__cxa_eh_globals::uncaughtExceptions' while the following call to
`__cxa_begin_catch' decrements it. This fixes PR 62258.

The original two-line patch was created by Dmitry Prokoptsev. Michael



Hanselmann implemented a testcase.


Committed to trunk - thanks for the patch and test.

commit 1cfba1875f21f485fe82e8c118b2355c4714bcdc
Author: Jonathan Wakely jwak...@redhat.com
Date:   Mon Apr 27 13:48:23 2015 +0100

2015-04-27  Dmitry Prokoptsev  dprokopt...@gmail.com
	Michael Hanselmann  pub...@hansmi.ch

	PR libstdc++/62258
	* libsupc++/eh_ptr.cc (rethrow_exception): Increment count of
	uncaught exceptions.
	* testsuite/18_support/exception_ptr/62258.cc: New.

diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 685c9e0..dae9246 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -245,6 +245,9 @@ std::rethrow_exception(std::exception_ptr ep)
   __GXX_INIT_DEPENDENT_EXCEPTION_CLASS(dep-unwindHeader.exception_class);
   dep-unwindHeader.exception_cleanup = __gxx_dependent_exception_cleanup;
 
+  __cxa_eh_globals *globals = __cxa_get_globals ();
+  globals-uncaughtExceptions += 1;
+
 #ifdef _GLIBCXX_SJLJ_EXCEPTIONS
   _Unwind_SjLj_RaiseException (dep-unwindHeader);
 #else
diff --git a/libstdc++-v3/testsuite/18_support/exception_ptr/62258.cc b/libstdc++-v3/testsuite/18_support/exception_ptr/62258.cc
new file mode 100644
index 000..3734f19
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/exception_ptr/62258.cc
@@ -0,0 +1,61 @@
+// { dg-options -std=gnu++11 }
+// { dg-require-atomic-builtins  }
+
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+// PR libstdc++/62258
+
+#include exception
+#include testsuite_hooks.h
+
+struct check_on_destruct
+{
+  ~check_on_destruct();
+};
+
+check_on_destruct::~check_on_destruct()
+{
+  VERIFY(std::uncaught_exception());
+}
+
+int main ()
+{
+  VERIFY(!std::uncaught_exception());
+
+  try
+{
+  check_on_destruct check;
+
+  try
+{
+  throw 1;
+}
+  catch (...)
+{
+  VERIFY(!std::uncaught_exception());
+
+  std::rethrow_exception(std::current_exception());
+}
+}
+  catch (...)
+{
+  VERIFY(!std::uncaught_exception());
+}
+
+  VERIFY(!std::uncaught_exception());
+}


Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Richard Biener
On Sat, Apr 25, 2015 at 2:35 PM, Tom de Vries tom_devr...@mentor.com wrote:
 Hi,

 this patch fixes PR65818, and hppa bootstrap.

 When compiling gcc/libiberty/vprintf-support.c, the following va_arg is
 compiled:
 ...
  (void) __builtin_va_arg(ap, double);
 ...

 This results in the following ifn_va_arg at gimple level, with a NULL_TREE
 lhs:
 ...
   VA_ARG (ap, 0B);
 ...

 We start expanding the ifn_va_arg in expand_ifn_va_arg_1 by calling
 gimplify_va_arg_internal:
 ...
 expr = gimplify_va_arg_internal (ap, type, gimple_location (stmt),
  pre, post);
 ...

 Subsequently, because the lhs is NULL_TREE, we skip generating the assign to
 the lhs:
 ...
 lhs = gimple_call_lhs (stmt);
 if (lhs != NULL_TREE)
   {
 gcc_assert (useless_type_conversion_p (TREE_TYPE (lhs), type));

 if (gimple_call_num_args (stmt) == 3)
   {
 /* We've transported the size of with WITH_SIZE_EXPR here as
the 3rd argument of the internal fn call.  Now reinstate
it.  */
 tree size = gimple_call_arg (stmt, 2);
 expr = build2 (WITH_SIZE_EXPR, TREE_TYPE (expr), expr,
 size);
   }

 /* We use gimplify_assign here, rather than gimple_build_assign,
because gimple_assign knows how to deal with variable-sized
types.  */
 gimplify_assign (lhs, expr, pre);
   }
 ...

 We assume here that any side-effects related to updating ap have been
 generated into pre/post by gimplify_va_arg_internal, and that it's safe to
 ignore expr.

 Turns out, that's not the case for hppa. The target hook
 hppa_gimplify_va_arg_expr (called from gimplify_va_arg_internal) returns an
 expression that still contains a side-effect:
 ...
 *(double *) (ap = ap + 4294967288  4294967288B)
 ...

 This patch fixes that by gimplifying the address expression of the mem-ref
 returned by the target hook (borrowing code from gimplify_expr, case
 MEM_REF).

 Bootstrapped and reg-tested on x86_64.

 Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.

 OK for trunk?

Hmm, that assert looks suspicious...

Can't you simply always do

  gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);

?

Richard.

 Thanks,
 - Tom



Improve LTO type checking during symtab merging

2015-04-27 Thread Jan Hubicka
Hi,
this patch strengten ODR checking for requiring match on declarations
(if both of them are defined in C++ tranlsation unit).  Currently
ipa-symtab.c will warn only if declarations are incompatible in 
useless_type_conversion sense.  Now we warn more often, i.e. in the following
example:

t.C:
char a;

t2.C:
#include stdtype.h
extern int8_t a;
void *b=a;

Will lead to warning:
t2.C:2:15: warning: type of �a� does not match original declaration
 extern int8_t a;
   ^
built-in: note: type name �char� should match type name �signed char�
/usr/include/stdint.h:37:22: note: the incompatible type is defined here
 typedef signed char  int8_t;
  ^
/aux/hubicka/t.C:1:6: note: previously declared here
 char a;
  ^

Funnilly enough we do not get warning when t2.C is just signed char
or unsigned char because that is builtin type and thus non-ODR. Something
I need to deal with incrementally.  

I also added call to warn_types_mismatch that outputs extra note about
what is wrong with the type: in C++ the mismatch is often carried over
several levels on declarations and it is hard to work out the reason
without extra info.

Richard, According to comments, it seems that the code was tailored to not
output warning when we can avoid wrong code issues on mismatches based on fact
that linker would be also silent. I am particularly entertained by the
following hunk which I killed:

-  if (!types_compatible_p (TREE_TYPE (prevailing_decl),
-  TREE_TYPE (decl)))
-   /* If we don't have a merged type yet...sigh.  The linker
-  wouldn't complain if the types were mismatched, so we
-  probably shouldn't either.  Just use the type from
-  whichever decl appears to be associated with the
-  definition.  If for some odd reason neither decl is, the
-  older one wins.  */
-   (void) 0;

It is fun we go through the trouble of comparing types and then do nothing ;)
Type merging is also completed at this time, so at least the comment looks
out of date.

I think as QOI issue, we do want to warn in cases the code is inconsistent (it
is pretty much surely a bug), but perhaps we may want to have the warnings with
-Wodr only and use slightly different warning and different -W switch for
warnings that unavoidably leads to wrong code surprises?   This should be easy
to implement by adding two levels into new warn_type_compatibility_p predicate.
I am personaly also OK however with warning always or making all the warnings
-Wodr.

What do you think?

Incrementally I am heading towards proper definition of decl compatibility that
I can plug into symtab merging and avoid merging incompatible decls (so
FORTIFY_SOURCE works).

lto-symtab and ipa-icf both have some knowledge of the problem, I want to get
both right and factor out common logic.

Other improvement is to preserve the ODR type info when non-ODR variant
previals so one can diagnose clash in between C++ units even in mixed
language linktimes.

Bootstrapped/regtested x86_64-linux with no new ODR warnings hit by -Werror.

Honza

* ipa-devirt.c (register_odr_type): Be ready for non-odr main variant
of odr type.
* ipa-utils.h (warn_types_mismatch): New.
* lto/lto-symtab.c (warn_type_compatibility_p): Break out from ...
(lto_symtab_merge): ... here.
(lto_symtab_merge_decls_2): Improve warnings.

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 222391)
+++ ipa-devirt.c(working copy)
@@ -2029,10 +2030,14 @@ register_odr_type (tree type)
   if (in_lto_p)
 odr_vtable_hash = new odr_vtable_hash_type (23);
 }
-  /* Arrange things to be nicer and insert main variants first.  */
-  if (odr_type_p (TYPE_MAIN_VARIANT (type)))
+  /* Arrange things to be nicer and insert main variants first.
+ ??? fundamental prerecorded types do not have mangled names; this
+ makes it possible that non-ODR type is main_odr_variant of ODR type.
+ Things may get smoother if LTO FE set mangled name of those types same
+ way as C++ FE does.  */
+  if (odr_type_p (main_odr_variant (TYPE_MAIN_VARIANT (type
 get_odr_type (TYPE_MAIN_VARIANT (type), true);
-  if (TYPE_MAIN_VARIANT (type) != type)
+  if (TYPE_MAIN_VARIANT (type) != type  odr_type_p (main_odr_variant (type)))
 get_odr_type (type, true);
 }
 
Index: ipa-utils.h
===
--- ipa-utils.h (revision 222391)
+++ ipa-utils.h (working copy)
@@ -84,6 +84,7 @@ bool types_must_be_same_for_odr (tree, t
 bool types_odr_comparable (tree, tree, bool strict = false);
 cgraph_node *try_speculative_devirtualization (tree, HOST_WIDE_INT,
   ipa_polymorphic_call_context);
+void warn_types_mismatch (tree t1, tree t2);
 
 /* Return vector containing possible targets of polymorphic call E.
If 

ping*3: [PATCH, x86] [PR target/64835] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-04-27 Thread Christian Bruel
Hi,

I'd like to re-ping the following patch for GCC 5.2. It fixes the
__attribute__ ((__optimize__ (...))) on x86. Testcase is in the patch.

thanks

Christian

On 04/13/2015 04:24 PM, Christian Bruel wrote:
 https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00243.html
 
 thanks,
 
 Christian
 


Re: [PATCH] Fix VRP update_value_range and caller (PR tree-optimization/65875)

2015-04-27 Thread Jakub Jelinek
On Mon, Apr 27, 2015 at 10:05:21AM +0200, Richard Biener wrote:
  --- gcc/tree-vrp.c.jj   2015-04-20 14:35:39.0 +0200
  +++ gcc/tree-vrp.c  2015-04-24 18:10:41.321367440 +0200
  @@ -892,7 +892,12 @@ update_value_range (const_tree var, valu
   UNDEFINED or from VARYING.  */
 if (new_vr-type == VR_UNDEFINED
|| old_vr-type == VR_VARYING)
  -   set_value_range_to_varying (old_vr);
  +   {
  + BITMAP_FREE (new_vr-equiv);
  + set_value_range_to_varying (old_vr);
  + set_value_range_to_varying (new_vr);
  + return true;
 
 Actually we didn't change anything here (old_vr-type is VARYING already,
 so we shouldn't even have visited the stmt defining the SSA name again...
 eventually your fix below fixes that.

On the testcase, old_vr wasn't actually VARYING, but new_vr was UNDEFINED
(a result of intersecting disjoint ranges).  While for old_vr-type ==
VR_VARYING I agree we shouldn't have been called on this.

And turning a VR_RANGE into VR_UNDEFINED is going in the wrong direction
in the lattice.

Jakub


Re: [PATCH] Fix VRP update_value_range and caller (PR tree-optimization/65875)

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Jakub Jelinek wrote:

 On Mon, Apr 27, 2015 at 10:05:21AM +0200, Richard Biener wrote:
   --- gcc/tree-vrp.c.jj 2015-04-20 14:35:39.0 +0200
   +++ gcc/tree-vrp.c2015-04-24 18:10:41.321367440 +0200
   @@ -892,7 +892,12 @@ update_value_range (const_tree var, valu
  UNDEFINED or from VARYING.  */
  if (new_vr-type == VR_UNDEFINED
   || old_vr-type == VR_VARYING)
   - set_value_range_to_varying (old_vr);
   + {
   +   BITMAP_FREE (new_vr-equiv);
   +   set_value_range_to_varying (old_vr);
   +   set_value_range_to_varying (new_vr);
   +   return true;
  
  Actually we didn't change anything here (old_vr-type is VARYING already,
  so we shouldn't even have visited the stmt defining the SSA name again...
  eventually your fix below fixes that.
 
 On the testcase, old_vr wasn't actually VARYING, but new_vr was UNDEFINED
 (a result of intersecting disjoint ranges).  While for old_vr-type ==
 VR_VARYING I agree we shouldn't have been called on this.
 
 And turning a VR_RANGE into VR_UNDEFINED is going in the wrong direction
 in the lattice.

Ah, I misread the || for a .  The patch is ok with just dropping
the == VR_VARYING check.

Thanks,
Richard.


Fix librayr name of __builtin_allocal_with_align

2015-04-27 Thread Jan Hubicka
Hi,
build_common_builtin_nodes declares both __builtin_alloca and
__builtin_alloca_with_align to have library name alloca. This actually
triggers warning in an updated ODR violation detector on alloca being
declared twice.

__builtin_alloca_with_align IMO do not have library equivalent and I think this
is a pasto (__builtin_alloca_with_align is not documented in extend.texi). It
is not clear to me if there was some intention behind this oddity though.

I have bootstrapped/regtested x86_64 with the following. OK?

Honza

* tree.c (build_common_builtin_nodes): Do not build
__builtin_alloca_with_align as equivalent of library alloca.
Index: tree.c
===
--- tree.c  (revision 222391)
+++ tree.c  (working copy)
@@ -10088,7 +10098,8 @@ build_common_builtin_nodes (void)
   ftype = build_function_type_list (ptr_type_node, size_type_node,
size_type_node, NULL_TREE);
   local_define_builtin (__builtin_alloca_with_align, ftype,
-   BUILT_IN_ALLOCA_WITH_ALIGN, alloca,
+   BUILT_IN_ALLOCA_WITH_ALIGN,
+   __builtin_alloca_with_align,
ECF_MALLOC | ECF_NOTHROW | ECF_LEAF);
 
   /* If we're checking the stack, `alloca' can throw.  */


Re: [PATCH][PR65823] Fix va_arg ap_copy nop detection

2015-04-27 Thread Tom de Vries

On 22-04-15 15:50, Richard Biener wrote:

On Wed, Apr 22, 2015 at 3:38 PM, Tom de Vries tom_devr...@mentor.com wrote:

On 22-04-15 10:06, Richard Biener wrote:


On Wed, Apr 22, 2015 at 9:41 AM, Tom de Vries tom_devr...@mentor.com
wrote:


Hi,

this patch fixes PR65823.



SNIP



The patches fixes the problem by using operand_equal_p to do the equality
test.


Bootstrapped and reg-tested on x86_64.

Did minimal non-bootstrap build on arm and reg-tested.

OK for trunk?



Hmm, ok for now.



Committed.


But I wonder if we can't fix things to not require that
odd extra copy.



Agreed, that would be good.


In fact that we introduce ap.1 looks completely bogus to me



AFAICT, it's introduced by gimplify_arg ('argp') because argp (a PARM_DECL)
is not addressable.


(and we don't in this case for arm).  Note that the pointer compare
obviously
fails because we unshare the expression.

So ... what breaks if we simply remove this odd fixup?



[ Originally mentioned at https://gcc.gnu.org/ml/gcc/2015-02/msg00011.html .
]

I've committed gcc.target/x86_64/abi/callabi/vaarg-6.c specifically as a
minimal version of this problem.

If we remove the ap_copy fixup, at original we have:
...
;; Function do_cpy2 (null)
{
   char * e;

 char * e;
   e = VA_ARG_EXPR argp;
   e = VA_ARG_EXPR argp;
   if (e != b)
 {
   abort ();
 }
}
...

and after gimplify we have:
...
do_cpy2 (char * argp)
{
   char * argp.1;
   char * argp.2;
   char * b.3;
   char * e;

   argp.1 = argp;
   e = VA_ARG (argp.1, 0B);
   argp.2 = argp;
   e = VA_ARG (argp.2, 0B);
   b.3 = b;
   if (e != b.3) goto D.1373; else goto D.1374;
   D.1373:
   abort ();
   D.1374:
}
...

The second VA_ARG uses argp.2, which is a copy of argp, which is unmodified
by the first VA_ARG.


Using attached _proof-of-concept_ patch, I get callabi.exp working without
the ap_copy, still at the cost of one 'argp.1 = argp' copy though:
...
do_cpy2 (char * argp)
{
   char * argp.1;
   char * b.2;
   char * e;

   argp.1 = argp;
   e = VA_ARG (argp.1, 0B);
   e = VA_ARG (argp.1, 0B);
   b.2 = b;
   if (e != b.2) goto D.1372; else goto D.1373;
   D.1372:
   abort ();
   D.1373:
}
...

But perhaps there's an easier way?


Hum, simply

Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (revision 222320)
+++ gcc/gimplify.c  (working copy)
@@ -9419,6 +9419,7 @@ gimplify_va_arg_expr (tree *expr_p, gimp
  }

/* Transform a VA_ARG_EXPR into an VA_ARG internal function.  */
+  mark_addressable (valist);
ap = build_fold_addr_expr_loc (loc, valist);
tag = build_int_cst (build_pointer_type (type), 0);
*expr_p = build_call_expr_internal_loc (loc, IFN_VA_ARG, type, 2, ap, tag);



That sort of works, but causes other problems. Filed PR65887 - 'remove va_arg ap 
copies' to track this issue.


Thanks,
- Tom



Re: Fix librayr name of __builtin_allocal_with_align

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Jan Hubicka wrote:

 Hi,
 build_common_builtin_nodes declares both __builtin_alloca and
 __builtin_alloca_with_align to have library name alloca. This actually
 triggers warning in an updated ODR violation detector on alloca being
 declared twice.
 
 __builtin_alloca_with_align IMO do not have library equivalent and I think 
 this
 is a pasto (__builtin_alloca_with_align is not documented in extend.texi). It
 is not clear to me if there was some intention behind this oddity though.
 
 I have bootstrapped/regtested x86_64 with the following. OK?

Yes.

Thanks,
Richard.

 Honza
 
   * tree.c (build_common_builtin_nodes): Do not build
   __builtin_alloca_with_align as equivalent of library alloca.
 Index: tree.c
 ===
 --- tree.c(revision 222391)
 +++ tree.c(working copy)
 @@ -10088,7 +10098,8 @@ build_common_builtin_nodes (void)
ftype = build_function_type_list (ptr_type_node, size_type_node,
   size_type_node, NULL_TREE);
local_define_builtin (__builtin_alloca_with_align, ftype,
 - BUILT_IN_ALLOCA_WITH_ALIGN, alloca,
 + BUILT_IN_ALLOCA_WITH_ALIGN,
 + __builtin_alloca_with_align,
   ECF_MALLOC | ECF_NOTHROW | ECF_LEAF);
  
/* If we're checking the stack, `alloca' can throw.  */
 


Re: [PATCH] Fix VRP update_value_range and caller (PR tree-optimization/65875)

2015-04-27 Thread Richard Biener
On Fri, 24 Apr 2015, Jakub Jelinek wrote:

 Hi!
 
 In vrp_visit_assignment_or_call we try to return SSA_PROP_VARYING
 if update_value_range returned true and the new range is VR_VARYING,
 but vrp_visit_phi_node fails to do that.
 Another thing is that if update_value_range decides to
 set_value_range_to_varying (old_vr);
 it doesn't update new_vr, so the caller doesn't know the new vr
 is varying (and calling get_value_range again sounds unnecessary).
 
 The following patch fixes it, bootstrapped/regtested on x86_64-linux and
 i686-linux, ok for trunk and 5.2?
 
 2015-04-24  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/65875
   * tree-vrp.c (update_value_range): If in is_new case setting
   old_vr to VR_VARYING, also set new_vr to it.
   (vrp_visit_phi_node): Return SSA_PROP_VARYING instead of
   SSA_PROP_INTERESTING if update_value_range returned true,
   but new range is VR_VARYING.
 
   * gcc.c-torture/compile/pr65875.c: New test.
 
 --- gcc/tree-vrp.c.jj 2015-04-20 14:35:39.0 +0200
 +++ gcc/tree-vrp.c2015-04-24 18:10:41.321367440 +0200
 @@ -892,7 +892,12 @@ update_value_range (const_tree var, valu
UNDEFINED or from VARYING.  */
if (new_vr-type == VR_UNDEFINED
 || old_vr-type == VR_VARYING)
 - set_value_range_to_varying (old_vr);
 + {
 +   BITMAP_FREE (new_vr-equiv);
 +   set_value_range_to_varying (old_vr);
 +   set_value_range_to_varying (new_vr);
 +   return true;

Actually we didn't change anything here (old_vr-type is VARYING already,
so we shouldn't even have visited the stmt defining the SSA name again...
eventually your fix below fixes that.

So I'd prefer to simply drop the hunk... (you might want to do some
digging which rev. introduced it and if it came with a testcase...).

Otherwise ok.

Thanks,
Richard.

 + }
else
   set_value_range (old_vr, new_vr-type, new_vr-min, new_vr-max,
new_vr-equiv);
 @@ -8941,6 +8946,9 @@ update_range:
 fprintf (dump_file, \n);
   }
  
 +  if (vr_result.type == VR_VARYING)
 + return SSA_PROP_VARYING;
 +
return SSA_PROP_INTERESTING;
  }
  
 --- gcc/testsuite/gcc.c-torture/compile/pr65875.c.jj  2015-04-24 
 18:20:47.650595581 +0200
 +++ gcc/testsuite/gcc.c-torture/compile/pr65875.c 2015-04-24 
 18:20:30.0 +0200
 @@ -0,0 +1,24 @@
 +/* PR tree-optimization/65875 */
 +
 +int a, b, c, d, e, f, g;
 +
 +void
 +foo (void)
 +{
 +  long h = 0, i;
 +  if (g  0)
 +i = -g;
 +  for (; b;)
 +for (; c;)
 +  if (e)
 + h = 1;
 +  for (; f;)
 +if (a)
 +  break;
 +  if (h  i)
 +while (h  i)
 +  {
 + d = 0;
 + h--;
 +  }
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-04-27 Thread Uros Bizjak
On Wed, Feb 4, 2015 at 2:21 PM, Christian Bruel christian.br...@st.com wrote:
 While trying to reduce the PR64835 case for ARM and x86, I noticed that the
 alignment flags are cleared for x86 when attribute optimized is used.

 With the attached testcases, the visible effects are twofold :

 1) Functions compiled in with attribute optimize (-O2) are not aligned as if
 they were with the -O2 flag.

 2) can_inline_edge_p fails because opts_for_fn (caller-decl) != opts_for_fn
 (callee-decl)) even-though they are compiled with the same optimization
 level.

2015-02-06  Christian Bruel  christian.br...@st.com

PR target/64835
* config/i386/i386.c (ix86_default_align): New function.
(ix86_override_options_after_change): Call ix86_default_align.
(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): New hook.
(ix86_override_options_after_change): New function.

2015-02-06  Christian Bruel  christian.br...@st.com

PR target/64835
* gcc.dg/ipa/iinline-attr.c: New test.
* gcc.target/i386/iinline-attr-2.c: New test.

OK for mainline.

Thanks,
Uros


Re: Fix xstormy16 handling of HImode predicates

2015-04-27 Thread Nicholas Clifton

Hi Richard,


gcc/
* config/stormy16/predicates.md (xs_hi_general_operand): Delete.
(xs_hi_nonmemory_operand): Remove error.
* config/stormy16/stormy16.md (movhi, movhi_internal): Use
general_operand rather than xs_hi_general_operand.


Approved - please apply.

Cheers
  Nick



Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-27 Thread John David Anglin

On 2015-04-27 6:12 AM, Kyrill Tkachov wrote:


On 22/04/15 12:51, Kyrill Tkachov wrote:

On 21/04/15 18:33, Kyrill Tkachov wrote:

On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:
From reading config/stormy16/stormy-abi it seems to me that we 
don't
pass arguments partially in stormy16, so this code would never be 
called

there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of 
binutils

doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.

So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or 
smaller.
+   If the two are not statically comparable (for example, X and Y 
contain
+   different registers) return -1.  This is used in 
expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up 
reading from

+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

static int
memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
{
#ifdef ARGS_GROW_DOWNWARD
 rtx tmp = plus_constant (Pmode, x, -size);
#else
 rtx tmp = plus_constant (Pmode, x, size);
#endif
 rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);

 if (!CONST_INT_P (sub))
   return -1;

#ifdef ARGS_GROW_DOWNWARD
 return INTVAL (-sub);
#else
 return INTVAL (sub);
#endif
}

now, say for x == sp + 4,  y == sp + 8, size == 16:
This would be a problematic case for arm, so this code on arm
(where ARGS_GROW_DOWNWARD is *not* defined) would return
12, which is the number of bytes that overlap.

On a target where ARGS_GROW_DOWNWARD is defined this would return
-20, meaning that no overlap occurs (because we read in the descending
direction from x, IIUC).

Hi Jeff,

Here's an attempt to make this more concrete.
Only the memory_load_overlap function has changed.
This time I tried to take into account the case when
ARGS_GROW_DOWNWARD.

Take the case where x == sp, y == sp + 8, size == 16.
For arm, this would return 8 as that is the number of bytes
that overlap. On pa, since ARGS_GROW_DOWNWARD is defined it
would return -1 as we're reading down from x rather than up
towards y.

In the case when x == sp + 8, y == sp, size == 16
This would return -1 on arm since we're reading upwards from x
and thefore no overlap would happen.

On pa, this would return 8, which I think is the right thing.
But again, I don't have access to any pa means of testing.

What do you think of this approach?


Hi Dave,

Would it be possible for you to test this patch on a 64-bit hppa
or at least bootstrap it?
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01288.html
I started a build and test with your patch on hppa64-hp-hpux11.11 this 
morning.




There is a concern that it may potentially affect the passing of
complex arguments partially on the stack and partially in regs
on pa because of the way the args and stack grow on that target.

Unfortunately I don't have access to any hardware or simulators.
It would help a lot with getting this patch in.

If you write to linux-par...@vger.kernel.org, arrangements can be made for
an account on a Debian parisc linux machine for development testing.

Helge Deller has arranged for some new machines since we took over the
Debian buildd infrastructure  for parisc.  More info is here:
https://parisc.wiki.kernel.org/index.php/Main_Page



Thanks,
Kyrill




Thanks,
Kyrill

P.S. I've included the testcase from Honggyu in the patch.

2015-04-22  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  PR target/65358
  * expr.c (memory_load_overlap): New function.
  (emit_push_insn): When pushing partial args to the stack would
  clobber the register part load the overlapping part into a pseudo
  and put it into the hard reg after pushing.

2015-04-22  Honggyu Kim  hong.gyu@lge.com

  PR target/65358
  * gcc.dg/pr65358.c: New test.




[PATCH][AArch64] Fix operand costing logic for MINUS

2015-04-27 Thread Kyrill Tkachov

Hi all,

This fixes a case in aarch64 costs where we forgot
to account for one of the operands in the MINUS case.

This is important to get right as the mult synthesis code
can ask for the cost of a shift+sub operation and put the
shift part in any of the MINUS operands, expecting the cost function to
break it down into a separate shift operation if the
two cannot be combined.

Bootstrapped and tested on aarch64.
Ok for trunk?

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.c (aarch64_rtx_costs, MINUS):
Properly account for both operand costs in simple case.
commit 4859566692e4b9195b975632ed105b9c4b6ab765
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Mon Mar 2 10:18:22 2015 +

[AArch64] Properly cost both operands of MINUS.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 7ffa7ee..5a3f887 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5844,7 +5844,8 @@ cost_minus:
 	return true;
 	  }
 
-	*cost += rtx_cost (new_op1, MINUS, 1, speed);
+	*cost += rtx_cost (new_op1, MINUS, 1, speed)
+	 + rtx_cost (op0, MINUS, 0, speed);
 
 	if (speed)
 	  {


Re: [PATCH] Fix LTO option streaming

2015-04-27 Thread Richard Biener
On Thu, 23 Apr 2015, Richard Biener wrote:

 On Thu, 23 Apr 2015, Jan Hubicka wrote:
 
   
   It looks like when transitioning to using target and optimization
   option nodes for compile-time to link-time option streaming you
   didn't adjust lto-opts.c nor lto-wrapper.c.  The following fixes
  
  Yep, I assumed that lto-wrapper's merging is now redundant for optimization
  options, while it is still needed for global functions (-fPIC) that was
  not covered by my changes and since this was all bit late, it seemed I can
  just keep it for next stage1.
  
   the target option side (for SWITCHABLE_TARGET).  Do not record
   any target options in the lto_opts section.
   
   Honza - I suppose we don't have any testcase that this works, I'll
   try to come up with sth.  This also looks like a correctness issue
   to me.
  
  Thanks, that would be a good idea. So with SWITCHABLE_TARGET
  we have problem that after switching the options are misinterpretted?

Ok, so with

t.c
---
extern int have_fma;
extern double x, y, z;
void do_fma (void);
int main()
{
  if (have_fma)
do_fma ();
  else
x = x + y * z;
}

t2.c
---
int have_fma;
volatile double x, y, z;
void do_fma (void)
{
  x = x + y * z;
}

and doing

gcc -c -Ofast -flto t.c
gcc -c -Ofast -mfma -flto t2.c
gcc t.o t2.o

I get -mfma on the lto1 command-line for the link.  But I also get for 
main()

 target_option_node 0x76ac40d8
ix86_isa_flags (0x180022012)
ix86_fpmath (0x2)
target_flags (0xa0004c3)
arch = 18 (k8)
tune = 0 (generic)
branch_cost = 3
-m64 -msse2 -msse -mmmx -mfxsr -m128bit-long-double -m80387 
-mfp-ret-in-387 -mtls-direct-seg-refs -mvzeroupper 
-mavx256-split-unaligned-load -mavx256-split-unaligned-store -mfpmath=sse


while for do_fma()

 target_option_node 0x76ac2e58
ix86_isa_flags (0x117809224000112)
ix86_fpmath (0x2)
target_flags (0xa0004c3)
arch = 18 (k8)
tune = 0 (generic)
branch_cost = 3
-m64 -mfma -msse4.2 -msse4.1 -mssse3 -msse3 -msse2 -msse -mmmx -mfxsr 
-mpopcnt -mxsave -m128bit-long-double -m80387 -mfp-ret-in-387 
-mtls-direct-seg-refs -mvzeroupper -mavx256-split-unaligned-load 
-mavx256-split-unaligned-store -mfpmath=sse


and for some reason main () ends up _not_ having -mfma enabled.

I'm not sure what disables it or how we arrive at the above set of
options in the first place...  does the TARGET_OPTION_NODE store
the complete state of all target options?  I had the impression
it only stores modification?

So it seems to work (and simply ignore the global -mfma flag).  Not
sure on whether by design or by luck ;)

Richard.


  I think Jakub made patch to simply skip target-option-node streaming here,
  how this is supposed to work.
 
 Not all targets support target attributes and I think those that
 support them are those that are SWITCHABLE_TARGET - but maybe I am
 mistaken?
 
   
   We can do similar changes for optimize options, now, for all targets, no?
  
  I am not quite sure about this; not all options are per-function nature.
  I need to audit all the target options for LTO safety, I did some work
  on this for optimization options, but the target options are still very 
  wild.
  For example I am not quite convinced mixing -malign-double 
  -mno-align-double units
  will work as expected.
 
 Well, but once types are laid out specifying sth else at link time
 won't help.  But yes, if we don't stream the option we have no chance
 to complain in lto-wrapper.  It's just that we seem to not enter
 the merging machinery for single-file testcases so I chose to fix
 lto option streaming rather than lto-wrapper ... I'll have a closer
 look to lto-wrapper.  But to complain about target option mismatches
 we'd have to call a target hook from lto-wrapper (or finally re-structure
 all this and merge options in lto1 itself).
 
 Richard.
 
  Honza
   
   Thanks,
   Richard.
   
   2015-04-23  Richard Biener  rguent...@suse.de
   
 * lto-opts.c (lto_write_options): Do not record target options
 if we are targeting a SWITCHABLE_TARGET target.  We use target
 options on functions to transfer target flags from compile to
 link time.
   
   Index: gcc/lto-opts.c
   ===
   *** gcc/lto-opts.c(revision 222360)
   --- gcc/lto-opts.c(working copy)
   *** lto_write_options (void)
   *** 219,224 
   --- 219,230 
lto_stream_offload_p)
continue;
 
   +   /* Do not store target-specific options if we target a
   +  SWITCHABLE_TARGET target.  */
   +   if ((cl_options[option-opt_index].flags  CL_TARGET)
   +SWITCHABLE_TARGET)
   + continue;
   + 
   /* Drop options created from the gcc driver that will be rejected
  when passed on to the driver again.  */
   if (cl_options[option-opt_index].cl_reject_driver)
  
  
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX 

Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8

2015-04-27 Thread Bill Schmidt
On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote:
 On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt
 wschm...@linux.vnet.ibm.com wrote:

 
  Index: gcc/testsuite/gcc.dg/vect/vect-33.c
  ===
  --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118)
  +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy)
  @@ -36,9 +36,10 @@ int main (void)
 return main1 ();
   }
 
  +/* vect_hw_misalign  { ! vect64 } */
 
   /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect  } } */
  -/* { dg-final { scan-tree-dump Vectorizing an unaligned access vect { 
  target { vect_hw_misalign  { {! vect64} || vect_multiple_sizes } } } } } 
  */
  +/* { dg-final { scan-tree-dump Vectorizing an unaligned access vect { 
  target { { { ! powerpc*-*-* }  vect_hw_misalign }  { { ! vect64 } || 
  vect_multiple_sizes } } } } }  */
   /* { dg-final { scan-tree-dump Alignment of access forced using peeling 
  vect { target { vector_alignment_reachable  { vect64  {! 
  vect_multiple_sizes} } } } } } */
   /* { dg-final { scan-tree-dump-times Alignment of access forced using 
  versioning 1 vect { target { { {! vector_alignment_reachable} || {! 
  vect64} }  {! vect_hw_misalign} } } } } */
   /* { dg-final { cleanup-tree-dump vect } } */
 
 Hi Bill,
 With this change, the test case is skipped on aarch64 now.  Since it
 passed before, Is it expected to act like this on 64bit platforms?

Hi Bin,

No, that's a mistake on my part -- thanks for the report!  That first
added line was not intended to be part of the patch:

+/* vect_hw_misalign  { ! vect64 } */

Please try removing that line and verify that the patch succeeds again
for ARM.  Assuming so, I'll prepare a patch to fix this.

It looks like this mistake was introduced only in this particular test,
but please let me know if you see any other anomalies.

Thanks very much!

Bill
 
 PASS-NA: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects
 scan-tree-dump-times vect Vectorizing an unaligned access 0
 PASS-NA: gcc.dg/vect/vect-33.c scan-tree-dump-times vect Vectorizing
 an unaligned access 0
 
 Thanks,
 bin
 




Re: [PR64164] drop copyrename, integrate into expand

2015-04-27 Thread Richard Biener
On Fri, Apr 24, 2015 at 3:56 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Apr  6, 2015, Jeff Law l...@redhat.com wrote:

 So the bulk of the changes into this routine are really about picking
 a good leader, which presumably is how we're able to get the desired
 effects on debuginfo that we used to get from tree-ssa-copyrename.c?

 This has nothing to do with debuginfo, I'm afraid.  We just had to keep
 track of parm and result decls to avoid coalescing them together, and
 parm decl default defs to promote them to leaders, because expand copies
 incoming REGs to pseudos in PARM_DECL's DECL_RTL.  We should fill that
 in with the RTL created for the default def for the PARM_DECL.  At the
 end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
 return register or rtl.  I didn't want to tackle the reworking of these
 expanders to avoid problems out of copying incoming parms to one pseudo
 and then reading it from another, as I observed before I made this
 change.  I'm now tackling that, so that we can refrain from touching the
 base vars altogether, and not refrain from coalescing parms and results.

 Hmmm, so the real issue here is the expansion setup of parms and
 results.  I hadn't pondered that aspect.  I'd encourage fixing the
 expansion code too if you can see a path for that.

 That was the trickiest bit of the patch: getting assign_parms to use the
 out-of-SSA-chosen RTL for the (default def of the) param instead of
 creating a pseudo or stack slot of its own, especially when we create a
 .result_ptr decl and there is an incoming by-ref result_decl, in which
 case we ought to use the same SSA-assigned pseudo for both.  Another
 case worth mentioning is that in which a param is unused: there is no
 default def for it, but in the non-optimized case, we have to assign it
 to the same location.  I've used the DECL_RTL itself to carry the
 information in this case, at least in the non-optimized case, in which
 we know all SSA_NAMEs associated with each param *will* be assigned to
 the same partition, and use the same RTL.  If we do optimize, the param
 may get multiple locations, and DECL_RTL will be cleared.  That's fine:
 the incoming value of the param will end up being copied to a separate
 pseudo, so that there's no risk of messing with any other default def
 (there's a new testcase for this), and the copy is likely to be
 optimized out.

 The other tricky bit was to fix all expander bits that required
 SSA_NAMEs to have a associated decl.  I've removed all such cases, so we
 can now expand anonymous SSA decls directly, without having to create an
 ignored decl.  Doing that, we can coalesce variables and expand each
 partition without worrying about choosing a preferred partition leader.
 We just have to make sure we don't consider pairs of variables eligible
 for coalescing if they should get different promoted modes, or a
 different register-or-stack choice, and then expansion of partitions is
 streamlined: we just expand each leader, and then adjust all SSA_NAMEs
 to associate the RTL with their base variables, if any.


 In this revision of the patch, I have retained -ftree-coalesce-vars, so
 that its negated form can be used in testcases that formerly expected no
 coalescing across user variables, but that explicitly disabled VTA.

 As for testcases, while investigating test regressions, I found out
 various guality failures had to do with VT's lack of awareness of custom
 calling conventions.  Caller's variables saved in registers that are
 normally call-clobbered, but that are call-saved in custom conventions
 set up for a callee, would end up invalidating the entry-point location
 associations.  I've arranged for var-tracking to use custom calling
 conventions for register invalidation at call insns, and this fixed not
 only a few guality regressions due to changes in register assignment,
 but a number of other long-standing guality failures.  Yay!  This could
 be split out into a standalone patch.


 On Mar 31, 2015, Richard Biener richard.guent...@gmail.com wrote:

 Did you do any statistics on how the number of basevars changes with your 
 patch
 compared to trunk?

 In this version of the patch, we no longer touch the base vars at all.
 We just associate the piece of RTL generated for the partition with a
 list of decls, if needed.  (I've just realized that I never noticed a
 list of decls show up anywhere, and looking into this, I saw a bug in
 the leader_merge function, that causes it to fail to go from a single
 entry to a list: it creates the list, but then returns the original
 non-list entry; that's why I never saw them!  I won't delay posting the
 patch just because of this; I'm not even sure we want decl lists in REG
 or MEM attrs begin with)

 I have collected some statistics on the effects of the patch in
 compiling stage3-gcc/, before and after the patch, with and without
 -fno-tree-coalesce-vars.  I counted, per function:

 b/a: before the patch, or after the 

Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Richard Biener wrote:

 On Fri, 24 Apr 2015, Jeff Law wrote:
 
  On 02/17/2015 07:58 AM, Richard Biener wrote:
  [ Restarting a old thread... ]
  
   On a closer look the record_const_or_copy_1 hunk is redundant
   (record_equality is really a bit obfuscated...).
  Agreed.  I'm not entirely sure how it got to this point.
  
   And record_equality is where the SSA_NAME_VALUEs might end up
   forming chains?  At least because we might record a SSA_NAME_VALUE
   for sth that might be the target of a SSA_NAME_VALUE, as in
   
 a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
 if (b_2 == i_3(D))
   ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
   
   thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
   SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
   can't easily avoid that because we don't keep track of a
   reverse mapping (nor would we want to update that).
  Right.  In general, the fact that we're in SSA form means that we ought not
  get loops in the SSA_NAME_VALUE chain since there's a single static 
  assignment
  and we'll visit it once.
  
  That nice model breaks down in two ways.  First we derive equivalences from
  equality conditions like you've shown above.  Second, when threading we can
  thread through a loop latch and start reprocessing a block.  The interaction
  between those two can result in loops in the value chain.
  
  What I'm hoping to do in GCC6 is eliminate all this stuff with a threader 
  that
  is independent of the dominator walk (and optimizer). Instead of walking
  forward through the dominator tree recording key nuggets, then removing 
  those
  nuggets as we pop back up the tree, we instead we start with the conditional
  and walk up the use-def chains, simplifying as we go, with the goal of
  simplifying to a constant, of course.
  
  You can see the beginnings of that with the FSM bits from Sebastian.
  
  Obviously until those bits are in place, we should try to clean up any warts
  in the current implementation.
  
   
   Btw, in record_equality, the == part of the = check for the loop
   depth will just randomly swap x and y while we should prefer
   IL canonical order.  If we can't rely on the input being in IL
   canonical order we should prepend the whole function with
   
 if (tree_swap_operands_p (x, y, false))
   std::swap (x, y);
   
   and change = to  for the loop-depth check.
  Agreed.  Makes perfect sense.
 
 I'm now re-bootstrapping and testing the following.

Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

* gcc.target/i386/pr65217.c: XFAIL.

Index: gcc/tree-ssa-dom.c
===
*** gcc/tree-ssa-dom.c  (revision 222360)
--- gcc/tree-ssa-dom.c  (working copy)
*** record_equivalences_from_phis (basic_blo
*** 1519,1524 
--- 1519,1531 
  if (lhs == t)
continue;
  
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+ 
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
*** record_equality (tree x, tree y)
*** 1752,1757 
--- 1759,1767 
  {
tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+ std::swap (x, y);
+ 
if (TREE_CODE (x) == SSA_NAME)
  prev_x = SSA_NAME_VALUE (x);
if (TREE_CODE (y) == SSA_NAME)
*** record_equality (tree x, tree y)
*** 1766,1772 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) = loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x)  loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x  is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
*** record_equivalences_from_stmt (gimple st
*** 2128,2145 
if (may_optimize_p
   (TREE_CODE (rhs) == SSA_NAME
  || is_gimple_min_invariant (rhs)))
!   {
!   if (dump_file  (dump_flags  TDF_DETAILS))
! {
!   

Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Tom de Vries

On 27-04-15 10:17, Richard Biener wrote:

This patch fixes that by gimplifying the address expression of the mem-ref
returned by the target hook (borrowing code from gimplify_expr, case
MEM_REF).

Bootstrapped and reg-tested on x86_64.

Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.

OK for trunk?

Hmm, that assert looks suspicious...

Can't you simply always do

   gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);


It's a bit counter-intuitive for me, using is_gimple_lvalue for something (the 
result of va_arg) we use as rvalue.


But, it seems to work (with  in front of expr).

OK if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom

Return side-effect free result in gimplify_va_arg_internal

2015-04-27  Tom de Vries  t...@codesourcery.com

	PR tree-optimization/65818
	* gimplify.c (gimplify_va_arg_internal): Ensure that only side-effect
	free values are returned.
---
 gcc/gimplify.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c68bd47..4a68c87 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9352,7 +9352,9 @@ gimplify_va_arg_internal (tree valist, tree type, location_t loc,
   else
 gimplify_expr (valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
 
-  return targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
+  tree expr = targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
+  gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);
+  return expr;
 }
 
 /* Gimplify __builtin_va_arg, aka VA_ARG_EXPR, which is not really a
-- 
1.9.1



Re: [PATCH, ARM] Alternatives and type attributes fixes.

2015-04-27 Thread Kyrill Tkachov

Hi Yvan,

On 27/04/15 13:25, Yvan Roux wrote:

Hi,

This is a follow-up of PR64208 where an LRA loop was due to redundancy
in insn's alternatives.  I've checked all the insns in the arm backend
to avoid latent problems and this patch fixes the issues I've spotted.

Only thumb2_movsicc_insn contained duplicated alternatives, and the
rest of the changes are related to the type attribute, which were not
accurate or used accordingly to their definition.  Notice that the
modifications have only a limited impact as in most of the pipeline
descriptions, multiple/mov_reg/alu_reg/.. types are used the same way,
only cortex a8 seems to have a real difference between alu or multiple
instruction and mov.

Bootstrapped and regtested on arm-linux-gnueabihf.  Ok for trunk ?

Thanks,
Yvan

2015-04-27  Yvan Rouxyvan.r...@linaro.org

 * config/arm/arm.mb (*arm_movt): Fix type attribute.
 (*movsi_compare0): Likewise.
 (*cmpsi_shiftsi): Likewise.
 (*cmpsi_shiftsi_swp): Likewise.
 (*movsicc_insn): Likewise.
 (*cond_move): Likewise.
 (*if_plus_move): Likewise.
 (*if_move_plus): Likewise.
 (*if_arith_move): Likewise.
 (*if_move_arith): Likewise.
 (*if_shift_move): Likewise.
 (*if_move_shift): Likewise.
 * config/arm/thumb2.md (*thumb2_movsi_insn): Fix type attribute.
 (*thumb2_movsicc_insn): Fix alternative redundancy.
 (*thumb2_addsi_short): Split register and immediate alternatives.
 (thumb2_addsi3_compare0): Likewise.

insn.diff


diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..ee23deb 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5631,7 +5631,7 @@
[(set_attr predicable yes)
 (set_attr predicable_short_it no)
 (set_attr length 4)
-   (set_attr type mov_imm)]
+   (set_attr type alu_sreg)]
  )


For some context, this is the *arm_movt pattern that generates
a movt instruction. The documentation in types.md says:
This excludes MOV and MVN but includes MOVT. So using alu_sreg
is correct according to that logic.
However, don't you want to also update the arm_movtas_ze pattern
that generates movt but erroneously has the type 'mov_imm'?

  
  (define_insn *arm_movsi_insn

@@ -5919,7 +5919,7 @@
 cmp%?\\t%0, #0
 sub%.\\t%0, %1, #0
[(set_attr conds set)
-   (set_attr type alus_imm,alus_imm)]
+   (set_attr type alus_sreg,alus_sreg)]
  )


Why change that? They are instructions with immediate operands
so alus_imm should be gorrect?


@@ -486,12 +486,12 @@
  )
  
  (define_insn_and_split *thumb2_movsicc_insn

-  [(set (match_operand:SI 0 s_register_operand =l,l,r,r,r,r,r,r,r,r,r)
+  [(set (match_operand:SI 0 s_register_operand =l,l,r,r,r,r,r,r,r,r,r,r)
(if_then_else:SI
 (match_operator 3 arm_comparison_operator
  [(match_operand 4 cc_register ) (const_int 0)])
-(match_operand:SI 1 arm_not_operand 0 ,lPy,0 ,0,rI,K,rI,rI,K ,K,r)
-(match_operand:SI 2 arm_not_operand lPy,0 ,rI,K,0 ,0,rI,K 
,rI,K,r)))]
+(match_operand:SI 1 arm_not_operand 0 ,lPy,0 ,0,rI,K,I ,r,rI,K 
,K,r)
+(match_operand:SI 2 arm_not_operand lPy,0 ,rI,K,0 ,0,rI,I,K 
,rI,K,r)))]
TARGET_THUMB2
@
 it\\t%D3\;mov%D3\\t%0, %2
@@ -504,12 +504,14 @@
 #
 #
 #
+   #
 #
 ; alt 6: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
-   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
-   ; alt 8: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
-   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
-   ; alt 10: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
+   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
+   ; alt 8: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
+   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
+   ; alt 10: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
+   ; alt 11: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
 reload_completed
[(const_int 0)]
{
@@ -540,8 +542,8 @@
 operands[2])));
  DONE;
}
-  [(set_attr length 4,4,6,6,6,6,10,10,10,10,6)
-   (set_attr enabled_for_depr_it yes,yes,no,no,no,no,no,no,no,no,yes)
+  [(set_attr length 4,4,6,6,6,6,10,8,10,10,10,6)
+   (set_attr enabled_for_depr_it yes,yes,no,no,no,no,no,no,no,no,no,yes)
 (set_attr conds use)
 (set_attr type multiple)]
  )


So, I think here for the first 6 alternatives we can give it a more specific 
type,
like mov_immm or mov_reg, since you're cleaning this stuff up. The instructions 
in
those alternatives are just a mov instruction enclosed in an IT block, so they 
always
have to stick together anyway.


@@ -1161,9 +1163,9 @@
  )
  
  (define_insn *thumb2_addsi_short

-  [(set (match_operand:SI 0 low_register_operand =l,l)
-   (plus:SI (match_operand:SI 1 low_register_operand l,0)
-(match_operand:SI 2 low_reg_or_int_operand lPt,Ps)))
+  [(set (match_operand:SI 0 low_register_operand =l,l,l)
+   (plus:SI (match_operand:SI 1 low_register_operand l,l,0)
+

Re: Hide _S_n_primes from user code

2015-04-27 Thread Jonathan Wakely

On 22/04/15 22:11 +0200, François Dumont wrote:

+constexpr auto __n_primes
+  = sizeof(__prime_list) / sizeof(unsigned long) - 1;


Normally I'd say

   sizeof(__prime_list) / sizeof(*__prime_list) - 1

would be better, but since it's very unlikely we'll change the element
type in the array what you have should be safe.

OK for trunk.



Re: [Patch] Add regex_constants::__polynomial

2015-04-27 Thread Jonathan Wakely

On 26/04/15 17:55 -0700, Tim Shen wrote:

I didn't test with _GLIBCXX_DEBUG though. Updated the patch for
removing DFS restriction for ECMAScript.


OK for trunk.



[patch] Add a cross-reference link to libstdc++ manual

2015-04-27 Thread Jonathan Wakely

Committed to trunk.


commit 85b6429ce3410f9e204f7d926f67707556c6a4bf
Author: Jonathan Wakely jwak...@redhat.com
Date:   Mon Apr 27 13:27:27 2015 +0100

	* doc/xml/manual/extensions.xml: Add cross-reference.
	* doc/html/manual/ext_compile_checks.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/extensions.xml b/libstdc++-v3/doc/xml/manual/extensions.xml
index c4120c9..41b1a80 100644
--- a/libstdc++-v3/doc/xml/manual/extensions.xml
+++ b/libstdc++-v3/doc/xml/manual/extensions.xml
@@ -82,7 +82,8 @@ extensions, be aware of two things:
   They can be enabled at configure time with
   link linkend=manual.intro.setup.configureliteral--enable-concept-checks/literal/link.
   You can enable them on a per-translation-unit basis with
-  code#define _GLIBCXX_CONCEPT_CHECKS/code for GCC 3.4 and higher
+  link linkend=manual.intro.using.macroscode#define
+  _GLIBCXX_CONCEPT_CHECKS/code/link for GCC 3.4 and higher
   (or with code#define _GLIBCPP_CONCEPT_CHECKS/code for versions
   3.1, 3.2 and 3.3).
/para


Re: niter_base simplification

2015-04-27 Thread Jonathan Wakely

On 22/04/15 22:10 +0200, François Dumont wrote:

Hello

   I don't know if I am missing something but I think __niter_base 
could be simplified to remove usage of _Iter_base. Additionally I 
overload it to also remove __normal_iterator layer even if behind a 
reverse_iterator or move_iterator, might help compiler to optimize 
code, no ? If not, might allow other algo optimization in the 
future...


   I prefered to provide a __make_reverse_iterator to allow the 
latter in C++11 and not only in C++14. Is it fine to do it this way or 
do you prefer to simply get rid of all this part ?


It's fine to add __make_reverse_iterator but see my comment below.

   * include/bits/cpp_type_traits.h (__gnu_cxx::__normal_iterator): 
Delete.


You're removing __is_normal_iterator not __normal_iterator.


   * include/bits/stl_algobase.h (std::__niter_base): Adapt.
   * include/bits/stl_iterator.h (__make_reverse_iterator): New in C++11.
   (std::__niter_base): Overloads for std::reverse_iterator,
   __gnu_cxx::__normal_iterator and std::move_iterator.

Tested under Linux x86_64. I checked that std::copy still ends up 
calling __builtin_memmove when used on vector iterators.


François




diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 0bcb133..73eea6b 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -270,17 +270,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return __a;
}

-  // If _Iterator is a __normal_iterator return its base (a plain pointer,
-  // normally) otherwise return it untouched.  See copy, fill, ... 
+  // Fallback implementation of the function used to remove the

+  // __normal_iterator wrapper. See copy, fill, ...


It's a bit strange to have a function with no other overloads visible
described as a fallback. It would be good to say that the other
definition is in bits/stl_iterator.h


  templatetypename _Iterator
-struct _Niter_base
-: _Iter_base_Iterator, __is_normal_iterator_Iterator::__value
-{ };
-
-  templatetypename _Iterator
-inline typename _Niter_base_Iterator::iterator_type
+inline _Iterator
__niter_base(_Iterator __it)
-{ return std::_Niter_base_Iterator::_S_base(__it); }
+{ return __it; }

  // Likewise, for move_iterator.


This comment no longer makes sense, because you've removed the comment
on _Niter_base that it referred to. Please restore the original text
of the _Niter_base comment for _Miter_base.

(Alternatively, could the same simplification be made for
__miter_base? Do we need _Miter_base or just two overloads of
__miter_base()?)



  templatetypename _Iterator
diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 4a9189e..3aad9f3 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -390,7 +390,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{ return __y.base() - __x.base(); }
  //@}

-#if __cplusplus  201103L
+#if __cplusplus == 201103L
+  templatetypename _Iterator
+inline reverse_iterator_Iterator
+__make_reverse_iterator(_Iterator __i)
+{ return reverse_iterator_Iterator(__i); }
+
+# define _GLIBCXX_MAKE_REVERSE_ITERATOR(_Iter) \
+  std::__make_reverse_iterator(_Iter)
+#elif __cplusplus  201103L
#define __cpp_lib_make_reverse_iterator 201402

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -400,6 +408,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
inline reverse_iterator_Iterator
make_reverse_iterator(_Iterator __i)
{ return reverse_iterator_Iterator(__i); }
+
+# define _GLIBCXX_MAKE_REVERSE_ITERATOR(_Iter) \
+  std::make_reverse_iterator(_Iter)
+#endif
+
+#if __cplusplus = 201103L
+  templatetypename _Iterator
+auto
+__niter_base(reverse_iterator_Iterator __it)
+- decltype(_GLIBCXX_MAKE_REVERSE_ITERATOR(__niter_base(__it.base(
+{ return _GLIBCXX_MAKE_REVERSE_ITERATOR(__niter_base(__it.base())); }
#endif



It might be simpler to just add __make_reverse_iterator for = 201103L
and then always use std::__make_reverse_iterator instead of a macro.

That's similar to what we do for std:__addressof and std:addressof.


[PATCH, ARM] Alternatives and type attributes fixes.

2015-04-27 Thread Yvan Roux
Hi,

This is a follow-up of PR64208 where an LRA loop was due to redundancy
in insn's alternatives.  I've checked all the insns in the arm backend
to avoid latent problems and this patch fixes the issues I've spotted.

Only thumb2_movsicc_insn contained duplicated alternatives, and the
rest of the changes are related to the type attribute, which were not
accurate or used accordingly to their definition.  Notice that the
modifications have only a limited impact as in most of the pipeline
descriptions, multiple/mov_reg/alu_reg/.. types are used the same way,
only cortex a8 seems to have a real difference between alu or multiple
instruction and mov.

Bootstrapped and regtested on arm-linux-gnueabihf.  Ok for trunk ?

Thanks,
Yvan

2015-04-27  Yvan Roux  yvan.r...@linaro.org

* config/arm/arm.mb (*arm_movt): Fix type attribute.
(*movsi_compare0): Likewise.
(*cmpsi_shiftsi): Likewise.
(*cmpsi_shiftsi_swp): Likewise.
(*movsicc_insn): Likewise.
(*cond_move): Likewise.
(*if_plus_move): Likewise.
(*if_move_plus): Likewise.
(*if_arith_move): Likewise.
(*if_move_arith): Likewise.
(*if_shift_move): Likewise.
(*if_move_shift): Likewise.
* config/arm/thumb2.md (*thumb2_movsi_insn): Fix type attribute.
(*thumb2_movsicc_insn): Fix alternative redundancy.
(*thumb2_addsi_short): Split register and immediate alternatives.
(thumb2_addsi3_compare0): Likewise.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..ee23deb 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5631,7 +5631,7 @@
   [(set_attr predicable yes)
(set_attr predicable_short_it no)
(set_attr length 4)
-   (set_attr type mov_imm)]
+   (set_attr type alu_sreg)]
 )
 
 (define_insn *arm_movsi_insn
@@ -5919,7 +5919,7 @@
cmp%?\\t%0, #0
sub%.\\t%0, %1, #0
   [(set_attr conds set)
-   (set_attr type alus_imm,alus_imm)]
+   (set_attr type alus_sreg,alus_sreg)]
 )
 
 ;; Subroutine to store a half word from a register into memory.
@@ -6886,7 +6886,7 @@
   [(set_attr conds set)
(set_attr shift 1)
(set_attr arch 32,a,a)
-   (set_attr type alus_shift_imm,alu_shift_reg,alus_shift_imm)])
+   (set_attr type alus_shift_imm,alus_shift_reg,alus_shift_imm)])
 
 (define_insn *cmpsi_shiftsi_swp
   [(set (reg:CC_SWP CC_REGNUM)
@@ -6899,7 +6899,7 @@
   [(set_attr conds set)
(set_attr shift 1)
(set_attr arch 32,a,a)
-   (set_attr type alus_shift_imm,alu_shift_reg,alus_shift_imm)])
+   (set_attr type alus_shift_imm,alus_shift_reg,alus_shift_imm)])
 
 (define_insn *arm_cmpsi_negshiftsi_si
   [(set (reg:CC_Z CC_REGNUM)
@@ -7492,10 +7492,10 @@
 (const_string mov_imm)
 (const_string mov_reg))
   (const_string mvn_imm)
-  (const_string mov_reg)
-  (const_string mov_reg)
-  (const_string mov_reg)
-  (const_string mov_reg)])]
+  (const_string multiple)
+  (const_string multiple)
+  (const_string multiple)
+  (const_string multiple)])]
 )
 
 (define_insn *movsfcc_soft_insn
@@ -8653,7 +8653,14 @@
 return \\;
   
   [(set_attr conds use)
-   (set_attr type mov_reg,mov_reg,multiple)
+   (set_attr_alternative type
+ [(if_then_else (match_operand 2 const_int_operand 
)
+(const_string mov_imm)
+(const_string mov_reg))
+  (if_then_else (match_operand 1 const_int_operand 
)
+(const_string mov_imm)
+(const_string mov_reg))
+  (const_string multiple)])
(set_attr length 4,4,8)]
 )
 
@@ -9449,8 +9456,8 @@
 (const_string alu_imm )
 (const_string alu_sreg))
   (const_string alu_imm)
-  (const_string alu_sreg)
-  (const_string alu_sreg)])]
+  (const_string multiple)
+  (const_string multiple)])]
 )
 
 (define_insn *ifcompare_move_plus
@@ -9487,7 +9494,13 @@
sub%D4\\t%0, %2, #%n3\;mov%d4\\t%0, %1
   [(set_attr conds use)
(set_attr length 4,4,8,8)
-   (set_attr type alu_sreg,alu_imm,multiple,multiple)]
+   (set_attr_alternative type
+ [(if_then_else (match_operand 3 const_int_operand 
)
+(const_string alu_imm )
+(const_string alu_sreg))
+  (const_string alu_imm)
+  (const_string multiple)
+  (const_string multiple)])]
 )
 
 (define_insn *ifcompare_arith_arith
@@ -9582,7 

[PATCH][AArch64] Improve spill code - swap order in shl pattern

2015-04-27 Thread Wilco Dijkstra
Various instructions are supported as integer operations as well as SIMD on 
AArch64. When register
pressure is high, lra-constraints inserts spill code without taking the 
allocation class into
account, and basically chooses the first available pattern that matches. Since 
this instruction has
the SIMD version first it is usually chosen eventhough some of the operands are 
eventually allocated
to integer registers. The result is inefficient code not only due to the higher 
latency of SIMD
instructions but also due to the extra int-FP moves. Placing the integer 
variant first in the shl
pattern generates far more optimal spill code. A few more patterns are the 
wrong way around, which
I'll address in a separate patch. I'm also looking into fixing lra-constraints 
to generate the
expected code by taking the allocno class into account in the cost calculations 
during spilling.

2015-04-27  Wilco Dijkstra  wdijk...@arm.com

* gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_mode3):
Place integer variant first.

---
 gcc/config/aarch64/aarch64.md | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 7163025..baef56a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3334,17 +3334,17 @@
 
 ;; Logical left shift using SISD or Integer instruction
 (define_insn *aarch64_ashl_sisd_or_int_mode3
-  [(set (match_operand:GPI 0 register_operand =w,w,r)
+  [(set (match_operand:GPI 0 register_operand =r,w,w)
 (ashift:GPI
-  (match_operand:GPI 1 register_operand w,w,r)
-  (match_operand:QI 2 aarch64_reg_or_shift_imm_mode 
Uscmode,w,rUscmode)))]
+  (match_operand:GPI 1 register_operand r,w,w)
+  (match_operand:QI 2 aarch64_reg_or_shift_imm_mode 
rUscmode,Uscmode,w)))]
   
   @
+   lsl\t%w0, %w1, %w2
shl\t%rtn0vas, %rtn1vas, %2
-   ushl\t%rtn0vas, %rtn1vas, %rtn2vas
-   lsl\t%w0, %w1, %w2
-  [(set_attr simd yes,yes,no)
-   (set_attr type neon_shift_immq, neon_shift_regq,shift_reg)]
+   ushl\t%rtn0vas, %rtn1vas, %rtn2vas
+  [(set_attr simd no,yes,yes)
+   (set_attr type shift_reg,neon_shift_immq, neon_shift_regq)]
 )
 
 ;; Logical right shift using SISD or Integer instruction
-- 
1.9.1




Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Richard Biener
On Mon, Apr 27, 2015 at 3:06 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 27-04-15 10:17, Richard Biener wrote:

 This patch fixes that by gimplifying the address expression of the
 mem-ref
 returned by the target hook (borrowing code from gimplify_expr, case
 MEM_REF).
 
 Bootstrapped and reg-tested on x86_64.
 
 Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.
 
 OK for trunk?

 Hmm, that assert looks suspicious...

 Can't you simply always do

gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);


 It's a bit counter-intuitive for me, using is_gimple_lvalue for something
 (the result of va_arg) we use as rvalue.

Yeah, choosing that was done because you could assert it's a MEM_REF
and tree-stdarg eventually builds a WITH_SIZE_EXPR around it.

It would possibly be easier to simply inline gimplify_va_arg_internal
at its use and only gimplify the assignment.  Though in that case the
original code probably worked - just in the lhs == NULL case it didn't,
which hints at a better place for the fix - in expand_ifn_va_arg_1 do

 if (lhs != NULL_TREE)
   {
...
   }
 else
   gimplify_expr (expr, pre, post, is_gimple_val, fb_either);

So ... if you can re-try that one it's pre-approved.

Thanks,
Richard.

 But, it seems to work (with  in front of expr).

 OK if bootstrap and reg-test on x86_64 succeeds?

 Thanks,
 - Tom



RE: [PATCH][AArch64] Fix Cortex-A53 shift costs

2015-04-27 Thread Wilco Dijkstra
ping

 -Original Message-
 From: Wilco Dijkstra [mailto:wdijk...@arm.com]
 Sent: 05 March 2015 14:49
 To: gcc-patches@gcc.gnu.org
 Subject: [PATCH][AArch64] Fix Cortex-A53 shift costs
 
 This patch fixes the shift costs for Cortex-A53 so they are more accurate - 
 immediate shifts
 use SBFM/UBFM which takes 2 cycles, register controlled shifts take 1 cycle. 
 Bootstrap and
 regression OK.
 
 ChangeLog:
 2015-03-05  Wilco Dijkstra  wdijk...@arm.com
 
   * gcc/config/arm/aarch-cost-tables.h (cortexa53_extra_costs):
   Make Cortex-A53 shift costs more accurate.
 
 ---
  gcc/config/arm/aarch-cost-tables.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/gcc/config/arm/aarch-cost-tables.h 
 b/gcc/config/arm/aarch-cost-tables.h
 index 05e96a9..6bb8ede 100644
 --- a/gcc/config/arm/aarch-cost-tables.h
 +++ b/gcc/config/arm/aarch-cost-tables.h
 @@ -130,12 +130,12 @@ const struct cpu_cost_table cortexa53_extra_costs =
  0,   /* arith.  */
  0,   /* logical.  */
  COSTS_N_INSNS (1),   /* shift.  */
 -COSTS_N_INSNS (2),   /* shift_reg.  */
 +0,   /* shift_reg.  */
  COSTS_N_INSNS (1),   /* arith_shift.  */
 -COSTS_N_INSNS (2),   /* arith_shift_reg.  */
 +COSTS_N_INSNS (1),   /* arith_shift_reg.  */
  COSTS_N_INSNS (1),   /* log_shift.  */
 -COSTS_N_INSNS (2),   /* log_shift_reg.  */
 -0,   /* extend.  */
 +COSTS_N_INSNS (1),   /* log_shift_reg.  */
 +COSTS_N_INSNS (1),   /* extend.  */
  COSTS_N_INSNS (1),   /* extend_arith.  */
  COSTS_N_INSNS (1),   /* bfi.  */
  COSTS_N_INSNS (1),   /* bfx.  */
 --
 1.9.1




Re: [PATCH 5/13] microblaze musl support

2015-04-27 Thread Szabolcs Nagy

On 20/04/15 19:54, Szabolcs Nagy wrote:
 Set up dynamic linker name for microblaze.
 

Patch v2.
(undef MUSL_DYNAMIC_LINKER that comes from config/linux.h)

gcc/Changelog:

2015-04-24  Gregor Richards  gregor.richa...@uwaterloo.ca

* config/microblaze/linux.h (MUSL_DYNAMIC_LINKER): Define.
(DYNAMIC_LINKER): Change.
diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h
index a7faa7d..3e08138 100644
--- a/gcc/config/microblaze/linux.h
+++ b/gcc/config/microblaze/linux.h
@@ -25,7 +25,22 @@
 #undef TLS_NEEDS_GOT
 #define TLS_NEEDS_GOT 1
 
-#define DYNAMIC_LINKER /lib/ld.so.1
+#if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */
+#define MUSL_DYNAMIC_LINKER_E %{mbig-endian:;:el}
+#else
+#define MUSL_DYNAMIC_LINKER_E %{mlittle-endian:el}
+#endif
+
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER /lib/ld-musl-microblaze MUSL_DYNAMIC_LINKER_E .so.1
+#define GLIBC_DYNAMIC_LINKER /lib/ld.so.1
+
+#if DEFAULT_LIBC == LIBC_MUSL
+#define DYNAMIC_LINKER MUSL_DYNAMIC_LINKER
+#else
+#define DYNAMIC_LINKER GLIBC_DYNAMIC_LINKER
+#endif
+
 #undef  SUBTARGET_EXTRA_SPECS
 #define SUBTARGET_EXTRA_SPECS \
   { dynamic_linker, DYNAMIC_LINKER }


Re: [PATCH 6/13] mips musl support

2015-04-27 Thread Szabolcs Nagy


On 21/04/15 15:59, Matthew Fortune wrote:
 Rich Felker dal...@libc.org writes:
 On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote:
 There does however appear to be both soft and hard float variants

Patch v2.

Now all the ABI variants musl plans to support are represented.

gcc/Changelog:

2015-04-27  Gregor Richards  gregor.richa...@uwaterloo.ca
Szabolcs Nagy  szabolcs.n...@arm.com

* config/mips/linux.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERN32): Define.
(GNU_USER_DYNAMIC_LINKERN32): Update.
diff --git a/gcc/config/mips/linux.h b/gcc/config/mips/linux.h
index 91df261..6038498 100644
--- a/gcc/config/mips/linux.h
+++ b/gcc/config/mips/linux.h
@@ -37,7 +37,13 @@ along with GCC; see the file COPYING3.  If not see
 #define UCLIBC_DYNAMIC_LINKERN32 \
   %{mnan=2008:/lib32/ld-uClibc-mipsn8.so.0;:/lib32/ld-uClibc.so.0}
 
+#undef MUSL_DYNAMIC_LINKER32
+#define MUSL_DYNAMIC_LINKER32 /lib/ld-musl-mips%{EL:el}%{msoft-float:-sf}.so.1
+#undef MUSL_DYNAMIC_LINKER64
+#define MUSL_DYNAMIC_LINKER64 /lib/ld-musl-mips64%{EL:el}.so.1
+#define MUSL_DYNAMIC_LINKERN32 /lib/ld-musl-mipsn32%{EL:el}.so.1
+
 #define BIONIC_DYNAMIC_LINKERN32 /system/bin/linker32
 #define GNU_USER_DYNAMIC_LINKERN32 \
   CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKERN32, UCLIBC_DYNAMIC_LINKERN32, \
- BIONIC_DYNAMIC_LINKERN32)
+ BIONIC_DYNAMIC_LINKERN32, MUSL_DYNAMIC_LINKERN32)


Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Tom de Vries

On 27-04-15 15:40, Richard Biener wrote:

On Mon, Apr 27, 2015 at 3:06 PM, Tom de Vries tom_devr...@mentor.com wrote:

On 27-04-15 10:17, Richard Biener wrote:


This patch fixes that by gimplifying the address expression of the
mem-ref

returned by the target hook (borrowing code from gimplify_expr, case
MEM_REF).

Bootstrapped and reg-tested on x86_64.

Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.

OK for trunk?


Hmm, that assert looks suspicious...

Can't you simply always do

gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);



It's a bit counter-intuitive for me, using is_gimple_lvalue for something
(the result of va_arg) we use as rvalue.


Yeah, choosing that was done because you could assert it's a MEM_REF
and tree-stdarg eventually builds a WITH_SIZE_EXPR around it.

It would possibly be easier to simply inline gimplify_va_arg_internal
at its use and only gimplify the assignment.  Though in that case the
original code probably worked - just in the lhs == NULL case it didn't,
which hints at a better place for the fix - in expand_ifn_va_arg_1 do

  if (lhs != NULL_TREE)
{
...
}
  else
gimplify_expr (expr, pre, post, is_gimple_val, fb_either);

So ... if you can re-try that one it's pre-approved.



Using that proposed patch, we run into the following problem.

Consider this testcase with ignored-result-va_arg:
...
#include stdarg.h

void
f2 (int i, ...)
{
  va_list ap;
  va_start (ap, i);
  va_arg (ap, long);
  va_end (ap);
}
...

after gimplify_va_arg_internal we have:
...
(gdb) call debug_generic_expr (expr)
*(long int * {ref-all}) addr.2
...

In a bit more detail:
...
(gdb) call debug_tree (expr)
 mem_ref 0x765ea1b8
type integer_type 0x764a77e0 long int DI
size integer_cst 0x764a3ca8 constant 64
unit size integer_cst 0x764a3cc0 constant 8
align 64 symtab 0 alias set -1 canonical type 0x764a77e0 precision 
64 min integer_cst 0x764a3f30 -9223372036854775808 max integer_cst 
0x764a3f48 9223372036854775807

pointer_to_this pointer_type 0x765e0498

arg 0 nop_expr 0x765cbb60
type pointer_type 0x765e0498 type integer_type 0x764a77e0 
long int
public static unsigned DI size integer_cst 0x764a3ca8 64 unit 
size integer_cst 0x764a3cc0 8

align 64 symtab 0 alias set -1 canonical type 0x765e0498

arg 0 var_decl 0x765e32d0 addr.2 type pointer_type 0x764c2150
used unsigned ignored DI file stdarg-1.c line 4 col 1 size 
integer_cst 0x764a3ca8 64 unit size integer_cst 0x764a3cc0 8

align 64 context function_decl 0x765c21b0 f2
arg 1 integer_cst 0x765e64e0 type pointer_type 0x765e0498 
constant 0

...

During 'gimplify_expr (expr, pre, post, is_gimple_val, fb_either)' we ICE 
here:
...
8886  gcc_assert ((*gimple_test_f) (*expr_p));
...

At this point expr is:
...
(gdb) call debug_generic_expr (*expr_p)
*addr.2
...

In more detail:
...
(gdb) call debug_tree (*expr_p)
 mem_ref 0x765ea1e0
type void_type 0x764c2000 void VOID
align 8 symtab 0 alias set -1 canonical type 0x764c2000
pointer_to_this pointer_type 0x764c2150

arg 0 var_decl 0x765e32d0 addr.2
type pointer_type 0x764c2150 type void_type 0x764c2000 void
sizes-gimplified public unsigned DI
size integer_cst 0x764a3ca8 constant 64
unit size integer_cst 0x764a3cc0 constant 8
align 64 symtab 0 alias set -1 canonical type 0x764c2150
pointer_to_this pointer_type 0x764c9bd0
used unsigned ignored DI file stdarg-1.c line 4 col 1 size integer_cst 
0x764a3ca8 64 unit size integer_cst 0x764a3cc0 8

align 64 context function_decl 0x765c21b0 f2
arg 1 integer_cst 0x764c10c0 type pointer_type 0x764c2150 
constant 0

...

The memref is now VOID type. And that's not a gimple_val:
...
(gdb) p is_gimple_val (*expr_p)
$1 = false
...

Attached patch uses is_gimple_lvalue, but in expand_ifn_va_arg_1.

I'll bootstrap and reg-test on x86_64 and commit, unless further comments of 
course.

Thanks,
- Tom
Evaluate side-effects in expand_ifn_va_arg_1

2015-04-27  Tom de Vries  t...@codesourcery.com

	PR tree-optimization/65818
	* tree-stdarg.c (expand_ifn_va_arg_1): Ensure that side-effects are
	evaluated.
---
 gcc/tree-stdarg.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c
index 16a9e2c..1356374 100644
--- a/gcc/tree-stdarg.c
+++ b/gcc/tree-stdarg.c
@@ -1078,6 +1078,8 @@ expand_ifn_va_arg_1 (function *fun)
 	   types.  */
 	gimplify_assign (lhs, expr, pre);
 	  }
+	else
+	  gimplify_expr (expr, pre, post, is_gimple_lvalue, fb_lvalue);
 
 	pop_gimplify_context (NULL);
 
-- 
1.9.1



Re: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

2015-04-27 Thread Kyrill Tkachov


On 27/04/15 14:53, Kyrill Tkachov wrote:

Hi Wilco,

On 27/04/15 14:43, Wilco Dijkstra wrote:

ping


-Original Message-
From: Wilco Dijkstra [mailto:wdijk...@arm.com]
Sent: 04 March 2015 15:38
To: GCC Patches
Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

Include the cost of op0 and op1 in all cases in PLUS and MINUS in 
aarch64_rtx_costs.
Bootstrap  regression OK.

ChangeLog:
2015-03-04  Wilco Dijkstra  wdijk...@arm.com

  * gcc/config/aarch64/aarch64.c (aarch64_rtx_costs):
  Calculate cost of op0 and op1 in PLUS and MINUS cases.

---
   gcc/config/aarch64/aarch64.c | 12 +---
   1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 39921a7..e22d72e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5794,6 +5794,8 @@ aarch64_rtx_costs (rtx x, int code, int outer 
ATTRIBUTE_UNUSED,
op1 = XEXP (x, 1);

   cost_minus:
+   *cost += rtx_cost (op0, MINUS, 0, speed);
+
/* Detect valid immediates.  */
if ((GET_MODE_CLASS (mode) == MODE_INT
 || (GET_MODE_CLASS (mode) == MODE_CC
@@ -5801,13 +5803,10 @@ cost_minus:
 CONST_INT_P (op1)
 aarch64_uimm12_shift (INTVAL (op1)))
  {
-   *cost += rtx_cost (op0, MINUS, 0, speed);
-
if (speed)
  /* SUB(S) (immediate).  */
  *cost += extra_cost-alu.arith;
return true;
-
  }

/* Look for SUB (extended register).  */
@@ -5832,7 +5831,6 @@ cost_minus:
*cost += aarch64_rtx_mult_cost (new_op1, MULT,
(enum rtx_code) code,
speed);
-   *cost += rtx_cost (op0, MINUS, 0, speed);
return true;
  }

@@ -5879,6 +5877,8 @@ cost_plus:
return true;
  }

+   *cost += rtx_cost (op1, PLUS, 1, speed);
+

I don't think this is correct. In the code directly below
(when the aarch64_rtx_arith_op_extract_p condition is true)
we have a shift/extend operation by a constant, so we don't
want to take into account the cost of operand 1 (which is
the extend+shift rtx).

So, I looked at this code myself recently and found that the
wrong logic is addressed by my patch at:
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01617.html
which fixes the case where we forget to cost operand0 in the
MINUS case.


Never mind, I misread the code. The case is indeed
missing the costing of operand 1, so your patch is
a superset of mine. I think it's correct, but I can't
approve. If this goes in, then I withdraw my patch at:

https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01617.html


Sorry for the confusion,
Kyrill




What do you think?
Kyrill


/* Look for ADD (extended register).  */
   if (aarch64_rtx_arith_op_extract_p (op0, mode))
  {
@@ -5900,12 +5900,10 @@ cost_plus:
  {
*cost += aarch64_rtx_mult_cost (new_op0, MULT, PLUS,
speed);
-   *cost += rtx_cost (op1, PLUS, 1, speed);
return true;
  }

-   *cost += (rtx_cost (new_op0, PLUS, 0, speed)
- + rtx_cost (op1, PLUS, 1, speed));
+   *cost += rtx_cost (new_op0, PLUS, 0, speed);

if (speed)
  {
--
1.9.1




Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Marek Polacek
On Mon, Apr 27, 2015 at 04:47:12PM +0200, Marek Polacek wrote:
 On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:
  In case this example feels too contrived (even though it is an
  excerpt of Wine code), we now also warn about the following where
  the two types and variables are defined in different places and
  the size of one is set implicitly:
  
typedef int r_fun_t (int);
  
r_fun_t * text_funcs[] = {0,0,0};
  
int report (unsigned t)
{
  typedef int s_fun_t (long, char);
  
  static s_fun_t * GUI_funcs[3];
  
  return (t  sizeof text_funcs / sizeof text_funcs[0] 
  t  sizeof GUI_funcs / sizeof GUI_funcs[0]);
}
  
  (I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
  so that we keep track.)
 
 I'm afraid there isn't an easy solution to this; the problem is that we fold
 sizeof early, so the warning sees 
 
   t  sizeof 4  t  4
 
 and warns.  Maybe using SIZEOF_EXPR would help...

That said, this isn't something that regressed with my patch.  Consider e.g.:

struct S { int i; };
struct T { int i; };
int
foo (unsigned int t)
{
  return t  sizeof (struct S)  t  sizeof (struct T);
}

with -Wlogical-op:
w2.c: In function ‘foo’:
w2.c:6:32: warning: logical ‘and’ of mutually exclusive tests is always false 
[-Wlogical-op]
   return t  sizeof (struct S)  t  sizeof (struct T);
^


Re: [PATCH, ARM] Alternatives and type attributes fixes.

2015-04-27 Thread Yvan Roux
On 27 April 2015 at 15:20, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi Yvan,


 On 27/04/15 13:25, Yvan Roux wrote:

 Hi,

 This is a follow-up of PR64208 where an LRA loop was due to redundancy
 in insn's alternatives.  I've checked all the insns in the arm backend
 to avoid latent problems and this patch fixes the issues I've spotted.

 Only thumb2_movsicc_insn contained duplicated alternatives, and the
 rest of the changes are related to the type attribute, which were not
 accurate or used accordingly to their definition.  Notice that the
 modifications have only a limited impact as in most of the pipeline
 descriptions, multiple/mov_reg/alu_reg/.. types are used the same way,
 only cortex a8 seems to have a real difference between alu or multiple
 instruction and mov.

 Bootstrapped and regtested on arm-linux-gnueabihf.  Ok for trunk ?

 Thanks,
 Yvan

 2015-04-27  Yvan Rouxyvan.r...@linaro.org

  * config/arm/arm.mb (*arm_movt): Fix type attribute.
  (*movsi_compare0): Likewise.
  (*cmpsi_shiftsi): Likewise.
  (*cmpsi_shiftsi_swp): Likewise.
  (*movsicc_insn): Likewise.
  (*cond_move): Likewise.
  (*if_plus_move): Likewise.
  (*if_move_plus): Likewise.
  (*if_arith_move): Likewise.
  (*if_move_arith): Likewise.
  (*if_shift_move): Likewise.
  (*if_move_shift): Likewise.
  * config/arm/thumb2.md (*thumb2_movsi_insn): Fix type attribute.
  (*thumb2_movsicc_insn): Fix alternative redundancy.
  (*thumb2_addsi_short): Split register and immediate alternatives.
  (thumb2_addsi3_compare0): Likewise.

 insn.diff


 diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
 index 164ac13..ee23deb 100644
 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -5631,7 +5631,7 @@
 [(set_attr predicable yes)
  (set_attr predicable_short_it no)
  (set_attr length 4)
 -   (set_attr type mov_imm)]
 +   (set_attr type alu_sreg)]
   )


 For some context, this is the *arm_movt pattern that generates
 a movt instruction. The documentation in types.md says:
 This excludes MOV and MVN but includes MOVT. So using alu_sreg
 is correct according to that logic.
 However, don't you want to also update the arm_movtas_ze pattern
 that generates movt but erroneously has the type 'mov_imm'?

Ha, yes I missed this one ! I'll will update it.

 (define_insn *arm_movsi_insn
 @@ -5919,7 +5919,7 @@
  cmp%?\\t%0, #0
  sub%.\\t%0, %1, #0
 [(set_attr conds set)
 -   (set_attr type alus_imm,alus_imm)]
 +   (set_attr type alus_sreg,alus_sreg)]
   )


 Why change that? They are instructions with immediate operands
 so alus_imm should be gorrect?

Oups, I certainly misread #0 and %0 this one is correct.

 @@ -486,12 +486,12 @@
   )
 (define_insn_and_split *thumb2_movsicc_insn
 -  [(set (match_operand:SI 0 s_register_operand
 =l,l,r,r,r,r,r,r,r,r,r)
 +  [(set (match_operand:SI 0 s_register_operand
 =l,l,r,r,r,r,r,r,r,r,r,r)
 (if_then_else:SI
  (match_operator 3 arm_comparison_operator
   [(match_operand 4 cc_register ) (const_int 0)])
 -(match_operand:SI 1 arm_not_operand 0 ,lPy,0 ,0,rI,K,rI,rI,K
 ,K,r)
 -(match_operand:SI 2 arm_not_operand lPy,0 ,rI,K,0 ,0,rI,K
 ,rI,K,r)))]
 +(match_operand:SI 1 arm_not_operand 0 ,lPy,0 ,0,rI,K,I ,r,rI,K
 ,K,r)
 +(match_operand:SI 2 arm_not_operand lPy,0 ,rI,K,0 ,0,rI,I,K
 ,rI,K,r)))]
 TARGET_THUMB2
 @
  it\\t%D3\;mov%D3\\t%0, %2
 @@ -504,12 +504,14 @@
  #
  #
  #
 +   #
  #
  ; alt 6: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
 -   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
 -   ; alt 8: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
 -   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
 -   ; alt 10: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
 +   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
 +   ; alt 8: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
 +   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
 +   ; alt 10: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
 +   ; alt 11: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
  reload_completed
 [(const_int 0)]
 {
 @@ -540,8 +542,8 @@
  operands[2])));
   DONE;
 }
 -  [(set_attr length 4,4,6,6,6,6,10,10,10,10,6)
 -   (set_attr enabled_for_depr_it yes,yes,no,no,no,no,no,no,no,no,yes)
 +  [(set_attr length 4,4,6,6,6,6,10,8,10,10,10,6)
 +   (set_attr enabled_for_depr_it
 yes,yes,no,no,no,no,no,no,no,no,no,yes)
  (set_attr conds use)
  (set_attr type multiple)]
   )


 So, I think here for the first 6 alternatives we can give it a more specific
 type,
 like mov_immm or mov_reg, since you're cleaning this stuff up. The
 instructions in
 those alternatives are just a mov instruction enclosed in an IT block, so
 they always
 have to stick together anyway.

OK I'll change that.

 @@ -1161,9 +1163,9 @@
   )
 (define_insn *thumb2_addsi_short
 -  [(set 

Re: [PATCH] [AArch32] Additional bics patterns.

2015-04-27 Thread Alex Velenko

On 24/04/15 16:41, Alex Velenko wrote:

Hi,

This patch adds rtl patterns to generate bics instructions with shift.

Added attribute predicable_short_it since last respin.

Done full regression run on arm-none-eabi and arm-none-gnueabihf.
Bootstrapped on arm-none-gnueabihf.

Is this patch ok?

gcc/config

2015-04-24  Alex Velenko  alex.vele...@arm.com

  * arm/arm.md (andsi_not_shiftsi_si_scc): New pattern.
  * (andsi_not_shiftsi_si_scc_no_reuse): New pattern.

gcc/testsuite

2015-04-24  Alex Velenko alex.vele...@arm.com

  * gcc.target/arm/bics_1.c : New testcase.
  * gcc.target/arm/bics_2.c : New testcase.
  * gcc.target/arm/bics_3.c : New testcase.
  * gcc.target/arm/bics_4.c : New testcase.
---
 gcc/config/arm/arm.md | 44 +++
 gcc/testsuite/gcc.target/arm/bics_1.c | 54 
+
 gcc/testsuite/gcc.target/arm/bics_2.c | 57 
+++

 gcc/testsuite/gcc.target/arm/bics_3.c | 41 +
 gcc/testsuite/gcc.target/arm/bics_4.c | 49 
++

 5 files changed, 245 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_4.c

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..9e774c1 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2768,6 +2768,50 @@
   (const_string logic_shift_reg)))]
 )

+(define_insn andsi_not_shiftsi_si_scc_no_reuse
+  [(set (reg:CC_NOOV CC_REGNUM)
+(compare:CC_NOOV
+(and:SI (not:SI (match_operator:SI 0 shift_operator
+[(match_operand:SI 1 s_register_operand r)
+ (match_operand:SI 2 arm_rhs_operand rM)]))
+(match_operand:SI 3 s_register_operand r))
+(const_int 0)))
+   (clobber (match_scratch:SI 4 =r))]
+  TARGET_32BIT
+  bic%.%?\\t%4, %3, %1%S0
+  [(set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr conds set)
+   (set_attr shift 1)
+   (set (attr type) (if_then_else (match_operand 2 
const_int_operand )

+  (const_string logic_shift_imm)
+  (const_string logic_shift_reg)))]
+)
+
+(define_insn andsi_not_shiftsi_si_scc
+  [(parallel [(set (reg:CC_NOOV CC_REGNUM)
+(compare:CC_NOOV
+(and:SI (not:SI (match_operator:SI 0 shift_operator
+[(match_operand:SI 1 s_register_operand r)
+ (match_operand:SI 2 arm_rhs_operand rM)]))
+(match_operand:SI 3 s_register_operand r))
+(const_int 0)))
+(set (match_operand:SI 4 s_register_operand =r)
+ (and:SI (not:SI (match_op_dup 0
+ [(match_dup 1)
+  (match_dup 2)]))
+ (match_dup 3)))])]
+  TARGET_32BIT
+  bic%.%?\\t%4, %3, %1%S0
+  [(set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr conds set)
+   (set_attr shift 1)
+   (set (attr type) (if_then_else (match_operand 2 
const_int_operand )

+  (const_string logic_shift_imm)
+  (const_string logic_shift_reg)))]
+)
+
 (define_insn *andsi_notsi_si_compare0
   [(set (reg:CC_NOOV CC_REGNUM)
 (compare:CC_NOOV
diff --git a/gcc/testsuite/gcc.target/arm/bics_1.c 
b/gcc/testsuite/gcc.target/arm/bics_1.c

new file mode 100644
index 000..173eb89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/bics_1.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options -O2 --save-temps -fno-inline } */
+/* { dg-require-effective-target arm32 } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d = a  ~b;
+
+  /* { dg-final { scan-assembler-times bics\tr\[0-9\]+, r\[0-9\]+, 
r\[0-9\]+ 2 } } */

+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+bics_si_test2 (int a, int b, int c)
+{
+  int d = a  ~(b  3);
+
+  /* { dg-final { scan-assembler-times bics\tr\[0-9\]+, r\[0-9\]+, 
r\[0-9\]+, .sl \#3 1 } } */

+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+main ()
+{
+  int x;
+
+  x = bics_si_test1 (29, ~4, 5);
+  if (x != ((29  4) + ~4 + 5))
+abort ();
+
+  x = bics_si_test1 (5, ~2, 20);
+  if (x != 25)
+abort ();
+
+x = bics_si_test2 (35, ~4, 5);
+  if (x != ((35  ~(~4  3)) + ~4 + 5))
+abort ();
+
+  x = bics_si_test2 (96, ~2, 20);
+  if (x != 116)
+  abort ();
+
+  return 0;
+}
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/bics_2.c 
b/gcc/testsuite/gcc.target/arm/bics_2.c

new file mode 100644
index 000..740d7c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/bics_2.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-options -O2 --save-temps -fno-inline } */
+/* { dg-require-effective-target arm32 } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d = a  ~b;
+
+  /* { dg-final { scan-assembler-not bics\tr\[0-9\]+, r\[0-9\]+, 

Re: [PATCH 2/13] musl libc config

2015-04-27 Thread Szabolcs Nagy

On 22/04/15 16:42, Szabolcs Nagy wrote:
 yes, i didn't realize that this depends on the target specific parts
 
 i will prepare an updated patch that works if the target has no musl
 dynamic linker name defined
 

Updated the patch, this time I tested it separately.
(The test revealed that -mmusl was not wired up properly
in linux.opt through the Negative directive, now fixed).

A default dynamic linker name is defined for musl: /dev/null.
(The arch specific headers have to be updated to undef the
MUSL_DYNAMIC_LINKER* macros).
(Other options were considered such as print an error with %e,
but then the user cannot override it with -Wl,-dynamic-linker)

The other parts of the patch (config, include path reordering)
are not changed.

gcc/Changelog

2015-04-27  Gregor Richards  gregor.richa...@uwaterloo.ca
Szabolcs Nagy  szabolcs.n...@arm.com

* config.gcc (LIBC_MUSL): New tm_defines macro.
* config/linux.h (OPTION_MUSL): Define.
(MUSL_DYNAMIC_LINKER, MUSL_DYNAMIC_LINKER32,)
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERX32,)
(INCLUDE_DEFAULTS_MUSL_GPP, INCLUDE_DEFAULTS_MUSL_LOCAL,)
(INCLUDE_DEFAULTS_MUSL_PREFIX, INCLUDE_DEFAULTS_MUSL_CROSS,)
(INCLUDE_DEFAULTS_MUSL_TOOL, INCLUDE_DEFAULTS_MUSL_NATIVE): Define.

* config/linux.opt (mmusl): New option.
* configure.ac (gcc_cv_libc_provides_ssp): Add *-*-musl*.
(gcc_cv_target_dl_iterate_phdr): Add *-linux-musl*.

* configure: Regenerate.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index a1df043..0176ca2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -575,7 +575,7 @@ case ${target} in
 esac
 
 # Common C libraries.
-tm_defines=$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3
+tm_defines=$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 LIBC_MUSL=4
 
 # 32-bit x86 processors supported by --with-arch=.  Each processor
 # MUST be separated by exactly one space.
@@ -725,6 +725,9 @@ case ${target} in
 *-*-*uclibc*)
   tm_defines=$tm_defines DEFAULT_LIBC=LIBC_UCLIBC
   ;;
+*-*-*musl*)
+  tm_defines=$tm_defines DEFAULT_LIBC=LIBC_MUSL
+  ;;
 *)
   tm_defines=$tm_defines DEFAULT_LIBC=LIBC_GLIBC
   ;;
diff --git a/gcc/config/linux.h b/gcc/config/linux.h
index 857389a..b551c56 100644
--- a/gcc/config/linux.h
+++ b/gcc/config/linux.h
@@ -32,10 +32,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define OPTION_GLIBC  (DEFAULT_LIBC == LIBC_GLIBC)
 #define OPTION_UCLIBC (DEFAULT_LIBC == LIBC_UCLIBC)
 #define OPTION_BIONIC (DEFAULT_LIBC == LIBC_BIONIC)
+#define OPTION_MUSL   (DEFAULT_LIBC == LIBC_MUSL)
 #else
 #define OPTION_GLIBC  (linux_libc == LIBC_GLIBC)
 #define OPTION_UCLIBC (linux_libc == LIBC_UCLIBC)
 #define OPTION_BIONIC (linux_libc == LIBC_BIONIC)
+#define OPTION_MUSL   (linux_libc == LIBC_MUSL)
 #endif
 
 #define GNU_USER_TARGET_OS_CPP_BUILTINS()			\
@@ -50,21 +52,25 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 } while (0)
 
 /* Determine which dynamic linker to use depending on whether GLIBC or
-   uClibc or Bionic is the default C library and whether
-   -muclibc or -mglibc or -mbionic has been passed to change the default.  */
+   uClibc or Bionic or musl is the default C library and whether
+   -muclibc or -mglibc or -mbionic or -mmusl has been passed to change
+   the default.  */
 
-#define CHOOSE_DYNAMIC_LINKER1(LIBC1, LIBC2, LIBC3, LD1, LD2, LD3)	\
-  %{ LIBC2 : LD2 ;:%{ LIBC3 : LD3 ;: LD1 }}
+#define CHOOSE_DYNAMIC_LINKER1(LIBC1, LIBC2, LIBC3, LIBC4, LD1, LD2, LD3, LD4)	\
+  %{ LIBC2 : LD2 ;:%{ LIBC3 : LD3 ;:%{ LIBC4 : LD4 ;: LD1 }}}
 
 #if DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U, B) \
-  CHOOSE_DYNAMIC_LINKER1 (mglibc, muclibc, mbionic, G, U, B)
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 (mglibc, muclibc, mbionic, mmusl, G, U, B, M)
 #elif DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U, B) \
-  CHOOSE_DYNAMIC_LINKER1 (muclibc, mglibc, mbionic, U, G, B)
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 (muclibc, mglibc, mbionic, mmusl, U, G, B, M)
 #elif DEFAULT_LIBC == LIBC_BIONIC
-#define CHOOSE_DYNAMIC_LINKER(G, U, B) \
-  CHOOSE_DYNAMIC_LINKER1 (mbionic, mglibc, muclibc, B, G, U)
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 (mbionic, mglibc, muclibc, mmusl, B, G, U, M)
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 (mmusl, mglibc, muclibc, mbionic, M, G, U, B)
 #else
 #error Unsupported DEFAULT_LIBC
 #endif /* DEFAULT_LIBC */
@@ -81,24 +87,100 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define BIONIC_DYNAMIC_LINKER32 /system/bin/linker
 #define BIONIC_DYNAMIC_LINKER64 /system/bin/linker64
 #define BIONIC_DYNAMIC_LINKERX32 /system/bin/linkerx32
+/* Should be redefined for each target that supports musl.  */
+#define MUSL_DYNAMIC_LINKER /dev/null

Re: [PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Segher Boessenkool
On Mon, Apr 27, 2015 at 10:55:17AM +0100, Kyrill Tkachov wrote:
 Precedents suggest these changes are considered obvious.
 So I'll commit this in a couple of days unless someone objects.

Yes it's obvious.  One tiny thing...

 @@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
  {
if (rtx_equal_p (XEXP (x, 0), val))
 - cond = swap_condition (cond), temp = val, val = reg, reg = temp;
 + cond = swap_condition (cond), std::swap (val, reg);
  
if (rtx_equal_p (XEXP (x, 0), reg)  rtx_equal_p (XEXP (x, 1), val))
   {

Might as well write this as two statements, like everywhere else, e.g.

 @@ -11454,7 +11453,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, 
 rtx *pop1)
   is already a constant integer.  */
if (swap_commutative_operands_p (op0, op1))
  {
 -  tem = op0, op0 = op1, op1 = tem;
 +  std::swap (op0, op1);
code = swap_condition (code);
  }

Thanks,


Segher


Re: [PATCH 7/13] powerpc musl support

2015-04-27 Thread Szabolcs Nagy
On 20/04/15 19:55, Szabolcs Nagy wrote:
 Set up dynamic linker name for powerpc.  Musl only supports powerpc
 with secure plt, so appropriate options are passed to the linker by
 default.

Patch v2.

Powerpc does not include the top level linux.h, so I had to
repeat the include order fixes from there in rs6000/sysv4.h.

I corrected the endianness handling (the le suffix should
be added correctly now).

gcc/Changelog:

2015-04-24  Gregor Richards  gregor.richa...@uwaterloo.ca
Szabolcs Nagy  szabolcs.n...@arm.com

* config.gcc (secure_plt): Add *-linux*-musl*.
* config/rs6000/linux64.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64): Define.
(GNU_USER_DYNAMIC_LINKER32): Update.
(GNU_USER_DYNAMIC_LINKER64): Update.
(CHOOSE_DYNAMIC_LINKER): Update.

* config/rs6000/secureplt.h (LINK_SECURE_PLT_DEFAULT_SPEC): Define.
* config/rs6000/sysv4.h (GNU_USER_DYNAMIC_LINKER): Update.
(MUSL_DYNAMIC_LINKER, MUSL_DYNAMIC_LINKER_E,)
(INCLUDE_DEFAULTS_MUSL_GPP, INCLUDE_DEFAULTS_MUSL_LOCAL,)
(INCLUDE_DEFAULTS_MUSL_PREFIX, INCLUDE_DEFAULTS_MUSL_CROSS,)
(INCLUDE_DEFAULTS_MUSL_TOOL, INCLUDE_DEFAULTS_MUSL_NATIVE): Define.
(LINK_SECURE_PLT_DEFAULT_SPEC): Define.
(CHOOSE_DYNAMIC_LINKER, LINK_TARGET_SPEC, LINK_OS_LINUX_SPEC): Update.

* config/rs6000/sysv4le.h (MUSL_DYNAMIC_LINKER_E): Define.
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 0879e7e..a6ad842 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -365,17 +365,21 @@ extern int dot_symbols;
 #endif
 #define UCLIBC_DYNAMIC_LINKER32 /lib/ld-uClibc.so.0
 #define UCLIBC_DYNAMIC_LINKER64 /lib/ld64-uClibc.so.0
+#define MUSL_DYNAMIC_LINKER32 /lib/ld-musl-powerpc MUSL_DYNAMIC_LINKER_E .so.1
+#define MUSL_DYNAMIC_LINKER64 /lib/ld-musl-powerpc64 MUSL_DYNAMIC_LINKER_E .so.1
 #if DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) %{mglibc: G ;: U }
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) %{mglibc: G ;:%{mmusl: M ;: U }}
 #elif DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) %{muclibc: U ;: G }
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) %{muclibc: U ;:%{mmusl: M ;: G }}
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) %{mglibc: G ;:%{muclibc: U ;: M }}
 #else
 #error Unsupported DEFAULT_LIBC
 #endif
 #define GNU_USER_DYNAMIC_LINKER32 \
-  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER32, UCLIBC_DYNAMIC_LINKER32)
+  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER32, UCLIBC_DYNAMIC_LINKER32, MUSL_DYNAMIC_LINKER32)
 #define GNU_USER_DYNAMIC_LINKER64 \
-  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER64, UCLIBC_DYNAMIC_LINKER64)
+  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER64, UCLIBC_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKER64)
 
 #undef  DEFAULT_ASM_ENDIAN
 #if (TARGET_DEFAULT  MASK_LITTLE_ENDIAN)
diff --git a/gcc/config/rs6000/secureplt.h b/gcc/config/rs6000/secureplt.h
index b463463..77edf2a 100644
--- a/gcc/config/rs6000/secureplt.h
+++ b/gcc/config/rs6000/secureplt.h
@@ -18,3 +18,4 @@ along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
 #define CC1_SECURE_PLT_DEFAULT_SPEC -msecure-plt
+#define LINK_SECURE_PLT_DEFAULT_SPEC --secure-plt
diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
index 9917c2f..94c4263 100644
--- a/gcc/config/rs6000/sysv4.h
+++ b/gcc/config/rs6000/sysv4.h
@@ -537,6 +537,9 @@ ENDIAN_SELECT( -mbig,  -mlittle, DEFAULT_ASM_ENDIAN)
 #ifndef CC1_SECURE_PLT_DEFAULT_SPEC
 #define CC1_SECURE_PLT_DEFAULT_SPEC 
 #endif
+#ifndef LINK_SECURE_PLT_DEFAULT_SPEC
+#define LINK_SECURE_PLT_DEFAULT_SPEC 
+#endif
 
 /* Pass -G xxx to the compiler.  */
 #undef CC1_SPEC
@@ -586,7 +589,8 @@ ENDIAN_SELECT( -mbig,  -mlittle, DEFAULT_ASM_ENDIAN)
 
 /* Override the default target of the linker.  */
 #define	LINK_TARGET_SPEC \
-  ENDIAN_SELECT(,  --oformat elf32-powerpcle, )
+  ENDIAN_SELECT(,  --oformat elf32-powerpcle, ) \
+  %{!mbss-plt: %{!msecure-plt: %(link_secure_plt_default)}}
 
 /* Any specific OS flags.  */
 #define LINK_OS_SPEC \
@@ -762,17 +766,22 @@ ENDIAN_SELECT( -mbig,  -mlittle, DEFAULT_ASM_ENDIAN)
 
 #define LINK_START_LINUX_SPEC 
 
+#define MUSL_DYNAMIC_LINKER_E ENDIAN_SELECT(,le,)
+
 #define GLIBC_DYNAMIC_LINKER /lib/ld.so.1
 #define UCLIBC_DYNAMIC_LINKER /lib/ld-uClibc.so.0
+#define MUSL_DYNAMIC_LINKER /lib/ld-musl-powerpc MUSL_DYNAMIC_LINKER_E .so.1
 #if DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) %{mglibc: G ;: U }
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) %{mglibc: G ;:%{mmusl: M ;: U }}
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) %{mglibc: G ;:%{muclibc: U ;: M }}
 #elif !defined (DEFAULT_LIBC) || DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) %{muclibc: U ;: G }
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) %{muclibc: U ;:%{mmusl: M ;: G }}
 #else
 #error Unsupported DEFAULT_LIBC
 #endif
 #define GNU_USER_DYNAMIC_LINKER \
-  

RE: [PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Kyrill Tkachov
On 27/04/15 15:37, Segher Boessenkool wrote:
 On Mon, Apr 27, 2015 at 10:55:17AM +0100, Kyrill Tkachov wrote:
 Precedents suggest these changes are considered obvious.
 So I'll commit this in a couple of days unless someone objects.

 Yes it's obvious.  One tiny thing...

 @@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx
val)
else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
  {
if (rtx_equal_p (XEXP (x, 0), val))
 -cond = swap_condition (cond), temp = val, val = reg, reg = temp;
 +cond = swap_condition (cond), std::swap (val, reg);
  
if (rtx_equal_p (XEXP (x, 0), reg)  rtx_equal_p (XEXP (x, 1),
val))
  {

 Might as well write this as two statements, like everywhere else, e.g.

 @@ -11454,7 +11453,7 @@ simplify_comparison (enum rtx_code code, rtx
*pop0, rtx *pop1)
   is already a constant integer.  */
if (swap_commutative_operands_p (op0, op1))
  {
 -  tem = op0, op0 = op1, op1 = tem;
 +  std::swap (op0, op1);
code = swap_condition (code);
  }

 Thanks,

Thanks, here's what I committed with r222468.

Kyrill




 Segher



combine-swap.patch
Description: Binary data


Re: [PATCH][simplify-rtx][trivial] Use std::swap instead of manually swapping

2015-04-27 Thread Jeff Law

On 04/27/2015 03:55 AM, Kyrill Tkachov wrote:

Hi all,

This patch replaces in simplify-rtx.c the manual swapping of values with
std::swap.
Precedents suggest these are considered obvious changes.

Bootstrapped and tested on aarch64, x86_64.

Will commit as obvious in a couple of days if no one objects

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * simplify-rtx.c (simplify_gen_binary): Use std::swap instead
 of manually swapping.
 (simplify_associative_operation): Likewise.
 (simplify_binary_operation): Likewise.
 (simplify_plus_minus): Likewise.
 (simplify_relational_operation): Likewise.
 (simplify_ternary_operation): Likewise.
OK.  And just to be explicit, please consider any patch which just 
converts manual swapping to std::swap as pre-approved.  Just post them 
for archival purposes.



jeff



RE: [PATCH][AArch64] Make aarch64_min_divisions_for_recip_mul configurable

2015-04-27 Thread Wilco Dijkstra
ping

 -Original Message-
 From: Wilco Dijkstra [mailto:wdijk...@arm.com]
 Sent: 03 March 2015 18:06
 To: GCC Patches
 Subject: [PATCH][AArch64] Make aarch64_min_divisions_for_recip_mul 
 configurable
 
 This patch makes aarch64_min_divisions_for_recip_mul configurable for float 
 and double. This
 allows CPUs with really fast or multiple dividers to return 3 (or even 4) if 
 that happens to
 be faster overall. No code generation change - bootstrap  regression OK.
 
 ChangeLog:
 2015-03-03  Wilco Dijkstra  wdijk...@arm.com
 
   * gcc/config/aarch64/aarch64-protos.h (tune_params):
   Add min_div_recip_mul_sf and min_div_recip_mul_df fields.
   * gcc/config/aarch64/aarch64.c (aarch64_min_divisions_for_recip_mul):
   Return value depending on target.
   (generic_tunings): Initialize new target settings.
   (cortexa53_tunings): Likewise.
   (cortexa57_tunings): Likewise.
   (thunderx_tunings): Likewise.
   (xgene1_tunings): Likewise.
 
 ---
  gcc/config/aarch64/aarch64-protos.h |  2 ++
  gcc/config/aarch64/aarch64.c| 26 +++---
  2 files changed, 21 insertions(+), 7 deletions(-)
 
 diff --git a/gcc/config/aarch64/aarch64-protos.h 
 b/gcc/config/aarch64/aarch64-protos.h
 index 59c5824..4331e5c 100644
 --- a/gcc/config/aarch64/aarch64-protos.h
 +++ b/gcc/config/aarch64/aarch64-protos.h
 @@ -177,6 +177,8 @@ struct tune_params
const int int_reassoc_width;
const int fp_reassoc_width;
const int vec_reassoc_width;
 +  const int min_div_recip_mul_sf;
 +  const int min_div_recip_mul_df;
  };
 
  HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index e22d72e..42a96f6 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -353,7 +353,9 @@ static const struct tune_params generic_tunings =
4, /* loop_align.  */
2, /* int_reassoc_width.  */
4, /* fp_reassoc_width.  */
 -  1  /* vec_reassoc_width.  */
 +  1, /* vec_reassoc_width.  */
 +  2, /* min_div_recip_mul_sf.  */
 +  2  /* min_div_recip_mul_df.  */
  };
 
  static const struct tune_params cortexa53_tunings =
 @@ -371,7 +373,9 @@ static const struct tune_params cortexa53_tunings =
4, /* loop_align.  */
2, /* int_reassoc_width.  */
4, /* fp_reassoc_width.  */
 -  1  /* vec_reassoc_width.  */
 +  1, /* vec_reassoc_width.  */
 +  2, /* min_div_recip_mul_sf.  */
 +  2  /* min_div_recip_mul_df.  */
  };
 
  static const struct tune_params cortexa57_tunings =
 @@ -389,7 +393,9 @@ static const struct tune_params cortexa57_tunings =
4, /* loop_align.  */
2, /* int_reassoc_width.  */
4, /* fp_reassoc_width.  */
 -  1  /* vec_reassoc_width.  */
 +  1, /* vec_reassoc_width.  */
 +  2, /* min_div_recip_mul_sf.  */
 +  2  /* min_div_recip_mul_df.  */
  };
 
  static const struct tune_params thunderx_tunings =
 @@ -406,7 +412,9 @@ static const struct tune_params thunderx_tunings =
8, /* loop_align.  */
2, /* int_reassoc_width.  */
4, /* fp_reassoc_width.  */
 -  1  /* vec_reassoc_width.  */
 +  1, /* vec_reassoc_width.  */
 +  2, /* min_div_recip_mul_sf.  */
 +  2  /* min_div_recip_mul_df.  */
  };
 
  static const struct tune_params xgene1_tunings =
 @@ -423,7 +431,9 @@ static const struct tune_params xgene1_tunings =
16,/* loop_align.  */
2, /* int_reassoc_width.  */
4, /* fp_reassoc_width.  */
 -  1  /* vec_reassoc_width.  */
 +  1, /* vec_reassoc_width.  */
 +  2, /* min_div_recip_mul_sf.  */
 +  2  /* min_div_recip_mul_df.  */
  };
 
  /* A processor implementing AArch64.  */
 @@ -512,9 +522,11 @@ static const char * const aarch64_condition_codes[] =
  };
 
  static unsigned int
 -aarch64_min_divisions_for_recip_mul (enum machine_mode mode ATTRIBUTE_UNUSED)
 +aarch64_min_divisions_for_recip_mul (enum machine_mode mode)
  {
 -  return 2;
 +  if (GET_MODE_UNIT_SIZE (mode) == 4)
 +return aarch64_tune_params-min_div_recip_mul_sf;
 +  return aarch64_tune_params-min_div_recip_mul_df;
  }
 
  static int
 --
 1.9.1





RE: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

2015-04-27 Thread Wilco Dijkstra
ping

 -Original Message-
 From: Wilco Dijkstra [mailto:wdijk...@arm.com]
 Sent: 04 March 2015 15:38
 To: GCC Patches
 Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS
 
 Include the cost of op0 and op1 in all cases in PLUS and MINUS in 
 aarch64_rtx_costs.
 Bootstrap  regression OK.
 
 ChangeLog:
 2015-03-04  Wilco Dijkstra  wdijk...@arm.com
 
 * gcc/config/aarch64/aarch64.c (aarch64_rtx_costs):
 Calculate cost of op0 and op1 in PLUS and MINUS cases.
 
 ---
  gcc/config/aarch64/aarch64.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)
 
 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index 39921a7..e22d72e 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -5794,6 +5794,8 @@ aarch64_rtx_costs (rtx x, int code, int outer 
 ATTRIBUTE_UNUSED,
   op1 = XEXP (x, 1);
 
  cost_minus:
 + *cost += rtx_cost (op0, MINUS, 0, speed);
 +
   /* Detect valid immediates.  */
   if ((GET_MODE_CLASS (mode) == MODE_INT
|| (GET_MODE_CLASS (mode) == MODE_CC
 @@ -5801,13 +5803,10 @@ cost_minus:
CONST_INT_P (op1)
aarch64_uimm12_shift (INTVAL (op1)))
 {
 - *cost += rtx_cost (op0, MINUS, 0, speed);
 -
   if (speed)
 /* SUB(S) (immediate).  */
 *cost += extra_cost-alu.arith;
   return true;
 -
 }
 
   /* Look for SUB (extended register).  */
 @@ -5832,7 +5831,6 @@ cost_minus:
   *cost += aarch64_rtx_mult_cost (new_op1, MULT,
   (enum rtx_code) code,
   speed);
 - *cost += rtx_cost (op0, MINUS, 0, speed);
   return true;
 }
 
 @@ -5879,6 +5877,8 @@ cost_plus:
   return true;
 }
 
 + *cost += rtx_cost (op1, PLUS, 1, speed);
 +
   /* Look for ADD (extended register).  */
  if (aarch64_rtx_arith_op_extract_p (op0, mode))
 {
 @@ -5900,12 +5900,10 @@ cost_plus:
 {
   *cost += aarch64_rtx_mult_cost (new_op0, MULT, PLUS,
   speed);
 - *cost += rtx_cost (op1, PLUS, 1, speed);
   return true;
 }
 
 - *cost += (rtx_cost (new_op0, PLUS, 0, speed)
 -   + rtx_cost (op1, PLUS, 1, speed));
 + *cost += rtx_cost (new_op0, PLUS, 0, speed);
 
   if (speed)
 {
 --
 1.9.1




Re: [PATCH 8/13] sh musl support

2015-04-27 Thread Szabolcs Nagy


On 20/04/15 21:29, Oleg Endo wrote:
 On Mon, 2015-04-20 at 19:56 +0100, Szabolcs Nagy wrote:
 Set up dynamic linker name for sh.

 
 The SH parts are OK for trunk.
 

Patch v2

It turns out we have to distinguish between hard float and nofpu ABI.
I simply listed the options that turn on nofpu and hardfloat code
generation (the single precision modes are not relevant to musl,
currently musl only supports sh4 and up because it assumes either
gusa or llsc atomics is available and requires an mmu, but I listed
all the options for the nofpu check for completeness).

gcc/Changelog:

2015-04-27  Gregor Richards  gregor.richa...@uwaterloo.ca
Szabolcs Nagy  szabolcs.n...@arm.com

* config/sh/linux.h (MUSL_DYNAMIC_LINKER): Define.
(MUSL_DYNAMIC_LINKER_E, MUSL_DYNAMIC_LINKER_FP): Define.
diff --git a/gcc/config/sh/linux.h b/gcc/config/sh/linux.h
index 0f5d614..4c167c6 100644
--- a/gcc/config/sh/linux.h
+++ b/gcc/config/sh/linux.h
@@ -43,6 +43,28 @@ along with GCC; see the file COPYING3.  If not see
 
 #define TARGET_ASM_FILE_END file_end_indicate_exec_stack
 
+#if TARGET_ENDIAN_DEFAULT == MASK_LITTLE_ENDIAN
+#define MUSL_DYNAMIC_LINKER_E %{mb:eb}
+#else
+#define MUSL_DYNAMIC_LINKER_E %{!ml:eb}
+#endif
+
+#if TARGET_CPU_DEFAULT  ( MASK_HARD_SH2A_DOUBLE | MASK_SH4 )
+/* -nofpu if any nofpu option is specified */
+#define MUSL_DYNAMIC_LINKER_FP \
+  %{m1|m2|m2a-nofpu|m3|m4-nofpu|m4-100-nofpu|m4-200-nofpu|m4-300-nofpu| \
+  m4-340|m4-400|m4-500|m4al|m5-32media-nofpu|m5-64media-nofpu| \
+  m5-compact-nofpu:-nofpu}
+#else
+/* -nofpu if none of the hard fpu options are specified */
+#define MUSL_DYNAMIC_LINKER_FP \
+  %{m2a|m4|m4-100|m4-200|m4-300|m4a|m5-32media|m5-64media|m5-compact:;:-nofpu}
+#endif
+
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER \
+  /lib/ld-musl-sh MUSL_DYNAMIC_LINKER_E MUSL_DYNAMIC_LINKER_FP .so.1
+
 #define GLIBC_DYNAMIC_LINKER /lib/ld-linux.so.2
 
 #undef SUBTARGET_LINK_EMUL_SUFFIX


Re: [PATCH 9/13] x86 musl support

2015-04-27 Thread Szabolcs Nagy
On 20/04/15 19:57, Szabolcs Nagy wrote:
 Set up dynamic linker name for x86.
 

Patch v2.

(undef MUSL_DYNAMIC_LINKER* before defining it).

gcc/Changelog:

2015-04-27  Gregor Richards  gregor.richa...@uwaterloo.ca

* config/i386/linux.h (MUSL_DYNAMIC_LINKER): Define.
* config/i386/linux64.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERX32): Define.
diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h
index a100963..9c47fcb 100644
--- a/gcc/config/i386/linux.h
+++ b/gcc/config/i386/linux.h
@@ -21,3 +21,5 @@ along with GCC; see the file COPYING3.  If not see
 
 #define GNU_USER_LINK_EMULATION elf_i386
 #define GLIBC_DYNAMIC_LINKER /lib/ld-linux.so.2
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER /lib/ld-musl-i386.so.1
diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h
index a27d3be..e300480 100644
--- a/gcc/config/i386/linux64.h
+++ b/gcc/config/i386/linux64.h
@@ -30,3 +30,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define GLIBC_DYNAMIC_LINKER32 /lib/ld-linux.so.2
 #define GLIBC_DYNAMIC_LINKER64 /lib64/ld-linux-x86-64.so.2
 #define GLIBC_DYNAMIC_LINKERX32 /libx32/ld-linux-x32.so.2
+
+#undef MUSL_DYNAMIC_LINKER32
+#define MUSL_DYNAMIC_LINKER32 /lib/ld-musl-i386.so.1
+#undef MUSL_DYNAMIC_LINKER64
+#define MUSL_DYNAMIC_LINKER64 /lib/ld-musl-x86_64.so.1
+#undef MUSL_DYNAMIC_LINKERX32
+#define MUSL_DYNAMIC_LINKERX32 /lib/ld-musl-x32.so.1


RE: [PATCH][AArch64] Use conditional negate for abs expansion

2015-04-27 Thread Wilco Dijkstra
ping

 -Original Message-
 From: Wilco Dijkstra [mailto:wdijk...@arm.com]
 Sent: 03 March 2015 16:19
 To: GCC Patches
 Subject: [PATCH][AArch64] Use conditional negate for abs expansion
 
 Expand abs into a compare and conditional negate. This is the most obvious 
 expansion, enables
 merging of the comparison into ALU instructions and is faster on all 
 implementations.
 Bootstrapped  regression tested.
 
 int f(int x) { return abs (x + 1); }
 
 Before:
 add w0, w0, 1
 sxtwx0, w0
 eor x1, x0, x0, asr 63
 sub x1, x1, x0, asr 63
 mov x0, x1
 ret
 
 After:
 addsw0, w0, 1
 csneg   w0, w0, w0, pl
 ret
 
 ChangeLog:
 
 2015-03-03  Wilco Dijkstra  wdijk...@arm.com
 
   * gcc/config/aarch64/aarch64.md (absdi2): optimize abs expansion.
   (csneg3mode_insn): enable expansion of pattern.
   * gcc/testsuite/gcc.target/aarch64/abs_1.c (abs64): update test
   for new abs expansion.  (abs64_in_dreg): likewise.
 
 ---
  gcc/config/aarch64/aarch64.md| 33 
 +++-
  gcc/testsuite/gcc.target/aarch64/abs_1.c |  5 ++---
  2 files changed, 9 insertions(+), 29 deletions(-)
 
 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
 index 1f4169e..46b7a63 100644
 --- a/gcc/config/aarch64/aarch64.md
 +++ b/gcc/config/aarch64/aarch64.md
 @@ -2172,35 +2172,16 @@
[(set_attr type alu_ext)]
  )
 
 -(define_insn_and_split absdi2
 -  [(set (match_operand:DI 0 register_operand =r,w)
 - (abs:DI (match_operand:DI 1 register_operand r,w)))]
 +(define_expand absmode2
 +  [(match_operand:GPI 0 register_operand )
 +   (match_operand:GPI 1 register_operand )]

 -  @
 -   #
 -   abs\\t%d0, %d1
 -  reload_completed
 -GP_REGNUM_P (REGNO (operands[0]))
 -GP_REGNUM_P (REGNO (operands[1]))
 -  [(const_int 0)]
{
 -emit_insn (gen_rtx_SET (VOIDmode, operands[0],
 - gen_rtx_XOR (DImode,
 -  gen_rtx_ASHIFTRT (DImode,
 -operands[1],
 -GEN_INT (63)),
 -  operands[1])));
 -emit_insn (gen_rtx_SET (VOIDmode,
 - operands[0],
 - gen_rtx_MINUS (DImode,
 -operands[0],
 -gen_rtx_ASHIFTRT (DImode,
 -  operands[1],
 -  GEN_INT (63);
 +rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);
 +rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
 +emit_insn (gen_csneg3mode_insn (operands[0], x, operands[1], 
 operands[1]));
  DONE;
}
 -  [(set_attr type alu_sreg)
 -   (set_attr simd no,yes)]
  )
 
  (define_insn negmode2
 @@ -2879,7 +2860,7 @@
[(set_attr type csel)]
  )
 
 -(define_insn *csneg3mode_insn
 +(define_insn csneg3mode_insn
[(set (match_operand:GPI 0 register_operand =r)
  (if_then_else:GPI
 (match_operand 1 aarch64_comparison_operation )
 diff --git a/gcc/testsuite/gcc.target/aarch64/abs_1.c
 b/gcc/testsuite/gcc.target/aarch64/abs_1.c
 index 938bc84..11f1095 100644
 --- a/gcc/testsuite/gcc.target/aarch64/abs_1.c
 +++ b/gcc/testsuite/gcc.target/aarch64/abs_1.c
 @@ -7,15 +7,14 @@ extern void abort (void);
  long long
  abs64 (long long a)
  {
 -  /* { dg-final { scan-assembler eor\t } } */
 -  /* { dg-final { scan-assembler sub\t } } */
 +  /* { dg-final { scan-assembler csneg\t } } */
return llabs (a);
  }
 
  long long
  abs64_in_dreg (long long a)
  {
 -  /* { dg-final { scan-assembler abs\td\[0-9\]+, d\[0-9\]+ } } */
 +  /* { dg-final { scan-assembler csneg\t } } */
register long long x asm (d8) = a;
register long long y asm (d9);
asm volatile ( : : w (x));
 --
 1.9.1




Re: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

2015-04-27 Thread Kyrill Tkachov

Hi Wilco,

On 27/04/15 14:43, Wilco Dijkstra wrote:

ping


-Original Message-
From: Wilco Dijkstra [mailto:wdijk...@arm.com]
Sent: 04 March 2015 15:38
To: GCC Patches
Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

Include the cost of op0 and op1 in all cases in PLUS and MINUS in 
aarch64_rtx_costs.
Bootstrap  regression OK.

ChangeLog:
2015-03-04  Wilco Dijkstra  wdijk...@arm.com

 * gcc/config/aarch64/aarch64.c (aarch64_rtx_costs):
 Calculate cost of op0 and op1 in PLUS and MINUS cases.

---
  gcc/config/aarch64/aarch64.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 39921a7..e22d72e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5794,6 +5794,8 @@ aarch64_rtx_costs (rtx x, int code, int outer 
ATTRIBUTE_UNUSED,
op1 = XEXP (x, 1);

  cost_minus:
+   *cost += rtx_cost (op0, MINUS, 0, speed);
+
/* Detect valid immediates.  */
if ((GET_MODE_CLASS (mode) == MODE_INT
 || (GET_MODE_CLASS (mode) == MODE_CC
@@ -5801,13 +5803,10 @@ cost_minus:
 CONST_INT_P (op1)
 aarch64_uimm12_shift (INTVAL (op1)))
  {
-   *cost += rtx_cost (op0, MINUS, 0, speed);
-
if (speed)
  /* SUB(S) (immediate).  */
  *cost += extra_cost-alu.arith;
return true;
-
  }

/* Look for SUB (extended register).  */
@@ -5832,7 +5831,6 @@ cost_minus:
*cost += aarch64_rtx_mult_cost (new_op1, MULT,
(enum rtx_code) code,
speed);
-   *cost += rtx_cost (op0, MINUS, 0, speed);
return true;
  }

@@ -5879,6 +5877,8 @@ cost_plus:
return true;
  }

+   *cost += rtx_cost (op1, PLUS, 1, speed);
+


I don't think this is correct. In the code directly below
(when the aarch64_rtx_arith_op_extract_p condition is true)
we have a shift/extend operation by a constant, so we don't
want to take into account the cost of operand 1 (which is
the extend+shift rtx).

So, I looked at this code myself recently and found that the
wrong logic is addressed by my patch at:
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01617.html
which fixes the case where we forget to cost operand0 in the
MINUS case.

What do you think?
Kyrill


/* Look for ADD (extended register).  */
  if (aarch64_rtx_arith_op_extract_p (op0, mode))
  {
@@ -5900,12 +5900,10 @@ cost_plus:
  {
*cost += aarch64_rtx_mult_cost (new_op0, MULT, PLUS,
speed);
-   *cost += rtx_cost (op1, PLUS, 1, speed);
return true;
  }

-   *cost += (rtx_cost (new_op0, PLUS, 0, speed)
- + rtx_cost (op1, PLUS, 1, speed));
+   *cost += rtx_cost (new_op0, PLUS, 0, speed);

if (speed)
  {
--
1.9.1






Re: [PATCH 3/13] aarch64 musl support

2015-04-27 Thread Szabolcs Nagy

On 21/04/15 15:16, pins...@gmail.com wrote:
 
 I don't think you need to check if defaulting to little or big-endian here 
 are the specs always have one or the other passing through. 
 
 Also if musl does not support ilp32, you might want to error out. Or even 
 define the dynamic linker name even before support goes into musl. 
 

Patch v2.

(now _ilp32 abi variant is included too)

gcc/Changelog:

2015-04-27  Gregor Richards  gregor.richa...@uwaterloo.ca
Szabolcs Nagy  szabolcs.n...@arm.com

* config/aarch64/aarch64-linux.h (MUSL_DYNAMIC_LINKER): Define.
diff --git a/gcc/config/aarch64/aarch64-linux.h b/gcc/config/aarch64/aarch64-linux.h
index 9abb252..2d647c3 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -23,6 +23,9 @@
 
 #define GLIBC_DYNAMIC_LINKER /lib/ld-linux-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1
 
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER /lib/ld-musl-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1
+
 #undef  ASAN_CC1_SPEC
 #define ASAN_CC1_SPEC %{%:sanitize(address):-funwind-tables}
 


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Marek Polacek
On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:
 In case this example feels too contrived (even though it is an
 excerpt of Wine code), we now also warn about the following where
 the two types and variables are defined in different places and
 the size of one is set implicitly:
 
   typedef int r_fun_t (int);
 
   r_fun_t * text_funcs[] = {0,0,0};
 
   int report (unsigned t)
   {
 typedef int s_fun_t (long, char);
 
 static s_fun_t * GUI_funcs[3];
 
 return (t  sizeof text_funcs / sizeof text_funcs[0] 
 t  sizeof GUI_funcs / sizeof GUI_funcs[0]);
   }
 
 (I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
 so that we keep track.)

I'm afraid there isn't an easy solution to this; the problem is that we fold
sizeof early, so the warning sees 

  t  sizeof 4  t  4

and warns.  Maybe using SIZEOF_EXPR would help...

Marek


RE: [PATCH] Fix IRA register preferencing

2015-04-27 Thread Wilco Dijkstra
 Jeff Law wrote:
 On 12/10/14 06:26, Wilco Dijkstra wrote:
 
  If recomputing is best does that mean that record_reg_classes should not
  give a boost to the preferred class in the 2nd pass?
 Perhaps.  I haven't looked deeply at this part of IRA.  I was relaying
 my experiences with (ab)using the ira-reload callbacks to handle
 allocation after splitting -- where getting the costs and classes
 updated in a reasonable manner was clearly important to getting good
 code.  One could probably argue I should have kept testcases from that
 work :-)
 
 
 I don't understand
  what purpose this has - if the preferred class is from the first pass, it
  is already correct, so all it does is boost the preferred class further.
  And if the preferred class is wrong (eg. after live range splitting), it
  will boost the incorrect class even harder, so in the end you never get
  a different class.
 It may be historical from the old regclass code, not really sure.
 
   From what you're saying, recomputing seems best, and I'd be happy to submit
  a patch to remove all the preferred class code from record_reg_classes.
 Recomputing certainly helped the cases I was looking at.
 
  However there is still the possibility the preferred class is queried before
  the recomputation happens (I think that is a case Renlin fixed). Either 
  these
  should be faulted and fixed by forcing recomputation, or we need to provide 
  a
  correct preferred class. That should be a copy of the original class.
 I believe I had copied the original classes, then recomputed them to
 avoid any uninitialized memory reads and the like.  But looking at the
 old branch, I don't see the recomputation for classes (though I do see
 it for costs).  Of course all the backwards walk stuff isn't there
 either, so there's clearly code I worked on extensively, but never
 committed...

Well, looking into this further it does look like the preferences are improved
during the 2nd pass (in particular fewer cases of the bad ALL_REGS preference
that causes all the inefficient code), so it looks like recomputing preferences
is not beneficial and my original patch is the right fix for now.

https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00829.html

Wilco





update docs for --enable-languages

2015-04-27 Thread Jim Wilson
I noticed this while working on my mostlyclean patch.  The list of
languages in the docs for --enable-languages is incomplete.  It is
missing jit and lto.  I also noticed that the grep command matches
boot_language= in addition to language= which is a little confusing,
so I added the ^.

The sentence I added for lto is awkward.  It isn't a default language,
but it is built by default.  Maybe this would make more sense if we
talked about boot languages, but then that gets us into another mess
describing exactly when languages are boot languages.  C is always a
boot language.  C++ is a boot language if bootstrapping.  And lto is a
boot language if --enable-lto which is the default.

Jim
Index: ChangeLog
===
--- ChangeLog	(revision 222491)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2015-04-27  Jim Wilson  jim.wil...@linaro.org
+
+	* install.texi (--enable-languages): Add missing jit and lto info.
+	Add ^ to grep command.
+
 2015-04-27  Yoshinori Sato ys...@users.sourceforge.jp
 
 	* config.gcc: Add h8300-*-linux.
Index: doc/install.texi
===
--- doc/install.texi	(revision 222491)
+++ doc/install.texi	(working copy)
@@ -1544,15 +1544,17 @@
 @var{langN} you can issue the following command in the
 @file{gcc} directory of your GCC source tree:@*
 @smallexample
-grep language= */config-lang.in
+grep ^language= */config-lang.in
 @end smallexample
 Currently, you can use any of the following:
 @code{all}, @code{ada}, @code{c}, @code{c++}, @code{fortran},
-@code{go}, @code{java}, @code{objc}, @code{obj-c++}.
+@code{go}, @code{java}, @code{jit}, @code{lto}, @code{objc}, @code{obj-c++}.
 Building the Ada compiler has special requirements, see below.
 If you do not pass this flag, or specify the option @code{all}, then all
 default languages available in the @file{gcc} sub-tree will be configured.
-Ada, Go and Objective-C++ are not default languages; the rest are.
+Ada, Go, Jit, and Objective-C++ are not default languages.  LTO is not a
+default language, but is built by default because @option{--enable-lto} is
+enabled by default.  The other languages are default languages.
 
 @item --enable-stage1-languages=@var{lang1},@var{lang2},@dots{}
 Specify that a particular subset of compilers and their runtime


[PATCH] libstdc++: Fix list.cc xmethods test.

2015-04-27 Thread Doug Evans
Hi.

While we should eventually get the xmethods to handle cxx11,
this patch fixes the current failure.
The xmethod matcher doesn't currently handle __cxx11 in the type name.

Adding cxx11 support can be a follow up patch.

Ok to commit?

2015-04-27  Doug Evans  d...@google.com

* testsuite/libstdc++-xmethods/list.cc (_GLIBCXX_USE_CXX11_ABI):
Define to zero.

Index: testsuite/libstdc++-xmethods/list.cc
===
--- testsuite/libstdc++-xmethods/list.cc(revision 222477)
+++ testsuite/libstdc++-xmethods/list.cc(working copy)
@@ -18,6 +18,9 @@
 // with this library; see the file COPYING3.  If not see
 // http://www.gnu.org/licenses/.
 
+// List xmethods only recognize the non cxx11 std::list for now.
+#define _GLIBCXX_USE_CXX11_ABI 0
+
 #include list
 
 int


[PATCH] [libstdc++/65839] whatis support for xmethods

2015-04-27 Thread Doug Evans
Hi.

This patch is the counterpart to this patch to fix
libstdc++/65839, gdb/18285.
https://sourceware.org/ml/gdb-patches/2015-04/msg00947.html

Regression tested on amd64-linux with/without a patched gdb.
Without a patched gdb the new tests fail, but that's good.

2015-04-27  Doug Evans  d...@google.com

PR libstdc++/65839
* python/libstdcxx/v6/xmethods.py (get_bool_type): New function.
Replace all lookups of bool with this.
(get_std_size_type): New function.  Replace all lookups of std::size_t
with this.
(ArrayWorkerBase): Rename arg valtype to elem_type for consistency,
and save it in self._elem_type.
(DequeWorkerBase): Save elemtype.
(ListWorkerBase): New arg elem_type.
(ListMethodsMatcher): Update call to method.worker_class.
(UniquePtrGetWorker.__init__): New arg ptr_type.  Delete setting of
name, enabled.
(UniquePtrDerefWorker.__init__): New arg ptr_type.  Delete setting of
name.
(UniquePtrMethodsMatcher): Rewrite for consistency with all other
libstdc++ xmethod matchers.
(*Worker): New method get_result_type.
* testsuite/libstdc++-xmethods/array.cc: Add whatis tests.
* testsuite/libstdc++-xmethods/associative-containers.cc: Ditto.
* testsuite/libstdc++-xmethods/deque.cc: Ditto.
* testsuite/libstdc++-xmethods/forwardlist.cc: Ditto.
* testsuite/libstdc++-xmethods/unique_ptr.cc: Ditto.
* testsuite/libstdc++-xmethods/vector.cc: Ditto.
* testsuite/libstdc++-xmethods/list.cc: Ditto.

Index: python/libstdcxx/v6/xmethods.py
===
--- python/libstdcxx/v6/xmethods.py (revision 222477)
+++ python/libstdcxx/v6/xmethods.py (working copy)
@@ -21,6 +21,12 @@ import re
 
 matcher_name_prefix = 'libstdc++::'
 
+def get_bool_type():
+return gdb.lookup_type('bool')
+
+def get_std_size_type():
+return gdb.lookup_type('std::size_t')
+
 class LibStdCxxXMethod(gdb.xmethod.XMethod):
 def __init__(self, name, worker_class):
 gdb.xmethod.XMethod.__init__(self, name)
@@ -29,13 +35,13 @@ class LibStdCxxXMethod(gdb.xmethod.XMethod):
 # Xmethods for std::array
 
 class ArrayWorkerBase(gdb.xmethod.XMethodWorker):
-def __init__(self, valtype, size):
-self._valtype = valtype
+def __init__(self, elem_type, size):
+self._elem_type = elem_type
 self._size = size
 
 def null_value(self):
 nullptr = gdb.parse_and_eval('(void *) 0')
-return nullptr.cast(self._valtype.pointer()).dereference()
+return nullptr.cast(self._elem_type.pointer()).dereference()
 
 class ArraySizeWorker(ArrayWorkerBase):
 def __init__(self, valtype, size):
@@ -44,6 +50,9 @@ class ArraySizeWorker(ArrayWorkerBase):
 def get_arg_types(self):
 return None
 
+def get_result_type(self, obj):
+return get_std_size_type()
+
 def __call__(self, obj):
 return self._size
 
@@ -54,6 +63,9 @@ class ArrayEmptyWorker(ArrayWorkerBase):
 def get_arg_types(self):
 return None
 
+def get_result_type(self, obj):
+return get_bool_type()
+
 def __call__(self, obj):
 return (int(self._size) == 0)
 
@@ -64,6 +76,9 @@ class ArrayFrontWorker(ArrayWorkerBase):
 def get_arg_types(self):
 return None
 
+def get_result_type(self, obj):
+return self._elem_type
+
 def __call__(self, obj):
 if int(self._size)  0:
 return obj['_M_elems'][0]
@@ -77,6 +92,9 @@ class ArrayBackWorker(ArrayWorkerBase):
 def get_arg_types(self):
 return None
 
+def get_result_type(self, obj):
+return self._elem_type
+
 def __call__(self, obj):
 if int(self._size)  0:
 return obj['_M_elems'][self._size - 1]
@@ -88,8 +106,11 @@ class ArrayAtWorker(ArrayWorkerBase):
 ArrayWorkerBase.__init__(self, valtype, size)
 
 def get_arg_types(self):
-return gdb.lookup_type('std::size_t')
+return get_std_size_type()
 
+def get_result_type(self, obj, index):
+return self._elem_type
+
 def __call__(self, obj, index):
 if int(index) = int(self._size):
 raise IndexError('Array index %d should not be = %d.' %
@@ -101,8 +122,11 @@ class ArraySubscriptWorker(ArrayWorkerBase):
 ArrayWorkerBase.__init__(self, valtype, size)
 
 def get_arg_types(self):
-return gdb.lookup_type('std::size_t')
+return get_std_size_type()
 
+def get_result_type(self, obj, index):
+return self._elem_type
+
 def __call__(self, obj, index):
 if int(self._size)  0:
 return obj['_M_elems'][index]
@@ -140,6 +164,7 @@ class ArrayMethodsMatcher(gdb.xmethod.XMethodMatch
 
 class DequeWorkerBase(gdb.xmethod.XMethodWorker):
 def __init__(self, elemtype):
+self._elemtype = elemtype
 self._bufsize = (512 / elemtype.sizeof) or 

Re: [PATCH] libstdc++: Fix list.cc xmethods test.

2015-04-27 Thread Jonathan Wakely
On 27 April 2015 at 23:33, Doug Evans wrote:
 Hi.

 While we should eventually get the xmethods to handle cxx11,
 this patch fixes the current failure.
 The xmethod matcher doesn't currently handle __cxx11 in the type name.

 Adding cxx11 support can be a follow up patch.

Agreed. And when that's done we should backport it to the gcc-5-branch too.

 Ok to commit?

OK for trunk, thanks


[libstdc++, committed] fix PR65909

2015-04-27 Thread Sandra Loosemore
This patch works around a libstdc++ testsuite issue:  the generated 
program that tests for the availability of named locales dereferences 
invalid pointers on targets that don't allow command-line arguments to 
be passed.  (In particular, I ran into this with a simulator for an 
embedded target that produces a ton of verbose error messages when it 
hits an invalid memory reference.)  Jonathan already approved the fix in 
the issue, so I've gone ahead and checked it in.


-Sandra

2015-04-27  Sandra Loosemore  san...@codesourcery.com

	PR libstdc++/65909

	libstdc++-v3/
	* testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
	Make the generated test program fail gracefully if the target
	doesn't	support passing command-line arguments.
Index: libstdc++-v3/testsuite/lib/libstdc++.exp
===
--- libstdc++-v3/testsuite/lib/libstdc++.exp	(revision 222496)
+++ libstdc++-v3/testsuite/lib/libstdc++.exp	(working copy)
@@ -901,6 +901,11 @@ proc check_v3_target_namedlocale { args 
   puts $f using namespace std;	
   puts $f int main (int argc, char** argv)
   puts $f {
+  puts $f   if (argc  2)
+  puts $f   {
+  puts $f printf(\locale support test not supported\\n\);
+  puts $f return 1;
+  puts $f   }	
   puts $f   try	
   puts $f   {
   puts $f locale(*(argv + 1));


Re: [Patch] Add regex_constants::__polynomial

2015-04-27 Thread Tim Shen
On Mon, Apr 27, 2015 at 4:57 AM, Jonathan Wakely jwak...@redhat.com wrote:
 OK for trunk.

Committed.


-- 
Regards,
Tim Shen


PR65217.c, improve canonicalization of implied copy from equality comparison

2015-04-27 Thread Jeff Law


Richi recently changed tree-ssa-dom.c::record_equality to use 
tree_swap_operands_p to canonicalize the implied copy we record for 
equality comparisons.  This is a good thing.


However, there is a case where tree_swap_operands_p gives us operands in 
an undesirable order for this routine.  Specifically when both operands 
are SSA_NAMEs, but just one has a single use (in the equality test).  In 
that case, we want the single use SSA_NAME on the LHS of the recorded copy.


That, in effect, prevents DOM from destroying the single use nature of 
that SSA_NAME.  And if we're ultimately able to remove the comparison, 
the statements that fed the single use SSA_NAME can be removed as 
they'll be dead.


Given this issue is so closely tied to DOM, Richi and I agreed to fix 
this in DOM rather than in tree_swap_operands_p.  Had we tried to 
address this in tree_swap_operands_p, we'll end up regressing elsewhere, 
which is clearly not good.


Anyway, bootstrapped and regression tested on x86_64-linux-gnu.  Fixes 
the recently XFAILed 65217 test.  Installed on the trunk.


Jeff
commit a787980e8012128b408ca8b716028e71d1b21373
Author: root root@localhost.localdomain
Date:   Mon Apr 27 21:58:52 2015 -0600

PR tree-optimization/65217
* tree-ssa-dom.c (record_equality): Given two SSA_NAMEs, if just one
of them has a single use, make sure it is the LHS of the implied
copy.

PR tree-optimization/65217
* gcc.target/i386/pr65217.c: Remove XFAIL.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index fff0015..1a82f17 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2015-04-27  Jeff Law  l...@redhat.com
+
+   PR tree-optimization/65217
+   * tree-ssa-dom.c (record_equality): Given two SSA_NAMEs, if just one
+   of them has a single use, make sure it is the LHS of the implied
+   copy.
+
 2015-04-28  Alan Modra  amo...@gmail.com
 
PR target/65810
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 67bdd69..0fc2384 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-04-27  Jeff Law l...@redhat.com
+
+PR tree-optimization/65217
+   * gcc.target/i386/pr65217.c: Remove XFAIL.
+
 2015-04-27  Andre Vehreschild  ve...@gmx.de
 
PR fortran/60322
diff --git a/gcc/testsuite/gcc.target/i386/pr65217.c 
b/gcc/testsuite/gcc.target/i386/pr65217.c
index 3f495b2..d5c9be5 100644
--- a/gcc/testsuite/gcc.target/i386/pr65217.c
+++ b/gcc/testsuite/gcc.target/i386/pr65217.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options -O } */
-/* { dg-final { scan-assembler-not negl { xfail *-*-* } } } */
-/* { dg-final { scan-assembler-not andl { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not negl } } */
+/* { dg-final { scan-assembler-not andl } } */
 
 int 
 test(int n)
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index a67b4e4..c7d427b 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1762,6 +1762,20 @@ record_equality (tree x, tree y)
   if (tree_swap_operands_p (x, y, false))
 std::swap (x, y);
 
+  /* Most of the time tree_swap_operands_p does what we want.  But there's
+ cases where we we know one operand is better for copy propagation than
+ the other.  Given no other code cares about ordering of equality
+ comparison operators for that purpose, we just handle the special cases
+ here.  */
+  if (TREE_CODE (x) == SSA_NAME  TREE_CODE (y) == SSA_NAME)
+{
+  /* If one operand is a single use operand, then make it
+X.  This will preserve its single use properly and if this
+conditional is eliminated, the computation of X can be
+eliminated as well.  */
+  if (has_single_use (y)  ! has_single_use (x))
+   std::swap (x, y);
+}
   if (TREE_CODE (x) == SSA_NAME)
 prev_x = SSA_NAME_VALUE (x);
   if (TREE_CODE (y) == SSA_NAME)


Re: Improve LTO type checking during symtab merging

2015-04-27 Thread Jan Hubicka
Hi,
actually I bootstrapped/regtested without fortran (as I frogot the setting from
LTO bootstrap).  There are several new warnings in the fortran's testsuite.
As far as I can tell, they represent real mismatches.  I wonder, do we want
to fix the testcases (some fortran-fu would be needed), disable warnings on 
those
or explicitely allow excess warnings (bcause we still have no way to match
link-time warnings)?

Honza

spawn /aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../gfortran 
-B/aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../ 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/ 
f_lto_20091028-1_0.o f_lto_20091028-1_1.o -fno-diagnostics-show-caret 
-fdiagnostics-color=never -O0 -flto -flto-partition=none -fuse-linker-plugin -r 
-nostdlib -finline-functions 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libatomic/.libs 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs -o 
gfortran-dg-lto-20091028-1-01.exe
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/20091028-1_0.f90:7:0: 
warning: type of 'int_gen_ti_header_c' does not match original declaration
lto1: note: return value type mismatch
lto1: note: type 'int' should match type 'void'
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/20091028-1_1.c:3:5: note: 
previously declared here

spawn /aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../gfortran 
-B/aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../ 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/ 
f_lto_pr40724_0.o f_lto_pr40724_1.o -fno-diagnostics-show-caret 
-fdiagnostics-color=never -O0 -flto -flto-partition=none -fuse-linker-plugin 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libatomic/.libs 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs -o 
gfortran-dg-lto-pr40724-01.exe
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/pr40724_1.f:2:0: warning: 
type of 'f' does not match original declaration
lto1: note: types have different parameter counts
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/pr40724_0.f:1:0: note: 
previously declared here

spawn /aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../gfortran 
-B/aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../ 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/ 
f_lto_pr41069_0.o f_lto_pr41069_1.o f_lto_pr41069_2.o 
-fno-diagnostics-show-caret -fdiagnostics-color=never -O0 -flto 
-flto-partition=none -fuse-linker-plugin 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libatomic/.libs 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libquadmath/.libs -o 
gfortran-dg-lto-pr41069-01.exe
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/pr41069_1.f90:4:0: warning: 
type of 'mltfftsg' does not match original declaration
lto1: note: types have different parameter counts
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/pr41069_1.f90:4:0: warning: 
type of 'mltfftsg' does not match original declaration
lto1: note: types have different parameter counts
/aux/hubicka/trunk-8/gcc/testsuite/gfortran.dg/lto/pr41069_0.f90:2:0: note: 
previously declared here

spawn /aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../gfortran 
-B/aux/hubicka/trunk-8/build5/gcc/testsuite/gfortran3/../../ 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/ 
f_lto_pr41521_0.o f_lto_pr41521_1.o -fno-diagnostics-show-caret 
-fdiagnostics-color=never -g -O -flto 
-B/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libgfortran/.libs 
-L/aux/hubicka/trunk-8/build5/x86_64-unknown-linux-gnu/./libatomic/.libs 

Re: [PATCH] libmpx: remove AC_PROG_CXX/GCC_LIBSTDCXX_RAW_CXX_FLAGS from configure.ac

2015-04-27 Thread Ilya Enkovich
2015-04-23 12:53 GMT+03:00 Steven Noonan ste...@uplinklabs.net:
 This causes libmpx to fail the configure stage on my build hosts because 
 'xg++'
 was invoked with the bogus '-funconfigured-libstdc++-v3' flag, and the 
 fallback
 preprocessor '/lib/cpp' did not exist on my systems.

 Since libmpx has no C++ code, requiring a C++ preprocessor and compiler at
 configure time doesn't make any sense. Simply remove the directives from
 configure.ac.

 I tested this change on my build environments and the libmpx configure 
 failures
 I was seeing went away. Recommend this be applied to trunk and the gcc-5
 branch.

Thank you for the patch.  I applied it to trunk and will apply to
gcc-5 branch later.

Ilya


 ChangeLog

 2015-04-23  Steven Noonan ste...@uplinklabs.net

* configure.ac: Drop AC_PROG_CXX and friends, since libmpx has no C++
  sources.
* configure: Regenerate.
 ---
  libmpx/configure.ac | 2 --
  1 file changed, 2 deletions(-)

 diff --git a/libmpx/configure.ac b/libmpx/configure.ac
 index 3f8b50f..463e855 100644
 --- a/libmpx/configure.ac
 +++ b/libmpx/configure.ac
 @@ -23,7 +23,6 @@ AC_MSG_RESULT($version_specific_libs)
  AC_CANONICAL_SYSTEM
  target_alias=${target_alias-$host_alias}
  AC_SUBST(target_alias)
 -GCC_LIBSTDCXX_RAW_CXX_FLAGS

  # See if supported.
  unset LIBMPX_SUPPORTED
 @@ -94,7 +93,6 @@ AC_SUBST(toolexeclibdir)
  m4_rename([_AC_ARG_VAR_PRECIOUS],[real_PRECIOUS])
  m4_define([_AC_ARG_VAR_PRECIOUS],[])
  AC_PROG_CC
 -AC_PROG_CXX
  m4_rename_force([real_PRECIOUS],[_AC_ARG_VAR_PRECIOUS])

  AM_PROG_CC_C_O
 --
 2.3.6



Re: [PATCH v2 3/4] libcc1: Add 'set compile-gcc'

2015-04-27 Thread Pedro Alves
On 04/23/2015 09:38 PM, Jan Kratochvil wrote:
 - configury triplet prefix to the compiler.
 + configury triplet prefix to the compiler; TRIPLET_REGEXP can be
 + also absolute filename  to the computer.

to the computer is probably a typo for to the compiler.
Also, double space after filename.

   The arguments are copied by GCC.  ARGV need not be


Thanks,
Pedro Alves


Re: [PATCH] Make wider use of v constraint in i386.md

2015-04-27 Thread Ilya Tocar
On 17 Apr 10:09, Uros Bizjak wrote:
 On Thu, Mar 19, 2015 at 10:24 AM, Ilya Tocar tocarip.in...@gmail.com wrote:
  Hi,
 
  There were some discussion about x constraints being too conservative
  for some patterns in i386.md.
  Patch below fixes it. This is probably stage1 material.
 
  ChangeLog:
 
  gcc/
 
  2015-03-19  Ilya Tocar  ilya.to...@intel.com
 
  * config/i386/i386.h (EXT_SSE_REG_P): New.
  * config/i386/i386.md (*cmpiFPCMP:unordMODEF:mode_mixed): Use 
  v
  constraint.
  (*cmpiFPCMP:unordMODEF:mode_sse): Ditto.
  (*movxi_internal_avx512f): Ditto.
  (define_split): Check for xmm16+, when splitting scalar 
  float_extend.
  (*extendsfdf2_mixed): Use v constraint.
  (*extendsfdf2_sse): Ditto.
  (define_split): Check for xmm16+, when splitting scalar 
  float_truncate.
  (*truncdfsf_fast_sse): Use v constraint.
  (fix_truncMODEF:modeSWI48:mode_sse): Ditto.
  (*floatSWI48:modeMODEF:mode2_sse): Ditto.
  (define_peephole2): Check for xmm16+, when converting scalar
  float_truncate.
  (define_peephole2): Check for xmm16+, when converting scalar
  float_extend.
  (*fop_mode_comm_mixed): Use v constraint.
  (*fop_mode_comm_sse): Ditto.
  (*fop_mode_1_mixed): Ditto.
  (*sqrtmode2_sse): Ditto.
  (*ieee_sieee_maxminmode3): Ditto.
 
 I wonder if there are also changes needed in mmx.md. There are a
 couple of patterns that operate on xmm registers, so they should be
 reviewed if they need to change their constraint to v to accept
 extended xmm register set.


Doesn't look like it. At first glance non-v stuff in mmx.md isn't even
avx enabled, so v is not relevant. Moreover we don't allow mmx modes
(v2sf etc.) in xmm16+ via  ix86_hard_regno_mode_ok.


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Jeff Law

On 04/27/2015 08:47 AM, Marek Polacek wrote:

On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:

In case this example feels too contrived (even though it is an
excerpt of Wine code), we now also warn about the following where
the two types and variables are defined in different places and
the size of one is set implicitly:

   typedef int r_fun_t (int);

   r_fun_t * text_funcs[] = {0,0,0};

   int report (unsigned t)
   {
 typedef int s_fun_t (long, char);

 static s_fun_t * GUI_funcs[3];

 return (t  sizeof text_funcs / sizeof text_funcs[0] 
 t  sizeof GUI_funcs / sizeof GUI_funcs[0]);
   }

(I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
so that we keep track.)


I'm afraid there isn't an easy solution to this; the problem is that we fold
sizeof early, so the warning sees

   t  sizeof 4  t  4

and warns.  Maybe using SIZEOF_EXPR would help...
Can you file a bug for this for future reference?  We may not tackle 
this specific issue in the current iteration of delayed folding, but it 
certainly helps to have BZs for specific issues that we'd like to fix 
once we can delay folding.


jeff


Re: [C/C++ PATCH] Implement -Wshift-negative-value (PR c/65179)

2015-04-27 Thread Marek Polacek
On Sat, Apr 25, 2015 at 08:13:08PM +, Joseph Myers wrote:
 On Sat, 25 Apr 2015, Marek Polacek wrote:
 
  +   pedwarn (location, OPT_Wshift_negative_value,
  +left shift of negative value);
 
 Use of pedwarn is always suspect for something only undefined at runtime; 
 it must not produce an error with -pedantic-errors in any context where a 
 constant expression is not required.

Makes sense.  So how about moving the warning into -Wextra?  This way it won't
trigger by default.  One change is that we reject programs that use shift with
undefined behavior in a context where a constant expression is required, thus
e.g. enum E { A = -1  0 };
But I hope that's reasonable.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-27  Marek Polacek  pola...@redhat.com

PR c/65179
* c-common.c (c_fully_fold_internal): Warn when left shifting a
negative value.
* c.opt (Wshift-negative-value): New option.

* c-typeck.c (build_binary_op): Warn when left shifting a negative
value.

* typeck.c (cp_build_binary_op): Warn when left shifting a negative
value.

* doc/invoke.texi: Document -Wshift-negative-value.

* c-c++-common/Wshift-negative-value-1.c: New test.
* c-c++-common/Wshift-negative-value-2.c: New test.
* c-c++-common/Wshift-negative-value-3.c: New test.
* c-c++-common/Wshift-negative-value-4.c: New test.
* gcc.dg/c99-const-expr-7.c: Add dg-error.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 9797e17..f2fe65e 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -1361,6 +1361,15 @@ c_fully_fold_internal (tree expr, bool in_init, bool 
*maybe_const_operands,
   !TREE_OVERFLOW_P (op0)
   !TREE_OVERFLOW_P (op1))
overflow_warning (EXPR_LOCATION (expr), ret);
+  if (code == LSHIFT_EXPR
+  TREE_CODE (orig_op0) != INTEGER_CST
+  TREE_CODE (TREE_TYPE (orig_op0)) == INTEGER_TYPE
+  TREE_CODE (op0) == INTEGER_CST
+  c_inhibit_evaluation_warnings == 0
+  tree_int_cst_sgn (op0)  0
+  flag_isoc99)
+   warning_at (loc, OPT_Wshift_negative_value,
+   left shift of negative value);
   if ((code == LSHIFT_EXPR || code == RSHIFT_EXPR)
   TREE_CODE (orig_op1) != INTEGER_CST
   TREE_CODE (op1) == INTEGER_CST
diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 983f4a8..c61dfb1 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -781,6 +781,10 @@ Wshift-count-overflow
 C ObjC C++ ObjC++ Var(warn_shift_count_overflow) Init(1) Warning
 Warn if shift count = width of type
 
+Wshift-negative-value
+C ObjC C++ ObjC++ Var(warn_shift_negative_value) Warning EnabledBy(Wextra)
+Warn if left shifting a negative value
+
 Wsign-compare
 C ObjC C++ ObjC++ Var(warn_sign_compare) Warning LangEnabledBy(C++ ObjC++,Wall)
 Warn about signed-unsigned comparisons
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 91735b5..36cebc6 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -10707,6 +10707,15 @@ build_binary_op (location_t location, enum tree_code 
code,
   code1 == INTEGER_TYPE)
{
  doing_shift = true;
+ if (TREE_CODE (op0) == INTEGER_CST
+  tree_int_cst_sgn (op0)  0
+  flag_isoc99)
+   {
+ int_const = false;
+ if (c_inhibit_evaluation_warnings == 0)
+   warning_at (location, OPT_Wshift_negative_value,
+   left shift of negative value);
+   }
  if (TREE_CODE (op1) == INTEGER_CST)
{
  if (tree_int_cst_sgn (op1)  0)
diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 91db32a..3cb5a8a 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4326,11 +4326,21 @@ cp_build_binary_op (location_t location,
}
   else if (code0 == INTEGER_TYPE  code1 == INTEGER_TYPE)
{
+ tree const_op0 = fold_non_dependent_expr (op0);
+ if (TREE_CODE (const_op0) != INTEGER_CST)
+   const_op0 = op0;
  tree const_op1 = fold_non_dependent_expr (op1);
  if (TREE_CODE (const_op1) != INTEGER_CST)
const_op1 = op1;
  result_type = type0;
  doing_shift = true;
+ if (TREE_CODE (const_op0) == INTEGER_CST
+  tree_int_cst_sgn (const_op0)  0
+  (complain  tf_warning)
+  c_inhibit_evaluation_warnings == 0
+  cxx_dialect = cxx11)
+   warning (OPT_Wshift_negative_value,
+left shift of negative value);
  if (TREE_CODE (const_op1) == INTEGER_CST)
{
  if (tree_int_cst_lt (const_op1, integer_zero_node))
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 7d2f6e5..dcfe4cf 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -271,7 +271,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
 

Re: [PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Jeff Law

On 04/27/2015 08:37 AM, Segher Boessenkool wrote:

On Mon, Apr 27, 2015 at 10:55:17AM +0100, Kyrill Tkachov wrote:

Precedents suggest these changes are considered obvious.
So I'll commit this in a couple of days unless someone objects.


Yes it's obvious.  One tiny thing...


@@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
  {
if (rtx_equal_p (XEXP (x, 0), val))
-   cond = swap_condition (cond), temp = val, val = reg, reg = temp;
+   cond = swap_condition (cond), std::swap (val, reg);

if (rtx_equal_p (XEXP (x, 0), reg)  rtx_equal_p (XEXP (x, 1), val))
{


Might as well write this as two statements, like everywhere else, e.g.
Agreed.  I really dislike using ',' like is shown above from a 
readability standpoint.


jeff



Re: [PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state

2015-04-27 Thread Peter Bergner
On Fri, 2015-04-24 at 11:40 -0400, David Edelsohn wrote:
 On Fri, Apr 24, 2015 at 11:34 AM, Segher Boessenkool
  All looks great to me now, thanks for the changes,
 
 Okay for me.

In back porting my patch, I see that the patch doesn't apply
cleanly to the 4.8/4.9 branch code.  Looking at why, I see that it's
because the 4.8/4.9 code doesn't have the following change from
Segher.  I think we want this patch back ported too, don't we?

2014-09-11  Segher Boessenkool  seg...@kernel.crashing.org

   * config/rs6000/htm.md (tabort, tabortdc, tabortdci, tabortwc,
   tabortwci, tbegin, tcheck, tend, trechkpt, treclaim, tsr): Use xor
   instead of minus.
   * config/rs6000/vector.md (cr6_test_for_zero_reverse,
   cr6_test_for_lt_reverse): Ditto.

If so, I can back port it and test it along with mine.  Otherwise, I'll
just work around it.  Let me know what you want me to do.

Peter




Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-04-27 Thread Thomas Schwinge
Hi!

Ping.

On Tue, 10 Mar 2015 13:37:47 +0100, I wrote:
 Ping.
 
 On Thu, 19 Feb 2015 09:39:52 +0100, I wrote:
  On Wed, 18 Feb 2015 18:04:21 +0100, I wrote:
   On Mon, 13 Oct 2014 14:33:11 +0400, Ilya Verbin iver...@gmail.com wrote:
On 13 Oct 12:19, Jakub Jelinek wrote:
 On Sat, Oct 11, 2014 at 06:49:00PM +0400, Ilya Verbin wrote:
  2. -foffload-abi=[lp64|ilp32]
 This option is supposed to tell mkoffload (and offload compiler) 
  which ABI is
  used in streamed GIMPLE.  This option is desirable, because host 
  and offload
  compilers must have the same ABI.  The option is generated by the 
  host compiler
  automatically, it should not be specified by user.
 
 But I'd like to understand why is this one needed.
 Why should the compilers care?  Aggregates layout and alignment of
 integral/floating types must match between host and offload 
 compilers, sure,
 but isn't that something streamed already in the LTO bytecode?
 Or is LTO streamer not streaming some types like long_type_node?
 I'd expect if host and offload compiler disagree on long type size 
 that
 you'd just use a different integral type with the same size as long 
 on the
 host.
 Different sized pointers are of course a bigger problem, but can't 
 you just
 error out on that during reading of the LTO, or even handle it (just 
 use
 some integral type for when is the pointer stored in memory, and just
 convert to pointer after reads from memory, and convert back before 
 storing
 to memory).  Erroring out during LTO streaming in sounds just fine to 
 me
 though.

Actually this option was developed by Bernd, so I think PTX team is 
going to use
it somehow.  In MIC's case we're planning just to check in mkoffload 
that host
and target compiler's ABI are the same.  Without this check we will 
crash in LTO
streamer with ICE, so I'd like to issue an error message, rather than 
crashing.
   
   In gcc/config/i386/intelmic-mkoffload.c, this option is now parsed to
   initialize the target_ilp32 variable, which will then be used
   (target_ilp32 ? -m32 : -m64) when invoking different tools.
   
   In nvptx, we've been using the following approach:
   
   --- gcc/config/nvptx/nvptx.h
   +++ gcc/config/nvptx/nvptx.h
   @@ -54,24 +54,28 @@

/* Type Layout.  */

   +#define TARGET_64BIT \
   +  (flag_offload_abi == OFFLOAD_ABI_UNSET ? TARGET_ABI64 \
   +   : flag_offload_abi == OFFLOAD_ABI_LP64 ? true : false)
   +
#define DEFAULT_SIGNED_CHAR 1

#define SHORT_TYPE_SIZE 16
#define INT_TYPE_SIZE 32
   -#define LONG_TYPE_SIZE (TARGET_ABI64 ? 64 : 32)
   +#define LONG_TYPE_SIZE (TARGET_64BIT ? 64 : 32)
#define LONG_LONG_TYPE_SIZE 64
#define FLOAT_TYPE_SIZE 32
#define DOUBLE_TYPE_SIZE 64
#define LONG_DOUBLE_TYPE_SIZE 64

#undef SIZE_TYPE
   -#define SIZE_TYPE (TARGET_ABI64 ? long unsigned int : unsigned int)
   +#define SIZE_TYPE (TARGET_64BIT ? long unsigned int : unsigned int)
#undef PTRDIFF_TYPE
   -#define PTRDIFF_TYPE (TARGET_ABI64 ? long int : int)
   +#define PTRDIFF_TYPE (TARGET_64BIT ? long int : int)

   -#define POINTER_SIZE (TARGET_ABI64 ? 64 : 32)
   +#define POINTER_SIZE (TARGET_64BIT ? 64 : 32)

   -#define Pmode (TARGET_ABI64 ? DImode : SImode)
   +#define Pmode (TARGET_64BIT ? DImode : SImode)

/* Registers.  Since ptx is a virtual target, we just define a few
   hard registers for special purposes and leave pseudos unallocated.  */
   
   Should we settle on one of the two, that is, either pass -m[...] from
   mkoffload, or handle flag_offload_abi in the respective backend?  I think
   I prefer the intelmic-mkoffload.c approach; this seems cleaner to me:
   mkoffload configures the offloading compiler.  (Also, the flag 32-bit
   vs. 64-bit flag may in fact be needed for tools other than the offloading
   compiler).
  
  Well, for instance, when in -m32 mode, we also need to use it in
  gcc/config/nvptx/mkoffload.c:compile_native -- otherwise, later on, the
  linker will not be happy with the request to link a 64-bit object file
  with 32-bit object files...
  
   Bernd, is there any specific reason for the approach you had
   chosen?
  
  OK to do the following instead?  (Coding style/code copied from
  gcc/config/i386/intelmic-mkoffload.c for consistency.)
  
  --- gcc/config/nvptx/mkoffload.c
  +++ gcc/config/nvptx/mkoffload.c
  @@ -126,6 +126,9 @@ static id_map *var_ids, **vars_tail = var_ids;
   static const char *ptx_name;
   static const char *ptx_cfile_name;
   
  +/* Shows if we should compile binaries for i386 instead of x86-64.  */
  +bool target_ilp32 = false;
  +
   /* Delete tempfiles.  */
   
   /* Unlink a temporary file unless requested otherwise.  */
  @@ -883,6 +886,7 @@ compile_native (const char *infile, const char 
  *outfile, const char *compiler)
 struct obstack argv_obstack;

Re: [PATCH][AArch64] Improve spill code - swap order in shl pattern

2015-04-27 Thread James Greenhalgh
On Mon, Apr 27, 2015 at 02:37:12PM +0100, Wilco Dijkstra wrote:
 Various instructions are supported as integer operations as well as SIMD on
 AArch64. When register pressure is high, lra-constraints inserts spill code
 without taking the allocation class into account, and basically chooses the
 first available pattern that matches. Since this instruction has the SIMD
 version first it is usually chosen eventhough some of the operands are
 eventually allocated to integer registers. The result is inefficient code not
 only due to the higher latency of SIMD instructions but also due to the extra
 int-FP moves. Placing the integer variant first in the shl pattern
 generates far more optimal spill code. A few more patterns are the wrong way
 around, which I'll address in a separate patch. I'm also looking into fixing
 lra-constraints to generate the expected code by taking the allocno class
 into account in the cost calculations during spilling.
 
 2015-04-27  Wilco Dijkstra  wdijk...@arm.com
 
 * gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_mode3):
 Place integer variant first.

OK, thanks for the fix.

Cheers,
James



Re: Offloading compilers' libgcc installation

2015-04-27 Thread Jakub Jelinek
On Mon, Apr 27, 2015 at 06:14:36PM +0200, Thomas Schwinge wrote:
 Hi!
 
 Ping.  (Or can Bernd just commit this patch,
 http://news.gmane.org/find-root.php?message_id=%3C54E5ACCE.7080502%40codesourcery.com%3E
 (with my review comment addressed,
 http://news.gmane.org/find-root.php?message_id=%3C87h9uhlypt.fsf%40schwinge.name%3E),
 given his nvptx architecture maintainership?)

Ok for trunk.

Jakub


RE: [PATCH][AArch64] Use conditional negate for abs expansion

2015-04-27 Thread Wilco Dijkstra
 James Greenhalgh wrote:
 On Mon, Apr 27, 2015 at 02:42:36PM +0100, Wilco Dijkstra wrote:
   -Original Message-
   From: Wilco Dijkstra [mailto:wdijk...@arm.com]
   Sent: 03 March 2015 16:19
   To: GCC Patches
   Subject: [PATCH][AArch64] Use conditional negate for abs expansion
  
   Expand abs into a compare and conditional negate. This is the most 
   obvious expansion,
 enables
   merging of the comparison into ALU instructions and is faster on all 
   implementations.
   Bootstrapped  regression tested.
  
   int f(int x) { return abs (x + 1); }
  
   Before:
   add w0, w0, 1
   sxtwx0, w0
   eor x1, x0, x0, asr 63
   sub x1, x1, x0, asr 63
   mov x0, x1
   ret
  
   After:
   addsw0, w0, 1
   csneg   w0, w0, w0, pl
   ret
  
   ChangeLog:
  
   2015-03-03  Wilco Dijkstra  wdijk...@arm.com
  
 * gcc/config/aarch64/aarch64.md (absdi2): optimize abs expansion.
 (csneg3mode_insn): enable expansion of pattern.
 * gcc/testsuite/gcc.target/aarch64/abs_1.c (abs64): update test
 for new abs expansion.  (abs64_in_dreg): likewise.
 
 
 This looks like it breaks support for abs in a D register (for example
 at the end of a loop, or extracted from Neon Intrinsics, etc).
 
 e.g. (totally contrived...)
 
   int64x1_t
   abs_max (int64x2_t x, int64_t *wb)
   {
 int64_t y = vgetq_lane_s64 (x, 0);
 if (y  0)
   y = -y;
 return vdup_n_s64 (y);
   }
 
 Which currently generates:
 
   abs_max:
   abs d0, d0
   ret
 
 I suppose we don't need to worry too much about that (and the current
 implementation doesn't seem to catch it reliably anyway), but it would be
 good if we could keep the support - even if it is rarely used.

Well it may be possible to write a peephole for this rare case, but I hope
people would use a vabs if that's what they wanted...

Wilco




Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Jeff Law

On 04/27/2015 06:46 AM, Richard Biener wrote:

On Mon, 27 Apr 2015, Richard Biener wrote:


On Fri, 24 Apr 2015, Jeff Law wrote:


On 02/17/2015 07:58 AM, Richard Biener wrote:
[ Restarting a old thread... ]


On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).

Agreed.  I'm not entirely sure how it got to this point.


And record_equality is where the SSA_NAME_VALUEs might end up
forming chains?  At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in

   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
   if (b_2 == i_3(D))
 ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).

Right.  In general, the fact that we're in SSA form means that we ought not
get loops in the SSA_NAME_VALUE chain since there's a single static assignment
and we'll visit it once.

That nice model breaks down in two ways.  First we derive equivalences from
equality conditions like you've shown above.  Second, when threading we can
thread through a loop latch and start reprocessing a block.  The interaction
between those two can result in loops in the value chain.

What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
is independent of the dominator walk (and optimizer). Instead of walking
forward through the dominator tree recording key nuggets, then removing those
nuggets as we pop back up the tree, we instead we start with the conditional
and walk up the use-def chains, simplifying as we go, with the goal of
simplifying to a constant, of course.

You can see the beginnings of that with the FSM bits from Sebastian.

Obviously until those bits are in place, we should try to clean up any warts
in the current implementation.



Btw, in record_equality, the == part of the = check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order.  If we can't rely on the input being in IL
canonical order we should prepend the whole function with

   if (tree_swap_operands_p (x, y, false))
 std::swap (x, y);

and change = to  for the loop-depth check.

Agreed.  Makes perfect sense.


I'm now re-bootstrapping and testing the following.


Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  rguent...@suse.de

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

* gcc.target/i386/pr65217.c: XFAIL.
Looks good.  I'd spun the same record_equality changes over the weekend 
and have state on regressing 65217.


65217 is an interesting test.


  if ((n  -n) != n)
__builtin_unreachable();
  return n;

n  -n will (of course) get computed into an SSA_NAME.  We then 
propagate that name for the use of n in the return statement rather 
than using the effectively zero cost n.


If we put some smarts into tree_swap_operands_p to order sensibly in 
this kind of case, we end up regressing a different case that I'll be 
looking at today.


jeff




Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-04-27 Thread Jeff Law

On 04/25/2015 05:49 AM, Richard Sandiford wrote:



@@ -2099,9 +2107,9 @@ fix_crossing_conditional_branches (void)
  emit_label (new_label);

  gcc_assert (GET_CODE (old_label) == LABEL_REF);
- old_label = JUMP_LABEL (old_jump);
- new_jump = emit_jump_insn (gen_jump (old_label));
- JUMP_LABEL (new_jump) = old_label;
+ old_label_insn = JUMP_LABEL_AS_INSN (old_jump);
+ new_jump = emit_jump_insn (gen_jump (old_label_insn));
+ JUMP_LABEL (new_jump) = old_label_insn;

  last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)-prev_bb;
  new_bb = create_basic_block (new_label, new_jump, last_bb);


I think the eventual aim would be to have rtx_jump_insn member functions
that get and set the jump label as an rtx_insn, with JUMP_LABEL_AS_INSN
being a stepping stone towards that.  In cases like this it might make
more sense to ensure old_jump has the right type (rtx_jump_insn) and go
straight to the member functions, rather than switching to JUMP_LABEL_AS_INSN
now and then having to rewrite it later.
I'm comfortable with either way, so long as we get there.  I know that 
David certainly found it easier to introduce scaffolding early in this 
patch series, then exploit it, then tear down the scaffolding near the 
end of a patch series.



diff --git a/gcc/is-a.h b/gcc/is-a.h
index 58917eb..4fb9dde 100644
--- a/gcc/is-a.h
+++ b/gcc/is-a.h
@@ -46,6 +46,11 @@ TYPE as_a TYPE (pointer)

do_something_with (as_a cgraph_node * *ptr);

+TYPE assert_as_a TYPE (pointer)
+
+Like as_a TYPE (pointer), but uses assertion, which is enabled even in
+non-checking (release) build.
+
  TYPE safe_as_a TYPE (pointer)

  Like as_a TYPE (pointer), but where pointer could be NULL.  This
@@ -193,6 +198,17 @@ as_a (U *p)
return is_a_helper T::cast (p);
  }

+/* Same as above, but checks the condition even in release build.  */
+
+template typename T, typename U
+inline T
+assert_as_a (U *p)
+{
+  gcc_assert (is_a T (p));
+  return is_a_helper T::cast (p);
+}
+
+
  /* Similar to as_a, but where the pointer can be NULL, even if
 is_a_helperT doesn't check for NULL.  */


This preserves the behaviour of the original code but I'm not sure
it's worth it.  I doubt the distinction between:

   gcc_assert (JUMP_P (x));

and:

   gcc_checking_assert (JUMP_P (x));

was ever very scientific.  Seems like we should take this refactoring as
an opportunity to make the checking more consistent.
Without some guidelines I suspect usage of gcc_check_assert would be 
highly inconsistent.


And ultimately we want to get away from the helpers as much as possible, 
instead relying on the static typesystem to detect errors at compile 
time.  So unless there's a compelling reason, I'd prefer not to add more 
support for these helpers.


[ snip]



That seems pretty heavy-weight for LRA-local code.  Also, the long-term
plan is for INSN_LIST and rtx_insn to be in separate hierarchies,
at which point we'd have no alias-safe way to distinguish them.
That's certainly what I think ought to happen.  INSN_LIST should turn 
into a standard vector or forward list.  For the use cases in GCC, 
either ought to be acceptable.


Jeff


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Marek Polacek
On Mon, Apr 27, 2015 at 10:06:26AM -0600, Jeff Law wrote:
 On 04/27/2015 08:47 AM, Marek Polacek wrote:
 On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:
 In case this example feels too contrived (even though it is an
 excerpt of Wine code), we now also warn about the following where
 the two types and variables are defined in different places and
 the size of one is set implicitly:
 
typedef int r_fun_t (int);
 
r_fun_t * text_funcs[] = {0,0,0};
 
int report (unsigned t)
{
  typedef int s_fun_t (long, char);
 
  static s_fun_t * GUI_funcs[3];
 
  return (t  sizeof text_funcs / sizeof text_funcs[0] 
  t  sizeof GUI_funcs / sizeof GUI_funcs[0]);
}
 
 (I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
 so that we keep track.)
 
 I'm afraid there isn't an easy solution to this; the problem is that we fold
 sizeof early, so the warning sees
 
t  sizeof 4  t  4
 
 and warns.  Maybe using SIZEOF_EXPR would help...
 Can you file a bug for this for future reference?  We may not tackle this
 specific issue in the current iteration of delayed folding, but it certainly
 helps to have BZs for specific issues that we'd like to fix once we can
 delay folding.

Gerald already opened PR65891 for this :).

Marek


  1   2   >