Re: [PATCH] Fix PR52298

2012-02-24 Thread Richard Guenther
On Thu, 23 Feb 2012, Ulrich Weigand wrote:

 Richard Guenther wrote:
 
  PR tree-optimization/52298
  * tree-vect-stmts.c (vectorizable_store): Properly use
  STMT_VINFO_DR_STEP instead of DR_STEP when vectorizing
  outer loops.
  (vectorizable_load): Likewise.
  * tree-vect-data-refs.c (vect_analyze_data_ref_access):
  Access DR_STEP after ensuring it is not NULL.
 
 This causes a bunch of regressions on SPU:
 
 FAIL: gcc.dg/vect/vect-outer-fir-big-array.c (internal compiler error)
 FAIL: gcc.dg/vect/vect-outer-fir-big-array.c (test for excess errors)
 WARNING: gcc.dg/vect/vect-outer-fir-big-array.c compilation failed to produce 
 executable
 FAIL: gcc.dg/vect/vect-outer-fir-big-array.c scan-tree-dump-times vect OUTER 
 LOOP VECTORIZED 2
 FAIL: gcc.dg/vect/vect-outer-fir-lb-big-array.c (internal compiler error)
 FAIL: gcc.dg/vect/vect-outer-fir-lb-big-array.c (test for excess errors)
 WARNING: gcc.dg/vect/vect-outer-fir-lb-big-array.c compilation failed to 
 produce executable
 FAIL: gcc.dg/vect/vect-outer-fir-lb-big-array.c scan-tree-dump-times vect 
 OUTER LOOP VECTORIZED 2
 FAIL: gcc.dg/vect/vect-outer-fir-lb.c (internal compiler error)
 FAIL: gcc.dg/vect/vect-outer-fir-lb.c (test for excess errors)
 WARNING: gcc.dg/vect/vect-outer-fir-lb.c compilation failed to produce 
 executable
 FAIL: gcc.dg/vect/vect-outer-fir-lb.c scan-tree-dump-times vect OUTER LOOP 
 VECTORIZED 2
 FAIL: gcc.dg/vect/vect-outer-fir.c (internal compiler error)
 FAIL: gcc.dg/vect/vect-outer-fir.c (test for excess errors)
 WARNING: gcc.dg/vect/vect-outer-fir.c compilation failed to produce executable
 FAIL: gcc.dg/vect/vect-outer-fir.c scan-tree-dump-times vect OUTER LOOP 
 VECTORIZED 2
 
 all due to ICEs of the same type:
 
  internal compiler error: in vectorizable_load, at tree-vect-stmts.c:4665
 
 The assert in question looks like:
 
   if (nested_in_vect_loop
(TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info))
   % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
 { 
   gcc_assert (alignment_support_scheme != dr_explicit_realign_optimized);
   compute_in_loop = true;
 }
 
 where your patch changed DR_STEP to STMT_VINFO_DR_STEP (reverting just this
 one change makes the ICEs go away).
 
 However, at the place where the decision to use the 
 dr_explicit_realign_optimized 
 strategy is made (tree-vect-data-refs.c:vect_supportable_dr_alignment), we 
 still
 have:
 
   if ((nested_in_vect_loop
 (TREE_INT_CST_LOW (DR_STEP (dr))
!= GET_MODE_SIZE (TYPE_MODE (vectype
   || !loop_vinfo)
 return dr_explicit_realign;
   else
 return dr_explicit_realign_optimized;
 
 Should this now also use STMT_VINFO_DR_STEP?

Yes, I think so.

Richard.

 Bye,
 Ulrich
 
 

-- 
Richard Guenther rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [PATCH 5/5] dump_file whitespace nitpicks

2012-02-24 Thread Richard Guenther
On Thu, Feb 23, 2012 at 7:21 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 gcc/ChangeLog:

 2012-02-23  Bernhard Reutner-Fischer  al...@gcc.gnu.org

        * tree-into-ssa (update_ssa): Avoid trailing whitespace in
        dump_file.
        * tree-ssa-sccvn.c (print_scc): Ditto.

Ok.

Thanks,
Richard.

 Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
 ---
  gcc/tree-into-ssa.c  |    4 ++--
  gcc/tree-ssa-sccvn.c |    4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)

 diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
 index 7eaed2a..6ca52c1 100644
 --- a/gcc/tree-into-ssa.c
 +++ b/gcc/tree-into-ssa.c
 @@ -3519,9 +3519,9 @@ update_ssa (unsigned update_flags)

       if (dump_flags  TDF_DETAILS)
        {
 -         fprintf (dump_file, Affected blocks: );
 +         fprintf (dump_file, Affected blocks:);
          EXECUTE_IF_SET_IN_BITMAP (blocks_to_update, 0, i, bi)
 -           fprintf (dump_file, %u , i);
 +           fprintf (dump_file,  %u, i);
          fprintf (dump_file, \n);
        }

 diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
 index fdebe47..ddb1ba6 100644
 --- a/gcc/tree-ssa-sccvn.c
 +++ b/gcc/tree-ssa-sccvn.c
 @@ -2462,11 +2462,11 @@ print_scc (FILE *out, VEC (tree, heap) *scc)
   tree var;
   unsigned int i;

 -  fprintf (out, SCC consists of: );
 +  fprintf (out, SCC consists of:);
   FOR_EACH_VEC_ELT (tree, scc, i, var)
     {
 -      print_generic_expr (out, var, 0);
       fprintf (out,  );
 +      print_generic_expr (out, var, 0);
     }
   fprintf (out, \n);
  }
 --
 1.7.9



Re: [PATCH 4/5] make_phi_node can be static

2012-02-24 Thread Richard Guenther
On Thu, Feb 23, 2012 at 7:21 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 gcc/ChangeLog:

 2012-02-23  Bernhard Reutner-Fischer  al...@gcc.gnu.org

        * tree-phinodes.c (make_phi_node): Mark static.
        * tree-flow.h (make_phi_node): Remove extern decl.
        * doc/gimple.texi (make_phi_node): Remove documentation.

Ok.

Thanks,
Richard.

 Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
 ---
  gcc/doc/gimple.texi |    4 
  gcc/tree-flow.h     |    1 -
  gcc/tree-phinodes.c |    2 +-
  3 files changed, 1 insertions(+), 6 deletions(-)

 diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi
 index b75dc72..fa31eb0 100644
 --- a/gcc/doc/gimple.texi
 +++ b/gcc/doc/gimple.texi
 @@ -1963,10 +1963,6 @@ Set @code{CLAUSES} to be the clauses associated with 
 @code{OMP_SINGLE} @code{G}.
  @subsection @code{GIMPLE_PHI}
  @cindex @code{GIMPLE_PHI}

 -@deftypefn {GIMPLE function} gimple make_phi_node (tree var, int len)
 -Build a @code{PHI} node with len argument slots for variable var.
 -@end deftypefn
 -
  @deftypefn {GIMPLE function} unsigned gimple_phi_capacity (gimple g)
  Return the maximum number of arguments supported by @code{GIMPLE_PHI} 
 @code{G}.
  @end deftypefn
 diff --git a/gcc/tree-flow.h b/gcc/tree-flow.h
 index f4c4d5c..319be2b 100644
 --- a/gcc/tree-flow.h
 +++ b/gcc/tree-flow.h
 @@ -504,7 +504,6 @@ extern void find_referenced_vars_in (gimple);
  /* In tree-phinodes.c  */
  extern void reserve_phi_args_for_new_edge (basic_block);
  extern void add_phi_node_to_bb (gimple phi, basic_block bb);
 -extern gimple make_phi_node (tree var, int len);
  extern gimple create_phi_node (tree, basic_block);
  extern void add_phi_arg (gimple, tree, edge, source_location);
  extern void remove_phi_args (edge);
 diff --git a/gcc/tree-phinodes.c b/gcc/tree-phinodes.c
 index 1d7e5c2..218a551 100644
 --- a/gcc/tree-phinodes.c
 +++ b/gcc/tree-phinodes.c
 @@ -204,7 +204,7 @@ ideal_phi_node_len (int len)

  /* Return a PHI node with LEN argument slots for variable VAR.  */

 -gimple
 +static gimple
  make_phi_node (tree var, int len)
  {
   gimple phi;
 --
 1.7.9



Re: [PR51752] publication safety violations in loop invariant motion pass

2012-02-24 Thread Richard Guenther
On Thu, Feb 23, 2012 at 10:11 PM, Aldy Hernandez al...@redhat.com wrote:
 On 02/23/12 12:19, Aldy Hernandez wrote:

 about hit me. Instead now I save all loads in a function and iterate
 through them in a brute force way. I'd like to rewrite this into a hash
 of some sort, but before I go any further I'm interested to know if the
 main idea is ok.


 For the record, it may be ideal to reuse some of the iterations we already
 do over the function's basic blocks, so we don't have to iterate yet again
 over the IL stream.  Though it may complicate the pass unnecessarily.

It's definitely ugly that you need to do this.  And yes, you have to look at
very many passes I guess, PRE comes to my mind at least.

I still do not understand the situation very well, at least why for

 transaction { for () if (b) load; } load;

it should be safe to hoist load out of the transaction while for

 load; transaction { for () if (b) load; }

it is not.  Neither do I understand why it's not ok for

 transaction { for () if (b) load; }

to hoist the load out of the transaction.

I assume all the issue is about hoisting things over the trasaction start.
So - why does the transaction start not simply clobber all global memory?
That would avoid the hoisting.  I assume that  transforming the above to

 transaction { tem = load; for () if (b) = tem; }

is ok?

Thanks,
Richard.


RE: spill failure after IF-CASE-2 transformation

2012-02-24 Thread Henderson, Stuart
I think this is a fairly reasonable minimal fix. For 4.8 we could
experiment whether to always do this, regardless of s_r_c_f_m_p.

Ok if bootstrapped and tested on a primary target (i.e. linux) and
tested on Blackfin.


Bernd

Thanks again, Bernd.

Forwarding to gcc-patches to give people a couple of days to object.

Tested on linux and bfin.

Stu


diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 8d81c89..e4e13ab 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -2295,7 +2295,9 @@ noce_get_condition (rtx jump, rtx *earliest, bool 
then_else_reversed)

   cond = XEXP (SET_SRC (set), 0);
   tmp = XEXP (cond, 0);
-  if (REG_P (tmp)  GET_MODE_CLASS (GET_MODE (tmp)) == MODE_INT)
+  if (REG_P (tmp)  GET_MODE_CLASS (GET_MODE (tmp)) == MODE_INT
+   (GET_MODE (tmp) != BImode
+  || !targetm.small_register_classes_for_mode_p (BImode)))
 {
   *earliest = jump;





upstream2.patch
Description: upstream2.patch


Re: [PATCH 3/5] tree-if-conv: Commentary typo fix

2012-02-24 Thread Bernhard Reutner-Fischer
On Thu, Feb 23, 2012 at 07:21:29PM +0100, Bernhard Reutner-Fischer wrote:
gcc/ChangeLog:

2012-02-23  Bernhard Reutner-Fischer  al...@gcc.gnu.org

   * tree-if-conv (predicate_scalar_phi): Commentary typo fix.

Applied to trunk as obvious as r184546.

Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
---
 gcc/tree-if-conv.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index cdbbe5b..ca9503f 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1262,7 +1262,7 @@ find_phi_replacement_condition (struct loop *loop,
arguments.
 
For example,
- S1: A = PHI x1(1), x2(5)
+ S1: A = PHI x1(1), x2(5)
is converted into,
  S2: A = cond ? x1 : x2;
 
-- 
1.7.9



Re: [PATCH, i386, Android] Enable __ANDROID__ macro for Android i386 target

2012-02-24 Thread Ilya Enkovich
 On Wed, Feb 22, 2012 at 6:59 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hello,

 Here is a one-line fix to enable __ANDROID__ macro on i386 Android
 target. OK for trunk?

 Thanks,
 Ilya
 --

 2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

        * gcc/config/i386/gnu-user.h (TARGET_OS_CPP_BUILTINS): Add
        ANDROID_TARGET_OS_CPP_BUILTINS.


 diff --git a/gcc/config/i386/gnu-user.h b/gcc/config/i386/gnu-user.h
 index 98d0a25..d317229 100644
 --- a/gcc/config/i386/gnu-user.h
 +++ b/gcc/config/i386/gnu-user.h
 @@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
   do                                           \
     {                                          \
        GNU_USER_TARGET_OS_CPP_BUILTINS();      \
 +       ANDROID_TARGET_OS_CPP_BUILTINS();       \
     }                                          \
   while (0)

 I think this should be done in linux.h, not gnu-user.h.

I fix macro which is defined in gnu-user.h. How do you suppose me to
do it in linux.h?

Ilya

 --
 H.J.


Re: [PATCH, i386, Android] Enable exceptions and RTTI by default for Android

2012-02-24 Thread Ilya Enkovich
 On Wed, Feb 22, 2012 at 3:57 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hello,

 Here is a simple patch which enables exceptions and RTTI by default
 for Android target. OK for trunk?

 Err - isn't that the default?  Thus, simply delete the bogus spec?

 Richard.


Hi,

Is following patch OK or it's better to remove whole macro and its usages?

Thanks,
Ilya
--
2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

* gcc/config/linux-android.h (ANDROID_CC1PLUS_SPEC): Enable
exceptions and rtti by default.


diff --git a/gcc/config/linux-android.h b/gcc/config/linux-android.h
index 94c5274..180b62b 100644
--- a/gcc/config/linux-android.h
+++ b/gcc/config/linux-android.h
@@ -45,9 +45,7 @@
   %{!mglibc:%{!muclibc:%{!mbionic: -mbionic}}}   \
   %{!fno-pic:%{!fno-PIC:%{!fpic:%{!fPIC: -fPIC

-#define ANDROID_CC1PLUS_SPEC   \
-  %{!fexceptions:%{!fno-exceptions: -fno-exceptions}}\
-  %{!frtti:%{!fno-rtti: -fno-rtti}}
+#define ANDROID_CC1PLUS_SPEC 

 #define ANDROID_LIB_SPEC \
   %{!static: -ldl}


[PATCH] Fix PR52361 (a bit)

2012-02-24 Thread Richard Guenther

This avoids redundant verify_gimple_in_cfg calls (the original idea
was to always verify gimple when we verify SSA form).

Scheduled for a bootstrap/regtest.

Richard.

2012-02-24  Richard Guenther  rguent...@suse.de

PR middle-end/52361
* passes.c (execute_function_todo): When verifying SSA form
verify gimple form first.
* tree-ssa.c (verify_ssa): Do not verify gimple form here.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 184541)
+++ gcc/passes.c(working copy)
@@ -1724,11 +1724,14 @@ execute_function_todo (void *data)
 #if defined ENABLE_CHECKING
   if (flags  TODO_verify_ssa
   || (current_loops  loops_state_satisfies_p (LOOP_CLOSED_SSA)))
-verify_ssa (true);
+{
+  verify_gimple_in_cfg (cfun);
+  verify_ssa (true);
+}
+  else if (flags  TODO_verify_stmts)
+verify_gimple_in_cfg (cfun);
   if (flags  TODO_verify_flow)
 verify_flow_info ();
-  if (flags  TODO_verify_stmts)
-verify_gimple_in_cfg (cfun);
   if (current_loops  loops_state_satisfies_p (LOOP_CLOSED_SSA))
 verify_loop_closed_ssa (false);
   if (flags  TODO_verify_rtl_sharing)
Index: gcc/tree-ssa.c
===
--- gcc/tree-ssa.c  (revision 184541)
+++ gcc/tree-ssa.c  (working copy)
@@ -925,8 +925,6 @@ verify_ssa (bool check_modified_stmt)
 
   gcc_assert (!need_ssa_update_p (cfun));
 
-  verify_gimple_in_cfg (cfun);
-
   timevar_push (TV_TREE_SSA_VERIFY);
 
   /* Keep track of SSA names present in the IL.  */


Re: [PATCH, i386, Android] Enable exceptions and RTTI by default for Android

2012-02-24 Thread Richard Guenther
On Fri, Feb 24, 2012 at 11:22 AM, Ilya Enkovich enkovich@gmail.com wrote:
 On Wed, Feb 22, 2012 at 3:57 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hello,

 Here is a simple patch which enables exceptions and RTTI by default
 for Android target. OK for trunk?

 Err - isn't that the default?  Thus, simply delete the bogus spec?

 Richard.


 Hi,

 Is following patch OK or it's better to remove whole macro and its usages?

The latter.

Richard.

 Thanks,
 Ilya
 --
 2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

        * gcc/config/linux-android.h (ANDROID_CC1PLUS_SPEC): Enable
        exceptions and rtti by default.


 diff --git a/gcc/config/linux-android.h b/gcc/config/linux-android.h
 index 94c5274..180b62b 100644
 --- a/gcc/config/linux-android.h
 +++ b/gcc/config/linux-android.h
 @@ -45,9 +45,7 @@
   %{!mglibc:%{!muclibc:%{!mbionic: -mbionic}}}                       \
   %{!fno-pic:%{!fno-PIC:%{!fpic:%{!fPIC: -fPIC

 -#define ANDROID_CC1PLUS_SPEC                                           \
 -  %{!fexceptions:%{!fno-exceptions: -fno-exceptions}}                \
 -  %{!frtti:%{!fno-rtti: -fno-rtti}}
 +#define ANDROID_CC1PLUS_SPEC 

  #define ANDROID_LIB_SPEC \
   %{!static: -ldl}


[PATCH] Fix PR52355

2012-02-24 Thread Richard Guenther

This fixes PR52355 by extending the existing folding of a[i] - a[j]
to cover multi-dimensional array access and non-equal base.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-02-24  Richard Guenther  rguent...@suse.de

PR middle-end/52355
* fold-const.c (fold_addr_of_array_ref_difference): New function.
(fold_binary_loc): Use it to extend the existing a[i] - a[j]
folding.

* gcc.dg/pr52355.c: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 184541)
+++ gcc/fold-const.c(working copy)
@@ -9671,6 +9674,44 @@ fold_vec_perm (tree type, tree arg0, tre
 }
 }
 
+/* Try to fold a pointer difference of type TYPE two address expressions of
+   array references AREF0 and AREF1 using location LOC.  Return a
+   simplified expression for the difference or NULL_TREE.  */
+
+static tree
+fold_addr_of_array_ref_difference (location_t loc, tree type,
+  tree aref0, tree aref1)
+{
+  tree base0 = TREE_OPERAND (aref0, 0);
+  tree base1 = TREE_OPERAND (aref1, 0);
+  tree base_offset = build_int_cst (type, 0);
+
+  /* If the bases are array references as well, recurse.  If the bases
+ are pointer indirections compute the difference of the pointers.
+ If the bases are equal, we are set.  */
+  if ((TREE_CODE (base0) == ARRAY_REF
+TREE_CODE (base1) == ARRAY_REF
+(base_offset
+  = fold_addr_of_array_ref_difference (loc, type, base0, base1)))
+  || (INDIRECT_REF_P (base0)
+  INDIRECT_REF_P (base1)
+  (base_offset = fold_binary_loc (loc, MINUS_EXPR, type,
+TREE_OPERAND (base0, 0),
+TREE_OPERAND (base1, 0
+  || operand_equal_p (base0, base1, 0))
+{
+  tree op0 = fold_convert_loc (loc, type, TREE_OPERAND (aref0, 1));
+  tree op1 = fold_convert_loc (loc, type, TREE_OPERAND (aref1, 1));
+  tree esz = fold_convert_loc (loc, type, array_ref_element_size (aref0));
+  tree diff = build2 (MINUS_EXPR, type, op0, op1);
+  return fold_build2_loc (loc, PLUS_EXPR, type,
+ base_offset,
+ fold_build2_loc (loc, MULT_EXPR, type,
+  diff, esz));
+}
+  return NULL_TREE;
+}
+
 /* Fold a binary expression of code CODE and type TYPE with operands
OP0 and OP1.  LOC is the location of the resulting expression.
Return the folded expression if folding is successful.  Otherwise,
@@ -10582,19 +10623,11 @@ fold_binary_loc (location_t loc,
   TREE_CODE (arg1) == ADDR_EXPR
   TREE_CODE (TREE_OPERAND (arg1, 0)) == ARRAY_REF)
 {
- tree aref0 = TREE_OPERAND (arg0, 0);
- tree aref1 = TREE_OPERAND (arg1, 0);
- if (operand_equal_p (TREE_OPERAND (aref0, 0),
-  TREE_OPERAND (aref1, 0), 0))
-   {
- tree op0 = fold_convert_loc (loc, type, TREE_OPERAND (aref0, 1));
- tree op1 = fold_convert_loc (loc, type, TREE_OPERAND (aref1, 1));
- tree esz = array_ref_element_size (aref0);
- tree diff = build2 (MINUS_EXPR, type, op0, op1);
- return fold_build2_loc (loc, MULT_EXPR, type, diff,
- fold_convert_loc (loc, type, esz));
-
-   }
+ tree tem = fold_addr_of_array_ref_difference (loc, type,
+   TREE_OPERAND (arg0, 0),
+   TREE_OPERAND (arg1, 0));
+ if (tem)
+   return tem;
}
 
   if (FLOAT_TYPE_P (type)
Index: gcc/testsuite/gcc.dg/pr52355.c
===
--- gcc/testsuite/gcc.dg/pr52355.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr52355.c  (revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+void f(char a[16][16][16])
+{
+  asm volatile ( : : i (a[1][0][0] - a[0][0][0]));
+}
+
+int main(void)
+{
+  char a[16][16][16];
+  f(a);
+  return 0;
+}


[PATCH] Some more speedups for PR52361

2012-02-24 Thread Richard Guenther

I noticed that checking time is dominated by walk_gimple_op and
walk_tree, not so much by the core worker.  walk_gimple_op can
be micro-optimized a bit and the simple predicate is_gimple_reg_type
can be trivially inlined.

Bootstrap  testing in progress.

That was what is low-hanging.

Richard.

2012-02-24  Richard Guenther  rguent...@suse.de

PR middle-end/52361
* gimple.c (walk_gimple_op): Use predicates with less redundant
tests.
(is_gimple_reg_type): Move inline ...
* gimple.h (is_gimple_reg_type): ... here.

Index: gcc/gimple.c
===
*** gcc/gimple.c(revision 184551)
--- gcc/gimple.c(working copy)
*** walk_gimple_op (gimple stmt, walk_tree_f
*** 1481,1487 
  tree lhs = gimple_assign_lhs (stmt);
  wi-val_only
= (is_gimple_reg_type (TREE_TYPE (lhs))  !is_gimple_reg (lhs))
! || !gimple_assign_single_p (stmt);
}
  
for (i = 1; i  gimple_num_ops (stmt); i++)
--- 1481,1487 
  tree lhs = gimple_assign_lhs (stmt);
  wi-val_only
= (is_gimple_reg_type (TREE_TYPE (lhs))  !is_gimple_reg (lhs))
! || gimple_assign_rhs_class (stmt) != GIMPLE_SINGLE_RHS;
}
  
for (i = 1; i  gimple_num_ops (stmt); i++)
*** walk_gimple_op (gimple stmt, walk_tree_f
*** 1497,1507 
if (wi)
{
/* If the RHS has more than 1 operand, it is not appropriate
!  for the memory.  */
! wi-val_only = !(is_gimple_mem_rhs (gimple_assign_rhs1 (stmt))
!  || TREE_CODE (gimple_assign_rhs1 (stmt))
! == CONSTRUCTOR)
!  || !gimple_assign_single_p (stmt);
  wi-is_lhs = true;
}
  
--- 1497,1510 
if (wi)
{
/* If the RHS has more than 1 operand, it is not appropriate
!  for the memory.
!???  A lhs always requires an lvalue, checking the val_only flag
!does not make any sense, so we should be able to avoid computing
!it here.  */
! tree rhs1 = gimple_assign_rhs1 (stmt);
! wi-val_only = !(is_gimple_mem_rhs (rhs1)
!  || TREE_CODE (rhs1) == CONSTRUCTOR)
!  || gimple_assign_rhs_class (stmt) != 
GIMPLE_SINGLE_RHS;
  wi-is_lhs = true;
}
  
*** is_gimple_id (tree t)
*** 2908,2921 
  || TREE_CODE (t) == STRING_CST);
  }
  
- /* Return true if TYPE is a suitable type for a scalar register variable.  */
- 
- bool
- is_gimple_reg_type (tree type)
- {
-   return !AGGREGATE_TYPE_P (type);
- }
- 
  /* Return true if T is a non-aggregate register variable.  */
  
  bool
--- 2911,2916 
Index: gcc/gimple.h
===
*** gcc/gimple.h(revision 184551)
--- gcc/gimple.h(working copy)
*** tree gimple_extract_devirt_binfo_from_cs
*** 963,970 
  /* Returns true iff T is a valid GIMPLE statement.  */
  extern bool is_gimple_stmt (tree);
  
- /* Returns true iff TYPE is a valid type for a scalar register variable.  */
- extern bool is_gimple_reg_type (tree);
  /* Returns true iff T is a scalar register variable.  */
  extern bool is_gimple_reg (tree);
  /* Returns true iff T is any sort of variable.  */
--- 963,968 
*** gimple_expr_type (const_gimple stmt)
*** 4838,4843 
--- 4836,4848 
  return void_type_node;
  }
  
+ /* Return true if TYPE is a suitable type for a scalar register variable.  */
+ 
+ static inline bool
+ is_gimple_reg_type (tree type)
+ {
+   return !AGGREGATE_TYPE_P (type);
+ }
  
  /* Return a new iterator pointing to GIMPLE_SEQ's first statement.  */
  


Re: [PATCH] Adjust 'malloc' attribute documentation to match implementation

2012-02-24 Thread Richard Guenther
On Tue, Feb 21, 2012 at 4:02 PM, Tijl Coosemans t...@coosemans.org wrote:
 On Tuesday 21 February 2012 10:19:15 Richard Guenther wrote:
 On Mon, Feb 20, 2012 at 8:55 PM, Tijl Coosemans t...@coosemans.org wrote:
 On Monday 9 January 2012 10:05:08 Richard Guenther wrote:
 Since GCC 4.4 applying the malloc attribute to realloc-like
 functions does not work under the documented constraints because
 the contents of the memory pointed to are not properly transfered
 from the realloc argument (or treated as pointing to anything,
 like 4.3 behaved).

 The following adjusts documentation to reflect implementation
 reality (we do have an implementation detail that treats the
 memory blob returned for non-builtins as pointing to any global
 variable, but that is neither documented nor do I plan to do
 so - I presume it is to allow allocation + initialization
 routines to be marked with malloc, but even that area looks
 susceptible to misinterpretation to me).

 Any comments?

 The new text says the memory must be undefined, but gives calloc as an
 example for which the memory is defined to be zero. Also, GCC has
 built-ins for strdup and strndup with the malloc attribute and GLIBC
 further adds it to wcsdup (wchar_t version of strdup) and tempnam. In
 all of these cases the memory is defined.

 Isn't the reason the attribute doesn't apply to realloc simply because
 the returned pointer may alias the one given as argument, rather than
 having defined memory content?

 The question is really what the alias-analysis code can derive from a
 function that is declared with the malloc attribute.  The most useful
 property for alias analysis would be that te non-aliasing holds
 transitively, thus reading (with any level of indirection) from the returned
 pointer does not produce memory that is aliased by any other pointer.
 That's what happens for 'malloc' (also for 'calloc' - you can't do any
 further indirections through the NULL pointers the memory holds).  It
 does not happen for realloc.  Currently the alias-analysis code does
 assume exactly this properly (only very slightly weakened, possibly
 because we broke some code I guess).

 Internally, all builtins with interesting allocation properties are handled
 explicitely, so we probably should not rely on the malloc attribute present
 on those (and maybe simply drop it there).

 The question is really what is useful for users, and what's the most natural
 behavior?  For example

 int **my_initialized_malloc (int *p)
 {
   int **q = malloc (sizeof (int *));
   *q = p;
   return q;
 }

 would not qualify for the 'malloc' attribute (but we've taken measures to not
 miscompile this kind of code, it seems to be a very common misconception
 to place annotate these with 'malloc').

 I'm not sure how to exactly constrain the documentation for 'malloc' better.
 Maybe

 The @code{malloc} attribute is used to tell the compiler that a function
 may be treated as if any non-@code{NULL} pointer it returns cannot
 alias any other pointer valid when the function returns and that the memory
 does not contain any pointer value.

 ?  Because that is what is relevant.  That you can in no way extract
 a pointer value from the memory pointed to by the return value.  Because
 alias analysis will assume any such extracted pointer value points
 nowhere (so, extracting a NULL pointer is ok).

 The reasoning why the string functions have the malloc attribute was
 probably that strings do not contain pointer values.  Of course they
 can, you can store a character encoding of a pointer, copy the
 string and decode it from the copy again.  We'd miscompile then

  int i = 1;
  int *p = i;
  char ptr[16];
  ... inline encode p into ptr ...
  char *x = strdup (ptr);
  int *q = ... inline decode x to q
  *q = 2;
  return i;

 to return 1 because we do not see that q may point to i.  Of course
 we properly handle the transfer of pointers for str[n]dup, so the
 'malloc' attribute on it is a lie...

 Thanks, that was very informative.

 Is it correct to say that the attribute applies to deep copies, but not to
 shallow ones?

No, see below


 How about the following text:

 @item malloc
 @cindex @code{malloc} attribute
 The @code{malloc} attribute is used to tell the compiler that a pointer
 returned by a function is either @code{NULL} or points to a newly
 allocated object and that any pointer within that object is either
 uninitialised, @code{NULL} or pointing to a newly allocated object for
 which the same conditions hold recursively.

The '.. or pointing to a newly allocated object for which the same
conditions hold recursively' is not what is implemented.  What is
implemented is '.. or pointing to global memory', but I don't really
want to document this as this implementation detail may change
(and what is considered 'global memory' would deserve its own
complicated description).

  The compiler assumes that
 existing variables and memory cannot be accessed through the returned
 pointer which will 

Re: [PR51752] publication safety violations in loop invariant motion pass

2012-02-24 Thread Torvald Riegel
On Fri, 2012-02-24 at 09:58 +0100, Richard Guenther wrote:
 On Thu, Feb 23, 2012 at 10:11 PM, Aldy Hernandez al...@redhat.com wrote:
  On 02/23/12 12:19, Aldy Hernandez wrote:
 
  about hit me. Instead now I save all loads in a function and iterate
  through them in a brute force way. I'd like to rewrite this into a hash
  of some sort, but before I go any further I'm interested to know if the
  main idea is ok.
 
 
  For the record, it may be ideal to reuse some of the iterations we already
  do over the function's basic blocks, so we don't have to iterate yet again
  over the IL stream.  Though it may complicate the pass unnecessarily.
 
 It's definitely ugly that you need to do this.

Indeed.  But that's simply the price we have to pay for making
publication with transactions easier for the programmer yet still at
acceptable performance.

Also note that what we primarily have to care about for this PR is to
not hoist loads _within_ transactions if they would violate publication
safety.  I didn't have time to look at Aldy's patch yet, but a first
safe and conservative way would be to treat transactions as full
transformation barriers, and prevent publication-safety-violating
transformations _within_ transactions.  Which I would prefer until we're
confident that we understood all of it.

For hoisting out of or across transactions, we have to reason about more
than just publication safety.

 And yes, you have to look at
 very many passes I guess, PRE comes to my mind at least.
 
 I still do not understand the situation very well, at least why for
 
  transaction { for () if (b) load; } load;
 
 it should be safe to hoist load out of the transaction

This one is funny.  *Iff* this is an atomic txn, we can assume that the
transaction does not wait for a concurrent event. If it is a relaxed
txn, then we must not have a loop which terminates depending on an
atomic load (which we don't in that example); otherwise, we cannot hoist
load to before the txn (same reason as why we must not hoist to before
an atomic load with memory_order_acquire).
Now, if these things hold, then the load will always be executed after
the txn.  Thus, we can assume that it will happen anyway and nothing can
stop it (irrespective of which position the txn gets in the
Transactional Synchronization Order (see the C++ TM spec)).  It is a
nonatomic load, and we can rely on the program being data-race-free, so
we cannot have other threads storing to the same location, so we can
hoist it across because in that part of the program, the location is
guaranteed to be thread-local data (or immutable).

As I said, this isn't related to just pub safety anymore.  And this is
tricky enough that I'd rather not try to optimize this currently but
instead wait until we have more confidence in our understanding of the
matter.

 while for
 
  load; transaction { for () if (b) load; }
 
 it is not.

If the same assumptions as above hold, I think you can hoist it out,
because again you can assume that it targets thread-local/immutable
data: the nontransactional (+nonatomic!) load can happen at any time,
essentially, irrespective of b's value or how/when other threads modify
it.  Thus, it cannot have been changed between the two loads in a
data-race-free program.

 Neither do I understand why it's not ok for
 
  transaction { for () if (b) load; }
 
 to hoist the load out of the transaction.

You would violate publication safety.

Also, you don't have any reason to believe that load targets
thread-local/immutable data, so you must not make it nontransactional
(otherwise, you could introduce a race condition).

 
 I assume all the issue is about hoisting things over the trasaction start.
 So - why does the transaction start not simply clobber all global memory?
 That would avoid the hoisting.  I assume that  transforming the above to
 
  transaction { tem = load; for () if (b) = tem; }
 
 is ok?

No, it is not.  Actually, this is precisely the transformation that we
need to prevent from happening.
As Aldy said, please see the explanations in the PR, or have a look at
the C++ TM specification and the C++11 memory model alternatively.  We
could also discuss this on IRC, if this would be easier.

Torvald



Re: [PATCH] Fix PR52298

2012-02-24 Thread Ulrich Weigand
Richard Guenther wrote:
 On Thu, 23 Feb 2012, Ulrich Weigand wrote:
  The assert in question looks like:
  
if (nested_in_vect_loop
 (TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info))
% GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
  { 
gcc_assert (alignment_support_scheme != 
  dr_explicit_realign_optimized);
compute_in_loop = true;
  }
  
  where your patch changed DR_STEP to STMT_VINFO_DR_STEP (reverting just this
  one change makes the ICEs go away).
  
  However, at the place where the decision to use the 
  dr_explicit_realign_optimized 
  strategy is made (tree-vect-data-refs.c:vect_supportable_dr_alignment), we 
  still
  have:
  
if ((nested_in_vect_loop
  (TREE_INT_CST_LOW (DR_STEP (dr))
 != GET_MODE_SIZE (TYPE_MODE (vectype
|| !loop_vinfo)
  return dr_explicit_realign;
else
  return dr_explicit_realign_optimized;
  
  Should this now also use STMT_VINFO_DR_STEP?
 
 Yes, I think so.

Hmmm.  Reading the comment in vect_supportable_dr_alignment:

 However, in the case of outer-loop vectorization, when vectorizing a
 memory access in the inner-loop nested within the LOOP that is now being
 vectorized, while it is guaranteed that the misalignment of the
 vectorized memory access will remain the same in different outer-loop
 iterations, it is *not* guaranteed that is will remain the same throughout
 the execution of the inner-loop.  This is because the inner-loop advances
 with the original scalar step (and not in steps of VS).  If the inner-loop
 step happens to be a multiple of VS, then the misalignment remains fixed
 and we can use the optimized realignment scheme. 

it would appear that in this case, checking the inner-loop step is deliberate.

Given the comment in vectorizable_load:

  /* If the misalignment remains the same throughout the execution of the
 loop, we can create the init_addr and permutation mask at the loop
 preheader.  Otherwise, it needs to be created inside the loop.
 This can only occur when vectorizing memory accesses in the inner-loop
 nested within an outer-loop that is being vectorized.  */

this looks to me that, since the check is intended to verify that
misalignment remains the same throughout the execuction of the loop,
we actually want to check the inner-loop step here as well, i.e. revert
this chunk of your patch ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: [PATCH, i386, Android] -mandroid support for i386 target

2012-02-24 Thread Ilya Enkovich
 On Wed, Feb 22, 2012 at 6:54 AM, Ilya Enkovich enkovich@gmail.com wrote:
 Hello,

 This patch adds -mandroid support to i386 target. OK for trunk?

 Thanks,
 Ilya
 --

 2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

        * config/i386/gnu-user.h (LINUX_TARGET_CC1_SPEC): New.

 I don't think you should define LINUX_* in gnu-user.h.

        (CC1_SPEC): Use LINUX_OR_ANDROID_CC.
        (CC1PLUS_SPEC): Likewise.
        (LINUX_TARGET_LINK_SPEC): New.
        (LINK_SPEC): Support LINUX_OR_ANDROID_LD.
        (LIB_SPEC): New.
        (STARTFILE_SPEC): New.
        (LINUX_TARGET_ENDFILE_SPEC): New.
        (ENDFILE_SPEC): Support LINUX_OR_ANDROID_LD.

 There is a feedback at

 http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01283.html

 to my earlier patch to define GNU_USER_TARGET_* in gnu-user.h
 and use them in linux.h.


Thanks for the link. I fixed patch according to this feedback.

        * config/linux-android.h (ANDROID_STARTFILE_SPEC): Use
        crtbegin_so%O%s for -shared.
        (ANDROID_ENDFILE_SPEC): Use crtend_so%O%s for -shared.



 I think you should separate this part similar to

 http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01109.html

I removed this part from the patch.



 --
 H.J.

Here is a new patch version. Does it look better?

Thanks,
Ilya
--

2012-02-24  Enkovich Ilya  ilya.enkov...@intel.com

* gcc/config/i386/gnu-user.h (CC1_SPEC): Rename to ...
(GNU_USER_TARGET_CC1_SPEC): ... this.
(LINK_SPEC): Rename to ...
(GNU_USER_TARGET_LINK_SPEC): ... this.
(ENDFILE_SPEC): Delete.
(GNU_USER_TARGET_MATHFILE_SPEC): New.

* gcc/config/i386/linux.h (CC1_SPEC): New.
(LINK_SPEC): New.
(LIB_SPEC): New.
(STARTFILE_SPEC): New.
(ENDFILE_SPEC): New.


diff --git a/gcc/config/i386/gnu-user.h b/gcc/config/i386/gnu-user.h
index 98d0a25..59d7062 100644
--- a/gcc/config/i386/gnu-user.h
+++ b/gcc/config/i386/gnu-user.h
@@ -77,8 +77,8 @@ along with GCC; see the file COPYING3.  If not see
 #undef CPP_SPEC
 #define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT}

-#undef CC1_SPEC
-#define CC1_SPEC %(cc1_cpu) %{profile:-p}
+#undef GNU_USER_TARGET_CC1_SPEC
+#define GNU_USER_TARGET_CC1_SPEC %(cc1_cpu) %{profile:-p}

 /* Provide a LINK_SPEC appropriate for GNU userspace.  Here we provide support
for the special GCC options -static and -shared, which allow us to
@@ -97,8 +97,8 @@ along with GCC; see the file COPYING3.  If not see
   { link_emulation, GNU_USER_LINK_EMULATION },\
   { dynamic_linker, GNU_USER_DYNAMIC_LINKER }

-#undef LINK_SPEC
-#define LINK_SPEC -m %(link_emulation) %{shared:-shared} \
+#define GNU_USER_TARGET_LINK_SPEC \
+  -m %(link_emulation) %{shared:-shared} \
   %{!shared: \
 %{!static: \
   %{rdynamic:-export-dynamic} \
@@ -106,13 +106,11 @@ along with GCC; see the file COPYING3.  If not see
   %{static:-static}}

 /* Similar to standard GNU userspace, but adding -ffast-math support.  */
-#undef  ENDFILE_SPEC
-#define ENDFILE_SPEC \
+#define GNU_USER_TARGET_MATHFILE_SPEC \
   %{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
%{mpc32:crtprec32.o%s} \
%{mpc64:crtprec64.o%s} \
-   %{mpc80:crtprec80.o%s} \
-   %{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s
+   %{mpc80:crtprec80.o%s}

 /* A C statement (sans semicolon) to output to the stdio stream
FILE the assembler definition of uninitialized global DECL named
diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h
index 73681fe..a832ddc 100644
--- a/gcc/config/i386/linux.h
+++ b/gcc/config/i386/linux.h
@@ -22,3 +22,30 @@ along with GCC; see the file COPYING3.  If not see

 #define GNU_USER_LINK_EMULATION elf_i386
 #define GLIBC_DYNAMIC_LINKER /lib/ld-linux.so.2
+
+#undef CC1_SPEC
+#define CC1_SPEC \
+  LINUX_OR_ANDROID_CC (GNU_USER_TARGET_CC1_SPEC, \
+  GNU_USER_TARGET_CC1_SPEC   ANDROID_CC1_SPEC)
+
+#undef LINK_SPEC
+#define LINK_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_LINK_SPEC, \
+  GNU_USER_TARGET_LINK_SPEC   ANDROID_LINK_SPEC)
+
+#undef  LIB_SPEC
+#define LIB_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_LIB_SPEC, \
+  GNU_USER_TARGET_LIB_SPEC   ANDROID_LIB_SPEC)
+
+#undef  STARTFILE_SPEC
+#define STARTFILE_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_STARTFILE_SPEC, \
+  ANDROID_STARTFILE_SPEC)
+
+#undef  ENDFILE_SPEC
+#define ENDFILE_SPEC \
+  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_MATHFILE_SPEC   \
+  GNU_USER_TARGET_ENDFILE_SPEC, \
+  GNU_USER_TARGET_MATHFILE_SPEC   \
+  ANDROID_ENDFILE_SPEC)


Re: [PATCH, i386, Android] Enable __ANDROID__ macro for Android i386 target

2012-02-24 Thread H.J. Lu
On Fri, Feb 24, 2012 at 1:51 AM, Ilya Enkovich enkovich@gmail.com wrote:
 On Wed, Feb 22, 2012 at 6:59 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hello,

 Here is a one-line fix to enable __ANDROID__ macro on i386 Android
 target. OK for trunk?

 Thanks,
 Ilya
 --

 2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

        * gcc/config/i386/gnu-user.h (TARGET_OS_CPP_BUILTINS): Add
        ANDROID_TARGET_OS_CPP_BUILTINS.


 diff --git a/gcc/config/i386/gnu-user.h b/gcc/config/i386/gnu-user.h
 index 98d0a25..d317229 100644
 --- a/gcc/config/i386/gnu-user.h
 +++ b/gcc/config/i386/gnu-user.h
 @@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
   do                                           \
     {                                          \
        GNU_USER_TARGET_OS_CPP_BUILTINS();      \
 +       ANDROID_TARGET_OS_CPP_BUILTINS();       \
     }                                          \
   while (0)

 I think this should be done in linux.h, not gnu-user.h.

 I fix macro which is defined in gnu-user.h. How do you suppose me to
 do it in linux.h?


Undef TARGET_OS_CPP_BUILTINS and define TARGET_OS_CPP_BUILTINS
in linux.h with GNU_USER_TARGET_OS_CPP_BUILTINS and
ANDROID_TARGET_OS_CPP_BUILTINS.


-- 
H.J.


Re: [PATCH, i386, Android] -mandroid support for i386 target

2012-02-24 Thread H.J. Lu
On Fri, Feb 24, 2012 at 7:17 AM, Ilya Enkovich enkovich@gmail.com wrote:
 On Wed, Feb 22, 2012 at 6:54 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hello,

 This patch adds -mandroid support to i386 target. OK for trunk?

 Thanks,
 Ilya
 --

 2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

        * config/i386/gnu-user.h (LINUX_TARGET_CC1_SPEC): New.

 I don't think you should define LINUX_* in gnu-user.h.

        (CC1_SPEC): Use LINUX_OR_ANDROID_CC.
        (CC1PLUS_SPEC): Likewise.
        (LINUX_TARGET_LINK_SPEC): New.
        (LINK_SPEC): Support LINUX_OR_ANDROID_LD.
        (LIB_SPEC): New.
        (STARTFILE_SPEC): New.
        (LINUX_TARGET_ENDFILE_SPEC): New.
        (ENDFILE_SPEC): Support LINUX_OR_ANDROID_LD.

 There is a feedback at

 http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01283.html

 to my earlier patch to define GNU_USER_TARGET_* in gnu-user.h
 and use them in linux.h.


 Thanks for the link. I fixed patch according to this feedback.

        * config/linux-android.h (ANDROID_STARTFILE_SPEC): Use
        crtbegin_so%O%s for -shared.
        (ANDROID_ENDFILE_SPEC): Use crtend_so%O%s for -shared.



 I think you should separate this part similar to

 http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01109.html

 I removed this part from the patch.



 --
 H.J.

 Here is a new patch version. Does it look better?

 Thanks,
 Ilya
 --

 2012-02-24  Enkovich Ilya  ilya.enkov...@intel.com

        * gcc/config/i386/gnu-user.h (CC1_SPEC): Rename to ...
        (GNU_USER_TARGET_CC1_SPEC): ... this.
        (LINK_SPEC): Rename to ...
        (GNU_USER_TARGET_LINK_SPEC): ... this.
        (ENDFILE_SPEC): Delete.
        (GNU_USER_TARGET_MATHFILE_SPEC): New.


You should keep those *_SPEC and define them with new
GNU_*_SPEC in gnu-user.h since gnu-user.h is also used
by other non-linux targets.  In linux.h, you undef *_SPEC
before defining them.


-- 
H.J.


[testsuite] Skip gcc.target/mips/interrupt_handler-[23].c on IRIX (PR target/50580)

2012-02-24 Thread Rainer Orth
Looking closer into the gcc.target/mips/interrupt_handler-[23].c
failures

FAIL: gcc.target/mips/interrupt_handler-2.c scan-assembler \\t.cfi_restore 
64\\n
FAIL: gcc.target/mips/interrupt_handler-2.c scan-assembler \\t.cfi_restore 
65\\n

and many more for interrupt_handler-3.c, I now understand what's going
on: .cfi* directives are only emitted if dwarf2cfi.c
(dwarf2out_do_cfi_asm) returns true, but if MIPS_DEBUGGING_INFO is
defined (as is the case in iris6.h), it returns false.  Thus, the tests
are meaningless since even the scan-assembler-not tests only succeed
because no .cfi* directives whatsoever are present in the assembler
output.  XFAILing them with dg-xfail-if doesn't work:

XPASS: gcc.target/mips/interrupt_handler-2.c (test for excess errors)
FAIL: gcc.target/mips/interrupt_handler-2.c scan-assembler \t\\.cfi_restore 64\n
FAIL: gcc.target/mips/interrupt_handler-2.c scan-assembler \t\\.cfi_restore 65\n

and adding xfail's to each and every dg-final seems pointless,
especially given that the scan-assembler-not tests only pass by
accident.  I therefore just skip the tests on IRIX.

Tested with the appropriate runtest invocation on mips-sgi-irix6.5,
installed on mainline.

Rainer


2012-02-24  Rainer Orth  r...@cebitec.uni-bielefeld.de

PR target/50580
* gcc.target/mips/interrupt_handler-2.c: Skip on mips-sgi-irix6*.
* gcc.target/mips/interrupt_handler-3.c: Likewise.

# HG changeset patch
# Parent 5002952adc32431a8c1084a301aa929640d70870
[testsuite] Skip gcc.target/mips/interrupt_handler-[23].c on IRIX (PR target/50580)

diff --git a/gcc/testsuite/gcc.target/mips/interrupt_handler-2.c b/gcc/testsuite/gcc.target/mips/interrupt_handler-2.c
--- a/gcc/testsuite/gcc.target/mips/interrupt_handler-2.c
+++ b/gcc/testsuite/gcc.target/mips/interrupt_handler-2.c
@@ -4,6 +4,7 @@
 /* { dg-final { scan-assembler \t\\\.cfi_restore 65\n } } */
 /* { dg-final { scan-assembler-not \\\.cfi_def_cfa( |\t) } } */
 /* { dg-final { scan-assembler-not \\\.cfi_def_cfa_register( |\t) } } */
+/* { dg-skip-if PR target/50580 { mips-sgi-irix6* } } */
 
 extern void f (void);
 
diff --git a/gcc/testsuite/gcc.target/mips/interrupt_handler-3.c b/gcc/testsuite/gcc.target/mips/interrupt_handler-3.c
--- a/gcc/testsuite/gcc.target/mips/interrupt_handler-3.c
+++ b/gcc/testsuite/gcc.target/mips/interrupt_handler-3.c
@@ -23,6 +23,7 @@
 /* { dg-final { scan-assembler \t\\\.cfi_def_cfa_offset 0\n } } */
 /* { dg-final { scan-assembler-not \\\.cfi_def_cfa( |\t) } } */
 /* { dg-final { scan-assembler-not \\\.cfi_def_cfa_register( |\t) } } */
+/* { dg-skip-if PR target/50580 { mips-sgi-irix6* } } */
 
 extern void f (void);
 


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PATCH: PR target/52364: The unnecessary second form in *movabsmode_[12]

2012-02-24 Thread Uros Bizjak
On Fri, Feb 24, 2012 at 5:58 AM, H.J. Lu hongjiu...@intel.com wrote:

 The second form is redundant in

 ;; Stores and loads of ax to arbitrary constant address.
 ;; We fake an second form of instruction to force reload to load address
 ;; into register when rax is not available
 (define_insn *movabsmode_1
  [(set (mem:SWI1248x (match_operand:DI 0 x86_64_movabs_operand i,r))
        (match_operand:SWI1248x 1 nonmemory_operand a,er))]
  TARGET_64BIT  ix86_check_movabs (insn, 0)
  @
   movabs{imodesuffix}\t{%1, %P0|%P0, %1}
   mov{imodesuffix}\t{%1, %a0|%a0, %1}
  [(set_attr type imov)
   (set_attr modrm 0,*)
   (set_attr length_address 8,0)
   (set_attr length_immediate 0,*)
   (set_attr memory store)
   (set_attr mode MODE)])

 since it is just normal movmode.  Tested on Linux/x86-64.  OK for stage1?

I am a bit scarred by ... well ... scary comment that mentions reload.
This second form predates IRA - are we sure that IRA is clever enough
not to break due to this change?

Jan, Vladimir, do you have any comments?

Uros.


Re: [patch] Fix cygwin ada install [was Re: Yet another issue with gcc current trunk with ada on cygwin]

2012-02-24 Thread Dave Korn
On 22/02/2012 16:25, Pascal Obry wrote:
 Dave,
 
   Pascal, ping?
 
 Sorry for the delay, these message has fallen into the crack!

  No problem, I had plenty to be getting on with in the meantime :)

 Anyway, with these explanations I'm ok with the patch.

  Thanks.  Committed revision 184558.

cheers,
  DaveK



Re: [patch i386]: Add support of delegitimize of UNSPEC_PCREL plus displacement

2012-02-24 Thread Richard Henderson
On 02/23/12 09:12, Kai Tietz wrote:
 2012-02-23  Kai Tietz  kti...@redhat.com
 
   * config/i386/i386.c (ix86_delegitimize_address): Handle
   UNSPEC_PCREL plus displacement.

Ok.


r~


[PATCH] genattrtab: avoid NULL-deref on error

2012-02-24 Thread Jim Meyering
This fixes a coverity-spotted issue.
A NULL et could be dereferenced after the diagnostic is issued.


2012-02-24  Jim Meyering  meyer...@redhat.com

genattrtab: avoid NULL-deref on error
* genattrtab.c (gen_attr): Avoid NULL-deref after diagnosing
absence of an defin_enum call.

diff --git a/gcc/genattrtab.c b/gcc/genattrtab.c
index 4a4c2a2..bfbe3e8 100644
--- a/gcc/genattrtab.c
+++ b/gcc/genattrtab.c
@@ -1,6 +1,6 @@
 /* Generate code from machine description to compute values of attributes.
Copyright (C) 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
-   2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+   2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2012
Free Software Foundation, Inc.
Contributed by Richard Kenner (ken...@vlsi1.ultra.nyu.edu)

@@ -2993,8 +2993,9 @@ gen_attr (rtx exp, int lineno)
   if (!et || !et-md_p)
error_with_line (lineno, No define_enum called `%s' defined,
 attr-name);
-  for (ev = et-values; ev; ev = ev-next)
-   add_attr_value (attr, ev-name);
+  if (et)
+   for (ev = et-values; ev; ev = ev-next)
+ add_attr_value (attr, ev-name);
 }
   else if (*XSTR (exp, 1) == '\0')
 attr-is_numeric = 1;
--
1.7.9.2.263.g9be8b7


Re: [PR51752] publication safety violations in loop invariant motion pass

2012-02-24 Thread Aldy Hernandez

On 02/24/12 07:10, Torvald Riegel wrote:

On Fri, 2012-02-24 at 09:58 +0100, Richard Guenther wrote:

On Thu, Feb 23, 2012 at 10:11 PM, Aldy Hernandezal...@redhat.com  wrote:

On 02/23/12 12:19, Aldy Hernandez wrote:


about hit me. Instead now I save all loads in a function and iterate
through them in a brute force way. I'd like to rewrite this into a hash
of some sort, but before I go any further I'm interested to know if the
main idea is ok.



For the record, it may be ideal to reuse some of the iterations we already
do over the function's basic blocks, so we don't have to iterate yet again
over the IL stream.  Though it may complicate the pass unnecessarily.


It's definitely ugly that you need to do this.


Indeed.  But that's simply the price we have to pay for making
publication with transactions easier for the programmer yet still at
acceptable performance.

Also note that what we primarily have to care about for this PR is to
not hoist loads _within_ transactions if they would violate publication


For that matter, didn't rth add a memory barrier at the beginning of 
transactions last week?  That would mean that we can't hoist anything 
outside of a transaction anyhow.  Or was it not a full memory barrier?



safety.  I didn't have time to look at Aldy's patch yet, but a first
safe and conservative way would be to treat transactions as full
transformation barriers, and prevent publication-safety-violating
transformations _within_ transactions.  Which I would prefer until we're
confident that we understood all of it.


Do you mean disallow hoisting of *any* loads that happen inside of a 
transaction (regardless of whether a subsequent load happens on every 
path out of the loop)?  This would definitely be safe and quite easily 
doable, simply by checking if loads to be hoisted are within a transaction.




For hoisting out of or across transactions, we have to reason about more
than just publication safety.


Again, __transactions being barriers and all, I don't think we should 
complicate things unnecessarily at this point, since it doesn't happen.


Aldy


Re: PATCH: PR target/52364: The unnecessary second form in *movabsmode_[12]

2012-02-24 Thread H.J. Lu
On Fri, Feb 24, 2012 at 8:11 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Feb 24, 2012 at 5:58 AM, H.J. Lu hongjiu...@intel.com wrote:

 The second form is redundant in

 ;; Stores and loads of ax to arbitrary constant address.
 ;; We fake an second form of instruction to force reload to load address
 ;; into register when rax is not available
 (define_insn *movabsmode_1
  [(set (mem:SWI1248x (match_operand:DI 0 x86_64_movabs_operand i,r))
        (match_operand:SWI1248x 1 nonmemory_operand a,er))]
  TARGET_64BIT  ix86_check_movabs (insn, 0)
  @
   movabs{imodesuffix}\t{%1, %P0|%P0, %1}
   mov{imodesuffix}\t{%1, %a0|%a0, %1}
  [(set_attr type imov)
   (set_attr modrm 0,*)
   (set_attr length_address 8,0)
   (set_attr length_immediate 0,*)
   (set_attr memory store)
   (set_attr mode MODE)])

 since it is just normal movmode.  Tested on Linux/x86-64.  OK for stage1?

 I am a bit scarred by ... well ... scary comment that mentions reload.
 This second form predates IRA - are we sure that IRA is clever enough
 not to break due to this change?


I am afraid reload can't deal with it.  I withdrew this patch.

-- 
H.J.


[lra] patch for S390 bootstrap.

2012-02-24 Thread Vladimir Makarov
The following patch fixes some bugs preventing S390x bootstrap.  The 
patch is still not fixing the S390 bootstrap yet but I am working on it.


The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 184561.

2012-02-24  Vladimir Makarov vmaka...@redhat.com

* lra-assigns.c (improve_inheritance): Add an argument.  Set it
up.  Don't change allocation of a reload pseudo.
(assign_by_spills): Pass parameter to improve_inheritance.

* lra-constraints.c (extract_loc_address_regs): Process memory as
a register.
(process_addr_reg): Reload memory.
(process_alt_operands): Use get_op_mode instead of GET_MODE.
(process_address): Use find_reg_note instead of find_regno_note.

Index: lra-assigns.c
===
--- lra-assigns.c	(revision 184177)
+++ lra-assigns.c	(working copy)
@@ -935,10 +935,10 @@ setup_live_pseudos_and_spill_after_risky
pseudos to the connected pseudos.  We need this because inheritance
pseudos are allocated after reload pseudos in the thread and when
we assign a hard register to a reload pseudo we don't know yet that
-   the connected inheritance pseudos can get the same hard
-   register.  */
+   the connected inheritance pseudos can get the same hard register.
+   Add pseudos with changed allocation to bitmap CHANGED_PSEUDOS.  */
 static void
-improve_inheritance (void)
+improve_inheritance (bitmap changed_pseudos)
 {
   unsigned int k;
   int regno, another_regno, hard_regno, another_hard_regno, cost, i, n;
@@ -983,6 +983,7 @@ improve_inheritance (void)
 		assign_hard_regno (hard_regno, another_regno);
 	  else
 		assign_hard_regno (another_hard_regno, another_regno);
+	  bitmap_set_bit (changed_pseudos, another_regno);
 	}
 	}
 }
@@ -1104,7 +1105,7 @@ assign_by_spills (void)
 	  }
   n = nfails;
 }
-  improve_inheritance ();
+  improve_inheritance (changed_pseudo_bitmap);
   bitmap_clear (changed_insns);
   /* We can not assign to inherited pseudos if any its inheritance
  pseudo did not get hard register because undo inheritance pass
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 184524)
+++ lra-constraints.c	(working copy)
@@ -295,7 +295,9 @@ get_reload_reg (enum op_type type, enum 
 
 /* The page contains code to extract memory address parts.  */
 
-/* Info about base and index regs of an address.  */
+/* Info about base and index regs of an address.  In some rare cases,
+   base/index register can be actually memory.  In this case we will
+   reload it.  */
 struct address
 {
   rtx *base_reg_loc;  /* NULL if there is no a base register.  */
@@ -519,6 +521,13 @@ extract_loc_address_regs (bool top_p, en
 SCRATCH, true, ad);
   break;
 
+  /* We process memory as a register.  That means we flatten
+	 addresses.  In other words, the final code will never
+	 contains memory in an address even if the target supports
+	 such addresses (it is too rare these days).  Memory also can
+	 occur in address as a result some previous transformations
+	 like equivalence substitution.  */
+case MEM:
 case REG:
   if (context_p)
 	ad-index_reg_loc = loc;
@@ -1228,6 +1237,26 @@ process_addr_reg (rtx *loc, rtx *before,
   enum machine_mode mode;
   bool change_p = false;
 
+  mode = GET_MODE (reg);
+  if (MEM_P (reg))
+{
+  /* Always reload memory in an address even if the target
+	 supports such addresses.  */
+  new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, address);
+  push_to_sequence (*before);
+  lra_emit_move (new_reg, reg);
+  *before = get_insns ();
+  end_sequence ();
+  *loc = new_reg;
+  if (after != NULL)
+	{
+	  push_to_sequence (*after);
+	  lra_emit_move (reg, new_reg);
+	  *after = get_insns ();
+	  end_sequence ();
+	}
+  return true;
+}
   gcc_assert (REG_P (reg));
   final_regno = regno = REGNO (reg);
   if (regno  FIRST_PSEUDO_REGISTER)
@@ -1257,7 +1286,6 @@ process_addr_reg (rtx *loc, rtx *before,
 }
   if (*loc != reg || ! in_class_p (final_regno, cl, new_class))
 {
-  mode = GET_MODE (reg);
   reg = *loc;
   if (get_reload_reg (OP_IN, mode, reg, cl, address, new_reg))
 	{
@@ -1564,7 +1592,7 @@ process_alt_operands (int only_alternati
 	}
   
 	  op = no_subreg_operand[nop];
-	  mode = GET_MODE (*curr_id-operand_loc[nop]);
+	  mode = get_op_mode (nop);
 
 	  win = did_match = winreg = offmemok = constmemok = false;
 	  badop = true;
@@ -2474,8 +2502,8 @@ process_address (int nop, rtx *before, r
 {
   if (process_addr_reg (ad.base_reg_loc, before,
 			(ad.base_modify_p
-			  find_regno_note (curr_insn, REG_DEAD,
-		 REGNO (*ad.base_reg_loc)) == NULL
+			  find_reg_note (curr_insn, REG_DEAD,
+	   *ad.base_reg_loc) == NULL
 			 ? after : NULL),
 			base_reg_class (mode, as, 

[PATCH] simulate-thread tweaks and fixes

2012-02-24 Thread Andrew MacLeod
I've been toying with the simulate-thread framework a bit.  With the 
timeout threshold is back to where it originally was, the spectre of 
huge log files when an infinite loop happens is back.


This patch has the following modifications:

1) An instruction count threshold has been added.
  This will truly prevent infinite loops which were the original 
issue.  If the count reaches the threshold value, the test case fails.   
I've set the current limit to 10,000, which I would expect to be 
sufficient. The highest Ive seen so far is 877, but I'm sure someone 
will find something higher :-).  A testcase can override this value if 
it wishes.  If we encounter targets where this is not high enough, we 
can raise it further.  The main point is to avoid generating 10's of 
gigabytes of log files when a fast machine hits an infinite loop and the 
time based timeout doesnt kick in quickly enough.
  If a test does fail for this reason, the log file will issue a FAIL 
message indicating the instruction count threshold was exceeded.


2) I tweaked the atomic-load-int128.c testcase to avoid an inadvertent 
hostile thread situation that was not intended.


3) The  speculative-store.c testcase had a bug in it where the verify 
function was not returning a value when successful. ThIs resulted in an 
UNSUPPORTED result occasionally because the testcase didn't run far 
enough to indicate GDB had successfully run, yet no FAIL was issued.


4) I lowered the hostile thread threshold by an order of magnitude so 
that if a hostile thread is encountered frequently, it wont rapidly 
approach the new instruction count threshold.  This could happen on 
compare_and_swap loop targets.


I've tried this on x86_64-unknown-linux-gnu and everything seems good.   
If anyone wants to try it on their more stressed arch, all you need to 
do is apply the patch to the testsuite and run 'make check-gcc 
RUNTESTFLAGS=simulate-thread.exp' to see if the results are as expected.


OK for mainline?

Andrew





	* gcc.dg/simulate-thread/simulate-thread.gdb: Use return value from
	simulate_thread_wrapper_other_threads
	* gcc.dg/simulate-thread/atomic-load-int128.c (simulate_thread_main):
	Move initialization of 'value' to main().
	(main): Initialize 'value';
	* gcc.dg/simulate-thread/speculative-store.c
	(simulate_thread_step_verify): Return 0 when successful.
	* gcc.dg/simulate-thread/simulate-thread.h (HOSTILE_THREAD_THRESHOLD):
	Reduce threshold.
	(INSN_COUNT_THRESHOLD): New.  Instruction limit to terminate test.
	(simulate_thread_wrapper_other_threads): Return a success/fail value
	and issue an error if the instruction count threshold is exceeded.

Index: testsuite/gcc.dg/simulate-thread/simulate-thread.gdb
===
*** testsuite/gcc.dg/simulate-thread/simulate-thread.gdb	(revision 184447)
--- testsuite/gcc.dg/simulate-thread/simulate-thread.gdb	(working copy)
*** run
*** 5,11 
  
  set $ret = 0
  while (simulate_thread_fini != 1)  (! $ret)
!   call simulate_thread_wrapper_other_threads()
stepi
set $ret |= simulate_thread_step_verify()
  end
--- 5,11 
  
  set $ret = 0
  while (simulate_thread_fini != 1)  (! $ret)
!   set $ret |= simulate_thread_wrapper_other_threads()
stepi
set $ret |= simulate_thread_step_verify()
  end
Index: testsuite/gcc.dg/simulate-thread/atomic-load-int128.c
===
*** testsuite/gcc.dg/simulate-thread/atomic-load-int128.c	(revision 184447)
--- testsuite/gcc.dg/simulate-thread/atomic-load-int128.c	(working copy)
*** void simulate_thread_main()
*** 105,113 
  {
int x;
  
-   /* Make sure value starts with an atomic value now.  */
-   __atomic_store_n (value, ret, __ATOMIC_SEQ_CST);
- 
/* Execute loads with value changing at various cyclic values.  */
for (table_cycle_size = 16; table_cycle_size  4 ; table_cycle_size--)
  {
--- 105,110 
*** void simulate_thread_main()
*** 126,131 
--- 123,132 
  main()
  {
fill_table ();
+ 
+   /* Make sure value starts with an atomic value from the table.  */
+   __atomic_store_n (value, table[0], __ATOMIC_SEQ_CST);
+ 
simulate_thread_main ();
simulate_thread_done ();
return 0;
Index: testsuite/gcc.dg/simulate-thread/speculative-store.c
===
*** testsuite/gcc.dg/simulate-thread/speculative-store.c	(revision 184447)
--- testsuite/gcc.dg/simulate-thread/speculative-store.c	(working copy)
*** int simulate_thread_step_verify()
*** 24,29 
--- 24,30 
printf(FAIL: global variable was assigned to.  \n);
return 1;
  }
+   return 0;
  }
  
  int simulate_thread_final_verify()
Index: testsuite/gcc.dg/simulate-thread/simulate-thread.h
===
*** testsuite/gcc.dg/simulate-thread/simulate-thread.h	(revision 

[libgo] Fix typo in libgo/runtime/go-nosys.c

2012-02-24 Thread Rainer Orth
Mainline bootstrap on x86_64-unknown-linux-gnu (CentOS 5.6) was failing
due to a trivial typo.  Fixed as follows.

Rainer


diff --git a/libgo/runtime/go-nosys.c b/libgo/runtime/go-nosys.c
--- a/libgo/runtime/go-nosys.c
+++ b/libgo/runtime/go-nosys.c
@@ -52,7 +52,7 @@ faccessat (int fd __attribute__ ((unused
 int
 fallocate (int fd __attribute__ ((unused)),
 	   int mode __attribute__ ((unused)),
-	   off_t offset __attribute __ ((unused)),
+	   off_t offset __attribute__ ((unused)),
 	   off_t len __attribute__ ((unused)))
 {
   errno = ENOSYS;

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] simulate-thread tweaks and fixes

2012-02-24 Thread Jack Howarth
On Fri, Feb 24, 2012 at 12:38:56PM -0500, Andrew MacLeod wrote:
 I've been toying with the simulate-thread framework a bit.  With the  
 timeout threshold is back to where it originally was, the spectre of  
 huge log files when an infinite loop happens is back.

 This patch has the following modifications:

 1) An instruction count threshold has been added.
   This will truly prevent infinite loops which were the original  
 issue.  If the count reaches the threshold value, the test case fails.
 I've set the current limit to 10,000, which I would expect to be  
 sufficient. The highest Ive seen so far is 877, but I'm sure someone  
 will find something higher :-).  A testcase can override this value if  
 it wishes.  If we encounter targets where this is not high enough, we  
 can raise it further.  The main point is to avoid generating 10's of  
 gigabytes of log files when a fast machine hits an infinite loop and the  
 time based timeout doesnt kick in quickly enough.
   If a test does fail for this reason, the log file will issue a FAIL  
 message indicating the instruction count threshold was exceeded.

 2) I tweaked the atomic-load-int128.c testcase to avoid an inadvertent  
 hostile thread situation that was not intended.

 3) The  speculative-store.c testcase had a bug in it where the verify  
 function was not returning a value when successful. ThIs resulted in an  
 UNSUPPORTED result occasionally because the testcase didn't run far  
 enough to indicate GDB had successfully run, yet no FAIL was issued.

 4) I lowered the hostile thread threshold by an order of magnitude so  
 that if a hostile thread is encountered frequently, it wont rapidly  
 approach the new instruction count threshold.  This could happen on  
 compare_and_swap loop targets.

 I've tried this on x86_64-unknown-linux-gnu and everything seems good.
 If anyone wants to try it on their more stressed arch, all you need to  
 do is apply the patch to the testsuite and run 'make check-gcc  
 RUNTESTFLAGS=simulate-thread.exp' to see if the results are as expected.

Andrew,
Works fine on x86_64 darwin when applied to curren gcc trunk...

Native configuration is x86_64-apple-darwin11.3.0

=== gcc tests ===

Schedule of variations:
unix/-m32
unix/-m64

Running target unix/-m32
Using /sw/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /sw/share/dejagnu/config/unix.exp as generic interface file for target.
Using 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20120224/gcc/testsuite/config/default.exp
 as tool-and-target-specific interface file.
Running 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20120224/gcc/testsuite/gcc.dg/simulate-thread/simulate-thread.exp
 ...

=== gcc Summary for unix/-m32 ===

# of expected passes76
# of unsupported tests  8
Running target unix/-m64
Using /sw/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /sw/share/dejagnu/config/unix.exp as generic interface file for target.
Using 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20120224/gcc/testsuite/config/default.exp
 as tool-and-target-specific interface file.
Running 
/sw/src/fink.build/gcc47-4.7.0-1/gcc-4.7-20120224/gcc/testsuite/gcc.dg/simulate-thread/simulate-thread.exp
 ...

=== gcc Summary for unix/-m64 ===

# of expected passes88
# of unsupported tests  2

=== gcc Summary ===

# of expected passes164
# of unsupported tests  10
/sw/src/fink.build/gcc47-4.7.0-1/darwin_objdir/gcc/xgcc  version 4.7.0 20120224 
(experimental) (GCC


 OK for mainline?

 Andrew





 
   * gcc.dg/simulate-thread/simulate-thread.gdb: Use return value from
   simulate_thread_wrapper_other_threads
   * gcc.dg/simulate-thread/atomic-load-int128.c (simulate_thread_main):
   Move initialization of 'value' to main().
   (main): Initialize 'value';
   * gcc.dg/simulate-thread/speculative-store.c
   (simulate_thread_step_verify): Return 0 when successful.
   * gcc.dg/simulate-thread/simulate-thread.h (HOSTILE_THREAD_THRESHOLD):
   Reduce threshold.
   (INSN_COUNT_THRESHOLD): New.  Instruction limit to terminate test.
   (simulate_thread_wrapper_other_threads): Return a success/fail value
   and issue an error if the instruction count threshold is exceeded.
 
 Index: testsuite/gcc.dg/simulate-thread/simulate-thread.gdb
 ===
 *** testsuite/gcc.dg/simulate-thread/simulate-thread.gdb  (revision 
 184447)
 --- testsuite/gcc.dg/simulate-thread/simulate-thread.gdb  (working copy)
 *** run
 *** 5,11 
   
   set $ret = 0
   while (simulate_thread_fini != 1)  (! $ret)
 !   call simulate_thread_wrapper_other_threads()
 stepi
 set $ret |= simulate_thread_step_verify()
   end
 --- 5,11 
   
   set $ret = 0
   while (simulate_thread_fini != 1)  (! $ret

Ping #1: [Patch,AVR] Fix/hack around spill fail ICE PR52148

2012-02-24 Thread Georg-Johann Lay
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html

Georg-Johann Lay wrote:
 Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3
 pointer registers. Register allocator is at it's limits and the patch tries to
 cure the situation by replacing
 
 (match_operand:HI 0 register_operand x)
 
 with explicit
 
 (reg:HI REG_X)
 
 and similar for Z Register classes x and z contain only one HI register.
 
 This PR and PR50925 show that register allocator has some problems.
 Even though this patch is not a fix of the root cause, it allows the PR's test
 case to compile.
 
 Anyways, the patch simplifies the backend and replaces an insn with 11(!)
 operands with an insn with only 2 operands so that the patch is improvement of
 the backend.
 
 The hard registers are already known at expand time so there is no need for
 match_operands.
 
 Passes without regression.
 
 Ok for trunk?
 
 Johann
 
   PR target/52148
   * config/avr/avr.md (movmem_mode): Replace match_operand that
   match only one single hard register with respective hard reg rtx.
   (movmemx_mode): Ditto.
   * config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new
   insn anatomy of movmem[x]_mode.
   (avr_out_movmem): Same for printing assembler and operand usage.


Ping #1: [Patch,AVR]: Add builtins.def and fix some ICE, add tests

2012-02-24 Thread Georg-Johann Lay
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00843.html

Georg-Johann Lay wrote:
 This patch introduces a new file builtins.def that is used as central registry
 to hold built-ins' information.
 
 The file is used by defining DEF_BUILTIN macre and then including the file as
 described in the head comment of builtins.def.
 
 Up to here it's all code clean-up and no functional change.
 
 Moreover there are some minor changes and ICE fixes:
 
 * Fold __builtin_avr_swap to rotate  4
 * Don't fold __builtin_avr_insert_bits if first arg is non-const (was ICE)
 * Don't expand __builtin_avr_delay_cycles if arg is not-const (was ICE)
 
 Ok for trunk?
 
 Johann
 
 gcc/testsuite/
   * gcc.target/avr/torture/builtins-1.c: New test.
   * gcc.target/avr/torture/builtins-error.c: New test.
 gcc/
   * config/avr/builtins.def: New file.
   * config/avr/t-avr (avr.o, avr-c.o): Depend on it.
   * config/avr/avr.c (enum avr_builtin_id): Use it.
   (avr_init_builtins): Use it. And use avr_bdesc.
   (bdesc_1arg): Remove.
   (bdesc_2arg): Remove.
   (bdesc_3arg): Remove.
   (struct avr_builtin_description): Add field n_args.
   (avr_bdesc): New static variable using builtins.def.
   (avr_expand_builtin): Use it.
   Don't call avr_expand_delay_cycles if op0 is not CONST_INT.
   (avr_fold_builtin): Fold AVR_BUILTIN_SWAP.
   Don't fold AVR_BUILTIN_INSERT_BITS if arg0 is not INTEGER_CST.




Re: [PATCH] simulate-thread tweaks and fixes

2012-02-24 Thread Mike Stump
On Feb 24, 2012, at 9:38 AM, Andrew MacLeod wrote:
 I've been toying with the simulate-thread framework a bit.

 OK for mainline?

Ok.

Don't know why you ask...  I'd ask you if I wanted to make a change to the 
file...  Anyway, I reviewed it, and didn't spot anything bad.


Re: [PATCH] simulate-thread tweaks and fixes

2012-02-24 Thread Andrew MacLeod

On 02/24/2012 02:10 PM, Mike Stump wrote:

On Feb 24, 2012, at 9:38 AM, Andrew MacLeod wrote:

I've been toying with the simulate-thread framework a bit.
OK for mainline?

Ok.

Don't know why you ask...  I'd ask you if I wanted to make a change to the 
file...  Anyway, I reviewed it, and didn't spot anything bad.
Just dotting my i's :-)   I figured I'd give anyone who had issues 
earlier with these tests a bit of a chance to make sure it works if they 
wanted :-)


Thanks.
Andrew


PATCH: PR target/52352: [x32] - Wrong code to access addresses 0x80000000 to 0xFFFFFFFF using registers

2012-02-24 Thread H.J. Lu
Hi,

This patches enables *movabsmode_1 and *movabsmode_2 only for
TARGET_LP64 since x32 doesn't need 64bit address.  OK for trunk?

Thanks.

H.J.
---
2012-02-24  H.J. Lu  hongjiu...@intel.com

PR target/52352
* config/i386/i386.md (*movabsmode_1): Enable only for
TARGET_LP64.
(*movabsmode_2): Likewise.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6e2c123..0291e60 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2362,7 +2362,7 @@
 (define_insn *movabsmode_1
   [(set (mem:SWI1248x (match_operand:DI 0 x86_64_movabs_operand i,r))
(match_operand:SWI1248x 1 nonmemory_operand a,er))]
-  TARGET_64BIT  ix86_check_movabs (insn, 0)
+  TARGET_LP64  ix86_check_movabs (insn, 0)
   @
movabs{imodesuffix}\t{%1, %P0|%P0, %1}
mov{imodesuffix}\t{%1, %a0|%a0, %1}
@@ -2376,7 +2376,7 @@
 (define_insn *movabsmode_2
   [(set (match_operand:SWI1248x 0 register_operand =a,r)
 (mem:SWI1248x (match_operand:DI 1 x86_64_movabs_operand i,r)))]
-  TARGET_64BIT  ix86_check_movabs (insn, 1)
+  TARGET_LP64  ix86_check_movabs (insn, 1)
   @
movabs{imodesuffix}\t{%P1, %0|%0, %P1}
mov{imodesuffix}\t{%a1, %0|%0, %a1}


Re: Simulator testing for sh and sh64

2012-02-24 Thread Thomas Schwinge
Hi!

On Fri, 24 Feb 2012 00:08:00 +0900, Kaz Kojima kkoj...@rr.iij4u.or.jp wrote:
 Thomas Schwinge tho...@codesourcery.com wrote:
  /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c: In 
  function '__powisf2':
  /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c:1779:1: 
  error: unrecognizable insn:
  (insn 10 9 11 3 (set (reg:SI 162 [ D.2769 ])
  (abs:SI (reg/v:SI 168 [ m ]))) 
  /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c:1770 -1
   (nil))
  /scratch/tschwing/FM_sh64-elf/src/gcc-mainline/libgcc/libgcc2.c:1779:1: 
  internal compiler error: in extract_insn, at recog.c:2123
 
 BTW, I have a patch below which restores sh64-elf build on trunk.
 The hunks for sh_dwarf_register_span and abssi2 are almost obvious.
 Those for sh_register_move_cost and CASE_USE_BIT_TESTS would be
 suspicious, though.
 
 Regards,
   kaz
 --
 diff -up ORIG/trunk/gcc/config/sh/sh.c trunk/gcc/config/sh/sh.c
 --- ORIG/trunk/gcc/config/sh/sh.c 2011-12-30 09:22:01.0 +0900
 +++ trunk/gcc/config/sh/sh.c  2012-02-23 21:23:44.0 +0900

Confirming that this patch makes GCC trunk buildable again.

Comparing to the 4.6 testsuite results, with trunk there are about 700
new execution failures in g++, gcc, libstdc++, about 100 ``compilation
failed to produce executable'' in g++, there is ``FAIL:
gcc.target/sh/pr21255-2-ml.c scan-assembler mov @\\(4,r.\\),r.; mov
@r.,r.'', ``FAIL: gcc.target/sh/pr49468-si.c scan-assembler-times neg
2'', several tests took a suspiciously long time to compile (with 0 % CPU
usage) so that I killed the GCC processes, and about 12 GCC ICEs.


Grüße,
 Thomas


pgpyUcuGBUk2D.pgp
Description: PGP signature


Re: [PATCH] [RFC, GCC 4.8] Optimize conditional moves from adjacent memory locations

2012-02-24 Thread William J. Schmidt
On Fri, 2012-02-10 at 15:46 -0500, Michael Meissner wrote:
 I was looking at the routelookup EEMBC benchmark and it has code of the form:
 
while ( this_node-cmpbit  next_node-cmpbit )
 {
   this_node = next_node;
 
   if ( proute-dst_addr  (0x1  this_node-cmpbit) )
  next_node = this_node-rlink;
   else
  next_node = this_node-llink;
 }
 
 This is where you have a binary tree/trie and you are iterating going down
 either the right link or left link until you find a stopping condition.  The
 code in ifcvt.c does not handle optimizing these cases for conditional move
 since the load might trap, and generates code that does if-then-else with 
 loads
 and jumps.
 
 However, since the two elements are next to each other in memory, they are
 likely in the same cache line, particularly with aligned stacks and malloc
 returning aligned pointers.  Except in unusual circumstances where the pointer
 is not aligned, this means it is much faster to optimize it as:
 
while ( this_node-cmpbit  next_node-cmpbit )
 {
   this_node = next_node;
 
   rtmp = this_node-rlink;
   ltmp = this_node-llink;
   if ( proute-dst_addr  (0x1  this_node-cmpbit) )
  next_node = rtmp;
   else
  next_node = ltmp;
 }
 
snip

Andrew and Richard both suggested this would be better handled as a tree
optimization.  Here is a proposed patch to do that.  

I've tried to be as conservative as possible in this first attempt, to
reduce any downside of speculative loads.  In particular, I'm only
hoisting aligned pointers that are adjacent and that will reside in the
same page.  The latter restriction is enforced with a parameter
(defaulting to 16 bytes) intended to represent the smaller of two
alignments:  the alignment imposed on the stack, and the alignment
guaranteed by heap allocation interfaces.  Since the latter is probably
impossible to obtain within GCC, I went with a parameter.

This would again be targeted for 4.8.  I'd appreciate any comments on
the code.

Thanks,
Bill


2012-02-24  Bill Schmidt  wschm...@linux.vnet.ibm.com

* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Add argument to forward
declaration.
(hoist_adjacent_loads, gate_hoist_loads): New forward declarations.
(tree_ssa_phiopt): Call gate_hoist_loads.
(tree_ssa_cs_elim): Add parm to tree_ssa_phiopt_worker call.
(tree_ssa_phiopt_worker): Add do_hoist_loads to formal arg list; call
hoist_adjacent_loads.
(local_reg_dependence): New function.
(local_mem_dependence): Likewise.
(hoist_adjacent_loads): Likewise.
(gate_hoist_loads): Likewise.
* common.opt (fhoist-adjacent-loads): New switch.
* Makefile.in (tree-ssa-phiopt.o): Added dependencies.
* params.def (PARAM_MIN_CMOVE_STRUCT_ALIGN): New param.


Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 184419)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -36,9 +36,17 @@ along with GCC; see the file COPYING3.  If not see
 #include domwalk.h
 #include cfgloop.h
 #include tree-data-ref.h
+#include gimple-pretty-print.h
+#include insn-config.h
+#include expr.h
+#include optabs.h
 
+#ifndef HAVE_conditional_move
+#define HAVE_conditional_move (0)
+#endif
+
 static unsigned int tree_ssa_phiopt (void);
-static unsigned int tree_ssa_phiopt_worker (bool);
+static unsigned int tree_ssa_phiopt_worker (bool, bool);
 static bool conditional_replacement (basic_block, basic_block,
 edge, edge, gimple, tree, tree);
 static bool value_replacement (basic_block, basic_block,
@@ -52,6 +60,9 @@ static bool cond_store_replacement (basic_block, b
 static bool cond_if_else_store_replacement (basic_block, basic_block, 
basic_block);
 static struct pointer_set_t * get_non_trapping (void);
 static void replace_phi_edge_with_variable (basic_block, edge, gimple, tree);
+static void hoist_adjacent_loads (basic_block, basic_block,
+ basic_block, basic_block);
+static bool gate_hoist_loads (void);
 
 /* This pass tries to replaces an if-then-else block with an
assignment.  We have four kinds of transformations.  Some of these
@@ -137,12 +148,56 @@ static void replace_phi_edge_with_variable (basic_
  bb2:
x = PHI x' (bb0), ...;
 
-   A similar transformation is done for MAX_EXPR.  */
+   A similar transformation is done for MAX_EXPR.
 
+
+   This pass also performs a fifth transformation of a slightly different
+   flavor.
+
+   Adjacent Load Hoisting
+   --
+   
+   This transformation replaces
+
+ bb0:
+   if (...) goto bb2; else goto bb1;
+ bb1:
+   x1 = (expr).field1;
+   goto bb3;
+ bb2:
+   x2 = (expr).field2;
+ bb3:
+   # x = PHI x1, x2;
+
+   with
+
+ bb0:
+   x1 = (expr).field1;
+   x2 = (expr).field2;
+   if (...) 

Re: [PATCH, i386, Android] Enable exceptions and RTTI by default for Android

2012-02-24 Thread Jing Yu
My comment is(was) not on the format of the patch. Instead, I am
thinking whether Android toolchain customer, which is Android AOSP,
wants this patch.

I don't know the scenario behind this patch. I think the question
behind this patch is, if RTTI and exceptions are enabled by default,
who is supposed to handle RTTI and exceptions by default? The answer
is no answer, for now.

Android AOSP tree provides very limited C++ support. Android NDK
provides four options for C++ support. Some of the options support
both exceptions and rttit, some options only support rtti.

Therefore I guess Android AOSP probably would not like to enable
exceptions and RTTI by default.

Questions/complaints/requests on Android limited C++ support, should
go to Android forum.
Questions about license concerns, should go to Android AOSP lawyer.

Thanks,
Jing

On Fri, Feb 24, 2012 at 2:36 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Fri, Feb 24, 2012 at 11:22 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 On Wed, Feb 22, 2012 at 3:57 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 Hello,

 Here is a simple patch which enables exceptions and RTTI by default
 for Android target. OK for trunk?

 Err - isn't that the default?  Thus, simply delete the bogus spec?

 Richard.


 Hi,

 Is following patch OK or it's better to remove whole macro and its usages?

 The latter.

 Richard.

 Thanks,
 Ilya
 --
 2012-02-22  Enkovich Ilya  ilya.enkov...@intel.com

        * gcc/config/linux-android.h (ANDROID_CC1PLUS_SPEC): Enable
        exceptions and rtti by default.


 diff --git a/gcc/config/linux-android.h b/gcc/config/linux-android.h
 index 94c5274..180b62b 100644
 --- a/gcc/config/linux-android.h
 +++ b/gcc/config/linux-android.h
 @@ -45,9 +45,7 @@
   %{!mglibc:%{!muclibc:%{!mbionic: -mbionic}}}                       \
   %{!fno-pic:%{!fno-PIC:%{!fpic:%{!fPIC: -fPIC

 -#define ANDROID_CC1PLUS_SPEC                                           \
 -  %{!fexceptions:%{!fno-exceptions: -fno-exceptions}}                \
 -  %{!frtti:%{!fno-rtti: -fno-rtti}}
 +#define ANDROID_CC1PLUS_SPEC 

  #define ANDROID_LIB_SPEC \
   %{!static: -ldl}


Re: Simulator testing for sh and sh64

2012-02-24 Thread Oleg Endo
Hello,

On Fri, 2012-02-24 at 22:23 +0100, Thomas Schwinge wrote:

 Confirming that this patch makes GCC trunk buildable again.
 
 Comparing to the 4.6 testsuite results, with trunk there are about 700
 new execution failures in g++, gcc, libstdc++, about 100 ``compilation
 failed to produce executable'' in g++, there is ``FAIL:
 gcc.target/sh/pr21255-2-ml.c scan-assembler mov @\\(4,r.\\),r.; mov
 @r.,r.''

The test case should be actually skipped, shouldn't it?
At least it tries to do..
/* { dg-skip-if  { sh*-*-* } { -mb  -m5*} {  } }  */

I guess this could fail to skip depending on the compiler's default
configuration.  E.g. if it is configured to emit big endian by default,
-mb will not necessarily be passed and the test would be executed,
although it is not supposed to be.

 ``FAIL: gcc.target/sh/pr49468-si.c scan-assembler-times neg
 2'', 

This test case should be disabled for SH64 if the abs:SI insns are
disabled in sh.md, otherwise it will most likely fail.
Adding the following line to pr49468-si.c 

/* { dg-skip-if  { sh*-*-* } { -m5*} {  } }  */

should skip it on SH64 (if an -m5* arg is actually passed).

Cheers,
Oleg



[google][4.6]Bug fix to function reordering plugin to check presence of elf.h

2012-02-24 Thread Sriraman Tallam
function_reordering_plugin.c includes elf.h which is not available
on non-ELF platforms building a cross-compiler. This patch checks for
elf.h before including it. Otherwise, it redefines the macros used.
This is safe because the macros will not change.

For context, this linker plugin itself is only available in the google
4_6 branch and I will port it to other branches and make it available
for review for trunk soon.


2012-02-24  Sriraman Tallam  tmsri...@google.com

* function_reordering_plugin.c: Check for presence of elf.h.
Otherwise, redefine the elf macros used.

Ok to commit?

Thanks,
-Sri.
Index: function_reordering_plugin/function_reordering_plugin.c
===
--- function_reordering_plugin/function_reordering_plugin.c (revision 
184564)
+++ function_reordering_plugin/function_reordering_plugin.c (working copy)
@@ -43,11 +43,23 @@ along with this program; see the file COPYING3.  I
 #include stdlib.h
 #include assert.h
 #include string.h
-#include elf.h
+#if  defined (__ELF__)
+  #include elf.h
+#endif
 #include config.h
 #include plugin-api.h
 #include callgraph.h
 
+/* #include elf.h   Not available on Darwin. 
+   Rather than dealing with cross-compilation includes, hard code the
+   values we need, as these will not change.  */
+#ifndef SHT_NULL
+ #define SHT_NULL 0
+#endif
+#ifndef SHT_PROGBITS
+ #define SHT_PROGBITS 1
+#endif
+
 enum ld_plugin_status claim_file_hook (const struct ld_plugin_input_file *file,
int *claimed);
 enum ld_plugin_status all_symbols_read_hook ();


Re: [libgo] Fix typo in libgo/runtime/go-nosys.c

2012-02-24 Thread Ian Lance Taylor
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Mainline bootstrap on x86_64-unknown-linux-gnu (CentOS 5.6) was failing
 due to a trivial typo.  Fixed as follows.

Whoops.  Thanks.  Committed.

Ian


[PATCH] backport r184555 to gcc-4_6-branch

2012-02-24 Thread Jack Howarth
   The attached patch is a backport of r184555 to the gcc-4_6-branch to fix 
PR52179
so that boehm-gc functions properly with darwin11 and later's default usage of 
-pie
in the linker. This fixes PR49461 properly and allows the previous hack of 
passing
-no_pie to SYSTEMSPEC to be removed. Bootstrap and regression tested on
x86_64-apple-darwin11...

http://gcc.gnu.org/ml/gcc-testresults/2012-02/msg02427.html

Okay for gcc-4_6-branch when it reopens for gcc 4.6.4?
   Jack


boehm-gc/

2012-02-25  Jack Howarth  howa...@bromo.med.uc.edu

Backport from mainline
2012-02-23  Patrick Marlier  patrick.marl...@gmail.com
Jack Howarth  howa...@bromo.med.uc.edu

PR boehm-gc/52179
* include/gc_config.h.in: Undefine HAVE_PTHREAD_GET_STACKADDR_NP.
* include/private/gcconfig.h (DARWIN): Define STACKBOTTOM with
pthread_get_stackaddr_np when available.
* configure.ac (THREADS): Check availability of 
pthread_get_stackaddr_np.
* configure: Regenerate.

libjava/

2012-02-25  Jack Howarth  howa...@bromo.med.uc.edu

Backport from mainline
2012-02-23  Patrick Marlier  patrick.marl...@gmail.com
Jack Howarth  howa...@bromo.med.uc.edu

PR target/49461
* configure.ac (SYSTEMSPEC): No longer pass -no_pie for darwin11.
* configure: Regenerate.


Index: boehm-gc/configure.ac
===
--- boehm-gc/configure.ac   (revision 184553)
+++ boehm-gc/configure.ac   (working copy)
@@ -392,6 +392,7 @@ esac
 oldLIBS=$LIBS
 LIBS=$LIBS $THREADLIBS
 AC_CHECK_FUNCS([pthread_getattr_np])
+AC_CHECK_FUNCS([pthread_get_stackaddr_np])
 LIBS=$oldLIBS
 
 # Configuration of machine-dependent code
Index: boehm-gc/include/gc_config.h.in
===
--- boehm-gc/include/gc_config.h.in (revision 184553)
+++ boehm-gc/include/gc_config.h.in (working copy)
@@ -87,6 +87,9 @@
 /* Define to 1 if you have the `pthread_getattr_np' function. */
 #undef HAVE_PTHREAD_GETATTR_NP
 
+/* Define to 1 if you have the `pthread_get_stackaddr_np_np' function. */
+#undef HAVE_PTHREAD_GET_STACKADDR_NP
+
 /* Define to 1 if you have the stdint.h header file. */
 #undef HAVE_STDINT_H
 
Index: boehm-gc/include/private/gcconfig.h
===
--- boehm-gc/include/private/gcconfig.h (revision 184553)
+++ boehm-gc/include/private/gcconfig.h (working copy)
@@ -1331,7 +1331,11 @@
 These aren't used when dyld support is enabled (it is by default) */
 # define DATASTART ((ptr_t) get_etext())
 # define DATAEND   ((ptr_t) get_end())
-# define STACKBOTTOM ((ptr_t) 0xc000)
+# ifdef HAVE_PTHREAD_GET_STACKADDR_NP
+#   define STACKBOTTOM (ptr_t)pthread_get_stackaddr_np(pthread_self())
+# else
+#   define STACKBOTTOM ((ptr_t) 0xc000)
+# endif
 # define USE_MMAP
 # define USE_MMAP_ANON
 # define USE_ASM_PUSH_REGS
@@ -2011,7 +2015,11 @@
 These aren't used when dyld support is enabled (it is by default) */
 # define DATASTART ((ptr_t) get_etext())
 # define DATAEND   ((ptr_t) get_end())
-# define STACKBOTTOM ((ptr_t) 0x7fff5fc0)
+# ifdef HAVE_PTHREAD_GET_STACKADDR_NP
+#   define STACKBOTTOM (ptr_t)pthread_get_stackaddr_np(pthread_self())
+# else
+#   define STACKBOTTOM ((ptr_t) 0x7fff5fc0)
+# endif
 # define USE_MMAP
 # define USE_MMAP_ANON
 # ifdef GC_DARWIN_THREADS
Index: libjava/configure.ac
===
--- libjava/configure.ac(revision 184553)
+++ libjava/configure.ac(working copy)
@@ -886,14 +886,9 @@ case ${host} in
 SYSTEMSPEC=-lunicows $SYSTEMSPEC
   fi
 ;;
-*-*-darwin9*)
+*-*-darwin[[912]]*)
   SYSTEMSPEC=%{!Zdynamiclib:%{!Zbundle:-allow_stack_execute}}
 ;;
-*-*-darwin[[12]]*)
-  # Something is incompatible with pie, would be nice to fix it and
-  # remove -no_pie.  PR49461
-  SYSTEMSPEC=-no_pie %{!Zdynamiclib:%{!Zbundle:-allow_stack_execute}}
-;;
 *)
   SYSTEMSPEC=
 ;;


[PATCH 1/2] mips: Add R4600 scheduling support for imul and idiv

2012-02-24 Thread Matt Turner
The r4600_imul and r4600_idiv reservations were correct for si, but
there were no *_di reservations.

See page 4 of
http://www.sgistuff.net/hardware/other/documents/R4600_Prod_OV.pdf

2012-02-24  Matt Turner  matts...@gmail.com

* config/mips/4600.md (r4600_imul_si): Rename from r4600_imul.
(r4600_imul_di): New.
(r4600_idiv_si): Rename from r4600_idiv.
(r4600_idiv_di): New.
---
 gcc/config/mips/4600.md |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/gcc/config/mips/4600.md b/gcc/config/mips/4600.md
index c645cbc..fcdbf00 100644
--- a/gcc/config/mips/4600.md
+++ b/gcc/config/mips/4600.md
@@ -1,5 +1,5 @@
 ;; R4600 and R4650 pipeline description.
-;;   Copyright (C) 2004, 2005, 2007 Free Software Foundation, Inc.
+;;   Copyright (C) 2004, 2005, 2007, 2012 Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
 
@@ -24,16 +24,30 @@
 ;; We handle the R4600 and R4650 in much the same way.  The only difference
 ;; is in the integer multiplication and division costs.
 
-(define_insn_reservation r4600_imul 10
+(define_insn_reservation r4600_imul_si 10
   (and (eq_attr cpu r4600)
-   (eq_attr type imul,imul3,imadd))
+   (eq_attr type imul,imul3,imadd)
+   (eq_attr mode SI))
   imuldiv*10)
 
-(define_insn_reservation r4600_idiv 42
+(define_insn_reservation r4600_imul_di 12
   (and (eq_attr cpu r4600)
-   (eq_attr type idiv))
+   (eq_attr type imul,imul3,imadd)
+   (eq_attr mode DI))
+  imuldiv*12)
+
+(define_insn_reservation r4600_idiv_si 42
+  (and (eq_attr cpu r4600)
+   (eq_attr type idiv)
+   (eq_attr mode SI))
   imuldiv*42)
 
+(define_insn_reservation r4600_idiv_di 74
+  (and (eq_attr cpu r4600)
+   (eq_attr type idiv)
+   (eq_attr mode DI))
+  imuldiv*74)
+
 
 (define_insn_reservation r4650_imul 4
   (and (eq_attr cpu r4650)
-- 
1.7.3.4



Miscellaneous mips, arm, and alpha patches

2012-02-24 Thread Matt Turner
Hi,

Following this email are five rather trivial patches that I've had
sitting around while waiting for my grad school and the Free Software
Foundation to decide it's okay for me to contribute. I don't have
copyright assignment for gcc yet, but I thought I would pipeline this
process and try to get the patches at least reviewed before the
paperwork is completed. If they're trivial enough to be committed
without copyright assignment, I'd love for them to be committed for
gcc 4.8.

The patches are

[PATCH 1/2] mips: Add R4600 scheduling support for imul and idiv
[PATCH 2/2] mips: Add R4700 scheduling support
[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294)
[PATCH] arm: add _mm_empty to mmintrin.h for source compatibility
[PATCH] alpha: add bypasses for fmul/fadd/fcmov - fst/ftoi

I have not contributed to gcc before, so please tell me if I've missed
a step or didn't format the ChangeLog entries properly, and so forth.
Please CC me on replies.

Thanks,
Matt Turner


[PATCH] alpha: add bypasses for fmul/fadd/fcmov - fst/ftoi

2012-02-24 Thread Matt Turner
See section 2.5.3 (page 28) of
http://download.majix.org/dec/comp_guide_v2.pdf

2012-02-24  Matt Turner  matts...@gmail.com

* config/alpha/ev6.md: (define_bypass ev6_fmul,ev6_fadd): New.
(define_bypass ev6_fcmov): New.
---
 gcc/config/alpha/ev6.md |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/gcc/config/alpha/ev6.md b/gcc/config/alpha/ev6.md
index adfe504..a16535a 100644
--- a/gcc/config/alpha/ev6.md
+++ b/gcc/config/alpha/ev6.md
@@ -147,11 +147,15 @@
(eq_attr type fadd,fcpys,fbr))
   ev6_fa)
 
+(define_bypass 6 ev6_fmul,ev6_fadd ev6_fst,ev6_ftoi)
+
 (define_insn_reservation ev6_fcmov 8
   (and (eq_attr tune ev6)
(eq_attr type fcmov))
   ev6_fa,nothing*3,ev6_fa)
 
+(define_bypass 10 ev6_fcmov ev6_fst,ev6_ftoi)
+
 (define_insn_reservation ev6_fdivsf 12
   (and (eq_attr tune ev6)
(and (eq_attr type fdiv)
-- 
1.7.3.4



[PATCH 2/2] mips: Add R4700 scheduling support

2012-02-24 Thread Matt Turner
The R4700 is identical to the R4600 except for the integer and
floating-point multiplication costs.

See page 4 of http://datasheets.chipdb.org/IDT/MIPS/79RV4700.pdf

2012-02-24  Matt Turner  matts...@gmail.com

* config/mips/4600.md (r4700_imul_si): New.
(r4700_imul_di): New.
(r4700_fmul_single): New.
(r4700_fmul_double): New.
* config/mips/driver-native.c (cpu_types): Add r4700.
* config/mips/mips-cpus.def: Likewise.
* config/mips/mips.c: Likewise.
* config/mips/mips.md: Likewise.
---
 gcc/config/mips/4600.md |   51 ++
 gcc/config/mips/driver-native.c |2 +-
 gcc/config/mips/mips-cpus.def   |1 +
 gcc/config/mips/mips.c  |3 ++
 gcc/config/mips/mips.md |1 +
 5 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/gcc/config/mips/4600.md b/gcc/config/mips/4600.md
index fcdbf00..ef74fd3 100644
--- a/gcc/config/mips/4600.md
+++ b/gcc/config/mips/4600.md
@@ -1,4 +1,4 @@
-;; R4600 and R4650 pipeline description.
+;; R4600, R4650, and R4700 pipeline description.
 ;;   Copyright (C) 2004, 2005, 2007, 2012 Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
@@ -21,8 +21,10 @@
 ;; This file overrides parts of generic.md.  It is derived from the
 ;; old define_function_unit description.
 ;;
-;; We handle the R4600 and R4650 in much the same way.  The only difference
-;; is in the integer multiplication and division costs.
+;; We handle the R4600, R4650, and R4700 in much the same way.  The only
+;; differences between R4600 and R4650 are the integer multiplication and
+;; division costs. The only differences between R4600 and R4700 are the
+;; integer and floating-point multiplication costs.
 
 (define_insn_reservation r4600_imul_si 10
   (and (eq_attr cpu r4600)
@@ -37,13 +39,13 @@
   imuldiv*12)
 
 (define_insn_reservation r4600_idiv_si 42
-  (and (eq_attr cpu r4600)
+  (and (eq_attr cpu r4600,r4700)
(eq_attr type idiv)
(eq_attr mode SI))
   imuldiv*42)
 
 (define_insn_reservation r4600_idiv_di 74
-  (and (eq_attr cpu r4600)
+  (and (eq_attr cpu r4600,r4700)
(eq_attr type idiv)
(eq_attr mode DI))
   imuldiv*74)
@@ -60,13 +62,26 @@
   imuldiv*36)
 
 
+(define_insn_reservation r4700_imul_si 8
+  (and (eq_attr cpu r4700)
+   (eq_attr type imul,imul3,imadd)
+   (eq_attr mode SI))
+  imuldiv*8)
+
+(define_insn_reservation r4700_imul_di 10
+  (and (eq_attr cpu r4700)
+   (eq_attr type imul,imul3,imadd)
+   (eq_attr mode DI))
+  imuldiv*10)
+
+
 (define_insn_reservation r4600_load 2
-  (and (eq_attr cpu r4600,r4650)
+  (and (eq_attr cpu r4600,r4650,r4700)
(eq_attr type load,fpload,fpidxload))
   alu)
 
 (define_insn_reservation r4600_fmove 1
-  (and (eq_attr cpu r4600,r4650)
+  (and (eq_attr cpu r4600,r4650,r4700)
(eq_attr type fabs,fneg,fmove))
   alu)
 
@@ -76,26 +91,40 @@
(eq_attr mode SF)))
   alu)
 
+
+(define_insn_reservation r4700_fmul_single 4
+  (and (eq_attr cpu r4700)
+   (and (eq_attr type fmul,fmadd)
+   (eq_attr mode SF)))
+  alu)
+
+(define_insn_reservation r4700_fmul_double 5
+  (and (eq_attr cpu r4700)
+   (and (eq_attr type fmul,fmadd)
+   (eq_attr mode DF)))
+  alu)
+
+
 (define_insn_reservation r4600_fdiv_single 32
-  (and (eq_attr cpu r4600,r4650)
+  (and (eq_attr cpu r4600,r4650,r4700)
(and (eq_attr type fdiv,frdiv)
(eq_attr mode SF)))
   alu)
 
 (define_insn_reservation r4600_fdiv_double 61
-  (and (eq_attr cpu r4600,r4650)
+  (and (eq_attr cpu r4600,r4650,r4700)
(and (eq_attr type fdiv,frdiv)
(eq_attr mode DF)))
   alu)
 
 (define_insn_reservation r4600_fsqrt_single 31
-  (and (eq_attr cpu r4600,r4650)
+  (and (eq_attr cpu r4600,r4650,r4700)
(and (eq_attr type fsqrt,frsqrt)
(eq_attr mode SF)))
   alu)
 
 (define_insn_reservation r4600_fsqrt_double 60
-  (and (eq_attr cpu r4600,r4650)
+  (and (eq_attr cpu r4600,r4650,r4700)
(and (eq_attr type fsqrt,frsqrt)
(eq_attr mode DF)))
   alu)
diff --git a/gcc/config/mips/driver-native.c b/gcc/config/mips/driver-native.c
index f565c57..580bca2 100644
--- a/gcc/config/mips/driver-native.c
+++ b/gcc/config/mips/driver-native.c
@@ -45,7 +45,7 @@ static const struct cpu_types {
   { C0_IMP_R14000, r14000 },
   { C0_IMP_R8000,  r8000 },
   { C0_IMP_R4600,  r4600 },
-  { C0_IMP_R4700,  r4600 },
+  { C0_IMP_R4700,  r4700 },
   { C0_IMP_R4650,  r4650 },
   { C0_IMP_R5000,  vr5000 },
   { C0_IMP_RM7000, rm7000 },
diff --git a/gcc/config/mips/mips-cpus.def b/gcc/config/mips/mips-cpus.def
index 98b915a..d4631b0 100644
--- a/gcc/config/mips/mips-cpus.def
+++ b/gcc/config/mips/mips-cpus.def
@@ -70,6 +70,7 @@ MIPS_CPU (r4400, PROCESSOR_R4000, 3, 0)
 MIPS_CPU (r4600, PROCESSOR_R4600, 3, 0)
 MIPS_CPU (orion, PROCESSOR_R4600, 3, 0)
 MIPS_CPU (r4650, PROCESSOR_R4650, 3, 0)
+MIPS_CPU (r4700, PROCESSOR_R4700, 3, 0)
 /* ST Loongson 2E/2F 

[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294).

2012-02-24 Thread Matt Turner
PR 36798 and 36966 are duplicates.

2012-02-24  Matt Turner  matts...@gmail.com

PR target/35294
* config/arm/arm.c (arm_expand_builtin): Wire up missing
intrinsics.
---
 gcc/config/arm/arm.c |   62 +-
 1 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7f0dc6b..f5935d6 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -20502,7 +20502,8 @@ arm_expand_binop_builtin (enum insn_code icode,
   || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
 target = gen_reg_rtx (tmode);
 
-  gcc_assert (GET_MODE (op0) == mode0  GET_MODE (op1) == mode1);
+  gcc_assert ((GET_MODE (op0) == mode0 || GET_MODE (op0) == VOIDmode)
+  (GET_MODE (op1) == mode1 || GET_MODE (op1) == VOIDmode));
 
   if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
 op0 = copy_to_mode_reg (mode0, op0);
@@ -21181,6 +21182,65 @@ arm_expand_builtin (tree exp,
   emit_insn (pat);
   return target;
 
+case ARM_BUILTIN_WSLLH:
+case ARM_BUILTIN_WSLLHI:
+case ARM_BUILTIN_WSLLW:
+case ARM_BUILTIN_WSLLWI:
+case ARM_BUILTIN_WSLLD:
+case ARM_BUILTIN_WSLLDI:
+case ARM_BUILTIN_WSRAH:
+case ARM_BUILTIN_WSRAHI:
+case ARM_BUILTIN_WSRAW:
+case ARM_BUILTIN_WSRAWI:
+case ARM_BUILTIN_WSRAD:
+case ARM_BUILTIN_WSRADI:
+case ARM_BUILTIN_WSRLH:
+case ARM_BUILTIN_WSRLHI:
+case ARM_BUILTIN_WSRLW:
+case ARM_BUILTIN_WSRLWI:
+case ARM_BUILTIN_WSRLD:
+case ARM_BUILTIN_WSRLDI:
+case ARM_BUILTIN_WRORH:
+case ARM_BUILTIN_WRORHI:
+case ARM_BUILTIN_WRORW:
+case ARM_BUILTIN_WRORWI:
+case ARM_BUILTIN_WRORD:
+case ARM_BUILTIN_WRORDI:
+case ARM_BUILTIN_WAND:
+case ARM_BUILTIN_WANDN:
+case ARM_BUILTIN_WOR:
+case ARM_BUILTIN_WXOR:
+  icode = (fcode == ARM_BUILTIN_WSLLH ? CODE_FOR_ashlv4hi3_di
+  : fcode == ARM_BUILTIN_WSLLHI ? CODE_FOR_ashlv4hi3_iwmmxt
+  : fcode == ARM_BUILTIN_WSLLW  ? CODE_FOR_ashlv2si3_di
+  : fcode == ARM_BUILTIN_WSLLWI ? CODE_FOR_ashlv2si3_iwmmxt
+  : fcode == ARM_BUILTIN_WSLLD  ? CODE_FOR_ashldi3_di
+  : fcode == ARM_BUILTIN_WSLLDI ? CODE_FOR_ashldi3_iwmmxt
+  : fcode == ARM_BUILTIN_WSRAH  ? CODE_FOR_ashrv4hi3_di
+  : fcode == ARM_BUILTIN_WSRAHI ? CODE_FOR_ashrv4hi3_iwmmxt
+  : fcode == ARM_BUILTIN_WSRAW  ? CODE_FOR_ashrv2si3_di
+  : fcode == ARM_BUILTIN_WSRAWI ? CODE_FOR_ashrv2si3_iwmmxt
+  : fcode == ARM_BUILTIN_WSRAD  ? CODE_FOR_ashrdi3_di
+  : fcode == ARM_BUILTIN_WSRADI ? CODE_FOR_ashrdi3_iwmmxt
+  : fcode == ARM_BUILTIN_WSRLH  ? CODE_FOR_lshrv4hi3_di
+  : fcode == ARM_BUILTIN_WSRLHI ? CODE_FOR_lshrv4hi3_iwmmxt
+  : fcode == ARM_BUILTIN_WSRLW  ? CODE_FOR_lshrv2si3_di
+  : fcode == ARM_BUILTIN_WSRLWI ? CODE_FOR_lshrv2si3_iwmmxt
+  : fcode == ARM_BUILTIN_WSRLD  ? CODE_FOR_lshrdi3_di
+  : fcode == ARM_BUILTIN_WSRLDI ? CODE_FOR_lshrdi3_iwmmxt
+  : fcode == ARM_BUILTIN_WRORH  ? CODE_FOR_rorv4hi3_di
+  : fcode == ARM_BUILTIN_WRORHI ? CODE_FOR_rorv4hi3
+  : fcode == ARM_BUILTIN_WRORW  ? CODE_FOR_rorv2si3_di
+  : fcode == ARM_BUILTIN_WRORWI ? CODE_FOR_rorv2si3
+  : fcode == ARM_BUILTIN_WRORD  ? CODE_FOR_rordi3_di
+  : fcode == ARM_BUILTIN_WRORDI ? CODE_FOR_rordi3
+  : fcode == ARM_BUILTIN_WAND   ? CODE_FOR_iwmmxt_anddi3
+  : fcode == ARM_BUILTIN_WANDN  ? CODE_FOR_iwmmxt_nanddi3
+  : fcode == ARM_BUILTIN_WOR? CODE_FOR_iwmmxt_iordi3
+  : fcode == ARM_BUILTIN_WXOR   ? CODE_FOR_iwmmxt_xordi3
+  : CODE_FOR_rordi3);
+  return arm_expand_binop_builtin (icode, exp, target);
+
 case ARM_BUILTIN_WZERO:
   target = gen_reg_rtx (DImode);
   emit_insn (gen_iwmmxt_clrdi (target));
-- 
1.7.3.4



[PATCH] arm: add _mm_empty to mmintrin.h for source compatibility

2012-02-24 Thread Matt Turner
The x86/amd64 mmintrin.h provides the _mm_empty intrinsic for the 'emms'
MMX instruction. Although ARM does not need such an instruction, we
should provide an empty _mm_empty function nonetheless for source
compatibility.

2012-02-24  Matt Turner  matts...@gmail.com

* config/arm/mmintrin.h (_mm_empty): New.
---
 gcc/config/arm/mmintrin.h |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/gcc/config/arm/mmintrin.h b/gcc/config/arm/mmintrin.h
index 2cc500d..ea73bf1 100644
--- a/gcc/config/arm/mmintrin.h
+++ b/gcc/config/arm/mmintrin.h
@@ -32,6 +32,12 @@ typedef int __v2si __attribute__ ((vector_size (8)));
 typedef short __v4hi __attribute__ ((vector_size (8)));
 typedef char __v8qi __attribute__ ((vector_size (8)));
 
+/* Provided for source compatibility with MMX.  */
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_empty (void)
+{
+}
+
 /* Convert __m64 and __int64 into each other.  */
 static __inline __m64 
 _mm_cvtsi64_m64 (__int64 __i)
@@ -1248,6 +1254,7 @@ _m_from_int (int __a)
 #define _m_psadzbw _mm_sadz_pu8
 #define _m_psadzwd _mm_sadz_pu16
 #define _m_paligniq _mm_align_si64
+#define _m_empty _mm_empty
 #define _m_cvt_si2pi _mm_cvtsi64_m64
 #define _m_cvt_pi2si _mm_cvtm64_si64
 
-- 
1.7.3.4



Re: [PATCH] backport r184555 to gcc-4_6-branch

2012-02-24 Thread Mike Stump
On Feb 24, 2012, at 6:45 PM, Jack Howarth howa...@bromo.med.uc.edu wrote:
   The attached patch is a backport of r184555 to the gcc-4_6-branch to fix 
 PR52179
 so that boehm-gc functions properly with darwin11 and later's default usage 
 of -pie
 in the linker. This fixes PR49461 properly and allows the previous hack of 
 passing
 -no_pie to SYSTEMSPEC to be removed. Bootstrap and regression tested on
 x86_64-apple-darwin11...
 
 http://gcc.gnu.org/ml/gcc-testresults/2012-02/msg02427.html
 
 Okay for gcc-4_6-branch when it reopens for gcc 4.6.4?

Ok after it reopens.


Re: [google][4.6]Bug fix to function reordering plugin to check presence of elf.h

2012-02-24 Thread Xinliang David Li
ok.

David

On Fri, Feb 24, 2012 at 4:19 PM, Sriraman Tallam tmsri...@google.com wrote:
 function_reordering_plugin.c includes elf.h which is not available
 on non-ELF platforms building a cross-compiler. This patch checks for
 elf.h before including it. Otherwise, it redefines the macros used.
 This is safe because the macros will not change.

 For context, this linker plugin itself is only available in the google
 4_6 branch and I will port it to other branches and make it available
 for review for trunk soon.


 2012-02-24  Sriraman Tallam  tmsri...@google.com

        * function_reordering_plugin.c: Check for presence of elf.h.
        Otherwise, redefine the elf macros used.

 Ok to commit?

 Thanks,
 -Sri.