date:20130510

[PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop

2013-05-10 Thread Shakthi Kannan

Hi,

The attached patch adds LAST_SOURCE_COLUMN to pp_verbatim
function in the while loop present in
diagnostic_report_current_module(). This makes the output
consistent for any error parsing program as stated in the bug.

2013-05-10 Shakthi Kannan skan...@redhat.com

PR preprocessor/42014
* gcc/diagnostic.c: Added LAST_SOURCE_COLUMN in while loop.

---
 gcc/diagnostic.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index f9a236b..2addbf0 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -528,8 +528,9 @@ diagnostic_report_current_module (diagnostic_context 
*context, location_t where)
{
  map = INCLUDED_FROM (line_table, map);
  pp_verbatim (context-printer,
-  ,\n from %s:%d,
-  LINEMAP_FILE (map), LAST_SOURCE_LINE (map));
+  ,\n from %s:%d:%d,
+  LINEMAP_FILE (map),
+  LAST_SOURCE_LINE (map), LAST_SOURCE_COLUMN (map));
}
  pp_verbatim (context-printer, :);
  pp_newline (context-printer);
-- 
1.7.11.7

Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop

2013-05-10 Thread Marek Polacek

On Fri, May 10, 2013 at 02:55:07AM -0400, Shakthi Kannan wrote:
 Hi,
 
 The attached patch adds LAST_SOURCE_COLUMN to pp_verbatim
 function in the while loop present in
 diagnostic_report_current_module(). This makes the output
 consistent for any error parsing program as stated in the bug.
 
 2013-05-10 Shakthi Kannan skan...@redhat.com
 
 PR preprocessor/42014
 * gcc/diagnostic.c: Added LAST_SOURCE_COLUMN in while loop.

Maybe
PR preprocessor/42014
* diagnostic.c: Print LAST_SOURCE_COLUMN as well.
?

Marek

Re: [PATCH] PR55033: Fix

2013-05-10 Thread Sebastian Huber


On 05/10/2013 03:55 AM, Alan Modra wrote:

On Thu, May 09, 2013 at 08:47:19AM -0700, Richard Henderson wrote:

On 02/22/2013 02:51 AM, Alan Modra wrote:

Richard, can I bother you with a review request?
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00544.html


Sorry for the truly excessive delay here.

This patch is ok for mainline and 4.8.1.


Thanks.  Committed revision 198762 and 198763.



Thanks a lot, and sorry for being so obtrusive.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

[PATCH] Test case for PR55033

2013-05-10 Thread Sebastian Huber

gcc/testsuite/ChangeLog
2013-05-10  Sebastian Huber  sebastian.hu...@embedded-brains.de

PR target/55033
* gcc.target/powerpc/pr55033.c: New.
---
 gcc/testsuite/gcc.target/powerpc/pr55033.c |   21 +
 1 files changed, 21 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c

diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c 
b/gcc/testsuite/gcc.target/powerpc/pr55033.c
new file mode 100644
index 000..cd8abcb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_eabi_ok } */
+/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */
+
+void f(void);
+
+struct s {
+  int *p;
+  int *q;
+};
+
+extern int a;
+
+extern const struct s c;
+
+const struct s c = { a, 0 };
+
+void f(void)
+{
+  char buf[4] = { 0, 1, 2, 3 };
+}
-- 
1.7.7

Re: [PATCH] Rotate pattern recognition in forwprop (PR tree-optimization/57157)

2013-05-10 Thread Richard Biener

On Thu, 9 May 2013, Jakub Jelinek wrote:

 Hi!
 
 The bit_rotate: code in fold-const.c handles only a subset of rotates,
 and e.g. with C++ FE when sizeof isn't folded early fold_binary often
 even isn't called with something that could recognize the rotate pattern,
 even if bit_rotate: has been extended.  Furthermore, not all the rotate
 patterns need to be written in a single statement.
 
 So, this patch adds a rotate pattern recognizer into forwprop, and handles
 several different patterns, the comment above the function shows which ones.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux.
 
 On the Botan KASUMI distilled testcase (PR55278 #c6 at -O3) I get 6.3%
 speedup on Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz and 9.9% speedup on
 Quad-Core AMD Opteron(tm) Processor 8354.
 
 Ok for trunk?

Ok.

Thanks,
Richard.

 2013-05-09  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/45216
   PR tree-optimization/57157
   * tree-ssa-forwprop.c (simplify_rotate): New function.
   (ssa_forward_propagate_and_combine): Call it.
 
   * c-c++-common/rotate-1.c: New test.
   * c-c++-common/rotate-1a.c: New test.
   * c-c++-common/rotate-2.c: New test.
   * c-c++-common/rotate-2a.c: New test.
   * c-c++-common/rotate-3.c: New test.
   * c-c++-common/rotate-3a.c: New test.
   * c-c++-common/rotate-4.c: New test.
   * c-c++-common/rotate-4a.c: New test.
   * gcc.target/i386/rotate-1.c: Accept rolb or rorb instruction.
 
 --- gcc/tree-ssa-forwprop.c.jj2013-04-22 08:06:41.0 +0200
 +++ gcc/tree-ssa-forwprop.c   2013-05-09 15:31:49.35512 +0200
 @@ -2124,6 +2124,242 @@ simplify_bitwise_binary (gimple_stmt_ite
  }
  
  
 +/* Recognize rotation patterns.  Return true if a transformation
 +   applied, otherwise return false.
 +
 +   We are looking for X with unsigned type T with bitsize B, OP being
 +   +, | or ^, some type T2 wider than T and
 +   (X  CNT1) OP (X  CNT2)iff CNT1 + CNT2 
 == B
 +   ((T) ((T2) X  CNT1)) OP ((T) ((T2) X  CNT2))  iff CNT1 + CNT2 == B
 +   (X  Y) OP (X  (B - Y))
 +   (X  (int) Y) OP (X  (int) (B - Y))
 +   ((T) ((T2) X  Y)) OP ((T) ((T2) X  (B - Y)))
 +   ((T) ((T2) X  (int) Y)) OP ((T) ((T2) X  (int) (B - Y)))
 +   (X  Y) OP (X  ((-Y)  (B - 1)))
 +   (X  (int) Y) OP (X  (int) ((-Y)  (B - 1)))
 +   ((T) ((T2) X  Y)) OP ((T) ((T2) X  ((-Y)  (B - 1
 +   ((T) ((T2) X  (int) Y)) OP ((T) ((T2) X  (int) ((-Y)  (B - 1
 +
 +   and transform these into:
 +   X r CNT1
 +   X r Y
 +
 +   Note, in the patterns with T2 type, the type of OP operands
 +   might be even a signed type, but should have precision B.  */
 +
 +static bool
 +simplify_rotate (gimple_stmt_iterator *gsi)
 +{
 +  gimple stmt = gsi_stmt (*gsi);
 +  tree arg[2], rtype, rotcnt = NULL_TREE;
 +  tree def_arg1[2], def_arg2[2];
 +  enum tree_code def_code[2];
 +  tree lhs;
 +  int i;
 +  bool swapped_p = false;
 +  gimple g;
 +
 +  arg[0] = gimple_assign_rhs1 (stmt);
 +  arg[1] = gimple_assign_rhs2 (stmt);
 +  rtype = TREE_TYPE (arg[0]);
 +
 +  /* Only create rotates in complete modes.  Other cases are not
 + expanded properly.  */
 +  if (!INTEGRAL_TYPE_P (rtype)
 +  || TYPE_PRECISION (rtype) != GET_MODE_PRECISION (TYPE_MODE (rtype)))
 +return false;
 +
 +  for (i = 0; i  2; i++)
 +defcodefor_name (arg[i], def_code[i], def_arg1[i], def_arg2[i]);
 +
 +  /* Look through narrowing conversions.  */
 +  if (CONVERT_EXPR_CODE_P (def_code[0])
 +   CONVERT_EXPR_CODE_P (def_code[1])
 +   INTEGRAL_TYPE_P (TREE_TYPE (def_arg1[0]))
 +   INTEGRAL_TYPE_P (TREE_TYPE (def_arg1[1]))
 +   TYPE_PRECISION (TREE_TYPE (def_arg1[0]))
 +  == TYPE_PRECISION (TREE_TYPE (def_arg1[1]))
 +   TYPE_PRECISION (TREE_TYPE (def_arg1[0]))  TYPE_PRECISION (rtype)
 +   has_single_use (arg[0])
 +   has_single_use (arg[1]))
 +{
 +  for (i = 0; i  2; i++)
 + {
 +   arg[i] = def_arg1[i];
 +   defcodefor_name (arg[i], def_code[i], def_arg1[i], def_arg2[i]);
 + }
 +}
 +
 +  /* One operand has to be LSHIFT_EXPR and one RSHIFT_EXPR.  */
 +  for (i = 0; i  2; i++)
 +if (def_code[i] != LSHIFT_EXPR  def_code[i] != RSHIFT_EXPR)
 +  return false;
 +else if (!has_single_use (arg[i]))
 +  return false;
 +  if (def_code[0] == def_code[1])
 +return false;
 +
 +  /* If we've looked through narrowing conversions before, look through
 + widening conversions from unsigned type with the same precision
 + as rtype here.  */
 +  if (TYPE_PRECISION (TREE_TYPE (def_arg1[0])) != TYPE_PRECISION (rtype))
 +for (i = 0; i  2; i++)
 +  {
 + tree tem;
 + enum tree_code code;
 + defcodefor_name (def_arg1[i], code, tem, NULL);
 + if (!CONVERT_EXPR_CODE_P (code)
 + || !INTEGRAL_TYPE_P (TREE_TYPE (tem))
 + || TYPE_PRECISION (TREE_TYPE (tem)) != TYPE_PRECISION (rtype))
 +   return false;
 + def_arg1[i] = tem;
 +  }
 +  /* Both shifts have to

Re: [PATCH][1/n] Re-organize -fvect-cost-model, enable basic vectorization at -O2

2013-05-10 Thread Richard Biener

On Tue, 7 May 2013, Richard Biener wrote:

 
 The following patch is a first step towards being able to enable
 vectorizing of a subset of all vectorizable functions at -O2 by
 default.  Analysis of Polyhedron (loop heavy code) shows that
 the cost of doing vectorizer analysis is in the noise but compile-time
 (and binary size) grows only with the number of loops we emit
 vectorized code for (because we generate up to 3 extra loops for
 each vectorizable loop that we have to move down the pass pipeline).
 
 This very first patch makes sure that a runtime cost model check
 comes first - and not after alias or alignment versioning checks.
 That part of the patch would be a no-op without the rest because
 currently peeling for alignment and versioning cannot coexist
 (well - they could not, before I introduced
 STMT_VINFO_LOOP_PHI_EVOLUTION_PART a few releases ago ...).  Thus
 the patch enables doing both which may eventually speedup things.
 
 Sofar tested on the testsuite and polyhedron, full bootstrap and
 regtest is underway.
 
 Patches in this series will transform -fvect-cost-model to get
 an additional parameter (not sure how that will end up looking like)
 to be able to control vectorization in the following 'steps'
 
  1 vectorize loops that do not require a runtime cost check to be
profitable and that do not require versioning (to be enabled at -O2)
  2 vectorize loops like we do now
  3 vectorize loops like we do now but assume the runtime cost check
will always succeed (thus, omit it)  [-fno-vect-cost-model]
 
 I'm not sure yet if restricting the versioning makes sense
 (it's supposed to reduce code bloat and compile-time of course),
 esp. considering that peeling for alignment could be disabled as well
 (but then HW without unaligned accesses will likely vectorize nothing).
 Thus the complication seems to be the code size considerations
 (the cost model is currently set up to compare runtime cost only).

 Richard.
 
 2013-05-07  Richard Biener  rguent...@suse.de
 
   * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Do not
   disable peeling when we version for aliasing.
   * tree-vect-loop-manip.c (vect_can_advance_ivs_p): Use
   STMT_VINFO_LOOP_PHI_EVOLUTION_PART instead of recomputing it.
   * tree-vect-loop.c (vect_transform_loop): First apply versioning,
   then peeling to arrange for the cost-model check to come first.

Runs into issues with gcc.target/i386/l_fma_double_1.c and friends
because that now does peeling for alignment instead of just versioning
for alias.  And peeling for alignment is only applied on x86_64 for
double because on i686 alignment might not be reachable with peeling.
Trying to fixup in the testcase with a properly aligned double type
shows that we don't honor user alignment in the vectorizer for this
purpose - thus, fixed, to allow not too ugly adjustments of
the fma testcases.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-05-10  Richard Biener  rguent...@suse.de

* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Do not
disable peeling when we version for aliasing.
(vector_alignment_reachable_p): Honor explicit user alignment.
(vect_supportable_dr_alignment): Likewise.
* tree-vect-loop-manip.c (vect_can_advance_ivs_p): Use
STMT_VINFO_LOOP_PHI_EVOLUTION_PART instead of recomputing it.
* tree-vect-loop.c (vect_transform_loop): First apply versioning,
then peeling to arrange for the cost-model check to come first.

* gcc.target/i386/avx256-unaligned-load-2.c: Make well-defined.
* gcc.target/i386/l_fma_double_1.c: Adjust.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.

Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c.orig  2013-05-08 13:26:16.0 +0200
--- gcc/tree-vect-data-refs.c   2013-05-08 14:23:42.937265437 +0200
*** vector_alignment_reachable_p (struct dat
*** 1024,1030 
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, 
   Unknown misalignment, is_packed = %d,is_packed);
!   if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
return true;
else
return false;
--- 1024,1031 
if (dump_enabled_p ())
dump_printf_loc

[PATCH] Fix PR57214

2013-05-10 Thread Richard Biener


This fixes PR57214 - the simple constant propagation pass we perform
over unrolled loop bodies does not properly avoid propagating across
abnormal edges.

Bootstrap  regtest running on x86_64-unknown-linux-gnu, I'll install
this on trunk and the 4.8 branch where the bug is latent.

Richard.

2013-05-10  Richard Biener  rguent...@suse.de

PR tree-optimization/57214
* tree-ssa-loop-ivcanon.c (propagate_constants_for_unrolling): Do
not propagate from SSA names that occur in abnormal PHI nodes.

Index: gcc/tree-ssa-loop-ivcanon.c
===
*** gcc/tree-ssa-loop-ivcanon.c (revision 198768)
--- gcc/tree-ssa-loop-ivcanon.c (working copy)
*** propagate_constants_for_unrolling (basic
*** 1085,1092 
tree lhs;
  
if (is_gimple_assign (stmt)
   (lhs = gimple_assign_lhs (stmt), TREE_CODE (lhs) == SSA_NAME)
!  gimple_assign_rhs_code (stmt) == INTEGER_CST)
{
  propagate_into_all_uses (lhs, gimple_assign_rhs1 (stmt));
  gsi_remove (gsi, true);
--- 1085,1093 
tree lhs;
  
if (is_gimple_assign (stmt)
+  gimple_assign_rhs_code (stmt) == INTEGER_CST
   (lhs = gimple_assign_lhs (stmt), TREE_CODE (lhs) == SSA_NAME)
!  !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
{
  propagate_into_all_uses (lhs, gimple_assign_rhs1 (stmt));
  gsi_remove (gsi, true);
Index: gcc/testsuite/gcc.dg/torture/pr57214.c
===
*** gcc/testsuite/gcc.dg/torture/pr57214.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr57214.c  (working copy)
***
*** 0 
--- 1,25 
+ /* { dg-do compile } */
+ 
+ extern int baz (void);
+ extern int foo (void) __attribute__ ((returns_twice));
+ 
+ void
+ bar (_Bool b)
+ {
+   int buf[1];
+   while (1)
+ {
+   _Bool x = 1;
+   if (b)
+   baz ();
+   b = 1;
+   baz ();
+   x = 0;
+   int i;
+   while (buf[i]  i)
+   i++;
+   foo ();
+   if (!x)
+   b = 0;
+ }
+ }

RFC patch for fold-const.c's fold_indirect_ref_1 / avoid type-checking ICE with TYPE_MIN_VALUE vs. offset signess mismatches

2013-05-10 Thread Tobias Burnus

I run into the following issue on the Fortran-Dev branch. I get an ICE due to a 
tree-checking assert for fold-const.c's fold_indirect_ref_1, which has

  tree min_val = size_zero_node;
  if (type_domain  TYPE_MIN_VALUE (type_domain))
min_val = TYPE_MIN_VALUE (type_domain);
  op01 = size_binop_loc (loc, EXACT_DIV_EXPR, op01,
 TYPE_SIZE_UNIT (type));
  op01 = size_binop_loc (loc, PLUS_EXPR, op01, min_val);

The problem is that op01 is unsigned (size_type_node) while min_val alias 
TYPE_MIN_VALUE is signed (gfc_array_index_type which is PTRDIFF_TYPE).


TYPE_MIN_VALUE gets set via tree.c's build_range_type_1, which has:

  TYPE_MIN_VALUE (itype) = fold_convert (type, lowval);
  TYPE_MAX_VALUE (itype) = highval ? fold_convert (type, highval) : NULL;


And op01 becomes a size_type_node via tree.h's

fold_build_pointer_plus_loc (location_t loc, tree ptr, tree off)
{
  return fold_build2_loc (loc, POINTER_PLUS_EXPR, TREE_TYPE (ptr),
  ptr, fold_convert_loc (loc, sizetype, off));


I am not completely sure what's the best way to solve that, but the attached 
patch avoids the ICE for me.

I did an all-all language bootstrap (all,ada,go,objc,obj-c++) and had no new 
failures with regresting.
OK? Or do you have a better idea how to solve this?

Tobias
2013-05-10  Tobias Burnus  bur...@net-b.de

	* fold-const.c (fold_indirect_ref_1): Fold offset
	to TREE_TYPE (min_val) before converting to array syntax.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 74b451e..8ac873a 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -16591,10 +16591,11 @@ fold_indirect_ref_1 (location_t loc, tree type, tree op0)
 	  tree min_val = size_zero_node;
 	  if (type_domain  TYPE_MIN_VALUE (type_domain))
 		min_val = TYPE_MIN_VALUE (type_domain);
 	  op01 = size_binop_loc (loc, EXACT_DIV_EXPR, op01,
  TYPE_SIZE_UNIT (type));
+ op01 = fold_convert (TREE_TYPE (min_val), op01);
 	  op01 = size_binop_loc (loc, PLUS_EXPR, op01, min_val);
 	  return build4_loc (loc, ARRAY_REF, type, op00, op01,
  NULL_TREE, NULL_TREE);
 	}
 	}

Re: [PATCH] Improve rotation by mode bitsize - 1

2013-05-10 Thread Richard Biener

On Thu, May 9, 2013 at 10:24 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, May 09, 2013 at 09:47:49PM +0200, Uros Bizjak wrote:
 On Thu, May 9, 2013 at 8:45 PM, Jakub Jelinek ja...@redhat.com wrote:
  2013-05-09  Jakub Jelinek  ja...@redhat.com
 
  * config/i386/i386.md (rotateinv): New code attr.
  (*rotate_insnmode3_1, *rotate_insnsi3_1_zext,
  *rotate_insnqi3_1_slp): Emit rorl %eax instead of
  roll $31, %eax, etc.

 OK.

 Thanks.  While talking about rotates, seems they are quite expensive
 on SandyBridge if the destination (== source1) is memory.

 With -O3 -mavx:

 unsigned int a[1024] = { 0, 1, 2, 3, 4, 5, 6 };

 __attribute__((noinline, noclone)) void
 foo (void)
 {
   int i;
   for (i = 0; i  1024; i++)
 {
   int j = i  31;
   a[i] = (a[i]  j) ^ (a[i]  ((-j)  31));
 }
 }

 int
 main ()
 {
   int i;
   for (i = 0; i  100; i++)
 foo ();
   return 0;
 }

 the timing for vanilla gcc, gcc with the rotate recognizer patch and
 the latter hand editted to use rotate on register gives:
 Intel i7-2600   0m1.421s0m1.914s0m0.826s
 AMD 83540m2.353s0m0.988s0m1.407s

 The vanilla gcc loop is long:
 .L2:
 movla(,%rax,4), %edi
 movl%eax, %esi
 andl$31, %esi
 movl%esi, %ecx
 negl%ecx
 movl%edi, %edx
 shrl%cl, %edx
 movl%esi, %ecx
 sall%cl, %edi
 xorl%edi, %edx
 movl%edx, a(,%rax,4)
 addq$1, %rax
 cmpq$1024, %rax
 jne .L2
 With rotate using mem:
 .L2:
 roll%cl, a(,%rcx,4)
 addq$1, %rcx
 cmpq$1024, %rcx
 jne .L2
 and with the hand tweaked rotate:
 .L2:
 movla(,%rcx,4), %eax
 roll%cl, %eax
 addq$1, %rcx
 cmpq$1024, %rcx
 movl%eax, a-4(,%rcx,4)
 jne .L2

 So, on SandyBridge (haven't tried other Intel CPUs) it perhaps might be
 beneficial to disallow rotate with =m , 0 constraints, while for the above
 mentioned AMD it is best to use it.  Guess the above isn't sufficient data
 for that, we'd need info on more CPUs and more rotation sizes.

 BTW, with -mavx2 we vectorize the loop without the rotate recognition patch
 and don't vectorize anymore, I'll need to adjust the pattern recognizer to
 handle those.  The question is if for missing vector rotate optab
 we can replace it with the probably faster and shorter
 x r y  - (x  y) | (x  (32 - y)) - which is invalid for y = 0,
 but perhaps if the only thing the vector shift would do for shift by 32
 would be either return 0, or x, it would still work, or if we want to go
 for the (x  y) | (x  ((-y)  31)).  The i?86/x86_64 vector shifts don't
 truncate the shift count, so perhaps best would be a vector rotate expander
 that expands it into some some special insn like vectorized y ? x  (32 - y) 
 : 0

 One question is whether our LROTATE_EXPR or RROTATE_EXPR is only defined for
 rotation counts 0 through bitsize - 1 (as is the case when we pattern detect
 it, do we create *ROTATE_EXPR other way?), or whether we allow arbitrary
 rotation counts (would assume the rotation be done always with modulo
 bitsize).  Because if the rotation count could be arbitrary, we'd need
 to vectorize it as (x  (y  31)) | (x  ((-y)  31))

The rotation count should be 0 to bitsize - 1 similar to shifts.

Richard.

 Jakub

Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop

2013-05-10 Thread Shakthi Kannan

Hi,

- Original Message -
| From: Marek Polacek pola...@redhat.com
|
| Maybe
|   PR preprocessor/42014
|   * diagnostic.c: Print LAST_SOURCE_COLUMN as well.
| ?
\--

Sure!

SK

-- 
Shakthi Kannan 
skannan at redhat dot com

Re: More vector folding

2013-05-10 Thread Richard Biener

On Thu, May 9, 2013 at 12:55 AM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 here are a few more changes to fold-const.c so vectors can use the existing
 optimizations. Note that I made fold_truth_not_expr safe for use with vector
 BIT_NOT_EXPR.

 Passes bootstrap+testsuite on x86_64-linux-gnu.

 2013-05-09  Marc Glisse  marc.gli...@inria.fr

 gcc/
 * fold-const.c (fold_negate_expr): Handle vectors.
 (fold_truth_not_expr): Likewise.
 (invert_truthvalue_loc): Likewise.
 (fold_single_bit_test): Likewise.
 (fold_comparison): Likewise.
 (fold_ternary_loc): Likewise.

 gcc/testsuite/
 * g++.dg/ext/vector22.C: New testcase.

 --
 Marc Glisse
 Index: gcc/testsuite/g++.dg/ext/vector22.C
 ===
 --- gcc/testsuite/g++.dg/ext/vector22.C (revision 0)
 +++ gcc/testsuite/g++.dg/ext/vector22.C (revision 0)
 @@ -0,0 +1,20 @@
 +/* { dg-do compile } */
 +/* { dg-options -O -fdump-tree-gimple } */
 +
 +typedef unsigned vec __attribute__((vector_size(4*sizeof(int;
 +
 +void f(vec*a,vec*b){
 +  *a=(*a)?-1:(*b10);
 +  *b=(*b)?(*a10):0;
 +}
 +void g(vec*a,vec*b){
 +  *a=(*a)?(*a*a):-1;
 +  *b=(*b)?-1:(*b*b);
 +}
 +void h(vec*a){
 +  *a=(~*a==5);
 +}
 +
 +/* { dg-final { scan-tree-dump-not ~ gimple } } */
 +/* { dg-final { scan-tree-dump-not VEC_COND_EXPR gimple } } */
 +/* { dg-final { cleanup-tree-dump gimple } } */

 Property changes on: gcc/testsuite/g++.dg/ext/vector22.C
 ___
 Added: svn:eol-style
+ native
 Added: svn:keywords
+ Author Date Id Revision URL

 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c(revision 198726)
 +++ gcc/fold-const.c(working copy)
 @@ -519,21 +519,21 @@ fold_negate_expr (location_t loc, tree t
  {
tree type = TREE_TYPE (t);
tree tem;

switch (TREE_CODE (t))
  {
  /* Convert - (~A) to A + 1.  */
  case BIT_NOT_EXPR:
if (INTEGRAL_TYPE_P (type))
  return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t, 0),
 -build_int_cst (type, 1));
 +build_one_cst (type));
break;

  case INTEGER_CST:
tem = fold_negate_const (t, type);
if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
   || !TYPE_OVERFLOW_TRAPS (type))
 return tem;
break;

  case REAL_CST:
 @@ -3110,20 +3110,23 @@ fold_truth_not_expr (location_t loc, tre

return build2_loc (loc, code, type, TREE_OPERAND (arg, 0),
  TREE_OPERAND (arg, 1));
  }

switch (code)
  {
  case INTEGER_CST:
return constant_boolean_node (integer_zerop (arg), type);

 +case VECTOR_CST:
 +  return fold_unary_loc (loc, BIT_NOT_EXPR, type, arg);
 +
  case TRUTH_AND_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
return build2_loc (loc, TRUTH_OR_EXPR, type,
  invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
 0)),
  invert_truthvalue_loc (loc2, TREE_OPERAND (arg,
 1)));

  case TRUTH_OR_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
 @@ -3195,26 +3198,28 @@ fold_truth_not_expr (location_t loc, tre
 return build1_loc (loc, TRUTH_NOT_EXPR, type, arg);

/* ... fall through ...  */

  case FLOAT_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
return build1_loc (loc, TREE_CODE (arg), type,
  invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
 0)));

  case BIT_AND_EXPR:
 -  if (!integer_onep (TREE_OPERAND (arg, 1)))
 +  if (VECTOR_TYPE_P (type) || !integer_onep (TREE_OPERAND (arg, 1)))
 return NULL_TREE;
return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0));

  case SAVE_EXPR:
 -  return build1_loc (loc, TRUTH_NOT_EXPR, type, arg);
 +  return build1_loc (loc, VECTOR_TYPE_P (type) ? BIT_NOT_EXPR
 +  : TRUTH_NOT_EXPR,
 +type, arg);

  case CLEANUP_POINT_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
return build1_loc (loc, CLEANUP_POINT_EXPR, type,
  invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
 0)));

  default:
return NULL_TREE;
  }
  }
 @@ -3222,28 +3227,28 @@ fold_truth_not_expr (location_t loc, tre
  /* Return a simplified tree node for the truth-negation of ARG.  This
 never alters ARG itself.  We assume that ARG is an operation that
 returns a truth value (0 or 1).

 FIXME: one would think we would fold the result, but it causes
 problems with the dominator optimizer.  */

  tree

Re: TYPE_PRECISION for vectors

2013-05-10 Thread Richard Biener

On Tue, May 7, 2013 at 11:26 PM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 this patch is about the use of TYPE_PRECISION for non-scalar types. For
 complex types, it is unused (always 0) and for vectors, the field is used to
 store the log of the number of elements, so in both cases we shouldn't use
 the macro. I tried to enforce it, see the SCALAR_TYPE_CHECK in tree.h (I
 assume I should remove it and the FIXME before commit), and started fixing
 the crashes, but there were too many and I gave up. Those fixes I did write
 are included in the patch, since I believe we eventually want to get there.

 For hashing in gimple.c, including TYPE_UNSIGNED for complex and vector
 seems questionable: I think that it is the same as for the inner type, which
 is already hashed. I didn't change that in the patch.

 Passes bootstrap+testsuite on x86_64-linux-gnu.

Ok.

Thanks,
Richard.


 2013-05-07  Marc Glisse  marc.gli...@inria.fr

 gcc/
 * stor-layout.c (element_precision): New function.
 * machmode.h (element_precision): Declare it.
 * tree.c (build_minus_one_cst): New function.
 (element_precision): Likewise.
 * tree.h (SCALAR_TYPE_CHECK): New macro.
 (build_minus_one_cst): Declare new function.
 (element_precision): Likewise.
 * fold-const.c (operand_equal_p): Use element_precision.
 (fold_binary_loc): Handle vector types.
 * convert.c (convert_to_integer): Use element_precision.
 * gimple.c (iterative_hash_canonical_type): Handle complex and
 vectors
 separately.

 gcc/c-family/
 * c-common.c (vector_types_convertible_p): No TYPE_PRECISION for
 vectors.

 gcc/testsuite/
 * gcc.dg/vector-shift.c: New testcase.

 --
 Marc Glisse
 Index: gcc/convert.c
 ===
 --- gcc/convert.c   (revision 198685)
 +++ gcc/convert.c   (working copy)
 @@ -348,22 +348,22 @@ convert_to_real (tree type, tree expr)
 fixed-point or vector; in other cases error is called.

 The result of this is always supposed to be a newly created tree node
 not in use in any existing structure.  */

  tree
  convert_to_integer (tree type, tree expr)
  {
enum tree_code ex_form = TREE_CODE (expr);
tree intype = TREE_TYPE (expr);
 -  unsigned int inprec = TYPE_PRECISION (intype);
 -  unsigned int outprec = TYPE_PRECISION (type);
 +  unsigned int inprec = element_precision (intype);
 +  unsigned int outprec = element_precision (type);

/* An INTEGER_TYPE cannot be incomplete, but an ENUMERAL_TYPE can
   be.  Consider `enum E = { a, b = (enum E) 3 };'.  */
if (!COMPLETE_TYPE_P (type))
  {
error (conversion to incomplete type);
return error_mark_node;
  }

/* Convert e.g. (long)round(d) - lround(d).  */
 Index: gcc/tree.c
 ===
 --- gcc/tree.c  (revision 198685)
 +++ gcc/tree.c  (working copy)
 @@ -1615,20 +1615,59 @@ build_one_cst (tree type)
  case COMPLEX_TYPE:
return build_complex (type,
 build_one_cst (TREE_TYPE (type)),
 build_zero_cst (TREE_TYPE (type)));

  default:
gcc_unreachable ();
  }
  }

 +/* Return a constant of arithmetic type TYPE which is the
 +   opposite of the multiplicative identity of the set TYPE.  */
 +
 +tree
 +build_minus_one_cst (tree type)
 +{
 +  switch (TREE_CODE (type))
 +{
 +case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE:
 +case POINTER_TYPE: case REFERENCE_TYPE:
 +case OFFSET_TYPE:
 +  return build_int_cst (type, -1);
 +
 +case REAL_TYPE:
 +  return build_real (type, dconstm1);
 +
 +case FIXED_POINT_TYPE:
 +  /* We can only generate 1 for accum types.  */
 +  gcc_assert (ALL_SCALAR_ACCUM_MODE_P (TYPE_MODE (type)));
 +  return build_fixed (type, fixed_from_double_int
 (double_int_minus_one,
 +  TYPE_MODE (type)));
 +
 +case VECTOR_TYPE:
 +  {
 +   tree scalar = build_minus_one_cst (TREE_TYPE (type));
 +
 +   return build_vector_from_val (type, scalar);
 +  }
 +
 +case COMPLEX_TYPE:
 +  return build_complex (type,
 +   build_minus_one_cst (TREE_TYPE (type)),
 +   build_zero_cst (TREE_TYPE (type)));
 +
 +default:
 +  gcc_unreachable ();
 +}
 +}
 +
  /* Build 0 constant of type TYPE.  This is used by constructor folding
 and thus the constant should be represented in memory by
 zero(es).  */

  tree
  build_zero_cst (tree type)
  {
switch (TREE_CODE (type))
  {
  case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE:
 @@ -6921,20 +6960,33 @@ compare_tree_int (const_tree t, unsigned
  bool
  valid_constant_size_p (const_tree size)
  {
if (! host_integerp (size, 1)
|| TREE_OVERFLOW (size)
||

Re: More vector folding

2013-05-10 Thread Marc Glisse


On Fri, 10 May 2013, Richard Biener wrote:


On Thu, May 9, 2013 at 12:55 AM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

here are a few more changes to fold-const.c so vectors can use the existing
optimizations. Note that I made fold_truth_not_expr safe for use with vector
BIT_NOT_EXPR.

Passes bootstrap+testsuite on x86_64-linux-gnu.

2013-05-09  Marc Glisse  marc.gli...@inria.fr

gcc/
* fold-const.c (fold_negate_expr): Handle vectors.
(fold_truth_not_expr): Likewise.
(invert_truthvalue_loc): Likewise.
(fold_single_bit_test): Likewise.
(fold_comparison): Likewise.
(fold_ternary_loc): Likewise.

gcc/testsuite/
* g++.dg/ext/vector22.C: New testcase.

--
Marc Glisse
Index: gcc/testsuite/g++.dg/ext/vector22.C
===
--- gcc/testsuite/g++.dg/ext/vector22.C (revision 0)
+++ gcc/testsuite/g++.dg/ext/vector22.C (revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-gimple } */
+
+typedef unsigned vec __attribute__((vector_size(4*sizeof(int;
+
+void f(vec*a,vec*b){
+  *a=(*a)?-1:(*b10);
+  *b=(*b)?(*a10):0;
+}
+void g(vec*a,vec*b){
+  *a=(*a)?(*a*a):-1;
+  *b=(*b)?-1:(*b*b);
+}
+void h(vec*a){
+  *a=(~*a==5);
+}
+
+/* { dg-final { scan-tree-dump-not ~ gimple } } */
+/* { dg-final { scan-tree-dump-not VEC_COND_EXPR gimple } } */
+/* { dg-final { cleanup-tree-dump gimple } } */

Property changes on: gcc/testsuite/g++.dg/ext/vector22.C
___
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 198726)
+++ gcc/fold-const.c(working copy)
@@ -519,21 +519,21 @@ fold_negate_expr (location_t loc, tree t
 {
   tree type = TREE_TYPE (t);
   tree tem;

   switch (TREE_CODE (t))
 {
 /* Convert - (~A) to A + 1.  */
 case BIT_NOT_EXPR:
   if (INTEGRAL_TYPE_P (type))
 return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t, 0),
-build_int_cst (type, 1));
+build_one_cst (type));
   break;

 case INTEGER_CST:
   tem = fold_negate_const (t, type);
   if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
  || !TYPE_OVERFLOW_TRAPS (type))
return tem;
   break;

 case REAL_CST:
@@ -3110,20 +3110,23 @@ fold_truth_not_expr (location_t loc, tre

   return build2_loc (loc, code, type, TREE_OPERAND (arg, 0),
 TREE_OPERAND (arg, 1));
 }

   switch (code)
 {
 case INTEGER_CST:
   return constant_boolean_node (integer_zerop (arg), type);

+case VECTOR_CST:
+  return fold_unary_loc (loc, BIT_NOT_EXPR, type, arg);
+
 case TRUTH_AND_EXPR:
   loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
   loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
   return build2_loc (loc, TRUTH_OR_EXPR, type,
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
0)),
 invert_truthvalue_loc (loc2, TREE_OPERAND (arg,
1)));

 case TRUTH_OR_EXPR:
   loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
   loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
@@ -3195,26 +3198,28 @@ fold_truth_not_expr (location_t loc, tre
return build1_loc (loc, TRUTH_NOT_EXPR, type, arg);

   /* ... fall through ...  */

 case FLOAT_EXPR:
   loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
   return build1_loc (loc, TREE_CODE (arg), type,
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
0)));

 case BIT_AND_EXPR:
-  if (!integer_onep (TREE_OPERAND (arg, 1)))
+  if (VECTOR_TYPE_P (type) || !integer_onep (TREE_OPERAND (arg, 1)))
return NULL_TREE;
   return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type, 0));

 case SAVE_EXPR:
-  return build1_loc (loc, TRUTH_NOT_EXPR, type, arg);
+  return build1_loc (loc, VECTOR_TYPE_P (type) ? BIT_NOT_EXPR
+  : TRUTH_NOT_EXPR,
+type, arg);

 case CLEANUP_POINT_EXPR:
   loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
   return build1_loc (loc, CLEANUP_POINT_EXPR, type,
 invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
0)));

 default:
   return NULL_TREE;
 }
 }
@@ -3222,28 +3227,28 @@ fold_truth_not_expr (location_t loc, tre
 /* Return a simplified tree node for the truth-negation of ARG.  This
never alters ARG itself.  We assume that ARG is an operation that
returns a truth value (0 or 1).

FIXME: one would think we would fold the result, but it causes
problems with the dominator optimizer.  */

 tree
 invert_truthvalue_loc (location_t loc, tree arg)
 {
-  tree tem;
-
   if (TREE_CODE (arg) ==

Re: [RFA/ARM/4.7] Fix PR54974: Thumb literal pools don't handle PC rounding

2013-05-10 Thread Ramana Radhakrishnan




OK for 4.7?




Ok - I did say ok if no fallout in my original review :)


regards
Ramana

Thanks,

Matt

2013-01-04  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org

Backport from mainline.
2012-11-29  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org

PR target/54974
* config/arm/arm.md (thumb2_pool_range, pool_range): Add
comment on Thumb pool ranges.
(thumb1_extendhisi2): Reduce Thumb pool range.
(arm_movdi): Likewise.
(thumb1_movdi_insn): Likewise.
(thumb1_movsi_insn): Likewise.
(pic_load_addr_unified): Likewise.
(pic_load_addr_32bit): Likewise.
(pic_load_addr_thumb1): Likewise.
(thumb1_movhf): Likewise.
(arm_movsf_soft_insn): Likewise.
(thumb1_movsf_soft_insn): Likewise.
(movdf_soft_insn): Likewise.
(thumb1_movdf_soft_insn): Likewise.
* config/arm/neon.md (*neon_movmode): Likewise.
(*neon_movmode): Likwise.
* config/arm/thumb2.md: (*thumb2_movsi_insn): Likewise.
(*thumb2_movhi_insn): Likewise.
(*thumb2_extendqisi_v6): Likewise.
(*thumb2_zero_extendqisi_v6): Likewise.
(*thumb2_zero_extendqisi2_v6): Likewise.
* config/arm/vfp.md: (*thumb2_movsi_vfp): Likewise.
(*movdi_vfp): Likewise.
(*movdi_vfp_cortexa8): Likewise.
(*thumb2_movsf_vfp): Likewise.
(*thumb2_movdf_vfp): Likewise.

[Patch, Fortran] PR57217 - re-add type checks for TBP overriding

2013-05-10 Thread Tobias Burnus

GCC 4.7 added some additional checks for type-bound procedure 
overriding. However, doing so it weakened the check whether the nonpass 
argument has the same type.


While for normal arguments, passing the parent type to an extended type 
is fine, for overriding the type (of nonpass arguments) must be exactly 
the same as in the original type.


The attached patch re-adds this check.

Build and regtested on x86-64-gnu-linux.
OK for the trunk and the 4.7/4.8 branches?

Tobias
2013-05-10  Tobias Burnus  bur...@net-b.de

	PR fortran/57217
	* interface.c (gfc_check_typebound_override): Check whether
	nonpass types are identical same.

2013-05-10  Tobias Burnus  bur...@net-b.de

	PR fortran/57217
	* gfortran.dg/typebound_override_4.f90: New.
	* gfortran.dg/typebound_proc_6.f03: Update dg-error.

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index 1b967fa..8f22e4c 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -4128,6 +4128,17 @@ gfc_check_typebound_override (gfc_symtree* proc, gfc_symtree* old)
 	}
 
   check_type = proc_pass_arg != argpos  old_pass_arg != argpos;
+  if (check_type
+   (!gfc_compare_types (proc_formal-sym-ts, old_formal-sym-ts)
+	  || !gfc_compare_types (old_formal-sym-ts,
+ proc_formal-sym-ts)))
+	{
+	  gfc_error (Argument type mismatch for the overriding procedure 
+		 '%s' at %L: %s shall be %s, proc-name, where,
+		 gfc_typename (proc_formal-sym-ts),
+		 gfc_typename (old_formal-sym-ts));
+	  return false;
+	}
   if (!check_dummy_characteristics (proc_formal-sym, old_formal-sym, 
 	check_type, err, sizeof(err)))
 	{
diff --git a/gcc/testsuite/gfortran.dg/typebound_override_4.f90 b/gcc/testsuite/gfortran.dg/typebound_override_4.f90
new file mode 100644
index 000..e6f9805
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/typebound_override_4.f90
@@ -0,0 +1,36 @@
+! { dg-do compile }
+!
+! PR fortran/57217
+!
+! Contributed Salvatore Filippone
+!
+module base_mod
+  type base_type
+integer :: kind
+  contains
+procedure, pass(map)  :: clone= base_clone
+  end type base_type
+contains
+  subroutine  base_clone(map,mapout,info)
+implicit none
+class(base_type), intent(inout) :: map
+class(base_type), intent(inout) :: mapout
+integer :: info
+  end subroutine base_clone
+end module base_mod
+
+module r_mod
+  use base_mod
+  type, extends(base_type) :: r_type
+real  :: dat
+  contains
+procedure, pass(map)  :: clone= r_clone ! { dg-error Argument type mismatch for the overriding procedure 'clone' at .1.: CLASS.r_type. shall be CLASS.base_type. }
+  end type r_type
+contains
+  subroutine  r_clone(map,mapout,info)
+implicit none
+class(r_type), intent(inout) :: map
+class(r_type), intent(inout) :: mapout
+integer :: info
+  end subroutine r_clone
+end module r_mod
diff --git a/gcc/testsuite/gfortran.dg/typebound_proc_6.f03 b/gcc/testsuite/gfortran.dg/typebound_proc_6.f03
index 3a32cbc..1fe2580 100644
--- a/gcc/testsuite/gfortran.dg/typebound_proc_6.f03
+++ b/gcc/testsuite/gfortran.dg/typebound_proc_6.f03
@@ -89,7 +89,7 @@ MODULE testmod
 ! For corresponding dummy arguments.
 PROCEDURE, PASS :: corresp1 = proc_tmeint ! Ok.
 PROCEDURE, PASS :: corresp2 = proc_tmeintx ! { dg-error should be named 'a' }
-PROCEDURE, PASS :: corresp3 = proc_tmereal ! { dg-error Type/rank mismatch in argument 'a' }
+PROCEDURE, PASS :: corresp3 = proc_tmereal ! { dg-error Argument type mismatch for the overriding procedure 'corresp3' at .1.: REAL.4. shall be INTEGER.4. }
 
   END TYPE t

Re: [PATCH] PR32219, weak hidden reference segfault

2013-05-10 Thread Richard Sandiford

Chung-Lin Tang clt...@codesourcery.com writes:
 +case UNSPEC:
 +  /* Reach for a contained symbol.  */
 +  return nonzero_address_p (XVECEXP (x, 0, 0));

I don't think this is safe.  UNSPECs really are unspecified :-),
so we can't assume that (unspec X) is nonzero simply because X is.

Richard

[MIPS, committed] Use microMIPS-style constraints for MIPS16

2013-05-10 Thread Richard Sandiford

This patch makes the MIPS16 length checks use the style of predicates
and constraints that were added for microMIPS.  Some constraints and
predicates are shared between the two ISA encodings, while others are
only useful for one or the other.

This is supposed to be a 1:1 change.  I haven't tried to catch more cases,
since that should be a separate patch.

For now I've kept using extended_mips16 as the attribute of choice.
We could look at extending the compression attribute to MIPS16
(as Catherine's patch originally did), but that's likewise a separate
patch.

Tested on mips-sde-elf (with micromips multilibs locally patched out).
Applied.

Richard


gcc/
* config/mips/mips-protos.h (m16_uimm3_b, m16_simm4_1, m16_nsimm4_1)
(m16_simm5_1, m16_nsimm5_1, m16_uimm5_4, m16_nuimm5_4, m16_simm8_1)
(m16_nsimm8_1, m16_uimm8_1, m16_nuimm8_1, m16_uimm8_m1_1, m16_uimm8_4)
(m16_nuimm8_4, m16_simm8_8, m16_nsimm8_8): Delete.
* config/mips/mips.c (m16_check_op, m16_uimm3_b, m16_simm4_1)
(m16_nsimm4_1, m16_simm5_1, m16_nsimm5_1, m16_uimm5_4, m16_nuimm5_4)
(m16_simm8_1, m16_nsimm8_1, m16_uimm8_1, m16_nuimm8_1, m16_uimm8_m1_1)
(m16_uimm8_4, m16_nuimm8_4, m16_simm8_8, m16_nsimm8_8): Delete.
* config/mips/constraints.md (Udb8, Usb5, Usb8, Usd8, Uub8, Uuw5)
(Uuw8): New constraints.
(Usb4): Move into alphabetical order.
* config/mips/predicates.md (db8_operand, sb5_operand, sb8_operand)
(sd8_operand, ub8_operand, uw8_operand): New predicates.
* config/mips/mips.md (*xormode3, *xormode3_mips16): Name
previously unnamed patterns.
(*addmode3_mips16, *xormode3_mips16, *optabsi3_mips16)
(*ashldi3_mips16, *ashrdi3_mips16, *lshrdi3_mips16)
(*sltu_GPR:modeGPR2:mode_mips16)
(*sleu_GPR:modeGPR2:mode_mips16): Use constraints instead
of set_attr_alternative/if_then_else.  Use extended_mips16 instead
of specific lengths.

Index: gcc/config/mips/mips-protos.h
===
--- gcc/config/mips/mips-protos.h   2013-05-05 10:07:31.808505873 +0100
+++ gcc/config/mips/mips-protos.h   2013-05-05 10:08:54.009030051 +0100
@@ -211,23 +211,6 @@ extern rtx mips_strip_unspec_address (rt
 extern void mips_move_integer (rtx, rtx, unsigned HOST_WIDE_INT);
 extern bool mips_legitimize_move (enum machine_mode, rtx, rtx);
 
-extern int m16_uimm3_b (rtx, enum machine_mode);
-extern int m16_simm4_1 (rtx, enum machine_mode);
-extern int m16_nsimm4_1 (rtx, enum machine_mode);
-extern int m16_simm5_1 (rtx, enum machine_mode);
-extern int m16_nsimm5_1 (rtx, enum machine_mode);
-extern int m16_uimm5_4 (rtx, enum machine_mode);
-extern int m16_nuimm5_4 (rtx, enum machine_mode);
-extern int m16_simm8_1 (rtx, enum machine_mode);
-extern int m16_nsimm8_1 (rtx, enum machine_mode);
-extern int m16_uimm8_1 (rtx, enum machine_mode);
-extern int m16_nuimm8_1 (rtx, enum machine_mode);
-extern int m16_uimm8_m1_1 (rtx, enum machine_mode);
-extern int m16_uimm8_4 (rtx, enum machine_mode);
-extern int m16_nuimm8_4 (rtx, enum machine_mode);
-extern int m16_simm8_8 (rtx, enum machine_mode);
-extern int m16_nsimm8_8 (rtx, enum machine_mode);
-
 extern rtx mips_subword (rtx, bool);
 extern bool mips_split_move_p (rtx, rtx, enum mips_split_type);
 extern void mips_split_move (rtx, rtx, enum mips_split_type);
Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  2013-05-05 10:08:50.629997872 +0100
+++ gcc/config/mips/mips.c  2013-05-05 10:08:54.015030107 +0100
@@ -3357,113 +3357,6 @@ mips_rewrite_small_data (rtx pattern)
   return pattern;
 }
 
-/* We need a lot of little routines to check the range of MIPS16 immediate
-   operands.  */
-
-static int
-m16_check_op (rtx op, int low, int high, int mask)
-{
-  return (CONST_INT_P (op)
-  IN_RANGE (INTVAL (op), low, high)
-  (INTVAL (op)  mask) == 0);
-}
-
-int
-m16_uimm3_b (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, 0x1, 0x8, 0);
-}
-
-int
-m16_simm4_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0x8, 0x7, 0);
-}
-
-int
-m16_nsimm4_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0x7, 0x8, 0);
-}
-
-int
-m16_simm5_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0x10, 0xf, 0);
-}
-
-int
-m16_nsimm5_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0xf, 0x10, 0);
-}
-
-int
-m16_uimm5_4 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0x10  2, 0xf  2, 3);
-}
-
-int
-m16_nuimm5_4 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0xf  2, 0x10  2, 3);
-}
-
-int
-m16_simm8_1 (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return m16_check_op (op, -0x80, 0x7f, 0);
-}
-
-int

[Patch:RL78] Fix hardware multiply on G13 target

2013-05-10 Thread Kaushik Phatak

Hi,
The following patch fixes an issue with the inline hardware multiply used in
the RL78 G13 target.
The G13 target does not have a multiply instruction, but achieves this using
set of data and control registers.

This patch adds an additional 'nop' after control and data registers are loaded
in the multiply-accumulation operation. Currently, there is only 1 'nop'. 
As per the data sheet, 
regular multiply operation: 1 nop
multiply-accumulate operation: 2 nops

Section 17.4.2 of G13 hardware manual:
After 6, the multiply-accumulation operation finishes in one additional 
clock cycle. (There is a wait of at least two clock cycles after specifying 
the initial settings is finished (5).)

Note: The multiplication fails on hardware (for certain testcases), but always
works as expected using rl78-elf-run simulator. I do not think the simulator
has any cycle check for nop's.
Please let me know if this is OK to commit.

Regards,
Kaushik

2013-05-10  Kaushik Phatak  kaushik.pha...@kpitcummins.com

* config/rl78/rl78.md (mulsi3_g13): Add additional 'nop' required
in multiply-accumulate mode

diff -uprN /home/kpit/fsfsrc/gcc-4.8.0-20121219/gcc/config/rl78/rl78.md 
gcc/config/rl78/rl78.md
--- /home/kpit/fsfsrc/gcc-4.8.0-20121219/gcc/config/rl78/rl78.md2013-01-25 
16:26:27.0 +0530
+++ gcc/config/rl78/rl78.md 2013-05-10 16:57:26.0 +0530
@@ -381,6 +381,7 @@
movwax, %h2
movw0x2, ax ; MDAH
nop ; mdc += mdal * mdah
+   nop ; Additional nop for MAC
 
mov a, #0x40
mov !0xf00e8, a ; MDUC
@@ -389,6 +390,7 @@
movwax, %H2
movw0x2, ax ; MDAH
nop ; mdc += mdal * mdah
+   nop ; Additional nop for MAC
movwax, !0xf00e0; MDCL
movw%H0, ax
; end of mulsi macro

Re: [PATCH, PR 57084] Create real cgraph node during late intraprocedural devirtualization

2013-05-10 Thread Jan Hubicka

 On Thu, May 09, 2013 at 03:52:20PM +0200, Martin Jambor wrote:
  On Tue, May 07, 2013 at 11:43:33PM +0200, Jan Hubicka wrote:
Hi,

the problem in PR 57084 is that late PRE devirtualization creates a
direct call to a decl fro which we only have an inlined call graph
mode in the given partition.  I tried to find a most universal place
where to fix it because this problem is not special to type-based
devirtualization and in theory can be caused by any call to a decl
that is grabbed from a constructor.  I think the best place is the
following one-liner, because all such decls should go through
canonicalize_constructor_val.

Bootstrapped and tested on x86_64-linux, fixes the testcase (at -m32)
and I have happened to also LTO build Mozilla Firefox with it.  OK for
trunk?
   
   Is this change needed for 4.8, too? It would be OK there.
  
  Although the testcase shows the problem relies on code that is new in
  4.9, I think that yes, even the current devirtualization by looking
  into constructors can hit the bug.  So I am going to commit it to 4.8
  after testing on that branch.
 
 Except that there is no such thing as
 cgraph_get_create_real_symbol_node in 4.8.  I suppose it's not worth
 backporting it without a bug triggering there.  However, this makes me

Yep, it was same situation with the PR fix I introduced 
cgraph_get_create_real_symbol_node for.
In theory the bug exists for 4.8, too, but I do not have testcase.

Lets go with bugfixes first and then I can do the renaming.

Honza

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Jan Hubicka

 On 05/07/13 23:13, Teresa Johnson wrote:
 --
 Revised patch that fixes failures encountered when enabling
 -freorder-blocks-and-partition, including the failure reported in PR 53743.
 
 This includes new verification code to ensure no cold blocks dominate hot
 blocks contributed by Steven Bosscher.
 Seems like a reasonable verification; presumably if we have a cold
 block dominating a hot block, then the block/edge frequencies are
 badly broken.  Ah, just saw the comments for the other case where
 this happens.  cold entry, but hot loop inside pushing over the
 barrier. Arguably given a cold block in the dominator graph, all its

Yep, also note that sanity checking anything about frequencies is really hard.
There are very many places in compiler that necesarilly need to invalidate
frequencies in weird ways (at least short of rebuilding the whole profile
from probabilities again).

 I can't really comment on the cfglayout and related stuff -- it was
 added at a time when I wasn't doing much with GCC and thus I don't
 know much about it.

I think I can take a look at the cfglayout stuff. Splitting the patch would be 
great.

Honza
 
 However, I like the changes to record if we've done partitioning and
 checking those instead of flag_reorder_blocks_and_partition.  That's
 simple enough that I'd support pulling it out as a separate patch
 and installing immediately if that can be done so without major
 headaches.
 
 I think we could do something similar with the code to verify the
 idom of a hot block is also hot.  Though looking at the
 implementation I wonder if it could be simplified by walking the
 dominator tree?  I can't look at it real closely tonight though.
 
 Could you pull those two logical hunks of work out into individual
 patches.
 
 jeff

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Steven Bosscher

On Fri, May 10, 2013 at 12:57 AM, Xinliang David Li wrote:
 On Thu, May 9, 2013 at 3:40 PM, Steven Bosscher wrote:

 This patch mixes up things badly from the point of
 what-depends-on-what, the whole approach looks wrong to me.


 Do you mean the 'source file dependency' or 'logical dependency'?

 If the former, the code can be easily refactored to remove the
 dependency. I don't see how the latter can be avoided as bb-partition
 etc does change cfg states and leads to different actions in cfg
 layout finalize.

I mean logical dependency. cfglayout is just a representation of the
CFG, and bb-partition is a code transformation. By making
fixup_reorder_chain emit the note, you' ve put part of the
transformation into out-of-cfglayout which is just bogus. You also
don't put GCSE or loop unrolling in out-of-cfglayout, and this change
is IMHO in the same category: mixing transformations into internal
representations. That may be a short-term fix but it is a long-term
maintenance/cleanups nightmare.

Although I've only skimmed the patch, I have noted several issues with it:

* 3 different changes put into a single patch: the
crtl-has_bb_partition change (which looks good to me), the
verification stuff, and various fixes. The patch should be submitted
in 3 parts to make/testing review easier.

* Emitting barriers in cfglayout mode. That's non-sense.

* Fixup redirected edges that did not cross partitions before but
apparently do after redirection. This is not supposed to happen in the
first place, so fixing up any of this just papers over an error
elsewhere in the compiler.

* The fixup_reorder_chain changes I've mentioned above.


Ciao!
Steven

Re: RFC patch for fold-const.c's fold_indirect_ref_1 / avoid type-checking ICE with TYPE_MIN_VALUE vs. offset signess mismatches

2013-05-10 Thread Richard Biener

On Fri, May 10, 2013 at 10:44 AM, Tobias Burnus
tobias.bur...@physik.fu-berlin.de wrote:
 I run into the following issue on the Fortran-Dev branch. I get an ICE due to 
 a tree-checking assert for fold-const.c's fold_indirect_ref_1, which has

   tree min_val = size_zero_node;
   if (type_domain  TYPE_MIN_VALUE (type_domain))
 min_val = TYPE_MIN_VALUE (type_domain);
   op01 = size_binop_loc (loc, EXACT_DIV_EXPR, op01,
  TYPE_SIZE_UNIT (type));
   op01 = size_binop_loc (loc, PLUS_EXPR, op01, min_val);

 The problem is that op01 is unsigned (size_type_node) while min_val alias 
 TYPE_MIN_VALUE is signed (gfc_array_index_type which is PTRDIFF_TYPE).


 TYPE_MIN_VALUE gets set via tree.c's build_range_type_1, which has:

   TYPE_MIN_VALUE (itype) = fold_convert (type, lowval);
   TYPE_MAX_VALUE (itype) = highval ? fold_convert (type, highval) : NULL;


 And op01 becomes a size_type_node via tree.h's

 fold_build_pointer_plus_loc (location_t loc, tree ptr, tree off)
 {
   return fold_build2_loc (loc, POINTER_PLUS_EXPR, TREE_TYPE (ptr),
   ptr, fold_convert_loc (loc, sizetype, off));


 I am not completely sure what's the best way to solve that, but the attached 
 patch avoids the ICE for me.

 I did an all-all language bootstrap (all,ada,go,objc,obj-c++) and had no new 
 failures with regresting.
 OK? Or do you have a better idea how to solve this?

I think the middle-end expects unsigned sizetype in many places, so this
isn't the only point where you may hit such an issue.  The code has the
related issue that it happily does an unsigned division on negative offsets
because it fails to interpret the POINTER_PLUS_EXPR offset as signed

Note that those foldings are highly suspicious (I've removed most similar
foldings from elsewhere) because they are susceptible to generate
out-of-bound array accesses for multi-dimensional arrays.  This one seems
to be bogus this way, too.

That said, the function needs some major TLC :/  (I'd simply try removing
that array case)

Richard.

 Tobias

Re: More vector folding

2013-05-10 Thread Richard Biener

On Fri, May 10, 2013 at 12:03 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Fri, 10 May 2013, Richard Biener wrote:

 On Thu, May 9, 2013 at 12:55 AM, Marc Glisse marc.gli...@inria.fr wrote:

 Hello,

 here are a few more changes to fold-const.c so vectors can use the
 existing
 optimizations. Note that I made fold_truth_not_expr safe for use with
 vector
 BIT_NOT_EXPR.

 Passes bootstrap+testsuite on x86_64-linux-gnu.

 2013-05-09  Marc Glisse  marc.gli...@inria.fr

 gcc/
 * fold-const.c (fold_negate_expr): Handle vectors.
 (fold_truth_not_expr): Likewise.
 (invert_truthvalue_loc): Likewise.
 (fold_single_bit_test): Likewise.
 (fold_comparison): Likewise.
 (fold_ternary_loc): Likewise.

 gcc/testsuite/
 * g++.dg/ext/vector22.C: New testcase.

 --
 Marc Glisse
 Index: gcc/testsuite/g++.dg/ext/vector22.C
 ===
 --- gcc/testsuite/g++.dg/ext/vector22.C (revision 0)
 +++ gcc/testsuite/g++.dg/ext/vector22.C (revision 0)
 @@ -0,0 +1,20 @@
 +/* { dg-do compile } */
 +/* { dg-options -O -fdump-tree-gimple } */
 +
 +typedef unsigned vec __attribute__((vector_size(4*sizeof(int;
 +
 +void f(vec*a,vec*b){
 +  *a=(*a)?-1:(*b10);
 +  *b=(*b)?(*a10):0;
 +}
 +void g(vec*a,vec*b){
 +  *a=(*a)?(*a*a):-1;
 +  *b=(*b)?-1:(*b*b);
 +}
 +void h(vec*a){
 +  *a=(~*a==5);
 +}
 +
 +/* { dg-final { scan-tree-dump-not ~ gimple } } */
 +/* { dg-final { scan-tree-dump-not VEC_COND_EXPR gimple } } */
 +/* { dg-final { cleanup-tree-dump gimple } } */

 Property changes on: gcc/testsuite/g++.dg/ext/vector22.C
 ___
 Added: svn:eol-style
+ native
 Added: svn:keywords
+ Author Date Id Revision URL

 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c(revision 198726)
 +++ gcc/fold-const.c(working copy)
 @@ -519,21 +519,21 @@ fold_negate_expr (location_t loc, tree t
  {
tree type = TREE_TYPE (t);
tree tem;

switch (TREE_CODE (t))
  {
  /* Convert - (~A) to A + 1.  */
  case BIT_NOT_EXPR:
if (INTEGRAL_TYPE_P (type))
  return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t,
 0),
 -build_int_cst (type, 1));
 +build_one_cst (type));
break;

  case INTEGER_CST:
tem = fold_negate_const (t, type);
if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
   || !TYPE_OVERFLOW_TRAPS (type))
 return tem;
break;

  case REAL_CST:
 @@ -3110,20 +3110,23 @@ fold_truth_not_expr (location_t loc, tre

return build2_loc (loc, code, type, TREE_OPERAND (arg, 0),
  TREE_OPERAND (arg, 1));
  }

switch (code)
  {
  case INTEGER_CST:
return constant_boolean_node (integer_zerop (arg), type);

 +case VECTOR_CST:
 +  return fold_unary_loc (loc, BIT_NOT_EXPR, type, arg);
 +
  case TRUTH_AND_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
return build2_loc (loc, TRUTH_OR_EXPR, type,
  invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
 0)),
  invert_truthvalue_loc (loc2, TREE_OPERAND (arg,
 1)));

  case TRUTH_OR_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
loc2 = expr_location_or (TREE_OPERAND (arg, 1), loc);
 @@ -3195,26 +3198,28 @@ fold_truth_not_expr (location_t loc, tre
 return build1_loc (loc, TRUTH_NOT_EXPR, type, arg);

/* ... fall through ...  */

  case FLOAT_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
return build1_loc (loc, TREE_CODE (arg), type,
  invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
 0)));

  case BIT_AND_EXPR:
 -  if (!integer_onep (TREE_OPERAND (arg, 1)))
 +  if (VECTOR_TYPE_P (type) || !integer_onep (TREE_OPERAND (arg, 1)))
 return NULL_TREE;
return build2_loc (loc, EQ_EXPR, type, arg, build_int_cst (type,
 0));

  case SAVE_EXPR:
 -  return build1_loc (loc, TRUTH_NOT_EXPR, type, arg);
 +  return build1_loc (loc, VECTOR_TYPE_P (type) ? BIT_NOT_EXPR
 +  : TRUTH_NOT_EXPR,
 +type, arg);

  case CLEANUP_POINT_EXPR:
loc1 = expr_location_or (TREE_OPERAND (arg, 0), loc);
return build1_loc (loc, CLEANUP_POINT_EXPR, type,
  invert_truthvalue_loc (loc1, TREE_OPERAND (arg,
 0)));

  default:
return NULL_TREE;
  }
  }
 @@ -3222,28 +3227,28 @@ fold_truth_not_expr (location_t loc, tre
  /* Return a simplified tree node for the truth-negation of ARG.  This
 never alters ARG itself.  We assume that ARG is an operation that
 returns a truth value (0 or 1).

 FIXME:

Re: [PATCH] Dynamic dispatch of multiversioned functions and CPU mocks for code coverage.

2013-05-10 Thread Joseph S. Myers

On Thu, 9 May 2013, Sriraman Tallam wrote:

 Then, I plan to add the following hooks to libgcc (in a different patch) :
 
 int set_mock_cpu_is (const char *cpu);
 int set_mock_cpu_supports (const char *isa);
 int init_mock_cpu (); // Clear the values of the mock cpu.

Those names are in the user's namespace; I think libgcc should only 
provide or use symbols in the implementation namespace.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Improve rotation by mode bitsize - 1

2013-05-10 Thread Jan Hubicka

 Hi!
 
 This is something I've noticed while working on the rotate recognizer
 patch I've just posted.  We emit say
   roll %eax
 instead of
   roll $1, %eax
 because the former is shorter, but emit
   roll $31, %eax
 instead of the equivalent, but shorter
   rorl %eax
 The following patch let us optimize even those.  Bootstrapped/regtested
 on x86_64-linux and i686-linux, ok for trunk?
 
 2013-05-09  Jakub Jelinek  ja...@redhat.com
 
   * config/i386/i386.md (rotateinv): New code attr.
   (*rotate_insnmode3_1, *rotate_insnsi3_1_zext,
   *rotate_insnqi3_1_slp): Emit rorl %eax instead of
   roll $31, %eax, etc.

Going this route you will also need to update

  [(set_attr type rotate1)
   (set (attr length_immediate)
 (if_then_else
   (and (match_operand 1 const1_operand)
(ior (match_test TARGET_SHIFT1)
 (match_test optimize_function_for_size_p (cfun
   (const_string 0)
   (const_string *)))
   (set_attr mode QI)])

computing the immediate size.  Why we can't canonicalize this in
folding/simplify_rtx/combiner? I do not see much benefit allowing both
forms of instructions in our insn streams...

Honza

Re: [PATCH] Improve rotation by mode bitsize - 1

2013-05-10 Thread Jakub Jelinek

Hi!

Note, the patch has been already committed.

On Fri, May 10, 2013 at 03:48:50PM +0200, Jan Hubicka wrote:
 Going this route you will also need to update
 
   [(set_attr type rotate1)
(set (attr length_immediate)
  (if_then_else
(and (match_operand 1 const1_operand)
 (ior (match_test TARGET_SHIFT1)
  (match_test optimize_function_for_size_p (cfun
(const_string 0)
(const_string *)))
(set_attr mode QI)])
 
 computing the immediate size.  Why we can't canonicalize this in
 folding/simplify_rtx/combiner? I do not see much benefit allowing both
 forms of instructions in our insn streams...

Both directions of rotation make sense for variable length rotation, and
why would the generic simplify-rtx.c code want to prefer one rotate over
the other direction for constant rotation counts?  That is clearly target
specific preference.

What we could do instead of the patch I've committed just tweak expansion
of the patterns instead, assuming it is unlikely RTL optimizations
materialize a constant rotation out of non-constant rotations from expansion
time.

Or the length_immediate attributes could be adjusted, shouldn't be too hard
(just replace that (match_operand 1 const1_operand) in there with
(and (match_operand 1 const_int_operand)
 (ior (match_operand 1 const1_operand)
  (match_test INTVAL (operands[1]) == GET_MODE_BITSIZE (MODEmode) - 
1)))
or similar (untested).

Jakub

Re: [PATCH] Test case for PR55033

2013-05-10 Thread Chung-Ju Wu

2013/5/10 Sebastian Huber sebastian.hu...@embedded-brains.de:
 gcc/testsuite/ChangeLog
 2013-05-10  Sebastian Huber  sebastian.hu...@embedded-brains.de

 PR target/55033
 * gcc.target/powerpc/pr55033.c: New.
 ---
  gcc/testsuite/gcc.target/powerpc/pr55033.c |   21 +
  1 files changed, 21 insertions(+), 0 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c

 diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c 
 b/gcc/testsuite/gcc.target/powerpc/pr55033.c
 new file mode 100644
 index 000..cd8abcb
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c
 @@ -0,0 +1,21 @@
 +/* { dg-do compile } */
 +/* { dg-require-effective-target powerpc_eabi_ok } */
 +/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */
 +
 +void f(void);
   ^
I think there is missing space. :p

 +
 +struct s {
 +  int *p;
 +  int *q;
 +};
 +
 +extern int a;
 +
 +extern const struct s c;
 +
 +const struct s c = { a, 0 };
 +
 +void f(void)
   ^
Likewise. :)

 +{
 +  char buf[4] = { 0, 1, 2, 3 };
 +}
 --
 1.7.7


Best regards,
jasonwucj

[v3] Fix libstdc++/54577

2013-05-10 Thread Paolo Carlini


Hi,

this is the issue about the signatures of the erase member functions of 
the sequence containers. Mostly rather straightfoward stuff within the 
limits of the current infrastructure: the various _M_const_case are 
normally simple enough, I only mention the rather ugly std::vector one, 
required otherwise an ext_pointer testcase fails: I suppose that 
handling these issues in a proper way will happen together with changing 
the vector::pointer typedef to the conforming:


typedef typename _Alloc_traits::pointer  pointer;

and I suppose Jonathan has the issue under control. Something similar 
is true for the other containers.


Tested x86_64-linux.

Thanks,
Paolo.


2013-05-10  Paolo Carlini  paolo.carl...@oracle.com

PR libstdc++/54577
* include/bits/stl_deque.h (erase): Fix signatures in C++11 mode.
(_Deque_iterator::_M_const_cast): Add.
(deque::_M_erase): Declare.
* include/bits/deque.tcc: ... and define.
* include/bits/stl_list.h (erase): Fix signatures in C++11 mode.
(_List_const_iterator::_M_const_cast): Add.
* include/bits/list.tcc (erase (iterator)): Fix.
* include/bits/stl_iterator.h (__normal_iterator::_M_const_cast):
Add; include ext/cast.h.
* include/bits/stl_vector.h (erase): Fix signatures in C++11 mode.
(vector::_M_erase): Declare.
* include/bits/stl_bvector.h: Likewise.
* include/bits/vector.tcc (vector::_M_erase): Define.
* include/ext/vstring.h (erase): Fix signatures in C++11 mode.
* include/debug/deque: Adjust.
* include/debug/list: Likewise.
* include/debug/vector: Likewise.
* include/profile/deque: Likewise.
* include/profile/list: Likewise.
* include/profile/vector: Likewise.
* testsuite/util/exception/safety.h (erase_basedeque,
erase_basedeque, erase_basevector): Remove.
(erase_base__versa_string): Update.
* testsuite/ext/vstring/modifiers/char/54577.cc: New.
* testsuite/ext/vstring/modifiers/wchar_t/54577.cc: Likewise.
* testsuite/23_containers/deque/modifiers/erase/54577.cc: Likewise.
* testsuite/23_containers/list/modifiers/erase/54577.cc: Likewise.
* testsuite/23_containers/vector/bool/modifiers/erase/54577.cc:
Likewise.
* testsuite/23_containers/vector/modifiers/erase/54577.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/assign_neg.cc:
Adjust dg-error line numbers.
* testsuite/23_containers/deque/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/insert_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/assign_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/list/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/list/requirements/dr438/
insert_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/assign_neg.cc:
Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/insert_neg.cc:
Likewise.
Index: include/bits/deque.tcc
===
--- include/bits/deque.tcc  (revision 198770)
+++ include/bits/deque.tcc  (working copy)
@@ -191,7 +191,7 @@
   template typename _Tp, typename _Alloc
 typename deque_Tp, _Alloc::iterator
 deque_Tp, _Alloc::
-erase(iterator __position)
+_M_erase(iterator __position)
 {
   iterator __next = __position;
   ++__next;
@@ -214,7 +214,7 @@
   template typename _Tp, typename _Alloc
 typename deque_Tp, _Alloc::iterator
 deque_Tp, _Alloc::
-erase(iterator __first, iterator __last)
+_M_erase(iterator __first, iterator __last)
 {
   if (__first == __last)
return __first;
Index: include/bits/list.tcc
===
--- include/bits/list.tcc   (revision 198770)
+++ include/bits/list.tcc   (working copy)
@@ -106,10 +106,18 @@
   templatetypename _Tp, typename _Alloc
 typename list_Tp, _Alloc::iterator
 list_Tp, _Alloc::
+#if __cplusplus = 201103L
+erase(const_iterator __position)
+#else
 erase(iterator __position)
+#endif
 {
   iterator __ret = iterator(__position._M_node-_M_next);
+#if __cplusplus = 201103L
+  _M_erase(__position._M_const_cast());
+#else
   _M_erase(__position);
+#endif
   return __ret;
 }
 
Index:

C++ PATCH for c++/55149 (VLA lambda capture)

2013-05-10 Thread Jason Merrill

This bug talks about ICEs trying to capture a VLA by value.  We don't 
support that, but we can give a more helpful error.  In the comments 
Paolo also noticed that with the new support for capture by reference, 
we wrongly warn about returning a reference to the captured variable.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit e88823b454079098df61d1701ac0c7b59a96b758
Author: Jason Merrill ja...@redhat.com
Date:   Thu May 9 22:15:22 2013 -0400

	PR c++/55149
	* semantics.c (add_capture): Error rather than abort on copy
	capture of VLA.
	* typeck.c (maybe_warn_about_returning_address_of_local): Don't
	warn about capture proxy.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3e1a0bf..d0db10a 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -9463,9 +9463,12 @@ add_capture (tree lambda, tree id, tree initializer, bool by_reference_p,
   type = lambda_capture_field_type (initializer, explicit_init_p);
   if (array_of_runtime_bound_p (type))
 {
+  if (!by_reference_p)
+	error (array of runtime bound cannot be captured by copy, 
+	   only by reference);
+
   /* For a VLA, we capture the address of the first element and the
 	 maximum index, and then reconstruct the VLA for the proxy.  */
-  gcc_assert (by_reference_p);
   tree elt = cp_build_array_ref (input_location, initializer,
  integer_zero_node, tf_warning_or_error);
   tree ctype = vla_capture_type (type);
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index df5fc4a..b8ea555 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -8140,6 +8140,7 @@ maybe_warn_about_returning_address_of_local (tree retval)
   if (DECL_P (whats_returned)
DECL_NAME (whats_returned)
DECL_FUNCTION_SCOPE_P (whats_returned)
+   !is_capture_proxy (whats_returned)
!(TREE_STATIC (whats_returned)
 	   || TREE_PUBLIC (whats_returned)))
 {
diff --git a/gcc/testsuite/g++.dg/cpp1y/vla5.C b/gcc/testsuite/g++.dg/cpp1y/vla5.C
new file mode 100644
index 000..1f6da29
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/vla5.C
@@ -0,0 +1,8 @@
+// PR c++/55149
+// { dg-options -std=c++1y }
+
+void test(int n) {
+  int r[n];
+  [r]() { return r + 0; };
+  [r]() { return r + 0; };	// { dg-error captured by copy }
+}

C++ PATCH for c++/57047 (ICE with constexpr)

2013-05-10 Thread Jason Merrill

Oddly, after you assign to a variable, you can't use its old value 
anymore...


Tested x86_64-pc-linux-gnu, applying to trunk, 4.8, 4.7.
commit 16ee0ee2c495bc475e0ba015c6b9fd4707692196
Author: jason jason@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Fri May 10 14:17:37 2013 +

	PR c++/57047
	* semantics.c (cxx_fold_indirect_ref): Fix thinko.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@198777 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index d0db10a..3e78887 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7643,15 +7643,17 @@ cxx_fold_indirect_ref (location_t loc, tree type, tree op0, bool *empty_base)
 	}
 	}
 }
-  /* *(foo *)fooarrptreturn (*fooarrptr)[0] */
+  /* *(foo *)fooarrptr = (*fooarrptr)[0] */
   else if (TREE_CODE (TREE_TYPE (subtype)) == ARRAY_TYPE
 	(same_type_ignoring_top_level_qualifiers_p
 	   (type, TREE_TYPE (TREE_TYPE (subtype)
 {
   tree type_domain;
   tree min_val = size_zero_node;
-  sub = cxx_fold_indirect_ref (loc, TREE_TYPE (subtype), sub, NULL);
-  if (!sub)
+  tree newsub = cxx_fold_indirect_ref (loc, TREE_TYPE (subtype), sub, NULL);
+  if (newsub)
+	sub = newsub;
+  else
 	sub = build1_loc (loc, INDIRECT_REF, TREE_TYPE (subtype), sub);
   type_domain = TYPE_DOMAIN (TREE_TYPE (sub));
   if (type_domain  TYPE_MIN_VALUE (type_domain))
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr8.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr8.C
new file mode 100644
index 000..ee425ea
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-array-ptr8.C
@@ -0,0 +1,54 @@
+// PR c++/57047
+// { dg-require-effective-target c++11 }
+
+template typename
+struct A;
+template typename T
+struct A T 
+{
+  typedef T type;
+};
+template typename T
+constexpr T  foo (typename A T::type  __t) noexcept
+{
+  return static_cast T (__t);
+}
+template class T1, class T2
+struct B
+{
+  T1 t1;
+  T2 t2;
+  template class U
+  constexpr B (U  __x, const T2  __y) : t1 (foo U (__x)), t2 (__y) {}
+};
+static inline constexpr bool
+fn1 (const char c)
+{
+  return ('0' = c)  (c = '9');
+}
+static inline constexpr bool
+fn2 (const char c)
+{
+  return (('A' = c)  (c = 'Z')) || (('a' = c)  (c = 'z'));
+}
+static constexpr bool
+fn3 (const char *const x)
+{
+  return (x[1] == '\0'  x[0] == ']') ? true : (!fn1 (x[0])) ? false : fn3 (x[1]);
+}
+static constexpr bool
+fn4 (const char *const x)
+{
+  return (x[0] == '\0') ? fn3 (x[1]) : fn4 (x[1]);
+}
+static inline constexpr bool
+fn5 (const char *const x)
+{
+  return fn2 (x[0]) ? fn4 (x) : false;
+}
+struct C final
+{
+  constexpr C (const char *const t1) : c (fn5 (t1) ? 199 : 69) {}
+  unsigned c;
+};
+B C, C p (a, b);

[PATCH v2] Test case for PR55033

2013-05-10 Thread Sebastian Huber

v2: Format changes

gcc/testsuite/ChangeLog
2013-05-10  Sebastian Huber  sebastian.hu...@embedded-brains.de

PR target/55033
* gcc.target/powerpc/pr55033.c: New.
---
 gcc/testsuite/gcc.target/powerpc/pr55033.c |   23 +++
 1 files changed, 23 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c

diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c 
b/gcc/testsuite/gcc.target/powerpc/pr55033.c
new file mode 100644
index 000..2c1835e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_eabi_ok } */
+/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */
+
+extern void f (void);
+
+struct s
+{
+  int *p;
+  int *q;
+};
+
+extern int a;
+
+extern const struct s c;
+
+const struct s c = { a, 0 };
+
+void
+f (void)
+{
+  char buf[4] = { 0, 1, 2, 3 };
+}
-- 
1.7.7

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Teresa Johnson

On Thu, May 9, 2013 at 2:42 PM, Diego Novillo dnovi...@google.com wrote:
 On 2013-05-08 01:13 , Teresa Johnson wrote:

 Somehow Rietveld didn't upload the patch properly. I've attached the
 patch to this email instead. Here is the description:


 Rietveld has turned out to be far less useful that I had hoped.  If you are
 running ubuntu precise, the upload script is having some bad interaction
 with the server, which makes it to constantly reject your password.

 I do not recommend using Rietveld anymore.  I don't really have the cycles
 to invest in fixing the various usability warts we've found. Sorry.

Thanks for the note. The main reason I have tried to keep using
Rietveld is that it sends out the patch inline in the email with the
formatting preserved. I have found that cut-n-paste into a gmail
window messes up the spacing. Do you know of a good way to work around
this issue?



 -static void
 +void
  emit_barrier_after_bb (basic_block bb)
  {
rtx barrier = emit_barrier_after (BB_END (bb));
 -  BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
 +  if (current_ir_type () == IR_RTL_CFGLAYOUT)
 +BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);


 What if the current IR is not RTL?  Should we fail here?  It doesn't seem
 like it makes sense to call this from gimple, for instance.

This is called only from bb-reorder and cfgrtl, so we should only be
in IR_RTL, I can add an assert to this effect.

More on this change when I respond to Steven's comments.



 + several different possibilities. One is that there are edge weight
 insanities
 + due to optimization phases that do not properly update basic block
 profile
 + counts. The second is that the entry of the function may not be hot,
 because
 + it is entered fewer times than the number of profile training runs,
 but there
 + is a loop inside the function that causes blocks within the function
 to be
 + above the threshold for hotness.  */
 +  if (cold_bb_count)
 +{
 +  bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
 +


 Move this out into its own function?

Will do.


 +  if (dom_calculated_here)
 +calculate_dominance_info (CDI_DOMINATORS);
 +
 +  /* Keep examining hot bbs until we have either checked them all, or
 + re-marked all cold bbs hot.  */
 +  while (! bbs_in_hot_partition.is_empty ()
 +  cold_bb_count)
 +{
 +  basic_block dom_bb;
 +
 +  bb = bbs_in_hot_partition.pop ();
 +  dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
 +
 +  /* If bb's immediate dominator is also hot then it is ok.  */
 +  if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION)
 +continue;
 +
 +  /* We have a hot bb with an immediate dominator that is cold.
 + The dominator needs to be re-marked to hot.  */


 s/to hot/hot/

ok. Actually, I think s/to hot/as hot/ might sound better.


 Index: cfgrtl.c
 ===
 --- cfgrtl.c(revision 198686)
 +++ cfgrtl.c(working copy)
 @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
  #include tree.h
  #include hard-reg-set.h
  #include basic-block.h
 +#include bb-reorder.h


 You may need to modify Makefile.in to declare this new dependency.

 +/* Called when edge E has been redirected to a new destination,
 +   in order to update the region crossing flag on the edge and
 +   jump.  */
 +
 +static void
 +fixup_partition_crossing (edge e, basic_block target)
 +{
 +  rtx note;
 +
 +  gcc_assert (e-dest == target);


 Then, why not just take a single argument E?

Good idea, will do.


 +fixup_bb_partition (basic_block bb)
 +{
 +  edge e;
 +  edge_iterator ei;
 +
 +  /* Now need to make bb's pred edges non-region crossing.  */


 This is hard to parse.

Ok, how about:

/* Ensure edges to bb reflect its new partition assignment with the appropriate
   region-crossing flag setting.  */


 +  /* Delete any blocks that became unreachable and weren't
 + already cleaned up, for example during edge forwarding
 + and convert_jumps_to_returns. This will expose more
 + opportunities for fixing the partition boundaries here.
 + Also, the calculation of the dominance graph during verification
 + will assert if there are unreachable nodes.  */
 +  delete_unreachable_blocks ();


 Why not just schedule a CFG cleanup as a prerequisite to this pass?

Which pass? This is called right after we try to optimize the cfg
during cleanup_cfg, which is invoked numerous places. try_optimize_cfg
performs a number of cfg optimizations, some of which can create
unreachable blocks. I found it was much easier to clean this up in one
pass at the end rather that try to detect and fix this up
incrementally.



 A minor formatting nit.  References to locals and function arguments should
 be done in capitals.

Ok, will clean this up.

Thanks!
Teresa



 Diego.



--
Teresa Johnson |

C++ PATCH for c++/57196 (wrong error with initializer-list constructor)

2013-05-10 Thread Jason Merrill

When I added the check for instantiation-dependent expressions, I 
treated statement-expressions and some qualified-ids as 
instantiation-dependent even when they don't involve template 
parameters.  In this PR, that causes problems because array 
initialization expands into a statement-expression via build_vec_init, 
and a sizeof-expression is not immediately folded, and then when we go 
to convert the sizeof to the appropriate type for the template non-type 
parameter, we call uses_template_parms, which pretends we're in a 
template, we look inside the sizeof and see the statement-expression, 
and decide that it's dependent.


The solution is not to pretend that we're in a template when deciding 
whether a template argument is dependent.


Tested x86_64-pc-linux-gnu, applying to trunk.  If no problems turn up, 
I'll apply it to 4.8 later as well.
commit 32bf8cd06a9d5e2ad4abd35ce3571457a071b27d
Author: jason jason@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Fri May 10 14:17:45 2013 +

	PR c++/57196
	* pt.c (convert_template_argument): Use dependent_template_arg_p,
	not uses_template_parms.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@198778 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 0747de6..8f88b10 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -6368,7 +6368,8 @@ convert_template_argument (tree parm,
 	  val = error_mark_node;
 	}
 	}
-  else if (!uses_template_parms (orig_arg)  !uses_template_parms (t))
+  else if (!dependent_template_arg_p (orig_arg)
+	!uses_template_parms (t))
 	/* We used to call digest_init here.  However, digest_init
 	   will report errors, which we don't want when complain
 	   is zero.  More importantly, digest_init will try too
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist70.C b/gcc/testsuite/g++.dg/cpp0x/initlist70.C
new file mode 100644
index 000..f215b9d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist70.C
@@ -0,0 +1,27 @@
+// PR c++/57196
+// { dg-require-effective-target c++11 }
+
+#include initializer_list
+
+templateclass T
+struct set {
+  set() = default;
+  set(std::initializer_listT){}
+};
+
+struct string {
+  string(const char*){}
+  ~string(){}
+};
+
+typedef decltype(sizeof(0)) size_t;
+
+template size_t struct EqHelper { };
+
+int IsNullLiteralHelper(...);
+
+void Fn() {
+  EqHelpersizeof IsNullLiteralHelper(setint{1})eq1;  // ok
+  EqHelpersizeof IsNullLiteralHelper(setstring())  eq2;  // ok
+  EqHelpersizeof IsNullLiteralHelper(setstring{foo}) eq3;  // error
+}
diff --git a/gcc/testsuite/g++.dg/template/crash106.C b/gcc/testsuite/g++.dg/template/crash106.C
index c2d117e..ebd47bc 100644
--- a/gcc/testsuite/g++.dg/template/crash106.C
+++ b/gcc/testsuite/g++.dg/template/crash106.C
@@ -10,3 +10,5 @@ struct A
 templateT N = 0, void (A::*)() = A::fooN  struct B {}; // { dg-error type|declared }
 
 B b; // { dg-error type|declaration }
+
+// { dg-prune-output could not convert }
diff --git a/gcc/testsuite/g++.dg/template/crash112.C b/gcc/testsuite/g++.dg/template/crash112.C
index 919c887..ff35764 100644
--- a/gcc/testsuite/g++.dg/template/crash112.C
+++ b/gcc/testsuite/g++.dg/template/crash112.C
@@ -5,7 +5,7 @@ struct A
   templatetypename void foo() {}
 };
 
-templatevoid (A::*)() struct B {}; // { dg-error declaration }
+templatevoid (A::*)() struct B {};
 
 templateint struct C
 {
@@ -13,3 +13,5 @@ templateint struct C
 };
 
 C0 c;
+
+// { dg-prune-output could not convert }
diff --git a/gcc/testsuite/g++.dg/template/dependent-args1.C b/gcc/testsuite/g++.dg/template/dependent-args1.C
index 0b197cf..a540e55 100644
--- a/gcc/testsuite/g++.dg/template/dependent-args1.C
+++ b/gcc/testsuite/g++.dg/template/dependent-args1.C
@@ -9,3 +9,5 @@ struct A
 templateint N, void (A::*)() = A::fooN  struct B {};
 
 Bint b; // { dg-error type/value mismatch|expected a constant|invalid type }
+
+// { dg-prune-output could not convert }

[PATCH] Pattern recognizer for rotates

2013-05-10 Thread Jakub Jelinek

Hi!

This patch allows us to vectorize attached testcases again, rot1.c
and rot3.c e.g. with -O3 -mavx, rot2.c only with -O3 -mavx2.

The problem with rot2.c is that the rotation count is there
vect_external_def, but when we insert some stmts into the pattern
def seq (the (-j)  31), we turn it into vec_internal_def, we have right now
no easy way to say, if you want to vectorize this, just push these statements
before the loop as scalar and use those as external def.
Well, perhaps the pattern recognizer could in that case emit the stmts
before the loop directly, and if vectorization won't be successful, just
let it be DCEd again.  Perhaps something to be handled as a follow-up.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Not sure about if something should go into the testsuite/, and in what form
(just runtime testcases with verification, or vect_shift target tests?
Note I couldn't find any target supports stuff that would check if vector by
vector shifts are supported).

2013-05-10  Jakub Jelinek  ja...@redhat.com

* tree-vectorizer.h (NUM_PATTERNS): Increment.
* tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
vect_recog_rotate_pattern.
(vect_recog_rotate_pattern): New function.

--- gcc/tree-vectorizer.h.jj2013-04-24 12:07:12.0 +0200
+++ gcc/tree-vectorizer.h   2013-05-10 13:38:14.658985201 +0200
@@ -1005,7 +1005,7 @@ extern void vect_slp_transform_bb (basic
Additional pattern recognition functions can (and will) be added
in the future.  */
 typedef gimple (* vect_recog_func_ptr) (vecgimple *, tree *, tree *);
-#define NUM_PATTERNS 10
+#define NUM_PATTERNS 11
 void vect_pattern_recog (loop_vec_info, bb_vec_info);
 
 /* In tree-vectorizer.c.  */
--- gcc/tree-vect-patterns.c.jj 2013-01-11 09:03:01.0 +0100
+++ gcc/tree-vect-patterns.c2013-05-10 14:59:35.756019753 +0200
@@ -50,6 +50,7 @@ static gimple vect_recog_over_widening_p
  tree *);
 static gimple vect_recog_widen_shift_pattern (vecgimple *,
tree *, tree *);
+static gimple vect_recog_rotate_pattern (vecgimple *, tree *, tree *);
 static gimple vect_recog_vector_vector_shift_pattern (vecgimple *,
  tree *, tree *);
 static gimple vect_recog_divmod_pattern (vecgimple *,
@@ -64,6 +65,7 @@ static vect_recog_func_ptr vect_vect_rec
vect_recog_pow_pattern,
vect_recog_widen_shift_pattern,
vect_recog_over_widening_pattern,
+   vect_recog_rotate_pattern,
vect_recog_vector_vector_shift_pattern,
vect_recog_divmod_pattern,
vect_recog_mixed_size_cond_pattern,
@@ -1446,6 +1448,218 @@ vect_recog_widen_shift_pattern (vecgimp
 
   if (dump_enabled_p ())
 dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, pattern_stmt, 0);
+
+  stmts-safe_push (last_stmt);
+  return pattern_stmt;
+}
+
+/* Detect a rotate pattern wouldn't be otherwise vectorized:
+
+   type a_t, b_t, c_t;
+
+   S0 a_t = b_t r c_t;
+
+  Input/Output:
+
+  * STMTS: Contains a stmt from which the pattern search begins,
+i.e. the shift/rotate stmt.  The original stmt (S0) is replaced
+with a sequence:
+
+   S1 d_t = -c_t;
+   S2 e_t = d_t  (B - 1);
+   S3 f_t = b_t  c_t;
+   S4 g_t = b_t  e_t;
+   S0 a_t = f_t | g_t;
+
+where B is element bitsize of type.
+
+  Output:
+
+  * TYPE_IN: The type of the input arguments to the pattern.
+
+  * TYPE_OUT: The type of the output of this pattern.
+
+  * Return value: A new stmt that will be used to replace the rotate
+S0 stmt.  */
+
+static gimple
+vect_recog_rotate_pattern (vecgimple *stmts, tree *type_in, tree *type_out)
+{
+  gimple last_stmt = stmts-pop ();
+  tree oprnd0, oprnd1, lhs, var, var1, var2, vectype, type, stype, def, def2;
+  gimple pattern_stmt, def_stmt;
+  enum tree_code rhs_code;
+  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo);
+  enum vect_def_type dt;
+  optab optab1, optab2;
+
+  if (!is_gimple_assign (last_stmt))
+return NULL;
+
+  rhs_code = gimple_assign_rhs_code (last_stmt);
+  switch (rhs_code)
+{
+case LROTATE_EXPR:
+case RROTATE_EXPR:
+  break;
+default:
+  return NULL;
+}
+
+  if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
+return NULL;
+
+  lhs = gimple_assign_lhs (last_stmt);
+  oprnd0 = gimple_assign_rhs1 (last_stmt);
+  type = TREE_TYPE (oprnd0);
+  oprnd1 = gimple_assign_rhs2 (last_stmt);
+  if (TREE_CODE (oprnd0) != SSA_NAME
+  || TYPE_PRECISION (TREE_TYPE (lhs)) != TYPE_PRECISION (type)
+  || !INTEGRAL_TYPE_P (type)
+  || !TYPE_UNSIGNED (type))
+return NULL;
+
+  if (!vect_is_simple_use (oprnd1, last_stmt, loop_vinfo, bb_vinfo, def_stmt,
+  def, dt))
+return NULL;
+
+  if (dt != vect_internal_def
+   dt !=

[PATCH] Fix handle_char_store in strlen pass (PR tree-optimization/57230)

2013-05-10 Thread Jakub Jelinek

Hi!

I've apparently missed one spot, where store of zero value was assumed,
but not actually verified.  Fixed thusly, bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk/4.8/4.7?

2013-05-10  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/57230
* tree-ssa-strlen.c (handle_char_store): Add missing integer_zerop
check.

* gcc.dg/strlenopt-23.c: New test.

--- gcc/tree-ssa-strlen.c.jj2013-04-26 08:49:53.0 +0200
+++ gcc/tree-ssa-strlen.c   2013-05-10 08:57:20.654523288 +0200
@@ -1703,7 +1703,7 @@ handle_char_store (gimple_stmt_iterator
   its length may be decreased.  */
adjust_last_stmt (si, stmt, false);
}
-  else if (si != NULL)
+  else if (si != NULL  integer_zerop (gimple_assign_rhs1 (stmt)))
{
  si = unshare_strinfo (si);
  si-length = build_int_cst (size_type_node, 0);
--- gcc/testsuite/gcc.dg/strlenopt-23.c.jj  2013-05-10 09:01:27.808152595 
+0200
+++ gcc/testsuite/gcc.dg/strlenopt-23.c 2013-05-10 09:02:08.042931124 +0200
@@ -0,0 +1,15 @@
+/* PR tree-optimization/57230 */
+/* { dg-do run } */
+/* { dg-options -O2 } */
+
+#include strlenopt.h
+
+int
+main ()
+{
+  char p[] = hello world;
+  p[0] = (char) (strlen (p) - 1);
+  if (strlen (p) != 11)
+abort ();
+  return 0;
+}

Jakub

[PATCH] Improve handle_char_store in strlen pass (PR tree-optimization/57230)

2013-05-10 Thread Jakub Jelinek

Hi!

Related to the previously posted patch, I've noticed that
on the attached testcase we don't optimize the strlen into
constant, even when we easily could.

Implemented thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2013-05-10  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/57230
* tree-ssa-strlen.c (handle_char_store): Record length for
array store from STRING_CST.

* gcc.dg/strlenopt-24.c: New test.

--- gcc/tree-ssa-strlen.c.jj2013-05-10 08:57:20.0 +0200
+++ gcc/tree-ssa-strlen.c   2013-05-10 10:22:03.166458997 +0200
@@ -1740,6 +1740,25 @@ handle_char_store (gimple_stmt_iterator
   if (si != NULL)
si-writable = true;
 }
+  else if (idx == 0
+   TREE_CODE (gimple_assign_rhs1 (stmt)) == STRING_CST
+   ssaname == NULL_TREE
+   TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE)
+{
+  size_t l = strlen (TREE_STRING_POINTER (gimple_assign_rhs1 (stmt)));
+  HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs));
+  if (a  0  (unsigned HOST_WIDE_INT) a  l)
+   {
+ int idx = new_addr_stridx (lhs);
+ if (idx != 0)
+   {
+ si = new_strinfo (build_fold_addr_expr (lhs), idx,
+   build_int_cst (size_type_node, l));
+ set_strinfo (idx, si);
+ si-dont_invalidate = true;
+   }
+   }
+}
 
   if (si != NULL  initializer_zerop (gimple_assign_rhs1 (stmt)))
 {
--- gcc/testsuite/gcc.dg/strlenopt-24.c.jj  2013-05-10 10:30:01.343821849 
+0200
+++ gcc/testsuite/gcc.dg/strlenopt-24.c 2013-05-10 10:30:37.736621582 +0200
@@ -0,0 +1,17 @@
+/* PR tree-optimization/57230 */
+/* { dg-do run } */
+/* { dg-options -O2 -fdump-tree-strlen } */
+
+#include strlenopt.h
+
+int
+main ()
+{
+  char p[] = hello world;
+  if (strlen (p) != 11)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times strlen \\( 0 strlen } } *
+/* { dg-final { cleanup-tree-dump strlen } } */

Jakub

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Teresa Johnson

On Fri, May 10, 2013 at 4:52 AM, Jan Hubicka hubi...@ucw.cz wrote:
 On 05/07/13 23:13, Teresa Johnson wrote:
 --
 Revised patch that fixes failures encountered when enabling
 -freorder-blocks-and-partition, including the failure reported in PR 53743.
 
 This includes new verification code to ensure no cold blocks dominate hot
 blocks contributed by Steven Bosscher.
 Seems like a reasonable verification; presumably if we have a cold
 block dominating a hot block, then the block/edge frequencies are
 badly broken.  Ah, just saw the comments for the other case where
 this happens.  cold entry, but hot loop inside pushing over the
 barrier. Arguably given a cold block in the dominator graph, all its
 children should have their frequences scaled down to avoid that situation.

 Yep, also note that sanity checking anything about frequencies is really hard.
 There are very many places in compiler that necesarilly need to invalidate
 frequencies in weird ways (at least short of rebuilding the whole profile
 from probabilities again).

Yes, as noted in the comments this was in part due to several places
where counts/frequencies were not kept in sync. Rather than try to fix
all of these, or do any scaling of frequencies, the partitioning code
now just enforces that the partitioning is sane w.r.t. the given
counts. This is done during bb partitioning. The sanity checking
routine was also useful for finding places where optimization passes
were splitting edges and causing hot blocks previously reached by both
hot and cold blocks to become dominated by cold blocks (see comments
in commit_edge_insertions in my patch), and making sure they got fixed
up.

But there is the issue of what we should do in the case of an
infrequent but non-zero entry (marked cold by maybe_hot_count_p
because its count is less than the number of training runs) that leads
to a hot loop. The code I added to the partitioning routine
(find_rarely_executed_basic_blocks_and_crossing_edges) will cause the
entry to also be placed in the hot partition. I would argue this is
the desired behavior - if the routine contains code that is very hot
for, say, 1/2 its training runs, the entry and hot loop (and
everything on the path in between) should be in the hot partition.


 I can't really comment on the cfglayout and related stuff -- it was
 added at a time when I wasn't doing much with GCC and thus I don't
 know much about it.

 I think I can take a look at the cfglayout stuff. Splitting the patch would 
 be great.

Thanks, that would be great. I can split the patch first.


 Honza

 However, I like the changes to record if we've done partitioning and
 checking those instead of flag_reorder_blocks_and_partition.  That's
 simple enough that I'd support pulling it out as a separate patch
 and installing immediately if that can be done so without major
 headaches.

Ok, thanks, will do.


 I think we could do something similar with the code to verify the
 idom of a hot block is also hot.  Though looking at the
 implementation I wonder if it could be simplified by walking the
 dominator tree?  I can't look at it real closely tonight though.

Looking at this code again, I agree with you. It looks like it is
going to walk cold bb's more than once and O(n^2) in the worst case. I
will fix this (there are a couple places in the patch that do a walk
to ensure that this is not violated).


 Could you pull those two logical hunks of work out into individual
 patches.

Will do. The only complication with splitting out the dominance
checking stuff is that there are a number of changes in the patch to
ensure that we don't violate this (hot block can't be dominated by
cold block). I am not sure it makes sense or will be easy to split all
of these out. I think what I will do is try to pull the big related
chunks of them out to a separate patch (the new verification code, the
code to prevent this in the partitioning routine, and
fixup_partitions), but there are going to be a few places in the other
patch that do some fixups related to this (e.g. in rtl_split_edge)
that I would like to leave in the larger correctness patch.

Thanks,
Teresa


 jeff



--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[PATCH] Fix up rotate expansion

2013-05-10 Thread Jakub Jelinek

Hi!

Our rotate expansion if rotate optab isn't present for the respective
mode looks unsafe for rotations by variable count if that count could
be 0, because then it invokes right or left shift by bitsize.
While on most targets the hw will probably do the right thing
(it is fine if x  32 will either yield x or 0, in both cases
that ored together with x  0 aka x will still yield x), perhaps gcc
could try to optimize based on the fact that undefined behavior can't
happen, so IMHO it is better to generate a safer sequence.

Ok for trunk?

2013-05-10  Jakub Jelinek  ja...@redhat.com

* expmed.c (expand_shift_1): For rotations by const0_rtx just
return shifted.  Use (-op1)  (prec - 1) as other_amount
instead of prec - op1.

--- gcc/expmed.c.jj 2013-05-07 10:26:46.0 +0200
+++ gcc/expmed.c2013-05-10 11:54:34.659987038 +0200
@@ -2166,7 +2166,8 @@ expand_shift_1 (enum tree_code code, enu
{
  /* If we have been unable to open-code this by a rotation,
 do it as the IOR of two shifts.  I.e., to rotate A
-by N bits, compute (A  N) | ((unsigned) A  (C - N))
+by N bits, compute
+(A  N) | ((unsigned) A  ((-N)  (C - 1)))
 where C is the bitsize of A.
 
 It is theoretically possible that the target machine might
@@ -2181,14 +2182,22 @@ expand_shift_1 (enum tree_code code, enu
  rtx temp1;
 
  new_amount = op1;
- if (CONST_INT_P (op1))
+ if (op1 == const0_rtx)
+   return shifted;
+ else if (CONST_INT_P (op1))
other_amount = GEN_INT (GET_MODE_BITSIZE (mode)
- INTVAL (op1));
  else
-   other_amount
- = simplify_gen_binary (MINUS, GET_MODE (op1),
-GEN_INT (GET_MODE_PRECISION (mode)),
-op1);
+   {
+ other_amount
+   = simplify_gen_unary (NEG, GET_MODE (op1),
+ op1, GET_MODE (op1));
+ other_amount
+   = simplify_gen_binary (AND, GET_MODE (op1),
+  other_amount,
+  GEN_INT (GET_MODE_PRECISION (mode)
+   - 1));
+   }
 
  shifted = force_reg (mode, shifted);
 

Jakub

Re: [PATCH] Multilib changes ARM RTEMS

2013-05-10 Thread Joel Sherrill


Committed to 4.7, 4.8 and head.

On 5/8/2013 5:06 AM, Sebastian Huber wrote:

This patch should be applied to all open GCC branches.

This patch removes the mthumb/march=armv7 since it is virtually equal to
the mthumb/march=armv7-m multilib.  Add variants for ARMv7-A and
ARMv7-R.

I tried to use the MULTILIB_REQUIRED option, but this didn't work for
patterns with more than two options.

gcc/ChangeLog
2013-05-08  Sebastian Huber sebastian.hu...@embedded-brains.de

* config/arm/t-rtems-eabi: Remove mthumb/march=armv7 multilib.
Add mthumb/march=armv7-a multilib.
Add mthumb/march=armv7-r multilib.
Add mthumb/march=armv7-a/mfpu=neon/mfloat-abi=hard multilib.
---
  gcc/config/arm/t-rtems-eabi |   51 +-
  1 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/gcc/config/arm/t-rtems-eabi b/gcc/config/arm/t-rtems-eabi
index f0e714a..d81fbf7 100644
--- a/gcc/config/arm/t-rtems-eabi
+++ b/gcc/config/arm/t-rtems-eabi
@@ -1,8 +1,47 @@
  # Custom RTEMS EABI multilibs
  
-MULTILIB_OPTIONS= mthumb march=armv6-m/march=armv7/march=armv7-m

-MULTILIB_DIRNAMES   = thumb armv6-m armv7 armv7-m
-MULTILIB_EXCEPTIONS = march=armv6-m march=armv7 march=armv7-m
-MULTILIB_MATCHES=
-MULTILIB_EXCLUSIONS =
-MULTILIB_OSDIRNAMES =
+MULTILIB_OPTIONS  = mthumb 
march=armv6-m/march=armv7-a/march=armv7-r/march=armv7-m mfpu=neon 
mfloat-abi=hard
+MULTILIB_DIRNAMES = thumb armv6-m armv7-a armv7-r armv7-m neon hard
+
+# Enumeration of multilibs
+
+MULTILIB_EXCEPTIONS =
+MULTILIB_EXCEPTIONS += mthumb/march=armv6-m/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += mthumb/march=armv6-m/mfpu=neon
+MULTILIB_EXCEPTIONS += mthumb/march=armv6-m/mfloat-abi=hard
+# MULTILIB_EXCEPTIONS += mthumb/march=armv6-m
+# MULTILIB_EXCEPTIONS += mthumb/march=armv7-a/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-a/mfpu=neon
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-a/mfloat-abi=hard
+# MULTILIB_EXCEPTIONS += mthumb/march=armv7-a
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-r/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-r/mfpu=neon
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-r/mfloat-abi=hard
+# MULTILIB_EXCEPTIONS += mthumb/march=armv7-r
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-m/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-m/mfpu=neon
+MULTILIB_EXCEPTIONS += mthumb/march=armv7-m/mfloat-abi=hard
+# MULTILIB_EXCEPTIONS += mthumb/march=armv7-m
+MULTILIB_EXCEPTIONS += mthumb/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += mthumb/mfpu=neon
+MULTILIB_EXCEPTIONS += mthumb/mfloat-abi=hard
+# MULTILIB_EXCEPTIONS += mthumb
+MULTILIB_EXCEPTIONS += march=armv6-m/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv6-m/mfpu=neon
+MULTILIB_EXCEPTIONS += march=armv6-m/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv6-m
+MULTILIB_EXCEPTIONS += march=armv7-a/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv7-a/mfpu=neon
+MULTILIB_EXCEPTIONS += march=armv7-a/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv7-a
+MULTILIB_EXCEPTIONS += march=armv7-r/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv7-r/mfpu=neon
+MULTILIB_EXCEPTIONS += march=armv7-r/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv7-r
+MULTILIB_EXCEPTIONS += march=armv7-m/mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv7-m/mfpu=neon
+MULTILIB_EXCEPTIONS += march=armv7-m/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += march=armv7-m
+MULTILIB_EXCEPTIONS += mfpu=neon/mfloat-abi=hard
+MULTILIB_EXCEPTIONS += mfpu=neon
+MULTILIB_EXCEPTIONS += mfloat-abi=hard



--
Joel Sherrill, Ph.D. Director of Research  Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985

RE: [RFA/ARM/4.7] Fix PR54974: Thumb literal pools don't handle PC rounding

2013-05-10 Thread Joey Ye

 -Original Message-
 From: Ramana Radhakrishnan
 Sent: Friday, May 10, 2013 18:13
 To: Matthew Gretton-Dann
 Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; d...@canonical.com; Patch
 Tracking; Richard Biener; Joey Ye
 Subject: Re: [RFA/ARM/4.7] Fix PR54974: Thumb literal pools don't handle
PC
 rounding
 
 
  OK for 4.7?
 
 
 
 Ok - I did say ok if no fallout in my original review :)
Tested with Qemu Cortex-M3. No regression.

Committed to 4.7.

- Joey

a tiny LRA patch

2013-05-10 Thread Vladimir Makarov

Analyzing code generated for PPC with LRA, I found that sometimes 
call-saved registers are overused.  It creates long prologs/epilogs.  
The following patch fixes it.


The problem was in that BB frequency might be zero.

Committed to trunk as rev. 198792.

The same patch was committed into lra-mike branch.

2013-05-10  Vladimir Makarov  vmaka...@redhat.com

* lra-assigns.c (find_hard_regno_for): Add 1 to the cost of call
saved registers.

Index: lra-assigns.c
===
--- lra-assigns.c   (revision 198780)
+++ lra-assigns.c   (working copy)
@@ -598,7 +598,7 @@ find_hard_regno_for (int regno, int *cos
 ! df_regs_ever_live_p (hard_regno + j))
  /* It needs save restore.  */
  hard_regno_costs[hard_regno]
-   += 2 * ENTRY_BLOCK_PTR-next_bb-frequency;
+   += 2 * ENTRY_BLOCK_PTR-next_bb-frequency + 1;
  priority = targetm.register_priority (hard_regno);
  if (best_hard_regno  0 || hard_regno_costs[hard_regno]  
best_cost

  || (hard_regno_costs[hard_regno] == best_cost

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Teresa Johnson

On Thu, May 9, 2013 at 3:40 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Thu, May 9, 2013 at 11:42 PM, Diego Novillo wrote:
 On 2013-05-08 01:13 , Teresa Johnson wrote:
 -static void
 +void
  emit_barrier_after_bb (basic_block bb)
  {
rtx barrier = emit_barrier_after (BB_END (bb));
 -  BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
 +  if (current_ir_type () == IR_RTL_CFGLAYOUT)
 +BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);


 What if the current IR is not RTL?  Should we fail here?  It doesn't seem
 like it makes sense to call this from gimple, for instance.

 It also makes no sense calling it in IR_RTL_CFGLAYOUT mode. Barriers
 are meaningless in cfglayout mode.

Actually, the change above ensures we can call this routine when *not*
in CFGLAYOUT mode. It was previously only called when we were in
CFGLAYOUT mode (while in bbpart).

(This relates to a conversation we had on an earlier version of the
patch back in the fall, where we discussed this issue of bbpart adding
barriers while in cfglayout mode:
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02996.html).


 -  emit_barrier_after (BB_END (jump_block));
 +  /* We might be in cfg layout mode, and if so, the following routine will
 + insert the barrier correctly.  */
 +  emit_barrier_after_bb (jump_block);

 We're practically always in cfglayout mode, but oh well...

Right, not always, which is why I needed to make this change.


On Fri, May 10, 2013 at 5:05 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Fri, May 10, 2013 at 12:57 AM, Xinliang David Li wrote:
 On Thu, May 9, 2013 at 3:40 PM, Steven Bosscher wrote:

 This patch mixes up things badly from the point of
 what-depends-on-what, the whole approach looks wrong to me.


 Do you mean the 'source file dependency' or 'logical dependency'?

 If the former, the code can be easily refactored to remove the
 dependency. I don't see how the latter can be avoided as bb-partition
 etc does change cfg states and leads to different actions in cfg
 layout finalize.

 I mean logical dependency. cfglayout is just a representation of the
 CFG, and bb-partition is a code transformation. By making
 fixup_reorder_chain emit the note, you' ve put part of the
 transformation into out-of-cfglayout which is just bogus. You also
 don't put GCSE or loop unrolling in out-of-cfglayout, and this change
 is IMHO in the same category: mixing transformations into internal
 representations. That may be a short-term fix but it is a long-term
 maintenance/cleanups nightmare.

The main issue I had here, and why I made this change, is that we go
in and out of cfglayout mode several times after bb partitioning and
then out_of_cfglayout. The problem was that when we subsequently went
in and out of cfglayout mode, the switch text section notes that had
been inserted by bbpart were getting messed up (they were moved into
the bb header when we enter cfglayout mode and then not being
transferred to the correct location upon exit). I investigated trying
to keep those in sync, but it is really difficult/impossible to do
during cfglayout mode when they are in the header. So I simply strip
them out completely on entry to cfglayout mode, and if there were any
there on entry, this change ensures that they are restored in the
appropriate location upon exit. I'm not sure what is a good
alternative?

I found a more detailed description of why I made this change in our
email exchange from the fall:

--
I triggered the same error in 445.gobmk once I applied the
thread_prologue_and_epilogue_insns fixes. This is an assert in the
dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION
note not being preceeded by a barrier:

gcc -c -o engine/utils.o -DSPEC_CPU -DNDEBUG -DHAVE_CONFIG_H -I. -I..
-I../include -I./include   -fprofile-use
-freorder-blocks-and-partition -freorder-blocks -ffunction-sections
 -O2  engine/utils.c

engine/utils.c: In function ‘visible_along_edge’:
engine/utils.c:274:1: internal compiler error: in create_pseudo_cfg,
at dwarf2cfi.c:2742
 }
 ^

In this case the switch section note was inside a BB. What I found was
that this was due to several phases going into and back out of
cfglayout mode again. In this case it was the compgotos phase. There
aren't any computed gotos, but this change occurs during
cfg_layout_initialize (in try_optimize_cfg called via cleanup_cfg).
Here it merged two (non-contiguous) blocks that had a
single-successor/single-predecessor relationship. However, the source
block was previously on the section boundary and had a SWITCH note
prior. This note is put into the header of the bb when we go into
cfglayout mode, and ended up inside the new merged block, which was in
any case not on the new border between the hot and cold sections.

The correct solution in my opinion is to strip out the SWITCH note
every time we enter cfglayout mode after bbro, and then invoke
insert_section_boundary_note when leaving cfglayout (if one was found
on entry

Re: [Patch, Fortran] PR57217 - re-add type checks for TBP overriding

2013-05-10 Thread Janus Weil

Hi Tobias,

 GCC 4.7 added some additional checks for type-bound procedure overriding.
 However, doing so it weakened the check whether the nonpass argument has the
 same type.

 While for normal arguments, passing the parent type to an extended type is
 fine, for overriding the type (of nonpass arguments) must be exactly the
 same as in the original type.

in principle I think your idea to tighten up the type check by
symmetrizing it is ok.

However, I'm not sure if the place where you do it is the most
preferable: I think it should rather go inside of
check_dummy_characteristics (since any check for the dummy
characteristics should include the 'strict' type check, and not only a
check of type compatibility, cf. F08:12.3.2.2).

Anyway, there already is a call to 'compare_type_rank' in
'check_dummy_characteristics'. So, you could either just symmetrize
this, or alternatively use a double 'gfc_compare_types' and a separate
check for the rank.

Moreover, I think your current patch misses e.g. the assumed-type
handling that is present in compare_type_rank.

Cheers,
Janus

Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop

2013-05-10 Thread Tom Tromey

 Shakthi == Shakthi Kannan skan...@redhat.com writes:

Shakthi 2013-05-10 Shakthi Kannan skan...@redhat.com
Shakthi PR preprocessor/42014
Shakthi * gcc/diagnostic.c: Added LAST_SOURCE_COLUMN in while loop.

You should mention the function name in there.
See the GNU Coding Standards for the format.

Shakthi - ,\n from %s:%d,
Shakthi - LINEMAP_FILE (map), LAST_SOURCE_LINE (map));
Shakthi + ,\n from %s:%d:%d,
Shakthi + LINEMAP_FILE (map),
Shakthi + LAST_SOURCE_LINE (map), LAST_SOURCE_COLUMN 
(map));

Does this cause test suite regressions?

If so then the patch needs fixes there.
If not then the patch needs a new test.

Also, is the column number actually correct?
IIRC some things in cpp still don't get the right column number.

Tom

Re: [Patch, Fortran] PR57217 - re-add type checks for TBP overriding

2013-05-10 Thread Tobias Burnus


Hi Janus,

Janus Weil wrote:

in principle I think your idea to tighten up the type check by
symmetrizing it is ok.


I actually kind of copied it from the proc-pointer assignment check, 
which does the same.



However, I'm not sure if the place where you do it is the most
preferable: I think it should rather go inside of
check_dummy_characteristics (since any check for the dummy
characteristics should include the 'strict' type check, and not only a
check of type compatibility, cf. F08:12.3.2.2).


Okay. Maybe one can at the same time replace the rank+type check by a 
type and a rank check to provide a better error message, i.e. show the 
type for a type mismatch (showing both types)* and the rank for a rank 
mismatch (showing both ranks).


(* best with special handling of no_arg_check)


Moreover, I think your current patch misses e.g. the assumed-type
handling that is present in compare_type_rank.


Good point. Another reason to think about splitting the two.


Would you interested updating this patch? Or should I put it again on my 
growing to-do list?


Tobias

Re: Ping: [PATCH, PR 42371] Remove references to functions from symbol table during inlining

2013-05-10 Thread Martin Jambor

Hi,

I thought I was hitting the ipa-inline-transform.c:263 assert with
this patch when LTO-building Mozilla Firefox but it turned out it was
caused by a different patch so I'm handling that there.

On Thu, May 02, 2013 at 02:37:26PM +0200, Jan Hubicka wrote:
  2013-03-22  Martin Jambor  mjam...@suse.cz
  
  PR middle-end/42371
  * ipa-prop.h (IPA_UNDESCRIBED_USE): New macro.
  (ipa_constant_data): New type.
  (ipa_jump_func): Use ipa_constant_data to hold information about
  constant jump functions.
  (ipa_get_jf_constant): Adjust to jump function type changes.
  (ipa_get_jf_constant_rdesc): New function.
  (ipa_param_descriptor): New field controlled_uses.
  (ipa_get_controlled_uses): New function.
  (ipa_set_controlled_uses): Likewise.
  * ipa-ref.h (ipa_find_reference): Declare.
  * ipa-prop.c (ipa_cst_ref_desc): New type.
  (ipa_print_node_jump_functions_for_edge): Adjust for jump function type
  changes.
  (ipa_set_jf_constant): Likewise.  Also create reference descriptions.
  New parameter cs.  Adjust all callers.
  (ipa_analyze_params_uses): Detect uncontrolled and controlled uses.
  (remove_described_reference): New function.
  (jfunc_rdesc_usable): Likewise.
  (try_make_edge_direct_simple_call): Decrement controlled use count,
  attempt to remove reference if it hits zero.
  (combine_controlled_uses_counters): New function.
  (propagate_controlled_uses): Likewise.
  (ipa_propagate_indirect_call_infos): Call propagate_controlled_uses.
  (ipa_edge_duplication_hook): Duplicate reference descriptions.
  (ipa_print_node_params): Print described use counter.
  (ipa_write_jump_function): Adjust to jump function type changes.
  (ipa_read_jump_function): New parameter CS, pass it to
  ipa_set_jf_constant.  Adjust caller.
  (ipa_write_node_info): Stream controlled use count
  (ipa_read_node_info): Likewise.
  * cgraph.c (cgraph_mark_address_taken_node): Bail out instead of
  asserting.
  * ipa-cp.c (ipcp_discover_new_direct_edges): Decrement controlled use
  count.  Remove cloning-added reference if it reaches zero.
  * ipa-ref.c (ipa_find_reference): New function.
 
 As mentioned offline, I think the patch should be extended to work on 
 references in general,
 so we can do more of promoting to !TREE_ADDRESSABLE. But it can be handled 
 incrementally.

Yeah, adding that is easy.  But since I add no use for that, I
refrained from allocating data for them.

 
  Index: src/gcc/cgraph.c
  ===
  *** src.orig/gcc/cgraph.c
  --- src/gcc/cgraph.c
  *** cgraph_remove_node (struct cgraph_node *
  *** 1409,1415 
void
cgraph_mark_address_taken_node (struct cgraph_node *node)
{
  !   gcc_assert (!node-global.inlined_to);
  /* FIXME: address_taken flag is used both as a shortcut for testing 
  whether
 IPA_REF_ADDR reference exists (and thus it should be set on node
 representing alias we take address of) and as a test whether address
  --- 1409,1418 
void
cgraph_mark_address_taken_node (struct cgraph_node *node)
{
  !   /* Indirect inlining can figure out that all uses of the address are
  !  inlined.  */
  !   if (node-global.inlined_to)
  ! return;
  /* FIXME: address_taken flag is used both as a shortcut for testing 
  whether
 IPA_REF_ADDR reference exists (and thus it should be set on node
 representing alias we take address of) and as a test whether address
 
 Perhaps add assert that after_inlining is set?

I added asserting:
  gcc_assert (cfun-after_inlining);
  gcc_assert (node-callers-indirect_inlining_edge);

 
 Paths is OK,

OK, the updated patch is below (the only change is the extra asserts),
re-bootstrapped and re-tested on x86_64-linux.  I will commit it on
Monday if there are no objections.

Thanks,

Martin


2013-05-09  Martin Jambor  mjam...@suse.cz

PR middle-end/42371
* ipa-prop.h (IPA_UNDESCRIBED_USE): New macro.
(ipa_constant_data): New type.
(ipa_jump_func): Use ipa_constant_data to hold information about
constant jump functions.
(ipa_get_jf_constant): Adjust to jump function type changes.
(ipa_get_jf_constant_rdesc): New function.
(ipa_param_descriptor): New field controlled_uses.
(ipa_get_controlled_uses): New function.
(ipa_set_controlled_uses): Likewise.
* ipa-ref.h (ipa_find_reference): Declare.
* ipa-prop.c (ipa_cst_ref_desc): New type.
(ipa_print_node_jump_functions_for_edge): Adjust for jump function type
changes.
(ipa_set_jf_constant): Likewise.  Also create reference descriptions.
New parameter cs.  Adjust all callers.
(ipa_analyze_params_uses): Detect uncontrolled and controlled uses.
(remove_described_reference): New function.

Re: Add std::unordered_* C++11 allocator support

2013-05-10 Thread Paolo Carlini


.. looks like the prettyprinting code needs again adjustments:

FAIL: libstdc++-prettyprinters/cxx11.cc print uom
FAIL: libstdc++-prettyprinters/cxx11.cc print uomm
FAIL: libstdc++-prettyprinters/cxx11.cc print uos
FAIL: libstdc++-prettyprinters/cxx11.cc print uoms

Thanks,
Paolo.

[PATCH] Consistent and slightly better IPA dumps

2013-05-10 Thread Martin Jambor

Hi,

I have noticed that whereas all of the dumps in ipa-inline*, ipa-cp.c,
ipa-prop.c and elsewhere dump node UIDs, function dump_cgraph_node,
which is the core of call graph dumping, dumps node-symbol.uid.  This
unnecessarily makes it more difficult to see what has happened to a
node in the graph dump or how did a node from an action ended up in
the dump.

The patch below changes this and dumps symbol.uid everywhere where we
currently dump node.uid for the sake of consistency.  There is only
one exception, and that is input_node in lto-cgraph.c where we assign
to the order, so I just made it explicit that we were dumping UIDs.

Apart from this, the patch also dumps the symbol.order in three places
where we currently only print the name, because I reckon it is useful
ands some extra info about indirect edges in jump function dumping.

Bootstrapped and tested on x86_64-linux without any problems.  OK for
trunk?

Thanks,

Martin


2013-05-07  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (ipa_print_node_jump_functions): Print symbol order in
header, print symbol order instead of node uid, print more information
about indirect edge targets.
(ipa_make_edge_direct_to_target): Print symbol order instead of node
uids.
(ipa_make_edge_direct_to_target): Likewise.
(remove_described_reference): Likewise.
(propagate_controlled_uses): Likewise.
(ipa_print_node_params): Also print symbol order.
(ipcp_transform_function): Print symbol order instead of node uids.
* cgraph.c (cgraph_redirect_edge_call_stmt_to_callee): Likewise.
(cgraph_get_create_real_symbol_node): Likewise.
* ipa-cp.c (print_lattice): Likewise.
(print_all_lattices): Likewise.
(determine_versionability): Likewise.
(initialize_node_lattices): Likewise.
(estimate_local_effects): Likewise.
(update_profiling_info): Likewise.
(create_specialized_node): Likewise.
(perhaps_add_new_callers): Likewise.
(decide_about_value): Likewise.
(decide_whether_version_node): Likewise.
(identify_dead_nodes): Likewise.
* ipa-inline-analysis.c (dump_inline_edge_summary): Likewise.
(dump_inline_summary): Likewise.
(estimate_node_size_and_time): Likewise.
(inline_analyze_function): Likewise.
* ipa-inline.c (report_inline_failed_reason): Likewise.
(want_early_inline_function_p): Likewise.
(edge_badness): Likewise.
(update_edge_key): Likewise.
(inline_small_functions): Likewise.  Add dumping of order to two other
dumps.
* ipa-pure-const.c (pure_const_read_summary): Print symbol order
instead of node uids.
(propagate_pure_const): Likewise.
(propagate_pure_const): Likewise.
* ipa-utils.c (dump_cgraph_node_set): Likewise.
* lto-cgraph.c (input_node): Explicitly specify we dump uid.
* lto-symtab.c (lto_cgraph_replace_node): Print symbol order instead
of node uids.
* tree-pretty-print.c (dump_function_header): Likewise.
* tree-sra.c (convert_callers_for_node): Dump in traditional format.
Print symbol order instead of node uids.

lto/
* lto-partition.c (lto_balanced_map): Print symbol order instead
of node uids.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -265,34 +265,45 @@ void
 ipa_print_node_jump_functions (FILE *f, struct cgraph_node *node)
 {
   struct cgraph_edge *cs;
-  int i;
 
-  fprintf (f,   Jump functions of caller  %s:\n, cgraph_node_name (node));
+  fprintf (f,   Jump functions of caller  %s/%i:\n, cgraph_node_name (node),
+  node-symbol.order);
   for (cs = node-callees; cs; cs = cs-next_callee)
 {
   if (!ipa_edge_args_info_available_for_edge_p (cs))
continue;
 
   fprintf (f, callsite  %s/%i - %s/%i : \n,
-  xstrdup (cgraph_node_name (node)), node-uid,
-  xstrdup (cgraph_node_name (cs-callee)), cs-callee-uid);
+  xstrdup (cgraph_node_name (node)), node-symbol.order,
+  xstrdup (cgraph_node_name (cs-callee)),
+  cs-callee-symbol.order);
   ipa_print_node_jump_functions_for_edge (f, cs);
 }
 
-  for (cs = node-indirect_calls, i = 0; cs; cs = cs-next_callee, i++)
+  for (cs = node-indirect_calls; cs; cs = cs-next_callee)
 {
+  struct cgraph_indirect_call_info *ii;
   if (!ipa_edge_args_info_available_for_edge_p (cs))
continue;
 
+  ii = cs-indirect_info;
+  if (ii-agg_contents)
+   fprintf (f, indirect aggregate callsite, calling param %i, 
+offset  HOST_WIDE_INT_PRINT_DEC , %s,
+ii-param_index, ii-offset,
+ii-by_ref ? by reference : by_value);
+  else
+   fprintf (f, indirect %s callsite, calling param %i,
+

Re: [Patch, Fortran] PR57217 - re-add type checks for TBP overriding

2013-05-10 Thread Janus Weil

Hi,

 However, I'm not sure if the place where you do it is the most
 preferable: I think it should rather go inside of
 check_dummy_characteristics (since any check for the dummy
 characteristics should include the 'strict' type check, and not only a
 check of type compatibility, cf. F08:12.3.2.2).

 Okay. Maybe one can at the same time replace the rank+type check by a type
 and a rank check to provide a better error message, i.e. show the type for a
 type mismatch (showing both types)* and the rank for a rank mismatch
 (showing both ranks).

yes, actually I also thought that splitting it up might be a good idea.


 Moreover, I think your current patch misses e.g. the assumed-type
 handling that is present in compare_type_rank.

 Good point. Another reason to think about splitting the two.

Right.

However, the question is whether the patch will not become too
involved for backporting.


 Would you interested updating this patch? Or should I put it again on my
 growing to-do list?

My weekend is extremely tight, but I might find a quite hour in the
evenings during the next week (unless you can beat me to it) ...

Cheers,
Janus

[PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Martin Jambor

Hi,

PR 57084 has alerted me that IN LTO we do not devirtualize calls
during ipa-cp and ipa-inline that we do in non-LTO compilations.  The
reason is that the type comparison in get_binfo_at_offset is failing,
because the two types do not have the same TYPE_MAIN_VARIANT but
TYPE_CANONICAL should be checked instead.  Or better yet, call
types_compatible_p which hopefully will also know what to do in these
cases, which is what the patch below does.

The change increases the number of type-based devirtualizations in
ipa-cp and ipa-inline from 137 to 179 in 483.xalancbmk when compiles
with -O2 -flto -m32 (-m32 is not significant here, it was only
required to trigger the bug and I made the measurements in the same
environment).

Bootstrapped and tested on x86_64-linux without any issues.  OK for
trunk?

Thanks,

Martin


2013-05-10  Martin Jambor  mjam...@suse.cz

* tree.c (get_binfo_at_offset): Use types_compatible_p to compare
types.

Index: src/gcc/tree.c
===
--- src.orig/gcc/tree.c
+++ src/gcc/tree.c
@@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI
   tree fld;
   int i;
 
-  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type))
+  if (types_compatible_p (type, expected_type))
  return binfo;
   if (offset  0)
return NULL_TREE;

Re: [PATCH] Improve rotation by mode bitsize - 1

2013-05-10 Thread Jan Hubicka

 Hi!
 
 Note, the patch has been already committed.
 
 On Fri, May 10, 2013 at 03:48:50PM +0200, Jan Hubicka wrote:
  Going this route you will also need to update
  
[(set_attr type rotate1)
 (set (attr length_immediate)
   (if_then_else
 (and (match_operand 1 const1_operand)
  (ior (match_test TARGET_SHIFT1)
   (match_test optimize_function_for_size_p (cfun
 (const_string 0)
 (const_string *)))
 (set_attr mode QI)])
  
  computing the immediate size.  Why we can't canonicalize this in
  folding/simplify_rtx/combiner? I do not see much benefit allowing both
  forms of instructions in our insn streams...
 
 Both directions of rotation make sense for variable length rotation, and
 why would the generic simplify-rtx.c code want to prefer one rotate over
 the other direction for constant rotation counts?  That is clearly target
 specific preference.

It seems to me that it is not different from normalizing reg-10 into reg+(-10)
we do for years (and for good reason).  It is still target preference when use
add and when sub to perform the arithmetic, but it makes sense to keep single
canonical form of the expression both in RTL and Gimple.

For example we may want to be able to prove that 
  (rotate reg 31) == (rotatert reg 1)
is true or
  (rotate reg 30) == (rotatert reg 2)
is also true or cross jump both variants into one instruction.

So it seems to me that canonicalizing constant rotations to get the operand
into range 0bitsize/2 makes sense in general and will make the extra
special case in i386.md unnecesary.
 
 What we could do instead of the patch I've committed just tweak expansion
 of the patterns instead, assuming it is unlikely RTL optimizations
 materialize a constant rotation out of non-constant rotations from expansion
 time.
 
 Or the length_immediate attributes could be adjusted, shouldn't be too hard
 (just replace that (match_operand 1 const1_operand) in there with
 (and (match_operand 1 const_int_operand)
  (ior (match_operand 1 const1_operand)
 (match_test INTVAL (operands[1]) == GET_MODE_BITSIZE (MODEmode) - 
 1)))
 or similar (untested).

Yep, we should either update the pattern or add canonicalization.  I still think
the second variant is better...

Honza

Re: Ping: [PATCH, PR 42371] Remove references to functions from symbol table during inlining

2013-05-10 Thread Jan Hubicka

 OK, the updated patch is below (the only change is the extra asserts),
 re-bootstrapped and re-tested on x86_64-linux.  I will commit it on
 Monday if there are no objections.

No objections from my side. Path looks OK.

Thanks,
Honza

Re: [PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Jan Hubicka

 2013-05-10  Martin Jambor  mjam...@suse.cz
 
   * tree.c (get_binfo_at_offset): Use types_compatible_p to compare
   types.
 
 Index: src/gcc/tree.c
 ===
 --- src.orig/gcc/tree.c
 +++ src/gcc/tree.c
 @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI
tree fld;
int i;
  
 -  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type))
 +  if (types_compatible_p (type, expected_type))
As discussed on lunch, it seems fine to me, but I am not an expert ;)  I wonder 
about
the following test few lines bellow:
  /* Offset 0 indicates the primary base, whose vtable contents are
 represented in the binfo for the derived class.  */
  else if (offset != 0)
{
  tree base_binfo, found_binfo = NULL_TREE;
  for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
if (TREE_TYPE (base_binfo) == TREE_TYPE (fld))

don't we want also more relaxed testing?

grep TYPE.* ==.*TYPE  *.c seems to show some extra dubious cases i.e.
in fold-const, tree-ssa-coaelesce and other places where we most probably care
only about semantical equivalence of the type...

Thanks for looking into this!
Honza

[PATCH] Redirect calls to non-functions to builtin_unreachable

2013-05-10 Thread Martin Jambor

Hi,

as we discover targets of previously indirect calls in ipa-inline and
ipa-cp, we sometimes figure out that the targets are not a function.
One typical example is when NULL is passed as a function pointer
parameter, another is when C++ member-pointer points to a virtual
function and the overloaded field of the structure which can also hold
pointers to non-virtual methods contains odd integer constants.

I have gone through all the cases when this happens when LTO building
Mozilla Firefox and verified all of them were OK and legal.  Because
no such call to a non-function can ever occur in a legal program, I
have decided to turn them into calls of builting_unreachable, which
then could help further optimizations.

Because calls to builtin_unreachable is not cgraph_maybe_hot_edge_p,
whereas an indirect call with an unknown destination is, this change
often triggered assertion tree-ipa-inline-transform.c:263.  I have
discussed this with Honza and he has agreed to switch the check off
when there are newly discovered direct edges.

The patch has passed bootstrap and is undergoing testing now.  OK for
trunk if it passes?

Thanks,

Martin


2013-05-09  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (ipa_make_edge_direct_to_target): Redirect calls to
non-functions to builtin_unreachable.
* ipa-inline-transform.c (inline_call): Do not assert estimates were
correct when new direct edges were discovered.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -2200,6 +2200,7 @@ ipa_make_edge_direct_to_target (struct c
 {
   struct cgraph_node *callee;
   struct inline_edge_summary *es = inline_edge_summary (ie);
+  bool unreachable = false;
 
   if (TREE_CODE (target) == ADDR_EXPR)
 target = TREE_OPERAND (target, 0);
@@ -2210,12 +2211,17 @@ ipa_make_edge_direct_to_target (struct c
{
  if (dump_file)
fprintf (dump_file, ipa-prop: Discovered direct call to 
non-function
-in %s/%i.\n,
+in %s/%i, making it unreachable.\n,
 cgraph_node_name (ie-caller), ie-caller-symbol.order);
- return NULL;
+ target = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
+ callee = cgraph_get_create_node (target);
+ unreachable = true;
}
+  else
+   callee = cgraph_get_node (target);
 }
-  callee = cgraph_get_node (target);
+  else
+callee = cgraph_get_node (target);
 
   /* Because may-edges are not explicitely represented and vtable may be 
external,
  we may create the first reference to the object in the unit.  */
@@ -2252,7 +2258,7 @@ ipa_make_edge_direct_to_target (struct c
 - eni_size_weights.call_cost);
   es-call_stmt_time -= (eni_time_weights.indirect_call_cost
 - eni_time_weights.call_cost);
-  if (dump_file)
+  if (dump_file  !unreachable)
 {
   fprintf (dump_file, ipa-prop: Discovered %s call to a known target 
   (%s/%i - %s/%i), for stmt ,
Index: src/gcc/ipa-inline-transform.c
===
--- src.orig/gcc/ipa-inline-transform.c
+++ src/gcc/ipa-inline-transform.c
@@ -260,7 +260,7 @@ inline_call (struct cgraph_edge *e, bool
 #ifdef ENABLE_CHECKING
   /* Verify that estimated growth match real growth.  Allow off-by-one
  error due to INLINE_SIZE_SCALE roudoff errors.  */
-  gcc_assert (!update_overall_summary || !overall_size
+  gcc_assert (!update_overall_summary || !overall_size || new_edges_found
  || abs (estimated_growth - (new_size - old_size)) = 1
  /* FIXME: a hack.  Edges with false predicate are accounted
 wrong, we should remove them from callgraph.  */

[Google] Suppress message when primary module entry cannot found

2013-05-10 Thread Dehao Chen

Now we don't store the module info if the module is not exported or
has any aux module (to compress the profile data size). Thus it's
normal that a primary module entry cannot be found. This patch
suppresses the messages printed when the primary module is not found.

Bootstrapped and passed regression test.

OK for google branch?

Thanks,
Dehao

Index: auto-profile.c
===
--- auto-profile.c (revision 198751)
+++ auto-profile.c (working copy)
@@ -497,10 +497,7 @@ read_aux_modules (void)
   module.name = xstrdup (in_fnames[0]);
   entry = (struct afdo_module *) htab_find (module_htab, module);
   if (!entry)
-{
-  inform (0, primary module %s cannot be found., in_fnames[0]);
-  return;
-}
+return;
   module_infos = XCNEWVEC (struct gcov_module_info *,
entry-num_aux_modules + 1);
   afdo_add_module (module_infos, entry, true);

Re: [PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Martin Jambor

Hi,

On Fri, May 10, 2013 at 07:24:06PM +0200, Jan Hubicka wrote:
  2013-05-10  Martin Jambor  mjam...@suse.cz
  
  * tree.c (get_binfo_at_offset): Use types_compatible_p to compare
  types.
  
  Index: src/gcc/tree.c
  ===
  --- src.orig/gcc/tree.c
  +++ src/gcc/tree.c
  @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI
 tree fld;
 int i;
   
  -  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type))
  +  if (types_compatible_p (type, expected_type))
 As discussed on lunch, it seems fine to me, but I am not an expert ;)  I 
 wonder about
 the following test few lines bellow:
   /* Offset 0 indicates the primary base, whose vtable contents are
  represented in the binfo for the derived class.  */
   else if (offset != 0)
 {
   tree base_binfo, found_binfo = NULL_TREE;
   for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
 if (TREE_TYPE (base_binfo) == TREE_TYPE (fld))
 
 don't we want also more relaxed testing?

we most probably do, good catch.  I'll update the patch, re-test and
commit it early next week unless someone objects.

 
 grep TYPE.* ==.*TYPE  *.c seems to show some extra dubious cases i.e.
 in fold-const, tree-ssa-coaelesce and other places where we most probably care
 only about semantical equivalence of the type...
 

Yeah, examinig those is also probably a good idea.

Thanks,

Martin

Re: [PATCH] Redirect calls to non-functions to builtin_unreachable

2013-05-10 Thread Jeff Law


On 05/10/2013 11:27 AM, Martin Jambor wrote:

Hi,

as we discover targets of previously indirect calls in ipa-inline and
ipa-cp, we sometimes figure out that the targets are not a function.
One typical example is when NULL is passed as a function pointer
parameter, another is when C++ member-pointer points to a virtual
function and the overloaded field of the structure which can also hold
pointers to non-virtual methods contains odd integer constants.

I have gone through all the cases when this happens when LTO building
Mozilla Firefox and verified all of them were OK and legal.  Because
no such call to a non-function can ever occur in a legal program, I
have decided to turn them into calls of builting_unreachable, which
then could help further optimizations.

Because calls to builtin_unreachable is not cgraph_maybe_hot_edge_p,
whereas an indirect call with an unknown destination is, this change
often triggered assertion tree-ipa-inline-transform.c:263.  I have
discussed this with Honza and he has agreed to switch the check off
when there are newly discovered direct edges.

The patch has passed bootstrap and is undergoing testing now.  OK for
trunk if it passes?

Thanks,

Martin


2013-05-09  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (ipa_make_edge_direct_to_target): Redirect calls to
non-functions to builtin_unreachable.
* ipa-inline-transform.c (inline_call): Do not assert estimates were
correct when new direct edges were discovered.

This is good.  Please install.

Thanks,
Jeff

Re: [Google] Suppress message when primary module entry cannot found

2013-05-10 Thread Teresa Johnson

Is it only auto fdo that doesn't store the module info if the module
is not exported or has aux modules? Note that this will prevent usage
of my script that enables annotation of gcov dump info with function
names, which relies on accessing the primary module info to obtain the
module name (which is then used to locate the module's associated func
id to func name mapping optionally emitted into the build output).

Teresa

On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote:
 Now we don't store the module info if the module is not exported or
 has any aux module (to compress the profile data size). Thus it's
 normal that a primary module entry cannot be found. This patch
 suppresses the messages printed when the primary module is not found.

 Bootstrapped and passed regression test.

 OK for google branch?

 Thanks,
 Dehao

 Index: auto-profile.c
 ===
 --- auto-profile.c (revision 198751)
 +++ auto-profile.c (working copy)
 @@ -497,10 +497,7 @@ read_aux_modules (void)
module.name = xstrdup (in_fnames[0]);
entry = (struct afdo_module *) htab_find (module_htab, module);
if (!entry)
 -{
 -  inform (0, primary module %s cannot be found., in_fnames[0]);
 -  return;
 -}
 +return;
module_infos = XCNEWVEC (struct gcov_module_info *,
 entry-num_aux_modules + 1);
afdo_add_module (module_infos, entry, true);



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Dynamic dispatch of multiversioned functions and CPU mocks for code coverage.

2013-05-10 Thread Sriraman Tallam

On Fri, May 10, 2013 at 6:34 AM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Thu, 9 May 2013, Sriraman Tallam wrote:

 Then, I plan to add the following hooks to libgcc (in a different patch) :

 int set_mock_cpu_is (const char *cpu);
 int set_mock_cpu_supports (const char *isa);
 int init_mock_cpu (); // Clear the values of the mock cpu.

 Those names are in the user's namespace; I think libgcc should only
 provide or use symbols in the implementation namespace.

Shall I just use __builtin prefixes for these too?, would that work?

Thanks
Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com

Re: [Google] Suppress message when primary module entry cannot found

2013-05-10 Thread Dehao Chen

On Fri, May 10, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com wrote:
 Is it only auto fdo that doesn't store the module info if the module
 is not exported or has aux modules? Note that this will prevent usage
 of my script that enables annotation of gcov dump info with function
 names, which relies on accessing the primary module info to obtain the
 module name (which is then used to locate the module's associated func
 id to func name mapping optionally emitted into the build output).

Yes, this is auto fdo only. And gcov dump will not work on auto fdo
profile because it does not use the id at all in the profile (instead,
it stores the function name directly).

Thanks,
Dehao


 Teresa

 On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote:
 Now we don't store the module info if the module is not exported or
 has any aux module (to compress the profile data size). Thus it's
 normal that a primary module entry cannot be found. This patch
 suppresses the messages printed when the primary module is not found.

 Bootstrapped and passed regression test.

 OK for google branch?

 Thanks,
 Dehao

 Index: auto-profile.c
 ===
 --- auto-profile.c (revision 198751)
 +++ auto-profile.c (working copy)
 @@ -497,10 +497,7 @@ read_aux_modules (void)
module.name = xstrdup (in_fnames[0]);
entry = (struct afdo_module *) htab_find (module_htab, module);
if (!entry)
 -{
 -  inform (0, primary module %s cannot be found., in_fnames[0]);
 -  return;
 -}
 +return;
module_infos = XCNEWVEC (struct gcov_module_info *,
 entry-num_aux_modules + 1);
afdo_add_module (module_infos, entry, true);



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [Google] Suppress message when primary module entry cannot found

2013-05-10 Thread Xinliang David Li

ok. Would be nicer if there is a way to tell this from other error cases though.

David

On Fri, May 10, 2013 at 11:00 AM, Dehao Chen de...@google.com wrote:
 On Fri, May 10, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com wrote:
 Is it only auto fdo that doesn't store the module info if the module
 is not exported or has aux modules? Note that this will prevent usage
 of my script that enables annotation of gcov dump info with function
 names, which relies on accessing the primary module info to obtain the
 module name (which is then used to locate the module's associated func
 id to func name mapping optionally emitted into the build output).

 Yes, this is auto fdo only. And gcov dump will not work on auto fdo
 profile because it does not use the id at all in the profile (instead,
 it stores the function name directly).

 Thanks,
 Dehao


 Teresa

 On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote:
 Now we don't store the module info if the module is not exported or
 has any aux module (to compress the profile data size). Thus it's
 normal that a primary module entry cannot be found. This patch
 suppresses the messages printed when the primary module is not found.

 Bootstrapped and passed regression test.

 OK for google branch?

 Thanks,
 Dehao

 Index: auto-profile.c
 ===
 --- auto-profile.c (revision 198751)
 +++ auto-profile.c (working copy)
 @@ -497,10 +497,7 @@ read_aux_modules (void)
module.name = xstrdup (in_fnames[0]);
entry = (struct afdo_module *) htab_find (module_htab, module);
if (!entry)
 -{
 -  inform (0, primary module %s cannot be found., in_fnames[0]);
 -  return;
 -}
 +return;
module_infos = XCNEWVEC (struct gcov_module_info *,
 entry-num_aux_modules + 1);
afdo_add_module (module_infos, entry, true);



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[RFC] Make the new var decl STATIC in function dw2_output_indirect_constant_1

2013-05-10 Thread Carrot Wei

Hi

In function dw2_output_indirect_constant_1 a new var decl is created. Only
When the variable is not PUBLIC it is allocated static storage. Does anybody
know why the variable is not allocated static storage by marking TREE_STATIC
when it is PUBLIC?

The following patch marks the STATIC flag in all cases. It can pass bootstrap
and regression test on x86-64.

Any comments?

thanks
Carrot


2013-05-09  Guozhi Wei  car...@google.com

* dwarf2asm.c (dw2_output_indirect_constant_1): Mark new decl STATIC.


Index: dwarf2asm.c
===
--- dwarf2asm.c (revision 198794)
+++ dwarf2asm.c (working copy)
@@ -906,6 +906,7 @@
   DECL_IGNORED_P (decl) = 1;
   DECL_INITIAL (decl) = decl;
   TREE_READONLY (decl) = 1;
+  TREE_STATIC (decl) = 1;

   if (TREE_PUBLIC (id))
 {
@@ -914,8 +915,6 @@
   if (USE_LINKONCE_INDIRECT)
  DECL_VISIBILITY (decl) = VISIBILITY_HIDDEN;
 }
-  else
-TREE_STATIC (decl) = 1;

   sym_ref = gen_rtx_SYMBOL_REF (Pmode, sym);
   assemble_variable (decl, 1, 1, 1);

Re: [Google] Suppress message when primary module entry cannot found

2013-05-10 Thread Dehao Chen

On Fri, May 10, 2013 at 11:32 AM, Teresa Johnson tejohn...@google.com wrote:
 On Fri, May 10, 2013 at 11:00 AM, Dehao Chen de...@google.com wrote:
 On Fri, May 10, 2013 at 10:47 AM, Teresa Johnson tejohn...@google.com 
 wrote:
 Is it only auto fdo that doesn't store the module info if the module
 is not exported or has aux modules? Note that this will prevent usage
 of my script that enables annotation of gcov dump info with function
 names, which relies on accessing the primary module info to obtain the
 module name (which is then used to locate the module's associated func
 id to func name mapping optionally emitted into the build output).

 Yes, this is auto fdo only. And gcov dump will not work on auto fdo
 profile because it does not use the id at all in the profile (instead,
 it stores the function name directly).

 If you are trying to reduce profile sizes, it might be worthwhile to
 try to use the id instead of the name. Before implementing the wrapper
 script solution to getting function names in the gcov-info dump I
 first made a patch to include the function names in the gcda file.
 This caused a huge bloat in gcda and instrumented file sizes.

In AutoFDO, we use names (file names, function names) to match the
binary level profile back to source. We do this because:

1. We don't do instrumentation, thus there is no id info in the binary/profile
2. This makes the profile loosely coupled with profile collection runs

For the size issue, we only stores profile for those functions with
sample counts larger than a threshold. Right now, a typical profile
file is around 10M~20M.

Thanks,
Dehao


 Teresa


 Thanks,
 Dehao


 Teresa

 On Fri, May 10, 2013 at 10:37 AM, Dehao Chen de...@google.com wrote:
 Now we don't store the module info if the module is not exported or
 has any aux module (to compress the profile data size). Thus it's
 normal that a primary module entry cannot be found. This patch
 suppresses the messages printed when the primary module is not found.

 Bootstrapped and passed regression test.

 OK for google branch?

 Thanks,
 Dehao

 Index: auto-profile.c
 ===
 --- auto-profile.c (revision 198751)
 +++ auto-profile.c (working copy)
 @@ -497,10 +497,7 @@ read_aux_modules (void)
module.name = xstrdup (in_fnames[0]);
entry = (struct afdo_module *) htab_find (module_htab, module);
if (!entry)
 -{
 -  inform (0, primary module %s cannot be found., in_fnames[0]);
 -  return;
 -}
 +return;
module_infos = XCNEWVEC (struct gcov_module_info *,
 entry-num_aux_modules + 1);
afdo_add_module (module_infos, entry, true);



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[PATCH 7/n, i386]: Improve memory attribute handling in vec_extract patterns

2013-05-10 Thread Uros Bizjak

Hello!

2013-05-10  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (memory): Handle sseishft1.
* config/i386/sse.md (*vec_extractv4si): Remove memory attribute.
(*vec_extractv2di_1): Ditto.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 198780)
+++ i386.md (working copy)
@@ -615,8 +615,9 @@
 !alu1,negnot,ishift1,
   imov,imovx,icmp,test,bitmanip,
   fmov,fcmp,fsgn,
-  sse,ssemov,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,sselog1,
-  sseshuf1,sseadd1,sseiadd1,mmx,mmxmov,mmxcmp,mmxcvt)
+  sse,ssemov,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,
+  sselog1,sseshuf1,sseadd1,sseiadd1,sseishft1,
+  mmx,mmxmov,mmxcmp,mmxcvt)
  (match_operand 2 memory_operand))
   (const_string load)
 (and (eq_attr type icmov,ssemuladd,sse4arg)
Index: sse.md
===
--- sse.md  (revision 198780)
+++ sse.md  (working copy)
@@ -7387,7 +7387,6 @@
 }
   [(set_attr isa *,noavx,avx)
(set_attr type sselog1,sseishft1,sseishft1)
-   (set_attr memory *,none,none)
(set_attr prefix_extra 1,*,*)
(set_attr length_immediate 1)
(set_attr prefix maybe_vex,orig,vex)
@@ -7446,7 +7445,6 @@
   [(set_attr isa x64_sse4,*,sse2_noavx,avx,noavx,*,x64)
(set_attr type sselog1,ssemov,sseishft1,sseishft1,ssemov,ssemov,imov)
(set_attr length_immediate 1,*,1,1,*,*,*)
-   (set_attr memory *,*,none,none,*,*,*)
(set_attr prefix_rex 1,*,*,*,*,*,*)
(set_attr prefix_extra 1,*,*,*,*,*,*)
(set_attr prefix maybe_vex,maybe_vex,orig,vex,orig,*,*)

Re: [PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Jan Hubicka

 Martin Jambor mjam...@suse.cz wrote:
 
 Hi,
 
 On Fri, May 10, 2013 at 07:24:06PM +0200, Jan Hubicka wrote:
   2013-05-10  Martin Jambor  mjam...@suse.cz
   
* tree.c (get_binfo_at_offset): Use types_compatible_p to compare
types.
   
   Index: src/gcc/tree.c
   ===
   --- src.orig/gcc/tree.c
   +++ src/gcc/tree.c
   @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI
  tree fld;
  int i;

   -  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT
 (expected_type))
   +  if (types_compatible_p (type, expected_type))
  As discussed on lunch, it seems fine to me, but I am not an expert ;)
 
 This doesn't look good to me. With lto all types that are structurally 
 equivalent will compare compatible this way. Probably not what you want.

Yep, that was my concern, too.  I wonder how to recognize the two types
represent the same object.  Perhaps by adding an assiciated vtable (via binfos)
into the structural match while merging?

Honza

Re: wide-int.[ch] review

2013-05-10 Thread Mike Stump

On May 10, 2013, at 5:12 AM, Richard Biener rguent...@suse.de wrote:
 Somewhere we talked about naming new C++ files wide-int.cc
 (or was it .C or .cxx?)

for the C++ library:

$ find . -name \*.cc | wc -l
5605
$ find . -name \*.C | wc -l
1

:-)

Go also prefers .cc.

Re: [Patch:RL78] Fix hardware multiply on G13 target

2013-05-10 Thread DJ Delorie


   movwax, %h2
   movw0x2, ax ; MDAH
   nop ; mdc += mdal * mdah
 + nop ; Additional nop for MAC
  
   mov a, #0x40
   mov !0xf00e8, a ; MDUC

Shouldn't the MOV after the nop add the extra cycle here?

   movwax, %H2
   movw0x2, ax ; MDAH
   nop ; mdc += mdal * mdah
 + nop ; Additional nop for MAC
   movwax, !0xf00e0; MDCL
   movw%H0, ax

Yeah, this one is probably needed.

Re: [PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Richard Biener

Jan Hubicka hubi...@ucw.cz wrote:

 Martin Jambor mjam...@suse.cz wrote:
 
 Hi,
 
 On Fri, May 10, 2013 at 07:24:06PM +0200, Jan Hubicka wrote:
   2013-05-10  Martin Jambor  mjam...@suse.cz
   
   * tree.c (get_binfo_at_offset): Use types_compatible_p to
compare
   types.
   
   Index: src/gcc/tree.c
  
===
   --- src.orig/gcc/tree.c
   +++ src/gcc/tree.c
   @@ -11483,7 +11483,7 @@ get_binfo_at_offset (tree binfo, HOST_WI
  tree fld;
  int i;

   -  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT
 (expected_type))
   +  if (types_compatible_p (type, expected_type))
  As discussed on lunch, it seems fine to me, but I am not an expert
;)
 
 This doesn't look good to me. With lto all types that are
structurally equivalent will compare compatible this way. Probably not
what you want.

Yep, that was my concern, too.  I wonder how to recognize the two types
represent the same object.  Perhaps by adding an assiciated vtable (via
binfos)
into the structural match while merging?

The existing check should work ok with lto. If not then we should figure out 
why we do not merge the main variants properly.

Richard.

Honza

sparc64--rtems* should not define svr4

2013-05-10 Thread Joel Sherrill


Hi

sparc64*-*-rtems* ends up with __svr4__ defined. The attached
patch corrects that.

OK to apply?

013-05-10  Joel Sherrill  joel.sherr...@oarcorp.com

* config.gcc (sparc64*-*-rtems*): Use sp64-rtemself.h.
RTEMS target should not have -D__svr4__ in CPP_SUBTARGET_SPEC.
* config/sparc/sp64-rtemself.h: New file.

--
Joel Sherrill, Ph.D. Director of Research  Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985

2013-05-10  Joel Sherrill  joel.sherr...@oarcorp.com

* config.gcc (sparc64*-*-rtems*): Use sp64-rtemself.h
RTEMS target should not have -D__svr4__ in CPP_SUBTARGET_SPEC.
* config/sparc/sp64-rtemself.h: New file.

Index: gcc/config/sparc/sp64-rtemself.h
===
--- gcc/config/sparc/sp64-rtemself.h(revision 0)
+++ gcc/config/sparc/sp64-rtemself.h(working copy)
@@ -0,0 +1,37 @@
+/* Definitions for rtems targeting a SPARC64 using ELF.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Joel Sherrill (joel.sherr...@oarcorp.com).
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+/* Target OS builtins.  */
+#undef TARGET_OS_CPP_BUILTINS
+#define TARGET_OS_CPP_BUILTINS()   \
+  do   \
+{  \
+   builtin_define (__rtems__);   \
+   builtin_define (__USE_INIT_FINI__);   \
+   builtin_assert (system=rtems);\
+}  \
+  while (0)
+
+/* Use the default */
+#undef LINK_GCC_C_SEQUENCE_SPEC
+
+/* we are not svr4 */
+#undef CPP_SUBTARGET_SPEC
+#define CPP_SUBTARGET_SPEC 
Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 198775)
+++ gcc/config.gcc  (working copy)
@@ -2516,7 +2516,7 @@
tmake_file=${tmake_file} sparc/t-sparc
;;
 sparc64-*-rtems*)
-   tm_file=${tm_file} dbxelf.h elfos.h newlib-stdint.h sparc/sysv4.h 
sparc/sp64-elf.h sparc/rtemself.h rtems.h
+   tm_file=${tm_file} dbxelf.h elfos.h newlib-stdint.h sparc/sysv4.h 
sparc/sp64-elf.h sparc/sp64-rtemself.h rtems.h
extra_options=${extra_options}
tmake_file=${tmake_file} sparc/t-sparc sparc/t-rtems-64 t-rtems
;;

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Steven Bosscher

On Fri, May 10, 2013 at 5:54 PM, Teresa Johnson wrote:
 The main issue I had here, and why I made this change, is that we go
 in and out of cfglayout mode several times after bb partitioning and
 then out_of_cfglayout. The problem was that when we subsequently went
 in and out of cfglayout mode, the switch text section notes that had
 been inserted by bbpart were getting messed up (they were moved into
 the bb header when we enter cfglayout mode and then not being
 transferred to the correct location upon exit).

 I investigated trying
 to keep those in sync, but it is really difficult/impossible to do
 during cfglayout mode when they are in the header. So I simply strip
 them out completely on entry to cfglayout mode, and if there were any
 there on entry, this change ensures that they are restored in the
 appropriate location upon exit. I'm not sure what is a good
 alternative?

The problem is that the note exists at all. I'd like to see the note
go away completely eventually (when we have a CFG all the way through
pass_final). As a stop-gap, we should not emit the note until
pass_free_cfg. Up to that point the basic blocks tell what partition
they're in, and all hot block and cold blocks should be in a sequence.
So during pass_free_cfg, just walk the basic blocks chain until
there's a partition change, and emit the note between those two
blocks.


 I triggered the same error in 445.gobmk once I applied the
 thread_prologue_and_epilogue_insns fixes. This is an assert in the
 dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION
 note not being preceeded by a barrier:

The problem here is that fixup_reorder_chain should force a jump for
basic blocks in one partition falling through to bb-next. That should
be dealt with in can_fallthru, which should return false if
BB_PARTITION (target) != BB_PARTITION (src).


 The correct solution in my opinion is to strip out the SWITCH note
 every time we enter cfglayout mode after bbro, and then invoke
 insert_section_boundary_note when leaving cfglayout (if one was found
 on entry to that cfglayout mode) to reapply it.

Please make the note go away completely before pass_free_cfg, and earn
greater admiration than Zeus. The note always was wrong, and now
you've shown it's also a problem.


 * Fixup redirected edges that did not cross partitions before but
 apparently do after redirection. This is not supposed to happen in the
 first place, so fixing up any of this just papers over an error
 elsewhere in the compiler.

 Looking back through the earlier email exchanges from last fall I
 found some discussion on this where I had found a couple places
 causing this to happen. Two were in
 thread_prologue_and_epilogue_insns: when we duplicated tail blocks or
 created a new block to hold a simple return, and redirected some of
 the edges to the new copy. In some cases this caused edges that were
 previously region crossing to become non-region crossing, and in other
 cases the reverse happened. I think there are a couple of other cases
 mentioned in the emails too, but I would need to do some more digging
 to find them all.

 There were a few places in the code that tried to detect this type of
 issue, and fix them up, but they weren't consistent and there were
 other places that had the same issue. I've centralized all that
 handling in this patch (fixup_partitions and fixup_partition_crossing,
 called from a few places where we redirect edges) so that it is more
 consistent and comprehensive.

Right, I think it's good that you've centralized this code. But it
seems to me that we should make sure that the hot blocks and cold
blocks are still grouped (i.e. there is only one basic block B such
that BB_PARTITION(B) != BB_PARTITION(B-next_bb). That is something
your code doesn't handle, AFAICT. It's just one thing that's difficult
to maintain, probably there are others. It's also something the
partitioning verifier should check, i.e. that the basic blocks in hot
and cold partitions are properly grouped.


BTW1: I don't understand this comment:

 +  /* Invoke the cleanup again once we are out of cfg layout mode
 + after committing the final bb layout above. This enables removal
 + of forwarding blocks across the hot/cold section boundary when
 + splitting is enabled that were necessary while in cfg layout
 + mode.  */
 +  if (crtl-has_bb_partition)
 +cleanup_cfg (CLEANUP_EXPENSIVE);

There shouldn't be any forwarder blocks in cfg layout mode. What did
you need this for?

BTW2: We badly need to figure out a way to create test cases for FDO... :-(

Ciao!
Steven

Re: [PATCH] Dynamic dispatch of multiversioned functions and CPU mocks for code coverage.

2013-05-10 Thread Joseph S. Myers

On Fri, 10 May 2013, Sriraman Tallam wrote:

 On Fri, May 10, 2013 at 6:34 AM, Joseph S. Myers
 jos...@codesourcery.com wrote:
  On Thu, 9 May 2013, Sriraman Tallam wrote:
 
  Then, I plan to add the following hooks to libgcc (in a different patch) :
 
  int set_mock_cpu_is (const char *cpu);
  int set_mock_cpu_supports (const char *isa);
  int init_mock_cpu (); // Clear the values of the mock cpu.
 
  Those names are in the user's namespace; I think libgcc should only
  provide or use symbols in the implementation namespace.
 
 Shall I just use __builtin prefixes for these too?, would that work?

I'm not sure if that's a good idea for something that's actually a library 
function (we've previously discussed rejecting explicit declarations of 
__builtin_* identifiers to some extent - see bug 32455 - which would be an 
issue for defining library functions with such a name if we do decide in 
future to reject such declarations), but use __ prefixes in some form, 
certainly.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Jan Hubicka

 There shouldn't be any forwarder blocks in cfg layout mode. What did
 you need this for?
 
 BTW2: We badly need to figure out a way to create test cases for FDO... :-(

We have gcc.dg/tree-prof and friends.  What do you need to add?

Honza

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Steven Bosscher

On Fri, May 10, 2013 at 11:10 PM, Jan Hubicka hubi...@ucw.cz wrote:
 There shouldn't be any forwarder blocks in cfg layout mode. What did
 you need this for?

 BTW2: We badly need to figure out a way to create test cases for FDO... :-(

 We have gcc.dg/tree-prof and friends.  What do you need to add?

Something to more easily reproduce bugs and derive test cases from them.

Ciao!
Steven

Re: web ICEs on subreg

2013-05-10 Thread Steven Bosscher

On Fri, May 10, 2013 at 11:51 PM, Mike Stump wrote:

 -   if (DF_REF_LOC (*ref) == recog_data.operand_loc[op])
 - break;
 +   {
 + if (DF_REF_LOC (*ref) == recog_data.operand_loc[op])
 +   break;
 + /* DF_REF_LOC can be (subreg:DI (reg:TI 5) 8) and
 +recog_data.operand_loc[op] can be (reg:TI 5), and the above
 +won't find it.  */
 + if (GET_CODE (*DF_REF_LOC (*ref)) == SUBREG 
 + SUBREG_REG (*DF_REF_LOC (*ref)) == recog_data.operand_loc[op])

Congrats, you've re-invented DF_REF_REAL_LOC.

Ciao!
Steven

Re: web ICEs on subreg

2013-05-10 Thread Steven Bosscher

On Fri, May 10, 2013 at 11:51 PM, Mike Stump wrote:
 I have a instruction pattern for my port that reads in part:

   [(parallel [(set (subreg:DI (match_operand:TI 0 register_operand =r) 0)
(mem:DI (plus:DI (match_operand:DI 1 register_operand 
 r)
 (match_operand:DI 2 const_int_operand 
 
   (set (subreg:DI (match_dup 0) 8)
(mem:DI (plus:DI (match_dup 1)
 (match_operand:DI 3 
 const_int_operand])]

 The subreg seems to give web.c fits.  web is looking for the match_dup and 
 failing to find it, due to the fact that DF_REF_LOC is of the form (subreg:DI 
 (reg:TI 5) 8), not (reg:TI 5).  If I add a case for SUBREG like in the below, 
 we can then find and process the subreg.

 So, the question is, is the port completely wrong for trying to use subreg in 
 this fashion?  If not, is processing subregs in this way the right way?  If 
 yes, Ok?

I don't think this port is completely wrong. I've looked at some other
ports and there are comparable patterns, e.g. in m68k:

;; We need a separate DEFINE_EXPAND for u?mulsidi3 to be able to use the
;; proper matching constraint.  This is because the matching is between
;; the high-numbered word of the DImode operand[0] and operand[1].
(define_expand umulsidi3
  [(parallel
[(set (subreg:SI (match_operand:DI 0 register_operand ) 4)
  (mult:SI (match_operand:SI 1 register_operand )
   (match_operand:SI 2 register_operand )))
 (set (subreg:SI (match_dup 0) 0)
  (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 1))
 (zero_extend:DI (match_dup 2)))
(const_int 32])]
  TARGET_68020  !TUNE_68060  !TARGET_COLDFIRE
  )

Your web.c patch looks correct to me, but I can't approve it.

Ciao!
Steven

Re: web ICEs on subreg

2013-05-10 Thread Mike Stump

On May 10, 2013, at 3:29 PM, Steven Bosscher stevenb@gmail.com wrote:
 Your web.c patch looks correct to me, but I can't approve it.

Thanks.  Now that you point out the DF accessor, it all makes perfect sense.  
:-)  I've fixed one other instance that has to be as wrong for all the same 
reasons.  Though my port doesn't trip the other case from the test suite, if my 
port used an IN/OUT, I bet I could make it trip it.

Ok?

2013-05-10  Mike Stump  mikest...@comcast.net

* web.c (union_match_dups): Use DF_REF_REAL_LOC instead.

Doing diffs in web.c.~1~:
--- web.c.~1~   2013-01-14 10:39:36.0 -0800
+++ web.c   2013-05-10 15:45:02.0 -0700
@@ -132,13 +132,13 @@ union_match_dups (rtx insn, struct web_e
   ref = type == OP_IN ? use_link : def_link;
   entry = type == OP_IN ? use_entry : def_entry;
   for (; *ref; ref++)
-   if (DF_REF_LOC (*ref) == recog_data.operand_loc[op])
+   if (DF_REF_REAL_LOC (*ref) == recog_data.operand_loc[op])
  break;
 
   if (!*ref  type == OP_INOUT)
{
  for (ref = use_link, entry = use_entry; *ref; ref++)
-   if (DF_REF_LOC (*ref) == recog_data.operand_loc[op])
+   if (DF_REF_REAL_LOC (*ref) == recog_data.operand_loc[op])
  break;
}
 
--

Re: [PATCH] Improve handle_char_store in strlen pass (PR tree-optimization/57230)

2013-05-10 Thread Jeff Law


On 05/10/2013 08:49 AM, Jakub Jelinek wrote:

Hi!

Related to the previously posted patch, I've noticed that
on the attached testcase we don't optimize the strlen into
constant, even when we easily could.

Implemented thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2013-05-10  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/57230
* tree-ssa-strlen.c (handle_char_store): Record length for
array store from STRING_CST.

* gcc.dg/strlenopt-24.c: New test.

OK.

A bit surprised we weren't already handling this...

jeff

Re: [PATCH] Fix up rotate expansion

2013-05-10 Thread Jeff Law


On 05/10/2013 08:53 AM, Jakub Jelinek wrote:

Hi!

Our rotate expansion if rotate optab isn't present for the respective
mode looks unsafe for rotations by variable count if that count could
be 0, because then it invokes right or left shift by bitsize.
While on most targets the hw will probably do the right thing
(it is fine if x  32 will either yield x or 0, in both cases
that ored together with x  0 aka x will still yield x), perhaps gcc
could try to optimize based on the fact that undefined behavior can't
happen, so IMHO it is better to generate a safer sequence.

Ok for trunk?

2013-05-10  Jakub Jelinek  ja...@redhat.com

* expmed.c (expand_shift_1): For rotations by const0_rtx just
return shifted.  Use (-op1)  (prec - 1) as other_amount
instead of prec - op1.
Found by inspection?  Presumably the rotate was synthesized by GCC from 
some other set of operations.  To be optimizable, we'd have to prove the 
original sequence triggered undefined behaviour.


Seems that we ought to have a testcase, even though it probably means 
scanning the tree dumps to pick up the undefined behaviour.  Approved 
with a testcase.



Out of curiosity, do we recognize the original sequence created by 
expand_shift_1 as a rotate if they appeared in the original source?




jeff

[patch] struct resources: remove unch_memory member

2013-05-10 Thread Steven Bosscher

Hello,

This unch_memory in struct resources is a left-over from
RTX_UNCHANGING_P, but it looks like the change-over to MEM_READONLY_P
was done incorrectly: The resource_conflicts_p code now reports
conflicts for insns reading readonly memory and insns reading normal
memory or no memory at all.

Spotted by checking why reorg.c failed to fill some slots that my
sched-deps based delay slot scheduler managed to fill.

Bootstrappedtested on sparc64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven
* resource.h (struct resources): Remove unch_memory member.
(CLEAR_RESOURCE): Don't clear unch_memory.
* resource.c (mark_referenced_resources): Don't set it.
(mark_set_resources): Likewise.
(mark_target_live_regs): Don't clear it.
(init_resource_info): Likewise.
* reorg.c (resource_conflicts_p): Don't compare it.
(redundant_insn): Don't set it.

Index: resource.h
===
--- resource.h  (revision 198737)
+++ resource.h  (working copy)
@@ -25,14 +25,13 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Macro to clear all resources.  */
 #define CLEAR_RESOURCE(RES)\
- do { (RES)-memory = (RES)-unch_memory = (RES)-volatil = (RES)-cc = 0; \
+ do { (RES)-memory = (RES)-volatil = (RES)-cc = 0; \
   CLEAR_HARD_REG_SET ((RES)-regs); } while (0)
 
 /* The resources used by a given insn.  */
 struct resources
 {
   char memory; /* Insn sets or needs a memory location.  */
-  char unch_memory;/* Insn sets or needs an unchanging MEM.  */
   char volatil;/* Insn sets or needs a volatile memory loc.  */
   char cc; /* Insn sets or needs the condition codes.  */
   HARD_REG_SET regs;   /* Which registers are set or needed.  */
Index: resource.c
===
--- resource.c  (revision 198737)
+++ resource.c  (working copy)
@@ -240,9 +240,7 @@ mark_referenced_resources (rtx x, struct resources
 case MEM:
   /* If this memory shouldn't change, it really isn't referencing
 memory.  */
-  if (MEM_READONLY_P (x))
-   res-unch_memory = 1;
-  else
+  if (! MEM_READONLY_P (x))
res-memory = 1;
   res-volatil |= MEM_VOLATILE_P (x);
 
@@ -740,7 +738,6 @@ mark_set_resources (rtx x, struct resources *res,
   if (in_dest)
{
  res-memory = 1;
- res-unch_memory |= MEM_READONLY_P (x);
  res-volatil |= MEM_VOLATILE_P (x);
}
 
@@ -896,7 +893,7 @@ mark_target_live_regs (rtx insns, rtx target, stru
 
   /* We have to assume memory is needed, but the CC isn't.  */
   res-memory = 1;
-  res-volatil = res-unch_memory = 0;
+  res-volatil = 0;
   res-cc = 0;
 
   /* See if we have computed this value already.  */
@@ -1145,7 +1142,6 @@ init_resource_info (rtx epilogue_insn)
 
   end_of_function_needs.cc = 0;
   end_of_function_needs.memory = 1;
-  end_of_function_needs.unch_memory = 0;
   CLEAR_HARD_REG_SET (end_of_function_needs.regs);
 
   if (frame_pointer_needed)
Index: reorg.c
===
--- reorg.c (revision 198737)
+++ reorg.c (working copy)
@@ -276,7 +276,6 @@ static int
 resource_conflicts_p (struct resources *res1, struct resources *res2)
 {
   if ((res1-cc  res2-cc) || (res1-memory  res2-memory)
-  || (res1-unch_memory  res2-unch_memory)
   || res1-volatil || res2-volatil)
 return 1;
 
@@ -1542,7 +1541,6 @@ redundant_insn (rtx insn, rtx target, rtx delay_li
   /* Insns we pass may not set either NEEDED or SET, so merge them for
  simpler tests.  */
   needed.memory |= set.memory;
-  needed.unch_memory |= set.unch_memory;
   IOR_HARD_REG_SET (needed.regs, set.regs);
 
   /* This insn isn't redundant if it conflicts with an insn that either is

Re: cfgexpand.c patch for [was new port: msp430-elf]

2013-05-10 Thread DJ Delorie


  Note that I had to make a few changes (fixes?)  in the MI portions of
  gcc to avoid problems I encountered, I don't know if these changes are
  correct or if there are better ways to avoid those cases.  Those
 
 In any case, they should best be posted in separate messages, each one 
 with its own rationale.

Here's the first of those...  The patch assumes that, by definition,
a partial int mode has fewer bits than an int mode of the same size,
and thus truncation should be used to go from the int mode to the
partial int mode.

Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c (revision 198591)
+++ gcc/cfgexpand.c (working copy)
@@ -3090,13 +3090,17 @@ expand_debug_expr (tree exp)
 size_t, we need to check for mis-matched modes and correct
 the addend.  */
   if (op0  op1
   GET_MODE (op0) != VOIDmode  GET_MODE (op1) != VOIDmode
   GET_MODE (op0) != GET_MODE (op1))
{
- if (GET_MODE_BITSIZE (GET_MODE (op0))  GET_MODE_BITSIZE (GET_MODE 
(op1)))
+ if (GET_MODE_BITSIZE (GET_MODE (op0))  GET_MODE_BITSIZE (GET_MODE 
(op1))
+ /* Don't try to sign-extend SImode to PSImode, for example.  */
+ || (GET_MODE_BITSIZE (GET_MODE (op0)) == GET_MODE_BITSIZE 
(GET_MODE (op1))
+  GET_MODE_CLASS (GET_MODE (op0)) == MODE_PARTIAL_INT
+  GET_MODE_CLASS (GET_MODE (op1)) == MODE_INT))
op1 = simplify_gen_unary (TRUNCATE, GET_MODE (op0), op1,
  GET_MODE (op1));
  else
/* We always sign-extend, regardless of the signedness of
   the operand, because the operand is always unsigned
   here even if the original C expression is signed.  */

Re: dwarf2cfi.c patch for [was new port: msp430-elf]

2013-05-10 Thread DJ Delorie


  Note that I had to make a few changes (fixes?)  in the MI portions of
  gcc to avoid problems I encountered, I don't know if these changes are
  correct or if there are better ways to avoid those cases.  Those
 
 In any case, they should best be posted in separate messages, each one 
 with its own rationale.

Here's the second of those...

Index: dwarf2cfi.c
===
--- dwarf2cfi.c (revision 198591)
+++ dwarf2cfi.c (working copy)
@@ -277,12 +277,19 @@ expand_builtin_init_dwarf_reg_sizes (tre
{
  if (save_mode == VOIDmode)
continue;
  wrote_return_column = true;
}
  size = GET_MODE_SIZE (save_mode);
+
+  /* Entries in the dwarf_reg_size_table must be big enough to hold an 
_Unwind_Word
+ even if this is bigger than reg_raw_mode.  This can happen on 
targets where the
+ pointer size is larger than the integer size, and not a 
power-of-two.  (Eg MSP430).  */
+  if (size  GET_MODE_SIZE (targetm.unwind_word_mode ()))
+size = GET_MODE_SIZE (targetm.unwind_word_mode ());
+
  if (offset  0)
continue;
 
  emit_move_insn (adjust_address (mem, mode, offset),
  gen_int_mode (size, mode));
}

Re: [PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Jan Hubicka

 
 The existing check should work ok with lto. If not then we should figure out 
 why we do not merge the main variants properly.
Hmm, adding:
Index: tree.c
===
--- tree.c  (revision 198796)
+++ tree.c  (working copy)
@@ -11572,6 +11572,12 @@ get_binfo_at_offset (tree binfo, HOST_WI
 
   if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type))
  return binfo;
+  else
+   if (types_compatible_p (type, expected_type))
+ {
+   debug_tree (TYPE_MAIN_VARIANT (type));
+   debug_tree (TYPE_MAIN_VARIANT (expected_type));
+ }
   if (offset  0)
return NULL_TREE;
 
@@ -11605,6 +11611,12 @@ get_binfo_at_offset (tree binfo, HOST_WI
found_binfo = base_binfo;
break;
  }
+ else
+   if (types_compatible_p (TREE_TYPE (base_binfo), TREE_TYPE 
(fld)))
+ {
+   debug_tree (TREE_TYPE (base_binfo));
+   debug_tree (TREE_TYPE (fld));
+ }
  if (!found_binfo)
return NULL_TREE;
  binfo = found_binfo;

seems to bring a lot of positives and by quick inspection most of them seem
like same classes.  I will try to find some time to debug this more. But
glancing over the the dumps, I see many of them just have different name
spaces.  Do we even attempt to merge namespace_decl? How types from same
namespaces in different units are supposed to match?

Honza
 
 Richard.
 
 Honza

Re: [PATCH] Use types_compatible_p in get_binfo_at_offset

2013-05-10 Thread Jan Hubicka

 seems to bring a lot of positives and by quick inspection most of them seem
 like same classes.  I will try to find some time to debug this more. But
 glancing over the the dumps, I see many of them just have different name
 spaces.  Do we even attempt to merge namespace_decl? How types from same
 namespaces in different units are supposed to match?
Just as example
 record_type 0x7ff129ff6bd0 X86Assembler addressable needs-constructing BLK
size integer_cst 0x7ff136ee2760 type integer_type 0x7ff1370af0a8 
bitsizetype constant 2368
unit size integer_cst 0x7ff136ee2780 type integer_type 0x7ff1370af000 
sizetype constant 296
align 64 symtab 0 alias set -1 canonical type 0x7ff136efc498
fields field_decl 0x7ff129fe2850 D.752229
type record_type 0x7ff129ff6150 GenericAssembler needs-constructing TI
size integer_cst 0x7ff13709eda0 constant 128
unit size integer_cst 0x7ff13709edc0 constant 16
align 64 symtab 0 alias set -1 canonical type 0x7ff136ec6b28 fields 
field_decl 0x7ff129fe2098 printer context namespace_decl 0x7ff129e45e60 JSC
pointer_to_this pointer_type 0x7ff129c7f888 chain type_decl 
0x7ff129fd7e60 GenericAssembler
ignored TI file 
/abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h
 line 140 col 7
size integer_cst 0x7ff137227400 constant 72
unit size integer_cst 0x7ff136eb50a0 constant 9
align 64 offset_align 128
offset integer_cst 0x7ff13709ed60 constant 0
bit offset integer_cst 0x7ff13709ede0 constant 0 context record_type 
0x7ff129ff6bd0 X86Assembler
chain field_decl 0x7ff129fe28e8 m_formatter type record_type 
0x7ff129ff6b28 X86InstructionFormatter
used private nonlocal BLK file 
/abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h
 line 3207 col 7
size integer_cst 0x7ff136e64d00 constant 2240
unit size integer_cst 0x7ff136e64d20 constant 280
align 64 offset_align 128 offset integer_cst 0x7ff13709edc0 16 
bit offset integer_cst 0x7ff13709ede0 0 context record_type 0x7ff129ff6bd0 
X86Assembler chain type_decl 0x7ff129ff75c0 X86Assembler context 
namespace_decl 0x7ff129e45e60 JSC
pointer_to_this pointer_type 0x7ff129c7fa80 chain type_decl 
0x7ff129ff7508 X86Assembler
 record_type 0x7ff12c9e7690 X86Assembler addressable needs-constructing BLK
size integer_cst 0x7ff136ee2760 type integer_type 0x7ff1370af0a8 
bitsizetype constant 2368
unit size integer_cst 0x7ff136ee2780 type integer_type 0x7ff1370af000 
sizetype constant 296
align 64 symtab 0 alias set -1 canonical type 0x7ff136efc498
fields field_decl 0x7ff12db56000 D.789661
type record_type 0x7ff12e6b4000 GenericAssembler needs-constructing TI
size integer_cst 0x7ff13709eda0 constant 128
unit size integer_cst 0x7ff13709edc0 constant 16
align 64 symtab 0 alias set -1 canonical type 0x7ff136ec6b28 fields 
field_decl 0x7ff12c72e8e8 printer context namespace_decl 0x7ff1315587e8 JSC
pointer_to_this pointer_type 0x7ff12c6cb738 chain type_decl 
0x7ff13155f228 GenericAssembler
ignored TI file 
/abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h
 line 140 col 0
size integer_cst 0x7ff137227400 constant 72
unit size integer_cst 0x7ff136eb50a0 constant 9
align 64 offset_align 128
offset integer_cst 0x7ff13709ed60 constant 0
bit offset integer_cst 0x7ff13709ede0 constant 0 context record_type 
0x7ff12c9e7690 X86Assembler
chain field_decl 0x7ff12db56098 m_formatter type record_type 
0x7ff12c9e75e8 X86InstructionFormatter
used private nonlocal BLK file 
/abuild/jh/mozilla-central2/mozilla-central/js/src/assembler/assembler/X86Assembler.h
 line 3207 col 0
size integer_cst 0x7ff136e64d00 constant 2240
unit size integer_cst 0x7ff136e64d20 constant 280
align 64 offset_align 128 offset integer_cst 0x7ff13709edc0 16 
bit offset integer_cst 0x7ff13709ede0 0 context record_type 0x7ff12c9e7690 
X86Assembler chain type_decl 0x7ff12c673678 X86Assembler context 
namespace_decl 0x7ff1315587e8 JSC
pointer_to_this pointer_type 0x7ff12c6cbf18 chain type_decl 
0x7ff12c6732e0 X86Assembler

Honza
 
 Honza
  
  Richard.
  
  Honza

Re: web ICEs on subreg

2013-05-10 Thread Kenneth Zadeck


Assuming the patch has been tested on a public port, it is ok for commit.

kenny
On 05/10/2013 06:52 PM, Mike Stump wrote:

On May 10, 2013, at 3:29 PM, Steven Bosscher stevenb@gmail.com wrote:

Your web.c patch looks correct to me, but I can't approve it.

Thanks.  Now that you point out the DF accessor, it all makes perfect sense.  
:-)  I've fixed one other instance that has to be as wrong for all the same 
reasons.  Though my port doesn't trip the other case from the test suite, if my 
port used an IN/OUT, I bet I could make it trip it.

Ok?

Re: wide-int.[ch] review

2013-05-10 Thread Ian Lance Taylor

On Fri, May 10, 2013 at 5:12 AM, Richard Biener rguent...@suse.de wrote:

 Somewhere we talked about naming new C++ files wide-int.cc
 (or was it .C or .cxx?)

Definitely not .C.

I prefer .cc.

Ian

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2013-05-10 Thread Teresa Johnson

On Fri, May 10, 2013 at 2:00 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Fri, May 10, 2013 at 5:54 PM, Teresa Johnson wrote:
 The main issue I had here, and why I made this change, is that we go
 in and out of cfglayout mode several times after bb partitioning and
 then out_of_cfglayout. The problem was that when we subsequently went
 in and out of cfglayout mode, the switch text section notes that had
 been inserted by bbpart were getting messed up (they were moved into
 the bb header when we enter cfglayout mode and then not being
 transferred to the correct location upon exit).

 I investigated trying
 to keep those in sync, but it is really difficult/impossible to do
 during cfglayout mode when they are in the header. So I simply strip
 them out completely on entry to cfglayout mode, and if there were any
 there on entry, this change ensures that they are restored in the
 appropriate location upon exit. I'm not sure what is a good
 alternative?

 The problem is that the note exists at all. I'd like to see the note
 go away completely eventually (when we have a CFG all the way through
 pass_final). As a stop-gap, we should not emit the note until
 pass_free_cfg. Up to that point the basic blocks tell what partition
 they're in, and all hot block and cold blocks should be in a sequence.
 So during pass_free_cfg, just walk the basic blocks chain until
 there's a partition change, and emit the note between those two
 blocks.

Ok, let me take a stab at that.



 I triggered the same error in 445.gobmk once I applied the
 thread_prologue_and_epilogue_insns fixes. This is an assert in the
 dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION
 note not being preceeded by a barrier:

 The problem here is that fixup_reorder_chain should force a jump for
 basic blocks in one partition falling through to bb-next. That should
 be dealt with in can_fallthru, which should return false if
 BB_PARTITION (target) != BB_PARTITION (src).

That wasn't the issue in this case.

Here there was a block that happened to be laid out at the very start
of the cold section (it was jumped to from elsewhere, not reached via
fall through from its layout predecessor). Thus it was preceded by a
switch section note, which was put into the bb header when we entered
cfglayout mode for compgoto. The note ended up in the middle of the
block when we did some block combining with its cfg predecessor (not
the block that preceded it in the layout chain, which was the last hot
block in the reorder chain).

So the issue here wasn't making sure there is no fall-through across
section boundaries. There was already code to prevent this, but my
patch contains a few fixes to make sure that is always happening.



 The correct solution in my opinion is to strip out the SWITCH note
 every time we enter cfglayout mode after bbro, and then invoke
 insert_section_boundary_note when leaving cfglayout (if one was found
 on entry to that cfglayout mode) to reapply it.

 Please make the note go away completely before pass_free_cfg, and earn
 greater admiration than Zeus. The note always was wrong, and now
 you've shown it's also a problem.

Ok, I will try. =)



 * Fixup redirected edges that did not cross partitions before but
 apparently do after redirection. This is not supposed to happen in the
 first place, so fixing up any of this just papers over an error
 elsewhere in the compiler.

 Looking back through the earlier email exchanges from last fall I
 found some discussion on this where I had found a couple places
 causing this to happen. Two were in
 thread_prologue_and_epilogue_insns: when we duplicated tail blocks or
 created a new block to hold a simple return, and redirected some of
 the edges to the new copy. In some cases this caused edges that were
 previously region crossing to become non-region crossing, and in other
 cases the reverse happened. I think there are a couple of other cases
 mentioned in the emails too, but I would need to do some more digging
 to find them all.

 There were a few places in the code that tried to detect this type of
 issue, and fix them up, but they weren't consistent and there were
 other places that had the same issue. I've centralized all that
 handling in this patch (fixup_partitions and fixup_partition_crossing,
 called from a few places where we redirect edges) so that it is more
 consistent and comprehensive.

 Right, I think it's good that you've centralized this code. But it
 seems to me that we should make sure that the hot blocks and cold
 blocks are still grouped (i.e. there is only one basic block B such
 that BB_PARTITION(B) != BB_PARTITION(B-next_bb). That is something
 your code doesn't handle, AFAICT. It's just one thing that's difficult
 to maintain, probably there are others. It's also something the
 partitioning verifier should check, i.e. that the basic blocks in hot
 and cold partitions are properly grouped.

Actually, there is already code that verifies this at the

Re: [PATCH v2] Test case for PR55033

2013-05-10 Thread Chung-Ju Wu

2013/5/10 Sebastian Huber sebastian.hu...@embedded-brains.de:
 v2: Format changes

 gcc/testsuite/ChangeLog
 2013-05-10  Sebastian Huber  sebastian.hu...@embedded-brains.de

 PR target/55033
 * gcc.target/powerpc/pr55033.c: New.
 ---
  gcc/testsuite/gcc.target/powerpc/pr55033.c |   23 +++
  1 files changed, 23 insertions(+), 0 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr55033.c

 diff --git a/gcc/testsuite/gcc.target/powerpc/pr55033.c 
 b/gcc/testsuite/gcc.target/powerpc/pr55033.c
 new file mode 100644
 index 000..2c1835e
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/powerpc/pr55033.c
 @@ -0,0 +1,23 @@
 +/* { dg-do compile } */
 +/* { dg-require-effective-target powerpc_eabi_ok } */
 +/* { dg-options -mcpu=8540 -msoft-float -msdata=eabi -G 8 -fno-common } */
 +
 +extern void f (void);
 +
 +struct s
 +{
 +  int *p;
 +  int *q;
 +};
 +
 +extern int a;
 +
 +extern const struct s c;
 +
 +const struct s c = { a, 0 };
 +
 +void
 +f (void)
 +{
 +  char buf[4] = { 0, 1, 2, 3 };
 +}
 --
 1.7.7


It seems that you forgot to include powerpc maintainer in the CC list.
Let me do this for you. :)


Best regards,
jasonwucj

87 matches

Mail list logo