date:20130305

Re: __sdivsi3_i4i and __udivsi3_i4i called for sh2 variant.

2013-03-05 Thread Yoshinori Sato

  Uses a lookup table for divisors in the range -128 .. +128, and
 
 The code that you have enabled in lib1funcs.S will utilize dynamic shift
 instructions, which are not available on SH1 or SH2.  Maybe your target
 HW is SH2A which has dynamic shift instructions and you haven't noticed
 a problem?
 Adding __SH2A__ instead of __SH2__ should be fine though.
 
 If I'm not mistaken, the __sdivsi3_i4i and __udivsi3_i4i division
 functions will be used by the compiler if the -mdiv=call-table option is
 used.  The compiler should reject 'call-table' for SH targets that don't
 have dynamic shifts ... in sh.c there is a check...
 
   else if (! strcmp (sh_div_str, call-table)  TARGET_SH2)
   sh_div_strategy = SH_DIV_CALL_TABLE;
 
 ... which is not quite complete.
 I will prepare a patch for this.
 
 Cheers,
 Oleg

Re: [PATCH] Fix vect_create_epilog_for_reduction memory leaks (PR middle-end/56461)

2013-03-05 Thread Richard Biener

On Mon, 4 Mar 2013, Jakub Jelinek wrote:

 Hi!
 
 vect_create_epilog_for_reduction leaks memory both from the inner_phis
 vector not being released for double_reduc, and also for stmt_vec_info
 it creates (because those are added for stmts added into exit_bb, i.e.
 after loop, which destroy_loop_vec_info doesn't free).  Fixed thusly,
 bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2013-03-04  Jakub Jelinek  ja...@redhat.com
 
   PR middle-end/56461
   * tree-vect-stmts.c (free_stmt_vec_info_vec): Call
   free_stmt_vec_info on any left-over stmt_vec_info in the vector.
   * tree-vect-loop.c (vect_create_epilog_for_reduction): Release
   inner_phis vector.
 
 --- gcc/tree-vect-stmts.c.jj  2013-03-04 11:07:33.0 +0100
 +++ gcc/tree-vect-stmts.c 2013-03-04 12:14:16.111393716 +0100
 @@ -5969,6 +5969,11 @@ init_stmt_vec_info_vec (void)
  void
  free_stmt_vec_info_vec (void)
  {
 +  unsigned int i;
 +  vec_void_p info;
 +  FOR_EACH_VEC_ELT (stmt_vec_info_vec, i, info)
 +if (info != NULL)
 +  free_stmt_vec_info (STMT_VINFO_STMT ((stmt_vec_info) info));
gcc_assert (stmt_vec_info_vec.exists ());
stmt_vec_info_vec.release ();
  }
 --- gcc/tree-vect-loop.c.jj   2013-03-04 11:01:48.0 +0100
 +++ gcc/tree-vect-loop.c  2013-03-04 12:17:09.934351015 +0100
 @@ -4487,8 +4487,9 @@ vect_finalize_reduction:
  }
  
scalar_results.release ();
 +  inner_phis.release ();
new_phis.release ();
 -} 
 +}
  
  
  /* Function vectorizable_reduction.
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Fix discover_iteration_bound_by_body_walk memory leaks (PR middle-end/56461)

2013-03-05 Thread Richard Biener

On Mon, 4 Mar 2013, Jakub Jelinek wrote:

 Hi!
 
 This function was releasing only some vectors pushed into queues vector, not
 all, and wasn't releasing bounds vector.  Fixed thusly.  There is no need to
 use a typedef for the C++ish vec.h vectors, and the code can be tiny bit
 simplified.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2013-03-04  Jakub Jelinek  ja...@redhat.com
 
   PR middle-end/56461
   * tree-ssa-loop-niter.c (bb_queue): Remove typedef.
   (discover_iteration_bound_by_body_walk): Change queues to
   vecvecbasic_block  and queue to vecbasic_block.  Fix up
   spelling in comment.  Call safe_push on queues[bound_index] directly.
   Release queues[queue_index] in every iteration unconditionally.
   Release bounds vector.
 
 --- gcc/tree-ssa-loop-niter.c.jj  2013-02-27 23:05:07.0 +0100
 +++ gcc/tree-ssa-loop-niter.c 2013-03-04 14:57:37.380872029 +0100
 @@ -3007,9 +3007,6 @@ bound_index (vecdouble_int bounds, dou
gcc_unreachable ();
  }
  
 -/* Used to hold vector of queues of basic blocks bellow.  */
 -typedef vecbasic_block bb_queue;
 -
  /* We recorded loop bounds only for statements dominating loop latch (and 
 thus
 executed each loop iteration).  If there are any bounds on statements not
 dominating the loop latch we can improve the estimate by walking the loop
 @@ -3022,8 +3019,8 @@ discover_iteration_bound_by_body_walk (s
pointer_map_t *bb_bounds;
struct nb_iter_bound *elt;
vecdouble_int bounds = vNULL;
 -  vecbb_queue queues = vNULL;
 -  bb_queue queue = bb_queue();
 +  vecvecbasic_block  queues = vNULL;
 +  vecbasic_block queue = vNULL;
ptrdiff_t queue_index;
ptrdiff_t latch_index = 0;
pointer_map_t *block_priority;
 @@ -3096,7 +3093,7 @@ discover_iteration_bound_by_body_walk (s
   present in the path and we look for path with largest smallest bound
   on it.
  
 - To avoid the need for fibonaci heap on double ints we simply compress
 + To avoid the need for fibonacci heap on double ints we simply compress
   double ints into indexes to BOUNDS array and then represent the queue
   as arrays of queues for every index.
   Index of BOUNDS.length() means that the execution of given BB has
 @@ -3162,16 +3159,11 @@ discover_iteration_bound_by_body_walk (s
   }
   
 if (insert)
 - {
 -   bb_queue queue2 = queues[bound_index];
 -   queue2.safe_push (e-dest);
 -   queues[bound_index] = queue2;
 - }
 + queues[bound_index].safe_push (e-dest);
   }
   }
   }
 -  else
 - queues[queue_index].release ();
 +  queues[queue_index].release ();
  }
  
gcc_assert (latch_index = 0);
 @@ -3187,6 +3179,7 @@ discover_iteration_bound_by_body_walk (s
  }
  
queues.release ();
 +  bounds.release ();
pointer_map_destroy (bb_bounds);
pointer_map_destroy (block_priority);
  }
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Fix vect_supported_load_permutation_p memory leak (PR middle-end/56461)

2013-03-05 Thread Richard Biener

On Mon, 4 Mar 2013, Jakub Jelinek wrote:

 Hi!
 
 When returning true, load_index sbitmap is released, but not when returning
 false.  Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk?

Ok.

Thanks,
Richard.

 2013-03-04  Jakub Jelinek  ja...@redhat.com
 
   PR middle-end/56461
   * tree-vect-slp.c (vect_supported_load_permutation_p): Free
   load_index sbitmap even if some bit in it isn't set.
 
 --- gcc/tree-vect-slp.c.jj2013-02-28 22:19:57.0 +0100
 +++ gcc/tree-vect-slp.c   2013-03-04 15:01:48.441490311 +0100
 @@ -1429,7 +1429,10 @@ vect_supported_load_permutation_p (slp_i
   
for (j = 0; j  group_size; j++)
  if (!bitmap_bit_p (load_index, j))
 -  return false;
 +  {
 + sbitmap_free (load_index);
 + return false;
 +  }
  
sbitmap_free (load_index);
  
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Fix PR56478

2013-03-05 Thread Marek Polacek

On Fri, Mar 01, 2013 at 11:10:40AM +0100, Richard Biener wrote:
 Don't use NULL_TREE built_int_cst - doing so hints at that you want to
 use double_ints.  Generally doing computation with trees is expensive.
 You want to avoid that at all cost.  Use double-ints (yeah, you have to
 use the clunky divmod_with_overflow interface).

So this is a WIP patch, which uses double_ints.  I apologize for my
dumbness, but I haven't figured out how to do the normalization here.
It's probably something simple, but...  The point of the normalization
would be that when multiplying the normalized number with 1 (aka
REG_BR_PROB_BASE), the result fits into plain int, right?
If you could suggest what to do with that, that would be appreciated.

Thanks.

--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -1028,13 +1028,13 @@ static bool
 is_comparison_with_loop_invariant_p (gimple stmt, struct loop *loop,
 tree *loop_invariant,
 enum tree_code *compare_code,
-int *loop_step,
+tree *loop_step,
 tree *loop_iv_base)
 {
   tree op0, op1, bound, base;
   affine_iv iv0, iv1;
   enum tree_code code;
-  int step;
+  tree step;
 
   code = gimple_cond_code (stmt);
   *loop_invariant = NULL;
@@ -1077,7 +1077,7 @@ is_comparison_with_loop_invariant_p (gimple stmt, struct 
loop *loop,
   bound = iv0.base;
   base = iv1.base;
   if (host_integerp (iv1.step, 0))
-   step = tree_low_cst (iv1.step, 0);
+   step = iv1.step;
   else
return false;
 }
@@ -1086,7 +1086,7 @@ is_comparison_with_loop_invariant_p (gimple stmt, struct 
loop *loop,
   bound = iv1.base;
   base = iv0.base;
   if (host_integerp (iv0.step, 0))
-   step = tree_low_cst (iv0.step, 0);  
+   step = iv0.step;
   else
return false;
 }
@@ -1154,6 +1154,16 @@ expr_coherent_p (tree t1, tree t2)
 return false;
 }
 
+static double_int
+normalize (double_int n)
+{
+  int msb = HOST_BITS_PER_WIDE_INT - clz_hwi (n.to_shwi ());
+  if (msb  HOST_BITS_PER_INT - 16)
+{}
+// ??? n = n.rshift (?, ?, ?);
+  return n;
+}
+
 /* Predict branch probability of BB when BB contains a branch that compares
an induction variable in LOOP with LOOP_IV_BASE_VAR to LOOP_BOUND_VAR. The
loop exit is compared using LOOP_BOUND_CODE, with step of LOOP_BOUND_STEP.
@@ -1178,7 +1188,7 @@ predict_iv_comparison (struct loop *loop, basic_block bb,
   gimple stmt;
   tree compare_var, compare_base;
   enum tree_code compare_code;
-  int compare_step;
+  tree compare_step;
   edge then_edge;
   edge_iterator ei;
 
@@ -1224,34 +1234,68 @@ predict_iv_comparison (struct loop *loop, basic_block 
bb,
host_integerp (compare_base, 0))
 {
   int probability;
-  HOST_WIDE_INT compare_count;
-  HOST_WIDE_INT loop_bound = tree_low_cst (loop_bound_var, 0);
-  HOST_WIDE_INT compare_bound = tree_low_cst (compare_var, 0);
-  HOST_WIDE_INT base = tree_low_cst (compare_base, 0);
-  HOST_WIDE_INT loop_count = (loop_bound - base) / compare_step;
-
-  if ((compare_step  0)
+  bool of, overflow = false;
+  double_int mod, compare_count, tem, loop_count;
+
+  double_int loop_bound = tree_to_double_int (loop_bound_var);
+  double_int compare_bound = tree_to_double_int (compare_var);
+  double_int base = tree_to_double_int (compare_base);
+  double_int compare_step = tree_to_double_int (compare_step);
+
+  /* (loop_bound - base) / compare_step */
+  tem = loop_bound.sub_with_overflow (base, of);
+  overflow |= of;
+  loop_count = tem.divmod_with_overflow (compare_step,
+ 0, TRUNC_DIV_EXPR,
+ mod, of);
+  overflow |= of;
+
+  if ((compare_step.scmp (double_int_zero) == 1)
   ^ (compare_code == LT_EXPR || compare_code == LE_EXPR))
-   compare_count = (loop_bound - compare_bound) / compare_step;
+   {
+ /* (loop_bound - compare_bound) / compare_step */
+ tem = loop_bound.sub_with_overflow (compare_bound, of);
+ overflow |= of;
+ compare_count = tem.divmod_with_overflow (compare_step,
+0, TRUNC_DIV_EXPR,
+mod, of);
+ overflow |= of;
+   }
   else
-   compare_count = (compare_bound - base) / compare_step;
-
+{
+ /* (compare_bound - base) / compare_step */
+ tem = compare_bound.sub_with_overflow (base, of);
+ overflow |= of;
+  compare_count = tem.divmod_with_overflow (compare_step,
+0, TRUNC_DIV_EXPR,
+mod, of);
+ overflow |= of;
+   }
   if (compare_code == LE_EXPR || compare_code ==

Re: [PATCH] Fix inlining of calls with NULL block (PR56515)

2013-03-05 Thread Richard Biener

On Mon, 4 Mar 2013, Jan Hubicka wrote:

  On Mon, Mar 04, 2013 at 01:42:51PM +0100, Richard Biener wrote:
   When inlining call stmts with a NULL gimple_block we still remap
   all the callee blocks into a block tree copy but we'll end up
   not referencing it from anywhere.  This causes verification failures
   because then we have nothing refering to the inline stmt blocks.
  
  Ugh, best would be to set proper block even for the artificially added calls
  (whether to look around for surrounding statement blocks or similar).
 
 As I added to the PR log, I do not really think we can LTO reliably libraries,
 like libgcov, where backend invents its own calls.  For safe LTO of runtime
 bits (libgcc/libgcov/libc) we will need a mechanizm to explicitly mark 
 implementations of runtime calls and do not remove them from callgraph until
 it is clear that no new calls are invented.

Indeed.  Still the issue exists.

  Because clearing the block tree will have the nasty effect that nothing
  in the inlined routine will be debuggable.  Attaching the block tree to
  DECL_INITIAL isn't very good either, because while the inline fn itself
  will be debuggable, the parent function's variables won't be accessible.
  
  But I guess your patch is fine as a hack to avoid ICEs, and we just should
  try harder and harder to avoid ever hitting that situation.
 
 Well, what would you recommend to do when adding an instrumentation calls on
 random places of the callgraph?
 We do
 1) adding the counter increments that are associated with edges in callgraph.
 2) adding VPT calls that are associated with an instruction (like divide)
 3) adding calls to prologue handling indirect call.
 
 I suppose 3) can be handled same way as prologue code and 2) can copy block
 from the associated instruction. But what should we do for 1?

Note that copy block from surrounding code is equivalent to
a NULL block (just inherit the currently active block).  What's
more interesting is whether we want debug information for the
inlined code at all - after all the callees are all DECL_ARTIFICIAL
and definitely middle-end introduced memcpy calls diving into
inlined memcpy implementation would be odd - after all the source
doesn't contain the memcpy call and so I think there is no way to
step over the call (after all we don't want to report it!).

Thus I think for inlined artificial calls we want to drop debug
information from its inlined body.  The patch does that for
BLOCKs (and thus variables - which effectively means everything
but line information).

What we eventually want to have is a BLOCK marked as
inlined outer scope of artificial function FOO.  Not sure
if dwarf can express that though.  Then gdb could do the
right thing and other debug consumers like systemtap could
still see the implementation.

I've now LTO profilebootstrapped and bootstrapped and tested
the patch on x86_64-unknown-linux-gnu and plan to install it
with a big fat comment added.

Thanks,
Richard.

Re: [patch] PR c++/55135

2013-03-05 Thread Richard Biener

On Mon, Mar 4, 2013 at 8:13 PM, Steven Bosscher stevenb@gmail.com wrote:
 Hello,

 Bug c++/55135 is another one of those almost-insane large test cases
 that triggers some of the worst time complexity behavior GCC has to
 offer. The attached patch doesn't actually fix anything the bug poster
 complained about, but something I ran into myself while trying to
 compile the file at -O0. It's a regression from older GCC releases and
 a test case for which clang kicks our butts.

 What happens at -O0 for this test case, is that there are 179972 EH
 regions and all but 3 of them are removed in
 remove_unreachable_handlers, which calls remove_eh_handler one region
 at a time in a loop. Because the EH tree is almost flat (almost a
 linked list), and remove_eh_handler has to look up the dead region in
 the tree, this results in O(N_EH_regions^2) run time in
 pass_cleanup_eh.

 The solution I propose in the attached patch, is to remove all
 unreachable regions in a single walk over the EH tree. This makes
 remove_unreachable_handlers run in no worse than O(N_EH_regions) time.
 If there are only a few regions to be removed, then this is
 potentially slower than the existing algorithm, but there is already a
 complete function walk in remove_unreachable_handlers and in the
 non-O0 case the EH tree is usually relatively small even for large
 functions. In any case, I have measured compile time on some C++ and
 Java cases and there were no measurable compile time regressions at
 -O1+, and a few improvements at -O0.

 Bootstrappedtested on x86_64-unknown-linux-gnu. OK for trunk?

Ok.

Thanks,
Richard.

 Ciao!
 Steven


 gcc/
 PR c++/55135
 * except.h (remove_unreachable_eh_regions): New prototype.
 * except.c (remove_eh_handler_splicer): New function, split out
 of remove_eh_handler.
 (remove_eh_handler): Use remove_eh_handler_splicer.  Add comment
 warning about running it on many EH regions one at a time.
 (remove_unreachable_eh_regions_worker): New function, walk the
 EH tree in depth-first order and remove non-marked regions.
 (remove_unreachable_eh_regions): New function.
 * tree-eh.c (mark_reachable_handlers): New function, split out
 from remove_unreachable_handlers.
 (remove_unreachable_handlers): Use mark_reachable_handlers and
 remove_unreachable_eh_regions.
 (remove_unreachable_handlers_no_lp): Use mark_reachable_handlers
 and remove_unreachable_eh_regions.

Re: [PATCH] Fix lots of uninitialized memory uses in sched_analyze_reg

2013-03-05 Thread Richard Biener

On Mon, Mar 4, 2013 at 10:17 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 Something that again hits lots of testcases during valgrind checking
 bootstrap.  init_alias_analysis apparently does
   vec_safe_grow_cleared (reg_known_value, maxreg - FIRST_PSEUDO_REGISTER);
   reg_known_equiv_p = sbitmap_alloc (maxreg - FIRST_PSEUDO_REGISTER);
 but doesn't bitmap_clear (reg_known_equiv_p), perhaps as an optimization?
 If set_reg_known_value is called (and not to the reg itself),
 set_reg_known_equiv_p is called too though.
 Right now get_reg_known_equiv_p is only called in one place, and we are only
 interested in MEM_P known values there, so the following works fine.
 Though perhaps if in the future we use the reg_known_equiv_p bitmap more,
 we should bitmap_clear (reg_known_equiv_p) it instead.
 Bootstrapped/regtested on x86_64-linux and i686-linux.

 Ok for trunk (or do you prefer to slow down init_alias_analysis and just
 clear the bitmap)?

Looks ok, also clear the sbitmap as of stevens comment.

Thanks,
Richard.

 2013-03-04  Jakub Jelinek  ja...@redhat.com

 * sched-deps.c (sched_analyze_reg): Only call get_reg_known_equiv_p
 if get_reg_known_value returned non-NULL.

 --- gcc/sched-deps.c.jj 2013-03-04 12:21:09.0 +0100
 +++ gcc/sched-deps.c2013-03-04 17:29:03.478944157 +0100
 @@ -2351,10 +2351,10 @@ sched_analyze_reg (struct deps_desc *dep
/* Pseudos that are REG_EQUIV to something may be replaced
  by that during reloading.  We need only add dependencies for
 the address in the REG_EQUIV note.  */
 -  if (!reload_completed  get_reg_known_equiv_p (regno))
 +  if (!reload_completed)
 {
   rtx t = get_reg_known_value (regno);
 - if (MEM_P (t))
 + if (t  MEM_P (t)  get_reg_known_equiv_p (regno))
 sched_analyze_2 (deps, XEXP (t, 0), insn);
 }


 Jakub

Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode

2013-03-05 Thread Richard Biener

On Tue, Mar 5, 2013 at 12:47 AM, Paul Brook p...@codesourcery.com wrote:
 I somehow missed the Appendix A: Support for Advanced SIMD Extensions
 in the AAPCS document (it's not in the TOC!). It looks like the
 builtin vector types are indeed defined to be stored in memory in
 vldm/vstm order -- I think that means we're back to square one.

 There's still the possibility of making gcc generic vector types different
 from the ABI specified types[1], but that feels like it's probably a really
 bad idea.

 Having a distinct set of types just for the vectorizer may be a more viable
 option. IIRC the type selection hooks are more flexible than when we first
 looked at this problem.

 Paul

 [1] e.g. int gcc __attribute__((vector_size(8)));  v.s. int32x2_t eabi;

I think int32x2_t should not be a GCC vector type (thus not have a vector mode).
The ABI specified types should map to an integer mode of the right size
instead.  The vectorizer would then still use internal GCC vector types
and modes and the backend needs to provide instruction patterns that
do the right thing with the element ordering the vectorizer expects.

How are the int32x2_t types used?  I suppose they are arguments to
the intrinsics.  Which means that for _most_ operations element order
does not matter, thus a plus32x2 (int32x2_t x, int32x2_t y) can simply
use the equivalent of return (int32x2_t)((gcc_int32x2_t)x + (gcc_int32x2_t)y).
In intrinsics where order matters you'd insert appropriate __builtin_shuffle()s.

Oh, of course do the above only for big-endian mode ...

The other way around, mapping intrinsics and ABI vectors to vector modes
will have issues ... you'd have to guard all optab queries in the middle-end
to fail for arm big-endian as they expect instruction patterns that deal with
the GCC vector ordering.

Thus: model the backend after GCCs expectations and fixup the rest
by fixing the ABI types and intrinsics.

Richard.

Re: [PATCH] Tiny make check-gcc parallelization improvement

2013-03-05 Thread Richard Biener

On Tue, Mar 5, 2013 at 7:16 AM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 This patch syncs the list of target exp files (a few have been added in the
 last few years).  Also, in my testing, usually vect.exp, guality.exp,
 struct-layout-1.exp and i386.exp take quite a lot of time, so it is
 undesirable to have them in pairs anymore, so the patch allows running all 4
 of them in parallel.

 This gained a minute in make -j48 -k check testing on my box.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2013-03-05  Jakub Jelinek  ja...@redhat.com

 * Makefile.in (dg_target_exps): Add aarch64.exp, epiphany.exp and
 tic6x.exp.
 (check_gcc_parallelize): Run guality.exp as a separate job from
 vect.exp with unsorted.exp and $(dg_target_exps) separately from
 struct-layout-1.exp with stackalign.exp.

 --- gcc/Makefile.in.jj  2013-02-27 08:27:26.0 +0100
 +++ gcc/Makefile.in 2013-03-04 13:11:48.002638910 +0100
 @@ -494,10 +494,11 @@ xm_include_list=@xm_include_list@
  xm_defines=@xm_defines@
  lang_checks=
  lang_checks_parallelized=
 -dg_target_exps:=alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp,frv.exp
 -dg_target_exps:=$(dg_target_exps),i386.exp,ia64.exp,m68k.exp,microblaze.exp
 -dg_target_exps:=$(dg_target_exps),mips.exp,powerpc.exp,rx.exp,s390.exp,sh.exp
 -dg_target_exps:=$(dg_target_exps),sparc.exp,spu.exp,xstormy16.exp
 +dg_target_exps:=aarch64.exp,alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp
 +dg_target_exps:=$(dg_target_exps),epiphany.exp,frv.exp,i386.exp,ia64.exp
 +dg_target_exps:=$(dg_target_exps),m68k.exp,microblaze.exp,mips.exp,powerpc.exp
 +dg_target_exps:=$(dg_target_exps),rx.exp,s390.exp,sh.exp,sparc.exp,spu.exp
 +dg_target_exps:=$(dg_target_exps),tic6x.exp,xstormy16.exp
  # This lists a couple of test files that take most time during check-gcc.
  # When doing parallelized check-gcc, these can run in parallel with the
  # remaining tests.  Each word in this variable stands for work for one
 @@ -517,8 +518,10 @@ check_gcc_parallelize=execute.exp=execut
   compile.exp=compile/\[9pP\]*,builtins.exp \
   compile.exp=compile/\[013-8a-oq-zA-OQ-Z\]* \
   dg-torture.exp,ieee.exp \
 - vect.exp,guality.exp,unsorted.exp \
 - struct-layout-1.exp,stackalign.exp,$(dg_target_exps)
 + vect.exp,unsorted.exp \
 + guality.exp \
 + struct-layout-1.exp,stackalign.exp \
 + $(dg_target_exps)
  lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt $(srcdir)/common.opt
  lang_specs_files=@lang_specs_files@
  lang_tree_files=@lang_tree_files@

 Jakub

[PATCH] Silence up a false positive warning in libiberty (PR middle-end/56526)

2013-03-05 Thread Jakub Jelinek

Hi!

While wrapper_sect_offset is always initialized if
(gnu_sections_found  SOMO_WRAPPING) != 0 and used only guarded with that
same condition, as the PR says apparently we get a false positive maybe
uninitialized warning for it still.  I'd say it is a good programming style
to just initialize such vars, especially in performance non-critical code.

Ok for trunk?

2013-03-05  Jakub Jelinek  ja...@redhat.com

PR middle-end/56526
* simple-object-mach-o.c (simple_object_mach_o_segment): Initialize
wrapper_sect_offset to avoid a warning.

--- libiberty/simple-object-mach-o.c.jj 2013-01-07 14:14:46.0 +0100
+++ libiberty/simple-object-mach-o.c2013-03-05 11:46:19.574157009 +0100
@@ -432,7 +432,7 @@ simple_object_mach_o_segment (simple_obj
   size_t index_size;
   unsigned int n_wrapped_sects;
   size_t wrapper_sect_size;
-  off_t wrapper_sect_offset;
+  off_t wrapper_sect_offset = 0;
 
   fetch_32 = (omr-is_big_endian
  ? simple_object_fetch_big_32

Jakub

[C++ testcase, committed] PR 56530

2013-03-05 Thread Paolo Carlini


Hi,

I added the testcase and closed the PR as fixed in 4.7.3 and mainline.

Paolo.

/
2013-03-05  Paolo Carlini  paolo.carl...@oracle.com

PR c++/56530
* g++.dg/warn/Wsign-conversion-2.C: New.
Index: g++.dg/warn/Wsign-conversion-2.C
===
--- g++.dg/warn/Wsign-conversion-2.C(revision 0)
+++ g++.dg/warn/Wsign-conversion-2.C(working copy)
@@ -0,0 +1,11 @@
+// PR c++/56530
+// { dg-options -Wsign-conversion }
+
+struct string
+{
+  string () {};
+  ~string () {};
+};
+
+string foo[1];  // okay
+string bar[1][1];   // gives bogus warning

[PATCH] Fix PR56525

2013-03-05 Thread Richard Biener


This should fix PR56525, we reference ggc_freed loop structures
from bb-loop_father when fix_loop_structure removes a loop
and then calls flow_loops_find.  Fixed by delaying the ggc_free
part of loop removal until after that (I thought about other
ways to fix the reference but they are way more intrusive).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2013-03-05  Richard Biener  rguent...@suse.de

PR middle-end/56525
* loop-init.c (fix_loop_structure): Remove loops in two stages,
not freeing them until the end.

Index: gcc/loop-init.c
===
*** gcc/loop-init.c (revision 196451)
--- gcc/loop-init.c (working copy)
*** fix_loop_structure (bitmap changed_bbs)
*** 186,192 
int record_exits = 0;
loop_iterator li;
struct loop *loop;
!   unsigned old_nloops;
  
timevar_push (TV_LOOP_INIT);
  
--- 186,192 
int record_exits = 0;
loop_iterator li;
struct loop *loop;
!   unsigned old_nloops, i;
  
timevar_push (TV_LOOP_INIT);
  
*** fix_loop_structure (bitmap changed_bbs)
*** 230,237 
  flow_loop_tree_node_add (loop_outer (loop), ploop);
}
  
!   /* Remove the loop and free its data.  */
!   delete_loop (loop);
  }
  
/* Remember the number of loops so we can return how many new loops
--- 230,238 
  flow_loop_tree_node_add (loop_outer (loop), ploop);
}
  
!   /* Remove the loop.  */
!   loop-header = NULL;
!   flow_loop_tree_node_remove (loop);
  }
  
/* Remember the number of loops so we can return how many new loops
*** fix_loop_structure (bitmap changed_bbs)
*** 253,258 
--- 254,267 
}
  }
  
+   /* Finally free deleted loops.  */
+   FOR_EACH_VEC_ELT (*get_loops (), i, loop)
+ if (loop  loop-header == NULL)
+   {
+   (*get_loops ())[i] = NULL;
+   flow_loop_free (loop);
+   }
+ 
loops_state_clear (LOOPS_NEED_FIXUP);
  
/* Apply flags to loops.  */

[PATCH] Fix PR56521

2013-03-05 Thread Richard Biener


VN now inserts all sorts of calls into the references hashtable,
not only those which produce a value.  This results in missing
initializations of -value_id which eventually PRE ends up
accessing.

The following fixes that.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2013-03-05  Richard Biener  rguent...@suse.de

* tree-ssa-sccvn.c (set_value_id_for_result): For a NULL
result set a new value-id.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 196451)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3954,7 +3962,7 @@ free_scc_vn (void)
   XDELETE (optimistic_info);
 }
 
-/* Set *ID if we computed something useful in RESULT.  */
+/* Set *ID according to RESULT.  */
 
 static void
 set_value_id_for_result (tree result, unsigned int *id)
@@ -3966,6 +3974,8 @@ set_value_id_for_result (tree result, un
   else if (is_gimple_min_invariant (result))
*id = get_or_alloc_constant_value_id (result);
 }
+  else
+*id = get_next_value_id ();
 }
 
 /* Set the value ids in the valid hash tables.  */

Re: [PATCH] Fix PR56521

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 12:51:09PM +0100, Richard Biener wrote:
 VN now inserts all sorts of calls into the references hashtable,
 not only those which produce a value.  This results in missing
 initializations of -value_id which eventually PRE ends up
 accessing.
 
 The following fixes that.
 
 Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
 
 Richard.
 
 2013-03-05  Richard Biener  rguent...@suse.de
 
   * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL
   result set a new value-id.
 
 --- gcc/tree-ssa-sccvn.c  (revision 196451)
 +++ gcc/tree-ssa-sccvn.c  (working copy)
 @@ -3954,7 +3962,7 @@ free_scc_vn (void)
XDELETE (optimistic_info);
  }
  
 -/* Set *ID if we computed something useful in RESULT.  */
 +/* Set *ID according to RESULT.  */
  
  static void
  set_value_id_for_result (tree result, unsigned int *id)
 @@ -3966,6 +3974,8 @@ set_value_id_for_result (tree result, un
else if (is_gimple_min_invariant (result))
   *id = get_or_alloc_constant_value_id (result);

This still won't initialize *id if result is non-NULL, but isn't
SSA_NAME nor is_gimple_min_invariant.  Can't you do the same for that case
too, just in case (perhaps we can't trigger that right now, but still
it would make me feel safer about that).

  }
 +  else
 +*id = get_next_value_id ();
  }
  
  /* Set the value ids in the valid hash tables.  */

Otherwise looks good to me, thanks.

Jakub

Re: [PATCH] Fix PR56521

2013-03-05 Thread Richard Biener

On Tue, 5 Mar 2013, Jakub Jelinek wrote:

 On Tue, Mar 05, 2013 at 12:51:09PM +0100, Richard Biener wrote:
  VN now inserts all sorts of calls into the references hashtable,
  not only those which produce a value.  This results in missing
  initializations of -value_id which eventually PRE ends up
  accessing.
  
  The following fixes that.
  
  Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
  
  Richard.
  
  2013-03-05  Richard Biener  rguent...@suse.de
  
  * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL
  result set a new value-id.
  
  --- gcc/tree-ssa-sccvn.c(revision 196451)
  +++ gcc/tree-ssa-sccvn.c(working copy)
  @@ -3954,7 +3962,7 @@ free_scc_vn (void)
 XDELETE (optimistic_info);
   }
   
  -/* Set *ID if we computed something useful in RESULT.  */
  +/* Set *ID according to RESULT.  */
   
   static void
   set_value_id_for_result (tree result, unsigned int *id)
  @@ -3966,6 +3974,8 @@ set_value_id_for_result (tree result, un
 else if (is_gimple_min_invariant (result))
  *id = get_or_alloc_constant_value_id (result);
 
 This still won't initialize *id if result is non-NULL, but isn't
 SSA_NAME nor is_gimple_min_invariant.  Can't you do the same for that case
 too, just in case (perhaps we can't trigger that right now, but still
 it would make me feel safer about that).

Yeah.  Can happen from aggregate stores I gues.

   }
  +  else
  +*id = get_next_value_id ();
   }
   
   /* Set the value ids in the valid hash tables.  */
 
 Otherwise looks good to me, thanks.

As follows.

Richard.

2013-03-05  Richard Biener  rguent...@suse.de

* tree-ssa-sccvn.c (set_value_id_for_result): For a NULL
result set a new value-id.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 196451)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3954,18 +3962,17 @@ free_scc_vn (void)
   XDELETE (optimistic_info);
 }
 
-/* Set *ID if we computed something useful in RESULT.  */
+/* Set *ID according to RESULT.  */
 
 static void
 set_value_id_for_result (tree result, unsigned int *id)
 {
-  if (result)
-{
-  if (TREE_CODE (result) == SSA_NAME)
-   *id = VN_INFO (result)-value_id;
-  else if (is_gimple_min_invariant (result))
-   *id = get_or_alloc_constant_value_id (result);
-}
+  if (result  TREE_CODE (result) == SSA_NAME)
+*id = VN_INFO (result)-value_id;
+  else if (result  is_gimple_min_invariant (result))
+*id = get_or_alloc_constant_value_id (result);
+  else
+*id = get_next_value_id ();
 }
 
 /* Set the value ids in the valid hash tables.  */

Re: [PATCH, combine] Fix host-specific behavior in simplify_compare_const()

2013-03-05 Thread Eric Botcazou

 In other words, any 32-bit target with 'need_64bit_hwint=yes' in config.gcc
 is not able to have benefit from this optimization because it never
 passes the condition test.
 
 
 My solution is to use GET_MODE_MASK(mode) to filter out all bits not
 in target mode. The following is my patch:

The patch is OK for 4.9 once stage #1 is open if it passes bootstrap/regtest.

 gcc/ChangLog:
 
 * gcc/combine.c: Use GET_MODE_MASK() to filter out
 unnecessary bits in simplify_compare_const().

This should be

* combine.c (simplify_compare_const): Use GET_MODE_MASK to filter out
unnecessary bits in the constant power of two case.

 
 diff --git a/gcc/combine.c b/gcc/combine.c
 index 67bd776..8c8cb92 100644
 --- a/gcc/combine.c
 +++ b/gcc/combine.c
 @@ -10917,8 +10917,8 @@ simplify_compare_const (enum rtx_code code,
 rtx op0, rtx *pop1)
 (code == EQ || code == NE || code == GE || code == GEU
 
   || code == LT || code == LTU)
 
 mode_width = HOST_BITS_PER_WIDE_INT
 -   exact_log2 (const_op) = 0
 -   nonzero_bits (op0, mode) == (unsigned HOST_WIDE_INT) const_op)
 +   exact_log2 (const_op  GET_MODE_MASK (mode)) = 0
 +   nonzero_bits (op0, mode) == (unsigned HOST_WIDE_INT)
 (const_op  GET_MODE_MASK (mode)))
  {
code = (code == EQ || code == GE || code == GEU ? NE : EQ);
const_op = 0;

The line is too long, write

  nonzero_bits (op0, mode)
== (unsigned HOST_WIDE_INT) (const_op  GET_MODE_MASK (mode)))

instead.

-- 
Eric Botcazou

[Committed] S/390: Define DWARF2_ASM_LINE_DEBUG_INFO

2013-03-05 Thread Andreas Krebbel

Hi,

the attached patch enables the debug line infos to be generated from
the asm listing for s390.  With the patch two testsuite failures
disappear.

 FAIL: gcc.dg/debug/dwarf2/asm-line1.c scan-assembler is_stmt 1
 FAIL: gnat.dg/return3.adb scan-assembler loc 1 6

Committed to mainline.

Bye,

-Andreas-

2013-03-05  Andreas Krebbel  andreas.kreb...@de.ibm.com

* config/s390/s390.h: Define DWARF2_ASM_LINE_DEBUG_INFO.

---
 gcc/config/s390/s390.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: gcc/config/s390/s390.h
===
*** gcc/config/s390/s390.h.orig
--- gcc/config/s390/s390.h
*** extern const enum reg_class regclass_map
*** 591,596 
--- 591,599 
  /* Register save slot alignment.  */
  #define DWARF_CIE_DATA_ALIGNMENT (-UNITS_PER_LONG)
  
+ /* Let the assembler generate debug line info.  */
+ #define DWARF2_ASM_LINE_DEBUG_INFO 1
+ 
  
  /* Frame registers.  */

Re: [PATCH] Fix PR56521

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 12:57:41PM +0100, Richard Biener wrote:
 As follows.
 
 Richard.
 
 2013-03-05  Richard Biener  rguent...@suse.de
 
   * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL
   result set a new value-id.

Looks much better.
You forgot to adjust the ChangeLog entry, and PR line is missing,
if it passes bootstrap, please check it in.

 --- gcc/tree-ssa-sccvn.c  (revision 196451)
 +++ gcc/tree-ssa-sccvn.c  (working copy)
 @@ -3954,18 +3962,17 @@ free_scc_vn (void)
XDELETE (optimistic_info);
  }
  
 -/* Set *ID if we computed something useful in RESULT.  */
 +/* Set *ID according to RESULT.  */
  
  static void
  set_value_id_for_result (tree result, unsigned int *id)
  {
 -  if (result)
 -{
 -  if (TREE_CODE (result) == SSA_NAME)
 - *id = VN_INFO (result)-value_id;
 -  else if (is_gimple_min_invariant (result))
 - *id = get_or_alloc_constant_value_id (result);
 -}
 +  if (result  TREE_CODE (result) == SSA_NAME)
 +*id = VN_INFO (result)-value_id;
 +  else if (result  is_gimple_min_invariant (result))
 +*id = get_or_alloc_constant_value_id (result);
 +  else
 +*id = get_next_value_id ();
  }
  
  /* Set the value ids in the valid hash tables.  */

Jakub

Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode

2013-03-05 Thread Julian Brown

On Tue, 5 Mar 2013 10:42:59 +0100
Richard Biener richard.guent...@gmail.com wrote:

 On Tue, Mar 5, 2013 at 12:47 AM, Paul Brook p...@codesourcery.com
 wrote:
  I somehow missed the Appendix A: Support for Advanced SIMD
  Extensions in the AAPCS document (it's not in the TOC!). It looks
  like the builtin vector types are indeed defined to be stored in
  memory in vldm/vstm order -- I think that means we're back to
  square one.
 
  There's still the possibility of making gcc generic vector types
  different from the ABI specified types[1], but that feels like it's
  probably a really bad idea.
 
  Having a distinct set of types just for the vectorizer may be a
  more viable option. IIRC the type selection hooks are more flexible
  than when we first looked at this problem.
 
  Paul
 
  [1] e.g. int gcc __attribute__((vector_size(8)));  v.s. int32x2_t
  eabi;
 
 I think int32x2_t should not be a GCC vector type (thus not have a
 vector mode). The ABI specified types should map to an integer mode
 of the right size instead.  The vectorizer would then still use
 internal GCC vector types and modes and the backend needs to provide
 instruction patterns that do the right thing with the element
 ordering the vectorizer expects.
 
 How are the int32x2_t types used?  I suppose they are arguments to
 the intrinsics.  Which means that for _most_ operations element order
 does not matter, thus a plus32x2 (int32x2_t x, int32x2_t y) can simply
 use the equivalent of return (int32x2_t)((gcc_int32x2_t)x +
 (gcc_int32x2_t)y). In intrinsics where order matters you'd insert
 appropriate __builtin_shuffle()s.

Maybe there's no need to interpret the vector layout for any of the
intrinsics -- just treat all inputs  outputs as opaque (there are
intrinsics for getting/setting lanes -- IMO these shouldn't attempt to
convert lane numbers at all, though they do at present). Several
intrinsics are currently implemented using __builtin_shuffle, e.g.:

__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
vrev64_s8 (int8x8_t __a)
{
  return (int8x8_t) __builtin_shuffle (__a, (uint8x8_t) { 7, 6, 5, 4, 3, 2, 1, 
0 });
}

I'd imagine that if int8x8_t are not actual vector types, we could
invent extra builtins to convert them to and from such types to be able
to still do this kind of thing (in arm_neon.h, not necessarily for
direct use by users), i.e.:

typedef char gcc_int8x8_t __attribute__((vector_size(8)));

int8x8_t
vrev64_s8 (int8x8_t __a)
{
  gcc_int8x8_t tmp = __builtin_neon2generic (__a);
  tmp = __builtin_shuffle (tmp, (gcc_int8x8_t) { 7, 6, 5, 4, ... });
  return __builtin_generic2neon (tmp);
}

(On re-reading, that's basically the same as what you suggested, I
think.)

 Oh, of course do the above only for big-endian mode ...
 
 The other way around, mapping intrinsics and ABI vectors to vector
 modes will have issues ... you'd have to guard all optab queries in
 the middle-end to fail for arm big-endian as they expect instruction
 patterns that deal with the GCC vector ordering.
 
 Thus: model the backend after GCCs expectations and fixup the rest
 by fixing the ABI types and intrinsics.

I think this plan will work fine -- it has the added advantage (which
looks like a disadvantage, but really isn't) that generic vector
operations like:

void foo (void)
{
  int8x8_t x = { 0, 1, 2, 3, 4, 5, 6, 7 };
}

will *not* work -- nor will e.g. subscripting ABI-defined vectors using
[]s. At the moment using these features can lead to surprising results.

Unfortunately NEON's pretty complicated, and the ARM backend currently
uses vector modes quite heavily implementing it, so just using integer
modes for intrinsics is going to be tough. It might work to create a
shadow set of vector modes for use only by the intrinsics (O*mode for
opaque instead of V*mode, say), if the middle end won't barf at that.

Thanks,

Julian

Re: [PATCH] Fix PR56525

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 12:27:20PM +0100, Richard Biener wrote:
 This should fix PR56525, we reference ggc_freed loop structures
 from bb-loop_father when fix_loop_structure removes a loop
 and then calls flow_loops_find.  Fixed by delaying the ggc_free
 part of loop removal until after that (I thought about other
 ways to fix the reference but they are way more intrusive).
 
 Bootstrap and regtest running on x86_64-unknown-linux-gnu.
 
 Richard.
 
 2013-03-05  Richard Biener  rguent...@suse.de
 
   PR middle-end/56525
   * loop-init.c (fix_loop_structure): Remove loops in two stages,
   not freeing them until the end.

Looks good to me (when reporting the bug, I actually thought about
defering the removal for the duration of the fixup too).

Jakub

Re: [patch][RFC] bitmaps as lists or trees

2013-03-05 Thread Richard Biener

On Tue, Mar 5, 2013 at 1:00 PM, Steven Bosscher stevenb@gmail.com wrote:
 Hello,

 A recurring problem with GCC's sparse bitmap data structure is that it
 performs poorly for random access patterns. Such patterns result in
 linked-list walks, and can trigger behavior quadratic in the number of
 linked-list member elements in the set.

 The attached patch is a first stab at an idea I've had for a while:
 Implement a change of view for bitmaps, such that a bitmap can be
 either a linked list, or a binary tree. I've implemented this idea
 with top-down splay trees because splay tree nodes do not need
 meta-data on (unlike e.g. color for RB-trees, rank for AVL trees,
 etc.) and top-down splay tree operations are very simple to implement
 (less than 200 lines of code). As far as I'm aware, this is the first
 attempt at allowing different views on bitmaps. The idea came from
 Andrew Macleod's tree-ssa-live implementation.

 The idea is to convert the bitmap to a tree view if the set
 represented by the bitmap is mostly used for membership testing, and
 not for iterations over the items (as e.g. for bitmap dataflow). A
 typical example of this is e.g. invalid_mode_changes, which just
 explodes for the test case of PR55135 at -O0.

 I haven't tested this patch at all, except making sure that it
 compiles. Just posting this for discussion, and for feedback on the
 idea. I know there have been many others before me who've tried
 different data structures for bitmaps, perhaps someone has already
 tried this before.

Definitely a nice idea.  Iteration should be easy to implement (without
actually splaying for each visited bit), the bit operations can use the
iteration as building block as well then.

Now, an instrumented bitmap to identify bitmaps that would benefit
from the tree view would be nice ;)  [points-to sets are never modified
after being computed, but they are both random-tested and intersected]

What I missed often as well is a reference counted shared bitmap
implementation (we have various special case implementations).
I wonder if that could even use shared sub-trees/lists of bitmap_elts.

Richard.

 Ciao!
 Steven

[PATCH] Simplify -fwhole-program documentation

2013-03-05 Thread Richard Biener


This removes all encouragement to use -fwhole-program with -flto
from the documentation.  As can be seen in PR56533 it can be
most confusing ... instead advise to rely on a linker plugin.

Ok?

Thanks,
Richard.

2013-03-05  Richard Biener  rguent...@suse.de

* doc/invoke.texi (fwhole-program): Discourage use in combination
with -flto.

Index: gcc/doc/invoke.texi
===
*** gcc/doc/invoke.texi (revision 196451)
--- gcc/doc/invoke.texi (working copy)
*** Enabled by default with @option{-funroll
*** 8168,8182 
  Assume that the current compilation unit represents the whole program being
  compiled.  All public functions and variables with the exception of 
@code{main}
  and those merged by attribute @code{externally_visible} become static 
functions
! and in effect are optimized more aggressively by interprocedural optimizers. 
If @command{gold} is used as the linker plugin, @code{externally_visible} 
attributes are automatically added to functions (not variable yet due to a 
current @command{gold} issue) that are accessed outside of LTO objects 
according to resolution file produced by @command{gold}.  For other linkers 
that cannot generate resolution file, explicit @code{externally_visible} 
attributes are still necessary.
! While this option is equivalent to proper use of the @code{static} keyword for
! programs consisting of a single file, in combination with option
! @option{-flto} this flag can be used to
! compile many smaller scale programs since the functions and variables become
! local for the whole combined compilation unit, not for the single source file
! itself.
  
! This option implies @option{-fwhole-file} for Fortran programs.
  
  @item -flto[=@var{n}]
  @opindex flto
--- 8168,8178 
  Assume that the current compilation unit represents the whole program being
  compiled.  All public functions and variables with the exception of 
@code{main}
  and those merged by attribute @code{externally_visible} become static 
functions
! and in effect are optimized more aggressively by interprocedural optimizers.
  
! In combination with @code{-flto} using this option should not be used.
! Instead relying on a linker plugin should provide safer and more precise
! information.
  
  @item -flto[=@var{n}]
  @opindex flto

Patch ping

2013-03-05 Thread Jakub Jelinek

Hi!

Thanks for all the recent reviews of memory leak plugging patches,
there are 4 still unreviewed from last week though.

- sched-deps leak fix:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html

- LRA leak fix:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html

- libcpp leak fix:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html

- PCH leak fix + --enable-checking=valgrind changes to allow
  --enable-checking=yes,valgrind bootstrap to succeed:
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html

Jakub

Re: Patch ping

2013-03-05 Thread Richard Biener

On Tue, 5 Mar 2013, Jakub Jelinek wrote:

 Hi!
 
 Thanks for all the recent reviews of memory leak plugging patches,
 there are 4 still unreviewed from last week though.
 
 - sched-deps leak fix:
 http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html
 
 - LRA leak fix:
 http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html
 
 - libcpp leak fix:
 http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html
 
 - PCH leak fix + --enable-checking=valgrind changes to allow
   --enable-checking=yes,valgrind bootstrap to succeed:
 http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html

That looks awkward ... isn't there a simple valgrind_disable () /
valgrind_enable () way of disabling checking around this code?

Richard.

Re: Patch ping

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 02:26:03PM +0100, Richard Biener wrote:
 On Tue, 5 Mar 2013, Jakub Jelinek wrote:
  Thanks for all the recent reviews of memory leak plugging patches,
  there are 4 still unreviewed from last week though.
  
  - sched-deps leak fix:
  http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html
  
  - LRA leak fix:
  http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html
  
  - libcpp leak fix:
  http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html
  
  - PCH leak fix + --enable-checking=valgrind changes to allow
--enable-checking=yes,valgrind bootstrap to succeed:
  http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html
 
 That looks awkward ... isn't there a simple valgrind_disable () /
 valgrind_enable () way of disabling checking around this code?

Unfortunately not.  I went through all valgrind.h and memcheck.h
client calls.  If at least there was a VALGRIND_GET_VBITS variants
that allowed getting all vbits, (i.e. whether something is unaddressable
vs. undefined vs. defined), rather than just if any of the vbits are
unaddressable, give up, otherwise return undefined vs. defined bits,
it would simplify the code.  I hope perhaps future valgrind version
could add that, so it would be just VALGRIND_GET_VBITS2,
VALGRIND_MAKE_MEM_DEFINED before and VALGRIND_SET_VBITS2 at the end
(restore previous state).  I've at least added __builtin_expect,
so the binary search code isn't in hot path.  It isn't that slow,
during binary search I'm always testing just a single byte, and
say if we don't have any single memory allocations  4GB, it will be
at most 37 valgrind client calls per objects, usually much smaller
number than that.

Jakub

Re: Patch ping

2013-03-05 Thread Richard Biener

On Tue, 5 Mar 2013, Jakub Jelinek wrote:

 On Tue, Mar 05, 2013 at 02:26:03PM +0100, Richard Biener wrote:
  On Tue, 5 Mar 2013, Jakub Jelinek wrote:
   Thanks for all the recent reviews of memory leak plugging patches,
   there are 4 still unreviewed from last week though.
   
   - sched-deps leak fix:
   http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html
   
   - LRA leak fix:
   http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html
   
   - libcpp leak fix:
   http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html
   
   - PCH leak fix + --enable-checking=valgrind changes to allow
 --enable-checking=yes,valgrind bootstrap to succeed:
   http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html
  
  That looks awkward ... isn't there a simple valgrind_disable () /
  valgrind_enable () way of disabling checking around this code?
 
 Unfortunately not.  I went through all valgrind.h and memcheck.h
 client calls.  If at least there was a VALGRIND_GET_VBITS variants
 that allowed getting all vbits, (i.e. whether something is unaddressable
 vs. undefined vs. defined), rather than just if any of the vbits are
 unaddressable, give up, otherwise return undefined vs. defined bits,
 it would simplify the code.  I hope perhaps future valgrind version
 could add that, so it would be just VALGRIND_GET_VBITS2,
 VALGRIND_MAKE_MEM_DEFINED before and VALGRIND_SET_VBITS2 at the end
 (restore previous state).  I've at least added __builtin_expect,
 so the binary search code isn't in hot path.  It isn't that slow,
 during binary search I'm always testing just a single byte, and
 say if we don't have any single memory allocations  4GB, it will be
 at most 37 valgrind client calls per objects, usually much smaller
 number than that.

Alternatively using a suppressions file during bootstrap might
be possible ... maybe also useful for general valgrind
debugging use?

Richard.

Re: [PATCH] Fix cp_parser_braced_list

2013-03-05 Thread Jason Merrill


OK.

Jason

Re: [patch][RFC] bitmaps as lists or trees

2013-03-05 Thread Michael Matz

Hi,

On Tue, 5 Mar 2013, Richard Biener wrote:

  I haven't tested this patch at all, except making sure that it 
  compiles. Just posting this for discussion, and for feedback on the 
  idea. I know there have been many others before me who've tried 
  different data structures for bitmaps, perhaps someone has already 
  tried this before.
 
 Definitely a nice idea.  Iteration should be easy to implement (without 
 actually splaying for each visited bit), the bit operations can use the 
 iteration as building block as well then.

Iteration isn't easy on trees without a pointer to the parent (i.e. 
enlarging each node), you need to remember variably sized context in the 
iterator (e.g. the current stack of nodes).

I do like the idea of reusing the same internal data structure to 
implement the tree.  And I'm wondering about performance impact, I 
wouldn't be surprised either way (i.e. that it brings about a large 
improvement, or none at all), most bitmap membership tests in GCC are 
surprisingly clustered so that the bitmaps cache of last accessed 
element can work its magic (not all of them, as the testcase shows of 
course :) ).


Ciao,
Michael.

Re: [patch][RFC] bitmaps as lists or trees

2013-03-05 Thread Steven Bosscher

On Tue, Mar 5, 2013 at 1:32 PM, Richard Biener wrote:
 The attached patch is a first stab at an idea I've had for a while:
 Implement a change of view for bitmaps, such that a bitmap can be
 either a linked list, or a binary tree.
...
 Definitely a nice idea.  Iteration should be easy to implement (without
 actually splaying for each visited bit), the bit operations can use the
 iteration as building block as well then.

It is really easy, you only have to listify the splay tree such that
the root is the element with the lowest index. AFAICT the iterators
only look at the next member of each bitmap_element, and a list is
also a valid splay tree.


 Now, an instrumented bitmap to identify bitmaps that would benefit
 from the tree view would be nice ;)  [points-to sets are never modified
 after being computed, but they are both random-tested and intersected]

I have no idea how to create that kind of instrumentation.

 What I missed often as well is a reference counted shared bitmap
 implementation (we have various special case implementations).
 I wonder if that could even use shared sub-trees/lists of bitmap_elts.

And this idea, I don't even understand :-)
reference counted shared bitmaps as in, the same bitmap element
shared between different bitmaps? How would you link such elements
together in a tree or a list? It could be done with array bitmaps, but
those have other downsides (insert/delete is near impossible without a
lot of mem-moving around).

Ciao!
Steven

RE: [Patch, microblaze]: Add support for swap instructions and reorder option

2013-03-05 Thread David Holsgrove

Hi Michal,

 -Original Message-
 From: Michael Eager [mailto:ea...@eagerm.com]
 Sent: Monday, 4 March 2013 3:37 am
 To: David Holsgrove
 Cc: Michael Eager; gcc-patches@gcc.gnu.org; John Williams; Edgar E. Iglesias
 (edgar.igles...@gmail.com); Vinod Kathail; Vidhumouli Hunsigida; Nagaraju
 Mekala; Tom Shui
 Subject: Re: [Patch, microblaze]: Add support for swap instructions and 
 reorder
 option
 
 Committed revision 196415.

Thanks for committing.

 
 Please submit a patch to update gcc/doc/invoke.texi with -mxl-reorder
 description.
 

Please find patch attached to this mail which updates the MicroBlaze section of
documentation to include -mxl-reorder. I also added -mbig-endian and
-mlittle-endian as they were missed in previous patch.

thanks again,
David

 
 --
 Michael Eager  ea...@eagercon.com
 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
 





0001-Patch-microblaze-Update-gcc-doc-invoke.texi-for-Micr.patch
Description: 0001-Patch-microblaze-Update-gcc-doc-invoke.texi-for-Micr.patch

Re: [PATCH] Silence up a false positive warning in libiberty (PR middle-end/56526)

2013-03-05 Thread Ian Lance Taylor

On Tue, Mar 5, 2013 at 2:52 AM, Jakub Jelinek ja...@redhat.com wrote:

 2013-03-05  Jakub Jelinek  ja...@redhat.com

 PR middle-end/56526
 * simple-object-mach-o.c (simple_object_mach_o_segment): Initialize
 wrapper_sect_offset to avoid a warning.

This is OK.

Thanks.

Ian

RE: [Patch, microblaze]: Added fast_interrupt controller

2013-03-05 Thread David Holsgrove

Hi Michael,

 -Original Message-
 From: Michael Eager [mailto:ea...@eagerm.com]
 Sent: Wednesday, 27 February 2013 4:12 am
 To: David Holsgrove
 Cc: gcc-patches@gcc.gnu.org; Michael Eager (ea...@eagercon.com); John
 Williams; Edgar E. Iglesias (edgar.igles...@gmail.com); Vinod Kathail; 
 Vidhumouli
 Hunsigida; Nagaraju Mekala; Tom Shui
 Subject: Re: [Patch, microblaze]: Added fast_interrupt controller

 On 02/10/2013 10:39 PM, David Holsgrove wrote:
  Added fast_interrupt controller

  Changelog

  2013-02-11  Nagaraju Mekala nmek...@xilinx.com

 * config/microblaze/microblaze-protos.h: microblaze_is_fast_interrupt.
 * config/microblaze/microblaze.c (microblaze_attribute_table): Add
microblaze_is_fast_interrupt.
(microblaze_fast_interrupt_function_p): New function.
(microblaze_is_fast_interrupt check): New function.
(microblaze_must_save_register): Account for fast_interrupt.
(save_restore_insns): Likewise.
(compute_frame_size): Likewise.
(microblaze_globalize_label): Add FAST_INTERRUPT_NAME.
 * config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME as
fast_interrupt.
 * config/microblaze/microblaze.md (movsi_status): Can be
fast_interrupt
(return): Add microblaze_is_fast_interrupt.
(return_internal): Likewise.

 +int
 +microblaze_is_fast_interrupt (void)
 +{
 +  return fast_interrupt;
 +}

 +  if (fast_interrupt)
 +{

 Use wrapper functions consistently.  Either reference the flag everywhere
 or use the wrapper everywhere.

I've repurposed the existing 'microblaze_is_interrupt_handler' wrapper, (which 
was
only used in the machine description), to be 'microblaze_is_interrupt_variant' 
- true
if the function's attribute is either interrupt_handler or fast_interrupt.

 +  if (interrupt_handler || fast_interrupt)

 +  if (microblaze_is_interrupt_handler () || microblaze_is_fast_interrupt())

 There are many places in the patch where both interrupt_handler and
 fast_interrupt
 are tested.  These can be eliminated by setting the interrupt_handler flag 
 when
 you see fast_interrupt and checking for the correct registers to be saved in
 microblaze_must_save_register().

I've used this microblaze_is_interrupt_variant wrapper throughout, checking
specifically for the interrupt_handler or fast_interrupt flag only where it was
necessary to handle them differently.

Please let me know if the patch attached is acceptable, or if you would prefer
I refactor all the existing interrupt_handler functionality to accommodate the
fast_interrupt.

Updated Changelog;

2013-03-05  David Holsgrove david.holsgr...@xilinx.com

  *  gcc/config/microblaze/microblaze-protos.h: Rename
 microblaze_is_interrupt_handler to microblaze_is_interrupt_variant.
  *  gcc/config/microblaze/microblaze.c (microblaze_attribute_table): Add
 fast_interrupt.
 (microblaze_fast_interrupt_function_p): New function.
 (microblaze_is_interrupt_handler): Rename to
 microblaze_is_interrupt_variant and add fast_interrupt check.
 (microblaze_must_save_register): Use microblaze_is_interrupt_variant.
 (save_restore_insns): Likewise.
 (compute_frame_size): Likewise.
 (microblaze_function_prologue): Add FAST_INTERRUPT_NAME.
 (microblaze_globalize_label): Likewise.
  *  gcc/config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME.
  *  gcc/config/microblaze/microblaze.md: Use wrapper
 microblaze_is_interrupt_variant.

thanks again for the reviews,
David

 +  if ((interrupt_handler  !prologue) ||( fast_interrupt  !prologue) )

 +  if ((interrupt_handler  prologue) || (fast_interrupt  prologue))

 Refactor.  Fix spacing around parens.

 --
 Michael Eager  ea...@eagercon.com
 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

0002-Gcc-Added-fast_interrupt-controller.patch
Description: 0002-Gcc-Added-fast_interrupt-controller.patch

Re: Patch ping

2013-03-05 Thread Vladimir Makarov


On 03/05/2013 08:12 AM, Jakub Jelinek wrote:

Hi!

Thanks for all the recent reviews of memory leak plugging patches,
there are 4 still unreviewed from last week though.

- sched-deps leak fix:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html



This patch is ok.  Thanks for working on this, Jakub.

Re: Patch ping

2013-03-05 Thread Vladimir Makarov


On 03/05/2013 08:12 AM, Jakub Jelinek wrote:

Hi!

Thanks for all the recent reviews of memory leak plugging patches,
there are 4 still unreviewed from last week though.

  LRA leak fix:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html


This patch is ok too.

[PATCH] Fix PR50494 in a different way

2013-03-05 Thread Richard Biener


This fixes PR50494 by avoiding to increase alignment of decls
that are in the constant pool by the vectorizer.

Bootstrap  regtest pending on powerpc64-linux-gnu, with
the older fix reverted.

Richard.

2013-03-05  Richard Biener  rguent...@suse.de

PR middle-end/50494
* tree-vect-data-refs.c (vect_can_force_dr_alignment_p):
Do not adjust alignment of DECL_IN_CONSTANT_POOL decls.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 196466)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -4829,9 +4829,12 @@ vect_can_force_dr_alignment_p (const_tre
   /* We cannot change alignment of common or external symbols as another
  translation unit may contain a definition with lower alignment.  
  The rules of common symbol linking mean that the definition
- will override the common symbol.  */
+ will override the common symbol.  The same is true for constant
+ pool entries which may be shared and are not properly merged
+ by LTO.  */
   if (DECL_EXTERNAL (decl)
-  || DECL_COMMON (decl))
+  || DECL_COMMON (decl)
+  || DECL_IN_CONSTANT_POOL (decl))
 return false;
 
   if (TREE_ASM_WRITTEN (decl))

Re: [Patch, microblaze]: Add support for swap instructions and reorder option

2013-03-05 Thread Michael Eager


On 03/05/2013 06:54 AM, David Holsgrove wrote:

Hi Michal,


-Original Message-
From: Michael Eager [mailto:ea...@eagerm.com]
Sent: Monday, 4 March 2013 3:37 am
To: David Holsgrove
Cc: Michael Eager; gcc-patches@gcc.gnu.org; John Williams; Edgar E. Iglesias
(edgar.igles...@gmail.com); Vinod Kathail; Vidhumouli Hunsigida; Nagaraju
Mekala; Tom Shui
Subject: Re: [Patch, microblaze]: Add support for swap instructions and reorder
option

Committed revision 196415.


Thanks for committing.



Please submit a patch to update gcc/doc/invoke.texi with -mxl-reorder
description.



Please find patch attached to this mail which updates the MicroBlaze section of
documentation to include -mxl-reorder. I also added -mbig-endian and
-mlittle-endian as they were missed in previous patch.


Thanks.

Committed revision 196470.

gcc/ChangeLog:

2013-03-05  David Holsgrove david.holsgr...@xilinx.com

* doc/invoke.texi (MicroBlaze): Add -mbig-endian, -mlittle-endian,
-mxl-reorder.


Please remember to submit a ChangeLog with patches.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

Re: [patch][RFC] bitmaps as lists or trees

2013-03-05 Thread Richard Biener

On Tue, Mar 5, 2013 at 3:50 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Tue, Mar 5, 2013 at 1:32 PM, Richard Biener wrote:
 The attached patch is a first stab at an idea I've had for a while:
 Implement a change of view for bitmaps, such that a bitmap can be
 either a linked list, or a binary tree.
 ...
 Definitely a nice idea.  Iteration should be easy to implement (without
 actually splaying for each visited bit), the bit operations can use the
 iteration as building block as well then.

 It is really easy, you only have to listify the splay tree such that
 the root is the element with the lowest index. AFAICT the iterators
 only look at the next member of each bitmap_element, and a list is
 also a valid splay tree.

You'd have a fat iterator object with a (sorted) array of bitmap elements to
iterate over, similar to how loop iterators work.

 Now, an instrumented bitmap to identify bitmaps that would benefit
 from the tree view would be nice ;)  [points-to sets are never modified
 after being computed, but they are both random-tested and intersected]

 I have no idea how to create that kind of instrumentation.

 What I missed often as well is a reference counted shared bitmap
 implementation (we have various special case implementations).
 I wonder if that could even use shared sub-trees/lists of bitmap_elts.

 And this idea, I don't even understand :-)
 reference counted shared bitmaps as in, the same bitmap element
 shared between different bitmaps? How would you link such elements
 together in a tree or a list? It could be done with array bitmaps, but
 those have other downsides (insert/delete is near impossible without a
 lot of mem-moving around).

You can share leafs of trees (not of lists due to the back pointer),
splaying of course destroys the shared properties ...

At the moment shared bitmaps (where used) are simply using hashtables
and bitmap_hash.  The propagation parts of the points-to solver could
benefit from copy-on-write shared bitmaps.

Richard.

 Ciao!
 Steven

Re: [PATCH] Fix PR56344

2013-03-05 Thread Marek Polacek

On Fri, Mar 01, 2013 at 09:41:27AM +0100, Richard Biener wrote:
 On Wed, Feb 27, 2013 at 6:38 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
  On Wed, 27 Feb 2013, Richard Biener wrote:
 
  Wouldn't it be better to simply pass this using the variable size handling
  code?  Thus, initialize args_size.var for too large constant size instead?
 
  Would that be compatible with the ABI definition of how a large (constant
  size) argument should be passed?
 
 I'm not sure.  Another alternative is to expand to __builtin_trap (), but 
 that's
 probably not easy at this very point.
 
 Or simply fix the size calculation to not overflow (either don't count bits
 or use a double-int).

I don't think double_int will help us here.  We won't detect overflow,
because we overflowed here (when lower_bound is an int):
  lower_bound = INTVAL (XEXP (XEXP (arg-stack_slot, 0), 1));
The value from INTVAL () fits when lower_bound is a double_int, but
then:
  i = lower_bound;
  ...
  stack_usage_map[i]
the size of stack_usage_map is stored in highest_outgoing_arg_in_use,
which is an int, so we're limited by an int size here.
Changing the type of highest_outgoing_arg_in_use from an int to a
double_int isn't worth the trouble, IMHO.

Maybe the original approach, only with sorry () instead of error ()
and e.g. HOST_BITS_PER_INT - 1 instead of 30 would be appropriate
after all.  Dunno.

Marek

[PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)

2013-03-05 Thread Jakub Jelinek

Hi!

Without this patch, ifcvt extends lifetime of %eax hard register,
which causes reload/LRA ICE later on.  Combiner and other passes try hard
not to do that, even ifcvt has code for it if x is a hard register a few
lines below it, but in this case the hard register is SET_SRC (set_b).

With this patch we just use the pseudo (x) which has been initialized
from the hard register before the conditional.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-03-05  Jakub Jelinek  ja...@redhat.com

PR rtl-optimization/56484
* ifcvt.c (noce_process_if_block): Before reload if else_bb
is NULL, avoid extending lifetimes of hard registers in
likely to spilled or small register classes.

--- gcc/ifcvt.c.jj  2013-01-11 09:02:48.0 +0100
+++ gcc/ifcvt.c 2013-03-05 12:36:19.217251997 +0100
@@ -2491,6 +2491,15 @@ noce_process_if_block (struct noce_if_in
  || ! noce_operand_ok (SET_SRC (set_b))
  || reg_overlap_mentioned_p (x, SET_SRC (set_b))
  || modified_between_p (SET_SRC (set_b), insn_b, jump)
+ /* Avoid extending the lifetime of hard registers on small
+register class machines before reload.  */
+ || (!reload_completed
+  REG_P (SET_SRC (set_b))
+  HARD_REGISTER_P (SET_SRC (set_b))
+  (targetm.class_likely_spilled_p
+   (REGNO_REG_CLASS (REGNO (SET_SRC (set_b
+ || targetm.small_register_classes_for_mode_p
+  (GET_MODE (SET_SRC (set_b)
  /* Likewise with X.  In particular this can happen when
 noce_get_condition looks farther back in the instruction
 stream than one might expect.  */
--- gcc/testsuite/gcc.c-torture/compile/pr56484.c.jj2013-03-05 
12:42:24.972220034 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr56484.c   2013-03-05 
12:41:59.0 +0100
@@ -0,0 +1,17 @@
+/* PR rtl-optimization/56484 */
+
+unsigned char b[4096];
+int bar (void);
+
+int
+foo (void)
+{
+  int a = 0;
+  while (bar ())
+{
+  int c = bar ();
+  a = a  0 ? a : c;
+  __builtin_memset (b, 0, sizeof b);
+}
+  return a;
+}

Jakub

[PATCH] Avoid too complex debug insns during expansion (PR debug/56510)

2013-03-05 Thread Jakub Jelinek

Hi!

cselib (probably among others) isn't prepared to handle arbitrarily
complex debug insns.  The debug insns are usually created from debug stmts
which shouldn't have unbound complexity, but with TER we can actually end up
with arbitrarily large debug insns.

This patch fixes that up during expansion, by splitting subexpressions of
too large debug insn expressions into their own debug temporaries.

So far bootstrapped/regtested on x86_64-linux and i686-linux without the
first two hunks (it caused one failure on the latter because of invalid RTL
sharing), I'm going to bootstrap/regtest it again, ok for trunk if it
passes?

2013-03-05  Jakub Jelinek  ja...@redhat.com

PR debug/56510
* cfgexpand.c (expand_debug_parm_decl): Call copy_rtx on incoming.
(avoid_complex_debug_insns): New function.
(expand_debug_locations): Call it.

* gcc.dg/pr56510.c: New test.

--- gcc/cfgexpand.c.jj  2013-03-05 15:12:15.071565689 +0100
+++ gcc/cfgexpand.c 2013-03-05 17:21:55.683602432 +0100
@@ -2622,6 +2622,8 @@ expand_debug_parm_decl (tree decl)
  reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg)));
  incoming = replace_equiv_address_nv (incoming, reg);
}
+ else
+   incoming = copy_rtx (incoming);
}
 #endif
 
@@ -2637,7 +2639,7 @@ expand_debug_parm_decl (tree decl)
  || (GET_CODE (XEXP (incoming, 0)) == PLUS
   XEXP (XEXP (incoming, 0), 0) == virtual_incoming_args_rtx
   CONST_INT_P (XEXP (XEXP (incoming, 0), 1)
-return incoming;
+return copy_rtx (incoming);
 
   return NULL_RTX;
 }
@@ -3704,6 +3706,54 @@ expand_debug_source_expr (tree exp)
   return op0;
 }
 
+/* Ensure INSN_VAR_LOCATION_LOC (insn) doesn't have unbound complexity.
+   Allow 4 levels of rtl nesting for most rtl codes, and if we see anything
+   deeper than that, create DEBUG_EXPRs and emit DEBUG_INSNs before INSN.  */
+
+static void
+avoid_complex_debug_insns (rtx insn, rtx *exp_p, int depth)
+{
+  rtx exp = *exp_p;
+  if (exp == NULL_RTX)
+return;
+  if ((OBJECT_P (exp)  !MEM_P (exp)) || GET_CODE (exp) == CLOBBER)
+return;
+
+  if (depth == 4)
+{
+  /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL).  */
+  rtx dval = make_debug_expr_from_rtl (exp);
+
+  /* Emit a debug bind insn before INSN.  */
+  rtx bind = gen_rtx_VAR_LOCATION (GET_MODE (exp),
+  DEBUG_EXPR_TREE_DECL (dval), exp,
+  VAR_INIT_STATUS_INITIALIZED);
+
+  emit_debug_insn_before (bind, insn);
+  *exp_p = dval;
+  return;
+}
+
+  const char *format_ptr = GET_RTX_FORMAT (GET_CODE (exp));
+  int i, j;
+  for (i = 0; i  GET_RTX_LENGTH (GET_CODE (exp)); i++)
+switch (*format_ptr++)
+  {
+  case 'e':
+   avoid_complex_debug_insns (insn, XEXP (exp, i), depth + 1);
+   break;
+
+  case 'E':
+  case 'V':
+   for (j = 0; j  XVECLEN (exp, i); j++)
+ avoid_complex_debug_insns (insn, XVECEXP (exp, i, j), depth + 1);
+   break;
+
+  default:
+   break;
+  }
+}
+
 /* Expand the _LOCs in debug insns.  We run this after expanding all
regular insns, so that any variables referenced in the function
will have their DECL_RTLs set.  */
@@ -3724,7 +3774,7 @@ expand_debug_locations (void)
 if (DEBUG_INSN_P (insn))
   {
tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
-   rtx val;
+   rtx val, prev_insn, insn2;
enum machine_mode mode;
 
if (value == NULL_TREE)
@@ -3753,6 +3803,9 @@ expand_debug_locations (void)
  }
 
INSN_VAR_LOCATION_LOC (insn) = val;
+   prev_insn = PREV_INSN (insn);
+   for (insn2 = insn; insn2 != prev_insn; insn2 = PREV_INSN (insn2))
+ avoid_complex_debug_insns (insn2, INSN_VAR_LOCATION_LOC (insn2), 0);
   }
 
   flag_strict_aliasing = save_strict_alias;
--- gcc/testsuite/gcc.dg/pr56510.c.jj   2013-03-05 16:57:54.498939220 +0100
+++ gcc/testsuite/gcc.dg/pr56510.c  2013-03-05 16:57:54.499939214 +0100
@@ -0,0 +1,37 @@
+/* PR debug/56510 */
+/* { dg-do compile } */
+/* { dg-options -O2 -g } */
+
+struct S { unsigned long s1; void **s2[0]; };
+void **a, **b, **c, **d, **e, **f;
+
+static void **
+baz (long x, long y)
+{
+  void **s = f;
+  *f = (void **) (y  8 | (x  0xff));
+  f += y + 1;
+  return s;
+}
+
+void bar (void);
+void
+foo (void)
+{
+  void **g = b[4];
+  a = b[2];
+  b = b[1];
+  g[2] = e;
+  void **h
+= ((void **)
+   
a)[1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][66];
+  void **i = ((struct S *) h)-s2[4];
+  d = baz (4, 3);
+  d[1] = b;
+  d[2] = a;
+  d[3] = bar;
+  b = d;
+  g[1] = i[2];
+  a = g;
+  ((void (*) (void)) (i[1])) ();
+}

Jakub

[C++ Patch] PR 56534

2013-03-05 Thread Paolo Carlini


Hi,

this (and 55786, which is a Dup) is an ICE on invalid regression in 
4.7/4.8. The problem is that for such broken input, 
check_elaborated_type_specifier is called by 
cp_parser_elaborated_type_specifier with a DECL which has a nul 
TREE_TYPE, a TEMPLATE_ID_EXPR actually, and therefore immediately 
crashes on TREE_CODE (type) == TEMPLATE_TYPE_PARM.


In comparison, 4_6-branch, instead of calling 
check_elaborated_type_specifier, has cp_parser_elaborated_type_specifier 
simply doing type = TREE_TYPE (decl), thus it seems we can cure the 
regression in a straightforward and safe way by simply checking that 
TREE_TYPE (decl) is not nul at the beginning of 
check_elaborated_type_specifier. In this way the error messages are also 
exactly the same produced by 4_6.


Tested x86_64-linux.

Thanks,
Paolo.

//
/cp
2013-03-05  Paolo Carlini  paolo.carl...@oracle.com

PR c++/56534
* decl.c (check_elaborated_type_specifier): Check for NULL_TREE
as TREE_TYPE (decl).

/testsuite
2013-03-05  Paolo Carlini  paolo.carl...@oracle.com

PR c++/56534
* g++.dg/template/crash115.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 196465)
+++ cp/decl.c   (working copy)
@@ -11725,6 +11725,8 @@ check_elaborated_type_specifier (enum tag_types ta
 decl = TYPE_NAME (TREE_TYPE (decl));
 
   type = TREE_TYPE (decl);
+  if (!type)
+return NULL_TREE;
 
   /* Check TEMPLATE_TYPE_PARM first because DECL_IMPLICIT_TYPEDEF_P
  is false for this case as well.  */
Index: testsuite/g++.dg/template/crash115.C
===
--- testsuite/g++.dg/template/crash115.C(revision 0)
+++ testsuite/g++.dg/template/crash115.C(working copy)
@@ -0,0 +1,3 @@
+// PR c++/56534
+
+template  struct template rebind   // { dg-error expected }

Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)

2013-03-05 Thread Jeff Law


On 03/05/2013 09:26 AM, Jakub Jelinek wrote:

Hi!

Without this patch, ifcvt extends lifetime of %eax hard register,
which causes reload/LRA ICE later on.  Combiner and other passes try hard
not to do that, even ifcvt has code for it if x is a hard register a few
lines below it, but in this case the hard register is SET_SRC (set_b).

With this patch we just use the pseudo (x) which has been initialized
from the hard register before the conditional.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-03-05  Jakub Jelinek  ja...@redhat.com

PR rtl-optimization/56484
* ifcvt.c (noce_process_if_block): Before reload if else_bb
is NULL, avoid extending lifetimes of hard registers in
likely to spilled or small register classes.

OK.
Jeff

Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)

2013-03-05 Thread Eric Botcazou

 Without this patch, ifcvt extends lifetime of %eax hard register,
 which causes reload/LRA ICE later on.  Combiner and other passes try hard
 not to do that, even ifcvt has code for it if x is a hard register a few
 lines below it, but in this case the hard register is SET_SRC (set_b).
 
 With this patch we just use the pseudo (x) which has been initialized
 from the hard register before the conditional.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 2013-03-05  Jakub Jelinek  ja...@redhat.com
 
   PR rtl-optimization/56484
   * ifcvt.c (noce_process_if_block): Before reload if else_bb
   is NULL, avoid extending lifetimes of hard registers in
   likely to spilled or small register classes.

ifcvt.c tests only small_register_classes_for_mode_p in the other places, so 
do you really need class_likely_spilled_p here?

-- 
Eric Botcazou

[patch sdbout]: Fix regression in sdbout.c

2013-03-05 Thread Kai Tietz

Hello,

this patch fixes a regression in gcc.dg/debug/tls-1.c testcase for -gcoffn.

ChangeLog

2013-03-05  Kai Tietz  kti...@redhat.com

* sdbout.c (sdbout_one_type): Switch to current function's section
supporting cold/hot.

Tested for x86_64-w64-mingw32.  Ok for apply?

Index: sdbout.c
===
--- sdbout.c(Revision 196451)
+++ sdbout.c(Arbeitskopie)
@@ -1017,7 +1017,7 @@ sdbout_one_type (tree type)
DECL_SECTION_NAME (current_function_decl) != NULL_TREE)
 ; /* Don't change section amid function.  */
   else
-switch_to_section (text_section);
+switch_to_section (current_function_section ());

   switch (TREE_CODE (type))
 {

Re: [patch sdbout]: Fix regression in sdbout.c

2013-03-05 Thread Richard Henderson

On 03/05/2013 09:31 AM, Kai Tietz wrote:
 2013-03-05  Kai Tietz  kti...@redhat.com
 
   * sdbout.c (sdbout_one_type): Switch to current function's section
   supporting cold/hot.

Ok.


r~

[PATCH] libgcc: Add DWARF info to aeabi_ldivmod and aeabi_uldivmod

2013-03-05 Thread Meador Inge

Hi All,

This patch fixes a minor annoyance that causes backtraces to disappear
inside of aeabi_ldivmod and aeabi_uldivmod due to the lack of appropriate
DWARF information.  I fixed the problem by adding the necessary cfi_*
macros in these functions.

OK?

2013-03-05  Meador Inge  mead...@codesourcery.com

* config/arm/bpabi.S (aeabi_ldivmod): Add DWARF information for
computing the location of the link register.
(aeabi_uldivmod): Ditto.

Index: libgcc/config/arm/bpabi.S
===
--- libgcc/config/arm/bpabi.S   (revision 196470)
+++ libgcc/config/arm/bpabi.S   (working copy)
@@ -123,6 +123,7 @@ ARM_FUNC_START aeabi_ulcmp
 #ifdef L_aeabi_ldivmod
 
 ARM_FUNC_START aeabi_ldivmod
+   cfi_start   __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
test_div_by_zero signed
 
sub sp, sp, #8
@@ -132,17 +133,20 @@ ARM_FUNC_START aeabi_ldivmod
 #else
do_push {sp, lr}
 #endif
+98:cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
bl SYM(__gnu_ldivmod_helper) __PLT__
ldr lr, [sp, #4]
add sp, sp, #8
do_pop {r2, r3}
RET
+   cfi_end LSYM(Lend_aeabi_ldivmod)

 #endif /* L_aeabi_ldivmod */
 
 #ifdef L_aeabi_uldivmod
 
 ARM_FUNC_START aeabi_uldivmod
+   cfi_start   __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
test_div_by_zero unsigned
 
sub sp, sp, #8
@@ -152,11 +156,13 @@ ARM_FUNC_START aeabi_uldivmod
 #else
do_push {sp, lr}
 #endif
+98:cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
bl SYM(__gnu_uldivmod_helper) __PLT__
ldr lr, [sp, #4]
add sp, sp, #8
do_pop {r2, r3}
RET
-   
+   cfi_end LSYM(Lend_aeabi_uldivmod)
+
 #endif /* L_aeabi_divmod */

[google gcc-4_7] change LIPO default module grouping algorithm (issue7490043)

2013-03-05 Thread Rong Xu

Hi,

This patch changes the default lipo module grouping
algorithm from algoritm 0 (eager propagation algorithm)
to algorith 1 (inclusion_based priority algorithm).

It also changes the name __gcov_lipo_strict_inclusion
to __gcov_lipo_weak_inclusion and the default is 0.

Tested with google internal benchmarks.

-Rong

2013-03-05  Rong Xu  x...@google.com

* libgcc/dyn-ipa.c (__gcov_lipo_weak_inclusion):
changed from __gcov_lipo_strict_inclusion.
(init_dyn_call_graph): Ditto.
(ps_add_auxiliary): Ditto.
(modu_edge_add_auxiliary): Ditto.
* gcc/tree-profile.c (tree_init_dyn_ipa_parameters): Ditto.
* gcc/params.def (PARAM_LIPO_GROUPING_ALGORITHM): Changed
default value from 0 to 1.

Index: libgcc/dyn-ipa.c
===
--- libgcc/dyn-ipa.c(revision 196405)
+++ libgcc/dyn-ipa.c(working copy)
@@ -157,7 +157,7 @@ extern gcov_unsigned_t __gcov_lipo_dump_cgraph;
 extern gcov_unsigned_t __gcov_lipo_max_mem;
 extern gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 extern gcov_unsigned_t __gcov_lipo_merge_modu_edges;
-extern gcov_unsigned_t __gcov_lipo_strict_inclusion;
+extern gcov_unsigned_t __gcov_lipo_weak_inclusion;
 
 #if defined(inhibit_libc)
 __gcov_build_callgraph (void) {}
@@ -195,7 +195,7 @@ enum GROUPING_ALGORITHM
 };
 static int flag_alg_mode;
 static int flag_modu_merge_edges;
-static int flag_strict_inclusion;
+static int flag_weak_inclusion;
 static gcov_unsigned_t mem_threshold;
 
 /* Returns 0 if no dump is enabled. Returns 1 if text form graph
@@ -387,7 +387,7 @@ init_dyn_call_graph (void)
 
   flag_alg_mode = __gcov_lipo_grouping_algorithm;
   flag_modu_merge_edges = __gcov_lipo_merge_modu_edges;
-  flag_strict_inclusion = __gcov_lipo_strict_inclusion;
+  flag_weak_inclusion = __gcov_lipo_weak_inclusion;
   mem_threshold = __gcov_lipo_max_mem * 1.25;
 
   gi_ptr = __gcov_list;
@@ -417,13 +417,13 @@ init_dyn_call_graph (void)
   if ((env_str = getenv (GCOV_DYN_MERGE_EDGES)))
 flag_modu_merge_edges = atoi (env_str);
 
-  if ((env_str = getenv (GCOV_DYN_STRICT_INCLUSION)))
-flag_strict_inclusion = atoi (env_str);
+  if ((env_str = getenv (GCOV_DYN_WEAK_INCLUSION)))
+flag_weak_inclusion = atoi (env_str);
 
   if (do_dump)
fprintf (stderr, 
- Using ALG=%d merge_edges=%d strict_inclusion=%d. \n,
-flag_alg_mode, flag_modu_merge_edges, flag_strict_inclusion);
+ Using ALG=%d merge_edges=%d weak_inclusion=%d. \n,
+flag_alg_mode, flag_modu_merge_edges, flag_weak_inclusion);
 }
 
   if (do_dump)
@@ -1809,7 +1809,7 @@ ps_add_auxiliary (const void *value,
   int not_safe_to_insert = *(int *) data3;
   gcov_unsigned_t new_ggc_size;
 
-  /* For strict incluesion, we know it's safe to insert.  */
+  /* For strict inclusion, we know it's safe to insert.  */
   if (!not_safe_to_insert)
 {
   modu_add_auxiliary (m_id, s_m_id, *(gcov_type*)data2);
@@ -1825,7 +1825,8 @@ ps_add_auxiliary (const void *value,
   return 1;
 }
 
-/* return 1 if insertion happened, otherwise 0.  */
+/* Return 1 if insertion happened, otherwise 0.  */
+
 static int
 modu_edge_add_auxiliary (struct modu_edge *edge)
 {
@@ -1871,7 +1872,7 @@ modu_edge_add_auxiliary (struct modu_edge *edge)
 {
   pointer_set_traverse (node_exported_to, ps_check_ggc_mem,
 callee_m_id, fail, 0);
-  if (fail  flag_strict_inclusion)
+  if (fail  !flag_weak_inclusion)
 return 0;
 }
 
Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c  (revision 196471)
+++ gcc/tree-profile.c  (working copy)
@@ -389,10 +389,10 @@ tree_init_dyn_ipa_parameters (void)
   gcov_lipo_strict_inclusion = build_decl (
   UNKNOWN_LOCATION,
   VAR_DECL,
-  get_identifier (__gcov_lipo_strict_inclusion),
+  get_identifier (__gcov_lipo_weak_inclusion),
   get_gcov_unsigned_t ());
   init_comdat_decl (gcov_lipo_strict_inclusion,
-PARAM_LIPO_STRICT_INCLUSION);
+PARAM_LIPO_WEAK_INCLUSION);
 }
 }
 
Index: gcc/params.def
===
--- gcc/params.def  (revision 196471)
+++ gcc/params.def  (working copy)
@@ -1018,25 +1018,26 @@ DEFPARAM (PARAM_INLINE_DUMP_MODULE_ID,
LIPO profile-gen.  */
 DEFPARAM (PARAM_LIPO_GROUPING_ALGORITHM,
  lipo-grouping-algorithm,
- Default is 0 which is the eager propagation algorithm.
-  If the value is 1, use the inclusion_based priority algorithm.,
- 0, 0, 1)
+ Algorithm 0 uses the eager propagation algorithm.
+ Algorithm 1 uses the inclusion_based priority algorithm.
+ The default algorithm is 1.,
+ 1, 0, 1)
 
 /* In the inclusion_based_priority grouping algorithm, specify if we combine

Re: [google gcc-4_7] change LIPO default module grouping algorithm (issue7490043)

2013-03-05 Thread Xinliang David Li

Looks good.

thanks,

David

On Tue, Mar 5, 2013 at 11:06 AM, Rong Xu x...@google.com wrote:
 Hi,

 This patch changes the default lipo module grouping
 algorithm from algoritm 0 (eager propagation algorithm)
 to algorith 1 (inclusion_based priority algorithm).

 It also changes the name __gcov_lipo_strict_inclusion
 to __gcov_lipo_weak_inclusion and the default is 0.

 Tested with google internal benchmarks.

 -Rong

 2013-03-05  Rong Xu  x...@google.com

 * libgcc/dyn-ipa.c (__gcov_lipo_weak_inclusion):
 changed from __gcov_lipo_strict_inclusion.
 (init_dyn_call_graph): Ditto.
 (ps_add_auxiliary): Ditto.
 (modu_edge_add_auxiliary): Ditto.
 * gcc/tree-profile.c (tree_init_dyn_ipa_parameters): Ditto.
 * gcc/params.def (PARAM_LIPO_GROUPING_ALGORITHM): Changed
 default value from 0 to 1.

 Index: libgcc/dyn-ipa.c
 ===
 --- libgcc/dyn-ipa.c(revision 196405)
 +++ libgcc/dyn-ipa.c(working copy)
 @@ -157,7 +157,7 @@ extern gcov_unsigned_t __gcov_lipo_dump_cgraph;
  extern gcov_unsigned_t __gcov_lipo_max_mem;
  extern gcov_unsigned_t __gcov_lipo_grouping_algorithm;
  extern gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 -extern gcov_unsigned_t __gcov_lipo_strict_inclusion;
 +extern gcov_unsigned_t __gcov_lipo_weak_inclusion;

  #if defined(inhibit_libc)
  __gcov_build_callgraph (void) {}
 @@ -195,7 +195,7 @@ enum GROUPING_ALGORITHM
  };
  static int flag_alg_mode;
  static int flag_modu_merge_edges;
 -static int flag_strict_inclusion;
 +static int flag_weak_inclusion;
  static gcov_unsigned_t mem_threshold;

  /* Returns 0 if no dump is enabled. Returns 1 if text form graph
 @@ -387,7 +387,7 @@ init_dyn_call_graph (void)

flag_alg_mode = __gcov_lipo_grouping_algorithm;
flag_modu_merge_edges = __gcov_lipo_merge_modu_edges;
 -  flag_strict_inclusion = __gcov_lipo_strict_inclusion;
 +  flag_weak_inclusion = __gcov_lipo_weak_inclusion;
mem_threshold = __gcov_lipo_max_mem * 1.25;

gi_ptr = __gcov_list;
 @@ -417,13 +417,13 @@ init_dyn_call_graph (void)
if ((env_str = getenv (GCOV_DYN_MERGE_EDGES)))
  flag_modu_merge_edges = atoi (env_str);

 -  if ((env_str = getenv (GCOV_DYN_STRICT_INCLUSION)))
 -flag_strict_inclusion = atoi (env_str);
 +  if ((env_str = getenv (GCOV_DYN_WEAK_INCLUSION)))
 +flag_weak_inclusion = atoi (env_str);

if (do_dump)
 fprintf (stderr,
 - Using ALG=%d merge_edges=%d strict_inclusion=%d. \n,
 -flag_alg_mode, flag_modu_merge_edges, flag_strict_inclusion);
 + Using ALG=%d merge_edges=%d weak_inclusion=%d. \n,
 +flag_alg_mode, flag_modu_merge_edges, flag_weak_inclusion);
  }

if (do_dump)
 @@ -1809,7 +1809,7 @@ ps_add_auxiliary (const void *value,
int not_safe_to_insert = *(int *) data3;
gcov_unsigned_t new_ggc_size;

 -  /* For strict incluesion, we know it's safe to insert.  */
 +  /* For strict inclusion, we know it's safe to insert.  */
if (!not_safe_to_insert)
  {
modu_add_auxiliary (m_id, s_m_id, *(gcov_type*)data2);
 @@ -1825,7 +1825,8 @@ ps_add_auxiliary (const void *value,
return 1;
  }

 -/* return 1 if insertion happened, otherwise 0.  */
 +/* Return 1 if insertion happened, otherwise 0.  */
 +
  static int
  modu_edge_add_auxiliary (struct modu_edge *edge)
  {
 @@ -1871,7 +1872,7 @@ modu_edge_add_auxiliary (struct modu_edge *edge)
  {
pointer_set_traverse (node_exported_to, ps_check_ggc_mem,
  callee_m_id, fail, 0);
 -  if (fail  flag_strict_inclusion)
 +  if (fail  !flag_weak_inclusion)
  return 0;
  }

 Index: gcc/tree-profile.c
 ===
 --- gcc/tree-profile.c  (revision 196471)
 +++ gcc/tree-profile.c  (working copy)
 @@ -389,10 +389,10 @@ tree_init_dyn_ipa_parameters (void)
gcov_lipo_strict_inclusion = build_decl (
UNKNOWN_LOCATION,
VAR_DECL,
 -  get_identifier (__gcov_lipo_strict_inclusion),
 +  get_identifier (__gcov_lipo_weak_inclusion),
get_gcov_unsigned_t ());
init_comdat_decl (gcov_lipo_strict_inclusion,
 -PARAM_LIPO_STRICT_INCLUSION);
 +PARAM_LIPO_WEAK_INCLUSION);
  }
  }

 Index: gcc/params.def
 ===
 --- gcc/params.def  (revision 196471)
 +++ gcc/params.def  (working copy)
 @@ -1018,25 +1018,26 @@ DEFPARAM (PARAM_INLINE_DUMP_MODULE_ID,
 LIPO profile-gen.  */
  DEFPARAM (PARAM_LIPO_GROUPING_ALGORITHM,
   lipo-grouping-algorithm,
 - Default is 0 which is the eager propagation algorithm.
 -  If the value is 1, use the inclusion_based priority algorithm.,
 - 0, 0, 1)
 + Algorithm 0 uses the eager propagation algorithm.
 +

Re: [Patch, microblaze]: Added fast_interrupt controller

2013-03-05 Thread Michael Eager

On 03/05/2013 07:09 AM, David Holsgrove wrote:

Hi Michael,

-Original Message-
From: Michael Eager [mailto:ea...@eagerm.com]
Sent: Wednesday, 27 February 2013 4:12 am
To: David Holsgrove
Cc: gcc-patches@gcc.gnu.org; Michael Eager (ea...@eagercon.com); John
Williams; Edgar E. Iglesias (edgar.igles...@gmail.com); Vinod Kathail; 
Vidhumouli
Hunsigida; Nagaraju Mekala; Tom Shui
Subject: Re: [Patch, microblaze]: Added fast_interrupt controller

On 02/10/2013 10:39 PM, David Holsgrove wrote:

Added fast_interrupt controller

Changelog

2013-02-11  Nagaraju Mekala nmek...@xilinx.com

* config/microblaze/microblaze-protos.h: microblaze_is_fast_interrupt.
* config/microblaze/microblaze.c (microblaze_attribute_table): Add
   microblaze_is_fast_interrupt.
   (microblaze_fast_interrupt_function_p): New function.
   (microblaze_is_fast_interrupt check): New function.
   (microblaze_must_save_register): Account for fast_interrupt.
   (save_restore_insns): Likewise.
   (compute_frame_size): Likewise.
   (microblaze_globalize_label): Add FAST_INTERRUPT_NAME.
* config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME as
   fast_interrupt.
* config/microblaze/microblaze.md (movsi_status): Can be
   fast_interrupt
   (return): Add microblaze_is_fast_interrupt.
   (return_internal): Likewise.

+int
+microblaze_is_fast_interrupt (void)
+{
+  return fast_interrupt;
+}

+  if (fast_interrupt)
+{

Use wrapper functions consistently.  Either reference the flag everywhere
or use the wrapper everywhere.

I've repurposed the existing 'microblaze_is_interrupt_handler' wrapper, (which 
was
only used in the machine description), to be 'microblaze_is_interrupt_variant' 
- true
if the function's attribute is either interrupt_handler or fast_interrupt.

+  if (interrupt_handler || fast_interrupt)

+  if (microblaze_is_interrupt_handler () || microblaze_is_fast_interrupt())

There are many places in the patch where both interrupt_handler and
fast_interrupt
are tested.  These can be eliminated by setting the interrupt_handler flag when
you see fast_interrupt and checking for the correct registers to be saved in
microblaze_must_save_register().

I've used this microblaze_is_interrupt_variant wrapper throughout, checking
specifically for the interrupt_handler or fast_interrupt flag only where it was
necessary to handle them differently.

Please let me know if the patch attached is acceptable, or if you would prefer
I refactor all the existing interrupt_handler functionality to accommodate the
fast_interrupt.

Updated Changelog;

2013-03-05  David Holsgrove david.holsgr...@xilinx.com

   *  gcc/config/microblaze/microblaze-protos.h: Rename
  microblaze_is_interrupt_handler to microblaze_is_interrupt_variant.
   *  gcc/config/microblaze/microblaze.c (microblaze_attribute_table): Add
  fast_interrupt.
  (microblaze_fast_interrupt_function_p): New function.
  (microblaze_is_interrupt_handler): Rename to
  microblaze_is_interrupt_variant and add fast_interrupt check.
  (microblaze_must_save_register): Use microblaze_is_interrupt_variant.
  (save_restore_insns): Likewise.
  (compute_frame_size): Likewise.
  (microblaze_function_prologue): Add FAST_INTERRUPT_NAME.
  (microblaze_globalize_label): Likewise.
   *  gcc/config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME.
   *  gcc/config/microblaze/microblaze.md: Use wrapper
  microblaze_is_interrupt_variant.

thanks again for the reviews,
David

+  if ((interrupt_handler  !prologue) ||( fast_interrupt  !prologue) )

+  if ((interrupt_handler  prologue) || (fast_interrupt  prologue))

Refactor.  Fix spacing around parens.

Committed revision 196474.

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

[SH] PR 55303 - Add basic support for SH2A clip insns

2013-03-05 Thread Oleg Endo

Hi,

This adds basic support for the SH2A clips and clipu instructions.
Tested on rev 196406 with
make -k check RUNTESTFLAGS=--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}

and no new failures.

OK for trunk or 4.9?

Cheers,
Oleg

gcc/ChangeLog:

PR target/55303
* config/sh/sh.c (sh_rtx_costs): Handle SMIN and SMAX cases.
* config/sh/sh.md (*clips, uminsi3, *clipu, clipu_one): New 
insns and related expanders.
* config/sh/iterators.md (SMIN_SMAX): New code iterator.
* config/sh/predicates.md (arith_reg_or_0_or_1_operand, 
clips_min_const_int, clips_max_const_int, clipu_max_const_int):
New predicates.

testsuite/ChangeLog:

PR target/55303
* gcc.target/sh/pr55303-1.c: New.
* gcc.target/sh/pr55303-2.c: New.
* gcc.target/sh/pr55303-3.c: New.
Index: gcc/testsuite/gcc.target/sh/pr55303-1.c
===
--- gcc/testsuite/gcc.target/sh/pr55303-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/pr55303-1.c	(revision 0)
@@ -0,0 +1,87 @@
+/* Verify that the SH2A clips and clipu instructions are generated as
+   expected.  */
+/* { dg-do compile { target sh*-*-* } } */
+/* { dg-options -O2 } */
+/* { dg-skip-if  { sh*-*-* } { * } { -m2a* } } */
+/* { dg-final { scan-assembler-times clips.b 2 } } */
+/* { dg-final { scan-assembler-times clips.w 2 } } */
+/* { dg-final { scan-assembler-times clipu.b 2 } } */
+/* { dg-final { scan-assembler-times clipu.w 2 } } */
+
+static inline int
+min (int a, int b)
+{
+  return a  b ? a : b;
+}
+
+static inline int
+max (int a, int b)
+{
+  return a  b ? b : a;
+}
+
+int
+test_00 (int a)
+{
+  /* 1x clips.b  */
+  return max (-128, min (127, a));
+}
+
+int
+test_01 (int a)
+{
+  /* 1x clips.b  */
+  return min (127, max (-128, a));
+}
+
+int
+test_02 (int a)
+{
+  /* 1x clips.w  */
+  return max (-32768, min (32767, a));
+}
+
+int
+test_03 (int a)
+{
+  /* 1x clips.w  */
+  return min (32767, max (-32768, a));
+}
+
+unsigned int
+test_04 (unsigned int a)
+{
+  /* 1x clipu.b  */
+  return a  255 ? 255 : a;
+}
+
+unsigned int
+test_05 (unsigned int a)
+{
+  /* 1x clipu.b  */
+  return a = 255 ? 255 : a;
+}
+
+unsigned int
+test_06 (unsigned int a)
+{
+  /* 1x clipu.w  */
+  return a  65535 ? 65535 : a;
+}
+
+unsigned int
+test_07 (unsigned int a)
+{
+  /* 1x clipu.w  */
+  return a = 65535 ? 65535 : a;
+}
+
+void
+test_08 (unsigned short a, unsigned short b, unsigned int* r)
+{
+  /* Must not see a clip insn here -- it is not needed.  */
+  unsigned short x = a + b;
+  if (x  65535)
+x = 65535;
+  *r = x;
+}
Index: gcc/testsuite/gcc.target/sh/pr55303-3.c
===
--- gcc/testsuite/gcc.target/sh/pr55303-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/pr55303-3.c	(revision 0)
@@ -0,0 +1,15 @@
+/* Verify that the special case (umin (reg const_int 1)) results in the
+   expected instruction sequence on SH2A.  */
+/* { dg-do compile { target sh*-*-* } } */
+/* { dg-options -O2 } */
+/* { dg-skip-if  { sh*-*-* } { * } { -m2a* } } */
+/* { dg-final { scan-assembler-times tst 1 } } */
+/* { dg-final { scan-assembler-times movrt 1 } } */
+
+unsigned int
+test_00 (unsigned int a)
+{
+  /* 1x tst
+ 1x movrt  */
+  return a  1 ? 1 : a;
+}
Index: gcc/testsuite/gcc.target/sh/pr55303-2.c
===
--- gcc/testsuite/gcc.target/sh/pr55303-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/pr55303-2.c	(revision 0)
@@ -0,0 +1,35 @@
+/* Verify that for SH2A smax/smin - cbranch conversion is done properly
+   if the clips insn is not used and the expected comparison insns are
+   generated.  */
+/* { dg-do compile { target sh*-*-* } } */
+/* { dg-options -O2 } */
+/* { dg-skip-if  { sh*-*-* } { * } { -m2a* } } */
+/* { dg-final { scan-assembler-times cmp/pl 4 } } */
+
+int
+test_00 (int a)
+{
+  /* 1x cmp/pl  */
+  return a = 0 ? a : 0;
+}
+
+int
+test_01 (int a)
+{
+  /* 1x cmp/pl  */
+  return a = 0 ? a : 0;
+}
+
+int
+test_02 (int a)
+{
+  /* 1x cmp/pl  */
+  return a  1 ? 1 : a;
+}
+
+int
+test_03 (int a)
+{
+  /* 1x cmp/pl  */
+  return a  1 ? a : 1;
+}
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 196091)
+++ gcc/config/sh/sh.c	(working copy)
@@ -3507,6 +3507,22 @@
   else
 	return false;
 
+case SMIN:
+case SMAX:
+  /* This is most likely a clips.b or clips.w insn that is being made up
+	 by combine.  */
+  if (TARGET_SH2A
+	   (GET_CODE (XEXP (x, 0)) == SMAX || GET_CODE (XEXP (x, 0)) == SMIN)
+	   CONST_INT_P (XEXP (XEXP (x, 0), 1))
+	   REG_P (XEXP (XEXP (x, 0), 0))
+	   CONST_INT_P (XEXP (x, 1)))
+	{
+	  *total = COSTS_N_INSNS (1);
+	  return true;
+	}
+  else
+	return false;
+
 case CONST:
 case LABEL_REF:
 case SYMBOL_REF:
Index: gcc/config/sh/sh.md

Re: FW: [PATCH] [MIPS] microMIPS gcc support

2013-03-05 Thread Richard Sandiford

Moore, Catherine catherine_mo...@mentor.com writes:
 -Original Message-
 From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
 Sent: Monday, March 04, 2013 3:54 PM
 To: Moore, Catherine
 Cc: gcc-patches@gcc.gnu.org; Rozycki, Maciej
 Subject: Re: FW: [PATCH] [MIPS] microMIPS gcc support

 Moore, Catherine catherine_mo...@mentor.com writes:
  Hi Richard,
 - Predicates should always check the code though.  E.g.:

   (define_predicate umips_addius5_imm
 (and (match_code const_int)
  (match_test IN_RANGE (INTVAL (op), -8, 7

 - In general, please try to make the names of the predicates as generic
   as possible.  There's nothing really add-specific about the predicate
   above.  Or microMIPS-specific either really: some of these predicates
   are probably going to be useful for MIPS16 too.

   The existing MIPS16 functions follow the convention:

   n if negated (optional)
 + s or u for signed vs. unsigned
 + imm
 + number of significant bits
 + _
 + multiplication factor or, er, b for +1...

   It might be nice to have a similar convention for microMIPS.
   The choices there are a bit more exotic, so please feel free to
   diverge from the MIPS16 one above; we can switch MIPS16 over once
   the microMIPS one is settled.  In fact, a new convention that's
   compact enough to be used in both predicate and constraint names
   would be great.  E.g. for the umips_addius5_imm predicate above,
   a name like Ys5 would be easier to remember than Zo/Yo.

 How compact would you consider compact enough?  I would need to change
 the existing Y constraints as well.

Argh, sorry, I'd forgotten about that restriction.

We have a few internal-only undocumented constraints that aren't used much,
so we should be able to move them to the Y space instead.  The patch
below does this for T and U.  Then we could use U for new, longer
constraints.

 I think trying to invent some convention with less than four letter will
 be difficult and even with four, I doubt it could be uniformly followed.
 I think we could get descriptive with four, however.
 Let me know what you think.

Four sounds good.  Here's one idea:

Utypefactorbits

where type is:

  s for signed
  u for unsigned
  d for decremented unsigned (-1 ... N)
  i for incremented unsigned (1 ... N)

where factor is:

  b for byte (*1)
  h for halfwords (*2)
  w for words (*4)
  d for doublewords (*8) -- useful for 64-bit MIPS16 but probably not
  needed for 32-bit microMIPS

and where bits is the number of bits.  type and factor could be
replaced with an ad-hoc two-letter combination for special cases.
E.g. Uas9 (add stack) for ADDISUP.

Just a suggestion though.  I'm not saying these names are totally intuitive
or anything, but they should at least be better than arbitrary letters.

Also, bits could be two digits if necessary, or we could just use
hex digits.

We could have:

/* Return true if X fits within an unsigned field of BITS bits that is
   shifted left SHIFT bits before being used.  */

static inline bool
mips_unsigned_immediate_p (unsigned HOST_WIDE_INT x, int bits, int shift = 0)
{
  return (x  ((1  shift) - 1)) == 0  x  (1  (shift + bits));
}

/* Return true if X fits within a signed field of BITS bits that is
   shifted left SHIFT bits before being used.  */

static inline bool
mips_signed_immediate_p (unsigned HOST_WIDE_INT x, int bits, int shift = 0)
{
  x += 1  (bits + shift - 1);
  return mips_unsigned_immediate_p (x, bits, shift);
}

The 'd' and 'i' cases would pass a biased X to mips_unsigned_immediate_p.

I'll apply the patch below once 4.9 starts.

Thanks,
Richard

gcc/
* config/mips/constraints.md (T): Rename to...
(Yf): ...this.
(U): Rename to...
(Yd): ...this.
* config/mips/mips.md (*movdi_64bit, *movdi_64bit_mips16)
(*movmode_internal, *movmode_mips16): Update accordingly.

Index: gcc/config/mips/constraints.md
===
--- gcc/config/mips/constraints.md	2013-02-25 21:45:10.0 +
+++ gcc/config/mips/constraints.md	2013-03-05 08:22:36.687354771 +
@@ -170,22 +170,6 @@ (define_constraint S
   (and (match_operand 0 call_insn_operand)
(match_test CONSTANT_P (op

-(define_constraint T
-  @internal
-   A constant @code{move_operand} that cannot be safely loaded into @code{$25}
-   using @code{la}.
-  (and (match_operand 0 move_operand)
-   (match_test CONSTANT_P (op))
-   (match_test mips_dangerous_for_la25_p (op
-
-(define_constraint U
-  @internal
-   A constant @code{move_operand} that can be safely loaded into @code{$25}
-   using @code{la}.
-  (and (match_operand 0 move_operand)
-   (match_test CONSTANT_P (op))
-   (not (match_test mips_dangerous_for_la25_p (op)
-
 (define_memory_constraint W
   @internal
A memory address based on a member of @code{BASE_REG_CLASS}.  This is
@@ -220,6 +204,22 @@ (define_constraint Yb
@internal

Re: [Patch] Add microMIPS jraddiusp support

2013-03-05 Thread Richard Sandiford

Moore, Catherine catherine_mo...@mentor.com writes:
 Index: config/mips/micromips.md
 ===
 --- config/mips/micromips.md  (revision 196341)
 +++ config/mips/micromips.md  (working copy)
 @@ -95,6 +95,19 @@
 (set_attr mode SI)
 (set_attr can_delay no)])
  
 +;; For JRADDIUSP.
 +(define_insn jraddiusp
 +  [(parallel [(return)
 +  (use (reg:SI 31))
 +   (set (reg:SI 29)
 +(plus:SI (reg:SI 29)
 + (match_operand 0 const_int_operand)))])]

Since this is a generic pattern (not depending on UNSPECs, etc.),
I think we should use a specific predicate instead of const_int_operand.
From the suggestion in the thread about addition, this would be a uw5,
i.e. uw5_operand.

 Index: config/mips/mips.c
 ===
 --- config/mips/mips.c(revision 196341)
 +++ config/mips/mips.c(working copy)
 @@ -11364,6 +11364,7 @@
const struct mips_frame_info *frame;
HOST_WIDE_INT step1, step2;
rtx base, adjust, insn;
 +  bool use_jraddiusp_p = false;
  
if (!sibcall_p  mips_can_use_return_insn ())
  {
 @@ -11453,6 +11454,14 @@
mips_for_each_saved_gpr_and_fpr (frame-total_size - step2,
  mips_restore_reg);
  
 +  /* Check if we can use JRADDIUSP.  */
 +  use_jraddiusp_p = (TARGET_MICROMIPS
 +   !crtl-calls_eh_return
 +   !sibcall_p
 +   step2  0
 +   (step2  3) == 0
 +   step2 = (31  2));
 +
if (cfun-machine-interrupt_handler_p)
   {
 HOST_WIDE_INT offset;
 @@ -11480,8 +11489,9 @@
 mips_emit_move (gen_rtx_REG (word_mode, K0_REG_NUM), mem);
 offset -= UNITS_PER_WORD;
  
 -   /* If we don't use shadow register set, we need to update SP.  */
 -   if (!cfun-machine-use_shadow_register_set_p)
 +   /* If we don't use shadow register set or the microMIPS
 + JRADDIUSP insn, we need to update SP.  */
 +   if (!cfun-machine-use_shadow_register_set_p  !use_jraddiusp_p)
   mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0);
 else
   /* The choice of position is somewhat arbitrary in this case.  */

We shouldn't use JRADDIUSP in an interrupt handler, so I think it would
be better to move the use_jraddiusp_p condition into the else branch and
drop the hunk above.

 @@ -11492,11 +11502,14 @@
   gen_rtx_REG (SImode, K0_REG_NUM)));
   }
else
 - /* Deallocate the final bit of the frame.  */
 - mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0);
 + /* Deallocate the final bit of the frame unless using the microMIPS
 +   JRADDIUSP insn.  */
 + if (!use_jraddiusp_p)
 +   mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0);
  }
  
 -  gcc_assert (!mips_epilogue.cfa_restores);
 +  if (!use_jraddiusp_p)
 +gcc_assert (!mips_epilogue.cfa_restores);

We still need to emit the CFA restores somewhere.  Something like:

else if (TARGET_MICROMIPS
  !crtl-calls_eh_return
  !sibcall_p
  step2  0
  mips_unsigned_immediate_p (step2, 5, 2))
  {
/* We can deallocate the stack and jump to $31 using JRADDIUSP.
   Emit the CFA restores immediately before the deallocation.  */
use_jraddisup_p = true;
mips_epilogue_emit_cfa_restores ();
  }
else
  /* Deallocate the final bit of the frame.  */
  mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0);

where mips_unsigned_immediate_p comes from the other thread.

Thanks,
Richard

Re: [PATCH] Avoid too complex debug insns during expansion (PR debug/56510)

2013-03-05 Thread Jeff Law


On 03/05/2013 09:30 AM, Jakub Jelinek wrote:

Hi!

cselib (probably among others) isn't prepared to handle arbitrarily
complex debug insns.  The debug insns are usually created from debug stmts
which shouldn't have unbound complexity, but with TER we can actually end up
with arbitrarily large debug insns.

This patch fixes that up during expansion, by splitting subexpressions of
too large debug insn expressions into their own debug temporaries.

So far bootstrapped/regtested on x86_64-linux and i686-linux without the
first two hunks (it caused one failure on the latter because of invalid RTL
sharing), I'm going to bootstrap/regtest it again, ok for trunk if it
passes?

2013-03-05  Jakub Jelinek  ja...@redhat.com

PR debug/56510
* cfgexpand.c (expand_debug_parm_decl): Call copy_rtx on incoming.
(avoid_complex_debug_insns): New function.
(expand_debug_locations): Call it.

* gcc.dg/pr56510.c: New test.
So it's not that cselib (and possibly others) can't handle these complex 
RTL expressions, it's just unbearably slow.  Right?





  }

+/* Ensure INSN_VAR_LOCATION_LOC (insn) doesn't have unbound complexity.
+   Allow 4 levels of rtl nesting for most rtl codes, and if we see anything
+   deeper than that, create DEBUG_EXPRs and emit DEBUG_INSNs before INSN.  */
:-)  Similar to a comment I made in someone else's patch, I don't like 
the magic number 4, but I don't think this is worth creating a PARAM 
for controlling its behaviour.




+
+static void
+avoid_complex_debug_insns (rtx insn, rtx *exp_p, int depth)
+{
+  rtx exp = *exp_p;
+  if (exp == NULL_RTX)
+return;
+  if ((OBJECT_P (exp)  !MEM_P (exp)) || GET_CODE (exp) == CLOBBER)
+return;

A blank line or two seems to be missing above.


Fine with the trivial formatting fix assuming your bootstrap/regtest is OK.


Jeff

Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 06:28:13PM +0100, Eric Botcazou wrote:
  Without this patch, ifcvt extends lifetime of %eax hard register,
  which causes reload/LRA ICE later on.  Combiner and other passes try hard
  not to do that, even ifcvt has code for it if x is a hard register a few
  lines below it, but in this case the hard register is SET_SRC (set_b).
  
  With this patch we just use the pseudo (x) which has been initialized
  from the hard register before the conditional.
  
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
  
  2013-03-05  Jakub Jelinek  ja...@redhat.com
  
  PR rtl-optimization/56484
  * ifcvt.c (noce_process_if_block): Before reload if else_bb
  is NULL, avoid extending lifetimes of hard registers in
  likely to spilled or small register classes.
 
 ifcvt.c tests only small_register_classes_for_mode_p in the other places, so 
 do you really need class_likely_spilled_p here?

I guess I don't.  I've grepped for small_register_classes_for_mode_p and didn't 
see
anything in i386, so I figured out that it would be using a default (which
is false).  But apparently it uses hook_bool_mode_true, so it is a superset
of class_likely_spilled_p, guess I can leave that out.

Jakub

[PATCH] Fix g++.dg/debug/dwarf2/thunk1.C on darwin

2013-03-05 Thread Jack Howarth

Darwin does PIC differently than ELF so that the scan-assembler-times
fails for g++.dg/debug/dwarf2/thunk1.C. The attached patch skips the
scan-assembler for *-*-darwin*. Tested on x86_64-apple-darwin12. Okay
for gcc trunk.
  Jack

gcc/testsuite/

2013-03-05  Jack Howarth  howa...@bromo.med.uc.edu

PR debug/53363
* g++.dg/debug/dwarf2/thunk1.C: Skip final scan on darwin.

Index: gcc/testsuite/g++.dg/debug/dwarf2/thunk1.C
===
--- gcc/testsuite/g++.dg/debug/dwarf2/thunk1.C  (revision 196462)
+++ gcc/testsuite/g++.dg/debug/dwarf2/thunk1.C  (working copy)
@@ -1,7 +1,7 @@
 // Test that we don't add the x86 PC thunk to .debug_ranges
 // { dg-do compile { target { { i?86-*-* x86_64-*-* }  ia32 } } }
 // { dg-options -g -fpic -fno-dwarf2-cfi-asm }
-// { dg-final { scan-assembler-times LFB3 5 } }
+// { dg-final { scan-assembler-times LFB3 5 { target { ! *-*-darwin* } } } }
 
 template class T void f(T t) { }

Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 11:03:13PM +0100, Jakub Jelinek wrote:
  ifcvt.c tests only small_register_classes_for_mode_p in the other places, 
  so 
  do you really need class_likely_spilled_p here?
 
 I guess I don't.  I've grepped for small_register_classes_for_mode_p and 
 didn't see
 anything in i386, so I figured out that it would be using a default (which
 is false).  But apparently it uses hook_bool_mode_true, so it is a superset
 of class_likely_spilled_p, guess I can leave that out.

Here is what I've actually committed (I've also removed the
!reload_completed , because noce_process_if_block is only called for
!!reload_completed (the only caller asserts it)).

2013-03-05  Jakub Jelinek  ja...@redhat.com

PR rtl-optimization/56484
* ifcvt.c (noce_process_if_block): If else_bb is NULL, avoid extending
lifetimes of hard registers on small register class machines.

* gcc.c-torture/compile/pr56484.c: New test.

--- gcc/ifcvt.c.jj  2013-03-05 15:12:15.284564443 +0100
+++ gcc/ifcvt.c 2013-03-05 23:11:25.751625601 +0100
@@ -2491,6 +2491,12 @@ noce_process_if_block (struct noce_if_in
  || ! noce_operand_ok (SET_SRC (set_b))
  || reg_overlap_mentioned_p (x, SET_SRC (set_b))
  || modified_between_p (SET_SRC (set_b), insn_b, jump)
+ /* Avoid extending the lifetime of hard registers on small
+register class machines.  */
+ || (REG_P (SET_SRC (set_b))
+  HARD_REGISTER_P (SET_SRC (set_b))
+  targetm.small_register_classes_for_mode_p
+  (GET_MODE (SET_SRC (set_b
  /* Likewise with X.  In particular this can happen when
 noce_get_condition looks farther back in the instruction
 stream than one might expect.  */
--- gcc/testsuite/gcc.c-torture/compile/pr56484.c.jj2013-03-05 
16:57:50.416961638 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr56484.c   2013-03-05 
16:57:50.417961672 +0100
@@ -0,0 +1,17 @@
+/* PR rtl-optimization/56484 */
+
+unsigned char b[4096];
+int bar (void);
+
+int
+foo (void)
+{
+  int a = 0;
+  while (bar ())
+{
+  int c = bar ();
+  a = a  0 ? a : c;
+  __builtin_memset (b, 0, sizeof b);
+}
+  return a;
+}


Jakub

Re: [SH] PR 55303 - Add basic support for SH2A clip insns

2013-03-05 Thread Kaz Kojima

Oleg Endo oleg.e...@t-online.de wrote:
 This adds basic support for the SH2A clips and clipu instructions.
 Tested on rev 196406 with
 make -k check RUNTESTFLAGS=--target_board=sh-sim
 \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}
 
 and no new failures.
 
 OK for trunk or 4.9?

OK.

Regards,
kaz

Re: [PATCH] Avoid too complex debug insns during expansion (PR debug/56510)

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 02:40:34PM -0700, Jeff Law wrote:
 So it's not that cselib (and possibly others) can't handle these
 complex RTL expressions, it's just unbearably slow.  Right?

They handle it, but with bad compile time complexity, so on some testcases
it might take years or centuries etc.

 Fine with the trivial formatting fix assuming your bootstrap/regtest is OK.

Thanks, bootstraps/regtests finished fine, I've added the two blank lines
and committed.

Jakub

Re: [SH] PR 55303 - Add basic support for SH2A clip insns

2013-03-05 Thread Oleg Endo

On Wed, 2013-03-06 at 07:37 +0900, Kaz Kojima wrote:
 Oleg Endo oleg.e...@t-online.de wrote:
  This adds basic support for the SH2A clips and clipu instructions.
  Tested on rev 196406 with
  make -k check RUNTESTFLAGS=--target_board=sh-sim
  \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}
  
  and no new failures.
  
  OK for trunk or 4.9?
 
 OK.

OK for 4.8 trunk or 4.9? :)

Cheers,
Oleg

[patch] Fix PR 55364: ICE in remove_addr_table_entry with -gsplit-dwarf

2013-03-05 Thread Cary Coutant

This patch fixes an ICE in remove_addr_table_entry, where we try to
remove the .debug_addr entries for an expression where they've already
been removed earlier in the loop.

-cary


2013-03-05   Sterling Augustine  saugust...@google.com
 Cary Coutant  ccout...@google.com

PR debug/55364
* gcc/dwarf2out.c (resolve_addr): Don't call
remove_loc_list_addr_table_entries a second time for the same
expression.


Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 196479)
+++ gcc/dwarf2out.c (working copy)
@@ -22691,8 +22691,6 @@ resolve_addr (dw_die_ref die)
else
  {
loc-replaced = 1;
-if (dwarf_split_debug_info)
-  remove_loc_list_addr_table_entries (loc-expr);
loc-dw_loc_next = *start;
  }
  }

Re: [SH] PR 55303 - Add basic support for SH2A clip insns

2013-03-05 Thread Kaz Kojima

Oleg Endo oleg.e...@t-online.de wrote:
 OK for 4.8 trunk or 4.9? :)

Sorry, I've missed the trunk part.  OK for 4.9.

Regards,
kaz

[SH, committed] PR 56529 - Calls to __sdivsi3_i4i and __udivsi3_i4i are generated on SH2

2013-03-05 Thread Oleg Endo

Hi,

This is the patch that I posted in the PR and that was pre-approved by
Kaz, with some documentation bits added.

Tested with 'make info dvi pdf' and 'make all'.
Applied as revision 196484.
Will backport it to 4.7 branch.

Cheers,
Oleg

gcc/ChangeLog:

PR target/56529
* config/sh/sh.c (sh_option_override): Check for TARGET_DYNSHIFT
instead of TARGET_SH2 for call-table case.  Do not set 
sh_div_strategy to SH_DIV_CALL_TABLE for TARGET_SH2.
* config.gcc (sh_multilibs): Add m2 and m2a to sh*-*-linux* 
multilib list.
* doc/invoke.texi (SH options): Document mdiv= call-div1, 
call-fp, call-table options.

libgcc/ChangeLog:

PR target/56529
* config/sh/lib1funcs.S (udivsi3_i4i, sdivsi3_i4i): Add __SH2A__
to inclusion list.
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 196483)
+++ gcc/config/sh/sh.c	(working copy)
@@ -820,7 +820,7 @@
 		   || (TARGET_HARD_SH4  TARGET_SH2E)
 		   || (TARGET_SHCOMPACT  TARGET_FPU_ANY)))
 	sh_div_strategy = SH_DIV_CALL_FP;
-  else if (! strcmp (sh_div_str, call-table)  TARGET_SH2)
+  else if (! strcmp (sh_div_str, call-table)  TARGET_DYNSHIFT)
 	sh_div_strategy = SH_DIV_CALL_TABLE;
   else
 	/* Pick one that makes most sense for the target in general.
@@ -840,8 +840,6 @@
 	  sh_div_strategy = SH_DIV_CALL_FP;
 	/* SH1 .. SH3 cores often go into small-footprint systems, so
 	   default to the smallest implementation available.  */
-	else if (TARGET_SH2)	/* ??? EXPERIMENTAL */
-	  sh_div_strategy = SH_DIV_CALL_TABLE;
 	else
 	  sh_div_strategy = SH_DIV_CALL_DIV1;
 }
Index: gcc/config.gcc
===
--- gcc/config.gcc	(revision 196483)
+++ gcc/config.gcc	(working copy)
@@ -2371,7 +2371,7 @@
 		sh[1234]*)	sh_multilibs=${sh_cpu_target} ;;
 		sh64* | sh5*)	sh_multilibs=m5-32media,m5-32media-nofpu,m5-compact,m5-compact-nofpu,m5-64media,m5-64media-nofpu ;;
 		sh-superh-*)	sh_multilibs=m4,m4-single,m4-single-only,m4-nofpu ;;
-		sh*-*-linux*)	sh_multilibs=m1,m3e,m4 ;;
+		sh*-*-linux*)	sh_multilibs=m1,m2,m2a,m3e,m4 ;;
 		sh*-*-netbsd*)	sh_multilibs=m3,m3e,m4 ;;
 		*) sh_multilibs=m1,m2,m2e,m4,m4-single,m4-single-only,m2a,m2a-single ;;
 		esac
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 196483)
+++ gcc/doc/invoke.texi	(working copy)
@@ -18749,8 +18749,8 @@
 
 @item -mdiv=@var{strategy}
 @opindex mdiv=@var{strategy}
-Set the division strategy to use for SHmedia code.  @var{strategy} must be
-one of: 
+Set the division strategy to be used for integer division operations.
+For SHmedia @var{strategy} can be one of: 
 
 @table @samp
 
@@ -18808,6 +18808,36 @@
 
 @end table
 
+For targets other than SHmedia @var{strategy} can be one of:
+
+@table @samp
+
+@item call-div1
+Calls a library function that uses the single-step division instruction
+@code{div1} to perform the operation.  Division by zero calculates an
+unspecified result and does not trap.  This is the default except for SH4,
+SH2A and SHcompact.
+
+@item call-fp
+Calls a library function that performs the operation in double precision
+floating point.  Division by zero causes a floating-point exception.  This is
+the default for SHcompact with FPU.  Specifying this for targets that do not
+have a double precision FPU will default to @code{call-div1}.
+
+@item call-table
+Calls a library function that uses a lookup table for small divisors and
+the @code{div1} instruction with case distinction for larger divisors.  Division
+by zero calculates an unspecified result and does not trap.  This is the default
+for SH4.  Specifying this for targets that do not have dynamic shift
+instructions will default to @code{call-div1}.
+
+@end table
+
+When a division strategy has not been specified the default strategy will be
+selected based on the current target.  For SH2A the default strategy is to
+use the @code{divs} and @code{divu} instructions instead of library function
+calls.
+
 @item -maccumulate-outgoing-args
 @opindex maccumulate-outgoing-args
 Reserve space once for outgoing arguments in the function prologue rather
Index: libgcc/config/sh/lib1funcs.S
===
--- libgcc/config/sh/lib1funcs.S	(revision 196483)
+++ libgcc/config/sh/lib1funcs.S	(working copy)
@@ -3288,8 +3288,8 @@
 	.word	17136
 	.word	16639
 
-#elif defined (__SH3__) || defined (__SH3E__) || defined (__SH4__) || defined (__SH4_SINGLE__) || defined (__SH4_SINGLE_ONLY__) || defined (__SH4_NOFPU__)
-/* This code used shld, thus is not suitable for SH1 / SH2.  */
+#elif defined (__SH2A__) || defined (__SH3__) || defined (__SH3E__) || defined (__SH4__) || defined (__SH4_SINGLE__) || defined (__SH4_SINGLE_ONLY__) || defined (__SH4_NOFPU__)
+/* This code uses shld, thus is not suitable for SH1 /

Re: ping - Re: Fix some texinfo 5.0 warnings in gcc/doc + libiberty

2013-03-05 Thread Joseph S. Myers

On Fri, 1 Mar 2013, Tobias Burnus wrote:

 Joseph S. Myers wrote:
  OK, though for the libiberty patch it would be good if someone can find
  the make-obstacks-texi.sh script referred to in libiberty.texi, check it
  in and get obstacks.texi exactly in sync with the output of that script
  run on current glibc sources.
 
 I couldn't find it, but I created a Perl version of the unknown script.
 
 Is the attached patch OK? (I tested it with make info html pdf using (only)
 texinfo-4.13a.)

OK, with 2013 used as copyright date instead of 2012.

-- 
Joseph S. Myers
jos...@codesourcery.com

RE: [PATCH] Fix PR50293 - LTO plugin with space in path

2013-03-05 Thread Joseph S. Myers

On Mon, 4 Mar 2013, Joey Ye wrote:

 +  char *new_spec = (char *)xmalloc (len + number_of_space + 1);

Space in cast between (char *) and xmalloc.  OK with that change.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix lots of uninitialized memory uses in sched_analyze_reg

2013-03-05 Thread Vladimir Makarov


On 13-03-04 4:17 PM, Jakub Jelinek wrote:

Hi!

Something that again hits lots of testcases during valgrind checking
bootstrap.  init_alias_analysis apparently does
   vec_safe_grow_cleared (reg_known_value, maxreg - FIRST_PSEUDO_REGISTER);
   reg_known_equiv_p = sbitmap_alloc (maxreg - FIRST_PSEUDO_REGISTER);
but doesn't bitmap_clear (reg_known_equiv_p), perhaps as an optimization?
Sorry, I don't know current state of alias.c well to say something 
definite about this.  But I believe it should be cleared.

If set_reg_known_value is called (and not to the reg itself),
set_reg_known_equiv_p is called too though.
Right now get_reg_known_equiv_p is only called in one place, and we are only
interested in MEM_P known values there, so the following works fine.
Though perhaps if in the future we use the reg_known_equiv_p bitmap more,
we should bitmap_clear (reg_known_equiv_p) it instead.
Bootstrapped/regtested on x86_64-linux and i686-linux.

Ok for trunk (or do you prefer to slow down init_alias_analysis and just
clear the bitmap)?
I don't see any harm from your patch but I guess it should be fixed by 
clearing reg_know_equiv_p.  I think you need Steven's opinion on this as 
he is an author of the code.


2013-03-04  Jakub Jelinek  ja...@redhat.com

* sched-deps.c (sched_analyze_reg): Only call get_reg_known_equiv_p
if get_reg_known_value returned non-NULL.

--- gcc/sched-deps.c.jj 2013-03-04 12:21:09.0 +0100
+++ gcc/sched-deps.c2013-03-04 17:29:03.478944157 +0100
@@ -2351,10 +2351,10 @@ sched_analyze_reg (struct deps_desc *dep
/* Pseudos that are REG_EQUIV to something may be replaced
 by that during reloading.  We need only add dependencies for
the address in the REG_EQUIV note.  */
-  if (!reload_completed  get_reg_known_equiv_p (regno))
+  if (!reload_completed)
{
  rtx t = get_reg_known_value (regno);
- if (MEM_P (t))
+ if (t  MEM_P (t)  get_reg_known_equiv_p (regno))
sched_analyze_2 (deps, XEXP (t, 0), insn);
}

Re: [PATCH] Fix lots of uninitialized memory uses in sched_analyze_reg

2013-03-05 Thread Jakub Jelinek

On Tue, Mar 05, 2013 at 11:58:09PM -0500, Vladimir Makarov wrote:
 I don't see any harm from your patch but I guess it should be fixed
 by clearing reg_know_equiv_p.  I think you need Steven's opinion on
 this as he is an author of the code.

Yeah, I've already committed the clearing of the sbitmap in alias.c
instead of this sched-deps.c patch, which doesn't make sense after the
alias.c change.

Jakub

[PATCH] Fix PR 55473

2013-03-05 Thread Shakthi Kannan

The libquadmath/quadmath.h file cannot be used with C++. The
following patch allows inclusion and use of the quadmath.h header
file.

2013-03-06 Shakthi Kannan shakthim...@gmail.com

PR libquadmath/55473
* quadmath.h: Add ifdef __cplusplus macros.

---
 libquadmath/quadmath.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/libquadmath/quadmath.h b/libquadmath/quadmath.h
index 863fe44..aa9ef51 100644
--- a/libquadmath/quadmath.h
+++ b/libquadmath/quadmath.h
@@ -23,6 +23,10 @@ Boston, MA 02110-1301, USA.  */

 #include stdlib.h

+#ifdef __cplusplus
+extern C {
+#endif
+
 /* Define the complex type corresponding to __float128
(_Complex __float128 is not allowed) */
 typedef _Complex float __attribute__((mode(TC))) __complex128;
@@ -189,4 +193,8 @@ __quadmath_nth (conjq (__complex128 __z))
   return __extension__ ~__z;
 }

+#ifdef __cplusplus
+}
+#endif
+
 #endif
-- 
1.7.11.7

68 matches

Mail list logo