Re: __sdivsi3_i4i and __udivsi3_i4i called for sh2 variant.
Uses a lookup table for divisors in the range -128 .. +128, and The code that you have enabled in lib1funcs.S will utilize dynamic shift instructions, which are not available on SH1 or SH2. Maybe your target HW is SH2A which has dynamic shift instructions and you haven't noticed a problem? Adding __SH2A__ instead of __SH2__ should be fine though. If I'm not mistaken, the __sdivsi3_i4i and __udivsi3_i4i division functions will be used by the compiler if the -mdiv=call-table option is used. The compiler should reject 'call-table' for SH targets that don't have dynamic shifts ... in sh.c there is a check... else if (! strcmp (sh_div_str, call-table) TARGET_SH2) sh_div_strategy = SH_DIV_CALL_TABLE; ... which is not quite complete. I will prepare a patch for this. Cheers, Oleg
Re: [PATCH] Fix vect_create_epilog_for_reduction memory leaks (PR middle-end/56461)
On Mon, 4 Mar 2013, Jakub Jelinek wrote: Hi! vect_create_epilog_for_reduction leaks memory both from the inner_phis vector not being released for double_reduc, and also for stmt_vec_info it creates (because those are added for stmts added into exit_bb, i.e. after loop, which destroy_loop_vec_info doesn't free). Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2013-03-04 Jakub Jelinek ja...@redhat.com PR middle-end/56461 * tree-vect-stmts.c (free_stmt_vec_info_vec): Call free_stmt_vec_info on any left-over stmt_vec_info in the vector. * tree-vect-loop.c (vect_create_epilog_for_reduction): Release inner_phis vector. --- gcc/tree-vect-stmts.c.jj 2013-03-04 11:07:33.0 +0100 +++ gcc/tree-vect-stmts.c 2013-03-04 12:14:16.111393716 +0100 @@ -5969,6 +5969,11 @@ init_stmt_vec_info_vec (void) void free_stmt_vec_info_vec (void) { + unsigned int i; + vec_void_p info; + FOR_EACH_VEC_ELT (stmt_vec_info_vec, i, info) +if (info != NULL) + free_stmt_vec_info (STMT_VINFO_STMT ((stmt_vec_info) info)); gcc_assert (stmt_vec_info_vec.exists ()); stmt_vec_info_vec.release (); } --- gcc/tree-vect-loop.c.jj 2013-03-04 11:01:48.0 +0100 +++ gcc/tree-vect-loop.c 2013-03-04 12:17:09.934351015 +0100 @@ -4487,8 +4487,9 @@ vect_finalize_reduction: } scalar_results.release (); + inner_phis.release (); new_phis.release (); -} +} /* Function vectorizable_reduction. Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Fix discover_iteration_bound_by_body_walk memory leaks (PR middle-end/56461)
On Mon, 4 Mar 2013, Jakub Jelinek wrote: Hi! This function was releasing only some vectors pushed into queues vector, not all, and wasn't releasing bounds vector. Fixed thusly. There is no need to use a typedef for the C++ish vec.h vectors, and the code can be tiny bit simplified. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2013-03-04 Jakub Jelinek ja...@redhat.com PR middle-end/56461 * tree-ssa-loop-niter.c (bb_queue): Remove typedef. (discover_iteration_bound_by_body_walk): Change queues to vecvecbasic_block and queue to vecbasic_block. Fix up spelling in comment. Call safe_push on queues[bound_index] directly. Release queues[queue_index] in every iteration unconditionally. Release bounds vector. --- gcc/tree-ssa-loop-niter.c.jj 2013-02-27 23:05:07.0 +0100 +++ gcc/tree-ssa-loop-niter.c 2013-03-04 14:57:37.380872029 +0100 @@ -3007,9 +3007,6 @@ bound_index (vecdouble_int bounds, dou gcc_unreachable (); } -/* Used to hold vector of queues of basic blocks bellow. */ -typedef vecbasic_block bb_queue; - /* We recorded loop bounds only for statements dominating loop latch (and thus executed each loop iteration). If there are any bounds on statements not dominating the loop latch we can improve the estimate by walking the loop @@ -3022,8 +3019,8 @@ discover_iteration_bound_by_body_walk (s pointer_map_t *bb_bounds; struct nb_iter_bound *elt; vecdouble_int bounds = vNULL; - vecbb_queue queues = vNULL; - bb_queue queue = bb_queue(); + vecvecbasic_block queues = vNULL; + vecbasic_block queue = vNULL; ptrdiff_t queue_index; ptrdiff_t latch_index = 0; pointer_map_t *block_priority; @@ -3096,7 +3093,7 @@ discover_iteration_bound_by_body_walk (s present in the path and we look for path with largest smallest bound on it. - To avoid the need for fibonaci heap on double ints we simply compress + To avoid the need for fibonacci heap on double ints we simply compress double ints into indexes to BOUNDS array and then represent the queue as arrays of queues for every index. Index of BOUNDS.length() means that the execution of given BB has @@ -3162,16 +3159,11 @@ discover_iteration_bound_by_body_walk (s } if (insert) - { - bb_queue queue2 = queues[bound_index]; - queue2.safe_push (e-dest); - queues[bound_index] = queue2; - } + queues[bound_index].safe_push (e-dest); } } } - else - queues[queue_index].release (); + queues[queue_index].release (); } gcc_assert (latch_index = 0); @@ -3187,6 +3179,7 @@ discover_iteration_bound_by_body_walk (s } queues.release (); + bounds.release (); pointer_map_destroy (bb_bounds); pointer_map_destroy (block_priority); } Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Fix vect_supported_load_permutation_p memory leak (PR middle-end/56461)
On Mon, 4 Mar 2013, Jakub Jelinek wrote: Hi! When returning true, load_index sbitmap is released, but not when returning false. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2013-03-04 Jakub Jelinek ja...@redhat.com PR middle-end/56461 * tree-vect-slp.c (vect_supported_load_permutation_p): Free load_index sbitmap even if some bit in it isn't set. --- gcc/tree-vect-slp.c.jj2013-02-28 22:19:57.0 +0100 +++ gcc/tree-vect-slp.c 2013-03-04 15:01:48.441490311 +0100 @@ -1429,7 +1429,10 @@ vect_supported_load_permutation_p (slp_i for (j = 0; j group_size; j++) if (!bitmap_bit_p (load_index, j)) - return false; + { + sbitmap_free (load_index); + return false; + } sbitmap_free (load_index); Jakub -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Fix PR56478
On Fri, Mar 01, 2013 at 11:10:40AM +0100, Richard Biener wrote: Don't use NULL_TREE built_int_cst - doing so hints at that you want to use double_ints. Generally doing computation with trees is expensive. You want to avoid that at all cost. Use double-ints (yeah, you have to use the clunky divmod_with_overflow interface). So this is a WIP patch, which uses double_ints. I apologize for my dumbness, but I haven't figured out how to do the normalization here. It's probably something simple, but... The point of the normalization would be that when multiplying the normalized number with 1 (aka REG_BR_PROB_BASE), the result fits into plain int, right? If you could suggest what to do with that, that would be appreciated. Thanks. --- a/gcc/predict.c +++ b/gcc/predict.c @@ -1028,13 +1028,13 @@ static bool is_comparison_with_loop_invariant_p (gimple stmt, struct loop *loop, tree *loop_invariant, enum tree_code *compare_code, -int *loop_step, +tree *loop_step, tree *loop_iv_base) { tree op0, op1, bound, base; affine_iv iv0, iv1; enum tree_code code; - int step; + tree step; code = gimple_cond_code (stmt); *loop_invariant = NULL; @@ -1077,7 +1077,7 @@ is_comparison_with_loop_invariant_p (gimple stmt, struct loop *loop, bound = iv0.base; base = iv1.base; if (host_integerp (iv1.step, 0)) - step = tree_low_cst (iv1.step, 0); + step = iv1.step; else return false; } @@ -1086,7 +1086,7 @@ is_comparison_with_loop_invariant_p (gimple stmt, struct loop *loop, bound = iv1.base; base = iv0.base; if (host_integerp (iv0.step, 0)) - step = tree_low_cst (iv0.step, 0); + step = iv0.step; else return false; } @@ -1154,6 +1154,16 @@ expr_coherent_p (tree t1, tree t2) return false; } +static double_int +normalize (double_int n) +{ + int msb = HOST_BITS_PER_WIDE_INT - clz_hwi (n.to_shwi ()); + if (msb HOST_BITS_PER_INT - 16) +{} +// ??? n = n.rshift (?, ?, ?); + return n; +} + /* Predict branch probability of BB when BB contains a branch that compares an induction variable in LOOP with LOOP_IV_BASE_VAR to LOOP_BOUND_VAR. The loop exit is compared using LOOP_BOUND_CODE, with step of LOOP_BOUND_STEP. @@ -1178,7 +1188,7 @@ predict_iv_comparison (struct loop *loop, basic_block bb, gimple stmt; tree compare_var, compare_base; enum tree_code compare_code; - int compare_step; + tree compare_step; edge then_edge; edge_iterator ei; @@ -1224,34 +1234,68 @@ predict_iv_comparison (struct loop *loop, basic_block bb, host_integerp (compare_base, 0)) { int probability; - HOST_WIDE_INT compare_count; - HOST_WIDE_INT loop_bound = tree_low_cst (loop_bound_var, 0); - HOST_WIDE_INT compare_bound = tree_low_cst (compare_var, 0); - HOST_WIDE_INT base = tree_low_cst (compare_base, 0); - HOST_WIDE_INT loop_count = (loop_bound - base) / compare_step; - - if ((compare_step 0) + bool of, overflow = false; + double_int mod, compare_count, tem, loop_count; + + double_int loop_bound = tree_to_double_int (loop_bound_var); + double_int compare_bound = tree_to_double_int (compare_var); + double_int base = tree_to_double_int (compare_base); + double_int compare_step = tree_to_double_int (compare_step); + + /* (loop_bound - base) / compare_step */ + tem = loop_bound.sub_with_overflow (base, of); + overflow |= of; + loop_count = tem.divmod_with_overflow (compare_step, + 0, TRUNC_DIV_EXPR, + mod, of); + overflow |= of; + + if ((compare_step.scmp (double_int_zero) == 1) ^ (compare_code == LT_EXPR || compare_code == LE_EXPR)) - compare_count = (loop_bound - compare_bound) / compare_step; + { + /* (loop_bound - compare_bound) / compare_step */ + tem = loop_bound.sub_with_overflow (compare_bound, of); + overflow |= of; + compare_count = tem.divmod_with_overflow (compare_step, +0, TRUNC_DIV_EXPR, +mod, of); + overflow |= of; + } else - compare_count = (compare_bound - base) / compare_step; - +{ + /* (compare_bound - base) / compare_step */ + tem = compare_bound.sub_with_overflow (base, of); + overflow |= of; + compare_count = tem.divmod_with_overflow (compare_step, +0, TRUNC_DIV_EXPR, +mod, of); + overflow |= of; + } if (compare_code == LE_EXPR || compare_code ==
Re: [PATCH] Fix inlining of calls with NULL block (PR56515)
On Mon, 4 Mar 2013, Jan Hubicka wrote: On Mon, Mar 04, 2013 at 01:42:51PM +0100, Richard Biener wrote: When inlining call stmts with a NULL gimple_block we still remap all the callee blocks into a block tree copy but we'll end up not referencing it from anywhere. This causes verification failures because then we have nothing refering to the inline stmt blocks. Ugh, best would be to set proper block even for the artificially added calls (whether to look around for surrounding statement blocks or similar). As I added to the PR log, I do not really think we can LTO reliably libraries, like libgcov, where backend invents its own calls. For safe LTO of runtime bits (libgcc/libgcov/libc) we will need a mechanizm to explicitly mark implementations of runtime calls and do not remove them from callgraph until it is clear that no new calls are invented. Indeed. Still the issue exists. Because clearing the block tree will have the nasty effect that nothing in the inlined routine will be debuggable. Attaching the block tree to DECL_INITIAL isn't very good either, because while the inline fn itself will be debuggable, the parent function's variables won't be accessible. But I guess your patch is fine as a hack to avoid ICEs, and we just should try harder and harder to avoid ever hitting that situation. Well, what would you recommend to do when adding an instrumentation calls on random places of the callgraph? We do 1) adding the counter increments that are associated with edges in callgraph. 2) adding VPT calls that are associated with an instruction (like divide) 3) adding calls to prologue handling indirect call. I suppose 3) can be handled same way as prologue code and 2) can copy block from the associated instruction. But what should we do for 1? Note that copy block from surrounding code is equivalent to a NULL block (just inherit the currently active block). What's more interesting is whether we want debug information for the inlined code at all - after all the callees are all DECL_ARTIFICIAL and definitely middle-end introduced memcpy calls diving into inlined memcpy implementation would be odd - after all the source doesn't contain the memcpy call and so I think there is no way to step over the call (after all we don't want to report it!). Thus I think for inlined artificial calls we want to drop debug information from its inlined body. The patch does that for BLOCKs (and thus variables - which effectively means everything but line information). What we eventually want to have is a BLOCK marked as inlined outer scope of artificial function FOO. Not sure if dwarf can express that though. Then gdb could do the right thing and other debug consumers like systemtap could still see the implementation. I've now LTO profilebootstrapped and bootstrapped and tested the patch on x86_64-unknown-linux-gnu and plan to install it with a big fat comment added. Thanks, Richard.
Re: [patch] PR c++/55135
On Mon, Mar 4, 2013 at 8:13 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, Bug c++/55135 is another one of those almost-insane large test cases that triggers some of the worst time complexity behavior GCC has to offer. The attached patch doesn't actually fix anything the bug poster complained about, but something I ran into myself while trying to compile the file at -O0. It's a regression from older GCC releases and a test case for which clang kicks our butts. What happens at -O0 for this test case, is that there are 179972 EH regions and all but 3 of them are removed in remove_unreachable_handlers, which calls remove_eh_handler one region at a time in a loop. Because the EH tree is almost flat (almost a linked list), and remove_eh_handler has to look up the dead region in the tree, this results in O(N_EH_regions^2) run time in pass_cleanup_eh. The solution I propose in the attached patch, is to remove all unreachable regions in a single walk over the EH tree. This makes remove_unreachable_handlers run in no worse than O(N_EH_regions) time. If there are only a few regions to be removed, then this is potentially slower than the existing algorithm, but there is already a complete function walk in remove_unreachable_handlers and in the non-O0 case the EH tree is usually relatively small even for large functions. In any case, I have measured compile time on some C++ and Java cases and there were no measurable compile time regressions at -O1+, and a few improvements at -O0. Bootstrappedtested on x86_64-unknown-linux-gnu. OK for trunk? Ok. Thanks, Richard. Ciao! Steven gcc/ PR c++/55135 * except.h (remove_unreachable_eh_regions): New prototype. * except.c (remove_eh_handler_splicer): New function, split out of remove_eh_handler. (remove_eh_handler): Use remove_eh_handler_splicer. Add comment warning about running it on many EH regions one at a time. (remove_unreachable_eh_regions_worker): New function, walk the EH tree in depth-first order and remove non-marked regions. (remove_unreachable_eh_regions): New function. * tree-eh.c (mark_reachable_handlers): New function, split out from remove_unreachable_handlers. (remove_unreachable_handlers): Use mark_reachable_handlers and remove_unreachable_eh_regions. (remove_unreachable_handlers_no_lp): Use mark_reachable_handlers and remove_unreachable_eh_regions.
Re: [PATCH] Fix lots of uninitialized memory uses in sched_analyze_reg
On Mon, Mar 4, 2013 at 10:17 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! Something that again hits lots of testcases during valgrind checking bootstrap. init_alias_analysis apparently does vec_safe_grow_cleared (reg_known_value, maxreg - FIRST_PSEUDO_REGISTER); reg_known_equiv_p = sbitmap_alloc (maxreg - FIRST_PSEUDO_REGISTER); but doesn't bitmap_clear (reg_known_equiv_p), perhaps as an optimization? If set_reg_known_value is called (and not to the reg itself), set_reg_known_equiv_p is called too though. Right now get_reg_known_equiv_p is only called in one place, and we are only interested in MEM_P known values there, so the following works fine. Though perhaps if in the future we use the reg_known_equiv_p bitmap more, we should bitmap_clear (reg_known_equiv_p) it instead. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk (or do you prefer to slow down init_alias_analysis and just clear the bitmap)? Looks ok, also clear the sbitmap as of stevens comment. Thanks, Richard. 2013-03-04 Jakub Jelinek ja...@redhat.com * sched-deps.c (sched_analyze_reg): Only call get_reg_known_equiv_p if get_reg_known_value returned non-NULL. --- gcc/sched-deps.c.jj 2013-03-04 12:21:09.0 +0100 +++ gcc/sched-deps.c2013-03-04 17:29:03.478944157 +0100 @@ -2351,10 +2351,10 @@ sched_analyze_reg (struct deps_desc *dep /* Pseudos that are REG_EQUIV to something may be replaced by that during reloading. We need only add dependencies for the address in the REG_EQUIV note. */ - if (!reload_completed get_reg_known_equiv_p (regno)) + if (!reload_completed) { rtx t = get_reg_known_value (regno); - if (MEM_P (t)) + if (t MEM_P (t) get_reg_known_equiv_p (regno)) sched_analyze_2 (deps, XEXP (t, 0), insn); } Jakub
Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode
On Tue, Mar 5, 2013 at 12:47 AM, Paul Brook p...@codesourcery.com wrote: I somehow missed the Appendix A: Support for Advanced SIMD Extensions in the AAPCS document (it's not in the TOC!). It looks like the builtin vector types are indeed defined to be stored in memory in vldm/vstm order -- I think that means we're back to square one. There's still the possibility of making gcc generic vector types different from the ABI specified types[1], but that feels like it's probably a really bad idea. Having a distinct set of types just for the vectorizer may be a more viable option. IIRC the type selection hooks are more flexible than when we first looked at this problem. Paul [1] e.g. int gcc __attribute__((vector_size(8))); v.s. int32x2_t eabi; I think int32x2_t should not be a GCC vector type (thus not have a vector mode). The ABI specified types should map to an integer mode of the right size instead. The vectorizer would then still use internal GCC vector types and modes and the backend needs to provide instruction patterns that do the right thing with the element ordering the vectorizer expects. How are the int32x2_t types used? I suppose they are arguments to the intrinsics. Which means that for _most_ operations element order does not matter, thus a plus32x2 (int32x2_t x, int32x2_t y) can simply use the equivalent of return (int32x2_t)((gcc_int32x2_t)x + (gcc_int32x2_t)y). In intrinsics where order matters you'd insert appropriate __builtin_shuffle()s. Oh, of course do the above only for big-endian mode ... The other way around, mapping intrinsics and ABI vectors to vector modes will have issues ... you'd have to guard all optab queries in the middle-end to fail for arm big-endian as they expect instruction patterns that deal with the GCC vector ordering. Thus: model the backend after GCCs expectations and fixup the rest by fixing the ABI types and intrinsics. Richard.
Re: [PATCH] Tiny make check-gcc parallelization improvement
On Tue, Mar 5, 2013 at 7:16 AM, Jakub Jelinek ja...@redhat.com wrote: Hi! This patch syncs the list of target exp files (a few have been added in the last few years). Also, in my testing, usually vect.exp, guality.exp, struct-layout-1.exp and i386.exp take quite a lot of time, so it is undesirable to have them in pairs anymore, so the patch allows running all 4 of them in parallel. This gained a minute in make -j48 -k check testing on my box. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2013-03-05 Jakub Jelinek ja...@redhat.com * Makefile.in (dg_target_exps): Add aarch64.exp, epiphany.exp and tic6x.exp. (check_gcc_parallelize): Run guality.exp as a separate job from vect.exp with unsorted.exp and $(dg_target_exps) separately from struct-layout-1.exp with stackalign.exp. --- gcc/Makefile.in.jj 2013-02-27 08:27:26.0 +0100 +++ gcc/Makefile.in 2013-03-04 13:11:48.002638910 +0100 @@ -494,10 +494,11 @@ xm_include_list=@xm_include_list@ xm_defines=@xm_defines@ lang_checks= lang_checks_parallelized= -dg_target_exps:=alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp,frv.exp -dg_target_exps:=$(dg_target_exps),i386.exp,ia64.exp,m68k.exp,microblaze.exp -dg_target_exps:=$(dg_target_exps),mips.exp,powerpc.exp,rx.exp,s390.exp,sh.exp -dg_target_exps:=$(dg_target_exps),sparc.exp,spu.exp,xstormy16.exp +dg_target_exps:=aarch64.exp,alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp +dg_target_exps:=$(dg_target_exps),epiphany.exp,frv.exp,i386.exp,ia64.exp +dg_target_exps:=$(dg_target_exps),m68k.exp,microblaze.exp,mips.exp,powerpc.exp +dg_target_exps:=$(dg_target_exps),rx.exp,s390.exp,sh.exp,sparc.exp,spu.exp +dg_target_exps:=$(dg_target_exps),tic6x.exp,xstormy16.exp # This lists a couple of test files that take most time during check-gcc. # When doing parallelized check-gcc, these can run in parallel with the # remaining tests. Each word in this variable stands for work for one @@ -517,8 +518,10 @@ check_gcc_parallelize=execute.exp=execut compile.exp=compile/\[9pP\]*,builtins.exp \ compile.exp=compile/\[013-8a-oq-zA-OQ-Z\]* \ dg-torture.exp,ieee.exp \ - vect.exp,guality.exp,unsorted.exp \ - struct-layout-1.exp,stackalign.exp,$(dg_target_exps) + vect.exp,unsorted.exp \ + guality.exp \ + struct-layout-1.exp,stackalign.exp \ + $(dg_target_exps) lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt $(srcdir)/common.opt lang_specs_files=@lang_specs_files@ lang_tree_files=@lang_tree_files@ Jakub
[PATCH] Silence up a false positive warning in libiberty (PR middle-end/56526)
Hi! While wrapper_sect_offset is always initialized if (gnu_sections_found SOMO_WRAPPING) != 0 and used only guarded with that same condition, as the PR says apparently we get a false positive maybe uninitialized warning for it still. I'd say it is a good programming style to just initialize such vars, especially in performance non-critical code. Ok for trunk? 2013-03-05 Jakub Jelinek ja...@redhat.com PR middle-end/56526 * simple-object-mach-o.c (simple_object_mach_o_segment): Initialize wrapper_sect_offset to avoid a warning. --- libiberty/simple-object-mach-o.c.jj 2013-01-07 14:14:46.0 +0100 +++ libiberty/simple-object-mach-o.c2013-03-05 11:46:19.574157009 +0100 @@ -432,7 +432,7 @@ simple_object_mach_o_segment (simple_obj size_t index_size; unsigned int n_wrapped_sects; size_t wrapper_sect_size; - off_t wrapper_sect_offset; + off_t wrapper_sect_offset = 0; fetch_32 = (omr-is_big_endian ? simple_object_fetch_big_32 Jakub
[C++ testcase, committed] PR 56530
Hi, I added the testcase and closed the PR as fixed in 4.7.3 and mainline. Paolo. / 2013-03-05 Paolo Carlini paolo.carl...@oracle.com PR c++/56530 * g++.dg/warn/Wsign-conversion-2.C: New. Index: g++.dg/warn/Wsign-conversion-2.C === --- g++.dg/warn/Wsign-conversion-2.C(revision 0) +++ g++.dg/warn/Wsign-conversion-2.C(working copy) @@ -0,0 +1,11 @@ +// PR c++/56530 +// { dg-options -Wsign-conversion } + +struct string +{ + string () {}; + ~string () {}; +}; + +string foo[1]; // okay +string bar[1][1]; // gives bogus warning
[PATCH] Fix PR56525
This should fix PR56525, we reference ggc_freed loop structures from bb-loop_father when fix_loop_structure removes a loop and then calls flow_loops_find. Fixed by delaying the ggc_free part of loop removal until after that (I thought about other ways to fix the reference but they are way more intrusive). Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2013-03-05 Richard Biener rguent...@suse.de PR middle-end/56525 * loop-init.c (fix_loop_structure): Remove loops in two stages, not freeing them until the end. Index: gcc/loop-init.c === *** gcc/loop-init.c (revision 196451) --- gcc/loop-init.c (working copy) *** fix_loop_structure (bitmap changed_bbs) *** 186,192 int record_exits = 0; loop_iterator li; struct loop *loop; ! unsigned old_nloops; timevar_push (TV_LOOP_INIT); --- 186,192 int record_exits = 0; loop_iterator li; struct loop *loop; ! unsigned old_nloops, i; timevar_push (TV_LOOP_INIT); *** fix_loop_structure (bitmap changed_bbs) *** 230,237 flow_loop_tree_node_add (loop_outer (loop), ploop); } ! /* Remove the loop and free its data. */ ! delete_loop (loop); } /* Remember the number of loops so we can return how many new loops --- 230,238 flow_loop_tree_node_add (loop_outer (loop), ploop); } ! /* Remove the loop. */ ! loop-header = NULL; ! flow_loop_tree_node_remove (loop); } /* Remember the number of loops so we can return how many new loops *** fix_loop_structure (bitmap changed_bbs) *** 253,258 --- 254,267 } } + /* Finally free deleted loops. */ + FOR_EACH_VEC_ELT (*get_loops (), i, loop) + if (loop loop-header == NULL) + { + (*get_loops ())[i] = NULL; + flow_loop_free (loop); + } + loops_state_clear (LOOPS_NEED_FIXUP); /* Apply flags to loops. */
[PATCH] Fix PR56521
VN now inserts all sorts of calls into the references hashtable, not only those which produce a value. This results in missing initializations of -value_id which eventually PRE ends up accessing. The following fixes that. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2013-03-05 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL result set a new value-id. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 196451) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -3954,7 +3962,7 @@ free_scc_vn (void) XDELETE (optimistic_info); } -/* Set *ID if we computed something useful in RESULT. */ +/* Set *ID according to RESULT. */ static void set_value_id_for_result (tree result, unsigned int *id) @@ -3966,6 +3974,8 @@ set_value_id_for_result (tree result, un else if (is_gimple_min_invariant (result)) *id = get_or_alloc_constant_value_id (result); } + else +*id = get_next_value_id (); } /* Set the value ids in the valid hash tables. */
Re: [PATCH] Fix PR56521
On Tue, Mar 05, 2013 at 12:51:09PM +0100, Richard Biener wrote: VN now inserts all sorts of calls into the references hashtable, not only those which produce a value. This results in missing initializations of -value_id which eventually PRE ends up accessing. The following fixes that. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2013-03-05 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL result set a new value-id. --- gcc/tree-ssa-sccvn.c (revision 196451) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -3954,7 +3962,7 @@ free_scc_vn (void) XDELETE (optimistic_info); } -/* Set *ID if we computed something useful in RESULT. */ +/* Set *ID according to RESULT. */ static void set_value_id_for_result (tree result, unsigned int *id) @@ -3966,6 +3974,8 @@ set_value_id_for_result (tree result, un else if (is_gimple_min_invariant (result)) *id = get_or_alloc_constant_value_id (result); This still won't initialize *id if result is non-NULL, but isn't SSA_NAME nor is_gimple_min_invariant. Can't you do the same for that case too, just in case (perhaps we can't trigger that right now, but still it would make me feel safer about that). } + else +*id = get_next_value_id (); } /* Set the value ids in the valid hash tables. */ Otherwise looks good to me, thanks. Jakub
Re: [PATCH] Fix PR56521
On Tue, 5 Mar 2013, Jakub Jelinek wrote: On Tue, Mar 05, 2013 at 12:51:09PM +0100, Richard Biener wrote: VN now inserts all sorts of calls into the references hashtable, not only those which produce a value. This results in missing initializations of -value_id which eventually PRE ends up accessing. The following fixes that. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2013-03-05 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL result set a new value-id. --- gcc/tree-ssa-sccvn.c(revision 196451) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -3954,7 +3962,7 @@ free_scc_vn (void) XDELETE (optimistic_info); } -/* Set *ID if we computed something useful in RESULT. */ +/* Set *ID according to RESULT. */ static void set_value_id_for_result (tree result, unsigned int *id) @@ -3966,6 +3974,8 @@ set_value_id_for_result (tree result, un else if (is_gimple_min_invariant (result)) *id = get_or_alloc_constant_value_id (result); This still won't initialize *id if result is non-NULL, but isn't SSA_NAME nor is_gimple_min_invariant. Can't you do the same for that case too, just in case (perhaps we can't trigger that right now, but still it would make me feel safer about that). Yeah. Can happen from aggregate stores I gues. } + else +*id = get_next_value_id (); } /* Set the value ids in the valid hash tables. */ Otherwise looks good to me, thanks. As follows. Richard. 2013-03-05 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL result set a new value-id. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 196451) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -3954,18 +3962,17 @@ free_scc_vn (void) XDELETE (optimistic_info); } -/* Set *ID if we computed something useful in RESULT. */ +/* Set *ID according to RESULT. */ static void set_value_id_for_result (tree result, unsigned int *id) { - if (result) -{ - if (TREE_CODE (result) == SSA_NAME) - *id = VN_INFO (result)-value_id; - else if (is_gimple_min_invariant (result)) - *id = get_or_alloc_constant_value_id (result); -} + if (result TREE_CODE (result) == SSA_NAME) +*id = VN_INFO (result)-value_id; + else if (result is_gimple_min_invariant (result)) +*id = get_or_alloc_constant_value_id (result); + else +*id = get_next_value_id (); } /* Set the value ids in the valid hash tables. */
Re: [PATCH, combine] Fix host-specific behavior in simplify_compare_const()
In other words, any 32-bit target with 'need_64bit_hwint=yes' in config.gcc is not able to have benefit from this optimization because it never passes the condition test. My solution is to use GET_MODE_MASK(mode) to filter out all bits not in target mode. The following is my patch: The patch is OK for 4.9 once stage #1 is open if it passes bootstrap/regtest. gcc/ChangLog: * gcc/combine.c: Use GET_MODE_MASK() to filter out unnecessary bits in simplify_compare_const(). This should be * combine.c (simplify_compare_const): Use GET_MODE_MASK to filter out unnecessary bits in the constant power of two case. diff --git a/gcc/combine.c b/gcc/combine.c index 67bd776..8c8cb92 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10917,8 +10917,8 @@ simplify_compare_const (enum rtx_code code, rtx op0, rtx *pop1) (code == EQ || code == NE || code == GE || code == GEU || code == LT || code == LTU) mode_width = HOST_BITS_PER_WIDE_INT - exact_log2 (const_op) = 0 - nonzero_bits (op0, mode) == (unsigned HOST_WIDE_INT) const_op) + exact_log2 (const_op GET_MODE_MASK (mode)) = 0 + nonzero_bits (op0, mode) == (unsigned HOST_WIDE_INT) (const_op GET_MODE_MASK (mode))) { code = (code == EQ || code == GE || code == GEU ? NE : EQ); const_op = 0; The line is too long, write nonzero_bits (op0, mode) == (unsigned HOST_WIDE_INT) (const_op GET_MODE_MASK (mode))) instead. -- Eric Botcazou
[Committed] S/390: Define DWARF2_ASM_LINE_DEBUG_INFO
Hi, the attached patch enables the debug line infos to be generated from the asm listing for s390. With the patch two testsuite failures disappear. FAIL: gcc.dg/debug/dwarf2/asm-line1.c scan-assembler is_stmt 1 FAIL: gnat.dg/return3.adb scan-assembler loc 1 6 Committed to mainline. Bye, -Andreas- 2013-03-05 Andreas Krebbel andreas.kreb...@de.ibm.com * config/s390/s390.h: Define DWARF2_ASM_LINE_DEBUG_INFO. --- gcc/config/s390/s390.h |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: gcc/config/s390/s390.h === *** gcc/config/s390/s390.h.orig --- gcc/config/s390/s390.h *** extern const enum reg_class regclass_map *** 591,596 --- 591,599 /* Register save slot alignment. */ #define DWARF_CIE_DATA_ALIGNMENT (-UNITS_PER_LONG) + /* Let the assembler generate debug line info. */ + #define DWARF2_ASM_LINE_DEBUG_INFO 1 + /* Frame registers. */
Re: [PATCH] Fix PR56521
On Tue, Mar 05, 2013 at 12:57:41PM +0100, Richard Biener wrote: As follows. Richard. 2013-03-05 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (set_value_id_for_result): For a NULL result set a new value-id. Looks much better. You forgot to adjust the ChangeLog entry, and PR line is missing, if it passes bootstrap, please check it in. --- gcc/tree-ssa-sccvn.c (revision 196451) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -3954,18 +3962,17 @@ free_scc_vn (void) XDELETE (optimistic_info); } -/* Set *ID if we computed something useful in RESULT. */ +/* Set *ID according to RESULT. */ static void set_value_id_for_result (tree result, unsigned int *id) { - if (result) -{ - if (TREE_CODE (result) == SSA_NAME) - *id = VN_INFO (result)-value_id; - else if (is_gimple_min_invariant (result)) - *id = get_or_alloc_constant_value_id (result); -} + if (result TREE_CODE (result) == SSA_NAME) +*id = VN_INFO (result)-value_id; + else if (result is_gimple_min_invariant (result)) +*id = get_or_alloc_constant_value_id (result); + else +*id = get_next_value_id (); } /* Set the value ids in the valid hash tables. */ Jakub
Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode
On Tue, 5 Mar 2013 10:42:59 +0100 Richard Biener richard.guent...@gmail.com wrote: On Tue, Mar 5, 2013 at 12:47 AM, Paul Brook p...@codesourcery.com wrote: I somehow missed the Appendix A: Support for Advanced SIMD Extensions in the AAPCS document (it's not in the TOC!). It looks like the builtin vector types are indeed defined to be stored in memory in vldm/vstm order -- I think that means we're back to square one. There's still the possibility of making gcc generic vector types different from the ABI specified types[1], but that feels like it's probably a really bad idea. Having a distinct set of types just for the vectorizer may be a more viable option. IIRC the type selection hooks are more flexible than when we first looked at this problem. Paul [1] e.g. int gcc __attribute__((vector_size(8))); v.s. int32x2_t eabi; I think int32x2_t should not be a GCC vector type (thus not have a vector mode). The ABI specified types should map to an integer mode of the right size instead. The vectorizer would then still use internal GCC vector types and modes and the backend needs to provide instruction patterns that do the right thing with the element ordering the vectorizer expects. How are the int32x2_t types used? I suppose they are arguments to the intrinsics. Which means that for _most_ operations element order does not matter, thus a plus32x2 (int32x2_t x, int32x2_t y) can simply use the equivalent of return (int32x2_t)((gcc_int32x2_t)x + (gcc_int32x2_t)y). In intrinsics where order matters you'd insert appropriate __builtin_shuffle()s. Maybe there's no need to interpret the vector layout for any of the intrinsics -- just treat all inputs outputs as opaque (there are intrinsics for getting/setting lanes -- IMO these shouldn't attempt to convert lane numbers at all, though they do at present). Several intrinsics are currently implemented using __builtin_shuffle, e.g.: __extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) vrev64_s8 (int8x8_t __a) { return (int8x8_t) __builtin_shuffle (__a, (uint8x8_t) { 7, 6, 5, 4, 3, 2, 1, 0 }); } I'd imagine that if int8x8_t are not actual vector types, we could invent extra builtins to convert them to and from such types to be able to still do this kind of thing (in arm_neon.h, not necessarily for direct use by users), i.e.: typedef char gcc_int8x8_t __attribute__((vector_size(8))); int8x8_t vrev64_s8 (int8x8_t __a) { gcc_int8x8_t tmp = __builtin_neon2generic (__a); tmp = __builtin_shuffle (tmp, (gcc_int8x8_t) { 7, 6, 5, 4, ... }); return __builtin_generic2neon (tmp); } (On re-reading, that's basically the same as what you suggested, I think.) Oh, of course do the above only for big-endian mode ... The other way around, mapping intrinsics and ABI vectors to vector modes will have issues ... you'd have to guard all optab queries in the middle-end to fail for arm big-endian as they expect instruction patterns that deal with the GCC vector ordering. Thus: model the backend after GCCs expectations and fixup the rest by fixing the ABI types and intrinsics. I think this plan will work fine -- it has the added advantage (which looks like a disadvantage, but really isn't) that generic vector operations like: void foo (void) { int8x8_t x = { 0, 1, 2, 3, 4, 5, 6, 7 }; } will *not* work -- nor will e.g. subscripting ABI-defined vectors using []s. At the moment using these features can lead to surprising results. Unfortunately NEON's pretty complicated, and the ARM backend currently uses vector modes quite heavily implementing it, so just using integer modes for intrinsics is going to be tough. It might work to create a shadow set of vector modes for use only by the intrinsics (O*mode for opaque instead of V*mode, say), if the middle end won't barf at that. Thanks, Julian
Re: [PATCH] Fix PR56525
On Tue, Mar 05, 2013 at 12:27:20PM +0100, Richard Biener wrote: This should fix PR56525, we reference ggc_freed loop structures from bb-loop_father when fix_loop_structure removes a loop and then calls flow_loops_find. Fixed by delaying the ggc_free part of loop removal until after that (I thought about other ways to fix the reference but they are way more intrusive). Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2013-03-05 Richard Biener rguent...@suse.de PR middle-end/56525 * loop-init.c (fix_loop_structure): Remove loops in two stages, not freeing them until the end. Looks good to me (when reporting the bug, I actually thought about defering the removal for the duration of the fixup too). Jakub
Re: [patch][RFC] bitmaps as lists *or* trees
On Tue, Mar 5, 2013 at 1:00 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, A recurring problem with GCC's sparse bitmap data structure is that it performs poorly for random access patterns. Such patterns result in linked-list walks, and can trigger behavior quadratic in the number of linked-list member elements in the set. The attached patch is a first stab at an idea I've had for a while: Implement a change of view for bitmaps, such that a bitmap can be either a linked list, or a binary tree. I've implemented this idea with top-down splay trees because splay tree nodes do not need meta-data on (unlike e.g. color for RB-trees, rank for AVL trees, etc.) and top-down splay tree operations are very simple to implement (less than 200 lines of code). As far as I'm aware, this is the first attempt at allowing different views on bitmaps. The idea came from Andrew Macleod's tree-ssa-live implementation. The idea is to convert the bitmap to a tree view if the set represented by the bitmap is mostly used for membership testing, and not for iterations over the items (as e.g. for bitmap dataflow). A typical example of this is e.g. invalid_mode_changes, which just explodes for the test case of PR55135 at -O0. I haven't tested this patch at all, except making sure that it compiles. Just posting this for discussion, and for feedback on the idea. I know there have been many others before me who've tried different data structures for bitmaps, perhaps someone has already tried this before. Definitely a nice idea. Iteration should be easy to implement (without actually splaying for each visited bit), the bit operations can use the iteration as building block as well then. Now, an instrumented bitmap to identify bitmaps that would benefit from the tree view would be nice ;) [points-to sets are never modified after being computed, but they are both random-tested and intersected] What I missed often as well is a reference counted shared bitmap implementation (we have various special case implementations). I wonder if that could even use shared sub-trees/lists of bitmap_elts. Richard. Ciao! Steven
[PATCH] Simplify -fwhole-program documentation
This removes all encouragement to use -fwhole-program with -flto from the documentation. As can be seen in PR56533 it can be most confusing ... instead advise to rely on a linker plugin. Ok? Thanks, Richard. 2013-03-05 Richard Biener rguent...@suse.de * doc/invoke.texi (fwhole-program): Discourage use in combination with -flto. Index: gcc/doc/invoke.texi === *** gcc/doc/invoke.texi (revision 196451) --- gcc/doc/invoke.texi (working copy) *** Enabled by default with @option{-funroll *** 8168,8182 Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of @code{main} and those merged by attribute @code{externally_visible} become static functions ! and in effect are optimized more aggressively by interprocedural optimizers. If @command{gold} is used as the linker plugin, @code{externally_visible} attributes are automatically added to functions (not variable yet due to a current @command{gold} issue) that are accessed outside of LTO objects according to resolution file produced by @command{gold}. For other linkers that cannot generate resolution file, explicit @code{externally_visible} attributes are still necessary. ! While this option is equivalent to proper use of the @code{static} keyword for ! programs consisting of a single file, in combination with option ! @option{-flto} this flag can be used to ! compile many smaller scale programs since the functions and variables become ! local for the whole combined compilation unit, not for the single source file ! itself. ! This option implies @option{-fwhole-file} for Fortran programs. @item -flto[=@var{n}] @opindex flto --- 8168,8178 Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of @code{main} and those merged by attribute @code{externally_visible} become static functions ! and in effect are optimized more aggressively by interprocedural optimizers. ! In combination with @code{-flto} using this option should not be used. ! Instead relying on a linker plugin should provide safer and more precise ! information. @item -flto[=@var{n}] @opindex flto
Patch ping
Hi! Thanks for all the recent reviews of memory leak plugging patches, there are 4 still unreviewed from last week though. - sched-deps leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html - LRA leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html - libcpp leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html - PCH leak fix + --enable-checking=valgrind changes to allow --enable-checking=yes,valgrind bootstrap to succeed: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html Jakub
Re: Patch ping
On Tue, 5 Mar 2013, Jakub Jelinek wrote: Hi! Thanks for all the recent reviews of memory leak plugging patches, there are 4 still unreviewed from last week though. - sched-deps leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html - LRA leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html - libcpp leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html - PCH leak fix + --enable-checking=valgrind changes to allow --enable-checking=yes,valgrind bootstrap to succeed: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html That looks awkward ... isn't there a simple valgrind_disable () / valgrind_enable () way of disabling checking around this code? Richard.
Re: Patch ping
On Tue, Mar 05, 2013 at 02:26:03PM +0100, Richard Biener wrote: On Tue, 5 Mar 2013, Jakub Jelinek wrote: Thanks for all the recent reviews of memory leak plugging patches, there are 4 still unreviewed from last week though. - sched-deps leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html - LRA leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html - libcpp leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html - PCH leak fix + --enable-checking=valgrind changes to allow --enable-checking=yes,valgrind bootstrap to succeed: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html That looks awkward ... isn't there a simple valgrind_disable () / valgrind_enable () way of disabling checking around this code? Unfortunately not. I went through all valgrind.h and memcheck.h client calls. If at least there was a VALGRIND_GET_VBITS variants that allowed getting all vbits, (i.e. whether something is unaddressable vs. undefined vs. defined), rather than just if any of the vbits are unaddressable, give up, otherwise return undefined vs. defined bits, it would simplify the code. I hope perhaps future valgrind version could add that, so it would be just VALGRIND_GET_VBITS2, VALGRIND_MAKE_MEM_DEFINED before and VALGRIND_SET_VBITS2 at the end (restore previous state). I've at least added __builtin_expect, so the binary search code isn't in hot path. It isn't that slow, during binary search I'm always testing just a single byte, and say if we don't have any single memory allocations 4GB, it will be at most 37 valgrind client calls per objects, usually much smaller number than that. Jakub
Re: Patch ping
On Tue, 5 Mar 2013, Jakub Jelinek wrote: On Tue, Mar 05, 2013 at 02:26:03PM +0100, Richard Biener wrote: On Tue, 5 Mar 2013, Jakub Jelinek wrote: Thanks for all the recent reviews of memory leak plugging patches, there are 4 still unreviewed from last week though. - sched-deps leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html - LRA leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html - libcpp leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01341.html - PCH leak fix + --enable-checking=valgrind changes to allow --enable-checking=yes,valgrind bootstrap to succeed: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00044.html That looks awkward ... isn't there a simple valgrind_disable () / valgrind_enable () way of disabling checking around this code? Unfortunately not. I went through all valgrind.h and memcheck.h client calls. If at least there was a VALGRIND_GET_VBITS variants that allowed getting all vbits, (i.e. whether something is unaddressable vs. undefined vs. defined), rather than just if any of the vbits are unaddressable, give up, otherwise return undefined vs. defined bits, it would simplify the code. I hope perhaps future valgrind version could add that, so it would be just VALGRIND_GET_VBITS2, VALGRIND_MAKE_MEM_DEFINED before and VALGRIND_SET_VBITS2 at the end (restore previous state). I've at least added __builtin_expect, so the binary search code isn't in hot path. It isn't that slow, during binary search I'm always testing just a single byte, and say if we don't have any single memory allocations 4GB, it will be at most 37 valgrind client calls per objects, usually much smaller number than that. Alternatively using a suppressions file during bootstrap might be possible ... maybe also useful for general valgrind debugging use? Richard.
Re: [PATCH] Fix cp_parser_braced_list
OK. Jason
Re: [patch][RFC] bitmaps as lists *or* trees
Hi, On Tue, 5 Mar 2013, Richard Biener wrote: I haven't tested this patch at all, except making sure that it compiles. Just posting this for discussion, and for feedback on the idea. I know there have been many others before me who've tried different data structures for bitmaps, perhaps someone has already tried this before. Definitely a nice idea. Iteration should be easy to implement (without actually splaying for each visited bit), the bit operations can use the iteration as building block as well then. Iteration isn't easy on trees without a pointer to the parent (i.e. enlarging each node), you need to remember variably sized context in the iterator (e.g. the current stack of nodes). I do like the idea of reusing the same internal data structure to implement the tree. And I'm wondering about performance impact, I wouldn't be surprised either way (i.e. that it brings about a large improvement, or none at all), most bitmap membership tests in GCC are surprisingly clustered so that the bitmaps cache of last accessed element can work its magic (not all of them, as the testcase shows of course :) ). Ciao, Michael.
Re: [patch][RFC] bitmaps as lists *or* trees
On Tue, Mar 5, 2013 at 1:32 PM, Richard Biener wrote: The attached patch is a first stab at an idea I've had for a while: Implement a change of view for bitmaps, such that a bitmap can be either a linked list, or a binary tree. ... Definitely a nice idea. Iteration should be easy to implement (without actually splaying for each visited bit), the bit operations can use the iteration as building block as well then. It is really easy, you only have to listify the splay tree such that the root is the element with the lowest index. AFAICT the iterators only look at the next member of each bitmap_element, and a list is also a valid splay tree. Now, an instrumented bitmap to identify bitmaps that would benefit from the tree view would be nice ;) [points-to sets are never modified after being computed, but they are both random-tested and intersected] I have no idea how to create that kind of instrumentation. What I missed often as well is a reference counted shared bitmap implementation (we have various special case implementations). I wonder if that could even use shared sub-trees/lists of bitmap_elts. And this idea, I don't even understand :-) reference counted shared bitmaps as in, the same bitmap element shared between different bitmaps? How would you link such elements together in a tree or a list? It could be done with array bitmaps, but those have other downsides (insert/delete is near impossible without a lot of mem-moving around). Ciao! Steven
RE: [Patch, microblaze]: Add support for swap instructions and reorder option
Hi Michal, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Monday, 4 March 2013 3:37 am To: David Holsgrove Cc: Michael Eager; gcc-patches@gcc.gnu.org; John Williams; Edgar E. Iglesias (edgar.igles...@gmail.com); Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui Subject: Re: [Patch, microblaze]: Add support for swap instructions and reorder option Committed revision 196415. Thanks for committing. Please submit a patch to update gcc/doc/invoke.texi with -mxl-reorder description. Please find patch attached to this mail which updates the MicroBlaze section of documentation to include -mxl-reorder. I also added -mbig-endian and -mlittle-endian as they were missed in previous patch. thanks again, David -- Michael Eager ea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 0001-Patch-microblaze-Update-gcc-doc-invoke.texi-for-Micr.patch Description: 0001-Patch-microblaze-Update-gcc-doc-invoke.texi-for-Micr.patch
Re: [PATCH] Silence up a false positive warning in libiberty (PR middle-end/56526)
On Tue, Mar 5, 2013 at 2:52 AM, Jakub Jelinek ja...@redhat.com wrote: 2013-03-05 Jakub Jelinek ja...@redhat.com PR middle-end/56526 * simple-object-mach-o.c (simple_object_mach_o_segment): Initialize wrapper_sect_offset to avoid a warning. This is OK. Thanks. Ian
RE: [Patch, microblaze]: Added fast_interrupt controller
Hi Michael, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Wednesday, 27 February 2013 4:12 am To: David Holsgrove Cc: gcc-patches@gcc.gnu.org; Michael Eager (ea...@eagercon.com); John Williams; Edgar E. Iglesias (edgar.igles...@gmail.com); Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui Subject: Re: [Patch, microblaze]: Added fast_interrupt controller On 02/10/2013 10:39 PM, David Holsgrove wrote: Added fast_interrupt controller Changelog 2013-02-11 Nagaraju Mekala nmek...@xilinx.com * config/microblaze/microblaze-protos.h: microblaze_is_fast_interrupt. * config/microblaze/microblaze.c (microblaze_attribute_table): Add microblaze_is_fast_interrupt. (microblaze_fast_interrupt_function_p): New function. (microblaze_is_fast_interrupt check): New function. (microblaze_must_save_register): Account for fast_interrupt. (save_restore_insns): Likewise. (compute_frame_size): Likewise. (microblaze_globalize_label): Add FAST_INTERRUPT_NAME. * config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME as fast_interrupt. * config/microblaze/microblaze.md (movsi_status): Can be fast_interrupt (return): Add microblaze_is_fast_interrupt. (return_internal): Likewise. +int +microblaze_is_fast_interrupt (void) +{ + return fast_interrupt; +} + if (fast_interrupt) +{ Use wrapper functions consistently. Either reference the flag everywhere or use the wrapper everywhere. I've repurposed the existing 'microblaze_is_interrupt_handler' wrapper, (which was only used in the machine description), to be 'microblaze_is_interrupt_variant' - true if the function's attribute is either interrupt_handler or fast_interrupt. + if (interrupt_handler || fast_interrupt) + if (microblaze_is_interrupt_handler () || microblaze_is_fast_interrupt()) There are many places in the patch where both interrupt_handler and fast_interrupt are tested. These can be eliminated by setting the interrupt_handler flag when you see fast_interrupt and checking for the correct registers to be saved in microblaze_must_save_register(). I've used this microblaze_is_interrupt_variant wrapper throughout, checking specifically for the interrupt_handler or fast_interrupt flag only where it was necessary to handle them differently. Please let me know if the patch attached is acceptable, or if you would prefer I refactor all the existing interrupt_handler functionality to accommodate the fast_interrupt. Updated Changelog; 2013-03-05 David Holsgrove david.holsgr...@xilinx.com * gcc/config/microblaze/microblaze-protos.h: Rename microblaze_is_interrupt_handler to microblaze_is_interrupt_variant. * gcc/config/microblaze/microblaze.c (microblaze_attribute_table): Add fast_interrupt. (microblaze_fast_interrupt_function_p): New function. (microblaze_is_interrupt_handler): Rename to microblaze_is_interrupt_variant and add fast_interrupt check. (microblaze_must_save_register): Use microblaze_is_interrupt_variant. (save_restore_insns): Likewise. (compute_frame_size): Likewise. (microblaze_function_prologue): Add FAST_INTERRUPT_NAME. (microblaze_globalize_label): Likewise. * gcc/config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME. * gcc/config/microblaze/microblaze.md: Use wrapper microblaze_is_interrupt_variant. thanks again for the reviews, David + if ((interrupt_handler !prologue) ||( fast_interrupt !prologue) ) + if ((interrupt_handler prologue) || (fast_interrupt prologue)) Refactor. Fix spacing around parens. -- Michael Eager ea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 0002-Gcc-Added-fast_interrupt-controller.patch Description: 0002-Gcc-Added-fast_interrupt-controller.patch
Re: Patch ping
On 03/05/2013 08:12 AM, Jakub Jelinek wrote: Hi! Thanks for all the recent reviews of memory leak plugging patches, there are 4 still unreviewed from last week though. - sched-deps leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01197.html This patch is ok. Thanks for working on this, Jakub.
Re: Patch ping
On 03/05/2013 08:12 AM, Jakub Jelinek wrote: Hi! Thanks for all the recent reviews of memory leak plugging patches, there are 4 still unreviewed from last week though. LRA leak fix: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01239.html This patch is ok too.
[PATCH] Fix PR50494 in a different way
This fixes PR50494 by avoiding to increase alignment of decls that are in the constant pool by the vectorizer. Bootstrap regtest pending on powerpc64-linux-gnu, with the older fix reverted. Richard. 2013-03-05 Richard Biener rguent...@suse.de PR middle-end/50494 * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Do not adjust alignment of DECL_IN_CONSTANT_POOL decls. Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 196466) +++ gcc/tree-vect-data-refs.c (working copy) @@ -4829,9 +4829,12 @@ vect_can_force_dr_alignment_p (const_tre /* We cannot change alignment of common or external symbols as another translation unit may contain a definition with lower alignment. The rules of common symbol linking mean that the definition - will override the common symbol. */ + will override the common symbol. The same is true for constant + pool entries which may be shared and are not properly merged + by LTO. */ if (DECL_EXTERNAL (decl) - || DECL_COMMON (decl)) + || DECL_COMMON (decl) + || DECL_IN_CONSTANT_POOL (decl)) return false; if (TREE_ASM_WRITTEN (decl))
Re: [Patch, microblaze]: Add support for swap instructions and reorder option
On 03/05/2013 06:54 AM, David Holsgrove wrote: Hi Michal, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Monday, 4 March 2013 3:37 am To: David Holsgrove Cc: Michael Eager; gcc-patches@gcc.gnu.org; John Williams; Edgar E. Iglesias (edgar.igles...@gmail.com); Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui Subject: Re: [Patch, microblaze]: Add support for swap instructions and reorder option Committed revision 196415. Thanks for committing. Please submit a patch to update gcc/doc/invoke.texi with -mxl-reorder description. Please find patch attached to this mail which updates the MicroBlaze section of documentation to include -mxl-reorder. I also added -mbig-endian and -mlittle-endian as they were missed in previous patch. Thanks. Committed revision 196470. gcc/ChangeLog: 2013-03-05 David Holsgrove david.holsgr...@xilinx.com * doc/invoke.texi (MicroBlaze): Add -mbig-endian, -mlittle-endian, -mxl-reorder. Please remember to submit a ChangeLog with patches. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [patch][RFC] bitmaps as lists *or* trees
On Tue, Mar 5, 2013 at 3:50 PM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Mar 5, 2013 at 1:32 PM, Richard Biener wrote: The attached patch is a first stab at an idea I've had for a while: Implement a change of view for bitmaps, such that a bitmap can be either a linked list, or a binary tree. ... Definitely a nice idea. Iteration should be easy to implement (without actually splaying for each visited bit), the bit operations can use the iteration as building block as well then. It is really easy, you only have to listify the splay tree such that the root is the element with the lowest index. AFAICT the iterators only look at the next member of each bitmap_element, and a list is also a valid splay tree. You'd have a fat iterator object with a (sorted) array of bitmap elements to iterate over, similar to how loop iterators work. Now, an instrumented bitmap to identify bitmaps that would benefit from the tree view would be nice ;) [points-to sets are never modified after being computed, but they are both random-tested and intersected] I have no idea how to create that kind of instrumentation. What I missed often as well is a reference counted shared bitmap implementation (we have various special case implementations). I wonder if that could even use shared sub-trees/lists of bitmap_elts. And this idea, I don't even understand :-) reference counted shared bitmaps as in, the same bitmap element shared between different bitmaps? How would you link such elements together in a tree or a list? It could be done with array bitmaps, but those have other downsides (insert/delete is near impossible without a lot of mem-moving around). You can share leafs of trees (not of lists due to the back pointer), splaying of course destroys the shared properties ... At the moment shared bitmaps (where used) are simply using hashtables and bitmap_hash. The propagation parts of the points-to solver could benefit from copy-on-write shared bitmaps. Richard. Ciao! Steven
Re: [PATCH] Fix PR56344
On Fri, Mar 01, 2013 at 09:41:27AM +0100, Richard Biener wrote: On Wed, Feb 27, 2013 at 6:38 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Wed, 27 Feb 2013, Richard Biener wrote: Wouldn't it be better to simply pass this using the variable size handling code? Thus, initialize args_size.var for too large constant size instead? Would that be compatible with the ABI definition of how a large (constant size) argument should be passed? I'm not sure. Another alternative is to expand to __builtin_trap (), but that's probably not easy at this very point. Or simply fix the size calculation to not overflow (either don't count bits or use a double-int). I don't think double_int will help us here. We won't detect overflow, because we overflowed here (when lower_bound is an int): lower_bound = INTVAL (XEXP (XEXP (arg-stack_slot, 0), 1)); The value from INTVAL () fits when lower_bound is a double_int, but then: i = lower_bound; ... stack_usage_map[i] the size of stack_usage_map is stored in highest_outgoing_arg_in_use, which is an int, so we're limited by an int size here. Changing the type of highest_outgoing_arg_in_use from an int to a double_int isn't worth the trouble, IMHO. Maybe the original approach, only with sorry () instead of error () and e.g. HOST_BITS_PER_INT - 1 instead of 30 would be appropriate after all. Dunno. Marek
[PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)
Hi! Without this patch, ifcvt extends lifetime of %eax hard register, which causes reload/LRA ICE later on. Combiner and other passes try hard not to do that, even ifcvt has code for it if x is a hard register a few lines below it, but in this case the hard register is SET_SRC (set_b). With this patch we just use the pseudo (x) which has been initialized from the hard register before the conditional. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-03-05 Jakub Jelinek ja...@redhat.com PR rtl-optimization/56484 * ifcvt.c (noce_process_if_block): Before reload if else_bb is NULL, avoid extending lifetimes of hard registers in likely to spilled or small register classes. --- gcc/ifcvt.c.jj 2013-01-11 09:02:48.0 +0100 +++ gcc/ifcvt.c 2013-03-05 12:36:19.217251997 +0100 @@ -2491,6 +2491,15 @@ noce_process_if_block (struct noce_if_in || ! noce_operand_ok (SET_SRC (set_b)) || reg_overlap_mentioned_p (x, SET_SRC (set_b)) || modified_between_p (SET_SRC (set_b), insn_b, jump) + /* Avoid extending the lifetime of hard registers on small +register class machines before reload. */ + || (!reload_completed + REG_P (SET_SRC (set_b)) + HARD_REGISTER_P (SET_SRC (set_b)) + (targetm.class_likely_spilled_p + (REGNO_REG_CLASS (REGNO (SET_SRC (set_b + || targetm.small_register_classes_for_mode_p + (GET_MODE (SET_SRC (set_b) /* Likewise with X. In particular this can happen when noce_get_condition looks farther back in the instruction stream than one might expect. */ --- gcc/testsuite/gcc.c-torture/compile/pr56484.c.jj2013-03-05 12:42:24.972220034 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr56484.c 2013-03-05 12:41:59.0 +0100 @@ -0,0 +1,17 @@ +/* PR rtl-optimization/56484 */ + +unsigned char b[4096]; +int bar (void); + +int +foo (void) +{ + int a = 0; + while (bar ()) +{ + int c = bar (); + a = a 0 ? a : c; + __builtin_memset (b, 0, sizeof b); +} + return a; +} Jakub
[PATCH] Avoid too complex debug insns during expansion (PR debug/56510)
Hi! cselib (probably among others) isn't prepared to handle arbitrarily complex debug insns. The debug insns are usually created from debug stmts which shouldn't have unbound complexity, but with TER we can actually end up with arbitrarily large debug insns. This patch fixes that up during expansion, by splitting subexpressions of too large debug insn expressions into their own debug temporaries. So far bootstrapped/regtested on x86_64-linux and i686-linux without the first two hunks (it caused one failure on the latter because of invalid RTL sharing), I'm going to bootstrap/regtest it again, ok for trunk if it passes? 2013-03-05 Jakub Jelinek ja...@redhat.com PR debug/56510 * cfgexpand.c (expand_debug_parm_decl): Call copy_rtx on incoming. (avoid_complex_debug_insns): New function. (expand_debug_locations): Call it. * gcc.dg/pr56510.c: New test. --- gcc/cfgexpand.c.jj 2013-03-05 15:12:15.071565689 +0100 +++ gcc/cfgexpand.c 2013-03-05 17:21:55.683602432 +0100 @@ -2622,6 +2622,8 @@ expand_debug_parm_decl (tree decl) reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg))); incoming = replace_equiv_address_nv (incoming, reg); } + else + incoming = copy_rtx (incoming); } #endif @@ -2637,7 +2639,7 @@ expand_debug_parm_decl (tree decl) || (GET_CODE (XEXP (incoming, 0)) == PLUS XEXP (XEXP (incoming, 0), 0) == virtual_incoming_args_rtx CONST_INT_P (XEXP (XEXP (incoming, 0), 1) -return incoming; +return copy_rtx (incoming); return NULL_RTX; } @@ -3704,6 +3706,54 @@ expand_debug_source_expr (tree exp) return op0; } +/* Ensure INSN_VAR_LOCATION_LOC (insn) doesn't have unbound complexity. + Allow 4 levels of rtl nesting for most rtl codes, and if we see anything + deeper than that, create DEBUG_EXPRs and emit DEBUG_INSNs before INSN. */ + +static void +avoid_complex_debug_insns (rtx insn, rtx *exp_p, int depth) +{ + rtx exp = *exp_p; + if (exp == NULL_RTX) +return; + if ((OBJECT_P (exp) !MEM_P (exp)) || GET_CODE (exp) == CLOBBER) +return; + + if (depth == 4) +{ + /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL). */ + rtx dval = make_debug_expr_from_rtl (exp); + + /* Emit a debug bind insn before INSN. */ + rtx bind = gen_rtx_VAR_LOCATION (GET_MODE (exp), + DEBUG_EXPR_TREE_DECL (dval), exp, + VAR_INIT_STATUS_INITIALIZED); + + emit_debug_insn_before (bind, insn); + *exp_p = dval; + return; +} + + const char *format_ptr = GET_RTX_FORMAT (GET_CODE (exp)); + int i, j; + for (i = 0; i GET_RTX_LENGTH (GET_CODE (exp)); i++) +switch (*format_ptr++) + { + case 'e': + avoid_complex_debug_insns (insn, XEXP (exp, i), depth + 1); + break; + + case 'E': + case 'V': + for (j = 0; j XVECLEN (exp, i); j++) + avoid_complex_debug_insns (insn, XVECEXP (exp, i, j), depth + 1); + break; + + default: + break; + } +} + /* Expand the _LOCs in debug insns. We run this after expanding all regular insns, so that any variables referenced in the function will have their DECL_RTLs set. */ @@ -3724,7 +3774,7 @@ expand_debug_locations (void) if (DEBUG_INSN_P (insn)) { tree value = (tree)INSN_VAR_LOCATION_LOC (insn); - rtx val; + rtx val, prev_insn, insn2; enum machine_mode mode; if (value == NULL_TREE) @@ -3753,6 +3803,9 @@ expand_debug_locations (void) } INSN_VAR_LOCATION_LOC (insn) = val; + prev_insn = PREV_INSN (insn); + for (insn2 = insn; insn2 != prev_insn; insn2 = PREV_INSN (insn2)) + avoid_complex_debug_insns (insn2, INSN_VAR_LOCATION_LOC (insn2), 0); } flag_strict_aliasing = save_strict_alias; --- gcc/testsuite/gcc.dg/pr56510.c.jj 2013-03-05 16:57:54.498939220 +0100 +++ gcc/testsuite/gcc.dg/pr56510.c 2013-03-05 16:57:54.499939214 +0100 @@ -0,0 +1,37 @@ +/* PR debug/56510 */ +/* { dg-do compile } */ +/* { dg-options -O2 -g } */ + +struct S { unsigned long s1; void **s2[0]; }; +void **a, **b, **c, **d, **e, **f; + +static void ** +baz (long x, long y) +{ + void **s = f; + *f = (void **) (y 8 | (x 0xff)); + f += y + 1; + return s; +} + +void bar (void); +void +foo (void) +{ + void **g = b[4]; + a = b[2]; + b = b[1]; + g[2] = e; + void **h += ((void **) + a)[1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][1][66]; + void **i = ((struct S *) h)-s2[4]; + d = baz (4, 3); + d[1] = b; + d[2] = a; + d[3] = bar; + b = d; + g[1] = i[2]; + a = g; + ((void (*) (void)) (i[1])) (); +} Jakub
[C++ Patch] PR 56534
Hi, this (and 55786, which is a Dup) is an ICE on invalid regression in 4.7/4.8. The problem is that for such broken input, check_elaborated_type_specifier is called by cp_parser_elaborated_type_specifier with a DECL which has a nul TREE_TYPE, a TEMPLATE_ID_EXPR actually, and therefore immediately crashes on TREE_CODE (type) == TEMPLATE_TYPE_PARM. In comparison, 4_6-branch, instead of calling check_elaborated_type_specifier, has cp_parser_elaborated_type_specifier simply doing type = TREE_TYPE (decl), thus it seems we can cure the regression in a straightforward and safe way by simply checking that TREE_TYPE (decl) is not nul at the beginning of check_elaborated_type_specifier. In this way the error messages are also exactly the same produced by 4_6. Tested x86_64-linux. Thanks, Paolo. // /cp 2013-03-05 Paolo Carlini paolo.carl...@oracle.com PR c++/56534 * decl.c (check_elaborated_type_specifier): Check for NULL_TREE as TREE_TYPE (decl). /testsuite 2013-03-05 Paolo Carlini paolo.carl...@oracle.com PR c++/56534 * g++.dg/template/crash115.C: New. Index: cp/decl.c === --- cp/decl.c (revision 196465) +++ cp/decl.c (working copy) @@ -11725,6 +11725,8 @@ check_elaborated_type_specifier (enum tag_types ta decl = TYPE_NAME (TREE_TYPE (decl)); type = TREE_TYPE (decl); + if (!type) +return NULL_TREE; /* Check TEMPLATE_TYPE_PARM first because DECL_IMPLICIT_TYPEDEF_P is false for this case as well. */ Index: testsuite/g++.dg/template/crash115.C === --- testsuite/g++.dg/template/crash115.C(revision 0) +++ testsuite/g++.dg/template/crash115.C(working copy) @@ -0,0 +1,3 @@ +// PR c++/56534 + +template struct template rebind // { dg-error expected }
Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)
On 03/05/2013 09:26 AM, Jakub Jelinek wrote: Hi! Without this patch, ifcvt extends lifetime of %eax hard register, which causes reload/LRA ICE later on. Combiner and other passes try hard not to do that, even ifcvt has code for it if x is a hard register a few lines below it, but in this case the hard register is SET_SRC (set_b). With this patch we just use the pseudo (x) which has been initialized from the hard register before the conditional. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-03-05 Jakub Jelinek ja...@redhat.com PR rtl-optimization/56484 * ifcvt.c (noce_process_if_block): Before reload if else_bb is NULL, avoid extending lifetimes of hard registers in likely to spilled or small register classes. OK. Jeff
Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)
Without this patch, ifcvt extends lifetime of %eax hard register, which causes reload/LRA ICE later on. Combiner and other passes try hard not to do that, even ifcvt has code for it if x is a hard register a few lines below it, but in this case the hard register is SET_SRC (set_b). With this patch we just use the pseudo (x) which has been initialized from the hard register before the conditional. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-03-05 Jakub Jelinek ja...@redhat.com PR rtl-optimization/56484 * ifcvt.c (noce_process_if_block): Before reload if else_bb is NULL, avoid extending lifetimes of hard registers in likely to spilled or small register classes. ifcvt.c tests only small_register_classes_for_mode_p in the other places, so do you really need class_likely_spilled_p here? -- Eric Botcazou
[patch sdbout]: Fix regression in sdbout.c
Hello, this patch fixes a regression in gcc.dg/debug/tls-1.c testcase for -gcoffn. ChangeLog 2013-03-05 Kai Tietz kti...@redhat.com * sdbout.c (sdbout_one_type): Switch to current function's section supporting cold/hot. Tested for x86_64-w64-mingw32. Ok for apply? Index: sdbout.c === --- sdbout.c(Revision 196451) +++ sdbout.c(Arbeitskopie) @@ -1017,7 +1017,7 @@ sdbout_one_type (tree type) DECL_SECTION_NAME (current_function_decl) != NULL_TREE) ; /* Don't change section amid function. */ else -switch_to_section (text_section); +switch_to_section (current_function_section ()); switch (TREE_CODE (type)) {
Re: [patch sdbout]: Fix regression in sdbout.c
On 03/05/2013 09:31 AM, Kai Tietz wrote: 2013-03-05 Kai Tietz kti...@redhat.com * sdbout.c (sdbout_one_type): Switch to current function's section supporting cold/hot. Ok. r~
[PATCH] libgcc: Add DWARF info to aeabi_ldivmod and aeabi_uldivmod
Hi All, This patch fixes a minor annoyance that causes backtraces to disappear inside of aeabi_ldivmod and aeabi_uldivmod due to the lack of appropriate DWARF information. I fixed the problem by adding the necessary cfi_* macros in these functions. OK? 2013-03-05 Meador Inge mead...@codesourcery.com * config/arm/bpabi.S (aeabi_ldivmod): Add DWARF information for computing the location of the link register. (aeabi_uldivmod): Ditto. Index: libgcc/config/arm/bpabi.S === --- libgcc/config/arm/bpabi.S (revision 196470) +++ libgcc/config/arm/bpabi.S (working copy) @@ -123,6 +123,7 @@ ARM_FUNC_START aeabi_ulcmp #ifdef L_aeabi_ldivmod ARM_FUNC_START aeabi_ldivmod + cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod) test_div_by_zero signed sub sp, sp, #8 @@ -132,17 +133,20 @@ ARM_FUNC_START aeabi_ldivmod #else do_push {sp, lr} #endif +98:cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10 bl SYM(__gnu_ldivmod_helper) __PLT__ ldr lr, [sp, #4] add sp, sp, #8 do_pop {r2, r3} RET + cfi_end LSYM(Lend_aeabi_ldivmod) #endif /* L_aeabi_ldivmod */ #ifdef L_aeabi_uldivmod ARM_FUNC_START aeabi_uldivmod + cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod) test_div_by_zero unsigned sub sp, sp, #8 @@ -152,11 +156,13 @@ ARM_FUNC_START aeabi_uldivmod #else do_push {sp, lr} #endif +98:cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10 bl SYM(__gnu_uldivmod_helper) __PLT__ ldr lr, [sp, #4] add sp, sp, #8 do_pop {r2, r3} RET - + cfi_end LSYM(Lend_aeabi_uldivmod) + #endif /* L_aeabi_divmod */
[google gcc-4_7] change LIPO default module grouping algorithm (issue7490043)
Hi, This patch changes the default lipo module grouping algorithm from algoritm 0 (eager propagation algorithm) to algorith 1 (inclusion_based priority algorithm). It also changes the name __gcov_lipo_strict_inclusion to __gcov_lipo_weak_inclusion and the default is 0. Tested with google internal benchmarks. -Rong 2013-03-05 Rong Xu x...@google.com * libgcc/dyn-ipa.c (__gcov_lipo_weak_inclusion): changed from __gcov_lipo_strict_inclusion. (init_dyn_call_graph): Ditto. (ps_add_auxiliary): Ditto. (modu_edge_add_auxiliary): Ditto. * gcc/tree-profile.c (tree_init_dyn_ipa_parameters): Ditto. * gcc/params.def (PARAM_LIPO_GROUPING_ALGORITHM): Changed default value from 0 to 1. Index: libgcc/dyn-ipa.c === --- libgcc/dyn-ipa.c(revision 196405) +++ libgcc/dyn-ipa.c(working copy) @@ -157,7 +157,7 @@ extern gcov_unsigned_t __gcov_lipo_dump_cgraph; extern gcov_unsigned_t __gcov_lipo_max_mem; extern gcov_unsigned_t __gcov_lipo_grouping_algorithm; extern gcov_unsigned_t __gcov_lipo_merge_modu_edges; -extern gcov_unsigned_t __gcov_lipo_strict_inclusion; +extern gcov_unsigned_t __gcov_lipo_weak_inclusion; #if defined(inhibit_libc) __gcov_build_callgraph (void) {} @@ -195,7 +195,7 @@ enum GROUPING_ALGORITHM }; static int flag_alg_mode; static int flag_modu_merge_edges; -static int flag_strict_inclusion; +static int flag_weak_inclusion; static gcov_unsigned_t mem_threshold; /* Returns 0 if no dump is enabled. Returns 1 if text form graph @@ -387,7 +387,7 @@ init_dyn_call_graph (void) flag_alg_mode = __gcov_lipo_grouping_algorithm; flag_modu_merge_edges = __gcov_lipo_merge_modu_edges; - flag_strict_inclusion = __gcov_lipo_strict_inclusion; + flag_weak_inclusion = __gcov_lipo_weak_inclusion; mem_threshold = __gcov_lipo_max_mem * 1.25; gi_ptr = __gcov_list; @@ -417,13 +417,13 @@ init_dyn_call_graph (void) if ((env_str = getenv (GCOV_DYN_MERGE_EDGES))) flag_modu_merge_edges = atoi (env_str); - if ((env_str = getenv (GCOV_DYN_STRICT_INCLUSION))) -flag_strict_inclusion = atoi (env_str); + if ((env_str = getenv (GCOV_DYN_WEAK_INCLUSION))) +flag_weak_inclusion = atoi (env_str); if (do_dump) fprintf (stderr, - Using ALG=%d merge_edges=%d strict_inclusion=%d. \n, -flag_alg_mode, flag_modu_merge_edges, flag_strict_inclusion); + Using ALG=%d merge_edges=%d weak_inclusion=%d. \n, +flag_alg_mode, flag_modu_merge_edges, flag_weak_inclusion); } if (do_dump) @@ -1809,7 +1809,7 @@ ps_add_auxiliary (const void *value, int not_safe_to_insert = *(int *) data3; gcov_unsigned_t new_ggc_size; - /* For strict incluesion, we know it's safe to insert. */ + /* For strict inclusion, we know it's safe to insert. */ if (!not_safe_to_insert) { modu_add_auxiliary (m_id, s_m_id, *(gcov_type*)data2); @@ -1825,7 +1825,8 @@ ps_add_auxiliary (const void *value, return 1; } -/* return 1 if insertion happened, otherwise 0. */ +/* Return 1 if insertion happened, otherwise 0. */ + static int modu_edge_add_auxiliary (struct modu_edge *edge) { @@ -1871,7 +1872,7 @@ modu_edge_add_auxiliary (struct modu_edge *edge) { pointer_set_traverse (node_exported_to, ps_check_ggc_mem, callee_m_id, fail, 0); - if (fail flag_strict_inclusion) + if (fail !flag_weak_inclusion) return 0; } Index: gcc/tree-profile.c === --- gcc/tree-profile.c (revision 196471) +++ gcc/tree-profile.c (working copy) @@ -389,10 +389,10 @@ tree_init_dyn_ipa_parameters (void) gcov_lipo_strict_inclusion = build_decl ( UNKNOWN_LOCATION, VAR_DECL, - get_identifier (__gcov_lipo_strict_inclusion), + get_identifier (__gcov_lipo_weak_inclusion), get_gcov_unsigned_t ()); init_comdat_decl (gcov_lipo_strict_inclusion, -PARAM_LIPO_STRICT_INCLUSION); +PARAM_LIPO_WEAK_INCLUSION); } } Index: gcc/params.def === --- gcc/params.def (revision 196471) +++ gcc/params.def (working copy) @@ -1018,25 +1018,26 @@ DEFPARAM (PARAM_INLINE_DUMP_MODULE_ID, LIPO profile-gen. */ DEFPARAM (PARAM_LIPO_GROUPING_ALGORITHM, lipo-grouping-algorithm, - Default is 0 which is the eager propagation algorithm. - If the value is 1, use the inclusion_based priority algorithm., - 0, 0, 1) + Algorithm 0 uses the eager propagation algorithm. + Algorithm 1 uses the inclusion_based priority algorithm. + The default algorithm is 1., + 1, 0, 1) /* In the inclusion_based_priority grouping algorithm, specify if we combine
Re: [google gcc-4_7] change LIPO default module grouping algorithm (issue7490043)
Looks good. thanks, David On Tue, Mar 5, 2013 at 11:06 AM, Rong Xu x...@google.com wrote: Hi, This patch changes the default lipo module grouping algorithm from algoritm 0 (eager propagation algorithm) to algorith 1 (inclusion_based priority algorithm). It also changes the name __gcov_lipo_strict_inclusion to __gcov_lipo_weak_inclusion and the default is 0. Tested with google internal benchmarks. -Rong 2013-03-05 Rong Xu x...@google.com * libgcc/dyn-ipa.c (__gcov_lipo_weak_inclusion): changed from __gcov_lipo_strict_inclusion. (init_dyn_call_graph): Ditto. (ps_add_auxiliary): Ditto. (modu_edge_add_auxiliary): Ditto. * gcc/tree-profile.c (tree_init_dyn_ipa_parameters): Ditto. * gcc/params.def (PARAM_LIPO_GROUPING_ALGORITHM): Changed default value from 0 to 1. Index: libgcc/dyn-ipa.c === --- libgcc/dyn-ipa.c(revision 196405) +++ libgcc/dyn-ipa.c(working copy) @@ -157,7 +157,7 @@ extern gcov_unsigned_t __gcov_lipo_dump_cgraph; extern gcov_unsigned_t __gcov_lipo_max_mem; extern gcov_unsigned_t __gcov_lipo_grouping_algorithm; extern gcov_unsigned_t __gcov_lipo_merge_modu_edges; -extern gcov_unsigned_t __gcov_lipo_strict_inclusion; +extern gcov_unsigned_t __gcov_lipo_weak_inclusion; #if defined(inhibit_libc) __gcov_build_callgraph (void) {} @@ -195,7 +195,7 @@ enum GROUPING_ALGORITHM }; static int flag_alg_mode; static int flag_modu_merge_edges; -static int flag_strict_inclusion; +static int flag_weak_inclusion; static gcov_unsigned_t mem_threshold; /* Returns 0 if no dump is enabled. Returns 1 if text form graph @@ -387,7 +387,7 @@ init_dyn_call_graph (void) flag_alg_mode = __gcov_lipo_grouping_algorithm; flag_modu_merge_edges = __gcov_lipo_merge_modu_edges; - flag_strict_inclusion = __gcov_lipo_strict_inclusion; + flag_weak_inclusion = __gcov_lipo_weak_inclusion; mem_threshold = __gcov_lipo_max_mem * 1.25; gi_ptr = __gcov_list; @@ -417,13 +417,13 @@ init_dyn_call_graph (void) if ((env_str = getenv (GCOV_DYN_MERGE_EDGES))) flag_modu_merge_edges = atoi (env_str); - if ((env_str = getenv (GCOV_DYN_STRICT_INCLUSION))) -flag_strict_inclusion = atoi (env_str); + if ((env_str = getenv (GCOV_DYN_WEAK_INCLUSION))) +flag_weak_inclusion = atoi (env_str); if (do_dump) fprintf (stderr, - Using ALG=%d merge_edges=%d strict_inclusion=%d. \n, -flag_alg_mode, flag_modu_merge_edges, flag_strict_inclusion); + Using ALG=%d merge_edges=%d weak_inclusion=%d. \n, +flag_alg_mode, flag_modu_merge_edges, flag_weak_inclusion); } if (do_dump) @@ -1809,7 +1809,7 @@ ps_add_auxiliary (const void *value, int not_safe_to_insert = *(int *) data3; gcov_unsigned_t new_ggc_size; - /* For strict incluesion, we know it's safe to insert. */ + /* For strict inclusion, we know it's safe to insert. */ if (!not_safe_to_insert) { modu_add_auxiliary (m_id, s_m_id, *(gcov_type*)data2); @@ -1825,7 +1825,8 @@ ps_add_auxiliary (const void *value, return 1; } -/* return 1 if insertion happened, otherwise 0. */ +/* Return 1 if insertion happened, otherwise 0. */ + static int modu_edge_add_auxiliary (struct modu_edge *edge) { @@ -1871,7 +1872,7 @@ modu_edge_add_auxiliary (struct modu_edge *edge) { pointer_set_traverse (node_exported_to, ps_check_ggc_mem, callee_m_id, fail, 0); - if (fail flag_strict_inclusion) + if (fail !flag_weak_inclusion) return 0; } Index: gcc/tree-profile.c === --- gcc/tree-profile.c (revision 196471) +++ gcc/tree-profile.c (working copy) @@ -389,10 +389,10 @@ tree_init_dyn_ipa_parameters (void) gcov_lipo_strict_inclusion = build_decl ( UNKNOWN_LOCATION, VAR_DECL, - get_identifier (__gcov_lipo_strict_inclusion), + get_identifier (__gcov_lipo_weak_inclusion), get_gcov_unsigned_t ()); init_comdat_decl (gcov_lipo_strict_inclusion, -PARAM_LIPO_STRICT_INCLUSION); +PARAM_LIPO_WEAK_INCLUSION); } } Index: gcc/params.def === --- gcc/params.def (revision 196471) +++ gcc/params.def (working copy) @@ -1018,25 +1018,26 @@ DEFPARAM (PARAM_INLINE_DUMP_MODULE_ID, LIPO profile-gen. */ DEFPARAM (PARAM_LIPO_GROUPING_ALGORITHM, lipo-grouping-algorithm, - Default is 0 which is the eager propagation algorithm. - If the value is 1, use the inclusion_based priority algorithm., - 0, 0, 1) + Algorithm 0 uses the eager propagation algorithm. +
Re: [Patch, microblaze]: Added fast_interrupt controller
On 03/05/2013 07:09 AM, David Holsgrove wrote: Hi Michael, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Wednesday, 27 February 2013 4:12 am To: David Holsgrove Cc: gcc-patches@gcc.gnu.org; Michael Eager (ea...@eagercon.com); John Williams; Edgar E. Iglesias (edgar.igles...@gmail.com); Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui Subject: Re: [Patch, microblaze]: Added fast_interrupt controller On 02/10/2013 10:39 PM, David Holsgrove wrote: Added fast_interrupt controller Changelog 2013-02-11 Nagaraju Mekala nmek...@xilinx.com * config/microblaze/microblaze-protos.h: microblaze_is_fast_interrupt. * config/microblaze/microblaze.c (microblaze_attribute_table): Add microblaze_is_fast_interrupt. (microblaze_fast_interrupt_function_p): New function. (microblaze_is_fast_interrupt check): New function. (microblaze_must_save_register): Account for fast_interrupt. (save_restore_insns): Likewise. (compute_frame_size): Likewise. (microblaze_globalize_label): Add FAST_INTERRUPT_NAME. * config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME as fast_interrupt. * config/microblaze/microblaze.md (movsi_status): Can be fast_interrupt (return): Add microblaze_is_fast_interrupt. (return_internal): Likewise. +int +microblaze_is_fast_interrupt (void) +{ + return fast_interrupt; +} + if (fast_interrupt) +{ Use wrapper functions consistently. Either reference the flag everywhere or use the wrapper everywhere. I've repurposed the existing 'microblaze_is_interrupt_handler' wrapper, (which was only used in the machine description), to be 'microblaze_is_interrupt_variant' - true if the function's attribute is either interrupt_handler or fast_interrupt. + if (interrupt_handler || fast_interrupt) + if (microblaze_is_interrupt_handler () || microblaze_is_fast_interrupt()) There are many places in the patch where both interrupt_handler and fast_interrupt are tested. These can be eliminated by setting the interrupt_handler flag when you see fast_interrupt and checking for the correct registers to be saved in microblaze_must_save_register(). I've used this microblaze_is_interrupt_variant wrapper throughout, checking specifically for the interrupt_handler or fast_interrupt flag only where it was necessary to handle them differently. Please let me know if the patch attached is acceptable, or if you would prefer I refactor all the existing interrupt_handler functionality to accommodate the fast_interrupt. Updated Changelog; 2013-03-05 David Holsgrove david.holsgr...@xilinx.com * gcc/config/microblaze/microblaze-protos.h: Rename microblaze_is_interrupt_handler to microblaze_is_interrupt_variant. * gcc/config/microblaze/microblaze.c (microblaze_attribute_table): Add fast_interrupt. (microblaze_fast_interrupt_function_p): New function. (microblaze_is_interrupt_handler): Rename to microblaze_is_interrupt_variant and add fast_interrupt check. (microblaze_must_save_register): Use microblaze_is_interrupt_variant. (save_restore_insns): Likewise. (compute_frame_size): Likewise. (microblaze_function_prologue): Add FAST_INTERRUPT_NAME. (microblaze_globalize_label): Likewise. * gcc/config/microblaze/microblaze.h: Define FAST_INTERRUPT_NAME. * gcc/config/microblaze/microblaze.md: Use wrapper microblaze_is_interrupt_variant. thanks again for the reviews, David + if ((interrupt_handler !prologue) ||( fast_interrupt !prologue) ) + if ((interrupt_handler prologue) || (fast_interrupt prologue)) Refactor. Fix spacing around parens. Committed revision 196474. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
[SH] PR 55303 - Add basic support for SH2A clip insns
Hi, This adds basic support for the SH2A clips and clipu instructions. Tested on rev 196406 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK for trunk or 4.9? Cheers, Oleg gcc/ChangeLog: PR target/55303 * config/sh/sh.c (sh_rtx_costs): Handle SMIN and SMAX cases. * config/sh/sh.md (*clips, uminsi3, *clipu, clipu_one): New insns and related expanders. * config/sh/iterators.md (SMIN_SMAX): New code iterator. * config/sh/predicates.md (arith_reg_or_0_or_1_operand, clips_min_const_int, clips_max_const_int, clipu_max_const_int): New predicates. testsuite/ChangeLog: PR target/55303 * gcc.target/sh/pr55303-1.c: New. * gcc.target/sh/pr55303-2.c: New. * gcc.target/sh/pr55303-3.c: New. Index: gcc/testsuite/gcc.target/sh/pr55303-1.c === --- gcc/testsuite/gcc.target/sh/pr55303-1.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr55303-1.c (revision 0) @@ -0,0 +1,87 @@ +/* Verify that the SH2A clips and clipu instructions are generated as + expected. */ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O2 } */ +/* { dg-skip-if { sh*-*-* } { * } { -m2a* } } */ +/* { dg-final { scan-assembler-times clips.b 2 } } */ +/* { dg-final { scan-assembler-times clips.w 2 } } */ +/* { dg-final { scan-assembler-times clipu.b 2 } } */ +/* { dg-final { scan-assembler-times clipu.w 2 } } */ + +static inline int +min (int a, int b) +{ + return a b ? a : b; +} + +static inline int +max (int a, int b) +{ + return a b ? b : a; +} + +int +test_00 (int a) +{ + /* 1x clips.b */ + return max (-128, min (127, a)); +} + +int +test_01 (int a) +{ + /* 1x clips.b */ + return min (127, max (-128, a)); +} + +int +test_02 (int a) +{ + /* 1x clips.w */ + return max (-32768, min (32767, a)); +} + +int +test_03 (int a) +{ + /* 1x clips.w */ + return min (32767, max (-32768, a)); +} + +unsigned int +test_04 (unsigned int a) +{ + /* 1x clipu.b */ + return a 255 ? 255 : a; +} + +unsigned int +test_05 (unsigned int a) +{ + /* 1x clipu.b */ + return a = 255 ? 255 : a; +} + +unsigned int +test_06 (unsigned int a) +{ + /* 1x clipu.w */ + return a 65535 ? 65535 : a; +} + +unsigned int +test_07 (unsigned int a) +{ + /* 1x clipu.w */ + return a = 65535 ? 65535 : a; +} + +void +test_08 (unsigned short a, unsigned short b, unsigned int* r) +{ + /* Must not see a clip insn here -- it is not needed. */ + unsigned short x = a + b; + if (x 65535) +x = 65535; + *r = x; +} Index: gcc/testsuite/gcc.target/sh/pr55303-3.c === --- gcc/testsuite/gcc.target/sh/pr55303-3.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr55303-3.c (revision 0) @@ -0,0 +1,15 @@ +/* Verify that the special case (umin (reg const_int 1)) results in the + expected instruction sequence on SH2A. */ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O2 } */ +/* { dg-skip-if { sh*-*-* } { * } { -m2a* } } */ +/* { dg-final { scan-assembler-times tst 1 } } */ +/* { dg-final { scan-assembler-times movrt 1 } } */ + +unsigned int +test_00 (unsigned int a) +{ + /* 1x tst + 1x movrt */ + return a 1 ? 1 : a; +} Index: gcc/testsuite/gcc.target/sh/pr55303-2.c === --- gcc/testsuite/gcc.target/sh/pr55303-2.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr55303-2.c (revision 0) @@ -0,0 +1,35 @@ +/* Verify that for SH2A smax/smin - cbranch conversion is done properly + if the clips insn is not used and the expected comparison insns are + generated. */ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O2 } */ +/* { dg-skip-if { sh*-*-* } { * } { -m2a* } } */ +/* { dg-final { scan-assembler-times cmp/pl 4 } } */ + +int +test_00 (int a) +{ + /* 1x cmp/pl */ + return a = 0 ? a : 0; +} + +int +test_01 (int a) +{ + /* 1x cmp/pl */ + return a = 0 ? a : 0; +} + +int +test_02 (int a) +{ + /* 1x cmp/pl */ + return a 1 ? 1 : a; +} + +int +test_03 (int a) +{ + /* 1x cmp/pl */ + return a 1 ? a : 1; +} Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 196091) +++ gcc/config/sh/sh.c (working copy) @@ -3507,6 +3507,22 @@ else return false; +case SMIN: +case SMAX: + /* This is most likely a clips.b or clips.w insn that is being made up + by combine. */ + if (TARGET_SH2A + (GET_CODE (XEXP (x, 0)) == SMAX || GET_CODE (XEXP (x, 0)) == SMIN) + CONST_INT_P (XEXP (XEXP (x, 0), 1)) + REG_P (XEXP (XEXP (x, 0), 0)) + CONST_INT_P (XEXP (x, 1))) + { + *total = COSTS_N_INSNS (1); + return true; + } + else + return false; + case CONST: case LABEL_REF: case SYMBOL_REF: Index: gcc/config/sh/sh.md
Re: FW: [PATCH] [MIPS] microMIPS gcc support
Moore, Catherine catherine_mo...@mentor.com writes: -Original Message- From: Richard Sandiford [mailto:rdsandif...@googlemail.com] Sent: Monday, March 04, 2013 3:54 PM To: Moore, Catherine Cc: gcc-patches@gcc.gnu.org; Rozycki, Maciej Subject: Re: FW: [PATCH] [MIPS] microMIPS gcc support Moore, Catherine catherine_mo...@mentor.com writes: Hi Richard, - Predicates should always check the code though. E.g.: (define_predicate umips_addius5_imm (and (match_code const_int) (match_test IN_RANGE (INTVAL (op), -8, 7 - In general, please try to make the names of the predicates as generic as possible. There's nothing really add-specific about the predicate above. Or microMIPS-specific either really: some of these predicates are probably going to be useful for MIPS16 too. The existing MIPS16 functions follow the convention: n if negated (optional) + s or u for signed vs. unsigned + imm + number of significant bits + _ + multiplication factor or, er, b for +1... It might be nice to have a similar convention for microMIPS. The choices there are a bit more exotic, so please feel free to diverge from the MIPS16 one above; we can switch MIPS16 over once the microMIPS one is settled. In fact, a new convention that's compact enough to be used in both predicate and constraint names would be great. E.g. for the umips_addius5_imm predicate above, a name like Ys5 would be easier to remember than Zo/Yo. How compact would you consider compact enough? I would need to change the existing Y constraints as well. Argh, sorry, I'd forgotten about that restriction. We have a few internal-only undocumented constraints that aren't used much, so we should be able to move them to the Y space instead. The patch below does this for T and U. Then we could use U for new, longer constraints. I think trying to invent some convention with less than four letter will be difficult and even with four, I doubt it could be uniformly followed. I think we could get descriptive with four, however. Let me know what you think. Four sounds good. Here's one idea: Utypefactorbits where type is: s for signed u for unsigned d for decremented unsigned (-1 ... N) i for incremented unsigned (1 ... N) where factor is: b for byte (*1) h for halfwords (*2) w for words (*4) d for doublewords (*8) -- useful for 64-bit MIPS16 but probably not needed for 32-bit microMIPS and where bits is the number of bits. type and factor could be replaced with an ad-hoc two-letter combination for special cases. E.g. Uas9 (add stack) for ADDISUP. Just a suggestion though. I'm not saying these names are totally intuitive or anything, but they should at least be better than arbitrary letters. Also, bits could be two digits if necessary, or we could just use hex digits. We could have: /* Return true if X fits within an unsigned field of BITS bits that is shifted left SHIFT bits before being used. */ static inline bool mips_unsigned_immediate_p (unsigned HOST_WIDE_INT x, int bits, int shift = 0) { return (x ((1 shift) - 1)) == 0 x (1 (shift + bits)); } /* Return true if X fits within a signed field of BITS bits that is shifted left SHIFT bits before being used. */ static inline bool mips_signed_immediate_p (unsigned HOST_WIDE_INT x, int bits, int shift = 0) { x += 1 (bits + shift - 1); return mips_unsigned_immediate_p (x, bits, shift); } The 'd' and 'i' cases would pass a biased X to mips_unsigned_immediate_p. I'll apply the patch below once 4.9 starts. Thanks, Richard gcc/ * config/mips/constraints.md (T): Rename to... (Yf): ...this. (U): Rename to... (Yd): ...this. * config/mips/mips.md (*movdi_64bit, *movdi_64bit_mips16) (*movmode_internal, *movmode_mips16): Update accordingly. Index: gcc/config/mips/constraints.md === --- gcc/config/mips/constraints.md 2013-02-25 21:45:10.0 + +++ gcc/config/mips/constraints.md 2013-03-05 08:22:36.687354771 + @@ -170,22 +170,6 @@ (define_constraint S (and (match_operand 0 call_insn_operand) (match_test CONSTANT_P (op -(define_constraint T - @internal - A constant @code{move_operand} that cannot be safely loaded into @code{$25} - using @code{la}. - (and (match_operand 0 move_operand) - (match_test CONSTANT_P (op)) - (match_test mips_dangerous_for_la25_p (op - -(define_constraint U - @internal - A constant @code{move_operand} that can be safely loaded into @code{$25} - using @code{la}. - (and (match_operand 0 move_operand) - (match_test CONSTANT_P (op)) - (not (match_test mips_dangerous_for_la25_p (op) - (define_memory_constraint W @internal A memory address based on a member of @code{BASE_REG_CLASS}. This is @@ -220,6 +204,22 @@ (define_constraint Yb @internal
Re: [Patch] Add microMIPS jraddiusp support
Moore, Catherine catherine_mo...@mentor.com writes: Index: config/mips/micromips.md === --- config/mips/micromips.md (revision 196341) +++ config/mips/micromips.md (working copy) @@ -95,6 +95,19 @@ (set_attr mode SI) (set_attr can_delay no)]) +;; For JRADDIUSP. +(define_insn jraddiusp + [(parallel [(return) + (use (reg:SI 31)) + (set (reg:SI 29) +(plus:SI (reg:SI 29) + (match_operand 0 const_int_operand)))])] Since this is a generic pattern (not depending on UNSPECs, etc.), I think we should use a specific predicate instead of const_int_operand. From the suggestion in the thread about addition, this would be a uw5, i.e. uw5_operand. Index: config/mips/mips.c === --- config/mips/mips.c(revision 196341) +++ config/mips/mips.c(working copy) @@ -11364,6 +11364,7 @@ const struct mips_frame_info *frame; HOST_WIDE_INT step1, step2; rtx base, adjust, insn; + bool use_jraddiusp_p = false; if (!sibcall_p mips_can_use_return_insn ()) { @@ -11453,6 +11454,14 @@ mips_for_each_saved_gpr_and_fpr (frame-total_size - step2, mips_restore_reg); + /* Check if we can use JRADDIUSP. */ + use_jraddiusp_p = (TARGET_MICROMIPS + !crtl-calls_eh_return + !sibcall_p + step2 0 + (step2 3) == 0 + step2 = (31 2)); + if (cfun-machine-interrupt_handler_p) { HOST_WIDE_INT offset; @@ -11480,8 +11489,9 @@ mips_emit_move (gen_rtx_REG (word_mode, K0_REG_NUM), mem); offset -= UNITS_PER_WORD; - /* If we don't use shadow register set, we need to update SP. */ - if (!cfun-machine-use_shadow_register_set_p) + /* If we don't use shadow register set or the microMIPS + JRADDIUSP insn, we need to update SP. */ + if (!cfun-machine-use_shadow_register_set_p !use_jraddiusp_p) mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0); else /* The choice of position is somewhat arbitrary in this case. */ We shouldn't use JRADDIUSP in an interrupt handler, so I think it would be better to move the use_jraddiusp_p condition into the else branch and drop the hunk above. @@ -11492,11 +11502,14 @@ gen_rtx_REG (SImode, K0_REG_NUM))); } else - /* Deallocate the final bit of the frame. */ - mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0); + /* Deallocate the final bit of the frame unless using the microMIPS + JRADDIUSP insn. */ + if (!use_jraddiusp_p) + mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0); } - gcc_assert (!mips_epilogue.cfa_restores); + if (!use_jraddiusp_p) +gcc_assert (!mips_epilogue.cfa_restores); We still need to emit the CFA restores somewhere. Something like: else if (TARGET_MICROMIPS !crtl-calls_eh_return !sibcall_p step2 0 mips_unsigned_immediate_p (step2, 5, 2)) { /* We can deallocate the stack and jump to $31 using JRADDIUSP. Emit the CFA restores immediately before the deallocation. */ use_jraddisup_p = true; mips_epilogue_emit_cfa_restores (); } else /* Deallocate the final bit of the frame. */ mips_deallocate_stack (stack_pointer_rtx, GEN_INT (step2), 0); where mips_unsigned_immediate_p comes from the other thread. Thanks, Richard
Re: [PATCH] Avoid too complex debug insns during expansion (PR debug/56510)
On 03/05/2013 09:30 AM, Jakub Jelinek wrote: Hi! cselib (probably among others) isn't prepared to handle arbitrarily complex debug insns. The debug insns are usually created from debug stmts which shouldn't have unbound complexity, but with TER we can actually end up with arbitrarily large debug insns. This patch fixes that up during expansion, by splitting subexpressions of too large debug insn expressions into their own debug temporaries. So far bootstrapped/regtested on x86_64-linux and i686-linux without the first two hunks (it caused one failure on the latter because of invalid RTL sharing), I'm going to bootstrap/regtest it again, ok for trunk if it passes? 2013-03-05 Jakub Jelinek ja...@redhat.com PR debug/56510 * cfgexpand.c (expand_debug_parm_decl): Call copy_rtx on incoming. (avoid_complex_debug_insns): New function. (expand_debug_locations): Call it. * gcc.dg/pr56510.c: New test. So it's not that cselib (and possibly others) can't handle these complex RTL expressions, it's just unbearably slow. Right? } +/* Ensure INSN_VAR_LOCATION_LOC (insn) doesn't have unbound complexity. + Allow 4 levels of rtl nesting for most rtl codes, and if we see anything + deeper than that, create DEBUG_EXPRs and emit DEBUG_INSNs before INSN. */ :-) Similar to a comment I made in someone else's patch, I don't like the magic number 4, but I don't think this is worth creating a PARAM for controlling its behaviour. + +static void +avoid_complex_debug_insns (rtx insn, rtx *exp_p, int depth) +{ + rtx exp = *exp_p; + if (exp == NULL_RTX) +return; + if ((OBJECT_P (exp) !MEM_P (exp)) || GET_CODE (exp) == CLOBBER) +return; A blank line or two seems to be missing above. Fine with the trivial formatting fix assuming your bootstrap/regtest is OK. Jeff
Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)
On Tue, Mar 05, 2013 at 06:28:13PM +0100, Eric Botcazou wrote: Without this patch, ifcvt extends lifetime of %eax hard register, which causes reload/LRA ICE later on. Combiner and other passes try hard not to do that, even ifcvt has code for it if x is a hard register a few lines below it, but in this case the hard register is SET_SRC (set_b). With this patch we just use the pseudo (x) which has been initialized from the hard register before the conditional. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-03-05 Jakub Jelinek ja...@redhat.com PR rtl-optimization/56484 * ifcvt.c (noce_process_if_block): Before reload if else_bb is NULL, avoid extending lifetimes of hard registers in likely to spilled or small register classes. ifcvt.c tests only small_register_classes_for_mode_p in the other places, so do you really need class_likely_spilled_p here? I guess I don't. I've grepped for small_register_classes_for_mode_p and didn't see anything in i386, so I figured out that it would be using a default (which is false). But apparently it uses hook_bool_mode_true, so it is a superset of class_likely_spilled_p, guess I can leave that out. Jakub
[PATCH] Fix g++.dg/debug/dwarf2/thunk1.C on darwin
Darwin does PIC differently than ELF so that the scan-assembler-times fails for g++.dg/debug/dwarf2/thunk1.C. The attached patch skips the scan-assembler for *-*-darwin*. Tested on x86_64-apple-darwin12. Okay for gcc trunk. Jack gcc/testsuite/ 2013-03-05 Jack Howarth howa...@bromo.med.uc.edu PR debug/53363 * g++.dg/debug/dwarf2/thunk1.C: Skip final scan on darwin. Index: gcc/testsuite/g++.dg/debug/dwarf2/thunk1.C === --- gcc/testsuite/g++.dg/debug/dwarf2/thunk1.C (revision 196462) +++ gcc/testsuite/g++.dg/debug/dwarf2/thunk1.C (working copy) @@ -1,7 +1,7 @@ // Test that we don't add the x86 PC thunk to .debug_ranges // { dg-do compile { target { { i?86-*-* x86_64-*-* } ia32 } } } // { dg-options -g -fpic -fno-dwarf2-cfi-asm } -// { dg-final { scan-assembler-times LFB3 5 } } +// { dg-final { scan-assembler-times LFB3 5 { target { ! *-*-darwin* } } } } template class T void f(T t) { }
Re: [PATCH] Avoid extending lifetime of likely spilled hard regs in ifcvt before reload (PR rtl-optimization/56484)
On Tue, Mar 05, 2013 at 11:03:13PM +0100, Jakub Jelinek wrote: ifcvt.c tests only small_register_classes_for_mode_p in the other places, so do you really need class_likely_spilled_p here? I guess I don't. I've grepped for small_register_classes_for_mode_p and didn't see anything in i386, so I figured out that it would be using a default (which is false). But apparently it uses hook_bool_mode_true, so it is a superset of class_likely_spilled_p, guess I can leave that out. Here is what I've actually committed (I've also removed the !reload_completed , because noce_process_if_block is only called for !!reload_completed (the only caller asserts it)). 2013-03-05 Jakub Jelinek ja...@redhat.com PR rtl-optimization/56484 * ifcvt.c (noce_process_if_block): If else_bb is NULL, avoid extending lifetimes of hard registers on small register class machines. * gcc.c-torture/compile/pr56484.c: New test. --- gcc/ifcvt.c.jj 2013-03-05 15:12:15.284564443 +0100 +++ gcc/ifcvt.c 2013-03-05 23:11:25.751625601 +0100 @@ -2491,6 +2491,12 @@ noce_process_if_block (struct noce_if_in || ! noce_operand_ok (SET_SRC (set_b)) || reg_overlap_mentioned_p (x, SET_SRC (set_b)) || modified_between_p (SET_SRC (set_b), insn_b, jump) + /* Avoid extending the lifetime of hard registers on small +register class machines. */ + || (REG_P (SET_SRC (set_b)) + HARD_REGISTER_P (SET_SRC (set_b)) + targetm.small_register_classes_for_mode_p + (GET_MODE (SET_SRC (set_b /* Likewise with X. In particular this can happen when noce_get_condition looks farther back in the instruction stream than one might expect. */ --- gcc/testsuite/gcc.c-torture/compile/pr56484.c.jj2013-03-05 16:57:50.416961638 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr56484.c 2013-03-05 16:57:50.417961672 +0100 @@ -0,0 +1,17 @@ +/* PR rtl-optimization/56484 */ + +unsigned char b[4096]; +int bar (void); + +int +foo (void) +{ + int a = 0; + while (bar ()) +{ + int c = bar (); + a = a 0 ? a : c; + __builtin_memset (b, 0, sizeof b); +} + return a; +} Jakub
Re: [SH] PR 55303 - Add basic support for SH2A clip insns
Oleg Endo oleg.e...@t-online.de wrote: This adds basic support for the SH2A clips and clipu instructions. Tested on rev 196406 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK for trunk or 4.9? OK. Regards, kaz
Re: [PATCH] Avoid too complex debug insns during expansion (PR debug/56510)
On Tue, Mar 05, 2013 at 02:40:34PM -0700, Jeff Law wrote: So it's not that cselib (and possibly others) can't handle these complex RTL expressions, it's just unbearably slow. Right? They handle it, but with bad compile time complexity, so on some testcases it might take years or centuries etc. Fine with the trivial formatting fix assuming your bootstrap/regtest is OK. Thanks, bootstraps/regtests finished fine, I've added the two blank lines and committed. Jakub
Re: [SH] PR 55303 - Add basic support for SH2A clip insns
On Wed, 2013-03-06 at 07:37 +0900, Kaz Kojima wrote: Oleg Endo oleg.e...@t-online.de wrote: This adds basic support for the SH2A clips and clipu instructions. Tested on rev 196406 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK for trunk or 4.9? OK. OK for 4.8 trunk or 4.9? :) Cheers, Oleg
[patch] Fix PR 55364: ICE in remove_addr_table_entry with -gsplit-dwarf
This patch fixes an ICE in remove_addr_table_entry, where we try to remove the .debug_addr entries for an expression where they've already been removed earlier in the loop. -cary 2013-03-05 Sterling Augustine saugust...@google.com Cary Coutant ccout...@google.com PR debug/55364 * gcc/dwarf2out.c (resolve_addr): Don't call remove_loc_list_addr_table_entries a second time for the same expression. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 196479) +++ gcc/dwarf2out.c (working copy) @@ -22691,8 +22691,6 @@ resolve_addr (dw_die_ref die) else { loc-replaced = 1; -if (dwarf_split_debug_info) - remove_loc_list_addr_table_entries (loc-expr); loc-dw_loc_next = *start; } }
Re: [SH] PR 55303 - Add basic support for SH2A clip insns
Oleg Endo oleg.e...@t-online.de wrote: OK for 4.8 trunk or 4.9? :) Sorry, I've missed the trunk part. OK for 4.9. Regards, kaz
[SH, committed] PR 56529 - Calls to __sdivsi3_i4i and __udivsi3_i4i are generated on SH2
Hi, This is the patch that I posted in the PR and that was pre-approved by Kaz, with some documentation bits added. Tested with 'make info dvi pdf' and 'make all'. Applied as revision 196484. Will backport it to 4.7 branch. Cheers, Oleg gcc/ChangeLog: PR target/56529 * config/sh/sh.c (sh_option_override): Check for TARGET_DYNSHIFT instead of TARGET_SH2 for call-table case. Do not set sh_div_strategy to SH_DIV_CALL_TABLE for TARGET_SH2. * config.gcc (sh_multilibs): Add m2 and m2a to sh*-*-linux* multilib list. * doc/invoke.texi (SH options): Document mdiv= call-div1, call-fp, call-table options. libgcc/ChangeLog: PR target/56529 * config/sh/lib1funcs.S (udivsi3_i4i, sdivsi3_i4i): Add __SH2A__ to inclusion list. Index: gcc/config/sh/sh.c === --- gcc/config/sh/sh.c (revision 196483) +++ gcc/config/sh/sh.c (working copy) @@ -820,7 +820,7 @@ || (TARGET_HARD_SH4 TARGET_SH2E) || (TARGET_SHCOMPACT TARGET_FPU_ANY))) sh_div_strategy = SH_DIV_CALL_FP; - else if (! strcmp (sh_div_str, call-table) TARGET_SH2) + else if (! strcmp (sh_div_str, call-table) TARGET_DYNSHIFT) sh_div_strategy = SH_DIV_CALL_TABLE; else /* Pick one that makes most sense for the target in general. @@ -840,8 +840,6 @@ sh_div_strategy = SH_DIV_CALL_FP; /* SH1 .. SH3 cores often go into small-footprint systems, so default to the smallest implementation available. */ - else if (TARGET_SH2) /* ??? EXPERIMENTAL */ - sh_div_strategy = SH_DIV_CALL_TABLE; else sh_div_strategy = SH_DIV_CALL_DIV1; } Index: gcc/config.gcc === --- gcc/config.gcc (revision 196483) +++ gcc/config.gcc (working copy) @@ -2371,7 +2371,7 @@ sh[1234]*) sh_multilibs=${sh_cpu_target} ;; sh64* | sh5*) sh_multilibs=m5-32media,m5-32media-nofpu,m5-compact,m5-compact-nofpu,m5-64media,m5-64media-nofpu ;; sh-superh-*) sh_multilibs=m4,m4-single,m4-single-only,m4-nofpu ;; - sh*-*-linux*) sh_multilibs=m1,m3e,m4 ;; + sh*-*-linux*) sh_multilibs=m1,m2,m2a,m3e,m4 ;; sh*-*-netbsd*) sh_multilibs=m3,m3e,m4 ;; *) sh_multilibs=m1,m2,m2e,m4,m4-single,m4-single-only,m2a,m2a-single ;; esac Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 196483) +++ gcc/doc/invoke.texi (working copy) @@ -18749,8 +18749,8 @@ @item -mdiv=@var{strategy} @opindex mdiv=@var{strategy} -Set the division strategy to use for SHmedia code. @var{strategy} must be -one of: +Set the division strategy to be used for integer division operations. +For SHmedia @var{strategy} can be one of: @table @samp @@ -18808,6 +18808,36 @@ @end table +For targets other than SHmedia @var{strategy} can be one of: + +@table @samp + +@item call-div1 +Calls a library function that uses the single-step division instruction +@code{div1} to perform the operation. Division by zero calculates an +unspecified result and does not trap. This is the default except for SH4, +SH2A and SHcompact. + +@item call-fp +Calls a library function that performs the operation in double precision +floating point. Division by zero causes a floating-point exception. This is +the default for SHcompact with FPU. Specifying this for targets that do not +have a double precision FPU will default to @code{call-div1}. + +@item call-table +Calls a library function that uses a lookup table for small divisors and +the @code{div1} instruction with case distinction for larger divisors. Division +by zero calculates an unspecified result and does not trap. This is the default +for SH4. Specifying this for targets that do not have dynamic shift +instructions will default to @code{call-div1}. + +@end table + +When a division strategy has not been specified the default strategy will be +selected based on the current target. For SH2A the default strategy is to +use the @code{divs} and @code{divu} instructions instead of library function +calls. + @item -maccumulate-outgoing-args @opindex maccumulate-outgoing-args Reserve space once for outgoing arguments in the function prologue rather Index: libgcc/config/sh/lib1funcs.S === --- libgcc/config/sh/lib1funcs.S (revision 196483) +++ libgcc/config/sh/lib1funcs.S (working copy) @@ -3288,8 +3288,8 @@ .word 17136 .word 16639 -#elif defined (__SH3__) || defined (__SH3E__) || defined (__SH4__) || defined (__SH4_SINGLE__) || defined (__SH4_SINGLE_ONLY__) || defined (__SH4_NOFPU__) -/* This code used shld, thus is not suitable for SH1 / SH2. */ +#elif defined (__SH2A__) || defined (__SH3__) || defined (__SH3E__) || defined (__SH4__) || defined (__SH4_SINGLE__) || defined (__SH4_SINGLE_ONLY__) || defined (__SH4_NOFPU__) +/* This code uses shld, thus is not suitable for SH1 /
Re: *ping* - Re: Fix some texinfo 5.0 warnings in gcc/doc + libiberty
On Fri, 1 Mar 2013, Tobias Burnus wrote: Joseph S. Myers wrote: OK, though for the libiberty patch it would be good if someone can find the make-obstacks-texi.sh script referred to in libiberty.texi, check it in and get obstacks.texi exactly in sync with the output of that script run on current glibc sources. I couldn't find it, but I created a Perl version of the unknown script. Is the attached patch OK? (I tested it with make info html pdf using (only) texinfo-4.13a.) OK, with 2013 used as copyright date instead of 2012. -- Joseph S. Myers jos...@codesourcery.com
RE: [PATCH] Fix PR50293 - LTO plugin with space in path
On Mon, 4 Mar 2013, Joey Ye wrote: + char *new_spec = (char *)xmalloc (len + number_of_space + 1); Space in cast between (char *) and xmalloc. OK with that change. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Fix lots of uninitialized memory uses in sched_analyze_reg
On 13-03-04 4:17 PM, Jakub Jelinek wrote: Hi! Something that again hits lots of testcases during valgrind checking bootstrap. init_alias_analysis apparently does vec_safe_grow_cleared (reg_known_value, maxreg - FIRST_PSEUDO_REGISTER); reg_known_equiv_p = sbitmap_alloc (maxreg - FIRST_PSEUDO_REGISTER); but doesn't bitmap_clear (reg_known_equiv_p), perhaps as an optimization? Sorry, I don't know current state of alias.c well to say something definite about this. But I believe it should be cleared. If set_reg_known_value is called (and not to the reg itself), set_reg_known_equiv_p is called too though. Right now get_reg_known_equiv_p is only called in one place, and we are only interested in MEM_P known values there, so the following works fine. Though perhaps if in the future we use the reg_known_equiv_p bitmap more, we should bitmap_clear (reg_known_equiv_p) it instead. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk (or do you prefer to slow down init_alias_analysis and just clear the bitmap)? I don't see any harm from your patch but I guess it should be fixed by clearing reg_know_equiv_p. I think you need Steven's opinion on this as he is an author of the code. 2013-03-04 Jakub Jelinek ja...@redhat.com * sched-deps.c (sched_analyze_reg): Only call get_reg_known_equiv_p if get_reg_known_value returned non-NULL. --- gcc/sched-deps.c.jj 2013-03-04 12:21:09.0 +0100 +++ gcc/sched-deps.c2013-03-04 17:29:03.478944157 +0100 @@ -2351,10 +2351,10 @@ sched_analyze_reg (struct deps_desc *dep /* Pseudos that are REG_EQUIV to something may be replaced by that during reloading. We need only add dependencies for the address in the REG_EQUIV note. */ - if (!reload_completed get_reg_known_equiv_p (regno)) + if (!reload_completed) { rtx t = get_reg_known_value (regno); - if (MEM_P (t)) + if (t MEM_P (t) get_reg_known_equiv_p (regno)) sched_analyze_2 (deps, XEXP (t, 0), insn); }
Re: [PATCH] Fix lots of uninitialized memory uses in sched_analyze_reg
On Tue, Mar 05, 2013 at 11:58:09PM -0500, Vladimir Makarov wrote: I don't see any harm from your patch but I guess it should be fixed by clearing reg_know_equiv_p. I think you need Steven's opinion on this as he is an author of the code. Yeah, I've already committed the clearing of the sbitmap in alias.c instead of this sched-deps.c patch, which doesn't make sense after the alias.c change. Jakub
[PATCH] Fix PR 55473
The libquadmath/quadmath.h file cannot be used with C++. The following patch allows inclusion and use of the quadmath.h header file. 2013-03-06 Shakthi Kannan shakthim...@gmail.com PR libquadmath/55473 * quadmath.h: Add ifdef __cplusplus macros. --- libquadmath/quadmath.h | 8 1 file changed, 8 insertions(+) diff --git a/libquadmath/quadmath.h b/libquadmath/quadmath.h index 863fe44..aa9ef51 100644 --- a/libquadmath/quadmath.h +++ b/libquadmath/quadmath.h @@ -23,6 +23,10 @@ Boston, MA 02110-1301, USA. */ #include stdlib.h +#ifdef __cplusplus +extern C { +#endif + /* Define the complex type corresponding to __float128 (_Complex __float128 is not allowed) */ typedef _Complex float __attribute__((mode(TC))) __complex128; @@ -189,4 +193,8 @@ __quadmath_nth (conjq (__complex128 __z)) return __extension__ ~__z; } +#ifdef __cplusplus +} +#endif + #endif -- 1.7.11.7