Re: [PATCH] Fix CDDCE miscompilation (PR tree-optimization/55018)
On Tue, Oct 23, 2012 at 12:49:44AM +0200, Steven Bosscher wrote: On Mon, Oct 22, 2012 at 11:09 PM, Jakub Jelinek ja...@redhat.com wrote: Wouldn't it be way cheaper to just export dfs_find_deadend from cfganal.c and call it in calc_dfs_tree on each unconnected bb? I.e. (untested with the exception of the testcase): FWIW, dfs_find_deadend looks broken to me for this usage case. It could return a self-loop block with more than one successor. For a pre-order search like dominance.c needs, you'd have to look as deep as possible, something like this: I've bootstrapped overnight the patch I posted (without your dfs_find_deadend change (next_bb is unused var there btw)), and there is a new FAIL with it - ssa-dce-3.c (and your dfs_find_deadend change doesn't change anything on it). Before cddce1 we have: bb 2: goto bb 6; bb 3: j_8 = j_3 + 501; goto bb 5; bb 4: j_9 = j_3 + 499; bb 5: # j_2 = PHI j_8(3), j_9(4) i_10 = i_1 + 2; bb 6: # i_1 = PHI 1(2), i_10(5) # j_3 = PHI 0(2), j_2(5) j_6 = j_3 + 500; _7 = j_6 % 7; if (_7 != 0) goto bb 3; else goto bb 4; and before the dominance.c change bb6 has fake edge to exit, bb3 and bb4 are immediately post-dominated by bb5 and bb5 is immediately post-dominated by bb6, thus when mark_control_dependent_edges_necessary is called on the latch (bb5) of the infinite loop, it marks the _7 != 0 statement as necessary and the j_N and _7 assignments stay, as 6-3 and 6-4 edges are recorded as control parents for bb3, bb4 and bb5. With the patch bb5 is instead the bb with fake edge to exit, and bb6, bb3 and bb4 are all immediate post-dominated by bb5 and edge 5-6 is control parent of bb5. So with the patch this is optimized into just: bb 2: bb 3: goto bb 3; I guess it is fine that way and the testcase needs adjustment, just wanted to point to the differences. (And the (EDGE_COUNT(bb-succs) == 0) is unnecessary for inverted_post_order_compute because it already puts all such blocks on the initial work list :-) And so does dominance.c: FOR_EACH_BB_REVERSE (b) { if (EDGE_COUNT (b-succs) 0) { if (di-dfs_order[b-index] == 0) saw_unconnected = true; continue; } bitmap_set_bit (di-fake_exit_edge, b-index); Jakub
Re: gcc 4.7 libgo patch committed: Set libgo version number
Hello Ian, On Tue, 23 Oct 2012 06:55:01 +0200, Ian Lance Taylor wrote: PR 54918 points out that libgo is not using version numbers as it should. At present none of libgo in 4.6, 4.7 and mainline are compatible with each other. This patch to the 4.7 branch sets the version number for libgo there. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to 4.7 branch. it has regressed GDB testsuite: -PASS: gdb.go/handcall.exp: print add (1, 2) +FAIL: gdb.go/handcall.exp: print add (1, 2) GNU gdb (GDB) 7.5.50.20121022-cvs before: (gdb) print add (1, 2) $1 = 3 (gdb) ptype add type = int32 (int, int) (gdb) info line add Line 219 of ../../../libgo/runtime/cpuprof.c starts at address 0x755c0884 tick+52 and ends at 0x755c0898 tick+72. now: (gdb) print add (1, 2) Too few arguments in function call. (gdb) ptype add type = void (Profile *, uintptr *, int32) (gdb) info line add Line 212 of ../../../gcc47/libgo/runtime/cpuprof.c starts at address 0x755b05fe add and ends at 0x755b0609 add+11. Regards, Jan
Re: [PATCH] Fix PR55011
On Mon, 22 Oct 2012, Michael Matz wrote: Hi, On Mon, 22 Oct 2012, Richard Biener wrote: On Mon, 22 Oct 2012, Michael Matz wrote: Hi, On Mon, 22 Oct 2012, Richard Biener wrote: This fixes PR55011, it seems nothing checks for invalid lattice transitions in VRP, That makes sense, because the individual parts of VRP that produce new ranges are supposed to not generate invalid transitions. So if anything such checking should be an assert and the causes be fixed. No, the checking should be done in update_value_range Exactly. And that's the routine you're changing, but you aren't adding checking, you silently fix invalid transitions. What I tried to say is that the one calling update_value_range with new_vr being UNDEFINED is wrong, and update_value_range shouldn't fix it, but assert, so that this wrong caller may be fixed. Callers do not look at the lattice value they change and it would not be convenient to do so in the various places. I re-word your complaint to the function has a wrong name then, which even I would not agree with. update_value_range is told to update the lattice value-range from lhs with the information from the given range. which copies the new VR over to the lattice. The job of that function is also to detect lattice changes. Sure, but not to fix invalid input. The input isn't invalid. The input cannot be put into the lattice just because that would be an invalid transition. so the following adds that It's a work around ... No. since we now can produce a lot more UNDEFINED than before ... for this. We should never produce UNDEFINED when the input wasn't UNDEFINED already. Why? Because doing so _always_ means an invalid lattice transition. UNDEFINED is TOP, anything not UNDEFINED is not TOP. So going from something to UNDEFINED is always going upward the lattice and hence in the wrong direction. Um, what do you mean with input then? Certainly intersecting [2, 4] and [6, 8] yields UNDEFINED. And the inputs are not UNDEFINED. We shouldn't update the lattice this way, yes, but that is what the patch ensures. An assert ensures. A work around works around a problem. I say that the problem is in those routines that produced the new UNDEFINED range in the first place, and it's not update_value_range's job to fix that after the fact. It is. See how CCPs set_lattice_vlaue adjusts the input as well. It's just not convenient to repeat the adjustments everywhere. The workers only compute a new value-range for a stmt based on input value ranges. And if they produce UNDEFINED when the input wasn't so, then _that's_ where the bug is. See above. not doing so triggers issues. Hmm? It oscillates and thus never finishes. I'm not sure I understand. You claim that the workers have to produce UNDEFINED from non-UNDEFINED in some cases, otherwise we oscillate? That sounds strange. Or do you mean that we oscillate without your patch to update_value_range? That I believe, it's the natural result of going a lattice the wrong way, but I say that update_value_range is not the place to silently fix invalid transitions. No, I mean that going up the lattice results in oscillation. richard.
Re: Fix array bound niter estimate (PR middle-end/54937)
On Mon, 22 Oct 2012, Jan Hubicka wrote: Hi, here is updated patch with the comments. The fortran failures turned out to be funny interaction in between this patch and my other change that hoped that loop closed SSA is closed on VOPs, but it is not. Regtested x86_64-linux, bootstrap in progress, OK? Ok for trunk. Thanks, Richard. Honza * tree-ssa-loop-niter.c (record_estimate): Do not try to lower the bound of non-is_exit statements. (maybe_lower_iteration_bound): Do it here. (estimate_numbers_of_iterations_loop): Call it. * gcc.c-torture/execute/pr54937.c: New testcase. * gcc.dg/tree-ssa/cunroll-2.c: Update. Index: tree-ssa-loop-niter.c === --- tree-ssa-loop-niter.c (revision 192632) +++ tree-ssa-loop-niter.c (working copy) @@ -2535,7 +2541,6 @@ record_estimate (struct loop *loop, tree gimple at_stmt, bool is_exit, bool realistic, bool upper) { double_int delta; - edge exit; if (dump_file (dump_flags TDF_DETAILS)) { @@ -2570,14 +2577,10 @@ record_estimate (struct loop *loop, tree } /* Update the number of iteration estimates according to the bound. - If at_stmt is an exit or dominates the single exit from the loop, - then the loop latch is executed at most BOUND times, otherwise - it can be executed BOUND + 1 times. */ - exit = single_exit (loop); - if (is_exit - || (exit != NULL -dominated_by_p (CDI_DOMINATORS, - exit-src, gimple_bb (at_stmt + If at_stmt is an exit then the loop latch is executed at most BOUND times, + otherwise it can be executed BOUND + 1 times. We will lower the estimate + later if such statement must be executed on last iteration */ + if (is_exit) delta = double_int_zero; else delta = double_int_one; @@ -2953,6 +2956,110 @@ gcov_type_to_double_int (gcov_type val) return ret; } +/* See if every path cross the loop goes through a statement that is known + to not execute at the last iteration. In that case we can decrese iteration + count by 1. */ + +static void +maybe_lower_iteration_bound (struct loop *loop) +{ + pointer_set_t *not_executed_last_iteration = pointer_set_create (); + struct nb_iter_bound *elt; + bool found_exit = false; + VEC (basic_block, heap) *queue = NULL; + bitmap visited; + + /* Collect all statements with interesting (i.e. lower than + nb_iterations_upper_bound) bound on them. + + TODO: Due to the way record_estimate choose estimates to store, the bounds + will be always nb_iterations_upper_bound-1. We can change this to record + also statements not dominating the loop latch and update the walk bellow + to the shortest path algorthm. */ + for (elt = loop-bounds; elt; elt = elt-next) +{ + if (!elt-is_exit +elt-bound.ult (loop-nb_iterations_upper_bound)) + { + if (!not_executed_last_iteration) + not_executed_last_iteration = pointer_set_create (); + pointer_set_insert (not_executed_last_iteration, elt-stmt); + } +} + if (!not_executed_last_iteration) +return; + + /* Start DFS walk in the loop header and see if we can reach the + loop latch or any of the exits (including statements with side + effects that may terminate the loop otherwise) without visiting + any of the statements known to have undefined effect on the last + iteration. */ + VEC_safe_push (basic_block, heap, queue, loop-header); + visited = BITMAP_ALLOC (NULL); + bitmap_set_bit (visited, loop-header-index); + found_exit = false; + + do +{ + basic_block bb = VEC_pop (basic_block, queue); + gimple_stmt_iterator gsi; + bool stmt_found = false; + + /* Loop for possible exits and statements bounding the execution. */ + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) + { + gimple stmt = gsi_stmt (gsi); + if (pointer_set_contains (not_executed_last_iteration, stmt)) + { + stmt_found = true; + break; + } + if (gimple_has_side_effects (stmt)) + { + found_exit = true; + break; + } + } + if (found_exit) + break; + + /* If no bounding statement is found, continue the walk. */ + if (!stmt_found) + { + edge e; + edge_iterator ei; + + FOR_EACH_EDGE (e, ei, bb-succs) + { + if (loop_exit_edge_p (loop, e) + || e == loop_latch_edge (loop)) + { + found_exit = true; + break; + } + if (bitmap_set_bit (visited, e-dest-index)) + VEC_safe_push (basic_block, heap, queue, e-dest); + } + } +} + while
Re: Minor record_upper_bound tweek
On Mon, 22 Oct 2012, Jan Hubicka wrote: Hi, with profile feedback we may misupdate the profile and start to believe that loops iterate more times than they do. This patch makes at least nb_iterations_estimate no greater than nb_iterations_upper_bound. This makes the unrolling/peeling/unswitching heuristics to behave more consistently. Bootstrapped/regtested x86_64-linux, OK? Ok with ... Honza * tree-sssa-loop-niter.c (record_niter_bound): Be sure that realistic estimate is not bigger than upper bound. Index: tree-ssa-loop-niter.c === --- tree-ssa-loop-niter.c (revision 192632) +++ tree-ssa-loop-niter.c (working copy) @@ -2506,13 +2506,20 @@ record_niter_bound (struct loop *loop, d { loop-any_upper_bound = true; loop-nb_iterations_upper_bound = i_bound; + if (loop-any_estimate +i_bound.ult (loop-nb_iterations_estimate)) +loop-nb_iterations_estimate = i_bound; } if (realistic (!loop-any_estimate || i_bound.ult (loop-nb_iterations_estimate))) { loop-any_estimate = true; - loop-nb_iterations_estimate = i_bound; + if (loop-nb_iterations_upper_bound.ult (i_bound) + loop-any_upper_bound) testing any_upper_bound before accessing loop-nb_iterations_upper_bound +loop-nb_iterations_estimate = loop-nb_iterations_upper_bound; + else +loop-nb_iterations_estimate = i_bound; } /* If an upper bound is smaller than the realistic estimate of the -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: Loop closed SSA loop update
On Mon, 22 Oct 2012, Jan Hubicka wrote: Hi, this patch updates tree_unroll_loops_completely to update loop closed SSA. WHen unlooping the loop some basic blocks may move out of the other loops and that makes the need to check their use and add PHIs. Fortunately update_loop_close_ssa already support local updates and thus this can be done quite cheaply by recoridng the blocks in fix_bb_placements and passing it along. I tried the patch with TODO_update_ssa_no_phi but that causes weird bug in 3 fortran testcases because VOPS seems to not be in the loop closed form. We can track this incrementally I suppose. Yeah, we need to update the checking code to verify loop-closedness of VOPs and see why we don't properly rewrite into it. Bootstrapped/regtested x86_64-linux, OK? Ok. Thanks, Richard. Honza PR middle-end/54967 * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Take loop_closed_ssa_invalidated parameter; pass it along. (canonicalize_loop_induction_variables): Update loop closed SSA. (tree_unroll_loops_completely): Likewise. * cfgloop.h (unloop): UPdate prototype. * cfgloopmanip.c (fix_bb_placements): Record BBs updated into optional bitmap. (unloop): Update to pass along loop_closed_ssa_invalidated. * gfortran.dg/pr54967.f90: New testcase. Index: tree-ssa-loop-ivcanon.c === --- tree-ssa-loop-ivcanon.c (revision 192632) +++ tree-ssa-loop-ivcanon.c (working copy) @@ -390,13 +390,16 @@ loop_edge_to_cancel (struct loop *loop) EXIT is the exit of the loop that should be eliminated. IRRED_INVALIDATED is used to bookkeep if information about irreducible regions may become invalid as a result - of the transformation. */ + of the transformation. + LOOP_CLOSED_SSA_INVALIDATED is used to bookkepp the case + when we need to go into loop closed SSA form. */ static bool try_unroll_loop_completely (struct loop *loop, edge exit, tree niter, enum unroll_level ul, - bool *irred_invalidated) + bool *irred_invalidated, + bitmap loop_closed_ssa_invalidated) { unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll, unr_insns; gimple cond; @@ -562,7 +565,7 @@ try_unroll_loop_completely (struct loop locus = latch_edge-goto_locus; /* Unloop destroys the latch edge. */ - unloop (loop, irred_invalidated); + unloop (loop, irred_invalidated, loop_closed_ssa_invalidated); /* Create new basic block for the latch edge destination and wire it in. */ @@ -615,7 +618,8 @@ static bool canonicalize_loop_induction_variables (struct loop *loop, bool create_iv, enum unroll_level ul, bool try_eval, -bool *irred_invalidated) +bool *irred_invalidated, +bitmap loop_closed_ssa_invalidated) { edge exit = NULL; tree niter; @@ -663,7 +667,8 @@ canonicalize_loop_induction_variables (s (int)max_loop_iterations_int (loop)); } - if (try_unroll_loop_completely (loop, exit, niter, ul, irred_invalidated)) + if (try_unroll_loop_completely (loop, exit, niter, ul, irred_invalidated, + loop_closed_ssa_invalidated)) return true; if (create_iv @@ -683,13 +688,15 @@ canonicalize_induction_variables (void) struct loop *loop; bool changed = false; bool irred_invalidated = false; + bitmap loop_closed_ssa_invalidated = BITMAP_ALLOC (NULL); FOR_EACH_LOOP (li, loop, 0) { changed |= canonicalize_loop_induction_variables (loop, true, UL_SINGLE_ITER, true, - irred_invalidated); + irred_invalidated, + loop_closed_ssa_invalidated); } gcc_assert (!need_ssa_update_p (cfun)); @@ -701,6 +708,13 @@ canonicalize_induction_variables (void) evaluation could reveal new information. */ scev_reset (); + if (!bitmap_empty_p (loop_closed_ssa_invalidated)) +{ + gcc_checking_assert (loops_state_satisfies_p (LOOP_CLOSED_SSA)); + rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa); +} + BITMAP_FREE (loop_closed_ssa_invalidated); + if (changed) return TODO_cleanup_cfg; return 0; @@ -794,11 +808,15 @@ tree_unroll_loops_completely (bool may_i bool changed; enum unroll_level ul; int iteration = 0; + bool irred_invalidated = false; do {
[PATCH] [0/10] AArch64 Port
Folks, We would like to request the merge of aarch64-branch into trunk. This series of patches represents the delta from gcc trunk @r192445 to aarch64-branch @r192535. The patch set is broken down as follows: [1/10] gcc configury This patch contains the adjustments to top level gcc configury required to enable the AArch64 port. [2/10] gcc doc updates This patch contains the additions to the gcc/doc files to document the AArch64 port. [3/10] gcc AArch64 target new files This patch contains all of the new files for the target port itself, the patch does not modify any existing file. [4/10] gcc test suite adjustments This patch contains the adjustments to the existing test suite to support AArch64. [5/10] gcc AArch64 test suite new files This patch contains all of the new files added to the test suite for AArch64, the patch does not modify any existing file. [6/10] libatomic adjustments This patch adjusts the libatomic configury for AArch64. [7/10] libcpp adjustments This patch adjusts the libcpp configury for AArch64. [8/10] libgcc adjustments This patch provides the AArch64 libgcc port, it contains both the required configury adjustment to config.host and the new files introduced by the AArch64 port. [9/10] libgomp adjustments This patch adjusts the libgomp configury for AArch64. [10/10] libstdc adjustments This patch provides the AArch64 libstdc++-v3 port, it contains both the required configury adjustment to config.host and the new file introduced by the AArch64 port. In addition to these patches, the config.guess file will also need to be copied down from upstream, no patch provided. OK to commit? Thanks /Marcus
[PATCH] [1/10] AArch64 Port
This patch contains the adjustments to top level gcc configury required to enable the AArch64 port. Proposed ChangeLog: * config.gcc: Add AArch64. * configure.ac: Add AArch64 TLS support detection. * configure: Regenerate. diff --git a/gcc/config.gcc b/gcc/config.gcc index ed7474ad68c4ae7234072d508b697a9a2218d18d..75ca21756ebca80479d69c38ff8d3c4142d822f3 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -310,6 +310,13 @@ m32c*-*-*) tmake_file=m32c/t-m32c target_has_targetm_common=no ;; +aarch64*-*-*) + cpu_type=aarch64 + need_64bit_hwint=yes + extra_headers=arm_neon.h + extra_objs=aarch64-builtins.o + target_has_targetm_common=yes + ;; alpha*-*-*) cpu_type=alpha need_64bit_hwint=yes @@ -796,6 +803,27 @@ case ${target} in esac case ${target} in +aarch64*-*-elf) + tm_file=${tm_file} dbxelf.h elfos.h newlib-stdint.h + tm_file=${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-elf-raw.h + tmake_file=${tmake_file} aarch64/t-aarch64 + use_gcc_stdint=wrap + case $target in + aarch64_be-*) + tm_defines=${tm_defines} TARGET_BIG_ENDIAN_DEFAULT=1 + ;; + esac + ;; +aarch64*-*-linux*) + tm_file=${tm_file} dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h + tm_file=${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-linux.h + tmake_file=${tmake_file} aarch64/t-aarch64 aarch64/t-aarch64-linux + case $target in + aarch64_be-*) + tm_defines=${tm_defines} TARGET_BIG_ENDIAN_DEFAULT=1 + ;; + esac + ;; alpha*-*-linux*) tm_file=elfos.h ${tm_file} alpha/elf.h alpha/linux.h alpha/linux-elf.h glibc-stdint.h extra_options=${extra_options} alpha/elf.opt @@ -2995,6 +3023,92 @@ fi supported_defaults= case ${target} in + aarch64*-*-*) + supported_defaults=cpu arch + for which in cpu arch; do + + eval val=\$with_$which + base_val=`echo $val | sed -e 's/\+.*//'` + ext_val=`echo $val | sed -e 's/[a-z0-9\-]\+//'` + + if [ $which = arch ]; then + def=aarch64-arches.def + pattern=AARCH64_ARCH + else + def=aarch64-cores.def + pattern=AARCH64_CORE + fi + + ext_mask=AARCH64_CPU_DEFAULT_FLAGS + + # Find the base CPU or ARCH id in aarch64-cores.def or + # aarch64-arches.def + if [ x$base_val = x ] \ + || grep ^$pattern(\$base_val\, \ + ${srcdir}/config/aarch64/$def \ +/dev/null; then + + if [ $which = arch ]; then + base_id=`grep ^$pattern(\$base_val\, \ + ${srcdir}/config/aarch64/$def | \ + sed -e 's/^[^,]*,[]*//' | \ + sed -e 's/,.*$//'` + else + base_id=`grep ^$pattern(\$base_val\, \ + ${srcdir}/config/aarch64/$def | \ + sed -e 's/^[^,]*,[]*//' | \ + sed -e 's/,.*$//'` + fi + + while [ x$ext_val != x ] + do + ext_val=`echo $ext_val | sed -e 's/\+//'` + ext=`echo $ext_val | sed -e 's/\+.*//'` + base_ext=`echo $ext | sed -e 's/^no//'` + + if [ x$base_ext = x ] \ + || grep ^AARCH64_OPT_EXTENSION(\$base_ext\, \ + ${srcdir}/config/aarch64/aarch64-option-extensions.def \ +/dev/null; then + + ext_on=`grep ^AARCH64_OPT_EXTENSION(\$base_ext\, \ + ${srcdir}/config/aarch64/aarch64-option-extensions.def | \ + sed -e 's/^[^,]*,[ ]*//' | \ + sed -e 's/,.*$//'` + ext_off=`grep ^AARCH64_OPT_EXTENSION(\$base_ext\, \ + ${srcdir}/config/aarch64/aarch64-option-extensions.def | \ + sed -e 's/^[^,]*,[ ]*[^,]*,[ ]*//' | \ + sed -e 's/,.*$//' | \ + sed -e 's/).*$//'` + + if [ $ext = $base_ext ]; then + # Adding extension +
[PATCH] [2/10] AArch64 Port
This patch contains the additions to the gcc/doc files to document the AArch64 port. Proposed ChangeLog: * doc/invoke.texi (AArch64 Options): New. * doc/md.texi (Machine Constraints): Add AArch64. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index a9a79343985bdc6bcd070453446a40e996199612..cb5de9e1993eabef512cbbcbe79de6588c6b666a 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -467,6 +467,15 @@ Objective-C and Objective-C++ Dialects}. @c Try and put the significant identifier (CPU or system) first, @c so users have a clue at guessing where the ones they want will be. +@emph{AArch64 Options} +@gccoptlist{-mbig-endian -mlittle-endian @gol +-mgeneral-regs-only @gol +-mcmodel=tiny -mcmodel=small -mcmodel=large @gol +-mstrict-align @gol +-momit-leaf-frame-pointer -mno-omit-leaf-frame-pointer @gol +-mtls-dialect=desc -mtls-dialect=traditional @gol +-march=@var{name} -mcpu=@var{name} -mtune=@var{name}} + @emph{Adapteva Epiphany Options} @gccoptlist{-mhalf-reg-file -mprefer-short-insn-regs @gol -mbranch-cost=@var{num} -mcmove -mnops=@var{num} -msoft-cmpsf @gol @@ -10611,6 +10620,7 @@ platform. @c in Machine Dependent Options @menu +* AArch64 Options:: * Adapteva Epiphany Options:: * ARM Options:: * AVR Options:: @@ -10820,6 +10830,125 @@ purpose. The default is @option{-m1reg- @end table +@node AArch64 Options +@subsection AArch64 Options +@cindex AArch64 Options + +These options are defined for AArch64 implementations: + +@table @gcctabopt + +@item -mbig-endian +@opindex mbig-endian +Generate big-endian code. This is the default when GCC is configured for an +@samp{aarch64_be-*-*} target. + +@item -mgeneral-regs-only +@opindex mgeneral-regs-only +Generate code which uses only the general registers. + +@item -mlittle-endian +@opindex mlittle-endian +Generate little-endian code. This is the default when GCC is configured for an +@samp{aarch64-*-*} but not an @samp{aarch64_be-*-*} target. + +@item -mcmodel=tiny +@opindex mcmodel=tiny +Generate code for the tiny code model. The program and its statically defined +symbols must be within 1GB of each other. Pointers are 64 bits. Programs can +be statically or dynamically linked. This model is not fully implemented and +mostly treated as small. + +@item -mcmodel=small +@opindex mcmodel=small +Generate code for the small code model. The program and its statically defined +symbols must be within 4GB of each other. Pointers are 64 bits. Programs can +be statically or dynamically linked. This is the default code model. + +@item -mcmodel=large +@opindex mcmodel=large +Generate code for the large code model. This makes no assumptions about +addresses and sizes of sections. Pointers are 64 bits. Programs can be +statically linked only. + +@item -mstrict-align +@opindex mstrict-align +Do not assume that unaligned memory references will be handled by the system. + +@item -momit-leaf-frame-pointer +@item -mno-omit-leaf-frame-pointer +@opindex momit-leaf-frame-pointer +@opindex mno-omit-leaf-frame-pointer +Omit or keep the frame pointer in leaf functions. The former behaviour is the +default. + +@item -mtls-dialect=desc +@opindex mtls-dialect=desc +Use TLS descriptors as the thread-local storage mechanism for dynamic accesses +of TLS variables. This is the default. + +@item -mtls-dialect=traditional +@opindex mtls-dialect=traditional +Use traditional TLS as the thread-local storage mechanism for dynamic accesses +of TLS variables. + +@item -march=@var{name} +@opindex march +Specify the name of the target architecture, optionally suffixed by one or +more feature modifiers. This option has the form +@option{-march=@var{arch}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, where the +only value for @var{arch} is @samp{armv8-a}. The possible values for +@var{feature} are documented in the sub-section below. + +Where conflicting feature modifiers are specified, the right-most feature is +used. + +GCC uses this name to determine what kind of instructions it can emit when +generating assembly code. This option can be used in conjunction with or +instead of the @option{-mcpu=} option. + +@item -mcpu=@var{name} +@opindex mcpu +Specify the name of the target processor, optionally suffixed by one or more +feature modifiers. This option has the form +@option{-mcpu=@var{cpu}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, where the +possible values for @var{cpu} are @samp{generic}, @samp{large}. The +possible values for @var{feature} are documented in the sub-section +below. + +Where conflicting feature modifiers are specified, the right-most feature is +used. + +GCC uses this name to determine what kind of instructions it can emit when +generating assembly code. + +@item -mtune=@var{name} +@opindex mtune +Specify the name of the processor to tune the performance for. The code will +be tuned as if the target processor were of the type specified in this option, +but still using instructions compatible with the
[PATCH] [6/10] AArch64 Port
This patch adjusts the libatomic configury for AArch64. Proposed ChangeLog: * configure.tgt: Mark libatomic unsupported.diff --git a/libatomic/configure.tgt b/libatomic/configure.tgt index 847ac41ebed81efff601fcb966d76f35d228dda2..0caa0f42ff99766d1020acd8d966509d0f3447ce 100644 --- a/libatomic/configure.tgt +++ b/libatomic/configure.tgt @@ -95,6 +95,11 @@ fi # Other system configury case ${target} in + aarch64*) + # This is currently not supported in AArch64. + UNSUPPORTED=1 + ;; + arm*-*-linux*) # OS support for atomic primitives. config_path=${config_path} linux/arm posix
[PATCH] [4/10] AArch64 Port
This patch contains the adjustments to the existing test suite to support AArch64. Proposed ChangeLog: * lib/target-supports.exp (check_profiling_available): Add AArch64. (check_effective_target_vect_int): Likewise. (check_effective_target_vect_shift): Likewise. (check_effective_target_vect_float): Likewise. (check_effective_target_vect_double): Likewise. (check_effective_target_vect_widen_mult_qi_to_hi): Likewise. (check_effective_target_vect_widen_mult_hi_to_si): Likewise. (check_effective_target_vect_pack_trunc): Likewise. (check_effective_target_vect_unpack): Likewise. (check_effective_target_vect_hw_misalign): Likewise. (check_effective_target_vect_short_mult): Likewise. (check_effective_target_vect_int_mult): Likewise. (check_effective_target_vect_stridedN): Likewise. (check_effective_target_sync_int_long): Likewise. (check_effective_target_sync_char_short): Likewise. (check_vect_support_and_set_flags): Likewise. (check_effective_target_aarch64_tiny): New. (check_effective_target_aarch64_small): New. (check_effective_target_aarch64_large): New. * g++.dg/other/PR23205.C: Enable aarch64. * g++.dg/other/pr23205-2.C: Likewise. * g++.old-deja/g++.abi/ptrmem.C: Likewise. * gcc.c-torture/execute/20101011-1.c: Likewise. * gcc.dg/20020312-2.c: Likewise. * gcc.dg/20040813-1.c: Likewise. * gcc.dg/builtin-apply2.c: Likewise. * gcc.dg/stack-usage-1.c: Likewise. diff --git a/gcc/testsuite/g++.dg/abi/aarch64_guard1.C b/gcc/testsuite/g++.dg/abi/aarch64_guard1.C index ...af82ad2ec36998135e67a25f47d19b4e977fd8d2 100644 --- a/gcc/testsuite/g++.dg/abi/aarch64_guard1.C +++ b/gcc/testsuite/g++.dg/abi/aarch64_guard1.C @@ -0,0 +1,17 @@ +// Check that the initialization guard variable is an 8-byte aligned, +// 8-byte doubleword and that only the least significant bit is used +// for initialization guard variables. +// { dg-do compile { target aarch64*-*-* } } +// { dg-options -O -fdump-tree-original } + +int bar(); + +int *foo () +{ + static int x = bar (); + return x; +} + +// { dg-final { scan-assembler _ZGVZ3foovE1x,8,8 } } +// { dg-final { scan-tree-dump _ZGVZ3foovE1x 1 original } } +// { dg-final { cleanup-tree-dump original } } diff --git a/gcc/testsuite/g++.dg/other/PR23205.C b/gcc/testsuite/g++.dg/other/PR23205.C index a31fc1d773ddf0b21bdb219be2646c574923d7a5..e55710b40f0a0a69528ca4e27facff742ff2e4ad 100644 --- a/gcc/testsuite/g++.dg/other/PR23205.C +++ b/gcc/testsuite/g++.dg/other/PR23205.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-skip-if No stabs { mmix-*-* *-*-aix* alpha*-*-* hppa*64*-*-* ia64-*-* tile*-*-* *-*-vxworks } { * } { } } */ +/* { dg-skip-if No stabs { aarch64*-*-* mmix-*-* *-*-aix* alpha*-*-* hppa*64*-*-* ia64-*-* tile*-*-* *-*-vxworks } { * } { } } */ /* { dg-options -gstabs+ -fno-eliminate-unused-debug-types } */ const int foobar = 4; diff --git a/gcc/testsuite/g++.dg/other/pr23205-2.C b/gcc/testsuite/g++.dg/other/pr23205-2.C index fbd16dfab5836e4f0ceb987cbf42271d3728c63f..607e5a2b4e433a0fec79d3fda4dc265f1f8a39ae 100644 --- a/gcc/testsuite/g++.dg/other/pr23205-2.C +++ b/gcc/testsuite/g++.dg/other/pr23205-2.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-skip-if No stabs { mmix-*-* *-*-aix* alpha*-*-* hppa*64*-*-* ia64-*-* tile*-*-* } { * } { } } */ +/* { dg-skip-if No stabs { aarch64*-*-* mmix-*-* *-*-aix* alpha*-*-* hppa*64*-*-* ia64-*-* tile*-*-* } { * } { } } */ /* { dg-options -gstabs+ -fno-eliminate-unused-debug-types -ftoplevel-reorder } */ const int foobar = 4; diff --git a/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C b/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C index 077fa50840c978f9c0dda8c0e7071eda514395b5..341735879c59d517edb1fc49edfb78c6e2e01846 100644 --- a/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C +++ b/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C @@ -7,7 +7,7 @@ function. However, some platforms use all bits to encode a function pointer. Such platforms use the lowest bit of the delta, that is shifted left by one bit. */ -#if defined __MN10300__ || defined __SH5__ || defined __arm__ || defined __thumb__ || defined __mips__ +#if defined __MN10300__ || defined __SH5__ || defined __arm__ || defined __thumb__ || defined __mips__ || defined __aarch64__ #define ADJUST_PTRFN(func, virt) ((void (*)())(func)) #define ADJUST_DELTA(delta, virt) (((delta) 1) + !!(virt)) #else diff --git a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c index b98454e253ef074b6219a83f0f9473f9dbc0188d..76b9f068723994dd3f0543a9a4ece4538cb676de 100644 --- a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c @@ -12,6 +12,10 @@ #elif defined (__sh__) /* On SH division by zero does not trap. */ #
[PATCH] [7/10] AArch64 Port
This patch adjusts the libcpp configury for AArch64. Proposed ChangeLog: * configure.ac: Enable AArch64. * configure: Regenerate.diff --git a/libcpp/configure.ac b/libcpp/configure.ac index 29bd8c5e6f1a7bddb628f415f3138dfeaa69a483..e62da06ce278f832084ff2080d694c99e24f8532 100644 --- a/libcpp/configure.ac +++ b/libcpp/configure.ac @@ -134,6 +134,7 @@ fi m4_changequote(,) case $target in + aarch64*-*-* | \ alpha*-*-* | \ arm*-*-*eabi* | \ arm*-*-symbianelf* | \ diff --git a/libcpp/configure b/libcpp/configure index 01e4462307f7ae6aa1b563133746fb45e41af74e..d33969b2b2d5f692ed39a78abd8a94c0385d071e 100755 --- a/libcpp/configure +++ b/libcpp/configure @@ -7096,6 +7096,7 @@ fi case $target in + aarch64*-*-* | \ alpha*-*-* | \ arm*-*-*eabi* | \ arm*-*-symbianelf* | \
[PATCH] [9/10] AArch64 Port
This patch adjusts the libgomp configury for AArch64. Proposed ChangeLog: * configure.tgt: Add AArch64.diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt index d5a1480e4812634ae280238684cb2187b2c618f8..2eecc93a349f3afe9e0afbbc2e98194065873498 100644 --- a/libgomp/configure.tgt +++ b/libgomp/configure.tgt @@ -27,6 +27,10 @@ config_path=posix if test $enable_linux_futex = yes; then case ${target} in +aarch64*-*-linux*) + config_path=linux posix + ;; + alpha*-*-linux*) config_path=linux/alpha linux posix ;;
[PATCH] [10/10] AArch64 Port
This patch provides the AArch64 libstdc++-v3 port, it contains both the required configury adjustment to config.host and the new file introduced by the AArch64 port. Proposed ChangeLog: * config/cpu/aarch64/cxxabi_tweaks.h: New file. * configure.host: Enable aarch64.diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host index ed9e72109d41774c179190d9546b53d0dd4feef1..af5d3ffbff48eb82dd85ef55c290a3e5d2be9f89 100644 --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -99,6 +99,9 @@ error_constants_dir=os/generic # variants into the established source config/cpu/* sub-directories. # THIS TABLE IS SORTED. KEEP IT THAT WAY. case ${host_cpu} in + aarch64*) +try_cpu=aarch64 +;; alpha*) try_cpu=alpha ;; diff --git a/libstdc++-v3/config/cpu/aarch64/cxxabi_tweaks.h b/libstdc++-v3/config/cpu/aarch64/cxxabi_tweaks.h index ...31a423f4fd56bffaead1d1f2b0057cdb80cda1fb 100644 --- a/libstdc++-v3/config/cpu/aarch64/cxxabi_tweaks.h +++ b/libstdc++-v3/config/cpu/aarch64/cxxabi_tweaks.h @@ -0,0 +1,60 @@ +// Control various target specific ABI tweaks. AArch64 version. + +// Copyright (C) 2004, 2006, 2008, 2009, 2011, 2012 +// Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// http://www.gnu.org/licenses/. + +/** @file cxxabi_tweaks.h + * The header provides an CPU-variable interface to the C++ ABI. + */ + +#ifndef _CXXABI_TWEAKS_H +#define _CXXABI_TWEAKS_H 1 + +#ifdef __cplusplus +namespace __cxxabiv1 +{ + extern C + { +#endif + + // The AArch64 ABI uses the least significant bit of a 64-bit + // guard variable. +#define _GLIBCXX_GUARD_TEST(x) ((*(x) 1) != 0) +#define _GLIBCXX_GUARD_SET(x) *(x) = 1 +#define _GLIBCXX_GUARD_BIT 1 +#define _GLIBCXX_GUARD_PENDING_BIT __guard_test_bit (1, 1) +#define _GLIBCXX_GUARD_WAITING_BIT __guard_test_bit (2, 1) + __extension__ typedef int __guard __attribute__((mode (__DI__))); + + // __cxa_vec_ctor has void return type. + typedef void __cxa_vec_ctor_return_type; +#define _GLIBCXX_CXA_VEC_CTOR_RETURN(x) return + // Constructors and destructors do not return a value. + typedef void __cxa_cdtor_return_type; + +#ifdef __cplusplus + } +} // namespace __cxxabiv1 +#endif + +#endif
[PATCH] [8/10] AArch64 Port
This patch provides the AArch64 libgcc port, it contains both the required configury adjustment to config.host and the new files introduced by the AArch64 port. Proposed ChangeLog: * config.host (aarch64*-*-elf, aarch64*-*-linux*): New. * config/aarch64/crti.S: New file. * config/aarch64/crtn.S: New file. * config/aarch64/linux-unwind.h: New file. * config/aarch64/sfp-machine.h: New file. * config/aarch64/sync-cache.c: New file. * config/aarch64/t-aarch64: New file. * config/aarch64/t-softfp: New file.diff --git a/libgcc/config.host b/libgcc/config.host index 763f6c3a252223e149eb9b995679f455051bfe7a..96c93a4e6a04f63eb7ab629b822a55c19e6d5f97 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -83,6 +83,9 @@ m32c*-*-*) cpu_type=m32c tmake_file=t-fdpbit ;; +aarch64*-*-*) + cpu_type=aarch64 + ;; alpha*-*-*) cpu_type=alpha ;; @@ -278,6 +281,16 @@ i[34567]86-*-mingw* | x86_64-*-mingw*) esac case ${host} in +aarch64*-*-elf) + extra_parts=$extra_parts crtbegin.o crtend.o crti.o crtn.o + tmake_file=${tmake_file} ${cpu_type}/t-aarch64 + tmake_file=${tmake_file} ${cpu_type}/t-softfp t-softfp + ;; +aarch64*-*-linux*) + md_unwind_header=aarch64/linux-unwind.h + tmake_file=${tmake_file} ${cpu_type}/t-aarch64 + tmake_file=${tmake_file} ${cpu_type}/t-softfp t-softfp + ;; alpha*-*-linux*) tmake_file=${tmake_file} alpha/t-alpha alpha/t-ieee t-crtfm alpha/t-linux extra_parts=$extra_parts crtfastmath.o diff --git a/libgcc/config/aarch64/crti.S b/libgcc/config/aarch64/crti.S index ...49611303b023206cd9cd72511e49fe4aadca340c 100644 --- a/libgcc/config/aarch64/crti.S +++ b/libgcc/config/aarch64/crti.S @@ -0,0 +1,68 @@ +# Machine description for AArch64 architecture. +# Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc. +# Contributed by ARM Ltd. +# +# This file is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 3, or (at your option) any +# later version. +# +# This file is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# Under Section 7 of GPL version 3, you are granted additional +# permissions described in the GCC Runtime Library Exception, version +# 3.1, as published by the Free Software Foundation. +# +# You should have received a copy of the GNU General Public License and +# a copy of the GCC Runtime Library Exception along with this program; +# see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +# http://www.gnu.org/licenses/. + +/* An executable stack is *not* required for these functions. */ +#if defined(__ELF__) defined(__linux__) +.section .note.GNU-stack,,%progbits +.previous +#endif + +# This file creates a stack frame for the contents of the .fini and +# .init sections. Users may put any desired instructions in those +# sections. + +#ifdef __ELF__ +#define TYPE(x) .type x,function +#else +#define TYPE(x) +#endif + + # Note - this macro is complemented by the FUNC_END macro + # in crtn.S. If you change this macro you must also change + # that macro match. +.macro FUNC_START + # Create a stack frame and save any call-preserved registers + stp x29, x30, [sp, #-16]! + stp x27, x28, [sp, #-16]! + stp x25, x26, [sp, #-16]! + stp x23, x24, [sp, #-16]! + stp x21, x22, [sp, #-16]! + stp x19, x20, [sp, #-16]! +.endm + + .section.init + .align 2 + .global _init + TYPE(_init) +_init: + FUNC_START + + + .section.fini + .align 2 + .global _fini + TYPE(_fini) +_fini: + FUNC_START + +# end of crti.S diff --git a/libgcc/config/aarch64/crtn.S b/libgcc/config/aarch64/crtn.S index ...70dbc19c59275ce591025de1fc4b39596628730b 100644 --- a/libgcc/config/aarch64/crtn.S +++ b/libgcc/config/aarch64/crtn.S @@ -0,0 +1,61 @@ +# Machine description for AArch64 architecture. +# Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc. +# Contributed by ARM Ltd. +# +# This file is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 3, or (at your option) any +# later version. +# +# This file is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# Under Section 7 of GPL version 3, you are granted additional +# permissions described in the
Re: [PATCH] [9/10] AArch64 Port
On Tue, Oct 23, 2012 at 10:42:57AM +0100, Marcus Shawcroft wrote: This patch adjusts the libgomp configury for AArch64. Proposed ChangeLog: * configure.tgt: Add AArch64. This is ok. diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt index d5a1480e4812634ae280238684cb2187b2c618f8..2eecc93a349f3afe9e0afbbc2e98194065873498 100644 --- a/libgomp/configure.tgt +++ b/libgomp/configure.tgt @@ -27,6 +27,10 @@ config_path=posix if test $enable_linux_futex = yes; then case ${target} in +aarch64*-*-linux*) + config_path=linux posix + ;; + alpha*-*-linux*) config_path=linux/alpha linux posix ;; Jakub
Re: [wwwdocs,Java] Replace sources.redhat.com by sourceware.org
- Original Message - ...and some other simplifications and improvements I noticed on the way. This was triggered by a note that the sources.redhat.com DNS entry is going to go away at some point in the future that I got yesterday. Applied. Gerald 2012-10-21 Gerald Pfeifer ger...@pfeifer.com * news.html: Replace references to sources.redhat.com by sourceware.org. Avoid a reference to CVS. Some style adjustments to the February 8, 2001 entry. Index: news.html === RCS file: /cvs/gcc/wwwdocs/htdocs/java/news.html,v retrieving revision 1.12 diff -u -3 -p -r1.12 news.html --- news.html 19 Sep 2010 20:35:03 - 1.12 +++ news.html 21 Oct 2012 02:02:51 - @@ -153,7 +153,7 @@ code size heuristics. It is enabled by dd Gary Benson from Red Hat has released a href=http://people.redhat.com/gbenson/naoko/;Naoko/a: a subset -of the a href=http://sources.redhat.com/rhug/;rhug/a packages +of the a href=http://sourceware.org/rhug/;rhug/a packages that have been repackaged for eventual inclusion in Red Hat Linux. Naoko basically comprises binary RPMS of Ant, Tomcat, and their dependencies built with gcj. @@ -172,8 +172,8 @@ A team of hackers from Red Hat has relea of a href=http://www.eclipse.org/;Eclipse/a, a free software IDE written in Java, that has been compiled with a modified gcj. You can find more information -a href=http://sources.redhat.com/eclipse/;here/a. We'll be -integrating the required gcj patches into cvs in the near future. +a href=http://sourceware.org/eclipse/;here/a. We'll be +integrating the required gcj patches in the near future. /dd dtJuly 31, 2003/dt @@ -426,7 +426,7 @@ find bugs! dtFebruary 8, 2001/dt dd Made use of Warren Levy's change to the -a href=http://sources.redhat.com/mauve/;Mauve test suite/a to handle +a href=http://sourceware.org/mauve/;Mauve test suite/a to handle regressions. Modifications have been made to ttmauve.exp/tt to copy the newly created ttxfails/tt file of known library failures from the source tree @@ -434,9 +434,9 @@ to the directory where the libjava tt' This allows the testsuite to ignore ttXFAIL/tts and thus highlight true regressions in the library. The Mauve tests are automatically run as part of a libjava -tt'make check'/tt as long as the Mauve suite is accessible -and the env var ttMAUVEDIR/tt is set to point to the top-level -of the a href=http://sources.redhat.com/mauve/download.html;Mauve source/a. +codemake check/code as long as the Mauve suite is accessible and the +environment variable codeMAUVEDIR/code is set to point to the top-level +of the Mauve sources. /dd dtJanuary 28, 2001/dt It's never been obvious to me how the web material gets updated. GCJ regularly misses out on being mentioned in changes too, despite fixes going in. -- Andrew :) Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: 248BDC07 (https://keys.indymedia.org/) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: Minimize downward code motion during reassociation
On Mon, Oct 22, 2012 at 8:31 PM, Easwaran Raman era...@google.com wrote: On Mon, Oct 22, 2012 at 12:59 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Oct 19, 2012 at 12:36 AM, Easwaran Raman era...@google.com wrote: Hi, During expression reassociation, statements are conservatively moved downwards to ensure that dependences are correctly satisfied after reassocation. This could lead to lengthening of live ranges. This patch moves statements only to the extent necessary. Bootstraps and no test regression on x86_64/linux. OK for trunk? Thanks, Easwaran 2012-10-18 Easwaran Raman era...@google.com * tree-ssa-reassoc.c(assign_uids): New function. (assign_uids_in_relevant_bbs): Likewise. (ensure_ops_are_available): Likewise. (rewrite_expr_tree): Do not move statements beyond what is necessary. Remove call to swap_ops_for_binary_stmt... (reassociate_bb): ... and move it here. Index: gcc/tree-ssa-reassoc.c === --- gcc/tree-ssa-reassoc.c (revision 192487) +++ gcc/tree-ssa-reassoc.c (working copy) @@ -2250,6 +2250,128 @@ swap_ops_for_binary_stmt (VEC(operand_entry_t, hea } } +/* Assign UIDs to statements in basic block BB. */ + +static void +assign_uids (basic_block bb) +{ + unsigned uid = 0; + gimple_stmt_iterator gsi; + /* First assign uids to phis. */ + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (gsi)) +{ + gimple stmt = gsi_stmt (gsi); + gimple_set_uid (stmt, uid++); +} + + /* Then assign uids to stmts. */ + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) +{ + gimple stmt = gsi_stmt (gsi); + gimple_set_uid (stmt, uid++); +} +} + +/* For each operand in OPS, find the basic block that contains the statement + which defines the operand. For all such basic blocks, assign UIDs. */ + +static void +assign_uids_in_relevant_bbs (VEC(operand_entry_t, heap) * ops) +{ + operand_entry_t oe; + int i; + struct pointer_set_t *seen_bbs = pointer_set_create (); + + for (i = 0; VEC_iterate (operand_entry_t, ops, i, oe); i++) +{ + gimple def_stmt; + basic_block bb; + if (TREE_CODE (oe-op) != SSA_NAME) +continue; + def_stmt = SSA_NAME_DEF_STMT (oe-op); + bb = gimple_bb (def_stmt); + if (!pointer_set_contains (seen_bbs, bb)) +{ + assign_uids (bb); + pointer_set_insert (seen_bbs, bb); +} +} + pointer_set_destroy (seen_bbs); +} Please assign UIDs once using the existing renumber_gimple_stmt_uids (). You seem to call the above multiple times and thus do work bigger than O(number of basic blocks). The reason I call the above multiple times is that gsi_move_before might get called between two calls to the above. For instance, after rewrite_expr_tree is called once, the following sequence of calls could happen: reassociate_bb - linearize_expr_tree - linearize_expr - gsi_move_before. So it is not sufficient to call renumber_gimple_stmt_uids once per do_reassoc. Or do you want me to use renumber_gimple_stmt_uids_in_blocks instead of assign_uids_in_relevant_bbs? It's sufficient to call it once if you conservatively update UIDs at stmt move / insert time (assign the same UID as the stmt before/after). +/* Ensure that operands in the OPS vector starting from OPINDEXth entry are live + at STMT. This is accomplished by moving STMT if needed. */ + +static void +ensure_ops_are_available (gimple stmt, VEC(operand_entry_t, heap) * ops, int opindex) +{ + int i; + int len = VEC_length (operand_entry_t, ops); + gimple insert_stmt = stmt; + basic_block insert_bb = gimple_bb (stmt); + gimple_stmt_iterator gsi_insert, gsistmt; + for (i = opindex; i len; i++) +{ Likewise you call this for each call to rewrite_expr_tree, so it seems to me this is quadratic in the number of ops in the op vector. The call to ensure_ops_are_available inside rewrite_expr_tree is guarded by if (!moved) and I am setting moved = true there to ensure that ensure_ops_are_available inside is called once per reassociation of a expression tree. It would be a lot easier to understand this function if you call it once after rewrite_expr_tree finished. Why make this all so complicated? It seems to me that we should fixup stmt order only after the whole ops vector has been materialized. + operand_entry_t oe = VEC_index (operand_entry_t, ops, i); + gimple def_stmt; + basic_block def_bb; + /* Ignore constants and operands with default definitons. */ + if (TREE_CODE (oe-op) != SSA_NAME + || SSA_NAME_IS_DEFAULT_DEF (oe-op)) +continue; + def_stmt = SSA_NAME_DEF_STMT (oe-op); + def_bb = gimple_bb (def_stmt); + if (def_bb != insert_bb + !dominated_by_p (CDI_DOMINATORS, insert_bb, def_bb)) +{ + insert_bb =
Re: [wwwdocs,Java] Replace sources.redhat.com by sourceware.org
On 10/23/2012 10:47 AM, Andrew Hughes wrote: It's never been obvious to me how the web material gets updated. GCJ regularly misses out on being mentioned in changes too, despite fixes going in. Web material gets updated with patches through the same process as the software. Andrew.
Re: Constant-fold vector comparisons
On Mon, Oct 22, 2012 at 10:31 PM, Marc Glisse marc.gli...@inria.fr wrote: On Mon, 15 Oct 2012, Richard Biener wrote: On Fri, Oct 12, 2012 at 4:07 PM, Marc Glisse marc.gli...@inria.fr wrote: On Sat, 29 Sep 2012, Marc Glisse wrote: 1) it handles constant folding of vector comparisons, 2) it fixes another place where vectors are not expected Here is a new version of this patch. In a first try, I got bitten by the operator priorities in a b?c:d, which g++ doesn't warn about. 2012-10-12 Marc Glisse marc.gli...@inria.fr gcc/ * tree-ssa-forwprop.c (forward_propagate_into_cond): Handle vectors. * fold-const.c (fold_relational_const): Handle VECTOR_CST. gcc/testsuite/ * gcc.dg/tree-ssa/foldconst-6.c: New testcase. Here is a new version, with the same ChangeLog plus * doc/generic.texi (VEC_COND_EXPR): Document current policy. Which means I'd prefer if you simply condition the existing ~ and ^ handling on COND_EXPR. Done. - if (integer_onep (tmp)) + if ((gimple_assign_rhs_code (stmt) == VEC_COND_EXPR) + ? integer_all_onesp (tmp) : integer_onep (tmp)) and cache gimple_assign_rhs_code as a 'code' variable at the beginning of the function. Done. + if (TREE_CODE (op0) == VECTOR_CST TREE_CODE (op1) == VECTOR_CST) +{ + int count = VECTOR_CST_NELTS (op0); + tree *elts = XALLOCAVEC (tree, count); + gcc_assert (TREE_CODE (type) == VECTOR_TYPE); A better check would be that VECTOR_CST_NELTS of type is the same as that of op0. I wasn't sure which check you meant, so I added both possibilities. I am fine with removing either or both, actually. Ok with these changes. A few too many changes, I prefer to re-post, in case. On Tue, 16 Oct 2012, Richard Biener wrote: I liked your idea of the signed boolean vector, as a way to express that we know some vector can only have values 0 and -1, but I am not sure how to use it. Ah no, I didn't mean to suggest that ;) Maybe you didn't, but I still took the idea from your words ;-) Thus, as we defined true to -1 and false to 0 we cannot, unless relaxing what VEC_COND_EXRP treats as true or false, optimize any of ~ or ^ -1 away. It seems to me that what prevents from optimizing is if we want to keep the door open for a future relaxation of what VEC_COND_EXPR accepts as its first argument. Which means: produce only -1 and 0, but don't assume we are only reading -1 and 0 (unless we have a reason to know it, for instance because it is the result of a comparison), and don't assume any specific interpretation on those other values. Not sure how much that limits possible optimizations. I'm not sure either - I'd rather leave the possibility open until we see a compelling reason to go either way (read: a testcase where it matters in practice). Ok, I implemented the safe way. My current opinion is that we should go with a VEC_COND_EXPR that only accepts 0 and -1 (it is easy to pass a LT_EXPR or NE_EXPR as first argument if that is what one wants), but it can wait. Ok with ... -- Marc Glisse Index: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c === --- gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0) @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-ccp1 } */ + +typedef long vec __attribute__ ((vector_size (2 * sizeof(long; + +vec f () +{ + vec a = { -2, 666 }; + vec b = { 3, 2 }; + return a b; +} + +/* { dg-final { scan-tree-dump-not 666 ccp1} } */ +/* { dg-final { cleanup-tree-dump ccp1 } } */ Property changes on: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 192695) +++ gcc/fold-const.c(working copy) @@ -16123,20 +16123,45 @@ fold_relational_const (enum tree_code co TREE_IMAGPART (op0), TREE_IMAGPART (op1)); if (code == EQ_EXPR) return fold_build2 (TRUTH_ANDIF_EXPR, type, rcond, icond); else if (code == NE_EXPR) return fold_build2 (TRUTH_ORIF_EXPR, type, rcond, icond); else return NULL_TREE; } + if (TREE_CODE (op0) == VECTOR_CST TREE_CODE (op1) == VECTOR_CST) +{ + unsigned count = VECTOR_CST_NELTS (op0); + tree *elts = XALLOCAVEC (tree, count); + gcc_assert (VECTOR_CST_NELTS (op1) == count); + gcc_assert (TYPE_VECTOR_SUBPARTS (type) == count); gcc_assert (VECTOR_CST_NELTS (op1) == count TYPE_VECTOR_SUBPARTS (type) == count); Thanks, Richard. +
Re: Fix bugs introduced by switch-case profile propagation
Ping. On Wed, Oct 17, 2012 at 1:48 PM, Easwaran Raman era...@google.com wrote: Hi, This patch fixes bugs introduced by my previous patch to propagate profiles during switch expansion. Bootstrap and profiledbootstrap successful on x86_64. Confirmed that it fixes the crashes reported in PR middle-end/54957. OK for trunk? - Easwaran 2012-10-17 Easwaran Raman era...@google.com PR target/54938 PR middle-end/54957 * optabs.c (emit_cmp_and_jump_insn_1): Add REG_BR_PROB note only if it doesn't already exist. * except.c (sjlj_emit_function_enter): Remove unused variable. * stmt.c (get_outgoing_edge_probs): Return 0 if BB is NULL. Seems fine, but under what conditions you get NULL here? Honza (emit_case_dispatch_table): Handle the case where STMT_BB is NULL. (expand_sjlj_dispatch_table): Pass BB containing before_case to emit_case_dispatch_table. Index: gcc/optabs.c === --- gcc/optabs.c (revision 192488) +++ gcc/optabs.c (working copy) @@ -4268,11 +4268,9 @@ emit_cmp_and_jump_insn_1 (rtx test, enum machine_m profile_status != PROFILE_ABSENT insn JUMP_P (insn) - any_condjump_p (insn)) -{ - gcc_assert (!find_reg_note (insn, REG_BR_PROB, 0)); - add_reg_note (insn, REG_BR_PROB, GEN_INT (prob)); -} + any_condjump_p (insn) + !find_reg_note (insn, REG_BR_PROB, 0)) +add_reg_note (insn, REG_BR_PROB, GEN_INT (prob)); } /* Generate code to compare X with Y so that the condition codes are Index: gcc/except.c === --- gcc/except.c (revision 192488) +++ gcc/except.c (working copy) @@ -1153,7 +1153,7 @@ sjlj_emit_function_enter (rtx dispatch_label) if (dispatch_label) { #ifdef DONT_USE_BUILTIN_SETJMP - rtx x, last; + rtx x; x = emit_library_call_value (setjmp_libfunc, NULL_RTX, LCT_RETURNS_TWICE, TYPE_MODE (integer_type_node), 1, plus_constant (Pmode, XEXP (fc, 0), Index: gcc/stmt.c === --- gcc/stmt.c (revision 192488) +++ gcc/stmt.c (working copy) @@ -1867,6 +1867,8 @@ get_outgoing_edge_probs (basic_block bb) edge e; edge_iterator ei; int prob_sum = 0; + if (!bb) +return 0; FOR_EACH_EDGE(e, ei, bb-succs) prob_sum += e-probability; return prob_sum; @@ -1916,8 +1918,8 @@ emit_case_dispatch_table (tree index_expr, tree in rtx fallback_label = label_rtx (case_list-code_label); rtx table_label = gen_label_rtx (); bool has_gaps = false; - edge default_edge = EDGE_SUCC(stmt_bb, 0); - int default_prob = default_edge-probability; + edge default_edge = stmt_bb ? EDGE_SUCC(stmt_bb, 0) : NULL; + int default_prob = default_edge ? default_edge-probability : 0; int base = get_outgoing_edge_probs (stmt_bb); bool try_with_tablejump = false; @@ -1997,7 +1999,8 @@ emit_case_dispatch_table (tree index_expr, tree in default_prob = 0; } - default_edge-probability = default_prob; + if (default_edge) +default_edge-probability = default_prob; /* We have altered the probability of the default edge. So the probabilities of all other edges need to be adjusted so that it sums up to @@ -2289,7 +2292,8 @@ expand_sjlj_dispatch_table (rtx dispatch_index, emit_case_dispatch_table (index_expr, index_type, case_list, default_label, - minval, maxval, range, NULL); + minval, maxval, range, +BLOCK_FOR_INSN (before_case)); emit_label (default_label); free_alloc_pool (case_node_pool); }
Re: Remove def operands cache, try 2
On Mon, Oct 22, 2012 at 4:12 PM, Michael Matz m...@suse.de wrote: Hi, On Tue, 11 Sep 2012, Michael Matz wrote: the operands cache is ugly. This patch removes it at least for the def operands, saving three pointers for roughly each normal statement (the pointer in gsbase, and two pointers from def_optype_d). This is relatively easy to do, because all statements except ASMs have at most one def (and one vdef), which themself aren't pointed to by something else, unlike the use operands which have more structure for the SSA web. Performance wise the patch is a slight improvement (1% for some C++ testcases, but relatively noisy, but at least not slower), bootstrap time is unaffected. As the iterator is a bit larger code size increases by 1 promille. The patch is regstrapped on x86_64-linux. If it's approved I'll adjust the WORD count markers in gimple.h, I left it out in this submission as it's just verbose noise in comments. So, 2nd try after some internal feedback. This version changes the operand order of asms to also have the defs at the beginning, which makes the iterators slightly nicer, and joins some more fields of the iterator, though not all that we could merge. Again, if approved I'll adjust the word count markers. Regstrapping on x86_64-linux in progress, speed similar as before. Okay for trunk? Ok. Thanks, Richard. Ciao, Michael. -- * tree-ssa-operands.h (struct def_optype_d, def_optype_p): Remove. (ssa_operands.free_defs): Remove. (DEF_OP_PTR, DEF_OP): Remove. (struct ssa_operand_iterator_d): Remove 'defs', add 'flags' members, rename 'phi_stmt' to 'stmt', 'phi_i' to 'i' and 'num_phi' to 'numops'. * gimple.h (gimple_statement_with_ops.def_ops): Remove. (gimple_def_ops, gimple_set_def_ops): Remove. (gimple_vdef_op): Don't take const gimple, adjust. (gimple_asm_input_op, gimple_asm_input_op_ptr, gimple_asm_set_input_op, gimple_asm_output_op, gimple_asm_output_op_ptr, gimple_asm_set_output_op): Adjust asserts, and rewrite to move def operands to front. (gimple_asm_clobber_op, gimple_asm_set_clobber_op, gimple_asm_label_op, gimple_asm_set_label_op): Correct asserts. * tree-ssa-operands.c (build_defs): Remove. (init_ssa_operands): Don't initialize it. (fini_ssa_operands): Don't free it. (cleanup_build_arrays): Don't truncate it. (finalize_ssa_stmt_operands): Don't assert on it. (alloc_def, add_def_op, append_def): Remove. (finalize_ssa_defs): Remove building of def_ops list. (finalize_ssa_uses): Don't mark for SSA renaming here, ... (add_stmt_operand): ... but here, don't call append_def. (get_indirect_ref_operands): Remove recurse_on_base argument. (get_expr_operands): Adjust call to get_indirect_ref_operands. (verify_ssa_operands): Don't check def operands. (free_stmt_operands): Don't free def operands. * gimple.c (gimple_copy): Don't clear def operands. * tree-flow-inline.h (op_iter_next_use): Adjust to explicitely handle def operand. (op_iter_next_tree, op_iter_next_def): Ditto. (clear_and_done_ssa_iter): Clear new fields. (op_iter_init): Adjust to setup new iterator structure. (op_iter_init_phiuse): Adjust. Index: tree-ssa-operands.h === --- tree-ssa-operands.h.orig2012-09-24 15:24:52.0 +0200 +++ tree-ssa-operands.h 2012-10-22 15:12:30.0 +0200 @@ -34,14 +34,6 @@ typedef ssa_use_operand_t *use_operand_p #define NULL_USE_OPERAND_P ((use_operand_p)NULL) #define NULL_DEF_OPERAND_P ((def_operand_p)NULL) -/* This represents the DEF operands of a stmt. */ -struct def_optype_d -{ - struct def_optype_d *next; - tree *def_ptr; -}; -typedef struct def_optype_d *def_optype_p; - /* This represents the USE operands of a stmt. */ struct use_optype_d { @@ -68,7 +60,6 @@ struct GTY(()) ssa_operands { bool ops_active; - struct def_optype_d * GTY ((skip ())) free_defs; struct use_optype_d * GTY ((skip ())) free_uses; }; @@ -82,9 +73,6 @@ struct GTY(()) ssa_operands { #define USE_OP_PTR(OP) (((OP)-use_ptr)) #define USE_OP(OP) (USE_FROM_PTR (USE_OP_PTR (OP))) -#define DEF_OP_PTR(OP) ((OP)-def_ptr) -#define DEF_OP(OP) (DEF_FROM_PTR (DEF_OP_PTR (OP))) - #define PHI_RESULT_PTR(PHI)gimple_phi_result_ptr (PHI) #define PHI_RESULT(PHI)DEF_FROM_PTR (PHI_RESULT_PTR (PHI)) #define SET_PHI_RESULT(PHI, V) SET_DEF (PHI_RESULT_PTR (PHI), (V)) @@ -133,13 +121,13 @@ enum ssa_op_iter_type { typedef struct ssa_operand_iterator_d { - bool done; enum ssa_op_iter_type iter_type; - def_optype_p defs; + bool done; + int flags; + unsigned i; +
Re: [PATCH] Intrinsics for fxsave[,64], xsave[,64], xsaveopt[,64]
On Tue, Oct 23, 2012 at 12:14 PM, Alexander Ivchenko aivch...@gmail.com wrote: Please take a look at the attached patch. I changed the asm-pattern implementation according to your recomendation. Changed the name of feature option from -mfxsave to -mfxsr, as it is in Intel SDM. Corrected the arguments name in the headers. Bootstrap passes +case OPT_mxsaveopt: + if (value) + { + opts-x_ix86_isa_flags |= OPTION_MASK_ISA_XSAVEOPT_SET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XSAVEOPT_SET; + } + else + { + opts-x_ix86_isa_flags = ~OPTION_MASK_ISA_XSAVEOPT_UNSET; + opts-x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_XSAVEOPT_UNSET; + } + return true; I think that -mxsaveopt should include -mxsave, in the same way that -msse2 includes -msse. Please add this change (also remove -mxsave from sse-X.[c,C] tests). The patch is OK with this change. Please wait for H.J to look if everything is OK w.r.t ABI. Thanks, Uros.
Re: [wwwdocs,Java] Replace sources.redhat.com by sourceware.org
On Tue, Oct 23, 2012 at 10:52:41AM +0100, Andrew Haley wrote: On 10/23/2012 10:47 AM, Andrew Hughes wrote: It's never been obvious to me how the web material gets updated. GCJ regularly misses out on being mentioned in changes too, despite fixes going in. Web material gets updated with patches through the same process as the software. Only thing to realize is that docs are still in CVS, not SVN: http://gcc.gnu.org/cvs.html For more background see also: http://gcc.gnu.org/contribute.html#webchanges http://gcc.gnu.org/projects/web.html Cheers, Mark
Additional fix for pre-reload schedule on x86 targets.
Hi All, This fix is aimed to remove stability issues with using pre-reload scheduler for x86 targets caused by cross-block motion of function arguments passed in likely-spilled HW registers. We found one more issue in a process of more detail testing pre-reload scheduler for all x86 platforms. The fix was fully tested on all acceptable suites and gcc bootstrapping with turned on pre-reload scheduler. Tested for i386 and x86-64, ok for trunk? ChangeLog: 2012-10-23 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386.c (insn_is_function_arg) : Add check on CALL instruction. (ix86_dependencies_evaluation_hook): Insert dependencies in all predecessors of call block for non-trivial region avoiding creation of loop-carried dependency to avoid cross-block motion of HW registers. i386-pre-reload-scheduler-fix.diff Description: Binary data
Re: Additional fix for pre-reload schedule on x86 targets.
On Tue, Oct 23, 2012 at 1:38 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: This fix is aimed to remove stability issues with using pre-reload scheduler for x86 targets caused by cross-block motion of function arguments passed in likely-spilled HW registers. We found one more issue in a process of more detail testing pre-reload scheduler for all x86 platforms. The fix was fully tested on all acceptable suites and gcc bootstrapping with turned on pre-reload scheduler. Tested for i386 and x86-64, ok for trunk? ChangeLog: 2012-10-23 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386.c (insn_is_function_arg) : Add check on CALL instruction. (ix86_dependencies_evaluation_hook): Insert dependencies in all predecessors of call block for non-trivial region avoiding creation of loop-carried dependency to avoid cross-block motion of HW registers. Please Cc Vladimir on scheduler patches. If he agrees on proposed approach, I'll just rubberstamp the patch as OK for mainline. Uros.
[PATCH 0/3][asan] Instrument memory access builtins calls
Hello, The three patches following up this message implement instrumenting memory access builtins calls in AddressSanitizer, like what the llvm implementation does. The first two patches do some factorizing that is used by the third one. I have split them up like this to ease the review and to ensure that applying them up one by one keeps the tree in a build-able state. As we don't yet seem to have have a test harness on the tree, I have tested the patches by starring at the gimple outputs and by bootstrapping the tree for basic sanity. This is my first foray in gimple and middle-end land, so I guess some apologies are due in advance for the barbarisms you might find in these patches. Below is the summary of the changes. Thank you in advance. [asan] Make build_check_stmt accept an SSA_NAME for its base [asan] Factorize condition insertion code out of build_check_stmt [asan] Instrument built-in memory access function calls gcc/asan.c | 482 + 1 file changed, 419 insertions(+), 63 deletions(-) -- Dodji
[PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base
This patch makes build_check_stmt accept its memory access parameter to be an SSA name. This is useful for a subsequent patch that will re-use. Tested by running cc1 -fasan on the program below with and without the patch and inspecting the gimple output to see that there is no change. void foo () { char foo[1] = {0}; foo[0] = 1; } gcc/ * asan.c (build_check_stmt): Accept the memory access to be represented by an SSA_NAME. --- gcc/asan.c | 36 +++- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index 9464836..e201f75 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -397,16 +397,18 @@ asan_init_func (void) #define PROB_VERY_UNLIKELY (REG_BR_PROB_BASE / 2000 - 1) #define PROB_ALWAYS(REG_BR_PROB_BASE) -/* Instrument the memory access instruction BASE. - Insert new statements before ITER. - LOCATION is source code location. - IS_STORE is either 1 (for a store) or 0 (for a load). +/* Instrument the memory access instruction BASE. Insert new + statements before ITER. + + Note that the memory access represented by BASE can be either an + SSA_NAME, or a non-SSA expression. LOCATION is the source code + location. IS_STORE is TRUE for a store, FALSE for a load. SIZE_IN_BYTES is one of 1, 2, 4, 8, 16. */ static void -build_check_stmt (tree base, - gimple_stmt_iterator *iter, - location_t location, bool is_store, int size_in_bytes) +build_check_stmt (tree base, gimple_stmt_iterator *iter, + location_t location, bool is_store, + int size_in_bytes) { gimple_stmt_iterator gsi; basic_block cond_bb, then_bb, else_bb; @@ -417,6 +419,7 @@ build_check_stmt (tree base, tree shadow_type = TREE_TYPE (shadow_ptr_type); tree uintptr_type = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1); + tree base_ssa = base; /* We first need to split the current basic block, and start altering the CFG. This allows us to insert the statements we're about to @@ -462,15 +465,22 @@ build_check_stmt (tree base, base = unshare_expr (base); gsi = gsi_last_bb (cond_bb); - g = gimple_build_assign_with_ops (TREE_CODE (base), - make_ssa_name (TREE_TYPE (base), NULL), - base, NULL_TREE); - gimple_set_location (g, location); - gsi_insert_after (gsi, g, GSI_NEW_STMT); + + /* If BASE can already be an SSA_NAME; in that case, do not create a + new SSA_NAME for it. */ + if (TREE_CODE (base) != SSA_NAME) +{ + g = gimple_build_assign_with_ops (TREE_CODE (base), + make_ssa_name (TREE_TYPE (base), NULL), + base, NULL_TREE); + gimple_set_location (g, location); + gsi_insert_after (gsi, g, GSI_NEW_STMT); + base_ssa = gimple_assign_lhs (g); +} g = gimple_build_assign_with_ops (NOP_EXPR, make_ssa_name (uintptr_type, NULL), - gimple_assign_lhs (g), NULL_TREE); + base_ssa, NULL_TREE); gimple_set_location (g, location); gsi_insert_after (gsi, g, GSI_NEW_STMT); base_addr = gimple_assign_lhs (g); -- Dodji
[PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt
This patch splits a new create_cond_insert_point_before_iter function out of build_check_stmt, to be used by a later patch. Tested by running cc1 -fasan on the test program below with and without the patch and by inspecting the gimple output to see that there is no change. void foo () { char foo[1] = {0}; foo[0] = 1; } gcc/ * asan.c (create_cond_insert_point_before_iter): Factorize out of ... (build_check_stmt): ... here. --- gcc/asan.c | 122 +++-- 1 file changed, 78 insertions(+), 44 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index e201f75..aed1a60 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -397,6 +397,77 @@ asan_init_func (void) #define PROB_VERY_UNLIKELY (REG_BR_PROB_BASE / 2000 - 1) #define PROB_ALWAYS(REG_BR_PROB_BASE) +/* Split the current basic block and create a condition statement + insertion point right before the statement pointed to by ITER. + Return an iterator to the point at which the caller might safely + insert the condition statement. + + THEN_BLOCK must be set to the address of an uninitialized instance + of basic_block. The function will then set *THEN_BLOCK to the + 'then block' of the condition statement to be inserted by the + caller. + + Similarly, the function will set *FALLTRHOUGH_BLOCK to the 'else + block' of the condition statement to be inserted by the caller. + + Note that *FALLTHROUGH_BLOCK is a new block that contains the + statements starting from *ITER, and *THEN_BLOCK is a new empty + block. + + *ITER is adjusted to still point to the same statement it was + *pointing to initially. */ + +static gimple_stmt_iterator +create_cond_insert_point_before_iter (gimple_stmt_iterator *iter, + bool then_more_likely_p, + basic_block *then_block, + basic_block *fallthrough_block) +{ + gcc_assert (then_block != NULL fallthrough_block != NULL); + + gimple_stmt_iterator gsi = *iter; + + if (!gsi_end_p (gsi)) +gsi_prev (gsi); + + basic_block cur_bb = gsi_bb (*iter); + + edge e = split_block (cur_bb, gsi_stmt (gsi)); + + /* Get a hold on the 'condition block', the 'then block' and the + 'else block'. */ + basic_block cond_bb = e-src; + basic_block fallthru_bb = e-dest; + basic_block then_bb = create_empty_bb (cond_bb); + + /* Set up the newly created 'then block'. */ + e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); + int fallthrough_probability = +then_more_likely_p +? PROB_VERY_UNLIKELY +: PROB_ALWAYS - PROB_VERY_UNLIKELY; + e-probability = PROB_ALWAYS - fallthrough_probability; + make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU); + + /* Set up the the fallthrough basic block. */ + e = find_edge (cond_bb, fallthru_bb); + e-flags = EDGE_FALSE_VALUE; + e-count = cond_bb-count; + e-probability = fallthrough_probability; + + /* Update dominance info for the newly created then_bb; note that + fallthru_bb's dominance info has already been updated by + split_bock. */ + if (dom_info_available_p (CDI_DOMINATORS)) +set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb); + + *then_block = then_bb; + *fallthrough_block = fallthru_bb; + *iter = gsi_start_bb (fallthru_bb); + + return gsi_last_bb (cond_bb); +} + /* Instrument the memory access instruction BASE. Insert new statements before ITER. @@ -411,8 +482,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter, int size_in_bytes) { gimple_stmt_iterator gsi; - basic_block cond_bb, then_bb, else_bb; - edge e; + basic_block then_bb, else_bb; tree t, base_addr, shadow; gimple g; tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0]; @@ -421,51 +491,15 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter, = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1); tree base_ssa = base; - /* We first need to split the current basic block, and start altering - the CFG. This allows us to insert the statements we're about to - construct into the right basic blocks. */ - - cond_bb = gimple_bb (gsi_stmt (*iter)); - gsi = *iter; - gsi_prev (gsi); - if (!gsi_end_p (gsi)) -e = split_block (cond_bb, gsi_stmt (gsi)); - else -e = split_block_after_labels (cond_bb); - cond_bb = e-src; - else_bb = e-dest; - - /* A recap at this point: else_bb is the basic block at whose head - is the gimple statement for which this check expression is being - built. cond_bb is the (possibly new, synthetic) basic block the - end of which will contain the cache-lookup code, and a - conditional that jumps to the cache-miss code or, much more - likely, over to else_bb. */ - - /* Create the bb that contains the crash block. */ - then_bb = create_empty_bb (cond_bb); - e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); -
Re: [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base
On Tue, Oct 23, 2012 at 03:07:14PM +0200, Dodji Seketeli wrote: * asan.c (build_check_stmt): Accept the memory access to be represented by an SSA_NAME. This is ok for asan, thanks. Jakub
[PATCH 3/3] [asan] Instrument built-in memory access function calls
This patch instruments many memory access patterns through builtins. Basically, for a call like: __builtin_memset (from, 0, n_bytes); the patch would only instrument the accesses at the beginning and at the end of the memory region [from, from + n_bytes]. This is the strategy used by the llvm implementation of asan. This instrumentation for all the memory access builtin functions that expose a well specified memory region -- one that explicitly states the number of bytes accessed in the region. Tested by running cc1 -fasan on variations of simple programs like: void foo () { char foo[1] = {0}; __builtin_memset (foo, 0, 2); } and by starring at the gimple output. gcc/ * asan.c (insert_if_then_before_iter) (instrument_mem_region_access) (maybe_instrument_builtin_call, maybe_instrument_call): New static functions. (instrument_assignment): Factorize from ... (transform_statements): ... here. Use maybe_instrument_call to instrument (builtin) function calls as well. --- gcc/asan.c | 328 +++-- 1 file changed, 320 insertions(+), 8 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index aed1a60..a8e3827 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -468,6 +468,40 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter, return gsi_last_bb (cond_bb); } +/* Insert an if condition followed by a 'then block' right before the + statement pointed to by ITER. The fallthrough block -- which is the + else block of the condition as well as the destination of the + outcoming edge of the 'then block' -- starts with the statement + pointed to by ITER. + + COND is the condition of the if. + + If THEN_MORE_LIKELY_P is true, + the the probability of the edge to the 'then block' is higher than + the probability of the edge to the fallthrough block. + + Upon completion of the function, *THEN_BB is set to the newly + inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the + fallthrough block. + + *ITER is adjusted to still point to the same statement it was + pointing to initially. */ + +static void +insert_if_then_before_iter (gimple cond, + gimple_stmt_iterator *iter, + bool then_more_likely_p, + basic_block *then_bb, + basic_block *fallthrough_bb) +{ + gimple_stmt_iterator cond_insert_point = +create_cond_insert_point_before_iter (iter, + then_more_likely_p, + then_bb, + fallthrough_bb); + gsi_insert_after (cond_insert_point, cond, GSI_NEW_STMT); +} + /* Instrument the memory access instruction BASE. Insert new statements before ITER. @@ -628,7 +662,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter, /* If T represents a memory access, add instrumentation code before ITER. LOCATION is source code location. - IS_STORE is either 1 (for a store) or 0 (for a load). */ + IS_STORE is either TRUE (for a store) or FALSE (for a load). */ static void instrument_derefs (gimple_stmt_iterator *iter, tree t, @@ -670,6 +704,285 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t, build_check_stmt (base, iter, location, is_store, size_in_bytes); } +/* Instrument an access to a contiguous memory region that starts at + the address pointed to by BASE, over a length of LEN (expressed in + the sizeof (*BASE) bytes). ITER points to the the instruction + before which the instrumentation instructions must be inserted. + LOCATION is the source location that the instrumentation + instructions must have. If IS_STORE is true, then the memory + access is a store; otherwise, it's a load. */ + +static void +instrument_mem_region_access (tree base, tree len, + gimple_stmt_iterator *iter, + location_t location, bool is_store) +{ + if (integer_zerop (len)) +return; + + gimple_stmt_iterator gsi = *iter; + tree pointed_to_type = TREE_TYPE (TREE_TYPE (base)); + + if (!is_gimple_constant (len)) +{ + /* So, the length of the memory area to asan-protect is +non-constant. Let's guard the generated instrumentation code +like: + +if (len != 0) + { +//asan instrumentation code goes here. + } + // falltrough instructions, starting with *ITER. */ + + basic_block fallthrough_bb, then_bb; + gimple g = gimple_build_cond (NE_EXPR, + len, + build_int_cst (TREE_TYPE (len), 0), + NULL_TREE, NULL_TREE); + gimple_set_location (g, location); + insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true, +
Re: [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt
On Tue, Oct 23, 2012 at 03:08:07PM +0200, Dodji Seketeli wrote: +static gimple_stmt_iterator +create_cond_insert_point_before_iter (gimple_stmt_iterator *iter, + bool then_more_likely_p, + basic_block *then_block, + basic_block *fallthrough_block) +{ + gcc_assert (then_block != NULL fallthrough_block != NULL); I think this assert is useless, if they are NULL + *then_block = then_bb; + *fallthrough_block = fallthru_bb; the above two stmts will just crash and be as useful for debugging as the assert. Jakub
Re: gcc 4.7 libgo patch committed: Set libgo version number
On Tue, Oct 23, 2012 at 12:37 AM, Jan Kratochvil jan.kratoch...@redhat.com wrote: On Tue, 23 Oct 2012 06:55:01 +0200, Ian Lance Taylor wrote: PR 54918 points out that libgo is not using version numbers as it should. At present none of libgo in 4.6, 4.7 and mainline are compatible with each other. This patch to the 4.7 branch sets the version number for libgo there. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to 4.7 branch. it has regressed GDB testsuite: -PASS: gdb.go/handcall.exp: print add (1, 2) +FAIL: gdb.go/handcall.exp: print add (1, 2) GNU gdb (GDB) 7.5.50.20121022-cvs before: (gdb) print add (1, 2) $1 = 3 (gdb) ptype add type = int32 (int, int) (gdb) info line add Line 219 of ../../../libgo/runtime/cpuprof.c starts at address 0x755c0884 tick+52 and ends at 0x755c0898 tick+72. now: (gdb) print add (1, 2) Too few arguments in function call. (gdb) ptype add type = void (Profile *, uintptr *, int32) (gdb) info line add Line 212 of ../../../gcc47/libgo/runtime/cpuprof.c starts at address 0x755b05fe add and ends at 0x755b0609 add+11. In both the before and after, gdb seems to be picking up the wrong version of add, according to the line number information. It's picking up the static function add defined in the libgo library, not the function defined in the user code. The type information is different but in fact the type of the static function in the libgo library has not changed since it was introduced in March, 2011. So somehow in the before case gdb is displaying the line number of the static function add but the type of the Go function add, but in the after case gdb is displaying the line number and the type of the static function. I don't have any explanation for this difference but it's hard for me to believe that the root cause is in libgo. In effect there are two functions named add: one is in libgo with C mangling, and one is in the user code with Go mangling. gdb is not picking the one that the testsuite expects it to pick. Ian
Re: [PATCH] Fix PR55011
Hi, On Tue, 23 Oct 2012, Richard Biener wrote: ... for this. We should never produce UNDEFINED when the input wasn't UNDEFINED already. Why? Because doing so _always_ means an invalid lattice transition. UNDEFINED is TOP, anything not UNDEFINED is not TOP. So going from something to UNDEFINED is always going upward the lattice and hence in the wrong direction. Um, what do you mean with input then? Certainly intersecting [2, 4] and [6, 8] yields UNDEFINED. Huh? It should yield VARYING, i.e. BOTTOM, not UNDEFINED, aka TOP. That's the whole point I'm trying to make. You're fixing up this very bug. We shouldn't update the lattice this way, yes, but that is what the patch ensures. An assert ensures. A work around works around a problem. I say that the problem is in those routines that produced the new UNDEFINED range in the first place, and it's not update_value_range's job to fix that after the fact. It is. See how CCPs set_lattice_vlaue adjusts the input as well. It actually does what I say update_value_range should do. It _asserts_ a valid transition, and the fixup before is correctly marked with a ??? comment about the improperty of the place of such fixing up. It's just not convenient to repeat the adjustments everywhere. Sure. If the workers were correct there would be no need to do any adjustments. The workers only compute a new value-range for a stmt based on input value ranges. And if they produce UNDEFINED when the input wasn't so, then _that's_ where the bug is. See above. Hmm? See above. I'm not sure I understand. You claim that the workers have to produce UNDEFINED from non-UNDEFINED in some cases, otherwise we oscillate? That sounds strange. Or do you mean that we oscillate without your patch to update_value_range? That I believe, it's the natural result of going a lattice the wrong way, but I say that update_value_range is not the place to silently fix invalid transitions. No, I mean that going up the lattice results in oscillation. And that's why the workers are not supposed to do that. They can't willy-nilly create new UNDEFINED/TOP lattice values. Somehow we have a disconnect in this discussion over some very basic property, let's try to clean this up first. So, one question, are you claiming that a VRP worker like this: VR derive_new_range_from_operation (VR a, VR b) is _ever_ allowed to return UNDEFINED when a or b is something else than UNDEFINED? You seem to claim so AFAIU, but at the same time admit that this results in oscillation, and hence needs fixing up in the routine that uses the above result to install the lattice value in the map. I'm obviously saying that the above worker is not allowed to return UNDEFINED. Ciao, Michael.
[Patch] Fix the test libgomp.graphite/force-parallel-6.c
The test libgomp.graphite/force-parallel-6.c is not valid as it tries to write Y[2*N] for Y defined as int X[2*N], Y[2*N], B[2*N]; This patch fixes the bounds of the loops in order to make the test valid Since I don't have write access, could someone commit the patch if it is approved? Dominique libgomp/ChangeLog 2012-10-23 Dominique d'Humieres domi...@lps.ens.fr * testsuite/libgomp.graphite/force-parallel-6.c: Adjust the loops. --- ../_clean/libgomp/testsuite/libgomp.graphite/force-parallel-6.c 2011-12-06 15:36:08.0 +0100 +++ libgomp/testsuite/libgomp.graphite/force-parallel-6.c 2012-10-23 14:55:12.0 +0200 @@ -7,13 +7,13 @@ int foo(void) { int i, j, k; - for (i = 1; i = N; i++) + for (i = 0; i N; i++) { X[i] = Y[i] + 10; - for (j = 1; j = N; j++) + for (j = 0; j N; j++) { B[j] = A[j][N]; - for (k = 1; k = N; k++) + for (k = 0; k N; k++) { A[j+1][k] = B[j] + C[j][k]; }
Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote: * asan.c (insert_if_then_before_iter) (instrument_mem_region_access) (maybe_instrument_builtin_call, maybe_instrument_call): New static Why not just write it: * asan.c (insert_if_then_before_iter, instrument_mem_region_access, maybe_instrument_builtin_call, maybe_instrument_call): New static ? functions. + tree pointed_to_type = TREE_TYPE (TREE_TYPE (base)); Shouldn't pointed_to_type be always char_type_node? I mean it shouldn't be VOID_TYPE, even when the argument is (void *), etc. + /* The 'then block' of the 'if (len != 0) condition is where + we'll generate the asan instrumentation code now. */ + gsi = gsi_start_bb (then_bb); + + /* Instrument the beginning of the memory region to be accessed, + and arrange for the rest of the intrumentation code to be + inserted in the then block *after* the current gsi. */ + build_check_stmt (base, gsi, location, is_store, + int_size_in_bytes (pointed_to_type)); + gsi = gsi_last_bb (then_bb); +} + else +{ + /* Instrument the beginning of the memory region to be + accessed. */ + build_check_stmt (base, iter, location, is_store, + int_size_in_bytes (pointed_to_type)); + gsi = *iter; +} Is there any reason why you can't call build_check_stmt just once, after the conditional? I.e. do ... gsi = gsi_start_bb (then_bb); } else gsi = *iter; build_check_stmt (base, gsi, location, is_store, int_size_in_bytes (pointed_to_type)); + /* instrument access at _2; */ + gsi_next (gsi); + tree end = gimple_assign_lhs (region_end); + build_check_stmt (end, gsi, location, is_store, Can't you just pass gimple_assign_lhs (region_end) as first argument to build_check_stmt? And again, I think you want to test a single byte there, not more. + int_size_in_bytes (TREE_TYPE (end))); + switch (DECL_FUNCTION_CODE (callee)) +{ + /* (s, s, n) style memops. */ +case BUILT_IN_BCMP: +case BUILT_IN_MEMCMP: + /* These cannot be safely instrumented as their length parameter + is just a mere limit. + +case BUILT_IN_STRNCASECMP: +case BUILT_IN_STRNCMP: */ I think these comments make the code less readable instead of more readable, I'd move the comments why something can't be instrumented to the default: case. On the other side, you IMHO want to handle here also __atomic_* and __sync_* builtins (not by using instrument_mem_region_access, but just instrument_derefs (if the argument is ADDR_EXPR, on what it points to, otherwise if it is SSA_NAME, on MEM_REF created for it). Jakub
[Patch] Fix the tests gcc.dg/vect/vect-8[23]_64.c
Following the changes in [PATCH] Add option for dumping to stderr (issue6190057) the tests gcc.dg/vect/vect-8[23]_64.c fails on powerpc*-*-*. This patch adjust the dump files and has been tested on powerpc-apple-darwin9. If approved could someone commit it for me (no write access). Note that these tests use both dg-do run and dg-do compile which is not supported (see http://gcc.gnu.org/ml/gcc/2012-10/msg00226.html and the rest of the thread). TIA Dominique gcc/testsuite/ChangeLog 2012-10-23 Dominique d'Humieres domi...@lps.ens.fr * gcc.dg/vect/vect-82_64.c: Adjust the dump file. * gcc.dg/vect/vect-83_64.c: Likewise. diff -up gcc/testsuite/gcc.dg/vect/vect-82_64.c ../work/gcc/testsuite/gcc.dg/vect/vect-82_64.c --- gcc/testsuite/gcc.dg/vect/vect-82_64.c 2007-11-21 20:18:48.0 +0100 +++ ../work/gcc/testsuite/gcc.dg/vect/vect-82_64.c 2012-10-08 13:52:25.0 +0200 @@ -1,6 +1,6 @@ /* { dg-do run { target { { powerpc*-*-* lp64 } powerpc_altivec_ok } } } */ /* { dg-do compile { target { { powerpc*-*-* ilp32 } powerpc_altivec_ok } } } */ -/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats -maltivec } */ +/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-details -maltivec } */ #include stdarg.h #include tree-vect.h diff -up gcc/testsuite/gcc.dg/vect/vect-83_64.c ../work/gcc/testsuite/gcc.dg/vect/vect-83_64.c --- gcc/testsuite/gcc.dg/vect/vect-83_64.c 2007-11-21 20:18:48.0 +0100 +++ ../work/gcc/testsuite/gcc.dg/vect/vect-83_64.c 2012-10-08 13:52:42.0 +0200 @@ -1,6 +1,6 @@ /* { dg-do run { target { { powerpc*-*-* lp64 } powerpc_altivec_ok } } } */ /* { dg-do compile { target { { powerpc*-*-* ilp32 } powerpc_altivec_ok } } } */ -/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats -maltivec } */ +/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-details -maltivec } */ #include stdarg.h #include tree-vect.h
Re: patch to fix constant math - 4th patch - the wide-int class.
On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch implements the wide-int class.this is a more general version of the double-int class and is meant to be the eventual replacement for that class.The use of this class removes all dependencies of the host from the target compiler's integer math. I have made all of the changes i agreed to in the earlier emails. In particular, this class internally maintains a bitsize and precision but not a mode. The class now is neutral about modes and tree-types.the functions that take modes or tree-types are just convenience functions that translate the parameters into bitsize and precision and where ever there is a call that takes a mode, there is a corresponding call that takes a tree-type. All of the little changes that richi suggested have also been made. The buffer sizes is now twice the size needed by the largest integer mode. This gives enough room for tree-vrp to do full multiplies on any type that the target supports. Tested on x86-64. This patch depends on the first three patches. I am still waiting on final approval on the hwint.h patch. Ok to commit? diff --git a/gcc/wide-int.h b/gcc/wide-int.h new file mode 100644 index 000..efd2c01 --- /dev/null +++ b/gcc/wide-int.h ... +#ifndef GENERATOR_FILE The whole file is guarded with that ... why? That is bound to be fragile once use of wide-int spreads? How do generator programs end up including this file if they don't need it at all? +#include tree.h +#include hwint.h +#include options.h +#include tm.h +#include insn-modes.h +#include machmode.h +#include double-int.h +#include gmp.h +#include insn-modes.h + That's a lot of tree and rtl dependencies. double-int.h avoids these by placing conversion routines in different headers or by only resorting to types in coretypes.h. Please try to reduce the above to a minimum. + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; in double-int.h and replace its implementation with a specialization of wide_int. Due to a number of divergences (double_int is not a subset of wide_int) that doesn't seem easily possible (one reason is the ShiftOp and related enums you use). Of course wide_int is not a template either. For the hypotetical embedded target above we'd end up using wide_int1, a even more trivial specialization. I realize again this wide-int is not what your wide-int is (because you add a precision member). Still factoring out the commons of wide-int and double-int into a wide_int_raw template should be possible. +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. If + operations are added that require larger buffers, then VAL needs + to be changed. */ + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned short len; + unsigned int bitsize; + unsigned int precision; The len, bitsize and precision members need documentation. At least one sounds redundant. + public: + enum ShiftOp { +NONE, NONE is never a descriptive name ... I suppose this is for arithmetic vs. logical shifts? +/* There are two uses for the wide-int shifting functions. The + first use is as an emulation of the target hardware. The + second use is as service routines for other optimizations. The + first case needs to be identified by passing TRUNC as the value + of ShiftOp so that shift amount is properly handled according to the + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift + amount is always truncated by the bytesize of the mode of + THIS. */ +TRUNC ah, no, it's for SHIFT_COUNT_TRUNCATED. mode of THIS? Now it's precision I suppose. That said, handling SHIFT_COUNT_TRUNCATED in wide-int sounds over-engineered, the caller should be responsible of applying SHIFT_COUNT_TRUNCATED when needed. + enum SignOp { +/* Many of the math functions produce different results depending + on if they are SIGNED or UNSIGNED. In general, there are two + different functions, whose names are prefixed with an 'S and + or an 'U'. However, for some math functions there is also a + routine that does not have the prefix and takes an SignOp + parameter of SIGNED or UNSIGNED. */ +SIGNED, +UNSIGNED + }; double-int and _all_ of the rest of the middle-end uses a
Re: [MIPS] Implement static stack checking
On Mon, 22 Oct 2012, Richard Sandiford wrote: The loop probes at FIRST + N * PROBE_INTERVAL for values of N from 1 until it is equal to ROUNDED_SIZE, inclusive, so FIRST + SIZE is always probed. Doh! But in that case, rather than: 1: beq r1,r2,2f addiu r1,r1,interval b 1b sw $0,0(r1) 2: why not just: 1: addiu r1,r1,interval bne r1,r2,1b sw $0,0(r1) ? For the record that can be easily rewritten to support the MIPS16 mode, e.g.: move$1,r2 1: d/addiu r1,interval li r2,0 sw/sd r2,0(r1) mover2,$1 cmp r1,r2 btnez 1b with $2 and $3 used as temporaries (in addition to $1) as these are I believe available in MIPS16 prologues ($1 obviously is). The juggling with $1 can be avoided if the probe used need not be zero (need it?). The range of interval supported by the machine instruction encodings available is the same as for the standard MIPS or microMIPS mode. If we care about MIPS16 support, that is. Maciej
Re: [PATCH] Fix PR55011
On Tue, 23 Oct 2012, Michael Matz wrote: Hi, On Tue, 23 Oct 2012, Richard Biener wrote: ... for this. We should never produce UNDEFINED when the input wasn't UNDEFINED already. Why? Because doing so _always_ means an invalid lattice transition. UNDEFINED is TOP, anything not UNDEFINED is not TOP. So going from something to UNDEFINED is always going upward the lattice and hence in the wrong direction. Um, what do you mean with input then? Certainly intersecting [2, 4] and [6, 8] yields UNDEFINED. Huh? It should yield VARYING, i.e. BOTTOM, not UNDEFINED, aka TOP. That's the whole point I'm trying to make. You're fixing up this very bug. I suppose we can argue about this one. When using VARYING here this VARYING can leak out from unreachable paths in the CFG. We shouldn't update the lattice this way, yes, but that is what the patch ensures. An assert ensures. A work around works around a problem. I say that the problem is in those routines that produced the new UNDEFINED range in the first place, and it's not update_value_range's job to fix that after the fact. It is. See how CCPs set_lattice_vlaue adjusts the input as well. It actually does what I say update_value_range should do. It _asserts_ a valid transition, and the fixup before is correctly marked with a ??? comment about the improperty of the place of such fixing up. It's just not convenient to repeat the adjustments everywhere. Sure. If the workers were correct there would be no need to do any adjustments. The workers only compute a new value-range for a stmt based on input value ranges. And if they produce UNDEFINED when the input wasn't so, then _that's_ where the bug is. See above. Hmm? See above. I'm not sure I understand. You claim that the workers have to produce UNDEFINED from non-UNDEFINED in some cases, otherwise we oscillate? That sounds strange. Or do you mean that we oscillate without your patch to update_value_range? That I believe, it's the natural result of going a lattice the wrong way, but I say that update_value_range is not the place to silently fix invalid transitions. No, I mean that going up the lattice results in oscillation. And that's why the workers are not supposed to do that. They can't willy-nilly create new UNDEFINED/TOP lattice values. Somehow we have a disconnect in this discussion over some very basic property, let's try to clean this up first. Well, consider 0 * VARYING. That's still [0, 0], not VARYING. Workers should compute the best approximation for a range of an operation, otherwise we cannot use them to build up operations by pieces (like we do for -b). So, one question, are you claiming that a VRP worker like this: VR derive_new_range_from_operation (VR a, VR b) is _ever_ allowed to return UNDEFINED when a or b is something else than UNDEFINED? You seem to claim so AFAIU, but at the same time admit that this results in oscillation, and hence needs fixing up in the routine that uses the above result to install the lattice value in the map. I'm obviously saying that the above worker is not allowed to return UNDEFINED. Yes. Do you say it never may return a RANGE if either a or b is VARYING? The whole point is that we have both derive_new_range_from_operation and update_lattice_value. Only the latter knows how we iterate and that this iteration restricts the values we may update the lattice with. derive_new_range_from_operation is supposed to be generally useful. Richard. -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Ping^3 Re: Defining C99 predefined macros for whole translation unit
Ping^3. This patch http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01907.html (non-C parts) is still pending review. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] [2/10] AArch64 Port
On Tue, 23 Oct 2012, Marcus Shawcroft wrote: +@item -mcmodel=tiny +@opindex mcmodel=tiny +Generate code for the tiny code model. The program and its statically defined +symbols must be within 1GB of each other. Pointers are 64 bits. Programs can +be statically or dynamically linked. This model is not fully implemented and +mostly treated as small. Say @samp{small} instead of using quotes in Texinfo sources. -- Joseph S. Myers jos...@codesourcery.com
Re: Ping^3 Re: Defining C99 predefined macros for whole translation unit
On Tue, Oct 23, 2012 at 02:35:53PM +, Joseph S. Myers wrote: Ping^3. This patch http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01907.html (non-C parts) is still pending review. Looks ok to me. Jakub
Re: Tidy extract_bit_field_1 co.
Here's a version with the corresponding fixes from Eric's review of the store_bit_field_1 patch. Tested as before. gcc/ * expmed.c (store_split_bit_field): Update the calls to extract_fixed_bit_field. In the big-endian case, always use the mode of OP0 to count the number of significant bits. (extract_bit_field_1): Remove unit, offset, bitpos and byte_offset from the outermost scope. Express conditions in terms of bitnum rather than offset, bitpos and byte_offset. Move the computation of MODE1 to the block that needs it. Use MODE unless the TMODE-based mode_for_size calculation succeeds. Split the plain move cases into two, one for memory accesses and one for register accesses. Generalize the memory case, freeing it from the old register-based endian checks. Move the INT_MODE calculation above the code that needs it. Use simplify_gen_subreg to handle multiword OP0s. If the field still spans several words, pass it directly to extract_split_bit_field. Assume after that point that both targets and register sources fit within a word. Replace x-prefixed variables with non-prefixed forms. Compute the bitpos for ext(z)v register operands directly in the chosen unit size, rather than going through an intermediate BITS_PER_WORD unit size. Simplify the containment check used when forcing OP0 into a register. Update the call to extract_fixed_bit_field. (extract_fixed_bit_field): Replace the bitpos and offset parameters with a single bitnum parameter, of the same form as extract_bit_field. Assume that OP0 contains the full field. Simplify the memory offset calculation and containment check for volatile bitfields. Make the offset explicit when volatile bitfields force a misaligned access. Remove WARNED and fix long lines. Assert that the processed OP0 has an integral mode. (store_split_bit_field): Update the call to store_fixed_bit_field. This looks good to me (modulo the counterpart of the corresponding change in the first patch for which I cannot really tell whether it's right or wrong). -- Eric Botcazou
Re: [PATCH] [4/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch contains the adjustments to the existing test suite to support AArch64. Proposed ChangeLog: * lib/target-supports.exp (check_profiling_available): Add AArch64. (check_effective_target_vect_int): Likewise. (check_effective_target_vect_shift): Likewise. (check_effective_target_vect_float): Likewise. (check_effective_target_vect_double): Likewise. (check_effective_target_vect_widen_mult_qi_to_hi): Likewise. (check_effective_target_vect_widen_mult_hi_to_si): Likewise. (check_effective_target_vect_pack_trunc): Likewise. (check_effective_target_vect_unpack): Likewise. (check_effective_target_vect_hw_misalign): Likewise. (check_effective_target_vect_short_mult): Likewise. (check_effective_target_vect_int_mult): Likewise. (check_effective_target_vect_stridedN): Likewise. (check_effective_target_sync_int_long): Likewise. (check_effective_target_sync_char_short): Likewise. (check_vect_support_and_set_flags): Likewise. (check_effective_target_aarch64_tiny): New. (check_effective_target_aarch64_small): New. (check_effective_target_aarch64_large): New. * g++.dg/other/PR23205.C: Enable aarch64. * g++.dg/other/pr23205-2.C: Likewise. * g++.old-deja/g++.abi/ptrmem.C: Likewise. * gcc.c-torture/execute/20101011-1.c: Likewise. * gcc.dg/20020312-2.c: Likewise. * gcc.dg/20040813-1.c: Likewise. * gcc.dg/builtin-apply2.c: Likewise. * gcc.dg/stack-usage-1.c: Likewise. This is good. Please install. jeff
Re: [PATCH] [7/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch adjusts the libcpp configury for AArch64. Proposed ChangeLog: * configure.ac: Enable AArch64. * configure: Regenerate. This is fine. Please install. Jeff
Re: [PATCH] [6/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch adjusts the libatomic configury for AArch64. Proposed ChangeLog: * configure.tgt: Mark libatomic unsupported. This is good. Please install. Presumably at some point in the not too distant future, aarch support will be added to libatomic? jeff
loop-unroll.c TLC 3/4 simple peeling heuristic fix
Hi, simple peeling heuristic thinks it makes no sense to peel loops with known iteration count (because they will be runtime unrolled instead). This is not true because the known iteration count is only upper bound. Fixed this. To make testcase possible I had to reduce overactive heuristic on number of branches in the loop. It looks bit more like an thinko copied from simple unrolling where it makes sort of more sense. Peeling first iterations when loop is known to execute few times makes sense for branch prediction queality. Bootstrapped/regtested x86_64-linux, comitted. Honza Index: ChangeLog === --- ChangeLog (revision 192717) +++ ChangeLog (working copy) @@ -1,3 +1,11 @@ +2012-10-23 Jan Hubicka j...@suse.cz + + * loop-unroll.c (decide_peel_simple): Simple peeling makes sense even + with simple loops; bound number of branches only when FDO is not + available. + (decide_unroll_stupid): Mention that num_loop_branches heuristics + is off. + 2012-10-23 Nick Clifton ni...@redhat.com PR target/54660 Index: loop-unroll.c === --- loop-unroll.c (revision 192717) +++ loop-unroll.c (working copy) @@ -1228,7 +1228,6 @@ static void decide_peel_simple (struct loop *loop, int flags) { unsigned npeel; - struct niter_desc *desc; double_int iterations; if (!(flags UAP_PEEL)) @@ -1253,20 +1252,17 @@ decide_peel_simple (struct loop *loop, i return; } - /* Check for simple loops. */ - desc = get_simple_loop_desc (loop); - - /* Check number of iterations. */ - if (desc-simple_p !desc-assumptions desc-const_iter) -{ - if (dump_file) - fprintf (dump_file, ;; Loop iterates constant times\n); - return; -} - /* Do not simply peel loops with branches inside -- it increases number - of mispredicts. */ - if (num_loop_branches (loop) 1) + of mispredicts. + Exception is when we do have profile and we however have good chance + to peel proper number of iterations loop will iterate in practice. + TODO: this heuristic needs tunning; while for complette unrolling + the branch inside loop mostly eliminates any improvements, for + peeling it is not the case. Also a function call inside loop is + also branch from branch prediction POV (and probably better reason + to not unroll/peel). */ + if (num_loop_branches (loop) 1 + profile_status != PROFILE_READ) { if (dump_file) fprintf (dump_file, ;; Not peeling, contains branches\n); @@ -1435,7 +1431,9 @@ decide_unroll_stupid (struct loop *loop, } /* Do not unroll loops with branches inside -- it increases number - of mispredicts. */ + of mispredicts. + TODO: this heuristic needs tunning; call inside the loop body + is also relatively good reason to not unroll. */ if (num_loop_branches (loop) 1) { if (dump_file) Index: testsuite/gcc.dg/tree-prof/peel-1.c === --- testsuite/gcc.dg/tree-prof/peel-1.c (revision 0) +++ testsuite/gcc.dg/tree-prof/peel-1.c (revision 0) @@ -0,0 +1,25 @@ +/* { dg-options -O3 -fdump-rtl-loop2_unroll -fno-unroll-loops -fpeel-loops } */ +void abort(); + +int a[1000]; +int +__attribute__ ((noinline)) +t() +{ + int i; + for (i=0;i1000;i++) +if (!a[i]) + return 1; + abort (); +} +main() +{ + int i; + for (i=0;i1000;i++) +t(); + return 0; +} +/* { dg-final-use { scan-rtl-dump Considering simply peeling loop loop2_unroll } } */ +/* In fact one peeling is enough; we however mispredict number of iterations of the loop + at least until loop_ch is schedule ahead of profiling pass. */ +/* { dg-final-use { cleanup-rtl-dump Decided to simply peel the loop 2 times } } */ Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 192717) +++ testsuite/ChangeLog (working copy) @@ -1,7 +1,11 @@ +2012-10-23 Jan Hubicka j...@suse.cz + + * gcc.dg/tree-prof/peel-1.c: New testcase. + 2012-10-23 Dominique d'Humieres domi...@lps.ens.fr PR gcc/52945 - * testsuite/gcc.dg/lto/pr52634_0.c: skip the test on Darwin. + * gcc.dg/lto/pr52634_0.c: skip the test on Darwin. 2012-10-23 Joseph Myers jos...@codesourcery.com
Re: [PATCH] [5/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch contains all of the new files added to the test suite for AArch64, the patch does not modify any existing file. Proposed ChangeLog: * gcc.target/aarch64/aapcs/aapcs64.exp: New file. * gcc.target/aarch64/aapcs/abitest-2.h: New file. * gcc.target/aarch64/aapcs/abitest-common.h: New file. * gcc.target/aarch64/aapcs/abitest.S: New file. * gcc.target/aarch64/aapcs/abitest.h: New file. * gcc.target/aarch64/aapcs/func-ret-1.c: New file. * gcc.target/aarch64/aapcs/func-ret-2.c: New file. * gcc.target/aarch64/aapcs/func-ret-3.c: New file. * gcc.target/aarch64/aapcs/func-ret-3.x: New file. * gcc.target/aarch64/aapcs/func-ret-4.c: New file. * gcc.target/aarch64/aapcs/func-ret-4.x: New file. * gcc.target/aarch64/aapcs/ice_1.c: New file. * gcc.target/aarch64/aapcs/ice_2.c: New file. * gcc.target/aarch64/aapcs/ice_3.c: New file. * gcc.target/aarch64/aapcs/ice_4.c: New file. * gcc.target/aarch64/aapcs/ice_5.c: New file. * gcc.target/aarch64/aapcs/macro-def.h: New file. * gcc.target/aarch64/aapcs/test_1.c: New file. * gcc.target/aarch64/aapcs/test_10.c: New file. * gcc.target/aarch64/aapcs/test_11.c: New file. * gcc.target/aarch64/aapcs/test_12.c: New file. * gcc.target/aarch64/aapcs/test_13.c: New file. * gcc.target/aarch64/aapcs/test_14.c: New file. * gcc.target/aarch64/aapcs/test_15.c: New file. * gcc.target/aarch64/aapcs/test_16.c: New file. * gcc.target/aarch64/aapcs/test_17.c: New file. * gcc.target/aarch64/aapcs/test_18.c: New file. * gcc.target/aarch64/aapcs/test_19.c: New file. * gcc.target/aarch64/aapcs/test_2.c: New file. * gcc.target/aarch64/aapcs/test_20.c: New file. * gcc.target/aarch64/aapcs/test_21.c: New file. * gcc.target/aarch64/aapcs/test_22.c: New file. * gcc.target/aarch64/aapcs/test_23.c: New file. * gcc.target/aarch64/aapcs/test_24.c: New file. * gcc.target/aarch64/aapcs/test_25.c: New file. * gcc.target/aarch64/aapcs/test_26.c: New file. * gcc.target/aarch64/aapcs/test_3.c: New file. * gcc.target/aarch64/aapcs/test_4.c: New file. * gcc.target/aarch64/aapcs/test_5.c: New file. * gcc.target/aarch64/aapcs/test_6.c: New file. * gcc.target/aarch64/aapcs/test_7.c: New file. * gcc.target/aarch64/aapcs/test_8.c: New file. * gcc.target/aarch64/aapcs/test_9.c: New file. * gcc.target/aarch64/aapcs/test_align-1.c: New file. * gcc.target/aarch64/aapcs/test_align-2.c: New file. * gcc.target/aarch64/aapcs/test_align-3.c: New file. * gcc.target/aarch64/aapcs/test_align-4.c: New file. * gcc.target/aarch64/aapcs/test_complex.c: New file. * gcc.target/aarch64/aapcs/test_int128.c: New file. * gcc.target/aarch64/aapcs/test_quad_double.c: New file. * gcc.target/aarch64/aapcs/type-def.h: New file. * gcc.target/aarch64/aapcs/va_arg-1.c: New file. * gcc.target/aarch64/aapcs/va_arg-10.c: New file. * gcc.target/aarch64/aapcs/va_arg-11.c: New file. * gcc.target/aarch64/aapcs/va_arg-12.c: New file. * gcc.target/aarch64/aapcs/va_arg-2.c: New file. * gcc.target/aarch64/aapcs/va_arg-3.c: New file. * gcc.target/aarch64/aapcs/va_arg-4.c: New file. * gcc.target/aarch64/aapcs/va_arg-5.c: New file. * gcc.target/aarch64/aapcs/va_arg-6.c: New file. * gcc.target/aarch64/aapcs/va_arg-7.c: New file. * gcc.target/aarch64/aapcs/va_arg-8.c: New file. * gcc.target/aarch64/aapcs/va_arg-9.c: New file. * gcc.target/aarch64/aapcs/validate_memory.h: New file. * gcc.target/aarch64/aarch64.exp: New file. * gcc.target/aarch64/adc-1.c: New file. * gcc.target/aarch64/adc-2.c: New file. * gcc.target/aarch64/asm-1.c: New file. * gcc.target/aarch64/clrsb.c: New file. * gcc.target/aarch64/clz.c: New file. * gcc.target/aarch64/ctz.c: New file. * gcc.target/aarch64/csinc-1.c: New file. * gcc.target/aarch64/csinv-1.c: New file. * gcc.target/aarch64/csneg-1.c: New file. * gcc.target/aarch64/extend.c: New file. * gcc.target/aarch64/fcvt.x: New file. * gcc.target/aarch64/fcvt_double_int.c: New file. * gcc.target/aarch64/fcvt_double_long.c: New file. * gcc.target/aarch64/fcvt_double_uint.c: New file. * gcc.target/aarch64/fcvt_double_ulong.c: New file. * gcc.target/aarch64/fcvt_float_int.c: New file. * gcc.target/aarch64/fcvt_float_long.c: New file. * gcc.target/aarch64/fcvt_float_uint.c: New file. * gcc.target/aarch64/fcvt_float_ulong.c: New file. *
Re: [PATCH] [8/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch provides the AArch64 libgcc port, it contains both the required configury adjustment to config.host and the new files introduced by the AArch64 port. Proposed ChangeLog: * config.host (aarch64*-*-elf, aarch64*-*-linux*): New. * config/aarch64/crti.S: New file. * config/aarch64/crtn.S: New file. * config/aarch64/linux-unwind.h: New file. * config/aarch64/sfp-machine.h: New file. * config/aarch64/sync-cache.c: New file. * config/aarch64/t-aarch64: New file. * config/aarch64/t-softfp: New file. This is fine. Please install. Jeff
Re: [PATCH] [2/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch contains the additions to the gcc/doc files to document the AArch64 port. Proposed ChangeLog: * doc/invoke.texi (AArch64 Options): New. * doc/md.texi (Machine Constraints): Add AArch64. This is fine. Please install. jeff
Re: [PATCH] Fix PR55011
Hi, On Tue, 23 Oct 2012, Richard Biener wrote: So, one question, are you claiming that a VRP worker like this: VR derive_new_range_from_operation (VR a, VR b) is _ever_ allowed to return UNDEFINED when a or b is something else than UNDEFINED? You seem to claim so AFAIU, but at the same time admit that this results in oscillation, and hence needs fixing up in the routine that uses the above result to install the lattice value in the map. I'm obviously saying that the above worker is not allowed to return UNDEFINED. Yes. Do you say it never may return a RANGE if either a or b is VARYING? It may not do so if the recorded lattice value for the return ssa name was already BOTTOM. But I think I see were you're getting at; you say the old lattice value is not available to derive_new_range_from_operation, so that one simply isn't capable of doing the right thing? But this is about BOTTOM vs. RANGE, not about TOP; I think the workers should indeed not be allowed to return TOP, which would create an unsymmetry between TOP and BOTTOM, but there we are. OTOH, when the old lattice value isn't available, I indeed also see no other way than your current patch, so let's transform my complaint into just a general unhappiness with how VRP is implemented :) Ciao, Michael.
Re: [PATCH] [3/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch contains all of the new files for the target port itself, the patch does not modify any existing file. Proposed ChangeLog: * common/config/aarch64/aarch64-common.c: New file. * config/aarch64/aarch64-arches.def: New file. * config/aarch64/aarch64-builtins.c: New file. * config/aarch64/aarch64-cores.def: New file. * config/aarch64/aarch64-elf-raw.h: New file. * config/aarch64/aarch64-elf.h: New file. * config/aarch64/aarch64-generic.md: New file. * config/aarch64/aarch64-linux.h: New file. * config/aarch64/aarch64-modes.def: New file. * config/aarch64/aarch64-option-extensions.def: New file. * config/aarch64/aarch64-opts.h: New file. * config/aarch64/aarch64-protos.h: New file. * config/aarch64/aarch64-simd.md: New file. * config/aarch64/aarch64-tune.md: New file. * config/aarch64/aarch64.c: New file. * config/aarch64/aarch64.h: New file. * config/aarch64/aarch64.md: New file. * config/aarch64/aarch64.opt: New file. * config/aarch64/arm_neon.h: New file. * config/aarch64/constraints.md: New file. * config/aarch64/gentune.sh: New file. * config/aarch64/iterators.md: New file. * config/aarch64/large.md: New file. * config/aarch64/predicates.md: New file. * config/aarch64/small.md: New file. * config/aarch64/sync.md: New file. * config/aarch64/t-aarch64-linux: New file. * config/aarch64/t-aarch64: New file. Given that you and Richard Earnshaw are the approved maintainers for the AAarch64 port, I'm going to give this an OK without diving into it. I'm going to assume you and Richard will iterate with anyone who does dive deeply into the port and has comments/suggestions. The one question in the back of my mind is whether or not this uses the new iterator support we discussed a few months ago? I can't recall if that was integrated into the trunk or not. Jeff
[PATCH] Invalidate in cselib sp after processing frame_pointer_needed fp setter (PR rtl-optimization/54921)
Hi! This is an attempt to hopefully end the endless stream of aliasing miscompilations where alias.c assumes that hfp based accesses can't alias sp based accesses, but both sp and fp can appear in VALUEs pretty much randomly. As detailed in the PR, we have the r variable at rbp - 48 and rsp is also rbp - 48, some stores use direct rbp - 44 based addresses which then have find_base_term of the value equal to (address:DI -4), other stores use rdi == r9 based address, where r9 is r8 + 4 and r8 is rbp - 48, but value for r8 has r8 as well as rsp registers in its locs and therefore find_base_term returns (address:DI -1) - i.e. sp based term, and alias.c says those two can't alias even when they actually do. The fix is what I've done recently in var-tracking.c, invalidate sp if frame_pointer_needed right after the fp setter insn, so we then have a set of VALUEs which are based on rbp and a different set of values which are based on rsp, but those two sets are now disjoint. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-10-23 Jakub Jelinek ja...@redhat.com PR rtl-optimization/54921 * cselib.c (cselib_process_insn): If frame_pointer_needed, call cselib_invalidate_rtx (stack_pointer_rtx) after processing a frame pointer setter in the prologue. * gcc.dg/pr54921.c: New test. --- gcc/cselib.c.jj 2012-10-16 13:20:25.0 +0200 +++ gcc/cselib.c2012-10-23 14:22:17.694861625 +0200 @@ -2655,6 +2655,34 @@ cselib_process_insn (rtx insn) if (GET_CODE (XEXP (x, 0)) == CLOBBER) cselib_invalidate_rtx (XEXP (XEXP (x, 0), 0)); + /* On setter of the hard frame pointer if frame_pointer_needed, + invalidate stack_pointer_rtx, so that sp and {,h}fp based + VALUEs are distinct. */ + if (reload_completed + frame_pointer_needed + RTX_FRAME_RELATED_P (insn) + BLOCK_FOR_INSN (insn) == single_succ (ENTRY_BLOCK_PTR)) +{ + rtx pat = PATTERN (insn); + rtx expr = find_reg_note (insn, REG_FRAME_RELATED_EXPR, NULL_RTX); + if (expr) + pat = XEXP (expr, 0); + if (GET_CODE (pat) == SET + SET_DEST (pat) == hard_frame_pointer_rtx) + cselib_invalidate_rtx (stack_pointer_rtx); + else if (GET_CODE (pat) == PARALLEL) + { + int i; + for (i = XVECLEN (pat, 0) - 1; i = 0; i--) + if (GET_CODE (XVECEXP (pat, 0, i)) == SET +SET_DEST (XVECEXP (pat, 0, i)) == hard_frame_pointer_rtx) + { + cselib_invalidate_rtx (stack_pointer_rtx); + break; + } + } +} + cselib_current_insn = NULL_RTX; if (n_useless_values MAX_USELESS_VALUES --- gcc/testsuite/gcc.dg/pr54921.c.jj 2012-10-23 14:17:49.811466882 +0200 +++ gcc/testsuite/gcc.dg/pr54921.c 2012-10-23 14:17:32.0 +0200 @@ -0,0 +1,32 @@ +/* PR rtl-optimization/54921 */ +/* { dg-do run } */ +/* { dg-options -Os -fno-omit-frame-pointer -fsched2-use-superblocks -ftree-slp-vectorize } */ +/* { dg-additional-options -fstack-protector { target fstack_protector } } */ + +struct A +{ + int a; + char b[32]; +} a, b; + +__attribute__((noinline, noclone)) +struct A +bar (int x) +{ + struct A r; + static int n; + r.a = ++n; + __builtin_memset (r.b, 0, sizeof (r.b)); + r.b[0] = x; + return r; +} + +int +main () +{ + a = bar (3); + b = bar (4); + if (a.a != 1 || a.b[0] != 3 || b.a != 2 || b.b[0] != 4) +__builtin_abort (); + return 0; +} Jakub
Re: [PATCH] [1/10] AArch64 Port
On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch contains the adjustments to top level gcc configury required to enable the AArch64 port. Proposed ChangeLog: * config.gcc: Add AArch64. * configure.ac: Add AArch64 TLS support detection. * configure: Regenerate. OK. Jeff
Re: [PATCH] [10/10] AArch64 Port
On 10/23/2012 03:43 AM, Marcus Shawcroft wrote: This patch provides the AArch64 libstdc++-v3 port, it contains both the required configury adjustment to config.host and the new file introduced by the AArch64 port. Proposed ChangeLog: * config/cpu/aarch64/cxxabi_tweaks.h: New file. * configure.host: Enable aarch64. This is fine. Thanks. Jeff
Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
On Tue, Oct 23, 2012 at 6:11 AM, Dodji Seketeli do...@redhat.com wrote: This patch instruments many memory access patterns through builtins. Basically, for a call like: __builtin_memset (from, 0, n_bytes); the patch would only instrument the accesses at the beginning and at the end of the memory region [from, from + n_bytes]. This is the strategy used by the llvm implementation of asan. This instrumentation for all the memory access builtin functions that expose a well specified memory region -- one that explicitly states the number of bytes accessed in the region. Tested by running cc1 -fasan on variations of simple programs like: void foo () { char foo[1] = {0}; __builtin_memset (foo, 0, 2); } and by starring at the gimple output. gcc/ * asan.c (insert_if_then_before_iter) (instrument_mem_region_access) (maybe_instrument_builtin_call, maybe_instrument_call): New static functions. (instrument_assignment): Factorize from ... (transform_statements): ... here. Use maybe_instrument_call to instrument (builtin) function calls as well. --- gcc/asan.c | 328 +++-- 1 file changed, 320 insertions(+), 8 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index aed1a60..a8e3827 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -468,6 +468,40 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter, return gsi_last_bb (cond_bb); } +/* Insert an if condition followed by a 'then block' right before the + statement pointed to by ITER. The fallthrough block -- which is the + else block of the condition as well as the destination of the + outcoming edge of the 'then block' -- starts with the statement + pointed to by ITER. + + COND is the condition of the if. + + If THEN_MORE_LIKELY_P is true, + the the probability of the edge to the 'then block' is higher than + the probability of the edge to the fallthrough block. + + Upon completion of the function, *THEN_BB is set to the newly + inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the + fallthrough block. + + *ITER is adjusted to still point to the same statement it was + pointing to initially. */ + +static void +insert_if_then_before_iter (gimple cond, + gimple_stmt_iterator *iter, + bool then_more_likely_p, + basic_block *then_bb, + basic_block *fallthrough_bb) +{ + gimple_stmt_iterator cond_insert_point = +create_cond_insert_point_before_iter (iter, + then_more_likely_p, + then_bb, + fallthrough_bb); + gsi_insert_after (cond_insert_point, cond, GSI_NEW_STMT); +} + /* Instrument the memory access instruction BASE. Insert new statements before ITER. @@ -628,7 +662,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter, /* If T represents a memory access, add instrumentation code before ITER. LOCATION is source code location. - IS_STORE is either 1 (for a store) or 0 (for a load). */ + IS_STORE is either TRUE (for a store) or FALSE (for a load). */ static void instrument_derefs (gimple_stmt_iterator *iter, tree t, @@ -670,6 +704,285 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t, build_check_stmt (base, iter, location, is_store, size_in_bytes); } +/* Instrument an access to a contiguous memory region that starts at + the address pointed to by BASE, over a length of LEN (expressed in + the sizeof (*BASE) bytes). ITER points to the the instruction + before which the instrumentation instructions must be inserted. + LOCATION is the source location that the instrumentation + instructions must have. If IS_STORE is true, then the memory + access is a store; otherwise, it's a load. */ + +static void +instrument_mem_region_access (tree base, tree len, + gimple_stmt_iterator *iter, + location_t location, bool is_store) +{ + if (integer_zerop (len)) +return; + + gimple_stmt_iterator gsi = *iter; + tree pointed_to_type = TREE_TYPE (TREE_TYPE (base)); + + if (!is_gimple_constant (len)) +{ + /* So, the length of the memory area to asan-protect is +non-constant. Let's guard the generated instrumentation code +like: + +if (len != 0) + { +//asan instrumentation code goes here. + } + // falltrough instructions, starting with *ITER. */ + + basic_block fallthrough_bb, then_bb; + gimple g = gimple_build_cond (NE_EXPR, + len, + build_int_cst (TREE_TYPE (len), 0), +
[Patch] libitm: Ask dispatch whether it requires serial mode.
This patch cleans up an implicit assumption about which TM methods actually need to be run in serial mode. Instead, the transaction begin code now asks a TM method's dispatch what it needs. OK for trunk? Torvald commit 12170ba5013e855bf4ea784823961f63e3e2de4c Author: Torvald Riegel trie...@redhat.com Date: Tue Oct 23 14:09:22 2012 +0200 Ask dispatch whether it requires serial mode. * retry.cc (gtm_thread::decide_begin_dispatch): Ask dispatch whether it requires serial mode instead of assuming that for certain dispatchs. * dispatch.h (abi_dispatch::requires_serial): New. (abi_dispatch::abi_dispatch): Adapt. * method-gl.cc (gl_wt_dispatch::gl_wt_dispatch): Adapt. * method-ml.cc (ml_wt_dispatch::ml_wt_dispatch): Same. * method-serial.cc (serialirr_dispatch::serialirr_dispatch, serial_dispatch::serial_dispatch, serialirr_onwrite_dispatch::serialirr_onwrite_dispatch): Same. diff --git a/libitm/dispatch.h b/libitm/dispatch.h index 6a9e62e..200138b 100644 --- a/libitm/dispatch.h +++ b/libitm/dispatch.h @@ -311,6 +311,9 @@ public: } // Returns true iff this TM method supports closed nesting. bool closed_nesting() const { return m_closed_nesting; } + // Returns STATE_SERIAL or STATE_SERIAL | STATE_IRREVOCABLE iff the TM + // method only works for serial-mode transactions. + uint32_t requires_serial() const { return m_requires_serial; } method_group* get_method_group() const { return m_method_group; } static void *operator new(size_t s) { return xmalloc (s); } @@ -332,12 +335,14 @@ protected: const bool m_write_through; const bool m_can_run_uninstrumented_code; const bool m_closed_nesting; + const uint32_t m_requires_serial; method_group* const m_method_group; abi_dispatch(bool ro, bool wt, bool uninstrumented, bool closed_nesting, - method_group* mg) : + uint32_t requires_serial, method_group* mg) : m_read_only(ro), m_write_through(wt), m_can_run_uninstrumented_code(uninstrumented), -m_closed_nesting(closed_nesting), m_method_group(mg) +m_closed_nesting(closed_nesting), m_requires_serial(requires_serial), +m_method_group(mg) { } }; diff --git a/libitm/method-gl.cc b/libitm/method-gl.cc index 4b6769b..be8f36c 100644 --- a/libitm/method-gl.cc +++ b/libitm/method-gl.cc @@ -341,7 +341,7 @@ public: CREATE_DISPATCH_METHODS(virtual, ) CREATE_DISPATCH_METHODS_MEM() - gl_wt_dispatch() : abi_dispatch(false, true, false, false, o_gl_mg) + gl_wt_dispatch() : abi_dispatch(false, true, false, false, 0, o_gl_mg) { } }; diff --git a/libitm/method-ml.cc b/libitm/method-ml.cc index 88455e8..80278f5 100644 --- a/libitm/method-ml.cc +++ b/libitm/method-ml.cc @@ -590,7 +590,7 @@ public: CREATE_DISPATCH_METHODS(virtual, ) CREATE_DISPATCH_METHODS_MEM() - ml_wt_dispatch() : abi_dispatch(false, true, false, false, o_ml_mg) + ml_wt_dispatch() : abi_dispatch(false, true, false, false, 0, o_ml_mg) { } }; diff --git a/libitm/method-serial.cc b/libitm/method-serial.cc index bdecd7b..09cfdd4 100644 --- a/libitm/method-serial.cc +++ b/libitm/method-serial.cc @@ -50,13 +50,15 @@ static serial_mg o_serial_mg; class serialirr_dispatch : public abi_dispatch { public: - serialirr_dispatch() : abi_dispatch(false, true, true, false, o_serial_mg) + serialirr_dispatch() : abi_dispatch(false, true, true, false, + gtm_thread::STATE_SERIAL | gtm_thread::STATE_IRREVOCABLE, o_serial_mg) { } protected: serialirr_dispatch(bool ro, bool wt, bool uninstrumented, - bool closed_nesting, method_group* mg) : -abi_dispatch(ro, wt, uninstrumented, closed_nesting, mg) { } + bool closed_nesting, uint32_t requires_serial, method_group* mg) : +abi_dispatch(ro, wt, uninstrumented, closed_nesting, requires_serial, mg) + { } // Transactional loads and stores simply access memory directly. // These methods are static to avoid indirect calls, and will be used by the @@ -151,7 +153,9 @@ public: CREATE_DISPATCH_METHODS(virtual, ) CREATE_DISPATCH_METHODS_MEM() - serial_dispatch() : abi_dispatch(false, true, false, true, o_serial_mg) { } + serial_dispatch() : abi_dispatch(false, true, false, true, + gtm_thread::STATE_SERIAL, o_serial_mg) + { } }; @@ -162,7 +166,7 @@ class serialirr_onwrite_dispatch : public serialirr_dispatch { public: serialirr_onwrite_dispatch() : -serialirr_dispatch(false, true, false, false, o_serial_mg) { } +serialirr_dispatch(false, true, false, false, 0, o_serial_mg) { } protected: static void pre_write() diff --git a/libitm/retry.cc b/libitm/retry.cc index 660bf52..172419b 100644 --- a/libitm/retry.cc +++ b/libitm/retry.cc @@ -173,7 +173,7 @@ GTM::gtm_thread::decide_begin_dispatch (uint32_t prop) dd-closed_nesting_alternative()) dd = dd-closed_nesting_alternative(); - if (dd != dispatch_serial() dd != dispatch_serialirr()) + if
[patch] libitm: Clarify ABI requirements for data-logging functions.
This patch clarifies the ABI requirements for data-logging functions in libitm's documentation. Thanks to Luke Dalessandro for pointing this out. OK for trunk? Torvald commit b9cbb260f958f53afbea69675458f3f15a04b812 Author: Torvald Riegel trie...@redhat.com Date: Tue Oct 23 14:56:32 2012 +0200 Clarify ABI requirements for data-logging functions. * libitm.texi: Clarify ABI requirements for data-logging functions. diff --git a/libitm/libitm.texi b/libitm/libitm.texi index 6cfcaf9..7e5c413 100644 --- a/libitm/libitm.texi +++ b/libitm/libitm.texi @@ -156,6 +156,13 @@ about which memory locations are shared and which are not shared with other threads (i.e., data must be accessed either transactionally or nontransactionally). Otherwise, non-write-through TM algorithms would not work. +For memory locations on the stack, this requirement extends to only the +lifetime of the stack frame that the memory location belongs to (or the +lifetime of the transaction, whichever is shorter). Thus, memory that is +reused for several stack frames could be target of both data logging and +transactional accesses; however, this is harmless because these stack frames' +lifetimes will end before the transaction finishes. + @subsection [No changes] Scatter/gather calls @subsection [No changes] Serial and irrevocable mode @subsection [No changes] Transaction descriptor
[PATCH, ARM] Fix offset_ok_for_ldrd_strd in Thumb1
The function offset_ok_for_ldrd_strd should return false for Thumb1, because TARGET_LDRD and Thumb1 can be both enabled (for example, the default for cortex-m0). This patch fixes ICE that is caused by gcc r192678 and occurs when building gcc with newlib for arm-none-eabi cortex-m0. Ok for trunk? Thanks, Greta ChangeLog gcc/ 2012-10-23 Greta Yorsh greta.yo...@arm.com * config/arm/arm.c (offset_ok_for_ldrd_strd): Return false for Thumb1.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e9b9463..a94e537 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12209,7 +12209,7 @@ offset_ok_for_ldrd_strd (HOST_WIDE_INT offset) else if (TARGET_ARM) max_offset = 255; else -gcc_unreachable (); +return false; return ((offset = max_offset) (offset = -max_offset)); }
Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
On Tue, Oct 23, 2012 at 08:47:48AM -0700, Xinliang David Li wrote: + /* The builtin below cannot be safely instrumented as their + length parameter is just a mere limit. + Why can't the following be instrumented? The length is min (n, strlen (str)). Because that would be too expensive, and libasan intercepts (most of the) str* functions anyway, both so that it can check this and test argument overlap. The memory builtin instrumentation is done primary for the cases where the builtins are expanded inline, without calling library routine, because then nothing is verified in libasan. For 'strlen', can the memory check be done at the end of the string using the returned length? Guess strlen is commonly expanded inline, so it would be worthwhile to check the shadow memory after the call (well, we could check the first byte before the call and the last one after the call). Jakub
Re: [PATCH] [3/10] AArch64 Port
On 10/23/12 16:38, Jeff Law wrote: On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: The one question in the back of my mind is whether or not this uses the new iterator support we discussed a few months ago? I can't recall if that was integrated into the trunk or not. Generic support for int-iterators went in around June . So I'd expect the port to be using it quite aggressively especially as the feature was first developed as part of the aarch64 port. Ramana
Re: [PATCH, ARM] Fix offset_ok_for_ldrd_strd in Thumb1
Ok for trunk? Ok. ramana
Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote: + /* (src, n) style memops. */ +case BUILT_IN_STRNDUP: + source0 = gimple_call_arg (call, 0); + len = gimple_call_arg (call, 1); + break; I think you can't instrument strndup either, the length is just a limit there, it can copy fewer characters than that if strlen (source0) is shorter. libasan intercepts strndup I think. + /* (src, x, n) style memops. */ +case BUILT_IN_MEMCHR: + source0 = gimple_call_arg (call, 0); + len = gimple_call_arg (call, 2); And similarly for memchr, you could call p = malloc (4096); p[4095] = 1; x = memchr (p, 1, 8192); and it shouldn't read anything past the end of the allocated area. Jakub
Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
On Tue, Oct 23, 2012 at 8:58 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Oct 23, 2012 at 08:47:48AM -0700, Xinliang David Li wrote: + /* The builtin below cannot be safely instrumented as their + length parameter is just a mere limit. + Why can't the following be instrumented? The length is min (n, strlen (str)). Because that would be too expensive, and libasan intercepts (most of the) str* functions anyway, both so that it can check this and test argument overlap. The memory builtin instrumentation is done primary for the cases where the builtins are expanded inline, without calling library routine, because then nothing is verified in libasan. Ok that makes sense. thanks, David For 'strlen', can the memory check be done at the end of the string using the returned length? Guess strlen is commonly expanded inline, so it would be worthwhile to check the shadow memory after the call (well, we could check the first byte before the call and the last one after the call). Jakub
Re: [PATCH, ARM] Fix offset_ok_for_ldrd_strd in Thumb1
On 23/10/12 16:54, Greta Yorsh wrote: The function offset_ok_for_ldrd_strd should return false for Thumb1, because TARGET_LDRD and Thumb1 can be both enabled (for example, the default for cortex-m0). This patch fixes ICE that is caused by gcc r192678 and occurs when building gcc with newlib for arm-none-eabi cortex-m0. Ok for trunk? Thanks, Greta ChangeLog gcc/ 2012-10-23 Greta Yorsh greta.yo...@arm.com * config/arm/arm.c (offset_ok_for_ldrd_strd): Return false for Thumb1. I think we should fix TARGET_LDRD to reject Thumb1. R. m0-ice.v2.patch.txt diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e9b9463..a94e537 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12209,7 +12209,7 @@ offset_ok_for_ldrd_strd (HOST_WIDE_INT offset) else if (TARGET_ARM) max_offset = 255; else -gcc_unreachable (); +return false; return ((offset = max_offset) (offset = -max_offset)); }
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/2012 10:12 AM, Richard Biener wrote: On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: This patch implements the wide-int class.this is a more general version of the double-int class and is meant to be the eventual replacement for that class.The use of this class removes all dependencies of the host from the target compiler's integer math. I have made all of the changes i agreed to in the earlier emails. In particular, this class internally maintains a bitsize and precision but not a mode. The class now is neutral about modes and tree-types.the functions that take modes or tree-types are just convenience functions that translate the parameters into bitsize and precision and where ever there is a call that takes a mode, there is a corresponding call that takes a tree-type. All of the little changes that richi suggested have also been made. The buffer sizes is now twice the size needed by the largest integer mode. This gives enough room for tree-vrp to do full multiplies on any type that the target supports. Tested on x86-64. This patch depends on the first three patches. I am still waiting on final approval on the hwint.h patch. Ok to commit? diff --git a/gcc/wide-int.h b/gcc/wide-int.h new file mode 100644 index 000..efd2c01 --- /dev/null +++ b/gcc/wide-int.h ... +#ifndef GENERATOR_FILE The whole file is guarded with that ... why? That is bound to be fragile once use of wide-int spreads? How do generator programs end up including this file if they don't need it at all? This is so that wide-int can be included at the level of the generators. There some stuff that needs to see this type that is done during the build build phase that cannot see the types that are included in wide-int.h. +#include tree.h +#include hwint.h +#include options.h +#include tm.h +#include insn-modes.h +#include machmode.h +#include double-int.h +#include gmp.h +#include insn-modes.h + That's a lot of tree and rtl dependencies. double-int.h avoids these by placing conversion routines in different headers or by only resorting to types in coretypes.h. Please try to reduce the above to a minimum. + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; are we sure this rounds properly? Consider a port with max byte mode size 4 on a 64bit host. I do not believe that this can happen. The core compiler includes all modes up to TI mode, so by default we already up to 128 bits. I still would like to have the ability to provide specializations of wide_int for small sizes, thus ideally wide_int would be a template templated on the number of HWIs in val. Interface-wise wide_int2 should be identical to double_int, thus we should be able to do typedef wide_int2 double_int; If you want to go down this path after the patches get in, go for it. I see no use at all for this. This was not meant to be a plug in replacement for double int. This goal of this patch is to get the compiler to do the constant math the way that the target does it. Any such instantiation is by definition placing some predefined limit that some target may not want. in double-int.h and replace its implementation with a specialization of wide_int. Due to a number of divergences (double_int is not a subset of wide_int) that doesn't seem easily possible (one reason is the ShiftOp and related enums you use). Of course wide_int is not a template either. For the hypotetical embedded target above we'd end up using wide_int1, a even more trivial specialization. I realize again this wide-int is not what your wide-int is (because you add a precision member). Still factoring out the commons of wide-int and double-int into a wide_int_raw template should be possible. +class wide_int { + /* Internal representation. */ + + /* VAL is set to a size that is capable of computing a full + multiplication on the largest mode that is represented on the + target. The full multiplication is use by tree-vrp. If + operations are added that require larger buffers, then VAL needs + to be changed. */ + HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; + unsigned short len; + unsigned int bitsize; + unsigned int precision; The len, bitsize and precision members need documentation. At least one sounds redundant. + public: + enum ShiftOp { +NONE, NONE is never a descriptive name ... I suppose this is for arithmetic vs. logical shifts? suggest something +/* There are two uses for the wide-int shifting functions. The + first use is as an emulation of the target hardware. The + second use is as service routines for other optimizations. The + first case needs to be identified by passing TRUNC as the value + of ShiftOp so that shift amount is properly handled according to the + SHIFT_COUNT_TRUNCATED flag. For the second case, the shift + amount is always truncated by the bytesize of
Re: gcc 4.7 libgo patch committed: Set libgo version number
On 23.10.2012 06:55, Ian Lance Taylor wrote: PR 54918 points out that libgo is not using version numbers as it should. At present none of libgo in 4.6, 4.7 and mainline are compatible with each other. This patch to the 4.7 branch sets the version number for libgo there. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to 4.7 branch. changing the soname of a runtime library on the branch? I don't like this idea at all. Matthias
[Patch] Potential fix for PR55033
-- Sebastian Huber, embedded brains GmbH Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany Phone : +49 89 18 90 80 79-6 Fax : +49 89 18 90 80 79-9 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG. From 7770cb04ee95666e745f96b779edec10560203e6 Mon Sep 17 00:00:00 2001 From: Sebastian Huber sebastian.hu...@embedded-brains.de Date: Tue, 23 Oct 2012 18:06:25 +0200 Subject: [PATCH] Potential fix for PR55033 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55033 This patch fixes my problem, but I am absolutely not sure if this is the right way. We have in gcc/varasm.c: [...] static bool decl_readonly_section_1 (enum section_category category) { switch (category) { case SECCAT_RODATA: case SECCAT_RODATA_MERGE_STR: case SECCAT_RODATA_MERGE_STR_INIT: case SECCAT_RODATA_MERGE_CONST: case SECCAT_SRODATA: return true; default: return false; } } [...] section * default_elf_select_section (tree decl, int reloc, unsigned HOST_WIDE_INT align) { const char *sname; switch (categorize_decl_for_section (decl, reloc)) { case SECCAT_TEXT: /* We're not supposed to be called on FUNCTION_DECLs. */ gcc_unreachable (); case SECCAT_RODATA: return readonly_data_section; case SECCAT_RODATA_MERGE_STR: return mergeable_string_section (decl, align, 0); case SECCAT_RODATA_MERGE_STR_INIT: return mergeable_string_section (DECL_INITIAL (decl), align, 0); case SECCAT_RODATA_MERGE_CONST: return mergeable_constant_section (DECL_MODE (decl), align, 0); case SECCAT_SRODATA: sname = .sdata2; break; [...] All read-only sections have a special object except SECCAT_SRODATA. Thus it is created with get_named_section() and potentially decl == NULL. The patch adds another special case to default_section_type_flags(). 2012-10-23 Sebastian Huber sebastian.hu...@embedded-brains.de PR middle-end/55033 * varasm.c (default_section_type_flags): If decl is NULL and name is .sdata2, set flags to 0. --- gcc/varasm.c | 13 + 1 files changed, 9 insertions(+), 4 deletions(-) diff --git a/gcc/varasm.c b/gcc/varasm.c index a587c80..d0941f3 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -5937,10 +5937,15 @@ default_section_type_flags (tree decl, const char *name, int reloc) } else { - flags = SECTION_WRITE; - if (strcmp (name, .data.rel.ro) == 0 - || strcmp (name, .data.rel.ro.local) == 0) - flags |= SECTION_RELRO; + if (strcmp (name, .sdata2) != 0) +{ + flags = SECTION_WRITE; + if (strcmp (name, .data.rel.ro) == 0 + || strcmp (name, .data.rel.ro.local) == 0) +flags |= SECTION_RELRO; +} + else +flags = 0; } if (decl DECL_ONE_ONLY (decl)) -- 1.7.7
Re: gcc 4.7 libgo patch committed: Set libgo version number
On Tue, Oct 23, 2012 at 06:16:25PM +0200, Matthias Klose wrote: On 23.10.2012 06:55, Ian Lance Taylor wrote: PR 54918 points out that libgo is not using version numbers as it should. At present none of libgo in 4.6, 4.7 and mainline are compatible with each other. This patch to the 4.7 branch sets the version number for libgo there. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to 4.7 branch. changing the soname of a runtime library on the branch? I don't like this idea at all. Me neither. You should just make sure you increment the soname on the trunk whenever new major release has ABI changes, and don't do ABI changes on the release branches. Jakub
Re: [C++ PATCH] Fix cplus_decl_attributes (PR c++/54988)
OK. Jason
Re: [PATCH] [3/10] AArch64 Port
On 23/10/12 16:38, Jeff Law wrote: Given that you and Richard Earnshaw are the approved maintainers for the AAarch64 port, I'm going to give this an OK without diving into it. I'm going to assume you and Richard will iterate with anyone who does dive deeply into the port and has comments/suggestions. We will iterate as required with any further comments and suggestions raised. The one question in the back of my mind is whether or not this uses the new iterator support we discussed a few months ago? I can't recall if that was integrated into the trunk or not. The int-iterator support was accepted on trunk ~ 12th June. The AArch64 port does make extensive use of them. /Marcus
[PATCH] variably_modified_type_p tweak for cdtor cloning (PR debug/54828)
Hi! The following testcase ICEs, because the VLA ARRAY_TYPE in the ctor isn't considered variably_modified_type_p during cloning of the ctor when the types weren't gimplified yet. The size is a non-constant expression that has SAVE_EXPR of a global VAR_DECL in it, so when variably_modified_type_p is called with non-NULL second argument, it returns NULL, we share the ARRAY_TYPE in between the abstract ctor origin and and the base_ctor and comp_ctor clones, later on when gimplifying the first of the clones gimplify_one_sizepos actually replaces the expression with a local temporary VAR_DECL and from that point it is variably_modified_type_p in the base ctor. But it is a var from different function in the other ctor and var with a location in the abstract origin which causes dwarf2out.c ICE. This patch fixes it by returning true also for non-gimplified types where gimplify_one_sizepos is expected to turn that into a local temporary. It seems frontends usually call variably_modified_type_p with NULL as last argument, with non-NULL argument it is only used during tree-nested.c/omp-low.c (which happens after gimplification), tree-inline.c (which usually happens after gimplification, with the exception of cdtor cloning, at least can't find anything else). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-10-23 Jakub Jelinek ja...@redhat.com PR debug/54828 * tree.c (RETURN_TRUE_IF_VAR): Return true also if !TYPE_SIZES_GIMPLIFIED (type) and _t is going to be gimplified into a local temporary. * g++.dg/debug/pr54828.C: New test. --- gcc/tree.c.jj 2012-10-19 11:01:07.0 +0200 +++ gcc/tree.c 2012-10-23 14:46:24.846195605 +0200 @@ -8467,14 +8467,21 @@ variably_modified_type_p (tree type, tre tree t; /* Test if T is either variable (if FN is zero) or an expression containing - a variable in FN. */ + a variable in FN. If TYPE isn't gimplified, return true also if + gimplify_one_sizepos would gimplify the expression into a local + variable. */ #define RETURN_TRUE_IF_VAR(T) \ do { tree _t = (T); \ if (_t != NULL_TREE \ _t != error_mark_node\ TREE_CODE (_t) != INTEGER_CST\ TREE_CODE (_t) != PLACEHOLDER_EXPR \ -(!fn || walk_tree (_t, find_var_from_fn, fn, NULL)))\ +(!fn \ + || (!TYPE_SIZES_GIMPLIFIED (type) \ +!TREE_CONSTANT (_t) \ +TREE_CODE (_t) != VAR_DECL \ +!CONTAINS_PLACEHOLDER_P (_t))\ + || walk_tree (_t, find_var_from_fn, fn, NULL)))\ return true; } while (0) if (type == error_mark_node) --- gcc/testsuite/g++.dg/debug/pr54828.C.jj 2012-10-23 14:30:13.194012566 +0200 +++ gcc/testsuite/g++.dg/debug/pr54828.C2012-10-23 14:30:07.0 +0200 @@ -0,0 +1,14 @@ +// PR debug/54828 +// { dg-do compile } +// { dg-options -g } + +struct T { T (); virtual ~T (); }; +struct S : public virtual T { S (); virtual ~S (); }; +int v; +void foo (char *); + +S::S () +{ + char s[v]; + foo (s); +} Jakub
Re: gcc 4.7 libgo patch committed: Set libgo version number
On Tue, Oct 23, 2012 at 9:27 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Oct 23, 2012 at 06:16:25PM +0200, Matthias Klose wrote: On 23.10.2012 06:55, Ian Lance Taylor wrote: PR 54918 points out that libgo is not using version numbers as it should. At present none of libgo in 4.6, 4.7 and mainline are compatible with each other. This patch to the 4.7 branch sets the version number for libgo there. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to 4.7 branch. changing the soname of a runtime library on the branch? I don't like this idea at all. Me neither. You should just make sure you increment the soname on the trunk whenever new major release has ABI changes, and don't do ABI changes on the release branches. The problem is that I forgot to do that when the 4.7 branch was created. So the 4.7 branch and the 4.6 branch were using the same SONAME although they had completely different ABIs. That is, there is no ABI change on the 4.7 branch. I'm setting the SONAME to distinguish it from the 4.6 branch. I agree is not ideal but it seems like the best approach under the circumstances. I'll roll it back if y'all continue to think it is a bad idea. Ian
Re: [PATCH] [6/10] AArch64 Port
On 23/10/12 16:14, Jeff Law wrote: On 10/23/2012 03:42 AM, Marcus Shawcroft wrote: This patch adjusts the libatomic configury for AArch64. Proposed ChangeLog: * configure.tgt: Mark libatomic unsupported. This is good. Please install. Presumably at some point in the not too distant future, aarch support will be added to libatomic? jeff We have support for atomic optab coming real soon now at which point I think we can simply revert this patch. /Marcus
Re: libgo patch committed: Update to current Go library
Hello! I have committed a patch to update the mainline version of libgo to the current master Go library sources. At this point I will only be updating the gcc 4.7 branch for bug fixes. This is a substantial patch that brings in several months of work. As usual I am not posting the complete patch here, as it is mostly simply copies of changes to the upstream repository. I have attached the changes to gccgo-specific files and files with lots of gccgo-specific changes. There is a decent change that this will break something on non-x86 systems. I will do what testing I am able to do after the commit. On my x86_64-linux-gnu (Fedora 18) libgo testsuite fails following test: --- FAIL: TestCgoCrashHandler (0.01 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH --- FAIL: TestCrashHandler (0.00 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH FAIL FAIL: runtime Probably some trivial test issue. Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall I didn't yet debug this one, will do soon. Otherwise all other libgo tests pass on these two systems. Uros.
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
On Oct 5, 2012, at 6:05 PM, Cary Coutant wrote: There certainly is a fair amount of code in dwarf2read.c in gdb to handle DW_AT_declaration and do things differently for declarations. Should I rework this patch to use that mechanism instead? If so, how? If the class is marked only by prune_unused_types_mark visiting it as a parent, but hasn't been marked by ??? that visits all its children, then emit it with a DW_AT_declaration marking? One question I'd consider is what do you want to see in the debugger if this truly is the only debug info you have for the class? (For example, in the test case you added, a DW_AT_declaration attribute won't help if there's no full definition of the class anywhere else.) Is it reasonable to just show a truncated class definition in that case, or do you want the full definition available. My tentative answer would be that we do the pruning here because we expect there to be a full definition somewhere else, and that the lack of a DW_AT_declaration attribute is the bug. The answer appears to be: 1. Until the full symbols have been read (via gdb -r, or by reference to another symbol in that compilation unit) such declarations are not visible. Once the full symbols have been read, a class marked as DW_AT_declaration is shown as imcomplete type which makes sense. I think this is reasonable behavior. I confirmed that if you do have a definition elsewhere, gdb does the correct thing. That's what you would expect, given that the DW_AT_declaration flag was already being generated (for imcomplete types). As you've discovered, however, it's not straightforward. You'll want to add the declaration attribute if you mark the DIE from below, but not from any reference where dokids is true. Alternatively, add the declaration attribute if any of its children are pruned. Perhaps that could be done in prune_unused_types_prune(). If you're willing to rework the patch this way (assuming GDB does the right thing with it), I think that would be better. Thanks. -cary Attached is the revised patch. It marks classes as declaration if they weren't marked by one of the mark functions that visits children, and something was actually pruned. That second check is needed, otherwise the class Executor in testcase nested-3.C gets marked as a declaration. The testcase has been reworked to check both aspects. Struct s gets defined (because a variable of that type is defined), while class c and union u are not, so they are marked as declaration while struct s is not marked. The testcase verifies that. Tested by build and check RUNTESTFLAGS=dwarf2.exp on Linux and Darwin. Ok to commit? paul ChangeLog: 2012-10-23 Paul Koning n...@arrl.net * dwarf2out.c (prune_unused_types_prune): If pruning a class and not all its children were marked, add DW_AT_declaration flag. testcases/ChangeLog: 2012-10-23 Paul Koning n...@arrl.net * g++.dg/debug/dwarf2/pr54508.C: New. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 192405) +++ gcc/dwarf2out.c (working copy) @@ -21218,6 +21218,7 @@ prune_unused_types_prune (dw_die_ref die) { dw_die_ref c; + int pruned = 0; gcc_assert (die-die_mark); prune_unused_types_update_strings (die); @@ -21240,13 +21241,24 @@ prev-die_sib = c-die_sib; die-die_child = prev; } - return; + pruned = 1; + goto finished; } if (c != prev-die_sib) - prev-die_sib = c; + { + prev-die_sib = c; + pruned = 1; + } prune_unused_types_prune (c); } while (c != die-die_child); + + finished: + /* If we pruned children, and this is a class, mark it as a + declaration to inform debuggers that this is not a complete + class definition. */ + if (pruned die-die_mark == 1 class_scope_p (die)) +add_AT_flag (die, DW_AT_declaration, 1); } /* Remove dies representing declarations that we never use. */ Index: gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C === --- gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0) +++ gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C (revision 0) @@ -0,0 +1,72 @@ +// PR debug/54508 +// { dg-do compile } +// { dg-options -g2 -dA -fno-merge-debug-strings } + +// { dg-final { scan-assembler-not \cbase0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler \c0\\[ \t\]+\[#;/!|@\]+ DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ DW_AT_decl_line\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ DW_AT_declaration } } +// { dg-final { scan-assembler-not \OPCODE0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler-not \bi0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// { dg-final { scan-assembler-not \si0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } +// {
Re: gcc 4.7 libgo patch committed: Set libgo version number
On Tue, Oct 23, 2012 at 09:57:21AM -0700, Ian Lance Taylor wrote: The problem is that I forgot to do that when the 4.7 branch was created. So the 4.7 branch and the 4.6 branch were using the same SONAME although they had completely different ABIs. That is, there is no ABI change on the 4.7 branch. I'm setting the SONAME to distinguish it from the 4.6 branch. I agree is not ideal but it seems like the best approach under the circumstances. I think it is too late for such a change on the 4.7 branch, better just say that the go support in 4.6 is experimental, without stable ABI, otherwise there will be ABI incompatibility also on the 4.7 branch between patchlevel versions thereof. Jakub
Re: [C++ Patch] PR 54922
OK. Jason
Re: [PATCH] Fix sizeof related pt.c ICE (PR c++/54844)
OK. Jason
Re: LRA has been merged into trunk.
Hello! Hi, I was going to merge LRA into trunk last Sunday. It did not happen. LRA was actively changed last 4 weeks by implementing reviewer's proposals which resulted in a lot of new LRA regressions on GCC testsuite in comparison with reload. Finally, they were fixed and everything looks ok to me. So I've committed the patch into trunk as rev. 192719. The final patch is in the attachment. This commit introduced following testsuite failure on x86_64-pc-linux-gnu with -m32: FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -fomit-frame-pointer -finline-functions (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -fomit-frame-pointer -finline-functions -funroll-loops (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -fbounds-check (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O3 -g (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -Os (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -ftree-vectorize -msse2 (internal compiler error) The error is: /home/uros/gcc-svn/trunk/gcc/testsuite/gfortran.fortran-torture/execute/intrinsic_nearest.f90: In function 'test_n': /home/uros/gcc-svn/trunk/gcc/testsuite/gfortran.fortran-torture/execute/intrinsic_nearest.f90:77:0: internal compiler error: in lra_assign, at lra-assigns.c:1361 0x8557e3 lra_assign() ../../gcc-svn/trunk/gcc/lra-assigns.c:1361 0x8538bc lra(_IO_FILE*) ../../gcc-svn/trunk/gcc/lra.c:2310 0x81bc46 do_reload ../../gcc-svn/trunk/gcc/ira.c:4613 0x81bc46 rest_of_handle_reload ../../gcc-svn/trunk/gcc/ira.c:4719 Please submit a full bug report, Uros.
Re: [PATCH] [2/10] AArch64 Port
On 23/10/12 15:39, Joseph S. Myers wrote: On Tue, 23 Oct 2012, Marcus Shawcroft wrote: +@item -mcmodel=tiny +@opindex mcmodel=tiny +Generate code for the tiny code model. The program and its statically defined +symbols must be within 1GB of each other. Pointers are 64 bits. Programs can +be statically or dynamically linked. This model is not fully implemented and +mostly treated as small. Say @samp{small} instead of using quotes in Texinfo sources. Committed with this correction. Thankyou /Marcus
Re: Fix bugs introduced by switch-case profile propagation
On Tue, Oct 23, 2012 at 3:03 AM, Jan Hubicka hubi...@ucw.cz wrote: Ping. On Wed, Oct 17, 2012 at 1:48 PM, Easwaran Raman era...@google.com wrote: Hi, This patch fixes bugs introduced by my previous patch to propagate profiles during switch expansion. Bootstrap and profiledbootstrap successful on x86_64. Confirmed that it fixes the crashes reported in PR middle-end/54957. OK for trunk? - Easwaran 2012-10-17 Easwaran Raman era...@google.com PR target/54938 PR middle-end/54957 * optabs.c (emit_cmp_and_jump_insn_1): Add REG_BR_PROB note only if it doesn't already exist. * except.c (sjlj_emit_function_enter): Remove unused variable. * stmt.c (get_outgoing_edge_probs): Return 0 if BB is NULL. Seems fine, but under what conditions you get NULL here? When expand_sjlj_dispatch_table calls emit_case_dispatch_table, stmt_bb is NULL. - Easwaran Honza (emit_case_dispatch_table): Handle the case where STMT_BB is NULL. (expand_sjlj_dispatch_table): Pass BB containing before_case to emit_case_dispatch_table. Index: gcc/optabs.c === --- gcc/optabs.c (revision 192488) +++ gcc/optabs.c (working copy) @@ -4268,11 +4268,9 @@ emit_cmp_and_jump_insn_1 (rtx test, enum machine_m profile_status != PROFILE_ABSENT insn JUMP_P (insn) - any_condjump_p (insn)) -{ - gcc_assert (!find_reg_note (insn, REG_BR_PROB, 0)); - add_reg_note (insn, REG_BR_PROB, GEN_INT (prob)); -} + any_condjump_p (insn) + !find_reg_note (insn, REG_BR_PROB, 0)) +add_reg_note (insn, REG_BR_PROB, GEN_INT (prob)); } /* Generate code to compare X with Y so that the condition codes are Index: gcc/except.c === --- gcc/except.c (revision 192488) +++ gcc/except.c (working copy) @@ -1153,7 +1153,7 @@ sjlj_emit_function_enter (rtx dispatch_label) if (dispatch_label) { #ifdef DONT_USE_BUILTIN_SETJMP - rtx x, last; + rtx x; x = emit_library_call_value (setjmp_libfunc, NULL_RTX, LCT_RETURNS_TWICE, TYPE_MODE (integer_type_node), 1, plus_constant (Pmode, XEXP (fc, 0), Index: gcc/stmt.c === --- gcc/stmt.c (revision 192488) +++ gcc/stmt.c (working copy) @@ -1867,6 +1867,8 @@ get_outgoing_edge_probs (basic_block bb) edge e; edge_iterator ei; int prob_sum = 0; + if (!bb) +return 0; FOR_EACH_EDGE(e, ei, bb-succs) prob_sum += e-probability; return prob_sum; @@ -1916,8 +1918,8 @@ emit_case_dispatch_table (tree index_expr, tree in rtx fallback_label = label_rtx (case_list-code_label); rtx table_label = gen_label_rtx (); bool has_gaps = false; - edge default_edge = EDGE_SUCC(stmt_bb, 0); - int default_prob = default_edge-probability; + edge default_edge = stmt_bb ? EDGE_SUCC(stmt_bb, 0) : NULL; + int default_prob = default_edge ? default_edge-probability : 0; int base = get_outgoing_edge_probs (stmt_bb); bool try_with_tablejump = false; @@ -1997,7 +1999,8 @@ emit_case_dispatch_table (tree index_expr, tree in default_prob = 0; } - default_edge-probability = default_prob; + if (default_edge) +default_edge-probability = default_prob; /* We have altered the probability of the default edge. So the probabilities of all other edges need to be adjusted so that it sums up to @@ -2289,7 +2292,8 @@ expand_sjlj_dispatch_table (rtx dispatch_index, emit_case_dispatch_table (index_expr, index_type, case_list, default_label, - minval, maxval, range, NULL); + minval, maxval, range, +BLOCK_FOR_INSN (before_case)); emit_label (default_label); free_alloc_pool (case_node_pool); }
Re: [PATCH] [0/10] AArch64 Port
On 23/10/12 10:42, Marcus Shawcroft wrote: Folks, We would like to request the merge of aarch64-branch into trunk. All of the patches approved by Jeff and Jakub are now committed, with the documentation correction requested by Joseph. /Marcus
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
OK. Jason
Re: libgo patch committed: Update to current Go library
On Tue, Oct 23, 2012 at 10:47 AM, Uros Bizjak ubiz...@gmail.com wrote: On my x86_64-linux-gnu (Fedora 18) libgo testsuite fails following test: --- FAIL: TestCgoCrashHandler (0.01 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH --- FAIL: TestCrashHandler (0.00 seconds) testing.go:377: program exited with error: exec: go: executable file not found in $PATH FAIL FAIL: runtime Thanks. Turns out this test is currently meaningless with gccgo, and was only working for me because I have the other Go compiler on my PATH as well. I committed this patch to mainline to disable it for gccgo. Ian foo.patch Description: Binary data
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/12, Richard Biener richard.guent...@gmail.com wrote: I wonder if for the various ways to specify precision/len there is a nice C++ way of moving this detail out of wide-int. I can think only of one: struct WIntSpec { WIntSpec (unsigned int len, unsigned int precision); WIntSpec (const_tree); WIntSpec (enum machine_mode); unsigned int len; unsigned int precision; }; and then (sorry to pick one of the less useful functions): inline static wide_int zero (WIntSpec) which you should be able to call like wide_int::zero (SImode) wide_int::zero (integer_type_node) and (ugly) wide_int::zero (WIntSpec (32, 32)) with C++0x wide_int::zero ({32, 32}) should be possible? Or we keep the precision overload. At least providing the WIntSpec abstraction allows custom ways of specifying required bits to not pollute wide-int itself too much. Lawrence? Yes, in C++11, wide_int::zero ({32, 32}) is possible using an implicit conversion to WIntSpec from an initializer_list. However, at present we are limited to C++03 to enable older compilers as boot compilers. -- Lawrence Crowl
[m68k] Fix option handling for -m68020-40 and -m68020-60
Hello, While working with GCC 4.7, I noticed that the -m68020-40 and -m68020-60 options are broken. This bug was introduced in May 2011 with the patch at http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02278.html. The code in m68k_option_override to set cpu and tune doesn't trigger when using -m68020-40 and -m68020-60 since global_options_set is not touched by the evaluation code for -m68020-40 and -m68020-60 in m68k_handle_option. This patch was tested by checking the -dM output of a patched cc1 for the present __mc680X0__ macros. Regards, Gunther Nikl -- cut -- 2012-10-23 Gunther Nikl gn...@users.sourceforge.net * common/config/m68k/m68k-common.c (m68k_handle_option): Set gcc_options fields of opts_set for -m68020-40 and -m68020-60. -- cut -- Index: common/config/m68k/m68k-common.c === --- common/config/m68k/m68k-common.c(revision 192718) +++ common/config/m68k/m68k-common.c(working copy) @@ -33,7 +33,7 @@ static bool m68k_handle_option (struct gcc_options *opts, - struct gcc_options *opts_set ATTRIBUTE_UNUSED, + struct gcc_options *opts_set, const struct cl_decoded_option *decoded, location_t loc) { @@ -45,12 +45,16 @@ { case OPT_m68020_40: opts-x_m68k_tune_option = u68020_40; + opts_set-x_m68k_tune_option = (uarch_type) 1; opts-x_m68k_cpu_option = m68020; + opts_set-x_m68k_cpu_option = (target_device) 1; return true; case OPT_m68020_60: opts-x_m68k_tune_option = u68020_60; + opts_set-x_m68k_tune_option = (uarch_type) 1; opts-x_m68k_cpu_option = m68020; + opts_set-x_m68k_cpu_option = (target_device) 1; return true; case OPT_mshared_library_id_: -- cut --
Re: libgo patch committed: Update to current Go library
On Tue, Oct 23, 2012 at 10:47 AM, Uros Bizjak ubiz...@gmail.com wrote: Additional test fails on alphaev68-linux-gnu: --- FAIL: TestPassFD (0.15 seconds) passfd_test.go:62: FileConn: dup: Bad file descriptor FAIL FAIL: syscall As far as I can see this error message occurs when calling dup on the second file descriptor returned by socketpair. But I can't see why that would fail on Alpha. Ian
Re: Tidy store_bit_field_1 co.
Eric Botcazou ebotca...@adacore.com writes: + enum machine_mode op_mode = mode_for_extraction (EP_insv, 3); if (HAVE_insv GET_MODE (value) != BLKmode bitsize 0 @@ -690,25 +670,34 @@ store_bit_field_1 (rtx str_rtx, unsigned -fstrict-volatile-bitfields is in effect. */ !(MEM_P (op0) MEM_VOLATILE_P (op0) flag_strict_volatile_bitfields 0) - ! ((REG_P (op0) || GET_CODE (op0) == SUBREG) - (bitsize + bitpos GET_MODE_BITSIZE (op_mode))) /* Do not use insv if the bit region is restricted and op_mode integer at offset doesn't fit into the restricted region. */ !(MEM_P (op0) bitregion_end -bitnum - bitpos + GET_MODE_BITSIZE (op_mode) +bitnum - (bitnum % BITS_PER_UNIT) + GET_MODE_BITSIZE (op_mode) bitregion_end + 1)) { struct expand_operand ops[4]; - int xbitpos = bitpos; + unsigned HOST_WIDE_INT bitpos = bitnum; rtx value1; rtx xop0 = op0; rtx last = get_last_insn (); bool copy_back = false; - /* Add OFFSET into OP0's address. */ + unsigned int unit = GET_MODE_BITSIZE (op_mode); if (MEM_P (xop0)) -xop0 = adjust_bitfield_address (xop0, byte_mode, offset); +{ + /* Get a reference to the first byte of the field. */ + xop0 = adjust_bitfield_address (xop0, byte_mode, + bitpos / BITS_PER_UNIT); + bitpos %= BITS_PER_UNIT; +} + else +{ + /* Convert from counting within OP0 to counting in OP_MODE. */ + if (BYTES_BIG_ENDIAN) +bitpos += unit - GET_MODE_BITSIZE (GET_MODE (op0)); +} /* If xop0 is a register, we need it in OP_MODE to make it acceptable to the format of insv. */ @@ -735,20 +724,13 @@ store_bit_field_1 (rtx str_rtx, unsigned copy_back = true; } - /* We have been counting XBITPOS within UNIT. - Count instead within the size of the register. */ - if (BYTES_BIG_ENDIAN !MEM_P (xop0)) -xbitpos += GET_MODE_BITSIZE (op_mode) - unit; - - unit = GET_MODE_BITSIZE (op_mode); - /* If BITS_BIG_ENDIAN is zero on a BYTES_BIG_ENDIAN machine, we count backwards from the size of the unit we are inserting into. Otherwise, we count bits from the most significant on a BYTES/BITS_BIG_ENDIAN machine. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) -xbitpos = unit - bitsize - xbitpos; +bitpos = unit - bitsize - bitpos; /* Convert VALUE to op_mode (which insv insn wants) in VALUE1. */ value1 = value; I guess I see the reasoning, but I cannot say whether it's right or wrong... I should probably have responded to this earlier, sorry. I'm not sure which part you mean, so here's an attempt at justifying the whole block: 1) WORDS_BIG_ENDIAN is deliberately ignored: /* The following line once was done only if WORDS_BIG_ENDIAN, but I think that is a mistake. WORDS_BIG_ENDIAN is meaningful at a much higher level; when structures are copied between memory and regs, the higher-numbered regs always get higher addresses. */ 2) For MEM: The old code reached this if statement with: unsigned int unit = (MEM_P (str_rtx)) ? BITS_PER_UNIT : BITS_PER_WORD; offset = bitnum / unit; bitpos = bitnum % unit; I.e.: offset = bitnum / BITS_PER_UNIT; bitpos = bitnum % BITS_PER_UNIT; which the new code does explicitly with: unsigned HOST_WIDE_INT bitpos = bitnum; if (MEM_P (xop0)) { /* Get a reference to the first byte of the field. */ xop0 = adjust_bitfield_address (xop0, byte_mode, bitpos / BITS_PER_UNIT); -- offset bitpos %= BITS_PER_UNIT; } The following: unit = GET_MODE_BITSIZE (op_mode); if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) xbitpos = unit - bitsize - xbitpos; code is the same as before. 3) For REG, !BYTES_BIG_ENDIAN, !BITS_BIG_ENDIAN: The easy case. The old code reached this if statement with: unsigned int unit = (MEM_P (str_rtx)) ? BITS_PER_UNIT : BITS_PER_WORD; offset = bitnum / unit; bitpos = bitnum % unit; ... if (!MEM_P (op0) ...) { ...set op0 to the word containing the field... offset = 0; } where the awkward thing is that OFFSET and BITPOS are now out of sync with BITNUM. So before the if statement: offset = 0; bitpos = bitnum % BITS_PER_WORD; which if the !MEM_P block above had updated BITNUM too would just have been: offset = 0; bitpos = bitnum; The new code does update BITNUM: if (!MEM_P (op0) ...) { ...set op0 to the word containing the field... bitnum %= BITS_PER_WORD; } ... unsigned HOST_WIDE_INT bitpos = bitnum; so both the following hold: offset = 0; bitpos = bitnum %
Re: gcc 4.7 libgo patch committed: Set libgo version number
On Tue, Oct 23, 2012 at 9:59 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Oct 23, 2012 at 09:57:21AM -0700, Ian Lance Taylor wrote: The problem is that I forgot to do that when the 4.7 branch was created. So the 4.7 branch and the 4.6 branch were using the same SONAME although they had completely different ABIs. That is, there is no ABI change on the 4.7 branch. I'm setting the SONAME to distinguish it from the 4.6 branch. I agree is not ideal but it seems like the best approach under the circumstances. I think it is too late for such a change on the 4.7 branch, better just say that the go support in 4.6 is experimental, without stable ABI, otherwise there will be ABI incompatibility also on the 4.7 branch between patchlevel versions thereof. OK, I reverted the patch on the 4.7 branch. Ian
Re: [Patch] Fix the tests gcc.dg/vect/vect-8[23]_64.c
+cc richard.guent...@gmail.com If it is approved, I will be happy to commit it for you. Thanks, Sharad Sharad On Tue, Oct 23, 2012 at 6:52 AM, Dominique Dhumieres domi...@lps.ens.fr wrote: Following the changes in [PATCH] Add option for dumping to stderr (issue6190057) the tests gcc.dg/vect/vect-8[23]_64.c fails on powerpc*-*-*. This patch adjust the dump files and has been tested on powerpc-apple-darwin9. If approved could someone commit it for me (no write access). Note that these tests use both dg-do run and dg-do compile which is not supported (see http://gcc.gnu.org/ml/gcc/2012-10/msg00226.html and the rest of the thread). TIA Dominique gcc/testsuite/ChangeLog 2012-10-23 Dominique d'Humieres domi...@lps.ens.fr * gcc.dg/vect/vect-82_64.c: Adjust the dump file. * gcc.dg/vect/vect-83_64.c: Likewise. diff -up gcc/testsuite/gcc.dg/vect/vect-82_64.c ../work/gcc/testsuite/gcc.dg/vect/vect-82_64.c --- gcc/testsuite/gcc.dg/vect/vect-82_64.c 2007-11-21 20:18:48.0 +0100 +++ ../work/gcc/testsuite/gcc.dg/vect/vect-82_64.c 2012-10-08 13:52:25.0 +0200 @@ -1,6 +1,6 @@ /* { dg-do run { target { { powerpc*-*-* lp64 } powerpc_altivec_ok } } } */ /* { dg-do compile { target { { powerpc*-*-* ilp32 } powerpc_altivec_ok } } } */ -/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats -maltivec } */ +/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-details -maltivec } */ #include stdarg.h #include tree-vect.h diff -up gcc/testsuite/gcc.dg/vect/vect-83_64.c ../work/gcc/testsuite/gcc.dg/vect/vect-83_64.c --- gcc/testsuite/gcc.dg/vect/vect-83_64.c 2007-11-21 20:18:48.0 +0100 +++ ../work/gcc/testsuite/gcc.dg/vect/vect-83_64.c 2012-10-08 13:52:42.0 +0200 @@ -1,6 +1,6 @@ /* { dg-do run { target { { powerpc*-*-* lp64 } powerpc_altivec_ok } } } */ /* { dg-do compile { target { { powerpc*-*-* ilp32 } powerpc_altivec_ok } } } */ -/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-stats -maltivec } */ +/* { dg-options -O2 -ftree-vectorize -mpowerpc64 -fdump-tree-vect-details -maltivec } */ #include stdarg.h #include tree-vect.h
Re: LRA has been merged into trunk.
On 10/23/2012 01:57 PM, Uros Bizjak wrote: Hello! Hi, I was going to merge LRA into trunk last Sunday. It did not happen. LRA was actively changed last 4 weeks by implementing reviewer's proposals which resulted in a lot of new LRA regressions on GCC testsuite in comparison with reload. Finally, they were fixed and everything looks ok to me. So I've committed the patch into trunk as rev. 192719. The final patch is in the attachment. This commit introduced following testsuite failure on x86_64-pc-linux-gnu with -m32: FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -fomit-frame-pointer -finline-functions (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -fomit-frame-pointer -finline-functions -funroll-loops (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -fbounds-check (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O3 -g (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -Os (internal compiler error) FAIL: gfortran.fortran-torture/execute/intrinsic_nearest.f90, -O2 -ftree-vectorize -msse2 (internal compiler error) Thanks, Uros. I'll be working on this. I tested with -march=corei7 and -mtune=corei7 (that is what H.J. uses) therefore I did not see it.
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + inline bool minus_one_p () const; + inline bool zero_p () const; + inline bool one_p () const; + inline bool neg_p () const; what's wrong with w == -1, w == 0, w == 1, etc.? I would love to do this and you seem to be somewhat knowledgeable of c++. But i cannot for the life of me figure out how to do it. Starting from the simple case, you write an operator ==. as global operator: bool operator == (wide_int w, int i); as member operator: bool wide_int::operator == (int i); In the simple case, bool operator == (wide_int w, int i) { switch (i) { case -1: return w.minus_one_p (); case 0: return w.zero_p (); case 1: return w.one_p (); default: unexpected } } say i have a TImode number, which must be represented in 4 ints on a 32 bit host (the same issue happens on 64 bit hosts, but the examples are simpler on 32 bit hosts) and i compare it to -1. The value that i am going to see as the argument of the function is going have the value 0x. but the value that i have internally is 128 bits. do i take this and 0 or sign extend it? What would you have done with w.minus_one_p ()? in particular if someone wants to compare a number to 0xdeadbeef i have no idea what to do. I tried defining two different functions, one that took a signed and one that took and unsigned number but then i wanted a cast in front of all the positive numbers. This is where it does get tricky. For signed arguments, you should sign extend. For unsigned arguments, you should not. At present, we need multiple overloads to avoid type ambiguities. bool operator == (wide_int w, long long int i); bool operator == (wide_int w, unsigned long long int i); inline bool operator == (wide_int w, long int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned long int i) { return w == (unsigned long long int) i; } inline bool operator == (wide_int w, int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned int i) { return w == (unsigned long long int) i; } (There is a proposal before the C++ committee to fix this problem.) Even so, there is room for potential bugs when wide_int does not carry around whether or not it is signed. The problem is that regardless of what the programmer thinks of the sign of the wide int, the comparison will use the sign of the int. If there is a way to do this, then i will do it, but it is going to have to work properly for things larger than a HOST_WIDE_INT. The long-term solution, IMHO, is to either carry the sign information around in either the type or the class data. (I prefer type, but with a mechanism to carry it as data when needed.) Such comparisons would then require consistency in signedness between the wide int and the plain int. I know that double-int does some of this and it does not carry around a notion of signedness either. is this just code that has not been fully tested or is there a trick in c++ that i am missing? The double int class only provides == and !=, and only with other double ints. Otherwise, it has the same value query functions that you do above. In the case of double int, the goal was to simplify use of the existing semantics. If you are changing the semantics, consider incorporating sign explicitly. -- Lawrence Crowl
Re: PING^2: [patch] pr/54508: fix incomplete debug information for class
On Oct 23, 2012, at 2:02 PM, Jason Merrill wrote: OK. Jason Thanks. Committed. paul
Re: patch to fix constant math - 4th patch - the wide-int class.
On 10/23/2012 02:38 PM, Lawrence Crowl wrote: On 10/23/12, Kenneth Zadeck zad...@naturalbridge.com wrote: On 10/23/2012 10:12 AM, Richard Biener wrote: + inline bool minus_one_p () const; + inline bool zero_p () const; + inline bool one_p () const; + inline bool neg_p () const; what's wrong with w == -1, w == 0, w == 1, etc.? I would love to do this and you seem to be somewhat knowledgeable of c++. But i cannot for the life of me figure out how to do it. Starting from the simple case, you write an operator ==. as global operator: bool operator == (wide_int w, int i); as member operator: bool wide_int::operator == (int i); In the simple case, bool operator == (wide_int w, int i) { switch (i) { case -1: return w.minus_one_p (); case 0: return w.zero_p (); case 1: return w.one_p (); default: unexpected } } no, this seems wrong.you do not want to write code that can only fail at runtime unless there is a damn good reason to do that. say i have a TImode number, which must be represented in 4 ints on a 32 bit host (the same issue happens on 64 bit hosts, but the examples are simpler on 32 bit hosts) and i compare it to -1. The value that i am going to see as the argument of the function is going have the value 0x. but the value that i have internally is 128 bits. do i take this and 0 or sign extend it? What would you have done with w.minus_one_p ()? the code knows that -1 is a negative number and it knows the precision of w.That is enough information. So it logically builds a -1 that has enough bits to do the conversion. in particular if someone wants to compare a number to 0xdeadbeef i have no idea what to do. I tried defining two different functions, one that took a signed and one that took and unsigned number but then i wanted a cast in front of all the positive numbers. This is where it does get tricky. For signed arguments, you should sign extend. For unsigned arguments, you should not. At present, we need multiple overloads to avoid type ambiguities. bool operator == (wide_int w, long long int i); bool operator == (wide_int w, unsigned long long int i); inline bool operator == (wide_int w, long int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned long int i) { return w == (unsigned long long int) i; } inline bool operator == (wide_int w, int i) { return w == (long long int) i; } inline bool operator (wide_int w, unsigned int i) { return w == (unsigned long long int) i; } (There is a proposal before the C++ committee to fix this problem.) Even so, there is room for potential bugs when wide_int does not carry around whether or not it is signed. The problem is that regardless of what the programmer thinks of the sign of the wide int, the comparison will use the sign of the int. when they do we can revisit this. but i looked at this and i said the potential bugs were not worth the effort. If there is a way to do this, then i will do it, but it is going to have to work properly for things larger than a HOST_WIDE_INT. The long-term solution, IMHO, is to either carry the sign information around in either the type or the class data. (I prefer type, but with a mechanism to carry it as data when needed.) Such comparisons would then require consistency in signedness between the wide int and the plain int. carrying the sign information is a non starter.The rtl level does not have it and the middle end violates it more often than not.My view was to design this having looked at all of the usage. I have basically converted the whole compiler before i released the abi. I am still getting out the errors and breaking it up in reviewable sized patches, but i knew very very well who my clients were before i wrote the abi. I know that double-int does some of this and it does not carry around a notion of signedness either. is this just code that has not been fully tested or is there a trick in c++ that i am missing? The double int class only provides == and !=, and only with other double ints. Otherwise, it has the same value query functions that you do above. In the case of double int, the goal was to simplify use of the existing semantics. If you are changing the semantics, consider incorporating sign explicitly. i have, and it does not work.
Re: PR c/53063 Handle Wformat with LangEnabledBy
On 19 October 2012 18:17, Joseph S. Myers jos...@codesourcery.com wrote: On Wed, 17 Oct 2012, Manuel López-Ibáñez wrote: documentation but I can also implement -Wformat=0 being an alias for -Wno-format and -Wformat=1 an alias for -Wformat and simply reject -Wno-format=. I think that's what's wanted; -Wno-format= should be rejected, -Wformat= should take an arbitrary integer level (of which at present all those above 2 are equivalent to 2, just as -On for n 3 is equivalent to -O3). The problem is how to represent that Wformat-y2k is enabled by -Wformat=X with X = 2, while Wformat-zero-length is enabled by X =1. One possiblity is to allow to specify a condition directly: Wformat-y2k C ObjC C++ ObjC++ Var(warn_format_y2k) Warning LangEnabledByCond(C ObjC C++ ObjC++,Wformat=,warn_format = 2) Warn about strftime formats yielding 2-digit years Wformat-zero-length C ObjC C++ ObjC++ Var(warn_format_zero_length) Warning LangEnabledByCond(C ObjC C++ ObjC++,Wformat=,warn_format = 2) Warn about zero-length formats Wformat= C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_format) Warning I think this is both flexible and easy to implement given the current infrastructure. But I wanted to get your approval before. What do you think?
Re: [PATCH] [7/10] AArch64 Port
Marcus == Marcus Shawcroft marcus.shawcr...@arm.com writes: Marcus This patch adjusts the libcpp configury for AArch64. Marcus Proposed ChangeLog: Marcus * configure.ac: Enable AArch64. Marcus * configure: Regenerate. This is ok. Thanks. Tom
Re: Fourth ping: Re: Add a configure option to disable system header canonicalizations (issue6495088)
Steven Probably you mean the revised patch here: Steven http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00459.html Steven The patch look OK to me but I can't approve it. I'm sorry about the delay on this. The libcpp bits are ok. I can't approve the other parts. I think new configure options should be documented in install.texi. Tom
Re: Fourth ping: Re: Add a configure option to disable system header canonicalizations (issue6495088)
On Tue, Oct 23, 2012 at 12:38 PM, Tom Tromey tro...@redhat.com wrote: Steven Probably you mean the revised patch here: Steven http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00459.html Steven The patch look OK to me but I can't approve it. I'm sorry about the delay on this. The libcpp bits are ok. I can't approve the other parts. The rest of the patch is fine. I think new configure options should be documented in install.texi. Looks to me like it is. Thanks. Ian