[ping] Fix wrong code with boolean negation
It's a regression (albeit an old one): http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01044.html Thanks in advance. -- Eric Botcazou
[PATCH][RFC] Fix PR56113 more
This reduces compile-time of the testcase in PR56113 (with n = 4) from 575s to 353s. It does so by reducing the quadratic algorithm to impose an order on visiting dominator sons during a domwalk. Steven raises the issue that there exist domwalk users that modify the CFG during the walk and thus the new scheme does not work (at least optimally, as the current quadratic scheme does). As we are using a fixed-size sbitmap to track visited blocks existing domwalks cannot add new blocks to the CFG so the worst thing that can happen is that the order of dominator sons is no longer optimal (I suppose with the right CFG manipulations even the domwalk itself does not work - so I'd be hesitant to try to support such domwalk users) - back to the state before any ordering was imposed on the dom children visits (see rev 159100). Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? (it's a regression from 4.5) Thanks, Richard. 2013-02-01 Richard Biener rguent...@suse.de PR middle-end/56113 * domwalk.c (bb_postorder): New global static. (cmp_bb_postorder): New function. (walk_dominator_tree): Replace scheme imposing an order for visiting dominator sons by one sorting them at the time they are pushed on the stack. Index: gcc/domwalk.c === *** gcc/domwalk.c (revision 195615) --- gcc/domwalk.c (working copy) *** along with GCC; see the file COPYING3. *** 128,133 --- 128,148 which is currently an abstraction over walking tree statements. Thus the dominator walker is currently only useful for trees. */ + static int *bb_postorder; + + static int + cmp_bb_postorder (const void *a, const void *b) + { + basic_block bb1 = *(basic_block *)const_castvoid *(a); + basic_block bb2 = *(basic_block *)const_castvoid *(b); + if (bb1-index == bb2-index) + return 0; + /* Place higher completion number first (pop off lower number first). */ + if (bb_postorder[bb1-index] bb_postorder[bb2-index]) + return -1; + return 1; + } + /* Recursively walk the dominator tree. WALK_DATA contains a set of callbacks to perform pass-specific *** walk_dominator_tree (struct dom_walk_dat *** 143,151 basic_block dest; basic_block *worklist = XNEWVEC (basic_block, n_basic_blocks * 2); int sp = 0; ! sbitmap visited = sbitmap_alloc (last_basic_block + 1); ! bitmap_clear (visited); ! bitmap_set_bit (visited, ENTRY_BLOCK_PTR-index); while (true) { --- 158,174 basic_block dest; basic_block *worklist = XNEWVEC (basic_block, n_basic_blocks * 2); int sp = 0; ! int *postorder, postorder_num; ! ! if (walk_data-dom_direction == CDI_DOMINATORS) ! { ! postorder = XNEWVEC (int, n_basic_blocks); ! postorder_num = inverted_post_order_compute (postorder); ! bb_postorder = XNEWVEC (int, last_basic_block); ! for (int i = 0; i postorder_num; ++i) ! bb_postorder[postorder[i]] = i; ! free (postorder); ! } while (true) { *** walk_dominator_tree (struct dom_walk_dat *** 186,201 if (walk_data-before_dom_children) (*walk_data-before_dom_children) (walk_data, bb); - bitmap_set_bit (visited, bb-index); - /* Mark the current BB to be popped out of the recursion stack once children are processed. */ worklist[sp++] = bb; worklist[sp++] = NULL; for (dest = first_dom_son (walk_data-dom_direction, bb); dest; dest = next_dom_son (walk_data-dom_direction, dest)) worklist[sp++] = dest; } /* NULL is used to mark pop operations in the recursion stack. */ while (sp 0 !worklist[sp - 1]) --- 209,233 if (walk_data-before_dom_children) (*walk_data-before_dom_children) (walk_data, bb); /* Mark the current BB to be popped out of the recursion stack once children are processed. */ worklist[sp++] = bb; worklist[sp++] = NULL; + int saved_sp = sp; for (dest = first_dom_son (walk_data-dom_direction, bb); dest; dest = next_dom_son (walk_data-dom_direction, dest)) worklist[sp++] = dest; + if (walk_data-dom_direction == CDI_DOMINATORS) + switch (sp - saved_sp) + { + case 0: + case 1: + break; + default: + qsort (worklist[saved_sp], sp - saved_sp, + sizeof (basic_block), cmp_bb_postorder); + } } /* NULL is used to mark pop operations in the recursion stack. */ while (sp 0 !worklist[sp - 1]) *** walk_dominator_tree (struct dom_walk_dat *** 217,260 } } if (sp) ! { ! int spp; !
Re: [patch] Fix wrong code with boolean negation
On Mon, Jan 21, 2013 at 9:41 AM, Eric Botcazou ebotca...@adacore.com wrote: Hi, this is a regression present in the Ada compiler since 4.5: the issue had been latent for ages, but an unrelated streamlining of the IR made it appear. When make_range_step is invoked on: (integer)!b 0 where b is a boolean, it returns always true instead of always false. The sequence is as follows: (integer)!b 0 is_true_if not in [0:0] (integer)!b is_true_if not in [0;+inf[ !bis_true_if not in [0;255] b is_true_if in [0;255] The wrong step is the last one: when TRUTH_NOT_EXPR is seen in make_range_step the in value is unconditionally toggled. Of course that doesn't work in the general case, just if the range is the boolean range. As a matter of fact, this is explained just below for the comparison operators: /* We can only do something if the range is testing for zero and if the second operand is an integer constant. Note that saying something is in the range we make is done by complementing IN_P since it will set in the initial case of being not equal to zero; out is leaving it alone. */ so the fix is to use the zero range condition in the TRUTH_NOT_EXPR case. Tested on x86_64-suse-linux, OK for mainline? And for branch(es)? Ok everywhere. Thanks, Richard. 2013-01-21 Eric Botcazou ebotca...@adacore.com * fold-const.c (make_range_step) TRUTH_NOT_EXPR: Bail out if the range isn't testing for zero. 2013-01-21 Eric Botcazou ebotca...@adacore.com * gnat.dg/opt26.adb: New test. -- Eric Botcazou
Re: [PATCH][RFC] Fix PR56113 more
On Fri, Feb 01, 2013 at 10:00:00AM +0100, Richard Biener wrote: This reduces compile-time of the testcase in PR56113 (with n = 4) from 575s to 353s. It does so by reducing the quadratic algorithm to impose an order on visiting dominator sons during a domwalk. Steven raises the issue that there exist domwalk users that modify the CFG during the walk and thus the new scheme does not work (at least optimally, as the current quadratic scheme does). As we are using a fixed-size sbitmap to track visited blocks existing domwalks cannot add new blocks to the CFG so the worst thing that can happen is that the order of dominator sons is no longer optimal (I suppose with the right CFG manipulations even the domwalk itself does not work - so I'd be hesitant to try to support such domwalk users) - back to the state before any ordering was imposed on the dom children visits (see rev 159100). I think it would be desirable to first analyze the failures Steven saw, if any. As you said, asan doesn't use domwalk, so it is a mystery to me. Jakub
[PATCH] More PR56113 PTA speedups
This reduces the work done for single predecessor nodes for assigning pointer equivalence classes in label_visit. For the PR56113 testcase with n = 4 this reduces PTA time from tree PTA: 119.59 (34%) usr to tree PTA: 51.62 (18%) usr (the percentages are with the domwalk fix applied). Easy optimization with a surprisingly big effect. On the way I also reduce the work done for non-direct nodes. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2013-02-01 Richard Biener rguent...@suse.de * tree-ssa-structalias.c (label_visit): Reduce work for single-predecessor nodes. Index: gcc/tree-ssa-structalias.c === *** gcc/tree-ssa-structalias.c (revision 195641) --- gcc/tree-ssa-structalias.c (working copy) *** condense_visit (constraint_graph_t graph *** 2107,2120 static void label_visit (constraint_graph_t graph, struct scc_info *si, unsigned int n) { ! unsigned int i; bitmap_iterator bi; - bitmap_set_bit (si-visited, n); ! if (!graph-points_to[n]) ! graph-points_to[n] = BITMAP_ALLOC (predbitmap_obstack); /* Label and union our incoming edges's points to sets. */ EXECUTE_IF_IN_NONNULL_BITMAP (graph-preds[n], 0, i, bi) { unsigned int w = si-node_mapping[i]; --- 2108,2120 static void label_visit (constraint_graph_t graph, struct scc_info *si, unsigned int n) { ! unsigned int i, first_pred; bitmap_iterator bi; ! bitmap_set_bit (si-visited, n); /* Label and union our incoming edges's points to sets. */ + first_pred = -1U; EXECUTE_IF_IN_NONNULL_BITMAP (graph-preds[n], 0, i, bi) { unsigned int w = si-node_mapping[i]; *** label_visit (constraint_graph_t graph, s *** 2126,2136 continue; if (graph-points_to[w]) ! bitmap_ior_into(graph-points_to[n], graph-points_to[w]); } ! /* Indirect nodes get fresh variables. */ if (!bitmap_bit_p (graph-direct_nodes, n)) ! bitmap_set_bit (graph-points_to[n], FIRST_REF_NODE + n); if (!bitmap_empty_p (graph-points_to[n])) { --- 2126,2170 continue; if (graph-points_to[w]) ! { ! if (first_pred == -1U) ! first_pred = w; ! else if (!graph-points_to[n]) ! { ! graph-points_to[n] = BITMAP_ALLOC (predbitmap_obstack); ! bitmap_ior (graph-points_to[n], ! graph-points_to[first_pred], graph-points_to[w]); ! } ! else ! bitmap_ior_into(graph-points_to[n], graph-points_to[w]); ! } } ! ! /* Indirect nodes get fresh variables and a new pointer equiv class. */ if (!bitmap_bit_p (graph-direct_nodes, n)) ! { ! if (!graph-points_to[n]) ! { ! graph-points_to[n] = BITMAP_ALLOC (predbitmap_obstack); ! if (first_pred != -1U) ! bitmap_copy (graph-points_to[n], graph-points_to[first_pred]); ! } ! bitmap_set_bit (graph-points_to[n], FIRST_REF_NODE + n); ! graph-pointer_label[n] = pointer_equiv_class++; ! return; ! } ! ! /* If there was only a single non-empty predecessor the pointer equiv ! class is the same. */ ! if (!graph-points_to[n]) ! { ! if (first_pred != -1U) ! { ! graph-pointer_label[n] = graph-pointer_label[first_pred]; ! graph-points_to[n] = graph-points_to[first_pred]; ! } ! return; ! } if (!bitmap_empty_p (graph-points_to[n])) {
Re: [PATCH][RFC] Fix PR56113 more
On Fri, 1 Feb 2013, Jakub Jelinek wrote: On Fri, Feb 01, 2013 at 10:00:00AM +0100, Richard Biener wrote: This reduces compile-time of the testcase in PR56113 (with n = 4) from 575s to 353s. It does so by reducing the quadratic algorithm to impose an order on visiting dominator sons during a domwalk. Steven raises the issue that there exist domwalk users that modify the CFG during the walk and thus the new scheme does not work (at least optimally, as the current quadratic scheme does). As we are using a fixed-size sbitmap to track visited blocks existing domwalks cannot add new blocks to the CFG so the worst thing that can happen is that the order of dominator sons is no longer optimal (I suppose with the right CFG manipulations even the domwalk itself does not work - so I'd be hesitant to try to support such domwalk users) - back to the state before any ordering was imposed on the dom children visits (see rev 159100). I think it would be desirable to first analyze the failures Steven saw, if any. As you said, asan doesn't use domwalk, so it is a mystery to me. Yeah. Now, fortunately domwalk.h is only directly included and thus the set of optimizers using it are compare-elim.c:#include domwalk.h domwalk.c:#include domwalk.h fwprop.c:#include domwalk.h gimple-ssa-strength-reduction.c:#include domwalk.h graphite-sese-to-poly.c:#include domwalk.h tree-into-ssa.c:#include domwalk.h tree-ssa-dom.c:#include domwalk.h tree-ssa-dse.c:#include domwalk.h tree-ssa-loop-im.c:#include domwalk.h tree-ssa-math-opts.c: If we did this using domwalk.c, an efficient implementation would have tree-ssa-phiopt.c:#include domwalk.h tree-ssa-pre.c:#include domwalk.h tree-ssa-pre.c:/* Local state for the eliminate domwalk. */ tree-ssa-pre.c: eliminate domwalk. */ tree-ssa-pre.c:/* At the current point of the eliminate domwalk make OP available. */ tree-ssa-pre.c:/* Perform elimination for the basic-block B during the domwalk. */ tree-ssa-strlen.c:#include domwalk.h tree-ssa-uncprop.c:#include domwalk.h I don't see any target specific ones that do not have coverage with x86_64 multilib testing (maybe compare-elim.c? though that doesn't really require a domwalk as it is only using the before_dom_children hook). That said, arbitrary CFG manipulations during domwalk certainly will not preserve domwalk properties of a domwalk. Steven - can you reproduce your failures (and on which target?) Thanks, Richard.
[PATCH, AArch64] AND operation should use CC_NZ mode
The mode for AND should really be CC_NZ, so I fixed that up and in the TST patterns that (erroneously) expected it to be CC mode. It has been tested on linux and bare-metal. OK to commit to trunk (as bug fix)? Thanks. Ian 13-02-01 Ian Bolton ian.bol...@arm.com * config/aarch64/aarch64.c (aarch64_select_cc_mode): Return correct CC mode for AND. * config/aarch64/aarch64.md (*andmode3nr_compare0): Fixed to use CC_NZ. (*and_SHIFT:optabmode3nr_compare0): Likewise. - diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 03b1361..2b09669 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3076,7 +3076,7 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y) if ((GET_MODE (x) == SImode || GET_MODE (x) == DImode) y == const0_rtx (code == EQ || code == NE || code == LT || code == GE) - (GET_CODE (x) == PLUS || GET_CODE (x) == MINUS)) + (GET_CODE (x) == PLUS || GET_CODE (x) == MINUS || GET_CODE (x) == AND)) return CC_NZmode; /* A compare with a shifted operand. Because of canonicalization, diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 36267c9..c4c152f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2470,8 +2470,8 @@ ) (define_insn *andmode3nr_compare0 - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NZ CC_REGNUM) + (compare:CC_NZ (and:GPI (match_operand:GPI 0 register_operand %r,r) (match_operand:GPI 1 aarch64_logical_operand r,lconst)) (const_int 0)))] @@ -2481,8 +2481,8 @@ (set_attr mode MODE)]) (define_insn *and_SHIFT:optabmode3nr_compare0 - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NZ CC_REGNUM) + (compare:CC_NZ (and:GPI (SHIFT:GPI (match_operand:GPI 0 register_operand r) (match_operand:QI 1 aarch64_shift_imm_mode n))
Re: [PATCH, AArch64] AND operation should use CC_NZ mode
On 01/02/13 11:05, Ian Bolton wrote: The mode for AND should really be CC_NZ, so I fixed that up and in the TST patterns that (erroneously) expected it to be CC mode. It has been tested on linux and bare-metal. OK to commit to trunk (as bug fix)? Thanks. Ian 13-02-01 Ian Bolton ian.bol...@arm.com * config/aarch64/aarch64.c (aarch64_select_cc_mode): Return correct CC mode for AND. * config/aarch64/aarch64.md (*andmode3nr_compare0): Fixed to use CC_NZ. (*and_SHIFT:optabmode3nr_compare0): Likewise. OK and backport to ARM/aarch64-4.7-branch please. Thanks /Marcus
{PATCH,x86] Workarond for 55970
Hi All, This is simple fix that is aimed to help users in porting their applications to x86 platforms which rely on an order of function argument evaluation. To preserve direct order of argument evaluation they need to be added additional option '-mno-push-args' to compile that looks reasonable price for non-C/C++ Standard conformance. I checked that adding this option does not affect on performance on Corei7 and Atom platforms. Note also that option -push-args is passed by default on all x86 platforms and it means that changing this macros will not likely affect on almost all gcc users. Is it OK for trunk? 2013-02-01 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386.h : Change macros PUSH_ARGS_REVERSED. patch Description: Binary data
[PATCH] Fix PR56168
So - back to PR55848 - this testcase shows that we still handle builtins vs. non-builtins in a wrong way. If at compile-time we chose to use a non-builtin variant we have to preserve that (-fno-builtin) - easy to do at WPA stage by adjusting symbol merging. Now, at LTRANS stage somebody clever decided to skip symtab merging - but forgot to disable decl fixup. In this case decl fixup with un-merged builtin and non-builtin symtab entry choses the builtin if it happens to be first in the list of asm aliases. The following patch fixes both issues, preserving runtime behavior of the weird testcase and avoiding the folding to cbrt as in the original report. LTO bootstrap and testing running on x86_64-unknown-linux-gnu (in stage3 already). Honza, does this look sane? Thanks, Richard. 2013-02-01 Richard Guenther rguent...@suse.de PR lto/56168 * lto-symtab.c (lto_symtab_merge_decls_1): Make non-builtin node prevail as last resort. (lto_symtab_merge_decls): Remove guard on LTRANS here. (lto_symtab_prevailing_decl): Builtins are their own prevailing decl. lto/ * lto.c (read_cgraph_and_symbols): Do not call lto_symtab_merge_decls or lto_fixup_decls at LTRANS time. * gcc.dg/lto/pr56168_0.c: New testcase. * gcc.dg/lto/pr56168_1.c: Likewise. Index: gcc/lto-symtab.c === *** gcc/lto-symtab.c(revision 195641) --- gcc/lto-symtab.c(working copy) *** lto_symtab_merge_decls_1 (symtab_node fi *** 439,450 COMPLETE_TYPE_P (TREE_TYPE (e-symbol.decl))) prevailing = e; } ! /* For variables prefer the builtin if one is available. */ else if (TREE_CODE (prevailing-symbol.decl) == FUNCTION_DECL) { for (e = first; e; e = e-symbol.next_sharing_asm_name) if (TREE_CODE (e-symbol.decl) == FUNCTION_DECL !DECL_BUILT_IN (e-symbol.decl)) { prevailing = e; break; --- 439,450 COMPLETE_TYPE_P (TREE_TYPE (e-symbol.decl))) prevailing = e; } ! /* For variables prefer the non-builtin if one is available. */ else if (TREE_CODE (prevailing-symbol.decl) == FUNCTION_DECL) { for (e = first; e; e = e-symbol.next_sharing_asm_name) if (TREE_CODE (e-symbol.decl) == FUNCTION_DECL !!DECL_BUILT_IN (e-symbol.decl)) { prevailing = e; break; *** lto_symtab_merge_decls (void) *** 507,518 { symtab_node node; - /* In ltrans mode we read merged cgraph, we do not really need to care - about resolving symbols again, we only need to replace duplicated declarations - read from the callgraph and from function sections. */ - if (flag_ltrans) - return; - /* Populate assembler name hash. */ symtab_initialize_asm_name_hash (); --- 507,512 *** lto_symtab_prevailing_decl (tree decl) *** 598,603 --- 592,602 if (TREE_CODE (decl) == FUNCTION_DECL DECL_ABSTRACT (decl)) return decl; + /* Likewise builtins are their own prevailing decl. This preserves + non-builtin vs. builtin uses from compile-time. */ + if (TREE_CODE (decl) == FUNCTION_DECL DECL_BUILT_IN (decl)) + return decl; + /* Ensure DECL_ASSEMBLER_NAME will not set assembler name. */ gcc_assert (DECL_ASSEMBLER_NAME_SET_P (decl)); Index: gcc/lto/lto.c === *** gcc/lto/lto.c (revision 195641) --- gcc/lto/lto.c (working copy) *** read_cgraph_and_symbols (unsigned nfiles *** 3033,3048 fprintf (stderr, Merging declarations\n); timevar_push (TV_IPA_LTO_DECL_MERGE); ! /* Merge global decls. */ ! lto_symtab_merge_decls (); ! /* If there were errors during symbol merging bail out, we have no ! good way to recover here. */ ! if (seen_error ()) ! fatal_error (errors during merging of translation units); ! /* Fixup all decls. */ ! lto_fixup_decls (all_file_decl_data); htab_delete (tree_with_vars); tree_with_vars = NULL; ggc_collect (); --- 3033,3054 fprintf (stderr, Merging declarations\n); timevar_push (TV_IPA_LTO_DECL_MERGE); ! /* Merge global decls. In ltrans mode we read merged cgraph, we do not ! need to care about resolving symbols again, we only need to replace ! duplicated declarations read from the callgraph and from function ! sections. */ ! if (!flag_ltrans) ! { ! lto_symtab_merge_decls (); ! /* If there were errors during symbol merging bail out, we have no !good way to recover here. */ ! if (seen_error ()) ! fatal_error (errors during merging of translation units); ! /* Fixup all decls. */
Re: [Patch, AArch64, AArch64-4.7] Backport Optimize cmp in some cases patch
On 27/01/13 08:46, Venkataramanan Kumar wrote: Hi Maintainers, The attached patch backports the gcc trunk patch http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00143.html to ARM/aarch64-4.7-branch branch. ChangeLog.aarch64 2013-01-27 Venkataramanan Kumar venkataramanan.ku...@linaro.org Backport from mainline. 2013-01-04 Andrew Pinski apin...@cavium.com * config/aarch64/aarch64.c (aarch64_fixed_condition_code_regs): New function. (TARGET_FIXED_CONDITION_CODE_REGS): Define Path is attached. Please let me know if I can change -1 to INVALID_REGNUM and commit. Built gcc and tested the gcc testsuites for the aarch64-none-elf target with ARMv8 Foundation model. No new regressions. Ok to for the ARM/aarch64-4.7-branch ? This is fine. Thank you. Please commit. /Marcus
[PATCH,committed] Define ASM_OUTPUT_ALIGNED_LOCAL for AIX
AIX 6.1 added an alignment argument to the .lcomm pseudo-op. This fixes many of the remaining Altivec failures on AIX where GCC was generating a zero vector in BSS, but the block was not appropriately aligned. I also took the opportunity to change ASM_OUTPUT_ALIGNED_COMMON use of exact_log2 to floor_log2. * config/rs6000/xcoff.h (ASM_OUTPUT_ALIGNED_COMMON): Use floor_log2. (ASM_OUTPUT_ALIGNED_LOCAL): New. Bootstrapped on powerpc-ibm-aix7.1.0.0. Thanks, David Index: xcoff.h === --- xcoff.h (revision 195639) +++ xcoff.h (working copy) @@ -283,7 +283,7 @@ RS6000_OUTPUT_BASENAME ((FILE), (NAME));\ if ((ALIGN) 32) \ fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%u\n, (SIZE), \ - exact_log2 ((ALIGN) / BITS_PER_UNIT)); \ + floor_log2 ((ALIGN) / BITS_PER_UNIT)); \ else if ((SIZE) 4)\ fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,3\n, (SIZE)); \ else\ @@ -292,12 +292,30 @@ /* This says how to output an assembler line to define a local common symbol. - Alignment cannot be specified, but we can try to maintain + The assembler in AIX 6.1 and later supports an alignment argument. + For earlier releases of AIX, we try to maintain alignment after preceding TOC section if it was aligned for 64-bit mode. */ #define LOCAL_COMMON_ASM_OP \t.lcomm +#if TARGET_AIX_VERSION = 61 +#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN) \ + do { fputs (LOCAL_COMMON_ASM_OP, (FILE));\ + RS6000_OUTPUT_BASENAME ((FILE), (NAME));\ + if ((ALIGN) 32) \ +fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%s,%u\n,\ + (SIZE), xcoff_bss_section_name, \ + floor_log2 ((ALIGN) / BITS_PER_UNIT));\ + else if ((SIZE) 4)\ +fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%s,3\n, \ + (SIZE), xcoff_bss_section_name); \ + else\ +fprintf ((FILE), ,HOST_WIDE_INT_PRINT_UNSIGNED,%s\n, \ + (SIZE), xcoff_bss_section_name); \ + } while (0) +#endif + #define ASM_OUTPUT_LOCAL(FILE, NAME, SIZE, ROUNDED)\ do { fputs (LOCAL_COMMON_ASM_OP, (FILE));\ RS6000_OUTPUT_BASENAME ((FILE), (NAME));\
[committed] Backports from trunk to 4.7 branch
Hi! I've committed following backports from trunk to 4.7 branch, after bootstrapping/regtesting it on x86_64-linux and i686-linux. Jakub 2013-02-01 Jakub Jelinek ja...@redhat.com Backported from mainline 2012-11-13 Jakub Jelinek ja...@redhat.com PR rtl-optimization/54127 * cfgrtl.c (force_nonfallthru_and_redirect): When redirecting asm goto labels from BB_HEAD (e-dest) to target bb, decrement LABEL_NUSES of BB_HEAD (e-dest) and increment LABEL_NUSES of BB_HEAD (target) appropriately and adjust JUMP_LABEL and/or REG_LABEL_TARGET and REG_LABEL_OPERAND. * gcc.dg/torture/pr54127.c: New test. --- gcc/cfgrtl.c(revision 193469) +++ gcc/cfgrtl.c(revision 193470) @@ -1424,14 +1424,46 @@ force_nonfallthru_and_redirect (edge e, (note = extract_asm_operands (PATTERN (BB_END (e-src) { int i, n = ASM_OPERANDS_LABEL_LENGTH (note); + bool adjust_jump_target = false; for (i = 0; i n; ++i) { if (XEXP (ASM_OPERANDS_LABEL (note, i), 0) == BB_HEAD (e-dest)) - XEXP (ASM_OPERANDS_LABEL (note, i), 0) = block_label (target); + { + LABEL_NUSES (XEXP (ASM_OPERANDS_LABEL (note, i), 0))--; + XEXP (ASM_OPERANDS_LABEL (note, i), 0) = block_label (target); + LABEL_NUSES (XEXP (ASM_OPERANDS_LABEL (note, i), 0))++; + adjust_jump_target = true; + } if (XEXP (ASM_OPERANDS_LABEL (note, i), 0) == BB_HEAD (target)) asm_goto_edge = true; } + if (adjust_jump_target) + { + rtx insn = BB_END (e-src), note; + rtx old_label = BB_HEAD (e-dest); + rtx new_label = BB_HEAD (target); + + if (JUMP_LABEL (insn) == old_label) + { + JUMP_LABEL (insn) = new_label; + note = find_reg_note (insn, REG_LABEL_TARGET, new_label); + if (note) + remove_note (insn, note); + } + else + { + note = find_reg_note (insn, REG_LABEL_TARGET, old_label); + if (note) + remove_note (insn, note); + if (JUMP_LABEL (insn) != new_label + !find_reg_note (insn, REG_LABEL_TARGET, new_label)) + add_reg_note (insn, REG_LABEL_TARGET, new_label); + } + while ((note = find_reg_note (insn, REG_LABEL_OPERAND, old_label)) +!= NULL_RTX) + XEXP (note, 0) = new_label; + } } if (EDGE_COUNT (e-src-succs) = 2 || abnormal_edge_flags || asm_goto_edge) --- gcc/testsuite/gcc.dg/torture/pr54127.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr54127.c (revision 193470) @@ -0,0 +1,16 @@ +/* PR rtl-optimization/54127 */ +/* { dg-do compile } */ + +extern void foo (void) __attribute__ ((__noreturn__)); + +void +bar (int x) +{ + if (x 0) +foo (); + if (x == 0) +return; + __asm goto (# %l[lab] %l[lab2] : : : : lab, lab2); +lab:; +lab2:; +} 2013-02-01 Jakub Jelinek ja...@redhat.com Backported from mainline 2012-11-17 Jakub Jelinek ja...@redhat.com PR tree-optimization/55236 * fold-const.c (make_range_step) case NEGATE_EXPR: For -fwrapv and signed ARG0_TYPE, force low and high to be non-NULL. * gcc.dg/pr55236.c: New test. --- gcc/fold-const.c(revision 193590) +++ gcc/fold-const.c(revision 193591) @@ -3880,6 +3880,17 @@ make_range_step (location_t loc, enum tr return arg0; case NEGATE_EXPR: + /* If flag_wrapv and ARG0_TYPE is signed, make sure +low and high are non-NULL, then normalize will DTRT. */ + if (!TYPE_UNSIGNED (arg0_type) + !TYPE_OVERFLOW_UNDEFINED (arg0_type)) + { + if (low == NULL_TREE) + low = TYPE_MIN_VALUE (arg0_type); + if (high == NULL_TREE) + high = TYPE_MAX_VALUE (arg0_type); + } + /* (-x) IN [a,b] - x in [-b, -a] */ n_low = range_binop (MINUS_EXPR, exp_type, build_int_cst (exp_type, 0), --- gcc/testsuite/gcc.dg/pr55236.c (revision 0) +++ gcc/testsuite/gcc.dg/pr55236.c (revision 193591) @@ -0,0 +1,31 @@ +/* PR tree-optimization/55236 */ +/* { dg-do run } */ +/* { dg-options -O2 -fwrapv } */ + +extern void abort (); + +__attribute__((noinline, noclone)) void +foo (int i) +{ + if (i 0) +abort (); + i = -i; + if (i 0) +return; + abort (); +} + +__attribute__((noinline, noclone)) void +bar (int i) +{ + if (i 0 || (-i) = 0) +abort (); +} + +int +main () +{ + foo (-__INT_MAX__ - 1); + bar (-__INT_MAX__ - 1); + return 0; +} 2013-02-01 Jakub Jelinek ja...@redhat.com Backported from mainline 2012-11-20 Jakub Jelinek ja...@redhat.com PR middle-end/55094 * builtins.c (expand_builtin_trap): Add REG_ARGS_SIZE note on the trap insn for !ACCUMULATE_OUTGOING_ARGS. * cfgcleanup.c
Re: {PATCH,x86] Workarond for 55970
On Fri, Feb 1, 2013 at 5:10 AM, Yuri Rumyantsev ysrum...@gmail.com wrote: This is simple fix that is aimed to help users in porting their applications to x86 platforms which rely on an order of function argument evaluation. To preserve direct order of argument evaluation they need to be added additional option '-mno-push-args' to compile that looks reasonable price for non-C/C++ Standard conformance. I checked that adding this option does not affect on performance on Corei7 and Atom platforms. Note also that option -push-args is passed by default on all x86 platforms and it means that changing this macros will not likely affect on almost all gcc users. If your goal is to preserve the order in which function arguments are evaluated, this patch is not going to be reliable. It only affects the conversion from GIMPLE to RTL. The GIMPLE optimizers will have already had plenty of opportunity to change the function argument evaluation order. I don't think we should change the compiler to generate less efficient code in order to help non-standard-conforming programs when the change won't work reliably anyhow. Ian
[RFC,PATCH] __cxa_atexit support for AIX (v2)
Richard Stallman has given permission to include code derived from GNU C Library in libgcc for AIX using the GCC Runtime Exception license. The updated patch is appended. The GNU C Library code (cxa_atexit.c, cxa_finalize.c, exit.h) is modified, so I am not exactly certain if my reference to the GNU C Library origin is correct. This has been bootstrapped on powerpc-ibm-aix7.1.0.0 using --enable-cxa_atexit. Any comments, especially about the header for the files derived from GNU C Library? Thanks, David libgcc/ * config.host (powerpc-ibm-aix[56789]): Add t-aix-cxa to tmake_file. Add crtcxa to extra_parts. * config/rs6000/exit.h: New file. * config/rs6000/cxa_atexit.c: New file. * config/rs6000/cxa_finalize.c: New file. * config/rs6000/crtcxa.c: New file. * config/rs6000/t-aix-cxa: New file. * config/rs6000/libgcc-aix-cxa.ver: New file. gcc/ * config/rs6000/aix61.h (STARTFILE_SPEC): Add crtcxa. Index: libgcc/config.host === --- libgcc/config.host (revision 195639) +++ libgcc/config.host (working copy) @@ -899,7 +899,8 @@ ;; rs6000-ibm-aix[56789].* | powerpc-ibm-aix[56789].*) md_unwind_header=rs6000/aix-unwind.h - tmake_file=t-fdpbit rs6000/t-ppc64-fp rs6000/t-slibgcc-aix rs6000/t-ibm-ldouble + tmake_file=t-fdpbit rs6000/t-ppc64-fp rs6000/t-slibgcc-aix rs6000/t-ibm-ldouble rs6000/t-aix-cxa + extra_parts=crtcxa.o crtcxa_s.o ;; rl78-*-elf) tmake_file=$tm_file t-fdpbit rl78/t-rl78 Index: libgcc/config/rs6000/exit.h === --- libgcc/config/rs6000/exit.h (revision 0) +++ libgcc/config/rs6000/exit.h (revision 0) @@ -0,0 +1,92 @@ +/* Copyright (C) 1991-2013 Free Software Foundation, Inc. + +Derived from exit.h in GNU C Library. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +http://www.gnu.org/licenses/. */ + +#ifndef_EXIT_H +#define _EXIT_H 1 + +#define attribute_hidden +#define INTDEF(name) + +#include stdbool.h +#include stdint.h + +enum +{ + ef_free, /* `ef_free' MUST be zero! */ + ef_us, + ef_on, + ef_at, + ef_cxa +}; + +struct exit_function + { +/* `flavour' should be of type of the `enum' above but since we need + this element in an atomic operation we have to use `long int'. */ +long int flavor; +union + { + void (*at) (void); + struct + { + void (*fn) (int status, void *arg); + void *arg; + } on; + struct + { + void (*fn) (void *arg, int status); + void *arg; + void *dso_handle; + } cxa; + } func; + }; +struct exit_function_list + { +struct exit_function_list *next; +size_t idx; +struct exit_function fns[32]; + }; +extern struct exit_function_list *__exit_funcs attribute_hidden; +extern struct exit_function_list *__quick_exit_funcs attribute_hidden; + +extern struct exit_function *__new_exitfn (struct exit_function_list **listp); +extern uint64_t __new_exitfn_called attribute_hidden; + +extern void __run_exit_handlers (int status, struct exit_function_list **listp, +bool run_list_atexit) + attribute_hidden __attribute__ ((__noreturn__)); + +extern int __internal_atexit (void (*func) (void *), void *arg, void *d, + struct exit_function_list **listp) + attribute_hidden; +extern int __cxa_at_quick_exit (void (*func) (void *), void *d); + +extern int __cxa_atexit (void (*func) (void *), void *arg, void *d); +extern int __cxa_atexit_internal (void (*func) (void *), void *arg, void *d) + attribute_hidden; + +extern void __cxa_finalize (void *d); + +#endif /* exit.h */ Index: libgcc/config/rs6000/cxa_atexit.c === --- libgcc/config/rs6000/cxa_atexit.c (revision 0) +++ libgcc/config/rs6000/cxa_atexit.c (revision 0) @@ -0,0 +1,130 @@ +/* Copyright (C) 1999-2013 Free Software Foundation, Inc. + +Derived from cxa_atexit.c in GNU C Library. + +This file
Re: [PATCH] If possible, include range of profile hunk before prologue in .debug_loc ranges (PR debug/54793)
On 01/31/2013 02:02 AM, Jakub Jelinek wrote: 2013-01-31 Jakub Jelinekja...@redhat.com PR debug/54793 * final.c (need_profile_function): New variable. (final_start_function): Drop ATTRIBUTE_UNUSED from first argument. If first of NOTE_INSN_BASIC_BLOCK or NOTE_INSN_FUNCTION_BEG is only preceeded by NOTE_INSN_VAR_LOCATION or NOTE_INSN_DELETED notes, targetm.asm_out.function_prologue doesn't emit anything, HAVE_prologue and profiler should be emitted before prologue, set need_profile_function instead of emitting it. (final_scan_insn): If need_profile_function, emit profile_function on the first NOTE_INSN_BASIC_BLOCK or NOTE_INSN_FUNCTION_BEG note. Ok. r~
Re: [PATCH] Vtable pointer verification, C++ front end changes (patch 1 of 3)
On 01/31/2013 07:24 PM, Caroline Tice wrote: On Wed, Jan 30, 2013 at 9:26 AM, Jason Merrill ja...@redhat.com wrote: @@ -17954,6 +17954,10 @@ mark_class_instantiated (tree t, int ext + if (flag_vtable_verify) +vtv_save_class_info (t); Why do you need this here as well as in finish_struct_1? If we don't have this in both places, then we miss getting vtable pointers for instantiated templates. Why? instantiated templates also go through finish_struct_1. And we only hit this function for explicit instantiations, not implicit. + base_id = DECL_ASSEMBLER_NAME (TREE_CHAIN (base_class)); I think you want TYPE_LINKAGE_IDENTIFIER here. I don't know the difference between DECL_ASSEMBLER_NAME and TYPE_LINKAGE_IDENTIFIER. We are just trying to get the mangled name for the class. Ah, I guess you don't want TYPE_LINKAGE_IDENTIFIER, as that's the simple name rather than the mangled one. But for the external name you always want to look at TYPE_NAME, not TREE_CHAIN (which corresponds to TYPE_STUB_DECL); in the case of an anonymous class that gets a name for linkage purposes from a typedef, the latter will have the original placeholder name, while the former will have the name used in mangling. I don't understand what the qualifier business is trying to accomplish, especially since you never use type_decl_type. You do this in several places, but it should never be necessary; classes don't have any qualifiers. We used to not have the qualifier business, assuming that classes did not have any type qualifiers. This turned out not to be a true assumption. Occasionally we were getting a case where a class had a const qualifier attached to it *sometimes*. Why? You are getting a qualified variant of the class somehow. Where is it coming from? Here you're doing two hash table lookups when one would be enough. As written the insert function doesn't return anything to let you know whether the item was already there or not, which we need to know (we use the results here to avoid generating redundant calls to __VLTRegisterPair. I suppose we could modify the insert function to return a boolean indicating if the item was already in the hashtable, and then we could get by with just one call here... Yep, that's what I was thinking. For that matter, you don't need the array, either; you can just use TYPE_UID for a bitmap key and use htab_traverse to iterate over all elements. I don't understand how this would work. I think we need the vec, at least, to have direct access based on TYPE_UID (which is also the vec index). TYPE_UID is already a property of the type, different from the class_uid in your patch. But yes, I guess you do need some way to get from your index back to the type, so never mind. +guess_num_vtable_pointers (struct vtv_graph_node *class_node) I would think it would be better to pass the unrounded count to the library, and let the library decide how to adjust that number for allocation. If there is any computation we can do at compile-time rather than run-time, we would rather do it at compile time. I guess that makes sense. + var_name = ACONCAT ((_ZN4_VTVI, IDENTIFIER_POINTER (base_id), + E12__vtable_mapE, NULL)); $ c++filt _ZN4_VTVISt13bad_exceptionE12__vtable_mapE _VTVstd::bad_exception::__vtable_map Interesting. Does this _VTV template appear anywhere else? Even if we stay with this approach to producing the name, I'd like it to happen in a (new) function in mangle.c. +reset_type_qualifiers (unsigned int new_quals, tree type_node) This function is not safe and should be removed; as mentioned above, it shouldn't be needed anyway. As I explained above, we originally didn't have it and then found we really needed it. If you know of a safer or better way to accomplish the same thing we would be happy to hear about it. TYPE_MAIN_VARIANT will give you an unqualified variant of any qualified type. Jason
[PATCH][ARM][2/2] Load-acquire, store-release atomics in AArch32 ARMv8
Hi all, This patch adds the tests for the ARMv8 AArch32 implementation of atomics. It refactors some aarch64 tests and reuses them. Ok for trunk or for the next stage 1 (together with part 1 at http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01441.html)? Thanks, Kyrill gcc/testsuite/ChangeLog 2013-01-25 Kyrylo Tkachov kyrylo.tkachov at arm.com * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Move test body from here... * gcc.target/aarch64/atomic-comp-swap-release-acquire.x: ... to here. * gcc.target/aarch64/atomic-op-acq_rel.c: Move test body from here... * gcc.target/aarch64/atomic-op-acq_rel.x: ... to here. * gcc.target/aarch64/atomic-op-acquire.c: Move test body from here... * gcc.target/aarch64/atomic-op-acquire.x: ... to here. * gcc.target/aarch64/atomic-op-char.c: Move test body from here... * gcc.target/aarch64/atomic-op-char.x: ... to here. * gcc.target/aarch64/atomic-op-consume.c: Move test body from here... * gcc.target/aarch64/atomic-op-consume.x: ... to here. * gcc.target/aarch64/atomic-op-int.c: Move test body from here... * gcc.target/aarch64/atomic-op-int.x: ... to here. * gcc.target/aarch64/atomic-op-relaxed.c: Move test body from here... * gcc.target/aarch64/atomic-op-relaxed.x: ... to here. * gcc.target/aarch64/atomic-op-release.c: Move test body from here... * gcc.target/aarch64/atomic-op-release.x: ... to here. * gcc.target/aarch64/atomic-op-seq_cst.c: Move test body from here... * gcc.target/aarch64/atomic-op-seq_cst.x: ... to here. * gcc.target/aarch64/atomic-op-short.c: Move test body from here... * gcc.target/aarch64/atomic-op-short.x: ... to here. * gcc.target/arm/atomic-comp-swap-release-acquire.c: New test. * gcc.target/arm/atomic-op-acq_rel.c: Likewise. * gcc.target/arm/atomic-op-acquire.c: Likewise. * gcc.target/arm/atomic-op-char.c: Likewise. * gcc.target/arm/atomic-op-consume.c: Likewise. * gcc.target/arm/atomic-op-int.c: Likewise. * gcc.target/arm/atomic-op-relaxed.c: Likewise. * gcc.target/arm/atomic-op-release.c: Likewise. * gcc.target/arm/atomic-op-seq_cst.c: Likewise. * gcc.target/arm/atomic-op-short.c: Likewise.
RE: [PATCH][ARM][2/2] Load-acquire, store-release atomics in AArch32 ARMv8
Ummm... forgot the patch, sorry! -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Kyrylo Tkachov Sent: 01 February 2013 17:37 To: gcc-patches@gcc.gnu.org Cc: Ramana Radhakrishnan; Richard Earnshaw; Marcus Shawcroft Subject: [PATCH][ARM][2/2] Load-acquire, store-release atomics in AArch32 ARMv8 Hi all, This patch adds the tests for the ARMv8 AArch32 implementation of atomics. It refactors some aarch64 tests and reuses them. Ok for trunk or for the next stage 1 (together with part 1 at http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01441.html)? Thanks, Kyrill gcc/testsuite/ChangeLog 2013-01-25 Kyrylo Tkachov kyrylo.tkachov at arm.com * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Move test body from here... * gcc.target/aarch64/atomic-comp-swap-release-acquire.x: ... to here. * gcc.target/aarch64/atomic-op-acq_rel.c: Move test body from here... * gcc.target/aarch64/atomic-op-acq_rel.x: ... to here. * gcc.target/aarch64/atomic-op-acquire.c: Move test body from here... * gcc.target/aarch64/atomic-op-acquire.x: ... to here. * gcc.target/aarch64/atomic-op-char.c: Move test body from here... * gcc.target/aarch64/atomic-op-char.x: ... to here. * gcc.target/aarch64/atomic-op-consume.c: Move test body from here... * gcc.target/aarch64/atomic-op-consume.x: ... to here. * gcc.target/aarch64/atomic-op-int.c: Move test body from here... * gcc.target/aarch64/atomic-op-int.x: ... to here. * gcc.target/aarch64/atomic-op-relaxed.c: Move test body from here... * gcc.target/aarch64/atomic-op-relaxed.x: ... to here. * gcc.target/aarch64/atomic-op-release.c: Move test body from here... * gcc.target/aarch64/atomic-op-release.x: ... to here. * gcc.target/aarch64/atomic-op-seq_cst.c: Move test body from here... * gcc.target/aarch64/atomic-op-seq_cst.x: ... to here. * gcc.target/aarch64/atomic-op-short.c: Move test body from here... * gcc.target/aarch64/atomic-op-short.x: ... to here. * gcc.target/arm/atomic-comp-swap-release-acquire.c: New test. * gcc.target/arm/atomic-op-acq_rel.c: Likewise. * gcc.target/arm/atomic-op-acquire.c: Likewise. * gcc.target/arm/atomic-op-char.c: Likewise. * gcc.target/arm/atomic-op-consume.c: Likewise. * gcc.target/arm/atomic-op-int.c: Likewise. * gcc.target/arm/atomic-op-relaxed.c: Likewise. * gcc.target/arm/atomic-op-release.c: Likewise. * gcc.target/arm/atomic-op-seq_cst.c: Likewise. * gcc.target/arm/atomic-op-short.c: Likewise. diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c index 1492e25..9785bca 100644 --- a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c +++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c @@ -1,41 +1,7 @@ /* { dg-do compile } */ /* { dg-options -O2 } */ -#define STRONG 0 -#define WEAK 1 -int v = 0; - -int -atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b) -{ - return __atomic_compare_exchange (v, a, b, - STRONG, __ATOMIC_RELEASE, - __ATOMIC_ACQUIRE); -} - -int -atomic_compare_exchange_WEAK_RELEASE_ACQUIRE (int a, int b) -{ - return __atomic_compare_exchange (v, a, b, - WEAK, __ATOMIC_RELEASE, - __ATOMIC_ACQUIRE); -} - -int -atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) -{ - return __atomic_compare_exchange_n (v, a, b, - STRONG, __ATOMIC_RELEASE, - __ATOMIC_ACQUIRE); -} - -int -atomic_compare_exchange_n_WEAK_RELEASE_ACQUIRE (int a, int b) -{ - return __atomic_compare_exchange_n (v, a, b, - WEAK, __ATOMIC_RELEASE, - __ATOMIC_ACQUIRE); -} +#include atomic-comp-swap-release-acquire.x /* { dg-final { scan-assembler-times ldaxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 4 } } */ /* { dg-final { scan-assembler-times stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\] 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.x b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.x new file mode 100644 index 000..4403afd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.x @@ -0,0 +1,36 @@ + +#define STRONG 0 +#define WEAK 1 +int v = 0; + +int +atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b) +{ + return __atomic_compare_exchange (v, a, b, + STRONG, __ATOMIC_RELEASE, + __ATOMIC_ACQUIRE); +} + +int
[PATCH 1/6] [AArch64-4.7] Fix warning - Initialise generic_tunings.
Hi, This patch moves the various tuning parameter data structures further up config/aarch64/aarch64.c and then uses them to initialise the generic_tunings variable. This mirrors their position on trunk. This fixes the warning: config/aarch64/aarch64.c:129:33: warning: uninitialised const ‘generic_tunings’ is invalid in C++ [-Wc++-compat] Regression tested on aarch64-none-elf with no regressions. OK for aarch64-4.7-branch? Thanks, James --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (generic_tunings): Initialise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 40f438d..59124eb 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -125,8 +125,70 @@ unsigned long aarch64_isa_flags = 0; /* Mask to specify which instruction scheduling options should be used. */ unsigned long aarch64_tune_flags = 0; -/* Tuning models. */ -static const struct tune_params generic_tunings; +/* Tuning parameters. */ + +#if HAVE_DESIGNATED_INITIALIZERS +#define NAMED_PARAM(NAME, VAL) .NAME = (VAL) +#else +#define NAMED_PARAM(NAME, VAL) (VAL) +#endif + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +__extension__ +#endif +static const struct cpu_rtx_cost_table generic_rtx_cost_table = +{ + NAMED_PARAM (memory_load, COSTS_N_INSNS (1)), + NAMED_PARAM (memory_store, COSTS_N_INSNS (0)), + NAMED_PARAM (register_shift, COSTS_N_INSNS (1)), + NAMED_PARAM (int_divide, COSTS_N_INSNS (6)), + NAMED_PARAM (float_divide, COSTS_N_INSNS (2)), + NAMED_PARAM (double_divide, COSTS_N_INSNS (6)), + NAMED_PARAM (int_multiply, COSTS_N_INSNS (1)), + NAMED_PARAM (int_multiply_extend, COSTS_N_INSNS (1)), + NAMED_PARAM (int_multiply_add, COSTS_N_INSNS (1)), + NAMED_PARAM (int_multiply_extend_add, COSTS_N_INSNS (1)), + NAMED_PARAM (float_multiply, COSTS_N_INSNS (0)), + NAMED_PARAM (double_multiply, COSTS_N_INSNS (1)) +}; + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +__extension__ +#endif +static const struct cpu_addrcost_table generic_addrcost_table = +{ + NAMED_PARAM (pre_modify, 0), + NAMED_PARAM (post_modify, 0), + NAMED_PARAM (register_offset, 0), + NAMED_PARAM (register_extend, 0), + NAMED_PARAM (imm_offset, 0) +}; + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +__extension__ +#endif +static const struct cpu_regmove_cost generic_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + NAMED_PARAM (GP2FP, 2), + NAMED_PARAM (FP2GP, 2), + /* We currently do not provide direct support for TFmode Q-Q move. + Therefore we need to raise the cost above 2 in order to have + reload handle the situation. */ + NAMED_PARAM (FP2FP, 4) +}; + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +__extension__ +#endif + +static const struct tune_params generic_tunings = +{ + generic_rtx_cost_table, + generic_addrcost_table, + generic_regmove_cost, + NAMED_PARAM (memmov_cost, 4) +}; /* A processor implementing AArch64. */ struct processor @@ -4504,71 +4566,6 @@ aarch64_memory_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED, static void initialize_aarch64_code_model (void); -/* Tuning parameters. */ - -#if HAVE_DESIGNATED_INITIALIZERS -#define NAMED_PARAM(NAME, VAL) .NAME = (VAL) -#else -#define NAMED_PARAM(NAME, VAL) (VAL) -#endif - -#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 -__extension__ -#endif -static const struct cpu_rtx_cost_table generic_rtx_cost_table = -{ - NAMED_PARAM (memory_load, COSTS_N_INSNS (1)), - NAMED_PARAM (memory_store, COSTS_N_INSNS (0)), - NAMED_PARAM (register_shift, COSTS_N_INSNS (1)), - NAMED_PARAM (int_divide, COSTS_N_INSNS (6)), - NAMED_PARAM (float_divide, COSTS_N_INSNS (2)), - NAMED_PARAM (double_divide, COSTS_N_INSNS (6)), - NAMED_PARAM (int_multiply, COSTS_N_INSNS (1)), - NAMED_PARAM (int_multiply_extend, COSTS_N_INSNS (1)), - NAMED_PARAM (int_multiply_add, COSTS_N_INSNS (1)), - NAMED_PARAM (int_multiply_extend_add, COSTS_N_INSNS (1)), - NAMED_PARAM (float_multiply, COSTS_N_INSNS (0)), - NAMED_PARAM (double_multiply, COSTS_N_INSNS (1)) -}; - -#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 -__extension__ -#endif -static const struct cpu_addrcost_table generic_addrcost_table = -{ - NAMED_PARAM (pre_modify, 0), - NAMED_PARAM (post_modify, 0), - NAMED_PARAM (register_offset, 0), - NAMED_PARAM (register_extend, 0), - NAMED_PARAM (imm_offset, 0) -}; - -#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 -__extension__ -#endif -static const struct cpu_regmove_cost generic_regmove_cost = -{ - NAMED_PARAM (GP2GP, 1), - NAMED_PARAM (GP2FP, 2), - NAMED_PARAM (FP2GP, 2), - /* We currently do not provide direct support for TFmode Q-Q move. - Therefore we need to raise the cost above 2 in order to have - reload handle the situation. */ - NAMED_PARAM (FP2FP, 4) -}; - -#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 -__extension__ -#endif -static const struct tune_params generic_tunings = -{ -
[Patch 0/6][AArch64-4.7] Fix warnings.
Hi, This patch series fixes a number of warnings in the AArch64 port on the aarch64-4.7-branch. The warnings fixed are: --- [AArch64-4.7] Fix warning - Initialise generic_tunings. config/aarch64/aarch64.c:129:33: warning: uninitialised const ‘generic_tunings’ is invalid in C++ [-Wc++-compat] --- [AArch64-4.7] Fix warning - aarch64_add_constant mixed code and declarations. config/aarch64/aarch64.c: In function ‘aarch64_add_constant’: config/aarch64/aarch64.c:2249:4: warning: ISO C90 forbids mixed declarations and code [-pedantic] --- [AArch64-4.7] Fix warning - aarch64_legitimize_reload_address passes the wrong type to push_reload. config/aarch64/aarch64.c: In function ‘aarch64_legitimize_reload_address’: config/aarch64/aarch64.c:3641:6: warning: enum conversion when passing argument 11 of ‘push_reload’ is invalid in C++ [-Wc++-compat] --- [AArch64-4.7] Fix warning - aarch64_trampoline_init passes the wrong type to emit_library_call. config/aarch64/aarch64.c: In function ‘aarch64_trampoline_init’: config/aarch64/aarch64.c:3893:8: warning: enum conversion when passing argument 2 of ‘emit_library_call’ is invalid in C++ [-Wc++-compat] --- [AArch64-4.7] Fix warning - Mixed code and declarations in aarch64_simd_const_bounds. config/aarch64/aarch64.c: In function ‘aarch64_simd_const_bounds’: config/aarch64/aarch64.c:6412:3: warning: ISO C90 forbids mixed declarations and code [-pedantic] --- [AArch64-4.7] Backport: Fix warning in aarch64.md config/aarch64/aarch64.md:840: warning: source missing a mode? --- The patch series as a whole has been regression tested against aarch64-none-elf with no regressions. Are these patches OK to commit to aarch64-4.7-branch? Thanks, James Greenhalgh --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (generic_tunings): Initialise. 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_add_constant): Move declaration of 'shift' above code. 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_legitimize_reload_address): Cast 'type' before passing to push_reload. 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_trampoline_init): Pass 'LCT_NORMAL' rather than '0' to emit_library_call. 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_simd_const_bounds): Move declaration of 'lane' above code. 2013-02-01 James Greenhalgh james.greenha...@arm.com Backport from mainline. 2012-12-18 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.md (insv_immmode): Add modes for source operands.
[PATCH 2/6] [AArch64-4.7] Fix warning - aarch64_add_constant mixed code and declarations.
Hi, In config/aarch64/aarch64.c::aarch64_add_constant `shift' was declared after we started writing code. C90 doesn't like this, so split the declaration and the assignment. This fixes the warning: config/aarch64/aarch64.c: In function ‘aarch64_add_constant’: config/aarch64/aarch64.c:2249:4: warning: ISO C90 forbids mixed declarations and code [-pedantic] Regression tested on aarch64-none-elf with no regressions. OK for aarch64-4.7-branch? Thanks, James --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_add_constant): Move declaration of 'shift' above code. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 59124eb..62d0a12 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2307,8 +2307,9 @@ aarch64_add_constant (int regnum, int scratchreg, HOST_WIDE_INT delta) { if (mdelta = 4096) { + rtx shift; emit_insn (gen_rtx_SET (Pmode, scratch_rtx, GEN_INT (mdelta / 4096))); - rtx shift = gen_rtx_ASHIFT (Pmode, scratch_rtx, GEN_INT (12)); + shift = gen_rtx_ASHIFT (Pmode, scratch_rtx, GEN_INT (12)); if (delta 0) emit_insn (gen_rtx_SET (Pmode, this_rtx, gen_rtx_MINUS (Pmode, this_rtx, shift)));
[PATCH 3/6] [AArch64-4.7] Fix warning - aarch64_legitimize_reload_address passes the wrong type to push_reload.
Hi, push_reload takes an `enum reload_type' as its final argument. On trunk we just cast the int we have to the correct type, so we do that here to mirror trunk and correct the warning. We can't fix this by changing the type of the argument we take as we would then need to forward declare the enum when giving the prototype, which is illegal. This fixes the warning: config/aarch64/aarch64.c: In function ‘aarch64_legitimize_reload_address’: config/aarch64/aarch64.c:3641:6: warning: enum conversion when passing argument 11 of ‘push_reload’ is invalid in C++ [-Wc++-compat] Regression tested aarch64-none-elf with no regressions. OK for aarch64-4.7-branch? Thanks, James --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_legitimize_reload_address): Cast 'type' before passing to push_reload. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 62d0a12..fef2983 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3701,7 +3701,7 @@ aarch64_legitimize_reload_address (rtx *x_p, x = copy_rtx (x); push_reload (orig_rtx, NULL_RTX, x_p, NULL, BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, - opnum, type); + opnum, (enum reload_type) type); return x; } @@ -3714,7 +3714,7 @@ aarch64_legitimize_reload_address (rtx *x_p, { push_reload (XEXP (x, 0), NULL_RTX, XEXP (x, 0), NULL, BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0, - opnum, type); + opnum, (enum reload_type) type); return x; } @@ -3778,7 +3778,7 @@ aarch64_legitimize_reload_address (rtx *x_p, push_reload (XEXP (x, 0), NULL_RTX, XEXP (x, 0), NULL, BASE_REG_CLASS, Pmode, VOIDmode, 0, 0, - opnum, type); + opnum, (enum reload_type) type); return x; }
Re: [ARM] Turning off 64bits ops in Neon and gfortran/modulo-scheduling problem
Here is a new version of my patch, with the cleanup you requested. 2012-12-18 Christophe Lyon christophe.l...@linaro.org gcc/ * config/arm/arm-protos.h (tune_params): Add prefer_neon_for_64bits field. * config/arm/arm.c (prefer_neon_for_64bits): New variable. (arm_slowmul_tune): Default prefer_neon_for_64bits to false. (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto. (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto. (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto. (arm_cortex_a9_tune, arm_fa726te_tune): Ditto. (arm_option_override): Handle -mneon-for-64bits new option. * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro. (prefer_neon_for_64bits): Declare new variable. * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to avoid_neon_for_64bits and neon_for_64bits. Remove onlya8 and nota8. (arch_enabled): Handle new arch types. Remove support for onlya8 and nota8. (one_cmpldi2): Use new arch names. * config/arm/arm.opt (mneon-for-64bits): Add option. * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon) (anddi3_neon, xordi3_neon, ashldi3_neon, shiftdi3_neon): Use neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead of onlya8. * doc/invoke.texi (-mneon-for-64bits): Document. gcc/testsuite/ * gcc.target/arm/neon-for-64bits-1.c: New tests. * gcc.target/arm/neon-for-64bits-2.c: Likewise. Ok for 4.9 stage1 now. regards Ramana
[PATCH 5/6] [AArch64-4.7] Fix warning - Mixed code and declarations in aarch64_simd_const_bounds.
Hi, aarch64_simd_const_bounds declares `lane' after an assert. This patch moves the declaration above the assert. This patch fixes the warning: config/aarch64/aarch64.c: In function ‘aarch64_simd_const_bounds’: config/aarch64/aarch64.c:6412:3: warning: ISO C90 forbids mixed declarations and code [-pedantic] Regression tested on aarch64-none-elf with no regressions. OK for aarch64-4.7-branch? Thanks, James --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_simd_const_bounds): Move declaration of 'lane' above code. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 434ccd7..a3c482b 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6406,8 +6406,9 @@ aarch64_simd_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high) void aarch64_simd_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high) { + HOST_WIDE_INT lane; gcc_assert (GET_CODE (operand) == CONST_INT); - HOST_WIDE_INT lane = INTVAL (operand); + lane = INTVAL (operand); if (lane low || lane = high) error (constant out of range);
[PATCH 4/6] [AArch64-4.7] Fix warning - aarch64_trampoline_init passes the wrong type to emit_library_call.
Hi, emit_library_call takes an `enum library_type` as its second argument. Currently aarch64-4.7-branch passes it an int 0. This patch fixes this, mirroring trunk, by passing LCT_NORMAL instead. This patch fixes the warning: config/aarch64/aarch64.c: In function ‘aarch64_trampoline_init’: config/aarch64/aarch64.c:3893:8: warning: enum conversion when passing argument 2 of ‘emit_library_call’ is invalid in C++ [-Wc++-compat] Regression tested on aarch64-none-elf with no regressions. OK for aarch64-4.7-branch? Thanks, James --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.c (aarch64_trampoline_init): Pass 'LCT_NORMAL' rather than '0' to emit_library_call. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index fef2983..434ccd7 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3952,7 +3952,7 @@ aarch64_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value) gen_clear_cache(). */ a_tramp = XEXP (m_tramp, 0); emit_library_call (gen_rtx_SYMBOL_REF (Pmode, __clear_cache), - 0, VOIDmode, 2, a_tramp, Pmode, + LCT_NORMAL, VOIDmode, 2, a_tramp, Pmode, plus_constant (a_tramp, TRAMPOLINE_SIZE), Pmode); }
[PATCH 6/6] [AArch64-4.7] Backport: Fix warning in aarch64.md
Hi, This patch is a backport of one approved here: http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01135.html The patch fixes the warning: config/aarch64/aarch64.md:840: warning: source missing a mode? Regression tested with no regressions on aarch64-none-elf. OK for aarch64-4.7-branch? Thanks, James --- gcc/ 2013-02-01 James Greenhalgh james.greenha...@arm.com Backport from mainline. 2012-12-18 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64.md (insv_immmode): Add modes for source operands. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 6f51469..9bb95e0 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -840,8 +840,8 @@ (define_insn insv_immmode [(set (zero_extract:GPI (match_operand:GPI 0 register_operand +r) (const_int 16) - (match_operand 1 const_int_operand n)) - (match_operand 2 const_int_operand n))] + (match_operand:GPI 1 const_int_operand n)) + (match_operand:GPI 2 const_int_operand n))] INTVAL (operands[1]) GET_MODE_BITSIZE (MODEmode) INTVAL (operands[1]) % 16 == 0 INTVAL (operands[2]) = 0x
Re: Fix for PR55561 race condition in libgomp
LGTM On Thu, Jan 31, 2013 at 8:54 PM, VandeVondele Joost joost.vandevond...@mat.ethz.ch wrote: The updated changelog entry is below, but somebody with write access should do the commit, please. 2013-01-31 Dmitry Vyukov dvyu...@gcc.gnu.org Joost VandeVondele joost.vandevond...@mat.ethz.ch PR libgomp/55561 * config/linux/wait.h (do_spin): Use atomic load for addr. * config/linux/ptrlock.c (gomp_ptrlock_get_slow): Use atomic for intptr and ptrlock. * config/linux/ptrlock.h (gomp_ptrlock_get): Use atomic load for ptrlock.
[PATCH, RFC] GCC 4.9, powerpc, allow TImode in VSX registers
When I did the initial power7 port, I punted on allowing TImode in the VSX registers because I couldn't get it to work. I am now revisiting it, and these patches are my current effort, and I was wondering if people had comments on them. In terms of performance, there are two benchmarks in the Spec 2006 suite that have minor regressions (perlbench and gamess), and 3 that have minor improvements (hmmer, h264ref, and gromacs), so overall it looks like a wash. I do want to look the regressions, and see if there is something simple to tweak. Some issues I ran into include: I needed to set CANNOT_CHANGE_MODE so that TImode won't overlap with smaller data types, due to the scalar portion of the register being in the upper 64-bits of the VSX register. I limited the available address formats for TImode to be REG+REG needed for VSX instructions. I discovered that setjmp/longjmp and exception handling needed to create TImode values with the STACK_SAVEAREA_MODE macro. However, the implementation of this needs REG+OFFSET addressing. So, I added a new type PTImode, which is only used for STACK_SAVEAREA_MODE, and PTImode is limited to the GPRs. If I enable logical operations in TImode mode (and, xor, etc.), the compiler will convert DImode logical operations to TImode for 32-bit programs. In the future, I think I will tune this and/or provide insn splitters for DImode logical operations. For now, I just disallow logical operations on TImode if 32-bit. I added a debug switch (-mvsx-timode) to disable putting TImode into VSX registers. 2013-01-31 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/vector.md (mulmode3): Use the combined macro VECTOR_UNIT_ALTIVEC_OR_VSX_P instead of separate calls to VECTOR_UNIT_ALTIVEC_P and VECTOR_UNIT_VSX_P. (vcondmodemode): Likewise. (vcondumodemode): Likewise. (vector_gtumode): Likewise. (vector_gtemode): Likewise. (xormode3): Don't allow logical operations on TImode in 32-bit to prevent the compiler from converting DImode operations to TImode. (iormode3): Likewise. (andmode3): Likewise. (one_cmplmode2): Likewise. (normode3): Likewise. (andcmode3): Likewise. * config/rs6000/constraints.md (wt constraint): New constraint that returns VSX_REGS if TImode is allowed in VSX registers. * config/rs6000/predicates.md (easy_fp_constant): 0.0f is an easy constant under VSX. * config/rs6000/rs6000-modes.def (PTImode): Define, PTImode is similar to TImode, but it is restricted to being in the GPRs. * config/rs6000/rs6000.opt (-mvsx-timode): New switch to allow TImode to occupy a single VSX register. * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Default to -mvsx-timode for power7/power8. (power7 cpu): Likewise. (power8 cpu): Likewise. * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Make sure that TFmode/TDmode take up two registers if they are ever allowed in the upper VSX registers. (rs6000_hard_regno_mode_ok): If -mvsx-timode, allow TImode in VSX registers. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_debug_reg_global): Add debugging for PTImode and wt constraint. Print if LRA is turned on. (rs6000_option_override_internal): Give an error if -mvsx-timode and VSX is not enabled. (invalid_e500_subreg): Handle PTImode, restricting it to GPRs. If -mvsx-timode, restrict TImode to reg+reg addressing, and PTImode to reg+offset addressing. Use PTImode when checking offset addresses for validity. (reg_offset_addressing_ok_p): Likewise. (rs6000_legitimate_offset_address_p): Likewise. (rs6000_legitimize_address): Likewise. (rs6000_legitimize_reload_address): Likewise. (rs6000_legitimate_address_p): Likewise. (rs6000_eliminate_indexed_memrefs): Likewise. (rs6000_emit_move): Likewise. (rs6000_secondary_reload): Likewise. (rs6000_secondary_reload_inner): Handle PTImode. Allow 64-bit reloads to fpr registers to continue to use reg+offset addressing, but 64-bit reloads to altivec registers need reg+reg addressing. Drop test for PRE_MODIFY, since VSX loads/stores no longer support it. Treat LO_SUM like a PLUS operation. (rs6000_secondary_reload_class): If type is 64-bit, prefer to use FLOAT_REGS instead of VSX_RGS to allow use of reg+offset addressing. (rs6000_cannot_change_mode_class): Do not allow TImode in VSX registers to share a register with a smaller sized type, since VSX puts scalars in the upper 64-bits. (print_operand): Add support for PTImode. (rs6000_register_move_cost): Use VECTOR_MEM_VSX_P instead of VECTOR_UNIT_VSX_P to catch types that
Fwd: Re: Export _Prime_rehash_policy symbols
Test successful so attached patch applied. 2013-02-01 François Dumont fdum...@gcc.gnu.org * include/bits/hashtable_policy.h (_Prime_rehash_policy::_M_next_bkt) (_Prime_rehash_policy::_M_need_rehash): Move definition... * src/c++11/hashtable_c++0x.cc: ... here. * src/shared/hashtable-aux.cc: Remove c++config.h include. * config/abi/gnu.ver (GLIBCXX_3.4.18): Export _Prime_rehash_policy symbols. François On 01/30/2013 11:12 AM, Paolo Carlini wrote: ... before committing, please double check that we aren't breaking |--enable-symvers=|gnu-versioned-namespace, wouldn't be the first time that we do that and we notice only much later. At minimum build with it and run the testsuite. Paolo. Index: include/bits/hashtable_policy.h === --- include/bits/hashtable_policy.h (revision 195557) +++ include/bits/hashtable_policy.h (working copy) @@ -369,7 +369,8 @@ // Return a bucket count appropriate for n elements std::size_t -_M_bkt_for_elements(std::size_t __n) const; +_M_bkt_for_elements(std::size_t __n) const +{ return __builtin_ceil(__n / (long double)_M_max_load_factor); } // __n_bkt is current bucket count, __n_elt is current element count, // and __n_ins is number of elements to be inserted. Do we need to @@ -397,77 +398,6 @@ mutable std::size_t _M_next_resize; }; - extern const unsigned long __prime_list[]; - - // XXX This is a hack. There's no good reason for any of - // _Prime_rehash_policy's member functions to be inline. - - // Return a prime no smaller than n. - inline std::size_t - _Prime_rehash_policy:: - _M_next_bkt(std::size_t __n) const - { -// Optimize lookups involving the first elements of __prime_list. -// (useful to speed-up, eg, constructors) -static const unsigned char __fast_bkt[12] - = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 }; - -if (__n = 11) - { - _M_next_resize - = __builtin_ceil(__fast_bkt[__n] - * (long double)_M_max_load_factor); - return __fast_bkt[__n]; - } - -const unsigned long* __next_bkt - = std::lower_bound(__prime_list + 5, __prime_list + _S_n_primes, - __n); -_M_next_resize - = __builtin_ceil(*__next_bkt * (long double)_M_max_load_factor); -return *__next_bkt; - } - - // Return the smallest integer p such that alpha p = n, where alpha - // is the load factor. - inline std::size_t - _Prime_rehash_policy:: - _M_bkt_for_elements(std::size_t __n) const - { return __builtin_ceil(__n / (long double)_M_max_load_factor); } - - // Finds the smallest prime p such that alpha p __n_elt + __n_ins. - // If p __n_bkt, return make_pair(true, p); otherwise return - // make_pair(false, 0). In principle this isn't very different from - // _M_bkt_for_elements. - - // The only tricky part is that we're caching the element count at - // which we need to rehash, so we don't have to do a floating-point - // multiply for every insertion. - - inline std::pairbool, std::size_t - _Prime_rehash_policy:: - _M_need_rehash(std::size_t __n_bkt, std::size_t __n_elt, - std::size_t __n_ins) const - { -if (__n_elt + __n_ins = _M_next_resize) - { - long double __min_bkts = (__n_elt + __n_ins) - / (long double)_M_max_load_factor; - if (__min_bkts = __n_bkt) - return std::make_pair(true, - _M_next_bkt(std::maxstd::size_t(__builtin_floor(__min_bkts) + 1, - __n_bkt * _S_growth_factor))); - else - { - _M_next_resize - = __builtin_floor(__n_bkt * (long double)_M_max_load_factor); - return std::make_pair(false, 0); - } - } -else - return std::make_pair(false, 0); - } - // Base classes for std::_Hashtable. We define these base classes // because in some cases we want to do different things depending on // the value of a policy class. In some cases the policy class Index: src/shared/hashtable-aux.cc === --- src/shared/hashtable-aux.cc (revision 195557) +++ src/shared/hashtable-aux.cc (working copy) @@ -1,6 +1,6 @@ // std::__detail and std::tr1::__detail definitions -*- C++ -*- -// Copyright (C) 2007, 2009, 2011 Free Software Foundation, Inc. +// Copyright (C) 2007-2013 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -22,8 +22,6 @@ // see the files COPYING3 and COPYING.RUNTIME respectively. If not, see // http://www.gnu.org/licenses/. -#include bits/c++config.h - namespace __detail { _GLIBCXX_BEGIN_NAMESPACE_VERSION Index: src/c++11/hashtable_c++0x.cc === --- src/c++11/hashtable_c++0x.cc (revision 195557) +++ src/c++11/hashtable_c++0x.cc (working copy) @@ -1,6 +1,6 @@ // std::__detail definitions -*- C++ -*- -// Copyright (C) 2007, 2008, 2009,
[lra] merged with trunk
The branch was merged with trunk @ 195676. The branch was successfully bootstrapped on x86/x86-64. Committed as rev. 195679.
Re: patch to fix PR56144
On 13-01-31 6:36 PM, Steven Bosscher wrote: On Wed, Jan 30, 2013 at 6:24 PM, Vladimir Makarov wrote: The following patch fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56144 The patch was successfully bootstrapped and tested on x86/x86-64. Hello Vlad, Can you please put this patch on the lra-branch too, so that the auto-testers can pick it up? I've merged the branch with trunk instead.
[Google 4.7 Split Dwarf] Use .debug_str for some strings. (issue7241067)
The enclosed patch for Google 4.7 is an optimization for debug strings under -gsplit-dwarf. Currently under -gsplit-dwarf, all strings with DW_FORM_strp end up in the .debug_str.dwo section, which requires any string not destined for the .dwo to use DW_FORM_string, disallowing any duplication removal. With this patch, gcc creates a normal .debug_str section even under -gsplit-dwarf, and puts any DW_FORM_strp string destined for the .o file into that section. Tested with full bootstrap and the gdb test suite. OK for Google 4.7? When stage 1 opens again, I expect I will port it there as well. Sterling Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 195673) +++ gcc/dwarf2out.c (working copy) @@ -166,6 +166,7 @@ static GTY(()) section *debug_pubnames_section; static GTY(()) section *debug_pubtypes_section; static GTY(()) section *debug_str_section; +static GTY(()) section *debug_str_dwo_section; static GTY(()) section *debug_str_offsets_section; static GTY(()) section *debug_ranges_section; static GTY(()) section *debug_frame_section; @@ -217,6 +218,28 @@ static GTY ((param_is (struct indirect_string_node))) htab_t debug_str_hash; +/* With split_debug_info, both the comp_dir and dwo_name go in the + main object file, rather than the dwo, similar to the force_direct + parameter elsewhere but with additional complications: + + 1) The string is needed in both the main object file and the dwo. + That is, the comp_dir and dwo_name will appear in both places. + + 2) Strings can use three forms: DW_FORM_string, DW_FORM_strp or + DW_FORM_GNU_str_index. + + 3) GCC chooses the form to use late, depending on the size and + reference count. + + Rather than forcing the all debug string handling functions and + callers to deal with these complications, simply use a separate, + special-cased string table for any attribute that should go in the + main object file. This limits the complexity to just the places + that need it. */ + +static GTY ((param_is (struct indirect_string_node))) + htab_t skeleton_debug_str_hash; + static GTY(()) int dw2_string_counter; /* True if the compilation unit places functions in more than one section. */ @@ -3593,6 +3616,8 @@ static void schedule_generic_params_dies_gen (tree t); static void gen_scheduled_generic_parms_dies (void); +static const char *comp_dir_string (void); + /* enum for tracking thread-local variables whose address is really an offset relative to the TLS pointer, which will need link-time relocation, but will not need relocation by the DWARF consumer. */ @@ -3710,11 +3735,11 @@ (!dwarf_split_debug_info \ ? (DEBUG_NORM_STR_OFFSETS_SECTION) : (DEBUG_DWO_STR_OFFSETS_SECTION)) #endif -#define DEBUG_DWO_STR_SECTION .debug_str.dwo -#define DEBUG_NORM_STR_SECTION .debug_str +#ifndef DEBUG_STR_DWO_SECTION +#define DEBUG_STR_DWO_SECTION .debug_str.dwo +#endif #ifndef DEBUG_STR_SECTION -#define DEBUG_STR_SECTION \ - (!dwarf_split_debug_info ? (DEBUG_NORM_STR_SECTION) : (DEBUG_DWO_STR_SECTION)) +#define DEBUG_STR_SECTION .debug_str #endif #ifndef DEBUG_RANGES_SECTION #define DEBUG_RANGES_SECTION .debug_ranges @@ -3726,17 +3751,18 @@ #endif /* Section flags for .debug_macinfo/.debug_macro section. */ -#define DEBUG_MACRO_SECTION_FLAGS \ +#define DEBUG_MACRO_SECTION_FLAGS \ (dwarf_split_debug_info ? SECTION_DEBUG | SECTION_EXCLUDE : SECTION_DEBUG) /* Section flags for .debug_str section. */ -#define DEBUG_STR_SECTION_FLAGS \ - (dwarf_split_debug_info \ - ? SECTION_DEBUG | SECTION_EXCLUDE \ - : (HAVE_GAS_SHF_MERGE flag_merge_debug_strings \ - ? SECTION_DEBUG | SECTION_MERGE | SECTION_STRINGS | 1\ - : SECTION_DEBUG)) +#define DEBUG_STR_SECTION_FLAGS \ + (HAVE_GAS_SHF_MERGE flag_merge_debug_strings \ + ? SECTION_DEBUG | SECTION_MERGE | SECTION_STRINGS | 1\ + : SECTION_DEBUG) +/* Section flags for .debug_str.dwo section. */ +#define DEBUG_STR_DWO_SECTION_FLAGS (SECTION_DEBUG | SECTION_EXCLUDE) + /* Labels we insert at beginning sections we can reference instead of the section names themselves. */ @@ -4658,19 +4684,15 @@ (const char *)x2) == 0; } -/* Add STR to the indirect string hash table. */ +/* Add STR to the given string hash table. */ static struct indirect_string_node * -find_AT_string (const char *str) +find_AT_string_in_table (const char *str, htab_t table) { struct indirect_string_node *node; void **slot; - if (! debug_str_hash) -debug_str_hash = htab_create_ggc (10, debug_str_do_hash, - debug_str_eq, NULL); - - slot = htab_find_slot_with_hash (debug_str_hash, str, + slot = htab_find_slot_with_hash (table, str,