Re: PR middle-end/52141: ICE due to asm statement
On Thu, Feb 16, 2012 at 7:26 PM, Aldy Hernandez al...@redhat.com wrote: On 02/16/12 12:04, Jakub Jelinek wrote: On Thu, Feb 16, 2012 at 11:46:33AM -0600, Aldy Hernandez wrote: #GOOD houston:/build/t2/gcc$ ./cc1 a.c -fgnu-tm -O0 -quiet -w In function 'asmfunc', inlined from 'f' at a.c:13:10: a.c:7:3: error: asm not allowed in 'transaction_safe' function #BAD houston:/build/t2/gcc$ ./cc1 a.c -fgnu-tm -O1 -quiet -w a.c: In function 'f': a.c:7:3: error: asm not allowed in 'transaction_safe' function houston:/build/t2/gcc$ Even with -O1 -g ? With -g0 we prune BLOCKs, so the backtraces don't work well. Well with -O1 -g the backtrace works for this particular error. But with -O2 -g, we trigger another error asm not allowed in atomic transaction, and no amount of %K fiddling will get me a backtrace. But even so, it doesn't seem right to expect users to get a correct error message only with -g, does it? Well, you can't expect diagnostics to be perfect with inlining in place. But you can try improving things by looking at what blocks tree-ssa-live.c prunes. Disabling early inlining for TM is a hack, I don't think you want to make people pay the optimizing penalty just because of asm diagnostics. Richard.
Re: PR middle-end/52141: ICE due to asm statement
On Mon, Feb 20, 2012 at 10:56:30AM +0100, Richard Guenther wrote: Disabling early inlining for TM is a hack, I don't think you want to make people pay the optimizing penalty just because of asm diagnostics. Yeah, it is unfortunate that without -g the diagnostic will be less verbose, but even from the -g0 diagnostics IMHO anybody can figure that issue out. Jakub
Re: [PATCH] Fix (X C1) | C2 folding (PR tree-optimization/52286)
On Fri, Feb 17, 2012 at 2:50 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! The following testcase is miscompiled (by CCP, or, if -fno-tree-ccp, by VRP), because the (X C1) | C2 folding changes the C1 constant into an integer constant with most significant bit set, but not sign-extended into the double_int. Fixed by using double_int_to_tree which extends it properly, instead of build_int_cst_wide. Instead of creating a double_int from the hi3/lo3 pair I've changed the code to actually work on double_ints. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2012-02-17 Jakub Jelinek ja...@redhat.com PR tree-optimization/52286 * fold-const.c (fold_binary_loc): For (X C1) | C2 optimization use double_int_to_tree instead of build_int_cst_wide, rewrite to use double_int vars. * gcc.c-torture/execute/pr52286.c: New test. --- gcc/fold-const.c.jj 2012-02-16 20:04:34.0 +0100 +++ gcc/fold-const.c 2012-02-17 12:05:46.559700551 +0100 @@ -10959,66 +10959,50 @@ fold_binary_loc (location_t loc, TREE_CODE (arg1) == INTEGER_CST TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST) { - unsigned HOST_WIDE_INT hi1, lo1, hi2, lo2, hi3, lo3, mlo, mhi; + double_int c1, c2, c3, msk; int width = TYPE_PRECISION (type), w; - hi1 = TREE_INT_CST_HIGH (TREE_OPERAND (arg0, 1)); - lo1 = TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1)); - hi2 = TREE_INT_CST_HIGH (arg1); - lo2 = TREE_INT_CST_LOW (arg1); + c1 = tree_to_double_int (TREE_OPERAND (arg0, 1)); + c2 = tree_to_double_int (arg1); /* If (C1C2) == C1, then (XC1)|C2 becomes (X,C2). */ - if ((hi1 hi2) == hi1 (lo1 lo2) == lo1) + if (double_int_equal_p (double_int_and (c1, c2), c1)) return omit_one_operand_loc (loc, type, arg1, - TREE_OPERAND (arg0, 0)); + TREE_OPERAND (arg0, 0)); - if (width HOST_BITS_PER_WIDE_INT) - { - mhi = (unsigned HOST_WIDE_INT) -1 - (2 * HOST_BITS_PER_WIDE_INT - width); - mlo = -1; - } - else - { - mhi = 0; - mlo = (unsigned HOST_WIDE_INT) -1 - (HOST_BITS_PER_WIDE_INT - width); - } + msk = double_int_mask (width); /* If (C1|C2) == ~0 then (XC1)|C2 becomes X|C2. */ - if ((~(hi1 | hi2) mhi) == 0 (~(lo1 | lo2) mlo) == 0) + if (double_int_zero_p (double_int_and_not (msk, + double_int_ior (c1, c2 return fold_build2_loc (loc, BIT_IOR_EXPR, type, - TREE_OPERAND (arg0, 0), arg1); + TREE_OPERAND (arg0, 0), arg1); /* Minimize the number of bits set in C1, i.e. C1 := C1 ~C2, unless (C1 ~C2) | (C2 C3) for some C3 is a mask of some mode which allows further optimizations. */ - hi1 = mhi; - lo1 = mlo; - hi2 = mhi; - lo2 = mlo; - hi3 = hi1 ~hi2; - lo3 = lo1 ~lo2; + c1 = double_int_and (c1, msk); + c2 = double_int_and (c2, msk); + c3 = double_int_and_not (c1, c2); for (w = BITS_PER_UNIT; w = width w = HOST_BITS_PER_WIDE_INT; w = 1) { unsigned HOST_WIDE_INT mask = (unsigned HOST_WIDE_INT) -1 (HOST_BITS_PER_WIDE_INT - w); - if (((lo1 | lo2) mask) == mask - (lo1 ~mask) == 0 hi1 == 0) + if (((c1.low | c2.low) mask) == mask + (c1.low ~mask) == 0 c1.high == 0) { - hi3 = 0; - lo3 = mask; + c3 = uhwi_to_double_int (mask); break; } } - if (hi3 != hi1 || lo3 != lo1) + if (!double_int_equal_p (c3, c1)) return fold_build2_loc (loc, BIT_IOR_EXPR, type, - fold_build2_loc (loc, BIT_AND_EXPR, type, - TREE_OPERAND (arg0, 0), - build_int_cst_wide (type, - lo3, hi3)), - arg1); + fold_build2_loc (loc, BIT_AND_EXPR, type, + TREE_OPERAND (arg0, 0), + double_int_to_tree (type, + c3)), + arg1); } /* (X Y) | Y is (X, Y). */ ---
Re: [PATCH] Fix up --enable-initfini-array autodetection in configure (PR bootstrap/50237)
On Fri, Feb 17, 2012 at 2:51 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Rainer Orth r...@cebitec.uni-bielefeld.de writes: Richard Guenther richard.guent...@gmail.com writes: I'm not sure about the varasm.c change - it's definitely not a no-op (callback will be not set, and the flags will be different). Certainly As I've demonstrated in my response to H.J., the effect with gas is none. the current code is inconsistent wrt the priority != DEFAULT_INIT_PRIORITY case, not sure why, but you don't make it consistent either because you don't change that case to SECTION_NOTYPE either. I'd be fine I didn't find a need for that, but agree that the inconsistency is disturbing. default_section_type_flags suggests that SECTION_NOTYPE should be set for .init_array and friends, but it's only used by get_named_section. with it with both calls using SECTION_NOTYPE, but would like to know what the callback difference is about. I don't think the callback is needed, which is effectively a printf (\t.section\t.init_array); Looking at it, this is e.g. wrong for Sun as on SPARC, which requires the section name to be in double quotes. Here's the revised patch which consistently sets SECTION_NOTYPE. Bootstrapped without regressions on i386-pc-solaris2.11 with gas/gld and x86_64-unknown-linux-gnu, ok for mainline? Ok. Thanks, Richard. Rainer 2012-01-20 Rainer Orth r...@cebitec.uni-bielefeld.de PR target/50166 * acinclude.m4 (gcc_AC_INITFINI_ARRAY): Require gcc_SUN_LD_VERSION. Define _start. Remove -e 0 from $gcc_cv_ld invocation. Only use __GLIBC_PREREQ if defined. Enable on Solaris since Solaris 8 patch. (gcc_SUN_LD_VERSION): New macro. * configure.ac (ld_ver) *-*-solaris2*: Refer to gcc_SUN_LD_VERSION for version number format. * configure: Regenerate. * varasm.c (get_elf_initfini_array_priority_section): Set SECTION_NOTYPE for non-default priority. Use get_section instead of get_unnamed_section to emit .init_array/.fini_array with default priority. -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PR52001] too many cse reverse equiv exprs (take2)
On Fri, Feb 17, 2012 at 02:01:36AM -0200, Alexandre Oliva wrote: for gcc/ChangeLog from Alexandre Oliva aol...@redhat.com PR debug/52001 * cselib.c (preserve_only_constants): Rename to... (preserve_constants_and_equivs): ... this. Split out... (invariant_or_equiv_p): ... this. Preserve plus expressions of other preserved expressions too. (cselib_reset_table): Adjust. * var-tracking.c (reverse_op): Use canonical value to build reverse operation. This patch is ok for the trunk, provided testing was successful. Jakub
Re: Continue strict-volatile-bitfields fixes
On Fri, Feb 17, 2012 at 9:51 PM, Thomas Schwinge tho...@codesourcery.com wrote: Hi! How do we move this issue forward? On Mon, 23 Jan 2012 15:46:34 +0100, Bernd Schmidt ber...@codesourcery.com wrote: On 11/29/2011 05:35 PM, Mitchell, Mark wrote: So, I still think this patch is the best way to go forward, and it does fix incorrect code generation. Would appreciate an OK. Ping. If you don't hear any objections within a week, please proceed. That was committed a while ago. The part in stor-layout.c that stops us from promoting bitfields to normal fields apparently caused some testsuite regressions in sh tests, where some optimization dump scans show that we can't perform the optimizations if there are BIT_FIELD_REFs rather than a normal member access. The testcases use things like enum something field:8; and I think applying strict-volatile-bitfields to enums is probably meaningless. Should we adjust semantics so that the compiler is free to optimize enum bitfields? The patch would look something like the below. Thomas has tested this on arm and sh with our internal tree. Bernd Index: gcc/stor-layout.c === --- gcc/stor-layout.c (revision 355696) +++ gcc/stor-layout.c (working copy) @@ -665,8 +665,7 @@ may make a volatile object later. */ if (TYPE_SIZE (type) != 0 TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST - GET_MODE_CLASS (TYPE_MODE (type)) == MODE_INT - flag_strict_volatile_bitfields = 0) + GET_MODE_CLASS (TYPE_MODE (type)) == MODE_INT) { enum machine_mode xmode = mode_for_size_tree (DECL_SIZE (decl), MODE_INT, 1); @@ -674,7 +673,12 @@ if (xmode != BLKmode !(xalign BITS_PER_UNIT DECL_PACKED (decl)) - (known_align == 0 || known_align = xalign)) + (known_align == 0 || known_align = xalign) + (flag_strict_volatile_bitfields = 0 + /* Same size makes strict volatile bitfields meaningless. */ + || GET_MODE_SIZE (xmode) == GET_MODE_SIZE (TYPE_MODE (type)) + /* Strict volatile bitfields shouldn't apply to enums. */ + || TREE_CODE (type) == ENUMERAL_TYPE)) What about BOOLEAN_TYPE bitfields? Thus, why not explicitely spell out TREE_CODE (type) == INTEGER_TYPE? { DECL_ALIGN (decl) = MAX (xalign, DECL_ALIGN (decl)); DECL_MODE (decl) = xmode; Grüße, Thomas
RE: [Ping] RE: CR16 Port addition
Hello Gerald, Thank you for reviewing the earlier patch. PFA, the patch modified as per your suggestion. I have also attached here another patch for contrib.texi file changes. Please let me know if any modifications are required in it. Thanks and Regards, Jayant Sonar [KPIT Cummins, Pune] cr16-htdocs3.diff Description: cr16-htdocs3.diff cr16-contrib.patch Description: cr16-contrib.patch
[testsuite,committed]: Fix FAIL on 16-bit int platforms.
http://gcc.gnu.org/viewcvs?view=revisionrevision=184393 http://gcc.gnu.org/viewcvs?view=revisionrevision=184394 Index: gcc.c-torture/execute/pr52286.c === --- gcc.c-torture/execute/pr52286.c (revision 184393) +++ gcc.c-torture/execute/pr52286.c (working copy) @@ -5,9 +5,15 @@ extern void abort (void); int main () { +#if __SIZEOF_INT__ 2 int a, b; asm ( : =r (a) : 0 (0)); b = (~a | 1) -2038094497; +#else + long a, b; + asm ( : =r (a) : 0 (0)); + b = (~a | 1) -2038094497L; +#endif if (b = 0) abort (); return 0; Index: gcc.dg/pr52132.c === --- gcc.dg/pr52132.c(revision 184392) +++ gcc.dg/pr52132.c(working copy) @@ -2,6 +2,11 @@ /* { dg-do compile } */ /* { dg-options -std=c99 -O2 -g } */ +#if (__SIZEOF_INT__ __SIZEOF_FLOAT__) \ + (__SIZEOF_LONG__ == __SIZEOF_FLOAT__) +#define int long +#endif + int l; void bar (void); @@ -10,7 +15,11 @@ foo (int *x, float y) { float b; union { float f; int i; } u = { .f = y }; +#if (__SIZEOF_INT__ __SIZEOF_FLOAT__) + u.i += 127L 23; +#else u.i += 127 23; +#endif u.f = ((-1.0f / 3) * u.f + 2) * u.f - 2.0f / 3; b = 0.5 * (u.f + l); if (b = *x)
[PATCH] Fix PR52298
This fixes PR52298, we need to use the proper DR step for outer loop vectorization. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-02-20 Richard Guenther rguent...@suse.de PR tree-optimization/52298 * tree-vect-stmts.c (vectorizable_store): Properly use STMT_VINFO_DR_STEP instead of DR_STEP when vectorizing outer loops. (vectorizable_load): Likewise. * tree-vect-data-refs.c (vect_analyze_data_ref_access): Access DR_STEP after ensuring it is not NULL. * gcc.dg/torture/pr52298.c: New testcase. * gcc.dg/vect/pr52298.c: Likewise. Index: gcc/tree-vect-stmts.c === *** gcc/tree-vect-stmts.c (revision 184390) --- gcc/tree-vect-stmts.c (working copy) *** vectorizable_store (gimple stmt, gimple_ *** 3753,3759 if (!STMT_VINFO_DATA_REF (stmt_info)) return false; ! if (tree_int_cst_compare (DR_STEP (dr), size_zero_node) 0) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, negative step for store.); --- 3753,3761 if (!STMT_VINFO_DATA_REF (stmt_info)) return false; ! if (tree_int_cst_compare (loop nested_in_vect_loop_p (loop, stmt) ! ? STMT_VINFO_DR_STEP (stmt_info) : DR_STEP (dr), ! size_zero_node) 0) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, negative step for store.); *** vectorizable_load (gimple stmt, gimple_s *** 4266,4272 if (!STMT_VINFO_DATA_REF (stmt_info)) return false; ! negative = tree_int_cst_compare (DR_STEP (dr), size_zero_node) 0; if (negative ncopies 1) { if (vect_print_dump_info (REPORT_DETAILS)) --- 4268,4277 if (!STMT_VINFO_DATA_REF (stmt_info)) return false; ! negative = tree_int_cst_compare (loop nested_in_vect_loop_p (loop, stmt) ! ? STMT_VINFO_DR_STEP (stmt_info) ! : DR_STEP (dr), ! size_zero_node) 0; if (negative ncopies 1) { if (vect_print_dump_info (REPORT_DETAILS)) *** vectorizable_load (gimple stmt, gimple_s *** 4654,4660 nested within an outer-loop that is being vectorized. */ if (loop nested_in_vect_loop_p (loop, stmt) !(TREE_INT_CST_LOW (DR_STEP (dr)) % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) { gcc_assert (alignment_support_scheme != dr_explicit_realign_optimized); --- 4659,4665 nested within an outer-loop that is being vectorized. */ if (loop nested_in_vect_loop_p (loop, stmt) !(TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info)) % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) { gcc_assert (alignment_support_scheme != dr_explicit_realign_optimized); Index: gcc/tree-vect-data-refs.c === *** gcc/tree-vect-data-refs.c (revision 184390) --- gcc/tree-vect-data-refs.c (working copy) *** vect_analyze_data_ref_access (struct dat *** 2319,2325 stmt_vec_info stmt_info = vinfo_for_stmt (stmt); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); struct loop *loop = NULL; ! HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step); if (loop_vinfo) loop = LOOP_VINFO_LOOP (loop_vinfo); --- 2319,2325 stmt_vec_info stmt_info = vinfo_for_stmt (stmt); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); struct loop *loop = NULL; ! HOST_WIDE_INT dr_step; if (loop_vinfo) loop = LOOP_VINFO_LOOP (loop_vinfo); *** vect_analyze_data_ref_access (struct dat *** 2332,2337 --- 2332,2338 } /* Allow invariant loads in loops. */ + dr_step = TREE_INT_CST_LOW (step); if (loop_vinfo dr_step == 0) { GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = NULL; Index: gcc/testsuite/gcc.dg/torture/pr52298.c === *** gcc/testsuite/gcc.dg/torture/pr52298.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr52298.c (revision 0) *** *** 0 --- 1,16 + /* { dg-do compile } */ + + int a, b, c, h; + + int i[5][5]; + + void + fn1 () + { + int l = 0; + + for (a = 0; a = 3; a++) + for (b = 1; b = 0; b -= 1) + l |= i[0][b]; + c = l; + } Index: gcc/testsuite/gcc.dg/vect/pr52298.c === *** gcc/testsuite/gcc.dg/vect/pr52298.c (revision 0) --- gcc/testsuite/gcc.dg/vect/pr52298.c (revision 0) *** *** 0 --- 1,31 + /* { dg-do run } */ + /* { dg-options -O1 -ftree-vectorize -fno-tree-pre -fno-tree-loop-im } */ + + extern void abort (void); + int c[80]; + +
Re: [google/gcc-4_6_2-mobile] PATCH: PR other/46770: Replace .ctors/.dtors with .init_array/.fini_array on targets supporting them
Hello, Hey, Jing, you broke the google/gcc-4_6 branch by checking the new header file into the wrong directory. Fixed via r184386. google/gcc-4_6_2-mobile branch still has the same problem. Could please someone fix it? Thanks Ilya Ollie On Fri, Feb 17, 2012 at 10:25 PM, Jing Yu jin...@google.com wrote: OK. Thanks for porting the patch. I will commit the patch into google/gcc-4_6_2-mobile for you. I would also like to commit it into google/gcc-4_6 branch if all tests pass. This patch is almost the same as Google Ref 47894. Thanks, Jing On Fri, Feb 17, 2012 at 5:20 PM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch backports the fix from trunk: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770 for google/gcc-4_6_2-mobile branch. This is needed to support C++ global constructors/destructiors on Android/x86. OK for google/gcc-4_6_2-mobile branch? Thanks. H.J. --- 2011-08-20 H.J. Lu hongjiu...@intel.com PR other/46770 * config.gcc (tm_file): Add initfini-array.h if .init_arrary/.fini_array are supported. * crtstuff.c: Don't generate .ctors nor .dtors sections if USE_INITFINI_ARRAY is defined. * output.h (default_elf_init_array_asm_out_constructor): New. (default_elf_fini_array_asm_out_destructor): Likewise. * varasm.c (elf_init_array_section): Likewise. (elf_fini_array_section): Likewise. (get_elf_initfini_array_priority_section): Likewise. (default_elf_init_array_asm_out_constructor): Likewise. (default_elf_fini_array_asm_out_destructor): Likewise. * config/initfini-array.h: New. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@177933 138bc75d-0d04-0410-961f-82ee72b054a4 Conflicts: gcc/ChangeLog --- gcc/ChangeLog.hjl | 21 +++ gcc/config.gcc | 5 +++ gcc/config/initfini-array.h | 37 +++ gcc/crtstuff.c | 11 +++- gcc/output.h | 2 + gcc/varasm.c | 58 +++ 6 files changed, 133 insertions(+), 1 deletions(-) create mode 100644 gcc/ChangeLog.hjl create mode 100644 gcc/config/initfini-array.h diff --git a/gcc/ChangeLog.hjl b/gcc/ChangeLog.hjl new file mode 100644 index 000..3527b27 --- /dev/null +++ b/gcc/ChangeLog.hjl @@ -0,0 +1,21 @@ +2011-12-07 H.J. Lu hongjiu...@intel.com + + Backport from mainline + 2011-08-20 H.J. Lu hongjiu...@intel.com + + PR other/46770 + * config.gcc (tm_file): Add initfini-array.h if + .init_arrary/.fini_array are supported. + + * crtstuff.c: Don't generate .ctors nor .dtors sections if + USE_INITFINI_ARRAY is defined. + + * output.h (default_elf_init_array_asm_out_constructor): New. + (default_elf_fini_array_asm_out_destructor): Likewise. + * varasm.c (elf_init_array_section): Likewise. + (elf_fini_array_section): Likewise. + (get_elf_initfini_array_priority_section): Likewise. + (default_elf_init_array_asm_out_constructor): Likewise. + (default_elf_fini_array_asm_out_destructor): Likewise. + + * config/initfini-array.h: New. diff --git a/gcc/config.gcc b/gcc/config.gcc index d9ac0fa..b386424 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3176,6 +3176,11 @@ if test x$with_schedule = x; then esac fi +# Support --enable-initfini-array. +if test x$enable_initfini_array = xyes; then + tm_file=${tm_file} initfini-array.h +fi + # Validate and mark as valid any --with options supported # by this target. In order to use a particular --with option # you must list it in supported_defaults; validating the value diff --git a/gcc/config/initfini-array.h b/gcc/config/initfini-array.h new file mode 100644 index 000..8aaadf6 --- /dev/null +++ b/gcc/config/initfini-array.h @@ -0,0 +1,37 @@ +/* Definitions for ELF systems with .init_array/.fini_array section + support. + Copyright (C) 2011 + Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + http://www.gnu.org/licenses/. */ + +#define USE_INITFINI_ARRAY + +#undef INIT_SECTION_ASM_OP
Re: [PATCH] Fix PR52298
On Mon, Feb 20, 2012 at 04:11:13PM +0100, Richard Guenther wrote: This fixes PR52298, we need to use the proper DR step for outer loop vectorization. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Thanks. *** vectorizable_load (gimple stmt, gimple_s *** 4266,4272 if (!STMT_VINFO_DATA_REF (stmt_info)) return false; ! negative = tree_int_cst_compare (DR_STEP (dr), size_zero_node) 0; if (negative ncopies 1) { if (vect_print_dump_info (REPORT_DETAILS)) --- 4268,4277 if (!STMT_VINFO_DATA_REF (stmt_info)) return false; ! negative = tree_int_cst_compare (loop nested_in_vect_loop_p (loop, stmt) !? STMT_VINFO_DR_STEP (stmt_info) !: DR_STEP (dr), !size_zero_node) 0; vectorizable_loop, unlike vectorizable_store, has nested_in_vect_loop bool flag, so you could just test that instead of loop nested_in_vect_loop_p (loop, stmt) above and below. if (negative ncopies 1) { if (vect_print_dump_info (REPORT_DETAILS)) *** vectorizable_load (gimple stmt, gimple_s *** 4654,4660 nested within an outer-loop that is being vectorized. */ if (loop nested_in_vect_loop_p (loop, stmt) !(TREE_INT_CST_LOW (DR_STEP (dr)) % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) { gcc_assert (alignment_support_scheme != dr_explicit_realign_optimized); --- 4659,4665 nested within an outer-loop that is being vectorized. */ if (loop nested_in_vect_loop_p (loop, stmt) !(TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info)) % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) { gcc_assert (alignment_support_scheme != dr_explicit_realign_optimized); Jakub
RE: spill failure after IF-CASE-2 transformation
Ping. looks like an invalid transformation, but I suspect rather than setting the CC register, the (*) insn is setting a pseudo (more accurate RTL would be useful). There are some cases in ifcvt.c which check targetm.small_register_classes_for_mode already, this is probably what should be done to prevent this transformation. You suspect correctly, cc=x sets CC whereas cc=y is a pseudo which can only match CC. Presumably I must check all instructions in the else_bb for modifications to small_register_classes_for_mode_p? e.g. see below. Does this seem reasonable? Patch here: http://gcc.gnu.org/ml/gcc/2012-02/msg00296.html Thanks, Stu
libitm: fixes and optimizations in gl_wt
The following patches don't depend on each other but are all for the gl_wt TM method. patch1: For memory transfers from source regions with the read-for-write modifier, undo logging of these regions was missing. Also optimize the number of accesses to gtm_thr(). patch2: Similar to the ml_wt code, we don't need to use acq_rel but just acquire memory order when acquiring the orec. The following release fence is still necessary. patch3: Recently, I changed how gtm_rwlock handles upgrades to serial mode. This change was to fix privatization safety and basically consists of not modifying gtm_thread::shared_state until trycommit() and/or rollback() have finished after the upgrade (and before the switch to serialirr dispatch). With this in place, we can avoid the previous workarounds for the commit-in-serial-mode corner case. patch4: Privatization safety in gl_wt rollback was previously enforced with something that probably worked but for which the precise C++11 mem model reasoning was unclear. The patch fixes that (so it's clear why it works on the basis of the language's memory model, not on the HW models), and it less costly than the previously used seq_cst barrier. patch5: Just put gl_wt's global orec on a separate cacheline, same as we do in ml_wt. OK for trunk? commit 61ffbfa5610636556335595cc4c28c9e8cd45e23 Author: Torvald Riegel trie...@redhat.com Date: Mon Feb 20 14:04:17 2012 +0100 libitm: Add missing undo-logging of RfW src regions in gl_wt memtransfer. libitm/ * method-gl.cc (gl_wt_dispatch::validate): New. (gl_wt_dispatch::pre_write): New. (gl_wt_dispatch::memtransfer_static): Add missing undo for RfW src. Optimize number of calls to gtm_thr. diff --git a/libitm/method-gl.cc b/libitm/method-gl.cc index d6d050d..8df3d30 100644 --- a/libitm/method-gl.cc +++ b/libitm/method-gl.cc @@ -85,9 +85,8 @@ static gl_mg o_gl_mg; class gl_wt_dispatch : public abi_dispatch { protected: - static void pre_write(const void *addr, size_t len) + static void pre_write(const void *addr, size_t len, gtm_thread *tx) { -gtm_thread *tx = gtm_thr(); gtm_word v = tx-shared_state.load(memory_order_relaxed); if (unlikely(!gl_mg::is_locked(v))) { @@ -123,7 +122,13 @@ protected: tx-undolog.log(addr, len); } - static void validate() + static void pre_write(const void *addr, size_t len) + { +gtm_thread *tx = gtm_thr(); +pre_write(addr, len, tx); + } + + static void validate(gtm_thread *tx) { // Check that snapshot is consistent. We expect the previous data load to // have acquire memory order, or be atomic and followed by an acquire @@ -137,12 +142,17 @@ protected: // or read an orec value that was written after the data had been written. // Either will allow us to detect inconsistent reads because it will have // a higher/different value. -gtm_thread *tx = gtm_thr(); gtm_word l = o_gl_mg.orec.load(memory_order_relaxed); if (l != tx-shared_state.load(memory_order_relaxed)) tx-restart(RESTART_VALIDATE_READ); } + static void validate() + { +gtm_thread *tx = gtm_thr(); +validate(tx); + } + template typename V static V load(const V* addr, ls_modifier mod) { // Read-for-write should be unlikely, but we need to handle it or will @@ -198,9 +208,20 @@ public: static void memtransfer_static(void *dst, const void* src, size_t size, bool may_overlap, ls_modifier dst_mod, ls_modifier src_mod) { -if ((dst_mod != WaW src_mod != RaW) -(dst_mod != NONTXNAL || src_mod == RfW)) - pre_write(dst, size); +gtm_thread *tx = 0; +if (dst_mod != WaW dst_mod != NONTXNAL) + { + tx = gtm_thr(); + pre_write(dst, size, tx); + } +// We need at least undo-logging for an RfW src region because we might +// subsequently write there with WaW. +if (src_mod == RfW) + { + if (dst_mod == WaW || dst_mod == NONTXNAL) + tx = gtm_thr(); + pre_write(src, size, tx); + } // FIXME We should use atomics here (see store()). Let's just hope that // memcpy/memmove are good enough. @@ -211,7 +232,11 @@ public: if (src_mod != RfW src_mod != RaW src_mod != NONTXNAL dst_mod != WaW) - validate(); + { + if ((dst_mod == WaW || dst_mod == NONTXNAL) src_mod != RfW) + tx = gtm_thr(); + validate(tx); + } } static void memset_static(void *dst, int c, size_t size, ls_modifier mod) commit 609458878f63aafd77902466dae76b59a6c2ba68 Author: Torvald Riegel trie...@redhat.com Date: Mon Feb 20 14:27:55 2012 +0100 libitm: Optimize memory order requiremens in gl_wt pre_write. libtim/ * method-gl.cc (gl_wt_dispatch::pre_write): Optimize memory orders. diff --git a/libitm/method-gl.cc b/libitm/method-gl.cc index 8df3d30..958e81f 100644 --- a/libitm/method-gl.cc +++ b/libitm/method-gl.cc @@ -103,16 +103,27 @@
Re: [PATCH, i386] RTM support
On Thu, Feb 16, 2012 at 5:06 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello guys, Here is a patch which adds support of first part of Intel TSX extensions. Could you please have a look? As the first remark, you don't have to add BLKmode memory clobbers. unspec_volatile RTX is considered to use and clobber all memory: +(define_expand xbegin + [(set (match_operand:SI 0 register_operand ) + (unspec_volatile:SI [(match_dup 1)] UNSPECV_XBEGIN))] + TARGET_RTM + operands[1] = force_reg (SImode, constm1_rtx);) +(define_insn *xbegin_1 + [(set (match_operand:SI 0 register_operand =a) + (unspec_volatile:SI [(match_operand:SI 1 register_operand 0)] UNSPECV_XBEGIN))] + TARGET_RTM + xbegin\t.+6 + [(set_attr type other) + (set_attr length 3)]) But, I think that we want to use code label from the top of the false branch instead of .+6. The question is, how to get it ... +(define_expand xtest + [(set (match_operand:QI 0 register_operand ) + (unspec_volatile:QI [(const_int 0)] UNSPECV_XTEST))] + TARGET_RTM +{ + rtx insn, op0, op1; + op0 = gen_reg_rtx (QImode); + emit_move_insn (op0, const0_rtx); + op1 = gen_reg_rtx (QImode); + emit_move_insn (op1, const1_rtx); + emit_insn (gen_xtest_1 ()); + insn = gen_rtx_EQ (VOIDmode, gen_rtx_REG (CCZmode, FLAGS_REG), +const0_rtx); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], + gen_rtx_IF_THEN_ELSE (QImode, insn, + op1, op0))); + DONE; +}) Please use ix86_expand_setcc instead of opencoding sete insn. Uros.
[v3] libstdc++/52241
Hi, thus, this is what I'm going to commit to resolve the issue (see audit trail for details). Tested x86_64-linux. Thanks, Paolo. /// 2012-12-20 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/52241 * src/c++98/tree.cc (local_Rb_tree_increment, local_Rb_tree_decrement): Add. (_Rb_tree_increment(const _Rb_tree_node_base*), _Rb_tree_decrement(const _Rb_tree_node_base*)): Use the latter. (_Rb_tree_increment(_Rb_tree_node_base*), _Rb_tree_decrement(_Rb_tree_node_base*)): New. Index: src/c++98/tree.cc === --- src/c++98/tree.cc (revision 184388) +++ src/c++98/tree.cc (working copy) @@ -1,6 +1,6 @@ // RB tree utilities implementation -*- C++ -*- -// Copyright (C) 2003, 2005, 2009 Free Software Foundation, Inc. +// Copyright (C) 2003, 2005, 2009, 2012 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -56,8 +56,8 @@ { _GLIBCXX_BEGIN_NAMESPACE_VERSION - _Rb_tree_node_base* - _Rb_tree_increment(_Rb_tree_node_base* __x) throw () + static _Rb_tree_node_base* + local_Rb_tree_increment(_Rb_tree_node_base* __x) throw () { if (__x-_M_right != 0) { @@ -79,14 +79,20 @@ return __x; } + _Rb_tree_node_base* + _Rb_tree_increment(_Rb_tree_node_base* __x) throw () + { +return local_Rb_tree_increment(__x); + } + const _Rb_tree_node_base* _Rb_tree_increment(const _Rb_tree_node_base* __x) throw () { -return _Rb_tree_increment(const_cast_Rb_tree_node_base*(__x)); +return local_Rb_tree_increment(const_cast_Rb_tree_node_base*(__x)); } - _Rb_tree_node_base* - _Rb_tree_decrement(_Rb_tree_node_base* __x) throw () + static _Rb_tree_node_base* + local_Rb_tree_decrement(_Rb_tree_node_base* __x) throw () { if (__x-_M_color == _S_red __x-_M_parent-_M_parent == __x) @@ -111,10 +117,16 @@ return __x; } + _Rb_tree_node_base* + _Rb_tree_decrement(_Rb_tree_node_base* __x) throw () + { +return local_Rb_tree_decrement(__x); + } + const _Rb_tree_node_base* _Rb_tree_decrement(const _Rb_tree_node_base* __x) throw () { -return _Rb_tree_decrement(const_cast_Rb_tree_node_base*(__x)); +return local_Rb_tree_decrement(const_cast_Rb_tree_node_base*(__x)); } static void
Re: Continue strict-volatile-bitfields fixes
On 20/02/12 17:28, Bernd Schmidt wrote: On 02/20/2012 12:14 PM, Richard Guenther wrote: On Fri, Feb 17, 2012 at 9:51 PM, Thomas Schwinge tho...@codesourcery.com wrote: Hi! How do we move this issue forward? On Mon, 23 Jan 2012 15:46:34 +0100, Bernd Schmidt ber...@codesourcery.com wrote: That was committed a while ago. The part in stor-layout.c that stops us from promoting bitfields to normal fields apparently caused some testsuite regressions in sh tests, where some optimization dump scans show that we can't perform the optimizations if there are BIT_FIELD_REFs rather than a normal member access. The testcases use things like enum something field:8; and I think applying strict-volatile-bitfields to enums is probably meaningless. Should we adjust semantics so that the compiler is free to optimize enum bitfields? The patch would look something like the below. What about BOOLEAN_TYPE bitfields? Thus, why not explicitely spell out TREE_CODE (type) == INTEGER_TYPE? That would work for me, if we can all agree that -fstrict-volatile-bitfields should be restricted to INTEGER_TYPE. Bernd I'm not sure why it should be. Can't a user write #ifdef __cplusplus #define BOOL bool #else #define bool _Bool #endif struct x { volatile BOOL a : 1; volatile BOOL b : 1; volatile unsigned char c : 6; volatile BOOL d : 1; } y; ? If you've got strict volatile bitfields, then the concept here is that the access uses the declared type for accessing the member. Since in the ABI bool has a defined size, then it should access the member using that size. On ARM, sizeof bool is 1, so I'd take the above to mean that accessing y.a to mean a read of a, b and c, but not d. R.
Re: [PATCH, i386] RTM support
On Mon, Feb 20, 2012 at 07:10:26PM +0100, Uros Bizjak wrote: So the above is right and needed, though perhaps we might want a combine pattern or peephole to turn the movl $-1, %eax xbegin .+6 cmpl %eax, $-1 jne 1f The compiler can reverse the condition and exchange arms of if expression. This will void the assumption that false arm (and abort sequence) is at pc+6. The xbegin .+6 instruction doesn't (from the compiler's POV) as any JUMP_INSN, it just returns either -1 or some other value in %eax (when coupled with setting %eax to -1 before it). That is the semantics of the _xbegin () intrinsic. We can (and at least in the combiner should) model it as a JUMP_INSN with UNSPEC_VOLATILE in it, which will work similarly to asm goto. The question is how happy cfgrtl.c etc. will be on a conditional JUMP_INSN with UNSPEC_VOLATILE in it. Jakub
Re: [PATCH, i386] RTM support
On Mon, Feb 20, 2012 at 7:18 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Feb 20, 2012 at 07:10:26PM +0100, Uros Bizjak wrote: So the above is right and needed, though perhaps we might want a combine pattern or peephole to turn the movl $-1, %eax xbegin .+6 cmpl %eax, $-1 jne 1f The compiler can reverse the condition and exchange arms of if expression. This will void the assumption that false arm (and abort sequence) is at pc+6. The xbegin .+6 instruction doesn't (from the compiler's POV) as any JUMP_INSN, it just returns either -1 or some other value in %eax (when coupled with setting %eax to -1 before it). That is the semantics of the _xbegin () intrinsic. We can (and at least in the combiner should) model it as a JUMP_INSN with UNSPEC_VOLATILE in it, which will work similarly to asm goto. The question is how happy cfgrtl.c etc. will be on a conditional JUMP_INSN with UNSPEC_VOLATILE in it. IIUC the documentation, the fallback label is a parameter to xbegin insn, but the insn itself doesn't jump anywhere - it just records the parameter as a fallback address. However, there is no guarantee that the fallback code is exactly at (pc)+6, so we have to use asm labels here. As written in your example in this thread, the label is at the top of the false branch. Instead of .+6, we have to pass this label to xbegin insn somehow. Uros.
Re: [PATCH, i386] RTM support
On Mon, Feb 20, 2012 at 07:27:48PM +0100, Uros Bizjak wrote: IIUC the documentation, the fallback label is a parameter to xbegin insn, but the insn itself doesn't jump anywhere - it just records the parameter as a fallback address. However, there is no guarantee that the fallback code is exactly at (pc)+6, so we have to use asm labels here. 6 bytes is the length of the xbegin instruction, so xbegin .+6 says that it has the fallback address at the immediately next insn after xbegin. Which is the _xbegin () semantics. One would use it as if (_xbegin () != 0x) { fallback_txn_aborted; } else { txn_code; _xend (); } Jakub
Re: [PATCH, i386] RTM support
On Mon, Feb 20, 2012 at 7:31 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Feb 20, 2012 at 07:27:48PM +0100, Uros Bizjak wrote: IIUC the documentation, the fallback label is a parameter to xbegin insn, but the insn itself doesn't jump anywhere - it just records the parameter as a fallback address. However, there is no guarantee that the fallback code is exactly at (pc)+6, so we have to use asm labels here. 6 bytes is the length of the xbegin instruction, so xbegin .+6 says that it has the fallback address at the immediately next insn after xbegin. Which is the _xbegin () semantics. No! From 319433-012a.pdf, page 523, it says that tempRIP = RIP + SignExtend (IMM), where RIP is instruction following XBEGIN instruction. cmp and jcc that follows xbegin are 5 bytes long... Uros.
Re: libitm: fixes and optimizations in gl_wt
On 02/20/12 08:54, Torvald Riegel wrote: The following patches don't depend on each other but are all for the gl_wt TM method. patch1: For memory transfers from source regions with the read-for-write modifier, undo logging of these regions was missing. Also optimize the number of accesses to gtm_thr(). Out of curiosity, does an optional argument as in static void pre_write(const void *addr, size_t len, gtm_thread *tx = gtm_thr()) work just as well as + static void pre_write(const void *addr, size_t len) + { +gtm_thread *tx = gtm_thr(); +pre_write(addr, len, tx); + } + if (dst_mod == WaW || dst_mod == NONTXNAL) + tx = gtm_thr(); This sort of micro-optimization can't be good. If we are, for some reason, not unifying the two loads, it seems to me that it'd be better to simply unconditionally load the thread data here: +gtm_thread *tx = 0; The actual bug fix, calling pre_write for RfW, is of course ok. patch2: Similar to the ml_wt code, we don't need to use acq_rel but just acquire memory order when acquiring the orec. The following release fence is still necessary. Ok. patch3: Recently, I changed how gtm_rwlock handles upgrades to serial mode. This change was to fix privatization safety and basically consists of not modifying gtm_thread::shared_state until trycommit() and/or rollback() have finished after the upgrade (and before the switch to serialirr dispatch). With this in place, we can avoid the previous workarounds for the commit-in-serial-mode corner case. Ok. patch4: Privatization safety in gl_wt rollback was previously enforced with something that probably worked but for which the precise C++11 mem model reasoning was unclear. The patch fixes that (so it's clear why it works on the basis of the language's memory model, not on the HW models), and it less costly than the previously used seq_cst barrier. Ok. patch5: Just put gl_wt's global orec on a separate cacheline, same as we do in ml_wt. + atomicgtm_word orec __attribute__((aligned(HW_CACHELINE_SIZE))); + char tailpadding[HW_CACHELINE_SIZE - sizeof(atomicgtm_word)]; The tailpadding structure should not be required. Since an element of the structure is aligned, the whole class will be aligned. And the compiler automatically pads aligned structures so that arrays of them work, and satisfy the alignment. Or is the orec overlapping with the vtable or something? r~
Re: [PATCH, i386] RTM support
On 02/20/12 10:51, Uros Bizjak wrote: On Mon, Feb 20, 2012 at 7:43 PM, Richard Henderson r...@redhat.com wrote: IIUC the documentation, the fallback label is a parameter to xbegin insn, but the insn itself doesn't jump anywhere - it just records the parameter as a fallback address. However, there is no guarantee that the fallback code is exactly at (pc)+6, so we have to use asm labels here. 6 bytes is the length of the xbegin instruction, so xbegin .+6 says that it has the fallback address at the immediately next insn after xbegin. Which is the _xbegin () semantics. No! From 319433-012a.pdf, page 523, it says that tempRIP = RIP + SignExtend (IMM), where RIP is instruction following XBEGIN instruction. So? .+N is generic assembler syntax, not specifying IMM=6. With xbegin .+6 the assembler will of course encode IMM=0, because it knows that the xbegin insn is 6 bytes. Is the fallback code the insn just after the xbegin insn? Yes. If you write everything in pure assembly (or if the compiler gets really smart) then you can use the branch-like semantics and arrange for the fallback code to be completely out of line. Otherwise we ignore the branch-like feature and rely entirely on the data characteristics of the insn, in that %rax is modified. Thus the initial set of -1 and subsequent compare vs the same. r~
Re: [PATCH, i386] RTM support
On Mon, Feb 20, 2012 at 7:57 PM, Richard Henderson r...@redhat.com wrote: tempRIP = RIP + SignExtend (IMM), where RIP is instruction following XBEGIN instruction. So? .+N is generic assembler syntax, not specifying IMM=6. With xbegin .+6 the assembler will of course encode IMM=0, because it knows that the xbegin insn is 6 bytes. Is the fallback code the insn just after the xbegin insn? Yes. If you write everything in pure assembly (or if the compiler gets really smart) then you can use the branch-like semantics and arrange for the fallback code to be completely out of line. Otherwise we ignore the branch-like feature and rely entirely on the data characteristics of the insn, in that %rax is modified. Thus the initial set of -1 and subsequent compare vs the same. I see the logic now, somehow I have assumed the really smart way ;) Other than that, testing of new headers should be added to gcc.target/i386/sse-[12,13,14].c and g++.dg/other/i386-[2,3].c. Thanks, Uros.
[patch libgcc]: Avoid specifying w32-unwind.h for 32-bit mingw if sjlj-threading is used
Hi, this patch fixes an issue about libgcc config.host for SjLj variant build. ChangeLog 2012-02-20 Kai Tietz kti...@redhat.com * config.host (i686-*-mingw*): Set md_unwind_header only for dw2-mode to w32-unwind.h header. Tested for i686-w64-migw32. Ok for apply? Regards, Kai Index: config.host === --- config.host (revision 184333) +++ config.host (working copy) @@ -606,6 +606,7 @@ tmake_eh_file=i386/t-sjlj-eh else tmake_eh_file=i386/t-dw2-eh + md_unwind_header=i386/w32-unwind.h fi # Shared libgcc DLL install dir depends on cross/native build. if test x${build} = x${host} ; then @@ -614,7 +615,6 @@ tmake_dlldir_file=i386/t-dlldir-x fi tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} i386/t-slibgcc-cygming i386/t-cygming i386/t-mingw32 i386/t-crtfm i386/t-chkstk t-dfprules - md_unwind_header=i386/w32-unwind.h ;; x86_64-*-mingw*) case ${target_thread_file} in
Re: libitm: fixes and optimizations in gl_wt
On Mon, 2012-02-20 at 10:36 -0800, Richard Henderson wrote: On 02/20/12 08:54, Torvald Riegel wrote: The following patches don't depend on each other but are all for the gl_wt TM method. patch1: For memory transfers from source regions with the read-for-write modifier, undo logging of these regions was missing. Also optimize the number of accesses to gtm_thr(). Out of curiosity, does an optional argument as in static void pre_write(const void *addr, size_t len, gtm_thread *tx = gtm_thr()) work just as well as + static void pre_write(const void *addr, size_t len) + { +gtm_thread *tx = gtm_thr(); +pre_write(addr, len, tx); + } + if (dst_mod == WaW || dst_mod == NONTXNAL) + tx = gtm_thr(); This sort of micro-optimization can't be good. If we are, for some reason, not unifying the two loads, it seems to me that it'd be better to simply unconditionally load the thread data here: +gtm_thread *tx = 0; The actual bug fix, calling pre_write for RfW, is of course ok. patch2: Similar to the ml_wt code, we don't need to use acq_rel but just acquire memory order when acquiring the orec. The following release fence is still necessary. Ok. patch3: Recently, I changed how gtm_rwlock handles upgrades to serial mode. This change was to fix privatization safety and basically consists of not modifying gtm_thread::shared_state until trycommit() and/or rollback() have finished after the upgrade (and before the switch to serialirr dispatch). With this in place, we can avoid the previous workarounds for the commit-in-serial-mode corner case. Ok. patch4: Privatization safety in gl_wt rollback was previously enforced with something that probably worked but for which the precise C++11 mem model reasoning was unclear. The patch fixes that (so it's clear why it works on the basis of the language's memory model, not on the HW models), and it less costly than the previously used seq_cst barrier. Ok. patch5: Just put gl_wt's global orec on a separate cacheline, same as we do in ml_wt. + atomicgtm_word orec __attribute__((aligned(HW_CACHELINE_SIZE))); + char tailpadding[HW_CACHELINE_SIZE - sizeof(atomicgtm_word)]; The tailpadding structure should not be required. Since an element of the structure is aligned, the whole class will be aligned. And the compiler automatically pads aligned structures so that arrays of them work, and satisfy the alignment. Or is the orec overlapping with the vtable or something? It is without the tail-padding, but currently the virtual functions are called very infrequently. Removed the tailpadding and added a comment instead. Committed with your changes applied.
Re: [patch stor-layout]: Fix PR 52238 - -mms-bitfields: __attribute__ ((aligned (n))) ignored for struct members
On 02/17/12 04:15, Kai Tietz wrote: 2012-02-17 Kai Tietz kti...@redhat.com PR target/52238 * stor-layout.c (place_field): Handle desired_align for ms-bitfields, too. 2012-02-17 Kai Tietz kti...@redhat.com * gcc.dg/bf-ms-layout-3.c: New testcase. As I mentioned on IRC, please use the attribute instead of -mms-bitfields. Otherwise ok. r~
Re: [patch stor-layout]: Fix PR 52238 - -mms-bitfields: __attribute__ ((aligned (n))) ignored for struct members
2012/2/20 Richard Henderson r...@redhat.com: On 02/17/12 04:15, Kai Tietz wrote: 2012-02-17 Kai Tietz kti...@redhat.com PR target/52238 * stor-layout.c (place_field): Handle desired_align for ms-bitfields, too. 2012-02-17 Kai Tietz kti...@redhat.com * gcc.dg/bf-ms-layout-3.c: New testcase. As I mentioned on IRC, please use the attribute instead of -mms-bitfields. Otherwise ok. r~ Ok applied with adjusted testcase at rev. 184409 to trunk, and back-merged change to 4.6.x branch at rev 184410. This bug reaches back at least to 4.2, but I think a back-merge to 4.6 is enough here. Thanks, Kai
[wwwdocs] Janitor stuff
Hello, Two changes in the attached patch: 1. Remove a broken link for XScale documentation 2. Change some links to the C++ ABI (CodeSourcery - Mentor) Is this OK? Ciao! Steven Index: readings.html === RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v retrieving revision 1.217 diff -u -r1.217 readings.html --- readings.html 17 Feb 2012 18:44:55 - 1.217 +++ readings.html 20 Feb 2012 22:22:59 - @@ -37,9 +37,7 @@ by Joachim Nadler and Tim Josling lt;a href=mailto:t...@melbpc.org.au;t...@melbpc.org.au/agt;./li - lia href=http://www4.in.tum.de/~pizka/;GNU INSEL Compiler gic/a./li - - lia href=http://www.codesourcery.com/public/cxx-abi/; + lia href=http://sourcery.mentor.com/public/cxx-abi/; The V3 multi-vendor standard C++ ABI/a is used in GCC releases 3.0 and above./li @@ -78,8 +76,6 @@ br /Manufacturer: Various, by license from ARM br /CPUs include: ARM7 and ARM7T series (eg. ARM7TDMI), ARM9 and StrongARM br /a href=http://infocenter.arm.com/help/index.jsp;ARM Documentation/a - br /a href=http://www.intel.com/design/intelxscale/273473.htm;Intel -XScale Core Developer's Manual/a /li liAVR Index: gcc-3.2/c++-abi.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.2/c++-abi.html,v retrieving revision 1.5 diff -u -r1.5 c++-abi.html --- gcc-3.2/c++-abi.html1 Jan 2009 20:15:25 - 1.5 +++ gcc-3.2/c++-abi.html20 Feb 2012 22:22:59 - @@ -10,7 +10,7 @@ The main point of the GCC 3.2 release is to have a relatively stable and common C++ ABI for GNU/Linux and BSD usage, following the documentation at -a href=http://www.codesourcery.com/public/cxx-abi/;http://www.codesourcery.com/public/cxx-abi//a. +a href=http://sourcery.mentor.com/public/cxx-abi/;http://sourcery.mentor.com/public/cxx-abi//a. Unfortunately this means that GCC 3.2 is incompatible with GCC 3.0 and GCC 3.1 releases./p Index: gcc-4.0/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.0/changes.html,v retrieving revision 1.59 diff -u -r1.59 changes.html --- gcc-4.0/changes.html30 Jan 2011 11:39:36 - 1.59 +++ gcc-4.0/changes.html20 Feb 2012 22:22:59 - @@ -182,7 +182,7 @@ a href=#visibilitycode-fvisibility/code option/a./li liThe compiler now uses the library interface specified by the a -href=http://www.codesourcery.com/public/cxx-abi/;C++ ABI/a for +href=http://sourcery.mentor.com/public/cxx-abi/;C++ ABI/a for thread-safe initialization of function-scope static variables. Most users should leave this alone, but embedded programmers may want to disable this by specifying
Re: [wwwdocs] Janitor stuff
On Mon, Feb 20, 2012 at 11:26 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, Two changes in the attached patch: 1. Remove a broken link for XScale documentation 2. Change some links to the C++ ABI (CodeSourcery - Mentor) One more: MeP is also gone... Index: readings.html === RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v retrieving revision 1.217 diff -u -r1.217 readings.html --- readings.html 17 Feb 2012 18:44:55 - 1.217 +++ readings.html 20 Feb 2012 22:28:29 - @@ -181,7 +177,6 @@ liMeP br /Manufacturer: Toshiba - br /a href=http://www.semicon.toshiba.co.jp/eng/product/micro/selection/mep/;Toshiba MeP Site/a br /SID includes a MeP simulator. /li
Re: PR middle-end/52141: ICE due to asm statement
On 02/20/12 13:16, Aldy Hernandez wrote: PR middle-end/52141 * trans-mem.c (ipa_tm_scan_irr_block): Error out on GIMPLE_ASM's in a transaction safe function. Ok. r~
Re: [wwwdocs] Janitor stuff
On Mon, 20 Feb 2012, Steven Bosscher wrote: Two changes in the attached patch: 1. Remove a broken link for XScale documentation 2. Change some links to the C++ ABI (CodeSourcery - Mentor) Is this OK? Sure thing. Just go for changes like this -- for web pages we can interpret the obvious rule veeery broadly, and in general I am happy for anyone with write access to make changes like these. Thanks for doing this! Gerald
[patch lto-plugin]: Fix pr 50616
Hi, this patch replaces use of llx for printf/scanf by inttypes.h PRIxMAX/SCNxMAX macros. If those macros aren't present it defines them as default to llx. ChangeLog 2012-02-20 Kai Tietz kti...@redhat.com PR lto/50616 * lto-plugin.c (PRIxMAX,SCNxMAX): Use inttypes.h header if present, otherwise define them as llx. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: lto-plugin.c === --- lto-plugin.c(revision 184414) +++ lto-plugin.c(working copy) @@ -38,6 +38,9 @@ #if HAVE_STDINT_H #include stdint.h #endif +#if HAVE_INTTYPES_H +#include inttypes.h +#endif #include assert.h #include string.h #include stdlib.h @@ -62,6 +65,14 @@ #include simple-object.h #include plugin-api.h +#ifndef PRIxMAX +#define PRIxMAX llx +#endif + +#ifndef SCNxMAX +#define SCNxMAX llx +#endif + /* Handle opening elf files on hosts, such as Windows, that may use text file handling that will break binary access. */ #ifndef O_BINARY @@ -360,7 +371,7 @@ assert (resolution != LDPR_UNKNOWN); - fprintf (f, %u %llx %s %s\n, + fprintf (f, %u % PRIxMAX %s %s\n, (unsigned int) slot, symtab-aux[j].id, lto_resolution_str[resolution], symtab-syms[j].name); @@ -816,7 +827,7 @@ s = strrchr (name, '.'); if (s) -sscanf (s, .%llx, obj-out-id); +sscanf (s, .% SCNxMAX, obj-out-id); secdata = xmalloc (length); offset += obj-file-offset; if (offset != lseek (obj-file-fd, offset, SEEK_SET)
Re: [google/gcc-4_6_2-mobile] PATCH: PR other/46770: Replace .ctors/.dtors with .init_array/.fini_array on targets supporting them
Sorry -- I will fix it in google/gcc-4_6_2-mobile. Thanks, Jing On Mon, Feb 20, 2012 at 7:15 AM, Ilya Enkovich enkovich@gmail.com wrote: Hello, Hey, Jing, you broke the google/gcc-4_6 branch by checking the new header file into the wrong directory. Fixed via r184386. google/gcc-4_6_2-mobile branch still has the same problem. Could please someone fix it? Thanks Ilya Ollie On Fri, Feb 17, 2012 at 10:25 PM, Jing Yu jin...@google.com wrote: OK. Thanks for porting the patch. I will commit the patch into google/gcc-4_6_2-mobile for you. I would also like to commit it into google/gcc-4_6 branch if all tests pass. This patch is almost the same as Google Ref 47894. Thanks, Jing On Fri, Feb 17, 2012 at 5:20 PM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch backports the fix from trunk: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770 for google/gcc-4_6_2-mobile branch. This is needed to support C++ global constructors/destructiors on Android/x86. OK for google/gcc-4_6_2-mobile branch? Thanks. H.J. --- 2011-08-20 H.J. Lu hongjiu...@intel.com PR other/46770 * config.gcc (tm_file): Add initfini-array.h if .init_arrary/.fini_array are supported. * crtstuff.c: Don't generate .ctors nor .dtors sections if USE_INITFINI_ARRAY is defined. * output.h (default_elf_init_array_asm_out_constructor): New. (default_elf_fini_array_asm_out_destructor): Likewise. * varasm.c (elf_init_array_section): Likewise. (elf_fini_array_section): Likewise. (get_elf_initfini_array_priority_section): Likewise. (default_elf_init_array_asm_out_constructor): Likewise. (default_elf_fini_array_asm_out_destructor): Likewise. * config/initfini-array.h: New. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@177933 138bc75d-0d04-0410-961f-82ee72b054a4 Conflicts: gcc/ChangeLog --- gcc/ChangeLog.hjl | 21 +++ gcc/config.gcc | 5 +++ gcc/config/initfini-array.h | 37 +++ gcc/crtstuff.c | 11 +++- gcc/output.h | 2 + gcc/varasm.c | 58 +++ 6 files changed, 133 insertions(+), 1 deletions(-) create mode 100644 gcc/ChangeLog.hjl create mode 100644 gcc/config/initfini-array.h diff --git a/gcc/ChangeLog.hjl b/gcc/ChangeLog.hjl new file mode 100644 index 000..3527b27 --- /dev/null +++ b/gcc/ChangeLog.hjl @@ -0,0 +1,21 @@ +2011-12-07 H.J. Lu hongjiu...@intel.com + + Backport from mainline + 2011-08-20 H.J. Lu hongjiu...@intel.com + + PR other/46770 + * config.gcc (tm_file): Add initfini-array.h if + .init_arrary/.fini_array are supported. + + * crtstuff.c: Don't generate .ctors nor .dtors sections if + USE_INITFINI_ARRAY is defined. + + * output.h (default_elf_init_array_asm_out_constructor): New. + (default_elf_fini_array_asm_out_destructor): Likewise. + * varasm.c (elf_init_array_section): Likewise. + (elf_fini_array_section): Likewise. + (get_elf_initfini_array_priority_section): Likewise. + (default_elf_init_array_asm_out_constructor): Likewise. + (default_elf_fini_array_asm_out_destructor): Likewise. + + * config/initfini-array.h: New. diff --git a/gcc/config.gcc b/gcc/config.gcc index d9ac0fa..b386424 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3176,6 +3176,11 @@ if test x$with_schedule = x; then esac fi +# Support --enable-initfini-array. +if test x$enable_initfini_array = xyes; then + tm_file=${tm_file} initfini-array.h +fi + # Validate and mark as valid any --with options supported # by this target. In order to use a particular --with option # you must list it in supported_defaults; validating the value diff --git a/gcc/config/initfini-array.h b/gcc/config/initfini-array.h new file mode 100644 index 000..8aaadf6 --- /dev/null +++ b/gcc/config/initfini-array.h @@ -0,0 +1,37 @@ +/* Definitions for ELF systems with .init_array/.fini_array section + support. + Copyright (C) 2011 + Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC;
Re: [PATCH, i386] RTM support
IIUC the documentation, the fallback label is a parameter to xbegin insn, but the insn itself doesn't jump anywhere - it just records the From the point of view of the program XBEGIN behaves like a conditional jump (with a very complicated and unpredictable condition) -Andi -- a...@linux.intel.com -- Speaking for myself only.
Re: [wwwdocs] Janitor stuff
This is DJ's baby, let's see whether he has an alternate site? (Google did not find me an immediate candidate.) Gerald On Mon, 20 Feb 2012, Steven Bosscher wrote: One more: MeP is also gone... Index: readings.html === RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v retrieving revision 1.217 diff -u -r1.217 readings.html --- readings.html 17 Feb 2012 18:44:55 - 1.217 +++ readings.html 20 Feb 2012 22:28:29 - @@ -181,7 +177,6 @@ liMeP br /Manufacturer: Toshiba - br /a href=http://www.semicon.toshiba.co.jp/eng/product/micro/selection/mep/;Toshiba MeP Site/a br /SID includes a MeP simulator. /li
[PATCH] Fix up gimplify_and_update_call_from_tree (PR tree-optimization/52318)
Hi! This function spends significant amount of code to update the virtual uses/defs in the new sequence, but only handles that way stores, not non-pure/const calls, so we ICE during tree DSE on this testcase, because vop has been marked for renaming. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-02-20 Jakub Jelinek ja...@redhat.com PR tree-optimization/52318 * gimple-fold.c (gimplify_and_update_call_from_tree): Add vdef also to non-pure/const call stmts in the sequence. * gcc.dg/pr52318.c: New test. --- gcc/gimple-fold.c.jj2011-12-16 08:37:45.0 +0100 +++ gcc/gimple-fold.c 2012-02-20 21:51:20.853133362 +0100 @@ -591,8 +591,11 @@ gimplify_and_update_call_from_tree (gimp for (i = gsi_last (stmts); !gsi_end_p (i); gsi_prev (i)) { new_stmt = gsi_stmt (i); - if (gimple_assign_single_p (new_stmt) - !is_gimple_reg (gimple_assign_lhs (new_stmt))) + if ((gimple_assign_single_p (new_stmt) + !is_gimple_reg (gimple_assign_lhs (new_stmt))) + || (is_gimple_call (new_stmt) + (gimple_call_flags (new_stmt) + (ECF_NOVOPS | ECF_PURE | ECF_CONST | ECF_NORETURN)) == 0)) { tree vdef; if (!laststore) --- gcc/testsuite/gcc.dg/pr52318.c.jj 2012-02-20 22:10:09.211778710 +0100 +++ gcc/testsuite/gcc.dg/pr52318.c 2012-02-20 22:09:10.0 +0100 @@ -0,0 +1,17 @@ +/* PR tree-optimization/52318 */ +/* { dg-do compile } */ +/* { dg-options -O3 -ftracer -fno-tree-ccp -fno-tree-copy-prop -fno-tree-dce } */ + +int c; +char *p; + +void +foo (int i) +{ + char a[2]; + char b[20]; + p = __builtin___stpcpy_chk (a, , 2); + p = __builtin___stpcpy_chk (b[16], i ? e : jkl, 4); + if (c) +foo (i); +} Jakub
Re: [wwwdocs] Janitor stuff
This is DJ's baby, let's see whether he has an alternate site? Sorry, I got nothin'
[PATCH] Explain why we don't use RDPC for sparc PIC register setup.
Richard Henderson asked me to add an explanatory comments to explain this decision. * config/sparc/sparc.md (load_pcrel_symP:mode): Explain why we don't use the rd %pc instruction on v9 for PIC register loads. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@184422 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog |5 + gcc/config/sparc/sparc.md |4 2 files changed, 9 insertions(+), 0 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e12e596..df2419a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2012-02-20 David S. Miller da...@davemloft.net + + * config/sparc/sparc.md (load_pcrel_symP:mode): Explain why we + don't use the rd %pc instruction on v9 for PIC register loads. + 2012-02-20 Aldy Hernandez al...@redhat.com PR middle-end/52141 diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index f70acd3..c0c1ef8 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -1352,6 +1352,10 @@ ;; Load in operand 0 the (absolute) address of operand 1, which is a symbolic ;; value subject to a PC-relative relocation. Operand 2 is a helper function ;; that adds the PC value at the call point to register #(operand 3). +;; +;; Even on V9 we use this call sequence with a stub, instead of rd %pc, ... +;; because the RDPC instruction is extremely expensive and incurs a complete +;; instruction pipeline flush. (define_insn load_pcrel_symP:mode [(set (match_operand:P 0 register_operand =r) -- 1.7.6.401.g6a319
Re: [google/gcc-4_6_2-mobile] PATCH: PR other/46770: Replace .ctors/.dtors with .init_array/.fini_array on targets supporting them
Hi H.J., I think the patch itself is not enough. I compared AC_DEFUN([gcc_AC_INITFINI_ARRAY] part (in acinclude.m4) of gcc trunk and google/gcc-4_6_2-mobile, and found how enable_initfini_array is configured is different. The patch breaks some of our tests. enable_initfini_array should be disabled for cross compile by default. But it is not true in our branch. Could you please point us all related patches? Thanks, Jing On Mon, Feb 20, 2012 at 8:30 AM, H.J. Lu hjl.to...@gmail.com wrote: On Sat, Feb 18, 2012 at 1:54 AM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Feb 17, 2012 at 05:20:02PM -0800, H.J. Lu wrote: This patch backports the fix from trunk: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770 for google/gcc-4_6_2-mobile branch. This is needed to support C++ global constructors/destructiors on Android/x86. OK for google/gcc-4_6_2-mobile branch? Don't you want to backport also all the follow-ups on this? The original patch works with cross compile since it is disabled by default. For native compile, it works with the default GNU assembler and linker. If there are additional failure for native compile, it should be disabled at configure time. -- H.J.
Re: [PATCH, i386] RTM support
On Tue, Feb 21, 2012 at 12:26 AM, Andi Kleen a...@firstfloor.org wrote: IIUC the documentation, the fallback label is a parameter to xbegin insn, but the insn itself doesn't jump anywhere - it just records the From the point of view of the program XBEGIN behaves like a conditional jump (with a very complicated and unpredictable condition) Yes, but as it is written ATM, it conditionally jumps to the next insn. BTW: Looking a bit more to the spec, we can simply write xbegin $0 as the spec says that offset is relative to the _NEXT_ instruction. Uros.