[PATCH] Fix PR49603
This fixes PR49603. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-07-01 Richard Guenther rguent...@suse.de PR tree-optimization/49603 * tree-vect-stmts.c (vectorizable_load): Remove unnecessary assert. * gcc.dg/torture/pr49603.c: New testcase. Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c (revision 175709) +++ gcc/tree-vect-stmts.c (working copy) @@ -4574,19 +4574,14 @@ vectorizable_load (gimple stmt, gimple_s /* 4. Handle invariant-load. */ if (inv_p !bb_vinfo) { + tree vec_inv; + gimple_stmt_iterator gsi2 = *gsi; gcc_assert (!strided_load); - if (j == 0) - { - tree vec_inv; - gimple_stmt_iterator gsi2 = *gsi; - gsi_next (gsi2); - vec_inv = build_vector_from_val (vectype, scalar_dest); - new_temp = vect_init_vector (stmt, vec_inv, - vectype, gsi2); - new_stmt = SSA_NAME_DEF_STMT (new_temp); - } - else - gcc_unreachable (); /* FORNOW. */ + gsi_next (gsi2); + vec_inv = build_vector_from_val (vectype, scalar_dest); + new_temp = vect_init_vector (stmt, vec_inv, + vectype, gsi2); + new_stmt = SSA_NAME_DEF_STMT (new_temp); } if (negative) Index: gcc/testsuite/gcc.dg/torture/pr49603.c === --- gcc/testsuite/gcc.dg/torture/pr49603.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr49603.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ + +struct gl_visual { +float AlphaScale; +}; +struct gl_context { +struct gl_visual *Visual; +}; +void quickdraw_rgb( struct gl_context * ctx, + int width, int height) +{ + int i, j; + unsigned char alpha[1600]; + for (j=0; jwidth; j++) +alpha[j] = (int) ctx-Visual-AlphaScale; + for (i=0; iheight; i++) +foo( alpha); +} +
Re: [PATCH, PR 49094] Refrain from creating misaligned accesses in SRA
On Thu, 30 Jun 2011, Martin Jambor wrote: Hi, On Thu, Jun 30, 2011 at 03:39:55PM +0200, Martin Jambor wrote: Hi, I had to add a test that the analyzed expression is not an SSA name. This has been approved by Rchi on IRC yesterday. Thus, I have committed the following as revision 175703 after successful run of c and c++ testsuite on sparc64 (and a bootstrap and test with another patch on x86_64-linux). I also tested fortran on sparc64 but missed a regression there (gfortran.dg/pr25923.f90). The problem is that *arg_1(D) is also scrutinized, get_object_alignment returns 8 for it and that is obviously not enough for SImode required alignment. I assume such loads have to be aligned for the mode on strict aligned targets and therefore they are OK. The question is, is that true for all MEM_REFs or only for those with zero offset? Since the original problem was that the expander expanded MEM[(struct ip *)ip_3 + 7B].s_addr; as if it was aligned, I suppose that MEM_REFs are generally fine and we can avoid this issue by skipping them just like the SSA_NAMEs. Is my reasoning correct? Not really. This just highlights the issue that alignment on strict-align targets is broken for any non-component-ref style access. As we happily fold component-refs into the MEM_REFs offset the issue is now more appearant. For MEM_REFs the alignment is supposed to be at least that of TYPE_ALIGN (TREE_TYPE (mem-ref)), even though the expanders would ignore that. Richard. Thanks, Martin Thanks, Martin 2011-06-30 Martin Jambor mjam...@suse.cz PR tree-optimization/49094 * tree-sra.c (tree_non_mode_aligned_mem_p): New function. (build_accesses_from_assign): Use it. * testsuite/gcc.dg/tree-ssa/pr49094.c: New test. Index: src/gcc/tree-sra.c === --- src.orig/gcc/tree-sra.c +++ src/gcc/tree-sra.c @@ -1050,6 +1050,26 @@ disqualify_ops_if_throwing_stmt (gimple return false; } +/* Return true iff type of EXP is not sufficiently aligned. */ + +static bool +tree_non_mode_aligned_mem_p (tree exp) +{ + enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + unsigned int align; + + if (TREE_CODE (exp) == SSA_NAME + || mode == BLKmode + || !STRICT_ALIGNMENT) +return false; + + align = get_object_alignment (exp, BIGGEST_ALIGNMENT); + if (GET_MODE_ALIGNMENT (mode) align) +return true; + + return false; +} + /* Scan expressions occuring in STMT, create access structures for all accesses to candidates for scalarization and remove those candidates which occur in statements or expressions that prevent them from being split apart. Return @@ -1074,7 +1094,10 @@ build_accesses_from_assign (gimple stmt) lacc = build_access_from_expr_1 (lhs, stmt, true); if (lacc) -lacc-grp_assignment_write = 1; +{ + lacc-grp_assignment_write = 1; + lacc-grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (rhs); +} if (racc) { @@ -1082,6 +1105,7 @@ build_accesses_from_assign (gimple stmt) if (should_scalarize_away_bitmap !gimple_has_volatile_ops (stmt) !is_gimple_reg_type (racc-type)) bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc-base)); + racc-grp_unscalarizable_region |= tree_non_mode_aligned_mem_p (lhs); } if (lacc racc Index: src/gcc/testsuite/gcc.dg/tree-ssa/pr49094.c === --- /dev/null +++ src/gcc/testsuite/gcc.dg/tree-ssa/pr49094.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options -O } */ + +struct in_addr { + unsigned int s_addr; +}; + +struct ip { + unsigned char ip_p; + unsigned short ip_sum; + struct in_addr ip_src,ip_dst; +} __attribute__ ((aligned(1), packed)); + +struct ip ip_fw_fwd_addr; + +int test_alignment( char *m ) +{ + struct ip *ip = (struct ip *) m; + struct in_addr pkt_dst; + pkt_dst = ip-ip_dst ; + if( pkt_dst.s_addr == 0 ) +return 1; + else +return 0; +} + +int __attribute__ ((noinline, noclone)) +intermediary (char *p) +{ + return test_alignment (p); +} + +int +main (int argc, char *argv[]) +{ + ip_fw_fwd_addr.ip_dst.s_addr = 1; + return intermediary ((void *) ip_fw_fwd_addr); +} -- Richard Guenther rguent...@suse.de Novell / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer
Re: C++ PATCH to avoid duplicate overload resolution (related to c++/48481)
On Thu, Jun 30, 2011 at 05:08:49PM -0400, Jason Merrill wrote: Author: Jason Merrill ja...@redhat.com Date: Thu Jun 30 00:03:40 2011 -0400 PR c++/48481 * name-lookup.c (struct arg_lookup): Add fn_set. (add_function): Check it. (lookup_arg_dependent_1): Initialize it. diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c index 8bf5f5f..615e177 100644 --- a/gcc/cp/name-lookup.c +++ b/gcc/cp/name-lookup.c @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. If not see #include debug.h #include c-family/c-pragma.h #include params.h +#include pointer-set.h Then cp/name-lookup.o should depend on pointer-set.h in cp/Make-lang.in. Fixed thusly, so far untested, I've noticed a couple of other spots. Ok for trunk after testing? OT, if we don't switch to on the fly dependency generation, I think it would be nice to at least a write a maintainer-script that would regenerate GCC style rules for dependences in Makefile.in and */Make-lang.in - dependencies on just immediately mentioned headers, and for headers with dependencies find a matching SPLAY_TREE_H etc. variable, which we could run once a month or so to make sure our Makefiles are up to date. 2011-07-01 Jakub Jelinek ja...@redhat.com * Make-lang.in (cp/decl.o): Depend on pointer-set.h. (cp/class.o): Likewise. (cp/error.o): Likewise. (cp/name-lookup.o): Likewise. (cp/decl2.o): Likewise. Don't depend on $(POINTER_SET_H). --- gcc/cp/Make-lang.in.jj 2011-06-17 11:01:54.0 +0200 +++ gcc/cp/Make-lang.in 2011-07-01 09:19:29.0 +0200 @@ -266,11 +266,11 @@ cp/decl.o: cp/decl.c $(CXX_TREE_H) $(TM_ output.h toplev.h $(HASHTAB_H) $(RTL_H) \ cp/operators.def $(TM_P_H) $(TREE_INLINE_H) $(DIAGNOSTIC_H) $(C_PRAGMA_H) \ debug.h gt-cp-decl.h $(TIMEVAR_H) $(TARGET_H) $(PLUGIN_H) \ - intl.h tree-iterator.h $(SPLAY_TREE_H) \ + intl.h tree-iterator.h pointer-set.h $(SPLAY_TREE_H) \ c-family/c-objc.h cp/decl2.o: cp/decl2.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) cp/decl.h \ output.h toplev.h $(C_COMMON_H) gt-cp-decl2.h $(CGRAPH_H) \ - $(C_PRAGMA_H) $(TREE_DUMP_H) intl.h $(TARGET_H) $(GIMPLE_H) $(POINTER_SET_H) \ + $(C_PRAGMA_H) $(TREE_DUMP_H) intl.h $(TARGET_H) $(GIMPLE_H) pointer-set.h \ $(SPLAY_TREE_H) c-family/c-ada-spec.h \ c-family/c-objc.h cp/cp-objcp-common.o : cp/cp-objcp-common.c $(CONFIG_H) $(SYSTEM_H) \ @@ -284,7 +284,7 @@ cp/typeck.o: cp/typeck.c $(CXX_TREE_H) $ output.h c-family/c-objc.h cp/class.o: cp/class.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) toplev.h \ $(TARGET_H) convert.h $(CGRAPH_H) $(TREE_DUMP_H) gt-cp-class.h \ - $(SPLAY_TREE_H) + $(SPLAY_TREE_H) pointer-set.h cp/call.o: cp/call.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) toplev.h \ $(DIAGNOSTIC_CORE_H) intl.h gt-cp-call.h convert.h $(TARGET_H) langhooks.h \ $(TIMEVAR_H) c-family/c-objc.h @@ -312,7 +312,7 @@ cp/pt.o: cp/pt.c $(CXX_TREE_H) $(TM_H) c c-family/c-objc.h cp/error.o: cp/error.c $(CXX_TREE_H) $(TM_H) $(DIAGNOSTIC_H) \ $(FLAGS_H) $(REAL_H) $(LANGHOOKS_DEF_H) $(CXX_PRETTY_PRINT_H) \ - tree-diagnostic.h tree-pretty-print.h c-family/c-objc.h + tree-diagnostic.h tree-pretty-print.h pointer-set.h c-family/c-objc.h cp/repo.o: cp/repo.c $(CXX_TREE_H) $(TM_H) toplev.h $(DIAGNOSTIC_CORE_H) \ gt-cp-repo.h cp/semantics.o: cp/semantics.c $(CXX_TREE_H) $(TM_H) toplev.h \ @@ -333,7 +333,7 @@ cp/cp-gimplify.o: cp/cp-gimplify.c $(CXX cp/name-lookup.o: cp/name-lookup.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(CXX_TREE_H) $(TIMEVAR_H) gt-cp-name-lookup.h \ - $(DIAGNOSTIC_CORE_H) $(FLAGS_H) debug.h + $(DIAGNOSTIC_CORE_H) $(FLAGS_H) debug.h pointer-set.h cp/cxx-pretty-print.o: cp/cxx-pretty-print.c $(CXX_PRETTY_PRINT_H) \ $(CONFIG_H) $(SYSTEM_H) $(TM_H) coretypes.h $(CXX_TREE_H) tree-pretty-print.h Jakub
Re: [PATCH] Add support on powerpc to change CASE_VALUES_THRESHOLD
On Thu, Jun 30, 2011 at 7:06 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Thu, Jun 30, 2011 at 12:31:44PM +0200, Richard Guenther wrote: On Thu, Jun 30, 2011 at 12:34 AM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On the powerpc, switch statements can be expensive, and we would like to be able to tune the threshold of when the compiler generates if statements vs. using a table jump operation (and different processors within the powerpc have different limits). This patch adds a powerpc tuning option to control this. I've done bootstraps and make checks with no regressions. Is this ok to apply to the trunk? At this time, I am not changing the default value (4). With the option, I've seen a few spec 2006 benchmarks run faster, and a few run slower. Hmm. This hook looks like it could be turned into a --param. In fact it doesn't get whatever context information, so I wonder if any target does something fancy. I've done it also as a --param. I tend to think it is better as a param, and then other people can tune their code. One slight problem is the default for CASE_VALUES_THRESHOLD depends on whether you have a named casesi pattern that is active (HAVE_casesi ? 4 : 5). Either we need two separate parameters, or we have just one parameter, and if that is 0, fall back to the 4 or 5 value. This patch swtiches to use --param to set the value. Is it acceptable to check in? I've done a bootstrap and I'm about to do a make check. Looks good to me. Please leave others 24h to comment on the special zero value (no idea if we have precedent for it, but it sounds ok to me). Thanks, Richard. [gcc] 2011-06-30 Michael Meissner meiss...@linux.vnet.ibm.com * params.def (PARAM_CASE_VALUES_THRESHOLD): New parameter to override CASE_VALUES_THRESHOLD. * stmt.c (toplevel): Include params.h. (case_values_threshold): Use the --param case-values-threshold value if non-zero, otherwise use machine dependent value. (expand_case): Use case_values_threshold. * Makefile.in (stmt.o): Add $(PARAMS_H) dependency. * doc/invoke.texi (--param case-values-threshold): Document. [gcc/testsuite] 2011-06-30 Michael Meissner meiss...@linux.vnet.ibm.com * gcc.target/powerpc/ppc-switch-1.c: New test for -mcase-values-threshold. * gcc.target/powerpc/ppc-switch-2.c: Ditto. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [patch tree-optimization]: Fix typo in X bitwise-binary-op CST
On Thu, Jun 30, 2011 at 8:01 PM, Kai Tietz kti...@redhat.com wrote: Hello, this patch fixes a typo in type-sinking for bitwise-binary operation X op CST. ChangeLog gcc/ 2011-06-30 Kai Tietz kti...@redhat.com * tree-ssa-forwprop.c (simplify_bitwise_binary): Fix typo. ChangeLog gcc/testsuite 2011-06-30 Kai Tietz kti...@redhat.com * gcc.dg/tree-ssa/bitwise-sink.c: New test. Bootstrapped and regression tested for all standard languages plus Ada and Obj-C++ on x86_64-pc-linux-gnu. Ok for apply? Ok. Thanks, Richard. Regards, Kai Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/bitwise-sink.c === --- /dev/null +++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/bitwise-sink.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int +foo (_Bool x) +{ + return (x ^ 1); +} + +/* { dg-final { scan-tree-dump-times x\[^ \]* \\^ 1 1 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc-head/gcc/tree-ssa-forwprop.c === --- gcc-head.orig/gcc/tree-ssa-forwprop.c +++ gcc-head/gcc/tree-ssa-forwprop.c @@ -1946,7 +1946,7 @@ simplify_bitwise_binary (gimple_stmt_ite /* Try to fold (type) X op CST - (type) (X op ((type-x) CST)). */ if (TREE_CODE (arg2) == INTEGER_CST CONVERT_EXPR_CODE_P (def1_code) - INTEGRAL_TYPE_P (def1_arg1) + INTEGRAL_TYPE_P (TREE_TYPE (def1_arg1)) int_fits_type_p (arg2, TREE_TYPE (def1_arg1))) { gimple newop;
Re: PATCH [5/n]: Prepare x32: PR middle-end/48016: Inconsistency in non-local goto save area
On Thu, Jun 30, 2011 at 8:05 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Jun 29, 2011 at 9:16 AM, Michael Matz m...@suse.de wrote: Hi, On Wed, 29 Jun 2011, H.J. Lu wrote: diff --git a/gcc/function.c b/gcc/function.c index 81c4d39..131bc09 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -4780,7 +4780,7 @@ expand_function_start (tree subr) cfun-nonlocal_goto_save_area, integer_zero_node, NULL_TREE, NULL_TREE); r_save = expand_expr (t_save, NULL_RTX, VOIDmode, EXPAND_WRITE); - r_save = convert_memory_address (Pmode, r_save); + r_save = adjust_address (r_save, Pmode, 0); This is actually the same problem as in explow.c. t_save is built with ptr_type_node, where it should have been using TREE_TYPE (TREE_TYPE (cfun-nonlocal_goto_save_area)) Then r_save should have the correct mode already, possibly this could be asserted. You are right that r_save needs to correspond to the nonlocal_goto_save_area[0] array-ref, hence pseudos aren't okay, therefore convert_memory_address isn't. Actually I think we might even want to assert that indeed the expanded r_save is of Pmode already. This patch works for me. OK for trunk if there are no regressions? Ok. Thanks, Richard. Thanks. -- H.J. --- 2011-06-30 H.J. Lu hongjiu...@intel.com PR middle-end/48016 * explow.c (update_nonlocal_goto_save_area): Use proper mode for stack save area. * function.c (expand_function_start): Likewise.
Re: [PATCH] Improve CONSTRUCTOR printing in tree-pretty-print.c
On Thu, Jun 30, 2011 at 8:42 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As reporteed by Tobias, when printing array ctors the pretty printer would never print indexes or ranges, which means that e.g. {[2 ... 71]=7} ctor was printed as {7} and {[3]=1, [7]=2} ctor was printed as {1, 2} The following patch prints the index (if different from the last index + 1 resp. min value for the first element) or range (always). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2011-06-30 Jakub Jelinek ja...@redhat.com * tree-pretty-print.c (dump_generic_code) case CONSTRUCTOR: Print [idx]= and [idx1 ... idx2]= before initializers if needed for array initializers. --- gcc/tree-pretty-print.c.jj 2011-06-06 19:07:08.0 +0200 +++ gcc/tree-pretty-print.c 2011-06-30 11:51:04.0 +0200 @@ -1250,19 +1250,58 @@ dump_generic_node (pretty_printer *buffe { unsigned HOST_WIDE_INT ix; tree field, val; - bool is_struct_init = FALSE; + bool is_struct_init = false; + bool is_array_init = false; + double_int curidx = double_int_zero; pp_character (buffer, '{'); if (TREE_CODE (TREE_TYPE (node)) == RECORD_TYPE || TREE_CODE (TREE_TYPE (node)) == UNION_TYPE) - is_struct_init = TRUE; + is_struct_init = true; + else if (TREE_CODE (TREE_TYPE (node)) == ARRAY_TYPE + TYPE_DOMAIN (TREE_TYPE (node)) + TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (node))) + TREE_CODE (TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (node + == INTEGER_CST) + { + tree minv = TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (node))); + is_array_init = true; + curidx = tree_to_double_int (minv); + } FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (node), ix, field, val) { - if (field is_struct_init) + if (field) { - pp_character (buffer, '.'); - dump_generic_node (buffer, field, spc, flags, false); - pp_string (buffer, =); + if (is_struct_init) + { + pp_character (buffer, '.'); + dump_generic_node (buffer, field, spc, flags, false); + pp_character (buffer, '='); + } + else if (is_array_init + (TREE_CODE (field) != INTEGER_CST + || !double_int_equal_p (tree_to_double_int (field), + curidx))) + { + pp_character (buffer, '['); + if (TREE_CODE (field) == RANGE_EXPR) + { + dump_generic_node (buffer, TREE_OPERAND (field, 0), spc, + flags, false); + pp_string (buffer, ... ); + dump_generic_node (buffer, TREE_OPERAND (field, 1), spc, + flags, false); + if (TREE_CODE (TREE_OPERAND (field, 1)) == INTEGER_CST) + curidx = tree_to_double_int (TREE_OPERAND (field, 1)); + } + else + dump_generic_node (buffer, field, spc, flags, false); + if (TREE_CODE (field) == INTEGER_CST) + curidx = tree_to_double_int (field); + pp_string (buffer, ]=); + } } + if (is_array_init) + curidx = double_int_add (curidx, double_int_one); if (val TREE_CODE (val) == ADDR_EXPR) if (TREE_CODE (TREE_OPERAND (val, 0)) == FUNCTION_DECL) val = TREE_OPERAND (val, 0); Jakub
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
H.J. Lu hjl.to...@gmail.com writes: Here is the updated patch. It works on simple tests. I am running full tests. I kept config/i386/value-unwind.h since libgcc/md-unwind-support.h is included too late in unwind-dw2.c and I don't want to move it to be on the safe side. Oh please, don't pile hack upon hack to avoid proper testing. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] ARM test pr42093.c: thumb2 or thumb1
On 24/06/11 14:18, Ramana Radhakrishnan wrote: On 24/06/11 01:40, Janis Johnson wrote: Test gcc.target/arm/pr42093.c, added by Ramana, requires support for arm_thumb2 but fails for those targets. The patch for which it was added modified support for thumb1. Should the test instead require arm_thumb1_ok, as in this patch? No this is for a Thumb2 defect so the test is valid for Thumb2 - we shouldn't be generating a tbb / tbh with signed offsets and that's what was happening there. This test I think ends up being fragile because the generation of tbb / tbh depends on how the blocks have been laid out . It would be interesting to try and get a test that works reliably in T2 . cheers Ramana Janis Perhaps -fno-reorder-blocks could be used to make it less fragile. R.
Re: Merge score7 to score.c and remove forwarding functions(with score_handle_option merged).
2011/6/14 Wei Qin wei@gmail.com Merge score7 to score.c and remove forwarding functions. Merge score_handle_option. Delete score7.c and score7.h. 2011-06-13 Wei Qinwei@gmail.com * config.gcc (score-*-elf): Remove score7.o. * config/score/t-score-elf: Likewise. * config/score/score.c: Merge score7 to score.c and remove forwarding functions. * config/score/score7.c Deleted. * config/score/score7.h Deleted. Thanks, patched had commited. --liqin
[build, doc] Obsolete IRIX 6.5, Tru64 UNIX V5.1
I think the time has come to obsolete the IRIX 6.5 and Tru64 UNIX V5.1 ports: my test machines are falling apart one by one, bootstraps take between 24 (osf) and 36 (irix) hours, and both platform's EOSL is approaching: * IRIX 6.5 until December 2013: http://www.sgi.com/support/services/irix_mips_support.html * Tru64 UNIX V5.1B until December 2012: http://tru64unix.compaq.com/tru64roadmap.pdf http://h30097.www3.hp.com/pdf/FINAL_Tru64UNIX_Policy_Page_24March2009.pdf http://h30097.www3.hp.com/external_letter_2010.pdf Given that there is not to be much user demand, this seems reasonable. I'm currently making a pass over the remaining testsuite failures and PRs to get 4.7 into as good shape as possible on both platforms. Tested by configuring without and with --enable-obsolete on alpha-dec-osf5.1b, installed on mainline. I will update gcc-4.7/changes.html with this change and the removal of -compat-bsd on Solaris soon. Rainer 2011-04-08 Rainer Orth r...@cebitec.uni-bielefeld.de * config.gcc: Obsolete alpha*-dec-osf5.1, mips-sgi-irix6.5. * doc/install.texi (Specific, alpha*-dec-osf5.1): Document it. (Specific, mips-sgi-irix6): Likewise. diff --git a/gcc/config.gcc b/gcc/config.gcc --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -237,7 +237,9 @@ md_file= # Obsolete configurations. case ${target} in - i[34567]86-*-interix3* \ + alpha*-dec-osf5.1* \ + | i[34567]86-*-interix3* \ + | mips-sgi-irix6.5\ | mips*-*-openbsd*\ | score-* \ | *-*-solaris2.8* \ diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -3086,10 +3086,12 @@ Systems using processors that implement are running the DEC/Compaq/HP Unix (DEC OSF/1, Digital UNIX, or Compaq/HP Tru64 UNIX) operating system, for example the DEC Alpha AXP systems. -As of GCC 3.2, versions before @code{alpha*-dec-osf4} are no longer -supported. (These are the versions which identify themselves as DEC -OSF/1.) As of GCC 4.6, support for Tru64 UNIX V4.0 and V5.0 has been -removed. +Support for Tru64 UNIX V5.1 has been obsoleted in GCC 4.7, but can still +be enabled by configuring with @option{--enable-obsolete}. Support will +be removed in GCC 4.8. As of GCC 4.6, support for Tru64 UNIX V4.0 and +V5.0 has been removed. As of GCC 3.2, versions before +@code{alpha*-dec-osf4} are no longer supported. (These are the versions +which identify themselves as DEC OSF/1.) On Tru64 UNIX, virtual memory exhausted bootstrap failures may be fixed by reconfiguring Kernel Virtual Memory and Swap parameters @@ -3868,11 +3870,13 @@ Support for IRIX 5 has been removed in G @end html @heading @anchor{mips-sgi-irix6}mips-sgi-irix6 -Support for IRIX 6 releases before 6.5 has been removed in GCC 4.6, as -well as support for -the O32 ABI. It is @emph{strongly} recommended to upgrade to at least -IRIX 6.5.18. This release introduced full ISO C99 support, though for -the N32 and N64 ABIs only. +Support for IRIX 6.5 has been obsoleted in GCC 4.7, but can still be +enabled by configuring with @option{--enable-obsolete}. Support will be +removed in GCC 4.8. Support for IRIX 6 releases before 6.5 has been +removed in GCC 4.6, as well as support for the O32 ABI. It is +@emph{strongly} recommended to upgrade to at least IRIX 6.5.18. This +release introduced full ISO C99 support, though for the N32 and N64 ABIs +only. To build and use GCC on IRIX 6.5, you need the IRIX Development Foundation (IDF) and IRIX Development Libraries (IDL). They are included with the -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[wwwdocs] Document IRIX 6.5, Tru64 UNIX V5.1 obsoletion
As just announced, here's the patch to the GCC 4.7 changes.html to document the IRIX and Tru64 UNIX obsoletions, together with the -compat-bsd removal. I don't need approval for the patch, but would be grateful for improvements to wording. Thanks. Rainer 2011-07-01 Rainer Orth r...@cebitec.uni-bielefeld.de * htdocs/gcc-4.7/changes.html: Document IRIX 6.5, Tru64 UNIX V5.1 obsoletion Document -compat-bsd removal. Index: htdocs/gcc-4.7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.18 diff -u -p -r1.18 changes.html --- htdocs/gcc-4.7/changes.html 28 May 2011 08:05:36 - 1.18 +++ htdocs/gcc-4.7/changes.html 1 Jul 2011 10:13:37 - @@ -24,10 +24,12 @@ particular architectures have been obsoleted:/p ul + liIRIX 6.5 (mips-sgi-irix6.5)/li + liMIPS OpenBSD (mips*-*-openbsd*)/li liSolaris 8 (*-*-solaris2.8). Details can be found in the a href=http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01263.html; announcement/a./li - liMIPS OpenBSD (mips*-*-openbsd*)./li + liTru64 UNIX V5.1 (alpha*-dec-osf5.1*)/li /ul /li @@ -35,6 +37,12 @@ liSupport has been removed for Unix International threads on Solaris 2, so the code--enable-threads=solaris/code configure option and the code-threads/code compiler option don't work any longer./li +liSupport has been removed for the Solaris BSD Compatibility Package, +which lives in code/usr/ucbinclude/code and +code/usr/ucblib/code. It has been removed from Solaris 11, and was +only intended as a migration aid from SunOS 4 to SunOS 5. The +code-compat-bsd/code compiler option isn't recognized any +longer./li /ul h2General Optimizer Improvements/h2 -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Fix PR49596
This fixes the fallout of the IPA PTA patch. The varpool_all_refs_explicit_p predicate does not properly give answers for external variables. Fixed as follows, pre-approved by Honza on IRC. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-07-01 Richard Guenther rguent...@suse.de PR middle-end/49596 * cgraph.h (varpool_all_refs_explicit_p): Not analyzed nodes may have unknown refs. Index: gcc/cgraph.h === --- gcc/cgraph.h(revision 175749) +++ gcc/cgraph.h(working copy) @@ -947,7 +947,8 @@ varpool_can_remove_if_no_refs (struct va static inline bool varpool_all_refs_explicit_p (struct varpool_node *vnode) { - return (!vnode-externally_visible + return (vnode-analyzed + !vnode-externally_visible !vnode-used_from_other_partition !vnode-force_output); }
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
- Original Message - From: Richard Guenther richard.guent...@gmail.com To: Kai Tietz kti...@redhat.com Cc: gcc-patches@gcc.gnu.org Sent: Thursday, June 30, 2011 1:36:13 PM Subject: Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns On Wed, Jun 29, 2011 at 3:00 PM, Kai Tietz kti...@redhat.com wrote: - Original Message - From: Kai Tietz kti...@redhat.com To: Richard Guenther richard.guent...@gmail.com Cc: gcc-patches@gcc.gnu.org Sent: Wednesday, June 29, 2011 1:33:30 PM Subject: Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns - Original Message - From: Richard Guenther richard.guent...@gmail.com To: Kai Tietz kti...@redhat.com Cc: gcc-patches@gcc.gnu.org Sent: Wednesday, June 29, 2011 12:14:10 PM Subject: Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns On Tue, Jun 28, 2011 at 5:05 PM, Kai Tietz kti...@redhat.com wrote: Hello, this patch implements the X op !X patterns within tree-ssa-forwprop.c without using here const-fold routines. Additionally it does some trivial folding for X op X. Implementation also looks through [(type)] X op [(type)] !X, if type of X is integral and precision is suitable for operation. ChangeLog gcc/ 2011-06-28 Kai Tietz kti...@redhat.com * tree-ssa-forwprop.c (operand_precision_onep): New function. (find_possible_not_expr_argument): Likewise. (simplify_bitwise_binary_1): Likewise. (simplify_bitwise_binary): Use simplify_bitwise_binary_1 for detecting various X op !X optimizations. ChangeLog gcc/testsuite 2011-06-28 Kai Tietz kti...@redhat.com * gcc.dg/binop-notand1a.c: New test. * gcc.dg/binop-notand2a.c: New test. * gcc.dg/binop-notand3a.c: New test. * gcc.dg/binop-notand4a.c: New test. * gcc.dg/binop-notand5a.c: New test. * gcc.dg/binop-notand6a.c: New test. * gcc.dg/binop-notor1.c: New test. * gcc.dg/binop-notor2.c: New test. * gcc.dg/binop-notxor1.c: New test. * gcc.dg/binop-notxor2.c: New test. Bootstrapped and regression tested for all languages plus Ada and Obj-C for x86_64-pc-linux-gnu. Ok for apply? I can't follow the code in find_possible_not_expr_argument or its uses at all. Please try to produce patches that look more obvious in what they are doing - don't try to solve every testcase you can come up with in a single patch. Especially don't write functions like find_possible_not_expr_argument which seems to have evolved a lot after you wrote the overall function comment. Thanks, Richard. Regards, Kai Well, I added some comments to these functions and renamed the find_possible_not_expr_argument function to detect_not_expr_operand, which hits its use better. The cause for this function is, that there are more then one variant of expressing a logical-not and all of them are used. This routine simply tries to detect different variants used for not. Eg ~X == !X and (X ^ 1) == !X for integral type of X with precision one. For X with integral type, (X == 0) == !X. The folding for the three different bitwise-operations is pretty easy and it makes sense to implement them at once. I see here no good point to separate them into different patches. To separate them might even lead to questions about abstracting some code-pieces out of the main function. I didn't added testcases for all variants I am aware now. Just those, which are now handled. So hope you can read and understand logic of patch better by updated patch. Regards, Kai I found that in version I've sent there is an unclosed comment. So here is updated patch, which additionally simplify some code to ease reading. Ok, I'm going to comment on a few things in the patch. +/* Checks if expression has type of one-bit precision, or is result of + a known boolean expression. */ +static bool +operand_precision_onep (tree expr) +{ + enum tree_code code; + gimple def_stmt; + + do +{ + code = TREE_CODE (expr); + if (!INTEGRAL_TYPE_P (TREE_TYPE (expr))) + return false; + if (TYPE_PRECISION (TREE_TYPE (expr)) == 1) + return true; + if (code != SSA_NAME) + break; + def_stmt = SSA_NAME_DEF_STMT (expr); + if (!def_stmt || !is_gimple_assign (def_stmt)) + break; + code = gimple_assign_rhs_code (def_stmt); + if (!CONVERT_EXPR_CODE_P (code)) + break; + expr = gimple_assign_rhs1 (def_stmt); +} + while (CONVERT_EXPR_CODE_P (code)); + + if (code == TRUTH_NOT_EXPR || TREE_CODE_CLASS (code) == tcc_comparison) +return true; + return false; +} Please don't do arbitrary deep lookups like this. Instead this function should be bool truth_valued_ssa_name_p (tree name) { tree type = TREE_TYPE (name); gimple def_stmt; if (!INTEGRAL_TYPE_P (type)) return false; if (TREE_CODE (type)
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 28/06/11 17:37, Michael Matz wrote: What I want (and I'm not totally clear on what this actually means) is to be able to optimize all the cases where the end result will be the same as the compiler produces now (using multiple multiply, shift, and add operations). Okay, then you really want to look through value-preserving conversions. Ok, so that's an obvious statement, but the point is that, right now, the compiler does nothing special when you cast from int - unsigned int, or vice-versa, and I want to capture that somehow. There are some exceptions, I'm sure, but what are they? Same-sized signed- unsigned conversions aren't value preserving: unsigned char c = 255; (signed char)c == -1; 255 != -1 unsigned - larger sized signed is value preserving unsigned char c = 255; (signed short)c == 255; signed - unsigned never is value preserving OK, so I've tried implementing this, and I find I hit against a problem: Given this test case: unsigned long long foo (unsigned long long a, signed char *b, signed char *c) { return a + *b * *c; } Those rules say that it should not be suitable for optimization because there's an implicit cast from signed int to unsigned long long. Without any widening multiplications allowed, GCC gives this code (for ARM): ldrsb r2, [r2, #0] ldrsb r3, [r3, #0] mul r2, r2, r3 addsr0, r0, r2 adc r1, r1, r2, asr #31 This is exactly what a signed widening multiply-and-accumulate with smlalbb would have done! OK, so the types in the testcase are a bit contrived, but my point is that I want to be able to use the widening-mult instructions everywhere that they would produce the same output and gcc would otherwise, and gcc just doesn't seem that interested in signed-unsigned conversions. So, I'm happy to put in checks to ensure that truncations are not ignore, but I'm really not sure what's the right thing to do with the extends and signedness switches. Any suggestions? Andrew
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On Fri, Jul 1, 2011 at 1:58 PM, Stubbs, Andrew andrew_stu...@mentor.com wrote: On 28/06/11 17:37, Michael Matz wrote: What I want (and I'm not totally clear on what this actually means) is to be able to optimize all the cases where the end result will be the same as the compiler produces now (using multiple multiply, shift, and add operations). Okay, then you really want to look through value-preserving conversions. Ok, so that's an obvious statement, but the point is that, right now, the compiler does nothing special when you cast from int - unsigned int, or vice-versa, and I want to capture that somehow. There are some exceptions, I'm sure, but what are they? Same-sized signed- unsigned conversions aren't value preserving: unsigned char c = 255; (signed char)c == -1; 255 != -1 unsigned - larger sized signed is value preserving unsigned char c = 255; (signed short)c == 255; signed - unsigned never is value preserving OK, so I've tried implementing this, and I find I hit against a problem: Given this test case: unsigned long long foo (unsigned long long a, signed char *b, signed char *c) { return a + *b * *c; } Those rules say that it should not be suitable for optimization because there's an implicit cast from signed int to unsigned long long. Without any widening multiplications allowed, GCC gives this code (for ARM): ldrsb r2, [r2, #0] ldrsb r3, [r3, #0] mul r2, r2, r3 adds r0, r0, r2 adc r1, r1, r2, asr #31 This is exactly what a signed widening multiply-and-accumulate with smlalbb would have done! OK, so the types in the testcase are a bit contrived, but my point is that I want to be able to use the widening-mult instructions everywhere that they would produce the same output and gcc would otherwise, and gcc just doesn't seem that interested in signed-unsigned conversions. So, I'm happy to put in checks to ensure that truncations are not ignore, but I'm really not sure what's the right thing to do with the extends and signedness switches. Any suggestions? Well - some operations work the same on both signedness if you just care about the twos-complement result. This includes multiplication (but not for example division). For this special case I suggest to not bother trying to invent a generic predicate but do something local in tree-ssa-math-opts.c. Richard. Andrew
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 07/01/2011 01:58 PM, Stubbs, Andrew wrote: Given this test case: unsigned long long foo (unsigned long long a, signed char *b, signed char *c) { return a + *b * *c; } Those rules say that it should not be suitable for optimization because there's an implicit cast from signed int to unsigned long long. Got it now! Casts from signed to unsigned are not value-preserving, but they are bit-preserving: s32-s64 obviously is, and s32-u64 has the same result bit-by-bit as the s64 result. The fact that s64 has an implicit ... in front, while an u64 has an implicit ... does not matter. Is this the meaning of the predicate you want? I think so, based on the discussion, but it's hard to say without seeing the cases enumerated (i.e. a patch). However, perhaps there is a catch. We can do the following thought experiment. What would happen if you had multiple widening multiplies? Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit unsigned? I believe in this case you couldn't optimize 8-bit signed to 128-bit unsigned. Would your code do it? Paolo
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
On Fri, Jul 1, 2011 at 1:44 PM, Kai Tietz kti...@redhat.com wrote: Ok, here is reworked patch with adjusted testcase. ChangeLog gcc/ 2011-07-01 Kai Tietz kti...@redhat.com * tree-ssa-forwprop.c (truth_valued_ssa): New function. (detect_not_expr_operand): New function. (simplify_bitwise_binary_1): New function. (simplify_bitwise_binary): Use simplify_bitwise_binary_1. ChangeLog gcc/ 2011-07-01 Kai Tietz kti...@redhat.com * gcc.dg/binop-notand1a.c: New test. * gcc.dg/binop-notand2a.c: New test. * gcc.dg/binop-notand3a.c: New test. * gcc.dg/binop-notand4a.c: New test. * gcc.dg/binop-notand5a.c: New test. * gcc.dg/binop-notand6a.c: New test. * gcc.dg/binop-notor1.c: New test. * gcc.dg/binop-notor2.c: New test. * gcc.dg/binop-notxor1.c: New test. * gcc.dg/binop-notxor2.c: New test. Bootstrapped and regression tested for all standard languages plus Ada and Obj-C++ for x86_64-pc-linux-gnu. Ok for apply? (please post patches inline) +/* Checks if expression has type of one-bit precision, or is a known + truth-value pression. */ +static bool +truth_valued_ssa_name (tree name) The function comment should refer to each parameter in capital letters. The comment is also odd, if you consider the function name. Better would be Return true if the SSA name NAME is of truth-value. + /* Don't check here for BOOLEAN_TYPE as the precision isn't + necessarily one and so ~X is not equal to !X. */ + if (TYPE_PRECISION (type) == 1) +return true; Technically correct, but did you run into actual problems without this? +/* Helper routine for simplify_bitwise_binary_1 function. + If a NOT-expression is found, the operand of the NOT-expression is + returned. Othewise NULL_TREE is returned. + Detected not-patterns are !X or X == 0 for X with integral type, and + X ^ 1 or ~X for X with integral type with precision of one. + The value of CNT_CASTS is either zero, or one. */ +static tree +detect_not_expr_operand (tree name) What's a NOT-expression? I'd suggest /* For the SSA name NAME return an expression X so that NAME = !X. If there is no such X, return NULL_TREE. */ Then a better name for the function would be lookup_inverted_value. + def = SSA_NAME_DEF_STMT (name); + if (!def || !is_gimple_assign (def)) +return NULL_TREE; + def is never NULL. + code = gimple_assign_rhs_code (def); + op1 = gimple_assign_rhs1 (def); + op2 = NULL_TREE; + + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, + or BIT_NOT_EXPR, then return. */ + if (code == EQ_EXPR || code == BIT_XOR_EXPR) +op2 = gimple_assign_rhs2 (def); + + switch (code) +{ +case TRUTH_NOT_EXPR: + return op1; +case BIT_NOT_EXPR: + if (truth_valued_ssa_name (name)) op1, not name + return op1; + break; +case EQ_EXPR: + /* Check if we have X == 0 and X has an integral type. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; I think you want this test generally, before the switch. + if (integer_zerop (op2)) + return op1; + else if (integer_zerop (op1)) + return op2; It's always op2 that is 0, no need to test op1. What about NE_EXPR? If you allow EQ/NE_EXPRs then what this function returns is not something for which NAME = !X holds but something for which NAME = X == 0 holds. Otherwise you have to make sure op1 is a truth value. There is also EQ/NE_EXPR with op2 == 1, which at least for truth-valued op1 can be handled as well. + break; +case BIT_XOR_EXPR: + /* Check if we have X ^ 1 and X is truth valued. */ + if (integer_onep (op2) truth_valued_ssa_name (op1)) + return op1; + break; +default: + break; +} + /* First check if operands ARG1 and ARG2 are equal, if so we + won't have a NOT-pattern match. Fold these patterns, as + we have detected it already. */ + if (operand_equal_p (arg1, arg2, 0)) +{ + /* X X - X, and X | X - X. */ + if (code == BIT_AND_EXPR || code == BIT_IOR_EXPR) +return arg1; + /* X ^ X - 0. */ + return integer_zero_node; +} gimple_fold catches this already, no reason to do that here. + /* Do we have case not(X) op not(X)? */ + if (a1not a2not) +{ CSE would have handled this, so no reason to check this - you've done this with the previous operand_equal_p test already. + /* Get for each operation operand its optional by one integral typed + cast stripped argument. And get the not-expression's operand, if + argument represents an not-expression. */ + a1not = detect_not_expr_operand (arg1); + a2not = detect_not_expr_operand (arg2); + + /* If there are no not-expressions found, return NULL_TREE. */ + if (!a1not !a2not) +return NULL_TREE; ... + if (a2not) +{ + /* X equal to Y for X
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 01/07/11 13:33, Paolo Bonzini wrote: Got it now! Casts from signed to unsigned are not value-preserving, but they are bit-preserving: s32-s64 obviously is, and s32-u64 has the same result bit-by-bit as the s64 result. The fact that s64 has an implicit ... in front, while an u64 has an implicit ... does not matter. But, the ... and ... are not implicit. They are very real, and if applied incorrectly will change the result, I think. Is this the meaning of the predicate you want? I think so, based on the discussion, but it's hard to say without seeing the cases enumerated (i.e. a patch). The purpose of this predicate is to determine whether any type conversions that occur between the output of a widening multiply, and the input of an addition have any bearing on the end result. We know what the effective output type of the multiply is (the size is 2x the input type, and the signed if either one of the inputs in signed), and we know what the input type of the addition is, but any amount of junk can lie in between. The problem is determining if it *is* junk. In an ideal world there would only be two cases to consider: 1. No conversion needed. 2. A single sign-extend or zero-extend (according to the type of the inputs) to match the input size of the addition. Anything else would be unsuitable for optimization. Of course, it's never that simple, but it should still be possible to boil down a list of conversions to one of these cases, if it's valid. The signedness of the input to the addition is not significant - the code would be the same either way. But, I is important not to try to zero-extend something that started out signed, and not to sign-extend something that started out unsigned. However, perhaps there is a catch. We can do the following thought experiment. What would happen if you had multiple widening multiplies? Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit unsigned? I believe in this case you couldn't optimize 8-bit signed to 128-bit unsigned. Would your code do it? My code does not attempt to combine multiple multiplies. In any case, if you have two multiplications, surely you have at least three input values, so they can't be combined? It does attempt to combine a multiply and an addition, where a suitable madd* insn is available. (This is not new; I'm just trying to do it in more cases.) I have considered the case where you have (a * b) + (c * d), but have not yet coded anything for it. At present, the code will simply choose whichever multiply happens to find itself the first input operand of the plus, and ignores the other, even if the first turns out not to be a suitable candidate. Andrew
C++ PATCH for c++/49085 (ICE with offsetof in template)
We can't take the offsetof a field until after we've laid out the class. Tested x86_64-pc-linux-gnu, applying to trunk. commit 92f52fccaaf25b5791c0f8bff930fe75d25bd90b Author: Jason Merrill ja...@redhat.com Date: Thu Jun 30 23:05:49 2011 -0400 PR c++/49085 * semantics.c (finish_offsetof): Complain about incomplete type. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index ad68a01..a704aa3 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -3422,6 +3422,12 @@ finish_offsetof (tree expr) } if (REFERENCE_REF_P (expr)) expr = TREE_OPERAND (expr, 0); + if (TREE_CODE (expr) == COMPONENT_REF) +{ + tree object = TREE_OPERAND (expr, 0); + if (!complete_type_or_else (TREE_TYPE (object), object)) + return error_mark_node; +} return fold_offsetof (expr, NULL_TREE); } diff --git a/gcc/testsuite/g++.dg/template/offsetof2.C b/gcc/testsuite/g++.dg/template/offsetof2.C new file mode 100644 index 000..da888f7 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/offsetof2.C @@ -0,0 +1,10 @@ +// PR c++/49085 + +template class T +struct A // { dg-error declaration } +{ + int i, j; + int ar[__builtin_offsetof(A,j)]; // { dg-error incomplete type } +}; + +Aint a;
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
H.J. Lu hjl.to...@gmail.com writes: On Fri, Jul 1, 2011 at 2:02 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: Here is the updated patch. It works on simple tests. I am running full tests. I kept config/i386/value-unwind.h since libgcc/md-unwind-support.h is included too late in unwind-dw2.c and I don't want to move it to be on the safe side. Oh please, don't pile hack upon hack to avoid proper testing. What is your suggestion? How about moving the md-unwind-support.h include up to the rest of the includes? The headers usually only define MD_FALLBACK_FRAME_STATE_FOR and perhaps MD_FROB_UPDATE_CONTEXT, everything else is an internal helper macro, so order shouldn't matter. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
On Fri, Jul 1, 2011 at 6:37 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: On Fri, Jul 1, 2011 at 2:02 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: Here is the updated patch. It works on simple tests. I am running full tests. I kept config/i386/value-unwind.h since libgcc/md-unwind-support.h is included too late in unwind-dw2.c and I don't want to move it to be on the safe side. Oh please, don't pile hack upon hack to avoid proper testing. What is your suggestion? How about moving the md-unwind-support.h include up to the rest of the includes? The headers usually only define MD_FALLBACK_FRAME_STATE_FOR and perhaps MD_FROB_UPDATE_CONTEXT, everything else is an internal helper macro, so order shouldn't matter. It doesn't work on Linux/x86-64: In file included from /export/gnu/import/git/gcc/libgcc/../gcc/unwind-dw2.c:39:0: ./md-unwind-support.h: In function \u2018x86_fallback_frame_state\u2019: ./md-unwind-support.h:121:30: error: dereferencing pointer to incomplete type ./md-unwind-support.h:129:17: error: dereferencing pointer to incomplete type ./md-unwind-support.h:141:23: error: dereferencing pointer to incomplete type ./md-unwind-support.h:153:49: error: dereferencing pointer to incomplete type ./md-unwind-support.h: In function \u2018x86_frob_update_context\u2019: ./md-unwind-support.h:185:30: error: dereferencing pointer to incomplete type ./md-unwind-support.h:192:5: warning: implicit declaration of function \u2018_Unwind_SetSignalFrame\u2019 [-Wimplicit-function-declaration] /export/gnu/import/git/gcc/libgcc/../gcc/unwind-dw2.c: At top level: /export/gnu/import/git/gcc/libgcc/../gcc/unwind-dw2.c:140:1: warning: conflicting types for \u2018_Unwind_SetSignalFrame\u2019 [enabled by default] /export/gnu/import/git/gcc/libgcc/../gcc/unwind-dw2.c:140:1: error: static declaration of \u2018_Unwind_SetSignalFrame\u2019 follows non-static declaration ./md-unwind-support.h:192:5: note: previous implicit declaration of \u2018_Unwind_SetSignalFrame\u2019 was here -- H.J.
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
2011/7/1 Richard Guenther richard.guent...@gmail.com: On Fri, Jul 1, 2011 at 1:44 PM, Kai Tietz kti...@redhat.com wrote: Ok, here is reworked patch with adjusted testcase. ChangeLog gcc/ 2011-07-01 Kai Tietz kti...@redhat.com * tree-ssa-forwprop.c (truth_valued_ssa): New function. (detect_not_expr_operand): New function. (simplify_bitwise_binary_1): New function. (simplify_bitwise_binary): Use simplify_bitwise_binary_1. ChangeLog gcc/ 2011-07-01 Kai Tietz kti...@redhat.com * gcc.dg/binop-notand1a.c: New test. * gcc.dg/binop-notand2a.c: New test. * gcc.dg/binop-notand3a.c: New test. * gcc.dg/binop-notand4a.c: New test. * gcc.dg/binop-notand5a.c: New test. * gcc.dg/binop-notand6a.c: New test. * gcc.dg/binop-notor1.c: New test. * gcc.dg/binop-notor2.c: New test. * gcc.dg/binop-notxor1.c: New test. * gcc.dg/binop-notxor2.c: New test. Bootstrapped and regression tested for all standard languages plus Ada and Obj-C++ for x86_64-pc-linux-gnu. Ok for apply? (please post patches inline) +/* Checks if expression has type of one-bit precision, or is a known + truth-value pression. */ +static bool +truth_valued_ssa_name (tree name) The function comment should refer to each parameter in capital letters. The comment is also odd, if you consider the function name. Better would be Return true if the SSA name NAME is of truth-value. Ok + /* Don't check here for BOOLEAN_TYPE as the precision isn't + necessarily one and so ~X is not equal to !X. */ + if (TYPE_PRECISION (type) == 1) + return true; Technically correct, but did you run into actual problems without this? Yes, this makes issues. See BIT_NOT_EXPR in fold-const. It uses LHS type for ~0. [*] +/* Helper routine for simplify_bitwise_binary_1 function. + If a NOT-expression is found, the operand of the NOT-expression is + returned. Othewise NULL_TREE is returned. + Detected not-patterns are !X or X == 0 for X with integral type, and + X ^ 1 or ~X for X with integral type with precision of one. + The value of CNT_CASTS is either zero, or one. */ +static tree +detect_not_expr_operand (tree name) What's a NOT-expression? I'd suggest /* For the SSA name NAME return an expression X so that NAME = !X. If there is no such X, return NULL_TREE. */ Then a better name for the function would be lookup_inverted_value. Hmm, we don't look up inverted values in general. May lookup_inverted_truth_value? + def = SSA_NAME_DEF_STMT (name); + if (!def || !is_gimple_assign (def)) + return NULL_TREE; + def is never NULL. Ok + code = gimple_assign_rhs_code (def); + op1 = gimple_assign_rhs1 (def); + op2 = NULL_TREE; + + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, + or BIT_NOT_EXPR, then return. */ + if (code == EQ_EXPR || code == BIT_XOR_EXPR) + op2 = gimple_assign_rhs2 (def); + + switch (code) + { + case TRUTH_NOT_EXPR: + return op1; + case BIT_NOT_EXPR: + if (truth_valued_ssa_name (name)) op1, not name No, name is right. see [*] + return op1; + break; + case EQ_EXPR: + /* Check if we have X == 0 and X has an integral type. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; I think you want this test generally, before the switch. No, no need for this. Just for comparisons I need to check that operands are equal. The type of NAME is always an integral value. + if (integer_zerop (op2)) + return op1; + else if (integer_zerop (op1)) + return op2; It's always op2 that is 0, no need to test op1. So for comparison constant will be moved always right-hand? Ok fine by this. What about NE_EXPR? Maybe for X != 1 for an truth-valued X. But I never saw this pattern generated. All other cases related to NE_EXPR, which might be an inverted variant aren't trivial and not sure if it is worth checking them here. Eg. (X | Y) != 0 - (X != 0 | Y != 0) would be the inverted variant of (X == 0 Y == 0). But those things might be better placed in and/or_comparison folding in gimple-fold.c, isn't it? If you allow EQ/NE_EXPRs then what this function returns is not something for which NAME = !X holds but something for which NAME = X == 0 holds. Otherwise you have to make sure op1 is a truth value. !X is (for integral types) X == (type-x) 0. And this transformation is bijective AFAICS. I don't see the point you mean here. There is also EQ/NE_EXPR with op2 == 1, which at least for truth-valued op1 can be handled as well. See comment above. It is true that X != 1 (for truth-valued X) is X == 0. This might be a special case worth to add, but for X != 0 (for truth-valued X) it isn't. Same as for X == 1 cases. We want to return the argument of the not expression here. So we would need to
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
2011/7/1 Kai Tietz ktiet...@googlemail.com: 2011/7/1 Richard Guenther richard.guent...@gmail.com: On Fri, Jul 1, 2011 at 1:44 PM, Kai Tietz kti...@redhat.com wrote: Ok, here is reworked patch with adjusted testcase. ChangeLog gcc/ 2011-07-01 Kai Tietz kti...@redhat.com * tree-ssa-forwprop.c (truth_valued_ssa): New function. (detect_not_expr_operand): New function. (simplify_bitwise_binary_1): New function. (simplify_bitwise_binary): Use simplify_bitwise_binary_1. ChangeLog gcc/ 2011-07-01 Kai Tietz kti...@redhat.com * gcc.dg/binop-notand1a.c: New test. * gcc.dg/binop-notand2a.c: New test. * gcc.dg/binop-notand3a.c: New test. * gcc.dg/binop-notand4a.c: New test. * gcc.dg/binop-notand5a.c: New test. * gcc.dg/binop-notand6a.c: New test. * gcc.dg/binop-notor1.c: New test. * gcc.dg/binop-notor2.c: New test. * gcc.dg/binop-notxor1.c: New test. * gcc.dg/binop-notxor2.c: New test. Bootstrapped and regression tested for all standard languages plus Ada and Obj-C++ for x86_64-pc-linux-gnu. Ok for apply? (please post patches inline) +/* Checks if expression has type of one-bit precision, or is a known + truth-value pression. */ +static bool +truth_valued_ssa_name (tree name) The function comment should refer to each parameter in capital letters. The comment is also odd, if you consider the function name. Better would be Return true if the SSA name NAME is of truth-value. Ok + /* Don't check here for BOOLEAN_TYPE as the precision isn't + necessarily one and so ~X is not equal to !X. */ + if (TYPE_PRECISION (type) == 1) + return true; Technically correct, but did you run into actual problems without this? Yes, this makes issues. See BIT_NOT_EXPR in fold-const. It uses LHS type for ~0. [*] +/* Helper routine for simplify_bitwise_binary_1 function. + If a NOT-expression is found, the operand of the NOT-expression is + returned. Othewise NULL_TREE is returned. + Detected not-patterns are !X or X == 0 for X with integral type, and + X ^ 1 or ~X for X with integral type with precision of one. + The value of CNT_CASTS is either zero, or one. */ +static tree +detect_not_expr_operand (tree name) What's a NOT-expression? I'd suggest /* For the SSA name NAME return an expression X so that NAME = !X. If there is no such X, return NULL_TREE. */ Then a better name for the function would be lookup_inverted_value. Hmm, we don't look up inverted values in general. May lookup_inverted_truth_value? + def = SSA_NAME_DEF_STMT (name); + if (!def || !is_gimple_assign (def)) + return NULL_TREE; + def is never NULL. Ok + code = gimple_assign_rhs_code (def); + op1 = gimple_assign_rhs1 (def); + op2 = NULL_TREE; + + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, + or BIT_NOT_EXPR, then return. */ + if (code == EQ_EXPR || code == BIT_XOR_EXPR) + op2 = gimple_assign_rhs2 (def); + + switch (code) + { + case TRUTH_NOT_EXPR: + return op1; + case BIT_NOT_EXPR: + if (truth_valued_ssa_name (name)) op1, not name No, name is right. see [*] + return op1; + break; + case EQ_EXPR: + /* Check if we have X == 0 and X has an integral type. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; I think you want this test generally, before the switch. No, no need for this. Just for comparisons I need to check that operands are equal. The type of NAME is always an integral value. + if (integer_zerop (op2)) + return op1; + else if (integer_zerop (op1)) + return op2; It's always op2 that is 0, no need to test op1. So for comparison constant will be moved always right-hand? Ok fine by this. What about NE_EXPR? Maybe for X != 1 for an truth-valued X. But I never saw this pattern generated. All other cases related to NE_EXPR, which might be an inverted variant aren't trivial and not sure if it is worth checking them here. Eg. (X | Y) != 0 - (X != 0 | Y != 0) would be the inverted variant of (X == 0 Y == 0). But those things might be better placed in and/or_comparison folding in gimple-fold.c, isn't it? If you allow EQ/NE_EXPRs then what this function returns is not something for which NAME = !X holds but something for which NAME = X == 0 holds. Otherwise you have to make sure op1 is a truth value. !X is (for integral types) X == (type-x) 0. And this transformation is bijective AFAICS. I don't see the point you mean here. There is also EQ/NE_EXPR with op2 == 1, which at least for truth-valued op1 can be handled as well. See comment above. It is true that X != 1 (for truth-valued X) is X == 0. This might be a special case worth to add, but for X != 0 (for truth-valued X) it isn't. Same as for X == 1 cases. We want to
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
H.J. Lu hjl.to...@gmail.com writes: What is your suggestion? How about moving the md-unwind-support.h include up to the rest of the includes? The headers usually only define MD_FALLBACK_FRAME_STATE_FOR and perhaps MD_FROB_UPDATE_CONTEXT, everything else is an internal helper macro, so order shouldn't matter. It doesn't work on Linux/x86-64: In file included from /export/gnu/import/git/gcc/libgcc/../gcc/unwind-dw2.c:39:0: ./md-unwind-support.h: In function \u2018x86_fallback_frame_state\u2019: ./md-unwind-support.h:121:30: error: dereferencing pointer to incomplete type Then move it below the definition of struct _Unwind_Context with a comment explaining why it has to be there. I thought you aspired to become Linux maintainer? Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
On Fri, Jul 1, 2011 at 7:02 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: What is your suggestion? How about moving the md-unwind-support.h include up to the rest of the includes? The headers usually only define MD_FALLBACK_FRAME_STATE_FOR and perhaps MD_FROB_UPDATE_CONTEXT, everything else is an internal helper macro, so order shouldn't matter. It doesn't work on Linux/x86-64: In file included from /export/gnu/import/git/gcc/libgcc/../gcc/unwind-dw2.c:39:0: ./md-unwind-support.h: In function \u2018x86_fallback_frame_state\u2019: ./md-unwind-support.h:121:30: error: dereferencing pointer to incomplete type Then move it below the definition of struct _Unwind_Context with a It won't work since I need to define a macro before struct _Unwind_Context. comment explaining why it has to be there. I thought you aspired to become Linux maintainer? Yes, not by breaking working codes. -- H.J.
Re: PING: PATCH: PR target/46770: Use .init_array/.fini_array sections
On Sun, Jun 19, 2011 at 11:39 AM, H.J. Lu hjl.to...@gmail.com wrote: On Fri, Jun 3, 2011 at 5:51 AM, H.J. Lu hjl.to...@gmail.com wrote: On Fri, Jun 3, 2011 at 5:31 AM, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Jun 3, 2011 at 6:31 AM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, May 18, 2011 at 8:57 AM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Apr 26, 2011 at 6:05 AM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Mar 31, 2011 at 7:57 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 21, 2011 at 11:40 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Mar 14, 2011 at 12:28 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Jan 27, 2011 at 2:40 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Jan 27, 2011 at 12:12 AM, H.J. Lu hongjiu...@intel.com wrote: On Tue, Dec 14, 2010 at 05:20:48PM -0800, H.J. Lu wrote: This patch uses .init_array/.fini_array sections instead of .ctors/.dtors sections if mixing .init_array/.fini_array and .ctors/.dtors sections with init_priority works. It removes .ctors/.ctors sections from executables and DSOes, which will remove one function call at startup time from each executable and DSO. It should reduce image size and improve system startup time. If a platform with a working .init_array/.fini_array support needs a different .init_array/.fini_array implementation, it can set use_initfini_array to no. Since .init_array/.fini_array is a target feature. --enable-initfini-array is default to no unless the native run-time test is passed. To pass the native run-time test, a linker with SORT_BY_INIT_PRIORITY support is required. The binutils patch is available at http://sourceware.org/ml/binutils/2010-12/msg00466.html Linker patch has been checked in. This patch passed 32bit/64bit regression test on Linux/x86-64. Any comments? This updated patch fixes build on Linux/ia64 and should work on others. Any comments? Yes. This is stage1 material. Here is the updated patch. OK for trunk? Thanks. -- H.J. 2011-03-14 H.J. Lu hongjiu...@intel.com PR target/46770 * acinclude.m4 (gcc_AC_INITFINI_ARRAY): Removed. * config.gcc (use_initfini_array): New variable. Use initfini-array.o if supported. * crtstuff.c: Don't generate .ctors nor .dtors sections if NO_CTORS_DTORS_SECTIONS is defined. * configure.ac: Remove gcc_AC_INITFINI_ARRAY. Add --enable-initfini-array and check if .init_array can be used with .ctors. * configure: Regenerated. * config/initfini-array.c: New. * config/initfini-array.h: Likewise. * config/t-initfini-array: Likewise. * config/arm/arm.c (arm_asm_init_sections): Call elf_initfini_array_init_sections if NO_CTORS_DTORS_SECTIONS is defined. * config/avr/avr.c (avr_asm_init_sections): Likewise. * config/ia64/ia64.c (ia64_asm_init_sections): Likewise. * config/mep/mep.c (mep_asm_init_sections): Likewise. * config/microblaze/microblaze.c (microblaze_elf_asm_init_sections): Likewise. * config/rs6000/rs6000.c (rs6000_elf_asm_init_sections): Likewise. * config/stormy16/stormy16.c (xstormy16_asm_init_sections): Likewise. * config/v850/v850.c (v850_asm_init_sections): Likewise. PING: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00760.html Any comments? Any objections? Here is the patch updated for the current trunk. OK for trunk? PING,. Hi Richard, You commented my patch was stage 1 material: http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01989.html Is my patch: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00760.html OK for trunk? I can't approve the configury changes and would like to defer to target maintainers for the target specific changes. That said, I'm not familiar enough with the area of the patch. But yes, it's stage1 now - so if anyone else wants to approve this patch... My first attempt: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00589.html only affects x86. I changed it to generic based on the feedbacks. But other target maintainers show no interests. Should I make it x86 only first? Each target can enable it if needed. I am enclosing 2 patches here. One only affects Linux/x86 and the other covers all targets. I tested both versions on Linux/x86 without any regressions. Since I only got OK from one target maintainer and I have been pinging on this patch for more than 6 months, I'd like to get it enabled for Linux/x86 soon. Hi Ian, I'd like to get this issue resolved, at least for Linux/x86. Can you recommend how I should proceed? Thanks. -- H.J.
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
H.J. Lu hjl.to...@gmail.com writes: Then move it below the definition of struct _Unwind_Context with a It won't work since I need to define a macro before struct _Unwind_Context. Then this does seem to be a case for libgcc_tm_file indeed. Ugly that the unwinder configuration has to be split between two different files this way, but unavoidable, it seems. At least when libgcc_tm_file moves to libgcc/config.host, it won't any longer be split between gcc and libgcc. comment explaining why it has to be there. I thought you aspired to become Linux maintainer? Yes, not by breaking working codes. Anyway, thanks for trying. I'd really have liked to avoid this split. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [Patch, Fortran, F08] PR 49562: [4.6/4.7 Regression] [OOP] assigning value to type-bound function
Ping ... ! 2011/6/28 Janus Weil ja...@gcc.gnu.org: Hi all, here is a patch for a problem which was originally reported as an ICE-on-invalid regression (assigning to a type-bound function). In the course of fixing it, I noticed that it becomes valid according to F08 if the function is pointer-valued, and modified the patch such that it will accept this variant. I also adapted the original test case to be a run-time test of this F08 feature (in fact it is just a very complicated way of performing an increment from 0 to 1, and would still segfault without the patch). The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.6.2? Cheers, Janus 2011-06-28 Janus Weil ja...@gcc.gnu.org PR fortran/49562 * expr.c (gfc_check_vardef_context): Handle type-bound procedures. 2011-06-28 Janus Weil ja...@gcc.gnu.org PR fortran/49562 * gfortran.dg/typebound_proc_23.f90: New.
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 07/01/2011 03:30 PM, Stubbs, Andrew wrote: However, perhaps there is a catch. We can do the following thought experiment. What would happen if you had multiple widening multiplies? Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit unsigned? I believe in this case you couldn't optimize 8-bit signed to 128-bit unsigned. Would your code do it? My code does not attempt to combine multiple multiplies. In any case, if you have two multiplications, surely you have at least three input values, so they can't be combined? What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to (u128)c + (u128)((s8)a * (s8)b). Paolo
Re: [PATCH] [Annotalysis] Fixes virtual method calls when type is unknown
Hi, On Thu, Jun 30, 2011 at 10:01:58AM -0700, Delesley Hutchins wrote: This bug fixes a failure in annotalysis that is caused when gcc does not return the correct static type for the callee of a virtual method. Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu. Okay for branches/annotalysis and google/main? -DeLesley 2011-06-30 DeLesley Hutchins deles...@google.com * tree-threadsafe-analyze.c (handle_call_gs): Fixes case where the virtual method callee has unknown type. Index: gcc/testsuite/g++.dg/thread-ann/thread_annot_lock-79.C === --- gcc/testsuite/g++.dg/thread-ann/thread_annot_lock-79.C(revision 0) +++ gcc/testsuite/g++.dg/thread-ann/thread_annot_lock-79.C(revision 0) @@ -0,0 +1,54 @@ +// Test virtual method calls in cases where the static type is unknown +// This is a good test case that should not incur any thread safety warning. +// { dg-do compile } +// { dg-options -Wthread-safety -O } + +#include thread_annot_common.h + +class Foo { +public: + Mutex m; + + Foo(); + virtual ~Foo(); + virtual void doSomething() EXCLUSIVE_LOCKS_REQUIRED(m) = 0; +}; + + +class FooDerived : public Foo { +public: + FooDerived(); + virtual ~FooDerived(); + virtual void doSomething(); +}; + + +/** + * This is a test case for a bug wherein annotalysis would crash when analyzing + * a virtual method call. + * + * The following three functions represent cases where gcc loses the static + * type of foo in the expression foo-doSomething() when it drops down to + * gimple. Annotalysis should technically issue a thread-safe warning in these + * cases, but because the type is missing, it should silently accept them + * instead. See tree-threadsafe-analyze.c::handle_call_gs. + */ + +// reinterpret_cast +void foo1(void* ptr) +{ + reinterpret_castFoo*(ptr)-doSomething(); +} + +// C-style cast +void foo2(int* buf) +{ + ((Foo*) buf)-doSomething(); +} + +// new expression, with no local variable +void foo3() +{ + (new FooDerived)-doSomething(); +} + Index: gcc/tree-threadsafe-analyze.c === --- gcc/tree-threadsafe-analyze.c (revision 175385) +++ gcc/tree-threadsafe-analyze.c (working copy) @@ -2446,7 +2446,18 @@ if (TREE_CODE (callee) == OBJ_TYPE_REF) { tree objtype = TREE_TYPE (TREE_TYPE (OBJ_TYPE_REF_OBJECT (callee))); - fdecl = lang_hooks.get_virtual_function_decl (callee, objtype); + /* Check to make sure objtype is a valid type. + * OBJ_TYPE_REF_OBJECT does not always return the correct static type of the callee. + * For example: Given foo(void* ptr) { ((Foo*) ptr)-doSomething(); } + * objtype will be void, not Foo. Whether or not this happens depends on the details + * of how a particular call is lowered to GIMPLE, and there is no easy fix that works + * in all cases. For now, we simply rely on gcc's type information; if that information + * is not accurate, then the analysis will be less precise. + */ + if (TREE_CODE(objtype) == RECORD_TYPE) /* if objtype is an object type... */ +{ + fdecl = lang_hooks.get_virtual_function_decl (callee, objtype); +} } } If you in some way rely on static types of those pointers, you may be in for a number of unpleasant surprises, because these types do not really have any meaning at all. Just out of curiosity, I does your analysis crash also on the testcase below (add the Mutex field if it is necessary)? Also, I have just very briefly glanced at the code in handle_call_gs but still it seems to me that the code dealing with virtual calls assumes virtual functions are never overridden...? And pretty please, use -p diff option when creating patches to be posted here. Thanks, Martin namespace std { typedef __SIZE_TYPE__ size_t; } inline void* operator new(std::size_t, void* __p) throw() { return __p; } extern C void abort (void); struct Foo { public: char stuff[1024]; }; class Bar { public: virtual int test (void) { return 1; } }; Foo ft; Foo *f = ft; int main() { Bar *b; b = new (f) Bar(); return b-test(); }
Re: Fix PR 49014
On 26.05.2011 17:32, Andrey Belevantsev wrote: On 25.05.2011 19:31, Bernd Schmidt wrote: On 05/25/2011 03:29 PM, Andrey Belevantsev wrote: I think the hook is a better idea than the attribute because nobody will care to mark all offending insns with an attribute. I don't know. IIRC when I looked at sh or whatever the broken port was, it was only two insns - there would still be some value in being able to assert that all other insns have a reservation. OK, I will take a look on x86-64 and will get back with more information. Andrey So, I have made an attempt to bootstrap on x86-64 with the extra assert in selective scheduling that assumes the DFA state always changes when issuing a recog_memoized =0 insn (patch attached). Indeed, there are just a few general insns that don't have proper reservations. However, it was a surprise to me to see that almost any insn with SSE registers fails this assert and thus does not get properly scheduled. Overall, the work on fixing those seems doable, it took just a day to get the compiler bootstrapped (of course, the testsuite may bring much more issues). So, if there is an agreement on marking a few offending insns with the new attribute, we can proceed with the help of somebody from the x86 land on fixing those and researching for other targets. Andrey Bernd Index: gcc/sel-sched.c === *** gcc/sel-sched.c (revision 175757) --- gcc/sel-sched.c (working copy) *** advance_state_on_fence (fence_t fence, i *** 5305,5310 --- 5305,5315 if (FENCE_ISSUED_INSNS (fence) issue_rate) gcc_unreachable (); } + else if (get_attr_nondfa_insn (insn) == 0) + { + error (found an insn that does not modify DFA state); + fatal_insn (this is the insn:, insn); + } } else { Index: gcc/common.opt === *** gcc/common.opt (revision 175757) --- gcc/common.opt (working copy) *** Common Report Var(flag_selective_schedul *** 1687,1697 Schedule instructions using selective scheduling algorithm fselective-scheduling2 ! Common Report Var(flag_selective_scheduling2) Optimization Run selective scheduling after reload fsel-sched-pipelining ! Common Report Var(flag_sel_sched_pipelining) Init(0) Optimization Perform software pipelining of inner loops during selective scheduling fsel-sched-pipelining-outer-loops --- 1687,1697 Schedule instructions using selective scheduling algorithm fselective-scheduling2 ! Common Report Var(flag_selective_scheduling2) Optimization Init(1) Run selective scheduling after reload fsel-sched-pipelining ! Common Report Var(flag_sel_sched_pipelining) Init(1) Optimization Perform software pipelining of inner loops during selective scheduling fsel-sched-pipelining-outer-loops Index: gcc/config/i386/i386.md === *** gcc/config/i386/i386.md (revision 175757) --- gcc/config/i386/i386.md (working copy) *** (define_attr prefix orig,vex,maybe_ve *** 518,523 --- 518,526 ;; VEX W bit is used. (define_attr prefix_vex_w (const_int 0)) + + (define_attr nondfa_insn (const_int 0)) + ;; The length of VEX prefix ;; Only instructions with 0f prefix can have 2 byte VEX prefix, ;; 0f38/0f3a prefixes can't. In i386.md 0f3[8a] is *** (define_insn x86_fnstsw_1 *** 1467,1472 --- 1470,1476 fnstsw\t%0 [(set (attr length) (symbol_ref ix86_attr_length_address_default (insn) + 2)) (set_attr mode SI) +(set (attr nondfa_insn) (const_int 1)) (set_attr unit i387)]) ;; FP compares, step 3 *** (define_insn *movti_internal_sse *** 1962,1967 --- 1966,1972 } [(set_attr type sselog1,ssemov,ssemov) (set_attr prefix maybe_vex) +(set (attr nondfa_insn) (const_int 1)) (set (attr mode) (cond [(ior (eq (symbol_ref TARGET_SSE2) (const_int 0)) (ne (symbol_ref optimize_function_for_size_p (cfun)) *** (define_insn *movdi_internal *** 2152,2157 --- 2157,2163 (if_then_else (eq_attr alternative 9,10,11,12) (const_string noavx) (const_string *))) +(set (attr nondfa_insn) (if_then_else (eq_attr alternative 2,3,4,5,6,7,8,9,10,11,12) (const_int 1) (const_int 0))) (set (attr type) (cond [(eq_attr alternative 0,1) (const_string multi) *** (define_insn *movsi_internal *** 2244,2249 --- 2250,2256 (if_then_else (and (eq_attr type ssemov) (eq_attr mode SI)) (const_string 1) (const_string *))) +(set (attr nondfa_insn) (if_then_else (eq_attr alternative 6,7,9,11) (const_int 1) (const_int 0))) (set (attr mode) (cond [(eq_attr alternative 2,3) (const_string DI) *** (define_insn
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 01/07/11 15:40, Paolo Bonzini wrote: On 07/01/2011 03:30 PM, Stubbs, Andrew wrote: However, perhaps there is a catch. We can do the following thought experiment. What would happen if you had multiple widening multiplies? Like 8-bit signed to 64-bit unsigned and then 64-bit unsigned to 128-bit unsigned? I believe in this case you couldn't optimize 8-bit signed to 128-bit unsigned. Would your code do it? My code does not attempt to combine multiple multiplies. In any case, if you have two multiplications, surely you have at least three input values, so they can't be combined? What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to (u128)c + (u128)((s8)a * (s8)b). Oh I see, sorry. Yes, that's exactly what I'm trying to do here. No, wait, I don't see. Where are these multiple widening multiplies you're talking about? I only see one multiply? Andrew
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
On Fri, Jul 1, 2011 at 7:25 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: Then move it below the definition of struct _Unwind_Context with a It won't work since I need to define a macro before struct _Unwind_Context. Then this does seem to be a case for libgcc_tm_file indeed. Ugly that the unwinder configuration has to be split between two different files this way, but unavoidable, it seems. At least when libgcc_tm_file moves to libgcc/config.host, it won't any longer be split between gcc and libgcc. We need target support for unwind: 1. Target configuration. 2. Target implementation. Unfortunately, they can't be put in the same file. If we really want to use the same file, we can add some macros like: UNWIND_CONFIGURE UNWIND_IMPLEMENT and we can include the same file twice. -- H.J.
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 01/07/11 14:30, Stubbs, Andrew wrote: Got it now! Casts from signed to unsigned are not value-preserving, but they are bit-preserving: s32-s64 obviously is, and s32-u64 has the same result bit-by-bit as the s64 result. The fact that s64 has an implicit ... in front, while an u64 has an implicit ... does not matter. But, the ... and ... are not implicit. They are very real, and if applied incorrectly will change the result, I think. Wait, I'm clearly confused When I try a s32-u64 conversion, the expand pass generates a sign_extend insn. Clearly it's the source type that determines the extension type, not the destination type ... and I'm a dunce! Thanks :) Andrew
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
So updated patch (bootstrapped and tested for all standard languages plus Ada and Obj-C++) on x86_64-pc-linux-gnu host. Index: gcc-head/gcc/tree-ssa-forwprop.c === --- gcc-head.orig/gcc/tree-ssa-forwprop.c +++ gcc-head/gcc/tree-ssa-forwprop.c @@ -1602,6 +1602,156 @@ simplify_builtin_call (gimple_stmt_itera return false; } +/* Checks if expression has type of one-bit precision, or is a known + truth-valued expression. */ +static bool +truth_valued_ssa_name (tree name) +{ + gimple def; + tree type = TREE_TYPE (name); + + if (!INTEGRAL_TYPE_P (type)) +return false; + /* Don't check here for BOOLEAN_TYPE as the precision isn't + necessarily one and so ~X is not equal to !X. */ + if (TYPE_PRECISION (type) == 1) +return true; + def = SSA_NAME_DEF_STMT (name); + if (is_gimple_assign (def)) +return truth_value_p (gimple_assign_rhs_code (def)); + return false; +} + +/* Helper routine for simplify_bitwise_binary_1 function. + Return for the SSA name NAME the expression X if it mets condition + NAME = !X. Otherwise return NULL_TREE. + Detected patterns for NAME = !X are: + !X and X == 0 for X with integral type. + X ^ 1, X != 1,or ~X for X with integral type with precision of one. */ +static tree +lookup_logical_inverted_value (tree name) +{ + tree op1, op2; + enum tree_code code; + gimple def; + + /* If name has none-intergal type, or isn't a SSA_NAME, then + return. */ + if (TREE_CODE (name) != SSA_NAME + || !INTEGRAL_TYPE_P (TREE_TYPE (name))) +return NULL_TREE; + def = SSA_NAME_DEF_STMT (name); + if (!is_gimple_assign (def)) +return NULL_TREE; + + code = gimple_assign_rhs_code (def); + op1 = gimple_assign_rhs1 (def); + op2 = NULL_TREE; + + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, + or BIT_NOT_EXPR, then return. */ + if (code == EQ_EXPR || code == NE_EXPR + || code == BIT_XOR_EXPR) +op2 = gimple_assign_rhs2 (def); + + switch (code) +{ +case TRUTH_NOT_EXPR: + return op1; +case BIT_NOT_EXPR: + if (truth_valued_ssa_name (name)) + return op1; + break; +case EQ_EXPR: + /* Check if we have X == 0 and X has an integral type. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; + if (integer_zerop (op2)) + return op1; + break; +case NE_EXPR: + /* Check if we have X != 1 and X is a truth-valued. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; + if (integer_onep (op2) truth_valued_ssa_name (op1)) + return op1; + break; +case BIT_XOR_EXPR: + /* Check if we have X ^ 1 and X is truth valued. */ + if (integer_onep (op2) truth_valued_ssa_name (op1)) + return op1; + break; +default: + break; +} + + return NULL_TREE; +} + +/* Try to optimize patterns X !X - zero, X | !X - one, and + X ^ !X - one, if type of X is valid for this. + + See for list of detected logical-not patterns the + lookup_logical_inverted_value function. */ +static tree +simplify_bitwise_binary_1 (enum tree_code code, tree arg1, + tree arg2) +{ + tree a1not, a2not; + tree op = NULL_TREE; + + /* If CODE isn't a bitwise binary operation, return NULL_TREE. */ + if (code != BIT_AND_EXPR code != BIT_IOR_EXPR + code != BIT_XOR_EXPR) +return NULL_TREE; + + /* First check if operands ARG1 and ARG2 are equal. */ + if (operand_equal_p (arg1, arg2, 0)) +return NULL_TREE; + /* See if we have in arguments logical-not patterns. */ + a1not = lookup_logical_inverted_value (arg1); + a2not = lookup_logical_inverted_value (arg2); + + /* If there are no logical-not in arguments, return NULL_TREE. */ + if (!a1not !a2not) +return NULL_TREE; + + /* If both arguments are logical-not patterns, then try to fold + them or return NULL_TREE. */ + if (a1not a2not) +{ + /* If logical-not operands of ARG1 and ARG2 are equal, then fold +them.. */ + if (operand_equal_p (a1not, a2not, 0)) + { + /* !X !X - !X, and !X | !X - !X. */ + if (code == BIT_AND_EXPR || code == BIT_IOR_EXPR) + return arg1; + /* !X ^ !X - 0. */ + return integer_zero_node; + } + return NULL_TREE; +} + + if (a2not operand_equal_p (arg1, a2not, 0)) +op = arg1; + else if (a1not operand_equal_p (arg2, a1not, 0)) +op = arg2; + else +return NULL_TREE; + + /* X !X - 0. */ + if (code == BIT_AND_EXPR) +return integer_zero_node; + /* X | !X - 1 and X ^ !X - 1, if X is truth-valued. */ + if (truth_valued_ssa_name (op)) +return integer_one_node; + + /* ??? Otherwise result is (X != 0 ? X : 1. not handled. */ + return NULL_TREE; +} + /* Simplify bitwise binary operations. Return true if a transformation applied, otherwise return false. */ @@
[Patch 0/3] ARM 64 bit atomic operations
Hi, This is a series of 3 patches relating to ARM atomic operations. 1) Provide 64 bit atomic operations using the new ldrexd/strexd in ARMv6k and above. 2) Provide fallbacks so that when compiled for earlier CPUs a Linux kernel asssist is called (as per 32bit and smaller ops) 3) Fix pr48126 which is a misplaced barrier in the atomic generation Many thanks to Richard Sandiford for pointing me in the right direction and reviewing it. This work is part of Linaro blueprint: https://blueprints.launchpad.net/linaro-toolchain-misc/+spec/64-bit-sync-primitives The patch was generated from the gcc git tree from about 2 weeks back but applies happily on top of the head. It's been tested cross to ARM from x86 and also a native x86 build test. Dave
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 07/01/2011 04:55 PM, Stubbs, Andrew wrote: What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to (u128)c + (u128)((s8)a * (s8)b). Oh I see, sorry. Yes, that's exactly what I'm trying to do here. No, wait, I don't see. Where are these multiple widening multiplies you're talking about? I only see one multiply? I meant one multiplication with multiple widening steps. Not clear at all, sorry. Paolo
Re: [PATCH, PR 49094] Refrain from creating misaligned accesses in SRA
Hi, On Thu, Jun 30, 2011 at 06:36:01PM +0200, Martin Jambor wrote: Hi, On Thu, Jun 30, 2011 at 03:39:55PM +0200, Martin Jambor wrote: Hi, I had to add a test that the analyzed expression is not an SSA name. This has been approved by Rchi on IRC yesterday. Thus, I have committed the following as revision 175703 after successful run of c and c++ testsuite on sparc64 (and a bootstrap and test with another patch on x86_64-linux). I also tested fortran on sparc64 but missed a regression there (gfortran.dg/pr25923.f90). The problem is that *arg_1(D) is also scrutinized, get_object_alignment returns 8 for it and that is obviously not enough for SImode required alignment. The following patch resolves this issue, has been approved on IRC by Richi and I have just committed it. Thanks, Martin 2011-07-01 Martin Jambor mjam...@suse.cz * tree-sra.c (tree_non_mode_aligned_mem_p): Also ignore MEM_REFs. Index: src/gcc/tree-sra.c === --- src.orig/gcc/tree-sra.c +++ src/gcc/tree-sra.c @@ -1076,6 +1076,7 @@ tree_non_mode_aligned_mem_p (tree exp) unsigned int align; if (TREE_CODE (exp) == SSA_NAME + || TREE_CODE (exp) == MEM_REF || mode == BLKmode || !STRICT_ALIGNMENT) return false;
[Patch 2/3] ARM 64 bit atomic operations
Provide fallbacks for 64bit atomics that call Linux commpage helpers when compiling for older machines. The code is based on the existing linux-atomic.c for other sizes, however it performs an init time check that the kernel is new enough to provide the helper. This relies on Nicolas Pitre's kernel patch here: https://patchwork.kernel.org/patch/894932/ diff --git a/gcc/config/arm/linux-atomic-64bit.c b/gcc/config/arm/linux-atomic-64bit.c new file mode 100644 index 000..140cc2f --- /dev/null +++ b/gcc/config/arm/linux-atomic-64bit.c @@ -0,0 +1,165 @@ +/* 64bit Linux-specific atomic operations for ARM EABI. + Copyright (C) 2008, 2009, 2010, 2011 Free Software Foundation, Inc. + Based on linux-atomic.c + + 64 bit additions david.gilb...@linaro.org + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +http://www.gnu.org/licenses/. */ + +/* 64bit helper functions for atomic operations; the compiler will + call these when the code is compiled for a CPU without ldrexd/strexd. + (If the CPU had those then the compiler inlines the operation). + + These helpers require a kernel helper that's only present on newer + kernels; we check for that in an init section and bail out rather + unceremoneously. */ + +/* For write */ +#include unistd.h +/* For abort */ +#include stdlib.h + +/* Kernel helper for compare-and-exchange. */ +typedef int (__kernel_cmpxchg64_t) (const long long* oldval, + const long long* newval, + long long *ptr); +#define __kernel_cmpxchg64 (*(__kernel_cmpxchg64_t *) 0x0f60) + +/* Kernel helper page version number */ + +#define __kernel_helper_version (*(unsigned int *)0x0ffc) + +/* Check that the kernel has a new enough version at load */ +void __check_for_sync8_kernelhelper (void) +{ + if (__kernel_helper_version 5) +{ + const char err[] = A newer kernel is required to run this binary. (__kernel_cmpxchg64 helper)\n; + /* At this point we need a way to crash with some information +for the user - I'm not sure I can rely on much else being +available at this point, so do the same as generic-morestack.c +write() and abort(). */ + write (2 /* stderr */, err, sizeof(err)); + abort (); +} +}; + +void (*__sync8_kernelhelper_inithook[]) (void) __attribute__ ((section (.init_array))) = { + __check_for_sync8_kernelhelper +}; + +#define HIDDEN __attribute__ ((visibility (hidden))) + +#define FETCH_AND_OP_WORD64(OP, PFX_OP, INF_OP)\ + long long HIDDEN \ + __sync_fetch_and_##OP##_8 (long long *ptr, long long val)\ + {\ +int failure; \ +long long tmp,tmp2;\ + \ +do { \ + tmp = *ptr; \ + tmp2 = PFX_OP (tmp INF_OP val); \ + failure = __kernel_cmpxchg64 (tmp, tmp2, ptr); \ +} while (failure != 0);\ + \ +return tmp;\ + } + +FETCH_AND_OP_WORD64 (add, , +) +FETCH_AND_OP_WORD64 (sub, , -) +FETCH_AND_OP_WORD64 (or,, |) +FETCH_AND_OP_WORD64 (and, , ) +FETCH_AND_OP_WORD64 (xor, , ^) +FETCH_AND_OP_WORD64 (nand, ~, ) + +#define NAME_oldval(OP, WIDTH) __sync_fetch_and_##OP##_##WIDTH +#define NAME_newval(OP, WIDTH) __sync_##OP##_and_fetch_##WIDTH + +/* Implement both __sync_op_and_fetch and __sync_fetch_and_op for + subword-sized quantities. */ + +#define OP_AND_FETCH_WORD64(OP, PFX_OP, INF_OP)\ + long long HIDDEN \ + __sync_##OP##_and_fetch_8 (long long *ptr, long long val)\ + {\ +int
[Patch 3/3] ARM 64 bit atomic operations
As per pr/48126 Michael Edwards spotted that in the case where the compare fails in the cmpxchg, the barrier at the end wasn't taken theoretically allowing a following load to float up above the load value compared. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 057f9ba..39057d2 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -23531,8 +23626,8 @@ arm_output_sync_loop (emit_f emit, } } - arm_process_output_memory_barrier (emit, NULL); arm_output_asm_insn (emit, 1, operands, %sLSYB%%=:, LOCAL_LABEL_PREFIX); + arm_process_output_memory_barrier (emit, NULL); } static rtx
Re: [Patch 2/3] ARM 64 bit atomic operations
On 07/01/2011 08:55 AM, Dr. David Alan Gilbert wrote: +/* Check that the kernel has a new enough version at load */ +void __check_for_sync8_kernelhelper (void) +{ + if (__kernel_helper_version 5) +{ + const char err[] = A newer kernel is required to run this binary. (__kernel_cmpxchg64 helper)\n; + /* At this point we need a way to crash with some information + for the user - I'm not sure I can rely on much else being + available at this point, so do the same as generic-morestack.c + write() and abort(). */ + write (2 /* stderr */, err, sizeof(err)); + abort (); +} +}; Wouldn't it be better to convert the arm kernel to use a proper VDSO, so that this error actually comes from the dynamic linker? That's the beauty of a true VDSO -- proper symbol resolution. r~
Re: [PATCH] [Annotalysis] Fixes virtual method calls when type is unknown
If you in some way rely on static types of those pointers, you may be in for a number of unpleasant surprises, because these types do not really have any meaning at all. Annotalysis is just a static analyzer, so if the types are somehow inaccurate (as they are in certain cases), then the only ill effect will be that the analysis issues a warning where it shouldn't, or fails to issue a warning where it should. Of course, no analysis that's based on types will ever be sound for C++, because C++ is not type-safe. We merely assume that the declared types represent the intentions of the programmer, and hope that the declared types are preserved in gimple with some degree of fidelity. Just out of curiosity, I does your analysis crash also on the testcase below (add the Mutex field if it is necessary)? No, the static type of b (in b-test()) remains Bar* in gimple, so the analysis works fine (i.e. the analyzer will not crash). Also, I have just very briefly glanced at the code in handle_call_gs but still it seems to me that the code dealing with virtual calls assumes virtual functions are never overridden...? It doesn't matter whether it's overridden or not, since any overriding versions are required to have the same type signature as the original. Annotalysis is merely an extended type-checker. And pretty please, use -p diff option when creating patches to be posted here. My apologies; will do. -DeLesley
Re: PATCH [2/n]: Prepare x32: Convert pointer to TLS symbol if needed
H.J. Lu hjl.to...@gmail.com writes: On Wed, Jun 29, 2011 at 7:06 AM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Jun 29, 2011 at 1:45 AM, Richard Sandiford richard.sandif...@linaro.org wrote: H.J. Lu hongjiu...@intel.com writes: @@ -706,7 +706,13 @@ precompute_register_parameters (int num_actuals, struct arg_data *args, pseudo now. TLS symbols sometimes need a call to resolve. */ if (CONSTANT_P (args[i].value) !targetm.legitimate_constant_p (args[i].mode, args[i].value)) - args[i].value = force_reg (args[i].mode, args[i].value); + { + if (GET_MODE (args[i].value) != args[i].mode) + args[i].value = convert_to_mode (args[i].mode, + args[i].value, + args[i].unsignedp); + args[i].value = force_reg (args[i].mode, args[i].value); + } But if GET_MODE (args[i].value) != args[i].mode, then the call to targetm.legitimate_constant_p looks wrong. The mode passed in the first argument is supposed to the mode of the second argument. Is there any reason why this and the following: /* If we are to promote the function arg to a wider mode, do it now. */ if (args[i].mode != TYPE_MODE (TREE_TYPE (args[i].tree_value))) args[i].value = convert_modes (args[i].mode, TYPE_MODE (TREE_TYPE (args[i].tree_value)), args[i].value, args[i].unsignedp); need to be done in the current order? I can't think of any off-hand. If not, would swapping them also fix the bug? (I can't review this either way, of course.) It works on the testcase. I will do a full test. It works. There are no regressions on Linux/x86-64. Great! I can't approve it, but FWIW, it looks good to me. The new order seems to make more conceptual sense: coerce the value into the right mode, then coerce it into the right type of rtx. Richard
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 06/28/11 18:14, Andrew Stubbs wrote: unsigned long long foo (unsigned long long a, unsigned char b, unsigned char c) { return a + b * c; } This appears to be entirely unsigned maths with plenty of spare precision, and therefore a dead cert for any SI-DI multiply-and-accumulate instruction, but not so - it is represented internally as: signed int tmp = (signed int)a * (signed int)b; unsigned long long result = a + (unsigned long long)tmp; Notice the unexpected signed int in the middle! I need to be able to get past that to optimize this properly. Since both inputs are positive in a signed int (they must be, being cast from a smaller unsigned value), you can infer that it does not matter whether you treat the result of the multiplication as a signed or an unsigned value. It is positive in any case. So, I think the thing to test is: if the accumulate step requires widening the result of the multiplication, either the cast must be value preserving (widening unsigned to signed), or you must be able to prove that the multiplication produces a positive result. If the accumulate step just casts the multiplication result from signed to unsigned, keeping the precision the same, you can ignore the cast since the addition is unaffected by it. Bernd
RE: [PING] [PATCH, libstdc++-v3] Add newlib specific ctype_members.cc
Hi Paolo, Thank you for the review. Ping for: http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00440.html personally, I have only minor comments, like only 2011 as Copyright year for new files, and please also regtest on a gnu-linux system. I've updated the Copyright year and removed the regenerated file from the patch. The updated patch is attached. I've also run the regtest on x86_64-linux-gnu. Before going ahead, however, you should send the patch also to the libstdc++ mailing list (this is the rule for v3 patches) and make sure the other maintainers don't have further comments. Wait, say, at least a couple of days. Sorry for missing the rule previously. I sent the patch to the libstdc++ mailing list yesterday; see http://gcc.gnu.org/ml/libstdc++/2011-06/msg00091.html I've also added the mailing list to the CC list. I'll wait a few days to see if there is any more comment. Thanks, Yufengdiff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index ed8b129..7f9231c 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -1753,7 +1753,7 @@ dnl AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [ GLIBCXX_ENABLE(clocale,auto,[[[=MODEL]]], [use MODEL for target locale package], -[permit generic|gnu|ieee_1003.1-2001|yes|no|auto]) +[permit generic|gnu|ieee_1003.1-2001|newlib|yes|no|auto]) # Deal with gettext issues. Default to not using it (=no) until we detect # support for it later. Let the user turn it off via --e/d, but let that @@ -1764,7 +1764,7 @@ AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [ [], [enable_nls=yes]) - # Either a known packaage, or auto + # Either a known package, or auto if test $enable_clocale = no || test $enable_clocale = yes; then enable_clocale=auto fi @@ -1781,7 +1781,11 @@ AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [ enable_clocale_flag=darwin ;; *) - enable_clocale_flag=generic + if test x$with_newlib = xyes; then + enable_clocale_flag=newlib + else + enable_clocale_flag=generic + fi ;; esac fi @@ -1915,6 +1919,22 @@ AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [ CTIME_CC=config/locale/generic/time_members.cc CLOCALE_INTERNAL_H=config/locale/generic/c++locale_internal.h ;; +newlib) + AC_MSG_RESULT(newlib) + + CLOCALE_H=config/locale/generic/c_locale.h + CLOCALE_CC=config/locale/generic/c_locale.cc + CCODECVT_CC=config/locale/generic/codecvt_members.cc + CCOLLATE_CC=config/locale/generic/collate_members.cc + CCTYPE_CC=config/locale/newlib/ctype_members.cc + CMESSAGES_H=config/locale/generic/messages_members.h + CMESSAGES_CC=config/locale/generic/messages_members.cc + CMONEY_CC=config/locale/generic/monetary_members.cc + CNUMERIC_CC=config/locale/generic/numeric_members.cc + CTIME_H=config/locale/generic/time_members.h + CTIME_CC=config/locale/generic/time_members.cc + CLOCALE_INTERNAL_H=config/locale/generic/c++locale_internal.h + ;; esac # This is where the testsuite looks for locale catalogs, using the diff --git a/libstdc++-v3/config/locale/newlib/ctype_members.cc b/libstdc++-v3/config/locale/newlib/ctype_members.cc new file mode 100644 index 000..ee91baf --- /dev/null +++ b/libstdc++-v3/config/locale/newlib/ctype_members.cc @@ -0,0 +1,280 @@ +// std::ctype implementation details, newlib version -*- C++ -*- + +// Copyright (C) 2011 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// http://www.gnu.org/licenses/. + +// +// ISO C++ 14882: 22.2.1.1.2 ctype virtual functions. +// + +#include locale +#include cstdlib +#include cstring +#include cstdio + +namespace std _GLIBCXX_VISIBILITY(default) +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + + // NB: The other ctypechar specializations are in src/locale.cc and + // various /config/os/* files. + ctype_bynamechar::ctype_byname(const char* __s, size_t __refs) + : ctypechar(0, false, __refs) + { +if (std::strcmp(__s, C) != 0 std::strcmp(__s, POSIX) != 0) + { +
Re: [gcc patch] Re: C++ member function template id not matching linkage name (PR debug/49408)
OK. Jason
Re: [Patch 2/3] ARM 64 bit atomic operations
On 1 July 2011 17:03, Richard Henderson r...@redhat.com wrote: On 07/01/2011 08:55 AM, Dr. David Alan Gilbert wrote: +/* Check that the kernel has a new enough version at load */ +void __check_for_sync8_kernelhelper (void) +{ + if (__kernel_helper_version 5) + { + const char err[] = A newer kernel is required to run this binary. (__kernel_cmpxchg64 helper)\n; + /* At this point we need a way to crash with some information + for the user - I'm not sure I can rely on much else being + available at this point, so do the same as generic-morestack.c + write() and abort(). */ + write (2 /* stderr */, err, sizeof(err)); + abort (); + } +}; Wouldn't it be better to convert the arm kernel to use a proper VDSO, so that this error actually comes from the dynamic linker? That's the beauty of a true VDSO -- proper symbol resolution. Well I can't say I like the need for this check/exit - so yes if something else could do it for me that would be great. Having said that, I don't know the history of the ARM commpage/lack of VDSO , neither do I know how big a change that would be. Let me have a chat to some of the kernel guys who might know the history. Dave
Re: [gcc patch] Re: C++ member function template id not matching linkage name (PR debug/49408)
On Fri, 01 Jul 2011 18:27:36 +0200, Jason Merrill wrote: OK. Checked in: http://gcc.gnu.org/viewcvs?view=revisionrevision=175761 No regressions on gcc-4.6.1-1.fc15.x86_64 (it is not trunk but hopefully similar enough). Thanks, Jan
[0/11] GET_MODE_PRECISION vs GET_MODE_BITSIZE
I'm working on a patch to support __int40_t for the C6X target. This will involve a new integer mode with bitsize 64, and precision 40. A lot of the existing code doesn't make a distinction between the two values, since at the moment they are identical for all integer modes (except BImode). This patch set tries to address that problem. Roughly speaking, these are the categories where we should use GET_MODE_SIZE/GET_MODE_BITSIZE: * computing subreg words * accessing memory For the following, we should use GET_MODE_PRECISION: * shift counts * sign bit positions, sign/zero-extending and all other arithmetic * testing for paradoxical subregs (or generally, whether we're extending or truncating) * testing TRULY_NOOP_TRUNCATION * testing whether a value can be represented in HOST_WIDE_INT Undoubtedly there are spots I've missed, but it doesn't all have to be fixed in one go. Existing targets should be unaffected by any of these changes, it only becomes important once a new fractional integer mode is added. Testing was done with all 11 patches applied, not for each of them individually. Bootstrapped and regression tested on i686-linux, all languages except Go. Regression tested on cris-elf. An earlier version, including support for int40_t, was tested in a 4.5 c6x-elf tree. I've built at least cc1 for the following compilers, and verified that generated code is identical on a large set of input files. i686-linux i686-linux x cris-elf i686-linux x ia64-hppa-hpux x86_64-linux x mips64-linux x86_64-linux x m68k-elf All of these tests, except the ia64-linux cross, were with a slightly earlier version that did not have the BImode special case in patch 11; the need for that was shown only with ia64 testing. Bernd
[1/11] Use targetm.shift_truncation_mask more consistently
At some point we've grown a shift_truncation_mask hook, but we're not using it everywhere we're masking shift counts. This patch changes the instances I found. Bernd * simplify-rtx.c (simplify_const_binary_operation): Use the shift_truncation_mask hook instead of performing modulo by width. Compare against mode precision, not bitsize. * combine.c (combine_simplify_rtx, simplify_shift_const_1): Use shift_truncation_mask instead of constructing the value manually. Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c.orig +++ gcc/simplify-rtx.c @@ -3704,8 +3704,8 @@ simplify_const_binary_operation (enum rt shift_truncation_mask, since the shift might not be part of an ashlM3, lshrM3 or ashrM3 instruction. */ if (SHIFT_COUNT_TRUNCATED) - arg1 = (unsigned HOST_WIDE_INT) arg1 % width; - else if (arg1 0 || arg1 = GET_MODE_BITSIZE (mode)) + arg1 = targetm.shift_truncation_mask (mode); + else if (arg1 0 || arg1 = GET_MODE_PRECISION (mode)) return 0; val = (code == ASHIFT Index: gcc/combine.c === --- gcc/combine.c.orig +++ gcc/combine.c @@ -5941,9 +5941,7 @@ combine_simplify_rtx (rtx x, enum machin else if (SHIFT_COUNT_TRUNCATED !REG_P (XEXP (x, 1))) SUBST (XEXP (x, 1), force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)), - ((unsigned HOST_WIDE_INT) 1 - exact_log2 (GET_MODE_BITSIZE (GET_MODE (x - - 1, + targetm.shift_truncation_mask (GET_MODE (x)), 0)); break; @@ -9896,7 +9894,7 @@ simplify_shift_const_1 (enum rtx_code co want to do this inside the loop as it makes it more difficult to combine shifts. */ if (SHIFT_COUNT_TRUNCATED) -orig_count = GET_MODE_BITSIZE (mode) - 1; +orig_count = targetm.shift_truncation_mask (mode); /* If we were given an invalid count, don't do anything except exactly what was requested. */
[2/11] Neater tests for signbits
We have a function mode_signbit_p, which tests whether a given rtx is equal to the sign bit of a given mode. This patch adds some similar helper functions and uses them to simplify tests. Also, in some instances of zero- and sign-extending, I've changed some bit shifting to uses of GET_MODE_MASK. Bernd * cse.c (find_comparison_args): Use val_mode_signbit_set_p. * simplify-rtx.c (mode_signbit_p): Use GET_MODE_PRECISION. (val_mode_signbit_p, val_mode_signbit_set_p): New functions. (simplify_const_unary_operation, simplify_binary_operation_1, simplify_const_binary_operation, simplify_const_relational_operation): Use them. Use GET_MODE_MASK for masking and sign-extensions. * combine.c (set_nonzero_bits_and_sign_copies, simplify_set, combine_simplify_rtx, force_to_mode, reg_nonzero_bits_for_combine, simplify_shift_const_1, simplify_comparison): Likewise. * expr.c (convert_modes): Likewise. * rtlanal.c (nonzero_bits1, canonicalize_condition): Likewise. * expmed.c (emit_cstore, emit_store_flag_1, emit_store_flag): Likewise. * rtl.h (val_mode_signbit_p, val_mode_signbit_set_p): Declare. Index: gcc/cse.c === --- gcc/cse.c.orig +++ gcc/cse.c @@ -3063,12 +3063,8 @@ find_comparison_args (enum rtx_code code for STORE_FLAG_VALUE, also look at LT and GE operations. */ || ((code == NE || (code == LT - GET_MODE_CLASS (inner_mode) == MODE_INT - (GET_MODE_BITSIZE (inner_mode) - = HOST_BITS_PER_WIDE_INT) - (STORE_FLAG_VALUE - ((HOST_WIDE_INT) 1 - (GET_MODE_BITSIZE (inner_mode) - 1 + val_signbit_known_set_p (inner_mode, + STORE_FLAG_VALUE)) #ifdef FLOAT_STORE_FLAG_VALUE || (code == LT SCALAR_FLOAT_MODE_P (inner_mode) @@ -3083,12 +3079,8 @@ find_comparison_args (enum rtx_code code } else if ((code == EQ || (code == GE -GET_MODE_CLASS (inner_mode) == MODE_INT -(GET_MODE_BITSIZE (inner_mode) - = HOST_BITS_PER_WIDE_INT) -(STORE_FLAG_VALUE -((HOST_WIDE_INT) 1 - (GET_MODE_BITSIZE (inner_mode) - 1 +val_signbit_known_set_p (inner_mode, + STORE_FLAG_VALUE)) #ifdef FLOAT_STORE_FLAG_VALUE || (code == GE SCALAR_FLOAT_MODE_P (inner_mode) Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c.orig +++ gcc/simplify-rtx.c @@ -82,7 +82,7 @@ mode_signbit_p (enum machine_mode mode, if (GET_MODE_CLASS (mode) != MODE_INT) return false; - width = GET_MODE_BITSIZE (mode); + width = GET_MODE_PRECISION (mode); if (width == 0) return false; @@ -103,6 +103,62 @@ mode_signbit_p (enum machine_mode mode, val = ((unsigned HOST_WIDE_INT) 1 width) - 1; return val == ((unsigned HOST_WIDE_INT) 1 (width - 1)); } + +/* Test whether VAL is equal to the most significant bit of mode MODE + (after masking with the mode mask of MODE). Returns false if the + precision of MODE is too large to handle. */ + +bool +val_signbit_p (enum machine_mode mode, unsigned HOST_WIDE_INT val) +{ + unsigned int width; + + if (GET_MODE_CLASS (mode) != MODE_INT) +return false; + + width = GET_MODE_PRECISION (mode); + if (width == 0 || width HOST_BITS_PER_WIDE_INT) +return false; + + val = GET_MODE_MASK (mode); + return val == ((unsigned HOST_WIDE_INT) 1 (width - 1)); +} + +/* Test whether the most significant bit of mode MODE is set in VAL. + Returns false if the precision of MODE is too large to handle. */ +bool +val_signbit_known_set_p (enum machine_mode mode, unsigned HOST_WIDE_INT val) +{ + unsigned int width; + + if (GET_MODE_CLASS (mode) != MODE_INT) +return false; + + width = GET_MODE_PRECISION (mode); + if (width == 0 || width HOST_BITS_PER_WIDE_INT) +return false; + + val = (unsigned HOST_WIDE_INT) 1 (width - 1); + return val != 0; +} + +/* Test whether the most significant bit of mode MODE is clear in VAL. + Returns false if the precision of MODE is too large to handle. */ +bool +val_signbit_known_clear_p (enum machine_mode mode, unsigned HOST_WIDE_INT val) +{ + unsigned int width; + + if (GET_MODE_CLASS (mode) != MODE_INT) +return false; + + width = GET_MODE_PRECISION (mode); + if (width == 0 || width HOST_BITS_PER_WIDE_INT) +return false; + + val = (unsigned HOST_WIDE_INT) 1 (width - 1); + return val == 0; +}
[6/11] Tests for HOST_WIDE_INT representability
A lot of code tests GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT to determine whether it can operate on values in the mode using HOST_WIDE_INT. This patch hides that behind a new macro, which now uses GET_MODE_PRECISION. Bernd * machmode.h (HWI_COMPUTABLE_MODE_P): New macro. * combine.c (set_nonzero_bits_and_sign_copies): Use it. (find_split-point, combine_simplify_rtx, simplify_if_then_else, simplify_set, simplify_logical, expand_compound_operation, make_extraction, force_to_mode, if_then_else_cond, extended_count, try_widen_shift_mode, simplify_shift_const_1, simplify_comparison, record_value_for_reg): Likewise. * expmed.c (expand_widening_mult, expand_mult_highpart): Likewise. * simplify-rtx. c (simplify_unary_operation_1, simplify_binary_operation_1, simplify_const_relational_operation): Likewise. Index: baseline-trunk/gcc/combine.c === --- baseline-trunk.orig/gcc/combine.c +++ baseline-trunk/gcc/combine.c @@ -1560,7 +1560,7 @@ set_nonzero_bits_and_sign_copies (rtx x, say what its contents were. */ ! REGNO_REG_SET_P (DF_LR_IN (ENTRY_BLOCK_PTR-next_bb), REGNO (x)) - GET_MODE_BITSIZE (GET_MODE (x)) = HOST_BITS_PER_WIDE_INT) + HWI_COMPUTABLE_MODE_P (GET_MODE (x))) { reg_stat_type *rsp = VEC_index (reg_stat_type, reg_stat, REGNO (x)); @@ -4679,8 +4679,7 @@ find_split_point (rtx *loc, rtx insn, bo /* See if this is a bitfield assignment with everything constant. If so, this is an IOR of an AND, so split it into that. */ if (GET_CODE (SET_DEST (x)) == ZERO_EXTRACT - (GET_MODE_BITSIZE (GET_MODE (XEXP (SET_DEST (x), 0))) - = HOST_BITS_PER_WIDE_INT) + HWI_COMPUTABLE_MODE_P (GET_MODE (XEXP (SET_DEST (x), 0))) CONST_INT_P (XEXP (SET_DEST (x), 1)) CONST_INT_P (XEXP (SET_DEST (x), 2)) CONST_INT_P (SET_SRC (x)) @@ -5584,7 +5583,7 @@ combine_simplify_rtx (rtx x, enum machin if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT) break; - if (GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT) + if (HWI_COMPUTABLE_MODE_P (mode)) SUBST (XEXP (x, 0), force_to_mode (XEXP (x, 0), GET_MODE (XEXP (x, 0)), GET_MODE_MASK (mode), 0)); @@ -5596,7 +5595,7 @@ combine_simplify_rtx (rtx x, enum machin /* Similarly to what we do in simplify-rtx.c, a truncate of a register whose value is a comparison can be replaced with a subreg if STORE_FLAG_VALUE permits. */ - if (GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT + if (HWI_COMPUTABLE_MODE_P (mode) (STORE_FLAG_VALUE ~GET_MODE_MASK (mode)) == 0 (temp = get_last_value (XEXP (x, 0))) COMPARISON_P (temp)) @@ -5634,7 +5633,7 @@ combine_simplify_rtx (rtx x, enum machin INTVAL (XEXP (x, 1)) == -INTVAL (XEXP (XEXP (x, 0), 1)) ((i = exact_log2 (UINTVAL (XEXP (XEXP (x, 0), 1 = 0 || (i = exact_log2 (UINTVAL (XEXP (x, 1 = 0) - GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT + HWI_COMPUTABLE_MODE_P (mode) ((GET_CODE (XEXP (XEXP (x, 0), 0)) == AND CONST_INT_P (XEXP (XEXP (XEXP (x, 0), 0), 1)) (UINTVAL (XEXP (XEXP (XEXP (x, 0), 0), 1)) @@ -5669,7 +5668,7 @@ combine_simplify_rtx (rtx x, enum machin for example in cases like ((a 1) + (a 2)), which can become a 3. */ - if (GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT + if (HWI_COMPUTABLE_MODE_P (mode) (nonzero_bits (XEXP (x, 0), mode) nonzero_bits (XEXP (x, 1), mode)) == 0) { @@ -5875,7 +5874,7 @@ combine_simplify_rtx (rtx x, enum machin AND with STORE_FLAG_VALUE when we are done, since we are only going to test the sign bit. */ if (new_code == NE GET_MODE_CLASS (mode) == MODE_INT - GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT + HWI_COMPUTABLE_MODE_P (mode) val_signbit_p (mode, STORE_FLAG_VALUE) op1 == const0_rtx mode == GET_MODE (op0) @@ -6209,7 +6208,7 @@ simplify_if_then_else (rtx x) || GET_CODE (XEXP (t, 0)) == LSHIFTRT || GET_CODE (XEXP (t, 0)) == ASHIFTRT) GET_CODE (XEXP (XEXP (t, 0), 0)) == SUBREG - GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT + HWI_COMPUTABLE_MODE_P (mode) subreg_lowpart_p (XEXP (XEXP (t, 0), 0)) rtx_equal_p (SUBREG_REG (XEXP (XEXP (t, 0), 0)), f) ((nonzero_bits (f, GET_MODE (f)) @@ -6225,7 +6224,7 @@ simplify_if_then_else (rtx x) || GET_CODE (XEXP (t, 0)) == IOR || GET_CODE (XEXP (t, 0)) == XOR) GET_CODE (XEXP (XEXP (t,
[8/11] Expander changes
This replaces remaining uses of GET_MODE_BITSIZE with GET_MODE_PRECISION where doing so seems relatively obviously correct. The patch is intended to cover the expander. Bernd * optabs.c (expand_binop): Use GET_MODE_PRECISION instead of GET_MODE_BITSIZE where appropriate. (widen_leading, expand_parity, expand_ctz, expand_ffs, expand_unop, expand_abs_nojump, expand_one_cmpl_abs_nojump, expand_float, expand_fix): Likewise. * expr.c (convert_move, convert_modes, expand_expr_real_2, expand_expr_real_1, reduce_to_bit_field_precision): Likewise. * stor-layout.c (get_mode_bounds): Likewise. * cfgexpand.c (convert_debug_memory_address, expand_debug_expr): Likewise. * convert.c (convert_to_integer): Likewise. * expmed.c (expand_shift_1): Likewise. Index: gcc/optabs.c === --- gcc/optabs.c.orig +++ gcc/optabs.c @@ -1407,7 +1407,7 @@ expand_binop (enum machine_mode mode, op { optab otheroptab = (binoptab == rotl_optab ? rotr_optab : rotl_optab); rtx newop1; - unsigned int bits = GET_MODE_BITSIZE (mode); + unsigned int bits = GET_MODE_PRECISION (mode); if (CONST_INT_P (op1)) newop1 = GEN_INT (bits - INTVAL (op1)); @@ -2353,8 +2353,8 @@ widen_leading (enum machine_mode mode, r unoptab != clrsb_optab); if (temp != 0) temp = expand_binop (wider_mode, sub_optab, temp, -GEN_INT (GET_MODE_BITSIZE (wider_mode) - - GET_MODE_BITSIZE (mode)), +GEN_INT (GET_MODE_PRECISION (wider_mode) + - GET_MODE_PRECISION (mode)), target, true, OPTAB_DIRECT); if (temp == 0) delete_insns_since (last); @@ -2540,7 +2540,7 @@ expand_parity (enum machine_mode mode, r } /* Try calculating ctz(x) as K - clz(x -x) , - where K is GET_MODE_BITSIZE(mode) - 1. + where K is GET_MODE_PRECISION(mode) - 1. Both __builtin_ctz and __builtin_clz are undefined at zero, so we don't have to worry about what the hardware does in that case. (If @@ -2568,7 +2568,7 @@ expand_ctz (enum machine_mode mode, rtx if (temp) temp = expand_unop_direct (mode, clz_optab, temp, NULL_RTX, true); if (temp) -temp = expand_binop (mode, sub_optab, GEN_INT (GET_MODE_BITSIZE (mode) - 1), +temp = expand_binop (mode, sub_optab, GEN_INT (GET_MODE_PRECISION (mode) - 1), temp, target, true, OPTAB_DIRECT); if (temp == 0) @@ -2619,7 +2619,7 @@ expand_ffs (enum machine_mode mode, rtx if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2) { defined_at_zero = true; - val = (GET_MODE_BITSIZE (mode) - 1) - val; + val = (GET_MODE_PRECISION (mode) - 1) - val; } } else @@ -3077,8 +3077,8 @@ expand_unop (enum machine_mode mode, opt if ((unoptab == clz_optab || unoptab == clrsb_optab) temp != 0) temp = expand_binop (wider_mode, sub_optab, temp, -GEN_INT (GET_MODE_BITSIZE (wider_mode) - - GET_MODE_BITSIZE (mode)), +GEN_INT (GET_MODE_PRECISION (wider_mode) + - GET_MODE_PRECISION (mode)), target, true, OPTAB_DIRECT); if (temp) @@ -3173,7 +3173,7 @@ expand_abs_nojump (enum machine_mode mod false) = 2) { rtx extended = expand_shift (RSHIFT_EXPR, mode, op0, - GET_MODE_BITSIZE (mode) - 1, + GET_MODE_PRECISION (mode) - 1, NULL_RTX, 0); temp = expand_binop (mode, xor_optab, extended, op0, target, 0, @@ -3274,7 +3274,7 @@ expand_one_cmpl_abs_nojump (enum machine false) = 2) { rtx extended = expand_shift (RSHIFT_EXPR, mode, op0, - GET_MODE_BITSIZE (mode) - 1, + GET_MODE_PRECISION (mode) - 1, NULL_RTX, 0); temp = expand_binop (mode, xor_optab, extended, op0, target, 0, @@ -4663,7 +4663,7 @@ expand_float (rtx to, rtx from, int unsi int doing_unsigned = unsignedp; if (fmode != GET_MODE (to) -significand_size (fmode) GET_MODE_BITSIZE (GET_MODE (from))) +significand_size (fmode) GET_MODE_PRECISION (GET_MODE (from))) continue; icode = can_float_p (fmode, imode, unsignedp); @@ -4707,7 +4707,7 @@ expand_float (rtx to, rtx from, int unsi for (fmode = GET_MODE (to); fmode !=
[9/11] Fix units mismatch in comparison
A bug fix discovered while working on the other patches. Previously, this was a comparison of a GET_MODE_BITSIZE vs a GET_MODE_SIZE value. After the other patches, it's GET_MODE_PRECISION vs GET_MODE_SIZE, which is just as wrong, so change it. Bernd * rtlanal.c (nonzero_bits1): Don't compare GET_MODE_SIZE against a bitsize. Index: baseline-trunk/gcc/rtlanal.c === --- baseline-trunk.orig/gcc/rtlanal.c +++ baseline-trunk/gcc/rtlanal.c @@ -3993,7 +3993,7 @@ nonzero_bits1 (const_rtx x, enum machine nonzero = 1; #endif - if (GET_MODE_SIZE (GET_MODE (x)) mode_width) + if (GET_MODE_PRECISION (GET_MODE (x)) mode_width) nonzero |= (GET_MODE_MASK (mode) ~GET_MODE_MASK (GET_MODE (x))); break;
[10/11] Expander fixes for 40-bit integers
This fixes a few random problems that occur when you add a new fractional integer mode - for example, trying to expand doubleword shifts normally for them, or trying to generate 40-64 bit widening multiply. In some cases where it seems we can only deal with modes where precision == bitsisze, I've added asserts. Bernd * optabs.c (expand_binop): Tighten conditions for doubleword expansions. (widen_bswap): Assert that mode bitsize and precision are the same. * stor-layout.c (get_best_mode): Skip modes that have lower precision than bitsize. * recog.c (simplify_while_replacing): Assert that bitsize and precision are the same. Index: gcc/optabs.c === --- gcc/optabs.c.orig +++ gcc/optabs.c @@ -1428,12 +1428,12 @@ expand_binop (enum machine_mode mode, op takes operands of this mode and makes a wider mode. */ if (binoptab == smul_optab - GET_MODE_WIDER_MODE (mode) != VOIDmode + GET_MODE_2XWIDER_MODE (mode) != VOIDmode (optab_handler ((unsignedp ? umul_widen_optab : smul_widen_optab), -GET_MODE_WIDER_MODE (mode)) +GET_MODE_2XWIDER_MODE (mode)) != CODE_FOR_nothing)) { - temp = expand_binop (GET_MODE_WIDER_MODE (mode), + temp = expand_binop (GET_MODE_2XWIDER_MODE (mode), unsignedp ? umul_widen_optab : smul_widen_optab, op0, op1, NULL_RTX, unsignedp, OPTAB_DIRECT); @@ -1575,6 +1575,7 @@ expand_binop (enum machine_mode mode, op mclass == MODE_INT (CONST_INT_P (op1) || optimize_insn_for_speed_p ()) GET_MODE_SIZE (mode) == 2 * UNITS_PER_WORD + GET_MODE_PRECISION (mode) == GET_MODE_BITSIZE (mode) optab_handler (binoptab, word_mode) != CODE_FOR_nothing optab_handler (ashl_optab, word_mode) != CODE_FOR_nothing optab_handler (lshr_optab, word_mode) != CODE_FOR_nothing) @@ -1647,7 +1648,7 @@ expand_binop (enum machine_mode mode, op if ((binoptab == rotl_optab || binoptab == rotr_optab) mclass == MODE_INT CONST_INT_P (op1) - GET_MODE_SIZE (mode) == 2 * UNITS_PER_WORD + GET_MODE_PRECISION (mode) == 2 * BITS_PER_WORD optab_handler (ashl_optab, word_mode) != CODE_FOR_nothing optab_handler (lshr_optab, word_mode) != CODE_FOR_nothing) { @@ -2463,6 +2464,8 @@ widen_bswap (enum machine_mode mode, rtx x = widen_operand (op0, wider_mode, mode, true, true); x = expand_unop (wider_mode, bswap_optab, x, NULL_RTX, true); + gcc_assert (GET_MODE_PRECISION (wider_mode) == GET_MODE_BITSIZE (wider_mode) + GET_MODE_PRECISION (mode) == GET_MODE_BITSIZE (mode)); if (x != 0) x = expand_shift (RSHIFT_EXPR, wider_mode, x, GET_MODE_BITSIZE (wider_mode) Index: gcc/stor-layout.c === --- gcc/stor-layout.c.orig +++ gcc/stor-layout.c @@ -2389,7 +2389,8 @@ get_best_mode (int bitsize, int bitpos, mode = GET_MODE_WIDER_MODE (mode)) { unit = GET_MODE_BITSIZE (mode); - if ((bitpos % unit) + bitsize = unit) + if (unit == GET_MODE_PRECISION (mode) + (bitpos % unit) + bitsize = unit) break; } @@ -2414,7 +2415,8 @@ get_best_mode (int bitsize, int bitpos, tmode = GET_MODE_WIDER_MODE (tmode)) { unit = GET_MODE_BITSIZE (tmode); - if (bitpos / unit == (bitpos + bitsize - 1) / unit + if (unit == GET_MODE_PRECISION (tmode) + bitpos / unit == (bitpos + bitsize - 1) / unit unit = BITS_PER_WORD unit = MIN (align, BIGGEST_ALIGNMENT) (largest_mode == VOIDmode Index: gcc/recog.c === --- gcc/recog.c.orig +++ gcc/recog.c @@ -638,6 +638,8 @@ simplify_while_replacing (rtx *loc, rtx (GET_MODE_SIZE (is_mode) - GET_MODE_SIZE (wanted_mode) - offset); + gcc_assert (GET_MODE_PRECISION (wanted_mode) + == GET_MODE_BITSIZE (wanted_mode)); pos %= GET_MODE_BITSIZE (wanted_mode); newmem = adjust_address_nv (XEXP (x, 0), wanted_mode, offset);
[11/11] Fix get_mode_bounds
get_mode_bounds should also use GET_MODE_PRECISION, but this exposes a problem on ia64 - BImode needs to be handled specially here to work around another preexisting special case in gen_int_mode. Bernd * stor-layout.c (get_mode_bounds): Use GET_MODE_PRECISION. Special case BImode. Index: gcc/stor-layout.c === --- gcc/stor-layout.c.orig +++ gcc/stor-layout.c @@ -2439,11 +2439,26 @@ get_mode_bounds (enum machine_mode mode, enum machine_mode target_mode, rtx *mmin, rtx *mmax) { - unsigned size = GET_MODE_BITSIZE (mode); + unsigned size = GET_MODE_PRECISION (mode); unsigned HOST_WIDE_INT min_val, max_val; gcc_assert (size = HOST_BITS_PER_WIDE_INT); + /* gen_int_mode performs an unwanted canonicalization for BImode. */ + if (mode == BImode) +{ + if (sign) + { + *mmin = constm1_rtx; + *mmax = const0_rtx; + } + else + { + *mmin = const0_rtx; + *mmax = const1_rtx; + } + return; +} if (sign) { min_val = -((unsigned HOST_WIDE_INT) 1 (size - 1));
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
Looks OK to me. Thanks, DJ. I've just checked the patch in on the GCC side. I will push it on the src/GDB CVS momentarily. -- Joel
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 01/07/11 16:54, Paolo Bonzini wrote: On 07/01/2011 04:55 PM, Stubbs, Andrew wrote: What about (u128)c + (u64)((s8)a * (s8)b)? You cannot convert this to (u128)c + (u128)((s8)a * (s8)b). Oh I see, sorry. Yes, that's exactly what I'm trying to do here. No, wait, I don't see. Where are these multiple widening multiplies you're talking about? I only see one multiply? I meant one multiplication with multiple widening steps. Not clear at all, sorry. Yes, I see now, the whole purpose of my patch set is widening by more than one mode. The case of the multiply-and-accumulate is the only way there can be more than one step though. Widening multiplies themselves are always handled as one unit. Andrew
Re: [patch, fortran] Always return malloc(1) for empty arrays in the library
Am 01.07.2011 14:34, schrieb Janne Blomqvist: On Thu, Jun 30, 2011 at 21:09, Thomas Koenigtkoe...@netcologne.de wrote: Good point. I have done so in the attached patch Seems you forgot to attach it.. ;) Oops, I hadn't realized your crystal ball was broken :-) Is this better? Thomas 2011-06-30 Thomas Koenig tkoe...@gcc.gnu.org * runtime/memory.c (internal_malloc_size): If size is zero or less, allocate a single byte. * m4/pack.m4 (pack_'rtype_code`): Don't check for zero size for the argument of internal_malloc_size. * m4/spread.m4 (spread_'rtype_code`): Likewise. * m4/eoshift1.m4 (eoshift1): Don't allocate twice. Don't check for zero size for the argument of internal_malloc_size. * m4/eoshift3.m4: Don't check for zero size for the argument of internal_malloc_size. * intrinsics/pack_generic.c (pack_internal): Likewise. (pack_s_internal): Likewise. * intrinsics/cshift0.c (cshift0): Likewise. * intrinsics/spread_generic.c (spread_internal): Likewise. * intrinsics/eoshift0.c (eoshift0): Likewise. * intrinsics/eoshift2.c (eoshift2): Likewise. * generated/eoshift1_16.c: Regenerated. * generated/eoshift1_4.c: Regenerated. * generated/eoshift1_8.c: Regenerated. * generated/eoshift3_16.c: Regenerated. * generated/eoshift3_4.c: Regenerated. * generated/eoshift3_8.c: Regenerated. * generated/pack_c10.c: Regenerated. * generated/pack_c16.c: Regenerated. * generated/pack_c4.c: Regenerated. * generated/pack_c8.c: Regenerated. * generated/pack_i16.c: Regenerated. * generated/pack_i1.c: Regenerated. * generated/pack_i2.c: Regenerated. * generated/pack_i4.c: Regenerated. * generated/pack_i8.c: Regenerated. * generated/pack_r10.c: Regenerated. * generated/pack_r16.c: Regenerated. * generated/pack_r4.c: Regenerated. * generated/pack_r8.c: Regenerated. * generated/spread_c10.c: Regenerated. * generated/spread_c16.c: Regenerated. * generated/spread_c4.c: Regenerated. * generated/spread_c8.c: Regenerated. * generated/spread_i16.c: Regenerated. * generated/spread_i1.c: Regenerated. * generated/spread_i2.c: Regenerated. * generated/spread_i4.c: Regenerated. * generated/spread_i8.c: Regenerated. * generated/spread_r10.c: Regenerated. * generated/spread_r16.c: Regenerated. * generated/spread_r4.c: Regenerated. * generated/spread_r8.c: Regenerated. Index: runtime/memory.c === --- runtime/memory.c (Revision 175598) +++ runtime/memory.c (Arbeitskopie) @@ -54,8 +54,8 @@ get_mem (size_t n) void * internal_malloc_size (size_t size) { - if (size == 0) -return NULL; + if (unlikely (size = 0)) +size = 1; return get_mem (size); } Index: m4/pack.m4 === --- m4/pack.m4 (Revision 175598) +++ m4/pack.m4 (Arbeitskopie) @@ -167,14 +167,12 @@ pack_'rtype_code` ('rtype` *ret, const 'rtype` *ar GFC_DIMENSION_SET(ret-dim[0], 0, total-1, 1); ret-offset = 0; + + /* internal_malloc_size allocates a single byte for zero size. */ + ret-data = internal_malloc_size (sizeof ('rtype_name`) * total); + if (total == 0) - { - /* In this case, nothing remains to be done. */ - ret-data = internal_malloc_size (1); - return; - } - else - ret-data = internal_malloc_size (sizeof ('rtype_name`) * total); + return; } else { Index: m4/spread.m4 === --- m4/spread.m4 (Revision 175598) +++ m4/spread.m4 (Arbeitskopie) @@ -101,13 +101,11 @@ spread_'rtype_code` ('rtype` *ret, const 'rtype` * GFC_DIMENSION_SET(ret-dim[n], 0, ub, stride); } ret-offset = 0; - if (rs 0) -ret-data = internal_malloc_size (rs * sizeof('rtype_name`)); - else - { - ret-data = internal_malloc_size (1); - return; - } + + /* internal_malloc_size allocates a single byte for zero size. */ + ret-data = internal_malloc_size (rs * sizeof('rtype_name`)); + if (rs = 0) +return; } else { Index: m4/eoshift1.m4 === --- m4/eoshift1.m4 (Revision 175598) +++ m4/eoshift1.m4 (Arbeitskopie) @@ -89,7 +89,6 @@ eoshift1 (gfc_array_char * const restrict ret, { int i; - ret-data = internal_malloc_size (size * arraysize); ret-offset = 0; ret-dtype = array-dtype; for (i = 0; i GFC_DESCRIPTOR_RANK (array); i++) @@ -107,10 +106,8 @@ eoshift1 (gfc_array_char * const restrict ret, GFC_DIMENSION_SET(ret-dim[i], 0, ub, str); } - if (arraysize 0) - ret-data =
Re: [PATCH] [Annotalysis] Fixes virtual method calls when type is unknown
Hi, On Fri, Jul 01, 2011 at 09:31:30AM -0700, Delesley Hutchins wrote: Just out of curiosity, I does your analysis crash also on the testcase below (add the Mutex field if it is necessary)? No, the static type of b (in b-test()) remains Bar* in gimple, so the analysis works fine (i.e. the analyzer will not crash). Sorry, I've had a look only just now and you do the analysis before early inlining and as long as you do that, the type is really Bar and you are fine as far as far as placement new is concerned. However, on the second thought, I still think you need to handle the case when BINFO_VIRTUALS are NULL in cp_get_virtual_function_decl (BTW, there is a non-langhook variant in gimple-fold.c called gimple_get_virt_method_for_binfo) because of the case below. Martin // { dg-do compile } // { dg-options -Wthread-safety -O } //#include thread_annot_common.h class Anc { public: int a; }; class Foo : public Anc { public: // Mutex m; Foo(); virtual ~Foo(); virtual void doSomething() = 0;// EXCLUSIVE_LOCKS_REQUIRED(m) = 0; }; class FooDerived : public Foo { public: FooDerived(); virtual ~FooDerived(); virtual void doSomething(); }; // reinterpret_cast void foo1(Anc* ptr) { reinterpret_castFoo*(ptr)-doSomething(); } // C-style cast void foo2(Anc* buf) { ((Foo*) buf)-doSomething(); }
Re: RFC: [GUPC] UPC-related front-end changes
On Fri, 1 Jul 2011, Gary Funck wrote: * Most of the #ifdef conditionals have been removed. Some target macros have been defined and documented in tm.texi. We still have some questions For each target macro, what is the justification requiring it to be a macro rather than a hook documented in target.def with just an @hook line in tm.texi.in? The justification should be one of: * Macro is used in a case label, defining the values of an enum or some other such context that requires a constant value and the code would be substantially obfuscated by changing it to avoid this. (Normally if statements are preferred to #if conditionals. Simple use in #if is not sufficient justification for a target macro, you need to explain why it is necessary to use a preprocessor conditional - for example, if the #if is there to control values in an enum.) * Macro is very closely related to an existing set of macros, such that it does not make sense for some members of the set to be hooks while others are macros. * Benchmarking with provided figures has shown significant performance regressions from use of a hook. I looked at the first of the documented macros, and it doesn't seem to be a target macro at all, being defined by configure rather than in tm.h. Configure-defined macros don't go in tm.texi. But any that are really target macros need such a justification, and it's still preferable to define configure-defined macros to 1 or 0 (rather than defined or undefined) and test them in if statements not #if. -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch 2/3] ARM 64 bit atomic operations
On Fri, 1 Jul 2011, Dr. David Alan Gilbert wrote: +/* For write */ +#include unistd.h +/* For abort */ +#include stdlib.h Please don't include system headers in libgcc without appropriate inhibit_libc checks for bootstrap purposes. In this case, it would seem better just to declare the functions you need. +/* Check that the kernel has a new enough version at load */ +void __check_for_sync8_kernelhelper (void) Shouldn't this function be static? +{ + if (__kernel_helper_version 5) +{ + const char err[] = A newer kernel is required to run this binary. (__kernel_cmpxchg64 helper)\n; + /* At this point we need a way to crash with some information + for the user - I'm not sure I can rely on much else being + available at this point, so do the same as generic-morestack.c + write() and abort(). */ + write (2 /* stderr */, err, sizeof(err)); write is in the user's namespace in ISO C so it's not ideal to have a call to it. If there isn't a reserved-namespace version, using the syscall directly (hardcoding the syscall number) might be better. +void (*__sync8_kernelhelper_inithook[]) (void) __attribute__ ((section (.init_array))) = { + __check_for_sync8_kernelhelper +}; Shouldn't this also be static (marked used if needed)? Though I'd have thought simply marking the function as a constructor would be simpler and better -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: 40 bit integer support
On Fri, 1 Jul 2011, Bernd Schmidt wrote: * Should we add an __int40_t keyword, or just do a pushdecl for it? The patch currently does the latter to match __int128_t, but decimal float and fixed-point support uses keywords. This may make a difference for (existing) code using unsigned __int40_t. My advice is neither, initially; leave it to the user to define typedefs using the mode (possibly providing a header that the user can include with -include). __int128_t is legacy and undocumented, with the __int128 keyword (from the x86_64 ABI) being preferred. I also don't recommend adding any types to stdint.h (so eliminating a lot more of the complexity); there is no requirement to have those types there, and in any case libc's stdint.h is used in most cases instead of GCC's. * The __INT40_C macros use a cast, since using a new suffix is likely to conflict with something defined by a future standard. This means they are not usable in CPP tests. The TI compiler defines them in the same way, however, and mentions DR209. The link they give is defunct: http://wwwold.dkuug.dk/JTC1/SC22/WG14/www/docs/dr_209.htm Don't try to understand anything about stdint.h with reference to C99 versions before TC3; there were several defects fixed in the TCs. DR#209 was fixed by correcting the definitions of the types produced from certain macros. Yes, you need a way of producing something usable in #if to define such a macro. This is evidence that the type does not belong in stdint.h -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: 40 bit integer support
I should add: make the type, the new mode, the testcases etc. entirely target-specific; target-independent GCC should not need to know or care about the specifics of this type. It's bad enough target-independent GCC knowing about HImode, SImode, DImode and TImode outside default target hook implementations for targets that use those modes. And is there anything wrong with the existing PDImode name? (It's not wrong to provide a PDImode-support file for libgcc in libgcc/config/, say, to be used by any targets that want to use such a type, but I think it's best kept separate from the main libgcc2.c and libgcc2.h. And you won't get the PDImode changes into upstream glibc's soft-fp.h, which dictates defining things in sfp-machine.h just as is done for TImode types - the TImode soft-fp files are local to GCC.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] [Annotalysis] Fixes virtual method calls when type is unknown
This is indeed a problem. Good catch; thanks! -DeLesley On Fri, Jul 1, 2011 at 11:50 AM, Martin Jambor mjam...@suse.cz wrote: Hi, On Fri, Jul 01, 2011 at 09:31:30AM -0700, Delesley Hutchins wrote: Just out of curiosity, I does your analysis crash also on the testcase below (add the Mutex field if it is necessary)? No, the static type of b (in b-test()) remains Bar* in gimple, so the analysis works fine (i.e. the analyzer will not crash). Sorry, I've had a look only just now and you do the analysis before early inlining and as long as you do that, the type is really Bar and you are fine as far as far as placement new is concerned. However, on the second thought, I still think you need to handle the case when BINFO_VIRTUALS are NULL in cp_get_virtual_function_decl (BTW, there is a non-langhook variant in gimple-fold.c called gimple_get_virt_method_for_binfo) because of the case below. Martin // { dg-do compile } // { dg-options -Wthread-safety -O } //#include thread_annot_common.h class Anc { public: int a; }; class Foo : public Anc { public: // Mutex m; Foo(); virtual ~Foo(); virtual void doSomething() = 0;// EXCLUSIVE_LOCKS_REQUIRED(m) = 0; }; class FooDerived : public Foo { public: FooDerived(); virtual ~FooDerived(); virtual void doSomething(); }; // reinterpret_cast void foo1(Anc* ptr) { reinterpret_castFoo*(ptr)-doSomething(); } // C-style cast void foo2(Anc* buf) { ((Foo*) buf)-doSomething(); } -- DeLesley Hutchins | Software Engineer | deles...@google.com | 505-206-0315
Re: [patch] fix c++/49605: -Wdelete-non-virtual-dtor is not picky enough
This patch is OK, I think I have a slight preference for the warning where it is. Jason
Re: RFC: 40 bit integer support
On 07/01/11 22:04, Joseph S. Myers wrote: I should add: make the type, the new mode, the testcases etc. entirely target-specific; target-independent GCC should not need to know or care about the specifics of this type. It's bad enough target-independent GCC knowing about HImode, SImode, DImode and TImode outside default target hook implementations for targets that use those modes. The idea here is that there is more than one target that supports 40 bit operations, so why shouldn't we have support for it in target-independent code and libgcc? It differs from QI/HI/SImode etc. in that the precision is known and not target-specific. And is there anything wrong with the existing PDImode name? PDImode is so far always defined as MODE_PARTIAL_INT which is handled quite differently (i.e. by not handling it very much at all). IMO it would be a bad idea to overload the name. Bernd
Re: RFC: 40 bit integer support
On Jul 1, 2011, at 4:14 PM, Bernd Schmidt wrote: On 07/01/11 22:04, Joseph S. Myers wrote: I should add: make the type, the new mode, the testcases etc. entirely target-specific; target-independent GCC should not need to know or care about the specifics of this type. It's bad enough target-independent GCC knowing about HImode, SImode, DImode and TImode outside default target hook implementations for targets that use those modes. The idea here is that there is more than one target that supports 40 bit operations, so why shouldn't we have support for it in target-independent code and libgcc? It differs from QI/HI/SImode etc. in that the precision is known and not target-specific. And is there anything wrong with the existing PDImode name? PDImode is so far always defined as MODE_PARTIAL_INT which is handled quite differently (i.e. by not handling it very much at all). IMO it would be a bad idea to overload the name. Would it make sense to fix the not much at all problem? paul
Re: RFC: 40 bit integer support
On 07/01/11 21:49, Joseph S. Myers wrote: On Fri, 1 Jul 2011, Bernd Schmidt wrote: * Should we add an __int40_t keyword, or just do a pushdecl for it? The patch currently does the latter to match __int128_t, but decimal float and fixed-point support uses keywords. This may make a difference for (existing) code using unsigned __int40_t. My advice is neither, initially; leave it to the user to define typedefs using the mode (possibly providing a header that the user can include with -include). __int128_t is legacy and undocumented, with the __int128 keyword (from the x86_64 ABI) being preferred. I also don't recommend adding any types to stdint.h (so eliminating a lot more of the complexity); there is no requirement to have those types there, Well, this was done for compatibility with an existing toolchain (which unfortunately also allows unsigned __int40_t). I guess I could be persuaded that we don't care enough and people must include c6x_intrinsics.h when using gcc. and in any case libc's stdint.h is used in most cases instead of GCC's That just means we'd have to modify newlib and uClibc as well if we wanted to support this. Don't try to understand anything about stdint.h with reference to C99 versions before TC3; there were several defects fixed in the TCs. Is this available anywhere (even if only inside CS?) Bernd
C++ PATCH for c++/48883, c++/49609 (not instantiating function selected with a template-id)
Since DR 115, people have been able to designate a single function specialization with a template-id. When it's then used in a call it gets marked used, but uses in other situations weren't getting marked, leading to undefined symbol errors. Tested x86_64-pc-linux-gnu, applying to trunk. commit 7b1b0825bcc1531d35b4c25a38392105da7c1c95 Author: Jason Merrill ja...@redhat.com Date: Fri Jul 1 12:42:09 2011 -0400 PR c++/48883 PR c++/49609 * pt.c (resolve_nondeduced_context): Call mark_used. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 4903044..947e19e 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -14679,6 +14679,7 @@ resolve_nondeduced_context (tree orig_expr) } if (good == 1) { + mark_used (goodfn); expr = goodfn; if (baselink) expr = build_baselink (BASELINK_BINFO (baselink), diff --git a/gcc/testsuite/g++.dg/template/explicit-args4.C b/gcc/testsuite/g++.dg/template/explicit-args4.C new file mode 100644 index 000..c64a085 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/explicit-args4.C @@ -0,0 +1,14 @@ +// PR c++/48883 +// { dg-do link } + +templatetypename T +T myMax(T a, T b) { + if(a b) return a; + return b; +} + +int main() { + bool even = true; + int (*fp)(int, int); + fp = even ? myMaxint : myMaxint; /* yields link error */ +}
Re: RFC: 40 bit integer support
On 07/01/11 22:18, Paul Koning wrote: PDImode is so far always defined as MODE_PARTIAL_INT which is handled quite differently (i.e. by not handling it very much at all). IMO it would be a bad idea to overload the name. Would it make sense to fix the not much at all problem? Ideally once I'm done with all this, people could change their PDImode definitions to FRACTIONAL_INT_MODE with whatever precision they need. It certainly would be nice if a suitable target maintainer applied my 11 patches and tried to test such a change :) Bernd
C++ PATCH for c++/48593 ((T::m) being treated as pointer to member in template)
We weren't preserving the parenthesis in the template context. Oops. Tested x86_64-pc-linux-gnu, applying to trunk. commit 9108c3486dd1a408fdd7c64a0393aa5eec1e7a14 Author: Jason Merrill ja...@redhat.com Date: Fri Jul 1 13:43:22 2011 -0400 PR c++/48593 * pt.c (tsubst_qualified_id): Check PTRMEM_OK_P. * tree.c (build_qualified_name): Set PTRMEM_OK_P. * semantics.c (finish_parenthesized_expr): Clear PTRMEM_OK_P on SCOPE_REF, too. * cp-tree.h (PTRMEM_OK_P): Apply to SCOPE_REF, too. (QUALIFIED_NAME_IS_TEMPLATE): Switch to lang flag 1. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index ef25c97..357295c 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -53,7 +53,7 @@ c-common.h, not after. TREE_INDIRECT_USING (in NAMESPACE_DECL). CLEANUP_P (in TRY_BLOCK) AGGR_INIT_VIA_CTOR_P (in AGGR_INIT_EXPR) - PTRMEM_OK_P (in ADDR_EXPR, OFFSET_REF) + PTRMEM_OK_P (in ADDR_EXPR, OFFSET_REF, SCOPE_REF) PAREN_STRING_LITERAL (in STRING_CST) DECL_PRETTY_FUNCTION_P (in VAR_DECL) KOENIG_LOOKUP_P (in CALL_EXPR) @@ -62,7 +62,6 @@ c-common.h, not after. STMT_EXPR_NO_SCOPE (in STMT_EXPR) BIND_EXPR_TRY_BLOCK (in BIND_EXPR) TYPENAME_IS_ENUM_P (in TYPENAME_TYPE) - QUALIFIED_NAME_IS_TEMPLATE (in SCOPE_REF) OMP_FOR_GIMPLIFYING_P (in OMP_FOR) BASELINK_QUALIFIED_P (in BASELINK) TARGET_EXPR_IMPLICIT_P (in TARGET_EXPR) @@ -86,6 +85,7 @@ c-common.h, not after. LAMBDA_EXPR_MUTABLE_P (in LAMBDA_EXPR) DECLTYPE_FOR_LAMBDA_RETURN (in DECLTYPE_TYPE) DECL_FINAL_P (in FUNCTION_DECL) + QUALIFIED_NAME_IS_TEMPLATE (in SCOPE_REF) 2: IDENTIFIER_OPNAME_P (in IDENTIFIER_NODE) ICS_THIS_FLAG (in _CONV) DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (in VAR_DECL) @@ -3354,7 +3354,7 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter) /* Indicates when overload resolution may resolve to a pointer to member function. [expr.unary.op]/3 */ #define PTRMEM_OK_P(NODE) \ - TREE_LANG_FLAG_0 (TREE_CHECK2 ((NODE), ADDR_EXPR, OFFSET_REF)) + TREE_LANG_FLAG_0 (TREE_CHECK3 ((NODE), ADDR_EXPR, OFFSET_REF, SCOPE_REF)) /* Get the POINTER_TYPE to the METHOD_TYPE associated with this pointer to member function. TYPE_PTRMEMFUNC_P _must_ be true, @@ -3801,7 +3801,7 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter) /* True for a SCOPE_REF iff the template keyword was used to indicate that the qualified name denotes a template. */ #define QUALIFIED_NAME_IS_TEMPLATE(NODE) \ - (TREE_LANG_FLAG_0 (SCOPE_REF_CHECK (NODE))) + (TREE_LANG_FLAG_1 (SCOPE_REF_CHECK (NODE))) /* True for an OMP_ATOMIC that has dependent parameters. These are stored as an expr in operand 1, and integer_zero_node in operand 0. */ diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 947e19e..5743159 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -11344,7 +11344,7 @@ tsubst_qualified_id (tree qualified_id, tree args, expr = (adjust_result_of_qualified_name_lookup (expr, scope, current_class_type)); expr = (finish_qualified_id_expr - (scope, expr, done, address_p, + (scope, expr, done, address_p PTRMEM_OK_P (qualified_id), QUALIFIED_NAME_IS_TEMPLATE (qualified_id), /*template_arg_p=*/false)); } diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index a704aa3..e29705c 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -1504,7 +1504,8 @@ finish_parenthesized_expr (tree expr) /* This inhibits warnings in c_common_truthvalue_conversion. */ TREE_NO_WARNING (expr) = 1; - if (TREE_CODE (expr) == OFFSET_REF) + if (TREE_CODE (expr) == OFFSET_REF + || TREE_CODE (expr) == SCOPE_REF) /* [expr.unary.op]/3 The qualified id of a pointer-to-member must not be enclosed in parentheses. */ PTRMEM_OK_P (expr) = 0; diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 678c7ef..dcd85e4 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -1411,6 +1411,7 @@ build_qualified_name (tree type, tree scope, tree name, bool template_p) return error_mark_node; t = build2 (SCOPE_REF, type, scope, name); QUALIFIED_NAME_IS_TEMPLATE (t) = template_p; + PTRMEM_OK_P (t) = true; if (type) t = convert_from_reference (t); return t; diff --git a/gcc/testsuite/g++.dg/template/qualified-id4.C b/gcc/testsuite/g++.dg/template/qualified-id4.C new file mode 100644 index 000..0d97cb8 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/qualified-id4.C @@ -0,0 +1,20 @@ +// PR c++/48593 + +template typename T struct foo +{ + T data; +}; + +templatetypename T struct bar: public fooT +{ + void some_func() + { +T* ptr = (fooT::data); + } +}; + +int main() +{ + barint().some_func(); +} +
Re: [trans-mem] Beginning of refactoring
Attached patch adds a vector-like container implementation and uses it for a transaction's undo log. Also, adds a flag to xmalloc/xrealloc that requests the allocated data to be on cache lines that are not shared with data accessed by other threads (this will get an actual implementation in a later patch). Ok for branch? commit f8953dec8a791e691938cd2abc96eff3b9712024 Author: Torvald Riegel trie...@redhat.com Date: Fri Jul 1 22:18:33 2011 +0200 Add vector-like container, use it for gtm_transaction::undo_log. * containers.h: New file. * util.cc (xmalloc, xrealloc): Accept cacheline-alloc flag. * libitm_i.h (xmalloc, xrealloc): Moved declarations from here ... * common.h: ... to here. (local_undo): Use GTM::vector for gtm_transaction::local_undo. * local.cc: Same. diff --git a/libitm/common.h b/libitm/common.h index d94ca94..9db566a 100644 --- a/libitm/common.h +++ b/libitm/common.h @@ -40,4 +40,24 @@ #define likely(X) __builtin_expect((X) != 0, 1) #define unlikely(X)__builtin_expect((X), 0) +namespace GTM HIDDEN { + +// Locally defined protected allocation functions. +// +// To avoid dependency on libstdc++ new/delete, as well as to not +// interfere with the wrapping of the global new/delete we wrap for +// the user in alloc_cpp.cc, use class-local versions that defer +// to malloc/free. Recall that operator new/delete does not go through +// normal lookup and so we cannot simply inject a version into the +// GTM namespace. +// If separate_cl is true, the allocator will try to return memory that is on +// cache lines that are not shared with any object used by another thread. +extern void * xmalloc (size_t s, bool separate_cl = false) + __attribute__((malloc, nothrow)); +extern void * xrealloc (void *p, size_t s, bool separate_cl = false) + __attribute__((malloc, nothrow)); + +} // namespace GTM + + #endif // COMMON_H diff --git a/libitm/containers.h b/libitm/containers.h new file mode 100644 index 000..13a6cbc --- /dev/null +++ b/libitm/containers.h @@ -0,0 +1,110 @@ +/* Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Torvald Riegel trie...@redhat.com. + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + http://www.gnu.org/licenses/. */ + +#ifndef LIBITM_CONTAINERS_H +#define LIBITM_CONTAINERS_H 1 + +#include common.h +#include stdio.h + +namespace GTM HIDDEN { + +// A simple vector-like container. +// If alloc_seperate_cl is true, allocations will happen on separate cache +// lines. +template typename T, bool alloc_separate_cl = true +class vector +{ + private: + size_t m_capacity; + size_t m_size; + T* entries; + + // Initial capacity of the vector. + static const size_t default_initial_capacity = 32; + // Above that capacity, grow vector by that size for each call. + static const size_t default_resize_max = 2048; + // Resize vector to at least this capacity. + static const size_t default_resize_min = 32; + + // Don't try to copy this vector. + vectorT, alloc_separate_cl(const vectorT, alloc_separate_cl x); + + public: + typedef T datatype; + typedef T* iterator; + + iterator begin() const { return entries; } + iterator end() const { return entries + m_size; } + T operator[] (size_t pos) { return entries[pos]; } + const T operator[] (size_t pos) const { return entries[pos]; } + + vectorT, alloc_separate_cl(size_t initial_size = default_initial_capacity) +: m_capacity(initial_size), + m_size(0) + { +if (m_capacity 0) + entries = (T*) xmalloc(sizeof(T) * m_capacity, alloc_separate_cl); +else + entries = 0; + } + ~vectorT, alloc_separate_cl() { if (m_capacity) free(entries); } + + void resize() + { +if (m_capacity = default_resize_max) + m_capacity = m_capacity + default_resize_max; +else + m_capacity = m_capacity * 2; +if (m_capacity default_resize_min) + m_capacity = default_resize_min; +entries = (T*) xrealloc(entries, sizeof(T) * m_capacity, alloc_separate_cl); + } + void resize_noinline()
C++ PATCH for c++/48261 (ICE with non-template used as template)
Replacing assert with error. Tested x86_64-pc-linux-gnu, applying to trunk. commit e42789795f05e88b0b55c5da0670aa827ad046b2 Author: Jason Merrill ja...@redhat.com Date: Fri Jul 1 14:06:46 2011 -0400 PR c++/48261 * pt.c (lookup_template_function): Handle non-function. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 5743159..7236e7e 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -6622,8 +6622,12 @@ lookup_template_function (tree fns, tree arglist) return error_mark_node; gcc_assert (!arglist || TREE_CODE (arglist) == TREE_VEC); - gcc_assert (fns (is_overloaded_fn (fns) - || TREE_CODE (fns) == IDENTIFIER_NODE)); + + if (!is_overloaded_fn (fns) TREE_CODE (fns) != IDENTIFIER_NODE) +{ + error (%q#D is not a function template, fns); + return error_mark_node; +} if (BASELINK_P (fns)) { diff --git a/gcc/testsuite/g++.dg/template/template-id-3.C b/gcc/testsuite/g++.dg/template/template-id-3.C new file mode 100644 index 000..e0753ab --- /dev/null +++ b/gcc/testsuite/g++.dg/template/template-id-3.C @@ -0,0 +1,22 @@ +// PR c++/48261 + +typedef double (*gaddType)(double,double); +struct Foo2 +{ + static gaddType add; +}; + +templatetypename T +struct Something +{ + void work() + { +double x=T::template adddouble(5.0,6.0); // { dg-error add } + } +}; + +int main() +{ + SomethingFoo2 s2; + s2.work(); +}
Re: [trans-mem] Beginning of refactoring
On 07/01/2011 01:23 PM, Torvald Riegel wrote: Add vector-like container, use it for gtm_transaction::undo_log. * containers.h: New file. * util.cc (xmalloc, xrealloc): Accept cacheline-alloc flag. * libitm_i.h (xmalloc, xrealloc): Moved declarations from here ... * common.h: ... to here. (local_undo): Use GTM::vector for gtm_transaction::local_undo. * local.cc: Same. Ok. r~
Re: RFC: [GUPC] UPC-related front-end changes
On 07/01/11 19:28:34, Joseph S. Myers wrote: On Fri, 1 Jul 2011, Gary Funck wrote: GF: * Most of the #ifdef conditionals have been removed. Some target macros GF: have been defined and documented in tm.texi. We still have some questions For each target macro, what is the justification requiring it to be a macro rather than a hook documented in target.def with just an @hook line in tm.texi.in? The justification should be one of: [...] Joseph, thanks for the follow up. I have several questions on the best method of documenting and packaging UPC's configuration (and target) specific definition. I will describe those questions and some potential solutions in an email, to follow. - Gary
Re: RFC: 40 bit integer support
On Fri, 1 Jul 2011, Bernd Schmidt wrote: On 07/01/11 22:04, Joseph S. Myers wrote: I should add: make the type, the new mode, the testcases etc. entirely target-specific; target-independent GCC should not need to know or care about the specifics of this type. It's bad enough target-independent GCC knowing about HImode, SImode, DImode and TImode outside default target hook implementations for targets that use those modes. The idea here is that there is more than one target that supports 40 bit operations, so why shouldn't we have support for it in target-independent code and libgcc? It differs from QI/HI/SImode etc. in that the precision is known and not target-specific. Well, the idea of an integer mode with non-target-specific precision is pretty unusual in GCC; normally they are defined as a multiple of QImode (BITS_PER_UNIT). And the existence of this mode is apparently target-specific even if its properties aren't. Apart from the stdint.h code (inappropriate given the lack of a suffix, and where the proliferation of target macros etc. could be avoided once those macros are converted to hooks in some suitable way): * The global tree nodes for various modes are suspicious. Why are they needed at all? * The c_common_type_for_size code using those nodes is suspicious. Front ends shouldn't care about modes like that. Check standard types, otherwise defer to something generic that loops over available types or modes or builds a type as needed. Targets should be able to define integer types and modes as needed - but changing target-independent code for a particular type indicates something is wrong; I wouldn't expect any more target-independent changes than have been associated with floating-point types such as __fp16, __float80 or __float128. There's the odd target hook, and it's necessary to tell libgcc what modes to build for (but in general you have a libgcc function implementation that can be used for more than one mode, depending on the properties of the types, rather than separate implementations per mode). Just as __float80 and __float128 are target-specific types defined by small amounts of target hook code on a couple of targets, I think 40-bit integers should also be like that. And is there anything wrong with the existing PDImode name? PDImode is so far always defined as MODE_PARTIAL_INT which is handled quite differently (i.e. by not handling it very much at all). IMO it would be a bad idea to overload the name. What is the function of having both PARTIAL_INT_MODE and FRACTIONAL_INT_MODE? -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: [GUPC] UPC-related front-end changes
On Fri, Jul 01, 2011 at 11:31:45AM -0700, Gary Funck wrote: @@ -2405,6 +2469,9 @@ struct GTY(()) tree_type_common { alias_set_type alias_set; tree pointer_to; tree reference_to; + /* UPC: for block-distributed arrays */ + tree block_factor; I think this is undesirable. Using a single bit and looking it up in a hash table might be a better solution, the vast majority of the types aren't going to be UPC block distributed arrays. Jakub
Re: RFC: 40 bit integer support
On Fri, 1 Jul 2011, Bernd Schmidt wrote: On 07/01/11 21:49, Joseph S. Myers wrote: On Fri, 1 Jul 2011, Bernd Schmidt wrote: * Should we add an __int40_t keyword, or just do a pushdecl for it? The patch currently does the latter to match __int128_t, but decimal float and fixed-point support uses keywords. This may make a difference for (existing) code using unsigned __int40_t. My advice is neither, initially; leave it to the user to define typedefs using the mode (possibly providing a header that the user can include with -include). __int128_t is legacy and undocumented, with the __int128 keyword (from the x86_64 ABI) being preferred. I also don't recommend adding any types to stdint.h (so eliminating a lot more of the complexity); there is no requirement to have those types there, Well, this was done for compatibility with an existing toolchain (which unfortunately also allows unsigned __int40_t). I guess I could be persuaded that we don't care enough and people must include c6x_intrinsics.h when using gcc. Maybe there's a case for an __int40 keyword (and a built-in macro for __int40_t), but the point at which a second such keyword is added is the point to figure out a generic approach that also covers __int128, __float80 and __float128 - and doesn't need a load of code in the target-independent compiler for each type. And I think that should really be kept separate from simply making the types work in the same way that TImode types worked for years (i.e. using mode attributes to access them, but having the required libgcc functions). This patch does too much. Once the series to make 40-bit types work is in, the next natural step would be adding such a type as target-specific. (FWIW, I wonder if all the new libgcc functions are really useful, or if the core compiler should just handle calling 64-bit functions and truncating as needed for most of the functions.) When there are multiple targets, then share things between only the targets that need them. Don't try to understand anything about stdint.h with reference to C99 versions before TC3; there were several defects fixed in the TCs. Is this available anywhere (even if only inside CS?) C99+TC1+TC2+TC3 is WG14 document N1256, more readily available than the original C99. -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: 40 bit integer support
One more general point: There are further issues around what we might call extended types that behave much like integer and floating-point types, especially for C++; see my comment in PR 43622, and the references therein. How to fix these (again, while avoiding hardcoding references to such types in target-independent code) should also be considered separately. -- Joseph S. Myers jos...@codesourcery.com
[committed] Handle BUILT_IN_ASSUME_ALIGNED in tree-object-size.c
Hi! __builtin_assume_aligned is a pass thru call, preserves object size. Bootstrapped/regtested on x86_64-linux and i686-linux, committed as obvious. 2011-07-01 Jakub Jelinek ja...@redhat.com * tree-object-size.c (pass_through_call): Handle BUILT_IN_ASSUME_ALIGNED. --- gcc/tree-object-size.c.jj 2011-06-15 11:54:43.0 +0200 +++ gcc/tree-object-size.c 2011-07-01 10:23:53.361527213 +0200 @@ -1,5 +1,5 @@ /* __builtin_object_size (ptr, object_size_type) computation - Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010 + Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Jakub Jelinek ja...@redhat.com @@ -464,6 +464,7 @@ pass_through_call (const_gimple call) case BUILT_IN_STRNCPY_CHK: case BUILT_IN_STRCAT_CHK: case BUILT_IN_STRNCAT_CHK: + case BUILT_IN_ASSUME_ALIGNED: if (gimple_call_num_args (call) = 1) return gimple_call_arg (call, 0); break; Jakub
Re: RFC: 40 bit integer support
On 07/01/11 22:36, Joseph S. Myers wrote: On Fri, 1 Jul 2011, Bernd Schmidt wrote: The idea here is that there is more than one target that supports 40 bit operations, so why shouldn't we have support for it in target-independent code and libgcc? It differs from QI/HI/SImode etc. in that the precision is known and not target-specific. Well, the idea of an integer mode with non-target-specific precision is pretty unusual in GCC; normally they are defined as a multiple of QImode (BITS_PER_UNIT). It's unusual only in terms of integer modes. It's analogous to decimal float (where we also check targetm.scalar_mode_supported_p in the frontend to determine whether to support it) or fixed-point support (checked with targetm.fixed_point_supported_p, and the fixed-point modes are all defined with specified precisions). And even for integer modes, there's precedent with TI/__int128_t. And the existence of this mode is apparently target-specific even if its properties aren't. Yes. That's probably a good way to think about it. Again, the analogy would be decimal float or fixed-point. * The global tree nodes for various modes are suspicious. Why are they needed at all? Do you mean only the PImode ones or also intQI_type_node etc.? These are used to pick a suitable type in c_common_type_for_size. * The c_common_type_for_size code using those nodes is suspicious. Front ends shouldn't care about modes like that. It doesn't care about the modes, it just picks a suitable one for a given precision. Note that my patch does not introduce this mechanism, it just extends it. Check standard types, otherwise defer to something generic that loops over available types or modes or builds a type as needed. Please look at the code; this is exactly what is being done. The function checks the standard types, and if it does not find one with an exact match for the precision, it examines the available modes (exposed through intQI_type_node etc.). Targets should be able to define integer types and modes as needed - but changing target-independent code for a particular type indicates something is wrong; I wouldn't expect any more target-independent changes than have been associated with floating-point types such as __fp16, __float80 or __float128. That's because these all fall into the standard C type system of float, double, long double. There's the odd target hook, and it's necessary to tell libgcc what modes to build for (but in general you have a libgcc function implementation that can be used for more than one mode, depending on the properties of the types, rather than separate implementations per mode). Just as __float80 and __float128 are target-specific types defined by small amounts of target hook code on a couple of targets, I think 40-bit integers should also be like that. Floating point types tend not to be defined as bitfields, so naturally there's no support in c_common_type_for_size. This is something that's only useful for integer modes, and I don't think you can avoid this. We could define such a type in each target that supports it, but since I know about at least three machines where this particular feature exists, I think it makes more sense to expose it in target-independent code. I think doing this would also require uglier changes in c_common_type_for_size. If you subtract out all the INT_LEAST40_TYPE etc. support, there really aren't very many changes in the C frontend, and I think the remaining ones are just the same ones we have for TImode - pretty much unavoidable for properly supporting a new integer type that is exposed to programmers. I'll resubmit such a patch in the hope that it'll look more palatable. The fact that the libgcc implementations of typical functions can be used with more than one mode depends on the fact that while we don't know the exact size, we have a nice ladder of types where one is twice as big as the next. int40_t falls outside of that. What is the function of having both PARTIAL_INT_MODE and FRACTIONAL_INT_MODE? Not having to change all the targets using PARTIAL_INT_MODE immediately to use the better mechanism. Bernd
[PATCH] Fix tree-into-ssa.c for debug stmts (PR debug/49602)
Hi! For debug stmt uses, we don't want any PHI nodes to be created just because of them, so the debug uses need to be ignored during decisions which PHI nodes to add. Unfortunately that means get_current_def can't be always trusted. The following patch trusts it in a few cases where I'm reasonably sure it is trustable. Perhaps some more cases could be added in the future if anyone has ideas... Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-07-01 Jakub Jelinek ja...@redhat.com PR debug/49602 * tree-into-ssa.c (rewrite_debug_stmt_uses): Disregard get_current_def return value if it can't be trusted to be the current value of the variable in the current bb. * gcc.dg/pr49602.c: New test. --- gcc/tree-into-ssa.c.jj 2011-06-23 10:13:58.0 +0200 +++ gcc/tree-into-ssa.c 2011-07-01 20:39:40.0 +0200 @@ -1343,7 +1343,41 @@ rewrite_debug_stmt_uses (gimple stmt) } } else - def = get_current_def (var); + { + def = get_current_def (var); + /* Check if get_current_def can be trusted. */ + if (def) + { + basic_block bb = gimple_bb (stmt); + basic_block def_bb + = SSA_NAME_IS_DEFAULT_DEF (def) + ? NULL : gimple_bb (SSA_NAME_DEF_STMT (def)); + + /* If definition is in current bb, it is fine. */ + if (bb == def_bb) + ; + /* If definition bb doesn't dominate the current bb, +it can't be used. */ + else if (def_bb !dominated_by_p (CDI_DOMINATORS, bb, def_bb)) + def = NULL; + /* If there is just one definition and dominates the current +bb, it is fine. */ + else if (get_phi_state (var) == NEED_PHI_STATE_NO) + ; + else + { + struct def_blocks_d *db_p = get_def_blocks_for (var); + + /* If there are some non-debug uses in the current bb, +it is fine. */ + if (bitmap_bit_p (db_p-livein_blocks, bb-index)) + ; + /* Otherwise give up for now. */ + else + def = NULL; + } + } + } if (def == NULL) { gimple_debug_bind_reset_value (stmt); --- gcc/testsuite/gcc.dg/pr49602.c.jj 2011-07-01 20:42:44.0 +0200 +++ gcc/testsuite/gcc.dg/pr49602.c 2011-07-01 18:49:04.0 +0200 @@ -0,0 +1,17 @@ +/* PR debug/49602 */ +/* { dg-do compile } */ +/* { dg-options -g -O2 } */ + +static void +foo (int *x) +{ +} + +void +bar (int *x) +{ + int i; + for (i = 0; i == 1; ++i) +x = 0; + foo (x); +} Jakub
Re: RFC: [GUPC] UPC-related front-end changes
On 07/01/11 22:42:55, Jakub Jelinek wrote: On Fri, Jul 01, 2011 at 11:31:45AM -0700, Gary Funck wrote: @@ -2405,6 +2469,9 @@ struct GTY(()) tree_type_common { alias_set_type alias_set; tree pointer_to; tree reference_to; + /* UPC: for block-distributed arrays */ + tree block_factor; I think this is undesirable. Using a single bit and looking it up in a hash table might be a better solution, the vast majority of the types aren't going to be UPC block distributed arrays. Jakub, it is true that the vast majority of types will not have a UPC blocking factor. However, building a hash table will be complicated by questions of the scope and lifetime of the type that references the blocking factor in a hash table. Implementing the hash table sounds like it might be quite a bit of work. Is there precedent for this technique being used in other GCC front-ends? Some other fields in a tree_type_common node are also likely not used by very many types (or in the case of the symtab info. are only used, for example, if debug info. is being generated?): * attributes * reference_to * tree_type_symtab * name Is there some way to use the language specific information hook? struct GTY(()) tree_type_with_lang_specific { struct tree_type_common common; /* Points to a structure whose details depend on the language in use. */ struct lang_type *lang_specific; }; - Gary
Re: RFC: 40 bit integer support
On 07/01/11 23:18, Bernd Schmidt wrote: What is the function of having both PARTIAL_INT_MODE and FRACTIONAL_INT_MODE? Not having to change all the targets using PARTIAL_INT_MODE immediately to use the better mechanism. Also, come to think of it, preventing the rest of the compiler from trying to use such a mode in case the target only supports some very specific operations on it. A port could choose to use PImode, defined in machmode.def (and get __int40_t support), or it could add its own private PDImode to use in specific situations only. Bernd
[Ada] Fix parallel LTO bootstrap
Not clear why this never showed up on the 4.6 branch, but this now prevents a parallel LTO bootstrap with Ada enabled from completing on the mainline. Parallel LTO-bootstrapped, applied on the mainline and 4.6 branch. 2011-07-01 Eric Botcazou ebotca...@adacore.com * gcc-interface/Make-lang.in (gnat1): Prepend '+' to the command. (gnatbind): Likewise. -- Eric Botcazou Index: gcc-interface/Make-lang.in === --- gcc-interface/Make-lang.in (revision 175747) +++ gcc-interface/Make-lang.in (working copy) @@ -476,11 +476,11 @@ TARGET_ADA_SRCS = # Since the RTL should be built with the latest compiler, remove the # stamp target in the parent directory whenever gnat1 is rebuilt gnat1$(exeext): $(TARGET_ADA_SRCS) $(GNAT1_OBJS) $(ADA_BACKEND) libcommon-target.a $(LIBDEPS) - $(GCC_LINK) -o $@ $(GNAT1_OBJS) $(ADA_BACKEND) libcommon-target.a $(LIBS) $(SYSLIBS) $(BACKENDLIBS) $(CFLAGS) + +$(GCC_LINK) -o $@ $(GNAT1_OBJS) $(ADA_BACKEND) libcommon-target.a $(LIBS) $(SYSLIBS) $(BACKENDLIBS) $(CFLAGS) $(RM) stamp-gnatlib2-rts stamp-tools gnatbind$(exeext): ada/b_gnatb.o $(CONFIG_H) $(GNATBIND_OBJS) ggc-none.o libcommon-target.a $(LIBDEPS) - $(GCC_LINK) -o $@ ada/b_gnatb.o $(GNATBIND_OBJS) ggc-none.o libcommon-target.a $(LIBS) $(SYSLIBS) $(CFLAGS) + +$(GCC_LINK) -o $@ ada/b_gnatb.o $(GNATBIND_OBJS) ggc-none.o libcommon-target.a $(LIBS) $(SYSLIBS) $(CFLAGS) # use cross-gcc gnat-cross: force
Re: RFC: [GUPC] UPC-related front-end changes
On Fri, Jul 01, 2011 at 02:34:41PM -0700, Gary Funck wrote: Is there precedent for this technique being used in other GCC front-ends? Yes, look at DECL_VALUE_EXPR/SET_DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P or DECL_DEBUG_EXPR/SET_DECL_DEBUG_EXPR/DEBUG_EXPR_IS_FROM. Jakub
Re: RFC: 40 bit integer support
On Fri, 1 Jul 2011, Bernd Schmidt wrote: On 07/01/11 23:18, Bernd Schmidt wrote: What is the function of having both PARTIAL_INT_MODE and FRACTIONAL_INT_MODE? Not having to change all the targets using PARTIAL_INT_MODE immediately to use the better mechanism. Also, come to think of it, preventing the rest of the compiler from trying to use such a mode in case the target only supports some very specific operations on it. A port could choose to use PImode, defined in We have well-established ways of indicating supported operations with hooks (used for example to limit operations on __fpreg and __fp16 in the front ends) and optabs. -- Joseph S. Myers jos...@codesourcery.com
Re: [pph] Test cleanup (issue4572050)
Hey, so I really like the new clean testing system, so that we always quickly see what we fixed/broked with a local change. One problem now though: `// pph asm xdiff`, only flags for asm diffs, but those could be different diffs after a change (for the better or worse) and this won't be caught. It's probably hard to get something precise on this, but maybe we could simply add the # of lines of diff expected, e.g. `// pph asm xdiff 32`. Then we XFAIL if the number of expected lines in the diff match, but actually fail if the number of lines in the diff is now different. I'm not very familiar with dg.. Is that doable? Would be very helpful at this stage. Gab On Wed, Jun 8, 2011 at 1:44 PM, Lawrence Crowl cr...@google.com wrote: Update the test suite to reflect recent fixes and merges from trunk. The following changes happened in this patch. Conversion of a compilation failure to an assembly diff failure: remove the dg comments and add the comment // pph asm xdiff Conversion of a ICE to a BOGUS change the dg-xfail-if line from ICE to BOGUS change the dg-bogus line from the ICE message to the BOGUS message remove the dg-prune-output lines corresponding to the ICE (dg knows to remove them for errors, but not for ICEs) Update line numbers from a merge (or other edits to regular sources) change the line number from old to new as reflected in the test log The output of make check filtered through the following sed command should yield no FAIL and no XPASS. sed -e ' /-fpph-map=pph.map/ ! { /^XPASS: .*test for bogus messages/ d /^XPASS: .*test for excess errors/ d } /^#/ p /^ERROR: / p /^XFAIL: / p /^XPASS: / p /^FAIL: / p d ' The same command works to filter the log file to its essential record of pass/fail. In particular, it enables listing XFAILs that need work. Index: gcc/testsuite/ChangeLog.pph 2011-06-08 Lawrence Crowl cr...@google.com * lib/dg-pph.exp (dg-pph-pos): Stop redundantly reporting a missing pph assembly as an xfail when its compile xfails. * g++.dg/pph/x1typerefs.cc: Replace ICE xfail with a bogus error xfail. * g++.dg/pph/x1tmplclass.cc: Likewise. * g++.dg/pph/x1autometh.cc: Replace ICE xfail with an assembly diff xfail. * g++.dg/pph/x1special.cc: Remove ICE xfail. * g++.dg/pph/x1functions.cc: Correct xfail to BOGUS rather than ERROR. Remove leftover pruning from previous ICE. * g++.dg/pph/x1template.cc: Update line number of ICE. * g++.dg/pph/x1variables.cc: Update line number of BOGUS. Index: gcc/testsuite/lib/dg-pph.exp === --- gcc/testsuite/lib/dg-pph.exp (revision 174789) +++ gcc/testsuite/lib/dg-pph.exp (working copy) @@ -94,9 +94,7 @@ proc dg-pph-pos { subdir test options ma # Quit if it did not compile successfully. if { ![file_on_host exists $bname.s] } { # Expect assembly to be missing when the compile is an expected fail. - if { [llength [grep $test dg-xfail-if.*-fpph-map]] } { - xfail $nshort $options (pph assembly missing) - } else { + if { ![llength [grep $test dg-xfail-if.*-fpph-map]] } { fail $nshort $options (pph assembly missing) } return Index: gcc/testsuite/g++.dg/pph/x1typerefs.cc === --- gcc/testsuite/g++.dg/pph/x1typerefs.cc (revision 174789) +++ gcc/testsuite/g++.dg/pph/x1typerefs.cc (working copy) @@ -1,6 +1,5 @@ -// { dg-xfail-if ICE { *-*-* } { -fpph-map=pph.map } } -// { dg-bogus c1typerefs.h:11:18: internal compiler error: Segmentation fault { xfail *-*-* } 0 } -// { dg-prune-output In file included from } +// { dg-xfail-if BOGUS { *-*-* } { -fpph-map=pph.map } } +// { dg-bogus c1typerefs.h:11:18: error: cannot convert 'const std::type_info.' to 'const std::type_info.' in initialization { xfail *-*-* } 0 } #include x1typerefs.h Index: gcc/testsuite/g++.dg/pph/x1autometh.cc === --- gcc/testsuite/g++.dg/pph/x1autometh.cc (revision 174789) +++ gcc/testsuite/g++.dg/pph/x1autometh.cc (working copy) @@ -1,6 +1,4 @@ -// { dg-xfail-if ICE { *-*-* } { -fpph-map=pph.map } } -// { dg-bogus x1autometh.h:8:1: internal compiler error: Segmentation fault { xfail *-*-* } 0 } -// { dg-prune-output In file included from } +// pph asm xdiff #include x1autometh.h Index: gcc/testsuite/g++.dg/pph/x1special.cc === --- gcc/testsuite/g++.dg/pph/x1special.cc (revision 174789) +++ gcc/testsuite/g++.dg/pph/x1special.cc (working copy) @@ -1,7 +1,3 @@ -// { dg-xfail-if ICE { *-*-* } {
Re: [pph] Fix executable test detection (issue4635087)
LGTM On 7/1/11, Gabriel Charette gch...@google.com wrote: [string compare dg-do-what run] which was used before would always return true. Thus the tests would no longer even get to the asm diff section... Me and Lawrence tried to find a way to get the content of the dg-do-what variable, but couldn't. We decided to revert to this quick hack fix for now (better then not running the asm diffs...) (I also added an unrelated re-ordering to the order of the pph asm xdiff comment in c1varoder.cc) 2011-07-01 Gabriel Charette gch...@google.com * g++.dg/pph/c1varorder.cc: Moved pph asm xdiff comment to top. * lib/dg-pph.exp (proc): Fixed executable test detection. diff --git a/gcc/testsuite/g++.dg/pph/c1varorder.cc b/gcc/testsuite/g++.dg/pph/c1varorder.cc index 2db8209..a7a65ec 100644 --- a/gcc/testsuite/g++.dg/pph/c1varorder.cc +++ b/gcc/testsuite/g++.dg/pph/c1varorder.cc @@ -1,6 +1,7 @@ -#include c1varorder.h // pph asm xdiff +#include c1varorder.h + int foo(void) { return var1 - var2; diff --git a/gcc/testsuite/lib/dg-pph.exp b/gcc/testsuite/lib/dg-pph.exp index b701ce2..e34bd63 100644 --- a/gcc/testsuite/lib/dg-pph.exp +++ b/gcc/testsuite/lib/dg-pph.exp @@ -74,8 +74,11 @@ proc dg-pph-pos { subdir test options mapflag suffix } { set dg-do-what-default compile dg-test -keep-output $test $options -I. +# Determine whether this is an executable test +set is_exec [llength [grep $test dg-do run]] + # Executables do not generate assembly. -if { ![string compare dg-do-what run] } { +if { !$is_exec } { # Not executable, so quit if it did not compile successfully. if { ![file_on_host exists $bname.s] } { fail $nshort $options (regular assembly missing) @@ -93,7 +96,7 @@ proc dg-pph-pos { subdir test options mapflag suffix } { dg-test -keep-output $test $options $mapflag -I. # Executables do not generate assembly, -if { [string compare dg-do-what run] } { +if { $is_exec } { # and so we are done testing. return } -- This patch is available for review at http://codereview.appspot.com/4635087 -- Lawrence Crowl
Re: [pph] Test cleanup (issue4572050)
On 7/1/11, Gabriel Charette gch...@google.com wrote: One problem now though: `// pph asm xdiff`, only flags for asm diffs, but those could be different diffs after a change (for the better or worse) and this won't be caught. It's probably hard to get something precise on this, but maybe we could simply add the # of lines of diff expected, e.g. `// pph asm xdiff 32`. Then we XFAIL if the number of expected lines in the diff match, but actually fail if the number of lines in the diff is now different. I'm not very familiar with dg.. Is that doable? Would be very helpful at this stage. That looks easy enough. I need to finish the current test stuff before I get to that though. -- Lawrence Crowl
Re: [pph] Fix executable test detection (issue4635087)
I should add that this exposed two assembly comparison failures which slipped in a previous check-in when this problem was still up. Namely: FAIL: g++.dg/pph/x1dynarray0.cc (assembly comparison) FAIL: g++.dg/pph/x1dynarray1.cc (assembly comparison) Lawrence is reorganizing those tests as we speak and told me not to bother marking them as `pph asm xdiff` as part of this check-in. http://codereview.appspot.com/4635087/
[pph] Fix global variable assembly ordering (issue4627087)
As variables are discovered (while parsing the header) they are added to the varpool and their RTL is built. We do not stream, nor the varpool, nor the RTL (and I don't think we want to + that wouldn't work with multiple pph). We want to rebuild the varpool when streaming the global variables of the pph in so as to redefine them in the varpool in the same order they would have been found in a regular #include style parse. I'm not sure whether global variables, not externals is specific enough or too broad (I can't reuse the caller of varpool_finalize_decl (rest_of_decl_compilation) to take care of this logic because it needs some parser state which we no longer have). I will create more tests next week with different orderings for functions, structs, etc. coming in from the pph. For now, this fixes 8 tests :). Tested with bootstrap and pph regression testing. PS: Just realized I didn't put a comment in the code for that fix, should I add one? 2011-07-01 Gabriel Charette gch...@google.com * gcc/cp/pph-streamer-in.c (pph_add_bindings_to_namespace): Rebuild the varpool for global variables coming in from the pph. * gcc/testsuite/g++.dg/pph/c120060625-1.cc: Expect no asm difference. * gcc/testsuite/g++.dg/pph/c1struct.cc: Likewise. * gcc/testsuite/g++.dg/pph/c1variables.cc: Likewise. * gcc/testsuite/g++.dg/pph/c1varorder.cc: Likewise. * gcc/testsuite/g++.dg/pph/c2builtin1.cc: Likewise. * gcc/testsuite/g++.dg/pph/c2builtin2.cc: Likewise. * gcc/testsuite/g++.dg/pph/x1struct2.cc: Likewise. * gcc/testsuite/g++.dg/pph/x1variables.cc: Likewise. diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 3ac5243..0ddcc75 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -1157,6 +1157,9 @@ pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) Preserve it. */ chain = DECL_CHAIN (t); pushdecl_into_namespace (t, ns); + + if (TREE_CODE (t) == VAR_DECL TREE_STATIC (t) !DECL_EXTERNAL (t)) + varpool_finalize_decl (t); } for (t = bl-namespaces; t; t = chain) diff --git a/gcc/testsuite/g++.dg/pph/c120060625-1.cc b/gcc/testsuite/g++.dg/pph/c120060625-1.cc index d09be39..05c7929 100644 --- a/gcc/testsuite/g++.dg/pph/c120060625-1.cc +++ b/gcc/testsuite/g++.dg/pph/c120060625-1.cc @@ -1,2 +1 @@ -// pph asm xdiff #include c120060625-1.h diff --git a/gcc/testsuite/g++.dg/pph/c1struct.cc b/gcc/testsuite/g++.dg/pph/c1struct.cc index 88c215b..8de9166 100644 --- a/gcc/testsuite/g++.dg/pph/c1struct.cc +++ b/gcc/testsuite/g++.dg/pph/c1struct.cc @@ -1,5 +1,3 @@ -// pph asm xdiff - #include c1struct.h thing var1; diff --git a/gcc/testsuite/g++.dg/pph/c1variables.cc b/gcc/testsuite/g++.dg/pph/c1variables.cc index 12a5f30..4d3c5a7 100644 --- a/gcc/testsuite/g++.dg/pph/c1variables.cc +++ b/gcc/testsuite/g++.dg/pph/c1variables.cc @@ -1,5 +1,3 @@ -// pph asm xdiff - #include c1variables.h int gbl_initial = 1; diff --git a/gcc/testsuite/g++.dg/pph/c1varorder.cc b/gcc/testsuite/g++.dg/pph/c1varorder.cc index a7a65ec..2791427 100644 --- a/gcc/testsuite/g++.dg/pph/c1varorder.cc +++ b/gcc/testsuite/g++.dg/pph/c1varorder.cc @@ -1,5 +1,3 @@ -// pph asm xdiff - #include c1varorder.h int foo(void) diff --git a/gcc/testsuite/g++.dg/pph/c2builtin1.cc b/gcc/testsuite/g++.dg/pph/c2builtin1.cc index 9a3a7a1..67327cb 100644 --- a/gcc/testsuite/g++.dg/pph/c2builtin1.cc +++ b/gcc/testsuite/g++.dg/pph/c2builtin1.cc @@ -1,5 +1,3 @@ -// pph asm xdiff - #include c2builtin1.h #include c2builtin2.h #include c2builtin3.h diff --git a/gcc/testsuite/g++.dg/pph/c2builtin2.cc b/gcc/testsuite/g++.dg/pph/c2builtin2.cc index 186ebcf..0ccae57 100644 --- a/gcc/testsuite/g++.dg/pph/c2builtin2.cc +++ b/gcc/testsuite/g++.dg/pph/c2builtin2.cc @@ -1,4 +1,3 @@ -// pph asm xdiff #include a2builtin4.h #include c2builtin5.h #include c2builtin6.h diff --git a/gcc/testsuite/g++.dg/pph/x1struct2.cc b/gcc/testsuite/g++.dg/pph/x1struct2.cc index be1af39..f31fb8c 100644 --- a/gcc/testsuite/g++.dg/pph/x1struct2.cc +++ b/gcc/testsuite/g++.dg/pph/x1struct2.cc @@ -1,5 +1,3 @@ -// pph asm xdiff - #include x1struct2.h type D::method() diff --git a/gcc/testsuite/g++.dg/pph/x1variables.cc b/gcc/testsuite/g++.dg/pph/x1variables.cc index 0f0814f..f9fd76e 100644 --- a/gcc/testsuite/g++.dg/pph/x1variables.cc +++ b/gcc/testsuite/g++.dg/pph/x1variables.cc @@ -1,5 +1,3 @@ -// pph asm xdiff - #include x1variables.h int D::mbr_init_plain = 4; -- This patch is available for review at http://codereview.appspot.com/4627087
Re: [pph] Stream first/weak_object_global_name (issue4641086)
So I did find a better way of doing this, see Issue #4627087. I'll go ahead and close this issue. On 2011/07/01 01:27:26, Gabriel Charette wrote: notice_global_symbol is actually called in the parser (before we stream out). I just confirmed this by setting a break point on the function in the pph generation of c1varoder.pph and hitting the function twice for the two variables defined in c1varorder.h It looks like rtl and assembler_name are set BEFORE we stream out... so that we would need to stream them (looks like assembler_name is already streamed, but not rtl...). Not streaming rtl causes it to be recreated later (which i think might be the cause of the bad ordering). Now another problem is that when two pph are generated, they are not aware of each other (and of their respective first/weak_global_object_name...) so even if we stream those variables, their could still be conflicts (and/or asm diffs..) However, I'm not sure about how make_decl_rtl works, but if it can regenerate the same rtl we had before (and even potentially solve the conflicts mentioned in the previous paragraph), we may not need to actually stream the rtl, we would just need to fix the ordering as the pph rtl's are being regenerated. (they are regenerated because DECL_RTL(NODE) either returns (NODE)-decl_with_rtl.rtl or calls make_decl_rtl(NODE) if it's NULL...) If we do this, I think we wanna have the first/weak_object_global_name set to what it should be by streaming it in from the first pph, to ensure all the rtl and assembler_name choices are made correctly (I'm not sure I understand all of this, but it seems to rely on those fields). Gab On Thu, Jun 30, 2011 at 4:33 PM, Diego Novillo mailto:dnovi...@google.com wrote: On Thu, Jun 30, 2011 at 16:24, Gabriel Charette mailto:gch...@google.com wrote: first/weak_global_object_name are part of the global state. This seems to be used to produce the assembler names. They are set only once as early as possible in the parsing; thus we should define it to be whatever it was in the first pph (or even in the C file if it was set before any pph was even loaded. I'm not sure exactly what this does, but trying to find a fix for the asm diffs this is the first thing that I hit in the compiler logic that was different in the good compiler vs the pph compiler. With this fix this bit of logic is now correct. However, this doesn't fix any pph test (nor does it even change any pph asm (I diff'ed the old bad assemblies (*.s+pph) with the new ones)). This does not look right to me. nbsp;These two symbols are set when calling notice_global_symbol, which is done during code generation (you will see it called from cgraph_finalize_function, cgraph_mark_reachable_node, etc). You are probably close to the cause of this relative ordering problem, but streaming these two globals is not the solution. nbsp;They both should be set in the same order by both compilers, but not by forcing them this way. One thing that may be happening here is that the order in which we call cgraph_finalize_function is different in the two compilers. That's what needs to change. nbsp;One thing you could do is set a breakpoint in notice_global_symbol. nbsp;The two compilers should be calling it in the same order. Diego. http://codereview.appspot.com/4641086/